Thursday, December 28, 2006

Server Crash.....

Today during the last week of the year where it is the most quite and boring a server decides to crash. Hardware failure is the error on the BSOD. This server is for our archive. It is connected to both of our Apple XServer RAIDs via fiber channel switch. And to top it off a user is requesting archive material.

No worries though this is where a storage area network SAN shines. All I did was rezone the WW names of the two XServe RAIDs from the dead server to another server and wah-lah the drives are back online. Took mear minutes.

Tuesday, December 12, 2006

All users on the new SAN volume (event ID 55 cont.)

So my strategy worked like a charm. Robocopying all the data to our new volume (SAN) and using ViceVersa to mirror the data between the source (internal storage) and destination (SAN). I removed the share from the old volume and created the same name on the SAN volume. No one had to reboot or anything. Very transparent.

Now I'll wait a few days to make sure everyone has everything before I completely wipe out that corrupt partition.

Monday, December 11, 2006

So my Event ID: 55 errors are back

Actually they never left. They are just more frequent again and last week the server froze up twice in the same day. What is my problem. Here it is.

Event ID: 55
Source: NTFS
Description: The file system structure on disk is corrupt and unusable. Please run the chkdsk utility on the volume "Drive_letter:"

CAUSE
This behavior can occur if the NTFS volumes' Master File Table (MFT) is corrupted. The short and long file name pairs that are stored in the directory index record and the file names that are stored in the associated File Record Segment (FRS) contain case-sensitive characters that do not match.

NTFS supports case-sensitive (POSIX) file names, but Chkdsk does not check file names in case-sensitive mode.

For example, assume that the directory index record has a BADFILe.TXT entry but the FRS has a BADFILE.TXT entry for the file name. NTFS views this as being invalid or corrupted, but Chkdsk compares only the names and ignores the case. It does not make repairs.
Back to the top Back to the top
RESOLUTION
To resolve this issue, back up the volume that contains the corrupted file(s) and exclude the corrupted file(s) from the backup job. Reformat the volume, and then restore from the backup.
ARE YOU SERIOUS? I MEAN ARE YOU FUCKING SERIOUS!

So what am I going to do now? I'm NOT going to run any kind of chkdsk /f I have no time for that. We already changed 2 dead disks and ran the HP diagnostics and all checked out fine. I have 1.7TB of data and cannot afford any downtime what so ever. These people around here just don't seem to understand. Here is what I am doing right now. I've installed ViceVersa and registered it. Remember that new DAE for the SAN that was 3.7TB. Well over the weekend I ran a robocopy to that location. it took 64 hours hours to complete. I started on Friday Dec 7, 7:46am and it ending on Monday Dec 11 00:24:38 2006 (whatever the hell that is)

------------------------------------------------------------------------------

Total Copied Skipped Mismatch FAILED Extras
Dirs : 78477 78466 0 0 11 0
Files : 935841 935791 0 0 50 0
Bytes :1888030907.1 m1888063666.6 m 0 0 8.5 m 0
Times : 64:38:09 60:58:02 1:39:58 2:00:08

Ended : Mon Dec 11 00:24:38 2006

C:\>

I got a few errors but at this point I could care less. Anyway back to what I am doing. Since all that data copied over the weekend there was bound to be changes. This place never sleeps. So I need ViceVersa to compare the source and destination and tell me what has changed. Then I will run the ViceVersa Sync to update from the source to the destination. But because of the nature of the error there is/are corrupted files/folders that cannot be opened, deleted and are showing as 0KB in properties. This is a problem for ViceVersa becasue when it tried to read these files to compare it bombs out. It bombs out with like 10% left at that. So in ViceVersa I had to create a profile to exclude the folder (in this case) that is causing all the trouble. If this works I'll be able to sync the two locations with this exclusion rule repath everyone to this new locatin on the SAN and destroy this corrupted NTFS volume (Thanks Microsoft).

Friday, December 01, 2006

Another Apple Xserve RAID added

We've also added another Apple Xserve RAID. We have filled up the 5.5TB of usable storage on the first one. These devices are for archive purposes only.

The bottom one is the last added. Another 5.5TB for archive.
IMG_0001

emc Clarion CX300 maxed out

We've added the last DAE to our emc CX300. It is maxed out on disks at 60 for this particular model. This last DEA was all fiber channel drives at 300GB each. It's about 3.7 TB usable data. Time to plan the upgrade path for the CX500 which I'm sure we'll need by the end of 2007.

The top DEA was the last one added.
IMG_0002