Monday, October 26, 2009

It's been a while...

I'm still here. I just got back from London. Some restructuring is going on in that office and a few folks migrated over to a new company. Most of the IT guys went so I had to go over there and have a look. It was a nice trip first time in that office and in London. The goal is to bring that office more in line with the New York HQ on all levels.

What else have I been up to since April? Well LOTS! So much that I can't even remember. Our Hong Kong office office moved to a bigger space in the same building and we had to move our MPLS line that I talked about in April. The line was finally brought up last week. It takes months to move international circuits. If you ever have to do anything like that make sure you plan around that. Don't expect the line to be moved over night from the time you request it.

I've had some SAN issues after my April post. One drive showed up as dead which was fine normal. They shipped us a new drive no problem. The drive comes and I swap it and the SAN still thinks no drive is in the bay. WHAT! I call Dell/emc support they send a new drive same deal. WOW! WTF! Then they want me to pull SP collects. Done they nothing. Now they want to webex in to run a tool and dig deeper. problems found. Another disk is bad but not showing up in Navi and I had to replace that disk as well. That two bad disk at the same time. I forgot exactly what we did but the process was long and took days b/c we had to put in a new disk into the slot where the ghosted bad disk was (the disk that was bad on the back end) and wait for it to transition. That's a day gone. The next day after the transition we had to go back to the original bad disk and swap that out and wait for transition, another day down. On the third day which was actually the forth day not included was the first day the original call was placed. On the Thursday the transitioning was done and all was well again. Four days to address drive problems. But I got to tell you this, NO user complained and there was NO down time. Gotta love EMC!

Internet slowness. Out users are just consuming bandwidth at an alarming rate here. Even with Websense to block sites and protocols there still isn't enough. Then again we do only have 3mb to the internet at the NY HQ. But I did get a 12mb DS3 installed at the 42nd st office we are supposed to be moving to that we have not moved to yet. So in the mean time I will be routing internet traffic for the users over that 12mb line and keeping my servers and public IP's here at the main office. This is b/c all of our public IP's are tied to our circuits here in the HQ and we can't afford the downtime to move the block of IP's to the other office. Even though Verizon says it take 5 minutes. Yeah right! Something will break and everyone will be on my a$$ b/c there is down time and something isn't working, like emails.

We've also installed and rolled out Microsoft's Office Communication Server (OCS). Now all our users have IM. Microsoft has a really nice product. There plan is to make OCS and Exchange into a VoIP solution. I saw a demo and it looked promising. So of course I came back to the office and upgraded to R2 and attempted to connect the OCS server to our Cisco call manager. I was fine right up until I was supposed to active the truck line and decided not to. The last thing I want is to bring down our phone system. So I put that project on hold until someone with more experience is available. I've contacted some of our vendors and they don't even know how to do it.

There are more stuff but I leave it at this for now.

Tuesday, April 14, 2009

Setting up our Hong Kong office network from NY

I'm in the process of setting up our Hong Kong (HK) office. We've installed an MPLS line there over 6 months ago. Our plan was to send someone over from NY to install everything and set them up. Then the economy tanked and personnel got laid off. As a result the HK office has been using our Juniper SSL VPN exclusively. They are able to grab files, access email, intranet and work like any remote office. But the MPLS line is just sitting there costing us per month.

I've finally been able to think about them and wanted to get that ball rolling in getting them setup to work exactly like our Shanghai office is. Meaning getting connected to the MPLS. My counterpart in London and another Admin didn't like the idea I presented of remotely setting them up. They'd rather send someone over and train them how to do things differently. My take is that we've been waiting on someone going over for over 6 months and we are paying of a line we are not using. In that time we could have had them setup remotely using local resources when needed. I've met with my boss and fleshed out a plan to get our local consultants involved. It's not a difficult thing to do at all. We've already had a server ordered and delivered sitting there as well for over 6 months that wasn't being used (reasons explained later).

I've drafted my plan and sent it to my boss and we got on a conference call with the HK folks one night EST. We all agreed to attempt to get this to happen. The HK folks wanted direct connection to our NY office and I wanted to give it to them. I've sent my plan over to the local consultants. My plan was to get them to install VMware on the server setup it up according to my instruction I sent over with IP addresses etc, connect the ESX server to a switch that connects to the MPLS router and I'll take it from there. At first they didn't know how to install VMware. So I sent them this youtube VMWare ESX 3 Training CBT - Installing ESX and a link to site How to Install VMware ESX Server. I also stated in an email to them that if they know how to install any *NIX distro they should be able to install ESX with no problem. Once they got that email with links they said yeah they know how to do it.

So now the server is installed and connected to the MPLS. I got an email from the office manager saying that it was done one morning followed by an email from the person in charge there (above office manager) and you would not believe this. The person in charge said the server was too loud and decided to turn it off. Granted the HK office is one room in a business center and the server is in the corner. So I've asked if they could leave the server on when they leave and turn it off in the morning. There night is our day and vise-versa. So they did that for me the next day. I was able to access the server via infrastructure client and SSL. I sent over our windows image, took about 30 mins. Once that was done I installed the first VM and that is a Windows DC for that office. The second VM installed was a file server. There were some pain points with the installation. I remote desktop to our Virtual center server which has all my VMware tools then I vnc into the VM's once they were up and that was a bit slow since I didn't get VMware tool installed yet. Once the tools were installed I was good to go. That took pretty much my entire day.

I come in the next day and the server is off. Yes off they couldn't even turn it back on when they left the office. Either they forgot or don't care. So that's where that project is right now. They can't stand the noise and even if I did put the finishing touches on it they would have turning it off anyway. Now imagine if we sent someone over like everyone wanted to do. That would have been thousand of $ in travel expenses and once they left the machine would be turned off O_o!

Monday, February 09, 2009

Vmotion with RDM problem solved!

I've finally figured it out and in doing so I've discovered some other potential problems. First the solution to not being able to move a VM from ESX host to ESX host even though the LUN's are shared to both host within Navisphere. It turns out that I needed to have both of my hosts in the same storage group on the Navi side. I had them in their own groups. Once I moved all the VM over to one host the and made sure the all LUN's were visible to the second host and added the ESX host with no VM to the storage group on the other host. Then I renamed the storage group in Navi to something that represents whats going on.

The reason for this is b/c although VMotion was working with all the other LUN's RDM's are a bit different. When adding a LUN to a an ESX server they read the same LUN ID this isn't the case with RDM's. RDM's get a LUN ID of whatever is the next number within ESX. For VMotion to work the everything has to match up on both ESX hosts. In my case with RDM's the LUN ID's on the ESX host were never the same. Now with the change on the Navi side all my LUN's have the same ID within ESX.

Now I can move all my VM's back and forth instead on the just the ones that were VMDK's.

The other potential problem that I found was that four of my initiator names were pointing to the wrong server host name within Navi under connectivity status. All of my servers should have four initiators and one had eight. The initiators are the WWN names that come from the HBA's in the servers. One of my ESX host wasn't showing up properly. There was no real problem but I caught that before it became one.

Wednesday, February 04, 2009

My first day back from vaction and problems...

After my vaction was over I was well rested and stress free. I get to work Monday morning Jan 5th and that's when it starts. My exchange cluster starts to freak out for a bit. I investigate what going on and powerpath is telling that a connection to the SAn is lost. WTF!!!! Now I frantic b/c I just got back and I'm grilling everyone to see if anything went on while I was away (I stayed here BTW www.rayonhotels.com). Everything was quiet I was told and I already knew that b/c I was checking in via SSL VPN. I went in Navi and I saw errors and the email home was going off. I checked the error and it was and SFP error. Now this could be real simple and real bad. Then guess what the errors go away and I check the SFP's and they are fine nice and snug in there.

I come in Tuesday morning and it's happening all over again. This time the transmittance is more frequent dropping Outlook connections to the server all day long. I put a call into Dell right away and to my surprise they didn't have my service tag updated on file so they couldn't find my SAN in their system. WHAT are you F***ing serious? WTF is going on? So I call my account rep and ask him that the deal is. He said we'd see whats going on. At this time my rebooting my exchange cluster to reestablis connection and trying to make sense of this thing.

After 2 hours they finally got their act together and updated their system. Sign of the times over at Dell huh? Anyhow they are dispatching parts and they should come by 11am and a tech should call me shortly after. I get the parts at 11am but no call from tech. There are two boxes. Now from my knowledge an SFP is the size of a thumb. I did get a small box and I also got a much larger box. Now I'm think WTF and open it. it's a storage processor. Why the ef would I need this? And this is what I mean this could be real simple or real bad. I send emails to these people working on this case. I get replies saying I should hear from someone soon. I get a call from the Technical account manager saying the tech should be here by 1pm EST. 1 pm EST rolls by and no call or no one. I send more emails and they keep send me ETA that can't keep. It's now going on 3 pm and still no one. I get a call from my reps and we are on a conference call. Since this is saying it's an SFP I know I can change that myself and ask if this is a good idea. They say if I feel comfortable do so. Great som much for gold support right? I end up changing the DAMN thing and immediately the errors in Navi are gone and everything is back to normal.

After all that, after all of that I get a call from a tech saying he's at another site and can come over and that he just got the call. Hmmmmm..... what happened to all those ETS's? I told him don't bother and changed the SFP myself. Now what if that wasn't the problem and what if the port itself was bad. I wasn't going to try to change a storage processor myself. I mean if push came to shove they could walk me through it but WTF is up with my gold support?

Vmotion with RDM problem

So I've run into a bit of a snag with Vmotion. So to do Vmotion it's best to have a VMDK partition shared amongst both ESX host. In the perfect world this would be the case but we're not in a perfect work and some of us can't convert/migrate to a VMDK partition. What I mean by this and this is based on my situation. I can convert servers without much problems into a VMDK partition but what about the data the server serves up. Remember my 3.14 TB migration that is no a 8TB partition that took 2 months. How long would VMware converter take to convert that data and move to a VMDK partition of equal size if I had one? Or if I try to convert the data partition from the ESX host by adding storage it would format my 8TB and that would be BAD.

So with all that in mind there is another option with a BUT that allows me to add the existing partition as an RDM (raw device mapping). I make sure to share the LUN with both ESX host. I add it to my VM through the edit option and it shows up no problem. Remember that BUT? Well you can't Vmotion the RDM b/c it thinks it's not shared with the other host, even though it is.

So I have a call in with HP/Vmware to help resolve this issue. I'll post back with my findings.

More VMware, Virtual Center Server and Vmotion

After the LUN migration fiasco I finally have the space to move forward with VMware and continuing to convert more servers. It's even more important now b/c we are supposed to be moving our data center and the lest I have to transport the better. Also for high availability, flexibility and efficiency. A lot of buzz words right!

I've created a 1TB LUN for all my VMDK files. Mostly the guest OS's. This should be more then enough for what I need currently. I have plenty of rooms for SWAP space and backups. I've then shared this massive LUN to both ESX host inside of Navishpere (select the LUN and add to storage group - in the case you add the LUN to both ESX storage groups). That's the basics.

I'm going to take a step back and discuss the overall picture. The gist of the matter is to be able to have my servers up all the time or as little downtime as possible. VMware already allows you to reboot servers in about 1 1/2 minutes time so that already faster then a physical box rebooting. But in the cases where you can't afford to have any dropped connections what-so-ever Vmotion is the way to go. This is what's meant by high availability. To accomplish this you'll need two ESX servers and one Virtual Center server with the appropriate license to unlock Vmotion.

So I've went and installed Virtual Center server on a windows 2003 server and pointed it to my two ESX hosts. You'll need to have a central lic server that manages all the lic's on the ESX hosts and the virtual center server. This is no biggie. Go to your account page on Vmware site and convert your lic's to central file or something like that. I pretty much edited one of my lic's and added the rest to the last line. Then installed the lic manger of the Virtual center server and updated the ESX hosts to look there. I was good to go. Then you'll need to update your infrastructure client by pointing it to the Vitrual center server so that you can see both hosts. Create a new cluster and add you hosts. All your existing VM's will appear on both host if you have them.

The final set will be to create a Vmotion network which is actually called a VMkernel Vmotion. You mush have a few NIC's in both host that you can allowcate to network redundancy and Vmotion. You assign a NIC for the Vmotion network and give it a private IP address x.x.x.1 and on the other host do the same thing with x.x.x.2 At this point you must physically get a crossover cable and connect those NIC you just assigned. You'll have to go to the back of the host and start plugging until you find the NIC's LOL. Or you can just VLAN those NIC's and you should be set. I those the first method this time around.

Sounds confusing but it really isn't. Once you get started the info just flows to your brain from out of nowhere....LOL!

Once all that was out the way I created a test VM on the shared LUN I created above. I put it on the network and opened the console or VNC to it. I then started a continuous ping from inside the VM to our DNS server (since that always on) and proceeded to do a manual Vmotion. Drag and drop the VM from it's ESX host to the other ESX host and you get a window pop-up and a few questions. It's a done deal. You can see exactly when the VM moves buy watching the ping hic-up but you don't lose a single connection.

Amazing ain't it!

Happy New Year......yeah I'm late but here is an update

The last time I posted I started my LUN migration to our new space. I thought it would take about a week two tops to migrate 3.14TB to a standard LUN twice the size. Well it took 2 MONTHs. I started mid Nov and it end mid Jan.

For those who have experienced this before will say it depends on what my priority rate was and would probably bet my priority was on the lowest setting. NOPE! I had it on ASAP for a bit and it freaked the user base out. They all complained about slow connectivity to the file server. So I dropped it down to medium and that's were it sat for the better part of 2 months. I went on vacation and everything (had a blast by the way at www.rayonhotels.com).

So after that was done I expanded the partition in windows 2003 by using diskpart and whalah instant space.

Tuesday, December 02, 2008

Back to VMWare finally

Now that I have space to play with on the CX3-20 I can return to moving forward with VMWare. My plan is to use all of our FC drive for VMWare and databases. Currently I'm using most of our FC disk for a file server. I've set a LUN migration job to migrate the meta-LUN we created on the FC disk over to a 8TB LUN I created on our new SATAII drives.

The LUN migration processes if very easy and straight forward within Navisphere. It's going to take a couple of days to migrate the 3.5TB LUN over. I've started this yesterday and set the priority to ASAP. BTW this LUN is in production and our user base isn't being affected at all.

So I've connected out second ESX server to our FC network and zoned it to the CX3-20. I've installed VMWare Virtual Center on a windows 2003 server to manage both of our hosts. Just getting that done was a huge learning curve. Thanks to the forums and searching I was able to figure it out. Maybe I'll make a post about the steps to setting up Virtual Center and two hosts on a SAN.

Now I'm going to test VMotion to see how that works for myself.

OK so I've done it a manual VMotion move from host to host and it worked like a charm.

Now time to clean up mu LUNs and create a huge LUN for all my guest OS and share that between both hosts.

emc CX300 to CX3-20 conversion done

The conversion is finally done. The hardware has been sitting behind my desk for months and this weekend it finally happened. It went off without a hitch and it took about 7 hours as expected. The Dell guy knew exactly what he was doing and as he should have. Prior to the conversion I prepped the rack by moving the DAE's up 1U to make space at the bottom for the new storage processor 1U unit. If I hadn't done that it would have taken an extra hours to move all that stuff up.

We've also added our 5th DAE with 1TB SATAII drives for a total of 15TB. After the hot spare and the LUN overhead I've got ~12.5TB to play with.

Tuesday, November 11, 2008

Data center move

Oh yes I have to move my data center down to our new office. Our new room is finally of professional grade. We have rows of racks for server and another row just for network gear. We have real dedicated AC units three of them in an N+1 redundancy configuration. We have a real central UPS. No more of those rack based bottom crap. God I hate those. Dedicated power and cooling for my room. I even have a big red button to shut down the entire room :D

Before the move I've got a few things that I'm trying to get done.

1. upgrade my SAN

2. move 5TB of data to the new DAE

3. consolidate as many servers as I can with VMware. I should be able to shave off 14 physical servers and make them VM's

4. install new PRI lines for our VoIP system (ordered)

5. install 10mb to the internet (ordered)

6. get with our data integrators and be the quarterback.

My plan for that weekend will be to have the movers pack up everything that we are taking and move them to the new office. Once there I'll get on the phone (cell) with Verizon to transfer our DID's over to the new PRI's. Then get the phone system up and tested. Move our IP's to the new routers so our DNS entries should not change (I'll have to confirm this). Get our SAN online and exchange cluster up and running. Then our file servers and VMware cluster. Should be easy right? :D

So when am I going to do all of this? I was planning on getting it done before Dec 22 but since I'm going on vacation I don't want anyone calling me while I'm in Jamaica. Plus I doubt that the 5TB of data will be copied by then. So I'm shooting for some time in January. I should we well rested.

SAN upgrade

We are smack in the middle of upgrading our emc CX-300 to a CX3-20. We went from flare code 19 to 26 last week and another guy will come out and convert it to a CX3-20. They said it will take 7 hours. This should be fun....NOT! I may have to come in the Sunday right after Thanksgiving. Such is life.

During that conversion we will also be adding a new DAE (disk array enclosure) that is about 14TB. We are getting 1TB disk each. Yeah we need all that space. We burn through disks here at an alarming rate. It seems like every few years I have to move our entire production data over to new disk and quite frankly I am tired of it :/ But the job job must get done.

So it's been a while....Move update Phase I

So back in June I mentioned that my office is moving (Phase I), well that is complete. It's wasn't easy but it wasn't hard either. A lot of coordinating with vendors to get services delivered. I was dealing mostly with Verizon. I want to thank Donna Moriarity at Verizon for keeping me up to date throughout that entire project. She wasn't even our project manager go figure.

Anyhow we are at two site now about 15 city blocks apart in NYC. We are connected via a 1 gig direct connection. So that new office seems like they are working at the main office still.

A crash course in single mode and multi mode connection. When the curcuit was installed it was a single mode fiber hand off. Now I was very new to single mode and multi mode jargin. It didn't take me long to get my head around it though. I was pressed for time. I got my Cisco 3560's stacked and configured and all I needed was the line connected and away we go. Boy was I wrong. I had about 2 weeks to get that going and wireless with local and guest VLANs working.

I quickly learned that Cisco switches are multi mode and most of the fible cables on the market were multi mode. All the cables we had were in fact multi mode. So I had to act fast and order single mode cables but this was before I realized that the Cisco gear didn't take a direct multi mode connection. So when the cables arrived (one for both ends) the connection still didn't work. Then I figured that there must be a piece of this equation that I was missed. I then found that my gear on both sides need a single mode to multi mode converter transceiver and Gbic. The order was placed and once I got them we were in business. My link was lit and data was flowing like a river. Everything worked perfectly. Phones across the link, data, emails, printing perfect. My VLAN's were all configured without issues ;)

So now I burned though a week with all the back and forth ordering so I'm left with less than a week before the move and the wireless was installed yet. We mounted the AP's and installed the WLAN controller. We got a guy in to do it for us but we were just as involved in it as he was and actually telling him how certain things should be done. None-the-less we got it going the day of the move which was on Friday Sept 19th. Everything went smooth. My workstation guys and mover did a midnight move and I came in in the morning just to check to make sure all the login were ok.

Another project under my belt signed sealed and delivered.

Thursday, June 19, 2008

Extracting email address from Exchange....

I've got a call from Messaglabs to update our email address that are allowed to send emails through their system. They are trying to decrease the possibility of a dictionary attack which will scan an entire domain with made up words and send emails to any and every address in that domain. So to help keep the company secure I've had to stop what I was doing to attend to this matter ASAP.

So how does one get all the valid emails out of Exchange 2003? Well you'd think it was as simple and right click and export right? WRONG! Exchange 2003 does not allow you to export all of your email address in the manner that you think. There may be some 3rd party tool but who wants to go through all of that.

Here is a simple solution that I found that works great. You'll need Windows support tools for ADSIEdit installed on your DC's (the one you'll run this command on)

First go to a global catalog server and run ADSIEdit (if that does not work you don't have the support tools and you'll need to download them - they are apart of 2003 SP1)

1.run ADSIEdit
2.expand the doamin to see the OU's
3.right click on the OU you are trying to get the addresses from (we are talking about the OU that has all or most of the users and groups with email addresses)
4a.go to properties
4b.on the left field you are looking for the distinguishedName
5.click edit and copy the entry
6.paste this to notepad - it should read OU=YourOUname,DC=yourDomainName,DC=com
example
UO = abc
domain = xyz

OU=abc,DC=xyz,DC=com

7. go to the command prompt on the same server and type

csvde -f c:\addresseslist.csv -d "OU=abc,DC=xyz,DC=com" -r (mailnickname=*) -l mailnickname,proxyaddresses -p subtree

edit whats in bold to accommodate your environment.

8. no go to the C:\ drive and import the .cvs file into excel.

All done. You'll need to make sure the excel import wizard runs so you can set the perameters so that it's easy to extract the email address.

Checking in......

I've been quite busy since I last posted here is what I've been up to;

My office is moving in phase so I've been attending weekly meetings to make sure everything from the IT standpoint goes according to plan.

The phase one move will be about half the office moving to a new space fifteen blocks away. All the IT recourses will stay here until phase two. In the meantime I am connecting both offices via Verizon Metro-fiber (1 gig).

We are also rolling out a firm wide MPLS WAN solution to connect offices in New York, London, Shanghai, Hong Kong and Singapore. I've been doing a lot of coordinating with Verizon via phone, email and meeting to make sure this is a solid solution. We originally contracted Savvis for this project so they could set us up with their MPLS solution but they could not deliver and we were forced to cancel and go with Verizon.

I'm also in the process of upgrading our EMC CX300 to a CX3-20. The CX3-20 will allow us to utilize 120 disks vs the CX300 maxing out at 60 disks. Plus all the other bells a whistles that the CX3-20 has to offer.

We are also upgrading our Cisco call manger and unity VoIP system to 6.1 (I think) it's the linux version.

I think thats it for now. I'm sure I've missed some things. My brain is rattled these days with so much going on both at work and home. I need a vacation BADLY!

Wednesday, January 02, 2008

Old VM's to New ESX server

So for all of us small/medium size guys who are facing problems with old VMs on older servers or moving VM from test into production who don't have Vmotion or any of the cool stuff how do they do it?

I know of two ways. One way is command line and the other is gui. The command line way it more fun LOL but it's long and prone to error. The gui way like everything gui related to short and to the point. I'll go over the command line way first b/c it's good to know these things.

Moving VM's from ESX host to ESX host (method 1)

First you'll have to unregister you VM from the target host. To do this you have to use an ssh tool to console in like putty. SSH into the host where the VM that you want to move is located. Run the command to list your VM's

vmware-cmd -l = lists your VM's

You will notice that VM's show up with long character names followed by the shortcut name which is what you named your datastore within the Infrastructure client.

Now you'll need to unregister your VM that you want to move. To do so you'll use the command

vmware-cmd -s unregister /path/to/datastore (this path should look something like this /vmfs/volumes/datastore name

Now using FastSCP b/c this allows you to SCP into the server with root access (great tool), connect to the target ESX host and destination ESX host. You should see both host on the left side pane.

On the destination host create a folder where your volumes are and call it the datastore name. On the target browse to into the VM folder that you want to move to the new host. You will see about a dozen or so files. Copy all the files ACCEPT the .vmdk files over to the folder you just created in the new host.

Once that copy is done on the destination host create a temp folder where ever you have available space (must be enough to hold the size of the .vmdk files). Now copy the .vmdk files from the target server to this location. This copy takes a while depending on the size of the .vmdk files, 40GB is about 2 hours.

Once the files are copied run the vmkfstools command.

vmkfstools -i /path/of copied vmdk/name.vmdk /path/of where all the other files are/name.vmdk

So you will run this tool against the .vmdk files you just copied to allow the tool to reincorporate the .vmdk files over to the first set of files you copied earlier. So you must have both paths correct of it will fail.

This process takes a little while b/c it has to put everything back together and I think it defrags at the same time.


Moving VM's from ESX host to ESX host (method 2)

Use VMware Converter. LOL thats it. I found out the hard way that this was the easiest way to move VM from host to host. My last post explains how to do it.

VMware is sweeeeeet!

So I've moved forward with VMware full swing. We purchased a new server for our production VM environment. The server specs are as follow;

HP DL580 G5
4 Quad core 3GHz Xeon cpus'
48GB memory
2 72BG 10K rpm HD
2 Qlogic HBA's
connected to our emc SAN

This server is a beast. If you didn't realize that is 16 cores in this one. I can host about 5 VM's per core for a total of 80 VM's give or take depending on resource allocation.

This server could replace pretty much all the servers on my network. And in essence I could have one rack with just this box and my SAN.......but I'm not doing that. I'm just going to consolidate our file servers and our single function servers. No need for those 1U guys anymore that just run IIS.

So why is VMware so sweet. Well b/c it's owned by emc. Really b/c the integration into your environment is almost transparent to the end users. You can literally convert a physical server that dying into a VM in a matter of hours providing you are ideally setup with a SAN and your ESX host is up and running.

Using VMware Converter (free) you can install it on the physical machine you want to convert or if that machine is out of space like most old servers are you can install it on any other server and point to converter to the server you'd like to convert. Once the VMware Converter is installed you run the app and a wizard opens up pretty much asking you what server you want to convery, what volumes (c:, d: and any others), where you'd like to house these volumes (this is best in a SAN environment where you have already carved out a LUN for this server), name change, network config and thats it. Once you hit start the physical server is still online with users connected and it takes a snapshot of the entire server files and all and turns it into a VM. Not only does it do that but it also send the VM over to the ESX host and powers it on (power on is an option if you want no downtime).

The amount of time it takes depends on how much data you will be converting. Remember the converter will be taking that physical server with everything on it that you picked into a VM. So if it's a server with a 40BG database all that has to come over. Thats takes about 2 hours. The converter has an ETA as well.

It's great. I am amazed that emc and VMware have automated this entire process.

Happy New Year!!!

Wishing you all the best in 08.

Tuesday, November 13, 2007

Exchange Store Defrag

I defragged two of the four information store in my exchange server on Friday night. I started at 10pm and I went to bed at 4:30am Saturday morning.

My four mailbox stores are over 100BG. It's was already over the 100GB partition that I had them on. So months ago I had to more one to the same partition that our public store is on. The plan was to defrag all the stores getting back all the white space and moving the store on the public store partition back with the rest of the mailbox stores. Well that didn't happen.

All the stores are about 25GB and their streaming databases are about 6BG each. So you can see how that is well over 100GB. I did them one at a time. I first did a database move to our recovery storage partition since it's apart of our clusters mount points. This move took 45mins for both the .edb and the .stm databases. Once
moved I ran the

eseutil /d :drive\location\"database name b/c I have spaces"

This took some 3 hours. About 9GB her hours is about right according to MS. When done I mounted the store from the defrag location and did a database move back to the original location. This took another 45mins to copy back. I started the second store at about 3:30am. After that one copied to the defrag location (45min copy) and ran the eseutil and WENT TO BED. I woke up at 6:30ish am and found it was done. I copied it back and that was then of that.

The scheduled downtime was from 10pm Friday night to 10am Saturday morning. So attempting to do the last two stores would have gone well over the time I allotted for maintenance.

What was i doing why I waited for the copy and defrag to finish? While the wife and baby was asleep I took the time to play some WOW ;)

Tuesday, October 30, 2007

Job is relocating

My job is moving a couple a blocks. From an old historic building to another old build in NYC. At least this new old building is a lot better.

My responsibility as the MIS is move or build a new IT infrastructure. I think we are going to do a little bit of both. The problem is is that we are not all moving in at once. We are moving at about 100 a time over a 2 year period. So that means I can't just pull up the infrastructure in one location and move it over a weekend. There can't be any downtime (you think this was a financial firm). So I tasked with setting up two networks that will talk to each other so when people move from the old office to the new office everything works exactly the same. Here is what is involved on the IT side of things to get this to work.

- A solid WAN connection
- WAN accelerators Riverbed devices
- Cisco IPT phone system
- SAN and VMware
- Switches for new space
- lots of cabling
- security
- A/V
- Wireless
- new workstations
- metro card

This is pretty much the basics. We have all of this stuff now but we may need to get a second of everything. The bottom line is that there can be ZERO downtime. I think this will be a piece of cake. My boss the Director of IT seems to be stressing a bit. Hey you can only play the cards you are dealt.

It's been a while..

Didn't I already use this title?

Anyway. I've done a lot since my last post. Lets see;

- I got married in June. 7/7/07 :D <--this will NEVER be any form of password LOL!
- Went on a 2 week honeymoon. Can't beat a Carribean Cruise
- Started playing WOW again
- Did a crazy setup in my house 5 boxing WOW
- Planning a physical relocation at my job for the IT infrastructure.
- Also consolidating Windows 2003 domains into a single domain.

That wasn't too much was it?

Oh and I got an iPhone too. How could I forget that. THEE BEST PHONE EVER! I can't believe I left it at home today too F%$@!

Wednesday, May 16, 2007

VMware and EMC news

I was back to focusing on some VMware last week and found that the VM's that I created were not showing up anymore. Hmm wonder why that is all of a sudden. I checked the zoning and it was all right, I checked the fiber cables they were lit. I even went as far as swapping the connection on the HBA's that didn't work as my paths the the storage under storage adapter in the VMware Infrastructure Client vanished. So I out it back. I then went into Navisphere to see if the host was showing up and it wasn't. Not even the IP was coming up. WTH! I then start browsing around the VMware forums and did a search for Clarion and ESX. I didn't find anything concrete. I'm not a member of the forum yet either so I didn't bother asking a question. I figured that this was basic and was probably asking to some degree in the past.

Anyway the problem was that this server was originally a windows 2003 server with all the emc software (SAN surfer and powerpath) installed so when the server was on it registered with CX300 automatically. So under the host tab in Navishpere the server would be right there and you can assign LUNs and away you go. Since the ESX server I didn't have the software installed it wasn't showing up in navi. I was rattling my brain trying to get this working again b/c it worked before. The VMware forums lead me to a post where a guy mentioned just adding the WWN of the server to navi and thats it. But it didn't mention exactly how. So this is where I figured out what was going on. was right clicking on everything in navi trying to find where to add a host or WWN name. I finally came across connectivity Status window. Here is where all the host and WWN names are associated. I noticed that the names (it was renamed two times) of what this server was as a windows 2003 box was still in there and associating itself with the WWN's of the HBA's. Being that the new ESX host did not have the updating software the CX300 didn't know about this server even while zoning was still in place and the hardware was the same. So I had to deregister the WWN name from all the old host names and register the WWN name to the new host which is the ESX server and new IP. Then everything started to work.

That took about a day to figure out but it felt good figuring it out none-the-less. So if anyone runs into this sort of thing who is not a VMware/EMC expert check the Connectivity Status from right clicking the Storage System in Navishpere to make sure your connection are all up to date and old connection aren't lingering around.

Friday, May 11, 2007

Exchange Cluster issue

About two weeks ago I was pretty much alone running the NY side of things. My boss the Director was out in our other office in London then Shanghai and my Admin was on vacation. So I was left to handle the back-end and make decisions on my own, AGAIN!

It was a dark and stormy Tuesday night...(it was just dark) the phone rang right as my wife tells me that my baby girl has a fever of 100+ degree's yikes! It's my boss on the phone and he says he can't connect to the exchange server from Shanghai. We are on an MPLS so everything should work. So I dig up my laptop and have a million things running through my head as I am most worried about why my daughter has such a high fever. I boot up and VPN into the office to check things out. At first glance everything looks fine. I am in my Outlook and I can OWA in as well. So what is he talking about. I VNC ALLLLLLLLLLLLLLLL the way to the shanghai server and see if I can do anything from there and I can. So what is the deal here. He tell me he keeps getting an error when trying to open outlook and OWA. So I try to login from there and I can OWA fine. I try to use his credentials from the same box and I get the error. I use his credentials on my box in the NY office (remote desktop + VPN is great) and I get the error too. So what the hell I say.

I start snooping around the exchange server manager to see if I can see anything abnormal nothing. Nothing b/c the damn thing gives no errors and the app does not refresh so I didn't know there was a problem until later. I start checking the event log, mind you it is going on 11pm and I am getting sleepy and worried about my daughter and this damn problem here at the same time. The event log was saying that the mailbox store was having problems wiring the the disk and was stopping I think it said. But that didn't register b/c I wasn't focused on this problem my daughter was boiling up and I was scared to shit. I'm still on the phone with my boss and he tells me he has to go to a meeting over there and will call me back.

I'm off the phone worried about two things. My work and my daughter. Well my daughter has went to bed and fever came down and my work was really starting to get to me. It was about 12am now I am just realizing what is happening. I at first thought my transaction logs filled up so I checked the space and it was fine then I reread the event error and was like hmm. Then is dawned on me the store can't write b/c the drive is FULL. I check and sure enough the 100% full. Then I really lost it b/c all what was going on had me not thinking logical. At that point I thought new information was over writing existing information (why? like I said I was worried about my daughter all night and not thinking straight) So I look back in the exchange system manger and refresh the mailbox stores and mailbox store #4 was down. I nearly had a heart attack. In that second I though my bosses mailbox and others were completely gone and I got up from the dinning room table walked into the living room and collapsed on the floor. It felt like all the blood drained from my head and extremities and pooled up in my stomach. Did I have an anxiety attack or pannick attack or both? After a few minutes on the floor I got up and regaining my composure. I was able to analyze what had happened and came up with a game plan to resolve the issue. I needed to move one of the mailbox stores to another partition to free up space in this one so that all store can come back online. I went to bed and got up at 3am drove into work and moved the mailbox store. it took about 15 minutes to move to 16GB store. But that did the trick.

What I need to do next (still) is shrink the database with the esutil tool to reclaim the white space. In all I should get back about 25GB. What caused all of this all of a sudden was when we moved to the cluster. The limits in the stores were not put back allowing the users to fill up their mailboxes in a matter of a month. We are back on track now and all is good again. For now!

Management duties

My management duties have consumed most of my time since my last post. I also have an exchange information store shutdown. NOT a crash a shutdown. I'll make a new post about that soon. I've been dealing with meeting, talking with vendors and making sure a lot of things get done. BORING stuff. Still a lot of thinking required. The most frustrating is the unorganization of the office environment, from a business standpoint I am speaking of. Yeah I new a few things about how a business should run enough to hold a conversation ;)

Wednesday, April 18, 2007

Multi-tasking at it's finest

Multi-tasking baby. I'm talking about me here not computers. I've been swamped today trying to get VMware ESX 3 going configuring the Juniper SSL VPN box and making sure our over seas users have the proper access to their resources. Basically I am locking them down and forcing them to use the Juniper SSL VPN as their entry point. Yes I'm doing this all at the same time.

VMware ESX 3 has it's own learning curve. I've been rattling my brain just trying to install my first guest OS. I've got the VM's installed that's the easy part. For some reason the VM's won't boot from the CD-ROM. I've tried the ESX host machine and I've tried my workstation. Nadda! I've been beating the boards all morning. I ended up creating an ISO of my Windows 2003 server CD using the dd command on the ESX host.

dd if=/dev/cdrom of=/vmimages/myISO.iso bs=32k

What it does exactly I have no clue just yet. I'm joking, it's copying the files from the CD to the location it's the bs=32k that's got me. But this is the learning process anyway. I will be reinstalling once I get a handle on what exactly is happening. Also b/c I am using my only 100GB of my SAN to install all these VM's. Each VM I am giving 10GB. Eventually I will figure out best practice on the installation and how to manage LUNs off the SAN. I did a typical install :p call me a noob I don't care, two VM's with guest OS's installed more to go ;)

And as for the Juniper you can say I had a crash course in configuring that too. Under pressure it's amazing what you can do. Thats if you know what you doing.

All in all the Juniper device is great. So great in fact I ordered it today.

Tuesday, April 17, 2007

VMware ESX 3

Finally I can get around to installing this thing and trying it out. I'll update more once it's setup.

Wednesday, April 04, 2007

Juniper SSL VPN appliance

We are testing a Juniper SSL VPN SA-2000 appliance for 30 days. It was installed on Monday and I am impressed with what it can do. We are looking for a better VPN solution than our current. Right now we use Checkpoints secure client and we have to install that on all our remote users laptops. This limits who can VPN into the office to only those with company laptops. With the SA-2000 we can have anyone VPN into the office.

Based on the flexibility of the device we can setup policies to allow different levels of access. I can set the device to do a hardware check, user check or any combinations of checks. Example, I have hardware check on that scans the registry for company name machines based on our naming convention. If that checks out fine network connect will install. Network connect is pretty much a VPN java applet that create an SSL tunnel over http giving you an IP from a pool allowing you to have full network access. Now if the hardware check scans the registry and see that you are not a company machine you will not get network connect and will only get browser access to resources. All authentication is done via Active Directory which is nice.

If you are a certain user that is not on a company machine you have more resources published to the SSL VPN home page. Example, if I log in I will get my intranet, terminal service, all mapped drives, meeting (like webEx) and whatever other internal links that I want to add. If a regular user logs in I can have them only get Outlook Web Access and/or whatever resource they are working on internally.

I really like this solution as it pretty much the one stop shot for remote access. And the level of flexibility is great. It runs a hardened verion of LINUX not sure which distro but I can get into that some other time.

Exchange Cluster complete

So we've finished our cluster install on Friday. All mailboxes moved accept one 11GB culprit. We got Trend ScanMail installed and running. That 11GB culprit we finally got him down to 3 GB last night and moved his mailbox over. Now we can properly decommision that exchange server and reclaim the box.

Thursday, March 29, 2007

Exchange Clustering Day 5.6.7.8.9 something

In the last few days we have been moving mailboxes and trying to iron out some issues that have come up.

One issue that came up was RUS (Recipient Update Service)service was pointing to an old domain controller that was decommissioned a LONG time ago. Any way we reconfigured that. The issues that came up were when a mailbox was moved it took forever in a day for the outlook client to reconnect back. And it should reconnect back in seconds. FIXED!

Another issue we were having is when moving the mailbox that had blackberry's associated with them the BB would not be able to send emails. I think this was b/c RUS was all eF'ed up too. FIXED!

We were also having Symantec Enterprise Vault issues when we had to reset up the services to archive the public folders. The application had to associate the service to the system mailbox and it could not see ANY mailboxes. So we rebooted the EV server and we where able to see all the mailboxes and chose the system mailbox for the EV service. FIXED!

Repathing SMTP. All of our incoming and outgoing emails go to MessageLabs. You can pretty much say they have an MX record for us. They only send emails to SMTP.domain.com and only recieve emails from A.B.C.D IP's we give them. If my current single exchange server is already working can't I just swap IP's. Well yes I can't and no it will not work 100% of the time. Here is why. My firewall objects points to my exchange server, switching the internal IP on the object will work for incoming emails. SMTP.domain.com points to a public IP that is NAT'ed to an internal IP. BUT... in a cluster what we have come to find is that even though the virtual exchange server that now SMTP.domain.com will get all emails going out is totally different. Which ever node in the cluster is the active one that's the IP that will be attached to the email header. So Messagelabs is seeing this new IP from the active node trying to send emails and is rejecting it even though the emails are coming from the virtual cluster with is registered with the correct external IP. It's the active node's IP that the emails hang on to. So I had to call Messagelabs and get them to add both external IPs of the active node that I created. Well it takes 4-6 hours to propagate. We send emails to a Messagelabs cluster so the changes I've added have to hit all server in thier cluster. I was able to get emails to go out b/c some towers in the cluster had the changes and some didn't. I'll just wait until they all have the changes to switch IP's later on tonight. So now all my emails are still flowing through the single (non-clustered) exchange server. Something to watch out for if you are exchange clustering. FIXED!(in a few hours)

Friday, March 23, 2007

Exchange Clustering Day 4

We've tested mailbox moves and all works well a very small mailbox. I am in the process of moving my mailbox (700mb. I estimate it will take 40 minutes to move b/c I'm doing this in the middle of the day and the server is busy.

One thing I have to keep in mind is that the current production Exchnage server is the only server that can send emails through the firewall. Also it's the only server that can send emails to Messagelabs. So what I will have to do is change the firewall object to point to the clusters internal virtual IP. This should allow the cluster to send and recieve emails without sending on behalf of the current exchange server like it is doing now for testing.

I am also testing our blackberry functionality with the mailbox move to a new cluster as well.

Thursday, March 22, 2007

Exchange Clustering Day 3

Today we are tweking the cluster and configuring replication on the public folders since that will take forever in a day. DAMN a day if will take a few days. We have a 100GB public information store. Yeah we really use our public folders. Hopefully today we can move a test mailbox over to the cluster and see the results.

Exchange Clustering Day 2

Day 2 was actually yesterday. It was a bit busy and frustrating at times. One of both servers were acting real funny right from the very beginning. On one the OS service pack 2 wasn't showing up in Add remove programs but it was installed. I even went ahead and installed it again three times. We went ahead and installed exchange on both servers and then created the exchange virtual server. The exchange virtual server was created but creating an IP Address Resource a Network Name Resource and a Physical Disk Resource then the System Attendant resource which allowed the cluster to show up in the Organization.

After that the failover from server A to B and back was taking enirely too long. Then Exchange SP2 wouldn't install on server A. The MSDTC service would always fail to start. So we followed some of the articles to remove and re-add it but the service never added back for some reason. So at the very end of the day I said made the decision to REFORMAT both servers and reinstall.

We've decided to look for some newer hardware drivers and firmware updates and found some and installed those. Then we replaces the heartbeat cross-over cable. Then reinstalled. Everything works so much better and the service pack was installed before the cluster was brought up this time, LOL!

That was yesterday.

Tuesday, March 20, 2007

Exchange Clustering Day 1

After getting the RAM and HBA's into the server, it racked, heartbeat connected and LAN connection we installed the OS and patched them. We've named them A and B and got the carving work all set on the SAN. We've carved up 64GB for logs, 100GB for private store and 150GB for public store. We've also carved us 500mb for the Quorum drive and 5GB for the exchange mounts. The exchange LUN is to minimize the amount of drive letters that will show up in the server. There will only be C: E: and Q: NO D: F: G: H: Why? Here is why. In the drive labled exchange there will be mount points to the transaction logs, the private information store and public information store. Normally these mounts would have been drive letters in my computer.

Then we'll turn off one of the servers. In this case B. We'll assign the the Quorum and exchnage LUN to server A and run diskpart to offset the disk for performance.

diskpart
select disk #
create partition primary align=64

Do this for each LUN as per EMC's best practice.

Shut down server A and bring up server B. On server B we'll just assign the LUNs that we've just assigned to A. (NOTE you technically are not suppose to assign a sign LUN to two servers the acception is in a cluster environment which we are implementing. This is why one server is turned off). Once assigned we can run cluster manager. It does not matter what server we are on as long as one of them is turned off.

On cluster administrator click open and create new cluster. Add the current server to the cluster. This server will be in the cluster alone for now and most importantly LOCK the shared LUN's so the other server cannot write to it when it's turned back on. Add the name of the other server in the wizard. Once done turn the other server back on. Open cluster administrator on that server ( the one just turned on) and run the wizard but select add node to cluster.

The cluster should be all set up now. You should see your heartbeat and LAN connections under networks. You will have to setup disks in the Cluster group for all your LUN's even the LUN's that are mount point in the Exchnage LUN created eariler. You should also see who the server owner is for the disk are at that given time. There can only be one server owner for the disks. You can change owners which will shift the disks over to the other server by right clicking on Cluster group and move group. This will move the disks manually over to the other server in the node. This will automatically happen in the event that something happens to the active server. I am setting up an active/passive cluster BTW.

Exchange install tomorrow.

VMware consolidation project, answer

To answer my own question of how a VMware server with 5 host and 4 HBA's (2 going to the SAN and 2 going to the XServeRAID) will share resourse?

I'll need ESX server and the ESX server will find all the hardware. It will act as the sole server connected to the SAN and XServeRAID. The five virtual server will no nothing of these storage devices. The ESX server will have the two paths to the SAN and two paths the the XServeRAID with two of each HBA. Once setup the ESX server will have the drives needed for each server before the actual virtual servers are installed. Then I'll install the OS's and assign the disks to each server.

I can't wait to tackle this project.

Friday, March 16, 2007

VMware consolidation project

As I am starting my installation for my Exchange cluster project at the same time I am thinking about future projects. What came to mind was to consolidate five servers into one VMware box. Sounds easy enought but I have questions that I am uncertain about. We have;

Intranet server
OWA server
file server (Lib)
file server (home directorys)
file server (img)

I wan't my IIS servers to pull their data from my SAN (separate LUNS). I want my File server (lib) to pull from the SAN also (separate LUN) that's the easy part. The other two server File server (home directory) and (img) I'd like them to pull data from an XServeRAID. Being that the XServeRAID is not a SAN (in our environment) I cannot give them their own LUNS. They would be sharing the same volume if I set it up in it's current state and that is not good practice. So I'd ether have to put one or the other on the SAN. Anyway the real question is if I have these five servers on one physical box with four HBA's (2 to the SAN and 2 to the XServeRAID). How would five servers share four HBA's using VMware? Are virtual HBA's setup? I'll have to find out the answer to that.

VMware vendors don't call me, I'll call you.

Thursday, March 15, 2007

Update 4 External file hosting

I've also been trying to identify an external file hosting solution. I won't get into names as i don't want to directly give away the industry I work in. But some that can relate can figure that out. Or you can just ask me via email. Anyway for 200GB and 1000 users they want to charge us 90K a year to host some files for us. I can do that for a fraction of that as Verizon FIOS is available in my neighborhood now :D They've made the same joke and said it's not the storage alone we are paying but the server and the redundancy and X and Y and Z. I think it's the name that associated with them that make the price to high. Afterall they do make the applications we use the generate our large files.

Update 3 Symantec FSA

We've got Sysmantec's FSA (file system archive) installed and running. We are using this to clean up old files of the production file server. We have 1.2TB of Adobe PSD files on the file server older that 60 days. Some of you have a total of that for all your storage. Well that is a fraction of our storage and it's easily noticable via running tools to identiy files by extension and size. We are removing these files and leaving pointers so that we can get back space for our Exchange Cluster project that start next week.

Update 2 Cisco IPT

I've been dealing with our Cisco rep and our IPT implementer trying to figure out why the deployment of key features that we've liked to roll out the the user base is so painful.

The installation of our IPT system went perfectly. The migration the the new system went well also. It's been up and running for a few months now and we are satisfied with the phone system itself. What we are not satisfied with is that features of the system require different password. Pretty much every feature of Cisco Unity Connection has a separate password with different credentials. What I mean is when we setup user template originally with Unity Connection the password is 8 characters. So we went with that not knowing that when we've setup other features these features required passwords that required 6 characters. HUH! So change one set to match everything, easy fix right? WRONG! When setting up the original users on the system before rolling out these additional features you can not change the template without wiping out the user. The Template is set in stone. So as a result the users had a voice mail password of 8 characters and a PCA (Personal Communications Assistant) password of 6 characters. Not to mention a Windows password that expires every 90 days. Also we wanted to roll out the IMAP feature that allows your voice mail to show up in you outlook in a different mailbox that also has a password. You see what I'm getting at. Too many DAMN password using Unity Connection and nobody told us this when we were buying it.

These little things are easily overlooked of too stupid to even ask if you are buying a 250K phone system. You would thing Cisco would streamline some of these features and the dumbest thing an ordinary person can think of would be there. Sadly that is not the case.

Another gripe with Cisco's IPT is that we went with Unity Connection b/c we did NOT want our voice mails stored on our exchange server which is what Unified Messaging does. Unified Messaging has all these feature single password (I think) but the VM's are stored inside the Exchange server. That's a NO NO on so many levels. Some companies have very strict email policies and emeil are deleted every X days. If VM's are in there they are automatically treated like emails and wiped out. Financial firms come to mind. Law firms also. So this is why Unity Connection is there. VM's are not stored on the Exchange server. They are kept in the Unity server. But Unity Connection has all the crap I discussed up top. So why don't they have the best of both worlds? Who knows! The ideal product will be Unified Messaging with the ability to pick where you wanted to store your voice mails. If I wanted them in my Exchange server that would be the default. If I wanted them on another server server with links to exchange that should be an option. It can be done this is America and we are in 2007 anything can be done. We use Symantic Enterprise Vault to archive emails that pulls emails out of the Exchange server and stores them in a database on another server/storage device but leaves a pointer to those archives. One click and it's back in seconds. The same thing can happen with VM too Cisco. Wake up!

Update 1 Exchange

So what has been going on since my last post that was on.......January 4th? Well back in January One of our exchange servers had a hardware failure. This server was in our remote office in London. The server was down for a day or two b/c there was no hardware 4 hours turn around time waranty on parts. As a result of that some very important people could not get email on that end. This important person wanted this to never happen again. So I come up with a solution for both offices. (oh just to be clear I do not administer the london office, my counter part does that. My servers have 4 hour turn around time on parts.) With that said I've come up with a solution for both offices.

My solution is to cluster the exchange servers in both offices. Increasing the rundancy of the server. Both offices currtly have one exchange 2003 server. All email flows come into the NY office and goes to the LD office across our private line. Next week we will start this project in both offices. My solution requires two new exchnage server running on MS Server 2003 Enterprise Edition. The information stores will be held on our SAN and in the SAN in the LD office. This is going to set ourselves up to start cloning the databases for backups and do away with tape, sort of. Our current exchnage 2003 server will act as a restore server. It will be hanging off the cluster so to speak and in the event of restoring that server will be the one mounting the database (information store).

Thursday, January 04, 2007

Exchange GAL updating

I've been beating myself over the head for a while now trying to figure out why my GAL (Global Address List) wasn't updating in Outlook when I make changes to a user account or AD object. When you create users in AD even if they are test users they will show up in the GAL if you are not copying a preexisting configured template. We try to keep our GAL clean and free of all the crap that we test but some times if we don't delete the accounts or can't for that matter they show up in the GAL. So I would go in and hide the user from the GAL. Exchange/Outlook is funny where you have to either close outlook and reopen it for the changes to show or restart the Information store to force the change. Well recently I remember one of my colleagues saying it was best to have Outlook cache mode on. I sort of disagreed but I turning it on anyway and forgot about it. So I get to request to clean up the GAL. I proceed to do so and low and behold the names in GAL still show up. I remember this happening in the past a restart of the Information Store always fixed it. Not this time. Then I go and check Outlook to see if it was caching and yes as I mentioned before I forgot that I turned caching on and it was keeping all of the information. Don't know why cache insists on keeping the information even after restarting the Information Store and restarting outlook. A safety net perhaps?

Thursday, December 28, 2006

Server Crash.....

Today during the last week of the year where it is the most quite and boring a server decides to crash. Hardware failure is the error on the BSOD. This server is for our archive. It is connected to both of our Apple XServer RAIDs via fiber channel switch. And to top it off a user is requesting archive material.

No worries though this is where a storage area network SAN shines. All I did was rezone the WW names of the two XServe RAIDs from the dead server to another server and wah-lah the drives are back online. Took mear minutes.

Tuesday, December 12, 2006

All users on the new SAN volume (event ID 55 cont.)

So my strategy worked like a charm. Robocopying all the data to our new volume (SAN) and using ViceVersa to mirror the data between the source (internal storage) and destination (SAN). I removed the share from the old volume and created the same name on the SAN volume. No one had to reboot or anything. Very transparent.

Now I'll wait a few days to make sure everyone has everything before I completely wipe out that corrupt partition.

Monday, December 11, 2006

So my Event ID: 55 errors are back

Actually they never left. They are just more frequent again and last week the server froze up twice in the same day. What is my problem. Here it is.

Event ID: 55
Source: NTFS
Description: The file system structure on disk is corrupt and unusable. Please run the chkdsk utility on the volume "Drive_letter:"

CAUSE
This behavior can occur if the NTFS volumes' Master File Table (MFT) is corrupted. The short and long file name pairs that are stored in the directory index record and the file names that are stored in the associated File Record Segment (FRS) contain case-sensitive characters that do not match.

NTFS supports case-sensitive (POSIX) file names, but Chkdsk does not check file names in case-sensitive mode.

For example, assume that the directory index record has a BADFILe.TXT entry but the FRS has a BADFILE.TXT entry for the file name. NTFS views this as being invalid or corrupted, but Chkdsk compares only the names and ignores the case. It does not make repairs.
Back to the top Back to the top
RESOLUTION
To resolve this issue, back up the volume that contains the corrupted file(s) and exclude the corrupted file(s) from the backup job. Reformat the volume, and then restore from the backup.
ARE YOU SERIOUS? I MEAN ARE YOU FUCKING SERIOUS!

So what am I going to do now? I'm NOT going to run any kind of chkdsk /f I have no time for that. We already changed 2 dead disks and ran the HP diagnostics and all checked out fine. I have 1.7TB of data and cannot afford any downtime what so ever. These people around here just don't seem to understand. Here is what I am doing right now. I've installed ViceVersa and registered it. Remember that new DAE for the SAN that was 3.7TB. Well over the weekend I ran a robocopy to that location. it took 64 hours hours to complete. I started on Friday Dec 7, 7:46am and it ending on Monday Dec 11 00:24:38 2006 (whatever the hell that is)

------------------------------------------------------------------------------

Total Copied Skipped Mismatch FAILED Extras
Dirs : 78477 78466 0 0 11 0
Files : 935841 935791 0 0 50 0
Bytes :1888030907.1 m1888063666.6 m 0 0 8.5 m 0
Times : 64:38:09 60:58:02 1:39:58 2:00:08

Ended : Mon Dec 11 00:24:38 2006

C:\>

I got a few errors but at this point I could care less. Anyway back to what I am doing. Since all that data copied over the weekend there was bound to be changes. This place never sleeps. So I need ViceVersa to compare the source and destination and tell me what has changed. Then I will run the ViceVersa Sync to update from the source to the destination. But because of the nature of the error there is/are corrupted files/folders that cannot be opened, deleted and are showing as 0KB in properties. This is a problem for ViceVersa becasue when it tried to read these files to compare it bombs out. It bombs out with like 10% left at that. So in ViceVersa I had to create a profile to exclude the folder (in this case) that is causing all the trouble. If this works I'll be able to sync the two locations with this exclusion rule repath everyone to this new locatin on the SAN and destroy this corrupted NTFS volume (Thanks Microsoft).

Friday, December 01, 2006

Another Apple Xserve RAID added

We've also added another Apple Xserve RAID. We have filled up the 5.5TB of usable storage on the first one. These devices are for archive purposes only.

The bottom one is the last added. Another 5.5TB for archive.
IMG_0001

emc Clarion CX300 maxed out

We've added the last DAE to our emc CX300. It is maxed out on disks at 60 for this particular model. This last DEA was all fiber channel drives at 300GB each. It's about 3.7 TB usable data. Time to plan the upgrade path for the CX500 which I'm sure we'll need by the end of 2007.

The top DEA was the last one added.
IMG_0002

Thursday, November 16, 2006

Riverbed install done.

It didn't take long. About an hour on the phone. We are going to have to upgrade the OS on all 6 at some point though. Just like any home router or access point the upgrade is pretty much the same way. Download the OS, browes to where it was saved and install.

Here they are. This first one is the 1020 and it is connected to another like it in London.
IMG_0005

This next one is the 200 and it is connected to another like it in Shanghai.
IMG_0004

to learn more about Riverbed appliances visit www.riverbed.com

Tuesday, November 14, 2006

Second set of Riverbeds going in..

Our second set of Riverbed devices are going in tomorrow between NY and Shanghai. Nothing special but plug them in between your network and your point-to-point router.

Tuesday, November 07, 2006

IPT rollout continued

We have changed from our original plan of just rolling out IP phones to the office. The phones we were rolling and have started to roll out were the 7941G. We decided to move up to the 7941G-GE. What is the difference? Well the power requirements for both phones for starters and the bandwidth the other. what do I mean? The 7941G is a 10/100 phone that require 6.3W of power. The 7941G-GE is a 10/100/1000 phones that require 12.9W of power. Each WS-3560-48PS switch or any switch for that matter (access layer)provides a maximum of 370W of power. So if loaded up the switch with all 7941G's the power consumption would be 48x6.3W= 302.4W of power. Now that we decided for the more bandwidth option we can not add 48 7941G-GE phones to these switches (alone). They need power bricks or adapters b/c if I added these phones without them the power requirement would exceed what the switch can handle. 48x12.9=619.2W of power. That is almost twice what the switch can put out.

So in order for me to make this work smoothly I am going to disable PoE on the ports so I don't have any mishaps in the future. I'd rather plug a device in the switch and it not power up then plug a device in the switch and it blow out said switch and everything that is plug into it. So I will use the;

(config-if)#power inline never

on the range of ports I want no PoE to. Better safe then sorry.

Monday, October 30, 2006

More on storage

Like I said eariler storage is the single most challenging area for me. We are ordering another DAE (disk array enclosure) for or Clarion CX300 SAN. This is the last one we can add before we have to upgrade the CX500. This new DAE would be about 3.8TB of usable storage. We are also getting another Apple Xserver RAID 7TB raw unit. Our Archive has used up 5TB already. To continue with the growth we need to have more archive space.

Thursday, October 26, 2006

Centralizing storage has allowed us to decommission 5 servers

Since we've had our SAN in place for over a year now we are finally able to get rid of a few servers. Consolidating mostly file servers is the biggest benefit. No more hanging on to these old servers, no more paying for care packs, no more worrying about hardware failures due to age, no more buying racks and UPS like mad for space. I think I've even saw the UPS meter lights drop a few bubbles. LOL.

Wednesday, October 25, 2006

Event ID: 55 continued Not what I expected

The server froze up again yesterday. That would be 3 times in the past week this happend. I never ran the chkdsk /f last night. Instead one of my Admins called up HP to see if they had any say on the matter. It came down to our array controller cards firmware being outdated. So outdated that the HP array configuration util wouldn't open up. And on top of that so outdated or bugged that the array didn't show that we had 2 failed drives. Yes these failed drive were causing the event ID 55 in OS. The firmware was updated and newer diagnostic utils were loaded on the server. Running these new tools we can see the failed drives. HP is sending up 2 replacements this morning.

I must say that from all the server I've Administered old and new I have never seen this problem before. Drive failing and not blinking red inside the array. This server is not that old either ~ 3 years old. Still pretty beefy. This is why carepacks are so important. So for all you cheap companys out there that like to cut corners, budget server warranty and service contracts and stop blaming your Admins for not being able to fix things fast enough.

Tuesday, October 24, 2006

Event ID: 55

So I'm getting this error in my system event log. No big deal accept this is on a 1.7TB volume that is in production. Not sure how long it will take to run chkdsk /f on this volume. This utility forces the volume to dismount kicking anyone off. Around here these people don't like when I have to take the server offline and I'm talking about after hours maintenance. I'll update when I run this util with how long it took.

Event Type: Error
Event Source: Ntfs
Event Category: Disk
Event ID: 55
Date: 10/24/2006
Time: 9:31:15 AM
User: N/A
Computer: **********
Description:
The file system structure on the disk is corrupt and unusable. Please run the chkdsk utility on the volume New Volume.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 00 00 04 00 02 00 52 00 ......R.
0008: 02 00 00 00 37 00 04 c0 ....7..À
0010: 00 00 00 00 02 01 00 c0 .......À
0018: 00 00 00 00 00 00 00 00 ........
0020: 00 00 00 00 00 00 00 00 ........
0028: c2 00 22 00 Â.".

Thursday, October 19, 2006

phone calls

I get all kinds of phone calls. These days I don't even answer my phone anymore. My wife says to me I tried calling you but you must have been busy. I said yeah I was busy screen my calls. LOL! Anyhow I've had to blow off a few people for a long time until one day I have to just deal with them. One guy called me the begining of the year, I told him call me back b/c I don't feel like getting into what ever it is he is trying to sell me. He calls back a few weeks later asking if I remember him. I say no (of course I did) so he goes on to tell me who he is what he has and how I can benifit from it. I say not interested right now and I have to go. He asks if he can try back at later time and gives a specific date. Out of anger a frustration b/c I'm busy I say yeah try back then just to get him off my phone. Sure enough this same damn guy calls back on the date LOL. I tell him call back again and I tell him I'm going on leave for a month so I won't be around. Anyway when I got back from my leave (new born baby girl :D). I think I've have a change of heart this guy calls back so I talk to him for about 40 mins and I explain to him how we do things here and how his product is just not right for us. I even volentarily give him more information about what I am looking in regauds to the issues I am currently facing. He them recommended some expensive products to me. We end the call and I was sure that was the end of him. A few weeks later this same guy calls one of my Admins singing the same damn song O_o!

Old Story***Exchange 5.5 migration gone wrong

I couple of years ago when we were in the midst's of migrating from exchange 5.5 to exchange 2003 we had planned the process for weeks. Myself and a consultant that I've used for a long time set a date for a Saturday morning. Early morning to get a head start. We had both servers ready. New and old (I mean real old).

We started the day by installing 2003 server on the new box as well as exchange 2003. We added if the server to the Organization and both servers were talking. We ran tests and check the whole way and everything was looking fine. We went back to the 5.5 server to load up the tools to do mailbox move. With NT 4.0 and Exchange 5.5 everything needed a reboot. We proceed to reboot the server. It shuts down and comes up in a BSOD. Blue Screen of Death.....WTF. We do a hard restart and cross our fingers. Same thing when it restarted BSOD. We put a call into Microsoft and after an hour on the call it boils down to reinstalling the OS. So we do that and exchange is pretty much F@&%ed.

We were able to see the files that represent both stores. We copy the stores over to another server as a near line backup just in case. but at that time both stores were about 130GB in total. That took at that time a few hours to copy. Once that was done copying we reinstalled exchange 5.5 and service packs. Once this was done we had to copy the data back to the exchange 5.5 server. So that took another few hours to copy back. It's going from evening to night now. The files are done copy but we have to run ISinteg and other tools to defrag the database stores. We locate another area on the network to defrag the store to. Now this process takes a few more hours. Once the defrag process is done we had to remove the old data base store and copy the newly defraged ones back to the server. This was going to take a few more hours to copy.

During these night time copying we decide to attempt to sleep. there is no place comfy to sleep in the office so I grab some old boxes and lay them out on floor. One of my coworkers (female) had a sweater on her chair she leave at work for when the office gets cold, I used that to keep worm that night. Shhh, don't tell her. So I get a nap on the floor and Sunday morning rolls around.

I am now here a full 24 hours. the files are done copying. We attempt the start the exchange 5.5 services and they come up. So we are now back to square one where we were 24 hours before. We install the tools again and cross our finger that the server does not again BSOD. The server reboots and all is fine. Now we are using the mailbox move tool to move hundreds of mailboxes over to the new server. This too takes time so we wait and watch one by one moving over (in groups of 10). It's midday on that Sunday and we are about half way done with the mailboxes. We still have the 80GB with of public folders to do after this. We ended up finishing all mailboxes sometime in mid afternoon. We run tests to see if emails are working on the new server and they are.

Half the battle done now for the public folders. To get these over is different from the mailbox move wizard. I can't remember what processes were available at the time but I ended up exporting my entire public folder to a file and importing it in my new exchange 2003 server. This caused problems b/c not all of the data was exported/imported. My public folders run deep and has many sub folders. So of the sub folders either didn't show up in the new server or the contents were missing. So I had individually export and import specific folders. All of this importing, exporting took the rest of Sunday night into Monday morning. I was done for the most part my Monday 9am.

I still didn't go home. As users began to login I stuck around to make sure everyone was OK with Outlook and accessing anything in the new Exchange 2003 server. I got a few calls about the data in the public folders not being there but that was an easy fix. Export, import. I ended up going home about 4pm Monday afternoon and still went in on Tuesday regular time. The missing data calls kept coming in months after the migration was complete but I kept the old exchange 5.5 online and unplugged from the network just in case. To this day I still have that old server sitting there and I am now ready to throw it out for good. If calls about missing data come up we tell them that data has been removed for good.

I was at work for about 52 hours straight that time. No shower, no shave, no teeth brushing (ginger ale :D) no deodorant, no change of cloths and hardly ate. I was a mess but I got the job done and that's all that mattered to me. Yes I have been though the trenches and that was the last time I laid my head to floor at work either but I'll leave that story for another time.

Monday, October 16, 2006

8 hour tech support call

About a week an a half ago it was a Thursday everything was well for the most part. One of my Admins says he was getting errors in exchange system manger when trying to click on objects. The first thing I think was hmm... How long has that server been up and running? You know the usual thought when something goes wrong in a windows environment. So I check my outlook to see if emails were flowing and it was fine. It was already after 9am so rebooting the server to clean up errors wasn't an option at that point. I make a self note to reboot the server in the morning. A few hours pass by and I get a call from our overseas office asking if we are having issues with emails. I log into exchange and the server seems fine. No errors on the screen, task manager clear, no memory leaks and nothing frozen. As soon as I get off the phone I get calls from users saying they are getting email bounce backs. I then know something is wrong. I run into the server to see what is going on again. NOTHING! All looks fine from the typical admin perspective. So to not waste time I reboot the server. Once I did that NONE of the exchange services started. I tried to start them manually and no joy. So I knew something was wrong. I did nothing in exchange for weeks how could all of a sudden something just happen. I call back the overseas office to tell them yes we are having a problem how did you notice to problem. They say they have a consultant in installing Intellysync software on their end and they said they did nothing to anything else. So I called Microsoft. Hours roll by after explaining the problem. These were what seemed to be text book guys and went by the book. Little errors in AD these guys wanted to stray from the matter at hand and resolve them first. We did have a DC error but it was a rogue DC acting up BUT not affecting AD itself. These guys spent a few hours on that alone and I had to basically tell them we are going to demote this DC get it offline and continue with the matter at hand. It's only been 6 hours that I was on the phone.

To make a long story short. The permissions for exchange site name and anywhere that inherited these permissions were stripped somehow. We had to add them back one by one and even when we finished doing this the services still would not start. The phone call was already at 7.5 hours and it this time the MS tech decided to say lets reinstall exchange. I'm like WHAT! He says it again but we will select reinstall from the drop down box. Basically we did reinstall exchange right on top of the current install, wiping out the service pack as well. After the install we went ahead and reinstall the service pack too. After this we attempted to restart the services and they came up. I rebooted the server to make sure the service started themselves and they did.

After 8 hours the problem was finally resolved. No one still knows why these permissions were stripped out but I have my suspicions.

Tuesday, October 03, 2006

1.7 TB took 4 days to copy

So it took 4 days to copy 1.7TB from my file server to my EMC SAN via HBA's at 2GB Data rate. 4 days DAYMN!!!!! It would have been less if the script didn't get stuck on files that didn't have Admin rights. I would have posted the copy stats but the script is setup to sort of loop itself and start over if anything changes. No I didn't set it to write to a log. I'm guessing the log file would have been in the hundred MB rage and can't chance a log file crashing my server.

Friday, September 29, 2006

File copying takes to long

One of my production volumes has been running desperately low on disk space. This is happen far too often so I've decided to just go ahead and use some space on the SAN to put this large amount of data. The space I'm using is space that I am not a real huge fan of b/c it consists of Parallel ATA drive running at 5400rpm. Yes these are not even SATA drives. Anyway it's 2.5TB of unused space that I am copying the data to right now.

The data is coming from internal SCSI disks on the server that are 10Krpm and total 1.7TB of space. There is only 30GB available. So As you can see I gotta do what I gotta do. I started the data copy on Tuesday at 1pm EST. It's is now as you can see from the time of this post Friday 9am EST and it's copied 1TB so far. The server is connected to the SAN via HBA's set to 2GB Data rate. Pretty fast but the copy could have been far ahead if certain users didn't take ownership of a bunch of files tripping domain Administrator from the permissions. I get in on Thursday morning to find the script hanging for 30secs (the retry time)on every file in a specific folder. So I then have to manually change them and readd domain Admin back to the permissions. Who know how long the script was on these certain files. At that point 400GB's were copied. This morning it's 1 TB so I'd say that hold up cost me 200GB of data not being copied. Anyway I have 700GB's to go. So I'd say some time Saturday it will be done.

But wait! The script is set to check for changes at the source and run again. So it should update the destination. I'll have a test folder set at the very end of the source and modify the file inside accordingly to see if the destination get the update. I am using Robocopy version XP010 BTW.

Well couldn't I just restore from tape? Yeah but I need to run my backups this weekend and don't want to tie up my tape library for a few days restoring data. plus the restore will be from tapes that is a week old and I'd have to run some kind of sync software anyway. I therefore an potentially killing two birds with one stone doing it this way.

Tuesday, September 05, 2006

Exchange mailbox management

So this month we are go to start automatically deleting emails older than X days from our exchange server using Exchange 2003's build in Mailbox Manager. From my test it seems to work well in report mode. It generates a nice report that is sent to the Administrators mailbox (or any mailbox you'd like) listed below. This is the general, it's a sum of all the messages that would be deleted and the amount of space I would potentially get back. I ran the test on a portion of my userbase.

The email sent to the Admin mailbox;

The Microsoft Exchange Server Mailbox Manager has completed processing mailboxes
Started at: 2006-09-05 09:25:48
Stopped at: 2006-09-05 09:35:49
Mailboxes processed: 113
Messages that would be moved or deleted: 110803
Size of messages that would be moved or deleted: 12381.03 MB

Here is what is inside the report;

Cleaning Mailbox user@company.com
Recipient Policy: purge test
Folder Deleted Items Contents: 126 Items (8.37 MB)
Folder Deleted Items Contents: 126 Items (8.37 MB)
Folder Deleted Items Done: 0 Items Processed, 0 Items Would Be Moved or Deleted (0 (null))
Folder / Contents: 0 Items (1.00 KB)
Folder / Contents: 0 Items (1.00 KB)
Folder / Done: 0 Items Processed, 0 Items Would Be Moved or Deleted (0 (null))
Folder /Inbox Contents: 433 Items (328.14 MB)
Folder /Inbox Contents: 433 Items (328.14 MB)
Folder /Inbox Done: 0 Items Processed, 295 Items Would Be Moved or Deleted (77093 (null))
Folder /Outbox Contents: 0 Items (0.00 KB)
Folder /Outbox Contents: 0 Items (0.00 KB)
Folder /Outbox Done: 0 Items Processed, 0 Items Would Be Moved or Deleted (0 (null))
Folder /Sent Items Contents: 18 Items (1.37 MB)
Folder /Sent Items Contents: 18 Items (1.37 MB)
Folder /Sent Items Done: 0 Items Processed, 0 Items Would Be Moved or Deleted (0 (null))
Mailbox user@company.com Contents (before processing): 577 Items (337.88 MB)
Mailbox user@company.com Done: 5 Folders Processed, 295 Items Would Be Moved Or Deleted (77093 (null))

We are doing this b/c for the life of me I'm don't know why people still won't clean up their mailboxes. Even after countless email notifications, howto's instructions on our intranet and even mailbox limits the users still have 200+mb mailboxes. So we are going to automate this process.

When I was originally looking to set this up I went to Micosoft's website and didn't find a thing. I ended up at Msexchange.org they have a very nice write up on how to do this. What I manged to find on Microsoft's site is how to exclude users from this process.

Thursday, August 31, 2006

New Version of Blogger

Seems like there is a new version of blogger. I'm going to port over and see how it goes. First let me copy all my configs ;-)

Cisco port analyzer SPAN feature

If you are on a Cisco network and need to monitor network traffic or filtering you'll need to setup SPANning on one of your switches. Here is Cisco's Configuring the Catalyst Switched Port Analyzer (SPAN) Feature page.

In a nut shell what is happening is that you are copying traffice from one port (sourse) to another (destination) for monitoring. So lets say you have a firewall on port 0/1 and you want to capture and filter all the web traffic what you would do is plug you monitor into another port say port 0/24. Now you would need to copy all traffic from port 0/1 to 0/24. to do so you'd have to setup a SPAN or monitor session.

#config t
(config)#monitor session 1 source interface fastethernet 0/1
(config)#monitor session 1 destination interface fastethernet 0/24 both
(config)#end

The both at the end of the second command means that this port is bidirectional rx and tx.

Enjoy!

Web filtering

Another one of my tasks is to make sure web filtering is in place. Can't have users going to Adult sites on the job or doing other non work related activities. The product we use is Surf Control Web Filter. When I started here over 5 years ago this is what was in place but it was installed on the Windows NT4.0 firewall (yes an NT4.0 firewall) at the time. It was suppose to go hand and hand with Checkpoint Firewall for NT. It worked OK. But as many people know that NT4.0 was a system hog in itself then putting Checkpoint on it and Surf Control on top of that put a strain on the server. So after a while of dealing with that crappy box and all the problems I've had with it (firewall crashed and I had to get a non windows firewall). Anyway I've install Surf Control on it's own server and got it filtering the web traffic.

The installation went OK you need either an SQL server to talk to or MSDE on the box itself. I went to MSDE route. I want this box to depend on itself only. At the time since our switched didn't allow rx tx on the SPAN port (or maybe I just didn't spend enough time trying to figure that out) we used a HUB since that allowed writeback. What am I talking about? In order for Surf Control to effectively block sites it has to capture packets and determine it's nature and either let it go or put a block on it sending a message to the users screen. This is what I am talking about when I say writeback or rx and tx. Since our network upgrade I was able to toss out that HUB and properly configure my new Cisco 4506 to SPAN with rx, tx. So now Surf is blocking site on a switch like it should.

The product itself has nice features. Realtime logging, categorization of sites, reporting of user usage, most visited site in a given time etc. It also integrates with Active Directory (now). But the product is a little flaky and buggy. It takes some time to figure out and you'll find yourself on their knowledgebase very often. I guess I've built up a tolerance for it's buggyness and just cope with it. After all I do know how to get it to work.

Monday, August 28, 2006

Copying LARGE amounts of data

In the industry I work in we have LARGE files and HUGE folders to copy from one location to the next. I'm talking about hundreds of Gigs. Weather it be from production to production space or production to archive. We want to be sure being in a Windows environment that the file copy doesn't bomb out (good old copy, paste). I use Robocopy to handle all my copying needs. Robocopy is an old tool part of Microsoft Resource kit.

This site (http://www.ss64.com/nt/robocopy.html) has beginner information about robocopy. Once I started using robocopy years ago I never stopped. You can also use Xcopy to achieve the same results.

The power of the SAN

One of our departments needed space last week. Lots of if and out of nowhere too. So think about a strategy for about 15 mins b/c we really didn't want to do what took us only 5 mins to think of. Anyway it came back to putting their data on the SAN. Luckily we have SATA drive to use so we created a 600GB LUN on the SAN put it in a storage group with it's host and whalah!!! 600GB available to them like that. It's easy for us to do all this storage shuffling now b/c we have our infrastructure in place. Fiber channel switches, HBA's in each server emc SAN and lots of drive space ;)

Need help building out your infrastructure for a SAN solution hit me up :D

Login scripts

To manage who gets what and how on the workstations we use login scritps. In our Windows 2003 AD environment you can use group policy but we have been using kixtart and login.bat ever since NT4.0. It works that damn good. In my example my users are in group ABC & XYZ and will get mapped drives from certain files servers and printers from certain print servers.

Install the kixtart files to your sysvol direcory of your domain controller. By now they should have a template for you to follow (not sure I've haven't upgrade it in years). Anyway in AD under the user account profile tab in the section for login script put login.bat

In the sysvol folder of you domain controller create a text file add this entry to it
@echo off
%0\..\Kix32.exe kick.scr

then save it as login.bat When the user logs in they will be calling this file. This file will then execute Kixtart and call kick.scr

Kick.scr is the srcipt that does all the mapping based on where the user lies in AD. Here is an simple versio of the Kick.scr sript that I use. I have most things (; commented out ; = comment in the begining of each line)

;*****ABC Group*********************************************

If Ingroup ("ABC")

;Deploy intranet page to IE This will make their IE default to the company intranet page all the time.

writevalue("HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\Main","Start Page","http://intranet_address_here","REG_SZ")


;Map Network Drives
;Example this will give all users in group ABC a network map for ABCresources on 'fileserver01' as a (Z) drive
use z: "\\fileserver01\ABCresources"



; use i: "\\servername\share"
; use j: "\\servername\share"
; use k: "\\servername\share"
; use l: "\\servername\share"
; use m: "\\servername\share"
; use n: "\\servername\share"
; use o: "\\servername\share"
; use p: "\\servername\share"
; use p: "\\servername\share"
; use r: "\\servername\share"



;Map to Network Printers

;Example this will give all users in group ABC a network printer called ABCgroup_color_printer from printserver1

addprinterconnection ("\\printserver1\ABCgroup_color_printer")



; addprinterconnection ("\\print_server_name\printer_name")
; addprinterconnection ("\\print_server_name\printer_name")
; addprinterconnection ("\\print_server_name\printer_name")
; addprinterconnection ("\\print_server_name\printer_name")
; addprinterconnection ("\\print_server_name\printer_name")
; addprinterconnection ("\\print_server_name\printer_name")
; addprinterconnection ("\\print_server_name\printer_name")
; addprinterconnection ("\\print_server_name\printer_name")
; addprinterconnection ("\\print_server_name\printer_name")


;to delete a printer connection
;Example to delete the ABCgroup_color_printer from the group
delkey ("HKEY_CURRENT_USER\Printers\Connections\,,printserver1,ABCgroup_color_printer")


; delkey ("HKEY_CURRENT_USER\Printers\Connections\,,print_server_name,printer_name")

; delkey ("HKEY_CURRENT_USER\Printers\Connections\,,print_server_name,printer_name")



EndIf
;*******************************************************************

;*************XYZ Group*********************************************

If Ingroup ("XYZ")

;Deploy intranet page to IE

writevalue("HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\Main","Start Page","http://intranet_address_here","REG_SZ")


;Map Network Drives

; use i: "\\servername\share"
; use j: "\\servername\share"
; use k: "\\servername\share"
; use l: "\\servername\share"
; use m: "\\servername\share"
; use n: "\\servername\share"
; use o: "\\servername\share"
; use p: "\\servername\share"
; use p: "\\servername\share"
; use r: "\\servername\share"



;Map to Network Printers



; addprinterconnection ("\\print_server_name\printer_name")
; addprinterconnection ("\\print_server_name\printer_name")
; addprinterconnection ("\\print_server_name\printer_name")
; addprinterconnection ("\\print_server_name\printer_name")
; addprinterconnection ("\\print_server_name\printer_name")
; addprinterconnection ("\\print_server_name\printer_name")
; addprinterconnection ("\\print_server_name\printer_name")
; addprinterconnection ("\\print_server_name\printer_name")
; addprinterconnection ("\\print_server_name\printer_name")
; addprinterconnection ("\\print_server_name\printer_name")

;to delete a printer connection

; delkey ("HKEY_CURRENT_USER\Printers\Connections\,,print_server_name,printer_name")

; delkey ("HKEY_CURRENT_USER\Printers\Connections\,,print_server_name,printer_name")

; delkey ("HKEY_CURRENT_USER\Printers\Connections\,,print_server_name,printer_name")



EndIf
;*******************************************************************



So in this sample script I have two groups ABC & XYZ this represents the different groups in AD. This is one way of using the script. There are many ways to get the job done.

Monday, August 21, 2006

Server room over heat

Over the weekend the AC unit in our server room went out. It happened sometime late Friday night. The room being so small and full with other crap caused the room to heat up in a matter of minutes. As a result of this the servers started shutting themselves down.

We have a dedicated AC unit like any other server room would have. The problem is that the unit is connected to our building central fire until. Something went wrong with the fire command unit last week as there were tests and the HVAC repair guys in most of the week fixing other related issue. Over the weekend the fire command unit cut the power to the AC units in the building. This caused what could have been a nasty chair reaction to occur. Resulting in me and other coworkers racing in on Saturday morning.

How did we know something was wrong? The internet connection to the office went out. The firewall shutdown. So no email notifications could go out. We found out b/c one of us was trying to work from home on Saturday morning and noticed the VPN down, OWA down, ftp down. So I had a pretty interesting weekend. Now to get to the bottom of WTF the building is up to with their fire command unit. In the past 2 weeks our dedicated AC unit for our server room went out 3 times now and the funny thing is it's not even all that hot here in NY. I could understand if this was the week we had a heatwave but it's not. It's in the low 80's when all of this as been happening. So now I am looking into a Sensaphone 1400. Someone wants to charge us $4800 for it. I know nothing about this device but from the description it looks pretty good.

Here is what we are doing to ensure the room stays cool right now LOL!!!
cool _room
They all will be removed when we are sure the building has their act together.

Cisco IPT installation update

So what have I been up to?

Well, we rolled out the cisco phone system. The system got configured and is up and running without any problems. We are still using our old PBX and only a hand full of users are on the new system now. I am trying to make this as smooth as possible but have run into one snag.

On the old system we use 4 digit extensions on the Cisco system we use 4 digit extension. So what the issue? Well the, T1 connection between the old and new system require anyone on the old PBX to dial an access code to connect anyone on the Cisco system. So if I am on the old system I have to dial an access number say (6) then the 4 digit extension. This is going to be a huge hassle. We are going to have to tell all the users to enter an extra digit to speak to the users on the new system. But if you are on the new system you don't have to dial that extra digit. All kinds of confusion can happen.

So why can't we setup the systems to hide a digit so it seems like no matter what system a user is on they will only dial 4 digits? Problem is that our PBS was never upgraded though out the years and we do not have an option for coordinated dial plan (CDP). This will allow the PBX to hide or insert a digit right after the access number is pressed. So if the access number is (6) and the extension to the new system is 7560 the PBX will be smart enough to insert the (7) at the beginning of the extension right after you press the access code number of (6). This also brings up other issues down the road as well.

Why can't we do this from the Cisco end then? Well, we can but the PBX will only issue 4 digits. It will only allow use to dial 4 digits once their is a dial tone. If the PBX allowed us to issues only 3 digits we could easily have the Cisco system add a digit. How do I know??? We already got this to work. WHAT? Yes we got this to work then the very next day we come in and it's not working. WTF!!!!!!!! We call in consultants and a PBX guy. The PBX guys are saying we have to spend 30K to upgrade the system just to allows coordinated dial plans (CDP). Screw them. We're not going to spend 30K and will only keep the PBX for 6 months until we get everyone on the Cisco system.

Monday, August 14, 2006

What once was

Well since sloppy wiring seem to be flying around the net as of late I'll proudly display what my core rack USE to look like ;)

This is what happens when you have to scramble for space for a parallel network installation.
Can anyone spot the hanging switch???
How about the cliff jumping Linksys???
What about the DMZ that dubbed as a perch???
wiring

Thursday, August 03, 2006

The light

Just gimmie the light and pass the...

To the naked eye the light from a connected fiber cable is red but to the camera's eye it's white.

Here is the light from the gig port once an LC connector SX transceiver is installed. (these little bastards can get expensive)
the fiber light1

the fiber light2

Here is the LC/SC fiber cable this is the LC side.
the fiber light3

cable and port

You notice that only one side is lit that's b/c one side sends and one side receives data.

Monday, July 31, 2006

A peek inside the DHCP server

Here is what the DHCP server looks like for a network with VLANs on every floor and a VOIP system.

DHCP

From the pic you can see all the scopes created. What is happening here as I mentioned before is that each floor is it own VLAN and each VLAN gets their IP from the scope that it's associated with. Remember that IP helper address I told you about (10.100.1.22) well this is the server.

You will also notice more than half my scopes are inactive. This is b/c the 12th floor isn't online as of yet. Also the other scopes are for the IP phones for each floor too. Those aren't rolled out yet either. Yes the phones need a scope too. They will have IP's as well.

For the ones that don't know the IP phone plugs into a standard RJ45 jack (the network jack)and your PC plugs into the phone. The IP phone is also a switch.

Bottom of a Cisco IP phone
10/100 SW goes into your network jack
10/100 PC connects to your PC
The AC power port you do not need if you have power over ethernet (PoE). Your access switches will power your IP phones.
bottom of 7941

Friday, July 28, 2006

Fiber cables and IPT stuff

I'm using three types of connections in my network. SC/SC, LC/SC, SC/SC.

SC/SC
LC/LC
The SC/SC cables go from the fiber panel to the core switch. They connect to one of the ports on the gig module on the 4506. This would be your backbone cable in the data center (depending on setup).

LC/SC
LC/SC
The LC/SC cable is from the fiber patch panel to an access switch. Or from a core switch to an access switch or from an old access witch to a newer model 3500xl to 3560 for example.

LC/LC
LC/LC
The LC/LC cables connect two access switch's together via the gig ports (depending on setup).


Here is my fiber panel in my data center. These go up to all my floors.
central fiber patch in data center
Each floor as six pair or 12 strands. Whoever installed this before my time did an excellent job with leaving room for growth. If we have to pull new fiber it would have cost us tens-of-thousands of dollars.

Fiber patch panel 3
A fiber panel on my 15th floor. We are only using two pair. (one is unplugged)

Fiber patch panel 2
(someone forgot to dust)

PBX and Cisco
lower lart of PBX

Cisco on the cart next to PBX
mobile VoIP on a rack
Call manager on top
3560 switch below
2811 gateway router
2851 gateway router

Red cable is the T1
The red cable represents the T1 link. This is the line that will connect the old PBX to the new Cisco system.

new rack for VoIP stuff
This rack will house the CM the Unity connect server the gateway along with a few UPS'. We will also get another CM in a few months when we get more phones.

Monday, July 24, 2006

This weeks agenda

-Call manager install. I'll blog that for sure.
-VLANs configured for QoS (may or may not happen this week)
-Nortel option 61 T1 card being installed in our PBX (this is to connect to the Cisco system since we are doing the hybrid approach). It's a T1 between both systems to be exact. Also the card for the PBX will require downtime. Damn it's a really old PBX system too...
-My gig E modules need to be installed but I was told this may require downtime. So a 7am installation is deffinitely on the agenda.
-New floor data cabling and some power install. Two separate vendors I have to coordinate with for this too. They are union guys too so they tend to drag jobs out.
-Very busy week.

Friday, July 21, 2006

My current VLAN setup

With our dual core setup we are fully redundant. All access layer switches (the ones on the floors and usually in pairs) have a connection to each core switch. The way this works physically is by connecting one fiber cable to one gig port all the way down to one core switch and one port on the second access switch to the other core in our data center. Then connecting one of the gig ports on the access switch to one port on the other access switch. The two cores are also connected to each other. Picture all four switches in a circle holding hands.

3560's in action

In the core we have our VLANs. As I explained in a past post each floor is a VLAN. The only way to make this physical redundancy work is to set it up virtually in the core. We have two core switches A&B. One is the root and the other is the standby. If the root core fails for whatever reason the standby is there. The setup on the core for redundant VLANs looks like this.

switch A

interface Vlan30
description CLIENT_FLOOR_30 VLAN
ip address 10.100.30.2 255.255.255.0
ip helper-address 10.100.1.22
ip helper-address 10.100.1.78
no ip redirects
no ip unreachables
no ip proxy-arp
standby 30 ip 10.100.30.1
standby 30 priority 105
standby 30 preempt



switch B

interface Vlan30
description CLIENT_FLOOR_30 VLAN
ip address 10.100.30.3 255.255.255.0
ip helper-address 10.100.1.22
ip helper-address 10.100.1.78
no ip redirects
no ip unreachables
no ip proxy-arp
standby 30 ip 10.100.30.1

This will only work if you have VTP (VLAN trunking protocol) setup obviously if you understand this far.

What is happening here is that switch A is the root and switch B is the standby. This is defined by the priority of 105.

Anywhere you see 30 represents the floor, so this VLAN would belong to the 30th floor.

The gateway of the clients on this VLAN is 10.100.30.1. Now 10.100.30.1 is on both switches and it is the HSRP (Hot Standby Router Protocol) address. So the A has a real address of 10.100.30.2 and B has a real address of 10.100.30.3 the virtual or HSRP is 10.100.30.1 and is linked to both switches by the (standby 30 10.100.30.1) command.

I'm just going over the main entries so stuff like no ip (redirect, unreachables, proxy-arp) you can google.

For VLANs the ip helper-address is important b/c broadcast do not cross VLANs (why would anyone want them to?) If you have a DHCP server that is in a SERVER VLAN just setting up the client VLANs and leaving with result in an entire network of workstation trying to find the DHCP server and not able to connect to anything. There are two simple ways to resolve this.
1. setup a DHCP server on every VLAN. This would be the dumbest and most inefficient thing to do.
2. add an ip helper-address statement (ip helper-address 10.100.1.22)to allow the client VLAN to find the server in the SERVER VLAN. 10.100.1.22 would be my DHCP server and this line would be in all my VLAN configs. Not only that but a scope for every client VLAN will have to be created in your DHCP server. So the scope created for the network on the 30th floor would be set to give out IP as such;
10.100.30.100-254 /24.

You can also see that I have another ip helper-address there. That is for another server that uses broadcasts to communicate with the clients.

For the configs above if I had a network that spanned across a 30 floor building I would have a VLAN for every floor that would look the same way. I also have a management VLAN, Server VLAN, Voice VLAN, Video VLAN, Printer VLAN, Wireless VLAN etc...

You can see how complicated this can get and this is only VLANs we are dealing with here.

Wednesday, July 19, 2006

Cisco phones are in

As I am typing this up I don't even know the exact model number of them LOL. I'll update that when I get the pics of them up.
(update 7941)
cisco 7941

My desk area is in shambles. 100 phones in 20 boxes piled up by my desk.
VOIP hardware behind desk

Along with 2 GigE expansion blades for both 4506's a few trunk adapters (I think that's what the box said)
gig Eth module

2 routers and a server, voice gateway and call manager (unity server not here yet). I didn't bother check for details been too busy with other things.
7800

Here are the pics.

Monday, July 10, 2006

What's next?

While I was out we got the go ahead to acquire another floor in the building. This floor will accommodate a least 65 users. We will have to core fiber from one of our other floors. We can't drop a patch cable out the window like we did a few years ago this time (true story).

This new floor will get all new cabling and a dual fiber run to our 4506's, PoE switches and Cisco IP phones. We are finally moving over into the IPT boat. It will be a Cisco hybrid and work with our existing Nortel PBX. We will be using a Cisco CallManager 4.0-PBX Interoperability. New toys and new headaches.

I've been doing a lot of Cisco stuff lately. Very fun stuff.

I am back from my month off.

And you thought 2 weeks or even a 3 day weekend went by fast. My month flew like the wind. Some would have tried to squeeze in a mini vacation and travel not me. I was home enjoying fatherhood.

Thursday, June 29, 2006

Taken much needed time off

I've taken a month off from work and all tech related things to raise my daughter. You can tell that I can't sleep by the time I am posting this. She has has changed my sleeping pattern. I'm not sure how I am going to cope with that when I go back to work. We'll see!

Sunday, June 11, 2006

She's here.

At 10:52am my first born arrived. She is so beautiful. The life that I once knew has officially changed. I am a father.

Daddy loves you Maia.

Tuesday, June 06, 2006

Network Upgrade complete! Flawless execution!

On Saturday June 3rd we did our cut over to the new network. This was the true test and all went damn near perfectly. All in all it was still flawless. We had one issue with netbios and one of our CAD applications.

We used an IP Helper address to route all those packets to the license server IP.
The application work where the client needs to find the server and check out license. Once the client gets the license the server seems to need to know if the client still has the license so it can give the next license to the next client. What was happening was that the server was giving out the same license to all clients. Seems like the server was using netbios to look for computer names. I was able to resolve this by adding DNS suffixes to the server and license server was updating not thinking that the clients were giving up the license they were trying to keep. It's a bit confusing reading it here but if you see it in action it will make more sense.

Other than this legacy netbios multi VLAN issue everything else was golden. VPN, email, web, all of our remote sites, databases etc. was all good. I think I mentioned before that we had to change subnets from /16 to /24 so this also meant that some of our servers needed an IP change as well. Those changes were made and everything was updated. I also had to create objects in my checkpoint firewall for every VLAN. I was getting some IP spoofing in the logs and was like WTF! Come to find out the objects weren't created.

If you are attempting to do anything like this make sure you have a checklist and plan, plan, plan. So much for that project. My hat goes off to Dimension Data for their help and support. Thanks Joe!

cabled_rack

Wednesday, May 31, 2006

New Network diagram

It's all about documentation and diagrams so other can know what has been done. Here is my new network diagram. It looks so simple but it is very complex on the inside. The complexity can be seen in the documentation that you cannot and will not see ;)

New diagram
New_Network_diag

Old diagram
Network topology

Sunday, May 28, 2006

Apple Xserve RAID in a Windows 2003 server environment

We've faced a few storage crisis in the past. The main one being the production data. We have also been faced with archive storage problems. What we use to do years ago was when a project (a huge root folder gigs in size) was ready to be archived we use to in this order;

-make a tape backup and remove the files from the network
-burned a lot of CD's and remove the files from the network
-burned DVD's and remove the files from the network
-copy the project to a NAS device
-copy the project to cheap large IDE/SATA drives

That is a basic timeline of our archive process. This has become very inefficient. Even though everything was properly labeled and could be found it was still inefficient. Why? It could take a long time to search through the amount of single backup tapes, CD's, DVD's and archive locations on the network and NAS devices for a specific piece of data. Also the data was all over the place as you can see.

We resolved these problems by revisiting centralized storage again. An expensive SAN for dead storage is a waste of a lot of money (depending on your environment and your budget). A bunch of cheap storage is also a waste b/c you will be buying A BUNCH of device eliminating the centralization factor. Also when those device get remodeled and decommissioned you are stuck with a bunch of devices of different storage sizes and device shapes and styles. Our choice was Apples Xserve RAID.

We decided on the Apple Xserve RAID after careful research. Our biggest question was b/c it was an apple product will our Windows servers see the storage? Also will our Windows servers see a volume over 2 TB. Windows use to have a 2 TB limit on their volumes. This was back in the early Windows 2000 server days. Then Microsoft came out with Dynamic disks. Where you can use 2 basic disks to create a dynamic disk larger than 2 TB. Read more about it here,

  • Reviewing Storage Limits

  • After we were confident that our servers will be able to create a volume over 2 TB we placed the order for their 7 TB Xserve RAID. It cost us $13,000. Talk about cheap storage.

    The device is by no means a cheap POS though. It has 2 storage processors, 2 fiber connections, 2 power supplies and 14 hot swap Ultra ATA 500GB drives. This is a very nice piece of hardware just like all of apple's hardware products they did not slack on this one. As good as it looks it was even easier to setup. The has 2 of everything including network interfaces. These interfaces are set for DHCP so you just plug the device in and they are on your network. Apple's RAID Admin tool that comes with OS X but they also have a Java based one for Windows environments finds the device on your network. You can now change the IP's of both interface to conform with your server or storage device IP addressing. within in RAID Admin you can manage the device like any other enterprise level storage device. We are not using the Xserve RAID as a true SAN so will are not managing drive space in that manner. We are using it as NAS on steroids or a limited SAN. Why I say that is b/c we have connected the Xserve RAID to our fiber channel switch and zoned it out to it's host server. Our Windows 2003 server detects the Xserve RAID as if it were an internal drive or a SAN drive.

    So far the Xserve RAID is working great. After creating the partition in Windows disk manager and waiting 28 hours for the drives to initialize (yes 28 hours) the device has been working flawlessly. Windows 2003 server sees 5.5 TB out of the 7 TB raw so keep that in mind that you will lose 1.5 TB to overhead. We have populated the device with 2.5 TB or archive data so far and we still have a lot of data left to add. Will we needs another one soon? I hope so b/c these device are very nice to have on your network. It just makes managing and centralizing storage so easy, cheap and simple.

    Front of Apple Xserve RAID
    XserveRAID

    Back of Apple Xserve RAID
    XserveRAIDback

    SPAM

    Spam use to be a huge issue for me about 3 years ago. We used a software product called mailsweeper. It worked OK but like an anti virus program the definitions had to be updated by us and rules and exceptions had to be put in as well. It was a headache to manage and took focus away from other areas that needed attention. Every Monday or day after a long weekend we would have well over 65,000 emails caught by mailsweeper. It worked good and worked too good sometime. It would also catch a lot of false positives. We had to sift through all the SPAM just to find blocked emails. It was very time consuming.

    These days I don't even think about spam. Spam is a word that I forget exist. We have made our entire company very happy when we had MessageLabs (www.messagelabs.com)take over and filter all of our emails. They have a very robust multilayer filtering system. Spam, virus and porn also images with excessive skin content. One of the best features about it is that the user administers their own blocked emails. If an email is a false positive the user will get an email from messagelabs saying they have blocked emails. In the notification email their is a login issued by messagelabs so they can go in a release or delete the email. They also have a retention period so the emails won't pile up in their system.

    How it works is you have to edit your MX records with your ISP to send all your SMTP traffic to messagelabs cluster. Messagelabs will give you a virtual IP which is a cluster of towers that will filter you emails. Once messagelabs has your emails they will process it through their filters and then forward it to your mail server IP. If you are watching your firewall log you will see entries for MLtower## sending emails to your email server. Best practice is that you will want to set your email server to only sent emails to messagelabs and also only receive emails from them also. This can be done in your firewall and should be able to be done on you email server as well. I use exchange so it is there.

    Spam for us is a thing of the past. If you are still (LOL) having spam issues in your company check out www.messagelabs.com or www.postini.com I can only speak for ML but I hear postini is nice also.

    Friday, May 26, 2006

    Day 3 of our network upgrade - VLANs

    Today I just went through and planned out our IP scheme. Like I mentioned before each floor will be it's own VLAN network. Here is how they will be configured;

    16th floor VLAN IP
    IP scheme 10.100.16.x / 24
    Gateway 10.100.16.1

    15th floor VLAN IP
    IP scheme 10.100.15.x / 24
    Gateway 10.100.15.1

    14th floor VLAN IP
    IP scheme 10.100.14.x / 24
    Gateway 10.100.14.1

    4th floor VLAN IP
    IP scheme 10.100.4.x / 24
    Gateway 10.100.4.1

    3rd floor VLAN IP
    IP scheme 10.100.3.x / 24
    Gateway 10.100.3.1

    A lot to think about and thanks to EIGRP that takes care of most of it :D

    I ran some test on printers and plotters as well (we do a ton of that here) just to be sure that the print servers in one VLAN can communicate with the printers/plotters on another VLAN. We all know it works but you don't want any surprised come cutover time. This is why we test and test and test some more.

    On cutover day I have a lot of sensitive work to do. I need to change my netmask from /16 to /24 that means I need to get to all my servers and firewall objects and SAN devices. Those are critical. These are some of things an MIS has to worry about. Yes worry and stress to the point were you almost crap your pants. You ever get those feelings? LOL!

    Thursday, May 25, 2006

    Day 2 of our network upgrade - setting up the routes

    Today the consultant and I just went through the design and made sure we had all the routes in place. We are moving from a network that covers 5 floors but is a single flat network (VLAN 1) to a multi VLAN network. Each of the floors will be it's own VLAN. Think of each floor as their own network. With this configuration I need to make sure all clients can connect to each other, the servers and connect to our London and Shanghai networks. Thank God for EIGRP. Without EIGRP our routing table would be at arms length. Meaning that I would have had to tell the router that every VLAN exist, where they are and what path they need to take to get to their destination. I would have to think like a router while entering all of these routes in the routing table. I don't have to do that with EIGRP.

    **What is EIGRP?
    Enhanced Interior Gateway Routing Protocol (EIGRP) is a Cisco proprietary routing protocol based on their original IGRP. EIGRP is a balanced hybrid IP routing protocol, with optimizations to minimize both the routing instability incurred after topology changes, as well as the use of bandwidth and processing power in the router.

    Some of the routing optimizations are based on the Diffusing Update Algorithm (DUAL) work from SRI, which guarantees loop-free operation. In particular, DUAL avoids the "count to infinity" behavior of RIP when a destination becomes completely unreachable. The maximum hop count of EIGRP-routed packets is 224.**

    Or just look at the pic to get a better idea of what it is and does.

    eigrp

    Wednesday, May 24, 2006

    Day 1 of our network upgrade - Cisco 4506's

    We unpacked and placed all the switches in their locations throughout the company. Once we connected the access layer switches to the extra pair of fiber we had to hope we would get a light at the end of the tunnel LOL. No really we had to hope for a light down in the datacenter at the end of the fiber run. When we were done installing the switches we got our lights. Everything went smooth. The new network is running in parallel.

    Here is our new core.
    4506_4
    2 Cisco 4506's. We will have 2 runs to each access layer switch from the core. This is in preparation for a future VoIP install later on. Your foundation MUST be up to par before you can even think about VoIP.

    We are finally moving to a more enterprise level network. Cisco would call it a three-layered hierarchical model or hierarchical internetworking model. It consist of;

    -Core layer
    -Distribution layer
    -Access layer

    Our setup will be core/distribution layer (all in the core) and access layer out on the floors.

    Tuesday, May 23, 2006

    My Coworker won a MacBook from the apple store in NYC

    What a lucky MOFO. If I had won I would have...... given it my fieance or my mother :D. It's funny how he won too. He originally went to the opening and spent 3 hours on line. He got in and IM'ed me at home said he was standing next to celebs and such. Stayed for about an hour and left. He went to hang out on Friday night and went back to the apple store at 5am. Filled out the form and left. He got a call on Sunday saying he won. Congrats!!!

    MacBook_winner_coworker
    Congrats!

    emc Clarion CX300 our storage solution

    When we decided to go with a SAN over a year ago we made that decision based on these criteria;

    -centralized storage
    -redundancy
    -easy to manage (if you know what you are doing)
    -scalability
    -Dell/emc maintenance and support (very good in this area)
    -performance
    -versatility

    We are a Microsoft Windows shop so it was a very easy integration. It took a day to set up and get all the servers configured. Each server needed 2 Host Bus Adapters (HBA), drivers, SANsurfers and emc Powerpath. These are for connectivity management and licensing.

    Once the SAN itself was unboxed, racked and firmware was updated it was time to carve it up. Originally we only have the CX300 itself and 1 disk array Enclosure (DAE) we added a second one a few months later. I have a pic of how the LUNs carved up look in paper.

    SAN-design.xls

    **What are LUNs?
    In computer storage, a logical unit number or LUN is an address for an individual disk drive and by extension, the disk device itself. The term originated in the SCSI protocol as a way of differentiating individual disk drives within a common SCSI target device like a disk array.

    The term has become common in storage area networks (SAN) and other enterprise storage fields. Today, LUNs are normally not entire disk drives but rather virtual partitions (or volumes) of a RAID set.(wikipedia definition)**


    After the carving we were ready to connect the server to the SAN via McData Fiber Channel switches. Here we had to do zoning.

    **What is SAN zoning?
    SAN zoning is a method of arranging Fiber Channel devices into logical groups over the physical configuration of the fabric.
    (seems like this is the best used definition on the web so why reinvent the wheel ;) )**

    Zoning on the McData switches are really easy once you get the hang of it. Find the servers WW name and find the storage device WW name and add the two to the same field. This will allow the Windows server to detect a new storage device in disk manager after you point the server to it's LUN on the SAN using emc Navisphere. No rebooting required if all works right. Hit refresh a few times and your new volumes is ready to be partitioned and formatted.

    Here is a diagram of the server/SAN setup.
    SAN_diag_2_27_06.vsd
    (I'll talk about the apple XServeRAID in a bit)

    Seems easy enough but I didn't do it alone. We had the Dell/emc guy with us the whole time. They won't let you perform these task without an engineer onsite. Too easy for something to go wrong.

    The benefit of all this is that if my server runs out of space I can easily grow the LUN and manage everything from a single location. Centralized storage is a must in our environment. As much as we want to centralize everything we still can't :/

    Storage, one of he biggest headaches to deal with in this environment.

    I work for an architecture firm in NYC where we generate lots of large files. Files from various 3d, CAD and imaging applications. In this environment is is hard to manage the storage. Why? Well here is why.

    When a project starts a folder is created in on the file server. This folder has a name or number associated with it to identify the project. For the life of the project everything related to it is stored here (accept emails). These projects last years. I really mean years. Well if you think about it how long does it take to build lets say (for example) a Dam or a Skyscraper. This is how long these files have to be accessible on the network. If the project is active it has to be on the production server. If it's on hold or wrapped up it goes to the archive server. These project folders get to be well over 100GB and that is only 1 of many projects.

    Why can't I just delete old files?
    That's not my job to be honest. Sounds like a don't care attitude right? Wrong! If I went the lengths to delete every old file even though they are on tape backup I will be restoring files EVERY DAY all day long. It should not fall on my to be in charge of what gets deleted and what doesn't that should be the team that is in charge of the projects job. I only provide the means for them to store their files and work without problems. Nice cop out right ;)

    So I and my team of admins have been faced with this problem of the servers filling up year after year. How we use to deal with it was throw disk at the server. At one point we had about 5 files servers of various sizes filling up. One year I thought I was in the clear. We had a 400GB volume on our main file server and it filled up. We purchased an 800GB drive cage for an HP server. So I think to myself and tell my boss we are in the clear for the next 2 years. Well 6 months go by and the 800GB is down to 100GB free. We move inactive projects off to free up space and this goes on for a few more months. We then double that capacity. At the time 800GB was a hell of a lot of space for our size company. When we filled it up I was as shocked as anyone else would be. So now we had 1.6TB. Again we filled that up in 18 months. So in 2 years we ran through over 2 TB (if you include the projects we pulled off to make space). We decide to get an emc CX300 with 2TB production and 4TB for archive. This working out OK for since we got it but we still are running out a space.

    We are currently looking for a hierarchical storage management (HSM) solution that integrates well in our environment. No it's not easy to just go get IBM Tivoli, CommVault or Veritas Enterprise Vault <--- (we have this for email and what a pain in the ass it is to set up). We have to make sure that whatever pointer file is left behind can be read by our CAD software. The problem is that our CAD software uses what is called an Xref. An Xref is a bunch of files that are accossiated with the main file you are working on. If I open file building1234.dwg it can call dozens of other files and they will all open due to the Xref. This is the root of the storage problem and the cause behind why a solution isn't easy to find. Say we use an HSM solution to move files older than 30 days to an archive spot, leaving a pointer in place and that file moved is a part of an Xref. If my CAD software cannot read that pointer file we can potentially corrupt the main file and delay a project. I have been explaining to these vendors this situation giving the same example asking them to find out if these HSM products will work with CAD software. Ofcourse they don't test nothing and say yeah it should work so I can buy it. Well if the software costed $30 I would buy it and try it but they cost thousands of dollars. We all know these software works on word files and all the common stuff that you find in the financails, law and medical firms but the architecture firms are always left out. It's like no one knows about us. Maybe we should stop designing buildings.

    Thursday, May 11, 2006

    Best App

    Google Earth

    Currently using it to find my next home. Get the address of the property of interest (not always provided on certain real estate sites). Plug the address into Google Earth and you can have an idea of what the neighborhood looks like. You can also get an idea of the lot size without having to visit the property. And you can get an idea of the distance your new home will be in proximity to certain landmarks, malls, super market, theaters, schools, your job, hospital, police, transportation etc. This is how I am currently using Google Earth. What's yours?

    Less than a month. Take a guess..

    Pink packages all in the livingroom

    First entry

    I finally decided to create a blog. I took so long b/c you can say I been there done that in the past. 5 years ago I had a website that I spoke about all my hobbies; computers, home theater, paintball and my ride ;) This blog will mostly be about tech stuff.

    I'll be posting a lot in the weeks to come.