Wednesday, October 14, 2009

Windows 2008 R2 problems with HP Drivers

I decided to share with you some issues that I experienced with HP servers when upgrading to Windows 2008 R2. I discover that some settings that were working perfectly in Windows 2008 didn’t worked any more after I introduce the recent Windows 2008 R2, no matter if I did a upgrade or a clean install. These problems were common to Full and Core installations. The most common errors were:

Hyper-V servers (when adding new switches):
Error Applying New Virtual Network Changes

Setup switch failed.
Information is no longer available about this task because the object that monitors the process no longer exists. This may occur when many tasks are being processed.
~
After the process failure and consequent retry to get the vswitch on the correct NIC I got:
(Note:You also have some solutions in this POST to clear this error)
Error Applying New Virtual Network Changes

Binding to the external Ethernet
Cannot bind to the external Ethernet < NIC description> because it is already bound to another virtual network.

Hyper-V servers (when removing switches):
You might not have permissions to perform this task

Upgrading Servers (When running the NCU update):

The NCU setup sometimes fail to install/or upgrade and in some servers I had to stop some services in order to get the NCU installed (action that I would later regret).


Cluster Errors:
The operation has failed.

Error in Validation. (Note: Some possible issues are also described in this POST)

Clusters creation Errors:
In Clusters, although the validation process ended ok, the process to create the cluster fails with the error:
Validating installation of the Network FT Driver on node .
Unable to successfully cleanup.

*************************************************
When I started to debug these problems I notice that they all had one thing in common, they were all network related problems.

Hum…

Ok, let’s start from the beginning, I checked the latest Firmware and Software (HP supported versions for Windows 2008 R2) and everything was fine, I was using the Smart Start 8.30 (18 Aug 2009) and I also did the upgrade for the latest Firmware version 8.60 (10 Aug 2009). Although everything seemed OK, the fact was that I still had those issues.

So, what to do next?

After several tests I come up with several conclusions to make this work, I’m not going to explain all of them nor am I going to explain in detail about the details of the processes. Instead I’m going to give you the easier way to implement and to get things done until HP solves this problem and hopefully give us an UPDATED version of some drivers and other software.
                 After several tests I concluded that these Network problems are ALL related with the drivers that are included in the Smart Start CD 8.30 (yep, the version that HP says that is supported to Windows 2008 R2). Most of these drivers and software have the date of August 2009, but it seems that they were not tested carefully enough to guarantee the minimum quality in order to function correctly in Windows 2008 R2. Yes I know, it’s October 2009 and things haven’t change.

That said, let’s start:
1- Before anything else, use the Firmware CD 8.60 or later and boot from the CD. In order to get HP support you need to update ALL existing hardware to the most recent firmware version.

2- After that, use the Smart Start CD 8.30 or later (hopefully when you read this HP already has an update version and you probably don’t need to follow this procedure) to start the server and installation process. After boot process the Smart Start CD stats a wizard and you’ll get a number of questions regarding to server name, organization, etc… when you get the option to choose the Installation mode, choose “CUSTOM”, this will allow you to choose what components you want to install at the end of Windows installation.














3- When the installation finishes a wizard is presented that will allow you to install the components available in the PSP, this will allow you to choose exactly what components you wish to install. In that Screen exclude the NCU (Network Configuration tool) from the driver/software installation process (you may also exclude the NICs if you want). If you don’t exclude the NCU at this step of the process you will probably have problems with network communications depending of the roles that you plan to install on that server.




4- The next step is to get the most recent version of the NICs drivers available online. DO NOT USE THE SMARTSTART CD 8.30 drivers or the DRIVERS that ARE ONLINE AT HP WEBSITE for the same version of the Smart Start CD 8.30; instead get the DRIVERS directly from the hardware provider (In my servers – Broadcom and Intel).
Note: I couldn’t get a version from Broadcom that specifically said that the driver was for Windows 2008 R2, instead I got the 12.26.02 version that has the date of 08/28/09 and according with Broadcom website this driver is for Windows 2008 x64. Intel already has drivers for Windows 2008 R2 for some NICs; you’ll need to check if your NIC is already on that list.

5- Be careful and take note of the date and driver version, I already had some servers that after the driver setup, the driver wasn’t properly updated as it should and I had to remove the driver, reboot the server and re-install it again. I also had scenarios (especially in server core) were I wasn’t able to perform the update because after driver removal, the NICs were lost and the only way to get them back (assuming Windows Full installs) was through the device manager. I had to use device manger to manually remove the NICs and them re-add them again using the option “Scan for hardware changes”, as you may know the device manger for server core installs can only be accessed remotely and is READ-ONLY, and the biggest problem in Server Core installs for this problem is that you don’t have a process that allows you to remove hardware in Cerver Core as you would in Full installs, so if that happens in you’re stuck, and you probably need to re-format the drive re-install the SO, the HP drivers and then perform the upgrade once again.

6- If server is going to be a Hyper-V server, remember that NCU should be only installed AFTER Hyper-V role is activated
(Note: NCU will also give you headaches and weird behaviors when used with wrong drivers, including those that ship with Windows and those that come in the Smart Start CD, multiple NICs, inability to change or remove them, communications issues, Hyper-V virtual switch errors, crashes, etc… are just some of the problems that you may experience with NCU and the wrong drivers).

7- Now that you’ve the correct version of the drivers and if you’re going to use that server with Hyper-V, it’s time to install the Hyper-V role, reboot and test. Assuming that you’re able to create/delete vswitchs without errors and your VMs can communicate through the interfaces you should be ready for NCU installation.

8- Use the latest version of NCU (at the moment I have the Ver.9.70). Install the NCU, reboot the server, and you should be ready to configure your first Team for the NICs. After creating the first NIC Team, try to create a vswitch on top of that new Team vInterface, after that make sure that a new NIC was created to the vswitch, test communications and finally test the removal of that vswitch, after removal recheck if the NIC that was created for the vswitch was also removed, if these tests were all OK, then you should ready to rock.

9- At this moment your problem should already be correct (at least temporarily until get the new versions of HP drivers/software, they know what is happening and I was told that they’re working on it).

10- If your server is going to be a cluster, my advice for you is this: “DO NOT INSTALL NCU – WAIT FOR THE NEW VERSION of NCU and NIC drivers that work with that new NCU”, if you do, and you start to get weird messages or problems when mounting new nodes like those explained at the beginning of this post, then you can say "Thank you" to the NCU.
Note: Sometimes even if you uninstall NCU the problems will NOT GO AWAY, the only way to get things working is to do a clean install or perform a manual removal of the tool, which can be very useful especially in Core installations. I also tried other technics like disable RSS, TCP Chimney, etc... I can tell you, in some scenarios I was able to solve the problem, in others not so well (again, this will depend of the role that the ser performs).

11- To finish this blog entry I wanted to say that when you decide that is time to move to Windows 2008 R2, do NOT go through the UPGRADE PATH (this will save you a lot of headaches), instead, and if possible, do a clean installation and them move the apps, Roles, etc.. to that server, I’m warning you, if go to the upgrade path, you may SORRY again and again like I did. If clean install is not an option, you should perform the upgrade in lab using an IDENTICAL SERVER (forget virtual servers especially if you need to perform hardware tests, use IDENTICAL HARDWARE), and them when possible, plan the move to a Windows 2008 R2.

Hopefully this will help you to understand and to recover from these errors.
Enjoy.

7 comments:

  1. Thank you for this!

    ReplyDelete
  2. I had a similar issue with this and it turned out to be symantec end point protection 11. I wasnt able to form a cluster b/w 2 windows 2008 r2 HP servers. I uninstalled symantec and it all worked. Later i upgraded to the latest version of end point

    ReplyDelete
  3. Thanks - having exact same issues

    Had similar issues with Win 2008 when first released as well. Good to see HP are doing extensive testing with drivers and Win 2008 :(

    ReplyDelete
  4. Anyone able to comment on whether this issue has been fixed with the release of SmartStart 8.40?

    ReplyDelete
  5. Hi, I uninsall the HP Network Utility from the Network Card using netcfg -u cp_cpqteam.

    ReplyDelete
  6. Same issue with SmartStart 8.40. Even with new drivers, no go with Hyper-V R2. Not Stable though. :(

    ReplyDelete
  7. Running into the same issues. I solution I have found to the "network configuration is locked" problem is that the NCU adds cpqteam.exe to the hklm\software\microsoft\windows\currentversion\run key. this is probably what is locking the network configuration.
    Have re-built a couple of times because of these issues, almost giving up on hyper-v because of these issues.
    Microsoft should seriously consider having NIC teaming built into the kernel, similar to Linux.

    ReplyDelete