VMWare — Rolling your own OEM branded image to include missing vibs

Posted by & filed under Virtualization, VMWare.

Problem: A fresh install of HPE branded ESXi 6.5 U1 cannot see the LUNs on the SAN during the installation. The server boots from SAN which means I need to be able to connect to the remote LUNs during installation. There is no local storage. Currently on 5.5u3, it is working fine. The HPE branded 6.5U1 installer does not see the LUNs presented by my SAN. A quick boot into the 5.5 installer confirms it can see the LUNS with no problems ruling out zoning issues, physical issues, etc.

The HPE ESXi 6.5 image seems to be lacking support for the Qlogic BR-815/Qlogic BR-825/Brocade-415/Brocade-825 FC cards which are all mostly the same card. After verifying compatibility of the server, and of the BR-815 FC cards, I determined that the driver simply is not included in the HPE image.

Here are the steps I took to roll my own installer using the HPE branded one as a base using the VMWare Image Builder toolset:

Resources:

  • Customizing installations with Image Builder: https://docs.vmware.com/en/VMware-vSphere/6.5/com.vmware.vsphere.install.doc/GUID-48AC6D6A-B936-4585-8720-A1F344E366F9.html
  • Add VIBs to an image profile: pubs.vmware.com/vsphere-51/index.jsp#com…
  • Export image profile to a ISO: pubs.vmware.com/vsphere-51/index.jsp#com…
  • HPE vibs Depot: http://vibsdepot.hpe.com
  • Using vibsdepot with Image Builder: http://vibsdepot.hpe.com/getting_started.html
  • Applying VIBS to a image walkthrough: https://blogs.vmware.com/vsphere/2017/05/apply-latest-vmware-esxi-security-patches-oem-custom-images-visualize-differences.html
  • VMWare Compatibility Guide: https://www.vmware.com/resources/compatibility/search.php
  • HPE VMWare Support and Certification Matrices: http://h17007.www1.hpe.com/us/en/enterprise/servers/supportmatrix/vmware.aspx
  • Info on HPE Custom Images: https://www.hpe.com/us/en/servers/hpe-esxi.html
  • Supported driver firmware versions for I/O devices: https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2030818

Basic steps:

  • Identify OEM’s software depot URL, in this case the HPE ESXi 6.5U1 image http://vibsdepot.hpe.com/index-ecli-650.xml
  • Identify where the VIB is available for the driver. In my case, the Brocade BR-815 driver was downloaded via the VMWare compatibility site: https://www.vmware.com/resources/compatibility/detail.php?deviceCategory=io&productid=5346 — Note the VIB is actually inside a zip file inside the zip you download. It will be looking for a index.xml file in the root of the zip.
  • Use the esx-image-creator.ps1 to generate a new image with the newly included software: https://github.com/vmware/PowerCLI-Example-Scripts/blob/master/Scripts/esxi-image-creator.ps1
  • Use Export-EsxImageProfile to generate a ISO for installation.

 

PowerCLI C:\Users\user> Add-EsxSoftwareDepot http://vibsdepot.hpe.com/index-ecli-650.xml

Depot Url
---------
http://vibsdepot.hpe.com/index-ecli-650.xml


PowerCLI C:\Users\user> Add-EsxSoftwareDepot -DepotUrl C:\Users\user\Downloads\BCD-bfa-3.2.5.0-00000-offline_bundle-2352086.zip

Depot Url
---------
zip:C:\Users\user\Downloads\BCD-bfa-3.2.5.0-00000-offline_bundle-2352086.zip?index.xml


PowerCLI C:\Users\user> Get-EsxSoftwareDepot

Depot Url
---------
http://vibsdepot.hpe.com/index-ecli-650.xml
zip:C:\Users\user\Downloads\BCD-bfa-3.2.5.0-00000-offline_bundle-2352086.zip?index.xml


PowerCLI C:\Users\user> .\esxi-image-creator.ps1 -LeaveCurrentDepotsMounted -NewProfileName ESXi_6.5.0U1_with_HPE_and_Qlogic -Files C:\Users\user\Downloads\VMware-ESXi-6.5.0-Update1-5969303-HPE-650.U1.10.1.3.3-Oct2017-depot.zip -Accepta
nce PartnerSupported

Depot Url
---------
zip:C:\Users\user\Downloads\VMware-ESXi-6.5.0-Update1-5969303-HPE-650.U1.10.1.3.3-Oct2017-depot.zip?index.xml

The following VIBs will not be included in ESXi_6.5.0U1_with_HPE_and_Qlogic:
tools-light

Finished creating ESXi_6.5.0U1_with_HPE_and_Qlogic


PowerCLI C:\Users\user> Export-EsxImageProfile -ExportToIso -ImageProfile "ESXi_6.5.0U1_with_HPE_and_Qlogic" -FilePath C:\Users\user\Downloads\VMWare-ESXi-6.5.0-U1-HPE-Qlogic-Custom-Oct2017.iso

Booting the server with the newly built ISO enables me to see the LUNs so I can complete my boot-from-san installation.

Recovering from a failed platform services controller installation – vSphere 6.5

Posted by & filed under Virtualization, VMWare.

I have used to below commands to recover from a failed PSC deployment. When trying to redeploy after the failed deployment, I encountered the error:

“Failed to run vdcpromo”

Following the below steps on the current PSC resolved the error and I was then able to successfully restart the PSC deployment.

Also, protip to avoid having to keep redeploying the appliance, take a snapshot right after phase 1 completes. Then you can simply restore the snap and access your vm via the web interface to try again.

login as: root

VMware vCenter Server Appliance 6.5.0.10000

Type: vCenter Server with an embedded Platform Services Controller

Using keyboard-interactive authentication.
Password:
Last login: Wed Sep 20 15:34:18 2017 from 10.110.0.181
Connected to service

    * List APIs: "help api list"
    * List Plugins: "help pi list"
    * Launch BASH: "shell"

Command> shell
Shell access is granted to root
root@vcenter [ ~ ]# cd /usr/lib/vmware-vmdir/bin
root@vcenter [ /usr/lib/vmware-vmdir/bin ]# ./vdcleavefed -h vcenter-psc.redacted.lan -u Administrator
password:
vdcleavefd offline for server vcenter-psc.redacted.lan
 vcenter-psc.redacted.lan server cleanup performed.
root@vcenter [ /usr/lib/vmware-vmdir/bin ]#

 

docs.vmware.com/en/VMware-vSphere/6.5/co…

Additional info: I also ran into this when trying to deploy an additional PSC that had a failed installation, but got a completely different error (see below). Going to Administration -> System Configuration in the flash vSphere web client also displays the failed PSC. Login to the live PSC and use the above commands to cleanup, then restart the new PSC deployment. Refreshing the System Configuration page once the vdcleavefed command was ran confirms the cleanup is complete and the failed install is no longer listed.

The error I received when deploying this PSC was:

Could not connect to VMware Directory Service via LDAP. Verify VMware Directory Service is running on the appropriate system and is reachable from this host.

Removing the failed deployment via vdcleavefed did not resolve the issue.

I decided to test LDAP connectivity to the PSC from the failed PSC deployment. I SSH’d into the box and did the following:

root@localhost [ /usr/lib/vmware-vmdir/bin ]# ./vdcadmintool


==================
Please select:
0. exit
1. Test LDAP connectivity
2. Force start replication cycle
3. Reset account password
4. Set log level and mask
5. Set vmdir state
6. Get vmdir state
7. Get vmdir log level and mask
==================

1
Please enter LDAP server host: vcenter-psc.redacted.lan
Please enter LDAP server port: 389
Please enter LDAP server SSL port: 11712
Please enter LDAP Bind DN: cn=Administrator,cn=Users,dc=vsphere,dc=local
Please enter LDAP Bind UPN: Administrator@vsphere.local
Please enter LDAP Bind password:

ldap://vcenter-psc.redacted.lan:389 (ANONYMOUS) bind succeeded.

++++++++++++++++++++ ldaps://vcenter-psc.redacted.lan:11712 SSL bind failed. (-1)(Can't contact LDAP server)

ldap://vcenter-psc.redacted.lan:389 SRP bind succeeded.

++++++++++++++++++++ ldap://vcenter-psc.redacted.lan:389 GSSAPI bind failed. (9100)(Unknown (extension) error)

Edit: Additional semi-related data

Get machine’s guid

root@vcenter-psc [ /usr/lib/vmware-vmdir/bin ]# /usr/lib/vmware-vmafd/bin/vmafd-cli get-machine-id --server-name localhost

Get machine’s pnid (machine/host name?)

root@vcenter-psc [ /usr/lib/vmware-vmdir/bin ]# /usr/lib/vmware-vmafd/bin/vmafd-cli get-pnid --server-name localhost

Get services in the directory

root@vcenter-psc [ ~ ]# /usr/lib/vmware-vmafd/bin/dir-cli service list

vCenter 5.5 to 6.5U1 Upgrade – SSL Errors

Posted by & filed under Server Admin, Virtualization, VMWare.

Ran into some issues with the ssl certs on the vCenter server when trying to run the Migration Assistant. Notes on the will follow, but first links to articles on the actual upgrade:

The issues I ran into with the migration assistant complained of the SSL certs not matching. Upon inspecting the certs I found all were issues for domain.lan except for one which was issued to domain.net. I followed the following articles to generate a new vCenter cert and install it:

  • Generate SSL cert using openssl: https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2074942
  • Install and activate cert: https://kb.vmware.com/selfservice/search.do?cmd=displayKC&docType=kc&docTypeID=DT_KB_1_1&externalId=2061973

As the Appliance Installed reached Stage 2 of the install where it copies the data to the new VCSA, I received the following error (note the yellow warning in the background along with the details in the foreground):

To resolve this error, I followed the following articles:

  • Upgrading to VMware vCenter 6.0 fails with the error: Error attempting Backup PBM Please check Insvc upgrade logs for details (2127574): https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2127574
  • Resetting the VMware vCenter Server 5.x Inventory Service database (2042200): https://kb.vmware.com/selfservice/search.do?cmd=displayKC&docType=kc&docTypeID=DT_KB_1_1&externalId=2042200#3

Which essentially had me reset the inventory service’s database due to corruption. I had noticed the vSphere client slow in recent weeks, this could be a side effect.

  • Additional more generic docs for tshooting vCenter upgrades: https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2106760