vCenter Server (VCSA) 6.5 – Add root CA cert to Windows Server 2016 AD to enable the default VCSA cert to be trusted

Posted by & filed under Active Directory, Server Admin, Virtualization, VMWare.

The VCSA has it’s own CA built in. It uses that CA to generate certs for all the various services. There are two options available to ensure that the certificate is trusted in the browser:

  1. Generate a CSR for the cert and submit to a CA who can generate the cert.
  2. Use Microsoft Active Directory GPO to push out the VCSA’s root CA cert, thereby allowing the workstations to trust the cert already installed.

I went with the second one because the VCSA is using vcenter.mydomain.lan and is only accessible from inside my network which also means only machines on the domain will be connecting to the web interface. This was very simple to make happen…

On the DC:

To distribute certificates to client computers by using Group Policy

  1. On a domain controller in the forest of the account partner organization, start the Group Policy Management snap-in.
  2. Find an existing Group Policy Object (GPO) or create a new GPO to contain the certificate settings. Ensure that the GPO is associated with the domain, site, or organizational unit (OU) where the appropriate user and computer accounts reside.
  3. Right-click the GPO, and then click Edit.
  4. In the console tree, open Computer Configuration\Policies\Windows Settings\Security Settings\Public Key Policies, right-click Trusted Root Certification Authorities, and then click Import.
  5. On the Welcome to the Certificate Import Wizard page, click Next.
  6. On the File to Import page, type the path to the appropriate certificate files (for example, \\fs1\c$\fs1.cer), and then click Next.
  7. On the Certificate Store page, click Place all certificates in the following store, and then click Next.
  8. On the Completing the Certificate Import Wizard page, verify that the information you provided is accurate, and then click Finish.
  9. Repeat steps 2 through 6 to add additional certificates for each of the federation servers in the farm.

Once the policy is setup, you will need to either wait for machine reboots, or for the GP tp update. As an alternative, you can also run gpupdate /force to cause the update to occur immediately. Once complete, you can verify the cert was installed by running certmgr.msc and inspecting the Trusted Root Certification Authorities tree for the cert. It was my experience that the machine still required a reboot due to the browser still not recognizing the new root CA and therefore still displaying the ugly SSL browser error. After a reboot it was good to go.

Reference: https://docs.microsoft.com/en-us/windows-server/identity/ad-fs/deployment/distribute-certificates-to-client-computers-by-using-group-policy

vCenter 5.5 to 6.5U1 Upgrade – SSL Errors

Posted by & filed under Server Admin, Virtualization, VMWare.

Ran into some issues with the ssl certs on the vCenter server when trying to run the Migration Assistant. Notes on the will follow, but first links to articles on the actual upgrade:

The issues I ran into with the migration assistant complained of the SSL certs not matching. Upon inspecting the certs I found all were issues for domain.lan except for one which was issued to domain.net. I followed the following articles to generate a new vCenter cert and install it:

  • Generate SSL cert using openssl: https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2074942
  • Install and activate cert: https://kb.vmware.com/selfservice/search.do?cmd=displayKC&docType=kc&docTypeID=DT_KB_1_1&externalId=2061973

As the Appliance Installed reached Stage 2 of the install where it copies the data to the new VCSA, I received the following error (note the yellow warning in the background along with the details in the foreground):

To resolve this error, I followed the following articles:

  • Upgrading to VMware vCenter 6.0 fails with the error: Error attempting Backup PBM Please check Insvc upgrade logs for details (2127574): https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2127574
  • Resetting the VMware vCenter Server 5.x Inventory Service database (2042200): https://kb.vmware.com/selfservice/search.do?cmd=displayKC&docType=kc&docTypeID=DT_KB_1_1&externalId=2042200#3

Which essentially had me reset the inventory service’s database due to corruption. I had noticed the vSphere client slow in recent weeks, this could be a side effect.

  • Additional more generic docs for tshooting vCenter upgrades: https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2106760

 

VCSA – Joining to AD Domain fails – Error: Enabling Active Directory failed. ERROR_GEN_FAILURE 0x00000001f

Posted by & filed under Active Directory, Server Admin, Virtualization, VMWare.

Attempting to join a freshly deployed VCSA server to a AD domain can be problematic if SMB1 is disabled. In my case it was 5.5 but I believe this issue persists in 6.x. SMB1 was disabled on the DC as it should be as it is broken and insecure. The problem lies in the fact that VCSA doesn’t support SMB2 and this causes the error. The VAMI (web interface) might report something like the following when attempting to join the domain:

Error: Enabling Active Directory failed.

Additionally, on the VCSA, /var/log/vmware/vpx/vpxd_cfg.log contains entries like the following:

2017-08-16 14:30:07 26987: ERROR: Enabling active directory failed: Joining to AD Domain:   domain.lan
With Computer DNS Name: vcenter-server.domain.lan


Error: ERROR_GEN_FAILURE [code 0x0000001f]
2017-08-16 14:30:07 26987: VC_CFG_RESULT=302

Of course DNS resolution of the VCSA’s hostname should be validated before continuing, but assuming everything else is in working order, the fix is to enable SMB2 on the VCSA.

Verify SMB2 is disabled (note the Smb2Enabled key is 0:

vc-01:~ # /opt/likewise/bin/lwregshell list_values '[HKEY_THIS_MACHINE\Services\lwio\Parameters\Drivers\rdr]'
   "EchoInterval"     REG_DWORD       0x0000012c (300)
   "EchoTimeout"      REG_DWORD       0x0000000a (10)
   "IdleTimeout"      REG_DWORD       0x0000000a (10)
   "MinCreditReserve" REG_DWORD       0x0000000a (10)
   "Path"             REG_SZ          "/opt/likewise/lib64/librdr.sys.so"
   "ResponseTimeout"  REG_DWORD       0x00000014 (20)
   "SigningEnabled"   REG_DWORD       0x00000001 (1)
   "SigningRequired"  REG_DWORD       0x00000000 (0)
   "Smb2Enabled"      REG_DWORD       0x00000000 (0)

Enable SMB2:

vc-01:~ # /opt/likewise/bin/lwregshell set_value '[HKEY_THIS_MACHINE\Services\lwio\Parameters\Drivers\rdr]' Smb2Enabled 1

Restart the lwio service:

vc-01:~ # /opt/likewise/bin/lwsm restart lwio

Log out of VAMI web interface, log back in and retry joining to the domain.

FreeBSD identifying failed disk

Posted by & filed under Hardware.

Using the sas2ircu utility from LSI, we can blink the drive LED to help ID the failed drive correctly. Of course this requires a LSI card. Some LSI cards may need to use the sas3ircu utility instead. There have been some reports from the interwebs that this utility failed to blink the correct drive, but I have not experienced this myself.

As always use the supercomputer between your ears to ensure the physical serial and the serial reported by the system match, etc etc.

[root@jetstore] ~# sas2ircu list
LSI Corporation SAS2 IR Configuration Utility.
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved.


         Adapter      Vendor  Device                       SubSys  SubSys
 Index    Type          ID      ID    Pci Address          Ven ID  Dev ID
 -----  ------------  ------  ------  -----------------    ------  ------
   0     SAS2308_2     1000h    87h   00h:06h:00h:00h      1000h   3020h

         Adapter      Vendor  Device                       SubSys  SubSys
 Index    Type          ID      ID    Pci Address          Ven ID  Dev ID
 -----  ------------  ------  ------  -----------------    ------  ------
   1     SAS2308_2     1000h    87h   00h:81h:00h:00h      1000h   3020h
SAS2IRCU: Utility Completed Successfully.

Back to the sas2ircu utility in a moment. We need to first acquire the serial number of the failed disk. For a system that is multipath, we can find the actual dev names by running the following to locate a disk in the fail state:

[root@jetstore] ~# gmultipath list | grep -i -B 10 fail
Consumers:
1. Name: da43
   Mediasize: 3000592982016 (2.7T)
   Sectorsize: 512
   Mode: r1w1e1
   State: ACTIVE
2. Name: da16
   Mediasize: 3000592982016 (2.7T)
   Sectorsize: 512
   Mode: r1w1e1
   State: FAIL

Now we can see da16 is failed. Time to get the serial number of that disk. Or da43. they are the same just multipaths.

[root@jetstore] ~# smartctl -a /dev/da16 | grep Serial
Serial number:        WMC1F0D5T1DF

Save that serial number for the next step.

Smartctl also outputs other useful information about the drive, statistics, etc. Worth checking out, but not relevant here.

Next, we can display the disks attached to one of those controllers. Be sure to input the correct serial number in the grep command:

[root@jetstore] ~# sas2ircu 0 display | grep -C 10 WMC1F0D5T1DF

Device is a Hard disk
  Enclosure #                             : 3
  Slot #                                  : 20
  SAS Address                             : 50000c0-f-01f9-f6eb
  State                                   : Ready (RDY)
  Size (in MB)/(in sectors)               : 2861588/5860533167
  Manufacturer                            : WD
  Model Number                            : WD3001FYYG-01SL3
  Firmware Revision                       : VR08
  Serial No                               : WDWMC1F0D5T1DF
  GUID                                    : N/A
  Protocol                                : SAS
  Drive Type                              : SAS_HDD

Get the enclosure and slot # of the failed drive and turn the led on:

sas2ircu 0 locate 3:20 ON

Turn the led off:

sas2ircu 0 locate 3:20 OFF

NOTE: If you are replacing a disk that is multipath, e.g. you see something like the following when you offline and remove a disk, ensure that the LED above is OFF or GEOM_MULTIPATH will not pickup the new disk as multipath. See the below log for what happens when a disk is inserted with the LED blinking Vs not blinking:

----------start drive detach event (already offline)------------
Aug 14 14:05:31 jetstore mps1: mpssas_prepare_remove: Sending reset for target ID 27
Aug 14 14:05:31 jetstore da43 at mps1 bus 0 scbus10 target 27 lun 0
Aug 14 14:05:31 jetstore da43: <WD WD3001FYYG-01SL3 VR08> s/n         WMC1F0D5T1DF detached
Aug 14 14:05:31 jetstore GEOM_MULTIPATH: da43 in disk17 was disconnected
Aug 14 14:05:31 jetstore mps1: GEOM_MULTIPATH: all paths in disk17 were marked FAIL, restore da16
Aug 14 14:05:31 jetstore Unfreezing devq for target ID 27
Aug 14 14:05:31 jetstore GEOM_MULTIPATH: da16 is now active path in disk17
Aug 14 14:05:31 jetstore GEOM_MULTIPATH: da43 removed from disk17
Aug 14 14:05:31 jetstore (da43:mps1:0:27:0): Periph destroyed
Aug 14 14:05:31 jetstore mps0: mpssas_prepare_remove: Sending reset for target ID 38
Aug 14 14:05:31 jetstore da16 at mps0 bus 0 scbus2 target 38 lun 0
Aug 14 14:05:31 jetstore da16: <WD WD3001FYYG-01SL3 VR08> s/n         WMC1F0D5T1DF detached
Aug 14 14:05:31 jetstore GEOM_MULTIPATH: da16 in disk17 was disconnected
Aug 14 14:05:31 jetstore mps0: GEOM_MULTIPATH: out of providers for disk17
Aug 14 14:05:31 jetstore Unfreezing devq for target ID 38
Aug 14 14:05:31 jetstore GEOM_MULTIPATH: da16 removed from disk17
Aug 14 14:05:31 jetstore GEOM_MULTIPATH: destroying disk17
Aug 14 14:05:31 jetstore GEOM_MULTIPATH: disk17 destroyed
Aug 14 14:05:31 jetstore (da16:mps0:0:38:0): Periph destroyed
----------end detach event-------------

----------start insert with LED BLINKING - note no GEOM_MULTIPATH----------
Aug 14 14:10:27 jetstore da16 at mps0 bus 0 scbus2 target 50 lun 0
Aug 14 14:10:27 jetstore da16: da43 at mps1 bus 0 scbus10 target 39 lun 0
Aug 14 14:10:27 jetstore syslog-ng[1426]: Error processing log message: <WD WD3001FYYG-01SL3 VR08> Fixed Direct Access SPC-4 SCSI device
Aug 14 14:10:27 jetstore da43: da16: Serial Number         WMC1F0D9UX1U
Aug 14 14:10:27 jetstore syslog-ng[1426]: Error processing log message: <WD WD3001FYYG-01SL3 VR08> Fixed Direct Access SPC-4 SCSI device
Aug 14 14:10:27 jetstore da16: 600.000MB/s transfersda43: Serial Number         WMC1F0D9UX1U
Aug 14 14:10:27 jetstore da43: 600.000MB/s transfersda16: Command Queueing enabled
Aug 14 14:10:27 jetstore da16: 2861588MB (5860533168 512 byte sectors)
Aug 14 14:10:27 jetstore da43: Command Queueing enabled
Aug 14 14:10:27 jetstore da43: 2861588MB (5860533168 512 byte sectors)
Aug 14 14:10:27 jetstore ses3: da43,pass47: Element descriptor: 'Slot 21'
Aug 14 14:10:27 jetstore ses0: da16,pass18: Element descriptor: 'Slot 21'
Aug 14 14:10:27 jetstore ses3: da43,pass47: SAS Device Slot Element: 1 Phys at Slot 20
Aug 14 14:10:27 jetstore ses0: da16,pass18: SAS Device Slot Element: 1 Phys at Slot 20
Aug 14 14:10:27 jetstore ses3:  phy 0: SAS device type 1 id 0
Aug 14 14:10:27 jetstore ses0:  phy 0: SAS device type 1 id 1
Aug 14 14:10:27 jetstore ses3:  phy 0: protocols: Initiator( None ) Target( SSP )
Aug 14 14:10:27 jetstore ses0:  phy 0: protocols: Initiator( None ) Target( SSP )
Aug 14 14:10:27 jetstore ses3:  phy 0: parent 50030480003c273f addr 50000c0f0137b686
Aug 14 14:10:27 jetstore ses0:  phy 0: parent 50030480003c27bf addr 50000c0f0137b687

-------end insert with LED BLINKING-------


------start insert with LED off----------------

Aug 14 14:28:53 jetstore da16 at mps0 bus 0 scbus2 target 50 lun 0
Aug 14 14:28:53 jetstore da43 at mps1 bus 0 scbus10 target 39 lun 0
Aug 14 14:28:53 jetstore da16: da43: <WD WD3001FYYG-01SL3 VR08> Fixed Direct Access SPC-4 SCSI device
Aug 14 14:28:53 jetstore syslog-ng[1426]: Error processing log message: <WD WD3001FYYG-01SL3 VR08> Fixed Direct Access SPC-4 SCSI device
Aug 14 14:28:53 jetstore da16: Serial Number         WMC1F0D9UX1U
Aug 14 14:28:53 jetstore da43: Serial Number         WMC1F0D9UX1U
Aug 14 14:28:53 jetstore da16: 600.000MB/s transfersda43: 600.000MB/s transfers
Aug 14 14:28:53 jetstore da16: Command Queueing enabled
Aug 14 14:28:53 jetstore da43: Command Queueing enabled
Aug 14 14:28:53 jetstore da16: 2861588MB (5860533168 512 byte sectors)
Aug 14 14:28:53 jetstore da43: 2861588MB (5860533168 512 byte sectors)
Aug 14 14:28:53 jetstore ses3: da43,pass47: Element descriptor: 'Slot 21'
Aug 14 14:28:53 jetstore ses0: da16,pass18: Element descriptor: 'Slot 21'
Aug 14 14:28:53 jetstore ses3: da43,pass47: SAS Device Slot Element: 1 Phys at Slot 20
Aug 14 14:28:53 jetstore ses0: da16,pass18: SAS Device Slot Element: 1 Phys at Slot 20
Aug 14 14:28:53 jetstore ses3:  phy 0: SAS device type 1 id 0
Aug 14 14:28:53 jetstore ses0:  phy 0: SAS device type 1 id 1
Aug 14 14:28:53 jetstore ses3:  phy 0: protocols: Initiator( None ) Target( SSP )
Aug 14 14:28:53 jetstore ses0:  phy 0: protocols: Initiator( None ) Target( SSP )
Aug 14 14:28:53 jetstore ses3:  phy 0: parent 50030480003c273f addr 50000c0f0137b686
Aug 14 14:28:53 jetstore ses0:  phy 0: parent 50030480003c27bf addr 50000c0f0137b687
Aug 14 14:29:07 jetstore GEOM_MULTIPATH: disk17 created
Aug 14 14:29:07 jetstore GEOM_MULTIPATH: da16 added to disk17
Aug 14 14:29:07 jetstore GEOM_MULTIPATH: da16 is now active path in disk17
Aug 14 14:29:07 jetstore GEOM_MULTIPATH: da43 added to disk17

------end insert with LED off----------------