Skip to main content

Alternative ways to provide High Availability for VMware Platform Services Controller (PSC)?

Over the past few months I have been looking at whether there is any value for me to design a highly available Platform Services Controller (PSC) setup for our new vSphere 6 platform. We all know that one of the functions of a PSC is to provide the authentication to VMware's product suites. So for example without a working PSC I would not be able to access my vCenter until it is fixed or restored. This would mean I would lose the management of the environment but it doesn't mean I would lose VMs that are already running.

To setup PSC to be in a high availability setup for your vCenter you would need to use one of the supported 3rd party load balancers that VMware recommends.  Below were the pros and cons for us at the time of design;

Note: The only policy currently supported at the time of writing is Active/Passive mode for load balancing. You will not be able to do active/active configuration.

  • Provides an "always on" PSC service
  • Provides automatic failover in event of failure or during upgrades of PSC so that the management of vCenter is not affected
  • Could be costly if you do not have a load balancer infrastructure in place already
  • Our virtual infrastructure team would need a good understanding of how load balancers work so that they could troubleshoot in event of a problem
  • Could complex the solution when troubleshooting the issue as the management of load balancers are not within the team. We would rely on other teams availability to help us
  • VMware would provide limited support with the configuration of your load balance should problems arises
  • The pair of PSC in high availability setup would need to be of the same type i.e. you cannot have one appliance and one windows based PSC in a pair

  • Only LTM (Local Traffic Manager) is supported and not cross site GTM (Global Traffic Manager) so if you had two sites you would need to have another pair of load balancer


Could we do a Standby PSC?

With the release of vSphere 6 Update 1 there is a command line which allows us to be able to repoint a instance of vCenter to a working PSC (within a site and cross-site) which gives us an option to have a standby manual PSC instance. As we know that the PSC works using a multi-master replication (like Active Directory) then the data/configuration is technically always replicated as soon as there are any changes between them.

So now in event of a failure we could manually repoint our vCenter to our standby one as shown below:

There are two KB articles which describes how to repoint your vCenter within the same site KB2113917 and KB2131191 to repoint across sites.

So as you can see there is an official way to repoint your vCenter to another Platform Services Controller (PSC). If you have monitoring software in place then you could possibly monitor the PSC VM and if it is down then run a script to repoint your vCenter to a healthy PSC.

Below are two possible designs for my environment which we will test in our lab to make sure it works:

Option 1: At each site I have two PSC, one as Active and the other one as standby
  • The local vCenter is connected to one of the local Platform Services Controller. If the connected one fails then I have a standby one I can fail over to
  • If my links between the two sites fail then I can still run my vCenters independently and have local resiliency for my Platform Services Controller
  • If my local Platform Services Controller both fails then I can repoint to my remote site one and still get my vCenter back up and running. Although I wouldn't recommend it as if the links are slow or high latency then it could cause you issues
  • Downside is I would need to manage and maintain four PSCs and ensure the standby one is working (Maybe Switch over once a month?)
  • We would need to apply Anti-Affinity rules to keep the two PSCs at each site to be on separate hosts

Option 2: At each site just one PSC and both active for their respective site
  • If my links between the two sites fail then I can still run my vCenters independently
  • If my local Platform Services Controller fails then I can repoint to my remote site one and still get my vCenter back up and running. Although I wouldn't recommend it as if the links are slow or high latency then it could cause you issues
  • Simple design (less PSCs) and the PSC is always in use so I don't need to do switch overs to test

We decided to use appliances for Platform Services Controller instead of Windows installs because for supportability, security and upgrades we can see it being easier as it is all under one vendor, VMware. So when we have issues we don't need to involve other teams such as Windows and Security.

Hopefully this gives you some ideas on the possible solutions/workaround you can do for your very important Platform Services Controller. We are now going to test our possible designs in our lab and verify that it is workable. Will update this once we have finished our testing and provide you will the scenarios that we tested against. Once again we are looking this from a point of view where we are only using vCenter with PSC. If you have other VMware products using PSC as well such as vROPS, SRM then you will need to do further investigations to make sure all the other products will support this method.

References Links:


Popular posts from this blog

Rolling back a version of ESXi

There is an option in VMware where after you have performed an major upgrade of ESXi you can roll back to your previous version. The benefit of this is that you would not need to reinstall your ESXi and its configuration if you had issues with the new software. I had to do this on one occassion in my lab where I upgraded from 6.5 to 6.7 and my VMs would not run because the CPU was not supported in 6.7. Please remember if you are using ISO method to upgrade ESXi please ensure you select "Upgrade ESXi, preserve VMFS datastore". Selecting "Install ESXi, preserve VMFS datastore" does not mean preserving datastore means retaining ESXi as it will still do a clean install of ESXi. This method does not work for vSphere 7.0 as there are changes to the partitions on the boot device. Below are the steps to roll back to a previous version which is quite straight forward. As always perform an backup of your host configuration before you upgrade or rollback ( KB2042141 ). I have

Configuring ESXi 6 host to send logs to Syslog Server

In my previous post I talked about configuring VMware Syslog server for Windows which is installed and enabled by default on installation of vCenter 6 for Windows. I will now describe the basic configuration that is required on an ESXi 6 host to be able to send logs out to a syslog server using my vCenter as the example. 1) Navigate to your ESXi host within vCenter. Go to "Manage" tab and select "Settings" followed by "Advanced System Settings". Look for the settings "" and highlight this settings. Click the pencil icon to edit the configuration for this setting. 2) You can now add the host name or ip address of your syslog server/s. You can enter just hostname or IP address, use udp://hostname:514 or ssl://hostname:1514 to be more specific on the port and protocol to be used. If you have multiple hosts then you use the comma (,) to separate each server i.e. udp://,udp:// 3)We n

Custom ESXi Image - ISO using PowerCLI

There comes a time when you have purchased a new hardware to run your ESXi software and discover that the installable base media provided by VMware does not include the drivers or the drivers are out of date. In the world of Windows (Plug and Play) it would discover the hardware and prompt you to provide the drivers so that Windows would install/update the drivers for the hardware. For ESXi if the drivers are not present during load time then the hardware will possibly not work. VMware uses VIB (vSphere Installation Bundle) as a way for vendors to distribute their drivers. To install these VIBs you can either use Update Manager or command line (esxcli). Now this is all good but it does mean you have to first install the base ESXi then use one of the steps above to install/update the drivers.   Some people might feel that it is OK to update the drivers using the above methods but what if it was the network card that was the new hardware and you needed new drivers. Without the net