Friday, 24 January 2020

End of Life vSphere 6.0 - My upgrade plan


vSphere 6.0 will be end of life/general support on 12th March 2020 which is just under two months away and most people would of started or completed their journey to upgrade to vSphere 6.5 or 6.7. I have provided some of the high level steps/consideration that I have taken when I was doing my planning.

As vSphere 6.5 and 6.7 both have an end of life date of 15th November 2021 I decided to look at both version for my upgrade plan with the mind of moving to 6.7 as my first choice. The reason I looked at both was in case there was issues with hardware or software that was not supported on a particular version. Throughout my planning I also put into consideration whether it is better to do fresh installs of vCenter/ESXi or an upgrade as a fresh install might be easier. For a small environment with not much integration it might be easier to just do a fresh install but that is my opinion and you would need to weight up what is better for you. 

First thing first is to make sure your current server model/hardware, more specifically the CPU works on vSphere 6.5 and/or 6.7 as if it doesn’t work on the version that you want to move to then you need to quickly get funding for new kit or move to the version that is supported for now. Hardware Compatibility Guide

Once you got the all clear that the server model/CPU hardware is good and compatible to the version you are looking to upgrade to then you need check the hardware vendor to ensure that your local hardware components like array controllers, network card etc are compatible as you may need to replace them or upgrade the firmware on them.

As the virtual infrastructure is built with other hardware components such as network switches and storage arrays you would want to check with those vendors to make sure that you don’t need to do any upgrades or replacements to those as well. For example Pure Storage has information here to tell you some of the best practice settings you should apply. and I usually raise a support case with them to double check. As vSphere 6.5/6.7 has been out for a while now most vendor would have published best practice guides to help you on how to configure their products to work with VMware nicely together.

We talked about the hardware but the next step is to look at some of the third party software (backup, monitoring, plugins etc) that you might have integrated with VMware to see if your current version is compatible with vSphere 6.5/6.7. Again you might need to do upgrades or purchase a new version if you don't have a valid support agreement. You don’t want to do an upgrade and then afterwards realise you can’t do any VM backups or have caveats with the backup method which breaks your SLA.

Next step is to check if there are any other VMware product/s that you are currently using that will come to end of life at the same time as vSphere 6.0. Lifecycle matrix

The following products below have the same end of life so you would need to start reading on their update process and ensure you include them too in your plan.

  • vCenter Server 6.0
  • vCenter Update Manager 6.0
  • ESXi 6.0
  • Site Recovery Manager 6.0 and 6.1
  • vSAN 6.0, 6.1 and 6.2
  • vSphere Data Protection 6.0 and 6.1
  • vSphere Replication 6.0 and 6.1

At this stage you would of worked out which version of vSphere you are upgrading too. We now need to see if the version of 6.0 that has been deployed in your environment is just a single step upgrade to the version you want to go to or a multi step upgrade. Using this Upgrade Path page you can check what version you can upgrade to.


As you can see from the example there are some big red cross for some version. For example if you was on 6.0 U3 and you wanted to go to 6.5.o then it is a big no no. If you hover across the box you will see an information box pop up explaining the reason.

If you have other products above then also use the tool to check against them too.

On the same web page there is a "Solution/Database interoperability" tab which you might need if you decide that you don't want to use VMware Postgres database with the appliance. On the tab select "VMware vCenter Server" as the solution and then the version of vCenter you are looking to move to and see which database version is supported.


On the same page the first tab "interoperability" lets you compare two solutions to make sure that the upgraded version will work with the other VMware solution that you have running

As you can see I have selected vCenter 6.5 U3 and 6.7 U3 and my solution was SRM. You can see that with vSphere 6.5 U3 I can run SRM version 6 and 8 but for vSphere 6.7 U3 I can only run version 8.1.2 and 8.2.

If you have more that one VMware product that you need to upgrade as part of this project then you would need to check the update sequence to ensure that you are upgrading them in the right order. VMware has produced a nice chart to help you see which order you would need to update in if you use more than one VMware product. vSphere 6.5 KB 2147289 vSphere 6.7 KB53710

By now you would roughly know what you need to do as part of the process to upgrade from vSphere 6.0 to 6.5 or 6.7. A step sometimes I forget to do it is to check if I actually have capacity to deploy these products. For example you may require addition storage, network bandwidth, compute if you was doing a vCenter side by side upgrade so make sure you reserve some for this upgrade.

VMware has stated that they have deprecated the architecture of having an external Platform Service Controller and moving to embedded to simplify the architecture. So if you haven't fully started to use VMware appliance then this is a good time to move to fully using just the appliance. The next major version of vSphere won’t have a windows option which was announced back in August 2017 Bye Bye Windows

The following article KB60229 provides you with workarounds and supported methods of Platform Service Controller. The article also has links to upgrade guide to 6.5/6.7 which covers pretty much all possibly scenario of platform service controller architecture that you might have now and what options you have when you upgrade to 6.5 or 6.7. Its a bit of a read but do spend time on it.

Next step is to plan what testing you want to carry out after your upgrades;
  • backup and restore process of vCenter
  • Powershell scripts that you might normally run
  • vCenter alerts ensuring they do fire out
  • host failure to check that HA works
  • Move VMs between host to ensure vMotion works and especailly different ESXi versions
These are just some of the examples and your list could be much longer

So these are the steps I have taken to help me plan my upgrade which in the end was 6.7. I have listed some of the resources I used along the journey of mine which I hope will have some use for you. Good luck
Articles

Monday, 20 January 2020

Azure Preview Portal

Microsoft Azure has a preview portal at the following URL https://preview.portal.azure.com which they use to showcase some of the possible changes that they have coming up to improve your experience of using the Azure portal. Things like improved search features, navigation, themes, change of icon for resources all popped up there before hitting the main site.

I remembered some preview features/products such as the Bastion host worked only via the preview site. You was unable to see the feature in the main site. So do login to the preview site to manage your resources from time to time to have a sneak preview of what changes might be coming out.

You will know you are at the preview site as the header will say Microsoft Azure (Preview)

Saturday, 18 January 2020

vSphere 6.7 support backup over SMB

Since vSphere 6.7 U2 SMB protocol was introduced to allow you to backup to a file share. This was excellent news for us and I decided to test it and see how it works. Upon setting up the backup schedule I got this error 
So I attempted to use the option to do a one off "Backup" and you get this message instead

Both messages didn't really point me to any particular direction so I did a quick search on VMwares knowledge base and found an article KB70646 which states that it is only supported for SMB1 and currently it is still not fixed in U3 (Tested on Build number 15132721) either.  We all know SMB1 is insecure and shouldn't be using it but I still wanted to try and see how it worked so first I enabled SMB 1 on my server. For reference of the powershell cmdlets to run go to this Microsoft article which covers SMB1/2 and 3.

Once I had SMB 1 enabled I attempted to set the backup schedule and as I was filling in the "Backup server credentials" section I was expecting to include my domain name such  "myvmx\yungk" or "[email protected]" for the "user name" as the share I created was using AD permissions to control access. Using either of the format was giving me the same error messages as above and  it appears that the only way it worked was to just use the account name i.e "yungk" which work immediately.

It's not ideal to use SMB to do your VCSA backup yet unless you are willing to use SMB 1 protocol which I am pretty sure you would have your security team knocking on your door tell you to not use it. 








Thursday, 16 January 2020

DHCP Error : Delete failover relationship failed. Error 1722


I came across an issue where I was trying to delete a failover relationship between two DHCP server (Windows 2012 R2) because one of the servers had a hardware fault and we couldn’t recover the server. We knew we had to built a new one so one of the steps I tried to initially take was to mark the failed server as “partner down” in the relationship. 

Upon click the button I get this message which is promising

And then the final message to tell me its not going to happen

Now depending on how you set you DHCP failover modes you could run in to issues if your setting was like the one I had where it was set at “Load Balance Mode” with 50/50 and I was unable to change the partner status to down.

This is Microsoft explanation of the difference of “Lost contact” vs “Partner down”
In load balancing mode, when a DHCP server loses contact with its failover partner it will begin granting leases to all DHCP clients. If it receives a lease renewal request from a DHCP client that is assigned to its failover partner, it will temporarily renew the same IP address lease for the duration of the MCLT. If it receives a request from a client that was not previously assigned a lease, it will grant a new lease from its free IP address pool until this is exhausted, and then it will begin using the free IP address pool of its failover partner. If the DHCP server enters a partner down state, it will wait for the MCLT duration and then assume responsibility for 100% of the IP address pool.

As you can see if you do not change the status of the failed server to “Partner down” then the working DHCP server will not be responsible for the full 100% of the scope. You need to be careful that overall that you don’t use more than 50% of the scope on average so that if you do have any issues with one of the DHCP servers then you don’t need to panic to quickly fix the issue. The above recommendation is my personal view and would only apply based on my setting that I have on this article.

As I couldn’t change the status of the failed server I decided to try and delete the relationship as we would need to create a new relationship as the new server might not be using the same server name.

So back at the IPv4 properties of the DHCP server under the failover tab I decided to hit the delete button for the relationship

Which I followed the process to have all scopes to be removed from the partner server


You then get this message below which states the obvious where the other DHCP server is not contactable or running with a error code of 1722


At this point I tried to do a bit of googling of this error code of 1722 and looked at the event viewer but didn’t get anything useful. I then did a quick search to see if there was command line to delete the relationship and came across a forum where someone used a cmdlet;
remove-dhcpserverv4failover -computername %ComputerName% -Name %RelationshipName% -Force

You would need to replace %ComputerName% with the FQDN of the working DHCP server, %RelationshipName% would be the name of the relationship that you would like to delete
Once you run the command you will receive an error message like below saying the relationship has not been deleted but when you look back at the failover properties it is deleted.



Hopefully this can help people where they have struggled to delete the relationship whether it is due to a server that has failed or have done a DHCP database restore  onto a different server. It all comes down to using powershell 😊 to save the day

Azure Resource Support for Availability Zone

Over the years, an increasing number of services are consumed in the cloud and as architects one of the key considerations is designing the ...