Last Updated on January 30, 2024 by rudyooms
This blog will be about why “Things” could break with your totally TPM Protected Autopilot Azure Ad Joined device when your System board gets replaced!
I will divide this blog into multiple parts
- The Situation before the System Board replacement
- The Situation after the System Board replacement (in user perspective)
- The Situation after the System Board replacement (in the IT Admin perspective)
- The wonderful story behind it
- Solving the broken Authentication?
- What happens when you have no TPM?
- Important stuff to know!
It’s all about the Device Key (DKPriv/DKPub), the Storage / Transport key (TKPub/TKPriv), and something with the thing called TPM. Blog done!
No, I am just kidding! but before we continue I would recommend you to read these next few blogs first… I know I know, it could be a lot to digest… but I guess it’s worth the time (and maybe the brain freeze?)
To start with a small summary from the blogs above, we can assume that the MS-Organization-Certificate AKA the Device Certificate is quite important. Without that certificate, there will be no communication, whatsoever!
2. The Situation before
Let’s start with getting to know the situation before we decide to swap the system boards. To do so, we need to take a look at some device details. First with the use of DSRegcmd /status
In the picture above we will notice the device is of course TPM protected and it will show you the device ID and also the Thumbprint of the important Device Certificate. A little small warning before you are going to pull out the existing system boards and replace it… please make sure you turn Bitlocker off.
Or make 100% sure you have a working recovery key. If you don’t have those, you will end up in the Bitlocker recovery console to enter your recovery key. You can guess what you will need to tell the customer when you don’t have the recovery keys available.
Also please make a note of the serial number, by entering this command WMIC BIOS GET SERIALNUMBER . So you can be sure you can remove the old hardware hash from the allow list. I will tell you why in the important stuff-to-know part!
3. The Situation after (user Perspective)
Okay, so we replaced the device its system board, booted up the device, and logged in. Yes, you read it right! We can still log in with our Azure Ad account, Even while the whole device trust is blown.
Do you know what’s funny? We can even still log in with an Azure Ad User who has never logged in to the device before. Why? Because the Device Certificate is still available and with it, the Azure Ad Join is still a little bit alive.
Beware, because after logging in you will be prompted with some errors, that’s for sure. This warning will be fully portrayed when you log in. You will need to contact your IT admin.. I guess it’s a good thing that you are the IT admin who is replacing the system board!
Luckily everything you will try to access without your device being trusted will be blocked and a nice error will be shown. I will guide you through a few.
3.1 Company Portal
Let’s start with the famous company portal. This App is just fantastic.. as always I have done a blog about it.
Just try to open the company portal yourself, good luck with that! A nice login error will occur.
3.2 Opening Office
Also opening the Office 365 Apps will just prompt you for your credentials/sign-in instead of signing you in with the nice PRT SSO functionality you got before the System board replacement.
3.3 Opening OneDrive
But I am not done yet, as I also need to show you what beautiful errors you will get when opening OneDrive. This will result in a nice error 80090016
Also, another warning will be shown, telling you there was a problem signing you in with the error 0x8004deb4
4. The Situation (In the IT Admin perspective)
Now we have witnessed the nice errors the end-user could notice when he logged in, we are also going to take a better look at what a good IT admin could notice.
4.1 Dsregcmd /status /debug
I guess this command is well-known by every IT pro that is doing some Azure Ad troubleshooting. So let’s see what it is telling us as the IT admin!
You will notice multiple errors on this picture above and every error is telling you something is really broken. Let’s start with these two:
Error PrivateKeySigntest 0x80090016 (isNGcTransportKeyerror: False)
DeviceAuthStatus : FAILED, Error:8007013d
Please note: The DeviceAuthStatus field was added in the Windows 10 May 2021 update (version 21H1).
Mmm again that 80090016 error…. And the PrivateKeySignTest error, doesn’t sound too good. You will also notice the Failed sign key test when looking at the diagnostic data fields.
4.2 The Famous Device Certificate
As every screenshot and almost every error is telling us there is something wrong with the Private Key or transport keys, shall we take a look if the MS-Organization-Access (aka device certificate) still looks valid?
Because if you have been reading my older blog about Autopilot and the Lost Azure Ad Join and especially part 8, you will know that without this device certificate your Azure Ad Join is screwed! So, I guess it’s very important you don’t mess with it or delete it…
So let’s take a look at the MS-Organization-Access certificate. It’s still there where it should be and it does look like there are no missing private keys? The key icon is still there as shown below.
4.3 Event logs
Almost forgot to mention the event logs. Everyone knows that looking at the event logs could really help you out, so let’s do so. Open the event log and take a good look at what the Nice AADTokenBroker event log tells us. A lot of nice errors can be discovered within all those red error messages.
0x80090016: Keyset does not exist. The error code itself certainly does ring a bell or two as we have seen that error popping up earlier! And again, the error mentions something about the Keyset missing?
And the next error will come!
Looking at the picture above, the event 1098 and again the 0x80090016 (or 0xc0090016) error warning. It shows us there are issues getting a token silently.
DSregcmd and the mdmdiagnosticstool go hand in hand when you need to troubleshoot Azure Ad Join or Intune problems. You can also use this tool to get some more information about the TPM as I showed you in the TPM attestation series.
So, let’s enter this nice command (mdmdiagnosticstool -area TPM -cab c:\temp\tpm.cab) to see if we can get some good results.
Beautiful, another error we can add to our collection: Element not found: 0x80070490
4.5 Windows Performance Analyzer
As I showed you in my last blogs that showed you how to start troubleshooting some nice TPM attestation errors, I was using the WPA. So why shouldn’t we use it now?
I started the WPR trace, logged off, and logged in again. After logging in I waited a couple of minutes to be sure I had everything I needed and stopped the tracing. Let’s take a look at what the ETL file tells us.
Certificate Private Key test failed… Funny, that almost looks like this error we’ve seen earlier
This error is also known as 2146893802 and the TPM operation failed or was invalid
5. The Wonderful Story behind it
I already sort of explained the Azure Ad Join process in the Willys White Glove wonderland blog, But I will focus a little bit more on the Device Certificate part this time. Because that certificate is of utmost importance for your Azure Ad Join.
I will do so, to get a better understanding of who is responsible and what is needed for the creation of the much-needed device certificate!
First, the end of the graphical flow when we are enrolling a nice new device into Azure Ad
And the details behind the flow, but I will explain it a little bit more:
1. The TPM Bound *Device Keys (DkPub/DkPriv) need to be generated to start the request to retrieve the device certificate. After the devices keys are generated, a certificate request will be generated by using the DkPub and signed by the DkPriv
*Device Keys are used to identify the device itself
2. After the singing request is created successfully, an additional key pair will be created. This *Transport key (TKPub/TKPriv) will be used to make sure you get your SSO when authenticating to Azure AD and of course to validate the device state during PRT requests. This Transport key is derived from the Storage Root Key of the TPM.
*Transport keys are used to decrypt the session key.
*The Session key is an encrypted symmetric key, generated by the Azure AD authentication service, issued as part of the PRT and acts as proof of possession
3. A device registration request is sent to the Azure Device Registration Service (DRS). In the request, it will send the ID token, the Certificate Request (CSR), and the public part of the transport key (TkPub) along with its attestation data.
4. After Azure DRS receives the request with all the important data attached, the ID token will be validated and the corresponding device object will be created in Azure AD.(When using Autopilot white-glove it will be in a disabled state)
5. Azure DRS will send back the device Certificate to the client and the client will install the device certificate in the Personal Certificate Store. After this step the MDM enrollment will start, but as this is not part of this blog… I will skip it
So, to summarize the flow, the TPM is responsible for the creation of the device certificate. Without the TPM that started the device certificate generation, the device will also lose access to the Device and Transport keys. Without these keys you will end up with a nice “Keyset does not exist” error 0x80090016 as I showed you in part 3 of this blog.
6. Solving the Broken Device Authentication?
I will divide this part into multiple subparts
6.1. Trying to fix it!
First, let’s just try a funny thing and launch a dsregcmd /debug /join from an admin session to see if that will fix our issue with the missing keys
I guess not because a lot of errors will be thrown at us at once!
CryptAcquireCertificatePrivateKey failed 0x80090016
PrivateKeyAquireTest failed with error code 0x801c002c.
TestTransportKeyHealth failed with error code 0x80090011
That certainly doesn’t look good … But then again… I guess after you have read the flow from part 4, we know why it’s telling us the Transport and Private Keys are failing us. It’s simply because the TPM is changed and with that, the device bound Device Keys!
Okay, okay… We know it’s broken now, Let’s fix it!
Let’s sign off from the broken Azure Ad account and log back in with a nice dedicated local admin account protected with LAPS… Once you found the proper credentials to log in, fire up a command prompt and run Dsregcmd /forcerecovery
If you don’t want to open a command prompt to force the recovery, we could also execute the AAD Recovery from the “Run” command. (ms-cxh://NTH/AADRECOVERY)
This AADRECOVERY command is part of the dsreg.dll that will trigger the recovery itself.
After entering the recovery command, you will be prompted to log in with your Microsoft 365 credentials to start the AAD Recovery
Of course, please make sure you don’t have Device Enrollment Restrictions enabled, otherwise, you could end up with the error: 80180014 | DeviceNotSupported error
After a while you will receive a nice prompt telling you, it’s almost done. Please read it.
So just do what they tell you to do!!! Sign out with your local admin account and immediately (without rebooting!!) log in with that account and you will notice everything is fixed and your Device Authentication is working again.
Do you want to guess what happened, when we forced a recovery? While forcing a recovery the “OLD” Device ID, will be removed from Intune. So when clicking on the device when you are performing the recovery you will get a nice error: “Not Found” for a few moments
But at the same time, the “NEW” Device ID you also notice in the dsregcmd will be attached to that same Device(Name)
So everything assigned to “All Devices” will still work. For example, opening WU4B still tells us the policy is assigned to that specific device.
Clicking on the device brings us to the hardware overview and in it, you will find the new DeviceID and the serial number.
But beware even when it has the same device name… multiple device records will be created!
And we all know what that means… nothing good when you are targeting device configuration profiles based on the Autopilot Enrollment profile (which doesn’t target that device anymore)
7. What happens when you don’t have a TPM?
I guess that’s indeed a very good question… But??? Why don’t you have a TPM? That isn’t smart at all, maybe just plain stupid 🙂
Well enough said, let’s take a look at what will happen when you are replacing the system board on an Azure Ad Joined device and you weren’t using a TPM.
Before we replace the System Board, let’s take a look at the dsregcmd status first. As shown below, this device isn’t TPM protected.
To be sure we do see some difference when we are changing the hardware, let’s fire up a cmd and type this command to get back our serial number: Get-WmiObject win32_bios | select Serialnumber
After we replaced that virtual “System board”, let’s fire up the VM and take a look at what happens… Pretty much no difference… except the serial number changed.
Everything just works like expected, the device is still Azure Ad Joined and the DeviceAuthStatus is still showing as a success, isn’t that great!
So without a TPM (stupid idea… as a TPMgreatly enhances the security of your Azure AD Joined device!) nothing breaks…
8. Important Stuff to know!
8.1: Rebooting the Device after the Recovery
If you have chosen to fix it manually instead of throwing away the device, you need to make sure you are not rebooting the device. Because when you are not logging out with the local admin account and logging in with the user account but instead you are rebooting the device.. guess what nice ESP will be shown in a state that’s pretty much stuck.
8.2: 4K HH Hardware Hash for Autopilot
Please, pretty please when you are using Autopilot, you will need to make sure you are uploading the new 4K HH (Hardware Hash) to the trusted EkPub autopilot devices like I am explaining in this blog
Why? As the hardware hash will be totally different from the old one. I will try to explain a little bit more.
The hardware hash contains a lot of information unique to the device itself, like the *SmbiosUuid and SMbiosSystemSerialNumber
*SMbiosUUID: It’s a unique number bound to the system/motherboard and if you change that you can safely say you changed your computer.
Of course, the Hardware Hash also contains some really important stuff about the TPM itself, like the famous EkPub! (Which we need when we need to do some AIK Attestation)
So, without this new important information about the hardware of the new system board, how could we still perform device authentication? As this device authentication is totally based on trust and the new device (TPM) isn’t the one, we imported earlier into the allowed/trusted list of EkPub’s.
And without this trust, you will not get your device enrolled into Autopilot. You could check out for yourself what’s inside the Hardware Hash. To do so download the OEM Activation 3 Tool (OA3Tool) from this link
I already had the HardwareHash exported with the wonderful get-windowsautopilot info PowerShell tool, so I only needed to add the Hardware Hash.
Please note: Even the hardware hash can be different each time you create it, why?
As shown above, it also contains the OsSystemTime and OsLocalTime. Luckily these time details are NOT used for the Autopilot registration! Let’s move on to the oa3tool and start decoding it yourself!
oa3tool.exe /DecodeHwHash=”hardwareHash from the csv” > c:\temp\hh.txt
When opening that text file, you will a lot of useful stuff in it. As shown below.. it contains a lot of information!
8.3 Testing it yourself
It’s easy to reproduce what happens when a system board is replaced, just use Hyper-V. First I created a Virtual Machine
I added a virtual TPM to the device, and installed Windows 10 21h2 on the device itself, and uploaded the 4K HH
After the hardware hash was uploaded I reinstalled the device and enrolled it into Azure/Intune and made sure everything was working fine.
In the virtual machine picture, I showed you, you could have noticed I also created a second Virtual Machine (also with A TPM) and mapped the hard drive to the same VHDX file from the first machine we created!
So what happens when you shut down the first Virtual Machine and boot that second Virtual Machine? Yes indeed! It looks like the system board and the TPM has been replaced!
8.4 Deleting the Hardware Hash
I also have to point out the fact that changing the system board on these devices is not my most favorite option! I know that sometimes you need to deal with warranties, but then again…. A way,way,way better option would be to remove the hash from the list of allowed Autopilot devices and just buy a new device and enroll that device!
Because when the Hardware Hash isn’t removed from the tenant (one that you don’t control!) and that system board ends up in your device you have some work to do… Because you will end up with the ZtdDeviceAssignedToOtherTenant (808) error message and/or ZtdProfileAssignedToOtherTenant.
You will end up creating a Microsoft Support ticket or asking the vendor to send a new one to get it fixed. When putting in a service ticket, please make sure you have the Device CSV, Proof of ownership and the Diagnostic logs ready for them.
Please read this MS Doc for more information:
I guess this conclusion can be short? When replacing your system board, the trust between your device and Azure will be broken because your TPM isn’t the one the device first married with!
Again… just buy a new device…. but when that’s not possible you will need to fix the issue with the Device/Transport keys and the TPM, you can do so by using the dsregcmd /forcerecovery (but beware of all the group assignments that would break)
Another and better option would be to just wipe the device from intune/azure, upload the new keys and reenroll the device! Just like Microsoft is telling us 🙂