Sorry for all the words. Skip to the end if you just want the punchline
Have you ever had one of those weeks where you just bang your head against the same wall and eventually get some relief? Sometimes, your head breaks open, but other times, the wall breaks. Well I just had one of those weeks but I’m not sure how I feel about where this all ended up.
A couple months ago, I used Johan Ardwidmark’s Hydration Kit for ConfigMgr and built out the lab in my spare time. My company is trying to move toward Azure and AutoPilot for our Windows client management so I’m trying to learn about it in my lab. I had just gotten AutoPilot and ConfigMgr Co-Management working when I realized that I had forgotten about setting up AD FS for client authentication (Azure and the Azure AD Connect tool supports several authentication options) so I started down that path. After a few nights of poking around and figuring things out, I realized that I didn’t have a DMZ for my federation server to serve the external facing AD FS authentication web service. That’s when the trouble started.
I’m using Hyper-V for this lab. It’s a simple setup with 1 Domain Controller and 1 ConfigMgr server plus Windows client VM’s for testing builds. They are all connected to a the same Hyper-V Internal Virtual Switch that I configured NAT on using Amy Casto’s cool PowerShell tip. Unfortunately, I don’t manage domain controllers, DHCP, DNS, etc in my day job. I’m a Desktop guy. I let the Server guys handle all that stuff. That’s why I used the Hydration Kit instead of manually building my servers - I don’t want to deal with setting up Active directory and all that stuff (I should do it, but I don’t want to and you can’t make me!!).
So there I am with a perfectly working setup and I decided that I had to break it, because I learn best by starting with a working product then tearing it apart to see how it works. Instead of having a private lab network, I thought it would be easier to change the lab to use the same subnet that my Host server/Router were on. During this process, things went south and I ended up deleting all of the records out of DHCP and DNS. I fought with it for a while and ended up reading that I could set up a Routing and Remote Access (RRAS) server that would bridge my internal and external networks without making my Host machine part of my Domain. Johan even had a link to a write-up for that too, I had just gone with Ami’s simple Virtual Switch NAT instead and never realized he had the process documented.
Then I rebuilt the original subnet but could never get things to talk after that. They would either talk between the DC and RRAS or DC and CM and sometimes not a all, but never all 3 and DHCP never would work. I even disabled all Windows firewalls and uninstalled Defender and still nothing. I used Wireshark to watch traffic and it just kept searching for the DHCP and RRAS servers. I spent several long nights deleting and rebuilding DHCP, DNS and RRAS, deleting NICS and Switches and all sorts of things in between. I watched EVERY tutorial on and read every obscure blog and forum post about Hyper-V, DHCP, DNS, RRAS and AD communication issues. No matter, how simple everyone made it seem, I was still missing something.
Tonight, I had decided that this was it. If it didn’t work soon, I was going to delete it all and start over. Then I remembered something that I had skimmed over. Someone said, ‘Don’t check the Enable virtual LAN identification box!’ in Hyper-V. They didn’t provide a reason, but the fact that they mentioned it made me wonder. On a whim, I went through and unchecked it from my NICs and THEY ALL STARTED TALKING!!!! Several times in the troubleshooting process, I had checked and unchecked that box, along with all of the other settings I kept toggling, thinking it would help when I was using Wireshark to monitor the traffic. It didn’t. Just 1 simple check box.
Once it was all working, I started Googling and found this KB for server 2008 that seems to indicate this is/was a known issue (clearly I’m not ‘in the know’). The weird thing about this is that my virtual switches aren’t connected to an internal network adapter - the article specifically calls out Intel. I’m using the a Hyper-V Virtual Private Switch on my new setup since the Internal switch adds the Hyper-V Host server to become a Router and I figured that wasn’t helping things. So, it may not be the exact issue and I’m honestly tired of dealing with it and want to get back to whatever it was I was doing before I broke everything. I’m just not going to check the box from now on. If you know the answer, please feel free to share and I’ll update this post.
TL;DR;
Don’t check the Enable virtual LAN identification box in Hyper-V if you want your VM’s to talk to each other.