Hello first time poster long time listener,
TLDR: If user load < Scaling Logic put servers with the least connections in drain mode to avoid users getting sent to a server and keeping it on longer than needed.
I've heard through our sales engineer this feature may be in the enterprise version but wanted to highlight the reasoning on the importance of this feature.
An example setup:
40 users
Scale-In restriction set to low to avoid kicking people
4 AVD Servers with Auto-Scaling set to Scale out a total of 3 servers in the morning and leave one running at all times – This server is SessionHost-1
Host Pool set to Breadth-First
Inactivity timer 2 hours before disconnected and 2 hours disconnected to logoff (coming from RDS we can't really change that as people want to disconnect and re-connect to not lose work)
Now a brief description of Breadth First load balancing in AVD: The breadth-first algorithm first queries session hosts in a host pool that allow new connections. The algorithm then selects a session host randomly from half the set of available session hosts with the fewest sessions. For example, if there are nine session hosts with 11, 12, 13, 14, 15, 16, 17, 18, and 19 sessions, a new session doesn't automatically go to the session host with the fewest sessions. Instead, it can go to any of the first five session hosts with the fewest sessions at random. Due to the randomization, some sessions may not be evenly distributed across all session hosts.
Found here: Host pool load balancing algorithms in Azure Virtual Desktop - Azure | Microsoft Learn
Problem: At the end of the day, person 1 connected to SessionHost-1 and person 2 conncted to SessionHost-2 are working late. Person 2 leaves for the day but doesn't log off and they just close out of the Remote Desktop Client. This starts the 2 hour timer before it logs them fully out and the scale-in restriction when set to low will power off that server.
During that 2 hour time window Person 1 has logged off but since they were on SessionHost-1 it will always remain online. Now you have 2 sessions hosts powered on SessionHost-1 has 0 connections and SessionHost-2 has 1 user disconnected. Person 3 then decides to log in and gets load balanced to SessionHost-2. Which if Person 3 does the same thing and close out at the end of the day and person 4, person 5, and person 6 do some late work you have 2 servers up when you only need 1 server up for that load. This problem is only amplified keeping sessions hosts online longer the smaller your host pool is because Breadth-first algorithm selects the first five session hosts with the fewest sessions at random to connect that user to.
Solution: Barring any Scale-In restrictions if your user count < Scaling Logic but servers remain online due to a disconnected user or connected users put those servers in Drain Mode so no new sessions can be kicked to that server(unless there is the maximum number of users on those servers). This would remove the problem we see where sometimes session hosts that should be scaled-in stay online all night because those new users would never hit that server because it’s in drain mode. It would be nice to not have to rely on randomness for those users to be sent to the server that is set to always remain online.
I know there are other ways we could setup the scaling-logic to avoid this like using depth-first however I’d argue that is a much more complicated solution and worse for user experience jamming the maximum users on a single server.
Thanks if you made it this far
Comments (3 comments)