Azure Capacity Extender options for E8as_v6

I'm looking for good alternatives to the E8as_v6 for Azure Capacity Extender. I don't really trust Intelligent ACE because it's suggesting E8ads_v6, L8s_v4, and L8as_v4. Those are bad because they all have local disks, and even if you can resize from a SKU without a local disk to a SKU that has one, you can't go back. I've come up with my own list: 

Alternatives to E8as_v6 Gen-2 VM:

  1. E8as_v5
  2. E8s_v6
  3. E8s_v5
  4. E8bs_v6
  5. EC8as_v6
  6. EC8as_v5
  7. F8ms_v6

I don't know if the v5 VMs will work.  Idk if NMM handles the switch from NVMe to SCSI controllers. I haven't had time to try resizing a v6 to v5 yet. Any suggestions for other sizes? I think the constraints are Gen-2 VM and no temp disk. The options I've listed should force redeployment to a different cluster, where there will hopefully be more resource available. 

0

Comments (8 comments)

0
Avatar
John Tokash

I'm curious what context you view local disk as bad?  I'm not sure how to help provide some assistance here, but the switch from NVMe to/from SCSI controllers is tricky ( I have not mastered it for a session host image yet ) - however, we prefer to deploy with local disk vm's in lots of cases.   

I'll give you my views below, but I want to help answer your question, so I'm curious to understand the context for why you want to avoid the local disk?  Is it avoiding the risk that necessary data is placed there inadvertantly?  

While the storage is not persistent, there are plenty of use cases.   We prefer to deploy them by default because its a place to offload pagefile and associated IOPS and throughput away from the OS disk.   For cost concsious customers where that added load isn't critical we slim back to vm's without the local disk, but that is generally rare.

You are correct in that resizing has its drawbacks, but for AVD session hosts, if you are using an image based model, you can simply redeploy the session host.   If you aren't basing hosts off an image, then that isn't very helpful I'll admit. 

Another valuable use case is for SQL Servers.   That disk, on the right SKU's is going to have excellent performance characteristics, and a SQL VM built through the Azure Portal and properly integrated will handle reinstating the tempdb on that drive naturally.   If you install SQL manually, there is some fancy footwork to be done so that SQL restarts healthy after an event causing the temp disk to reset, but as a best practice, we try to avoid that.

The biggest drawback I've found is that despite the data loss warnings on the drive, 3rd party teams that are not as familiar with Azure in general (and working on servers, not session hosts), put stateful data there and then things get messy when it suddenly disappears.

0
Avatar
Peter Yasuda

Hi John Tokash

The local disk is only “bad” in the sense that once you resize to a SKU that has a local disk, you can't go back to the original SKU. I completely agree there are cases where the local disk is valuable, especially DB servers. I've deployed SQL servers as brand new Azure SQL servers (easy) and migrated servers with a startup script relocate tempdb. 

We're deploying AVD hosts without local disks because there's no reason to pay more for negligible performance improvement. 

My goal is to have a list of sane alternatives for Azure Capacity Extender. I consulted the GLEs, and they pointed out you can't go from v6 to v5, so my list is now: 

  1. E8s_v6
  2. E8bs_v6
  3. EC8as_v6
  4. F8ms_v6

 

0
Avatar
John Tokash
(Edited )

Understood - so the use case here is for resizing (in the event of a capacity issue).   We set up ours to resize as needed (with or without tempdisk), and then should a capacity event occur, we come back and re-deploy from images later when the dust has settled, where the fact that we used a size with a temp disk won't matter, we just want the machine started ASAP.   It does surface that we'd face the same issue in reverse, we can't use SKU's that don't have the tempdisk, in the same scenario.  For session hosts, i can deploy onto a VM with temp disk or not, at least I have not had an issue where it didn't work, but that isn't where capacity extender is really bringing its value. I see the case you are outlining, which is the inability to resize back out of the temp disk vm without redeploying.   

 

That said, on the available sku's, Assuming you are looking for alternative 8 vcpu and 64gb ram machines that are Gen-2, these couple are missing (but you have the bulk of the options), largely bringing the rest of the AMD family into scope.

E8as_v6

F8ams_v6

If the criteria is incorrect, do share - but those are the only other ones I could come up with for you. 

 

0
Avatar
Peter Yasuda

E8as_v6 was the starting point :-), but F8ams_v6 looks good; thanks. And it's available in NMM. I think I meant to list that because F8ms_v6 does not exist - the only available Intel SKU is Fsv2. 

We're still on NMM 6.4.3, and two of the SKUs I listed are not available: E8bs_v6 and EC8as_v6. I looked at the 6.5 release notes, and didn't see anything about adding VM SKUs. I'm going to try to go to 6.5.2 today, and I'll check afterwards. Also 6.5 added notifications for Capacity Extender!

NMM does have the Eb8s_v5, and this article says it can work with NVMe or SCSI. It's not clear to me whether the VM will automatically allocate with NVMe enabled or not. https://learn.microsoft.com/en-us/azure/virtual-machines/ebdsv5-ebsv5-series

I agree with you that adding a temp disk is better than not being able to allocate a host. 

Thanks! 

 

 

0
Avatar
Dave Stephenson

For what it's worth, there are a few things to keep in mind with the v6+ VMs, that we've seen on our end.
NOTE: v6+ isn't some new secret SKU. I'm just using it to reference any newer SKUs (i.e.v6, v7, v8, etc.)

  • Azure doesn't handle switching to/from a v5/v6+ SKU for an existing VM
    You can always delete/recreate a VM to change generations, but a resize/reimage task will always fail (whether natively in Azure or in Nerdio Manager).
    You see similar behavior when trying to go to/from a VM that includes a temp disk.
    When setting your Azure Capacity Extender preferences, you'll want to keep these things in mind.
     
  • Just because you have a v6+ SKU, without a SKU that includes a temp disk, you're not getting too much of an NVMe performance benefit.
    It includes an NVMe disc controller, but the OS disk is still the version (i.e HDD, Standard SSD, Premium SSD, etc.) you select. To get the full NVMe benefit, you'll want to include a SKU with a temporary disk and/or utilize Ephemeral OS Disks for your session hosts.
     
  • Even with a v6+ SKU with a Temp Disk, the paging file doesn't move to the D drive. (see this Blog Post for more information)
    We have this scripted action (NMM-SE/ScriptedActions/v6VMSKU-AutomaticPagingFile.ps1 at main · Get-Nerdio/NMM-SE), from that blog post, that helps automate that paging file setting for you as part of the VM Power-on step
     

 

 

1
Avatar
John Tokash

Dave Stephenson  - Thank you for the reference to the Blog post and SE Scripted Action.   I'd asked support a couple months ago and eventually just wrote my own.   Anxious to review and compare to improve, as I've had mixed results with my DIY script.  Happy to see you've got a version in the SE repo now too!

0
Avatar
Dave Stephenson

You're very welcome, John.

Depending on what you find with your testing, maybe we can combine the two scripts into an improved version? 🤔

0
Avatar
Peter Yasuda

A while back we had a v5 host pool with page files on the temp disk. We were getting memory errors, and it was because Chrome would only use  a C: drive page file. Same with Teams, which was at the time based on Chrome (Electron). Idk about now. Idk about Edge either. 

Once there's a page file on C:, Idk how to prevent things besides Chrome from using it, or how Chrome knows where the page file is for that matter. That was the reason we gave up on temp disks for AVD hosts, and only use them on our DB servers. 

Please sign in to leave a comment.