Create process for obtaining/applying for multi-card GPU flavors
Problem / Opportunity Statement
We need to develop a process for gaining access to multi-card flavors.
There are snippets from Slack
Is priority NAIRR fellows > NAIRR pilot > ACCESS?
Yes. To some degree - within reason
Several projects are already on mult-gpu instances, so prioritizing those folks to the H100s would probably be good.
Concur - though we need to make sure that they understand that these are limited and they don't get to take A100 multi and H100 multi just because. We can let them migrate but also need to hold them to a timeline on releasing resources. Additionally, we should consider if they are using 4xA100, can they use 2xH100 -- same VRAM, more system RAM, newer gen, etc.
Jenn noted NAIRR240266 as a NAIRR allocation that requested multi-card. That said, they may already be using a g3.2xl and would get picked up in the review of existing users noted above.
Current list from Support Side (needs updating)
Resolution
Potentially a form-based process for applying for and justifying access.
From Le Mai:
like the idea of a form for tracking requests and justifications. We'd need to agree on a review, approval and notification process. I wonder though (given the slim usage) what people think of opening it up to NAIRR awardees and have everyone else be subject to a review process.
I'd say all people need to at least justify it, even NAIRR users, though I'd give them priority.
Let's have discussion in the Thursday meeting and we can add any needed comments here before and after.
To-Do Tasks
-
Create and finalize multi-gpu request form -
Create draft and share in this issue -
Settle on Final Draft
-
-
Update Documentation to link to this form -
Evaluate other methods of communications regarding this form (email, newsletter, Slack, etc.)