From: Chuck Lever <cel@kernel.org>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: Daniel Gomez <da.gomez@kruces.com>,
kdevops@lists.linux.dev,
Devasena Inupakutika <devasena.i@samsung.com>,
DongjooSeo <dongjoo.seo1@samsung.com>,
Joel Fernandes <Joelagnelf@nvidia.com>
Subject: Re: [PATCH v2 0/4] vLLM and the vLLM production stack
Date: Wed, 8 Oct 2025 13:46:15 -0400 [thread overview]
Message-ID: <fe25a471-e97d-4b46-87af-7990277ef21a@kernel.org> (raw)
In-Reply-To: <e49254e9-620e-4cce-aad6-410b9281d1ae@kernel.org>
On 10/4/25 1:14 PM, Chuck Lever wrote:
>> But the rest of the changes are legit, please feel
>> free to cherry pick what you see useful from here:
>>
>> https://github.com/linux-kdevops/kdevops/tree/ci-testing/
>> mcgrof/20251004-cloud-bill
> Great! I will have a look at those.
Have some comments/requests on these, not sure where to post them.
Oldest to newest:
- workflows: Add vLLM workflow for LLM inference and production deployment
- vllm: Add DECLARE_HOSTS support for bare metal and existing infrastructure
- vllm: Add GPU-enabled defconfig with compatibility documentation
- defconfigs: Add composable fragments for Lambda Labs vLLM deployment
No comments on these.
- aws: prevent SSH key conflicts across multiple kdevops directories
This one was posted before, and my comment still stands: this is badly
needed IMO, but it should work for all cloud providers, not just aws,
and the new Kconfig options should go in the existing kconfig ssh menu
for terraform, probably.
Do you want me to work on adapting this one, or do you want to give
Claude another crack at it?
- Add static GPU Kconfig support for AWS
Wondering if my dynamic instance type menu already brings in these new
GPU-enabled instance types.
- Add make cloud-bill target for AWS cost tracking
Nit: I'd like to see provider-specific scripts go into
terraform/<provider>/scripts/
I'm sorry that I had to drop the pricing information from my dynamic
menu patches. I just pushed that out of the MVP "just get it working"
patches, and I do plan to come back to it. I do follow running costs,
but not as closely as these patches suggest that you do.
- terraform/aws: use default VPC to avoid VPC limit issues
I think we can make this work, and IIRC some of the other providers
also provision default VPCs. Making it switchable (use the default,
or create one for me) makes sense. We might consider following the
precedent that OCI has set here (use an existing VPC).
There are some other resources that have similar limits.
- terraform/aws: fix EBS volume availability zone mismatch
Fair catch, but why not use the AZ that the instance is in rather
than the AZ that the subnet is in?
- terraform/aws: enable public IP assignment for instances
- terraform/aws: prefer subnets with public IP auto-assignment
As above, might need some work, but these two look do-able. Probably
should be squashed into "terraform/aws: use default VPC to avoid VPC
limit issues"
- terraform/aws: fix GPU AMI selection in terraform templates
No comment on this one. I need to first go and merge in your original
GPU AMI patches. I'd like to see that integrated into the existing AWS
Kconfig compute menu.
- ansible: map GPU instance configurations to terraform instance types
- defconfigs: fix GPU instance choice configuration
Wondering if these two are still necessary with my dynamic menu patches.
- slack-billing: add AWS cost notifications to Slack
Clever, but isn't this something that should be configured via the
cloud console? Not really sure.
- kconfig: fix Slack notification configuration syntax errors
Squash-me.
- defconfigs: add AWS P5.4xlarge GPU instance support
No comment.
HTH.
--
Chuck Lever
next prev parent reply other threads:[~2025-10-08 17:46 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-04 16:38 [PATCH v2 0/4] vLLM and the vLLM production stack Luis Chamberlain
2025-10-04 16:38 ` [PATCH v2 1/4] workflows: Add vLLM workflow for LLM inference and production deployment Luis Chamberlain
2025-10-04 16:38 ` [PATCH v2 2/4] vllm: Add DECLARE_HOSTS support for bare metal and existing infrastructure Luis Chamberlain
2025-10-04 16:38 ` [PATCH v2 3/4] vllm: Add GPU-enabled defconfig with compatibility documentation Luis Chamberlain
2025-10-04 16:38 ` [PATCH v2 4/4] defconfigs: Add composable fragments for Lambda Labs vLLM deployment Luis Chamberlain
2025-10-04 16:39 ` [PATCH v2 0/4] vLLM and the vLLM production stack Luis Chamberlain
2025-10-04 16:55 ` Chuck Lever
2025-10-04 17:03 ` Luis Chamberlain
2025-10-04 17:14 ` Chuck Lever
2025-10-08 17:46 ` Chuck Lever [this message]
2025-10-10 0:55 ` Luis Chamberlain
2025-10-10 12:38 ` Chuck Lever
2025-10-10 16:20 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fe25a471-e97d-4b46-87af-7990277ef21a@kernel.org \
--to=cel@kernel.org \
--cc=Joelagnelf@nvidia.com \
--cc=da.gomez@kruces.com \
--cc=devasena.i@samsung.com \
--cc=dongjoo.seo1@samsung.com \
--cc=kdevops@lists.linux.dev \
--cc=mcgrof@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox