Re: [PATCH v2 0/4] vLLM and the vLLM production stack

public inbox for kdevops@lists.linux.dev
 help / color / mirror / Atom feed

From: Chuck Lever <cel@kernel.org>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: Daniel Gomez <da.gomez@kruces.com>,
	kdevops@lists.linux.dev,
	Devasena Inupakutika <devasena.i@samsung.com>,
	DongjooSeo <dongjoo.seo1@samsung.com>,
	Joel Fernandes <Joelagnelf@nvidia.com>
Subject: Re: [PATCH v2 0/4] vLLM and the vLLM production stack
Date: Wed, 8 Oct 2025 13:46:15 -0400	[thread overview]
Message-ID: <fe25a471-e97d-4b46-87af-7990277ef21a@kernel.org> (raw)
In-Reply-To: <e49254e9-620e-4cce-aad6-410b9281d1ae@kernel.org>

On 10/4/25 1:14 PM, Chuck Lever wrote:
>> But the rest of the changes are legit, please feel
>> free to cherry pick what you see useful from here:
>>
>> https://github.com/linux-kdevops/kdevops/tree/ci-testing/
>> mcgrof/20251004-cloud-bill
> Great! I will have a look at those.

Have some comments/requests on these, not sure where to post them.
Oldest to newest:

- workflows: Add vLLM workflow for LLM inference and production deployment

- vllm: Add DECLARE_HOSTS support for bare metal and existing infrastructure

- vllm: Add GPU-enabled defconfig with compatibility documentation

- defconfigs: Add composable fragments for Lambda Labs vLLM deployment

No comments on these.

- aws: prevent SSH key conflicts across multiple kdevops directories

This one was posted before, and my comment still stands: this is badly
needed IMO, but it should work for all cloud providers, not just aws,
and the new Kconfig options should go in the existing kconfig ssh menu
for terraform, probably.

Do you want me to work on adapting this one, or do you want to give
Claude another crack at it?

- Add static GPU Kconfig support for AWS

Wondering if my dynamic instance type menu already brings in these new
GPU-enabled instance types.

- Add make cloud-bill target for AWS cost tracking

Nit: I'd like to see provider-specific scripts go into

  terraform/<provider>/scripts/

I'm sorry that I had to drop the pricing information from my dynamic
menu patches. I just pushed that out of the MVP "just get it working"
patches, and I do plan to come back to it. I do follow running costs,
but not as closely as these patches suggest that you do.

- terraform/aws: use default VPC to avoid VPC limit issues

I think we can make this work, and IIRC some of the other providers
also provision default VPCs. Making it switchable (use the default,
or create one for me) makes sense. We might consider following the
precedent that OCI has set here (use an existing VPC).

There are some other resources that have similar limits.

- terraform/aws: fix EBS volume availability zone mismatch

Fair catch, but why not use the AZ that the instance is in rather
than the AZ that the subnet is in?

- terraform/aws: enable public IP assignment for instances

- terraform/aws: prefer subnets with public IP auto-assignment

As above, might need some work, but these two look do-able. Probably
should be squashed into "terraform/aws: use default VPC to avoid VPC
limit issues"

- terraform/aws: fix GPU AMI selection in terraform templates

No comment on this one. I need to first go and merge in your original
GPU AMI patches. I'd like to see that integrated into the existing AWS
Kconfig compute menu.

- ansible: map GPU instance configurations to terraform instance types

- defconfigs: fix GPU instance choice configuration

Wondering if these two are still necessary with my dynamic menu patches.

- slack-billing: add AWS cost notifications to Slack

Clever, but isn't this something that should be configured via the
cloud console? Not really sure.

- kconfig: fix Slack notification configuration syntax errors

Squash-me.

- defconfigs: add AWS P5.4xlarge GPU instance support

No comment.

HTH.

-- 
Chuck Lever

next prev parent reply	other threads:[~2025-10-08 17:46 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-04 16:38 [PATCH v2 0/4] vLLM and the vLLM production stack Luis Chamberlain
2025-10-04 16:38 ` [PATCH v2 1/4] workflows: Add vLLM workflow for LLM inference and production deployment Luis Chamberlain
2025-10-04 16:38 ` [PATCH v2 2/4] vllm: Add DECLARE_HOSTS support for bare metal and existing infrastructure Luis Chamberlain
2025-10-04 16:38 ` [PATCH v2 3/4] vllm: Add GPU-enabled defconfig with compatibility documentation Luis Chamberlain
2025-10-04 16:38 ` [PATCH v2 4/4] defconfigs: Add composable fragments for Lambda Labs vLLM deployment Luis Chamberlain
2025-10-04 16:39 ` [PATCH v2 0/4] vLLM and the vLLM production stack Luis Chamberlain
2025-10-04 16:55 ` Chuck Lever
2025-10-04 17:03   ` Luis Chamberlain
2025-10-04 17:14     ` Chuck Lever
2025-10-08 17:46       ` Chuck Lever [this message]
2025-10-10  0:55         ` Luis Chamberlain
2025-10-10 12:38           ` Chuck Lever
2025-10-10 16:20             ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fe25a471-e97d-4b46-87af-7990277ef21a@kernel.org \
    --to=cel@kernel.org \
    --cc=Joelagnelf@nvidia.com \
    --cc=da.gomez@kruces.com \
    --cc=devasena.i@samsung.com \
    --cc=dongjoo.seo1@samsung.com \
    --cc=kdevops@lists.linux.dev \
    --cc=mcgrof@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox