Re: [PATCH v2 0/4] vLLM and the vLLM production stack

public inbox for kdevops@lists.linux.dev
 help / color / mirror / Atom feed

From: Luis Chamberlain <mcgrof@kernel.org>
To: Chuck Lever <cel@kernel.org>
Cc: Daniel Gomez <da.gomez@kruces.com>,
	kdevops@lists.linux.dev,
	Devasena Inupakutika <devasena.i@samsung.com>,
	DongjooSeo <dongjoo.seo1@samsung.com>,
	Joel Fernandes <Joelagnelf@nvidia.com>
Subject: Re: [PATCH v2 0/4] vLLM and the vLLM production stack
Date: Sat, 4 Oct 2025 10:03:08 -0700	[thread overview]
Message-ID: <aOFTTOG3YV_jOGxB@bombadil.infradead.org> (raw)
In-Reply-To: <9821a951-e5a1-4e24-868f-f1a874509d5b@kernel.org>

On Sat, Oct 04, 2025 at 12:55:36PM -0400, Chuck Lever wrote:
> On 10/4/25 12:38 PM, Luis Chamberlain wrote:
> > This adds initial vLLM and vLLM production stack support on kdevops.
> > 
> > This v2 series augments vLLM support for real CPUs on bare metal using
> > the DECLARE_HOSTS and also goes tested against a real GPU on the cloud,
> > showing that essentially now anyone can use the vLLM production stack on
> > any cloud provider we support in a flash. All we need are the instances
> > which have GPUs added, and for that we expect growth soon using dynamic
> > kconfig support.
> 
> As an update/road-map on that:
> 
> I think Lambda has GPU support already,

Yes, this goest tested with that.

> and AWS has enough dynamic menu
> support now that GPU-enabled instance types are available there with the
> default menus in the git tree. Please let me know if that's missing
> something.

Oh! I hadn't seen that and had been waiting for this! I'll test in a
couple of days!

Exciting times!

> I haven't done the follow-up work yet to integrate GPU-enabled AMIs into
> the AWS Compute menu. That seems like it should be the top priority. I
> need to go back and look at what you did to generate those in your
> prototype, to close those gaps.

Oh yes that's needed. I also need a patch to disable VPCs and enable
public IPs. While at it, so that AWS won't eat my corporate expenditures
I added a slack cloud-bill support too. The whole "static" stuff can be
ignored, it doesn't work, I was just trying to add static instances
to see if I could get some larger GPU instnaces to work but it didn't
work and I gave up. But the rest of the changes are legit, please feel
free to cherry pick what you see useful from here:

https://github.com/linux-kdevops/kdevops/tree/ci-testing/mcgrof/20251004-cloud-bill

> When that is complete, my next steps are to ask Claude to "copy" the
> scripts from terraform/aws/scripts to the other three major cloud
> providers that kdevops supports... an NFS bake-a-thon is this coming
> week, so there will be some delay.

Nice!!

> In the medium term, adding support for enabling RDMA fabrics in these
> environments is on my to-do list. I believe that will allow testing
> things like GPU direct with NVMe-o-F devices.

Oh my, that would be dreamy!

 Luis

next prev parent reply	other threads:[~2025-10-04 17:03 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-04 16:38 [PATCH v2 0/4] vLLM and the vLLM production stack Luis Chamberlain
2025-10-04 16:38 ` [PATCH v2 1/4] workflows: Add vLLM workflow for LLM inference and production deployment Luis Chamberlain
2025-10-04 16:38 ` [PATCH v2 2/4] vllm: Add DECLARE_HOSTS support for bare metal and existing infrastructure Luis Chamberlain
2025-10-04 16:38 ` [PATCH v2 3/4] vllm: Add GPU-enabled defconfig with compatibility documentation Luis Chamberlain
2025-10-04 16:38 ` [PATCH v2 4/4] defconfigs: Add composable fragments for Lambda Labs vLLM deployment Luis Chamberlain
2025-10-04 16:39 ` [PATCH v2 0/4] vLLM and the vLLM production stack Luis Chamberlain
2025-10-04 16:55 ` Chuck Lever
2025-10-04 17:03   ` Luis Chamberlain [this message]
2025-10-04 17:14     ` Chuck Lever
2025-10-08 17:46       ` Chuck Lever
2025-10-10  0:55         ` Luis Chamberlain
2025-10-10 12:38           ` Chuck Lever
2025-10-10 16:20             ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aOFTTOG3YV_jOGxB@bombadil.infradead.org \
    --to=mcgrof@kernel.org \
    --cc=Joelagnelf@nvidia.com \
    --cc=cel@kernel.org \
    --cc=da.gomez@kruces.com \
    --cc=devasena.i@samsung.com \
    --cc=dongjoo.seo1@samsung.com \
    --cc=kdevops@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox