From: Luis Chamberlain <mcgrof@kernel.org>
To: Chuck Lever <cel@kernel.org>
Cc: Daniel Gomez <da.gomez@kruces.com>,
kdevops@lists.linux.dev,
Devasena Inupakutika <devasena.i@samsung.com>,
DongjooSeo <dongjoo.seo1@samsung.com>,
Joel Fernandes <Joelagnelf@nvidia.com>
Subject: Re: [PATCH v2 0/4] vLLM and the vLLM production stack
Date: Sat, 4 Oct 2025 10:03:08 -0700 [thread overview]
Message-ID: <aOFTTOG3YV_jOGxB@bombadil.infradead.org> (raw)
In-Reply-To: <9821a951-e5a1-4e24-868f-f1a874509d5b@kernel.org>
On Sat, Oct 04, 2025 at 12:55:36PM -0400, Chuck Lever wrote:
> On 10/4/25 12:38 PM, Luis Chamberlain wrote:
> > This adds initial vLLM and vLLM production stack support on kdevops.
> >
> > This v2 series augments vLLM support for real CPUs on bare metal using
> > the DECLARE_HOSTS and also goes tested against a real GPU on the cloud,
> > showing that essentially now anyone can use the vLLM production stack on
> > any cloud provider we support in a flash. All we need are the instances
> > which have GPUs added, and for that we expect growth soon using dynamic
> > kconfig support.
>
> As an update/road-map on that:
>
> I think Lambda has GPU support already,
Yes, this goest tested with that.
> and AWS has enough dynamic menu
> support now that GPU-enabled instance types are available there with the
> default menus in the git tree. Please let me know if that's missing
> something.
Oh! I hadn't seen that and had been waiting for this! I'll test in a
couple of days!
Exciting times!
> I haven't done the follow-up work yet to integrate GPU-enabled AMIs into
> the AWS Compute menu. That seems like it should be the top priority. I
> need to go back and look at what you did to generate those in your
> prototype, to close those gaps.
Oh yes that's needed. I also need a patch to disable VPCs and enable
public IPs. While at it, so that AWS won't eat my corporate expenditures
I added a slack cloud-bill support too. The whole "static" stuff can be
ignored, it doesn't work, I was just trying to add static instances
to see if I could get some larger GPU instnaces to work but it didn't
work and I gave up. But the rest of the changes are legit, please feel
free to cherry pick what you see useful from here:
https://github.com/linux-kdevops/kdevops/tree/ci-testing/mcgrof/20251004-cloud-bill
> When that is complete, my next steps are to ask Claude to "copy" the
> scripts from terraform/aws/scripts to the other three major cloud
> providers that kdevops supports... an NFS bake-a-thon is this coming
> week, so there will be some delay.
Nice!!
> In the medium term, adding support for enabling RDMA fabrics in these
> environments is on my to-do list. I believe that will allow testing
> things like GPU direct with NVMe-o-F devices.
Oh my, that would be dreamy!
Luis
next prev parent reply other threads:[~2025-10-04 17:03 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-04 16:38 [PATCH v2 0/4] vLLM and the vLLM production stack Luis Chamberlain
2025-10-04 16:38 ` [PATCH v2 1/4] workflows: Add vLLM workflow for LLM inference and production deployment Luis Chamberlain
2025-10-04 16:38 ` [PATCH v2 2/4] vllm: Add DECLARE_HOSTS support for bare metal and existing infrastructure Luis Chamberlain
2025-10-04 16:38 ` [PATCH v2 3/4] vllm: Add GPU-enabled defconfig with compatibility documentation Luis Chamberlain
2025-10-04 16:38 ` [PATCH v2 4/4] defconfigs: Add composable fragments for Lambda Labs vLLM deployment Luis Chamberlain
2025-10-04 16:39 ` [PATCH v2 0/4] vLLM and the vLLM production stack Luis Chamberlain
2025-10-04 16:55 ` Chuck Lever
2025-10-04 17:03 ` Luis Chamberlain [this message]
2025-10-04 17:14 ` Chuck Lever
2025-10-08 17:46 ` Chuck Lever
2025-10-10 0:55 ` Luis Chamberlain
2025-10-10 12:38 ` Chuck Lever
2025-10-10 16:20 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aOFTTOG3YV_jOGxB@bombadil.infradead.org \
--to=mcgrof@kernel.org \
--cc=Joelagnelf@nvidia.com \
--cc=cel@kernel.org \
--cc=da.gomez@kruces.com \
--cc=devasena.i@samsung.com \
--cc=dongjoo.seo1@samsung.com \
--cc=kdevops@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox