public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
* Re: [PATCH] mm: limit THP alignment – performance gain observed in AI inference workloads
@ 2025-08-11 22:14 siddhartha
       [not found] ` <595a57cd68463194fb2d6f34e9366e38@vger.kernel.org>
  0 siblings, 1 reply; 4+ messages in thread
From: siddhartha @ 2025-08-11 22:14 UTC (permalink / raw)
  To: Vlastimil Babka; +Cc: Dev Jain, Lorenzo Stoakes, linux-mm, LKML

[-- Attachment #1: Type: text/plain, Size: 3149 bytes --]

On 2025-07-28 16:30, Vlastimil Babka wrote:

> On 7/28/25 07:41, siddhartha@kenip.in wrote:
> 
>> On 2025-07-07 14:26, Vlastimil Babka wrote:
>> Hi Lorenzo, Dev, Mel,
>> 
>> I'm following up on this patch submission from earlier this month:
>> "[PATCH] mm: limit THP alignment - performance gain observed in AI
>> inference workloads."
> 
> I'm confused. That wasn't a patch submission, but reporting performance
> results for my patch from late 2024? (and thanks for those!)
> 
> The patch was also already merged in late 2024:
> 
> commit d4148aeab412432bf928f311eca8a2ba52bb05df
> Author: Vlastimil Babka <vbabka@suse.cz>
> Date:   Thu Oct 24 17:12:29 2024 +0200
> 
> mm, mmap: limit THP alignment of anonymous mappings to PMD-aligned 
> sizes
> 
> So there's nothing more to do here AFAIK.

> Hello Vlastimil,
> 
> Hope you are doing great!
> 
> Sorry about the late reply, my inbox made your email invisible somehow.
> 
> Thank you for the clarification -- yes, I am aware that the mm, mmap: 
> limit THP alignment of anonymous mappings to PMD-aligned sizes patch 
> was merged in late 2024 (commit 
> d4148aeab412432bf928f311eca8a2ba52bb05df).
> 
> The performance results I shared were generated much later because of 
> my working setup:
> 
> *
> 
> The tests were conducted on Intel Developer Cloud workloads as part of 
> a broader benchmarking exercise involving OpenVINO-based inference 
> pipelines.
> *
> 
> The specific environment, dataset, and configuration scripts were 
> stored on an SSD that unfortunately suffered corruption. I am currently 
> working to recover them so I can share the exact test harness and 
> commit-specific diffs. If and when I get that access back from Intel 
> Developer Cloud, I can surely provide all those relevant files.
> 
> Although this is not a new patch submission, I thought the numbers 
> might still be valuable -- they show notable throughput and latency 
> changes when aligning the current behavior with OpenVINO's large 
> contiguous allocation preferences in certain inference scenarios.
> 
> Summary of observed improvements:
> 
> *
> 
> Throughput: +7.3% average increase in model inference throughput on 
> ResNet-50 with mixed batch sizes (64/128)
> *
> 
> Latency: -5.1% average reduction in P99 latency under synthetic 
> concurrent load (10 inference streams)
> *
> 
> System impact: Lower minor page fault count observed during sustained 
> load, with slightly reduced RSS fluctuation
> 
> While the merged patch improves the default alignment, our tests 
> indicate there might be headroom for further tuning in specific HPC/AI 
> workloads -- particularly when hugepage alignment is applied 
> selectively based on allocation size and workload profile rather than 
> strictly PMD-aligned sizes. I was also working on specifics and pseudo 
> diffs from the working Linux code that I can generate to send that 
> email via git send-email.
> 
> I'd be happy to collaborate on a deeper investigation once I recover 
> the original scripts -- or I can try to replicate the environment on a 
> fresh setup and collect new diffs for comparison.
> 
> Best regards,
> Siddhartha Sharma

[-- Attachment #2: Type: text/html, Size: 5027 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-09-25 23:12 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-11 22:14 [PATCH] mm: limit THP alignment – performance gain observed in AI inference workloads siddhartha
     [not found] ` <595a57cd68463194fb2d6f34e9366e38@vger.kernel.org>
     [not found]   ` <0197c80c5bc7989b858b79317a4fbc45@kenip.in>
2025-09-25 13:54     ` [PATCH follow-up] mm/thp: Requesting status update on alignment performance configuration siddhartha
2025-09-25 18:46       ` Vlastimil Babka
2025-09-25 23:12         ` siddhartha

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox