From: David Hildenbrand <david@redhat.com>
To: "Christian König" <christian.koenig@amd.com>,
intel-xe@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org,
x86@kernel.org
Cc: airlied@gmail.com, thomas.hellstrom@linux.intel.com,
matthew.brost@intel.com, dave.hansen@linux.intel.com,
luto@kernel.org, peterz@infradead.org,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Subject: Re:
Date: Tue, 26 Aug 2025 11:17:06 +0200 [thread overview]
Message-ID: <75aca34d-3557-49e9-a523-bd3244c28190@redhat.com> (raw)
In-Reply-To: <fc3e013c-e7f7-441d-a638-2ee3dd372775@amd.com>
On 26.08.25 11:00, Christian König wrote:
> On 26.08.25 10:46, David Hildenbrand wrote:
>>>> So my assumption would be that that is missing for the drivers here?
>>>
>>> Well yes and no.
>>>
>>> See the PAT is optimized for applying specific caching attributes to ranges [A..B] (e.g. it uses an R/B tree). But what drivers do here is that they have single pages (usually for get_free_page or similar) and want to apply a certain caching attribute to it.
>>>
>>> So what would happen is that we completely clutter the R/B tree used by the PAT with thousands if not millions of entries.
>>>
>>
>> Hm, above you're saying that there is no direct map, but now you are saying that the pages were obtained through get_free_page()?
>
> The problem only happens with highmem pages on 32bit kernels. Those pages are not in the linear mapping.
Right, in the common case there is a direct map.
>
>> I agree that what you describe here sounds suboptimal. But if the pages where obtained from the buddy, there surely is a direct map -- unless we explicitly remove it :(
>>
>> If we're talking about individual pages without a directmap, I would wonder if they are actually part of a bigger memory region that can just be reserved in one go (similar to how remap_pfn_range()) would handle it.
>>
>> Can you briefly describe how your use case obtains these PFNs, and how scattered tehy + their caching attributes might be?
>
> What drivers do is to call get_free_page() or alloc_pages_node() with the GFP_HIGHUSER flag set.
>
> For non highmem pages drivers then calls set_pages_wc/uc() which changes the caching of the linear mapping, but for highmem pages there is no linear mapping so set_pages_wc() or set_pages_uc() doesn't work and drivers avoid calling it.
>
> Those are basically just random system memory pages. So they are potentially scattered over the whole memory address space.
Thanks, that's valuable information.
So essentially these drivers maintain their own consistency and PAT is
not aware of that.
And the real problem is ordinary system RAM.
There are various ways forward.
1) We use another interface that consumes pages instead of PFNs, like a
vm_insert_pages_pgprot() we would be adding.
Is there any strong requirement for inserting non-refcounted PFNs?
2) We add another interface that consumes PFNs, but explicitly states
that it is only for ordinary system RAM, and that the user is
required for updating the direct map.
We could sanity-check the direct map in debug kernels.
3) We teach PAT code in pfnmap_setup_cachemode_pfn() about treating this
system RAM differently.
There is also the option for a mixture between 1 and 2, where we get
pages, but we map them non-refcounted in a VM_PFNMAP.
In general, having pages makes it easier to assert that they are likely
ordinary system ram pages, and that the interface is not getting abused
for something else.
We could also perform the set_pages_wc/uc() from inside that function,
but maybe it depends on the use case whether we want to do that whenever
we map them into a process?
--
Cheers
David / dhildenb
next prev parent reply other threads:[~2025-08-26 9:17 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20250820143739.3422-1-christian.koenig@amd.com>
2025-08-20 14:33 ` [PATCH 1/3] drm/ttm: use apply_page_range instead of vmf_insert_pfn_prot Christian König
2025-08-20 14:33 ` [PATCH 2/3] drm/ttm: reapply increase ttm pre-fault value to PMD size" Christian König
2025-08-20 14:33 ` [PATCH 3/3] drm/ttm: disable changing the global caching flags on newer AMD CPUs v2 Christian König
2025-08-20 15:12 ` Borislav Petkov
2025-08-20 15:23 ` David Hildenbrand
2025-08-21 8:10 ` Re: Christian König
2025-08-25 19:10 ` Re: David Hildenbrand
2025-08-26 8:38 ` Re: Christian König
2025-08-26 8:46 ` Re: David Hildenbrand
2025-08-26 9:00 ` Re: Christian König
2025-08-26 9:17 ` David Hildenbrand [this message]
2025-08-26 9:56 ` Re: Christian König
2025-08-26 12:07 ` Re: David Hildenbrand
2025-08-26 16:09 ` Re: Christian König
2025-08-27 9:13 ` [PATCH 0/3] drm/ttm: Michel Dänzer
2025-08-28 21:18 ` stupid and complicated PAT :) David Hildenbrand
2025-08-28 21:28 ` David Hildenbrand
2025-08-28 21:32 ` David Hildenbrand
2025-08-29 10:50 ` Christian König
2025-08-29 19:52 ` David Hildenbrand
2025-08-29 19:58 ` David Hildenbrand
2025-08-26 14:27 ` Thomas Hellström
2025-08-28 21:01 ` stupid PAT :) David Hildenbrand
2025-08-26 12:37 ` David Hildenbrand
2025-08-21 9:16 ` your mail Lorenzo Stoakes
2025-08-21 9:30 ` David Hildenbrand
2025-08-21 10:05 ` Lorenzo Stoakes
2025-08-21 10:16 ` David Hildenbrand
2025-08-25 18:35 ` Christian König
2025-08-25 19:20 ` David Hildenbrand
2025-08-21 8:19 ` ✗ i915.CI.BAT: failure for series starting with [1/3] drm/ttm: use apply_page_range instead of vmf_insert_pfn_prot Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=75aca34d-3557-49e9-a523-bd3244c28190@redhat.com \
--to=david@redhat.com \
--cc=airlied@gmail.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=christian.koenig@amd.com \
--cc=dave.hansen@linux.intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=luto@kernel.org \
--cc=matthew.brost@intel.com \
--cc=peterz@infradead.org \
--cc=thomas.hellstrom@linux.intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).