From: David Hildenbrand <david@redhat.com>
To: Kiryl Shutsemau <kirill@shutemov.name>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Matthew Wilcox <willy@infradead.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm/filemap: Implement fast short reads
Date: Thu, 23 Oct 2025 13:49:22 +0200 [thread overview]
Message-ID: <f57d73e3-fb6c-4c01-9897-c9686889fec2@redhat.com> (raw)
In-Reply-To: <7fmiqrcyiccff5okrs7sdz3i63mp376f2r76e4r5c2miluwk76@567sm46qop5h>
On 23.10.25 13:40, Kiryl Shutsemau wrote:
> On Thu, Oct 23, 2025 at 01:11:43PM +0200, David Hildenbrand wrote:
>> On 23.10.25 13:10, David Hildenbrand wrote:
>>> On 23.10.25 12:54, David Hildenbrand wrote:
>>>> On 23.10.25 12:31, Kiryl Shutsemau wrote:
>>>>> On Wed, Oct 22, 2025 at 07:28:27PM +0200, David Hildenbrand wrote:
>>>>>> "garbage" as in pointing at something without a direct map, something that's
>>>>>> protected differently (MTE? weird CoCo protection?) or even worse MMIO with
>>>>>> undesired read-effects.
>>>>>
>>>>> Pedro already points to the problem with missing direct mapping.
>>>>> _nofault() copy should help with this.
>>>>
>>>> Yeah, we do something similar when reading the kcore for that reason.
>>>>
>>>>>
>>>>> Can direct mapping ever be converted to MMIO? It can be converted to DMA
>>>>> buffer (which is fine), but MMIO? I have not seen it even in virtualized
>>>>> environments.
>>>>
>>>> I recall discussions in the context of PAT and the adjustment of caching
>>>> attributes of the direct map for MMIO purposes: so I suspect there are
>>>> ways that can happen, but I am not 100% sure.
>>>>
>>>>
>>>> Thinking about it, in VMs we have the direct map set on balloon inflated
>>>> pages that should not be touched, not even read, otherwise your
>>>> hypervisor might get very angry. That case we could likely handle by
>>>> checking whether the source page actually exists and doesn't have
>>>> PageOffline() set, before accessing it. A bit nasty.
>>>>
>>>> A more obscure cases would probably be reading a page that was poisoned
>>>> by hardware and is not expected to be used anymore. Could also be
>>>> checked by checking the page.
>>>>
>>>> Essentially all cases where we try to avoid reading ordinary memory
>>>> already when creating memory dumps that might have a direct map.
>>>>
>>>>
>>>> Regarding MTE and load_unaligned_zeropad(): I don't know unfortunately.
>>>
>>> Looking into this, I'd assume the exception handler will take care of it.
>>>
>>> load_unaligned_zeropad() is interesting if there is a direct map but the
>>> memory should not be touched (especially regarding PageOffline and
>>> memory errors).
>>>
>>> I read drivers/firmware/efi/unaccepted_memory.c where we there is a
>>> lengthy discussion about guard pages and how that works for unaccepted
>>> memory.
>>>
>>> While it works for unaccepted memory, it wouldn't work for other random
>>
>> Sorry I meant here "while that works for load_unaligned_zeropad()".
>
> Do we have other random reads?
>
> For unaccepted memory, we care about touching memory that was never
> allocated because accepting memory is one way road.
Right, but I suspect if you get a random read (as the unaccepted memory
doc states) you'd be in trouble as well.
The "nice" thing about unaccepted memory is that it's a one way road
indeed, and at some point the system will not have unaccepted memory
anymore.
>
> I only know about load_unaligned_zeropad() that does reads like this. Do
> you know others?
No, I am not aware of others. Most code that could read random memory
(kcore, vmcore) was fixed to exclude pages we know are unsafe to touch.
Code where might speculatively access the "struct page" after it might
already have been freed (speculative pagecache lookups, GUP-fast) will
just back off and never read page content.
We avoid such random memory reads as best we can, as it's just a pain to
deal with (like load_unaligned_zeropad(), which i would just wish we
could get rid of now that it's present again in my memory. :( ).
--
Cheers
David / dhildenb
next prev parent reply other threads:[~2025-10-23 11:49 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-17 14:15 [PATCH] mm/filemap: Implement fast short reads Kiryl Shutsemau
2025-10-18 2:38 ` kernel test robot
2025-10-18 3:54 ` kernel test robot
2025-10-18 4:46 ` kernel test robot
2025-10-18 17:56 ` Linus Torvalds
2025-10-20 11:03 ` Kiryl Shutsemau
2025-10-20 4:53 ` Andrew Morton
2025-10-20 11:33 ` Kiryl Shutsemau
2025-10-21 15:50 ` Linus Torvalds
2025-10-21 23:39 ` Dave Chinner
2025-10-22 4:25 ` Linus Torvalds
2025-10-22 8:00 ` Dave Chinner
2025-10-22 15:31 ` Linus Torvalds
2025-10-23 7:50 ` Dave Chinner
2025-10-23 9:37 ` Jan Kara
2025-10-21 15:47 ` Linus Torvalds
2025-10-22 7:08 ` Pedro Falcato
2025-10-22 7:13 ` Linus Torvalds
2025-10-22 7:38 ` Pedro Falcato
2025-10-22 10:00 ` Kiryl Shutsemau
2025-10-22 17:28 ` David Hildenbrand
2025-10-23 10:31 ` Kiryl Shutsemau
2025-10-23 10:54 ` David Hildenbrand
2025-10-23 11:09 ` Kiryl Shutsemau
2025-10-23 12:08 ` David Hildenbrand
2025-10-23 11:10 ` David Hildenbrand
2025-10-23 11:11 ` David Hildenbrand
2025-10-23 11:40 ` Kiryl Shutsemau
2025-10-23 11:49 ` David Hildenbrand [this message]
2025-10-23 12:41 ` Kiryl Shutsemau
2025-10-23 17:42 ` Yang Shi
2025-10-27 10:49 ` Hugh Dickins
2025-10-27 15:50 ` Linus Torvalds
2025-10-27 16:06 ` David Hildenbrand
2025-10-27 16:48 ` Linus Torvalds
2025-10-27 16:53 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f57d73e3-fb6c-4c01-9897-c9686889fec2@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=jack@suse.cz \
--cc=kirill@shutemov.name \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).