From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>,
Yury Norov <yury.norov@gmail.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Nicholas Piggin <npiggin@gmail.com>,
Ding Tianhong <dingtianhong@huawei.com>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Alexey Klimov <aklimov@redhat.com>,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] vmap(): don't allow invalid pages
Date: Thu, 20 Jan 2022 16:54:03 +0000 [thread overview]
Message-ID: <YemTq/yGkHQ+grd1@shell.armlinux.org.uk> (raw)
In-Reply-To: <e6fde086-16b9-430f-5510-5296ef74a4e7@arm.com>
On Thu, Jan 20, 2022 at 04:37:01PM +0000, Robin Murphy wrote:
> On 2022-01-20 13:03, Russell King (Oracle) wrote:
> > On Thu, Jan 20, 2022 at 12:22:35PM +0000, Robin Murphy wrote:
> > > On 2022-01-19 19:12, Russell King (Oracle) wrote:
> > > > On Wed, Jan 19, 2022 at 06:43:10PM +0000, Robin Murphy wrote:
> > > > > Indeed, my impression is that the only legitimate way to get hold of a page
> > > > > pointer without assumed provenance is via pfn_to_page(), which is where
> > > > > pfn_valid() comes in. Thus pfn_valid(page_to_pfn()) really *should* be a
> > > > > tautology.
> > > >
> > > > That can only be true if pfn == page_to_pfn(pfn_to_page(pfn)) for all
> > > > values of pfn.
> > > >
> > > > Given how pfn_to_page() is defined in the sparsemem case:
> > > >
> > > > #define __pfn_to_page(pfn) \
> > > > ({ unsigned long __pfn = (pfn); \
> > > > struct mem_section *__sec = __pfn_to_section(__pfn); \
> > > > __section_mem_map_addr(__sec) + __pfn; \
> > > > })
> > > > #define page_to_pfn __page_to_pfn
> > > >
> > > > that isn't the case, especially when looking at page_to_pfn():
> > > >
> > > > #define __page_to_pfn(pg) \
> > > > ({ const struct page *__pg = (pg); \
> > > > int __sec = page_to_section(__pg); \
> > > > (unsigned long)(__pg - __section_mem_map_addr(__nr_to_section(__sec))); \
> > > > })
> > > >
> > > > Where:
> > > >
> > > > static inline unsigned long page_to_section(const struct page *page)
> > > > {
> > > > return (page->flags >> SECTIONS_PGSHIFT) & SECTIONS_MASK;
> > > > }
> > > >
> > > > So if page_to_section() returns something that is, e.g. zero for an
> > > > invalid page in a non-zero section, you're not going to end up with
> > > > the right pfn from page_to_pfn().
> > >
> > > Right, I emphasised "should" in an attempt to imply "in the absence of
> > > serious bugs that have further-reaching consequences anyway".
> > >
> > > > As I've said now a couple of times, trying to determine of a struct
> > > > page pointer is valid is the wrong question to be asking.
> > >
> > > And doing so in one single place, on the justification of avoiding an
> > > incredibly niche symptom, is even more so. Not to mention that an address
> > > size fault is one of the best possible outcomes anyway, vs. the untold
> > > damage that may stem from accesses actually going through to random parts of
> > > the physical memory map.
> >
> > I don't see it as a "niche" symptom.
>
> The commit message specifically cites a Data Abort "at address translation
> later". Broadly speaking, a Data Abort due to an address size fault only
> occurs if you've been lucky enough that the bogus PA which got mapped is so
> spectacularly wrong that it's beyond the range configured in TCR.IPS. How
> many other architectures even have a mechanism like that?
I think we're misinterpreting each other.
> > If we start off with the struct page being invalid, then the result of
> > page_to_pfn() can not be relied upon to produce something that is
> > meaningful - which is exactly why the vmap() issue arises.
> >
> > With a pfn_valid() check, we at least know that the PFN points at
> > memory.
>
> No, we know it points to some PA space which has a struct page to represent
> it. pfn_valid() only says that pfn_to_page() will yield a valid result. That
> also includes things like reserved pages covering non-RAM areas, where a
> kernel VA mapping existing at all could potentially be fatal to the system
> even if it's never explicitly accessed - for all we know it might be a
> carveout belonging to overly-aggressive Secure software such that even a
> speculative prefetch might trigger an instant system reset.
So are you saying that the "address size fault" can happen because we've
mapped something for which pfn_valid() returns true?
> > However, that memory could be _anything_ in the system - it
> > could be the kernel image, and it could give userspace access to
> > change kernel code.
> >
> > So, while it is useful to do a pfn_valid() check in vmap(), as I said
> > to willy, this must _not_ be the primary check. It should IMHO use
> > WARN_ON() to make it blatently obvious that it should be something we
> > expect _not_ to trigger under normal circumstances, but is there to
> > catch programming errors elsewhere.
>
> Rather, "to partially catch unrelated programming errors elsewhere, provided
> the buggy code happens to call vmap() rather than any of the many other
> functions with a struct page * argument." That's where it stretches my
> definition of "useful" just a bit too far. It's not about perfect being the
> enemy of good, it's about why vmap() should be special, and death by a
> thousand "useful" cuts - if we don't trust the pointer, why not check its
> alignment for basic plausibility first? If it seems valid, why not check if
> the page flags look sensible to make sure? How many useful little checks is
> too many? Every bit of code footprint and execution overhead imposed
> unconditionally on all end users to theoretically save developers' debugging
> time still adds up. Although on that note, it looks like arch/arm's
> pfn_valid() is still a linear scan of the memblock array, so the overhead of
> adding that for every page in every vmap() might not even be so small...
Well, I think I've adequately explained why I believe:
pfn_is_valid(page_to_pfn(page))
being used as the primary check is substandard, and will likely lead to
a future CVE. When generating an array of struct page's, I believe that
it is the responsibility for the generator to ensure that the array
only contains valid pages.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
WARNING: multiple messages have this Message-ID (diff)
From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>,
Yury Norov <yury.norov@gmail.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Nicholas Piggin <npiggin@gmail.com>,
Ding Tianhong <dingtianhong@huawei.com>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Alexey Klimov <aklimov@redhat.com>,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] vmap(): don't allow invalid pages
Date: Thu, 20 Jan 2022 16:54:03 +0000 [thread overview]
Message-ID: <YemTq/yGkHQ+grd1@shell.armlinux.org.uk> (raw)
In-Reply-To: <e6fde086-16b9-430f-5510-5296ef74a4e7@arm.com>
On Thu, Jan 20, 2022 at 04:37:01PM +0000, Robin Murphy wrote:
> On 2022-01-20 13:03, Russell King (Oracle) wrote:
> > On Thu, Jan 20, 2022 at 12:22:35PM +0000, Robin Murphy wrote:
> > > On 2022-01-19 19:12, Russell King (Oracle) wrote:
> > > > On Wed, Jan 19, 2022 at 06:43:10PM +0000, Robin Murphy wrote:
> > > > > Indeed, my impression is that the only legitimate way to get hold of a page
> > > > > pointer without assumed provenance is via pfn_to_page(), which is where
> > > > > pfn_valid() comes in. Thus pfn_valid(page_to_pfn()) really *should* be a
> > > > > tautology.
> > > >
> > > > That can only be true if pfn == page_to_pfn(pfn_to_page(pfn)) for all
> > > > values of pfn.
> > > >
> > > > Given how pfn_to_page() is defined in the sparsemem case:
> > > >
> > > > #define __pfn_to_page(pfn) \
> > > > ({ unsigned long __pfn = (pfn); \
> > > > struct mem_section *__sec = __pfn_to_section(__pfn); \
> > > > __section_mem_map_addr(__sec) + __pfn; \
> > > > })
> > > > #define page_to_pfn __page_to_pfn
> > > >
> > > > that isn't the case, especially when looking at page_to_pfn():
> > > >
> > > > #define __page_to_pfn(pg) \
> > > > ({ const struct page *__pg = (pg); \
> > > > int __sec = page_to_section(__pg); \
> > > > (unsigned long)(__pg - __section_mem_map_addr(__nr_to_section(__sec))); \
> > > > })
> > > >
> > > > Where:
> > > >
> > > > static inline unsigned long page_to_section(const struct page *page)
> > > > {
> > > > return (page->flags >> SECTIONS_PGSHIFT) & SECTIONS_MASK;
> > > > }
> > > >
> > > > So if page_to_section() returns something that is, e.g. zero for an
> > > > invalid page in a non-zero section, you're not going to end up with
> > > > the right pfn from page_to_pfn().
> > >
> > > Right, I emphasised "should" in an attempt to imply "in the absence of
> > > serious bugs that have further-reaching consequences anyway".
> > >
> > > > As I've said now a couple of times, trying to determine of a struct
> > > > page pointer is valid is the wrong question to be asking.
> > >
> > > And doing so in one single place, on the justification of avoiding an
> > > incredibly niche symptom, is even more so. Not to mention that an address
> > > size fault is one of the best possible outcomes anyway, vs. the untold
> > > damage that may stem from accesses actually going through to random parts of
> > > the physical memory map.
> >
> > I don't see it as a "niche" symptom.
>
> The commit message specifically cites a Data Abort "at address translation
> later". Broadly speaking, a Data Abort due to an address size fault only
> occurs if you've been lucky enough that the bogus PA which got mapped is so
> spectacularly wrong that it's beyond the range configured in TCR.IPS. How
> many other architectures even have a mechanism like that?
I think we're misinterpreting each other.
> > If we start off with the struct page being invalid, then the result of
> > page_to_pfn() can not be relied upon to produce something that is
> > meaningful - which is exactly why the vmap() issue arises.
> >
> > With a pfn_valid() check, we at least know that the PFN points at
> > memory.
>
> No, we know it points to some PA space which has a struct page to represent
> it. pfn_valid() only says that pfn_to_page() will yield a valid result. That
> also includes things like reserved pages covering non-RAM areas, where a
> kernel VA mapping existing at all could potentially be fatal to the system
> even if it's never explicitly accessed - for all we know it might be a
> carveout belonging to overly-aggressive Secure software such that even a
> speculative prefetch might trigger an instant system reset.
So are you saying that the "address size fault" can happen because we've
mapped something for which pfn_valid() returns true?
> > However, that memory could be _anything_ in the system - it
> > could be the kernel image, and it could give userspace access to
> > change kernel code.
> >
> > So, while it is useful to do a pfn_valid() check in vmap(), as I said
> > to willy, this must _not_ be the primary check. It should IMHO use
> > WARN_ON() to make it blatently obvious that it should be something we
> > expect _not_ to trigger under normal circumstances, but is there to
> > catch programming errors elsewhere.
>
> Rather, "to partially catch unrelated programming errors elsewhere, provided
> the buggy code happens to call vmap() rather than any of the many other
> functions with a struct page * argument." That's where it stretches my
> definition of "useful" just a bit too far. It's not about perfect being the
> enemy of good, it's about why vmap() should be special, and death by a
> thousand "useful" cuts - if we don't trust the pointer, why not check its
> alignment for basic plausibility first? If it seems valid, why not check if
> the page flags look sensible to make sure? How many useful little checks is
> too many? Every bit of code footprint and execution overhead imposed
> unconditionally on all end users to theoretically save developers' debugging
> time still adds up. Although on that note, it looks like arch/arm's
> pfn_valid() is still a linear scan of the memblock array, so the overhead of
> adding that for every page in every vmap() might not even be so small...
Well, I think I've adequately explained why I believe:
pfn_is_valid(page_to_pfn(page))
being used as the primary check is substandard, and will likely lead to
a future CVE. When generating an array of struct page's, I believe that
it is the responsibility for the generator to ensure that the array
only contains valid pages.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
next prev parent reply other threads:[~2022-01-20 16:55 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-18 23:52 [PATCH] vmap(): don't allow invalid pages Yury Norov
2022-01-18 23:52 ` Yury Norov
2022-01-19 0:51 ` Matthew Wilcox
2022-01-19 0:51 ` Matthew Wilcox
2022-01-19 6:17 ` Anshuman Khandual
2022-01-19 6:17 ` Anshuman Khandual
2022-01-19 17:22 ` Yury Norov
2022-01-19 17:22 ` Yury Norov
2022-01-20 3:37 ` Anshuman Khandual
2022-01-20 3:37 ` Anshuman Khandual
2022-01-20 4:27 ` Matthew Wilcox
2022-01-20 4:27 ` Matthew Wilcox
2022-01-21 2:56 ` Yury Norov
2022-01-21 2:56 ` Yury Norov
2022-01-19 11:16 ` Mark Rutland
2022-01-19 11:16 ` Mark Rutland
2022-01-19 17:00 ` Yury Norov
2022-01-19 17:00 ` Yury Norov
2022-01-19 18:06 ` Mark Rutland
2022-01-19 18:06 ` Mark Rutland
2022-01-19 13:28 ` Robin Murphy
2022-01-19 13:28 ` Robin Murphy
2022-01-19 16:27 ` Matthew Wilcox
2022-01-19 16:27 ` Matthew Wilcox
2022-01-19 17:54 ` Russell King (Oracle)
2022-01-19 17:54 ` Russell King (Oracle)
2022-01-19 18:01 ` Matthew Wilcox
2022-01-19 18:01 ` Matthew Wilcox
2022-01-19 18:57 ` Russell King (Oracle)
2022-01-19 18:57 ` Russell King (Oracle)
2022-01-19 19:35 ` Matthew Wilcox
2022-01-19 19:35 ` Matthew Wilcox
2022-01-19 22:38 ` Russell King (Oracle)
2022-01-19 22:38 ` Russell King (Oracle)
2022-01-19 18:43 ` Robin Murphy
2022-01-19 18:43 ` Robin Murphy
2022-01-19 19:12 ` Russell King (Oracle)
2022-01-19 19:12 ` Russell King (Oracle)
2022-01-20 12:22 ` Robin Murphy
2022-01-20 12:22 ` Robin Murphy
2022-01-20 13:03 ` Russell King (Oracle)
2022-01-20 13:03 ` Russell King (Oracle)
2022-01-20 16:37 ` Robin Murphy
2022-01-20 16:37 ` Robin Murphy
2022-01-20 16:54 ` Russell King (Oracle) [this message]
2022-01-20 16:54 ` Russell King (Oracle)
2022-01-20 19:04 ` Matthew Wilcox
2022-01-20 19:04 ` Matthew Wilcox
2022-01-21 5:26 ` Yury Norov
2022-01-21 5:26 ` Yury Norov
2022-01-26 2:50 ` Matthew Wilcox
2022-01-26 2:50 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YemTq/yGkHQ+grd1@shell.armlinux.org.uk \
--to=linux@armlinux.org.uk \
--cc=aklimov@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=catalin.marinas@arm.com \
--cc=dingtianhong@huawei.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npiggin@gmail.com \
--cc=robin.murphy@arm.com \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=yury.norov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.