From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Hugh Dickins <hughd@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Andrea Arcangeli <aarcange@redhat.com>,
David Rientjes <rientjes@google.com>,
Dave Hansen <dave.hansen@intel.com>, Mel Gorman <mgorman@suse.de>,
Rik van Riel <riel@redhat.com>, Vlastimil Babka <vbabka@suse.cz>,
Christoph Lameter <cl@gentwo.org>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
Steve Capper <steve.capper@linaro.org>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@suse.cz>,
Jerome Marchand <jmarchan@redhat.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: page-flags behavior on compound pages: a worry
Date: Thu, 6 Aug 2015 18:33:00 +0300 [thread overview]
Message-ID: <20150806153259.GA2834@node.dhcp.inet.fi> (raw)
In-Reply-To: <alpine.LSU.2.11.1508052001350.6404@eggly.anvils>
On Wed, Aug 05, 2015 at 09:15:57PM -0700, Hugh Dickins wrote:
> Hi Kirill,
>
> I had a nasty thought this morning.
Tough day.
I'm trying to wrap my head around this mail and not sure if I succeed
much. :-|
> Andrew had prodded me gently to re-examine my concerns with your
> page-flags rework in mmotm. I still dislike the bloat (my mm/built-in.o
> text goes up from 478513 to 490183 bytes on a non-DEBUG_VM build); but I
> was hoping to set that aside, to let us move forward.
>
> But looking into the bloat led me to what seems a more serious issue
> with it. I'd tacked a little function on to the end of mm/filemap.c:
>
> bool page_is_locked(struct page *page)
> {
> return !!PageLocked(page);
> }
>
> which came out as:
>
> 0000000000003a60 <page_is_locked>:
> 3a60: 48 8b 07 mov (%rdi),%rax
> 3a63: 55 push %rbp
> 3a64: 48 89 e5 mov %rsp,%rbp
>
> [instructions above same as without your patches; those below added by them]
>
> 3a67: f6 c4 80 test $0x80,%ah
> 3a6a: 74 10 je 3a7c <page_is_locked+0x1c>
> 3a6c: 48 8b 47 30 mov 0x30(%rdi),%rax
> 3a70: 48 8b 17 mov (%rdi),%rdx
> 3a73: 80 e6 80 and $0x80,%dh
> 3a76: 48 0f 44 c7 cmove %rdi,%rax
> 3a7a: eb 03 jmp 3a7f <page_is_locked+0x1f>
> 3a7c: 48 89 f8 mov %rdi,%rax
> 3a7f: 48 8b 00 mov (%rax),%rax
>
> [instructions above added by your patches; those below same as before]
>
> 3a82: 5d pop %rbp
> 3a83: 83 e0 01 and $0x1,%eax
> 3a86: c3 retq
>
> The "and $0x80,%dh" looked superfluous at first, but of course it isn't:
> it's from the smp_rmb() in David's 668f9abbd433 "mm: close PageTail race"
> (a later commit refactors compound_head() but doesn't change the story).
>
> And it's that race, or a worse race of that kind, that now worries me.
> Relying on smp_wmb() and smp_rmb() may be all that was needed in the
> case that David was fixing; and (I dare not look at them to audit!)
> all uses of compound_head() in our current v4.2-rc tree may well be
> safe, for this or that contingent reason in each place that it's used.
>
> But there is no locking within compound_head(page) to make it safe
> everywhere, yet your page-flags rework is changing a large number
> of PageWhatever()s and SetPageWhatever()s and ClearPageWhatever()s
> now to do a hidden compound_head(page) beneath the covers.
>
> To be more specific: if preemption, or an interrupt, or entry to SMM
> mode, or whatever, delays this thread somewhere in that compound_head()
> sequence of instructions, how can we be sure that the "head" returned
> by compound_head() is good? We know the page was PageTail just before
> looking up page->first_page, and we know it was PageTail just after,
> but we don't know that it was PageTail throughout, and we don't know
> whether page->first_page is even a good page pointer, or something
> else from the private/ptl/slab_cache union.
That looks like a very valid worry to me. For current -mm tree.
But let's take my refcounting rework into picture.
One thing it simplifies is protection against splitting. Once you've got a
reference to a page, it cannot be split under you. It makes PageTail() and
->first_page stable for most callsites.
We can access the page's flags under ptl, without having reference the
page. And that's fine: ptl protects against splitting too.
Fast GUP also have a way to protect against split.
IIUC, the only potentially problematic callsites left are physical memory
scanners. This code requires audit. I'll do that.
Do I miss something else?
> Of course it would be very rare for it to go wrong; and most callsites
> will obviously be safe for this or that reason; though, sadly, none of
> them safe from holding a reference to the tail page in question, since
> its count is frozen at 0 and cannot be grabbed by get_page_unless_zero.
Do you mean that grabbing head page's ->_count is not enough to protect
against splitting and freeing tail page under you?
I know a patchset which solves this! ;)
> But I don't see how it can be safe to rely on compound_head() inside
> a general purpose page-flag function, that we're all accustomed to
> think of as a simple bitop, that can be applied without great care.
>
> Hugh
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Hugh Dickins <hughd@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Andrea Arcangeli <aarcange@redhat.com>,
David Rientjes <rientjes@google.com>,
Dave Hansen <dave.hansen@intel.com>, Mel Gorman <mgorman@suse.de>,
Rik van Riel <riel@redhat.com>, Vlastimil Babka <vbabka@suse.cz>,
Christoph Lameter <cl@gentwo.org>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
Steve Capper <steve.capper@linaro.org>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@suse.cz>,
Jerome Marchand <jmarchan@redhat.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: page-flags behavior on compound pages: a worry
Date: Thu, 6 Aug 2015 18:33:00 +0300 [thread overview]
Message-ID: <20150806153259.GA2834@node.dhcp.inet.fi> (raw)
In-Reply-To: <alpine.LSU.2.11.1508052001350.6404@eggly.anvils>
On Wed, Aug 05, 2015 at 09:15:57PM -0700, Hugh Dickins wrote:
> Hi Kirill,
>
> I had a nasty thought this morning.
Tough day.
I'm trying to wrap my head around this mail and not sure if I succeed
much. :-|
> Andrew had prodded me gently to re-examine my concerns with your
> page-flags rework in mmotm. I still dislike the bloat (my mm/built-in.o
> text goes up from 478513 to 490183 bytes on a non-DEBUG_VM build); but I
> was hoping to set that aside, to let us move forward.
>
> But looking into the bloat led me to what seems a more serious issue
> with it. I'd tacked a little function on to the end of mm/filemap.c:
>
> bool page_is_locked(struct page *page)
> {
> return !!PageLocked(page);
> }
>
> which came out as:
>
> 0000000000003a60 <page_is_locked>:
> 3a60: 48 8b 07 mov (%rdi),%rax
> 3a63: 55 push %rbp
> 3a64: 48 89 e5 mov %rsp,%rbp
>
> [instructions above same as without your patches; those below added by them]
>
> 3a67: f6 c4 80 test $0x80,%ah
> 3a6a: 74 10 je 3a7c <page_is_locked+0x1c>
> 3a6c: 48 8b 47 30 mov 0x30(%rdi),%rax
> 3a70: 48 8b 17 mov (%rdi),%rdx
> 3a73: 80 e6 80 and $0x80,%dh
> 3a76: 48 0f 44 c7 cmove %rdi,%rax
> 3a7a: eb 03 jmp 3a7f <page_is_locked+0x1f>
> 3a7c: 48 89 f8 mov %rdi,%rax
> 3a7f: 48 8b 00 mov (%rax),%rax
>
> [instructions above added by your patches; those below same as before]
>
> 3a82: 5d pop %rbp
> 3a83: 83 e0 01 and $0x1,%eax
> 3a86: c3 retq
>
> The "and $0x80,%dh" looked superfluous at first, but of course it isn't:
> it's from the smp_rmb() in David's 668f9abbd433 "mm: close PageTail race"
> (a later commit refactors compound_head() but doesn't change the story).
>
> And it's that race, or a worse race of that kind, that now worries me.
> Relying on smp_wmb() and smp_rmb() may be all that was needed in the
> case that David was fixing; and (I dare not look at them to audit!)
> all uses of compound_head() in our current v4.2-rc tree may well be
> safe, for this or that contingent reason in each place that it's used.
>
> But there is no locking within compound_head(page) to make it safe
> everywhere, yet your page-flags rework is changing a large number
> of PageWhatever()s and SetPageWhatever()s and ClearPageWhatever()s
> now to do a hidden compound_head(page) beneath the covers.
>
> To be more specific: if preemption, or an interrupt, or entry to SMM
> mode, or whatever, delays this thread somewhere in that compound_head()
> sequence of instructions, how can we be sure that the "head" returned
> by compound_head() is good? We know the page was PageTail just before
> looking up page->first_page, and we know it was PageTail just after,
> but we don't know that it was PageTail throughout, and we don't know
> whether page->first_page is even a good page pointer, or something
> else from the private/ptl/slab_cache union.
That looks like a very valid worry to me. For current -mm tree.
But let's take my refcounting rework into picture.
One thing it simplifies is protection against splitting. Once you've got a
reference to a page, it cannot be split under you. It makes PageTail() and
->first_page stable for most callsites.
We can access the page's flags under ptl, without having reference the
page. And that's fine: ptl protects against splitting too.
Fast GUP also have a way to protect against split.
IIUC, the only potentially problematic callsites left are physical memory
scanners. This code requires audit. I'll do that.
Do I miss something else?
> Of course it would be very rare for it to go wrong; and most callsites
> will obviously be safe for this or that reason; though, sadly, none of
> them safe from holding a reference to the tail page in question, since
> its count is frozen at 0 and cannot be grabbed by get_page_unless_zero.
Do you mean that grabbing head page's ->_count is not enough to protect
against splitting and freeing tail page under you?
I know a patchset which solves this! ;)
> But I don't see how it can be safe to rely on compound_head() inside
> a general purpose page-flag function, that we're all accustomed to
> think of as a simple bitop, that can be applied without great care.
>
> Hugh
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Kirill A. Shutemov
next prev parent reply other threads:[~2015-08-06 15:33 UTC|newest]
Thread overview: 117+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-19 17:08 [PATCH 00/16] Sanitize usage of ->flags and ->mapping for tail pages Kirill A. Shutemov
2015-03-19 17:08 ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 01/16] mm: consolidate all page-flags helpers in <linux/page-flags.h> Kirill A. Shutemov
2015-03-19 17:08 ` Kirill A. Shutemov
2015-03-23 0:10 ` Hugh Dickins
2015-03-23 0:10 ` Hugh Dickins
2015-03-19 17:08 ` [PATCH 02/16] page-flags: trivial cleanup for PageTrans* helpers Kirill A. Shutemov
2015-03-19 17:08 ` Kirill A. Shutemov
2015-03-23 0:12 ` Hugh Dickins
2015-03-23 0:12 ` Hugh Dickins
2015-03-19 17:08 ` [PATCH 03/16] page-flags: introduce page flags policies wrt compound pages Kirill A. Shutemov
2015-03-19 17:08 ` Kirill A. Shutemov
2015-03-20 20:35 ` Andrew Morton
2015-03-20 20:35 ` Andrew Morton
2015-03-20 21:34 ` Kirill A. Shutemov
2015-03-20 21:34 ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 04/16] page-flags: define PG_locked behavior on " Kirill A. Shutemov
2015-03-19 17:08 ` Kirill A. Shutemov
2015-03-27 15:11 ` Mateusz Krawczuk
2015-03-27 15:11 ` Mateusz Krawczuk
2015-03-27 15:13 ` Mateusz Krawczuk
2015-03-27 15:13 ` Mateusz Krawczuk
2015-03-27 15:13 ` Mateusz Krawczuk
2015-03-27 16:37 ` Kirill A. Shutemov
2015-03-27 16:37 ` Kirill A. Shutemov
2015-03-27 16:37 ` Kirill A. Shutemov
2015-07-15 20:20 ` Christoph Lameter
2015-07-15 20:20 ` Christoph Lameter
2015-08-06 4:15 ` page-flags behavior on compound pages: a worry Hugh Dickins
2015-08-06 4:15 ` Hugh Dickins
2015-08-06 15:33 ` Kirill A. Shutemov [this message]
2015-08-06 15:33 ` Kirill A. Shutemov
2015-08-06 19:24 ` Hugh Dickins
2015-08-06 19:24 ` Hugh Dickins
2015-08-06 20:45 ` Christoph Lameter
2015-08-06 20:45 ` Christoph Lameter
2015-08-07 14:50 ` Kirill A. Shutemov
2015-08-07 14:50 ` Kirill A. Shutemov
2015-08-07 15:28 ` Christoph Lameter
2015-08-07 15:28 ` Christoph Lameter
2015-08-10 11:09 ` Kirill A. Shutemov
2015-08-10 11:09 ` Kirill A. Shutemov
2015-08-10 13:50 ` Christoph Lameter
2015-08-10 13:50 ` Christoph Lameter
2015-08-07 14:49 ` Kirill A. Shutemov
2015-08-07 14:49 ` Kirill A. Shutemov
2015-08-13 5:10 ` Hugh Dickins
2015-08-13 5:10 ` Hugh Dickins
2015-08-12 14:35 ` Kirill A. Shutemov
2015-08-12 14:35 ` Kirill A. Shutemov
2015-08-12 14:47 ` Vlastimil Babka
2015-08-12 14:47 ` Vlastimil Babka
2015-08-12 21:16 ` Andrew Morton
2015-08-12 21:16 ` Andrew Morton
2015-08-12 22:21 ` Kirill A. Shutemov
2015-08-12 22:21 ` Kirill A. Shutemov
2015-08-13 4:12 ` Hugh Dickins
2015-08-13 4:12 ` Hugh Dickins
2015-03-19 17:08 ` [PATCH 05/16] page-flags: define behavior of FS/IO-related flags on compound pages Kirill A. Shutemov
2015-03-19 17:08 ` Kirill A. Shutemov
2015-03-19 18:29 ` Dave Hansen
2015-03-19 18:29 ` Dave Hansen
2015-03-19 20:02 ` Kirill A. Shutemov
2015-03-19 20:02 ` Kirill A. Shutemov
2015-03-23 0:02 ` Hugh Dickins
2015-03-23 0:02 ` Hugh Dickins
2015-03-23 12:17 ` Kirill A. Shutemov
2015-03-23 12:17 ` Kirill A. Shutemov
2015-03-24 22:54 ` Hugh Dickins
2015-03-24 22:54 ` Hugh Dickins
2015-03-25 10:23 ` Kirill A. Shutemov
2015-03-25 10:23 ` Kirill A. Shutemov
2015-03-25 18:56 ` Hugh Dickins
2015-03-25 18:56 ` Hugh Dickins
2015-03-19 17:08 ` [PATCH 06/16] page-flags: define behavior of LRU-related " Kirill A. Shutemov
2015-03-19 17:08 ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 07/16] page-flags: define behavior SL*B-related " Kirill A. Shutemov
2015-03-19 17:08 ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 08/16] page-flags: define behavior of Xen-related " Kirill A. Shutemov
2015-03-19 17:08 ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 09/16] page-flags: define PG_reserved behavior " Kirill A. Shutemov
2015-03-19 17:08 ` Kirill A. Shutemov
2020-01-31 15:24 ` Chris Wilson
2020-02-03 15:18 ` Kirill A. Shutemov
2020-02-03 15:24 ` Chris Wilson
2020-02-03 17:10 ` David Hildenbrand
2020-02-03 17:29 ` Christoph Hellwig
2015-03-19 17:08 ` [PATCH 10/16] page-flags: define PG_swapbacked " Kirill A. Shutemov
2015-03-19 17:08 ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 11/16] page-flags: define PG_swapcache " Kirill A. Shutemov
2015-03-19 17:08 ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 12/16] page-flags: define PG_mlocked " Kirill A. Shutemov
2015-03-19 17:08 ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 13/16] page-flags: define PG_uncached " Kirill A. Shutemov
2015-03-19 17:08 ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 14/16] page-flags: define PG_uptodate " Kirill A. Shutemov
2015-03-19 17:08 ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 15/16] page-flags: look on head page if the flag is encoded in page->mapping Kirill A. Shutemov
2015-03-19 17:08 ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 16/16] mm: sanitize page->mapping for tail pages Kirill A. Shutemov
2015-03-19 17:08 ` Kirill A. Shutemov
2015-03-23 0:28 ` [PATCH 00/16] Sanitize usage of ->flags and ->mapping " Hugh Dickins
2015-03-23 0:28 ` Hugh Dickins
2015-03-23 10:04 ` Kirill A. Shutemov
2015-03-23 10:04 ` Kirill A. Shutemov
2015-03-24 23:42 ` Hugh Dickins
2015-03-24 23:42 ` Hugh Dickins
2015-03-25 10:55 ` Kirill A. Shutemov
2015-03-25 10:55 ` Kirill A. Shutemov
2015-03-24 17:39 ` Konstantin Khlebnikov
2015-03-24 17:39 ` Konstantin Khlebnikov
2015-03-24 20:04 ` Kirill A. Shutemov
2015-03-24 20:04 ` Kirill A. Shutemov
2015-07-15 20:20 ` Christoph Lameter
2015-07-15 20:20 ` Christoph Lameter
2015-07-15 21:18 ` Kirill A. Shutemov
2015-07-15 21:18 ` Kirill A. Shutemov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150806153259.GA2834@node.dhcp.inet.fi \
--to=kirill@shutemov.name \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=cl@gentwo.org \
--cc=dave.hansen@intel.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=jmarchan@redhat.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.cz \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=riel@redhat.com \
--cc=rientjes@google.com \
--cc=steve.capper@linaro.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.