linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Hugh Dickins <hughd@google.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.cz>,
	David Rientjes <rientjes@google.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Christoph Lameter <cl@linux.com>
Subject: Re: [PATCHv3 4/5] mm: make compound_head() robust
Date: Wed, 26 Aug 2015 09:38:51 -0700	[thread overview]
Message-ID: <20150826163851.GF11078@linux.vnet.ibm.com> (raw)
In-Reply-To: <20150826150412.GA16412@node.dhcp.inet.fi>

On Wed, Aug 26, 2015 at 06:04:12PM +0300, Kirill A. Shutemov wrote:
> On Tue, Aug 25, 2015 at 02:19:54PM -0700, Paul E. McKenney wrote:
> > On Tue, Aug 25, 2015 at 10:46:44PM +0200, Vlastimil Babka wrote:
> > > On 25.8.2015 22:11, Paul E. McKenney wrote:
> > > > On Tue, Aug 25, 2015 at 09:33:54PM +0300, Kirill A. Shutemov wrote:
> > > >> On Tue, Aug 25, 2015 at 01:44:13PM +0200, Vlastimil Babka wrote:
> > > >>> On 08/21/2015 02:10 PM, Kirill A. Shutemov wrote:
> > > >>>> On Thu, Aug 20, 2015 at 04:36:43PM -0700, Andrew Morton wrote:
> > > >>>>> On Wed, 19 Aug 2015 12:21:45 +0300 "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> wrote:
> > > >>>>>
> > > >>>>>> The patch introduces page->compound_head into third double word block in
> > > >>>>>> front of compound_dtor and compound_order. That means it shares storage
> > > >>>>>> space with:
> > > >>>>>>
> > > >>>>>>  - page->lru.next;
> > > >>>>>>  - page->next;
> > > >>>>>>  - page->rcu_head.next;
> > > >>>>>>  - page->pmd_huge_pte;
> > > >>>>>>
> > > >>>
> > > >>> We should probably ask Paul about the chances that rcu_head.next would like
> > > >>> to use the bit too one day?
> > > >>
> > > >> +Paul.
> > > > 
> > > > The call_rcu() function does stomp that bit, but if you stop using that
> > > > bit before you invoke call_rcu(), no problem.
> > > 
> > > You mean that it sets the bit 0 of rcu_head.next during its processing?
> > 
> > Not at the moment, though RCU will splat if given a misaligned rcu_head
> > structure because of the possibility to use that bit to flag callbacks
> > that do nothing but free memory.  If RCU needs to do that (e.g., to
> > promote energy efficiency), then that bit might well be set during
> > RCU grace-period processing.
> 
> Ugh.. :-/
> 
> > > bad news then. It's not that we would trigger that bit when the rcu_head part of
> > > the union is "active". It's that pfn scanners could inspect such page at
> > > arbitrary time, see the bit 0 set (due to RCU processing) and think that it's a
> > > tail page of a compound page, and interpret the rest of the pointer as a pointer
> > > to the head page (to test it for flags etc).
> > 
> > On the other hand, if you avoid scanning rcu_head structures for pages
> > that are currently waiting for a grace period, no problem.  RCU does
> > not use the rcu_head structure at all except for during the time between
> > when call_rcu() is invoked on that rcu_head structure and the time that
> > the callback is invoked.
> > 
> > Is there some other page state that indicates that the page is waiting
> > for a grace period?  If so, you could simply avoid testing that bit in
> > that case.
> 
> No, I don't think so.

OK, I'll bite...  How do you know that it is safe to invoke call_rcu(),
given that you are not allowed to invoke call_rcu() until the previous
callback has been invoked?

> For compound pages most of info of its state is stored in head page (e.g.
> page_count(), flags, etc). So if we examine random page (pfn scanner case)
> the very first thing we want to know if we stepped on tail page.
> PageTail() is what I wanted to encode in the bit...

Ah, so that would require the page scanner to do reverse mapping or some
such, then.  Which is perhaps what you are trying to avoid.

> What if we change order of fields within rcu_head and put ->func first?
> Can we expect this pointer to have bit 0 always clear?

I asked that question some time back, and the answer was "no".  You
can apparently have functions that start at odd addresses on some
architectures.

That said, there are likely to be reserved bits somewhere in the function
address, perhaps varying depending on architecture and/or boot, in the
case of address-space randomization.  Perhaps some way of identifying
those bits with architecture-independent ways of querying and setting
them?

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2015-08-26 16:38 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-19  9:21 [PATCHv3 0/5] Fix compound_head() race Kirill A. Shutemov
2015-08-19  9:21 ` [PATCHv3 1/5] mm: drop page->slab_page Kirill A. Shutemov
2015-08-24 14:59   ` Vlastimil Babka
2015-08-24 15:02   ` Vlastimil Babka
2015-08-25 17:24     ` Kirill A. Shutemov
2015-08-19  9:21 ` [PATCHv3 2/5] zsmalloc: use page->private instead of page->first_page Kirill A. Shutemov
2015-08-24 15:04   ` Vlastimil Babka
2015-08-19  9:21 ` [PATCHv3 3/5] mm: pack compound_dtor and compound_order into one word in struct page Kirill A. Shutemov
2015-08-20 23:26   ` Andrew Morton
2015-08-21  7:13     ` Michal Hocko
2015-08-21 10:40       ` Kirill A. Shutemov
2015-08-21 10:51         ` Michal Hocko
2015-08-19  9:21 ` [PATCHv3 4/5] mm: make compound_head() robust Kirill A. Shutemov
2015-08-20 23:36   ` Andrew Morton
2015-08-21 12:10     ` Kirill A. Shutemov
2015-08-21 16:11       ` Christoph Lameter
2015-08-21 19:31         ` Kirill A. Shutemov
2015-08-21 19:34           ` Andrew Morton
2015-08-21 21:15             ` Christoph Lameter
2015-08-24 15:49             ` Vlastimil Babka
2015-08-25 11:44       ` Vlastimil Babka
2015-08-25 18:33         ` Kirill A. Shutemov
2015-08-25 20:11           ` Paul E. McKenney
2015-08-25 20:46             ` Vlastimil Babka
2015-08-25 21:19               ` Paul E. McKenney
2015-08-26 15:04                 ` Kirill A. Shutemov
2015-08-26 15:39                   ` Vlastimil Babka
2015-08-26 16:38                   ` Paul E. McKenney [this message]
2015-08-26 18:18                 ` Hugh Dickins
2015-08-26 21:29                   ` Paul E. McKenney
2015-08-26 22:28                     ` Hugh Dickins
2015-08-26 23:34                       ` Paul E. McKenney
2015-08-27 15:09                     ` Michal Hocko
2015-08-27 16:03                       ` Michal Hocko
2015-08-27 17:28                         ` Hugh Dickins
2015-08-27 18:06                           ` Michal Hocko
2015-08-27 16:36                       ` Paul E. McKenney
2015-08-27 18:14                         ` Michal Hocko
2015-08-27 19:01                           ` Paul E. McKenney
2015-08-23 23:59   ` Jesper Dangaard Brouer
2015-08-24  9:29     ` Kirill A. Shutemov
2015-08-24 10:17   ` Kirill A. Shutemov
2015-08-19  9:21 ` [PATCHv3 5/5] mm: use 'unsigned int' for page order Kirill A. Shutemov
2015-08-20  8:32   ` Michal Hocko
2015-08-20 12:31 ` [PATCHv3 0/5] Fix compound_head() race Kirill A. Shutemov
2015-08-20 23:38   ` Andrew Morton
2015-08-22 20:13     ` Hugh Dickins
2015-08-24  9:36       ` Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150826163851.GF11078@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=dave.hansen@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=rientjes@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).