public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Yuanhan Liu <yuanhan.liu@linux.intel.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Michel Lespinasse <walken@google.com>
Subject: Re: [PATCH 4/4] mm/rmap.c: move anon_vma initialization code into anon_vma_ctor
Date: Mon, 4 Nov 2013 11:37:49 +0800	[thread overview]
Message-ID: <20131104033749.GG30123@yliu-dev.sh.intel.com> (raw)
In-Reply-To: <CA+55aFzAxnP0R8LrO9d1MzjCs8LxKXGwxadnUtoKeXhAh3Mzqw@mail.gmail.com>

On Fri, Nov 01, 2013 at 11:04:40AM -0700, Linus Torvalds wrote:
> On Fri, Nov 1, 2013 at 12:54 AM, Yuanhan Liu
> <yuanhan.liu@linux.intel.com> wrote:
> > @@ -67,19 +67,7 @@ static struct kmem_cache *anon_vma_chain_cachep;
> >
> >  static inline struct anon_vma *anon_vma_alloc(void)
> >  {
> > -       struct anon_vma *anon_vma;
> > -
> > -       anon_vma = kmem_cache_alloc(anon_vma_cachep, GFP_KERNEL);
> > -       if (anon_vma) {
> > -               atomic_set(&anon_vma->refcount, 1);
> > -               /*
> > -                * Initialise the anon_vma root to point to itself. If called
> > -                * from fork, the root will be reset to the parents anon_vma.
> > -                */
> > -               anon_vma->root = anon_vma;
> > -       }
> > -
> > -       return anon_vma;
> > +       return kmem_cache_alloc(anon_vma_cachep, GFP_KERNEL);
> >  }
> >
> >  static inline void anon_vma_free(struct anon_vma *anon_vma)
> > @@ -293,8 +281,15 @@ static void anon_vma_ctor(void *data)
> >         struct anon_vma *anon_vma = data;
> >
> >         rwlock_init(&anon_vma->rwlock);
> > -       atomic_set(&anon_vma->refcount, 0);
> >         anon_vma->rb_root = RB_ROOT;
> > +
> > +       atomic_set(&anon_vma->refcount, 1);
> > +       /*
> > +        * Initialise the anon_vma root to point to itself. If called
> > +        * from fork, the root will be reset to the parents anon_vma.
> > +        */
> > +       anon_vma->root = anon_vma;
> > +
> >  }
> 
> This looks totally invalid.
> 
> The slab constructor is *not* called on every allocation.

Sorry, I didn't know that :(

And thanks for the detailed info very much!

	--yliu

> Quite the
> reverse. Constructors are called when the underlying allocation is
> initially done, and then *not* done again, even if that particular
> object may be allocated and free'd many times.
> 
> So the reason we can do
> 
>     atomic_set(&anon_vma->refcount, 0);
> 
> in a constructor is that anybody who frees that allocation will do so
> only when the refcount goes back down to zero, so zero is "valid
> state" while the slab entry stays on some percpu freelist.
> 
> But the same is ABSOLUTELY NOT TRUE of the value "1", nor is it true
> of the anon_vma->root. When the anonvma gets free'd, those values will
> *not* be the same (the refcount has been decremented to zero, and the
> root will have been set to whatever the root was.
> 
> So the rule about constructors is that the values they construct
> absolutely *have* to be the ones they get free'd with. With one
> special case.
> 
> Using slab constructors is almost always a mistake. The original
> Sun/Solaris argument for them was to avoid initialization costs in
> allocators, and that was pure and utter bullshit (initializing a whole
> cacheline is generally cheaper than not initializing it and having to
> fetch it from L3 caches, but it does hide the cost so that it is now
> spread out in the users rather than in the allocator).
> 
> So the _original_ reason for slab is pure and utter BS, and we've
> removed pretty much all uses of the constructors.
> 
> In fact, the only valid reason for using them any more is the special
> case: locks and RCU.
> 
> The reason we still have constructors is that sometimes we want to
> keep certain data structures "alive" across allocations together with
> SLAB_DESTROY_BY_RCU (which delays the actual *page* destroying by RCU,
> but the allocation can go to the free-list and get re-allocated
> without a RCU grace-period).
> 
> But because allocations can now "stay active" over a
> alloc/free/alloc-again sequence, that means that the allocation
> sequence MUST NOT re-initialize the lock, because some RCU user may
> still be looking at those fields (and in particular, unlocking an
> allocation that in the meantime got free'd and re-allocated).
> 
> So these days, the *only* valid pattern for slab constructors is
> together with SLAB_DESTROY_BY_RCU, and making sure that the fields
> that RCU readers look at (and in particular, change) are "stable" over
> such re-allocations.
> 
> Your patch is horribly wrong.
> 
>                   Linus

  reply	other threads:[~2013-11-04  3:37 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-01  7:54 [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t Yuanhan Liu
2013-11-01  7:54 ` [PATCH 1/4] mm/rmap: per anon_vma lock Yuanhan Liu
2013-11-01  8:43   ` Peter Zijlstra
2013-11-01  9:22     ` Michel Lespinasse
2013-11-01  9:29       ` Ingo Molnar
2013-11-01 10:07       ` Yuanhan Liu
2013-11-01 10:15         ` Peter Zijlstra
2013-11-01 11:44           ` Yuanhan Liu
2013-11-01 12:07             ` Peter Zijlstra
2013-11-01 14:02               ` Yuanhan Liu
2013-11-01  9:38     ` Yuanhan Liu
2013-11-01 10:22       ` Peter Zijlstra
2013-11-01 14:09         ` Yuanhan Liu
2013-11-01 17:47           ` Linus Torvalds
2013-11-01  7:54 ` [PATCH 2/4] mm/rmap: convert anon_vma rwsem to rwlock_t Yuanhan Liu
2013-11-01  8:46   ` Peter Zijlstra
2013-11-01  7:54 ` [PATCH 3/4] mm/rmap: cleanup unnecessary code Yuanhan Liu
2013-11-01  7:54 ` [PATCH 4/4] mm/rmap.c: move anon_vma initialization code into anon_vma_ctor Yuanhan Liu
2013-11-01 18:04   ` Linus Torvalds
2013-11-04  3:37     ` Yuanhan Liu [this message]
2013-11-01  8:01 ` [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t Ingo Molnar
2013-11-01  8:11   ` Yuanhan Liu
2013-11-01  8:21     ` Ingo Molnar
2013-11-01 10:16       ` Yuanhan Liu
2013-11-02  3:15         ` Davidlohr Bueso
2013-11-04  3:59           ` Yuanhan Liu
2013-11-05  1:44             ` Tim Chen
2013-11-05  2:03               ` Tim Chen
2013-11-05  3:41                 ` Yuanhan Liu
2013-11-05  3:10               ` Yuanhan Liu
2013-11-05 14:43                 ` Yuanhan Liu
2013-11-01 17:49   ` Davidlohr Bueso
2013-11-01 18:09     ` Linus Torvalds
2013-11-01 18:47       ` Michel Lespinasse
2013-11-01 18:55         ` Linus Torvalds
2013-11-02  3:18           ` Davidlohr Bueso
2013-11-01 19:01 ` KOSAKI Motohiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131104033749.GG30123@yliu-dev.sh.intel.com \
    --to=yuanhan.liu@linux.intel.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=riel@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox