From: Yuanhan Liu <yuanhan.liu@linux.intel.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Ingo Molnar <mingo@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Rik van Riel <riel@redhat.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Michel Lespinasse <walken@google.com>
Subject: Re: [PATCH 4/4] mm/rmap.c: move anon_vma initialization code into anon_vma_ctor
Date: Mon, 4 Nov 2013 11:37:49 +0800 [thread overview]
Message-ID: <20131104033749.GG30123@yliu-dev.sh.intel.com> (raw)
In-Reply-To: <CA+55aFzAxnP0R8LrO9d1MzjCs8LxKXGwxadnUtoKeXhAh3Mzqw@mail.gmail.com>
On Fri, Nov 01, 2013 at 11:04:40AM -0700, Linus Torvalds wrote:
> On Fri, Nov 1, 2013 at 12:54 AM, Yuanhan Liu
> <yuanhan.liu@linux.intel.com> wrote:
> > @@ -67,19 +67,7 @@ static struct kmem_cache *anon_vma_chain_cachep;
> >
> > static inline struct anon_vma *anon_vma_alloc(void)
> > {
> > - struct anon_vma *anon_vma;
> > -
> > - anon_vma = kmem_cache_alloc(anon_vma_cachep, GFP_KERNEL);
> > - if (anon_vma) {
> > - atomic_set(&anon_vma->refcount, 1);
> > - /*
> > - * Initialise the anon_vma root to point to itself. If called
> > - * from fork, the root will be reset to the parents anon_vma.
> > - */
> > - anon_vma->root = anon_vma;
> > - }
> > -
> > - return anon_vma;
> > + return kmem_cache_alloc(anon_vma_cachep, GFP_KERNEL);
> > }
> >
> > static inline void anon_vma_free(struct anon_vma *anon_vma)
> > @@ -293,8 +281,15 @@ static void anon_vma_ctor(void *data)
> > struct anon_vma *anon_vma = data;
> >
> > rwlock_init(&anon_vma->rwlock);
> > - atomic_set(&anon_vma->refcount, 0);
> > anon_vma->rb_root = RB_ROOT;
> > +
> > + atomic_set(&anon_vma->refcount, 1);
> > + /*
> > + * Initialise the anon_vma root to point to itself. If called
> > + * from fork, the root will be reset to the parents anon_vma.
> > + */
> > + anon_vma->root = anon_vma;
> > +
> > }
>
> This looks totally invalid.
>
> The slab constructor is *not* called on every allocation.
Sorry, I didn't know that :(
And thanks for the detailed info very much!
--yliu
> Quite the
> reverse. Constructors are called when the underlying allocation is
> initially done, and then *not* done again, even if that particular
> object may be allocated and free'd many times.
>
> So the reason we can do
>
> atomic_set(&anon_vma->refcount, 0);
>
> in a constructor is that anybody who frees that allocation will do so
> only when the refcount goes back down to zero, so zero is "valid
> state" while the slab entry stays on some percpu freelist.
>
> But the same is ABSOLUTELY NOT TRUE of the value "1", nor is it true
> of the anon_vma->root. When the anonvma gets free'd, those values will
> *not* be the same (the refcount has been decremented to zero, and the
> root will have been set to whatever the root was.
>
> So the rule about constructors is that the values they construct
> absolutely *have* to be the ones they get free'd with. With one
> special case.
>
> Using slab constructors is almost always a mistake. The original
> Sun/Solaris argument for them was to avoid initialization costs in
> allocators, and that was pure and utter bullshit (initializing a whole
> cacheline is generally cheaper than not initializing it and having to
> fetch it from L3 caches, but it does hide the cost so that it is now
> spread out in the users rather than in the allocator).
>
> So the _original_ reason for slab is pure and utter BS, and we've
> removed pretty much all uses of the constructors.
>
> In fact, the only valid reason for using them any more is the special
> case: locks and RCU.
>
> The reason we still have constructors is that sometimes we want to
> keep certain data structures "alive" across allocations together with
> SLAB_DESTROY_BY_RCU (which delays the actual *page* destroying by RCU,
> but the allocation can go to the free-list and get re-allocated
> without a RCU grace-period).
>
> But because allocations can now "stay active" over a
> alloc/free/alloc-again sequence, that means that the allocation
> sequence MUST NOT re-initialize the lock, because some RCU user may
> still be looking at those fields (and in particular, unlocking an
> allocation that in the meantime got free'd and re-allocated).
>
> So these days, the *only* valid pattern for slab constructors is
> together with SLAB_DESTROY_BY_RCU, and making sure that the fields
> that RCU readers look at (and in particular, change) are "stable" over
> such re-allocations.
>
> Your patch is horribly wrong.
>
> Linus
next prev parent reply other threads:[~2013-11-04 3:37 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-01 7:54 [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t Yuanhan Liu
2013-11-01 7:54 ` [PATCH 1/4] mm/rmap: per anon_vma lock Yuanhan Liu
2013-11-01 8:43 ` Peter Zijlstra
2013-11-01 9:22 ` Michel Lespinasse
2013-11-01 9:29 ` Ingo Molnar
2013-11-01 10:07 ` Yuanhan Liu
2013-11-01 10:15 ` Peter Zijlstra
2013-11-01 11:44 ` Yuanhan Liu
2013-11-01 12:07 ` Peter Zijlstra
2013-11-01 14:02 ` Yuanhan Liu
2013-11-01 9:38 ` Yuanhan Liu
2013-11-01 10:22 ` Peter Zijlstra
2013-11-01 14:09 ` Yuanhan Liu
2013-11-01 17:47 ` Linus Torvalds
2013-11-01 7:54 ` [PATCH 2/4] mm/rmap: convert anon_vma rwsem to rwlock_t Yuanhan Liu
2013-11-01 8:46 ` Peter Zijlstra
2013-11-01 7:54 ` [PATCH 3/4] mm/rmap: cleanup unnecessary code Yuanhan Liu
2013-11-01 7:54 ` [PATCH 4/4] mm/rmap.c: move anon_vma initialization code into anon_vma_ctor Yuanhan Liu
2013-11-01 18:04 ` Linus Torvalds
2013-11-04 3:37 ` Yuanhan Liu [this message]
2013-11-01 8:01 ` [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t Ingo Molnar
2013-11-01 8:11 ` Yuanhan Liu
2013-11-01 8:21 ` Ingo Molnar
2013-11-01 10:16 ` Yuanhan Liu
2013-11-02 3:15 ` Davidlohr Bueso
2013-11-04 3:59 ` Yuanhan Liu
2013-11-05 1:44 ` Tim Chen
2013-11-05 2:03 ` Tim Chen
2013-11-05 3:41 ` Yuanhan Liu
2013-11-05 3:10 ` Yuanhan Liu
2013-11-05 14:43 ` Yuanhan Liu
2013-11-01 17:49 ` Davidlohr Bueso
2013-11-01 18:09 ` Linus Torvalds
2013-11-01 18:47 ` Michel Lespinasse
2013-11-01 18:55 ` Linus Torvalds
2013-11-02 3:18 ` Davidlohr Bueso
2013-11-01 19:01 ` KOSAKI Motohiro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131104033749.GG30123@yliu-dev.sh.intel.com \
--to=yuanhan.liu@linux.intel.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=riel@redhat.com \
--cc=torvalds@linux-foundation.org \
--cc=walken@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.