All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>,
	stable@kernel.org, Andrew Morton <akpm@linux-foundation.org>,
	Nick Piggin <npiggin@suse.de>,
	Pekka Enberg <penberg@cs.helsinki.fi>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: fix lazy vmap purging (use-after-free error)
Date: Fri, 20 Feb 2009 17:40:56 -0800	[thread overview]
Message-ID: <20090221014056.GU6960@linux.vnet.ibm.com> (raw)
In-Reply-To: <19f34abd0902201551o65a3650egf29d81e8b6823d67@mail.gmail.com>

On Sat, Feb 21, 2009 at 12:51:23AM +0100, Vegard Nossum wrote:
> 2009/2/20 Paul E. McKenney <paulmck@linux.vnet.ibm.com>:
> > On Fri, Feb 20, 2009 at 03:51:28PM +0100, Vegard Nossum wrote:
> >>
> >> I added some printks to __free_vmap_area() and rcu_free_va(), and it
> >> shows that the kfree() is being called immediately (inside the list
> >> traversal). So the call_rcu() is happening immediately (or almost
> >> immediately).
> >>
> >> If I've understood correctly, the RCU processing can happen inside a
> >> spinlock, as long as interrupts are enabled. (Won't the timer IRQ
> >> trigger softirq processing, which triggers RCU callback processing,
> >> for example?)
> >>
> >> And interrupts are enabled when this happens: EFLAGS: 00000292
> >>
> >> Please correct me if I am wrong!
> >
> > If you are using preemptable RCU, and if the read side accesses are not
> > protected by rcu_read_lock(), this can happen.  At least for values of
> > "immediately" in the millisecond range.
> >
> > If you were using classic or hierarchical RCU, the fact that the
> > call_rcu() is within a spinlock (as opposed to mutex) critical section
> > should prevent the grace period from ending.
> >
> > So, what flavor of RCU were you using?
> 
> $ grep RCU .config
> # RCU Subsystem
> # CONFIG_CLASSIC_RCU is not set
> CONFIG_TREE_RCU=y

OK, for this RCU implementation, disabling preemption should prevent
grace periods from completing.

Hmmm...

> # CONFIG_PREEMPT_RCU is not set
> # CONFIG_RCU_TRACE is not set
> CONFIG_RCU_FANOUT=32
> # CONFIG_RCU_FANOUT_EXACT is not set
> # CONFIG_TREE_RCU_TRACE is not set
> # CONFIG_PREEMPT_RCU_TRACE is not set
> # CONFIG_RCU_TORTURE_TEST is not set
> # CONFIG_RCU_CPU_STALL_DETECTOR is not set
> 
> And at boot:
> 
> [    0.000000] Initializing CPU#0
> [    0.000000] Experimental hierarchical RCU implementation.
> [    0.000000] Experimental hierarchical RCU init done.
> 
> What I did for this list traversal was to put one print-out in front
> of the traversal, one after the traversal, one inside (so it would be
> called on each iteration), and one in the RCU callback. It looks
> something like this:
> 
> [  449.670460] __purge_vmap_area_lazy() list:
> [  449.671332] __free_vmap_area(c7806a40)
> [  449.674736] __free_vmap_area(c7806a80)
> [  449.675441] rcu_free_va(c7806a40)

This is 4.1 milliseconds, so is quite plausible.  Is the code -really-
disabling preemption for 4.1 milliseconds?

> [  449.677407] __free_vmap_area(c7806ac0)
> [  449.680113] rcu_free_va(c7806a80)

5.4 milliseconds...

> [  449.682821] __free_vmap_area(c7806b00)
> [  449.684264] rcu_free_va(c7806ac0)

6.9 milliseconds...

> [  449.686525] __free_vmap_area(c7806b40)
> [  449.688205] rcu_free_va(c7806b00)

5.4 milliseconds...

> ...and goes on for a long time, until something triggers this:
> 
> [  449.902253] rcu_free_va(c7839d00)
> [  449.903247] WARNING: kmemcheck: Caught 32-bit read from freed
> memory (c7839d20)
> 
> ...and finally:
> 
> [  457.580253] __purge_vmap_area_lazy() end
> [  457.581201] rcu_free_va(c78974c0)

And I don't see the corresponding __free_vmap_area() for either of the
above rcu_free_va() calls.  Would you be willing to forward the
timestamp for the __free_vmap_area() for c7839d20?

> So this is also what I meant by "immediately": The RCU callbacks are
> getting called inside the loop, and they're almost always paired with
> the list removal, or lagging one object behind.
> 
> My guess is that this code posts "too many callbacks", which would
> "force the grace period" according to __call_rcu() in
> kernel/rcutree.c. What do you think about this?

If the code really suppresses preemption across the whole loop, then
any attempt to force the grace period should fail.  Is it possible that
preemption is momentarily enabled somewhere within the loop?  Or that
we are seeing multiple passes through the loop rather than one big long
pass through the loop?

							Thanx, Paul

  reply	other threads:[~2009-02-21  1:41 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-20 13:41 [PATCH] mm: fix lazy vmap purging (use-after-free error) Vegard Nossum
2009-02-20 13:50 ` Ingo Molnar
2009-02-20 13:58   ` Pekka Enberg
2009-02-20 14:01   ` Ingo Molnar
2009-02-20 14:18     ` Pekka Enberg
2009-02-20 15:41       ` Paul E. McKenney
2009-02-20 14:51     ` Vegard Nossum
2009-02-20 15:46       ` Paul E. McKenney
2009-02-20 16:04         ` Ingo Molnar
2009-02-20 16:44           ` Paul E. McKenney
2009-02-20 17:14             ` Ingo Molnar
2009-02-20 17:25               ` Paul E. McKenney
2009-02-20 23:51         ` Vegard Nossum
2009-02-21  1:40           ` Paul E. McKenney [this message]
2009-02-21  9:30             ` Vegard Nossum
2009-02-21 17:47               ` Paul E. McKenney
2009-02-21 18:08                 ` Vegard Nossum
2009-02-21 18:33                   ` Paul E. McKenney
2009-02-21 18:37                   ` Vegard Nossum
2009-02-22  3:00                     ` Paul E. McKenney
2009-02-23  5:17                       ` Paul E. McKenney
2009-02-23  8:24                         ` Vegard Nossum
2009-02-23 15:39                           ` Paul E. McKenney
2009-02-23  9:07                         ` Ingo Molnar
2009-02-23  9:17                           ` Andrew Morton
2009-02-23  9:27                             ` Ingo Molnar
2009-02-23 15:56                               ` Paul E. McKenney
2009-02-23 13:29                         ` Nick Piggin
2009-02-23 16:17                           ` Paul E. McKenney
2009-02-23 17:20                             ` Ingo Molnar
2009-02-23 19:10                             ` Andrew Morton
2009-02-23 19:30                               ` Paul E. McKenney
2009-02-23 19:59                                 ` Andrew Morton
2009-02-23 20:12                                   ` Paul E. McKenney
2009-02-23 20:30                                     ` Andrew Morton
2009-02-23 19:33                               ` Ingo Molnar
2009-02-23 20:04                                 ` Andrew Morton
2009-02-23 20:09                                   ` Ingo Molnar
2009-02-23 20:44                                   ` Paul E. McKenney
2009-02-23 20:43                                 ` Paul E. McKenney
2009-02-24  3:23                                   ` Nick Piggin
2009-02-24  3:37                                     ` Paul E. McKenney
2009-02-21 19:21                 ` Vegard Nossum
2009-02-20 16:01       ` Ingo Molnar
2009-02-20 16:49         ` Paul E. McKenney
2009-02-20 15:56     ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090221014056.GU6960@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=npiggin@suse.de \
    --cc=penberg@cs.helsinki.fi \
    --cc=stable@kernel.org \
    --cc=vegard.nossum@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.