Re: call_rcu from trace_preempt

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Alexei Starovoitov <ast@plumgrid.com>
Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: call_rcu from trace_preempt
Date: Mon, 15 Jun 2015 19:14:58 -0700	[thread overview]
Message-ID: <20150616021458.GE3913@linux.vnet.ibm.com> (raw)
In-Reply-To: <557F7764.5060707@plumgrid.com>

On Mon, Jun 15, 2015 at 06:09:56PM -0700, Alexei Starovoitov wrote:
> On 6/15/15 4:07 PM, Paul E. McKenney wrote:
> >
> >Oh...  One important thing is that both call_rcu() and kfree_rcu()
> >use per-CPU variables, managing a per-CPU linked list.  This is why
> >they disable interrupts.  If you do another call_rcu() in the middle
> >of the first one in just the wrong place, you will have two entities
> >concurrently manipulating the same linked list, which will not go well.
> 
> yes. I'm trying to find that 'wrong place'.
> The trace.patch is doing kmalloc/kfree_rcu for every preempt_enable.
> So any spin_unlock called by first call_rcu will be triggering
> 2nd recursive to call_rcu.
> But as far as I could understand rcu code that looks ok everywhere.
> call_rcu
>   debug_rcu_head_[un]queue
>     debug_object_activate
>       spin_unlock
> 
> and debug_rcu_head* seems to be called from safe places
> where local_irq is enabled.

I do sympathize, but your own testing does demonstrate that it is
very much not OK.  ;-)

> >Maybe mark call_rcu() and the things it calls as notrace?  Or you
> >could maintain a separate per-CPU linked list that gathered up the
> >stuff to be kfree()ed after a grace period, and some time later
> >feed them to kfree_rcu()?
> 
> yeah, I can think of this or 10 other ways to fix it within
> kprobe+bpf area, but I think something like call_rcu_notrace()
> may be a better solution.
> Or may be single generic 'fix' for call_rcu will be enough if
> it doesn't affect all other users.

Why do you believe that it is better to fix it within call_rcu()?

> >The usual consequence of racing a pair of callback insertions on the
> >same CPU would be that one of them gets leaked, and possible all
> >subsequent callbacks.  So the lockup is no surprise.  And there are a
> >lot of other assumptions in nearby code paths about only one execution
> >at a time from a given CPU.
> 
> yes, I don't think calling 2nd call_rcu from preempt_enable violates
> this assumptions. local_irq does it job. No extra stuff is called when
> interrupts are disabled.

Perhaps you are self-deadlocking within __call_rcu_core().  If you have
not already done so, please try running with CONFIG_PROVE_LOCKING=y.
(Yes, I see your point about not calling extra stuff when interrupts
are disabled, and I remember that __call_rcu() avoids that path when
interrupts are disabled, but -something- is clearly going wrong!
Or maybe something momentarily enables interrupts somewhere, and RCU
has just been getting lucky.  Or...)

> >>Any advise on where to look is greatly appreciated.
> >
> >What I don't understand is exactly what you are trying to do.  Have more
> >complex tracers that dynamically allocate memory?  If so, having a per-CPU
> >list that stages memory to be freed so that it can be passed to call_rcu()
> >in a safe environment might make sense.  Of course, that list would need
> >to be managed carefully!
> 
> yes. We tried to compute the time the kernel spends between
> preempt_disable->preempt_enable and plot a histogram of latencies.
> 
> >Or am I missing the point of the code below?
> 
> this trace.patch is reproducer of call_rcu crashes that doing:
> preempt_enable
>   trace_preempt_on
>     kfree_call_rcu
> 
> The real call stack is:
> preempt_enable
>   trace_preempt_on
>     kprobe_int3_handler
>       trace_call_bpf
>         bpf_map_update_elem
>           htab_map_update_elem
>             kree_call_rcu

I suspect that your problem may range quite a bit further than just
call_rcu().  For example, in your stack trace, you have a recursive
call to debug_object_activate(), which might not be such good thing.

							Thanx, Paul

next prev parent reply	other threads:[~2015-06-16  2:15 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-15 22:24 call_rcu from trace_preempt Alexei Starovoitov
2015-06-15 23:07 ` Paul E. McKenney
2015-06-16  1:09   ` Alexei Starovoitov
2015-06-16  2:14     ` Paul E. McKenney [this message]
2015-06-16  5:45       ` Alexei Starovoitov
2015-06-16  6:06         ` Daniel Wagner
2015-06-16  6:25           ` Alexei Starovoitov
2015-06-16  6:34             ` Daniel Wagner
2015-06-16  6:46               ` Alexei Starovoitov
2015-06-16  6:54                 ` Daniel Wagner
2015-06-16 12:27         ` Paul E. McKenney
2015-06-16 12:38           ` Daniel Wagner
2015-06-16 14:16             ` Paul E. McKenney
2015-06-16 15:43               ` Steven Rostedt
2015-06-16 16:07                 ` Paul E. McKenney
2015-06-16 17:13                   ` Daniel Wagner
2015-06-16 15:41             ` Steven Rostedt
2015-06-16 15:52               ` Steven Rostedt
2015-06-16 17:11               ` Daniel Wagner
2015-06-16 17:20             ` Alexei Starovoitov
2015-06-16 17:37               ` Steven Rostedt
2015-06-17  0:33                 ` Alexei Starovoitov
2015-06-17  0:47                   ` Steven Rostedt
2015-06-17  1:04                     ` Alexei Starovoitov
2015-06-17  1:19                       ` Steven Rostedt
2015-06-17  8:11               ` Daniel Wagner
2015-06-17  9:05                 ` Daniel Wagner
2015-06-17 18:39                   ` Alexei Starovoitov
2015-06-17 20:37                     ` Paul E. McKenney
2015-06-17 20:53                       ` Alexei Starovoitov
2015-06-17 21:36                         ` Paul E. McKenney
2015-06-17 23:58                           ` Alexei Starovoitov
2015-06-18  0:20                             ` Paul E. McKenney
2015-06-16 15:37           ` Steven Rostedt
2015-06-16 16:05             ` Paul E. McKenney
2015-06-16 17:14               ` Alexei Starovoitov
2015-06-16 17:39                 ` Paul E. McKenney
2015-06-16 18:57                   ` Steven Rostedt
2015-06-16 19:20                     ` Paul E. McKenney
2015-06-16 19:29                       ` Steven Rostedt
2015-06-16 19:34                         ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150616021458.GE3913@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=ast@plumgrid.com \
    --cc=daniel.wagner@bmw-carit.de \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.