public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Alexei Starovoitov <ast@plumgrid.com>
Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: call_rcu from trace_preempt
Date: Mon, 15 Jun 2015 19:14:58 -0700	[thread overview]
Message-ID: <20150616021458.GE3913@linux.vnet.ibm.com> (raw)
In-Reply-To: <557F7764.5060707@plumgrid.com>

On Mon, Jun 15, 2015 at 06:09:56PM -0700, Alexei Starovoitov wrote:
> On 6/15/15 4:07 PM, Paul E. McKenney wrote:
> >
> >Oh...  One important thing is that both call_rcu() and kfree_rcu()
> >use per-CPU variables, managing a per-CPU linked list.  This is why
> >they disable interrupts.  If you do another call_rcu() in the middle
> >of the first one in just the wrong place, you will have two entities
> >concurrently manipulating the same linked list, which will not go well.
> 
> yes. I'm trying to find that 'wrong place'.
> The trace.patch is doing kmalloc/kfree_rcu for every preempt_enable.
> So any spin_unlock called by first call_rcu will be triggering
> 2nd recursive to call_rcu.
> But as far as I could understand rcu code that looks ok everywhere.
> call_rcu
>   debug_rcu_head_[un]queue
>     debug_object_activate
>       spin_unlock
> 
> and debug_rcu_head* seems to be called from safe places
> where local_irq is enabled.

I do sympathize, but your own testing does demonstrate that it is
very much not OK.  ;-)

> >Maybe mark call_rcu() and the things it calls as notrace?  Or you
> >could maintain a separate per-CPU linked list that gathered up the
> >stuff to be kfree()ed after a grace period, and some time later
> >feed them to kfree_rcu()?
> 
> yeah, I can think of this or 10 other ways to fix it within
> kprobe+bpf area, but I think something like call_rcu_notrace()
> may be a better solution.
> Or may be single generic 'fix' for call_rcu will be enough if
> it doesn't affect all other users.

Why do you believe that it is better to fix it within call_rcu()?

> >The usual consequence of racing a pair of callback insertions on the
> >same CPU would be that one of them gets leaked, and possible all
> >subsequent callbacks.  So the lockup is no surprise.  And there are a
> >lot of other assumptions in nearby code paths about only one execution
> >at a time from a given CPU.
> 
> yes, I don't think calling 2nd call_rcu from preempt_enable violates
> this assumptions. local_irq does it job. No extra stuff is called when
> interrupts are disabled.

Perhaps you are self-deadlocking within __call_rcu_core().  If you have
not already done so, please try running with CONFIG_PROVE_LOCKING=y.
(Yes, I see your point about not calling extra stuff when interrupts
are disabled, and I remember that __call_rcu() avoids that path when
interrupts are disabled, but -something- is clearly going wrong!
Or maybe something momentarily enables interrupts somewhere, and RCU
has just been getting lucky.  Or...)

> >>Any advise on where to look is greatly appreciated.
> >
> >What I don't understand is exactly what you are trying to do.  Have more
> >complex tracers that dynamically allocate memory?  If so, having a per-CPU
> >list that stages memory to be freed so that it can be passed to call_rcu()
> >in a safe environment might make sense.  Of course, that list would need
> >to be managed carefully!
> 
> yes. We tried to compute the time the kernel spends between
> preempt_disable->preempt_enable and plot a histogram of latencies.
> 
> >Or am I missing the point of the code below?
> 
> this trace.patch is reproducer of call_rcu crashes that doing:
> preempt_enable
>   trace_preempt_on
>     kfree_call_rcu
> 
> The real call stack is:
> preempt_enable
>   trace_preempt_on
>     kprobe_int3_handler
>       trace_call_bpf
>         bpf_map_update_elem
>           htab_map_update_elem
>             kree_call_rcu

I suspect that your problem may range quite a bit further than just
call_rcu().  For example, in your stack trace, you have a recursive
call to debug_object_activate(), which might not be such good thing.

							Thanx, Paul


  reply	other threads:[~2015-06-16  2:15 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-15 22:24 call_rcu from trace_preempt Alexei Starovoitov
2015-06-15 23:07 ` Paul E. McKenney
2015-06-16  1:09   ` Alexei Starovoitov
2015-06-16  2:14     ` Paul E. McKenney [this message]
2015-06-16  5:45       ` Alexei Starovoitov
2015-06-16  6:06         ` Daniel Wagner
2015-06-16  6:25           ` Alexei Starovoitov
2015-06-16  6:34             ` Daniel Wagner
2015-06-16  6:46               ` Alexei Starovoitov
2015-06-16  6:54                 ` Daniel Wagner
2015-06-16 12:27         ` Paul E. McKenney
2015-06-16 12:38           ` Daniel Wagner
2015-06-16 14:16             ` Paul E. McKenney
2015-06-16 15:43               ` Steven Rostedt
2015-06-16 16:07                 ` Paul E. McKenney
2015-06-16 17:13                   ` Daniel Wagner
2015-06-16 15:41             ` Steven Rostedt
2015-06-16 15:52               ` Steven Rostedt
2015-06-16 17:11               ` Daniel Wagner
2015-06-16 17:20             ` Alexei Starovoitov
2015-06-16 17:37               ` Steven Rostedt
2015-06-17  0:33                 ` Alexei Starovoitov
2015-06-17  0:47                   ` Steven Rostedt
2015-06-17  1:04                     ` Alexei Starovoitov
2015-06-17  1:19                       ` Steven Rostedt
2015-06-17  8:11               ` Daniel Wagner
2015-06-17  9:05                 ` Daniel Wagner
2015-06-17 18:39                   ` Alexei Starovoitov
2015-06-17 20:37                     ` Paul E. McKenney
2015-06-17 20:53                       ` Alexei Starovoitov
2015-06-17 21:36                         ` Paul E. McKenney
2015-06-17 23:58                           ` Alexei Starovoitov
2015-06-18  0:20                             ` Paul E. McKenney
2015-06-16 15:37           ` Steven Rostedt
2015-06-16 16:05             ` Paul E. McKenney
2015-06-16 17:14               ` Alexei Starovoitov
2015-06-16 17:39                 ` Paul E. McKenney
2015-06-16 18:57                   ` Steven Rostedt
2015-06-16 19:20                     ` Paul E. McKenney
2015-06-16 19:29                       ` Steven Rostedt
2015-06-16 19:34                         ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150616021458.GE3913@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=ast@plumgrid.com \
    --cc=daniel.wagner@bmw-carit.de \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox