All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marco Elver <elver@google.com>
To: Eric Dumazet <edumazet@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	David Miller <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>, netdev <netdev@vger.kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH] random32: Use rcuidle variant for tracepoint
Date: Fri, 21 Aug 2020 17:35:32 +0200	[thread overview]
Message-ID: <20200821153532.GA3205540@elver.google.com> (raw)
In-Reply-To: <CANn89i+1MQRCSRVg-af758en5e9nwQBes3aBSjQ6BY1pV5+HdQ@mail.gmail.com>

On Fri, Aug 21, 2020 at 08:06AM -0700, Eric Dumazet wrote:
> On Fri, Aug 21, 2020 at 1:59 AM <peterz@infradead.org> wrote:
> >
> > On Fri, Aug 21, 2020 at 08:30:43AM +0200, Marco Elver wrote:
> > > With KCSAN enabled, prandom_u32() may be called from any context,
> > > including idle CPUs.
> > >
> > > Therefore, switch to using trace_prandom_u32_rcuidle(), to avoid various
> > > issues due to recursion and lockdep warnings when KCSAN and tracing is
> > > enabled.
> >
> > At some point we're going to have to introduce noinstr to idle as well.
> > But until that time this should indeed cure things.
> 
> I do not understand what the issue is.  This _rcuidle() is kind of opaque ;)
>
> Would this alternative patch work, or is it something more fundamental ?

There are 2 problems:

1. Recursion due to ending up in lockdep from the tracepoint. I need to
solve this either way. One way is to use _rcuidle() variant, which
doesn't call into lockdep.

2. Somehow running into trouble because we use tracing from an idle CPU.
At least that's what I gathered from the documentation -- but you'd have
to wait for Peter or Steven to get a better explanation.

> Thanks !
> 
> diff --git a/lib/random32.c b/lib/random32.c
> index 932345323af092a93fc2690b0ebbf4f7485ae4f3..17af2d1631e5ab6e02ad1e9288af7e007bed6d5f
> 100644
> --- a/lib/random32.c
> +++ b/lib/random32.c
> @@ -83,9 +83,10 @@ u32 prandom_u32(void)
>         u32 res;
> 
>         res = prandom_u32_state(state);
> -       trace_prandom_u32(res);
>         put_cpu_var(net_rand_state);
> 
> +       trace_prandom_u32(res);
> +
>         return res;
>  }
>  EXPORT_SYMBOL(prandom_u32);

That unfortunately still gets me the same warning:

| ------------[ cut here ]------------
| DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled())
| WARNING: CPU: 4 PID: 1861 at kernel/locking/lockdep.c:4875 check_flags.part.0+0x157/0x160 kernel/locking/lockdep.c:4875
| Modules linked in:
| CPU: 4 PID: 1861 Comm: kworker/u16:4 Not tainted 5.9.0-rc1+ #24
| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
| RIP: 0010:check_flags.part.0+0x157/0x160 kernel/locking/lockdep.c:4875
| Code: c0 0f 84 70 5d 00 00 44 8b 0d fd 11 5f 06 45 85 c9 0f 85 60 5d 00 00 48 c7 c6 3e d0 f4 86 48 c7 c7 b2 49 f3 86 e8 8d 49 f6 ff <0f> 0b e9 46 5d 00 00 66 90 41 57 41 56 49 89 fe 41 55 41 89 d5 41
| RSP: 0000:ffffc900034bfcb0 EFLAGS: 00010082
| RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff8136161c
| RDX: ffff88881a9dcb00 RSI: ffffffff81363835 RDI: 0000000000000006
| RBP: ffffc900034bfd00 R08: 0000000000000000 R09: 0000ffffffffffff
| R10: 0000000000000104 R11: 0000ffff874efd6b R12: ffffffff874f26c0
| R13: 0000000000000244 R14: 0000000000000000 R15: 0000000000000046
| FS:  0000000000000000(0000) GS:ffff88881fc00000(0000) knlGS:0000000000000000
| CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
| CR2: 0000000000000000 CR3: 0000000007489001 CR4: 0000000000770ee0
| DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
| DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
| PKRU: 55555554
| Call Trace:
|  check_flags kernel/locking/lockdep.c:4871 [inline]
|  lock_is_held_type+0x42/0x100 kernel/locking/lockdep.c:5042
|  lock_is_held include/linux/lockdep.h:267 [inline]
|  rcu_read_lock_sched_held+0x41/0x80 kernel/rcu/update.c:136
|  trace_prandom_u32 include/trace/events/random.h:310 [inline]
|  prandom_u32+0x1bb/0x200 lib/random32.c:86
|  prandom_u32_max include/linux/prandom.h:46 [inline]
|  reset_kcsan_skip kernel/kcsan/core.c:277 [inline]
|  kcsan_setup_watchpoint+0x9b/0x600 kernel/kcsan/core.c:424
|  perf_lock_task_context+0x5e3/0x6e0 kernel/events/core.c:1491
|  perf_pin_task_context kernel/events/core.c:1506 [inline]
|  perf_event_exit_task_context kernel/events/core.c:12284 [inline]
|  perf_event_exit_task+0x1e2/0x910 kernel/events/core.c:12364
|  do_exit+0x70e/0x18b0 kernel/exit.c:815
|  call_usermodehelper_exec_async+0x2e2/0x2f0 kernel/umh.c:114
|  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
| irq event stamp: 107
| hardirqs last  enabled at (107): [<ffffffff815532ab>] perf_lock_task_context+0x5db/0x6e0 kernel/events/core.c:1491
| hardirqs last disabled at (106): [<ffffffff81552f12>] perf_lock_task_context+0x242/0x6e0 kernel/events/core.c:1459
| softirqs last  enabled at (0): [<ffffffff8129b95e>] copy_process+0xe9e/0x3970 kernel/fork.c:2004
| softirqs last disabled at (0): [<0000000000000000>] 0x0
| ---[ end trace a3058d9b157af5c4 ]---
| possible reason: unannotated irqs-off.
| irq event stamp: 107
| hardirqs last  enabled at (107): [<ffffffff815532ab>] perf_lock_task_context+0x5db/0x6e0 kernel/events/core.c:1491
| hardirqs last disabled at (106): [<ffffffff81552f12>] perf_lock_task_context+0x242/0x6e0 kernel/events/core.c:1459
| softirqs last  enabled at (0): [<ffffffff8129b95e>] copy_process+0xe9e/0x3970 kernel/fork.c:2004
| softirqs last disabled at (0): [<0000000000000000>] 0x0

I also have a patch which avoids the problem entirely by not using
prandom_u32(): https://lkml.kernel.org/r/20200821123126.3121494-1-elver@google.com
But that patch will likely only make it into the next merge window
(because of other conflicts).

So, if the _rcuidle() variant here doesn't break your usecase, there
should be no harm in using the _rcuidle() variant. This also lifts the
restriction on where prandom_u32() is usable to what it was before,
which should be any context.

Steven, Peter: What's the downside to of _rcuidle()?

Thanks,
-- Marco

  reply	other threads:[~2020-08-21 15:36 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-21  6:30 [PATCH] random32: Use rcuidle variant for tracepoint Marco Elver
2020-08-21  8:58 ` Marco Elver
2020-08-21  8:59 ` peterz
2020-08-21 15:06   ` Eric Dumazet
2020-08-21 15:35     ` Marco Elver [this message]
2020-09-18  1:21       ` Steven Rostedt
2020-08-21 15:38     ` Steven Rostedt
2020-08-21 15:41       ` Steven Rostedt
2020-08-21 18:38         ` Peter Zijlstra
2020-10-09  7:58 ` [tip: locking/core] kcsan: Use tracing-safe version of prandom tip-bot2 for Marco Elver

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200821153532.GA3205540@elver.google.com \
    --to=elver@google.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.