From mboxrd@z Thu Jan  1 00:00:00 1970
From: paulmck@linux.ibm.com (Paul E. McKenney)
Date: Mon, 11 Feb 2019 13:08:08 -0800
Subject: v5.0-rc2 and NVMeOF
In-Reply-To: <1549905891.19311.5.camel@acm.org>
References: <1547579226.83374.114.camel@acm.org>
 <6c18d8f8-949f-9502-566a-643d384e9113@grimberg.me>
 <1549905891.19311.5.camel@acm.org>
Message-ID: <20190211210808.GS4240@linux.ibm.com>

On Mon, Feb 11, 2019@09:24:51AM -0800, Bart Van Assche wrote:
> On Wed, 2019-01-16@17:16 -0800, Sagi Grimberg wrote:
> > On 1/15/19 11:07 AM, Bart Van Assche wrote:
> > > Hello,
> > > 
> > > With Linus' kernel v5.0-rc2 the blktests nvmeof-mp tests trigger the
> > > complaint shown below. Is this a known issue?
> > 
> > Seems like ns remove is racing with ns revalidate again..
> > 
> > Wasn't this related to: eb4c2382272a ("srcu: Lock srcu_data structure in 
> > srcu_gp_start()") ?
> 
> (+Paul)
> 
> I'm not sure. Paul, are you perhaps aware of any open issues in the RCU
> infrastructure? If I run the following test:

The last one I knew of is present in v5.0-rc1 eb4c2382272a ("srcu:
Lock srcu_data structure in srcu_gp_start()").

> git clone https://github.com/osandov/blktests.git
> cd blktests
> ./check -q nvmeof-mp
> 
> then the following appears on the console:
> 
> BUG: KASAN: use-after-free in srcu_invoke_callbacks+0x209/0x290
> Read of size 8 at addr ffff8881126b6df0 by task kworker/2:94/26747
> CPU: 2 PID: 26747 Comm: kworker/2:94 Not tainted 5.0.0-rc5-dbg+ #5
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> Workqueue: rcu_gp srcu_invoke_callbacks
> Call Trace:
>  dump_stack+0x86/0xca
>  print_address_description+0x71/0x239
>  kasan_report.cold.3+0x1b/0x3e
>  __asan_load8+0x54/0x90
>  srcu_invoke_callbacks+0x209/0x290
>  process_one_work+0x4f1/0xa40
>  worker_thread+0x67/0x5b0
>  kthread+0x1cf/0x1f0
>  ret_from_fork+0x24/0x30

The usual way that something like this happens is by invoking call_srcu()
twice in a row on the same object, similar to double-kfree() but with
call_srcu() instead of kfree().  One way to check for this sort of thing
is to reproduce in a kernel built with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y.

							Thanx, Paul

> Allocated by task 955:
>  save_stack+0x43/0xd0
>  __kasan_kmalloc.constprop.9+0xcb/0xd0
>  kasan_kmalloc+0x9/0x10
>  kmem_cache_alloc_trace+0x14c/0x340
>  nvme_validate_ns+0xada/0x1170
>  nvme_scan_work+0x299/0x4c8
>  process_one_work+0x4f1/0xa40
>  worker_thread+0x67/0x5b0
>  kthread+0x1cf/0x1f0
>  ret_from_fork+0x24/0x30
> 
> Freed by task 55:
>  save_stack+0x43/0xd0
>  __kasan_slab_free+0x139/0x190
>  kasan_slab_free+0xe/0x10
>  kfree+0x103/0x320
>  nvme_free_ns+0x198/0x1a0
>  nvme_ns_remove+0x1c5/0x240
>  nvme_remove_namespaces+0x1b3/0x210
>  nvme_delete_ctrl_work+0x7d/0xe0
>  process_one_work+0x4f1/0xa40
>  worker_thread+0x367/0x5b0
>  kthread+0x1cf/0x1f0
>  ret_from_fork+0x24/0x30
> 
> The buggy address belongs to the object at ffff8881126b6c00
>  which belongs to the cache kmalloc-1k of size 1024
> The buggy address is located 496 bytes inside of
>  1024-byte region [ffff8881126b6c00, ffff8881126b7000)
> The buggy address belongs to the page:
> page:ffffea000449ac00 count:1 mapcount:0 mapping:ffff88811b002a00 index:0xffff8881126b1f80 compound_mapcount: 0
> flags: 0x2fff000000010200(slab|head)
> raw: 2fff000000010200 ffffea00042bcc08 ffffea000457b808 ffff88811b002a00
> raw: ffff8881126b1f80 00000000001c0011 00000001ffffffff 0000000000000000
> page dumped because: kasan: bad access detected
> 
> Memory state around the buggy address:
>  ffff8881126b6c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff8881126b6d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >ffff8881126b6d80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>                                                              ^
>  ffff8881126b6e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff8881126b6e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>