Distributed Replicated Block Device (DRBD) development
 help / color / mirror / Atom feed
* [Drbd-dev] BUG: NULL pointer dereference triggered by drbdsetup events2
@ 2017-03-01  0:28 Eric Wheeler
  2017-03-01 13:06 ` Lars Ellenberg
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Wheeler @ 2017-03-01  0:28 UTC (permalink / raw)
  To: drbd-dev

Hello all,

We found a relatively easy to reproduce bug that crashes the kernel. Start 
with all of the resources down and only the DRBD module loaded and run the 
following:

drbdadm up foo
drbdsetup events2 foo | true
drbdadm down foo; ls -d /sys/devices/virtual/bdi/147:7740
drbdadm up foo

The backtrace is below. The ls above simply illustrates the problem. After 
a down, the sysfs entry should not exist. The bug manifests because 
add_disk attempts to create the same sysfs entry but it cannot and fails 
up the stack. I have a feeling that the interface used by events2 is 
holding open a reference count after down so the sysfs entry is never 
removed.

We are piping into true because it fails quickly without reading any 
stdio, so perhaps the kernel is blocked trying to flush a buffer into 
userspace and never releases a resource count (speculation).

This was tested using the 4.10.1 kernel with userspace tools 
drbd-utils-8.9.4. I suspect this could be worked around in userspace, but 
it would be ideal if the kernel module could be fixed up to prevent a 
crash.

Please let me know if you need additional information or if I can provide 
any other testing.

Thank you for your help!

--
Eric Wheeler


[  147.386028] ------------[ cut here ]------------
[  147.386495] WARNING: CPU: 0 PID: 1956 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x64/0x80
[  147.387106] sysfs: cannot create duplicate filename '/devices/virtual/bdi/147:7740'
[  147.387735] Modules linked in: md4 xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio kvm_intel kvm irqbypass nfsd sg ppdev virtio_balloon parport_pc parport i2c_piix4 acpi_cpufreq pcspkr auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 jbd2 mbcache sr_mod cdrom ata_generic pata_acpi cirrus drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm serio_raw virtio_blk floppy virtio_net drm ata_piix i2c_core libata dm_mirror dm_region_hash dm_log dm_mod drbd libcrc32c crc32c_intel lru_cache
[  147.392997] CPU: 0 PID: 1956 Comm: drbdsetup-84 Not tainted 4.10.1 #9
[  147.393508] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[  147.394029] Call Trace:
[  147.394230]  dump_stack+0x63/0x87
[  147.394495]  __warn+0xd1/0xf0
[  147.394730]  warn_slowpath_fmt+0x5f/0x80
[  147.395039]  ? kernfs_path_from_node+0x50/0x60
[  147.395391]  sysfs_warn_dup+0x64/0x80
[  147.396799]  sysfs_create_dir_ns+0x7e/0x90
[  147.398192]  kobject_add_internal+0xc1/0x340
[  147.399548]  ? vsnprintf+0x34e/0x4d0
[  147.400841]  kobject_add+0x75/0xd0
[  147.402104]  ? mutex_lock+0x12/0x40
[  147.403351]  device_add+0x119/0x610
[  147.404588]  device_create_groups_vargs+0xd8/0x100
[  147.405923]  device_create_vargs+0x1c/0x20
[  147.407193]  bdi_register+0x8c/0x1a0
[  147.408417]  bdi_register_owner+0x38/0x60
[  147.409677]  device_add_disk+0x165/0x4a0
[  147.410914]  ? kmem_cache_alloc_trace+0x19b/0x1b0
[  147.412207]  drbd_create_device+0x64f/0x780 [drbd]
[  147.413554]  drbd_adm_new_minor+0xec/0x2c0 [drbd]
[  147.414848]  ? selinux_capable+0x20/0x30
[  147.416052]  genl_family_rcv_msg+0x1f6/0x3e0
[  147.417247]  ? copy_to_iter+0x97/0x430
[  147.418433]  genl_rcv_msg+0x5c/0xb0
[  147.419564]  ? __netlink_lookup+0xc0/0x110
[  147.420733]  ? genl_family_rcv_msg+0x3e0/0x3e0
[  147.421949]  netlink_rcv_skb+0xa7/0xc0
[  147.423092]  genl_rcv+0x28/0x40
[  147.424187]  netlink_unicast+0x181/0x240
[  147.425336]  netlink_sendmsg+0x32e/0x3b0
[  147.426477]  sock_sendmsg+0x38/0x50
[  147.427606]  sock_write_iter+0x85/0xf0
[  147.428750]  __vfs_write+0xe2/0x140
[  147.429843]  vfs_write+0xb2/0x1b0
[  147.430916]  ? syscall_trace_enter+0x1d0/0x2b0
[  147.432109]  SyS_write+0x55/0xc0
[  147.433183]  do_syscall_64+0x67/0x180
[  147.434268]  entry_SYSCALL64_slow_path+0x25/0x25
[  147.435406] RIP: 0033:0x7f2615c4cc60
[  147.436439] RSP: 002b:00007fffa5bcb488 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  147.437753] RAX: ffffffffffffffda RBX: 0000000000a04070 RCX: 00007f2615c4cc60
[  147.439020] RDX: 0000000000000030 RSI: 0000000000a04080 RDI: 0000000000000004
[  147.440264] RBP: 0000000000000030 R08: 0000000000001fd8 R09: 0000000000000065
[  147.441469] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000a04080
[  147.442694] R13: 0000000000000030 R14: 0000000000000004 R15: 0000000000414438
[  147.443874] ---[ end trace 81090c3270658cb6 ]---
[  147.444975] ------------[ cut here ]------------
[  147.446066] WARNING: CPU: 0 PID: 1956 at lib/kobject.c:240 kobject_add_internal+0x2d4/0x340
[  147.447436] kobject_add_internal failed for 147:7740 with -EEXIST, don't try to register things with the same name in the same directory.
[  147.449675] Modules linked in: md4 xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio kvm_intel kvm irqbypass nfsd sg ppdev virtio_balloon parport_pc parport i2c_piix4 acpi_cpufreq pcspkr auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 jbd2 mbcache sr_mod cdrom ata_generic pata_acpi cirrus drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm serio_raw virtio_blk floppy virtio_net drm ata_piix i2c_core libata dm_mirror dm_region_hash dm_log dm_mod drbd libcrc32c crc32c_intel lru_cache
[  147.460329] CPU: 0 PID: 1956 Comm: drbdsetup-84 Tainted: G        W       4.10.1 #9
[  147.461922] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[  147.463263] Call Trace:
[  147.464357]  dump_stack+0x63/0x87
[  147.465660]  __warn+0xd1/0xf0
[  147.466801]  warn_slowpath_fmt+0x5f/0x80
[  147.468016]  ? sysfs_warn_dup+0x6c/0x80
[  147.469222]  kobject_add_internal+0x2d4/0x340
[  147.470461]  ? vsnprintf+0x34e/0x4d0
[  147.471650]  kobject_add+0x75/0xd0
[  147.472810]  ? mutex_lock+0x12/0x40
[  147.473970]  device_add+0x119/0x610
[  147.475130]  device_create_groups_vargs+0xd8/0x100
[  147.476415]  device_create_vargs+0x1c/0x20
[  147.477746]  bdi_register+0x8c/0x1a0
[  147.478935]  bdi_register_owner+0x38/0x60
[  147.480160]  device_add_disk+0x165/0x4a0
[  147.481416]  ? kmem_cache_alloc_trace+0x19b/0x1b0
[  147.482698]  drbd_create_device+0x64f/0x780 [drbd]
[  147.483964]  drbd_adm_new_minor+0xec/0x2c0 [drbd]
[  147.485229]  ? selinux_capable+0x20/0x30
[  147.486457]  genl_family_rcv_msg+0x1f6/0x3e0
[  147.487825]  ? copy_to_iter+0x97/0x430
[  147.489023]  genl_rcv_msg+0x5c/0xb0
[  147.490211]  ? __netlink_lookup+0xc0/0x110
[  147.491438]  ? genl_family_rcv_msg+0x3e0/0x3e0
[  147.492689]  netlink_rcv_skb+0xa7/0xc0
[  147.493928]  genl_rcv+0x28/0x40
[  147.495077]  netlink_unicast+0x181/0x240
[  147.496374]  netlink_sendmsg+0x32e/0x3b0
[  147.497608]  sock_sendmsg+0x38/0x50
[  147.498843]  sock_write_iter+0x85/0xf0
[  147.500020]  __vfs_write+0xe2/0x140
[  147.501199]  vfs_write+0xb2/0x1b0
[  147.502329]  ? syscall_trace_enter+0x1d0/0x2b0
[  147.503592]  SyS_write+0x55/0xc0
[  147.504694]  do_syscall_64+0x67/0x180
[  147.505837]  entry_SYSCALL64_slow_path+0x25/0x25
[  147.507038] RIP: 0033:0x7f2615c4cc60
[  147.508155] RSP: 002b:00007fffa5bcb488 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  147.509537] RAX: ffffffffffffffda RBX: 0000000000a04070 RCX: 00007f2615c4cc60
[  147.510864] RDX: 0000000000000030 RSI: 0000000000a04080 RDI: 0000000000000004
[  147.512276] RBP: 0000000000000030 R08: 0000000000001fd8 R09: 0000000000000065
[  147.513545] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000a04080
[  147.514890] R13: 0000000000000030 R14: 0000000000000004 R15: 0000000000414438
[  147.516183] ---[ end trace 81090c3270658cb7 ]---
[  147.517686] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
[  147.518959] IP: sysfs_do_create_link_sd.isra.2+0x34/0xb0
[  147.520001] PGD 234009067 
[  147.520002] PUD 234b30067 
[  147.520877] PMD 0 
[  147.521734] 
[  147.523326] Oops: 0000 [#1] SMP
[  147.524164] Modules linked in: md4 xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio kvm_intel kvm irqbypass nfsd sg ppdev virtio_balloon parport_pc parport i2c_piix4 acpi_cpufreq pcspkr auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 jbd2 mbcache sr_mod cdrom ata_generic pata_acpi cirrus drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm serio_raw virtio_blk floppy virtio_net drm ata_piix i2c_core libata dm_mirror dm_region_hash dm_log dm_mod drbd libcrc32c crc32c_intel lru_cache
[  147.534222] CPU: 3 PID: 1956 Comm: drbdsetup-84 Tainted: G        W       4.10.1 #9
[  147.535672] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[  147.536950] task: ffff8802346fbd80 task.stack: ffffc90001ba4000
[  147.538251] RIP: 0010:sysfs_do_create_link_sd.isra.2+0x34/0xb0
[  147.539547] RSP: 0018:ffffc90001ba79a8 EFLAGS: 00010246
[  147.540797] RAX: 0000000000000000 RBX: 0000000000000040 RCX: 0000000000000001
[  147.542216] RDX: 0000000000000001 RSI: 0000000000000040 RDI: ffffffff8224bdb0
[  147.543622] RBP: ffffc90001ba79d0 R08: 000000000001cb40 R09: ffffffff813886c1
[  147.545113] R10: ffff88023fd9cb40 R11: ffffea0008d44700 R12: ffffffff81a26c0a
[  147.546525] R13: ffff880234d41a50 R14: 0000000000000001 R15: ffff8802326009b0
[  147.547944] FS:  00007f2616134740(0000) GS:ffff88023fd80000(0000) knlGS:0000000000000000
[  147.549432] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  147.550785] CR2: 0000000000000040 CR3: 0000000233b46000 CR4: 00000000000006e0
[  147.552220] Call Trace:
[  147.553401]  sysfs_create_link+0x25/0x40
[  147.554610]  device_add_disk+0x1fb/0x4a0
[  147.555838]  drbd_create_device+0x64f/0x780 [drbd]
[  147.557130]  drbd_adm_new_minor+0xec/0x2c0 [drbd]
[  147.558405]  ? selinux_capable+0x20/0x30
[  147.559622]  genl_family_rcv_msg+0x1f6/0x3e0
[  147.560866]  ? copy_to_iter+0x97/0x430
[  147.562091]  genl_rcv_msg+0x5c/0xb0
[  147.563335]  ? __netlink_lookup+0xc0/0x110
[  147.564612]  ? genl_family_rcv_msg+0x3e0/0x3e0
[  147.565877]  netlink_rcv_skb+0xa7/0xc0
[  147.567187]  genl_rcv+0x28/0x40
[  147.568343]  netlink_unicast+0x181/0x240
[  147.569555]  netlink_sendmsg+0x32e/0x3b0
[  147.570763]  sock_sendmsg+0x38/0x50
[  147.571914]  sock_write_iter+0x85/0xf0
[  147.573090]  __vfs_write+0xe2/0x140
[  147.574275]  vfs_write+0xb2/0x1b0
[  147.575447]  ? syscall_trace_enter+0x1d0/0x2b0
[  147.576687]  SyS_write+0x55/0xc0
[  147.577813]  do_syscall_64+0x67/0x180
[  147.579039]  entry_SYSCALL64_slow_path+0x25/0x25
[  147.580262] RIP: 0033:0x7f2615c4cc60
[  147.581395] RSP: 002b:00007fffa5bcb488 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  147.582843] RAX: ffffffffffffffda RBX: 0000000000a04070 RCX: 00007f2615c4cc60
[  147.584315] RDX: 0000000000000030 RSI: 0000000000a04080 RDI: 0000000000000004
[  147.585762] RBP: 0000000000000030 R08: 0000000000001fd8 R09: 0000000000000065
[  147.587194] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000a04080
[  147.588590] R13: 0000000000000030 R14: 0000000000000004 R15: 0000000000414438
[  147.589959] Code: 48 89 e5 41 57 41 56 41 55 41 54 49 89 d4 53 74 73 48 85 ff 49 89 fd 74 6b 48 89 f3 48 c7 c7 b0 bd 24 82 41 89 ce e8 0c 00 47 00 <48> 8b 1b 48 85 db 74 08 48 89 df e8 dc c4 ff ff 48 c7 c7 b0 bd 
[  147.592976] RIP: sysfs_do_create_link_sd.isra.2+0x34/0xb0 RSP: ffffc90001ba79a8
[  147.594307] CR2: 0000000000000040
[  147.595350] ---[ end trace 81090c3270658cb8 ]---
[  147.596650] Kernel panic - not syncing: Fatal exception
[  147.597957] Kernel Offset: disabled
[  147.598975] ---[ end Kernel panic - not syncing: Fatal exception

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Drbd-dev] BUG: NULL pointer dereference triggered by drbdsetup events2
  2017-03-01  0:28 [Drbd-dev] BUG: NULL pointer dereference triggered by drbdsetup events2 Eric Wheeler
@ 2017-03-01 13:06 ` Lars Ellenberg
  2017-03-01 21:36   ` Eric Wheeler
  0 siblings, 1 reply; 6+ messages in thread
From: Lars Ellenberg @ 2017-03-01 13:06 UTC (permalink / raw)
  To: drbd-dev

On Tue, Feb 28, 2017 at 04:28:46PM -0800, Eric Wheeler wrote:
> Hello all,
> 
> We found a relatively easy to reproduce bug that crashes the kernel. Start 
> with all of the resources down and only the DRBD module loaded and run the 
> following:
> 
> drbdadm up foo
> drbdsetup events2 foo | true
> drbdadm down foo; ls -d /sys/devices/virtual/bdi/147:7740
> drbdadm up foo
> 
> The backtrace is below. The ls above simply illustrates the problem. After 
> a down, the sysfs entry should not exist. The bug manifests because 
> add_disk attempts to create the same sysfs entry but it cannot and fails 
> up the stack. I have a feeling that the interface used by events2 is 
> holding open a reference count after down so the sysfs entry is never 
> removed.
> 
> We are piping into true because it fails quickly without reading any 
> stdio, so perhaps the kernel is blocked trying to flush a buffer into 
> userspace and never releases a resource count (speculation).
> 
> This was tested using the 4.10.1 kernel with userspace tools 
> drbd-utils-8.9.4. I suspect this could be worked around in userspace, but 
> it would be ideal if the kernel module could be fixed up to prevent a 
> crash.
> 
> Please let me know if you need additional information or if I can provide 
> any other testing.
> 
> Thank you for your help!

The "events2" part of drbd 8.4 was a "backport" from drbd 9.
I found this apparently relevant commit in our DRBD 9 repo.

In our out-of-tree 8.4 code, we later dropped the .done parts again,
for compat with RHEL 5, which is likely why this fix fell through.

I basically only re-diffed and compile-tested for current upstream kernel.
Does this help already, or do we need to dig deeper?

Thanks,
	Lars


commit 74635bbfbd03546f19ed015cec6e470062af44af
Author: Lars Ellenberg <lars.ellenberg@linbit.com>
Date:   Wed Jul 27 14:53:00 2016 +0200

    drbd: fix potential refcount leak, add missing drbd_adm_get_initial_state_done
    
    If some drbdsetup events2 / wait-* command aborts early,
    or the netlink socket currently in .dumpit = drbd_adm_get_initial_state
    is destroyed early for some other reason,
    the cleanup code inside that .dumpit function would never run,
    and we end up with a reference count leak.
    
    Move cleanup to .done = drbd_adm_get_initial_state_done.

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index f35db29..115b468 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -84,6 +84,7 @@ int drbd_adm_dump_connections_done(struct netlink_callback *cb);
 int drbd_adm_dump_peer_devices(struct sk_buff *skb, struct netlink_callback *cb);
 int drbd_adm_dump_peer_devices_done(struct netlink_callback *cb);
 int drbd_adm_get_initial_state(struct sk_buff *skb, struct netlink_callback *cb);
+int drbd_adm_get_initial_state_done(struct netlink_callback *cb);
 
 #include <linux/drbd_genl_api.h>
 #include "drbd_nla.h"
@@ -4932,6 +4933,21 @@ static int get_initial_state(struct sk_buff *skb, struct netlink_callback *cb)
 	return skb->len;
 }
 
+int drbd_adm_get_initial_state_done(struct netlink_callback *cb)
+{
+	LIST_HEAD(head);
+	if (cb->args[0]) {
+		struct drbd_state_change *state_change =
+			(struct drbd_state_change *)cb->args[0];
+		cb->args[0] = 0;
+
+		/* connect list to head */
+		list_add(&head, &state_change->list);
+		free_state_changes(&head);
+	}
+	return 0;
+}
+
 int drbd_adm_get_initial_state(struct sk_buff *skb, struct netlink_callback *cb)
 {
 	struct drbd_resource *resource;
@@ -4940,14 +4956,6 @@ int drbd_adm_get_initial_state(struct sk_buff *skb, struct netlink_callback *cb)
 	if (cb->args[5] >= 1) {
 		if (cb->args[5] > 1)
 			return get_initial_state(skb, cb);
-		if (cb->args[0]) {
-			struct drbd_state_change *state_change =
-				(struct drbd_state_change *)cb->args[0];
-
-			/* connect list to head */
-			list_add(&head, &state_change->list);
-			free_state_changes(&head);
-		}
 		return 0;
 	}
 
diff --git a/include/linux/drbd_genl.h b/include/linux/drbd_genl.h
index c934d3a..aa9bf5d 100644
--- a/include/linux/drbd_genl.h
+++ b/include/linux/drbd_genl.h
@@ -521,6 +521,7 @@ GENL_op(
 	DRBD_ADM_GET_INITIAL_STATE, 38,
 	GENL_op_init(
 	        .dumpit = drbd_adm_get_initial_state,
+	        .done = drbd_adm_get_initial_state_done,
 	),
 	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, DRBD_GENLA_F_MANDATORY))
 

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [Drbd-dev] BUG: NULL pointer dereference triggered by drbdsetup events2
  2017-03-01 13:06 ` Lars Ellenberg
@ 2017-03-01 21:36   ` Eric Wheeler
  2017-03-02 21:33     ` Lars Ellenberg
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Wheeler @ 2017-03-01 21:36 UTC (permalink / raw)
  To: Lars Ellenberg; +Cc: drbd-dev

On Wed, 1 Mar 2017, Lars Ellenberg wrote:

> On Tue, Feb 28, 2017 at 04:28:46PM -0800, Eric Wheeler wrote:
> > Hello all,
> > 
> > We found a relatively easy to reproduce bug that crashes the kernel. Start 
> > with all of the resources down and only the DRBD module loaded and run the 
> > following:
> > 
> > drbdadm up foo
> > drbdsetup events2 foo | true
> > drbdadm down foo; ls -d /sys/devices/virtual/bdi/147:7740
> > drbdadm up foo
> > 
> > The backtrace is below. The ls above simply illustrates the problem. After 
> > a down, the sysfs entry should not exist. The bug manifests because 
> > add_disk attempts to create the same sysfs entry but it cannot and fails 
> > up the stack. I have a feeling that the interface used by events2 is 
> > holding open a reference count after down so the sysfs entry is never 
> > removed.
> > 
> > We are piping into true because it fails quickly without reading any 
> > stdio, so perhaps the kernel is blocked trying to flush a buffer into 
> > userspace and never releases a resource count (speculation).
> > 
> > This was tested using the 4.10.1 kernel with userspace tools 
> > drbd-utils-8.9.4. I suspect this could be worked around in userspace, but 
> > it would be ideal if the kernel module could be fixed up to prevent a 
> > crash.
> > 
> > Please let me know if you need additional information or if I can provide 
> > any other testing.
> > 
> > Thank you for your help!
> 
> The "events2" part of drbd 8.4 was a "backport" from drbd 9.
> I found this apparently relevant commit in our DRBD 9 repo.
> 
> In our out-of-tree 8.4 code, we later dropped the .done parts again,
> for compat with RHEL 5, which is likely why this fix fell through.
> 
> I basically only re-diffed and compile-tested for current upstream kernel.
> Does this help already, or do we need to dig deeper?

Hi Lars,

Thank you for responding so quickly!

The events2 feature is neat, it allows us to prevent polling and provide 
better responsiveness in our infrastructure. We applied the patch and this 
is what we found:

It no longer crashes the kernel with a null pointer dereference, but now 
the drbdsetup calls never leave the kernel. The first time we ran the test 
case it worked fine, but the second time 'drbdadm down foo' hung. I 
checked the stack traces inside the kernel for each process that is hung 
and have also included the output of this test:


# This test succeeded
~]# drbdadm up foo
~]# drbdsetup events2 foo | true
~]# drbdadm down foo; ls -d /sys/devices/virtual/bdi/147:7740
ls: cannot access /sys/devices/virtual/bdi/147:7740: No such file or directory
~]# drbdadm up foo


# This test failed
~]# drbdsetup events2 foo | true
~]# drbdadm down foo; ls -d /sys/devices/virtual/bdi/147:7740
Command 'drbdsetup-84 down foo' did not terminate within 600 seconds
/sys/devices/virtual/bdi/147:7740
~]# drbdadm up foo
Command 'drbdsetup-84 new-resource foo --on-no-data-accessible=suspend-io' did not terminate within 5 seconds


The second test that failed left hung drbdsetup processes:

~]# ps ax |grep drbdsetup-84
 1579 pts/0    D      0:00 drbdsetup-84 down foo
 1585 ?        D      0:00 drbdsetup-84 role 7740
 1737 pts/0    D      0:00 drbdsetup-84 new-resource foo --on-no-data-accessible=suspend-io

These are the in-kernel stack traces for each pid:

##### 1579 pts/0    D+     0:00 drbdsetup-84 down foo
[root@hvtest2 ~]# cat /proc/1579/stack 
[<ffffffff81102a8c>] wait_rcu_gp+0x5c/0x80
[<ffffffff81106685>] synchronize_sched+0x45/0x60
[<ffffffffa02a8ca2>] adm_del_resource+0xf2/0x110 [drbd]
[<ffffffffa02ab5f3>] drbd_adm_down+0x233/0x240 [drbd]
[<ffffffff8161235d>] genl_family_rcv_msg+0x1cd/0x400
[<ffffffff81612621>] genl_rcv_msg+0x91/0xd0
[<ffffffff816113f1>] netlink_rcv_skb+0xc1/0xe0
[<ffffffff81611adc>] genl_rcv+0x2c/0x40
[<ffffffff81610a76>] netlink_unicast+0x106/0x210
[<ffffffff81610fb4>] netlink_sendmsg+0x434/0x690
[<ffffffff815c036d>] sock_sendmsg+0x3d/0x50
[<ffffffff815c0405>] sock_write_iter+0x85/0xf0
[<ffffffff812257b1>] __vfs_write+0xd1/0x110
[<ffffffff81225e49>] vfs_write+0xa9/0x1b0
[<ffffffff81226cf5>] SyS_write+0x55/0xd0
[<ffffffff816fdb6e>] system_call_fastpath+0x12/0x71
[<ffffffffffffffff>] 0xffffffffffffffff

##### 1585 ?        D      0:00 drbdsetup-84 role 7740
~]# cat /proc/1585/stack
[<ffffffff81612658>] genl_rcv_msg+0xc8/0xd0
[<ffffffff816113f1>] netlink_rcv_skb+0xc1/0xe0
[<ffffffff81611adc>] genl_rcv+0x2c/0x40
[<ffffffff81610a76>] netlink_unicast+0x106/0x210
[<ffffffff81610fb4>] netlink_sendmsg+0x434/0x690
[<ffffffff815c036d>] sock_sendmsg+0x3d/0x50
[<ffffffff815c0405>] sock_write_iter+0x85/0xf0
[<ffffffff812257b1>] __vfs_write+0xd1/0x110
[<ffffffff81225e49>] vfs_write+0xa9/0x1b0
[<ffffffff81226cf5>] SyS_write+0x55/0xd0
[<ffffffff816fdb6e>] system_call_fastpath+0x12/0x71
[<ffffffffffffffff>] 0xffffffffffffffff

##### 1737 pts/0    D      0:00 drbdsetup-84 new-resource foo --on-no-data-accessible=suspend-io
~]# cat /proc/1737/stack 
[<ffffffff81612658>] genl_rcv_msg+0xc8/0xd0
[<ffffffff816113f1>] netlink_rcv_skb+0xc1/0xe0
[<ffffffff81611adc>] genl_rcv+0x2c/0x40
[<ffffffff81610a76>] netlink_unicast+0x106/0x210
[<ffffffff81610fb4>] netlink_sendmsg+0x434/0x690
[<ffffffff815c036d>] sock_sendmsg+0x3d/0x50
[<ffffffff815c0405>] sock_write_iter+0x85/0xf0
[<ffffffff812257b1>] __vfs_write+0xd1/0x110
[<ffffffff81225e49>] vfs_write+0xa9/0x1b0
[<ffffffff81226cf5>] SyS_write+0x55/0xd0
[<ffffffff816fdb6e>] system_call_fastpath+0x12/0x71
[<ffffffffffffffff>] 0xffffffffffffffff


Please let me know if you would like any additional information.

--
Eric Wheeler

> 
> Thanks,
> 	Lars
> 
> 
> commit 74635bbfbd03546f19ed015cec6e470062af44af
> Author: Lars Ellenberg <lars.ellenberg@linbit.com>
> Date:   Wed Jul 27 14:53:00 2016 +0200
> 
>     drbd: fix potential refcount leak, add missing drbd_adm_get_initial_state_done
>     
>     If some drbdsetup events2 / wait-* command aborts early,
>     or the netlink socket currently in .dumpit = drbd_adm_get_initial_state
>     is destroyed early for some other reason,
>     the cleanup code inside that .dumpit function would never run,
>     and we end up with a reference count leak.
>     
>     Move cleanup to .done = drbd_adm_get_initial_state_done.
> 
> diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
> index f35db29..115b468 100644
> --- a/drivers/block/drbd/drbd_nl.c
> +++ b/drivers/block/drbd/drbd_nl.c
> @@ -84,6 +84,7 @@ int drbd_adm_dump_connections_done(struct netlink_callback *cb);
>  int drbd_adm_dump_peer_devices(struct sk_buff *skb, struct netlink_callback *cb);
>  int drbd_adm_dump_peer_devices_done(struct netlink_callback *cb);
>  int drbd_adm_get_initial_state(struct sk_buff *skb, struct netlink_callback *cb);
> +int drbd_adm_get_initial_state_done(struct netlink_callback *cb);
>  
>  #include <linux/drbd_genl_api.h>
>  #include "drbd_nla.h"
> @@ -4932,6 +4933,21 @@ static int get_initial_state(struct sk_buff *skb, struct netlink_callback *cb)
>  	return skb->len;
>  }
>  
> +int drbd_adm_get_initial_state_done(struct netlink_callback *cb)
> +{
> +	LIST_HEAD(head);
> +	if (cb->args[0]) {
> +		struct drbd_state_change *state_change =
> +			(struct drbd_state_change *)cb->args[0];
> +		cb->args[0] = 0;
> +
> +		/* connect list to head */
> +		list_add(&head, &state_change->list);
> +		free_state_changes(&head);
> +	}
> +	return 0;
> +}
> +
>  int drbd_adm_get_initial_state(struct sk_buff *skb, struct netlink_callback *cb)
>  {
>  	struct drbd_resource *resource;
> @@ -4940,14 +4956,6 @@ int drbd_adm_get_initial_state(struct sk_buff *skb, struct netlink_callback *cb)
>  	if (cb->args[5] >= 1) {
>  		if (cb->args[5] > 1)
>  			return get_initial_state(skb, cb);
> -		if (cb->args[0]) {
> -			struct drbd_state_change *state_change =
> -				(struct drbd_state_change *)cb->args[0];
> -
> -			/* connect list to head */
> -			list_add(&head, &state_change->list);
> -			free_state_changes(&head);
> -		}
>  		return 0;
>  	}
>  
> diff --git a/include/linux/drbd_genl.h b/include/linux/drbd_genl.h
> index c934d3a..aa9bf5d 100644
> --- a/include/linux/drbd_genl.h
> +++ b/include/linux/drbd_genl.h
> @@ -521,6 +521,7 @@ GENL_op(
>  	DRBD_ADM_GET_INITIAL_STATE, 38,
>  	GENL_op_init(
>  	        .dumpit = drbd_adm_get_initial_state,
> +	        .done = drbd_adm_get_initial_state_done,
>  	),
>  	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, DRBD_GENLA_F_MANDATORY))
>  
> _______________________________________________
> drbd-dev mailing list
> drbd-dev@lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-dev
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Drbd-dev] BUG: NULL pointer dereference triggered by drbdsetup events2
  2017-03-01 21:36   ` Eric Wheeler
@ 2017-03-02 21:33     ` Lars Ellenberg
  2017-03-11  0:29       ` Eric Wheeler
  0 siblings, 1 reply; 6+ messages in thread
From: Lars Ellenberg @ 2017-03-02 21:33 UTC (permalink / raw)
  To: drbd-dev

On Wed, Mar 01, 2017 at 01:36:55PM -0800, Eric Wheeler wrote:
> On Wed, 1 Mar 2017, Lars Ellenberg wrote:
> 
> > On Tue, Feb 28, 2017 at 04:28:46PM -0800, Eric Wheeler wrote:
> > > Hello all,
> > > 
> > > We found a relatively easy to reproduce bug that crashes the kernel. Start 
> > > with all of the resources down and only the DRBD module loaded and run the 
> > > following:
> > > 
> > > drbdadm up foo
> > > drbdsetup events2 foo | true
> > > drbdadm down foo; ls -d /sys/devices/virtual/bdi/147:7740
> > > drbdadm up foo
> > > 
> > > The backtrace is below. The ls above simply illustrates the problem. After 
> > > a down, the sysfs entry should not exist. The bug manifests because 
> > > add_disk attempts to create the same sysfs entry but it cannot and fails 
> > > up the stack. I have a feeling that the interface used by events2 is 
> > > holding open a reference count after down so the sysfs entry is never 
> > > removed.
> > > 
> > > We are piping into true because it fails quickly without reading any 
> > > stdio, so perhaps the kernel is blocked trying to flush a buffer into 
> > > userspace and never releases a resource count (speculation).
> > > 
> > > This was tested using the 4.10.1 kernel with userspace tools 
> > > drbd-utils-8.9.4. I suspect this could be worked around in userspace, but 
> > > it would be ideal if the kernel module could be fixed up to prevent a 
> > > crash.
> > > 
> > > Please let me know if you need additional information or if I can provide 
> > > any other testing.
> > > 
> > > Thank you for your help!
> > 
> > The "events2" part of drbd 8.4 was a "backport" from drbd 9.
> > I found this apparently relevant commit in our DRBD 9 repo.
> > 
> > In our out-of-tree 8.4 code, we later dropped the .done parts again,
> > for compat with RHEL 5, which is likely why this fix fell through.
> > 
> > I basically only re-diffed and compile-tested for current upstream kernel.
> > Does this help already, or do we need to dig deeper?
> 
> Hi Lars,
> 
> Thank you for responding so quickly!
> 
> The events2 feature is neat, it allows us to prevent polling and provide 
> better responsiveness in our infrastructure. We applied the patch and this 
> is what we found:
> 
> It no longer crashes the kernel with a null pointer dereference, but now 
> the drbdsetup calls never leave the kernel. The first time we ran the test 
> case it worked fine, but the second time 'drbdadm down foo' hung. I 
> checked the stack traces inside the kernel for each process that is hung 
> and have also included the output of this test:
> 
> 
> # This test succeeded
> ~]# drbdadm up foo
> ~]# drbdsetup events2 foo | true
> ~]# drbdadm down foo; ls -d /sys/devices/virtual/bdi/147:7740
> ls: cannot access /sys/devices/virtual/bdi/147:7740: No such file or directory
> ~]# drbdadm up foo

right.

> # This test failed
> ~]# drbdsetup events2 foo | true
> ~]# drbdadm down foo; ls -d /sys/devices/virtual/bdi/147:7740
> Command 'drbdsetup-84 down foo' did not terminate within 600 seconds
> /sys/devices/virtual/bdi/147:7740
> ~]# drbdadm up foo
> Command 'drbdsetup-84 new-resource foo --on-no-data-accessible=suspend-io' did not terminate within 5 seconds

There is something else going on here.

> The second test that failed left hung drbdsetup processes:
> 
> ~]# ps ax |grep drbdsetup-84
>  1579 pts/0    D      0:00 drbdsetup-84 down foo
>  1585 ?        D      0:00 drbdsetup-84 role 7740

Where is that "role" coming from?

>  1737 pts/0    D      0:00 drbdsetup-84 new-resource foo --on-no-data-accessible=suspend-io
> 
> These are the in-kernel stack traces for each pid:
> 
> ##### 1579 pts/0    D+     0:00 drbdsetup-84 down foo
> [root@hvtest2 ~]# cat /proc/1579/stack 
> [<ffffffff81102a8c>] wait_rcu_gp+0x5c/0x80
> [<ffffffff81106685>] synchronize_sched+0x45/0x60
> [<ffffffffa02a8ca2>] adm_del_resource+0xf2/0x110 [drbd]

Hm.
That's just doing a synchronize_rcu().

> [<ffffffffa02ab5f3>] drbd_adm_down+0x233/0x240 [drbd]
> [<ffffffff8161235d>] genl_family_rcv_msg+0x1cd/0x400
> [<ffffffff81612621>] genl_rcv_msg+0x91/0xd0
> [<ffffffff816113f1>] netlink_rcv_skb+0xc1/0xe0
> [<ffffffff81611adc>] genl_rcv+0x2c/0x40
> [<ffffffff81610a76>] netlink_unicast+0x106/0x210
> [<ffffffff81610fb4>] netlink_sendmsg+0x434/0x690

> ##### 1585 ?        D      0:00 drbdsetup-84 role 7740
> ~]# cat /proc/1585/stack
> [<ffffffff81612658>] genl_rcv_msg+0xc8/0xd0

This is likely waiting for the "genl_lock()".

> [<ffffffff816113f1>] netlink_rcv_skb+0xc1/0xe0
> [<ffffffff81611adc>] genl_rcv+0x2c/0x40
> [<ffffffff81610a76>] netlink_unicast+0x106/0x210
> [<ffffffff81610fb4>] netlink_sendmsg+0x434/0x690

> ##### 1737 pts/0    D      0:00 drbdsetup-84 new-resource foo --on-no-data-accessible=suspend-io
> ~]# cat /proc/1737/stack 
> [<ffffffff81612658>] genl_rcv_msg+0xc8/0xd0

The same.

> [<ffffffff816113f1>] netlink_rcv_skb+0xc1/0xe0
> [<ffffffff81611adc>] genl_rcv+0x2c/0x40
> [<ffffffff81610a76>] netlink_unicast+0x106/0x210
> [<ffffffff81610fb4>] netlink_sendmsg+0x434/0x690

The two drbdsetup waiting for genl_lock()
are maybe ugly, but expected in this,
as the drbdsetup down still holds it.

Question is: why does that down not make progress,
Why does the rcu grace period not make progress?

Anything suspicious in the logs before that?

> Please let me know if you would like any additional information.

Just as an additional data point,
can you try to reproduce with our out-of-tree DRBD 8.4 git?
(plus that .done() fix... I did not yet push it)


-- 
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
: R&D, Integration, Ops, Consulting, Support

DRBD® and LINBIT® are registered trademarks of LINBIT

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Drbd-dev] BUG: NULL pointer dereference triggered by drbdsetup events2
  2017-03-02 21:33     ` Lars Ellenberg
@ 2017-03-11  0:29       ` Eric Wheeler
  2017-03-23 11:52         ` Lars Ellenberg
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Wheeler @ 2017-03-11  0:29 UTC (permalink / raw)
  To: Lars Ellenberg; +Cc: drbd-dev

On Thu, 2 Mar 2017, Lars Ellenberg wrote:

> On Wed, Mar 01, 2017 at 01:36:55PM -0800, Eric Wheeler wrote:
> > On Wed, 1 Mar 2017, Lars Ellenberg wrote:
> > 
> > > On Tue, Feb 28, 2017 at 04:28:46PM -0800, Eric Wheeler wrote:
> > > > Hello all,
> > > > 
> > > > We found a relatively easy to reproduce bug that crashes the kernel. Start 
> > > > with all of the resources down and only the DRBD module loaded and run the 
> > > > following:
> > > > 
> > > > drbdadm up foo
> > > > drbdsetup events2 foo | true
> > > > drbdadm down foo; ls -d /sys/devices/virtual/bdi/147:7740
> > > > drbdadm up foo
> > > > 
> > > > The backtrace is below. The ls above simply illustrates the problem. After 
> > > > a down, the sysfs entry should not exist. The bug manifests because 
> > > > add_disk attempts to create the same sysfs entry but it cannot and fails 
> > > > up the stack. I have a feeling that the interface used by events2 is 
> > > > holding open a reference count after down so the sysfs entry is never 
> > > > removed.
> > > > 
> > > > We are piping into true because it fails quickly without reading any 
> > > > stdio, so perhaps the kernel is blocked trying to flush a buffer into 
> > > > userspace and never releases a resource count (speculation).
> > > > 
> > > > This was tested using the 4.10.1 kernel with userspace tools 
> > > > drbd-utils-8.9.4. I suspect this could be worked around in userspace, but 
> > > > it would be ideal if the kernel module could be fixed up to prevent a 
> > > > crash.
> > > > 
> > > > Please let me know if you need additional information or if I can provide 
> > > > any other testing.
> > > > 
> > > > Thank you for your help!
> > > 
> > > The "events2" part of drbd 8.4 was a "backport" from drbd 9.
> > > I found this apparently relevant commit in our DRBD 9 repo.
> > > 
> > > In our out-of-tree 8.4 code, we later dropped the .done parts again,
> > > for compat with RHEL 5, which is likely why this fix fell through.
> > > 
> > > I basically only re-diffed and compile-tested for current upstream kernel.
> > > Does this help already, or do we need to dig deeper?
> > 
> > Hi Lars,
> > 
> > Thank you for responding so quickly!
> > 
> > The events2 feature is neat, it allows us to prevent polling and provide 
> > better responsiveness in our infrastructure. We applied the patch and this 
> > is what we found:
> > 
> > It no longer crashes the kernel with a null pointer dereference, but now 
> > the drbdsetup calls never leave the kernel. The first time we ran the test 
> > case it worked fine, but the second time 'drbdadm down foo' hung. I 
> > checked the stack traces inside the kernel for each process that is hung 
> > and have also included the output of this test:
> > 
> > 
> > # This test succeeded
> > ~]# drbdadm up foo
> > ~]# drbdsetup events2 foo | true
> > ~]# drbdadm down foo; ls -d /sys/devices/virtual/bdi/147:7740
> > ls: cannot access /sys/devices/virtual/bdi/147:7740: No such file or directory
> > ~]# drbdadm up foo
> 
> right.
> 
> > # This test failed
> > ~]# drbdsetup events2 foo | true
> > ~]# drbdadm down foo; ls -d /sys/devices/virtual/bdi/147:7740
> > Command 'drbdsetup-84 down foo' did not terminate within 600 seconds
> > /sys/devices/virtual/bdi/147:7740
> > ~]# drbdadm up foo
> > Command 'drbdsetup-84 new-resource foo --on-no-data-accessible=suspend-io' did not terminate within 5 seconds
> 
> There is something else going on here.
> 
> > The second test that failed left hung drbdsetup processes:
> > 
> > ~]# ps ax |grep drbdsetup-84
> >  1579 pts/0    D      0:00 drbdsetup-84 down foo
> >  1585 ?        D      0:00 drbdsetup-84 role 7740
> 
> Where is that "role" coming from?
> 
> >  1737 pts/0    D      0:00 drbdsetup-84 new-resource foo --on-no-data-accessible=suspend-io
> > 
> > These are the in-kernel stack traces for each pid:
> > 
> > ##### 1579 pts/0    D+     0:00 drbdsetup-84 down foo
> > [root@hvtest2 ~]# cat /proc/1579/stack 
> > [<ffffffff81102a8c>] wait_rcu_gp+0x5c/0x80
> > [<ffffffff81106685>] synchronize_sched+0x45/0x60
> > [<ffffffffa02a8ca2>] adm_del_resource+0xf2/0x110 [drbd]
> 
> Hm.
> That's just doing a synchronize_rcu().
> 
> > [<ffffffffa02ab5f3>] drbd_adm_down+0x233/0x240 [drbd]
> > [<ffffffff8161235d>] genl_family_rcv_msg+0x1cd/0x400
> > [<ffffffff81612621>] genl_rcv_msg+0x91/0xd0
> > [<ffffffff816113f1>] netlink_rcv_skb+0xc1/0xe0
> > [<ffffffff81611adc>] genl_rcv+0x2c/0x40
> > [<ffffffff81610a76>] netlink_unicast+0x106/0x210
> > [<ffffffff81610fb4>] netlink_sendmsg+0x434/0x690
> 
> > ##### 1585 ?        D      0:00 drbdsetup-84 role 7740
> > ~]# cat /proc/1585/stack
> > [<ffffffff81612658>] genl_rcv_msg+0xc8/0xd0
> 
> This is likely waiting for the "genl_lock()".
> 
> > [<ffffffff816113f1>] netlink_rcv_skb+0xc1/0xe0
> > [<ffffffff81611adc>] genl_rcv+0x2c/0x40
> > [<ffffffff81610a76>] netlink_unicast+0x106/0x210
> > [<ffffffff81610fb4>] netlink_sendmsg+0x434/0x690
> 
> > ##### 1737 pts/0    D      0:00 drbdsetup-84 new-resource foo --on-no-data-accessible=suspend-io
> > ~]# cat /proc/1737/stack 
> > [<ffffffff81612658>] genl_rcv_msg+0xc8/0xd0
> 
> The same.
> 
> > [<ffffffff816113f1>] netlink_rcv_skb+0xc1/0xe0
> > [<ffffffff81611adc>] genl_rcv+0x2c/0x40
> > [<ffffffff81610a76>] netlink_unicast+0x106/0x210
> > [<ffffffff81610fb4>] netlink_sendmsg+0x434/0x690
> 
> The two drbdsetup waiting for genl_lock()
> are maybe ugly, but expected in this,
> as the drbdsetup down still holds it.
> 
> Question is: why does that down not make progress,
> Why does the rcu grace period not make progress?
> 
> Anything suspicious in the logs before that?
> 
> > Please let me know if you would like any additional information.
> 
> Just as an additional data point,
> can you try to reproduce with our out-of-tree DRBD 8.4 git?
> (plus that .done() fix... I did not yet push it)

Sure, this is the latest out of tree DRBD from git with the .done patch:

This is our test:
~]# drbdadm up foo
~]# drbdsetup events2 foo | true
~]# drbdadm down foo; ls -d /sys/devices/virtual/bdi/147:7740
/sys/devices/virtual/bdi/147:7740

The first few times we were running the commands slowly, one at a time and 
did not hit the bug. However, when they are run in short succession in a 
script or by pasting to a terminal, down shows that the directory still 
exists as shown above. Thus, something is holding the sysfs entry in 
place.

There is still a `drbdsetup-84 role 7740` stuck after down completes and 
its stack trace are as follows (Note that we never ran `drbdsetup role 
7740`):

~]# ps ax |grep drbdsetup
14787 ?        D      0:00 drbdsetup-84 role 7740
~]# cat /proc/14787/stack 
[<ffffffff81612658>] genl_rcv_msg+0xc8/0xd0
[<ffffffff816113f1>] netlink_rcv_skb+0xc1/0xe0
[<ffffffff81611adc>] genl_rcv+0x2c/0x40
[<ffffffff81610a76>] netlink_unicast+0x106/0x210
[<ffffffff81610fb4>] netlink_sendmsg+0x434/0x690
[<ffffffff815c036d>] sock_sendmsg+0x3d/0x50
[<ffffffff815c0405>] sock_write_iter+0x85/0xf0
[<ffffffff812257b1>] __vfs_write+0xd1/0x110
[<ffffffff81225e49>] vfs_write+0xa9/0x1b0
[<ffffffff81226cf5>] SyS_write+0x55/0xd0
[<ffffffff816fdb6e>] system_call_fastpath+0x12/0x71
[<ffffffffffffffff>] 0xffffffffffffffff


When we were able to trigger the bug, this is the full kernel output. We 
are using the latest 4.1.38 and your latest 8.4 kernel module from 
git://git.linbit.com/drbd-8.4.git .

I notice that there is a bug about rcu scheduling while atomic, but I am 
not sure if that's related or a side effect of the actual issue:

kernel: drbd foo: Starting worker thread (from drbdsetup-84 [14768])
kernel: block drbd7740: disk( Diskless -> Attaching ) 
kernel: drbd foo: Method to ensure write ordering: flush
kernel: block drbd7740: max BIO size = 1048576
kernel: block drbd7740: Adjusting my ra_pages to backing device's (32 -> 1024)
kernel: block drbd7740: drbd_bm_resize called with capacity == 4194040
kernel: block drbd7740: resync bitmap: bits=524255 words=8192 pages=16
kernel: block drbd7740: size = 2048 MB (2097020 KB)
kernel: block drbd7740: recounting of set bits took additional 0 jiffies
kernel: block drbd7740: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
kernel: block drbd7740: disk( Attaching -> UpToDate ) 
kernel: block drbd7740: attached to UUIDs B8CB842A92C72018:0000000000000000:054C0372E3E65F70:054B0372E3E65F70
kernel: drbd foo: conn( StandAlone -> Unconnected ) 
kernel: drbd foo: Starting receiver thread (from drbd_w_foo [14771])
kernel: drbd foo: receiver (re)started
kernel: drbd foo: conn( Unconnected -> WFConnection ) 
kernel: drbd foo: conn( WFConnection -> Disconnecting ) 
kernel: drbd foo: Discarding network configuration.
kernel: drbd foo: Connection closed
kernel: drbd foo: conn( Disconnecting -> StandAlone ) 
kernel: drbd foo: receiver terminated
kernel: drbd foo: Terminating drbd_r_foo
kernel: block drbd7740: disk( UpToDate -> Failed ) 
kernel: BUG: scheduling while atomic: rcuos/2/25/0x00000200
kernel: Modules linked in: md4 xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio kvm_intel kvm ppdev i2c_piix4 pcspkr virtio_balloon sg parport_pc parport acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc drbd(OE) libcrc32c ip_tables ext4 mbcache jbd2 sr_mod cdrom ata_generic pata_acpi cirrus syscopyarea sysfillrect sysimgblt drm_kms_helper ttm drm crc32c_intel ata_piix i2c_core serio_raw virtio_blk libata floppy virtio_net dm_mirror dm_region_hash dm_log dm_mod
kernel: CPU: 2 PID: 25 Comm: rcuos/2 Tainted: G           OE   4.1.38-4.el7.x86_64 #1
kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
kernel: 0000000000000286 00000000f24899fd ffff88023626fc48 ffffffff816f611c
kernel: ffff88023fd177c0 ffffffff81b4fb64 ffff88023626fc58 ffffffff816f0b25
kernel: ffff88023626fca8 ffffffff816f97ad ffff880236276040 ffff880236270000
kernel: Call Trace:
kernel: [<ffffffff816f611c>] dump_stack+0x63/0x81
kernel: [<ffffffff816f0b25>] __schedule_bug+0x4b/0x59
kernel: [<ffffffff816f97ad>] __schedule+0x7cd/0x950
kernel: [<ffffffff816f9967>] schedule+0x37/0x90
kernel: [<ffffffff816f9c9e>] schedule_preempt_disabled+0xe/0x10
kernel: [<ffffffff816fba95>] __mutex_lock_slowpath+0xb5/0x130
kernel: [<ffffffff816fbb33>] mutex_lock+0x23/0x37
kernel: [<ffffffff8161170f>] genl_lock_done+0x2f/0x60
kernel: [<ffffffff8160cfae>] netlink_sock_destruct+0x3e/0x150
kernel: [<ffffffff815c4663>] __sk_free+0x23/0x140
kernel: [<ffffffff815c4799>] sk_free+0x19/0x20
kernel: [<ffffffff8160da70>] deferred_put_nlk_sk+0x20/0x30
kernel: [<ffffffff81104dff>] rcu_nocb_kthread+0x18f/0x550
kernel: [<ffffffff810e6a80>] ? prepare_to_wait_event+0xf0/0xf0
kernel: [<ffffffff81104c70>] ? rcu_cpu_notify+0x3c0/0x3c0
kernel: [<ffffffff810c0c98>] kthread+0xd8/0xf0
kernel: [<ffffffff810c0bc0>] ? kthread_create_on_node+0x1b0/0x1b0
kernel: [<ffffffff816fdf62>] ret_from_fork+0x42/0x70
kernel: [<ffffffff810c0bc0>] ? kthread_create_on_node+0x1b0/0x1b0
kernel: NOHZ: local_softirq_pending 282
kernel: block drbd7740: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
kernel: block drbd7740: disk( Failed -> Diskless ) 
kernel: drbd foo: Terminating drbd_w_foo
kernel: BUG: scheduling while atomic: rcuos/2/25/0x00000200
kernel: Modules linked in: md4 xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio kvm_intel kvm ppdev i2c_piix4 pcspkr virtio_balloon sg parport_pc parport acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc drbd(OE) libcrc32c ip_tables ext4 mbcache jbd2 sr_mod cdrom ata_generic pata_acpi cirrus syscopyarea sysfillrect sysimgblt drm_kms_helper ttm drm crc32c_intel ata_piix i2c_core serio_raw virtio_blk libata floppy virtio_net dm_mirror dm_region_hash dm_log dm_mod
kernel: CPU: 2 PID: 25 Comm: rcuos/2 Tainted: G        W  OE   4.1.38-4.el7.x86_64 #1
kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
kernel: 0000000000000286 00000000f24899fd ffff88023626f9c8 ffffffff816f611c
kernel: ffff88023fd177c0 ffff88023626fb80 ffff88023626f9d8 ffffffff816f0b25
kernel: ffff88023626fa28 ffffffff816f97ad ffff88023626f9f8 ffff880236270000
kernel: Call Trace:
kernel: [<ffffffff816f611c>] dump_stack+0x63/0x81
kernel: [<ffffffff816f0b25>] __schedule_bug+0x4b/0x59
kernel: [<ffffffff816f97ad>] __schedule+0x7cd/0x950
kernel: [<ffffffff816f9967>] schedule+0x37/0x90
kernel: [<ffffffff816fc9e4>] schedule_timeout+0x244/0x2c0
kernel: [<ffffffff810cf7c4>] ? try_to_wake_up+0x1d4/0x350
kernel: [<ffffffff81103610>] ? get_state_synchronize_rcu+0x20/0x20
kernel: [<ffffffff816fa4d4>] wait_for_completion+0x134/0x190
kernel: [<ffffffff810cf9b0>] ? wake_up_state+0x20/0x20
kernel: [<ffffffff81103610>] ? get_state_synchronize_rcu+0x20/0x20
kernel: [<ffffffff81141ebf>] __stop_cpus+0x5f/0x90
kernel: [<ffffffff81103610>] ? get_state_synchronize_rcu+0x20/0x20
kernel: [<ffffffff811421fe>] try_stop_cpus+0x3e/0x60
kernel: [<ffffffff81106454>] synchronize_sched_expedited+0x134/0x320
kernel: [<ffffffff8110672e>] synchronize_rcu_expedited+0xe/0x10
kernel: [<ffffffff811cb64a>] bdi_destroy+0x11a/0x170
kernel: [<ffffffff8131dee7>] blk_cleanup_queue+0x117/0x1e0
kernel: [<ffffffffa02934de>] drbd_destroy_device+0x15e/0x370 [drbd]
kernel: [<ffffffffa02ad81d>] forget_state_change+0xbd/0xf0 [drbd]
kernel: [<ffffffffa029e210>] drbd_adm_get_initial_state_done+0x80/0x90 [drbd]
kernel: [<ffffffff81611717>] genl_lock_done+0x37/0x60
kernel: [<ffffffff8160cfae>] netlink_sock_destruct+0x3e/0x150
kernel: [<ffffffff815c4663>] __sk_free+0x23/0x140
kernel: [<ffffffff815c4799>] sk_free+0x19/0x20
kernel: [<ffffffff8160da70>] deferred_put_nlk_sk+0x20/0x30
kernel: [<ffffffff81104dff>] rcu_nocb_kthread+0x18f/0x550
kernel: [<ffffffff810e6a80>] ? prepare_to_wait_event+0xf0/0xf0
kernel: [<ffffffff81104c70>] ? rcu_cpu_notify+0x3c0/0x3c0
kernel: [<ffffffff810c0c98>] kthread+0xd8/0xf0
kernel: [<ffffffff810c0bc0>] ? kthread_create_on_node+0x1b0/0x1b0
kernel: [<ffffffff816fdf62>] ret_from_fork+0x42/0x70
kernel: [<ffffffff810c0bc0>] ? kthread_create_on_node+0x1b0/0x1b0
kernel: BUG: scheduling while atomic: rcuos/2/25/0x00000200
kernel: Modules linked in: md4 xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio kvm_intel kvm ppdev i2c_piix4 pcspkr virtio_balloon sg parport_pc parport acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc drbd(OE) libcrc32c ip_tables ext4 mbcache jbd2 sr_mod cdrom ata_generic pata_acpi cirrus syscopyarea sysfillrect sysimgblt drm_kms_helper ttm drm crc32c_intel ata_piix i2c_core serio_raw virtio_blk libata floppy virtio_net dm_mirror dm_region_hash dm_log dm_mod
kernel: CPU: 2 PID: 25 Comm: rcuos/2 Tainted: G        W  OE   4.1.38-4.el7.x86_64 #1
kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
kernel: 0000000000000286 00000000f24899fd ffff88023626fa28 ffffffff816f611c
kernel: ffff88023fd177c0 ffff88023626fbe8 ffff88023626fa38 ffffffff816f0b25
kernel: ffff88023626fa88 ffffffff816f97ad ffff88023626fa68 ffff880236270000
kernel: Call Trace:
kernel: [<ffffffff816f611c>] dump_stack+0x63/0x81
kernel: [<ffffffff816f0b25>] __schedule_bug+0x4b/0x59
kernel: [<ffffffff816f97ad>] __schedule+0x7cd/0x950
kernel: [<ffffffff816f9967>] schedule+0x37/0x90
kernel: [<ffffffff816fc9e4>] schedule_timeout+0x244/0x2c0
kernel: [<ffffffff810cf7c4>] ? try_to_wake_up+0x1d4/0x350
kernel: [<ffffffff810cf9b0>] ? wake_up_state+0x20/0x20
kernel: [<ffffffff816fa4d4>] wait_for_completion+0x134/0x190
kernel: [<ffffffff810cf9b0>] ? wake_up_state+0x20/0x20
kernel: [<ffffffff810b9c09>] flush_work+0x109/0x1f0
kernel: [<ffffffff810b7610>] ? destroy_worker+0x90/0x90
kernel: [<ffffffff810b9fc3>] flush_delayed_work+0x43/0x50
kernel: [<ffffffff811cb668>] bdi_destroy+0x138/0x170
kernel: [<ffffffff8131dee7>] blk_cleanup_queue+0x117/0x1e0
kernel: [<ffffffffa02934de>] drbd_destroy_device+0x15e/0x370 [drbd]
kernel: [<ffffffffa02ad81d>] forget_state_change+0xbd/0xf0 [drbd]
kernel: [<ffffffffa029e210>] drbd_adm_get_initial_state_done+0x80/0x90 [drbd]
kernel: [<ffffffff81611717>] genl_lock_done+0x37/0x60
kernel: [<ffffffff8160cfae>] netlink_sock_destruct+0x3e/0x150
kernel: [<ffffffff815c4663>] __sk_free+0x23/0x140
kernel: [<ffffffff815c4799>] sk_free+0x19/0x20
kernel: [<ffffffff8160da70>] deferred_put_nlk_sk+0x20/0x30
kernel: [<ffffffff81104dff>] rcu_nocb_kthread+0x18f/0x550
kernel: [<ffffffff810e6a80>] ? prepare_to_wait_event+0xf0/0xf0
kernel: [<ffffffff81104c70>] ? rcu_cpu_notify+0x3c0/0x3c0
kernel: [<ffffffff810c0c98>] kthread+0xd8/0xf0
kernel: [<ffffffff810c0bc0>] ? kthread_create_on_node+0x1b0/0x1b0
kernel: [<ffffffff816fdf62>] ret_from_fork+0x42/0x70
kernel: [<ffffffff810c0bc0>] ? kthread_create_on_node+0x1b0/0x1b0
kernel: BUG: scheduling while atomic: rcuos/2/25/0x00000200
kernel: Modules linked in: md4 xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio kvm_intel kvm ppdev i2c_piix4 pcspkr virtio_balloon sg parport_pc parport acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc drbd(OE) libcrc32c ip_tables ext4 mbcache jbd2 sr_mod cdrom ata_generic pata_acpi cirrus syscopyarea sysfillrect sysimgblt drm_kms_helper ttm drm crc32c_intel ata_piix i2c_core serio_raw virtio_blk libata floppy virtio_net dm_mirror dm_region_hash dm_log dm_mod
kernel: CPU: 2 PID: 25 Comm: rcuos/2 Tainted: G        W  OE   4.1.38-4.el7.x86_64 #1
kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
kernel: 0000000000000286 00000000f24899fd ffff88023626f978 ffffffff816f611c
kernel: ffff88023fd177c0 ffff88023626fb28 ffff88023626f988 ffffffff816f0b25
kernel: ffff88023626f9d8 ffffffff816f97ad ffff88023626f9d8 ffff880236270000
kernel: Call Trace:
kernel: [<ffffffff816f611c>] dump_stack+0x63/0x81
kernel: [<ffffffff816f0b25>] __schedule_bug+0x4b/0x59
kernel: [<ffffffff816f97ad>] __schedule+0x7cd/0x950
kernel: [<ffffffff816f9967>] schedule+0x37/0x90
kernel: [<ffffffff816fc9e4>] schedule_timeout+0x244/0x2c0
kernel: [<ffffffff810e6784>] ? __wake_up_sync_key+0x54/0x70
kernel: [<ffffffff816fa4d4>] wait_for_completion+0x134/0x190
kernel: [<ffffffff810cf9b0>] ? wake_up_state+0x20/0x20
kernel: [<ffffffff81105690>] ? __call_rcu+0x340/0x340
kernel: [<ffffffff81102a8c>] wait_rcu_gp+0x5c/0x80
kernel: [<ffffffff81102a10>] ? ftrace_raw_output_rcu_utilization+0x70/0x70
kernel: [<ffffffff81106685>] synchronize_sched+0x45/0x60
kernel: [<ffffffff8131dcc2>] blk_queue_bypass_start+0xb2/0xe0
kernel: [<ffffffff81339968>] blkcg_deactivate_policy+0x38/0x140
kernel: [<ffffffff8133dd28>] blk_throtl_exit+0x38/0x50
kernel: [<ffffffff8133a894>] blkcg_exit_queue+0x44/0x70
kernel: [<ffffffff81321c5e>] blk_release_queue+0x1e/0x100
kernel: [<ffffffff8135effb>] kobject_cleanup+0x7b/0x1a0
kernel: [<ffffffff8135eea0>] kobject_put+0x30/0x70
kernel: [<ffffffff8131def3>] blk_cleanup_queue+0x123/0x1e0
kernel: [<ffffffffa02934de>] drbd_destroy_device+0x15e/0x370 [drbd]
kernel: [<ffffffffa02ad81d>] forget_state_change+0xbd/0xf0 [drbd]
kernel: [<ffffffffa029e210>] drbd_adm_get_initial_state_done+0x80/0x90 [drbd]
kernel: [<ffffffff81611717>] genl_lock_done+0x37/0x60
kernel: [<ffffffff8160cfae>] netlink_sock_destruct+0x3e/0x150
kernel: [<ffffffff815c4663>] __sk_free+0x23/0x140
kernel: [<ffffffff815c4799>] sk_free+0x19/0x20
kernel: [<ffffffff8160da70>] deferred_put_nlk_sk+0x20/0x30
kernel: [<ffffffff81104dff>] rcu_nocb_kthread+0x18f/0x550
kernel: [<ffffffff810e6a80>] ? prepare_to_wait_event+0xf0/0xf0
kernel: [<ffffffff81104c70>] ? rcu_cpu_notify+0x3c0/0x3c0
kernel: [<ffffffff810c0c98>] kthread+0xd8/0xf0
kernel: [<ffffffff810c0bc0>] ? kthread_create_on_node+0x1b0/0x1b0
kernel: [<ffffffff816fdf62>] ret_from_fork+0x42/0x70
kernel: [<ffffffff810c0bc0>] ? kthread_create_on_node+0x1b0/0x1b0
kernel: NOHZ: local_softirq_pending 282

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Drbd-dev] BUG: NULL pointer dereference triggered by drbdsetup events2
  2017-03-11  0:29       ` Eric Wheeler
@ 2017-03-23 11:52         ` Lars Ellenberg
  0 siblings, 0 replies; 6+ messages in thread
From: Lars Ellenberg @ 2017-03-23 11:52 UTC (permalink / raw)
  To: drbd-dev

On Fri, Mar 10, 2017 at 04:29:30PM -0800, Eric Wheeler wrote:
> > > # This test failed
> > > ~]# drbdsetup events2 foo | true
> > > ~]# drbdadm down foo; ls -d /sys/devices/virtual/bdi/147:7740
> > > Command 'drbdsetup-84 down foo' did not terminate within 600 seconds
> > > /sys/devices/virtual/bdi/147:7740
> > > ~]# drbdadm up foo
> > > Command 'drbdsetup-84 new-resource foo --on-no-data-accessible=suspend-io' did not terminate within 5 seconds
> > 
> > There is something else going on here.

Eric,

I was able to reproduce the
sysfs: cannot create duplicate filename '/devices/virtual/bdi/147:0'
and resulting NULL deref in sysfs_do_create_link_sd
easily before, and with the fix in place, I can no longer reproduce,
so I consider that one fixed.

I am not able to reproduce this "down does not terminate", though.

> > > The second test that failed left hung drbdsetup processes:
> > > 
> > > ~]# ps ax |grep drbdsetup-84
> > >  1579 pts/0    D      0:00 drbdsetup-84 down foo

And as long as the "down" is still pending,
of course the links in sysfs will still be there as well.


> > The two drbdsetup waiting for genl_lock()
> > are maybe ugly, but expected in this,
> > as the drbdsetup down still holds it.
> > 
> > Question is: why does that down not make progress,
> > Why does the rcu grace period not make progress?
> > 
> > Anything suspicious in the logs before that?
> > 
> > > Please let me know if you would like any additional information.
> > 
> > Just as an additional data point,
> > can you try to reproduce with our out-of-tree DRBD 8.4 git?
> > (plus that .done() fix... I did not yet push it)
> 
> Sure, this is the latest out of tree DRBD from git with the .done patch:
> 
> This is our test:
> ~]# drbdadm up foo
> ~]# drbdsetup events2 foo | true
> ~]# drbdadm down foo; ls -d /sys/devices/virtual/bdi/147:7740
> /sys/devices/virtual/bdi/147:7740
> 
> The first few times we were running the commands slowly, one at a time and 
> did not hit the bug. However, when they are run in short succession in a 
> script or by pasting to a terminal, down shows that the directory still 
> exists as shown above. Thus, something is holding the sysfs entry in 
> place.
> 
> There is still a `drbdsetup-84 role 7740` stuck after down completes and 
> its stack trace are as follows (Note that we never ran `drbdsetup role 
> 7740`):

Someone did.
Who?

> ~]# ps ax |grep drbdsetup
> 14787 ?        D      0:00 drbdsetup-84 role 7740
> ~]# cat /proc/14787/stack 
> [<ffffffff81612658>] genl_rcv_msg+0xc8/0xd0
> [<ffffffff816113f1>] netlink_rcv_skb+0xc1/0xe0
> [<ffffffff81611adc>] genl_rcv+0x2c/0x40
> [<ffffffff81610a76>] netlink_unicast+0x106/0x210
> [<ffffffff81610fb4>] netlink_sendmsg+0x434/0x690
> [<ffffffff815c036d>] sock_sendmsg+0x3d/0x50
> [<ffffffff815c0405>] sock_write_iter+0x85/0xf0
> [<ffffffff812257b1>] __vfs_write+0xd1/0x110
> [<ffffffff81225e49>] vfs_write+0xa9/0x1b0
> [<ffffffff81226cf5>] SyS_write+0x55/0xd0
> [<ffffffff816fdb6e>] system_call_fastpath+0x12/0x71
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> 
> When we were able to trigger the bug, this is the full kernel output. We 
> are using the latest 4.1.38 and your latest 8.4 kernel module from 
> git://git.linbit.com/drbd-8.4.git .
> 
> I notice that there is a bug about rcu scheduling while atomic, but I am 
> not sure if that's related or a side effect of the actual issue:
> 
> kernel: drbd foo: Starting worker thread (from drbdsetup-84 [14768])
> kernel: block drbd7740: disk( Diskless -> Attaching ) 
> kernel: drbd foo: Method to ensure write ordering: flush
> kernel: block drbd7740: max BIO size = 1048576
> kernel: block drbd7740: Adjusting my ra_pages to backing device's (32 -> 1024)
> kernel: block drbd7740: drbd_bm_resize called with capacity == 4194040
> kernel: block drbd7740: resync bitmap: bits=524255 words=8192 pages=16
> kernel: block drbd7740: size = 2048 MB (2097020 KB)
> kernel: block drbd7740: recounting of set bits took additional 0 jiffies
> kernel: block drbd7740: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
> kernel: block drbd7740: disk( Attaching -> UpToDate ) 
> kernel: block drbd7740: attached to UUIDs B8CB842A92C72018:0000000000000000:054C0372E3E65F70:054B0372E3E65F70
> kernel: drbd foo: conn( StandAlone -> Unconnected ) 
> kernel: drbd foo: Starting receiver thread (from drbd_w_foo [14771])
> kernel: drbd foo: receiver (re)started
> kernel: drbd foo: conn( Unconnected -> WFConnection ) 
> kernel: drbd foo: conn( WFConnection -> Disconnecting ) 
> kernel: drbd foo: Discarding network configuration.
> kernel: drbd foo: Connection closed
> kernel: drbd foo: conn( Disconnecting -> StandAlone ) 
> kernel: drbd foo: receiver terminated
> kernel: drbd foo: Terminating drbd_r_foo
> kernel: block drbd7740: disk( UpToDate -> Failed ) 
> kernel: BUG: scheduling while atomic: rcuos/2/25/0x00000200

That's strange.

> kernel: Modules linked in: md4 xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio kvm_intel kvm ppdev i2c_piix4 pcspkr virtio_balloon sg parport_pc parport acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc drbd(OE) libcrc32c ip_tables ext4 mbcache jbd2 sr_mod cdrom ata_generic pata_acpi cirrus syscopyarea sysfillrect sysimgblt drm_kms_helper ttm drm crc32c_intel ata_piix i2c_core serio_raw virtio_blk libata floppy virtio_net dm_mirror dm_region_hash dm_log dm_mod
> kernel: CPU: 2 PID: 25 Comm: rcuos/2 Tainted: G           OE   4.1.38-4.el7.x86_64 #1
> kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> kernel: 0000000000000286 00000000f24899fd ffff88023626fc48 ffffffff816f611c
> kernel: ffff88023fd177c0 ffffffff81b4fb64 ffff88023626fc58 ffffffff816f0b25
> kernel: ffff88023626fca8 ffffffff816f97ad ffff880236276040 ffff880236270000
> kernel: Call Trace:
> kernel: [<ffffffff816f611c>] dump_stack+0x63/0x81
> kernel: [<ffffffff816f0b25>] __schedule_bug+0x4b/0x59
> kernel: [<ffffffff816f97ad>] __schedule+0x7cd/0x950
> kernel: [<ffffffff816f9967>] schedule+0x37/0x90
> kernel: [<ffffffff816f9c9e>] schedule_preempt_disabled+0xe/0x10
> kernel: [<ffffffff816fba95>] __mutex_lock_slowpath+0xb5/0x130

> kernel: [<ffffffff816fbb33>] mutex_lock+0x23/0x37
> kernel: [<ffffffff8161170f>] genl_lock_done+0x2f/0x60
> kernel: [<ffffffff8160cfae>] netlink_sock_destruct+0x3e/0x150
> kernel: [<ffffffff815c4663>] __sk_free+0x23/0x140
> kernel: [<ffffffff815c4799>] sk_free+0x19/0x20
> kernel: [<ffffffff8160da70>] deferred_put_nlk_sk+0x20/0x30
> kernel: [<ffffffff81104dff>] rcu_nocb_kthread+0x18f/0x550

Something wrong below deferred_put_nlk_sk.

Can I bother you to check with different kernel versions?

Maybe this one vvv helps? (>= 4.9)
ed5d778 netlink: Do not schedule work from sk_destruct

Or a kernel <= 3.18 (before that vvv one)?
21e4902 netlink: Lockless lookup with RCU grace period in socket release

Thanks,
	Lars


> kernel: [<ffffffff810e6a80>] ? prepare_to_wait_event+0xf0/0xf0
> kernel: [<ffffffff81104c70>] ? rcu_cpu_notify+0x3c0/0x3c0
> kernel: [<ffffffff810c0c98>] kthread+0xd8/0xf0
> kernel: [<ffffffff810c0bc0>] ? kthread_create_on_node+0x1b0/0x1b0
> kernel: [<ffffffff816fdf62>] ret_from_fork+0x42/0x70
> kernel: [<ffffffff810c0bc0>] ? kthread_create_on_node+0x1b0/0x1b0

> kernel: NOHZ: local_softirq_pending 282
> kernel: block drbd7740: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
> kernel: block drbd7740: disk( Failed -> Diskless ) 
> kernel: drbd foo: Terminating drbd_w_foo
> kernel: BUG: scheduling while atomic: rcuos/2/25/0x00000200
> kernel: Modules linked in: md4 xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio kvm_intel kvm ppdev i2c_piix4 pcspkr virtio_balloon sg parport_pc parport acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc drbd(OE) libcrc32c ip_tables ext4 mbcache jbd2 sr_mod cdrom ata_generic pata_acpi cirrus syscopyarea sysfillrect sysimgblt drm_kms_helper ttm drm crc32c_intel ata_piix i2c_core serio_raw virtio_blk libata floppy virtio_net dm_mirror dm_region_hash dm_log dm_mod
> kernel: CPU: 2 PID: 25 Comm: rcuos/2 Tainted: G        W  OE   4.1.38-4.el7.x86_64 #1
> kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> kernel: 0000000000000286 00000000f24899fd ffff88023626f9c8 ffffffff816f611c
> kernel: ffff88023fd177c0 ffff88023626fb80 ffff88023626f9d8 ffffffff816f0b25
> kernel: ffff88023626fa28 ffffffff816f97ad ffff88023626f9f8 ffff880236270000
> kernel: Call Trace:
> kernel: [<ffffffff816f611c>] dump_stack+0x63/0x81
> kernel: [<ffffffff816f0b25>] __schedule_bug+0x4b/0x59
> kernel: [<ffffffff816f97ad>] __schedule+0x7cd/0x950
> kernel: [<ffffffff816f9967>] schedule+0x37/0x90
> kernel: [<ffffffff816fc9e4>] schedule_timeout+0x244/0x2c0
> kernel: [<ffffffff810cf7c4>] ? try_to_wake_up+0x1d4/0x350
> kernel: [<ffffffff81103610>] ? get_state_synchronize_rcu+0x20/0x20
> kernel: [<ffffffff816fa4d4>] wait_for_completion+0x134/0x190
> kernel: [<ffffffff810cf9b0>] ? wake_up_state+0x20/0x20
> kernel: [<ffffffff81103610>] ? get_state_synchronize_rcu+0x20/0x20
> kernel: [<ffffffff81141ebf>] __stop_cpus+0x5f/0x90
> kernel: [<ffffffff81103610>] ? get_state_synchronize_rcu+0x20/0x20
> kernel: [<ffffffff811421fe>] try_stop_cpus+0x3e/0x60
> kernel: [<ffffffff81106454>] synchronize_sched_expedited+0x134/0x320
> kernel: [<ffffffff8110672e>] synchronize_rcu_expedited+0xe/0x10
> kernel: [<ffffffff811cb64a>] bdi_destroy+0x11a/0x170
> kernel: [<ffffffff8131dee7>] blk_cleanup_queue+0x117/0x1e0
> kernel: [<ffffffffa02934de>] drbd_destroy_device+0x15e/0x370 [drbd]
> kernel: [<ffffffffa02ad81d>] forget_state_change+0xbd/0xf0 [drbd]
> kernel: [<ffffffffa029e210>] drbd_adm_get_initial_state_done+0x80/0x90 [drbd]
> kernel: [<ffffffff81611717>] genl_lock_done+0x37/0x60
> kernel: [<ffffffff8160cfae>] netlink_sock_destruct+0x3e/0x150
> kernel: [<ffffffff815c4663>] __sk_free+0x23/0x140
> kernel: [<ffffffff815c4799>] sk_free+0x19/0x20
> kernel: [<ffffffff8160da70>] deferred_put_nlk_sk+0x20/0x30
> kernel: [<ffffffff81104dff>] rcu_nocb_kthread+0x18f/0x550

[... trimmed the other stack traces ... ]


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-03-23 11:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-01  0:28 [Drbd-dev] BUG: NULL pointer dereference triggered by drbdsetup events2 Eric Wheeler
2017-03-01 13:06 ` Lars Ellenberg
2017-03-01 21:36   ` Eric Wheeler
2017-03-02 21:33     ` Lars Ellenberg
2017-03-11  0:29       ` Eric Wheeler
2017-03-23 11:52         ` Lars Ellenberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox