Distributed Replicated Block Device (DRBD) development
 help / color / mirror / Atom feed
From: Lars Ellenberg <lars.ellenberg@linbit.com>
To: drbd-dev@lists.linbit.com
Subject: Re: [Drbd-dev] BUG: NULL pointer dereference triggered by drbdsetup events2
Date: Wed, 1 Mar 2017 14:06:14 +0100	[thread overview]
Message-ID: <20170301130614.GH10667@soda.linbit> (raw)
In-Reply-To: <alpine.LRH.2.11.1702281620050.16946@mail.ewheeler.net>

On Tue, Feb 28, 2017 at 04:28:46PM -0800, Eric Wheeler wrote:
> Hello all,
> 
> We found a relatively easy to reproduce bug that crashes the kernel. Start 
> with all of the resources down and only the DRBD module loaded and run the 
> following:
> 
> drbdadm up foo
> drbdsetup events2 foo | true
> drbdadm down foo; ls -d /sys/devices/virtual/bdi/147:7740
> drbdadm up foo
> 
> The backtrace is below. The ls above simply illustrates the problem. After 
> a down, the sysfs entry should not exist. The bug manifests because 
> add_disk attempts to create the same sysfs entry but it cannot and fails 
> up the stack. I have a feeling that the interface used by events2 is 
> holding open a reference count after down so the sysfs entry is never 
> removed.
> 
> We are piping into true because it fails quickly without reading any 
> stdio, so perhaps the kernel is blocked trying to flush a buffer into 
> userspace and never releases a resource count (speculation).
> 
> This was tested using the 4.10.1 kernel with userspace tools 
> drbd-utils-8.9.4. I suspect this could be worked around in userspace, but 
> it would be ideal if the kernel module could be fixed up to prevent a 
> crash.
> 
> Please let me know if you need additional information or if I can provide 
> any other testing.
> 
> Thank you for your help!

The "events2" part of drbd 8.4 was a "backport" from drbd 9.
I found this apparently relevant commit in our DRBD 9 repo.

In our out-of-tree 8.4 code, we later dropped the .done parts again,
for compat with RHEL 5, which is likely why this fix fell through.

I basically only re-diffed and compile-tested for current upstream kernel.
Does this help already, or do we need to dig deeper?

Thanks,
	Lars


commit 74635bbfbd03546f19ed015cec6e470062af44af
Author: Lars Ellenberg <lars.ellenberg@linbit.com>
Date:   Wed Jul 27 14:53:00 2016 +0200

    drbd: fix potential refcount leak, add missing drbd_adm_get_initial_state_done
    
    If some drbdsetup events2 / wait-* command aborts early,
    or the netlink socket currently in .dumpit = drbd_adm_get_initial_state
    is destroyed early for some other reason,
    the cleanup code inside that .dumpit function would never run,
    and we end up with a reference count leak.
    
    Move cleanup to .done = drbd_adm_get_initial_state_done.

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index f35db29..115b468 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -84,6 +84,7 @@ int drbd_adm_dump_connections_done(struct netlink_callback *cb);
 int drbd_adm_dump_peer_devices(struct sk_buff *skb, struct netlink_callback *cb);
 int drbd_adm_dump_peer_devices_done(struct netlink_callback *cb);
 int drbd_adm_get_initial_state(struct sk_buff *skb, struct netlink_callback *cb);
+int drbd_adm_get_initial_state_done(struct netlink_callback *cb);
 
 #include <linux/drbd_genl_api.h>
 #include "drbd_nla.h"
@@ -4932,6 +4933,21 @@ static int get_initial_state(struct sk_buff *skb, struct netlink_callback *cb)
 	return skb->len;
 }
 
+int drbd_adm_get_initial_state_done(struct netlink_callback *cb)
+{
+	LIST_HEAD(head);
+	if (cb->args[0]) {
+		struct drbd_state_change *state_change =
+			(struct drbd_state_change *)cb->args[0];
+		cb->args[0] = 0;
+
+		/* connect list to head */
+		list_add(&head, &state_change->list);
+		free_state_changes(&head);
+	}
+	return 0;
+}
+
 int drbd_adm_get_initial_state(struct sk_buff *skb, struct netlink_callback *cb)
 {
 	struct drbd_resource *resource;
@@ -4940,14 +4956,6 @@ int drbd_adm_get_initial_state(struct sk_buff *skb, struct netlink_callback *cb)
 	if (cb->args[5] >= 1) {
 		if (cb->args[5] > 1)
 			return get_initial_state(skb, cb);
-		if (cb->args[0]) {
-			struct drbd_state_change *state_change =
-				(struct drbd_state_change *)cb->args[0];
-
-			/* connect list to head */
-			list_add(&head, &state_change->list);
-			free_state_changes(&head);
-		}
 		return 0;
 	}
 
diff --git a/include/linux/drbd_genl.h b/include/linux/drbd_genl.h
index c934d3a..aa9bf5d 100644
--- a/include/linux/drbd_genl.h
+++ b/include/linux/drbd_genl.h
@@ -521,6 +521,7 @@ GENL_op(
 	DRBD_ADM_GET_INITIAL_STATE, 38,
 	GENL_op_init(
 	        .dumpit = drbd_adm_get_initial_state,
+	        .done = drbd_adm_get_initial_state_done,
 	),
 	GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, DRBD_GENLA_F_MANDATORY))
 

  reply	other threads:[~2017-03-01 13:06 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-01  0:28 [Drbd-dev] BUG: NULL pointer dereference triggered by drbdsetup events2 Eric Wheeler
2017-03-01 13:06 ` Lars Ellenberg [this message]
2017-03-01 21:36   ` Eric Wheeler
2017-03-02 21:33     ` Lars Ellenberg
2017-03-11  0:29       ` Eric Wheeler
2017-03-23 11:52         ` Lars Ellenberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170301130614.GH10667@soda.linbit \
    --to=lars.ellenberg@linbit.com \
    --cc=drbd-dev@lists.linbit.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox