From: Lars Ellenberg <lars.ellenberg@linbit.com>
To: drbd-dev@lists.linbit.com
Subject: Re: [Drbd-dev] BUG: NULL pointer dereference triggered by drbdsetup events2
Date: Wed, 1 Mar 2017 14:06:14 +0100 [thread overview]
Message-ID: <20170301130614.GH10667@soda.linbit> (raw)
In-Reply-To: <alpine.LRH.2.11.1702281620050.16946@mail.ewheeler.net>
On Tue, Feb 28, 2017 at 04:28:46PM -0800, Eric Wheeler wrote:
> Hello all,
>
> We found a relatively easy to reproduce bug that crashes the kernel. Start
> with all of the resources down and only the DRBD module loaded and run the
> following:
>
> drbdadm up foo
> drbdsetup events2 foo | true
> drbdadm down foo; ls -d /sys/devices/virtual/bdi/147:7740
> drbdadm up foo
>
> The backtrace is below. The ls above simply illustrates the problem. After
> a down, the sysfs entry should not exist. The bug manifests because
> add_disk attempts to create the same sysfs entry but it cannot and fails
> up the stack. I have a feeling that the interface used by events2 is
> holding open a reference count after down so the sysfs entry is never
> removed.
>
> We are piping into true because it fails quickly without reading any
> stdio, so perhaps the kernel is blocked trying to flush a buffer into
> userspace and never releases a resource count (speculation).
>
> This was tested using the 4.10.1 kernel with userspace tools
> drbd-utils-8.9.4. I suspect this could be worked around in userspace, but
> it would be ideal if the kernel module could be fixed up to prevent a
> crash.
>
> Please let me know if you need additional information or if I can provide
> any other testing.
>
> Thank you for your help!
The "events2" part of drbd 8.4 was a "backport" from drbd 9.
I found this apparently relevant commit in our DRBD 9 repo.
In our out-of-tree 8.4 code, we later dropped the .done parts again,
for compat with RHEL 5, which is likely why this fix fell through.
I basically only re-diffed and compile-tested for current upstream kernel.
Does this help already, or do we need to dig deeper?
Thanks,
Lars
commit 74635bbfbd03546f19ed015cec6e470062af44af
Author: Lars Ellenberg <lars.ellenberg@linbit.com>
Date: Wed Jul 27 14:53:00 2016 +0200
drbd: fix potential refcount leak, add missing drbd_adm_get_initial_state_done
If some drbdsetup events2 / wait-* command aborts early,
or the netlink socket currently in .dumpit = drbd_adm_get_initial_state
is destroyed early for some other reason,
the cleanup code inside that .dumpit function would never run,
and we end up with a reference count leak.
Move cleanup to .done = drbd_adm_get_initial_state_done.
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index f35db29..115b468 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -84,6 +84,7 @@ int drbd_adm_dump_connections_done(struct netlink_callback *cb);
int drbd_adm_dump_peer_devices(struct sk_buff *skb, struct netlink_callback *cb);
int drbd_adm_dump_peer_devices_done(struct netlink_callback *cb);
int drbd_adm_get_initial_state(struct sk_buff *skb, struct netlink_callback *cb);
+int drbd_adm_get_initial_state_done(struct netlink_callback *cb);
#include <linux/drbd_genl_api.h>
#include "drbd_nla.h"
@@ -4932,6 +4933,21 @@ static int get_initial_state(struct sk_buff *skb, struct netlink_callback *cb)
return skb->len;
}
+int drbd_adm_get_initial_state_done(struct netlink_callback *cb)
+{
+ LIST_HEAD(head);
+ if (cb->args[0]) {
+ struct drbd_state_change *state_change =
+ (struct drbd_state_change *)cb->args[0];
+ cb->args[0] = 0;
+
+ /* connect list to head */
+ list_add(&head, &state_change->list);
+ free_state_changes(&head);
+ }
+ return 0;
+}
+
int drbd_adm_get_initial_state(struct sk_buff *skb, struct netlink_callback *cb)
{
struct drbd_resource *resource;
@@ -4940,14 +4956,6 @@ int drbd_adm_get_initial_state(struct sk_buff *skb, struct netlink_callback *cb)
if (cb->args[5] >= 1) {
if (cb->args[5] > 1)
return get_initial_state(skb, cb);
- if (cb->args[0]) {
- struct drbd_state_change *state_change =
- (struct drbd_state_change *)cb->args[0];
-
- /* connect list to head */
- list_add(&head, &state_change->list);
- free_state_changes(&head);
- }
return 0;
}
diff --git a/include/linux/drbd_genl.h b/include/linux/drbd_genl.h
index c934d3a..aa9bf5d 100644
--- a/include/linux/drbd_genl.h
+++ b/include/linux/drbd_genl.h
@@ -521,6 +521,7 @@ GENL_op(
DRBD_ADM_GET_INITIAL_STATE, 38,
GENL_op_init(
.dumpit = drbd_adm_get_initial_state,
+ .done = drbd_adm_get_initial_state_done,
),
GENL_tla_expected(DRBD_NLA_CFG_CONTEXT, DRBD_GENLA_F_MANDATORY))
next prev parent reply other threads:[~2017-03-01 13:06 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-01 0:28 [Drbd-dev] BUG: NULL pointer dereference triggered by drbdsetup events2 Eric Wheeler
2017-03-01 13:06 ` Lars Ellenberg [this message]
2017-03-01 21:36 ` Eric Wheeler
2017-03-02 21:33 ` Lars Ellenberg
2017-03-11 0:29 ` Eric Wheeler
2017-03-23 11:52 ` Lars Ellenberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170301130614.GH10667@soda.linbit \
--to=lars.ellenberg@linbit.com \
--cc=drbd-dev@lists.linbit.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox