* [PATCH 1/2] drivers/base: add bus_device_iter_init, bus_device_iter_next, bus_device_iter_exit
2016-01-11 17:28 [PATCH 0/2] avoid crashing when reading /proc/scsi/scsi and simultaneously removing devices Ewan D. Milne
@ 2016-01-11 17:28 ` Ewan D. Milne
2016-01-11 17:28 ` [PATCH 2/2] scsi_proc: Change /proc/scsi/scsi to use bus device iterator Ewan D. Milne
2016-01-11 19:15 ` [PATCH 0/2] avoid crashing when reading /proc/scsi/scsi and simultaneously removing devices James Bottomley
2 siblings, 0 replies; 6+ messages in thread
From: Ewan D. Milne @ 2016-01-11 17:28 UTC (permalink / raw)
To: linux-kernel, linux-scsi; +Cc: gregkh, martin.petersen, hare
From: "Ewan D. Milne" <emilne@redhat.com>
These functions are needed to expose an iterator for SCSI usage.
>From a patch originally developed by David Jeffery <djeffery@redhat.com>
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
drivers/base/bus.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/device.h | 5 +++++
2 files changed, 64 insertions(+)
diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index 5005924..a472e46 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -318,6 +318,65 @@ int bus_for_each_dev(struct bus_type *bus, struct device *start,
EXPORT_SYMBOL_GPL(bus_for_each_dev);
/**
+ * bus_device_iter_init - Initialize an iterator for walking a bus's devices.
+ * @iter: iterator structure to initialize
+ * @bus: bus type
+ *
+ * Initializes an iterator for safely walking a bus's list of devices. The
+ * iterator can be used by bus_device_iter_next() to safely walk the list, even
+ * if a device is removed from the list while being examined. Needs to be
+ * matched with a call to bus_device_iter_exit() to clean up the iterator when
+ * finished.
+ *
+ * Return 0 on success, non-zero on failure. Iterator cannot be used
+ * for a non-zero result
+ */
+int bus_device_iter_init(struct klist_iter *iter,
+ struct bus_type *bus)
+{
+ if (!bus || !bus->p)
+ return -EINVAL;
+
+ klist_iter_init_node(&bus->p->klist_devices, iter, NULL);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(bus_device_iter_init);
+
+/**
+ * bus_device_iter_next - Get a bus's next device from the iterator.
+ * @iter: iterator structure from bus_device_iter_init()
+ *
+ * Finds the next valid device entry on a bus's device list. Allows the list
+ * to be safely traversed by the caller even when other tasks remove devices
+ * from the list.
+ *
+ * Returns a reference to the next device, or NULL if no more devices.
+ */
+struct device *bus_device_iter_next(struct klist_iter *iter)
+{
+ struct device *dev;
+
+ while ((dev = next_device(iter)))
+ if (get_device(dev))
+ break;
+
+ return dev;
+}
+EXPORT_SYMBOL_GPL(bus_device_iter_next);
+
+/**
+ * bus_device_iter_exit - Clean up an iterator from walking a bus's device list.
+ * @iter: iterator structure from bus_device_iter_init() to clean up
+ *
+ * Clean up any remaining state after finishing walking a bus's device list.
+ */
+void bus_device_iter_exit(struct klist_iter *iter)
+{
+ klist_iter_exit(iter);
+}
+EXPORT_SYMBOL_GPL(bus_device_iter_exit);
+
+/**
* bus_find_device - device iterator for locating a particular device.
* @bus: bus type
* @start: Device to begin with
diff --git a/include/linux/device.h b/include/linux/device.h
index b8f411b..a44d912 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -151,6 +151,11 @@ void subsys_dev_iter_exit(struct subsys_dev_iter *iter);
int bus_for_each_dev(struct bus_type *bus, struct device *start, void *data,
int (*fn)(struct device *dev, void *data));
+
+int bus_device_iter_init(struct klist_iter *iter, struct bus_type *bus);
+struct device *bus_device_iter_next(struct klist_iter *iter);
+void bus_device_iter_exit(struct klist_iter *iter);
+
struct device *bus_find_device(struct bus_type *bus, struct device *start,
void *data,
int (*match)(struct device *dev, void *data));
--
2.1.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH 2/2] scsi_proc: Change /proc/scsi/scsi to use bus device iterator
2016-01-11 17:28 [PATCH 0/2] avoid crashing when reading /proc/scsi/scsi and simultaneously removing devices Ewan D. Milne
2016-01-11 17:28 ` [PATCH 1/2] drivers/base: add bus_device_iter_init, bus_device_iter_next, bus_device_iter_exit Ewan D. Milne
@ 2016-01-11 17:28 ` Ewan D. Milne
2016-01-11 19:15 ` [PATCH 0/2] avoid crashing when reading /proc/scsi/scsi and simultaneously removing devices James Bottomley
2 siblings, 0 replies; 6+ messages in thread
From: Ewan D. Milne @ 2016-01-11 17:28 UTC (permalink / raw)
To: linux-kernel, linux-scsi; +Cc: gregkh, martin.petersen, hare
From: "Ewan D. Milne" <emilne@redhat.com>
This prevents crashing due to accessing a removed element on the list,
the iterator will now hold the correct reference. It was not sufficient
to rely on the klist's reference on the containing device object.
>From a patch originally developed by David Jeffery <djeffery@redhat.com>
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
---
drivers/scsi/scsi_proc.c | 49 ++++++++++++++++++++++++++++--------------------
1 file changed, 29 insertions(+), 20 deletions(-)
diff --git a/drivers/scsi/scsi_proc.c b/drivers/scsi/scsi_proc.c
index 251598e..6c1b79d 100644
--- a/drivers/scsi/scsi_proc.c
+++ b/drivers/scsi/scsi_proc.c
@@ -40,6 +40,11 @@
/* 4K page size, but our output routines, use some slack for overruns */
#define PROC_BLOCK_SIZE (3*1024)
+struct scsi_proc_state {
+ struct klist_iter iter;
+ int pos;
+};
+
static struct proc_dir_entry *proc_scsi;
/* Protect sht->present and sht->proc_dir */
@@ -370,47 +375,50 @@ static ssize_t proc_scsi_write(struct file *file, const char __user *buf,
return err;
}
-static int always_match(struct device *dev, void *data)
-{
- return 1;
-}
-
-static inline struct device *next_scsi_device(struct device *start)
-{
- struct device *next = bus_find_device(&scsi_bus_type, start, NULL,
- always_match);
- put_device(start);
- return next;
-}
-
static void *scsi_seq_start(struct seq_file *sfile, loff_t *pos)
{
+ struct scsi_proc_state *state = sfile->private;
struct device *dev = NULL;
loff_t n = *pos;
+ int err;
+
+ err = bus_device_iter_init(&state->iter, &scsi_bus_type);
+ if (err < 0)
+ return ERR_PTR(err);
- while ((dev = next_scsi_device(dev))) {
+ while ((dev = bus_device_iter_next(&state->iter))) {
if (!n--)
break;
- sfile->private++;
+ put_device(dev);
+ state->pos++;
}
return dev;
}
static void *scsi_seq_next(struct seq_file *sfile, void *v, loff_t *pos)
{
+ struct scsi_proc_state *state = sfile->private;
+
(*pos)++;
- sfile->private++;
- return next_scsi_device(v);
+ put_device(v);
+ state->pos++;
+
+ return bus_device_iter_next(&state->iter);
}
static void scsi_seq_stop(struct seq_file *sfile, void *v)
{
+ struct scsi_proc_state *state = sfile->private;
+
put_device(v);
+ bus_device_iter_exit(&state->iter);
}
static int scsi_seq_show(struct seq_file *sfile, void *dev)
{
- if (!sfile->private)
+ struct scsi_proc_state *state = sfile->private;
+
+ if (!state->pos)
seq_puts(sfile, "Attached devices:\n");
return proc_print_scsidevice(dev, sfile);
@@ -436,7 +444,8 @@ static int proc_scsi_open(struct inode *inode, struct file *file)
* We don't really need this for the write case but it doesn't
* harm either.
*/
- return seq_open(file, &scsi_seq_ops);
+ return seq_open_private(file, &scsi_seq_ops,
+ sizeof(struct scsi_proc_state));
}
static const struct file_operations proc_scsi_operations = {
@@ -445,7 +454,7 @@ static const struct file_operations proc_scsi_operations = {
.read = seq_read,
.write = proc_scsi_write,
.llseek = seq_lseek,
- .release = seq_release,
+ .release = seq_release_private,
};
/**
--
2.1.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH 0/2] avoid crashing when reading /proc/scsi/scsi and simultaneously removing devices
2016-01-11 17:28 [PATCH 0/2] avoid crashing when reading /proc/scsi/scsi and simultaneously removing devices Ewan D. Milne
2016-01-11 17:28 ` [PATCH 1/2] drivers/base: add bus_device_iter_init, bus_device_iter_next, bus_device_iter_exit Ewan D. Milne
2016-01-11 17:28 ` [PATCH 2/2] scsi_proc: Change /proc/scsi/scsi to use bus device iterator Ewan D. Milne
@ 2016-01-11 19:15 ` James Bottomley
2016-01-11 21:32 ` Ewan Milne
2 siblings, 1 reply; 6+ messages in thread
From: James Bottomley @ 2016-01-11 19:15 UTC (permalink / raw)
To: Ewan D. Milne, linux-kernel, linux-scsi; +Cc: gregkh, martin.petersen, hare
On Mon, 2016-01-11 at 12:28 -0500, Ewan D. Milne wrote:
> From: "Ewan D. Milne" <emilne@redhat.com>
>
> The klist traversal used by the reading of /proc/scsi/scsi is not
> interlocked
> against device removal. It takes a reference on the containing
> object, but
> this does not prevent the device from being removed from the list.
> Thus, we
> get errors and eventually panic, as shown in the traces below. Fix
> this by
> keeping a klist iterator in the seq_file private data.
>
> The problem can be easily reproduced by repeatedly increasing
> scsi_debug's
> max_luns to 30 and then deleting the devices via sysfs, while
> simulatenously
> accessing /proc/scsi/scsi.
>
> From a patch originally developed by David Jeffery <
> djeffery@redhat.com>
OK, so it looks like this is a bug in the klist system. When a
starting point is used, there should be a check to see if it's still
active otherwise the whole thing is racy. If it's fixed in klist, the
fix works for everyone, not just SCSI.
How about this? It causes the iterator to start at the beginning if
the node has been deleted. That will produce double output during some
of your test, but I think that's OK given that this is a rare race.
James
---
diff --git a/lib/klist.c b/lib/klist.c
index d74cf7a..0507fa5 100644
--- a/lib/klist.c
+++ b/lib/klist.c
@@ -282,9 +282,9 @@ void klist_iter_init_node(struct klist *k, struct klist_iter *i,
struct klist_node *n)
{
i->i_klist = k;
- i->i_cur = n;
- if (n)
- kref_get(&n->n_ref);
+ i->i_cur = NULL;
+ if (n && kref_get_unless_zero(&n->n_ref))
+ i->i_cur = n;
}
EXPORT_SYMBOL_GPL(klist_iter_init_node);
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH 0/2] avoid crashing when reading /proc/scsi/scsi and simultaneously removing devices
2016-01-11 19:15 ` [PATCH 0/2] avoid crashing when reading /proc/scsi/scsi and simultaneously removing devices James Bottomley
@ 2016-01-11 21:32 ` Ewan Milne
2016-01-12 2:35 ` James Bottomley
0 siblings, 1 reply; 6+ messages in thread
From: Ewan Milne @ 2016-01-11 21:32 UTC (permalink / raw)
To: James Bottomley; +Cc: linux-kernel, linux-scsi, gregkh, martin.petersen, hare
On Mon, 2016-01-11 at 11:15 -0800, James Bottomley wrote:
> On Mon, 2016-01-11 at 12:28 -0500, Ewan D. Milne wrote:
> > From: "Ewan D. Milne" <emilne@redhat.com>
> >
> > The klist traversal used by the reading of /proc/scsi/scsi is not
> > interlocked
> > against device removal. It takes a reference on the containing
> > object, but
> > this does not prevent the device from being removed from the list.
> > Thus, we
> > get errors and eventually panic, as shown in the traces below. Fix
> > this by
> > keeping a klist iterator in the seq_file private data.
> >
> > The problem can be easily reproduced by repeatedly increasing
> > scsi_debug's
> > max_luns to 30 and then deleting the devices via sysfs, while
> > simulatenously
> > accessing /proc/scsi/scsi.
> >
> > From a patch originally developed by David Jeffery <
> > djeffery@redhat.com>
>
> OK, so it looks like this is a bug in the klist system. When a
> starting point is used, there should be a check to see if it's still
> active otherwise the whole thing is racy. If it's fixed in klist, the
> fix works for everyone, not just SCSI.
>
> How about this? It causes the iterator to start at the beginning if
> the node has been deleted. That will produce double output during some
> of your test, but I think that's OK given that this is a rare race.
>
> James
I'm running with your change now, it does appear to fix the problem.
I guess the question is whether this behavior would trip up any other
klist users, for /proc/scsi/scsi it is probably not a problem. The
worst that might happen is that userspace tools that parse the output
would get duplicate entries.
-Ewan
> ---
>
> diff --git a/lib/klist.c b/lib/klist.c
> index d74cf7a..0507fa5 100644
> --- a/lib/klist.c
> +++ b/lib/klist.c
> @@ -282,9 +282,9 @@ void klist_iter_init_node(struct klist *k, struct klist_iter *i,
> struct klist_node *n)
> {
> i->i_klist = k;
> - i->i_cur = n;
> - if (n)
> - kref_get(&n->n_ref);
> + i->i_cur = NULL;
> + if (n && kref_get_unless_zero(&n->n_ref))
> + i->i_cur = n;
> }
> EXPORT_SYMBOL_GPL(klist_iter_init_node);
>
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH 0/2] avoid crashing when reading /proc/scsi/scsi and simultaneously removing devices
2016-01-11 21:32 ` Ewan Milne
@ 2016-01-12 2:35 ` James Bottomley
0 siblings, 0 replies; 6+ messages in thread
From: James Bottomley @ 2016-01-12 2:35 UTC (permalink / raw)
To: emilne; +Cc: linux-kernel, linux-scsi, gregkh, martin.petersen, hare
On Mon, 2016-01-11 at 16:32 -0500, Ewan Milne wrote:
> On Mon, 2016-01-11 at 11:15 -0800, James Bottomley wrote:
> > On Mon, 2016-01-11 at 12:28 -0500, Ewan D. Milne wrote:
> > > From: "Ewan D. Milne" <emilne@redhat.com>
> > >
> > > The klist traversal used by the reading of /proc/scsi/scsi is not
> > > interlocked
> > > against device removal. It takes a reference on the containing
> > > object, but
> > > this does not prevent the device from being removed from the
> > > list.
> > > Thus, we
> > > get errors and eventually panic, as shown in the traces below.
> > > Fix
> > > this by
> > > keeping a klist iterator in the seq_file private data.
> > >
> > > The problem can be easily reproduced by repeatedly increasing
> > > scsi_debug's
> > > max_luns to 30 and then deleting the devices via sysfs, while
> > > simulatenously
> > > accessing /proc/scsi/scsi.
> > >
> > > From a patch originally developed by David Jeffery <
> > > djeffery@redhat.com>
> >
> > OK, so it looks like this is a bug in the klist system. When a
> > starting point is used, there should be a check to see if it's
> > still
> > active otherwise the whole thing is racy. If it's fixed in klist,
> > the
> > fix works for everyone, not just SCSI.
> >
> > How about this? It causes the iterator to start at the beginning
> > if
> > the node has been deleted. That will produce double output during
> > some
> > of your test, but I think that's OK given that this is a rare race.
> >
> > James
>
> I'm running with your change now, it does appear to fix the problem.
> I guess the question is whether this behavior would trip up any other
> klist users,
I don't see how it can. We simply can't use a removed node as the
starting point without triggering the kref warn on you see in your log.
Things just go downhill from there. This will happen to any klist
user, not just us. I'd contend that starting at the beginning is
better than an eventual panic for anyone.
What you should see currently is actually the list is truncated when
the problem is hit because the next pointer is set to the poison value.
That causes another warn on and traversal truncation (or straight
dereference if you're unlucky).
> for /proc/scsi/scsi it is probably not a problem. The
> worst that might happen is that userspace tools that parse the output
> would get duplicate entries.
Well, it's not really possible to keep the current behaviour and
truncate the list (before oopsing). There's no non-racy way of
positioning the iterator at the last element so it exits on
klist_next() because the list can be added to during the iteration.
James
^ permalink raw reply [flat|nested] 6+ messages in thread