All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Fix Null pointer Exception
@ 2008-09-08 15:40 Stefan Raspl
  2008-09-09  0:21 ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Raspl @ 2008-09-08 15:40 UTC (permalink / raw)
  To: device-mapper development, agk; +Cc: akpm, h.carstens, martin.schwidefsky

Here's a trivial patch for the kernel panics that we reported last week
when testing various ways to forcefully disconnect or temporarily disable
DASD disks from an IBM System z machine. We ran into NULL pointer exceptions
at the respective places.


Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>


---
 drivers/md/dm-table.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -943,7 +943,8 @@ int dm_table_any_congested(struct dm_tab
 
 	list_for_each_entry(dd, devices, list) {
 		struct request_queue *q = bdev_get_queue(dd->bdev);
-		r |= bdi_congested(&q->backing_dev_info, bdi_bits);
+		if (q)
+			r |= bdi_congested(&q->backing_dev_info, bdi_bits);
 	}
 
 	return r;
@@ -957,7 +958,8 @@ void dm_table_unplug_all(struct dm_table
 	list_for_each_entry(dd, devices, list) {
 		struct request_queue *q = bdev_get_queue(dd->bdev);
 
-		blk_unplug(q);
+		if (q)
+			blk_unplug(q);
 	}
 }
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Fix Null pointer Exception
  2008-09-08 15:40 [PATCH] Fix Null pointer Exception Stefan Raspl
@ 2008-09-09  0:21 ` Andrew Morton
  2008-09-23 16:56   ` Alasdair G Kergon
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2008-09-09  0:21 UTC (permalink / raw)
  To: raspl; +Cc: dm-devel, h.carstens, martin.schwidefsky, agk

On Mon, 08 Sep 2008 17:40:59 +0200
Stefan Raspl <raspl@linux.vnet.ibm.com> wrote:

> Here's a trivial patch for the kernel panics that we reported last week
> when testing various ways to forcefully disconnect or temporarily disable
> DASD disks from an IBM System z machine. We ran into NULL pointer exceptions
> at the respective places.
> 

Please look at the above text and consider how it will look to people
who read it in the git repository in 2011.

And consider how it looks today, to people who don't know anything
about "the kernel panics that we reported last week".

> 
> Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
> 
> 
> ---
>  drivers/md/dm-table.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> --- a/drivers/md/dm-table.c
> +++ b/drivers/md/dm-table.c
> @@ -943,7 +943,8 @@ int dm_table_any_congested(struct dm_tab
>  
>  	list_for_each_entry(dd, devices, list) {
>  		struct request_queue *q = bdev_get_queue(dd->bdev);
> -		r |= bdi_congested(&q->backing_dev_info, bdi_bits);
> +		if (q)
> +			r |= bdi_congested(&q->backing_dev_info, bdi_bits);
>  	}
>  
>  	return r;
> @@ -957,7 +958,8 @@ void dm_table_unplug_all(struct dm_table
>  	list_for_each_entry(dd, devices, list) {
>  		struct request_queue *q = bdev_get_queue(dd->bdev);
>  
> -		blk_unplug(q);
> +		if (q)
> +			blk_unplug(q);
>  	}
>  }
>  

And it's not just a trivial matter of getting the paperwork right. 
This could be the wrong fix - how did these null pointers come about? 
What was the workload?  It seems strange to have a blockdev which has
no queue associated with it.

It takes no more than five minutes to fully describe a patch of this
kind.  Please expend that time.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: [PATCH] Fix Null pointer Exception
  2008-09-09  0:21 ` Andrew Morton
@ 2008-09-23 16:56   ` Alasdair G Kergon
  2008-09-24 13:17     ` Stefan Raspl
  0 siblings, 1 reply; 7+ messages in thread
From: Alasdair G Kergon @ 2008-09-23 16:56 UTC (permalink / raw)
  To: raspl, Jens Axboe; +Cc: Andrew Morton, h.carstens, dm-devel, martin.schwidefsky

On Mon, Sep 08, 2008 at 05:21:53PM -0700, Andrew Morton wrote:
> On Mon, 08 Sep 2008 17:40:59 +0200
> Stefan Raspl <raspl@linux.vnet.ibm.com> wrote:
> > Here's a trivial patch for the kernel panics that we reported last week
> > when testing various ways to forcefully disconnect or temporarily disable
> > DASD disks from an IBM System z machine. We ran into NULL pointer exceptions
> > at the respective places.
> Please look at the above text and consider how it will look to people
> who read it in the git repository in 2011.
> 
> And consider how it looks today, to people who don't know anything
> about "the kernel panics that we reported last week".
 
I've found this message:

+ Date: Mon, 01 Sep 2008 15:48:18 +0200
+ From: Stefan Raspl <raspl@linux.vnet.ibm.com>
+ Subject: [dm-devel] bdev lost its queue
+ To: device-mapper development <dm-devel@redhat.com>
+ 
+ We conducted a bunch tests where we used various ways to forcefully 
+ disconnect or temporarily disable DASD disks from an IBM System z 
+ machine. In the course we ran into a kernel panic in 
+ dm_table_unplug_all() (dm-table.c), specifically at
+ 
+    struct request_queue *q = bdev_get_queue(dd->bdev);
+     blk_unplug(q);
+ 
+ since the queue of the bdev was NULL.
+ Did anyone see a crash in this place before? And are there other disk 
+ drivers to can set their queue to NULL due to unplugging/outages or similar?

> > --- a/drivers/md/dm-table.c
> > +++ b/drivers/md/dm-table.c
> > @@ -943,7 +943,8 @@ int dm_table_any_congested(struct dm_tab
> >  
> >  	list_for_each_entry(dd, devices, list) {
> >  		struct request_queue *q = bdev_get_queue(dd->bdev);
> > -		r |= bdi_congested(&q->backing_dev_info, bdi_bits);
> > +		if (q)
> > +			r |= bdi_congested(&q->backing_dev_info, bdi_bits);
> >  	}
> >  
> >  	return r;
> > @@ -957,7 +958,8 @@ void dm_table_unplug_all(struct dm_table
> >  	list_for_each_entry(dd, devices, list) {
> >  		struct request_queue *q = bdev_get_queue(dd->bdev);
> >  
> > -		blk_unplug(q);
> > +		if (q)
> > +			blk_unplug(q);
> >  	}
> >  }
> >  
> 
> And it's not just a trivial matter of getting the paperwork right. 
> This could be the wrong fix - how did these null pointers come about? 
> What was the workload?  It seems strange to have a blockdev which has
> no queue associated with it.
 
Indeed, there is also code in other files that assumes struct
request_queue is not NULL.  And no, I've not heard of other drivers
causing this problem.  (Our dm multipath code doesn't do this when all
the paths disappear, for example.)

Have you had chance to look at old kernel versions to see in which
commit this was introduced?

Jens - What's your opinion: s390 driver fix or more messages like this one?
                if (!q) {
                        printk(KERN_ERR
                               "generic_make_request: Trying to access "
                                "nonexistent block-device %s (%Lu)\n",
                                bdevname(bio->bi_bdev, b),
                                (long long) bio->bi_sector);

Alasdair
-- 
agk@redhat.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: [PATCH] Fix Null pointer Exception
  2008-09-23 16:56   ` Alasdair G Kergon
@ 2008-09-24 13:17     ` Stefan Raspl
  2008-10-07 21:18       ` Alasdair G Kergon
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Raspl @ 2008-09-24 13:17 UTC (permalink / raw)
  To: raspl, Jens Axboe, Andrew Morton, dm-devel, heiko.carstens,
	schwidefsky, Neil

Alasdair G Kergon wrote:

> Have you had chance to look at old kernel versions to see in which
> commit this was introduced?

The next best thing I can come up with is that it has been around in
2.6.9 already. But I believe the DASD driver did the NULL thing to
its internal queue also back then, still.

Ciao,
Stefan


-- 
Linux on System z
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter
Geschäftsführer: Erich Baier
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: [PATCH] Fix Null pointer Exception
  2008-09-24 13:17     ` Stefan Raspl
@ 2008-10-07 21:18       ` Alasdair G Kergon
  2008-10-09 15:17         ` Alasdair G Kergon
  0 siblings, 1 reply; 7+ messages in thread
From: Alasdair G Kergon @ 2008-10-07 21:18 UTC (permalink / raw)
  To: raspl
  Cc: wein, heiko.carstens, device-mapper development, Jens Axboe,
	schwidefsky, Andrew Morton

On Wed, Sep 24, 2008 at 03:17:37PM +0200, Stefan Raspl wrote:
> Alasdair G Kergon wrote:
> > Have you had chance to look at old kernel versions to see in which
> > commit this was introduced?
> The next best thing I can come up with is that it has been around in
> 2.6.9 already. But I believe the DASD driver did the NULL thing to
> its internal queue also back then, still.
 
Well, in the absence of further information and being unaware of any
other driver doing this, rather than changing several places in the
block layer to cope with one out-of-the-ordinary driver, I wish to see
the driver fixed to work within the existing block layer design.  If
that is difficult or impossible (which I doubt), then we need more
information to justify that position.

Meanwhile I will not be applying that patch, though I may take a patch
to provide clean diagnostics should a situation like this occur again.

Alasdair
-- 
agk@redhat.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: [PATCH] Fix Null pointer Exception
  2008-10-07 21:18       ` Alasdair G Kergon
@ 2008-10-09 15:17         ` Alasdair G Kergon
  2008-10-10  7:23           ` Stefan Raspl
  0 siblings, 1 reply; 7+ messages in thread
From: Alasdair G Kergon @ 2008-10-09 15:17 UTC (permalink / raw)
  To: raspl, device-mapper development, Jens Axboe, Andrew Morton,
	heiko.carstens

On Tue, Oct 07, 2008 at 10:18:40PM +0100, Alasdair G Kergon wrote:
> Meanwhile I will not be applying that patch, though I may take a patch
> to provide clean diagnostics should a situation like this occur again.
 
I'm thinking of something like this:


From: Alasdair G Kergon <agk@redhat.com>

Detect and report buggy drivers that destroy their request_queue.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: Stefan Raspl <raspl@linux.vnet.ibm.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>

---
 drivers/md/dm-table.c |   24 ++++++++++++++++++++++--
 1 files changed, 22 insertions(+), 2 deletions(-)

Index: linux-2.6.27-rc8/drivers/md/dm-table.c
===================================================================
--- linux-2.6.27-rc8.orig/drivers/md/dm-table.c	2008-10-09 14:53:35.000000000 +0100
+++ linux-2.6.27-rc8/drivers/md/dm-table.c	2008-10-09 14:56:26.000000000 +0100
@@ -478,6 +478,13 @@ void dm_set_device_limits(struct dm_targ
 {
 	struct request_queue *q = bdev_get_queue(bdev);
 	struct io_restrictions *rs = &ti->limits;
+	char b[BDEVNAME_SIZE];
+
+	if (unlikely(!q)) {
+		DMWARN("%s: Cannot set limits for nonexistent device %s",
+		       dm_device_name(ti->table->md), bdevname(bdev, b));
+		return;
+	}
 
 	/*
 	 * Combine the device limits low.
@@ -943,7 +950,14 @@ int dm_table_any_congested(struct dm_tab
 
 	list_for_each_entry(dd, devices, list) {
 		struct request_queue *q = bdev_get_queue(dd->bdev);
-		r |= bdi_congested(&q->backing_dev_info, bdi_bits);
+		char b[BDEVNAME_SIZE];
+
+		if (likely(q))
+			r |= bdi_congested(&q->backing_dev_info, bdi_bits);
+		else
+			DMWARN_LIMIT("%s: any_congested: nonexistent device %s",
+				     dm_device_name(t->md),
+				     bdevname(dd->bdev, b));
 	}
 
 	return r;
@@ -956,8 +970,14 @@ void dm_table_unplug_all(struct dm_table
 
 	list_for_each_entry(dd, devices, list) {
 		struct request_queue *q = bdev_get_queue(dd->bdev);
+		char b[BDEVNAME_SIZE];
 
-		blk_unplug(q);
+		if (likely(q))
+			blk_unplug(q);
+		else
+			DMWARN_LIMIT("%s: Cannot unplug nonexistent device %s",
+				     dm_device_name(t->md),
+				     bdevname(dd->bdev, b));
 	}
 }
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: [PATCH] Fix Null pointer Exception
  2008-10-09 15:17         ` Alasdair G Kergon
@ 2008-10-10  7:23           ` Stefan Raspl
  0 siblings, 0 replies; 7+ messages in thread
From: Stefan Raspl @ 2008-10-10  7:23 UTC (permalink / raw)
  To: raspl, device-mapper development, Jens Axboe, Andrew Morton,
	heiko.carstens

Alasdair G Kergon wrote:
> On Tue, Oct 07, 2008 at 10:18:40PM +0100, Alasdair G Kergon wrote:
>> Meanwhile I will not be applying that patch, though I may take a patch
>> to provide clean diagnostics should a situation like this occur again.
> 
> I'm thinking of something like this:
> 
> 
> From: Alasdair G Kergon <agk@redhat.com>
> 
> Detect and report buggy drivers that destroy their request_queue.
> 
> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
> Cc: Stefan Raspl <raspl@linux.vnet.ibm.com>
> Cc: Jens Axboe <jens.axboe@oracle.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> 
> ---
>  drivers/md/dm-table.c |   24 ++++++++++++++++++++++--
>  1 files changed, 22 insertions(+), 2 deletions(-)
> 
> Index: linux-2.6.27-rc8/drivers/md/dm-table.c
> ===================================================================
> --- linux-2.6.27-rc8.orig/drivers/md/dm-table.c	2008-10-09 14:53:35.000000000 +0100
> +++ linux-2.6.27-rc8/drivers/md/dm-table.c	2008-10-09 14:56:26.000000000 +0100
> @@ -478,6 +478,13 @@ void dm_set_device_limits(struct dm_targ
>  {
>  	struct request_queue *q = bdev_get_queue(bdev);
>  	struct io_restrictions *rs = &ti->limits;
> +	char b[BDEVNAME_SIZE];
> +
> +	if (unlikely(!q)) {
> +		DMWARN("%s: Cannot set limits for nonexistent device %s",
> +		       dm_device_name(ti->table->md), bdevname(bdev, b));
> +		return;
> +	}
> 
>  	/*
>  	 * Combine the device limits low.
> @@ -943,7 +950,14 @@ int dm_table_any_congested(struct dm_tab
> 
>  	list_for_each_entry(dd, devices, list) {
>  		struct request_queue *q = bdev_get_queue(dd->bdev);
> -		r |= bdi_congested(&q->backing_dev_info, bdi_bits);
> +		char b[BDEVNAME_SIZE];
> +
> +		if (likely(q))
> +			r |= bdi_congested(&q->backing_dev_info, bdi_bits);
> +		else
> +			DMWARN_LIMIT("%s: any_congested: nonexistent device %s",
> +				     dm_device_name(t->md),
> +				     bdevname(dd->bdev, b));
>  	}
> 
>  	return r;
> @@ -956,8 +970,14 @@ void dm_table_unplug_all(struct dm_table
> 
>  	list_for_each_entry(dd, devices, list) {
>  		struct request_queue *q = bdev_get_queue(dd->bdev);
> +		char b[BDEVNAME_SIZE];
> 
> -		blk_unplug(q);
> +		if (likely(q))
> +			blk_unplug(q);
> +		else
> +			DMWARN_LIMIT("%s: Cannot unplug nonexistent device %s",
> +				     dm_device_name(t->md),
> +				     bdevname(dd->bdev, b));
>  	}
>  }
> 

Looks good to me!

Ciao,
Stefan


-- 
Linux on System z
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter
Geschäftsführer: Erich Baier
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-10-10  7:23 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-08 15:40 [PATCH] Fix Null pointer Exception Stefan Raspl
2008-09-09  0:21 ` Andrew Morton
2008-09-23 16:56   ` Alasdair G Kergon
2008-09-24 13:17     ` Stefan Raspl
2008-10-07 21:18       ` Alasdair G Kergon
2008-10-09 15:17         ` Alasdair G Kergon
2008-10-10  7:23           ` Stefan Raspl

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.