All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 12020] New: scsi_times_out NULL pointer dereference
@ 2008-11-13 18:30 bugme-daemon
  2008-11-13 18:40 ` [Bug 12020] " bugme-daemon
                   ` (12 more replies)
  0 siblings, 13 replies; 27+ messages in thread
From: bugme-daemon @ 2008-11-13 18:30 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12020

           Summary: scsi_times_out NULL pointer dereference
           Product: SCSI Drivers
           Version: 2.5
     KernelVersion: 2.6.28-git20081113
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Other
        AssignedTo: scsi_drivers-other@kernel-bugs.osdl.org
        ReportedBy: bs@q-leap.de


Latest working kernel version: 2.6.27
Earliest failing kernel version: 2.6.28-rc4
Hardware Environment: Infortrend G2430 connected to LSI22320R
Problem Description:

Hello,

first in 2.6.28-rc{1,2,3} the error handler was entirely broken - it
deadlocked. In rc4 this is fixed, but now I already two times got a Null
pointer dereference while doing some error handler tests. All of that looks
like due to the scsi timeout commits.

Steps to reproduce: E.g. reset devices connected to LSI 53C1030 devices using
lsiutil. Can be reproduced on about 20% eh activations.

(gdb) l *(scsi_times_out+0x15)
0xffffffff80460f1e is in scsi_times_out (drivers/scsi/scsi_error.c:176).
171             enum blk_eh_timer_return (*eh_timed_out)(struct scsi_cmnd *);
172             enum blk_eh_timer_return rtn = BLK_EH_NOT_HANDLED;
173
174             scsi_log_completion(scmd, TIMEOUT_ERROR);
175
176             if (scmd->device->host->transportt->eh_timed_out)
177                     eh_timed_out =
scmd->device->host->transportt->eh_timed_out;
178             else if (scmd->device->host->hostt->eh_timed_out)
179                     eh_timed_out = scmd->device->host->hostt->eh_timed_out;
180             else

[  143.804672] BUG: unable to handle kernel NULL pointer dereference at
0000000000000000
[  143.808507] IP: [<ffffffff80460f1e>] scsi_times_out+0x15/0x71
[  143.816020] PGD f9381067 PUD f9360067 PMD 0
[  143.824018] Oops: 0000 [#1] SMP
[  143.824018] last sysfs file:
/sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map
[  143.832016] Dumping ftrace buffer:
[  143.832016]    (ftrace buffer empty)
[  143.832016] CPU 1
[  143.832016] Modules linked in: mptctl ib_ipoib inet_lro ib_umad rdma_ucm
rdma_cm ib_cm iw_cm ib_sa ib_addr ib_uvee
[  143.832016] Pid: 246, comm: pdflush Not tainted 2.6.28-rc4-bs1 #10
[  143.832016] RIP: 0010:[<ffffffff80460f1e>]  [<ffffffff80460f1e>]
scsi_times_out+0x15/0x71
[  143.832016] RSP: 0018:ffff88007f6a3df0  EFLAGS: 00010086
[  143.832016] RAX: ffff88007ebf5330 RBX: 0000000000000000 RCX:
ffff8800f93804b8
[  143.832016] RDX: ffff88007ebf5948 RSI: 0000000000000246 RDI:
ffff8800f9380378
[  143.832016] RBP: ffff88007f6a3e00 R08: 0000000000000000 R09:
0000000000000000
[  143.832016] R10: ffff8800f9144680 R11: ffff88007eeac240 R12:
ffff88007ebf5330
[  143.832016] R13: ffff88007ebf5808 R14: ffffffff80380461 R15:
0000000000000000
[  143.832016] FS:  0000000000733860(0000) GS:ffff8800fb29ab40(0000)
knlGS:0000000000000000
[  143.832016] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[  143.832016] CR2: 0000000000000000 CR3: 00000000e80ec000 CR4:
00000000000006e0
[  143.832016] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  143.832016] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[  143.832016] Process pdflush (pid: 246, threadinfo ffff88007ed12000, task
ffff88007ed11890)
[  143.832016] Stack:
[  143.832016]  ffff88007f6a3e00 ffff8800f9380378 ffff88007f6a3e20
ffffffff80380426
[  143.832016]  ffff88007ebf5330 ffff8800f9380378 ffff88007f6a3e70
ffffffff803804f9
[  143.832016]  ffff88007eea0000 ffff88007ebf5668 0000000000000246
ffff88007ebf5330
[  143.832016] Call Trace:
[  143.832016]  <IRQ> <0> [<ffffffff80380426>] blk_rq_timed_out+0x1b/0x56
[  143.832016]  [<ffffffff803804f9>] blk_rq_timed_out_timer+0x98/0x118
[  143.832016]  [<ffffffff80380461>] ? blk_rq_timed_out_timer+0x0/0x118
[  143.832016]  [<ffffffff802464e2>] run_timer_softirq+0x14c/0x1cc
[  143.832016]  [<ffffffff80242392>] __do_softirq+0x83/0x128
[  143.832016]  [<ffffffff8020d03c>] call_softirq+0x1c/0x28
[  143.832016]  [<ffffffff8020ea39>] do_softirq+0x49/0x90
[  143.832016]  [<ffffffff802422aa>] irq_exit+0x44/0x46
[  143.832016]  [<ffffffff8020e88b>] do_IRQ+0xba/0xcf


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Bug 12020] scsi_times_out NULL pointer dereference
  2008-11-13 18:30 [Bug 12020] New: scsi_times_out NULL pointer dereference bugme-daemon
@ 2008-11-13 18:40 ` bugme-daemon
  2008-11-13 19:03 ` [Bug 12020] New: " James Bottomley
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: bugme-daemon @ 2008-11-13 18:40 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12020


akpm@osdl.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Regression|0                           |1




-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Bug 12020] New: scsi_times_out NULL pointer dereference
  2008-11-13 18:30 [Bug 12020] New: scsi_times_out NULL pointer dereference bugme-daemon
  2008-11-13 18:40 ` [Bug 12020] " bugme-daemon
@ 2008-11-13 19:03 ` James Bottomley
  2008-11-13 22:46   ` James Bottomley
  2008-11-13 19:03 ` [Bug 12020] " bugme-daemon
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 27+ messages in thread
From: James Bottomley @ 2008-11-13 19:03 UTC (permalink / raw)
  To: bugme-daemon; +Cc: linux-scsi, Jens Axboe

On Thu, 2008-11-13 at 10:30 -0800, bugme-daemon@bugzilla.kernel.org
wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=12020
> 
>            Summary: scsi_times_out NULL pointer dereference
>            Product: SCSI Drivers
>            Version: 2.5
>      KernelVersion: 2.6.28-git20081113
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Other
>         AssignedTo: scsi_drivers-other@kernel-bugs.osdl.org
>         ReportedBy: bs@q-leap.de
> 
> 
> Latest working kernel version: 2.6.27
> Earliest failing kernel version: 2.6.28-rc4
> Hardware Environment: Infortrend G2430 connected to LSI22320R
> Problem Description:
> 
> Hello,
> 
> first in 2.6.28-rc{1,2,3} the error handler was entirely broken - it
> deadlocked. In rc4 this is fixed, but now I already two times got a Null
> pointer dereference while doing some error handler tests. All of that looks
> like due to the scsi timeout commits.
> 
> Steps to reproduce: E.g. reset devices connected to LSI 53C1030 devices using
> lsiutil. Can be reproduced on about 20% eh activations.
> 
> (gdb) l *(scsi_times_out+0x15)
> 0xffffffff80460f1e is in scsi_times_out (drivers/scsi/scsi_error.c:176).
> 171             enum blk_eh_timer_return (*eh_timed_out)(struct scsi_cmnd *);
> 172             enum blk_eh_timer_return rtn = BLK_EH_NOT_HANDLED;
> 173
> 174             scsi_log_completion(scmd, TIMEOUT_ERROR);
> 175
> 176             if (scmd->device->host->transportt->eh_timed_out)
> 177                     eh_timed_out =
> scmd->device->host->transportt->eh_timed_out;
> 178             else if (scmd->device->host->hostt->eh_timed_out)
> 179                     eh_timed_out = scmd->device->host->hostt->eh_timed_out;
> 180             else

Actually, I think the trace is slightly off.  I suspect this is the
problem:

	struct scsi_cmnd *scmd = req->special;

I bet req->special is NULL because the command timed out even before it
was prepared by the subsystem.

Does this fix it?

The fix is more of a bandaid than anything ... we can't really have
commands timing out in the mid-layer because we expect we have full
control of them.  With this patch, if we run out of resets, block will
complete a command we're still processing.

James

---

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 94ed262..5612c42 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -127,6 +127,13 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
 	enum blk_eh_timer_return (*eh_timed_out)(struct scsi_cmnd *);
 	enum blk_eh_timer_return rtn = BLK_EH_NOT_HANDLED;
 
+	if (!scmd)
+		/*
+		 * nasty: command timed out before the mid layer
+		 * even prepared it
+		 */
+		return BLK_EH_RESET_TIMER;
+
 	scsi_log_completion(scmd, TIMEOUT_ERROR);
 
 	if (scmd->device->host->transportt->eh_timed_out)



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Bug 12020] scsi_times_out NULL pointer dereference
  2008-11-13 18:30 [Bug 12020] New: scsi_times_out NULL pointer dereference bugme-daemon
  2008-11-13 18:40 ` [Bug 12020] " bugme-daemon
  2008-11-13 19:03 ` [Bug 12020] New: " James Bottomley
@ 2008-11-13 19:03 ` bugme-daemon
  2008-11-13 20:12 ` bugme-daemon
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: bugme-daemon @ 2008-11-13 19:03 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12020





------- Comment #1 from anonymous@kernel-bugs.osdl.org  2008-11-13 11:03 -------
Reply-To: James.Bottomley@HansenPartnership.com

On Thu, 2008-11-13 at 10:30 -0800, bugme-daemon@bugzilla.kernel.org
wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=12020
> 
>            Summary: scsi_times_out NULL pointer dereference
>            Product: SCSI Drivers
>            Version: 2.5
>      KernelVersion: 2.6.28-git20081113
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Other
>         AssignedTo: scsi_drivers-other@kernel-bugs.osdl.org
>         ReportedBy: bs@q-leap.de
> 
> 
> Latest working kernel version: 2.6.27
> Earliest failing kernel version: 2.6.28-rc4
> Hardware Environment: Infortrend G2430 connected to LSI22320R
> Problem Description:
> 
> Hello,
> 
> first in 2.6.28-rc{1,2,3} the error handler was entirely broken - it
> deadlocked. In rc4 this is fixed, but now I already two times got a Null
> pointer dereference while doing some error handler tests. All of that looks
> like due to the scsi timeout commits.
> 
> Steps to reproduce: E.g. reset devices connected to LSI 53C1030 devices using
> lsiutil. Can be reproduced on about 20% eh activations.
> 
> (gdb) l *(scsi_times_out+0x15)
> 0xffffffff80460f1e is in scsi_times_out (drivers/scsi/scsi_error.c:176).
> 171             enum blk_eh_timer_return (*eh_timed_out)(struct scsi_cmnd *);
> 172             enum blk_eh_timer_return rtn = BLK_EH_NOT_HANDLED;
> 173
> 174             scsi_log_completion(scmd, TIMEOUT_ERROR);
> 175
> 176             if (scmd->device->host->transportt->eh_timed_out)
> 177                     eh_timed_out =
> scmd->device->host->transportt->eh_timed_out;
> 178             else if (scmd->device->host->hostt->eh_timed_out)
> 179                     eh_timed_out = scmd->device->host->hostt->eh_timed_out;
> 180             else

Actually, I think the trace is slightly off.  I suspect this is the
problem:

        struct scsi_cmnd *scmd = req->special;

I bet req->special is NULL because the command timed out even before it
was prepared by the subsystem.

Does this fix it?

The fix is more of a bandaid than anything ... we can't really have
commands timing out in the mid-layer because we expect we have full
control of them.  With this patch, if we run out of resets, block will
complete a command we're still processing.

James

---

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 94ed262..5612c42 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -127,6 +127,13 @@ enum blk_eh_timer_return scsi_times_out(struct request
*req)
        enum blk_eh_timer_return (*eh_timed_out)(struct scsi_cmnd *);
        enum blk_eh_timer_return rtn = BLK_EH_NOT_HANDLED;

+       if (!scmd)
+               /*
+                * nasty: command timed out before the mid layer
+                * even prepared it
+                */
+               return BLK_EH_RESET_TIMER;
+
        scsi_log_completion(scmd, TIMEOUT_ERROR);

        if (scmd->device->host->transportt->eh_timed_out)


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Bug 12020] scsi_times_out NULL pointer dereference
  2008-11-13 18:30 [Bug 12020] New: scsi_times_out NULL pointer dereference bugme-daemon
                   ` (2 preceding siblings ...)
  2008-11-13 19:03 ` [Bug 12020] " bugme-daemon
@ 2008-11-13 20:12 ` bugme-daemon
  2008-11-13 20:22   ` James Bottomley
  2008-11-13 20:23 ` bugme-daemon
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 27+ messages in thread
From: bugme-daemon @ 2008-11-13 20:12 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12020





------- Comment #2 from bs@q-leap.de  2008-11-13 12:12 -------
Thanks going to test it now. 

While we are at this function, could you please check 

        if (eh_timed_out)
                rtn = eh_timed_out(scmd);
                switch (rtn) {
                case BLK_EH_NOT_HANDLED:
                        break;
                default:
                        return rtn;
                }


Is the indentation wrong or are there missing if-braces?


Thanks,
Bernd


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Bug 12020] scsi_times_out NULL pointer dereference
  2008-11-13 20:12 ` bugme-daemon
@ 2008-11-13 20:22   ` James Bottomley
  0 siblings, 0 replies; 27+ messages in thread
From: James Bottomley @ 2008-11-13 20:22 UTC (permalink / raw)
  To: bugme-daemon; +Cc: linux-scsi

On Thu, 2008-11-13 at 12:12 -0800, bugme-daemon@bugzilla.kernel.org
wrote:
> While we are at this function, could you please check 
> 
>         if (eh_timed_out)
>                 rtn = eh_timed_out(scmd);
>                 switch (rtn) {
>                 case BLK_EH_NOT_HANDLED:
>                         break;
>                 default:
>                         return rtn;
>                 }
> 
> 
> Is the indentation wrong or are there missing if-braces?

It's not as intended, but harmless:

http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff;h=6ec39f02cf48df89c3cbab4aeef521569fec00e4

James



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Bug 12020] scsi_times_out NULL pointer dereference
  2008-11-13 18:30 [Bug 12020] New: scsi_times_out NULL pointer dereference bugme-daemon
                   ` (3 preceding siblings ...)
  2008-11-13 20:12 ` bugme-daemon
@ 2008-11-13 20:23 ` bugme-daemon
  2008-11-13 21:36 ` bugme-daemon
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: bugme-daemon @ 2008-11-13 20:23 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12020





------- Comment #3 from anonymous@kernel-bugs.osdl.org  2008-11-13 12:23 -------
Reply-To: James.Bottomley@HansenPartnership.com

On Thu, 2008-11-13 at 12:12 -0800, bugme-daemon@bugzilla.kernel.org
wrote:
> While we are at this function, could you please check 
> 
>         if (eh_timed_out)
>                 rtn = eh_timed_out(scmd);
>                 switch (rtn) {
>                 case BLK_EH_NOT_HANDLED:
>                         break;
>                 default:
>                         return rtn;
>                 }
> 
> 
> Is the indentation wrong or are there missing if-braces?

It's not as intended, but harmless:

http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff;h=6ec39f02cf48df89c3cbab4aeef521569fec00e4

James


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Bug 12020] scsi_times_out NULL pointer dereference
  2008-11-13 18:30 [Bug 12020] New: scsi_times_out NULL pointer dereference bugme-daemon
                   ` (4 preceding siblings ...)
  2008-11-13 20:23 ` bugme-daemon
@ 2008-11-13 21:36 ` bugme-daemon
  2008-11-13 22:47 ` bugme-daemon
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: bugme-daemon @ 2008-11-13 21:36 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12020


rjw@sisk.pl changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rjw@sisk.pl
OtherBugsDependingO|                            |11808
              nThis|                            |




-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Bug 12020] New: scsi_times_out NULL pointer dereference
  2008-11-13 19:03 ` [Bug 12020] New: " James Bottomley
@ 2008-11-13 22:46   ` James Bottomley
  0 siblings, 0 replies; 27+ messages in thread
From: James Bottomley @ 2008-11-13 22:46 UTC (permalink / raw)
  To: bugme-daemon; +Cc: linux-scsi, Jens Axboe

On Thu, 2008-11-13 at 13:03 -0600, James Bottomley wrote:
> Actually, I think the trace is slightly off.  I suspect this is the
> problem:
> 
> 	struct scsi_cmnd *scmd = req->special;
> 
> I bet req->special is NULL because the command timed out even before it
> was prepared by the subsystem.
> 
> Does this fix it?
> 
> The fix is more of a bandaid than anything ... we can't really have
> commands timing out in the mid-layer because we expect we have full
> control of them.  With this patch, if we run out of resets, block will
> complete a command we're still processing.
> 
> James
> 
> ---
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 94ed262..5612c42 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -127,6 +127,13 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
>  	enum blk_eh_timer_return (*eh_timed_out)(struct scsi_cmnd *);
>  	enum blk_eh_timer_return rtn = BLK_EH_NOT_HANDLED;
>  
> +	if (!scmd)
> +		/*
> +		 * nasty: command timed out before the mid layer
> +		 * even prepared it
> +		 */
> +		return BLK_EH_RESET_TIMER;
> +
>  	scsi_log_completion(scmd, TIMEOUT_ERROR);
>  
>  	if (scmd->device->host->transportt->eh_timed_out)

Mike Anderson pointed out that we have a potential window where the
timer can fire after we've unprepped the request in SCSI (so making
req->special NULL) but before we call blk_requeue_request() which stops
the timer.  We can rejig the locking to prevent this from happening, so
could you (separately) try this patch?

James

---

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f5d3b96..3475b74 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -649,8 +643,8 @@ static void scsi_requeue_command(struct request_queue *q, struct scsi_cmnd *cmd)
 	struct request *req = cmd->request;
 	unsigned long flags;
 
-	scsi_unprep_request(req);
 	spin_lock_irqsave(q->queue_lock, flags);
+	scsi_unprep_request(req);
 	blk_requeue_request(q, req);
 	spin_unlock_irqrestore(q->queue_lock, flags);
 



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Bug 12020] scsi_times_out NULL pointer dereference
  2008-11-13 18:30 [Bug 12020] New: scsi_times_out NULL pointer dereference bugme-daemon
                   ` (5 preceding siblings ...)
  2008-11-13 21:36 ` bugme-daemon
@ 2008-11-13 22:47 ` bugme-daemon
  2008-11-16 17:50 ` bugme-daemon
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: bugme-daemon @ 2008-11-13 22:47 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12020





------- Comment #4 from anonymous@kernel-bugs.osdl.org  2008-11-13 14:47 -------
Reply-To: James.Bottomley@HansenPartnership.com

On Thu, 2008-11-13 at 13:03 -0600, James Bottomley wrote:
> Actually, I think the trace is slightly off.  I suspect this is the
> problem:
> 
> 	struct scsi_cmnd *scmd = req->special;
> 
> I bet req->special is NULL because the command timed out even before it
> was prepared by the subsystem.
> 
> Does this fix it?
> 
> The fix is more of a bandaid than anything ... we can't really have
> commands timing out in the mid-layer because we expect we have full
> control of them.  With this patch, if we run out of resets, block will
> complete a command we're still processing.
> 
> James
> 
> ---
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 94ed262..5612c42 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -127,6 +127,13 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
>  	enum blk_eh_timer_return (*eh_timed_out)(struct scsi_cmnd *);
>  	enum blk_eh_timer_return rtn = BLK_EH_NOT_HANDLED;
>  
> +	if (!scmd)
> +		/*
> +		 * nasty: command timed out before the mid layer
> +		 * even prepared it
> +		 */
> +		return BLK_EH_RESET_TIMER;
> +
>  	scsi_log_completion(scmd, TIMEOUT_ERROR);
>  
>  	if (scmd->device->host->transportt->eh_timed_out)

Mike Anderson pointed out that we have a potential window where the
timer can fire after we've unprepped the request in SCSI (so making
req->special NULL) but before we call blk_requeue_request() which stops
the timer.  We can rejig the locking to prevent this from happening, so
could you (separately) try this patch?

James

---

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f5d3b96..3475b74 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -649,8 +643,8 @@ static void scsi_requeue_command(struct request_queue *q,
struct scsi_cmnd *cmd)
        struct request *req = cmd->request;
        unsigned long flags;

-       scsi_unprep_request(req);
        spin_lock_irqsave(q->queue_lock, flags);
+       scsi_unprep_request(req);
        blk_requeue_request(q, req);
        spin_unlock_irqrestore(q->queue_lock, flags);



-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Bug #12020] scsi_times_out NULL pointer dereference
  2008-11-16 16:24 2.6.28-rc5: Reported regressions from 2.6.27 Rafael J. Wysocki
@ 2008-11-16 16:35   ` Rafael J. Wysocki
  0 siblings, 0 replies; 27+ messages in thread
From: Rafael J. Wysocki @ 2008-11-16 16:35 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Bernd Schubert, James Bottomley

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=12020
Subject		: scsi_times_out NULL pointer dereference
Submitter	: Bernd Schubert <bs-PKu+Ek1N2UGzQB+pC5nmwQ@public.gmane.org>
Date		: 2008-11-13 10:30 (4 days old)
Handled-By	: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
Patch		: http://bugzilla.kernel.org/show_bug.cgi?id=12020#c4


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Bug #12020] scsi_times_out NULL pointer dereference
@ 2008-11-16 16:35   ` Rafael J. Wysocki
  0 siblings, 0 replies; 27+ messages in thread
From: Rafael J. Wysocki @ 2008-11-16 16:35 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Bernd Schubert, James Bottomley

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=12020
Subject		: scsi_times_out NULL pointer dereference
Submitter	: Bernd Schubert <bs@q-leap.de>
Date		: 2008-11-13 10:30 (4 days old)
Handled-By	: James Bottomley <James.Bottomley@HansenPartnership.com>
Patch		: http://bugzilla.kernel.org/show_bug.cgi?id=12020#c4



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Bug 12020] scsi_times_out NULL pointer dereference
  2008-11-13 18:30 [Bug 12020] New: scsi_times_out NULL pointer dereference bugme-daemon
                   ` (6 preceding siblings ...)
  2008-11-13 22:47 ` bugme-daemon
@ 2008-11-16 17:50 ` bugme-daemon
  2008-11-20 15:12 ` bugme-daemon
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: bugme-daemon @ 2008-11-16 17:50 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12020





------- Comment #5 from rjw@sisk.pl  2008-11-16 09:50 -------
Patch : http://bugzilla.kernel.org/show_bug.cgi?id=12020#c4
Handled-By : James Bottomley <James.Bottomley@HansenPartnership.com>


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Bug 12020] scsi_times_out NULL pointer dereference
  2008-11-13 18:30 [Bug 12020] New: scsi_times_out NULL pointer dereference bugme-daemon
                   ` (7 preceding siblings ...)
  2008-11-16 17:50 ` bugme-daemon
@ 2008-11-20 15:12 ` bugme-daemon
  2008-11-20 19:36   ` Mike Anderson
  2008-11-20 19:36 ` bugme-daemon
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 27+ messages in thread
From: bugme-daemon @ 2008-11-20 15:12 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12020





------- Comment #6 from git.user@gmail.com  2008-11-20 07:12 -------
looks very similar

[  316.336654] BUG: unable to handle kernel NULL pointer dereference at
0000000000000000
[  316.339972] IP: [<ffffffff803f84d3>] scsi_times_out+0x10/0x72
[  316.339972] PGD 3e627067 PUD 3de0b067 PMD 0
[  316.339972] Oops: 0000 [#1] PREEMPT SMP
[  316.339972] last sysfs file:
/sys/devices/virtual/block/md0/md/metadata_version
[  316.339972] Dumping ftrace buffer:
[  316.339972]    (ftrace buffer empty)
[  316.339972] CPU 1
[  316.339972] Modules linked in: floppy sg
[  316.339972] Pid: 0, comm: swapper Not tainted 2.6.28-rc5 #1
[  316.339972] RIP: 0010:[<ffffffff803f84d3>]  [<ffffffff803f84d3>]
scsi_times_out+0x10/0x72
[  316.339972] RSP: 0018:ffff88003fb53e20  EFLAGS: 00010082
[  316.339972] RAX: ffff88003ef60000 RBX: 0000000000000000 RCX:
ffff88003ef60308
[  316.339972] RDX: ffff88003ef60308 RSI: 0000000000006cb2 RDI:
ffff880033dae5c0
[  316.339972] RBP: ffff88003fb53e30 R08: ffff880001019180 R09:
0000000000000010
[  316.339972] R10: 0000000000000000 R11: 0000000000000001 R12:
ffff88003ef601c8
[  316.339972] R13: ffff88003ef60308 R14: 0000000000000102 R15:
0000000000000000
[  316.339972] FS:  0000000000000000(0000) GS:ffff88003fb23b00(0000)
knlGS:0000000000000000
[  316.339972] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[  316.339972] CR2: 0000000000000000 CR3: 000000003e7e8000 CR4:
00000000000006e0
[  316.339972] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  316.339972] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[  316.339972] Process swapper (pid: 0, threadinfo ffff88003fb4e000, task
ffff88003f863500)
[  316.339972] Stack:
[  316.339972]  ffff88000101f900 ffff880033dae5c0 ffff88003fb53e50
ffffffff803544ca
[  316.339972]  ffff88003ef60000 ffff880033dae5c0 ffff88003fb53ea0
ffffffff803545d8
[  316.339972]  ffff88003fb53ea0 ffff88003ef60000 0000000000000286
ffff88003ef60000
[  316.339972] Call Trace:
[  316.339972]  <IRQ> <0> [<ffffffff803544ca>] blk_rq_timed_out+0x16/0x5c
[  316.339972]  [<ffffffff803545d8>] blk_rq_timed_out_timer+0xc8/0x138
[  316.339972]  [<ffffffff80354510>] ? blk_rq_timed_out_timer+0x0/0x138
[  316.339972]  [<ffffffff8023f8b7>] run_timer_softirq+0x183/0x1ec
[  316.339972]  [<ffffffff80254c0c>] ? tick_dev_program_event+0x6c/0xa4
[  316.339972]  [<ffffffff8023b326>] __do_softirq+0x72/0x128
[  316.339972]  [<ffffffff8020c8cc>] call_softirq+0x1c/0x30
[  316.339972]  [<ffffffff8020df2d>] do_softirq+0x3d/0x78
[  316.339972]  [<ffffffff8023b249>] irq_exit+0x8f/0x98
[  316.339972]  [<ffffffff8021d8e4>] smp_apic_timer_interrupt+0x8a/0xd6
[  316.339972]  [<ffffffff8020c31b>] apic_timer_interrupt+0x6b/0x70
[  316.339972]  <EOI> <0> [<ffffffff80212f22>] ? mwait_idle+0x45/0x4a
[  316.339972]  [<ffffffff80209deb>] ? enter_idle+0x22/0x24
[  316.339972]  [<ffffffff8020a386>] ? cpu_idle+0x41/0x80
[  316.339972] Code: cb ff ff 85 c0 74 a0 45 31 e4 eb d2 45 31 e4 44 89 e0 5b
41 5c 41 5d 41 5e 5d c3 55 48 89 e5 53 48 83 ec 08 48 8b 9f e0 00 00 00 <48> 8b
03 48 8b 10 48 8b 82 b8 00 00 00 48 8b 80 60 01 00 00 48
[  316.339972] RIP  [<ffffffff803f84d3>] scsi_times_out+0x10/0x72
[  316.339972]  RSP <ffff88003fb53e20>
[  316.339972] CR2: 0000000000000000
[  316.339972] Kernel panic - not syncing: Fatal exception in interrupt

in my case easily be triggered by disk activity (i.g. rsync) 
on rebuilding raid

On Thu, 2008-11-13 at 13:03 -0600, James Bottomley wrote:
> Actually, I think the trace is slightly off.  I suspect this is the
> problem:
> 
>       struct scsi_cmnd *scmd = req->special;
> 
> I bet req->special is NULL because the command timed out even before it
> was prepared by the subsystem.
> 
> Does this fix it?

In my case it doesn't 'fix', but proof of concept.
With your patch [i just printk-ing the comment] 
system remains locked printk-ing from time to time: 
"nasty: command timed out before the mid layer 
even prepared it"

> The fix is more of a bandaid than anything ... we can't really have
> commands timing out in the mid-layer because we expect we have full
> control of them.  With this patch, if we run out of resets, block will
> complete a command we're still processing.

here is a dmesg:
http://sysadminday.org.ru/2.6.28-rc5-git3/scsi_times_out-NULL_pointer_dereference_dmesg


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Bug 12020] scsi_times_out NULL pointer dereference
  2008-11-20 15:12 ` bugme-daemon
@ 2008-11-20 19:36   ` Mike Anderson
  0 siblings, 0 replies; 27+ messages in thread
From: Mike Anderson @ 2008-11-20 19:36 UTC (permalink / raw)
  To: bugme-daemon; +Cc: linux-scsi, Jens Axboe, James Bottomley, Tejun Heo

I have two systems that are hitting similar signatures in scsi_times_out.

Note: that my testing is using a distro kernel, but in this area the code
is very similar. I will work to get a reproduction on mainline.

..but..

I added some debug to scsi_times_out and noticed that the request with no
scmd set in req->special also did not have REQ_STARTED set.

I added a WARN_ON check to blk_add_timer for any requests 
that we where starting a timer for that did not have REQ_STARTED. This is
shown below. This does not look good as the elv_dequeue_request is being
called off elv_next_request for some cases.

Call Trace:
[c00000007b747580] [c00000000027808c] .blk_add_timer+0x74/0x134
(unreliable)
[c00000007b747610] [c00000000026f9b8] .elv_dequeue_request+0x78/0x8c
[c00000007b747680] [c000000000275830] .blk_do_ordered+0x8c/0x31c
[c00000007b747720] [c00000000026fc18] .elv_next_request+0x24c/0x2d4
[c00000007b7477c0] [d000000000368004] .scsi_request_fn+0xc8/0x628
[scsi_mod]
[c00000007b7478a0] [c00000000026fdf4] .elv_insert+0x154/0x38c
[c00000007b747940] [c000000000273ad0] .__make_request+0x4e4/0x568
[c00000007b7479f0] [c000000000271a68] .generic_make_request+0x3f4/0x468
[c00000007b747af0] [c000000000271bd8] .submit_bio+0xfc/0x124
[c00000007b747bb0] [c000000000160a00] .submit_bh+0x14c/0x198
[c00000007b747c40] [c0000000001630a0] .sync_dirty_buffer+0xbc/0x15c
[c00000007b747cd0] [c0000000001fcac0]
.journal_commit_transaction+0x1014/0x158c
[c00000007b747e10] [c00000000020111c] .kjournald+0x104/0x2f4
[c00000007b747f00] [c0000000000a909c] .kthread+0x78/0xc4
[c00000007b747f90] [c00000000002ae2c] .kernel_thread+0x4c/0x68

I changed the previous mentioned WARN_ON to just do a return if the request
does not have REQ_STARTED. This corrected the issue of seeing an oops in
scsi_times_out. But this is just a hack.

Hope this analysis is not flawed because of kernel deltas. It also may not
address this specific issue being seen in this bug, but does appear to
indicate a possible path to get a request on the timeout list with out a
req->special set.

I think we may need to look at some of the paths that are calling
blkdev_dequeue_request and understand how to prevent blk_add_timer from
being called if we are not really starting a SCSI cmd.

-andmike
--
Michael Anderson
andmike@linux.vnet.ibm.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Bug 12020] scsi_times_out NULL pointer dereference
  2008-11-13 18:30 [Bug 12020] New: scsi_times_out NULL pointer dereference bugme-daemon
                   ` (8 preceding siblings ...)
  2008-11-20 15:12 ` bugme-daemon
@ 2008-11-20 19:36 ` bugme-daemon
  2008-12-03 10:19 ` bugme-daemon
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: bugme-daemon @ 2008-11-20 19:36 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12020





------- Comment #7 from anonymous@kernel-bugs.osdl.org  2008-11-20 11:36 -------
Reply-To: andmike@linux.vnet.ibm.com

I have two systems that are hitting similar signatures in scsi_times_out.

Note: that my testing is using a distro kernel, but in this area the code
is very similar. I will work to get a reproduction on mainline.

..but..

I added some debug to scsi_times_out and noticed that the request with no
scmd set in req->special also did not have REQ_STARTED set.

I added a WARN_ON check to blk_add_timer for any requests 
that we where starting a timer for that did not have REQ_STARTED. This is
shown below. This does not look good as the elv_dequeue_request is being
called off elv_next_request for some cases.

Call Trace:
[c00000007b747580] [c00000000027808c] .blk_add_timer+0x74/0x134
(unreliable)
[c00000007b747610] [c00000000026f9b8] .elv_dequeue_request+0x78/0x8c
[c00000007b747680] [c000000000275830] .blk_do_ordered+0x8c/0x31c
[c00000007b747720] [c00000000026fc18] .elv_next_request+0x24c/0x2d4
[c00000007b7477c0] [d000000000368004] .scsi_request_fn+0xc8/0x628
[scsi_mod]
[c00000007b7478a0] [c00000000026fdf4] .elv_insert+0x154/0x38c
[c00000007b747940] [c000000000273ad0] .__make_request+0x4e4/0x568
[c00000007b7479f0] [c000000000271a68] .generic_make_request+0x3f4/0x468
[c00000007b747af0] [c000000000271bd8] .submit_bio+0xfc/0x124
[c00000007b747bb0] [c000000000160a00] .submit_bh+0x14c/0x198
[c00000007b747c40] [c0000000001630a0] .sync_dirty_buffer+0xbc/0x15c
[c00000007b747cd0] [c0000000001fcac0]
.journal_commit_transaction+0x1014/0x158c
[c00000007b747e10] [c00000000020111c] .kjournald+0x104/0x2f4
[c00000007b747f00] [c0000000000a909c] .kthread+0x78/0xc4
[c00000007b747f90] [c00000000002ae2c] .kernel_thread+0x4c/0x68

I changed the previous mentioned WARN_ON to just do a return if the request
does not have REQ_STARTED. This corrected the issue of seeing an oops in
scsi_times_out. But this is just a hack.

Hope this analysis is not flawed because of kernel deltas. It also may not
address this specific issue being seen in this bug, but does appear to
indicate a possible path to get a request on the timeout list with out a
req->special set.

I think we may need to look at some of the paths that are calling
blkdev_dequeue_request and understand how to prevent blk_add_timer from
being called if we are not really starting a SCSI cmd.

-andmike
--
Michael Anderson
andmike@linux.vnet.ibm.com


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Bug #12020] scsi_times_out NULL pointer dereference
  2008-11-22 20:24 2.6.28-rc6-git1: Reported regressions from 2.6.27 Rafael J. Wysocki
@ 2008-11-22 20:28   ` Rafael J. Wysocki
  0 siblings, 0 replies; 27+ messages in thread
From: Rafael J. Wysocki @ 2008-11-22 20:28 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Bernd Schubert, James Bottomley

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=12020
Subject		: scsi_times_out NULL pointer dereference
Submitter	: Bernd Schubert <bs-PKu+Ek1N2UGzQB+pC5nmwQ@public.gmane.org>
Date		: 2008-11-13 10:30 (10 days old)
Handled-By	: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
Patch		: http://bugzilla.kernel.org/show_bug.cgi?id=12020#c4


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Bug #12020] scsi_times_out NULL pointer dereference
@ 2008-11-22 20:28   ` Rafael J. Wysocki
  0 siblings, 0 replies; 27+ messages in thread
From: Rafael J. Wysocki @ 2008-11-22 20:28 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Bernd Schubert, James Bottomley

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=12020
Subject		: scsi_times_out NULL pointer dereference
Submitter	: Bernd Schubert <bs@q-leap.de>
Date		: 2008-11-13 10:30 (10 days old)
Handled-By	: James Bottomley <James.Bottomley@HansenPartnership.com>
Patch		: http://bugzilla.kernel.org/show_bug.cgi?id=12020#c4



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Bug 12020] scsi_times_out NULL pointer dereference
  2008-11-13 18:30 [Bug 12020] New: scsi_times_out NULL pointer dereference bugme-daemon
                   ` (9 preceding siblings ...)
  2008-11-20 19:36 ` bugme-daemon
@ 2008-12-03 10:19 ` bugme-daemon
  2008-12-07 20:21 ` bugme-daemon
  2008-12-07 20:21 ` bugme-daemon
  12 siblings, 0 replies; 27+ messages in thread
From: bugme-daemon @ 2008-12-03 10:19 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12020





------- Comment #8 from bs@q-leap.de  2008-12-03 02:19 -------
> Mike Anderson pointed out that we have a potential window where the
> timer can fire after we've unprepped the request in SCSI (so making
> req->special NULL) but before we call blk_requeue_request() which stops
> the timer.  We can rejig the locking to prevent this from happening, so
> could you (separately) try this patch?
> 
> James
> 

Hello James,

sorry for the huge delay. Unfortunately it turned out I was just 'lucky' to run
into this bug the first few times. When I later on tried to reproduce this
specific issue, I tried 20 times and couldn't, even without any patch :( So
testing especially the 2nd patch turnes out to be a bit difficult (can't be
verified by printk). 
I now also mostly only have remotely access to the test hardware and can't
reset this system remotely, so testing has become a bit difficult :(
Tomorrow will be in the lab again and try to test again.

Thanks for your help and patience,
Bernd


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Bug #12020] scsi_times_out NULL pointer dereference
  2008-12-03 21:49 2.6.28-rc7-git2: Reported regressions from 2.6.27 Rafael J. Wysocki
@ 2008-12-03 21:57   ` Rafael J. Wysocki
  0 siblings, 0 replies; 27+ messages in thread
From: Rafael J. Wysocki @ 2008-12-03 21:57 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Bernd Schubert, James Bottomley

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=12020
Subject		: scsi_times_out NULL pointer dereference
Submitter	: Bernd Schubert <bs-PKu+Ek1N2UGzQB+pC5nmwQ@public.gmane.org>
Date		: 2008-11-13 10:30 (21 days old)
Handled-By	: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
Patch		: http://bugzilla.kernel.org/show_bug.cgi?id=12020#c4


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Bug #12020] scsi_times_out NULL pointer dereference
@ 2008-12-03 21:57   ` Rafael J. Wysocki
  0 siblings, 0 replies; 27+ messages in thread
From: Rafael J. Wysocki @ 2008-12-03 21:57 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Bernd Schubert, James Bottomley

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=12020
Subject		: scsi_times_out NULL pointer dereference
Submitter	: Bernd Schubert <bs@q-leap.de>
Date		: 2008-11-13 10:30 (21 days old)
Handled-By	: James Bottomley <James.Bottomley@HansenPartnership.com>
Patch		: http://bugzilla.kernel.org/show_bug.cgi?id=12020#c4



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Bug #12020] scsi_times_out NULL pointer dereference
  2008-12-03 21:57   ` Rafael J. Wysocki
@ 2008-12-04  0:14     ` James Bottomley
  -1 siblings, 0 replies; 27+ messages in thread
From: James Bottomley @ 2008-12-04  0:14 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Bernd Schubert

On Wed, 2008-12-03 at 22:57 +0100, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.27.  Please verify if it still should be listed and let me know
> (either way).

That's a hard call.  We think this might be fixed by Tejun's block timer
patch, but the reporter has been unable to reproduce the problem (with
or without the timer patch).

Perhaps list as closed for now and reopen if we get another problem
report with kernels containing the block timer patch?

James


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Bug #12020] scsi_times_out NULL pointer dereference
@ 2008-12-04  0:14     ` James Bottomley
  0 siblings, 0 replies; 27+ messages in thread
From: James Bottomley @ 2008-12-04  0:14 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Bernd Schubert

On Wed, 2008-12-03 at 22:57 +0100, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.27.  Please verify if it still should be listed and let me know
> (either way).

That's a hard call.  We think this might be fixed by Tejun's block timer
patch, but the reporter has been unable to reproduce the problem (with
or without the timer patch).

Perhaps list as closed for now and reopen if we get another problem
report with kernels containing the block timer patch?

James



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Bug 12020] scsi_times_out NULL pointer dereference
  2008-11-13 18:30 [Bug 12020] New: scsi_times_out NULL pointer dereference bugme-daemon
                   ` (10 preceding siblings ...)
  2008-12-03 10:19 ` bugme-daemon
@ 2008-12-07 20:21 ` bugme-daemon
  2008-12-07 20:21 ` bugme-daemon
  12 siblings, 0 replies; 27+ messages in thread
From: bugme-daemon @ 2008-12-07 20:21 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12020


rjw@sisk.pl changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |REJECTED
         Resolution|                            |UNREPRODUCIBLE




-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Bug 12020] scsi_times_out NULL pointer dereference
  2008-11-13 18:30 [Bug 12020] New: scsi_times_out NULL pointer dereference bugme-daemon
                   ` (11 preceding siblings ...)
  2008-12-07 20:21 ` bugme-daemon
@ 2008-12-07 20:21 ` bugme-daemon
  12 siblings, 0 replies; 27+ messages in thread
From: bugme-daemon @ 2008-12-07 20:21 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=12020


rjw@sisk.pl changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REJECTED                    |CLOSED




-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Bug #12020] scsi_times_out NULL pointer dereference
  2008-12-04  0:14     ` James Bottomley
@ 2008-12-07 20:22         ` Rafael J. Wysocki
  -1 siblings, 0 replies; 27+ messages in thread
From: Rafael J. Wysocki @ 2008-12-07 20:22 UTC (permalink / raw)
  To: James Bottomley
  Cc: Linux Kernel Mailing List, Kernel Testers List, Bernd Schubert

On Thursday, 4 of December 2008, James Bottomley wrote:
> On Wed, 2008-12-03 at 22:57 +0100, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> > 
> > The following bug entry is on the current list of known regressions
> > from 2.6.27.  Please verify if it still should be listed and let me know
> > (either way).
> 
> That's a hard call.  We think this might be fixed by Tejun's block timer
> patch, but the reporter has been unable to reproduce the problem (with
> or without the timer patch).
> 
> Perhaps list as closed for now and reopen if we get another problem
> report with kernels containing the block timer patch?

I closed it as unreproducible on the basis of the last Bugzilla comment.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Bug #12020] scsi_times_out NULL pointer dereference
@ 2008-12-07 20:22         ` Rafael J. Wysocki
  0 siblings, 0 replies; 27+ messages in thread
From: Rafael J. Wysocki @ 2008-12-07 20:22 UTC (permalink / raw)
  To: James Bottomley
  Cc: Linux Kernel Mailing List, Kernel Testers List, Bernd Schubert

On Thursday, 4 of December 2008, James Bottomley wrote:
> On Wed, 2008-12-03 at 22:57 +0100, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> > 
> > The following bug entry is on the current list of known regressions
> > from 2.6.27.  Please verify if it still should be listed and let me know
> > (either way).
> 
> That's a hard call.  We think this might be fixed by Tejun's block timer
> patch, but the reporter has been unable to reproduce the problem (with
> or without the timer patch).
> 
> Perhaps list as closed for now and reopen if we get another problem
> report with kernels containing the block timer patch?

I closed it as unreproducible on the basis of the last Bugzilla comment.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2008-12-07 20:28 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-13 18:30 [Bug 12020] New: scsi_times_out NULL pointer dereference bugme-daemon
2008-11-13 18:40 ` [Bug 12020] " bugme-daemon
2008-11-13 19:03 ` [Bug 12020] New: " James Bottomley
2008-11-13 22:46   ` James Bottomley
2008-11-13 19:03 ` [Bug 12020] " bugme-daemon
2008-11-13 20:12 ` bugme-daemon
2008-11-13 20:22   ` James Bottomley
2008-11-13 20:23 ` bugme-daemon
2008-11-13 21:36 ` bugme-daemon
2008-11-13 22:47 ` bugme-daemon
2008-11-16 17:50 ` bugme-daemon
2008-11-20 15:12 ` bugme-daemon
2008-11-20 19:36   ` Mike Anderson
2008-11-20 19:36 ` bugme-daemon
2008-12-03 10:19 ` bugme-daemon
2008-12-07 20:21 ` bugme-daemon
2008-12-07 20:21 ` bugme-daemon
  -- strict thread matches above, loose matches on Subject: below --
2008-11-16 16:24 2.6.28-rc5: Reported regressions from 2.6.27 Rafael J. Wysocki
2008-11-16 16:35 ` [Bug #12020] scsi_times_out NULL pointer dereference Rafael J. Wysocki
2008-11-16 16:35   ` Rafael J. Wysocki
2008-11-22 20:24 2.6.28-rc6-git1: Reported regressions from 2.6.27 Rafael J. Wysocki
2008-11-22 20:28 ` [Bug #12020] scsi_times_out NULL pointer dereference Rafael J. Wysocki
2008-11-22 20:28   ` Rafael J. Wysocki
2008-12-03 21:49 2.6.28-rc7-git2: Reported regressions from 2.6.27 Rafael J. Wysocki
2008-12-03 21:57 ` [Bug #12020] scsi_times_out NULL pointer dereference Rafael J. Wysocki
2008-12-03 21:57   ` Rafael J. Wysocki
2008-12-04  0:14   ` James Bottomley
2008-12-04  0:14     ` James Bottomley
     [not found]     ` <1228349648.5551.98.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2008-12-07 20:22       ` Rafael J. Wysocki
2008-12-07 20:22         ` Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.