* [Bug 19642] New: 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113
@ 2010-10-03 18:37 bugzilla-daemon
2010-10-03 18:37 ` [Bug 19642] " bugzilla-daemon
` (3 more replies)
0 siblings, 4 replies; 10+ messages in thread
From: bugzilla-daemon @ 2010-10-03 18:37 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=19642
Summary: 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113
Product: IO/Storage
Version: 2.5
Kernel Version: 2.6.36-rc6
Platform: All
OS/Version: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: SCSI
AssignedTo: linux-scsi@vger.kernel.org
ReportedBy: maciej.rutecki@gmail.com
CC: rjw@sisk.pl, maciej.rutecki@gmail.com,
linux@horizon.com
Regression: Yes
Subject : 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113
Submitter : "George Spelvin" <linux@horizon.com>
Date : 2010-09-30 21:10
Message-ID : 20100930211006.27449.qmail@science.horizon.com
References : http://marc.info/?l=linux-kernel&m=128588102620299&w=2
This entry is being used for tracking a regression from 2.6.35. Please don't
close it until the problem is fixed in the mainline.
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug 19642] 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113
2010-10-03 18:37 [Bug 19642] New: 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113 bugzilla-daemon
@ 2010-10-03 18:37 ` bugzilla-daemon
2010-10-03 18:53 ` [Bug 19642] New: " James Bottomley
` (2 subsequent siblings)
3 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2010-10-03 18:37 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=19642
Maciej Rutecki <maciej.rutecki@gmail.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Blocks| |16444
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bug 19642] New: 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113
2010-10-03 18:37 [Bug 19642] New: 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113 bugzilla-daemon
2010-10-03 18:37 ` [Bug 19642] " bugzilla-daemon
@ 2010-10-03 18:53 ` James Bottomley
2010-10-03 21:00 ` Mike Snitzer
2010-10-03 18:54 ` [Bug 19642] " bugzilla-daemon
2010-10-03 21:32 ` bugzilla-daemon
3 siblings, 1 reply; 10+ messages in thread
From: James Bottomley @ 2010-10-03 18:53 UTC (permalink / raw)
To: bugzilla-daemon; +Cc: linux-scsi, dm-devel
On Sun, 2010-10-03 at 18:37 +0000, bugzilla-daemon@bugzilla.kernel.org
wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=19642
>
> Summary: 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113
> Product: IO/Storage
> Version: 2.5
> Kernel Version: 2.6.36-rc6
> Platform: All
> OS/Version: Linux
> Tree: Mainline
> Status: NEW
> Severity: normal
> Priority: P1
> Component: SCSI
> AssignedTo: linux-scsi@vger.kernel.org
I think this is wrong.
> ReportedBy: maciej.rutecki@gmail.com
> CC: rjw@sisk.pl, maciej.rutecki@gmail.com,
> linux@horizon.com
> Regression: Yes
>
>
> Subject : 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113
> Submitter : "George Spelvin" <linux@horizon.com>
> Date : 2010-09-30 21:10
> Message-ID : 20100930211006.27449.qmail@science.horizon.com
> References : http://marc.info/?l=linux-kernel&m=128588102620299&w=2
That looks like a zero length barrier send down to SCSI with the RW flag
set (hence we try to map a buffer that has nothing to map). I cc'd
dm-devel to see what their opinion is.
James
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug 19642] 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113
2010-10-03 18:37 [Bug 19642] New: 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113 bugzilla-daemon
2010-10-03 18:37 ` [Bug 19642] " bugzilla-daemon
2010-10-03 18:53 ` [Bug 19642] New: " James Bottomley
@ 2010-10-03 18:54 ` bugzilla-daemon
2010-10-03 21:32 ` bugzilla-daemon
3 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2010-10-03 18:54 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=19642
--- Comment #1 from Anonymous Emailer <anonymous@kernel-bugs.osdl.org> 2010-10-03 18:53:57 ---
Reply-To: James.Bottomley@suse.de
On Sun, 2010-10-03 at 18:37 +0000, bugzilla-daemon@bugzilla.kernel.org
wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=19642
>
> Summary: 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113
> Product: IO/Storage
> Version: 2.5
> Kernel Version: 2.6.36-rc6
> Platform: All
> OS/Version: Linux
> Tree: Mainline
> Status: NEW
> Severity: normal
> Priority: P1
> Component: SCSI
> AssignedTo: linux-scsi@vger.kernel.org
I think this is wrong.
> ReportedBy: maciej.rutecki@gmail.com
> CC: rjw@sisk.pl, maciej.rutecki@gmail.com,
> linux@horizon.com
> Regression: Yes
>
>
> Subject : 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113
> Submitter : "George Spelvin" <linux@horizon.com>
> Date : 2010-09-30 21:10
> Message-ID : 20100930211006.27449.qmail@science.horizon.com
> References : http://marc.info/?l=linux-kernel&m=128588102620299&w=2
That looks like a zero length barrier send down to SCSI with the RW flag
set (hence we try to map a buffer that has nothing to map). I cc'd
dm-devel to see what their opinion is.
James
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bug 19642] New: 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113
2010-10-03 18:53 ` [Bug 19642] New: " James Bottomley
@ 2010-10-03 21:00 ` Mike Snitzer
0 siblings, 0 replies; 10+ messages in thread
From: Mike Snitzer @ 2010-10-03 21:00 UTC (permalink / raw)
To: device-mapper development; +Cc: linux-raid, bugzilla-daemon, linux-scsi
On Sun, Oct 03 2010 at 2:53pm -0400,
James Bottomley <James.Bottomley@suse.de> wrote:
> On Sun, 2010-10-03 at 18:37 +0000, bugzilla-daemon@bugzilla.kernel.org
> wrote:
> > https://bugzilla.kernel.org/show_bug.cgi?id=19642
> >
> > Summary: 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113
> > Product: IO/Storage
> > Version: 2.5
> > Kernel Version: 2.6.36-rc6
> > Platform: All
> > OS/Version: Linux
> > Tree: Mainline
> > Status: NEW
> > Severity: normal
> > Priority: P1
> > Component: SCSI
> > AssignedTo: linux-scsi@vger.kernel.org
>
> I think this is wrong.
>
> > ReportedBy: maciej.rutecki@gmail.com
> > CC: rjw@sisk.pl, maciej.rutecki@gmail.com,
> > linux@horizon.com
> > Regression: Yes
> >
> >
> > Subject : 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113
> > Submitter : "George Spelvin" <linux@horizon.com>
> > Date : 2010-09-30 21:10
> > Message-ID : 20100930211006.27449.qmail@science.horizon.com
> > References : http://marc.info/?l=linux-kernel&m=128588102620299&w=2
>
> That looks like a zero length barrier send down to SCSI with the RW flag
> set (hence we try to map a buffer that has nothing to map). I cc'd
> dm-devel to see what their opinion is.
Given that the referenced LKML report is MD focused it seems linux-raid
should be cc'd instead (doing so now).
Mike
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug #19642] 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113
2010-10-03 21:15 2.6.36-rc6-git2: Reported regressions from 2.6.35 Rafael J. Wysocki
@ 2010-10-03 21:19 ` Rafael J. Wysocki
0 siblings, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2010-10-03 21:19 UTC (permalink / raw)
To: Linux Kernel Mailing List
Cc: Kernel Testers List, Maciej Rutecki, Florian Mickler,
George Spelvin
This message has been generated automatically as a part of a summary report
of recent regressions.
The following bug entry is on the current list of known regressions
from 2.6.35. Please verify if it still should be listed and let the tracking team
know (either way).
Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=19642
Subject : 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113
Submitter : George Spelvin <linux@horizon.com>
Date : 2010-09-30 21:10 (4 days old)
Message-ID : <20100930211006.27449.qmail@science.horizon.com>
References : http://marc.info/?l=linux-kernel&m=128588102620299&w=2
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug 19642] 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113
2010-10-03 18:37 [Bug 19642] New: 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113 bugzilla-daemon
` (2 preceding siblings ...)
2010-10-03 18:54 ` [Bug 19642] " bugzilla-daemon
@ 2010-10-03 21:32 ` bugzilla-daemon
3 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2010-10-03 21:32 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=19642
Alasdair G Kergon <agk@redhat.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |agk@redhat.com
Component|SCSI |MD
AssignedTo|linux-scsi@vger.kernel.org |io_md@kernel-bugs.osdl.org
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bug 19642] 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113
[not found] <20101005000309.8466.qmail@science.horizon.com>
@ 2010-10-05 0:10 ` Nicholas A. Bellinger
2010-10-05 0:18 ` Nicholas A. Bellinger
2010-10-05 1:24 ` George Spelvin
0 siblings, 2 replies; 10+ messages in thread
From: Nicholas A. Bellinger @ 2010-10-05 0:10 UTC (permalink / raw)
To: George Spelvin; +Cc: bugzilla-daemon, linux-scsi
On Mon, 2010-10-04 at 20:03 -0400, George Spelvin wrote:
> > Would you mind enabling CONFIG_SCSI_LOGGING=y in your kernel build and calling
> >
> > echo 0xffffffff > /proc/sys/dev/scsi/logging_level
> >
> > so we can double check to see which CDBs are actually generating the issue..?
> >
> > Thanks!
>
> Here's a syslog starting at boot time for that (I also added verbose
> SCSI errors, FWIW.) I did
> echo 0xffffffff > /proc/sys/dev/scsi/logging_level; echo check > /sys/block/md6/md/sync_action
> to minimize time delay and resultany unrelated log spam.
>
> Note that md6 is mirrored across *every* drive on the machine. (It's the
> emergency recovery /boot partition.)
<SNIP>
> 23:56:29: md: data-check of RAID array md6
> 23:56:29: md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
> 23:56:29: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
> 23:56:29: md: using 128k window, over a total of 1166912 blocks.
> 23:56:29: sd 0:0:0:0: [sda] sd_init_command: block=0, count=0
> 23:56:29: sd 0:0:0:0: [sda] block=0
> 23:56:29: sd 0:0:0:0: [sda] reading 0/0 512 byte blocks.
> 23:56:29: sd 0:0:0:0: [sda] Send: 0xffff88021984ed00
> 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> 23:56:29: buffer = 0xffff88021984ee00, bufflen = 0, queuecommand 0xffffffff812be49d
> 23:56:29: ata1.00: WARNING: zero len r/w req
> 23:56:29: sd 0:0:0:0: [sda] Done: 0xffff88021984ed00 RETRY
> 23:56:29: sd 0:0:0:0: [sda] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
> 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> 23:56:29: sd 0:0:0:0: [sda] scsi host busy 1 failed 0
> 23:56:29: Inserting command ffff88021984ed00 into mlqueue
> 23:56:29: sd 0:0:0:0: [sda] Send: 0xffff88021984ed00
> 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> 23:56:29: buffer = 0xffff88021984ee00, bufflen = 0, queuecommand 0xffffffff812be49d
> 23:56:29: ata1.00: WARNING: zero len r/w req
> 23:56:29: leaving scsi_dispatch_cmnd()
> 23:56:29: sd 0:0:0:0: [sda] Done: 0xffff88021984ed00 RETRY
> 23:56:29: sd 0:0:0:0: [sda] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
> 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> 23:56:29: sd 0:0:0:0: [sda] scsi host busy 1 failed 0
> 23:56:29: Inserting command ffff88021984ed00 into mlqueue
> 23:56:29: sd 0:0:0:0: [sda] Send: 0xffff88021984ed00
> 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> 23:56:29: buffer = 0xffff88021984ee00, bufflen = 0, queuecommand 0xffffffff812be49d
> 23:56:29: ata1.00: WARNING: zero len r/w req
> 23:56:29: leaving scsi_dispatch_cmnd()
> 23:56:29: sd 0:0:0:0: [sda] Done: 0xffff88021984ed00 RETRY
> 23:56:29: sd 0:0:0:0: [sda] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
> 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> 23:56:29: sd 0:0:0:0: [sda] scsi host busy 1 failed 0
> 23:56:29: Inserting command ffff88021984ed00 into mlqueue
> 23:56:29: sd 0:0:0:0: [sda] Send: 0xffff88021984ed00
> 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> 23:56:29: buffer = 0xffff88021984ee00, bufflen = 0, queuecommand 0xffffffff812be49d
> 23:56:29: ata1.00: WARNING: zero len r/w req
> 23:56:29: leaving scsi_dispatch_cmnd()
> 23:56:29: sd 0:0:0:0: [sda] Done: 0xffff88021984ed00 RETRY
> 23:56:29: sd 0:0:0:0: [sda] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
> 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> 23:56:29: sd 0:0:0:0: [sda] scsi host busy 1 failed 0
> 23:56:29: Inserting command ffff88021984ed00 into mlqueue
> 23:56:29: sd 0:0:0:0: [sda] Send: 0xffff88021984ed00
> 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> 23:56:29: buffer = 0xffff88021984ee00, bufflen = 0, queuecommand 0xffffffff812be49d
> 23:56:29: ata1.00: WARNING: zero len r/w req
> 23:56:29: leaving scsi_dispatch_cmnd()
> 23:56:29: sd 0:0:0:0: [sda] Done: 0xffff88021984ed00 RETRY
> 23:56:29: sd 0:0:0:0: [sda] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
> 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> 23:56:29: sd 0:0:0:0: [sda] scsi host busy 1 failed 0
> 23:56:29: Inserting command ffff88021984ed00 into mlqueue
> 23:56:29: sd 0:0:0:0: [sda] Send: 0xffff88021984ed00
> 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> 23:56:29: buffer = 0xffff88021984ee00, bufflen = 0, queuecommand 0xffffffff812be49d
> 23:56:29: ata1.00: WARNING: zero len r/w req
> 23:56:29: leaving scsi_dispatch_cmnd()
> 23:56:29: sd 0:0:0:0: [sda] Done: 0xffff88021984ed00 SUCCESS
> 23:56:29: sd 0:0:0:0: [sda] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
> 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> 23:56:29: sd 0:0:0:0: [sda] scsi host busy 1 failed 0
> 23:56:29: sd 0:0:0:0: Notifying upper driver of completion (result 70000)
> 23:56:29: sd 0:0:0:0: [sda] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
> 23:56:29: 0 sectors total, 0 bytes done.
> 23:56:29: leaving scsi_dispatch_cmnd()
Ok, it looks like my initial guess about FUA / BARRIER breakage was
completely wrong here. So the issue is in fact being caused by a
READ_10 with a transfer length of zero generated somewhere else UPL MD
code (perhaps something to do with MD metadata..?)
Would you be able to 'git bisect' on your system with Linus'es
linux-2.6.git tree from commit 9fe6206f400646a (v2.6.35) to HEAD at
v2.6.36-rc6 to help us pinpoint which commit the breakage was
introduced..?
Thanks!
--nab
> 23:56:29: ------------[ cut here ]------------
> 23:56:29: kernel BUG at drivers/scsi/scsi_lib.c:1113!
> 23:56:29: invalid opcode: 0000 [#1] SMP
> 23:56:29: last sysfs file: /sys/devices/virtual/block/md6/md/sync_action
> 23:56:29: CPU 3
> 23:56:29: Modules linked in: ctr twofish_generic twofish_x86_64 twofish_common serpent xcbc sha512_generic sha256_generic crypto_null ipg 8250_pci 8250_pnp 8250 serial_core
> 23:56:29:
> 23:56:29: Pid: 5446, comm: md6_resync Not tainted 2.6.36-rc6-00070-g44c064a #3 MS-7376/MS-7376
> 23:56:29: RIP: 0010:[<ffffffff812ac592>] [<ffffffff812ac592>] scsi_setup_fs_cmnd+0x4e/0xbb
> 23:56:29: RSP: 0018:ffff88021af05ab0 EFLAGS: 00010046
> 23:56:29: RAX: 0000000000000000 RBX: ffff88021f39cb88 RCX: ffff88021f2ee960
> 23:56:29: RDX: 0000000000000000 RSI: ffff88021f39cb88 RDI: ffff88021f65bc00
> 23:56:29: RBP: ffff88021af05ac0 R08: 00000000000003e8 R09: ffff88021f677848
> 23:56:29: R10: 0000000000000000 R11: ffff88021f677848 R12: ffff88021f65bc00
> 23:56:29: R13: 0000000000000000 R14: ffff88021f65bc48 R15: ffff88021f2f7000
> 23:56:29: FS: 00007f096e6a27c0(0000) GS:ffff880028380000(0000) knlGS:0000000000000000
> 23:56:29: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> 23:56:29: CR2: 0000003a10c1e8f0 CR3: 000000021c612000 CR4: 00000000000006e0
> 23:56:29: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> 23:56:29: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> 23:56:29: Process md6_resync (pid: 5446, threadinfo ffff88021af04000, task ffff88021988ac00)
> 23:56:29: Stack:
> 23:56:29: 0000000000000000 ffff88021f39cb88 ffff88021af05b40 ffffffff812b2af8
> 23:56:29: <0> ffff88021af05b40 ffffffff8116b8e5 0000000000000200 0000000000000000
> 23:56:29: <0> ffff88021f511868 ffff88021f2ee960 ffff88021f511958 ffff88021f65bc00
> 23:56:29: Call Trace:
> 23:56:29: [<ffffffff812b2af8>] sd_prep_fn+0x252/0xb00
> 23:56:29: [<ffffffff8116b8e5>] ? cfq_dispatch_requests+0x6c3/0x7ec
> 23:56:29: [<ffffffff81160487>] blk_peek_request+0xb3/0x177
> 23:56:29: [<ffffffff812abc42>] scsi_request_fn+0x8b/0x4cb
> 23:56:29: [<ffffffff81160d9c>] __generic_unplug_device+0x32/0x37
> 23:56:29: [<ffffffff81160dcc>] generic_unplug_device+0x2b/0x3a
> 23:56:29: [<ffffffff8115f0de>] blk_unplug+0x12/0x14
> 23:56:29: [<ffffffff81308275>] unplug_slaves+0x69/0x9f
> 23:56:29: [<ffffffff813082c3>] raid1_unplug+0x18/0x28
> 23:56:29: [<ffffffff8115f0de>] blk_unplug+0x12/0x14
> 23:56:29: [<ffffffff8131c1b6>] md_unplug+0x1d/0x33
> 23:56:29: [<ffffffff8131cb18>] md_do_sync+0x94c/0xba6
> 23:56:29: [<ffffffff8102f88f>] ? finish_task_switch+0x34/0x74
> 23:56:29: [<ffffffff8131d17e>] md_thread+0xf6/0x114
> 23:56:29: [<ffffffff8131d088>] ? md_thread+0x0/0x114
> 23:56:29: [<ffffffff8131d088>] ? md_thread+0x0/0x114
> 23:56:29: [<ffffffff81049e4f>] kthread+0x7d/0x85
> 23:56:29: [<ffffffff81002c94>] kernel_thread_helper+0x4/0x10
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bug 19642] 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113
2010-10-05 0:10 ` [Bug 19642] " Nicholas A. Bellinger
@ 2010-10-05 0:18 ` Nicholas A. Bellinger
2010-10-05 1:24 ` George Spelvin
1 sibling, 0 replies; 10+ messages in thread
From: Nicholas A. Bellinger @ 2010-10-05 0:18 UTC (permalink / raw)
To: Neil Brown; +Cc: bugzilla-daemon, linux-scsi, George Spelvin
Hi Neil,
Do you know if any of the recent changes into drivers/md/ code could
starting causing zero-length bio READ to be generated to MD RAID1
backstores along the lines of what George is currently seeing w/ 10x
RAID1 arrays during assembly..?
Thanks,
--nab
On Mon, 2010-10-04 at 17:10 -0700, Nicholas A. Bellinger wrote:
> On Mon, 2010-10-04 at 20:03 -0400, George Spelvin wrote:
> > > Would you mind enabling CONFIG_SCSI_LOGGING=y in your kernel build and calling
> > >
> > > echo 0xffffffff > /proc/sys/dev/scsi/logging_level
> > >
> > > so we can double check to see which CDBs are actually generating the issue..?
> > >
> > > Thanks!
> >
> > Here's a syslog starting at boot time for that (I also added verbose
> > SCSI errors, FWIW.) I did
> > echo 0xffffffff > /proc/sys/dev/scsi/logging_level; echo check > /sys/block/md6/md/sync_action
> > to minimize time delay and resultany unrelated log spam.
> >
> > Note that md6 is mirrored across *every* drive on the machine. (It's the
> > emergency recovery /boot partition.)
>
> <SNIP>
>
> > 23:56:29: md: data-check of RAID array md6
> > 23:56:29: md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
> > 23:56:29: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
> > 23:56:29: md: using 128k window, over a total of 1166912 blocks.
> > 23:56:29: sd 0:0:0:0: [sda] sd_init_command: block=0, count=0
> > 23:56:29: sd 0:0:0:0: [sda] block=0
> > 23:56:29: sd 0:0:0:0: [sda] reading 0/0 512 byte blocks.
> > 23:56:29: sd 0:0:0:0: [sda] Send: 0xffff88021984ed00
> > 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> > 23:56:29: buffer = 0xffff88021984ee00, bufflen = 0, queuecommand 0xffffffff812be49d
> > 23:56:29: ata1.00: WARNING: zero len r/w req
> > 23:56:29: sd 0:0:0:0: [sda] Done: 0xffff88021984ed00 RETRY
> > 23:56:29: sd 0:0:0:0: [sda] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
> > 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> > 23:56:29: sd 0:0:0:0: [sda] scsi host busy 1 failed 0
> > 23:56:29: Inserting command ffff88021984ed00 into mlqueue
> > 23:56:29: sd 0:0:0:0: [sda] Send: 0xffff88021984ed00
> > 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> > 23:56:29: buffer = 0xffff88021984ee00, bufflen = 0, queuecommand 0xffffffff812be49d
> > 23:56:29: ata1.00: WARNING: zero len r/w req
> > 23:56:29: leaving scsi_dispatch_cmnd()
> > 23:56:29: sd 0:0:0:0: [sda] Done: 0xffff88021984ed00 RETRY
> > 23:56:29: sd 0:0:0:0: [sda] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
> > 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> > 23:56:29: sd 0:0:0:0: [sda] scsi host busy 1 failed 0
> > 23:56:29: Inserting command ffff88021984ed00 into mlqueue
> > 23:56:29: sd 0:0:0:0: [sda] Send: 0xffff88021984ed00
> > 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> > 23:56:29: buffer = 0xffff88021984ee00, bufflen = 0, queuecommand 0xffffffff812be49d
> > 23:56:29: ata1.00: WARNING: zero len r/w req
> > 23:56:29: leaving scsi_dispatch_cmnd()
> > 23:56:29: sd 0:0:0:0: [sda] Done: 0xffff88021984ed00 RETRY
> > 23:56:29: sd 0:0:0:0: [sda] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
> > 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> > 23:56:29: sd 0:0:0:0: [sda] scsi host busy 1 failed 0
> > 23:56:29: Inserting command ffff88021984ed00 into mlqueue
> > 23:56:29: sd 0:0:0:0: [sda] Send: 0xffff88021984ed00
> > 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> > 23:56:29: buffer = 0xffff88021984ee00, bufflen = 0, queuecommand 0xffffffff812be49d
> > 23:56:29: ata1.00: WARNING: zero len r/w req
> > 23:56:29: leaving scsi_dispatch_cmnd()
> > 23:56:29: sd 0:0:0:0: [sda] Done: 0xffff88021984ed00 RETRY
> > 23:56:29: sd 0:0:0:0: [sda] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
> > 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> > 23:56:29: sd 0:0:0:0: [sda] scsi host busy 1 failed 0
> > 23:56:29: Inserting command ffff88021984ed00 into mlqueue
> > 23:56:29: sd 0:0:0:0: [sda] Send: 0xffff88021984ed00
> > 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> > 23:56:29: buffer = 0xffff88021984ee00, bufflen = 0, queuecommand 0xffffffff812be49d
> > 23:56:29: ata1.00: WARNING: zero len r/w req
> > 23:56:29: leaving scsi_dispatch_cmnd()
> > 23:56:29: sd 0:0:0:0: [sda] Done: 0xffff88021984ed00 RETRY
> > 23:56:29: sd 0:0:0:0: [sda] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
> > 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> > 23:56:29: sd 0:0:0:0: [sda] scsi host busy 1 failed 0
> > 23:56:29: Inserting command ffff88021984ed00 into mlqueue
> > 23:56:29: sd 0:0:0:0: [sda] Send: 0xffff88021984ed00
> > 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> > 23:56:29: buffer = 0xffff88021984ee00, bufflen = 0, queuecommand 0xffffffff812be49d
> > 23:56:29: ata1.00: WARNING: zero len r/w req
> > 23:56:29: leaving scsi_dispatch_cmnd()
> > 23:56:29: sd 0:0:0:0: [sda] Done: 0xffff88021984ed00 SUCCESS
> > 23:56:29: sd 0:0:0:0: [sda] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
> > 23:56:29: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 00 00
> > 23:56:29: sd 0:0:0:0: [sda] scsi host busy 1 failed 0
> > 23:56:29: sd 0:0:0:0: Notifying upper driver of completion (result 70000)
> > 23:56:29: sd 0:0:0:0: [sda] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
> > 23:56:29: 0 sectors total, 0 bytes done.
> > 23:56:29: leaving scsi_dispatch_cmnd()
>
> Ok, it looks like my initial guess about FUA / BARRIER breakage was
> completely wrong here. So the issue is in fact being caused by a
> READ_10 with a transfer length of zero generated somewhere else UPL MD
> code (perhaps something to do with MD metadata..?)
>
> Would you be able to 'git bisect' on your system with Linus'es
> linux-2.6.git tree from commit 9fe6206f400646a (v2.6.35) to HEAD at
> v2.6.36-rc6 to help us pinpoint which commit the breakage was
> introduced..?
>
> Thanks!
>
> --nab
>
>
> > 23:56:29: ------------[ cut here ]------------
> > 23:56:29: kernel BUG at drivers/scsi/scsi_lib.c:1113!
> > 23:56:29: invalid opcode: 0000 [#1] SMP
> > 23:56:29: last sysfs file: /sys/devices/virtual/block/md6/md/sync_action
> > 23:56:29: CPU 3
> > 23:56:29: Modules linked in: ctr twofish_generic twofish_x86_64 twofish_common serpent xcbc sha512_generic sha256_generic crypto_null ipg 8250_pci 8250_pnp 8250 serial_core
> > 23:56:29:
> > 23:56:29: Pid: 5446, comm: md6_resync Not tainted 2.6.36-rc6-00070-g44c064a #3 MS-7376/MS-7376
> > 23:56:29: RIP: 0010:[<ffffffff812ac592>] [<ffffffff812ac592>] scsi_setup_fs_cmnd+0x4e/0xbb
> > 23:56:29: RSP: 0018:ffff88021af05ab0 EFLAGS: 00010046
> > 23:56:29: RAX: 0000000000000000 RBX: ffff88021f39cb88 RCX: ffff88021f2ee960
> > 23:56:29: RDX: 0000000000000000 RSI: ffff88021f39cb88 RDI: ffff88021f65bc00
> > 23:56:29: RBP: ffff88021af05ac0 R08: 00000000000003e8 R09: ffff88021f677848
> > 23:56:29: R10: 0000000000000000 R11: ffff88021f677848 R12: ffff88021f65bc00
> > 23:56:29: R13: 0000000000000000 R14: ffff88021f65bc48 R15: ffff88021f2f7000
> > 23:56:29: FS: 00007f096e6a27c0(0000) GS:ffff880028380000(0000) knlGS:0000000000000000
> > 23:56:29: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > 23:56:29: CR2: 0000003a10c1e8f0 CR3: 000000021c612000 CR4: 00000000000006e0
> > 23:56:29: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > 23:56:29: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > 23:56:29: Process md6_resync (pid: 5446, threadinfo ffff88021af04000, task ffff88021988ac00)
> > 23:56:29: Stack:
> > 23:56:29: 0000000000000000 ffff88021f39cb88 ffff88021af05b40 ffffffff812b2af8
> > 23:56:29: <0> ffff88021af05b40 ffffffff8116b8e5 0000000000000200 0000000000000000
> > 23:56:29: <0> ffff88021f511868 ffff88021f2ee960 ffff88021f511958 ffff88021f65bc00
> > 23:56:29: Call Trace:
> > 23:56:29: [<ffffffff812b2af8>] sd_prep_fn+0x252/0xb00
> > 23:56:29: [<ffffffff8116b8e5>] ? cfq_dispatch_requests+0x6c3/0x7ec
> > 23:56:29: [<ffffffff81160487>] blk_peek_request+0xb3/0x177
> > 23:56:29: [<ffffffff812abc42>] scsi_request_fn+0x8b/0x4cb
> > 23:56:29: [<ffffffff81160d9c>] __generic_unplug_device+0x32/0x37
> > 23:56:29: [<ffffffff81160dcc>] generic_unplug_device+0x2b/0x3a
> > 23:56:29: [<ffffffff8115f0de>] blk_unplug+0x12/0x14
> > 23:56:29: [<ffffffff81308275>] unplug_slaves+0x69/0x9f
> > 23:56:29: [<ffffffff813082c3>] raid1_unplug+0x18/0x28
> > 23:56:29: [<ffffffff8115f0de>] blk_unplug+0x12/0x14
> > 23:56:29: [<ffffffff8131c1b6>] md_unplug+0x1d/0x33
> > 23:56:29: [<ffffffff8131cb18>] md_do_sync+0x94c/0xba6
> > 23:56:29: [<ffffffff8102f88f>] ? finish_task_switch+0x34/0x74
> > 23:56:29: [<ffffffff8131d17e>] md_thread+0xf6/0x114
> > 23:56:29: [<ffffffff8131d088>] ? md_thread+0x0/0x114
> > 23:56:29: [<ffffffff8131d088>] ? md_thread+0x0/0x114
> > 23:56:29: [<ffffffff81049e4f>] kthread+0x7d/0x85
> > 23:56:29: [<ffffffff81002c94>] kernel_thread_helper+0x4/0x10
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bug 19642] 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113
2010-10-05 0:10 ` [Bug 19642] " Nicholas A. Bellinger
2010-10-05 0:18 ` Nicholas A. Bellinger
@ 2010-10-05 1:24 ` George Spelvin
1 sibling, 0 replies; 10+ messages in thread
From: George Spelvin @ 2010-10-05 1:24 UTC (permalink / raw)
To: nab; +Cc: bugzilla-daemon, linux, linux-scsi
> Would you be able to 'git bisect' on your system with Linus'es
> linux-2.6.git tree from commit 9fe6206f400646a (v2.6.35) to HEAD at
> v2.6.36-rc6 to help us pinpoint which commit the breakage was
> introduced..?
It's tedious, but I can, HOWEVER...
any advice avoiding false positives from
https://bugzilla.kernel.org/show_bug.cgi?id=16275
introduced in 74450be123b6f3cb480c358a056be398cce6aa6e
and fixed in 2c7d46ec192e4f2b350f67a0e185b9bce646cd6b?
I suppose I could cherry-pick the latter when in the
vulnerable range...
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2010-10-05 1:24 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-03 18:37 [Bug 19642] New: 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113 bugzilla-daemon
2010-10-03 18:37 ` [Bug 19642] " bugzilla-daemon
2010-10-03 18:53 ` [Bug 19642] New: " James Bottomley
2010-10-03 21:00 ` Mike Snitzer
2010-10-03 18:54 ` [Bug 19642] " bugzilla-daemon
2010-10-03 21:32 ` bugzilla-daemon
-- strict thread matches above, loose matches on Subject: below --
2010-10-03 21:15 2.6.36-rc6-git2: Reported regressions from 2.6.35 Rafael J. Wysocki
2010-10-03 21:19 ` [Bug #19642] 2.6.36-rc6 BUG at drivers/scsi/scsi_lib.c:1113 Rafael J. Wysocki
[not found] <20101005000309.8466.qmail@science.horizon.com>
2010-10-05 0:10 ` [Bug 19642] " Nicholas A. Bellinger
2010-10-05 0:18 ` Nicholas A. Bellinger
2010-10-05 1:24 ` George Spelvin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.