* SCSI bug
@ 2016-01-23 18:00 John David Anglin
2016-02-20 20:13 ` John David Anglin
0 siblings, 1 reply; 25+ messages in thread
From: John David Anglin @ 2016-01-23 18:00 UTC (permalink / raw)
To: linux-parisc List
I gave 4.4.0+ a try this morning and we still have the SCSI issue previously reported:
sym0: <1010-66> rev 0x1 at pci 0000:20:01.0 irq 70
sym0: No NVRAM, ID 7, Fast-80, LVD, parity checking
sym0: SCSI BUS has been reset.
scsi host0: sym-2.2.3
usb 1-3: new high-speed USB device number 2 using ehci-pci
usb usb2: New USB device found, idVendor=1d6b, idProduct=0001
usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
usb usb2: Product: OHCI PCI host controller
usb usb2: Manufacturer: Linux 4.4.0+ ohci_hcd
usb usb2: SerialNumber: 0000:00:01.0
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 3 ports detected
ohci-pci 0000:00:01.1: OHCI PCI host controller
ohci-pci 0000:00:01.1: new USB bus registered, assigned bus number 3
ohci-pci 0000:00:01.1: irq 67, io mem 0xffffffff80001000
usb 1-3: New USB device found, idVendor=1058, idProduct=0748
usb 1-3: New USB device strings: Mfr=1, Product=2, SerialNumber=5
usb 1-3: Product: My Passport 0748
usb 1-3: Manufacturer: Western Digital
usb 1-3: SerialNumber: 575836314542325A33383231
usb-storage 1-3:1.0: USB Mass Storage device detected
scsi host1: usb-storage 1-3:1.0
usbcore: registered new interface driver usb-storage
usbcore: registered new interface driver uas
usb usb3: New USB device found, idVendor=1d6b, idProduct=0001
usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1
usb usb3: Product: OHCI PCI host controller
usb usb3: Manufacturer: Linux 4.4.0+ ohci_hcd
usb usb3: SerialNumber: 0000:00:01.1
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
pata_cmd64x 0000:00:02.0: Secondary port is disabled
scsi host2: pata_cmd64x
scsi host3: pata_cmd64x
ata1: PATA max UDMA/100 cmd 0xd18 ctl 0xd24 bmdma 0xd00 irq 69
ata2: DUMMY
ata1.00: ATAPI: DW-224E, C.0B, max UDMA/33
ata1.00: configured for UDMA/33
scsi 2:0:0:0: CD-ROM TEAC DW-224E C.0B PQ: 0 ANSI: 5
sr 2:0:0:0: [sr0] scsi3-mmc drive: 24x/24x writer cd/rw xa/form2 cdda tray
cdrom: Uniform CD-ROM driver Revision: 3.20
scsi 1:0:0:0: Direct-Access WD My Passport 0748 1022 PQ: 0 ANSI: 6
scsi 1:0:0:1: Enclosure WD SES Device 1022 PQ: 0 ANSI: 6
sd 1:0:0:0: [sda] Spinning up disk...
.
scsi 0:0:0:0: Direct-Access SEAGATE ST3300007LC D705 PQ: 0 ANSI: 3
scsi target0:0:0: tagged command queuing enabled, command queue depth 16.
scsi target0:0:0: Beginning Domain Validation
scsi target0:0:0: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 31)
scsi target0:0:0: Ending Domain Validation
sd 0:0:0:0: [sdb] 585937500 512-byte logical blocks: (300 GB/279 GiB)
sd 0:0:0:0: [sdb] Write Protect is off
sd 0:0:0:0: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA
sd 0:0:0:0: [sdb] Attached SCSI disk
ready
sd 1:0:0:0: [sda] 3906963456 512-byte logical blocks: (2.00 TB/1.82 TiB)
sd 1:0:0:0: [sda] Write Protect is off
sd 1:0:0:0: [sda] No Caching mode page found
sd 1:0:0:0: [sda] Assuming drive cache: write through
ses 1:0:0:1: Attached Enclosure device
sda: sda1 sda2
sd 1:0:0:0: [sda] Attached SCSI disk
random: nonblocking pool is initialized
------------[ cut here ]------------
WARNING: at block/blk-merge.c:454
Modules linked in: ses enclosure scsi_transport_sas sd_mod sr_mod cdrom uas usb_storage pata_cmd64x ohci_pci tg3(+) ptp sym53c8xx(+) libata pps_core ehci_pci scsi_transport_spi ohci_hcd ehci_hcd scsi_mod usbcore usb_common
CPU: 1 PID: 930 Comm: systemd-udevd Not tainted 4.4.0+ #1
task: 000000007f038c68 ti: 000000007e198000 task.ti: 000000007e198000
YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00001000000001001111111100001110 Not tainted
r00-03 000000ff0804ff0e 0000000040756dc0 0000000000000000 000000007e199210
r04-07 000000004072cdc0 0000000000000000 0000000000000000 000000000000001e
r08-11 0000000000000001 0000000000000000 000000007c1fe1b8 0000000000000000
r12-15 0000000000000002 00000000000001e0 0000000000001000 000000007e1d6800
r16-19 000000007c1fa6f8 000000004270e3c0 000000007c1fe1b8 0000000000000000
r20-23 0000000000000000 000000007e3f3c00 00000000407ac690 0000000000000000
r24-27 0000000000000000 000000007c1fa6f8 000000004270e3c0 000000004072cdc0
r28-31 0000000000000001 0000000000001000 000000007e199340 0000000000000001
sr00-03 0000000000012000 0000000000000000 0000000000012000 0000000000012000
sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000
IASQ: 0000000000000000 0000000000000000 IAOQ: 000000004038d5f4 000000004038d5f8
IIR: 03ffe01f ISR: 0000000010340000 IOR: 000000f07f9fe1b8
CPU: 1 CR30: 000000007e198000 CR31: ff7f1bfb00e9e4ff
ORIG_R28: 000000007e1994f0
IAOQ[0]: blk_rq_map_sg+0x564/0x598
IAOQ[1]: blk_rq_map_sg+0x568/0x598
RP(r2): (null)
Backtrace:
[<0000000018317690>] scsi_init_sgtable+0x70/0x168 [scsi_mod]
[<00000000183177f4>] scsi_init_io+0x6c/0x250 [scsi_mod]
[<000000001c7f16b0>] sd_setup_read_write_cmnd+0x58/0x948 [sd_mod]
[<000000001c7f1fe4>] sd_init_command+0x44/0x130 [sd_mod]
[<0000000018317adc>] scsi_setup_cmnd+0x104/0x1c0 [scsi_mod]
[<0000000018317e28>] scsi_prep_fn+0x100/0x340 [scsi_mod]
[<000000004038663c>] blk_peek_request+0x1b4/0x290
[<0000000018319a34>] scsi_request_fn+0xf4/0xab0 [scsi_mod]
[<00000000403819e4>] __blk_run_queue+0x4c/0x70
[<00000000403ac6a8>] cfq_insert_request+0x2e0/0x588
[<0000000040380ba0>] __elv_add_request+0x190/0x2d8
---[ end trace 19da17b0d547c92b ]---
------------[ cut here ]------------
kernel BUG at drivers/scsi/scsi_lib.c:1097!
CPU: 1 PID: 930 Comm: systemd-udevd Tainted: G W 4.4.0+ #1
task: 000000007f038c68 ti: 000000007e198000 task.ti: 000000007e198000
YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00001000000001001111111100001110 Tainted: G W
r00-03 000000ff0804ff0e 0000000040756dc0 0000000018317690 000000007e199170
r04-07 0000000018301000 000000007c1fa6f8 000000007e21ab60 0000000000000000
r08-11 000000007c1fa6f8 000000007e1fe800 00000000408444d0 0000000000000110
r12-15 0000000000000000 000000007e198798 0000000018301000 0000000040756dc0
r16-19 00000000408444dc 000000007e1fe800 000000007c216800 0000000000000000
r20-23 0000000000000000 000000007e3f3c00 00000000407ac690 0000000000000000
r24-27 0000000000000000 000000007c1fa6f8 000000004270e3c0 000000004072cdc0
r28-31 0000000000000002 0000000000001000 000000007e199210 0000000000000001
sr00-03 0000000000012000 0000000000000000 0000000000012000 0000000000012000
sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000
IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000183176d4 00000000183176d8
IIR: 03ffe01f ISR: 0000000000000000 IOR: 0000000000000000
CPU: 1 CR30: 000000007e198000 CR31: ff7f1bfb00e9e4ff
ORIG_R28: 0000000000000000
IAOQ[0]: scsi_init_sgtable+0xb4/0x168 [scsi_mod]
IAOQ[1]: scsi_init_sgtable+0xb8/0x168 [scsi_mod]
RP(r2): scsi_init_sgtable+0x70/0x168 [scsi_mod]
Backtrace:
[<00000000183177f4>] scsi_init_io+0x6c/0x250 [scsi_mod]
[<000000001c7f16b0>] sd_setup_read_write_cmnd+0x58/0x948 [sd_mod]
[<000000001c7f1fe4>] sd_init_command+0x44/0x130 [sd_mod]
[<0000000018317adc>] scsi_setup_cmnd+0x104/0x1c0 [scsi_mod]
[<0000000018317e28>] scsi_prep_fn+0x100/0x340 [scsi_mod]
[<000000004038663c>] blk_peek_request+0x1b4/0x290
[<0000000018319a34>] scsi_request_fn+0xf4/0xab0 [scsi_mod]
[<00000000403819e4>] __blk_run_queue+0x4c/0x70
[<00000000403ac6a8>] cfq_insert_request+0x2e0/0x588
[<0000000040380ba0>] __elv_add_request+0x190/0x2d8
CPU: 1 PID: 930 Comm: systemd-udevd Tainted: G W 4.4.0+ #1
Backtrace:
[<000000004015d560>] show_stack+0x20/0x38
[<00000000403b1f94>] dump_stack+0x9c/0x110
[<000000004015d734>] die_if_kernel+0x19c/0x2e0
[<000000004015e610>] handle_interruption+0x9a8/0x9d0
---[ end trace 19da17b0d547c92c ]---
NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [systemd-udevd:930]
Modules linked in: ses enclosure scsi_transport_sas sd_mod sr_mod cdrom uas usb_storage pata_cmd64x ohci_pci tg3(+) ptp sym53c8xx(+) libata pps_core ehci_pci scsi_transport_spi ohci_hcd ehci_hcd scsi_mod usbcore usb_common
CPU: 3 PID: 930 Comm: systemd-udevd Tainted: G D W 4.4.0+ #1
task: 000000007f038c68 ti: 000000007e198000 task.ti: 000000007e198000
YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00001000000001101111111100001111 Tainted: G D W
r00-03 000000ff0806ff0f 000000007e199d00 00000000402039c0 000000007e199c30
r04-07 000000004072cdc0 000000007e199cd0 00000000407bc588 00000000407bc4a0
r08-11 00000000428443a0 00000000428443a8 0000000000000001 0000000040705660
r12-15 0000000000000000 000000007e199cd0 0000000018301000 0000000040756dc0
r16-19 000000007e199210 000000007e1fe800 000000007c216800 000000004282bfa0
r20-23 0000000000000001 00000000428443a8 000000000800000f 0000000000000000
r24-27 0000000000000000 0000000000000020 00000000428443a8 000000004072cdc0
r28-31 0000000000000001 000000007e199d50 000000007e199d00 0000000000000003
sr00-03 000000000001c000 0000000000000000 0000000000000000 000000000001c000
sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000
IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000402039e8 00000000402039ec
IIR: 4a7f0030 ISR: 000000004072cdc0 IOR: 000000007e36e620
CPU: 3 CR30: 000000007e198000 CR31: ffffffffffffffff
ORIG_R28: 0000000000000001
IAOQ[0]: smp_call_function_many+0x338/0x3b0
IAOQ[1]: smp_call_function_many+0x33c/0x3b0
RP(r2): smp_call_function_many+0x310/0x3b0
Backtrace:
[<0000000040203b18>] on_each_cpu+0x58/0xa0
[<000000004015aed8>] flush_tlb_all+0x108/0x1e8
[<0000000040258270>] tlb_flush_mmu_tlbonly+0x48/0xa8
[<0000000040259048>] tlb_finish_mmu+0x30/0x98
[<0000000040263e5c>] exit_mmap+0x134/0x1b8
[<0000000040182668>] mmput+0xc0/0x1a8
[<0000000040188eec>] do_exit+0x334/0xce0
[<000000004015d790>] die_if_kernel+0x1f8/0x2e0
[<000000004015e610>] handle_interruption+0x9a8/0x9d0
NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [systemd-udevd:930]
Modules linked in: ses enclosure scsi_transport_sas sd_mod sr_mod cdrom uas usb_storage pata_cmd64x ohci_pci tg3(+) ptp sym53c8xx(+) libata pps_core ehci_pci scsi_transport_spi ohci_hcd ehci_hcd scsi_mod usbcore usb_common
CPU: 3 PID: 930 Comm: systemd-udevd Tainted: G D W L 4.4.0+ #1
task: 000000007f038c68 ti: 000000007e198000 task.ti: 000000007e198000
YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00001000000001101111111100001111 Tainted: G D W L
r00-03 000000ff0806ff0f 000000007e199d00 00000000402039c0 000000007e199c30
r04-07 000000004072cdc0 000000007e199cd0 00000000407bc588 00000000407bc4a0
r08-11 00000000428443a0 00000000428443a8 0000000000000001 0000000040705660
r12-15 0000000000000000 000000007e199cd0 0000000018301000 0000000040756dc0
r16-19 000000007e199210 000000007e1fe800 000000007c216800 000000004282bfa0
r20-23 0000000000000001 00000000428443a8 000000000800000f 0000000000000000
r24-27 0000000000000000 0000000000000020 00000000428443a8 000000004072cdc0
r28-31 0000000000000001 000000007e199d50 000000007e199d00 0000000000000003
sr00-03 000000000001c000 0000000000000000 0000000000000000 000000000001c000
sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000
IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000402039e8 00000000402039ec
IIR: 4a7f0030 ISR: 000000004072cdc0 IOR: 000000007e36e620
CPU: 3 CR30: 000000007e198000 CR31: ffffffffffffffff
ORIG_R28: 0000000000000001
IAOQ[0]: smp_call_function_many+0x338/0x3b0
IAOQ[1]: smp_call_function_many+0x33c/0x3b0
RP(r2): smp_call_function_many+0x310/0x3b0
Backtrace:
[<0000000040203b18>] on_each_cpu+0x58/0xa0
[<000000004015aed8>] flush_tlb_all+0x108/0x1e8
[<0000000040258270>] tlb_flush_mmu_tlbonly+0x48/0xa8
[<0000000040259048>] tlb_finish_mmu+0x30/0x98
[<0000000040263e5c>] exit_mmap+0x134/0x1b8
[<0000000040182668>] mmput+0xc0/0x1a8
[<0000000040188eec>] do_exit+0x334/0xce0
[<000000004015d790>] die_if_kernel+0x1f8/0x2e0
[<000000004015e610>] handle_interruption+0x9a8/0x9d0
INFO: rcu_sched self-detected stall on CPU
3-...: (5978 ticks this GP) idle=ecb/140000000000001/0 softirq=709/709 fqs=6000
(t=6000 jiffies g=-123 c=-124 q=4)
Task dump for CPU 1:
kworker/1:1H R running task 0 948 2 0x00000004
Workqueue: kblockd cfq_kick_queue
Backtrace:
[<000000004015361c>] __schedule+0x264/0x5b8
[<00000000401539bc>] schedule+0x4c/0xc8
[<00000000401a29d0>] worker_thread+0x338/0x688
[<00000000401aae84>] kthread+0x144/0x178
[<0000000040149020>] end_fault_vector+0x20/0x28
Task dump for CPU 3:
systemd-udevd R running task 0 930 925 0x00000014
Backtrace:
[<000000004015d560>] show_stack+0x20/0x38
[<00000000401b95b4>] sched_show_task+0x134/0x1d0
[<00000000401bc054>] dump_cpu_task+0x64/0x80
[<00000000401e7584>] rcu_dump_cpu_stacks+0xf4/0x180
[<00000000401ec644>] rcu_check_callbacks+0x5ac/0x9b8
[<00000000401ef264>] update_process_times+0x74/0xd8
[<000000004015e878>] timer_interrupt+0x1b0/0x210
[<00000000401dc510>] handle_irq_event_percpu+0xa8/0x248
[<00000000401e1e7c>] handle_percpu_irq+0xac/0xe8
[<00000000401db794>] generic_handle_irq+0x4c/0x68
[<000000004014b2cc>] call_on_stack+0x18/0x24
...
Dave
--
John David Anglin dave.anglin@bell.net
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: SCSI bug 2016-01-23 18:00 SCSI bug John David Anglin @ 2016-02-20 20:13 ` John David Anglin 2016-02-20 20:43 ` John David Anglin 0 siblings, 1 reply; 25+ messages in thread From: John David Anglin @ 2016-02-20 20:13 UTC (permalink / raw) To: John David Anglin; +Cc: linux-parisc List On 2016-01-23, at 1:00 PM, John David Anglin wrote: > WARNING: at block/blk-merge.c:454 With linux-image-4.4.0-1-parisc64-smp on c3740, the above warning is the last message I see. Kernel seems to hang at that point. This is warning code: /* * Something must have been wrong if the figured number of * segment is bigger than number of req's physical segments */ WARN_ON(nsegs > rq->nr_phys_segments); Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-20 20:13 ` John David Anglin @ 2016-02-20 20:43 ` John David Anglin 2016-02-20 21:59 ` Helge Deller 0 siblings, 1 reply; 25+ messages in thread From: John David Anglin @ 2016-02-20 20:43 UTC (permalink / raw) To: John David Anglin; +Cc: linux-parisc List On 2016-02-20, at 3:13 PM, John David Anglin wrote: > On 2016-01-23, at 1:00 PM, John David Anglin wrote: > >> WARNING: at block/blk-merge.c:454 > > With linux-image-4.4.0-1-parisc64-smp on c3740, the above warning is the last message I see. > Kernel seems to hang at that point. This is warning code: > > /* > * Something must have been wrong if the figured number of > * segment is bigger than number of req's physical segments > */ > WARN_ON(nsegs > rq->nr_phys_segments); On Sep. 12, 2015, I reported the following problem: http://www.spinics.net/lists/linux-parisc/msg06327.html Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-20 20:43 ` John David Anglin @ 2016-02-20 21:59 ` Helge Deller 2016-02-20 22:52 ` John David Anglin 0 siblings, 1 reply; 25+ messages in thread From: Helge Deller @ 2016-02-20 21:59 UTC (permalink / raw) To: John David Anglin; +Cc: linux-parisc List, James Bottomley On 20.02.2016 21:43, John David Anglin wrote: > On 2016-02-20, at 3:13 PM, John David Anglin wrote: > >> On 2016-01-23, at 1:00 PM, John David Anglin wrote: >> >>> WARNING: at block/blk-merge.c:454 >> >> With linux-image-4.4.0-1-parisc64-smp on c3740, the above warning is the last message I see. >> Kernel seems to hang at that point. This is warning code: >> >> /* >> * Something must have been wrong if the figured number of >> * segment is bigger than number of req's physical segments >> */ >> WARN_ON(nsegs > rq->nr_phys_segments); > > On Sep. 12, 2015, I reported the following problem: > > http://www.spinics.net/lists/linux-parisc/msg06327.html The problem is still, that this bug can only be reproduced at every boot when then scsi drivers are built as modules (and in an initrd). I could never reproduce it when I booted a kernel with built-in scsi drivers. The bug seems to be triggered by(*nsegs)++ command in __blk_segment_map_sg() in block/blk-merge.c. I'm testing with the 4.4.2 kernel from debian. I modified __blk_segment_map_sg() like that: static inline void __blk_segment_map_sg(struct request_queue *q, struct bio_vec *bvec, struct scatterlist *sglist, struct bio_vec *bvprv, struct scatterlist **sg, int *nsegs, int *cluster) { int nbytes = bvec->bv_len; if (*sg && *cluster) { if ((*sg)->length + nbytes > queue_max_segment_size(q)) goto new_segment; if (!BIOVEC_PHYS_MERGEABLE(bvprv, bvec)) goto new_segment; if (!BIOVEC_SEG_BOUNDARY(q, bvprv, bvec)) goto new_segment; (*sg)->length += nbytes; } else { new_segment: if (*sg && *cluster) { printk("NEW SEGMENT sg = %p!!!\n", sg); printk("__blk_segment_map_sg: length = %d, nbytes = %d, sum = %d > %d\n", (*sg)->length, nbytes, (*sg)->length + nbytes, queue_max_segment_size(q)); printk("__blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = %d, BIOVEC_SEG_BOUNDARY = %d\n", BIOVEC_PHYS_MERGEABLE(bvprv, bvec), BIOVEC_SEG_BOUNDARY(q, bvprv, bvec) ); } if (!*sg) *sg = sglist; else { /* * If the driver previously mapped a shorter * list, we could see a termination bit * prematurely unless it fully inits the sg * table on each mapping. We KNOW that there * must be more entries here or the driver * would be buggy, so force clear the * termination bit to avoid doing a full * sg_init_table() in drivers for each command. */ sg_unmark_end(*sg); *sg = sg_next(*sg); } sg_set_page(*sg, bvec->bv_page, nbytes, bvec->bv_offset); (*nsegs)++; } *bvprv = *bvec; } The boot log looks then like this: [ 43.044000] scsi_init_sgtable: count = 1, nents = 1 (there are lots of those before it!) [ 43.164000] scsi_init_sgtable: nr_phys_segments = 1 [ 43.164000] scsi_init_sgtable: count = 1, nents = 1 [ 43.280000] scsi_init_sgtable: nr_phys_segments = 1 [ 43.280000] scsi_init_sgtable: count = 1, nents = 1 [ 43.396000] scsi_init_sgtable: nr_phys_segments = 1 [ 43.396000] scsi_init_sgtable: count = 1, nents = 1 [ 43.512000] scsi_init_sgtable: nr_phys_segments = 1 [ 43.512000] scsi_init_sgtable: count = 1, nents = 1 [ 43.628000] scsi_init_sgtable: nr_phys_segments = 3 [ 43.628000] NEW SEGMENT sg = 000000007fa911e8!!! [ 43.628000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 43.628000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 43.628000] NEW SEGMENT sg = 000000007fa911e8!!! [ 43.628000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 43.628000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 43.628000] scsi_init_sgtable: count = 3, nents = 3 [ 44.224000] scsi_init_sgtable: nr_phys_segments = 1 [ 44.224000] scsi_init_sgtable: count = 1, nents = 1 [ 44.340000] scsi_init_sgtable: nr_phys_segments = 1 [ 44.340000] scsi_init_sgtable: count = 1, nents = 1 [ 44.456000] scsi_init_sgtable: nr_phys_segments = 7 [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 44.456000] scsi_init_sgtable: count = 7, nents = 7 [ 44.456000] timer_interrupt(CPU 0): delayed! cycles 4527081F rem C6C21 next/now 14E153306E/14E146C44D [ 46.116000] scsi_init_sgtable: nr_phys_segments = 7 [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! [ 46.116000] __blk_segment_map_sg: length = 8192, nbytes = 4096, sum = 12288 > 65536 [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! [ 46.116000] __blk_segment_map_sg: length = 16384, nbytes = 4096, sum = 20480 > 65536 [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 46.116000] scsi_init_sgtable: count = 7, nents = 7 [ 46.116000] timer_interrupt(CPU 0): delayed! cycles 453F0A77 rem 223089 next/now 152BB6286E/152B93F7E5 [ 47.780000] scsi_init_sgtable: nr_phys_segments = 1 [ 47.780000] scsi_init_sgtable: count = 1, nents = 1 [ 47.896000] scsi_init_sgtable: nr_phys_segments = 6 [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! [ 47.896000] __blk_segment_map_sg: length = 61440, nbytes = 4096, sum = 65536 > 65536 [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! [ 47.896000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! [ 47.896000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! [ 47.896000] __blk_segment_map_sg: length = 8192, nbytes = 4096, sum = 12288 > 65536 [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! [ 47.896000] __blk_segment_map_sg: length = 8192, nbytes = 4096, sum = 12288 > 65536 [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 47.896000] scsi_init_sgtable: count = 6, nents = 6 [ 47.896000] timer_interrupt(CPU 0): delayed! cycles 3AB087E2 rem 23E4DE next/now 1570BBD5EE/157097F110 [ 49.324000] scsi_init_sgtable: nr_phys_segments = 1 [ 49.324000] scsi_init_sgtable: count = 1, nents = 1 [ 49.440000] scsi_init_sgtable: nr_phys_segments = 2 [ 49.440000] NEW SEGMENT sg = 000000007fa911e8!!! [ 49.440000] __blk_segment_map_sg: length = 65536, nbytes = 4096, sum = 69632 > 65536 (this is interesting! Here we reach a sum of > 65536 the first time) [ 49.440000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 1, BIOVEC_SEG_BOUNDARY = 1 [ 49.440000] NEW SEGMENT sg = 000000007fa911e8!!! [ 49.440000] __blk_segment_map_sg: length = 16384, nbytes = 4096, sum = 20480 > 65536 [ 49.440000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 49.440000] *** FIXIT *** HELGE: nsegs > rq->nr_phys_segments = 3 > 2 [ 49.440000] scsi_init_sgtable: count = 3, nents = 2 [ 50.116000] ------------[ cut here ]------------ [ 50.172000] WARNING: at /build/linux-4.4/linux-4.4.2/drivers/scsi/scsi_lib.c:1104 (this is usually a BUG(). I changed it to WARN() in the hope it would work anyway. It didn't.) [ 50.260000] Modules linked in: sd_mod sr_mod cdrom ata_generic ohci_pci ehci_pci ohci_hcd ehci_hcd pata_ns87415 sym53c8xx libata scsi_transport_spi scsi_mod usbcorep [ 50.456000] CPU: 0 PID: 70 Comm: systemd-udevd Not tainted 4.4.0-1-parisc64-smp #5 Debian 4.4.2-2 [ 50.564000] task: 000000007f948b28 ti: 000000007fa90000 task.ti: 000000007fa90000 [ 50.652000] [ 50.672000] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI [ 50.728000] PSW: 00001000000001001111100100001110 Not tainted [ 50.796000] r00-03 000000ff0804f90e 00000000409ea2e0 00000000003e2ee0 000000007fa91140 [ 50.892000] r04-07 00000000003cd000 000000007f914300 000000007f914b10 0000000000000003 [ 50.988000] r08-11 0000000000000000 000000007f918000 0000000040bdd6b0 00000000003cd800 [ 51.084000] r12-15 0000000000000000 000000007fa90778 00000000003cd000 000000007f918000 [ 51.180000] r16-19 0000000000001300 0000000040bdd6b8 0000000040bdd6bc 0000000040ba2420 [ 51.276000] r20-23 0000000099116e92 0000000000000000 00000000000002a0 00000000000002ee [ 51.372000] r24-27 0000000000000000 000000000800000e 0000000040b60750 00000000409b3ae0 [ 51.468000] r28-31 0000000000000002 000000007fa914f0 000000007fa911e0 0000000040ba2408 [ 51.564000] sr00-03 0000000000015000 0000000000000000 0000000000000000 0000000000015000 [ 51.660000] sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 51.756000] [ 51.772000] IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000003e2f24 00000000003e2f28 [ 51.872000] IIR: 03ffe01f ISR: 0000000010340000 IOR: 000000fea4691528 [ 51.956000] CPU: 0 CR30: 000000007fa90000 CR31: 00000000ffff7dff [ 52.040000] ORIG_R28: 0000000040b60718 [ 52.084000] IAOQ[0]: scsi_init_sgtable+0xfc/0x1b8 [scsi_mod] [ 52.152000] IAOQ[1]: scsi_init_sgtable+0x100/0x1b8 [scsi_mod] [ 52.224000] RP(r2): scsi_init_sgtable+0xb8/0x1b8 [scsi_mod] [ 52.292000] Backtrace: [ 52.320000] [<00000000003e304c>] scsi_init_io+0x6c/0x258 [scsi_mod] [ 52.396000] [<000000000087d078>] sd_init_command+0x70/0xec8 [sd_mod] In general I think the bug is somehow in blk-merge.c. But I'm not an expert in that code. Helge ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-20 21:59 ` Helge Deller @ 2016-02-20 22:52 ` John David Anglin 2016-02-21 2:52 ` John David Anglin 0 siblings, 1 reply; 25+ messages in thread From: John David Anglin @ 2016-02-20 22:52 UTC (permalink / raw) To: Helge Deller; +Cc: linux-parisc List, James Bottomley On 2016-02-20, at 4:59 PM, Helge Deller wrote: > On 20.02.2016 21:43, John David Anglin wrote: >> On 2016-02-20, at 3:13 PM, John David Anglin wrote: >> >>> On 2016-01-23, at 1:00 PM, John David Anglin wrote: >>> >>>> WARNING: at block/blk-merge.c:454 >>> >>> With linux-image-4.4.0-1-parisc64-smp on c3740, the above warning is the last message I see. >>> Kernel seems to hang at that point. This is warning code: >>> >>> /* >>> * Something must have been wrong if the figured number of >>> * segment is bigger than number of req's physical segments >>> */ >>> WARN_ON(nsegs > rq->nr_phys_segments); >> >> On Sep. 12, 2015, I reported the following problem: >> >> http://www.spinics.net/lists/linux-parisc/msg06327.html > > The problem is still, that this bug can only be reproduced at every boot when then > scsi drivers are built as modules (and in an initrd). I could never reproduce it when > I booted a kernel with built-in scsi drivers. > > The bug seems to be triggered by(*nsegs)++ command in __blk_segment_map_sg() in block/blk-merge.c. > I'm testing with the 4.4.2 kernel from debian. > I modified __blk_segment_map_sg() like that: > static inline void > __blk_segment_map_sg(struct request_queue *q, struct bio_vec *bvec, > struct scatterlist *sglist, struct bio_vec *bvprv, > struct scatterlist **sg, int *nsegs, int *cluster) > { > > int nbytes = bvec->bv_len; > > if (*sg && *cluster) { > if ((*sg)->length + nbytes > queue_max_segment_size(q)) > goto new_segment; > > if (!BIOVEC_PHYS_MERGEABLE(bvprv, bvec)) > goto new_segment; > if (!BIOVEC_SEG_BOUNDARY(q, bvprv, bvec)) > goto new_segment; > > (*sg)->length += nbytes; > } else { > new_segment: > if (*sg && *cluster) { > printk("NEW SEGMENT sg = %p!!!\n", sg); > printk("__blk_segment_map_sg: length = %d, nbytes = %d, sum = %d > %d\n", (*sg)->length, nbytes, (*sg)->length + nbytes, queue_max_segment_size(q)); > printk("__blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = %d, BIOVEC_SEG_BOUNDARY = %d\n", BIOVEC_PHYS_MERGEABLE(bvprv, bvec), BIOVEC_SEG_BOUNDARY(q, bvprv, bvec) ); > } > if (!*sg) > *sg = sglist; > else { > /* > * If the driver previously mapped a shorter > * list, we could see a termination bit > * prematurely unless it fully inits the sg > * table on each mapping. We KNOW that there > * must be more entries here or the driver > * would be buggy, so force clear the > * termination bit to avoid doing a full > * sg_init_table() in drivers for each command. > */ > sg_unmark_end(*sg); > *sg = sg_next(*sg); > } > > sg_set_page(*sg, bvec->bv_page, nbytes, bvec->bv_offset); > (*nsegs)++; > } > *bvprv = *bvec; > } > > The boot log looks then like this: > [ 43.044000] scsi_init_sgtable: count = 1, nents = 1 > (there are lots of those before it!) > [ 43.164000] scsi_init_sgtable: nr_phys_segments = 1 > [ 43.164000] scsi_init_sgtable: count = 1, nents = 1 > [ 43.280000] scsi_init_sgtable: nr_phys_segments = 1 > [ 43.280000] scsi_init_sgtable: count = 1, nents = 1 > [ 43.396000] scsi_init_sgtable: nr_phys_segments = 1 > [ 43.396000] scsi_init_sgtable: count = 1, nents = 1 > [ 43.512000] scsi_init_sgtable: nr_phys_segments = 1 > [ 43.512000] scsi_init_sgtable: count = 1, nents = 1 > [ 43.628000] scsi_init_sgtable: nr_phys_segments = 3 > [ 43.628000] NEW SEGMENT sg = 000000007fa911e8!!! > [ 43.628000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 > [ 43.628000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 > [ 43.628000] NEW SEGMENT sg = 000000007fa911e8!!! > [ 43.628000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 > [ 43.628000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 > [ 43.628000] scsi_init_sgtable: count = 3, nents = 3 > [ 44.224000] scsi_init_sgtable: nr_phys_segments = 1 > [ 44.224000] scsi_init_sgtable: count = 1, nents = 1 > [ 44.340000] scsi_init_sgtable: nr_phys_segments = 1 > [ 44.340000] scsi_init_sgtable: count = 1, nents = 1 > [ 44.456000] scsi_init_sgtable: nr_phys_segments = 7 > [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! > [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 > [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 > [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! > [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 > [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 > [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! > [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 > [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 > [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! > [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 > [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 > [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! > [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 > [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 > [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! > [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 > [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 > [ 44.456000] scsi_init_sgtable: count = 7, nents = 7 > [ 44.456000] timer_interrupt(CPU 0): delayed! cycles 4527081F rem C6C21 next/now 14E153306E/14E146C44D > [ 46.116000] scsi_init_sgtable: nr_phys_segments = 7 > [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! > [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 > [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 > [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! > [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 > [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 > [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! > [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 > [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 > [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! > [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 > [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 > [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! > [ 46.116000] __blk_segment_map_sg: length = 8192, nbytes = 4096, sum = 12288 > 65536 > [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 > [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! > [ 46.116000] __blk_segment_map_sg: length = 16384, nbytes = 4096, sum = 20480 > 65536 > [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 > [ 46.116000] scsi_init_sgtable: count = 7, nents = 7 > [ 46.116000] timer_interrupt(CPU 0): delayed! cycles 453F0A77 rem 223089 next/now 152BB6286E/152B93F7E5 > [ 47.780000] scsi_init_sgtable: nr_phys_segments = 1 > [ 47.780000] scsi_init_sgtable: count = 1, nents = 1 > [ 47.896000] scsi_init_sgtable: nr_phys_segments = 6 > [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! > [ 47.896000] __blk_segment_map_sg: length = 61440, nbytes = 4096, sum = 65536 > 65536 > [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 > [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! > [ 47.896000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 > [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 > [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! > [ 47.896000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 > [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 > [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! > [ 47.896000] __blk_segment_map_sg: length = 8192, nbytes = 4096, sum = 12288 > 65536 > [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 > [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! > [ 47.896000] __blk_segment_map_sg: length = 8192, nbytes = 4096, sum = 12288 > 65536 > [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 > [ 47.896000] scsi_init_sgtable: count = 6, nents = 6 > [ 47.896000] timer_interrupt(CPU 0): delayed! cycles 3AB087E2 rem 23E4DE next/now 1570BBD5EE/157097F110 > [ 49.324000] scsi_init_sgtable: nr_phys_segments = 1 > [ 49.324000] scsi_init_sgtable: count = 1, nents = 1 > [ 49.440000] scsi_init_sgtable: nr_phys_segments = 2 > [ 49.440000] NEW SEGMENT sg = 000000007fa911e8!!! > [ 49.440000] __blk_segment_map_sg: length = 65536, nbytes = 4096, sum = 69632 > 65536 > > (this is interesting! Here we reach a sum of > 65536 the first time) > > [ 49.440000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 1, BIOVEC_SEG_BOUNDARY = 1 > [ 49.440000] NEW SEGMENT sg = 000000007fa911e8!!! > [ 49.440000] __blk_segment_map_sg: length = 16384, nbytes = 4096, sum = 20480 > 65536 > [ 49.440000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 > [ 49.440000] *** FIXIT *** HELGE: nsegs > rq->nr_phys_segments = 3 > 2 > [ 49.440000] scsi_init_sgtable: count = 3, nents = 2 > [ 50.116000] ------------[ cut here ]------------ > [ 50.172000] WARNING: at /build/linux-4.4/linux-4.4.2/drivers/scsi/scsi_lib.c:1104 > > (this is usually a BUG(). I changed it to WARN() in the hope it would work anyway. It didn't.) > > [ 50.260000] Modules linked in: sd_mod sr_mod cdrom ata_generic ohci_pci ehci_pci ohci_hcd ehci_hcd pata_ns87415 sym53c8xx libata scsi_transport_spi scsi_mod usbcorep > [ 50.456000] CPU: 0 PID: 70 Comm: systemd-udevd Not tainted 4.4.0-1-parisc64-smp #5 Debian 4.4.2-2 > [ 50.564000] task: 000000007f948b28 ti: 000000007fa90000 task.ti: 000000007fa90000 > [ 50.652000] > [ 50.672000] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI > [ 50.728000] PSW: 00001000000001001111100100001110 Not tainted > [ 50.796000] r00-03 000000ff0804f90e 00000000409ea2e0 00000000003e2ee0 000000007fa91140 > [ 50.892000] r04-07 00000000003cd000 000000007f914300 000000007f914b10 0000000000000003 > [ 50.988000] r08-11 0000000000000000 000000007f918000 0000000040bdd6b0 00000000003cd800 > [ 51.084000] r12-15 0000000000000000 000000007fa90778 00000000003cd000 000000007f918000 > [ 51.180000] r16-19 0000000000001300 0000000040bdd6b8 0000000040bdd6bc 0000000040ba2420 > [ 51.276000] r20-23 0000000099116e92 0000000000000000 00000000000002a0 00000000000002ee > [ 51.372000] r24-27 0000000000000000 000000000800000e 0000000040b60750 00000000409b3ae0 > [ 51.468000] r28-31 0000000000000002 000000007fa914f0 000000007fa911e0 0000000040ba2408 > [ 51.564000] sr00-03 0000000000015000 0000000000000000 0000000000000000 0000000000015000 > [ 51.660000] sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > [ 51.756000] > [ 51.772000] IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000003e2f24 00000000003e2f28 > [ 51.872000] IIR: 03ffe01f ISR: 0000000010340000 IOR: 000000fea4691528 > [ 51.956000] CPU: 0 CR30: 000000007fa90000 CR31: 00000000ffff7dff > [ 52.040000] ORIG_R28: 0000000040b60718 > [ 52.084000] IAOQ[0]: scsi_init_sgtable+0xfc/0x1b8 [scsi_mod] > [ 52.152000] IAOQ[1]: scsi_init_sgtable+0x100/0x1b8 [scsi_mod] > [ 52.224000] RP(r2): scsi_init_sgtable+0xb8/0x1b8 [scsi_mod] > [ 52.292000] Backtrace: > [ 52.320000] [<00000000003e304c>] scsi_init_io+0x6c/0x258 [scsi_mod] > [ 52.396000] [<000000000087d078>] sd_init_command+0x70/0xec8 [sd_mod] > > In general I think the bug is somehow in blk-merge.c. > But I'm not an expert in that code. The warning was added in this patch sequence: https://lkml.org/lkml/2015/11/23/996 Possibly, but above seems to indicate that it could be driver issue as well. Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-20 22:52 ` John David Anglin @ 2016-02-21 2:52 ` John David Anglin 2016-02-21 3:47 ` James Bottomley 0 siblings, 1 reply; 25+ messages in thread From: John David Anglin @ 2016-02-21 2:52 UTC (permalink / raw) To: John David Anglin; +Cc: Helge Deller, linux-parisc List, James Bottomley On 2016-02-20, at 5:52 PM, John David Anglin wrote: > On 2016-02-20, at 4:59 PM, Helge Deller wrote: > >> On 20.02.2016 21:43, John David Anglin wrote: >>> On 2016-02-20, at 3:13 PM, John David Anglin wrote: >>> >>>> On 2016-01-23, at 1:00 PM, John David Anglin wrote: >>>> >>>>> WARNING: at block/blk-merge.c:454 >>>> >>>> With linux-image-4.4.0-1-parisc64-smp on c3740, the above warning is the last message I see. >>>> Kernel seems to hang at that point. This is warning code: >>>> >>>> /* >>>> * Something must have been wrong if the figured number of >>>> * segment is bigger than number of req's physical segments >>>> */ >>>> WARN_ON(nsegs > rq->nr_phys_segments); >>> >>> On Sep. 12, 2015, I reported the following problem: >>> >>> http://www.spinics.net/lists/linux-parisc/msg06327.html >> >> The problem is still, that this bug can only be reproduced at every boot when then >> scsi drivers are built as modules (and in an initrd). I could never reproduce it when >> I booted a kernel with built-in scsi drivers. >> >> The bug seems to be triggered by(*nsegs)++ command in __blk_segment_map_sg() in block/blk-merge.c. >> I'm testing with the 4.4.2 kernel from debian. >> I modified __blk_segment_map_sg() like that: >> static inline void >> __blk_segment_map_sg(struct request_queue *q, struct bio_vec *bvec, >> struct scatterlist *sglist, struct bio_vec *bvprv, >> struct scatterlist **sg, int *nsegs, int *cluster) >> { >> >> int nbytes = bvec->bv_len; >> >> if (*sg && *cluster) { >> if ((*sg)->length + nbytes > queue_max_segment_size(q)) >> goto new_segment; >> >> if (!BIOVEC_PHYS_MERGEABLE(bvprv, bvec)) >> goto new_segment; >> if (!BIOVEC_SEG_BOUNDARY(q, bvprv, bvec)) >> goto new_segment; >> >> (*sg)->length += nbytes; >> } else { >> new_segment: >> if (*sg && *cluster) { >> printk("NEW SEGMENT sg = %p!!!\n", sg); >> printk("__blk_segment_map_sg: length = %d, nbytes = %d, sum = %d > %d\n", (*sg)->length, nbytes, (*sg)->length + nbytes, queue_max_segment_size(q)); >> printk("__blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = %d, BIOVEC_SEG_BOUNDARY = %d\n", BIOVEC_PHYS_MERGEABLE(bvprv, bvec), BIOVEC_SEG_BOUNDARY(q, bvprv, bvec) ); >> } >> if (!*sg) >> *sg = sglist; >> else { >> /* >> * If the driver previously mapped a shorter >> * list, we could see a termination bit >> * prematurely unless it fully inits the sg >> * table on each mapping. We KNOW that there >> * must be more entries here or the driver >> * would be buggy, so force clear the >> * termination bit to avoid doing a full >> * sg_init_table() in drivers for each command. >> */ >> sg_unmark_end(*sg); >> *sg = sg_next(*sg); >> } >> >> sg_set_page(*sg, bvec->bv_page, nbytes, bvec->bv_offset); >> (*nsegs)++; >> } >> *bvprv = *bvec; >> } >> >> The boot log looks then like this: >> [ 43.044000] scsi_init_sgtable: count = 1, nents = 1 >> (there are lots of those before it!) >> [ 43.164000] scsi_init_sgtable: nr_phys_segments = 1 >> [ 43.164000] scsi_init_sgtable: count = 1, nents = 1 >> [ 43.280000] scsi_init_sgtable: nr_phys_segments = 1 >> [ 43.280000] scsi_init_sgtable: count = 1, nents = 1 >> [ 43.396000] scsi_init_sgtable: nr_phys_segments = 1 >> [ 43.396000] scsi_init_sgtable: count = 1, nents = 1 >> [ 43.512000] scsi_init_sgtable: nr_phys_segments = 1 >> [ 43.512000] scsi_init_sgtable: count = 1, nents = 1 >> [ 43.628000] scsi_init_sgtable: nr_phys_segments = 3 >> [ 43.628000] NEW SEGMENT sg = 000000007fa911e8!!! >> [ 43.628000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 >> [ 43.628000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 >> [ 43.628000] NEW SEGMENT sg = 000000007fa911e8!!! >> [ 43.628000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 >> [ 43.628000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 >> [ 43.628000] scsi_init_sgtable: count = 3, nents = 3 >> [ 44.224000] scsi_init_sgtable: nr_phys_segments = 1 >> [ 44.224000] scsi_init_sgtable: count = 1, nents = 1 >> [ 44.340000] scsi_init_sgtable: nr_phys_segments = 1 >> [ 44.340000] scsi_init_sgtable: count = 1, nents = 1 >> [ 44.456000] scsi_init_sgtable: nr_phys_segments = 7 >> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 >> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 >> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 >> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 >> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 >> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 >> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 >> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 >> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 >> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 >> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 >> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 >> [ 44.456000] scsi_init_sgtable: count = 7, nents = 7 >> [ 44.456000] timer_interrupt(CPU 0): delayed! cycles 4527081F rem C6C21 next/now 14E153306E/14E146C44D >> [ 46.116000] scsi_init_sgtable: nr_phys_segments = 7 >> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >> [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 >> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 >> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >> [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 >> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 >> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >> [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 >> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 >> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >> [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 >> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 >> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >> [ 46.116000] __blk_segment_map_sg: length = 8192, nbytes = 4096, sum = 12288 > 65536 >> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 >> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >> [ 46.116000] __blk_segment_map_sg: length = 16384, nbytes = 4096, sum = 20480 > 65536 >> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 >> [ 46.116000] scsi_init_sgtable: count = 7, nents = 7 >> [ 46.116000] timer_interrupt(CPU 0): delayed! cycles 453F0A77 rem 223089 next/now 152BB6286E/152B93F7E5 >> [ 47.780000] scsi_init_sgtable: nr_phys_segments = 1 >> [ 47.780000] scsi_init_sgtable: count = 1, nents = 1 >> [ 47.896000] scsi_init_sgtable: nr_phys_segments = 6 >> [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! >> [ 47.896000] __blk_segment_map_sg: length = 61440, nbytes = 4096, sum = 65536 > 65536 >> [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 >> [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! >> [ 47.896000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 >> [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 >> [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! >> [ 47.896000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 >> [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 >> [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! >> [ 47.896000] __blk_segment_map_sg: length = 8192, nbytes = 4096, sum = 12288 > 65536 >> [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 >> [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! >> [ 47.896000] __blk_segment_map_sg: length = 8192, nbytes = 4096, sum = 12288 > 65536 >> [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 >> [ 47.896000] scsi_init_sgtable: count = 6, nents = 6 >> [ 47.896000] timer_interrupt(CPU 0): delayed! cycles 3AB087E2 rem 23E4DE next/now 1570BBD5EE/157097F110 >> [ 49.324000] scsi_init_sgtable: nr_phys_segments = 1 >> [ 49.324000] scsi_init_sgtable: count = 1, nents = 1 >> [ 49.440000] scsi_init_sgtable: nr_phys_segments = 2 >> [ 49.440000] NEW SEGMENT sg = 000000007fa911e8!!! >> [ 49.440000] __blk_segment_map_sg: length = 65536, nbytes = 4096, sum = 69632 > 65536 >> >> (this is interesting! Here we reach a sum of > 65536 the first time) >> >> [ 49.440000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 1, BIOVEC_SEG_BOUNDARY = 1 >> [ 49.440000] NEW SEGMENT sg = 000000007fa911e8!!! >> [ 49.440000] __blk_segment_map_sg: length = 16384, nbytes = 4096, sum = 20480 > 65536 >> [ 49.440000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 >> [ 49.440000] *** FIXIT *** HELGE: nsegs > rq->nr_phys_segments = 3 > 2 >> [ 49.440000] scsi_init_sgtable: count = 3, nents = 2 >> [ 50.116000] ------------[ cut here ]------------ >> [ 50.172000] WARNING: at /build/linux-4.4/linux-4.4.2/drivers/scsi/scsi_lib.c:1104 >> >> (this is usually a BUG(). I changed it to WARN() in the hope it would work anyway. It didn't.) >> >> [ 50.260000] Modules linked in: sd_mod sr_mod cdrom ata_generic ohci_pci ehci_pci ohci_hcd ehci_hcd pata_ns87415 sym53c8xx libata scsi_transport_spi scsi_mod usbcorep >> [ 50.456000] CPU: 0 PID: 70 Comm: systemd-udevd Not tainted 4.4.0-1-parisc64-smp #5 Debian 4.4.2-2 >> [ 50.564000] task: 000000007f948b28 ti: 000000007fa90000 task.ti: 000000007fa90000 >> [ 50.652000] >> [ 50.672000] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI >> [ 50.728000] PSW: 00001000000001001111100100001110 Not tainted >> [ 50.796000] r00-03 000000ff0804f90e 00000000409ea2e0 00000000003e2ee0 000000007fa91140 >> [ 50.892000] r04-07 00000000003cd000 000000007f914300 000000007f914b10 0000000000000003 >> [ 50.988000] r08-11 0000000000000000 000000007f918000 0000000040bdd6b0 00000000003cd800 >> [ 51.084000] r12-15 0000000000000000 000000007fa90778 00000000003cd000 000000007f918000 >> [ 51.180000] r16-19 0000000000001300 0000000040bdd6b8 0000000040bdd6bc 0000000040ba2420 >> [ 51.276000] r20-23 0000000099116e92 0000000000000000 00000000000002a0 00000000000002ee >> [ 51.372000] r24-27 0000000000000000 000000000800000e 0000000040b60750 00000000409b3ae0 >> [ 51.468000] r28-31 0000000000000002 000000007fa914f0 000000007fa911e0 0000000040ba2408 >> [ 51.564000] sr00-03 0000000000015000 0000000000000000 0000000000000000 0000000000015000 >> [ 51.660000] sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> [ 51.756000] >> [ 51.772000] IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000003e2f24 00000000003e2f28 >> [ 51.872000] IIR: 03ffe01f ISR: 0000000010340000 IOR: 000000fea4691528 >> [ 51.956000] CPU: 0 CR30: 000000007fa90000 CR31: 00000000ffff7dff >> [ 52.040000] ORIG_R28: 0000000040b60718 >> [ 52.084000] IAOQ[0]: scsi_init_sgtable+0xfc/0x1b8 [scsi_mod] >> [ 52.152000] IAOQ[1]: scsi_init_sgtable+0x100/0x1b8 [scsi_mod] >> [ 52.224000] RP(r2): scsi_init_sgtable+0xb8/0x1b8 [scsi_mod] >> [ 52.292000] Backtrace: >> [ 52.320000] [<00000000003e304c>] scsi_init_io+0x6c/0x258 [scsi_mod] >> [ 52.396000] [<000000000087d078>] sd_init_command+0x70/0xec8 [sd_mod] >> >> In general I think the bug is somehow in blk-merge.c. >> But I'm not an expert in that code. > > The warning was added in this patch sequence: > https://lkml.org/lkml/2015/11/23/996 > > Possibly, but above seems to indicate that it could be driver issue as well. I believe this bug was introduced by the following merge: commit 1081230b748de8f03f37f80c53dfa89feda9b8de Merge: df91039 2ca495a Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed Sep 2 13:10:25 2015 -0700 Merge branch 'for-4.3/core' of git://git.kernel.dk/linux-block Pull core block updates from Jens Axboe: "This first core part of the block IO changes contains: - Cleanup of the bio IO error signaling from Christoph. We used to rely on the uptodate bit and passing around of an error, now we store the error in the bio itself. - Improvement of the above from myself, by shrinking the bio size down again to fit in two cachelines on x86-64. - Revert of the max_hw_sectors cap removal from a revision again, from Jeff Moyer. This caused performance regressions in various tests. Reinstate the limit, bump it to a more reasonable size instead. - Make /sys/block/<dev>/queue/discard_max_bytes writeable, by me. Most devices have huge trim limits, which can cause nasty latencies when deleting files. Enable the admin to configure the size down. We will look into having a more sane default instead of UINT_MAX sectors. - Improvement of the SGP gaps logic from Keith Busch. - Enable the block core to handle arbitrarily sized bios, which enables a nice simplification of bio_add_page() (which is an IO hot path). From Kent. - Improvements to the partition io stats accounting, making it faster. From Ming Lei. - Also from Ming Lei, a basic fixup for overflow of the sysfs pending file in blk-mq, as well as a fix for a blk-mq timeout race condition. - Ming Lin has been carrying Kents above mentioned patches forward for a while, and testing them. Ming also did a few fixes around that. - Sasha Levin found and fixed a use-after-free problem introduced by the bio->bi_error changes from Christoph. - Small blk cgroup cleanup from Viresh Kumar" * 'for-4.3/core' of git://git.kernel.dk/linux-block: (26 commits) blk: Fix bio_io_vec index when checking bvec gaps block: Replace SG_GAPS with new queue limits mask block: bump BLK_DEF_MAX_SECTORS to 2560 Revert "block: remove artifical max_hw_sectors cap" blk-mq: fix race between timeout and freeing request blk-mq: fix buffer overflow when reading sysfs file of 'pending' Documentation: update notes in biovecs about arbitrarily sized bios block: remove bio_get_nr_vecs() fs: use helper bio_add_page() instead of open coding on bi_io_vec block: kill merge_bvec_fn() completely md/raid5: get rid of bio_fits_rdev() md/raid5: split bio for chunk_aligned_read block: remove split code in blkdev_issue_{discard,write_same} btrfs: remove bio splitting and merge_bvec_fn() calls bcache: remove driver private bio splitting code block: simplify bio_add_page() block: make generic_make_request handle arbitrarily sized bios blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL) block: don't access bio->bi_error after bio_put() block: shrink struct bio down to 2 cache lines again ... https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=1081230b748de8f03f37f80c53dfa89feda9b8de Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-21 2:52 ` John David Anglin @ 2016-02-21 3:47 ` James Bottomley 2016-02-21 14:45 ` John David Anglin 2016-02-21 18:09 ` John David Anglin 0 siblings, 2 replies; 25+ messages in thread From: James Bottomley @ 2016-02-21 3:47 UTC (permalink / raw) To: John David Anglin; +Cc: Helge Deller, linux-parisc List On Sat, 2016-02-20 at 21:52 -0500, John David Anglin wrote: > On 2016-02-20, at 5:52 PM, John David Anglin wrote: > > > On 2016-02-20, at 4:59 PM, Helge Deller wrote: > > > > > On 20.02.2016 21:43, John David Anglin wrote: > > > > On 2016-02-20, at 3:13 PM, John David Anglin wrote: > > > > > > > > > On 2016-01-23, at 1:00 PM, John David Anglin wrote: > > > > > > > > > > > WARNING: at block/blk-merge.c:454 > > > > > > > > > > With linux-image-4.4.0-1-parisc64-smp on c3740, the above > > > > > warning is the last message I see. > > > > > Kernel seems to hang at that point. This is warning code: > > > > > > > > > > /* > > > > > * Something must have been wrong if the figured number > > > > > of > > > > > * segment is bigger than number of req's physical > > > > > segments > > > > > */ > > > > > WARN_ON(nsegs > rq->nr_phys_segments); > > > > > > > > On Sep. 12, 2015, I reported the following problem: > > > > > > > > http://www.spinics.net/lists/linux-parisc/msg06327.html > > > > > > The problem is still, that this bug can only be reproduced at > > > every boot when then > > > scsi drivers are built as modules (and in an initrd). I could > > > never reproduce it when > > > I booted a kernel with built-in scsi drivers. > > > > > > The bug seems to be triggered by(*nsegs)++ command in > > > __blk_segment_map_sg() in block/blk-merge.c. > > > I'm testing with the 4.4.2 kernel from debian. > > > I modified __blk_segment_map_sg() like that: > > > static inline void > > > __blk_segment_map_sg(struct request_queue *q, struct bio_vec > > > *bvec, > > > struct scatterlist *sglist, struct bio_vec > > > *bvprv, > > > struct scatterlist **sg, int *nsegs, int > > > *cluster) > > > { > > > > > > int nbytes = bvec->bv_len; > > > > > > if (*sg && *cluster) { > > > if ((*sg)->length + nbytes > > > > queue_max_segment_size(q)) > > > goto new_segment; > > > > > > if (!BIOVEC_PHYS_MERGEABLE(bvprv, bvec)) > > > goto new_segment; > > > if (!BIOVEC_SEG_BOUNDARY(q, bvprv, bvec)) > > > goto new_segment; > > > > > > (*sg)->length += nbytes; > > > } else { > > > new_segment: > > > if (*sg && *cluster) { > > > printk("NEW SEGMENT sg = %p!!!\n", sg); > > > printk("__blk_segment_map_sg: length = %d, > > > nbytes = %d, sum = %d > %d\n", (*sg)->length, nbytes, (*sg) > > > ->length + nbytes, queue_max_segment_size(q)); > > > printk("__blk_segment_map_sg: > > > BIOVEC_PHYS_MERGEABLE = %d, BIOVEC_SEG_BOUNDARY = %d\n", > > > BIOVEC_PHYS_MERGEABLE(bvprv, bvec), BIOVEC_SEG_BOUNDARY(q, bvprv, > > > bvec) ); > > > } > > > if (!*sg) > > > *sg = sglist; > > > else { > > > /* > > > * If the driver previously mapped a > > > shorter > > > * list, we could see a termination bit > > > * prematurely unless it fully inits the sg > > > * table on each mapping. We KNOW that > > > there > > > * must be more entries here or the driver > > > * would be buggy, so force clear the > > > * termination bit to avoid doing a full > > > * sg_init_table() in drivers for each > > > command. > > > */ > > > sg_unmark_end(*sg); > > > *sg = sg_next(*sg); > > > } > > > > > > sg_set_page(*sg, bvec->bv_page, nbytes, bvec > > > ->bv_offset); > > > (*nsegs)++; > > > } > > > *bvprv = *bvec; > > > } > > > > > > The boot log looks then like this: > > > [ 43.044000] scsi_init_sgtable: count = 1, nents = 1 > > > (there are lots of those before it!) > > > [ 43.164000] scsi_init_sgtable: nr_phys_segments = 1 > > > [ 43.164000] scsi_init_sgtable: count = 1, nents = 1 > > > [ 43.280000] scsi_init_sgtable: nr_phys_segments = 1 > > > [ 43.280000] scsi_init_sgtable: count = 1, nents = 1 > > > [ 43.396000] scsi_init_sgtable: nr_phys_segments = 1 > > > [ 43.396000] scsi_init_sgtable: count = 1, nents = 1 > > > [ 43.512000] scsi_init_sgtable: nr_phys_segments = 1 > > > [ 43.512000] scsi_init_sgtable: count = 1, nents = 1 > > > [ 43.628000] scsi_init_sgtable: nr_phys_segments = 3 > > > [ 43.628000] NEW SEGMENT sg = 000000007fa911e8!!! > > > [ 43.628000] __blk_segment_map_sg: length = 4096, nbytes = > > > 4096, sum = 8192 > 65536 > > > [ 43.628000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > > > BIOVEC_SEG_BOUNDARY = 1 > > > [ 43.628000] NEW SEGMENT sg = 000000007fa911e8!!! > > > [ 43.628000] __blk_segment_map_sg: length = 4096, nbytes = > > > 4096, sum = 8192 > 65536 > > > [ 43.628000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > > > BIOVEC_SEG_BOUNDARY = 1 > > > [ 43.628000] scsi_init_sgtable: count = 3, nents = 3 > > > [ 44.224000] scsi_init_sgtable: nr_phys_segments = 1 > > > [ 44.224000] scsi_init_sgtable: count = 1, nents = 1 > > > [ 44.340000] scsi_init_sgtable: nr_phys_segments = 1 > > > [ 44.340000] scsi_init_sgtable: count = 1, nents = 1 > > > [ 44.456000] scsi_init_sgtable: nr_phys_segments = 7 > > > [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = > > > 4096, sum = 8192 > 65536 > > > [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > > > BIOVEC_SEG_BOUNDARY = 1 > > > [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = > > > 4096, sum = 8192 > 65536 > > > [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > > > BIOVEC_SEG_BOUNDARY = 1 > > > [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = > > > 4096, sum = 8192 > 65536 > > > [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > > > BIOVEC_SEG_BOUNDARY = 1 > > > [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = > > > 4096, sum = 8192 > 65536 > > > [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > > > BIOVEC_SEG_BOUNDARY = 1 > > > [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = > > > 4096, sum = 8192 > 65536 > > > [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > > > BIOVEC_SEG_BOUNDARY = 1 > > > [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = > > > 4096, sum = 8192 > 65536 > > > [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > > > BIOVEC_SEG_BOUNDARY = 1 > > > [ 44.456000] scsi_init_sgtable: count = 7, nents = 7 > > > [ 44.456000] timer_interrupt(CPU 0): delayed! cycles 4527081F > > > rem C6C21 next/now 14E153306E/14E146C44D > > > [ 46.116000] scsi_init_sgtable: nr_phys_segments = 7 > > > [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = > > > 4096, sum = 8192 > 65536 > > > [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > > > BIOVEC_SEG_BOUNDARY = 1 > > > [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = > > > 4096, sum = 8192 > 65536 > > > [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > > > BIOVEC_SEG_BOUNDARY = 1 > > > [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = > > > 4096, sum = 8192 > 65536 > > > [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > > > BIOVEC_SEG_BOUNDARY = 1 > > > [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = > > > 4096, sum = 8192 > 65536 > > > [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > > > BIOVEC_SEG_BOUNDARY = 1 > > > [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > [ 46.116000] __blk_segment_map_sg: length = 8192, nbytes = > > > 4096, sum = 12288 > 65536 > > > [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > > > BIOVEC_SEG_BOUNDARY = 1 > > > [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > [ 46.116000] __blk_segment_map_sg: length = 16384, nbytes = > > > 4096, sum = 20480 > 65536 > > > [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > > > BIOVEC_SEG_BOUNDARY = 1 > > > [ 46.116000] scsi_init_sgtable: count = 7, nents = 7 > > > [ 46.116000] timer_interrupt(CPU 0): delayed! cycles 453F0A77 > > > rem 223089 next/now 152BB6286E/152B93F7E5 > > > [ 47.780000] scsi_init_sgtable: nr_phys_segments = 1 > > > [ 47.780000] scsi_init_sgtable: count = 1, nents = 1 > > > [ 47.896000] scsi_init_sgtable: nr_phys_segments = 6 > > > [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! > > > [ 47.896000] __blk_segment_map_sg: length = 61440, nbytes = > > > 4096, sum = 65536 > 65536 > > > [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > > > BIOVEC_SEG_BOUNDARY = 1 > > > [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! > > > [ 47.896000] __blk_segment_map_sg: length = 4096, nbytes = > > > 4096, sum = 8192 > 65536 > > > [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > > > BIOVEC_SEG_BOUNDARY = 1 > > > [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! > > > [ 47.896000] __blk_segment_map_sg: length = 4096, nbytes = > > > 4096, sum = 8192 > 65536 > > > [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > > > BIOVEC_SEG_BOUNDARY = 1 > > > [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! > > > [ 47.896000] __blk_segment_map_sg: length = 8192, nbytes = > > > 4096, sum = 12288 > 65536 > > > [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > > > BIOVEC_SEG_BOUNDARY = 1 > > > [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! > > > [ 47.896000] __blk_segment_map_sg: length = 8192, nbytes = > > > 4096, sum = 12288 > 65536 > > > [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > > > BIOVEC_SEG_BOUNDARY = 1 > > > [ 47.896000] scsi_init_sgtable: count = 6, nents = 6 > > > [ 47.896000] timer_interrupt(CPU 0): delayed! cycles 3AB087E2 > > > rem 23E4DE next/now 1570BBD5EE/157097F110 > > > [ 49.324000] scsi_init_sgtable: nr_phys_segments = 1 > > > [ 49.324000] scsi_init_sgtable: count = 1, nents = 1 > > > [ 49.440000] scsi_init_sgtable: nr_phys_segments = 2 > > > [ 49.440000] NEW SEGMENT sg = 000000007fa911e8!!! > > > [ 49.440000] __blk_segment_map_sg: length = 65536, nbytes = > > > 4096, sum = 69632 > 65536 > > > > > > (this is interesting! Here we reach a sum of > 65536 the first > > > time) > > > > > > [ 49.440000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 1, > > > BIOVEC_SEG_BOUNDARY = 1 > > > [ 49.440000] NEW SEGMENT sg = 000000007fa911e8!!! > > > [ 49.440000] __blk_segment_map_sg: length = 16384, nbytes = > > > 4096, sum = 20480 > 65536 > > > [ 49.440000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > > > BIOVEC_SEG_BOUNDARY = 1 > > > [ 49.440000] *** FIXIT *** HELGE: nsegs > rq->nr_phys_segments > > > = 3 > 2 > > > [ 49.440000] scsi_init_sgtable: count = 3, nents = 2 > > > [ 50.116000] ------------[ cut here ]------------ > > > [ 50.172000] WARNING: at /build/linux-4.4/linux > > > -4.4.2/drivers/scsi/scsi_lib.c:1104 > > > > > > (this is usually a BUG(). I changed it to WARN() in the hope it > > > would work anyway. It didn't.) > > > > > > [ 50.260000] Modules linked in: sd_mod sr_mod cdrom ata_generic > > > ohci_pci ehci_pci ohci_hcd ehci_hcd pata_ns87415 sym53c8xx libata > > > scsi_transport_spi scsi_mod usbcorep > > > [ 50.456000] CPU: 0 PID: 70 Comm: systemd-udevd Not tainted > > > 4.4.0-1-parisc64-smp #5 Debian 4.4.2-2 > > > [ 50.564000] task: 000000007f948b28 ti: 000000007fa90000 > > > task.ti: 000000007fa90000 > > > [ 50.652000] > > > [ 50.672000] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI > > > [ 50.728000] PSW: 00001000000001001111100100001110 Not tainted > > > [ 50.796000] r00-03 000000ff0804f90e 00000000409ea2e0 > > > 00000000003e2ee0 000000007fa91140 > > > [ 50.892000] r04-07 00000000003cd000 000000007f914300 > > > 000000007f914b10 0000000000000003 > > > [ 50.988000] r08-11 0000000000000000 000000007f918000 > > > 0000000040bdd6b0 00000000003cd800 > > > [ 51.084000] r12-15 0000000000000000 000000007fa90778 > > > 00000000003cd000 000000007f918000 > > > [ 51.180000] r16-19 0000000000001300 0000000040bdd6b8 > > > 0000000040bdd6bc 0000000040ba2420 > > > [ 51.276000] r20-23 0000000099116e92 0000000000000000 > > > 00000000000002a0 00000000000002ee > > > [ 51.372000] r24-27 0000000000000000 000000000800000e > > > 0000000040b60750 00000000409b3ae0 > > > [ 51.468000] r28-31 0000000000000002 000000007fa914f0 > > > 000000007fa911e0 0000000040ba2408 > > > [ 51.564000] sr00-03 0000000000015000 0000000000000000 > > > 0000000000000000 0000000000015000 > > > [ 51.660000] sr04-07 0000000000000000 0000000000000000 > > > 0000000000000000 0000000000000000 > > > [ 51.756000] > > > [ 51.772000] IASQ: 0000000000000000 0000000000000000 IAOQ: > > > 00000000003e2f24 00000000003e2f28 > > > [ 51.872000] IIR: 03ffe01f ISR: 0000000010340000 IOR: > > > 000000fea4691528 > > > [ 51.956000] CPU: 0 CR30: 000000007fa90000 CR31: > > > 00000000ffff7dff > > > [ 52.040000] ORIG_R28: 0000000040b60718 > > > [ 52.084000] IAOQ[0]: scsi_init_sgtable+0xfc/0x1b8 [scsi_mod] > > > [ 52.152000] IAOQ[1]: scsi_init_sgtable+0x100/0x1b8 [scsi_mod] > > > [ 52.224000] RP(r2): scsi_init_sgtable+0xb8/0x1b8 [scsi_mod] > > > [ 52.292000] Backtrace: > > > [ 52.320000] [<00000000003e304c>] scsi_init_io+0x6c/0x258 > > > [scsi_mod] > > > [ 52.396000] [<000000000087d078>] sd_init_command+0x70/0xec8 > > > [sd_mod] > > > > > > In general I think the bug is somehow in blk-merge.c. > > > But I'm not an expert in that code. > > > > The warning was added in this patch sequence: > > https://lkml.org/lkml/2015/11/23/996 > > > > Possibly, but above seems to indicate that it could be driver issue > > as well. > > > I believe this bug was introduced by the following merge: > > commit 1081230b748de8f03f37f80c53dfa89feda9b8de > Merge: df91039 2ca495a > Author: Linus Torvalds <torvalds@linux-foundation.org> > Date: Wed Sep 2 13:10:25 2015 -0700 > > Merge branch 'for-4.3/core' of git://git.kernel.dk/linux-block > > Pull core block updates from Jens Axboe: > "This first core part of the block IO changes contains: > > - Cleanup of the bio IO error signaling from Christoph. We > used to > rely on the uptodate bit and passing around of an error, now > we > store the error in the bio itself. > > - Improvement of the above from myself, by shrinking the bio > size > down again to fit in two cachelines on x86-64. > > - Revert of the max_hw_sectors cap removal from a revision > again, > from Jeff Moyer. This caused performance regressions in > various > tests. Reinstate the limit, bump it to a more reasonable > size > instead. > > - Make /sys/block/<dev>/queue/discard_max_bytes writeable, by > me. > Most devices have huge trim limits, which can cause nasty > latencies > when deleting files. Enable the admin to configure the size > down. > We will look into having a more sane default instead of > UINT_MAX > sectors. > > - Improvement of the SGP gaps logic from Keith Busch. > > - Enable the block core to handle arbitrarily sized bios, > which > enables a nice simplification of bio_add_page() (which is an > IO hot > path). From Kent. > > - Improvements to the partition io stats accounting, making it > faster. From Ming Lei. > > - Also from Ming Lei, a basic fixup for overflow of the sysfs > pending > file in blk-mq, as well as a fix for a blk-mq timeout race > condition. > > - Ming Lin has been carrying Kents above mentioned patches > forward > for a while, and testing them. Ming also did a few fixes > around > that. > > - Sasha Levin found and fixed a use-after-free problem > introduced by > the bio->bi_error changes from Christoph. > > - Small blk cgroup cleanup from Viresh Kumar" > > * 'for-4.3/core' of git://git.kernel.dk/linux-block: (26 commits) > blk: Fix bio_io_vec index when checking bvec gaps > block: Replace SG_GAPS with new queue limits mask > block: bump BLK_DEF_MAX_SECTORS to 2560 > Revert "block: remove artifical max_hw_sectors cap" > blk-mq: fix race between timeout and freeing request > blk-mq: fix buffer overflow when reading sysfs file of > 'pending' > Documentation: update notes in biovecs about arbitrarily sized > bios > block: remove bio_get_nr_vecs() > fs: use helper bio_add_page() instead of open coding on > bi_io_vec > block: kill merge_bvec_fn() completely > md/raid5: get rid of bio_fits_rdev() > md/raid5: split bio for chunk_aligned_read > block: remove split code in blkdev_issue_{discard,write_same} > btrfs: remove bio splitting and merge_bvec_fn() calls > bcache: remove driver private bio splitting code > block: simplify bio_add_page() > block: make generic_make_request handle arbitrarily sized bios > blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL) > block: don't access bio->bi_error after bio_put() > block: shrink struct bio down to 2 cache lines again > ... If you can bisect it down to the exact commit, I might be able to work out what's the problem. Otherwise, even in an all modular config, I can't reproduce this on 4.5-rc4, so it may be fixed upstream (just not backported). James ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-21 3:47 ` James Bottomley @ 2016-02-21 14:45 ` John David Anglin 2016-02-21 18:10 ` James Bottomley 2016-02-21 18:09 ` John David Anglin 1 sibling, 1 reply; 25+ messages in thread From: John David Anglin @ 2016-02-21 14:45 UTC (permalink / raw) To: James Bottomley; +Cc: Helge Deller, linux-parisc List On 2016-02-20, at 10:47 PM, James Bottomley wrote: > On Sat, 2016-02-20 at 21:52 -0500, John David Anglin wrote: >> On 2016-02-20, at 5:52 PM, John David Anglin wrote: >> >>> On 2016-02-20, at 4:59 PM, Helge Deller wrote: >>> >>>> On 20.02.2016 21:43, John David Anglin wrote: >>>>> On 2016-02-20, at 3:13 PM, John David Anglin wrote: >>>>> >>>>>> On 2016-01-23, at 1:00 PM, John David Anglin wrote: >>>>>> >>>>>>> WARNING: at block/blk-merge.c:454 >>>>>> >>>>>> With linux-image-4.4.0-1-parisc64-smp on c3740, the above >>>>>> warning is the last message I see. >>>>>> Kernel seems to hang at that point. This is warning code: >>>>>> >>>>>> /* >>>>>> * Something must have been wrong if the figured number >>>>>> of >>>>>> * segment is bigger than number of req's physical >>>>>> segments >>>>>> */ >>>>>> WARN_ON(nsegs > rq->nr_phys_segments); >>>>> >>>>> On Sep. 12, 2015, I reported the following problem: >>>>> >>>>> http://www.spinics.net/lists/linux-parisc/msg06327.html >>>> >>>> The problem is still, that this bug can only be reproduced at >>>> every boot when then >>>> scsi drivers are built as modules (and in an initrd). I could >>>> never reproduce it when >>>> I booted a kernel with built-in scsi drivers. >>>> >>>> The bug seems to be triggered by(*nsegs)++ command in >>>> __blk_segment_map_sg() in block/blk-merge.c. >>>> I'm testing with the 4.4.2 kernel from debian. >>>> I modified __blk_segment_map_sg() like that: >>>> static inline void >>>> __blk_segment_map_sg(struct request_queue *q, struct bio_vec >>>> *bvec, >>>> struct scatterlist *sglist, struct bio_vec >>>> *bvprv, >>>> struct scatterlist **sg, int *nsegs, int >>>> *cluster) >>>> { >>>> >>>> int nbytes = bvec->bv_len; >>>> >>>> if (*sg && *cluster) { >>>> if ((*sg)->length + nbytes > >>>> queue_max_segment_size(q)) >>>> goto new_segment; >>>> >>>> if (!BIOVEC_PHYS_MERGEABLE(bvprv, bvec)) >>>> goto new_segment; >>>> if (!BIOVEC_SEG_BOUNDARY(q, bvprv, bvec)) >>>> goto new_segment; >>>> >>>> (*sg)->length += nbytes; >>>> } else { >>>> new_segment: >>>> if (*sg && *cluster) { >>>> printk("NEW SEGMENT sg = %p!!!\n", sg); >>>> printk("__blk_segment_map_sg: length = %d, >>>> nbytes = %d, sum = %d > %d\n", (*sg)->length, nbytes, (*sg) >>>> ->length + nbytes, queue_max_segment_size(q)); >>>> printk("__blk_segment_map_sg: >>>> BIOVEC_PHYS_MERGEABLE = %d, BIOVEC_SEG_BOUNDARY = %d\n", >>>> BIOVEC_PHYS_MERGEABLE(bvprv, bvec), BIOVEC_SEG_BOUNDARY(q, bvprv, >>>> bvec) ); >>>> } >>>> if (!*sg) >>>> *sg = sglist; >>>> else { >>>> /* >>>> * If the driver previously mapped a >>>> shorter >>>> * list, we could see a termination bit >>>> * prematurely unless it fully inits the sg >>>> * table on each mapping. We KNOW that >>>> there >>>> * must be more entries here or the driver >>>> * would be buggy, so force clear the >>>> * termination bit to avoid doing a full >>>> * sg_init_table() in drivers for each >>>> command. >>>> */ >>>> sg_unmark_end(*sg); >>>> *sg = sg_next(*sg); >>>> } >>>> >>>> sg_set_page(*sg, bvec->bv_page, nbytes, bvec >>>> ->bv_offset); >>>> (*nsegs)++; >>>> } >>>> *bvprv = *bvec; >>>> } >>>> >>>> The boot log looks then like this: >>>> [ 43.044000] scsi_init_sgtable: count = 1, nents = 1 >>>> (there are lots of those before it!) >>>> [ 43.164000] scsi_init_sgtable: nr_phys_segments = 1 >>>> [ 43.164000] scsi_init_sgtable: count = 1, nents = 1 >>>> [ 43.280000] scsi_init_sgtable: nr_phys_segments = 1 >>>> [ 43.280000] scsi_init_sgtable: count = 1, nents = 1 >>>> [ 43.396000] scsi_init_sgtable: nr_phys_segments = 1 >>>> [ 43.396000] scsi_init_sgtable: count = 1, nents = 1 >>>> [ 43.512000] scsi_init_sgtable: nr_phys_segments = 1 >>>> [ 43.512000] scsi_init_sgtable: count = 1, nents = 1 >>>> [ 43.628000] scsi_init_sgtable: nr_phys_segments = 3 >>>> [ 43.628000] NEW SEGMENT sg = 000000007fa911e8!!! >>>> [ 43.628000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 43.628000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 43.628000] NEW SEGMENT sg = 000000007fa911e8!!! >>>> [ 43.628000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 43.628000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 43.628000] scsi_init_sgtable: count = 3, nents = 3 >>>> [ 44.224000] scsi_init_sgtable: nr_phys_segments = 1 >>>> [ 44.224000] scsi_init_sgtable: count = 1, nents = 1 >>>> [ 44.340000] scsi_init_sgtable: nr_phys_segments = 1 >>>> [ 44.340000] scsi_init_sgtable: count = 1, nents = 1 >>>> [ 44.456000] scsi_init_sgtable: nr_phys_segments = 7 >>>> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 44.456000] scsi_init_sgtable: count = 7, nents = 7 >>>> [ 44.456000] timer_interrupt(CPU 0): delayed! cycles 4527081F >>>> rem C6C21 next/now 14E153306E/14E146C44D >>>> [ 46.116000] scsi_init_sgtable: nr_phys_segments = 7 >>>> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 46.116000] __blk_segment_map_sg: length = 8192, nbytes = >>>> 4096, sum = 12288 > 65536 >>>> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 46.116000] __blk_segment_map_sg: length = 16384, nbytes = >>>> 4096, sum = 20480 > 65536 >>>> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 46.116000] scsi_init_sgtable: count = 7, nents = 7 >>>> [ 46.116000] timer_interrupt(CPU 0): delayed! cycles 453F0A77 >>>> rem 223089 next/now 152BB6286E/152B93F7E5 >>>> [ 47.780000] scsi_init_sgtable: nr_phys_segments = 1 >>>> [ 47.780000] scsi_init_sgtable: count = 1, nents = 1 >>>> [ 47.896000] scsi_init_sgtable: nr_phys_segments = 6 >>>> [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! >>>> [ 47.896000] __blk_segment_map_sg: length = 61440, nbytes = >>>> 4096, sum = 65536 > 65536 >>>> [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! >>>> [ 47.896000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! >>>> [ 47.896000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! >>>> [ 47.896000] __blk_segment_map_sg: length = 8192, nbytes = >>>> 4096, sum = 12288 > 65536 >>>> [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! >>>> [ 47.896000] __blk_segment_map_sg: length = 8192, nbytes = >>>> 4096, sum = 12288 > 65536 >>>> [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 47.896000] scsi_init_sgtable: count = 6, nents = 6 >>>> [ 47.896000] timer_interrupt(CPU 0): delayed! cycles 3AB087E2 >>>> rem 23E4DE next/now 1570BBD5EE/157097F110 >>>> [ 49.324000] scsi_init_sgtable: nr_phys_segments = 1 >>>> [ 49.324000] scsi_init_sgtable: count = 1, nents = 1 >>>> [ 49.440000] scsi_init_sgtable: nr_phys_segments = 2 >>>> [ 49.440000] NEW SEGMENT sg = 000000007fa911e8!!! >>>> [ 49.440000] __blk_segment_map_sg: length = 65536, nbytes = >>>> 4096, sum = 69632 > 65536 >>>> >>>> (this is interesting! Here we reach a sum of > 65536 the first >>>> time) >>>> >>>> [ 49.440000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 1, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 49.440000] NEW SEGMENT sg = 000000007fa911e8!!! >>>> [ 49.440000] __blk_segment_map_sg: length = 16384, nbytes = >>>> 4096, sum = 20480 > 65536 >>>> [ 49.440000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 49.440000] *** FIXIT *** HELGE: nsegs > rq->nr_phys_segments >>>> = 3 > 2 >>>> [ 49.440000] scsi_init_sgtable: count = 3, nents = 2 >>>> [ 50.116000] ------------[ cut here ]------------ >>>> [ 50.172000] WARNING: at /build/linux-4.4/linux >>>> -4.4.2/drivers/scsi/scsi_lib.c:1104 >>>> >>>> (this is usually a BUG(). I changed it to WARN() in the hope it >>>> would work anyway. It didn't.) >>>> >>>> [ 50.260000] Modules linked in: sd_mod sr_mod cdrom ata_generic >>>> ohci_pci ehci_pci ohci_hcd ehci_hcd pata_ns87415 sym53c8xx libata >>>> scsi_transport_spi scsi_mod usbcorep >>>> [ 50.456000] CPU: 0 PID: 70 Comm: systemd-udevd Not tainted >>>> 4.4.0-1-parisc64-smp #5 Debian 4.4.2-2 >>>> [ 50.564000] task: 000000007f948b28 ti: 000000007fa90000 >>>> task.ti: 000000007fa90000 >>>> [ 50.652000] >>>> [ 50.672000] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI >>>> [ 50.728000] PSW: 00001000000001001111100100001110 Not tainted >>>> [ 50.796000] r00-03 000000ff0804f90e 00000000409ea2e0 >>>> 00000000003e2ee0 000000007fa91140 >>>> [ 50.892000] r04-07 00000000003cd000 000000007f914300 >>>> 000000007f914b10 0000000000000003 >>>> [ 50.988000] r08-11 0000000000000000 000000007f918000 >>>> 0000000040bdd6b0 00000000003cd800 >>>> [ 51.084000] r12-15 0000000000000000 000000007fa90778 >>>> 00000000003cd000 000000007f918000 >>>> [ 51.180000] r16-19 0000000000001300 0000000040bdd6b8 >>>> 0000000040bdd6bc 0000000040ba2420 >>>> [ 51.276000] r20-23 0000000099116e92 0000000000000000 >>>> 00000000000002a0 00000000000002ee >>>> [ 51.372000] r24-27 0000000000000000 000000000800000e >>>> 0000000040b60750 00000000409b3ae0 >>>> [ 51.468000] r28-31 0000000000000002 000000007fa914f0 >>>> 000000007fa911e0 0000000040ba2408 >>>> [ 51.564000] sr00-03 0000000000015000 0000000000000000 >>>> 0000000000000000 0000000000015000 >>>> [ 51.660000] sr04-07 0000000000000000 0000000000000000 >>>> 0000000000000000 0000000000000000 >>>> [ 51.756000] >>>> [ 51.772000] IASQ: 0000000000000000 0000000000000000 IAOQ: >>>> 00000000003e2f24 00000000003e2f28 >>>> [ 51.872000] IIR: 03ffe01f ISR: 0000000010340000 IOR: >>>> 000000fea4691528 >>>> [ 51.956000] CPU: 0 CR30: 000000007fa90000 CR31: >>>> 00000000ffff7dff >>>> [ 52.040000] ORIG_R28: 0000000040b60718 >>>> [ 52.084000] IAOQ[0]: scsi_init_sgtable+0xfc/0x1b8 [scsi_mod] >>>> [ 52.152000] IAOQ[1]: scsi_init_sgtable+0x100/0x1b8 [scsi_mod] >>>> [ 52.224000] RP(r2): scsi_init_sgtable+0xb8/0x1b8 [scsi_mod] >>>> [ 52.292000] Backtrace: >>>> [ 52.320000] [<00000000003e304c>] scsi_init_io+0x6c/0x258 >>>> [scsi_mod] >>>> [ 52.396000] [<000000000087d078>] sd_init_command+0x70/0xec8 >>>> [sd_mod] >>>> >>>> In general I think the bug is somehow in blk-merge.c. >>>> But I'm not an expert in that code. >>> >>> The warning was added in this patch sequence: >>> https://lkml.org/lkml/2015/11/23/996 >>> >>> Possibly, but above seems to indicate that it could be driver issue >>> as well. >> >> >> I believe this bug was introduced by the following merge: >> >> commit 1081230b748de8f03f37f80c53dfa89feda9b8de >> Merge: df91039 2ca495a >> Author: Linus Torvalds <torvalds@linux-foundation.org> >> Date: Wed Sep 2 13:10:25 2015 -0700 >> >> Merge branch 'for-4.3/core' of git://git.kernel.dk/linux-block >> >> Pull core block updates from Jens Axboe: >> "This first core part of the block IO changes contains: >> >> - Cleanup of the bio IO error signaling from Christoph. We >> used to >> rely on the uptodate bit and passing around of an error, now >> we >> store the error in the bio itself. >> >> - Improvement of the above from myself, by shrinking the bio >> size >> down again to fit in two cachelines on x86-64. >> >> - Revert of the max_hw_sectors cap removal from a revision >> again, >> from Jeff Moyer. This caused performance regressions in >> various >> tests. Reinstate the limit, bump it to a more reasonable >> size >> instead. >> >> - Make /sys/block/<dev>/queue/discard_max_bytes writeable, by >> me. >> Most devices have huge trim limits, which can cause nasty >> latencies >> when deleting files. Enable the admin to configure the size >> down. >> We will look into having a more sane default instead of >> UINT_MAX >> sectors. >> >> - Improvement of the SGP gaps logic from Keith Busch. >> >> - Enable the block core to handle arbitrarily sized bios, >> which >> enables a nice simplification of bio_add_page() (which is an >> IO hot >> path). From Kent. >> >> - Improvements to the partition io stats accounting, making it >> faster. From Ming Lei. >> >> - Also from Ming Lei, a basic fixup for overflow of the sysfs >> pending >> file in blk-mq, as well as a fix for a blk-mq timeout race >> condition. >> >> - Ming Lin has been carrying Kents above mentioned patches >> forward >> for a while, and testing them. Ming also did a few fixes >> around >> that. >> >> - Sasha Levin found and fixed a use-after-free problem >> introduced by >> the bio->bi_error changes from Christoph. >> >> - Small blk cgroup cleanup from Viresh Kumar" >> >> * 'for-4.3/core' of git://git.kernel.dk/linux-block: (26 commits) >> blk: Fix bio_io_vec index when checking bvec gaps >> block: Replace SG_GAPS with new queue limits mask >> block: bump BLK_DEF_MAX_SECTORS to 2560 >> Revert "block: remove artifical max_hw_sectors cap" >> blk-mq: fix race between timeout and freeing request >> blk-mq: fix buffer overflow when reading sysfs file of >> 'pending' >> Documentation: update notes in biovecs about arbitrarily sized >> bios >> block: remove bio_get_nr_vecs() >> fs: use helper bio_add_page() instead of open coding on >> bi_io_vec >> block: kill merge_bvec_fn() completely >> md/raid5: get rid of bio_fits_rdev() >> md/raid5: split bio for chunk_aligned_read >> block: remove split code in blkdev_issue_{discard,write_same} >> btrfs: remove bio splitting and merge_bvec_fn() calls >> bcache: remove driver private bio splitting code >> block: simplify bio_add_page() >> block: make generic_make_request handle arbitrarily sized bios >> blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL) >> block: don't access bio->bi_error after bio_put() >> block: shrink struct bio down to 2 cache lines again >> ... > > If you can bisect it down to the exact commit, I might be able to work > out what's the problem. Otherwise, even in an all modular config, I > can't reproduce this on 4.5-rc4, so it may be fixed upstream (just not > backported). I tried HEAD this morning and problem is still present. The warning in blk-merge.c occurs first followed by BUG: kernel BUG at drivers/scsi/scsi_lib.c:1097! Entire console output is shown below: reboot: Restarting system Firmware Version 2.12 Duplex Console IO Dependent Code (IODC) revision 1 ------------------------------------------------------------------------------ (c) Copyright 1995-2004, Hewlett-Packard Company, All rights reserved ------------------------------------------------------------------------------ Processor Speed State CoProcessor Cache Size Number State Inst Data --------- -------- --------------------- ----------------- ------------ 0 1000 MHz Active Functional 32 MB/32 MB 1 1000 MHz Idle Functional 32 MB/32 MB 2 1000 MHz Idle Functional 32 MB/32 MB 3 1000 MHz Idle Functional 32 MB/32 MB Central Bus Speed (in MHz) : 200 Available Memory : 12582912 KB Good Memory Required : Not initialized. Defaults to 32 MB. Primary boot path: scsiA.0 0/2/1/0.0 Alternate boot path: lan.0.0.0.0 0/3/3/0 Console path: serial_A.643 17.643 Keyboard path: usb0 0/3/1/0.0 Keyboard path ignored for serial consoles. ---- Main Menu --------------------------------------------------------------- Command Description ------- ----------- BOot [PRI|ALT|<path>] Boot from specified path PAth [PRI|ALT|CON|KEY [<path>]] Display or change a path SEArch [DIsplay|[[IPL] [<path>]]] Search for boot devices COnfiguration menu Displays or sets boot values INformation menu Displays hardware information SERvice menu Displays service commands DIsplay Redisplay the current menu HElp [<menu>|<command>] Display help for menu or command RESET Restart the system ---- Main Menu: Enter command or menu > bo Interact with IPL (Y, N, or Cancel)?> n Booting... Boot IO Dependent Code (IODC) revision 2 HARD Booted. palo ipl 1.92 root@c3000 Wed Oct 9 21:48:57 CEST 2013 Partition Start(MB) End(MB) Id Type 1 1 62 f0 Palo 2 63 305 83 ext2 3 306 2259 82 swap 4 2260 70007 83 ext2 PALO(F0) partition contains: 0/vmlinux64 3063747(8270074) bytes @ 0x48000 Command line for kernel: 'root=/dev/sda4 console=ttyS0 HOME=/ rootfstype=ext3 c' Selected kernel: /vmlinuz from partition 2 Selected ramdisk: /initrd.img from partition 2 uncompressing Linux kernel.....................................................2 . ELF64 executable Entry 00100000 first 00100000 n 2 Segment 0 load 00100000 size 215488 mediaptr 0x1000 Segment 1 load 00135000 size 7706288 mediaptr 0x36000 Loading ramdisk 10819353 bytes @ 3e19d000... Branching to kernel entry point 0x00100000. If this is the last message you see, you may need to switch your console. This is a common symptom -- search the FAQ and mailing list at parisc-linux.org Linux version 4.5.0-rc5 (dave@atlas) (gcc version 4.9.3 (GCC) ) #1 SMP Sun Feb 6 unwind_init: start = 0x407398b0, end = 0x40776820, entries = 15607 FP[0] enabled: Rev 1 Model 20 The 64-bit Kernel has started... Kernel default page size is 4 KB. Huge pages disabled. bootconsole [ttyB0] enabled Initialized PDC Console for debugging. Determining PDC firmware type: 64 bit PAT. model 000088b0 00000491 00000000 00000002 56bb5389fb0d1ec0 100000f0 00000008 002 vers 00000302 CPUID vers 20 rev 5 (0x00000285) capabilities 0x35 model 9000/785/C8000 parisc_cache_init: Only equivalent aliasing supported! Memory Ranges: 0) Start 0x0000000000000000 End 0x000000003fffffff Size 1024 MB 1) Start 0x0000000100000000 End 0x00000002ffdfffff Size 8190 MB 2) Start 0x0000004040000000 End 0x00000040ffffffff Size 3072 MB Total Memory: 12286 MB initrd: 7e19d000-7ebee719 initrd: reserving 3e19d000-3ebee719 (mem_max 2ffe00000) PERCPU: Embedded 17 pages/cpu @0000000042e11000 s29936 r8192 d31504 u69632 SMP: bootstrap CPU ID is 0 Built 3 zonelists in Zone order, mobility grouping on. Total pages: 3102215 Kernel command line: root=/dev/sda4 console=ttyS0 HOME=/ rootfstype=ext3 clocksz log_buf_len individual max cpu contribution: 4096 bytes log_buf_len total cpu_extra contributions: 126976 bytes log_buf_len min size: 131072 bytes log_buf_len: 262144 bytes early log buf free: 127768(97%) PID hash table entries: 4096 (order: 3, 32768 bytes) Dentry cache hash table entries: 2097152 (order: 12, 16777216 bytes) Inode-cache hash table entries: 1048576 (order: 11, 8388608 bytes) Sorting __ex_table... Memory: 12338568K/12580864K available (4668K kernel code, 1515K rwdata, 842K ro) virtual kernel memory layout: vmalloc : 0x0000000000008000 - 0x000000003f000000 (1007 MB) memory : 0x0000000040000000 - 0x0000004140000000 (266240 MB) .init : 0x0000000040100000 - 0x0000000040145000 ( 276 kB) .data : 0x00000000405d4000 - 0x0000000040821530 (2357 kB) .text : 0x0000000040145000 - 0x00000000405d4000 (4668 kB) Hierarchical RCU implementation. Build-time adjustment of leaf fanout to 64. NR_IRQS:128 clocksource: cr16: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idles Console: colour dummy device 160x64 ------------------------ | Locking API testsuite: ---------------------------------------------------------------------------- | spin |wlock |rlock |mutex | wsem | rsem | -------------------------------------------------------------------------- A-A deadlock:failed|failed| ok |failed|failed|failed| A-B-B-A deadlock:failed|failed| ok |failed|failed|failed| A-B-B-C-C-A deadlock:failed|failed| ok |failed|failed|failed| A-B-C-A-B-C deadlock:failed|failed| ok |failed|failed|failed| A-B-B-C-C-D-D-A deadlock:failed|failed| ok |failed|failed|failed| A-B-C-D-B-D-D-A deadlock:failed|failed| ok |failed|failed|failed| A-B-C-D-B-C-D-A deadlock:failed|failed| ok |failed|failed|failed| double unlock:failed|failed|failed| ok |failed|failed| initialize held:failed|failed|failed|failed|failed|failed| bad unlock order: ok | ok | ok | ok | ok | ok | -------------------------------------------------------------------------- recursive read-lock: | ok | |failed| recursive read-lock #2: | ok | |failed| mixed read-write-lock: |failed| |failed| mixed write-read-lock: |failed| |failed| -------------------------------------------------------------------------- hard-irqs-on + irq-safe-A/12:failed|failed| ok | soft-irqs-on + irq-safe-A/12:failed|failed| ok | hard-irqs-on + irq-safe-A/21:failed|failed| ok | soft-irqs-on + irq-safe-A/21:failed|failed| ok | sirq-safe-A => hirqs-on/12:failed|failed| ok | sirq-safe-A => hirqs-on/21:failed|failed| ok | hard-safe-A + irqs-on/12:failed|failed| ok | soft-safe-A + irqs-on/12:failed|failed| ok | hard-safe-A + irqs-on/21:failed|failed| ok | soft-safe-A + irqs-on/21:failed|failed| ok | hard-safe-A + unsafe-B #1/123:failed|failed| ok | soft-safe-A + unsafe-B #1/123:failed|failed| ok | hard-safe-A + unsafe-B #1/132:failed|failed| ok | soft-safe-A + unsafe-B #1/132:failed|failed| ok | hard-safe-A + unsafe-B #1/213:failed|failed| ok | soft-safe-A + unsafe-B #1/213:failed|failed| ok | hard-safe-A + unsafe-B #1/231:failed|failed| ok | soft-safe-A + unsafe-B #1/231:failed|failed| ok | hard-safe-A + unsafe-B #1/312:failed|failed| ok | soft-safe-A + unsafe-B #1/312:failed|failed| ok | hard-safe-A + unsafe-B #1/321:failed|failed| ok | soft-safe-A + unsafe-B #1/321:failed|failed| ok | hard-safe-A + unsafe-B #2/123:failed|failed| ok | soft-safe-A + unsafe-B #2/123:failed|failed| ok | hard-safe-A + unsafe-B #2/132:failed|failed| ok | soft-safe-A + unsafe-B #2/132:failed|failed| ok | hard-safe-A + unsafe-B #2/213:failed|failed| ok | soft-safe-A + unsafe-B #2/213:failed|failed| ok | hard-safe-A + unsafe-B #2/231:failed|failed| ok | soft-safe-A + unsafe-B #2/231:failed|failed| ok | hard-safe-A + unsafe-B #2/312:failed|failed| ok | soft-safe-A + unsafe-B #2/312:failed|failed| ok | hard-safe-A + unsafe-B #2/321:failed|failed| ok | soft-safe-A + unsafe-B #2/321:failed|failed| ok | hard-irq lock-inversion/123:failed|failed| ok | soft-irq lock-inversion/123:failed|failed| ok | hard-irq lock-inversion/132:failed|failed| ok | soft-irq lock-inversion/132:failed|failed| ok | hard-irq lock-inversion/213:failed|failed| ok | soft-irq lock-inversion/213:failed|failed| ok | hard-irq lock-inversion/231:failed|failed| ok | soft-irq lock-inversion/231:failed|failed| ok | hard-irq lock-inversion/312:failed|failed| ok | soft-irq lock-inversion/312:failed|failed| ok | hard-irq lock-inversion/321:failed|failed| ok | soft-irq lock-inversion/321:failed|failed| ok | hard-irq read-recursion/123: ok | soft-irq read-recursion/123: ok | hard-irq read-recursion/132: ok | soft-irq read-recursion/132: ok | hard-irq read-recursion/213: ok | soft-irq read-recursion/213: ok | hard-irq read-recursion/231: ok | soft-irq read-recursion/231: ok | hard-irq read-recursion/312: ok | soft-irq read-recursion/312: ok | hard-irq read-recursion/321: ok | soft-irq read-recursion/321: ok | -------------------------------------------------------------------------- | Wound/wait tests | --------------------- ww api failures: ok | ok | ok | ww contexts mixing:failed| ok | finishing ww context: ok | ok | ok | ok | locking mismatches: ok | ok | ok | EDEADLK handling: ok | ok | ok | ok | ok | ok | o| spinlock nest unlocked:failed| ----------------------------------------------------- |block | try |context| ----------------------------------------------------- context:failed| ok | ok | try:failed| ok |failed| block:failed| ok |failed| spinlock:failed| ok |failed| -------------------------------------------------------- 153 out of 253 testcases failed, as expected. | ---------------------------------------------------- Calibrating delay loop... 1993.93 BogoMIPS (lpj=9969664) pid_max: default: 32768 minimum: 301 Mount-cache hash table entries: 32768 (order: 6, 262144 bytes) Mountpoint-cache hash table entries: 32768 (order: 6, 262144 bytes) Brought up 1 CPUs devtmpfs: initialized clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 191s atomic64_test: passed NET: Registered protocol family 16 Searching for devices... Found devices: 1. Crestone Peak Fast? at 0xfffffffffe780000 [128] { 0, 0x0, 0x88b, 0x00004 } 2. Crestone Peak Fast? at 0xfffffffffe781000 [129] { 0, 0x0, 0x88b, 0x00004 } 3. Crestone Peak Fast? at 0xfffffffffe798000 [152] { 0, 0x0, 0x88b, 0x00004 } 4. Crestone Peak Fast? at 0xfffffffffe799000 [153] { 0, 0x0, 0x88b, 0x00004 } 5. Memory at 0xfffffffffed08000 [8] { 1, 0x0, 0x0b6, 0x00009 } 6. Pluto BC McKinley Port at 0xfffffffffed00000 [0] { 12, 0x0, 0x880, 0x0000c } 7. Mercury PCI Bridge at 0xfffffffffed20000 [0/0] { 13, 0x0, 0x783, 0x0000a } 8. Mercury PCI Bridge at 0xfffffffffed24000 [0/2] { 13, 0x0, 0x783, 0x0000a } 9. Mercury PCI Bridge at 0xfffffffffed26000 [0/3] { 13, 0x0, 0x783, 0x0000a } 10. Quicksilver AGP Bridge at 0xfffffffffed28000 [0/4] { 13, 0x0, 0x784, 0x0000} 11. BMC IPMI Mgmt Ctlr at 0xfffffff0f05b0000 [16] { 15, 0x0, 0x004, 0x000c0 } 12. Crestone Peak Fast? Core RS-232 at 0xfffffff0f05e0000 [17] { 10, 0x0, 0x077} 13. Crestone Peak Fast? Core RS-232 at 0xfffffff0f05e2000 [18] { 10, 0x0, 0x077} Enabling PDC_PAT chassis codes support v0.05 Releasing cpu 1 now, hpa=fffffffffe781000 FP[1] enabled: Rev 1 Model 20 Releasing cpu 2 now, hpa=fffffffffe798000 FP[2] enabled: Rev 1 Model 20 Releasing cpu 3 now, hpa=fffffffffe799000 FP[3] enabled: Rev 1 Model 20 CPU(s): 4 out of 4 PA8800 (Mako) at 1000.000000 MHz online Setting cache flush threshold to 32768 kB Setting TLB flush threshold to 1236 kB SBA found Pluto 2.3 at 0xfffffffffed00000 Mercury version TR3.2 (0x32) found at 0xfffffffffed20000 LBA: lmmio_space [0xffffffff80000000-0xffffffff9fffffff] - new LBA 0:0: PCI host bridge to bus 0000:00 pci_bus 0000:00: root bus resource [io 0x0000-0xffff] pci_bus 0000:00: root bus resource [mem 0xffffffff80000000-0xffffffff9fffffff] ) pci_bus 0000:00: root bus resource [bus 00-07] Mercury version TR3.2 (0x32) found at 0xfffffffffed24000 LBA 0:2: PCI host bridge to bus 0000:40 pci_bus 0000:40: root bus resource [io 0x10000-0x1ffff] (bus address [0x0000-0) pci_bus 0000:40: root bus resource [mem 0xffffffffa0000000-0xffffffffafffffff] ) pci_bus 0000:40: root bus resource [bus 40-47] Mercury version TR3.2 (0x32) found at 0xfffffffffed26000 LBA 0:3: PCI host bridge to bus 0000:60 pci_bus 0000:60: root bus resource [io 0x20000-0x2ffff] (bus address [0x0000-0) pci_bus 0000:60: root bus resource [mem 0xffffffffb0000000-0xffffffffbfffffff] ) pci_bus 0000:60: root bus resource [bus 60-67] Quicksilver version TR1.0 (0x10) found at 0xfffffffffed28000 LBA: lmmio_space [0xffffffffc0000000-0xffffffffdfffffff] - new LBA 0:4: PCI host bridge to bus 0000:80 pci_bus 0000:80: root bus resource [io 0x30000-0x3ffff] (bus address [0x0000-0) pci_bus 0000:80: root bus resource [mem 0xffffffffc0000000-0xffffffffdfffffff] ) pci_bus 0000:80: root bus resource [bus 80-87] powersw: Soft power switch at 0xfffffff0f042e278 enabled. vgaarb: setting as boot device: PCI:0000:80:00.0 vgaarb: device added: PCI:0000:80:00.0,decodes=io+mem,owns=io+mem,locks=none vgaarb: loaded vgaarb: bridge control possible 0000:80:00.0 NET: Registered protocol family 2 TCP established hash table entries: 131072 (order: 8, 1048576 bytes) TCP bind hash table entries: 65536 (order: 8, 1048576 bytes) TCP: Hash tables configured (established 131072 bind 65536) UDP hash table entries: 8192 (order: 6, 262144 bytes) UDP-Lite hash table entries: 8192 (order: 6, 262144 bytes) NET: Registered protocol family 1 RPC: Registered named UNIX socket transport module. RPC: Registered udp transport module. RPC: Registered tcp transport module. RPC: Registered tcp NFSv4.1 backchannel transport module. Trying to unpack rootfs image as initramfs... Freeing initrd memory: 10564K (000000007e19d000 - 000000007ebee000) Performance monitoring counters enabled for Crestone Peak Fast? futex hash table entries: 8192 (order: 6, 262144 bytes) Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253) io scheduler noop registered io scheduler deadline registered io scheduler cfq registered (default) Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled console [ttyS0] disabled 17: ttyS0 at MMIO 0xfffffff0f05e0800 (irq = 76, base_baud = 115200) is a 16550A console [ttyS0] enabled console [ttyS0] enabled bootconsole [ttyB0] disabled bootconsole [ttyB0] disabled 18: ttyS1 at MMIO 0xfffffff0f05e2800 (irq = 77, base_baud = 115200) is a 16550A Linux agpgart interface v0.103 brd: module loaded HP SDC: No SDC found. HP SDC MLC: Registering the System Domain Controller's HIL MLC. HP SDC MLC: Request for raw HIL ISR hook denied mousedev: PS/2 mouse device common for all mice rtc-generic rtc-generic: rtc core: registered rtc-generic as rtc0 hidraw: raw HID events driver (C) Jiri Kosina rtc-generic rtc-generic: setting system clock to 2016-02-21 14:32:46 UTC (14560) Freeing unused kernel memory: 276K (0000000040100000 - 0000000040145000) Loading, please wait... starting versionrandom: systemd-udevd urandom read with 59 bits of entropy avaie 228 Fusion MPT base driver 3.04.20 Copyright (c) 1999-2008 LSI Corporation SCSI subsystem initialized usbcore: registered new interface driver usbfs usbcore: registered new interface driver hub Uniform Multi-Platform E-IDE driver e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI e1000: Copyright (c) 1999-2006 Intel Corporation. usbcore: registered new device driver usb ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver sata_sil24 0000:00:01.0: Applying completion IRQ loss on PCI-X errata fix ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver ehci-pci: EHCI PCI platform driver Fusion MPT SPI Host driver 3.04.20 mptbase: ioc0: Initiating bringup scsi host0: sata_sil24 scsi host1: sata_sil24 scsi host2: sata_sil24 scsi host3: sata_sil24 e1000 0000:60:03.0 eth0: (PCI:33MHz:32-bit) 00:11:0a:31:8a:77 e1000 0000:60:03.0 eth0: Intel(R) PRO/1000 Network Connection siimage 0000:60:02.0: IDE controller (0x1095:0x0680 rev 0x02) siimage 0000:60:02.0: BASE CLOCK == 133 siimage 0000:60:02.0: 100% native mode on irq 72 ide0: MMIO-DMA ide1: MMIO-DMA ata1: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80080000 ir6 ata2: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80082000 ir6 ata3: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80084000 ir6 ata4: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80086000 ir6 hdc: HL-DT-STDVD+-RW GSA-H21L, ATAPI CD/DVD-ROM drive ioc0: LSI53C1030 B2: Capabilities={Initiator,Target} hdc: UDMA/44 mode selected ide0 at 0x107c4080-0x107c4087,0x107c408a on irq 72 ide1 at 0x107c40c0-0x107c40c7,0x107c40ca on irq 72 ehci-pci 0000:60:01.2: EHCI Host Controller ehci-pci 0000:60:01.2: new USB bus registered, assigned bus number 1 ehci-pci 0000:60:01.2: irq 71, io mem 0xffffffffb00a1000 ehci-pci 0000:60:01.2: USB 2.0 started, EHCI 0.95 usb usb1: New USB device found, idVendor=1d6b, idProduct=0002 usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 usb usb1: Product: EHCI Host Controller usb usb1: Manufacturer: Linux 4.5.0-rc5 ehci_hcd usb usb1: SerialNumber: 0000:60:01.2 hub 1-0:1.0: USB hub found hub 1-0:1.0: 5 ports detected ohci-pci: OHCI PCI platform driver ohci-pci 0000:60:01.0: OHCI PCI host controller ohci-pci 0000:60:01.0: new USB bus registered, assigned bus number 2 ohci-pci 0000:60:01.0: irq 69, io mem 0xffffffffb00a3000 ata1: SATA link down (SStatus 0 SControl 0) scsi host4: ioc0: LSI53C1030 B2, FwRev=01032341h, Ports=1, MaxQ=255, IRQ=67 usb usb2: New USB device found, idVendor=1d6b, idProduct=0001 usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1 usb usb2: Product: OHCI PCI host controller usb usb2: Manufacturer: Linux 4.5.0-rc5 ohci_hcd usb usb2: SerialNumber: 0000:60:01.0 hub 2-0:1.0: USB hub found hub 2-0:1.0: 3 ports detected ohci-pci 0000:60:01.1: OHCI PCI host controller ohci-pci 0000:60:01.1: new USB bus registered, assigned bus number 3 ohci-pci 0000:60:01.1: irq 70, io mem 0xffffffffb00a2000 scsi 4:0:0:0: Direct-Access HP 73.4G ST373207LW HPC1 PQ: 0 ANSI: 3 scsi target4:0:0: Beginning Domain Validation scsi target4:0:0: Ending Domain Validation scsi target4:0:0: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RTI WRFLOW PCOMP (6.2) usb usb3: New USB device found, idVendor=1d6b, idProduct=0001 usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1 usb usb3: Product: OHCI PCI host controller usb usb3: Manufacturer: Linux 4.5.0-rc5 ohci_hcd usb usb3: SerialNumber: 0000:60:01.1 hub 3-0:1.0: USB hub found ata2: SATA link down (SStatus 0 SControl 0) hub 3-0:1.0: 2 ports detected scsi 4:0:2:0: Direct-Access HP 73.4G ST373207LW HPC1 PQ: 0 ANSI: 3 scsi target4:0:2: Beginning Domain Validation scsi target4:0:2: Ending Domain Validation scsi target4:0:2: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RTI WRFLOW PCOMP (6.2) random: nonblocking pool is initialized ata3: SATA link down (SStatus 0 SControl 0) mptbase: ioc1: Initiating bringup ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 0) ata4.00: ATA-9: ST3000DM001-1ER166, CC25, max UDMA/133 ata4.00: 5860533168 sectors, multi 0: LBA48 NCQ (depth 31/32) ata4.00: configured for UDMA/100 scsi 3:0:0:0: Direct-Access ATA ST3000DM001-1ER1 CC25 PQ: 0 ANSI: 5 sd 3:0:0:0: [sdc] 5860533168 512-byte logical blocks: (3.00 TB/2.73 TiB) sd 3:0:0:0: [sdc] 4096-byte physical blocks sd 3:0:0:0: [sdc] Write Protect is off sd 3:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPA sdc: sdc1 sd 3:0:0:0: [sdc] Attached SCSI disk ioc1: LSI53C1030 B2: Capabilities={Initiator,Target} sd 4:0:0:0: [sda] 143374738 512-byte logical blocks: (73.4 GB/68.4 GiB) sd 4:0:2:0: [sdb] 143374738 512-byte logical blocks: (73.4 GB/68.4 GiB) sd 4:0:0:0: [sda] Write Protect is off sd 4:0:0:0: [sda] Write cache: enabled, read cache: enabled, supports DPO and FA scsi host5: ioc1: LSI53C1030 B2, FwRev=01032341h, Ports=1, MaxQ=255, IRQ=68 sd 4:0:2:0: [sdb] Write Protect is off sda: sda1 sda2 sda3 sda4 sd 4:0:0:0: [sda] Attached SCSI disk sd 4:0:2:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FA sd 4:0:2:0: [sdb] Attached SCSI disk ------------[ cut here ]------------ WARNING: at block/blk-merge.c:466 Modules linked in: sd_mod ohci_pci pata_sil680 mptspi ehci_pci ohci_hcd scsi_trd CPU: 2 PID: 927 Comm: systemd-udevd Not tainted 4.5.0-rc5 #1 task: 000000007f622b18 ti: 000000007e550000 task.ti: 000000007e550000 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001001111111100001110 Not tainted r00-03 000000ff0804ff0e 000000007e551220 0000000000001000 000000007e551220 r04-07 000000004070f340 0000000000000000 00000000000001e0 0000000000000000 r08-11 0000000000000000 0000000000000001 000000007e231918 0000000000000000 r12-15 0000000000001000 0000000000000008 000000000000001e 0000000042da3cc8 r16-19 000000007e231918 000000007e2459b8 000000007f01e800 0000000000000000 r20-23 0000000000000000 000000007feb4400 0000000000002000 000000004078c690 r24-27 0002ea873ff455e3 cffd1578c0000000 000000007e231918 000000004070f340 r28-31 0000000000000001 000000007e551320 000000007e551350 0000000000000007 sr00-03 0000000000013000 0000000000000000 0000000000013000 0000000000013000 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 IASQ: 0000000000000000 0000000000000000 IAOQ: 0000000040372d70 0000000040372d74 IIR: 03ffe01f ISR: 0000000010340000 IOR: 000000f88c631918 CPU: 2 CR30: 000000007e550000 CR31: ffffffffffffffff ORIG_R28: 0000000000000000 IAOQ[0]: blk_rq_map_sg+0x5d8/0x610 IAOQ[1]: blk_rq_map_sg+0x5dc/0x610 RP(r2): 0x1000 Backtrace: [<0000000010217708>] scsi_init_sgtable+0x70/0x168 [scsi_mod] [<000000001021786c>] scsi_init_io+0x6c/0x250 [scsi_mod] [<00000000107f16e0>] sd_setup_read_write_cmnd+0x58/0x948 [sd_mod] [<00000000107f2014>] sd_init_command+0x44/0x130 [sd_mod] [<0000000010217b54>] scsi_setup_cmnd+0x104/0x1c0 [scsi_mod] [<0000000010217ea0>] scsi_prep_fn+0x100/0x340 [scsi_mod] [<000000004036bc14>] blk_peek_request+0x1bc/0x2a8 [<0000000010219aac>] scsi_request_fn+0xf4/0xab0 [scsi_mod] [<0000000040366fb4>] __blk_run_queue+0x4c/0x70 [<00000000403919e0>] cfq_insert_request+0x2e0/0x588 [<0000000040366170>] __elv_add_request+0x190/0x2d8 ---[ end trace c4c138874107ee9a ]--- ------------[ cut here ]------------ kernel BUG at drivers/scsi/scsi_lib.c:1097! CPU: 2 PID: 927 Comm: systemd-udevd Tainted: G W 4.5.0-rc5 #1 task: 000000007f622b18 ti: 000000007e550000 task.ti: 000000007e550000 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001001111111100001110 Tainted: G W r00-03 000000ff0804ff0e 000000007e551220 0000000010217708 000000007e551180 r04-07 0000000010201000 000000007e2459b8 000000007e0d04c0 0000000000000000 r08-11 000000007e2459b8 000000007fe8e800 0000000040821520 0000000000000110 r12-15 0000000000000000 000000007e550798 0000000010201000 0000000040738340 r16-19 000000004082152c 000000007fe8e800 000000007e71d800 0000000000000000 r20-23 0000000000000000 000000007feb4400 0000000000002000 000000004078c690 r24-27 0002ea873ff455e3 cffd1578c0000000 000000007e231918 000000004070f340 r28-31 0000000000000008 000000007e551320 000000007e551220 0000000000000007 sr00-03 0000000000013000 0000000000000000 0000000000013000 0000000000013000 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 IASQ: 0000000000000000 0000000000000000 IAOQ: 000000001021774c 0000000010217750 IIR: 03ffe01f ISR: 0000000000000000 IOR: 0000000000000000 CPU: 2 CR30: 000000007e550000 CR31: ffffffffffffffff ORIG_R28: 0000000000000000 IAOQ[0]: scsi_init_sgtable+0xb4/0x168 [scsi_mod] IAOQ[1]: scsi_init_sgtable+0xb8/0x168 [scsi_mod] RP(r2): scsi_init_sgtable+0x70/0x168 [scsi_mod] Backtrace: [<000000001021786c>] scsi_init_io+0x6c/0x250 [scsi_mod] [<00000000107f16e0>] sd_setup_read_write_cmnd+0x58/0x948 [sd_mod] [<00000000107f2014>] sd_init_command+0x44/0x130 [sd_mod] [<0000000010217b54>] scsi_setup_cmnd+0x104/0x1c0 [scsi_mod] [<0000000010217ea0>] scsi_prep_fn+0x100/0x340 [scsi_mod] [<000000004036bc14>] blk_peek_request+0x1bc/0x2a8 [<0000000010219aac>] scsi_request_fn+0xf4/0xab0 [scsi_mod] [<0000000040366fb4>] __blk_run_queue+0x4c/0x70 [<00000000403919e0>] cfq_insert_request+0x2e0/0x588 [<0000000040366170>] __elv_add_request+0x190/0x2d8 CPU: 2 PID: 927 Comm: systemd-udevd Tainted: G W 4.5.0-rc5 #1 Backtrace: [<000000004015c0a8>] show_stack+0x20/0x38 [<0000000040397458>] dump_stack+0xa8/0x120 [<000000004015c27c>] die_if_kernel+0x19c/0x2e0 [<000000004015d158>] handle_interruption+0x9a8/0x9d0 ---[ end trace c4c138874107ee9b ]--- NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [systemd-udevd:927] Modules linked in: sd_mod ohci_pci pata_sil680 mptspi ehci_pci ohci_hcd scsi_trd CPU: 3 PID: 927 Comm: systemd-udevd Tainted: G D W 4.5.0-rc5 #1 task: 000000007f622b18 ti: 000000007e550000 task.ti: 000000007e550000 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001101111111100001111 Tainted: G D W r00-03 000000ff0806ff0f 000000007e551d10 0000000040202348 000000007e551c40 r04-07 000000004070f340 000000007e551ce0 000000004079c560 000000004079c4a0 r08-11 0000000042e4b360 0000000042e4b368 0000000000000001 00000000406e8360 r12-15 0000000000000000 000000007e551ce0 0000000010201000 0000000040738340 r16-19 000000007e551220 000000007fe8e800 000000007e71d800 0000000042e421d0 r20-23 0000000000000001 0000000042e4b368 000000000800000f 0000000000000000 r24-27 0000000000000000 0000000000000020 0000000042e4b368 000000004070f340 r28-31 0000000000000002 000000007e551d60 000000007e551d10 0000000000000003 sr00-03 000000000001b800 0000000000000000 0000000000000000 000000000001b800 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 IASQ: 0000000000000000 0000000000000000 IAOQ: 0000000040202370 0000000040202374 IIR: 4a7f0030 ISR: 0000000000000000 IOR: 0000000002000200 CPU: 3 CR30: 000000007e550000 CR31: fffffffffffeffff ORIG_R28: 0000000000000000 IAOQ[0]: smp_call_function_many+0x338/0x3b0 IAOQ[1]: smp_call_function_many+0x33c/0x3b0 RP(r2): smp_call_function_many+0x310/0x3b0 Backtrace: [<00000000402024a0>] on_each_cpu+0x58/0xa0 [<0000000040159998>] flush_tlb_all+0x108/0x1e8 [<0000000040249600>] tlb_flush_mmu_tlbonly+0x48/0xa8 [<000000004024a480>] tlb_finish_mmu+0x30/0x98 [<00000000402551dc>] exit_mmap+0x134/0x1b8 [<00000000401811d8>] mmput+0xc0/0x1a8 [<0000000040186438>] do_exit+0x320/0xcb8 [<000000004015c2d8>] die_if_kernel+0x1f8/0x2e0 [<000000004015d158>] handle_interruption+0x9a8/0x9d0 timer_interrupt(CPU 3): delayed! cycles 8291193C rem 903CC4 next/now 259158BA28 NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [systemd-udevd:927] Modules linked in: sd_mod ohci_pci pata_sil680 mptspi ehci_pci ohci_hcd scsi_trd CPU: 3 PID: 927 Comm: systemd-udevd Tainted: G D W L 4.5.0-rc5 #1 task: 000000007f622b18 ti: 000000007e550000 task.ti: 000000007e550000 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001101111111100001111 Tainted: G D W L r00-03 000000ff0806ff0f 000000007e551d10 0000000040202348 000000007e551c40 r04-07 000000004070f340 000000007e551ce0 000000004079c560 000000004079c4a0 r08-11 0000000042e4b360 0000000042e4b368 0000000000000001 00000000406e8360 r12-15 0000000000000000 000000007e551ce0 0000000010201000 0000000040738340 r16-19 000000007e551220 000000007fe8e800 000000007e71d800 0000000042e421d0 r20-23 0000000000000001 0000000042e4b368 000000000800000f 0000000000000000 r24-27 0000000000000000 0000000000000020 0000000042e4b368 000000004070f340 r28-31 0000000000000002 000000007e551d60 000000007e551d10 0000000000000003 sr00-03 000000000001b800 0000000000000000 0000000000000000 000000000001b800 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 IASQ: 0000000000000000 0000000000000000 IAOQ: 0000000040202370 0000000040202374 IIR: 4a7f0030 ISR: 0000000000000000 IOR: 0000000002000200 CPU: 3 CR30: 000000007e550000 CR31: fffffffffffeffff ORIG_R28: 0000000000000000 IAOQ[0]: smp_call_function_many+0x338/0x3b0 IAOQ[1]: smp_call_function_many+0x33c/0x3b0 RP(r2): smp_call_function_many+0x310/0x3b0 Backtrace: [<00000000402024a0>] on_each_cpu+0x58/0xa0 [<0000000040159998>] flush_tlb_all+0x108/0x1e8 [<0000000040249600>] tlb_flush_mmu_tlbonly+0x48/0xa8 [<000000004024a480>] tlb_finish_mmu+0x30/0x98 [<00000000402551dc>] exit_mmap+0x134/0x1b8 [<00000000401811d8>] mmput+0xc0/0x1a8 [<0000000040186438>] do_exit+0x320/0xcb8 [<000000004015c2d8>] die_if_kernel+0x1f8/0x2e0 [<000000004015d158>] handle_interruption+0x9a8/0x9d0 timer_interrupt(CPU 3): delayed! cycles 82907C30 rem 90D9D0 next/now 2C1646D22C INFO: rcu_sched self-detected stall on CPUINFO: rcu_sched detected stalls on CP: 2-...: (1 GPs behind) idle=c8f/140000000000000/0 softirq=1886/1886 fqs= 3-...: (5545 ticks this GP) idle=a85/140000000000001/0 softirq=1132/113 (detected by 1, t=6002 jiffies, g=39, c=38, q=66) Task dump for CPU 2: kworker/2:1H R running task 0 998 2 0x00000004 Workqueue: kblockd cfq_kick_queue Backtrace: [<0000000040152044>] __schedule+0x264/0x5b8 [<00000000401523e4>] schedule+0x4c/0xc8 [<00000000401a12b8>] worker_thread+0x338/0x688 [<00000000401a976c>] kthread+0x144/0x178 [<0000000040148020>] end_fault_vector+0x20/0x28 [<0000000010726808>] ohci_hub_control+0x0/0x650 [ohci_hcd] [<000000001072b0a0>] ohci_dump+0x0/0xf0 [ohci_hcd] Task dump for CPU 3: systemd-udevd R running task 0 927 921 0x00000014 Backtrace: [<00000000402024a0>] on_each_cpu+0x58/0xa0 [<0000000040159998>] flush_tlb_all+0x108/0x1e8 [<0000000040249600>] tlb_flush_mmu_tlbonly+0x48/0xa8 [<000000004024a480>] tlb_finish_mmu+0x30/0x98 [<00000000402551dc>] exit_mmap+0x134/0x1b8 [<00000000401811d8>] mmput+0xc0/0x1a8 [<0000000040186438>] do_exit+0x320/0xcb8 [<000000004015c2d8>] die_if_kernel+0x1f8/0x2e0 [<000000004015d158>] handle_interruption+0x9a8/0x9d0 3-...: (5545 ticks this GP) idle=a85/140000000000001/0 softirq=1132/113 (t=6147 jiffies g=39 c=38 q=67) Task dump for CPU 2: kworker/2:1H R running task 0 998 2 0x00000004 Workqueue: kblockd cfq_kick_queue Backtrace: [<0000000040152044>] __schedule+0x264/0x5b8 [<00000000401523e4>] schedule+0x4c/0xc8 [<00000000401a12b8>] worker_thread+0x338/0x688 [<00000000401a976c>] kthread+0x144/0x178 [<0000000040148020>] end_fault_vector+0x20/0x28 Task dump for CPU 3: systemd-udevd R running task 0 927 921 0x00000014 Backtrace: [<000000004015c0a8>] show_stack+0x20/0x38 [<00000000401b7e9c>] sched_show_task+0x134/0x1d0 [<00000000401ba93c>] dump_cpu_task+0x64/0x80 [<00000000401e5dec>] rcu_dump_cpu_stacks+0xf4/0x180 [<00000000401eaeac>] rcu_check_callbacks+0x5ac/0x9b8 [<00000000401edacc>] update_process_times+0x74/0xd8 [<000000004015d3c0>] timer_interrupt+0x1b0/0x210 [<00000000401dad4c>] handle_irq_event_percpu+0xb4/0x250 [<00000000401e06b4>] handle_percpu_irq+0xac/0xe8 [<00000000401d9fc4>] generic_handle_irq+0x4c/0x68 [<000000004014a2cc>] call_on_stack+0x18/0x24 timer_interrupt(CPU 3): delayed! cycles 998573C3 rem 42393D next/now 2EDA613C2F NMI watchdog: BUG: soft lockup - CPU#3 stuck for 24s! [systemd-udevd:927] Modules linked in: sd_mod ohci_pci pata_sil680 mptspi ehci_pci ohci_hcd scsi_trd CPU: 3 PID: 927 Comm: systemd-udevd Tainted: G D W L 4.5.0-rc5 #1 task: 000000007f622b18 ti: 000000007e550000 task.ti: 000000007e550000 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001101111111100001111 Tainted: G D W L r00-03 000000ff0806ff0f 000000007e551d10 0000000040202348 000000007e551c40 r04-07 000000004070f340 000000007e551ce0 000000004079c560 000000004079c4a0 r08-11 0000000042e4b360 0000000042e4b368 0000000000000001 00000000406e8360 r12-15 0000000000000000 000000007e551ce0 0000000010201000 0000000040738340 r16-19 000000007e551220 000000007fe8e800 000000007e71d800 0000000042e421d0 r20-23 0000000000000001 0000000042e4b368 000000000800000f 0000000000000000 r24-27 0000000000000000 0000000000000020 0000000042e4b368 000000004070f340 r28-31 0000000000000002 000000007e551d60 000000007e551d10 0000000000000003 sr00-03 000000000001b800 0000000000000000 0000000000000000 000000000001b800 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 IASQ: 0000000000000000 0000000000000000 IAOQ: 0000000040202370 0000000040202374 IIR: 4a7f0030 ISR: 0000000000000000 IOR: 0000000002000200 CPU: 3 CR30: 000000007e550000 CR31: fffffffffffeffff ORIG_R28: 0000000000000000 IAOQ[0]: smp_call_function_many+0x338/0x3b0 IAOQ[1]: smp_call_function_many+0x33c/0x3b0 RP(r2): smp_call_function_many+0x310/0x3b0 Backtrace: [<00000000402024a0>] on_each_cpu+0x58/0xa0 [<0000000040159998>] flush_tlb_all+0x108/0x1e8 [<0000000040249600>] tlb_flush_mmu_tlbonly+0x48/0xa8 [<000000004024a480>] tlb_finish_mmu+0x30/0x98 [<00000000402551dc>] exit_mmap+0x134/0x1b8 [<00000000401811d8>] mmput+0xc0/0x1a8 [<0000000040186438>] do_exit+0x320/0xcb8 [<000000004015c2d8>] die_if_kernel+0x1f8/0x2e0 [<000000004015d158>] handle_interruption+0x9a8/0x9d0 timer_interrupt(CPU 3): delayed! cycles 82907FF3 rem 90D60D next/now 356676622F NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [systemd-udevd:927] Modules linked in: sd_mod ohci_pci pata_sil680 mptspi ehci_pci ohci_hcd scsi_trd CPU: 3 PID: 927 Comm: systemd-udevd Tainted: G D W L 4.5.0-rc5 #1 task: 000000007f622b18 ti: 000000007e550000 task.ti: 000000007e550000 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001101111111100001111 Tainted: G D W L r00-03 000000ff0806ff0f 000000007e551d10 0000000040202348 000000007e551c40 r04-07 000000004070f340 000000007e551ce0 000000004079c560 000000004079c4a0 r08-11 0000000042e4b360 0000000042e4b368 0000000000000001 00000000406e8360 r12-15 0000000000000000 000000007e551ce0 0000000010201000 0000000040738340 r16-19 000000007e551220 000000007fe8e800 000000007e71d800 0000000042e421d0 r20-23 0000000000000001 0000000042e4b368 000000000800000f 0000000000000000 r24-27 0000000000000000 0000000000000020 0000000042e4b368 000000004070f340 r28-31 0000000000000002 000000007e551d60 000000007e551d10 0000000000000003 sr00-03 000000000001b800 0000000000000000 0000000000000000 000000000001b800 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 IASQ: 0000000000000000 0000000000000000 IAOQ: 0000000040202370 0000000040202374 IIR: 4a7f0030 ISR: 0000000000000000 IOR: 0000000002000200 CPU: 3 CR30: 000000007e550000 CR31: fffffffffffeffff ORIG_R28: 0000000000000000 IAOQ[0]: smp_call_function_many+0x338/0x3b0 IAOQ[1]: smp_call_function_many+0x33c/0x3b0 RP(r2): smp_call_function_many+0x310/0x3b0 Backtrace: [<00000000402024a0>] on_each_cpu+0x58/0xa0 [<0000000040159998>] flush_tlb_all+0x108/0x1e8 [<0000000040249600>] tlb_flush_mmu_tlbonly+0x48/0xa8 [<000000004024a480>] tlb_finish_mmu+0x30/0x98 [<00000000402551dc>] exit_mmap+0x134/0x1b8 [<00000000401811d8>] mmput+0xc0/0x1a8 [<0000000040186438>] do_exit+0x320/0xcb8 [<000000004015c2d8>] die_if_kernel+0x1f8/0x2e0 [<000000004015d158>] handle_interruption+0x9a8/0x9d0 timer_interrupt(CPU 3): delayed! cycles 82919115 rem 8FC4EB next/now 3BEB647A21 NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [udevadm:929] Modules linked in: sd_mod ohci_pci pata_sil680 mptspi ehci_pci ohci_hcd scsi_trd CPU: 1 PID: 929 Comm: udevadm Tainted: G D W L 4.5.0-rc5 #1 task: 000000007f074d18 ti: 000000007c648000 task.ti: 000000007c648000 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001001111111100001111 Tainted: G D W L r00-03 000000ff0804ff0f 000000007c648850 0000000040202348 000000007c648780 r04-07 000000004070f340 000000007c648820 000000004079c560 000000004079c4a0 r08-11 0000000042e29360 0000000042e29368 0000000000000001 00000000406e8360 r12-15 0000000000000000 000000007c648820 00000000fa88bb88 0000000000000002 r16-19 00000000fa889f20 0000000000000000 00000000000ce9f4 0000000042e42190 r20-23 0000000000000001 0000000042e29368 000000000800000f 0000000000000000 r24-27 0000000000000000 0000000000000020 0000000042e29368 000000004070f340 r28-31 0000000000000002 000000007c6488a0 000000007c648850 0000000000000003 sr00-03 0000000000015800 0000000000015800 0000000000000000 0000000000015800 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 IASQ: 0000000000000000 0000000000000000 IAOQ: 0000000040202370 0000000040202374 IIR: 4a7f0030 ISR: 0000000000000000 IOR: 0000000002000200 CPU: 1 CR30: 000000007c648000 CR31: ffffffffffffffff ORIG_R28: 0000000000000000 IAOQ[0]: smp_call_function_many+0x338/0x3b0 IAOQ[1]: smp_call_function_many+0x33c/0x3b0 RP(r2): smp_call_function_many+0x310/0x3b0 Backtrace: [<00000000402024a0>] on_each_cpu+0x58/0xa0 [<0000000040159998>] flush_tlb_all+0x108/0x1e8 [<0000000040249600>] tlb_flush_mmu_tlbonly+0x48/0xa8 [<000000004024a480>] tlb_finish_mmu+0x30/0x98 [<00000000402551dc>] exit_mmap+0x134/0x1b8 [<00000000401811d8>] mmput+0xc0/0x1a8 [<0000000040186438>] do_exit+0x320/0xcb8 [<00000000401880a0>] do_group_exit+0x50/0xf0 [<0000000040188160>] SyS_exit_group+0x20/0x28 [<0000000040149fe8>] syscall_exit+0x0/0x14 timer_interrupt(CPU 1): delayed! cycles 84101F2A rem 4263D6 next/now 3CCB28AEFD NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [systemd-udevd:927] Modules linked in: sd_mod ohci_pci pata_sil680 mptspi ehci_pci ohci_hcd scsi_trd CPU: 3 PID: 927 Comm: systemd-udevd Tainted: G D W L 4.5.0-rc5 #1 task: 000000007f622b18 ti: 000000007e550000 task.ti: 000000007e550000 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001101111111100001111 Tainted: G D W L r00-03 000000ff0806ff0f 000000007e551d10 0000000040202348 000000007e551c40 r04-07 000000004070f340 000000007e551ce0 000000004079c560 000000004079c4a0 r08-11 0000000042e4b360 0000000042e4b368 0000000000000001 00000000406e8360 r12-15 0000000000000000 000000007e551ce0 0000000010201000 0000000040738340 r16-19 000000007e551220 000000007fe8e800 000000007e71d800 0000000042e421d0 r20-23 0000000000000001 0000000042e4b368 000000000800000f 0000000000000000 r24-27 0000000000000000 0000000000000020 0000000042e4b368 000000004070f340 r28-31 0000000000000002 000000007e551d60 000000007e551d10 0000000000000003 sr00-03 000000000001b800 0000000000000000 0000000000000000 000000000001b800 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 IASQ: 0000000000000000 0000000000000000 IAOQ: 0000000040202370 0000000040202374 IIR: 4a7f0030 ISR: 0000000000000000 IOR: 0000000002000200 CPU: 3 CR30: 000000007e550000 CR31: fffffffffffeffff ORIG_R28: 0000000000000000 IAOQ[0]: smp_call_function_many+0x338/0x3b0 IAOQ[1]: smp_call_function_many+0x33c/0x3b0 RP(r2): smp_call_function_many+0x310/0x3b0 Backtrace: [<00000000402024a0>] on_each_cpu+0x58/0xa0 [<0000000040159998>] flush_tlb_all+0x108/0x1e8 [<0000000040249600>] tlb_flush_mmu_tlbonly+0x48/0xa8 [<000000004024a480>] tlb_finish_mmu+0x30/0x98 [<00000000402551dc>] exit_mmap+0x134/0x1b8 [<00000000401811d8>] mmput+0xc0/0x1a8 [<0000000040186438>] do_exit+0x320/0xcb8 [<000000004015c2d8>] die_if_kernel+0x1f8/0x2e0 [<000000004015d158>] handle_interruption+0x9a8/0x9d0 timer_interrupt(CPU 3): delayed! cycles 829113E3 rem 90421D next/now 427052922F NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [udevadm:929] Modules linked in: sd_mod ohci_pci pata_sil680 mptspi ehci_pci ohci_hcd scsi_trd CPU: 1 PID: 929 Comm: udevadm Tainted: G D W L 4.5.0-rc5 #1 task: 000000007f074d18 ti: 000000007c648000 task.ti: 000000007c648000 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001001111111100001111 Tainted: G D W L r00-03 000000ff0804ff0f 000000007c648850 0000000040202348 000000007c648780 r04-07 000000004070f340 000000007c648820 000000004079c560 000000004079c4a0 r08-11 0000000042e29360 0000000042e29368 0000000000000001 00000000406e8360 r12-15 0000000000000000 000000007c648820 00000000fa88bb88 0000000000000002 r16-19 00000000fa889f20 0000000000000000 00000000000ce9f4 0000000042e42190 r20-23 0000000000000001 0000000042e29368 000000000800000f 0000000000000000 r24-27 0000000000000000 0000000000000020 0000000042e29368 000000004070f340 r28-31 0000000000000002 000000007c6488a0 000000007c648850 0000000000000003 sr00-03 0000000000015800 0000000000015800 0000000000000000 0000000000015800 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 IASQ: 0000000000000000 0000000000000000 IAOQ: 0000000040202370 0000000040202374 IIR: 4a7f0030 ISR: 0000000000000000 IOR: 0000000002000200 CPU: 1 CR30: 000000007c648000 CR31: ffffffffffffffff ORIG_R28: 0000000000000000 IAOQ[0]: smp_call_function_many+0x338/0x3b0 IAOQ[1]: smp_call_function_many+0x33c/0x3b0 RP(r2): smp_call_function_many+0x310/0x3b0 Backtrace: [<00000000402024a0>] on_each_cpu+0x58/0xa0 [<0000000040159998>] flush_tlb_all+0x108/0x1e8 [<0000000040249600>] tlb_flush_mmu_tlbonly+0x48/0xa8 [<000000004024a480>] tlb_finish_mmu+0x30/0x98 [<00000000402551dc>] exit_mmap+0x134/0x1b8 [<00000000401811d8>] mmput+0xc0/0x1a8 [<0000000040186438>] do_exit+0x320/0xcb8 [<00000000401880a0>] do_group_exit+0x50/0xf0 [<0000000040188160>] SyS_exit_group+0x20/0x28 [<0000000040149fe8>] syscall_exit+0x0/0x14 timer_interrupt(CPU 1): delayed! cycles 840FA62E rem 42DCD2 next/now 435016C6F1 I've downloaded linux-block and I'll see if I can pin point the change that causes the problem. Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-21 14:45 ` John David Anglin @ 2016-02-21 18:10 ` James Bottomley 0 siblings, 0 replies; 25+ messages in thread From: James Bottomley @ 2016-02-21 18:10 UTC (permalink / raw) To: John David Anglin; +Cc: Helge Deller, linux-parisc List On Sun, 2016-02-21 at 09:45 -0500, John David Anglin wrote: > I tried HEAD this morning and problem is still present. The warning > in blk-merge.c occurs > first followed by BUG: > kernel BUG at drivers/scsi/scsi_lib.c:1097! > > Entire console output is shown below: OK, this might actually be a clue. I've attached my full boot below for good measure. There are two significant differences, firstly the gcc version: you're 4.9 and I'm 4.5 and secondly the ata drivers. I have no ata drivers for the CD rom in my initrd (they caused problems a while ago, so I have them blacklisted). Can you take all the ata drivers out of your initrd and see if you still get the panic. If not, we know where the fault is. James --- Command line for kernel: ' root=/dev/sda3 panic=5 console=ttyS1 palo_kernel=1/vmlinux-test' Selected kernel: /vmlinux-test from partition 1 Selected ramdisk: /initrd.img-test from partition 1 ELF64 executable Entry 00100000 first 00100000 n 4 Segment 0 load 00100000 size 229872 mediaptr 0x1000 Segment 1 load 00139000 size 148016 mediaptr 0x3a000 Segment 2 load 00200000 size 8622768 mediaptr 0x5f000 Segment 3 load 00b00000 size 1583856 mediaptr 0x899000 Loading ramdisk 3119670 bytes @ 3fcf4000... Branching to kernel entry point 0x00100000. If this is the last message you see, you may need to switch your console. This is a common symptom -- search the FAQ and mailing list at parisc-linux.org [ 0.000000] Linux version 4.5.0-rc2 (jejb@ion) (gcc version 4.2.4 (Debian 4.2.4-6)) #1 SMP Fri Feb 5 17:20:38 PST 2016 [ 0.000000] unwind_init: start = 0x409e92d0, end = 0x40a392b0, entries = 20478 [ 0.000000] WARNING: Out of order unwind entry! 00000000409ed440 and 00000000409ed450 [ 0.000000] WARNING: Out of order unwind entry! 00000000409ed450 and 00000000409ed460 [ 0.000000] WARNING: Out of order unwind entry! 00000000409ee160 and 00000000409ee170 [ 0.000000] WARNING: Out of order unwind entry! 00000000409ee170 and 00000000409ee180 [ 0.000000] WARNING: Out of order unwind entry! 00000000409ee1f0 and 00000000409ee200 [ 0.000000] WARNING: Out of order unwind entry! 00000000409ee200 and 00000000409ee210 [ 0.000000] WARNING: Out of order unwind entry! 00000000409ee210 and 00000000409ee220 [ 0.000000] WARNING: Out of order unwind entry! 00000000409ee220 and 00000000409ee230 [ 0.000000] WARNING: Out of order unwind entry! 00000000409ee230 and 00000000409ee240 [ 0.000000] WARNING: Out of order unwind entry! 00000000409ee240 and 00000000409ee250 [ 0.000000] WARNING: Out of order unwind entry! 00000000409ee250 and 00000000409ee260 [ 0.000000] WARNING: Out of order unwind entry! 00000000409ee260 and 00000000409ee270 [ 0.000000] WARNING: Out of order unwind entry! 00000000409ee270 and 00000000409ee280 [ 0.000000] WARNING: Out of order unwind entry! 00000000409ee280 and 00000000409ee290 [ 0.000000] WARNING: Out of order unwind entry! 00000000409ee2b0 and 00000000409ee2c0 [ 0.000000] WARNING: Out of order unwind entry! 00000000409ee2c0 and 00000000409ee2d0 [ 0.000000] WARNING: Out of order unwind entry! 00000000409ee2d0 and 00000000409ee2e0 [ 0.000000] WARNING: Out of order unwind entry! 00000000409ee2e0 and 00000000409ee2f0 [ 0.000000] FP[0] enabled: Rev 1 Model 20 [ 0.000000] The 64-bit Kernel has started... [ 0.000000] Kernel default page size is 4 KB. Huge pages enabled with 1 MB physical and 2 MB virtual size. [ 0.000000] bootconsole [ttyB0] enabled [ 0.000000] Initialized PDC Console for debugging. [ 0.000000] Determining PDC firmware type: 64 bit PAT. [ 0.000000] model 00008870 00000491 00000000 00000002 3e0505e7352af710 100000f0 00000008 000000b2 000000b2 [ 0.000000] vers 00000301 [ 0.000000] CPUID vers 20 rev 4 (0x00000284) [ 0.000000] capabilities 0x35 [ 0.000000] model 9000/800/rp3440 [ 0.000000] parisc_cache_init: Only equivalent aliasing supported! [ 0.000000] Memory Ranges: [ 0.000000] 0) Start 0x0000000000000000 End 0x000000003fffffff Size 1024 MB [ 0.000000] 1) Start 0x0000004040000000 End 0x000000407fdfffff Size 1022 MB [ 0.000000] Total Memory: 2046 MB [ 0.000000] initrd: 7fcf4000-7ffeda36 [ 0.000000] initrd: reserving 3fcf4000-3ffeda36 (mem_max 7fe00000) [ 0.000000] PERCPU: Embedded 18 pages/cpu @0000000042020000 s33328 r8192 d32208 u73728 [ 0.000000] SMP: bootstrap CPU ID is 0 [ 0.000000] Built 2 zonelists in Zone order, mobility grouping on. Total pages: 515592 [ 0.000000] Kernel command line: root=/dev/sda3 panic=5 console=ttyS1 palo_kernel=1/vmlinux-test [ 0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes) [ 0.000000] Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes) [ 0.000000] Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes) [ 0.000000] Sorting __ex_table... [ 0.000000] Memory: 2039128K/2095104K available (6156K kernel code, 2727K rwdata, 1214K rodata, 1024K init, 664K bss, 55976K reserved, 0K cma-reserved) [ 0.000000] virtual kernel memory layout: 0.000000] vmalloc : 0x0000000000008000 - 0x000000003f000000 (1007 MB) 0.000000] memory : 0x0000000040000000 - 0x00000040bfe00000 (264190 MB) 0.000000] .init : 0x0000000040100000 - 0x0000000040200000 (1024 kB) 0.000000] .data : 0x0000000040803000 - 0x0000000040bdc8d0 (3942 kB) 0.000000] .text : 0x0000000040200000 - 0x0000000040803000 (6156 kB) [ 0.000000] Hierarchical RCU implementation. [ 0.000000] Build-time adjustment of leaf fanout to 64. [ 0.000000] NR_IRQS:128 [ 0.000000] clocksource: cr16: mask: 0xffffffffffffffff max_cycles: 0xb881be7834, max_idle_ns: 440795218296 ns [ 0.000000] Console: colour dummy device 160x64 [ 0.004000] Calibrating delay loop... 1594.36 BogoMIPS (lpj=3188736) [ 0.028000] pid_max: default: 32768 minimum: 301 [ 0.028000] Security Framework initialized [ 0.028000] Yama: becoming mindful. [ 0.028000] AppArmor: AppArmor disabled by boot time parameter [ 0.036000] Mount-cache hash table entries: 4096 (order: 3, 32768 bytes) [ 0.036000] Mountpoint-cache hash table entries: 4096 (order: 3, 32768 bytes) [ 0.044000] Brought up 1 CPUs [ 0.044000] devtmpfs: initialized [ 0.052000] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns [ 0.076000] NET: Registered protocol family 16 [ 0.076000] EISA bus registered [ 0.080000] Searching for devices... [ 0.212000] Found devices: [ 0.212000] 1. Storm Peak Slow at 0xfffffffffe780000 [128] { 0, 0x0, 0x887, 0x00004 } [ 0.212000] 2. Storm Peak Slow at 0xfffffffffe781000 [129] { 0, 0x0, 0x887, 0x00004 } [ 0.212000] 3. Storm Peak Slow at 0xfffffffffe798000 [152] { 0, 0x0, 0x887, 0x00004 } [ 0.220000] 4. Storm Peak Slow at 0xfffffffffe799000 [153] { 0, 0x0, 0x887, 0x00004 } [ 0.224000] 5. Everest Mako Memory at 0xfffffffffed08000 [8] { 1, 0x0, 0x0af, 0x00009 } [ 0.232000] 6. Pluto BC McKinley Port at 0xfffffffffed00000 [0] { 12, 0x0, 0x880, 0x0000c } [ 0.240000] 7. Mercury PCI Bridge at 0xfffffffffed20000 [0/0] { 13, 0x0, 0x783, 0x0000a } [ 0.240000] 8. Mercury PCI Bridge at 0xfffffffffed22000 [0/1] { 13, 0x0, 0x783, 0x0000a } [ 0.248000] 9. Mercury PCI Bridge at 0xfffffffffed24000 [0/2] { 13, 0x0, 0x783, 0x0000a } [ 0.256000] 10. Mercury PCI Bridge at 0xfffffffffed26000 [0/3] { 13, 0x0, 0x783, 0x0000a } [ 0.264000] 11. Mercury PCI Bridge at 0xfffffffffed28000 [0/4] { 13, 0x0, 0x783, 0x0000a } [ 0.272000] 12. Mercury PCI Bridge at 0xfffffffffed2c000 [0/6] { 13, 0x0, 0x783, 0x0000a } [ 0.280000] 13. Mercury PCI Bridge at 0xfffffffffed2e000 [0/7] { 13, 0x0, 0x783, 0x0000a } [ 0.280000] 14. BMC IPMI Mgmt Ctlr at 0xfffffff0f05b0000 [16] { 15, 0x0, 0x004, 0x000c0 } [ 0.288000] Enabling PDC_PAT chassis codes support v0.05 [ 0.912000] Releasing cpu 1 now, hpa=fffffffffe781000 [ 0.960000] FP[1] enabled: Rev 1 Model 20 [ 0.968000] Releasing cpu 2 now, hpa=fffffffffe798000 [ 1.016000] FP[2] enabled: Rev 1 Model 20 [ 1.024000] Releasing cpu 3 now, hpa=fffffffffe799000 [ 1.084000] FP[3] enabled: Rev 1 Model 20 [ 1.088000] CPU(s): 4 out of 4 PA8800 (Mako) at 800.010600 MHz online [ 1.100000] Setting cache flush threshold to 32768 kB [ 1.100000] Setting TLB flush threshold to 1436 kB [ 1.140000] SBA found Pluto 2.3 at 0xfffffffffed00000 [ 1.340000] Mercury version TR3.2 (0x32) found at 0xfffffffffed20000 [ 1.396000] LBA 0:0: PCI host bridge to bus 0000:00 [ 1.396000] pci_bus 0000:00: root bus resource [io 0x0000-0xffff] [ 1.396000] pci_bus 0000:00: root bus resource [mem 0xffffffff80000000-0xffffffff8fffffff] (bus address [0x80000000-0x8fffffff]) [ 1.404000] pci_bus 0000:00: root bus resource [bus 00-07] [ 1.428000] Mercury version TR3.2 (0x32) found at 0xfffffffffed22000 [ 1.484000] LBA 0:1: PCI host bridge to bus 0000:20 [ 1.484000] pci_bus 0000:20: root bus resource [io 0x10000-0x1ffff] (bus address [0x0000-0xffff]) [ 1.484000] pci_bus 0000:20: root bus resource [mem 0xffffffff90000000-0xffffffff9fffffff] (bus address [0x90000000-0x9fffffff]) [ 1.492000] pci_bus 0000:20: root bus resource [bus 20-27] [ 1.524000] Mercury version TR3.2 (0x32) found at 0xfffffffffed24000 [ 1.580000] LBA 0:2: PCI host bridge to bus 0000:40 [ 1.580000] pci_bus 0000:40: root bus resource [io 0x20000-0x2ffff] (bus address [0x0000-0xffff]) [ 1.580000] pci_bus 0000:40: root bus resource [mem 0xffffffffa0000000-0xffffffffafffffff] (bus address [0xa0000000-0xafffffff]) [ 1.588000] pci_bus 0000:40: root bus resource [bus 40-47] [ 1.620000] Mercury version TR3.2 (0x32) found at 0xfffffffffed26000 [ 1.676000] LBA 0:3: PCI host bridge to bus 0000:60 [ 1.676000] pci_bus 0000:60: root bus resource [io 0x30000-0x3ffff] (bus address [0x0000-0xffff]) [ 1.676000] pci_bus 0000:60: root bus resource [mem 0xffffffffb0000000-0xffffffffbfffffff] (bus address [0xb0000000-0xbfffffff]) [ 1.684000] pci_bus 0000:60: root bus resource [bus 60-67] [ 1.716000] Mercury version TR3.2 (0x32) found at 0xfffffffffed28000 [ 1.748000] LBA: lmmio_space [0xffffffffc0000000-0xffffffffdfffffff] - new [ 1.772000] LBA 0:4: PCI host bridge to bus 0000:80 [ 1.772000] pci_bus 0000:80: root bus resource [io 0x40000-0x4ffff] (bus address [0x0000-0xffff]) [ 1.772000] pci_bus 0000:80: root bus resource [mem 0xffffffffc0000000-0xffffffffdfffffff] (bus address [0xc0000000-0xdfffffff]) [ 1.784000] pci_bus 0000:80: root bus resource [bus 80-87] [ 1.812000] Mercury version TR3.2 (0x32) found at 0xfffffffffed2c000 [ 1.868000] LBA 0:6: PCI host bridge to bus 0000:c0 [ 1.868000] pci_bus 0000:c0: root bus resource [io 0x50000-0x5ffff] (bus address [0x0000-0xffff]) [ 1.872000] pci_bus 0000:c0: root bus resource [mem 0xffffffffe0000000-0xffffffffefffffff] (bus address [0xe0000000-0xefffffff]) [ 1.880000] pci_bus 0000:c0: root bus resource [bus c0-c7] [ 1.912000] Mercury version TR3.2 (0x32) found at 0xfffffffffed2e000 [ 1.944000] LBA: lmmio_space [0xfffffffff0000000-0xfffffffffe77ffff] - new [ 1.968000] LBA 0:7: PCI host bridge to bus 0000:e0 [ 1.968000] pci_bus 0000:e0: root bus resource [io 0x60000-0x6ffff] (bus address [0x0000-0xffff]) [ 1.968000] pci_bus 0000:e0: root bus resource [mem 0xfffffffff0000000-0xfffffffffe77ffff] (bus address [0xf0000000-0xfe77ffff]) [ 1.976000] pci_bus 0000:e0: root bus resource [bus e0-e7] [ 1.988000] powersw: Soft power switch support not available. [ 2.024000] HugeTLB registered 2 MB page size, pre-allocated 0 pages [ 2.048000] vgaarb: setting as boot device: PCI:0000:e0:02.0 [ 2.048000] vgaarb: device added: PCI:0000:e0:02.0,decodes=io+mem,owns=io+mem,locks=none [ 2.048000] vgaarb: loaded [ 2.048000] vgaarb: bridge control possible 0000:e0:02.0 [ 2.056000] VFS: Disk quotas dquot_6.6.0 [ 2.056000] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes) [ 2.068000] NET: Registered protocol family 2 [ 2.092000] TCP established hash table entries: 16384 (order: 5, 131072 bytes) [ 2.092000] TCP bind hash table entries: 16384 (order: 6, 262144 bytes) [ 2.092000] TCP: Hash tables configured (established 16384 bind 16384) [ 2.100000] UDP hash table entries: 1024 (order: 3, 32768 bytes) [ 2.100000] UDP-Lite hash table entries: 1024 (order: 3, 32768 bytes) [ 2.108000] NET: Registered protocol family 1 [ 2.176000] Unpacking initramfs... [ 2.348000] Freeing initrd memory: 3044K (000000007fcf4000 - 000000007ffed000) [ 2.348000] Chassis warnings not supported. [ 2.348000] Performance monitoring counters enabled for Storm Peak Slow [ 2.352000] futex hash table entries: 2048 (order: 4, 65536 bytes) [ 2.356000] audit: initializing netlink subsys (disabled) [ 2.364000] audit: type=2000 audit(1456077963.364:1): initialized [ 2.364000] zbud: loaded [ 2.372000] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252) [ 2.372000] io scheduler noop registered [ 2.376000] io scheduler deadline registered [ 2.376000] io scheduler cfq registered (default) [ 2.384000] PDC Stable Storage facility v0.30 [ 2.432000] STI GSC/PCI core graphics driver Version 0.9b [ 2.432000] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled [ 2.476000] 0000:e0:01.0: ttyS0 at MMIO 0xfffffffff4051000 (irq = 75, base_baud = 115200) is a 16550A [ 2.520000] 0000:e0:01.1: ttyS1 at MMIO 0xfffffffff4050000 (irq = 75, base_baud = 115200) is a 16550A [ 2.520000] console [ttyS1] enabled [ 2.520000] console [ttyS1] enabled [ 2.524000] bootconsole [ttyB0] disabled [ 2.524000] bootconsole [ttyB0] disabled [ 2.552000] 0000:e0:01.1: ttyS2 at MMIO 0xfffffffff4050010 (irq = 75, base_baud = 115200) is a 16550A [ 2.572000] 0000:e0:01.1: ttyS3 at MMIO 0xfffffffff4050038 (irq = 75, base_baud = 115200) is a 16550A [ 2.572000] Linux agpgart interface v0.103 [ 2.572000] quicksilver: No AGP devices found. [ 2.572000] [drm] Initialized drm 1.1.0 20060810 [ 2.580000] mousedev: PS/2 mouse device common for all mice [ 2.588000] rtc-generic rtc-generic: rtc core: registered rtc-generic as rtc0 [ 2.588000] ledtrig-cpu: registered to indicate activity on CPUs [ 2.620000] NET: Registered protocol family 10 [ 2.620000] mip6: Mobile IPv6 [ 2.620000] NET: Registered protocol family 17 [ 2.620000] mpls_gso: MPLS GSO support [ 2.620000] registered taskstats version 1 [ 2.628000] zswap: loaded using pool lzo/zbud [ 2.636000] rtc-generic rtc-generic: setting system clock to 2016-02-21 18:06:03 UTC (1456077963) [ 2.660000] Freeing unused kernel memory: 1024K (0000000040100000 - 0000000040200000) Loading, please wait... /scripts/init-top/udev: line 14: can't create /sys/kernel/uevent_helper: Permission denied Begin: Loading essential drivers ... [ 3.312000] SCSI subsystem initialized [ 3.880000] sym0: <1010-66> rev 0x1 at pci 0000:20:01.0 irq 70 [ 3.884000] sym0: No NVRAM, ID 7, Fast-80, LVD, parity checking [ 3.924000] sym0: SCSI BUS has been reset. [ 3.924000] scsi host0: sym-2.2.3 [ 4.124000] sym1: <1010-66> rev 0x1 at pci 0000:20:01.1 irq 71 [ 4.128000] sym1: No NVRAM, ID 7, Fast-80, LVD, parity checking [ 4.168000] sym1: SCSI BUS has been reset. [ 4.168000] scsi host1: sym-2.2.3 [ 6.936000] scsi 0:0:0:0: Direct-Access HP 73.4G ST373405LC HP03 PQ: 0 ANSI: 2 [ 6.936000] scsi target0:0:0: tagged command queuing enabled, command queue depth 16. [ 6.936000] scsi target0:0:0: Beginning Domain Validation [ 6.956000] scsi target0:0:0: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 31) [ 6.972000] scsi target0:0:0: Ending Domain Validation [ 6.980000] scsi 0:0:1:0: Direct-Access HP 73.4G MAX3073NC HPC1 PQ: 0 ANSI: 3 [ 6.980000] scsi target0:0:1: tagged command queuing enabled, command queue depth 16. [ 6.980000] scsi target0:0:1: Beginning Domain Validation [ 6.996000] scsi target0:0:1: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 31) [ 7.004000] scsi target0:0:1: Ending Domain Validation [ 7.816000] scsi 1:0:2:0: Direct-Access HP 73.4G ST373405LC HP03 PQ: 0 ANSI: 2 [ 7.820000] scsi target1:0:2: tagged command queuing enabled, command queue depth 16. [ 7.820000] scsi target1:0:2: Beginning Domain Validation [ 7.840000] scsi target1:0:2: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 31) [ 7.856000] scsi target1:0:2: Ending Domain Validation [ 8.936000] random: nonblocking pool is initialized [ 10.888000] sd 0:0:0:0: [sda] 143374738 512-byte logical blocks: (73.4 GB/68.4 GiB) [ 10.888000] sd 0:0:1:0: [sdb] 143374738 512-byte logical blocks: (73.4 GB/68.4 GiB) [ 10.888000] sd 0:0:0:0: [sda] Write Protect is off [ 10.892000] sd 0:0:1:0: [sdb] Write Protect is off [ 10.892000] sd 0:0:1:0: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA [ 10.916000] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA [ 10.928000] sdb: sdb1 sdb2 sdb3 [ 10.968000] sda: sda1 sda2 sda3 [ 10.988000] sd 0:0:1:0: [sdb] Attached SCSI disk [ 10.996000] sd 0:0:0:0: [sda] Attached SCSI disk [ 11.428000] sd 1:0:2:0: [sdc] 143374738 512-byte logical blocks: (73.4 GB/68.4 GiB) [ 11.428000] sd 1:0:2:0: [sdc] Write Protect is off [ 11.432000] sd 1:0:2:0: [sdc] Write cache: disabled, read cache: enabled, supports DPO and FUA [ 11.472000] sdc: sdc1 [ 11.500000] sd 1:0:2:0: [sdc] Attached SCSI disk done. Begin: Running /scripts/init-premount ... done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. error sending message: Connection refused udevadm[129]: error sending message: Connection refused Begin: Running /scripts/local-premount ... done. [ 12.300000] EXT4-fs (sda3): mounting ext3 file system using the ext4 subsystem [ 12.320000] EXT4-fs (sda3): mounted filesystem with ordered data mode. Opts: (null) Begin: Running /scripts/local-bottom ... done. done. Begin: Running /scripts/init-bottom ... done. INIT: version 2.88 booting Setting hostname to 'ion'...done. udev requires hotplug support, not started. ... failed! failed! Setting the system clock. System Clock set to: Sun Feb 21 18:06:17 UTC 2016. Activating swap:swapon on /dev/sda2 swapon: /dev/sda2: found swap signature: version 1, page-size 4, same byte order swapon: /dev/sda2: pagesize=4096, swapsize=1028157440, devsize=1028160000 [ 16.128000] Adding 1004056k swap on /dev/sda2. Priority:-1 extents:1 across:1004056k FS . [ 16.296000] EXT4-fs (sda3): re-mounted. Opts: (null) Will now check root file system:fsck from util-linux-ng 2.17.2 [/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a -C0 /dev/sda3 /dev/sda3: clean, 1959576/8814592 files, 10926496/17605231 blocks . [ 16.616000] EXT4-fs (sda3): re-mounted. Opts: errors=remount-ro Setting the system clock. System Clock set to: Sun Feb 21 18:06:19 UTC 2016. Cleaning up ifupdown.... Loading kernel module tg3. WARNING: All config files need .conf: /etc/modprobe.d/arch, it will be ignored in a future release. [ 18.980000] pps_core: LinuxPPS API ver. 1 registered [ 18.980000] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it> [ 19.092000] PTP clock support registered [ 19.384000] tg3.c:v3.137 (May 11, 2014) [ 20.904000] tg3 0000:20:02.0 eth0: Tigon3 [partno(BCM95700A6) rev 0105] (PCI:66MHz:64-bit) MAC address 00:30:6e:4b:15:59 [ 20.904000] tg3 0000:20:02.0 eth0: attached PHY is 5701 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[0]) [ 20.912000] tg3 0000:20:02.0 eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[0] [ 20.912000] tg3 0000:20:02.0 eth0: dma_rwctrl[76ff2d0f] dma_mask[32-bit] Will now activate lvm and md swap:done. Will now check all file systems. fsck from util-linux-ng 2.17.2 Checking all file systems. [/sbin/fsck.ext2 (1) -- /boot] fsck.ext2 -a -C0 /dev/sda1 /dev/sda1: clean, 55/64256 files, 153810/257008 blocks Done checking file systems. A log is being saved in /var/log/fsck/checkfs if that location is writable.. Setting kernel variables ... /etc/sysctl.conf... /etc/sysctl.d/bindv6only.conf...done. Will now mount local filesystems:[ 21.568000] EXT4-fs (sda1): mounting ext2 file system using the ext4 subsystem [ 21.580000] EXT4-fs (sda1): mounted filesystem without journal. Opts: (null) mount: /sys already mounted or /sys busy mount: according to mtab, sysfs is already mounted on /sys failed! Will now activate swapfile swap:done. Cleaning up temporary files...Cleaning /tmp...done. Cleaning /var/run...done. Cleaning /var/lock...done. . Checking minimum space in /tmp...done. Running 0dns-down to make sure resolv.conf is ok...done. Setting up networking.... /etc/network/options still exists and it will be IGNORED! Read README.Debian of netbase. ... (warning). Configuring network interfaces...Internet Systems Consortium DHCP Client 4.1.1-P1 Copyright 2004-2010 Internet Systems Consortium. All rights reserved. For info, please visit https://www.isc.org/software/dhcp/ [ 24.416000] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready Listening on LPF/eth0/00:30:6e:4b:15:59 Sending on LPF/eth0/00:30:6e:4b:15:59 Sending on Socket/fallback ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-21 3:47 ` James Bottomley 2016-02-21 14:45 ` John David Anglin @ 2016-02-21 18:09 ` John David Anglin 2016-02-21 18:13 ` James Bottomley 1 sibling, 1 reply; 25+ messages in thread From: John David Anglin @ 2016-02-21 18:09 UTC (permalink / raw) To: James Bottomley; +Cc: Helge Deller, linux-parisc List [-- Attachment #1: Type: text/plain, Size: 21408 bytes --] On 2016-02-20, at 10:47 PM, James Bottomley wrote: > On Sat, 2016-02-20 at 21:52 -0500, John David Anglin wrote: >> On 2016-02-20, at 5:52 PM, John David Anglin wrote: >> >>> On 2016-02-20, at 4:59 PM, Helge Deller wrote: >>> >>>> On 20.02.2016 21:43, John David Anglin wrote: >>>>> On 2016-02-20, at 3:13 PM, John David Anglin wrote: >>>>> >>>>>> On 2016-01-23, at 1:00 PM, John David Anglin wrote: >>>>>> >>>>>>> WARNING: at block/blk-merge.c:454 >>>>>> >>>>>> With linux-image-4.4.0-1-parisc64-smp on c3740, the above >>>>>> warning is the last message I see. >>>>>> Kernel seems to hang at that point. This is warning code: >>>>>> >>>>>> /* >>>>>> * Something must have been wrong if the figured number >>>>>> of >>>>>> * segment is bigger than number of req's physical >>>>>> segments >>>>>> */ >>>>>> WARN_ON(nsegs > rq->nr_phys_segments); >>>>> >>>>> On Sep. 12, 2015, I reported the following problem: >>>>> >>>>> http://www.spinics.net/lists/linux-parisc/msg06327.html >>>> >>>> The problem is still, that this bug can only be reproduced at >>>> every boot when then >>>> scsi drivers are built as modules (and in an initrd). I could >>>> never reproduce it when >>>> I booted a kernel with built-in scsi drivers. >>>> >>>> The bug seems to be triggered by(*nsegs)++ command in >>>> __blk_segment_map_sg() in block/blk-merge.c. >>>> I'm testing with the 4.4.2 kernel from debian. >>>> I modified __blk_segment_map_sg() like that: >>>> static inline void >>>> __blk_segment_map_sg(struct request_queue *q, struct bio_vec >>>> *bvec, >>>> struct scatterlist *sglist, struct bio_vec >>>> *bvprv, >>>> struct scatterlist **sg, int *nsegs, int >>>> *cluster) >>>> { >>>> >>>> int nbytes = bvec->bv_len; >>>> >>>> if (*sg && *cluster) { >>>> if ((*sg)->length + nbytes > >>>> queue_max_segment_size(q)) >>>> goto new_segment; >>>> >>>> if (!BIOVEC_PHYS_MERGEABLE(bvprv, bvec)) >>>> goto new_segment; >>>> if (!BIOVEC_SEG_BOUNDARY(q, bvprv, bvec)) >>>> goto new_segment; >>>> >>>> (*sg)->length += nbytes; >>>> } else { >>>> new_segment: >>>> if (*sg && *cluster) { >>>> printk("NEW SEGMENT sg = %p!!!\n", sg); >>>> printk("__blk_segment_map_sg: length = %d, >>>> nbytes = %d, sum = %d > %d\n", (*sg)->length, nbytes, (*sg) >>>> ->length + nbytes, queue_max_segment_size(q)); >>>> printk("__blk_segment_map_sg: >>>> BIOVEC_PHYS_MERGEABLE = %d, BIOVEC_SEG_BOUNDARY = %d\n", >>>> BIOVEC_PHYS_MERGEABLE(bvprv, bvec), BIOVEC_SEG_BOUNDARY(q, bvprv, >>>> bvec) ); >>>> } >>>> if (!*sg) >>>> *sg = sglist; >>>> else { >>>> /* >>>> * If the driver previously mapped a >>>> shorter >>>> * list, we could see a termination bit >>>> * prematurely unless it fully inits the sg >>>> * table on each mapping. We KNOW that >>>> there >>>> * must be more entries here or the driver >>>> * would be buggy, so force clear the >>>> * termination bit to avoid doing a full >>>> * sg_init_table() in drivers for each >>>> command. >>>> */ >>>> sg_unmark_end(*sg); >>>> *sg = sg_next(*sg); >>>> } >>>> >>>> sg_set_page(*sg, bvec->bv_page, nbytes, bvec >>>> ->bv_offset); >>>> (*nsegs)++; >>>> } >>>> *bvprv = *bvec; >>>> } >>>> >>>> The boot log looks then like this: >>>> [ 43.044000] scsi_init_sgtable: count = 1, nents = 1 >>>> (there are lots of those before it!) >>>> [ 43.164000] scsi_init_sgtable: nr_phys_segments = 1 >>>> [ 43.164000] scsi_init_sgtable: count = 1, nents = 1 >>>> [ 43.280000] scsi_init_sgtable: nr_phys_segments = 1 >>>> [ 43.280000] scsi_init_sgtable: count = 1, nents = 1 >>>> [ 43.396000] scsi_init_sgtable: nr_phys_segments = 1 >>>> [ 43.396000] scsi_init_sgtable: count = 1, nents = 1 >>>> [ 43.512000] scsi_init_sgtable: nr_phys_segments = 1 >>>> [ 43.512000] scsi_init_sgtable: count = 1, nents = 1 >>>> [ 43.628000] scsi_init_sgtable: nr_phys_segments = 3 >>>> [ 43.628000] NEW SEGMENT sg = 000000007fa911e8!!! >>>> [ 43.628000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 43.628000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 43.628000] NEW SEGMENT sg = 000000007fa911e8!!! >>>> [ 43.628000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 43.628000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 43.628000] scsi_init_sgtable: count = 3, nents = 3 >>>> [ 44.224000] scsi_init_sgtable: nr_phys_segments = 1 >>>> [ 44.224000] scsi_init_sgtable: count = 1, nents = 1 >>>> [ 44.340000] scsi_init_sgtable: nr_phys_segments = 1 >>>> [ 44.340000] scsi_init_sgtable: count = 1, nents = 1 >>>> [ 44.456000] scsi_init_sgtable: nr_phys_segments = 7 >>>> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 44.456000] scsi_init_sgtable: count = 7, nents = 7 >>>> [ 44.456000] timer_interrupt(CPU 0): delayed! cycles 4527081F >>>> rem C6C21 next/now 14E153306E/14E146C44D >>>> [ 46.116000] scsi_init_sgtable: nr_phys_segments = 7 >>>> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 46.116000] __blk_segment_map_sg: length = 8192, nbytes = >>>> 4096, sum = 12288 > 65536 >>>> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>> [ 46.116000] __blk_segment_map_sg: length = 16384, nbytes = >>>> 4096, sum = 20480 > 65536 >>>> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 46.116000] scsi_init_sgtable: count = 7, nents = 7 >>>> [ 46.116000] timer_interrupt(CPU 0): delayed! cycles 453F0A77 >>>> rem 223089 next/now 152BB6286E/152B93F7E5 >>>> [ 47.780000] scsi_init_sgtable: nr_phys_segments = 1 >>>> [ 47.780000] scsi_init_sgtable: count = 1, nents = 1 >>>> [ 47.896000] scsi_init_sgtable: nr_phys_segments = 6 >>>> [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! >>>> [ 47.896000] __blk_segment_map_sg: length = 61440, nbytes = >>>> 4096, sum = 65536 > 65536 >>>> [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! >>>> [ 47.896000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! >>>> [ 47.896000] __blk_segment_map_sg: length = 4096, nbytes = >>>> 4096, sum = 8192 > 65536 >>>> [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! >>>> [ 47.896000] __blk_segment_map_sg: length = 8192, nbytes = >>>> 4096, sum = 12288 > 65536 >>>> [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! >>>> [ 47.896000] __blk_segment_map_sg: length = 8192, nbytes = >>>> 4096, sum = 12288 > 65536 >>>> [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 47.896000] scsi_init_sgtable: count = 6, nents = 6 >>>> [ 47.896000] timer_interrupt(CPU 0): delayed! cycles 3AB087E2 >>>> rem 23E4DE next/now 1570BBD5EE/157097F110 >>>> [ 49.324000] scsi_init_sgtable: nr_phys_segments = 1 >>>> [ 49.324000] scsi_init_sgtable: count = 1, nents = 1 >>>> [ 49.440000] scsi_init_sgtable: nr_phys_segments = 2 >>>> [ 49.440000] NEW SEGMENT sg = 000000007fa911e8!!! >>>> [ 49.440000] __blk_segment_map_sg: length = 65536, nbytes = >>>> 4096, sum = 69632 > 65536 >>>> >>>> (this is interesting! Here we reach a sum of > 65536 the first >>>> time) >>>> >>>> [ 49.440000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 1, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 49.440000] NEW SEGMENT sg = 000000007fa911e8!!! >>>> [ 49.440000] __blk_segment_map_sg: length = 16384, nbytes = >>>> 4096, sum = 20480 > 65536 >>>> [ 49.440000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, >>>> BIOVEC_SEG_BOUNDARY = 1 >>>> [ 49.440000] *** FIXIT *** HELGE: nsegs > rq->nr_phys_segments >>>> = 3 > 2 >>>> [ 49.440000] scsi_init_sgtable: count = 3, nents = 2 >>>> [ 50.116000] ------------[ cut here ]------------ >>>> [ 50.172000] WARNING: at /build/linux-4.4/linux >>>> -4.4.2/drivers/scsi/scsi_lib.c:1104 >>>> >>>> (this is usually a BUG(). I changed it to WARN() in the hope it >>>> would work anyway. It didn't.) >>>> >>>> [ 50.260000] Modules linked in: sd_mod sr_mod cdrom ata_generic >>>> ohci_pci ehci_pci ohci_hcd ehci_hcd pata_ns87415 sym53c8xx libata >>>> scsi_transport_spi scsi_mod usbcorep >>>> [ 50.456000] CPU: 0 PID: 70 Comm: systemd-udevd Not tainted >>>> 4.4.0-1-parisc64-smp #5 Debian 4.4.2-2 >>>> [ 50.564000] task: 000000007f948b28 ti: 000000007fa90000 >>>> task.ti: 000000007fa90000 >>>> [ 50.652000] >>>> [ 50.672000] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI >>>> [ 50.728000] PSW: 00001000000001001111100100001110 Not tainted >>>> [ 50.796000] r00-03 000000ff0804f90e 00000000409ea2e0 >>>> 00000000003e2ee0 000000007fa91140 >>>> [ 50.892000] r04-07 00000000003cd000 000000007f914300 >>>> 000000007f914b10 0000000000000003 >>>> [ 50.988000] r08-11 0000000000000000 000000007f918000 >>>> 0000000040bdd6b0 00000000003cd800 >>>> [ 51.084000] r12-15 0000000000000000 000000007fa90778 >>>> 00000000003cd000 000000007f918000 >>>> [ 51.180000] r16-19 0000000000001300 0000000040bdd6b8 >>>> 0000000040bdd6bc 0000000040ba2420 >>>> [ 51.276000] r20-23 0000000099116e92 0000000000000000 >>>> 00000000000002a0 00000000000002ee >>>> [ 51.372000] r24-27 0000000000000000 000000000800000e >>>> 0000000040b60750 00000000409b3ae0 >>>> [ 51.468000] r28-31 0000000000000002 000000007fa914f0 >>>> 000000007fa911e0 0000000040ba2408 >>>> [ 51.564000] sr00-03 0000000000015000 0000000000000000 >>>> 0000000000000000 0000000000015000 >>>> [ 51.660000] sr04-07 0000000000000000 0000000000000000 >>>> 0000000000000000 0000000000000000 >>>> [ 51.756000] >>>> [ 51.772000] IASQ: 0000000000000000 0000000000000000 IAOQ: >>>> 00000000003e2f24 00000000003e2f28 >>>> [ 51.872000] IIR: 03ffe01f ISR: 0000000010340000 IOR: >>>> 000000fea4691528 >>>> [ 51.956000] CPU: 0 CR30: 000000007fa90000 CR31: >>>> 00000000ffff7dff >>>> [ 52.040000] ORIG_R28: 0000000040b60718 >>>> [ 52.084000] IAOQ[0]: scsi_init_sgtable+0xfc/0x1b8 [scsi_mod] >>>> [ 52.152000] IAOQ[1]: scsi_init_sgtable+0x100/0x1b8 [scsi_mod] >>>> [ 52.224000] RP(r2): scsi_init_sgtable+0xb8/0x1b8 [scsi_mod] >>>> [ 52.292000] Backtrace: >>>> [ 52.320000] [<00000000003e304c>] scsi_init_io+0x6c/0x258 >>>> [scsi_mod] >>>> [ 52.396000] [<000000000087d078>] sd_init_command+0x70/0xec8 >>>> [sd_mod] >>>> >>>> In general I think the bug is somehow in blk-merge.c. >>>> But I'm not an expert in that code. >>> >>> The warning was added in this patch sequence: >>> https://lkml.org/lkml/2015/11/23/996 >>> >>> Possibly, but above seems to indicate that it could be driver issue >>> as well. >> >> >> I believe this bug was introduced by the following merge: >> >> commit 1081230b748de8f03f37f80c53dfa89feda9b8de >> Merge: df91039 2ca495a >> Author: Linus Torvalds <torvalds@linux-foundation.org> >> Date: Wed Sep 2 13:10:25 2015 -0700 >> >> Merge branch 'for-4.3/core' of git://git.kernel.dk/linux-block >> >> Pull core block updates from Jens Axboe: >> "This first core part of the block IO changes contains: >> >> - Cleanup of the bio IO error signaling from Christoph. We >> used to >> rely on the uptodate bit and passing around of an error, now >> we >> store the error in the bio itself. >> >> - Improvement of the above from myself, by shrinking the bio >> size >> down again to fit in two cachelines on x86-64. >> >> - Revert of the max_hw_sectors cap removal from a revision >> again, >> from Jeff Moyer. This caused performance regressions in >> various >> tests. Reinstate the limit, bump it to a more reasonable >> size >> instead. >> >> - Make /sys/block/<dev>/queue/discard_max_bytes writeable, by >> me. >> Most devices have huge trim limits, which can cause nasty >> latencies >> when deleting files. Enable the admin to configure the size >> down. >> We will look into having a more sane default instead of >> UINT_MAX >> sectors. >> >> - Improvement of the SGP gaps logic from Keith Busch. >> >> - Enable the block core to handle arbitrarily sized bios, >> which >> enables a nice simplification of bio_add_page() (which is an >> IO hot >> path). From Kent. >> >> - Improvements to the partition io stats accounting, making it >> faster. From Ming Lei. >> >> - Also from Ming Lei, a basic fixup for overflow of the sysfs >> pending >> file in blk-mq, as well as a fix for a blk-mq timeout race >> condition. >> >> - Ming Lin has been carrying Kents above mentioned patches >> forward >> for a while, and testing them. Ming also did a few fixes >> around >> that. >> >> - Sasha Levin found and fixed a use-after-free problem >> introduced by >> the bio->bi_error changes from Christoph. >> >> - Small blk cgroup cleanup from Viresh Kumar" >> >> * 'for-4.3/core' of git://git.kernel.dk/linux-block: (26 commits) >> blk: Fix bio_io_vec index when checking bvec gaps >> block: Replace SG_GAPS with new queue limits mask >> block: bump BLK_DEF_MAX_SECTORS to 2560 >> Revert "block: remove artifical max_hw_sectors cap" >> blk-mq: fix race between timeout and freeing request >> blk-mq: fix buffer overflow when reading sysfs file of >> 'pending' >> Documentation: update notes in biovecs about arbitrarily sized >> bios >> block: remove bio_get_nr_vecs() >> fs: use helper bio_add_page() instead of open coding on >> bi_io_vec >> block: kill merge_bvec_fn() completely >> md/raid5: get rid of bio_fits_rdev() >> md/raid5: split bio for chunk_aligned_read >> block: remove split code in blkdev_issue_{discard,write_same} >> btrfs: remove bio splitting and merge_bvec_fn() calls >> bcache: remove driver private bio splitting code >> block: simplify bio_add_page() >> block: make generic_make_request handle arbitrarily sized bios >> blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL) >> block: don't access bio->bi_error after bio_put() >> block: shrink struct bio down to 2 cache lines again >> ... > > If you can bisect it down to the exact commit, I might be able to work > out what's the problem. Otherwise, even in an all modular config, I > can't reproduce this on 4.5-rc4, so it may be fixed upstream (just not > backported). Okay, the bug was introduced by the following change: commit 54efd50bfd873e2dbf784e0b21a8027ba4299a3e Author: Kent Overstreet <kent.overstreet@gmail.com> Date: Thu Apr 23 22:37:18 2015 -0700 block: make generic_make_request handle arbitrarily sized bios The way the block layer is currently written, it goes to great lengths to avoid having to split bios; upper layer code (such as bio_add_page()) checks what the underlying device can handle and tries to always create bios that don't need to be split. But this approach becomes unwieldy and eventually breaks down with stacked devices and devices with dynamic limits, and it adds a lot of complexity. If the block layer could split bios as needed, we could eliminate a lot of complexity elsewhere - particularly in stacked drivers. Code that creates bios can then create whatever size bios are convenient, and more importantly stacked drivers don't have to deal with both their own bio size limitations and the limitations of the (potentially multiple) devices underneath them. In the future this will let us delete merge_bvec_fn and a bunch of other code. We do this by adding calls to blk_queue_split() to the various make_request functions that need it - a few can already handle arbitrary size bios. Note that we add the call _after_ any call to blk_queue_bounce(); this means that blk_queue_split() and blk_recalc_rq_segments() don't need to be concerned with bouncing affecting segment merging. Some make_request_fn() callbacks were simple enough to audit and verify they don't need blk_queue_split() calls. The skipped ones are: * nfhd_make_request (arch/m68k/emu/nfblock.c) * axon_ram_make_request (arch/powerpc/sysdev/axonram.c) * simdisk_make_request (arch/xtensa/platforms/iss/simdisk.c) * brd_make_request (ramdisk - drivers/block/brd.c) * mtip_submit_request (drivers/block/mtip32xx/mtip32xx.c) * loop_make_request * null_queue_bio * bcache's make_request fns Some others are almost certainly safe to remove now, but will be left for future patches. Cc: Jens Axboe <axboe@kernel.dk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Ming Lei <ming.lei@canonical.com> Cc: Neil Brown <neilb@suse.de> Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: dm-devel@redhat.com Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: drbd-user@lists.linbit.com Cc: Jiri Kosina <jkosina@suse.cz> Cc: Geoff Levand <geoff@infradead.org> Cc: Jim Paris <jim@jtan.com> Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Oleg Drokin <oleg.drokin@intel.com> Cc: Andreas Dilger <andreas.dilger@intel.com> Acked-by: NeilBrown <neilb@suse.de> (for the 'md/md.c' bits) Acked-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> [dpark: skip more mq-based drivers, resolve merge conflicts, etc.] Signed-off-by: Dongsu Park <dpark@posteo.net> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Signed-off-by: Jens Axboe <axboe@fb.com> Attached diff. Dave -- John David Anglin dave.anglin@bell.net [-- Attachment #2: generic_make_request.d.txt --] [-- Type: text/plain, Size: 14370 bytes --] diff --git a/block/blk-core.c b/block/blk-core.c index d1796b5..60912e9 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -643,6 +643,10 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id) if (q->id < 0) goto fail_q; + q->bio_split = bioset_create(BIO_POOL_SIZE, 0); + if (!q->bio_split) + goto fail_id; + q->backing_dev_info.ra_pages = (VM_MAX_READAHEAD * 1024) / PAGE_CACHE_SIZE; q->backing_dev_info.capabilities = BDI_CAP_CGROUP_WRITEBACK; @@ -651,7 +655,7 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id) err = bdi_init(&q->backing_dev_info); if (err) - goto fail_id; + goto fail_split; setup_timer(&q->backing_dev_info.laptop_mode_wb_timer, laptop_mode_timer_fn, (unsigned long) q); @@ -693,6 +697,8 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id) fail_bdi: bdi_destroy(&q->backing_dev_info); +fail_split: + bioset_free(q->bio_split); fail_id: ida_simple_remove(&blk_queue_ida, q->id); fail_q: @@ -1610,6 +1616,8 @@ static void blk_queue_bio(struct request_queue *q, struct bio *bio) struct request *req; unsigned int request_count = 0; + blk_queue_split(q, &bio, q->bio_split); + /* * low level driver can indicate that it wants pages above a * certain limit bounced to low memory (ie for highmem, or even @@ -1832,15 +1840,6 @@ generic_make_request_checks(struct bio *bio) goto end_io; } - if (likely(bio_is_rw(bio) && - nr_sectors > queue_max_hw_sectors(q))) { - printk(KERN_ERR "bio too big device %s (%u > %u)\n", - bdevname(bio->bi_bdev, b), - bio_sectors(bio), - queue_max_hw_sectors(q)); - goto end_io; - } - part = bio->bi_bdev->bd_part; if (should_fail_request(part, bio->bi_iter.bi_size) || should_fail_request(&part_to_disk(part)->part0, diff --git a/block/blk-merge.c b/block/blk-merge.c index a455b98..d9c3a75 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -9,12 +9,158 @@ #include "blk.h" +static struct bio *blk_bio_discard_split(struct request_queue *q, + struct bio *bio, + struct bio_set *bs) +{ + unsigned int max_discard_sectors, granularity; + int alignment; + sector_t tmp; + unsigned split_sectors; + + /* Zero-sector (unknown) and one-sector granularities are the same. */ + granularity = max(q->limits.discard_granularity >> 9, 1U); + + max_discard_sectors = min(q->limits.max_discard_sectors, UINT_MAX >> 9); + max_discard_sectors -= max_discard_sectors % granularity; + + if (unlikely(!max_discard_sectors)) { + /* XXX: warn */ + return NULL; + } + + if (bio_sectors(bio) <= max_discard_sectors) + return NULL; + + split_sectors = max_discard_sectors; + + /* + * If the next starting sector would be misaligned, stop the discard at + * the previous aligned sector. + */ + alignment = (q->limits.discard_alignment >> 9) % granularity; + + tmp = bio->bi_iter.bi_sector + split_sectors - alignment; + tmp = sector_div(tmp, granularity); + + if (split_sectors > tmp) + split_sectors -= tmp; + + return bio_split(bio, split_sectors, GFP_NOIO, bs); +} + +static struct bio *blk_bio_write_same_split(struct request_queue *q, + struct bio *bio, + struct bio_set *bs) +{ + if (!q->limits.max_write_same_sectors) + return NULL; + + if (bio_sectors(bio) <= q->limits.max_write_same_sectors) + return NULL; + + return bio_split(bio, q->limits.max_write_same_sectors, GFP_NOIO, bs); +} + +static struct bio *blk_bio_segment_split(struct request_queue *q, + struct bio *bio, + struct bio_set *bs) +{ + struct bio *split; + struct bio_vec bv, bvprv; + struct bvec_iter iter; + unsigned seg_size = 0, nsegs = 0; + int prev = 0; + + struct bvec_merge_data bvm = { + .bi_bdev = bio->bi_bdev, + .bi_sector = bio->bi_iter.bi_sector, + .bi_size = 0, + .bi_rw = bio->bi_rw, + }; + + bio_for_each_segment(bv, bio, iter) { + if (q->merge_bvec_fn && + q->merge_bvec_fn(q, &bvm, &bv) < (int) bv.bv_len) + goto split; + + bvm.bi_size += bv.bv_len; + + if (bvm.bi_size >> 9 > queue_max_sectors(q)) + goto split; + + /* + * If the queue doesn't support SG gaps and adding this + * offset would create a gap, disallow it. + */ + if (q->queue_flags & (1 << QUEUE_FLAG_SG_GAPS) && + prev && bvec_gap_to_prev(&bvprv, bv.bv_offset)) + goto split; + + if (prev && blk_queue_cluster(q)) { + if (seg_size + bv.bv_len > queue_max_segment_size(q)) + goto new_segment; + if (!BIOVEC_PHYS_MERGEABLE(&bvprv, &bv)) + goto new_segment; + if (!BIOVEC_SEG_BOUNDARY(q, &bvprv, &bv)) + goto new_segment; + + seg_size += bv.bv_len; + bvprv = bv; + prev = 1; + continue; + } +new_segment: + if (nsegs == queue_max_segments(q)) + goto split; + + nsegs++; + bvprv = bv; + prev = 1; + seg_size = bv.bv_len; + } + + return NULL; +split: + split = bio_clone_bioset(bio, GFP_NOIO, bs); + + split->bi_iter.bi_size -= iter.bi_size; + bio->bi_iter = iter; + + if (bio_integrity(bio)) { + bio_integrity_advance(bio, split->bi_iter.bi_size); + bio_integrity_trim(split, 0, bio_sectors(split)); + } + + return split; +} + +void blk_queue_split(struct request_queue *q, struct bio **bio, + struct bio_set *bs) +{ + struct bio *split; + + if ((*bio)->bi_rw & REQ_DISCARD) + split = blk_bio_discard_split(q, *bio, bs); + else if ((*bio)->bi_rw & REQ_WRITE_SAME) + split = blk_bio_write_same_split(q, *bio, bs); + else + split = blk_bio_segment_split(q, *bio, q->bio_split); + + if (split) { + bio_chain(split, *bio); + generic_make_request(*bio); + *bio = split; + } +} +EXPORT_SYMBOL(blk_queue_split); + static unsigned int __blk_recalc_rq_segments(struct request_queue *q, struct bio *bio, bool no_sg_merge) { struct bio_vec bv, bvprv = { NULL }; - int cluster, high, highprv = 1; + int cluster, prev = 0; unsigned int seg_size, nr_phys_segs; struct bio *fbio, *bbio; struct bvec_iter iter; @@ -36,7 +182,6 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q, cluster = blk_queue_cluster(q); seg_size = 0; nr_phys_segs = 0; - high = 0; for_each_bio(bio) { bio_for_each_segment(bv, bio, iter) { /* @@ -46,13 +191,7 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q, if (no_sg_merge) goto new_segment; - /* - * the trick here is making sure that a high page is - * never considered part of another segment, since - * that might change with the bounce page. - */ - high = page_to_pfn(bv.bv_page) > queue_bounce_pfn(q); - if (!high && !highprv && cluster) { + if (prev && cluster) { if (seg_size + bv.bv_len > queue_max_segment_size(q)) goto new_segment; @@ -72,8 +211,8 @@ new_segment: nr_phys_segs++; bvprv = bv; + prev = 1; seg_size = bv.bv_len; - highprv = high; } bbio = bio; } diff --git a/block/blk-mq.c b/block/blk-mq.c index 9455902..81edbd9 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1287,6 +1287,8 @@ static void blk_mq_make_request(struct request_queue *q, struct bio *bio) return; } + blk_queue_split(q, &bio, q->bio_split); + if (!is_flush_fua && !blk_queue_nomerges(q) && blk_attempt_plug_merge(q, bio, &request_count, &same_queue_rq)) return; @@ -1372,6 +1374,8 @@ static void blk_sq_make_request(struct request_queue *q, struct bio *bio) return; } + blk_queue_split(q, &bio, q->bio_split); + if (!is_flush_fua && !blk_queue_nomerges(q) && blk_attempt_plug_merge(q, bio, &request_count, NULL)) return; diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index b1f34e4..3e44a9d 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -561,6 +561,9 @@ static void blk_release_queue(struct kobject *kobj) blk_trace_shutdown(q); + if (q->bio_split) + bioset_free(q->bio_split); + ida_simple_remove(&blk_queue_ida, q->id); call_rcu(&q->rcu_head, blk_free_queue_rcu); } diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c index 9cb4116..923c857 100644 --- a/drivers/block/drbd/drbd_req.c +++ b/drivers/block/drbd/drbd_req.c @@ -1499,6 +1499,8 @@ void drbd_make_request(struct request_queue *q, struct bio *bio) struct drbd_device *device = (struct drbd_device *) q->queuedata; unsigned long start_jif; + blk_queue_split(q, &bio, q->bio_split); + start_jif = jiffies; /* diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c index a7a259e..ee7ad5e 100644 --- a/drivers/block/pktcdvd.c +++ b/drivers/block/pktcdvd.c @@ -2447,6 +2447,10 @@ static void pkt_make_request(struct request_queue *q, struct bio *bio) char b[BDEVNAME_SIZE]; struct bio *split; + blk_queue_bounce(q, &bio); + + blk_queue_split(q, &bio, q->bio_split); + pd = q->queuedata; if (!pd) { pr_err("%s incorrect request queue\n", @@ -2477,8 +2481,6 @@ static void pkt_make_request(struct request_queue *q, struct bio *bio) goto end_io; } - blk_queue_bounce(q, &bio); - do { sector_t zone = get_zone(bio->bi_iter.bi_sector, pd); sector_t last_zone = get_zone(bio_end_sector(bio) - 1, pd); diff --git a/drivers/block/ps3vram.c b/drivers/block/ps3vram.c index 49b4706..d89fcac 100644 --- a/drivers/block/ps3vram.c +++ b/drivers/block/ps3vram.c @@ -606,6 +606,8 @@ static void ps3vram_make_request(struct request_queue *q, struct bio *bio) dev_dbg(&dev->core, "%s\n", __func__); + blk_queue_split(q, &bio, q->bio_split); + spin_lock_irq(&priv->lock); busy = !bio_list_empty(&priv->list); bio_list_add(&priv->list, bio); diff --git a/drivers/block/rsxx/dev.c b/drivers/block/rsxx/dev.c index 63b9d2f..3163e4cdc 100644 --- a/drivers/block/rsxx/dev.c +++ b/drivers/block/rsxx/dev.c @@ -151,6 +151,8 @@ static void rsxx_make_request(struct request_queue *q, struct bio *bio) struct rsxx_bio_meta *bio_meta; int st = -EINVAL; + blk_queue_split(q, &bio, q->bio_split); + might_sleep(); if (!card) diff --git a/drivers/block/umem.c b/drivers/block/umem.c index 3b3afd2..04d6579 100644 --- a/drivers/block/umem.c +++ b/drivers/block/umem.c @@ -531,6 +531,8 @@ static void mm_make_request(struct request_queue *q, struct bio *bio) (unsigned long long)bio->bi_iter.bi_sector, bio->bi_iter.bi_size); + blk_queue_split(q, &bio, q->bio_split); + spin_lock_irq(&card->lock); *card->biotail = bio; bio->bi_next = NULL; diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 68c3d48..aec781a 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -900,6 +900,8 @@ static void zram_make_request(struct request_queue *queue, struct bio *bio) if (unlikely(!zram_meta_get(zram))) goto error; + blk_queue_split(queue, &bio, queue->bio_split); + if (!valid_io_request(zram, bio->bi_iter.bi_sector, bio->bi_iter.bi_size)) { atomic64_inc(&zram->stats.invalid_io); diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 7f367fc..069f8d7 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1799,6 +1799,8 @@ static void dm_make_request(struct request_queue *q, struct bio *bio) map = dm_get_live_table(md, &srcu_idx); + blk_queue_split(q, &bio, q->bio_split); + generic_start_io_acct(rw, bio_sectors(bio), &dm_disk(md)->part0); /* if we're suspended, we have to queue this io for later */ diff --git a/drivers/md/md.c b/drivers/md/md.c index ac4381a..e1d8723 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -257,6 +257,8 @@ static void md_make_request(struct request_queue *q, struct bio *bio) unsigned int sectors; int cpu; + blk_queue_split(q, &bio, q->bio_split); + if (mddev == NULL || mddev->pers == NULL || !mddev->ready) { bio_io_error(bio); diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c index 8bcb822..29ea239 100644 --- a/drivers/s390/block/dcssblk.c +++ b/drivers/s390/block/dcssblk.c @@ -826,6 +826,8 @@ dcssblk_make_request(struct request_queue *q, struct bio *bio) unsigned long source_addr; unsigned long bytes_done; + blk_queue_split(q, &bio, q->bio_split); + bytes_done = 0; dev_info = bio->bi_bdev->bd_disk->private_data; if (dev_info == NULL) diff --git a/drivers/s390/block/xpram.c b/drivers/s390/block/xpram.c index 93856b9..02871f1 100644 --- a/drivers/s390/block/xpram.c +++ b/drivers/s390/block/xpram.c @@ -190,6 +190,8 @@ static void xpram_make_request(struct request_queue *q, struct bio *bio) unsigned long page_addr; unsigned long bytes; + blk_queue_split(q, &bio, q->bio_split); + if ((bio->bi_iter.bi_sector & 7) != 0 || (bio->bi_iter.bi_size & 4095) != 0) /* Request is not page-aligned. */ diff --git a/drivers/staging/lustre/lustre/llite/lloop.c b/drivers/staging/lustre/lustre/llite/lloop.c index cc00fd1..1e33d54 100644 --- a/drivers/staging/lustre/lustre/llite/lloop.c +++ b/drivers/staging/lustre/lustre/llite/lloop.c @@ -340,6 +340,8 @@ static void loop_make_request(struct request_queue *q, struct bio *old_bio) int rw = bio_rw(old_bio); int inactive; + blk_queue_split(q, &old_bio, q->bio_split); + if (!lo) goto err; diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 243f29e..ca778d9 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -463,6 +463,7 @@ struct request_queue { struct blk_mq_tag_set *tag_set; struct list_head tag_set_list; + struct bio_set *bio_split; }; #define QUEUE_FLAG_QUEUED 1 /* uses generic tag queueing */ @@ -783,6 +784,8 @@ extern void blk_rq_unprep_clone(struct request *rq); extern int blk_insert_cloned_request(struct request_queue *q, struct request *rq); extern void blk_delay_queue(struct request_queue *, unsigned long); +extern void blk_queue_split(struct request_queue *, struct bio **, + struct bio_set *); extern void blk_recount_segments(struct request_queue *, struct bio *); extern int scsi_verify_blk_ioctl(struct block_device *, unsigned int); extern int scsi_cmd_blk_ioctl(struct block_device *, fmode_t, ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-21 18:09 ` John David Anglin @ 2016-02-21 18:13 ` James Bottomley 2016-02-21 18:43 ` John David Anglin 0 siblings, 1 reply; 25+ messages in thread From: James Bottomley @ 2016-02-21 18:13 UTC (permalink / raw) To: John David Anglin; +Cc: Helge Deller, linux-parisc List On Sun, 2016-02-21 at 13:09 -0500, John David Anglin wrote: > On 2016-02-20, at 10:47 PM, James Bottomley wrote: > > > On Sat, 2016-02-20 at 21:52 -0500, John David Anglin wrote: > > > On 2016-02-20, at 5:52 PM, John David Anglin wrote: > > > > > > > On 2016-02-20, at 4:59 PM, Helge Deller wrote: > > > > > > > > > On 20.02.2016 21:43, John David Anglin wrote: > > > > > > On 2016-02-20, at 3:13 PM, John David Anglin wrote: > > > > > > > > > > > > > On 2016-01-23, at 1:00 PM, John David Anglin wrote: > > > > > > > > > > > > > > > WARNING: at block/blk-merge.c:454 > > > > > > > > > > > > > > With linux-image-4.4.0-1-parisc64-smp on c3740, the above > > > > > > > warning is the last message I see. > > > > > > > Kernel seems to hang at that point. This is warning > > > > > > > code: > > > > > > > > > > > > > > /* > > > > > > > * Something must have been wrong if the figured > > > > > > > number > > > > > > > of > > > > > > > * segment is bigger than number of req's physical > > > > > > > segments > > > > > > > */ > > > > > > > WARN_ON(nsegs > rq->nr_phys_segments); > > > > > > > > > > > > On Sep. 12, 2015, I reported the following problem: > > > > > > > > > > > > http://www.spinics.net/lists/linux-parisc/msg06327.html > > > > > > > > > > The problem is still, that this bug can only be reproduced at > > > > > every boot when then > > > > > scsi drivers are built as modules (and in an initrd). I could > > > > > never reproduce it when > > > > > I booted a kernel with built-in scsi drivers. > > > > > > > > > > The bug seems to be triggered by(*nsegs)++ command in > > > > > __blk_segment_map_sg() in block/blk-merge.c. > > > > > I'm testing with the 4.4.2 kernel from debian. > > > > > I modified __blk_segment_map_sg() like that: > > > > > static inline void > > > > > __blk_segment_map_sg(struct request_queue *q, struct bio_vec > > > > > *bvec, > > > > > struct scatterlist *sglist, struct bio_vec > > > > > *bvprv, > > > > > struct scatterlist **sg, int *nsegs, int > > > > > *cluster) > > > > > { > > > > > > > > > > int nbytes = bvec->bv_len; > > > > > > > > > > if (*sg && *cluster) { > > > > > if ((*sg)->length + nbytes > > > > > > queue_max_segment_size(q)) > > > > > goto new_segment; > > > > > > > > > > if (!BIOVEC_PHYS_MERGEABLE(bvprv, bvec)) > > > > > goto new_segment; > > > > > if (!BIOVEC_SEG_BOUNDARY(q, bvprv, bvec)) > > > > > goto new_segment; > > > > > > > > > > (*sg)->length += nbytes; > > > > > } else { > > > > > new_segment: > > > > > if (*sg && *cluster) { > > > > > printk("NEW SEGMENT sg = %p!!!\n", sg); > > > > > printk("__blk_segment_map_sg: length = > > > > > %d, > > > > > nbytes = %d, sum = %d > %d\n", (*sg)->length, nbytes, (*sg) > > > > > ->length + nbytes, queue_max_segment_size(q)); > > > > > printk("__blk_segment_map_sg: > > > > > BIOVEC_PHYS_MERGEABLE = %d, BIOVEC_SEG_BOUNDARY = %d\n", > > > > > BIOVEC_PHYS_MERGEABLE(bvprv, bvec), BIOVEC_SEG_BOUNDARY(q, > > > > > bvprv, > > > > > bvec) ); > > > > > } > > > > > if (!*sg) > > > > > *sg = sglist; > > > > > else { > > > > > /* > > > > > * If the driver previously mapped a > > > > > shorter > > > > > * list, we could see a termination bit > > > > > * prematurely unless it fully inits the > > > > > sg > > > > > * table on each mapping. We KNOW that > > > > > there > > > > > * must be more entries here or the > > > > > driver > > > > > * would be buggy, so force clear the > > > > > * termination bit to avoid doing a full > > > > > * sg_init_table() in drivers for each > > > > > command. > > > > > */ > > > > > sg_unmark_end(*sg); > > > > > *sg = sg_next(*sg); > > > > > } > > > > > > > > > > sg_set_page(*sg, bvec->bv_page, nbytes, bvec > > > > > ->bv_offset); > > > > > (*nsegs)++; > > > > > } > > > > > *bvprv = *bvec; > > > > > } > > > > > > > > > > The boot log looks then like this: > > > > > [ 43.044000] scsi_init_sgtable: count = 1, nents = 1 > > > > > (there are lots of those before it!) > > > > > [ 43.164000] scsi_init_sgtable: nr_phys_segments = 1 > > > > > [ 43.164000] scsi_init_sgtable: count = 1, nents = 1 > > > > > [ 43.280000] scsi_init_sgtable: nr_phys_segments = 1 > > > > > [ 43.280000] scsi_init_sgtable: count = 1, nents = 1 > > > > > [ 43.396000] scsi_init_sgtable: nr_phys_segments = 1 > > > > > [ 43.396000] scsi_init_sgtable: count = 1, nents = 1 > > > > > [ 43.512000] scsi_init_sgtable: nr_phys_segments = 1 > > > > > [ 43.512000] scsi_init_sgtable: count = 1, nents = 1 > > > > > [ 43.628000] scsi_init_sgtable: nr_phys_segments = 3 > > > > > [ 43.628000] NEW SEGMENT sg = 000000007fa911e8!!! > > > > > [ 43.628000] __blk_segment_map_sg: length = 4096, nbytes = > > > > > 4096, sum = 8192 > 65536 > > > > > [ 43.628000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = > > > > > 0, > > > > > BIOVEC_SEG_BOUNDARY = 1 > > > > > [ 43.628000] NEW SEGMENT sg = 000000007fa911e8!!! > > > > > [ 43.628000] __blk_segment_map_sg: length = 4096, nbytes = > > > > > 4096, sum = 8192 > 65536 > > > > > [ 43.628000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = > > > > > 0, > > > > > BIOVEC_SEG_BOUNDARY = 1 > > > > > [ 43.628000] scsi_init_sgtable: count = 3, nents = 3 > > > > > [ 44.224000] scsi_init_sgtable: nr_phys_segments = 1 > > > > > [ 44.224000] scsi_init_sgtable: count = 1, nents = 1 > > > > > [ 44.340000] scsi_init_sgtable: nr_phys_segments = 1 > > > > > [ 44.340000] scsi_init_sgtable: count = 1, nents = 1 > > > > > [ 44.456000] scsi_init_sgtable: nr_phys_segments = 7 > > > > > [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > > > [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = > > > > > 4096, sum = 8192 > 65536 > > > > > [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = > > > > > 0, > > > > > BIOVEC_SEG_BOUNDARY = 1 > > > > > [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > > > [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = > > > > > 4096, sum = 8192 > 65536 > > > > > [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = > > > > > 0, > > > > > BIOVEC_SEG_BOUNDARY = 1 > > > > > [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > > > [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = > > > > > 4096, sum = 8192 > 65536 > > > > > [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = > > > > > 0, > > > > > BIOVEC_SEG_BOUNDARY = 1 > > > > > [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > > > [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = > > > > > 4096, sum = 8192 > 65536 > > > > > [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = > > > > > 0, > > > > > BIOVEC_SEG_BOUNDARY = 1 > > > > > [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > > > [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = > > > > > 4096, sum = 8192 > 65536 > > > > > [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = > > > > > 0, > > > > > BIOVEC_SEG_BOUNDARY = 1 > > > > > [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > > > [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = > > > > > 4096, sum = 8192 > 65536 > > > > > [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = > > > > > 0, > > > > > BIOVEC_SEG_BOUNDARY = 1 > > > > > [ 44.456000] scsi_init_sgtable: count = 7, nents = 7 > > > > > [ 44.456000] timer_interrupt(CPU 0): delayed! cycles > > > > > 4527081F > > > > > rem C6C21 next/now 14E153306E/14E146C44D > > > > > [ 46.116000] scsi_init_sgtable: nr_phys_segments = 7 > > > > > [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > > > [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = > > > > > 4096, sum = 8192 > 65536 > > > > > [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = > > > > > 0, > > > > > BIOVEC_SEG_BOUNDARY = 1 > > > > > [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > > > [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = > > > > > 4096, sum = 8192 > 65536 > > > > > [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = > > > > > 0, > > > > > BIOVEC_SEG_BOUNDARY = 1 > > > > > [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > > > [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = > > > > > 4096, sum = 8192 > 65536 > > > > > [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = > > > > > 0, > > > > > BIOVEC_SEG_BOUNDARY = 1 > > > > > [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > > > [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = > > > > > 4096, sum = 8192 > 65536 > > > > > [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = > > > > > 0, > > > > > BIOVEC_SEG_BOUNDARY = 1 > > > > > [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > > > [ 46.116000] __blk_segment_map_sg: length = 8192, nbytes = > > > > > 4096, sum = 12288 > 65536 > > > > > [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = > > > > > 0, > > > > > BIOVEC_SEG_BOUNDARY = 1 > > > > > [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! > > > > > [ 46.116000] __blk_segment_map_sg: length = 16384, nbytes = > > > > > 4096, sum = 20480 > 65536 > > > > > [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = > > > > > 0, > > > > > BIOVEC_SEG_BOUNDARY = 1 > > > > > [ 46.116000] scsi_init_sgtable: count = 7, nents = 7 > > > > > [ 46.116000] timer_interrupt(CPU 0): delayed! cycles > > > > > 453F0A77 > > > > > rem 223089 next/now 152BB6286E/152B93F7E5 > > > > > [ 47.780000] scsi_init_sgtable: nr_phys_segments = 1 > > > > > [ 47.780000] scsi_init_sgtable: count = 1, nents = 1 > > > > > [ 47.896000] scsi_init_sgtable: nr_phys_segments = 6 > > > > > [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! > > > > > [ 47.896000] __blk_segment_map_sg: length = 61440, nbytes = > > > > > 4096, sum = 65536 > 65536 > > > > > [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = > > > > > 0, > > > > > BIOVEC_SEG_BOUNDARY = 1 > > > > > [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! > > > > > [ 47.896000] __blk_segment_map_sg: length = 4096, nbytes = > > > > > 4096, sum = 8192 > 65536 > > > > > [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = > > > > > 0, > > > > > BIOVEC_SEG_BOUNDARY = 1 > > > > > [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! > > > > > [ 47.896000] __blk_segment_map_sg: length = 4096, nbytes = > > > > > 4096, sum = 8192 > 65536 > > > > > [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = > > > > > 0, > > > > > BIOVEC_SEG_BOUNDARY = 1 > > > > > [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! > > > > > [ 47.896000] __blk_segment_map_sg: length = 8192, nbytes = > > > > > 4096, sum = 12288 > 65536 > > > > > [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = > > > > > 0, > > > > > BIOVEC_SEG_BOUNDARY = 1 > > > > > [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! > > > > > [ 47.896000] __blk_segment_map_sg: length = 8192, nbytes = > > > > > 4096, sum = 12288 > 65536 > > > > > [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = > > > > > 0, > > > > > BIOVEC_SEG_BOUNDARY = 1 > > > > > [ 47.896000] scsi_init_sgtable: count = 6, nents = 6 > > > > > [ 47.896000] timer_interrupt(CPU 0): delayed! cycles > > > > > 3AB087E2 > > > > > rem 23E4DE next/now 1570BBD5EE/157097F110 > > > > > [ 49.324000] scsi_init_sgtable: nr_phys_segments = 1 > > > > > [ 49.324000] scsi_init_sgtable: count = 1, nents = 1 > > > > > [ 49.440000] scsi_init_sgtable: nr_phys_segments = 2 > > > > > [ 49.440000] NEW SEGMENT sg = 000000007fa911e8!!! > > > > > [ 49.440000] __blk_segment_map_sg: length = 65536, nbytes = > > > > > 4096, sum = 69632 > 65536 > > > > > > > > > > (this is interesting! Here we reach a sum of > 65536 the > > > > > first > > > > > time) > > > > > > > > > > [ 49.440000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = > > > > > 1, > > > > > BIOVEC_SEG_BOUNDARY = 1 > > > > > [ 49.440000] NEW SEGMENT sg = 000000007fa911e8!!! > > > > > [ 49.440000] __blk_segment_map_sg: length = 16384, nbytes = > > > > > 4096, sum = 20480 > 65536 > > > > > [ 49.440000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = > > > > > 0, > > > > > BIOVEC_SEG_BOUNDARY = 1 > > > > > [ 49.440000] *** FIXIT *** HELGE: nsegs > rq > > > > > ->nr_phys_segments > > > > > = 3 > 2 > > > > > [ 49.440000] scsi_init_sgtable: count = 3, nents = 2 > > > > > [ 50.116000] ------------[ cut here ]------------ > > > > > [ 50.172000] WARNING: at /build/linux-4.4/linux > > > > > -4.4.2/drivers/scsi/scsi_lib.c:1104 > > > > > > > > > > (this is usually a BUG(). I changed it to WARN() in the hope > > > > > it > > > > > would work anyway. It didn't.) > > > > > > > > > > [ 50.260000] Modules linked in: sd_mod sr_mod cdrom > > > > > ata_generic > > > > > ohci_pci ehci_pci ohci_hcd ehci_hcd pata_ns87415 sym53c8xx > > > > > libata > > > > > scsi_transport_spi scsi_mod usbcorep > > > > > [ 50.456000] CPU: 0 PID: 70 Comm: systemd-udevd Not tainted > > > > > 4.4.0-1-parisc64-smp #5 Debian 4.4.2-2 > > > > > [ 50.564000] task: 000000007f948b28 ti: 000000007fa90000 > > > > > task.ti: 000000007fa90000 > > > > > [ 50.652000] > > > > > [ 50.672000] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI > > > > > [ 50.728000] PSW: 00001000000001001111100100001110 Not > > > > > tainted > > > > > [ 50.796000] r00-03 000000ff0804f90e 00000000409ea2e0 > > > > > 00000000003e2ee0 000000007fa91140 > > > > > [ 50.892000] r04-07 00000000003cd000 000000007f914300 > > > > > 000000007f914b10 0000000000000003 > > > > > [ 50.988000] r08-11 0000000000000000 000000007f918000 > > > > > 0000000040bdd6b0 00000000003cd800 > > > > > [ 51.084000] r12-15 0000000000000000 000000007fa90778 > > > > > 00000000003cd000 000000007f918000 > > > > > [ 51.180000] r16-19 0000000000001300 0000000040bdd6b8 > > > > > 0000000040bdd6bc 0000000040ba2420 > > > > > [ 51.276000] r20-23 0000000099116e92 0000000000000000 > > > > > 00000000000002a0 00000000000002ee > > > > > [ 51.372000] r24-27 0000000000000000 000000000800000e > > > > > 0000000040b60750 00000000409b3ae0 > > > > > [ 51.468000] r28-31 0000000000000002 000000007fa914f0 > > > > > 000000007fa911e0 0000000040ba2408 > > > > > [ 51.564000] sr00-03 0000000000015000 0000000000000000 > > > > > 0000000000000000 0000000000015000 > > > > > [ 51.660000] sr04-07 0000000000000000 0000000000000000 > > > > > 0000000000000000 0000000000000000 > > > > > [ 51.756000] > > > > > [ 51.772000] IASQ: 0000000000000000 0000000000000000 IAOQ: > > > > > 00000000003e2f24 00000000003e2f28 > > > > > [ 51.872000] IIR: 03ffe01f ISR: 0000000010340000 IOR: > > > > > 000000fea4691528 > > > > > [ 51.956000] CPU: 0 CR30: 000000007fa90000 CR31: > > > > > 00000000ffff7dff > > > > > [ 52.040000] ORIG_R28: 0000000040b60718 > > > > > [ 52.084000] IAOQ[0]: scsi_init_sgtable+0xfc/0x1b8 > > > > > [scsi_mod] > > > > > [ 52.152000] IAOQ[1]: scsi_init_sgtable+0x100/0x1b8 > > > > > [scsi_mod] > > > > > [ 52.224000] RP(r2): scsi_init_sgtable+0xb8/0x1b8 > > > > > [scsi_mod] > > > > > [ 52.292000] Backtrace: > > > > > [ 52.320000] [<00000000003e304c>] scsi_init_io+0x6c/0x258 > > > > > [scsi_mod] > > > > > [ 52.396000] [<000000000087d078>] > > > > > sd_init_command+0x70/0xec8 > > > > > [sd_mod] > > > > > > > > > > In general I think the bug is somehow in blk-merge.c. > > > > > But I'm not an expert in that code. > > > > > > > > The warning was added in this patch sequence: > > > > https://lkml.org/lkml/2015/11/23/996 > > > > > > > > Possibly, but above seems to indicate that it could be driver > > > > issue > > > > as well. > > > > > > > > > I believe this bug was introduced by the following merge: > > > > > > commit 1081230b748de8f03f37f80c53dfa89feda9b8de > > > Merge: df91039 2ca495a > > > Author: Linus Torvalds <torvalds@linux-foundation.org> > > > Date: Wed Sep 2 13:10:25 2015 -0700 > > > > > > Merge branch 'for-4.3/core' of git://git.kernel.dk/linux-block > > > > > > Pull core block updates from Jens Axboe: > > > "This first core part of the block IO changes contains: > > > > > > - Cleanup of the bio IO error signaling from Christoph. We > > > used to > > > rely on the uptodate bit and passing around of an error, > > > now > > > we > > > store the error in the bio itself. > > > > > > - Improvement of the above from myself, by shrinking the > > > bio > > > size > > > down again to fit in two cachelines on x86-64. > > > > > > - Revert of the max_hw_sectors cap removal from a revision > > > again, > > > from Jeff Moyer. This caused performance regressions in > > > various > > > tests. Reinstate the limit, bump it to a more reasonable > > > size > > > instead. > > > > > > - Make /sys/block/<dev>/queue/discard_max_bytes writeable, > > > by > > > me. > > > Most devices have huge trim limits, which can cause nasty > > > latencies > > > when deleting files. Enable the admin to configure the > > > size > > > down. > > > We will look into having a more sane default instead of > > > UINT_MAX > > > sectors. > > > > > > - Improvement of the SGP gaps logic from Keith Busch. > > > > > > - Enable the block core to handle arbitrarily sized bios, > > > which > > > enables a nice simplification of bio_add_page() (which is > > > an > > > IO hot > > > path). From Kent. > > > > > > - Improvements to the partition io stats accounting, making > > > it > > > faster. From Ming Lei. > > > > > > - Also from Ming Lei, a basic fixup for overflow of the > > > sysfs > > > pending > > > file in blk-mq, as well as a fix for a blk-mq timeout > > > race > > > condition. > > > > > > - Ming Lin has been carrying Kents above mentioned patches > > > forward > > > for a while, and testing them. Ming also did a few fixes > > > around > > > that. > > > > > > - Sasha Levin found and fixed a use-after-free problem > > > introduced by > > > the bio->bi_error changes from Christoph. > > > > > > - Small blk cgroup cleanup from Viresh Kumar" > > > > > > * 'for-4.3/core' of git://git.kernel.dk/linux-block: (26 > > > commits) > > > blk: Fix bio_io_vec index when checking bvec gaps > > > block: Replace SG_GAPS with new queue limits mask > > > block: bump BLK_DEF_MAX_SECTORS to 2560 > > > Revert "block: remove artifical max_hw_sectors cap" > > > blk-mq: fix race between timeout and freeing request > > > blk-mq: fix buffer overflow when reading sysfs file of > > > 'pending' > > > Documentation: update notes in biovecs about arbitrarily > > > sized > > > bios > > > block: remove bio_get_nr_vecs() > > > fs: use helper bio_add_page() instead of open coding on > > > bi_io_vec > > > block: kill merge_bvec_fn() completely > > > md/raid5: get rid of bio_fits_rdev() > > > md/raid5: split bio for chunk_aligned_read > > > block: remove split code in > > > blkdev_issue_{discard,write_same} > > > btrfs: remove bio splitting and merge_bvec_fn() calls > > > bcache: remove driver private bio splitting code > > > block: simplify bio_add_page() > > > block: make generic_make_request handle arbitrarily sized > > > bios > > > blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL) > > > block: don't access bio->bi_error after bio_put() > > > block: shrink struct bio down to 2 cache lines again > > > ... > > > > If you can bisect it down to the exact commit, I might be able to > > work > > out what's the problem. Otherwise, even in an all modular config, > > I > > can't reproduce this on 4.5-rc4, so it may be fixed upstream (just > > not > > backported). > > > Okay, the bug was introduced by the following change: > > commit 54efd50bfd873e2dbf784e0b21a8027ba4299a3e > Author: Kent Overstreet <kent.overstreet@gmail.com> If you've verified that reverting this alone gets you a bootable kernel, it's time to report it to the appropriate lists, which would be linux-block and linux-scsi. James ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-21 18:13 ` James Bottomley @ 2016-02-21 18:43 ` John David Anglin 2016-02-21 19:07 ` James Bottomley 0 siblings, 1 reply; 25+ messages in thread From: John David Anglin @ 2016-02-21 18:43 UTC (permalink / raw) To: James Bottomley; +Cc: Helge Deller, linux-parisc List On 2016-02-21, at 1:13 PM, James Bottomley wrote: > On Sun, 2016-02-21 at 13:09 -0500, John David Anglin wrote: >> On 2016-02-20, at 10:47 PM, James Bottomley wrote: >> >>> On Sat, 2016-02-20 at 21:52 -0500, John David Anglin wrote: >>>> On 2016-02-20, at 5:52 PM, John David Anglin wrote: >>>> >>>>> On 2016-02-20, at 4:59 PM, Helge Deller wrote: >>>>> >>>>>> On 20.02.2016 21:43, John David Anglin wrote: >>>>>>> On 2016-02-20, at 3:13 PM, John David Anglin wrote: >>>>>>> >>>>>>>> On 2016-01-23, at 1:00 PM, John David Anglin wrote: >>>>>>>> >>>>>>>>> WARNING: at block/blk-merge.c:454 >>>>>>>> >>>>>>>> With linux-image-4.4.0-1-parisc64-smp on c3740, the above >>>>>>>> warning is the last message I see. >>>>>>>> Kernel seems to hang at that point. This is warning >>>>>>>> code: >>>>>>>> >>>>>>>> /* >>>>>>>> * Something must have been wrong if the figured >>>>>>>> number >>>>>>>> of >>>>>>>> * segment is bigger than number of req's physical >>>>>>>> segments >>>>>>>> */ >>>>>>>> WARN_ON(nsegs > rq->nr_phys_segments); >>>>>>> >>>>>>> On Sep. 12, 2015, I reported the following problem: >>>>>>> >>>>>>> http://www.spinics.net/lists/linux-parisc/msg06327.html >>>>>> >>>>>> The problem is still, that this bug can only be reproduced at >>>>>> every boot when then >>>>>> scsi drivers are built as modules (and in an initrd). I could >>>>>> never reproduce it when >>>>>> I booted a kernel with built-in scsi drivers. >>>>>> >>>>>> The bug seems to be triggered by(*nsegs)++ command in >>>>>> __blk_segment_map_sg() in block/blk-merge.c. >>>>>> I'm testing with the 4.4.2 kernel from debian. >>>>>> I modified __blk_segment_map_sg() like that: >>>>>> static inline void >>>>>> __blk_segment_map_sg(struct request_queue *q, struct bio_vec >>>>>> *bvec, >>>>>> struct scatterlist *sglist, struct bio_vec >>>>>> *bvprv, >>>>>> struct scatterlist **sg, int *nsegs, int >>>>>> *cluster) >>>>>> { >>>>>> >>>>>> int nbytes = bvec->bv_len; >>>>>> >>>>>> if (*sg && *cluster) { >>>>>> if ((*sg)->length + nbytes > >>>>>> queue_max_segment_size(q)) >>>>>> goto new_segment; >>>>>> >>>>>> if (!BIOVEC_PHYS_MERGEABLE(bvprv, bvec)) >>>>>> goto new_segment; >>>>>> if (!BIOVEC_SEG_BOUNDARY(q, bvprv, bvec)) >>>>>> goto new_segment; >>>>>> >>>>>> (*sg)->length += nbytes; >>>>>> } else { >>>>>> new_segment: >>>>>> if (*sg && *cluster) { >>>>>> printk("NEW SEGMENT sg = %p!!!\n", sg); >>>>>> printk("__blk_segment_map_sg: length = >>>>>> %d, >>>>>> nbytes = %d, sum = %d > %d\n", (*sg)->length, nbytes, (*sg) >>>>>> ->length + nbytes, queue_max_segment_size(q)); >>>>>> printk("__blk_segment_map_sg: >>>>>> BIOVEC_PHYS_MERGEABLE = %d, BIOVEC_SEG_BOUNDARY = %d\n", >>>>>> BIOVEC_PHYS_MERGEABLE(bvprv, bvec), BIOVEC_SEG_BOUNDARY(q, >>>>>> bvprv, >>>>>> bvec) ); >>>>>> } >>>>>> if (!*sg) >>>>>> *sg = sglist; >>>>>> else { >>>>>> /* >>>>>> * If the driver previously mapped a >>>>>> shorter >>>>>> * list, we could see a termination bit >>>>>> * prematurely unless it fully inits the >>>>>> sg >>>>>> * table on each mapping. We KNOW that >>>>>> there >>>>>> * must be more entries here or the >>>>>> driver >>>>>> * would be buggy, so force clear the >>>>>> * termination bit to avoid doing a full >>>>>> * sg_init_table() in drivers for each >>>>>> command. >>>>>> */ >>>>>> sg_unmark_end(*sg); >>>>>> *sg = sg_next(*sg); >>>>>> } >>>>>> >>>>>> sg_set_page(*sg, bvec->bv_page, nbytes, bvec >>>>>> ->bv_offset); >>>>>> (*nsegs)++; >>>>>> } >>>>>> *bvprv = *bvec; >>>>>> } >>>>>> >>>>>> The boot log looks then like this: >>>>>> [ 43.044000] scsi_init_sgtable: count = 1, nents = 1 >>>>>> (there are lots of those before it!) >>>>>> [ 43.164000] scsi_init_sgtable: nr_phys_segments = 1 >>>>>> [ 43.164000] scsi_init_sgtable: count = 1, nents = 1 >>>>>> [ 43.280000] scsi_init_sgtable: nr_phys_segments = 1 >>>>>> [ 43.280000] scsi_init_sgtable: count = 1, nents = 1 >>>>>> [ 43.396000] scsi_init_sgtable: nr_phys_segments = 1 >>>>>> [ 43.396000] scsi_init_sgtable: count = 1, nents = 1 >>>>>> [ 43.512000] scsi_init_sgtable: nr_phys_segments = 1 >>>>>> [ 43.512000] scsi_init_sgtable: count = 1, nents = 1 >>>>>> [ 43.628000] scsi_init_sgtable: nr_phys_segments = 3 >>>>>> [ 43.628000] NEW SEGMENT sg = 000000007fa911e8!!! >>>>>> [ 43.628000] __blk_segment_map_sg: length = 4096, nbytes = >>>>>> 4096, sum = 8192 > 65536 >>>>>> [ 43.628000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = >>>>>> 0, >>>>>> BIOVEC_SEG_BOUNDARY = 1 >>>>>> [ 43.628000] NEW SEGMENT sg = 000000007fa911e8!!! >>>>>> [ 43.628000] __blk_segment_map_sg: length = 4096, nbytes = >>>>>> 4096, sum = 8192 > 65536 >>>>>> [ 43.628000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = >>>>>> 0, >>>>>> BIOVEC_SEG_BOUNDARY = 1 >>>>>> [ 43.628000] scsi_init_sgtable: count = 3, nents = 3 >>>>>> [ 44.224000] scsi_init_sgtable: nr_phys_segments = 1 >>>>>> [ 44.224000] scsi_init_sgtable: count = 1, nents = 1 >>>>>> [ 44.340000] scsi_init_sgtable: nr_phys_segments = 1 >>>>>> [ 44.340000] scsi_init_sgtable: count = 1, nents = 1 >>>>>> [ 44.456000] scsi_init_sgtable: nr_phys_segments = 7 >>>>>> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>>>> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = >>>>>> 4096, sum = 8192 > 65536 >>>>>> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = >>>>>> 0, >>>>>> BIOVEC_SEG_BOUNDARY = 1 >>>>>> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>>>> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = >>>>>> 4096, sum = 8192 > 65536 >>>>>> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = >>>>>> 0, >>>>>> BIOVEC_SEG_BOUNDARY = 1 >>>>>> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>>>> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = >>>>>> 4096, sum = 8192 > 65536 >>>>>> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = >>>>>> 0, >>>>>> BIOVEC_SEG_BOUNDARY = 1 >>>>>> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>>>> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = >>>>>> 4096, sum = 8192 > 65536 >>>>>> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = >>>>>> 0, >>>>>> BIOVEC_SEG_BOUNDARY = 1 >>>>>> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>>>> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = >>>>>> 4096, sum = 8192 > 65536 >>>>>> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = >>>>>> 0, >>>>>> BIOVEC_SEG_BOUNDARY = 1 >>>>>> [ 44.456000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>>>> [ 44.456000] __blk_segment_map_sg: length = 4096, nbytes = >>>>>> 4096, sum = 8192 > 65536 >>>>>> [ 44.456000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = >>>>>> 0, >>>>>> BIOVEC_SEG_BOUNDARY = 1 >>>>>> [ 44.456000] scsi_init_sgtable: count = 7, nents = 7 >>>>>> [ 44.456000] timer_interrupt(CPU 0): delayed! cycles >>>>>> 4527081F >>>>>> rem C6C21 next/now 14E153306E/14E146C44D >>>>>> [ 46.116000] scsi_init_sgtable: nr_phys_segments = 7 >>>>>> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>>>> [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = >>>>>> 4096, sum = 8192 > 65536 >>>>>> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = >>>>>> 0, >>>>>> BIOVEC_SEG_BOUNDARY = 1 >>>>>> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>>>> [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = >>>>>> 4096, sum = 8192 > 65536 >>>>>> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = >>>>>> 0, >>>>>> BIOVEC_SEG_BOUNDARY = 1 >>>>>> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>>>> [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = >>>>>> 4096, sum = 8192 > 65536 >>>>>> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = >>>>>> 0, >>>>>> BIOVEC_SEG_BOUNDARY = 1 >>>>>> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>>>> [ 46.116000] __blk_segment_map_sg: length = 4096, nbytes = >>>>>> 4096, sum = 8192 > 65536 >>>>>> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = >>>>>> 0, >>>>>> BIOVEC_SEG_BOUNDARY = 1 >>>>>> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>>>> [ 46.116000] __blk_segment_map_sg: length = 8192, nbytes = >>>>>> 4096, sum = 12288 > 65536 >>>>>> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = >>>>>> 0, >>>>>> BIOVEC_SEG_BOUNDARY = 1 >>>>>> [ 46.116000] NEW SEGMENT sg = 00000000bfca0f98!!! >>>>>> [ 46.116000] __blk_segment_map_sg: length = 16384, nbytes = >>>>>> 4096, sum = 20480 > 65536 >>>>>> [ 46.116000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = >>>>>> 0, >>>>>> BIOVEC_SEG_BOUNDARY = 1 >>>>>> [ 46.116000] scsi_init_sgtable: count = 7, nents = 7 >>>>>> [ 46.116000] timer_interrupt(CPU 0): delayed! cycles >>>>>> 453F0A77 >>>>>> rem 223089 next/now 152BB6286E/152B93F7E5 >>>>>> [ 47.780000] scsi_init_sgtable: nr_phys_segments = 1 >>>>>> [ 47.780000] scsi_init_sgtable: count = 1, nents = 1 >>>>>> [ 47.896000] scsi_init_sgtable: nr_phys_segments = 6 >>>>>> [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! >>>>>> [ 47.896000] __blk_segment_map_sg: length = 61440, nbytes = >>>>>> 4096, sum = 65536 > 65536 >>>>>> [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = >>>>>> 0, >>>>>> BIOVEC_SEG_BOUNDARY = 1 >>>>>> [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! >>>>>> [ 47.896000] __blk_segment_map_sg: length = 4096, nbytes = >>>>>> 4096, sum = 8192 > 65536 >>>>>> [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = >>>>>> 0, >>>>>> BIOVEC_SEG_BOUNDARY = 1 >>>>>> [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! >>>>>> [ 47.896000] __blk_segment_map_sg: length = 4096, nbytes = >>>>>> 4096, sum = 8192 > 65536 >>>>>> [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = >>>>>> 0, >>>>>> BIOVEC_SEG_BOUNDARY = 1 >>>>>> [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! >>>>>> [ 47.896000] __blk_segment_map_sg: length = 8192, nbytes = >>>>>> 4096, sum = 12288 > 65536 >>>>>> [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = >>>>>> 0, >>>>>> BIOVEC_SEG_BOUNDARY = 1 >>>>>> [ 47.896000] NEW SEGMENT sg = 000000007fa911e8!!! >>>>>> [ 47.896000] __blk_segment_map_sg: length = 8192, nbytes = >>>>>> 4096, sum = 12288 > 65536 >>>>>> [ 47.896000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = >>>>>> 0, >>>>>> BIOVEC_SEG_BOUNDARY = 1 >>>>>> [ 47.896000] scsi_init_sgtable: count = 6, nents = 6 >>>>>> [ 47.896000] timer_interrupt(CPU 0): delayed! cycles >>>>>> 3AB087E2 >>>>>> rem 23E4DE next/now 1570BBD5EE/157097F110 >>>>>> [ 49.324000] scsi_init_sgtable: nr_phys_segments = 1 >>>>>> [ 49.324000] scsi_init_sgtable: count = 1, nents = 1 >>>>>> [ 49.440000] scsi_init_sgtable: nr_phys_segments = 2 >>>>>> [ 49.440000] NEW SEGMENT sg = 000000007fa911e8!!! >>>>>> [ 49.440000] __blk_segment_map_sg: length = 65536, nbytes = >>>>>> 4096, sum = 69632 > 65536 >>>>>> >>>>>> (this is interesting! Here we reach a sum of > 65536 the >>>>>> first >>>>>> time) >>>>>> >>>>>> [ 49.440000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = >>>>>> 1, >>>>>> BIOVEC_SEG_BOUNDARY = 1 >>>>>> [ 49.440000] NEW SEGMENT sg = 000000007fa911e8!!! >>>>>> [ 49.440000] __blk_segment_map_sg: length = 16384, nbytes = >>>>>> 4096, sum = 20480 > 65536 >>>>>> [ 49.440000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = >>>>>> 0, >>>>>> BIOVEC_SEG_BOUNDARY = 1 >>>>>> [ 49.440000] *** FIXIT *** HELGE: nsegs > rq >>>>>> ->nr_phys_segments >>>>>> = 3 > 2 >>>>>> [ 49.440000] scsi_init_sgtable: count = 3, nents = 2 >>>>>> [ 50.116000] ------------[ cut here ]------------ >>>>>> [ 50.172000] WARNING: at /build/linux-4.4/linux >>>>>> -4.4.2/drivers/scsi/scsi_lib.c:1104 >>>>>> >>>>>> (this is usually a BUG(). I changed it to WARN() in the hope >>>>>> it >>>>>> would work anyway. It didn't.) >>>>>> >>>>>> [ 50.260000] Modules linked in: sd_mod sr_mod cdrom >>>>>> ata_generic >>>>>> ohci_pci ehci_pci ohci_hcd ehci_hcd pata_ns87415 sym53c8xx >>>>>> libata >>>>>> scsi_transport_spi scsi_mod usbcorep >>>>>> [ 50.456000] CPU: 0 PID: 70 Comm: systemd-udevd Not tainted >>>>>> 4.4.0-1-parisc64-smp #5 Debian 4.4.2-2 >>>>>> [ 50.564000] task: 000000007f948b28 ti: 000000007fa90000 >>>>>> task.ti: 000000007fa90000 >>>>>> [ 50.652000] >>>>>> [ 50.672000] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI >>>>>> [ 50.728000] PSW: 00001000000001001111100100001110 Not >>>>>> tainted >>>>>> [ 50.796000] r00-03 000000ff0804f90e 00000000409ea2e0 >>>>>> 00000000003e2ee0 000000007fa91140 >>>>>> [ 50.892000] r04-07 00000000003cd000 000000007f914300 >>>>>> 000000007f914b10 0000000000000003 >>>>>> [ 50.988000] r08-11 0000000000000000 000000007f918000 >>>>>> 0000000040bdd6b0 00000000003cd800 >>>>>> [ 51.084000] r12-15 0000000000000000 000000007fa90778 >>>>>> 00000000003cd000 000000007f918000 >>>>>> [ 51.180000] r16-19 0000000000001300 0000000040bdd6b8 >>>>>> 0000000040bdd6bc 0000000040ba2420 >>>>>> [ 51.276000] r20-23 0000000099116e92 0000000000000000 >>>>>> 00000000000002a0 00000000000002ee >>>>>> [ 51.372000] r24-27 0000000000000000 000000000800000e >>>>>> 0000000040b60750 00000000409b3ae0 >>>>>> [ 51.468000] r28-31 0000000000000002 000000007fa914f0 >>>>>> 000000007fa911e0 0000000040ba2408 >>>>>> [ 51.564000] sr00-03 0000000000015000 0000000000000000 >>>>>> 0000000000000000 0000000000015000 >>>>>> [ 51.660000] sr04-07 0000000000000000 0000000000000000 >>>>>> 0000000000000000 0000000000000000 >>>>>> [ 51.756000] >>>>>> [ 51.772000] IASQ: 0000000000000000 0000000000000000 IAOQ: >>>>>> 00000000003e2f24 00000000003e2f28 >>>>>> [ 51.872000] IIR: 03ffe01f ISR: 0000000010340000 IOR: >>>>>> 000000fea4691528 >>>>>> [ 51.956000] CPU: 0 CR30: 000000007fa90000 CR31: >>>>>> 00000000ffff7dff >>>>>> [ 52.040000] ORIG_R28: 0000000040b60718 >>>>>> [ 52.084000] IAOQ[0]: scsi_init_sgtable+0xfc/0x1b8 >>>>>> [scsi_mod] >>>>>> [ 52.152000] IAOQ[1]: scsi_init_sgtable+0x100/0x1b8 >>>>>> [scsi_mod] >>>>>> [ 52.224000] RP(r2): scsi_init_sgtable+0xb8/0x1b8 >>>>>> [scsi_mod] >>>>>> [ 52.292000] Backtrace: >>>>>> [ 52.320000] [<00000000003e304c>] scsi_init_io+0x6c/0x258 >>>>>> [scsi_mod] >>>>>> [ 52.396000] [<000000000087d078>] >>>>>> sd_init_command+0x70/0xec8 >>>>>> [sd_mod] >>>>>> >>>>>> In general I think the bug is somehow in blk-merge.c. >>>>>> But I'm not an expert in that code. >>>>> >>>>> The warning was added in this patch sequence: >>>>> https://lkml.org/lkml/2015/11/23/996 >>>>> >>>>> Possibly, but above seems to indicate that it could be driver >>>>> issue >>>>> as well. >>>> >>>> >>>> I believe this bug was introduced by the following merge: >>>> >>>> commit 1081230b748de8f03f37f80c53dfa89feda9b8de >>>> Merge: df91039 2ca495a >>>> Author: Linus Torvalds <torvalds@linux-foundation.org> >>>> Date: Wed Sep 2 13:10:25 2015 -0700 >>>> >>>> Merge branch 'for-4.3/core' of git://git.kernel.dk/linux-block >>>> >>>> Pull core block updates from Jens Axboe: >>>> "This first core part of the block IO changes contains: >>>> >>>> - Cleanup of the bio IO error signaling from Christoph. We >>>> used to >>>> rely on the uptodate bit and passing around of an error, >>>> now >>>> we >>>> store the error in the bio itself. >>>> >>>> - Improvement of the above from myself, by shrinking the >>>> bio >>>> size >>>> down again to fit in two cachelines on x86-64. >>>> >>>> - Revert of the max_hw_sectors cap removal from a revision >>>> again, >>>> from Jeff Moyer. This caused performance regressions in >>>> various >>>> tests. Reinstate the limit, bump it to a more reasonable >>>> size >>>> instead. >>>> >>>> - Make /sys/block/<dev>/queue/discard_max_bytes writeable, >>>> by >>>> me. >>>> Most devices have huge trim limits, which can cause nasty >>>> latencies >>>> when deleting files. Enable the admin to configure the >>>> size >>>> down. >>>> We will look into having a more sane default instead of >>>> UINT_MAX >>>> sectors. >>>> >>>> - Improvement of the SGP gaps logic from Keith Busch. >>>> >>>> - Enable the block core to handle arbitrarily sized bios, >>>> which >>>> enables a nice simplification of bio_add_page() (which is >>>> an >>>> IO hot >>>> path). From Kent. >>>> >>>> - Improvements to the partition io stats accounting, making >>>> it >>>> faster. From Ming Lei. >>>> >>>> - Also from Ming Lei, a basic fixup for overflow of the >>>> sysfs >>>> pending >>>> file in blk-mq, as well as a fix for a blk-mq timeout >>>> race >>>> condition. >>>> >>>> - Ming Lin has been carrying Kents above mentioned patches >>>> forward >>>> for a while, and testing them. Ming also did a few fixes >>>> around >>>> that. >>>> >>>> - Sasha Levin found and fixed a use-after-free problem >>>> introduced by >>>> the bio->bi_error changes from Christoph. >>>> >>>> - Small blk cgroup cleanup from Viresh Kumar" >>>> >>>> * 'for-4.3/core' of git://git.kernel.dk/linux-block: (26 >>>> commits) >>>> blk: Fix bio_io_vec index when checking bvec gaps >>>> block: Replace SG_GAPS with new queue limits mask >>>> block: bump BLK_DEF_MAX_SECTORS to 2560 >>>> Revert "block: remove artifical max_hw_sectors cap" >>>> blk-mq: fix race between timeout and freeing request >>>> blk-mq: fix buffer overflow when reading sysfs file of >>>> 'pending' >>>> Documentation: update notes in biovecs about arbitrarily >>>> sized >>>> bios >>>> block: remove bio_get_nr_vecs() >>>> fs: use helper bio_add_page() instead of open coding on >>>> bi_io_vec >>>> block: kill merge_bvec_fn() completely >>>> md/raid5: get rid of bio_fits_rdev() >>>> md/raid5: split bio for chunk_aligned_read >>>> block: remove split code in >>>> blkdev_issue_{discard,write_same} >>>> btrfs: remove bio splitting and merge_bvec_fn() calls >>>> bcache: remove driver private bio splitting code >>>> block: simplify bio_add_page() >>>> block: make generic_make_request handle arbitrarily sized >>>> bios >>>> blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL) >>>> block: don't access bio->bi_error after bio_put() >>>> block: shrink struct bio down to 2 cache lines again >>>> ... >>> >>> If you can bisect it down to the exact commit, I might be able to >>> work >>> out what's the problem. Otherwise, even in an all modular config, >>> I >>> can't reproduce this on 4.5-rc4, so it may be fixed upstream (just >>> not >>> backported). >> >> >> Okay, the bug was introduced by the following change: >> >> commit 54efd50bfd873e2dbf784e0b21a8027ba4299a3e >> Author: Kent Overstreet <kent.overstreet@gmail.com> > > If you've verified that reverting this alone gets you a bootable > kernel, it's time to report it to the appropriate lists, which would be > linux-block and linux-scsi. I verified that commit 54efd50bfd873e2dbf784e0b21a8027ba4299a3e in linux-block fails to boot and commit 41609892701e26724b8617201f43254cadf2e7ae (blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL)) does boot successfully. Commit 41609892701e26724b8617201f43254cadf2e7ae is previous commit in tree. I don't believe that the change can be reverted from Linus' tree as this commit allowed other stuff to be removed (see second paragraph of commit description). Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-21 18:43 ` John David Anglin @ 2016-02-21 19:07 ` James Bottomley 2016-02-21 19:36 ` Helge Deller 0 siblings, 1 reply; 25+ messages in thread From: James Bottomley @ 2016-02-21 19:07 UTC (permalink / raw) To: John David Anglin; +Cc: Helge Deller, linux-parisc List On Sun, 2016-02-21 at 13:43 -0500, John David Anglin wrote: > I verified that commit 54efd50bfd873e2dbf784e0b21a8027ba4299a3e in > linux-block fails to > boot and commit 41609892701e26724b8617201f43254cadf2e7ae (blk-cgroup: > Drop unlikely > before IS_ERR(_OR_NULL)) does boot successfully. Commit > 41609892701e26724b8617201f43254cadf2e7ae > is previous commit in tree. > > I don't believe that the change can be reverted from Linus' tree as > this commit allowed other > stuff to be removed (see second paragraph of commit description). OK, can you just verify you can boot 4.5-rc5 without the sata_sil24 driver? My theory, based on what Helge produced is that this commit is building a large transfer list > 65535 and then splitting it wrongly. I think the problem is that it's not respecting the DMA boundary, so Helge sees a transfer size of 69632 which I think slops over on both sides, requiring 3 segments to describe. However, the merge code thinks we only need two (because the length is < 2*65536). The reason we only see this with ATA drivers is because virtually no SCSI drivers set the DMA boundary. Most ATA drivers require a dma boundary of 65535. James ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-21 19:07 ` James Bottomley @ 2016-02-21 19:36 ` Helge Deller 2016-02-21 20:28 ` James Bottomley 2016-02-21 20:42 ` John David Anglin 0 siblings, 2 replies; 25+ messages in thread From: Helge Deller @ 2016-02-21 19:36 UTC (permalink / raw) To: James Bottomley, John David Anglin; +Cc: linux-parisc List On 21.02.2016 20:07, James Bottomley wrote: > On Sun, 2016-02-21 at 13:43 -0500, John David Anglin wrote: >> I verified that commit 54efd50bfd873e2dbf784e0b21a8027ba4299a3e in >> linux-block fails to >> boot and commit 41609892701e26724b8617201f43254cadf2e7ae (blk-cgroup: >> Drop unlikely >> before IS_ERR(_OR_NULL)) does boot successfully. Commit >> 41609892701e26724b8617201f43254cadf2e7ae >> is previous commit in tree. >> >> I don't believe that the change can be reverted from Linus' tree as >> this commit allowed other >> stuff to be removed (see second paragraph of commit description). > > OK, can you just verify you can boot 4.5-rc5 without the sata_sil24 > driver? I tried it on my c3000, debian kernel 4.4.2, in this case without the pata_ns87415. I used "modprobe.blacklist=libata,pata_ns87415,ata_generic" as additional kernel command line option. [ 45.980000] NEW SEGMENT sg = 00000000bbe991e8!!! [ 45.980000] __blk_segment_map_sg: length = 12288, nbytes = 4096, sum = 16384 > 65536 [ 45.980000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 45.980000] NEW SEGMENT sg = 00000000bbe991e8!!! [ 45.980000] __blk_segment_map_sg: length = 65536, nbytes = 4096, sum = 69632 > 65536 ^^ here. [ 45.980000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 1, BIOVEC_SEG_BOUNDARY = 1 [ 45.980000] NEW SEGMENT sg = 00000000bbe991e8!!! [ 45.980000] __blk_segment_map_sg: length = 12288, nbytes = 4096, sum = 16384 > 65536 [ 45.980000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 45.980000] NEW SEGMENT sg = 00000000bbe991e8!!! [ 45.980000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 45.980000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 45.980000] NEW SEGMENT sg = 00000000bbe991e8!!! [ 45.980000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 45.980000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 45.980000] NEW SEGMENT sg = 00000000bbe991e8!!! [ 45.980000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 45.980000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 45.980000] NEW SEGMENT sg = 00000000bbe991e8!!! [ 45.980000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 45.980000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 45.980000] NEW SEGMENT sg = 00000000bbe991e8!!! [ 45.980000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 45.980000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 45.980000] NEW SEGMENT sg = 00000000bbe991e8!!! [ 45.980000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 45.980000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 45.980000] NEW SEGMENT sg = 00000000bbe991e8!!! [ 45.980000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 45.980000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 45.980000] *** FIXIT *** HELGE: nsegs > rq->nr_phys_segments = 11 > 10 [ 45.980000] scsi_init_sgtable: count = 11, nents = 10 ^^ here [ 45.980000] timer_interrupt(CPU 0): delayed! cycles 73C0F821 rem 1E1DDF next/now 1D78B3071C/1D7894E93D [ 48.684000] ------------[ cut here ]------------ [ 48.740000] WARNING: at /build/linux-4.4/linux-4.4.2/drivers/scsi/scsi_lib.c:1104 [ 48.828000] Modules linked in: sd_mod ohci_pci ehci_pci ohci_hcd ehci_hcd sym53c8xx scsi_transport_spi usbcore scsi_mod usb_common tulip [ 48.976000] CPU: 0 PID: 66 Comm: systemd-udevd Not tainted 4.4.0-1-parisc64-smp #5 Debian 4.4.2-2 [ 49.084000] task: 00000000bbe97538 ti: 00000000bbe98000 task.ti: 00000000bbe98000 > My theory, based on what Helge produced is that this commit is building > a large transfer list > 65535 and then splitting it wrongly. Yes, this theory sounds right. > I think > the problem is that it's not respecting the DMA boundary, so Helge sees > a transfer size of 69632 which I think slops over on both sides, > requiring 3 segments to describe. However, the merge code thinks we > only need two (because the length is < 2*65536). The reason we only > see this with ATA drivers is because virtually no SCSI drivers set the > DMA boundary. Most ATA drivers require a dma boundary of 65535. In my test above there is no ATA driver included, only SCSI discs. Helge ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-21 19:36 ` Helge Deller @ 2016-02-21 20:28 ` James Bottomley 2016-02-21 21:09 ` John David Anglin 2016-02-21 20:42 ` John David Anglin 1 sibling, 1 reply; 25+ messages in thread From: James Bottomley @ 2016-02-21 20:28 UTC (permalink / raw) To: Helge Deller, John David Anglin; +Cc: linux-parisc List On Sun, 2016-02-21 at 20:36 +0100, Helge Deller wrote: > On 21.02.2016 20:07, James Bottomley wrote: > > On Sun, 2016-02-21 at 13:43 -0500, John David Anglin wrote: > > > I verified that commit 54efd50bfd873e2dbf784e0b21a8027ba4299a3e > > > in > > > linux-block fails to > > > boot and commit 41609892701e26724b8617201f43254cadf2e7ae (blk > > > -cgroup: > > > Drop unlikely > > > before IS_ERR(_OR_NULL)) does boot successfully. Commit > > > 41609892701e26724b8617201f43254cadf2e7ae > > > is previous commit in tree. > > > > > > I don't believe that the change can be reverted from Linus' tree > > > as > > > this commit allowed other > > > stuff to be removed (see second paragraph of commit description). > > > > OK, can you just verify you can boot 4.5-rc5 without the sata_sil24 > > driver? > > I tried it on my c3000, debian kernel 4.4.2, in this case without the > pata_ns87415. > I used "modprobe.blacklist=libata,pata_ns87415,ata_generic" as > additional > kernel command line option. > > [ 45.980000] NEW SEGMENT sg = 00000000bbe991e8!!! > [ 45.980000] __blk_segment_map_sg: length = 12288, nbytes = 4096, > sum = 16384 > 65536 > [ 45.980000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > BIOVEC_SEG_BOUNDARY = 1 > [ 45.980000] NEW SEGMENT sg = 00000000bbe991e8!!! > [ 45.980000] __blk_segment_map_sg: length = 65536, nbytes = 4096, > sum = 69632 > 65536 > > ^^ here. > > [ 45.980000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 1, > BIOVEC_SEG_BOUNDARY = 1 > [ 45.980000] NEW SEGMENT sg = 00000000bbe991e8!!! > [ 45.980000] __blk_segment_map_sg: length = 12288, nbytes = 4096, > sum = 16384 > 65536 > [ 45.980000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > BIOVEC_SEG_BOUNDARY = 1 > [ 45.980000] NEW SEGMENT sg = 00000000bbe991e8!!! > [ 45.980000] __blk_segment_map_sg: length = 4096, nbytes = 4096, > sum = 8192 > 65536 > [ 45.980000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > BIOVEC_SEG_BOUNDARY = 1 > [ 45.980000] NEW SEGMENT sg = 00000000bbe991e8!!! > [ 45.980000] __blk_segment_map_sg: length = 4096, nbytes = 4096, > sum = 8192 > 65536 > [ 45.980000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > BIOVEC_SEG_BOUNDARY = 1 > [ 45.980000] NEW SEGMENT sg = 00000000bbe991e8!!! > [ 45.980000] __blk_segment_map_sg: length = 4096, nbytes = 4096, > sum = 8192 > 65536 > [ 45.980000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > BIOVEC_SEG_BOUNDARY = 1 > [ 45.980000] NEW SEGMENT sg = 00000000bbe991e8!!! > [ 45.980000] __blk_segment_map_sg: length = 4096, nbytes = 4096, > sum = 8192 > 65536 > [ 45.980000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > BIOVEC_SEG_BOUNDARY = 1 > [ 45.980000] NEW SEGMENT sg = 00000000bbe991e8!!! > [ 45.980000] __blk_segment_map_sg: length = 4096, nbytes = 4096, > sum = 8192 > 65536 > [ 45.980000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > BIOVEC_SEG_BOUNDARY = 1 > [ 45.980000] NEW SEGMENT sg = 00000000bbe991e8!!! > [ 45.980000] __blk_segment_map_sg: length = 4096, nbytes = 4096, > sum = 8192 > 65536 > [ 45.980000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > BIOVEC_SEG_BOUNDARY = 1 > [ 45.980000] NEW SEGMENT sg = 00000000bbe991e8!!! > [ 45.980000] __blk_segment_map_sg: length = 4096, nbytes = 4096, > sum = 8192 > 65536 > [ 45.980000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, > BIOVEC_SEG_BOUNDARY = 1 > [ 45.980000] *** FIXIT *** HELGE: nsegs > rq->nr_phys_segments = 11 > > 10 > [ 45.980000] scsi_init_sgtable: count = 11, nents = 10 > > ^^ here > > [ 45.980000] timer_interrupt(CPU 0): delayed! cycles 73C0F821 rem > 1E1DDF next/now 1D78B3071C/1D7894E93D > [ 48.684000] ------------[ cut here ]------------ > [ 48.740000] WARNING: at /build/linux-4.4/linux > -4.4.2/drivers/scsi/scsi_lib.c:1104 > [ 48.828000] Modules linked in: sd_mod ohci_pci ehci_pci ohci_hcd > ehci_hcd sym53c8xx scsi_transport_spi usbcore scsi_mod usb_common > tulip > [ 48.976000] CPU: 0 PID: 66 Comm: systemd-udevd Not tainted 4.4.0-1 > -parisc64-smp #5 Debian 4.4.2-2 > [ 49.084000] task: 00000000bbe97538 ti: 00000000bbe98000 task.ti: > 00000000bbe98000 > > > > My theory, based on what Helge produced is that this commit is > > building > > a large transfer list > 65535 and then splitting it wrongly. > > Yes, this theory sounds right. > > > I think > > the problem is that it's not respecting the DMA boundary, so Helge > > sees > > a transfer size of 69632 which I think slops over on both sides, > > requiring 3 segments to describe. However, the merge code thinks > > we > > only need two (because the length is < 2*65536). The reason we > > only > > see this with ATA drivers is because virtually no SCSI drivers set > > the > > DMA boundary. Most ATA drivers require a dma boundary of 65535. > > In my test above there is no ATA driver included, only SCSI discs. Heh, well, it's a combination of problems. Apparently in parisc we don't set the max segment size, so we inherit 64k even in SCSI drivers. That said, I still can't reproduce this, so you're going to have to help me find it. Current theory is ll_merge_request_fn() it looks like there's scope for miscalculation in there. Can you instrument this line /* Merge is OK... */ req->nr_phys_segments = total_phys_segments; To add just before the return if (req->nr_phys_segments != __blk_recalc_rq_segments(rq->q, rq->bio, false) printk("MISMATCH IN MERGE: got %d, should get %d\n", req->nr_phys_segments, __blk_recalc_rq_segments(rq->q, rq->bio, false)); Thanks, James ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-21 20:28 ` James Bottomley @ 2016-02-21 21:09 ` John David Anglin 2016-02-21 21:17 ` Helge Deller 0 siblings, 1 reply; 25+ messages in thread From: John David Anglin @ 2016-02-21 21:09 UTC (permalink / raw) To: James Bottomley; +Cc: Helge Deller, linux-parisc List On 2016-02-21, at 3:28 PM, James Bottomley wrote: > That said, I still can't reproduce this, so you're going to have to > help me find it. Current theory is ll_merge_request_fn() it looks like > there's scope for miscalculation in there. Can you instrument this > line > > /* Merge is OK... */ > req->nr_phys_segments = total_phys_segments; > > To add just before the return > > if (req->nr_phys_segments != __blk_recalc_rq_segments(rq->q, rq->bio, false) > printk("MISMATCH IN MERGE: got %d, should get %d\n", > req->nr_phys_segments, > __blk_recalc_rq_segments(rq->q, rq->bio, false)); This didn't trigger. There were some typos: diff --git a/block/blk-merge.c b/block/blk-merge.c index d9c3a75..e8969ef 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -545,6 +545,12 @@ static int ll_merge_requests_fn(struct request_queue *q, struct request *req, /* Merge is OK... */ req->nr_phys_segments = total_phys_segments; + +if (req->nr_phys_segments != __blk_recalc_rq_segments(req->q, req->bio, false)) + printk("MISMATCH IN MERGE: got %d, should get %d\n", + req->nr_phys_segments, + __blk_recalc_rq_segments(req->q, req->bio, false)); + return 1; } Any more ideas? Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-21 21:09 ` John David Anglin @ 2016-02-21 21:17 ` Helge Deller 2016-02-21 21:49 ` James Bottomley 2016-02-22 0:53 ` John David Anglin 0 siblings, 2 replies; 25+ messages in thread From: Helge Deller @ 2016-02-21 21:17 UTC (permalink / raw) To: John David Anglin, James Bottomley; +Cc: linux-parisc List On 21.02.2016 22:09, John David Anglin wrote: > On 2016-02-21, at 3:28 PM, James Bottomley wrote: > >> That said, I still can't reproduce this, so you're going to have to >> help me find it. Current theory is ll_merge_request_fn() it looks like >> there's scope for miscalculation in there. Can you instrument this >> line >> >> /* Merge is OK... */ >> req->nr_phys_segments = total_phys_segments; >> >> To add just before the return >> >> if (req->nr_phys_segments != __blk_recalc_rq_segments(rq->q, rq->bio, false) >> printk("MISMATCH IN MERGE: got %d, should get %d\n", >> req->nr_phys_segments, >> __blk_recalc_rq_segments(rq->q, rq->bio, false)); > > This didn't trigger. There were some typos: It didn't trigger for me either. > diff --git a/block/blk-merge.c b/block/blk-merge.c > index d9c3a75..e8969ef 100644 > --- a/block/blk-merge.c > +++ b/block/blk-merge.c > @@ -545,6 +545,12 @@ static int ll_merge_requests_fn(struct request_queue *q, struct request *req, > > /* Merge is OK... */ > req->nr_phys_segments = total_phys_segments; > + > +if (req->nr_phys_segments != __blk_recalc_rq_segments(req->q, req->bio, false)) > + printk("MISMATCH IN MERGE: got %d, should get %d\n", > + req->nr_phys_segments, > + __blk_recalc_rq_segments(req->q, req->bio, false)); > + > return 1; > } Interestingly it now triggered somewhere else... I enabled CONFIG_DEBUG_SG, which I had enabled the last few times as well, but it now happened for the first time: [ 49.848000] scsi_init_sgtable: nr_phys_segments = 4 [ 49.848000] NEW SEGMENT sg = 00000000bbdd5008!!! [ 49.848000] __blk_segment_map_sg: length = 65536, nbytes = 4096, sum = 69632 > 65536 [ 49.848000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 1, BIOVEC_SEG_BOUNDARY = 1 [ 49.848000] NEW SEGMENT sg = 00000000bbdd5008!!! [ 49.848000] __blk_segment_map_sg: length = 8192, nbytes = 4096, sum = 12288 > 65536 [ 49.848000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 49.848000] NEW SEGMENT sg = 00000000bbdd5008!!! [ 49.848000] __blk_segment_map_sg: length = 20480, nbytes = 4096, sum = 24576 > 65536 [ 49.848000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 49.848000] NEW SEGMENT sg = 00000000bbdd5008!!! [ 49.848000] __blk_segment_map_sg: length = 4096, nbytes = 4096, sum = 8192 > 65536 [ 49.848000] __blk_segment_map_sg: BIOVEC_PHYS_MERGEABLE = 0, BIOVEC_SEG_BOUNDARY = 1 [ 49.848000] timer_interrupt(CPU 0): delayed! cycles 2D6DC985 rem 2B2FBB next/now 19218A269E/19215EF6E3 [ 50.980000] ------------[ cut here ]------------ [ 51.036000] kernel BUG at /build/linux-4.4/linux-4.4.2/include/linux/scatterlist.h:92! [ 51.128000] CPU: 0 PID: 62 Comm: systemd-udevd Not tainted 4.4.0-1-parisc64-smp #6 Debian 4.4.2-2 [ 51.236000] task: 00000000bbd49508 ti: 00000000bbdd4000 task.ti: 00000000bbdd4000 [ 51.324000] [ 51.344000] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI [ 51.400000] PSW: 00001000000001001111110100001110 Not tainted [ 51.468000] r00-03 000000ff0804fd0e 00000000bbdd5000 00000000404f23f8 00000000bbdd5000 [ 51.564000] r04-07 00000000409b3b10 0000000000006000 0000000000001000 0000000000000180 [ 51.660000] r08-11 0000000000000000 00000000bfdab0f0 0000000000000018 0000000000000000 [ 51.756000] r12-15 0000000000000004 00000000bbe22e00 000000007faf3188 0000000000000000 [ 51.852000] r16-19 0000000042004140 0000000000001000 0000000000000001 0000000087654321 [ 51.948000] r20-23 000000020000c18c ffffffff87654000 00000000000002a9 00000000000002ee [ 52.044000] r24-27 0000000000000000 ffffffff87654000 00000000bbe22e80 00000000409b3b10 [ 52.140000] r28-31 00000000bbe22e80 00000000bbdd5140 00000000bbdd5170 0000c18c0000c18c [ 52.236000] sr00-03 0000000000013800 0000000000000000 0000000000000000 0000000000011800 [ 52.332000] sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 52.428000] [ 52.448000] IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000404f2664 00000000404f2668 [ 52.548000] IIR: 03ffe01f ISR: 0000000000000000 IOR: 000000000000005c [ 52.628000] CPU: 0 CR30: 00000000bbdd4000 CR31: 00000000d20345e0 [ 52.712000] ORIG_R28: 0000000040b60718 [ 52.756000] IAOQ[0]: blk_rq_map_sg+0x6bc/0x7d0 [ 52.812000] IAOQ[1]: blk_rq_map_sg+0x6c0/0x7d0 [ 52.864000] RP(r2): blk_rq_map_sg+0x450/0x7d0 [ 52.920000] Backtrace: [ 52.948000] [<00000000003f5ebc>] scsi_init_sgtable+0x94/0x1b0 [scsi_mod] [ 53.028000] [<00000000003f6044>] scsi_init_io+0x6c/0x258 [scsi_mod] [ 53.104000] [<000000000080e078>] sd_init_command+0x70/0xec8 [sd_mod] [ 53.180000] scsi_init_sgtable: nr_phys_segments = 1 [ 53.180000] scsi_init_sgtable: count = 1, nents = 1 [ 53.300000] [ 53.316000] CPU: 0 PID: 62 Comm: systemd-udevd Not tainted 4.4.0-1-parisc64-smp #6 Debian 4.4.2-2 [ 53.424000] Backtrace: [ 53.452000] [<0000000040215bd8>] show_stack+0x20/0x38 [ 53.512000] [<0000000040520a24>] dump_stack+0x9c/0x110 [ 53.576000] [<0000000040215dac>] die_if_kernel+0x19c/0x2e0 [ 53.640000] [<0000000040216c88>] handle_interruption+0x9a8/0x9d0 [ 53.716000] [ 53.732000] ---[ end trace 596bfe57ff9ccda5 ]--- [ 53.788000] scsi_init_sgtable: nr_phys_segments = 1 [ 53.788000] scsi_init_sgtable: count = 1, nents = 1 Helge ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-21 21:17 ` Helge Deller @ 2016-02-21 21:49 ` James Bottomley 2016-02-21 22:08 ` John David Anglin 2016-02-22 0:53 ` John David Anglin 1 sibling, 1 reply; 25+ messages in thread From: James Bottomley @ 2016-02-21 21:49 UTC (permalink / raw) To: Helge Deller, John David Anglin; +Cc: linux-parisc List On Sun, 2016-02-21 at 22:17 +0100, Helge Deller wrote: > On 21.02.2016 22:09, John David Anglin wrote: > > On 2016-02-21, at 3:28 PM, James Bottomley wrote: > > > > > That said, I still can't reproduce this, so you're going to have > > > to > > > help me find it. Current theory is ll_merge_request_fn() it > > > looks like > > > there's scope for miscalculation in there. Can you instrument > > > this > > > line > > > > > > /* Merge is OK... */ > > > req->nr_phys_segments = total_phys_segments; > > > > > > To add just before the return > > > > > > if (req->nr_phys_segments != __blk_recalc_rq_segments(rq->q, rq > > > ->bio, false) > > > printk("MISMATCH IN MERGE: got %d, should get %d\n", > > > req->nr_phys_segments, > > > __blk_recalc_rq_segments(rq->q, rq->bio, false)); > > > > This didn't trigger. There were some typos: > > It didn't trigger for me either. OK, can you instrument the same thing in ll_new_hw_segment() after req->nr_phys_segments += nr_phys_segs; > > diff --git a/block/blk-merge.c b/block/blk-merge.c > > index d9c3a75..e8969ef 100644 > > --- a/block/blk-merge.c > > +++ b/block/blk-merge.c > > @@ -545,6 +545,12 @@ static int ll_merge_requests_fn(struct > > request_queue *q, struct request *req, > > > > /* Merge is OK... */ > > req->nr_phys_segments = total_phys_segments; > > + > > +if (req->nr_phys_segments != __blk_recalc_rq_segments(req->q, req > > ->bio, false)) > > + printk("MISMATCH IN MERGE: got %d, should get %d\n", > > + req->nr_phys_segments, > > + __blk_recalc_rq_segments(req->q, req->bio, false)); > > + > > return 1; > > } > > Interestingly it now triggered somewhere else... > I enabled CONFIG_DEBUG_SG, which I had enabled the last few times as > well, but it now happened > for the first time: No, that just means the sg table you initialised was too short: the last element didn't get a sg_magic set; it's effectively the same error, just showing differently. James ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-21 21:49 ` James Bottomley @ 2016-02-21 22:08 ` John David Anglin 0 siblings, 0 replies; 25+ messages in thread From: John David Anglin @ 2016-02-21 22:08 UTC (permalink / raw) To: James Bottomley; +Cc: Helge Deller, linux-parisc List On 2016-02-21, at 4:49 PM, James Bottomley wrote: > OK, can you instrument the same thing in ll_new_hw_segment() after > > req->nr_phys_segments += nr_phys_segs; This didn't trigger either. In this boot, we had a bad address: Bad Address (null pointer deref?): Code=15 regs=000000007c5c5260 (Addr=0000010a) here: IAOQ[0]: blk_rq_map_sg+0x364/0x670 IAOQ[1]: blk_rq_map_sg+0x368/0x670 RP(r2): blk_rq_map_sg+0x358/0x670 Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-21 21:17 ` Helge Deller 2016-02-21 21:49 ` James Bottomley @ 2016-02-22 0:53 ` John David Anglin 2016-02-22 3:24 ` John David Anglin 1 sibling, 1 reply; 25+ messages in thread From: John David Anglin @ 2016-02-22 0:53 UTC (permalink / raw) To: Helge Deller; +Cc: James Bottomley, linux-parisc List [-- Attachment #1: Type: text/plain, Size: 5836 bytes --] On 2016-02-21, at 4:17 PM, Helge Deller wrote: > I enabled CONFIG_DEBUG_SG, which I had enabled the last few times as well, but it now happened > for the first time: I also enabled CONFIG_DEBUG_SG. I added a bunch of printks and eventually narrowed the bug occurance to __blk_segment_map_sg. It seems like clearing the termination bit is somewhat dangerous. I added the attached check. With it, I get the following on boot: scsi target6:0:2: Beginning Domain Validation sdb: sdb1 sdb2 sdb3 sdb4 scsi target6:0:2: Ending Domain Validation scsi target6:0:2: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RTI WRFLOW PCOMP (6.2) sd 6:0:2:0: [sdc] 143374738 512-byte logical blocks: (73.4 GB/68.3 GiB) sd 6:0:0:0: [sdb] Attached SCSI disk sd 6:0:2:0: [sdc] Write Protect is off sd 6:0:2:0: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FA mptbase: ioc1: Initiating bringup sd 6:0:2:0: [sdc] Attached SCSI disk ioc1: LSI53C1030 B2: Capabilities={Initiator,Target} scsi host7: ioc1: LSI53C1030 B2, FwRev=01032341h, Ports=1, MaxQ=255, IRQ=68 __blk_segment_map_sg: clearing termination bit ------------[ cut here ]------------ kernel BUG at include/linux/scatterlist.h:92! __blk_segment_map_sg: clearing termination bit ------------[ cut here ]------------ kernel BUG at include/linux/scatterlist.h:92! CPU: 3 PID: 1026 Comm: systemd-udevd Not tainted 4.2.0-rc2+ #16 task: 000000007f77d548 ti: 000000007e4f0000 task.ti: 000000007e4f0000 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001000011001000001110 Not tainted r00-03 000000ff0804320e 000000007e4f1130 00000000403623e8 000000007e4f1130 r04-07 000000004074f010 000000000001c000 0000000000001000 00000000000001a0 r08-11 000000007e7b3040 0000000000000000 0000000000000000 000000007e6990c0 r12-15 000000000000001a 0000000000000000 000000007c784790 0000000000000006 r16-19 0000000042d79220 0000000000001000 0000000000000001 0000000000000000 r20-23 000000000000002e 00000000705f7367 000000000000001f 0000000000000000 r24-27 ffffffff87654000 000000000800000e 000000007e6990c0 000000004074f010 r28-31 000000007e7b3a00 000000007e4f1230 000000007e4f1260 0000000087654321 sr00-03 0000000000016800 0000000000000000 0000000000000000 0000000000016800 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 IASQ: 0000000000000000 0000000000000000 IAOQ: 0000000040362608 000000004036260c IIR: 03ffe01f ISR: 0000000010340000 IOR: 000000f93c4f1260 CPU: 3 CR30: 000000007e4f0000 CR31: ffffffffffffffff ORIG_R28: 000000000000002e IAOQ[0]: blk_rq_map_sg+0x580/0x6a0 IAOQ[1]: blk_rq_map_sg+0x584/0x6a0 RP(r2): blk_rq_map_sg+0x360/0x6a0 Backtrace: [<000000004046f4b0>] scsi_init_sgtable+0x70/0xb8 [<000000004046f564>] scsi_init_io+0x6c/0x220 [<000000000811c5c0>] sd_setup_read_write_cmnd+0x58/0x968 [sd_mod] [<000000000811cf14>] sd_init_command+0x44/0x130 [sd_mod] [<000000004046f81c>] scsi_setup_cmnd+0x104/0x1b0 [<000000004046fab8>] scsi_prep_fn+0x100/0x1a0 [<000000004035b9b0>] blk_peek_request+0x1b8/0x298 [<0000000040471028>] scsi_request_fn+0xf8/0xa90 [<0000000040357244>] __blk_run_queue+0x4c/0x70 [<00000000403802c4>] cfq_insert_request+0x2dc/0x580 [<0000000040356404>] __elv_add_request+0x1b4/0x300 We have in blk-merge.c: else { /* * If the driver previously mapped a shorter * list, we could see a termination bit * prematurely unless it fully inits the sg * table on each mapping. We KNOW that there * must be more entries here or the driver * would be buggy, so force clear the * termination bit to avoid doing a full * sg_init_table() in drivers for each command. */ if (sg_is_last (*sg)) printk ("__blk_segment_map_sg: clearing termination bi t\n"); sg_unmark_end(*sg); *sg = sg_next(*sg); BUG_ON (!*sg); } The comment suggests there must be more entries... Dave -- John David Anglin dave.anglin@bell.net [-- Attachment #2: blk-merge.c.d.txt --] [-- Type: text/plain, Size: 1930 bytes --] diff --git a/block/blk-merge.c b/block/blk-merge.c index d9c3a75..f893ecf 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -327,8 +327,11 @@ new_segment: * termination bit to avoid doing a full * sg_init_table() in drivers for each command. */ + if (sg_is_last (*sg)) + printk ("__blk_segment_map_sg: clearing termination bit\n"); sg_unmark_end(*sg); *sg = sg_next(*sg); + BUG_ON (!*sg); } sg_set_page(*sg, bvec->bv_page, nbytes, bvec->bv_offset); @@ -392,6 +395,9 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq, if (rq->bio) nsegs = __blk_bios_map_sg(q, rq->bio, sglist, &sg); + if (!sg) + return nsegs; + if (unlikely(rq->cmd_flags & REQ_COPY_USER) && (blk_rq_bytes(rq) & q->dma_pad_mask)) { unsigned int pad_len = @@ -415,8 +421,7 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq, rq->extra_len += q->dma_drain_size; } - if (sg) - sg_mark_end(sg); + sg_mark_end(sg); return nsegs; } @@ -439,6 +444,14 @@ static inline int ll_new_hw_segment(struct request_queue *q, * counters. */ req->nr_phys_segments += nr_phys_segs; + +#if 0 +if (req->nr_phys_segments != __blk_recalc_rq_segments(req->q, req->bio, false)) + printk("ll_new_hw_segment: MISMATCH IN MERGE: got %d, should get %d\n", + req->nr_phys_segments, + __blk_recalc_rq_segments(req->q, req->bio, false)); +#endif + return 1; no_merge: @@ -545,6 +558,14 @@ static int ll_merge_requests_fn(struct request_queue *q, struct request *req, /* Merge is OK... */ req->nr_phys_segments = total_phys_segments; + +#if 0 +if (req->nr_phys_segments != __blk_recalc_rq_segments(req->q, req->bio, false)) + printk("MISMATCH IN MERGE: got %d, should get %d\n", + req->nr_phys_segments, + __blk_recalc_rq_segments(req->q, req->bio, false)); +#endif + return 1; } ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-22 0:53 ` John David Anglin @ 2016-02-22 3:24 ` John David Anglin 2016-02-23 3:04 ` John David Anglin 0 siblings, 1 reply; 25+ messages in thread From: John David Anglin @ 2016-02-22 3:24 UTC (permalink / raw) To: John David Anglin; +Cc: Helge Deller, James Bottomley, linux-parisc List On 2016-02-21, at 7:53 PM, John David Anglin wrote: > Backtrace: > [<000000004046f4b0>] scsi_init_sgtable+0x70/0xb8 > [<000000004046f564>] scsi_init_io+0x6c/0x220 > [<000000000811c5c0>] sd_setup_read_write_cmnd+0x58/0x968 [sd_mod] > [<000000000811cf14>] sd_init_command+0x44/0x130 [sd_mod] > [<000000004046f81c>] scsi_setup_cmnd+0x104/0x1b0 > [<000000004046fab8>] scsi_prep_fn+0x100/0x1a0 > [<000000004035b9b0>] blk_peek_request+0x1b8/0x298 > [<0000000040471028>] scsi_request_fn+0xf8/0xa90 > [<0000000040357244>] __blk_run_queue+0x4c/0x70 > [<00000000403802c4>] cfq_insert_request+0x2dc/0x580 > [<0000000040356404>] __elv_add_request+0x1b4/0x300 > > We have in blk-merge.c: > > else { > /* > * If the driver previously mapped a shorter > * list, we could see a termination bit > * prematurely unless it fully inits the sg > * table on each mapping. We KNOW that there > * must be more entries here or the driver > * would be buggy, so force clear the > * termination bit to avoid doing a full > * sg_init_table() in drivers for each command. > */ > if (sg_is_last (*sg)) > printk ("__blk_segment_map_sg: clearing termination bi > t\n"); > sg_unmark_end(*sg); > *sg = sg_next(*sg); > BUG_ON (!*sg); > } > > The comment suggests there must be more entries... I'm thinking with the split the scsi driver needs to provide one or two extra entires in the sg list. Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-22 3:24 ` John David Anglin @ 2016-02-23 3:04 ` John David Anglin 2016-02-23 18:06 ` Helge Deller 0 siblings, 1 reply; 25+ messages in thread From: John David Anglin @ 2016-02-23 3:04 UTC (permalink / raw) To: John David Anglin; +Cc: Helge Deller, James Bottomley, linux-parisc List [-- Attachment #1: Type: text/plain, Size: 2836 bytes --] On 2016-02-21, at 10:24 PM, John David Anglin wrote: > On 2016-02-21, at 7:53 PM, John David Anglin wrote: > >> Backtrace: >> [<000000004046f4b0>] scsi_init_sgtable+0x70/0xb8 >> [<000000004046f564>] scsi_init_io+0x6c/0x220 >> [<000000000811c5c0>] sd_setup_read_write_cmnd+0x58/0x968 [sd_mod] >> [<000000000811cf14>] sd_init_command+0x44/0x130 [sd_mod] >> [<000000004046f81c>] scsi_setup_cmnd+0x104/0x1b0 >> [<000000004046fab8>] scsi_prep_fn+0x100/0x1a0 >> [<000000004035b9b0>] blk_peek_request+0x1b8/0x298 >> [<0000000040471028>] scsi_request_fn+0xf8/0xa90 >> [<0000000040357244>] __blk_run_queue+0x4c/0x70 >> [<00000000403802c4>] cfq_insert_request+0x2dc/0x580 >> [<0000000040356404>] __elv_add_request+0x1b4/0x300 >> >> We have in blk-merge.c: >> >> else { >> /* >> * If the driver previously mapped a shorter >> * list, we could see a termination bit >> * prematurely unless it fully inits the sg >> * table on each mapping. We KNOW that there >> * must be more entries here or the driver >> * would be buggy, so force clear the >> * termination bit to avoid doing a full >> * sg_init_table() in drivers for each command. >> */ >> if (sg_is_last (*sg)) >> printk ("__blk_segment_map_sg: clearing termination bi >> t\n"); >> sg_unmark_end(*sg); >> *sg = sg_next(*sg); >> BUG_ON (!*sg); >> } >> >> The comment suggests there must be more entries... > > I'm thinking with the split the scsi driver needs to provide one or two extra entires in the sg list. With the attached patch, I'm able to boot 4.2.0-rc2+ on linux-block at commit 54efd50bfd873e2dbf784e0b21a8027ba4299a3e. I didn't try to optimize the number of extra entries but I know one is not enough. I guess the puzzle is why the number of entries isn't calculated correctly in the first place. Further, why does blk-merge believe that it's okay to go beyond the terminator? Clearly, the magic number isn't always set, etc. I added the WARN_ON so I'd know when we run off the end of the the list. Dave -- John David Anglin dave.anglin@bell.net [-- Attachment #2: scsi-nents.d.txt --] [-- Type: text/plain, Size: 1395 bytes --] diff --git a/block/blk-merge.c b/block/blk-merge.c index d9c3a75..8e2566b 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -327,6 +327,7 @@ new_segment: * termination bit to avoid doing a full * sg_init_table() in drivers for each command. */ + WARN_ON(sg_is_last (*sg)); sg_unmark_end(*sg); *sg = sg_next(*sg); } @@ -392,6 +393,9 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq, if (rq->bio) nsegs = __blk_bios_map_sg(q, rq->bio, sglist, &sg); + if (!sg) + return nsegs; + if (unlikely(rq->cmd_flags & REQ_COPY_USER) && (blk_rq_bytes(rq) & q->dma_pad_mask)) { unsigned int pad_len = @@ -415,8 +419,7 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq, rq->extra_len += q->dma_drain_size; } - if (sg) - sg_mark_end(sg); + sg_mark_end(sg); return nsegs; } diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index b1a2631..b421f03 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -595,6 +595,11 @@ static int scsi_alloc_sgtable(struct scsi_data_buffer *sdb, int nents, bool mq) BUG_ON(!nents); + /* Provide extra entries in case of split. */ + nents += 8; + if (nents > SCSI_MAX_SG_SEGMENTS) + nents = SCSI_MAX_SG_SEGMENTS; + if (mq) { if (nents <= SCSI_MAX_SG_SEGMENTS) { sdb->table.nents = nents; ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-23 3:04 ` John David Anglin @ 2016-02-23 18:06 ` Helge Deller 2016-02-23 19:10 ` John David Anglin 0 siblings, 1 reply; 25+ messages in thread From: Helge Deller @ 2016-02-23 18:06 UTC (permalink / raw) To: John David Anglin; +Cc: James Bottomley, linux-parisc List On 23.02.2016 04:04, John David Anglin wrote: > On 2016-02-21, at 10:24 PM, John David Anglin wrote: > >> On 2016-02-21, at 7:53 PM, John David Anglin wrote: >> >>> Backtrace: >>> [<000000004046f4b0>] scsi_init_sgtable+0x70/0xb8 >>> [<000000004046f564>] scsi_init_io+0x6c/0x220 >>> [<000000000811c5c0>] sd_setup_read_write_cmnd+0x58/0x968 [sd_mod] >>> [<000000000811cf14>] sd_init_command+0x44/0x130 [sd_mod] >>> [<000000004046f81c>] scsi_setup_cmnd+0x104/0x1b0 >>> [<000000004046fab8>] scsi_prep_fn+0x100/0x1a0 >>> [<000000004035b9b0>] blk_peek_request+0x1b8/0x298 >>> [<0000000040471028>] scsi_request_fn+0xf8/0xa90 >>> [<0000000040357244>] __blk_run_queue+0x4c/0x70 >>> [<00000000403802c4>] cfq_insert_request+0x2dc/0x580 >>> [<0000000040356404>] __elv_add_request+0x1b4/0x300 >>> >>> We have in blk-merge.c: >>> >>> else { >>> /* >>> * If the driver previously mapped a shorter >>> * list, we could see a termination bit >>> * prematurely unless it fully inits the sg >>> * table on each mapping. We KNOW that there >>> * must be more entries here or the driver >>> * would be buggy, so force clear the >>> * termination bit to avoid doing a full >>> * sg_init_table() in drivers for each command. >>> */ >>> if (sg_is_last (*sg)) >>> printk ("__blk_segment_map_sg: clearing termination bi >>> t\n"); >>> sg_unmark_end(*sg); >>> *sg = sg_next(*sg); >>> BUG_ON (!*sg); >>> } >>> >>> The comment suggests there must be more entries... >> >> I'm thinking with the split the scsi driver needs to provide one or two extra entires in the sg list. > > > With the attached patch, I'm able to boot 4.2.0-rc2+ on linux-block at commit > 54efd50bfd873e2dbf784e0b21a8027ba4299a3e. > > I didn't try to optimize the number of extra entries but I know one is not enough. > > I guess the puzzle is why the number of entries isn't calculated correctly in the first place. > Further, why does blk-merge believe that it's okay to go beyond the terminator? Clearly, > the magic number isn't always set, etc. > > I added the WARN_ON so I'd know when we run off the end of the the list. Still fails to boot for me on c3000 (although I think the patch is going into the right direction!): [ 25.140000] cdrom: Uniform CD-ROM driver Revision: 3.20 [ 25.200000] sd 3:0:6:0: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA [ 25.304000] sd 3:0:5:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA [ 25.436000] sdb: sdb1 sdb2 sdb3 < sdb5 sdb6 > [ 25.488000] sda: sda1 sda2 sda3 < sda5 sda6 > [ 25.560000] sd 3:0:6:0: [sdb] Attached SCSI disk [ 25.636000] scsi_id(112): unaligned access to 0x00000000faad5009 at ip=0x000000004100390b [ 25.752000] sd 3:0:5:0: [sda] Attached SCSI disk [ 25.832000] scsi_id(113): unaligned access to 0x00000000fa90a009 at ip=0x000000004100390b [ 25.972000] ------------[ cut here ]------------ [ 26.028000] WARNING: at /build/linux-4.4-neu/linux-4.4.2/block/blk-merge.c:466 [ 26.116000] random: nonblocking pool is initialized [ 26.172000] Modules linked in: sr_mod cdrom sd_mod ata_generic ohci_pci ehci_pci ohci_hcd ehci_hcd pata_ns87415 libata sym53c8xx scsi_transport_spi usbcore scsi_modp [ 26.368000] CPU: 0 PID: 65 Comm: systemd-udevd Not tainted 4.4.0-1-parisc64-smp #1 Debian 4.4.2-3 [ 26.476000] task: 00000000bbd70b08 ti: 00000000bbea8000 task.ti: 00000000bbea8000 [ 26.564000] [ 26.584000] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI [ 26.640000] PSW: 00001000000001001111111100001110 Not tainted [ 26.708000] r00-03 000000ff0804ff0e 00000000409e8380 00000000404f18f4 00000000bbea91e0 [ 26.804000] r04-07 00000000409b2b80 0000000000000000 0000000000000000 000000000000001e Helge ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-23 18:06 ` Helge Deller @ 2016-02-23 19:10 ` John David Anglin 0 siblings, 0 replies; 25+ messages in thread From: John David Anglin @ 2016-02-23 19:10 UTC (permalink / raw) To: Helge Deller; +Cc: James Bottomley, linux-parisc List On 2016-02-23 1:06 PM, Helge Deller wrote: > On 23.02.2016 04:04, John David Anglin wrote: >> On 2016-02-21, at 10:24 PM, John David Anglin wrote: >> >>> On 2016-02-21, at 7:53 PM, John David Anglin wrote: >>> >>>> Backtrace: >>>> [<000000004046f4b0>] scsi_init_sgtable+0x70/0xb8 >>>> [<000000004046f564>] scsi_init_io+0x6c/0x220 >>>> [<000000000811c5c0>] sd_setup_read_write_cmnd+0x58/0x968 [sd_mod] >>>> [<000000000811cf14>] sd_init_command+0x44/0x130 [sd_mod] >>>> [<000000004046f81c>] scsi_setup_cmnd+0x104/0x1b0 >>>> [<000000004046fab8>] scsi_prep_fn+0x100/0x1a0 >>>> [<000000004035b9b0>] blk_peek_request+0x1b8/0x298 >>>> [<0000000040471028>] scsi_request_fn+0xf8/0xa90 >>>> [<0000000040357244>] __blk_run_queue+0x4c/0x70 >>>> [<00000000403802c4>] cfq_insert_request+0x2dc/0x580 >>>> [<0000000040356404>] __elv_add_request+0x1b4/0x300 >>>> >>>> We have in blk-merge.c: >>>> >>>> else { >>>> /* >>>> * If the driver previously mapped a shorter >>>> * list, we could see a termination bit >>>> * prematurely unless it fully inits the sg >>>> * table on each mapping. We KNOW that there >>>> * must be more entries here or the driver >>>> * would be buggy, so force clear the >>>> * termination bit to avoid doing a full >>>> * sg_init_table() in drivers for each command. >>>> */ >>>> if (sg_is_last (*sg)) >>>> printk ("__blk_segment_map_sg: clearing termination bi >>>> t\n"); >>>> sg_unmark_end(*sg); >>>> *sg = sg_next(*sg); >>>> BUG_ON (!*sg); >>>> } >>>> >>>> The comment suggests there must be more entries... >>> I'm thinking with the split the scsi driver needs to provide one or two extra entires in the sg list. >> >> With the attached patch, I'm able to boot 4.2.0-rc2+ on linux-block at commit >> 54efd50bfd873e2dbf784e0b21a8027ba4299a3e. >> >> I didn't try to optimize the number of extra entries but I know one is not enough. >> >> I guess the puzzle is why the number of entries isn't calculated correctly in the first place. >> Further, why does blk-merge believe that it's okay to go beyond the terminator? Clearly, >> the magic number isn't always set, etc. >> >> I added the WARN_ON so I'd know when we run off the end of the the list. > Still fails to boot for me on c3000 > (although I think the patch is going into the right direction!): > > [ 25.140000] cdrom: Uniform CD-ROM driver Revision: 3.20 > [ 25.200000] sd 3:0:6:0: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA > [ 25.304000] sd 3:0:5:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA > [ 25.436000] sdb: sdb1 sdb2 sdb3 < sdb5 sdb6 > > [ 25.488000] sda: sda1 sda2 sda3 < sda5 sda6 > > [ 25.560000] sd 3:0:6:0: [sdb] Attached SCSI disk > [ 25.636000] scsi_id(112): unaligned access to 0x00000000faad5009 at ip=0x000000004100390b > [ 25.752000] sd 3:0:5:0: [sda] Attached SCSI disk > [ 25.832000] scsi_id(113): unaligned access to 0x00000000fa90a009 at ip=0x000000004100390b > [ 25.972000] ------------[ cut here ]------------ > [ 26.028000] WARNING: at /build/linux-4.4-neu/linux-4.4.2/block/blk-merge.c:466 This is this warning (not the one I added): /* * Something must have been wrong if the figured number of * segment is bigger than number of req's physical segments */ WARN_ON(nsegs > rq->nr_phys_segments); We need backtrace to see who called blk_rq_map_sg. Think driver didn't provide enough segments. Maybe my "+8" addition should be moved. I don't think this warning was actual failure point. James mentioned "Apparently in parisc we don't set the max segment size, so we inherit 64k even in SCSI drivers." Can this be changed? > [ 26.116000] random: nonblocking pool is initialized > [ 26.172000] Modules linked in: sr_mod cdrom sd_mod ata_generic ohci_pci ehci_pci ohci_hcd ehci_hcd pata_ns87415 libata sym53c8xx scsi_transport_spi usbcore scsi_modp > [ 26.368000] CPU: 0 PID: 65 Comm: systemd-udevd Not tainted 4.4.0-1-parisc64-smp #1 Debian 4.4.2-3 > [ 26.476000] task: 00000000bbd70b08 ti: 00000000bbea8000 task.ti: 00000000bbea8000 > [ 26.564000] > [ 26.584000] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI > [ 26.640000] PSW: 00001000000001001111111100001110 Not tainted > [ 26.708000] r00-03 000000ff0804ff0e 00000000409e8380 00000000404f18f4 00000000bbea91e0 > [ 26.804000] r04-07 00000000409b2b80 0000000000000000 0000000000000000 000000000000001e Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: SCSI bug 2016-02-21 19:36 ` Helge Deller 2016-02-21 20:28 ` James Bottomley @ 2016-02-21 20:42 ` John David Anglin 1 sibling, 0 replies; 25+ messages in thread From: John David Anglin @ 2016-02-21 20:42 UTC (permalink / raw) To: Helge Deller; +Cc: James Bottomley, linux-parisc List On 2016-02-21, at 2:36 PM, Helge Deller wrote: > On 21.02.2016 20:07, James Bottomley wrote: >> On Sun, 2016-02-21 at 13:43 -0500, John David Anglin wrote: >>> I verified that commit 54efd50bfd873e2dbf784e0b21a8027ba4299a3e in >>> linux-block fails to >>> boot and commit 41609892701e26724b8617201f43254cadf2e7ae (blk-cgroup: >>> Drop unlikely >>> before IS_ERR(_OR_NULL)) does boot successfully. Commit >>> 41609892701e26724b8617201f43254cadf2e7ae >>> is previous commit in tree. >>> >>> I don't believe that the change can be reverted from Linus' tree as >>> this commit allowed other >>> stuff to be removed (see second paragraph of commit description). >> >> OK, can you just verify you can boot 4.5-rc5 without the sata_sil24 >> driver? > > I tried it on my c3000, debian kernel 4.4.2, in this case without the pata_ns87415. Removing the sata_sil24 driver didn't help. Problem still occurs with just scsi drivers. All remaining ata drivers were builtin. Dave -- John David Anglin dave.anglin@bell.net ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2016-02-23 19:10 UTC | newest] Thread overview: 25+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-01-23 18:00 SCSI bug John David Anglin 2016-02-20 20:13 ` John David Anglin 2016-02-20 20:43 ` John David Anglin 2016-02-20 21:59 ` Helge Deller 2016-02-20 22:52 ` John David Anglin 2016-02-21 2:52 ` John David Anglin 2016-02-21 3:47 ` James Bottomley 2016-02-21 14:45 ` John David Anglin 2016-02-21 18:10 ` James Bottomley 2016-02-21 18:09 ` John David Anglin 2016-02-21 18:13 ` James Bottomley 2016-02-21 18:43 ` John David Anglin 2016-02-21 19:07 ` James Bottomley 2016-02-21 19:36 ` Helge Deller 2016-02-21 20:28 ` James Bottomley 2016-02-21 21:09 ` John David Anglin 2016-02-21 21:17 ` Helge Deller 2016-02-21 21:49 ` James Bottomley 2016-02-21 22:08 ` John David Anglin 2016-02-22 0:53 ` John David Anglin 2016-02-22 3:24 ` John David Anglin 2016-02-23 3:04 ` John David Anglin 2016-02-23 18:06 ` Helge Deller 2016-02-23 19:10 ` John David Anglin 2016-02-21 20:42 ` John David Anglin
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.