* [PATCH] instrument ide-scsi in 2.5.68
@ 2003-05-01 13:02 Douglas Gilbert
2003-05-01 22:52 ` Randy.Dunlap
2003-05-02 8:40 ` Douglas Gilbert
0 siblings, 2 replies; 18+ messages in thread
From: Douglas Gilbert @ 2003-05-01 13:02 UTC (permalink / raw)
To: linux-scsi; +Cc: alan
[-- Attachment #1: Type: text/plain, Size: 963 bytes --]
Alan Cox said last week that the ide-scsi driver was
500th on his list to fix for 2.6 . To help him, if
he hasn't already fixed it, this patch includes:
- instrument driver with 2 parameters (as examples)
- ide-scsi.debug + ide-scsi.suppress_reset
[kernel load time]
- debug + suppress_reset [module load time]
- debug + suppress_reset [sysfs: read/write
in /sys/bus/ide/drivers/ide-scsi directory]
- downgrade (simple) tag queuing to no tag queuing
[set some other things to more conservative values]
- add scsi_host::release() method
- make provision for 16 byte SCSI commands
- cleanup printk()s
Many people (on the linux kernel list) have reported problems
with this driver and ATAPI burners. I only have an ATAPI
reader and can't get it to fail. If someone has a repeatable
failure perhaps they could set "debug=3" and send the relevant
part of /var/log/messages to this list or me.
Doug Gilbert
[-- Attachment #2: ide-scsi_2568.diff.gz --]
[-- Type: application/x-gzip, Size: 7061 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH] instrument ide-scsi in 2.5.68 2003-05-01 13:02 [PATCH] instrument ide-scsi in 2.5.68 Douglas Gilbert @ 2003-05-01 22:52 ` Randy.Dunlap 2003-05-01 22:57 ` Randy.Dunlap 2003-05-01 23:29 ` Douglas Gilbert 2003-05-02 8:40 ` Douglas Gilbert 1 sibling, 2 replies; 18+ messages in thread From: Randy.Dunlap @ 2003-05-01 22:52 UTC (permalink / raw) To: dougg; +Cc: linux-scsi, alan [-- Attachment #1: Type: text/plain, Size: 1809 bytes --] On Thu, 01 May 2003 23:02:11 +1000 Douglas Gilbert <dougg@torque.net> wrote: | Alan Cox said last week that the ide-scsi driver was | 500th on his list to fix for 2.6 . To help him, if | he hasn't already fixed it, this patch includes: | | - instrument driver with 2 parameters (as examples) | - ide-scsi.debug + ide-scsi.suppress_reset | [kernel load time] | - debug + suppress_reset [module load time] | - debug + suppress_reset [sysfs: read/write | in /sys/bus/ide/drivers/ide-scsi directory] | - downgrade (simple) tag queuing to no tag queuing | [set some other things to more conservative values] | - add scsi_host::release() method | - make provision for 16 byte SCSI commands | - cleanup printk()s | | Many people (on the linux kernel list) have reported problems | with this driver and ATAPI burners. I only have an ATAPI | reader and can't get it to fail. If someone has a repeatable | failure perhaps they could set "debug=3" and send the relevant | part of /var/log/messages to this list or me. Hi Doug, I have ATAPI CD-ROM as hdc and CD-RW as hdd. I boot with "hdd=icd-scsi ide-scsi.debug=3". Is that right/OK so far? If not, what should it be? Anyway, it never gets thru the boot sequence (with or without this patch). It hangs with these last messages: scsi2 : SCSI host adapter emulation for IDE ATAPI devices ide-scsi: abort called for cmd serial #32 hdd: lost interrupt ide-scsi: CoD != 0 in idescsi_pc_intr hdd: DMA disabled hdd: ATAPI reset complete hdd: irq timeout: status=0x80 { Busy } hdd: ATAPI reset complete hdd: irq timeout: status=0x80 { Busy } hdd: ATAPI reset complete hdd: irq timeout: status=0x80 { Busy } I let it sit like this for > 5 minutes before rebooting. Complete boot log is attached. -- ~Randy [-- Attachment #2: capture.txt --] [-- Type: text/plain, Size: 10974 bytes --] Restarting system. Linux version 2.5.68 (rddunlap@gargoyle.pdx.osdl.net) (gcc version 3.2 (Mandrake Linux 9.0 3.2-1mdk)) #6 SMP Thu May 1 15:00:41 PDT 2003 Video mode to be used for restore is f05 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 00000000000a0000 (usable) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000003fff0000 (usable) BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS) BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data) BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved) 127MB HIGHMEM available. 896MB LOWMEM available. found SMP MP-table at 000f52a0 hm, page 000f5000 reserved twice. hm, page 000f6000 reserved twice. hm, page 000f1000 reserved twice. hm, page 000f2000 reserved twice. On node 0 totalpages: 262128 DMA zone: 4096 pages, LIFO batch:1 Normal zone: 225280 pages, LIFO batch:16 HighMem zone: 32752 pages, LIFO batch:7 Intel MultiProcessor Specification v1.4 Virtual Wire compatibility mode. OEM ID: OEM00000 Product ID: PROD00000000 APIC at: 0xFEE00000 Processor #0 15:1 APIC version 17 Processor #1 15:0 APIC version 17 I/O APIC #2 Version 17 at 0xFEC00000. Enabling APIC mode: Flat. Using 1 I/O APICs Processors: 2 IBM machine detected. Enabling interrupts during APM calls. IBM machine detected. Disabling SMBus accesses. Building zonelist for node : 0 Kernel command line: auto BOOT_IMAGE=lin2568isc ro root=304 devfs=nomount hdd=ide-scsi ide-scsi.debug=3 console=tty0 console=ttyS0,38400n8 debug ide_setup: hdd=ide-scsi ide_setup: ide-scsi.debug=3 -- BAD OPTION Initializing CPU#0 PID hash table entries: 4096 (order 12: 32768 bytes) Detected 1685.990 MHz processor. Console: colour VGA+ 80x30 Calibrating delay loop... 3317.76 BogoMIPS Memory: 1032804k/1048512k available (2828k kernel code, 14804k reserved, 1024k data, 144k init, 131008k highmem) Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Mount-cache hash table entries: 512 (order: 0, 4096 bytes) -> /dev -> /dev/console -> /root CPU: Trace cache: 12K uops, L1 D cache: 8K CPU: L2 cache: 256K CPU: Hyper-Threading is disabled CPU: After generic, caps: 3febfbff 00000000 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU#0: Intel P4/Xeon Extended MCE MSRs (12) available CPU#0: Thermal monitoring enabled Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. POSIX conformance testing by UNIFIX CPU0: Intel(R) Xeon(TM) CPU 1.70GHz stepping 02 per-CPU timeslice cutoff: 731.64 usecs. task migration cache decay timeout: 1 msecs. enabled ExtINT on CPU#0 ESR value before enabling vector: 00000000 ESR value after enabling vector: 00000000 Booting processor 1/1 eip 2000 Initializing CPU#1 masked ExtINT on CPU#1 ESR value before enabling vector: 00000000 ESR value after enabling vector: 00000000 Calibrating delay loop... 3358.72 BogoMIPS CPU: Trace cache: 12K uops, L1 D cache: 8K CPU: L2 cache: 256K CPU: Hyper-Threading is disabled CPU: After generic, caps: 3febfbff 00000000 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#1. CPU#1: Intel P4/Xeon Extended MCE MSRs (12) available CPU#1: Thermal monitoring enabled CPU1: Intel(R) Xeon(TM) CPU 1700MHz stepping 0a Total of 2 processors activated (6676.48 BogoMIPS). ENABLING IO-APIC IRQs Setting 2 in the phys_id_present_map ...changing IO-APIC physical APIC ID to 2 ... ok. init IO_APIC IRQs IO-APIC (apicid-pin) 2-0, 2-5, 2-10, 2-11, 2-12, 2-18, 2-21, 2-23 not connected. ..TIMER: vector=0x31 pin1=2 pin2=0 number of MP IRQ sources: 20. number of IO-APIC #2 registers: 24. testing the IO APIC....................... IO APIC #2...... .... register #00: 02000000 ....... : physical APIC id: 02 ....... : Delivery Type: 0 ....... : LTS : 0 .... register #01: 00178020 ....... : max redirection entries: 0017 ....... : PRQ implemented: 1 ....... : IO APIC version: 0020 .... register #02: 00000000 ....... : arbitration: 00 .... IRQ redirection table: NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect: 00 000 00 1 0 0 0 0 0 0 00 01 001 01 0 0 0 0 0 1 1 39 02 001 01 0 0 0 0 0 1 1 31 03 001 01 0 0 0 0 0 1 1 41 04 001 01 0 0 0 0 0 1 1 49 05 000 00 1 0 0 0 0 0 0 00 06 001 01 0 0 0 0 0 1 1 51 07 001 01 0 0 0 0 0 1 1 59 08 001 01 0 0 0 0 0 1 1 61 09 001 01 0 0 0 0 0 1 1 69 0a 000 00 1 0 0 0 0 0 0 00 0b 000 00 1 0 0 0 0 0 0 00 0c 000 00 1 0 0 0 0 0 0 00 0d 001 01 0 0 0 0 0 1 1 71 0e 001 01 0 0 0 0 0 1 1 79 0f 001 01 0 0 0 0 0 1 1 81 10 001 01 1 1 0 1 0 1 1 89 11 001 01 1 1 0 1 0 1 1 91 12 000 00 1 0 0 0 0 0 0 00 13 001 01 1 1 0 1 0 1 1 99 14 001 01 1 1 0 1 0 1 1 A1 15 000 00 1 0 0 0 0 0 0 00 16 001 01 1 1 0 1 0 1 1 A9 17 000 00 1 0 0 0 0 0 0 00 IRQ to pin mappings: IRQ0 -> 0:2 IRQ1 -> 0:1 IRQ3 -> 0:3 IRQ4 -> 0:4 IRQ6 -> 0:6 IRQ7 -> 0:7 IRQ8 -> 0:8 IRQ9 -> 0:9 IRQ13 -> 0:13 IRQ14 -> 0:14 IRQ15 -> 0:15 IRQ16 -> 0:16 IRQ17 -> 0:17 IRQ19 -> 0:19 IRQ20 -> 0:20 IRQ22 -> 0:22 .................................... done. Using local APIC timer interrupts. calibrating APIC timer ... ..... CPU clock speed is 1684.0875 MHz. ..... host bus clock speed is 99.0110 MHz. checking TSC synchronization across 2 CPUs: passed. Starting migration thread for cpu 0 Bringing up 1 CPU 1 IS NOW UP! Starting migration thread for cpu 1 CPUS done 4 mtrr: v2.0 (20020519) mtrr: your CPUs had inconsistent fixed MTRR settings mtrr: probably your BIOS does not setup all CPUs. mtrr: corrected configuration. PCI: PCI BIOS revision 2.10 entry at 0xfb110, last bus=4 PCI: Using configuration type 1 BIO: pool of 256 setup, 14Kb (56 bytes/bio) biovec pool[0]: 1 bvecs: 256 entries (12 bytes) biovec pool[1]: 4 bvecs: 256 entries (48 bytes) biovec pool[2]: 16 bvecs: 256 entries (192 bytes) biovec pool[3]: 64 bvecs: 256 entries (768 bytes) biovec pool[4]: 128 bvecs: 256 entries (1536 bytes) biovec pool[5]: 256 bvecs: 256 entries (3072 bytes) Linux Plug and Play Support v0.96 (c) Adam Belay block request queues: 128 requests per read queue 128 requests per write queue 8 requests per batch enter congestion at 15 exit congestion at 17 SCSI subsystem initialized drivers/usb/core/usb.c: registered new driver usbfs drivers/usb/core/usb.c: registered new driver hub PCI: Probing PCI hardware PCI: Probing PCI hardware (bus 00) Transparent bridge - Intel Corp. 82801BA/CA/DB PCI Br PCI: Using IRQ router PIIX [8086/2440] at 00:1f.0 PCI->APIC IRQ transform: (B0,I31,P3) -> 19 PCI->APIC IRQ transform: (B0,I31,P1) -> 17 PCI->APIC IRQ transform: (B0,I31,P1) -> 17 PCI->APIC IRQ transform: (B1,I0,P0) -> 22 PCI->APIC IRQ transform: (B3,I1,P0) -> 20 PCI->APIC IRQ transform: (B3,I1,P1) -> 20 PCI->APIC IRQ transform: (B4,I4,P0) -> 16 Initializing RT netlink socket Starting balanced_irq Enabling SEP on CPU 1 Enabling SEP on CPU 0 highmem bounce pool size: 64 pages Journalled Block Device driver loaded udf: registering filesystem pty: 256 Unix98 ptys configured request_module: failed /sbin/modprobe -- parport_lowlevel. error = -16 lp: driver loaded but no devices found Non-volatile memory driver v1.2 i810 TCO timer init: failed to reset NO_REBOOT flag, reboot disabled by hardware Software Watchdog Timer: 0.06, soft_margin: 60 sec, nowayout: 0 Linux agpgart interface v0.100 (c) Dave Jones agpgart: Detected Intel i860 chipset agpgart: Maximum main memory to use for agp memory: 941M agpgart: AGP aperture is 64M @ 0xe8000000 Serial: 8250/16550 driver $Revision: 1.90 $ IRQ sharing disabled ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A parport0: PC-style at 0x378 (0x778) [PCSPP(,...)] parport0: irq 7 detected lp0: using parport0 (polling). lp0: console ready Floppy drive(s): fd0 is 1.44M FDC 0 is a post-1991 82077 loop: loaded (max 8 devices) eepro100.c:v1.09j-t 9/29/99 Donald Becker http://www.scyld.com/network/eepro100.html eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin <saw@saw.sw.com.sg> and others eth0: OEM i82557/i82558 10/100 Ethernet, 00:02:55:1A:35:D4, IRQ 16. Board assembly a32040-002, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x3258698e). Linux video capture interface: v1.00 Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx ICH2: IDE controller at PCI slot 00:1f.1 ICH2: chipset revision 4 ICH2: not 100% native mode: will probe irqs later ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:DMA ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:DMA hda: IC35L040AVER07-0, ATA DISK drive hdb: IC35L020AVER07-0, ATA DISK drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 hdc: LTN486S, ATAPI CD/DVD-ROM drive hdd: SONY CD-RW CRX140E, ATAPI CD/DVD-ROM drive ide1 at 0x170-0x177,0x376 on irq 15 hda: host protected area => 1 hda: 78156288 sectors (40016 MB) w/1916KiB Cache, CHS=77536/16/63, UDMA(100) hda: hda1 hda2 hda3 hda4 hdb: host protected area => 1 hdb: 39876480 sectors (20417 MB) w/1916KiB Cache, CHS=39560/16/63, UDMA(100) hdb: hdb1 hdc: ATAPI 48X CD-ROM drive, 120kB Cache, UDMA(33) Uniform CD-ROM driver Revision: 3.12 end_request: I/O error, dev hdc, sector 0 scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.31 <Adaptec aic7899 Ultra160 SCSI adapter> aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs (scsi0:A:0): 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit) Vendor: IBM Model: IC35L036UWD210-0 Rev: S5BS Type: Direct-Access ANSI SCSI revision: 03 scsi0:A:0:0: Tagged Queuing enabled. Depth 253 scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.31 <Adaptec aic7899 Ultra160 SCSI adapter> aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs scsi2 : SCSI host adapter emulation for IDE ATAPI devices ide-scsi: abort called for cmd serial #32 hdd: lost interrupt ide-scsi: CoD != 0 in idescsi_pc_intr hdd: DMA disabled hdd: ATAPI reset complete hdd: irq timeout: status=0x80 { Busy } hdd: ATAPI reset complete hdd: irq timeout: status=0x80 { Busy } hdd: ATAPI reset complete hdd: irq timeout: status=0x80 { Busy } ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] instrument ide-scsi in 2.5.68 2003-05-01 22:52 ` Randy.Dunlap @ 2003-05-01 22:57 ` Randy.Dunlap 2003-05-01 23:29 ` Douglas Gilbert 1 sibling, 0 replies; 18+ messages in thread From: Randy.Dunlap @ 2003-05-01 22:57 UTC (permalink / raw) To: dougg; +Cc: linux-scsi, alan On Thu, 1 May 2003 15:52:25 -0700 "Randy.Dunlap" <rddunlap@osdl.org> wrote: | | Hi Doug, | | I have ATAPI CD-ROM as hdc and CD-RW as hdd. I boot with | "hdd=icd-scsi ide-scsi.debug=3". Is that right/OK so far? + ide-scsi (typo) -- ~Randy ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] instrument ide-scsi in 2.5.68 2003-05-01 22:52 ` Randy.Dunlap 2003-05-01 22:57 ` Randy.Dunlap @ 2003-05-01 23:29 ` Douglas Gilbert 2003-05-01 23:44 ` Willem Riede 2003-05-02 0:38 ` Randy.Dunlap 1 sibling, 2 replies; 18+ messages in thread From: Douglas Gilbert @ 2003-05-01 23:29 UTC (permalink / raw) To: Randy.Dunlap; +Cc: linux-scsi, alan Randy.Dunlap wrote: <snip/> > Hi Doug, > > I have ATAPI CD-ROM as hdc and CD-RW as hdd. I boot with > "hdd=icd-scsi ide-scsi.debug=3". Is that right/OK so far? > If not, what should it be? Randy, I thought so but it's not picking "ide-scsi.debug=3" up: > Kernel command line: auto BOOT_IMAGE=lin2568isc ro root=304 > devfs=nomount hdd=ide-scsi ide-scsi.debug=3 console=tty0 > console=ttyS0,38400n8 debug > ide_setup: hdd=ide-scsi > ide_setup: ide-scsi.debug=3 -- BAD OPTION The code uses Rusty's moduleparam.h and it worked for me with scsi_debug. [It is some time since I checked scsi_debug built-in.] Could you try "idescsi.debug=3" (i.e. drop the hyphen). Failing that perhaps you could build ide-scsi as a module. > Anyway, it never gets thru the boot sequence (with or without this > patch). It hangs with these last messages: > > scsi2 : SCSI host adapter emulation for IDE ATAPI devices > ide-scsi: abort called for cmd serial #32 > hdd: lost interrupt > ide-scsi: CoD != 0 in idescsi_pc_intr > hdd: DMA disabled > hdd: ATAPI reset complete > hdd: irq timeout: status=0x80 { Busy } > hdd: ATAPI reset complete > hdd: irq timeout: status=0x80 { Busy } > hdd: ATAPI reset complete > hdd: irq timeout: status=0x80 { Busy } It would be interesting to see what effect "suppress_reset=1" has on this failure. Doug Gilbert ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] instrument ide-scsi in 2.5.68 2003-05-01 23:29 ` Douglas Gilbert @ 2003-05-01 23:44 ` Willem Riede 2003-05-02 0:38 ` Randy.Dunlap 1 sibling, 0 replies; 18+ messages in thread From: Willem Riede @ 2003-05-01 23:44 UTC (permalink / raw) To: linux-scsi On 2003.05.01 19:29 Douglas Gilbert wrote: > Randy.Dunlap wrote: > <snip/> > >> Hi Doug, >> >> I have ATAPI CD-ROM as hdc and CD-RW as hdd. I boot with >> "hdd=icd-scsi ide-scsi.debug=3". Is that right/OK so far? >> If not, what should it be? > > Randy, > I thought so but it's not picking "ide-scsi.debug=3" up: > > > Kernel command line: auto BOOT_IMAGE=lin2568isc ro root=304 > > devfs=nomount hdd=ide-scsi ide-scsi.debug=3 console=tty0 > > console=ttyS0,38400n8 debug > > ide_setup: hdd=ide-scsi > > ide_setup: ide-scsi.debug=3 -- BAD OPTION > > The code uses Rusty's moduleparam.h and it worked > for me with scsi_debug. [It is some time since I checked > scsi_debug built-in.] Could you try "idescsi.debug=3" > (i.e. drop the hyphen). I have on occasion noticed that the kernel thinks ide-scsi is called ide_scsi (i.e. with an underscore). You could try that too. Success, Willem Riede. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] instrument ide-scsi in 2.5.68 2003-05-01 23:29 ` Douglas Gilbert 2003-05-01 23:44 ` Willem Riede @ 2003-05-02 0:38 ` Randy.Dunlap 2003-05-02 1:10 ` Douglas Gilbert 1 sibling, 1 reply; 18+ messages in thread From: Randy.Dunlap @ 2003-05-02 0:38 UTC (permalink / raw) To: dougg; +Cc: linux-scsi, alan On Fri, 02 May 2003 09:29:55 +1000 Douglas Gilbert <dougg@torque.net> wrote: | Randy.Dunlap wrote: | <snip/> | | > Hi Doug, | > | > I have ATAPI CD-ROM as hdc and CD-RW as hdd. I boot with | > "hdd=icd-scsi ide-scsi.debug=3". Is that right/OK so far? | > If not, what should it be? | | Randy, | I thought so but it's not picking "ide-scsi.debug=3" up: | | > Kernel command line: auto BOOT_IMAGE=lin2568isc ro root=304 | > devfs=nomount hdd=ide-scsi ide-scsi.debug=3 console=tty0 | > console=ttyS0,38400n8 debug | > ide_setup: hdd=ide-scsi | > ide_setup: ide-scsi.debug=3 -- BAD OPTION Yes, I noticed that. So I changed it to "ide_scsi.debug=3" and got the same message. So I changed it to "idescsi.debug=3" and still get error messages...?? Anyway, I set the sysfs file values to 3 and 1 (debug and suppress_reset) [since it boots now for some unknown reason] and tried to burn another CD. Now I have just another coaster. :) The system is hung-up again, no keyboard, no messages on serial console. In the future I should probably use ide-scsi as a module to see if the parameters work OK. And I'll look into the parameters for built-in drivers too (on my long todo list). | The code uses Rusty's moduleparam.h and it worked | for me with scsi_debug. [It is some time since I checked | scsi_debug built-in.] Could you try "idescsi.debug=3" | (i.e. drop the hyphen). Failing that perhaps you could | build ide-scsi as a module. | | > Anyway, it never gets thru the boot sequence (with or without this | > patch). It hangs with these last messages: | > | > scsi2 : SCSI host adapter emulation for IDE ATAPI devices | > ide-scsi: abort called for cmd serial #32 | > hdd: lost interrupt | > ide-scsi: CoD != 0 in idescsi_pc_intr | > hdd: DMA disabled | > hdd: ATAPI reset complete | > hdd: irq timeout: status=0x80 { Busy } | > hdd: ATAPI reset complete | > hdd: irq timeout: status=0x80 { Busy } | > hdd: ATAPI reset complete | > hdd: irq timeout: status=0x80 { Busy } | | It would be interesting to see what effect "suppress_reset=1" | has on this failure. | | Doug Gilbert -- ~Randy ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] instrument ide-scsi in 2.5.68 2003-05-02 0:38 ` Randy.Dunlap @ 2003-05-02 1:10 ` Douglas Gilbert 0 siblings, 0 replies; 18+ messages in thread From: Douglas Gilbert @ 2003-05-02 1:10 UTC (permalink / raw) To: Randy.Dunlap; +Cc: linux-scsi Randy.Dunlap wrote: > On Fri, 02 May 2003 09:29:55 +1000 Douglas Gilbert <dougg@torque.net> wrote: > > | Randy.Dunlap wrote: > | <snip/> > | > | > Hi Doug, > | > > | > I have ATAPI CD-ROM as hdc and CD-RW as hdd. I boot with > | > "hdd=icd-scsi ide-scsi.debug=3". Is that right/OK so far? > | > If not, what should it be? > | > | Randy, > | I thought so but it's not picking "ide-scsi.debug=3" up: > | > | > Kernel command line: auto BOOT_IMAGE=lin2568isc ro root=304 > | > devfs=nomount hdd=ide-scsi ide-scsi.debug=3 console=tty0 > | > console=ttyS0,38400n8 debug > | > ide_setup: hdd=ide-scsi > | > ide_setup: ide-scsi.debug=3 -- BAD OPTION > > Yes, I noticed that. So I changed it to "ide_scsi.debug=3" and got > the same message. So I changed it to "idescsi.debug=3" and still > get error messages...?? > > Anyway, I set the sysfs file values to 3 and 1 (debug and suppress_reset) > [since it boots now for some unknown reason] and tried to burn another > CD. Now I have just another coaster. :) > > The system is hung-up again, no keyboard, no messages on serial console. > > In the future I should probably use ide-scsi as a module to see if the > parameters work OK. And I'll look into the parameters for built-in > drivers too (on my long todo list). Randy, Thanks for trying this. It would be great to see which command it is locking up on. Since you see the problem at driver load time, perhaps this is related to what Matthew Dharm has been seeing with some USB devices. Namely when these devices get a SCSI command that don't like (e.g. INQUIRY with the EVPD bit set **) they lock up. If this is the case then the ide-scsi driver may be needing that command filter as well. ** Patrick Mansfield has submitted a patch (now in Linus's bk snapshots) that removes the VPD=0x83 code from device scans. Not sure if this is a forward step. Some user app is going to come along and send that INQUIRY (vpd=0x83) to all SCSI devices and lock up the "bad" ones ;-( That user space app needs information to discriminate. Doug Gilbert ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] instrument ide-scsi in 2.5.68 2003-05-01 13:02 [PATCH] instrument ide-scsi in 2.5.68 Douglas Gilbert 2003-05-01 22:52 ` Randy.Dunlap @ 2003-05-02 8:40 ` Douglas Gilbert 2003-05-02 16:55 ` Randy.Dunlap 1 sibling, 1 reply; 18+ messages in thread From: Douglas Gilbert @ 2003-05-02 8:40 UTC (permalink / raw) To: linux-scsi; +Cc: alan, rddunlap [-- Attachment #1: Type: text/plain, Size: 1059 bytes --] Douglas Gilbert wrote: > <snip/> > - instrument driver with 2 parameters (as examples) > - ide-scsi.debug + ide-scsi.suppress_reset > [kernel load time] > - debug + suppress_reset [module load time] > - debug + suppress_reset [sysfs: read/write > in /sys/bus/ide/drivers/ide-scsi directory] > - downgrade (simple) tag queuing to no tag queuing > [set some other things to more conservative values] > - add scsi_host::release() method > - make provision for 16 byte SCSI commands > - cleanup printk()s > <snip/> As Randy has found the setting of kernel boot time parameters doesn't work. The problem seems to be that the ide subsystem asserts ownership over all parameters that start with "ide". I was unable to bend moduleparam.h to accept a leading underscore so I went back to the old "__setup" method: the kernel boot time parameters are now: - _ide_scsi_debug + _ide_scsi_suppress_reset Module load time and sysfs parameters remain the same. A new patch against lk 2.5.68 is attached. Doug Gilbert [-- Attachment #2: ide-scsi_2568_2.diff.gz --] [-- Type: application/x-gzip, Size: 7485 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] instrument ide-scsi in 2.5.68 2003-05-02 8:40 ` Douglas Gilbert @ 2003-05-02 16:55 ` Randy.Dunlap 2003-05-03 9:03 ` Douglas Gilbert 0 siblings, 1 reply; 18+ messages in thread From: Randy.Dunlap @ 2003-05-02 16:55 UTC (permalink / raw) To: dougg; +Cc: linux-scsi, alan [-- Attachment #1: Type: text/plain, Size: 1365 bytes --] On Fri, 02 May 2003 18:40:41 +1000 Douglas Gilbert <dougg@torque.net> wrote: | Douglas Gilbert wrote: | > <snip/> | > - instrument driver with 2 parameters (as examples) | > - ide-scsi.debug + ide-scsi.suppress_reset | > [kernel load time] | > - debug + suppress_reset [module load time] | > - debug + suppress_reset [sysfs: read/write | > in /sys/bus/ide/drivers/ide-scsi directory] | > - downgrade (simple) tag queuing to no tag queuing | > [set some other things to more conservative values] | > - add scsi_host::release() method | > - make provision for 16 byte SCSI commands | > - cleanup printk()s | > <snip/> | | As Randy has found the setting of kernel boot time parameters | doesn't work. The problem seems to be that the ide subsystem | asserts ownership over all parameters that start with "ide". | I was unable to bend moduleparam.h to accept a leading underscore | so I went back to the old "__setup" method: the kernel boot time | parameters are now: | - _ide_scsi_debug + _ide_scsi_suppress_reset I'll confirm that those work. I.e., I saw values 3 and 1 in /sys/bus/ide/drivers/ide-scsi/*. | Module load time and sysfs parameters remain the same. | | A new patch against lk 2.5.68 is attached. Used that patch instead of v1 of it. Got an oops in ide-scsi. Message log attached. -- ~Randy [-- Attachment #2: capture-isc-oops.txt --] [-- Type: text/plain, Size: 25131 bytes --] Restarting system. Linux version 2.5.68 (rddunlap@gargoyle.pdx.osdl.net) (gcc version 3.2 (Mandrake Linux 9.0 3.2-1mdk)) #8 SMP Fri May 2 09:15:05 PDT 2003 Video mode to be used for restore is f05 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 00000000000a0000 (usable) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000003fff0000 (usable) BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS) BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data) BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved) 127MB HIGHMEM available. 896MB LOWMEM available. found SMP MP-table at 000f52a0 hm, page 000f5000 reserved twice. hm, page 000f6000 reserved twice. hm, page 000f1000 reserved twice. hm, page 000f2000 reserved twice. On node 0 totalpages: 262128 DMA zone: 4096 pages, LIFO batch:1 Normal zone: 225280 pages, LIFO batch:16 HighMem zone: 32752 pages, LIFO batch:7 Intel MultiProcessor Specification v1.4 Virtual Wire compatibility mode. OEM ID: OEM00000 Product ID: PROD00000000 APIC at: 0xFEE00000 Processor #0 15:1 APIC version 17 Processor #1 15:0 APIC version 17 I/O APIC #2 Version 17 at 0xFEC00000. Enabling APIC mode: Flat. Using 1 I/O APICs Processors: 2 IBM machine detected. Enabling interrupts during APM calls. IBM machine detected. Disabling SMBus accesses. Building zonelist for node : 0 Kernel command line: auto BOOT_IMAGE=lin2568isc ro root=304 devfs=nomount hdd=ide-scsi _ide_scsi_debug=3 _ide_scsi_suppress_reset=1 console=tty0 console=ttyS0,38400n8 debug ide_setup: hdd=ide-scsi Initializing CPU#0 PID hash table entries: 4096 (order 12: 32768 bytes) Detected 1685.214 MHz processor. Console: colour VGA+ 80x30 Calibrating delay loop... 3317.76 BogoMIPS Memory: 1032804k/1048512k available (2828k kernel code, 14804k reserved, 1024k data, 144k init, 131008k highmem) Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Mount-cache hash table entries: 512 (order: 0, 4096 bytes) -> /dev -> /dev/console -> /root CPU: Trace cache: 12K uops, L1 D cache: 8K CPU: L2 cache: 256K CPU: Hyper-Threading is disabled CPU: After generic, caps: 3febfbff 00000000 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU#0: Intel P4/Xeon Extended MCE MSRs (12) available CPU#0: Thermal monitoring enabled Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. POSIX conformance testing by UNIFIX CPU0: Intel(R) Xeon(TM) CPU 1.70GHz stepping 02 per-CPU timeslice cutoff: 731.20 usecs. task migration cache decay timeout: 1 msecs. enabled ExtINT on CPU#0 ESR value before enabling vector: 00000000 ESR value after enabling vector: 00000000 Booting processor 1/1 eip 2000 Initializing CPU#1 masked ExtINT on CPU#1 ESR value before enabling vector: 00000000 ESR value after enabling vector: 00000000 Calibrating delay loop... 3358.72 BogoMIPS CPU: Trace cache: 12K uops, L1 D cache: 8K CPU: L2 cache: 256K CPU: Hyper-Threading is disabled CPU: After generic, caps: 3febfbff 00000000 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#1. CPU#1: Intel P4/Xeon Extended MCE MSRs (12) available CPU#1: Thermal monitoring enabled CPU1: Intel(R) Xeon(TM) CPU 1700MHz stepping 0a Total of 2 processors activated (6676.48 BogoMIPS). ENABLING IO-APIC IRQs Setting 2 in the phys_id_present_map ...changing IO-APIC physical APIC ID to 2 ... ok. init IO_APIC IRQs IO-APIC (apicid-pin) 2-0, 2-5, 2-10, 2-11, 2-12, 2-18, 2-21, 2-23 not connected. ..TIMER: vector=0x31 pin1=2 pin2=0 number of MP IRQ sources: 20. number of IO-APIC #2 registers: 24. testing the IO APIC....................... IO APIC #2...... .... register #00: 02000000 ....... : physical APIC id: 02 ....... : Delivery Type: 0 ....... : LTS : 0 .... register #01: 00178020 ....... : max redirection entries: 0017 ....... : PRQ implemented: 1 ....... : IO APIC version: 0020 .... register #02: 00000000 ....... : arbitration: 00 .... IRQ redirection table: IRQ to pin mappings: IRQ0 -> 0:2 IRQ1 -> 0:1 IRQ3 -> 0:3 IRQ4 -> 0:4 IRQ6 -> 0:6 IRQ7 -> 0:7 IRQ8 -> 0:8 IRQ9 -> 0:9 IRQ13 -> 0:13 IRQ14 -> 0:14 IRQ15 -> 0:15 IRQ16 -> 0:16 IRQ17 -> 0:17 IRQ19 -> 0:19 IRQ20 -> 0:20 IRQ22 -> 0:22 .................................... done. Using local APIC timer interrupts. calibrating APIC timer ... ..... CPU clock speed is 1684.0989 MHz. ..... host bus clock speed is 99.0116 MHz. checking TSC synchronization across 2 CPUs: passed. Starting migration thread for cpu 0 Bringing up 1 CPU 1 IS NOW UP! Starting migration thread for cpu 1 CPUS done 4 mtrr: v2.0 (20020519) mtrr: your CPUs had inconsistent fixed MTRR settings mtrr: probably your BIOS does not setup all CPUs. mtrr: corrected configuration. PCI: PCI BIOS revision 2.10 entry at 0xfb110, last bus=4 PCI: Using configuration type 1 BIO: pool of 256 setup, 14Kb (56 bytes/bio) Linux Plug and Play Support v0.96 (c) Adam Belay block request queues: 128 requests per read queue 128 requests per write queue 8 requests per batch enter congestion at 15 exit congestion at 17 SCSI subsystem initialized PCI: Probing PCI hardware PCI: Probing PCI hardware (bus 00) Transparent bridge - Intel Corp. 82801BA/CA/DB PCI Br PCI: Using IRQ router PIIX [8086/2440] at 00:1f.0 PCI->APIC IRQ transform: (B0,I31,P3) -> 19 PCI->APIC IRQ transform: (B0,I31,P1) -> 17 PCI->APIC IRQ transform: (B0,I31,P1) -> 17 PCI->APIC IRQ transform: (B1,I0,P0) -> 22 PCI->APIC IRQ transform: (B3,I1,P0) -> 20 PCI->APIC IRQ transform: (B3,I1,P1) -> 20 PCI->APIC IRQ transform: (B4,I4,P0) -> 16 Initializing RT netlink socket Starting balanced_irq Enabling SEP on CPU 1 Enabling SEP on CPU 0 highmem bounce pool size: 64 pages Journalled Block Device driver loaded udf: registering filesystem pty: 256 Unix98 ptys configured Non-volatile memory driver v1.2 i810 TCO timer init: failed to reset NO_REBOOT flag, reboot disabled by hardware Software Watchdog Timer: 0.06, soft_margin: 60 sec, nowayout: 0 Serial: 8250/16550 driver $Revision: 1.90 $ IRQ sharing disabled ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A Linux video capture interface: v1.00 Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx ICH2: IDE controller at PCI slot 00:1f.1 ICH2: chipset revision 4 ICH2: not 100% native mode: will probe irqs later ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:DMA ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:DMA hda: IC35L040AVER07-0, ATA DISK drive hdb: IC35L020AVER07-0, ATA DISK drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 hdc: LTN486S, ATAPI CD/DVD-ROM drive hdd: SONY CD-RW CRX140E, ATAPI CD/DVD-ROM drive ide1 at 0x170-0x177,0x376 on irq 15 hda: host protected area => 1 hda: 78156288 sectors (40016 MB) w/1916KiB Cache, CHS=77536/16/63, UDMA(100) hda: hda1 hda2 hda3 hda4 hdb: host protected area => 1 hdb: 39876480 sectors (20417 MB) w/1916KiB Cache, CHS=39560/16/63, UDMA(100) hdb: hdb1 hdc: ATAPI 48X CD-ROM drive, 120kB Cache, UDMA(33) Uniform CD-ROM driver Revision: 3.12 end_request: I/O error, dev hdc, sector 0 scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.31 <Adaptec aic7899 Ultra160 SCSI adapter> aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs (scsi0:A:0): 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit) Vendor: IBM Model: IC35L036UWD210-0 Rev: S5BS Type: Direct-Access ANSI SCSI revision: 03 scsi0:A:0:0: Tagged Queuing enabled. Depth 253 scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.31 <Adaptec aic7899 Ultra160 SCSI adapter> aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs scsi2 : SCSI host adapter emulation for IDE ATAPI devices ide-scsi: hdd: que 32, cmd = [ 12 0 0 0 24 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 36 bytes transferred ide-scsi: hdd: suc 32, rst = [ 5 80 2 32 5b 0 0 0 53 4f 4e 59 20 20 20 20 ] ide-scsi: hdd: que 33, cmd = [ 12 0 0 0 60 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 96 bytes transferred ide-scsi: hdd: suc 33, rst = [ 5 80 2 32 5b 0 0 0 53 4f 4e 59 20 20 20 20 ] Vendor: SONY Model: CD-RW CRX140E Rev: 1.0n Type: CD-ROM ANSI SCSI revision: 02 SCSI device sda: 71687340 512-byte hdwr sectors (36704 MB) SCSI device sda: drive cache: write back sda: sda1 sda2 sda3 sda4 Attached scsi disk sda at scsi0, channel 0, id 0, lun 0 ide-scsi: hdd: que 39, cmd = [ 1a 0 2a 0 80 0 ] ide-scsi: hdd: que 39, tsl = [ 5a 0 2a 0 0 0 0 0 84 0 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 34 bytes transferred ide-scsi: hdd: suc 39, rst = [ 20 70 0 0 2a 18 7 7 71 63 29 23 16 d 1 0 ] sr0: scsi3-mmc drive: 32x/32x writer cd/rw xa/form2 cdda tray Attached scsi CD-ROM sr0 at scsi2, channel 0, id 0, lun 0 mice: PS/2 mouse device common for all mice input: PC Speaker serio: i8042 AUX port at 0x60,0x64 irq 12 input: AT Set 2 keyboard on isa0060/serio0 serio: i8042 KBD port at 0x60,0x64 irq 1 i2c /dev entries driver module version 2.7.0 (20021208) i2c-i801 version 2.7.0 (20021208) i2c-piix4 version 2.7.0 (20021208) kjournald starting. Commit interval 5 seconds EXT3-fs: mounted filesystem with ordered data mode. VFS: Mounted root (ext3 filesystem) readonly. Freeing unused kernel memory: 144k freed Setting clock (utc): Fri May 2 09:26:52 PDT 2003 [ OK ] Setting hostname gargoyle.pdx.osdl.net: [ OK ] Checking root filesystem /dev/hda4: clean, 225983/3637248 files, 946621/7263388 blocks [ OK ] Checking filesystems /dev/hda1: clean, 226/126976 files, 245301/506016 blocks /dev/hda3: clean, 15623/652800 files, 834432/1305281 blocks [ OK ] INIT: Entering runlevel: 3 Entering non-interactive startup SysRq : Changing Loglevel Loglevel set to 7 SysRq : Changing Loglevel Loglevel set to 7 SysRq : Changing Loglevel Loglevel set to 9 ide-scsi: hdd: que 43, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: check condition for 43 ide-scsi: hdd: queue cmd = [ 3 0 0 0 40 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 26 bytes transferred ide-scsi: hdd: wrap up check 43, rst = [ 70 0 2 0 0 0 0 12 0 0 0 0 3a 0 0 0 ] ide-scsi: hdd: que 44, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: check condition for 44 ide-scsi: hdd: queue cmd = [ 3 0 0 0 40 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 26 bytes transferred ide-scsi: hdd: wrap up check 44, rst = [ 70 0 2 0 0 0 0 12 0 0 0 0 3a 0 0 0 ] ide-scsi: hdd: que 45, cmd = [ 12 0 0 0 24 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 36 bytes transferred ide-scsi: hdd: suc 45, rst = [ 5 80 0 31 5b 0 0 0 53 4f 4e 59 20 20 20 20 ] ide-scsi: hdd: que 46, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: check condition for 46 ide-scsi: hdd: queue cmd = [ 3 0 0 0 40 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 26 bytes transferred ide-scsi: hdd: wrap up check 46, rst = [ 70 0 2 0 0 0 0 12 0 0 0 0 3a 0 0 0 ] ide-scsi: hdd: que 47, cmd = [ 5a 0 3f 0 0 0 0 0 8 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 8 bytes transferred ide-scsi: hdd: suc 47, rst = [ 0 96 70 0 0 0 0 0 ] ide-scsi: hdd: que 48, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: check condition for 48 ide-scsi: hdd: queue cmd = [ 3 0 0 0 40 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 26 bytes transferred ide-scsi: hdd: wrap up check 48, rst = [ 70 0 2 0 0 0 0 12 0 0 0 0 3a 0 0 0 ] ide-scsi: hdd: que 49, cmd = [ 5a 0 2a 0 0 0 0 0 2 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 2 bytes transferred ide-scsi: hdd: suc 49, rst = [ 0 20 ] ide-scsi: hdd: que 50, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: check condition for 50 ide-scsi: hdd: queue cmd = [ 3 0 0 0 40 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 26 bytes transferred ide-scsi: hdd: wrap up check 50, rst = [ 70 0 2 0 0 0 0 12 0 0 0 0 3a 0 0 0 ] ide-scsi: hdd: que 51, cmd = [ 5a 0 2a 0 0 0 0 0 22 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 34 bytes transferred ide-scsi: hdd: suc 51, rst = [ 0 20 70 0 0 0 0 0 2a 18 7 7 71 63 29 23 ] ide-scsi: hdd: que 52, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: check condition for 52 ide-scsi: hdd: queue cmd = [ 3 0 0 0 40 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 26 bytes transferred ide-scsi: hdd: wrap up check 52, rst = [ 70 0 2 0 0 0 0 12 0 0 0 0 3a 0 0 0 ] ide-scsi: hdd: que 53, cmd = [ 5a 0 2a 0 0 0 0 0 22 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 34 bytes transferred ide-scsi: hdd: suc 53, rst = [ 0 20 70 0 0 0 0 0 2a 18 7 7 71 63 29 23 ] ide-scsi: hdd: que 54, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: check condition for 54 ide-scsi: hdd: queue cmd = [ 3 0 0 0 40 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 26 bytes transferred ide-scsi: hdd: wrap up check 54, rst = [ 70 0 2 0 0 0 0 12 0 0 0 0 3a 0 0 0 ] Mandrake Linux release 9.0 (dolphin) for i586 Kernel 2.5.68 on a Dual-processor i686 / l gargoyle.pdx.osdl.net login: ide-scsi: hdd: que 55, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: check condition for 55 ide-scsi: hdd: queue cmd = [ 3 0 0 0 40 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 26 bytes transferred ide-scsi: hdd: wrap up check 55, rst = [ 70 0 6 0 0 0 0 12 0 0 0 0 28 0 0 0 ] ide-scsi: hdd: que 56, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: suc 56 ide-scsi: hdd: que 57, cmd = [ 12 0 0 0 24 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 36 bytes transferred ide-scsi: hdd: suc 57, rst = [ 5 80 0 31 5b 0 0 0 53 4f 4e 59 20 20 20 20 ] ide-scsi: hdd: que 58, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: suc 58 ide-scsi: hdd: que 59, cmd = [ 5a 0 3f 0 0 0 0 0 8 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 8 bytes transferred ide-scsi: hdd: suc 59, rst = [ 0 96 11 0 0 0 0 0 ] ide-scsi: hdd: que 60, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: suc 60 ide-scsi: hdd: que 61, cmd = [ 5a 0 2a 0 0 0 0 0 2 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 2 bytes transferred ide-scsi: hdd: suc 61, rst = [ 0 20 ] ide-scsi: hdd: que 62, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: suc 62 ide-scsi: hdd: que 63, cmd = [ 5a 0 2a 0 0 0 0 0 22 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 34 bytes transferred ide-scsi: hdd: suc 63, rst = [ 0 20 11 0 0 0 0 0 2a 18 7 7 71 63 29 23 ] ide-scsi: hdd: que 64, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: suc 64 ide-scsi: hdd: que 65, cmd = [ 5a 0 2a 0 0 0 0 0 22 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 34 bytes transferred ide-scsi: hdd: suc 65, rst = [ 0 20 11 0 0 0 0 0 2a 18 7 7 71 63 29 23 ] ide-scsi: hdd: que 66, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: suc 66 ide-scsi: hdd: que 67, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: suc 67 ide-scsi: hdd: que 68, cmd = [ 5a 0 3f 0 0 0 0 0 8 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 8 bytes transferred ide-scsi: hdd: suc 68, rst = [ 0 96 11 0 0 0 0 0 ] ide-scsi: hdd: que 69, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: suc 69 ide-scsi: hdd: que 70, cmd = [ 5a 0 2a 0 0 0 0 0 2 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 2 bytes transferred ide-scsi: hdd: suc 70, rst = [ 0 20 ] ide-scsi: hdd: que 71, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: suc 71 ide-scsi: hdd: que 72, cmd = [ 5a 0 2a 0 0 0 0 0 22 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 34 bytes transferred ide-scsi: hdd: suc 72, rst = [ 0 20 11 0 0 0 0 0 2a 18 7 7 71 63 29 23 ] ide-scsi: hdd: que 73, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: suc 73 ide-scsi: hdd: que 74, cmd = [ 5a 0 2a 0 0 0 0 0 22 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 34 bytes transferred ide-scsi: hdd: suc 74, rst = [ 0 20 11 0 0 0 0 0 2a 18 7 7 71 63 29 23 ] ide-scsi: hdd: que 75, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: suc 75 ide-scsi: hdd: que 76, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: suc 76 ide-scsi: hdd: que 77, cmd = [ 5a 0 3f 0 0 0 0 0 8 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 8 bytes transferred ide-scsi: hdd: suc 77, rst = [ 0 96 11 0 0 0 0 0 ] ide-scsi: hdd: que 78, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: suc 78 ide-scsi: hdd: que 79, cmd = [ 5a 0 2a 0 0 0 0 0 2 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 2 bytes transferred ide-scsi: hdd: suc 79, rst = [ 0 20 ] ide-scsi: hdd: que 80, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: suc 80 ide-scsi: hdd: que 81, cmd = [ 5a 0 2a 0 0 0 0 0 22 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 34 bytes transferred ide-scsi: hdd: suc 81, rst = [ 0 20 11 0 0 0 0 0 2a 18 7 7 71 63 29 23 ] ide-scsi: hdd: que 82, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: suc 82 ide-scsi: hdd: que 83, cmd = [ 5a 0 2a 0 0 0 0 0 22 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 34 bytes transferred ide-scsi: hdd: suc 83, rst = [ 0 20 11 0 0 0 0 0 2a 18 7 7 71 63 29 23 ] ide-scsi: hdd: que 84, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: suc 84 ide-scsi: hdd: que 85, cmd = [ 5a 0 2a 0 0 0 0 0 2 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 2 bytes transferred ide-scsi: hdd: suc 85, rst = [ 0 20 ] ide-scsi: hdd: que 86, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: suc 86 ide-scsi: hdd: que 87, cmd = [ 5a 0 2a 0 0 0 0 0 22 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 34 bytes transferred ide-scsi: hdd: suc 87, rst = [ 0 20 11 0 0 0 0 0 2a 18 7 7 71 63 29 23 ] ide-scsi: hdd: que 88, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: suc 88 ide-scsi: hdd: que 89, cmd = [ 5a 0 2a 0 0 0 0 0 22 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 34 bytes transferred ide-scsi: hdd: suc 89, rst = [ 0 20 11 0 0 0 0 0 2a 18 7 7 71 63 29 23 ] ide-scsi: hdd: que 90, cmd = [ 0 0 0 0 0 0 ] ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: suc 90 ide-scsi: hdd: que 91, cmd = [ 5a 0 2a 0 0 0 0 0 2 0 ] Mandrake Linux release 9.0 (dolphin) for i586 Kernel 2.5.68 on a Dual-processor i686 / l gargoyle.pdx.osdl.net login: hdd: lost interrupt ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: hdd: DMA complete ide-scsi: CoD != 0 in idescsi_pc_intr hdd: DMA disabled ide-scsi: abort ignored ide-scsi: device reset ignored ide-scsi: hdd: que 91, cmd = [ 0 0 0 0 0 0 ] hdd: ATAPI reset complete hdd: irq timeout: status=0x80 { Busy } hdd: ATAPI reset complete hdd: irq timeout: status=0x80 { Busy } hdd: ATAPI reset complete hdd: irq timeout: status=0x80 { Busy } ide-scsi: hdd: I/O error for 91 ide-scsi: Reached idescsi_pc_intr interrupt handler Packet command completed, 0 bytes transferred ide-scsi: hdd: suc 1802201963 Unable to handle kernel paging request at virtual address 6b6b6b8b printing eip: c02d8c52 *pde = 00000000 Oops: 0000 [#1] CPU: 0 EIP: 0060:[<c02d8c52>] Not tainted EFLAGS: 00010286 EIP is at idescsi_end_request+0xee/0x2dc eax: 6b6b6b6b ebx: 00000001 ecx: 00000001 edx: c04332c0 esi: f72bf3c0 edi: c0521aa8 ebp: c04c5eb0 esp: c04c5e80 ds: 007b es: 007b ss: 0068 Process swapper (pid: 0, threadinfo=c04c4000 task=c0430580) Stack: c03d92ab c0521aa8 6b6b6b6b c04332c0 f72bf3c0 f7d5729c f7dd3cc0 00000031 f70b007b 00000000 f72bf3c0 c0521aa8 c04c5eec c02d8ef2 c0521aa8 00000001 00000000 f70b6044 0000001d 00000086 c0293df4 00000000 f7d5729c f7dd3cc0 Call Trace: [<c02d8ef2>] idescsi_pc_intr+0xb2/0x320 [<c0293df4>] ide_timer_expiry+0x1bc/0x2b8 [<c029410a>] ide_intr+0x16e/0x27a [<c02d8e40>] idescsi_pc_intr+0x0/0x320 [<c0293c38>] ide_timer_expiry+0x0/0x2b8 [<c010b508>] handle_IRQ_event+0x3c/0xfe [<c010b889>] do_IRQ+0x101/0x1d6 [<c011a8a2>] schedule+0x21a/0x560 [<c010710e>] default_idle+0x0/0x2e [<c0109f88>] common_interrupt+0x18/0x20 [<c010710e>] default_idle+0x0/0x2e [<c011007b>] get_cmos_time+0x215/0x23c [<c0107138>] default_idle+0x2a/0x2e [<c01071b1>] cpu_idle+0x3b/0x44 [<c0105000>] rest_init+0x0/0x7c [<c04c6843>] start_kernel+0x193/0x19a [<c04c6422>] unknown_bootoption+0x0/0xf6 Code: 8b 78 20 9c 8f 45 e0 fa 8b 5f 34 81 7b 04 ad 4e ad de 74 1c <0>Kernel panic: Fatal exception in interrupt In interrupt handler - not syncing ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] instrument ide-scsi in 2.5.68 2003-05-02 16:55 ` Randy.Dunlap @ 2003-05-03 9:03 ` Douglas Gilbert 2003-05-05 8:46 ` Mike Anderson 2003-05-05 16:57 ` Randy.Dunlap 0 siblings, 2 replies; 18+ messages in thread From: Douglas Gilbert @ 2003-05-03 9:03 UTC (permalink / raw) To: Randy.Dunlap; +Cc: linux-scsi, alan Randy.Dunlap wrote: > On Fri, 02 May 2003 18:40:41 +1000 Douglas Gilbert <dougg@torque.net> wrote: > > | Douglas Gilbert wrote: > | > <snip/> > | > - instrument driver with 2 parameters (as examples) > | > - ide-scsi.debug + ide-scsi.suppress_reset > | > [kernel load time] > | > - debug + suppress_reset [module load time] > | > - debug + suppress_reset [sysfs: read/write > | > in /sys/bus/ide/drivers/ide-scsi directory] > | > - downgrade (simple) tag queuing to no tag queuing > | > [set some other things to more conservative values] > | > - add scsi_host::release() method > | > - make provision for 16 byte SCSI commands > | > - cleanup printk()s > | > <snip/> > | > | As Randy has found the setting of kernel boot time parameters > | doesn't work. The problem seems to be that the ide subsystem > | asserts ownership over all parameters that start with "ide". > | I was unable to bend moduleparam.h to accept a leading underscore > | so I went back to the old "__setup" method: the kernel boot time > | parameters are now: > | - _ide_scsi_debug + _ide_scsi_suppress_reset > > I'll confirm that those work. I.e., I saw values 3 and 1 in > /sys/bus/ide/drivers/ide-scsi/*. > > | Module load time and sysfs parameters remain the same. > | > | A new patch against lk 2.5.68 is attached. > > Used that patch instead of v1 of it. Got an oops in ide-scsi. Randy, Thanks for testing this. At this stage I have no solution but can offer some analysis. Firstly, it seems like Mandrake 9.0 has some program like RedHat's "magicdev" that polls cd/dvds every second or so. Nuke it please! It won't fix the problem but will make it easier to see the wood for the trees. Over 50 commands were issued successfully to /dev/hdd until the latter part of your capture file: > ide-scsi: hdd: que 91, cmd = [ 5a 0 2a 0 0 0 0 0 2 0 ] > > Mandrake Linux release 9.0 (dolphin) for i586 > Kernel 2.5.68 on a Dual-processor i686 / l > > gargoyle.pdx.osdl.net login: hdd: lost interrupt > ide-scsi: Reached idescsi_pc_intr interrupt handler > ide-scsi: hdd: DMA complete > ide-scsi: CoD != 0 in idescsi_pc_intr > hdd: DMA disabled > ide-scsi: abort ignored > ide-scsi: device reset ignored > ide-scsi: hdd: que 91, cmd = [ 0 0 0 0 0 0 ] > hdd: ATAPI reset complete > hdd: irq timeout: status=0x80 { Busy } > hdd: ATAPI reset complete > hdd: irq timeout: status=0x80 { Busy } > hdd: ATAPI reset complete > hdd: irq timeout: status=0x80 { Busy } > ide-scsi: hdd: I/O error for 91 > ide-scsi: Reached idescsi_pc_intr interrupt handler > Packet command completed, 0 bytes transferred > ide-scsi: hdd: suc 1802201963 > Unable to handle kernel paging request at virtual address 6b6b6b8b <oooops> The issued command is a MODE SENSE (10 byte) for page 0x2a and just the first 2 bytes (i.e. tell me the length you will return). There must have been 10 of the exact same commands executed successfully prior to this point. "hdd: lost interrupt" seems to indicate a timeout (ide-scsi sets its timeouts to max(10, whatever_app_wants) seconds). Then it re-enters the ide-scsi driver in the idescsi_pc_intr() routine and things look corrupted. It tries to finish off a DMA transfer but that 2 byte data_in request was not set up as DMA. It is all downhill from there. Naturally I don't see any errors on my box. However I can shorten the timeout so it will go off on a READ command. The timeout log looks quite different from yours and my driver continues in a healthy state. ide-scsi: hdd: que 11417, cmd = [ 28 0 0 0 0 0 0 0 1 0 ] hdd: irq timeout: status=0xd0 { Busy } hdd: ATAPI reset complete hdd: irq timeout: status=0xc0 { Busy } hdd: ATAPI reset complete hdd: irq timeout: status=0xc0 { Busy } ide-scsi: hdd: I/O error for 11417 Does anyone have any suggestions for further testing? I have cleaned up the logging messages a little but that does not justify releasing a new version yet. An additional observation concerning Randy's capture: the failed command (MODE SENSE) has serial number 91. After ide-scsi thinks it has cleared the timeout and the DMA transfer it issues a TEST UNIT READY which also get 91 as a serial number. This looks like a bug in the mid level. Doug Gilbert ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] instrument ide-scsi in 2.5.68 2003-05-03 9:03 ` Douglas Gilbert @ 2003-05-05 8:46 ` Mike Anderson 2003-05-05 16:57 ` Randy.Dunlap 1 sibling, 0 replies; 18+ messages in thread From: Mike Anderson @ 2003-05-05 8:46 UTC (permalink / raw) To: Douglas Gilbert; +Cc: Randy.Dunlap, linux-scsi, alan Douglas Gilbert [dougg@torque.net] wrote: > The issued command is a MODE SENSE (10 byte) for page 0x2a > and just the first 2 bytes (i.e. tell me the length you > will return). There must have been 10 of the exact same > commands executed successfully prior to this point. > > "hdd: lost interrupt" seems to indicate a timeout (ide-scsi > sets its timeouts to max(10, whatever_app_wants) seconds). > Then it re-enters the ide-scsi driver in the > idescsi_pc_intr() routine and things look corrupted. It tries > to finish off a DMA transfer but that 2 byte data_in request > was not set up as DMA. It is all downhill from there. > > Naturally I don't see any errors on my box. However I can > shorten the timeout so it will go off on a READ command. > The timeout log looks quite different from yours and > my driver continues in a healthy state. > > ide-scsi: hdd: que 11417, cmd = [ 28 0 0 0 0 0 0 0 1 0 ] > hdd: irq timeout: status=0xd0 { Busy } > hdd: ATAPI reset complete > hdd: irq timeout: status=0xc0 { Busy } > hdd: ATAPI reset complete > hdd: irq timeout: status=0xc0 { Busy } > ide-scsi: hdd: I/O error for 11417 > > > Does anyone have any suggestions for further testing? > It might be interesting if timeouts are happening to see what the error handler is doing. If SCSI_LOGGING is config'd. Then echo scsi log error 4 > /proc/scsi/scsi should not be too verbose. > I have cleaned up the logging messages a little but > that does not justify releasing a new version yet. > > > An additional observation concerning Randy's capture: > the failed command (MODE SENSE) has serial number 91. > After ide-scsi thinks it has cleared the timeout and > the DMA transfer it issues a TEST UNIT READY which > also get 91 as a serial number. This looks like a > bug in the mid level. > Well ++serial_number and scsi_pid++ are not very safe, but is the TUR coming from the error handler as it will have the same serial number that the timed out command had. Alan Stern was asking about this in another thread a few days ago. -andmike -- Michael Anderson andmike@us.ibm.com ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] instrument ide-scsi in 2.5.68 2003-05-03 9:03 ` Douglas Gilbert 2003-05-05 8:46 ` Mike Anderson @ 2003-05-05 16:57 ` Randy.Dunlap 2003-05-06 8:38 ` Douglas Gilbert 1 sibling, 1 reply; 18+ messages in thread From: Randy.Dunlap @ 2003-05-05 16:57 UTC (permalink / raw) To: dougg; +Cc: linux-scsi, alan On Sat, 03 May 2003 19:03:47 +1000 Douglas Gilbert <dougg@torque.net> wrote: | > | As Randy has found the setting of kernel boot time parameters | > | doesn't work. The problem seems to be that the ide subsystem | > | asserts ownership over all parameters that start with "ide". | > | I was unable to bend moduleparam.h to accept a leading underscore | > | so I went back to the old "__setup" method: the kernel boot time | > | parameters are now: | > | - _ide_scsi_debug + _ide_scsi_suppress_reset | > | > I'll confirm that those work. I.e., I saw values 3 and 1 in | > /sys/bus/ide/drivers/ide-scsi/*. | > | > | Module load time and sysfs parameters remain the same. | > | | > | A new patch against lk 2.5.68 is attached. | > | > Used that patch instead of v1 of it. Got an oops in ide-scsi. | | Randy, | Thanks for testing this. At this stage I have no | solution but can offer some analysis. | | Firstly, it seems like Mandrake 9.0 has some | program like RedHat's "magicdev" that polls | cd/dvds every second or so. Nuke it please! | It won't fix the problem but will make it easier | to see the wood for the trees. I'll happily do that if I can find it, since I don't use it anyway. However, I don't see it. I also don't see any periodic activity after enabling scsi logging as Mike A. mentioned. I did 'cdrecord -scanbus' and then cdrecord -v speed=4 dev=2,0,0 -data lca2003.iso and got the same oops as before (invalid ptr = 6b6b6bxx, into bad memory area). New kernel log file is now at http://www.xenotime.net/linux/capture2.txt Are there any docs on SCSI logging? -- ~Randy ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] instrument ide-scsi in 2.5.68 2003-05-05 16:57 ` Randy.Dunlap @ 2003-05-06 8:38 ` Douglas Gilbert 2003-05-06 14:39 ` Randy.Dunlap 0 siblings, 1 reply; 18+ messages in thread From: Douglas Gilbert @ 2003-05-06 8:38 UTC (permalink / raw) To: Randy.Dunlap; +Cc: linux-scsi, alan Randy.Dunlap wrote: > On Sat, 03 May 2003 19:03:47 +1000 Douglas Gilbert <dougg@torque.net> wrote: > > | > | As Randy has found the setting of kernel boot time parameters > | > | doesn't work. The problem seems to be that the ide subsystem > | > | asserts ownership over all parameters that start with "ide". > | > | I was unable to bend moduleparam.h to accept a leading underscore > | > | so I went back to the old "__setup" method: the kernel boot time > | > | parameters are now: > | > | - _ide_scsi_debug + _ide_scsi_suppress_reset > | > > | > I'll confirm that those work. I.e., I saw values 3 and 1 in > | > /sys/bus/ide/drivers/ide-scsi/*. > | > > | > | Module load time and sysfs parameters remain the same. > | > | > | > | A new patch against lk 2.5.68 is attached. > | > > | > Used that patch instead of v1 of it. Got an oops in ide-scsi. > | > | Randy, > | Thanks for testing this. At this stage I have no > | solution but can offer some analysis. > | > | Firstly, it seems like Mandrake 9.0 has some > | program like RedHat's "magicdev" that polls > | cd/dvds every second or so. Nuke it please! > | It won't fix the problem but will make it easier > | to see the wood for the trees. > > I'll happily do that if I can find it, since I don't use it anyway. > However, I don't see it. I also don't see any periodic activity > after enabling scsi logging as Mike A. mentioned. > > I did 'cdrecord -scanbus' and then > cdrecord -v speed=4 dev=2,0,0 -data lca2003.iso > and got the same oops as before (invalid ptr = 6b6b6bxx, into > bad memory area). > > > New kernel log file is now at > http://www.xenotime.net/linux/capture2.txt > > Are there any docs on SCSI logging? Randy, The failure looks very similar. This time it timed out and corrupted on a Mode Sense (10) for page 5 (alloc length=2) while last time its failed on a Mode Sense (10) for page 0x2a (alloc length=2). The Mode Sense is not translated (i.e. the app sent a 10 byte Mode Sense). Perhaps there is a weakness when the allocation length is that short. This time 141 commands where sent (last time 91) with lots of Test Unit Readys. Without timestamps it is hard to say whether a "magicdev" type program is at work or libscg (cdrecord transport layer) is solely responsible for that sequence of SCSI commands. [cdrecord may be accessing the cdwriter via the cdrom driver and the sg driver.] Here is the point of failure from Randy's log: ide-scsi: hdd: que 141, cmd = [ 5a 0 5 0 0 0 0 0 2 0 ] hdd: lost interrupt ide-scsi: Reached idescsi_pc_intr interrupt handler ide-scsi: hdd: DMA complete ide-scsi: CoD != 0 in idescsi_pc_intr hdd: DMA disabled Error handler scsi_eh_2 waking up scsi_eh_<pr4>t_hfdadi: lA_TstAaPIts r: e2se:t0: 0co:m0 plcmetdse failiedde:- sc0s,i :c anRecaeclh: ed1 idTeosctsali_ pofc_ 1in ctro mminantdersr oupnt 1 hdanedvliceres rePaqcuikrete ceho mmwoanrdk .... Ouch, after the error handler starts the dump looks like it is in Klingon :-) Lots of data but no information. It could be a result of the corruption or another problem. BTW I ran Kurt Garloff's rescan-scsi-bus.sh script and got this: # lsscsi [0:0:1:0] disk FUJITSU MAM3184MP 0106 /dev/sda [2:0:0:0] cd CREATIVE CD5233E 1.00 /dev/sr0 [2:0:1:0] cd CREATIVE CD5233E 1.00 /dev/sr1 [2:0:2:0] cd CREATIVE CD5233E 1.00 /dev/sr2 [2:0:3:0] cd CREATIVE CD5233E 1.00 /dev/sr3 [2:0:4:0] cd CREATIVE CD5233E 1.00 /dev/sr4 [2:0:5:0] cd CREATIVE CD5233E 1.00 /dev/sr5 [2:0:6:0] cd CREATIVE CD5233E 1.00 /dev/sr6 [2:0:7:0] cd CREATIVE CD5233E 1.00 /dev/sr7 [3:0:6:0] tape SONY SDT-7000 0192 /dev/st0 Another problem ... Doug Gilbert ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] instrument ide-scsi in 2.5.68 2003-05-06 8:38 ` Douglas Gilbert @ 2003-05-06 14:39 ` Randy.Dunlap 2003-05-07 13:21 ` Douglas Gilbert 0 siblings, 1 reply; 18+ messages in thread From: Randy.Dunlap @ 2003-05-06 14:39 UTC (permalink / raw) To: dougg; +Cc: linux-scsi, alan On Tue, 06 May 2003 18:38:39 +1000 Douglas Gilbert <dougg@torque.net> wrote: | > New kernel log file is now at | > http://www.xenotime.net/linux/capture2.txt | > | > Are there any docs on SCSI logging? | | Randy, | The failure looks very similar. This time it timed | out and corrupted on a Mode Sense (10) for page 5 | (alloc length=2) while last time its failed on a | Mode Sense (10) for page 0x2a (alloc length=2). | The Mode Sense is not translated (i.e. the app sent a | 10 byte Mode Sense). Perhaps there is a weakness | when the allocation length is that short. | | This time 141 commands where sent (last time 91) | with lots of Test Unit Readys. Without timestamps | it is hard to say whether a "magicdev" type program | is at work or libscg (cdrecord transport layer) is | solely responsible for that sequence of SCSI commands. | [cdrecord may be accessing the cdwriter via the cdrom | driver and the sg driver.] Well, I saw no scsi logging messages except when I ran cdrecord. Does that help any? | Here is the point of failure from Randy's log: | | ide-scsi: hdd: que 141, cmd = [ 5a 0 5 0 0 0 0 0 2 0 ] | hdd: lost interrupt | ide-scsi: Reached idescsi_pc_intr interrupt handler | ide-scsi: hdd: DMA complete | ide-scsi: CoD != 0 in idescsi_pc_intr | hdd: DMA disabled | Error handler scsi_eh_2 waking up | scsi_eh_<pr4>t_hfdadi: lA_TstAaPIts r: e2se:t0: 0co:m0 plcmetdse | failiedde:- sc0s,i :c anRecaeclh: ed1 | idTeosctsali_ pofc_ 1in ctro mminantdersr oupnt 1 hdanedvliceres | rePaqcuikrete ceho mmwoanrdk | .... | | Ouch, after the error handler starts the dump looks like it | is in Klingon :-) Lots of data but no information. It could | be a result of the corruption or another problem. I think (but I'm not sure) that this is because it's a dual-proc system and both procs are garbling the message log. Later on during a panic(), all procs except the current one are disabled, but for early messages, all procs can lead to this. I've seen this on 2 other occasions (non-scsi). -- ~Randy ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] instrument ide-scsi in 2.5.68 2003-05-06 14:39 ` Randy.Dunlap @ 2003-05-07 13:21 ` Douglas Gilbert 2003-05-07 18:16 ` Randy.Dunlap 0 siblings, 1 reply; 18+ messages in thread From: Douglas Gilbert @ 2003-05-07 13:21 UTC (permalink / raw) To: Randy.Dunlap; +Cc: linux-scsi, alan [-- Attachment #1: Type: text/plain, Size: 3995 bytes --] Randy.Dunlap wrote: > On Tue, 06 May 2003 18:38:39 +1000 Douglas Gilbert <dougg@torque.net> wrote: > > | > New kernel log file is now at > | > http://www.xenotime.net/linux/capture2.txt > | > > | > Are there any docs on SCSI logging? > | > | Randy, > | The failure looks very similar. This time it timed > | out and corrupted on a Mode Sense (10) for page 5 > | (alloc length=2) while last time its failed on a > | Mode Sense (10) for page 0x2a (alloc length=2). > | The Mode Sense is not translated (i.e. the app sent a > | 10 byte Mode Sense). Perhaps there is a weakness > | when the allocation length is that short. > | > | This time 141 commands where sent (last time 91) > | with lots of Test Unit Readys. Without timestamps > | it is hard to say whether a "magicdev" type program > | is at work or libscg (cdrecord transport layer) is > | solely responsible for that sequence of SCSI commands. > | [cdrecord may be accessing the cdwriter via the cdrom > | driver and the sg driver.] > > Well, I saw no scsi logging messages except when I ran > cdrecord. Does that help any? Randy, Now I have a generic atapi cdwriter. With it cdrecord (version in RH9) burnt several small dummy iso images. Some with pio (the default) and others with DMA (after 'hdparm -d 1 /dev/hdb'). Everything worked, here is a link to a tarball of the cdrecord output and the log from "ide-scsi debug=2": http://www.torque.net/sg/p/ide_scsi_capture.tgz The initial sequence from cdrecord looks close to your previous two captures. So maybe cdrecord was producing those sequences. > | Here is the point of failure from Randy's log: > | > | ide-scsi: hdd: que 141, cmd = [ 5a 0 5 0 0 0 0 0 2 0 ] > | hdd: lost interrupt > | ide-scsi: Reached idescsi_pc_intr interrupt handler > | ide-scsi: hdd: DMA complete > | ide-scsi: CoD != 0 in idescsi_pc_intr > | hdd: DMA disabled > | Error handler scsi_eh_2 waking up > | scsi_eh_<pr4>t_hfdadi: lA_TstAaPIts r: e2se:t0: 0co:m0 plcmetdse > | failiedde:- sc0s,i :c anRecaeclh: ed1 > | idTeosctsali_ pofc_ 1in ctro mminantdersr oupnt 1 hdanedvliceres > | rePaqcuikrete ceho mmwoanrdk > | .... > | > | Ouch, after the error handler starts the dump looks like it > | is in Klingon :-) Lots of data but no information. It could > | be a result of the corruption or another problem. > > I think (but I'm not sure) that this is because it's a dual-proc > system and both procs are garbling the message log. Later on during > a panic(), all procs except the current one are disabled, but for > early messages, all procs can lead to this. I've seen this on 2 > other occasions (non-scsi). Do your failures only occur in an SMP environment? Attached is a patch for ide-scsi.c against lk 2.5.69 - fixes multiple answering problem I reported yesterday (and should quieten ide-scsi when CONFIG_SCSI_MULTI_LUN is set) - sets the DMA direction for all data transfers in idescsi_set_direction() [not just read + write] - more debug message cleanups - changes reported earlier in this thread So bugs are being fixed and the driver cleaned up but still no insight into Randy's problem. Now for the strange observation of the day: # modprobe ide-scsi # lsscsi [0:0:1:0] disk FUJITSU MAM3184MP 0106 /dev/sda [2:0:0:0] cd ATAPI CD-RW 48X16 A.RZ /dev/sr0 [3:0:0:0] cd CREATIVE CD5233E 1.00 /dev/sr1 # rmmod ide-scsi # modprobe ide-scsi # lsscsi [0:0:1:0] disk FUJITSU MAM3184MP 0106 /dev/sda [2:0:0:0] cd CREATIVE CD5233E 1.00 /dev/sr0 [3:0:0:0] cd ATAPI CD-RW 48X16 A.RZ /dev/sr1 # rmmod ide-scsi # modprobe ide-scsi # lsscsi [0:0:1:0] disk FUJITSU MAM3184MP 0106 /dev/sda [2:0:0:0] cd ATAPI CD-RW 48X16 A.RZ /dev/sr0 [3:0:0:0] cd CREATIVE CD5233E 1.00 /dev/sr1 and the pattern continues with the two atapi devices changing positions. So much for device naming stability. Doug Gilbert [-- Attachment #2: ide-scsi_2569dpg1.diff.gz --] [-- Type: application/x-gzip, Size: 8758 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] instrument ide-scsi in 2.5.68 2003-05-07 13:21 ` Douglas Gilbert @ 2003-05-07 18:16 ` Randy.Dunlap 2003-05-07 21:20 ` Randy.Dunlap 0 siblings, 1 reply; 18+ messages in thread From: Randy.Dunlap @ 2003-05-07 18:16 UTC (permalink / raw) To: dougg; +Cc: linux-scsi, alan Hi Doug- On Wed, 07 May 2003 23:21:43 +1000 Douglas Gilbert <dougg@torque.net> wrote: | Randy.Dunlap wrote: | > On Tue, 06 May 2003 18:38:39 +1000 Douglas Gilbert <dougg@torque.net> wrote: | > | > | > Are there any docs on SCSI logging? | | Do your failures only occur in an SMP environment? Yes, I've been testing only SMP. I just rebooted 2.5.68 + your first ide-scsi patch with "nosmp" and it faults during the boot sequence at EIP is at scsi_eh_offline_sdevs+0x21/0x78. Full log available if wanted. I'll try 2.5.69 plus your patch ver. 2. | and the pattern continues with the two atapi devices changing | positions. So much for device naming stability. 8:( -- ~Randy ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] instrument ide-scsi in 2.5.68 2003-05-07 18:16 ` Randy.Dunlap @ 2003-05-07 21:20 ` Randy.Dunlap 2003-05-08 0:00 ` Douglas Gilbert 0 siblings, 1 reply; 18+ messages in thread From: Randy.Dunlap @ 2003-05-07 21:20 UTC (permalink / raw) To: dougg; +Cc: linux-scsi, alan | Hi Doug- | On Wed, 07 May 2003 23:21:43 +1000 Douglas Gilbert <dougg@torque.net> wrote: | | | Randy.Dunlap wrote: | | > On Tue, 06 May 2003 18:38:39 +1000 Douglas Gilbert <dougg@torque.net> wrote: | | > | | > | > Are there any docs on SCSI logging? | | | | Do your failures only occur in an SMP environment? | | I'll try 2.5.69 plus your patch ver. 2. 2.5.69 + your ide-scsi ver. 2 patch, booted with "nosmp": hangs/locks up during boot sequence, immediately after this: end_request: I/O error, dev hdc, sector 0 scsi0 : SCSI host adapter emulation for IDE ATAPI devices ide-scsi: hdd: que 1, cmd = [ 12 0 0 0 24 0 ] ide-scsi: abort called for cmd serial #1 ide-scsi: abort ignored ide-scsi: reset called for 1 ide-scsi: device reset partially ignored ide-scsi: hdd: que 1, cmd = [ 0 0 0 0 0 0 ] ide-scsi: abort called for cmd serial #1 ide-scsi: abort ignored scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 0 lun 0 same kernel without "nosmp", so it's dual-proc P4: I ran cdrecord, got to an error, cdrecord had a segfault. Not much help I'm sorry to say. Complete logs are here: http://www.xenotime.net/linux/capt-2569-nosmp-locked.txt and http://www.xenotime.net/linux/capt-2569-isc-dual-segv.txt -- ~Randy ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] instrument ide-scsi in 2.5.68 2003-05-07 21:20 ` Randy.Dunlap @ 2003-05-08 0:00 ` Douglas Gilbert 0 siblings, 0 replies; 18+ messages in thread From: Douglas Gilbert @ 2003-05-08 0:00 UTC (permalink / raw) To: Randy.Dunlap; +Cc: linux-scsi Randy.Dunlap wrote: > | Hi Doug- > > | On Wed, 07 May 2003 23:21:43 +1000 Douglas Gilbert <dougg@torque.net> wrote: > | > | | Randy.Dunlap wrote: > | | > On Tue, 06 May 2003 18:38:39 +1000 Douglas Gilbert <dougg@torque.net> wrote: > | | > > | | > | > Are there any docs on SCSI logging? > | | > | | Do your failures only occur in an SMP environment? > | > | I'll try 2.5.69 plus your patch ver. 2. > > 2.5.69 + your ide-scsi ver. 2 patch, booted with "nosmp": > > hangs/locks up during boot sequence, immediately after this: > > end_request: I/O error, dev hdc, sector 0 > scsi0 : SCSI host adapter emulation for IDE ATAPI devices > ide-scsi: hdd: que 1, cmd = [ 12 0 0 0 24 0 ] > ide-scsi: abort called for cmd serial #1 > ide-scsi: abort ignored > ide-scsi: reset called for 1 > ide-scsi: device reset partially ignored > ide-scsi: hdd: que 1, cmd = [ 0 0 0 0 0 0 ] > ide-scsi: abort called for cmd serial #1 > ide-scsi: abort ignored > scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 0 lun 0 > > > same kernel without "nosmp", so it's dual-proc P4: > > I ran cdrecord, got to an error, cdrecord had a segfault. > > Not much help I'm sorry to say. > > Complete logs are here: > http://www.xenotime.net/linux/capt-2569-nosmp-locked.txt Randy, The "nosmp" failed on the very first SCSI command sent to the device (36 byte INQUIRY during scan). Looks like the 'rq->flags' were cleared (corruption) and the end_request was routed through ide_end_request(). Good to see the scsi error handling took the device offline. Did the device lock up or the whole machine? > and > http://www.xenotime.net/linux/capt-2569-isc-dual-segv.txt Got a bit further than last time but failed on the same command, again with corrupted fields. Perhaps you could send me your .config file so I can set up my IDE subsystem the same way as yours. Doug Gilbert ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2003-05-07 23:45 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2003-05-01 13:02 [PATCH] instrument ide-scsi in 2.5.68 Douglas Gilbert 2003-05-01 22:52 ` Randy.Dunlap 2003-05-01 22:57 ` Randy.Dunlap 2003-05-01 23:29 ` Douglas Gilbert 2003-05-01 23:44 ` Willem Riede 2003-05-02 0:38 ` Randy.Dunlap 2003-05-02 1:10 ` Douglas Gilbert 2003-05-02 8:40 ` Douglas Gilbert 2003-05-02 16:55 ` Randy.Dunlap 2003-05-03 9:03 ` Douglas Gilbert 2003-05-05 8:46 ` Mike Anderson 2003-05-05 16:57 ` Randy.Dunlap 2003-05-06 8:38 ` Douglas Gilbert 2003-05-06 14:39 ` Randy.Dunlap 2003-05-07 13:21 ` Douglas Gilbert 2003-05-07 18:16 ` Randy.Dunlap 2003-05-07 21:20 ` Randy.Dunlap 2003-05-08 0:00 ` Douglas Gilbert
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox