linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [2.6.30-rc2] CD-R: wodim intermittent failures: [sr0] Add. Sense: Logical block address out of range, sector 0
       [not found] <20090421015234.GA7014@hexapodia.org>
@ 2009-04-22 21:49 ` Andrew Morton
  2009-04-23  6:46   ` Robert Hancock
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2009-04-22 21:49 UTC (permalink / raw)
  To: Andy Isaacson; +Cc: linux-kernel, linux-ide

On Mon, 20 Apr 2009 18:52:34 -0700
Andy Isaacson <adi@hexapodia.org> wrote:

> I'm running 2.6.30-rc2-00446-ga939b96 on a Dell E4300 with a
> upgraded-to-Jaunty-mostly Ubuntu install.  Using 2.6.28 and 2.6.29.1,
> I've successfully burned a dozen CDRs, but since updating to .30-rc1
> I've noticed intermittent failures to burn CDRs using wodim at the Gnome
> desktop.  (Switching back to .29.1 makes wodim reliable again.)  There
> have been a few different failure modes, but all of them seem to be
> associated with these messages in dmesg, which I haven't seen before:
> 
> [  360.740810] sr 1:0:0:0: [sr0] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> [  360.740816] sr 1:0:0:0: [sr0] Sense Key : Illegal Request [current] 
> [  360.740820] sr 1:0:0:0: [sr0] Add. Sense: Logical block address out of range
> [  360.740826] end_request: I/O error, dev sr0, sector 0
> [  360.740830] Buffer I/O error on device sr0, logical block 0
> 
> (repeated a few dozen times).
> 
> I haven't been able to reproduce these messages running wodim at the
> console (after "/etc/init.d/gdm stop") so I suspect there's some
> interference between wodim and Gnome's desktop device detection.
> 
> I've seen the following behavior running wodim under gnome:
> 
> 1. wodim blocked in D state for >10 minutes, system showing interactive
> lagginess.  Unfortunately I've not been able to reproduce this to get
> wchan info, but I *think* it was showing blk_execute_rq.
> 
> 2. running wodim triggers DID_OK messages, but wodim seems to be able to
> complete writing the ISO and the resulting CDR reads OK.
> 
> 3. running wodim triggers DID_OK messages, and wodim fails:  (but I
> think this is just due to the media having been partially written by a
> previous iteration of wodim.)
> 
> % wodim ubuntu-9.04-rc-desktop-amd64.iso
> wodim: No write mode specified.
> wodim: Asuming -tao mode.
> wodim: Future versions of wodim may have different drive dependent defaults.
> wodim: Operation not permitted. Warning: Cannot raise RLIMIT_MEMLOCK limits.Device was not specified. Trying to find an appropriate drive...
> Looking for a CD-R drive to store 698.17 MiB...
> Detected CD-R drive: /dev/cdrw
> Using /dev/cdrom of unknown capabilities
> Device type    : Removable CD-ROM
> Version        : 5
> Response Format: 2
> Capabilities   : 
> Vendor_info    : 'TSSTcorp'
> Identification : 'DVD+-RW TS-U633A'
> Revision       : 'D200'
> Device seems to be: Generic mmc2 DVD-R/DVD-RW.
> Using generic SCSI-3/mmc   CD-R/CD-RW driver (mmc_cdr).
> Driver flags   : MMC-3 SWABAUDIO BURNFREE 
> Supported modes: TAO PACKET SAO SAO/R96P SAO/R96R RAW/R16 RAW/R96P RAW/R96R
> Speed set to 4234 KB/s
> Starting to write CD/DVD at speed  24.0 in real TAO mode for single session.
> Last chance to quit, starting real write in    0 seconds. Operation starts.
> Errno: 5 (Input/output error), write_g1 scsi sendcmd: no error
> CDB:  2A 00 FF FF FF FF 00 00 1F 00
> status: 0x2 (CHECK CONDITION)
> Sense Bytes: 70 00 05 00 00 00 00 0A 00 00 00 00 21 00 00 00
> Sense Key: 0x5 Illegal Request, Segment 0
> Sense Code: 0x21 Qual 0x00 (logical block address out of range) Fru 0x0
> Sense flags: Blk 0 (not valid) 
> cmd finished after 0.003s timeout 40s
> write track data: error after 0 bytes
> wodim: The current problem looks like a buffer underrun.
> wodim: It looks like 'driveropts=burnfree' does not work for this drive.
> wodim: Please report.
> wodim: Make sure that you are root, enable DMA and check your HW/OS set up.
> Errno: 5 (Input/output error), close track/session scsi sendcmd: no error
> CDB:  5B 00 02 00 00 00 00 00 00 00
> status: 0x2 (CHECK CONDITION)
> Sense Bytes: 70 00 05 00 00 00 00 0A 00 00 00 00 72 03 00 00
> Sense Key: 0x5 Illegal Request, Segment 0
> Sense Code: 0x72 Qual 0x03 (session fixation error - incomplete track in session) Fru 0x0
> Sense flags: Blk 0 (not valid) 
> cmd finished after 0.340s timeout 480s
> cmd finished after 0.340s timeout 480s
> wodim: Cannot fixate disk.
> 
> I'm running "wodim ubuntu-9.04-rc-desktop-amd64.iso" as a regular user
> with group write permissions to /dev/sr0.
> 
> I've seen /lib/udev/vol_id running *after* wodim has opened the device,
> so I wonder if there's some race condition there.
> 
> This is with udev 140-2 and wodim 9:1.1.9-1ubuntu1.  .config and
> additional information available at:
> 
> http://web.hexapodia.org/~adi/sysinfo/1240274989_cvpe4300_2.6.30-rc2-00446-ga939b96/
> 

Is seems unreasonably hard (to me) to work out what low-level driver is
in use when we see bug reports like this.

Trolling your dmesg
(http://web.hexapodia.org/~adi/sysinfo/1240274989_cvpe4300_2.6.30-rc2-00446-ga939b96/dmesg.out)
I see 

[    2.184016] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    2.198709] ata2.00: ATAPI: TSSTcorp DVD+/-RW TS-U633A, D200, max UDMA/100, ATAPI AN
[    2.198774] ata2.00: applying bridge limits
[    2.214107] ata2.00: configured for UDMA/100
[    2.230305] scsi 1:0:0:0: CD-ROM            TSSTcorp DVD+-RW TS-U633A D200 PQ: 0 ANSI: 5
[    2.548015] ata5: SATA link down (SStatus 0 SControl 300)
[    2.884014] ata6: SATA link down (SStatus 0 SControl 300)

so I guess that it's serial ATA, at least.

But which lower-level ata driver is being used under that?

[    0.717960] ahci 0000:00:1f.2: version 3.0
[    0.717968] ahci 0000:00:1f.2: PCI INT D -> GSI 19 (level, low) -> IRQ 19
[    0.718057] ahci 0000:00:1f.2: irq 27 for MSI/MSI-X
[    0.718097] ahci: SSS flag set, parallel bus scan disabled
[    0.718176] ahci 0000:00:1f.2: AHCI 0001.0200 32 slots 4 ports 3 Gbps 0x33 impl RAID mode
[    0.718236] ahci 0000:00:1f.2: flags: 64bit ncq sntf stag pm led clo pmp pio slum part ems 
[    0.718298] ahci 0000:00:1f.2: setting latency timer to 64

ahci, I guess.


Probably this is all perfectly obvious to the initiated, but for those
whose life revolves around telling the initiated about their bugs, this
is all unreasonably hard :(

Oh well, let's tentatively assume that we have a post-2.6.29 regression
in the libata ahci driver.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [2.6.30-rc2] CD-R: wodim intermittent failures: [sr0] Add.   Sense: Logical block address out of range, sector 0
  2009-04-22 21:49 ` [2.6.30-rc2] CD-R: wodim intermittent failures: [sr0] Add. Sense: Logical block address out of range, sector 0 Andrew Morton
@ 2009-04-23  6:46   ` Robert Hancock
  2009-04-23  7:07     ` Andy Isaacson
  0 siblings, 1 reply; 8+ messages in thread
From: Robert Hancock @ 2009-04-23  6:46 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Andy Isaacson, linux-kernel, linux-ide

Andrew Morton wrote:
> On Mon, 20 Apr 2009 18:52:34 -0700
> Andy Isaacson <adi@hexapodia.org> wrote:
> 
>> I'm running 2.6.30-rc2-00446-ga939b96 on a Dell E4300 with a
>> upgraded-to-Jaunty-mostly Ubuntu install.  Using 2.6.28 and 2.6.29.1,
>> I've successfully burned a dozen CDRs, but since updating to .30-rc1
>> I've noticed intermittent failures to burn CDRs using wodim at the Gnome
>> desktop.  (Switching back to .29.1 makes wodim reliable again.)  There
>> have been a few different failure modes, but all of them seem to be
>> associated with these messages in dmesg, which I haven't seen before:
>>
>> [  360.740810] sr 1:0:0:0: [sr0] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
>> [  360.740816] sr 1:0:0:0: [sr0] Sense Key : Illegal Request [current] 
>> [  360.740820] sr 1:0:0:0: [sr0] Add. Sense: Logical block address out of range
>> [  360.740826] end_request: I/O error, dev sr0, sector 0
>> [  360.740830] Buffer I/O error on device sr0, logical block 0

...

> Oh well, let's tentatively assume that we have a post-2.6.29 regression
> in the libata ahci driver.

I tend to doubt it's a problem in ahci itself, I would guess it's higher 
up the stack. Or could be that it's not a kernel bug, just some change 
in timing that triggers the problem.

What's supposed to stop udev/hal from poking at the drive while wodim is 
writing to the disc, anyway?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [2.6.30-rc2] CD-R: wodim intermittent failures: [sr0] Add. Sense: Logical block address out of range, sector 0
  2009-04-23  6:46   ` Robert Hancock
@ 2009-04-23  7:07     ` Andy Isaacson
  2009-05-12 21:52       ` Andy Isaacson
  0 siblings, 1 reply; 8+ messages in thread
From: Andy Isaacson @ 2009-04-23  7:07 UTC (permalink / raw)
  To: Robert Hancock; +Cc: Andrew Morton, linux-kernel, linux-ide

On Thu, Apr 23, 2009 at 12:46:19AM -0600, Robert Hancock wrote:
> >Oh well, let's tentatively assume that we have a post-2.6.29 regression
> >in the libata ahci driver.
> 
> I tend to doubt it's a problem in ahci itself, I would guess it's higher 
> up the stack. Or could be that it's not a kernel bug, just some change 
> in timing that triggers the problem.
> 
> What's supposed to stop udev/hal from poking at the drive while wodim is 
> writing to the disc, anyway?

wodim opens O_EXCL, and it does occasionally complain about not being
able to open, then sleeping and retrying.

-andy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [2.6.30-rc2] CD-R: wodim intermittent failures: [sr0] Add. Sense: Logical block address out of range, sector 0
  2009-04-23  7:07     ` Andy Isaacson
@ 2009-05-12 21:52       ` Andy Isaacson
  2009-05-12 23:23         ` Robert Hancock
  0 siblings, 1 reply; 8+ messages in thread
From: Andy Isaacson @ 2009-05-12 21:52 UTC (permalink / raw)
  To: Robert Hancock; +Cc: Andrew Morton, linux-kernel, linux-ide

On Thu, Apr 23, 2009 at 12:07:13AM -0700, Andy Isaacson wrote:
> On Thu, Apr 23, 2009 at 12:46:19AM -0600, Robert Hancock wrote:
> > >Oh well, let's tentatively assume that we have a post-2.6.29 regression
> > >in the libata ahci driver.
> > 
> > I tend to doubt it's a problem in ahci itself, I would guess it's higher 
> > up the stack. Or could be that it's not a kernel bug, just some change 
> > in timing that triggers the problem.
> > 
> > What's supposed to stop udev/hal from poking at the drive while wodim is 
> > writing to the disc, anyway?
> 
> wodim opens O_EXCL, and it does occasionally complain about not being
> able to open, then sleeping and retrying.

I just saw this again with 2.6.30-rc5-00096-ga4d7749 and got some nice
stack traces, for both hald-addon-storage and wodim:

[ 2041.201084] INFO: task hald-addon-stor:3804 blocked for more than 120 seconds.
[ 2041.201108]  ffff8801145056f8 0000000000000086 ffff880118c66120 0000000000000086
[ 2041.201118]  00000000000110c0 000000000000c848 ffff8801144b2cc0 ffff8800d8008000
[ 2041.201127]  ffff8801144b3060 0000000118ddd460 ffff8801145056b8 ffffffff81296405
[ 2041.201135] Call Trace:
[ 2041.201153]  [<ffffffff81296405>] ? scsi_setup_blk_pc_cmnd+0x145/0x180
[ 2041.201162]  [<ffffffff8128f32c>] ? scsi_pool_alloc_command+0x2c/0x80
[ 2041.201171]  [<ffffffff812a70ad>] ? sr_prep_fn+0x4d/0x5b0
[ 2041.201180]  [<ffffffff81413ae9>] schedule+0x9/0x20
[ 2041.201187]  [<ffffffff81413fa5>] schedule_timeout+0x145/0x180
[ 2041.201196]  [<ffffffff811d32a7>] ? kobject_put+0x27/0x60
[ 2041.201203]  [<ffffffff81413d4d>] wait_for_common+0x14d/0x180
[ 2041.201214]  [<ffffffff81038e30>] ? default_wake_function+0x0/0x10
[ 2041.201224]  [<ffffffff811c02b3>] ? __generic_unplug_device+0x33/0x40
[ 2041.201231]  [<ffffffff81413e18>] wait_for_completion+0x18/0x20
[ 2041.201239]  [<ffffffff811c4cd5>] blk_execute_rq+0x65/0xb0
[ 2041.201246]  [<ffffffff811c0fd0>] ? freed_request+0x30/0x60
[ 2041.201255]  [<ffffffff81296f22>] scsi_execute+0xf2/0x150
[ 2041.201262]  [<ffffffff812971a8>] scsi_execute_req+0xd8/0x150
[ 2041.201271]  [<ffffffff812a6796>] sr_test_unit_ready+0x66/0x100
[ 2041.201280]  [<ffffffff810d6e49>] ? kmem_cache_alloc+0x79/0x110
[ 2041.201288]  [<ffffffff812a76ca>] sr_media_change+0xba/0x2f0
[ 2041.201298]  [<ffffffff8105d6dc>] ? lock_hrtimer_base+0x2c/0x60
[ 2041.201308]  [<ffffffff8130f3b8>] media_changed+0x68/0xb0
[ 2041.201316]  [<ffffffff8130f437>] cdrom_media_changed+0x37/0x40
[ 2041.201324]  [<ffffffff812a68a4>] sr_block_media_changed+0x14/0x20
[ 2041.201333]  [<ffffffff811090fe>] check_disk_change+0x2e/0x70
[ 2041.201341]  [<ffffffff81313ddb>] cdrom_open+0x1ab/0xb60
[ 2041.201349]  [<ffffffff8105a566>] ? remove_wait_queue+0x46/0x60
[ 2041.201357]  [<ffffffff810ee496>] ? poll_freewait+0x46/0xb0
[ 2041.201365]  [<ffffffff810ee8e9>] ? do_sys_poll+0x3e9/0x4c0
[ 2041.201372]  [<ffffffff810ef340>] ? __pollwait+0x0/0x100
[ 2041.201380]  [<ffffffff810ef440>] ? pollwake+0x0/0x50
[ 2041.201387]  [<ffffffff811d361a>] ? kobject_get+0x1a/0x30
[ 2041.201394]  [<ffffffff811c70bf>] ? get_disk+0x2f/0x90
[ 2041.201401]  [<ffffffff811d361a>] ? kobject_get+0x1a/0x30
[ 2041.201409]  [<ffffffff812a6a6a>] sr_block_open+0x7a/0xd0
[ 2041.201417]  [<ffffffff8110a24e>] __blkdev_get+0x21e/0x3a0
[ 2041.201426]  [<ffffffff810f761a>] ? mntput_no_expire+0x2a/0x140
[ 2041.201433]  [<ffffffff8110a3db>] blkdev_get+0xb/0x10
[ 2041.201440]  [<ffffffff8110a44c>] blkdev_open+0x6c/0xc0
[ 2041.201449]  [<ffffffff810dc5d6>] __dentry_open+0xe6/0x2e0
[ 2041.201456]  [<ffffffff8110a3e0>] ? blkdev_open+0x0/0xc0
[ 2041.201463]  [<ffffffff810dc8d7>] nameidata_to_filp+0x57/0x70
[ 2041.201471]  [<ffffffff810eb6f2>] do_filp_open+0x2a2/0x9d0
[ 2041.201479]  [<ffffffff810cb5ad>] ? free_pages_and_swap_cache+0x8d/0xb0
[ 2041.201489]  [<ffffffff810c2640>] ? unmap_region+0x140/0x160
[ 2041.201496]  [<ffffffff810f5c8a>] ? alloc_fd+0x4a/0x140
[ 2041.201503]  [<ffffffff810dc41b>] do_sys_open+0x7b/0x100
[ 2041.201510]  [<ffffffff810dc4cb>] sys_open+0x1b/0x20
[ 2041.201519]  [<ffffffff8100bf2b>] system_call_fastpath+0x16/0x1b
[ 2041.201542] INFO: task wodim:6398 blocked for more than 120 seconds.
[ 2041.201562] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2041.201568] wodim         D 0000000000000000     0  6398   5313
[ 2041.201575]  ffff8800d805d868 0000000000000082 ffff880118ed40d0 0000000000000000
[ 2041.201584]  00000000000110c0 000000000000c848 ffff8800d4164320 ffff88011b884320
[ 2041.201593]  ffff8800d41646c0 0000000118ed5e48 000000010005d5b2 ffff880118ed4000
[ 2041.201602] Call Trace:
[ 2041.201610]  [<ffffffff81413ae9>] schedule+0x9/0x20
[ 2041.201617]  [<ffffffff81413fa5>] schedule_timeout+0x145/0x180
[ 2041.201625]  [<ffffffff81295590>] ? scsi_request_fn+0xd0/0x570
[ 2041.201632]  [<ffffffff81413d4d>] wait_for_common+0x14d/0x180
[ 2041.201641]  [<ffffffff81038e30>] ? default_wake_function+0x0/0x10
[ 2041.201649]  [<ffffffff811c029a>] ? __generic_unplug_device+0x1a/0x40
[ 2041.201656]  [<ffffffff81413e18>] wait_for_completion+0x18/0x20
[ 2041.201663]  [<ffffffff811c4cd5>] blk_execute_rq+0x65/0xb0
[ 2041.201669]  [<ffffffff811c4896>] ? blk_rq_append_bio+0x26/0x70
[ 2041.201676]  [<ffffffff811c4a8a>] ? blk_rq_map_user+0x1aa/0x2a0
[ 2041.201684]  [<ffffffff8104bc4e>] ? capable+0xe/0x50
[ 2041.201691]  [<ffffffff811c8734>] sg_io+0x1d4/0x400
[ 2041.201699]  [<ffffffff810a8cd2>] ? mempool_free_slab+0x12/0x20
[ 2041.201707]  [<ffffffff811c9015>] scsi_cmd_ioctl+0x2e5/0x4d0
[ 2041.201715]  [<ffffffff81038c48>] ? try_to_wake_up+0x148/0x330
[ 2041.201723]  [<ffffffff81038e3d>] ? default_wake_function+0xd/0x10
[ 2041.201732]  [<ffffffff81312add>] cdrom_ioctl+0x3d/0x1050
[ 2041.201741]  [<ffffffff8124d1f2>] ? n_tty_receive_buf+0x772/0x11e0
[ 2041.201750]  [<ffffffff812a690b>] sr_block_ioctl+0x5b/0xb0
[ 2041.201757]  [<ffffffff811c606e>] __blkdev_driver_ioctl+0x8e/0xa0
[ 2041.201765]  [<ffffffff811c616d>] blkdev_ioctl+0xad/0x9e0
[ 2041.201772]  [<ffffffff810a6df4>] ? filemap_fault+0x144/0x450
[ 2041.201778]  [<ffffffff810a68f2>] ? unlock_page+0x22/0x30
[ 2041.201786]  [<ffffffff810bc2bd>] ? __do_fault+0x42d/0x4e0
[ 2041.201795]  [<ffffffff81108eb5>] block_ioctl+0x35/0x40
[ 2041.201802]  [<ffffffff810ed0c1>] vfs_ioctl+0x31/0xa0
[ 2041.201809]  [<ffffffff810ed1ba>] do_vfs_ioctl+0x8a/0x570
[ 2041.201816]  [<ffffffff811d77b1>] ? __up_read+0x91/0xb0
[ 2041.201824]  [<ffffffff8105dff9>] ? up_read+0x9/0x10
[ 2041.201831]  [<ffffffff810ed739>] sys_ioctl+0x99/0xa0
[ 2041.201840]  [<ffffffff8100bf2b>] system_call_fastpath+0x16/0x1b

Looking at hald/linux/addons/addon-storage.c it does appear to
open(O_EXCL) as well, so perhaps the O_EXCL multiple-open logic broke
recently?

Complete dmesg, kconfig, and so on at

http://web.hexapodia.org/~adi/sysinfo/1242164411_cvpe4300_2.6.30-rc5-dmar-00096-ga4d7749/

-andy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [2.6.30-rc2] CD-R: wodim intermittent failures: [sr0] Add. Sense: Logical block address out of range, sector 0
  2009-05-12 21:52       ` Andy Isaacson
@ 2009-05-12 23:23         ` Robert Hancock
  2009-05-13  1:00           ` Andy Isaacson
  0 siblings, 1 reply; 8+ messages in thread
From: Robert Hancock @ 2009-05-12 23:23 UTC (permalink / raw)
  To: Andy Isaacson; +Cc: Andrew Morton, linux-kernel, linux-ide, linux-scsi

Andy Isaacson wrote:
> On Thu, Apr 23, 2009 at 12:07:13AM -0700, Andy Isaacson wrote:
>> On Thu, Apr 23, 2009 at 12:46:19AM -0600, Robert Hancock wrote:
>>>> Oh well, let's tentatively assume that we have a post-2.6.29 regression
>>>> in the libata ahci driver.
>>> I tend to doubt it's a problem in ahci itself, I would guess it's higher 
>>> up the stack. Or could be that it's not a kernel bug, just some change 
>>> in timing that triggers the problem.
>>>
>>> What's supposed to stop udev/hal from poking at the drive while wodim is 
>>> writing to the disc, anyway?
>> wodim opens O_EXCL, and it does occasionally complain about not being
>> able to open, then sleeping and retrying.
> 
> I just saw this again with 2.6.30-rc5-00096-ga4d7749 and got some nice
> stack traces, for both hald-addon-storage and wodim:
> 
> [ 2041.201084] INFO: task hald-addon-stor:3804 blocked for more than 120 seconds.
> [ 2041.201108]  ffff8801145056f8 0000000000000086 ffff880118c66120 0000000000000086
> [ 2041.201118]  00000000000110c0 000000000000c848 ffff8801144b2cc0 ffff8800d8008000
> [ 2041.201127]  ffff8801144b3060 0000000118ddd460 ffff8801145056b8 ffffffff81296405
> [ 2041.201135] Call Trace:
> [ 2041.201153]  [<ffffffff81296405>] ? scsi_setup_blk_pc_cmnd+0x145/0x180
> [ 2041.201162]  [<ffffffff8128f32c>] ? scsi_pool_alloc_command+0x2c/0x80
> [ 2041.201171]  [<ffffffff812a70ad>] ? sr_prep_fn+0x4d/0x5b0
> [ 2041.201180]  [<ffffffff81413ae9>] schedule+0x9/0x20
> [ 2041.201187]  [<ffffffff81413fa5>] schedule_timeout+0x145/0x180
> [ 2041.201196]  [<ffffffff811d32a7>] ? kobject_put+0x27/0x60
> [ 2041.201203]  [<ffffffff81413d4d>] wait_for_common+0x14d/0x180
> [ 2041.201214]  [<ffffffff81038e30>] ? default_wake_function+0x0/0x10
> [ 2041.201224]  [<ffffffff811c02b3>] ? __generic_unplug_device+0x33/0x40
> [ 2041.201231]  [<ffffffff81413e18>] wait_for_completion+0x18/0x20
> [ 2041.201239]  [<ffffffff811c4cd5>] blk_execute_rq+0x65/0xb0
> [ 2041.201246]  [<ffffffff811c0fd0>] ? freed_request+0x30/0x60
> [ 2041.201255]  [<ffffffff81296f22>] scsi_execute+0xf2/0x150
> [ 2041.201262]  [<ffffffff812971a8>] scsi_execute_req+0xd8/0x150
> [ 2041.201271]  [<ffffffff812a6796>] sr_test_unit_ready+0x66/0x100
> [ 2041.201280]  [<ffffffff810d6e49>] ? kmem_cache_alloc+0x79/0x110
> [ 2041.201288]  [<ffffffff812a76ca>] sr_media_change+0xba/0x2f0
> [ 2041.201298]  [<ffffffff8105d6dc>] ? lock_hrtimer_base+0x2c/0x60
> [ 2041.201308]  [<ffffffff8130f3b8>] media_changed+0x68/0xb0
> [ 2041.201316]  [<ffffffff8130f437>] cdrom_media_changed+0x37/0x40
> [ 2041.201324]  [<ffffffff812a68a4>] sr_block_media_changed+0x14/0x20
> [ 2041.201333]  [<ffffffff811090fe>] check_disk_change+0x2e/0x70
> [ 2041.201341]  [<ffffffff81313ddb>] cdrom_open+0x1ab/0xb60
> [ 2041.201349]  [<ffffffff8105a566>] ? remove_wait_queue+0x46/0x60
> [ 2041.201357]  [<ffffffff810ee496>] ? poll_freewait+0x46/0xb0
> [ 2041.201365]  [<ffffffff810ee8e9>] ? do_sys_poll+0x3e9/0x4c0
> [ 2041.201372]  [<ffffffff810ef340>] ? __pollwait+0x0/0x100
> [ 2041.201380]  [<ffffffff810ef440>] ? pollwake+0x0/0x50
> [ 2041.201387]  [<ffffffff811d361a>] ? kobject_get+0x1a/0x30
> [ 2041.201394]  [<ffffffff811c70bf>] ? get_disk+0x2f/0x90
> [ 2041.201401]  [<ffffffff811d361a>] ? kobject_get+0x1a/0x30
> [ 2041.201409]  [<ffffffff812a6a6a>] sr_block_open+0x7a/0xd0
> [ 2041.201417]  [<ffffffff8110a24e>] __blkdev_get+0x21e/0x3a0
> [ 2041.201426]  [<ffffffff810f761a>] ? mntput_no_expire+0x2a/0x140
> [ 2041.201433]  [<ffffffff8110a3db>] blkdev_get+0xb/0x10
> [ 2041.201440]  [<ffffffff8110a44c>] blkdev_open+0x6c/0xc0
> [ 2041.201449]  [<ffffffff810dc5d6>] __dentry_open+0xe6/0x2e0
> [ 2041.201456]  [<ffffffff8110a3e0>] ? blkdev_open+0x0/0xc0
> [ 2041.201463]  [<ffffffff810dc8d7>] nameidata_to_filp+0x57/0x70
> [ 2041.201471]  [<ffffffff810eb6f2>] do_filp_open+0x2a2/0x9d0
> [ 2041.201479]  [<ffffffff810cb5ad>] ? free_pages_and_swap_cache+0x8d/0xb0
> [ 2041.201489]  [<ffffffff810c2640>] ? unmap_region+0x140/0x160
> [ 2041.201496]  [<ffffffff810f5c8a>] ? alloc_fd+0x4a/0x140
> [ 2041.201503]  [<ffffffff810dc41b>] do_sys_open+0x7b/0x100
> [ 2041.201510]  [<ffffffff810dc4cb>] sys_open+0x1b/0x20
> [ 2041.201519]  [<ffffffff8100bf2b>] system_call_fastpath+0x16/0x1b
> [ 2041.201542] INFO: task wodim:6398 blocked for more than 120 seconds.
> [ 2041.201562] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 2041.201568] wodim         D 0000000000000000     0  6398   5313
> [ 2041.201575]  ffff8800d805d868 0000000000000082 ffff880118ed40d0 0000000000000000
> [ 2041.201584]  00000000000110c0 000000000000c848 ffff8800d4164320 ffff88011b884320
> [ 2041.201593]  ffff8800d41646c0 0000000118ed5e48 000000010005d5b2 ffff880118ed4000
> [ 2041.201602] Call Trace:
> [ 2041.201610]  [<ffffffff81413ae9>] schedule+0x9/0x20
> [ 2041.201617]  [<ffffffff81413fa5>] schedule_timeout+0x145/0x180
> [ 2041.201625]  [<ffffffff81295590>] ? scsi_request_fn+0xd0/0x570
> [ 2041.201632]  [<ffffffff81413d4d>] wait_for_common+0x14d/0x180
> [ 2041.201641]  [<ffffffff81038e30>] ? default_wake_function+0x0/0x10
> [ 2041.201649]  [<ffffffff811c029a>] ? __generic_unplug_device+0x1a/0x40
> [ 2041.201656]  [<ffffffff81413e18>] wait_for_completion+0x18/0x20
> [ 2041.201663]  [<ffffffff811c4cd5>] blk_execute_rq+0x65/0xb0
> [ 2041.201669]  [<ffffffff811c4896>] ? blk_rq_append_bio+0x26/0x70
> [ 2041.201676]  [<ffffffff811c4a8a>] ? blk_rq_map_user+0x1aa/0x2a0
> [ 2041.201684]  [<ffffffff8104bc4e>] ? capable+0xe/0x50
> [ 2041.201691]  [<ffffffff811c8734>] sg_io+0x1d4/0x400
> [ 2041.201699]  [<ffffffff810a8cd2>] ? mempool_free_slab+0x12/0x20
> [ 2041.201707]  [<ffffffff811c9015>] scsi_cmd_ioctl+0x2e5/0x4d0
> [ 2041.201715]  [<ffffffff81038c48>] ? try_to_wake_up+0x148/0x330
> [ 2041.201723]  [<ffffffff81038e3d>] ? default_wake_function+0xd/0x10
> [ 2041.201732]  [<ffffffff81312add>] cdrom_ioctl+0x3d/0x1050
> [ 2041.201741]  [<ffffffff8124d1f2>] ? n_tty_receive_buf+0x772/0x11e0
> [ 2041.201750]  [<ffffffff812a690b>] sr_block_ioctl+0x5b/0xb0
> [ 2041.201757]  [<ffffffff811c606e>] __blkdev_driver_ioctl+0x8e/0xa0
> [ 2041.201765]  [<ffffffff811c616d>] blkdev_ioctl+0xad/0x9e0
> [ 2041.201772]  [<ffffffff810a6df4>] ? filemap_fault+0x144/0x450
> [ 2041.201778]  [<ffffffff810a68f2>] ? unlock_page+0x22/0x30
> [ 2041.201786]  [<ffffffff810bc2bd>] ? __do_fault+0x42d/0x4e0
> [ 2041.201795]  [<ffffffff81108eb5>] block_ioctl+0x35/0x40
> [ 2041.201802]  [<ffffffff810ed0c1>] vfs_ioctl+0x31/0xa0
> [ 2041.201809]  [<ffffffff810ed1ba>] do_vfs_ioctl+0x8a/0x570
> [ 2041.201816]  [<ffffffff811d77b1>] ? __up_read+0x91/0xb0
> [ 2041.201824]  [<ffffffff8105dff9>] ? up_read+0x9/0x10
> [ 2041.201831]  [<ffffffff810ed739>] sys_ioctl+0x99/0xa0
> [ 2041.201840]  [<ffffffff8100bf2b>] system_call_fastpath+0x16/0x1b
> 
> Looking at hald/linux/addons/addon-storage.c it does appear to
> open(O_EXCL) as well, so perhaps the O_EXCL multiple-open logic broke
> recently?
> 
> Complete dmesg, kconfig, and so on at
> 
> http://web.hexapodia.org/~adi/sysinfo/1242164411_cvpe4300_2.6.30-rc5-dmar-00096-ga4d7749/
> 
> -andy

I'm not sure how this is supposed to work. The only place I can see any 
O_EXCL handling is in the sg driver. But if you access the sr device 
directly, it doesn't seem like that would have any effect.

CCing linux-scsi. Could be there's a bad user-space assumption here, or 
something.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [2.6.30-rc2] CD-R: wodim intermittent failures: [sr0] Add. Sense: Logical block address out of range, sector 0
  2009-05-12 23:23         ` Robert Hancock
@ 2009-05-13  1:00           ` Andy Isaacson
  2009-05-13  1:08             ` Robert Hancock
  0 siblings, 1 reply; 8+ messages in thread
From: Andy Isaacson @ 2009-05-13  1:00 UTC (permalink / raw)
  To: Robert Hancock; +Cc: Andrew Morton, linux-kernel, linux-ide, linux-scsi

On Tue, May 12, 2009 at 05:23:57PM -0600, Robert Hancock wrote:
> >Looking at hald/linux/addons/addon-storage.c it does appear to
> >open(O_EXCL) as well, so perhaps the O_EXCL multiple-open logic broke
> >recently?
> >
> >Complete dmesg, kconfig, and so on at
> >
> >http://web.hexapodia.org/~adi/sysinfo/1242164411_cvpe4300_2.6.30-rc5-dmar-00096-ga4d7749/
> 
> I'm not sure how this is supposed to work. The only place I can see any 
> O_EXCL handling is in the sg driver. But if you access the sr device 
> directly, it doesn't seem like that would have any effect.
> 
> CCing linux-scsi. Could be there's a bad user-space assumption here, or 
> something.

Perhaps, I haven't tracked down exactly what the difference in behavior
between working 2.6.29 and broken 2.6.30-rc2 is.  It seems like the disk
spins up more often (in response to hal background probes presumably)
with 30-rc2 than it did before, and running "udevadm monitor" I do see
the following repeated "change" events happening, even on an idle
system:

KERNEL[1242175547.793029] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0 (scsi)
KERNEL[1242175547.794669] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 (block)
UDEV  [1242175547.794912] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0 (scsi)
UDEV  [1242175547.888539] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 (block)
KERNEL[1242175673.309129] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0 (scsi)
KERNEL[1242175673.310775] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 (block)
UDEV  [1242175673.311013] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0 (scsi)
UDEV  [1242175674.658718] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 (block)
KERNEL[1242175690.227604] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0 (scsi)
KERNEL[1242175690.229730] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 (block)
UDEV  [1242175690.229960] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0 (scsi)
UDEV  [1242175714.060440] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 (block)
KERNEL[1242175727.909819] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0 (scsi)
KERNEL[1242175727.911356] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 (block)
UDEV  [1242175727.911690] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0 (scsi)
UDEV  [1242175727.987747] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 (block)
KERNEL[1242175731.591853] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0 (scsi)
KERNEL[1242175731.593541] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 (block)

I've verified that O_EXCL works at least with a trivial test program, so
it's not simply that EXCL broke.

Also I've verified with strace that hald-addon-storage and wodim are
both using O_EXCL, and that hald-addon-storage gets EBUSY while wodim is
writing the disk.  (The straced run didn't fail, unfortunately.)

-andy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [2.6.30-rc2] CD-R: wodim intermittent failures: [sr0] Add. Sense: Logical block address out of range, sector 0
  2009-05-13  1:00           ` Andy Isaacson
@ 2009-05-13  1:08             ` Robert Hancock
  2009-05-13  2:27               ` Andy Isaacson
  0 siblings, 1 reply; 8+ messages in thread
From: Robert Hancock @ 2009-05-13  1:08 UTC (permalink / raw)
  To: Andy Isaacson; +Cc: Andrew Morton, linux-kernel, linux-ide, linux-scsi

On Tue, May 12, 2009 at 7:00 PM, Andy Isaacson <adi@hexapodia.org> wrote:
> On Tue, May 12, 2009 at 05:23:57PM -0600, Robert Hancock wrote:
>> >Looking at hald/linux/addons/addon-storage.c it does appear to
>> >open(O_EXCL) as well, so perhaps the O_EXCL multiple-open logic broke
>> >recently?
>> >
>> >Complete dmesg, kconfig, and so on at
>> >
>> >http://web.hexapodia.org/~adi/sysinfo/1242164411_cvpe4300_2.6.30-rc5-dmar-00096-ga4d7749/
>>
>> I'm not sure how this is supposed to work. The only place I can see any
>> O_EXCL handling is in the sg driver. But if you access the sr device
>> directly, it doesn't seem like that would have any effect.
>>
>> CCing linux-scsi. Could be there's a bad user-space assumption here, or
>> something.
>
> Perhaps, I haven't tracked down exactly what the difference in behavior
> between working 2.6.29 and broken 2.6.30-rc2 is.  It seems like the disk
> spins up more often (in response to hal background probes presumably)
> with 30-rc2 than it did before, and running "udevadm monitor" I do see
> the following repeated "change" events happening, even on an idle
> system:
>
> KERNEL[1242175547.793029] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0 (scsi)
> KERNEL[1242175547.794669] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 (block)
> UDEV  [1242175547.794912] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0 (scsi)
> UDEV  [1242175547.888539] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 (block)
> KERNEL[1242175673.309129] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0 (scsi)
> KERNEL[1242175673.310775] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 (block)
> UDEV  [1242175673.311013] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0 (scsi)
> UDEV  [1242175674.658718] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 (block)
> KERNEL[1242175690.227604] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0 (scsi)
> KERNEL[1242175690.229730] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 (block)
> UDEV  [1242175690.229960] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0 (scsi)
> UDEV  [1242175714.060440] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 (block)
> KERNEL[1242175727.909819] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0 (scsi)
> KERNEL[1242175727.911356] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 (block)
> UDEV  [1242175727.911690] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0 (scsi)
> UDEV  [1242175727.987747] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 (block)
> KERNEL[1242175731.591853] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0 (scsi)
> KERNEL[1242175731.593541] change   /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 (block)
>
> I've verified that O_EXCL works at least with a trivial test program, so
> it's not simply that EXCL broke.
>
> Also I've verified with strace that hald-addon-storage and wodim are
> both using O_EXCL, and that hald-addon-storage gets EBUSY while wodim is
> writing the disk.  (The straced run didn't fail, unfortunately.)

Those events seem rather suspicious. Hmm:

[    2.770959] ata2.00: ATAPI: TSSTcorp DVD+/-RW TS-U633A, D200, max
UDMA/100, ATAPI AN

Did libata report it was using ATAPI AN in 2.6.29?

You could try bisecting to see if you can find which patch started
triggering those spurious events..

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [2.6.30-rc2] CD-R: wodim intermittent failures: [sr0] Add. Sense: Logical block address out of range, sector 0
  2009-05-13  1:08             ` Robert Hancock
@ 2009-05-13  2:27               ` Andy Isaacson
  0 siblings, 0 replies; 8+ messages in thread
From: Andy Isaacson @ 2009-05-13  2:27 UTC (permalink / raw)
  To: Robert Hancock; +Cc: Andrew Morton, linux-kernel, linux-ide, linux-scsi

On Tue, May 12, 2009 at 07:08:54PM -0600, Robert Hancock wrote:
> On Tue, May 12, 2009 at 7:00 PM, Andy Isaacson <adi@hexapodia.org> wrote:
> > Perhaps, I haven't tracked down exactly what the difference in behavior
> > between working 2.6.29 and broken 2.6.30-rc2 is. ?It seems like the disk
> > spins up more often (in response to hal background probes presumably)
> > with 30-rc2 than it did before, and running "udevadm monitor" I do see
> > the following repeated "change" events happening, even on an idle
> > system:
> >
> > KERNEL[1242175547.793029] change ? /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0 (scsi)
> > KERNEL[1242175547.794669] change ? /devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 (block)
[snip]
> Those events seem rather suspicious. Hmm:
> 
> [    2.770959] ata2.00: ATAPI: TSSTcorp DVD+/-RW TS-U633A, D200, max
> UDMA/100, ATAPI AN
> 
> Did libata report it was using ATAPI AN in 2.6.29?

Yes, AN is present in every log I have for that device (back to
2.6.28.1).

> You could try bisecting to see if you can find which patch started
> triggering those spurious events..

Heh, I was dreading a bisect where each step burned a coaster, but just
looking for events is easy.  I'll try that tomorrow.

-andy

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-05-13  2:27 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20090421015234.GA7014@hexapodia.org>
2009-04-22 21:49 ` [2.6.30-rc2] CD-R: wodim intermittent failures: [sr0] Add. Sense: Logical block address out of range, sector 0 Andrew Morton
2009-04-23  6:46   ` Robert Hancock
2009-04-23  7:07     ` Andy Isaacson
2009-05-12 21:52       ` Andy Isaacson
2009-05-12 23:23         ` Robert Hancock
2009-05-13  1:00           ` Andy Isaacson
2009-05-13  1:08             ` Robert Hancock
2009-05-13  2:27               ` Andy Isaacson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).