All of lore.kernel.org
 help / color / mirror / Atom feed
* Bug in linux kernel when playing DVDs.
@ 2003-04-26 16:28 James Courtier-Dutton
  0 siblings, 0 replies; 14+ messages in thread
From: James Courtier-Dutton @ 2003-04-26 16:28 UTC (permalink / raw)
  To: linux-ide

Hello,

I have found a bug in the linux kernel when it plays DVDs. I use xine 
(xine.sf.net) for playing DVDs.
At some point during the playing there is an error on the DVD. But 
currently this error is not handled correctly by the linux kernel.
This puts the kernel into an uncertain state, causing the kernel to take 
100% CPU and fail all future read requests.
One way to exit this "uncertain state" is to push a pin into the small 
hole on the front of all DVD drive. This causes the kernel to sense 
"tray open", which it knows about, and handles correctly. After this, 
the kernel releases it's grab on the CPU and linux runs normally again.
The strange thing is that during the "uncertain state", the kernel tries 
to read sectors that a way outside the DVD.
Please see kern.log extract for more details.
What is error 0x34 ? Does anyone know how we should handle it, because 
the current method for handling it is obviously wrong.
I am 100% sure that the application does not ask for these out of range 
sectors, because I have debugged that far. I have now compiled the 
ide-cd as a kernel module, so I could add kprintf's to the kernel source 
if that helps give more information.
Currently, I cannot find document that will explain what error 0x34 is. 
Can anybody help ?

Cheers
James

Apr 26 17:15:55 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:00 games kernel: hdd: cdrom_decode_status: status=0x51 { 
DriveReady SeekComplete Error }
Apr 26 17:16:00 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:05 games kernel: hdd: cdrom_decode_status: status=0x51 { 
DriveReady SeekComplete Error }
Apr 26 17:16:05 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:10 games kernel: hdd: cdrom_decode_status: status=0x51 { 
DriveReady SeekComplete Error }
Apr 26 17:16:10 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:10 games kernel: hdd: ATAPI reset complete
Apr 26 17:16:15 games kernel: hdd: cdrom_decode_status: status=0x51 { 
DriveReady SeekComplete Error }
Apr 26 17:16:15 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:20 games kernel: hdd: cdrom_decode_status: status=0x51 { 
DriveReady SeekComplete Error }
Apr 26 17:16:20 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:24 games kernel: hdd: cdrom_decode_status: status=0x51 { 
DriveReady SeekComplete Error }
Apr 26 17:16:24 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:24 games kernel: hdd: ATAPI reset complete
Apr 26 17:16:25 games kernel: end_request: I/O error, dev 16:40 (hdd), 
sector 7750464
Apr 26 17:16:29 games kernel: hdd: cdrom_decode_status: status=0x51 { 
DriveReady SeekComplete Error }
Apr 26 17:16:29 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:34 games kernel: hdd: cdrom_decode_status: status=0x51 { 
DriveReady SeekComplete Error }
Apr 26 17:16:34 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:39 games kernel: hdd: cdrom_decode_status: status=0x51 { 
DriveReady SeekComplete Error }
Apr 26 17:16:39 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:44 games kernel: hdd: cdrom_decode_status: status=0x51 { 
DriveReady SeekComplete Error }
Apr 26 17:16:44 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:44 games kernel: hdd: ATAPI reset complete
Apr 26 17:16:49 games kernel: hdd: cdrom_decode_status: status=0x51 { 
DriveReady SeekComplete Error }
Apr 26 17:16:49 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:54 games kernel: hdd: cdrom_decode_status: status=0x51 { 
DriveReady SeekComplete Error }
Apr 26 17:16:54 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:59 games kernel: hdd: cdrom_decode_status: status=0x51 { 
DriveReady SeekComplete Error }
Apr 26 17:16:59 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:59 games kernel: hdd: ATAPI reset complete
Apr 26 17:16:59 games kernel: end_request: I/O error, dev 16:40 (hdd), 
sector 7750468
Apr 26 17:16:59 games kernel: hdd: cdrom_decode_status: status=0x51 { 
DriveReady SeekComplete Error }

"I use the PIN here"

Apr 26 17:16:59 games kernel: hdd: cdrom_decode_status: error=0xb4
Apr 26 17:16:59 games kernel: hdd: tray open
Apr 26 17:16:59 games kernel: end_request: I/O error, dev 16:40 (hdd), 
sector 7750472
Apr 26 17:16:59 games kernel: hdd: tray open
Apr 26 17:16:59 games kernel: end_request: I/O error, dev 16:40 (hdd), 
sector 7750476
Apr 26 17:16:59 games kernel: hdd: tray open
Apr 26 17:16:59 games kernel: end_request: I/O error, dev 16:40 (hdd), 
sector 7750480
Apr 26 17:16:59 games kernel: hdd: tray open
Apr 26 17:16:59 games kernel: end_request: I/O error, dev 16:40 (hdd), 
sector 7750484
Apr 26 17:16:59 games kernel: hdd: tray open


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Bug in linux kernel when playing DVDs.
@ 2003-04-27 10:47 James Courtier-Dutton
  2003-04-29  5:46 ` Denis Vlasenko
  0 siblings, 1 reply; 14+ messages in thread
From: James Courtier-Dutton @ 2003-04-27 10:47 UTC (permalink / raw)
  To: linux-kernel

Hello,

I have found a bug in the linux kernel when it plays DVDs. I use xine
(xine.sf.net) for playing DVDs.
At some point during the playing there is an error on the DVD. But
currently this error is not handled correctly by the linux kernel.
This puts the kernel into an uncertain state, causing the kernel to take
100% CPU and fail all future read requests.

One way to exit this "uncertain state" is to push a pin into the small
hole on the front of all DVD drive. This causes the kernel to sense
"tray open", which it knows about, and handles correctly. After this,
the kernel releases it's grab on the CPU and linux runs normally again.
Please see kern.log extract for more details.

What is error 0x34 ? Does anyone know how we should handle it, because
the current method for handling it is obviously wrong.
I am 100% sure that the application does not ask for out of range
sectors, because I have debugged that far. I have now compiled the
ide-cd as a kernel module, so I could add kprintf's to the kernel source
if that helps give more information.

Currently, I cannot find document that will explain what error 0x34 is.
Can anybody help ?

Cheers
James

Apr 26 17:15:55 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:00 games kernel: hdd: cdrom_decode_status: status=0x51 {
DriveReady SeekComplete Error }
Apr 26 17:16:00 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:05 games kernel: hdd: cdrom_decode_status: status=0x51 {
DriveReady SeekComplete Error }
Apr 26 17:16:05 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:10 games kernel: hdd: cdrom_decode_status: status=0x51 {
DriveReady SeekComplete Error }
Apr 26 17:16:10 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:10 games kernel: hdd: ATAPI reset complete
Apr 26 17:16:15 games kernel: hdd: cdrom_decode_status: status=0x51 {
DriveReady SeekComplete Error }
Apr 26 17:16:15 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:20 games kernel: hdd: cdrom_decode_status: status=0x51 {
DriveReady SeekComplete Error }
Apr 26 17:16:20 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:24 games kernel: hdd: cdrom_decode_status: status=0x51 {
DriveReady SeekComplete Error }
Apr 26 17:16:24 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:24 games kernel: hdd: ATAPI reset complete
Apr 26 17:16:25 games kernel: end_request: I/O error, dev 16:40 (hdd),
sector 7750464
Apr 26 17:16:29 games kernel: hdd: cdrom_decode_status: status=0x51 {
DriveReady SeekComplete Error }
Apr 26 17:16:29 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:34 games kernel: hdd: cdrom_decode_status: status=0x51 {
DriveReady SeekComplete Error }
Apr 26 17:16:34 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:39 games kernel: hdd: cdrom_decode_status: status=0x51 {
DriveReady SeekComplete Error }
Apr 26 17:16:39 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:44 games kernel: hdd: cdrom_decode_status: status=0x51 {
DriveReady SeekComplete Error }
Apr 26 17:16:44 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:44 games kernel: hdd: ATAPI reset complete
Apr 26 17:16:49 games kernel: hdd: cdrom_decode_status: status=0x51 {
DriveReady SeekComplete Error }
Apr 26 17:16:49 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:54 games kernel: hdd: cdrom_decode_status: status=0x51 {
DriveReady SeekComplete Error }
Apr 26 17:16:54 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:59 games kernel: hdd: cdrom_decode_status: status=0x51 {
DriveReady SeekComplete Error }
Apr 26 17:16:59 games kernel: hdd: cdrom_decode_status: error=0x34
Apr 26 17:16:59 games kernel: hdd: ATAPI reset complete
Apr 26 17:16:59 games kernel: end_request: I/O error, dev 16:40 (hdd),
sector 7750468
Apr 26 17:16:59 games kernel: hdd: cdrom_decode_status: status=0x51 {
DriveReady SeekComplete Error }

"I use the PIN here"

Apr 26 17:16:59 games kernel: hdd: cdrom_decode_status: error=0xb4
Apr 26 17:16:59 games kernel: hdd: tray open
Apr 26 17:16:59 games kernel: end_request: I/O error, dev 16:40 (hdd),
sector 7750472
Apr 26 17:16:59 games kernel: hdd: tray open
Apr 26 17:16:59 games kernel: end_request: I/O error, dev 16:40 (hdd),
sector 7750476
Apr 26 17:16:59 games kernel: hdd: tray open
Apr 26 17:16:59 games kernel: end_request: I/O error, dev 16:40 (hdd),
sector 7750480
Apr 26 17:16:59 games kernel: hdd: tray open
Apr 26 17:16:59 games kernel: end_request: I/O error, dev 16:40 (hdd),
sector 7750484
Apr 26 17:16:59 games kernel: hdd: tray open

-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bug in linux kernel when playing DVDs.
  2003-04-27 10:47 James Courtier-Dutton
@ 2003-04-29  5:46 ` Denis Vlasenko
  2003-04-29  6:56   ` Nick Piggin
  2003-04-29 11:11   ` James Courtier-Dutton
  0 siblings, 2 replies; 14+ messages in thread
From: Denis Vlasenko @ 2003-04-29  5:46 UTC (permalink / raw)
  To: James, linux-kernel

On 27 April 2003 13:47, James Courtier-Dutton wrote:
> Hello,
>
> I have found a bug in the linux kernel when it plays DVDs. I use xine
> (xine.sf.net) for playing DVDs.
> At some point during the playing there is an error on the DVD. But
> currently this error is not handled correctly by the linux kernel.
> This puts the kernel into an uncertain state, causing the kernel to
> take 100% CPU and fail all future read requests.
...
> Apr 26 17:16:24 games kernel: hdd: cdrom_decode_status: error=0x34
> Apr 26 17:16:24 games kernel: hdd: ATAPI reset complete
> Apr 26 17:16:25 games kernel: end_request: I/O error, dev 16:40
> (hdd), sector 7750464
...
> DriveReady SeekComplete Error }
> Apr 26 17:16:59 games kernel: hdd: cdrom_decode_status: error=0x34
> Apr 26 17:16:59 games kernel: hdd: ATAPI reset complete
> Apr 26 17:16:59 games kernel: end_request: I/O error, dev 16:40
> (hdd), sector 7750468

See? Sector # is increasing... Linux retries the read several times,
then reports EIO to userspace and goes to next sectors. Unfortunately,
they are bad too, so the loop repeats. Eventually it will pass
by all bad sectors (if not, it's a bug) but it can take longish
time.

Apart of making max retry # settable by the user, I don't see how
this can be made better. Pity. This is common problem on CDs...
--
vda

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bug in linux kernel when playing DVDs.
  2003-04-29  5:46 ` Denis Vlasenko
@ 2003-04-29  6:56   ` Nick Piggin
  2003-04-30 12:10     ` Denis Vlasenko
  2003-04-29 11:11   ` James Courtier-Dutton
  1 sibling, 1 reply; 14+ messages in thread
From: Nick Piggin @ 2003-04-29  6:56 UTC (permalink / raw)
  To: vda; +Cc: James, linux-kernel

Denis Vlasenko wrote:

>On 27 April 2003 13:47, James Courtier-Dutton wrote:
>
>>Hello,
>>
>>I have found a bug in the linux kernel when it plays DVDs. I use xine
>>(xine.sf.net) for playing DVDs.
>>At some point during the playing there is an error on the DVD. But
>>currently this error is not handled correctly by the linux kernel.
>>This puts the kernel into an uncertain state, causing the kernel to
>>take 100% CPU and fail all future read requests.
>>
>
[snip]

>
>Apart of making max retry # settable by the user, I don't see how
>this can be made better.
>
Having the kernel not use 100% CPU?

> Pity. This is common problem on CDs...
>  
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bug in linux kernel when playing DVDs.
  2003-04-29  5:46 ` Denis Vlasenko
  2003-04-29  6:56   ` Nick Piggin
@ 2003-04-29 11:11   ` James Courtier-Dutton
  2003-04-30 12:08     ` Denis Vlasenko
  1 sibling, 1 reply; 14+ messages in thread
From: James Courtier-Dutton @ 2003-04-29 11:11 UTC (permalink / raw)
  To: vda; +Cc: linux-kernel

Denis Vlasenko wrote:

>On 27 April 2003 13:47, James Courtier-Dutton wrote:
>  
>
>>Hello,
>>
>>I have found a bug in the linux kernel when it plays DVDs. I use xine
>>(xine.sf.net) for playing DVDs.
>>At some point during the playing there is an error on the DVD. But
>>currently this error is not handled correctly by the linux kernel.
>>This puts the kernel into an uncertain state, causing the kernel to
>>take 100% CPU and fail all future read requests.
>>    
>>
>...
>  
>
>>Apr 26 17:16:24 games kernel: hdd: cdrom_decode_status: error=0x34
>>Apr 26 17:16:24 games kernel: hdd: ATAPI reset complete
>>Apr 26 17:16:25 games kernel: end_request: I/O error, dev 16:40
>>(hdd), sector 7750464
>>    
>>
>...
>  
>
>>DriveReady SeekComplete Error }
>>Apr 26 17:16:59 games kernel: hdd: cdrom_decode_status: error=0x34
>>Apr 26 17:16:59 games kernel: hdd: ATAPI reset complete
>>Apr 26 17:16:59 games kernel: end_request: I/O error, dev 16:40
>>(hdd), sector 7750468
>>    
>>
>
>See? Sector # is increasing... Linux retries the read several times,
>then reports EIO to userspace and goes to next sectors. Unfortunately,
>they are bad too, so the loop repeats. Eventually it will pass
>by all bad sectors (if not, it's a bug) but it can take longish
>time.
>
>Apart of making max retry # settable by the user, I don't see how
>this can be made better. Pity. This is common problem on CDs...
>--
>vda
>  
>
What is this EIO report. The CPU is never returned to user space apps, 
so the app never sees any error.
As for retries, for DVD playing we do not want the Linux kernel to do 
any retries, because during DVD playback, we just want a very quick 
response saying there was an error. The DVD playing application can then 
skip forward 0.5 seconds and continue. If one sector fails on a DVD, 
there is little or not point in reading the next sector. One has to 
start reading from the next VOBU. (i.e. about 0.5 seconds skip.)

Cheers
James



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bug in linux kernel when playing DVDs.
  2003-04-30 12:10     ` Denis Vlasenko
@ 2003-04-30 12:07       ` Alan Cox
  2003-04-30 15:23         ` James Courtier-Dutton
  0 siblings, 1 reply; 14+ messages in thread
From: Alan Cox @ 2003-04-30 12:07 UTC (permalink / raw)
  To: vda; +Cc: Nick Piggin, James, Linux Kernel Mailing List

On Mer, 2003-04-30 at 13:10, Denis Vlasenko wrote:
> > Having the kernel not use 100% CPU?
> 
> I suspect IDE error recovery path was never audited for that

NOTABUG

User space keeps asking it to read so it keeps using CPU, fix the user
space


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bug in linux kernel when playing DVDs.
  2003-04-29 11:11   ` James Courtier-Dutton
@ 2003-04-30 12:08     ` Denis Vlasenko
  2003-04-30 12:29       ` Richard B. Johnson
  0 siblings, 1 reply; 14+ messages in thread
From: Denis Vlasenko @ 2003-04-30 12:08 UTC (permalink / raw)
  To: James Courtier-Dutton; +Cc: linux-kernel

On 29 April 2003 14:11, James Courtier-Dutton wrote:
> >See? Sector # is increasing... Linux retries the read several times,
> >then reports EIO to userspace and goes to next sectors.
> > Unfortunately, they are bad too, so the loop repeats. Eventually it
> > will pass by all bad sectors (if not, it's a bug) but it can take
> > longish time.
> >
> >Apart of making max retry # settable by the user, I don't see how
> >this can be made better. Pity. This is common problem on CDs...
>
> What is this EIO report. The CPU is never returned to user space
> apps, so the app never sees any error.

Are you sure that CPU never returned to the app?
(strace is your friend...)

> As for retries, for DVD playing we do not want the Linux kernel to do
> any retries, because during DVD playback, we just want a very quick
> response saying there was an error.

Kernel is not yet telepathic.

> The DVD playing application can
> then skip forward 0.5 seconds and continue. If one sector fails on a
> DVD, there is little or not point in reading the next sector. One has
> to start reading from the next VOBU. (i.e. about 0.5 seconds skip.)

You need a way to tell kernel that you want such behavior.
"skip 0.5 sec on error" requirement is rather hard
to describe to the kernel.
--
vda

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bug in linux kernel when playing DVDs.
  2003-04-29  6:56   ` Nick Piggin
@ 2003-04-30 12:10     ` Denis Vlasenko
  2003-04-30 12:07       ` Alan Cox
  0 siblings, 1 reply; 14+ messages in thread
From: Denis Vlasenko @ 2003-04-30 12:10 UTC (permalink / raw)
  To: Nick Piggin; +Cc: James, linux-kernel

On 29 April 2003 09:56, Nick Piggin wrote:
> >>At some point during the playing there is an error on the DVD. But
> >>currently this error is not handled correctly by the linux kernel.
> >>This puts the kernel into an uncertain state, causing the kernel to
> >>take 100% CPU and fail all future read requests.
>
> [snip]
>
> >Apart of making max retry # settable by the user, I don't see how
> >this can be made better.
>
> Having the kernel not use 100% CPU?

I suspect IDE error recovery path was never audited for that
--
vda

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bug in linux kernel when playing DVDs.
  2003-04-30 12:08     ` Denis Vlasenko
@ 2003-04-30 12:29       ` Richard B. Johnson
  0 siblings, 0 replies; 14+ messages in thread
From: Richard B. Johnson @ 2003-04-30 12:29 UTC (permalink / raw)
  To: Denis Vlasenko; +Cc: James Courtier-Dutton, linux-kernel

On Wed, 30 Apr 2003, Denis Vlasenko wrote:

> On 29 April 2003 14:11, James Courtier-Dutton wrote:
> > >See? Sector # is increasing... Linux retries the read several times,
> > >then reports EIO to userspace and goes to next sectors.
> > > Unfortunately, they are bad too, so the loop repeats. Eventually it
> > > will pass by all bad sectors (if not, it's a bug) but it can take
> > > longish time.
> > >
> > >Apart of making max retry # settable by the user, I don't see how
> > >this can be made better. Pity. This is common problem on CDs...
> >
> > What is this EIO report. The CPU is never returned to user space
> > apps, so the app never sees any error.
>
> Are you sure that CPU never returned to the app?
> (strace is your friend...)
>
> > As for retries, for DVD playing we do not want the Linux kernel to do
> > any retries, because during DVD playback, we just want a very quick
> > response saying there was an error.
>
> Kernel is not yet telepathic.
>
> > The DVD playing application can
> > then skip forward 0.5 seconds and continue. If one sector fails on a
> > DVD, there is little or not point in reading the next sector. One has
> > to start reading from the next VOBU. (i.e. about 0.5 seconds skip.)
>
> You need a way to tell kernel that you want such behavior.
> "skip 0.5 sec on error" requirement is rather hard
> to describe to the kernel.
> --
> vda

The usual way of reading DVDs is to ignore all errors! You need
to handle DVD errors differently than CD/ROM errors. With CDs,
it is expected that all data that is read is perfect. With
DVDs, this is not the case. The implimentation problem becomes
one of how to tell the kernel that the combined DVD/CDROM is
one or the other. I don't know what W$ does about this, but
on my Compaq lap-top, DVDs just stream right along, even though
there are whole corrupted frames, while the same drive containing
a defective CD will retry practically forever.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bug in linux kernel when playing DVDs.
  2003-04-30 15:23         ` James Courtier-Dutton
@ 2003-04-30 14:32           ` Alan Cox
  2003-04-30 16:46           ` Elladan
  1 sibling, 0 replies; 14+ messages in thread
From: Alan Cox @ 2003-04-30 14:32 UTC (permalink / raw)
  To: James Courtier-Dutton; +Cc: vda, Nick Piggin, Linux Kernel Mailing List

On Mer, 2003-04-30 at 16:23, James Courtier-Dutton wrote:
> When an error occurs on the DVD, "read done" message is never printed on 
> the console and all applications fail to respond to user input. This is 
> why I thought that the kernel hogs CPU 100% and the application never 
> receives the error message.

Can you provide me with an strace and the log of the same set of events.
In my case I saw the app I used continually going read/read/read and the
kernel working hard to clean up the mess getting an error and repeat


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bug in linux kernel when playing DVDs.
  2003-04-30 12:07       ` Alan Cox
@ 2003-04-30 15:23         ` James Courtier-Dutton
  2003-04-30 14:32           ` Alan Cox
  2003-04-30 16:46           ` Elladan
  0 siblings, 2 replies; 14+ messages in thread
From: James Courtier-Dutton @ 2003-04-30 15:23 UTC (permalink / raw)
  To: Alan Cox; +Cc: vda, Nick Piggin, Linux Kernel Mailing List

Alan Cox wrote:

>On Mer, 2003-04-30 at 13:10, Denis Vlasenko wrote:
>  
>
>>>Having the kernel not use 100% CPU?
>>>      
>>>
>>I suspect IDE error recovery path was never audited for that
>>    
>>
>
>NOTABUG
>
>User space keeps asking it to read so it keeps using CPU, fix the user
>space
>
>  
>
The application does an initial seek() command, which succeeds.
It then just does read() commands for then on.
For bug tracking, I have put printf statements in my application.
I.e.
printf("About to seek\n");
result seek();
printf("Seek done.\n");
BigLoop:
printf("About to read\n");
result = read(fd,buffer, x_bytes);
printf("read done.\n");
If (result != x_types) assert(0);
else loop back to BigLoop:

When an error occurs on the DVD, "read done" message is never printed on 
the console and all applications fail to respond to user input. This is 
why I thought that the kernel hogs CPU 100% and the application never 
receives the error message.
If I force a different error "tray open", by using a pin in the manual 
eject hole on the front of the dvd rom device, I then see the "read 
done" message and everything comes back to life.

To me, this is somewhat strange behaviour, a bug even.

Is there some other user space parts between the kernel and the "read 
done" message that I don't know about?

So, from the logs, it looks like the kernel tries to keep reading the 
DVD, but it is not the user application requesting that!
Is there some sort of caching code between read() and the kernel, and if 
so, how do I turn it off.

Cheers
James



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bug in linux kernel when playing DVDs.
  2003-04-30 15:23         ` James Courtier-Dutton
  2003-04-30 14:32           ` Alan Cox
@ 2003-04-30 16:46           ` Elladan
  2003-04-30 17:16             ` Jan Knutar
       [not found]             ` <20030524204548.GA1552@eskimo.com>
  1 sibling, 2 replies; 14+ messages in thread
From: Elladan @ 2003-04-30 16:46 UTC (permalink / raw)
  To: James Courtier-Dutton
  Cc: Alan Cox, vda, Nick Piggin, Linux Kernel Mailing List

On Wed, Apr 30, 2003 at 04:23:47PM +0100, James Courtier-Dutton wrote:
> Alan Cox wrote:
> >
> >NOTABUG
> >
> >User space keeps asking it to read so it keeps using CPU, fix the user
> >space
> 
> The application does an initial seek() command, which succeeds.
> It then just does read() commands for then on.
> For bug tracking, I have put printf statements in my application.
> I.e.
>
> [...]
> 
> When an error occurs on the DVD, "read done" message is never printed on 
> the console and all applications fail to respond to user input. This is 
> why I thought that the kernel hogs CPU 100% and the application never 
> receives the error message.
> If I force a different error "tray open", by using a pin in the manual 
> eject hole on the front of the dvd rom device, I then see the "read 
> done" message and everything comes back to life.

Are you sure it never returns, ever?

The behavior most people seem to see here 90% of the time seems to be
that the IDE layer retries the request a few dozen times before
returning an error result.  This usually takes 1-5 minutes.

So, does it return if you, say, go to lunch and then come back?

Of course, the other 10% of the time, things do seem to become
completely broken.  I've certainly observed this sort of behavior
before.

Not to mention, blocking for 1-5 minutes even on a CD-ROM read is
broken, and is certainly very unwanted for the task of playing a DVD.  I
think there needs to be a documented call to tell the kernel that the
application prefers to get I/O errors immediately instead of retries,
and it should always use a lot fewer retries on removable devices where
damaged media is common.

The other bug here is that the IDE layer seems uninterruptible in
software while it's doing this.  The tasks go into uninterruptible sleep
for up to 5 minutes at a time (sometimes forever), and can't be stopped
except by forcing a hardware exception eg. with eject.  You really need
to be able to kill a task and interrupt the file operation somehow when
it's in some sort of long-term CD error recovery situation.

-J

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bug in linux kernel when playing DVDs.
  2003-04-30 16:46           ` Elladan
@ 2003-04-30 17:16             ` Jan Knutar
       [not found]             ` <20030524204548.GA1552@eskimo.com>
  1 sibling, 0 replies; 14+ messages in thread
From: Jan Knutar @ 2003-04-30 17:16 UTC (permalink / raw)
  To: Elladan, James Courtier-Dutton
  Cc: Alan Cox, vda, Nick Piggin, Linux Kernel Mailing List

> The behavior most people seem to see here 90% of the time seems to be
> that the IDE layer retries the request a few dozen times before
> returning an error result.  This usually takes 1-5 minutes.
>
> So, does it return if you, say, go to lunch and then come back?

It's not just IDE. I have a machine with SCSI and an old 2X CDROM, 
which has troubble reading 80 min CD-R's, which I discovered doing a 
copy of a large file from it. Userspace locked up for a few days, and I 
don't just mean the cp process, I mean everything in userspace. The 
machine responded to pings and forwarded packets (It's my NAT machine), 
but not much else. Forced eject with pin clears it up in that case as 
well.

> Not to mention, blocking for 1-5 minutes even on a CD-ROM read is
> broken, and is certainly very unwanted for the task of playing a DVD.

It's unwanted for the task of anything, especially the 
everything-else-hangs-too behaviour that I observed, that might just be 
due to sim710 though ;-). (Kernel 2.4.18)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bug in linux kernel when playing DVDs.
       [not found]             ` <20030524204548.GA1552@eskimo.com>
@ 2004-01-30 20:46               ` James Courtier-Dutton
  0 siblings, 0 replies; 14+ messages in thread
From: James Courtier-Dutton @ 2004-01-30 20:46 UTC (permalink / raw)
  To: Elladan, linux-ide; +Cc: Alan Cox, vda, Nick Piggin

[-- Attachment #1: Type: text/plain, Size: 3289 bytes --]

Elladan wrote:
> On Wed, Apr 30, 2003 at 09:46:41AM -0700, Elladan wrote:
> 
>>On Wed, Apr 30, 2003 at 04:23:47PM +0100, James Courtier-Dutton wrote:
>>
>>>Alan Cox wrote:
>>>
>>>>NOTABUG
>>>>
>>>>User space keeps asking it to read so it keeps using CPU, fix the user
>>>>space
>>>
>>>The application does an initial seek() command, which succeeds.
>>>It then just does read() commands for then on.
>>>For bug tracking, I have put printf statements in my application.
>>>I.e.
>>>
>>>[...]
>>>
>>>When an error occurs on the DVD, "read done" message is never printed on 
>>>the console and all applications fail to respond to user input. This is 
>>>why I thought that the kernel hogs CPU 100% and the application never 
>>>receives the error message.
>>>If I force a different error "tray open", by using a pin in the manual 
>>>eject hole on the front of the dvd rom device, I then see the "read 
>>>done" message and everything comes back to life.
>>
>>Are you sure it never returns, ever?
>>
>>The behavior most people seem to see here 90% of the time seems to be
>>that the IDE layer retries the request a few dozen times before
>>returning an error result.  This usually takes 1-5 minutes.
> 
> 
> Say, does this lame hack help at all?  I got so annoyed with the ide-cd
> driver the other night that I hacked this garbage in during the middle
> of a movie.  After patching, you load the module, possibly with a value,
> and it changes the max number of retries ide-cd will attempt on an
> error.
> 
> Numbers like 0 or 1 may be good.
> 
> Or maybe I'm just dreaming and this patch does nothing, because my
> hardware suddenly started working better.  It does that sometimes.  :-)
> 
> -J
> 
> 

I have not yet tried the patch, because I run kernel 2.6.2 at the moment.
I have concluded that the kernel (not user space) is doing some kind of 
read-ahead caching.
Example of a read happening correctly: -
bash-2.05b# dd if=/dev/dvd of=error.img skip=1598700 count=1
1+0 records in
1+0 records out

Example of a read failing, both on the same DVD, just different sectors.
bash-2.05b# dd if=/dev/dvd of=error.img skip=1598740 count=1
dd: reading `/dev/dvd': Input/output error
0+0 records in
0+0 records out
See attachment for output from dmesg for this single command.

As you can see, the read() call is at least returning now with kernel 
2.6.2, which did not happen with whichever previous kernel I tested 
with. The first error message arrives in the kernel log about 1 second 
after the request which is fine. We could work around that in the user 
space DVD player, but at the moment, the return takes about 20 seconds. 
20 seconds is an unacceptable break in the playing of a DVD.

As the kernel is at least returning control to the application, I can at 
least program some recovery logic into the linux dvd player now.

I think that this problem would be fixed if the kernel would stop the 
read-ahead function as soon as an error is seen, and thus pass control 
back to the application as soon as possible. read-ahead caching should 
only be enabled again on the next disc read() or lseek() request from 
user space.

Summary: -
If we could get the read() call to return as soon as the first error 
message arrives in the dmesg linux would play DVDs much better.

Cheers
James

[-- Attachment #2: dmesg.txt --]
[-- Type: text/plain, Size: 2772 bytes --]

hda: command error: status=0x51 { DriveReady SeekComplete Error }
hda: command error: error=0x54
end_request: I/O error, dev hda, sector 1598736
Buffer I/O error on device hda, logical block 199842
hda: command error: status=0x51 { DriveReady SeekComplete Error }
hda: command error: error=0x54
end_request: I/O error, dev hda, sector 1598744
Buffer I/O error on device hda, logical block 199843
hda: command error: status=0x51 { DriveReady SeekComplete Error }
hda: command error: error=0x54
end_request: I/O error, dev hda, sector 1598752
Buffer I/O error on device hda, logical block 199844
hda: command error: status=0x51 { DriveReady SeekComplete Error }
hda: command error: error=0x54
end_request: I/O error, dev hda, sector 1598760
Buffer I/O error on device hda, logical block 199845
hda: command error: status=0x51 { DriveReady SeekComplete Error }
hda: command error: error=0x54
end_request: I/O error, dev hda, sector 1598768
Buffer I/O error on device hda, logical block 199846
hda: command error: status=0x51 { DriveReady SeekComplete Error }
hda: command error: error=0x54
end_request: I/O error, dev hda, sector 1598776
Buffer I/O error on device hda, logical block 199847
hda: command error: status=0x51 { DriveReady SeekComplete Error }
hda: command error: error=0x54
end_request: I/O error, dev hda, sector 1598784
Buffer I/O error on device hda, logical block 199848
hda: command error: status=0x51 { DriveReady SeekComplete Error }
hda: command error: error=0x54
end_request: I/O error, dev hda, sector 1598792
Buffer I/O error on device hda, logical block 199849
hda: command error: status=0x51 { DriveReady SeekComplete Error }
hda: command error: error=0x54
end_request: I/O error, dev hda, sector 1598800
Buffer I/O error on device hda, logical block 199850
hda: command error: status=0x51 { DriveReady SeekComplete Error }
hda: command error: error=0x54
end_request: I/O error, dev hda, sector 1598808
Buffer I/O error on device hda, logical block 199851
hda: command error: status=0x51 { DriveReady SeekComplete Error }
hda: command error: error=0x54
end_request: I/O error, dev hda, sector 1598816
Buffer I/O error on device hda, logical block 199852
hda: command error: status=0x51 { DriveReady SeekComplete Error }
hda: command error: error=0x54
end_request: I/O error, dev hda, sector 1598824
Buffer I/O error on device hda, logical block 199853
hda: command error: status=0x51 { DriveReady SeekComplete Error }
hda: command error: error=0x54
end_request: I/O error, dev hda, sector 1598832
Buffer I/O error on device hda, logical block 199854
hda: command error: status=0x51 { DriveReady SeekComplete Error }
hda: command error: error=0x54
end_request: I/O error, dev hda, sector 1598840
Buffer I/O error on device hda, logical block 199855

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2004-01-30 20:45 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-04-26 16:28 Bug in linux kernel when playing DVDs James Courtier-Dutton
  -- strict thread matches above, loose matches on Subject: below --
2003-04-27 10:47 James Courtier-Dutton
2003-04-29  5:46 ` Denis Vlasenko
2003-04-29  6:56   ` Nick Piggin
2003-04-30 12:10     ` Denis Vlasenko
2003-04-30 12:07       ` Alan Cox
2003-04-30 15:23         ` James Courtier-Dutton
2003-04-30 14:32           ` Alan Cox
2003-04-30 16:46           ` Elladan
2003-04-30 17:16             ` Jan Knutar
     [not found]             ` <20030524204548.GA1552@eskimo.com>
2004-01-30 20:46               ` James Courtier-Dutton
2003-04-29 11:11   ` James Courtier-Dutton
2003-04-30 12:08     ` Denis Vlasenko
2003-04-30 12:29       ` Richard B. Johnson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.