* Re: amd64 cdrom access locks system
2005-06-09 23:00 ` Andrew Morton
@ 2005-06-09 19:38 ` Jeff Wiegley
2005-06-09 21:58 ` Jeff Wiegley
` (2 subsequent siblings)
3 siblings, 0 replies; 20+ messages in thread
From: Jeff Wiegley @ 2005-06-09 19:38 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, john stultz, Andi Kleen, Pallipadi, Venkatesh
Your workaround does indeed seem to work around the problem.
I can rip tracks from a cd now and I don't get a lock up
anymore.
But the first time I do something with the CD I get this...
warning: many lost ticks.
Your time source seems to be instable or some driver is hogging interupts.
rip __do_softirq+0x48/0xb0
Falling back to HPET
From then on I'm guessing I'm using the HPET and I don't
get any more of these warnings.
I did check on DMA on for the device. I can't get it
to support DMA...
root@mail:/root# hdparm -d 1 /dev/hda
/dev/hda:
setting using_dma to 1 (on)
HDIO_SET_DMA failed: Operation not permitted
using_dma = 0 (off)
I don't know what else to "fiddle" with to get it working. My guess is
that DMA is not currently supported at all for the chipset/motherboard
I have. (As I said before, lspci seems to indicate that a lot of stuff
on this motherboard is "unknown" hardware; would be nice to get it
"known" but I don't know how. I can only be somebody's guinea pig for
patches ;-) Or maybe I am missing some trick to enabling DMA? I have
it enabled by default in my kernel .config
Anyhow, thanks for the work around. I can at least use my burner now.
Though I suspect you want a "real" fix sometime as for why the HPET
tick obtained a 0 value. If you want me to test another patch
towards this goal just let me know.
- Jeff
Andrew Morton wrote:
> Jeff Wiegley <jeffw@cyte.com> wrote:
>
>>warning: many lost ticks.
>>Your time source seems to be instable or some driver is hogging interupts
>>rip default_idle+0x24/0x30
>>Falling back to HPET
>>divide error: 0000 [1] PREEMPT
>>...
>>RIP: 0010:[<ffffffff80112704>] <ffffffff80112704>{timer_interrupt+244}
>
>
> The timer code got confused, fell back to the HPET timer and then got a
> divide-by-zero in timer_interrupt(). Probably because variable hpet_tick
> is zero.
>
> - It's probably a bug that the cdrom code is holding interrupts off for
> too long.
>
> Use hdparm and dmesg to see whether the driver is using DMA. If it
> isn't, fiddle with it until it is.
>
> - It's possibly a bug that we're falling back to HPET mode just because
> the cdrom driver is being transiently silly.
>
> - It's surely a bug that hpet_tick is zero after we've switched to HPET mode.
>
>
>
>
> Please test this workaround:
>
>
> arch/x86_64/kernel/time.c | 13 +++++++++----
> 1 files changed, 9 insertions(+), 4 deletions(-)
>
> diff -puN arch/x86_64/kernel/time.c~x86_64-div-by-zero-fix arch/x86_64/kernel/time.c
> --- 25/arch/x86_64/kernel/time.c~x86_64-div-by-zero-fix Thu Jun 9 15:51:50 2005
> +++ 25-akpm/arch/x86_64/kernel/time.c Thu Jun 9 15:53:08 2005
> @@ -75,6 +75,11 @@ unsigned long __wall_jiffies __section_w
> struct timespec __xtime __section_xtime;
> struct timezone __sys_tz __section_sys_tz;
>
> +static inline unsigned long fixed_hpet_tick(void)
> +{
> + return hpet_tick ? hpet_tick : 1;
> +}
> +
> static inline void rdtscll_sync(unsigned long *tsc)
> {
> #ifdef CONFIG_SMP
> @@ -305,7 +310,7 @@ unsigned long long monotonic_clock(void)
>
> } while (read_seqretry(&xtime_lock, seq));
> offset = (this_offset - last_offset);
> - offset *=(NSEC_PER_SEC/HZ)/hpet_tick;
> + offset *=(NSEC_PER_SEC/HZ)/fixed_hpet_tick();
> return base + offset;
> }else{
> do {
> @@ -393,11 +398,11 @@ static irqreturn_t timer_interrupt(int i
>
> if (vxtime.mode == VXTIME_HPET) {
> if (offset - vxtime.last > hpet_tick) {
> - lost = (offset - vxtime.last) / hpet_tick - 1;
> + lost = (offset - vxtime.last) / fixed_hpet_tick() - 1;
> }
>
> - monotonic_base +=
> - (offset - vxtime.last)*(NSEC_PER_SEC/HZ) / hpet_tick;
> + monotonic_base += (offset - vxtime.last)*(NSEC_PER_SEC/HZ) /
> + fixed_hpet_tick();
>
> vxtime.last = offset;
> #ifdef CONFIG_X86_PM_TIMER
> _
--
Jeff Wiegley, PhD
Cyte.Com, LLC
(ignore:cea2d3a38843531c7def1deff59114de)
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: amd64 cdrom access locks system
2005-06-09 23:00 ` Andrew Morton
2005-06-09 19:38 ` Jeff Wiegley
@ 2005-06-09 21:58 ` Jeff Wiegley
2005-06-09 23:32 ` Venkatesh Pallipadi
2005-06-13 16:35 ` Jeff Wiegley
3 siblings, 0 replies; 20+ messages in thread
From: Jeff Wiegley @ 2005-06-09 21:58 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, john stultz, Andi Kleen, Pallipadi, Venkatesh
Doh!! Apparently I'm not as sharp as I think I am...
In looking into the DMA enable issue I came across
a thread in google that indicated that somebody else
couldn't get DMA on ALI M5229 IDE controller to work
and the suggestion was to make sure Generic IDE
controller support was NOT enabled in the kernel.
I took that advice. recompiled another kernel
with generic IDE support disabled (I did have it enabled
because I didn't know exactly what IDE controller
this had since ALI M5229 wasn't an option, though
I also enabled the alim15x3 driver just in case.)
Well having them both there seems to be what caused
the error.
When I compiled out the generic IDE stuff I also
avoided the recent workaround you provided. (Just to
see if it was needed when using the alim15x3 driver
or whether the presence of generic IDE was the root
of all of my problems.
Now, when I access the cdrom drive it does not lock
up. It doesn't even print anything about "many lost
ticks" anymore either.
But! I can only read from it (cdparanoia) I can't
write to it and this seems to be kernel related.
When I do:
cdrecord -v -eject -dao dev=ATAPI:/dev/hda something.iso
cdrecord comes up and spits out:
...
Warning: Using ATA Packet interface.
Warning: The related Linux kernel interface code seems to be
unmaintained.
Warning: There is absolutely NO DMA, operations thus are slow.
Using libscg version 'ubuntu-0.8ubuntu1'.
cdrecord: Warning: using inofficial version of libscg
(ubuntu-0.8ubuntu1 '@(#)scsitransp.c 1.91 04/06/17 Copyright
1988,1995,2000-2004 J. Schilling').
SCSI buffer size: 64512
atapi: 1
Device type : Removable CD-ROM
Version : 0
Response Format: 2
Capabilities :
Vendor_info : 'SONY '
Identifikation : 'DVD RW DRU-500A '
Revision : '2.0h'
Device seems to be: Generic mmc2 DVD-R/DVD-RW.
Current: 0x0009
Profile: 0x001B
Profile: 0x001A
Profile: 0x0014
Profile: 0x0013
Profile: 0x0011
Profile: 0x0010
Profile: 0x000A
Profile: 0x0009 (current)
Profile: 0x0008
And then the cdrom device hangs. Not the whole machine, just the
cdrom drive. (I'm typing this while the cdrom drive is locked up
for instance) It never even starts to write anything.
What I do get after quite a wait in /var/log/kern.log is:
Jun 9 14:43:00 localhost kernel: ide-cd: cmd 0x3 timed out
Jun 9 14:43:00 localhost kernel: hda: lost interrupt
Jun 9 14:44:00 localhost kernel: ide-cd: cmd 0x3 timed out
Jun 9 14:44:00 localhost kernel: hda: lost interrupt
Jun 9 14:45:00 localhost kernel: hda: lost interrupt
So it looks like something is wrong with interrupt handling
still.
After a *very* long time the process seems to die and exit.
(Until I recompiled with some different options...)
I recompiled another kernel but this time:
- I turned off the PM timer since I seem to have a HPET timer.
- I turned off packet writing for CD writers.
- I added back in the workaround patch you recently gave me.
Nothing of that helped. (And now it looks like no matter how
long I wait the stuck cd drive process doesn't seem to exit
ever.
So in summary:
Reading works without the workaround patch but not if the
generic IDE driver is in charge.
Writing doesn't work under any circumstance.
That's all the compiling and rebooting I can handle for today.
Tomorrow I will try to turn on SCSI emulation and see if that
allows writing to work.
At least I can read CDs now though. Any thoughts on writing?
- Jeff
Andrew Morton wrote:
> Jeff Wiegley <jeffw@cyte.com> wrote:
>
>>warning: many lost ticks.
>>Your time source seems to be instable or some driver is hogging interupts
>>rip default_idle+0x24/0x30
>>Falling back to HPET
>>divide error: 0000 [1] PREEMPT
>>...
>>RIP: 0010:[<ffffffff80112704>] <ffffffff80112704>{timer_interrupt+244}
>
>
> The timer code got confused, fell back to the HPET timer and then got a
> divide-by-zero in timer_interrupt(). Probably because variable hpet_tick
> is zero.
>
> - It's probably a bug that the cdrom code is holding interrupts off for
> too long.
>
> Use hdparm and dmesg to see whether the driver is using DMA. If it
> isn't, fiddle with it until it is.
>
> - It's possibly a bug that we're falling back to HPET mode just because
> the cdrom driver is being transiently silly.
>
> - It's surely a bug that hpet_tick is zero after we've switched to HPET mode.
>
>
>
>
> Please test this workaround:
>
>
> arch/x86_64/kernel/time.c | 13 +++++++++----
> 1 files changed, 9 insertions(+), 4 deletions(-)
>
> diff -puN arch/x86_64/kernel/time.c~x86_64-div-by-zero-fix arch/x86_64/kernel/time.c
> --- 25/arch/x86_64/kernel/time.c~x86_64-div-by-zero-fix Thu Jun 9 15:51:50 2005
> +++ 25-akpm/arch/x86_64/kernel/time.c Thu Jun 9 15:53:08 2005
> @@ -75,6 +75,11 @@ unsigned long __wall_jiffies __section_w
> struct timespec __xtime __section_xtime;
> struct timezone __sys_tz __section_sys_tz;
>
> +static inline unsigned long fixed_hpet_tick(void)
> +{
> + return hpet_tick ? hpet_tick : 1;
> +}
> +
> static inline void rdtscll_sync(unsigned long *tsc)
> {
> #ifdef CONFIG_SMP
> @@ -305,7 +310,7 @@ unsigned long long monotonic_clock(void)
>
> } while (read_seqretry(&xtime_lock, seq));
> offset = (this_offset - last_offset);
> - offset *=(NSEC_PER_SEC/HZ)/hpet_tick;
> + offset *=(NSEC_PER_SEC/HZ)/fixed_hpet_tick();
> return base + offset;
> }else{
> do {
> @@ -393,11 +398,11 @@ static irqreturn_t timer_interrupt(int i
>
> if (vxtime.mode == VXTIME_HPET) {
> if (offset - vxtime.last > hpet_tick) {
> - lost = (offset - vxtime.last) / hpet_tick - 1;
> + lost = (offset - vxtime.last) / fixed_hpet_tick() - 1;
> }
>
> - monotonic_base +=
> - (offset - vxtime.last)*(NSEC_PER_SEC/HZ) / hpet_tick;
> + monotonic_base += (offset - vxtime.last)*(NSEC_PER_SEC/HZ) /
> + fixed_hpet_tick();
>
> vxtime.last = offset;
> #ifdef CONFIG_X86_PM_TIMER
> _
--
Jeff Wiegley, PhD
Cyte.Com, LLC
(ignore:cea2d3a38843531c7def1deff59114de)
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: amd64 cdrom access locks system
2005-06-09 23:00 ` Andrew Morton
2005-06-09 19:38 ` Jeff Wiegley
2005-06-09 21:58 ` Jeff Wiegley
@ 2005-06-09 23:32 ` Venkatesh Pallipadi
2005-06-09 18:23 ` Jeff Wiegley
2005-06-13 16:35 ` Jeff Wiegley
3 siblings, 1 reply; 20+ messages in thread
From: Venkatesh Pallipadi @ 2005-06-09 23:32 UTC (permalink / raw)
To: Andrew Morton
Cc: Jeff Wiegley, linux-kernel, john stultz, Andi Kleen,
Pallipadi, Venkatesh
On Thu, Jun 09, 2005 at 04:00:45PM -0700, Andrew Morton wrote:
> Jeff Wiegley <jeffw@cyte.com> wrote:
> >
> > warning: many lost ticks.
> > Your time source seems to be instable or some driver is hogging interupts
> > rip default_idle+0x24/0x30
> > Falling back to HPET
> > divide error: 0000 [1] PREEMPT
> > ...
> > RIP: 0010:[<ffffffff80112704>] <ffffffff80112704>{timer_interrupt+244}
>
> The timer code got confused, fell back to the HPET timer and then got a
> divide-by-zero in timer_interrupt(). Probably because variable hpet_tick
> is zero.
>
> - It's probably a bug that the cdrom code is holding interrupts off for
> too long.
>
> Use hdparm and dmesg to see whether the driver is using DMA. If it
> isn't, fiddle with it until it is.
>
> - It's possibly a bug that we're falling back to HPET mode just because
> the cdrom driver is being transiently silly.
>
> - It's surely a bug that hpet_tick is zero after we've switched to HPET mode.
>
>
>
>
> Please test this workaround:
>
Only reason I can see for hpet_tick to be zero is when there was some error
in hpet_init(), and we start using PIT. But, later we try to fallback to an
uninitilized HPET.
Can you look at your dmesg before the hang and check what timer is getting used?
The dmesg line will look something like this...
time.c: Using ______ MHz ___ timer.
Thanks,
Venki
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: amd64 cdrom access locks system
2005-06-09 23:32 ` Venkatesh Pallipadi
@ 2005-06-09 18:23 ` Jeff Wiegley
0 siblings, 0 replies; 20+ messages in thread
From: Jeff Wiegley @ 2005-06-09 18:23 UTC (permalink / raw)
To: Venkatesh Pallipadi; +Cc: Andrew Morton, linux-kernel, john stultz, Andi Kleen
The answer to what timer is getting used appears to be:
time.c: Using 3.579545 MHz PM timer.
time.c: Detected 2612.615 MHz processor.
time.c: Using HPET/TSC based timekeeping.
I'm still waiting for the compile to complete to test
Mr. Morton's workaround. Should have results posted
in about 15 minutes.
Thanks,
- Jeff
Venkatesh Pallipadi wrote:
> On Thu, Jun 09, 2005 at 04:00:45PM -0700, Andrew Morton wrote:
>
>>Jeff Wiegley <jeffw@cyte.com> wrote:
>>
>>>warning: many lost ticks.
>>>Your time source seems to be instable or some driver is hogging interupts
>>>rip default_idle+0x24/0x30
>>>Falling back to HPET
>>>divide error: 0000 [1] PREEMPT
>>>...
>>>RIP: 0010:[<ffffffff80112704>] <ffffffff80112704>{timer_interrupt+244}
>>
>>The timer code got confused, fell back to the HPET timer and then got a
>>divide-by-zero in timer_interrupt(). Probably because variable hpet_tick
>>is zero.
>>
>>- It's probably a bug that the cdrom code is holding interrupts off for
>> too long.
>>
>> Use hdparm and dmesg to see whether the driver is using DMA. If it
>> isn't, fiddle with it until it is.
>>
>>- It's possibly a bug that we're falling back to HPET mode just because
>> the cdrom driver is being transiently silly.
>>
>>- It's surely a bug that hpet_tick is zero after we've switched to HPET mode.
>>
>>
>>
>>
>>Please test this workaround:
>>
>
>
> Only reason I can see for hpet_tick to be zero is when there was some error
> in hpet_init(), and we start using PIT. But, later we try to fallback to an
> uninitilized HPET.
>
> Can you look at your dmesg before the hang and check what timer is getting used?
> The dmesg line will look something like this...
>
> time.c: Using ______ MHz ___ timer.
>
> Thanks,
> Venki
--
Jeff Wiegley, PhD
Cyte.Com, LLC
(ignore:cea2d3a38843531c7def1deff59114de)
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: amd64 cdrom access locks system
2005-06-09 23:00 ` Andrew Morton
` (2 preceding siblings ...)
2005-06-09 23:32 ` Venkatesh Pallipadi
@ 2005-06-13 16:35 ` Jeff Wiegley
2005-06-14 7:55 ` Bartlomiej Zolnierkiewicz
3 siblings, 1 reply; 20+ messages in thread
From: Jeff Wiegley @ 2005-06-13 16:35 UTC (permalink / raw)
To: B.Zolnierkiewicz; +Cc: linux-kernel, akpm
Andrew Morton said I should carbon copy the IDE developer on this
issue so I have in the hopes of re-opening this issue and making
some progress since I'm still unable to write anything with my
cd-burner.
Here's what I know to date:
I have the alim15x3 IDE driver installed and running.
I do NOT have any of the generic IDE drivers installed or
even compiled as they grossly interfere with the alim15x3
and cause a kernel panic.
My hardware is an AMD64 FX55 in a Shuttle ST20G5 case with a
serial ATA harddrive.
I'm using a stock 2.6.12-rc6 kernel.
Debian unstable distribution.
At first I can read from the drive fine.
For instance I did two "cdparanoia -B -d /dev/hda" without
a hitch. Nothing was reported in /var/log/kernel as a result.
The problem is that I can't write to the drive (burn cds with
cdrecord) with causing a lost interrupt and then nothing works;
even reads don't respond.
When I do:
cdrecord -v -tao dev=ATAPI:/dev/hda something.iso
I get this output:
Cdrecord-Clone 2.01.01a01 (x86_64-unknown-linux-gnu) Copyright (C)
1995-2004 Joerg Schilling
NOTE: this version of cdrecord is an inofficial (modified) release of
cdrecord
and thus may have bugs that are not present in the original
version.
Please send bug reports and support requests to
<cdrtools@packages.debian.org>.
The original author should not be bothered with problems of
this version.
cdrecord: Warning: Running on Linux-2.6.12-rc6-jw14
cdrecord: There are unsettled issues with Linux-2.5 and newer.
cdrecord: If you have unexpected problems, please try Linux-2.4 or
Solaris.
TOC Type: 1 = CD-ROM
scsidev: 'ATAPI:/dev/hda'
devname: 'ATAPI:/dev/hda'
scsibus: -2 target: -2 lun: -2
Warning: Using ATA Packet interface.
Warning: The related Linux kernel interface code seems to be
unmaintained.
Warning: There is absolutely NO DMA, operations thus are slow.
Using libscg version 'ubuntu-0.8ubuntu1'.
cdrecord: Warning: using inofficial version of libscg
(ubuntu-0.8ubuntu1 '@(#)scsitransp.c 1.91 04/06/17 Copyright
1988,1995,2000-2004 J. Schilling').
SCSI buffer size: 64512
atapi: 1
Device type : Removable CD-ROM
Version : 0
Response Format: 2
Capabilities :
Vendor_info : 'SONY '
Identifikation : 'DVD RW DRU-500A '
Revision : '2.0h'
Device seems to be: Generic mmc2 DVD-R/DVD-RW.
Current: 0x0009
Profile: 0x001B
Profile: 0x001A
Profile: 0x0014
Profile: 0x0013
Profile: 0x0011
Profile: 0x0010
Profile: 0x000A
Profile: 0x0009 (current)
Profile: 0x0008
And nothing else happens. (The drive light isn't even lit.)
The machine isn't locked up. (I'm typing this as it happened.)
after a minute, or so, /var/log/kern.log reports this:
Jun 13 08:57:25 localhost kernel: ide-cd: cmd 0x3 timed out
Jun 13 08:57:25 localhost kernel: hda: lost interrupt
A bit later (exactly one minute) kern.log again reports:
Jun 13 08:58:25 localhost kernel: ide-cd: cmd 0x3 timed out
Jun 13 08:58:25 localhost kernel: hda: lost interrupt
Then nothing else seems to happen through I've waited several minutes
more.
When I try to Ctrl-C the cdrecord process, it seems to be ignored.
But many minutes later the process dies after kern.log logs:
Jun 13 09:05:05 localhost kernel: hda: lost interrupt
Jun 13 09:06:05 localhost kernel: ide-cd: cmd 0x1e timed out
Jun 13 09:06:05 localhost kernel: hda: lost interrupt
after this point all access to the cd drive takes a *very* long
time to complete (or doesn't seem to complete at all).
The first time I did: eject -v /dev/hda it took several minutes to
complete. During which time kern.log again reports:
Jun 13 09:18:20 localhost kernel: hda: lost interrupt
Jun 13 09:19:20 localhost kernel: hda: lost interrupt
Jun 13 09:20:20 localhost kernel: ide-cd: cmd 0x1e timed out
Jun 13 09:20:20 localhost kernel: hda: lost interrupt
The second time I did eject it didn't seem to complete at all and
kern.log reported:
Jun 13 09:18:20 localhost kernel: hda: lost interrupt
Jun 13 09:19:20 localhost kernel: hda: lost interrupt
Jun 13 09:20:20 localhost kernel: ide-cd: cmd 0x1e timed out
Jun 13 09:20:20 localhost kernel: hda: lost interrupt
Jun 13 09:21:43 localhost kernel: hda: lost interrupt
Jun 13 09:22:43 localhost kernel: ide-cd: cmd 0x3 timed out
Jun 13 09:22:43 localhost kernel: hda: lost interrupt
Jun 13 09:23:42 localhost kernel: ide-cd: cmd 0x3 timed out
Jun 13 09:23:42 localhost kernel: hda: lost interrupt
Jun 13 09:24:44 localhost kernel: hda: lost interrupt
Jun 13 09:25:44 localhost kernel: hda: lost interrupt
Jun 13 09:26:44 localhost kernel: ide-cd: cmd 0x25 timed out
Jun 13 09:26:44 localhost kernel: hda: lost interrupt
Jun 13 09:27:44 localhost kernel: ide-cd: cmd 0x25 timed out
Jun 13 09:27:44 localhost kernel: hda: lost interrupt
Jun 13 09:28:44 localhost kernel: hda: lost interrupt
Jun 13 09:29:44 localhost kernel: hda: lost interrupt
Jun 13 09:30:44 localhost kernel: hda: lost interrupt
Jun 13 09:31:44 localhost kernel: hda: lost interrupt
Jun 13 09:32:44 localhost kernel: hda: lost interrupt
Jun 13 09:33:44 localhost kernel: hda: lost interrupt
Any ideas on how I can fix this?
--
Jeff Wiegley, PhD
Cyte.Com, LLC
(ignore:cea2d3a38843531c7def1deff59114de)
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: amd64 cdrom access locks system
2005-06-13 16:35 ` Jeff Wiegley
@ 2005-06-14 7:55 ` Bartlomiej Zolnierkiewicz
2005-06-14 10:35 ` Jeff Wiegley
0 siblings, 1 reply; 20+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2005-06-14 7:55 UTC (permalink / raw)
To: Jeff Wiegley; +Cc: B.Zolnierkiewicz, linux-kernel, akpm, Jens Axboe
[ Jens added to cc: ]
On 6/13/05, Jeff Wiegley <jeffw@cyte.com> wrote:
> Andrew Morton said I should carbon copy the IDE developer on this
> issue so I have in the hopes of re-opening this issue and making
> some progress since I'm still unable to write anything with my
> cd-burner.
>
> Here's what I know to date:
>
> I have the alim15x3 IDE driver installed and running.
> I do NOT have any of the generic IDE drivers installed or
> even compiled as they grossly interfere with the alim15x3
> and cause a kernel panic.
> My hardware is an AMD64 FX55 in a Shuttle ST20G5 case with a
> serial ATA harddrive.
> I'm using a stock 2.6.12-rc6 kernel.
> Debian unstable distribution.
>
> At first I can read from the drive fine.
> For instance I did two "cdparanoia -B -d /dev/hda" without
> a hitch. Nothing was reported in /var/log/kernel as a result.
>
> The problem is that I can't write to the drive (burn cds with
> cdrecord) with causing a lost interrupt and then nothing works;
> even reads don't respond.
>
> When I do:
> cdrecord -v -tao dev=ATAPI:/dev/hda something.iso
>
> I get this output:
> Cdrecord-Clone 2.01.01a01 (x86_64-unknown-linux-gnu) Copyright (C)
> 1995-2004 Joerg Schilling
> NOTE: this version of cdrecord is an inofficial (modified) release of
> cdrecord
> and thus may have bugs that are not present in the original
> version.
> Please send bug reports and support requests to
> <cdrtools@packages.debian.org>.
> The original author should not be bothered with problems of
> this version.
>
> cdrecord: Warning: Running on Linux-2.6.12-rc6-jw14
> cdrecord: There are unsettled issues with Linux-2.5 and newer.
> cdrecord: If you have unexpected problems, please try Linux-2.4 or
> Solaris.
> TOC Type: 1 = CD-ROM
> scsidev: 'ATAPI:/dev/hda'
> devname: 'ATAPI:/dev/hda'
> scsibus: -2 target: -2 lun: -2
> Warning: Using ATA Packet interface.
> Warning: The related Linux kernel interface code seems to be
> unmaintained.
^^^
> Warning: There is absolutely NO DMA, operations thus are slow.
^^^
What is the result of using "dev=/dev/hda" interface instead
(as suggested by Robert)?
Bartlomiej
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: amd64 cdrom access locks system
2005-06-14 7:55 ` Bartlomiej Zolnierkiewicz
@ 2005-06-14 10:35 ` Jeff Wiegley
2005-06-14 18:16 ` Bartlomiej Zolnierkiewicz
0 siblings, 1 reply; 20+ messages in thread
From: Jeff Wiegley @ 2005-06-14 10:35 UTC (permalink / raw)
To: Bartlomiej Zolnierkiewicz
Cc: B.Zolnierkiewicz, linux-kernel, akpm, Jens Axboe
using "dev=/dev/hda" yeilds the exact same behavior...
Jun 14 03:21:50 localhost kernel: ide-cd: cmd 0x3 timed out
Jun 14 03:21:50 localhost kernel: hda: lost interrupt
Jun 14 03:22:50 localhost kernel: ide-cd: cmd 0x3 timed out
Jun 14 03:22:50 localhost kernel: hda: lost interrupt
Jun 14 03:23:30 localhost kernel: hda: lost interrupt
And I'm a little confused by Robert's suggestion... Should it
ever be possible for a user-space application to cause lost
interrupts and other kernel state problems regardless of what
"interface" is used?? Sure, if the application uses the wrong
interface it should get spanked somehow but should it be able to
mess up the kernel for other applications as well? (Like now
I can't read or eject.)
The output from the cdrecord command was:
root@mail:~# cdrecord -v -eject -tao dev=/dev/hda stupid.iso
Cdrecord-Clone 2.01.01a01 (x86_64-unknown-linux-gnu) Copyright (C)
1995-2004 Joerg Schilling
NOTE: this version of cdrecord is an inofficial (modified) release of
cdrecord
and thus may have bugs that are not present in the original
version.
Please send bug reports and support requests to
<cdrtools@packages.debian.org>.
The original author should not be bothered with problems of
this version.
cdrecord: Warning: Running on Linux-2.6.12-rc6-jw14
cdrecord: There are unsettled issues with Linux-2.5 and newer.
cdrecord: If you have unexpected problems, please try Linux-2.4 or
Solaris.
TOC Type: 1 = CD-ROM
scsidev: '/dev/hda'
devname: '/dev/hda'
scsibus: -2 target: -2 lun: -2
Warning: Open by 'devname' is unintentional and not supported.
Linux sg driver version: 3.5.27
Using libscg version 'ubuntu-0.8ubuntu1'.
cdrecord: Warning: using inofficial version of libscg
(ubuntu-0.8ubuntu1 '@(#)scsitransp.c 1.91 04/06/17 Copyright
1988,1995,2000-2004 J. Schilling').
SCSI buffer size: 64512
atapi: 1
Device type : Removable CD-ROM
Version : 0
Response Format: 2
Capabilities :
Vendor_info : 'SONY '
Identifikation : 'DVD RW DRU-500A '
Revision : '2.0h'
Device seems to be: Generic mmc2 DVD-R/DVD-RW.
Current: 0x0009
Profile: 0x001B
Profile: 0x001A
Profile: 0x0014
Profile: 0x0013
Profile: 0x0011
Profile: 0x0010
Profile: 0x000A
Profile: 0x0009 (current)
Profile: 0x0008
Since the kernel gets messed up and reports losts interrupts I'm
inclined to believe that this is a kernel/driver issue and not my
misuse of an application/interface. Though I realize cdrecord is
being run as the superuser and therefore might be overiding some
kernel security checks and messing with the kernel so I might be
wrong about that.
One question comes to mind... Would Robert's suggestion and my
results be affected by the fact that I don't have Packet Writing
for CD drives turned on the current kernel?
Any other ideas?
Bartlomiej Zolnierkiewicz wrote:
> [ Jens added to cc: ]
>
> On 6/13/05, Jeff Wiegley <jeffw@cyte.com> wrote:
>
>>Andrew Morton said I should carbon copy the IDE developer on this
>>issue so I have in the hopes of re-opening this issue and making
>>some progress since I'm still unable to write anything with my
>>cd-burner.
>>
>>Here's what I know to date:
>>
>> I have the alim15x3 IDE driver installed and running.
>> I do NOT have any of the generic IDE drivers installed or
>> even compiled as they grossly interfere with the alim15x3
>> and cause a kernel panic.
>> My hardware is an AMD64 FX55 in a Shuttle ST20G5 case with a
>> serial ATA harddrive.
>> I'm using a stock 2.6.12-rc6 kernel.
>> Debian unstable distribution.
>>
>>At first I can read from the drive fine.
>> For instance I did two "cdparanoia -B -d /dev/hda" without
>> a hitch. Nothing was reported in /var/log/kernel as a result.
>>
>>The problem is that I can't write to the drive (burn cds with
>>cdrecord) with causing a lost interrupt and then nothing works;
>>even reads don't respond.
>>
>>When I do:
>> cdrecord -v -tao dev=ATAPI:/dev/hda something.iso
>>
>>I get this output:
>> Cdrecord-Clone 2.01.01a01 (x86_64-unknown-linux-gnu) Copyright (C)
>>1995-2004 Joerg Schilling
>> NOTE: this version of cdrecord is an inofficial (modified) release of
>>cdrecord
>> and thus may have bugs that are not present in the original
>>version.
>> Please send bug reports and support requests to
>><cdrtools@packages.debian.org>.
>> The original author should not be bothered with problems of
>>this version.
>>
>> cdrecord: Warning: Running on Linux-2.6.12-rc6-jw14
>> cdrecord: There are unsettled issues with Linux-2.5 and newer.
>> cdrecord: If you have unexpected problems, please try Linux-2.4 or
>>Solaris.
>> TOC Type: 1 = CD-ROM
>> scsidev: 'ATAPI:/dev/hda'
>> devname: 'ATAPI:/dev/hda'
>> scsibus: -2 target: -2 lun: -2
>> Warning: Using ATA Packet interface.
>> Warning: The related Linux kernel interface code seems to be
>>unmaintained.
>
>
> ^^^
>
>
>> Warning: There is absolutely NO DMA, operations thus are slow.
>
>
> ^^^
>
> What is the result of using "dev=/dev/hda" interface instead
> (as suggested by Robert)?
>
> Bartlomiej
--
Jeff Wiegley, PhD
Cyte.Com, LLC
(ignore:cea2d3a38843531c7def1deff59114de)
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: amd64 cdrom access locks system
2005-06-14 10:35 ` Jeff Wiegley
@ 2005-06-14 18:16 ` Bartlomiej Zolnierkiewicz
2005-12-15 9:15 ` Aric Cyr
0 siblings, 1 reply; 20+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2005-06-14 18:16 UTC (permalink / raw)
To: Jeff Wiegley; +Cc: B.Zolnierkiewicz, linux-kernel, akpm, Jens Axboe
On 6/14/05, Jeff Wiegley <jeffw@cyte.com> wrote:
> using "dev=/dev/hda" yeilds the exact same behavior...
>
> Jun 14 03:21:50 localhost kernel: ide-cd: cmd 0x3 timed out
> Jun 14 03:21:50 localhost kernel: hda: lost interrupt
> Jun 14 03:22:50 localhost kernel: ide-cd: cmd 0x3 timed out
> Jun 14 03:22:50 localhost kernel: hda: lost interrupt
> Jun 14 03:23:30 localhost kernel: hda: lost interrupt
Jens, any idea?
> And I'm a little confused by Robert's suggestion... Should it
> ever be possible for a user-space application to cause lost
> interrupts and other kernel state problems regardless of what
> "interface" is used?? Sure, if the application uses the wrong
> interface it should get spanked somehow but should it be able to
> mess up the kernel for other applications as well? (Like now
> I can't read or eject.)
It shouldn't be possible unless it is "raw" interface
(requires CAP_SYS_RAWIO) w/o checking all possible
parameters (it is not always possible) or device is buggy.
Also it is quite unlikely that somebody will fix obsolete
interface (hey, it got obsoleted for some reason ;-).
> The output from the cdrecord command was:
> root@mail:~# cdrecord -v -eject -tao dev=/dev/hda stupid.iso
> Cdrecord-Clone 2.01.01a01 (x86_64-unknown-linux-gnu) Copyright (C)
> 1995-2004 Joerg Schilling
> NOTE: this version of cdrecord is an inofficial (modified) release of
> cdrecord
> and thus may have bugs that are not present in the original
> version.
> Please send bug reports and support requests to
> <cdrtools@packages.debian.org>.
> The original author should not be bothered with problems of
> this version.
>
> cdrecord: Warning: Running on Linux-2.6.12-rc6-jw14
> cdrecord: There are unsettled issues with Linux-2.5 and newer.
> cdrecord: If you have unexpected problems, please try Linux-2.4 or
> Solaris.
> TOC Type: 1 = CD-ROM
> scsidev: '/dev/hda'
> devname: '/dev/hda'
> scsibus: -2 target: -2 lun: -2
> Warning: Open by 'devname' is unintentional and not supported.
> Linux sg driver version: 3.5.27
> Using libscg version 'ubuntu-0.8ubuntu1'.
> cdrecord: Warning: using inofficial version of libscg
> (ubuntu-0.8ubuntu1 '@(#)scsitransp.c 1.91 04/06/17 Copyright
> 1988,1995,2000-2004 J. Schilling').
> SCSI buffer size: 64512
> atapi: 1
> Device type : Removable CD-ROM
> Version : 0
> Response Format: 2
> Capabilities :
> Vendor_info : 'SONY '
> Identifikation : 'DVD RW DRU-500A '
> Revision : '2.0h'
> Device seems to be: Generic mmc2 DVD-R/DVD-RW.
> Current: 0x0009
> Profile: 0x001B
> Profile: 0x001A
> Profile: 0x0014
> Profile: 0x0013
> Profile: 0x0011
> Profile: 0x0010
> Profile: 0x000A
> Profile: 0x0009 (current)
> Profile: 0x0008
>
> Since the kernel gets messed up and reports losts interrupts I'm
> inclined to believe that this is a kernel/driver issue and not my
> misuse of an application/interface. Though I realize cdrecord is
> being run as the superuser and therefore might be overiding some
> kernel security checks and messing with the kernel so I might be
> wrong about that.
>
> One question comes to mind... Would Robert's suggestion and my
> results be affected by the fact that I don't have Packet Writing
> for CD drives turned on the current kernel?
No.
Bartlomiej
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: amd64 cdrom access locks system
2005-06-14 18:16 ` Bartlomiej Zolnierkiewicz
@ 2005-12-15 9:15 ` Aric Cyr
0 siblings, 0 replies; 20+ messages in thread
From: Aric Cyr @ 2005-12-15 9:15 UTC (permalink / raw)
To: linux-kernel
Bartlomiej Zolnierkiewicz <bzolnier <at> gmail.com> writes:
>
> On 6/14/05, Jeff Wiegley <jeffw <at> cyte.com> wrote:
> > using "dev=/dev/hda" yeilds the exact same behavior...
> >
> > Jun 14 03:21:50 localhost kernel: ide-cd: cmd 0x3 timed out
> > Jun 14 03:21:50 localhost kernel: hda: lost interrupt
> > Jun 14 03:22:50 localhost kernel: ide-cd: cmd 0x3 timed out
> > Jun 14 03:22:50 localhost kernel: hda: lost interrupt
> > Jun 14 03:23:30 localhost kernel: hda: lost interrupt
>
> Jens, any idea?
>
> > And I'm a little confused by Robert's suggestion... Should it
> > ever be possible for a user-space application to cause lost
> > interrupts and other kernel state problems regardless of what
> > "interface" is used?? Sure, if the application uses the wrong
> > interface it should get spanked somehow but should it be able to
> > mess up the kernel for other applications as well? (Like now
> > I can't read or eject.)
>
> It shouldn't be possible unless it is "raw" interface
> (requires CAP_SYS_RAWIO) w/o checking all possible
> parameters (it is not always possible) or device is buggy.
>
> Also it is quite unlikely that somebody will fix obsolete
> interface (hey, it got obsoleted for some reason .
>
> Bartlomiej
>
Has this problem been fixed at all or any workarounds known? I am having the
exact same issue with similar hardware and the alim15x3 driver. In my case it
does not matter which method I use for cdrecord (ATA:, ATAPI: or dev=/dev/hda),
I will always get the lost interrupts from the command "cdrecord -atip". I have
tried other drives without success so I don't believe that is the problem.
Interestingly cdrdao does not have any problems at all and burns perfectly, so I
suspect that cdrecord might be throwing some command that ide-cd or the IDE
drive doesn't like and fails to recover from. However, disabling DMA on the
drive via hdparm makes cdrecord work perfectly, so I suspect the alim15x3 driver
more than anything else. I can play DVDs for hours with DMA enabled just fine
though... go figure. My current kernel is 2.6.14-gentoo-r6, but I have had this
problem since I first had got the system (around 2.6.12).
I'm really anxious to track this down so if anyone has any information, or need
something from me (logs or debugging) please don't hesitate to ask.
Regards,
Aric
^ permalink raw reply [flat|nested] 20+ messages in thread