* 2.6.17-rc1-mm2: badness in 3w_xxxx driver
@ 2006-04-09 18:23 Nick Orlov
2006-04-09 18:32 ` Andrew Morton
0 siblings, 1 reply; 7+ messages in thread
From: Nick Orlov @ 2006-04-09 18:23 UTC (permalink / raw)
To: linux-kernel; +Cc: Andrew Morton, Jens Axboe, James Bottomley
The following patch: x86-kmap_atomic-debugging.patch exposed a badness
in 3w_xxx driver. I'm getting a lot of:
Apr 9 13:00:04 nickolas kernel: kmap_atomic: local irqs are enabled while using KM_IRQn
Apr 9 13:00:04 nickolas kernel: <c0104103> show_trace+0x13/0x20 <c010412e> dump_stack+0x1e/0x20
Apr 9 13:00:04 nickolas kernel: <c01159c9> kmap_atomic+0x79/0xe0 <c028b885> tw_transfer_internal+0x85/0xa0
Apr 9 13:00:04 nickolas kernel: <c028ca7e> tw_interrupt+0x3fe/0x820 <c0143b9e> handle_IRQ_event+0x3e/0x80
Apr 9 13:00:04 nickolas kernel: <c0143c70> __do_IRQ+0x90/0x100 <c01057a6> do_IRQ+0x26/0x40
Apr 9 13:00:04 nickolas kernel: <c010396e> common_interrupt+0x1a/0x20 <c0101cdd> cpu_idle+0x4d/0xb0
Apr 9 13:00:04 nickolas kernel: <c010f2cc> start_secondary+0x24c/0x4b0 <00000000> 0x0
Apr 9 13:00:04 nickolas kernel: <c214ffb4> 0xc214ffb4
I'm running 32 bit kernel on AMD64x2 w/ HIGHMEM enabled.
I think this is an old bug since the 3w_xxxx.c has not been changed for
a long time (at least since 2.6.16-rc1-mm4).
Please let me know if you want me to try some patches.
--
With best wishes,
Nick Orlov.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.17-rc1-mm2: badness in 3w_xxxx driver
2006-04-09 18:23 2.6.17-rc1-mm2: badness in 3w_xxxx driver Nick Orlov
@ 2006-04-09 18:32 ` Andrew Morton
2006-04-09 19:12 ` Jeff Garzik
2006-04-09 19:12 ` Nick Orlov
0 siblings, 2 replies; 7+ messages in thread
From: Andrew Morton @ 2006-04-09 18:32 UTC (permalink / raw)
To: Nick Orlov; +Cc: linux-kernel, axboe, James.Bottomley
Nick Orlov <bugfixer@list.ru> wrote:
>
> The following patch: x86-kmap_atomic-debugging.patch exposed a badness
> in 3w_xxx driver.
Sweet, thanks.
> I'm getting a lot of:
>
> Apr 9 13:00:04 nickolas kernel: kmap_atomic: local irqs are enabled while using KM_IRQn
> Apr 9 13:00:04 nickolas kernel: <c0104103> show_trace+0x13/0x20 <c010412e> dump_stack+0x1e/0x20
> Apr 9 13:00:04 nickolas kernel: <c01159c9> kmap_atomic+0x79/0xe0 <c028b885> tw_transfer_internal+0x85/0xa0
> Apr 9 13:00:04 nickolas kernel: <c028ca7e> tw_interrupt+0x3fe/0x820 <c0143b9e> handle_IRQ_event+0x3e/0x80
> Apr 9 13:00:04 nickolas kernel: <c0143c70> __do_IRQ+0x90/0x100 <c01057a6> do_IRQ+0x26/0x40
> Apr 9 13:00:04 nickolas kernel: <c010396e> common_interrupt+0x1a/0x20 <c0101cdd> cpu_idle+0x4d/0xb0
> Apr 9 13:00:04 nickolas kernel: <c010f2cc> start_secondary+0x24c/0x4b0 <00000000> 0x0
> Apr 9 13:00:04 nickolas kernel: <c214ffb4> 0xc214ffb4
>
> I'm running 32 bit kernel on AMD64x2 w/ HIGHMEM enabled.
> I think this is an old bug since the 3w_xxxx.c has not been changed for
> a long time (at least since 2.6.16-rc1-mm4).
>
> Please let me know if you want me to try some patches.
>
From: Andrew Morton <akpm@osdl.org>
We must disable local IRQs while holding KM_IRQ0 or KM_IRQ1. Otherwise, an
IRQ handler could use those kmap slots while this code is using them,
resulting in memory corruption.
Thanks to Nick Orlov <bugfixer@list.ru> for reporting.
Cc: <linuxraid@amcc.com>
Cc: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---
drivers/scsi/3w-xxxx.c | 3 +++
1 files changed, 3 insertions(+)
diff -puN drivers/scsi/3w-xxxx.c~3ware-kmap_atomic-fix drivers/scsi/3w-xxxx.c
--- devel/drivers/scsi/3w-xxxx.c~3ware-kmap_atomic-fix 2006-04-09 11:28:08.000000000 -0700
+++ devel-akpm/drivers/scsi/3w-xxxx.c 2006-04-09 11:29:21.000000000 -0700
@@ -1508,10 +1508,12 @@ static void tw_transfer_internal(TW_Devi
struct scsi_cmnd *cmd = tw_dev->srb[request_id];
void *buf;
unsigned int transfer_len;
+ unsigned long flags = 0;
if (cmd->use_sg) {
struct scatterlist *sg =
(struct scatterlist *)cmd->request_buffer;
+ local_irq_save(flags);
buf = kmap_atomic(sg->page, KM_IRQ0) + sg->offset;
transfer_len = min(sg->length, len);
} else {
@@ -1526,6 +1528,7 @@ static void tw_transfer_internal(TW_Devi
sg = (struct scatterlist *)cmd->request_buffer;
kunmap_atomic(buf - sg->offset, KM_IRQ0);
+ local_irq_restore(flags);
}
}
_
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.17-rc1-mm2: badness in 3w_xxxx driver
2006-04-09 18:32 ` Andrew Morton
@ 2006-04-09 19:12 ` Jeff Garzik
2006-04-09 19:21 ` Arjan van de Ven
2006-04-09 19:12 ` Nick Orlov
1 sibling, 1 reply; 7+ messages in thread
From: Jeff Garzik @ 2006-04-09 19:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: Nick Orlov, linux-kernel, axboe, James.Bottomley
Andrew Morton wrote:
> Nick Orlov <bugfixer@list.ru> wrote:
>> The following patch: x86-kmap_atomic-debugging.patch exposed a badness
>> in 3w_xxx driver.
>
> Sweet, thanks.
>
>> I'm getting a lot of:
>>
>> Apr 9 13:00:04 nickolas kernel: kmap_atomic: local irqs are enabled while using KM_IRQn
>> Apr 9 13:00:04 nickolas kernel: <c0104103> show_trace+0x13/0x20 <c010412e> dump_stack+0x1e/0x20
>> Apr 9 13:00:04 nickolas kernel: <c01159c9> kmap_atomic+0x79/0xe0 <c028b885> tw_transfer_internal+0x85/0xa0
>> Apr 9 13:00:04 nickolas kernel: <c028ca7e> tw_interrupt+0x3fe/0x820 <c0143b9e> handle_IRQ_event+0x3e/0x80
>> Apr 9 13:00:04 nickolas kernel: <c0143c70> __do_IRQ+0x90/0x100 <c01057a6> do_IRQ+0x26/0x40
>> Apr 9 13:00:04 nickolas kernel: <c010396e> common_interrupt+0x1a/0x20 <c0101cdd> cpu_idle+0x4d/0xb0
>> Apr 9 13:00:04 nickolas kernel: <c010f2cc> start_secondary+0x24c/0x4b0 <00000000> 0x0
>> Apr 9 13:00:04 nickolas kernel: <c214ffb4> 0xc214ffb4
>>
>> I'm running 32 bit kernel on AMD64x2 w/ HIGHMEM enabled.
>> I think this is an old bug since the 3w_xxxx.c has not been changed for
>> a long time (at least since 2.6.16-rc1-mm4).
>>
>> Please let me know if you want me to try some patches.
>>
>
>
> From: Andrew Morton <akpm@osdl.org>
>
> We must disable local IRQs while holding KM_IRQ0 or KM_IRQ1. Otherwise, an
> IRQ handler could use those kmap slots while this code is using them,
> resulting in memory corruption.
>
> Thanks to Nick Orlov <bugfixer@list.ru> for reporting.
>
> Cc: <linuxraid@amcc.com>
> Cc: James Bottomley <James.Bottomley@SteelEye.com>
> Signed-off-by: Andrew Morton <akpm@osdl.org>
> ---
>
> drivers/scsi/3w-xxxx.c | 3 +++
> 1 files changed, 3 insertions(+)
>
> diff -puN drivers/scsi/3w-xxxx.c~3ware-kmap_atomic-fix drivers/scsi/3w-xxxx.c
> --- devel/drivers/scsi/3w-xxxx.c~3ware-kmap_atomic-fix 2006-04-09 11:28:08.000000000 -0700
> +++ devel-akpm/drivers/scsi/3w-xxxx.c 2006-04-09 11:29:21.000000000 -0700
> @@ -1508,10 +1508,12 @@ static void tw_transfer_internal(TW_Devi
> struct scsi_cmnd *cmd = tw_dev->srb[request_id];
> void *buf;
> unsigned int transfer_len;
> + unsigned long flags = 0;
>
> if (cmd->use_sg) {
> struct scatterlist *sg =
> (struct scatterlist *)cmd->request_buffer;
> + local_irq_save(flags);
> buf = kmap_atomic(sg->page, KM_IRQ0) + sg->offset;
> transfer_len = min(sg->length, len);
> } else {
> @@ -1526,6 +1528,7 @@ static void tw_transfer_internal(TW_Devi
>
> sg = (struct scatterlist *)cmd->request_buffer;
> kunmap_atomic(buf - sg->offset, KM_IRQ0);
> + local_irq_restore(flags);
ACK.
Though please make sure the active maintainer is CC'd on this... There
is even a helpful MAINTAINERS entry for this driver.
Jeff
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.17-rc1-mm2: badness in 3w_xxxx driver
2006-04-09 18:32 ` Andrew Morton
2006-04-09 19:12 ` Jeff Garzik
@ 2006-04-09 19:12 ` Nick Orlov
2006-04-09 19:43 ` Andrew Morton
1 sibling, 1 reply; 7+ messages in thread
From: Nick Orlov @ 2006-04-09 19:12 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
On Sun, Apr 09, 2006 at 11:32:40AM -0700, Andrew Morton wrote:
> Nick Orlov <bugfixer@list.ru> wrote:
> >
> > The following patch: x86-kmap_atomic-debugging.patch exposed a badness
> > in 3w_xxx driver.
>
> Sweet, thanks.
>
[[ skipped ]]
>
> From: Andrew Morton <akpm@osdl.org>
>
> We must disable local IRQs while holding KM_IRQ0 or KM_IRQ1. Otherwise, an
> IRQ handler could use those kmap slots while this code is using them,
> resulting in memory corruption.
>
> Thanks to Nick Orlov <bugfixer@list.ru> for reporting.
>
> Cc: <linuxraid@amcc.com>
> Cc: James Bottomley <James.Bottomley@SteelEye.com>
> Signed-off-by: Andrew Morton <akpm@osdl.org>
> ---
>
> drivers/scsi/3w-xxxx.c | 3 +++
> 1 files changed, 3 insertions(+)
>
> diff -puN drivers/scsi/3w-xxxx.c~3ware-kmap_atomic-fix drivers/scsi/3w-xxxx.c
> --- devel/drivers/scsi/3w-xxxx.c~3ware-kmap_atomic-fix 2006-04-09 11:28:08.000000000 -0700
> +++ devel-akpm/drivers/scsi/3w-xxxx.c 2006-04-09 11:29:21.000000000 -0700
> @@ -1508,10 +1508,12 @@ static void tw_transfer_internal(TW_Devi
> struct scsi_cmnd *cmd = tw_dev->srb[request_id];
> void *buf;
> unsigned int transfer_len;
> + unsigned long flags = 0;
>
> if (cmd->use_sg) {
> struct scatterlist *sg =
> (struct scatterlist *)cmd->request_buffer;
> + local_irq_save(flags);
> buf = kmap_atomic(sg->page, KM_IRQ0) + sg->offset;
> transfer_len = min(sg->length, len);
> } else {
> @@ -1526,6 +1528,7 @@ static void tw_transfer_internal(TW_Devi
>
> sg = (struct scatterlist *)cmd->request_buffer;
> kunmap_atomic(buf - sg->offset, KM_IRQ0);
> + local_irq_restore(flags);
> }
> }
>
> _
Confirmed, this patch solves the "badness" problem for me.
I still experiencing a weird hangs though (the box just hangs, no
messages on console/syslog, nothing). I'll try to nail it down.
2.6.16-mm2 works like a charm with the same config.
Do you know which patches should I try to revert first?
--
With best wishes,
Nick Orlov.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.17-rc1-mm2: badness in 3w_xxxx driver
2006-04-09 19:12 ` Jeff Garzik
@ 2006-04-09 19:21 ` Arjan van de Ven
0 siblings, 0 replies; 7+ messages in thread
From: Arjan van de Ven @ 2006-04-09 19:21 UTC (permalink / raw)
To: Jeff Garzik
Cc: Andrew Morton, Nick Orlov, linux-kernel, axboe, James.Bottomley
> > Cc: <linuxraid@amcc.com>
> > ---
> >
> ACK.
>
> Though please make sure the active maintainer is CC'd on this... There
> is even a helpful MAINTAINERS entry for this driver.
I'd say it is ;-)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.17-rc1-mm2: badness in 3w_xxxx driver
2006-04-09 19:12 ` Nick Orlov
@ 2006-04-09 19:43 ` Andrew Morton
2006-04-09 21:23 ` Nick Orlov
0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2006-04-09 19:43 UTC (permalink / raw)
To: Nick Orlov; +Cc: linux-kernel
Nick Orlov <bugfixer@list.ru> wrote:
>
> Confirmed, this patch solves the "badness" problem for me.
yup, thanks.
> I still experiencing a weird hangs though (the box just hangs, no
> messages on console/syslog, nothing). I'll try to nail it down.
>
> 2.6.16-mm2 works like a charm with the same config.
> Do you know which patches should I try to revert first?
Gee, 2.6.16-mm2 was a long time ago.
Tried sysrq?
echo 1 > /proc/sys/kernel/sysrq
<wait for hang>
ALT-SYSRQ-P or ALT-SYSRQ-T
Is the NMi watchdog enabled? Boot with `nmi_watchdog=1', make sure that
the NMI counts are incrementing in /proc/interrupts.
Failing all that, testing 2.6.17-rc1 would be interesting.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.17-rc1-mm2: badness in 3w_xxxx driver
2006-04-09 19:43 ` Andrew Morton
@ 2006-04-09 21:23 ` Nick Orlov
0 siblings, 0 replies; 7+ messages in thread
From: Nick Orlov @ 2006-04-09 21:23 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
On Sun, Apr 09, 2006 at 12:43:01PM -0700, Andrew Morton wrote:
> Nick Orlov <bugfixer@list.ru> wrote:
> >
> > Confirmed, this patch solves the "badness" problem for me.
>
> yup, thanks.
>
> > I still experiencing a weird hangs though (the box just hangs, no
> > messages on console/syslog, nothing). I'll try to nail it down.
> >
> > 2.6.16-mm2 works like a charm with the same config.
> > Do you know which patches should I try to revert first?
>
> Gee, 2.6.16-mm2 was a long time ago.
>
> Tried sysrq?
>
> echo 1 > /proc/sys/kernel/sysrq
> <wait for hang>
> ALT-SYSRQ-P or ALT-SYSRQ-T
>
> Is the NMi watchdog enabled? Boot with `nmi_watchdog=1', make sure that
> the NMI counts are incrementing in /proc/interrupts.
>
> Failing all that, testing 2.6.17-rc1 would be interesting.
2.6.17-rc1 fails in the same fashion - it hangs "randomly".
Good news that I've found the pattern and solution:
it always happens when 2 applications open /dev/dsp simultaneously.
Applying the following patches published by Takashi Iwai solves the
problem:
http://marc.theaimsgroup.com/?l=linux-kernel&m=114423578508165&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=114424198614019&w=2
Not sure if the first one is enough.
I would probably recommend to put them into the hot-fixes,
since many people can be frustrated because of this.
--
With best wishes,
Nick Orlov.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2006-04-09 21:23 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-09 18:23 2.6.17-rc1-mm2: badness in 3w_xxxx driver Nick Orlov
2006-04-09 18:32 ` Andrew Morton
2006-04-09 19:12 ` Jeff Garzik
2006-04-09 19:21 ` Arjan van de Ven
2006-04-09 19:12 ` Nick Orlov
2006-04-09 19:43 ` Andrew Morton
2006-04-09 21:23 ` Nick Orlov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox