* 2.6.17-rc1-mm2: badness in 3w_xxxx driver @ 2006-04-09 18:23 Nick Orlov 2006-04-09 18:32 ` Andrew Morton 0 siblings, 1 reply; 7+ messages in thread From: Nick Orlov @ 2006-04-09 18:23 UTC (permalink / raw) To: linux-kernel; +Cc: Andrew Morton, Jens Axboe, James Bottomley The following patch: x86-kmap_atomic-debugging.patch exposed a badness in 3w_xxx driver. I'm getting a lot of: Apr 9 13:00:04 nickolas kernel: kmap_atomic: local irqs are enabled while using KM_IRQn Apr 9 13:00:04 nickolas kernel: <c0104103> show_trace+0x13/0x20 <c010412e> dump_stack+0x1e/0x20 Apr 9 13:00:04 nickolas kernel: <c01159c9> kmap_atomic+0x79/0xe0 <c028b885> tw_transfer_internal+0x85/0xa0 Apr 9 13:00:04 nickolas kernel: <c028ca7e> tw_interrupt+0x3fe/0x820 <c0143b9e> handle_IRQ_event+0x3e/0x80 Apr 9 13:00:04 nickolas kernel: <c0143c70> __do_IRQ+0x90/0x100 <c01057a6> do_IRQ+0x26/0x40 Apr 9 13:00:04 nickolas kernel: <c010396e> common_interrupt+0x1a/0x20 <c0101cdd> cpu_idle+0x4d/0xb0 Apr 9 13:00:04 nickolas kernel: <c010f2cc> start_secondary+0x24c/0x4b0 <00000000> 0x0 Apr 9 13:00:04 nickolas kernel: <c214ffb4> 0xc214ffb4 I'm running 32 bit kernel on AMD64x2 w/ HIGHMEM enabled. I think this is an old bug since the 3w_xxxx.c has not been changed for a long time (at least since 2.6.16-rc1-mm4). Please let me know if you want me to try some patches. -- With best wishes, Nick Orlov. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.17-rc1-mm2: badness in 3w_xxxx driver 2006-04-09 18:23 2.6.17-rc1-mm2: badness in 3w_xxxx driver Nick Orlov @ 2006-04-09 18:32 ` Andrew Morton 2006-04-09 19:12 ` Jeff Garzik 2006-04-09 19:12 ` Nick Orlov 0 siblings, 2 replies; 7+ messages in thread From: Andrew Morton @ 2006-04-09 18:32 UTC (permalink / raw) To: Nick Orlov; +Cc: linux-kernel, axboe, James.Bottomley Nick Orlov <bugfixer@list.ru> wrote: > > The following patch: x86-kmap_atomic-debugging.patch exposed a badness > in 3w_xxx driver. Sweet, thanks. > I'm getting a lot of: > > Apr 9 13:00:04 nickolas kernel: kmap_atomic: local irqs are enabled while using KM_IRQn > Apr 9 13:00:04 nickolas kernel: <c0104103> show_trace+0x13/0x20 <c010412e> dump_stack+0x1e/0x20 > Apr 9 13:00:04 nickolas kernel: <c01159c9> kmap_atomic+0x79/0xe0 <c028b885> tw_transfer_internal+0x85/0xa0 > Apr 9 13:00:04 nickolas kernel: <c028ca7e> tw_interrupt+0x3fe/0x820 <c0143b9e> handle_IRQ_event+0x3e/0x80 > Apr 9 13:00:04 nickolas kernel: <c0143c70> __do_IRQ+0x90/0x100 <c01057a6> do_IRQ+0x26/0x40 > Apr 9 13:00:04 nickolas kernel: <c010396e> common_interrupt+0x1a/0x20 <c0101cdd> cpu_idle+0x4d/0xb0 > Apr 9 13:00:04 nickolas kernel: <c010f2cc> start_secondary+0x24c/0x4b0 <00000000> 0x0 > Apr 9 13:00:04 nickolas kernel: <c214ffb4> 0xc214ffb4 > > I'm running 32 bit kernel on AMD64x2 w/ HIGHMEM enabled. > I think this is an old bug since the 3w_xxxx.c has not been changed for > a long time (at least since 2.6.16-rc1-mm4). > > Please let me know if you want me to try some patches. > From: Andrew Morton <akpm@osdl.org> We must disable local IRQs while holding KM_IRQ0 or KM_IRQ1. Otherwise, an IRQ handler could use those kmap slots while this code is using them, resulting in memory corruption. Thanks to Nick Orlov <bugfixer@list.ru> for reporting. Cc: <linuxraid@amcc.com> Cc: James Bottomley <James.Bottomley@SteelEye.com> Signed-off-by: Andrew Morton <akpm@osdl.org> --- drivers/scsi/3w-xxxx.c | 3 +++ 1 files changed, 3 insertions(+) diff -puN drivers/scsi/3w-xxxx.c~3ware-kmap_atomic-fix drivers/scsi/3w-xxxx.c --- devel/drivers/scsi/3w-xxxx.c~3ware-kmap_atomic-fix 2006-04-09 11:28:08.000000000 -0700 +++ devel-akpm/drivers/scsi/3w-xxxx.c 2006-04-09 11:29:21.000000000 -0700 @@ -1508,10 +1508,12 @@ static void tw_transfer_internal(TW_Devi struct scsi_cmnd *cmd = tw_dev->srb[request_id]; void *buf; unsigned int transfer_len; + unsigned long flags = 0; if (cmd->use_sg) { struct scatterlist *sg = (struct scatterlist *)cmd->request_buffer; + local_irq_save(flags); buf = kmap_atomic(sg->page, KM_IRQ0) + sg->offset; transfer_len = min(sg->length, len); } else { @@ -1526,6 +1528,7 @@ static void tw_transfer_internal(TW_Devi sg = (struct scatterlist *)cmd->request_buffer; kunmap_atomic(buf - sg->offset, KM_IRQ0); + local_irq_restore(flags); } } _ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.17-rc1-mm2: badness in 3w_xxxx driver 2006-04-09 18:32 ` Andrew Morton @ 2006-04-09 19:12 ` Jeff Garzik 2006-04-09 19:21 ` Arjan van de Ven 2006-04-09 19:12 ` Nick Orlov 1 sibling, 1 reply; 7+ messages in thread From: Jeff Garzik @ 2006-04-09 19:12 UTC (permalink / raw) To: Andrew Morton; +Cc: Nick Orlov, linux-kernel, axboe, James.Bottomley Andrew Morton wrote: > Nick Orlov <bugfixer@list.ru> wrote: >> The following patch: x86-kmap_atomic-debugging.patch exposed a badness >> in 3w_xxx driver. > > Sweet, thanks. > >> I'm getting a lot of: >> >> Apr 9 13:00:04 nickolas kernel: kmap_atomic: local irqs are enabled while using KM_IRQn >> Apr 9 13:00:04 nickolas kernel: <c0104103> show_trace+0x13/0x20 <c010412e> dump_stack+0x1e/0x20 >> Apr 9 13:00:04 nickolas kernel: <c01159c9> kmap_atomic+0x79/0xe0 <c028b885> tw_transfer_internal+0x85/0xa0 >> Apr 9 13:00:04 nickolas kernel: <c028ca7e> tw_interrupt+0x3fe/0x820 <c0143b9e> handle_IRQ_event+0x3e/0x80 >> Apr 9 13:00:04 nickolas kernel: <c0143c70> __do_IRQ+0x90/0x100 <c01057a6> do_IRQ+0x26/0x40 >> Apr 9 13:00:04 nickolas kernel: <c010396e> common_interrupt+0x1a/0x20 <c0101cdd> cpu_idle+0x4d/0xb0 >> Apr 9 13:00:04 nickolas kernel: <c010f2cc> start_secondary+0x24c/0x4b0 <00000000> 0x0 >> Apr 9 13:00:04 nickolas kernel: <c214ffb4> 0xc214ffb4 >> >> I'm running 32 bit kernel on AMD64x2 w/ HIGHMEM enabled. >> I think this is an old bug since the 3w_xxxx.c has not been changed for >> a long time (at least since 2.6.16-rc1-mm4). >> >> Please let me know if you want me to try some patches. >> > > > From: Andrew Morton <akpm@osdl.org> > > We must disable local IRQs while holding KM_IRQ0 or KM_IRQ1. Otherwise, an > IRQ handler could use those kmap slots while this code is using them, > resulting in memory corruption. > > Thanks to Nick Orlov <bugfixer@list.ru> for reporting. > > Cc: <linuxraid@amcc.com> > Cc: James Bottomley <James.Bottomley@SteelEye.com> > Signed-off-by: Andrew Morton <akpm@osdl.org> > --- > > drivers/scsi/3w-xxxx.c | 3 +++ > 1 files changed, 3 insertions(+) > > diff -puN drivers/scsi/3w-xxxx.c~3ware-kmap_atomic-fix drivers/scsi/3w-xxxx.c > --- devel/drivers/scsi/3w-xxxx.c~3ware-kmap_atomic-fix 2006-04-09 11:28:08.000000000 -0700 > +++ devel-akpm/drivers/scsi/3w-xxxx.c 2006-04-09 11:29:21.000000000 -0700 > @@ -1508,10 +1508,12 @@ static void tw_transfer_internal(TW_Devi > struct scsi_cmnd *cmd = tw_dev->srb[request_id]; > void *buf; > unsigned int transfer_len; > + unsigned long flags = 0; > > if (cmd->use_sg) { > struct scatterlist *sg = > (struct scatterlist *)cmd->request_buffer; > + local_irq_save(flags); > buf = kmap_atomic(sg->page, KM_IRQ0) + sg->offset; > transfer_len = min(sg->length, len); > } else { > @@ -1526,6 +1528,7 @@ static void tw_transfer_internal(TW_Devi > > sg = (struct scatterlist *)cmd->request_buffer; > kunmap_atomic(buf - sg->offset, KM_IRQ0); > + local_irq_restore(flags); ACK. Though please make sure the active maintainer is CC'd on this... There is even a helpful MAINTAINERS entry for this driver. Jeff ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.17-rc1-mm2: badness in 3w_xxxx driver 2006-04-09 19:12 ` Jeff Garzik @ 2006-04-09 19:21 ` Arjan van de Ven 0 siblings, 0 replies; 7+ messages in thread From: Arjan van de Ven @ 2006-04-09 19:21 UTC (permalink / raw) To: Jeff Garzik Cc: Andrew Morton, Nick Orlov, linux-kernel, axboe, James.Bottomley > > Cc: <linuxraid@amcc.com> > > --- > > > ACK. > > Though please make sure the active maintainer is CC'd on this... There > is even a helpful MAINTAINERS entry for this driver. I'd say it is ;-) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.17-rc1-mm2: badness in 3w_xxxx driver 2006-04-09 18:32 ` Andrew Morton 2006-04-09 19:12 ` Jeff Garzik @ 2006-04-09 19:12 ` Nick Orlov 2006-04-09 19:43 ` Andrew Morton 1 sibling, 1 reply; 7+ messages in thread From: Nick Orlov @ 2006-04-09 19:12 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel On Sun, Apr 09, 2006 at 11:32:40AM -0700, Andrew Morton wrote: > Nick Orlov <bugfixer@list.ru> wrote: > > > > The following patch: x86-kmap_atomic-debugging.patch exposed a badness > > in 3w_xxx driver. > > Sweet, thanks. > [[ skipped ]] > > From: Andrew Morton <akpm@osdl.org> > > We must disable local IRQs while holding KM_IRQ0 or KM_IRQ1. Otherwise, an > IRQ handler could use those kmap slots while this code is using them, > resulting in memory corruption. > > Thanks to Nick Orlov <bugfixer@list.ru> for reporting. > > Cc: <linuxraid@amcc.com> > Cc: James Bottomley <James.Bottomley@SteelEye.com> > Signed-off-by: Andrew Morton <akpm@osdl.org> > --- > > drivers/scsi/3w-xxxx.c | 3 +++ > 1 files changed, 3 insertions(+) > > diff -puN drivers/scsi/3w-xxxx.c~3ware-kmap_atomic-fix drivers/scsi/3w-xxxx.c > --- devel/drivers/scsi/3w-xxxx.c~3ware-kmap_atomic-fix 2006-04-09 11:28:08.000000000 -0700 > +++ devel-akpm/drivers/scsi/3w-xxxx.c 2006-04-09 11:29:21.000000000 -0700 > @@ -1508,10 +1508,12 @@ static void tw_transfer_internal(TW_Devi > struct scsi_cmnd *cmd = tw_dev->srb[request_id]; > void *buf; > unsigned int transfer_len; > + unsigned long flags = 0; > > if (cmd->use_sg) { > struct scatterlist *sg = > (struct scatterlist *)cmd->request_buffer; > + local_irq_save(flags); > buf = kmap_atomic(sg->page, KM_IRQ0) + sg->offset; > transfer_len = min(sg->length, len); > } else { > @@ -1526,6 +1528,7 @@ static void tw_transfer_internal(TW_Devi > > sg = (struct scatterlist *)cmd->request_buffer; > kunmap_atomic(buf - sg->offset, KM_IRQ0); > + local_irq_restore(flags); > } > } > > _ Confirmed, this patch solves the "badness" problem for me. I still experiencing a weird hangs though (the box just hangs, no messages on console/syslog, nothing). I'll try to nail it down. 2.6.16-mm2 works like a charm with the same config. Do you know which patches should I try to revert first? -- With best wishes, Nick Orlov. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.17-rc1-mm2: badness in 3w_xxxx driver 2006-04-09 19:12 ` Nick Orlov @ 2006-04-09 19:43 ` Andrew Morton 2006-04-09 21:23 ` Nick Orlov 0 siblings, 1 reply; 7+ messages in thread From: Andrew Morton @ 2006-04-09 19:43 UTC (permalink / raw) To: Nick Orlov; +Cc: linux-kernel Nick Orlov <bugfixer@list.ru> wrote: > > Confirmed, this patch solves the "badness" problem for me. yup, thanks. > I still experiencing a weird hangs though (the box just hangs, no > messages on console/syslog, nothing). I'll try to nail it down. > > 2.6.16-mm2 works like a charm with the same config. > Do you know which patches should I try to revert first? Gee, 2.6.16-mm2 was a long time ago. Tried sysrq? echo 1 > /proc/sys/kernel/sysrq <wait for hang> ALT-SYSRQ-P or ALT-SYSRQ-T Is the NMi watchdog enabled? Boot with `nmi_watchdog=1', make sure that the NMI counts are incrementing in /proc/interrupts. Failing all that, testing 2.6.17-rc1 would be interesting. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.17-rc1-mm2: badness in 3w_xxxx driver 2006-04-09 19:43 ` Andrew Morton @ 2006-04-09 21:23 ` Nick Orlov 0 siblings, 0 replies; 7+ messages in thread From: Nick Orlov @ 2006-04-09 21:23 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel On Sun, Apr 09, 2006 at 12:43:01PM -0700, Andrew Morton wrote: > Nick Orlov <bugfixer@list.ru> wrote: > > > > Confirmed, this patch solves the "badness" problem for me. > > yup, thanks. > > > I still experiencing a weird hangs though (the box just hangs, no > > messages on console/syslog, nothing). I'll try to nail it down. > > > > 2.6.16-mm2 works like a charm with the same config. > > Do you know which patches should I try to revert first? > > Gee, 2.6.16-mm2 was a long time ago. > > Tried sysrq? > > echo 1 > /proc/sys/kernel/sysrq > <wait for hang> > ALT-SYSRQ-P or ALT-SYSRQ-T > > Is the NMi watchdog enabled? Boot with `nmi_watchdog=1', make sure that > the NMI counts are incrementing in /proc/interrupts. > > Failing all that, testing 2.6.17-rc1 would be interesting. 2.6.17-rc1 fails in the same fashion - it hangs "randomly". Good news that I've found the pattern and solution: it always happens when 2 applications open /dev/dsp simultaneously. Applying the following patches published by Takashi Iwai solves the problem: http://marc.theaimsgroup.com/?l=linux-kernel&m=114423578508165&w=2 http://marc.theaimsgroup.com/?l=linux-kernel&m=114424198614019&w=2 Not sure if the first one is enough. I would probably recommend to put them into the hot-fixes, since many people can be frustrated because of this. -- With best wishes, Nick Orlov. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2006-04-09 21:23 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-04-09 18:23 2.6.17-rc1-mm2: badness in 3w_xxxx driver Nick Orlov 2006-04-09 18:32 ` Andrew Morton 2006-04-09 19:12 ` Jeff Garzik 2006-04-09 19:21 ` Arjan van de Ven 2006-04-09 19:12 ` Nick Orlov 2006-04-09 19:43 ` Andrew Morton 2006-04-09 21:23 ` Nick Orlov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox