* [Linux-ia64] Software IO-TLB Kernel panic
@ 2001-05-17 11:44 Martin Wilck
2001-05-17 15:00 ` David Mosberger
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Martin Wilck @ 2001-05-17 11:44 UTC (permalink / raw)
To: linux-ia64
Hi,
I am reproduceably getting kernel panics when accessing discs on an
Adaptec 39160 adapter (SCSI host 1 after the built-in QLA1280).
I am using the "new" aic7xxx driver on a 2.4.4 IA64 kernel.
I configured the kernel for DIG-compliant C0-stepping hardware.
The kernel panic occurs in map_single (arch/ia64/lib/swiotlb.c:171).
I have a 2-CPU Lion with C0-stepping CPUs. The requested
IO TLB size is 8192 when the panic occurs.
After the crash, I always have severe corruption on the filesystem that
was being accessed during the crash. e2fsck reports ~10 illegal blocks in
one inode, and many follow-up errors.
The only TLB related boot message I see is
kernel: Placing software IO TLB between 0xe000000000100000 - 0xe000000000300000
I am looking into this myself right now, but I would be grateful for hints
where to start. Any help appreciated,
Martin
--
Martin Wilck <Martin.Wilck@fujitsu-siemens.com>
FSC EP PS DS1, Paderborn Tel. +49 5251 8 15113
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [Linux-ia64] Software IO-TLB Kernel panic
2001-05-17 11:44 [Linux-ia64] Software IO-TLB Kernel panic Martin Wilck
@ 2001-05-17 15:00 ` David Mosberger
2001-05-17 17:40 ` Martin Wilck
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: David Mosberger @ 2001-05-17 15:00 UTC (permalink / raw)
To: linux-ia64
>>>>> On Thu, 17 May 2001 13:44:50 +0200 (CEST), Martin Wilck <Martin.Wilck@fujitsu-siemens.com> said:
Martin> I am reproduceably getting kernel panics when accessing
Martin> discs on an Adaptec 39160 adapter (SCSI host 1 after the
Martin> built-in QLA1280). I am using the "new" aic7xxx driver on a
Martin> 2.4.4 IA64 kernel. I configured the kernel for
Martin> DIG-compliant C0-stepping hardware.
Martin> The kernel panic occurs in map_single
Martin> (arch/ia64/lib/swiotlb.c:171).
Sounds like the driver is trying to map buffers that are bigger than
(1 << IO_TLB_SHIFT) (currently 2KB).
--david
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [Linux-ia64] Software IO-TLB Kernel panic
2001-05-17 11:44 [Linux-ia64] Software IO-TLB Kernel panic Martin Wilck
2001-05-17 15:00 ` David Mosberger
@ 2001-05-17 17:40 ` Martin Wilck
2001-05-17 18:49 ` David Mosberger
2001-05-18 19:00 ` [Linux-ia64] Software IO-TLB Kernel panic - preliminary analysis Martin Wilck
3 siblings, 0 replies; 5+ messages in thread
From: Martin Wilck @ 2001-05-17 17:40 UTC (permalink / raw)
To: linux-ia64
>
> Martin> The kernel panic occurs in map_single
> Martin> (arch/ia64/lib/swiotlb.c:171).
>
> Sounds like the driver is trying to map buffers that are bigger than
> (1 << IO_TLB_SHIFT) (currently 2KB).
I looked at the code and thought that bigger buffers than 2kB can (in
principle) be created by joining subsequent buffers.
Of course, it could happen that no such contiguous space is available.
Am I wrong ?
Martin
--
Martin Wilck <Martin.Wilck@fujitsu-siemens.com>
FSC EP PS DS1, Paderborn Tel. +49 5251 8 15113
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Linux-ia64] Software IO-TLB Kernel panic
2001-05-17 11:44 [Linux-ia64] Software IO-TLB Kernel panic Martin Wilck
2001-05-17 15:00 ` David Mosberger
2001-05-17 17:40 ` Martin Wilck
@ 2001-05-17 18:49 ` David Mosberger
2001-05-18 19:00 ` [Linux-ia64] Software IO-TLB Kernel panic - preliminary analysis Martin Wilck
3 siblings, 0 replies; 5+ messages in thread
From: David Mosberger @ 2001-05-17 18:49 UTC (permalink / raw)
To: linux-ia64
>>>>> On Thu, 17 May 2001 19:40:17 +0200 (CEST), Martin Wilck <Martin.Wilck@fujitsu-siemens.com> said:
>>
Martin> The kernel panic occurs in map_single
Martin> (arch/ia64/lib/swiotlb.c:171).
>> Sounds like the driver is trying to map buffers that are bigger
>> than (1 << IO_TLB_SHIFT) (currently 2KB).
Martin> I looked at the code and thought that bigger buffers than
Martin> 2kB can (in principle) be created by joining subsequent
Martin> buffers. Of course, it could happen that no such contiguous
Martin> space is available. Am I wrong ?
Yes, you're right. I forgot about the joining.
--david
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Linux-ia64] Software IO-TLB Kernel panic - preliminary analysis
2001-05-17 11:44 [Linux-ia64] Software IO-TLB Kernel panic Martin Wilck
` (2 preceding siblings ...)
2001-05-17 18:49 ` David Mosberger
@ 2001-05-18 19:00 ` Martin Wilck
3 siblings, 0 replies; 5+ messages in thread
From: Martin Wilck @ 2001-05-18 19:00 UTC (permalink / raw)
To: linux-ia64
This problem is really hard to hunt down, as even kdb will not
respond anymore after the crash happens. Also, my system logs are
truncated.
What I have seen, though, is that IO-TLBs are allocated very quickly
immediately before the crash. By using some printk's, I saw
133 allocations of 8192-byte chunks in a row without a single
deallocation immediately before the machine came down. This alone accounts
for about half of the bounce buffer space, without any space that
was allocated before and without any further allocations that
I may have lost due to the lost lines in the log.
By inspecting elements of the pci_dev structure passed to the routine,
I am now 99% convinced that the Adptect 7899a controller
is the "guilty" device. This fits well to the finding that the
crashes always occur after (!) a file system on that card was
activated a little more heavily.
It seems that the problem does not occur with the "old" aic7xxx
driver. On the contrary, that driver seems to deallocate every buffer
immediately after allocation.
Thus, for the time being I'd recommend to use the aic7xxx_old driver.
But it looks like a problem that ought to be solved.
Should I perhaps approach the aic7xxx maintainers?
Regards,
Martin
--
Martin Wilck <Martin.Wilck@fujitsu-siemens.com>
FSC EP PS DS1, Paderborn Tel. +49 5251 8 15113
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2001-05-18 19:00 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-05-17 11:44 [Linux-ia64] Software IO-TLB Kernel panic Martin Wilck
2001-05-17 15:00 ` David Mosberger
2001-05-17 17:40 ` Martin Wilck
2001-05-17 18:49 ` David Mosberger
2001-05-18 19:00 ` [Linux-ia64] Software IO-TLB Kernel panic - preliminary analysis Martin Wilck
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox