* Serial driver 8250 hangs the kernel with the VIA Nehemiah...
@ 2006-08-11 8:52 Chris Pringle
2006-08-11 10:36 ` Alan Cox
0 siblings, 1 reply; 5+ messages in thread
From: Chris Pringle @ 2006-08-11 8:52 UTC (permalink / raw)
To: linux-kernel
Hello,
I'm having problems with the 8250 serial driver. We have some custom
hardware that sits on the ISA bus, and has a number of 16C950/954 UARTs
on it. However, when the port is receiving a lot of data, it has the
tendency to either get corrupted data, or to crash the kernel.
I'm using the 2.6.13.1 kernel with low latency patches. I've also tried
the 2.6.17 kernel with low latency patches, and they also seem to fail.
The offending bit of code seems to be here:
static _INLINE_ unsigned int serial_in(struct uart_8250_port *up, int
offset)
{
offset <<= up->port.regshift;
switch (up->port.iotype) {
case UPIO_HUB6:
outb(up->port.hub6 - 1 + offset, up->port.iobase);
return inb(up->port.iobase + 1);
case UPIO_MEM:
return readb(up->port.membase + offset);
case UPIO_MEM32:
return readl(up->port.membase + offset);
default:
return inb(up->port.iobase + offset);
}
}
The "inb" as it is will sometimes return bad data - I'm guessing this
is due to ISA bus instability... Anyway I changed it to "inb_p" which
cured the corruption problem, but has introduced another issue - it
hangs the kernel.
Interestingly, it only hangs on systems with a VIA Nehemiah CPU, the
Intel Celerons seem to work fine. Could this be a problem with writing
to that dreaded port 0x080 within inb_p?
I've added in some of the kernel hacking options, including spinlock
detection etc. and it's not told me anything. And as expected, there's
nothing in the log files either. SysRq doesn't work, however the num
lock key does still flash if you hit NumLock.
I tried getting rid of the paused versions of inb, and adding in a
udelay of 2 immediately before - this helped a lot, but instead of
crashing after 15 minutes of continous activity, it takes around 12-36
hours. I've played with moving the udelay after the inb call as well -
but the results where the same.
Does anyone have any ideas what may be causing these issues? Why does
it only occur on the VIA chip and not the Celeron? I don't think the
problem is there when the low latency patches are not applied - so I'm
thinking it's probably a timing problem of some sort.
I wasn't sure whether the writing to port 0x080 was a problem for the
inb_p (in io.h) - so I tried changing it to the jumping approach, and
also tried the very long delays in io.h as well - however, neither had
any affect - if anything, they made it worse.
It should be noted that the Small Board Computer and the ISA cards are
exactly the same on all systems - it's only the CPU that's been
changed.
If anyone has any ideas how I can try and track down exactly where it's
crashing, or what the possibly causes might be, I would be very
grateful if you could give me some hints!
Cheers,
Chris.
--
______________________________
Chris Pringle
Software Engineer
Miranda Technologies Ltd.
Hithercroft Road
Wallingford
Oxfordshire OX10 9DG
UK
Tel. +44 1491 820206
Fax. +44 1491 820001
www.miranda.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Serial driver 8250 hangs the kernel with the VIA Nehemiah...
2006-08-11 8:52 Chris Pringle
@ 2006-08-11 10:36 ` Alan Cox
0 siblings, 0 replies; 5+ messages in thread
From: Alan Cox @ 2006-08-11 10:36 UTC (permalink / raw)
To: Chris Pringle; +Cc: linux-kernel
Ar Gwe, 2006-08-11 am 09:52 +0100, ysgrifennodd Chris Pringle:
> on it. However, when the port is receiving a lot of data, it has the
> tendency to either get corrupted data, or to crash the kernel.
What do the crash traces look like
> The "inb" as it is will sometimes return bad data - I'm guessing this
> is due to ISA bus instability... Anyway I changed it to "inb_p" which
> cured the corruption problem, but has introduced another issue - it
> hangs the kernel.
Maybe you need to have a chat with your hardware guys. Certainly if
inb_p makes a difference you've got hardware not software side problems.
> Interestingly, it only hangs on systems with a VIA Nehemiah CPU, the
> Intel Celerons seem to work fine. Could this be a problem with writing
> to that dreaded port 0x080 within inb_p?
Unlikely as it would affect both. More likely would be that the ISA bus
clock is generated off the PCI bus clock and you have one of the
multipliers wrong or too high for the board.
> it only occur on the VIA chip and not the Celeron? I don't think the
> problem is there when the low latency patches are not applied - so I'm
> thinking it's probably a timing problem of some sort.
That bit is interesting. Something really off the wall to try - disable
interrupts around the inb_p(), especially if you are using pre-emption
and let me know what happens.
Alan
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Serial driver 8250 hangs the kernel with the VIA Nehemiah...
@ 2006-08-11 11:08 Chris Pringle
2006-08-11 12:02 ` Alan Cox
0 siblings, 1 reply; 5+ messages in thread
From: Chris Pringle @ 2006-08-11 11:08 UTC (permalink / raw)
To: linux-kernel
Alan Cox wrote:
> Ar Gwe, 2006-08-11 am 09:52 +0100, ysgrifennodd Chris Pringle:
>
>> on it. However, when the port is receiving a lot of data, it has the
>> tendency to either get corrupted data, or to crash the kernel.
>>
>
> What do the crash traces look like
>
Sorry - the kernel hangs, not crashes. There is no output whatsoever
after the hang - SysRq doesn't work, but Numlock does.
>
>> The "inb" as it is will sometimes return bad data - I'm guessing this
>> is due to ISA bus instability... Anyway I changed it to "inb_p" which
>> cured the corruption problem, but has introduced another issue - it
>> hangs the kernel.
>>
>
> Maybe you need to have a chat with your hardware guys. Certainly if
> inb_p makes a difference you've got hardware not software side problems.
>
Perhaps - I'm in discussion with them at the moment, but neither our,
nor their department have managed to come up with an answer yet.
>
>> Interestingly, it only hangs on systems with a VIA Nehemiah CPU, the
>> Intel Celerons seem to work fine. Could this be a problem with writing
>> to that dreaded port 0x080 within inb_p?
>>
>
> Unlikely as it would affect both. More likely would be that the ISA bus
> clock is generated off the PCI bus clock and you have one of the
> multipliers wrong or too high for the board.
>
Thats interesting, but wouldn't this produce strange side affects for
the 2.4 kernel as well? 2.4 works fine on both VIAs and Celerons.
>
>> it only occur on the VIA chip and not the Celeron? I don't think the
>> problem is there when the low latency patches are not applied - so I'm
>> thinking it's probably a timing problem of some sort.
>>
>
> That bit is interesting. Something really off the wall to try - disable
> interrupts around the inb_p(), especially if you are using pre-emption
> and let me know what happens.
>
> Alan
>
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
I'll give the interrupt disabling a go...
Thanks Alan.
Chris.
--
______________________________
Chris Pringle
Software Engineer
Miranda Technologies Ltd.
Hithercroft Road
Wallingford
Oxfordshire OX10 9DG
UK
Tel. +44 1491 820206
Fax. +44 1491 820001
www.miranda.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Serial driver 8250 hangs the kernel with the VIA Nehemiah...
2006-08-11 11:08 Serial driver 8250 hangs the kernel with the VIA Nehemiah Chris Pringle
@ 2006-08-11 12:02 ` Alan Cox
2006-08-11 15:45 ` Chris Pringle
0 siblings, 1 reply; 5+ messages in thread
From: Alan Cox @ 2006-08-11 12:02 UTC (permalink / raw)
To: Chris Pringle; +Cc: linux-kernel
Ar Gwe, 2006-08-11 am 12:08 +0100, ysgrifennodd Chris Pringle:
> > Unlikely as it would affect both. More likely would be that the ISA bus
> > clock is generated off the PCI bus clock and you have one of the
> > multipliers wrong or too high for the board.
> >
> Thats interesting, but wouldn't this produce strange side affects for
> the 2.4 kernel as well? 2.4 works fine on both VIAs and Celerons.
That I wonder about. The power management stuff and some other things
that matter for timing are different however.
> I'll give the interrupt disabling a go...
Its just a guess but if you have low latency stuff, you have pre-empt
enabled and you actually depend upon the semantics of inb_p/outb_p
giving delays reliably then I'm not convinced are guarantees are strong
enough
Specifically we don't have any pre-empt protection between the I/O delay
and the I/O so we could violate it as we don't have pre-empt disables in
inb_p/outb_p and if your CPU context switch is quick enough it could
trigger a problem.
Alan
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Serial driver 8250 hangs the kernel with the VIA Nehemiah...
2006-08-11 12:02 ` Alan Cox
@ 2006-08-11 15:45 ` Chris Pringle
0 siblings, 0 replies; 5+ messages in thread
From: Chris Pringle @ 2006-08-11 15:45 UTC (permalink / raw)
To: Alan Cox; +Cc: linux-kernel
Alan Cox wrote:
> Ar Gwe, 2006-08-11 am 12:08 +0100, ysgrifennodd Chris Pringle:
>
>>> Unlikely as it would affect both. More likely would be that the ISA bus
>>> clock is generated off the PCI bus clock and you have one of the
>>> multipliers wrong or too high for the board.
>>>
>>>
>> Thats interesting, but wouldn't this produce strange side affects for
>> the 2.4 kernel as well? 2.4 works fine on both VIAs and Celerons.
>>
>
> That I wonder about. The power management stuff and some other things
> that matter for timing are different however.
>
We don't use any kind of power management (not compiled in) as our
systems are always on... Is there any timing related options in the
kernel config you'd recommend I look at?
>
>> I'll give the interrupt disabling a go...
>>
>
> Its just a guess but if you have low latency stuff, you have pre-empt
> enabled and you actually depend upon the semantics of inb_p/outb_p
> giving delays reliably then I'm not convinced are guarantees are strong
> enough
>
> Specifically we don't have any pre-empt protection between the I/O delay
> and the I/O so we could violate it as we don't have pre-empt disables in
> inb_p/outb_p and if your CPU context switch is quick enough it could
> trigger a problem.
>
> Alan
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
Okay, I've tried disabling both preemption and interrupts (separately,
and together) and its still hanging...
I've had the Celeron systems being thrashed for well over 4 days now,
and they are working fine... Why would the VIA system be any different?
They have a slightly different CPU speed (the VIAs are 1000MHz, whereas
the Celerons are 850MHz), but I would expect them to be fully compatible
otherwise... unless its a microcode bug?
Any more ideas? Do you think writing to port 0x80 could be causing issues?
Thanks,
Chris
--
______________________________
Chris Pringle
Software Engineer
Miranda Technologies Ltd.
Hithercroft Road
Wallingford
Oxfordshire OX10 9DG
UK
Tel. +44 1491 820206
Fax. +44 1491 820001
www.miranda.com
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-08-11 15:45 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-11 11:08 Serial driver 8250 hangs the kernel with the VIA Nehemiah Chris Pringle
2006-08-11 12:02 ` Alan Cox
2006-08-11 15:45 ` Chris Pringle
-- strict thread matches above, loose matches on Subject: below --
2006-08-11 8:52 Chris Pringle
2006-08-11 10:36 ` Alan Cox
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox