* [RFC] How drivers notice a MCA on I/O read? [2/3]
2003-11-18 10:11 [RFC] How drivers notice a MCA on I/O read? [1/3] Hidetoshi Seto
@ 2003-11-18 10:12 ` Hidetoshi Seto
2003-11-18 10:14 ` [RFC] How drivers notice a MCA on I/O read? [3/3] Hidetoshi Seto
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Hidetoshi Seto @ 2003-11-18 10:12 UTC (permalink / raw)
To: linux-ia64
This is a sample of readb_check on driver (with Notifier).
I certainly understand that some of this code should be written in assemblers,
but I‘m not sure in my skill to handle IA64 codes. I would appreciate any
comments and feedbacks.
-----
H.Seto <seto.hidetoshi@jp.fujitsu.com>
/******************************************************************************/
/************************************* MCA ************************************/
/******************************************************************************/
struct notifier_block *driver_notifier_list = { 0, 0, 0};
EXPORT_SYMBOL(driver_notifer_list)
void MCA_Handler(struct pt_regs *ptregs)
{
/* 'ip' is index of register instruction pointer */
/* 'A' is index of register A */
int check = 0;
…
/* statements for read&mf */
if (ptregs['A'] = current->pid) {
int ret = notifier_call_chain(&driver_notifier_chain,
0, ptregs['ip']);
if ((ret&~NOTIFY_STOP_MASK) = NOTIFY_OK) {
ptregs['A'] = 0;
check = 1;
}
}
if (!check) reboot(); /* system down */
…
}
…
/******************************************************************************/
/************************************ DRIVER **********************************/
/******************************************************************************/
…
static struct notifer_block driver_notifer = {
.notifer_call = driver_notifer_hook,
};
typedef struct {
void *start;
void *end;
} address_range;
static __attribute__((noinline))
int readb_check(unsigned char *data, void *adrs)
{
int volatile B = current->pid;
register int A asm("A") /* register A */
unsigned char ret;
A = B;
ret = read(adrs);
asm("mf.a"::);
if (A != B) {
return 0; /* false*/
} else {
*data = ret;
return 1; /* true */
}
}
static __attribute__((noinline))
int readw_check(unsigned short *data, void *adrs)
…
/* fixme: specify address range from 'read(adrs)' to 'asm("mf.a"::)' */
static address_range check_range[] = {
{readb_check, readw_check},
…
};
static int driver_notifier_hook(struct notifier_block *this,
unsigned long event, void *data)
{
void *adr = data;
int res = NOTIFY_DONE;
int i;
for (i = 0; i < sizeof(check_range)/sizeof(address_range); i++ ) {
if (check_range[i].start < adr && adr < check_range[i].end) {
res = NOTIFY_STOP_MASK|NOTIFY_OK;
break;
}
}
return res;
}
static int __init driver_init(void)
{
…
notifier_chain_register(&driver_notifier_list, &driver_notifier);
…
}
static void __exit driver_exit(void)
{
notifier_chain_unregister(&driver_notifer_list, &driver_notifier);
…
}
module_init(driver_init);
module_exit(driver_exit);
DRIVER_MAIN()
{
unsigned char data;
int retry_count, i;
…
retry_count = N1;
for ( i = 0; i < retry_count; i ++ ) {
if (readb_check(&data, address1)) break;
}
if ( i = retry_count) {
/* error */
}
…
retry_count = N2;
for ( i = 0; i < retry_count; i ++ ) {
if (readb_check(&data, address2)) break;
}
if ( i = retry_count) {
/* error */
}
…
}
^ permalink raw reply [flat|nested] 8+ messages in thread* [RFC] How drivers notice a MCA on I/O read? [3/3]
2003-11-18 10:11 [RFC] How drivers notice a MCA on I/O read? [1/3] Hidetoshi Seto
2003-11-18 10:12 ` [RFC] How drivers notice a MCA on I/O read? [2/3] Hidetoshi Seto
@ 2003-11-18 10:14 ` Hidetoshi Seto
2003-11-18 15:06 ` [RFC] How drivers notice a MCA on I/O read? [1/3] Zoltan Menyhart
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Hidetoshi Seto @ 2003-11-18 10:14 UTC (permalink / raw)
To: linux-ia64
This is a sample of readb_check on kernel.
I certainly understand that some of this code should be written in assemblers,
but I‘m not sure in my skill to handle IA64 codes. I would appreciate any
comments and feedbacks.
-----
H.Seto <seto.hidetoshi@jp.fujitsu.com>
/******************************************************************************/
/************************************* MCA ************************************/
/******************************************************************************/
...
typedef struct {
void *start;
void *end;
} address_range;
int readb_check(unsigned char*, void*);
int readw_check(unsigned short*, void*);
int readl_check(unsigned int*, void*);
static address_range check_range[] = {
{(void*)readb_check, (void*)readw_check},
…
};
void MCA_Handler(struct pt_regs *ptregs)
{
/* 'ip' is index of register instruction pointer */
/* 'A' is index of register A */
…
/* statements for read_check */
for (int i = 0; i < sizeof(check_range)/sizeof(address_range); i ++) {
if (check_range[i].start <= ptregs['ip']
&& ptregs['ip'] <= check_range[i].end) {
if (ptregs['A'] = current->pid) {/* register A*/
ptregs['A'] = 0;
break;
}
}
}
…
}
…
/******************************************************************************/
/********************************** read_check ********************************/
/******************************************************************************/
__attribute__((noinline)) int readb_check(unsigned char *data, void *adrs)
{
int volatile B = current->pid;
register int volatile A asm("A") /* register A */
unsigned char ret;
A = B;
ret = read(adrs);
asm("mf.a"::);
if (A != B) {
return 0; /* false*/
} else {
*data = ret;
return 1; /* true */
}
}
__attribute__((noinline)) int readw_check(unsigned short *data, void *adrs)
{
…
/******************************************************************************/
/************************************ DRIVER **********************************/
/******************************************************************************/
…
DRIVER_MAIN()
{
unsigned char data;
int retry_count, i;
…
retry_count = N1;
for ( i = 0; i < retry_count; i ++ ) {
if (readb_check(&data, address1)) break;
}
if ( i = retry_count) {
/* error */
}
…
retry_count = N2;
for ( i = 0; i < retry_count; i ++ ) {
if (readb_check(&data, address2)) break;
}
if ( i = retry_count) {
/* error */
}
…
}
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [RFC] How drivers notice a MCA on I/O read? [1/3]
2003-11-18 10:11 [RFC] How drivers notice a MCA on I/O read? [1/3] Hidetoshi Seto
2003-11-18 10:12 ` [RFC] How drivers notice a MCA on I/O read? [2/3] Hidetoshi Seto
2003-11-18 10:14 ` [RFC] How drivers notice a MCA on I/O read? [3/3] Hidetoshi Seto
@ 2003-11-18 15:06 ` Zoltan Menyhart
2003-11-18 17:10 ` Jesse Barnes
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Zoltan Menyhart @ 2003-11-18 15:06 UTC (permalink / raw)
To: linux-ia64
Let me make some remarks on I/O triggered MCAs.
Basically, there are 4 kinds of I/Os:
- I/O read / write by CPUs
- I/O read / write by DMAs
1. I/O write by CPUs:
As the machine is pipelined, the writes are executed *much*
later that they leave the CPUs. As soon as the data reaches
an I/O bridge, the I/O is considered to be done for the
coherency domain. O.K. you can wait and make sure that the
written data has reached the I/O device, but you will slow
down by 1000 the I/O access.
An I/O bridge usually does not remember who the originator is,
should an error happen, e.g. PCI PERR / SERR, the bridge does
not know whom to report the error to. It simply issues a
BERR. This is a global MCA, the interrupted context is
not precisely saved. You do not even know e.g. if a
"register++" done just before the MCA arrives,
if it is actually done or not.
You cannot resume the execution, you have to create a
"minimal state" that will be resumed.
And hard luck, the innocent CPUs are also affected, which
do not execute a carefully prepared code to survive an MCA.
2. I/O read by CPUs:
Some I/O bridges may poison the data read, instead of
signaling a BERR.
(Otherwise see above.)
The consummation of poisoned data triggers a local, imprecise
MCA (as above).
Before issuing the critical read (ld.* rx=[ry]) instruction,
make sure no operation is in any of the pipelines (e.g. our
"register++").
Note that the read operation by itself does not consume
the bad data, you have to do something with it, e.g.:
ld.8 r9=[r10];; // r10 = I/O address
add.8 r8=r9,r9;; // fake operation
An "mf.a" does not help, it is useless, it is an MCA
intern to the CPU.
3. Memory -> DMA -> I/O
Mostly the same as the case 1.
The HW could abort the DMA and the DMA status could indicate
the failure without disturbing the CPUs...
A usual HW simply sends a BERR to everyone :-(
4. I/O -> DMA -> Memory
The HW could abort the DMA, the memory could be poisoned to
indicate to the final consumer the error (CPU local MCA as in
the case 2), and the DMA status could indicate the failure
without disturbing the CPUs...
A usual HW simply sends a BERR to everyone :-(
--------------------------------------------------------------
To cheer you up: a usual machine has got ~ 50.000 hours of
MTBF (including all other errors).
Assuming you have got a sophisticated HW that does not send
unnecessary BERRs, how many errors will be recovered during
the whole life of the machine ?
(You cannot do anything to the imprecise MCA models of the
ia64 architecture).
How much is the MTBF of a Linux ? A well known commercial
unix is estimated to have 6.000 hours. Linux can have ...
Will a not so much reliable SW save the fife of a quite
good HW ?
Zoltan Menyhart
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [RFC] How drivers notice a MCA on I/O read? [1/3]
2003-11-18 10:11 [RFC] How drivers notice a MCA on I/O read? [1/3] Hidetoshi Seto
` (2 preceding siblings ...)
2003-11-18 15:06 ` [RFC] How drivers notice a MCA on I/O read? [1/3] Zoltan Menyhart
@ 2003-11-18 17:10 ` Jesse Barnes
2003-11-18 17:47 ` Luck, Tony
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Jesse Barnes @ 2003-11-18 17:10 UTC (permalink / raw)
To: linux-ia64
On Tue, Nov 18, 2003 at 07:11:20PM +0900, Hidetoshi Seto wrote:
> Assuming that Linux uses privilege level to determine action on MCA between
> kill the thread and down the system, if driver encounters MCA caused by I/O
> read, Linux should be down since privilege level of driver is kernel, not user.
> I want to convey the error to the offending driver, and want to enable the
> driver to retry failed read.
This would not only be useful for the occasional device failure, but
also for accessing memory spaces which by definition may or may not
respond to PIO requets, like legacy I/O bus and memory regions. Upon
entering readb_check(), you could set a global telling the MCA handler
to potentially expect a failure from the address or range that was
passed in. This would allow the MCA handler describe in simple terms
what went wrong in case of failure and/or take appropriate action.
> So, I think about a readb_check function that has checking ability
> enable it return error value if MCA occur on read. Drivers could use
> readb_check instead of usual readb, and could diagnosis whether a
> retry be required or not, by the return value of readb_check.
Since this proposal would affect the driver API, it should probably be
discussed on linux-kernel@vger.kernel.org. It may be beneficial to
other systems that hard fail under similar circumstances.
Jesse
^ permalink raw reply [flat|nested] 8+ messages in thread* RE: [RFC] How drivers notice a MCA on I/O read? [1/3]
2003-11-18 10:11 [RFC] How drivers notice a MCA on I/O read? [1/3] Hidetoshi Seto
` (3 preceding siblings ...)
2003-11-18 17:10 ` Jesse Barnes
@ 2003-11-18 17:47 ` Luck, Tony
2003-11-19 16:45 ` Grant Grundler
2003-11-25 9:27 ` Hidetoshi Seto
6 siblings, 0 replies; 8+ messages in thread
From: Luck, Tony @ 2003-11-18 17:47 UTC (permalink / raw)
To: linux-ia64
> This would not only be useful for the occasional device failure, but
> also for accessing memory spaces which by definition may or may not
> respond to PIO requets, like legacy I/O bus and memory regions. Upon
> entering readb_check(), you could set a global telling the MCA handler
> to potentially expect a failure from the address or range that was
> passed in. This would allow the MCA handler describe in simple terms
> what went wrong in case of failure and/or take appropriate action.
As Zoltan mentioned in his mail you'd have to do some heavy fencing
around the internals of readb_check() to make this safe ... which
might make readb_check() too expensive to use for the 99.999999% of
the cases where the I/O board isn't broken. But I don't actually know
how much overhead would be involved ... I/O reads are already horrendously
slow, so you may be able to add some sizeable overhead without affecting
macro benchmarks more than a few percent.
-Tony
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [RFC] How drivers notice a MCA on I/O read? [1/3]
2003-11-18 10:11 [RFC] How drivers notice a MCA on I/O read? [1/3] Hidetoshi Seto
` (4 preceding siblings ...)
2003-11-18 17:47 ` Luck, Tony
@ 2003-11-19 16:45 ` Grant Grundler
2003-11-25 9:27 ` Hidetoshi Seto
6 siblings, 0 replies; 8+ messages in thread
From: Grant Grundler @ 2003-11-19 16:45 UTC (permalink / raw)
To: linux-ia64
On Tue, Nov 18, 2003 at 07:11:20PM +0900, Hidetoshi Seto wrote:
...
> I want to convey the error to the offending driver, and want to enable the
> driver to retry failed read.
Hidetoshi,
Did you mean the driver literally "retry failed read" or did you mean
the driver could "recover" (ie return errors for pending IO requests)?
> So, I think about a readb_check function that has checking ability enable
> it return error value if MCA occur on read.
> Drivers could use readb_check instead of usual readb, and could diagnosis
> whether a retry be required or not, by the return value of readb_check.
I see little value in a simple retry. If the board is failing
(even transient failures) badly enough to cause MCA, it's probably
better to clean up driver state and stop accepting IO requests.
> To realize this, I consider following two plans:
>
> - readb_check on driver (with Notifier)
> Outline:
> - Platform specific MCA handler has a Notifier as hook point.
> - Driver may register a hook function to the Notifier.
> - Notifier calls over registered functions when MCA is signaled.
> - Called hook function checks address of error, and if the error seems
> to be concerned with the parent driver, ups internal error flag and
> stops Notifier by returning OK.
> - MCA handler regards state of Notifier, and decides the system to
> resume or not.
> - Restarted driver may refer the error flag after read, and may retry
> the read if flag is up.
This sounds flexibile to enough to do something other than retry read.
I've been wondering if registering a callback at module_init() would be
sufficient. The callback could clean up driver state so the driver
instance can be shut down. Something like a Hotplug operation to remove
the card.
This way the driver wouldn't need a new read/write interface to
access MMIO space.
> Feature:
> - Generic kernel is not changed.
> - Require a platform specific MCA handler.
> - Service is available for platform specific drivers.
>
> -readb_check on kernel
> Outline:
> - Kernel has readb_check function.
> - Drivers may use readb_check instead of usual readb.
> - MCA handler checks address of error, and if it occurs in readb_check,
> changes return value of readb_check and resumes interrupted context.
> - Driver may refer the return value to notice MCA in last read procedure.
> Feature:
> - Generic kernel requires new codes.
> - Require some codes in generic MCA procedure.
> - Service is available for all drivers.
>
> Which one is better?
I'm really not sure. Need to think about it more.
thanks,
grant
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [RFC] How drivers notice a MCA on I/O read? [1/3]
2003-11-18 10:11 [RFC] How drivers notice a MCA on I/O read? [1/3] Hidetoshi Seto
` (5 preceding siblings ...)
2003-11-19 16:45 ` Grant Grundler
@ 2003-11-25 9:27 ` Hidetoshi Seto
6 siblings, 0 replies; 8+ messages in thread
From: Hidetoshi Seto @ 2003-11-25 9:27 UTC (permalink / raw)
To: linux-ia64
Thanks, all.
For the moment, what I mentioned was only about I/O read interface between
drivers and MCA handler, and as Zoltan mentioned, something like recovering
from each
I/O read
I/O write
DMA from memory to device
DMA from device to memory
is still left as problem to be solved. I'm glad to hear some comment about this.
Well, I'd try to carry out the codes that really consume the data without
leaning mf.a. (Since I'm not familiar with IA64 machine instruction now,
maybe it takes a long time...)
And also, I'd try to check the difference between the readb_check and usual
readb. I should reconsider my attempt if the considerable overhead exists.
Surely, readb_check concerns the interface for drivers, so I think it is better
to consider my plan of discussion on linux-kernel@vger.kern.org.
------
H.Seto <seto.hidetoshi@jp.fujitsu.com>
^ permalink raw reply [flat|nested] 8+ messages in thread