* [patch 11/13] s390: instruction processing damage handling.
@ 2006-04-24 15:05 Martin Schwidefsky
2006-04-24 23:58 ` Andrew Morton
2006-04-28 7:33 ` Josef Sipek
0 siblings, 2 replies; 9+ messages in thread
From: Martin Schwidefsky @ 2006-04-24 15:05 UTC (permalink / raw)
To: linux-kernel, akpm, heiko.carstens
From: Heiko Carstens <heiko.carstens@de.ibm.com>
[patch 11/13] s390: instruction processing damage handling.
In case of an instruction processing damage (IPD) machine check in
kernel mode the resulting action is always to stop the kernel.
This is not necessarily the best solution since a retry of the
failing instruction might succeed. Add logic to retry the instruction
if no more than 30 instruction processing damage checks occured in
the last 5 minutes.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---
drivers/s390/s390mach.c | 33 ++++++++++++++++++++++++++++-----
1 files changed, 28 insertions(+), 5 deletions(-)
diff -urpN linux-2.6/drivers/s390/s390mach.c linux-2.6-patched/drivers/s390/s390mach.c
--- linux-2.6/drivers/s390/s390mach.c 2006-03-20 06:53:29.000000000 +0100
+++ linux-2.6-patched/drivers/s390/s390mach.c 2006-04-24 16:47:28.000000000 +0200
@@ -362,12 +362,19 @@ s390_revalidate_registers(struct mci *mc
return kill_task;
}
+#define MAX_IPD_COUNT 29
+#define MAX_IPD_TIME (5 * 60 * 100 * 1000) /* 5 minutes */
+
/*
* machine check handler.
*/
void
s390_do_machine_check(struct pt_regs *regs)
{
+ static DEFINE_SPINLOCK(ipd_lock);
+ static unsigned long long last_ipd;
+ static int ipd_count;
+ unsigned long long tmp;
struct mci *mci;
struct mcck_struct *mcck;
int umode;
@@ -404,11 +411,27 @@ s390_do_machine_check(struct pt_regs *re
s390_handle_damage("processing backup machine "
"check with damage.");
}
- if (!umode)
- s390_handle_damage("processing backup machine "
- "check in kernel mode.");
- mcck->kill_task = 1;
- mcck->mcck_code = *(unsigned long long *) mci;
+
+ /*
+ * Nullifying exigent condition, therefore we might
+ * retry this instruction.
+ */
+
+ spin_lock(&ipd_lock);
+
+ tmp = get_clock();
+
+ if (((tmp - last_ipd) >> 12) < MAX_IPD_TIME)
+ ipd_count++;
+ else
+ ipd_count = 1;
+
+ last_ipd = tmp;
+
+ if (ipd_count == MAX_IPD_COUNT)
+ s390_handle_damage("too many ipd retries.");
+
+ spin_unlock(&ipd_lock);
}
else {
/* Processing damage -> stopping machine */
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [patch 11/13] s390: instruction processing damage handling.
2006-04-24 15:05 [patch 11/13] s390: instruction processing damage handling Martin Schwidefsky
@ 2006-04-24 23:58 ` Andrew Morton
2006-04-25 20:14 ` Arnd Bergmann
2006-04-28 7:33 ` Josef Sipek
1 sibling, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2006-04-24 23:58 UTC (permalink / raw)
To: Martin Schwidefsky; +Cc: linux-kernel, heiko.carstens
Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
>
> + tmp = get_clock();
s390 has a get_clock()? I guess you don't use i2c much.
We shouldn't use such generic-looking identifiers for arch-specific things.
But I guess you can defer the great renaming to s390_get_clock() until
something actually breaks.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch 11/13] s390: instruction processing damage handling.
2006-04-24 23:58 ` Andrew Morton
@ 2006-04-25 20:14 ` Arnd Bergmann
0 siblings, 0 replies; 9+ messages in thread
From: Arnd Bergmann @ 2006-04-25 20:14 UTC (permalink / raw)
To: Andrew Morton; +Cc: Martin Schwidefsky, linux-kernel, heiko.carstens
Am Tuesday 25 April 2006 01:58 schrieb Andrew Morton:
> Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> > + tmp = get_clock();
>
> s390 has a get_clock()? I guess you don't use i2c much.
>
> We shouldn't use such generic-looking identifiers for arch-specific things.
> But I guess you can defer the great renaming to s390_get_clock() until
> something actually breaks.
Many places could probably just use get_cycles() if they don't need
sub-cycle resolution. Also such an update could be combined with
a move to the new store-clock-fast instruction.
Arnd <><
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch 11/13] s390: instruction processing damage handling.
2006-04-24 15:05 [patch 11/13] s390: instruction processing damage handling Martin Schwidefsky
2006-04-24 23:58 ` Andrew Morton
@ 2006-04-28 7:33 ` Josef Sipek
2006-04-28 8:39 ` Heiko Carstens
1 sibling, 1 reply; 9+ messages in thread
From: Josef Sipek @ 2006-04-28 7:33 UTC (permalink / raw)
To: Martin Schwidefsky; +Cc: linux-kernel, akpm, heiko.carstens
On Mon, Apr 24, 2006 at 05:05:44PM +0200, Martin Schwidefsky wrote:
> +++ linux-2.6-patched/drivers/s390/s390mach.c 2006-04-24 16:47:28.000000000 +0200
...
> +#define MAX_IPD_TIME (5 * 60 * 100 * 1000) /* 5 minutes */
I'm no s390 expert, but shouldn't the above use something like HZ?
Josef 'Jeff' Sipek.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch 11/13] s390: instruction processing damage handling.
2006-04-28 7:33 ` Josef Sipek
@ 2006-04-28 8:39 ` Heiko Carstens
2006-04-28 9:24 ` Martin Schwidefsky
0 siblings, 1 reply; 9+ messages in thread
From: Heiko Carstens @ 2006-04-28 8:39 UTC (permalink / raw)
To: Josef Sipek; +Cc: Martin Schwidefsky, linux-kernel, akpm
On Fri, Apr 28, 2006 at 04:33:58AM -0400, Josef Sipek wrote:
> On Mon, Apr 24, 2006 at 05:05:44PM +0200, Martin Schwidefsky wrote:
> > +++ linux-2.6-patched/drivers/s390/s390mach.c 2006-04-24 16:47:28.000000000 +0200
> ...
> > +#define MAX_IPD_TIME (5 * 60 * 100 * 1000) /* 5 minutes */
>
> I'm no s390 expert, but shouldn't the above use something like HZ?
Using HZ here feels just wrong to me. MAX_IPD_TIME has nothing to do with the
timer frequency. In this case it's used to tell if there were 30 machine
checks within the last 5 minutes (in a usec granularity). It's just by
accident that this could be expressed using HZ.
(5 * 60 * USEC_PER_SEC) would probably look better...
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch 11/13] s390: instruction processing damage handling.
2006-04-28 8:39 ` Heiko Carstens
@ 2006-04-28 9:24 ` Martin Schwidefsky
2006-04-28 13:43 ` Paulo Marques
0 siblings, 1 reply; 9+ messages in thread
From: Martin Schwidefsky @ 2006-04-28 9:24 UTC (permalink / raw)
To: Heiko Carstens; +Cc: Josef Sipek, linux-kernel, akpm
On Fri, 2006-04-28 at 10:39 +0200, Heiko Carstens wrote:
> > > +++ linux-2.6-patched/drivers/s390/s390mach.c 2006-04-24 16:47:28.000000000 +0200
> > ...
> > > +#define MAX_IPD_TIME (5 * 60 * 100 * 1000) /* 5 minutes */
> >
> > I'm no s390 expert, but shouldn't the above use something like HZ?
>
> Using HZ here feels just wrong to me. MAX_IPD_TIME has nothing to do with the
> timer frequency. In this case it's used to tell if there were 30 machine
> checks within the last 5 minutes (in a usec granularity). It's just by
> accident that this could be expressed using HZ.
> (5 * 60 * USEC_PER_SEC) would probably look better...
Using HZ would be wrong. The check that uses MAX_IPD_TIME compares it
against the result of a get_clock() call. That uses the TOD Clock
directly, there is no dependency on HZ.
--
blue skies,
Martin.
Martin Schwidefsky
Linux for zSeries Development & Services
IBM Deutschland Entwicklung GmbH
"Reality continues to ruin my life." - Calvin.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch 11/13] s390: instruction processing damage handling.
2006-04-28 9:24 ` Martin Schwidefsky
@ 2006-04-28 13:43 ` Paulo Marques
2006-04-28 14:53 ` Martin Schwidefsky
0 siblings, 1 reply; 9+ messages in thread
From: Paulo Marques @ 2006-04-28 13:43 UTC (permalink / raw)
To: schwidefsky; +Cc: Heiko Carstens, Josef Sipek, linux-kernel, akpm
Martin Schwidefsky wrote:
> On Fri, 2006-04-28 at 10:39 +0200, Heiko Carstens wrote:
>
>>>>+++ linux-2.6-patched/drivers/s390/s390mach.c 2006-04-24 16:47:28.000000000 +0200
>>>...
>>>>+#define MAX_IPD_TIME (5 * 60 * 100 * 1000) /* 5 minutes */
^^^^^^^^^^^^^^^^^^^^
Expression A
>>>I'm no s390 expert, but shouldn't the above use something like HZ?
>>
>>Using HZ here feels just wrong to me. MAX_IPD_TIME has nothing to do with the
>>timer frequency. In this case it's used to tell if there were 30 machine
>>checks within the last 5 minutes (in a usec granularity). It's just by
>>accident that this could be expressed using HZ.
>>(5 * 60 * USEC_PER_SEC) would probably look better...
^^^^^^^^^^^^^^^^^^^^^^^
Expression B
I'm no s390 expert either, but just wanted to point out that expression
B is 10 times larger than expression A, so something's fishy here.
> Using HZ would be wrong. The check that uses MAX_IPD_TIME compares it
> against the result of a get_clock() call. That uses the TOD Clock
> directly, there is no dependency on HZ.
Looking at include/asm-s390/timex.h:
#define CLOCK_TICK_RATE 1193180 /* Underlying HZ */
makes me wonder if this should be:
#define MAX_IPD_TIME (5 * 60 * CLOCK_TICK_RATE) /* 5 minutes */
--
Paulo Marques - www.grupopie.com
Pointy-Haired Boss: I don't see anything that could stand in our way.
Dilbert: Sanity? Reality? The laws of physics?
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [patch 11/13] s390: instruction processing damage handling.
2006-04-28 13:43 ` Paulo Marques
@ 2006-04-28 14:53 ` Martin Schwidefsky
2006-04-28 16:46 ` Heiko Carstens
0 siblings, 1 reply; 9+ messages in thread
From: Martin Schwidefsky @ 2006-04-28 14:53 UTC (permalink / raw)
To: Paulo Marques; +Cc: Heiko Carstens, Josef Sipek, linux-kernel, akpm
On Fri, 2006-04-28 at 14:43 +0100, Paulo Marques wrote:
> >>>>+++ linux-2.6-patched/drivers/s390/s390mach.c 2006-04-24 16:47:28.000000000 +0200
> >>>...
> >>>>+#define MAX_IPD_TIME (5 * 60 * 100 * 1000) /* 5 minutes */
> ^^^^^^^^^^^^^^^^^^^^
> Expression A
>
> >>>I'm no s390 expert, but shouldn't the above use something like HZ?
> >>
> >>Using HZ here feels just wrong to me. MAX_IPD_TIME has nothing to do with the
> >>timer frequency. In this case it's used to tell if there were 30 machine
> >>checks within the last 5 minutes (in a usec granularity). It's just by
> >>accident that this could be expressed using HZ.
> >>(5 * 60 * USEC_PER_SEC) would probably look better...
> ^^^^^^^^^^^^^^^^^^^^^^^
> Expression B
>
> I'm no s390 expert either, but just wanted to point out that expression
> B is 10 times larger than expression A, so something's fishy here.
Indeed, 5*60*100*1000 is wrong. That should be 5*60*1000*1000. This must
have been the week of stupid bugs.. thanks for spotting this.
> > Using HZ would be wrong. The check that uses MAX_IPD_TIME compares it
> > against the result of a get_clock() call. That uses the TOD Clock
> > directly, there is no dependency on HZ.
>
> Looking at include/asm-s390/timex.h:
>
> #define CLOCK_TICK_RATE 1193180 /* Underlying HZ */
>
> makes me wonder if this should be:
>
> #define MAX_IPD_TIME (5 * 60 * CLOCK_TICK_RATE) /* 5 minutes */
No. The CLOCK_TICK_RATE is a really strange beast. It sole purpose is to
satisfy the calculations in include/linux/jiffies.h. The value itself
has no meaning in regard to the TOD clock. 5*60*CLOCK_TICK_RATE
evaluates to about 358 seconds. Not what we want.
--
blue skies,
Martin.
Martin Schwidefsky
Linux for zSeries Development & Services
IBM Deutschland Entwicklung GmbH
"Reality continues to ruin my life." - Calvin.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch 11/13] s390: instruction processing damage handling.
2006-04-28 14:53 ` Martin Schwidefsky
@ 2006-04-28 16:46 ` Heiko Carstens
0 siblings, 0 replies; 9+ messages in thread
From: Heiko Carstens @ 2006-04-28 16:46 UTC (permalink / raw)
To: Martin Schwidefsky; +Cc: Paulo Marques, Josef Sipek, linux-kernel, akpm
> > >>>>+#define MAX_IPD_TIME (5 * 60 * 100 * 1000) /* 5 minutes */
> > ^^^^^^^^^^^^^^^^^^^^
> > Expression A
> > >>(5 * 60 * USEC_PER_SEC) would probably look better...
> > ^^^^^^^^^^^^^^^^^^^^^^^
> > Expression B
> >
> > I'm no s390 expert either, but just wanted to point out that expression
> > B is 10 times larger than expression A, so something's fishy here.
>
> Indeed, 5*60*100*1000 is wrong. That should be 5*60*1000*1000. This must
> have been the week of stupid bugs.. thanks for spotting this.
*sigh*... stupid me. At least this doesn't break anything and with this
patch it's still better than before.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2006-04-28 16:46 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-24 15:05 [patch 11/13] s390: instruction processing damage handling Martin Schwidefsky
2006-04-24 23:58 ` Andrew Morton
2006-04-25 20:14 ` Arnd Bergmann
2006-04-28 7:33 ` Josef Sipek
2006-04-28 8:39 ` Heiko Carstens
2006-04-28 9:24 ` Martin Schwidefsky
2006-04-28 13:43 ` Paulo Marques
2006-04-28 14:53 ` Martin Schwidefsky
2006-04-28 16:46 ` Heiko Carstens
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox