xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Xen: Fix Xen hypervisor panic during dumping timer info on huge machine.
@ 2016-10-12  7:58 Lan Tianyu
  2016-10-12  7:58 ` [PATCH v2 1/2] Xen/Keyhandler: Rework process of nonirq keyhandler Lan Tianyu
  2016-10-12  7:58 ` [PATCH v2 2/2] Xen/timer: Process softirq during dumping timer info Lan Tianyu
  0 siblings, 2 replies; 14+ messages in thread
From: Lan Tianyu @ 2016-10-12  7:58 UTC (permalink / raw)
  To: xen-devel, xen-devel
  Cc: Lan Tianyu, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, jbeulich

This patchset is to fix triggering NMI watchdog during dump timer info
on the huge machine with a mount of physical cpus. Detail please see
change log of Patch 1.

Previous discussion:
https://patchwork.kernel.org/patch/9328449/

Change since V1:
        Add "async" param for handle_keypress() to identify
whether run nonirq keyhandler in tasklet or not. This is to
avoid processing debugkey sysctl asynchronously.


Lan Tianyu (2):
  Xen/Keyhandler: Rework process of nonirq keyhandler
  Xen/timer: Process softirq during dumping timer info

 xen/common/keyhandler.c      |    8 +++++---
 xen/common/sysctl.c          |    2 +-
 xen/common/timer.c           |    1 +
 xen/drivers/char/console.c   |    2 +-
 xen/include/xen/keyhandler.h |    4 +++-
 5 files changed, 11 insertions(+), 6 deletions(-)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/2] Xen/Keyhandler: Rework process of nonirq keyhandler
  2016-10-12  7:58 [PATCH v2 0/2] Xen: Fix Xen hypervisor panic during dumping timer info on huge machine Lan Tianyu
@ 2016-10-12  7:58 ` Lan Tianyu
  2016-10-12 13:19   ` Jan Beulich
  2016-10-12  7:58 ` [PATCH v2 2/2] Xen/timer: Process softirq during dumping timer info Lan Tianyu
  1 sibling, 1 reply; 14+ messages in thread
From: Lan Tianyu @ 2016-10-12  7:58 UTC (permalink / raw)
  To: xen-devel, xen-devel
  Cc: Lan Tianyu, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, jbeulich


Keyhandler may run for a long time in serial port driver's
timer handler on the large machine with a lot of physical
cpus(e,g dump_timerq()) when serial port driver works in
the poll mode(via the exception mechanism).

If a timer handler runs a long time, it will block nmi_timer_fn()
to feed NMI watchdog and cause Xen hypervisor panic. Inserting
process_pending_softirqs() in timer handler will not help. when timer
interrupt arrives, timer subsystem calls all expired timer handlers
before programming next timer interrupt. There is no timer interrupt
arriving to trigger timer softirq during run a timer handler.

This patch is to fix the issue to make nonirq keyhandler run in
tasklet when receive debug key from serial port.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/common/keyhandler.c      |    8 +++++---
 xen/common/sysctl.c          |    2 +-
 xen/drivers/char/console.c   |    2 +-
 xen/include/xen/keyhandler.h |    4 +++-
 4 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c
index 16de6e8..3d50041 100644
--- a/xen/common/keyhandler.c
+++ b/xen/common/keyhandler.c
@@ -75,19 +75,21 @@ static struct keyhandler {
 
 static void keypress_action(unsigned long unused)
 {
-    handle_keypress(keypress_key, NULL);
+    console_start_log_everything();
+    key_table[keypress_key].fn(keypress_key);
+    console_end_log_everything();
 }
 
 static DECLARE_TASKLET(keypress_tasklet, keypress_action, 0);
 
-void handle_keypress(unsigned char key, struct cpu_user_regs *regs)
+void handle_keypress(unsigned char key, struct cpu_user_regs *regs, bool async)
 {
     struct keyhandler *h;
 
     if ( key >= ARRAY_SIZE(key_table) || !(h = &key_table[key])->fn )
         return;
 
-    if ( !in_irq() || h->irq_callback )
+    if ( h->irq_callback || !async )
     {
         console_start_log_everything();
         h->irq_callback ? h->irq_fn(key, regs) : h->fn(key);
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index 8aea6ef..1eb7bad 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -136,7 +136,7 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
         {
             if ( copy_from_guest_offset(&c, op->u.debug_keys.keys, i, 1) )
                 goto out;
-            handle_keypress(c, guest_cpu_user_regs());
+            handle_keypress(c, guest_cpu_user_regs(), false);
         }
         ret = 0;
         copyback = 0;
diff --git a/xen/drivers/char/console.c b/xen/drivers/char/console.c
index 55ae31a..184b523 100644
--- a/xen/drivers/char/console.c
+++ b/xen/drivers/char/console.c
@@ -347,7 +347,7 @@ static void switch_serial_input(void)
 static void __serial_rx(char c, struct cpu_user_regs *regs)
 {
     if ( xen_rx )
-        return handle_keypress(c, regs);
+        return handle_keypress(c, regs, true);
 
     /* Deliver input to guest buffer, unless it is already full. */
     if ( (serial_rx_prod-serial_rx_cons) != SERIAL_RX_SIZE )
diff --git a/xen/include/xen/keyhandler.h b/xen/include/xen/keyhandler.h
index 06c05c8..e9595bd 100644
--- a/xen/include/xen/keyhandler.h
+++ b/xen/include/xen/keyhandler.h
@@ -46,7 +46,9 @@ void register_irq_keyhandler(unsigned char key,
                              bool_t diagnostic);
 
 /* Inject a keypress into the key-handling subsystem. */
-extern void handle_keypress(unsigned char key, struct cpu_user_regs *regs);
+extern void handle_keypress(unsigned char key,
+			    struct cpu_user_regs *regs,
+			    bool async);
 
 /* Scratch space is available for use of any keyhandler. */
 extern char keyhandler_scratch[1024];
-- 
1.7.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 2/2] Xen/timer: Process softirq during dumping timer info
  2016-10-12  7:58 [PATCH v2 0/2] Xen: Fix Xen hypervisor panic during dumping timer info on huge machine Lan Tianyu
  2016-10-12  7:58 ` [PATCH v2 1/2] Xen/Keyhandler: Rework process of nonirq keyhandler Lan Tianyu
@ 2016-10-12  7:58 ` Lan Tianyu
  2016-10-21 17:27   ` Wei Liu
  1 sibling, 1 reply; 14+ messages in thread
From: Lan Tianyu @ 2016-10-12  7:58 UTC (permalink / raw)
  To: xen-devel, xen-devel
  Cc: Lan Tianyu, sstabellini, wei.liu2, George.Dunlap, andrew.cooper3,
	ian.jackson, jbeulich

Dumping timer info may run for a long time on the huge machine with
a lot of physical cpus. To avoid triggering NMI watchdog, add
process_pending_softirqs() in the loop of dumping timer info.

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
 xen/common/timer.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/xen/common/timer.c b/xen/common/timer.c
index 29a60a9..ab6bca0 100644
--- a/xen/common/timer.c
+++ b/xen/common/timer.c
@@ -530,6 +530,7 @@ static void dump_timerq(unsigned char key)
     {
         ts = &per_cpu(timers, i);
 
+        process_pending_softirqs();
         printk("CPU%02d:\n", i);
         spin_lock_irqsave(&ts->lock, flags);
         for ( j = 1; j <= GET_HEAP_SIZE(ts->heap); j++ )
-- 
1.7.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/2] Xen: Fix Xen hypervisor panic during dumping timer info on huge machine.
       [not found] <1476287051-17629-1-git-send-email-tianyu.lan@intel.com>
@ 2016-10-12  8:09 ` Jan Beulich
  2016-10-12  8:20   ` Lan Tianyu
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2016-10-12  8:09 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3, ian.jackson,
	xen-devel

>>> On 12.10.16 at 17:44, <tianyu.lan@intel.com> wrote:
> This patchset is to fix triggering NMI watchdog during dump timer info
> on the huge machine with a mount of physical cpus. Detail please see
> change log of Patch 1.
> 
> Previous discussion:
> https://patchwork.kernel.org/patch/9328449/ 
> 
> Change since V1:
>         Add "async" param for handle_keypress() to identify
> whether run nonirq keyhandler in tasklet or not. This is to
> avoid processing debugkey sysctl asynchronously.
> 
> 
> Lan Tianyu (2):
>   Xen/Keyhandler: Rework process of nonirq keyhandler
>   Xen/timer: Process softirq during dumping timer info

This second patch went in already a few days ago.

Also, any reason you send to the list twice (once @lists.xen.org,
and another time to @lists.xenproject.org)?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/2] Xen: Fix Xen hypervisor panic during dumping timer info on huge machine.
  2016-10-12  8:09 ` [PATCH v2 0/2] Xen: Fix Xen hypervisor panic during dumping timer info on huge machine Jan Beulich
@ 2016-10-12  8:20   ` Lan Tianyu
  2016-10-12  9:48     ` Wei Liu
  0 siblings, 1 reply; 14+ messages in thread
From: Lan Tianyu @ 2016-10-12  8:20 UTC (permalink / raw)
  To: Jan Beulich
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3, ian.jackson,
	xen-devel

On 2016年10月12日 16:09, Jan Beulich wrote:
>>>> On 12.10.16 at 17:44, <tianyu.lan@intel.com> wrote:
>> This patchset is to fix triggering NMI watchdog during dump timer info
>> on the huge machine with a mount of physical cpus. Detail please see
>> change log of Patch 1.
>>
>> Previous discussion:
>> https://patchwork.kernel.org/patch/9328449/ 
>>
>> Change since V1:
>>         Add "async" param for handle_keypress() to identify
>> whether run nonirq keyhandler in tasklet or not. This is to
>> avoid processing debugkey sysctl asynchronously.
>>
>>
>> Lan Tianyu (2):
>>   Xen/Keyhandler: Rework process of nonirq keyhandler
>>   Xen/timer: Process softirq during dumping timer info
> 
> This second patch went in already a few days ago.
> 

Oh. Sorry for noise. I didn't notice that.

> Also, any reason you send to the list twice (once @lists.xen.org,
> and another time to @lists.xenproject.org)?

Sometime I found my patches wasn't able to arrive xen-devel and so
send to both xen.org and xenproject.org maillist. I will double check.

> 
> Jan
> 


-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/2] Xen: Fix Xen hypervisor panic during dumping timer info on huge machine.
  2016-10-12  8:20   ` Lan Tianyu
@ 2016-10-12  9:48     ` Wei Liu
  2016-10-12 11:08       ` Ian Jackson
  0 siblings, 1 reply; 14+ messages in thread
From: Wei Liu @ 2016-10-12  9:48 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3, ian.jackson,
	Jan Beulich, xen-devel

On Wed, Oct 12, 2016 at 04:20:02PM +0800, Lan Tianyu wrote:
> On 2016年10月12日 16:09, Jan Beulich wrote:
> >>>> On 12.10.16 at 17:44, <tianyu.lan@intel.com> wrote:
> >> This patchset is to fix triggering NMI watchdog during dump timer info
> >> on the huge machine with a mount of physical cpus. Detail please see
> >> change log of Patch 1.
> >>
> >> Previous discussion:
> >> https://patchwork.kernel.org/patch/9328449/ 
> >>
> >> Change since V1:
> >>         Add "async" param for handle_keypress() to identify
> >> whether run nonirq keyhandler in tasklet or not. This is to
> >> avoid processing debugkey sysctl asynchronously.
> >>
> >>
> >> Lan Tianyu (2):
> >>   Xen/Keyhandler: Rework process of nonirq keyhandler
> >>   Xen/timer: Process softirq during dumping timer info
> > 
> > This second patch went in already a few days ago.
> > 
> 
> Oh. Sorry for noise. I didn't notice that.
> 

It is always good practice to rebase your patch series on top of
staging, FWIW.

> > Also, any reason you send to the list twice (once @lists.xen.org,
> > and another time to @lists.xenproject.org)?
> 
> Sometime I found my patches wasn't able to arrive xen-devel and so
> send to both xen.org and xenproject.org maillist. I will double check.
> 

Both addresses should work. There are glitches from time to time though.
So do report to us if this happens again.

Wei.

> > 
> > Jan
> > 
> 
> 
> -- 
> Best regards
> Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/2] Xen: Fix Xen hypervisor panic during dumping timer info on huge machine.
  2016-10-12  9:48     ` Wei Liu
@ 2016-10-12 11:08       ` Ian Jackson
  2016-10-12 14:32         ` Lan, Tianyu
  0 siblings, 1 reply; 14+ messages in thread
From: Ian Jackson @ 2016-10-12 11:08 UTC (permalink / raw)
  To: Wei Liu
  Cc: Lan Tianyu, sstabellini, George.Dunlap, andrew.cooper3,
	Jan Beulich, xen-devel

Wei Liu writes ("Re: [PATCH v2 0/2] Xen: Fix Xen hypervisor panic during dumping timer info on huge machine."):
> On Wed, Oct 12, 2016 at 04:20:02PM +0800, Lan Tianyu wrote:
> > On 2016年10月12日 16:09, Jan Beulich wrote:
> > > Also, any reason you send to the list twice (once @lists.xen.org,
> > > and another time to @lists.xenproject.org)?
> > 
> > Sometime I found my patches wasn't able to arrive xen-devel and so
> > send to both xen.org and xenproject.org maillist. I will double check.
> 
> Both addresses should work. There are glitches from time to time though.
> So do report to us if this happens again.

I don't think either address is likely to work differently or
separately to the other.  So please just send to one, and if it
doesn't work, please report it and we will try to fix it.

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/2] Xen/Keyhandler: Rework process of nonirq keyhandler
  2016-10-12  7:58 ` [PATCH v2 1/2] Xen/Keyhandler: Rework process of nonirq keyhandler Lan Tianyu
@ 2016-10-12 13:19   ` Jan Beulich
  2016-10-12 14:30     ` Lan, Tianyu
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2016-10-12 13:19 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3, ian.jackson,
	xen-devel

>>> On 12.10.16 at 09:58, <tianyu.lan@intel.com> wrote:
> --- a/xen/drivers/char/console.c
> +++ b/xen/drivers/char/console.c
> @@ -347,7 +347,7 @@ static void switch_serial_input(void)
>  static void __serial_rx(char c, struct cpu_user_regs *regs)
>  {
>      if ( xen_rx )
> -        return handle_keypress(c, regs);
> +        return handle_keypress(c, regs, true);

I think it would be nice to pass true here only when in polling mode,
unless you know or can deduce that the a similar problem also exists
in IRQ mode. Perhaps you could simply move the !in_irq() here? (Of
course the new function parameter would then want to be renamed.)

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/2] Xen/Keyhandler: Rework process of nonirq keyhandler
  2016-10-12 13:19   ` Jan Beulich
@ 2016-10-12 14:30     ` Lan, Tianyu
  2016-10-12 16:03       ` Jan Beulich
  0 siblings, 1 reply; 14+ messages in thread
From: Lan, Tianyu @ 2016-10-12 14:30 UTC (permalink / raw)
  To: Jan Beulich
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3, ian.jackson,
	xen-devel



On 10/12/2016 9:19 PM, Jan Beulich wrote:
>>>> On 12.10.16 at 09:58, <tianyu.lan@intel.com> wrote:
>> --- a/xen/drivers/char/console.c
>> +++ b/xen/drivers/char/console.c
>> @@ -347,7 +347,7 @@ static void switch_serial_input(void)
>>  static void __serial_rx(char c, struct cpu_user_regs *regs)
>>  {
>>      if ( xen_rx )
>> -        return handle_keypress(c, regs);
>> +        return handle_keypress(c, regs, true);
>
> I think it would be nice to pass true here only when in polling mode,
> unless you know or can deduce that the a similar problem also exists
> in IRQ mode. Perhaps you could simply move the !in_irq() here?

That's a good idea. Thanks.

>(Of course the new function parameter would then want to be renamed.)

Since the issue happens when handle_keypress() runs in a timer handler,
how about to name new parameter "intimer"? __serial_rx() is called in a 
timer handler or interrupt handler. Or do you have other suggestion?

>
> Jan
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/2] Xen: Fix Xen hypervisor panic during dumping timer info on huge machine.
  2016-10-12 11:08       ` Ian Jackson
@ 2016-10-12 14:32         ` Lan, Tianyu
  0 siblings, 0 replies; 14+ messages in thread
From: Lan, Tianyu @ 2016-10-12 14:32 UTC (permalink / raw)
  To: Ian Jackson, Wei Liu
  Cc: sstabellini, George.Dunlap, andrew.cooper3, Jan Beulich,
	xen-devel



On 10/12/2016 7:08 PM, Ian Jackson wrote:
> Wei Liu writes ("Re: [PATCH v2 0/2] Xen: Fix Xen hypervisor panic during dumping timer info on huge machine."):
>> On Wed, Oct 12, 2016 at 04:20:02PM +0800, Lan Tianyu wrote:
>>> On 2016年10月12日 16:09, Jan Beulich wrote:
>>>> Also, any reason you send to the list twice (once @lists.xen.org,
>>>> and another time to @lists.xenproject.org)?
>>>
>>> Sometime I found my patches wasn't able to arrive xen-devel and so
>>> send to both xen.org and xenproject.org maillist. I will double check.
>>
>> Both addresses should work. There are glitches from time to time though.
>> So do report to us if this happens again.
>
> I don't think either address is likely to work differently or
> separately to the other.  So please just send to one, and if it
> doesn't work, please report it and we will try to fix it.

Ok. I get it.

>
> Thanks,
> Ian.
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/2] Xen/Keyhandler: Rework process of nonirq keyhandler
  2016-10-12 14:30     ` Lan, Tianyu
@ 2016-10-12 16:03       ` Jan Beulich
  2016-10-13  1:15         ` Lan Tianyu
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2016-10-12 16:03 UTC (permalink / raw)
  To: Tianyu Lan
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3, ian.jackson,
	xen-devel

>>> On 12.10.16 at 16:30, <tianyu.lan@intel.com> wrote:

> 
> On 10/12/2016 9:19 PM, Jan Beulich wrote:
>>>>> On 12.10.16 at 09:58, <tianyu.lan@intel.com> wrote:
>>> --- a/xen/drivers/char/console.c
>>> +++ b/xen/drivers/char/console.c
>>> @@ -347,7 +347,7 @@ static void switch_serial_input(void)
>>>  static void __serial_rx(char c, struct cpu_user_regs *regs)
>>>  {
>>>      if ( xen_rx )
>>> -        return handle_keypress(c, regs);
>>> +        return handle_keypress(c, regs, true);
>>
>> I think it would be nice to pass true here only when in polling mode,
>> unless you know or can deduce that the a similar problem also exists
>> in IRQ mode. Perhaps you could simply move the !in_irq() here?
> 
> That's a good idea. Thanks.
> 
>>(Of course the new function parameter would then want to be renamed.)
> 
> Since the issue happens when handle_keypress() runs in a timer handler,
> how about to name new parameter "intimer"? __serial_rx() is called in a 
> timer handler or interrupt handler. Or do you have other suggestion?

I think "intimer" can be confusing (to be mixed up with timer interrupt).
How about "force_tasklet"?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/2] Xen/Keyhandler: Rework process of nonirq keyhandler
  2016-10-12 16:03       ` Jan Beulich
@ 2016-10-13  1:15         ` Lan Tianyu
  0 siblings, 0 replies; 14+ messages in thread
From: Lan Tianyu @ 2016-10-13  1:15 UTC (permalink / raw)
  To: Jan Beulich
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3, ian.jackson,
	xen-devel

On 2016年10月13日 00:03, Jan Beulich wrote:
>>>> On 12.10.16 at 16:30, <tianyu.lan@intel.com> wrote:
>>
>> Since the issue happens when handle_keypress() runs in a timer handler,
>> how about to name new parameter "intimer"? __serial_rx() is called in a 
>> timer handler or interrupt handler. Or do you have other suggestion?
> 
> I think "intimer" can be confusing (to be mixed up with timer interrupt).
> How about "force_tasklet"?

OK. I will update.
-- 
Best regards
Tianyu Lan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/2] Xen/timer: Process softirq during dumping timer info
  2016-10-12  7:58 ` [PATCH v2 2/2] Xen/timer: Process softirq during dumping timer info Lan Tianyu
@ 2016-10-21 17:27   ` Wei Liu
  2016-10-22  3:52     ` Lan, Tianyu
  0 siblings, 1 reply; 14+ messages in thread
From: Wei Liu @ 2016-10-21 17:27 UTC (permalink / raw)
  To: Lan Tianyu
  Cc: sstabellini, wei.liu2, George.Dunlap, andrew.cooper3, ian.jackson,
	xen-devel, jbeulich, xen-devel

On Wed, Oct 12, 2016 at 03:58:24PM +0800, Lan Tianyu wrote:
> Dumping timer info may run for a long time on the huge machine with
> a lot of physical cpus. To avoid triggering NMI watchdog, add
> process_pending_softirqs() in the loop of dumping timer info.
> 
> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
>  xen/common/timer.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/xen/common/timer.c b/xen/common/timer.c
> index 29a60a9..ab6bca0 100644
> --- a/xen/common/timer.c
> +++ b/xen/common/timer.c
> @@ -530,6 +530,7 @@ static void dump_timerq(unsigned char key)
>      {
>          ts = &per_cpu(timers, i);
>  
> +        process_pending_softirqs();

This is causing issues in ARM (x86 has a similar issue):

Oct 20 01:43:31.410010 (XEN) Xen call trace:
Oct 20 01:43:31.410048 (XEN)    [<00233920>] process_pending_softirqs+0x34/0x5c (PC)
Oct 20 01:43:31.417990 (XEN)    [<00237c6c>] timer.c#dump_timerq+0x9c/0x1fc (LR)
Oct 20 01:43:31.418030 (XEN)    [<00218658>] handle_keypress+0xc0/0xf4
Oct 20 01:43:31.426001 (XEN)    [<002490c8>] console.c#__serial_rx+0x4c/0x9c
Oct 20 01:43:31.433970 (XEN)    [<00249b74>] console.c#serial_rx+0xcc/0xe4
Oct 20 01:43:31.434007 (XEN)    [<0024b6ec>] serial_rx_interrupt+0xcc/0xf8
Oct 20 01:43:31.441964 (XEN)    [<0024ae54>] exynos4210-uart.c#exynos4210_uart_interrupt+0xf8/0x160
Oct 20 01:43:31.450001 (XEN)    [<00256338>] do_IRQ+0x1a0/0x228
Oct 20 01:43:31.450040 (XEN)    [<00254074>] gic_interrupt+0x58/0xfc
Oct 20 01:43:31.457985 (XEN)    [<00260f98>] do_trap_irq+0x24/0x38
Oct 20 01:43:31.458022 (XEN)    [<00264970>] entry.o#return_from_trap+0/0x4
Oct 20 01:43:31.466010 (XEN)    [<0030a240>] 0030a240
Oct 20 01:43:31.466044 (XEN) 
Oct 20 01:43:31.466066 (XEN) 
Oct 20 01:43:31.466099 (XEN) ****************************************
Oct 20 01:43:31.473998 (XEN) Panic on CPU 0:
Oct 20 01:43:31.474029 (XEN) Assertion '!in_irq() && local_irq_is_enabled()' failed at softirq.c:57
Oct 20 01:43:31.481982 (XEN) ****************************************

See
http://logs.test-lab.xenproject.org/osstest/logs/101571/test-armhf-armhf-libvirt/serial-arndale-bluewater.log

I've reverted this patch in staging.

Wei.

>          printk("CPU%02d:\n", i);
>          spin_lock_irqsave(&ts->lock, flags);
>          for ( j = 1; j <= GET_HEAP_SIZE(ts->heap); j++ )
> -- 
> 1.7.1
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/2] Xen/timer: Process softirq during dumping timer info
  2016-10-21 17:27   ` Wei Liu
@ 2016-10-22  3:52     ` Lan, Tianyu
  0 siblings, 0 replies; 14+ messages in thread
From: Lan, Tianyu @ 2016-10-22  3:52 UTC (permalink / raw)
  To: Wei Liu
  Cc: sstabellini, George.Dunlap, andrew.cooper3, ian.jackson,
	xen-devel, jbeulich



On 10/22/2016 1:27 AM, Wei Liu wrote:
> On Wed, Oct 12, 2016 at 03:58:24PM +0800, Lan Tianyu wrote:
>> Dumping timer info may run for a long time on the huge machine with
>> a lot of physical cpus. To avoid triggering NMI watchdog, add
>> process_pending_softirqs() in the loop of dumping timer info.
>>
>> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>>  xen/common/timer.c |    1 +
>>  1 files changed, 1 insertions(+), 0 deletions(-)
>>
>> diff --git a/xen/common/timer.c b/xen/common/timer.c
>> index 29a60a9..ab6bca0 100644
>> --- a/xen/common/timer.c
>> +++ b/xen/common/timer.c
>> @@ -530,6 +530,7 @@ static void dump_timerq(unsigned char key)
>>      {
>>          ts = &per_cpu(timers, i);
>>
>> +        process_pending_softirqs();
>
> This is causing issues in ARM (x86 has a similar issue):
>
> Oct 20 01:43:31.410010 (XEN) Xen call trace:
> Oct 20 01:43:31.410048 (XEN)    [<00233920>] process_pending_softirqs+0x34/0x5c (PC)
> Oct 20 01:43:31.417990 (XEN)    [<00237c6c>] timer.c#dump_timerq+0x9c/0x1fc (LR)
> Oct 20 01:43:31.418030 (XEN)    [<00218658>] handle_keypress+0xc0/0xf4
> Oct 20 01:43:31.426001 (XEN)    [<002490c8>] console.c#__serial_rx+0x4c/0x9c
> Oct 20 01:43:31.433970 (XEN)    [<00249b74>] console.c#serial_rx+0xcc/0xe4
> Oct 20 01:43:31.434007 (XEN)    [<0024b6ec>] serial_rx_interrupt+0xcc/0xf8
> Oct 20 01:43:31.441964 (XEN)    [<0024ae54>] exynos4210-uart.c#exynos4210_uart_interrupt+0xf8/0x160
> Oct 20 01:43:31.450001 (XEN)    [<00256338>] do_IRQ+0x1a0/0x228
> Oct 20 01:43:31.450040 (XEN)    [<00254074>] gic_interrupt+0x58/0xfc
> Oct 20 01:43:31.457985 (XEN)    [<00260f98>] do_trap_irq+0x24/0x38
> Oct 20 01:43:31.458022 (XEN)    [<00264970>] entry.o#return_from_trap+0/0x4
> Oct 20 01:43:31.466010 (XEN)    [<0030a240>] 0030a240
> Oct 20 01:43:31.466044 (XEN)
> Oct 20 01:43:31.466066 (XEN)
> Oct 20 01:43:31.466099 (XEN) ****************************************
> Oct 20 01:43:31.473998 (XEN) Panic on CPU 0:
> Oct 20 01:43:31.474029 (XEN) Assertion '!in_irq() && local_irq_is_enabled()' failed at softirq.c:57
> Oct 20 01:43:31.481982 (XEN) ****************************************
>
> See
> http://logs.test-lab.xenproject.org/osstest/logs/101571/test-armhf-armhf-libvirt/serial-arndale-bluewater.log
>
> I've reverted this patch in staging.
>
> Wei.

dump_timerq() or other non-irq keyhandlers should not run in irq context 
and has sent out a fix patch.

https://lists.xen.org/archives/html/xen-devel/2016-10/msg01391.html



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-10-22  3:52 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-12  7:58 [PATCH v2 0/2] Xen: Fix Xen hypervisor panic during dumping timer info on huge machine Lan Tianyu
2016-10-12  7:58 ` [PATCH v2 1/2] Xen/Keyhandler: Rework process of nonirq keyhandler Lan Tianyu
2016-10-12 13:19   ` Jan Beulich
2016-10-12 14:30     ` Lan, Tianyu
2016-10-12 16:03       ` Jan Beulich
2016-10-13  1:15         ` Lan Tianyu
2016-10-12  7:58 ` [PATCH v2 2/2] Xen/timer: Process softirq during dumping timer info Lan Tianyu
2016-10-21 17:27   ` Wei Liu
2016-10-22  3:52     ` Lan, Tianyu
     [not found] <1476287051-17629-1-git-send-email-tianyu.lan@intel.com>
2016-10-12  8:09 ` [PATCH v2 0/2] Xen: Fix Xen hypervisor panic during dumping timer info on huge machine Jan Beulich
2016-10-12  8:20   ` Lan Tianyu
2016-10-12  9:48     ` Wei Liu
2016-10-12 11:08       ` Ian Jackson
2016-10-12 14:32         ` Lan, Tianyu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).