* [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix
@ 2014-01-27 17:33 Oleksandr Tyshchenko
2014-01-27 17:33 ` [PATCH v1 1/2] xen/arm: Add return value to smp_call_function_interrupt function Oleksandr Tyshchenko
` (3 more replies)
0 siblings, 4 replies; 48+ messages in thread
From: Oleksandr Tyshchenko @ 2014-01-27 17:33 UTC (permalink / raw)
To: xen-devel
Hi, all.
We are trying to bringing up XEN on DRA7XX (OMAP5) platform.
We sometimes see some hangs in Hypervisor and these hangs are related to SMP.
We found out that deadlock took place in on_selected_cpus function
in case of simultaneous occurrence cross-interrupts.
The issue:
1. We receive irqs from first CPU (for example CPU0) and second CPU (for example CPU1) in parallel.
2. In our case the maintenance_interrupt function for maintenance irq from CPU0 is executed on CPU1 and
maintenance_interrupt for irq from CPU1 is executed on CPU0.
3. According to existing logic we have run gic_irq_eoi function on CPU which it was scheduled.
4. Due to this in both cases we need to call on_selected_cpus function to EOI irqs.
5. For the CPU0 on_selected_cpus function is called where we take a lock in the beginning of the function
and continue to execute it.
6. Parallel to this the same function is called for the CPU1 where we stop after attempting to take a lock because it is already holding.
7. For the CPU0 we send IPI and going to wait until CPU1 execute function and cleared cpumask.
8. But the mask will never be cleaned, because CPU1 is waiting too.
Now, we have next situation. The CPU0 can not exit from busy loop, it is waiting CPU1 to execute function and clear mask, but CPU0 is waiting to release lock.
This causes to deadlock.
Since as we needed solution to avoid hangs the attached patch was created. The solution is just to
call the smp_call_function_interrupt function if lock is holding. This causes the waiting CPU to exit from busy loop and release lock.
But I am afraid this solution not completed and maybe not enough for stable work. I would appreciate if you could explain me how to solve the issue in a right way or give some advices.
P.S. We use next SW:
1. Hypervisor - XEN 4.4 unstable
2. Dom0 - Kernel 3.8
3. DomU - Kernel 3.8
Oleksandr Tyshchenko (2):
xen/arm: Add return value to smp_call_function_interrupt function
xen/arm: Fix deadlock in on_selected_cpus function
xen/common/smp.c | 13 ++++++++++---
xen/include/xen/smp.h | 2 +-
2 files changed, 11 insertions(+), 4 deletions(-)
--
1.7.9.5
^ permalink raw reply [flat|nested] 48+ messages in thread* [PATCH v1 1/2] xen/arm: Add return value to smp_call_function_interrupt function 2014-01-27 17:33 [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix Oleksandr Tyshchenko @ 2014-01-27 17:33 ` Oleksandr Tyshchenko 2014-01-27 18:28 ` Stefano Stabellini 2014-01-27 17:33 ` [PATCH v1 2/2] xen/arm: Fix deadlock in on_selected_cpus function Oleksandr Tyshchenko ` (2 subsequent siblings) 3 siblings, 1 reply; 48+ messages in thread From: Oleksandr Tyshchenko @ 2014-01-27 17:33 UTC (permalink / raw) To: xen-devel Let the function returns error if the action can not be executed. Change-Id: Iace691850024656239326bf0e3c87b57cb1b8ab3 Signed-off-by: Oleksandr Tyshchenko <oleksandr.tyshchenko@globallogic.com> Signed-off-by: Andrii Tseglytskyi <andrii.tseglytskyi@globallogic.com> --- xen/common/smp.c | 7 +++++-- xen/include/xen/smp.h | 2 +- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/xen/common/smp.c b/xen/common/smp.c index 482a203..2700bd7 100644 --- a/xen/common/smp.c +++ b/xen/common/smp.c @@ -20,6 +20,7 @@ #include <asm/processor.h> #include <xen/spinlock.h> #include <xen/smp.h> +#include <xen/errno.h> /* * Structure and data for smp_call_function()/on_selected_cpus(). @@ -75,14 +76,14 @@ out: spin_unlock(&call_lock); } -void smp_call_function_interrupt(void) +int smp_call_function_interrupt(void) { void (*func)(void *info) = call_data.func; void *info = call_data.info; unsigned int cpu = smp_processor_id(); if ( !cpumask_test_cpu(cpu, &call_data.selected) ) - return; + return -EPERM; irq_enter(); @@ -100,6 +101,8 @@ void smp_call_function_interrupt(void) } irq_exit(); + + return 0; } /* diff --git a/xen/include/xen/smp.h b/xen/include/xen/smp.h index 6febb56..6d05910 100644 --- a/xen/include/xen/smp.h +++ b/xen/include/xen/smp.h @@ -61,7 +61,7 @@ static inline void on_each_cpu( /* * Call a function on the current CPU */ -void smp_call_function_interrupt(void); +int smp_call_function_interrupt(void); void smp_send_call_function_mask(const cpumask_t *mask); -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH v1 1/2] xen/arm: Add return value to smp_call_function_interrupt function 2014-01-27 17:33 ` [PATCH v1 1/2] xen/arm: Add return value to smp_call_function_interrupt function Oleksandr Tyshchenko @ 2014-01-27 18:28 ` Stefano Stabellini 0 siblings, 0 replies; 48+ messages in thread From: Stefano Stabellini @ 2014-01-27 18:28 UTC (permalink / raw) To: Oleksandr Tyshchenko; +Cc: xen-devel On Mon, 27 Jan 2014, Oleksandr Tyshchenko wrote: > Let the function returns error if the action can not be executed. Unless you make the calling function (do_sgi) check for the return value, this patch won't change anything. > Change-Id: Iace691850024656239326bf0e3c87b57cb1b8ab3 > Signed-off-by: Oleksandr Tyshchenko <oleksandr.tyshchenko@globallogic.com> > Signed-off-by: Andrii Tseglytskyi <andrii.tseglytskyi@globallogic.com> > --- > xen/common/smp.c | 7 +++++-- > xen/include/xen/smp.h | 2 +- > 2 files changed, 6 insertions(+), 3 deletions(-) > > diff --git a/xen/common/smp.c b/xen/common/smp.c > index 482a203..2700bd7 100644 > --- a/xen/common/smp.c > +++ b/xen/common/smp.c > @@ -20,6 +20,7 @@ > #include <asm/processor.h> > #include <xen/spinlock.h> > #include <xen/smp.h> > +#include <xen/errno.h> > > /* > * Structure and data for smp_call_function()/on_selected_cpus(). > @@ -75,14 +76,14 @@ out: > spin_unlock(&call_lock); > } > > -void smp_call_function_interrupt(void) > +int smp_call_function_interrupt(void) > { > void (*func)(void *info) = call_data.func; > void *info = call_data.info; > unsigned int cpu = smp_processor_id(); > > if ( !cpumask_test_cpu(cpu, &call_data.selected) ) > - return; > + return -EPERM; > > irq_enter(); > > @@ -100,6 +101,8 @@ void smp_call_function_interrupt(void) > } > > irq_exit(); > + > + return 0; > } > > /* > diff --git a/xen/include/xen/smp.h b/xen/include/xen/smp.h > index 6febb56..6d05910 100644 > --- a/xen/include/xen/smp.h > +++ b/xen/include/xen/smp.h > @@ -61,7 +61,7 @@ static inline void on_each_cpu( > /* > * Call a function on the current CPU > */ > -void smp_call_function_interrupt(void); > +int smp_call_function_interrupt(void); > > void smp_send_call_function_mask(const cpumask_t *mask); > > -- > 1.7.9.5 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > ^ permalink raw reply [flat|nested] 48+ messages in thread
* [PATCH v1 2/2] xen/arm: Fix deadlock in on_selected_cpus function 2014-01-27 17:33 [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix Oleksandr Tyshchenko 2014-01-27 17:33 ` [PATCH v1 1/2] xen/arm: Add return value to smp_call_function_interrupt function Oleksandr Tyshchenko @ 2014-01-27 17:33 ` Oleksandr Tyshchenko 2014-01-27 19:00 ` Stefano Stabellini 2014-01-28 13:58 ` Stefano Stabellini 2014-01-27 17:40 ` [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix Ian Campbell 2014-01-27 17:51 ` Julien Grall 3 siblings, 2 replies; 48+ messages in thread From: Oleksandr Tyshchenko @ 2014-01-27 17:33 UTC (permalink / raw) To: xen-devel This patch is needed to avoid possible deadlocks in case of simultaneous occurrence cross-interrupts. Change-Id: I574b496442253a7b67a27e2edd793526c8131284 Signed-off-by: Oleksandr Tyshchenko <oleksandr.tyshchenko@globallogic.com> Signed-off-by: Andrii Tseglytskyi <andrii.tseglytskyi@globallogic.com> --- xen/common/smp.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/xen/common/smp.c b/xen/common/smp.c index 2700bd7..46d2fc6 100644 --- a/xen/common/smp.c +++ b/xen/common/smp.c @@ -55,7 +55,11 @@ void on_selected_cpus( ASSERT(local_irq_is_enabled()); - spin_lock(&call_lock); + if (!spin_trylock(&call_lock)) { + if (smp_call_function_interrupt()) + return; + spin_lock(&call_lock); + } cpumask_copy(&call_data.selected, selected); -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH v1 2/2] xen/arm: Fix deadlock in on_selected_cpus function 2014-01-27 17:33 ` [PATCH v1 2/2] xen/arm: Fix deadlock in on_selected_cpus function Oleksandr Tyshchenko @ 2014-01-27 19:00 ` Stefano Stabellini 2014-01-28 10:03 ` Ian Campbell 2014-01-28 13:58 ` Stefano Stabellini 1 sibling, 1 reply; 48+ messages in thread From: Stefano Stabellini @ 2014-01-27 19:00 UTC (permalink / raw) To: Oleksandr Tyshchenko; +Cc: xen-devel On Mon, 27 Jan 2014, Oleksandr Tyshchenko wrote: > This patch is needed to avoid possible deadlocks in case of simultaneous > occurrence cross-interrupts. > > Change-Id: I574b496442253a7b67a27e2edd793526c8131284 > Signed-off-by: Oleksandr Tyshchenko <oleksandr.tyshchenko@globallogic.com> > Signed-off-by: Andrii Tseglytskyi <andrii.tseglytskyi@globallogic.com> > --- > xen/common/smp.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/xen/common/smp.c b/xen/common/smp.c > index 2700bd7..46d2fc6 100644 > --- a/xen/common/smp.c > +++ b/xen/common/smp.c > @@ -55,7 +55,11 @@ void on_selected_cpus( > > ASSERT(local_irq_is_enabled()); > > - spin_lock(&call_lock); > + if (!spin_trylock(&call_lock)) { > + if (smp_call_function_interrupt()) > + return; > + spin_lock(&call_lock); > + } > > cpumask_copy(&call_data.selected, selected); So this is where you check for the return value of smp_call_function_interrupt. I think it would be better to move the on_selected_cpus call out of maintenance_interrupt, after the write to EOIR (caused by desc->handler->end(desc) at the end of xen/arch/arm/irq.c:do_IRQ). Maybe to a tasklet. Alternatively, as Ian suggested, we could increase the priotiry of SGIs but I am a bit wary of making that change at RC2. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 2/2] xen/arm: Fix deadlock in on_selected_cpus function 2014-01-27 19:00 ` Stefano Stabellini @ 2014-01-28 10:03 ` Ian Campbell 2014-01-28 14:00 ` Stefano Stabellini 0 siblings, 1 reply; 48+ messages in thread From: Ian Campbell @ 2014-01-28 10:03 UTC (permalink / raw) To: Stefano Stabellini; +Cc: Oleksandr Tyshchenko, xen-devel On Mon, 2014-01-27 at 19:00 +0000, Stefano Stabellini wrote: > Alternatively, as Ian suggested, we could increase the priotiry of SGIs > but I am a bit wary of making that change at RC2. I'm leaning the other way -- I'm wary of open coding magic locking primitives to work around this issue on a case by case basis. It's just too subtle IMHO. The IPI and cross CPU calling primitives are basically predicated on those IPIs interrupting normal interrupt handlers. Ian. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 2/2] xen/arm: Fix deadlock in on_selected_cpus function 2014-01-28 10:03 ` Ian Campbell @ 2014-01-28 14:00 ` Stefano Stabellini 2014-01-28 15:05 ` Ian Campbell 0 siblings, 1 reply; 48+ messages in thread From: Stefano Stabellini @ 2014-01-28 14:00 UTC (permalink / raw) To: Ian Campbell; +Cc: Oleksandr Tyshchenko, xen-devel, Stefano Stabellini On Tue, 28 Jan 2014, Ian Campbell wrote: > On Mon, 2014-01-27 at 19:00 +0000, Stefano Stabellini wrote: > > Alternatively, as Ian suggested, we could increase the priotiry of SGIs > > but I am a bit wary of making that change at RC2. > > I'm leaning the other way -- I'm wary of open coding magic locking > primitives to work around this issue on a case by case basis. It's just > too subtle IMHO. > > The IPI and cross CPU calling primitives are basically predicated on > those IPIs interrupting normal interrupt handlers. The problem is that we don't know if we can context switch properly nested interrupts. Also I would need to think harder whether everything would work correctly without hitches with multiple SGIs happening simultaneously (with more than 2 cpus involved). On the other hand we know that both Oleksandr's and my solution should work OK with no surprises if implemented correctly. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 2/2] xen/arm: Fix deadlock in on_selected_cpus function 2014-01-28 14:00 ` Stefano Stabellini @ 2014-01-28 15:05 ` Ian Campbell 2014-01-28 16:02 ` Stefano Stabellini 0 siblings, 1 reply; 48+ messages in thread From: Ian Campbell @ 2014-01-28 15:05 UTC (permalink / raw) To: Stefano Stabellini; +Cc: Oleksandr Tyshchenko, xen-devel On Tue, 2014-01-28 at 14:00 +0000, Stefano Stabellini wrote: > On Tue, 28 Jan 2014, Ian Campbell wrote: > > On Mon, 2014-01-27 at 19:00 +0000, Stefano Stabellini wrote: > > > Alternatively, as Ian suggested, we could increase the priotiry of SGIs > > > but I am a bit wary of making that change at RC2. > > > > I'm leaning the other way -- I'm wary of open coding magic locking > > primitives to work around this issue on a case by case basis. It's just > > too subtle IMHO. > > > > The IPI and cross CPU calling primitives are basically predicated on > > those IPIs interrupting normal interrupt handlers. > > The problem is that we don't know if we can context switch properly > nested interrupts. What do you mean? We don't have to context switch an IPI. > Also I would need to think harder whether everything > would work correctly without hitches with multiple SGIs happening > simultaneously (with more than 2 cpus involved). Since all IPIs would be at the same higher priority only one will be active on each CPU at a time. If you are worried about multiple CPUs then that is already an issue today, just at a lower priority. I have hacked the IPI priority to be higher in the past and it worked fine, I just never got round to cleaning it up for submission (I hadn't thought of the locking thing and my use case was low priority). The interrupt entry and exit paths were written with nested interrupt in mind and they have to be so in order to handle interrupts which occur from both guest and hypervisor context. > On the other hand we know that both Oleksandr's and my solution should > work OK with no surprises if implemented correctly. That's a big if in my mind, any use of trylock is very subtle IMOH. AIUI this issue only occurs with "proto device assignment" patches added to 4.4, n which case I think the solution can wait until 4.5 and can be done properly via the IPI priority fix. Ian. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 2/2] xen/arm: Fix deadlock in on_selected_cpus function 2014-01-28 15:05 ` Ian Campbell @ 2014-01-28 16:02 ` Stefano Stabellini 2014-01-28 16:12 ` Ian Campbell 0 siblings, 1 reply; 48+ messages in thread From: Stefano Stabellini @ 2014-01-28 16:02 UTC (permalink / raw) To: Ian Campbell; +Cc: Oleksandr Tyshchenko, xen-devel, Stefano Stabellini On Tue, 28 Jan 2014, Ian Campbell wrote: > On Tue, 2014-01-28 at 14:00 +0000, Stefano Stabellini wrote: > > On Tue, 28 Jan 2014, Ian Campbell wrote: > > > On Mon, 2014-01-27 at 19:00 +0000, Stefano Stabellini wrote: > > > > Alternatively, as Ian suggested, we could increase the priotiry of SGIs > > > > but I am a bit wary of making that change at RC2. > > > > > > I'm leaning the other way -- I'm wary of open coding magic locking > > > primitives to work around this issue on a case by case basis. It's just > > > too subtle IMHO. > > > > > > The IPI and cross CPU calling primitives are basically predicated on > > > those IPIs interrupting normal interrupt handlers. > > > > The problem is that we don't know if we can context switch properly > > nested interrupts. > > What do you mean? We don't have to context switch an IPI. Sorry, I meant save/restore registers, stack pointer, processor mode, etc, for nested interrupts. > > Also I would need to think harder whether everything > > would work correctly without hitches with multiple SGIs happening > > simultaneously (with more than 2 cpus involved). > > Since all IPIs would be at the same higher priority only one will be > active on each CPU at a time. If you are worried about multiple CPUs > then that is already an issue today, just at a lower priority. That is correct, but if we moved the on_selected_cpus call out of the interrupt handler I don't think the problem could occur. > I have hacked the IPI priority to be higher in the past and it worked > fine, I just never got round to cleaning it up for submission (I hadn't > thought of the locking thing and my use case was low priority). > > The interrupt entry and exit paths were written with nested interrupt in > mind and they have to be so in order to handle interrupts which occur > from both guest and hypervisor context. > > > On the other hand we know that both Oleksandr's and my solution should > > work OK with no surprises if implemented correctly. > > That's a big if in my mind, any use of trylock is very subtle IMOH. I wouldn't accept that patch for 4.4 either (or for 4.5 given that we can certainly come up with something better by that time). > AIUI this issue only occurs with "proto device assignment" patches added > to 4.4, n which case I think the solution can wait until 4.5 and can be > done properly via the IPI priority fix. I think this is a pretty significant problem, even if we don't commit a fix, we should post a proper patch that we deem acceptable and link it to the wiki so that anybody that needs it can find it and be sure that it works correctly. In my opinion if we go to this length then we might as well commit it (if it doesn't touch common code of course), but I'll leave the decision up to you and George. Given the constraints, the solution I would feel more comfortable with at this time is something like the following patch (lightly tested): diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c index e6257a7..b00ca73 100644 --- a/xen/arch/arm/gic.c +++ b/xen/arch/arm/gic.c @@ -58,6 +58,25 @@ static DEFINE_PER_CPU(uint64_t, lr_mask); static unsigned nr_lrs; +static void gic_irq_eoi(void *info) +{ + int virq = (uintptr_t) info; + GICC[GICC_DIR] = virq; +} + +static DEFINE_PER_CPU(struct pending_irq*, eoi_irq); +static void eoi_action(unsigned long unused) +{ + struct pending_irq *p = this_cpu(eoi_irq); + ASSERT(p != NULL); + + on_selected_cpus(cpumask_of(p->desc->arch.eoi_cpu), + gic_irq_eoi, (void*)(uintptr_t)p->desc->irq, 0); + + this_cpu(eoi_irq) = NULL; +} +static DECLARE_TASKLET(eoi_tasklet, eoi_action, 0); + /* The GIC mapping of CPU interfaces does not necessarily match the * logical CPU numbering. Let's use mapping as returned by the GIC * itself @@ -897,12 +916,6 @@ int gicv_setup(struct domain *d) } -static void gic_irq_eoi(void *info) -{ - int virq = (uintptr_t) info; - GICC[GICC_DIR] = virq; -} - static void maintenance_interrupt(int irq, void *dev_id, struct cpu_user_regs *regs) { int i = 0, virq, pirq = -1; @@ -962,8 +975,11 @@ static void maintenance_interrupt(int irq, void *dev_id, struct cpu_user_regs *r if ( cpu == smp_processor_id() ) gic_irq_eoi((void*)(uintptr_t)pirq); else - on_selected_cpus(cpumask_of(cpu), - gic_irq_eoi, (void*)(uintptr_t)pirq, 0); + { + ASSERT(this_cpu(eoi_irq) == NULL); + this_cpu(eoi_irq) = p; + tasklet_schedule(&eoi_tasklet); + } } i++; ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH v1 2/2] xen/arm: Fix deadlock in on_selected_cpus function 2014-01-28 16:02 ` Stefano Stabellini @ 2014-01-28 16:12 ` Ian Campbell 2014-01-28 16:23 ` Stefano Stabellini 0 siblings, 1 reply; 48+ messages in thread From: Ian Campbell @ 2014-01-28 16:12 UTC (permalink / raw) To: Stefano Stabellini; +Cc: Oleksandr Tyshchenko, xen-devel On Tue, 2014-01-28 at 16:02 +0000, Stefano Stabellini wrote: > On Tue, 28 Jan 2014, Ian Campbell wrote: > > On Tue, 2014-01-28 at 14:00 +0000, Stefano Stabellini wrote: > > > On Tue, 28 Jan 2014, Ian Campbell wrote: > > > > On Mon, 2014-01-27 at 19:00 +0000, Stefano Stabellini wrote: > > > > > Alternatively, as Ian suggested, we could increase the priotiry of SGIs > > > > > but I am a bit wary of making that change at RC2. > > > > > > > > I'm leaning the other way -- I'm wary of open coding magic locking > > > > primitives to work around this issue on a case by case basis. It's just > > > > too subtle IMHO. > > > > > > > > The IPI and cross CPU calling primitives are basically predicated on > > > > those IPIs interrupting normal interrupt handlers. > > > > > > The problem is that we don't know if we can context switch properly > > > nested interrupts. > > > > What do you mean? We don't have to context switch an IPI. > > Sorry, I meant save/restore registers, stack pointer, processor mode, > etc, for nested interrupts. Right, we do handle that actually since it is the same code as handles an IRQ during a hypercall (since a hypercall is just another type of trap). > > > Also I would need to think harder whether everything > > > would work correctly without hitches with multiple SGIs happening > > > simultaneously (with more than 2 cpus involved). > > > > Since all IPIs would be at the same higher priority only one will be > > active on each CPU at a time. If you are worried about multiple CPUs > > then that is already an issue today, just at a lower priority. > > That is correct, but if we moved the on_selected_cpus call out of the > interrupt handler I don't think the problem could occur. Sure, I'm not opposed to fixing this issue by making an architectural improvement to the code which happens to not use on_selected_cpus. > > AIUI this issue only occurs with "proto device assignment" patches added > > to 4.4, n which case I think the solution can wait until 4.5 and can be > > done properly via the IPI priority fix. > > I think this is a pretty significant problem, even if we don't commit a > fix, we should post a proper patch that we deem acceptable and link it to > the wiki so that anybody that needs it can find it and be sure that it > works correctly. FWIW I have an almost fix to the IPI priority thing which I would intend to fix in 4.5 regardless of whether this issue needs it or not. > In my opinion if we go to this length then we might as well commit it > (if it doesn't touch common code of course), but I'll leave the decision > up to you and George. > > Given the constraints, the solution I would feel more comfortable with at > this time is something like the following patch (lightly tested): I don't think this buys you anything over just fixing on_selected_cpus to work as it should, unless there is some reason why deferring this work to a tasklet is the logically correct thing to do and/or a better designh? (IOW if you are only doing this to avoid calling on_selected_cpus in interrupt context then I think this is the wrong fix). Ian. > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c > index e6257a7..b00ca73 100644 > --- a/xen/arch/arm/gic.c > +++ b/xen/arch/arm/gic.c > @@ -58,6 +58,25 @@ static DEFINE_PER_CPU(uint64_t, lr_mask); > > static unsigned nr_lrs; > > +static void gic_irq_eoi(void *info) > +{ > + int virq = (uintptr_t) info; > + GICC[GICC_DIR] = virq; > +} > + > +static DEFINE_PER_CPU(struct pending_irq*, eoi_irq); > +static void eoi_action(unsigned long unused) > +{ > + struct pending_irq *p = this_cpu(eoi_irq); > + ASSERT(p != NULL); > + > + on_selected_cpus(cpumask_of(p->desc->arch.eoi_cpu), > + gic_irq_eoi, (void*)(uintptr_t)p->desc->irq, 0); > + > + this_cpu(eoi_irq) = NULL; > +} > +static DECLARE_TASKLET(eoi_tasklet, eoi_action, 0); > + > /* The GIC mapping of CPU interfaces does not necessarily match the > * logical CPU numbering. Let's use mapping as returned by the GIC > * itself > @@ -897,12 +916,6 @@ int gicv_setup(struct domain *d) > > } > > -static void gic_irq_eoi(void *info) > -{ > - int virq = (uintptr_t) info; > - GICC[GICC_DIR] = virq; > -} > - > static void maintenance_interrupt(int irq, void *dev_id, struct cpu_user_regs *regs) > { > int i = 0, virq, pirq = -1; > @@ -962,8 +975,11 @@ static void maintenance_interrupt(int irq, void *dev_id, struct cpu_user_regs *r > if ( cpu == smp_processor_id() ) > gic_irq_eoi((void*)(uintptr_t)pirq); > else > - on_selected_cpus(cpumask_of(cpu), > - gic_irq_eoi, (void*)(uintptr_t)pirq, 0); > + { > + ASSERT(this_cpu(eoi_irq) == NULL); > + this_cpu(eoi_irq) = p; > + tasklet_schedule(&eoi_tasklet); > + } > } > > i++; ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 2/2] xen/arm: Fix deadlock in on_selected_cpus function 2014-01-28 16:12 ` Ian Campbell @ 2014-01-28 16:23 ` Stefano Stabellini 0 siblings, 0 replies; 48+ messages in thread From: Stefano Stabellini @ 2014-01-28 16:23 UTC (permalink / raw) To: Ian Campbell; +Cc: Oleksandr Tyshchenko, xen-devel, Stefano Stabellini On Tue, 28 Jan 2014, Ian Campbell wrote: > On Tue, 2014-01-28 at 16:02 +0000, Stefano Stabellini wrote: > > On Tue, 28 Jan 2014, Ian Campbell wrote: > > > On Tue, 2014-01-28 at 14:00 +0000, Stefano Stabellini wrote: > > > > On Tue, 28 Jan 2014, Ian Campbell wrote: > > > > > On Mon, 2014-01-27 at 19:00 +0000, Stefano Stabellini wrote: > > > > > > Alternatively, as Ian suggested, we could increase the priotiry of SGIs > > > > > > but I am a bit wary of making that change at RC2. > > > > > > > > > > I'm leaning the other way -- I'm wary of open coding magic locking > > > > > primitives to work around this issue on a case by case basis. It's just > > > > > too subtle IMHO. > > > > > > > > > > The IPI and cross CPU calling primitives are basically predicated on > > > > > those IPIs interrupting normal interrupt handlers. > > > > > > > > The problem is that we don't know if we can context switch properly > > > > nested interrupts. > > > > > > What do you mean? We don't have to context switch an IPI. > > > > Sorry, I meant save/restore registers, stack pointer, processor mode, > > etc, for nested interrupts. > > Right, we do handle that actually since it is the same code as handles > an IRQ during a hypercall (since a hypercall is just another type of > trap). Ah right, good. > > > > Also I would need to think harder whether everything > > > > would work correctly without hitches with multiple SGIs happening > > > > simultaneously (with more than 2 cpus involved). > > > > > > Since all IPIs would be at the same higher priority only one will be > > > active on each CPU at a time. If you are worried about multiple CPUs > > > then that is already an issue today, just at a lower priority. > > > > That is correct, but if we moved the on_selected_cpus call out of the > > interrupt handler I don't think the problem could occur. > > Sure, I'm not opposed to fixing this issue by making an architectural > improvement to the code which happens to not use on_selected_cpus. > > > > AIUI this issue only occurs with "proto device assignment" patches added > > > to 4.4, n which case I think the solution can wait until 4.5 and can be > > > done properly via the IPI priority fix. > > > > I think this is a pretty significant problem, even if we don't commit a > > fix, we should post a proper patch that we deem acceptable and link it to > > the wiki so that anybody that needs it can find it and be sure that it > > works correctly. > > FWIW I have an almost fix to the IPI priority thing which I would intend > to fix in 4.5 regardless of whether this issue needs it or not. Well, in that case if you can test the patch in the specific case where an SGI interrupts a lower priority interrupt, it might be worth sending out the patch now. > > In my opinion if we go to this length then we might as well commit it > > (if it doesn't touch common code of course), but I'll leave the decision > > up to you and George. > > > > Given the constraints, the solution I would feel more comfortable with at > > this time is something like the following patch (lightly tested): > > I don't think this buys you anything over just fixing on_selected_cpus > to work as it should, unless there is some reason why deferring this > work to a tasklet is the logically correct thing to do and/or a better > designh? > > (IOW if you are only doing this to avoid calling on_selected_cpus in > interrupt context then I think this is the wrong fix). I feel that on_selected_cpus is the kind of work that belongs to the "bottom half". I was never very happy to have the call where it is now in the first place. This approach would have the benefit of allowing us to receive regular interrupts while waiting for other cpus to handle the SGI. Also I am sure that it would work even in cases where you have more than one SGIs targeting the same cpu simultaneously. > Ian. > > > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c > > index e6257a7..b00ca73 100644 > > --- a/xen/arch/arm/gic.c > > +++ b/xen/arch/arm/gic.c > > @@ -58,6 +58,25 @@ static DEFINE_PER_CPU(uint64_t, lr_mask); > > > > static unsigned nr_lrs; > > > > +static void gic_irq_eoi(void *info) > > +{ > > + int virq = (uintptr_t) info; > > + GICC[GICC_DIR] = virq; > > +} > > + > > +static DEFINE_PER_CPU(struct pending_irq*, eoi_irq); > > +static void eoi_action(unsigned long unused) > > +{ > > + struct pending_irq *p = this_cpu(eoi_irq); > > + ASSERT(p != NULL); > > + > > + on_selected_cpus(cpumask_of(p->desc->arch.eoi_cpu), > > + gic_irq_eoi, (void*)(uintptr_t)p->desc->irq, 0); > > + > > + this_cpu(eoi_irq) = NULL; > > +} > > +static DECLARE_TASKLET(eoi_tasklet, eoi_action, 0); > > + > > /* The GIC mapping of CPU interfaces does not necessarily match the > > * logical CPU numbering. Let's use mapping as returned by the GIC > > * itself > > @@ -897,12 +916,6 @@ int gicv_setup(struct domain *d) > > > > } > > > > -static void gic_irq_eoi(void *info) > > -{ > > - int virq = (uintptr_t) info; > > - GICC[GICC_DIR] = virq; > > -} > > - > > static void maintenance_interrupt(int irq, void *dev_id, struct cpu_user_regs *regs) > > { > > int i = 0, virq, pirq = -1; > > @@ -962,8 +975,11 @@ static void maintenance_interrupt(int irq, void *dev_id, struct cpu_user_regs *r > > if ( cpu == smp_processor_id() ) > > gic_irq_eoi((void*)(uintptr_t)pirq); > > else > > - on_selected_cpus(cpumask_of(cpu), > > - gic_irq_eoi, (void*)(uintptr_t)pirq, 0); > > + { > > + ASSERT(this_cpu(eoi_irq) == NULL); > > + this_cpu(eoi_irq) = p; > > + tasklet_schedule(&eoi_tasklet); > > + } > > } > > > > i++; > > ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 2/2] xen/arm: Fix deadlock in on_selected_cpus function 2014-01-27 17:33 ` [PATCH v1 2/2] xen/arm: Fix deadlock in on_selected_cpus function Oleksandr Tyshchenko 2014-01-27 19:00 ` Stefano Stabellini @ 2014-01-28 13:58 ` Stefano Stabellini 2014-01-30 11:58 ` Oleksandr Tyshchenko 1 sibling, 1 reply; 48+ messages in thread From: Stefano Stabellini @ 2014-01-28 13:58 UTC (permalink / raw) To: Oleksandr Tyshchenko; +Cc: xen-devel On Mon, 27 Jan 2014, Oleksandr Tyshchenko wrote: > This patch is needed to avoid possible deadlocks in case of simultaneous > occurrence cross-interrupts. > > Change-Id: I574b496442253a7b67a27e2edd793526c8131284 > Signed-off-by: Oleksandr Tyshchenko <oleksandr.tyshchenko@globallogic.com> > Signed-off-by: Andrii Tseglytskyi <andrii.tseglytskyi@globallogic.com> > --- > xen/common/smp.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/xen/common/smp.c b/xen/common/smp.c > index 2700bd7..46d2fc6 100644 > --- a/xen/common/smp.c > +++ b/xen/common/smp.c > @@ -55,7 +55,11 @@ void on_selected_cpus( > > ASSERT(local_irq_is_enabled()); > > - spin_lock(&call_lock); > + if (!spin_trylock(&call_lock)) { > + if (smp_call_function_interrupt()) > + return; If smp_call_function_interrupt returns -EPERM, shouldn't we go back to spinning on call_lock? Also there is a race condition between spin_lock, cpumask_copy and smp_call_function_interrupt: smp_call_function_interrupt could be called on cpu1 after cpu0 acquired the lock, but before cpu0 set call_data.selected. I think the correct implemention would be: while ( unlikely(!spin_trylock(&call_lock)) ) smp_call_function_interrupt(); > + spin_lock(&call_lock); > + } > > cpumask_copy(&call_data.selected, selected); > > -- > 1.7.9.5 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 2/2] xen/arm: Fix deadlock in on_selected_cpus function 2014-01-28 13:58 ` Stefano Stabellini @ 2014-01-30 11:58 ` Oleksandr Tyshchenko 0 siblings, 0 replies; 48+ messages in thread From: Oleksandr Tyshchenko @ 2014-01-30 11:58 UTC (permalink / raw) To: Stefano Stabellini; +Cc: xen-devel On Tue, Jan 28, 2014 at 3:58 PM, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote: > On Mon, 27 Jan 2014, Oleksandr Tyshchenko wrote: >> This patch is needed to avoid possible deadlocks in case of simultaneous >> occurrence cross-interrupts. >> >> Change-Id: I574b496442253a7b67a27e2edd793526c8131284 >> Signed-off-by: Oleksandr Tyshchenko <oleksandr.tyshchenko@globallogic.com> >> Signed-off-by: Andrii Tseglytskyi <andrii.tseglytskyi@globallogic.com> >> --- >> xen/common/smp.c | 6 +++++- >> 1 file changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/xen/common/smp.c b/xen/common/smp.c >> index 2700bd7..46d2fc6 100644 >> --- a/xen/common/smp.c >> +++ b/xen/common/smp.c >> @@ -55,7 +55,11 @@ void on_selected_cpus( >> >> ASSERT(local_irq_is_enabled()); >> >> - spin_lock(&call_lock); >> + if (!spin_trylock(&call_lock)) { >> + if (smp_call_function_interrupt()) >> + return; > > If smp_call_function_interrupt returns -EPERM, shouldn't we go back to > spinning on call_lock? > Also there is a race condition between spin_lock, cpumask_copy and > smp_call_function_interrupt: smp_call_function_interrupt could be called > on cpu1 after cpu0 acquired the lock, but before cpu0 set > call_data.selected. > > I think the correct implemention would be: > > > while ( unlikely(!spin_trylock(&call_lock)) ) > smp_call_function_interrupt(); I completely agree > > > >> + spin_lock(&call_lock); >> + } >> >> cpumask_copy(&call_data.selected, selected); >> >> -- >> 1.7.9.5 >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel >> -- Oleksandr Tyshchenko | Embedded Developer GlobalLogic ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-27 17:33 [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix Oleksandr Tyshchenko 2014-01-27 17:33 ` [PATCH v1 1/2] xen/arm: Add return value to smp_call_function_interrupt function Oleksandr Tyshchenko 2014-01-27 17:33 ` [PATCH v1 2/2] xen/arm: Fix deadlock in on_selected_cpus function Oleksandr Tyshchenko @ 2014-01-27 17:40 ` Ian Campbell 2014-01-27 17:51 ` Julien Grall 3 siblings, 0 replies; 48+ messages in thread From: Ian Campbell @ 2014-01-27 17:40 UTC (permalink / raw) To: Oleksandr Tyshchenko; +Cc: xen-devel On Mon, 2014-01-27 at 19:33 +0200, Oleksandr Tyshchenko wrote: > But I am afraid this solution not completed and maybe not enough for > stable work. I would appreciate if you could explain me how to solve > the issue in a right way or give some advices. I'll look into this properly tomorrow, but my initial reaction is that this should be fixed by making those SGIs which are used as IPIs have higher priority in the GIC than normal interrupts. Ian. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-27 17:33 [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix Oleksandr Tyshchenko ` (2 preceding siblings ...) 2014-01-27 17:40 ` [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix Ian Campbell @ 2014-01-27 17:51 ` Julien Grall 2014-01-28 19:25 ` Oleksandr Tyshchenko 3 siblings, 1 reply; 48+ messages in thread From: Julien Grall @ 2014-01-27 17:51 UTC (permalink / raw) To: Oleksandr Tyshchenko; +Cc: Stefano Stabellini, Ian Campbell, xen-devel On 01/27/2014 05:33 PM, Oleksandr Tyshchenko wrote: > Hi, all. Hello Oleksandr, > We are trying to bringing up XEN on DRA7XX (OMAP5) platform. > > We sometimes see some hangs in Hypervisor and these hangs are related to SMP. > We found out that deadlock took place in on_selected_cpus function > in case of simultaneous occurrence cross-interrupts. > > The issue: > > 1. We receive irqs from first CPU (for example CPU0) and second CPU (for example CPU1) in parallel. > 2. In our case the maintenance_interrupt function for maintenance irq from CPU0 is executed on CPU1 and > maintenance_interrupt for irq from CPU1 is executed on CPU0. > 3. According to existing logic we have run gic_irq_eoi function on CPU which it was scheduled. gic_irq_eoi is only called for physical IRQ routed to the guest (eg: hard drive, network, ...). As far as I remember, these IRQs are only routed to CPU0. Do you pass-through PPIs to dom0? -- Julien Grall ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-27 17:51 ` Julien Grall @ 2014-01-28 19:25 ` Oleksandr Tyshchenko 2014-01-29 10:56 ` Oleksandr Tyshchenko 2014-01-29 13:12 ` Julien Grall 0 siblings, 2 replies; 48+ messages in thread From: Oleksandr Tyshchenko @ 2014-01-28 19:25 UTC (permalink / raw) To: Julien Grall; +Cc: Stefano Stabellini, Ian Campbell, xen-devel [-- Attachment #1.1: Type: text/plain, Size: 15348 bytes --] Hello Julien, Please see inline gic_irq_eoi is only called for physical IRQ routed to the guest (eg: > hard drive, network, ...). As far as I remember, these IRQs are only > routed to CPU0. I understand. But, I have created debug patch to show the issue: diff --git a/xen/common/smp.c b/xen/common/smp.c index 46d2fc6..6123561 100644 --- a/xen/common/smp.c +++ b/xen/common/smp.c @@ -22,6 +22,8 @@ #include <xen/smp.h> #include <xen/errno.h> +int locked = 0; + /* * Structure and data for smp_call_function()/on_selected_cpus(). */ @@ -53,11 +55,19 @@ void on_selected_cpus( { unsigned int nr_cpus; + locked = 0; + ASSERT(local_irq_is_enabled()); if (!spin_trylock(&call_lock)) { + + locked = 1; + printk("\n>>>>> %s: line: %d, cpu_mask_curr: %08lx, cpu_mask_sel: %08lx\n", __func__, __LINE__, + cpumask_of(smp_processor_id())->bits[0], selected->bits[0]); + if (smp_call_function_interrupt()) return; + spin_lock(&call_lock); } @@ -78,6 +88,10 @@ void on_selected_cpus( out: spin_unlock(&call_lock); + + if (locked) + printk("\n>>>>> %s: line: %d, cpu_mask_curr: %08lx, cpu_mask_sel: %08lx\n", __func__, __LINE__, + cpumask_of(smp_processor_id())->bits[0], selected->bits[0]); } int smp_call_function_interrupt(void) @@ -86,6 +100,10 @@ int smp_call_function_interrupt(void) void *info = call_data.info; unsigned int cpu = smp_processor_id(); + if (locked) + printk("\n>>>>> %s: line: %d, cpu_mask_curr: %08lx, cpu_mask_sel: %08lx\n", __func__, __LINE__, + cpumask_of(smp_processor_id())->bits[0], call_data.selected.bits[0]); + if ( !cpumask_test_cpu(cpu, &call_data.selected) ) return -EPERM; Our issue (simultaneous cross-interrupts) has occurred during boot domU: [ 7.507812] oom_adj 2 => oom_score_adj 117 [ 7.507812] oom_adj 4 => oom_score_adj 235 [ 7.507812] oom_adj 9 => oom_score_adj 529 [ 7.507812] oom_adj 15 => oom_score_adj 1000 [ 8.835937] PVR_K:(Error): PVRSRVOpenDCDeviceKM: no devnode matching index 0 [0, ] (XEN) (XEN) >>>>> on_selected_cpus: line: 65, cpu_mask_curr: 00000002, cpu_mask_sel: 00000001 (XEN) (XEN) >>>>> smp_call_function_interrupt: line: 104, cpu_mask_curr: 00000002, cpu_mask_sel: 00000002 (XEN) (XEN) >>>>> on_selected_cpus: line: 93, cpu_mask_curr: 00000001, cpu_mask_sel: 00000002 (XEN) (XEN) >>>>> smp_call_function_interrupt: line: 104, cpu_mask_curr: 00000001, cpu_mask_sel: 00000001 (XEN) (XEN) >>>>> on_selected_cpus: line: 93, cpu_mask_curr: 00000002, cpu_mask_sel: 00000001 (XEN) (XEN) >>>>> smp_call_function_interrupt: line: 104, cpu_mask_curr: 00000002, cpu_mask_sel: 00000000 [ 11.023437] usbcore: registered new interface driver usbfs [ 11.023437] usbcore: registered new interface driver hub [ 11.023437] usbcore: registered new device driver usb [ 11.039062] usbcore: registered new interface driver usbhid [ 11.039062] usbhid: USB HID core driver > Do you pass-through PPIs to dom0? > If I understand correctly that PPIs is irqs from 16 to 31. So yes, I do. I see timer's irqs and maintenance irq which routed to both CPUs. And I have printed all irqs which fall to gic_route_irq_to_guest and gic_route_irq functions. ... (XEN) GIC initialization: (XEN) gic_dist_addr=0000000048211000 (XEN) gic_cpu_addr=0000000048212000 (XEN) gic_hyp_addr=0000000048214000 (XEN) gic_vcpu_addr=0000000048216000 (XEN) gic_maintenance_irq=25 (XEN) GIC: 192 lines, 2 cpus, secure (IID 0000043b). (XEN) (XEN) >>>>> gic_route_irq: irq: 25, cpu_mask: 00000001 (XEN) (XEN) >>>>> gic_route_irq: irq: 30, cpu_mask: 00000001 (XEN) (XEN) >>>>> gic_route_irq: irq: 26, cpu_mask: 00000001 (XEN) (XEN) >>>>> gic_route_irq: irq: 27, cpu_mask: 00000001 (XEN) (XEN) >>>>> gic_route_irq: irq: 104, cpu_mask: 00000001 (XEN) Using scheduler: SMP Credit Scheduler (credit) (XEN) Allocated console ring of 16 KiB. (XEN) VFP implementer 0x41 architecture 4 part 0x30 variant 0xf rev 0x0 (XEN) Bringing up CPU1 (XEN) (XEN) >>>>> gic_route_irq: irq: 25, cpu_mask: 00000002 (XEN) (XEN) >>>>> gic_route_irq: irq: 30, cpu_mask: 00000002 (XEN) (XEN) >>>>> gic_route_irq: irq: 26, cpu_mask: 00000002 (XEN) (XEN) >>>>> gic_route_irq: irq: 27, cpu_mask: 00000002 (XEN) CPU 1 booted. (XEN) Brought up 2 CPUs (XEN) *** LOADING DOMAIN 0 *** (XEN) Populate P2M 0xc8000000->0xd0000000 (1:1 mapping for dom0) (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 61, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 62, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 63, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 64, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 66, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 67, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 153, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 105, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 106, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 102, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 137, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 138, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 113, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 69, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 70, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 71, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 72, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 73, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 74, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 75, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 76, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 77, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 78, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 79, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 112, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 145, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 158, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 86, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 82, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 83, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 84, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 85, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 187, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 186, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 188, cpu: 0 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 189, cpu: 0 (XEN) Loading kernel from boot module 2 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 57, cpu: 0 (XEN) Loading zImage from 00000000c0000040 to 00000000c8008000-00000000c8304eb0 (XEN) Loading dom0 DTB to 0x00000000cfe00000-0x00000000cfe03978 (XEN) Std. Loglevel: All (XEN) Guest Loglevel: All (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to Xen) (XEN) Freed 252kB init memory. [ 0.000000] /cpus/cpu@0 missing clock-frequency property [ 0.000000] /cpus/cpu@1 missing clock-frequency property [ 0.093750] omap_l3_noc ocp.2: couldn't find resource 2 [ 0.265625] ahci ahci.0.auto: can't get clock [ 0.867187] Freeing init memory: 224K Parsing config from /xen/images/DomUAndroid.cfg (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 105, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 61, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 62, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 63, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 64, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 65, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 66, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 67, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 153, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 69, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 70, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 71, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 72, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 73, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 74, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 75, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 76, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 77, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 78, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 79, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 102, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 137, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 138, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 88, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 89, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 93, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 94, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 92, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 152, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 97, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 98, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 123, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 80, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 115, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 118, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 126, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 128, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 91, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 41, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 42, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 48, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 131, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 44, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 45, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 46, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 47, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 40, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 158, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 146, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 60, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 85, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 87, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 133, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 142, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 143, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 53, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 164, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 51, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 134, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 50, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 108, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 109, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 124, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 125, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 110, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 112, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 68, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 101, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 99, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 100, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 103, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 132, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 56, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 135, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 136, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 139, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 58, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 140, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 141, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 49, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 54, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 55, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 144, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 32, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 33, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 34, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 35, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 36, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 39, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 43, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 52, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 59, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 120, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 90, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 107, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 119, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 121, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 122, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 129, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 130, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 151, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 154, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 155, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 156, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 160, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 162, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 163, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 157, cpu: 1 (XEN) (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 173, cpu: 1 Daemon running with PID 569 ... > > -- > Julien Grall > -- Name | Title GlobalLogic P +x.xxx.xxx.xxxx M +x.xxx.xxx.xxxx S skype www.globallogic.com <http://www.globallogic.com/> http://www.globallogic.com/email_disclaimer.txt [-- Attachment #1.2: Type: text/html, Size: 22465 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-28 19:25 ` Oleksandr Tyshchenko @ 2014-01-29 10:56 ` Oleksandr Tyshchenko 2014-01-29 11:42 ` Stefano Stabellini 2014-01-29 13:07 ` [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix Julien Grall 2014-01-29 13:12 ` Julien Grall 1 sibling, 2 replies; 48+ messages in thread From: Oleksandr Tyshchenko @ 2014-01-29 10:56 UTC (permalink / raw) To: Julien Grall; +Cc: Stefano Stabellini, Ian Campbell, xen-devel Hello all, I just recollected about one hack which we created as we needed to route HW IRQ in domU. diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index 9d793ba..d0227b9 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -989,8 +989,6 @@ static void domcreate_launch_dm(libxl__egc *egc, libxl__multidev *multidev, LOG(DEBUG, "dom%d irq %d", domid, irq); - ret = irq >= 0 ? xc_physdev_map_pirq(CTX->xch, domid, irq, &irq) - : -EOVERFLOW; if (!ret) ret = xc_domain_irq_permission(CTX->xch, domid, irq, 1); if (ret < 0) { diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index 2e4b11f..b54c08e 100644 --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -85,7 +85,7 @@ int domain_vgic_init(struct domain *d) if ( d->domain_id == 0 ) d->arch.vgic.nr_lines = gic_number_lines() - 32; else - d->arch.vgic.nr_lines = 0; /* We don't need SPIs for the guest */ + d->arch.vgic.nr_lines = gic_number_lines() - 32; /* We do need SPIs for the guest */ d->arch.vgic.shared_irqs = xzalloc_array(struct vgic_irq_rank, DOMAIN_NR_RANKS(d)); diff --git a/xen/common/domctl.c b/xen/common/domctl.c index 75e2df3..ba88901 100644 --- a/xen/common/domctl.c +++ b/xen/common/domctl.c @@ -29,6 +29,7 @@ #include <asm/page.h> #include <public/domctl.h> #include <xsm/xsm.h> +#include <asm/gic.h> static DEFINE_SPINLOCK(domctl_lock); DEFINE_SPINLOCK(vcpu_alloc_lock); @@ -782,8 +783,11 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) ret = -EINVAL; else if ( xsm_irq_permission(XSM_HOOK, d, pirq, allow) ) ret = -EPERM; - else if ( allow ) - ret = pirq_permit_access(d, pirq); + else if ( allow ) { + struct dt_irq irq = {pirq + NR_LOCAL_IRQS,0}; + ret = pirq_permit_access(d, irq.irq); + gic_route_irq_to_guest(d, &irq, ""); + } else ret = pirq_deny_access(d, pirq); } (END) It seems, the following patch can violate the logic about routing physical IRQs only to CPU0. In gic_route_irq_to_guest() we need to call gic_set_irq_properties() where the one of the parameters is cpumask_of(smp_processor_id()). But in this part of code this function can be executed on CPU1. And as result this can cause to the fact that the wrong value would set to target CPU mask. Please, confirm my assumption. If I am right we have to add a basic HW IRQ routing to DomU in a right way. On Tue, Jan 28, 2014 at 9:25 PM, Oleksandr Tyshchenko <oleksandr.tyshchenko@globallogic.com> wrote: > Hello Julien, > > Please see inline > >> gic_irq_eoi is only called for physical IRQ routed to the guest (eg: >> hard drive, network, ...). As far as I remember, these IRQs are only >> routed to CPU0. > > > I understand. > > But, I have created debug patch to show the issue: > > diff --git a/xen/common/smp.c b/xen/common/smp.c > index 46d2fc6..6123561 100644 > --- a/xen/common/smp.c > +++ b/xen/common/smp.c > @@ -22,6 +22,8 @@ > #include <xen/smp.h> > #include <xen/errno.h> > > +int locked = 0; > + > /* > * Structure and data for smp_call_function()/on_selected_cpus(). > */ > @@ -53,11 +55,19 @@ void on_selected_cpus( > { > unsigned int nr_cpus; > > + locked = 0; > + > ASSERT(local_irq_is_enabled()); > > if (!spin_trylock(&call_lock)) { > + > + locked = 1; > + printk("\n>>>>> %s: line: %d, cpu_mask_curr: %08lx, cpu_mask_sel: > %08lx\n", __func__, __LINE__, > + cpumask_of(smp_processor_id())->bits[0], > selected->bits[0]); > + > if (smp_call_function_interrupt()) > return; > + > spin_lock(&call_lock); > } > > @@ -78,6 +88,10 @@ void on_selected_cpus( > > out: > spin_unlock(&call_lock); > + > + if (locked) > + printk("\n>>>>> %s: line: %d, cpu_mask_curr: %08lx, cpu_mask_sel: > %08lx\n", __func__, __LINE__, > + cpumask_of(smp_processor_id())->bits[0], selected->bits[0]); > } > > int smp_call_function_interrupt(void) > @@ -86,6 +100,10 @@ int smp_call_function_interrupt(void) > void *info = call_data.info; > unsigned int cpu = smp_processor_id(); > > + if (locked) > + printk("\n>>>>> %s: line: %d, cpu_mask_curr: %08lx, cpu_mask_sel: > %08lx\n", __func__, __LINE__, > + cpumask_of(smp_processor_id())->bits[0], > call_data.selected.bits[0]); > + > if ( !cpumask_test_cpu(cpu, &call_data.selected) ) > return -EPERM; > > Our issue (simultaneous cross-interrupts) has occurred during boot domU: > > [ 7.507812] oom_adj 2 => oom_score_adj 117 > [ 7.507812] oom_adj 4 => oom_score_adj 235 > [ 7.507812] oom_adj 9 => oom_score_adj 529 > [ 7.507812] oom_adj 15 => oom_score_adj 1000 > [ 8.835937] PVR_K:(Error): PVRSRVOpenDCDeviceKM: no devnode matching > index 0 [0, ] > (XEN) > (XEN) >>>>> on_selected_cpus: line: 65, cpu_mask_curr: 00000002, > cpu_mask_sel: 00000001 > (XEN) > (XEN) >>>>> smp_call_function_interrupt: line: 104, cpu_mask_curr: 00000002, > cpu_mask_sel: 00000002 > (XEN) > (XEN) >>>>> on_selected_cpus: line: 93, cpu_mask_curr: 00000001, > cpu_mask_sel: 00000002 > (XEN) > (XEN) >>>>> smp_call_function_interrupt: line: 104, cpu_mask_curr: 00000001, > cpu_mask_sel: 00000001 > (XEN) > (XEN) >>>>> on_selected_cpus: line: 93, cpu_mask_curr: 00000002, > cpu_mask_sel: 00000001 > (XEN) > (XEN) >>>>> smp_call_function_interrupt: line: 104, cpu_mask_curr: 00000002, > cpu_mask_sel: 00000000 > [ 11.023437] usbcore: registered new interface driver usbfs > [ 11.023437] usbcore: registered new interface driver hub > [ 11.023437] usbcore: registered new device driver usb > [ 11.039062] usbcore: registered new interface driver usbhid > [ 11.039062] usbhid: USB HID core driver > >> >> Do you pass-through PPIs to dom0? > > > If I understand correctly that PPIs is irqs from 16 to 31. > So yes, I do. I see timer's irqs and maintenance irq which routed to both > CPUs. > > And I have printed all irqs which fall to gic_route_irq_to_guest and > gic_route_irq functions. > ... > (XEN) GIC initialization: > (XEN) gic_dist_addr=0000000048211000 > (XEN) gic_cpu_addr=0000000048212000 > (XEN) gic_hyp_addr=0000000048214000 > (XEN) gic_vcpu_addr=0000000048216000 > (XEN) gic_maintenance_irq=25 > (XEN) GIC: 192 lines, 2 cpus, secure (IID 0000043b). > (XEN) > (XEN) >>>>> gic_route_irq: irq: 25, cpu_mask: 00000001 > (XEN) > (XEN) >>>>> gic_route_irq: irq: 30, cpu_mask: 00000001 > (XEN) > (XEN) >>>>> gic_route_irq: irq: 26, cpu_mask: 00000001 > (XEN) > (XEN) >>>>> gic_route_irq: irq: 27, cpu_mask: 00000001 > (XEN) > (XEN) >>>>> gic_route_irq: irq: 104, cpu_mask: 00000001 > (XEN) Using scheduler: SMP Credit Scheduler (credit) > (XEN) Allocated console ring of 16 KiB. > (XEN) VFP implementer 0x41 architecture 4 part 0x30 variant 0xf rev 0x0 > (XEN) Bringing up CPU1 > (XEN) > (XEN) >>>>> gic_route_irq: irq: 25, cpu_mask: 00000002 > (XEN) > (XEN) >>>>> gic_route_irq: irq: 30, cpu_mask: 00000002 > (XEN) > (XEN) >>>>> gic_route_irq: irq: 26, cpu_mask: 00000002 > (XEN) > (XEN) >>>>> gic_route_irq: irq: 27, cpu_mask: 00000002 > (XEN) CPU 1 booted. > (XEN) Brought up 2 CPUs > (XEN) *** LOADING DOMAIN 0 *** > (XEN) Populate P2M 0xc8000000->0xd0000000 (1:1 mapping for dom0) > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 61, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 62, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 63, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 64, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 66, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 67, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 153, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 105, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 106, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 102, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 137, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 138, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 113, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 69, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 70, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 71, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 72, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 73, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 74, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 75, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 76, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 77, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 78, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 79, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 112, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 145, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 158, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 86, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 82, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 83, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 84, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 85, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 187, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 186, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 188, cpu: 0 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 189, cpu: 0 > (XEN) Loading kernel from boot module 2 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 57, cpu: 0 > (XEN) Loading zImage from 00000000c0000040 to > 00000000c8008000-00000000c8304eb0 > (XEN) Loading dom0 DTB to 0x00000000cfe00000-0x00000000cfe03978 > (XEN) Std. Loglevel: All > (XEN) Guest Loglevel: All > (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to > Xen) > (XEN) Freed 252kB init memory. > [ 0.000000] /cpus/cpu@0 missing clock-frequency property > [ 0.000000] /cpus/cpu@1 missing clock-frequency property > [ 0.093750] omap_l3_noc ocp.2: couldn't find resource 2 > [ 0.265625] ahci ahci.0.auto: can't get clock > [ 0.867187] Freeing init memory: 224K > Parsing config from /xen/images/DomUAndroid.cfg > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 105, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 61, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 62, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 63, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 64, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 65, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 66, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 67, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 153, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 69, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 70, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 71, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 72, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 73, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 74, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 75, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 76, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 77, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 78, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 79, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 102, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 137, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 138, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 88, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 89, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 93, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 94, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 92, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 152, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 97, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 98, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 123, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 80, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 115, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 118, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 126, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 128, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 91, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 41, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 42, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 48, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 131, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 44, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 45, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 46, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 47, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 40, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 158, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 146, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 60, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 85, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 87, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 133, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 142, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 143, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 53, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 164, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 51, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 134, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 50, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 108, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 109, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 124, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 125, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 110, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 112, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 68, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 101, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 99, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 100, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 103, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 132, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 56, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 135, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 136, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 139, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 58, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 140, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 141, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 49, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 54, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 55, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 144, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 32, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 33, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 34, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 35, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 36, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 39, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 43, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 52, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 59, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 120, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 90, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 107, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 119, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 121, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 122, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 129, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 130, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 151, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 154, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 155, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 156, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 160, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 162, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 163, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 157, cpu: 1 > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 173, cpu: 1 > Daemon running with PID 569 > ... >> >> >> -- >> Julien Grall > > > > > -- > > Name | Title > GlobalLogic > P +x.xxx.xxx.xxxx M +x.xxx.xxx.xxxx S skype > www.globallogic.com > > http://www.globallogic.com/email_disclaimer.txt -- Name | Title GlobalLogic P +x.xxx.xxx.xxxx M +x.xxx.xxx.xxxx S skype www.globallogic.com http://www.globallogic.com/email_disclaimer.txt ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-29 10:56 ` Oleksandr Tyshchenko @ 2014-01-29 11:42 ` Stefano Stabellini 2014-01-29 11:46 ` Stefano Stabellini 2014-02-04 16:20 ` [PATCH] xen/arm: route irqs to cpu0 Stefano Stabellini 2014-01-29 13:07 ` [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix Julien Grall 1 sibling, 2 replies; 48+ messages in thread From: Stefano Stabellini @ 2014-01-29 11:42 UTC (permalink / raw) To: Oleksandr Tyshchenko Cc: Julien Grall, xen-devel, Ian Campbell, Stefano Stabellini On Wed, 29 Jan 2014, Oleksandr Tyshchenko wrote: > Hello all, > > I just recollected about one hack which we created > as we needed to route HW IRQ in domU. > > diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c > index 9d793ba..d0227b9 100644 > --- a/tools/libxl/libxl_create.c > +++ b/tools/libxl/libxl_create.c > @@ -989,8 +989,6 @@ static void domcreate_launch_dm(libxl__egc *egc, > libxl__multidev *multidev, > > LOG(DEBUG, "dom%d irq %d", domid, irq); > > - ret = irq >= 0 ? xc_physdev_map_pirq(CTX->xch, domid, irq, &irq) > - : -EOVERFLOW; > if (!ret) > ret = xc_domain_irq_permission(CTX->xch, domid, irq, 1); > if (ret < 0) { > diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c > index 2e4b11f..b54c08e 100644 > --- a/xen/arch/arm/vgic.c > +++ b/xen/arch/arm/vgic.c > @@ -85,7 +85,7 @@ int domain_vgic_init(struct domain *d) > if ( d->domain_id == 0 ) > d->arch.vgic.nr_lines = gic_number_lines() - 32; > else > - d->arch.vgic.nr_lines = 0; /* We don't need SPIs for the guest */ > + d->arch.vgic.nr_lines = gic_number_lines() - 32; /* We do > need SPIs for the guest */ > > d->arch.vgic.shared_irqs = > xzalloc_array(struct vgic_irq_rank, DOMAIN_NR_RANKS(d)); > diff --git a/xen/common/domctl.c b/xen/common/domctl.c > index 75e2df3..ba88901 100644 > --- a/xen/common/domctl.c > +++ b/xen/common/domctl.c > @@ -29,6 +29,7 @@ > #include <asm/page.h> > #include <public/domctl.h> > #include <xsm/xsm.h> > +#include <asm/gic.h> > > static DEFINE_SPINLOCK(domctl_lock); > DEFINE_SPINLOCK(vcpu_alloc_lock); > @@ -782,8 +783,11 @@ long > do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) > ret = -EINVAL; > else if ( xsm_irq_permission(XSM_HOOK, d, pirq, allow) ) > ret = -EPERM; > - else if ( allow ) > - ret = pirq_permit_access(d, pirq); > + else if ( allow ) { > + struct dt_irq irq = {pirq + NR_LOCAL_IRQS,0}; > + ret = pirq_permit_access(d, irq.irq); > + gic_route_irq_to_guest(d, &irq, ""); > + } > else > ret = pirq_deny_access(d, pirq); > } > (END) > > It seems, the following patch can violate the logic about routing > physical IRQs only to CPU0. > In gic_route_irq_to_guest() we need to call gic_set_irq_properties() > where the one of the parameters is cpumask_of(smp_processor_id()). > But in this part of code this function can be executed on CPU1. And as > result this can cause to the fact that the wrong value would set to > target CPU mask. > > Please, confirm my assumption. That is correct. > If I am right we have to add a basic HW IRQ routing to DomU in a right way. We could add the cpumask parameter to gic_route_irq_to_guest. Or maybe for now we could just hardcode the cpumask of cpu0 gic_route_irq_to_guest. However keep in mind that if you plan on routing SPIs to guests other than dom0, receiving all the interrupts on cpu0 might not be great for performances. It is impressive how small is this patch, if this is all is needed to get irq routing to guests working. > On Tue, Jan 28, 2014 at 9:25 PM, Oleksandr Tyshchenko > <oleksandr.tyshchenko@globallogic.com> wrote: > > Hello Julien, > > > > Please see inline > > > >> gic_irq_eoi is only called for physical IRQ routed to the guest (eg: > >> hard drive, network, ...). As far as I remember, these IRQs are only > >> routed to CPU0. > > > > > > I understand. > > > > But, I have created debug patch to show the issue: > > > > diff --git a/xen/common/smp.c b/xen/common/smp.c > > index 46d2fc6..6123561 100644 > > --- a/xen/common/smp.c > > +++ b/xen/common/smp.c > > @@ -22,6 +22,8 @@ > > #include <xen/smp.h> > > #include <xen/errno.h> > > > > +int locked = 0; > > + > > /* > > * Structure and data for smp_call_function()/on_selected_cpus(). > > */ > > @@ -53,11 +55,19 @@ void on_selected_cpus( > > { > > unsigned int nr_cpus; > > > > + locked = 0; > > + > > ASSERT(local_irq_is_enabled()); > > > > if (!spin_trylock(&call_lock)) { > > + > > + locked = 1; > > + printk("\n>>>>> %s: line: %d, cpu_mask_curr: %08lx, cpu_mask_sel: > > %08lx\n", __func__, __LINE__, > > + cpumask_of(smp_processor_id())->bits[0], > > selected->bits[0]); > > + > > if (smp_call_function_interrupt()) > > return; > > + > > spin_lock(&call_lock); > > } > > > > @@ -78,6 +88,10 @@ void on_selected_cpus( > > > > out: > > spin_unlock(&call_lock); > > + > > + if (locked) > > + printk("\n>>>>> %s: line: %d, cpu_mask_curr: %08lx, cpu_mask_sel: > > %08lx\n", __func__, __LINE__, > > + cpumask_of(smp_processor_id())->bits[0], selected->bits[0]); > > } > > > > int smp_call_function_interrupt(void) > > @@ -86,6 +100,10 @@ int smp_call_function_interrupt(void) > > void *info = call_data.info; > > unsigned int cpu = smp_processor_id(); > > > > + if (locked) > > + printk("\n>>>>> %s: line: %d, cpu_mask_curr: %08lx, cpu_mask_sel: > > %08lx\n", __func__, __LINE__, > > + cpumask_of(smp_processor_id())->bits[0], > > call_data.selected.bits[0]); > > + > > if ( !cpumask_test_cpu(cpu, &call_data.selected) ) > > return -EPERM; > > > > Our issue (simultaneous cross-interrupts) has occurred during boot domU: > > > > [ 7.507812] oom_adj 2 => oom_score_adj 117 > > [ 7.507812] oom_adj 4 => oom_score_adj 235 > > [ 7.507812] oom_adj 9 => oom_score_adj 529 > > [ 7.507812] oom_adj 15 => oom_score_adj 1000 > > [ 8.835937] PVR_K:(Error): PVRSRVOpenDCDeviceKM: no devnode matching > > index 0 [0, ] > > (XEN) > > (XEN) >>>>> on_selected_cpus: line: 65, cpu_mask_curr: 00000002, > > cpu_mask_sel: 00000001 > > (XEN) > > (XEN) >>>>> smp_call_function_interrupt: line: 104, cpu_mask_curr: 00000002, > > cpu_mask_sel: 00000002 > > (XEN) > > (XEN) >>>>> on_selected_cpus: line: 93, cpu_mask_curr: 00000001, > > cpu_mask_sel: 00000002 > > (XEN) > > (XEN) >>>>> smp_call_function_interrupt: line: 104, cpu_mask_curr: 00000001, > > cpu_mask_sel: 00000001 > > (XEN) > > (XEN) >>>>> on_selected_cpus: line: 93, cpu_mask_curr: 00000002, > > cpu_mask_sel: 00000001 > > (XEN) > > (XEN) >>>>> smp_call_function_interrupt: line: 104, cpu_mask_curr: 00000002, > > cpu_mask_sel: 00000000 > > [ 11.023437] usbcore: registered new interface driver usbfs > > [ 11.023437] usbcore: registered new interface driver hub > > [ 11.023437] usbcore: registered new device driver usb > > [ 11.039062] usbcore: registered new interface driver usbhid > > [ 11.039062] usbhid: USB HID core driver > > > >> > >> Do you pass-through PPIs to dom0? > > > > > > If I understand correctly that PPIs is irqs from 16 to 31. > > So yes, I do. I see timer's irqs and maintenance irq which routed to both > > CPUs. > > > > And I have printed all irqs which fall to gic_route_irq_to_guest and > > gic_route_irq functions. > > ... > > (XEN) GIC initialization: > > (XEN) gic_dist_addr=0000000048211000 > > (XEN) gic_cpu_addr=0000000048212000 > > (XEN) gic_hyp_addr=0000000048214000 > > (XEN) gic_vcpu_addr=0000000048216000 > > (XEN) gic_maintenance_irq=25 > > (XEN) GIC: 192 lines, 2 cpus, secure (IID 0000043b). > > (XEN) > > (XEN) >>>>> gic_route_irq: irq: 25, cpu_mask: 00000001 > > (XEN) > > (XEN) >>>>> gic_route_irq: irq: 30, cpu_mask: 00000001 > > (XEN) > > (XEN) >>>>> gic_route_irq: irq: 26, cpu_mask: 00000001 > > (XEN) > > (XEN) >>>>> gic_route_irq: irq: 27, cpu_mask: 00000001 > > (XEN) > > (XEN) >>>>> gic_route_irq: irq: 104, cpu_mask: 00000001 > > (XEN) Using scheduler: SMP Credit Scheduler (credit) > > (XEN) Allocated console ring of 16 KiB. > > (XEN) VFP implementer 0x41 architecture 4 part 0x30 variant 0xf rev 0x0 > > (XEN) Bringing up CPU1 > > (XEN) > > (XEN) >>>>> gic_route_irq: irq: 25, cpu_mask: 00000002 > > (XEN) > > (XEN) >>>>> gic_route_irq: irq: 30, cpu_mask: 00000002 > > (XEN) > > (XEN) >>>>> gic_route_irq: irq: 26, cpu_mask: 00000002 > > (XEN) > > (XEN) >>>>> gic_route_irq: irq: 27, cpu_mask: 00000002 > > (XEN) CPU 1 booted. > > (XEN) Brought up 2 CPUs > > (XEN) *** LOADING DOMAIN 0 *** > > (XEN) Populate P2M 0xc8000000->0xd0000000 (1:1 mapping for dom0) > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 61, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 62, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 63, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 64, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 66, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 67, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 153, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 105, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 106, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 102, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 137, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 138, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 113, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 69, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 70, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 71, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 72, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 73, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 74, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 75, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 76, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 77, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 78, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 79, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 112, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 145, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 158, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 86, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 82, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 83, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 84, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 85, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 187, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 186, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 188, cpu: 0 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 189, cpu: 0 > > (XEN) Loading kernel from boot module 2 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 57, cpu: 0 > > (XEN) Loading zImage from 00000000c0000040 to > > 00000000c8008000-00000000c8304eb0 > > (XEN) Loading dom0 DTB to 0x00000000cfe00000-0x00000000cfe03978 > > (XEN) Std. Loglevel: All > > (XEN) Guest Loglevel: All > > (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to > > Xen) > > (XEN) Freed 252kB init memory. > > [ 0.000000] /cpus/cpu@0 missing clock-frequency property > > [ 0.000000] /cpus/cpu@1 missing clock-frequency property > > [ 0.093750] omap_l3_noc ocp.2: couldn't find resource 2 > > [ 0.265625] ahci ahci.0.auto: can't get clock > > [ 0.867187] Freeing init memory: 224K > > Parsing config from /xen/images/DomUAndroid.cfg > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 105, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 61, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 62, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 63, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 64, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 65, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 66, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 67, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 153, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 69, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 70, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 71, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 72, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 73, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 74, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 75, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 76, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 77, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 78, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 79, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 102, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 137, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 138, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 88, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 89, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 93, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 94, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 92, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 152, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 97, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 98, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 123, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 80, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 115, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 118, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 126, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 128, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 91, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 41, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 42, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 48, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 131, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 44, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 45, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 46, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 47, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 40, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 158, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 146, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 60, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 85, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 87, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 133, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 142, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 143, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 53, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 164, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 51, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 134, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 50, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 108, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 109, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 124, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 125, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 110, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 112, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 68, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 101, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 99, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 100, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 103, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 132, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 56, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 135, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 136, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 139, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 58, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 140, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 141, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 49, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 54, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 55, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 144, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 32, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 33, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 34, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 35, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 36, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 39, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 43, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 52, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 59, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 120, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 90, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 107, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 119, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 121, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 122, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 129, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 130, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 151, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 154, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 155, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 156, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 160, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 162, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 163, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 157, cpu: 1 > > (XEN) > > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 173, cpu: 1 > > Daemon running with PID 569 > > ... > >> > >> > >> -- > >> Julien Grall > > > > > > > > > > -- > > > > Name | Title > > GlobalLogic > > P +x.xxx.xxx.xxxx M +x.xxx.xxx.xxxx S skype > > www.globallogic.com > > > > http://www.globallogic.com/email_disclaimer.txt > > > > -- > > Name | Title > GlobalLogic > P +x.xxx.xxx.xxxx M +x.xxx.xxx.xxxx S skype > www.globallogic.com > > http://www.globallogic.com/email_disclaimer.txt > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-29 11:42 ` Stefano Stabellini @ 2014-01-29 11:46 ` Stefano Stabellini 2014-01-29 13:15 ` Julien Grall 2014-02-04 15:32 ` Julien Grall 2014-02-04 16:20 ` [PATCH] xen/arm: route irqs to cpu0 Stefano Stabellini 1 sibling, 2 replies; 48+ messages in thread From: Stefano Stabellini @ 2014-01-29 11:46 UTC (permalink / raw) To: Stefano Stabellini Cc: Oleksandr Tyshchenko, Julien Grall, Ian Campbell, xen-devel On Wed, 29 Jan 2014, Stefano Stabellini wrote: > On Wed, 29 Jan 2014, Oleksandr Tyshchenko wrote: > > Hello all, > > > > I just recollected about one hack which we created > > as we needed to route HW IRQ in domU. > > > > diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c > > index 9d793ba..d0227b9 100644 > > --- a/tools/libxl/libxl_create.c > > +++ b/tools/libxl/libxl_create.c > > @@ -989,8 +989,6 @@ static void domcreate_launch_dm(libxl__egc *egc, > > libxl__multidev *multidev, > > > > LOG(DEBUG, "dom%d irq %d", domid, irq); > > > > - ret = irq >= 0 ? xc_physdev_map_pirq(CTX->xch, domid, irq, &irq) > > - : -EOVERFLOW; > > if (!ret) > > ret = xc_domain_irq_permission(CTX->xch, domid, irq, 1); > > if (ret < 0) { > > diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c > > index 2e4b11f..b54c08e 100644 > > --- a/xen/arch/arm/vgic.c > > +++ b/xen/arch/arm/vgic.c > > @@ -85,7 +85,7 @@ int domain_vgic_init(struct domain *d) > > if ( d->domain_id == 0 ) > > d->arch.vgic.nr_lines = gic_number_lines() - 32; > > else > > - d->arch.vgic.nr_lines = 0; /* We don't need SPIs for the guest */ > > + d->arch.vgic.nr_lines = gic_number_lines() - 32; /* We do > > need SPIs for the guest */ > > > > d->arch.vgic.shared_irqs = > > xzalloc_array(struct vgic_irq_rank, DOMAIN_NR_RANKS(d)); > > diff --git a/xen/common/domctl.c b/xen/common/domctl.c > > index 75e2df3..ba88901 100644 > > --- a/xen/common/domctl.c > > +++ b/xen/common/domctl.c > > @@ -29,6 +29,7 @@ > > #include <asm/page.h> > > #include <public/domctl.h> > > #include <xsm/xsm.h> > > +#include <asm/gic.h> > > > > static DEFINE_SPINLOCK(domctl_lock); > > DEFINE_SPINLOCK(vcpu_alloc_lock); > > @@ -782,8 +783,11 @@ long > > do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) > > ret = -EINVAL; > > else if ( xsm_irq_permission(XSM_HOOK, d, pirq, allow) ) > > ret = -EPERM; > > - else if ( allow ) > > - ret = pirq_permit_access(d, pirq); > > + else if ( allow ) { > > + struct dt_irq irq = {pirq + NR_LOCAL_IRQS,0}; > > + ret = pirq_permit_access(d, irq.irq); > > + gic_route_irq_to_guest(d, &irq, ""); > > + } > > else > > ret = pirq_deny_access(d, pirq); > > } > > (END) > > > > It seems, the following patch can violate the logic about routing > > physical IRQs only to CPU0. > > In gic_route_irq_to_guest() we need to call gic_set_irq_properties() > > where the one of the parameters is cpumask_of(smp_processor_id()). > > But in this part of code this function can be executed on CPU1. And as > > result this can cause to the fact that the wrong value would set to > > target CPU mask. > > > > Please, confirm my assumption. > > That is correct. > > > > If I am right we have to add a basic HW IRQ routing to DomU in a right way. > > We could add the cpumask parameter to gic_route_irq_to_guest. Or maybe > for now we could just hardcode the cpumask of cpu0 > gic_route_irq_to_guest. > > However keep in mind that if you plan on routing SPIs to guests other > than dom0, receiving all the interrupts on cpu0 might not be great for > performances. Thinking twice about it, it might be the only acceptable change for 4.4. diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c index e6257a7..af96a31 100644 --- a/xen/arch/arm/gic.c +++ b/xen/arch/arm/gic.c @@ -776,8 +795,7 @@ int gic_route_irq_to_guest(struct domain *d, const struct dt_irq *irq, level = dt_irq_is_level_triggered(irq); - gic_set_irq_properties(irq->irq, level, cpumask_of(smp_processor_id()), - 0xa0); + gic_set_irq_properties(irq->irq, level, cpumask_of(0), 0xa0); retval = __setup_irq(desc, irq->irq, action); if (retval) { ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-29 11:46 ` Stefano Stabellini @ 2014-01-29 13:15 ` Julien Grall 2014-02-04 15:32 ` Julien Grall 1 sibling, 0 replies; 48+ messages in thread From: Julien Grall @ 2014-01-29 13:15 UTC (permalink / raw) To: Stefano Stabellini; +Cc: Oleksandr Tyshchenko, Ian Campbell, xen-devel On 29/01/14 11:46, Stefano Stabellini wrote: > On Wed, 29 Jan 2014, Stefano Stabellini wrote: >> On Wed, 29 Jan 2014, Oleksandr Tyshchenko wrote: >>> Hello all, >>> >>> I just recollected about one hack which we created >>> as we needed to route HW IRQ in domU. >>> >>> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c >>> index 9d793ba..d0227b9 100644 >>> --- a/tools/libxl/libxl_create.c >>> +++ b/tools/libxl/libxl_create.c >>> @@ -989,8 +989,6 @@ static void domcreate_launch_dm(libxl__egc *egc, >>> libxl__multidev *multidev, >>> >>> LOG(DEBUG, "dom%d irq %d", domid, irq); >>> >>> - ret = irq >= 0 ? xc_physdev_map_pirq(CTX->xch, domid, irq, &irq) >>> - : -EOVERFLOW; >>> if (!ret) >>> ret = xc_domain_irq_permission(CTX->xch, domid, irq, 1); >>> if (ret < 0) { >>> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c >>> index 2e4b11f..b54c08e 100644 >>> --- a/xen/arch/arm/vgic.c >>> +++ b/xen/arch/arm/vgic.c >>> @@ -85,7 +85,7 @@ int domain_vgic_init(struct domain *d) >>> if ( d->domain_id == 0 ) >>> d->arch.vgic.nr_lines = gic_number_lines() - 32; >>> else >>> - d->arch.vgic.nr_lines = 0; /* We don't need SPIs for the guest */ >>> + d->arch.vgic.nr_lines = gic_number_lines() - 32; /* We do >>> need SPIs for the guest */ >>> >>> d->arch.vgic.shared_irqs = >>> xzalloc_array(struct vgic_irq_rank, DOMAIN_NR_RANKS(d)); >>> diff --git a/xen/common/domctl.c b/xen/common/domctl.c >>> index 75e2df3..ba88901 100644 >>> --- a/xen/common/domctl.c >>> +++ b/xen/common/domctl.c >>> @@ -29,6 +29,7 @@ >>> #include <asm/page.h> >>> #include <public/domctl.h> >>> #include <xsm/xsm.h> >>> +#include <asm/gic.h> >>> >>> static DEFINE_SPINLOCK(domctl_lock); >>> DEFINE_SPINLOCK(vcpu_alloc_lock); >>> @@ -782,8 +783,11 @@ long >>> do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) >>> ret = -EINVAL; >>> else if ( xsm_irq_permission(XSM_HOOK, d, pirq, allow) ) >>> ret = -EPERM; >>> - else if ( allow ) >>> - ret = pirq_permit_access(d, pirq); >>> + else if ( allow ) { >>> + struct dt_irq irq = {pirq + NR_LOCAL_IRQS,0}; >>> + ret = pirq_permit_access(d, irq.irq); >>> + gic_route_irq_to_guest(d, &irq, ""); >>> + } >>> else >>> ret = pirq_deny_access(d, pirq); >>> } >>> (END) >>> >>> It seems, the following patch can violate the logic about routing >>> physical IRQs only to CPU0. >>> In gic_route_irq_to_guest() we need to call gic_set_irq_properties() >>> where the one of the parameters is cpumask_of(smp_processor_id()). >>> But in this part of code this function can be executed on CPU1. And as >>> result this can cause to the fact that the wrong value would set to >>> target CPU mask. >>> >>> Please, confirm my assumption. >> >> That is correct. >> >> >>> If I am right we have to add a basic HW IRQ routing to DomU in a right way. >> >> We could add the cpumask parameter to gic_route_irq_to_guest. Or maybe >> for now we could just hardcode the cpumask of cpu0 >> gic_route_irq_to_guest. >> >> However keep in mind that if you plan on routing SPIs to guests other >> than dom0, receiving all the interrupts on cpu0 might not be great for >> performances. > > Thinking twice about it, it might be the only acceptable change for 4.4. In Xen upstream, gic_route_irq_to_guest is only called when the dom0 is built (so on CPU0). I don't think we need this patch for Xen 4.4. -- Julien Grall ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-29 11:46 ` Stefano Stabellini 2014-01-29 13:15 ` Julien Grall @ 2014-02-04 15:32 ` Julien Grall 1 sibling, 0 replies; 48+ messages in thread From: Julien Grall @ 2014-02-04 15:32 UTC (permalink / raw) To: Stefano Stabellini; +Cc: Oleksandr Tyshchenko, Ian Campbell, xen-devel Hi Stefano, On 01/29/2014 11:46 AM, Stefano Stabellini wrote: > Thinking twice about it, it might be the only acceptable change for 4.4. After thinking, if Ian's patch to prioritize the IPI (http://www.gossamer-threads.com/lists/xen/devel/315342?do=post_view_threaded) is not pushed for Xen 4.4, this patch might be useful for Oleksandr. Can you send it with a commit message and signed-off? > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c > index e6257a7..af96a31 100644 > --- a/xen/arch/arm/gic.c > +++ b/xen/arch/arm/gic.c > @@ -776,8 +795,7 @@ int gic_route_irq_to_guest(struct domain *d, const struct dt_irq *irq, > > level = dt_irq_is_level_triggered(irq); > > - gic_set_irq_properties(irq->irq, level, cpumask_of(smp_processor_id()), > - 0xa0); > + gic_set_irq_properties(irq->irq, level, cpumask_of(0), 0xa0); I would add a TODO before the function and perhaps explain why... Cheers, -- Julien Grall ^ permalink raw reply [flat|nested] 48+ messages in thread
* [PATCH] xen/arm: route irqs to cpu0 2014-01-29 11:42 ` Stefano Stabellini 2014-01-29 11:46 ` Stefano Stabellini @ 2014-02-04 16:20 ` Stefano Stabellini 2014-02-04 16:32 ` Julien Grall 2014-02-19 13:43 ` Julien Grall 1 sibling, 2 replies; 48+ messages in thread From: Stefano Stabellini @ 2014-02-04 16:20 UTC (permalink / raw) To: Stefano Stabellini Cc: Oleksandr Tyshchenko, George Dunlap, Julien Grall, Ian Campbell, xen-devel gic_route_irq_to_guest routes all IRQs to cpumask_of(smp_processor_id()), but actually it always called on cpu0. To avoid confusion and possible issues in case someone modified the code and reassigned a particular irq to a cpu other than cpu0, hardcode cpumask_of(0). Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c index e6257a7..8854800 100644 --- a/xen/arch/arm/gic.c +++ b/xen/arch/arm/gic.c @@ -776,8 +776,8 @@ int gic_route_irq_to_guest(struct domain *d, const struct dt_irq *irq, level = dt_irq_is_level_triggered(irq); - gic_set_irq_properties(irq->irq, level, cpumask_of(smp_processor_id()), - 0xa0); + /* TODO: handle routing irqs to cpus != cpu0 */ + gic_set_irq_properties(irq->irq, level, cpumask_of(0), 0xa0); retval = __setup_irq(desc, irq->irq, action); if (retval) { ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH] xen/arm: route irqs to cpu0 2014-02-04 16:20 ` [PATCH] xen/arm: route irqs to cpu0 Stefano Stabellini @ 2014-02-04 16:32 ` Julien Grall 2014-02-04 16:56 ` Oleksandr Tyshchenko 2014-02-19 13:43 ` Julien Grall 1 sibling, 1 reply; 48+ messages in thread From: Julien Grall @ 2014-02-04 16:32 UTC (permalink / raw) To: Stefano Stabellini Cc: Oleksandr Tyshchenko, George Dunlap, Ian Campbell, xen-devel On 02/04/2014 04:20 PM, Stefano Stabellini wrote: > gic_route_irq_to_guest routes all IRQs to > cpumask_of(smp_processor_id()), but actually it always called on cpu0. > To avoid confusion and possible issues in case someone modified the code > and reassigned a particular irq to a cpu other than cpu0, hardcode > cpumask_of(0). > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Julien Grall <julien.grall@linaro.org> > > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c > index e6257a7..8854800 100644 > --- a/xen/arch/arm/gic.c > +++ b/xen/arch/arm/gic.c > @@ -776,8 +776,8 @@ int gic_route_irq_to_guest(struct domain *d, const struct dt_irq *irq, > > level = dt_irq_is_level_triggered(irq); > > - gic_set_irq_properties(irq->irq, level, cpumask_of(smp_processor_id()), > - 0xa0); > + /* TODO: handle routing irqs to cpus != cpu0 */ > + gic_set_irq_properties(irq->irq, level, cpumask_of(0), 0xa0); > > retval = __setup_irq(desc, irq->irq, action); > if (retval) { > -- Julien Grall ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] xen/arm: route irqs to cpu0 2014-02-04 16:32 ` Julien Grall @ 2014-02-04 16:56 ` Oleksandr Tyshchenko 0 siblings, 0 replies; 48+ messages in thread From: Oleksandr Tyshchenko @ 2014-02-04 16:56 UTC (permalink / raw) To: Julien Grall; +Cc: George Dunlap, xen-devel, Ian Campbell, Stefano Stabellini Thanks. I need this patch. On Tue, Feb 4, 2014 at 6:32 PM, Julien Grall <julien.grall@linaro.org> wrote: > On 02/04/2014 04:20 PM, Stefano Stabellini wrote: >> gic_route_irq_to_guest routes all IRQs to >> cpumask_of(smp_processor_id()), but actually it always called on cpu0. >> To avoid confusion and possible issues in case someone modified the code >> and reassigned a particular irq to a cpu other than cpu0, hardcode >> cpumask_of(0). >> >> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > Acked-by: Julien Grall <julien.grall@linaro.org> > >> >> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c >> index e6257a7..8854800 100644 >> --- a/xen/arch/arm/gic.c >> +++ b/xen/arch/arm/gic.c >> @@ -776,8 +776,8 @@ int gic_route_irq_to_guest(struct domain *d, const struct dt_irq *irq, >> >> level = dt_irq_is_level_triggered(irq); >> >> - gic_set_irq_properties(irq->irq, level, cpumask_of(smp_processor_id()), >> - 0xa0); >> + /* TODO: handle routing irqs to cpus != cpu0 */ >> + gic_set_irq_properties(irq->irq, level, cpumask_of(0), 0xa0); >> >> retval = __setup_irq(desc, irq->irq, action); >> if (retval) { >> > > > -- > Julien Grall -- Name | Title GlobalLogic P +x.xxx.xxx.xxxx M +x.xxx.xxx.xxxx S skype www.globallogic.com http://www.globallogic.com/email_disclaimer.txt ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] xen/arm: route irqs to cpu0 2014-02-04 16:20 ` [PATCH] xen/arm: route irqs to cpu0 Stefano Stabellini 2014-02-04 16:32 ` Julien Grall @ 2014-02-19 13:43 ` Julien Grall 2014-02-19 13:53 ` Ian Campbell 1 sibling, 1 reply; 48+ messages in thread From: Julien Grall @ 2014-02-19 13:43 UTC (permalink / raw) To: Stefano Stabellini Cc: Oleksandr Tyshchenko, George Dunlap, Ian Campbell, xen-devel Hi all, Ping? It would be nice to have this patch for Xen 4.4 as IPI priority patch won't be pushed before the release. The patch is a minor change and won't impact normal use. When dom0 is built, Xen always do it on CPU 0. Regards, On 02/04/2014 04:20 PM, Stefano Stabellini wrote: > gic_route_irq_to_guest routes all IRQs to > cpumask_of(smp_processor_id()), but actually it always called on cpu0. > To avoid confusion and possible issues in case someone modified the code > and reassigned a particular irq to a cpu other than cpu0, hardcode > cpumask_of(0). > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c > index e6257a7..8854800 100644 > --- a/xen/arch/arm/gic.c > +++ b/xen/arch/arm/gic.c > @@ -776,8 +776,8 @@ int gic_route_irq_to_guest(struct domain *d, const struct dt_irq *irq, > > level = dt_irq_is_level_triggered(irq); > > - gic_set_irq_properties(irq->irq, level, cpumask_of(smp_processor_id()), > - 0xa0); > + /* TODO: handle routing irqs to cpus != cpu0 */ > + gic_set_irq_properties(irq->irq, level, cpumask_of(0), 0xa0); > > retval = __setup_irq(desc, irq->irq, action); > if (retval) { > -- Julien Grall ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] xen/arm: route irqs to cpu0 2014-02-19 13:43 ` Julien Grall @ 2014-02-19 13:53 ` Ian Campbell 2014-02-19 14:15 ` George Dunlap 0 siblings, 1 reply; 48+ messages in thread From: Ian Campbell @ 2014-02-19 13:53 UTC (permalink / raw) To: Julien Grall Cc: Oleksandr Tyshchenko, xen-devel, George Dunlap, Stefano Stabellini On Wed, 2014-02-19 at 13:43 +0000, Julien Grall wrote: > Hi all, > > Ping? No one made a case for a release exception so I put it in my 4.5 pile. > It would be nice to have this patch for Xen 4.4 as IPI priority > patch won't be pushed before the release. > > The patch is a minor change and won't impact normal use. When dom0 is > built, Xen always do it on CPU 0. Right, so whoever is doing otherwise already has a big pile of patches I presume? It's rather late to be making such changes IMHO, but I'll defer to George. > > Regards, > > On 02/04/2014 04:20 PM, Stefano Stabellini wrote: > > gic_route_irq_to_guest routes all IRQs to > > cpumask_of(smp_processor_id()), but actually it always called on cpu0. > > To avoid confusion and possible issues in case someone modified the code > > and reassigned a particular irq to a cpu other than cpu0, hardcode > > cpumask_of(0). > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > > > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c > > index e6257a7..8854800 100644 > > --- a/xen/arch/arm/gic.c > > +++ b/xen/arch/arm/gic.c > > @@ -776,8 +776,8 @@ int gic_route_irq_to_guest(struct domain *d, const struct dt_irq *irq, > > > > level = dt_irq_is_level_triggered(irq); > > > > - gic_set_irq_properties(irq->irq, level, cpumask_of(smp_processor_id()), > > - 0xa0); > > + /* TODO: handle routing irqs to cpus != cpu0 */ > > + gic_set_irq_properties(irq->irq, level, cpumask_of(0), 0xa0); > > > > retval = __setup_irq(desc, irq->irq, action); > > if (retval) { > > > > > > ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] xen/arm: route irqs to cpu0 2014-02-19 13:53 ` Ian Campbell @ 2014-02-19 14:15 ` George Dunlap 2014-02-20 14:52 ` Stefano Stabellini 0 siblings, 1 reply; 48+ messages in thread From: George Dunlap @ 2014-02-19 14:15 UTC (permalink / raw) To: Ian Campbell, Julien Grall Cc: Oleksandr Tyshchenko, xen-devel, George Dunlap, Stefano Stabellini On 02/19/2014 01:53 PM, Ian Campbell wrote: > On Wed, 2014-02-19 at 13:43 +0000, Julien Grall wrote: >> Hi all, >> >> Ping? > No one made a case for a release exception so I put it in my 4.5 pile. > >> It would be nice to have this patch for Xen 4.4 as IPI priority >> patch won't be pushed before the release. >> >> The patch is a minor change and won't impact normal use. When dom0 is >> built, Xen always do it on CPU 0. > Right, so whoever is doing otherwise already has a big pile of patches I > presume? > > It's rather late to be making such changes IMHO, but I'll defer to > George. I can't figure out from the description what's the advantage of having it in 4.4. -George ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] xen/arm: route irqs to cpu0 2014-02-19 14:15 ` George Dunlap @ 2014-02-20 14:52 ` Stefano Stabellini 2014-02-21 11:12 ` George Dunlap 0 siblings, 1 reply; 48+ messages in thread From: Stefano Stabellini @ 2014-02-20 14:52 UTC (permalink / raw) To: George Dunlap Cc: Ian Campbell, Stefano Stabellini, Julien Grall, George Dunlap, xen-devel, Oleksandr Tyshchenko On Wed, 19 Feb 2014, George Dunlap wrote: > On 02/19/2014 01:53 PM, Ian Campbell wrote: > > On Wed, 2014-02-19 at 13:43 +0000, Julien Grall wrote: > > > Hi all, > > > > > > Ping? > > No one made a case for a release exception so I put it in my 4.5 pile. > > > > > It would be nice to have this patch for Xen 4.4 as IPI priority > > > patch won't be pushed before the release. > > > > > > The patch is a minor change and won't impact normal use. When dom0 is > > > built, Xen always do it on CPU 0. > > Right, so whoever is doing otherwise already has a big pile of patches I > > presume? > > > > It's rather late to be making such changes IMHO, but I'll defer to > > George. > > I can't figure out from the description what's the advantage of having it in > 4.4. People that use the default configuration won't see any differences but people that manually modify Xen to start a second domain and assign a device to it would. To give you a concrete example, it fixes a deadlock reported by Oleksandr Tyshchenko: http://marc.info/?l=xen-devel&m=139099606402232 ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] xen/arm: route irqs to cpu0 2014-02-20 14:52 ` Stefano Stabellini @ 2014-02-21 11:12 ` George Dunlap 2014-02-21 11:59 ` Julien Grall 0 siblings, 1 reply; 48+ messages in thread From: George Dunlap @ 2014-02-21 11:12 UTC (permalink / raw) To: Stefano Stabellini Cc: Oleksandr Tyshchenko, Julien Grall, xen-devel@lists.xen.org, Ian Campbell, George Dunlap On Thu, Feb 20, 2014 at 2:52 PM, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote: > On Wed, 19 Feb 2014, George Dunlap wrote: >> On 02/19/2014 01:53 PM, Ian Campbell wrote: >> > On Wed, 2014-02-19 at 13:43 +0000, Julien Grall wrote: >> > > Hi all, >> > > >> > > Ping? >> > No one made a case for a release exception so I put it in my 4.5 pile. >> > >> > > It would be nice to have this patch for Xen 4.4 as IPI priority >> > > patch won't be pushed before the release. >> > > >> > > The patch is a minor change and won't impact normal use. When dom0 is >> > > built, Xen always do it on CPU 0. >> > Right, so whoever is doing otherwise already has a big pile of patches I >> > presume? >> > >> > It's rather late to be making such changes IMHO, but I'll defer to >> > George. >> >> I can't figure out from the description what's the advantage of having it in >> 4.4. > > People that use the default configuration won't see any differences but > people that manually modify Xen to start a second domain and assign a > device to it would. > To give you a concrete example, it fixes a deadlock reported by > Oleksandr Tyshchenko: > > http://marc.info/?l=xen-devel&m=139099606402232 Right -- I think if I had been cc'd when the patch was submitted, I would have said yes for sure; but at this point I think we just want to get 4.4.0 out without any more delays if possible. -George ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] xen/arm: route irqs to cpu0 2014-02-21 11:12 ` George Dunlap @ 2014-02-21 11:59 ` Julien Grall 2014-02-21 12:07 ` George Dunlap 0 siblings, 1 reply; 48+ messages in thread From: Julien Grall @ 2014-02-21 11:59 UTC (permalink / raw) To: George Dunlap Cc: Oleksandr Tyshchenko, George Dunlap, xen-devel@lists.xen.org, Ian Campbell, Stefano Stabellini On 02/21/2014 11:12 AM, George Dunlap wrote: > On Thu, Feb 20, 2014 at 2:52 PM, Stefano Stabellini > <stefano.stabellini@eu.citrix.com> wrote: >> On Wed, 19 Feb 2014, George Dunlap wrote: >>> On 02/19/2014 01:53 PM, Ian Campbell wrote: >>>> On Wed, 2014-02-19 at 13:43 +0000, Julien Grall wrote: >>>>> Hi all, >>>>> >>>>> Ping? >>>> No one made a case for a release exception so I put it in my 4.5 pile. >>>> >>>>> It would be nice to have this patch for Xen 4.4 as IPI priority >>>>> patch won't be pushed before the release. >>>>> >>>>> The patch is a minor change and won't impact normal use. When dom0 is >>>>> built, Xen always do it on CPU 0. >>>> Right, so whoever is doing otherwise already has a big pile of patches I >>>> presume? >>>> >>>> It's rather late to be making such changes IMHO, but I'll defer to >>>> George. >>> >>> I can't figure out from the description what's the advantage of having it in >>> 4.4. >> >> People that use the default configuration won't see any differences but >> people that manually modify Xen to start a second domain and assign a >> device to it would. >> To give you a concrete example, it fixes a deadlock reported by >> Oleksandr Tyshchenko: >> >> http://marc.info/?l=xen-devel&m=139099606402232 > > Right -- I think if I had been cc'd when the patch was submitted, I > would have said yes for sure; but at this point I think we just want > to get 4.4.0 out without any more delays if possible. You were already CCed from the beginning :). -- Julien Grall ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] xen/arm: route irqs to cpu0 2014-02-21 11:59 ` Julien Grall @ 2014-02-21 12:07 ` George Dunlap 0 siblings, 0 replies; 48+ messages in thread From: George Dunlap @ 2014-02-21 12:07 UTC (permalink / raw) To: Julien Grall Cc: Oleksandr Tyshchenko, George Dunlap, xen-devel@lists.xen.org, Ian Campbell, Stefano Stabellini On 02/21/2014 11:59 AM, Julien Grall wrote: > On 02/21/2014 11:12 AM, George Dunlap wrote: >> On Thu, Feb 20, 2014 at 2:52 PM, Stefano Stabellini >> <stefano.stabellini@eu.citrix.com> wrote: >>> On Wed, 19 Feb 2014, George Dunlap wrote: >>>> On 02/19/2014 01:53 PM, Ian Campbell wrote: >>>>> On Wed, 2014-02-19 at 13:43 +0000, Julien Grall wrote: >>>>>> Hi all, >>>>>> >>>>>> Ping? >>>>> No one made a case for a release exception so I put it in my 4.5 pile. >>>>> >>>>>> It would be nice to have this patch for Xen 4.4 as IPI priority >>>>>> patch won't be pushed before the release. >>>>>> >>>>>> The patch is a minor change and won't impact normal use. When dom0 is >>>>>> built, Xen always do it on CPU 0. >>>>> Right, so whoever is doing otherwise already has a big pile of patches I >>>>> presume? >>>>> >>>>> It's rather late to be making such changes IMHO, but I'll defer to >>>>> George. >>>> I can't figure out from the description what's the advantage of having it in >>>> 4.4. >>> People that use the default configuration won't see any differences but >>> people that manually modify Xen to start a second domain and assign a >>> device to it would. >>> To give you a concrete example, it fixes a deadlock reported by >>> Oleksandr Tyshchenko: >>> >>> http://marc.info/?l=xen-devel&m=139099606402232 >> Right -- I think if I had been cc'd when the patch was submitted, I >> would have said yes for sure; but at this point I think we just want >> to get 4.4.0 out without any more delays if possible. > You were already CCed from the beginning :). Right. :-) But as Ian said, no one made a case for a release exception, and I got a bit tired of having to ask every time. :-) -George ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-29 10:56 ` Oleksandr Tyshchenko 2014-01-29 11:42 ` Stefano Stabellini @ 2014-01-29 13:07 ` Julien Grall 2014-01-29 13:22 ` Stefano Stabellini 1 sibling, 1 reply; 48+ messages in thread From: Julien Grall @ 2014-01-29 13:07 UTC (permalink / raw) To: Oleksandr Tyshchenko; +Cc: Stefano Stabellini, Ian Campbell, xen-devel On 29/01/14 10:56, Oleksandr Tyshchenko wrote: > Hello all, > > I just recollected about one hack which we created > as we needed to route HW IRQ in domU. > > diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c > index 9d793ba..d0227b9 100644 > --- a/tools/libxl/libxl_create.c > +++ b/tools/libxl/libxl_create.c > @@ -989,8 +989,6 @@ static void domcreate_launch_dm(libxl__egc *egc, > libxl__multidev *multidev, > > LOG(DEBUG, "dom%d irq %d", domid, irq); > > - ret = irq >= 0 ? xc_physdev_map_pirq(CTX->xch, domid, irq, &irq) > - : -EOVERFLOW; > if (!ret) > ret = xc_domain_irq_permission(CTX->xch, domid, irq, 1); > if (ret < 0) { > diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c > index 2e4b11f..b54c08e 100644 > --- a/xen/arch/arm/vgic.c > +++ b/xen/arch/arm/vgic.c > @@ -85,7 +85,7 @@ int domain_vgic_init(struct domain *d) > if ( d->domain_id == 0 ) > d->arch.vgic.nr_lines = gic_number_lines() - 32; > else > - d->arch.vgic.nr_lines = 0; /* We don't need SPIs for the guest */ > + d->arch.vgic.nr_lines = gic_number_lines() - 32; /* We do > need SPIs for the guest */ > > d->arch.vgic.shared_irqs = > xzalloc_array(struct vgic_irq_rank, DOMAIN_NR_RANKS(d)); > diff --git a/xen/common/domctl.c b/xen/common/domctl.c > index 75e2df3..ba88901 100644 > --- a/xen/common/domctl.c > +++ b/xen/common/domctl.c > @@ -29,6 +29,7 @@ > #include <asm/page.h> > #include <public/domctl.h> > #include <xsm/xsm.h> > +#include <asm/gic.h> > > static DEFINE_SPINLOCK(domctl_lock); > DEFINE_SPINLOCK(vcpu_alloc_lock); > @@ -782,8 +783,11 @@ long > do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) > ret = -EINVAL; > else if ( xsm_irq_permission(XSM_HOOK, d, pirq, allow) ) > ret = -EPERM; > - else if ( allow ) > - ret = pirq_permit_access(d, pirq); > + else if ( allow ) { > + struct dt_irq irq = {pirq + NR_LOCAL_IRQS,0}; > + ret = pirq_permit_access(d, irq.irq); > + gic_route_irq_to_guest(d, &irq, ""); > + } > else > ret = pirq_deny_access(d, pirq); > } > (END) > > It seems, the following patch can violate the logic about routing > physical IRQs only to CPU0. I forgot the smp_processor_id() in gic_route_irq_to_guest(). As this function is only called (for upstream) when dom0 is building, only CPU0 is used. > In gic_route_irq_to_guest() we need to call gic_set_irq_properties() > where the one of the parameters is cpumask_of(smp_processor_id()). > But in this part of code this function can be executed on CPU1. And as > result this can cause to the fact that the wrong value would set to > target CPU mask. > Please, confirm my assumption. > If I am right we have to add a basic HW IRQ routing to DomU in a right way. With this current implementation, IRQ will be routed to CPU which has done the hypercall. This CPU could be different than the CPU where the domU (assuming we have only 1 VCPU) is running. I think for both dom0 and domU, in case of the VCPU is pinned, we should say use the cpumask of this VCPU. -- Julien Grall ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-29 13:07 ` [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix Julien Grall @ 2014-01-29 13:22 ` Stefano Stabellini 2014-01-29 18:40 ` Oleksandr Tyshchenko 0 siblings, 1 reply; 48+ messages in thread From: Stefano Stabellini @ 2014-01-29 13:22 UTC (permalink / raw) To: Julien Grall Cc: Oleksandr Tyshchenko, Stefano Stabellini, Ian Campbell, xen-devel On Wed, 29 Jan 2014, Julien Grall wrote: > On 29/01/14 10:56, Oleksandr Tyshchenko wrote: > > Hello all, > > > > I just recollected about one hack which we created > > as we needed to route HW IRQ in domU. > > > > diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c > > index 9d793ba..d0227b9 100644 > > --- a/tools/libxl/libxl_create.c > > +++ b/tools/libxl/libxl_create.c > > @@ -989,8 +989,6 @@ static void domcreate_launch_dm(libxl__egc *egc, > > libxl__multidev *multidev, > > > > LOG(DEBUG, "dom%d irq %d", domid, irq); > > > > - ret = irq >= 0 ? xc_physdev_map_pirq(CTX->xch, domid, irq, &irq) > > - : -EOVERFLOW; > > if (!ret) > > ret = xc_domain_irq_permission(CTX->xch, domid, irq, 1); > > if (ret < 0) { > > diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c > > index 2e4b11f..b54c08e 100644 > > --- a/xen/arch/arm/vgic.c > > +++ b/xen/arch/arm/vgic.c > > @@ -85,7 +85,7 @@ int domain_vgic_init(struct domain *d) > > if ( d->domain_id == 0 ) > > d->arch.vgic.nr_lines = gic_number_lines() - 32; > > else > > - d->arch.vgic.nr_lines = 0; /* We don't need SPIs for the guest */ > > + d->arch.vgic.nr_lines = gic_number_lines() - 32; /* We do > > need SPIs for the guest */ > > > > d->arch.vgic.shared_irqs = > > xzalloc_array(struct vgic_irq_rank, DOMAIN_NR_RANKS(d)); > > diff --git a/xen/common/domctl.c b/xen/common/domctl.c > > index 75e2df3..ba88901 100644 > > --- a/xen/common/domctl.c > > +++ b/xen/common/domctl.c > > @@ -29,6 +29,7 @@ > > #include <asm/page.h> > > #include <public/domctl.h> > > #include <xsm/xsm.h> > > +#include <asm/gic.h> > > > > static DEFINE_SPINLOCK(domctl_lock); > > DEFINE_SPINLOCK(vcpu_alloc_lock); > > @@ -782,8 +783,11 @@ long > > do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) > > ret = -EINVAL; > > else if ( xsm_irq_permission(XSM_HOOK, d, pirq, allow) ) > > ret = -EPERM; > > - else if ( allow ) > > - ret = pirq_permit_access(d, pirq); > > + else if ( allow ) { > > + struct dt_irq irq = {pirq + NR_LOCAL_IRQS,0}; > > + ret = pirq_permit_access(d, irq.irq); > > + gic_route_irq_to_guest(d, &irq, ""); > > + } > > else > > ret = pirq_deny_access(d, pirq); > > } > > (END) > > > > It seems, the following patch can violate the logic about routing > > physical IRQs only to CPU0. > > I forgot the smp_processor_id() in gic_route_irq_to_guest(). As this function > is only called (for upstream) when dom0 is building, only CPU0 is used. Right, that's why changing it to cpumask_of(0) shouldn't make any difference for xen-unstable (it should make things clearer, if nothing else) but it should fix things for Oleksandr. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-29 13:22 ` Stefano Stabellini @ 2014-01-29 18:40 ` Oleksandr Tyshchenko 2014-01-29 18:43 ` Oleksandr Tyshchenko 2014-01-29 18:49 ` Julien Grall 0 siblings, 2 replies; 48+ messages in thread From: Oleksandr Tyshchenko @ 2014-01-29 18:40 UTC (permalink / raw) To: Stefano Stabellini; +Cc: Julien Grall, Ian Campbell, xen-devel > Right, that's why changing it to cpumask_of(0) shouldn't make any > difference for xen-unstable (it should make things clearer, if nothing > else) but it should fix things for Oleksandr. Unfortunately, it is not enough for stable work. I was tried to use cpumask_of(smp_processor_id()) instead of cpumask_of(0) in gic_route_irq_to_guest(). And as result, I don't see our situation which cause to deadlock in on_selected_cpus function (expected). But, hypervisor sometimes hangs somewhere else (I have not identified yet where this is happening) or I sometimes see traps, like that: ("WARN_ON(p->desc != NULL)" in maintenance_interrupt() leads to them) (XEN) CPU1: Unexpected Trap: Undefined Instruction (XEN) ----[ Xen-4.4-unstable arm32 debug=y Not tainted ]---- (XEN) CPU: 1 (XEN) PC: 00242c1c __warn+0x20/0x28 (XEN) CPSR: 200001da MODE:Hypervisor (XEN) R0: 0026770c R1: 00000001 R2: 3fd2fd00 R3: 00000fff (XEN) R4: 00406100 R5: 40020ee0 R6: 00000000 R7: 4bfdf000 (XEN) R8: 00000001 R9: 4bfd7ed0 R10:00000001 R11:4bfd7ebc R12:00000002 (XEN) HYP: SP: 4bfd7eb4 LR: 00242c1c (XEN) (XEN) VTCR_EL2: 80002558 (XEN) VTTBR_EL2: 00020000dec6a000 (XEN) (XEN) SCTLR_EL2: 30cd187f (XEN) HCR_EL2: 00000000000028b5 (XEN) TTBR0_EL2: 00000000d2014000 (XEN) (XEN) ESR_EL2: 00000000 (XEN) HPFAR_EL2: 0000000000482110 (XEN) HDFAR: fa211190 (XEN) HIFAR: 00000000 (XEN) (XEN) Xen stack trace from sp=4bfd7eb4: (XEN) 0026431c 4bfd7efc 00247a54 00000024 002e6608 002e6608 00000097 00000001 (XEN) 00000000 4bfd7f54 40017000 40005f60 40017014 4bfd7f58 00000019 00000000 (XEN) 40005f60 4bfd7f24 00248e60 00000009 00000019 00404000 4bfd7f58 00000000 (XEN) 00405000 000045f0 002e7694 4bfd7f4c 00248978 c0079a90 00000097 00000097 (XEN) 00000000 fa212000 ea80c900 00000001 c05b8a60 4bfd7f54 0024f4b8 4bfd7f58 (XEN) 00251830 ea80c950 00000000 00000001 c0079a90 00000097 00000097 00000000 (XEN) fa212000 ea80c900 00000001 c05b8a60 00000000 e9879e3c ffffffff b6efbca3 (XEN) c03b29fc 60000193 9fffffe7 b6c0bbf0 c0607500 c03b3140 e9879eb8 c007680c (XEN) c060750c c03b32c0 c0607518 c03b3360 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 3ff6bebf a0000113 800b0193 800b0093 40000193 00000000 (XEN) ffeffbfe fedeefff fffd5ffe (XEN) Xen call trace: (XEN) [<00242c1c>] __warn+0x20/0x28 (PC) (XEN) [<00242c1c>] __warn+0x20/0x28 (LR) (XEN) [<00247a54>] maintenance_interrupt+0xfc/0x2f4 (XEN) [<00248e60>] do_IRQ+0x138/0x198 (XEN) [<00248978>] gic_interrupt+0x58/0xc0 (XEN) [<0024f4b8>] do_trap_irq+0x10/0x14 (XEN) [<00251830>] return_from_trap+0/0x4 (XEN) Also I am posting maintenance_interrupt() from my tree: static void maintenance_interrupt(int irq, void *dev_id, struct cpu_user_regs *regs) { int i = 0, virq, pirq; uint32_t lr; struct vcpu *v = current; uint64_t eisr = GICH[GICH_EISR0] | (((uint64_t) GICH[GICH_EISR1]) << 32); while ((i = find_next_bit((const long unsigned int *) &eisr, 64, i)) < 64) { struct pending_irq *p, *n; int cpu, eoi; cpu = -1; eoi = 0; spin_lock_irq(&gic.lock); lr = GICH[GICH_LR + i]; virq = lr & GICH_LR_VIRTUAL_MASK; p = irq_to_pending(v, virq); if ( p->desc != NULL ) { p->desc->status &= ~IRQ_INPROGRESS; /* Assume only one pcpu needs to EOI the irq */ cpu = p->desc->arch.eoi_cpu; eoi = 1; pirq = p->desc->irq; } if ( !atomic_dec_and_test(&p->inflight_cnt) ) { /* Physical IRQ can't be reinject */ WARN_ON(p->desc != NULL); gic_set_lr(i, p->irq, GICH_LR_PENDING, p->priority); spin_unlock_irq(&gic.lock); i++; continue; } GICH[GICH_LR + i] = 0; clear_bit(i, &this_cpu(lr_mask)); if ( !list_empty(&v->arch.vgic.lr_pending) ) { n = list_entry(v->arch.vgic.lr_pending.next, typeof(*n), lr_queue); gic_set_lr(i, n->irq, GICH_LR_PENDING, n->priority); list_del_init(&n->lr_queue); set_bit(i, &this_cpu(lr_mask)); } else { gic_inject_irq_stop(); } spin_unlock_irq(&gic.lock); spin_lock_irq(&v->arch.vgic.lock); list_del_init(&p->inflight); spin_unlock_irq(&v->arch.vgic.lock); if ( eoi ) { /* this is not racy because we can't receive another irq of the * same type until we EOI it. */ if ( cpu == smp_processor_id() ) gic_irq_eoi((void*)(uintptr_t)pirq); else on_selected_cpus(cpumask_of(cpu), gic_irq_eoi, (void*)(uintptr_t)pirq, 0); } i++; } } Oleksandr Tyshchenko | Embedded Developer GlobalLogic ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-29 18:40 ` Oleksandr Tyshchenko @ 2014-01-29 18:43 ` Oleksandr Tyshchenko 2014-01-29 18:49 ` Julien Grall 1 sibling, 0 replies; 48+ messages in thread From: Oleksandr Tyshchenko @ 2014-01-29 18:43 UTC (permalink / raw) To: Stefano Stabellini; +Cc: Julien Grall, Ian Campbell, xen-devel Sorry, just a small correction: > I was tried to use cpumask_of(smp_processor_id()) instead of cpumask_of(0) in cpumask_of(0) instead of cpumask_of(smp_processor_id()) ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-29 18:40 ` Oleksandr Tyshchenko 2014-01-29 18:43 ` Oleksandr Tyshchenko @ 2014-01-29 18:49 ` Julien Grall 2014-01-29 19:54 ` Oleksandr Tyshchenko 2014-01-30 13:24 ` Stefano Stabellini 1 sibling, 2 replies; 48+ messages in thread From: Julien Grall @ 2014-01-29 18:49 UTC (permalink / raw) To: Oleksandr Tyshchenko; +Cc: xen-devel, Ian Campbell, Stefano Stabellini [-- Attachment #1.1: Type: text/plain, Size: 5532 bytes --] Hi, It's weird, physical IRQ should not be injected twice ... Were you able to print the IRQ number? In any case, you are using the old version of the interrupt patch series. Your new error may come of race condition in this code. Can you try to use a newest version? On 29 Jan 2014 18:40, "Oleksandr Tyshchenko" < oleksandr.tyshchenko@globallogic.com> wrote: > > Right, that's why changing it to cpumask_of(0) shouldn't make any > > difference for xen-unstable (it should make things clearer, if nothing > > else) but it should fix things for Oleksandr. > > Unfortunately, it is not enough for stable work. > > I was tried to use cpumask_of(smp_processor_id()) instead of cpumask_of(0) > in > gic_route_irq_to_guest(). And as result, I don't see our situation > which cause to deadlock in on_selected_cpus function (expected). > But, hypervisor sometimes hangs somewhere else (I have not identified > yet where this is happening) or I sometimes see traps, like that: > ("WARN_ON(p->desc != NULL)" in maintenance_interrupt() leads to them) > > (XEN) CPU1: Unexpected Trap: Undefined Instruction > (XEN) ----[ Xen-4.4-unstable arm32 debug=y Not tainted ]---- > (XEN) CPU: 1 > (XEN) PC: 00242c1c __warn+0x20/0x28 > (XEN) CPSR: 200001da MODE:Hypervisor > (XEN) R0: 0026770c R1: 00000001 R2: 3fd2fd00 R3: 00000fff > (XEN) R4: 00406100 R5: 40020ee0 R6: 00000000 R7: 4bfdf000 > (XEN) R8: 00000001 R9: 4bfd7ed0 R10:00000001 R11:4bfd7ebc R12:00000002 > (XEN) HYP: SP: 4bfd7eb4 LR: 00242c1c > (XEN) > (XEN) VTCR_EL2: 80002558 > (XEN) VTTBR_EL2: 00020000dec6a000 > (XEN) > (XEN) SCTLR_EL2: 30cd187f > (XEN) HCR_EL2: 00000000000028b5 > (XEN) TTBR0_EL2: 00000000d2014000 > (XEN) > (XEN) ESR_EL2: 00000000 > (XEN) HPFAR_EL2: 0000000000482110 > (XEN) HDFAR: fa211190 > (XEN) HIFAR: 00000000 > (XEN) > (XEN) Xen stack trace from sp=4bfd7eb4: > (XEN) 0026431c 4bfd7efc 00247a54 00000024 002e6608 002e6608 00000097 > 00000001 > (XEN) 00000000 4bfd7f54 40017000 40005f60 40017014 4bfd7f58 00000019 > 00000000 > (XEN) 40005f60 4bfd7f24 00248e60 00000009 00000019 00404000 4bfd7f58 > 00000000 > (XEN) 00405000 000045f0 002e7694 4bfd7f4c 00248978 c0079a90 00000097 > 00000097 > (XEN) 00000000 fa212000 ea80c900 00000001 c05b8a60 4bfd7f54 0024f4b8 > 4bfd7f58 > (XEN) 00251830 ea80c950 00000000 00000001 c0079a90 00000097 00000097 > 00000000 > (XEN) fa212000 ea80c900 00000001 c05b8a60 00000000 e9879e3c ffffffff > b6efbca3 > (XEN) c03b29fc 60000193 9fffffe7 b6c0bbf0 c0607500 c03b3140 e9879eb8 > c007680c > (XEN) c060750c c03b32c0 c0607518 c03b3360 00000000 00000000 00000000 > 00000000 > (XEN) 00000000 00000000 3ff6bebf a0000113 800b0193 800b0093 40000193 > 00000000 > (XEN) ffeffbfe fedeefff fffd5ffe > (XEN) Xen call trace: > (XEN) [<00242c1c>] __warn+0x20/0x28 (PC) > (XEN) [<00242c1c>] __warn+0x20/0x28 (LR) > (XEN) [<00247a54>] maintenance_interrupt+0xfc/0x2f4 > (XEN) [<00248e60>] do_IRQ+0x138/0x198 > (XEN) [<00248978>] gic_interrupt+0x58/0xc0 > (XEN) [<0024f4b8>] do_trap_irq+0x10/0x14 > (XEN) [<00251830>] return_from_trap+0/0x4 > (XEN) > > Also I am posting maintenance_interrupt() from my tree: > > static void maintenance_interrupt(int irq, void *dev_id, struct > cpu_user_regs *regs) > { > int i = 0, virq, pirq; > uint32_t lr; > struct vcpu *v = current; > uint64_t eisr = GICH[GICH_EISR0] | (((uint64_t) GICH[GICH_EISR1]) << > 32); > > while ((i = find_next_bit((const long unsigned int *) &eisr, > 64, i)) < 64) { > struct pending_irq *p, *n; > int cpu, eoi; > > cpu = -1; > eoi = 0; > > spin_lock_irq(&gic.lock); > lr = GICH[GICH_LR + i]; > virq = lr & GICH_LR_VIRTUAL_MASK; > > p = irq_to_pending(v, virq); > if ( p->desc != NULL ) { > p->desc->status &= ~IRQ_INPROGRESS; > /* Assume only one pcpu needs to EOI the irq */ > cpu = p->desc->arch.eoi_cpu; > eoi = 1; > pirq = p->desc->irq; > } > if ( !atomic_dec_and_test(&p->inflight_cnt) ) > { > /* Physical IRQ can't be reinject */ > WARN_ON(p->desc != NULL); > gic_set_lr(i, p->irq, GICH_LR_PENDING, p->priority); > spin_unlock_irq(&gic.lock); > i++; > continue; > } > > GICH[GICH_LR + i] = 0; > clear_bit(i, &this_cpu(lr_mask)); > > if ( !list_empty(&v->arch.vgic.lr_pending) ) { > n = list_entry(v->arch.vgic.lr_pending.next, typeof(*n), > lr_queue); > gic_set_lr(i, n->irq, GICH_LR_PENDING, n->priority); > list_del_init(&n->lr_queue); > set_bit(i, &this_cpu(lr_mask)); > } else { > gic_inject_irq_stop(); > } > spin_unlock_irq(&gic.lock); > > spin_lock_irq(&v->arch.vgic.lock); > list_del_init(&p->inflight); > spin_unlock_irq(&v->arch.vgic.lock); > > if ( eoi ) { > /* this is not racy because we can't receive another irq of the > * same type until we EOI it. */ > if ( cpu == smp_processor_id() ) > gic_irq_eoi((void*)(uintptr_t)pirq); > else > on_selected_cpus(cpumask_of(cpu), > gic_irq_eoi, (void*)(uintptr_t)pirq, 0); > } > > i++; > } > } > > > Oleksandr Tyshchenko | Embedded Developer > GlobalLogic > [-- Attachment #1.2: Type: text/html, Size: 6801 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-29 18:49 ` Julien Grall @ 2014-01-29 19:54 ` Oleksandr Tyshchenko 2014-01-30 0:42 ` Julien Grall 2014-01-30 13:24 ` Stefano Stabellini 1 sibling, 1 reply; 48+ messages in thread From: Oleksandr Tyshchenko @ 2014-01-29 19:54 UTC (permalink / raw) To: Julien Grall; +Cc: xen-devel, Ian Campbell, Stefano Stabellini On Wed, Jan 29, 2014 at 8:49 PM, Julien Grall <julien.grall@linaro.org> wrote: > Hi, > > It's weird, physical IRQ should not be injected twice ... > Were you able to print the IRQ number? p->irq: 151, p->desc->irq: 151 (it is touchscreen irq) > > In any case, you are using the old version of the interrupt patch series. > Your new error may come of race condition in this code. > > Can you try to use a newest version? Yes, I can. But not today or tomorrow, I need time to apply our local changes to newest version needed our system to work. And Do you mean try to use newest version of XEN at whole or only parts of code related to interrupts handling? > > On 29 Jan 2014 18:40, "Oleksandr Tyshchenko" > <oleksandr.tyshchenko@globallogic.com> wrote: >> >> > Right, that's why changing it to cpumask_of(0) shouldn't make any >> > difference for xen-unstable (it should make things clearer, if nothing >> > else) but it should fix things for Oleksandr. >> >> Unfortunately, it is not enough for stable work. >> >> I was tried to use cpumask_of(smp_processor_id()) instead of cpumask_of(0) >> in >> gic_route_irq_to_guest(). And as result, I don't see our situation >> which cause to deadlock in on_selected_cpus function (expected). >> But, hypervisor sometimes hangs somewhere else (I have not identified >> yet where this is happening) or I sometimes see traps, like that: >> ("WARN_ON(p->desc != NULL)" in maintenance_interrupt() leads to them) >> >> (XEN) CPU1: Unexpected Trap: Undefined Instruction >> (XEN) ----[ Xen-4.4-unstable arm32 debug=y Not tainted ]---- >> (XEN) CPU: 1 >> (XEN) PC: 00242c1c __warn+0x20/0x28 >> (XEN) CPSR: 200001da MODE:Hypervisor >> (XEN) R0: 0026770c R1: 00000001 R2: 3fd2fd00 R3: 00000fff >> (XEN) R4: 00406100 R5: 40020ee0 R6: 00000000 R7: 4bfdf000 >> (XEN) R8: 00000001 R9: 4bfd7ed0 R10:00000001 R11:4bfd7ebc >> R12:00000002 >> (XEN) HYP: SP: 4bfd7eb4 LR: 00242c1c >> (XEN) >> (XEN) VTCR_EL2: 80002558 >> (XEN) VTTBR_EL2: 00020000dec6a000 >> (XEN) >> (XEN) SCTLR_EL2: 30cd187f >> (XEN) HCR_EL2: 00000000000028b5 >> (XEN) TTBR0_EL2: 00000000d2014000 >> (XEN) >> (XEN) ESR_EL2: 00000000 >> (XEN) HPFAR_EL2: 0000000000482110 >> (XEN) HDFAR: fa211190 >> (XEN) HIFAR: 00000000 >> (XEN) >> (XEN) Xen stack trace from sp=4bfd7eb4: >> (XEN) 0026431c 4bfd7efc 00247a54 00000024 002e6608 002e6608 00000097 >> 00000001 >> (XEN) 00000000 4bfd7f54 40017000 40005f60 40017014 4bfd7f58 00000019 >> 00000000 >> (XEN) 40005f60 4bfd7f24 00248e60 00000009 00000019 00404000 4bfd7f58 >> 00000000 >> (XEN) 00405000 000045f0 002e7694 4bfd7f4c 00248978 c0079a90 00000097 >> 00000097 >> (XEN) 00000000 fa212000 ea80c900 00000001 c05b8a60 4bfd7f54 0024f4b8 >> 4bfd7f58 >> (XEN) 00251830 ea80c950 00000000 00000001 c0079a90 00000097 00000097 >> 00000000 >> (XEN) fa212000 ea80c900 00000001 c05b8a60 00000000 e9879e3c ffffffff >> b6efbca3 >> (XEN) c03b29fc 60000193 9fffffe7 b6c0bbf0 c0607500 c03b3140 e9879eb8 >> c007680c >> (XEN) c060750c c03b32c0 c0607518 c03b3360 00000000 00000000 00000000 >> 00000000 >> (XEN) 00000000 00000000 3ff6bebf a0000113 800b0193 800b0093 40000193 >> 00000000 >> (XEN) ffeffbfe fedeefff fffd5ffe >> (XEN) Xen call trace: >> (XEN) [<00242c1c>] __warn+0x20/0x28 (PC) >> (XEN) [<00242c1c>] __warn+0x20/0x28 (LR) >> (XEN) [<00247a54>] maintenance_interrupt+0xfc/0x2f4 >> (XEN) [<00248e60>] do_IRQ+0x138/0x198 >> (XEN) [<00248978>] gic_interrupt+0x58/0xc0 >> (XEN) [<0024f4b8>] do_trap_irq+0x10/0x14 >> (XEN) [<00251830>] return_from_trap+0/0x4 >> (XEN) >> >> Also I am posting maintenance_interrupt() from my tree: >> >> static void maintenance_interrupt(int irq, void *dev_id, struct >> cpu_user_regs *regs) >> { >> int i = 0, virq, pirq; >> uint32_t lr; >> struct vcpu *v = current; >> uint64_t eisr = GICH[GICH_EISR0] | (((uint64_t) GICH[GICH_EISR1]) << >> 32); >> >> while ((i = find_next_bit((const long unsigned int *) &eisr, >> 64, i)) < 64) { >> struct pending_irq *p, *n; >> int cpu, eoi; >> >> cpu = -1; >> eoi = 0; >> >> spin_lock_irq(&gic.lock); >> lr = GICH[GICH_LR + i]; >> virq = lr & GICH_LR_VIRTUAL_MASK; >> >> p = irq_to_pending(v, virq); >> if ( p->desc != NULL ) { >> p->desc->status &= ~IRQ_INPROGRESS; >> /* Assume only one pcpu needs to EOI the irq */ >> cpu = p->desc->arch.eoi_cpu; >> eoi = 1; >> pirq = p->desc->irq; >> } >> if ( !atomic_dec_and_test(&p->inflight_cnt) ) >> { >> /* Physical IRQ can't be reinject */ >> WARN_ON(p->desc != NULL); >> gic_set_lr(i, p->irq, GICH_LR_PENDING, p->priority); >> spin_unlock_irq(&gic.lock); >> i++; >> continue; >> } >> >> GICH[GICH_LR + i] = 0; >> clear_bit(i, &this_cpu(lr_mask)); >> >> if ( !list_empty(&v->arch.vgic.lr_pending) ) { >> n = list_entry(v->arch.vgic.lr_pending.next, typeof(*n), >> lr_queue); >> gic_set_lr(i, n->irq, GICH_LR_PENDING, n->priority); >> list_del_init(&n->lr_queue); >> set_bit(i, &this_cpu(lr_mask)); >> } else { >> gic_inject_irq_stop(); >> } >> spin_unlock_irq(&gic.lock); >> >> spin_lock_irq(&v->arch.vgic.lock); >> list_del_init(&p->inflight); >> spin_unlock_irq(&v->arch.vgic.lock); >> >> if ( eoi ) { >> /* this is not racy because we can't receive another irq of >> the >> * same type until we EOI it. */ >> if ( cpu == smp_processor_id() ) >> gic_irq_eoi((void*)(uintptr_t)pirq); >> else >> on_selected_cpus(cpumask_of(cpu), >> gic_irq_eoi, (void*)(uintptr_t)pirq, 0); >> } >> >> i++; >> } >> } >> >> >> Oleksandr Tyshchenko | Embedded Developer >> GlobalLogic -- Oleksandr Tyshchenko | Embedded Developer GlobalLogic ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-29 19:54 ` Oleksandr Tyshchenko @ 2014-01-30 0:42 ` Julien Grall 0 siblings, 0 replies; 48+ messages in thread From: Julien Grall @ 2014-01-30 0:42 UTC (permalink / raw) To: Oleksandr Tyshchenko; +Cc: xen-devel, Ian Campbell, Stefano Stabellini On 29/01/14 19:54, Oleksandr Tyshchenko wrote: > On Wed, Jan 29, 2014 at 8:49 PM, Julien Grall <julien.grall@linaro.org> wrote: >> Hi, >> >> It's weird, physical IRQ should not be injected twice ... >> Were you able to print the IRQ number? > > p->irq: 151, p->desc->irq: 151 (it is touchscreen irq) > >> >> In any case, you are using the old version of the interrupt patch series. >> Your new error may come of race condition in this code. >> >> Can you try to use a newest version? > > Yes, I can. But not today or tomorrow, I need time to apply our local > changes to newest version needed our system to work. > And Do you mean try to use newest version of XEN at whole or only > parts of code related to interrupts handling? > I don't think you need a newest version. Which Xen commit are you using? This range of commit should be enough: 82064f0 - 88eb95e. -- Julien Grall ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-29 18:49 ` Julien Grall 2014-01-29 19:54 ` Oleksandr Tyshchenko @ 2014-01-30 13:24 ` Stefano Stabellini 2014-01-30 15:06 ` Oleksandr Tyshchenko 1 sibling, 1 reply; 48+ messages in thread From: Stefano Stabellini @ 2014-01-30 13:24 UTC (permalink / raw) To: Julien Grall Cc: Oleksandr Tyshchenko, xen-devel, Ian Campbell, Stefano Stabellini [-- Attachment #1: Type: text/plain, Size: 6807 bytes --] Is it a level or an edge irq? On Wed, 29 Jan 2014, Julien Grall wrote: > Hi, > > It's weird, physical IRQ should not be injected twice ... > Were you able to print the IRQ number? > > In any case, you are using the old version of the interrupt patch series. > Your new error may come of race condition in this code. > > Can you try to use a newest version? > > On 29 Jan 2014 18:40, "Oleksandr Tyshchenko" <oleksandr.tyshchenko@globallogic.com> wrote: > > Right, that's why changing it to cpumask_of(0) shouldn't make any > > difference for xen-unstable (it should make things clearer, if nothing > > else) but it should fix things for Oleksandr. > > Unfortunately, it is not enough for stable work. > > I was tried to use cpumask_of(smp_processor_id()) instead of cpumask_of(0) in > gic_route_irq_to_guest(). And as result, I don't see our situation > which cause to deadlock in on_selected_cpus function (expected). > But, hypervisor sometimes hangs somewhere else (I have not identified > yet where this is happening) or I sometimes see traps, like that: > ("WARN_ON(p->desc != NULL)" in maintenance_interrupt() leads to them) > > (XEN) CPU1: Unexpected Trap: Undefined Instruction > (XEN) ----[ Xen-4.4-unstable arm32 debug=y Not tainted ]---- > (XEN) CPU: 1 > (XEN) PC: 00242c1c __warn+0x20/0x28 > (XEN) CPSR: 200001da MODE:Hypervisor > (XEN) R0: 0026770c R1: 00000001 R2: 3fd2fd00 R3: 00000fff > (XEN) R4: 00406100 R5: 40020ee0 R6: 00000000 R7: 4bfdf000 > (XEN) R8: 00000001 R9: 4bfd7ed0 R10:00000001 R11:4bfd7ebc R12:00000002 > (XEN) HYP: SP: 4bfd7eb4 LR: 00242c1c > (XEN) > (XEN) VTCR_EL2: 80002558 > (XEN) VTTBR_EL2: 00020000dec6a000 > (XEN) > (XEN) SCTLR_EL2: 30cd187f > (XEN) HCR_EL2: 00000000000028b5 > (XEN) TTBR0_EL2: 00000000d2014000 > (XEN) > (XEN) ESR_EL2: 00000000 > (XEN) HPFAR_EL2: 0000000000482110 > (XEN) HDFAR: fa211190 > (XEN) HIFAR: 00000000 > (XEN) > (XEN) Xen stack trace from sp=4bfd7eb4: > (XEN) 0026431c 4bfd7efc 00247a54 00000024 002e6608 002e6608 00000097 00000001 > (XEN) 00000000 4bfd7f54 40017000 40005f60 40017014 4bfd7f58 00000019 00000000 > (XEN) 40005f60 4bfd7f24 00248e60 00000009 00000019 00404000 4bfd7f58 00000000 > (XEN) 00405000 000045f0 002e7694 4bfd7f4c 00248978 c0079a90 00000097 00000097 > (XEN) 00000000 fa212000 ea80c900 00000001 c05b8a60 4bfd7f54 0024f4b8 4bfd7f58 > (XEN) 00251830 ea80c950 00000000 00000001 c0079a90 00000097 00000097 00000000 > (XEN) fa212000 ea80c900 00000001 c05b8a60 00000000 e9879e3c ffffffff b6efbca3 > (XEN) c03b29fc 60000193 9fffffe7 b6c0bbf0 c0607500 c03b3140 e9879eb8 c007680c > (XEN) c060750c c03b32c0 c0607518 c03b3360 00000000 00000000 00000000 00000000 > (XEN) 00000000 00000000 3ff6bebf a0000113 800b0193 800b0093 40000193 00000000 > (XEN) ffeffbfe fedeefff fffd5ffe > (XEN) Xen call trace: > (XEN) [<00242c1c>] __warn+0x20/0x28 (PC) > (XEN) [<00242c1c>] __warn+0x20/0x28 (LR) > (XEN) [<00247a54>] maintenance_interrupt+0xfc/0x2f4 > (XEN) [<00248e60>] do_IRQ+0x138/0x198 > (XEN) [<00248978>] gic_interrupt+0x58/0xc0 > (XEN) [<0024f4b8>] do_trap_irq+0x10/0x14 > (XEN) [<00251830>] return_from_trap+0/0x4 > (XEN) > > Also I am posting maintenance_interrupt() from my tree: > > static void maintenance_interrupt(int irq, void *dev_id, struct > cpu_user_regs *regs) > { > int i = 0, virq, pirq; > uint32_t lr; > struct vcpu *v = current; > uint64_t eisr = GICH[GICH_EISR0] | (((uint64_t) GICH[GICH_EISR1]) << 32); > > while ((i = find_next_bit((const long unsigned int *) &eisr, > 64, i)) < 64) { > struct pending_irq *p, *n; > int cpu, eoi; > > cpu = -1; > eoi = 0; > > spin_lock_irq(&gic.lock); > lr = GICH[GICH_LR + i]; > virq = lr & GICH_LR_VIRTUAL_MASK; > > p = irq_to_pending(v, virq); > if ( p->desc != NULL ) { > p->desc->status &= ~IRQ_INPROGRESS; > /* Assume only one pcpu needs to EOI the irq */ > cpu = p->desc->arch.eoi_cpu; > eoi = 1; > pirq = p->desc->irq; > } > if ( !atomic_dec_and_test(&p->inflight_cnt) ) > { > /* Physical IRQ can't be reinject */ > WARN_ON(p->desc != NULL); > gic_set_lr(i, p->irq, GICH_LR_PENDING, p->priority); > spin_unlock_irq(&gic.lock); > i++; > continue; > } > > GICH[GICH_LR + i] = 0; > clear_bit(i, &this_cpu(lr_mask)); > > if ( !list_empty(&v->arch.vgic.lr_pending) ) { > n = list_entry(v->arch.vgic.lr_pending.next, typeof(*n), lr_queue); > gic_set_lr(i, n->irq, GICH_LR_PENDING, n->priority); > list_del_init(&n->lr_queue); > set_bit(i, &this_cpu(lr_mask)); > } else { > gic_inject_irq_stop(); > } > spin_unlock_irq(&gic.lock); > > spin_lock_irq(&v->arch.vgic.lock); > list_del_init(&p->inflight); > spin_unlock_irq(&v->arch.vgic.lock); > > if ( eoi ) { > /* this is not racy because we can't receive another irq of the > * same type until we EOI it. */ > if ( cpu == smp_processor_id() ) > gic_irq_eoi((void*)(uintptr_t)pirq); > else > on_selected_cpus(cpumask_of(cpu), > gic_irq_eoi, (void*)(uintptr_t)pirq, 0); > } > > i++; > } > } > > > Oleksandr Tyshchenko | Embedded Developer > GlobalLogic > > > [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-30 13:24 ` Stefano Stabellini @ 2014-01-30 15:06 ` Oleksandr Tyshchenko 2014-01-30 15:35 ` Stefano Stabellini 0 siblings, 1 reply; 48+ messages in thread From: Oleksandr Tyshchenko @ 2014-01-30 15:06 UTC (permalink / raw) To: Stefano Stabellini; +Cc: Julien Grall, Ian Campbell, xen-devel According to DT it is a level irq (DT_IRQ_TYPE_LEVEL_HIGH) On Thu, Jan 30, 2014 at 3:24 PM, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote: > Is it a level or an edge irq? > > On Wed, 29 Jan 2014, Julien Grall wrote: >> Hi, >> >> It's weird, physical IRQ should not be injected twice ... >> Were you able to print the IRQ number? >> >> In any case, you are using the old version of the interrupt patch series. >> Your new error may come of race condition in this code. >> >> Can you try to use a newest version? >> >> On 29 Jan 2014 18:40, "Oleksandr Tyshchenko" <oleksandr.tyshchenko@globallogic.com> wrote: >> > Right, that's why changing it to cpumask_of(0) shouldn't make any >> > difference for xen-unstable (it should make things clearer, if nothing >> > else) but it should fix things for Oleksandr. >> >> Unfortunately, it is not enough for stable work. >> >> I was tried to use cpumask_of(smp_processor_id()) instead of cpumask_of(0) in >> gic_route_irq_to_guest(). And as result, I don't see our situation >> which cause to deadlock in on_selected_cpus function (expected). >> But, hypervisor sometimes hangs somewhere else (I have not identified >> yet where this is happening) or I sometimes see traps, like that: >> ("WARN_ON(p->desc != NULL)" in maintenance_interrupt() leads to them) >> >> (XEN) CPU1: Unexpected Trap: Undefined Instruction >> (XEN) ----[ Xen-4.4-unstable arm32 debug=y Not tainted ]---- >> (XEN) CPU: 1 >> (XEN) PC: 00242c1c __warn+0x20/0x28 >> (XEN) CPSR: 200001da MODE:Hypervisor >> (XEN) R0: 0026770c R1: 00000001 R2: 3fd2fd00 R3: 00000fff >> (XEN) R4: 00406100 R5: 40020ee0 R6: 00000000 R7: 4bfdf000 >> (XEN) R8: 00000001 R9: 4bfd7ed0 R10:00000001 R11:4bfd7ebc R12:00000002 >> (XEN) HYP: SP: 4bfd7eb4 LR: 00242c1c >> (XEN) >> (XEN) VTCR_EL2: 80002558 >> (XEN) VTTBR_EL2: 00020000dec6a000 >> (XEN) >> (XEN) SCTLR_EL2: 30cd187f >> (XEN) HCR_EL2: 00000000000028b5 >> (XEN) TTBR0_EL2: 00000000d2014000 >> (XEN) >> (XEN) ESR_EL2: 00000000 >> (XEN) HPFAR_EL2: 0000000000482110 >> (XEN) HDFAR: fa211190 >> (XEN) HIFAR: 00000000 >> (XEN) >> (XEN) Xen stack trace from sp=4bfd7eb4: >> (XEN) 0026431c 4bfd7efc 00247a54 00000024 002e6608 002e6608 00000097 00000001 >> (XEN) 00000000 4bfd7f54 40017000 40005f60 40017014 4bfd7f58 00000019 00000000 >> (XEN) 40005f60 4bfd7f24 00248e60 00000009 00000019 00404000 4bfd7f58 00000000 >> (XEN) 00405000 000045f0 002e7694 4bfd7f4c 00248978 c0079a90 00000097 00000097 >> (XEN) 00000000 fa212000 ea80c900 00000001 c05b8a60 4bfd7f54 0024f4b8 4bfd7f58 >> (XEN) 00251830 ea80c950 00000000 00000001 c0079a90 00000097 00000097 00000000 >> (XEN) fa212000 ea80c900 00000001 c05b8a60 00000000 e9879e3c ffffffff b6efbca3 >> (XEN) c03b29fc 60000193 9fffffe7 b6c0bbf0 c0607500 c03b3140 e9879eb8 c007680c >> (XEN) c060750c c03b32c0 c0607518 c03b3360 00000000 00000000 00000000 00000000 >> (XEN) 00000000 00000000 3ff6bebf a0000113 800b0193 800b0093 40000193 00000000 >> (XEN) ffeffbfe fedeefff fffd5ffe >> (XEN) Xen call trace: >> (XEN) [<00242c1c>] __warn+0x20/0x28 (PC) >> (XEN) [<00242c1c>] __warn+0x20/0x28 (LR) >> (XEN) [<00247a54>] maintenance_interrupt+0xfc/0x2f4 >> (XEN) [<00248e60>] do_IRQ+0x138/0x198 >> (XEN) [<00248978>] gic_interrupt+0x58/0xc0 >> (XEN) [<0024f4b8>] do_trap_irq+0x10/0x14 >> (XEN) [<00251830>] return_from_trap+0/0x4 >> (XEN) >> >> Also I am posting maintenance_interrupt() from my tree: >> >> static void maintenance_interrupt(int irq, void *dev_id, struct >> cpu_user_regs *regs) >> { >> int i = 0, virq, pirq; >> uint32_t lr; >> struct vcpu *v = current; >> uint64_t eisr = GICH[GICH_EISR0] | (((uint64_t) GICH[GICH_EISR1]) << 32); >> >> while ((i = find_next_bit((const long unsigned int *) &eisr, >> 64, i)) < 64) { >> struct pending_irq *p, *n; >> int cpu, eoi; >> >> cpu = -1; >> eoi = 0; >> >> spin_lock_irq(&gic.lock); >> lr = GICH[GICH_LR + i]; >> virq = lr & GICH_LR_VIRTUAL_MASK; >> >> p = irq_to_pending(v, virq); >> if ( p->desc != NULL ) { >> p->desc->status &= ~IRQ_INPROGRESS; >> /* Assume only one pcpu needs to EOI the irq */ >> cpu = p->desc->arch.eoi_cpu; >> eoi = 1; >> pirq = p->desc->irq; >> } >> if ( !atomic_dec_and_test(&p->inflight_cnt) ) >> { >> /* Physical IRQ can't be reinject */ >> WARN_ON(p->desc != NULL); >> gic_set_lr(i, p->irq, GICH_LR_PENDING, p->priority); >> spin_unlock_irq(&gic.lock); >> i++; >> continue; >> } >> >> GICH[GICH_LR + i] = 0; >> clear_bit(i, &this_cpu(lr_mask)); >> >> if ( !list_empty(&v->arch.vgic.lr_pending) ) { >> n = list_entry(v->arch.vgic.lr_pending.next, typeof(*n), lr_queue); >> gic_set_lr(i, n->irq, GICH_LR_PENDING, n->priority); >> list_del_init(&n->lr_queue); >> set_bit(i, &this_cpu(lr_mask)); >> } else { >> gic_inject_irq_stop(); >> } >> spin_unlock_irq(&gic.lock); >> >> spin_lock_irq(&v->arch.vgic.lock); >> list_del_init(&p->inflight); >> spin_unlock_irq(&v->arch.vgic.lock); >> >> if ( eoi ) { >> /* this is not racy because we can't receive another irq of the >> * same type until we EOI it. */ >> if ( cpu == smp_processor_id() ) >> gic_irq_eoi((void*)(uintptr_t)pirq); >> else >> on_selected_cpus(cpumask_of(cpu), >> gic_irq_eoi, (void*)(uintptr_t)pirq, 0); >> } >> >> i++; >> } >> } >> >> >> Oleksandr Tyshchenko | Embedded Developer >> GlobalLogic >> >> >> -- Name | Title GlobalLogic P +x.xxx.xxx.xxxx M +x.xxx.xxx.xxxx S skype www.globallogic.com http://www.globallogic.com/email_disclaimer.txt ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-30 15:06 ` Oleksandr Tyshchenko @ 2014-01-30 15:35 ` Stefano Stabellini 2014-01-30 16:10 ` Oleksandr Tyshchenko 0 siblings, 1 reply; 48+ messages in thread From: Stefano Stabellini @ 2014-01-30 15:35 UTC (permalink / raw) To: Oleksandr Tyshchenko Cc: Julien Grall, xen-devel, Ian Campbell, Stefano Stabellini Given that we don't deactivate the interrupt (writing to GICC_DIR) until the guest EOIs it, I can't understand how you manage to get a second interrupt notifications before the guest EOIs the first one. Do you set GICC_CTL_EOI in GICC_CTLR? On Thu, 30 Jan 2014, Oleksandr Tyshchenko wrote: > According to DT it is a level irq (DT_IRQ_TYPE_LEVEL_HIGH) > > On Thu, Jan 30, 2014 at 3:24 PM, Stefano Stabellini > <stefano.stabellini@eu.citrix.com> wrote: > > Is it a level or an edge irq? > > > > On Wed, 29 Jan 2014, Julien Grall wrote: > >> Hi, > >> > >> It's weird, physical IRQ should not be injected twice ... > >> Were you able to print the IRQ number? > >> > >> In any case, you are using the old version of the interrupt patch series. > >> Your new error may come of race condition in this code. > >> > >> Can you try to use a newest version? > >> > >> On 29 Jan 2014 18:40, "Oleksandr Tyshchenko" <oleksandr.tyshchenko@globallogic.com> wrote: > >> > Right, that's why changing it to cpumask_of(0) shouldn't make any > >> > difference for xen-unstable (it should make things clearer, if nothing > >> > else) but it should fix things for Oleksandr. > >> > >> Unfortunately, it is not enough for stable work. > >> > >> I was tried to use cpumask_of(smp_processor_id()) instead of cpumask_of(0) in > >> gic_route_irq_to_guest(). And as result, I don't see our situation > >> which cause to deadlock in on_selected_cpus function (expected). > >> But, hypervisor sometimes hangs somewhere else (I have not identified > >> yet where this is happening) or I sometimes see traps, like that: > >> ("WARN_ON(p->desc != NULL)" in maintenance_interrupt() leads to them) > >> > >> (XEN) CPU1: Unexpected Trap: Undefined Instruction > >> (XEN) ----[ Xen-4.4-unstable arm32 debug=y Not tainted ]---- > >> (XEN) CPU: 1 > >> (XEN) PC: 00242c1c __warn+0x20/0x28 > >> (XEN) CPSR: 200001da MODE:Hypervisor > >> (XEN) R0: 0026770c R1: 00000001 R2: 3fd2fd00 R3: 00000fff > >> (XEN) R4: 00406100 R5: 40020ee0 R6: 00000000 R7: 4bfdf000 > >> (XEN) R8: 00000001 R9: 4bfd7ed0 R10:00000001 R11:4bfd7ebc R12:00000002 > >> (XEN) HYP: SP: 4bfd7eb4 LR: 00242c1c > >> (XEN) > >> (XEN) VTCR_EL2: 80002558 > >> (XEN) VTTBR_EL2: 00020000dec6a000 > >> (XEN) > >> (XEN) SCTLR_EL2: 30cd187f > >> (XEN) HCR_EL2: 00000000000028b5 > >> (XEN) TTBR0_EL2: 00000000d2014000 > >> (XEN) > >> (XEN) ESR_EL2: 00000000 > >> (XEN) HPFAR_EL2: 0000000000482110 > >> (XEN) HDFAR: fa211190 > >> (XEN) HIFAR: 00000000 > >> (XEN) > >> (XEN) Xen stack trace from sp=4bfd7eb4: > >> (XEN) 0026431c 4bfd7efc 00247a54 00000024 002e6608 002e6608 00000097 00000001 > >> (XEN) 00000000 4bfd7f54 40017000 40005f60 40017014 4bfd7f58 00000019 00000000 > >> (XEN) 40005f60 4bfd7f24 00248e60 00000009 00000019 00404000 4bfd7f58 00000000 > >> (XEN) 00405000 000045f0 002e7694 4bfd7f4c 00248978 c0079a90 00000097 00000097 > >> (XEN) 00000000 fa212000 ea80c900 00000001 c05b8a60 4bfd7f54 0024f4b8 4bfd7f58 > >> (XEN) 00251830 ea80c950 00000000 00000001 c0079a90 00000097 00000097 00000000 > >> (XEN) fa212000 ea80c900 00000001 c05b8a60 00000000 e9879e3c ffffffff b6efbca3 > >> (XEN) c03b29fc 60000193 9fffffe7 b6c0bbf0 c0607500 c03b3140 e9879eb8 c007680c > >> (XEN) c060750c c03b32c0 c0607518 c03b3360 00000000 00000000 00000000 00000000 > >> (XEN) 00000000 00000000 3ff6bebf a0000113 800b0193 800b0093 40000193 00000000 > >> (XEN) ffeffbfe fedeefff fffd5ffe > >> (XEN) Xen call trace: > >> (XEN) [<00242c1c>] __warn+0x20/0x28 (PC) > >> (XEN) [<00242c1c>] __warn+0x20/0x28 (LR) > >> (XEN) [<00247a54>] maintenance_interrupt+0xfc/0x2f4 > >> (XEN) [<00248e60>] do_IRQ+0x138/0x198 > >> (XEN) [<00248978>] gic_interrupt+0x58/0xc0 > >> (XEN) [<0024f4b8>] do_trap_irq+0x10/0x14 > >> (XEN) [<00251830>] return_from_trap+0/0x4 > >> (XEN) > >> > >> Also I am posting maintenance_interrupt() from my tree: > >> > >> static void maintenance_interrupt(int irq, void *dev_id, struct > >> cpu_user_regs *regs) > >> { > >> int i = 0, virq, pirq; > >> uint32_t lr; > >> struct vcpu *v = current; > >> uint64_t eisr = GICH[GICH_EISR0] | (((uint64_t) GICH[GICH_EISR1]) << 32); > >> > >> while ((i = find_next_bit((const long unsigned int *) &eisr, > >> 64, i)) < 64) { > >> struct pending_irq *p, *n; > >> int cpu, eoi; > >> > >> cpu = -1; > >> eoi = 0; > >> > >> spin_lock_irq(&gic.lock); > >> lr = GICH[GICH_LR + i]; > >> virq = lr & GICH_LR_VIRTUAL_MASK; > >> > >> p = irq_to_pending(v, virq); > >> if ( p->desc != NULL ) { > >> p->desc->status &= ~IRQ_INPROGRESS; > >> /* Assume only one pcpu needs to EOI the irq */ > >> cpu = p->desc->arch.eoi_cpu; > >> eoi = 1; > >> pirq = p->desc->irq; > >> } > >> if ( !atomic_dec_and_test(&p->inflight_cnt) ) > >> { > >> /* Physical IRQ can't be reinject */ > >> WARN_ON(p->desc != NULL); > >> gic_set_lr(i, p->irq, GICH_LR_PENDING, p->priority); > >> spin_unlock_irq(&gic.lock); > >> i++; > >> continue; > >> } > >> > >> GICH[GICH_LR + i] = 0; > >> clear_bit(i, &this_cpu(lr_mask)); > >> > >> if ( !list_empty(&v->arch.vgic.lr_pending) ) { > >> n = list_entry(v->arch.vgic.lr_pending.next, typeof(*n), lr_queue); > >> gic_set_lr(i, n->irq, GICH_LR_PENDING, n->priority); > >> list_del_init(&n->lr_queue); > >> set_bit(i, &this_cpu(lr_mask)); > >> } else { > >> gic_inject_irq_stop(); > >> } > >> spin_unlock_irq(&gic.lock); > >> > >> spin_lock_irq(&v->arch.vgic.lock); > >> list_del_init(&p->inflight); > >> spin_unlock_irq(&v->arch.vgic.lock); > >> > >> if ( eoi ) { > >> /* this is not racy because we can't receive another irq of the > >> * same type until we EOI it. */ > >> if ( cpu == smp_processor_id() ) > >> gic_irq_eoi((void*)(uintptr_t)pirq); > >> else > >> on_selected_cpus(cpumask_of(cpu), > >> gic_irq_eoi, (void*)(uintptr_t)pirq, 0); > >> } > >> > >> i++; > >> } > >> } > >> > >> > >> Oleksandr Tyshchenko | Embedded Developer > >> GlobalLogic > >> > >> > >> > > > > -- > > Name | Title > GlobalLogic > P +x.xxx.xxx.xxxx M +x.xxx.xxx.xxxx S skype > www.globallogic.com > > http://www.globallogic.com/email_disclaimer.txt > ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-30 15:35 ` Stefano Stabellini @ 2014-01-30 16:10 ` Oleksandr Tyshchenko 2014-01-30 17:18 ` Stefano Stabellini 0 siblings, 1 reply; 48+ messages in thread From: Oleksandr Tyshchenko @ 2014-01-30 16:10 UTC (permalink / raw) To: Stefano Stabellini; +Cc: Julien Grall, Ian Campbell, xen-devel Hello, all. 1. The "simultaneous cross-interrupts" issue only occurs if gic_route_irq_to_guest() was called on CPU1 during boot. it is possible in our case, since gic_route_irq_to_guest() is called when the both domains (dom0 and domU) are building unlike Xen upstream. So, the "impropriety" on our side. Next solution fixes side-effect (deadlock): while ( unlikely(!spin_trylock(&call_lock)) ) smp_call_function_interrupt(); 1.1 I have checked patch "xen: arm: increase priority of SGIs used as IPIs". In general it works (I mean that this patch doesn't cause to new issues). But, it doesn't fix the issue. (I still see "simultaneous cross-interrupts"). 1.2 I also have checked solution where on_selected_cpus call was moved out of the interrupt handler. Unfortunately, it doesn't work. I almost immediately see next error: (XEN) Assertion 'this_cpu(eoi_irq) == NULL' failed, line 981, file gic.c (XEN) Xen BUG at gic.c:981 (XEN) CPU1: Unexpected Trap: Undefined Instruction (XEN) ----[ Xen-4.4-unstable arm32 debug=y Not tainted ]---- (XEN) CPU: 1 (XEN) PC: 00241ee0 __bug+0x2c/0x44 (XEN) CPSR: 2000015a MODE:Hypervisor (XEN) R0: 0026770c R1: 00000000 R2: 3fd2fd00 R3: 00000fff (XEN) R4: 00263248 R5: 00264384 R6: 000003d5 R7: 4003d000 (XEN) R8: 00000001 R9: 00000091 R10:00000000 R11:40037ebc R12:00000001 (XEN) HYP: SP: 40037eb4 LR: 00241ee0 (XEN) (XEN) VTCR_EL2: 80002558 (XEN) VTTBR_EL2: 00010000deffc000 (XEN) (XEN) SCTLR_EL2: 30cd187f (XEN) HCR_EL2: 0000000000002835 (XEN) TTBR0_EL2: 00000000d2014000 (XEN) (XEN) ESR_EL2: 00000000 (XEN) HPFAR_EL2: 0000000000482110 (XEN) HDFAR: fa211f00 (XEN) HIFAR: 00000000 (XEN) (XEN) Xen stack trace from sp=40037eb4: (XEN) 00000000 40037efc 00247e1c 002e6610 002e6610 002e6608 002e6608 00000001 (XEN) 00000000 40015000 40017000 40005f60 40017014 40037f58 00000019 00000000 (XEN) 40005f60 40037f24 00249068 00000009 00000019 00404000 40037f58 00000000 (XEN) 00405000 00004680 002e7694 40037f4c 00248b80 00000000 c5b72000 00000091 (XEN) 00000000 c700d4e0 c008477c 000000f1 00000001 40037f54 0024f6c0 40037f58 (XEN) 00251a30 c700d4e0 00000001 c008477c 00000000 c5b72000 00000091 00000000 (XEN) c700d4e0 c008477c 000000f1 00000001 00000001 c5b72000 ffffffff 0000a923 (XEN) c0077ac4 60000193 00000000 b6eadaa0 c0578f40 c00138c0 c5b73f58 c036ab90 (XEN) c0578f4c c00136a0 c0578f58 c0013920 00000000 00000000 00000000 00000000 (XEN) 00000000 00000000 00000000 80000010 60000193 a0000093 80000193 00000000 (XEN) 00000000 0c41e00c 450c2880 (XEN) Xen call trace: (XEN) [<00241ee0>] __bug+0x2c/0x44 (PC) (XEN) [<00241ee0>] __bug+0x2c/0x44 (LR) (XEN) [<00247e1c>] maintenance_interrupt+0x2e8/0x328 (XEN) [<00249068>] do_IRQ+0x138/0x198 (XEN) [<00248b80>] gic_interrupt+0x58/0xc0 (XEN) [<0024f6c0>] do_trap_irq+0x10/0x14 (XEN) [<00251a30>] return_from_trap+0/0x4 (XEN) 2. The "simultaneous cross-interrupts" issue doesn't occur if I use next solution: So, as result I don't see deadlock in on_selected_cpus() diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c index e6257a7..af96a31 100644 --- a/xen/arch/arm/gic.c +++ b/xen/arch/arm/gic.c @@ -776,8 +795,7 @@ int gic_route_irq_to_guest(struct domain *d, const struct dt_irq *irq, level = dt_irq_is_level_triggered(irq); - gic_set_irq_properties(irq->irq, level, cpumask_of(smp_processor_id()), - 0xa0); + gic_set_irq_properties(irq->irq, level, cpumask_of(0), 0xa0); retval = __setup_irq(desc, irq->irq, action); if (retval) { So, as result I don't see deadlock in on_selected_cpus(). But, rarely, I see deadlocks in other parts related to interrupts handling. As noted by Julien, I am using the old version of the interrupt patch series. I completely agree. We are based on next XEN commit: 48249a1 libxl: Avoid realloc(,0) when libxl__xs_directory returns empty list Also we have some patches, which we cherry-picked when we urgently needed them: 6bba1a3 xen/arm: Keep count of inflight interrupts 33a8aa9 xen/arm: Only enable physical IRQs when the guest asks b6a4e65 xen/arm: Rename gic_irq_{startup, shutdown} to gic_irq_{mask, unmask} 5dbe455 xen/arm: Don't reinject the IRQ if it's already in LRs 1438f03 xen/arm: Physical IRQ is not always equal to virtual IRQ I have to apply next patches and check with them: 88eb95e xen/arm: disable a physical IRQ when the guest disables the corresponding IRQ a660ee3 xen/arm: implement gic_irq_enable and gic_irq_disable 1dc9556 xen/arm: do not add a second irq to the LRs if one is already present d16d511 xen/arm: track the state of guest IRQs I'll report about the results. I hope to do it today. A lot of thanks to all. On Thu, Jan 30, 2014 at 5:35 PM, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote: > Given that we don't deactivate the interrupt (writing to GICC_DIR) until > the guest EOIs it, I can't understand how you manage to get a second > interrupt notifications before the guest EOIs the first one. > > Do you set GICC_CTL_EOI in GICC_CTLR? > > On Thu, 30 Jan 2014, Oleksandr Tyshchenko wrote: >> According to DT it is a level irq (DT_IRQ_TYPE_LEVEL_HIGH) >> >> On Thu, Jan 30, 2014 at 3:24 PM, Stefano Stabellini >> <stefano.stabellini@eu.citrix.com> wrote: >> > Is it a level or an edge irq? >> > >> > On Wed, 29 Jan 2014, Julien Grall wrote: >> >> Hi, >> >> >> >> It's weird, physical IRQ should not be injected twice ... >> >> Were you able to print the IRQ number? >> >> >> >> In any case, you are using the old version of the interrupt patch series. >> >> Your new error may come of race condition in this code. >> >> >> >> Can you try to use a newest version? >> >> >> >> On 29 Jan 2014 18:40, "Oleksandr Tyshchenko" <oleksandr.tyshchenko@globallogic.com> wrote: >> >> > Right, that's why changing it to cpumask_of(0) shouldn't make any >> >> > difference for xen-unstable (it should make things clearer, if nothing >> >> > else) but it should fix things for Oleksandr. >> >> >> >> Unfortunately, it is not enough for stable work. >> >> >> >> I was tried to use cpumask_of(smp_processor_id()) instead of cpumask_of(0) in >> >> gic_route_irq_to_guest(). And as result, I don't see our situation >> >> which cause to deadlock in on_selected_cpus function (expected). >> >> But, hypervisor sometimes hangs somewhere else (I have not identified >> >> yet where this is happening) or I sometimes see traps, like that: >> >> ("WARN_ON(p->desc != NULL)" in maintenance_interrupt() leads to them) >> >> >> >> (XEN) CPU1: Unexpected Trap: Undefined Instruction >> >> (XEN) ----[ Xen-4.4-unstable arm32 debug=y Not tainted ]---- >> >> (XEN) CPU: 1 >> >> (XEN) PC: 00242c1c __warn+0x20/0x28 >> >> (XEN) CPSR: 200001da MODE:Hypervisor >> >> (XEN) R0: 0026770c R1: 00000001 R2: 3fd2fd00 R3: 00000fff >> >> (XEN) R4: 00406100 R5: 40020ee0 R6: 00000000 R7: 4bfdf000 >> >> (XEN) R8: 00000001 R9: 4bfd7ed0 R10:00000001 R11:4bfd7ebc R12:00000002 >> >> (XEN) HYP: SP: 4bfd7eb4 LR: 00242c1c >> >> (XEN) >> >> (XEN) VTCR_EL2: 80002558 >> >> (XEN) VTTBR_EL2: 00020000dec6a000 >> >> (XEN) >> >> (XEN) SCTLR_EL2: 30cd187f >> >> (XEN) HCR_EL2: 00000000000028b5 >> >> (XEN) TTBR0_EL2: 00000000d2014000 >> >> (XEN) >> >> (XEN) ESR_EL2: 00000000 >> >> (XEN) HPFAR_EL2: 0000000000482110 >> >> (XEN) HDFAR: fa211190 >> >> (XEN) HIFAR: 00000000 >> >> (XEN) >> >> (XEN) Xen stack trace from sp=4bfd7eb4: >> >> (XEN) 0026431c 4bfd7efc 00247a54 00000024 002e6608 002e6608 00000097 00000001 >> >> (XEN) 00000000 4bfd7f54 40017000 40005f60 40017014 4bfd7f58 00000019 00000000 >> >> (XEN) 40005f60 4bfd7f24 00248e60 00000009 00000019 00404000 4bfd7f58 00000000 >> >> (XEN) 00405000 000045f0 002e7694 4bfd7f4c 00248978 c0079a90 00000097 00000097 >> >> (XEN) 00000000 fa212000 ea80c900 00000001 c05b8a60 4bfd7f54 0024f4b8 4bfd7f58 >> >> (XEN) 00251830 ea80c950 00000000 00000001 c0079a90 00000097 00000097 00000000 >> >> (XEN) fa212000 ea80c900 00000001 c05b8a60 00000000 e9879e3c ffffffff b6efbca3 >> >> (XEN) c03b29fc 60000193 9fffffe7 b6c0bbf0 c0607500 c03b3140 e9879eb8 c007680c >> >> (XEN) c060750c c03b32c0 c0607518 c03b3360 00000000 00000000 00000000 00000000 >> >> (XEN) 00000000 00000000 3ff6bebf a0000113 800b0193 800b0093 40000193 00000000 >> >> (XEN) ffeffbfe fedeefff fffd5ffe >> >> (XEN) Xen call trace: >> >> (XEN) [<00242c1c>] __warn+0x20/0x28 (PC) >> >> (XEN) [<00242c1c>] __warn+0x20/0x28 (LR) >> >> (XEN) [<00247a54>] maintenance_interrupt+0xfc/0x2f4 >> >> (XEN) [<00248e60>] do_IRQ+0x138/0x198 >> >> (XEN) [<00248978>] gic_interrupt+0x58/0xc0 >> >> (XEN) [<0024f4b8>] do_trap_irq+0x10/0x14 >> >> (XEN) [<00251830>] return_from_trap+0/0x4 >> >> (XEN) >> >> >> >> Also I am posting maintenance_interrupt() from my tree: >> >> >> >> static void maintenance_interrupt(int irq, void *dev_id, struct >> >> cpu_user_regs *regs) >> >> { >> >> int i = 0, virq, pirq; >> >> uint32_t lr; >> >> struct vcpu *v = current; >> >> uint64_t eisr = GICH[GICH_EISR0] | (((uint64_t) GICH[GICH_EISR1]) << 32); >> >> >> >> while ((i = find_next_bit((const long unsigned int *) &eisr, >> >> 64, i)) < 64) { >> >> struct pending_irq *p, *n; >> >> int cpu, eoi; >> >> >> >> cpu = -1; >> >> eoi = 0; >> >> >> >> spin_lock_irq(&gic.lock); >> >> lr = GICH[GICH_LR + i]; >> >> virq = lr & GICH_LR_VIRTUAL_MASK; >> >> >> >> p = irq_to_pending(v, virq); >> >> if ( p->desc != NULL ) { >> >> p->desc->status &= ~IRQ_INPROGRESS; >> >> /* Assume only one pcpu needs to EOI the irq */ >> >> cpu = p->desc->arch.eoi_cpu; >> >> eoi = 1; >> >> pirq = p->desc->irq; >> >> } >> >> if ( !atomic_dec_and_test(&p->inflight_cnt) ) >> >> { >> >> /* Physical IRQ can't be reinject */ >> >> WARN_ON(p->desc != NULL); >> >> gic_set_lr(i, p->irq, GICH_LR_PENDING, p->priority); >> >> spin_unlock_irq(&gic.lock); >> >> i++; >> >> continue; >> >> } >> >> >> >> GICH[GICH_LR + i] = 0; >> >> clear_bit(i, &this_cpu(lr_mask)); >> >> >> >> if ( !list_empty(&v->arch.vgic.lr_pending) ) { >> >> n = list_entry(v->arch.vgic.lr_pending.next, typeof(*n), lr_queue); >> >> gic_set_lr(i, n->irq, GICH_LR_PENDING, n->priority); >> >> list_del_init(&n->lr_queue); >> >> set_bit(i, &this_cpu(lr_mask)); >> >> } else { >> >> gic_inject_irq_stop(); >> >> } >> >> spin_unlock_irq(&gic.lock); >> >> >> >> spin_lock_irq(&v->arch.vgic.lock); >> >> list_del_init(&p->inflight); >> >> spin_unlock_irq(&v->arch.vgic.lock); >> >> >> >> if ( eoi ) { >> >> /* this is not racy because we can't receive another irq of the >> >> * same type until we EOI it. */ >> >> if ( cpu == smp_processor_id() ) >> >> gic_irq_eoi((void*)(uintptr_t)pirq); >> >> else >> >> on_selected_cpus(cpumask_of(cpu), >> >> gic_irq_eoi, (void*)(uintptr_t)pirq, 0); >> >> } >> >> >> >> i++; >> >> } >> >> } >> >> >> >> >> >> Oleksandr Tyshchenko | Embedded Developer >> >> GlobalLogic >> >> >> >> >> >> >> >> >> >> -- >> >> Name | Title >> GlobalLogic >> P +x.xxx.xxx.xxxx M +x.xxx.xxx.xxxx S skype >> www.globallogic.com >> >> http://www.globallogic.com/email_disclaimer.txt >> -- Name | Title GlobalLogic P +x.xxx.xxx.xxxx M +x.xxx.xxx.xxxx S skype www.globallogic.com http://www.globallogic.com/email_disclaimer.txt ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-30 16:10 ` Oleksandr Tyshchenko @ 2014-01-30 17:18 ` Stefano Stabellini 2014-01-30 19:54 ` Oleksandr Tyshchenko 0 siblings, 1 reply; 48+ messages in thread From: Stefano Stabellini @ 2014-01-30 17:18 UTC (permalink / raw) To: Oleksandr Tyshchenko Cc: Julien Grall, xen-devel, Ian Campbell, Stefano Stabellini On Thu, 30 Jan 2014, Oleksandr Tyshchenko wrote: > 1.2 I also have checked solution where on_selected_cpus call was moved > out of the > interrupt handler. Unfortunately, it doesn't work. > > I almost immediately see next error: > (XEN) Assertion 'this_cpu(eoi_irq) == NULL' failed, line 981, file gic.c > (XEN) Xen BUG at gic.c:981 > (XEN) CPU1: Unexpected Trap: Undefined Instruction > (XEN) ----[ Xen-4.4-unstable arm32 debug=y Not tainted ]---- > (XEN) CPU: 1 > (XEN) PC: 00241ee0 __bug+0x2c/0x44 > (XEN) CPSR: 2000015a MODE:Hypervisor > (XEN) R0: 0026770c R1: 00000000 R2: 3fd2fd00 R3: 00000fff > (XEN) R4: 00263248 R5: 00264384 R6: 000003d5 R7: 4003d000 > (XEN) R8: 00000001 R9: 00000091 R10:00000000 R11:40037ebc R12:00000001 > (XEN) HYP: SP: 40037eb4 LR: 00241ee0 > (XEN) > (XEN) VTCR_EL2: 80002558 > (XEN) VTTBR_EL2: 00010000deffc000 > (XEN) > (XEN) SCTLR_EL2: 30cd187f > (XEN) HCR_EL2: 0000000000002835 > (XEN) TTBR0_EL2: 00000000d2014000 > (XEN) > (XEN) ESR_EL2: 00000000 > (XEN) HPFAR_EL2: 0000000000482110 > (XEN) HDFAR: fa211f00 > (XEN) HIFAR: 00000000 > (XEN) > (XEN) Xen stack trace from sp=40037eb4: > (XEN) 00000000 40037efc 00247e1c 002e6610 002e6610 002e6608 002e6608 00000001 > (XEN) 00000000 40015000 40017000 40005f60 40017014 40037f58 00000019 00000000 > (XEN) 40005f60 40037f24 00249068 00000009 00000019 00404000 40037f58 00000000 > (XEN) 00405000 00004680 002e7694 40037f4c 00248b80 00000000 c5b72000 00000091 > (XEN) 00000000 c700d4e0 c008477c 000000f1 00000001 40037f54 0024f6c0 40037f58 > (XEN) 00251a30 c700d4e0 00000001 c008477c 00000000 c5b72000 00000091 00000000 > (XEN) c700d4e0 c008477c 000000f1 00000001 00000001 c5b72000 ffffffff 0000a923 > (XEN) c0077ac4 60000193 00000000 b6eadaa0 c0578f40 c00138c0 c5b73f58 c036ab90 > (XEN) c0578f4c c00136a0 c0578f58 c0013920 00000000 00000000 00000000 00000000 > (XEN) 00000000 00000000 00000000 80000010 60000193 a0000093 80000193 00000000 > (XEN) 00000000 0c41e00c 450c2880 > (XEN) Xen call trace: > (XEN) [<00241ee0>] __bug+0x2c/0x44 (PC) > (XEN) [<00241ee0>] __bug+0x2c/0x44 (LR) > (XEN) [<00247e1c>] maintenance_interrupt+0x2e8/0x328 > (XEN) [<00249068>] do_IRQ+0x138/0x198 > (XEN) [<00248b80>] gic_interrupt+0x58/0xc0 > (XEN) [<0024f6c0>] do_trap_irq+0x10/0x14 > (XEN) [<00251a30>] return_from_trap+0/0x4 > (XEN) Are you seeing more than one interrupt being EOI'ed with a single maintenance interrupt? I didn't think it could be possible in practice. If so, we might have to turn eoi_irq into a list or an array. > 2. The "simultaneous cross-interrupts" issue doesn't occur if I use > next solution: > So, as result I don't see deadlock in on_selected_cpus() > > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c > index e6257a7..af96a31 100644 > --- a/xen/arch/arm/gic.c > +++ b/xen/arch/arm/gic.c > @@ -776,8 +795,7 @@ int gic_route_irq_to_guest(struct domain *d, const > struct dt_irq *irq, > > level = dt_irq_is_level_triggered(irq); > > - gic_set_irq_properties(irq->irq, level, cpumask_of(smp_processor_id()), > - 0xa0); > + gic_set_irq_properties(irq->irq, level, cpumask_of(0), 0xa0); > > retval = __setup_irq(desc, irq->irq, action); > if (retval) { > So, as result I don't see deadlock in on_selected_cpus(). As I stated before I think this is a good change to have in 4.4. > But, rarely, I see deadlocks in other parts related to interrupts handling. > As noted by Julien, I am using the old version of the interrupt patch series. > I completely agree. > > We are based on next XEN commit: > 48249a1 libxl: Avoid realloc(,0) when libxl__xs_directory returns empty list > > Also we have some patches, which we cherry-picked when we urgently needed them: > 6bba1a3 xen/arm: Keep count of inflight interrupts > 33a8aa9 xen/arm: Only enable physical IRQs when the guest asks > b6a4e65 xen/arm: Rename gic_irq_{startup, shutdown} to gic_irq_{mask, unmask} > 5dbe455 xen/arm: Don't reinject the IRQ if it's already in LRs > 1438f03 xen/arm: Physical IRQ is not always equal to virtual IRQ > > I have to apply next patches and check with them: > 88eb95e xen/arm: disable a physical IRQ when the guest disables the > corresponding IRQ > a660ee3 xen/arm: implement gic_irq_enable and gic_irq_disable > 1dc9556 xen/arm: do not add a second irq to the LRs if one is already present > d16d511 xen/arm: track the state of guest IRQs > > I'll report about the results. I hope to do it today. I am looking forward to reading your report. Cheers, Stefano > A lot of thanks to all. > > On Thu, Jan 30, 2014 at 5:35 PM, Stefano Stabellini > <stefano.stabellini@eu.citrix.com> wrote: > > Given that we don't deactivate the interrupt (writing to GICC_DIR) until > > the guest EOIs it, I can't understand how you manage to get a second > > interrupt notifications before the guest EOIs the first one. > > > > Do you set GICC_CTL_EOI in GICC_CTLR? > > > > On Thu, 30 Jan 2014, Oleksandr Tyshchenko wrote: > >> According to DT it is a level irq (DT_IRQ_TYPE_LEVEL_HIGH) > >> > >> On Thu, Jan 30, 2014 at 3:24 PM, Stefano Stabellini > >> <stefano.stabellini@eu.citrix.com> wrote: > >> > Is it a level or an edge irq? > >> > > >> > On Wed, 29 Jan 2014, Julien Grall wrote: > >> >> Hi, > >> >> > >> >> It's weird, physical IRQ should not be injected twice ... > >> >> Were you able to print the IRQ number? > >> >> > >> >> In any case, you are using the old version of the interrupt patch series. > >> >> Your new error may come of race condition in this code. > >> >> > >> >> Can you try to use a newest version? > >> >> > >> >> On 29 Jan 2014 18:40, "Oleksandr Tyshchenko" <oleksandr.tyshchenko@globallogic.com> wrote: > >> >> > Right, that's why changing it to cpumask_of(0) shouldn't make any > >> >> > difference for xen-unstable (it should make things clearer, if nothing > >> >> > else) but it should fix things for Oleksandr. > >> >> > >> >> Unfortunately, it is not enough for stable work. > >> >> > >> >> I was tried to use cpumask_of(smp_processor_id()) instead of cpumask_of(0) in > >> >> gic_route_irq_to_guest(). And as result, I don't see our situation > >> >> which cause to deadlock in on_selected_cpus function (expected). > >> >> But, hypervisor sometimes hangs somewhere else (I have not identified > >> >> yet where this is happening) or I sometimes see traps, like that: > >> >> ("WARN_ON(p->desc != NULL)" in maintenance_interrupt() leads to them) > >> >> > >> >> (XEN) CPU1: Unexpected Trap: Undefined Instruction > >> >> (XEN) ----[ Xen-4.4-unstable arm32 debug=y Not tainted ]---- > >> >> (XEN) CPU: 1 > >> >> (XEN) PC: 00242c1c __warn+0x20/0x28 > >> >> (XEN) CPSR: 200001da MODE:Hypervisor > >> >> (XEN) R0: 0026770c R1: 00000001 R2: 3fd2fd00 R3: 00000fff > >> >> (XEN) R4: 00406100 R5: 40020ee0 R6: 00000000 R7: 4bfdf000 > >> >> (XEN) R8: 00000001 R9: 4bfd7ed0 R10:00000001 R11:4bfd7ebc R12:00000002 > >> >> (XEN) HYP: SP: 4bfd7eb4 LR: 00242c1c > >> >> (XEN) > >> >> (XEN) VTCR_EL2: 80002558 > >> >> (XEN) VTTBR_EL2: 00020000dec6a000 > >> >> (XEN) > >> >> (XEN) SCTLR_EL2: 30cd187f > >> >> (XEN) HCR_EL2: 00000000000028b5 > >> >> (XEN) TTBR0_EL2: 00000000d2014000 > >> >> (XEN) > >> >> (XEN) ESR_EL2: 00000000 > >> >> (XEN) HPFAR_EL2: 0000000000482110 > >> >> (XEN) HDFAR: fa211190 > >> >> (XEN) HIFAR: 00000000 > >> >> (XEN) > >> >> (XEN) Xen stack trace from sp=4bfd7eb4: > >> >> (XEN) 0026431c 4bfd7efc 00247a54 00000024 002e6608 002e6608 00000097 00000001 > >> >> (XEN) 00000000 4bfd7f54 40017000 40005f60 40017014 4bfd7f58 00000019 00000000 > >> >> (XEN) 40005f60 4bfd7f24 00248e60 00000009 00000019 00404000 4bfd7f58 00000000 > >> >> (XEN) 00405000 000045f0 002e7694 4bfd7f4c 00248978 c0079a90 00000097 00000097 > >> >> (XEN) 00000000 fa212000 ea80c900 00000001 c05b8a60 4bfd7f54 0024f4b8 4bfd7f58 > >> >> (XEN) 00251830 ea80c950 00000000 00000001 c0079a90 00000097 00000097 00000000 > >> >> (XEN) fa212000 ea80c900 00000001 c05b8a60 00000000 e9879e3c ffffffff b6efbca3 > >> >> (XEN) c03b29fc 60000193 9fffffe7 b6c0bbf0 c0607500 c03b3140 e9879eb8 c007680c > >> >> (XEN) c060750c c03b32c0 c0607518 c03b3360 00000000 00000000 00000000 00000000 > >> >> (XEN) 00000000 00000000 3ff6bebf a0000113 800b0193 800b0093 40000193 00000000 > >> >> (XEN) ffeffbfe fedeefff fffd5ffe > >> >> (XEN) Xen call trace: > >> >> (XEN) [<00242c1c>] __warn+0x20/0x28 (PC) > >> >> (XEN) [<00242c1c>] __warn+0x20/0x28 (LR) > >> >> (XEN) [<00247a54>] maintenance_interrupt+0xfc/0x2f4 > >> >> (XEN) [<00248e60>] do_IRQ+0x138/0x198 > >> >> (XEN) [<00248978>] gic_interrupt+0x58/0xc0 > >> >> (XEN) [<0024f4b8>] do_trap_irq+0x10/0x14 > >> >> (XEN) [<00251830>] return_from_trap+0/0x4 > >> >> (XEN) > >> >> > >> >> Also I am posting maintenance_interrupt() from my tree: > >> >> > >> >> static void maintenance_interrupt(int irq, void *dev_id, struct > >> >> cpu_user_regs *regs) > >> >> { > >> >> int i = 0, virq, pirq; > >> >> uint32_t lr; > >> >> struct vcpu *v = current; > >> >> uint64_t eisr = GICH[GICH_EISR0] | (((uint64_t) GICH[GICH_EISR1]) << 32); > >> >> > >> >> while ((i = find_next_bit((const long unsigned int *) &eisr, > >> >> 64, i)) < 64) { > >> >> struct pending_irq *p, *n; > >> >> int cpu, eoi; > >> >> > >> >> cpu = -1; > >> >> eoi = 0; > >> >> > >> >> spin_lock_irq(&gic.lock); > >> >> lr = GICH[GICH_LR + i]; > >> >> virq = lr & GICH_LR_VIRTUAL_MASK; > >> >> > >> >> p = irq_to_pending(v, virq); > >> >> if ( p->desc != NULL ) { > >> >> p->desc->status &= ~IRQ_INPROGRESS; > >> >> /* Assume only one pcpu needs to EOI the irq */ > >> >> cpu = p->desc->arch.eoi_cpu; > >> >> eoi = 1; > >> >> pirq = p->desc->irq; > >> >> } > >> >> if ( !atomic_dec_and_test(&p->inflight_cnt) ) > >> >> { > >> >> /* Physical IRQ can't be reinject */ > >> >> WARN_ON(p->desc != NULL); > >> >> gic_set_lr(i, p->irq, GICH_LR_PENDING, p->priority); > >> >> spin_unlock_irq(&gic.lock); > >> >> i++; > >> >> continue; > >> >> } > >> >> > >> >> GICH[GICH_LR + i] = 0; > >> >> clear_bit(i, &this_cpu(lr_mask)); > >> >> > >> >> if ( !list_empty(&v->arch.vgic.lr_pending) ) { > >> >> n = list_entry(v->arch.vgic.lr_pending.next, typeof(*n), lr_queue); > >> >> gic_set_lr(i, n->irq, GICH_LR_PENDING, n->priority); > >> >> list_del_init(&n->lr_queue); > >> >> set_bit(i, &this_cpu(lr_mask)); > >> >> } else { > >> >> gic_inject_irq_stop(); > >> >> } > >> >> spin_unlock_irq(&gic.lock); > >> >> > >> >> spin_lock_irq(&v->arch.vgic.lock); > >> >> list_del_init(&p->inflight); > >> >> spin_unlock_irq(&v->arch.vgic.lock); > >> >> > >> >> if ( eoi ) { > >> >> /* this is not racy because we can't receive another irq of the > >> >> * same type until we EOI it. */ > >> >> if ( cpu == smp_processor_id() ) > >> >> gic_irq_eoi((void*)(uintptr_t)pirq); > >> >> else > >> >> on_selected_cpus(cpumask_of(cpu), > >> >> gic_irq_eoi, (void*)(uintptr_t)pirq, 0); > >> >> } > >> >> > >> >> i++; > >> >> } > >> >> } > >> >> > >> >> > >> >> Oleksandr Tyshchenko | Embedded Developer > >> >> GlobalLogic > >> >> > >> >> > >> >> > >> > >> > >> > >> -- > >> > >> Name | Title > >> GlobalLogic > >> P +x.xxx.xxx.xxxx M +x.xxx.xxx.xxxx S skype > >> www.globallogic.com > >> > >> http://www.globallogic.com/email_disclaimer.txt > >> > > > > -- > > Name | Title > GlobalLogic > P +x.xxx.xxx.xxxx M +x.xxx.xxx.xxxx S skype > www.globallogic.com > > http://www.globallogic.com/email_disclaimer.txt > ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-30 17:18 ` Stefano Stabellini @ 2014-01-30 19:54 ` Oleksandr Tyshchenko 2014-01-30 21:47 ` Julien Grall 0 siblings, 1 reply; 48+ messages in thread From: Oleksandr Tyshchenko @ 2014-01-30 19:54 UTC (permalink / raw) To: Stefano Stabellini; +Cc: Julien Grall, Ian Campbell, xen-devel I moved to 4.4.0-rc1 which already has necessary irq patches. And applied only one patch "cpumask_of(0) in gic_route_irq_to_guest". I see that Hypervisor hangs very often. Unfortunately, now I don't have debugger to localize part of code. So, I have to use prints and it may takes some time( On Thu, Jan 30, 2014 at 7:18 PM, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote: > On Thu, 30 Jan 2014, Oleksandr Tyshchenko wrote: >> 1.2 I also have checked solution where on_selected_cpus call was moved >> out of the >> interrupt handler. Unfortunately, it doesn't work. >> >> I almost immediately see next error: >> (XEN) Assertion 'this_cpu(eoi_irq) == NULL' failed, line 981, file gic.c >> (XEN) Xen BUG at gic.c:981 >> (XEN) CPU1: Unexpected Trap: Undefined Instruction >> (XEN) ----[ Xen-4.4-unstable arm32 debug=y Not tainted ]---- >> (XEN) CPU: 1 >> (XEN) PC: 00241ee0 __bug+0x2c/0x44 >> (XEN) CPSR: 2000015a MODE:Hypervisor >> (XEN) R0: 0026770c R1: 00000000 R2: 3fd2fd00 R3: 00000fff >> (XEN) R4: 00263248 R5: 00264384 R6: 000003d5 R7: 4003d000 >> (XEN) R8: 00000001 R9: 00000091 R10:00000000 R11:40037ebc R12:00000001 >> (XEN) HYP: SP: 40037eb4 LR: 00241ee0 >> (XEN) >> (XEN) VTCR_EL2: 80002558 >> (XEN) VTTBR_EL2: 00010000deffc000 >> (XEN) >> (XEN) SCTLR_EL2: 30cd187f >> (XEN) HCR_EL2: 0000000000002835 >> (XEN) TTBR0_EL2: 00000000d2014000 >> (XEN) >> (XEN) ESR_EL2: 00000000 >> (XEN) HPFAR_EL2: 0000000000482110 >> (XEN) HDFAR: fa211f00 >> (XEN) HIFAR: 00000000 >> (XEN) >> (XEN) Xen stack trace from sp=40037eb4: >> (XEN) 00000000 40037efc 00247e1c 002e6610 002e6610 002e6608 002e6608 00000001 >> (XEN) 00000000 40015000 40017000 40005f60 40017014 40037f58 00000019 00000000 >> (XEN) 40005f60 40037f24 00249068 00000009 00000019 00404000 40037f58 00000000 >> (XEN) 00405000 00004680 002e7694 40037f4c 00248b80 00000000 c5b72000 00000091 >> (XEN) 00000000 c700d4e0 c008477c 000000f1 00000001 40037f54 0024f6c0 40037f58 >> (XEN) 00251a30 c700d4e0 00000001 c008477c 00000000 c5b72000 00000091 00000000 >> (XEN) c700d4e0 c008477c 000000f1 00000001 00000001 c5b72000 ffffffff 0000a923 >> (XEN) c0077ac4 60000193 00000000 b6eadaa0 c0578f40 c00138c0 c5b73f58 c036ab90 >> (XEN) c0578f4c c00136a0 c0578f58 c0013920 00000000 00000000 00000000 00000000 >> (XEN) 00000000 00000000 00000000 80000010 60000193 a0000093 80000193 00000000 >> (XEN) 00000000 0c41e00c 450c2880 >> (XEN) Xen call trace: >> (XEN) [<00241ee0>] __bug+0x2c/0x44 (PC) >> (XEN) [<00241ee0>] __bug+0x2c/0x44 (LR) >> (XEN) [<00247e1c>] maintenance_interrupt+0x2e8/0x328 >> (XEN) [<00249068>] do_IRQ+0x138/0x198 >> (XEN) [<00248b80>] gic_interrupt+0x58/0xc0 >> (XEN) [<0024f6c0>] do_trap_irq+0x10/0x14 >> (XEN) [<00251a30>] return_from_trap+0/0x4 >> (XEN) > > Are you seeing more than one interrupt being EOI'ed with a single > maintenance interrupt? > I didn't think it could be possible in practice. > If so, we might have to turn eoi_irq into a list or an array. > > >> 2. The "simultaneous cross-interrupts" issue doesn't occur if I use >> next solution: >> So, as result I don't see deadlock in on_selected_cpus() >> >> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c >> index e6257a7..af96a31 100644 >> --- a/xen/arch/arm/gic.c >> +++ b/xen/arch/arm/gic.c >> @@ -776,8 +795,7 @@ int gic_route_irq_to_guest(struct domain *d, const >> struct dt_irq *irq, >> >> level = dt_irq_is_level_triggered(irq); >> >> - gic_set_irq_properties(irq->irq, level, cpumask_of(smp_processor_id()), >> - 0xa0); >> + gic_set_irq_properties(irq->irq, level, cpumask_of(0), 0xa0); >> >> retval = __setup_irq(desc, irq->irq, action); >> if (retval) { >> So, as result I don't see deadlock in on_selected_cpus(). > > As I stated before I think this is a good change to have in 4.4. > > >> But, rarely, I see deadlocks in other parts related to interrupts handling. >> As noted by Julien, I am using the old version of the interrupt patch series. >> I completely agree. >> >> We are based on next XEN commit: >> 48249a1 libxl: Avoid realloc(,0) when libxl__xs_directory returns empty list >> >> Also we have some patches, which we cherry-picked when we urgently needed them: >> 6bba1a3 xen/arm: Keep count of inflight interrupts >> 33a8aa9 xen/arm: Only enable physical IRQs when the guest asks >> b6a4e65 xen/arm: Rename gic_irq_{startup, shutdown} to gic_irq_{mask, unmask} >> 5dbe455 xen/arm: Don't reinject the IRQ if it's already in LRs >> 1438f03 xen/arm: Physical IRQ is not always equal to virtual IRQ >> >> I have to apply next patches and check with them: >> 88eb95e xen/arm: disable a physical IRQ when the guest disables the >> corresponding IRQ >> a660ee3 xen/arm: implement gic_irq_enable and gic_irq_disable >> 1dc9556 xen/arm: do not add a second irq to the LRs if one is already present >> d16d511 xen/arm: track the state of guest IRQs >> >> I'll report about the results. I hope to do it today. > > I am looking forward to reading your report. > Cheers, > > Stefano > >> A lot of thanks to all. >> >> On Thu, Jan 30, 2014 at 5:35 PM, Stefano Stabellini >> <stefano.stabellini@eu.citrix.com> wrote: >> > Given that we don't deactivate the interrupt (writing to GICC_DIR) until >> > the guest EOIs it, I can't understand how you manage to get a second >> > interrupt notifications before the guest EOIs the first one. >> > >> > Do you set GICC_CTL_EOI in GICC_CTLR? >> > >> > On Thu, 30 Jan 2014, Oleksandr Tyshchenko wrote: >> >> According to DT it is a level irq (DT_IRQ_TYPE_LEVEL_HIGH) >> >> >> >> On Thu, Jan 30, 2014 at 3:24 PM, Stefano Stabellini >> >> <stefano.stabellini@eu.citrix.com> wrote: >> >> > Is it a level or an edge irq? >> >> > >> >> > On Wed, 29 Jan 2014, Julien Grall wrote: >> >> >> Hi, >> >> >> >> >> >> It's weird, physical IRQ should not be injected twice ... >> >> >> Were you able to print the IRQ number? >> >> >> >> >> >> In any case, you are using the old version of the interrupt patch series. >> >> >> Your new error may come of race condition in this code. >> >> >> >> >> >> Can you try to use a newest version? >> >> >> >> >> >> On 29 Jan 2014 18:40, "Oleksandr Tyshchenko" <oleksandr.tyshchenko@globallogic.com> wrote: >> >> >> > Right, that's why changing it to cpumask_of(0) shouldn't make any >> >> >> > difference for xen-unstable (it should make things clearer, if nothing >> >> >> > else) but it should fix things for Oleksandr. >> >> >> >> >> >> Unfortunately, it is not enough for stable work. >> >> >> >> >> >> I was tried to use cpumask_of(smp_processor_id()) instead of cpumask_of(0) in >> >> >> gic_route_irq_to_guest(). And as result, I don't see our situation >> >> >> which cause to deadlock in on_selected_cpus function (expected). >> >> >> But, hypervisor sometimes hangs somewhere else (I have not identified >> >> >> yet where this is happening) or I sometimes see traps, like that: >> >> >> ("WARN_ON(p->desc != NULL)" in maintenance_interrupt() leads to them) >> >> >> >> >> >> (XEN) CPU1: Unexpected Trap: Undefined Instruction >> >> >> (XEN) ----[ Xen-4.4-unstable arm32 debug=y Not tainted ]---- >> >> >> (XEN) CPU: 1 >> >> >> (XEN) PC: 00242c1c __warn+0x20/0x28 >> >> >> (XEN) CPSR: 200001da MODE:Hypervisor >> >> >> (XEN) R0: 0026770c R1: 00000001 R2: 3fd2fd00 R3: 00000fff >> >> >> (XEN) R4: 00406100 R5: 40020ee0 R6: 00000000 R7: 4bfdf000 >> >> >> (XEN) R8: 00000001 R9: 4bfd7ed0 R10:00000001 R11:4bfd7ebc R12:00000002 >> >> >> (XEN) HYP: SP: 4bfd7eb4 LR: 00242c1c >> >> >> (XEN) >> >> >> (XEN) VTCR_EL2: 80002558 >> >> >> (XEN) VTTBR_EL2: 00020000dec6a000 >> >> >> (XEN) >> >> >> (XEN) SCTLR_EL2: 30cd187f >> >> >> (XEN) HCR_EL2: 00000000000028b5 >> >> >> (XEN) TTBR0_EL2: 00000000d2014000 >> >> >> (XEN) >> >> >> (XEN) ESR_EL2: 00000000 >> >> >> (XEN) HPFAR_EL2: 0000000000482110 >> >> >> (XEN) HDFAR: fa211190 >> >> >> (XEN) HIFAR: 00000000 >> >> >> (XEN) >> >> >> (XEN) Xen stack trace from sp=4bfd7eb4: >> >> >> (XEN) 0026431c 4bfd7efc 00247a54 00000024 002e6608 002e6608 00000097 00000001 >> >> >> (XEN) 00000000 4bfd7f54 40017000 40005f60 40017014 4bfd7f58 00000019 00000000 >> >> >> (XEN) 40005f60 4bfd7f24 00248e60 00000009 00000019 00404000 4bfd7f58 00000000 >> >> >> (XEN) 00405000 000045f0 002e7694 4bfd7f4c 00248978 c0079a90 00000097 00000097 >> >> >> (XEN) 00000000 fa212000 ea80c900 00000001 c05b8a60 4bfd7f54 0024f4b8 4bfd7f58 >> >> >> (XEN) 00251830 ea80c950 00000000 00000001 c0079a90 00000097 00000097 00000000 >> >> >> (XEN) fa212000 ea80c900 00000001 c05b8a60 00000000 e9879e3c ffffffff b6efbca3 >> >> >> (XEN) c03b29fc 60000193 9fffffe7 b6c0bbf0 c0607500 c03b3140 e9879eb8 c007680c >> >> >> (XEN) c060750c c03b32c0 c0607518 c03b3360 00000000 00000000 00000000 00000000 >> >> >> (XEN) 00000000 00000000 3ff6bebf a0000113 800b0193 800b0093 40000193 00000000 >> >> >> (XEN) ffeffbfe fedeefff fffd5ffe >> >> >> (XEN) Xen call trace: >> >> >> (XEN) [<00242c1c>] __warn+0x20/0x28 (PC) >> >> >> (XEN) [<00242c1c>] __warn+0x20/0x28 (LR) >> >> >> (XEN) [<00247a54>] maintenance_interrupt+0xfc/0x2f4 >> >> >> (XEN) [<00248e60>] do_IRQ+0x138/0x198 >> >> >> (XEN) [<00248978>] gic_interrupt+0x58/0xc0 >> >> >> (XEN) [<0024f4b8>] do_trap_irq+0x10/0x14 >> >> >> (XEN) [<00251830>] return_from_trap+0/0x4 >> >> >> (XEN) >> >> >> >> >> >> Also I am posting maintenance_interrupt() from my tree: >> >> >> >> >> >> static void maintenance_interrupt(int irq, void *dev_id, struct >> >> >> cpu_user_regs *regs) >> >> >> { >> >> >> int i = 0, virq, pirq; >> >> >> uint32_t lr; >> >> >> struct vcpu *v = current; >> >> >> uint64_t eisr = GICH[GICH_EISR0] | (((uint64_t) GICH[GICH_EISR1]) << 32); >> >> >> >> >> >> while ((i = find_next_bit((const long unsigned int *) &eisr, >> >> >> 64, i)) < 64) { >> >> >> struct pending_irq *p, *n; >> >> >> int cpu, eoi; >> >> >> >> >> >> cpu = -1; >> >> >> eoi = 0; >> >> >> >> >> >> spin_lock_irq(&gic.lock); >> >> >> lr = GICH[GICH_LR + i]; >> >> >> virq = lr & GICH_LR_VIRTUAL_MASK; >> >> >> >> >> >> p = irq_to_pending(v, virq); >> >> >> if ( p->desc != NULL ) { >> >> >> p->desc->status &= ~IRQ_INPROGRESS; >> >> >> /* Assume only one pcpu needs to EOI the irq */ >> >> >> cpu = p->desc->arch.eoi_cpu; >> >> >> eoi = 1; >> >> >> pirq = p->desc->irq; >> >> >> } >> >> >> if ( !atomic_dec_and_test(&p->inflight_cnt) ) >> >> >> { >> >> >> /* Physical IRQ can't be reinject */ >> >> >> WARN_ON(p->desc != NULL); >> >> >> gic_set_lr(i, p->irq, GICH_LR_PENDING, p->priority); >> >> >> spin_unlock_irq(&gic.lock); >> >> >> i++; >> >> >> continue; >> >> >> } >> >> >> >> >> >> GICH[GICH_LR + i] = 0; >> >> >> clear_bit(i, &this_cpu(lr_mask)); >> >> >> >> >> >> if ( !list_empty(&v->arch.vgic.lr_pending) ) { >> >> >> n = list_entry(v->arch.vgic.lr_pending.next, typeof(*n), lr_queue); >> >> >> gic_set_lr(i, n->irq, GICH_LR_PENDING, n->priority); >> >> >> list_del_init(&n->lr_queue); >> >> >> set_bit(i, &this_cpu(lr_mask)); >> >> >> } else { >> >> >> gic_inject_irq_stop(); >> >> >> } >> >> >> spin_unlock_irq(&gic.lock); >> >> >> >> >> >> spin_lock_irq(&v->arch.vgic.lock); >> >> >> list_del_init(&p->inflight); >> >> >> spin_unlock_irq(&v->arch.vgic.lock); >> >> >> >> >> >> if ( eoi ) { >> >> >> /* this is not racy because we can't receive another irq of the >> >> >> * same type until we EOI it. */ >> >> >> if ( cpu == smp_processor_id() ) >> >> >> gic_irq_eoi((void*)(uintptr_t)pirq); >> >> >> else >> >> >> on_selected_cpus(cpumask_of(cpu), >> >> >> gic_irq_eoi, (void*)(uintptr_t)pirq, 0); >> >> >> } >> >> >> >> >> >> i++; >> >> >> } >> >> >> } >> >> >> >> >> >> >> >> >> Oleksandr Tyshchenko | Embedded Developer >> >> >> GlobalLogic >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> >> Name | Title >> >> GlobalLogic >> >> P +x.xxx.xxx.xxxx M +x.xxx.xxx.xxxx S skype >> >> www.globallogic.com >> >> >> >> http://www.globallogic.com/email_disclaimer.txt >> >> >> >> >> >> -- >> >> Name | Title >> GlobalLogic >> P +x.xxx.xxx.xxxx M +x.xxx.xxx.xxxx S skype >> www.globallogic.com >> >> http://www.globallogic.com/email_disclaimer.txt >> -- Name | Title GlobalLogic P +x.xxx.xxx.xxxx M +x.xxx.xxx.xxxx S skype www.globallogic.com http://www.globallogic.com/email_disclaimer.txt ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-30 19:54 ` Oleksandr Tyshchenko @ 2014-01-30 21:47 ` Julien Grall 2014-01-31 1:57 ` Oleksandr Tyshchenko 0 siblings, 1 reply; 48+ messages in thread From: Julien Grall @ 2014-01-30 21:47 UTC (permalink / raw) To: Oleksandr Tyshchenko, Stefano Stabellini; +Cc: Ian Campbell, xen-devel Hello, On 30/01/14 19:54, Oleksandr Tyshchenko wrote: > I moved to 4.4.0-rc1 which already has necessary irq patches. Any specific reason to use 4.4.0-rc1 instead of 4.4.0-rc2? There is a bunch of fixes (which should not be related to your current bug) such as TLB issue, foreign mapping, and a first attempt to fix guest cache issue. > And applied only one patch "cpumask_of(0) in gic_route_irq_to_guest". > I see that Hypervisor hangs very often. Unfortunately, now I don't > have debugger to localize part of code. > So, I have to use prints and it may takes some time( What do you mean by hang? Do you have any output from Xen? What do you run? Dom0 and a DomU? Sincerely yours, -- Julien Grall ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-30 21:47 ` Julien Grall @ 2014-01-31 1:57 ` Oleksandr Tyshchenko 0 siblings, 0 replies; 48+ messages in thread From: Oleksandr Tyshchenko @ 2014-01-31 1:57 UTC (permalink / raw) To: Julien Grall; +Cc: xen-devel, Ian Campbell, Stefano Stabellini There is no specific reason) But, as I found out we already have our local tree based on 4.4.0-rc1. All works on porting our local patches on top on 4.4.0-rc1 (resolving conflicts, etc.) were done by my colleagues. I saw that the range of commits that you pointed are present there. And I just moved. I mean that hypervivor stucks somewhere in the interrupt (gets into an infinite loop trying to acquire lock, or waiting for event). And as result nothing works. Of course we don't have any output from it (Console is not working) For example, as it happened in on_selected_cpus() I run domU. We have operation system in domU with UI. After moving to 4.4.0-rc1, hypervivor began to hangs very often. I have not identified yet where this is happening. And these hangs occur when I use touchscreen (domU is running at this time). It is somehow depends on touchscreen irq. I would even say, from "touchscreen interrupt rate". I was trying to change interrupt priority and was doing other things, but, I don't have any positive results. At first I need to localize where deadlock happens. On Thu, Jan 30, 2014 at 11:47 PM, Julien Grall <julien.grall@linaro.org> wrote: > Hello, > > > On 30/01/14 19:54, Oleksandr Tyshchenko wrote: >> >> I moved to 4.4.0-rc1 which already has necessary irq patches. > > > Any specific reason to use 4.4.0-rc1 instead of 4.4.0-rc2? There is a bunch > of fixes (which should not be related to your current bug) such as TLB > issue, foreign mapping, and a first attempt to fix guest cache issue. > > >> And applied only one patch "cpumask_of(0) in gic_route_irq_to_guest". >> I see that Hypervisor hangs very often. Unfortunately, now I don't >> have debugger to localize part of code. >> So, I have to use prints and it may takes some time( > > > What do you mean by hang? Do you have any output from Xen? What do you run? > Dom0 and a DomU? > > Sincerely yours, > > -- > Julien Grall -- Name | Title GlobalLogic P +x.xxx.xxx.xxxx M +x.xxx.xxx.xxxx S skype www.globallogic.com http://www.globallogic.com/email_disclaimer.txt ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-28 19:25 ` Oleksandr Tyshchenko 2014-01-29 10:56 ` Oleksandr Tyshchenko @ 2014-01-29 13:12 ` Julien Grall 2014-01-29 18:55 ` Oleksandr Tyshchenko 1 sibling, 1 reply; 48+ messages in thread From: Julien Grall @ 2014-01-29 13:12 UTC (permalink / raw) To: Oleksandr Tyshchenko; +Cc: Stefano Stabellini, Ian Campbell, xen-devel Hello Oleksandr, On 28/01/14 19:25, Oleksandr Tyshchenko wrote: [..] > > Do you pass-through PPIs to dom0? > > If I understand correctly that PPIs is irqs from 16 to 31. > So yes, I do. I see timer's irqs and maintenance irq which routed to > both CPUs. This IRQs are used by Xen, therefore they are emulated for dom0 and domU. Xen won't EOI in maintenance_interrupt theses IRQs. > > And I have printed all irqs which fall to gic_route_irq_to_guest and > gic_route_irq functions. > ... > (XEN) GIC initialization: > (XEN) gic_dist_addr=0000000048211000 > (XEN) gic_cpu_addr=0000000048212000 > (XEN) gic_hyp_addr=0000000048214000 > (XEN) gic_vcpu_addr=0000000048216000 > (XEN) gic_maintenance_irq=25 > (XEN) GIC: 192 lines, 2 cpus, secure (IID 0000043b). > (XEN) > (XEN) >>>>> gic_route_irq: irq: 25, cpu_mask: 00000001 > (XEN) > (XEN) >>>>> gic_route_irq: irq: 30, cpu_mask: 00000001 > (XEN) > (XEN) >>>>> gic_route_irq: irq: 26, cpu_mask: 00000001 > (XEN) > (XEN) >>>>> gic_route_irq: irq: 27, cpu_mask: 00000001 > (XEN) > (XEN) >>>>> gic_route_irq: irq: 104, cpu_mask: 00000001 > (XEN) Using scheduler: SMP Credit Scheduler (credit) > (XEN) Allocated console ring of 16 KiB. > (XEN) VFP implementer 0x41 architecture 4 part 0x30 variant 0xf rev 0x0 > (XEN) Bringing up CPU1 > (XEN) > (XEN) >>>>> gic_route_irq: irq: 25, cpu_mask: 00000002 > (XEN) > (XEN) >>>>> gic_route_irq: irq: 30, cpu_mask: 00000002 > (XEN) > (XEN) >>>>> gic_route_irq: irq: 26, cpu_mask: 00000002 > (XEN) > (XEN) >>>>> gic_route_irq: irq: 27, cpu_mask: 00000002 > (XEN) CPU 1 booted. > (XEN) Brought up 2 CPUs > (XEN) *** LOADING DOMAIN 0 *** > (XEN) Populate P2M 0xc8000000->0xd0000000 (1:1 mapping for dom0) > (XEN) > (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 61, cpu: 0 [..] > (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 61, cpu: 1 Not related to this patch series, but is it normal that you passthrough the same interrupt both to dom0 and domU? There is few other case like that. -- Julien Grall ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix 2014-01-29 13:12 ` Julien Grall @ 2014-01-29 18:55 ` Oleksandr Tyshchenko 0 siblings, 0 replies; 48+ messages in thread From: Oleksandr Tyshchenko @ 2014-01-29 18:55 UTC (permalink / raw) To: Julien Grall; +Cc: Stefano Stabellini, Ian Campbell, xen-devel On Wed, Jan 29, 2014 at 3:12 PM, Julien Grall <julien.grall@linaro.org> wrote: > Hello Oleksandr, > > On 28/01/14 19:25, Oleksandr Tyshchenko wrote: > > [..] > > >> >> Do you pass-through PPIs to dom0? >> >> If I understand correctly that PPIs is irqs from 16 to 31. >> So yes, I do. I see timer's irqs and maintenance irq which routed to >> both CPUs. > > > This IRQs are used by Xen, therefore they are emulated for dom0 and domU. > Xen won't EOI in maintenance_interrupt theses IRQs. > > >> >> And I have printed all irqs which fall to gic_route_irq_to_guest and >> gic_route_irq functions. >> ... >> (XEN) GIC initialization: >> (XEN) gic_dist_addr=0000000048211000 >> (XEN) gic_cpu_addr=0000000048212000 >> (XEN) gic_hyp_addr=0000000048214000 >> (XEN) gic_vcpu_addr=0000000048216000 >> (XEN) gic_maintenance_irq=25 >> (XEN) GIC: 192 lines, 2 cpus, secure (IID 0000043b). >> (XEN) >> (XEN) >>>>> gic_route_irq: irq: 25, cpu_mask: 00000001 >> (XEN) >> (XEN) >>>>> gic_route_irq: irq: 30, cpu_mask: 00000001 >> (XEN) >> (XEN) >>>>> gic_route_irq: irq: 26, cpu_mask: 00000001 >> (XEN) >> (XEN) >>>>> gic_route_irq: irq: 27, cpu_mask: 00000001 >> (XEN) >> (XEN) >>>>> gic_route_irq: irq: 104, cpu_mask: 00000001 >> (XEN) Using scheduler: SMP Credit Scheduler (credit) >> (XEN) Allocated console ring of 16 KiB. >> (XEN) VFP implementer 0x41 architecture 4 part 0x30 variant 0xf rev 0x0 >> (XEN) Bringing up CPU1 >> (XEN) >> (XEN) >>>>> gic_route_irq: irq: 25, cpu_mask: 00000002 >> (XEN) >> (XEN) >>>>> gic_route_irq: irq: 30, cpu_mask: 00000002 >> (XEN) >> (XEN) >>>>> gic_route_irq: irq: 26, cpu_mask: 00000002 >> (XEN) >> (XEN) >>>>> gic_route_irq: irq: 27, cpu_mask: 00000002 >> (XEN) CPU 1 booted. >> (XEN) Brought up 2 CPUs >> (XEN) *** LOADING DOMAIN 0 *** >> (XEN) Populate P2M 0xc8000000->0xd0000000 (1:1 mapping for dom0) >> (XEN) >> (XEN) >>>>> gic_route_irq_to_guest: domid: 0, irq: 61, cpu: 0 > > > [..] > > >> (XEN) >>>>> gic_route_irq_to_guest: domid: 1, irq: 61, cpu: 1 > > > Not related to this patch series, but is it normal that you passthrough the > same interrupt both to dom0 and domU? > No, it isn't. Although, these interrupts are not used in both domains, we need to make cleanup. Thank you for your attention. > There is few other case like that. > > -- > Julien Grall -- Oleksandr Tyshchenko | Embedded Developer GlobalLogic ^ permalink raw reply [flat|nested] 48+ messages in thread
end of thread, other threads:[~2014-02-21 12:07 UTC | newest] Thread overview: 48+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-01-27 17:33 [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix Oleksandr Tyshchenko 2014-01-27 17:33 ` [PATCH v1 1/2] xen/arm: Add return value to smp_call_function_interrupt function Oleksandr Tyshchenko 2014-01-27 18:28 ` Stefano Stabellini 2014-01-27 17:33 ` [PATCH v1 2/2] xen/arm: Fix deadlock in on_selected_cpus function Oleksandr Tyshchenko 2014-01-27 19:00 ` Stefano Stabellini 2014-01-28 10:03 ` Ian Campbell 2014-01-28 14:00 ` Stefano Stabellini 2014-01-28 15:05 ` Ian Campbell 2014-01-28 16:02 ` Stefano Stabellini 2014-01-28 16:12 ` Ian Campbell 2014-01-28 16:23 ` Stefano Stabellini 2014-01-28 13:58 ` Stefano Stabellini 2014-01-30 11:58 ` Oleksandr Tyshchenko 2014-01-27 17:40 ` [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix Ian Campbell 2014-01-27 17:51 ` Julien Grall 2014-01-28 19:25 ` Oleksandr Tyshchenko 2014-01-29 10:56 ` Oleksandr Tyshchenko 2014-01-29 11:42 ` Stefano Stabellini 2014-01-29 11:46 ` Stefano Stabellini 2014-01-29 13:15 ` Julien Grall 2014-02-04 15:32 ` Julien Grall 2014-02-04 16:20 ` [PATCH] xen/arm: route irqs to cpu0 Stefano Stabellini 2014-02-04 16:32 ` Julien Grall 2014-02-04 16:56 ` Oleksandr Tyshchenko 2014-02-19 13:43 ` Julien Grall 2014-02-19 13:53 ` Ian Campbell 2014-02-19 14:15 ` George Dunlap 2014-02-20 14:52 ` Stefano Stabellini 2014-02-21 11:12 ` George Dunlap 2014-02-21 11:59 ` Julien Grall 2014-02-21 12:07 ` George Dunlap 2014-01-29 13:07 ` [PATCH v1 0/2] xen/arm: maintenance_interrupt SMP fix Julien Grall 2014-01-29 13:22 ` Stefano Stabellini 2014-01-29 18:40 ` Oleksandr Tyshchenko 2014-01-29 18:43 ` Oleksandr Tyshchenko 2014-01-29 18:49 ` Julien Grall 2014-01-29 19:54 ` Oleksandr Tyshchenko 2014-01-30 0:42 ` Julien Grall 2014-01-30 13:24 ` Stefano Stabellini 2014-01-30 15:06 ` Oleksandr Tyshchenko 2014-01-30 15:35 ` Stefano Stabellini 2014-01-30 16:10 ` Oleksandr Tyshchenko 2014-01-30 17:18 ` Stefano Stabellini 2014-01-30 19:54 ` Oleksandr Tyshchenko 2014-01-30 21:47 ` Julien Grall 2014-01-31 1:57 ` Oleksandr Tyshchenko 2014-01-29 13:12 ` Julien Grall 2014-01-29 18:55 ` Oleksandr Tyshchenko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).