linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] powerpc/xive: fix the fuzz value in xive_pick_irq_target()
@ 2017-08-02 16:43 Cédric Le Goater
  2017-08-02 21:57 ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 9+ messages in thread
From: Cédric Le Goater @ 2017-08-02 16:43 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Nicholas Piggin, Cédric Le Goater

If xive_find_target_in_mask() fails to find a cpu, the fuzz value used
in xive_pick_irq_target() is decremented and reused in the last
returning call to xive_find_target_in_mask(). This can result in such
WARNINGs if the initial fuzz value is zero :

   [    0.094480] WARNING: CPU: 10 PID: 1 at ../arch/powerpc/sysdev/xive/common.c:476 xive_find_target_in_mask+0x110/0x2f0
   [    0.094486] Modules linked in:
   [    0.094491] CPU: 10 PID: 1 Comm: swapper/0 Not tainted 4.12.0+ #3
   [    0.094496] task: c0000003fae4f200 task.stack: c0000003fe108000
   [    0.094501] NIP: c00000000008a310 LR: c00000000008a2e4 CTR: 000000000072ca34
   [    0.094506] REGS: c0000003fe10b360 TRAP: 0700   Not tainted  (4.12.0+)
   [    0.094510] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
   [    0.094515]   CR: 88000222  XER: 20040008
   [    0.094521] CFAR: c00000000008a2cc SOFTE: 0
   [    0.094521] GPR00: c00000000008a274 c0000003fe10b5e0 c000000001428f00 0000000000000010
   [    0.094521] GPR04: 0000000000000010 0000000000000010 0000000000000010 0000000000000099
   [    0.094521] GPR08: 0000000000000010 0000000000000001 ffffffffffff0000 0000000000000000
   [    0.094521] GPR12: 0000000000000000 c00000000fff2d00 c00000000000d4d8 0000000000000000
   [    0.094521] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
   [    0.094521] GPR20: 0000000000000000 0000000000000000 0000000000000000 c000000000b451e8
   [    0.094521] GPR24: 00000000ffffffff c000000001462354 0000000000000800 00000000000007ff
   [    0.094521] GPR28: c000000001462354 0000000000000010 c0000003f857e418 0000000000000010
   [    0.094580] NIP [c00000000008a310] xive_find_target_in_mask+0x110/0x2f0
   [    0.094585] LR [c00000000008a2e4] xive_find_target_in_mask+0xe4/0x2f0
   [    0.094589] Call Trace:
   [    0.094593] [c0000003fe10b5e0] [c00000000008a274] xive_find_target_in_mask+0x74/0x2f0 (unreliable)
   [    0.094601] [c0000003fe10b690] [c00000000008abf0] xive_pick_irq_target.isra.1+0x200/0x230
   [    0.094608] [c0000003fe10b830] [c00000000008b250] xive_irq_startup+0x60/0x180
   [    0.094614] [c0000003fe10b8b0] [c0000000001608f0] irq_startup+0x70/0xd0
   [    0.094620] [c0000003fe10b8f0] [c00000000015df7c] __setup_irq+0x7bc/0x880
   [    0.094626] [c0000003fe10ba90] [c00000000015e30c] request_threaded_irq+0x14c/0x2c0
   [    0.094632] [c0000003fe10baf0] [c0000000000aeb00] request_event_sources_irqs+0x100/0x180
   [    0.094639] [c0000003fe10bc10] [c000000000e7d2f8] __machine_initcall_pseries_init_ras_IRQ+0x104/0x134
   [    0.094646] [c0000003fe10bc40] [c00000000000cc88] do_one_initcall+0x68/0x1d0
   [    0.094652] [c0000003fe10bd00] [c000000000e643c8] kernel_init_freeable+0x290/0x374
   [    0.094658] [c0000003fe10bdc0] [c00000000000d4f4] kernel_init+0x24/0x170
   [    0.094664] [c0000003fe10be30] [c00000000000b268] ret_from_kernel_thread+0x5c/0x74
   [    0.094669] Instruction dump:
   [    0.094673] 48586529 60000000 e8dc0002 393f0001 7f9b4800 7c7d07b4 7d3f07b4 409effcc
   [    0.094682] 7f9d3000 7d26e850 79290fe0 69290001 <0b090000> 409c0194 3f620004 3b7b8ec8

Fix this problem by checking the fuzz value before decrementing it.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 arch/powerpc/sysdev/xive/common.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index 6595462b1fc8..50ce48983340 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -516,7 +516,8 @@ static int xive_pick_irq_target(struct irq_data *d,
 		free_cpumask_var(mask);
 		if (cpu >= 0)
 			return cpu;
-		fuzz--;
+		if (fuzz)
+			fuzz--;
 	}
 
 	/* No chip IDs, fallback to using the affinity mask */
-- 
2.7.5

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/xive: fix the fuzz value in xive_pick_irq_target()
  2017-08-02 16:43 [PATCH] powerpc/xive: fix the fuzz value in xive_pick_irq_target() Cédric Le Goater
@ 2017-08-02 21:57 ` Benjamin Herrenschmidt
  2017-08-03  0:01   ` Michael Ellerman
  2017-08-03  7:45   ` Cédric Le Goater
  0 siblings, 2 replies; 9+ messages in thread
From: Benjamin Herrenschmidt @ 2017-08-02 21:57 UTC (permalink / raw)
  To: Cédric Le Goater, linuxppc-dev
  Cc: Michael Ellerman, Paul Mackerras, Nicholas Piggin

On Wed, 2017-08-02 at 18:43 +0200, Cédric Le Goater wrote:
> If xive_find_target_in_mask() fails to find a cpu, the fuzz value used
> in xive_pick_irq_target() is decremented and reused in the last
> returning call to xive_find_target_in_mask(). This can result in such
> WARNINGs if the initial fuzz value is zero :

Ah indeed ... would have worked better if "fuzz" had been unsigned.

>    [    0.094480] WARNING: CPU: 10 PID: 1 at ../arch/powerpc/sysdev/xive/common.c:476 xive_find_target_in_mask+0x110/0x2f0
>    [    0.094486] Modules linked in:
>    [    0.094491] CPU: 10 PID: 1 Comm: swapper/0 Not tainted 4.12.0+ #3
>    [    0.094496] task: c0000003fae4f200 task.stack: c0000003fe108000
>    [    0.094501] NIP: c00000000008a310 LR: c00000000008a2e4 CTR: 000000000072ca34
>    [    0.094506] REGS: c0000003fe10b360 TRAP: 0700   Not tainted  (4.12.0+)
>    [    0.094510] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
>    [    0.094515]   CR: 88000222  XER: 20040008
>    [    0.094521] CFAR: c00000000008a2cc SOFTE: 0
>    [    0.094521] GPR00: c00000000008a274 c0000003fe10b5e0 c000000001428f00 0000000000000010
>    [    0.094521] GPR04: 0000000000000010 0000000000000010 0000000000000010 0000000000000099
>    [    0.094521] GPR08: 0000000000000010 0000000000000001 ffffffffffff0000 0000000000000000
>    [    0.094521] GPR12: 0000000000000000 c00000000fff2d00 c00000000000d4d8 0000000000000000
>    [    0.094521] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>    [    0.094521] GPR20: 0000000000000000 0000000000000000 0000000000000000 c000000000b451e8
>    [    0.094521] GPR24: 00000000ffffffff c000000001462354 0000000000000800 00000000000007ff
>    [    0.094521] GPR28: c000000001462354 0000000000000010 c0000003f857e418 0000000000000010
>    [    0.094580] NIP [c00000000008a310] xive_find_target_in_mask+0x110/0x2f0
>    [    0.094585] LR [c00000000008a2e4] xive_find_target_in_mask+0xe4/0x2f0
>    [    0.094589] Call Trace:
>    [    0.094593] [c0000003fe10b5e0] [c00000000008a274] xive_find_target_in_mask+0x74/0x2f0 (unreliable)
>    [    0.094601] [c0000003fe10b690] [c00000000008abf0] xive_pick_irq_target.isra.1+0x200/0x230
>    [    0.094608] [c0000003fe10b830] [c00000000008b250] xive_irq_startup+0x60/0x180
>    [    0.094614] [c0000003fe10b8b0] [c0000000001608f0] irq_startup+0x70/0xd0
>    [    0.094620] [c0000003fe10b8f0] [c00000000015df7c] __setup_irq+0x7bc/0x880
>    [    0.094626] [c0000003fe10ba90] [c00000000015e30c] request_threaded_irq+0x14c/0x2c0
>    [    0.094632] [c0000003fe10baf0] [c0000000000aeb00] request_event_sources_irqs+0x100/0x180
>    [    0.094639] [c0000003fe10bc10] [c000000000e7d2f8] __machine_initcall_pseries_init_ras_IRQ+0x104/0x134
>    [    0.094646] [c0000003fe10bc40] [c00000000000cc88] do_one_initcall+0x68/0x1d0
>    [    0.094652] [c0000003fe10bd00] [c000000000e643c8] kernel_init_freeable+0x290/0x374
>    [    0.094658] [c0000003fe10bdc0] [c00000000000d4f4] kernel_init+0x24/0x170
>    [    0.094664] [c0000003fe10be30] [c00000000000b268] ret_from_kernel_thread+0x5c/0x74
>    [    0.094669] Instruction dump:
>    [    0.094673] 48586529 60000000 e8dc0002 393f0001 7f9b4800 7c7d07b4 7d3f07b4 409effcc
>    [    0.094682] 7f9d3000 7d26e850 79290fe0 69290001 <0b090000> 409c0194 3f620004 3b7b8ec8
> 
> Fix this problem by checking the fuzz value before decrementing it.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
>  arch/powerpc/sysdev/xive/common.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
> index 6595462b1fc8..50ce48983340 100644
> --- a/arch/powerpc/sysdev/xive/common.c
> +++ b/arch/powerpc/sysdev/xive/common.c
> @@ -516,7 +516,8 @@ static int xive_pick_irq_target(struct irq_data *d,
>  		free_cpumask_var(mask);
>  		if (cpu >= 0)
>  			return cpu;
> -		fuzz--;
> +		if (fuzz)
> +			fuzz--;
>  	}
>  
>  	/* No chip IDs, fallback to using the affinity mask */

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/xive: fix the fuzz value in xive_pick_irq_target()
  2017-08-02 21:57 ` Benjamin Herrenschmidt
@ 2017-08-03  0:01   ` Michael Ellerman
  2017-08-03  0:58     ` Benjamin Herrenschmidt
  2017-08-03  7:45   ` Cédric Le Goater
  1 sibling, 1 reply; 9+ messages in thread
From: Michael Ellerman @ 2017-08-03  0:01 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Cédric Le Goater, linuxppc-dev
  Cc: Paul Mackerras, Nicholas Piggin

Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:

> On Wed, 2017-08-02 at 18:43 +0200, C=C3=A9dric Le Goater wrote:
>> If xive_find_target_in_mask() fails to find a cpu, the fuzz value used
>> in xive_pick_irq_target() is decremented and reused in the last
>> returning call to xive_find_target_in_mask(). This can result in such
>> WARNINGs if the initial fuzz value is zero :
>
> Ah indeed ... would have worked better if "fuzz" had been unsigned.

Is that an ack or a changes requested?

chers

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/xive: fix the fuzz value in xive_pick_irq_target()
  2017-08-03  0:01   ` Michael Ellerman
@ 2017-08-03  0:58     ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 9+ messages in thread
From: Benjamin Herrenschmidt @ 2017-08-03  0:58 UTC (permalink / raw)
  To: Michael Ellerman, Cédric Le Goater, linuxppc-dev
  Cc: Paul Mackerras, Nicholas Piggin

On Thu, 2017-08-03 at 10:01 +1000, Michael Ellerman wrote:
> Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
> 
> > On Wed, 2017-08-02 at 18:43 +0200, Cédric Le Goater wrote:
> > > If xive_find_target_in_mask() fails to find a cpu, the fuzz value used
> > > in xive_pick_irq_target() is decremented and reused in the last
> > > returning call to xive_find_target_in_mask(). This can result in such
> > > WARNINGs if the initial fuzz value is zero :
> > 
> > Ah indeed ... would have worked better if "fuzz" had been unsigned.
> 
> Is that an ack or a changes requested?

Either ;-) The original code would have been fine with an unsigned, I
didn't realize it was signed and going negative. That said, cedric
patch is fine.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/xive: fix the fuzz value in xive_pick_irq_target()
  2017-08-02 21:57 ` Benjamin Herrenschmidt
  2017-08-03  0:01   ` Michael Ellerman
@ 2017-08-03  7:45   ` Cédric Le Goater
  2017-08-03  8:02     ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 9+ messages in thread
From: Cédric Le Goater @ 2017-08-03  7:45 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, linuxppc-dev
  Cc: Michael Ellerman, Paul Mackerras, Nicholas Piggin

On 08/02/2017 11:57 PM, Benjamin Herrenschmidt wrote:
> On Wed, 2017-08-02 at 18:43 +0200, Cédric Le Goater wrote:
>> If xive_find_target_in_mask() fails to find a cpu, the fuzz value used
>> in xive_pick_irq_target() is decremented and reused in the last
>> returning call to xive_find_target_in_mask(). This can result in such
>> WARNINGs if the initial fuzz value is zero :
> 
> Ah indeed ... would have worked better if "fuzz" had been unsigned.

but 'fuzz' is unsigned ! 

With a -1, unsigned or not, the 'first' cpu  becomes out of range for
the calculation below :

	/* Pick up a starting point CPU in the mask based on  fuzz */
	num = cpumask_weight(mask);
	first = fuzz % num;

	/* Locate it */
	cpu = cpumask_first(mask);
	for (i = 0; i < first && cpu < nr_cpu_ids; i++)
		cpu = cpumask_next(cpu, mask);

May be there is a better fix ? 


Also, I am not sure of :

	num = cpumask_weight(mask);

shouldn't we be using : 

	num = nr_cpu_ids;

In that case, 'first' would have been in the cpu range.

Cheers,

C.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/xive: fix the fuzz value in xive_pick_irq_target()
  2017-08-03  7:45   ` Cédric Le Goater
@ 2017-08-03  8:02     ` Benjamin Herrenschmidt
  2017-08-03  9:52       ` Michael Ellerman
  0 siblings, 1 reply; 9+ messages in thread
From: Benjamin Herrenschmidt @ 2017-08-03  8:02 UTC (permalink / raw)
  To: Cédric Le Goater, linuxppc-dev
  Cc: Michael Ellerman, Paul Mackerras, Nicholas Piggin

On Thu, 2017-08-03 at 09:45 +0200, Cédric Le Goater wrote:
> On 08/02/2017 11:57 PM, Benjamin Herrenschmidt wrote:
> > On Wed, 2017-08-02 at 18:43 +0200, Cédric Le Goater wrote:
> > > If xive_find_target_in_mask() fails to find a cpu, the fuzz value used
> > > in xive_pick_irq_target() is decremented and reused in the last
> > > returning call to xive_find_target_in_mask(). This can result in such
> > > WARNINGs if the initial fuzz value is zero :
> > 
> > Ah indeed ... would have worked better if "fuzz" had been unsigned.
> 
> but 'fuzz' is unsigned ! 

Haha right.

> With a -1, unsigned or not, the 'first' cpu  becomes out of range for
> the calculation below :
> 
> 	/* Pick up a starting point CPU in the mask based on  fuzz */
> 	num = cpumask_weight(mask);
> 	first = fuzz % num;

How can it ? fuzz % num should then return something that's

	0 <= first < num

Regardless of the value of fuzz.

Which means we should be able to locate it. The only case I could think
of that would fail would be if num is 0

> 	/* Locate it */
> 	cpu = cpumask_first(mask);
> 	for (i = 0; i < first && cpu < nr_cpu_ids; i++)
> 		cpu = cpumask_next(cpu, mask);
> 
> May be there is a better fix ? 
> 

> Also, I am not sure of :
> 
> 	num = cpumask_weight(mask);
> 
> shouldn't we be using : 
> 
> 	num = nr_cpu_ids;
> 
> In that case, 'first' would have been in the cpu range.

No that's the whole point. If we did that, then we would go out of the
mask.

The basic idea is that the mask contains "num" bits set, and we want to
pick one of them. But those bits can be bit 0, 5, 12 ... while
nr_cpu_ids can be 96 for example.

So for example, if the bits set as above, and fuzz is 5, we have

num is 3 (3 bits set in the mask)

first will then be 5 % 3 which is 2. That means that we want to pick
the "2th 0-based" ie the 3rd bit in the mask as our tentative target.

So the loop will iterate all the bits in the mask until i reaches
first, which is 2. So it will start with cpu = 0 i = 0, then cpu = 5 i
= 1, then cpu = 12 i = 2 and will exit then.

Ben.

> Cheers,
> 
> C.
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/xive: fix the fuzz value in xive_pick_irq_target()
  2017-08-03  8:02     ` Benjamin Herrenschmidt
@ 2017-08-03  9:52       ` Michael Ellerman
  2017-08-03 11:52         ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Ellerman @ 2017-08-03  9:52 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Cédric Le Goater, linuxppc-dev
  Cc: Paul Mackerras, Nicholas Piggin

Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:

> On Thu, 2017-08-03 at 09:45 +0200, C=C3=A9dric Le Goater wrote:
>> On 08/02/2017 11:57 PM, Benjamin Herrenschmidt wrote:
>> > On Wed, 2017-08-02 at 18:43 +0200, C=C3=A9dric Le Goater wrote:
>> > > If xive_find_target_in_mask() fails to find a cpu, the fuzz value us=
ed
>> > > in xive_pick_irq_target() is decremented and reused in the last
>> > > returning call to xive_find_target_in_mask(). This can result in such
>> > > WARNINGs if the initial fuzz value is zero :
>> >=20
>> > Ah indeed ... would have worked better if "fuzz" had been unsigned.
>>=20
>> but 'fuzz' is unsigned !=20
>
> Haha right.
>
>> With a -1, unsigned or not, the 'first' cpu  becomes out of range for
>> the calculation below :
>>=20
>> 	/* Pick up a starting point CPU in the mask based on  fuzz */
>> 	num =3D cpumask_weight(mask);
>> 	first =3D fuzz % num;
>
> How can it ? fuzz % num should then return something that's
>
> 	0 <=3D first < num
>
> Regardless of the value of fuzz.

What if num is 0?

Which it would be in the fallback case, if the affinity mask is empty,
AFAICS.

cheers

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/xive: fix the fuzz value in xive_pick_irq_target()
  2017-08-03  9:52       ` Michael Ellerman
@ 2017-08-03 11:52         ` Benjamin Herrenschmidt
  2017-08-03 12:55           ` Cédric Le Goater
  0 siblings, 1 reply; 9+ messages in thread
From: Benjamin Herrenschmidt @ 2017-08-03 11:52 UTC (permalink / raw)
  To: Michael Ellerman, Cédric Le Goater, linuxppc-dev
  Cc: Paul Mackerras, Nicholas Piggin

On Thu, 2017-08-03 at 19:52 +1000, Michael Ellerman wrote:
> What if num is 0?
> 
> Which it would be in the fallback case, if the affinity mask is empty,
> AFAICS.

How can the mask be empty though ? But yes, as I noted there is a
problem if the value is 0. Not sure what to do if we are given an empty
mask though.

Ben.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/xive: fix the fuzz value in xive_pick_irq_target()
  2017-08-03 11:52         ` Benjamin Herrenschmidt
@ 2017-08-03 12:55           ` Cédric Le Goater
  0 siblings, 0 replies; 9+ messages in thread
From: Cédric Le Goater @ 2017-08-03 12:55 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Michael Ellerman, linuxppc-dev
  Cc: Paul Mackerras, Nicholas Piggin

On 08/03/2017 01:52 PM, Benjamin Herrenschmidt wrote:
> On Thu, 2017-08-03 at 19:52 +1000, Michael Ellerman wrote:
>> What if num is 0?
>>
>> Which it would be in the fallback case, if the affinity mask is empty,
>> AFAICS.
> 
> How can the mask be empty though ? But yes, as I noted there is a
> problem if the value is 0. Not sure what to do if we are given an empty
> mask though.

I am seeing different problems. 

The fuzz value was one but that was due to the fact that the spapr 
backend did not set the chip_id of the xive_irq_data. I should have 
fix that now. But in some other situations, the weight of the 
'affinity' mask is 2048 (and 'nr_cpu_ids') and that also breaks 
the 'first' calculation. 

I am currently tracking this with the xive support for spapr. I will 
send a fix with the patchset.

Thanks,

C.   

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-08-03 17:42 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-02 16:43 [PATCH] powerpc/xive: fix the fuzz value in xive_pick_irq_target() Cédric Le Goater
2017-08-02 21:57 ` Benjamin Herrenschmidt
2017-08-03  0:01   ` Michael Ellerman
2017-08-03  0:58     ` Benjamin Herrenschmidt
2017-08-03  7:45   ` Cédric Le Goater
2017-08-03  8:02     ` Benjamin Herrenschmidt
2017-08-03  9:52       ` Michael Ellerman
2017-08-03 11:52         ` Benjamin Herrenschmidt
2017-08-03 12:55           ` Cédric Le Goater

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).