linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* IRQ behaivour has been changed from v4.14 to v4.15-rc1
@ 2017-12-28 17:17 Shevchenko, Andriy
  2017-12-28 17:21 ` Thomas Gleixner
  2017-12-28 17:23 ` Andy Shevchenko
  0 siblings, 2 replies; 14+ messages in thread
From: Shevchenko, Andriy @ 2017-12-28 17:17 UTC (permalink / raw)
  To: tglx@linutronix.de, alan@linux.intel.com, Ailus, Sakari
  Cc: linux-kernel@vger.kernel.org, linux-media@vger.kernel.org

Hi!

Experimenting with AtomISP (yes, code is ugly and MSI handling rather
hackish, though...).

So, with v4.14 base:

[   33.639224] atomisp-isp2 0000:00:03.0: Start stream on pad 1 for asd0
[   33.652355] atomisp-isp2 0000:00:03.0: irq:0x20
[   33.662456] atomisp-isp2 0000:00:03.0: irq:0x20
[   33.698064] atomisp-isp2 0000:00:03.0: stream[0] started.

Ctrl+C

[   48.185643] atomisp-isp2 0000:00:03.0: <atomisp_dqbuf: -512
[   48.204641] atomisp-isp2 0000:00:03.0: release device ATOMISP ISP
CAPTURE output
...

and machine still alive.


With v4.15-rc1 base (basically your branch + some my hack patches) the
IR
Q behaviour changed, i.e. I have got:


[   85.167061] spurious APIC interrupt through vector ff on CPU#0,
should never happen.
[   85.199886] atomisp-isp2 0000:00:03.0: stream[0] started.

and Ctrl+C does NOT work. Machine just hangs.

It might be related to this:
https://lkml.org/lkml/2017/12/22/697

Any comments, Thomas?

-- 
Andy Shevchenko <andriy.shevchenko@intel.com>
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: IRQ behaivour has been changed from v4.14 to v4.15-rc1
  2017-12-28 17:17 IRQ behaivour has been changed from v4.14 to v4.15-rc1 Shevchenko, Andriy
@ 2017-12-28 17:21 ` Thomas Gleixner
  2017-12-28 17:34   ` Andy Shevchenko
  2017-12-28 17:23 ` Andy Shevchenko
  1 sibling, 1 reply; 14+ messages in thread
From: Thomas Gleixner @ 2017-12-28 17:21 UTC (permalink / raw)
  To: Shevchenko, Andriy
  Cc: alan@linux.intel.com, Ailus, Sakari, linux-kernel@vger.kernel.org,
	linux-media@vger.kernel.org

On Thu, 28 Dec 2017, Shevchenko, Andriy wrote:

> Hi!
> 
> Experimenting with AtomISP (yes, code is ugly and MSI handling rather
> hackish, though...).
> 
> So, with v4.14 base:
> 
> [   33.639224] atomisp-isp2 0000:00:03.0: Start stream on pad 1 for asd0
> [   33.652355] atomisp-isp2 0000:00:03.0: irq:0x20
> [   33.662456] atomisp-isp2 0000:00:03.0: irq:0x20
> [   33.698064] atomisp-isp2 0000:00:03.0: stream[0] started.
> 
> Ctrl+C
> 
> [   48.185643] atomisp-isp2 0000:00:03.0: <atomisp_dqbuf: -512
> [   48.204641] atomisp-isp2 0000:00:03.0: release device ATOMISP ISP
> CAPTURE output
> ...
> 
> and machine still alive.
> 
> 
> With v4.15-rc1 base (basically your branch + some my hack patches) the
> IR
> Q behaviour changed, i.e. I have got:
> 
> 
> [   85.167061] spurious APIC interrupt through vector ff on CPU#0,
> should never happen.
> [   85.199886] atomisp-isp2 0000:00:03.0: stream[0] started.
> 
> and Ctrl+C does NOT work. Machine just hangs.
> 
> It might be related to this:
> https://lkml.org/lkml/2017/12/22/697

I don't think so.

Does the patch below cure it?

Thanks,

	tglx
8<-----------------
 arch/x86/kernel/apic/apic_flat_64.c   |    2 +-
 arch/x86/kernel/apic/probe_32.c       |    2 +-
 arch/x86/kernel/apic/x2apic_cluster.c |    2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

--- a/arch/x86/kernel/apic/apic_flat_64.c
+++ b/arch/x86/kernel/apic/apic_flat_64.c
@@ -151,7 +151,7 @@ static struct apic apic_flat __ro_after_
 	.apic_id_valid			= default_apic_id_valid,
 	.apic_id_registered		= flat_apic_id_registered,
 
-	.irq_delivery_mode		= dest_LowestPrio,
+	.irq_delivery_mode		= dest_Fixed,
 	.irq_dest_mode			= 1, /* logical */
 
 	.disable_esr			= 0,
--- a/arch/x86/kernel/apic/probe_32.c
+++ b/arch/x86/kernel/apic/probe_32.c
@@ -105,7 +105,7 @@ static struct apic apic_default __ro_aft
 	.apic_id_valid			= default_apic_id_valid,
 	.apic_id_registered		= default_apic_id_registered,
 
-	.irq_delivery_mode		= dest_LowestPrio,
+	.irq_delivery_mode		= dest_Fixed,
 	/* logical delivery broadcast to all CPUs: */
 	.irq_dest_mode			= 1,
 
--- a/arch/x86/kernel/apic/x2apic_cluster.c
+++ b/arch/x86/kernel/apic/x2apic_cluster.c
@@ -184,7 +184,7 @@ static struct apic apic_x2apic_cluster _
 	.apic_id_valid			= x2apic_apic_id_valid,
 	.apic_id_registered		= x2apic_apic_id_registered,
 
-	.irq_delivery_mode		= dest_LowestPrio,
+	.irq_delivery_mode		= dest_Fixed,
 	.irq_dest_mode			= 1, /* logical */
 
 	.disable_esr			= 0,

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: IRQ behaivour has been changed from v4.14 to v4.15-rc1
  2017-12-28 17:17 IRQ behaivour has been changed from v4.14 to v4.15-rc1 Shevchenko, Andriy
  2017-12-28 17:21 ` Thomas Gleixner
@ 2017-12-28 17:23 ` Andy Shevchenko
  1 sibling, 0 replies; 14+ messages in thread
From: Andy Shevchenko @ 2017-12-28 17:23 UTC (permalink / raw)
  To: Ailus, Sakari, Thomas Gleixner, alan
  Cc: linux-media, linux-kernel@vger.kernel.org

On Thu, 2017-12-28 at 19:17 +0200, Andy Shevchenko wrote:
> Hi!
> 
> Experimenting with AtomISP (yes, code is ugly and MSI handling rather
> hackish, though...).
> 
> So, with v4.14 base:

See additional note below.

> 
> [   33.639224] atomisp-isp2 0000:00:03.0: Start stream on pad 1 for
> asd0
> [   33.652355] atomisp-isp2 0000:00:03.0: irq:0x20
> [   33.662456] atomisp-isp2 0000:00:03.0: irq:0x20
> [   33.698064] atomisp-isp2 0000:00:03.0: stream[0] started.
> 
> Ctrl+C
> 
> [   48.185643] atomisp-isp2 0000:00:03.0: <atomisp_dqbuf: -512
> [   48.204641] atomisp-isp2 0000:00:03.0: release device ATOMISP ISP
> CAPTURE output
> ...
> 
> and machine still alive.
> 
> 
> With v4.15-rc1 base (basically your branch + some my hack patches)

Needs a bit of elaboration:
a) nothing had been changed WRT AtomISP driver or media stuff, under
"your branch" one reads Sakari's media_tree.git/atomisp branch;
b) my hack patches has nothing to do with anything except AtomISP
itself;
c) v4.14 base required media/v4.15-1 tag to be merged as well.

>  the IRQ behaviour changed, i.e. I have got:
> 
> 
> [   85.167061] spurious APIC interrupt through vector ff on CPU#0,
> should never happen.
> [   85.199886] atomisp-isp2 0000:00:03.0: stream[0] started.
> 
> and Ctrl+C does NOT work. Machine just hangs.
> 
> It might be related to this:
> https://lkml.org/lkml/2017/12/22/697
> 
> Any comments, Thomas?
> 

-- 
Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Finland Oy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: IRQ behaivour has been changed from v4.14 to v4.15-rc1
  2017-12-28 17:21 ` Thomas Gleixner
@ 2017-12-28 17:34   ` Andy Shevchenko
  2017-12-28 17:44     ` Thomas Gleixner
  0 siblings, 1 reply; 14+ messages in thread
From: Andy Shevchenko @ 2017-12-28 17:34 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: alan@linux.intel.com, Ailus, Sakari, linux-kernel@vger.kernel.org,
	linux-media@vger.kernel.org

On Thu, 2017-12-28 at 18:21 +0100, Thomas Gleixner wrote:
> On Thu, 28 Dec 2017, Shevchenko, Andriy wrote:
> 
> > Hi!
> > 
> > Experimenting with AtomISP (yes, code is ugly and MSI handling
> > rather
> > hackish, though...).
> > 
> > So, with v4.14 base:
> > 
> > [   33.639224] atomisp-isp2 0000:00:03.0: Start stream on pad 1 for
> > asd0
> > [   33.652355] atomisp-isp2 0000:00:03.0: irq:0x20
> > [   33.662456] atomisp-isp2 0000:00:03.0: irq:0x20
> > [   33.698064] atomisp-isp2 0000:00:03.0: stream[0] started.
> > 
> > Ctrl+C
> > 
> > [   48.185643] atomisp-isp2 0000:00:03.0: <atomisp_dqbuf: -512
> > [   48.204641] atomisp-isp2 0000:00:03.0: release device ATOMISP ISP
> > CAPTURE output
> > ...
> > 
> > and machine still alive.
> > 
> > 
> > With v4.15-rc1 base (basically your branch + some my hack patches)
> > the
> > IR
> > Q behaviour changed, i.e. I have got:
> > 
> > 
> > [   85.167061] spurious APIC interrupt through vector ff on CPU#0,
> > should never happen.
> > [   85.199886] atomisp-isp2 0000:00:03.0: stream[0] started.
> > 
> > and Ctrl+C does NOT work. Machine just hangs.
> > 
> > It might be related to this:
> > https://lkml.org/lkml/2017/12/22/697
> 
> I don't think so.
> 
> Does the patch below cure it?

Unfortunately, no.

Same behaviour.

Tell me if I need to provide some debug before it hangs. For now I have
apic=debug (AFAIR it doesn't affect this).

> 
> Thanks,
> 
> 	tglx
> 8<-----------------
>  arch/x86/kernel/apic/apic_flat_64.c   |    2 +-
>  arch/x86/kernel/apic/probe_32.c       |    2 +-
>  arch/x86/kernel/apic/x2apic_cluster.c |    2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
> 
> --- a/arch/x86/kernel/apic/apic_flat_64.c
> +++ b/arch/x86/kernel/apic/apic_flat_64.c
> @@ -151,7 +151,7 @@ static struct apic apic_flat __ro_after_
>  	.apic_id_valid			=
> default_apic_id_valid,
>  	.apic_id_registered		= flat_apic_id_registered,
>  
> -	.irq_delivery_mode		= dest_LowestPrio,
> +	.irq_delivery_mode		= dest_Fixed,
>  	.irq_dest_mode			= 1, /* logical */
>  
>  	.disable_esr			= 0,
> --- a/arch/x86/kernel/apic/probe_32.c
> +++ b/arch/x86/kernel/apic/probe_32.c
> @@ -105,7 +105,7 @@ static struct apic apic_default __ro_aft
>  	.apic_id_valid			=
> default_apic_id_valid,
>  	.apic_id_registered		=
> default_apic_id_registered,
>  
> -	.irq_delivery_mode		= dest_LowestPrio,
> +	.irq_delivery_mode		= dest_Fixed,
>  	/* logical delivery broadcast to all CPUs: */
>  	.irq_dest_mode			= 1,
>  
> --- a/arch/x86/kernel/apic/x2apic_cluster.c
> +++ b/arch/x86/kernel/apic/x2apic_cluster.c
> @@ -184,7 +184,7 @@ static struct apic apic_x2apic_cluster _
>  	.apic_id_valid			= x2apic_apic_id_valid,
>  	.apic_id_registered		=
> x2apic_apic_id_registered,
>  
> -	.irq_delivery_mode		= dest_LowestPrio,
> +	.irq_delivery_mode		= dest_Fixed,
>  	.irq_dest_mode			= 1, /* logical */
>  
>  	.disable_esr			= 0,
> 
> 

-- 
Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Finland Oy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: IRQ behaivour has been changed from v4.14 to v4.15-rc1
  2017-12-28 17:34   ` Andy Shevchenko
@ 2017-12-28 17:44     ` Thomas Gleixner
  2017-12-28 19:31       ` Andy Shevchenko
  0 siblings, 1 reply; 14+ messages in thread
From: Thomas Gleixner @ 2017-12-28 17:44 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: alan@linux.intel.com, Ailus, Sakari, linux-kernel@vger.kernel.org,
	linux-media@vger.kernel.org

On Thu, 28 Dec 2017, Andy Shevchenko wrote:
> On Thu, 2017-12-28 at 18:21 +0100, Thomas Gleixner wrote:
> > > [   85.167061] spurious APIC interrupt through vector ff on CPU#0,
> > > should never happen.
> > > [   85.199886] atomisp-isp2 0000:00:03.0: stream[0] started.
> > > 
> > > and Ctrl+C does NOT work. Machine just hangs.
> > > 
> > > It might be related to this:
> > > https://lkml.org/lkml/2017/12/22/697
> > 
> > I don't think so.
> > 
> > Does the patch below cure it?
> 
> Unfortunately, no.
> 
> Same behaviour.
> 
> Tell me if I need to provide some debug before it hangs. For now I have
> apic=debug (AFAIR it doesn't affect this).

The interesting question is why that spurious APIC interrupt through vector
ff happens. Can you please add the following to the kernel command line:

 'trace_events=irq_vectors:* ftrace_dump_on_oops'

and apply the patch below. That should spill out the trace over serial (I
hope you have that ...)

Thanks,

	tglx

8<---------------
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1978,6 +1978,8 @@ void __init register_lapic_address(unsig
 	/* see sw-dev-man vol 3, chapter 7.4.13.5 */
 	pr_info("spurious APIC interrupt through vector %02x on CPU#%d, "
 		"should never happen.\n", vector, smp_processor_id());
+	tracing_off();
+	BUG();
 
 	trace_spurious_apic_exit(vector);
 	exiting_irq();

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: IRQ behaivour has been changed from v4.14 to v4.15-rc1
  2017-12-28 17:44     ` Thomas Gleixner
@ 2017-12-28 19:31       ` Andy Shevchenko
  2017-12-28 19:36         ` Andy Shevchenko
  2017-12-28 20:18         ` Thomas Gleixner
  0 siblings, 2 replies; 14+ messages in thread
From: Andy Shevchenko @ 2017-12-28 19:31 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: alan@linux.intel.com, Ailus, Sakari, linux-kernel@vger.kernel.org,
	linux-media@vger.kernel.org

On Thu, 2017-12-28 at 18:44 +0100, Thomas Gleixner wrote:
> On Thu, 28 Dec 2017, Andy Shevchenko wrote:
> > On Thu, 2017-12-28 at 18:21 +0100, Thomas Gleixner wrote:
> > > > [   85.167061] spurious APIC interrupt through vector ff on
> > > > CPU#0,
> > > > should never happen.
> > > > [   85.199886] atomisp-isp2 0000:00:03.0: stream[0] started.
> > > > 
> > > > and Ctrl+C does NOT work. Machine just hangs.
> > > > 
> > > > It might be related to this:
> > > > https://lkml.org/lkml/2017/12/22/697
> > > 
> > > I don't think so.
> > > 
> > > Does the patch below cure it?
> > 
> > Unfortunately, no.
> > 
> > Same behaviour.
> > 
> > Tell me if I need to provide some debug before it hangs. For now I
> > have
> > apic=debug (AFAIR it doesn't affect this).
> 
> The interesting question is why that spurious APIC interrupt through
> vector
> ff happens. Can you please add the following to the kernel command
> line:
> 
>  'trace_events=irq_vectors:* ftrace_dump_on_oops'
> 
> and apply the patch below. That should spill out the trace over serial
> (I
> hope you have that ...)

With or without CONFIG_FUNCTION_TRACER=y I have got


[   28.977918] Dumping ftrace buffer:
[   28.988115]    (ftrace buffer empty)

followed by BUG() output.

...
[   28.930154] RIP: 0010:smp_spurious_interrupt+0x67/0xb0
...


Anything I missed?

P.S. I didn't apply your previous patch.

> Thanks,
> 
> 	tglx
> 
> 8<---------------
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -1978,6 +1978,8 @@ void __init register_lapic_address(unsig
>  	/* see sw-dev-man vol 3, chapter 7.4.13.5 */
>  	pr_info("spurious APIC interrupt through vector %02x on
> CPU#%d, "
>  		"should never happen.\n", vector,
> smp_processor_id());
> +	tracing_off();
> +	BUG();
>  
>  	trace_spurious_apic_exit(vector);
>  	exiting_irq();
> 

-- 
Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Finland Oy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: IRQ behaivour has been changed from v4.14 to v4.15-rc1
  2017-12-28 19:31       ` Andy Shevchenko
@ 2017-12-28 19:36         ` Andy Shevchenko
  2017-12-28 20:18         ` Thomas Gleixner
  1 sibling, 0 replies; 14+ messages in thread
From: Andy Shevchenko @ 2017-12-28 19:36 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: alan@linux.intel.com, Ailus, Sakari, linux-kernel@vger.kernel.org,
	linux-media@vger.kernel.org

On Thu, 2017-12-28 at 21:31 +0200, Andy Shevchenko wrote:

> Anything I missed?

Perhaps I could bisect, though it's not so trivial in this case, when I
have a little more time. I guess it might take up to ~16 steps. If you
can point to more narrow range, it would be great.

-- 
Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Finland Oy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: IRQ behaivour has been changed from v4.14 to v4.15-rc1
  2017-12-28 19:31       ` Andy Shevchenko
  2017-12-28 19:36         ` Andy Shevchenko
@ 2017-12-28 20:18         ` Thomas Gleixner
  2017-12-28 21:03           ` Andy Shevchenko
  1 sibling, 1 reply; 14+ messages in thread
From: Thomas Gleixner @ 2017-12-28 20:18 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: alan@linux.intel.com, Ailus, Sakari, linux-kernel@vger.kernel.org,
	linux-media@vger.kernel.org

On Thu, 28 Dec 2017, Andy Shevchenko wrote:
> On Thu, 2017-12-28 at 18:44 +0100, Thomas Gleixner wrote:
> > On Thu, 28 Dec 2017, Andy Shevchenko wrote:
> > > On Thu, 2017-12-28 at 18:21 +0100, Thomas Gleixner wrote:
> > > > > [   85.167061] spurious APIC interrupt through vector ff on
> > > > > CPU#0,
> > > > > should never happen.
> > > > > [   85.199886] atomisp-isp2 0000:00:03.0: stream[0] started.
> > > > > 
> > > > > and Ctrl+C does NOT work. Machine just hangs.
> > > > > 
> > > > > It might be related to this:
> > > > > https://lkml.org/lkml/2017/12/22/697
> > > > 
> > > > I don't think so.
> > > > 
> > > > Does the patch below cure it?
> > > 
> > > Unfortunately, no.
> > > 
> > > Same behaviour.
> > > 
> > > Tell me if I need to provide some debug before it hangs. For now I
> > > have
> > > apic=debug (AFAIR it doesn't affect this).
> > 
> > The interesting question is why that spurious APIC interrupt through
> > vector
> > ff happens. Can you please add the following to the kernel command
> > line:
> > 
> >  'trace_events=irq_vectors:* ftrace_dump_on_oops'
> > 
> > and apply the patch below. That should spill out the trace over serial
> > (I
> > hope you have that ...)
> 
> With or without CONFIG_FUNCTION_TRACER=y I have got
> 
> 
> [   28.977918] Dumping ftrace buffer:
> [   28.988115]    (ftrace buffer empty)
> 
> followed by BUG() output.
> 
> ...
> [   28.930154] RIP: 0010:smp_spurious_interrupt+0x67/0xb0
> ...
> 
> 
> Anything I missed?

Yes, you missed the typo in the command line. It should be:

 'trace_event=irq_vectors:* ftrace_dump_on_oops'

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: IRQ behaivour has been changed from v4.14 to v4.15-rc1
  2017-12-28 20:18         ` Thomas Gleixner
@ 2017-12-28 21:03           ` Andy Shevchenko
  2017-12-28 21:31             ` Thomas Gleixner
  0 siblings, 1 reply; 14+ messages in thread
From: Andy Shevchenko @ 2017-12-28 21:03 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: alan@linux.intel.com, Ailus, Sakari, linux-kernel@vger.kernel.org,
	linux-media@vger.kernel.org

On Thu, 2017-12-28 at 21:18 +0100, Thomas Gleixner wrote:
> On Thu, 28 Dec 2017, Andy Shevchenko wrote:
> > On Thu, 2017-12-28 at 18:44 +0100, Thomas Gleixner wrote:
> > > On Thu, 28 Dec 2017, Andy Shevchenko wrote:
> > > > On Thu, 2017-12-28 at 18:21 +0100, Thomas Gleixner wrote:
> > > > > > [   85.167061] spurious APIC interrupt through vector ff on
> > > > > > CPU#0,
> > > > > > should never happen.
> > > > > > [   85.199886] atomisp-isp2 0000:00:03.0: stream[0] started.
> > > > > > 
> > > > > > and Ctrl+C does NOT work. Machine just hangs.
> > > > > > 
> > > > > > It might be related to this:
> > > > > > https://lkml.org/lkml/2017/12/22/697
> > > > > 
> > > > > I don't think so.
> > > > > 
> > > > > Does the patch below cure it?
> > > > 
> > > > Unfortunately, no.
> > > > 
> > > > Same behaviour.
> > > > 
> > > > Tell me if I need to provide some debug before it hangs. For now
> > > > I
> > > > have
> > > > apic=debug (AFAIR it doesn't affect this).
> > > 
> > > The interesting question is why that spurious APIC interrupt
> > > through
> > > vector
> > > ff happens. Can you please add the following to the kernel command
> > > line:
> > > 
> > >  'trace_events=irq_vectors:* ftrace_dump_on_oops'
> > > 
> > > and apply the patch below. That should spill out the trace over
> > > serial
> > > (I
> > > hope you have that ...)
> > 
> > With or without CONFIG_FUNCTION_TRACER=y I have got
> > 
> > 
> > [   28.977918] Dumping ftrace buffer:
> > [   28.988115]    (ftrace buffer empty)
> > 
> > followed by BUG() output.
> > 
> > ...
> > [   28.930154] RIP: 0010:smp_spurious_interrupt+0x67/0xb0
> > ...
> > 
> > 
> > Anything I missed?
> 
> Yes, you missed the typo in the command line. It should be:
> 
>  'trace_event=irq_vectors:* ftrace_dump_on_oops'

Indeed.

So, I had to disable LOCAL_TIMER_VECTOR, CALL_FUNCTION_VECTOR and
RESCHDULE_VECTOR tracing, otherwise I got a lot of spam and RCU stalls.

The result w/o above is (full log is available here https://pastebin.com
/J5yaTbM9):

[   38.570130]   <idle>-0       0d...    0us : vector_clear: irq=1
vector=49 cpu=0 prev_vector=0 prev_cpu=0

[   38.606969]   <idle>-0       0d...    0us : vector_reserve: irq=1
ret=0

[   38.627799]   <idle>-0       0d...    0us : vector_config: irq=1
vector=239 cpu=0 apicdest=0x00000001

[   38.665139]   <idle>-0       0....    0us : vector_setup: irq=1
is_legacy=0 ret=0

[   38.687383]   <idle>-0       0d...    0us : vector_setup: irq=0
is_legacy=1 ret=0

[   38.709354]   <idle>-0       0d...    0us : vector_config: irq=0
vector=48 cpu=0 apicdest=0x00000001

[   38.746192]   <idle>-0       0d...    0us : vector_clear: irq=3
vector=51 cpu=0 prev_vector=0 prev_cpu=0

[   38.783535]   <idle>-0       0d...    0us : vector_reserve: irq=3
ret=0

[   38.804574]   <idle>-0       0d...    0us : vector_config: irq=3
vector=239 cpu=0 apicdest=0x00000001

[   38.842397]   <idle>-0       0....    0us : vector_setup: irq=3
is_legacy=0 ret=0

...

so on up to 
[   40.329523]   <idle>-0       0d...    0us : vector_clear: irq=15
vector=63 cpu=0 prev_vector=0 prev_cpu=0

[   40.372425]   <idle>-0       0d...    0us : vector_reserve: irq=15
ret=0

[   40.396045]   <idle>-0       0d...    0us : vector_config: irq=15
vector=239 cpu=0 apicdest=0x00000001

[   40.438357]   <idle>-0       0....    0us : vector_setup: irq=15
is_legacy=0 ret=0

[   40.463002]   <idle>-0       0d...    0us : vector_deactivate: irq=0
is_managed=0 can_reserve=0 early=0

[   40.505473]   <idle>-0       0d...    0us : vector_activate: irq=0
is_managed=0 can_reserve=0 early=0


Then several times (for different tasks)

[   40.548017]  kauditd-32      0d.h. 307277us :
call_function_single_entry: vector=251

[   40.572984]  kauditd-32      0dNh. 307281us :
call_function_single_exit: vector=251

...


[   40.792078] swapper/-1       0d... 390605us : vector_activate: irq=9
is_managed=0 can_reserve=1 early=0

[   40.832953] swapper/-1       0d... 390611us : vector_update: irq=9
vector=33 cpu=0 prev_vector=0 prev_cpu=0

[   40.874548] swapper/-1       0d... 390613us : vector_alloc: irq=9
vector=33 reserved=1 ret=0

[   40.899551] swapper/-1       0d... 390614us : vector_config: irq=9
vector=33 cpu=0 apicdest=0x00000001

[   40.940771] swapper/-1       0d... 390620us : vector_config: irq=9
vector=33 cpu=0 apicdest=0x00000001

...

For irq=24-29, 47, 49:


[   41.071806] swapper/-1       1d... 989092us : vector_reserve: irq=24
ret=0

[   41.088297] swapper/-1       1d... 989096us : vector_config: irq=24
vector=239 cpu=0 apicdest=0x00000001

[   41.125772] swapper/-1       1.... 989097us : vector_setup: irq=24
is_legacy=0 ret=0

...

[   48.242842]     mdev-1450    3d.h. 20327793us :
call_function_single_entry: vector=251

[   48.269120]     mdev-1450    3d.h. 20327800us :
call_function_single_exit: vector=251

[   48.295319] modprobe-1348    0d... 20444228us : vector_activate:
irq=42 is_managed=0 can_reserve=1 early=0

[   48.341248] modprobe-1348    0d... 20444235us : vector_update: irq=42
vector=52 cpu=0 prev_vector=0 prev_cpu=0

[   48.387914] modprobe-1348    0d... 20444237us : vector_alloc: irq=42
vector=52 reserved=1 ret=0

[   48.415547] modprobe-1348    0d... 20444238us : vector_config: irq=42
vector=52 cpu=0 apicdest=0x00000001

[   48.461465] modprobe-1348    0d... 20444243us : vector_config: irq=42
vector=52 cpu=0 apicdest=0x00000001

[   48.507688] modprobe-1348    0d... 20484235us : vector_activate:
irq=43 is_managed=0 can_reserve=1 early=0

[   48.554044] modprobe-1348    0d... 20484241us : vector_update: irq=43
vector=53 cpu=0 prev_vector=0 prev_cpu=0

[   48.600812] modprobe-1348    0d... 20484243us : vector_alloc: irq=43
vector=53 reserved=1 ret=0

[   48.628494] modprobe-1348    0d... 20484244us : vector_config: irq=43
vector=53 cpu=0 apicdest=0x00000001

[   48.674508] modprobe-1348    0d... 20484248us : vector_config: irq=43
vector=53 cpu=0 apicdest=0x00000001

[   48.720830]   <idle>-0       1d.h. 20591763us :
call_function_single_entry: vector=251

[   48.747573]   <idle>-0       1d.h. 20591768us :
call_function_single_exit: vector=251

[   48.774025] modprobe-1348    1d... 20831415us : vector_reserve:
irq=150 ret=0

[   48.799631] modprobe-1348    1d... 20831419us : vector_config:
irq=150 vector=239 cpu=0 apicdest=0x00000001

[   48.845791] modprobe-1348    1.... 20831420us : vector_setup: irq=150
is_legacy=0 ret=0

[   48.872515] modprobe-1348    1.... 20831427us : vector_activate:
irq=150 is_managed=0 can_reserve=1 early=1

[   48.918432] modprobe-1348    1d... 20831428us : vector_config:
irq=150 vector=239 cpu=0 apicdest=0x00000001

[   48.964534] modprobe-1348    1d... 21152409us : vector_activate:
irq=150 is_managed=0 can_reserve=1 early=0

[   49.010805] modprobe-1348    1d... 21152415us : vector_update:
irq=150 vector=54 cpu=0 prev_vector=0 prev_cpu=0

[   49.057553] modprobe-1348    1d... 21152417us : vector_alloc: irq=150
vector=54 reserved=1 ret=0

[   49.085503] modprobe-1348    1d... 21152418us : vector_config:
irq=150 vector=54 cpu=0 apicdest=0x00000001

[   49.132545] modprobe-1348    1d... 21152428us : vector_config:
irq=150 vector=54 cpu=0 apicdest=0x00000001

[   49.180426]   <idle>-0       2d.h. 21227397us :
call_function_single_entry: vector=251

[   49.208066]   <idle>-0       2d.h. 21227403us :
call_function_single_exit: vector=251

[   49.235397]   <idle>-0       3d.h. 21229548us :
call_function_single_entry: vector=251

[   49.262625]   <idle>-0       3d.h. 21229551us :
call_function_single_exit: vector=251

[   49.289544]   <idle>-0       0d.h. 21231514us :
call_function_single_entry: vector=251

[   49.316365]   <idle>-0       0d.h. 21231519us :
call_function_single_exit: vector=251

[   49.343015]   <idle>-0       1d.h. 21238805us :
call_function_single_entry: vector=251

[   49.369550]   <idle>-0       1d.h. 21238809us :
call_function_single_exit: vector=251

[   49.395774]   <idle>-0       3d.h. 24735137us :
call_function_single_entry: vector=251

[   49.421902]   <idle>-0       3d.h. 24735143us :
call_function_single_exit: vector=251

[   49.447720]   <idle>-0       0d.h. 38913766us : spurious_apic_entry:
vector=255





-- 
Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Finland Oy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: IRQ behaivour has been changed from v4.14 to v4.15-rc1
  2017-12-28 21:03           ` Andy Shevchenko
@ 2017-12-28 21:31             ` Thomas Gleixner
  2017-12-28 21:59               ` Thomas Gleixner
  0 siblings, 1 reply; 14+ messages in thread
From: Thomas Gleixner @ 2017-12-28 21:31 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: alan@linux.intel.com, Ailus, Sakari, linux-kernel@vger.kernel.org,
	linux-media@vger.kernel.org

On Thu, 28 Dec 2017, Andy Shevchenko wrote:
> On Thu, 2017-12-28 at 21:18 +0100, Thomas Gleixner wrote:
> > Yes, you missed the typo in the command line. It should be:
> > 
> >  'trace_event=irq_vectors:* ftrace_dump_on_oops'
> 
> Indeed.
> 
> So, I had to disable LOCAL_TIMER_VECTOR, CALL_FUNCTION_VECTOR and
> RESCHDULE_VECTOR tracing, otherwise I got a lot of spam and RCU stalls.

Fair enough.

> The result w/o above is (full log is available here https://pastebin.com
> /J5yaTbM9):

Ok. Which irqs are related to that ISP thingy?

Are these interrupts MSI?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: IRQ behaivour has been changed from v4.14 to v4.15-rc1
  2017-12-28 21:31             ` Thomas Gleixner
@ 2017-12-28 21:59               ` Thomas Gleixner
  2017-12-29 12:06                 ` Andy Shevchenko
  0 siblings, 1 reply; 14+ messages in thread
From: Thomas Gleixner @ 2017-12-28 21:59 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: alan@linux.intel.com, Ailus, Sakari, linux-kernel@vger.kernel.org,
	linux-media@vger.kernel.org

On Thu, 28 Dec 2017, Thomas Gleixner wrote:
> On Thu, 28 Dec 2017, Andy Shevchenko wrote:
> > On Thu, 2017-12-28 at 21:18 +0100, Thomas Gleixner wrote:
> > > Yes, you missed the typo in the command line. It should be:
> > > 
> > >  'trace_event=irq_vectors:* ftrace_dump_on_oops'
> > 
> > Indeed.
> > 
> > So, I had to disable LOCAL_TIMER_VECTOR, CALL_FUNCTION_VECTOR and
> > RESCHDULE_VECTOR tracing, otherwise I got a lot of spam and RCU stalls.
> 
> Fair enough.
> 
> > The result w/o above is (full log is available here https://pastebin.com
> > /J5yaTbM9):
> 
> Ok. Which irqs are related to that ISP thingy?
> 
> Are these interrupts MSI?

And looking at the log, I see that you can load the driver successfully and
the trouble starts afterwards when you actually use it.

Can you please enable CONFIG_GENERIC_IRQ_DEBUGFS and after login, check
which interrupt is assigned to that atomisp thingy and then provide the
output of

cat /sys/kernel/debug/irq/irqs/$ATOMISPIRQ

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: IRQ behaivour has been changed from v4.14 to v4.15-rc1
  2017-12-28 21:59               ` Thomas Gleixner
@ 2017-12-29 12:06                 ` Andy Shevchenko
  2017-12-29 13:10                   ` Thomas Gleixner
  0 siblings, 1 reply; 14+ messages in thread
From: Andy Shevchenko @ 2017-12-29 12:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: alan@linux.intel.com, Ailus, Sakari, linux-kernel@vger.kernel.org,
	linux-media@vger.kernel.org

On Thu, 2017-12-28 at 22:59 +0100, Thomas Gleixner wrote:
> On Thu, 28 Dec 2017, Thomas Gleixner wrote:
> > On Thu, 28 Dec 2017, Andy Shevchenko wrote:

> > > The result w/o above is (full log is available here
> > > https://pastebin.com
> > > /J5yaTbM9):
> > 
> > Ok. Which irqs are related to that ISP thingy?
> > 
> > Are these interrupts MSI?

Yes, they are MSI.

> And looking at the log, I see that you can load the driver
> successfully and
> the trouble starts afterwards when you actually use it.

Correct.

> Can you please enable CONFIG_GENERIC_IRQ_DEBUGFS and after login,
> check
> which interrupt is assigned to that atomisp thingy and then provide
> the
> output of
> 
> cat /sys/kernel/debug/irq/irqs/$ATOMISPIRQ

Full log, including output of the above.

https://pastebin.com/4jammqi5

-- 
Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Finland Oy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: IRQ behaivour has been changed from v4.14 to v4.15-rc1
  2017-12-29 12:06                 ` Andy Shevchenko
@ 2017-12-29 13:10                   ` Thomas Gleixner
  2017-12-29 14:27                     ` Andy Shevchenko
  0 siblings, 1 reply; 14+ messages in thread
From: Thomas Gleixner @ 2017-12-29 13:10 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: alan@linux.intel.com, Ailus, Sakari, linux-kernel@vger.kernel.org,
	linux-media@vger.kernel.org

On Fri, 29 Dec 2017, Andy Shevchenko wrote:

> On Thu, 2017-12-28 at 22:59 +0100, Thomas Gleixner wrote:
> > On Thu, 28 Dec 2017, Thomas Gleixner wrote:
> > > On Thu, 28 Dec 2017, Andy Shevchenko wrote:
> 
> > > > The result w/o above is (full log is available here
> > > > https://pastebin.com
> > > > /J5yaTbM9):
> > > 
> > > Ok. Which irqs are related to that ISP thingy?
> > > 
> > > Are these interrupts MSI?
> 
> Yes, they are MSI.
> 
> > And looking at the log, I see that you can load the driver
> > successfully and
> > the trouble starts afterwards when you actually use it.
> 
> Correct.
> 
> > Can you please enable CONFIG_GENERIC_IRQ_DEBUGFS and after login,
> > check
> > which interrupt is assigned to that atomisp thingy and then provide
> > the
> > output of
> > 
> > cat /sys/kernel/debug/irq/irqs/$ATOMISPIRQ
> 
> Full log, including output of the above.
> 
> https://pastebin.com/4jammqi5

Thanks for the info. Can you please test the patch below?

Thanks,

	tglx

8<--------------------
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -339,6 +339,40 @@ int msi_domain_populate_irqs(struct irq_
 	return ret;
 }
 
+/*
+ * Carefully check whether the device can use reservation mode. If
+ * reservation mode is enabled then the early activation will assign a
+ * dummy vector to the device. If the PCI/MSI device does not support
+ * masking of the entry then this can result in spurious interrupts when
+ * the device driver is not absolutely careful. But even then a malfunction
+ * of the hardware could result in a spurious interrupt on the dummy vector
+ * and render the device unusable. If the entry can be masked then the core
+ * logic will prevent the spurious interrupt and reservation mode can be
+ * used. For now reservation mode is restricted to PCI/MSI.
+ */
+static bool msi_check_reservation_mode(struct irq_domain *domain,
+				       struct msi_domain_info *info,
+				       struct device *dev)
+{
+	struct msi_desc *desc;
+
+	if (domain->bus_token != DOMAIN_BUS_PCI_MSI)
+		return false;
+
+	if (!(info->flags & MSI_FLAG_MUST_REACTIVATE))
+		return false;
+
+	if (IS_ENABLED(CONFIG_PCI_MSI) && pci_msi_ignore_mask)
+		return false;
+
+	/*
+	 * Checking the first MSI descriptor is sufficient. MSIX supports
+	 * masking and MSI does so when the maskbit is set.
+	 */
+	desc = first_msi_entry(dev);
+	return desc->msi_attrib.is_msix || desc->msi_attrib.maskbit;
+}
+
 /**
  * msi_domain_alloc_irqs - Allocate interrupts from a MSI interrupt domain
  * @domain:	The domain to allocate from
@@ -353,9 +387,11 @@ int msi_domain_alloc_irqs(struct irq_dom
 {
 	struct msi_domain_info *info = domain->host_data;
 	struct msi_domain_ops *ops = info->ops;
-	msi_alloc_info_t arg;
+	struct irq_data *irq_data;
 	struct msi_desc *desc;
+	msi_alloc_info_t arg;
 	int i, ret, virq;
+	bool can_reserve;
 
 	ret = msi_domain_prepare_irqs(domain, dev, nvec, &arg);
 	if (ret)
@@ -385,6 +421,8 @@ int msi_domain_alloc_irqs(struct irq_dom
 	if (ops->msi_finish)
 		ops->msi_finish(&arg, 0);
 
+	can_reserve = msi_check_reservation_mode(domain, info, dev);
+
 	for_each_msi_entry(desc, dev) {
 		virq = desc->irq;
 		if (desc->nvec_used == 1)
@@ -397,17 +435,28 @@ int msi_domain_alloc_irqs(struct irq_dom
 		 * the MSI entries before the PCI layer enables MSI in the
 		 * card. Otherwise the card latches a random msi message.
 		 */
-		if (info->flags & MSI_FLAG_ACTIVATE_EARLY) {
-			struct irq_data *irq_data;
+		if (!(info->flags & MSI_FLAG_ACTIVATE_EARLY))
+			continue;
 
+		irq_data = irq_domain_get_irq_data(domain, desc->irq);
+		if (!can_reserve)
+			irqd_clr_can_reserve(irq_data);
+		ret = irq_domain_activate_irq(irq_data, can_reserve);
+		if (ret)
+			goto cleanup;
+	}
+
+	/*
+	 * If these interrupts use reservation mode clear the activated bit
+	 * so request_irq() will assign the final vector.
+	 */
+	if (can_reserve) {
+		for_each_msi_entry(desc, dev) {
 			irq_data = irq_domain_get_irq_data(domain, desc->irq);
-			ret = irq_domain_activate_irq(irq_data, true);
-			if (ret)
-				goto cleanup;
-			if (info->flags & MSI_FLAG_MUST_REACTIVATE)
-				irqd_clr_activated(irq_data);
+			irqd_clr_activated(irq_data);
 		}
 	}
+
 	return 0;
 
 cleanup:
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -184,6 +184,7 @@ static void reserve_irq_vector_locked(st
 	irq_matrix_reserve(vector_matrix);
 	apicd->can_reserve = true;
 	apicd->has_reserved = true;
+	irqd_set_can_reserve(irqd);
 	trace_vector_reserve(irqd->irq, 0);
 	vector_assign_managed_shutdown(irqd);
 }
@@ -368,8 +369,18 @@ static int activate_reserved(struct irq_
 	int ret;
 
 	ret = assign_irq_vector_any_locked(irqd);
-	if (!ret)
+	if (!ret) {
 		apicd->has_reserved = false;
+		/*
+		 * Core might have disabled reservation mode after
+		 * allocating the irq descriptor. Ideally this should
+		 * happen before allocation time, but that would require
+		 * completely convoluted ways of transporting that
+		 * information.
+		 */
+		if (!irqd_can_reserve(irqd))
+			apicd->can_reserve = false;
+	}
 	return ret;
 }
 
@@ -478,6 +489,7 @@ static bool vector_configure_legacy(unsi
 	} else {
 		/* Release the vector */
 		apicd->can_reserve = true;
+		irqd_set_can_reserve(irqd);
 		clear_irq_vector(irqd);
 		realloc = true;
 	}
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -212,6 +212,7 @@ struct irq_data {
  *				  mask. Applies only to affinity managed irqs.
  * IRQD_SINGLE_TARGET		- IRQ allows only a single affinity target
  * IRQD_DEFAULT_TRIGGER_SET	- Expected trigger already been set
+ * IRQD_CAN_RESERVE		- Can use reservation mode
  */
 enum {
 	IRQD_TRIGGER_MASK		= 0xf,
@@ -233,6 +234,7 @@ enum {
 	IRQD_MANAGED_SHUTDOWN		= (1 << 23),
 	IRQD_SINGLE_TARGET		= (1 << 24),
 	IRQD_DEFAULT_TRIGGER_SET	= (1 << 25),
+	IRQD_CAN_RESERVE		= (1 << 26),
 };
 
 #define __irqd_to_state(d) ACCESS_PRIVATE((d)->common, state_use_accessors)
@@ -377,6 +379,21 @@ static inline bool irqd_is_managed_and_s
 	return __irqd_to_state(d) & IRQD_MANAGED_SHUTDOWN;
 }
 
+static inline void irqd_set_can_reserve(struct irq_data *d)
+{
+	__irqd_to_state(d) |= IRQD_CAN_RESERVE;
+}
+
+static inline void irqd_clr_can_reserve(struct irq_data *d)
+{
+	__irqd_to_state(d) &= ~IRQD_CAN_RESERVE;
+}
+
+static inline bool irqd_can_reserve(struct irq_data *d)
+{
+	return __irqd_to_state(d) & IRQD_CAN_RESERVE;
+}
+
 #undef __irqd_to_state
 
 static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d)
--- a/kernel/irq/debugfs.c
+++ b/kernel/irq/debugfs.c
@@ -113,6 +113,7 @@ static const struct irq_bit_descr irqdat
 	BIT_MASK_DESCR(IRQD_SETAFFINITY_PENDING),
 	BIT_MASK_DESCR(IRQD_AFFINITY_MANAGED),
 	BIT_MASK_DESCR(IRQD_MANAGED_SHUTDOWN),
+	BIT_MASK_DESCR(IRQD_CAN_RESERVE),
 
 	BIT_MASK_DESCR(IRQD_FORWARDED_TO_VCPU),
 
--- a/arch/x86/kernel/apic/apic_flat_64.c
+++ b/arch/x86/kernel/apic/apic_flat_64.c
@@ -151,7 +151,7 @@ static struct apic apic_flat __ro_after_
 	.apic_id_valid			= default_apic_id_valid,
 	.apic_id_registered		= flat_apic_id_registered,
 
-	.irq_delivery_mode		= dest_LowestPrio,
+	.irq_delivery_mode		= dest_Fixed,
 	.irq_dest_mode			= 1, /* logical */
 
 	.disable_esr			= 0,
--- a/arch/x86/kernel/apic/apic_noop.c
+++ b/arch/x86/kernel/apic/apic_noop.c
@@ -110,7 +110,7 @@ struct apic apic_noop __ro_after_init =
 	.apic_id_valid			= default_apic_id_valid,
 	.apic_id_registered		= noop_apic_id_registered,
 
-	.irq_delivery_mode		= dest_LowestPrio,
+	.irq_delivery_mode		= dest_Fixed,
 	/* logical delivery broadcast to all CPUs: */
 	.irq_dest_mode			= 1,
 
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -39,17 +39,13 @@ static void irq_msi_compose_msg(struct i
 		((apic->irq_dest_mode == 0) ?
 			MSI_ADDR_DEST_MODE_PHYSICAL :
 			MSI_ADDR_DEST_MODE_LOGICAL) |
-		((apic->irq_delivery_mode != dest_LowestPrio) ?
-			MSI_ADDR_REDIRECTION_CPU :
-			MSI_ADDR_REDIRECTION_LOWPRI) |
+		MSI_ADDR_REDIRECTION_CPU |
 		MSI_ADDR_DEST_ID(cfg->dest_apicid);
 
 	msg->data =
 		MSI_DATA_TRIGGER_EDGE |
 		MSI_DATA_LEVEL_ASSERT |
-		((apic->irq_delivery_mode != dest_LowestPrio) ?
-			MSI_DATA_DELIVERY_FIXED :
-			MSI_DATA_DELIVERY_LOWPRI) |
+		MSI_DATA_DELIVERY_FIXED |
 		MSI_DATA_VECTOR(cfg->vector);
 }
 
--- a/arch/x86/kernel/apic/probe_32.c
+++ b/arch/x86/kernel/apic/probe_32.c
@@ -105,7 +105,7 @@ static struct apic apic_default __ro_aft
 	.apic_id_valid			= default_apic_id_valid,
 	.apic_id_registered		= default_apic_id_registered,
 
-	.irq_delivery_mode		= dest_LowestPrio,
+	.irq_delivery_mode		= dest_Fixed,
 	/* logical delivery broadcast to all CPUs: */
 	.irq_dest_mode			= 1,
 
--- a/arch/x86/kernel/apic/x2apic_cluster.c
+++ b/arch/x86/kernel/apic/x2apic_cluster.c
@@ -184,7 +184,7 @@ static struct apic apic_x2apic_cluster _
 	.apic_id_valid			= x2apic_apic_id_valid,
 	.apic_id_registered		= x2apic_apic_id_registered,
 
-	.irq_delivery_mode		= dest_LowestPrio,
+	.irq_delivery_mode		= dest_Fixed,
 	.irq_dest_mode			= 1, /* logical */
 
 	.disable_esr			= 0,
--- a/drivers/pci/host/pci-hyperv.c
+++ b/drivers/pci/host/pci-hyperv.c
@@ -985,9 +985,7 @@ static u32 hv_compose_msi_req_v1(
 	int_pkt->wslot.slot = slot;
 	int_pkt->int_desc.vector = vector;
 	int_pkt->int_desc.vector_count = 1;
-	int_pkt->int_desc.delivery_mode =
-		(apic->irq_delivery_mode == dest_LowestPrio) ?
-			dest_LowestPrio : dest_Fixed;
+	int_pkt->int_desc.delivery_mode = dest_Fixed;
 
 	/*
 	 * Create MSI w/ dummy vCPU set, overwritten by subsequent retarget in
@@ -1008,9 +1006,7 @@ static u32 hv_compose_msi_req_v2(
 	int_pkt->wslot.slot = slot;
 	int_pkt->int_desc.vector = vector;
 	int_pkt->int_desc.vector_count = 1;
-	int_pkt->int_desc.delivery_mode =
-		(apic->irq_delivery_mode == dest_LowestPrio) ?
-			dest_LowestPrio : dest_Fixed;
+	int_pkt->int_desc.delivery_mode = dest_Fixed;
 
 	/*
 	 * Create MSI w/ dummy vCPU set targeting just one vCPU, overwritten

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: IRQ behaivour has been changed from v4.14 to v4.15-rc1
  2017-12-29 13:10                   ` Thomas Gleixner
@ 2017-12-29 14:27                     ` Andy Shevchenko
  0 siblings, 0 replies; 14+ messages in thread
From: Andy Shevchenko @ 2017-12-29 14:27 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: alan@linux.intel.com, Ailus, Sakari, linux-kernel@vger.kernel.org,
	linux-media@vger.kernel.org

On Fri, 2017-12-29 at 14:10 +0100, Thomas Gleixner wrote:
> On Fri, 29 Dec 2017, Andy Shevchenko wrote:
> 
> > On Thu, 2017-12-28 at 22:59 +0100, Thomas Gleixner wrote:
> > > On Thu, 28 Dec 2017, Thomas Gleixner wrote:
> > > > On Thu, 28 Dec 2017, Andy Shevchenko wrote:
> > > > > The result w/o above is (full log is available here
> > > > > https://pastebin.com
> > > > > /J5yaTbM9):
> > > > 
> > > > Ok. Which irqs are related to that ISP thingy?
> > > > 
> > > > Are these interrupts MSI?
> > 
> > Yes, they are MSI.
> > 
> > > And looking at the log, I see that you can load the driver
> > > successfully and
> > > the trouble starts afterwards when you actually use it.
> > 
> > Correct.
> > 
> > > Can you please enable CONFIG_GENERIC_IRQ_DEBUGFS and after login,
> > > check
> > > which interrupt is assigned to that atomisp thingy and then
> > > provide
> > > the
> > > output of
> > > 
> > > cat /sys/kernel/debug/irq/irqs/$ATOMISPIRQ
> > 
> > Full log, including output of the above.
> > 
> > https://pastebin.com/4jammqi5
> 
> Thanks for the info. Can you please test the patch below?

Thanks for the patch. It does a trick!

P.S. I noticed that it's combined from at least one that is already in
x86/urgent, so, feel free to add my

Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

to the rest

> 
> Thanks,
> 
> 	tglx
> 
> 8<--------------------
> --- a/kernel/irq/msi.c
> +++ b/kernel/irq/msi.c
> @@ -339,6 +339,40 @@ int msi_domain_populate_irqs(struct irq_
>  	return ret;
>  }
>  
> +/*
> + * Carefully check whether the device can use reservation mode. If
> + * reservation mode is enabled then the early activation will assign
> a
> + * dummy vector to the device. If the PCI/MSI device does not support
> + * masking of the entry then this can result in spurious interrupts
> when
> + * the device driver is not absolutely careful. But even then a
> malfunction
> + * of the hardware could result in a spurious interrupt on the dummy
> vector
> + * and render the device unusable. If the entry can be masked then
> the core
> + * logic will prevent the spurious interrupt and reservation mode can
> be
> + * used. For now reservation mode is restricted to PCI/MSI.
> + */
> +static bool msi_check_reservation_mode(struct irq_domain *domain,
> +				       struct msi_domain_info *info,
> +				       struct device *dev)
> +{
> +	struct msi_desc *desc;
> +
> +	if (domain->bus_token != DOMAIN_BUS_PCI_MSI)
> +		return false;
> +
> +	if (!(info->flags & MSI_FLAG_MUST_REACTIVATE))
> +		return false;
> +
> +	if (IS_ENABLED(CONFIG_PCI_MSI) && pci_msi_ignore_mask)
> +		return false;
> +
> +	/*
> +	 * Checking the first MSI descriptor is sufficient. MSIX
> supports
> +	 * masking and MSI does so when the maskbit is set.
> +	 */
> +	desc = first_msi_entry(dev);
> +	return desc->msi_attrib.is_msix || desc->msi_attrib.maskbit;
> +}
> +
>  /**
>   * msi_domain_alloc_irqs - Allocate interrupts from a MSI interrupt
> domain
>   * @domain:	The domain to allocate from
> @@ -353,9 +387,11 @@ int msi_domain_alloc_irqs(struct irq_dom
>  {
>  	struct msi_domain_info *info = domain->host_data;
>  	struct msi_domain_ops *ops = info->ops;
> -	msi_alloc_info_t arg;
> +	struct irq_data *irq_data;
>  	struct msi_desc *desc;
> +	msi_alloc_info_t arg;
>  	int i, ret, virq;
> +	bool can_reserve;
>  
>  	ret = msi_domain_prepare_irqs(domain, dev, nvec, &arg);
>  	if (ret)
> @@ -385,6 +421,8 @@ int msi_domain_alloc_irqs(struct irq_dom
>  	if (ops->msi_finish)
>  		ops->msi_finish(&arg, 0);
>  
> +	can_reserve = msi_check_reservation_mode(domain, info, dev);
> +
>  	for_each_msi_entry(desc, dev) {
>  		virq = desc->irq;
>  		if (desc->nvec_used == 1)
> @@ -397,17 +435,28 @@ int msi_domain_alloc_irqs(struct irq_dom
>  		 * the MSI entries before the PCI layer enables MSI
> in the
>  		 * card. Otherwise the card latches a random msi
> message.
>  		 */
> -		if (info->flags & MSI_FLAG_ACTIVATE_EARLY) {
> -			struct irq_data *irq_data;
> +		if (!(info->flags & MSI_FLAG_ACTIVATE_EARLY))
> +			continue;
>  
> +		irq_data = irq_domain_get_irq_data(domain, desc-
> >irq);
> +		if (!can_reserve)
> +			irqd_clr_can_reserve(irq_data);
> +		ret = irq_domain_activate_irq(irq_data, can_reserve);
> +		if (ret)
> +			goto cleanup;
> +	}
> +
> +	/*
> +	 * If these interrupts use reservation mode clear the
> activated bit
> +	 * so request_irq() will assign the final vector.
> +	 */
> +	if (can_reserve) {
> +		for_each_msi_entry(desc, dev) {
>  			irq_data = irq_domain_get_irq_data(domain,
> desc->irq);
> -			ret = irq_domain_activate_irq(irq_data,
> true);
> -			if (ret)
> -				goto cleanup;
> -			if (info->flags & MSI_FLAG_MUST_REACTIVATE)
> -				irqd_clr_activated(irq_data);
> +			irqd_clr_activated(irq_data);
>  		}
>  	}
> +
>  	return 0;
>  
>  cleanup:
> --- a/arch/x86/kernel/apic/vector.c
> +++ b/arch/x86/kernel/apic/vector.c
> @@ -184,6 +184,7 @@ static void reserve_irq_vector_locked(st
>  	irq_matrix_reserve(vector_matrix);
>  	apicd->can_reserve = true;
>  	apicd->has_reserved = true;
> +	irqd_set_can_reserve(irqd);
>  	trace_vector_reserve(irqd->irq, 0);
>  	vector_assign_managed_shutdown(irqd);
>  }
> @@ -368,8 +369,18 @@ static int activate_reserved(struct irq_
>  	int ret;
>  
>  	ret = assign_irq_vector_any_locked(irqd);
> -	if (!ret)
> +	if (!ret) {
>  		apicd->has_reserved = false;
> +		/*
> +		 * Core might have disabled reservation mode after
> +		 * allocating the irq descriptor. Ideally this should
> +		 * happen before allocation time, but that would
> require
> +		 * completely convoluted ways of transporting that
> +		 * information.
> +		 */
> +		if (!irqd_can_reserve(irqd))
> +			apicd->can_reserve = false;
> +	}
>  	return ret;
>  }
>  
> @@ -478,6 +489,7 @@ static bool vector_configure_legacy(unsi
>  	} else {
>  		/* Release the vector */
>  		apicd->can_reserve = true;
> +		irqd_set_can_reserve(irqd);
>  		clear_irq_vector(irqd);
>  		realloc = true;
>  	}
> --- a/include/linux/irq.h
> +++ b/include/linux/irq.h
> @@ -212,6 +212,7 @@ struct irq_data {
>   *				  mask. Applies only to affinity
> managed irqs.
>   * IRQD_SINGLE_TARGET		- IRQ allows only a single
> affinity target
>   * IRQD_DEFAULT_TRIGGER_SET	- Expected trigger already been
> set
> + * IRQD_CAN_RESERVE		- Can use reservation mode
>   */
>  enum {
>  	IRQD_TRIGGER_MASK		= 0xf,
> @@ -233,6 +234,7 @@ enum {
>  	IRQD_MANAGED_SHUTDOWN		= (1 << 23),
>  	IRQD_SINGLE_TARGET		= (1 << 24),
>  	IRQD_DEFAULT_TRIGGER_SET	= (1 << 25),
> +	IRQD_CAN_RESERVE		= (1 << 26),
>  };
>  
>  #define __irqd_to_state(d) ACCESS_PRIVATE((d)->common,
> state_use_accessors)
> @@ -377,6 +379,21 @@ static inline bool irqd_is_managed_and_s
>  	return __irqd_to_state(d) & IRQD_MANAGED_SHUTDOWN;
>  }
>  
> +static inline void irqd_set_can_reserve(struct irq_data *d)
> +{
> +	__irqd_to_state(d) |= IRQD_CAN_RESERVE;
> +}
> +
> +static inline void irqd_clr_can_reserve(struct irq_data *d)
> +{
> +	__irqd_to_state(d) &= ~IRQD_CAN_RESERVE;
> +}
> +
> +static inline bool irqd_can_reserve(struct irq_data *d)
> +{
> +	return __irqd_to_state(d) & IRQD_CAN_RESERVE;
> +}
> +
>  #undef __irqd_to_state
>  
>  static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d)
> --- a/kernel/irq/debugfs.c
> +++ b/kernel/irq/debugfs.c
> @@ -113,6 +113,7 @@ static const struct irq_bit_descr irqdat
>  	BIT_MASK_DESCR(IRQD_SETAFFINITY_PENDING),
>  	BIT_MASK_DESCR(IRQD_AFFINITY_MANAGED),
>  	BIT_MASK_DESCR(IRQD_MANAGED_SHUTDOWN),
> +	BIT_MASK_DESCR(IRQD_CAN_RESERVE),
>  
>  	BIT_MASK_DESCR(IRQD_FORWARDED_TO_VCPU),
>  
> --- a/arch/x86/kernel/apic/apic_flat_64.c
> +++ b/arch/x86/kernel/apic/apic_flat_64.c
> @@ -151,7 +151,7 @@ static struct apic apic_flat __ro_after_
>  	.apic_id_valid			=
> default_apic_id_valid,
>  	.apic_id_registered		= flat_apic_id_registered,
>  
> -	.irq_delivery_mode		= dest_LowestPrio,
> +	.irq_delivery_mode		= dest_Fixed,
>  	.irq_dest_mode			= 1, /* logical */
>  
>  	.disable_esr			= 0,
> --- a/arch/x86/kernel/apic/apic_noop.c
> +++ b/arch/x86/kernel/apic/apic_noop.c
> @@ -110,7 +110,7 @@ struct apic apic_noop __ro_after_init =
>  	.apic_id_valid			=
> default_apic_id_valid,
>  	.apic_id_registered		= noop_apic_id_registered,
>  
> -	.irq_delivery_mode		= dest_LowestPrio,
> +	.irq_delivery_mode		= dest_Fixed,
>  	/* logical delivery broadcast to all CPUs: */
>  	.irq_dest_mode			= 1,
>  
> --- a/arch/x86/kernel/apic/msi.c
> +++ b/arch/x86/kernel/apic/msi.c
> @@ -39,17 +39,13 @@ static void irq_msi_compose_msg(struct i
>  		((apic->irq_dest_mode == 0) ?
>  			MSI_ADDR_DEST_MODE_PHYSICAL :
>  			MSI_ADDR_DEST_MODE_LOGICAL) |
> -		((apic->irq_delivery_mode != dest_LowestPrio) ?
> -			MSI_ADDR_REDIRECTION_CPU :
> -			MSI_ADDR_REDIRECTION_LOWPRI) |
> +		MSI_ADDR_REDIRECTION_CPU |
>  		MSI_ADDR_DEST_ID(cfg->dest_apicid);
>  
>  	msg->data =
>  		MSI_DATA_TRIGGER_EDGE |
>  		MSI_DATA_LEVEL_ASSERT |
> -		((apic->irq_delivery_mode != dest_LowestPrio) ?
> -			MSI_DATA_DELIVERY_FIXED :
> -			MSI_DATA_DELIVERY_LOWPRI) |
> +		MSI_DATA_DELIVERY_FIXED |
>  		MSI_DATA_VECTOR(cfg->vector);
>  }
>  
> --- a/arch/x86/kernel/apic/probe_32.c
> +++ b/arch/x86/kernel/apic/probe_32.c
> @@ -105,7 +105,7 @@ static struct apic apic_default __ro_aft
>  	.apic_id_valid			=
> default_apic_id_valid,
>  	.apic_id_registered		=
> default_apic_id_registered,
>  
> -	.irq_delivery_mode		= dest_LowestPrio,
> +	.irq_delivery_mode		= dest_Fixed,
>  	/* logical delivery broadcast to all CPUs: */
>  	.irq_dest_mode			= 1,
>  
> --- a/arch/x86/kernel/apic/x2apic_cluster.c
> +++ b/arch/x86/kernel/apic/x2apic_cluster.c
> @@ -184,7 +184,7 @@ static struct apic apic_x2apic_cluster _
>  	.apic_id_valid			= x2apic_apic_id_valid,
>  	.apic_id_registered		=
> x2apic_apic_id_registered,
>  
> -	.irq_delivery_mode		= dest_LowestPrio,
> +	.irq_delivery_mode		= dest_Fixed,
>  	.irq_dest_mode			= 1, /* logical */
>  
>  	.disable_esr			= 0,
> --- a/drivers/pci/host/pci-hyperv.c
> +++ b/drivers/pci/host/pci-hyperv.c
> @@ -985,9 +985,7 @@ static u32 hv_compose_msi_req_v1(
>  	int_pkt->wslot.slot = slot;
>  	int_pkt->int_desc.vector = vector;
>  	int_pkt->int_desc.vector_count = 1;
> -	int_pkt->int_desc.delivery_mode =
> -		(apic->irq_delivery_mode == dest_LowestPrio) ?
> -			dest_LowestPrio : dest_Fixed;
> +	int_pkt->int_desc.delivery_mode = dest_Fixed;
>  
>  	/*
>  	 * Create MSI w/ dummy vCPU set, overwritten by subsequent
> retarget in
> @@ -1008,9 +1006,7 @@ static u32 hv_compose_msi_req_v2(
>  	int_pkt->wslot.slot = slot;
>  	int_pkt->int_desc.vector = vector;
>  	int_pkt->int_desc.vector_count = 1;
> -	int_pkt->int_desc.delivery_mode =
> -		(apic->irq_delivery_mode == dest_LowestPrio) ?
> -			dest_LowestPrio : dest_Fixed;
> +	int_pkt->int_desc.delivery_mode = dest_Fixed;
>  
>  	/*
>  	 * Create MSI w/ dummy vCPU set targeting just one vCPU,
> overwritten

-- 
Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Finland Oy

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-12-29 14:27 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-28 17:17 IRQ behaivour has been changed from v4.14 to v4.15-rc1 Shevchenko, Andriy
2017-12-28 17:21 ` Thomas Gleixner
2017-12-28 17:34   ` Andy Shevchenko
2017-12-28 17:44     ` Thomas Gleixner
2017-12-28 19:31       ` Andy Shevchenko
2017-12-28 19:36         ` Andy Shevchenko
2017-12-28 20:18         ` Thomas Gleixner
2017-12-28 21:03           ` Andy Shevchenko
2017-12-28 21:31             ` Thomas Gleixner
2017-12-28 21:59               ` Thomas Gleixner
2017-12-29 12:06                 ` Andy Shevchenko
2017-12-29 13:10                   ` Thomas Gleixner
2017-12-29 14:27                     ` Andy Shevchenko
2017-12-28 17:23 ` Andy Shevchenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).