public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 2.6.21-rc5] Flush MSI-X table writes (rev 3)
@ 2007-03-30 18:54 Mitch Williams
  2007-03-30 19:04 ` Eric W. Biederman
  0 siblings, 1 reply; 17+ messages in thread
From: Mitch Williams @ 2007-03-30 18:54 UTC (permalink / raw)
  To: linux-pci, akpm; +Cc: gregkh, ebiederm, linux-kernel, auke-jan.h.kok

This patch fixes a kernel bug which is triggered when using the
irqbalance daemon with MSI-X hardware.

Because both MSI-X interrupt messages and MSI-X table writes are posted,
it's possible for them to cross while in-flight.  This results in
interrupts being received long after the kernel thinks they're disabled,
and in interrupts being sent to stale vectors after rebalancing.

This patch performs a read flush after writes to the MSI-X table for
mask and unmask operations.  Since the SMP affinity is set while
the interrupt is masked, and since it's unmasked immediately after,
no additional flushes are required in the various affinity setting
routines.

This patch has been validated with (unreleased) network hardware which
uses MSI-X.

Revised with input from Eric Biederman.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>

diff -urpN -X dontdiff linux-2.6.21-rc5-clean/drivers/pci/msi.c linux-2.6.21-rc5/drivers/pci/msi.c
--- linux-2.6.21-rc5-clean/drivers/pci/msi.c	2007-03-28 10:05:24.000000000 -0700
+++ linux-2.6.21-rc5/drivers/pci/msi.c	2007-03-28 09:21:34.000000000 -0700
@@ -68,6 +68,29 @@ static void msix_set_enable(struct pci_d
 	}
 }
 
+static void msix_flush_writes(unsigned int irq)
+{
+	struct msi_desc *entry;
+
+	entry = get_irq_msi(irq);
+	BUG_ON(!entry || !entry->dev);
+	switch (entry->msi_attrib.type) {
+	case PCI_CAP_ID_MSI:
+		/* nothing to do */
+		break;
+	case PCI_CAP_ID_MSIX:
+	{
+		int offset = entry->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE +
+			PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET;
+		readl(entry->mask_base + offset);
+		break;
+	}
+	default:
+		BUG();
+		break;
+	}
+}
+
 static void msi_set_mask_bit(unsigned int irq, int flag)
 {
 	struct msi_desc *entry;
@@ -186,11 +209,13 @@ void write_msi_msg(unsigned int irq, str
 void mask_msi_irq(unsigned int irq)
 {
 	msi_set_mask_bit(irq, 1);
+	msix_flush_writes(irq);
 }
 
 void unmask_msi_irq(unsigned int irq)
 {
 	msi_set_mask_bit(irq, 0);
+	msix_flush_writes(irq);
 }
 
 static int msi_free_irq(struct pci_dev* dev, int irq);

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2.6.21-rc5] Flush MSI-X table writes (rev 3)
  2007-03-30 18:54 [PATCH 2.6.21-rc5] Flush MSI-X table writes (rev 3) Mitch Williams
@ 2007-03-30 19:04 ` Eric W. Biederman
  2007-03-30 19:47   ` Andrew Morton
  0 siblings, 1 reply; 17+ messages in thread
From: Eric W. Biederman @ 2007-03-30 19:04 UTC (permalink / raw)
  To: Mitch Williams; +Cc: linux-pci, akpm, gregkh, linux-kernel, auke-jan.h.kok

Mitch Williams <mitch.a.williams@intel.com> writes:

> This patch fixes a kernel bug which is triggered when using the
> irqbalance daemon with MSI-X hardware.
>
> Because both MSI-X interrupt messages and MSI-X table writes are posted,
> it's possible for them to cross while in-flight.  This results in
> interrupts being received long after the kernel thinks they're disabled,
> and in interrupts being sent to stale vectors after rebalancing.
>
> This patch performs a read flush after writes to the MSI-X table for
> mask and unmask operations.  Since the SMP affinity is set while
> the interrupt is masked, and since it's unmasked immediately after,
> no additional flushes are required in the various affinity setting
> routines.
>
> This patch has been validated with (unreleased) network hardware which
> uses MSI-X.
>
> Revised with input from Eric Biederman.

Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>

>
> Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
>
> diff -urpN -X dontdiff linux-2.6.21-rc5-clean/drivers/pci/msi.c
> linux-2.6.21-rc5/drivers/pci/msi.c
> --- linux-2.6.21-rc5-clean/drivers/pci/msi.c 2007-03-28 10:05:24.000000000 -0700
> +++ linux-2.6.21-rc5/drivers/pci/msi.c	2007-03-28 09:21:34.000000000 -0700
> @@ -68,6 +68,29 @@ static void msix_set_enable(struct pci_d
>  	}
>  }
>  
> +static void msix_flush_writes(unsigned int irq)
> +{
> +	struct msi_desc *entry;
> +
> +	entry = get_irq_msi(irq);
> +	BUG_ON(!entry || !entry->dev);
> +	switch (entry->msi_attrib.type) {
> +	case PCI_CAP_ID_MSI:
> +		/* nothing to do */
> +		break;
> +	case PCI_CAP_ID_MSIX:
> +	{
> +		int offset = entry->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE +
> +			PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET;
> +		readl(entry->mask_base + offset);
> +		break;
> +	}
> +	default:
> +		BUG();
> +		break;
> +	}
> +}
> +
>  static void msi_set_mask_bit(unsigned int irq, int flag)
>  {
>  	struct msi_desc *entry;
> @@ -186,11 +209,13 @@ void write_msi_msg(unsigned int irq, str
>  void mask_msi_irq(unsigned int irq)
>  {
>  	msi_set_mask_bit(irq, 1);
> +	msix_flush_writes(irq);
>  }
>  
>  void unmask_msi_irq(unsigned int irq)
>  {
>  	msi_set_mask_bit(irq, 0);
> +	msix_flush_writes(irq);
>  }
>  
>  static int msi_free_irq(struct pci_dev* dev, int irq);

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2.6.21-rc5] Flush MSI-X table writes (rev 3)
  2007-03-30 19:04 ` Eric W. Biederman
@ 2007-03-30 19:47   ` Andrew Morton
  2007-03-30 19:49     ` Greg KH
  0 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2007-03-30 19:47 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Mitch Williams, linux-pci, gregkh, linux-kernel, auke-jan.h.kok

On Fri, 30 Mar 2007 13:04:02 -0600
ebiederm@xmission.com (Eric W. Biederman) wrote:

> Mitch Williams <mitch.a.williams@intel.com> writes:
> 
> > This patch fixes a kernel bug which is triggered when using the
> > irqbalance daemon with MSI-X hardware.
> >
> > Because both MSI-X interrupt messages and MSI-X table writes are posted,
> > it's possible for them to cross while in-flight.  This results in
> > interrupts being received long after the kernel thinks they're disabled,
> > and in interrupts being sent to stale vectors after rebalancing.
> >
> > This patch performs a read flush after writes to the MSI-X table for
> > mask and unmask operations.  Since the SMP affinity is set while
> > the interrupt is masked, and since it's unmasked immediately after,
> > no additional flushes are required in the various affinity setting
> > routines.
> >
> > This patch has been validated with (unreleased) network hardware which
> > uses MSI-X.
> >
> > Revised with input from Eric Biederman.
> 
> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>

Did we end up deciding whether this is (needed*safe) enough for 2.6.21?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2.6.21-rc5] Flush MSI-X table writes (rev 3)
  2007-03-30 19:47   ` Andrew Morton
@ 2007-03-30 19:49     ` Greg KH
  2007-03-30 20:00       ` Andrew Morton
  2007-03-30 20:26       ` Eric W. Biederman
  0 siblings, 2 replies; 17+ messages in thread
From: Greg KH @ 2007-03-30 19:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Eric W. Biederman, Mitch Williams, linux-pci, linux-kernel,
	auke-jan.h.kok

On Fri, Mar 30, 2007 at 12:47:47PM -0700, Andrew Morton wrote:
> On Fri, 30 Mar 2007 13:04:02 -0600
> ebiederm@xmission.com (Eric W. Biederman) wrote:
> 
> > Mitch Williams <mitch.a.williams@intel.com> writes:
> > 
> > > This patch fixes a kernel bug which is triggered when using the
> > > irqbalance daemon with MSI-X hardware.
> > >
> > > Because both MSI-X interrupt messages and MSI-X table writes are posted,
> > > it's possible for them to cross while in-flight.  This results in
> > > interrupts being received long after the kernel thinks they're disabled,
> > > and in interrupts being sent to stale vectors after rebalancing.
> > >
> > > This patch performs a read flush after writes to the MSI-X table for
> > > mask and unmask operations.  Since the SMP affinity is set while
> > > the interrupt is masked, and since it's unmasked immediately after,
> > > no additional flushes are required in the various affinity setting
> > > routines.
> > >
> > > This patch has been validated with (unreleased) network hardware which
> > > uses MSI-X.
> > >
> > > Revised with input from Eric Biederman.
> > 
> > Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
> 
> Did we end up deciding whether this is (needed*safe) enough for 2.6.21?

I say no for now, I have seen no bug reports for any hardware that is
not in a lab for this.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2.6.21-rc5] Flush MSI-X table writes (rev 3)
  2007-03-30 19:49     ` Greg KH
@ 2007-03-30 20:00       ` Andrew Morton
  2007-03-30 20:10         ` Greg KH
  2007-03-30 20:26       ` Eric W. Biederman
  1 sibling, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2007-03-30 20:00 UTC (permalink / raw)
  To: Greg KH
  Cc: Eric W. Biederman, Mitch Williams, linux-pci, linux-kernel,
	auke-jan.h.kok

On Fri, 30 Mar 2007 12:49:56 -0700
Greg KH <gregkh@suse.de> wrote:

> > > Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
> > 
> > Did we end up deciding whether this is (needed*safe) enough for 2.6.21?
> 
> I say no for now, I have seen no bug reports for any hardware that is
> not in a lab for this.

Well.  Presumably that hardware will be in the field during the lifetime of
2.6.21-based kernels.

Perhaps we should put this into 2.6.22 then backport it to 2.6.21.x once it
seems safe to do so.  If we decide to go this way, we'll need to ask Mitch
to remind us to do the backport at the appropriate time, else we'll surely
forget.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2.6.21-rc5] Flush MSI-X table writes (rev 3)
  2007-03-30 20:00       ` Andrew Morton
@ 2007-03-30 20:10         ` Greg KH
  2007-03-30 20:21           ` Williams, Mitch A
  0 siblings, 1 reply; 17+ messages in thread
From: Greg KH @ 2007-03-30 20:10 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Eric W. Biederman, Mitch Williams, linux-pci, linux-kernel,
	auke-jan.h.kok

On Fri, Mar 30, 2007 at 01:00:22PM -0700, Andrew Morton wrote:
> On Fri, 30 Mar 2007 12:49:56 -0700
> Greg KH <gregkh@suse.de> wrote:
> 
> > > > Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
> > > 
> > > Did we end up deciding whether this is (needed*safe) enough for 2.6.21?
> > 
> > I say no for now, I have seen no bug reports for any hardware that is
> > not in a lab for this.
> 
> Well.  Presumably that hardware will be in the field during the lifetime of
> 2.6.21-based kernels.
> 
> Perhaps we should put this into 2.6.22 then backport it to 2.6.21.x once it
> seems safe to do so.  If we decide to go this way, we'll need to ask Mitch
> to remind us to do the backport at the appropriate time, else we'll surely
> forget.

Yes, that's what I just asked him to do a message ago :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH 2.6.21-rc5] Flush MSI-X table writes (rev 3)
  2007-03-30 20:10         ` Greg KH
@ 2007-03-30 20:21           ` Williams, Mitch A
  2007-03-30 20:24             ` Greg KH
  0 siblings, 1 reply; 17+ messages in thread
From: Williams, Mitch A @ 2007-03-30 20:21 UTC (permalink / raw)
  To: Greg KH, Andrew Morton
  Cc: Eric W. Biederman, linux-pci, linux-kernel, Kok, Auke-jan H

Greg KH wrote: 
> 
>> Perhaps we should put this into 2.6.22 then backport it to 
>2.6.21.x once it
>> seems safe to do so.  If we decide to go this way, we'll 
>need to ask Mitch
>> to remind us to do the backport at the appropriate time, 
>else we'll surely
>> forget.
>
>Yes, that's what I just asked him to do a message ago :)
>
Be happy to -- just give me a good definition of "appropriate
time".

-Mitch

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2.6.21-rc5] Flush MSI-X table writes (rev 3)
  2007-03-30 20:21           ` Williams, Mitch A
@ 2007-03-30 20:24             ` Greg KH
  0 siblings, 0 replies; 17+ messages in thread
From: Greg KH @ 2007-03-30 20:24 UTC (permalink / raw)
  To: Williams, Mitch A
  Cc: Greg KH, Andrew Morton, Eric W. Biederman, linux-pci,
	linux-kernel, Kok, Auke-jan H

On Fri, Mar 30, 2007 at 01:21:03PM -0700, Williams, Mitch A wrote:
> Greg KH wrote: 
> > 
> >> Perhaps we should put this into 2.6.22 then backport it to 
> >2.6.21.x once it
> >> seems safe to do so.  If we decide to go this way, we'll 
> >need to ask Mitch
> >> to remind us to do the backport at the appropriate time, 
> >else we'll surely
> >> forget.
> >
> >Yes, that's what I just asked him to do a message ago :)
> >
> Be happy to -- just give me a good definition of "appropriate
> time".

After it goes into Linus's tree, which will be usually the week after
2.6.21 is out.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2.6.21-rc5] Flush MSI-X table writes (rev 3)
  2007-03-30 19:49     ` Greg KH
  2007-03-30 20:00       ` Andrew Morton
@ 2007-03-30 20:26       ` Eric W. Biederman
  2007-03-30 20:49         ` Williams, Mitch A
  1 sibling, 1 reply; 17+ messages in thread
From: Eric W. Biederman @ 2007-03-30 20:26 UTC (permalink / raw)
  To: Greg KH
  Cc: Andrew Morton, Mitch Williams, linux-pci, linux-kernel,
	auke-jan.h.kok

Greg KH <gregkh@suse.de> writes:

> On Fri, Mar 30, 2007 at 12:47:47PM -0700, Andrew Morton wrote:
>> Did we end up deciding whether this is (needed*safe) enough for 2.6.21?
>
> I say no for now, I have seen no bug reports for any hardware that is
> not in a lab for this.

The bug report would be phrased as someone seeing "No irq for vector"
on x86_64.  Unless they are a skilled developer they are unlikely to
trace it down to not flushing posted writes to a MSI bar during irq
migration.  It part it is a subtle hardware/software race.

I have seen some unresolved bug reports described this way on machines
that have MSI-X capable hardware.  The are against fedora core and I
can't haven't been able to get the reporters to try 2.6.21-rcX and
the other irq back ports don't work so well.

I currently count 7 drivers in 2.6.21 that are susceptible to the race
this bug closes.

This version of the patch seems to meet all of the criteria for being
obviously correct, and all it does is insert a stupid readl.  Not
very dangerous.

This fix removes a race between hardware and software that you likely
need to migrate MSI-X irqs a lot to trigger.  Which can be the kind of
thing that bytes you after you have a month of uptime.  This is a very
nasty condition to track down from bug reports.  I'd rather not have a
known culprit out there.

Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH 2.6.21-rc5] Flush MSI-X table writes (rev 3)
  2007-03-30 20:26       ` Eric W. Biederman
@ 2007-03-30 20:49         ` Williams, Mitch A
  2007-03-30 20:56           ` Chuck Ebbert
                             ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Williams, Mitch A @ 2007-03-30 20:49 UTC (permalink / raw)
  To: Eric W. Biederman, Greg KH
  Cc: Andrew Morton, linux-pci, linux-kernel, Kok, Auke-jan H

Eric W. Biederman wrote:

>The bug report would be phrased as someone seeing "No irq for vector"
>on x86_64.  Unless they are a skilled developer they are unlikely to
>trace it down to not flushing posted writes to a MSI bar during irq
>migration.  It part it is a subtle hardware/software race.
>
>I have seen some unresolved bug reports described this way on machines
>that have MSI-X capable hardware.  The are against fedora core and I
>can't haven't been able to get the reporters to try 2.6.21-rcX and
>the other irq back ports don't work so well.
>
>I currently count 7 drivers in 2.6.21 that are susceptible to the race
>this bug closes.
>
>This version of the patch seems to meet all of the criteria for being
>obviously correct, and all it does is insert a stupid readl.  Not
>very dangerous.
>
>This fix removes a race between hardware and software that you likely
>need to migrate MSI-X irqs a lot to trigger.  Which can be the kind of
>thing that bytes you after you have a month of uptime.  This is a very
>nasty condition to track down from bug reports.  I'd rather not have a
>known culprit out there.

Agreed, this is a subtle bug, and was a real hairball to track down.
Even so, I'm surprised that nobody else has dug into this, since it
should affect anybody running MSI-X.  I originally thought I was seeing
a hardware bug, which is why I dug more deeply into the issue.

If Eric is seeing bug reports related to "no vector for IRQ" in the
wild, then I have to change my stance and agree that this should be
pushed to -stable.  Every one of those messages indicates that we
hit the race condition.

-Mitch

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2.6.21-rc5] Flush MSI-X table writes (rev 3)
  2007-03-30 20:49         ` Williams, Mitch A
@ 2007-03-30 20:56           ` Chuck Ebbert
  2007-03-30 21:05           ` Eric W. Biederman
  2007-04-03  7:41           ` [PATCH] msi: Immediately mask and unmask msi-x irqs Eric W. Biederman
  2 siblings, 0 replies; 17+ messages in thread
From: Chuck Ebbert @ 2007-03-30 20:56 UTC (permalink / raw)
  To: Williams, Mitch A
  Cc: Eric W. Biederman, Greg KH, Andrew Morton, linux-pci,
	linux-kernel, Kok, Auke-jan H

Williams, Mitch A wrote:
> 
> If Eric is seeing bug reports related to "no vector for IRQ" in the
> wild, then I have to change my stance and agree that this should be
> pushed to -stable.  Every one of those messages indicates that we
> hit the race condition.
> 

Your previous patch is in a Fedora test kernel. Hopefully I will
have some feedback soon.

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=225399

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2.6.21-rc5] Flush MSI-X table writes (rev 3)
  2007-03-30 20:49         ` Williams, Mitch A
  2007-03-30 20:56           ` Chuck Ebbert
@ 2007-03-30 21:05           ` Eric W. Biederman
  2007-04-03  7:41           ` [PATCH] msi: Immediately mask and unmask msi-x irqs Eric W. Biederman
  2 siblings, 0 replies; 17+ messages in thread
From: Eric W. Biederman @ 2007-03-30 21:05 UTC (permalink / raw)
  To: Williams, Mitch A
  Cc: Greg KH, Andrew Morton, linux-pci, linux-kernel, Kok, Auke-jan H

"Williams, Mitch A" <mitch.a.williams@intel.com> writes:

> Agreed, this is a subtle bug, and was a real hairball to track down.
> Even so, I'm surprised that nobody else has dug into this, since it
> should affect anybody running MSI-X.  I originally thought I was seeing
> a hardware bug, which is why I dug more deeply into the issue.
>
> If Eric is seeing bug reports related to "no vector for IRQ" in the
> wild, then I have to change my stance and agree that this should be
> pushed to -stable.  Every one of those messages indicates that we
> hit the race condition.

There is a non MSI-X cause as well with the ioapics that I believe I have
now worked around.  It was a lot harder because ioapics don't obey the
pci ordering rules.  Although the primary culprit was receiving a second
instance of the an irq before we had acknowledged the first instance.

I had just about concluded that there was likely a second cause for
the "no vector for IRQ" message but I had not been able to confirm
that until I saw the first version of this patch.

In general the rule for -stable is that a fix has to go to -linus
first.  It makes it simple to see if the normal QA has been done
on the patch before we push it towards stable.  So this patch needs to
hit 2.6.21 first.

Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] msi: Immediately mask and unmask msi-x irqs.
  2007-03-30 20:49         ` Williams, Mitch A
  2007-03-30 20:56           ` Chuck Ebbert
  2007-03-30 21:05           ` Eric W. Biederman
@ 2007-04-03  7:41           ` Eric W. Biederman
  2007-04-03 17:24             ` Williams, Mitch A
  2007-04-03 18:52             ` Siddha, Suresh B
  2 siblings, 2 replies; 17+ messages in thread
From: Eric W. Biederman @ 2007-04-03  7:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Greg KH, Andrew Morton, linux-pci, linux-kernel, Kok, Auke-jan H,
	Williams, Mitch A


This is a simplified and actually more comprehensive form of a bug
fix from Mitch Williams <mitch.a.williams@intel.com>.

When we mask or unmask a msi-x irqs the writes may be posted because
we are writing to memory mapped region.  This means the mask and
unmask don't happen immediately but at some unspecified time in the
future.  Which is out of sync with how the mask/unmask logic work
for ioapic irqs.

The practical result is that we get very subtle and hard to track down
irq migration bugs.

This patch performs a read flush after writes to the MSI-X table for mask
and unmask operations.  Since the SMP affinity is set while the interrupt
is masked, and since it's unmasked immediately after, no additional flushes
are required in the various affinity setting routines.

The testing by Mitch Williams on his especially problematic system should
still be valid as I have only simplified the code, not changed the
functionality.

We currently have 7 drivers: cciss, mthca, cxgb3, forceth, s2io,
pcie/portdrv_core, and qla2xxx in 2.6.21 that are affected by this
problem when the hardware they driver is plugged into the right slot.

Given the difficulty of reproducing this bug and tracing it down to
anything that even remotely resembles a cause, even if people are
being affected we aren't likely to see many meaningful bug reports, and
the people who see this bug aren't likely to be able to reproduce this
bug in a timely fashion.  So it is best to get this problem fixed 
as soon as we can so people don't have problems.

Then if people do have a kernel message stating "No irq for vector" we
will know it is yet another novel cause that needs a complete new
investigation.

So here is a one liner that will hopefully be a part of 2.6.21.

Cc: Mitch Williams <mitch.a.williams@intel.com>
Cc: Greg KH <greg@kroah.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 drivers/pci/msi.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index ad33e01..435c195 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -94,6 +94,7 @@ static void msi_set_mask_bit(unsigned int irq, int flag)
 		int offset = entry->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE +
 			PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET;
 		writel(flag, entry->mask_base + offset);
+		readl(entry->mask_base + offset);
 		break;
 	}
 	default:
-- 
1.5.0.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* RE: [PATCH] msi: Immediately mask and unmask msi-x irqs.
  2007-04-03  7:41           ` [PATCH] msi: Immediately mask and unmask msi-x irqs Eric W. Biederman
@ 2007-04-03 17:24             ` Williams, Mitch A
  2007-04-03 18:52             ` Siddha, Suresh B
  1 sibling, 0 replies; 17+ messages in thread
From: Williams, Mitch A @ 2007-04-03 17:24 UTC (permalink / raw)
  To: Eric W. Biederman, Linus Torvalds
  Cc: Greg KH, Andrew Morton, linux-pci, linux-kernel, Kok, Auke-jan H

Acked-by: Mitch Williams <mitch.a.williams@intel.com>

>This is a simplified and actually more comprehensive form of a bug
>fix from Mitch Williams <mitch.a.williams@intel.com>.
[snip]
>Then if people do have a kernel message stating "No irq for vector" we
>will know it is yet another novel cause that needs a complete new
>investigation.
>
>So here is a one liner that will hopefully be a part of 2.6.21.
>
>Cc: Mitch Williams <mitch.a.williams@intel.com>
>Cc: Greg KH <greg@kroah.com>
>Cc: Andrew Morton <akpm@linux-foundation.org>
>Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>---
> drivers/pci/msi.c |    1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
>diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
>index ad33e01..435c195 100644
>--- a/drivers/pci/msi.c
>+++ b/drivers/pci/msi.c
>@@ -94,6 +94,7 @@ static void msi_set_mask_bit(unsigned int 
>irq, int flag)
> 		int offset = entry->msi_attrib.entry_nr * 
>PCI_MSIX_ENTRY_SIZE +
> 			PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET;
> 		writel(flag, entry->mask_base + offset);
>+		readl(entry->mask_base + offset);
> 		break;
> 	}
> 	default:
>-- 
>1.5.0.4
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] msi: Immediately mask and unmask msi-x irqs.
  2007-04-03  7:41           ` [PATCH] msi: Immediately mask and unmask msi-x irqs Eric W. Biederman
  2007-04-03 17:24             ` Williams, Mitch A
@ 2007-04-03 18:52             ` Siddha, Suresh B
  2007-04-03 19:39               ` Eric W. Biederman
  1 sibling, 1 reply; 17+ messages in thread
From: Siddha, Suresh B @ 2007-04-03 18:52 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Linus Torvalds, Greg KH, Andrew Morton, linux-pci, linux-kernel,
	Kok, Auke-jan H, Williams, Mitch A

On Tue, Apr 03, 2007 at 01:41:49AM -0600, Eric W. Biederman wrote:
> When we mask or unmask a msi-x irqs the writes may be posted because
> we are writing to memory mapped region.  This means the mask and
> unmask don't happen immediately but at some unspecified time in the
> future.  Which is out of sync with how the mask/unmask logic work
> for ioapic irqs.
> 
> The practical result is that we get very subtle and hard to track down
> irq migration bugs.
> 
> This patch performs a read flush after writes to the MSI-X table for mask
> and unmask operations.  Since the SMP affinity is set while the interrupt
> is masked, and since it's unmasked immediately after, no additional flushes
> are required in the various affinity setting routines.

set_msi_irq_affinity() is already doing read_msi_msg(). So the mask operation
before this should atleast get flushed before we modify the irq destination
information.

With this patch however, unmask happens immediately.

> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index ad33e01..435c195 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -94,6 +94,7 @@ static void msi_set_mask_bit(unsigned int irq, int flag)
>  		int offset = entry->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE +
>  			PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET;
>  		writel(flag, entry->mask_base + offset);
> +		readl(entry->mask_base + offset);

Don't we need the flush for the PCI_CAP_ID_MSI case aswell.

thanks,
suresh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] msi: Immediately mask and unmask msi-x irqs.
  2007-04-03 18:52             ` Siddha, Suresh B
@ 2007-04-03 19:39               ` Eric W. Biederman
  2007-04-03 20:57                 ` Siddha, Suresh B
  0 siblings, 1 reply; 17+ messages in thread
From: Eric W. Biederman @ 2007-04-03 19:39 UTC (permalink / raw)
  To: Siddha, Suresh B
  Cc: Linus Torvalds, Greg KH, Andrew Morton, linux-pci, linux-kernel,
	Kok, Auke-jan H, Williams, Mitch A

"Siddha, Suresh B" <suresh.b.siddha@intel.com> writes:

> set_msi_irq_affinity() is already doing read_msi_msg(). So the mask operation
> before this should atleast get flushed before we modify the irq destination
> information.

We modify irq reception information in assign_irq_vector, before that.
With my latest delayed free of irq reception information this is less
of an issue.

Further now that we cache the msi message that read_msi_msg should go
away and we should use the cached version from the msi_desc.

> With this patch however, unmask happens immediately.

Yes unmask happens immediately as well.  Just as with Mitch Williams
last patch, and just like we expect.  Further mask/unmask should not be
called that often so there is no point in micro optimizing them.  At
least not until some screams they are a performance problem.

>> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
>> index ad33e01..435c195 100644
>> --- a/drivers/pci/msi.c
>> +++ b/drivers/pci/msi.c
>> @@ -94,6 +94,7 @@ static void msi_set_mask_bit(unsigned int irq, int flag)
>>  		int offset = entry->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE +
>>  			PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET;
>>  		writel(flag, entry->mask_base + offset);
>> +		readl(entry->mask_base + offset);
>
> Don't we need the flush for the PCI_CAP_ID_MSI case aswell.

At least the 3.0 pci spec does not allow pci configuration access to
be posted.  So unless we find some hardware that actually does post
pci configuration writes we should be ok.

If we do find that hardware that posts pci config writes I expect we
will make pci config writes non posted in the generic linux pci config
functions, and only an optimized set of pci accessors functions for
drivers that really know what they are doing will export the posted
behavior.

Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] msi: Immediately mask and unmask msi-x irqs.
  2007-04-03 19:39               ` Eric W. Biederman
@ 2007-04-03 20:57                 ` Siddha, Suresh B
  0 siblings, 0 replies; 17+ messages in thread
From: Siddha, Suresh B @ 2007-04-03 20:57 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Siddha, Suresh B, Linus Torvalds, Greg KH, Andrew Morton,
	linux-pci, linux-kernel, Kok, Auke-jan H, Williams, Mitch A

On Tue, Apr 03, 2007 at 01:39:43PM -0600, Eric W. Biederman wrote:
> "Siddha, Suresh B" <suresh.b.siddha@intel.com> writes:
> 
> Further now that we cache the msi message that read_msi_msg should go
> away and we should use the cached version from the msi_desc.

Ok. Then this patch is a must.

> > Don't we need the flush for the PCI_CAP_ID_MSI case aswell.
> 
> At least the 3.0 pci spec does not allow pci configuration access to
> be posted.  So unless we find some hardware that actually does post
> pci configuration writes we should be ok.

ok. Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2007-04-03 20:58 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-30 18:54 [PATCH 2.6.21-rc5] Flush MSI-X table writes (rev 3) Mitch Williams
2007-03-30 19:04 ` Eric W. Biederman
2007-03-30 19:47   ` Andrew Morton
2007-03-30 19:49     ` Greg KH
2007-03-30 20:00       ` Andrew Morton
2007-03-30 20:10         ` Greg KH
2007-03-30 20:21           ` Williams, Mitch A
2007-03-30 20:24             ` Greg KH
2007-03-30 20:26       ` Eric W. Biederman
2007-03-30 20:49         ` Williams, Mitch A
2007-03-30 20:56           ` Chuck Ebbert
2007-03-30 21:05           ` Eric W. Biederman
2007-04-03  7:41           ` [PATCH] msi: Immediately mask and unmask msi-x irqs Eric W. Biederman
2007-04-03 17:24             ` Williams, Mitch A
2007-04-03 18:52             ` Siddha, Suresh B
2007-04-03 19:39               ` Eric W. Biederman
2007-04-03 20:57                 ` Siddha, Suresh B

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox