netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] [PCI] Check that MWI bit really did get set
@ 2006-10-06 19:05 Matthew Wilcox
  2006-10-06 19:05 ` [PATCH 2/2] [TULIP] Check the return value from pci_set_mwi() Matthew Wilcox
                   ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Matthew Wilcox @ 2006-10-06 19:05 UTC (permalink / raw)
  To: Val Henson, Greg Kroah-Hartman
  Cc: netdev, linux-pci, linux-kernel, Matthew Wilcox

Since some devices may not implement the MWI bit, we should check that
the write did set it and return an error if it didn't.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a544997..3d041f4 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -900,13 +900,17 @@ #endif
 		return rc;
 
 	pci_read_config_word(dev, PCI_COMMAND, &cmd);
-	if (! (cmd & PCI_COMMAND_INVALIDATE)) {
-		pr_debug("PCI: Enabling Mem-Wr-Inval for device %s\n", pci_name(dev));
-		cmd |= PCI_COMMAND_INVALIDATE;
-		pci_write_config_word(dev, PCI_COMMAND, cmd);
-	}
-	
-	return 0;
+	if (cmd & PCI_COMMAND_INVALIDATE)
+		return 0;
+
+	pr_debug("PCI: Enabling Mem-Wr-Inval for device %s\n", pci_name(dev));
+	cmd |= PCI_COMMAND_INVALIDATE;
+	pci_write_config_word(dev, PCI_COMMAND, cmd);
+
+	/* read result from hardware (in case bit refused to enable) */
+	pci_read_config_word(dev, PCI_COMMAND, &cmd);
+
+	return (cmd & PCI_COMMAND_INVALIDATE) ? 0 : -EINVAL;
 }
 
 /**

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 2/2] [TULIP] Check the return value from pci_set_mwi()
  2006-10-06 19:05 [PATCH 1/2] [PCI] Check that MWI bit really did get set Matthew Wilcox
@ 2006-10-06 19:05 ` Matthew Wilcox
  2006-10-06 19:15   ` Jeff Garzik
  2006-10-06 19:16 ` [PATCH 1/2] [PCI] Check that MWI bit really did get set Jeff Garzik
  2006-10-14  4:41 ` Andrew Morton
  2 siblings, 1 reply; 39+ messages in thread
From: Matthew Wilcox @ 2006-10-06 19:05 UTC (permalink / raw)
  To: Val Henson, Greg Kroah-Hartman
  Cc: netdev, linux-pci, linux-kernel, Matthew Wilcox

We used to check whether pci_set_mwi() had succeeded by testing the
hardware MWI bit.  Now we need only check the return value (and failing
to do so is a warning).  Also, pci_set_mwi() will fail if the cache line
size is 0, so we don't need to check that ourselves any more.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>

diff --git a/drivers/net/tulip/tulip_core.c b/drivers/net/tulip/tulip_core.c
index d11d28c..64d999b 100644
--- a/drivers/net/tulip/tulip_core.c
+++ b/drivers/net/tulip/tulip_core.c
@@ -1135,7 +1135,6 @@ static void __devinit tulip_mwi_config (
 {
 	struct tulip_private *tp = netdev_priv(dev);
 	u8 cache;
-	u16 pci_command;
 	u32 csr0;
 
 	if (tulip_debug > 3)
@@ -1153,21 +1152,15 @@ static void __devinit tulip_mwi_config (
 	/* set or disable MWI in the standard PCI command bit.
 	 * Check for the case where  mwi is desired but not available
 	 */
-	if (csr0 & MWI)	pci_set_mwi(pdev);
-	else		pci_clear_mwi(pdev);
-
-	/* read result from hardware (in case bit refused to enable) */
-	pci_read_config_word(pdev, PCI_COMMAND, &pci_command);
-	if ((csr0 & MWI) && (!(pci_command & PCI_COMMAND_INVALIDATE)))
-		csr0 &= ~MWI;
-
-	/* if cache line size hardwired to zero, no MWI */
-	pci_read_config_byte(pdev, PCI_CACHE_LINE_SIZE, &cache);
-	if ((csr0 & MWI) && (cache == 0)) {
-		csr0 &= ~MWI;
+	if (csr0 & MWI)	{
+		if (pci_set_mwi(pdev))
+			csr0 &= ~MWI;
+	} else {
 		pci_clear_mwi(pdev);
 	}
 
+	pci_read_config_byte(pdev, PCI_CACHE_LINE_SIZE, &cache);
+
 	/* assign per-cacheline-size cache alignment and
 	 * burst length values
 	 */

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/2] [TULIP] Check the return value from pci_set_mwi()
  2006-10-06 19:05 ` [PATCH 2/2] [TULIP] Check the return value from pci_set_mwi() Matthew Wilcox
@ 2006-10-06 19:15   ` Jeff Garzik
  2006-10-06 19:28     ` Matthew Wilcox
  0 siblings, 1 reply; 39+ messages in thread
From: Jeff Garzik @ 2006-10-06 19:15 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Val Henson, Greg Kroah-Hartman, netdev, linux-pci, linux-kernel

Matthew Wilcox wrote:
> Also, pci_set_mwi() will fail if the cache line
> size is 0, so we don't need to check that ourselves any more.

NAK, not true on all arches.  sparc64 at least presumes that the 
firmware DTRT with cacheline size, which hurts us now given this tulip patch

	Jeff



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-06 19:05 [PATCH 1/2] [PCI] Check that MWI bit really did get set Matthew Wilcox
  2006-10-06 19:05 ` [PATCH 2/2] [TULIP] Check the return value from pci_set_mwi() Matthew Wilcox
@ 2006-10-06 19:16 ` Jeff Garzik
  2006-10-14  4:41 ` Andrew Morton
  2 siblings, 0 replies; 39+ messages in thread
From: Jeff Garzik @ 2006-10-06 19:16 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Val Henson, Greg Kroah-Hartman, netdev, linux-pci, linux-kernel

Matthew Wilcox wrote:
> Since some devices may not implement the MWI bit, we should check that
> the write did set it and return an error if it didn't.
> 
> Signed-off-by: Matthew Wilcox <matthew@wil.cx>

ACK, though (as it seems you are doing) you should audit pci_set_mwi() 
users to make sure behavior matches up with this implementation

	Jeff




^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/2] [TULIP] Check the return value from pci_set_mwi()
  2006-10-06 19:15   ` Jeff Garzik
@ 2006-10-06 19:28     ` Matthew Wilcox
  2006-10-06 19:59       ` Jeff Garzik
  0 siblings, 1 reply; 39+ messages in thread
From: Matthew Wilcox @ 2006-10-06 19:28 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Val Henson, Greg Kroah-Hartman, netdev, linux-pci, linux-kernel,
	David Miller

On Fri, Oct 06, 2006 at 03:15:15PM -0400, Jeff Garzik wrote:
> Matthew Wilcox wrote:
> >Also, pci_set_mwi() will fail if the cache line
> >size is 0, so we don't need to check that ourselves any more.
> 
> NAK, not true on all arches.  sparc64 at least presumes that the 
> firmware DTRT with cacheline size, which hurts us now given this tulip patch

How does it hurt us?

int pcibios_prep_mwi(struct pci_dev *dev)
{
        /* We set correct PCI_CACHE_LINE_SIZE register values for every
         * device probed on this platform.  So there is nothing to check
         * and this always succeeds.
         */
        return 0;
}

If Dave's wrong about that, it hurts him, not us ;-)

It's still not necessary for the Tulip driver to check.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/2] [TULIP] Check the return value from pci_set_mwi()
  2006-10-06 19:28     ` Matthew Wilcox
@ 2006-10-06 19:59       ` Jeff Garzik
  2006-10-07  5:34         ` Grant Grundler
  0 siblings, 1 reply; 39+ messages in thread
From: Jeff Garzik @ 2006-10-06 19:59 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Val Henson, Greg Kroah-Hartman, netdev, linux-pci, linux-kernel,
	David Miller

Matthew Wilcox wrote:
> On Fri, Oct 06, 2006 at 03:15:15PM -0400, Jeff Garzik wrote:
>> Matthew Wilcox wrote:
>>> Also, pci_set_mwi() will fail if the cache line
>>> size is 0, so we don't need to check that ourselves any more.
>> NAK, not true on all arches.  sparc64 at least presumes that the 
>> firmware DTRT with cacheline size, which hurts us now given this tulip patch
> 
> How does it hurt us?
> 
> int pcibios_prep_mwi(struct pci_dev *dev)
> {
>         /* We set correct PCI_CACHE_LINE_SIZE register values for every
>          * device probed on this platform.  So there is nothing to check
>          * and this always succeeds.
>          */
>         return 0;
> }
> 
> If Dave's wrong about that, it hurts him, not us ;-)
> 
> It's still not necessary for the Tulip driver to check.

The unmodified tulip driver checks both MWI and cacheline-size because 
one of the clones (PNIC or PNIC2) will let you set the MWI bit, but 
hardwires cacheline size to zero.

If the arches do not behave consistently, we need to keep the check in 
the tulip driver, to avoid incorrectly programming the csr0 MWI bit.

	Jeff




^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/2] [TULIP] Check the return value from pci_set_mwi()
  2006-10-06 19:59       ` Jeff Garzik
@ 2006-10-07  5:34         ` Grant Grundler
  2006-10-07 14:44           ` Jeff Garzik
  0 siblings, 1 reply; 39+ messages in thread
From: Grant Grundler @ 2006-10-07  5:34 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Matthew Wilcox, Val Henson, Greg Kroah-Hartman, netdev, linux-pci,
	linux-kernel, David Miller

On Fri, Oct 06, 2006 at 03:59:57PM -0400, Jeff Garzik wrote:
> The unmodified tulip driver checks both MWI and cacheline-size because 
> one of the clones (PNIC or PNIC2) will let you set the MWI bit, but 
> hardwires cacheline size to zero.

Maybe the generic pci_set_mwi() can verify cacheline size is non-zero?
I don't think each driver should need to enforce this.

> If the arches do not behave consistently, we need to keep the check in 
> the tulip driver, to avoid incorrectly programming the csr0 MWI bit.

Why not fix the arches to be consistent?
There's alot more drivers than arches...and we have control
of the arch specific PCI support.

thanks,
grant

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/2] [TULIP] Check the return value from pci_set_mwi()
  2006-10-07  5:34         ` Grant Grundler
@ 2006-10-07 14:44           ` Jeff Garzik
  0 siblings, 0 replies; 39+ messages in thread
From: Jeff Garzik @ 2006-10-07 14:44 UTC (permalink / raw)
  To: Grant Grundler
  Cc: Matthew Wilcox, Val Henson, Greg Kroah-Hartman, netdev, linux-pci,
	linux-kernel, David Miller

Grant Grundler wrote:
> On Fri, Oct 06, 2006 at 03:59:57PM -0400, Jeff Garzik wrote:
>> The unmodified tulip driver checks both MWI and cacheline-size because 
>> one of the clones (PNIC or PNIC2) will let you set the MWI bit, but 
>> hardwires cacheline size to zero.
> 
> Maybe the generic pci_set_mwi() can verify cacheline size is non-zero?
> I don't think each driver should need to enforce this.

Agreed.


>> If the arches do not behave consistently, we need to keep the check in 
>> the tulip driver, to avoid incorrectly programming the csr0 MWI bit.
> 
> Why not fix the arches to be consistent?
> There's alot more drivers than arches...and we have control
> of the arch specific PCI support.

Agreed.

	Jeff




^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-06 19:05 [PATCH 1/2] [PCI] Check that MWI bit really did get set Matthew Wilcox
  2006-10-06 19:05 ` [PATCH 2/2] [TULIP] Check the return value from pci_set_mwi() Matthew Wilcox
  2006-10-06 19:16 ` [PATCH 1/2] [PCI] Check that MWI bit really did get set Jeff Garzik
@ 2006-10-14  4:41 ` Andrew Morton
  2006-10-14  5:21   ` Greg KH
  2006-10-14 14:02   ` Matthew Wilcox
  2 siblings, 2 replies; 39+ messages in thread
From: Andrew Morton @ 2006-10-14  4:41 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Val Henson, Greg Kroah-Hartman, netdev, linux-pci, linux-kernel

On Fri, 06 Oct 2006 13:05:18 -0600
Matthew Wilcox <matthew@wil.cx> wrote:

> Since some devices may not implement the MWI bit, we should check that
> the write did set it and return an error if it didn't.
> 
> Signed-off-by: Matthew Wilcox <matthew@wil.cx>
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index a544997..3d041f4 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -900,13 +900,17 @@ #endif
>  		return rc;
>  
>  	pci_read_config_word(dev, PCI_COMMAND, &cmd);
> -	if (! (cmd & PCI_COMMAND_INVALIDATE)) {
> -		pr_debug("PCI: Enabling Mem-Wr-Inval for device %s\n", pci_name(dev));
> -		cmd |= PCI_COMMAND_INVALIDATE;
> -		pci_write_config_word(dev, PCI_COMMAND, cmd);
> -	}
> -	
> -	return 0;
> +	if (cmd & PCI_COMMAND_INVALIDATE)
> +		return 0;
> +
> +	pr_debug("PCI: Enabling Mem-Wr-Inval for device %s\n", pci_name(dev));
> +	cmd |= PCI_COMMAND_INVALIDATE;
> +	pci_write_config_word(dev, PCI_COMMAND, cmd);
> +
> +	/* read result from hardware (in case bit refused to enable) */
> +	pci_read_config_word(dev, PCI_COMMAND, &cmd);
> +
> +	return (cmd & PCI_COMMAND_INVALIDATE) ? 0 : -EINVAL;
>  }
>  
>  /**

Bisection shows that this patch
(pci-check-that-mwi-bit-really-did-get-set.patch in Greg's PCI tree) breaks
suspend-to-disk on my Vaio.  It writes the suspend image and gets to the
point where it's supposed to power down, but doesn't.

After a manual power-cycle it successfully resumes from disk, but
networking (at least) is dead.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-14  4:41 ` Andrew Morton
@ 2006-10-14  5:21   ` Greg KH
  2006-10-14 14:02   ` Matthew Wilcox
  1 sibling, 0 replies; 39+ messages in thread
From: Greg KH @ 2006-10-14  5:21 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Matthew Wilcox, Val Henson, netdev, linux-pci, linux-kernel

On Fri, Oct 13, 2006 at 09:41:35PM -0700, Andrew Morton wrote:
> On Fri, 06 Oct 2006 13:05:18 -0600
> Matthew Wilcox <matthew@wil.cx> wrote:
> 
> > Since some devices may not implement the MWI bit, we should check that
> > the write did set it and return an error if it didn't.
> > 
> > Signed-off-by: Matthew Wilcox <matthew@wil.cx>
> > 
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index a544997..3d041f4 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -900,13 +900,17 @@ #endif
> >  		return rc;
> >  
> >  	pci_read_config_word(dev, PCI_COMMAND, &cmd);
> > -	if (! (cmd & PCI_COMMAND_INVALIDATE)) {
> > -		pr_debug("PCI: Enabling Mem-Wr-Inval for device %s\n", pci_name(dev));
> > -		cmd |= PCI_COMMAND_INVALIDATE;
> > -		pci_write_config_word(dev, PCI_COMMAND, cmd);
> > -	}
> > -	
> > -	return 0;
> > +	if (cmd & PCI_COMMAND_INVALIDATE)
> > +		return 0;
> > +
> > +	pr_debug("PCI: Enabling Mem-Wr-Inval for device %s\n", pci_name(dev));
> > +	cmd |= PCI_COMMAND_INVALIDATE;
> > +	pci_write_config_word(dev, PCI_COMMAND, cmd);
> > +
> > +	/* read result from hardware (in case bit refused to enable) */
> > +	pci_read_config_word(dev, PCI_COMMAND, &cmd);
> > +
> > +	return (cmd & PCI_COMMAND_INVALIDATE) ? 0 : -EINVAL;
> >  }
> >  
> >  /**
> 
> Bisection shows that this patch
> (pci-check-that-mwi-bit-really-did-get-set.patch in Greg's PCI tree) breaks
> suspend-to-disk on my Vaio.  It writes the suspend image and gets to the
> point where it's supposed to power down, but doesn't.
> 
> After a manual power-cycle it successfully resumes from disk, but
> networking (at least) is dead.

Ok, I'll drop this from my tree too.

Matthew, let me know whn you have a revised patch you wish to have me
include.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-14  4:41 ` Andrew Morton
  2006-10-14  5:21   ` Greg KH
@ 2006-10-14 14:02   ` Matthew Wilcox
  2006-10-14 20:48     ` Andrew Morton
  1 sibling, 1 reply; 39+ messages in thread
From: Matthew Wilcox @ 2006-10-14 14:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Val Henson, Greg Kroah-Hartman, netdev, linux-pci, linux-kernel

On Fri, Oct 13, 2006 at 09:41:35PM -0700, Andrew Morton wrote:
> Bisection shows that this patch
> (pci-check-that-mwi-bit-really-did-get-set.patch in Greg's PCI tree) breaks
> suspend-to-disk on my Vaio.  It writes the suspend image and gets to the
> point where it's supposed to power down, but doesn't.

How odd.  What driver is calling pci_set_mwi() on the suspend path?
What drivers do you have loaded on the Vaio?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-14 14:02   ` Matthew Wilcox
@ 2006-10-14 20:48     ` Andrew Morton
  2006-10-15  3:20       ` Matthew Wilcox
  0 siblings, 1 reply; 39+ messages in thread
From: Andrew Morton @ 2006-10-14 20:48 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Val Henson, Greg Kroah-Hartman, netdev, linux-pci, linux-kernel

On Sat, 14 Oct 2006 08:02:49 -0600
Matthew Wilcox <matthew@wil.cx> wrote:

> On Fri, Oct 13, 2006 at 09:41:35PM -0700, Andrew Morton wrote:
> > Bisection shows that this patch
> > (pci-check-that-mwi-bit-really-did-get-set.patch in Greg's PCI tree) breaks
> > suspend-to-disk on my Vaio.  It writes the suspend image and gets to the
> > point where it's supposed to power down, but doesn't.
> 
> How odd.  What driver is calling pci_set_mwi() on the suspend path?

ehci_pci_reinit().  I stuck a dump_stack() in there.  See
http://userweb.kernel.org/~akpm/s5000342.jpg

> What drivers do you have loaded on the Vaio?


sony:/home/akpm> lsmod
Module                  Size  Used by
ipw2200               163184  0 
sonypi                 21484  0 
autofs4                19908  1 
hidp                   16192  2 
l2cap                  21764  5 hidp
bluetooth              49060  2 hidp,l2cap
sunrpc                154172  1 
ip_conntrack_netbios_ns     3328  0 
ipt_REJECT              4736  1 
xt_state                2496  2 
ip_conntrack           51020  2 ip_conntrack_netbios_ns,xt_state
nfnetlink               7128  1 ip_conntrack
xt_tcpudp               3392  4 
iptable_filter          3264  1 
ip_tables              12616  1 iptable_filter
x_tables               15428  4 ipt_REJECT,xt_state,xt_tcpudp,ip_tables
video                  16836  0 
sony_acpi               7312  0 
sbs                    15968  0 
i2c_ec                  5504  1 sbs
button                  7184  0 
battery                10692  0 
asus_acpi              17564  0 
backlight               6464  2 sony_acpi,asus_acpi
ac                      5636  0 
nvram                   8072  0 
ohci1394               33264  0 
ehci_hcd               30088  0 
ieee1394              291896  1 ohci1394
uhci_hcd               22092  0 
sg                     33628  0 
joydev                  9920  0 
evbug                   3392  0 
snd_hda_intel          18968  0 
snd_hda_codec         161536  1 snd_hda_intel
snd_seq_dummy           4228  0 
snd_seq_oss            31744  0 
snd_seq_midi_event      7360  1 snd_seq_oss
snd_seq                48208  5 snd_seq_dummy,snd_seq_oss,snd_seq_midi_event
snd_seq_device          8524  3 snd_seq_dummy,snd_seq_oss,snd_seq
ieee80211              30920  1 ipw2200
snd_pcm_oss            41504  0 
snd_mixer_oss          16640  1 snd_pcm_oss
ieee80211_crypt         6016  1 ieee80211
snd_pcm                74632  3 snd_hda_intel,snd_hda_codec,snd_pcm_oss
snd_timer              21316  2 snd_seq,snd_pcm
snd                    50980  9 snd_hda_intel,snd_hda_codec,snd_seq_oss,snd_seq,snd_seq_device,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer
soundcore               7968  1 snd
snd_page_alloc         10376  2 snd_hda_intel,snd_pcm
piix                    9604  0 [permanent]
i2c_i801                7820  0 
pcspkr                  3136  0 
i2c_core               21840  2 i2c_ec,i2c_i801
generic                 5252  0 [permanent]
ext3                  127688  1 
jbd                    52712  1 ext3
ide_disk               16000  0 
ide_core              114780  3 piix,generic,ide_disk

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-14 20:48     ` Andrew Morton
@ 2006-10-15  3:20       ` Matthew Wilcox
  2006-10-15  6:53         ` Andrew Morton
  2006-10-15  7:08         ` [Bulk] " David Brownell
  0 siblings, 2 replies; 39+ messages in thread
From: Matthew Wilcox @ 2006-10-15  3:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Val Henson, Greg Kroah-Hartman, netdev, linux-pci, linux-kernel

On Sat, Oct 14, 2006 at 01:48:55PM -0700, Andrew Morton wrote:
> On Sat, 14 Oct 2006 08:02:49 -0600
> Matthew Wilcox <matthew@wil.cx> wrote:
> 
> > On Fri, Oct 13, 2006 at 09:41:35PM -0700, Andrew Morton wrote:
> > > Bisection shows that this patch
> > > (pci-check-that-mwi-bit-really-did-get-set.patch in Greg's PCI tree) breaks
> > > suspend-to-disk on my Vaio.  It writes the suspend image and gets to the
> > > point where it's supposed to power down, but doesn't.
> > 
> > How odd.  What driver is calling pci_set_mwi() on the suspend path?
> 
> ehci_pci_reinit().  I stuck a dump_stack() in there.  See
> http://userweb.kernel.org/~akpm/s5000342.jpg

Thanks for the picture; that's really helpful.

I see.  We hibernate all the devices then wake them all back up again.
No doubt there's a good reason for this.

Still doesn't make much sense, though.  As far as I can see, the only
consequence of this particular patch is that 1) we do an additional read
from PCI_COMMAND and 2) we can return -EINVAL in one additional case.
But the only effect of returning EINVAL is a printk (for this particular
driver):

        /* PCI Memory-Write-Invalidate cycle support is optional (uncommon) */
        retval = pci_set_mwi(pdev);
        if (!retval)
                ehci_dbg(ehci, "MWI active\n");

        ehci_port_power(ehci, 0);

        return 0;

So even if we return EINVAL ... big deal.

Is it possible reading PCI_COMMAND too quickly after writing it causes
a foul-up?  That would be weird ...

so I suppose there's a few things I can ask you to try:

1. Stop reading the register back altogether.  This should revert the behaviour to the prepatch state:

-       pci_read_config_word(dev, PCI_COMMAND, &cmd);
+//     pci_read_config_word(dev, PCI_COMMAND, &cmd);

2. Put an mdelay(1); before that line

3. Change the last line to just return 0.

-       return (cmd & PCI_COMMAND_INVALIDATE) ? 0 : -EINVAL;
+	return 0;

> > What drivers do you have loaded on the Vaio?
> 
> sony:/home/akpm> lsmod

I don't see any of the other drivers calling pci_set_mwi, so i guess we're
looking at the right suspect.


I feel rather guilty about the amount of time you're spending on this;
any bugs you want me to look at as penance?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-15  3:20       ` Matthew Wilcox
@ 2006-10-15  6:53         ` Andrew Morton
  2006-10-15 13:54           ` Matthew Wilcox
  2006-10-15  7:08         ` [Bulk] " David Brownell
  1 sibling, 1 reply; 39+ messages in thread
From: Andrew Morton @ 2006-10-15  6:53 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Val Henson, Greg Kroah-Hartman, netdev, linux-pci, linux-kernel

On Sat, 14 Oct 2006 21:20:01 -0600
Matthew Wilcox <matthew@wil.cx> wrote:

> On Sat, Oct 14, 2006 at 01:48:55PM -0700, Andrew Morton wrote:
> > On Sat, 14 Oct 2006 08:02:49 -0600
> > Matthew Wilcox <matthew@wil.cx> wrote:
> > 
> > > On Fri, Oct 13, 2006 at 09:41:35PM -0700, Andrew Morton wrote:
> > > > Bisection shows that this patch
> > > > (pci-check-that-mwi-bit-really-did-get-set.patch in Greg's PCI tree) breaks
> > > > suspend-to-disk on my Vaio.  It writes the suspend image and gets to the
> > > > point where it's supposed to power down, but doesn't.
> > > 
> > > How odd.  What driver is calling pci_set_mwi() on the suspend path?
> > 
> > ehci_pci_reinit().  I stuck a dump_stack() in there.  See
> > http://userweb.kernel.org/~akpm/s5000342.jpg
> 
> Thanks for the picture; that's really helpful.
> 
> I see.  We hibernate all the devices then wake them all back up again.
> No doubt there's a good reason for this.
> 
> Still doesn't make much sense, though.  As far as I can see, the only
> consequence of this particular patch is that 1) we do an additional read
> from PCI_COMMAND and 2) we can return -EINVAL in one additional case.
> But the only effect of returning EINVAL is a printk (for this particular
> driver):
> 
>         /* PCI Memory-Write-Invalidate cycle support is optional (uncommon) */
>         retval = pci_set_mwi(pdev);
>         if (!retval)
>                 ehci_dbg(ehci, "MWI active\n");
> 
>         ehci_port_power(ehci, 0);
> 
>         return 0;
> 
> So even if we return EINVAL ... big deal.
> 
> Is it possible reading PCI_COMMAND too quickly after writing it causes
> a foul-up?  That would be weird ...
> 
> so I suppose there's a few things I can ask you to try:

It seems to have stopped happening.  I don't know why.

gregkh-pci-pci-prevent-user-config-space-access-during-power-state-transitions.patch
still breaks suspend though: http://userweb.kernel.org/~akpm/s5000349.jpg

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Bulk] Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-15  3:20       ` Matthew Wilcox
  2006-10-15  6:53         ` Andrew Morton
@ 2006-10-15  7:08         ` David Brownell
  2006-10-15 13:52           ` Matthew Wilcox
  2006-10-15 14:21           ` Alan Cox
  1 sibling, 2 replies; 39+ messages in thread
From: David Brownell @ 2006-10-15  7:08 UTC (permalink / raw)
  To: matthew, akpm; +Cc: val_henson, netdev, linux-pci, linux-kernel, gregkh

> But the only effect of returning EINVAL is a printk (for this particular
> driver):
>
>         /* PCI Memory-Write-Invalidate cycle support is optional (uncommon) */
>         retval = pci_set_mwi(pdev);
>         if (!retval)
>                 ehci_dbg(ehci, "MWI active\n");

Erm, I've lost context here but it's completely legit for hardware
to NOT support MWI, so it is in no way an error if it's not set.
(Memory-Write-Invalidate is just a more efficient way to write data
that may be cached; if the device can't issue those cycles, there's
no loss of correctness.)

Since it's not an error, there should be no such printk ... which
is exactly how it's coded above.

Who is issuing the printk on a non-error code path??

- Dave



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Bulk] Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-15  7:08         ` [Bulk] " David Brownell
@ 2006-10-15 13:52           ` Matthew Wilcox
  2006-10-15 14:21           ` Alan Cox
  1 sibling, 0 replies; 39+ messages in thread
From: Matthew Wilcox @ 2006-10-15 13:52 UTC (permalink / raw)
  To: David Brownell; +Cc: akpm, val_henson, netdev, linux-pci, linux-kernel, gregkh

On Sun, Oct 15, 2006 at 12:08:09AM -0700, David Brownell wrote:
> > But the only effect of returning EINVAL is a printk (for this particular
> > driver):
> >
> >         /* PCI Memory-Write-Invalidate cycle support is optional (uncommon) */
> >         retval = pci_set_mwi(pdev);
> >         if (!retval)
> >                 ehci_dbg(ehci, "MWI active\n");
> 
> Erm, I've lost context here but it's completely legit for hardware
> to NOT support MWI, so it is in no way an error if it's not set.
> (Memory-Write-Invalidate is just a more efficient way to write data
> that may be cached; if the device can't issue those cycles, there's
> no loss of correctness.)
> 
> Since it's not an error, there should be no such printk ... which
> is exactly how it's coded above.
> 
> Who is issuing the printk on a non-error code path??

Er, that would be the EHCI driver, which you wrote ...

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-15  6:53         ` Andrew Morton
@ 2006-10-15 13:54           ` Matthew Wilcox
  2006-10-15 17:47             ` Andrew Morton
  0 siblings, 1 reply; 39+ messages in thread
From: Matthew Wilcox @ 2006-10-15 13:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Val Henson, Greg Kroah-Hartman, netdev, linux-pci, linux-kernel

On Sat, Oct 14, 2006 at 11:53:59PM -0700, Andrew Morton wrote:
> It seems to have stopped happening.  I don't know why.

Argh.  Possibly a mistake during the bisect procedure?

> gregkh-pci-pci-prevent-user-config-space-access-during-power-state-transitions.patch
> still breaks suspend though: http://userweb.kernel.org/~akpm/s5000349.jpg

I believe that; I was going to generate a new patch for that yesterday,
but got distracted trying to debug your other problem.  I'll put out a
new version of that patch later today.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Bulk] Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-15 14:21           ` Alan Cox
@ 2006-10-15 13:57             ` Matthew Wilcox
  2006-10-15 17:45               ` Andrew Morton
  0 siblings, 1 reply; 39+ messages in thread
From: Matthew Wilcox @ 2006-10-15 13:57 UTC (permalink / raw)
  To: Alan Cox
  Cc: David Brownell, akpm, val_henson, netdev, linux-pci, linux-kernel,
	gregkh

On Sun, Oct 15, 2006 at 03:21:22PM +0100, Alan Cox wrote:
> Ar Sul, 2006-10-15 am 00:08 -0700, ysgrifennodd David Brownell:
> > Since it's not an error, there should be no such printk ... which
> > is exactly how it's coded above.
> 
> The underlying bug is that someone marked pci_set_mwi must-check, that's
> wrong for most of the drivers that use it. If you remove the must check
> annotation from it then the problem and a thousand other spurious
> warnings go away.

There's only about 20 users of pci_set_mwi ... about 12 of them seem to
check it, one of them uses a variable called
compiler_warning_pointless_fix which leaves about 7 warnings to be
removed by removing the __must_check.

However, I do believe the __must_check should be removed.  For example,
the LSI 53c1030 has *nothing* to be done if setting MWI fails.  It just
doesn't work, and the device copes.  It's not like Tulip or sym53c8xx
where there are additional bits to be set or cleared in control registers.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Bulk] Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-15  7:08         ` [Bulk] " David Brownell
  2006-10-15 13:52           ` Matthew Wilcox
@ 2006-10-15 14:21           ` Alan Cox
  2006-10-15 13:57             ` Matthew Wilcox
  1 sibling, 1 reply; 39+ messages in thread
From: Alan Cox @ 2006-10-15 14:21 UTC (permalink / raw)
  To: David Brownell
  Cc: matthew, akpm, val_henson, netdev, linux-pci, linux-kernel,
	gregkh

Ar Sul, 2006-10-15 am 00:08 -0700, ysgrifennodd David Brownell:
> Since it's not an error, there should be no such printk ... which
> is exactly how it's coded above.

The underlying bug is that someone marked pci_set_mwi must-check, that's
wrong for most of the drivers that use it. If you remove the must check
annotation from it then the problem and a thousand other spurious
warnings go away.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Bulk] Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-15 13:57             ` Matthew Wilcox
@ 2006-10-15 17:45               ` Andrew Morton
  2006-10-15 19:16                 ` David Brownell
                                   ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Andrew Morton @ 2006-10-15 17:45 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Alan Cox, David Brownell, val_henson, netdev, linux-pci,
	linux-kernel, gregkh

On Sun, 15 Oct 2006 07:57:56 -0600
Matthew Wilcox <matthew@wil.cx> wrote:

> On Sun, Oct 15, 2006 at 03:21:22PM +0100, Alan Cox wrote:
> > Ar Sul, 2006-10-15 am 00:08 -0700, ysgrifennodd David Brownell:
> > > Since it's not an error, there should be no such printk ... which
> > > is exactly how it's coded above.
> > 
> > The underlying bug is that someone marked pci_set_mwi must-check, that's
> > wrong for most of the drivers that use it. If you remove the must check
> > annotation from it then the problem and a thousand other spurious
> > warnings go away.
> 
> There's only about 20 users of pci_set_mwi ... about 12 of them seem to
> check it, one of them uses a variable called
> compiler_warning_pointless_fix which leaves about 7 warnings to be
> removed by removing the __must_check.
> 
> However, I do believe the __must_check should be removed.  For example,
> the LSI 53c1030 has *nothing* to be done if setting MWI fails.  It just
> doesn't work, and the device copes.

If the drivers doesn't care and if it makes no difference to performance
then just delete the call to pci_set_mwi().

But if MWI _does_ make a difference to performance then we should tell
someone that it isn't working rather than silently misbehaving?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-15 13:54           ` Matthew Wilcox
@ 2006-10-15 17:47             ` Andrew Morton
  0 siblings, 0 replies; 39+ messages in thread
From: Andrew Morton @ 2006-10-15 17:47 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Val Henson, Greg Kroah-Hartman, netdev, linux-pci, linux-kernel

On Sun, 15 Oct 2006 07:54:41 -0600
Matthew Wilcox <matthew@wil.cx> wrote:

> On Sat, Oct 14, 2006 at 11:53:59PM -0700, Andrew Morton wrote:
> > It seems to have stopped happening.  I don't know why.
> 
> Argh.  Possibly a mistake during the bisect procedure?

I don't think so - I spent a lot of time fiddling with that because it was
so bizarre to have two patches each of which caused the same failure.

Something has changed, perhaps config: the failure is a bit different now
(happens earlier).

> > gregkh-pci-pci-prevent-user-config-space-access-during-power-state-transitions.patch
> > still breaks suspend though: http://userweb.kernel.org/~akpm/s5000349.jpg
> 
> I believe that; I was going to generate a new patch for that yesterday,
> but got distracted trying to debug your other problem.  I'll put out a
> new version of that patch later today.

ok..  Add plenty of debug printks to it.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-15 17:45               ` Andrew Morton
@ 2006-10-15 19:16                 ` David Brownell
  2006-10-15 19:34                   ` Andrew Morton
  2006-10-15 21:52                 ` [Bulk] " Alan Cox
  2006-10-16  0:00                 ` Paul Mackerras
  2 siblings, 1 reply; 39+ messages in thread
From: David Brownell @ 2006-10-15 19:16 UTC (permalink / raw)
  To: alan, matthew, akpm; +Cc: val_henson, netdev, linux-pci, linux-kernel, gregkh

(From Alan Cox:)
> The underlying bug is that someone marked pci_set_mwi must-check, that's
> wrong for most of the drivers that use it. If you remove the must check
> annotation from it then the problem and a thousand other spurious
> warnings go away.

Yes, there seems to be abuse of this new "must_check" feature.


(From Andrew Morton:)
> But if MWI _does_ make a difference to performance then we should tell
> someone that it isn't working rather than silently misbehaving?

Thing is, a "difference to performance (alone)" != "misbehavior".

If it affected correctness, then a warning would be appropriate.

Most drivers should be able to say "enable MWI if possible, but
don't worry if it's not possible".  Only a few controllers need
additional setup to make MWI actually work ... if they couldn't
do that setup, that'd be worth a warning before they backed off
to run in a non-MWI mode.

- Dave


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-15 19:16                 ` David Brownell
@ 2006-10-15 19:34                   ` Andrew Morton
  2006-10-15 22:45                     ` David Brownell
  0 siblings, 1 reply; 39+ messages in thread
From: Andrew Morton @ 2006-10-15 19:34 UTC (permalink / raw)
  To: David Brownell
  Cc: alan, matthew, val_henson, netdev, linux-pci, linux-kernel,
	gregkh

On Sun, 15 Oct 2006 12:16:31 -0700
David Brownell <david-b@pacbell.net> wrote:

> (From Alan Cox:)
> > The underlying bug is that someone marked pci_set_mwi must-check, that's
> > wrong for most of the drivers that use it. If you remove the must check
> > annotation from it then the problem and a thousand other spurious
> > warnings go away.
> 
> Yes, there seems to be abuse of this new "must_check" feature.
> 
> 
> (From Andrew Morton:)
> > But if MWI _does_ make a difference to performance then we should tell
> > someone that it isn't working rather than silently misbehaving?
> 
> Thing is, a "difference to performance (alone)" != "misbehavior".
> 
> If it affected correctness, then a warning would be appropriate.
> 
> Most drivers should be able to say "enable MWI if possible, but
> don't worry if it's not possible".  Only a few controllers need
> additional setup to make MWI actually work ... if they couldn't
> do that setup, that'd be worth a warning before they backed off
> to run in a non-MWI mode.
> 

So the semantics of pci_set_mwi() are "try to set MWI if this
platform/device supports it".

In that case its interface is misdesigned, because it doesn't discriminate
between "yes-it-does/no-it-doesn't" (which we don't want to report, because
either is expected and legitimate) and "something screwed up", which we do
want to report, because it is always unexpected.

So an appropriate return value protocol would be

0:	No error, unable to set MWI
1:	No error, able to set MWI
-EFOO:	Error

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Bulk] Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-15 17:45               ` Andrew Morton
  2006-10-15 19:16                 ` David Brownell
@ 2006-10-15 21:52                 ` Alan Cox
  2006-10-16  0:00                 ` Paul Mackerras
  2 siblings, 0 replies; 39+ messages in thread
From: Alan Cox @ 2006-10-15 21:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Wilcox, David Brownell, val_henson, netdev, linux-pci,
	linux-kernel, gregkh

Ar Sul, 2006-10-15 am 10:45 -0700, ysgrifennodd Andrew Morton:
> If the drivers doesn't care and if it makes no difference to performance
> then just delete the call to pci_set_mwi().
> 
> But if MWI _does_ make a difference to performance then we should tell
> someone that it isn't working rather than silently misbehaving?

It isn't misbehaving it just isn't available. MWI is rather different to
say pci_set_master() in that it makes a lot of sense for many drivers to
ask for it but not care about the results. Something like pci_set_master
failing is a big problem and has to be handled (although as we often
don't use BIOS PCI services we see fake success in some cases).

MWI is an "extra cheese" option not a "no pizza" case

Alan


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-15 19:34                   ` Andrew Morton
@ 2006-10-15 22:45                     ` David Brownell
  2006-10-15 23:18                       ` Andrew Morton
  0 siblings, 1 reply; 39+ messages in thread
From: David Brownell @ 2006-10-15 22:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: alan, matthew, val_henson, netdev, linux-pci, linux-kernel,
	gregkh

> > Most drivers should be able to say "enable MWI if possible, but
> > don't worry if it's not possible".  Only a few controllers need
> > additional setup to make MWI actually work ... if they couldn't
> > do that setup, that'd be worth a warning before they backed off
> > to run in a non-MWI mode.
> > 
> 
> So the semantics of pci_set_mwi() are "try to set MWI if this
> platform/device supports it".

Not what I said ... that's what the _driver_ usually wants to do,
which is different from the step implemented by set_mwi(). 


What Alan Cox said is a better paraphrase:

> MWI is an "extra cheese" option not a "no pizza" case

Or "sorry, that car is not available in olive, just burgundy."


Not:

> In that case its interface is misdesigned, because it doesn't discriminate
> between "yes-it-does/no-it-doesn't" (which we don't want to report, because
> either is expected and legitimate) and "something screwed up", which we do
> want to report, because it is always unexpected.

You mis-understand.  It's completely legit for the driver not to care.

I agree that set_mwo() should set MWI if possible, and fail cleanly
if it couldn't (for whatever reason).  Thing is, choosing to treat
that as an error must be the _driver's_ choice ... it'd be wrong to force
that policy into the _interface_ by forcing must_check etc.

- Dave



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-15 22:45                     ` David Brownell
@ 2006-10-15 23:18                       ` Andrew Morton
  2006-10-16  0:02                         ` Alan Cox
  2006-10-16  0:16                         ` David Brownell
  0 siblings, 2 replies; 39+ messages in thread
From: Andrew Morton @ 2006-10-15 23:18 UTC (permalink / raw)
  To: David Brownell
  Cc: alan, matthew, val_henson, netdev, linux-pci, linux-kernel,
	gregkh

On Sun, 15 Oct 2006 15:45:58 -0700
David Brownell <david-b@pacbell.net> wrote:

> > In that case its interface is misdesigned, because it doesn't discriminate
> > between "yes-it-does/no-it-doesn't" (which we don't want to report, because
> > either is expected and legitimate) and "something screwed up", which we do
> > want to report, because it is always unexpected.
> 
> You mis-understand.  It's completely legit for the driver not to care.
> 
> I agree that set_mwo() should set MWI if possible, and fail cleanly
> if it couldn't (for whatever reason).  Thing is, choosing to treat
> that as an error must be the _driver's_ choice ... it'd be wrong to force
> that policy into the _interface_ by forcing must_check etc.

No.  If pci_set_mwi() detects an unexpected error then the driver should
take some action: report it, recover from it, fail to load, etc.  If the
driver fails to do any of this then it's a buggy driver.

You, the driver author _do not know_ what pci_set_mwi() does at present, on
all platforms, nor do you know what it does in the future.  For you the
driver author to make assumptions about what's happening inside
pci_set_mwi() is a layering violation.  Maybe the bridge got hot-unplugged.
 Maybe the attempt to set MWI caused some synchronous PCI error.  For
example, take a look at the various implementations of pci_ops.read()
around the place - various of them can fail for various reasons.  

Now it could be that an appropriate solution is to make pci_set_mwi()
return only 0 or 1, and to generate a warning from within pci_set_mwi()
if some unexpected error happens.  In which case it is legitimate for
callers to not check for errors.

This is not a terribly important issue, and it is far from the worst case
of missed error-checking which we have in there. 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-16  0:02                         ` Alan Cox
@ 2006-10-15 23:44                           ` Andrew Morton
  2006-10-16  0:44                             ` Paul Mackerras
  2006-10-16 11:02                             ` Alan Cox
  0 siblings, 2 replies; 39+ messages in thread
From: Andrew Morton @ 2006-10-15 23:44 UTC (permalink / raw)
  To: Alan Cox
  Cc: David Brownell, matthew, val_henson, netdev, linux-pci,
	linux-kernel, gregkh

On Mon, 16 Oct 2006 01:02:40 +0100
Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:

> Ar Sul, 2006-10-15 am 16:18 -0700, ysgrifennodd Andrew Morton:
> > No.  If pci_set_mwi() detects an unexpected error then the driver should
> > take some action: report it, recover from it, fail to load, etc.  If the
> > driver fails to do any of this then it's a buggy driver.
> 
> Wrong and there are several drivers in the kernel that are proof of
> this.

Let me restore the words from my earlier email which you removed so that
you could say that:

  For you the driver author to make assumptions about what's happening
  inside pci_set_mwi() is a layering violation.  Maybe the bridge got
  hot-unplugged.  Maybe the attempt to set MWI caused some synchronous PCI
  error.  For example, take a look at the various implementations of
  pci_ops.read() around the place - various of them can fail for various
  reasons.  


> > You, the driver author _do not know_ what pci_set_mwi() does at present, on
> > all platforms, nor do you know what it does in the future.  For you the
> 
> You don't care. It isn't an error for set_mwi to fail. In fact the only
> reason set_mwi even needs to bother with a return code is that some
> chips want you to set other config private to the device if it is
> available and active.
> 

Let me restore the words from my earlier email which you removed which
address that:

  Now it could be that an appropriate solution is to make pci_set_mwi()
  return only 0 or 1, and to generate a warning from within pci_set_mwi()
  if some unexpected error happens.  In which case it is legitimate for
  callers to not check for errors.



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Bulk] Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-15 17:45               ` Andrew Morton
  2006-10-15 19:16                 ` David Brownell
  2006-10-15 21:52                 ` [Bulk] " Alan Cox
@ 2006-10-16  0:00                 ` Paul Mackerras
  2006-10-16  0:15                   ` Andrew Morton
  2006-10-16  0:21                   ` David Brownell
  2 siblings, 2 replies; 39+ messages in thread
From: Paul Mackerras @ 2006-10-16  0:00 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Wilcox, Alan Cox, David Brownell, val_henson, netdev,
	linux-pci, linux-kernel, gregkh

Andrew Morton writes:

> If the drivers doesn't care and if it makes no difference to performance
> then just delete the call to pci_set_mwi().
> 
> But if MWI _does_ make a difference to performance then we should tell
> someone that it isn't working rather than silently misbehaving?

That sounds like we need a printk inside pci_set_mwi then, rather than
adding a printk to every single caller.

Paul.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-15 23:18                       ` Andrew Morton
@ 2006-10-16  0:02                         ` Alan Cox
  2006-10-15 23:44                           ` Andrew Morton
  2006-10-16  0:16                         ` David Brownell
  1 sibling, 1 reply; 39+ messages in thread
From: Alan Cox @ 2006-10-16  0:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Brownell, matthew, val_henson, netdev, linux-pci,
	linux-kernel, gregkh

Ar Sul, 2006-10-15 am 16:18 -0700, ysgrifennodd Andrew Morton:
> No.  If pci_set_mwi() detects an unexpected error then the driver should
> take some action: report it, recover from it, fail to load, etc.  If the
> driver fails to do any of this then it's a buggy driver.

Wrong and there are several drivers in the kernel that are proof of
this.

> You, the driver author _do not know_ what pci_set_mwi() does at present, on
> all platforms, nor do you know what it does in the future.  For you the

You don't care. It isn't an error for set_mwi to fail. In fact the only
reason set_mwi even needs to bother with a return code is that some
chips want you to set other config private to the device if it is
available and active.

Alan


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Bulk] Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-16  0:00                 ` Paul Mackerras
@ 2006-10-16  0:15                   ` Andrew Morton
  2006-10-16  0:21                   ` David Brownell
  1 sibling, 0 replies; 39+ messages in thread
From: Andrew Morton @ 2006-10-16  0:15 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Matthew Wilcox, Alan Cox, David Brownell, val_henson, netdev,
	linux-pci, linux-kernel, gregkh

On Mon, 16 Oct 2006 10:00:25 +1000 Paul Mackerras <paulus@samba.org> wrote:

> Andrew Morton writes:
> 
> > If the drivers doesn't care and if it makes no difference to performance
> > then just delete the call to pci_set_mwi().
> > 
> > But if MWI _does_ make a difference to performance then we should tell
> > someone that it isn't working rather than silently misbehaving?
> 
> That sounds like we need a printk inside pci_set_mwi then, rather than
> adding a printk to every single caller.
> 

I think so, yes.  That's a good solution to a lot of these things.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-15 23:18                       ` Andrew Morton
  2006-10-16  0:02                         ` Alan Cox
@ 2006-10-16  0:16                         ` David Brownell
  2006-10-16  0:31                           ` Andrew Morton
  2006-10-16 10:59                           ` Alan Cox
  1 sibling, 2 replies; 39+ messages in thread
From: David Brownell @ 2006-10-16  0:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: alan, matthew, val_henson, netdev, linux-pci, linux-kernel,
	gregkh


> > I agree that set_mwo() should set MWI if possible, and fail cleanly
> > if it couldn't (for whatever reason).  Thing is, choosing to treat
> > that as an error must be the _driver's_ choice ... it'd be wrong to force
> > that policy into the _interface_ by forcing must_check etc.
> 
> No.  If pci_set_mwi() detects an unexpected error then the driver should
> take some action: report it, recover from it, fail to load, etc.  If the
> driver fails to do any of this then it's a buggy driver.

But what is an "unexpected" "error"?  Not every fault that's unexpected
is an error; consider a page fault.

Consider also kfree(NULL).  The same motivation for drivers not needing
to validate that parameter is behind arguing that device drivers should
not need to poke around in PCI config space to verify that the device
supports MWI; and should have the flexibility to ignore the return code,
just like kfree() returns no value.


> You, the driver author _do not know_ what pci_set_mwi() does at present, on
> all platforms, nor do you know what it does in the future. 

I know that it enables MWI accesses ... or fails.  Beyond that, there
should be no reason to care.  If the hardware can use a lower-overhead
type of PCI bus cycle, I want it to do so.  If not, no sweat.


> This is not a terribly important issue, and it is far from the worst case
> of missed error-checking which we have in there. 

The reason I think it's important enough to continue this discussion is
that as it currently stands, it's a good example of a **BAD** interface
design ... since it's pointlessly marked as must_check.  (See appended
patch to fix that issue.)

If you wouldn't want all callers of kfree() to say "if (ptr) kfree(ptr);";
why then would anyone want to demand

	... read the pci config space byte (and cleanly handle errors) ...
	... check for the MWI bit ...
	... if it's set (so we "expect" this next call to succeed) then:
		... call pci_set_mwi() ...
		... test set_mwi() value ...
		... ignore that value ...

where the first three lines duplicate code _inside_ pci_set_mwi(), and the
last two lines are obviously a pure waste of code and effort.  I'd want to
know the criteria by which "if(ptr)" is judged a waste of effort in all
callers, but that more extensive PCI configspace logic was not...

- Dave

-------------------- CUT HERE

Remove bogus annotation of pci_set_mwi() as __must_check.  It's completely
reasonable for drivers to not care about the return code, when all they're
doing is enabling an optional performance-improving mechanism that's often
not even available.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>

--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -499,7 +499,7 @@ int __must_check pci_enable_device_bars(
 void pci_disable_device(struct pci_dev *dev);
 void pci_set_master(struct pci_dev *dev);
 #define HAVE_PCI_SET_MWI
-int __must_check pci_set_mwi(struct pci_dev *dev);
+int pci_set_mwi(struct pci_dev *dev);
 void pci_clear_mwi(struct pci_dev *dev);
 void pci_intx(struct pci_dev *dev, int enable);
 int pci_set_dma_mask(struct pci_dev *dev, u64 mask);


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Bulk] Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-16  0:00                 ` Paul Mackerras
  2006-10-16  0:15                   ` Andrew Morton
@ 2006-10-16  0:21                   ` David Brownell
  1 sibling, 0 replies; 39+ messages in thread
From: David Brownell @ 2006-10-16  0:21 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Andrew Morton, Matthew Wilcox, Alan Cox, val_henson, netdev,
	linux-pci, linux-kernel, gregkh

> > If the drivers doesn't care and if it makes no difference to performance
> > then just delete the call to pci_set_mwi().
> > 
> > But if MWI _does_ make a difference to performance then we should tell
> > someone that it isn't working rather than silently misbehaving?

To repeat:  it's not "misbehaving" since support for MWI cycles is
completely optional.  It'd be more interesting to get messages in
the cases that it can be enabled, since typically it can't be.


> That sounds like we need a printk inside pci_set_mwi then, rather than
> adding a printk to every single caller.

Maybe wrapped in #ifdef CONFIG_SPAM_MY_KERNEL_LOG_MESSAGES ... :)


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-16  0:16                         ` David Brownell
@ 2006-10-16  0:31                           ` Andrew Morton
  2006-10-16 10:59                           ` Alan Cox
  1 sibling, 0 replies; 39+ messages in thread
From: Andrew Morton @ 2006-10-16  0:31 UTC (permalink / raw)
  To: David Brownell
  Cc: alan, matthew, val_henson, netdev, linux-pci, linux-kernel,
	gregkh

On Sun, 15 Oct 2006 17:16:35 -0700
David Brownell <david-b@pacbell.net> wrote:

> 
> > You, the driver author _do not know_ what pci_set_mwi() does at present, on
> > all platforms, nor do you know what it does in the future. 
> 
> I know that it enables MWI accesses ... or fails.  Beyond that, there
> should be no reason to care.  If the hardware can use a lower-overhead
> type of PCI bus cycle, I want it to do so.  If not, no sweat.
> 

There are two reasons why it can fail:

1: The bus doesn't support MWI.  Here, the caller doesn't care.

2: The bus _does_ support MWI, but the attempt to enable it failed. 
   Here we very much do care, because we're losing performance.

> 
> > This is not a terribly important issue, and it is far from the worst case
> > of missed error-checking which we have in there. 
> 
> The reason I think it's important enough to continue this discussion is
> that as it currently stands, it's a good example of a **BAD** interface
> design ... since it's pointlessly marked as must_check.  (See appended
> patch to fix that issue.)

It's important to continue this discussion so that certain principles can
be set and agreed to.  Because we have a *lot* of unchecked errors in
there.  We would benefit from setting guidelines establishing

- Which sorts of errors should be handled in callers

- Which sorts of errors should be handled (ie: just reported) in callees

- Which sorts of errors should be handled in neither callers nor callees
  (are there any of these?)

- Whether is it ever legitimate for a caller to not check the return code
  from a callee which can return -EFOO.  (I suspect not - it probably
  indicates a misdesign in the callee, as in this case).

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-15 23:44                           ` Andrew Morton
@ 2006-10-16  0:44                             ` Paul Mackerras
  2006-10-16  1:10                               ` Andrew Morton
  2006-10-16 11:02                             ` Alan Cox
  1 sibling, 1 reply; 39+ messages in thread
From: Paul Mackerras @ 2006-10-16  0:44 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alan Cox, David Brownell, matthew, val_henson, netdev, linux-pci,
	linux-kernel, gregkh

Andrew Morton writes:

> Let me restore the words from my earlier email which you removed so that
> you could say that:
> 
>   For you the driver author to make assumptions about what's happening
>   inside pci_set_mwi() is a layering violation.  Maybe the bridge got
>   hot-unplugged.  Maybe the attempt to set MWI caused some synchronous PCI
>   error.  For example, take a look at the various implementations of
>   pci_ops.read() around the place - various of them can fail for various
>   reasons.  

Maybe aliens are firing a ray-gun at the card.  I think it's
fundamentally wrong for the driver to deny service completely because
of a maybe.

Either there was a transient error that only affected the attempt to
set MWI, in which case a printk (inside pci_set_mwi!) is appropriate,
and we carry on.  Or there is a persistent error condition, in which
case the driver will see something else fail soon enough - something
that the driver actually needs to have working in order to operate -
and fail at that point.

For the driver to stop and refuse to go any further because of an
error in pci_set_mwi has far more disadvantages than advantages.

Paul.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-16  0:44                             ` Paul Mackerras
@ 2006-10-16  1:10                               ` Andrew Morton
  2006-10-16  2:07                                 ` David Brownell
  2006-10-16 10:58                                 ` Alan Cox
  0 siblings, 2 replies; 39+ messages in thread
From: Andrew Morton @ 2006-10-16  1:10 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Alan Cox, David Brownell, matthew, val_henson, netdev, linux-pci,
	linux-kernel, gregkh

On Mon, 16 Oct 2006 10:44:30 +1000
Paul Mackerras <paulus@samba.org> wrote:

> Andrew Morton writes:
> 
> > Let me restore the words from my earlier email which you removed so that
> > you could say that:
> > 
> >   For you the driver author to make assumptions about what's happening
> >   inside pci_set_mwi() is a layering violation.  Maybe the bridge got
> >   hot-unplugged.  Maybe the attempt to set MWI caused some synchronous PCI
> >   error.  For example, take a look at the various implementations of
> >   pci_ops.read() around the place - various of them can fail for various
> >   reasons.  
> 
> Maybe aliens are firing a ray-gun at the card.  I think it's
> fundamentally wrong for the driver to deny service completely because
> of a maybe.
> 
> Either there was a transient error that only affected the attempt to
> set MWI, in which case a printk (inside pci_set_mwi!) is appropriate,
> and we carry on.  Or there is a persistent error condition, in which
> case the driver will see something else fail soon enough - something
> that the driver actually needs to have working in order to operate -
> and fail at that point.
> 
> For the driver to stop and refuse to go any further because of an
> error in pci_set_mwi has far more disadvantages than advantages.
> 

Sure.

So I think what we're needing in this case is:

- A modified version of Willy's patch which returns 0 if MWI was enabled,
  1 if MWI isn't available.

- A printk if something went bad

  It appears that the various functions which try to match the line sizes
  already have printks if something went wrong, but they're using
  KERN_DEBUG facility level and that warning would dupliate the new one in
  pci_set_mwi().

  And although the various implementations of pci_read_config_foo() and
  pci_write_config_foo() can return error codes, we have so many instances
  where we're not checking it, I don't think it's practical to try to start
  doing that everywhere.

- drop the __must_check.

Question is, should pci_set_mwi() ever return -EFOO?  I guess it should, in
the case where setting the line size didn't work out.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-16  1:10                               ` Andrew Morton
@ 2006-10-16  2:07                                 ` David Brownell
  2006-10-16 10:58                                 ` Alan Cox
  1 sibling, 0 replies; 39+ messages in thread
From: David Brownell @ 2006-10-16  2:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Paul Mackerras, Alan Cox, matthew, val_henson, netdev, linux-pci,
	linux-kernel, gregkh

On Sunday 15 October 2006 6:10 pm, Andrew Morton wrote:

> - A printk if something went bad

Where "bad" would I hope exclude cases where the device
doesn't support MWI ... that is, the "go faster if you can"
advice from the driver, where it isn't an error to run into
the common case of _this_ implementation not happening to
be able to issue MWI cycles.

The other cases, where something actually went wrong, would
be reasonable for emitting KERN_ERR or KERN_WARNING messages.

- Dave


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-16  1:10                               ` Andrew Morton
  2006-10-16  2:07                                 ` David Brownell
@ 2006-10-16 10:58                                 ` Alan Cox
  1 sibling, 0 replies; 39+ messages in thread
From: Alan Cox @ 2006-10-16 10:58 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Paul Mackerras, David Brownell, matthew, val_henson, netdev,
	linux-pci, linux-kernel, gregkh

Ar Sul, 2006-10-15 am 18:10 -0700, ysgrifennodd Andrew Morton:
> Question is, should pci_set_mwi() ever return -EFOO?  I guess it should, in
> the case where setting the line size didn't work out.

It does no harm, no driver will ever check anything but 0 v !0 because
the handling is no different in either case.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-16  0:16                         ` David Brownell
  2006-10-16  0:31                           ` Andrew Morton
@ 2006-10-16 10:59                           ` Alan Cox
  1 sibling, 0 replies; 39+ messages in thread
From: Alan Cox @ 2006-10-16 10:59 UTC (permalink / raw)
  To: David Brownell
  Cc: Andrew Morton, matthew, val_henson, netdev, linux-pci,
	linux-kernel, gregkh

Ar Sul, 2006-10-15 am 17:16 -0700, ysgrifennodd David Brownell:
> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>

Acked-by: Alan Cox <alan@redhat.com>
> 
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -499,7 +499,7 @@ int __must_check pci_enable_device_bars(
>  void pci_disable_device(struct pci_dev *dev);
>  void pci_set_master(struct pci_dev *dev);
>  #define HAVE_PCI_SET_MWI
> -int __must_check pci_set_mwi(struct pci_dev *dev);
> +int pci_set_mwi(struct pci_dev *dev);
>  void pci_clear_mwi(struct pci_dev *dev);
>  void pci_intx(struct pci_dev *dev, int enable);
>  int pci_set_dma_mask(struct pci_dev *dev, u64 mask);

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] [PCI] Check that MWI bit really did get set
  2006-10-15 23:44                           ` Andrew Morton
  2006-10-16  0:44                             ` Paul Mackerras
@ 2006-10-16 11:02                             ` Alan Cox
  1 sibling, 0 replies; 39+ messages in thread
From: Alan Cox @ 2006-10-16 11:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Brownell, matthew, val_henson, netdev, linux-pci,
	linux-kernel, gregkh

Ar Sul, 2006-10-15 am 16:44 -0700, ysgrifennodd Andrew Morton:
> Let me restore the words from my earlier email which you removed so that
> you could say that:
> 
>   For you the driver author to make assumptions about what's happening
>   inside pci_set_mwi() is a layering violation.  Maybe the bridge got
>   hot-unplugged.  Maybe the attempt to set MWI caused some synchronous PCI
>   error.  For example, take a look at the various implementations of
>   pci_ops.read() around the place - various of them can fail for various
>   reasons.  

Let me repeat what I said before. As a driver author I do not care. It
doesn't matter if it failed because it is not supported or because a
pink elephant went for a dance on the PCI bus.

>   Now it could be that an appropriate solution is to make pci_set_mwi()
>   return only 0 or 1, and to generate a warning from within pci_set_mwi()
>   if some unexpected error happens.  In which case it is legitimate for
>   callers to not check for errors.

That would be my belief, and ditto for a lot of these other functions -
even the correctly __must_check ones like pci_set_master should do the
error reporting in the set_master() function etc not in every driver.
That gives us a single consistent printk and avoids missing them out or
bloat.

Alan


^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2006-10-16 10:36 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-06 19:05 [PATCH 1/2] [PCI] Check that MWI bit really did get set Matthew Wilcox
2006-10-06 19:05 ` [PATCH 2/2] [TULIP] Check the return value from pci_set_mwi() Matthew Wilcox
2006-10-06 19:15   ` Jeff Garzik
2006-10-06 19:28     ` Matthew Wilcox
2006-10-06 19:59       ` Jeff Garzik
2006-10-07  5:34         ` Grant Grundler
2006-10-07 14:44           ` Jeff Garzik
2006-10-06 19:16 ` [PATCH 1/2] [PCI] Check that MWI bit really did get set Jeff Garzik
2006-10-14  4:41 ` Andrew Morton
2006-10-14  5:21   ` Greg KH
2006-10-14 14:02   ` Matthew Wilcox
2006-10-14 20:48     ` Andrew Morton
2006-10-15  3:20       ` Matthew Wilcox
2006-10-15  6:53         ` Andrew Morton
2006-10-15 13:54           ` Matthew Wilcox
2006-10-15 17:47             ` Andrew Morton
2006-10-15  7:08         ` [Bulk] " David Brownell
2006-10-15 13:52           ` Matthew Wilcox
2006-10-15 14:21           ` Alan Cox
2006-10-15 13:57             ` Matthew Wilcox
2006-10-15 17:45               ` Andrew Morton
2006-10-15 19:16                 ` David Brownell
2006-10-15 19:34                   ` Andrew Morton
2006-10-15 22:45                     ` David Brownell
2006-10-15 23:18                       ` Andrew Morton
2006-10-16  0:02                         ` Alan Cox
2006-10-15 23:44                           ` Andrew Morton
2006-10-16  0:44                             ` Paul Mackerras
2006-10-16  1:10                               ` Andrew Morton
2006-10-16  2:07                                 ` David Brownell
2006-10-16 10:58                                 ` Alan Cox
2006-10-16 11:02                             ` Alan Cox
2006-10-16  0:16                         ` David Brownell
2006-10-16  0:31                           ` Andrew Morton
2006-10-16 10:59                           ` Alan Cox
2006-10-15 21:52                 ` [Bulk] " Alan Cox
2006-10-16  0:00                 ` Paul Mackerras
2006-10-16  0:15                   ` Andrew Morton
2006-10-16  0:21                   ` David Brownell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).