Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: b44: Reset due to FIFO overflow.
From: Mitchell Erblich @ 2010-06-28 21:21 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: James Courtier-Dutton, netdev
In-Reply-To: <1277723370.4235.388.camel@edumazet-laptop>

On Jun 28, 2010, at 4:09 AM, Eric Dumazet wrote:

> Le lundi 28 juin 2010 à 11:17 +0100, James Courtier-Dutton a écrit :
>> On 28 June 2010 11:00, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>> 
>>> Problem is we receive a spike of RX network frames (possibly UDP or some
>>> other RX only trafic), and chip raises an RX fifo overflow _error_
>>> indication.
>>> 

IMO, spikes are a normal behaviour.

>> 
>> The cause of the RX overflow is in my case is TCP.
>> It is reproducible in mythtv.
>> While watching LiveTV, press "s" for the program guide.
>> The program guide is implemented into mythtv by a SQL query that
>> results in a large response.
>> The kernel is probably not servicing the RX FIFO quickly enough due to
>> it being busy doing something else. In this case, probably a video
>> mode switch.
>> 
> 
> Thats strange, b44 has a big RX ring... and tcp sender should wait for
> ACK...
> 

Slow start, etc SHOULD/CAN  double the number of in-flight segments in each
next round-trip, placing them back to back.

IMO,  a stress test, would be a large number/wirespeed set of pings?

>>> Some hardware are buggy enough that such error indication is fatal and
>>> _require_ hardware reset. Thats life. I suspect b44 driver doing a full
>>> reset is not a random guess from driver author, but to avoid a complete
>>> NIC lockup.
>>> 
>> 
>> Interesting, which hardware, apart from the b44, is it that "requires"
>> a hardware reset after a RX FIFO overflow.
> 
> Just take a look at some net drivers and you'll see some of them have
> this requirement.
> 
> rtl8169_rx_interrupt()
> ...
> 	if (status & RxFOVF) {
> 		rtl8169_schedule_work(dev, rtl8169_reset_task);
> 		dev->stats.rx_fifo_errors++;
> 	}
> 
> 
> 
> 

If they can reset in say X frame loss units, then why not reset if
X is an acceptable number?

And a hammer may fix the dent, while I may be more
interested in preventing the dent in the first place.

Mitchell Erblich

^ permalink raw reply

* Re: [PATCH net-next-2.6] ipv4: sysctl to block responding on down interface
From: Joakim Tjernlund @ 2010-06-28 21:09 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, Stephen Hemminger
In-Reply-To: <1277754121.4235.673.camel@edumazet-laptop>

Eric Dumazet <eric.dumazet@gmail.com> wrote on 2010/06/28 21:42:01:
>
> Le lundi 28 juin 2010 à 21:03 +0200, Joakim Tjernlund a écrit :
> > Stephen Hemminger <shemminger@vyatta.com> wrote on 2010/06/11 17:48:54:
> > >
> > > When Linux is used as a router, it is undesirable for the kernel to process
> > > incoming packets when the address assigned to the interface is down.
> > > The initial problem report was for a management application that used ICMP
> > > to check link availability.
> > >
> > > The default is disabled to maintain compatibility with previous behavior.
> > > This is not recommended for server systems because it makes fail over more
> > > difficult, and does not account for configurations where multiple interfaces
> > > have the same IP address.
> > >
> > > Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> >
> > Ping David et. all?
> > I too want this.
>
> You probably missed David reply
>
> http://permalink.gmane.org/gmane.linux.network/164494

Sure did, don't know how that happened, sorry.

Reading David's reply I do wonder about the current behaviour. Why
is it so important to keep responding to an IP address when the
admin has put the interface holding that IP address into administratively
down state? I don't think the weak host model stipulates that it must be so, does it?

To me it "ifconfig eth0 down" means not only to stop using the I/F but
also any IP address associated with the I/F. I was rather surprised that
it didn't work that way. I don't see any way to make Linux stop responding to
that IP other that removing it completely from the system, which is rather
awkward.

Note, I don't mean that the same should be applied for the No Carrier case, just
ifconfig down.

 Jocke

^ permalink raw reply

* Re: [RFC][BUG-FIX] the problem of checksum checking in UDP protocol
From: Eric Dumazet @ 2010-06-28 20:38 UTC (permalink / raw)
  To: Shan Wei; +Cc: David Miller, Ronciak, John, netdev
In-Reply-To: <4C287E40.40906@cn.fujitsu.com>

Le lundi 28 juin 2010 à 18:49 +0800, Shan Wei a écrit :
> Eric Dumazet wrote, at 06/26/2010 01:28 PM:
> >> (This patch is not complete, it's just for my idea.)
> >> diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> >> index 1dd1aff..47f7e86 100644
> >> --- a/net/ipv6/udp.c
> >> +++ b/net/ipv6/udp.c
> >> @@ -723,6 +723,10 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
> >>                 if (ulen < skb->len) {
> >>                         if (pskb_trim_rcsum(skb, ulen))
> >>                                 goto short_packet;
> >> +
> >> +                       if (skb_csum_unnecessary(skb))
> >> +                               skb->ip_summed = CHECKSUM_NONE;
> >> +
> >>                         saddr = &ipv6_hdr(skb)->saddr;
> >>                         daddr = &ipv6_hdr(skb)->daddr;
> >>                         uh = udp_hdr(skb);
> >>
> > 
> > I really dont know if this fix is the right one.
> > 
> > pskb_trim_rcsum() already contains a check. Should this check be changed
> > to include yours ?
> 
> Oh..... I don't think so.
> pskb_trim_rcsum() is also used when IPv4/IPv6 protocol receiving packets
> and reassembling fragments. IP protocol does the right check and should
> trust CHECKSUM_UNNECESSARY flag that drivers set, So we no need to 
> change IP protocol.
> If we add the skb_csum_unnecessary(skb) check into pskb_trim_rcsum() and
> reset ip_summed with CHECKSUM_NONE, the checksum check that NIC hardward 
> has done is wasted.
> 
> Only for UDP protocol over IPv4/IPv6, and length parameter is lower than
> skb->len, We reset ip_summed with CHECKSUM_NONE.
> 
> 

I read RFC 768 again, and cannot tell if your patch is ever needed.

NIC validated the UDP checksum including the whole IP data length, not
the udp length.

Linux kernel computes checksum using udp.length, not ip.length, because
only udp.length is delivered to application anyway. Extra padding is
meaningless.

RFC 768 author didnt asserted ip.length = udp.length + 8

Both implementations are RFC compliant, but may have different results
with special hand crafted packets.

You add a test for a non issue, my analysis is we should not care at
all, unless you have another valid point (security ???)

If a change is needed, I would vote for a change in NIC firmware,
because RFC 768 gives following pseudo header :

0      7 8     15 16    23 24    31 
+--------+--------+--------+--------+
|          source address           |
+--------+--------+--------+--------+
|        destination address        |
+--------+--------+--------+--------+
|  zero  |protocol|   UDP length    |
+--------+--------+--------+--------+

Note it mentions UDP length, not IP length.

If NIC reports UDP check sum was verified, it should have used UDP length as well.




^ permalink raw reply

* Re: [PATCH net-next-2.6] ipv4: sysctl to block responding on down interface
From: Eric Dumazet @ 2010-06-28 19:42 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Stephen Hemminger, David Miller, netdev
In-Reply-To: <OF62B30BCC.F8523B86-ONC1257750.00687496-C1257750.0068AA27@transmode.se>

Le lundi 28 juin 2010 à 21:03 +0200, Joakim Tjernlund a écrit :
> Stephen Hemminger <shemminger@vyatta.com> wrote on 2010/06/11 17:48:54:
> >
> > When Linux is used as a router, it is undesirable for the kernel to process
> > incoming packets when the address assigned to the interface is down.
> > The initial problem report was for a management application that used ICMP
> > to check link availability.
> >
> > The default is disabled to maintain compatibility with previous behavior.
> > This is not recommended for server systems because it makes fail over more
> > difficult, and does not account for configurations where multiple interfaces
> > have the same IP address.
> >
> > Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> 
> Ping David et. all?
> I too want this.

You probably missed David reply

http://permalink.gmane.org/gmane.linux.network/164494




^ permalink raw reply

* Re: [PATCH linux-2.6.35-rc3] micrel phy driver
From: Ben Hutchings @ 2010-06-28 19:23 UTC (permalink / raw)
  To: Choi, David; +Cc: netdev
In-Reply-To: <C43529A246480145B0A6D0234BDB0F0D0212A2@MELANITE.micrel.com>

On Mon, 2010-06-28 at 11:51 -0700, Choi, David wrote:
[...]
> @@ -106,8 +233,12 @@ MODULE_LICENSE("GPL");
>  
>  static struct mdio_device_id micrel_tbl[] = {
>  	{ PHY_ID_KSZ9021, 0x000fff10 },
> -	{ PHY_ID_VSC8201, 0x00fffff0 },
>  	{ PHY_ID_KS8001, 0x00fffff0 },
> +	{ PHY_ID_KSZ9021, 0x000fff10 },
> +	{ PHY_ID_KS8001, 0x00fffff0 },
> +	{ PHY_ID_KS8737, 0x00fffff0 },
> +	{ PHY_ID_KS8041, 0x00fffff0 },
> +	{ PHY_ID_KS8051, 0x00fffff0 },
>  	{ }
>  };

You're duplicating the device IDs for KSZ9021 and KS8001.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [PATCH net-next-2.6] ipv4: sysctl to block responding on down interface
From: Joakim Tjernlund @ 2010-06-28 19:03 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, netdev
In-Reply-To: <20100611084854.0680c014@nehalam>

Stephen Hemminger <shemminger@vyatta.com> wrote on 2010/06/11 17:48:54:
>
> When Linux is used as a router, it is undesirable for the kernel to process
> incoming packets when the address assigned to the interface is down.
> The initial problem report was for a management application that used ICMP
> to check link availability.
>
> The default is disabled to maintain compatibility with previous behavior.
> This is not recommended for server systems because it makes fail over more
> difficult, and does not account for configurations where multiple interfaces
> have the same IP address.
>
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Ping David et. all?
I too want this.

 Jocke


^ permalink raw reply

* [PATCH linux-2.6.35-rc3] micrel phy driver
From: Choi, David @ 2010-06-28 18:51 UTC (permalink / raw)
  To: netdev

Hello David Miller

From: David J. Choi <david.choi@micrel.com>

Body of the explanation: This patch has changes as followings;
 -support the interrupt from phy devices from Micrel Inc.
 -support more phy devices, ks8737, ks8721, ks8041, ks8051 from Micrel.
 -remove vsc8201 because this device was used only internal test at Micrel.

Signed-off-by: David J. Choi <david.choi@micrel.com>

---
--- linux-2.6.35-rc3/drivers/net/phy/micrel.c.orig	2010-06-11 19:14:04.000000000 -0700
+++ linux-2.6.35-rc3/drivers/net/phy/micrel.c	2010-06-28 09:53:37.000000000 -0700
@@ -12,7 +12,8 @@
  * Free Software Foundation;  either version 2 of the  License, or (at your
  * option) any later version.
  *
- * Support : ksz9021 , vsc8201, ks8001
+ * Support : ksz9021 1000/100/10 phy from Micrel
+ *		ks8001, ks8737, ks8721, ks8041, ks8051 100/10 phy
  */
 
 #include <linux/kernel.h>
@@ -20,37 +21,146 @@
 #include <linux/phy.h>
 
 #define	PHY_ID_KSZ9021			0x00221611
-#define	PHY_ID_VSC8201			0x000FC413
+#define	PHY_ID_KS8737			0x00221720
+#define	PHY_ID_KS8041			0x00221510
+#define	PHY_ID_KS8051			0x00221550
+/* both for ks8001 Rev. A/B, and for ks8721 Rev 3. */
 #define	PHY_ID_KS8001			0x0022161A
 
+/* general Interrupt control/status reg in vendor specific block. */
+#define MII_KSZPHY_INTCS			0x1B
+#define	KSZPHY_INTCS_JABBER			(1 << 15)
+#define	KSZPHY_INTCS_RECEIVE_ERR		(1 << 14)
+#define	KSZPHY_INTCS_PAGE_RECEIVE		(1 << 13)
+#define	KSZPHY_INTCS_PARELLEL			(1 << 12)
+#define	KSZPHY_INTCS_LINK_PARTNER_ACK		(1 << 11)
+#define	KSZPHY_INTCS_LINK_DOWN			(1 << 10)
+#define	KSZPHY_INTCS_REMOTE_FAULT		(1 << 9)
+#define	KSZPHY_INTCS_LINK_UP			(1 << 8)
+#define	KSZPHY_INTCS_ALL			(KSZPHY_INTCS_LINK_UP |\
+						KSZPHY_INTCS_LINK_DOWN)
+
+/* general PHY control reg in vendor specific block. */
+#define	MII_KSZPHY_CTRL			0x1F
+/* bitmap of PHY register to set interrupt mode */
+#define KSZPHY_CTRL_INT_ACTIVE_HIGH		(1 << 9)
+#define KSZ9021_CTRL_INT_ACTIVE_HIGH		(1 << 14)
+#define KS8737_CTRL_INT_ACTIVE_HIGH		(1 << 14)
+
+static int kszphy_ack_interrupt(struct phy_device *phydev)
+{
+	/* bit[7..0] int status, which is a read and clear register. */
+	int rc;
+
+	rc = phy_read(phydev, MII_KSZPHY_INTCS);
+
+	return (rc < 0) ? rc : 0;
+}
+
+static int kszphy_set_interrupt(struct phy_device *phydev)
+{
+	int temp;
+	temp = (PHY_INTERRUPT_ENABLED == phydev->interrupts) ?
+		KSZPHY_INTCS_ALL : 0;
+	return phy_write(phydev, MII_KSZPHY_INTCS, temp);
+}
+
+static int kszphy_config_intr(struct phy_device *phydev)
+{
+	int temp, rc;
+
+	/* set the interrupt pin active low */
+	temp = phy_read(phydev, MII_KSZPHY_CTRL);
+	temp &= ~KSZPHY_CTRL_INT_ACTIVE_HIGH;
+	phy_write(phydev, MII_KSZPHY_CTRL, temp);
+	rc = kszphy_set_interrupt(phydev);
+	return rc < 0 ? rc : 0;
+}
+
+static int ksz9021_config_intr(struct phy_device *phydev)
+{
+	int temp, rc;
+
+	/* set the interrupt pin active low */
+	temp = phy_read(phydev, MII_KSZPHY_CTRL);
+	temp &= ~KSZ9021_CTRL_INT_ACTIVE_HIGH;
+	phy_write(phydev, MII_KSZPHY_CTRL, temp);
+	rc = kszphy_set_interrupt(phydev);
+	return rc < 0 ? rc : 0;
+}
+
+static int ks8737_config_intr(struct phy_device *phydev)
+{
+	int temp, rc;
+
+	/* set the interrupt pin active low */
+	temp = phy_read(phydev, MII_KSZPHY_CTRL);
+	temp &= ~KS8737_CTRL_INT_ACTIVE_HIGH;
+	phy_write(phydev, MII_KSZPHY_CTRL, temp);
+	rc = kszphy_set_interrupt(phydev);
+	return rc < 0 ? rc : 0;
+}
 
 static int kszphy_config_init(struct phy_device *phydev)
 {
 	return 0;
 }
 
+static struct phy_driver ks8737_driver = {
+	.phy_id		= PHY_ID_KS8737,
+	.phy_id_mask	= 0x00fffff0,
+	.name		= "Micrel KS8737",
+	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause),
+	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
+	.config_init	= kszphy_config_init,
+	.config_aneg	= genphy_config_aneg,
+	.read_status	= genphy_read_status,
+	.ack_interrupt	= kszphy_ack_interrupt,
+	.config_intr	= ks8737_config_intr,
+	.driver		= { .owner = THIS_MODULE,},
+};
+
+static struct phy_driver ks8041_driver = {
+	.phy_id		= PHY_ID_KS8041,
+	.phy_id_mask	= 0x00fffff0,
+	.name		= "Micrel KS8041",
+	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause
+				| SUPPORTED_Asym_Pause),
+	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
+	.config_init	= kszphy_config_init,
+	.config_aneg	= genphy_config_aneg,
+	.read_status	= genphy_read_status,
+	.ack_interrupt	= kszphy_ack_interrupt,
+	.config_intr	= kszphy_config_intr,
+	.driver		= { .owner = THIS_MODULE,},
+};
 
-static struct phy_driver ks8001_driver = {
-	.phy_id		= PHY_ID_KS8001,
-	.name		= "Micrel KS8001",
+static struct phy_driver ks8051_driver = {
+	.phy_id		= PHY_ID_KS8051,
 	.phy_id_mask	= 0x00fffff0,
-	.features	= PHY_BASIC_FEATURES,
-	.flags		= PHY_POLL,
+	.name		= "Micrel KS8051",
+	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause
+				| SUPPORTED_Asym_Pause),
+	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.config_init	= kszphy_config_init,
 	.config_aneg	= genphy_config_aneg,
 	.read_status	= genphy_read_status,
+	.ack_interrupt	= kszphy_ack_interrupt,
+	.config_intr	= kszphy_config_intr,
 	.driver		= { .owner = THIS_MODULE,},
 };
 
-static struct phy_driver vsc8201_driver = {
-	.phy_id		= PHY_ID_VSC8201,
-	.name		= "Micrel VSC8201",
+static struct phy_driver ks8001_driver = {
+	.phy_id		= PHY_ID_KS8001,
+	.name		= "Micrel KS8001 or KS8721",
 	.phy_id_mask	= 0x00fffff0,
-	.features	= PHY_BASIC_FEATURES,
-	.flags		= PHY_POLL,
+	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause),
+	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.config_init	= kszphy_config_init,
 	.config_aneg	= genphy_config_aneg,
 	.read_status	= genphy_read_status,
+	.ack_interrupt	= kszphy_ack_interrupt,
+	.config_intr	= kszphy_config_intr,
 	.driver		= { .owner = THIS_MODULE,},
 };
 
@@ -58,11 +168,14 @@ static struct phy_driver ksz9021_driver 
 	.phy_id		= PHY_ID_KSZ9021,
 	.phy_id_mask	= 0x000fff10,
 	.name		= "Micrel KSZ9021 Gigabit PHY",
-	.features	= PHY_GBIT_FEATURES | SUPPORTED_Pause,
-	.flags		= PHY_POLL,
+	.features	= (PHY_GBIT_FEATURES | SUPPORTED_Pause
+				| SUPPORTED_Asym_Pause),
+	.flags		= PHY_HAS_MAGICANEG | PHY_HAS_INTERRUPT,
 	.config_init	= kszphy_config_init,
 	.config_aneg	= genphy_config_aneg,
 	.read_status	= genphy_read_status,
+	.ack_interrupt	= kszphy_ack_interrupt,
+	.config_intr	= ksz9021_config_intr,
 	.driver		= { .owner = THIS_MODULE, },
 };
 
@@ -73,17 +186,29 @@ static int __init ksphy_init(void)
 	ret = phy_driver_register(&ks8001_driver);
 	if (ret)
 		goto err1;
-	ret = phy_driver_register(&vsc8201_driver);
+
+	ret = phy_driver_register(&ksz9021_driver);
 	if (ret)
 		goto err2;
 
-	ret = phy_driver_register(&ksz9021_driver);
+	ret = phy_driver_register(&ks8737_driver);
 	if (ret)
 		goto err3;
+	ret = phy_driver_register(&ks8041_driver);
+	if (ret)
+		goto err4;
+	ret = phy_driver_register(&ks8051_driver);
+	if (ret)
+		goto err5;
+
 	return 0;
 
+err5:
+	phy_driver_unregister(&ks8041_driver);
+err4:
+	phy_driver_unregister(&ks8737_driver);
 err3:
-	phy_driver_unregister(&vsc8201_driver);
+	phy_driver_unregister(&ksz9021_driver);
 err2:
 	phy_driver_unregister(&ks8001_driver);
 err1:
@@ -93,8 +218,10 @@ err1:
 static void __exit ksphy_exit(void)
 {
 	phy_driver_unregister(&ks8001_driver);
-	phy_driver_unregister(&vsc8201_driver);
+	phy_driver_unregister(&ks8737_driver);
 	phy_driver_unregister(&ksz9021_driver);
+	phy_driver_unregister(&ks8041_driver);
+	phy_driver_unregister(&ks8051_driver);
 }
 
 module_init(ksphy_init);
@@ -106,8 +233,12 @@ MODULE_LICENSE("GPL");
 
 static struct mdio_device_id micrel_tbl[] = {
 	{ PHY_ID_KSZ9021, 0x000fff10 },
-	{ PHY_ID_VSC8201, 0x00fffff0 },
 	{ PHY_ID_KS8001, 0x00fffff0 },
+	{ PHY_ID_KSZ9021, 0x000fff10 },
+	{ PHY_ID_KS8001, 0x00fffff0 },
+	{ PHY_ID_KS8737, 0x00fffff0 },
+	{ PHY_ID_KS8041, 0x00fffff0 },
+	{ PHY_ID_KS8051, 0x00fffff0 },
 	{ }
 };
 

---

^ permalink raw reply

* static inline int xfrm_mark_get() broken
From: Andreas Steffen @ 2010-06-28 18:46 UTC (permalink / raw)
  To: netdev

[-- Attachment #1: Type: text/plain, Size: 761 bytes --]

Hi,

experimenting with the new XFRM_MARK feature of the 2.6.34 kernel
I found out that the extraction of the mark mask might accidentally
work on 64 bit platforms but on 32 bit platforms the function is
awfully broken. The rather trivial patch attached to this mail fixes
the problem. Otherwise the XFRM_MARK feature seems quite promising!

Best regards

Andreas

======================================================================
Andreas Steffen                         e-mail: andreas.steffen@hsr.ch
Institute for Internet Technologies and Applications
Hochschule fuer Technik Rapperswil      phone:  +41 55 222 42 68
CH-8640 Rapperswil (Switzerland)        mobile: +41 76 340 25 56
===========================================================[ITA-HSR]==

[-- Attachment #2: xfrm.h.diff --]
[-- Type: text/x-patch, Size: 413 bytes --]

--- linux/include/net/xfrm.h.ori	2010-06-28 18:53:28.229489876 +0200
+++ linux/include/net/xfrm.h	2010-06-28 18:53:50.745487383 +0200
@@ -1587,7 +1587,7 @@
 static inline int xfrm_mark_get(struct nlattr **attrs, struct xfrm_mark *m)
 {
 	if (attrs[XFRMA_MARK])
-		memcpy(m, nla_data(attrs[XFRMA_MARK]), sizeof(m));
+		memcpy(m, nla_data(attrs[XFRMA_MARK]), sizeof(struct xfrm_mark));
 	else
 		m->v = m->m = 0;
 

^ permalink raw reply

* [PATCH net-2.6 2/2] ethtool: Fix potential user buffer overflow for ETHTOOL_{G,S}RXFH
From: Ben Hutchings @ 2010-06-28 18:45 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Santwona Behera
In-Reply-To: <1277750647.2089.19.camel@achroite.uk.solarflarecom.com>

struct ethtool_rxnfc was originally defined in 2.6.27 for the
ETHTOOL_{G,S}RXFH command with only the cmd, flow_type and data
fields.  It was then extended in 2.6.30 to support various additional
commands.  These commands should have been defined to use a new
structure, but it is too late to change that now.

Since user-space may still be using the old structure definition
for the ETHTOOL_{G,S}RXFH commands, and since they do not need the
additional fields, only copy the originally defined fields to and
from user-space.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Cc: stable@kernel.org
---
 include/linux/ethtool.h |    2 ++
 net/core/ethtool.c      |   36 +++++++++++++++++++++++++++---------
 2 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 2c8af09..07f9808 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -379,6 +379,8 @@ struct ethtool_rxnfc {
 	__u32				flow_type;
 	/* The rx flow hash value or the rule DB size */
 	__u64				data;
+	/* The following fields are not valid and must not be used for
+	 * the ETHTOOL_{G,X}RXFH commands. */
 	struct ethtool_rx_flow_spec	fs;
 	__u32				rule_cnt;
 	__u32				rule_locs[0];
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index a3a7e9a..75e4ffe 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -318,23 +318,33 @@ out:
 }
 
 static noinline_for_stack int ethtool_set_rxnfc(struct net_device *dev,
-						void __user *useraddr)
+						u32 cmd, void __user *useraddr)
 {
-	struct ethtool_rxnfc cmd;
+	struct ethtool_rxnfc info;
+	size_t info_size = sizeof(info);
 
 	if (!dev->ethtool_ops->set_rxnfc)
 		return -EOPNOTSUPP;
 
-	if (copy_from_user(&cmd, useraddr, sizeof(cmd)))
+	/* struct ethtool_rxnfc was originally defined for
+	 * ETHTOOL_{G,S}RXFH with only the cmd, flow_type and data
+	 * members.  User-space might still be using that
+	 * definition. */
+	if (cmd == ETHTOOL_SRXFH)
+		info_size = (offsetof(struct ethtool_rxnfc, data) +
+			     sizeof(info.data));
+
+	if (copy_from_user(&info, useraddr, info_size))
 		return -EFAULT;
 
-	return dev->ethtool_ops->set_rxnfc(dev, &cmd);
+	return dev->ethtool_ops->set_rxnfc(dev, &info);
 }
 
 static noinline_for_stack int ethtool_get_rxnfc(struct net_device *dev,
-						void __user *useraddr)
+						u32 cmd, void __user *useraddr)
 {
 	struct ethtool_rxnfc info;
+	size_t info_size = sizeof(info);
 	const struct ethtool_ops *ops = dev->ethtool_ops;
 	int ret;
 	void *rule_buf = NULL;
@@ -342,7 +352,15 @@ static noinline_for_stack int ethtool_get_rxnfc(struct net_device *dev,
 	if (!ops->get_rxnfc)
 		return -EOPNOTSUPP;
 
-	if (copy_from_user(&info, useraddr, sizeof(info)))
+	/* struct ethtool_rxnfc was originally defined for
+	 * ETHTOOL_{G,S}RXFH with only the cmd, flow_type and data
+	 * members.  User-space might still be using that
+	 * definition. */
+	if (cmd == ETHTOOL_GRXFH)
+		info_size = (offsetof(struct ethtool_rxnfc, data) +
+			     sizeof(info.data));
+
+	if (copy_from_user(&info, useraddr, info_size))
 		return -EFAULT;
 
 	if (info.cmd == ETHTOOL_GRXCLSRLALL) {
@@ -360,7 +378,7 @@ static noinline_for_stack int ethtool_get_rxnfc(struct net_device *dev,
 		goto err_out;
 
 	ret = -EFAULT;
-	if (copy_to_user(useraddr, &info, sizeof(info)))
+	if (copy_to_user(useraddr, &info, info_size))
 		goto err_out;
 
 	if (rule_buf) {
@@ -1517,12 +1535,12 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
 	case ETHTOOL_GRXCLSRLCNT:
 	case ETHTOOL_GRXCLSRULE:
 	case ETHTOOL_GRXCLSRLALL:
-		rc = ethtool_get_rxnfc(dev, useraddr);
+		rc = ethtool_get_rxnfc(dev, ethcmd, useraddr);
 		break;
 	case ETHTOOL_SRXFH:
 	case ETHTOOL_SRXCLSRLDEL:
 	case ETHTOOL_SRXCLSRLINS:
-		rc = ethtool_set_rxnfc(dev, useraddr);
+		rc = ethtool_set_rxnfc(dev, ethcmd, useraddr);
 		break;
 	case ETHTOOL_GGRO:
 		rc = ethtool_get_gro(dev, useraddr);
-- 
1.6.2.5

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* [PATCH net-2.6 1/2] ethtool: Fix potential kernel buffer overflow in ETHTOOL_GRXCLSRLALL
From: Ben Hutchings @ 2010-06-28 18:44 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Santwona Behera

On a 32-bit machine, info.rule_cnt >= 0x40000000 leads to integer
overflow and the buffer may be smaller than needed.  Since
ETHTOOL_GRXCLSRLALL is unprivileged, this can presumably be used for at
least denial of service.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Cc: stable@kernel.org
---
 net/core/ethtool.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index a0f4964..a3a7e9a 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -347,8 +347,9 @@ static noinline_for_stack int ethtool_get_rxnfc(struct net_device *dev,
 
 	if (info.cmd == ETHTOOL_GRXCLSRLALL) {
 		if (info.rule_cnt > 0) {
-			rule_buf = kmalloc(info.rule_cnt * sizeof(u32),
-					   GFP_USER);
+			if (info.rule_cnt <= KMALLOC_MAX_SIZE / sizeof(u32))
+				rule_buf = kmalloc(info.rule_cnt * sizeof(u32),
+						   GFP_USER);
 			if (!rule_buf)
 				return -ENOMEM;
 		}
-- 
1.6.2.5


-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* RE: [PATCH -next] vmxnet3: fail when try to setup unsupported features
From: Shreyas Bhatewara @ 2010-06-28 17:45 UTC (permalink / raw)
  To: Stanislaw Gruszka, netdev@vger.kernel.org; +Cc: Amerigo Wang
In-Reply-To: <20100628112942.0c919746@dhcp-lab-109.englab.brq.redhat.com>

> -----Original Message-----
> From: Stanislaw Gruszka [mailto:sgruszka@redhat.com]
> Sent: Monday, June 28, 2010 2:30 AM
> To: netdev@vger.kernel.org
> Cc: Amerigo Wang; Shreyas Bhatewara
> Subject: [PATCH -next] vmxnet3: fail when try to setup unsupported
> features
> 
> Return EOPNOTSUPP in ethtool_ops->set_flags.
> 
> Fix coding style while at it.
> 
> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
> ---
>  drivers/net/vmxnet3/vmxnet3_ethtool.c |    9 +++++++--
>  1 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/vmxnet3/vmxnet3_ethtool.c
> b/drivers/net/vmxnet3/vmxnet3_ethtool.c
> index 3935c44..8a71a21 100644
> --- a/drivers/net/vmxnet3/vmxnet3_ethtool.c
> +++ b/drivers/net/vmxnet3/vmxnet3_ethtool.c
> @@ -276,16 +276,21 @@ vmxnet3_get_strings(struct net_device *netdev,
> u32 stringset, u8 *buf)
>  }
> 
>  static u32
> -vmxnet3_get_flags(struct net_device *netdev) {
> +vmxnet3_get_flags(struct net_device *netdev)
> +{
>  	return netdev->features;
>  }
> 
>  static int
> -vmxnet3_set_flags(struct net_device *netdev, u32 data) {
> +vmxnet3_set_flags(struct net_device *netdev, u32 data)
> +{
>  	struct vmxnet3_adapter *adapter = netdev_priv(netdev);
>  	u8 lro_requested = (data & ETH_FLAG_LRO) == 0 ? 0 : 1;
>  	u8 lro_present = (netdev->features & NETIF_F_LRO) == 0 ? 0 : 1;
> 
> +	if (data & ~ETH_FLAG_LRO)
> +		return -EOPNOTSUPP;
> +
>  	if (lro_requested ^ lro_present) {
>  		/* toggle the LRO feature*/
>  		netdev->features ^= NETIF_F_LRO;
> --
> 1.5.5.6


Does not make sense to me. Switching LRO on/off is supported from the driver, why should the function return -EOPNOTSUPP ?

->Shreyas

^ permalink raw reply

* [PATCH 3/3] pm_qos: get rid of the allocation in pm_qos_add_request()
From: James Bottomley @ 2010-06-28 17:44 UTC (permalink / raw)
  To: Linux PM; +Cc: markgross, netdev, Takashi Iwai
In-Reply-To: <1277746434.10879.191.camel@mulgrave.site>

Since every caller has to squirrel away the returned pointer anyway,
they might as well supply the memory area.  This fixes a bug in a few of
the call sites where the returned pointer was dereferenced without
checking it for NULL (which gets returned if the kzalloc failed).

I'd like to hear how sound and netdev feels about this: it will add
about two more pointers worth of data to struct netdev and struct
snd_pcm_substream .. but I think it's worth it.  If you're OK, I'll add
your acks and send through the pm tree.

This also looks to me like an android independent clean up (even though
it renders the request_add atomically callable).  I also added include
guards to include/linux/pm_qos_params.h

cc: netdev@vger.kernel.org
cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
---
 drivers/net/e1000e/netdev.c            |   17 +++-----
 drivers/net/igbvf/netdev.c             |    9 ++--
 drivers/net/wireless/ipw2x00/ipw2100.c |   12 +++---
 include/linux/netdevice.h              |    2 +-
 include/linux/pm_qos_params.h          |   13 +++++-
 include/sound/pcm.h                    |    2 +-
 kernel/pm_qos_params.c                 |   67 +++++++++++++++++++-------------
 sound/core/pcm_native.c                |   13 ++----
 8 files changed, 74 insertions(+), 61 deletions(-)

diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
index 24507f3..47ea62f 100644
--- a/drivers/net/e1000e/netdev.c
+++ b/drivers/net/e1000e/netdev.c
@@ -2901,10 +2901,10 @@ static void e1000_configure_rx(struct e1000_adapter *adapter)
 			 * dropped transactions.
 			 */
 			pm_qos_update_request(
-				adapter->netdev->pm_qos_req, 55);
+				&adapter->netdev->pm_qos_req, 55);
 		} else {
 			pm_qos_update_request(
-				adapter->netdev->pm_qos_req,
+				&adapter->netdev->pm_qos_req,
 				PM_QOS_DEFAULT_VALUE);
 		}
 	}
@@ -3196,9 +3196,9 @@ int e1000e_up(struct e1000_adapter *adapter)
 
 	/* DMA latency requirement to workaround early-receive/jumbo issue */
 	if (adapter->flags & FLAG_HAS_ERT)
-		adapter->netdev->pm_qos_req =
-			pm_qos_add_request(PM_QOS_CPU_DMA_LATENCY,
-				       PM_QOS_DEFAULT_VALUE);
+		pm_qos_add_request(&adapter->netdev->pm_qos_req,
+				   PM_QOS_CPU_DMA_LATENCY,
+				   PM_QOS_DEFAULT_VALUE);
 
 	/* hardware has been reset, we need to reload some things */
 	e1000_configure(adapter);
@@ -3263,11 +3263,8 @@ void e1000e_down(struct e1000_adapter *adapter)
 	e1000_clean_tx_ring(adapter);
 	e1000_clean_rx_ring(adapter);
 
-	if (adapter->flags & FLAG_HAS_ERT) {
-		pm_qos_remove_request(
-			      adapter->netdev->pm_qos_req);
-		adapter->netdev->pm_qos_req = NULL;
-	}
+	if (adapter->flags & FLAG_HAS_ERT)
+		pm_qos_remove_request(&adapter->netdev->pm_qos_req);
 
 	/*
 	 * TODO: for power management, we could drop the link and
diff --git a/drivers/net/igbvf/netdev.c b/drivers/net/igbvf/netdev.c
index 5e2b2a8..add6197 100644
--- a/drivers/net/igbvf/netdev.c
+++ b/drivers/net/igbvf/netdev.c
@@ -48,7 +48,7 @@
 #define DRV_VERSION "1.0.0-k0"
 char igbvf_driver_name[] = "igbvf";
 const char igbvf_driver_version[] = DRV_VERSION;
-struct pm_qos_request_list *igbvf_driver_pm_qos_req;
+static struct pm_qos_request_list igbvf_driver_pm_qos_req;
 static const char igbvf_driver_string[] =
 				"Intel(R) Virtual Function Network Driver";
 static const char igbvf_copyright[] = "Copyright (c) 2009 Intel Corporation.";
@@ -2902,8 +2902,8 @@ static int __init igbvf_init_module(void)
 	printk(KERN_INFO "%s\n", igbvf_copyright);
 
 	ret = pci_register_driver(&igbvf_driver);
-	igbvf_driver_pm_qos_req = pm_qos_add_request(PM_QOS_CPU_DMA_LATENCY,
-	                       PM_QOS_DEFAULT_VALUE);
+	pm_qos_add_request(&igbvf_driver_pm_qos_req, PM_QOS_CPU_DMA_LATENCY,
+			   PM_QOS_DEFAULT_VALUE);
 
 	return ret;
 }
@@ -2918,8 +2918,7 @@ module_init(igbvf_init_module);
 static void __exit igbvf_exit_module(void)
 {
 	pci_unregister_driver(&igbvf_driver);
-	pm_qos_remove_request(igbvf_driver_pm_qos_req);
-	igbvf_driver_pm_qos_req = NULL;
+	pm_qos_remove_request(&igbvf_driver_pm_qos_req);
 }
 module_exit(igbvf_exit_module);
 
diff --git a/drivers/net/wireless/ipw2x00/ipw2100.c b/drivers/net/wireless/ipw2x00/ipw2100.c
index 0bd4dfa..7f0d98b 100644
--- a/drivers/net/wireless/ipw2x00/ipw2100.c
+++ b/drivers/net/wireless/ipw2x00/ipw2100.c
@@ -174,7 +174,7 @@ that only one external action is invoked at a time.
 #define DRV_DESCRIPTION	"Intel(R) PRO/Wireless 2100 Network Driver"
 #define DRV_COPYRIGHT	"Copyright(c) 2003-2006 Intel Corporation"
 
-struct pm_qos_request_list *ipw2100_pm_qos_req;
+struct pm_qos_request_list ipw2100_pm_qos_req;
 
 /* Debugging stuff */
 #ifdef CONFIG_IPW2100_DEBUG
@@ -1741,7 +1741,7 @@ static int ipw2100_up(struct ipw2100_priv *priv, int deferred)
 	/* the ipw2100 hardware really doesn't want power management delays
 	 * longer than 175usec
 	 */
-	pm_qos_update_request(ipw2100_pm_qos_req, 175);
+	pm_qos_update_request(&ipw2100_pm_qos_req, 175);
 
 	/* If the interrupt is enabled, turn it off... */
 	spin_lock_irqsave(&priv->low_lock, flags);
@@ -1889,7 +1889,7 @@ static void ipw2100_down(struct ipw2100_priv *priv)
 	ipw2100_disable_interrupts(priv);
 	spin_unlock_irqrestore(&priv->low_lock, flags);
 
-	pm_qos_update_request(ipw2100_pm_qos_req, PM_QOS_DEFAULT_VALUE);
+	pm_qos_update_request(&ipw2100_pm_qos_req, PM_QOS_DEFAULT_VALUE);
 
 	/* We have to signal any supplicant if we are disassociating */
 	if (associated)
@@ -6669,8 +6669,8 @@ static int __init ipw2100_init(void)
 	if (ret)
 		goto out;
 
-	ipw2100_pm_qos_req = pm_qos_add_request(PM_QOS_CPU_DMA_LATENCY,
-			PM_QOS_DEFAULT_VALUE);
+	pm_qos_add_request(&ipw2100_pm_qos_req, PM_QOS_CPU_DMA_LATENCY,
+			   PM_QOS_DEFAULT_VALUE);
 #ifdef CONFIG_IPW2100_DEBUG
 	ipw2100_debug_level = debug;
 	ret = driver_create_file(&ipw2100_pci_driver.driver,
@@ -6692,7 +6692,7 @@ static void __exit ipw2100_exit(void)
 			   &driver_attr_debug_level);
 #endif
 	pci_unregister_driver(&ipw2100_pci_driver);
-	pm_qos_remove_request(ipw2100_pm_qos_req);
+	pm_qos_remove_request(&ipw2100_pm_qos_req);
 }
 
 module_init(ipw2100_init);
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 40291f3..393555a 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -779,7 +779,7 @@ struct net_device {
 	 */
 	char			name[IFNAMSIZ];
 
-	struct pm_qos_request_list *pm_qos_req;
+	struct pm_qos_request_list pm_qos_req;
 
 	/* device name hash chain */
 	struct hlist_node	name_hlist;
diff --git a/include/linux/pm_qos_params.h b/include/linux/pm_qos_params.h
index 8ba440e..77cbddb 100644
--- a/include/linux/pm_qos_params.h
+++ b/include/linux/pm_qos_params.h
@@ -1,8 +1,10 @@
+#ifndef _LINUX_PM_QOS_PARAMS_H
+#define _LINUX_PM_QOS_PARAMS_H
 /* interface for the pm_qos_power infrastructure of the linux kernel.
  *
  * Mark Gross <mgross@linux.intel.com>
  */
-#include <linux/list.h>
+#include <linux/plist.h>
 #include <linux/notifier.h>
 #include <linux/miscdevice.h>
 
@@ -14,9 +16,12 @@
 #define PM_QOS_NUM_CLASSES 4
 #define PM_QOS_DEFAULT_VALUE -1
 
-struct pm_qos_request_list;
+struct pm_qos_request_list {
+	struct plist_node list;
+	int pm_qos_class;
+};
 
-struct pm_qos_request_list *pm_qos_add_request(int pm_qos_class, s32 value);
+void pm_qos_add_request(struct pm_qos_request_list *l, int pm_qos_class, s32 value);
 void pm_qos_update_request(struct pm_qos_request_list *pm_qos_req,
 		s32 new_value);
 void pm_qos_remove_request(struct pm_qos_request_list *pm_qos_req);
@@ -24,4 +29,6 @@ void pm_qos_remove_request(struct pm_qos_request_list *pm_qos_req);
 int pm_qos_request(int pm_qos_class);
 int pm_qos_add_notifier(int pm_qos_class, struct notifier_block *notifier);
 int pm_qos_remove_notifier(int pm_qos_class, struct notifier_block *notifier);
+int pm_qos_request_active(struct pm_qos_request_list *req);
 
+#endif
diff --git a/include/sound/pcm.h b/include/sound/pcm.h
index dd76cde..6e3a297 100644
--- a/include/sound/pcm.h
+++ b/include/sound/pcm.h
@@ -366,7 +366,7 @@ struct snd_pcm_substream {
 	int number;
 	char name[32];			/* substream name */
 	int stream;			/* stream (direction) */
-	struct pm_qos_request_list *latency_pm_qos_req; /* pm_qos request */
+	struct pm_qos_request_list latency_pm_qos_req; /* pm_qos request */
 	size_t buffer_bytes_max;	/* limit ring buffer size */
 	struct snd_dma_buffer dma_buffer;
 	unsigned int dma_buf_id;
diff --git a/kernel/pm_qos_params.c b/kernel/pm_qos_params.c
index b130b97..bff4053 100644
--- a/kernel/pm_qos_params.c
+++ b/kernel/pm_qos_params.c
@@ -30,7 +30,6 @@
 /*#define DEBUG*/
 
 #include <linux/pm_qos_params.h>
-#include <linux/plist.h>
 #include <linux/sched.h>
 #include <linux/spinlock.h>
 #include <linux/slab.h>
@@ -49,11 +48,6 @@
  * or pm_qos_object list and pm_qos_objects need to happen with pm_qos_lock
  * held, taken with _irqsave.  One lock to rule them all
  */
-struct pm_qos_request_list {
-	struct plist_node list;
-	int pm_qos_class;
-};
-
 enum pm_qos_type {
 	PM_QOS_MAX,		/* return the largest value */
 	PM_QOS_MIN		/* return the smallest value */
@@ -210,6 +204,12 @@ int pm_qos_request(int pm_qos_class)
 }
 EXPORT_SYMBOL_GPL(pm_qos_request);
 
+int pm_qos_request_active(struct pm_qos_request_list *req)
+{
+	return req->pm_qos_class != 0;
+}
+EXPORT_SYMBOL_GPL(pm_qos_request_active);
+
 /**
  * pm_qos_add_request - inserts new qos request into the list
  * @pm_qos_class: identifies which list of qos request to us
@@ -221,25 +221,23 @@ EXPORT_SYMBOL_GPL(pm_qos_request);
  * element as a handle for use in updating and removal.  Call needs to save
  * this handle for later use.
  */
-struct pm_qos_request_list *pm_qos_add_request(int pm_qos_class, s32 value)
+void pm_qos_add_request(struct pm_qos_request_list *dep,
+			int pm_qos_class, s32 value)
 {
-	struct pm_qos_request_list *dep;
-
-	dep = kzalloc(sizeof(struct pm_qos_request_list), GFP_KERNEL);
-	if (dep) {
-		struct pm_qos_object *o =  pm_qos_array[pm_qos_class];
-		int new_value;
-
-		if (value == PM_QOS_DEFAULT_VALUE)
-			new_value = o->default_value;
-		else
-			new_value = value;
-		plist_node_init(&dep->list, new_value);
-		dep->pm_qos_class = pm_qos_class;
-		update_target(o, &dep->list, 0, PM_QOS_DEFAULT_VALUE);
-	}
+	struct pm_qos_object *o =  pm_qos_array[pm_qos_class];
+	int new_value;
 
-	return dep;
+	if (pm_qos_request_active(dep)) {
+		WARN(1, KERN_ERR "pm_qos_add_request() called for already added request\n");
+		return;
+	}
+	if (value == PM_QOS_DEFAULT_VALUE)
+		new_value = o->default_value;
+	else
+		new_value = value;
+	plist_node_init(&dep->list, new_value);
+	dep->pm_qos_class = pm_qos_class;
+	update_target(o, &dep->list, 0, PM_QOS_DEFAULT_VALUE);
 }
 EXPORT_SYMBOL_GPL(pm_qos_add_request);
 
@@ -262,6 +260,11 @@ void pm_qos_update_request(struct pm_qos_request_list *pm_qos_req,
 	if (!pm_qos_req) /*guard against callers passing in null */
 		return;
 
+	if (pm_qos_request_active(pm_qos_req)) {
+		WARN(1, KERN_ERR "pm_qos_update_request() called for unknown object\n");
+		return;
+	}
+
 	o = pm_qos_array[pm_qos_req->pm_qos_class];
 
 	if (new_value == PM_QOS_DEFAULT_VALUE)
@@ -290,9 +293,14 @@ void pm_qos_remove_request(struct pm_qos_request_list *pm_qos_req)
 		return;
 		/* silent return to keep pcm code cleaner */
 
+	if (!pm_qos_request_active(pm_qos_req)) {
+		WARN(1, KERN_ERR "pm_qos_remove_request() called for unknown object\n");
+		return;
+	}
+
 	o = pm_qos_array[pm_qos_req->pm_qos_class];
 	update_target(o, &pm_qos_req->list, 1, PM_QOS_DEFAULT_VALUE);
-	kfree(pm_qos_req);
+	memset(pm_qos_req, 0, sizeof(*pm_qos_req));
 }
 EXPORT_SYMBOL_GPL(pm_qos_remove_request);
 
@@ -340,8 +348,12 @@ static int pm_qos_power_open(struct inode *inode, struct file *filp)
 
 	pm_qos_class = find_pm_qos_object_by_minor(iminor(inode));
 	if (pm_qos_class >= 0) {
-		filp->private_data = (void *) pm_qos_add_request(pm_qos_class,
-				PM_QOS_DEFAULT_VALUE);
+		struct pm_qos_request_list *req = kzalloc(GFP_KERNEL, sizeof(*req));
+		if (!req)
+			return -ENOMEM;
+
+		pm_qos_add_request(req, pm_qos_class, PM_QOS_DEFAULT_VALUE);
+		filp->private_data = req;
 
 		if (filp->private_data)
 			return 0;
@@ -353,8 +365,9 @@ static int pm_qos_power_release(struct inode *inode, struct file *filp)
 {
 	struct pm_qos_request_list *req;
 
-	req = (struct pm_qos_request_list *)filp->private_data;
+	req = filp->private_data;
 	pm_qos_remove_request(req);
+	kfree(req);
 
 	return 0;
 }
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 303ac04..a3b2a64 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -451,13 +451,11 @@ static int snd_pcm_hw_params(struct snd_pcm_substream *substream,
 	snd_pcm_timer_resolution_change(substream);
 	runtime->status->state = SNDRV_PCM_STATE_SETUP;
 
-	if (substream->latency_pm_qos_req) {
-		pm_qos_remove_request(substream->latency_pm_qos_req);
-		substream->latency_pm_qos_req = NULL;
-	}
+	if (pm_qos_request_active(&substream->latency_pm_qos_req))
+		pm_qos_remove_request(&substream->latency_pm_qos_req);
 	if ((usecs = period_to_usecs(runtime)) >= 0)
-		substream->latency_pm_qos_req = pm_qos_add_request(
-					PM_QOS_CPU_DMA_LATENCY, usecs);
+		pm_qos_add_request(&substream->latency_pm_qos_req,
+				   PM_QOS_CPU_DMA_LATENCY, usecs);
 	return 0;
  _error:
 	/* hardware might be unuseable from this time,
@@ -512,8 +510,7 @@ static int snd_pcm_hw_free(struct snd_pcm_substream *substream)
 	if (substream->ops->hw_free)
 		result = substream->ops->hw_free(substream);
 	runtime->status->state = SNDRV_PCM_STATE_OPEN;
-	pm_qos_remove_request(substream->latency_pm_qos_req);
-	substream->latency_pm_qos_req = NULL;
+	pm_qos_remove_request(&substream->latency_pm_qos_req);
 	return result;
 }
 
-- 
1.6.4.2




^ permalink raw reply related

* Re: [PATCH] vhost: break out of polling loop on error
From: Michael S. Tsirkin @ 2010-06-28 17:25 UTC (permalink / raw)
  To: Sridhar Samudrala
  Cc: Sridhar Samudrala, Arnd Bergmann, Paul E. McKenney, Juan Quintela,
	Rusty Russell, Jes.Sorensen, kraxel, Takuya Yoshikawa, kvm,
	virtualization, netdev, linux-kernel
In-Reply-To: <1277745103.23755.2.camel@w-sridhar.beaverton.ibm.com>

On Mon, Jun 28, 2010 at 10:11:43AM -0700, Sridhar Samudrala wrote:
> On Sun, 2010-06-27 at 11:59 +0300, Michael S. Tsirkin wrote:
> > When ring parsing fails, we currently handle this
> > as ring empty condition. This means that we enable
> > kicks and recheck ring empty: if this not empty,
> > we re-start polling which of course will fail again.
> > 
> > Instead, let's return a negative error code and stop polling.
> 
> One minor comment on error return below. With that change,
> 
> Acked-by: Sridhar Samudrala <sri@us.ibm.com>

Right. In fact, we don't really use the return code,
and the generated binary is smaller if we return
-EINVAL always.

So that's what I'll do.

> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> > 
> > Dave, I'm sending this out so it can get reviewed.
> > I'll put this on my vhost tree
> > so no need for you to pick this patch directly.
> > 
> >  drivers/vhost/net.c   |   12 ++++++++++--
> >  drivers/vhost/vhost.c |   33 +++++++++++++++++----------------
> >  drivers/vhost/vhost.h |    8 ++++----
> >  3 files changed, 31 insertions(+), 22 deletions(-)
> > 
> > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> > index 0f41c91..54096ee 100644
> > --- a/drivers/vhost/net.c
> > +++ b/drivers/vhost/net.c
> > @@ -98,7 +98,8 @@ static void tx_poll_start(struct vhost_net *net, struct socket *sock)
> >  static void handle_tx(struct vhost_net *net)
> >  {
> >  	struct vhost_virtqueue *vq = &net->dev.vqs[VHOST_NET_VQ_TX];
> > -	unsigned head, out, in, s;
> > +	unsigned out, in, s;
> > +	int head;
> >  	struct msghdr msg = {
> >  		.msg_name = NULL,
> >  		.msg_namelen = 0,
> > @@ -135,6 +136,9 @@ static void handle_tx(struct vhost_net *net)
> >  					 ARRAY_SIZE(vq->iov),
> >  					 &out, &in,
> >  					 NULL, NULL);
> > +		/* On error, stop handling until the next kick. */
> > +		if (head < 0)
> > +			break;
> >  		/* Nothing new?  Wait for eventfd to tell us they refilled. */
> >  		if (head == vq->num) {
> >  			wmem = atomic_read(&sock->sk->sk_wmem_alloc);
> > @@ -192,7 +196,8 @@ static void handle_tx(struct vhost_net *net)
> >  static void handle_rx(struct vhost_net *net)
> >  {
> >  	struct vhost_virtqueue *vq = &net->dev.vqs[VHOST_NET_VQ_RX];
> > -	unsigned head, out, in, log, s;
> > +	unsigned out, in, log, s;
> > +	int head;
> >  	struct vhost_log *vq_log;
> >  	struct msghdr msg = {
> >  		.msg_name = NULL,
> > @@ -228,6 +233,9 @@ static void handle_rx(struct vhost_net *net)
> >  					 ARRAY_SIZE(vq->iov),
> >  					 &out, &in,
> >  					 vq_log, &log);
> > +		/* On error, stop handling until the next kick. */
> > +		if (head < 0)
> > +			break;
> >  		/* OK, now we need to know about added descriptors. */
> >  		if (head == vq->num) {
> >  			if (unlikely(vhost_enable_notify(vq))) {
> > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > index 3b83382..5ccd384 100644
> > --- a/drivers/vhost/vhost.c
> > +++ b/drivers/vhost/vhost.c
> > @@ -873,12 +873,13 @@ static unsigned get_indirect(struct vhost_dev *dev, struct vhost_virtqueue *vq,
> >   * number of output then some number of input descriptors, it's actually two
> >   * iovecs, but we pack them into one and note how many of each there were.
> >   *
> > - * This function returns the descriptor number found, or vq->num (which
> > - * is never a valid descriptor number) if none was found. */
> > -unsigned vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
> > -			   struct iovec iov[], unsigned int iov_size,
> > -			   unsigned int *out_num, unsigned int *in_num,
> > -			   struct vhost_log *log, unsigned int *log_num)
> > + * This function returns the descriptor number found, or vq->num (which is
> > + * never a valid descriptor number) if none was found.  A negative code is
> > + * returned on error. */
> > +int vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
> > +		      struct iovec iov[], unsigned int iov_size,
> > +		      unsigned int *out_num, unsigned int *in_num,
> > +		      struct vhost_log *log, unsigned int *log_num)
> >  {
> >  	struct vring_desc desc;
> >  	unsigned int i, head, found = 0;
> > @@ -890,13 +891,13 @@ unsigned vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
> >  	if (get_user(vq->avail_idx, &vq->avail->idx)) {
> >  		vq_err(vq, "Failed to access avail idx at %p\n",
> >  		       &vq->avail->idx);
> > -		return vq->num;
> > +		return -EFAULT;
> >  	}
> > 
> >  	if ((u16)(vq->avail_idx - last_avail_idx) > vq->num) {
> >  		vq_err(vq, "Guest moved used index from %u to %u",
> >  		       last_avail_idx, vq->avail_idx);
> > -		return vq->num;
> > +		return -EFAULT;
> 
> This should be -EINVAL
> >  	}
> > 
> >  	/* If there's nothing new since last we looked, return invalid. */
> > @@ -912,14 +913,14 @@ unsigned vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
> >  		vq_err(vq, "Failed to read head: idx %d address %p\n",
> >  		       last_avail_idx,
> >  		       &vq->avail->ring[last_avail_idx % vq->num]);
> > -		return vq->num;
> > +		return -EFAULT;
> >  	}
> > 
> >  	/* If their number is silly, that's an error. */
> >  	if (head >= vq->num) {
> >  		vq_err(vq, "Guest says index %u > %u is available",
> >  		       head, vq->num);
> > -		return vq->num;
> > +		return -EINVAL;
> >  	}
> > 
> >  	/* When we start there are none of either input nor output. */
> > @@ -933,19 +934,19 @@ unsigned vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
> >  		if (i >= vq->num) {
> >  			vq_err(vq, "Desc index is %u > %u, head = %u",
> >  			       i, vq->num, head);
> > -			return vq->num;
> > +			return -EINVAL;
> >  		}
> >  		if (++found > vq->num) {
> >  			vq_err(vq, "Loop detected: last one at %u "
> >  			       "vq size %u head %u\n",
> >  			       i, vq->num, head);
> > -			return vq->num;
> > +			return -EINVAL;
> >  		}
> >  		ret = copy_from_user(&desc, vq->desc + i, sizeof desc);
> >  		if (ret) {
> >  			vq_err(vq, "Failed to get descriptor: idx %d addr %p\n",
> >  			       i, vq->desc + i);
> > -			return vq->num;
> > +			return -EFAULT;
> >  		}
> >  		if (desc.flags & VRING_DESC_F_INDIRECT) {
> >  			ret = get_indirect(dev, vq, iov, iov_size,
> > @@ -954,7 +955,7 @@ unsigned vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
> >  			if (ret < 0) {
> >  				vq_err(vq, "Failure detected "
> >  				       "in indirect descriptor at idx %d\n", i);
> > -				return vq->num;
> > +				return ret;
> >  			}
> >  			continue;
> >  		}
> > @@ -964,7 +965,7 @@ unsigned vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
> >  		if (ret < 0) {
> >  			vq_err(vq, "Translation failure %d descriptor idx %d\n",
> >  			       ret, i);
> > -			return vq->num;
> > +			return ret;
> >  		}
> >  		if (desc.flags & VRING_DESC_F_WRITE) {
> >  			/* If this is an input descriptor,
> > @@ -981,7 +982,7 @@ unsigned vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
> >  			if (*in_num) {
> >  				vq_err(vq, "Descriptor has out after in: "
> >  				       "idx %d\n", i);
> > -				return vq->num;
> > +				return -EINVAL;
> >  			}
> >  			*out_num += ret;
> >  		}
> > diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> > index 44591ba..11ee13d 100644
> > --- a/drivers/vhost/vhost.h
> > +++ b/drivers/vhost/vhost.h
> > @@ -120,10 +120,10 @@ long vhost_dev_ioctl(struct vhost_dev *, unsigned int ioctl, unsigned long arg);
> >  int vhost_vq_access_ok(struct vhost_virtqueue *vq);
> >  int vhost_log_access_ok(struct vhost_dev *);
> > 
> > -unsigned vhost_get_vq_desc(struct vhost_dev *, struct vhost_virtqueue *,
> > -			   struct iovec iov[], unsigned int iov_count,
> > -			   unsigned int *out_num, unsigned int *in_num,
> > -			   struct vhost_log *log, unsigned int *log_num);
> > +int vhost_get_vq_desc(struct vhost_dev *, struct vhost_virtqueue *,
> > +		      struct iovec iov[], unsigned int iov_count,
> > +		      unsigned int *out_num, unsigned int *in_num,
> > +		      struct vhost_log *log, unsigned int *log_num);
> >  void vhost_discard_vq_desc(struct vhost_virtqueue *);
> > 
> >  int vhost_add_used(struct vhost_virtqueue *, unsigned int head, int len);

^ permalink raw reply

* RE: [REGRESSION] e1000e stopped working
From: Maxim Levitsky @ 2010-06-28 17:14 UTC (permalink / raw)
  To: Allan, Bruce W; +Cc: netdev@vger.kernel.org
In-Reply-To: <8DD2590731AB5D4C9DBF71A877482A90015918F40A@orsmsx509.amr.corp.intel.com>

On Mon, 2010-06-28 at 10:04 -0700, Allan, Bruce W wrote:
> On Sunday, June 27, 2010 10:47 AM, Maxim Levitsky wrote:
> > On Sun, 2010-06-27 at 20:43 +0300, Maxim Levitsky wrote:
> >> On Sun, 2010-06-27 at 20:29 +0300, Maxim Levitsky wrote:
> >>> On Sun, 2010-06-27 at 20:27 +0300, Maxim Levitsky wrote:
> >>>> Just that,
> >>>> 
> >>>> It doesn't receive anything from my internet router during DHCP.
> >>>> 
> >>>> 
> >>>> 00:19.0 Ethernet controller [0200]: Intel Corporation 82566DC
> >>>> 	Gigabit Network Connection [8086:104b] (rev 02) Subsystem: Intel
> >>>> 	Corporation Device [8086:0001] Control: I/O+ Mem+ BusMaster+
> >>>> 	SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
> >>>> 	DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast
> >>>> 	>TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- 	Latency: 0
> >>>> 	Interrupt: pin A routed to IRQ 47 Region 0: Memory at 50300000
> >>>> 	(32-bit, non-prefetchable) [size=128K] Region 1: Memory at
> >>>> 	50324000 (32-bit, non-prefetchable) [size=4K] Region 2: I/O ports
> >>>> 		at 30e0 [size=32] Capabilities: [c8] Power Management version 2
> >>>> 		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
> >>>> 	PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0
> >>>> 		DScale=1 PME- Capabilities: [d0] Message Signalled Interrupts:
> >>>> 	Mask- 64bit+ Queue=0/0 Enable+ Address: 00000000fee0100c  Data:
> >>>> 	41c9 Kernel driver in use: e1000e Kernel modules: e1000e
> >>>> 
> >>>> I use vanilla tree, commit bf2937695fe2330bfd8933a2310e7bdd2581dc2e
> >>>> 
> >>>> 
> >>>> Best regards,
> >>>> 	Maxim Levitsky
> >>>> 
> >>> 
> >>> It appears to work now after reboot.
> >>> Will keep a look for this.
> >>> 
> >>> Disregard for now.
> >> 
> >> 
> >> Just s2ram cycle, problem is back.
> >> Did full reboot (power off then on), same thing card doesn't work...
> >> 
> > 
> > Yep, s2ram sometimes 'fixes', sometimes breaks the card.
> > Something got broken in device initialization path.
> > 
> > Best regards,
> >  	Maxim Levitsky
> 
> What distro are you using?  If RedHat, since you are using DHCP will you please try putting a "LINKDELAY=10" in the /etc/sysconfig/network-scripts/ifcfg-ethX config file.
> 
I use ubuntu 9.10

> Is there anything in the system log that might help narrow down the issue?

Nothing, really nothing.
It seems to detect link, dhcp client sends requests, but doesn't recieve
a thing (even tried promisc mode - doesn't help)



Best regards,
	Maxim Levitsky


^ permalink raw reply

* Re: [PATCH] vhost: break out of polling loop on error
From: Sridhar Samudrala @ 2010-06-28 17:11 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Sridhar Samudrala, Arnd Bergmann, Paul E. McKenney, Juan Quintela,
	Rusty Russell, Jes.Sorensen, kraxel, Takuya Yoshikawa, kvm,
	virtualization, netdev, linux-kernel
In-Reply-To: <20100627085907.GA8588@redhat.com>

On Sun, 2010-06-27 at 11:59 +0300, Michael S. Tsirkin wrote:
> When ring parsing fails, we currently handle this
> as ring empty condition. This means that we enable
> kicks and recheck ring empty: if this not empty,
> we re-start polling which of course will fail again.
> 
> Instead, let's return a negative error code and stop polling.

One minor comment on error return below. With that change,

Acked-by: Sridhar Samudrala <sri@us.ibm.com>

> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> 
> Dave, I'm sending this out so it can get reviewed.
> I'll put this on my vhost tree
> so no need for you to pick this patch directly.
> 
>  drivers/vhost/net.c   |   12 ++++++++++--
>  drivers/vhost/vhost.c |   33 +++++++++++++++++----------------
>  drivers/vhost/vhost.h |    8 ++++----
>  3 files changed, 31 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 0f41c91..54096ee 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -98,7 +98,8 @@ static void tx_poll_start(struct vhost_net *net, struct socket *sock)
>  static void handle_tx(struct vhost_net *net)
>  {
>  	struct vhost_virtqueue *vq = &net->dev.vqs[VHOST_NET_VQ_TX];
> -	unsigned head, out, in, s;
> +	unsigned out, in, s;
> +	int head;
>  	struct msghdr msg = {
>  		.msg_name = NULL,
>  		.msg_namelen = 0,
> @@ -135,6 +136,9 @@ static void handle_tx(struct vhost_net *net)
>  					 ARRAY_SIZE(vq->iov),
>  					 &out, &in,
>  					 NULL, NULL);
> +		/* On error, stop handling until the next kick. */
> +		if (head < 0)
> +			break;
>  		/* Nothing new?  Wait for eventfd to tell us they refilled. */
>  		if (head == vq->num) {
>  			wmem = atomic_read(&sock->sk->sk_wmem_alloc);
> @@ -192,7 +196,8 @@ static void handle_tx(struct vhost_net *net)
>  static void handle_rx(struct vhost_net *net)
>  {
>  	struct vhost_virtqueue *vq = &net->dev.vqs[VHOST_NET_VQ_RX];
> -	unsigned head, out, in, log, s;
> +	unsigned out, in, log, s;
> +	int head;
>  	struct vhost_log *vq_log;
>  	struct msghdr msg = {
>  		.msg_name = NULL,
> @@ -228,6 +233,9 @@ static void handle_rx(struct vhost_net *net)
>  					 ARRAY_SIZE(vq->iov),
>  					 &out, &in,
>  					 vq_log, &log);
> +		/* On error, stop handling until the next kick. */
> +		if (head < 0)
> +			break;
>  		/* OK, now we need to know about added descriptors. */
>  		if (head == vq->num) {
>  			if (unlikely(vhost_enable_notify(vq))) {
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 3b83382..5ccd384 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -873,12 +873,13 @@ static unsigned get_indirect(struct vhost_dev *dev, struct vhost_virtqueue *vq,
>   * number of output then some number of input descriptors, it's actually two
>   * iovecs, but we pack them into one and note how many of each there were.
>   *
> - * This function returns the descriptor number found, or vq->num (which
> - * is never a valid descriptor number) if none was found. */
> -unsigned vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
> -			   struct iovec iov[], unsigned int iov_size,
> -			   unsigned int *out_num, unsigned int *in_num,
> -			   struct vhost_log *log, unsigned int *log_num)
> + * This function returns the descriptor number found, or vq->num (which is
> + * never a valid descriptor number) if none was found.  A negative code is
> + * returned on error. */
> +int vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
> +		      struct iovec iov[], unsigned int iov_size,
> +		      unsigned int *out_num, unsigned int *in_num,
> +		      struct vhost_log *log, unsigned int *log_num)
>  {
>  	struct vring_desc desc;
>  	unsigned int i, head, found = 0;
> @@ -890,13 +891,13 @@ unsigned vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
>  	if (get_user(vq->avail_idx, &vq->avail->idx)) {
>  		vq_err(vq, "Failed to access avail idx at %p\n",
>  		       &vq->avail->idx);
> -		return vq->num;
> +		return -EFAULT;
>  	}
> 
>  	if ((u16)(vq->avail_idx - last_avail_idx) > vq->num) {
>  		vq_err(vq, "Guest moved used index from %u to %u",
>  		       last_avail_idx, vq->avail_idx);
> -		return vq->num;
> +		return -EFAULT;

This should be -EINVAL
>  	}
> 
>  	/* If there's nothing new since last we looked, return invalid. */
> @@ -912,14 +913,14 @@ unsigned vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
>  		vq_err(vq, "Failed to read head: idx %d address %p\n",
>  		       last_avail_idx,
>  		       &vq->avail->ring[last_avail_idx % vq->num]);
> -		return vq->num;
> +		return -EFAULT;
>  	}
> 
>  	/* If their number is silly, that's an error. */
>  	if (head >= vq->num) {
>  		vq_err(vq, "Guest says index %u > %u is available",
>  		       head, vq->num);
> -		return vq->num;
> +		return -EINVAL;
>  	}
> 
>  	/* When we start there are none of either input nor output. */
> @@ -933,19 +934,19 @@ unsigned vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
>  		if (i >= vq->num) {
>  			vq_err(vq, "Desc index is %u > %u, head = %u",
>  			       i, vq->num, head);
> -			return vq->num;
> +			return -EINVAL;
>  		}
>  		if (++found > vq->num) {
>  			vq_err(vq, "Loop detected: last one at %u "
>  			       "vq size %u head %u\n",
>  			       i, vq->num, head);
> -			return vq->num;
> +			return -EINVAL;
>  		}
>  		ret = copy_from_user(&desc, vq->desc + i, sizeof desc);
>  		if (ret) {
>  			vq_err(vq, "Failed to get descriptor: idx %d addr %p\n",
>  			       i, vq->desc + i);
> -			return vq->num;
> +			return -EFAULT;
>  		}
>  		if (desc.flags & VRING_DESC_F_INDIRECT) {
>  			ret = get_indirect(dev, vq, iov, iov_size,
> @@ -954,7 +955,7 @@ unsigned vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
>  			if (ret < 0) {
>  				vq_err(vq, "Failure detected "
>  				       "in indirect descriptor at idx %d\n", i);
> -				return vq->num;
> +				return ret;
>  			}
>  			continue;
>  		}
> @@ -964,7 +965,7 @@ unsigned vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
>  		if (ret < 0) {
>  			vq_err(vq, "Translation failure %d descriptor idx %d\n",
>  			       ret, i);
> -			return vq->num;
> +			return ret;
>  		}
>  		if (desc.flags & VRING_DESC_F_WRITE) {
>  			/* If this is an input descriptor,
> @@ -981,7 +982,7 @@ unsigned vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
>  			if (*in_num) {
>  				vq_err(vq, "Descriptor has out after in: "
>  				       "idx %d\n", i);
> -				return vq->num;
> +				return -EINVAL;
>  			}
>  			*out_num += ret;
>  		}
> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> index 44591ba..11ee13d 100644
> --- a/drivers/vhost/vhost.h
> +++ b/drivers/vhost/vhost.h
> @@ -120,10 +120,10 @@ long vhost_dev_ioctl(struct vhost_dev *, unsigned int ioctl, unsigned long arg);
>  int vhost_vq_access_ok(struct vhost_virtqueue *vq);
>  int vhost_log_access_ok(struct vhost_dev *);
> 
> -unsigned vhost_get_vq_desc(struct vhost_dev *, struct vhost_virtqueue *,
> -			   struct iovec iov[], unsigned int iov_count,
> -			   unsigned int *out_num, unsigned int *in_num,
> -			   struct vhost_log *log, unsigned int *log_num);
> +int vhost_get_vq_desc(struct vhost_dev *, struct vhost_virtqueue *,
> +		      struct iovec iov[], unsigned int iov_count,
> +		      unsigned int *out_num, unsigned int *in_num,
> +		      struct vhost_log *log, unsigned int *log_num);
>  void vhost_discard_vq_desc(struct vhost_virtqueue *);
> 
>  int vhost_add_used(struct vhost_virtqueue *, unsigned int head, int len);


^ permalink raw reply

* RE: [REGRESSION] e1000e stopped working
From: Allan, Bruce W @ 2010-06-28 17:04 UTC (permalink / raw)
  To: Maxim Levitsky, netdev@vger.kernel.org
In-Reply-To: <1277660831.3321.3.camel@localhost.localdomain>

On Sunday, June 27, 2010 10:47 AM, Maxim Levitsky wrote:
> On Sun, 2010-06-27 at 20:43 +0300, Maxim Levitsky wrote:
>> On Sun, 2010-06-27 at 20:29 +0300, Maxim Levitsky wrote:
>>> On Sun, 2010-06-27 at 20:27 +0300, Maxim Levitsky wrote:
>>>> Just that,
>>>> 
>>>> It doesn't receive anything from my internet router during DHCP.
>>>> 
>>>> 
>>>> 00:19.0 Ethernet controller [0200]: Intel Corporation 82566DC
>>>> 	Gigabit Network Connection [8086:104b] (rev 02) Subsystem: Intel
>>>> 	Corporation Device [8086:0001] Control: I/O+ Mem+ BusMaster+
>>>> 	SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
>>>> 	DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast
>>>> 	>TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- 	Latency: 0
>>>> 	Interrupt: pin A routed to IRQ 47 Region 0: Memory at 50300000
>>>> 	(32-bit, non-prefetchable) [size=128K] Region 1: Memory at
>>>> 	50324000 (32-bit, non-prefetchable) [size=4K] Region 2: I/O ports
>>>> 		at 30e0 [size=32] Capabilities: [c8] Power Management version 2
>>>> 		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
>>>> 	PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0
>>>> 		DScale=1 PME- Capabilities: [d0] Message Signalled Interrupts:
>>>> 	Mask- 64bit+ Queue=0/0 Enable+ Address: 00000000fee0100c  Data:
>>>> 	41c9 Kernel driver in use: e1000e Kernel modules: e1000e
>>>> 
>>>> I use vanilla tree, commit bf2937695fe2330bfd8933a2310e7bdd2581dc2e
>>>> 
>>>> 
>>>> Best regards,
>>>> 	Maxim Levitsky
>>>> 
>>> 
>>> It appears to work now after reboot.
>>> Will keep a look for this.
>>> 
>>> Disregard for now.
>> 
>> 
>> Just s2ram cycle, problem is back.
>> Did full reboot (power off then on), same thing card doesn't work...
>> 
> 
> Yep, s2ram sometimes 'fixes', sometimes breaks the card.
> Something got broken in device initialization path.
> 
> Best regards,
>  	Maxim Levitsky

What distro are you using?  If RedHat, since you are using DHCP will you please try putting a "LINKDELAY=10" in the /etc/sysconfig/network-scripts/ifcfg-ethX config file.

Is there anything in the system log that might help narrow down the issue?

Thanks,
Bruce.

^ permalink raw reply

* Re: [PATCH 4/9] cxgb4vf: Add code to provision T4 PCI-E SR-IOV Virtual Functions with hardware resources
From: Casey Leedom @ 2010-06-28 16:55 UTC (permalink / raw)
  To: netdev
In-Reply-To: <20100626133746.GB30133@verge.net.au>

| From: Simon Horman <horms@verge.net.au>
| Date: Saturday, June 26, 2010 06:37 am
| 
| I wonder if it would be cleaner to move the guts of the last hunk
| into a function (e.g. adap_init_sriov()) and have that be a dummy
| function in the case that CONFIG_PCI_IOV in the first hunk is not set.

  I have no objection to this.  I'd like to do minor tuning work like that as a 
follow on rather than respin the patch.  But, as I said in my submission, I'm 
not familiar with this process so if making changes like the above is required 
I'll definitely jump on it.  I don't know what issues are "critical show 
stoppers" and which ones are "nice to have" once things are in place.

  Thanks for your comments!

Casey

^ permalink raw reply

* [PATCH 4/4] phylib: Allow reading and writing a mii bus from atomic context.
From: Richard Cochran @ 2010-06-28 15:35 UTC (permalink / raw)
  To: netdev
In-Reply-To: <cover.1277737222.git.richard.cochran@omicron.at>

In order to support hardware time stamping from a PHY, it is necessary to
read from the PHY while running in_interrupt(). This patch allows a mii
bus to operate in an atomic context. An mii_bus driver may declare itself
capable for this mode. Drivers which do not do this will remain with the
default that bus operations may sleep.

Before commit 35b5f6b1a82b5c586e0b24c711dc6ba944e88ef1 mii bus
operations were protected with spin locks. That commit replaced the
locks with mutexs in order to accommodate i2c buses that need to
sleep. Thus, this patch restores the original behavior as a run time
option.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
---
 drivers/net/fsl_pq_mdio.c  |    4 +-
 drivers/net/phy/mdio_bus.c |   45 +++++++++++++++++++++++++++++++++++++------
 include/linux/phy.h        |   15 ++++++++++++-
 3 files changed, 53 insertions(+), 11 deletions(-)

diff --git a/drivers/net/fsl_pq_mdio.c b/drivers/net/fsl_pq_mdio.c
index b4c41d7..0b34f50 100644
--- a/drivers/net/fsl_pq_mdio.c
+++ b/drivers/net/fsl_pq_mdio.c
@@ -145,7 +145,7 @@ static int fsl_pq_mdio_reset(struct mii_bus *bus)
 	struct fsl_pq_mdio __iomem *regs = fsl_pq_mdio_get_regs(bus);
 	int timeout = PHY_INIT_TIMEOUT;
 
-	mutex_lock(&bus->mdio_lock);
+	mdiobus_lock(bus);
 
 	/* Reset the management interface */
 	out_be32(&regs->miimcfg, MIIMCFG_RESET);
@@ -157,7 +157,7 @@ static int fsl_pq_mdio_reset(struct mii_bus *bus)
 	while ((in_be32(&regs->miimind) & MIIMIND_BUSY) && timeout--)
 		cpu_relax();
 
-	mutex_unlock(&bus->mdio_lock);
+	mdiobus_unlock(bus);
 
 	if (timeout < 0) {
 		printk(KERN_ERR "%s: The MII Bus is stuck!\n",
diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
index 6a6b819..ad6bed8 100644
--- a/drivers/net/phy/mdio_bus.c
+++ b/drivers/net/phy/mdio_bus.c
@@ -37,6 +37,32 @@
 #include <asm/uaccess.h>
 
 /**
+ * mdiobus_lock - locks a given bus for read or write operations.
+ * @bus: target mii_bus
+ */
+void mdiobus_lock(struct mii_bus *bus)
+{
+	if (MDIOBUS_SLEEPS_RW == bus->locktype)
+		mutex_lock(&bus->lock.m);
+	else
+		spin_lock(&bus->lock.s);
+}
+EXPORT_SYMBOL(mdiobus_lock);
+
+/**
+ * mdiobus_unlock - unlocks a given bus for read or write operations.
+ * @bus: target mii_bus
+ */
+void mdiobus_unlock(struct mii_bus *bus)
+{
+	if (MDIOBUS_SLEEPS_RW == bus->locktype)
+		mutex_unlock(&bus->lock.m);
+	else
+		spin_unlock(&bus->lock.s);
+}
+EXPORT_SYMBOL(mdiobus_unlock);
+
+/**
  * mdiobus_alloc - allocate a mii_bus structure
  *
  * Description: called by a bus driver to allocate an mii_bus
@@ -107,7 +133,10 @@ int mdiobus_register(struct mii_bus *bus)
 		return -EINVAL;
 	}
 
-	mutex_init(&bus->mdio_lock);
+	if (MDIOBUS_SLEEPS_RW == bus->locktype)
+		mutex_init(&bus->lock.m);
+	else
+		spin_lock_init(&bus->lock.s);
 
 	if (bus->reset)
 		bus->reset(bus);
@@ -212,11 +241,12 @@ int mdiobus_read(struct mii_bus *bus, int addr, u32 regnum)
 {
 	int retval;
 
-	BUG_ON(in_interrupt());
+	if (MDIOBUS_SLEEPS_RW == bus->locktype)
+		BUG_ON(in_interrupt());
 
-	mutex_lock(&bus->mdio_lock);
+	mdiobus_lock(bus);
 	retval = bus->read(bus, addr, regnum);
-	mutex_unlock(&bus->mdio_lock);
+	mdiobus_unlock(bus);
 
 	return retval;
 }
@@ -237,11 +267,12 @@ int mdiobus_write(struct mii_bus *bus, int addr, u32 regnum, u16 val)
 {
 	int err;
 
-	BUG_ON(in_interrupt());
+	if (MDIOBUS_SLEEPS_RW == bus->locktype)
+		BUG_ON(in_interrupt());
 
-	mutex_lock(&bus->mdio_lock);
+	mdiobus_lock(bus);
 	err = bus->write(bus, addr, regnum, val);
-	mutex_unlock(&bus->mdio_lock);
+	mdiobus_unlock(bus);
 
 	return err;
 }
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 7a8caac..93ea55f 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -98,11 +98,20 @@ struct mii_bus {
 	int (*write)(struct mii_bus *bus, int phy_id, int regnum, u16 val);
 	int (*reset)(struct mii_bus *bus);
 
+	/* Indicates whether bus may be used from an atomic context. */
+	enum {
+		MDIOBUS_SLEEPS_RW,
+		MDIOBUS_ATOMIC_RW
+	} locktype;
+
 	/*
-	 * A lock to ensure that only one thing can read/write
+	 * A lock or mutex to ensure that only one thing can read/write
 	 * the MDIO bus at a time
 	 */
-	struct mutex mdio_lock;
+	union {
+		struct mutex m;
+		spinlock_t s;
+	} lock;
 
 	struct device *parent;
 	enum {
@@ -127,6 +136,8 @@ struct mii_bus {
 };
 #define to_mii_bus(d) container_of(d, struct mii_bus, dev)
 
+void mdiobus_lock(struct mii_bus *bus);
+void mdiobus_unlock(struct mii_bus *bus);
 struct mii_bus *mdiobus_alloc(void);
 int mdiobus_register(struct mii_bus *bus);
 void mdiobus_unregister(struct mii_bus *bus);
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH 3/4] phylib: preserve ifreq parameter when calling generic phy_mii_ioctl()
From: Richard Cochran @ 2010-06-28 15:34 UTC (permalink / raw)
  To: netdev
In-Reply-To: <cover.1277737222.git.richard.cochran@omicron.at>

The phy_mii_ioctl() function unnecessarily throws away the original ifreq.
We need access to the ifreq in order to support PHYs that can perform
hardware time stamping.

Two maverick drivers filter the ioctl commands passed to phy_mii_ioctl().
This is unnecessary since phylib will check the command in any case.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
---
 drivers/net/arm/ixp4xx_eth.c           |    3 ++-
 drivers/net/au1000_eth.c               |    2 +-
 drivers/net/bcm63xx_enet.c             |    2 +-
 drivers/net/cpmac.c                    |    5 +----
 drivers/net/dnet.c                     |    2 +-
 drivers/net/ethoc.c                    |    2 +-
 drivers/net/fec.c                      |    2 +-
 drivers/net/fec_mpc52xx.c              |    2 +-
 drivers/net/fs_enet/fs_enet-main.c     |    3 +--
 drivers/net/gianfar.c                  |    2 +-
 drivers/net/macb.c                     |    2 +-
 drivers/net/mv643xx_eth.c              |    2 +-
 drivers/net/octeon/octeon_mgmt.c       |    2 +-
 drivers/net/phy/phy.c                  |    8 +++++++-
 drivers/net/sb1250-mac.c               |    2 +-
 drivers/net/sh_eth.c                   |    2 +-
 drivers/net/smsc911x.c                 |    2 +-
 drivers/net/smsc9420.c                 |    2 +-
 drivers/net/stmmac/stmmac_main.c       |   22 ++++++++--------------
 drivers/net/tc35815.c                  |    2 +-
 drivers/net/tg3.c                      |    2 +-
 drivers/net/ucc_geth.c                 |    2 +-
 drivers/staging/octeon/ethernet-mdio.c |    2 +-
 include/linux/phy.h                    |    2 +-
 net/dsa/slave.c                        |    3 +--
 25 files changed, 39 insertions(+), 43 deletions(-)

diff --git a/drivers/net/arm/ixp4xx_eth.c b/drivers/net/arm/ixp4xx_eth.c
index ee2f842..4f1cc71 100644
--- a/drivers/net/arm/ixp4xx_eth.c
+++ b/drivers/net/arm/ixp4xx_eth.c
@@ -782,7 +782,8 @@ static int eth_ioctl(struct net_device *dev, struct ifreq *req, int cmd)
 
 	if (!netif_running(dev))
 		return -EINVAL;
-	return phy_mii_ioctl(port->phydev, if_mii(req), cmd);
+
+	return phy_mii_ioctl(port->phydev, req, cmd);
 }
 
 /* ethtool support */
diff --git a/drivers/net/au1000_eth.c b/drivers/net/au1000_eth.c
index ece6128..386d4fe 100644
--- a/drivers/net/au1000_eth.c
+++ b/drivers/net/au1000_eth.c
@@ -978,7 +978,7 @@ static int au1000_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 	if (!aup->phy_dev)
 		return -EINVAL; /* PHY not controllable */
 
-	return phy_mii_ioctl(aup->phy_dev, if_mii(rq), cmd);
+	return phy_mii_ioctl(aup->phy_dev, rq, cmd);
 }
 
 static const struct net_device_ops au1000_netdev_ops = {
diff --git a/drivers/net/bcm63xx_enet.c b/drivers/net/bcm63xx_enet.c
index faf5add..0d2c5da 100644
--- a/drivers/net/bcm63xx_enet.c
+++ b/drivers/net/bcm63xx_enet.c
@@ -1496,7 +1496,7 @@ static int bcm_enet_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 	if (priv->has_phy) {
 		if (!priv->phydev)
 			return -ENODEV;
-		return phy_mii_ioctl(priv->phydev, if_mii(rq), cmd);
+		return phy_mii_ioctl(priv->phydev, rq, cmd);
 	} else {
 		struct mii_if_info mii;
 
diff --git a/drivers/net/cpmac.c b/drivers/net/cpmac.c
index 3c58db5..0e47ca1 100644
--- a/drivers/net/cpmac.c
+++ b/drivers/net/cpmac.c
@@ -846,11 +846,8 @@ static int cpmac_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 		return -EINVAL;
 	if (!priv->phy)
 		return -EINVAL;
-	if ((cmd == SIOCGMIIPHY) || (cmd == SIOCGMIIREG) ||
-	    (cmd == SIOCSMIIREG))
-		return phy_mii_ioctl(priv->phy, if_mii(ifr), cmd);
 
-	return -EOPNOTSUPP;
+	return phy_mii_ioctl(priv->phy, ifr, cmd);
 }
 
 static int cpmac_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
diff --git a/drivers/net/dnet.c b/drivers/net/dnet.c
index 8b0f50b..4ea7141 100644
--- a/drivers/net/dnet.c
+++ b/drivers/net/dnet.c
@@ -797,7 +797,7 @@ static int dnet_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 	if (!phydev)
 		return -ENODEV;
 
-	return phy_mii_ioctl(phydev, if_mii(rq), cmd);
+	return phy_mii_ioctl(phydev, rq, cmd);
 }
 
 static void dnet_get_drvinfo(struct net_device *dev,
diff --git a/drivers/net/ethoc.c b/drivers/net/ethoc.c
index 37ce8ac..d9f3106 100644
--- a/drivers/net/ethoc.c
+++ b/drivers/net/ethoc.c
@@ -732,7 +732,7 @@ static int ethoc_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 		phy = priv->phy;
 	}
 
-	return phy_mii_ioctl(phy, mdio, cmd);
+	return phy_mii_ioctl(phy, ifr, cmd);
 }
 
 static int ethoc_config(struct net_device *dev, struct ifmap *map)
diff --git a/drivers/net/fec.c b/drivers/net/fec.c
index a3cae4e..f24f49e 100644
--- a/drivers/net/fec.c
+++ b/drivers/net/fec.c
@@ -828,7 +828,7 @@ static int fec_enet_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 	if (!phydev)
 		return -ENODEV;
 
-	return phy_mii_ioctl(phydev, if_mii(rq), cmd);
+	return phy_mii_ioctl(phydev, rq, cmd);
 }
 
 static void fec_enet_free_buffers(struct net_device *dev)
diff --git a/drivers/net/fec_mpc52xx.c b/drivers/net/fec_mpc52xx.c
index 25e6cc6..fdbf148 100644
--- a/drivers/net/fec_mpc52xx.c
+++ b/drivers/net/fec_mpc52xx.c
@@ -826,7 +826,7 @@ static int mpc52xx_fec_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 	if (!priv->phydev)
 		return -ENOTSUPP;
 
-	return phy_mii_ioctl(priv->phydev, if_mii(rq), cmd);
+	return phy_mii_ioctl(priv->phydev, rq, cmd);
 }
 
 static const struct net_device_ops mpc52xx_fec_netdev_ops = {
diff --git a/drivers/net/fs_enet/fs_enet-main.c b/drivers/net/fs_enet/fs_enet-main.c
index 309a0ea..f08cff9 100644
--- a/drivers/net/fs_enet/fs_enet-main.c
+++ b/drivers/net/fs_enet/fs_enet-main.c
@@ -963,12 +963,11 @@ static const struct ethtool_ops fs_ethtool_ops = {
 static int fs_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 {
 	struct fs_enet_private *fep = netdev_priv(dev);
-	struct mii_ioctl_data *mii = (struct mii_ioctl_data *)&rq->ifr_data;
 
 	if (!netif_running(dev))
 		return -EINVAL;
 
-	return phy_mii_ioctl(fep->phydev, mii, cmd);
+	return phy_mii_ioctl(fep->phydev, rq, cmd);
 }
 
 extern int fs_mii_connect(struct net_device *dev);
diff --git a/drivers/net/gianfar.c b/drivers/net/gianfar.c
index 8a17bf0..252cb75 100644
--- a/drivers/net/gianfar.c
+++ b/drivers/net/gianfar.c
@@ -833,7 +833,7 @@ static int gfar_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 	if (!priv->phydev)
 		return -ENODEV;
 
-	return phy_mii_ioctl(priv->phydev, if_mii(rq), cmd);
+	return phy_mii_ioctl(priv->phydev, rq, cmd);
 }
 
 static unsigned int reverse_bitmap(unsigned int bit_map, unsigned int max_qs)
diff --git a/drivers/net/macb.c b/drivers/net/macb.c
index 40797fb..ff2f158 100644
--- a/drivers/net/macb.c
+++ b/drivers/net/macb.c
@@ -1082,7 +1082,7 @@ static int macb_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 	if (!phydev)
 		return -ENODEV;
 
-	return phy_mii_ioctl(phydev, if_mii(rq), cmd);
+	return phy_mii_ioctl(phydev, rq, cmd);
 }
 
 static const struct net_device_ops macb_netdev_ops = {
diff --git a/drivers/net/mv643xx_eth.c b/drivers/net/mv643xx_eth.c
index e345ec8..e3ab182 100644
--- a/drivers/net/mv643xx_eth.c
+++ b/drivers/net/mv643xx_eth.c
@@ -2452,7 +2452,7 @@ static int mv643xx_eth_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 	struct mv643xx_eth_private *mp = netdev_priv(dev);
 
 	if (mp->phy != NULL)
-		return phy_mii_ioctl(mp->phy, if_mii(ifr), cmd);
+		return phy_mii_ioctl(mp->phy, ifr, cmd);
 
 	return -EOPNOTSUPP;
 }
diff --git a/drivers/net/octeon/octeon_mgmt.c b/drivers/net/octeon/octeon_mgmt.c
index 000e792..73a2671 100644
--- a/drivers/net/octeon/octeon_mgmt.c
+++ b/drivers/net/octeon/octeon_mgmt.c
@@ -620,7 +620,7 @@ static int octeon_mgmt_ioctl(struct net_device *netdev,
 	if (!p->phydev)
 		return -EINVAL;
 
-	return phy_mii_ioctl(p->phydev, if_mii(rq), cmd);
+	return phy_mii_ioctl(p->phydev, rq, cmd);
 }
 
 static void octeon_mgmt_adjust_link(struct net_device *netdev)
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 64be466..5130db8 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -309,8 +309,9 @@ EXPORT_SYMBOL(phy_ethtool_gset);
  * current state.  Use at own risk.
  */
 int phy_mii_ioctl(struct phy_device *phydev,
-		struct mii_ioctl_data *mii_data, int cmd)
+		struct ifreq *ifr, int cmd)
 {
+	struct mii_ioctl_data *mii_data = if_mii(ifr);
 	u16 val = mii_data->val_in;
 
 	switch (cmd) {
@@ -360,6 +361,11 @@ int phy_mii_ioctl(struct phy_device *phydev,
 		}
 		break;
 
+	case SIOCSHWTSTAMP:
+		if (phydev->drv->hwtstamp)
+			return phydev->drv->hwtstamp(phydev, ifr);
+		/* fall through */
+
 	default:
 		return -EOPNOTSUPP;
 	}
diff --git a/drivers/net/sb1250-mac.c b/drivers/net/sb1250-mac.c
index 1f3acc3..e585c3f 100644
--- a/drivers/net/sb1250-mac.c
+++ b/drivers/net/sb1250-mac.c
@@ -2532,7 +2532,7 @@ static int sbmac_mii_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 	if (!netif_running(dev) || !sc->phy_dev)
 		return -EINVAL;
 
-	return phy_mii_ioctl(sc->phy_dev, if_mii(rq), cmd);
+	return phy_mii_ioctl(sc->phy_dev, rq, cmd);
 }
 
 static int sbmac_close(struct net_device *dev)
diff --git a/drivers/net/sh_eth.c b/drivers/net/sh_eth.c
index 501a55f..8279f8e 100644
--- a/drivers/net/sh_eth.c
+++ b/drivers/net/sh_eth.c
@@ -1233,7 +1233,7 @@ static int sh_eth_do_ioctl(struct net_device *ndev, struct ifreq *rq,
 	if (!phydev)
 		return -ENODEV;
 
-	return phy_mii_ioctl(phydev, if_mii(rq), cmd);
+	return phy_mii_ioctl(phydev, rq, cmd);
 }
 
 #if defined(SH_ETH_HAS_TSU)
diff --git a/drivers/net/smsc911x.c b/drivers/net/smsc911x.c
index cc55974..56dc2ff 100644
--- a/drivers/net/smsc911x.c
+++ b/drivers/net/smsc911x.c
@@ -1538,7 +1538,7 @@ static int smsc911x_do_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 	if (!netif_running(dev) || !pdata->phy_dev)
 		return -EINVAL;
 
-	return phy_mii_ioctl(pdata->phy_dev, if_mii(ifr), cmd);
+	return phy_mii_ioctl(pdata->phy_dev, ifr, cmd);
 }
 
 static int
diff --git a/drivers/net/smsc9420.c b/drivers/net/smsc9420.c
index 6cdee6a..b09ee1c 100644
--- a/drivers/net/smsc9420.c
+++ b/drivers/net/smsc9420.c
@@ -245,7 +245,7 @@ static int smsc9420_do_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 	if (!netif_running(dev) || !pd->phy_dev)
 		return -EINVAL;
 
-	return phy_mii_ioctl(pd->phy_dev, if_mii(ifr), cmd);
+	return phy_mii_ioctl(pd->phy_dev, ifr, cmd);
 }
 
 static int smsc9420_ethtool_get_settings(struct net_device *dev,
diff --git a/drivers/net/stmmac/stmmac_main.c b/drivers/net/stmmac/stmmac_main.c
index a31d580..acf0616 100644
--- a/drivers/net/stmmac/stmmac_main.c
+++ b/drivers/net/stmmac/stmmac_main.c
@@ -1437,24 +1437,18 @@ static void stmmac_poll_controller(struct net_device *dev)
 static int stmmac_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 {
 	struct stmmac_priv *priv = netdev_priv(dev);
-	int ret = -EOPNOTSUPP;
+	int ret;
 
 	if (!netif_running(dev))
 		return -EINVAL;
 
-	switch (cmd) {
-	case SIOCGMIIPHY:
-	case SIOCGMIIREG:
-	case SIOCSMIIREG:
-		if (!priv->phydev)
-			return -EINVAL;
-
-		spin_lock(&priv->lock);
-		ret = phy_mii_ioctl(priv->phydev, if_mii(rq), cmd);
-		spin_unlock(&priv->lock);
-	default:
-		break;
-	}
+	if (!priv->phydev)
+		return -EINVAL;
+
+	spin_lock(&priv->lock);
+	ret = phy_mii_ioctl(priv->phydev, rq, cmd);
+	spin_unlock(&priv->lock);
+
 	return ret;
 }
 
diff --git a/drivers/net/tc35815.c b/drivers/net/tc35815.c
index be08b75..99e423a 100644
--- a/drivers/net/tc35815.c
+++ b/drivers/net/tc35815.c
@@ -2066,7 +2066,7 @@ static int tc35815_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 		return -EINVAL;
 	if (!lp->phy_dev)
 		return -ENODEV;
-	return phy_mii_ioctl(lp->phy_dev, if_mii(rq), cmd);
+	return phy_mii_ioctl(lp->phy_dev, rq, cmd);
 }
 
 static void tc35815_chip_reset(struct net_device *dev)
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 289cdc5..d4163f2 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -10954,7 +10954,7 @@ static int tg3_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 		if (!(tp->tg3_flags3 & TG3_FLG3_PHY_CONNECTED))
 			return -EAGAIN;
 		phydev = tp->mdio_bus->phy_map[TG3_PHY_MII_ADDR];
-		return phy_mii_ioctl(phydev, data, cmd);
+		return phy_mii_ioctl(phydev, ifr, cmd);
 	}
 
 	switch (cmd) {
diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
index 538148a..639aae5 100644
--- a/drivers/net/ucc_geth.c
+++ b/drivers/net/ucc_geth.c
@@ -3712,7 +3712,7 @@ static int ucc_geth_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 	if (!ugeth->phydev)
 		return -ENODEV;
 
-	return phy_mii_ioctl(ugeth->phydev, if_mii(rq), cmd);
+	return phy_mii_ioctl(ugeth->phydev, rq, cmd);
 }
 
 static const struct net_device_ops ucc_geth_netdev_ops = {
diff --git a/drivers/staging/octeon/ethernet-mdio.c b/drivers/staging/octeon/ethernet-mdio.c
index 7e0be8d..10a82ef 100644
--- a/drivers/staging/octeon/ethernet-mdio.c
+++ b/drivers/staging/octeon/ethernet-mdio.c
@@ -113,7 +113,7 @@ int cvm_oct_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 	if (!priv->phydev)
 		return -EINVAL;
 
-	return phy_mii_ioctl(priv->phydev, if_mii(rq), cmd);
+	return phy_mii_ioctl(priv->phydev, rq, cmd);
 }
 
 static void cvm_oct_adjust_link(struct net_device *dev)
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 3566bde..7a8caac 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -505,7 +505,7 @@ void phy_stop_machine(struct phy_device *phydev);
 int phy_ethtool_sset(struct phy_device *phydev, struct ethtool_cmd *cmd);
 int phy_ethtool_gset(struct phy_device *phydev, struct ethtool_cmd *cmd);
 int phy_mii_ioctl(struct phy_device *phydev,
-		struct mii_ioctl_data *mii_data, int cmd);
+		struct ifreq *ifr, int cmd);
 int phy_start_interrupts(struct phy_device *phydev);
 void phy_print_status(struct phy_device *phydev);
 struct phy_device* phy_device_create(struct mii_bus *bus, int addr, int phy_id);
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 8fdca56..64ca2a6 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -164,10 +164,9 @@ out:
 static int dsa_slave_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 {
 	struct dsa_slave_priv *p = netdev_priv(dev);
-	struct mii_ioctl_data *mii_data = if_mii(ifr);
 
 	if (p->phy != NULL)
-		return phy_mii_ioctl(p->phy, mii_data, cmd);
+		return phy_mii_ioctl(p->phy, ifr, cmd);
 
 	return -EOPNOTSUPP;
 }
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH 2/4] phylib: add a way to make PHY time stamps possible.
From: Richard Cochran @ 2010-06-28 15:34 UTC (permalink / raw)
  To: netdev
In-Reply-To: <cover.1277737222.git.richard.cochran@omicron.at>

This patch adds a new networking option to allow hardware time stamps
from PHY devices. Using PHY time stamps will still require adding two
inline function calls to each MAC driver. The CONFIG option makes
these calls safe to add, since the calls become NOOPs when the option
is disabled.

The patch also adds phylib driver methods for the SIOCSHWTSTAMP ioctl
and callbacks for transmit and receive time stamping. Drivers may
optionally implement these functions.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
---
 include/linux/phy.h    |    7 +++++++
 include/linux/skbuff.h |   29 +++++++++++++++++++++++++++++
 net/Kconfig            |   11 +++++++++++
 3 files changed, 47 insertions(+), 0 deletions(-)

diff --git a/include/linux/phy.h b/include/linux/phy.h
index 987e111..3566bde 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -234,6 +234,8 @@ enum phy_state {
 	PHY_RESUMING
 };
 
+struct sk_buff;
+
 /* phy_device: An instance of a PHY
  *
  * drv: Pointer to the driver for this PHY instance
@@ -402,6 +404,11 @@ struct phy_driver {
 	/* Clears up any memory if needed */
 	void (*remove)(struct phy_device *phydev);
 
+	/* Handles SIOCSHWTSTAMP ioctl for hardware time stamping. */
+	int (*hwtstamp)(struct phy_device *phydev, struct ifreq *ifr);
+	int (*rxtstamp)(struct phy_device *phydev, struct sk_buff *skb);
+	int (*txtstamp)(struct phy_device *phydev, struct sk_buff *skb);
+
 	struct device_driver driver;
 };
 #define to_phy_driver(d) container_of(d, struct phy_driver, driver)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index fe70e66..6163022 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1961,6 +1961,33 @@ static inline void sw_tx_timestamp(struct sk_buff *skb)
 }
 #endif
 
+#ifdef CONFIG_NETWORK_PHY_TIMESTAMPING
+
+static inline void phy_tx_timestamp(struct phy_device *phy, struct sk_buff *skb)
+{
+	union skb_shared_tx *shtx = skb_tx(skb);
+	if (shtx->hardware && phy && phy->drv->txtstamp)
+		phy->drv->txtstamp(phy, skb);
+}
+
+static inline void phy_rx_timestamp(struct phy_device *phy, struct sk_buff *skb)
+{
+	if (phy && phy->drv->rxtstamp)
+		phy->drv->rxtstamp(phy, skb);
+}
+
+#else /* CONFIG_NETWORK_PHY_TIMESTAMPING */
+
+static inline void phy_tx_timestamp(struct phy_device *phy, struct sk_buff *skb)
+{
+}
+
+static inline void phy_rx_timestamp(struct phy_device *phy, struct sk_buff *skb)
+{
+}
+
+#endif /* !CONFIG_NETWORK_PHY_TIMESTAMPING */
+
 /**
  * skb_tx_timestamp() - Driver hook for software timestamping
  *
@@ -1973,6 +2000,7 @@ static inline void sw_tx_timestamp(struct sk_buff *skb)
  */
 static inline void skb_tx_timestamp(struct phy_device *phy, struct sk_buff *skb)
 {
+	phy_tx_timestamp(phy, skb);
 	sw_tx_timestamp(skb);
 }
 
@@ -1987,6 +2015,7 @@ static inline void skb_tx_timestamp(struct phy_device *phy, struct sk_buff *skb)
  */
 static inline void skb_rx_timestamp(struct phy_device *phy, struct sk_buff *skb)
 {
+	phy_rx_timestamp(phy, skb);
 }
 
 extern __sum16 __skb_checksum_complete_head(struct sk_buff *skb, int len);
diff --git a/net/Kconfig b/net/Kconfig
index 73e8d97..09a1d26 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -97,6 +97,17 @@ config NETWORK_TXTS_FALLBACK
 
 	  If you are unsure how to answer this question, answer N.
 
+config NETWORK_PHY_TIMESTAMPING
+	bool "Timestamping in PHY devices"
+	depends on EXPERIMENTAL
+	help
+	  This allows timestamping of network packets by PHYs with
+	  hardware timestamping capabilities. This option adds some
+	  overhead in the transmit and receive paths. Note that this
+	  option also requires support in the MAC driver.
+
+	  If you are unsure how to answer this question, answer N.
+
 menuconfig NETFILTER
 	bool "Network packet filtering framework (Netfilter)"
 	---help---
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH 1/4] net: add driver hooks for time stamping.
From: Richard Cochran @ 2010-06-28 15:34 UTC (permalink / raw)
  To: netdev
In-Reply-To: <cover.1277737222.git.richard.cochran@omicron.at>

This patch adds hooks for transmit and receive time stamps and a new
networking option to allow a software fallback for transmit time stamps,
for MACs lacking time stamping hardware. The receive hook does not yet
have any effect, but it prepares the way for hardware time stamping
in PHY devices.

Using the hooks will still require adding two inline function calls to
each MAC driver. The CONFIG option makes these calls safe to add, since
the calls become NOOPs when the option is disabled.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
---
 include/linux/skbuff.h |   42 ++++++++++++++++++++++++++++++++++++++++++
 net/Kconfig            |   11 +++++++++++
 2 files changed, 53 insertions(+), 0 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index ac74ee0..fe70e66 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -29,6 +29,7 @@
 #include <linux/rcupdate.h>
 #include <linux/dmaengine.h>
 #include <linux/hrtimer.h>
+#include <linux/phy.h>
 
 /* Don't change this without changing skb_csum_unnecessary! */
 #define CHECKSUM_NONE 0
@@ -1947,6 +1948,47 @@ static inline ktime_t net_invalid_timestamp(void)
 extern void skb_tstamp_tx(struct sk_buff *orig_skb,
 			struct skb_shared_hwtstamps *hwtstamps);
 
+#ifdef CONFIG_NETWORK_TXTS_FALLBACK
+static inline void sw_tx_timestamp(struct sk_buff *skb)
+{
+	union skb_shared_tx *shtx = skb_tx(skb);
+	if (shtx->software && !shtx->in_progress)
+		skb_tstamp_tx(skb, NULL);
+}
+#else
+static inline void sw_tx_timestamp(struct sk_buff *skb)
+{
+}
+#endif
+
+/**
+ * skb_tx_timestamp() - Driver hook for software timestamping
+ *
+ * Ethernet MAC Drivers should call this function in their hard_xmit()
+ * function as soon as possible after giving the sk_buff to the MAC
+ * hardware, but before freeing the sk_buff.
+ *
+ * @phy: The port's phy_device. Pass NULL if this is not available.
+ * @skb: A socket buffer.
+ */
+static inline void skb_tx_timestamp(struct phy_device *phy, struct sk_buff *skb)
+{
+	sw_tx_timestamp(skb);
+}
+
+/**
+ * skb_rx_timestamp() - Driver hook for receive timestamping
+ *
+ * Ethernet MAC Drivers should call this function in their NAPI poll()
+ * function immediately before calling eth_type_trans().
+ *
+ * @phy: The port's phy_device. Pass NULL if this is not available.
+ * @skb: A socket buffer.
+ */
+static inline void skb_rx_timestamp(struct phy_device *phy, struct sk_buff *skb)
+{
+}
+
 extern __sum16 __skb_checksum_complete_head(struct sk_buff *skb, int len);
 extern __sum16 __skb_checksum_complete(struct sk_buff *skb);
 
diff --git a/net/Kconfig b/net/Kconfig
index 0d68b40..73e8d97 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -86,6 +86,17 @@ config NETWORK_SECMARK
 	  to nfmark, but designated for security purposes.
 	  If you are unsure how to answer this question, answer N.
 
+config NETWORK_TXTS_FALLBACK
+	bool "Transmit Timestamping in Software"
+	depends on EXPERIMENTAL
+	help
+	  This option allows timestamping of transmitted packets for
+	  MACs which lack hardware timestamping capabilities. This
+	  option adds some overhead in the transmit path. Note that
+	  this option also requires support in the MAC driver.
+
+	  If you are unsure how to answer this question, answer N.
+
 menuconfig NETFILTER
 	bool "Network packet filtering framework (Netfilter)"
 	---help---
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH 0/4] Extend Time Stamping
From: Richard Cochran @ 2010-06-28 15:33 UTC (permalink / raw)
  To: netdev

This patch set extends the packet time stamping capabilites of the
network stack in two ways.

1. The first patch presents a work-around for the TX software time
   stamping fallback problem cited in cd4d8fdad1f1. The idea is to add
   two inline functions into each MAC driver. The functions act as
   hooks which are only present if certain CONFIG options are enabled.

2. The other patches prepare the way for PHY drivers to offer time
   stamping.

I am preparing a new round of patches for PTP support, but it will
require the changes in this patch set in order to function. Thus I
would like to have this patch set reviewed (and hopefully merged) in
order to go forward.

Thanks,
Richard

Richard Cochran (4):
  net: add driver hooks for time stamping.
  phylib: add a way to make PHY time stamps possible.
  phylib: preserve ifreq parameter when calling generic phy_mii_ioctl()
  phylib: Allow reading and writing a mii bus from atomic context.

 drivers/net/arm/ixp4xx_eth.c           |    3 +-
 drivers/net/au1000_eth.c               |    2 +-
 drivers/net/bcm63xx_enet.c             |    2 +-
 drivers/net/cpmac.c                    |    5 +--
 drivers/net/dnet.c                     |    2 +-
 drivers/net/ethoc.c                    |    2 +-
 drivers/net/fec.c                      |    2 +-
 drivers/net/fec_mpc52xx.c              |    2 +-
 drivers/net/fs_enet/fs_enet-main.c     |    3 +-
 drivers/net/fsl_pq_mdio.c              |    4 +-
 drivers/net/gianfar.c                  |    2 +-
 drivers/net/macb.c                     |    2 +-
 drivers/net/mv643xx_eth.c              |    2 +-
 drivers/net/octeon/octeon_mgmt.c       |    2 +-
 drivers/net/phy/mdio_bus.c             |   45 +++++++++++++++++---
 drivers/net/phy/phy.c                  |    8 +++-
 drivers/net/sb1250-mac.c               |    2 +-
 drivers/net/sh_eth.c                   |    2 +-
 drivers/net/smsc911x.c                 |    2 +-
 drivers/net/smsc9420.c                 |    2 +-
 drivers/net/stmmac/stmmac_main.c       |   22 ++++------
 drivers/net/tc35815.c                  |    2 +-
 drivers/net/tg3.c                      |    2 +-
 drivers/net/ucc_geth.c                 |    2 +-
 drivers/staging/octeon/ethernet-mdio.c |    2 +-
 include/linux/phy.h                    |   24 +++++++++-
 include/linux/skbuff.h                 |   71 ++++++++++++++++++++++++++++++++
 net/Kconfig                            |   22 ++++++++++
 net/dsa/slave.c                        |    3 +-
 29 files changed, 192 insertions(+), 54 deletions(-)

^ permalink raw reply

* Re: [PATCHv2] vhost-net: add dhclient work-around from userspace
From: Michael S. Tsirkin @ 2010-06-28 15:30 UTC (permalink / raw)
  To: kvm, virtualization, netdev, linux-kernel, ykaul, markmc
In-Reply-To: <20100628100807.GA30685@redhat.com>

On Mon, Jun 28, 2010 at 01:08:07PM +0300, Michael S. Tsirkin wrote:
> Userspace virtio server has the following hack
> so guests rely on it, and we have to replicate it, too:
> 
> Use port number to detect incoming IPv4 DHCP response packets,
> and fill in the checksum for these.
> 
> The issue we are solving is that on linux guests, some apps
> that use recvmsg with AF_PACKET sockets, don't know how to
> handle CHECKSUM_PARTIAL;
> The interface to return the relevant information was added
> in 8dc4194474159660d7f37c495e3fc3f10d0db8cc,
> and older userspace does not use it.
> One important user of recvmsg with AF_PACKET is dhclient,
> so we add a work-around just for DHCP.
> 
> Don't bother applying the hack to IPv6 as userspace virtio does not
> have a work-around for that - let's hope guests will do the right
> thing wrt IPv6.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Tested-by: Laine Stump <laine@redhat.com>

^ permalink raw reply

* Re: [PATCH -next] qlcnic: fail when try to setup unsupported features
From: Ben Hutchings @ 2010-06-28 14:18 UTC (permalink / raw)
  To: Stanislaw Gruszka
  Cc: Amit Salecha, netdev@vger.kernel.org, Amerigo Wang,
	Anirban Chakraborty
In-Reply-To: <20100628161412.7d9d0e4f@dhcp-lab-109.englab.brq.redhat.com>

On Mon, 2010-06-28 at 16:14 +0200, Stanislaw Gruszka wrote:
[...]
> My plan is something like that:
> 
> static const struct ethtool_ops my_ethtool_ops = {
>         .get_flags              = ethtool_op_get_flags,
>         .set_flags              = ethtool_op_set_flags,
> 	.supported_flags	= ETH_FLAG_LRO
> }
> 
> Plus op->supported_flags check in ethtool_op_set_flags. That will allow
> to define flags per driver. There is also possible to add supported_flags
> to netdev, but I would like to avoid that - in such case drivers can use
> custom .set_flags function.

Sounds good to me.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [PATCH -next] qlcnic: fail when try to setup unsupported features
From: Stanislaw Gruszka @ 2010-06-28 14:18 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Amit Salecha, netdev@vger.kernel.org, Amerigo Wang,
	Anirban Chakraborty
In-Reply-To: <1277731840.2089.8.camel@achroite.uk.solarflarecom.com>

On Mon, 28 Jun 2010 14:30:40 +0100
Ben Hutchings <bhutchings@solarflare.com> wrote:

> > It would be more useful to add a supported_flags parameter to
> > ethtool_op_set_flags() so it can check the requested flags against the
> > driver/hardware capabilities.
> 
> I also just noticed that ethtool.h says set_flags() will return -EINVAL
> for unsupported values.  The current implementations variously return
> -EINVAL or -EOPNOTSUPP.

We need to unify ... , I prefer EOPNOTSUPP since this is returned now
if .set_flags == NULL. That will require to change a comment in 
ethtool.h and some drivers, I will take a look at it.

Stanislaw

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox