Netdev List
 help / color / mirror / Atom feed
* [PATCH] fec: add support for PHY interface platform data
From: Baruch Siach @ 2010-05-24  7:31 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Baruch Siach, Sascha Hauer
In-Reply-To: <20100523.233224.186495044.davem@davemloft.net>

The i.MX25 PDK uses RMII to communicate with its PHY. This patch adds the
ability to configure RMII, based on platform data.

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Acked-by:  Greg Ungerer <gerg@uclinux.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
---
 drivers/net/fec.c   |   22 ++++++++++++++++++++++
 drivers/net/fec.h   |    2 ++
 include/linux/fec.h |   21 +++++++++++++++++++++
 3 files changed, 45 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/fec.h

diff --git a/drivers/net/fec.c b/drivers/net/fec.c
index 42d9ac9..326465f 100644
--- a/drivers/net/fec.c
+++ b/drivers/net/fec.c
@@ -41,6 +41,7 @@
 #include <linux/clk.h>
 #include <linux/platform_device.h>
 #include <linux/phy.h>
+#include <linux/fec.h>
 
 #include <asm/cacheflush.h>
 
@@ -182,6 +183,7 @@ struct fec_enet_private {
 	struct  phy_device *phy_dev;
 	int     mii_timeout;
 	uint    phy_speed;
+	phy_interface_t	phy_interface;
 	int	index;
 	int	link;
 	int	full_duplex;
@@ -1191,6 +1193,21 @@ fec_restart(struct net_device *dev, int duplex)
 	/* Set MII speed */
 	writel(fep->phy_speed, fep->hwp + FEC_MII_SPEED);
 
+#ifdef FEC_MIIGSK_ENR
+	if (fep->phy_interface == PHY_INTERFACE_MODE_RMII) {
+		/* disable the gasket and wait */
+		writel(0, fep->hwp + FEC_MIIGSK_ENR);
+		while (readl(fep->hwp + FEC_MIIGSK_ENR) & 4)
+			udelay(1);
+
+		/* configure the gasket: RMII, 50 MHz, no loopback, no echo */
+		writel(1, fep->hwp + FEC_MIIGSK_CFGR);
+
+		/* re-enable the gasket */
+		writel(2, fep->hwp + FEC_MIIGSK_ENR);
+	}
+#endif
+
 	/* And last, enable the transmit and receive processing */
 	writel(2, fep->hwp + FEC_ECNTRL);
 	writel(0, fep->hwp + FEC_R_DES_ACTIVE);
@@ -1226,6 +1243,7 @@ static int __devinit
 fec_probe(struct platform_device *pdev)
 {
 	struct fec_enet_private *fep;
+	struct fec_platform_data *pdata;
 	struct net_device *ndev;
 	int i, irq, ret = 0;
 	struct resource *r;
@@ -1259,6 +1277,10 @@ fec_probe(struct platform_device *pdev)
 
 	platform_set_drvdata(pdev, ndev);
 
+	pdata = pdev->dev.platform_data;
+	if (pdata)
+		fep->phy_interface = pdata->phy;
+
 	/* This device has up to three irqs on some platforms */
 	for (i = 0; i < 3; i++) {
 		irq = platform_get_irq(pdev, i);
diff --git a/drivers/net/fec.h b/drivers/net/fec.h
index cc47f3f..2c48b25 100644
--- a/drivers/net/fec.h
+++ b/drivers/net/fec.h
@@ -43,6 +43,8 @@
 #define FEC_R_DES_START		0x180 /* Receive descriptor ring */
 #define FEC_X_DES_START		0x184 /* Transmit descriptor ring */
 #define FEC_R_BUFF_SIZE		0x188 /* Maximum receive buff size */
+#define FEC_MIIGSK_CFGR		0x300 /* MIIGSK Configuration reg */
+#define FEC_MIIGSK_ENR		0x308 /* MIIGSK Enable reg */
 
 #else
 
diff --git a/include/linux/fec.h b/include/linux/fec.h
new file mode 100644
index 0000000..5d3523d
--- /dev/null
+++ b/include/linux/fec.h
@@ -0,0 +1,21 @@
+/* include/linux/fec.h
+ *
+ * Copyright (c) 2009 Orex Computed Radiography
+ *   Baruch Siach <baruch@tkos.co.il>
+ *
+ * Header file for the FEC platform data
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef __LINUX_FEC_H__
+#define __LINUX_FEC_H__
+
+#include <linux/phy.h>
+
+struct fec_platform_data {
+	phy_interface_t phy;
+};
+
+#endif
-- 
1.7.1


^ permalink raw reply related

* Re: [PATCH] fec: add support for PHY interface platform data
From: David Miller @ 2010-05-24  7:36 UTC (permalink / raw)
  To: baruch; +Cc: netdev, s.hauer
In-Reply-To: <3fdc6da7f1b53fdea0e056d47d828e106c4e76c8.1274686230.git.baruch@tkos.co.il>

From: Baruch Siach <baruch@tkos.co.il>
Date: Mon, 24 May 2010 10:31:05 +0300

> The i.MX25 PDK uses RMII to communicate with its PHY. This patch adds the
> ability to configure RMII, based on platform data.
> 
> Signed-off-by: Baruch Siach <baruch@tkos.co.il>
> Acked-by:  Greg Ungerer <gerg@uclinux.org>

Applied, thanks.

^ permalink raw reply

* RE: Receiving of priority tagged packets
From: Vladislav Zolotarov @ 2010-05-24  7:44 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev@vger.kernel.org
In-Reply-To: <1274646751.5020.78.camel@edumazet-laptop>

Great! Thanks. I'll check it shortly.

However this patch handles the problem only in the HW accelerated path, which is used only when the NIC supports the HW acceleration of RX VLAN parsing and (the most important) at least one VLAN is configured for the current device (this will initialize the dev->vlgrp, otherwise it's NULL and u r not allowed to call vlan_hwaccel_rx()). So if I haven’t configured any VLAN this patch won't help. Namely we need to fix netif_receive_skb(skb) as well.

I would also like to ask u another question: formally VID=0 is not a valid VID (according to the same spec).
However there is this ethX.0 semantics which is a bit confusing. So, don't u think that together with the patch below and the addition described above we should also forbid the configuration of VID=0 (vconfig add ethX 0)?

Thanks a lot again.
Vlad



> -----Original Message-----
> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> Sent: Sunday, May 23, 2010 11:33 PM
> To: Vladislav Zolotarov
> Cc: netdev@vger.kernel.org
> Subject: RE: Receiving of priority tagged packets
> 
> Le dimanche 23 mai 2010 à 03:51 -0700, Vladislav Zolotarov a écrit :
> >
> > > -----Original Message-----
> > > From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
> On
> > > Behalf Of Vladislav Zolotarov
> > > Sent: Sunday, May 23, 2010 1:23 PM
> > > To: Eric Dumazet
> > > Cc: netdev@vger.kernel.org
> > > Subject: RE: Receiving of priority tagged packets
> > >
> > >
> > > > -----Original Message-----
> > > > From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> > > > Sent: Sunday, May 23, 2010 1:07 PM
> > > > To: Vladislav Zolotarov
> > > > Cc: netdev@vger.kernel.org
> > > > Subject: Re: Receiving of priority tagged packets
> > > >
> > > > Le dimanche 23 mai 2010 à 02:36 -0700, Vladislav Zolotarov a écrit :
> > > > > Hello,
> > > > > We were playing with FCoE in our labs and saw the strange behavior of
> > > Linux
> > > > networking stack in regard to priority-tagged frames (the ones that
> have a
> > > > zero VID in a VLAN tag). We saw that until we explicitly added a zero
> vlan
> > > > interface (vconfig add ethX 0) the stack refused to accept such packets
> > > both
> > > > in HW VLAN acceleration mode (skb is indicated using
> > > > vlan_hwaccel_receive_skb()) and in a regular mode (skb is indicated
> with
> > > > netif_receive_skb()).
> > > > >
> > > > > However "IEEE Std 802.1Q, 2006 Edition DRAFT D0.1" in section 6.7
> states
> > > > the following:
> > > > >
> > > > > Each Bridge Port shall support the following parameters for use by
> these
> > > > (EISS tagging and detagging) functions:
> > > > > 	c) an Acceptable Frame Types parameter with at least one of the
> > > > following values:
> > > > > 		1) Admit Only VLAN-tagged frames;
> > > > > 		2) Admit Only Untagged and Priority-tagged frames;
> > > > > 		3) Admit All frames
> > > > >
> > > > > So I guess this means that priority tagged frames should be accepted
> > > > together with the untagged frames on the default interface ethX.
> > > > >
> > > > > Could anyone explain, pls., what's the expected behavior of the Linux
> > > > Networking Stack in regard to the priority-tagged frames and what's
> > > expected
> > > > to be configured in order to start accepting them?
> > > > >
> > > > > Thanks in advance,
> > > > > vlad
> > > >
> > > > So if eth0.0 is setup, incoming vlanid=0 frames are delivered to
> eth0.0,
> > > > OK ? (This works in and out since commit 05423b241311c93)
> > > >
> > > > Now, if eth0.0 is not setup, you believe these frames should be
> directed
> > > > to eth0, as if they were not tagged ?
> > > >
> > > > That seems a bit strange.
> > >
> > > Well, as far as I understood this is what IEEE 802.1Q spec states...
> >
> > From the same spec:
> >
> > ***************
> > 3.38 Priority-tagged frame: A tagged frame whose tag header carries
> priority information but carries no
> > VLAN identification information.
> > ***************
> >
> > Namely priority-tagged frames are frames that carry no VLAN identification
> information and namely should be treated as non-VLAN packets.
> >
> > However, as u mentioned above, current Networking Stack implementation
> handles them as if they have VID 0.
> >
> > These two are quite different ways to handle the frame, don’t u think?
> 
> Following patch should fix it ?
> 
> Fallback to ethX is ethX.0 is not used.
> 
> diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
> index bd537fc..909f6e7 100644
> --- a/net/8021q/vlan_core.c
> +++ b/net/8021q/vlan_core.c
> @@ -8,6 +8,9 @@
>  int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
>  		      u16 vlan_tci, int polling)
>  {
> +	struct net_device *vlan_dev;
> +	u16 vlan_id;
> +
>  	if (netpoll_rx(skb))
>  		return NET_RX_DROP;
> 
> @@ -16,10 +19,14 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct
> vlan_group *grp,
> 
>  	skb->skb_iif = skb->dev->ifindex;
>  	__vlan_hwaccel_put_tag(skb, vlan_tci);
> -	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
> +	vlan_id = vlan_tci & VLAN_VID_MASK;
> +	vlan_dev = vlan_group_get_device(grp, vlan_id);
> 
> -	if (!skb->dev)
> -		goto drop;
> +	if (vlan_dev)
> +		skb->dev = vlan_dev;
> +	else
> +		if (vlan_id)
> +			goto drop;
> 
>  	return (polling ? netif_receive_skb(skb) : netif_rx(skb));
> 
> @@ -82,16 +89,22 @@ vlan_gro_common(struct napi_struct *napi, struct
> vlan_group *grp,
>  		unsigned int vlan_tci, struct sk_buff *skb)
>  {
>  	struct sk_buff *p;
> +	struct net_device *vlan_dev;
> +	u16 vlan_id;
> 
>  	if (skb_bond_should_drop(skb, ACCESS_ONCE(skb->dev->master)))
>  		goto drop;
> 
>  	skb->skb_iif = skb->dev->ifindex;
>  	__vlan_hwaccel_put_tag(skb, vlan_tci);
> -	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
> -
> -	if (!skb->dev)
> -		goto drop;
> +	vlan_id = vlan_tci & VLAN_VID_MASK;
> +	vlan_dev = vlan_group_get_device(grp, vlan_id);
> +
> +	if (vlan_dev)
> +		skb->dev = vlan_dev;
> +	else
> +		if (vlan_id)
> +			goto drop;
> 
>  	for (p = napi->gro_list; p; p = p->next) {
>  		NAPI_GRO_CB(p)->same_flow =
> 
> 
> 
> 


^ permalink raw reply

* net/dccp: expansion of error code size
From: Yoichi Yuasa @ 2010-05-24  6:45 UTC (permalink / raw)
  To: David S. Miller; +Cc: yuasa, linux-mips, netdev

Because MIPS's EDQUOT value is 1133(0x46d).
It's larger than u8.

Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
---
 net/dccp/input.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/dccp/input.c b/net/dccp/input.c
index 58f7bc1..6beb6a7 100644
--- a/net/dccp/input.c
+++ b/net/dccp/input.c
@@ -124,9 +124,9 @@ static int dccp_rcv_closereq(struct sock *sk, struct sk_buff *skb)
 	return queued;
 }
 
-static u8 dccp_reset_code_convert(const u8 code)
+static u16 dccp_reset_code_convert(const u8 code)
 {
-	const u8 error_code[] = {
+	const u16 error_code[] = {
 	[DCCP_RESET_CODE_CLOSED]	     = 0,	/* normal termination */
 	[DCCP_RESET_CODE_UNSPECIFIED]	     = 0,	/* nothing known */
 	[DCCP_RESET_CODE_ABORTED]	     = ECONNRESET,
@@ -148,7 +148,7 @@ static u8 dccp_reset_code_convert(const u8 code)
 
 static void dccp_rcv_reset(struct sock *sk, struct sk_buff *skb)
 {
-	u8 err = dccp_reset_code_convert(dccp_hdr_reset(skb)->dccph_reset_code);
+	u16 err = dccp_reset_code_convert(dccp_hdr_reset(skb)->dccph_reset_code);
 
 	sk->sk_err = err;
 
-- 
1.7.1


^ permalink raw reply related

* RE: Receiving of priority tagged packets
From: Vladislav Zolotarov @ 2010-05-24  9:05 UTC (permalink / raw)
  To: Vladislav Zolotarov, Eric Dumazet; +Cc: netdev@vger.kernel.org
In-Reply-To: <8628FE4E7912BF47A96AE7DD7BAC0AADDDC6675C68@SJEXCHCCR02.corp.ad.broadcom.com>

HW accelerated flow worked:
1) I've configured ethX.0 interface on one side (RH5.4 standard kernel) and a regular interface ethY above the patched upstream kernel on the other side. ethX.0 and ethY IPs were configured to reside in the same subnet.
2) Then I've configured VID=7 interface above ethY to enable HW acceleration flow.
3) Configured a static  ARP entry for ethX.0 for ethY IP and MAC. (ARP replies from ethY were sent without any VLAN tag and apparently haven't arrived to ethX.0. That's an example, why ethX.0 semantics is somewhat problematic).
4) Then I successfully pinged both ethX.0 from ethY and vice versa.

So, considering my previous post this patch looks like working.

Thanks,
vlad

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
> Behalf Of Vladislav Zolotarov
> Sent: Monday, May 24, 2010 10:44 AM
> To: Eric Dumazet
> Cc: netdev@vger.kernel.org
> Subject: RE: Receiving of priority tagged packets
> 
> Great! Thanks. I'll check it shortly.
> 
> However this patch handles the problem only in the HW accelerated path, which
> is used only when the NIC supports the HW acceleration of RX VLAN parsing and
> (the most important) at least one VLAN is configured for the current device
> (this will initialize the dev->vlgrp, otherwise it's NULL and u r not allowed
> to call vlan_hwaccel_rx()). So if I haven’t configured any VLAN this patch
> won't help. Namely we need to fix netif_receive_skb(skb) as well.
> 
> I would also like to ask u another question: formally VID=0 is not a valid
> VID (according to the same spec).
> However there is this ethX.0 semantics which is a bit confusing. So, don't u
> think that together with the patch below and the addition described above we
> should also forbid the configuration of VID=0 (vconfig add ethX 0)?
> 
> Thanks a lot again.
> Vlad
> 
> 
> 
> > -----Original Message-----
> > From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> > Sent: Sunday, May 23, 2010 11:33 PM
> > To: Vladislav Zolotarov
> > Cc: netdev@vger.kernel.org
> > Subject: RE: Receiving of priority tagged packets
> >
> > Le dimanche 23 mai 2010 à 03:51 -0700, Vladislav Zolotarov a écrit :
> > >
> > > > -----Original Message-----
> > > > From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org]
> > On
> > > > Behalf Of Vladislav Zolotarov
> > > > Sent: Sunday, May 23, 2010 1:23 PM
> > > > To: Eric Dumazet
> > > > Cc: netdev@vger.kernel.org
> > > > Subject: RE: Receiving of priority tagged packets
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> > > > > Sent: Sunday, May 23, 2010 1:07 PM
> > > > > To: Vladislav Zolotarov
> > > > > Cc: netdev@vger.kernel.org
> > > > > Subject: Re: Receiving of priority tagged packets
> > > > >
> > > > > Le dimanche 23 mai 2010 à 02:36 -0700, Vladislav Zolotarov a écrit :
> > > > > > Hello,
> > > > > > We were playing with FCoE in our labs and saw the strange behavior
> of
> > > > Linux
> > > > > networking stack in regard to priority-tagged frames (the ones that
> > have a
> > > > > zero VID in a VLAN tag). We saw that until we explicitly added a zero
> > vlan
> > > > > interface (vconfig add ethX 0) the stack refused to accept such
> packets
> > > > both
> > > > > in HW VLAN acceleration mode (skb is indicated using
> > > > > vlan_hwaccel_receive_skb()) and in a regular mode (skb is indicated
> > with
> > > > > netif_receive_skb()).
> > > > > >
> > > > > > However "IEEE Std 802.1Q, 2006 Edition DRAFT D0.1" in section 6.7
> > states
> > > > > the following:
> > > > > >
> > > > > > Each Bridge Port shall support the following parameters for use by
> > these
> > > > > (EISS tagging and detagging) functions:
> > > > > > 	c) an Acceptable Frame Types parameter with at least one of the
> > > > > following values:
> > > > > > 		1) Admit Only VLAN-tagged frames;
> > > > > > 		2) Admit Only Untagged and Priority-tagged frames;
> > > > > > 		3) Admit All frames
> > > > > >
> > > > > > So I guess this means that priority tagged frames should be
> accepted
> > > > > together with the untagged frames on the default interface ethX.
> > > > > >
> > > > > > Could anyone explain, pls., what's the expected behavior of the
> Linux
> > > > > Networking Stack in regard to the priority-tagged frames and what's
> > > > expected
> > > > > to be configured in order to start accepting them?
> > > > > >
> > > > > > Thanks in advance,
> > > > > > vlad
> > > > >
> > > > > So if eth0.0 is setup, incoming vlanid=0 frames are delivered to
> > eth0.0,
> > > > > OK ? (This works in and out since commit 05423b241311c93)
> > > > >
> > > > > Now, if eth0.0 is not setup, you believe these frames should be
> > directed
> > > > > to eth0, as if they were not tagged ?
> > > > >
> > > > > That seems a bit strange.
> > > >
> > > > Well, as far as I understood this is what IEEE 802.1Q spec states...
> > >
> > > From the same spec:
> > >
> > > ***************
> > > 3.38 Priority-tagged frame: A tagged frame whose tag header carries
> > priority information but carries no
> > > VLAN identification information.
> > > ***************
> > >
> > > Namely priority-tagged frames are frames that carry no VLAN
> identification
> > information and namely should be treated as non-VLAN packets.
> > >
> > > However, as u mentioned above, current Networking Stack implementation
> > handles them as if they have VID 0.
> > >
> > > These two are quite different ways to handle the frame, don’t u think?
> >
> > Following patch should fix it ?
> >
> > Fallback to ethX is ethX.0 is not used.
> >
> > diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
> > index bd537fc..909f6e7 100644
> > --- a/net/8021q/vlan_core.c
> > +++ b/net/8021q/vlan_core.c
> > @@ -8,6 +8,9 @@
> >  int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
> >  		      u16 vlan_tci, int polling)
> >  {
> > +	struct net_device *vlan_dev;
> > +	u16 vlan_id;
> > +
> >  	if (netpoll_rx(skb))
> >  		return NET_RX_DROP;
> >
> > @@ -16,10 +19,14 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct
> > vlan_group *grp,
> >
> >  	skb->skb_iif = skb->dev->ifindex;
> >  	__vlan_hwaccel_put_tag(skb, vlan_tci);
> > -	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
> > +	vlan_id = vlan_tci & VLAN_VID_MASK;
> > +	vlan_dev = vlan_group_get_device(grp, vlan_id);
> >
> > -	if (!skb->dev)
> > -		goto drop;
> > +	if (vlan_dev)
> > +		skb->dev = vlan_dev;
> > +	else
> > +		if (vlan_id)
> > +			goto drop;
> >
> >  	return (polling ? netif_receive_skb(skb) : netif_rx(skb));
> >
> > @@ -82,16 +89,22 @@ vlan_gro_common(struct napi_struct *napi, struct
> > vlan_group *grp,
> >  		unsigned int vlan_tci, struct sk_buff *skb)
> >  {
> >  	struct sk_buff *p;
> > +	struct net_device *vlan_dev;
> > +	u16 vlan_id;
> >
> >  	if (skb_bond_should_drop(skb, ACCESS_ONCE(skb->dev->master)))
> >  		goto drop;
> >
> >  	skb->skb_iif = skb->dev->ifindex;
> >  	__vlan_hwaccel_put_tag(skb, vlan_tci);
> > -	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
> > -
> > -	if (!skb->dev)
> > -		goto drop;
> > +	vlan_id = vlan_tci & VLAN_VID_MASK;
> > +	vlan_dev = vlan_group_get_device(grp, vlan_id);
> > +
> > +	if (vlan_dev)
> > +		skb->dev = vlan_dev;
> > +	else
> > +		if (vlan_id)
> > +			goto drop;
> >
> >  	for (p = napi->gro_list; p; p = p->next) {
> >  		NAPI_GRO_CB(p)->same_flow =
> >
> >
> >
> >
> 
> N�����r��y���b�X��ǧv�^�)޺{.n�+���z�^�)���w*
> jg���\x1e�����ݢj/���z�ޖ��2�ޙ���&�)ߡ�a��\x7f��\x1e�G���h�\x0f�j:+v���w�٥

^ permalink raw reply

* RE: ixgbe and SRIOV failure in driver?
From: Fischer, Anna @ 2010-05-24  9:49 UTC (permalink / raw)
  To: Rose, Gregory V; +Cc: e1000-devel@lists.sourceforge.net, netdev@vger.kernel.org
In-Reply-To: <43F901BD926A4E43B106BF17856F0755A79C061C@orsmsx508.amr.corp.intel.com>

[-- Attachment #1: Type: text/plain, Size: 2045 bytes --]

> Subject: RE: ixgbe and SRIOV failure in driver?
> 
> >-----Original Message-----
> >From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
> >On Behalf Of Fischer, Anna
> >Sent: Friday, May 21, 2010 9:33 AM
> >To: e1000-devel@lists.sourceforge.net; netdev@vger.kernel.org
> >Subject: ixgbe and SRIOV failure in driver?
> >
> >I am running a system with 3 Intel NICs. Two of them are 82598 devices,
> >and one is a SRIOV capable 82599.
> >
> >All devices use the ixgbe driver. What happens (I believe) now is that
> >when the driver loads at first, it sees the 82598 first (because of its
> >position in the PCI tree) and then it says "Device not IOV capable -
> >switching off IOV."
> >
> >So then it switches into non-IOV mode, and I can never enable SRIOV on
> >my 82599, because the driver does not enable it any more for further
> >devices.
> >
> >So to get around this issue, I tried to use pciback.hide to hide the
> >82598 devices from the OS. That way I was hoping that the driver would
> >switch on SRIOV on my 82599. However, then I got a kernel panic on boot
> >(see below).
> >
> >I am running Xen 4 and the Dom0 kernel is a 2.6.31 kernel.
> 
> The ixgbe driver included with the 2.6.31 kernel does not support SR-IOV.
> Where did you get the driver that does?  Please run ethtool -i <ethx>
> and
> post the results.

I have installed the driver separately on my pv-ops kernel.

[root@211052610-f-cpu ~]# ethtool -i eth
driver: ixgbe
version: 2.0.75.7-NAPI
firmware-version: 0.9-3
bus-info: 0000:05:00.0

I have attached a log on when the drivers loads. There is a bit more debugging information enabled. You can see that the driver loads at first on devices 0000:02:00.0 and 0000:02:00.1 which are 82598 devices and don't support SRIOV. So the driver switches off SRIOV. When it loads it on the 82599 which sits on 0000:05:00.0, it does not even have the SRIOV parameter enabled anymore, even though I load the driver with 'modprobe ixgbe max_vfs=4'.

Thanks,
Anna


[-- Attachment #2: ixgbe-error.log --]
[-- Type: application/octet-stream, Size: 4755 bytes --]

[  719.100280] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 2.0.75.7-NAPI
[  719.100289] Copyright (c) 1999-2010 Intel Corporation.
[  719.100383] ixgbe 0000:02:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[  719.100443] ixgbe 0000:02:00.0: setting latency timer to 64
[  719.159312] ixgbe: I/O Virtualization (IOV) set to 2
[  719.159321] ixgbe: 0000:02:00.0: ixgbe_check_options: Checking IOV capabilities.
[  719.159328] ixgbe: 0000:02:00.0: ixgbe_check_options: IOV is not supported on this hardware.  Disabling IOV.
[  719.159336] ixgbe: 0000:02:00.0: ixgbe_probe: SR-IOV disabled! Not probing VF.
[  719.164704] ixgbe 0000:02:00.0: irq 37 for MSI/MSI-X
[  719.164714] ixgbe 0000:02:00.0: irq 38 for MSI/MSI-X
[  719.164722] ixgbe 0000:02:00.0: irq 39 for MSI/MSI-X
[  719.164789] ixgbe: 0000:02:00.0: ixgbe_init_interrupt_scheme: Multiqueue Enabled: Rx Queue count = 2, Tx Queue count = 2
[  719.165779] ixgbe: eth0: ixgbe_probe: (PCI Express:2.5Gb/s:Width x8) 00:24:a8:c5:08:09
[  719.165857] ixgbe: eth0: ixgbe_probe: MAC: 1, PHY: 0, PBA No: e15728-001
[  719.165859] ixgbe: eth0: ixgbe_probe: GRO is enabled
[  719.165860] ixgbe: eth0: ixgbe_probe: Intel(R) 10 Gigabit Network Connection
[  719.165883] ixgbe 0000:02:00.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
[  719.165908] ixgbe 0000:02:00.1: setting latency timer to 64
[  719.223063] ixgbe: 0000:02:00.1: ixgbe_check_options: Checking IOV capabilities.
[  719.223071] ixgbe: 0000:02:00.1: ixgbe_check_options: IOV disabled.
[  719.223077] ixgbe: 0000:02:00.1: ixgbe_probe: SR-IOV disabled! Not probing VF.
[  719.228517] ixgbe 0000:02:00.1: irq 40 for MSI/MSI-X
[  719.228526] ixgbe 0000:02:00.1: irq 41 for MSI/MSI-X
[  719.228535] ixgbe 0000:02:00.1: irq 42 for MSI/MSI-X
[  719.228589] ixgbe: 0000:02:00.1: ixgbe_init_interrupt_scheme: Multiqueue Enabled: Rx Queue count = 2, Tx Queue count = 2
[  719.229595] ixgbe: eth1: ixgbe_probe: (PCI Express:2.5Gb/s:Width x8) 00:24:a8:c5:08:08
[  719.229675] ixgbe: eth1: ixgbe_probe: MAC: 1, PHY: 0, PBA No: e15728-001
[  719.229677] ixgbe: eth1: ixgbe_probe: GRO is enabled
[  719.229679] ixgbe: eth1: ixgbe_probe: Intel(R) 10 Gigabit Network Connection
[  719.229699] ixgbe 0000:05:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[  719.229732] ixgbe 0000:05:00.0: setting latency timer to 64
[  719.305494] ixgbe: 0000:05:00.0: ixgbe_check_options: Checking IOV capabilities.
[  719.305503] ixgbe: 0000:05:00.0: ixgbe_check_options: IOV disabled.
[  719.305510] ixgbe: 0000:05:00.0: ixgbe_check_options: Flow Director hash filtering enabled
[  719.305516] ixgbe: 0000:05:00.0: ixgbe_check_options: Flow Director allocated 64kB of packet buffer
[  719.305524] ixgbe: 0000:05:00.0: ixgbe_check_options: ATR Tx Packet sample rate set to default of 20
[  719.305530] ixgbe: 0000:05:00.0: ixgbe_probe: SR-IOV disabled! Not probing VF.
[  719.331724] ixgbe 0000:05:00.0: irq 43 for MSI/MSI-X
[  719.331734] ixgbe 0000:05:00.0: irq 44 for MSI/MSI-X
[  719.331741] ixgbe 0000:05:00.0: irq 45 for MSI/MSI-X
[  719.331810] ixgbe: 0000:05:00.0: ixgbe_init_interrupt_scheme: Multiqueue Enabled: Rx Queue count = 2, Tx Queue count = 2
[  719.334841] ixgbe: eth1: ixgbe_probe: (PCI Express:5.0Gb/s:Width x4) 00:1b:21:5a:29:5a
[  719.334932] ixgbe: eth1: ixgbe_probe: MAC: 2, PHY: 15, SFP+: 5, PBA No: e68787-002
[  719.334934] ixgbe: eth1: ixgbe_probe: GRO is enabled
[  719.334936] ixgbe: eth1: ixgbe_probe: HW RSC is enabled
[  719.335750] ixgbe: eth1: ixgbe_probe: Intel(R) 10 Gigabit Network Connection
[  719.675998] ADDRCONF(NETDEV_UP): eth0: link is not ready
[  719.688653] ixgbe: eth0: ixgbe_watchdog_task: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[  719.739952] ADDRCONF(NETDEV_UP): eth2: link is not ready
[  719.744768] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[  719.750402] ixgbe: eth2: ixgbe_watchdog_task: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[  719.753023] ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
[  719.993939] ADDRCONF(NETDEV_UP): eth0: link is not ready
[  720.007018] ixgbe: eth0: ixgbe_watchdog_task: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[  720.050958] ADDRCONF(NETDEV_UP): eth2: link is not ready
[  720.053396] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[  720.062087] ixgbe: eth2: ixgbe_watchdog_task: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[  720.062925] ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
[  730.684244] eth0: no IPv6 routers present
[  730.788245] eth2: no IPv6 routers present
[  781.234515] ADDRCONF(NETDEV_UP): eth: link is not ready
[  781.374284] ixgbe: eth: ixgbe_watchdog_task: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[  781.376958] ADDRCONF(NETDEV_CHANGE): eth: link becomes ready
[  792.102249] eth: no IPv6 routers present

^ permalink raw reply

* Re: [regression] fib6_del() bug from 2.6.34-rc1 still present in 2.6.34
From: Julien BLACHE @ 2010-05-24  9:54 UTC (permalink / raw)
  To: Shan Wei; +Cc: netdev, linux-kernel
In-Reply-To: <4BF9DCB7.8040308@cn.fujitsu.com>

Shan Wei <shanwei@cn.fujitsu.com> wrote:

Hi,

> I saw the warning on 2.6.34-rc3. But on net-next tree the warning disappeared.
> So I think the bug is fixed in net-next tree. My NIC driver is r8169.
>
> The net-next tree is newer than 2.6.34, maybe the fix is not queued to 2.6.34.
> So the warning also be present in 2.6.34.
> Can you try net-next tree firstly?

Unfortunately, this tree has a number of issues that prevent a
successful boot into userland on my machine :/

Any idea what the fix(es) is so I can test it out?

JB.

-- 
Julien BLACHE                                   <http://www.jblache.org> 
<jb@jblache.org>                                  GPG KeyID 0xF5D65169

^ permalink raw reply

* Re: [PATCH][VHOST] fix race with guest on multi-buffer used buffer updates
From: Michael S. Tsirkin @ 2010-05-24 10:17 UTC (permalink / raw)
  To: David L Stevens; +Cc: netdev, kvm
In-Reply-To: <1274374686.8492.12.camel@w-dls.beaverton.ibm.com>

On Thu, May 20, 2010 at 09:58:06AM -0700, David L Stevens wrote:
> [for Michael Tsirkin's vhost development git tree]
> 
> This patch fixes a race between guest and host when
> adding used buffers wraps the ring. Without it, guests
> can see partial packets before num_buffers is set in
> the vnet header.
> 
> Signed-off-by: David L Stevens <dlstevens@us.ibm.com>

Could you please explain what the race is?

> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 7f2568d..74790ab 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -1065,14 +1065,6 @@ static int __vhost_add_used_n(struct vhost_virtqueue *vq,
>  		vq_err(vq, "Failed to write used");
>  		return -EFAULT;
>  	}
> -	/* Make sure buffer is written before we update index. */
> -	smp_wmb();
> -	if (put_user(vq->last_used_idx + count, &vq->used->idx)) {
> -		vq_err(vq, "Failed to increment used idx");
> -		return -EFAULT;
> -	}
> -	if (unlikely(vq->log_used))
> -		vhost_log_used(vq, used);
>  	vq->last_used_idx += count;
>  	return 0;
>  }
> @@ -1093,7 +1085,17 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, struct vring_used_elem *heads,
>  		heads += n;
>  		count -= n;
>  	}
> -	return __vhost_add_used_n(vq, heads, count);
> +	r = __vhost_add_used_n(vq, heads, count);
> +
> +	/* Make sure buffer is written before we update index. */
> +	smp_wmb();
> +	if (put_user(vq->last_used_idx, &vq->used->idx)) {
> +		vq_err(vq, "Failed to increment used idx");
> +		return -EFAULT;
> +	}
> +	if (unlikely(vq->log_used))
> +		vhost_log_used(vq, vq->used->ring + start);
> +	return r;
>  }
>  
>  /* This actually signals the guest, using eventfd. */
> 

^ permalink raw reply

* Re: cls_cgroup: Store classid in struct sock
From: Neil Horman @ 2010-05-24 11:09 UTC (permalink / raw)
  To: David Miller; +Cc: herbert, eric.dumazet, bmb, tgraf, nhorman, netdev
In-Reply-To: <20100524.001429.259990428.davem@davemloft.net>

On Mon, May 24, 2010 at 12:14:29AM -0700, David Miller wrote:
> From: Herbert Xu <herbert@gondor.apana.org.au>
> Date: Mon, 24 May 2010 17:07:43 +1000
> 
> > On Sun, May 23, 2010 at 11:55:57PM -0700, David Miller wrote:
> >>
> >> Probably you only tested the build with cgroups enabled?
> > 
> > Indeed, that does seem to be the problem here.  It turns out
> > that their struct is only declared when CONFIG_CGROUPS is defined,
> > how annoying.
> > 
> > Oh well I guess I'll follow their example :)
> > 
> > cls_cgroup: Store classid in struct sock
> 
> Thanks for fixing this up, applied.
> 


Excuse all my noise from before, Herberts patch works fine.  Apparently the
problem was mine.  QEMU creates several threads during startup, but adding the
processid to a cgroup doesn't cause its child threads to get added
automatically.  Doing this by hand causes this to work.

Thanks
Neil


^ permalink raw reply

* Question about an assignment in handle_ing()
From: Jiri Pirko @ 2010-05-24 11:22 UTC (permalink / raw)
  To: hadi; +Cc: netdev, davem, herbert, kaber

Hi Jamal.

I want to ask you about the following chunk of code of net/core/dev.c in function handle_ing():

2699        if (*pt_prev) {
2700                *ret = deliver_skb(skb, *pt_prev, orig_dev);
2701                *pt_prev = NULL;
2702        } else {
2703                /* Huh? Why does turning on AF_PACKET affect this? */
2704                skb->tc_verd = SET_TC_OK2MUNGE(skb->tc_verd);
2705        }

The assignment (in "else" statement) was added by you here:
http://linux.bkbits.net:8080/linux-2.6.12-stable/?PAGE=cset&REV=40cfc085xjyQLyB1UiTbgaRZSlCN9w

The comment was added after move of this code by Herbert:
http://git.kernel.org/?p=linux/kernel/git/davem/net-next-2.6.git;a=commitdiff;h=f697c3e8b35c18b2698d64137c0fa84b0cdb3d10

Question is if this code is correct here. Maybe I'm missing something but
why is this dependent on a ptype was found previously?

Thanks,

	Jirka

^ permalink raw reply

* Re: cls_cgroup: Store classid in struct sock
From: Herbert Xu @ 2010-05-24 11:24 UTC (permalink / raw)
  To: Neil Horman; +Cc: David Miller, eric.dumazet, bmb, tgraf, nhorman, netdev
In-Reply-To: <20100524110950.GA22216@hmsreliant.think-freely.org>

On Mon, May 24, 2010 at 07:09:50AM -0400, Neil Horman wrote:
>
> Excuse all my noise from before, Herberts patch works fine.  Apparently the
> problem was mine.  QEMU creates several threads during startup, but adding the
> processid to a cgroup doesn't cause its child threads to get added
> automatically.  Doing this by hand causes this to work.

Thanks a lot for testing Neil!
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* [PATCH net-2.6] be2net: Bug fix in init code in probe
From: Sarveshwar Bandi @ 2010-05-24 11:20 UTC (permalink / raw)
  To: netdev; +Cc: davem

PCI function reset needs to invoked after fw init ioctl is issued.

Signed-off-by: Sarveshwar Bandi <sarveshwarb@serverengines.com>
---
 drivers/net/benet/be_main.c |   10 ++++++----
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/net/benet/be_main.c b/drivers/net/benet/be_main.c
index 058d7f9..1c79c20 100644
--- a/drivers/net/benet/be_main.c
+++ b/drivers/net/benet/be_main.c
@@ -2487,10 +2487,6 @@ static int __devinit be_probe(struct pci
 		status = be_cmd_POST(adapter);
 		if (status)
 			goto ctrl_clean;
-
-		status = be_cmd_reset_function(adapter);
-		if (status)
-			goto ctrl_clean;
 	}
 
 	/* tell fw we're ready to fire cmds */
@@ -2498,6 +2494,12 @@ static int __devinit be_probe(struct pci
 	if (status)
 		goto ctrl_clean;
 
+	if (be_physfn(adapter)) {
+		status = be_cmd_reset_function(adapter);
+		if (status)
+			goto ctrl_clean;
+	}
+
 	status = be_stats_init(adapter);
 	if (status)
 		goto ctrl_clean;
-- 
1.4.0


^ permalink raw reply related

* Re: [PATCH net-next-2.6 2/2] bonding: allow user-controlled output slave selection
From: Andy Gospodarek @ 2010-05-24 15:31 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: Neil Horman, Andy Gospodarek, netdev
In-Reply-To: <12958.1274145714@death.nxdomain.ibm.com>

On Mon, May 17, 2010 at 06:21:54PM -0700, Jay Vosburgh wrote:
> Jay Vosburgh <fubar@us.ibm.com> wrote:
> [...]
> >	For your patch, I'm exploring the idea of not setting
> >IFF_SLAVE_INACTIVE on "inactive" slaves for an "all_slaves_active"
> >option (I think that's a more descriptive name than "keep_all") instead
> >of adding a new KEEP_ALL flag bit to priv_flags.  Did you consider this
> >methodology and exclude it for some reason?
> 
> 	Following up to myself, I coded up approximately what I was
> talking about.  This doesn't require the extra priv_flag, and the sysfs
> _store is a little more complicated, but this appears to work (testing
> with ping -f after clearing the switch's MAC table to induce traffic
> flooding).  I didn't change the option name from "keep_all" here, but as
> far as the functionality goes, this seems to do what I think you want it
> to.
> 
> 	-J

This looks good to me, Jay.  I tested the patch here along with the
patch that started this thread and it works as expected.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>

It seems like you were willing to take the patch that started this
thread rather than a change that adds a new mode.  If we agree that this
is the correct direction, can you take a look at the patch and decide if
you are willing to ACK it as-is or if more changes are needed?

Thanks,

-andy


^ permalink raw reply

* Re: [PATCH][VHOST] fix race with guest on multi-buffer used buffer updates
From: David Stevens @ 2010-05-24 15:52 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: kvm, netdev
In-Reply-To: <20100524101709.GA28349@redhat.com>

"Michael S. Tsirkin" <mst@redhat.com> wrote on 05/24/2010 03:17:10 AM:

> On Thu, May 20, 2010 at 09:58:06AM -0700, David L Stevens wrote:
> > [for Michael Tsirkin's vhost development git tree]
> > 
> > This patch fixes a race between guest and host when
> > adding used buffers wraps the ring. Without it, guests
> > can see partial packets before num_buffers is set in
> > the vnet header.
> > 
> > Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
> 
> Could you please explain what the race is?

        Sure. The pre-patch code in the ring-wrap case
does this:

        add part1 bufs
        update used index
        add part2 bufs
        update used index

        After we update the used index for part1, the part1
buffers are available to the guest. If the guest is
consuming at that point, it can process the partial
packet before the rest of the packet is there. In that
case, num_buffers will be greater than the number of
buffers available to the guest and it'll drop the
packet with a framing error. I was seeing 2 or 3 framing
errors every 100 million packets or so pre-patch, none
post-patch.
        Actually, the second sentence is incorrect in the
original description-- num_buffers is up to date when
the guest sees it, but the used index is not.

                                                +-DLS


> 
> > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > index 7f2568d..74790ab 100644
> > --- a/drivers/vhost/vhost.c
> > +++ b/drivers/vhost/vhost.c
> > @@ -1065,14 +1065,6 @@ static int __vhost_add_used_n(struct 
vhost_virtqueue *vq,
> >        vq_err(vq, "Failed to write used");
> >        return -EFAULT;
> >     }
> > -   /* Make sure buffer is written before we update index. */
> > -   smp_wmb();
> > -   if (put_user(vq->last_used_idx + count, &vq->used->idx)) {
> > -      vq_err(vq, "Failed to increment used idx");
> > -      return -EFAULT;
> > -   }
> > -   if (unlikely(vq->log_used))
> > -      vhost_log_used(vq, used);
> >     vq->last_used_idx += count;
> >     return 0;
> >  }
> > @@ -1093,7 +1085,17 @@ int vhost_add_used_n(struct vhost_virtqueue 
*vq, 
> struct vring_used_elem *heads,
> >        heads += n;
> >        count -= n;
> >     }
> > -   return __vhost_add_used_n(vq, heads, count);
> > +   r = __vhost_add_used_n(vq, heads, count);
> > +
> > +   /* Make sure buffer is written before we update index. */
> > +   smp_wmb();
> > +   if (put_user(vq->last_used_idx, &vq->used->idx)) {
> > +      vq_err(vq, "Failed to increment used idx");
> > +      return -EFAULT;
> > +   }
> > +   if (unlikely(vq->log_used))
> > +      vhost_log_used(vq, vq->used->ring + start);
> > +   return r;
> >  }
> > 
> >  /* This actually signals the guest, using eventfd. */
> > 


^ permalink raw reply

* [PATCH net-2.6] macvlan: do a proper cleanup in macvlan_common_newlink()
From: Jiri Pirko @ 2010-05-24 15:58 UTC (permalink / raw)
  To: netdev; +Cc: davem, kaber

Fixes possible memory leak.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
---
 drivers/net/macvlan.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 4e238af..5f4694b 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -634,11 +634,17 @@ int macvlan_common_newlink(struct net *src_net, struct net_device *dev,
 
 	err = register_netdevice(dev);
 	if (err < 0)
-		return err;
+		goto destroy_port;
 
 	list_add_tail(&vlan->list, &port->vlans);
 	netif_stacked_transfer_operstate(lowerdev, dev);
+
 	return 0;
+
+destroy_port:
+	macvlan_port_destroy(lowerdev);
+
+	return err;
 }
 EXPORT_SYMBOL_GPL(macvlan_common_newlink);
 
-- 
1.6.6.1


^ permalink raw reply related

* Re: [PATCH][VHOST] fix race with guest on multi-buffer used buffer updates
From: Michael S. Tsirkin @ 2010-05-24 16:13 UTC (permalink / raw)
  To: David Stevens; +Cc: kvm, netdev
In-Reply-To: <OF343AB88F.017B31FE-ON8825772D.00564524-8825772D.00573773@us.ibm.com>

On Mon, May 24, 2010 at 08:52:40AM -0700, David Stevens wrote:
> "Michael S. Tsirkin" <mst@redhat.com> wrote on 05/24/2010 03:17:10 AM:
> 
> > On Thu, May 20, 2010 at 09:58:06AM -0700, David L Stevens wrote:
> > > [for Michael Tsirkin's vhost development git tree]
> > > 
> > > This patch fixes a race between guest and host when
> > > adding used buffers wraps the ring. Without it, guests
> > > can see partial packets before num_buffers is set in
> > > the vnet header.
> > > 
> > > Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
> > 
> > Could you please explain what the race is?
> 
>         Sure. The pre-patch code in the ring-wrap case
> does this:
> 
>         add part1 bufs
>         update used index
>         add part2 bufs
>         update used index
> 
>         After we update the used index for part1, the part1
> buffers are available to the guest. If the guest is
> consuming at that point, it can process the partial
> packet before the rest of the packet is there. In that
> case, num_buffers will be greater than the number of
> buffers available to the guest and it'll drop the
> packet with a framing error. I was seeing 2 or 3 framing
> errors every 100 million packets or so pre-patch, none
> post-patch.
>         Actually, the second sentence is incorrect in the
> original description-- num_buffers is up to date when
> the guest sees it, but the used index is not.
> 
>                                                 +-DLS

so this happens always - what does wrap-around refer to?

> 
> > 
> > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > > index 7f2568d..74790ab 100644
> > > --- a/drivers/vhost/vhost.c
> > > +++ b/drivers/vhost/vhost.c
> > > @@ -1065,14 +1065,6 @@ static int __vhost_add_used_n(struct 
> vhost_virtqueue *vq,
> > >        vq_err(vq, "Failed to write used");
> > >        return -EFAULT;
> > >     }
> > > -   /* Make sure buffer is written before we update index. */
> > > -   smp_wmb();
> > > -   if (put_user(vq->last_used_idx + count, &vq->used->idx)) {
> > > -      vq_err(vq, "Failed to increment used idx");
> > > -      return -EFAULT;
> > > -   }
> > > -   if (unlikely(vq->log_used))
> > > -      vhost_log_used(vq, used);
> > >     vq->last_used_idx += count;
> > >     return 0;
> > >  }
> > > @@ -1093,7 +1085,17 @@ int vhost_add_used_n(struct vhost_virtqueue 
> *vq, 
> > struct vring_used_elem *heads,
> > >        heads += n;
> > >        count -= n;
> > >     }
> > > -   return __vhost_add_used_n(vq, heads, count);
> > > +   r = __vhost_add_used_n(vq, heads, count);
> > > +
> > > +   /* Make sure buffer is written before we update index. */
> > > +   smp_wmb();
> > > +   if (put_user(vq->last_used_idx, &vq->used->idx)) {
> > > +      vq_err(vq, "Failed to increment used idx");
> > > +      return -EFAULT;
> > > +   }
> > > +   if (unlikely(vq->log_used))
> > > +      vhost_log_used(vq, vq->used->ring + start);
> > > +   return r;
> > >  }


I think a single vhost_log_used will not DTRT here:
it only updates log for a single entry.
So we'll need to split this to functions that
1. log used entries writes: called from __vhost_add_used_n
2. log used index write: called from vhost_add_used_n

> > > 
> > >  /* This actually signals the guest, using eventfd. */
> > > 

^ permalink raw reply

* Re: [PATCH][VHOST] fix race with guest on multi-buffer used buffer updates
From: David Stevens @ 2010-05-24 16:27 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: kvm, netdev, netdev-owner
In-Reply-To: <20100524161351.GA6580@redhat.com>

netdev-owner@vger.kernel.org wrote on 05/24/2010 09:13:51 AM:

> On Mon, May 24, 2010 at 08:52:40AM -0700, David Stevens wrote:
> > "Michael S. Tsirkin" <mst@redhat.com> wrote on 05/24/2010 03:17:10 AM:
> > 
> > > On Thu, May 20, 2010 at 09:58:06AM -0700, David L Stevens wrote:
> > > > [for Michael Tsirkin's vhost development git tree]
> > > > 
> > > > This patch fixes a race between guest and host when
> > > > adding used buffers wraps the ring. Without it, guests
> > > > can see partial packets before num_buffers is set in
> > > > the vnet header.
> > > > 
> > > > Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
> > > 
> > > Could you please explain what the race is?
> > 
> >         Sure. The pre-patch code in the ring-wrap case
> > does this:
> > 
> >         add part1 bufs
> >         update used index
> >         add part2 bufs
> >         update used index
> > 
> >         After we update the used index for part1, the part1
> > buffers are available to the guest. If the guest is
> > consuming at that point, it can process the partial
> > packet before the rest of the packet is there. In that
> > case, num_buffers will be greater than the number of
> > buffers available to the guest and it'll drop the
> > packet with a framing error. I was seeing 2 or 3 framing
> > errors every 100 million packets or so pre-patch, none
> > post-patch.
> >         Actually, the second sentence is incorrect in the
> > original description-- num_buffers is up to date when
> > the guest sees it, but the used index is not.
> > 
> >                                                 +-DLS
> 
> so this happens always - what does wrap-around refer to?

        The 2-part update only happens when a packet spans
the end/beginning of the vring (the wrap). The framing error only
happens if the guest sees the vring-wrapping packets
before the second used-index write (the race).
        So, the framing error doesn't happen always--it's
pretty rare. But with the patch, it never happens.

                                                        +-DLS


^ permalink raw reply

* [patch v2] caif: cleanup: remove duplicate checks
From: Dan Carpenter @ 2010-05-24 16:29 UTC (permalink / raw)
  To: walter harms
  Cc: netdev, Sjur Braendeland, Stephen Rothwell, kernel-janitors,
	David S. Miller
In-Reply-To: <4BF8E6FE.9000303@bfs.de>

"phyinfo" can never be null here because we assigned it an address, so I
removed both the assert and the second check inside the if statement.  I
removed the "phyinfo->phy_layer != NULL" check as well because that was
asserted earlier.

Walter Harms suggested I move the "phyinfo->phy_ref_count++;" outside
the if condition for readability, so I have done that.

Signed-off-by: Dan Carpenter <error27@gmail.com>
---
The original code increments phyinfo->phy_ref_count++ even if
phyinfo->phy_layer->modemcmd is NULL.  I kept it the same in my code,
but maybe I should change that?

v2: Removed an assert.  Moved the increment outside the if condition.

diff --git a/net/caif/cfcnfg.c b/net/caif/cfcnfg.c
index df43f26..7c81974 100644
--- a/net/caif/cfcnfg.c
+++ b/net/caif/cfcnfg.c
@@ -308,19 +308,15 @@ cfcnfg_linkup_rsp(struct cflayer *layer, u8 channel_id, enum cfctrl_srv serv,
 	caif_assert(cnfg != NULL);
 	caif_assert(phyid != 0);
 	phyinfo = &cnfg->phy_layers[phyid];
-	caif_assert(phyinfo != NULL);
 	caif_assert(phyinfo->id == phyid);
 	caif_assert(phyinfo->phy_layer != NULL);
 	caif_assert(phyinfo->phy_layer->id == phyid);
 
-	if (phyinfo != NULL &&
-	    phyinfo->phy_ref_count++ == 0 &&
-	    phyinfo->phy_layer != NULL &&
+	phyinfo->phy_ref_count++;
+	if (phyinfo->phy_ref_count == 1 &&
 	    phyinfo->phy_layer->modemcmd != NULL) {
-		caif_assert(phyinfo->phy_layer->id == phyid);
 		phyinfo->phy_layer->modemcmd(phyinfo->phy_layer,
 					     _CAIF_MODEMCMD_PHYIF_USEFULL);
-
 	}
 	adapt_layer->id = channel_id;
 

^ permalink raw reply related

* Re: [PATCH net-2.6] macvlan: do a proper cleanup in macvlan_common_newlink()
From: Jiri Pirko @ 2010-05-24 17:02 UTC (permalink / raw)
  To: netdev; +Cc: davem, kaber
In-Reply-To: <20100524155857.GO2810@psychotron.lab.eng.brq.redhat.com>

Mon, May 24, 2010 at 05:58:58PM CEST, jpirko@redhat.com wrote:
>Fixes possible memory leak.
>
>Signed-off-by: Jiri Pirko <jpirko@redhat.com>
>---
> drivers/net/macvlan.c |    8 +++++++-
> 1 files changed, 7 insertions(+), 1 deletions(-)
>
>diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
>index 4e238af..5f4694b 100644
>--- a/drivers/net/macvlan.c
>+++ b/drivers/net/macvlan.c
>@@ -634,11 +634,17 @@ int macvlan_common_newlink(struct net *src_net, struct net_device *dev,
> 
> 	err = register_netdevice(dev);
> 	if (err < 0)
>-		return err;
>+		goto destroy_port;
> 
> 	list_add_tail(&vlan->list, &port->vlans);
> 	netif_stacked_transfer_operstate(lowerdev, dev);
>+
> 	return 0;
>+
>+destroy_port:
>+	macvlan_port_destroy(lowerdev);
>+
>+	return err;
> }
> EXPORT_SYMBOL_GPL(macvlan_common_newlink);
> 
>-- 
>1.6.6.1

Actually it should be rather like this:

Subject: [PATCH net-2.6] macvlan: do proper cleanup in macvlan_common_newlink() V2

Fixes possible memory leak.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
---
 drivers/net/macvlan.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 4e238af..87e8d4c 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -634,11 +634,18 @@ int macvlan_common_newlink(struct net *src_net, struct net_device *dev,
 
 	err = register_netdevice(dev);
 	if (err < 0)
-		return err;
+		goto destroy_port;
 
 	list_add_tail(&vlan->list, &port->vlans);
 	netif_stacked_transfer_operstate(lowerdev, dev);
+
 	return 0;
+
+destroy_port:
+	if (list_empty(&port->vlans))
+		macvlan_port_destroy(lowerdev);
+
+	return err;
 }
 EXPORT_SYMBOL_GPL(macvlan_common_newlink);
 
-- 
1.6.6.1


^ permalink raw reply related

* RE: ixgbe: macvlan on PF/VF when SRIOV is enabled
From: Shirley Ma @ 2010-05-24 17:08 UTC (permalink / raw)
  To: Rose, Gregory V
  Cc: Kirsher, Jeffrey T, davem@davemloft.net, kvm@vger.kernel.org,
	netdev@vger.kernel.org, e1000-devel@lists.sourceforge.net
In-Reply-To: <43F901BD926A4E43B106BF17856F0755A79C099C@orsmsx508.amr.corp.intel.com>

Hello Greg,

Thanks for your prompt response.

On Sat, 2010-05-22 at 10:53 -0700, Rose, Gregory V wrote:
> As of 2.6.34 the ixgbe driver does not support multiple queues for
> macvlan.
> Support for multiple queues for macvlan will come in a subsequent
> release.

When it might happen? I will double check my test to see whether macvlan
was multiple queue or not. Then submitting my experimental patch for
review.

> The VF driver does not support macvlan.  Future releases may but there
> are no immediate plans to support it.

When it might be support in future. For performance reason, we are
interested in macvlan + VF for multiples VMs.

One more question here: Does VF support promiscuous mode? I don't see
the flag in ixgbevf driver.

Thanks
Shirley


^ permalink raw reply

* Re: [PATCH net-next-2.6 2/2] bonding: allow user-controlled output slave selection
From: Neil Horman @ 2010-05-24 17:08 UTC (permalink / raw)
  To: Andy Gospodarek; +Cc: Jay Vosburgh, netdev
In-Reply-To: <20100524153108.GJ7497@gospo.rdu.redhat.com>

On Mon, May 24, 2010 at 11:31:08AM -0400, Andy Gospodarek wrote:
> On Mon, May 17, 2010 at 06:21:54PM -0700, Jay Vosburgh wrote:
> > Jay Vosburgh <fubar@us.ibm.com> wrote:
> > [...]
> > >	For your patch, I'm exploring the idea of not setting
> > >IFF_SLAVE_INACTIVE on "inactive" slaves for an "all_slaves_active"
> > >option (I think that's a more descriptive name than "keep_all") instead
> > >of adding a new KEEP_ALL flag bit to priv_flags.  Did you consider this
> > >methodology and exclude it for some reason?
> > 
> > 	Following up to myself, I coded up approximately what I was
> > talking about.  This doesn't require the extra priv_flag, and the sysfs
> > _store is a little more complicated, but this appears to work (testing
> > with ping -f after clearing the switch's MAC table to induce traffic
> > flooding).  I didn't change the option name from "keep_all" here, but as
> > far as the functionality goes, this seems to do what I think you want it
> > to.
> > 
> > 	-J
> 
> This looks good to me, Jay.  I tested the patch here along with the
> patch that started this thread and it works as expected.
> 
> Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
> 
> It seems like you were willing to take the patch that started this
> thread rather than a change that adds a new mode.  If we agree that this
> is the correct direction, can you take a look at the patch and decide if
> you are willing to ACK it as-is or if more changes are needed?
> 
> Thanks,
> 
> -andy
> 
Awesome, thanks Andy!
Neil

> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

* RE: [PATCH 2.6.34-rc6] net: Improve ks8851 snl transmit performance
From: Ha, Tristram @ 2010-05-24 17:28 UTC (permalink / raw)
  To: David Miller; +Cc: ben, netdev, linux-kernel, x0066660, s-jan
In-Reply-To: <20100516.002607.84408834.davem@davemloft.net>

David Miller wrote:
> From: "Ha, Tristram" <Tristram.Ha@Micrel.Com>
> Date: Thu, 6 May 2010 15:50:27 -0700
> 
>> From: Tristram Ha <Tristram.Ha@micrel.com>
>> 
>> Under heavy transmission the driver will put 4 1514-byte packets in
>> queue and stop the device transmit queue.  Only the last packet
>> triggers the transmit done interrupt and wakes up the device transmit
>> queue.  That means a bit of time is wasted when the CPU cannot send
>> any more packet.
>> 
>> The new implementation triggers the transmit interrupt when the
>> transmit buffer left is less than 3 packets.  The maximum transmit
>> buffer size is 6144 bytes.  This allows the device transmit queue to
>> be restarted sooner so that CPU can send more packets.
>> 
>> For TCP receiving it also has the benefit of not triggering any
transmit interrupt at all.
>> 
>> There is a driver option no_tx_opt so that the driver can revert to
>> original implementation.  This allows user to verify if the transmit
>> performance actually improves.
>> 
>> Signed-off-by: Tristram Ha <Tristram.Ha@micrel.com>
> 
> First, if you want to post patches you have to format them properly as
ascii text with no longer
> than 80 column lines in your commit message. 
> I really don't want to hear about your email client being a reason you
can't do this properly :-)
> 
> Second, I don't think you can use the skb->ip_summed for this hacked
state tracking you are
> using.  The packet might be shared with other entities, and therefore
if you change the field it
> won't be correct for them any more.  

Sorry about the patch description.  I must have forgotten the rules.

As the socket buffer is accepted by the lowest level network driver and
reaches the end of its life, I think using one of its variables
temporarily does not cause any harm.

^ permalink raw reply

* Re: [PATCH][VHOST] fix race with guest on multi-buffer used buffer updates
From: Michael S. Tsirkin @ 2010-05-24 16:42 UTC (permalink / raw)
  To: David Stevens; +Cc: kvm, netdev, netdev-owner
In-Reply-To: <OF1C9B2FC2.B6CFE18F-ON8825772D.0059C934-8825772D.005A61BA@us.ibm.com>

On Mon, May 24, 2010 at 09:27:15AM -0700, David Stevens wrote:
> netdev-owner@vger.kernel.org wrote on 05/24/2010 09:13:51 AM:
> 
> > On Mon, May 24, 2010 at 08:52:40AM -0700, David Stevens wrote:
> > > "Michael S. Tsirkin" <mst@redhat.com> wrote on 05/24/2010 03:17:10 AM:
> > > 
> > > > On Thu, May 20, 2010 at 09:58:06AM -0700, David L Stevens wrote:
> > > > > [for Michael Tsirkin's vhost development git tree]
> > > > > 
> > > > > This patch fixes a race between guest and host when
> > > > > adding used buffers wraps the ring. Without it, guests
> > > > > can see partial packets before num_buffers is set in
> > > > > the vnet header.
> > > > > 
> > > > > Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
> > > > 
> > > > Could you please explain what the race is?
> > > 
> > >         Sure. The pre-patch code in the ring-wrap case
> > > does this:
> > > 
> > >         add part1 bufs
> > >         update used index
> > >         add part2 bufs
> > >         update used index
> > > 
> > >         After we update the used index for part1, the part1
> > > buffers are available to the guest. If the guest is
> > > consuming at that point, it can process the partial
> > > packet before the rest of the packet is there. In that
> > > case, num_buffers will be greater than the number of
> > > buffers available to the guest and it'll drop the
> > > packet with a framing error. I was seeing 2 or 3 framing
> > > errors every 100 million packets or so pre-patch, none
> > > post-patch.
> > >         Actually, the second sentence is incorrect in the
> > > original description-- num_buffers is up to date when
> > > the guest sees it, but the used index is not.
> > > 
> > >                                                 +-DLS
> > 
> > so this happens always - what does wrap-around refer to?
> 
>         The 2-part update only happens when a packet spans
> the end/beginning of the vring (the wrap). The framing error only
> happens if the guest sees the vring-wrapping packets
> before the second used-index write (the race).
>         So, the framing error doesn't happen always--it's
> pretty rare. But with the patch, it never happens.
> 
>                                                         +-DLS

I see. The logging is still bugg though I think.

^ permalink raw reply

* Re: [PATCH 2.6.34-rc6] net: Improve ks8851 snl transmit performance
From: David Miller @ 2010-05-24 17:44 UTC (permalink / raw)
  To: Tristram.Ha; +Cc: ben, netdev, linux-kernel, x0066660, s-jan
In-Reply-To: <14385191E87B904DBD836449AA30269D6DF528@MORGANITE.micrel.com>

From: "Ha, Tristram" <Tristram.Ha@Micrel.Com>
Date: Mon, 24 May 2010 10:28:40 -0700

> As the socket buffer is accepted by the lowest level network driver and
> reaches the end of its life, I think using one of its variables
> temporarily does not cause any harm.

The packet can be referenced by packet sniffers and other entities in
the kernel.

^ permalink raw reply

* Re: [PATCH][VHOST] fix race with guest on multi-buffer used buffer updates
From: David Stevens @ 2010-05-24 17:50 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: kvm, netdev, netdev-owner
In-Reply-To: <20100524164205.GA6664@redhat.com>

Michael S. Tsirkin <mst@redhat.com> wrote on 05/24/2010 09:42:05 AM:
> 
> I see. The logging is still bugg though I think.

Possibly; migration isn't working for me under load even
without mergeable buffers (investigating), so I haven't
yet been able to test wrap w/ logging, but did you see
something specific that's wrong?

                                                +-DLS


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox