Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH v2 net-next] net: use jump_label to shortcut RPS if not setup
From: Eric Dumazet @ 2011-11-17 22:19 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, therbert
In-Reply-To: <20111117.170442.1731100546322591134.davem@davemloft.net>

Le jeudi 17 novembre 2011 à 17:04 -0500, David Miller a écrit :
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Thu, 17 Nov 2011 22:34:49 +0100
> 
> > I wonder if we could use this jump_label infrastructure to shortcut
> > expensive netfilter nf_hook_slow() / nf_iterate on empty
> > tables/chains...
> 
> That would be nice, I can't see why it wouldn't be possible.

Almost done :)

Only problem is arch_static_branch() has a "i" (key), so some calls wont
be optimized (hooknum in a variable, not a constant)

for example in nf_ct_frag6_output()

Not a big deal, most calls can be optimized away.

^ permalink raw reply

* Re: [PATCH] bonding: Don't allow mode change via sysfs with slaves present
From: Nicolas de Pesloüan @ 2011-11-17 22:36 UTC (permalink / raw)
  To: David Miller; +Cc: vfalico, netdev, andy, fubar
In-Reply-To: <20111117.160405.368943784304624880.davem@davemloft.net>

Le 17/11/2011 22:04, David Miller a écrit :
> From: Veaceslav Falico<vfalico@redhat.com>
> Date: Tue, 15 Nov 2011 17:44:42 +0100
>
>> When changing mode via bonding's sysfs, the slaves are not initialized
>> correctly. Forbid to change modes with slaves present to ensure that every
>> slave is initialized correctly via bond_enslave().
>>
>> Signed-off-by: Veaceslav Falico<vfalico@redhat.com>
>
> Lots of discussion... what is the final verdict on this patch bonding folks?

Now looks good to me, and as all others participants agreed at the beginning, I think it's good for 
everyone.

	Nicolas.

^ permalink raw reply

* Re: [PATCH] net, bridge: print log message after state changed
From: Stephen Hemminger @ 2011-11-17 22:40 UTC (permalink / raw)
  To: David Lamparter
  Cc: Wolfgang.Fritz, netdev, bridge, holger.brunck, shemminger,
	David Miller
In-Reply-To: <20111117192353.GD2051622@jupiter.n2.diac24.net>

On Thu, 17 Nov 2011 20:23:53 +0100
David Lamparter <equinox@diac24.net> wrote:

> There seems to be a misunderstanding here. The patch effectively does:
> -       br_log_state(p);
>         p->state = BR_STATE_DISABLED;
> +       br_log_state(p);
> 
> and the issue it is trying to fix is not the timing but rather the code
> printing the wrong (old, now-left) state.
> 
> I do agree that the log message should be printed before anything
> happens, so, Holger, can you brew a patch that does that?

Other than being horribly strict, why bother? I hope you aren't building
some application that depends on the ordering of the log message.
The state change doesn't get undone, why does the order of the message
matter at all?

^ permalink raw reply

* Re: [PATCH 4/6] sky2: reduce default Tx ring size
From: Stephen Hemminger @ 2011-11-17 22:41 UTC (permalink / raw)
  To: Sven Joachim; +Cc: davem, netdev
In-Reply-To: <874ny2bgqa.fsf@turtle.gmx.de>

On Thu, 17 Nov 2011 22:07:25 +0100
Sven Joachim <svenjoac@gmx.de> wrote:

> On 2011-11-17 00:42 +0100, Stephen Hemminger wrote:
> 
> > The default Tx ring size for the sky2 driver is quite large and could
> > cause excess buffer bloat for many users. The minimum ring size 
> > possible and still allow handling  the worst case packet on 64bit platforms
> > is 38 which gets rounded up to a power of 2. But most packets only require
> > a couple of ring elements.
> >
> > Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> >
> > --- a/drivers/net/ethernet/marvell/sky2.c	2011-11-16 15:19:39.518659262 -0800
> > +++ b/drivers/net/ethernet/marvell/sky2.c	2011-11-16 15:19:40.990692544 -0800
> > @@ -68,7 +68,7 @@
> >  #define MAX_SKB_TX_LE	(2 + (sizeof(dma_addr_t)/sizeof(u32))*(MAX_SKB_FRAGS+1))
> >  #define TX_MIN_PENDING		(MAX_SKB_TX_LE+1)
> >  #define TX_MAX_PENDING		1024
> > -#define TX_DEF_PENDING		127
> > +#define TX_DEF_PENDING		63
> >  
> >  #define TX_WATCHDOG		(5 * HZ)
> >  #define NAPI_WEIGHT		64
> 
> It's hard to believe, but this innocuous-looking patch caused my system
> to crash and burn as soon as I actually used the network.
> Unfortunately, I have neither a digital camera nor a serial console at
> hand, but from my recollections and having read sky2.c I conclude that
> 
> 	BUG_ON(done >= sky2->tx_ring_size);
> 
> in sky2_tx_complete() seems to have been triggered.  Does this make any
> sense?
> 

If that is triggering then something is really odd because that means
the ring index went past the end of the ring and wasn't wrapped correctly.

^ permalink raw reply

* [PATCH] net: Use kmemdup rather than duplicating its implementation
From: Thomas Meyer @ 2011-11-17 22:43 UTC (permalink / raw)
  To: samuel, davem, netdev, linux-kernel

The semantic patch that makes this change is available
in scripts/coccinelle/api/memdup.cocci.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
---

diff -u -p a/net/irda/irttp.c b/net/irda/irttp.c
--- a/net/irda/irttp.c 2011-11-07 19:39:06.071138486 +0100
+++ b/net/irda/irttp.c 2011-11-08 10:59:07.152748948 +0100
@@ -1461,14 +1461,13 @@ struct tsap_cb *irttp_dup(struct tsap_cb
 	}
 
 	/* Allocate a new instance */
-	new = kmalloc(sizeof(struct tsap_cb), GFP_ATOMIC);
+	new = kmemdup(orig, sizeof(struct tsap_cb), GFP_ATOMIC);
 	if (!new) {
 		IRDA_DEBUG(0, "%s(), unable to kmalloc\n", __func__);
 		spin_unlock_irqrestore(&irttp->tsaps->hb_spinlock, flags);
 		return NULL;
 	}
 	/* Dup */
-	memcpy(new, orig, sizeof(struct tsap_cb));
 	spin_lock_init(&new->lock);
 
 	/* We don't need the old instance any more */

^ permalink raw reply

* [PATCH] ks8*/ksz8*: Casting (void *) value returned by kmalloc is useless
From: Thomas Meyer @ 2011-11-17 22:43 UTC (permalink / raw)
  To: netdev, linux-kernel

The semantic patch that makes this change is available
in scripts/coccinelle/api/alloc/drop_kmalloc_cast.cocci.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
---

diff -u -p a/drivers/net/ethernet/micrel/ks8851_mll.c b/drivers/net/ethernet/micrel/ks8851_mll.c
--- a/drivers/net/ethernet/micrel/ks8851_mll.c 2011-11-07 19:37:59.170122280 +0100
+++ b/drivers/net/ethernet/micrel/ks8851_mll.c 2011-11-08 09:00:03.544702580 +0100
@@ -1500,8 +1500,7 @@ static int ks_hw_init(struct ks_net *ks)
 	ks->all_mcast = 0;
 	ks->mcast_lst_size = 0;
 
-	ks->frame_head_info = (struct type_frame_head *) \
-		kmalloc(MHEADER_SIZE, GFP_KERNEL);
+	ks->frame_head_info = kmalloc(MHEADER_SIZE, GFP_KERNEL);
 	if (!ks->frame_head_info) {
 		pr_err("Error: Fail to allocate frame memory\n");
 		return false;

^ permalink raw reply

* [PATCH] RxRPC: Use kmemdup rather than duplicating its implementation
From: Thomas Meyer @ 2011-11-17 22:43 UTC (permalink / raw)
  To: davem, netdev, linux-kernel

The semantic patch that makes this change is available
in scripts/coccinelle/api/memdup.cocci.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
---

diff -u -p a/net/rxrpc/ar-key.c b/net/rxrpc/ar-key.c
--- a/net/rxrpc/ar-key.c 2011-11-07 19:38:32.050621708 +0100
+++ b/net/rxrpc/ar-key.c 2011-11-08 10:59:18.926240106 +0100
@@ -306,10 +306,9 @@ static int rxrpc_krb5_decode_tagged_data
 	td->data_len = len;
 
 	if (len > 0) {
-		td->data = kmalloc(len, GFP_KERNEL);
+		td->data = kmemdup(xdr, len, GFP_KERNEL);
 		if (!td->data)
 			return -ENOMEM;
-		memcpy(td->data, xdr, len);
 		len = (len + 3) & ~3;
 		toklen -= len;
 		xdr += len >> 2;
@@ -401,10 +400,9 @@ static int rxrpc_krb5_decode_ticket(u8 *
 	_debug("ticket len %u", len);
 
 	if (len > 0) {
-		*_ticket = kmalloc(len, GFP_KERNEL);
+		*_ticket = kmemdup(xdr, len, GFP_KERNEL);
 		if (!*_ticket)
 			return -ENOMEM;
-		memcpy(*_ticket, xdr, len);
 		len = (len + 3) & ~3;
 		toklen -= len;
 		xdr += len >> 2;

^ permalink raw reply

* [PATCH] brcm80211: smac: Use kmemdup rather than duplicating its implementation
From: Thomas Meyer @ 2011-11-17 22:43 UTC (permalink / raw)
  To: linville, linux-wireless, netdev, linux-kernel

The semantic patch that makes this change is available
in scripts/coccinelle/api/memdup.cocci.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
---

diff -u -p a/drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c b/drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c
--- a/drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c 2011-11-07 19:39:05.641131954 +0100
+++ b/drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c 2011-11-08 10:44:56.761349456 +0100
@@ -1561,11 +1561,10 @@ int brcms_ucode_init_buf(struct brcms_in
 			if (le32_to_cpu(hdr->idx) == idx) {
 				pdata = wl->fw.fw_bin[i]->data +
 					le32_to_cpu(hdr->offset);
-				*pbuf = kmalloc(len, GFP_ATOMIC);
+				*pbuf = kmemdup(pdata, len, GFP_ATOMIC);
 				if (*pbuf == NULL)
 					goto fail;
 
-				memcpy(*pbuf, pdata, len);
 				return 0;
 			}
 		}

^ permalink raw reply

* [PATCH] CDC NCM: Use kzalloc rather than kmalloc followed by memset with 0
From: Thomas Meyer @ 2011-11-17 22:43 UTC (permalink / raw)
  To: oliver, linux-usb, netdev, linux-kernel

This considers some simple cases that are common and easy to validate
Note in particular that there are no ...s in the rule, so all of the
matched code has to be contiguous

The semantic patch that makes this change is available
in scripts/coccinelle/api/alloc/kzalloc-simple.cocci.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
---

diff -u -p a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
--- a/drivers/net/usb/cdc_ncm.c 2011-11-07 19:38:00.160137318 +0100
+++ b/drivers/net/usb/cdc_ncm.c 2011-11-08 09:35:34.632578608 +0100
@@ -465,12 +465,10 @@ static int cdc_ncm_bind(struct usbnet *d
 	int temp;
 	u8 iface_no;
 
-	ctx = kmalloc(sizeof(*ctx), GFP_KERNEL);
+	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
 	if (ctx == NULL)
 		return -ENODEV;
 
-	memset(ctx, 0, sizeof(*ctx));
-
 	init_timer(&ctx->tx_timer);
 	spin_lock_init(&ctx->mtx);
 	ctx->netdev = dev->net;

^ permalink raw reply

* [PATCH] ethtool: Use kmemdup rather than duplicating its implementation
From: Thomas Meyer @ 2011-11-17 23:05 UTC (permalink / raw)
  To: netdev, linux-kernel

The semantic patch that makes this change is available
in scripts/coccinelle/api/memdup.cocci.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
---

diff -u -p a/drivers/net/ethernet/freescale/gianfar_ethtool.c b/drivers/net/ethernet/freescale/gianfar_ethtool.c
--- a/drivers/net/ethernet/freescale/gianfar_ethtool.c 2011-11-07 19:37:57.036756543 +0100
+++ b/drivers/net/ethernet/freescale/gianfar_ethtool.c 2011-11-08 10:42:14.842512269 +0100
@@ -1410,10 +1410,9 @@ static int gfar_optimize_filer_masks(str
 
 	/* We need a copy of the filer table because
 	 * we want to change its order */
-	temp_table = kmalloc(sizeof(*temp_table), GFP_KERNEL);
+	temp_table = kmemdup(tab, sizeof(*temp_table), GFP_KERNEL);
 	if (temp_table == NULL)
 		return -ENOMEM;
-	memcpy(temp_table, tab, sizeof(*temp_table));
 
 	mask_table = kcalloc(MAX_FILER_CACHE_IDX / 2 + 1,
 			sizeof(struct gfar_mask_entry), GFP_KERNEL);

^ permalink raw reply

* Should "N/A" dust bunnies be swept from fw_version?
From: Rick Jones @ 2011-11-17 23:27 UTC (permalink / raw)
  To: netdev

In the discussion on "enable virtio_net to return bus_info in ethtool -i 
consistent with emulated NICs" Ben Hutchings had the following feedback 
on what might go into bus_info:

> Please use the existing 'not implemented' value, which is the empty
> string.   If you think ethtool should print some helpful message instead
> of an empty string, please submit a patch for ethtool.

When I was sweeping in the .get_drvinfo routines, I noticed many drivers 
would return "N/A" for fw_version - presumably they were drivers for 
cards without firmware.  Should those be removed to have the fw_version 
be the empty string, or should those sleeping dust bunnies be allowed to 
lie?

rick jones

^ permalink raw reply

* RE: [net-next-2.6 PATCH 0/8 RFC v2] macvlan: MAC Address filtering support for passthru mode
From: Ben Hutchings @ 2011-11-17 23:37 UTC (permalink / raw)
  To: Rose, Gregory V
  Cc: Roopa Prabhu, netdev@vger.kernel.org, sri@us.ibm.com,
	dragos.tatulea@gmail.com, arnd@arndb.de, kvm@vger.kernel.org,
	mst@redhat.com, davem@davemloft.net, mchan@broadcom.com,
	dwang2@cisco.com, shemminger@vyatta.com, eric.dumazet@gmail.com,
	kaber@trash.net, benve@cisco.com
In-Reply-To: <43F901BD926A4E43B106BF17856F075501A19FF1D8@orsmsx508.amr.corp.intel.com>

On Thu, 2011-10-20 at 13:43 -0700, Rose, Gregory V wrote:
> > -----Original Message-----
> > From: Roopa Prabhu [mailto:roprabhu@cisco.com]
[...]
> > Moving the ops to netdev should be trivial. You probably want the ops to
> > work on the VF via the PF, like the existing ndo_set_vf_mac etc.
> 
> That is correct, so we would need to add some way to pass the VF number to the op.
> In addition, there are use cases for multiple MAC address filters for the Physical
> Function (PF) so we would like to be able to identify to the netdev op that it is
> supposed to perform the action on the PF filters instead of a VF.
> 
> An example of this would be when an administrator has created some number of VFs
> for a given PF but is also running the PF in bridged (i.e. promiscuous) mode so that it
> can support purely SW emulated network connections in some VMs that have low network
> latency and bandwidth requirements while reserving the VFs for VMs that require the low latency, high throughput that directly assigned VFs can provide.  In this case an
> emulated SW interface in a VM is unable to properly communicate with VFs on the same
> PF because the emulated SW interface's MAC address isn't programmed into the HW filters
> on the PF.  If we could use this op to program the MAC address and VLAN filters of
> the emulated SW interfaces into the PF HW a VF could then properly communicate across
> the NIC's internal VEB to the emulated SW interfaces.
[...]

This would also be good for Solarflare's VF plugin architecture.  The VF
driver works as a plugin for virtio or xen_netfront and can refuse
packets that need to be bridged to another (physically) local address.
The PF driver has to tell VFs what the local addresses are and currently
relies on some custom scripting to know about those extra addresses.

(No, none of that is upstream - I'm preparing for that now.)

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [net-next] iproute2: Add new command to IP link to enable/disable VF spoof check
From: Greg Rose @ 2011-11-17 23:39 UTC (permalink / raw)
  To: Jeff Kirsher; +Cc: Stephen Hemminger, davem, netdev, gospo
In-Reply-To: <1316938998-29855-1-git-send-email-jeffrey.t.kirsher@intel.com>



On 9/25/2011 1:23 AM, Jeff Kirsher wrote:
> From: Greg Rose <gregory.v.rose@intel.com>
>
> Add IP link command parsing for VF spoof checking enable/disable
>
> Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Stephen,

I don't see that this patch was ever applied to your tree.  Was there 
some problem with it that I missed?  Maybe I'm looking at the wrong tree?

Was just wondering...

Regards,

- Greg

> ---
> include/linux/if_link.h | 7 +++++++
> ip/ipaddress.c | 6 ++++++
> ip/iplink.c | 15 +++++++++++++++
> man/man8/ip.8 | 4 +++-
> 4 files changed, 31 insertions(+), 1 deletions(-)
>
> diff --git a/include/linux/if_link.h b/include/linux/if_link.h
> index 304c44f..e421f60 100644
> --- a/include/linux/if_link.h
> +++ b/include/linux/if_link.h
> @@ -277,6 +277,7 @@ enum {
> IFLA_VF_MAC, /* Hardware queue specific attributes */
> IFLA_VF_VLAN,
> IFLA_VF_TX_RATE, /* TX Bandwidth Allocation */
> + IFLA_VF_SPOOFCHK, /* Spoof Checking on/off switch */
> __IFLA_VF_MAX,
> };
>
> @@ -298,12 +299,18 @@ struct ifla_vf_tx_rate {
> __u32 rate; /* Max TX bandwidth in Mbps, 0 disables throttling */
> };
>
> +struct ifla_vf_spoofchk {
> + __u32 vf;
> + __u32 setting;
> +};
> +
> struct ifla_vf_info {
> __u32 vf;
> __u8 mac[32];
> __u32 vlan;
> __u32 qos;
> __u32 tx_rate;
> + __u32 spoofchk;
> };
>
> /* VF ports management section
> diff --git a/ip/ipaddress.c b/ip/ipaddress.c
> index 85f05a2..7e6ac50 100644
> --- a/ip/ipaddress.c
> +++ b/ip/ipaddress.c
> @@ -197,6 +197,7 @@ static void print_vfinfo(FILE *fp, struct rtattr
> *vfinfo)
> struct ifla_vf_mac *vf_mac;
> struct ifla_vf_vlan *vf_vlan;
> struct ifla_vf_tx_rate *vf_tx_rate;
> + struct ifla_vf_spoofchk *vf_spoofchk;
> struct rtattr *vf[IFLA_VF_MAX+1];
> SPRINT_BUF(b1);
>
> @@ -210,6 +211,7 @@ static void print_vfinfo(FILE *fp, struct rtattr
> *vfinfo)
> vf_mac = RTA_DATA(vf[IFLA_VF_MAC]);
> vf_vlan = RTA_DATA(vf[IFLA_VF_VLAN]);
> vf_tx_rate = RTA_DATA(vf[IFLA_VF_TX_RATE]);
> + vf_spoofchk = RTA_DATA(vf[IFLA_VF_SPOOFCHK]);
>
> fprintf(fp, "\n vf %d MAC %s", vf_mac->vf,
> ll_addr_n2a((unsigned char *)&vf_mac->mac,
> @@ -220,6 +222,10 @@ static void print_vfinfo(FILE *fp, struct rtattr
> *vfinfo)
> fprintf(fp, ", qos %d", vf_vlan->qos);
> if (vf_tx_rate->rate)
> fprintf(fp, ", tx rate %d (Mbps)", vf_tx_rate->rate);
> + if (vf_spoofchk->setting)
> + fprintf(fp, ", spoof checking on");
> + else
> + fprintf(fp, ", spoof checking off");
> }
>
> int print_linkinfo(const struct sockaddr_nl *who,
> diff --git a/ip/iplink.c b/ip/iplink.c
> index e5325a6..f4a5243 100644
> --- a/ip/iplink.c
> +++ b/ip/iplink.c
> @@ -71,7 +71,10 @@ void iplink_usage(void)
> fprintf(stderr, " [ alias NAME ]\n");
> fprintf(stderr, " [ vf NUM [ mac LLADDR ]\n");
> fprintf(stderr, " [ vlan VLANID [ qos VLAN-QOS ] ]\n");
> +
> fprintf(stderr, " [ rate TXRATE ] ] \n");
> +
> + fprintf(stderr, " [ spoofchk { on | off} ] ] \n");
> fprintf(stderr, " [ master DEVICE ]\n");
> fprintf(stderr, " [ nomaster ]\n");
> fprintf(stderr, " ip link show [ DEVICE | group GROUP ]\n");
> @@ -228,6 +231,18 @@ int iplink_parse_vf(int vf, int *argcp, char ***argvp,
> ivt.vf = vf;
> addattr_l(&req->n, sizeof(*req), IFLA_VF_TX_RATE, &ivt, sizeof(ivt));
>
> + } else if (matches(*argv, "spoofchk") == 0) {
> + struct ifla_vf_spoofchk ivs;
> + NEXT_ARG();
> + if (matches(*argv, "on") == 0)
> + ivs.setting = 1;
> + else if (matches(*argv, "off") == 0)
> + ivs.setting = 0;
> + else
> + invarg("Invalid \"spoofchk\" value\n", *argv);
> + ivs.vf = vf;
> + addattr_l(&req->n, sizeof(*req), IFLA_VF_SPOOFCHK, &ivs, sizeof(ivs));
> +
> } else {
> /* rewind arg */
> PREV_ARG();
> diff --git a/man/man8/ip.8 b/man/man8/ip.8
> index 27993a4..cd7b97a 100644
> --- a/man/man8/ip.8
> +++ b/man/man8/ip.8
> @@ -100,7 +100,9 @@ ip \- show / manipulate routing, devices, policy
> routing and tunnels
> .B qos
> .IR VLAN-QOS " ] ] ["
> .B rate
> -.IR TXRATE " ] |"
> +.IR TXRATE " ] ["
> +.B spoofchk { on | off }
> +] |
> .br
> .B master
> .IR DEVICE

^ permalink raw reply

* RE: [net-next-2.6 PATCH 0/8 RFC v2] macvlan: MAC Address filtering support for passthru mode
From: Ben Hutchings @ 2011-11-17 23:43 UTC (permalink / raw)
  To: Michael S. Tsirkin, Rose, Gregory V
  Cc: Roopa Prabhu, netdev@vger.kernel.org, sri@us.ibm.com,
	dragos.tatulea@gmail.com, arnd@arndb.de, kvm@vger.kernel.org,
	davem@davemloft.net, mchan@broadcom.com, dwang2@cisco.com,
	shemminger@vyatta.com, eric.dumazet@gmail.com, kaber@trash.net,
	benve@cisco.com
In-Reply-To: <43F901BD926A4E43B106BF17856F075501A1AE542A@orsmsx508.amr.corp.intel.com>

On Tue, 2011-10-25 at 08:59 -0700, Rose, Gregory V wrote:
> > -----Original Message-----
> > From: Michael S. Tsirkin [mailto:mst@redhat.com]
> > Sent: Tuesday, October 25, 2011 8:46 AM
> > To: Roopa Prabhu
> > Cc: netdev@vger.kernel.org; sri@us.ibm.com; dragos.tatulea@gmail.com;
> > arnd@arndb.de; kvm@vger.kernel.org; davem@davemloft.net;
> > mchan@broadcom.com; dwang2@cisco.com; shemminger@vyatta.com;
> > eric.dumazet@gmail.com; kaber@trash.net; benve@cisco.com; Rose, Gregory V
> > Subject: Re: [net-next-2.6 PATCH 0/8 RFC v2] macvlan: MAC Address
> > filtering support for passthru mode
> > 
> > On Mon, Oct 24, 2011 at 11:15:05AM -0700, Roopa Prabhu wrote:
> > > >> ... And also I dont know of any hw
> > > >> that provides an interface to set hw filters on a per queue basis.
> > > >
> > > > VMDq hardware would support this, no?
> > > >
> > > Am not really sure. This patch uses netdev to pass filters to hw. And I
> > > don't see any netdev infrastructure that would support per queue
> > filters.
> > 
> > Sure. I was only saying that as far as I understand,
> > VMDq hardware does support this functionality.
> > Right, Greg?
> 
> In the case of Intel HW yes.  I'll refrain from speaking for other HW
> vendors although I'm guessing it would be true in their cases also.
> YMMV, caveat emptor, etc. etc.

Current Solarflare hardware supports:
- RX MAC address filters for queue selection (steering), which can be
  combined with RSS (flow hashing)
- TX MAC address filters to prevent spoofing
Multiple filters can be associated with a single queue.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [PATCH] ipv4: avoid to double release dst in tcp_v4_connect
From: RongQing Li @ 2011-11-17 23:54 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20111117.165601.715150304040818603.davem@davemloft.net>

2011/11/18 David Miller <davem@davemloft.net>:
> We set 'rt' to NULL at these statements specifically to handle this
> situation, ip_rt_put(NULL) will do nothing.
>
> I don't think this change is necessary.
>

Yes, I am wrong, thank you.

^ permalink raw reply

* Re: [net-next-2.6 PATCH 0/6 v4] macvlan: MAC Address filtering support for passthru mode
From: Ben Hutchings @ 2011-11-18  0:15 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: netdev, davem, chrisw, sri, dragos.tatulea, kvm, arnd, mst,
	gregory.v.rose, mchan, dwang2, shemminger, eric.dumazet, kaber,
	benve
In-Reply-To: <20111109075449.13549.58135.stgit@rhel6.1>

Sorry to come to this rather late.

On Tue, 2011-11-08 at 23:55 -0800, Roopa Prabhu wrote:
[...]
> v2 -> v3
> - Moved set and get filter ops from rtnl_link_ops to netdev_ops
> - Support for SRIOV VFs.
>         [Note: The get filters msg (in the way current get rtnetlink handles
>         it) might get too big for SRIOV vfs. This patch follows existing sriov 
>         vf get code and tries to accomodate filters for all VF's in a PF. 
>         And for the SRIOV case I have only tested the fact that the VF 
>         arguments are getting delivered to rtnetlink correctly. The code
>         follows existing sriov vf handling code so rest of it should work fine]
[...]

This is already broken for large numbers of VFs, and increasing the
amount of information per VF is going to make the situation worse.  I am
no netlink expert but I think that the current approach of bundling all
information about an interface in a single message may not be
sustainable.

Also, I'm unclear on why this interface is to be used to set filtering
for the (PF) net device as well as for related VFs.  Doesn't that
duplicate the functionality of ndo_set_rx_mode and
ndo_vlan_rx_{add,kill}_vid?

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [net-next-2.6 PATCH 1/6 v4] rtnetlink: Netlink interface for setting MAC and VLAN filters
From: Ben Hutchings @ 2011-11-18  0:17 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: netdev, davem, chrisw, sri, dragos.tatulea, kvm, arnd, mst,
	gregory.v.rose, mchan, dwang2, shemminger, eric.dumazet, kaber,
	benve
In-Reply-To: <20111109075540.13549.82297.stgit@rhel6.1>

On Tue, 2011-11-08 at 23:55 -0800, Roopa Prabhu wrote:
> From: Roopa Prabhu <roprabhu@cisco.com>
> 
> This patch introduces the following netlink interface to set
> MAC and VLAN filters on an network interface. It can be used to
> set RX filter on any network interface (if supported by the driver) and
> also on a SRIOV VF via its PF
>
> Interface to set RX filter on a SRIOV VF
> [IFLA_VF_RX_FILTERS] = {
> 	[IFLA_VF_RX_FILTER] = {
> 		[IFLA_RX_FILTER_VF]
> 		[IFLA_RX_FILTER_ADDR] = {
> 			[IFLA_RX_FILTER_ADDR_FLAGS]
> 			[IFLA_RX_FILTER_ADDR_UC_LIST] = {
> 				[IFLA_ADDR_LIST_ENTRY]
> 			}
> 			[IFLA_RX_FILTER_ADDR_MC_LIST] = {
> 				[IFLA_ADDR_LIST_ENTRY]
> 			}
> 		}
> 		[IFLA_RX_FILTER_VLAN] = {
> 			[IFLA_RX_FILTER_VLAN_BITMAP]
> 		}
> 	}
> 	...
> }
[...]

Please put the details of both syntax *and semantics* in if_link.h.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: Should "N/A" dust bunnies be swept from fw_version?
From: Ben Hutchings @ 2011-11-18  0:19 UTC (permalink / raw)
  To: Rick Jones; +Cc: netdev
In-Reply-To: <4EC5984D.5070501@hp.com>

On Thu, 2011-11-17 at 15:27 -0800, Rick Jones wrote:
> In the discussion on "enable virtio_net to return bus_info in ethtool -i 
> consistent with emulated NICs" Ben Hutchings had the following feedback 
> on what might go into bus_info:
> 
> > Please use the existing 'not implemented' value, which is the empty
> > string.   If you think ethtool should print some helpful message instead
> > of an empty string, please submit a patch for ethtool.
> 
> When I was sweeping in the .get_drvinfo routines, I noticed many drivers 
> would return "N/A" for fw_version - presumably they were drivers for 
> cards without firmware.  Should those be removed to have the fw_version 
> be the empty string, or should those sleeping dust bunnies be allowed to 
> lie?

I much prefer the empty string; the ethtool utility can turn that into a
user-friendly placeholder if it's considered confusing.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* [PATCH] Change mii to ethtool advertisement function names
From: Matt Carlson @ 2011-11-18  0:30 UTC (permalink / raw)
  To: davem; +Cc: netdev, mcarlson, Michael Chan

This patch implements advice by Ben Hutchings to change the mii side of
the function names to look more like the register whose values they
convert.  New LPA translation functions have been added as well.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2.c |   20 +++++-----
 drivers/net/ethernet/broadcom/tg3.c  |   10 +++---
 drivers/net/ethernet/sun/niu.c       |    4 +-
 drivers/net/mii.c                    |   15 ++++----
 drivers/net/phy/phy_device.c         |    4 +-
 include/linux/mii.h                  |   64 ++++++++++++++++++++++++++--------
 6 files changed, 75 insertions(+), 42 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c
index 6b7cd1e..0bc39e6 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -2054,8 +2054,8 @@ __acquires(&bp->phy_lock)
 
 	if (bp->autoneg & AUTONEG_SPEED) {
 		u32 adv_reg, adv1000_reg;
-		u32 new_adv_reg = 0;
-		u32 new_adv1000_reg = 0;
+		u32 new_adv = 0;
+		u32 new_adv1000 = 0;
 
 		bnx2_read_phy(bp, bp->mii_adv, &adv_reg);
 		adv_reg &= (PHY_ALL_10_100_SPEED | ADVERTISE_PAUSE_CAP |
@@ -2064,18 +2064,18 @@ __acquires(&bp->phy_lock)
 		bnx2_read_phy(bp, MII_CTRL1000, &adv1000_reg);
 		adv1000_reg &= PHY_ALL_1000_SPEED;
 
-		new_adv_reg = ethtool_adv_to_mii_100bt(bp->advertising);
-		new_adv_reg |= ADVERTISE_CSMA;
-		new_adv_reg |= bnx2_phy_get_pause_adv(bp);
+		new_adv = ethtool_adv_to_mii_adv_t(bp->advertising);
+		new_adv |= ADVERTISE_CSMA;
+		new_adv |= bnx2_phy_get_pause_adv(bp);
 
-		new_adv1000_reg |= ethtool_adv_to_mii_1000T(bp->advertising);
+		new_adv1000 |= ethtool_adv_to_mii_ctrl1000_t(bp->advertising);
 
-		if ((adv1000_reg != new_adv1000_reg) ||
-			(adv_reg != new_adv_reg) ||
+		if ((adv1000_reg != new_adv1000) ||
+			(adv_reg != new_adv) ||
 			((bmcr & BMCR_ANENABLE) == 0)) {
 
-			bnx2_write_phy(bp, bp->mii_adv, new_adv_reg);
-			bnx2_write_phy(bp, MII_CTRL1000, new_adv1000_reg);
+			bnx2_write_phy(bp, bp->mii_adv, new_adv);
+			bnx2_write_phy(bp, MII_CTRL1000, new_adv1000);
 			bnx2_write_phy(bp, bp->mii_bmcr, BMCR_ANRESTART |
 				BMCR_ANENABLE);
 		}
diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index 024ca1d..47c0e3a 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -3594,7 +3594,7 @@ static int tg3_phy_autoneg_cfg(struct tg3 *tp, u32 advertise, u32 flowctrl)
 	u32 val, new_adv;
 
 	new_adv = ADVERTISE_CSMA;
-	new_adv |= ethtool_adv_to_mii_100bt(advertise);
+	new_adv |= ethtool_adv_to_mii_adv_t(advertise);
 	new_adv |= tg3_advert_flowctrl_1000T(flowctrl);
 
 	err = tg3_writephy(tp, MII_ADVERTISE, new_adv);
@@ -3604,7 +3604,7 @@ static int tg3_phy_autoneg_cfg(struct tg3 *tp, u32 advertise, u32 flowctrl)
 	if (tp->phy_flags & TG3_PHYFLG_10_100_ONLY)
 		goto done;
 
-	new_adv = ethtool_adv_to_mii_1000T(advertise);
+	new_adv = ethtool_adv_to_mii_ctrl1000_t(advertise);
 
 	if (tp->pci_chip_rev_id == CHIPREV_ID_5701_A0 ||
 	    tp->pci_chip_rev_id == CHIPREV_ID_5701_B0)
@@ -3778,7 +3778,7 @@ static int tg3_copper_is_advertising_all(struct tg3 *tp, u32 mask)
 {
 	u32 adv_reg, all_mask = 0;
 
-	all_mask = ethtool_adv_to_mii_100bt(mask);
+	all_mask = ethtool_adv_to_mii_adv_t(mask);
 
 	if (tg3_readphy(tp, MII_ADVERTISE, &adv_reg))
 		return 0;
@@ -3789,7 +3789,7 @@ static int tg3_copper_is_advertising_all(struct tg3 *tp, u32 mask)
 	if (!(tp->phy_flags & TG3_PHYFLG_10_100_ONLY)) {
 		u32 tg3_ctrl;
 
-		all_mask = ethtool_adv_to_mii_1000T(mask);
+		all_mask = ethtool_adv_to_mii_ctrl1000_t(mask);
 
 		if (tg3_readphy(tp, MII_CTRL1000, &tg3_ctrl))
 			return 0;
@@ -4889,7 +4889,7 @@ static int tg3_setup_fiber_mii_phy(struct tg3 *tp, int force_reset)
 				 ADVERTISE_SLCT);
 
 		newadv |= tg3_advert_flowctrl_1000X(tp->link_config.flowctrl);
-		newadv |= ethtool_adv_to_mii_1000X(tp->link_config.advertising);
+		newadv |= ethtool_adv_to_mii_adv_x(tp->link_config.advertising);
 
 		if ((newadv != adv) || !(bmcr & BMCR_ANENABLE)) {
 			tg3_writephy(tp, MII_ADVERTISE, newadv);
diff --git a/drivers/net/ethernet/sun/niu.c b/drivers/net/ethernet/sun/niu.c
index 9997be5..680b107 100644
--- a/drivers/net/ethernet/sun/niu.c
+++ b/drivers/net/ethernet/sun/niu.c
@@ -1151,8 +1151,8 @@ static int link_status_mii(struct niu *np, int *link_up_p)
 		supported |= SUPPORTED_1000baseT_Full;
 	lp->supported = supported;
 
-	advertising = mii_adv_to_ethtool_100bt(advert);
-	advertising |= mii_adv_to_ethtool_1000T(ctrl1000);
+	advertising = mii_adv_to_ethtool_adv_t(advert);
+	advertising |= mii_ctrl1000_to_ethtool_adv_t(ctrl1000);
 
 	if (bmcr & BMCR_ANENABLE) {
 		int neg, neg1000;
diff --git a/drivers/net/mii.c b/drivers/net/mii.c
index d0a2962..c70c233 100644
--- a/drivers/net/mii.c
+++ b/drivers/net/mii.c
@@ -35,14 +35,11 @@
 
 static u32 mii_get_an(struct mii_if_info *mii, u16 addr)
 {
-	u32 result = 0;
 	int advert;
 
 	advert = mii->mdio_read(mii->dev, mii->phy_id, addr);
-	if (advert & LPA_LPACK)
-		result |= ADVERTISED_Autoneg;
 
-	return result | mii_adv_to_ethtool_100bt(advert);
+	return mii_lpa_to_ethtool_lpa_t(advert);
 }
 
 /**
@@ -93,12 +90,13 @@ int mii_ethtool_gset(struct mii_if_info *mii, struct ethtool_cmd *ecmd)
 
 		ecmd->advertising |= mii_get_an(mii, MII_ADVERTISE);
 		if (mii->supports_gmii)
-			ecmd->advertising |= mii_adv_to_ethtool_1000T(ctrl1000);
+			ecmd->advertising |=
+					mii_ctrl1000_to_ethtool_adv_t(ctrl1000);
 
 		if (bmsr & BMSR_ANEGCOMPLETE) {
 			ecmd->lp_advertising = mii_get_an(mii, MII_LPA);
 			ecmd->lp_advertising |=
-					     mii_lpa_to_ethtool_1000T(stat1000);
+					mii_stat1000_to_ethtool_lpa_t(stat1000);
 		} else {
 			ecmd->lp_advertising = 0;
 		}
@@ -186,10 +184,11 @@ int mii_ethtool_sset(struct mii_if_info *mii, struct ethtool_cmd *ecmd)
 			advert2 = mii->mdio_read(dev, mii->phy_id, MII_CTRL1000);
 			tmp2 = advert2 & ~(ADVERTISE_1000HALF | ADVERTISE_1000FULL);
 		}
-		tmp |= ethtool_adv_to_mii_100bt(ecmd->advertising);
+		tmp |= ethtool_adv_to_mii_adv_t(ecmd->advertising);
 
 		if (mii->supports_gmii)
-			tmp2 |= ethtool_adv_to_mii_1000T(ecmd->advertising);
+			tmp2 |=
+			      ethtool_adv_to_mii_ctrl1000_t(ecmd->advertising);
 		if (advert != tmp) {
 			mii->mdio_write(dev, mii->phy_id, MII_ADVERTISE, tmp);
 			mii->advertising = tmp;
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index edb905f..f320f46 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -565,7 +565,7 @@ static int genphy_config_advert(struct phy_device *phydev)
 
 	adv &= ~(ADVERTISE_ALL | ADVERTISE_100BASE4 | ADVERTISE_PAUSE_CAP |
 		 ADVERTISE_PAUSE_ASYM);
-	adv |= ethtool_adv_to_mii_100bt(advertise);
+	adv |= ethtool_adv_to_mii_adv_t(advertise);
 
 	if (adv != oldadv) {
 		err = phy_write(phydev, MII_ADVERTISE, adv);
@@ -584,7 +584,7 @@ static int genphy_config_advert(struct phy_device *phydev)
 			return adv;
 
 		adv &= ~(ADVERTISE_1000FULL | ADVERTISE_1000HALF);
-		adv |= ethtool_adv_to_mii_1000T(advertise);
+		adv |= ethtool_adv_to_mii_ctrl1000_t(advertise);
 
 		if (adv != oldadv) {
 			err = phy_write(phydev, MII_CTRL1000, adv);
diff --git a/include/linux/mii.h b/include/linux/mii.h
index 6697b91..2783eca 100644
--- a/include/linux/mii.h
+++ b/include/linux/mii.h
@@ -241,14 +241,14 @@ static inline unsigned int mii_duplex (unsigned int duplex_lock,
 }
 
 /**
- * ethtool_adv_to_mii_100bt
+ * ethtool_adv_to_mii_adv_t
  * @ethadv: the ethtool advertisement settings
  *
  * A small helper function that translates ethtool advertisement
  * settings to phy autonegotiation advertisements for the
  * MII_ADVERTISE register.
  */
-static inline u32 ethtool_adv_to_mii_100bt(u32 ethadv)
+static inline u32 ethtool_adv_to_mii_adv_t(u32 ethadv)
 {
 	u32 result = 0;
 
@@ -269,13 +269,13 @@ static inline u32 ethtool_adv_to_mii_100bt(u32 ethadv)
 }
 
 /**
- * mii_adv_to_ethtool_100bt
+ * mii_adv_to_ethtool_adv_t
  * @adv: value of the MII_ADVERTISE register
  *
  * A small helper function that translates MII_ADVERTISE bits
  * to ethtool advertisement settings.
  */
-static inline u32 mii_adv_to_ethtool_100bt(u32 adv)
+static inline u32 mii_adv_to_ethtool_adv_t(u32 adv)
 {
 	u32 result = 0;
 
@@ -296,14 +296,14 @@ static inline u32 mii_adv_to_ethtool_100bt(u32 adv)
 }
 
 /**
- * ethtool_adv_to_mii_1000T
+ * ethtool_adv_to_mii_ctrl1000_t
  * @ethadv: the ethtool advertisement settings
  *
  * A small helper function that translates ethtool advertisement
  * settings to phy autonegotiation advertisements for the
  * MII_CTRL1000 register when in 1000T mode.
  */
-static inline u32 ethtool_adv_to_mii_1000T(u32 ethadv)
+static inline u32 ethtool_adv_to_mii_ctrl1000_t(u32 ethadv)
 {
 	u32 result = 0;
 
@@ -316,14 +316,14 @@ static inline u32 ethtool_adv_to_mii_1000T(u32 ethadv)
 }
 
 /**
- * mii_adv_to_ethtool_1000T
+ * mii_ctrl1000_to_ethtool_adv_t
  * @adv: value of the MII_CTRL1000 register
  *
  * A small helper function that translates MII_CTRL1000
  * bits, when in 1000Base-T mode, to ethtool
  * advertisement settings.
  */
-static inline u32 mii_adv_to_ethtool_1000T(u32 adv)
+static inline u32 mii_ctrl1000_to_ethtool_adv_t(u32 adv)
 {
 	u32 result = 0;
 
@@ -335,17 +335,33 @@ static inline u32 mii_adv_to_ethtool_1000T(u32 adv)
 	return result;
 }
 
-#define mii_lpa_to_ethtool_100bt(lpa)	mii_adv_to_ethtool_100bt(lpa)
+/**
+ * mii_lpa_to_ethtool_lpa_t
+ * @adv: value of the MII_LPA register
+ *
+ * A small helper function that translates MII_LPA
+ * bits, when in 1000Base-T mode, to ethtool
+ * LP advertisement settings.
+ */
+static inline u32 mii_lpa_to_ethtool_lpa_t(u32 lpa)
+{
+	u32 result = 0;
+
+	if (lpa & LPA_LPACK)
+		result |= ADVERTISED_Autoneg;
+
+	return result | mii_adv_to_ethtool_adv_t(lpa);
+}
 
 /**
- * mii_lpa_to_ethtool_1000T
+ * mii_stat1000_to_ethtool_lpa_t
  * @adv: value of the MII_STAT1000 register
  *
  * A small helper function that translates MII_STAT1000
  * bits, when in 1000Base-T mode, to ethtool
  * advertisement settings.
  */
-static inline u32 mii_lpa_to_ethtool_1000T(u32 lpa)
+static inline u32 mii_stat1000_to_ethtool_lpa_t(u32 lpa)
 {
 	u32 result = 0;
 
@@ -358,14 +374,14 @@ static inline u32 mii_lpa_to_ethtool_1000T(u32 lpa)
 }
 
 /**
- * ethtool_adv_to_mii_1000X
+ * ethtool_adv_to_mii_adv_x
  * @ethadv: the ethtool advertisement settings
  *
  * A small helper function that translates ethtool advertisement
  * settings to phy autonegotiation advertisements for the
  * MII_CTRL1000 register when in 1000Base-X mode.
  */
-static inline u32 ethtool_adv_to_mii_1000X(u32 ethadv)
+static inline u32 ethtool_adv_to_mii_adv_x(u32 ethadv)
 {
 	u32 result = 0;
 
@@ -382,14 +398,14 @@ static inline u32 ethtool_adv_to_mii_1000X(u32 ethadv)
 }
 
 /**
- * mii_adv_to_ethtool_1000X
+ * mii_adv_to_ethtool_adv_x
  * @adv: value of the MII_CTRL1000 register
  *
  * A small helper function that translates MII_CTRL1000
  * bits, when in 1000Base-X mode, to ethtool
  * advertisement settings.
  */
-static inline u32 mii_adv_to_ethtool_1000X(u32 adv)
+static inline u32 mii_adv_to_ethtool_adv_x(u32 adv)
 {
 	u32 result = 0;
 
@@ -406,6 +422,24 @@ static inline u32 mii_adv_to_ethtool_1000X(u32 adv)
 }
 
 /**
+ * mii_lpa_to_ethtool_lpa_x
+ * @adv: value of the MII_LPA register
+ *
+ * A small helper function that translates MII_LPA
+ * bits, when in 1000Base-X mode, to ethtool
+ * LP advertisement settings.
+ */
+static inline u32 mii_lpa_to_ethtool_lpa_x(u32 lpa)
+{
+	u32 result = 0;
+
+	if (lpa & LPA_LPACK)
+		result |= ADVERTISED_Autoneg;
+
+	return result | mii_adv_to_ethtool_adv_x(lpa);
+}
+
+/**
  * mii_advertise_flowctrl - get flow control advertisement flags
  * @cap: Flow control capabilities (FLOW_CTRL_RX, FLOW_CTRL_TX or both)
  */
-- 
1.7.3.4

^ permalink raw reply related

* Re: [net-next-2.6 PATCH 0/6 v4] macvlan: MAC Address filtering support for passthru mode
From: Greg Rose @ 2011-11-18  0:32 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Roopa Prabhu, netdev@vger.kernel.org, davem@davemloft.net,
	chrisw@redhat.com, sri@us.ibm.com, dragos.tatulea@gmail.com,
	kvm@vger.kernel.org, arnd@arndb.de, mst@redhat.com,
	mchan@broadcom.com, dwang2@cisco.com, shemminger@vyatta.com,
	eric.dumazet@gmail.com, kaber@trash.net, benve@cisco.com
In-Reply-To: <1321575301.2749.51.camel@bwh-desktop>


On 11/17/2011 4:15 PM, Ben Hutchings wrote:
> Sorry to come to this rather late.
>
> On Tue, 2011-11-08 at 23:55 -0800, Roopa Prabhu wrote:
> [...]
>> v2 ->  v3
>> - Moved set and get filter ops from rtnl_link_ops to netdev_ops
>> - Support for SRIOV VFs.
>>          [Note: The get filters msg (in the way current get rtnetlink handles
>>          it) might get too big for SRIOV vfs. This patch follows existing sriov
>>          vf get code and tries to accomodate filters for all VF's in a PF.
>>          And for the SRIOV case I have only tested the fact that the VF
>>          arguments are getting delivered to rtnetlink correctly. The code
>>          follows existing sriov vf handling code so rest of it should work fine]
> [...]
>
> This is already broken for large numbers of VFs, and increasing the
> amount of information per VF is going to make the situation worse.  I am
> no netlink expert but I think that the current approach of bundling all
> information about an interface in a single message may not be
> sustainable.
>
> Also, I'm unclear on why this interface is to be used to set filtering
> for the (PF) net device as well as for related VFs.  Doesn't that
> duplicate the functionality of ndo_set_rx_mode and
> ndo_vlan_rx_{add,kill}_vid?

Functionally yes but contextually no.  This allows the PF driver to know 
that it is setting these filters in the context of the existence of VFs, 
allowing it to take appropriate action.  The other two functions may be 
called without the presence of SR-IOV enablement and the existence of VFs.

Anyway, that's why I asked Roopa to add that capability.

- Greg

>
> Ben.
>

^ permalink raw reply

* Re: [PATCH] bonding: Don't allow mode change via sysfs with slaves present
From: David Miller @ 2011-11-18  0:32 UTC (permalink / raw)
  To: nicolas.2p.debian; +Cc: vfalico, netdev, andy, fubar
In-Reply-To: <4EC58C89.7020906@gmail.com>

From: Nicolas de Pesloüan <nicolas.2p.debian@gmail.com>
Date: Thu, 17 Nov 2011 23:36:57 +0100

> Le 17/11/2011 22:04, David Miller a écrit :
>> From: Veaceslav Falico<vfalico@redhat.com>
>> Date: Tue, 15 Nov 2011 17:44:42 +0100
>>
>>> When changing mode via bonding's sysfs, the slaves are not initialized
>>> correctly. Forbid to change modes with slaves present to ensure that
>>> every
>>> slave is initialized correctly via bond_enslave().
>>>
>>> Signed-off-by: Veaceslav Falico<vfalico@redhat.com>
>>
>> Lots of discussion... what is the final verdict on this patch bonding
>> folks?
> 
> Now looks good to me, and as all others participants agreed at the
> beginning, I think it's good for everyone.

Great, applied, thanks everyone.

^ permalink raw reply

* Re: root_lock vs. device's TX lock
From: Tom Herbert @ 2011-11-18  0:35 UTC (permalink / raw)
  To: Andy Fleming; +Cc: Dave Taht, Eric Dumazet, Linux Netdev List
In-Reply-To: <CAKWjMd7xvepSb2sMf4X4U4cjRYizq=tP2TTHhYaozuyBeSW4AQ@mail.gmail.com>

> Actually, I'm interested in circumventing *both* locks. Our SoC has
> some quite-versatile queueing infrastructure, such that (for many
> queueing setups) we can do all of the queueing in hardware, using
> per-cpu access portals. By hacking around the qdisc lock, and using a
> tx queue per core, we were able to achieve a significant speedup.
>
This was actually one of the motivations for my question.  If we have
a one TX queue per core, and and use a trivial mq aware qdisc for
instance, the locking becomes mostly overhead.  I don't mind taking a
lock once per TX, but right now were taking three! (root lock twice,
and device lock once).

Even without one queue per TX, I think the overhead savings may still
be present.  Eric, I realize that a point of dropping the root lock in
sch_direct_xmit is to possibly allow queuing to to qdisc and device
xmit in parallel, but if you're using a trivial qdisc then the time in
qdisc may be << time in device xmit, so the overhead of locking could
mitigate the gains in parallelism.  At the very least, this benefit is
hugely variable depending on the qdisc used.

Tom

^ permalink raw reply

* [PATCH net-next] sky2: fix hang in napi_disable
From: Stephen Hemminger @ 2011-11-18  0:37 UTC (permalink / raw)
  To: Sven Joachim; +Cc: davem, netdev
In-Reply-To: <87r517w2z0.fsf@turtle.gmx.de>

If IRQ was never initialized, then calling napi_disable() would hang.
Add more bookkeeping to track whether IRQ was ever initialized.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

--- a/drivers/net/ethernet/marvell/sky2.c	2011-11-17 11:10:35.938076778 -0800
+++ b/drivers/net/ethernet/marvell/sky2.c	2011-11-17 11:15:49.527882932 -0800
@@ -1723,6 +1723,8 @@ static int sky2_setup_irq(struct sky2_hw
 	if (err)
 		dev_err(&pdev->dev, "cannot assign irq %d\n", pdev->irq);
 	else {
+		hw->flags |= SKY2_HW_IRQ_SETUP;
+
 		napi_enable(&hw->napi);
 		sky2_write32(hw, B0_IMSK, Y2_IS_BASE);
 		sky2_read32(hw, B0_IMSK);
@@ -2120,6 +2122,7 @@ static int sky2_close(struct net_device
 
 		napi_disable(&hw->napi);
 		free_irq(hw->pdev->irq, hw);
+		hw->flags &= ~SKY2_HW_IRQ_SETUP;
 	} else {
 		u32 imask;
 
@@ -3423,12 +3426,13 @@ static void sky2_all_down(struct sky2_hw
 {
 	int i;
 
-	sky2_read32(hw, B0_IMSK);
-	sky2_write32(hw, B0_IMSK, 0);
+	if (hw->flags & SKY2_HW_IRQ_SETUP) {
+		sky2_read32(hw, B0_IMSK);
+		sky2_write32(hw, B0_IMSK, 0);
 
-	if (hw->ports > 1 || netif_running(hw->dev[0]))
 		synchronize_irq(hw->pdev->irq);
-	napi_disable(&hw->napi);
+		napi_disable(&hw->napi);
+	}
 
 	for (i = 0; i < hw->ports; i++) {
 		struct net_device *dev = hw->dev[i];
@@ -3445,7 +3449,7 @@ static void sky2_all_down(struct sky2_hw
 
 static void sky2_all_up(struct sky2_hw *hw)
 {
-	u32 imask = 0;
+	u32 imask = Y2_IS_BASE;
 	int i;
 
 	for (i = 0; i < hw->ports; i++) {
@@ -3461,8 +3465,7 @@ static void sky2_all_up(struct sky2_hw *
 		netif_wake_queue(dev);
 	}
 
-	if (imask || hw->ports > 1) {
-		imask |= Y2_IS_BASE;
+	if (hw->flags & SKY2_HW_IRQ_SETUP) {
 		sky2_write32(hw, B0_IMSK, imask);
 		sky2_read32(hw, B0_IMSK);
 		sky2_read32(hw, B0_Y2_SP_LISR);
--- a/drivers/net/ethernet/marvell/sky2.h	2011-11-16 15:15:40.964280527 -0800
+++ b/drivers/net/ethernet/marvell/sky2.h	2011-11-17 11:13:00.154858718 -0800
@@ -2287,6 +2287,7 @@ struct sky2_hw {
 #define SKY2_HW_RSS_BROKEN	0x00000100
 #define SKY2_HW_VLAN_BROKEN     0x00000200
 #define SKY2_HW_RSS_CHKSUM	0x00000400	/* RSS requires chksum */
+#define SKY2_HW_IRQ_SETUP	0x00000800
 
 	u8	     	     chip_id;
 	u8		     chip_rev;

^ permalink raw reply

* [PATCH net-next] sky2: enforce minimum ring size
From: Stephen Hemminger @ 2011-11-18  0:37 UTC (permalink / raw)
  To: Sven Joachim; +Cc: davem, netdev
In-Reply-To: <874ny2bgqa.fsf@turtle.gmx.de>

The hardware has a restriction that the minimum ring size possible
is 128. The number of elements used is controlled by tx_pending and
the overall number of elements in the ring tx_ring_size, therefore it
is okay to limit the number of elements in use to a small value (63)
but still provide a bigger ring.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

--- a/drivers/net/ethernet/marvell/sky2.c	2011-11-17 16:25:40.079898621 -0800
+++ b/drivers/net/ethernet/marvell/sky2.c	2011-11-17 16:33:19.088438709 -0800
@@ -4092,6 +4092,16 @@ static int sky2_set_coalesce(struct net_
 	return 0;
 }
 
+/*
+ * Hardware is limited to min of 128 and max of 2048 for ring size
+ * and  rounded up to next power of two
+ * to avoid division in modulus calclation
+ */
+static unsigned long roundup_ring_size(unsigned long pending)
+{
+	return max(128ul, roundup_pow_of_two(pending+1));
+}
+
 static void sky2_get_ringparam(struct net_device *dev,
 			       struct ethtool_ringparam *ering)
 {
@@ -4119,7 +4129,7 @@ static int sky2_set_ringparam(struct net
 
 	sky2->rx_pending = ering->rx_pending;
 	sky2->tx_pending = ering->tx_pending;
-	sky2->tx_ring_size = roundup_pow_of_two(sky2->tx_pending+1);
+	sky2->tx_ring_size = roundup_ring_size(sky2->tx_pending);
 
 	return sky2_reattach(dev);
 }
@@ -4714,7 +4724,7 @@ static __devinit struct net_device *sky2
 	spin_lock_init(&sky2->phy_lock);
 
 	sky2->tx_pending = TX_DEF_PENDING;
-	sky2->tx_ring_size = roundup_pow_of_two(TX_DEF_PENDING+1);
+	sky2->tx_ring_size = roundup_ring_size(TX_DEF_PENDING);
 	sky2->rx_pending = RX_DEF_PENDING;
 
 	hw->dev[port] = dev;

^ permalink raw reply

* Re: [net-next] iproute2: Add new command to IP link to enable/disable VF spoof check
From: Stephen Hemminger @ 2011-11-18  0:40 UTC (permalink / raw)
  To: Greg Rose; +Cc: Jeff Kirsher, davem, netdev, gospo
In-Reply-To: <4EC59B34.4080501@intel.com>

On Thu, 17 Nov 2011 15:39:32 -0800
Greg Rose <gregory.v.rose@intel.com> wrote:

> 
> 
> On 9/25/2011 1:23 AM, Jeff Kirsher wrote:
> > From: Greg Rose <gregory.v.rose@intel.com>
> >
> > Add IP link command parsing for VF spoof checking enable/disable
> >
> > Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> > Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

I am limiting patches at this point to those that will build against 3.0.
The 3.1 version of iproute2 has not been released because kernel.org tool
to do the release is still in beta (kup), and they are working on it.

After 3.1 version of iproute2 is released, any fixes that require stuff
from 3.2 will go in.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox