* Re: [PATCH net-next] net: struct sock cleanups
From: David Miller @ 2012-06-25 23:09 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev
In-Reply-To: <1340605369.10893.3.camel@edumazet-glaptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 25 Jun 2012 08:22:49 +0200
> From: Eric Dumazet <edumazet@google.com>
>
> Add missing kernel doc for sk_rx_dst
>
> Move sk_rx_dst to avoid two 32bit holes on 64bit arches
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Applied, thanks Eric.
^ permalink raw reply
* Re: [PATCH net-next] net: Remove 'unlikely' qualifier in skb_steal_sock()
From: David Miller @ 2012-06-25 23:08 UTC (permalink / raw)
To: eric.dumazet; +Cc: subramanian.vijay, netdev, alexander.h.duyck, shemminger
In-Reply-To: <1340604577.23933.23.camel@edumazet-glaptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 25 Jun 2012 08:09:37 +0200
> On Sun, 2012-06-24 at 16:03 -0700, Vijay Subramanian wrote:
>> With early demux enabled by default for TCP flows, there is high chance that
>> skb->sk will be non-null. 'unlikely()' was removed from __inet_lookup_skb() but
>> maybe it can be removed from skb_steal_sock() as well.
>>
>> Note: skb_steal_sock() is also called by __inet6_lookup_skb() and
>> __udp4_lib_lookup_skb() but they are protected by their own 'unlikely' calls.
>>
>> Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
>> ---
>
> Acked-by: Eric Dumazet <edumazet@google.com>
Applied, thanks.
^ permalink raw reply
* Re: [net 3/3] caif-hsi: Add missing return in error path
From: David Miller @ 2012-06-25 23:07 UTC (permalink / raw)
To: sjur.brandeland; +Cc: netdev, sjurbren
In-Reply-To: <1340571698-17892-3-git-send-email-sjur.brandeland@stericsson.com>
From: sjur.brandeland@stericsson.com
Date: Sun, 24 Jun 2012 23:01:38 +0200
> From: Sjur Brændeland <sjur.brandeland@stericsson.com>
>
> Fix a missing return, causing access to freed memory.
>
> Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Applied.
^ permalink raw reply
* Re: [net 2/3] caif-hsi: Bugfix - Piggyback'ed embedded CAIF frame lost
From: David Miller @ 2012-06-25 23:07 UTC (permalink / raw)
To: sjur.brandeland; +Cc: netdev, sjurbren, Per.Ellefsen
In-Reply-To: <1340571698-17892-2-git-send-email-sjur.brandeland@stericsson.com>
From: sjur.brandeland@stericsson.com
Date: Sun, 24 Jun 2012 23:01:37 +0200
> From: Per Ellefsen <Per.Ellefsen@stericsson.com>
>
> When receiving a piggyback'ed descriptor containing an
> embedded frame, but no payload, the embedded frame was
> lost.
>
> Signed-off-by: Per Ellefsen <per.ellefsen@stericsson.com>
> Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Applied.
^ permalink raw reply
* Re: [net 1/3] caif: Clear shutdown mask to zero at reconnect.
From: David Miller @ 2012-06-25 23:07 UTC (permalink / raw)
To: sjur.brandeland; +Cc: netdev, sjurbren
In-Reply-To: <1340571698-17892-1-git-send-email-sjur.brandeland@stericsson.com>
From: sjur.brandeland@stericsson.com
Date: Sun, 24 Jun 2012 23:01:36 +0200
> From: Sjur Brændeland <sjur.brandeland@stericsson.com>
>
> Clear caif sockets's shutdown mask at (re)connect.
>
> Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Applied.
^ permalink raw reply
* Re: [PATCH 5/5] tcp: plug dst leak in tcp_v6_conn_request()
From: David Miller @ 2012-06-25 23:06 UTC (permalink / raw)
To: ncardwell; +Cc: eric.dumazet, netdev, edumazet, therbert
In-Reply-To: <CADVnQykuAQcj_i5cZQphKnnoHaCPMd6xLBmq1ZSAeZs83z3tfw@mail.gmail.com>
From: Neal Cardwell <ncardwell@google.com>
Date: Sun, 24 Jun 2012 13:12:33 -0400
> Yes, the patches in this series were generated as patches against the
> "net" tree (sorry for not indicating that).
>
> The dst leak on the v6 sysctl_tw_recycle code path (patches 2-5) seems
> like a pretty low priority, so I think we could simplify your plan
> even a little further... How about this as a plan: we could apply the
> first patch in the series (tcp: heed result of
> security_inet_conn_request() in tcp_v6_conn_request()) to the net tree
> now, and skip patches 2-5 for now. Once your pending synack work is in
> net-next, I can respin patches 2-5 for net-next. How does that sound?
I've applied the first patch to 'net' and you can simply respin
your patches against net-next right now since I rejected Eric's
SYN-ACK patches.
^ permalink raw reply
* Re: [PATCH 1/5] tcp: heed result of security_inet_conn_request() in tcp_v6_conn_request()
From: David Miller @ 2012-06-25 23:05 UTC (permalink / raw)
To: eric.dumazet; +Cc: ncardwell, netdev, edumazet, therbert
In-Reply-To: <1340523417.23933.4.camel@edumazet-glaptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sun, 24 Jun 2012 09:36:57 +0200
> On Sun, 2012-06-24 at 01:22 -0400, Neal Cardwell wrote:
>> If security_inet_conn_request() returns non-zero then TCP/IPv6 should
>> drop the request, just as in TCP/IPv4 and DCCP in both IPv4 and IPv6.
>>
>> Signed-off-by: Neal Cardwell <ncardwell@google.com>
>> ---
>> net/ipv6/tcp_ipv6.c | 3 ++-
>> 1 files changed, 2 insertions(+), 1 deletions(-)
>>
>> diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
>> index 3a9aec2..9df64a5 100644
>> --- a/net/ipv6/tcp_ipv6.c
>> +++ b/net/ipv6/tcp_ipv6.c
>> @@ -1212,7 +1212,8 @@ have_isn:
>> tcp_rsk(req)->snt_isn = isn;
>> tcp_rsk(req)->snt_synack = tcp_time_stamp;
>>
>> - security_inet_conn_request(sk, skb, req);
>> + if (security_inet_conn_request(sk, skb, req))
>> + goto drop_and_release;
>>
>> if (tcp_v6_send_synack(sk, req,
>> (struct request_values *)&tmp_ext,
>
> Acked-by: Eric Dumazet <edumazet@google.com>
Applied to 'net'.
^ permalink raw reply
* Re: linux-next: manual merge of the net-next tree with the net tree
From: David Miller @ 2012-06-25 23:04 UTC (permalink / raw)
To: sfr; +Cc: netdev, linux-next, linux-kernel, ordex, sven
In-Reply-To: <20120625133812.aad5804dca97e504650b7dd1@canb.auug.org.au>
From: Stephen Rothwell <sfr@canb.auug.org.au>
Date: Mon, 25 Jun 2012 13:38:12 +1000
> Today's linux-next merge of the net-next tree got a conflict in
> net/batman-adv/translation-table.c between commit 8b8e4bc0391f
> ("batman-adv: fix race condition in TT full-table replacement") from the
> net tree and commit 7d211efc5087 ("batman-adv: Prefix originator
> non-static functions with batadv_") from the net-next tree.
>
> Just context changes. I fixed it up (see below) and can carry the fix as
> necessary.
I've also resolved this during today's net --> net-next merge.
^ permalink raw reply
* Re: linux-next: manual merge of the net-next tree with the net tree
From: David Miller @ 2012-06-25 23:04 UTC (permalink / raw)
To: sfr; +Cc: netdev, linux-next, linux-kernel, bjorn
In-Reply-To: <20120625133318.a57079c86bad4360828efac0@canb.auug.org.au>
From: Stephen Rothwell <sfr@canb.auug.org.au>
Date: Mon, 25 Jun 2012 13:33:18 +1000
> Today's linux-next merge of the net-next tree got a conflict in
> drivers/net/usb/qmi_wwan.c between commit b9f90eb27402 ("net: qmi_wwan:
> fix Gobi device probing") from the net tree and various commits from the
> net-next tree.
>
> I am not sure how to fix this, but the comments in the net tree commit
> implied that it would be placed in the 3.6 code, so I just used the
> version of this file from the net-next tree.
I've resolved this during today's net --> net-next merge.
^ permalink raw reply
* Re: [PATCH v2 net-next] tcp: avoid tx starvation by SYNACK packets
From: David Miller @ 2012-06-25 22:43 UTC (permalink / raw)
To: eric.dumazet
Cc: subramanian.vijay, dave.taht, hans.schillstrom, netdev, ncardwell,
therbert, brouer
In-Reply-To: <1340440962.17495.39.camel@edumazet-glaptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sat, 23 Jun 2012 10:42:42 +0200
> From: Eric Dumazet <edumazet@google.com>
>
> On Sat, 2012-06-23 at 00:34 -0700, Vijay Subramanian wrote:
>
>> This patch ([PATCH net-next] tcp: avoid tx starvation by SYNACK
>> packets) is neither in net/net-next trees nor on patchwork. Maybe it
>> was missed since it was sent during the merge window. Is this not
>> needed anymore or is it being tested currently?
>
> You're right, thanks for the reminder !
>
> [PATCH v2 net-next] tcp: avoid tx starvation by SYNACK packets
I don't agree with this change.
What is the point in having real classification configuration if
arbitrary places in the network stack are going to override SKB
priority with a fixed priority setting?
I bet the person who set listening socket priority really meant it and
does not expect you to override it.
^ permalink raw reply
* Re: Bug in net/ipv6/ip6_fib.c:fib6_dump_table()
From: David Miller @ 2012-06-25 22:40 UTC (permalink / raw)
To: eric.dumazet
Cc: johunt, kaber, dbavatar, netdev, yoshfuji, jmorris, pekkas,
kuznet, linux-kernel, greearb
In-Reply-To: <1340429851.4604.11942.camel@edumazet-glaptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sat, 23 Jun 2012 07:37:31 +0200
> [PATCH] ipv6: fib: fix fib dump restart
>
> Commit 2bec5a369ee79576a3 (ipv6: fib: fix crash when changing large fib
> while dumping it) introduced ability to restart the dump at tree root,
> but failed to skip correctly a count of already dumped entries. Code
> didn't match Patrick intent.
>
> We must skip exactly the number of already dumped entries.
>
> Note that like other /proc/net files or netlink producers, we could
> still dump some duplicates entries.
>
> Reported-by: Debabrata Banerjee <dbavatar@gmail.com>
> Reported-by: Josh Hunt <johunt@akamai.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
I've applied this.
But I wonder if it does the right thing, to be honest.
When tree change is detected, w->skip is set to w->count
But with your change, w->count won't be the number of entries to
skip from the root after the first time we handle a tree change.
So on the second tree change, we'll skip the wrong number of
entries, since the w->count we save into w->skip will be biased
by the previous w->skip value. So we'll skip too few entries.
^ permalink raw reply
* Re: [PATCH 1/4] netdev/phy: Handle IEEE802.3 clause 45 Ethernet PHYs
From: David Miller @ 2012-06-25 22:34 UTC (permalink / raw)
To: ddaney.cavm
Cc: grant.likely, rob.herring, devicetree-discuss, netdev,
linux-kernel, linux-mips, afleming, david.daney
In-Reply-To: <1340411056-18988-2-git-send-email-ddaney.cavm@gmail.com>
From: David Daney <ddaney.cavm@gmail.com>
Date: Fri, 22 Jun 2012 17:24:13 -0700
> From: David Daney <david.daney@cavium.com>
>
> The IEEE802.3 clause 45 MDIO bus protocol allows for directly
> addressing PHY registers using a 21 bit address, and is used by many
> 10G Ethernet PHYS. Already existing is the ability of MDIO bus
> drivers to use clause 45, with the MII_ADDR_C45 flag. Here we add
> struct phy_c45_device_ids to hold the device identifier registers
> present in clause 45. struct phy_device gets a couple of new fields:
> c45_ids to hold the identifiers and is_c45 to signal that it is clause
> 45.
>
> Normally the MII_ADDR_C45 flag is ORed with the register address to
> indicate a clause 45 transaction. Here we also use this flag in the
> *device* address passed to get_phy_device() to indicate that probing
> should be done with clause 45 transactions.
>
> EXPORT phy_device_create() so that the follow-on patch to of_mdio.c
> can use it to create phy devices for PHYs, that have non-standard
> device identifier registers, based on the device tree bindings.
>
> Signed-off-by: David Daney <david.daney@cavium.com>
I see no value in having two ways to say that clause-45 transactions
should be used.
Either make it a PHY device attribute, or specify it in the address
in the register accesses, but not both.
Also your patch is full of coding style errors, I simply couldn't
stomache applying this even if I agreed with the substance of the
changes:
> + i < ARRAY_SIZE(c45_ids->device_ids) &&
> + c45_ids->devices_in_package == 0;
c45_ids on the second line should line up with the initial 'i'
on the first line.
> + c45_ids->devices_in_package = (phy_reg & 0xffff) << 16;
> +
> +
> + reg_addr = MII_ADDR_C45 | i << 16 | 5;
There is not reason in the world to have two empty lines there, it
looks awful.
> + /*
> + * If mostly Fs, there is no device there,
> + * let's get out of here.
> + */
Format comments:
/* Like
* this.
*/
Not.
/*
* Like
* this.
*/
> + c45_ids->device_ids[i] = (phy_reg & 0xffff) << 16;
> +
> +
> + reg_addr = MII_ADDR_C45 | i << 16 | MII_PHYSID2;
Two empty lines. This is extremely irritating, it looks like you
had some kind of debugging code here and then were very lazy about
removing it.
> +/*
> + * Or MII_ADDR_C45 into regnum for read/write on mii_bus to enable the
> + * 21 bit IEEE 802.3ae clause 45 addressing mode used by 10GIGE phy
> + * chips. Also may be ORed into the device address in
> + * get_phy_device().
> + */
Comment formatting.
> +/*
> + * phy_c45_device_ids: 802.3-c45 Device Identifiers
> + *
> + * devices_in_package: Bit vector of devices present.
> + * device_ids: The device identifer for each present device.
> + */
If you're going to list the struct members use the correct kerneldoc
format to do so.
^ permalink raw reply
* Re: [PATCH] [XFRM] Fix unexpected SA hard expiration after changing date
From: David Miller @ 2012-06-25 22:29 UTC (permalink / raw)
To: fdu; +Cc: herbert, netdev
In-Reply-To: <1340259961-30354-2-git-send-email-fdu@windriver.com>
From: Fan Du <fdu@windriver.com>
Date: Thu, 21 Jun 2012 14:26:01 +0800
> After SA is setup, one timer is armed to detect soft/hard expiration,
> however the timer handler uses xtime to do the math. This makes hard
> expiration occurs first before soft expiration after setting new date
> with big interval. As a result new child SA is deleted before rekeying
> the new one.
>
> Signed-off-by: Fan Du <fdu@windriver.com>
> ---
> include/net/xfrm.h | 3 +++
> net/xfrm/xfrm_state.c | 21 +++++++++++++++++----
> 2 files changed, 20 insertions(+), 4 deletions(-)
>
> diff --git a/include/net/xfrm.h b/include/net/xfrm.h
> index 2933d74..8d16777 100644
> --- a/include/net/xfrm.h
> +++ b/include/net/xfrm.h
> @@ -197,6 +197,8 @@ struct xfrm_state
>
> struct xfrm_lifetime_cur curlft;
> struct timer_list timer;
> + /* used to fix curlft->add_time when changing date */
> + long saved_tmo;
'saved_tmo' is not properly indented to line up with the
struct member names above it.
^ permalink raw reply
* Re: [PATCH] netxen : Error return off by one for XG port.
From: David Miller @ 2012-06-25 22:27 UTC (permalink / raw)
To: rajesh.borundia; +Cc: santoshprasadnayak, sony.chacko, netdev, kernel-janitors
In-Reply-To: <13A253B3F9BEFE43B93C09CF75F63CAA81A886EF1A@MNEXMB1.qlogic.org>
From: Rajesh Borundia <rajesh.borundia@qlogic.com>
Date: Wed, 20 Jun 2012 08:06:04 -0500
> ______________________________________
> From: santosh nayak [santoshprasadnayak@gmail.com]
> Sent: Wednesday, June 20, 2012 4:22 PM
> To: Sony Chacko; Rajesh Borundia
> Cc: netdev; kernel-janitors@vger.kernel.org; Santosh Nayak
> Subject: [PATCH] netxen : Error return off by one for XG port.
This is not the correct way to submit patches written by other
people.
^ permalink raw reply
* Re: [PATCH] netxen : Error return off by one for XG port.
From: David Miller @ 2012-06-25 22:27 UTC (permalink / raw)
To: santoshprasadnayak; +Cc: sony.chacko, rajesh.borundia, netdev, kernel-janitors
In-Reply-To: <1340189578-18308-1-git-send-email-santoshprasadnayak@gmail.com>
From: santosh nayak <santoshprasadnayak@gmail.com>
Date: Wed, 20 Jun 2012 16:22:58 +0530
> From: Santosh Nayak <santoshprasadnayak@gmail.com>
>
> There are NETXEN_NIU_MAX_XG_PORTS ports.
> Port indexing starts from zero.
> Hence we should also return error for 'port == NETXEN_NIU_MAX_XG_PORTS'.
>
> Signed-off-by: Santosh Nayak <santoshprasadnayak@gmail.com>
Applied.
^ permalink raw reply
* Re: [PATCH] netxen: Error return off by one in 'netxen_nic_set_pauseparam()'.
From: David Miller @ 2012-06-25 22:27 UTC (permalink / raw)
To: santoshprasadnayak; +Cc: sony.chacko, rajesh.borundia, netdev, kernel-janitors
In-Reply-To: <1340177259-14083-1-git-send-email-santoshprasadnayak@gmail.com>
From: santosh nayak <santoshprasadnayak@gmail.com>
Date: Wed, 20 Jun 2012 12:57:39 +0530
> From: Santosh Nayak <santoshprasadnayak@gmail.com>
>
> There are 'NETXEN_NIU_MAX_GBE_PORTS' GBE ports. Port indexing starts
> from zero.
> Hence we should also return error for "port == NETXEN_NIU_MAX_GBE_PORTS"
>
> Signed-off-by: Santosh Nayak <santoshprasadnayak@gmail.com>
Applied.
^ permalink raw reply
* Re: [PATCH 3/3 net-next] tg3: Add sysfs file to export sensor data
From: David Miller @ 2012-06-25 22:24 UTC (permalink / raw)
To: bhutchings; +Cc: mchan, netdev, nsujir, mcarlson
In-Reply-To: <1340659554.2572.52.camel@bwh-desktop.uk.solarflarecom.com>
From: Ben Hutchings <bhutchings@solarflare.com>
Date: Mon, 25 Jun 2012 22:25:54 +0100
> On Mon, 2012-06-25 at 14:04 -0700, Michael Chan wrote:
>> And please elaborate on the private ioctl.
>
> Every driver gets to handle SIOCDEVPRIVATE .. SIOCDEVPRIVATE+15. But
> avoid the first 3, as userland may blindly try to use them for MDIO.
> David may of course tell you that you should under no circumstances
> actually do this.
Indeed. :-)
^ permalink raw reply
* Re: [PATCH 3/3 net-next] tg3: Add sysfs file to export sensor data
From: Michael Chan @ 2012-06-25 21:50 UTC (permalink / raw)
To: Ben Hutchings; +Cc: David Miller, netdev, nsujir, mcarlson
In-Reply-To: <1340659554.2572.52.camel@bwh-desktop.uk.solarflarecom.com>
On Mon, 2012-06-25 at 22:25 +0100, Ben Hutchings wrote:
> > The rest of the bulk data requires too much parsing in the kernel and
> > will have to be exposed as binary data. What do you mean by binary
> > attribute? A new binary sysfs attribute under hwmon? Or outside of
> > hwmon?
>
> A binary sysfs attribute under your PCI device. (In fact, for wider
> userland compatibility, hwmon sysfs attributes should also be under the
> PCI device rather than the hwmon device. Yes, this *is* a weird
> convention.)
>
> > And please elaborate on the private ioctl.
>
> Every driver gets to handle SIOCDEVPRIVATE .. SIOCDEVPRIVATE+15. But
> avoid the first 3, as userland may blindly try to use them for MDIO.
> David may of course tell you that you should under no circumstances
> actually do this.
Yeah, we won't touch SIOCDEVPRIVATE with a 10-foot pole.
^ permalink raw reply
* Re: [PATCH 3/3 net-next] tg3: Add sysfs file to export sensor data
From: Ben Hutchings @ 2012-06-25 21:25 UTC (permalink / raw)
To: Michael Chan; +Cc: David Miller, netdev, nsujir, mcarlson
In-Reply-To: <1340658258.4344.43.camel@LTIRV-MCHAN1.corp.ad.broadcom.com>
On Mon, 2012-06-25 at 14:04 -0700, Michael Chan wrote:
> On Sat, 2012-06-23 at 16:02 +0100, Ben Hutchings wrote:
> > Temperature and voltage can be exposed through an hwmon device (which
> > practically means you use multiple attributes with conventional names).
> > Other diagnostics might possible be suitable for ethtool stats,
> > depending on what they are.
>
> I think we can extract some common and more useful attributes such as
> temperature and voltage, and use the standard hwmon attributes to expose
> them.
>
> >
> > If the driver can't easily parse the information (e.g. it varies greatly
> > between the different chips and firmware versions) then a binary
> > attribute or private ioctl might be appropriate. But generic interfaces
> > really should be considered first.
> >
>
> The rest of the bulk data requires too much parsing in the kernel and
> will have to be exposed as binary data. What do you mean by binary
> attribute? A new binary sysfs attribute under hwmon? Or outside of
> hwmon?
A binary sysfs attribute under your PCI device. (In fact, for wider
userland compatibility, hwmon sysfs attributes should also be under the
PCI device rather than the hwmon device. Yes, this *is* a weird
convention.)
> And please elaborate on the private ioctl.
Every driver gets to handle SIOCDEVPRIVATE .. SIOCDEVPRIVATE+15. But
avoid the first 3, as userland may blindly try to use them for MDIO.
David may of course tell you that you should under no circumstances
actually do this.
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* Re: [PATCH 3/3 net-next] tg3: Add sysfs file to export sensor data
From: Michael Chan @ 2012-06-25 21:04 UTC (permalink / raw)
To: Ben Hutchings; +Cc: David Miller, netdev, nsujir, mcarlson
In-Reply-To: <1340463729.6345.25.camel@deadeye.wl.decadent.org.uk>
On Sat, 2012-06-23 at 16:02 +0100, Ben Hutchings wrote:
> Temperature and voltage can be exposed through an hwmon device (which
> practically means you use multiple attributes with conventional names).
> Other diagnostics might possible be suitable for ethtool stats,
> depending on what they are.
I think we can extract some common and more useful attributes such as
temperature and voltage, and use the standard hwmon attributes to expose
them.
>
> If the driver can't easily parse the information (e.g. it varies greatly
> between the different chips and firmware versions) then a binary
> attribute or private ioctl might be appropriate. But generic interfaces
> really should be considered first.
>
The rest of the bulk data requires too much parsing in the kernel and
will have to be exposed as binary data. What do you mean by binary
attribute? A new binary sysfs attribute under hwmon? Or outside of
hwmon? And please elaborate on the private ioctl.
Thanks.
^ permalink raw reply
* Re: [RFC net-next (v2) 12/14] ixgbe: set maximal number of default RSS queues
From: Alexander Duyck @ 2012-06-25 21:03 UTC (permalink / raw)
To: Ben Hutchings
Cc: eilong, Yuval Mintz, davem, netdev, Jeff Kirsher, John Fastabend
In-Reply-To: <1340649847.2572.15.camel@bwh-desktop.uk.solarflarecom.com>
On 06/25/2012 11:44 AM, Ben Hutchings wrote:
> On Mon, 2012-06-25 at 11:38 -0700, Alexander Duyck wrote:
>> On 06/25/2012 10:53 AM, Eilon Greenstein wrote:
>>> On Mon, 2012-06-25 at 08:44 -0700, Alexander Duyck wrote:
>>>> This doesn't limit queues, only interrupt vectors. As I told you before
>>>> you should look at the ixgbe_set_rss_queues function if you actually
>>>> want to limit the number of RSS queues.
>>> How about this:
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>> index af1a531..23a8609 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
>>> @@ -277,6 +277,8 @@ static inline bool ixgbe_set_rss_queues(struct ixgbe_adapter *adapter)
>>> bool ret = false;
>>> struct ixgbe_ring_feature *f = &adapter->ring_feature[RING_F_RSS];
>>>
>>> + f->indices = min_t(int, netif_get_num_default_rss_queues(), f->indices);
>>> +
>>> if (adapter->flags & IXGBE_FLAG_RSS_ENABLED) {
>>> f->mask = 0xF;
>>> adapter->num_rx_queues = f->indices;
>>> @@ -302,7 +304,9 @@ static inline bool ixgbe_set_fdir_queues(struct ixgbe_adapter *adapter)
>>> bool ret = false;
>>> struct ixgbe_ring_feature *f_fdir = &adapter->ring_feature[RING_F_FDIR];
>>>
>>> - f_fdir->indices = min_t(int, num_online_cpus(), f_fdir->indices);
>>> + f_fdir->indices = min_t(int, netif_get_num_default_rss_queues(),
>>> + f_fdir->indices);
>>> +
>>> f_fdir->mask = 0;
>>>
>>> /*
>>> @@ -339,8 +343,7 @@ static inline bool ixgbe_set_fcoe_queues(struct ixgbe_adapter *adapter)
>>> if (!(adapter->flags & IXGBE_FLAG_FCOE_ENABLED))
>>> return false;
>>>
>>> - f->indices = min_t(int, num_online_cpus(), f->indices);
>>> -
>>> + f->indices = min_t(int, f->indices, netif_get_num_default_rss_queues());
>>> adapter->num_rx_queues = 1;
>>> adapter->num_tx_queues = 1;
>>>
>> This makes much more sense, but still needs a few minor changes. The
>> first change I would recommend is to not alter ixgbe_set_fdir_queues
>> since that controls flow director queues, not RSS queues. The second
>> would be to update ixgbe_set_dcb_queues since that does RSS per DCB
>> traffic class and will also need to be updated.
> There is a difference between the stated aim (reduce memory allocated
> for RX buffers) and this implementation (limit number of RSS queues
> only). If we agree on that aim then we should be limiting the total
> number of RX queues.
>
> Ben
The problem is there are certain features that require a certain number
of Tx/Rx queues. In addition, certain features will behave differently
from RSS in terms of how well they scale based on the number of queues.
For example if we enable DCB we require at least one queue per traffic
class. In the case of ATR we should have 1 queue per CPU since the Tx
queue selection is based on the number of CPUs. If we don't have that
1:1 mapping we should be disabling ATR since the feature will not
function correctly in that case.
Thanks,
Alex
^ permalink raw reply
* Re: 3.4-rc: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
From: Francois Romieu @ 2012-06-25 20:25 UTC (permalink / raw)
To: Tomas Papan; +Cc: netdev
In-Reply-To: <CAMGsXDSSCnOF+CykudBoEjTacSYL4LYG_ZnRTcS+Xzt54LW=Og@mail.gmail.com>
Tomas Papan <tomas.papan@gmail.com> :
[...]
> Is there anything what can I provide or try? I do not want to be stuck
> on 3.2.x kernel.
You can try to revert 036dafa28da1e2565a8529de2ae663c37b7a0060
--
Ueimor
^ permalink raw reply
* Re: [PATCH RFC 0/8] LLDP implementation for Linux
From: Eldad Zack @ 2012-06-25 20:21 UTC (permalink / raw)
To: John Fastabend; +Cc: netdev
In-Reply-To: <4FE8B535.70900@intel.com>
On Mon, 25 Jun 2012, John Fastabend wrote:
> On 6/25/2012 11:28 AM, Eldad Zack wrote:
> > Hi all,
> >
> > This series of patches provides a partial LLDP (IEEE Std 802.1ab)
> > implementation.
>
> Why do you want to implement this in the kernel? We've been doing this
> in user space for sometime without any issues and has some noted
> advantages. Reduces kernel bloat, easier to add extend, etc.
You have a point there. I don't have any real, I just wrote it for fun.
> What are we missing? Can we address any deficiencies here rather than
> embedded this into the kernel.
I will take a look, though it seems too feature complete :)
> No, netlink is much nicer for these types of things. User space
> can register for events and stay in sync and also all the other
> relevant events are using netlink UP/DOWN/DORMANT events for
> example.
I had a hunch that it might be the case. Thanks!
> >
> > Thanks in advance if you're taking the time to review it or test it!
> >
>
> I'll scan the patches shortly.
Thanks for that (if you'd still like to) and all your comments, much
appreciated!
Eldad
^ permalink raw reply
* Re: [PATCH RFC 0/8] LLDP implementation for Linux
From: Eldad Zack @ 2012-06-25 20:05 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20120625115416.53655264@nehalam.linuxnetplumber.net>
On Mon, 25 Jun 2012, Stephen Hemminger wrote:
> Since LLDP is a control protocol what is wrong with the existing
> implementation in userspace?
> https://github.com/vincentbernat/lldpd/wiki
Well, I don't have a business case here, since I start it on my free
time out of a mix of interest and curiousity.
In any event, thanks for the feedback!
Eldad
^ permalink raw reply
* Re: [PATCH RFC 0/8] LLDP implementation for Linux
From: John Fastabend @ 2012-06-25 19:00 UTC (permalink / raw)
To: Eldad Zack; +Cc: netdev
In-Reply-To: <1340648900-6547-1-git-send-email-eldad@fogrefinery.com>
On 6/25/2012 11:28 AM, Eldad Zack wrote:
> Hi all,
>
> This series of patches provides a partial LLDP (IEEE Std 802.1ab)
> implementation.
>
> I'd really appreciate a review of it.
>
> LLDP is a simple discovery protocol which advertises the identification and
> other info (such as MTU or capabilities) of device on the link. It can also
> help debug misconfigurations on the link layer (wrong MTU, wrong VLAN).
>
Why do you want to implement this in the kernel? We've been doing this
in user space for sometime without any issues and has some noted
advantages. Reduces kernel bloat, easier to add extend, etc.
Take a look at
http://www.open-lldp.org/open-lldp
What are we missing? Can we address any deficiencies here rather than
embedded this into the kernel.
> Some notes/questions:
>
> * Applies against net-next and mainline.
>
> * Included in this series is only LLDP output code.
> This is not an issue since input and output are decoupled in LLDP anyway.
> I'm working on the input code as well and will post it at some point in the
> future.
>
> * Sysctl is used to do (some) configuration. This is done globally right now.
> Before I add per-device sysctls: is it at all appropriate to use sysctl
> here?
No, netlink is much nicer for these types of things. User space
can register for events and stay in sync and also all the other
relevant events are using netlink UP/DOWN/DORMANT events for
example.
>
> * By default, transmission is suppressed. To arm it, set
> /proc/sys/net/lldp/lldp_op_mode
> to 1.
>
> * I've tested it on x86_64 and (qemu'd) x86.
>
> * I've tested it on my machine and it works with Ethernet and WLAN.
>
> * The last patch ("8021q/vlan: process NETDEV_GOING_DOWN") is needed
> to be able to send shutdown PDU on VLAN interfaces, but has otherwise
> no effect.
>
Do this in user space and you can handle all these events without
changing existing infrastructure.
> * Is there a better way to deal with 16-bit endianness other than masking and
> shifting, when one field is more than a byte long (in this case 7/9)?
>
> * I usually only work on it on weekends, so I might be slow in responding back.
Please consider user space implementations (you don't have to use
mine) or let us know why we _need_ this in the kernel.
>
> Thanks in advance if you're taking the time to review it or test it!
>
I'll scan the patches shortly.
Thanks,
John
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox