Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 0/2] net/macb: fix multiqueue support patch up
From: David Miller @ 2014-12-15 16:51 UTC (permalink / raw)
  To: cyrille.pitchen
  Cc: nicolas.ferre, linux-arm-kernel, netdev, soren.brinkmann,
	linux-kernel
In-Reply-To: <cover.1418651689.git.cyrille.pitchen@atmel.com>

From: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Date: Mon, 15 Dec 2014 15:13:30 +0100

> This series of patches is a fixup for the multiqueue support patch.
> 
> The first patch fixes a bug introduced by the multiqueue support patch.
> The second one doesn't fix a bug but simplify the source code by removing
> useless calls to devm_free_irq() since we use managed device resources for
> IRQs.
> 
> They were applied on the net-next tree and tested with a sama5d36ek board.

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH] rds: Fix min() warning in rds_message_inc_copy_to_user()
From: David Miller @ 2014-12-15 16:49 UTC (permalink / raw)
  To: geert; +Cc: viro, netdev, linux-kernel
In-Reply-To: <1418646102-32415-1-git-send-email-geert@linux-m68k.org>

From: Geert Uytterhoeven <geert@linux-m68k.org>
Date: Mon, 15 Dec 2014 13:21:42 +0100

> net/rds/message.c: In function ‘rds_message_inc_copy_to_user’:
> net/rds/message.c:328: warning: comparison of distinct pointer types lacks a cast
> 
> Use min_t(unsigned long, ...) like is done in
> rds_message_copy_from_user().
> 
> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>

Also applied, thanks Geert.

^ permalink raw reply

* Re: [PATCH] net: stmmac: sti: Fix uninitialized pointer dereference if !OF
From: David Miller @ 2014-12-15 16:48 UTC (permalink / raw)
  To: geert; +Cc: peppe.cavallaro, netdev, linux-kernel
In-Reply-To: <1418642751-23308-1-git-send-email-geert@linux-m68k.org>

From: Geert Uytterhoeven <geert@linux-m68k.org>
Date: Mon, 15 Dec 2014 12:25:51 +0100

> If CONFIG_OF is not set:
> 
> drivers/net/ethernet/stmicro/stmmac/dwmac-sti.c: In function ‘sti_dwmac_parse_data’:
> drivers/net/ethernet/stmicro/stmmac/dwmac-sti.c:318: warning: ‘rs’ is used uninitialized in this function
> 
> of_property_read_string() will return -ENOSYS in this case, and rs will
> be an uninitialized pointer.
> 
> While the fallback clock selection is already selected correctly in this
> case, the string comparisons should be skipped too, else the system will
> crash while dereferencing the uninitialized pointer.
> 
> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>

Applied.

^ permalink raw reply

* Re: [PATCH net v3] net: smc91x: Fix build without gpiolib
From: David Miller @ 2014-12-15 16:47 UTC (permalink / raw)
  To: tklauser; +Cc: nico, tony, netdev
In-Reply-To: <1418634147-13442-1-git-send-email-tklauser@distanz.ch>

From: Tobias Klauser <tklauser@distanz.ch>
Date: Mon, 15 Dec 2014 10:02:27 +0100

> If GPIOLIB=n the following build errors occur:
> 
> drivers/net/ethernet/smsc/smc91x.c: In function 'try_toggle_control_gpio':
> drivers/net/ethernet/smsc/smc91x.c:2204:2: error: implicit declaration of function 'devm_gpiod_get_index' [-Werror=implicit-function-declaration]
> drivers/net/ethernet/smsc/smc91x.c:2204:7: warning: assignment makes pointer from integer without a cast [enabled by default]
> drivers/net/ethernet/smsc/smc91x.c:2213:2: error: implicit declaration of function 'gpiod_direction_output' [-Werror=implicit-function-declaration]
> drivers/net/ethernet/smsc/smc91x.c:2216:3: error: implicit declaration of function 'devm_gpiod_put' [-Werror=implicit-function-declaration]
> drivers/net/ethernet/smsc/smc91x.c:2222:2: error: implicit declaration of function 'gpiod_set_value_cansleep' [-Werror=implicit-function-declaration]
> 
> Fix this by letting the driver depend on GPIOLIB if OF is selected.
> 
> Fixes: 7d2911c4381 ("net: smc91x: Fix gpios for device tree based booting")
> Cc: Tony Lindgren <tony@atomide.com>
> Signed-off-by: Tobias Klauser <tklauser@distanz.ch>

Applied.

^ permalink raw reply

* Re: [PATCH net, v2] gre: fix the inner mac header in nbma tunnel xmit path
From: David Miller @ 2014-12-15 16:46 UTC (permalink / raw)
  To: timo.teras; +Cc: netdev, therbert, alexander.h.duyck
In-Reply-To: <1418628253-4950-1-git-send-email-timo.teras@iki.fi>

From: Timo Teräs <timo.teras@iki.fi>
Date: Mon, 15 Dec 2014 09:24:13 +0200

> The NBMA GRE tunnels temporarily push GRE header that contain the
> per-packet NBMA destination on the skb via header ops early in xmit
> path. It is the later pulled before the real GRE header is constructed.
> 
> The inner mac was thus set differently in nbma case: the GRE header
> has been pushed by neighbor layer, and mac header points to beginning
> of the temporary gre header (set by dev_queue_xmit).
> 
> Now that the offloads expect mac header to point to the gre payload,
> fix the xmit patch to:
>  - pull first the temporary gre header away
>  - and reset mac header to point to gre payload
> 
> This fixes tso to work again with nbma tunnels.
> 
> Fixes: 14051f0452a2 ("gre: Use inner mac length when computing tunnel length")
> Signed-off-by: Timo Teräs <timo.teras@iki.fi>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH net-next] fib_trie.txt: fix typo
From: David Miller @ 2014-12-15 16:45 UTC (permalink / raw)
  To: duanj.fnst; +Cc: netdev
In-Reply-To: <548E86AD.3090702@cn.fujitsu.com>

From: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Date: Mon, 15 Dec 2014 14:58:53 +0800

> Fix the typo, there should be "It".
> On the other hand, fix whitespace errors detected by checkpatch.pl
> 
> Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>

Applied.

^ permalink raw reply

* Re: [PATCH] cirrus: cs89x0: fix time comparison
From: David Miller @ 2014-12-15 16:44 UTC (permalink / raw)
  To: asaf.vertz; +Cc: ebiederm, julia.lawall, himangi774, netdev, linux-kernel
In-Reply-To: <20141214083418.GA25910@ubuntu>

From: Asaf Vertz <asaf.vertz@tandemg.com>
Date: Sun, 14 Dec 2014 10:34:18 +0200

> To be future-proof and for better readability the time comparisons are
> modified to use time_before, time_after, and time_after_eq instead of
> plain, error-prone math.
> 
> Signed-off-by: Asaf Vertz <asaf.vertz@tandemg.com>

Applied.

^ permalink raw reply

* Re: [RFC PATCH net-next 1/1] net: Support for switch port configuration
From: Roopa Prabhu @ 2014-12-15 16:44 UTC (permalink / raw)
  To: Thomas Graf
  Cc: Varlese, Marco, John Fastabend, Jiri Pirko,
	netdev@vger.kernel.org, stephen@networkplumber.org,
	Fastabend, John R, sfeldma@gmail.com,
	linux-kernel@vger.kernel.org
In-Reply-To: <20141215144027.GA21262@casper.infradead.org>

On 12/15/14, 6:40 AM, Thomas Graf wrote:
> On 12/15/14 at 02:29pm, Varlese, Marco wrote:
>>> All of these are highly generic and should *not* be passed through from user
>>> space to the driver directly but rather be properly abstracted as Roopa
>>> proposed. The value of this API is abstraction.
>> How would you let the user enable/disable features then? For instance, how would the user enable/disable flooding for broadcast packets (BFLOODING) on a given port? What I was proposing is to have a list of attributes (to be added in if_link.h) which can be tuned by the user using a tool like iproute2. What do you propose?
> Excellent, I agree with what you are saying. What set me off is that
> the patch does not reflect that yet. Instead, the patch introduces
> a pure Netlink pass-through API to the driver.
>
> I would expect the patch to:
>   1. Parse the Netlink messages and be aware of individual attributes
>   2. Validate them
>   3. Pass the configuration to the driver using an API that can also
>      be consumed from in-kernel users.
yes, exactly.
>
>> I think I have seen Roopa posting his updated ndo patch and getting some feedback by few people already and as long as I will be able to accomplish the use case described here I am happy with his way.
> I think Roopa's patches are supplementary. Not all switchdev users
> will be backed with a Linux Bridge. I therefore welcome your patches
> very much.
>
> The overlap is in the ndo. I think both the API you propose and
> Roopa's bridge code should use the same NDO.
>> I do not have an example right now of a vendor specific attribute but I was just saying that might happen (i.e. someone will have a feature not implemented by others?).
> That's fine. Once we have them we can consider adding vendor specific
> extensions.

^ permalink raw reply

* Re: [RFC PATCH net-next 0/5] tcp: TCP tracer
From: Josef Bacik @ 2014-12-15 16:42 UTC (permalink / raw)
  To: Eric Dumazet, Alexei Starovoitov, Laurent Chavey, Yuchung Cheng
  Cc: Martin KaFai Lau, netdev@vger.kernel.org, David S. Miller,
	Hannes Frederic Sowa, Steven Rostedt, Lawrence Brakmo,
	Kernel Team
In-Reply-To: <1418659395.9773.13.camel@edumazet-glaptop2.roam.corp.google.com>

On 12/15/2014 11:03 AM, Eric Dumazet wrote:
> On Sun, 2014-12-14 at 22:55 -0800, Alexei Starovoitov wrote:
>
>> I think patches 1 and 3 are good additions, since they establish
>> few permanent points of instrumentation in tcp stack.
>> Patches 4-5 look more like use cases of tracepoints established
>> before. They may feel like simple additions and, no doubt,
>> they are useful, but since they expose things via tracing
>> infra they become part of api and cannot be changed later,
>> when more stats would be needed.
>> I think systemtap like scripting on top of patches 1 and 3
>> should solve your use case ?
>> Also, have you looked at recent eBPF work?
>> Though it's not completely ready yet, soon it should
>> be able to do the same stats collection as you have
>> in 4/5 without adding permanent pieces to the kernel.
>
> So it looks like web10g like interfaces are very often requested by
> various teams.
>
> And we have many different views on how to hack this. I am astonished by
> number of hacks I saw about this stuff going on.
>
> What about a clean way, extending current TCP_INFO, which is both
> available as a getsockopt() for socket owners and ss/iproute2
> information for 'external entities'
>
> If we consider web10g info needed, then adding a ftrace/eBPF like
> interface is simply yet another piece of code we need to maintain,
> and the argument of 'this should cost nothing if not activated' is
> nonsense since major players need to constantly monitor TCP metrics and
> behavior.
>
> It seems both FaceBook and Google are working on a subset of web10g.
>
> I suggest we meet together and establish a common ground, preferably
> after Christmas holidays.
>

We've set up something for exactly this case at the end of January but 
have yet to get a response from Google.  If any of the Google people 
cc'ed (or really anybody, its not a strictly FB/Google thing) is 
interested please email me directly and I'll send you the details, we 
will be meeting face to face in the bay area at the end of January.  Thanks,

Josef

^ permalink raw reply

* Re: [PATCH net 0/2] mlx4 driver fixes for 3.19-rc1
From: David Miller @ 2014-12-15 16:36 UTC (permalink / raw)
  To: ogerlitz; +Cc: netdev, matanb, amirv, talal
In-Reply-To: <1418566685-9855-1-git-send-email-ogerlitz@mellanox.com>

From: Or Gerlitz <ogerlitz@mellanox.com>
Date: Sun, 14 Dec 2014 16:18:03 +0200

> Just fixes for two small issues introduced in the 3.19 merge window

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH 0/9 net-next] rhashtable: Per bucket locks & deferred table resizing
From: Thomas Graf @ 2014-12-15 16:27 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, herbert, paulmck, edumazet, john.r.fastabend, josh
In-Reply-To: <20141215.111838.992055066540772910.davem@davemloft.net>

On 12/15/14 at 11:18am, David Miller wrote:
> From: Thomas Graf <tgraf@suug.ch>
> Date: Mon, 15 Dec 2014 13:00:50 +0000
> 
> > Meant for "net-next". Apologies for the missing prefix.
> 
> Which is not open for submission right now, please resubmit when it is
> re-openned.

Will do. I wanted to expose it as soon as it's done as Herbert is
working on rehashing and I wanted him and everyone else to be aware
of this.

^ permalink raw reply

* Re: [RFC PATCH net-next 1/1] net: Support for switch port configuration
From: Roopa Prabhu @ 2014-12-15 16:18 UTC (permalink / raw)
  To: Varlese, Marco
  Cc: Jiri Pirko, John Fastabend, netdev@vger.kernel.org,
	stephen@networkplumber.org, Fastabend, John R, sfeldma@gmail.com,
	linux-kernel@vger.kernel.org
In-Reply-To: <C4896FB061E7DE4AAC93031BDCA044B104AC533A@IRSMSX108.ger.corp.intel.com>

On 12/15/14, 1:39 AM, Varlese, Marco wrote:
>> -----Original Message-----
>> From: Roopa Prabhu [mailto:roopa@cumulusnetworks.com]
>> Sent: Saturday, December 13, 2014 7:06 AM
>> To: Varlese, Marco
>> Cc: Jiri Pirko; John Fastabend; netdev@vger.kernel.org;
>> stephen@networkplumber.org; Fastabend, John R; sfeldma@gmail.com;
>> linux-kernel@vger.kernel.org
>> Subject: Re: [RFC PATCH net-next 1/1] net: Support for switch port
>> configuration
>>
>> On 12/12/14, 1:19 AM, Varlese, Marco wrote:
>>>> -----Original Message-----
>>>> From: Roopa Prabhu [mailto:roopa@cumulusnetworks.com]
>>>> Sent: Thursday, December 11, 2014 5:41 PM
>>>> To: Jiri Pirko
>>>> Cc: Varlese, Marco; John Fastabend; netdev@vger.kernel.org;
>>>> stephen@networkplumber.org; Fastabend, John R; sfeldma@gmail.com;
>>>> linux-kernel@vger.kernel.org
>>>> Subject: Re: [RFC PATCH net-next 1/1] net: Support for switch port
>>>> configuration
>>>>
>>>> On 12/11/14, 8:56 AM, Jiri Pirko wrote:
>>>>> Thu, Dec 11, 2014 at 05:37:46PM CET, roopa@cumulusnetworks.com
>> wrote:
>>>>>> On 12/11/14, 3:01 AM, Jiri Pirko wrote:
>>>>>>> Thu, Dec 11, 2014 at 10:59:42AM CET, marco.varlese@intel.com wrote:
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: John Fastabend [mailto:john.fastabend@gmail.com]
>>>>>>>>> Sent: Wednesday, December 10, 2014 5:04 PM
>>>>>>>>> To: Jiri Pirko
>>>>>>>>> Cc: Varlese, Marco; netdev@vger.kernel.org;
>>>>>>>>> stephen@networkplumber.org; Fastabend, John R;
>>>>>>>>> roopa@cumulusnetworks.com; sfeldma@gmail.com; linux-
>>>>>>>>> kernel@vger.kernel.org
>>>>>>>>> Subject: Re: [RFC PATCH net-next 1/1] net: Support for switch
>>>>>>>>> port configuration
>>>>>>>>>
>>>>>>>>> On 12/10/2014 08:50 AM, Jiri Pirko wrote:
>>>>>>>>>> Wed, Dec 10, 2014 at 05:23:40PM CET, marco.varlese@intel.com
>>>> wrote:
>>>>>>>>>>> From: Marco Varlese <marco.varlese@intel.com>
>>>>>>>>>>>
>>>>>>>>>>> Switch hardware offers a list of attributes that are
>>>>>>>>>>> configurable on a per port basis.
>>>>>>>>>>> This patch provides a mechanism to configure switch ports by
>>>>>>>>>>> adding an NDO for setting specific values to specific attributes.
>>>>>>>>>>> There will be a separate patch that extends iproute2 to call
>>>>>>>>>>> the new NDO.
>>>>>>>>>> What are these attributes? Can you give some examples. I'm
>>>>>>>>>> asking because there is a plan to pass generic attributes to
>>>>>>>>>> switch ports replacing current specific
>>>>>>>>>> ndo_switch_port_stp_update. In this case, bridge is setting that
>> attribute.
>>>>>>>>>> Is there need to set something directly from userspace or does
>>>>>>>>>> it make rather sense to use involved bridge/ovs/bond ? I think
>>>>>>>>>> that both will be needed.
>>>>>>>>> +1
>>>>>>>>>
>>>>>>>>> I think for many attributes it would be best to have both. The
>>>>>>>>> in kernel callers and netlink userspace can use the same driver
>> ndo_ops.
>>>>>>>>> But then we don't _require_ any specific bridge/ovs/etc module.
>>>>>>>>> And we may have some attributes that are not specific to any
>>>>>>>>> existing software module. I'm guessing Marco has some examples
>>>>>>>>> of
>>>> these.
>>>>>>>>> [...]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> John Fastabend         Intel Corporation
>>>>>>>> We do have a need to configure the attributes directly from
>>>>>>>> user-space
>>>> and I have identified the tool to do that in iproute2.
>>>>>>>> An example of attributes are:
>>>>>>>> * enabling/disabling of learning of source addresses on a given
>>>>>>>> port (you can imagine the attribute called LEARNING for example);
>>>>>>>> * internal loopback control (i.e. LOOPBACK) which will control
>>>>>>>> how the flow of traffic behaves from the switch fabric towards an
>>>>>>>> egress port;
>>>>>>>> * flooding for broadcast/multicast/unicast type of packets (i.e.
>>>>>>>> BFLOODING, MFLOODING, UFLOODING);
>>>>>>>>
>>>>>>>> Some attributes would be of the type enabled/disabled while other
>>>>>>>> will
>>>> allow specific values to allow the user to configure different
>>>> behaviours of that feature on that particular port on that platform.
>>>>>>>> One thing to mention - as John stated as well - there might be
>>>>>>>> some
>>>> attributes that are not specific to any software module but rather
>>>> have to do with the actual hardware/platform to configure.
>>>>>>>> I hope this clarifies some points.
>>>>>>> It does. Makes sense. We need to expose this attr set/get for both
>>>>>>> in-kernel and userspace use cases.
>>>>>>>
>>>>>>> Please adjust you patch for this. Also, as a second patch, it
>>>>>>> would be great if you can convert ndo_switch_port_stp_update to
>>>>>>> this new
>>>> ndo.
>>>>>> Why are we exposing generic switch attribute get/set from userspace
>>>>>> ?. We already have specific attributes for learning/flooding which
>>>>>> can be extended further.
>>>>> Yes, but that is for PF_BRIDGE and bridge specific attributes. There
>>>>> might be another generic attrs, no?
>>>> I cant think of any. And plus, the whole point of switchdev l2
>>>> offloads was to map these to existing bridge attributes. And we
>>>> already have a match for some of the attributes that marco wants.
>>>>
>>>> If there is a need for such attributes, i don't see why it is needed
>>>> for switch devices only.
>>>> It is needed for any hw (nics etc). And, a precedence to this is to
>>>> do it via ethtool.
>>>>
>>>> Having said that, am sure we will find a need for this in the future.
>>>> And having a netlink attribute always helps.
>>>>
>>>> Today, it seems like these can be mapped to existing attributes that
>>>> are settable via ndo_bridge_setlink/getlink.
>>>>
>>>>>> And for in kernel api....i had a sample patch in my RFC series
>>>>>> (Which i was going to resubmit, until it was decided that we will
>>>>>> use existing api around
>>>>>> ndo_bridge_setlink/ndo_bridge_getlink):
>>>>>> http://www.spinics.net/lists/netdev/msg305473.html
>>>>> Yes, this might become handy for other generic non-bridge attrs.
>>>>>
>>>>>> Thanks,
>>>>>> Roopa
>>>>>>
>>>>>>
>>>>>>
>>> The list I provided is only a subset of the attributes we will need to be
>> exposed. I do have more and I'm sure that more will come in the future. As I
>> mentioned in few posts earlier, some attributes are generic and some are
>> not.
>>> I did not consider ethtool for few reasons but the main one is that I was
>> under the impression that netlink was preferred in many circumstances over
>> the ethotool_ops.
>> That is correct. I don't think anybody hinted that you should extend ethtool.
>>>    Plus, all the cases I have identified so far are going to nicely fit into the
>> setlink set of operations.
>> Would be better if you submitted your iproute2 patch with this patch.
>>
>> I do plan to resubmit my generic ndo patch soon.
>>
>> Thanks,
>> Roopa
> I honestly do not understand what extra "help" the iproute2 would have brought to this RFC: that patch simply adds a new section for the iproute2 help and a new args parser for the input. From an infrastructure perspective is leveraging what netlink messages are using RTM_SETLINK hence hooking up eventually in the do_setlink(). Sure, obviously contains all the attributes I have in mind but from an infrastructure patch perspective I don't think that you would have gained much in seeing it.
correct. But you mentioned iproute2 changes in your patch comment. And 
since i was not getting a clear understanding of what these attributes 
were...from your current patch..., i thought your iproute2 patch might 
shed some light on how you plan to handle these attributes.

>
> Anyway, good to know you're reworking you generic patch. I'll keep an eye out for your new NDO.
>
>
> Thanks,
> Marco
>

^ permalink raw reply

* Re: [PATCH 0/9 net-next] rhashtable: Per bucket locks & deferred table resizing
From: David Miller @ 2014-12-15 16:18 UTC (permalink / raw)
  To: tgraf; +Cc: netdev, kernel, herbert, paulmck, edumazet, john.r.fastabend,
	josh
In-Reply-To: <20141215130050.GA9628@casper.infradead.org>

From: Thomas Graf <tgraf@suug.ch>
Date: Mon, 15 Dec 2014 13:00:50 +0000

> Meant for "net-next". Apologies for the missing prefix.

Which is not open for submission right now, please resubmit when it is
re-openned.

^ permalink raw reply

* Re: [RFC PATCH net-next 0/5] tcp: TCP tracer
From: Blake Matheny @ 2014-12-15 16:08 UTC (permalink / raw)
  To: Eric Dumazet, Alexei Starovoitov, Laurent Chavey, Yuchung Cheng
  Cc: Martin Lau, netdev@vger.kernel.org, David S. Miller,
	Hannes Frederic Sowa, Steven Rostedt, Lawrence Brakmo,
	Josef Bacik, Kernel Team
In-Reply-To: <1418659395.9773.13.camel@edumazet-glaptop2.roam.corp.google.com>

We have an additional set of patches for web10g that builds on these
tracepoints. It can be made to work either way, but I agree the idea of
something like a sockopt would be really nice.

-Blake

On 12/15/14, 8:03 AM, "Eric Dumazet" <eric.dumazet@gmail.com> wrote:

>On Sun, 2014-12-14 at 22:55 -0800, Alexei Starovoitov wrote:
>
>> I think patches 1 and 3 are good additions, since they establish
>> few permanent points of instrumentation in tcp stack.
>> Patches 4-5 look more like use cases of tracepoints established
>> before. They may feel like simple additions and, no doubt,
>> they are useful, but since they expose things via tracing
>> infra they become part of api and cannot be changed later,
>> when more stats would be needed.
>> I think systemtap like scripting on top of patches 1 and 3
>> should solve your use case ?
>> Also, have you looked at recent eBPF work?
>> Though it's not completely ready yet, soon it should
>> be able to do the same stats collection as you have
>> in 4/5 without adding permanent pieces to the kernel.
>
>So it looks like web10g like interfaces are very often requested by
>various teams.
>
>And we have many different views on how to hack this. I am astonished by
>number of hacks I saw about this stuff going on.
>
>What about a clean way, extending current TCP_INFO, which is both
>available as a getsockopt() for socket owners and ss/iproute2
>information for 'external entities'
>
>If we consider web10g info needed, then adding a ftrace/eBPF like
>interface is simply yet another piece of code we need to maintain,
>and the argument of 'this should cost nothing if not activated' is
>nonsense since major players need to constantly monitor TCP metrics and
>behavior.
>
>It seems both FaceBook and Google are working on a subset of web10g.
>
>I suggest we meet together and establish a common ground, preferably
>after Christmas holidays.
>
>Thanks
>
>


^ permalink raw reply

* Re: [RFC PATCH net-next 0/5] tcp: TCP tracer
From: Eric Dumazet @ 2014-12-15 16:03 UTC (permalink / raw)
  To: Alexei Starovoitov, Laurent Chavey, Yuchung Cheng
  Cc: Martin KaFai Lau, netdev@vger.kernel.org, David S. Miller,
	Hannes Frederic Sowa, Steven Rostedt, Lawrence Brakmo,
	Josef Bacik, Kernel Team
In-Reply-To: <CAADnVQJ+8mtB8LD=U7XbxOC2hxhDChxOELhZ3NEYeoTk1G3LYg@mail.gmail.com>

On Sun, 2014-12-14 at 22:55 -0800, Alexei Starovoitov wrote:

> I think patches 1 and 3 are good additions, since they establish
> few permanent points of instrumentation in tcp stack.
> Patches 4-5 look more like use cases of tracepoints established
> before. They may feel like simple additions and, no doubt,
> they are useful, but since they expose things via tracing
> infra they become part of api and cannot be changed later,
> when more stats would be needed.
> I think systemtap like scripting on top of patches 1 and 3
> should solve your use case ?
> Also, have you looked at recent eBPF work?
> Though it's not completely ready yet, soon it should
> be able to do the same stats collection as you have
> in 4/5 without adding permanent pieces to the kernel.

So it looks like web10g like interfaces are very often requested by
various teams.

And we have many different views on how to hack this. I am astonished by
number of hacks I saw about this stuff going on.

What about a clean way, extending current TCP_INFO, which is both
available as a getsockopt() for socket owners and ss/iproute2
information for 'external entities'

If we consider web10g info needed, then adding a ftrace/eBPF like
interface is simply yet another piece of code we need to maintain,
and the argument of 'this should cost nothing if not activated' is
nonsense since major players need to constantly monitor TCP metrics and
behavior.

It seems both FaceBook and Google are working on a subset of web10g.

I suggest we meet together and establish a common ground, preferably
after Christmas holidays.

Thanks

^ permalink raw reply

* Re: [PATCH net] net: Disallow providing non zero VLAN ID for NIC drivers FDB add flow
From: Jamal Hadi Salim @ 2014-12-15 15:31 UTC (permalink / raw)
  To: Jiri Pirko, Or Gerlitz
  Cc: Or Gerlitz, David S. Miller, Linux Netdev List, Andy Gospodarek,
	John Fastabend
In-Reply-To: <20141214223354.GA2035@nanopsycho.orion>

On 12/14/14 17:33, Jiri Pirko wrote:
> Sun, Dec 14, 2014 at 09:14:27PM CET, gerlitz.or@gmail.com wrote:
>> On Sun, Dec 14, 2014 at 9:23 PM, Jiri Pirko <jiri@resnulli.us> wrote:
>>> Sun, Dec 14, 2014 at 05:19:05PM CET, ogerlitz@mellanox.com wrote:
>>>> The current implementations all use dev_uc_add_excl() and such whose API
>>>> doesn't support vlans, so we can't make it with NICs HW for now.
>>>>
>>>> Fixes: f6f6424ba773 ('net: make vid as a parameter for ndo_fdb_add/ndo_fdb_del')
>>>
>>> Maybe I'm missing something, but this commit did not introduce the
>>> problem.
>>
>> This commit introduced the vid param to ndo_fdb_add and ndo_dflt_fdb_add
>> which further call the dev_uc_add APIs... so it did introduced the
>> ability to provide VID into these APIs, right? and we want to protect
>> against anyone using this ability @ this point.
>
> That is in-kernel change. Vs. usespace the patch is a no-op. If userspace
> fills up NDA_VLAN, it is ignored before the patch as well as after. No
> behaviour change, just +- cosmetics.
>

Indeed doesnt break anything, but:
I think this brings to the forefront what these dev_uc/mc addresses
are supposed to be here. I am sure there is a good reason to tie them
to the fdb API - I am just not grokking it at the moment.
The concept of a vlanid makes sense for an fdb entry. It
starts to break when you start tying in vlans - i.e i am not aware
of multicast/unicast device entries which are tied to vlans.
Maybe this is some SRIOV thing?

cheers,
jamal

^ permalink raw reply

* Re: [PATCH net-next v2 2/4] swdevice: add new api to set and del bridge port attributes
From: Jamal Hadi Salim @ 2014-12-15 15:26 UTC (permalink / raw)
  To: Roopa Prabhu, Jiri Pirko
  Cc: sfeldma, bcrl, tgraf, john.fastabend, stephen, linville, vyasevic,
	netdev, davem, shm, gospo
In-Reply-To: <548DE7E2.6010705@cumulusnetworks.com>


Sorry - i didnt quiet follow the discussion, but i can see the value
of propagating things from parent to children netdevs as part of the
generic approach. And in that spirit:

Ben's patches (and I am sure the cumulus folk do this) expose ports.
i.e you boot up the hardware and you see ports. You can then put these
ports in a bridge and you can offload fdbs and do other parametrization
to the ASIC. IOW, this only becomes a bridge because you created one
in the kernel and attached bridge ports to it.

Lets say i didnt want a bridge. I want instead to take these exposed
ports and create a bond (and maybe play with LACP). How does this
propagation from parent->child->child work then? I think the idea
of just bonding and not exposing it as a switch is a reasonable use
case.
Also how does it work when i start doing L3 and the bond's port doesnt
support L3? Is it time to revive the thing we called TheThing in Du?

cheers,
jamal

On 12/14/14 14:41, Roopa Prabhu wrote:
> On 12/14/14, 7:35 AM, Jiri Pirko wrote:

[..chopped off for brevity and saving electrons..]

cheers,
jamal

^ permalink raw reply

* Re: [bisected] tg3 broken in 3.18.0?
From: Marcelo Ricardo Leitner @ 2014-12-15 15:06 UTC (permalink / raw)
  To: Nils Holland, David Miller; +Cc: netdev, linux-pci, rajatxjain
In-Reply-To: <20141213210251.GA12812@teela.fritz.box>

On 13-12-2014 19:02, Nils Holland wrote:
> rajatxjain@gmail.com
> Bcc:
> Subject: Re: [bisected] tg3 broken in 3.18.0?
> Reply-To:
> In-Reply-To: <20141212.201831.186234837340644301.davem@davemloft.net>
>
> On Fri, Dec 12, 2014 at 08:18:31PM -0500, David Miller wrote:
>> From: Nils Holland <nholland@tisys.org>
>> Date: Sat, 13 Dec 2014 02:14:08 +0100
>>
>>>
>>> My bisect exercise suggests that the following commit is the culprit:
>>>
>>> 89665a6a71408796565bfd29cfa6a7877b17a667 (PCI: Check only the Vendor
>>> ID to identify Configuration Request Retry)
>>
>> You definitely need to bring this up with the author of that change
>> and the relevent list for the PCI subsystem and/or linux-kernel.
>
> I've now already sent an inquiry to Rajat Jain, the author of the
> patch in question, and this message here is now also CC'd to
> linux-pci@.
>
> With this message, I'd like to add one last result of investigation
> I've done today, in the hope that it will aid the folks with more
> knowledge to go after the issue.

FWIW, reverting this change fixes tg3 in here too.

Thanks Nils for doing the bisect!

With these debugs (note the re-revert):

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c 

index 2306268..4474502 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1436,14 +1436,22 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, 
int devfn, u32 *l,
                 return false;

         /* Configuration request Retry Status */
-       while (*l == 0xffff0001) {
-               if (!crs_timeout)
+       printk ("pci %04x:%02x:%02x.%d: 1st %x %x\n", pci_domain_nr(bus), 
bus->number,
+               PCI_SLOT(devfn), PCI_FUNC(devfn), *l, *l & 0xffff);
+       while ((*l & 0xffff) == 0x0001) {
+               if (!crs_timeout) {
+                       printk ("pci %04x:%02x:%02x.%d: crs_timeout: %d\n", 
pci_domain_nr(bus),
+                               bus->number, PCI_SLOT(devfn), PCI_FUNC(devfn), 
crs_timeout);
                         return false;
+               }

                 msleep(delay);
                 delay *= 2;
-               if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
+               if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l)) {
+                       printk ("pci %04x:%02x:%02x.%d: 
pci_bus_read_config_dword failed\n", pci_domain_nr(bus),
+                               bus->number, PCI_SLOT(devfn), PCI_FUNC(devfn));
                         return false;
+               }
                 /* Card hasn't responded in 60 seconds?  Must be stuck. */
                 if (delay > crs_timeout) {
                         printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not 
responding\n",
@@ -1451,6 +1459,7 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int 
devfn, u32 *l,
                                PCI_FUNC(devfn));
                         return false;
                 }
+               printk ("pci %04x:%02x:%02x.%d: %x %x\n", pci_domain_nr(bus), 
bus->number, PCI_SLOT(devfn), PCI_FUNC(devfn), *l, *l & 0xffff);
         }

         return true;

I'm getting, with commit 89665a6a71408796565bfd29cfa6a7877b17a667:

$ grep 'pci 0000:02' tg3.bad
[    0.190733] pci 0000:02:00.0: 1st 165a14e4 14e4
[    0.190736] pci 0000:02:00.0: 1st 165a14e4 14e4
[    0.190810] pci 0000:02:00.0: [14e4:165a] type 00 class 0x020000
[    0.190885] pci 0000:02:00.0: reg 0x10: [mem 0xf7c40000-0xf7c4ffff 64bit]
[    0.191048] pci 0000:02:00.0: reg 0x30: [mem 0xf7c00000-0xf7c3ffff pref]
[    0.191382] pci 0000:02:00.0: PME# supported from D3hot D3cold
[    0.191438] pci 0000:02:00.0: System wakeup disabled by ACPI
[    1.561555] pci 0000:02:00.0: 1st 1 1
[    1.561558] pci 0000:02:00.0: crs_timeout: 0
[   20.412021] pci 0000:02:00.0: 1st 1 1
[   20.412022] pci 0000:02:00.0: crs_timeout: 0
[   20.413596] pci 0000:02:00.0: 1st 1 1
[   20.413598] pci 0000:02:00.0: crs_timeout: 0

And without it:

$ grep 'pci 0000:02' tg3.good
[    0.190734] pci 0000:02:00.0: 1st 165a14e4 14e4
[    0.190738] pci 0000:02:00.0: 1st 165a14e4 14e4
[    0.190811] pci 0000:02:00.0: [14e4:165a] type 00 class 0x020000
[    0.190884] pci 0000:02:00.0: reg 0x10: [mem 0xf7c40000-0xf7c4ffff 64bit]
[    0.191047] pci 0000:02:00.0: reg 0x30: [mem 0xf7c00000-0xf7c3ffff pref]
[    0.191380] pci 0000:02:00.0: PME# supported from D3hot D3cold
[    0.191439] pci 0000:02:00.0: System wakeup disabled by ACPI
[    1.576778] pci 0000:02:00.0: 1st 1 1
[   19.068517] pci 0000:02:00.0: 1st 165a14e4 14e4

Hope that helps!

   Marcelo

^ permalink raw reply related

* Re: [RFC PATCH net-next 1/1] net: Support for switch port configuration
From: Thomas Graf @ 2014-12-15 14:40 UTC (permalink / raw)
  To: Varlese, Marco
  Cc: John Fastabend, Jiri Pirko, netdev@vger.kernel.org,
	stephen@networkplumber.org, Fastabend, John R,
	roopa@cumulusnetworks.com, sfeldma@gmail.com,
	linux-kernel@vger.kernel.org
In-Reply-To: <C4896FB061E7DE4AAC93031BDCA044B104AC5DCF@IRSMSX108.ger.corp.intel.com>

On 12/15/14 at 02:29pm, Varlese, Marco wrote:
> > All of these are highly generic and should *not* be passed through from user
> > space to the driver directly but rather be properly abstracted as Roopa
> > proposed. The value of this API is abstraction.
> How would you let the user enable/disable features then? For instance, how would the user enable/disable flooding for broadcast packets (BFLOODING) on a given port? What I was proposing is to have a list of attributes (to be added in if_link.h) which can be tuned by the user using a tool like iproute2. What do you propose? 

Excellent, I agree with what you are saying. What set me off is that
the patch does not reflect that yet. Instead, the patch introduces
a pure Netlink pass-through API to the driver.

I would expect the patch to:
 1. Parse the Netlink messages and be aware of individual attributes
 2. Validate them
 3. Pass the configuration to the driver using an API that can also
    be consumed from in-kernel users.

> I think I have seen Roopa posting his updated ndo patch and getting some feedback by few people already and as long as I will be able to accomplish the use case described here I am happy with his way.

I think Roopa's patches are supplementary. Not all switchdev users
will be backed with a Linux Bridge. I therefore welcome your patches
very much.

The overlap is in the ndo. I think both the API you propose and
Roopa's bridge code should use the same NDO.

> I do not have an example right now of a vendor specific attribute but I was just saying that might happen (i.e. someone will have a feature not implemented by others?).

That's fine. Once we have them we can consider adding vendor specific
extensions.

^ permalink raw reply

* Re: [PATCH net-next RESEND] net: Do not call ndo_dflt_fdb_dump if ndo_fdb_dump is defined.
From: Jamal Hadi Salim @ 2014-12-15 14:29 UTC (permalink / raw)
  To: John Fastabend
  Cc: Hubert Sokolowski, Roopa Prabhu, netdev@vger.kernel.org,
	Vlad Yasevich
In-Reply-To: <548B4AA4.1020804@gmail.com>

On 12/12/14 15:05, John Fastabend wrote:
> On 12/12/2014 06:35 AM, Jamal Hadi Salim wrote:


> I'll wake up ;)


Vlad made me go over those patches in a few iterations to make
sure that the use cases covered in the test case work. It is
holiday season, so he may be offline.

> First quick grep of code finds some strange uses of ndo_fdb_dump like
> this in macvlan,
>
>    ./drivers/net/macvlan.c
>          .ndo_fdb_dump           = ndo_dflt_fdb_dump,
>
> I'll be sending a patch once net-next opens up again to resolve it. Its
> harmless though so not really a fix for net.
>
> There seem to be a few places that have the potential to return
> different values then the uc/mc lists.
>
>      ./drivers/net/vxlan.c
>      ./drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
>      ./drivers/net/ethernet/rocker/rocker.c
>
>      ./net/bridge/br_device.c
>

Yes, thats my observation as well.
The question is: Are multi/unicast address unconditionally dumped?
Some of these drivers may be just doing the LinuxWay(aka cutnpaste what
the other driver did).
If you go over the original thread exchange with Vlad, you'll notice
i was kind of unsure why dumping of unicast/multicast had anything to
do with fdb dumping.
It is still my view that we shouldnt be treating these addresses as if
they were fdb entries. But: The problem is once you allow an API to
user space you cant take it back even if people are depending on bugs.


> So I guess we can walk through the list and analyse them a bit.
>
> vxlan:
>
> Try stacking devices on top of the vxlan device this will call a uc_add
> routine if you then change the mac addr on the vlan. This would get
> reported by the dflt fdb dump handlers but not the drivers fdb dump
> handlers. So removing the dflt dump handler from this patch at least
> changes things. We should either explain why this is OK or accept that
> the driver needs to be fixed. Or I guess that the patch is just wrong.
> My guess is one of the latter options.
>
> Also Jamal, your original patch seems like it might of changed this
> and Hubert's patch is reverting back to its original case. Was this
> specific part of your patch intentional?
>

Yes.
This is based on the view that unicast/multicast must be dumped
*unconditionally*. If the view is that uni/mcast addresses are
dumped conditionally based on what the driver thinks, then Hubert's
one liner is good. But i really would like Vlad to comment. 80%
of the effort on my part if you look at the thread was the refactoring
of the code to meet the use case.

I thought the abstraction which requires that your own MAC addresses
are treated as fdb entries was broken - but it is too late to change
that.

cheers,
jamal

^ permalink raw reply

* RE: [RFC PATCH net-next 1/1] net: Support for switch port configuration
From: Varlese, Marco @ 2014-12-15 14:29 UTC (permalink / raw)
  To: Thomas Graf
  Cc: John Fastabend, Jiri Pirko, netdev@vger.kernel.org,
	stephen@networkplumber.org, Fastabend, John R,
	roopa@cumulusnetworks.com, sfeldma@gmail.com,
	linux-kernel@vger.kernel.org
In-Reply-To: <20141215140749.GB21952@casper.infradead.org>

> -----Original Message-----
> From: Thomas Graf [mailto:tgr@infradead.org] On Behalf Of Thomas Graf
> Sent: Monday, December 15, 2014 2:08 PM
> To: Varlese, Marco
> Cc: John Fastabend; Jiri Pirko; netdev@vger.kernel.org;
> stephen@networkplumber.org; Fastabend, John R;
> roopa@cumulusnetworks.com; sfeldma@gmail.com; linux-
> kernel@vger.kernel.org
> Subject: Re: [RFC PATCH net-next 1/1] net: Support for switch port
> configuration
> 
> On 12/11/14 at 09:59am, Varlese, Marco wrote:
> > An example of attributes are:
> > * enabling/disabling of learning of source addresses on a given port
> > (you can imagine the attribute called LEARNING for example);
> > * internal loopback control (i.e. LOOPBACK) which will control how the
> > flow of traffic behaves from the switch fabric towards an egress port;
> > * flooding for broadcast/multicast/unicast type of packets (i.e.
> > BFLOODING, MFLOODING, UFLOODING);
> 
> All of these are highly generic and should *not* be passed through from user
> space to the driver directly but rather be properly abstracted as Roopa
> proposed. The value of this API is abstraction.
How would you let the user enable/disable features then? For instance, how would the user enable/disable flooding for broadcast packets (BFLOODING) on a given port? What I was proposing is to have a list of attributes (to be added in if_link.h) which can be tuned by the user using a tool like iproute2. What do you propose? 
I think I have seen Roopa posting his updated ndo patch and getting some feedback by few people already and as long as I will be able to accomplish the use case described here I am happy with his way.

> If we introduce per device attributes for generic functions we lose large
> portions of the value gained.
> You mentioned you have additional attributes in mind, maybe you can give a
> few examples which are not generic, i.e. do not apply to multiple vendors.
I do not have an example right now of a vendor specific attribute but I was just saying that might happen (i.e. someone will have a feature not implemented by others?).

^ permalink raw reply

* [PATCH 3.16.y-ckt 111/168] net/ping: handle protocol mismatching scenario
From: Luis Henriques @ 2014-12-15 14:26 UTC (permalink / raw)
  To: linux-kernel, stable, kernel-team
  Cc: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, netdev, Jane Zhou, Yiwei Zhao,
	Luis Henriques
In-Reply-To: <1418653622-21105-1-git-send-email-luis.henriques@canonical.com>

3.16.7-ckt3 -stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jane Zhou <a17711@motorola.com>

commit 91a0b603469069cdcce4d572b7525ffc9fd352a6 upstream.

ping_lookup() may return a wrong sock if sk_buff's and sock's protocols
dont' match. For example, sk_buff's protocol is ETH_P_IPV6, but sock's
sk_family is AF_INET, in that case, if sk->sk_bound_dev_if is zero, a wrong
sock will be returned.
the fix is to "continue" the searching, if no matching, return NULL.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Jane Zhou <a17711@motorola.com>
Signed-off-by: Yiwei Zhao <gbjc64@motorola.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
---
 net/ipv4/ping.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index 044a0ddf6a79..620e8ffa62e8 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -217,6 +217,8 @@ static struct sock *ping_lookup(struct net *net, struct sk_buff *skb, u16 ident)
 					     &ipv6_hdr(skb)->daddr))
 				continue;
 #endif
+		} else {
+			continue;
 		}
 
 		if (sk->sk_bound_dev_if && sk->sk_bound_dev_if != dif)
-- 
2.1.3

^ permalink raw reply related

* Re: [LKP] [genirq] c291ee62216:
From: Thomas Gleixner @ 2014-12-15 14:25 UTC (permalink / raw)
  To: Huang Ying; +Cc: LKML, LKP ML, Rick Jones, Peter Zijlstra, netdev
In-Reply-To: <1418627248.5745.186.camel@intel.com>

On Mon, 15 Dec 2014, Huang Ying wrote:
> FYI, we noticed the below changes on
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/urgent
> commit c291ee622165cb2c8d4e7af63fffd499354a23be ("genirq: Prevent proc race against freeing of irq descriptors")
> 
> testbox/testcase/testparams: lkp-nex04/netperf/performance-300s-200%-SCTP_STREAM

>                            time.voluntary_context_switches
> 
>   100500 ++-----------------------------------------------------------------+
>          O       O                    O          O                          |
>          |  O                            O          O                       |
>   100000 ++           O  O  O O     O          O                            |
>          |    O                  O          O          O                    |
>          |          O                                                       |
>    99500 ++                                                                 |
>          |                                                                  |
>    99000 ++                                                                 |
>          |                                                                  |
>          |                                                                  |
>    98500 ++        .*.*..                          .*.. .*..    .*..*..     |
>          *..*.*..*.      *..*.*..*..*.*..*..*..*.*.    *    *..*       *.*..*
>          |                                                                  |
>    98000 ++-----------------------------------------------------------------+
> 
> 
> 	[*] bisect-good sample
> 	[O] bisect-bad  sample

Cute. Looking at netperf source it seems to do a high frequency
readout of /proc/stat from all involved threads. Which of course
explains that the number of context switches is going up as the stuff
is going to content on the sparse_irq_mutex.

While its possible to fix^W band aid that case, I'm really not too
happy to do so just to please a wreckaged use case. High frequency
polling of /proc/stat is just asking for trouble and on larger
machines it's a complete scalability fail. Especially the interrupt
part is amazingly horrible

      for_each_irq_nr()
	for_each_possible_cpu()

Is it really required for netperf to do that stat poll in a loop or
can it be made smarter?

Btw, in that test scenario runs netserver and the test threads on the
same machine. So the utilization data is pretty useless anyway because
all threads will read more or less the same data which cannot be
correlated to a particular instance.

Thanks,

	tglx

^ permalink raw reply

* [PATCH 2/2] net/macb: remove useless calls of devm_free_irq()
From: Cyrille Pitchen @ 2014-12-15 14:13 UTC (permalink / raw)
  To: nicolas.ferre, davem, linux-arm-kernel, netdev, soren.brinkmann
  Cc: linux-kernel, Cyrille Pitchen
In-Reply-To: <cover.1418651689.git.cyrille.pitchen@atmel.com>

Inside macb_probe(), when devm_request_irq() fails on queue q, there is no need
to call devm_free_irq() on queues 0..q-1 because the managed device resources
are released later when calling free_netdev().

Also removing devm_free_irq() call from macb_remove() for the same reason.

Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
---
 drivers/net/ethernet/cadence/macb.c | 23 ++++++++---------------
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
index 81f317f..414f796 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -2160,7 +2160,7 @@ static int __init macb_probe(struct platform_device *pdev)
 	int err = -ENXIO;
 	const char *mac;
 	void __iomem *mem;
-	unsigned int hw_q, queue_mask, q, num_queues, q_irq = 0;
+	unsigned int hw_q, queue_mask, q, num_queues;
 	struct clk *pclk, *hclk, *tx_clk;
 
 	regs = platform_get_resource(pdev, IORESOURCE_MEM, 0);
@@ -2235,11 +2235,11 @@ static int __init macb_probe(struct platform_device *pdev)
 	 * register mapping but we don't want to test the queue index then
 	 * compute the corresponding register offset at run time.
 	 */
-	for (hw_q = 0; hw_q < MACB_MAX_QUEUES; ++hw_q) {
+	for (hw_q = 0, q = 0; hw_q < MACB_MAX_QUEUES; ++hw_q) {
 		if (!(queue_mask & (1 << hw_q)))
 			continue;
 
-		queue = &bp->queues[q_irq];
+		queue = &bp->queues[q];
 		queue->bp = bp;
 		if (hw_q) {
 			queue->ISR  = GEM_ISR(hw_q - 1);
@@ -2261,18 +2261,18 @@ static int __init macb_probe(struct platform_device *pdev)
 		 * must remove the optional gaps that could exist in the
 		 * hardware queue mask.
 		 */
-		queue->irq = platform_get_irq(pdev, q_irq);
+		queue->irq = platform_get_irq(pdev, q);
 		err = devm_request_irq(&pdev->dev, queue->irq, macb_interrupt,
 				       0, dev->name, queue);
 		if (err) {
 			dev_err(&pdev->dev,
 				"Unable to request IRQ %d (error %d)\n",
 				queue->irq, err);
-			goto err_out_free_irq;
+			goto err_out_free_netdev;
 		}
 
 		INIT_WORK(&queue->tx_error_task, macb_tx_error_task);
-		q_irq++;
+		q++;
 	}
 	dev->irq = bp->queues[0].irq;
 
@@ -2350,7 +2350,7 @@ static int __init macb_probe(struct platform_device *pdev)
 	err = register_netdev(dev);
 	if (err) {
 		dev_err(&pdev->dev, "Cannot register net device, aborting.\n");
-		goto err_out_free_irq;
+		goto err_out_free_netdev;
 	}
 
 	err = macb_mii_init(bp);
@@ -2373,9 +2373,7 @@ static int __init macb_probe(struct platform_device *pdev)
 
 err_out_unregister_netdev:
 	unregister_netdev(dev);
-err_out_free_irq:
-	for (q = 0, queue = bp->queues; q < q_irq; ++q, ++queue)
-		devm_free_irq(&pdev->dev, queue->irq, queue);
+err_out_free_netdev:
 	free_netdev(dev);
 err_out_disable_clocks:
 	if (!IS_ERR(tx_clk))
@@ -2392,8 +2390,6 @@ static int __exit macb_remove(struct platform_device *pdev)
 {
 	struct net_device *dev;
 	struct macb *bp;
-	struct macb_queue *queue;
-	unsigned int q;
 
 	dev = platform_get_drvdata(pdev);
 
@@ -2405,9 +2401,6 @@ static int __exit macb_remove(struct platform_device *pdev)
 		kfree(bp->mii_bus->irq);
 		mdiobus_free(bp->mii_bus);
 		unregister_netdev(dev);
-		queue = bp->queues;
-		for (q = 0; q < bp->num_queues; ++q, ++queue)
-			devm_free_irq(&pdev->dev, queue->irq, queue);
 		if (!IS_ERR(bp->tx_clk))
 			clk_disable_unprepare(bp->tx_clk);
 		clk_disable_unprepare(bp->hclk);
-- 
1.8.2.2

^ permalink raw reply related

* [PATCH 1/2] net/macb: fix misplaced call of free_netdev() in macb_remove()
From: Cyrille Pitchen @ 2014-12-15 14:13 UTC (permalink / raw)
  To: nicolas.ferre, davem, linux-arm-kernel, netdev, soren.brinkmann
  Cc: linux-kernel, Cyrille Pitchen
In-Reply-To: <cover.1418651689.git.cyrille.pitchen@atmel.com>

fix a bug introduced by the multiqueue support patch:
"net/macb: add TX multiqueue support for gem"

the "bp" pointer to the netdev private data was dereferenced and used after the
associated memory had been freed by calling free_netdev().

Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
---
 drivers/net/ethernet/cadence/macb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
index 0987d2a..81f317f 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -2408,11 +2408,11 @@ static int __exit macb_remove(struct platform_device *pdev)
 		queue = bp->queues;
 		for (q = 0; q < bp->num_queues; ++q, ++queue)
 			devm_free_irq(&pdev->dev, queue->irq, queue);
-		free_netdev(dev);
 		if (!IS_ERR(bp->tx_clk))
 			clk_disable_unprepare(bp->tx_clk);
 		clk_disable_unprepare(bp->hclk);
 		clk_disable_unprepare(bp->pclk);
+		free_netdev(dev);
 	}
 
 	return 0;
-- 
1.8.2.2

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox