Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] net: Add ndo_gso_check
From: Or Gerlitz @ 2014-10-06 20:22 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Alexander Duyck, John Fastabend, Jeff Kirsher, David Miller,
	Linux Netdev List, Thomas Graf, Pravin Shelar, Andy Zhou
In-Reply-To: <CA+mtBx-AW65w7grfb7zTaDKzKQ2n_7JQuLquj1Ayopa4gdiF6Q@mail.gmail.com>

On Mon, Oct 6, 2014 at 8:59 PM, Tom Herbert <therbert@google.com> wrote:
> On Sun, Oct 5, 2014 at 12:13 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>> RX wise, Linux tells the driver that UDP port X would be used for
>> VXLAN, right? and indeed, it's possible for some HW implementations
>> not to support RX offloading (checksum) for both VXLAN and NVGRE @ the
>> same time over the same port. But TX/GRO wise, you're probably
>> correct. The thing is that from the user POV they need solution that
>> works for both RX and TX offloading.

> I think from a user POV we want a solution that supports RX and TX
> offloading across the widest range of protocols. This is accomplished
> by implementing protocol agnostic mechanisms like CHECKSUM_COMPLETE
> and protocol agnostic UDP tunnel TSO like we've described. IMO, the
> fact that we have devices that implement protocol specific mechanisms
> for NVGRE and VXLAN should be considered legacy support in the stack,
> for new UDP encapsulation protocols we should not expose specifics in
> the stack in either by adding a GSO type for each protocol, nor
> ndo_add_foo_port for each protocol-- these things will not scale and
> unnecessarily complicate the core stack.

I tend to generally agree to the wind that blows from your writeup, namely:

UDP encapsulation offloads wise, we should pose few general
requirements to NICs to be implemented by vendors in their tomorrow's
HW and treat the current generation (these 4-5 drivers with their
limitations as legacy which should be supported but not state the
stack overall design).

Still we should seek more ways to reduce the pain/amount of
not-well-defined-configurations when these drivers are there and the
stack goes through this upside-down turnaround changes. OTOH you
didn't accept my SKB coloring suggestion for GSO inspection, and OTOH
I guess we can live with some sort of generic helper in the form of
what you suggested, but like it or not, getting rid of
ndo_add_vxlan_port will simply break things out.

Are we going to have a session on the encapsulation/offloads design @ LPC?

I think a replay of your LKS presentation along with open discussion
on how to get there with the legacy requirements could be very
helpful.

Or.

^ permalink raw reply

* Re: [PATCH v2 7/7] driver-core: add preferred async probe option for built-in and modules
From: Tejun Heo @ 2014-10-06 20:19 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: gregkh, dmitry.torokhov, tiwai, arjan, teg, rmilasan, werner,
	oleg, hare, bpoirier, santosh, pmladek, dbueso, mcgrof,
	linux-kernel, Tetsuo Handa, Joseph Salisbury, Kay Sievers,
	One Thousand Gnomes, Tim Gardner, Pierre Fersing, Andrew Morton,
	Nagalakshmi Nandigama, Praveen Krishnamoorthy, Sreekanth Reddy,
	Abhijit Mahajan, Casey Leedom, Hariprasad S <harip
In-Reply-To: <1412372683-2003-8-git-send-email-mcgrof@do-not-panic.com>

Hello, Luis.

The patchset generally looks good to me.  Please feel free to add

 Reviewed-by: Tejun Heo <tj@kernel.org>

A question below.

On Fri, Oct 03, 2014 at 02:44:43PM -0700, Luis R. Rodriguez wrote:
> +	bus.enable_kern_async=1 [KNL]
> +			This tells the kernel userspace has been vetted for
> +			asynchronous probe support and can listen to the device
> +			driver prefer_async_probe flag for both built-in device
> +			drivers and modules.

Do we intend to keep this param permanently?  Isn't this more of a
temp tool to be used during development?  If so, maybe we should make
that clear with __DEVEL__ too?

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [PATCH net-next 0/2] sunvnet: Packet processing in non-interrupt context.
From: David Miller @ 2014-10-06 19:37 UTC (permalink / raw)
  To: sowmini.varadhan; +Cc: raghuram.kothakota, netdev
In-Reply-To: <20141006193111.GE24721@oracle.com>

From: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Date: Mon, 6 Oct 2014 15:31:11 -0400

> On (10/06/14 15:25), David Miller wrote:
>> 
>> > But we still need to hold the vio lock around the ldc_write 
>> > (and also around dring write) in vnet_start_xmit, right?
>> 
>> You might be able to avoid it, you're fully serialized by the TX queue
>> lock.
> 
> yes, I was just noticing that. The only place where I believe I need
> to hold the vio spin-lock is to sync with the dr->cons checks
> (the "should I send a start_cons LDC message?" check in vnet_start_xmit()
> vs the vnet_ack() updates).

I don't see how that is any different from the netif_queue_wake() checks,
it should be just as easy to cover it with the generic xmit lock.

> But isn't it better in general to declare NETIF_F_LLTX and have finer lock
> granularity in the driver?

No, NETIF_F_LLTX drivers are heavily discouraged.  And on the
contrary, it's easier to optimize the locking when we can consolidate
what is covered at both the mid-level of the networking send paths and
what the drivers need in their ->ndo_start_xmit() routines.

^ permalink raw reply

* Re: [PATCH net-next] net: validate_xmit_vlan() is static
From: Joe Perches @ 2014-10-06 19:36 UTC (permalink / raw)
  To: Josh Triplett
  Cc: Eric Dumazet, David Miller, netdev, Julia Lawall, Dan Carpenter
In-Reply-To: <20141006190728.GA2137@thin>

On Mon, 2014-10-06 at 12:07 -0700, Josh Triplett wrote:
> On Mon, Oct 06, 2014 at 11:44:02AM -0700, Joe Perches wrote:
> > On Mon, 2014-10-06 at 11:26 -0700, Eric Dumazet wrote:
> > > From: Eric Dumazet <edumazet@google.com>
> > > 
> > > Marking this as static allows compiler to inline it.
> > 
> > Found by inspection or another tool?
> > 
> > Wasn't there some tool to look for non-static functions
> > that are not called externally that could/should be
> > converted to static?
> 
> Several: sparse,

Doesn't find anything for this use case

> gcc -Wmissing-prototypes, and findstatic.pl.

Right, thanks, using "make W=1" adds -Wmissing-prototypes.

^ permalink raw reply

* Re: [PATCH net-next] net: validate_xmit_vlan() is static
From: Eric Dumazet @ 2014-10-06 19:35 UTC (permalink / raw)
  To: Joe Perches
  Cc: David Miller, netdev, Julia Lawall, Dan Carpenter, Josh Triplett
In-Reply-To: <1412621042.2916.29.camel@joe-AO725>

On Mon, 2014-10-06 at 11:44 -0700, Joe Perches wrote:
> On Mon, 2014-10-06 at 11:26 -0700, Eric Dumazet wrote:
> > From: Eric Dumazet <edumazet@google.com>
> > 
> > Marking this as static allows compiler to inline it.
> 
> Found by inspection or another tool?
> 
> Wasn't there some tool to look for non-static functions
> that are not called externally that could/should be
> converted to static?

Found by inspecting and analyzing performance on real workload.

Note prior commits :

bec3cfdca36bf43cfa3751ad7b56db1a307e0760 net: skb_segment() provides list head and tail
01291202ed4ad548f9a7147d20425cb1d24f49a7 net: do not export skb_gro_receive()
55a93b3ea780908b7d1b3a8cf1976223a9268d78 qdisc: validate skb without holding lock

^ permalink raw reply

* Re: [PATCH net-next 0/2] sunvnet: Packet processing in non-interrupt context.
From: Sowmini Varadhan @ 2014-10-06 19:31 UTC (permalink / raw)
  To: David Miller; +Cc: raghuram.kothakota, netdev
In-Reply-To: <20141006.152526.965519223260573233.davem@davemloft.net>

On (10/06/14 15:25), David Miller wrote:
> 
> > But we still need to hold the vio lock around the ldc_write 
> > (and also around dring write) in vnet_start_xmit, right?
> 
> You might be able to avoid it, you're fully serialized by the TX queue
> lock.

yes, I was just noticing that. The only place where I believe I need
to hold the vio spin-lock is to sync with the dr->cons checks
(the "should I send a start_cons LDC message?" check in vnet_start_xmit()
vs the vnet_ack() updates).

But isn't it better in general to declare NETIF_F_LLTX and have finer lock
granularity in the driver?

--Sowmini

^ permalink raw reply

* Re: [PATCH net-next 0/2] sunvnet: Packet processing in non-interrupt context.
From: David Miller @ 2014-10-06 19:25 UTC (permalink / raw)
  To: sowmini.varadhan; +Cc: raghuram.kothakota, netdev
In-Reply-To: <20141006160418.GA3604@oracle.com>

From: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Date: Mon, 6 Oct 2014 12:04:18 -0400

>> I think you should be able to get rid of all of the in-driver
>> locking in the fast paths.
>> 
>> NAPI ->poll() is non-reentrant, so all RX processing occurs
>> strictly in a serialized environment.
>> 
>> Once you do TX reclaim in NAPI context, then all you have to do is
>> take the generic netdev TX queue lock during the evaluation of whether
>> to wakeup the TX queue or not.  Worst case you need to hold the
>> TX netdev queue lock across the whole TX reclaim operation.
>> 
>> The VIO lock really ought to be entirely superfluous in the data
>> paths.
> 
> A few clarifications, since there are more driver-examples using NAPI for
> Rx than for Tx reclaim

Those drivers fully go against our recommendations, we always say that
TX reclaim should also run from NAPI because it liberates SKBs that
therefore become available for RX processing.

> But we still need to hold the vio lock around the ldc_write 
> (and also around dring write) in vnet_start_xmit, right?

You might be able to avoid it, you're fully serialized by the TX queue
lock.

^ permalink raw reply

* Re: [PATCH v2 net-next 15/15] tipc: remove old ASCII netlink API
From: David Miller @ 2014-10-06 19:20 UTC (permalink / raw)
  To: jon.maloy; +Cc: richard.alpe, netdev, tipc-discussion
In-Reply-To: <A2BAEFC30C8FD34388F02C9B3121859D1C2EB615@eusaamb103.ericsson.se>

From: Jon Maloy <jon.maloy@ericsson.com>
Date: Mon, 6 Oct 2014 13:37:14 +0000

> I disagree.  We did of course consider this before we posted the patch.  
> There is absolutely no reason to believe there are binaries "out there" 
> using this API. It is intended for and used only by our own tool "tipc-config", 
> and the half-baked documentation that exists about it is not even
> correct. This is an TIPC-internal API/ABI, which has changed over the years,
> and keeps changing.  To our best judgement there is no risk in removing the
> old API.
> 
> However, we took great care to not break any scripts that might be using 
> tipc-config. Therefore, we made this wrapper that provides the old 
> tipc-config command API to existing users.

A user should be able to upgrade the kernel and keep using tipc-config
(or _whatever_) that's on their machine.

You cannot break old stuff like this, and just because I didn't notice
you slipping in incompatible changes in the past doesn't mean I'm going
to let you keep doing it.

Backwards compatability must be maintained for every single interface
exposed to the user, and this is non-negotiable, sorry.

^ permalink raw reply

* Re: [PATCH] sctp: handle association restarts when the socket is closed.
From: David Miller @ 2014-10-06 19:18 UTC (permalink / raw)
  To: vyasevich; +Cc: netdev
In-Reply-To: <5432943E.7000505@gmail.com>

From: Vlad Yasevich <vyasevich@gmail.com>
Date: Mon, 06 Oct 2014 09:08:14 -0400

> On 10/06/2014 12:22 AM, David Miller wrote:
>> Candidate for -stable?
> 
> I say yes.  If this problem is triggered it is total pain to get the memory
> clean-up.

Ok, queued up, thanks.

^ permalink raw reply

* Re: [PATCH net-next] ipvs: Avoid null-pointer deref in debug code
From: Julian Anastasov @ 2014-10-06 19:13 UTC (permalink / raw)
  To: Alex Gartrell; +Cc: horms, dan.carpenter, lvs-devel, netdev, kernel-team
In-Reply-To: <5432BBB8.9070206@fb.com>


	Hello,

On Mon, 6 Oct 2014, Alex Gartrell wrote:

> Hey Julian,
> 
> I've sent an updated patch that does this but I have some questions about
> other stuff that I find mildly confusing.  Specifically I didn't realize until
> looking at the call sites that !dest || daddr = dest->addr.ip (but maybe I'm
> wrong?)
> 
> If that's the case, why do we have the following line in __ip_vs_get_out_rt?
> 
>                 daddr = dest->addr.ip;

	Extra line that is not needed...

> If that's /always/ true then we should add the following line or a comment to
> the same effect to clarify
> 
> 	BUG_ON(dest && dest->addr.ip != daddr);

	IMHO, BUG_ON is not needed.

> If that's not intended to always be true, then should the patch be the
> following?
> 
> 	...%pI4", dest ? &dest->addr.ip : &daddr);

	Using daddr is fine.

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* Re: [PATCH net-next] net: validate_xmit_vlan() is static
From: Josh Triplett @ 2014-10-06 19:07 UTC (permalink / raw)
  To: Joe Perches
  Cc: Eric Dumazet, David Miller, netdev, Julia Lawall, Dan Carpenter
In-Reply-To: <1412621042.2916.29.camel@joe-AO725>

On Mon, Oct 06, 2014 at 11:44:02AM -0700, Joe Perches wrote:
> On Mon, 2014-10-06 at 11:26 -0700, Eric Dumazet wrote:
> > From: Eric Dumazet <edumazet@google.com>
> > 
> > Marking this as static allows compiler to inline it.
> 
> Found by inspection or another tool?
> 
> Wasn't there some tool to look for non-static functions
> that are not called externally that could/should be
> converted to static?

Several: sparse, gcc -Wmissing-prototypes, and findstatic.pl.

^ permalink raw reply

* Re: [net 0/8] gianfar: ARM port driver updates (1/2)
From: David Miller @ 2014-10-06 19:07 UTC (permalink / raw)
  To: claudiu.manoil; +Cc: netdev, Li.Xiubo, Shruti
In-Reply-To: <54324AFE.1040803@freescale.com>

From: Claudiu Manoil <claudiu.manoil@freescale.com>
Date: Mon, 6 Oct 2014 10:55:42 +0300

> On 10/6/2014 4:27 AM, David Miller wrote:
>> From: Claudiu Manoil <claudiu.manoil@freescale.com>
>> Date: Fri, 3 Oct 2014 19:02:41 +0300
>>
>>> This is the first round of driver protability fixes and clean-up
>>> with the main purpose to make gianfar portable on ARM, for the ARM
>>> based SoC that integrates the eTSEC ethernet controller - "ls1021a".
>>> The patches primarily address compile time errors, when compiling
>>> gianfar on ARM.  They replace PPC specific functions and macros
>>> with architecture independent ones, solve arch specific header
>>> inclusions, guard code that relates to PPC only, and even address
>>> some simple endianess issues (see MAC address setup patch).
>>> The patches addressing the bulk of remaining endianess issues,
>>> like handling DMA fields (BD and FCB), will follow with the sencond
>>> round.
>>> These patches were verified on the ls1021a SoC.
>>
>> If more endianness fixes are necessary and "will follow with the
>> second round", I do not see how you could have verified specifically
>> these changes on the ls1021a.
>>
> 
> Hi David,
> 
> What I did is to split the initial patchset in 2, to ease up the
> review
> process.
> This first part is fairly straightforward, these patches make
> localized
> code changes and can be more easily ported among different kernel
> versions.  The second part has fewer patches but touches more code,
> because it handles endianess conversions for all the reads/writes to
> the buffer descriptors.  Please let me now if you have objections to
> this approach.
> As for testing, we have our internal kernel tree for ARM supporting
> ls1021a, and these gianfar patches have been there for a while and
> tested.  Now it's time to upstream (a cleaned-up version of) them.
> (see git.freescale.com/git/cgit.cgi/layerscape/ls1021a/linux.git/)
> Please note that the current (upstream) net tree does not include the
> support for ls1021a (which is to be propagated via the arm tree).

I'm merely saying that it's inaccurate to say that you "verified" this
specific patch series on that chip, when in fact the second upcoming
series is necessary as well for the driver to work on that chip properly.

^ permalink raw reply

* Re: [PATCH 08/16] virtio_net: drop config_enable
From: David Miller @ 2014-10-06 19:02 UTC (permalink / raw)
  To: mst; +Cc: netdev, linux-kernel, virtualization
In-Reply-To: <1412525038-15871-9-git-send-email-mst@redhat.com>

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Sun, 5 Oct 2014 19:07:13 +0300

> Now that virtio core ensures config changes don't arrive during probing,
> drop config_enable flag in virtio net.
> On removal, flush is now sufficient to guarantee that no change work is
> queued.
> 
> This help simplify the driver, and will allow setting DRIVER_OK earlier
> without losing config change notifications.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

It's hard for people on the networking side to review these changes
since you haven't CC:'d them on any of the postings necessary to
understand the context of the net/ and drivers/net/ changes.

Please at a minimum CC: everyone on your header [PATCH 0/N] posting
so we know at least at a high level what is going on, and why.

Thanks.

^ permalink raw reply

* Re: [PATCH net-next] ipvs: Avoid null-pointer deref in debug code
From: Julian Anastasov @ 2014-10-06 19:01 UTC (permalink / raw)
  To: Alex Gartrell; +Cc: horms, dan.carpenter, lvs-devel, netdev, kernel-team
In-Reply-To: <1412610379-6295-1-git-send-email-agartrell@fb.com>


	Hello,

On Mon, 6 Oct 2014, Alex Gartrell wrote:

> Use daddr instead of reaching into dest.
> 
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Signed-off-by: Alex Gartrell <agartrell@fb.com>

	Thanks!

Acked-by: Julian Anastasov <ja@ssi.bg>

> ---
>  net/netfilter/ipvs/ip_vs_xmit.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
> index 91f17c1..437a366 100644
> --- a/net/netfilter/ipvs/ip_vs_xmit.c
> +++ b/net/netfilter/ipvs/ip_vs_xmit.c
> @@ -316,7 +316,7 @@ __ip_vs_get_out_rt(int skb_af, struct sk_buff *skb, struct ip_vs_dest *dest,
>  	if (unlikely(crosses_local_route_boundary(skb_af, skb, rt_mode,
>  						  local))) {
>  		IP_VS_DBG_RL("We are crossing local and non-local addresses"
> -			     " daddr=%pI4\n", &dest->addr.ip);
> +			     " daddr=%pI4\n", &daddr);
>  		goto err_put;
>  	}
>  
> @@ -458,7 +458,7 @@ __ip_vs_get_out_rt_v6(int skb_af, struct sk_buff *skb, struct ip_vs_dest *dest,
>  	if (unlikely(crosses_local_route_boundary(skb_af, skb, rt_mode,
>  						  local))) {
>  		IP_VS_DBG_RL("We are crossing local and non-local addresses"
> -			     " daddr=%pI6\n", &dest->addr.in6);
> +			     " daddr=%pI6\n", daddr);
>  		goto err_put;
>  	}
>  
> -- 
> Alex Gartrell <agartrell@fb.com>

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* Re: [PATCH net-next] net: validate_xmit_vlan() is static
From: Joe Perches @ 2014-10-06 18:44 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, netdev, Julia Lawall, Dan Carpenter, Josh Triplett
In-Reply-To: <1412619987.11091.76.camel@edumazet-glaptop2.roam.corp.google.com>

On Mon, 2014-10-06 at 11:26 -0700, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> Marking this as static allows compiler to inline it.

Found by inspection or another tool?

Wasn't there some tool to look for non-static functions
that are not called externally that could/should be
converted to static?

^ permalink raw reply

* Re: [Xen-devel] [PATCHv1] xen-netfront: always keep the Rx ring full of requests
From: annie li @ 2014-10-06 18:41 UTC (permalink / raw)
  To: David Vrabel; +Cc: netdev, Boris Ostrovsky, xen-devel
In-Reply-To: <5432BC96.6060709@citrix.com>


On 2014/10/6 12:00, David Vrabel wrote:
>>>    +    queue->rx.req_prod_pvt = req_prod;
>>> +
>>> +    /* Not enough requests? Try again later. */
>>> +    if (req_prod - queue->rx.rsp_cons < NET_RX_SLOTS_MIN) {
>>> +        mod_timer(&queue->rx_refill_timer, jiffies + (HZ/10));
>>> +        return;
>> If the previous for loop breaks because of failure of
>> xennet_alloc_one_rx_buffer, then notify_remote_via_irq is missed here if
>> the code returns directly.
> This is deliberate -- there's no point notifying the backend if there
> aren't enough requests for the next packet.  Since we don't know what
> the next packet might be we assume it's the largest possible.
That makes sense.
However, the largest packet case does not happen so frequently. 
Moreover, netback checks the slots every incoming skb requires in 
xenvif_rx_ring_slots_available, not only concerning the largest case.

Thanks
Annie

^ permalink raw reply

* [PATCH v2 net-next] net: phy: adjust fixed_phy_register() return value
From: Petri Gynther @ 2014-10-06 18:38 UTC (permalink / raw)
  To: netdev; +Cc: davem, f.fainelli, thomas.petazzoni

Adjust fixed_phy_register() to return struct phy_device *, so that
it becomes easy to use fixed PHYs without device tree support:

  phydev = fixed_phy_register(PHY_POLL, &fixed_phy_status, NULL);
  fixed_phy_set_link_update(phydev, fixed_phy_link_update);
  phy_connect_direct(netdev, phydev, handler_fn, phy_interface);

This change is a prerequisite for modifying bcmgenet driver to work
without a device tree on Broadcom's MIPS-based 7xxx platforms.

Signed-off-by: Petri Gynther <pgynther@google.com>
---
 drivers/net/phy/fixed.c   | 16 ++++++++--------
 drivers/of/of_mdio.c      |  7 +++++--
 include/linux/phy_fixed.h | 14 +++++++-------
 3 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/drivers/net/phy/fixed.c b/drivers/net/phy/fixed.c
index 5b19fbb..47872caa 100644
--- a/drivers/net/phy/fixed.c
+++ b/drivers/net/phy/fixed.c
@@ -233,9 +233,9 @@ EXPORT_SYMBOL_GPL(fixed_phy_del);
 static int phy_fixed_addr;
 static DEFINE_SPINLOCK(phy_fixed_addr_lock);
 
-int fixed_phy_register(unsigned int irq,
-		       struct fixed_phy_status *status,
-		       struct device_node *np)
+struct phy_device *fixed_phy_register(unsigned int irq,
+				      struct fixed_phy_status *status,
+				      struct device_node *np)
 {
 	struct fixed_mdio_bus *fmb = &platform_fmb;
 	struct phy_device *phy;
@@ -246,19 +246,19 @@ int fixed_phy_register(unsigned int irq,
 	spin_lock(&phy_fixed_addr_lock);
 	if (phy_fixed_addr == PHY_MAX_ADDR) {
 		spin_unlock(&phy_fixed_addr_lock);
-		return -ENOSPC;
+		return ERR_PTR(-ENOSPC);
 	}
 	phy_addr = phy_fixed_addr++;
 	spin_unlock(&phy_fixed_addr_lock);
 
 	ret = fixed_phy_add(PHY_POLL, phy_addr, status);
 	if (ret < 0)
-		return ret;
+		return ERR_PTR(ret);
 
 	phy = get_phy_device(fmb->mii_bus, phy_addr, false);
 	if (!phy || IS_ERR(phy)) {
 		fixed_phy_del(phy_addr);
-		return -EINVAL;
+		return ERR_PTR(-EINVAL);
 	}
 
 	of_node_get(np);
@@ -269,10 +269,10 @@ int fixed_phy_register(unsigned int irq,
 		phy_device_free(phy);
 		of_node_put(np);
 		fixed_phy_del(phy_addr);
-		return ret;
+		return ERR_PTR(ret);
 	}
 
-	return 0;
+	return phy;
 }
 
 static int __init fixed_mdio_bus_init(void)
diff --git a/drivers/of/of_mdio.c b/drivers/of/of_mdio.c
index a85d800..1bd4305 100644
--- a/drivers/of/of_mdio.c
+++ b/drivers/of/of_mdio.c
@@ -286,6 +286,7 @@ int of_phy_register_fixed_link(struct device_node *np)
 	struct device_node *fixed_link_node;
 	const __be32 *fixed_link_prop;
 	int len;
+	struct phy_device *phy;
 
 	/* New binding */
 	fixed_link_node = of_get_child_by_name(np, "fixed-link");
@@ -299,7 +300,8 @@ int of_phy_register_fixed_link(struct device_node *np)
 		status.asym_pause = of_property_read_bool(fixed_link_node,
 							  "asym-pause");
 		of_node_put(fixed_link_node);
-		return fixed_phy_register(PHY_POLL, &status, np);
+		phy = fixed_phy_register(PHY_POLL, &status, np);
+		return IS_ERR(phy) ? PTR_ERR(phy) : 0;
 	}
 
 	/* Old binding */
@@ -310,7 +312,8 @@ int of_phy_register_fixed_link(struct device_node *np)
 		status.speed = be32_to_cpu(fixed_link_prop[2]);
 		status.pause = be32_to_cpu(fixed_link_prop[3]);
 		status.asym_pause = be32_to_cpu(fixed_link_prop[4]);
-		return fixed_phy_register(PHY_POLL, &status, np);
+		phy = fixed_phy_register(PHY_POLL, &status, np);
+		return IS_ERR(phy) ? PTR_ERR(phy) : 0;
 	}
 
 	return -ENODEV;
diff --git a/include/linux/phy_fixed.h b/include/linux/phy_fixed.h
index 9411386..f2ca1b4 100644
--- a/include/linux/phy_fixed.h
+++ b/include/linux/phy_fixed.h
@@ -14,9 +14,9 @@ struct device_node;
 #ifdef CONFIG_FIXED_PHY
 extern int fixed_phy_add(unsigned int irq, int phy_id,
 			 struct fixed_phy_status *status);
-extern int fixed_phy_register(unsigned int irq,
-			      struct fixed_phy_status *status,
-			      struct device_node *np);
+extern struct phy_device *fixed_phy_register(unsigned int irq,
+					     struct fixed_phy_status *status,
+					     struct device_node *np);
 extern void fixed_phy_del(int phy_addr);
 extern int fixed_phy_set_link_update(struct phy_device *phydev,
 			int (*link_update)(struct net_device *,
@@ -27,11 +27,11 @@ static inline int fixed_phy_add(unsigned int irq, int phy_id,
 {
 	return -ENODEV;
 }
-static inline int fixed_phy_register(unsigned int irq,
-				     struct fixed_phy_status *status,
-				     struct device_node *np)
+static inline struct phy_device *fixed_phy_register(unsigned int irq,
+						struct fixed_phy_status *status,
+						struct device_node *np)
 {
-	return -ENODEV;
+	return ERR_PTR(-ENODEV);
 }
 static inline int fixed_phy_del(int phy_addr)
 {
-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply related

* Re: [PATCH net-next 5/5] ipv6: don't walk node's leaf during serial number update
From: Cong Wang @ 2014-10-06 18:27 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev, hideaki, kafai
In-Reply-To: <1412618574.3403.29.camel@localhost>

On Mon, Oct 6, 2014 at 11:02 AM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> On Mo, 2014-10-06 at 10:58 -0700, Cong Wang wrote:
>> On Mon, Oct 6, 2014 at 1:52 AM, Hannes Frederic Sowa
>> <hannes@stressinduktion.org> wrote:
>> > @@ -105,6 +106,10 @@ static int fib6_new_sernum(struct net *net)
>> >         return new;
>> >  }
>> >
>> > +enum {
>> > +       FIB6_NO_SERNUM_CHANGE = 0,
>> > +};
>> > +
>>
>> Not sure if it worth an enum definition... seems overkill for me.
>
> Yeah, maybe, I was used to do it like that in user space because of
> debuggers.
>
> I think it is ok, also I just send v2 with your proposed changes.
> If you have a strong opinion about that, let me know. ;)

No, I don't. :)

^ permalink raw reply

* [PATCH net-next] net: validate_xmit_vlan() is static
From: Eric Dumazet @ 2014-10-06 18:26 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

From: Eric Dumazet <edumazet@google.com>

Marking this as static allows compiler to inline it.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/core/dev.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 7d5691cc1f479ee4f1bb46b06d35fe8c..2702724ce2de5712e13c37321597a314 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2643,7 +2643,8 @@ out:
 	return skb;
 }
 
-struct sk_buff *validate_xmit_vlan(struct sk_buff *skb, netdev_features_t features)
+static struct sk_buff *validate_xmit_vlan(struct sk_buff *skb,
+					  netdev_features_t features)
 {
 	if (vlan_tx_tag_present(skb) &&
 	    !vlan_hw_offload_capable(features, skb->vlan_proto)) {

^ permalink raw reply related

* Re: [PATCH V2 linux-next] net: fix rcu access on phonet_routes
From: Eric Dumazet @ 2014-10-06 18:23 UTC (permalink / raw)
  To: Fabian Frederick
  Cc: linux-kernel, Eric Dumazet, Josh Triplett, Remi Denis-Courmont,
	David S. Miller, netdev
In-Reply-To: <1412619321-2212-1-git-send-email-fabf@skynet.be>

On Mon, 2014-10-06 at 20:15 +0200, Fabian Frederick wrote:
> -Add __rcu annotation on table to fix sparse warnings:
> net/phonet/pn_dev.c:279:25: warning: incorrect type in assignment (different address spaces)
> net/phonet/pn_dev.c:279:25:    expected struct net_device *<noident>
> net/phonet/pn_dev.c:279:25:    got void [noderef] <asn:4>*<noident>
> net/phonet/pn_dev.c:376:17: warning: incorrect type in assignment (different address spaces)
> net/phonet/pn_dev.c:376:17:    expected struct net_device *volatile <noident>
> net/phonet/pn_dev.c:376:17:    got struct net_device [noderef] <asn:4>*<noident>
> net/phonet/pn_dev.c:392:17: warning: incorrect type in assignment (different address spaces)
> net/phonet/pn_dev.c:392:17:    expected struct net_device *<noident>
> net/phonet/pn_dev.c:392:17:    got void [noderef] <asn:4>*<noident>
> 
> -Access table with rcu_access_pointer (fixes the following sparse errors):
> net/phonet/pn_dev.c:278:25: error: incompatible types in comparison expression (different address spaces)
> net/phonet/pn_dev.c:391:17: error: incompatible types in comparison expression (different address spaces)
> 
> Suggested-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Fabian Frederick <fabf@skynet.be>
> ---
> V2: use rcu_access_pointer instead of rcu_dereference out of rcu_read_lock context
>     (suggested by Eric Dumazet).

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply

* [PATCH V2 linux-next] net: fix rcu access on phonet_routes
From: Fabian Frederick @ 2014-10-06 18:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: Eric Dumazet, Josh Triplett, Fabian Frederick,
	Remi Denis-Courmont, David S. Miller, netdev

-Add __rcu annotation on table to fix sparse warnings:
net/phonet/pn_dev.c:279:25: warning: incorrect type in assignment (different address spaces)
net/phonet/pn_dev.c:279:25:    expected struct net_device *<noident>
net/phonet/pn_dev.c:279:25:    got void [noderef] <asn:4>*<noident>
net/phonet/pn_dev.c:376:17: warning: incorrect type in assignment (different address spaces)
net/phonet/pn_dev.c:376:17:    expected struct net_device *volatile <noident>
net/phonet/pn_dev.c:376:17:    got struct net_device [noderef] <asn:4>*<noident>
net/phonet/pn_dev.c:392:17: warning: incorrect type in assignment (different address spaces)
net/phonet/pn_dev.c:392:17:    expected struct net_device *<noident>
net/phonet/pn_dev.c:392:17:    got void [noderef] <asn:4>*<noident>

-Access table with rcu_access_pointer (fixes the following sparse errors):
net/phonet/pn_dev.c:278:25: error: incompatible types in comparison expression (different address spaces)
net/phonet/pn_dev.c:391:17: error: incompatible types in comparison expression (different address spaces)

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
---
V2: use rcu_access_pointer instead of rcu_dereference out of rcu_read_lock context
    (suggested by Eric Dumazet).

 net/phonet/pn_dev.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/phonet/pn_dev.c b/net/phonet/pn_dev.c
index 56a6146..a586800 100644
--- a/net/phonet/pn_dev.c
+++ b/net/phonet/pn_dev.c
@@ -36,7 +36,7 @@
 
 struct phonet_routes {
 	struct mutex		lock;
-	struct net_device	*table[64];
+	struct net_device __rcu	*table[64];
 };
 
 struct phonet_net {
@@ -275,7 +275,7 @@ static void phonet_route_autodel(struct net_device *dev)
 	bitmap_zero(deleted, 64);
 	mutex_lock(&pnn->routes.lock);
 	for (i = 0; i < 64; i++)
-		if (dev == pnn->routes.table[i]) {
+		if (rcu_access_pointer(pnn->routes.table[i]) == dev) {
 			RCU_INIT_POINTER(pnn->routes.table[i], NULL);
 			set_bit(i, deleted);
 		}
@@ -388,7 +388,7 @@ int phonet_route_del(struct net_device *dev, u8 daddr)
 
 	daddr = daddr >> 2;
 	mutex_lock(&routes->lock);
-	if (dev == routes->table[daddr])
+	if (rcu_access_pointer(routes->table[daddr]) == dev)
 		RCU_INIT_POINTER(routes->table[daddr], NULL);
 	else
 		dev = NULL;
-- 
1.9.3

^ permalink raw reply related

* Re: randconfig build error with next-20141001, in drivers/i2c/algos/i2c-algo-bit.c
From: Randy Dunlap @ 2014-10-06 18:09 UTC (permalink / raw)
  To: Oliver Hartkopp, Jim Davis, Stephen Rothwell
  Cc: linux-next, Stephane Grosjean, linux-i2c, netdev@vger.kernel.org,
	linux-can
In-Reply-To: <5432D3E6.9020805@hartkopp.net>

On 10/06/14 10:39, Oliver Hartkopp wrote:
> 
> 
> On 10/06/2014 06:52 PM, Randy Dunlap wrote:
>> On 10/06/14 01:06, Oliver Hartkopp wrote:
>>> Hello all,
>>>
>>> just to get it right:
>>>
>>> So far it looks like this in linux/drivers/net/can/sja1000/Kconfig
>>>
>>> config CAN_PEAK_PCIEC
>>>         bool "PEAK PCAN-ExpressCard Cards"
>>>         depends on CAN_PEAK_PCI
>>>         select I2C
>>>         select I2C_ALGOBIT
>>>
>>> If one would change the
>>>
>>>         select I2C
>>>
>>> into
>>>
>>>         depends on I2C
>>>
>>> IMHO the CAN_PEAK_PCIEC hardware would *only* be visible and selectable when
>>> I2C was selected before (from anyone else?).
>>
>> That is correct.
>>
>>> So what it wrong on the current Kconfig entry?
>>> Is 'select' deprecated?
>>
>> No, it's not deprecated.  It's just dangerous.  and driver configs should not
>> enable entire subsystems via 'select'.
>>
>>> Or did randconfig generate a configuration that would not be possible by
>>> properly generating the config file with 'make menuconfig' ??
>>
>> randconfig generated a config for another driver which causes a build error,
>> not for a CAN driver.  The CAN driver does not have a build error AFAIK.
>> Its Kconfig is just doing something with a very big & ugly stick.
> 
> But when it is not done like this, we might have an invisible config option in
> the corner case that I2C is not enabled by anyone else.
> 
> So what would you propose then?
> 
> AFAICS there is 'just' a style problem as 'configs should not enable entire
> subsystems'. But it finally is a correct and valid Kconfig, right?

Yes, right.

> When I2C is already enabled - fine. If (unlikely) I2C is not enabled, we need
> to pull the ugly stick. So what is dangerous on this? Was there any misuse of
> select statements before?

No syntactic misuse, more of a style thing, like you say.

The danger is in select being a big stick that does not check for symbol
dependencies.

In the unlikely case that I2C is not enabled, the user should have to enable
it instead of a solitary driver enabling it.  IOW, if a subsystem is disabled,
the user probably wanted it that way and a single driver should not override
that setting.


-- 
~Randy

^ permalink raw reply

* Re: [PATCH net-next 5/5] ipv6: don't walk node's leaf during serial number update
From: Hannes Frederic Sowa @ 2014-10-06 18:02 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev, hideaki, kafai
In-Reply-To: <CAHA+R7O2GYo6vTpm9uYPVE+jamdYRPEZo5301=NnL+Zw7vmxpg@mail.gmail.com>

On Mo, 2014-10-06 at 10:58 -0700, Cong Wang wrote:
> On Mon, Oct 6, 2014 at 1:52 AM, Hannes Frederic Sowa
> <hannes@stressinduktion.org> wrote:
> > @@ -105,6 +106,10 @@ static int fib6_new_sernum(struct net *net)
> >         return new;
> >  }
> >
> > +enum {
> > +       FIB6_NO_SERNUM_CHANGE = 0,
> > +};
> > +
> 
> Not sure if it worth an enum definition... seems overkill for me.

Yeah, maybe, I was used to do it like that in user space because of
debuggers.

I think it is ok, also I just send v2 with your proposed changes.
If you have a strong opinion about that, let me know. ;)

Thanks a lot for the review,
Hannes

^ permalink raw reply

* Re: [PATCH] net: Add ndo_gso_check
From: Tom Herbert @ 2014-10-06 17:59 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Alexander Duyck, John Fastabend, Jeff Kirsher, David Miller,
	Linux Netdev List, Thomas Graf, Pravin Shelar, Andy Zhou
In-Reply-To: <CAJ3xEMgPb9BhvtifmTzy6bS9WDf3jUx6dnx4418KNTjGNPreHQ@mail.gmail.com>

On Sun, Oct 5, 2014 at 12:13 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
> On Sun, Oct 5, 2014 at 9:49 PM, Tom Herbert <therbert@google.com> wrote:
>> On Sun, Oct 5, 2014 at 7:04 AM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>>> On Thu, Oct 2, 2014 at 2:06 AM, Tom Herbert <therbert@google.com> wrote:
>>>> On Wed, Oct 1, 2014 at 1:58 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>>>>> On Tue, Sep 30, 2014 at 6:34 PM, Tom Herbert <therbert@google.com> wrote:
>>>>>> On Tue, Sep 30, 2014 at 7:30 AM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>>> [...]
>>>> Solution #4: apply this patch and implement the check functions as
>>>> needed in those 4 or 5 drivers. If a device can only do VXLAN/NVGRE
>>>> then I believe the check function is something like:
>>>>
>>>> bool mydev_gso_check(struct sk_buff *skb, struct net_device *dev)
>>>> {
>>>>         if ((skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL) &&
>>>>             ((skb->inner_protocol_type != ENCAP_TYPE_ETHER ||
>>>>               skb->protocol != htons(ETH_P_TEB) ||
>>>>               skb_inner_mac_header(skb) - skb_transport_header(skb) != 12)
>>>>                 return false;
>>>>
>>>>         return true;
>>>> }
>>>
>>> Yep, such helper can can be basically made to work and let the 4-5
>>> drivers that can
>>> do GSO offloading for vxlan but not for any FOU/GUE packets signal
>>> that to the stack.
>>>
>>> Re the 12 constant, you were referring to the udp+vxlan headers? it's 8+8
>>>
>>> Also, we need a way for drivers that can support VXLAN or NVGRE but
>>> not concurrently
>>> on the same port @ the same time to only let vxlan packet to pass
>>> successfully through the helper.
>
>> Or, there should be no difference in GSO processing between VXLAN and
>> NVGRE. Can you explain why you feel you need to differentiate them for GSO?
>
>
> RX wise, Linux tells the driver that UDP port X would be used for
> VXLAN, right? and indeed, it's possible for some HW implementations
> not to support RX offloading (checksum) for both VXLAN and NVGRE @ the
> same time over the same port. But TX/GRO wise, you're probably
> correct. The thing is that from the user POV they need solution that
> works for both RX and TX offloading.

I think from a user POV we want a solution that supports RX and TX
offloading across the widest range of protocols. This is accomplished
by implementing protocol agnostic mechanisms like CHECKSUM_COMPLETE
and protocol agnostic UDP tunnel TSO like we've described. IMO, the
fact that we have devices that implement protocol specific mechanisms
for NVGRE and VXLAN should be considered legacy support in the stack,
for new UDP encapsulation protocols we should not expose specifics in
the stack in either by adding a GSO type for each protocol, nor
ndo_add_foo_port for each protocol-- these things will not scale and
unnecessarily complicate the core stack.

^ permalink raw reply

* [PATCH v2 net-next 5/5] ipv6: don't walk node's leaf during serial number update
From: Hannes Frederic Sowa @ 2014-10-06 17:58 UTC (permalink / raw)
  To: netdev; +Cc: hideaki, kafai, cwang
In-Reply-To: <cover.1412618014.git.hannes@stressinduktion.org>

Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 net/ipv6/ip6_fib.c | 47 ++++++++++++++++++++++++++++++-----------------
 1 file changed, 30 insertions(+), 17 deletions(-)

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 6f9beb1..b2d1838 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -50,6 +50,7 @@ struct fib6_cleaner {
 	struct fib6_walker w;
 	struct net *net;
 	int (*func)(struct rt6_info *, void *arg);
+	int sernum;
 	void *arg;
 };
 
@@ -105,6 +106,10 @@ static int fib6_new_sernum(struct net *net)
 	return new;
 }
 
+enum {
+	FIB6_NO_SERNUM_CHANGE = 0,
+};
+
 /*
  *	Auxiliary address test functions for the radix tree.
  *
@@ -1514,6 +1519,16 @@ static int fib6_clean_node(struct fib6_walker *w)
 		.nl_net = c->net,
 	};
 
+	if (c->sernum != FIB6_NO_SERNUM_CHANGE &&
+	    w->node->fn_sernum != c->sernum)
+		w->node->fn_sernum = c->sernum;
+
+	if (!c->func) {
+		WARN_ON_ONCE(c->sernum == FIB6_NO_SERNUM_CHANGE);
+		w->leaf = NULL;
+		return 0;
+	}
+
 	for (rt = w->leaf; rt; rt = rt->dst.rt6_next) {
 		res = c->func(rt, c->arg);
 		if (res < 0) {
@@ -1547,7 +1562,7 @@ static int fib6_clean_node(struct fib6_walker *w)
 
 static void fib6_clean_tree(struct net *net, struct fib6_node *root,
 			    int (*func)(struct rt6_info *, void *arg),
-			    bool prune, void *arg)
+			    bool prune, int sernum, void *arg)
 {
 	struct fib6_cleaner c;
 
@@ -1557,14 +1572,16 @@ static void fib6_clean_tree(struct net *net, struct fib6_node *root,
 	c.w.count = 0;
 	c.w.skip = 0;
 	c.func = func;
+	c.sernum = sernum;
 	c.arg = arg;
 	c.net = net;
 
 	fib6_walk(&c.w);
 }
 
-void fib6_clean_all(struct net *net, int (*func)(struct rt6_info *, void *arg),
-		    void *arg)
+static void __fib6_clean_all(struct net *net,
+			     int (*func)(struct rt6_info *, void *),
+			     int sernum, void *arg)
 {
 	struct fib6_table *table;
 	struct hlist_head *head;
@@ -1576,13 +1593,19 @@ void fib6_clean_all(struct net *net, int (*func)(struct rt6_info *, void *arg),
 		hlist_for_each_entry_rcu(table, head, tb6_hlist) {
 			write_lock_bh(&table->tb6_lock);
 			fib6_clean_tree(net, &table->tb6_root,
-					func, false, arg);
+					func, false, sernum, arg);
 			write_unlock_bh(&table->tb6_lock);
 		}
 	}
 	rcu_read_unlock();
 }
 
+void fib6_clean_all(struct net *net, int (*func)(struct rt6_info *, void *),
+		    void *arg)
+{
+	__fib6_clean_all(net, func, FIB6_NO_SERNUM_CHANGE, arg);
+}
+
 static int fib6_prune_clone(struct rt6_info *rt, void *arg)
 {
 	if (rt->rt6i_flags & RTF_CACHE) {
@@ -1595,25 +1618,15 @@ static int fib6_prune_clone(struct rt6_info *rt, void *arg)
 
 static void fib6_prune_clones(struct net *net, struct fib6_node *fn)
 {
-	fib6_clean_tree(net, fn, fib6_prune_clone, true, NULL);
-}
-
-static int fib6_update_sernum(struct rt6_info *rt, void *arg)
-{
-	int sernum = *(int *)arg;
-
-	if (rt->rt6i_node &&
-	    rt->rt6i_node->fn_sernum != sernum)
-		rt->rt6i_node->fn_sernum = sernum;
-
-	return 0;
+	fib6_clean_tree(net, fn, fib6_prune_clone, true,
+			FIB6_NO_SERNUM_CHANGE, NULL);
 }
 
 static void fib6_flush_trees(struct net *net)
 {
 	int new_sernum = fib6_new_sernum(net);
 
-	fib6_clean_all(net, fib6_update_sernum, &new_sernum);
+	__fib6_clean_all(net, NULL, new_sernum, NULL);
 }
 
 /*
-- 
1.9.3

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox