Netdev List
 help / color / mirror / Atom feed
* [PATCH 2/5] net: xdp: add invalid buffer warning
From: John Fastabend @ 2016-11-18 18:59 UTC (permalink / raw)
  To: tgraf, shm, alexei.starovoitov, daniel, davem
  Cc: john.r.fastabend, netdev, bblanco, john.fastabend, brouer
In-Reply-To: <20161118185517.16137.92123.stgit@john-Precision-Tower-5810>

This adds a warning for drivers to use when encountering an invalid
buffer for XDP. For normal cases this should not happen but to catch
this in virtual/qemu setups that I may not have expected from the
emulation layer having a standard warning is useful.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 include/linux/filter.h |    1 +
 net/core/filter.c      |    6 ++++++
 2 files changed, 7 insertions(+)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 1f09c52..0c79004 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -595,6 +595,7 @@ int sk_get_filter(struct sock *sk, struct sock_filter __user *filter,
 struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off,
 				       const struct bpf_insn *patch, u32 len);
 void bpf_warn_invalid_xdp_action(u32 act);
+void bpf_warn_invalid_xdp_buffer(void);
 
 #ifdef CONFIG_BPF_JIT
 extern int bpf_jit_enable;
diff --git a/net/core/filter.c b/net/core/filter.c
index cd9e2ba..b8fb57c 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2722,6 +2722,12 @@ void bpf_warn_invalid_xdp_action(u32 act)
 }
 EXPORT_SYMBOL_GPL(bpf_warn_invalid_xdp_action);
 
+void bpf_warn_invalid_xdp_buffer(void)
+{
+	WARN_ONCE(1, "Illegal XDP buffer encountered, expect packet loss\n");
+}
+EXPORT_SYMBOL_GPL(bpf_warn_invalid_xdp_buffer);
+
 static u32 sk_filter_convert_ctx_access(enum bpf_access_type type, int dst_reg,
 					int src_reg, int ctx_off,
 					struct bpf_insn *insn_buf,

^ permalink raw reply related

* [PATCH 1/5] net: virtio dynamically disable/enable LRO
From: John Fastabend @ 2016-11-18 18:59 UTC (permalink / raw)
  To: tgraf, shm, alexei.starovoitov, daniel, davem
  Cc: john.r.fastabend, netdev, bblanco, john.fastabend, brouer
In-Reply-To: <20161118185517.16137.92123.stgit@john-Precision-Tower-5810>

This adds support for dynamically setting the LRO feature flag. The
message to control guest features in the backend uses the
CTRL_GUEST_OFFLOADS msg type.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 drivers/net/virtio_net.c |   43 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 43 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 2cafd12..0758cae 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1419,6 +1419,41 @@ static void virtnet_init_settings(struct net_device *dev)
 	.set_settings = virtnet_set_settings,
 };
 
+static int virtnet_set_features(struct net_device *netdev,
+				netdev_features_t features)
+{
+	struct virtnet_info *vi = netdev_priv(netdev);
+	struct virtio_device *vdev = vi->vdev;
+	struct scatterlist sg;
+	u64 offloads = 0;
+
+	if (features & NETIF_F_LRO)
+		offloads |= (1 << VIRTIO_NET_F_GUEST_TSO4) |
+			    (1 << VIRTIO_NET_F_GUEST_TSO6);
+
+	if (features & NETIF_F_RXCSUM)
+		offloads |= (1 << VIRTIO_NET_F_GUEST_CSUM);
+
+	if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_GUEST_OFFLOADS)) {
+		sg_init_one(&sg, &offloads, sizeof(uint64_t));
+		if (!virtnet_send_command(vi,
+					  VIRTIO_NET_CTRL_GUEST_OFFLOADS,
+					  VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET,
+					  &sg)) {
+			dev_warn(&netdev->dev,
+				 "Failed to set guest offloads by virtnet command.\n");
+			return -EINVAL;
+		}
+	} else if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_GUEST_OFFLOADS) &&
+		   !virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) {
+		dev_warn(&netdev->dev,
+			 "No support for setting offloads pre version_1.\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 static const struct net_device_ops virtnet_netdev = {
 	.ndo_open            = virtnet_open,
 	.ndo_stop   	     = virtnet_close,
@@ -1435,6 +1470,7 @@ static void virtnet_init_settings(struct net_device *dev)
 #ifdef CONFIG_NET_RX_BUSY_POLL
 	.ndo_busy_poll		= virtnet_busy_poll,
 #endif
+	.ndo_set_features	= virtnet_set_features,
 };
 
 static void virtnet_config_changed_work(struct work_struct *work)
@@ -1810,6 +1846,12 @@ static int virtnet_probe(struct virtio_device *vdev)
 	if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_CSUM))
 		dev->features |= NETIF_F_RXCSUM;
 
+	if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO4) &&
+	    virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO6)) {
+		dev->features |= NETIF_F_LRO;
+		dev->hw_features |= NETIF_F_LRO;
+	}
+
 	dev->vlan_features = dev->features;
 
 	/* MTU range: 68 - 65535 */
@@ -2049,6 +2091,7 @@ static int virtnet_restore(struct virtio_device *vdev)
 	VIRTIO_NET_F_CTRL_MAC_ADDR,
 	VIRTIO_F_ANY_LAYOUT,
 	VIRTIO_NET_F_MTU,
+	VIRTIO_NET_F_CTRL_GUEST_OFFLOADS,
 };
 
 static struct virtio_driver virtio_net_driver = {

^ permalink raw reply related

* [PATCH 0/5] XDP for virtio_net
From: John Fastabend @ 2016-11-18 18:59 UTC (permalink / raw)
  To: tgraf, shm, alexei.starovoitov, daniel, davem
  Cc: john.r.fastabend, netdev, bblanco, john.fastabend, brouer

This implements virtio_net for the mergeable buffers and big_packet
modes. I tested this with vhost_net running on qemu and did not see
any issues.

There are some restrictions for XDP to be enabled (see patch 3) for
more details.

  1. LRO must be off
  2. MTU must be less than PAGE_SIZE
  3. queues must be available to dedicate to XDP
  4. num_bufs received in mergeable buffers must be 1
  5. big_packet mode must have all data on single page

Please review any comments/feedback welcome as always.

Thanks,
John
---

John Fastabend (4):
      net: virtio dynamically disable/enable LRO
      net: xdp: add invalid buffer warning
      virtio_net: add dedicated XDP transmit queues
      virtio_net: add XDP_TX support

Shrijeet Mukherjee (1):
      virtio_net: Add XDP support


 drivers/net/virtio_net.c |  264 +++++++++++++++++++++++++++++++++++++++++++++-
 include/linux/filter.h   |    1 
 net/core/filter.c        |    6 +
 3 files changed, 267 insertions(+), 4 deletions(-)

^ permalink raw reply

* Re: [Patch net v2] af_unix: conditionally use freezable blocking calls in read
From: David Miller @ 2016-11-18 18:59 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: netdev, dvyukov, tj, ccross, rafael.j.wysocki, hannes
In-Reply-To: <1479426926-28197-1-git-send-email-xiyou.wangcong@gmail.com>

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Thu, 17 Nov 2016 15:55:26 -0800

> Commit 2b15af6f95 ("af_unix: use freezable blocking calls in read")
> converts schedule_timeout() to its freezable version, it was probably
> correct at that time, but later, commit 2b514574f7e8
> ("net: af_unix: implement splice for stream af_unix sockets") breaks
> the strong requirement for a freezable sleep, according to
> commit 0f9548ca1091:
> 
>     We shouldn't try_to_freeze if locks are held.  Holding a lock can cause a
>     deadlock if the lock is later acquired in the suspend or hibernate path
>     (e.g.  by dpm).  Holding a lock can also cause a deadlock in the case of
>     cgroup_freezer if a lock is held inside a frozen cgroup that is later
>     acquired by a process outside that group.
> 
> The pipe_lock is still held at that point.
> 
> So use freezable version only for the recvmsg call path, avoid impact for
> Android.
> 
> Fixes: 2b514574f7e8 ("net: af_unix: implement splice for stream af_unix sockets")
> Reported-by: Dmitry Vyukov <dvyukov@google.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Colin Cross <ccross@android.com>
> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH v2 net-next] lan78xx: relocate mdix setting to phy driver
From: David Miller @ 2016-11-18 18:57 UTC (permalink / raw)
  To: Woojung.Huh; +Cc: f.fainelli, netdev, andrew, UNGLinuxDriver
In-Reply-To: <9235D6609DB808459E95D78E17F2E43D40966AA4@CHN-SV-EXMX02.mchp-main.com>

From: <Woojung.Huh@microchip.com>
Date: Thu, 17 Nov 2016 22:10:02 +0000

> From: Woojung Huh <woojung.huh@microchip.com>
> 
> Relocate mdix code to phy driver to be called at config_init().
> 
> Signed-off-by: Woojung Huh <woojung.huh@microchip.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH -next] tcp: make undo_cwnd mandatory for congestion modules
From: Florian Westphal @ 2016-11-18 18:54 UTC (permalink / raw)
  To: David Miller; +Cc: fw, netdev, edumazet, ycheng, ncardwell
In-Reply-To: <20161118.134304.1574614508508170507.davem@davemloft.net>

David Miller <davem@davemloft.net> wrote:
> From: Florian Westphal <fw@strlen.de>
> Date: Thu, 17 Nov 2016 13:56:51 +0100
> 
> > The undo_cwnd fallback in the stack doubles cwnd based on ssthresh,
> > which un-does reno halving behaviour.
> > 
> > It seems more appropriate to let congctl algorithms pair .ssthresh
> > and .undo_cwnd properly. Add a 'tcp_reno_undo_cwnd' function and wire it
> > up for all congestion algorithms that used to rely on the fallback.
> > 
> > highspeed, illinois, scalable, veno and yeah use 'reno undo' while their
> > .ssthresh implementation doesn't halve the slowstart threshold, this
> > might point to similar issue as the one fixed for dctcp in
> > ce6dd23329b1e ("dctcp: avoid bogus doubling of cwnd after loss").
> > 
> > Cc: Eric Dumazet <edumazet@google.com>
> > Cc: Yuchung Cheng <ycheng@google.com>
> > Cc: Neal Cardwell <ncardwell@google.com>
> > Signed-off-by: Florian Westphal <fw@strlen.de>
> 
> If you really suspect that highspeed et al. need to implement their own
> undo_cwnd instead of using the default reno fallback, I would really
> rather that this gets either fixed or explicitly marked as likely wrong
> (in an "XXX" comment or similar).

Ok, fair enough.  I am not familiar with these algorithms, I will check
what they're doing in more detail and if absolutely needed resubmit this
patch with XXX/FIXME/TODO comments added.

> Otherwise nobody is going to remember this down the road.

Agreed.

^ permalink raw reply

* Re: [PATCH net-next v4 0/5] net: Enable COMPILE_TEST for Marvell & Freescale drivers
From: David Miller @ 2016-11-18 18:54 UTC (permalink / raw)
  To: f.fainelli; +Cc: netdev, mw, arnd, gregory.clement, Shaohui.Xie, andrew
In-Reply-To: <20161117191914.11077-1-f.fainelli@gmail.com>

From: Florian Fainelli <f.fainelli@gmail.com>
Date: Thu, 17 Nov 2016 11:19:09 -0800

> This patch series allows building the Freescale and Marvell Ethernet
> network drivers with COMPILE_TEST.

Thanks for doing this work, this kind of thing helps me a lot.

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH net v2 0/7] net: cpsw: fix leaks and probe deferral
From: David Miller @ 2016-11-18 18:49 UTC (permalink / raw)
  To: johan; +Cc: mugunthanvnm, grygorii.strashko, linux-omap, netdev, linux-kernel
In-Reply-To: <1479400804-9847-1-git-send-email-johan@kernel.org>

From: Johan Hovold <johan@kernel.org>
Date: Thu, 17 Nov 2016 17:39:57 +0100

> This series fixes as number of leaks and issues in the cpsw probe-error
> and driver-unbind paths, some which specifically prevented deferred
> probing.
 ...
> v2
>  - Keep platform device runtime-resumed throughout probe instead of
>    resuming in the probe error path as suggested by Grygorii (patch
>    1/7).
> 
>  - Runtime-resume platform device before registering any children in
>    order to make sure it is synchronously suspended after deregistering
>    children in the error path (patch 3/7).

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH net v2 7/7] net: ethernet: ti: cpsw: fix fixed-link phy probe deferral
From: David Miller @ 2016-11-18 18:48 UTC (permalink / raw)
  To: johan; +Cc: mugunthanvnm, grygorii.strashko, linux-omap, netdev, linux-kernel
In-Reply-To: <20161117171920.GD10490@localhost>

From: Johan Hovold <johan@kernel.org>
Date: Thu, 17 Nov 2016 18:19:20 +0100

> On Thu, Nov 17, 2016 at 12:04:16PM -0500, David Miller wrote:
>> From: Johan Hovold <johan@kernel.org>
>> Date: Thu, 17 Nov 2016 17:40:04 +0100
>> 
>> > Make sure to propagate errors from of_phy_register_fixed_link() which
>> > can fail with -EPROBE_DEFER.
>> > 
>> > Fixes: 1f71e8c96fc6 ("drivers: net: cpsw: Add support for fixed-link
>> > PHY")
>> > Signed-off-by: Johan Hovold <johan@kernel.org>
>> 
>> Johan, when you update a patch within a series you must post the
>> entire series freshly to the lists, cover posting and all.
> 
> I'm quite sure that is exactly what I did. Did you only get this last
> patch out of the seven?

I ended up getting it delayed, thanks.

^ permalink raw reply

* Re: [PATCH net 0/3] mlx4 fix for shutdown flow
From: David Miller @ 2016-11-18 18:47 UTC (permalink / raw)
  To: tariqt; +Cc: netdev, eranbe, saeedm, eugenia
In-Reply-To: <1479397251-6932-1-git-send-email-tariqt@mellanox.com>

From: Tariq Toukan <tariqt@mellanox.com>
Date: Thu, 17 Nov 2016 17:40:48 +0200

> This patchset fixes an invalid reference to mdev in mlx4 shutdown flow.
> 
> In patch 1, we make sure netif_device_detach() is called from shutdown flow only,
> since we want to keep it present during a simple configuration change.
> 
> In patches 2 and 3, we add checks that were missing in:
> * dev_get_phys_port_id
> * dev_get_phys_port_name
> We check the presence of the network device before calling the driver's
> callbacks. This already exists for all other ndo's.
> 
> Series generated against net commit:
> e5f6f564fd19 bnxt: add a missing rcu synchronization

I don't like where this is going nor the precedence it is setting.

If you are taking the device into a state where it cannot be safely
accessed by ndo operations, then you _MUST_ do whatever is necessary
to make sure the device is unregistered and cannot be found in the
various global lists and tables of network devices.

This is mandatory.

And this is how we must fix these kinds of problems instead of
peppering device presence test all over the place.  That will be
error prone and in the long term a huge maintainence burdon.

I'm not applying this series, sorry.  You have to fix this properly.

^ permalink raw reply

* [PATCH] liquidio CN23XX: check if PENDING bit is clear using logical and
From: Colin King @ 2016-11-18 18:45 UTC (permalink / raw)
  To: Derek Chickles, Satanand Burla, Felix Manlunas, Raghu Vatsavayi,
	netdev
  Cc: linux-kernel

From: Colin Ian King <colin.king@canonical.com>

the mbox state should be bitwise anded rather than logically anded
with OCTEON_MBOX_STATE_RESPONSE_PENDING. Fix this by using the
correct & operator instead of &&.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 drivers/net/ethernet/cavium/liquidio/octeon_mailbox.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cavium/liquidio/octeon_mailbox.c b/drivers/net/ethernet/cavium/liquidio/octeon_mailbox.c
index 5309384..73696b42 100644
--- a/drivers/net/ethernet/cavium/liquidio/octeon_mailbox.c
+++ b/drivers/net/ethernet/cavium/liquidio/octeon_mailbox.c
@@ -301,7 +301,7 @@ int octeon_mbox_process_message(struct octeon_mbox *mbox)
 		       sizeof(struct octeon_mbox_cmd));
 		if (!mbox_cmd.msg.s.resp_needed) {
 			mbox->state &= ~OCTEON_MBOX_STATE_REQUEST_RECEIVED;
-			if (!(mbox->state &&
+			if (!(mbox->state &
 			      OCTEON_MBOX_STATE_RESPONSE_PENDING))
 				mbox->state = OCTEON_MBOX_STATE_IDLE;
 			writeq(OCTEON_PFVFSIG, mbox->mbox_read_reg);
-- 
2.10.2

^ permalink raw reply related

* Re: [PATCH -next] tcp: make undo_cwnd mandatory for congestion modules
From: David Miller @ 2016-11-18 18:43 UTC (permalink / raw)
  To: fw; +Cc: netdev, edumazet, ycheng, ncardwell
In-Reply-To: <1479387411-9830-1-git-send-email-fw@strlen.de>

From: Florian Westphal <fw@strlen.de>
Date: Thu, 17 Nov 2016 13:56:51 +0100

> The undo_cwnd fallback in the stack doubles cwnd based on ssthresh,
> which un-does reno halving behaviour.
> 
> It seems more appropriate to let congctl algorithms pair .ssthresh
> and .undo_cwnd properly. Add a 'tcp_reno_undo_cwnd' function and wire it
> up for all congestion algorithms that used to rely on the fallback.
> 
> highspeed, illinois, scalable, veno and yeah use 'reno undo' while their
> .ssthresh implementation doesn't halve the slowstart threshold, this
> might point to similar issue as the one fixed for dctcp in
> ce6dd23329b1e ("dctcp: avoid bogus doubling of cwnd after loss").
> 
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Yuchung Cheng <ycheng@google.com>
> Cc: Neal Cardwell <ncardwell@google.com>
> Signed-off-by: Florian Westphal <fw@strlen.de>

If you really suspect that highspeed et al. need to implement their own
undo_cwnd instead of using the default reno fallback, I would really
rather that this gets either fixed or explicitly marked as likely wrong
(in an "XXX" comment or similar).

Otherwise nobody is going to remember this down the road.

^ permalink raw reply

* Re: [PATCH] net: sky2: Fix shutdown crash
From: David Miller @ 2016-11-18 18:41 UTC (permalink / raw)
  To: jeremy.linton; +Cc: netdev, mlindner, stephen, Sudeep.Holla
In-Reply-To: <1479395665-27784-1-git-send-email-jeremy.linton@arm.com>

From: Jeremy Linton <jeremy.linton@arm.com>
Date: Thu, 17 Nov 2016 09:14:25 -0600

> The sky2 frequently crashes during machine shutdown with:
> 
> sky2_get_stats+0x60/0x3d8 [sky2]
> dev_get_stats+0x68/0xd8
> rtnl_fill_stats+0x54/0x140
> rtnl_fill_ifinfo+0x46c/0xc68
> rtmsg_ifinfo_build_skb+0x7c/0xf0
> rtmsg_ifinfo.part.22+0x3c/0x70
> rtmsg_ifinfo+0x50/0x5c
> netdev_state_change+0x4c/0x58
> linkwatch_do_dev+0x50/0x88
> __linkwatch_run_queue+0x104/0x1a4
> linkwatch_event+0x30/0x3c
> process_one_work+0x140/0x3e0
> worker_thread+0x60/0x44c
> kthread+0xdc/0xf0
> ret_from_fork+0x10/0x50
> 
> This is caused by the sky2 being called after it has been shutdown.
> A previous thread about this can be found here:
> 
> https://lkml.org/lkml/2016/4/12/410
> 
> An alternative fix is to assure that IFF_UP gets cleared by
> calling dev_close() during shutdown. This is similar to what the
> bnx2/tg3/xgene and maybe others are doing to assure that the driver
> isn't being called following _shutdown().
> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [v5,1/5] soc: qcom: smem_state: Fix include for ERR_PTR()
From: Bjorn Andersson @ 2016-11-18 18:35 UTC (permalink / raw)
  To: Kalle Valo
  Cc: Eugene Krasnikov, Kalle Valo, Andy Gross, wcn36xx, linux-wireless,
	netdev, linux-kernel, linux-arm-msm
In-Reply-To: <a4095c49fe5a42c4a405b1faf6c0f3b7@euamsexm01a.eu.qualcomm.com>

On Wed 16 Nov 10:49 PST 2016, Kalle Valo wrote:

> Bjorn Andersson <bjorn.andersson@linaro.org> wrote:
> > The correct include file for getting errno constants and ERR_PTR() is
> > linux/err.h, rather than linux/errno.h, so fix the include.
> > 
> > Fixes: e8b123e60084 ("soc: qcom: smem_state: Add stubs for disabled smem_state")
> > Acked-by: Andy Gross <andy.gross@linaro.org>
> > Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
> 
> For some reason this fails to compile now. Can you take a look, please?
> 
> ERROR: "qcom_wcnss_open_channel" [drivers/net/wireless/ath/wcn36xx/wcn36xx.ko] undefined!
> make[1]: *** [__modpost] Error 1
> make: *** [modules] Error 2
> 
> 5 patches set to Changes Requested.
> 
> 9429045 [v5,1/5] soc: qcom: smem_state: Fix include for ERR_PTR()
> 9429047 [v5,2/5] wcn36xx: Transition driver to SMD client

This patch was updated with the necessary depends in Kconfig to catch
this exact issue and when I pull in your .config (which has QCOM_SMD=n,
QCOM_WCNSS_CTRL=n and WCN36XX=y) I can build this just fine.

I've tested the various combinations and it seems to work fine. Do you
have any other patches in your tree? Any stale objects?

Would you mind retesting this, before I invest more time in trying to
reproduce the issue you're seeing?

Regards,
Bjorn

^ permalink raw reply

* Re: Synopsys Ethernet QoS Driver
From: Eric Dumazet @ 2016-11-18 18:29 UTC (permalink / raw)
  To: Joao Pinto
  Cc: Florian Fainelli, davem, jeffrey.t.kirsher, jiri, saeedm, idosch,
	netdev, linux-kernel, CARLOS.PALMINHA, andreas.irestal
In-Reply-To: <034e8607-b6d1-02e9-6ec3-fe50f1bd51c8@synopsys.com>

On Fri, 2016-11-18 at 16:40 +0000, Joao Pinto wrote:

> help a lot, thank you!
> lets start working then :)

Please read this very useful document first, so that you can avoid
common mistakes ;)


https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt

Thanks

^ permalink raw reply

* Re: [PATCH v8 5/6] net: ipv4, ipv6: run cgroup eBPF egress programs
From: Pablo Neira Ayuso @ 2016-11-18 17:44 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Daniel Mack, htejun-b10kYP2dOMg, daniel-FeC+5ew28dpmcu3hnIyYJQ,
	ast-b10kYP2dOMg, davem-fT/PcQaiUtIeIZ0/mPfg9Q, kafai-b10kYP2dOMg,
	fw-HFFVJYpyMKqzQB+pC5nmwQ, harald-H+wXaHxf7aLQT0dZR+AlfA,
	netdev-u79uwXL29TY76Z2rM5mHXA, sargun-GaZTRHToo+CzQB+pC5nmwQ,
	cgroups-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20161118171715.GA56632-+o4/htvd0TDFYCXBM6kdu7fOX0fSgVTm@public.gmane.org>

On Fri, Nov 18, 2016 at 09:17:18AM -0800, Alexei Starovoitov wrote:
> On Fri, Nov 18, 2016 at 01:37:32PM +0100, Pablo Neira Ayuso wrote:
> > On Thu, Nov 17, 2016 at 07:27:08PM +0100, Daniel Mack wrote:
> > [...]
> > > @@ -312,6 +314,12 @@ int ip_mc_output(struct net *net, struct sock *sk, struct sk_buff *skb)
> > >  	skb->dev = dev;
> > >  	skb->protocol = htons(ETH_P_IP);
> > >  
> > > +	ret = BPF_CGROUP_RUN_PROG_INET_EGRESS(sk, skb);
> > > +	if (ret) {
> > > +		kfree_skb(skb);
> > > +		return ret;
> > > +	}
> > > +
> > >  	/*
> > >  	 *	Multicasts are looped back for other local users
> > >  	 */
> > > @@ -364,12 +372,19 @@ int ip_mc_output(struct net *net, struct sock *sk, struct sk_buff *skb)
> > >  int ip_output(struct net *net, struct sock *sk, struct sk_buff *skb)
> > >  {
> > >  	struct net_device *dev = skb_dst(skb)->dev;
> > > +	int ret;
> > >  
> > >  	IP_UPD_PO_STATS(net, IPSTATS_MIB_OUT, skb->len);
> > >  
> > >  	skb->dev = dev;
> > >  	skb->protocol = htons(ETH_P_IP);
> > >  
> > > +	ret = BPF_CGROUP_RUN_PROG_INET_EGRESS(sk, skb);
> > > +	if (ret) {
> > > +		kfree_skb(skb);
> > > +		return ret;
> > > +	}
> > > +
> > >  	return NF_HOOK_COND(NFPROTO_IPV4, NF_INET_POST_ROUTING,
> > >  			    net, sk, skb, NULL, dev,
> > >  			    ip_finish_output,
> > 
> > Please, place this after the netfilter hook.
> > 
> > Since this new hook may mangle output packets, any mangling
> > potentially interfers and breaks conntrack.
> 
> actually this hook cannot mangle the packets, so no conntrack
> concerns.  Also this was brought up by Lorenzo earlier and consensus
> was that it's cleaner to leave it in this order.

Not yet probably, but this could be used to implement snat at some
point, you have potentially the infrastructure to do so in place
already.

> My reply:
> http://www.spinics.net/lists/cgroups/msg16675.html
> and Daniel's:
> http://www.spinics.net/lists/cgroups/msg16677.html
> and the rest of that thread.

Please place this afterwards since I don't want to update Netfilter
documentation to indicate that there is a new spot to debug before
POSTROUTING that may drop packets. People are used to debugging things
in a certain way, if packets are dropped after POSTROUTING, then
netfilter tracing will indicate the packet has successfully left our
framework and people will notice that packets are dropped somewhere
else, so they have a clue probably is this new layer.

Actually I remember you mentioned in a previous email that this hook
can be placed anywhere, and that they don't really need a fixed
location, if so, then it should not be much of a problem to change
this.

I can live with this new scenario where the kernel becomes a place
where everyone can push bpf blobs everywhere and your "code decides"
submission policy if others do as well, even if I frankly don't like
it. No problem. But please don't use the word "consensus" to justify
this, because this was not exactly what it was shown during Netconf.

So just send a v9 with this change I'm requesting and you have my word
I will not intefer anymore on this submission.

Thank you.

^ permalink raw reply

* Re: [PATCH] netns: fix get_net_ns_by_fd(int pid) typo
From: Rami Rosen @ 2016-11-18 17:40 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Netdev, David S. Miller
In-Reply-To: <1479462106-28529-1-git-send-email-stefanha@redhat.com>

On 18 November 2016 at 11:41, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> The argument to get_net_ns_by_fd() is a /proc/$PID/ns/net file
> descriptor not a pid.  Fix the typo.
>

Acked-by: Rami Rosen <roszenrami@gmail.com>

^ permalink raw reply

* Re: [PATCH net-next] amd-xgbe: Update connection validation for backplane mode
From: David Miller @ 2016-11-18 17:28 UTC (permalink / raw)
  To: thomas.lendacky; +Cc: netdev
In-Reply-To: <20161117144337.5714.57761.stgit@tlendack-t1.amdoffice.net>

From: Tom Lendacky <thomas.lendacky@amd.com>
Date: Thu, 17 Nov 2016 08:43:37 -0600

> Update the connection type enumeration for backplane mode and return
> an error when there is a mismatch between the mode and the connection
> type.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH v8 5/6] net: ipv4, ipv6: run cgroup eBPF egress programs
From: Alexei Starovoitov @ 2016-11-18 17:17 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: Daniel Mack, htejun-b10kYP2dOMg, daniel-FeC+5ew28dpmcu3hnIyYJQ,
	ast-b10kYP2dOMg, davem-fT/PcQaiUtIeIZ0/mPfg9Q, kafai-b10kYP2dOMg,
	fw-HFFVJYpyMKqzQB+pC5nmwQ, harald-H+wXaHxf7aLQT0dZR+AlfA,
	netdev-u79uwXL29TY76Z2rM5mHXA, sargun-GaZTRHToo+CzQB+pC5nmwQ,
	cgroups-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20161118123732.GA10400@salvia>

On Fri, Nov 18, 2016 at 01:37:32PM +0100, Pablo Neira Ayuso wrote:
> On Thu, Nov 17, 2016 at 07:27:08PM +0100, Daniel Mack wrote:
> [...]
> > @@ -312,6 +314,12 @@ int ip_mc_output(struct net *net, struct sock *sk, struct sk_buff *skb)
> >  	skb->dev = dev;
> >  	skb->protocol = htons(ETH_P_IP);
> >  
> > +	ret = BPF_CGROUP_RUN_PROG_INET_EGRESS(sk, skb);
> > +	if (ret) {
> > +		kfree_skb(skb);
> > +		return ret;
> > +	}
> > +
> >  	/*
> >  	 *	Multicasts are looped back for other local users
> >  	 */
> > @@ -364,12 +372,19 @@ int ip_mc_output(struct net *net, struct sock *sk, struct sk_buff *skb)
> >  int ip_output(struct net *net, struct sock *sk, struct sk_buff *skb)
> >  {
> >  	struct net_device *dev = skb_dst(skb)->dev;
> > +	int ret;
> >  
> >  	IP_UPD_PO_STATS(net, IPSTATS_MIB_OUT, skb->len);
> >  
> >  	skb->dev = dev;
> >  	skb->protocol = htons(ETH_P_IP);
> >  
> > +	ret = BPF_CGROUP_RUN_PROG_INET_EGRESS(sk, skb);
> > +	if (ret) {
> > +		kfree_skb(skb);
> > +		return ret;
> > +	}
> > +
> >  	return NF_HOOK_COND(NFPROTO_IPV4, NF_INET_POST_ROUTING,
> >  			    net, sk, skb, NULL, dev,
> >  			    ip_finish_output,
> 
> Please, place this after the netfilter hook.
> 
> Since this new hook may mangle output packets, any mangling
> potentially interfers and breaks conntrack.

actually this hook cannot mangle the packets, so no conntrack concerns.
Also this was brought up by Lorenzo earlier
and consensus was that it's cleaner to leave it in this order.
My reply:
http://www.spinics.net/lists/cgroups/msg16675.html
and Daniel's:
http://www.spinics.net/lists/cgroups/msg16677.html
and the rest of that thread.

Thanks

^ permalink raw reply

* Re: [PATCH net-next v3 0/5] Adding PHY-Tunables and downshift support
From: David Miller @ 2016-11-18 17:14 UTC (permalink / raw)
  To: allan.nielsen; +Cc: netdev, andrew, f.fainelli, raju.lakkaraju
In-Reply-To: <1479384444-31122-1-git-send-email-allan.nielsen@microsemi.com>

From: "Allan W. Nielsen" <allan.nielsen@microsemi.com>
Date: Thu, 17 Nov 2016 13:07:19 +0100

> This series add support for PHY tunables, and uses this facility to
> configure downshifting. The downshifting mechanism is implemented for MSCC
> phys.

Series applied, thanks.

^ permalink raw reply

* Re: Netperf UDP issue with connected sockets
From: Jesper Dangaard Brouer @ 2016-11-18 17:12 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Rick Jones, netdev, Saeed Mahameed, Tariq Toukan, brouer
In-Reply-To: <1479419042.8455.280.camel@edumazet-glaptop3.roam.corp.google.com>

On Thu, 17 Nov 2016 13:44:02 -0800
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> On Thu, 2016-11-17 at 22:19 +0100, Jesper Dangaard Brouer wrote:
> 
> > 
> > Maybe you can share your udp flood "udpsnd" program source?  
> 
> Very ugly. This is based on what I wrote when tracking the UDP v6
> checksum bug (4f2e4ad56a65f3b7d64c258e373cb71e8d2499f4 net: mangle zero
> checksum in skb_checksum_help()), because netperf sends the same message
> over and over...

Thanks a lot, hope you don't mind; I added the code to my github repo:
 https://github.com/netoptimizer/network-testing/blob/master/src/udp_snd.c

So I identified the difference, and reason behind the route lookups.
Your program is using send() and I was using sendmsg().  Given
udp_flood is designed to test different calls, I simply added --send as
a new possibility.
 https://github.com/netoptimizer/network-testing/commit/16166c2cd1fa8

If I use --write instead, then I can also avoid the fib_table_lookup
and __ip_route_output_key_hash calls.


> Use -d 2   to remove the ip_idents_reserve() overhead.

#define IP_PMTUDISC_DO	2 /* Always DF	*/

Added a --pmtu option to my udp_flood program
 https://github.com/netoptimizer/network-testing/commit/23a78caf4bb5b

 
> #define _GNU_SOURCE
> 
> #include <errno.h>
> #include <error.h>
> #include <linux/errqueue.h>
> #include <netinet/in.h>
> #include <sched.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <sys/time.h>
> #include <sys/types.h>
> #include <sys/socket.h>
> #include <unistd.h>
> 
> char buffer[1400];
> 
> int main(int argc, char** argv) {
>   int fd, i;
>   struct sockaddr_in6 addr;
>   char *host = "2002:af6:798::1";
>   int family = AF_INET6;
>   int discover = -1;
> 
>   while ((i = getopt(argc, argv, "4H:d:")) != -1) {
>     switch (i) {
>     case 'H': host = optarg; break;
>     case '4': family = AF_INET; break;
>     case 'd': discover = atoi(optarg); break;
>     }
>   }
>   fd = socket(family, SOCK_DGRAM, 0);
>   if (fd < 0)
>     error(1, errno, "failed to create socket");
>   if (discover != -1)
>     setsockopt(fd, SOL_IP, IP_MTU_DISCOVER,
>                &discover, sizeof(discover));
> 
>   memset(&addr, 0, sizeof(addr));
>   if (family == AF_INET6) {
> 	  addr.sin6_family = AF_INET6;
> 	  addr.sin6_port = htons(9);
>       inet_pton(family, host, (void *)&addr.sin6_addr.s6_addr);
>   } else {
>     struct sockaddr_in *in = (struct sockaddr_in *)&addr;
>     in->sin_family = family;
>     in->sin_port = htons(9);
>       inet_pton(family, host, &in->sin_addr);
>   }
>   connect(fd, (struct sockaddr *)&addr,
>           (family == AF_INET6) ? sizeof(addr) :
>                                  sizeof(struct sockaddr_in));
>   memset(buffer, 1, 1400);
>   for (i = 0; i < 655360000; i++) {
>     memcpy(buffer, &i, sizeof(i));
>     send(fd, buffer, 100 + rand() % 200, 0);

Using send() avoids the fib_table_lookup, on a connected UDP socket.

>   }
>   return 0;
> }


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [PATCH net-next V2 0/8] Mellanox 100G mlx5 update 2016-11-15
From: David Miller @ 2016-11-18 17:09 UTC (permalink / raw)
  To: saeedm; +Cc: netdev
In-Reply-To: <1479383162-3432-1-git-send-email-saeedm@mellanox.com>

From: Saeed Mahameed <saeedm@mellanox.com>
Date: Thu, 17 Nov 2016 13:45:54 +0200

> This series contains four humble mlx5 features.
> 
> From Gal, 
>  - Add the support for PCIe statistics and expose them in ethtool
> 
> From Huy,
>  - Add the support for port module events reporting and statistics
>  - Add the support for driver version setting into FW (for display purposes only)
> 
> From Mohamad,
>  - Extended the command interface cache flexibility
> 
> This series was generated against commit
> 6a02f5eb6a8a ("Merge branch 'mlxsw-i2c")
> 
> V2:
>  - Changed plain "unsigned" to "unsigned int"

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next 0/5] sfc: Firmware-Assisted TSO version 2
From: David Miller @ 2016-11-18 17:03 UTC (permalink / raw)
  To: ecree; +Cc: linux-net-drivers, bkenward, netdev
In-Reply-To: <ae6b02ca-9a0b-7072-6914-294fc9adb684@solarflare.com>

From: Edward Cree <ecree@solarflare.com>
Date: Thu, 17 Nov 2016 10:49:42 +0000

> The firmware on 8000 series SFC NICs supports a new TSO API ("FATSOv2"), and
>  7000 series NICs will also support this in an imminent release.  This series
>  adds driver support for this TSO implementation.
> The series also removes SWTSO, as it's now equivalent to GSO.  This does not
>  actually remove very much code, because SWTSO was grotesquely intertwingled
>  with FATSOv1, which will also be removed once 7000 series supports FATSOv2.

Series applied, thanks.

^ permalink raw reply

* Re: Synopsys Ethernet QoS Driver
From: Joao Pinto @ 2016-11-18 16:40 UTC (permalink / raw)
  To: Florian Fainelli, Joao Pinto, davem, jeffrey.t.kirsher, jiri,
	saeedm, idosch
  Cc: netdev, linux-kernel, CARLOS.PALMINHA, andreas.irestal
In-Reply-To: <a4bb3828-3aa5-1e80-f57e-ad9a94a681b3@gmail.com>

On 18-11-2016 16:35, Florian Fainelli wrote:
> 
> 
> On 11/18/2016 08:31 AM, Joao Pinto wrote:
>>  Hi Florian,
>>
>> On 18-11-2016 14:53, Florian Fainelli wrote:
>>> On November 18, 2016 4:28:30 AM PST, Joao Pinto <Joao.Pinto@synopsys.com> wrote:
>>>>

snip (...)

>>>> I would also gladly be available to be its maintainer if you agree with
>>>> it.
>>>
>>> Since you have both the hardware and a clear todo list for this driver, start submitting patches, get them included in David's tree and over time chances are that you will become the maintainer, either explicitly by adding an entry in the MAINTAINERS file or just by consistently contributing to this area.
>>
>> Thanks for the feedback.
>>
>> So I found 2 suitable git trees:
>>  a) kernel/git/davem/net.git
>>  b) kernel/git/davem/net-next.git
>>
>> We should submit to net.git correct? The net-next.git is a tree with selected
>> patches for upstream only?
> 
> net-next.git is the git tree where new features/enhancements can be
> submitted, while net.git is for bug fixes. Unless you absolutely need
> to, it is common practice to avoid having changes in net-next.git depend
> on net.git and vice versa.
> 
> Hope this helps.
> 

help a lot, thank you!
lets start working then :)

Thanks,
Joao

^ permalink raw reply

* Re: Synopsys Ethernet QoS Driver
From: Florian Fainelli @ 2016-11-18 16:35 UTC (permalink / raw)
  To: Joao Pinto, davem, jeffrey.t.kirsher, jiri, saeedm, idosch
  Cc: netdev, linux-kernel, CARLOS.PALMINHA, andreas.irestal
In-Reply-To: <fbf4263b-2b08-611d-f9cb-1f09c6f91117@synopsys.com>



On 11/18/2016 08:31 AM, Joao Pinto wrote:
>  Hi Florian,
> 
> On 18-11-2016 14:53, Florian Fainelli wrote:
>> On November 18, 2016 4:28:30 AM PST, Joao Pinto <Joao.Pinto@synopsys.com> wrote:
>>>
>>> Dear all,
>>>
>>> My name is Joao Pinto and I work at Synopsys.
>>> I am a kernel developer with special focus in mainline collaboration,
>>> both Linux
>>> and Buildroot. I was recently named one of the maintainers of the PCIe
>>> Designware core driver and I was the author of the Designware UFS
>>> driver stack.
>>>
>>> I am sending you this e-mail because you were the suggested contacts
>> >from the
>>> get_maintainers script concerning Ethernet drivers :).
>>>
>>> Currently I have the task to work on the mainline Ethernet QoS driver
>>> in which
>>> you are the author. The work would consist of the following:
>>>
>>> a) Separate the current driver in a Core driver (common ops) + platform
>>> glue
>>> driver + pci glue driver
>>> b) Add features that are currently only available internally
>>> c) Add specific phy support using the PHY framework
>>>
>>> I would also gladly be available to be its maintainer if you agree with
>>> it.
>>
>> Since you have both the hardware and a clear todo list for this driver, start submitting patches, get them included in David's tree and over time chances are that you will become the maintainer, either explicitly by adding an entry in the MAINTAINERS file or just by consistently contributing to this area.
> 
> Thanks for the feedback.
> 
> So I found 2 suitable git trees:
>  a) kernel/git/davem/net.git
>  b) kernel/git/davem/net-next.git
> 
> We should submit to net.git correct? The net-next.git is a tree with selected
> patches for upstream only?

net-next.git is the git tree where new features/enhancements can be
submitted, while net.git is for bug fixes. Unless you absolutely need
to, it is common practice to avoid having changes in net-next.git depend
on net.git and vice versa.

Hope this helps.
-- 
Florian

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox