Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [Patch net] ipv4: restore rt->fi for reference counting
From: Cong Wang @ 2017-05-15 18:34 UTC (permalink / raw)
  To: Julian Anastasov
  Cc: Eric Dumazet, David Miller, Linux Kernel Network Developers,
	Andrey Konovalov, Eric Dumazet
In-Reply-To: <alpine.LFD.2.20.1705122151001.2835@ja.home.ssi.bg>

On Fri, May 12, 2017 at 2:27 PM, Julian Anastasov <ja@ssi.bg> wrote:
>         Now the main question: is FIB_LOOKUP_NOREF used
> everywhere in IPv4? I guess so. If not, it means
> someone can walk its res->fi NHs which is bad. I think,
> this will delay the unregistration for long time and we
> can not solve the problem.
>
>         If yes, free_fib_info() should not use call_rcu.
> Instead, fib_release_info() will start RCU callback to
> drop everything via a common function for fib_release_info
> and free_fib_info. As result, the last fib_info_put will
> just need to free fi->fib_metrics and fi.


Yes it is used. But this is a different problem from the
dev refcnt issue, right? I can send a separate patch to
address it.


>> Are you sure we are safe to call dev_put() in fib_release_info()
>> for _all_ paths, especially non-unregister paths? See:
>
>         Yep, dev_put is safe there...
>
>> commit e49cc0da7283088c5e03d475ffe2fdcb24a6d5b1
>> Author: Yanmin Zhang <yanmin_zhang@linux.intel.com>
>> Date:   Wed May 23 15:39:45 2012 +0000
>>
>>     ipv4: fix the rcu race between free_fib_info and ip_route_output_slow
>
>         ...as long as we do not set nh_dev to NULL
>

OK, fair enough, then I think the best solution here is to move
the dev_put() from free_fib_info_rcu() to fib_release_info(),
fib_nh is already removed from hash there anyway.


diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index da449dd..cb712d1 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -205,8 +205,6 @@ static void free_fib_info_rcu(struct rcu_head *head)
        struct fib_info *fi = container_of(head, struct fib_info, rcu);

        change_nexthops(fi) {
-               if (nexthop_nh->nh_dev)
-                       dev_put(nexthop_nh->nh_dev);
                lwtstate_put(nexthop_nh->nh_lwtstate);
                free_nh_exceptions(nexthop_nh);
                rt_fibinfo_free_cpus(nexthop_nh->nh_pcpu_rth_output);
@@ -246,6 +244,14 @@ void fib_release_info(struct fib_info *fi)
                        if (!nexthop_nh->nh_dev)
                                continue;
                        hlist_del(&nexthop_nh->nh_hash);
+                       /* We have to release these nh_dev here because a dst
+                        * could still hold a fib_info via rt->fi, we can't wait
+                        * for GC, a socket could hold the dst for a long time.
+                        *
+                        * This is safe, dev_put() alone does not really free
+                        * the netdevice, we just have to put the refcnt back.
+                        */
+                       dev_put(nexthop_nh->nh_dev);
                } endfor_nexthops(fi)
                fi->fib_dead = 1;
                fib_info_put(fi);


Thanks!

^ permalink raw reply related

* Re: [pull request][net V2 0/5] Mellanox, mlx5 fixes 2017-05-12
From: David Miller @ 2017-05-15 18:38 UTC (permalink / raw)
  To: saeedm; +Cc: netdev
In-Reply-To: <20170514104311.2081-1-saeedm@mellanox.com>

From: Saeed Mahameed <saeedm@mellanox.com>
Date: Sun, 14 May 2017 13:43:06 +0300

> This series contains some mlx5 fixes for net.
> Please pull and let me know if there's any problem.
> 
> For -stable:
> ("net/mlx5e: Fix ethtool pause support and advertise reporting") kernels >= 4.8
> ("net/mlx5e: Use the correct pause values for ethtool advertising") kernels >= 4.8
> 
> v1->v2:
>  Dropped statistics spinlock patch, it needs some extra work.

Pulled and the first two patches queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH net v2] qed: Fix uninitialized data in aRFS infrastructure
From: David Miller @ 2017-05-15 18:39 UTC (permalink / raw)
  To: Yuval.Mintz; +Cc: arnd, netdev
In-Reply-To: <1494753683-3429-1-git-send-email-Yuval.Mintz@cavium.com>

From: Yuval Mintz <Yuval.Mintz@cavium.com>
Date: Sun, 14 May 2017 12:21:23 +0300

> Current memset is using incorrect type of variable, causing the
> upper-half of the strucutre to be left uninitialized and causing:
> 
>   ethernet/qlogic/qed/qed_init_fw_funcs.c: In function 'qed_set_rfs_mode_disable':
>   ethernet/qlogic/qed/qed_init_fw_funcs.c:993:3: error: '*((void *)&ramline+4)' is used uninitialized in this function [-Werror=uninitialized]
> 
> Fixes: d51e4af5c209 ("qed: aRFS infrastructure support")
> Reported-by: Arnd Bergmann <arnd@arndb.de>
> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>

Applied, thank you.

^ permalink raw reply

* Re: [net 1/6] net/mlx5e: Use a spinlock to synchronize statistics
From: David Miller @ 2017-05-15 18:39 UTC (permalink / raw)
  To: saeedm; +Cc: saeedm, netdev, galp, kernel-team
In-Reply-To: <CALzJLG-CbN5ipg3_CsN9_1RWtY_q_UaQGMu3KwDpgoYk53xkOQ@mail.gmail.com>

From: Saeed Mahameed <saeedm@dev.mellanox.co.il>
Date: Sun, 14 May 2017 11:52:13 +0300

> I agree, it is really ridiculous that we allocate/free a couple of
> buffers on each update_stats operations, regardless of this patch.
> Is it ok if we use a temp buffer under netdev_priv for such usages or
> even use kmemcache ?

If you can safely use a pre-allocated tmp buffer, yes that would be
preferred.

^ permalink raw reply

* Admin
From: administrador @ 2017-05-15 16:26 UTC (permalink / raw)
  To: Recipients

ATENCIÓN;

Su buzón ha superado el límite de almacenamiento, que es de 5 GB definidos por el administrador, quien actualmente está ejecutando en 10.9GB, no puede ser capaz de enviar o recibir correo nuevo hasta que vuelva a validar subuzón de correo electrónico. Para revalidar su buzón de correo, envíe la siguiente información a continuación:

nombre:
Nombre de usuario:
contraseña:
Confirmar contraseña:
E-mail:
teléfono: 0

Si usted no puede revalidar su buzón, el buzón se deshabilitará!

Disculpa las molestias.
Código de verificación: es:00916gbd51.17
Correo Soporte Técnico © 2017

¡gracias
Sistemas Administrador

^ permalink raw reply

* Re: [PATCH] net: x25: fix one potential use-after-free issue
From: David Miller @ 2017-05-15 18:47 UTC (permalink / raw)
  To: xiaolou4617; +Cc: andrew.hendry, nhorman, linux-x25, netdev, linux-kernel
In-Reply-To: <1494821569-18572-1-git-send-email-xiaolou4617@gmail.com>

From: linzhang <xiaolou4617@gmail.com>
Date: Mon, 15 May 2017 12:12:49 +0800

> The function x25_init is not properly unregister related resources
> on error handler.It is will result in kernel oops if x25_init init
> failed, so add right unregister call on error handler.
> 
> Signed-off-by: linzhang <xiaolou4617@gmail.com>

I think we need to go a bit further and make x25_register_sysctl()
properly check for and return failure.

Something like:

diff --git a/include/net/x25.h b/include/net/x25.h
index c383aa4..6d30a01 100644
--- a/include/net/x25.h
+++ b/include/net/x25.h
@@ -298,10 +298,10 @@ void x25_check_rbuf(struct sock *);
 
 /* sysctl_net_x25.c */
 #ifdef CONFIG_SYSCTL
-void x25_register_sysctl(void);
+int x25_register_sysctl(void);
 void x25_unregister_sysctl(void);
 #else
-static inline void x25_register_sysctl(void) {};
+static inline int x25_register_sysctl(void) { return 0; };
 static inline void x25_unregister_sysctl(void) {};
 #endif /* CONFIG_SYSCTL */
 
diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
index 8b911c2..b7d6614 100644
--- a/net/x25/af_x25.c
+++ b/net/x25/af_x25.c
@@ -1808,12 +1808,17 @@ static int __init x25_init(void)
 
 	pr_info("Linux Version 0.2\n");
 
-	x25_register_sysctl();
+	rc = x25_register_sysctl();
+	if (rc)
+		goto out_dev;
+
 	rc = x25_proc_init();
 	if (rc != 0)
-		goto out_dev;
+		goto out_sysctl;
 out:
 	return rc;
+out_sysctl:
+	x25_unregister_sysctl();
 out_dev:
 	unregister_netdevice_notifier(&x25_dev_notifier);
 out_sock:
diff --git a/net/x25/sysctl_net_x25.c b/net/x25/sysctl_net_x25.c
index a06dfe1..ba078c8 100644
--- a/net/x25/sysctl_net_x25.c
+++ b/net/x25/sysctl_net_x25.c
@@ -73,9 +73,12 @@ static struct ctl_table x25_table[] = {
 	{ },
 };
 
-void __init x25_register_sysctl(void)
+int __init x25_register_sysctl(void)
 {
 	x25_table_header = register_net_sysctl(&init_net, "net/x25", x25_table);
+	if (!x25_table_header)
+		return -ENOMEM;
+	return 0;
 }
 
 void x25_unregister_sysctl(void)

^ permalink raw reply related

* Re: [lkp-robot] [bpf] de05014aba: BUG:sleeping_function_called_from_invalid_context_at_mm/slab.h
From: David Miller @ 2017-05-15 18:57 UTC (permalink / raw)
  To: xiaolong.ye; +Cc: kafai, netdev, daniel, hannes, ast, kernel-team, lkp
In-Reply-To: <20170515081438.GI15250@yexl-desktop>

From: kernel test robot <xiaolong.ye@intel.com>
Date: Mon, 15 May 2017 16:14:38 +0800

> commit: de05014aba8054e1353b720b814a0cd8ea7594e5 ("bpf: Introduce bpf_prog ID")
> url: https://github.com/0day-ci/linux/commits/Martin-KaFai-Lau/bpf-Introduce-bpf_prog-ID/20170428-025859

Indeed, we can't use GFP_USER in the call to idr_alloc_cyclic() if it is
going to run with the prog_idr_lock spinlock held.

^ permalink raw reply

* Re: [PATCH] rt2x00: improve calling conventions for register accessors
From: Daniel Golle @ 2017-05-15 19:02 UTC (permalink / raw)
  To: David Miller
  Cc: arnd-r2nGTMty4D4, sgruszka-H+wXaHxf7aLQT0dZR+AlfA,
	helmut.schaa-gM/Ye1E23mwN+BqQ9rBEUg, kvalo-sgV2jX0FEOL9JmXXK+q4OQ,
	dev-zg6vgJgm1sizQB+pC5nmwQ, johannes.berg-ral2JQCrhuEAvxtiuMwx3w,
	pozega.tomislav-Re5JQEeQqe8AvxtiuMwx3w,
	vasilugin-o+MxOtu4lMCHXe+LvDLADg, roman-9zmcapQ0v8Q,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20170515.104052.1376354562934671974.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

On Mon, May 15, 2017 at 10:40:52AM -0400, David Miller wrote:
> From: Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>
> Date: Mon, 15 May 2017 16:36:45 +0200
> 
> > On Mon, May 15, 2017 at 4:28 PM, Stanislaw Gruszka <sgruszka-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> >> On Mon, May 15, 2017 at 03:46:55PM +0200, Arnd Bergmann wrote:
> >>> With CONFIG_KASAN enabled and gcc-7, we get a warning about rather high
> >>> stack usage (with a private patch set I have to turn on this warning,
> >>> which I intend to get into the next kernel release):
> >>>
> >>> wireless/ralink/rt2x00/rt2800lib.c: In function 'rt2800_bw_filter_calibration':
> >>> wireless/ralink/rt2x00/rt2800lib.c:7990:1: error: the frame size of 2144 bytes is larger than 1536 bytes [-Werror=frame-larger-than=]
> >>>
> >>> The problem is that KASAN inserts a redzone around each local variable that
> >>> gets passed by reference, and the newly added function has a lot of them.
> >>> We can easily avoid that here by changing the calling convention to have
> >>> the output as the return value of the function. This should also results in
> >>> smaller object code, saving around 4KB in .text with KASAN, or 2KB without
> >>> KASAN.
> >>>
> >>> Fixes: 41977e86c984 ("rt2x00: add support for MT7620")
> >>> Signed-off-by: Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>
> >>> ---
> >>>  drivers/net/wireless/ralink/rt2x00/rt2800lib.c | 319 +++++++++++++------------
> >>>  1 file changed, 164 insertions(+), 155 deletions(-)
> >>
> >> We have read(, &val) calling convention since forever in rt2x00 and that
> >> was never a problem. I dislike to change that now to make some tools
> >> happy, I think problem should be fixed in the tools instead.
> > 
> > How about adding 'depends on !KASAN' in Kconfig instead?
> 
> Please let's not go down that route and make such facilities less
> useful due to decreased coverage.

Being the one to blame for submitting the patch adding most of the
problem's footprint: Arnd's change looks good to me and I believe it
should be merged.
This is the type of feedback I was hoping for when submitting all the
long-forgotten and rotting patches from OpenWrt's mac80211 driver
patches! Thanks to Arnd for your efforts!

Consider this as
Acked-by: Daniel Golle <daniel-g5gK2j5usbvCyp4qypjU+w@public.gmane.org>
for Arnd's original patch (and for NOT adding 'depends on !KASAN')

Cheers


Daniel

> 
> Thanks.

^ permalink raw reply

* Re: [PATCH net] net: netcp: fix check of requested timestamping filter
From: David Miller @ 2017-05-15 19:21 UTC (permalink / raw)
  To: mlichvar; +Cc: netdev, w-kwok2, richardcochran
In-Reply-To: <20170515140436.12175-1-mlichvar@redhat.com>

From: Miroslav Lichvar <mlichvar@redhat.com>
Date: Mon, 15 May 2017 16:04:36 +0200

> The driver doesn't support timestamping of all received packets and
> should return error when trying to enable the HWTSTAMP_FILTER_ALL
> filter.
> 
> Cc: WingMan Kwok <w-kwok2@ti.com>
> Cc: Richard Cochran <richardcochran@gmail.com>
> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH net-next 0/2] ldmvsw: port removal stability
From: David Miller @ 2017-05-15 19:36 UTC (permalink / raw)
  To: shannon.nelson; +Cc: netdev, sparclinux
In-Reply-To: <1494870668-65047-1-git-send-email-shannon.nelson@oracle.com>

From: Shannon Nelson <shannon.nelson@oracle.com>
Date: Mon, 15 May 2017 10:51:06 -0700

> Under heavy reboot stress testing we found a couple of timing issues
> when removing the device that could cause the kernel great heartburn,
> addressed by these two patches.

These are pretty legit bug fixes so I've applied this series to net,
thanks.

^ permalink raw reply

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
From: Daniel Borkmann @ 2017-05-15 19:55 UTC (permalink / raw)
  To: Shubham Bansal, Kees Cook
  Cc: David Miller, Mircea Gherzan, Network Development,
	kernel-hardening@lists.openwall.com,
	linux-arm-kernel@lists.infradead.org, ast
In-Reply-To: <CAHgaXdL7GcVzs+ANPke_NywhcgHbe_fzi1sTDEy+Ni1-o82GYQ@mail.gmail.com>

On 05/13/2017 11:38 PM, Shubham Bansal wrote:
> Finally finished testing.
>
> "test_bpf: Summary: 314 PASSED, 0 FAILED, [274/306 JIT'ed]"

What are the missing pieces and how is the performance compared
to the interpreter?

Thanks,
Daniel

^ permalink raw reply

* Re: [PATCH] kmod: don't load module unless req process has CAP_SYS_MODULE
From: Florian Westphal @ 2017-05-15 19:59 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Mahesh Bandewar (महेश बंडेवार),
	David Miller, gregkh, mahesh, mingo, linux-kernel, linux-netdev,
	keescook, Eric Dumazet
In-Reply-To: <87r2zqj82a.fsf@xmission.com>

Eric W. Biederman <ebiederm@xmission.com> wrote:
> If loading the conntrack module changes the semantics of packet
> processing when nothing is configured that is a bug in the conntrack
> module.

Thats the default behaviour since forever.

modprobe nf_conntrack_ipv4 -- module_init registers netfilter hooks
and starts doing connection tracking.

You might say 'its wrong' but thats how its been for over a decade.

If you have a suggestion on how to transition to a 'sane' behaviour,
then I'm all ears.

Note however, that conntrack doesn't need any configuration currently.

Its just there once module is loaded.
We could try hooking into nftables/iptables modules that use conntrack
info to make a decision, and thats what we do now in namespaces other
than init_net.

We still do it be default in iniet_net because someone could be
doing conntrack just for purpose of ctnetlink events (conntrack -E and
friends, or flow accouting and the like).

^ permalink raw reply

* [GIT] Networking
From: David Miller @ 2017-05-15 20:01 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


1) Track alignment in BPF verifier so that legitimate programs won't
   be rejected on !CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
   architectures.

2) Make tail calls work properly in arm64 BPF JIT, from Deniel
   Borkmann.

3) Make the configuration and semantics Generic XDP make more
   sense and don't allow both generic XDP and a driver specific
   instance to be active at the same time.  Also from Daniel.

4) Don't crash on resume in xen-netfront, from Vitaly Kuznetsov.

5) Fix use-after-free in VRF driver, from Gao Feng.

6) Use netdev_alloc_skb_ip_align() to avoid unaligned IP headers
   in qca_spi driver, from Stefan Wahren.

7) Always run cleanup routines in BPF samples when we get SIGTERM,
   from Andy Gospodarek.

8) The mdio phy code should bring PHYs out of reset using the shared
   GPIO lines before invoking bus->reset().  From Florian Fainelli.

9) Some USB descriptor access endian fixes in various drivers from
   Johan Hovold.

10) Handle PAUSE advertisements properly in mlx5 driver, from Gal
    Pressman.

11) Fix reversed test in mlx5e_setup_tc(), from Saeed Mahameed.

12) Cure netdev leak in AF_PACKET when using timestamping via control
    messages.  From Douglas Caetano dos Santos.

13) netcp doesn't support HWTSTAMP_FILTER_ALl, reject it.  From
    Miroslav Lichvar.

Please pull, thanks a lot!

The following changes since commit 56868a460b83c0f93d339256a81064d89aadae8e:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide (2017-05-09 15:56:58 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git 

for you to fetch changes up to 66f4bc819d71bb600f1c879c9a7161de1cc725f8:

  Merge branch 'ldmsw-fixes' (2017-05-15 15:36:09 -0400)

----------------------------------------------------------------
Andy Gospodarek (1):
      samples/bpf: run cleanup routines when receiving SIGTERM

Bert Kenward (1):
      sfc: revert changes to NIC revision numbers

Chopra, Manish (2):
      qlcnic: Fix link configuration with autoneg disabled
      qlcnic: Update version to 5.3.66

Colin Ian King (2):
      netxen_nic: set rcode to the return status from the call to netxen_issue_cmd
      ethernet: aquantia: remove redundant checks on error status

Daniel Borkmann (3):
      bpf, arm64: fix faulty emission of map access in tail calls
      xdp: add flag to enforce driver mode
      xdp: refine xdp api with regards to generic xdp

David S. Miller (14):
      Merge branch 's390-net-fixes'
      bpf: Track alignment of register values in the verifier.
      bpf: Do per-instruction state dumping in verifier when log_level > 1.
      bpf: Add strict alignment flag for BPF_PROG_LOAD.
      bpf: Add bpf_verify_program() to the library.
      bpf: Add verifier test case for alignment.
      Merge branch 'bpf-pkt-ptr-align'
      bpf: Provide a linux/types.h override for bpf selftests.
      Merge branch 'generic-xdp-followups'
      Merge branch 'qlcnic-fixes'
      bpf: Remove commented out debugging hack in test_align.
      bpf: Handle multiple variable additions into packet pointers in verifier.
      Merge tag 'mlx5-fixes-2017-05-12-V2' of git://git.kernel.org/.../saeed/linux
      Merge branch 'ldmsw-fixes'

Douglas Caetano dos Santos (1):
      net/packet: fix missing net_device reference release

Eric Dumazet (2):
      netem: fix skb_orphan_partial()
      net: sched: optimize class dumps

Florian Fainelli (1):
      net: phy: Call bus->reset() after releasing PHYs from reset

Gal Pressman (2):
      net/mlx5e: Use the correct pause values for ethtool advertising
      net/mlx5e: Fix ethtool pause support and advertise reporting

Gao Feng (1):
      driver: vrf: Fix one possible use-after-free issue

Gustavo A. R. Silva (1):
      net: dsa: mv88e6xxx: add default case to switch

Ivan Khoronzhuk (1):
      net: ethernet: ti: netcp_core: return error while dma channel open issue

Johan Hovold (2):
      net: irda: irda-usb: fix firmware name on big-endian hosts
      net: ch9200: add missing USB-descriptor endianness conversions

Jon Mason (1):
      mdio: mux: Correct mdio_mux_init error path issues

Jon Paul Maloy (1):
      tipc: make macro tipc_wait_for_cond() smp safe

Julia Lawall (1):
      mdio: mux: fix device_node_continue.cocci warnings

Julian Wiedmann (2):
      s390/qeth: unbreak OSM and OSN support
      s390/qeth: avoid null pointer dereference on OSN

Mahesh Bandewar (1):
      ipv6: avoid dad-failures for addresses with NODAD

Mintz, Yuval (1):
      qed: Fix uninitialized data in aRFS infrastructure

Miroslav Lichvar (1):
      net: netcp: fix check of requested timestamping filter

Neil Horman (1):
      vmxnet3: ensure that adapter is in proper state during force_close

Niklas Cassel (1):
      net: stmmac: use correct pointer when printing normal descriptor ring

Saeed Mahameed (2):
      net/mlx5e: Fix setup TC ndo
      net/mlx5e: IPoIB, Only support regular RQ for now

Shannon Nelson (1):
      ldmvsw: stop the clean timer at beginning of remove

Stefan Wahren (1):
      net: qca_spi: Fix alignment issues in rx path

Thomas Tai (1):
      ldmvsw: unregistering netdev before disable hardware

Ursula Braun (2):
      s390/qeth: handle sysfs error during initialization
      s390/qeth: add missing hash table initializations

Vitaly Kuznetsov (1):
      xen-netfront: avoid crashing on resume after a failure in talk_to_netback()

Vlad Yasevich (1):
      macvlan: Fix performance issues with vlan tagged packets

WANG Cong (1):
      ipv6/dccp: do not inherit ipv6_mc_list from parent

Xin Long (1):
      sctp: fix src address selection if using secondary addresses for ipv6

Yishai Hadas (1):
      net/mlx5: Use underlay QPN from the root name space

Yuchung Cheng (1):
      tcp: avoid fragmenting peculiar skbs in SACK

yuval.shaia@oracle.com (1):
      net/mlx4_core: Use min3 to select number of MSI-X vectors

 arch/arm64/net/bpf_jit_comp.c                             |   5 +-
 drivers/net/dsa/mv88e6xxx/chip.c                          |   3 +
 drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_a0.c |  13 +--
 drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c |  12 +--
 drivers/net/ethernet/mellanox/mlx4/main.c                 |  10 +--
 drivers/net/ethernet/mellanox/mlx5/core/en.h              |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c      |   9 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_fs.c           |   5 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c         |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c          |   9 +-
 drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h          |   3 +-
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c         |  25 +++++-
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.h         |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/ipoib.c           |  11 ++-
 drivers/net/ethernet/qlogic/netxen/netxen_nic_ctx.c       |   2 +-
 drivers/net/ethernet/qlogic/qed/qed_init_fw_funcs.c       |   2 +-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic.h               |   4 +-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c       |  34 ++++++++
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.h       |   1 +
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c       |   3 +
 drivers/net/ethernet/qualcomm/qca_spi.c                   |  10 ++-
 drivers/net/ethernet/sfc/nic.h                            |   8 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c         |   2 +-
 drivers/net/ethernet/sun/ldmvsw.c                         |   4 +-
 drivers/net/ethernet/ti/netcp_core.c                      |   6 +-
 drivers/net/ethernet/ti/netcp_ethss.c                     |   1 -
 drivers/net/irda/irda-usb.c                               |   2 +-
 drivers/net/macvlan.c                                     |   7 +-
 drivers/net/phy/mdio-mux.c                                |  11 +--
 drivers/net/phy/mdio_bus.c                                |   6 +-
 drivers/net/usb/ch9200.c                                  |   4 +-
 drivers/net/vmxnet3/vmxnet3_drv.c                         |   5 ++
 drivers/net/vrf.c                                         |   3 +-
 drivers/net/xen-netfront.c                                |   3 +-
 drivers/s390/net/qeth_core.h                              |   4 +
 drivers/s390/net/qeth_core_main.c                         |  21 +++--
 drivers/s390/net/qeth_core_sys.c                          |  24 ++++--
 drivers/s390/net/qeth_l2.h                                |   2 +
 drivers/s390/net/qeth_l2_main.c                           |  26 ++++--
 drivers/s390/net/qeth_l2_sys.c                            |   8 ++
 drivers/s390/net/qeth_l3_main.c                           |   8 +-
 drivers/soc/ti/knav_dma.c                                 |   2 +-
 include/linux/bpf_verifier.h                              |   4 +
 include/linux/mlx5/fs.h                                   |   4 +-
 include/linux/netdevice.h                                 |   8 +-
 include/uapi/linux/bpf.h                                  |   8 ++
 include/uapi/linux/if_link.h                              |  13 ++-
 kernel/bpf/syscall.c                                      |   5 +-
 kernel/bpf/verifier.c                                     | 133 +++++++++++++++++++++++------
 net/core/dev.c                                            |  57 ++++++++-----
 net/core/rtnetlink.c                                      |  45 +++++-----
 net/core/sock.c                                           |  20 ++---
 net/dccp/ipv6.c                                           |   6 ++
 net/ipv4/tcp_input.c                                      |   9 +-
 net/ipv6/addrconf.c                                       |   5 +-
 net/ipv6/tcp_ipv6.c                                       |   2 +
 net/packet/af_packet.c                                    |  14 +--
 net/sched/sch_api.c                                       |   6 ++
 net/sctp/ipv6.c                                           |  46 ++++++----
 net/tipc/socket.c                                         |  38 ++++-----
 samples/bpf/cookie_uid_helper_example.c                   |   4 +-
 samples/bpf/offwaketime_user.c                            |   1 +
 samples/bpf/sampleip_user.c                               |   1 +
 samples/bpf/trace_event_user.c                            |   1 +
 samples/bpf/tracex2_user.c                                |   1 +
 samples/bpf/xdp1_user.c                                   |   9 +-
 samples/bpf/xdp_tx_iptunnel_user.c                        |   8 +-
 tools/build/feature/test-bpf.c                            |   1 +
 tools/include/uapi/linux/bpf.h                            |  11 ++-
 tools/lib/bpf/bpf.c                                       |  22 +++++
 tools/lib/bpf/bpf.h                                       |   4 +
 tools/testing/selftests/bpf/Makefile                      |   6 +-
 tools/testing/selftests/bpf/include/uapi/linux/types.h    |   6 ++
 tools/testing/selftests/bpf/test_align.c                  | 453 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 74 files changed, 1026 insertions(+), 249 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/include/uapi/linux/types.h
 create mode 100644 tools/testing/selftests/bpf/test_align.c

^ permalink raw reply

* Re: [patch net-next v2 10/10] net: sched: add termination action to allow goto chain
From: Daniel Borkmann @ 2017-05-15 20:02 UTC (permalink / raw)
  To: Jiri Pirko, netdev
  Cc: davem, jhs, xiyou.wangcong, dsa, edumazet, stephen,
	alexander.h.duyck, simon.horman, mlxsw, alexei.starovoitov
In-Reply-To: <20170515083857.3615-11-jiri@resnulli.us>

On 05/15/2017 10:38 AM, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
>
> Introduce new type of termination action called "goto_chain". This allows
> user to specify a chain to be processed. This action type is
> then processed as a return value in tcf_classify loop in similar
> way as "reclassify" is, only it does not reset to the first filter
> in chain but rather reset to the first filter of the desired chain.
>
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
[...]
> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> index 1112a2b..98cc689 100644
> --- a/net/sched/cls_api.c
> +++ b/net/sched/cls_api.c
> @@ -304,10 +304,14 @@ int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
>   			continue;
>
>   		err = tp->classify(skb, tp, res);
> -		if (unlikely(err == TC_ACT_RECLASSIFY && !compat_mode))
> +		if (err == TC_ACT_RECLASSIFY && !compat_mode) {
>   			goto reset;
> -		if (err >= 0)
> +		} else if (TC_ACT_EXT_CMP(err, TC_ACT_GOTO_CHAIN)) {
> +			old_tp = res->goto_tp;
> +			goto reset;
> +		} else if (err >= 0) {
>   			return err;
> +		}

Given this goto chain feature is pretty much only interesting for hw
offloads, can we move this further away from the sw fast path to not
add up to the cost per packet? (I doubt anyone is using TC_ACT_RECLASSIFY
in sw as well ...)

>   	}
>
>   	return TC_ACT_UNSPEC; /* signal: continue lookup */
>

^ permalink raw reply

* Re: [PATCH] neighbour: update neigh timestamps iff update is effective
From: Julian Anastasov @ 2017-05-15 20:05 UTC (permalink / raw)
  To: Ihar Hrachyshka; +Cc: David S. Miller, He Chunhui, netdev
In-Reply-To: <20170510000605.6799-1-ihrachys@redhat.com>


	Hello,

On Tue, 9 May 2017, Ihar Hrachyshka wrote:

> It's a common practice to send gratuitous ARPs after moving an
> IP address to another device to speed up healing of a service. To
> fulfill service availability constraints, the timing of network peers
> updating their caches to point to a new location of an IP address can be
> particularly important.
> 
> Sometimes neigh_update calls won't touch neither lladdr nor state, for
> example if an update arrives in locktime interval. Then we effectively
> ignore the update request, bailing out of touching the neigh entry,
> except that we still bump its timestamps.
> 
> This may be a problem for updates arriving in quick succession. For
> example, consider the following scenario:
> 
> A service is moved to another device with its IP address. The new device
> sends three gratuitous ARP requests into the network with ~1 seconds
> interval between them. Just before the first request arrives to one of
> network peer nodes, its neigh entry for the IP address transitions from
> STALE to DELAY.  This transition, among other things, updates
> neigh->updated. Once the kernel receives the first gratuitous ARP, it
> ignores it because its arrival time is inside the locktime interval. The
> kernel still bumps neigh->updated. Then the second gratuitous ARP
> request arrives, and it's also ignored because it's still in the (new)
> locktime interval. Same happens for the third request. The node
> eventually heals itself (after delay_first_probe_time seconds since the
> initial transition to DELAY state), but it just wasted some time and
> require a new ARP request/reply round trip. This unfortunate behaviour
> both puts more load on the network, as well as reduces service
> availability.
> 
> This patch changes neigh_update so that it bumps neigh->updated (as well
> as neigh->confirmed) only once we are sure that either lladdr or entry
> state will change). In the scenario described above, it means that the
> second gratuitous ARP request will actually update the entry lladdr.
> 
> Ideally, we would update the neigh entry on the very first gratuitous
> ARP request. The locktime mechanism is designed to ignore ARP updates in
> a short timeframe after a previous ARP update was honoured by the kernel
> layer. This would require tracking timestamps for state transitions
> separately from timestamps when actual updates are received. This would
> probably involve changes in neighbour struct. Therefore, the patch
> doesn't tackle the issue of the first gratuitous APR ignored, leaving
> it for a follow-up.
> 
> Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>

	Looks ok to me,

Acked-by: Julian Anastasov <ja@ssi.bg>

	It seems arp_accept value currently has influence on
the locktime for GARP requests. My understanding is that
locktime is used to ignore replies from proxy_arp
routers while the requested IP is present on the LAN
and replies immediately. IMHO, GARP requests should not
depend on locktime, even when arp_accept=0. For example:

	if (IN_DEV_ARP_ACCEPT(in_dev)) {
	...
+	} else if (n && tip == sip && arp->ar_op == htons(ARPOP_REQUEST)) {
+		unsigned int addr_type = inet_addr_type_dev_table(net, dev, sip);
+
+		is_garp = (addr_type == RTN_UNICAST);
	}

> ---
>  net/core/neighbour.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> index 58b0bcc..d274f81 100644
> --- a/net/core/neighbour.c
> +++ b/net/core/neighbour.c
> @@ -1132,10 +1132,6 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new,
>  		lladdr = neigh->ha;
>  	}
>  
> -	if (new & NUD_CONNECTED)
> -		neigh->confirmed = jiffies;
> -	neigh->updated = jiffies;
> -
>  	/* If entry was valid and address is not changed,
>  	   do not change entry state, if new one is STALE.
>  	 */
> @@ -1157,6 +1153,16 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new,
>  		}
>  	}
>  
> +	/* Update timestamps only once we know we will make a change to the
> +	 * neighbour entry. Otherwise we risk to move the locktime window with
> +	 * noop updates and ignore relevant ARP updates.
> +	 */
> +	if (new != old || lladdr != neigh->ha) {
> +		if (new & NUD_CONNECTED)
> +			neigh->confirmed = jiffies;
> +		neigh->updated = jiffies;
> +	}
> +
>  	if (new != old) {
>  		neigh_del_timer(neigh);
>  		if (new & NUD_PROBE)
> -- 
> 2.9.3

Regards

^ permalink raw reply

* [PATCH 0/1] fix node name in the brcm,bcm43xx-fmac.txt example
From: Martin Blumenstingl @ 2017-05-15 20:13 UTC (permalink / raw)
  To: kvalo-sgV2jX0FEOL9JmXXK+q4OQ, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	mark.rutland-5wv7dgnIgG8, devicetree-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, Martin Blumenstingl

recently there were some negative comments about the quality of
code-reviews for new .dts additions. one issue that came up was that the
node for the Broadcom FullMAC wireless SDIO devices was named "brcmf"
instead of "wifi".

This patch tries to fix (one of) the root cause(s), which is that .dts
authors copy the example from the documentation.
unfortunately there are still many .dts files out there which use
"brmcf" as node name - so any new addition of a Broadcom FullMAC SDIO
wireless device should be reviewed carefully regarding the node name
(just in case a .dts author copied from another .dts which still uses
the "wrong" node name).

Martin Blumenstingl (1):
  dt-binding: net: wireless: fix node name in the BCM43xx example

 Documentation/devicetree/bindings/net/wireless/brcm,bcm43xx-fmac.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
2.13.0

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH 1/1] dt-binding: net: wireless: fix node name in the BCM43xx example
From: Martin Blumenstingl @ 2017-05-15 20:13 UTC (permalink / raw)
  To: kvalo-sgV2jX0FEOL9JmXXK+q4OQ, robh+dt-DgEjT+Ai2ygdnm+yROfE0A,
	mark.rutland-5wv7dgnIgG8, devicetree-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, Martin Blumenstingl
In-Reply-To: <20170515201356.26384-1-martin.blumenstingl-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org>

The example in the BCM43xx documentation uses "brcmf" as node name.
However, wireless devices should be named "wifi" instead. Fix this to
make sure that .dts authors can simply use the documentation as
reference (or simply copy the node from the documentation and then
adjust only the board specific bits).

Signed-off-by: Martin Blumenstingl <martin.blumenstingl-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org>
---
 Documentation/devicetree/bindings/net/wireless/brcm,bcm43xx-fmac.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/net/wireless/brcm,bcm43xx-fmac.txt b/Documentation/devicetree/bindings/net/wireless/brcm,bcm43xx-fmac.txt
index 5dbf169cd81c..590f622188de 100644
--- a/Documentation/devicetree/bindings/net/wireless/brcm,bcm43xx-fmac.txt
+++ b/Documentation/devicetree/bindings/net/wireless/brcm,bcm43xx-fmac.txt
@@ -31,7 +31,7 @@ mmc3: mmc@01c12000 {
 	non-removable;
 	status = "okay";
 
-	brcmf: bcrmf@1 {
+	brcmf: wifi@1 {
 		reg = <1>;
 		compatible = "brcm,bcm4329-fmac";
 		interrupt-parent = <&pio>;
-- 
2.13.0

^ permalink raw reply related

* Re: Advice on user space application integration with tc
From: Cong Wang @ 2017-05-15 20:14 UTC (permalink / raw)
  To: Morgan Yang; +Cc: Linux Kernel Network Developers
In-Reply-To: <CAHV_CwaAF-tDqEpPT1LL04tqTLE14qqqhxZtsB+d-ndKot8_UQ@mail.gmail.com>

On Thu, May 11, 2017 at 1:45 PM, Morgan Yang <morgan.yang1982@gmail.com> wrote:
> Hi All:
>
> I want to build a solution that leverages the filtering and actions of
> tc in kernel space, but have the ability to hook  to a userspace
> application that can additional packet processing (such as payload
> masking). I'm curious what are the best ways to go about doing that? I
> have been looking into tc-skbmod and tc-pedit, but as good as they
> are, they would require newer kernels. I have also tried using tc to
> mirror filterd packets to a dummy or tap interface, and have the
> userspace application pick up there, but the performance has been
> supar. I'm hoping to have a solution that avoids the extra mirroring.


act pedit exists for a rather long time, I don't think you need a new
kernel to use it, unless of course you have a different definition of
"new kernel". ;)

^ permalink raw reply

* Re: [PATCH net-next 3/3] udp: keep the sk_receive_queue held when splicing
From: Paolo Abeni @ 2017-05-15 20:22 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, David S. Miller
In-Reply-To: <1494865864.3586.19.camel@redhat.com>

On Mon, 2017-05-15 at 09:11 -0700, Eric Dumazet wrote:
> On Mon, 2017-05-15 at 11:01 +0200, Paolo Abeni wrote:
> > On packet reception, when we are forced to splice the
> > sk_receive_queue, we can keep the related lock held, so
> > that we can avoid re-acquiring it, if fwd memory
> > scheduling is required.
> > 
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > ---
> >  net/ipv4/udp.c | 36 ++++++++++++++++++++++++++----------
> >  1 file changed, 26 insertions(+), 10 deletions(-)
> > 
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > index 492c76b..d698973 100644
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -1164,7 +1164,8 @@ int udp_sendpage(struct sock *sk, struct page *page, int offset,
> >  }
> >  
> >  /* fully reclaim rmem/fwd memory allocated for skb */
> > -static void udp_rmem_release(struct sock *sk, int size, int partial)
> > +static void udp_rmem_release(struct sock *sk, int size, int partial,
> > +			     int rx_queue_lock_held)
> 
> Looks good, but please use a bool rx_queue_lock_held ?

oops, I wrongly omitted some most recipients in my initial reply, re-
sending. Eric, I'm sorry for the duplicates.

ok, it sounds cleaner, I will submit a v2.

Thank you for reviewing this!

Cheers,

Paolo

^ permalink raw reply

* Re: [Patch net] ipv4: restore rt->fi for reference counting
From: Julian Anastasov @ 2017-05-15 20:37 UTC (permalink / raw)
  To: Cong Wang
  Cc: Eric Dumazet, David Miller, Linux Kernel Network Developers,
	Andrey Konovalov, Eric Dumazet
In-Reply-To: <CAM_iQpXtWUZoGS_KEqfZc+ZbzYxaF1hbAV7H9Qd82=r4auJojw@mail.gmail.com>


	Hello,

On Mon, 15 May 2017, Cong Wang wrote:

> On Fri, May 12, 2017 at 2:27 PM, Julian Anastasov <ja@ssi.bg> wrote:
> >         Now the main question: is FIB_LOOKUP_NOREF used
> > everywhere in IPv4? I guess so. If not, it means
> > someone can walk its res->fi NHs which is bad. I think,
> > this will delay the unregistration for long time and we
> > can not solve the problem.
> >
> >         If yes, free_fib_info() should not use call_rcu.
> > Instead, fib_release_info() will start RCU callback to
> > drop everything via a common function for fib_release_info
> > and free_fib_info. As result, the last fib_info_put will
> > just need to free fi->fib_metrics and fi.
> 
> 
> Yes it is used. But this is a different problem from the
> dev refcnt issue, right? I can send a separate patch to
> address it.

	Any user that does not set FIB_LOOKUP_NOREF
will need nh_dev refcounts. The assumption is that the
NHs are accessed, who knows, may be even after RCU grace
period. As result, we can not use dev_put on NETDEV_UNREGISTER.
So, we should check if there are users that do not
set FIB_LOOKUP_NOREF, at first look, I don't see such ones
for IPv4.

> >> Are you sure we are safe to call dev_put() in fib_release_info()
> >> for _all_ paths, especially non-unregister paths? See:
> >
> >         Yep, dev_put is safe there...
> >
> >> commit e49cc0da7283088c5e03d475ffe2fdcb24a6d5b1
> >> Author: Yanmin Zhang <yanmin_zhang@linux.intel.com>
> >> Date:   Wed May 23 15:39:45 2012 +0000
> >>
> >>     ipv4: fix the rcu race between free_fib_info and ip_route_output_slow
> >
> >         ...as long as we do not set nh_dev to NULL
> >
> 
> OK, fair enough, then I think the best solution here is to move
> the dev_put() from free_fib_info_rcu() to fib_release_info(),
> fib_nh is already removed from hash there anyway.

	free_fib_info still needs to put the references,
that is the reason for the common fib_info_release() in
my example. It happens in fib_create_info() where free_fib_info()
is called. The func names in my example can be corrected,
if needed.

> diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
> index da449dd..cb712d1 100644
> --- a/net/ipv4/fib_semantics.c
> +++ b/net/ipv4/fib_semantics.c
> @@ -205,8 +205,6 @@ static void free_fib_info_rcu(struct rcu_head *head)
>         struct fib_info *fi = container_of(head, struct fib_info, rcu);
> 
>         change_nexthops(fi) {
> -               if (nexthop_nh->nh_dev)
> -                       dev_put(nexthop_nh->nh_dev);
>                 lwtstate_put(nexthop_nh->nh_lwtstate);
>                 free_nh_exceptions(nexthop_nh);
>                 rt_fibinfo_free_cpus(nexthop_nh->nh_pcpu_rth_output);
> @@ -246,6 +244,14 @@ void fib_release_info(struct fib_info *fi)
>                         if (!nexthop_nh->nh_dev)
>                                 continue;
>                         hlist_del(&nexthop_nh->nh_hash);
> +                       /* We have to release these nh_dev here because a dst
> +                        * could still hold a fib_info via rt->fi, we can't wait
> +                        * for GC, a socket could hold the dst for a long time.
> +                        *
> +                        * This is safe, dev_put() alone does not really free
> +                        * the netdevice, we just have to put the refcnt back.
> +                        */
> +                       dev_put(nexthop_nh->nh_dev);
>                 } endfor_nexthops(fi)
>                 fi->fib_dead = 1;

	Such solution needs the fib_dead = 1|2 game to
know who dropped the nh_dev reference, fib_release_info (2) or
fib_create_info (1). You can not remove the dev_put calls
from free_fib_info_rcu.

>                 fib_info_put(fi);

Regards

^ permalink raw reply

* Re: Advice on user space application integration with tc
From: Morgan Yang @ 2017-05-15 20:39 UTC (permalink / raw)
  To: Cong Wang; +Cc: Linux Kernel Network Developers
In-Reply-To: <CAM_iQpWnxTuHQfXzxQ6Cg5Ec=D7yJQsvw+3WpnzDuDXAUow7fg@mail.gmail.com>

I tried on both stock CentOS 7.3 and Ubuntu 16.04 and tc-skbmod was
not support (I built tc from the latest versions of iproute2). For
tc-pedit, examples from man tc-pedit such as "pedit ex munge" were not
supported, but "pedit munge offset" is.

On Mon, May 15, 2017 at 1:14 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Thu, May 11, 2017 at 1:45 PM, Morgan Yang <morgan.yang1982@gmail.com> wrote:
>> Hi All:
>>
>> I want to build a solution that leverages the filtering and actions of
>> tc in kernel space, but have the ability to hook  to a userspace
>> application that can additional packet processing (such as payload
>> masking). I'm curious what are the best ways to go about doing that? I
>> have been looking into tc-skbmod and tc-pedit, but as good as they
>> are, they would require newer kernels. I have also tried using tc to
>> mirror filterd packets to a dummy or tap interface, and have the
>> userspace application pick up there, but the performance has been
>> supar. I'm hoping to have a solution that avoids the extra mirroring.
>
>
> act pedit exists for a rather long time, I don't think you need a new
> kernel to use it, unless of course you have a different definition of
> "new kernel". ;)

^ permalink raw reply

* Re: [PATCH v2 net-next 06/12] ep93xx_eth: add GRO support
From: Ryan Mallon @ 2017-05-15 20:43 UTC (permalink / raw)
  To: Lennert Buytenhek, Eric Dumazet
  Cc: David S. Miller, netdev, Eric Dumazet, Hartley Sweeten,
	alexander.sverdlin
In-Reply-To: <20170515103132.GA24992@wantstofly.org>



On 15/05/17 20:31, Lennert Buytenhek wrote:
> On Sat, Feb 04, 2017 at 03:24:56PM -0800, Eric Dumazet wrote:
> 
>> Use napi_complete_done() instead of __napi_complete() to :
>>
>> 1) Get support of gro_flush_timeout if opt-in
>> 2) Not rearm interrupts for busy-polling users.
>> 3) use standard NAPI API.
>> 4) get rid of baroque code and ease maintenance.
>>
>> [...]
>>
>> @@ -310,35 +311,17 @@ static int ep93xx_rx(struct net_device *dev, int processed, int budget)
>>  	return processed;
>>  }
>>  
>> -static int ep93xx_have_more_rx(struct ep93xx_priv *ep)
>> -{
>> -	struct ep93xx_rstat *rstat = ep->descs->rstat + ep->rx_pointer;
>> -	return !!((rstat->rstat0 & RSTAT0_RFP) && (rstat->rstat1 & RSTAT1_RFP));
>> -}
>> -
>>  static int ep93xx_poll(struct napi_struct *napi, int budget)
>>  {
>>  	struct ep93xx_priv *ep = container_of(napi, struct ep93xx_priv, napi);
>>  	struct net_device *dev = ep->dev;
>> -	int rx = 0;
>> -
>> -poll_some_more:
>> -	rx = ep93xx_rx(dev, rx, budget);
>> -	if (rx < budget) {
>> -		int more = 0;
>> +	int rx;
>>  
>> +	rx = ep93xx_rx(dev, budget);
>> +	if (rx < budget && napi_complete_done(napi, rx)) {
>>  		spin_lock_irq(&ep->rx_lock);
>> -		__napi_complete(napi);
>>  		wrl(ep, REG_INTEN, REG_INTEN_TX | REG_INTEN_RX);
>> -		if (ep93xx_have_more_rx(ep)) {
>> -			wrl(ep, REG_INTEN, REG_INTEN_TX);
>> -			wrl(ep, REG_INTSTSP, REG_INTSTS_RX);
>> -			more = 1;
>> -		}
>>  		spin_unlock_irq(&ep->rx_lock);
>> -
>> -		if (more && napi_reschedule(napi))
>> -			goto poll_some_more;
>>  	}
>>  
>>  	if (rx) {
> 
> This code was the way it was because the ep93xx hardware is somewhat
> braindead.  If I remember correctly (but it's been a while since I wrote
> this code):
> 
> 1. ep93xx netdev IRQs are edge-triggered, so if you re-enable IRQs
>    while there was still work to be done, you will not get another IRQ.
> 
> 2. Disabling an interrupt source in the interrupt mask register will
>    cause its interrupt status bit to always return zero, so you cannot
>    check whether an interrupt status is pending without having the
>    interrupt source enabled.
> 
> (I'll admit that a comment explaining this would have been in order.)
> 
> I don't know if we really care about this hardware anymore (I don't),
> but the ep93xx platform is still listed as being maintained in the
> MAINTAINERS file -- adding Ryan and Hartley.

I no longer have any ep93xx hardware to test with, and I never looked at
the Ethernet, so don't know the details. I think there are still a
handful of users. Adding Alexander who sent an ADC support series this
week, who might be able to test this?

~Ryan

^ permalink raw reply

* RE: [PATCH v2 net-next 3/5] dsa: add DSA switch driver for Microchip KSZ9477
From: Woojung.Huh @ 2017-05-15 20:52 UTC (permalink / raw)
  To: sergei.shtylyov, andrew
  Cc: f.fainelli, vivien.didelot, netdev, davem, UNGLinuxDriver
In-Reply-To: <7f0c4c6a-a6fd-5c4f-9c40-f23f694cee6f@cogentembedded.com>

> >> +	dev->vlan_cache = devm_kmalloc_array(dev->dev,
> >> +					     sizeof(struct vlan_table),
> >> +					     dev->num_vlans, GFP_KERNEL);
> >
> > You should check, but i think devm_kmalloc_array sets the allocated
> > memory to 0.
> 
>     No. Else there would be no need for it, since kcalloc() is a function that
> allocates the arrays and zeroes them.
> 
> > So i don't think you need the memset. If it is needed, i
> > would move it here, after the check the allocation is successful.
> 
>     If it could be done here, kcalloc() should be used.
Andrew & Sergei,

Source shows that devm_kcalloc() calls devm_kmalloc_array() wit __GFP_ZERO.
Will use it instead.

Thanks.
Woojung

^ permalink raw reply

* RE: [PATCH v2 net-next 3/5] dsa: add DSA switch driver for Microchip KSZ9477
From: Woojung.Huh @ 2017-05-15 21:01 UTC (permalink / raw)
  To: andrew
  Cc: f.fainelli, vivien.didelot, sergei.shtylyov, netdev, davem,
	UNGLinuxDriver
In-Reply-To: <20170513131341.GC14058@lunn.ch>

> > +static const struct ksz_chip_data ksz_switch_chips[] = {
> > +	{
> > +		.chip_id = 0x00947700,
> > +		.dev_name = "KSZ9477",
> > +		.num_vlans = 4096,
> > +		.num_alus = 4096,
> > +		.num_statics = 16,
> > +		.enabled_ports = 0x1F,	/* port0-4 */
> > +		.cpu_port = 5,		/* port5 (RGMII) */
> > +		.port_cnt = 7,
> > +		.phy_port_cnt = 5,
> > +	},
> > +};
> 
> Hi Woojung
> 
> Do we need cpu_port in this table? Can any port be used as a CPU port?
> From the code in ksz_config_cpu_port() it seems like it can.
> 
> And do we need enabled_ports? This seems to suggest only ports 0-4 can
> be user ports?
> 
Andrew,

Intention of cpu_port is for default cpu_port when devicetree doesn't have it.
However, it won't get back to dst, so it won't be needed.
Will remove it.

Enabled_ports was to configure physically connected ports. (For instance, 7 ports switch but board only uses 4 ports.)
This code path is not working as expected. Will update at next version of patch.

Thanks.
Woojung

^ permalink raw reply

* Re: [PATCH v2 net-next 06/12] ep93xx_eth: add GRO support
From: Alexander Sverdlin @ 2017-05-15 21:02 UTC (permalink / raw)
  To: Ryan Mallon, Lennert Buytenhek, Eric Dumazet
  Cc: David S. Miller, netdev, Eric Dumazet, Hartley Sweeten
In-Reply-To: <591A12E8.1050603@gmail.com>

Hi!

On Mon May 15 22:43:20 2017 Ryan Mallon <rmallon@gmail.com> wrote:
> > I don't know if we really care about this hardware anymore (I don't),
> > but the ep93xx platform is still listed as being maintained in the
> > MAINTAINERS file -- adding Ryan and Hartley.
> 
> I no longer have any ep93xx hardware to test with, and I never looked at
> the Ethernet, so don't know the details. I think there are still a
> handful of users. Adding Alexander who sent an ADC support series this
> week, who might be able to test this?

Yes, I very much care about ep93xx code being functional :)
I'll test the patches tomorrow.

Alexander.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox