Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net] tcp: ulp: fix possible crash in tcp_diag_get_aux_size()
From: David Miller @ 2019-09-07 15:34 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, eric.dumazet, lukehsiao, ncardwell, dcaratti
In-Reply-To: <20190905202041.138085-1-edumazet@google.com>

From: Eric Dumazet <edumazet@google.com>
Date: Thu,  5 Sep 2019 13:20:41 -0700

> tcp_diag_get_aux_size() can be called with sockets in any state.
> 
> icsk_ulp_ops is only present for full sockets.
> 
> For SYN_RECV or TIME_WAIT ones we would access garbage.
> 
> Fixes: 61723b393292 ("tcp: ulp: add functions to dump ulp-specific information")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Luke Hsiao <lukehsiao@google.com>
> Reported-by: Neal Cardwell <ncardwell@google.com>

Applied to net-next, thanks Eric.

^ permalink raw reply

* Re: [PATCH] net/ibmvnic: free reset work of removed device from queue
From: David Miller @ 2019-09-07 15:37 UTC (permalink / raw)
  To: julietk; +Cc: netdev, linuxppc-dev
In-Reply-To: <20190905213001.19818-1-julietk@linux.vnet.ibm.com>

From: Juliet Kim <julietk@linux.vnet.ibm.com>
Date: Thu,  5 Sep 2019 17:30:01 -0400

> Commit 36f1031c51a2 ("ibmvnic: Do not process reset during or after
>  device removal") made the change to exit reset if the driver has been
> removed, but does not free reset work items of the adapter from queue.
> 
> Ensure all reset work items are freed when breaking out of the loop early.
> 
> Fixes: 36f1031c51a2 ("ibmnvic: Do not process reset during or after
> device removal”)

Please do not break up Fixes: tags into mutliple lines, also please do
not put an empty line between the Fixes: tag and other tags like the
Signed-off-by:

> Signed-off-by: Juliet Kim <julietk@linux.vnet.ibm.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH 1/2] net: phy: dp83867: Add documentation for SGMII mode type
From: Andrew Lunn @ 2019-09-07 15:39 UTC (permalink / raw)
  To: Vitaly Gaiduk
  Cc: davem, robh+dt, f.fainelli, Mark Rutland, Trent Piepho, netdev,
	devicetree, linux-kernel
In-Reply-To: <1567700761-14195-2-git-send-email-vitaly.gaiduk@cloudbear.ru>

On Thu, Sep 05, 2019 at 07:26:00PM +0300, Vitaly Gaiduk wrote:
> Add documentation of ti,sgmii-type which can be used to select
> SGMII mode type (4 or 6-wire).
> 
> Signed-off-by: Vitaly Gaiduk <vitaly.gaiduk@cloudbear.ru>
> ---
>  Documentation/devicetree/bindings/net/ti,dp83867.txt | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/Documentation/devicetree/bindings/net/ti,dp83867.txt b/Documentation/devicetree/bindings/net/ti,dp83867.txt
> index db6aa3f2215b..18e7fd52897f 100644
> --- a/Documentation/devicetree/bindings/net/ti,dp83867.txt
> +++ b/Documentation/devicetree/bindings/net/ti,dp83867.txt
> @@ -37,6 +37,7 @@ Optional property:
>  			      for applicable values.  The CLK_OUT pin can also
>  			      be disabled by this property.  When omitted, the
>  			      PHY's default will be left as is.
> +	- ti,sgmii-type - This denotes the fact which SGMII mode is used (4 or 6-wire).

Hi Vitaly

You probably want to make this a Boolean. I don't think SGMII type is
a good idea. This is about enabling the receive clock to be passed to
the MAC. So how about ti,sgmii-ref-clock-output-enable.

    Andrew

^ permalink raw reply

* Re: [pull request][net-next 00/14] Mellanox, mlx5 cleanups & port congestion stats
From: David Miller @ 2019-09-07 15:40 UTC (permalink / raw)
  To: saeedm; +Cc: netdev
In-Reply-To: <20190905215034.22713-1-saeedm@mellanox.com>

From: Saeed Mahameed <saeedm@mellanox.com>
Date: Thu, 5 Sep 2019 21:50:52 +0000

> This series provides 12 mlx5 cleanup patches and last 2 patches provide
> port congestion stats to ethtool.
> 
> For more information please see tag log below.
> 
> Please pull and let me know if there is any problem.

Pulled, thanks Saeed.

^ permalink raw reply

* Re: [PATCH 1/2] net: phy: dp83867: Add documentation for SGMII mode type
From: Florian Fainelli @ 2019-09-07 15:41 UTC (permalink / raw)
  To: Vitaly Gaiduk, Andrew Lunn
  Cc: davem, robh+dt, Mark Rutland, Trent Piepho, netdev, devicetree,
	linux-kernel
In-Reply-To: <23dc47ea-209f-9f51-d4a5-161e62e2a69e@cloudbear.ru>



On 9/6/2019 1:45 PM, Vitaly Gaiduk wrote:
> Hi, Andrew.
> 
> I'm not familiar with generic PHY HW archs but suppose that it is
> proprietary to TI.
> 
> I'v never seen such feature so moved it in TI dts field.

My search engine results seem to indicate that this is indeed TI
specific only and this is not something that exists with other vendors
apparently. The "ti," prefix is therefore appropriate.
-- 
Florian

^ permalink raw reply

* Re: [PATCH net-next,v2, 0/2] Enable sg as tunable, sync offload settings to VF NIC
From: David Miller @ 2019-09-07 15:43 UTC (permalink / raw)
  To: haiyangz
  Cc: sashal, linux-hyperv, netdev, kys, sthemmin, olaf, vkuznets,
	linux-kernel
In-Reply-To: <1567725722-33552-1-git-send-email-haiyangz@microsoft.com>

From: Haiyang Zhang <haiyangz@microsoft.com>
Date: Thu, 5 Sep 2019 23:22:58 +0000

> This patch set fixes an issue in SG tuning, and sync
> offload settings from synthetic NIC to VF NIC.

Series applied to net-next.

^ permalink raw reply

* Re: [PATCH net v2] isdn/capi: check message length in capi_write()
From: David Miller @ 2019-09-07 15:47 UTC (permalink / raw)
  To: ebiggers; +Cc: netdev, isdn, syzkaller-bugs
In-Reply-To: <20190906023637.768-1-ebiggers@kernel.org>

From: Eric Biggers <ebiggers@kernel.org>
Date: Thu,  5 Sep 2019 19:36:37 -0700

> From: Eric Biggers <ebiggers@google.com>
> 
> syzbot reported:
> 
>     BUG: KMSAN: uninit-value in capi_write+0x791/0xa90 drivers/isdn/capi/capi.c:700
>     CPU: 0 PID: 10025 Comm: syz-executor379 Not tainted 4.20.0-rc7+ #2
>     Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>     Call Trace:
>       __dump_stack lib/dump_stack.c:77 [inline]
>       dump_stack+0x173/0x1d0 lib/dump_stack.c:113
>       kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613
>       __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:313
>       capi_write+0x791/0xa90 drivers/isdn/capi/capi.c:700
>       do_loop_readv_writev fs/read_write.c:703 [inline]
>       do_iter_write+0x83e/0xd80 fs/read_write.c:961
>       vfs_writev fs/read_write.c:1004 [inline]
>       do_writev+0x397/0x840 fs/read_write.c:1039
>       __do_sys_writev fs/read_write.c:1112 [inline]
>       __se_sys_writev+0x9b/0xb0 fs/read_write.c:1109
>       __x64_sys_writev+0x4a/0x70 fs/read_write.c:1109
>       do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
>       entry_SYSCALL_64_after_hwframe+0x63/0xe7
>     [...]
> 
> The problem is that capi_write() is reading past the end of the message.
> Fix it by checking the message's length in the needed places.
> 
> Reported-and-tested-by: syzbot+0849c524d9c634f5ae66@syzkaller.appspotmail.com
> Signed-off-by: Eric Biggers <ebiggers@google.com>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH] net-ipv6: addrconf_f6i_alloc - fix non-null pointer check to !IS_ERR()
From: David Miller @ 2019-09-07 15:48 UTC (permalink / raw)
  To: zenczykowski; +Cc: maze, netdev, dsahern, lorenzo, edumazet
In-Reply-To: <20190906035637.47097-1-zenczykowski@gmail.com>

From: Maciej Żenczykowski <zenczykowski@gmail.com>
Date: Thu,  5 Sep 2019 20:56:37 -0700

> From: Maciej Żenczykowski <maze@google.com>
> 
> Fixes a stupid bug I recently introduced...
> ip6_route_info_create() returns an ERR_PTR(err) and not a NULL on error.
> 
> Fixes: d55a2e374a94 ("net-ipv6: fix excessive RTF_ADDRCONF flag on ::1/128 local route (and others)'")
> Cc: David Ahern <dsahern@gmail.com>
> Cc: Lorenzo Colitti <lorenzo@google.com>
> Cc: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Maciej Żenczykowski <maze@google.com>

Applied and queued up for -stable since I queued up the patch this fixes.

^ permalink raw reply

* Re: [PATCHv3 net-next] ipmr: remove hard code cache_resolve_queue_len limit
From: David Miller @ 2019-09-07 15:49 UTC (permalink / raw)
  To: liuhangbin; +Cc: netdev, karn, sukumarg1973, kuznet, yoshfuji, eric.dumazet
In-Reply-To: <20190906073601.10525-1-liuhangbin@gmail.com>

From: Hangbin Liu <liuhangbin@gmail.com>
Date: Fri,  6 Sep 2019 15:36:01 +0800

> This is a re-post of previous patch wrote by David Miller[1].
> 
> Phil Karn reported[2] that on busy networks with lots of unresolved
> multicast routing entries, the creation of new multicast group routes
> can be extremely slow and unreliable.
> 
> The reason is we hard-coded multicast route entries with unresolved source
> addresses(cache_resolve_queue_len) to 10. If some multicast route never
> resolves and the unresolved source addresses increased, there will
> be no ability to create new multicast route cache.
> 
> To resolve this issue, we need either add a sysctl entry to make the
> cache_resolve_queue_len configurable, or just remove cache_resolve_queue_len
> limit directly, as we already have the socket receive queue limits of mrouted
> socket, pointed by David.
> 
> From my side, I'd perfer to remove the cache_resolve_queue_len limit instead
> of creating two more(IPv4 and IPv6 version) sysctl entry.
> 
> [1] https://lkml.org/lkml/2018/7/22/11
> [2] https://lkml.org/lkml/2018/7/21/343
> 
> v3: instead of remove cache_resolve_queue_len totally, let's only remove
> the hard code limit when allocate the unresolved cache, as Eric Dumazet
> suggested, so we don't need to re-count it in other places.
> 
> v2: hold the mfc_unres_lock while walking the unresolved list in
> queue_count(), as Nikolay Aleksandrov remind.
> 
> Reported-by: Phil Karn <karn@ka9q.net>
> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>

Applied to net-next.

^ permalink raw reply

* Re: [PATCH net-next 0/5] net: stmmac: Improvements and fixes for -next
From: David Miller @ 2019-09-07 15:57 UTC (permalink / raw)
  To: Jose.Abreu
  Cc: netdev, Joao.Pinto, peppe.cavallaro, alexandre.torgue,
	mcoquelin.stm32, linux-arm-kernel, linux-kernel
In-Reply-To: <cover.1567755423.git.joabreu@synopsys.com>

From: Jose Abreu <Jose.Abreu@synopsys.com>
Date: Fri,  6 Sep 2019 09:41:12 +0200

> Improvements and fixes for recently introduced features. All for -next tree.
> More info in commit logs.

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH] ethernet: micrel: Use DIV_ROUND_CLOSEST directly to make it readable
From: Andrew Lunn @ 2019-09-07 15:58 UTC (permalink / raw)
  To: zhong jiang; +Cc: davem, kstewart, gregkh, netdev, linux-kernel
In-Reply-To: <5D732078.6080902@huawei.com>

On Sat, Sep 07, 2019 at 11:14:00AM +0800, zhong jiang wrote:
> On 2019/9/7 3:40, Andrew Lunn wrote:
> > On Thu, Sep 05, 2019 at 11:53:48PM +0800, zhong jiang wrote:
> >> The kernel.h macro DIV_ROUND_CLOSEST performs the computation (x + d/2)/d
> >> but is perhaps more readable.
> > Hi Zhong
> >
> > Did you find this by hand, or did you use a tool. If a tool is used,
> > it is normal to give some credit to the tool.
> With the following help of Coccinelle. 

It is good to mention Coccinelle or other such tools. They often exist
because of university research work, and funding for such tools does
depend on publicity of the tools, getting the credit they deserve.

       Andrew

^ permalink raw reply

* Re: [PATCH v2 net] net: gso: Fix skb_segment splat when splitting gso_size mangled skb having linear-headed frag_list
From: David Miller @ 2019-09-07 16:00 UTC (permalink / raw)
  To: shmulik
  Cc: alexander.duyck, daniel, eric.dumazet, willemdebruijn.kernel,
	netdev, eyal, shmulik.ladkani
In-Reply-To: <20190906092350.13929-1-shmulik.ladkani@gmail.com>

From: Shmulik Ladkani <shmulik@metanetworks.com>
Date: Fri,  6 Sep 2019 12:23:50 +0300

> Historically, support for frag_list packets entering skb_segment() was
> limited to frag_list members terminating on exact same gso_size
> boundaries. This is verified with a BUG_ON since commit 89319d3801d1
> ("net: Add frag_list support to skb_segment"), quote:
> 
>     As such we require all frag_list members terminate on exact MSS
>     boundaries.  This is checked using BUG_ON.
>     As there should only be one producer in the kernel of such packets,
>     namely GRO, this requirement should not be difficult to maintain.
> 
> However, since commit 6578171a7ff0 ("bpf: add bpf_skb_change_proto helper"),
> the "exact MSS boundaries" assumption no longer holds:
> An eBPF program using bpf_skb_change_proto() DOES modify 'gso_size', but
> leaves the frag_list members as originally merged by GRO with the
> original 'gso_size'. Example of such programs are bpf-based NAT46 or
> NAT64.
> 
> This lead to a kernel BUG_ON for flows involving:
>  - GRO generating a frag_list skb
>  - bpf program performing bpf_skb_change_proto() or bpf_skb_adjust_room()
>  - skb_segment() of the skb
> 
> See example BUG_ON reports in [0].
> 
> In commit 13acc94eff12 ("net: permit skb_segment on head_frag frag_list skb"),
> skb_segment() was modified to support the "gso_size mangling" case of
> a frag_list GRO'ed skb, but *only* for frag_list members having
> head_frag==true (having a page-fragment head).
> 
> Alas, GRO packets having frag_list members with a linear kmalloced head
> (head_frag==false) still hit the BUG_ON.
> 
> This commit adds support to skb_segment() for a 'head_skb' packet having
> a frag_list whose members are *non* head_frag, with gso_size mangled, by
> disabling SG and thus falling-back to copying the data from the given
> 'head_skb' into the generated segmented skbs - as suggested by Willem de
> Bruijn [1].
> 
> Since this approach involves the penalty of skb_copy_and_csum_bits()
> when building the segments, care was taken in order to enable this
> solution only when required:
>  - untrusted gso_size, by testing SKB_GSO_DODGY is set
>    (SKB_GSO_DODGY is set by any gso_size mangling functions in
>     net/core/filter.c)
>  - the frag_list is non empty, its item is a non head_frag, *and* the
>    headlen of the given 'head_skb' does not match the gso_size.
> 
> [0]
> https://lore.kernel.org/netdev/20190826170724.25ff616f@pixies/
> https://lore.kernel.org/netdev/9265b93f-253d-6b8c-f2b8-4b54eff1835c@fb.com/
> 
> [1]
> https://lore.kernel.org/netdev/CA+FuTSfVsgNDi7c=GUU8nMg2hWxF2SjCNLXetHeVPdnxAW5K-w@mail.gmail.com/
> 
> Fixes: 6578171a7ff0 ("bpf: add bpf_skb_change_proto helper")
> Suggested-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Alexander Duyck <alexander.duyck@gmail.com>
> Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
> ---
> v2: reorder the test conditions, as suggested by Alexander Duyck

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH net-next] ionic: Remove unused including <linux/version.h>
From: David Miller @ 2019-09-07 16:01 UTC (permalink / raw)
  To: yuehaibing; +Cc: snelson, drivers, netdev, kernel-janitors
In-Reply-To: <20190906095410.107596-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Fri, 6 Sep 2019 09:54:09 +0000

> Remove including <linux/version.h> that don't need it.
> 
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH] be2net: make two arrays static const, makes object smaller
From: David Miller @ 2019-09-07 16:02 UTC (permalink / raw)
  To: colin.king
  Cc: sathya.perla, ajit.khaparde, sriharsha.basavapatna, somnath.kotur,
	netdev, kernel-janitors, linux-kernel
In-Reply-To: <20190906111943.5285-1-colin.king@canonical.com>

From: Colin King <colin.king@canonical.com>
Date: Fri,  6 Sep 2019 12:19:43 +0100

> From: Colin Ian King <colin.king@canonical.com>
> 
> Don't populate the arrays on the stack but instead make them
> static const. Makes the object code smaller by 281 bytes.
> 
> Before:
>    text	   data	    bss	    dec	    hex	filename
>   87553	   5672	      0	  93225	  16c29	benet/be_cmds.o
> 
> After:
>    text	   data	    bss	    dec	    hex	filename
>   87112	   5832	      0	  92944	  16b10	benet/be_cmds.o
> 
> (gcc version 9.2.1, amd64)
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>

Applied.

^ permalink raw reply

* Re: [PATCH] net: hns3: make array spec_opcode static const, makes object smaller
From: David Miller @ 2019-09-07 16:03 UTC (permalink / raw)
  To: colin.king
  Cc: yisen.zhuang, salil.mehta, lipeng321, netdev, kernel-janitors,
	linux-kernel
In-Reply-To: <20190906112804.7812-1-colin.king@canonical.com>

From: Colin King <colin.king@canonical.com>
Date: Fri,  6 Sep 2019 12:28:04 +0100

> From: Colin Ian King <colin.king@canonical.com>
> 
> Don't populate the array spec_opcode on the stack but instead make it
> static const. Makes the object code smaller by 48 bytes.
> 
> Before:
>    text	   data	    bss	    dec	    hex	filename
>    6914	   1040	    128	   8082	   1f92	hns3/hns3vf/hclgevf_cmd.o
> 
> After:
>    text	   data	    bss	    dec	    hex	filename
>    6866	   1040	    128	   8034	   1f62	hns3/hns3vf/hclgevf_cmd.o
> 
> (gcc version 9.2.1, amd64)
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>

Applied.

^ permalink raw reply

* Re: [PATCH net] nfp: flower: cmsg rtnl locks can timeout reify messages
From: David Miller @ 2019-09-07 16:06 UTC (permalink / raw)
  To: simon.horman; +Cc: jakub.kicinski, netdev, oss-drivers, frederik.lotter
In-Reply-To: <20190906172941.25136-1-simon.horman@netronome.com>

From: Simon Horman <simon.horman@netronome.com>
Date: Fri,  6 Sep 2019 19:29:41 +0200

> From: Fred Lotter <frederik.lotter@netronome.com>
> 
> Flower control message replies are handled in different locations. The truly
> high priority replies are handled in the BH (tasklet) context, while the
> remaining replies are handled in a predefined Linux work queue. The work
> queue handler orders replies into high and low priority groups, and always
> start servicing the high priority replies within the received batch first.
> 
> Reply Type:			Rtnl Lock:	Handler:
 ...
> A subset of control messages can block waiting for an rtnl lock (from both
> work queue priority groups). The rtnl lock is heavily contended for by
> external processes such as systemd-udevd, systemd-network and libvirtd,
> especially during netdev creation, such as when flower VFs and representors
> are instantiated.
> 
> Kernel netlink instrumentation shows that external processes (such as
> systemd-udevd) often use successive rtnl_trylock() sequences, which can result
> in an rtnl_lock() blocked control message to starve for longer periods of time
> during rtnl lock contention, i.e. netdev creation.
> 
> In the current design a single blocked control message will block the entire
> work queue (both priorities), and introduce a latency which is
> nondeterministic and dependent on system wide rtnl lock usage.
> 
> In some extreme cases, one blocked control message at exactly the wrong time,
> just before the maximum number of VFs are instantiated, can block the work
> queue for long enough to prevent VF representor REIFY replies from getting
> handled in time for the 40ms timeout.
> 
> The firmware will deliver the total maximum number of REIFY message replies in
> around 300us.
> 
> Only REIFY and MTU update messages require replies within a timeout period (of
> 40ms). The MTU-only updates are already done directly in the BH (tasklet)
> handler.
> 
> Move the REIFY handler down into the BH (tasklet) in order to resolve timeouts
> caused by a blocked work queue waiting on rtnl locks.
> 
> Signed-off-by: Fred Lotter <frederik.lotter@netronome.com>
> Signed-off-by: Simon Horman <simon.horman@netronome.com>

Applied.

^ permalink raw reply

* Re: pull request: bluetooth-next 2019-09-06
From: David Miller @ 2019-09-07 16:08 UTC (permalink / raw)
  To: johan.hedberg; +Cc: netdev, linux-bluetooth
In-Reply-To: <20190906172339.GA74057@jmoran1-mobl1.ger.corp.intel.com>

From: Johan Hedberg <johan.hedberg@gmail.com>
Date: Fri, 6 Sep 2019 20:23:39 +0300

> Here's the main bluetooth-next pull request for the 5.4 kernel.
> 
>  - Cleanups & fixes to btrtl driver
>  - Fixes for Realtek devices in btusb, e.g. for suspend handling
>  - Firmware loading support for BCM4345C5
>  - hidp_send_message() return value handling fixes
>  - Added support for utilizing Fast Advertising Interval
>  - Various other minor cleanups & fixes
> 
> Please let me know if there are any issues pulling. Thanks.

Pulled, thanks.

^ permalink raw reply

* Re: [PATCH net-next 0/4] net/tls: small TX offload optimizations
From: David Miller @ 2019-09-07 16:11 UTC (permalink / raw)
  To: jakub.kicinski
  Cc: netdev, oss-drivers, davejwatson, borisp, aviadye, john.fastabend,
	daniel
In-Reply-To: <20190907053000.23869-1-jakub.kicinski@netronome.com>

From: Jakub Kicinski <jakub.kicinski@netronome.com>
Date: Fri,  6 Sep 2019 22:29:56 -0700

> Hi!
> 
> This set brings small TLS TX device optimizations. The biggest
> gain comes from fixing a misuse of non temporal copy instructions.
> On a synthetic workload modelled after customer's RFC application
> I see 3-5% percent gain.

Series applied.

But if history is any indication I'd watch for how much this actually
helps or hurts universally.  We once tried to use non-temporal stores
for sendmsg/recvmsg copies and had to turn that off because it only
helped in certain situations on certain cpus and hurt in others.

^ permalink raw reply

* Re: [PATCH net-next] netfilter: nf_tables: avoid excessive stack usage
From: Pablo Neira Ayuso @ 2019-09-07 18:07 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Jozsef Kadlecsik, Florian Westphal, David S. Miller,
	Jakub Kicinski, wenxu, netfilter-devel, coreteam, netdev,
	linux-kernel
In-Reply-To: <20190906151242.1115282-1-arnd@arndb.de>

[-- Attachment #1: Type: text/plain, Size: 998 bytes --]

Hi Arnd,

On Fri, Sep 06, 2019 at 05:12:30PM +0200, Arnd Bergmann wrote:
> The nft_offload_ctx structure is much too large to put on the
> stack:
> 
> net/netfilter/nf_tables_offload.c:31:23: error: stack frame size of 1200 bytes in function 'nft_flow_rule_create' [-Werror,-Wframe-larger-than=]
> 
> Use dynamic allocation here, as we do elsewhere in the same
> function.
>
> Fixes: c9626a2cbdb2 ("netfilter: nf_tables: add hardware offload support")
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
> Since we only really care about two members of the structure, an
> alternative would be a larger rewrite, but that is probably too
> late for v5.4.

Thanks for this patch.

I'm attaching a patch to reduce this structure size a bit. Do you
think this alternative patch is ok until this alternative rewrite
happens? Anyway I agree we should to get this structure away from the
stack, even after this is still large, so your patch (or a variant of
it) will be useful sooner than later I think.

[-- Attachment #2: x.patch --]
[-- Type: text/x-diff, Size: 485 bytes --]

diff --git a/include/net/netfilter/nf_tables_offload.h b/include/net/netfilter/nf_tables_offload.h
index db104665a9e4..cc44d29e9fd7 100644
--- a/include/net/netfilter/nf_tables_offload.h
+++ b/include/net/netfilter/nf_tables_offload.h
@@ -5,10 +5,10 @@
 #include <net/netfilter/nf_tables.h>
 
 struct nft_offload_reg {
-	u32		key;
-	u32		len;
-	u32		base_offset;
-	u32		offset;
+	u8		key;
+	u8		len;
+	u8		base_offset;
+	u8		offset;
 	struct nft_data data;
 	struct nft_data	mask;
 };

^ permalink raw reply related

* Re: [PATCH net-next] netfilter: nf_tables: avoid excessive stack usage
From: Arnd Bergmann @ 2019-09-07 18:41 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: Jozsef Kadlecsik, Florian Westphal, David S. Miller,
	Jakub Kicinski, wenxu, netfilter-devel, coreteam, Networking,
	linux-kernel@vger.kernel.org
In-Reply-To: <20190907180754.dz7gstqfj7djlbrs@salvia>

On Sat, Sep 7, 2019 at 8:07 PM Pablo Neira Ayuso <pablo@netfilter.org> wrote:
>
> Hi Arnd,
>
> On Fri, Sep 06, 2019 at 05:12:30PM +0200, Arnd Bergmann wrote:
> > The nft_offload_ctx structure is much too large to put on the
> > stack:
> >
> > net/netfilter/nf_tables_offload.c:31:23: error: stack frame size of 1200 bytes in function 'nft_flow_rule_create' [-Werror,-Wframe-larger-than=]
> >
> > Use dynamic allocation here, as we do elsewhere in the same
> > function.
> >
> > Fixes: c9626a2cbdb2 ("netfilter: nf_tables: add hardware offload support")
> > Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> > ---
> > Since we only really care about two members of the structure, an
> > alternative would be a larger rewrite, but that is probably too
> > late for v5.4.
>
> Thanks for this patch.
>
> I'm attaching a patch to reduce this structure size a bit. Do you
> think this alternative patch is ok until this alternative rewrite
> happens?

I haven't tried it yet, but it looks like that would save 8 of the
48 bytes in each for each of the 24 registers (12 bytes on m68k
or i386, which only use 4 byte alignment for nft_data), so
this wouldn't make too much difference.

> Anyway I agree we should to get this structure away from the
> stack, even after this is still large, so your patch (or a variant of
> it) will be useful sooner than later I think.

What I was thinking for a possible smaller fix would be to not
pass the ctx into the expr->ops->offload callback but
only pass the 'dep' member. Since I've never seen this code
before, I have no idea if that would be an improvement
in the end.

       Arnd

^ permalink raw reply

* Re: [PATCH net-next] netfilter: nf_tables: avoid excessive stack usage
From: Pablo Neira Ayuso @ 2019-09-07 18:52 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Jozsef Kadlecsik, Florian Westphal, David S. Miller,
	Jakub Kicinski, wenxu, netfilter-devel, coreteam, Networking,
	linux-kernel@vger.kernel.org
In-Reply-To: <CAK8P3a04ic_VP6L_=N5P7vfQG1VDV25g3KvUpuCVdX483hx_cA@mail.gmail.com>

On Sat, Sep 07, 2019 at 08:41:22PM +0200, Arnd Bergmann wrote:
> On Sat, Sep 7, 2019 at 8:07 PM Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> >
> > Hi Arnd,
> >
> > On Fri, Sep 06, 2019 at 05:12:30PM +0200, Arnd Bergmann wrote:
> > > The nft_offload_ctx structure is much too large to put on the
> > > stack:
> > >
> > > net/netfilter/nf_tables_offload.c:31:23: error: stack frame size of 1200 bytes in function 'nft_flow_rule_create' [-Werror,-Wframe-larger-than=]
> > >
> > > Use dynamic allocation here, as we do elsewhere in the same
> > > function.
> > >
> > > Fixes: c9626a2cbdb2 ("netfilter: nf_tables: add hardware offload support")
> > > Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> > > ---
> > > Since we only really care about two members of the structure, an
> > > alternative would be a larger rewrite, but that is probably too
> > > late for v5.4.
> >
> > Thanks for this patch.
> >
> > I'm attaching a patch to reduce this structure size a bit. Do you
> > think this alternative patch is ok until this alternative rewrite
> > happens?
> 
> I haven't tried it yet, but it looks like that would save 8 of the
> 48 bytes in each for each of the 24 registers (12 bytes on m68k
> or i386, which only use 4 byte alignment for nft_data), so
> this wouldn't make too much difference.

I'll take your patch as is.

> > Anyway I agree we should to get this structure away from the
> > stack, even after this is still large, so your patch (or a variant of
> > it) will be useful sooner than later I think.
> 
> What I was thinking for a possible smaller fix would be to not
> pass the ctx into the expr->ops->offload callback but
> only pass the 'dep' member. Since I've never seen this code
> before, I have no idea if that would be an improvement
> in the end.

We might need this more fields of this context structure, this code is
very new, still under development, let's revisit this later.

Thanks.

^ permalink raw reply

* [PATCH net-next 0/3] net: dsa: mv88e6xxx: add PCL support
From: Vivien Didelot @ 2019-09-07 20:00 UTC (permalink / raw)
  To: netdev; +Cc: davem, f.fainelli, andrew, Vivien Didelot

This small series implements the ethtool RXNFC operations in the
mv88e6xxx DSA driver to configure a port's Layer 2 Policy Control List
(PCL) supported by models such as 88E6352 and 88E6390 and equivalent.

This allows to configure a port to discard frames based on a configured
destination or source MAC address and an optional VLAN, with e.g.:

    # ethtool --config-nfc lan1 flow-type ether src 00:11:22:33:44:55 action -1

Vivien Didelot (3):
  net: dsa: mv88e6xxx: complete ATU state definitions
  net: dsa: mv88e6xxx: introduce .port_set_policy
  net: dsa: mv88e6xxx: add RXNFC support

 drivers/net/dsa/mv88e6xxx/chip.c        | 241 ++++++++++++++++++++++--
 drivers/net/dsa/mv88e6xxx/chip.h        |  35 ++++
 drivers/net/dsa/mv88e6xxx/global1.h     |  43 +++--
 drivers/net/dsa/mv88e6xxx/global1_atu.c |   6 +-
 drivers/net/dsa/mv88e6xxx/port.c        |  74 ++++++++
 drivers/net/dsa/mv88e6xxx/port.h        |  17 +-
 6 files changed, 388 insertions(+), 28 deletions(-)

-- 
2.23.0


^ permalink raw reply

* [PATCH net-next 1/3] net: dsa: mv88e6xxx: complete ATU state definitions
From: Vivien Didelot @ 2019-09-07 20:00 UTC (permalink / raw)
  To: netdev; +Cc: davem, f.fainelli, andrew, Vivien Didelot
In-Reply-To: <20190907200049.25273-1-vivien.didelot@gmail.com>

Marvell has different values for the state of a MAC address,
depending on its multicast bit. This patch completes the definitions
for these states.

At the same time, use 0 which is intuitive enough and simplifies the
code a bit, instead of the UC or MC unused value.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
---
 drivers/net/dsa/mv88e6xxx/chip.c        | 19 +++++------
 drivers/net/dsa/mv88e6xxx/global1.h     | 43 +++++++++++++++++--------
 drivers/net/dsa/mv88e6xxx/global1_atu.c |  6 ++--
 3 files changed, 41 insertions(+), 27 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 30365a54c31b..0d54a69f3622 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -1497,7 +1497,7 @@ static int mv88e6xxx_port_db_load_purge(struct mv88e6xxx_chip *chip, int port,
 		fid = vlan.fid;
 	}
 
-	entry.state = MV88E6XXX_G1_ATU_DATA_STATE_UNUSED;
+	entry.state = 0;
 	ether_addr_copy(entry.mac, addr);
 	eth_addr_dec(entry.mac);
 
@@ -1506,17 +1506,16 @@ static int mv88e6xxx_port_db_load_purge(struct mv88e6xxx_chip *chip, int port,
 		return err;
 
 	/* Initialize a fresh ATU entry if it isn't found */
-	if (entry.state == MV88E6XXX_G1_ATU_DATA_STATE_UNUSED ||
-	    !ether_addr_equal(entry.mac, addr)) {
+	if (!entry.state || !ether_addr_equal(entry.mac, addr)) {
 		memset(&entry, 0, sizeof(entry));
 		ether_addr_copy(entry.mac, addr);
 	}
 
 	/* Purge the ATU entry only if no port is using it anymore */
-	if (state == MV88E6XXX_G1_ATU_DATA_STATE_UNUSED) {
+	if (!state) {
 		entry.portvec &= ~BIT(port);
 		if (!entry.portvec)
-			entry.state = MV88E6XXX_G1_ATU_DATA_STATE_UNUSED;
+			entry.state = 0;
 	} else {
 		entry.portvec |= BIT(port);
 		entry.state = state;
@@ -1732,8 +1731,7 @@ static int mv88e6xxx_port_fdb_del(struct dsa_switch *ds, int port,
 	int err;
 
 	mv88e6xxx_reg_lock(chip);
-	err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid,
-					   MV88E6XXX_G1_ATU_DATA_STATE_UNUSED);
+	err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0);
 	mv88e6xxx_reg_unlock(chip);
 
 	return err;
@@ -1747,7 +1745,7 @@ static int mv88e6xxx_port_db_dump_fid(struct mv88e6xxx_chip *chip,
 	bool is_static;
 	int err;
 
-	addr.state = MV88E6XXX_G1_ATU_DATA_STATE_UNUSED;
+	addr.state = 0;
 	eth_broadcast_addr(addr.mac);
 
 	do {
@@ -1755,7 +1753,7 @@ static int mv88e6xxx_port_db_dump_fid(struct mv88e6xxx_chip *chip,
 		if (err)
 			return err;
 
-		if (addr.state == MV88E6XXX_G1_ATU_DATA_STATE_UNUSED)
+		if (!addr.state)
 			break;
 
 		if (addr.trunk || (addr.portvec & BIT(port)) == 0)
@@ -4690,8 +4688,7 @@ static int mv88e6xxx_port_mdb_del(struct dsa_switch *ds, int port,
 	int err;
 
 	mv88e6xxx_reg_lock(chip);
-	err = mv88e6xxx_port_db_load_purge(chip, port, mdb->addr, mdb->vid,
-					   MV88E6XXX_G1_ATU_DATA_STATE_UNUSED);
+	err = mv88e6xxx_port_db_load_purge(chip, port, mdb->addr, mdb->vid, 0);
 	mv88e6xxx_reg_unlock(chip);
 
 	return err;
diff --git a/drivers/net/dsa/mv88e6xxx/global1.h b/drivers/net/dsa/mv88e6xxx/global1.h
index 78b9ae22d18c..0870fcc8bfc8 100644
--- a/drivers/net/dsa/mv88e6xxx/global1.h
+++ b/drivers/net/dsa/mv88e6xxx/global1.h
@@ -128,19 +128,36 @@
 #define MV88E6XXX_G1_ATU_OP_FULL_VIOLATION		BIT(4)
 
 /* Offset 0x0C: ATU Data Register */
-#define MV88E6XXX_G1_ATU_DATA				0x0c
-#define MV88E6XXX_G1_ATU_DATA_TRUNK			0x8000
-#define MV88E6XXX_G1_ATU_DATA_TRUNK_ID_MASK		0x00f0
-#define MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_MASK		0x3ff0
-#define MV88E6XXX_G1_ATU_DATA_STATE_MASK		0x000f
-#define MV88E6XXX_G1_ATU_DATA_STATE_UNUSED		0x0000
-#define MV88E6XXX_G1_ATU_DATA_STATE_UC_MGMT		0x000d
-#define MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC		0x000e
-#define MV88E6XXX_G1_ATU_DATA_STATE_UC_PRIO_OVER	0x000f
-#define MV88E6XXX_G1_ATU_DATA_STATE_MC_NONE_RATE	0x0005
-#define MV88E6XXX_G1_ATU_DATA_STATE_MC_STATIC		0x0007
-#define MV88E6XXX_G1_ATU_DATA_STATE_MC_MGMT		0x000e
-#define MV88E6XXX_G1_ATU_DATA_STATE_MC_PRIO_OVER	0x000f
+#define MV88E6XXX_G1_ATU_DATA					0x0c
+#define MV88E6XXX_G1_ATU_DATA_TRUNK				0x8000
+#define MV88E6XXX_G1_ATU_DATA_TRUNK_ID_MASK			0x00f0
+#define MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_MASK			0x3ff0
+#define MV88E6XXX_G1_ATU_DATA_STATE_MASK			0x000f
+#define MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED			0x0000
+#define MV88E6XXX_G1_ATU_DATA_STATE_UC_AGE_1_OLDEST		0x0001
+#define MV88E6XXX_G1_ATU_DATA_STATE_UC_AGE_2			0x0002
+#define MV88E6XXX_G1_ATU_DATA_STATE_UC_AGE_3			0x0003
+#define MV88E6XXX_G1_ATU_DATA_STATE_UC_AGE_4			0x0004
+#define MV88E6XXX_G1_ATU_DATA_STATE_UC_AGE_5			0x0005
+#define MV88E6XXX_G1_ATU_DATA_STATE_UC_AGE_6			0x0006
+#define MV88E6XXX_G1_ATU_DATA_STATE_UC_AGE_7_NEWEST		0x0007
+#define MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC_POLICY		0x0008
+#define MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC_POLICY_PO		0x0009
+#define MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC_AVB_NRL		0x000a
+#define MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC_AVB_NRL_PO	0x000b
+#define MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC_DA_MGMT		0x000c
+#define MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC_DA_MGMT_PO	0x000d
+#define MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC			0x000e
+#define MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC_PO		0x000f
+#define MV88E6XXX_G1_ATU_DATA_STATE_MC_UNUSED			0x0000
+#define MV88E6XXX_G1_ATU_DATA_STATE_MC_STATIC_POLICY		0x0004
+#define MV88E6XXX_G1_ATU_DATA_STATE_MC_STATIC_AVB_NRL		0x0005
+#define MV88E6XXX_G1_ATU_DATA_STATE_MC_STATIC_DA_MGMT		0x0006
+#define MV88E6XXX_G1_ATU_DATA_STATE_MC_STATIC			0x0007
+#define MV88E6XXX_G1_ATU_DATA_STATE_MC_STATIC_POLICY_PO		0x000c
+#define MV88E6XXX_G1_ATU_DATA_STATE_MC_STATIC_AVB_NRL_PO	0x000d
+#define MV88E6XXX_G1_ATU_DATA_STATE_MC_STATIC_DA_MGMT_PO	0x000e
+#define MV88E6XXX_G1_ATU_DATA_STATE_MC_STATIC_PO		0x000f
 
 /* Offset 0x0D: ATU MAC Address Register Bytes 0 & 1
  * Offset 0x0E: ATU MAC Address Register Bytes 2 & 3
diff --git a/drivers/net/dsa/mv88e6xxx/global1_atu.c b/drivers/net/dsa/mv88e6xxx/global1_atu.c
index 18b86515b6bc..792a96ef418f 100644
--- a/drivers/net/dsa/mv88e6xxx/global1_atu.c
+++ b/drivers/net/dsa/mv88e6xxx/global1_atu.c
@@ -135,7 +135,7 @@ static int mv88e6xxx_g1_atu_data_read(struct mv88e6xxx_chip *chip,
 		return err;
 
 	entry->state = val & 0xf;
-	if (entry->state != MV88E6XXX_G1_ATU_DATA_STATE_UNUSED) {
+	if (entry->state) {
 		entry->trunk = !!(val & MV88E6XXX_G1_ATU_DATA_TRUNK);
 		entry->portvec = (val >> 4) & mv88e6xxx_port_mask(chip);
 	}
@@ -148,7 +148,7 @@ static int mv88e6xxx_g1_atu_data_write(struct mv88e6xxx_chip *chip,
 {
 	u16 data = entry->state & 0xf;
 
-	if (entry->state != MV88E6XXX_G1_ATU_DATA_STATE_UNUSED) {
+	if (entry->state) {
 		if (entry->trunk)
 			data |= MV88E6XXX_G1_ATU_DATA_TRUNK;
 
@@ -209,7 +209,7 @@ int mv88e6xxx_g1_atu_getnext(struct mv88e6xxx_chip *chip, u16 fid,
 		return err;
 
 	/* Write the MAC address to iterate from only once */
-	if (entry->state == MV88E6XXX_G1_ATU_DATA_STATE_UNUSED) {
+	if (!entry->state) {
 		err = mv88e6xxx_g1_atu_mac_write(chip, entry);
 		if (err)
 			return err;
-- 
2.23.0


^ permalink raw reply related

* [PATCH net-next 2/3] net: dsa: mv88e6xxx: introduce .port_set_policy
From: Vivien Didelot @ 2019-09-07 20:00 UTC (permalink / raw)
  To: netdev; +Cc: davem, f.fainelli, andrew, Vivien Didelot
In-Reply-To: <20190907200049.25273-1-vivien.didelot@gmail.com>

Introduce a new .port_set_policy operation to configure a port's
Policy Control List, based on mapping such as DA, SA, Etype and so on.

Models similar to 88E6352 and 88E6390 are supported at the moment.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
---
 drivers/net/dsa/mv88e6xxx/chip.c |  9 ++++
 drivers/net/dsa/mv88e6xxx/chip.h | 22 ++++++++++
 drivers/net/dsa/mv88e6xxx/port.c | 74 ++++++++++++++++++++++++++++++++
 drivers/net/dsa/mv88e6xxx/port.h | 17 +++++++-
 4 files changed, 121 insertions(+), 1 deletion(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 0d54a69f3622..6f4d5303a1f3 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -3132,6 +3132,7 @@ static const struct mv88e6xxx_ops mv88e6172_ops = {
 	.port_set_rgmii_delay = mv88e6352_port_set_rgmii_delay,
 	.port_set_speed = mv88e6352_port_set_speed,
 	.port_tag_remap = mv88e6095_port_tag_remap,
+	.port_set_policy = mv88e6352_port_set_policy,
 	.port_set_frame_mode = mv88e6351_port_set_frame_mode,
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
@@ -3218,6 +3219,7 @@ static const struct mv88e6xxx_ops mv88e6176_ops = {
 	.port_set_rgmii_delay = mv88e6352_port_set_rgmii_delay,
 	.port_set_speed = mv88e6352_port_set_speed,
 	.port_tag_remap = mv88e6095_port_tag_remap,
+	.port_set_policy = mv88e6352_port_set_policy,
 	.port_set_frame_mode = mv88e6351_port_set_frame_mode,
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
@@ -3303,6 +3305,7 @@ static const struct mv88e6xxx_ops mv88e6190_ops = {
 	.port_set_speed = mv88e6390_port_set_speed,
 	.port_max_speed_mode = mv88e6390_port_max_speed_mode,
 	.port_tag_remap = mv88e6390_port_tag_remap,
+	.port_set_policy = mv88e6352_port_set_policy,
 	.port_set_frame_mode = mv88e6351_port_set_frame_mode,
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
@@ -3351,6 +3354,7 @@ static const struct mv88e6xxx_ops mv88e6190x_ops = {
 	.port_set_speed = mv88e6390x_port_set_speed,
 	.port_max_speed_mode = mv88e6390x_port_max_speed_mode,
 	.port_tag_remap = mv88e6390_port_tag_remap,
+	.port_set_policy = mv88e6352_port_set_policy,
 	.port_set_frame_mode = mv88e6351_port_set_frame_mode,
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
@@ -3448,6 +3452,7 @@ static const struct mv88e6xxx_ops mv88e6240_ops = {
 	.port_set_rgmii_delay = mv88e6352_port_set_rgmii_delay,
 	.port_set_speed = mv88e6352_port_set_speed,
 	.port_tag_remap = mv88e6095_port_tag_remap,
+	.port_set_policy = mv88e6352_port_set_policy,
 	.port_set_frame_mode = mv88e6351_port_set_frame_mode,
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
@@ -3539,6 +3544,7 @@ static const struct mv88e6xxx_ops mv88e6290_ops = {
 	.port_set_speed = mv88e6390_port_set_speed,
 	.port_max_speed_mode = mv88e6390_port_max_speed_mode,
 	.port_tag_remap = mv88e6390_port_tag_remap,
+	.port_set_policy = mv88e6352_port_set_policy,
 	.port_set_frame_mode = mv88e6351_port_set_frame_mode,
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
@@ -3809,6 +3815,7 @@ static const struct mv88e6xxx_ops mv88e6352_ops = {
 	.port_set_rgmii_delay = mv88e6352_port_set_rgmii_delay,
 	.port_set_speed = mv88e6352_port_set_speed,
 	.port_tag_remap = mv88e6095_port_tag_remap,
+	.port_set_policy = mv88e6352_port_set_policy,
 	.port_set_frame_mode = mv88e6351_port_set_frame_mode,
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
@@ -3863,6 +3870,7 @@ static const struct mv88e6xxx_ops mv88e6390_ops = {
 	.port_set_speed = mv88e6390_port_set_speed,
 	.port_max_speed_mode = mv88e6390_port_max_speed_mode,
 	.port_tag_remap = mv88e6390_port_tag_remap,
+	.port_set_policy = mv88e6352_port_set_policy,
 	.port_set_frame_mode = mv88e6351_port_set_frame_mode,
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
@@ -3915,6 +3923,7 @@ static const struct mv88e6xxx_ops mv88e6390x_ops = {
 	.port_set_speed = mv88e6390x_port_set_speed,
 	.port_max_speed_mode = mv88e6390x_port_max_speed_mode,
 	.port_tag_remap = mv88e6390_port_tag_remap,
+	.port_set_policy = mv88e6352_port_set_policy,
 	.port_set_frame_mode = mv88e6351_port_set_frame_mode,
 	.port_set_egress_floods = mv88e6352_port_set_egress_floods,
 	.port_set_ether_type = mv88e6351_port_set_ether_type,
diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h
index 6bc0a4e4fe7b..04a329a98158 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.h
+++ b/drivers/net/dsa/mv88e6xxx/chip.h
@@ -189,6 +189,24 @@ struct mv88e6xxx_port_hwtstamp {
 	struct hwtstamp_config tstamp_config;
 };
 
+enum mv88e6xxx_policy_mapping {
+	MV88E6XXX_POLICY_MAPPING_DA,
+	MV88E6XXX_POLICY_MAPPING_SA,
+	MV88E6XXX_POLICY_MAPPING_VTU,
+	MV88E6XXX_POLICY_MAPPING_ETYPE,
+	MV88E6XXX_POLICY_MAPPING_PPPOE,
+	MV88E6XXX_POLICY_MAPPING_VBAS,
+	MV88E6XXX_POLICY_MAPPING_OPT82,
+	MV88E6XXX_POLICY_MAPPING_UDP,
+};
+
+enum mv88e6xxx_policy_action {
+	MV88E6XXX_POLICY_ACTION_NORMAL,
+	MV88E6XXX_POLICY_ACTION_MIRROR,
+	MV88E6XXX_POLICY_ACTION_TRAP,
+	MV88E6XXX_POLICY_ACTION_DISCARD,
+};
+
 struct mv88e6xxx_port {
 	struct mv88e6xxx_chip *chip;
 	int port;
@@ -381,6 +399,10 @@ struct mv88e6xxx_ops {
 
 	int (*port_tag_remap)(struct mv88e6xxx_chip *chip, int port);
 
+	int (*port_set_policy)(struct mv88e6xxx_chip *chip, int port,
+			       enum mv88e6xxx_policy_mapping mapping,
+			       enum mv88e6xxx_policy_action action);
+
 	int (*port_set_frame_mode)(struct mv88e6xxx_chip *chip, int port,
 				   enum mv88e6xxx_frame_mode mode);
 	int (*port_set_egress_floods)(struct mv88e6xxx_chip *chip, int port,
diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c
index 04006344adb2..15ef81654b67 100644
--- a/drivers/net/dsa/mv88e6xxx/port.c
+++ b/drivers/net/dsa/mv88e6xxx/port.c
@@ -1341,3 +1341,77 @@ int mv88e6390_port_tag_remap(struct mv88e6xxx_chip *chip, int port)
 
 	return 0;
 }
+
+/* Offset 0x0E: Policy Control Register */
+
+int mv88e6352_port_set_policy(struct mv88e6xxx_chip *chip, int port,
+			      enum mv88e6xxx_policy_mapping mapping,
+			      enum mv88e6xxx_policy_action action)
+{
+	u16 reg, mask, val;
+	int shift;
+	int err;
+
+	switch (mapping) {
+	case MV88E6XXX_POLICY_MAPPING_DA:
+		shift = __bf_shf(MV88E6XXX_PORT_POLICY_CTL_DA_MASK);
+		mask = MV88E6XXX_PORT_POLICY_CTL_DA_MASK;
+		break;
+	case MV88E6XXX_POLICY_MAPPING_SA:
+		shift = __bf_shf(MV88E6XXX_PORT_POLICY_CTL_SA_MASK);
+		mask = MV88E6XXX_PORT_POLICY_CTL_SA_MASK;
+		break;
+	case MV88E6XXX_POLICY_MAPPING_VTU:
+		shift = __bf_shf(MV88E6XXX_PORT_POLICY_CTL_VTU_MASK);
+		mask = MV88E6XXX_PORT_POLICY_CTL_VTU_MASK;
+		break;
+	case MV88E6XXX_POLICY_MAPPING_ETYPE:
+		shift = __bf_shf(MV88E6XXX_PORT_POLICY_CTL_ETYPE_MASK);
+		mask = MV88E6XXX_PORT_POLICY_CTL_ETYPE_MASK;
+		break;
+	case MV88E6XXX_POLICY_MAPPING_PPPOE:
+		shift = __bf_shf(MV88E6XXX_PORT_POLICY_CTL_PPPOE_MASK);
+		mask = MV88E6XXX_PORT_POLICY_CTL_PPPOE_MASK;
+		break;
+	case MV88E6XXX_POLICY_MAPPING_VBAS:
+		shift = __bf_shf(MV88E6XXX_PORT_POLICY_CTL_VBAS_MASK);
+		mask = MV88E6XXX_PORT_POLICY_CTL_VBAS_MASK;
+		break;
+	case MV88E6XXX_POLICY_MAPPING_OPT82:
+		shift = __bf_shf(MV88E6XXX_PORT_POLICY_CTL_OPT82_MASK);
+		mask = MV88E6XXX_PORT_POLICY_CTL_OPT82_MASK;
+		break;
+	case MV88E6XXX_POLICY_MAPPING_UDP:
+		shift = __bf_shf(MV88E6XXX_PORT_POLICY_CTL_UDP_MASK);
+		mask = MV88E6XXX_PORT_POLICY_CTL_UDP_MASK;
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	switch (action) {
+	case MV88E6XXX_POLICY_ACTION_NORMAL:
+		val = MV88E6XXX_PORT_POLICY_CTL_NORMAL;
+		break;
+	case MV88E6XXX_POLICY_ACTION_MIRROR:
+		val = MV88E6XXX_PORT_POLICY_CTL_MIRROR;
+		break;
+	case MV88E6XXX_POLICY_ACTION_TRAP:
+		val = MV88E6XXX_PORT_POLICY_CTL_TRAP;
+		break;
+	case MV88E6XXX_POLICY_ACTION_DISCARD:
+		val = MV88E6XXX_PORT_POLICY_CTL_DISCARD;
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_POLICY_CTL, &reg);
+	if (err)
+		return err;
+
+	reg &= ~mask;
+	reg |= (val << shift) & mask;
+
+	return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_POLICY_CTL, reg);
+}
diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h
index d4e9bea6e82f..03a480cd71b9 100644
--- a/drivers/net/dsa/mv88e6xxx/port.h
+++ b/drivers/net/dsa/mv88e6xxx/port.h
@@ -222,7 +222,19 @@
 #define MV88E6XXX_PORT_PRI_OVERRIDE	0x0d
 
 /* Offset 0x0E: Policy Control Register */
-#define MV88E6XXX_PORT_POLICY_CTL	0x0e
+#define MV88E6XXX_PORT_POLICY_CTL		0x0e
+#define MV88E6XXX_PORT_POLICY_CTL_DA_MASK	0xc000
+#define MV88E6XXX_PORT_POLICY_CTL_SA_MASK	0x3000
+#define MV88E6XXX_PORT_POLICY_CTL_VTU_MASK	0x0c00
+#define MV88E6XXX_PORT_POLICY_CTL_ETYPE_MASK	0x0300
+#define MV88E6XXX_PORT_POLICY_CTL_PPPOE_MASK	0x00c0
+#define MV88E6XXX_PORT_POLICY_CTL_VBAS_MASK	0x0030
+#define MV88E6XXX_PORT_POLICY_CTL_OPT82_MASK	0x000c
+#define MV88E6XXX_PORT_POLICY_CTL_UDP_MASK	0x0003
+#define MV88E6XXX_PORT_POLICY_CTL_NORMAL	0x0000
+#define MV88E6XXX_PORT_POLICY_CTL_MIRROR	0x0001
+#define MV88E6XXX_PORT_POLICY_CTL_TRAP		0x0002
+#define MV88E6XXX_PORT_POLICY_CTL_DISCARD	0x0003
 
 /* Offset 0x0F: Port Special Ether Type */
 #define MV88E6XXX_PORT_ETH_TYPE		0x0f
@@ -324,6 +336,9 @@ int mv88e6185_port_set_egress_floods(struct mv88e6xxx_chip *chip, int port,
 				     bool unicast, bool multicast);
 int mv88e6352_port_set_egress_floods(struct mv88e6xxx_chip *chip, int port,
 				     bool unicast, bool multicast);
+int mv88e6352_port_set_policy(struct mv88e6xxx_chip *chip, int port,
+			      enum mv88e6xxx_policy_mapping mapping,
+			      enum mv88e6xxx_policy_action action);
 int mv88e6351_port_set_ether_type(struct mv88e6xxx_chip *chip, int port,
 				  u16 etype);
 int mv88e6xxx_port_set_message_port(struct mv88e6xxx_chip *chip, int port,
-- 
2.23.0


^ permalink raw reply related

* [PATCH net-next 3/3] net: dsa: mv88e6xxx: add RXNFC support
From: Vivien Didelot @ 2019-09-07 20:00 UTC (permalink / raw)
  To: netdev; +Cc: davem, f.fainelli, andrew, Vivien Didelot
In-Reply-To: <20190907200049.25273-1-vivien.didelot@gmail.com>

Implement the .get_rxnfc and .set_rxnfc DSA operations to configure
a port's Layer 2 Policy Control List (PCL) via ethtool.

Currently only dropping frames based on MAC Destination or Source
Address (including the option VLAN parameter) is supported.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
---
 drivers/net/dsa/mv88e6xxx/chip.c | 213 +++++++++++++++++++++++++++++++
 drivers/net/dsa/mv88e6xxx/chip.h |  13 ++
 2 files changed, 226 insertions(+)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 6f4d5303a1f3..6787d560e9e3 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -1524,6 +1524,216 @@ static int mv88e6xxx_port_db_load_purge(struct mv88e6xxx_chip *chip, int port,
 	return mv88e6xxx_g1_atu_loadpurge(chip, fid, &entry);
 }
 
+static int mv88e6xxx_policy_apply(struct mv88e6xxx_chip *chip, int port,
+				  const struct mv88e6xxx_policy *policy)
+{
+	enum mv88e6xxx_policy_mapping mapping = policy->mapping;
+	enum mv88e6xxx_policy_action action = policy->action;
+	const u8 *addr = policy->addr;
+	u16 vid = policy->vid;
+	u8 state;
+	int err;
+	int id;
+
+	if (!chip->info->ops->port_set_policy)
+		return -EOPNOTSUPP;
+
+	switch (mapping) {
+	case MV88E6XXX_POLICY_MAPPING_DA:
+	case MV88E6XXX_POLICY_MAPPING_SA:
+		if (action == MV88E6XXX_POLICY_ACTION_NORMAL)
+			state = 0; /* Dissociate the port and address */
+		else if (action == MV88E6XXX_POLICY_ACTION_DISCARD &&
+			 is_multicast_ether_addr(addr))
+			state = MV88E6XXX_G1_ATU_DATA_STATE_MC_STATIC_POLICY;
+		else if (action == MV88E6XXX_POLICY_ACTION_DISCARD &&
+			 is_unicast_ether_addr(addr))
+			state = MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC_POLICY;
+		else
+			return -EOPNOTSUPP;
+
+		err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid,
+						   state);
+		if (err)
+			return err;
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	/* Skip the port's policy clearing if the mapping is still in use */
+	if (action == MV88E6XXX_POLICY_ACTION_NORMAL)
+		idr_for_each_entry(&chip->policies, policy, id)
+			if (policy->port == port &&
+			    policy->mapping == mapping &&
+			    policy->action != action)
+				return 0;
+
+	return chip->info->ops->port_set_policy(chip, port, mapping, action);
+}
+
+static int mv88e6xxx_policy_insert(struct mv88e6xxx_chip *chip, int port,
+				   struct ethtool_rx_flow_spec *fs)
+{
+	struct ethhdr *mac_entry = &fs->h_u.ether_spec;
+	struct ethhdr *mac_mask = &fs->m_u.ether_spec;
+	enum mv88e6xxx_policy_mapping mapping;
+	enum mv88e6xxx_policy_action action;
+	struct mv88e6xxx_policy *policy;
+	u16 vid = 0;
+	u8 *addr;
+	int err;
+	int id;
+
+	if (fs->location != RX_CLS_LOC_ANY)
+		return -EINVAL;
+
+	if (fs->ring_cookie == RX_CLS_FLOW_DISC)
+		action = MV88E6XXX_POLICY_ACTION_DISCARD;
+	else
+		return -EOPNOTSUPP;
+
+	switch (fs->flow_type & ~FLOW_EXT) {
+	case ETHER_FLOW:
+		if (!is_zero_ether_addr(mac_mask->h_dest) &&
+		    is_zero_ether_addr(mac_mask->h_source)) {
+			mapping = MV88E6XXX_POLICY_MAPPING_DA;
+			addr = mac_entry->h_dest;
+		} else if (is_zero_ether_addr(mac_mask->h_dest) &&
+		    !is_zero_ether_addr(mac_mask->h_source)) {
+			mapping = MV88E6XXX_POLICY_MAPPING_SA;
+			addr = mac_entry->h_source;
+		} else {
+			/* Cannot support DA and SA mapping in the same rule */
+			return -EOPNOTSUPP;
+		}
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	if ((fs->flow_type & FLOW_EXT) && fs->m_ext.vlan_tci) {
+		if (fs->m_ext.vlan_tci != 0xffff)
+			return -EOPNOTSUPP;
+		vid = be16_to_cpu(fs->h_ext.vlan_tci) & VLAN_VID_MASK;
+	}
+
+	idr_for_each_entry(&chip->policies, policy, id) {
+		if (policy->port == port && policy->mapping == mapping &&
+		    policy->action == action && policy->vid == vid &&
+		    ether_addr_equal(policy->addr, addr))
+			return -EEXIST;
+	}
+
+	policy = devm_kzalloc(chip->dev, sizeof(*policy), GFP_KERNEL);
+	if (!policy)
+		return -ENOMEM;
+
+	fs->location = 0;
+	err = idr_alloc_u32(&chip->policies, policy, &fs->location, 0xffffffff,
+			    GFP_KERNEL);
+	if (err) {
+		devm_kfree(chip->dev, policy);
+		return err;
+	}
+
+	memcpy(&policy->fs, fs, sizeof(*fs));
+	ether_addr_copy(policy->addr, addr);
+	policy->mapping = mapping;
+	policy->action = action;
+	policy->port = port;
+	policy->vid = vid;
+
+	err = mv88e6xxx_policy_apply(chip, port, policy);
+	if (err) {
+		idr_remove(&chip->policies, fs->location);
+		devm_kfree(chip->dev, policy);
+		return err;
+	}
+
+	return 0;
+}
+
+static int mv88e6xxx_get_rxnfc(struct dsa_switch *ds, int port,
+			       struct ethtool_rxnfc *rxnfc, u32 *rule_locs)
+{
+	struct ethtool_rx_flow_spec *fs = &rxnfc->fs;
+	struct mv88e6xxx_chip *chip = ds->priv;
+	struct mv88e6xxx_policy *policy;
+	int err;
+	int id;
+
+	mv88e6xxx_reg_lock(chip);
+
+	switch (rxnfc->cmd) {
+	case ETHTOOL_GRXCLSRLCNT:
+		rxnfc->data = 0;
+		rxnfc->data |= RX_CLS_LOC_SPECIAL;
+		rxnfc->rule_cnt = 0;
+		idr_for_each_entry(&chip->policies, policy, id)
+			if (policy->port == port)
+				rxnfc->rule_cnt++;
+		err = 0;
+		break;
+	case ETHTOOL_GRXCLSRULE:
+		err = -ENOENT;
+		policy = idr_find(&chip->policies, fs->location);
+		if (policy) {
+			memcpy(fs, &policy->fs, sizeof(*fs));
+			err = 0;
+		}
+		break;
+	case ETHTOOL_GRXCLSRLALL:
+		rxnfc->data = 0;
+		rxnfc->rule_cnt = 0;
+		idr_for_each_entry(&chip->policies, policy, id)
+			if (policy->port == port)
+				rule_locs[rxnfc->rule_cnt++] = id;
+		err = 0;
+		break;
+	default:
+		err = -EOPNOTSUPP;
+		break;
+	}
+
+	mv88e6xxx_reg_unlock(chip);
+
+	return err;
+}
+
+static int mv88e6xxx_set_rxnfc(struct dsa_switch *ds, int port,
+			       struct ethtool_rxnfc *rxnfc)
+{
+	struct ethtool_rx_flow_spec *fs = &rxnfc->fs;
+	struct mv88e6xxx_chip *chip = ds->priv;
+	struct mv88e6xxx_policy *policy;
+	int err;
+
+	mv88e6xxx_reg_lock(chip);
+
+	switch (rxnfc->cmd) {
+	case ETHTOOL_SRXCLSRLINS:
+		err = mv88e6xxx_policy_insert(chip, port, fs);
+		break;
+	case ETHTOOL_SRXCLSRLDEL:
+		err = -ENOENT;
+		policy = idr_remove(&chip->policies, fs->location);
+		if (policy) {
+			policy->action = MV88E6XXX_POLICY_ACTION_NORMAL;
+			err = mv88e6xxx_policy_apply(chip, port, policy);
+			devm_kfree(chip->dev, policy);
+		}
+		break;
+	default:
+		err = -EOPNOTSUPP;
+		break;
+	}
+
+	mv88e6xxx_reg_unlock(chip);
+
+	return err;
+}
+
 static int mv88e6xxx_port_add_broadcast(struct mv88e6xxx_chip *chip, int port,
 					u16 vid)
 {
@@ -4655,6 +4865,7 @@ static struct mv88e6xxx_chip *mv88e6xxx_alloc_chip(struct device *dev)
 
 	mutex_init(&chip->reg_lock);
 	INIT_LIST_HEAD(&chip->mdios);
+	idr_init(&chip->policies);
 
 	return chip;
 }
@@ -4739,6 +4950,8 @@ static const struct dsa_switch_ops mv88e6xxx_switch_ops = {
 	.set_eeprom		= mv88e6xxx_set_eeprom,
 	.get_regs_len		= mv88e6xxx_get_regs_len,
 	.get_regs		= mv88e6xxx_get_regs,
+	.get_rxnfc		= mv88e6xxx_get_rxnfc,
+	.set_rxnfc		= mv88e6xxx_set_rxnfc,
 	.set_ageing_time	= mv88e6xxx_set_ageing_time,
 	.port_bridge_join	= mv88e6xxx_port_bridge_join,
 	.port_bridge_leave	= mv88e6xxx_port_bridge_leave,
diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h
index 04a329a98158..e9b1a1ac9a8e 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.h
+++ b/drivers/net/dsa/mv88e6xxx/chip.h
@@ -8,6 +8,7 @@
 #ifndef _MV88E6XXX_CHIP_H
 #define _MV88E6XXX_CHIP_H
 
+#include <linux/idr.h>
 #include <linux/if_vlan.h>
 #include <linux/irq.h>
 #include <linux/gpio/consumer.h>
@@ -207,6 +208,15 @@ enum mv88e6xxx_policy_action {
 	MV88E6XXX_POLICY_ACTION_DISCARD,
 };
 
+struct mv88e6xxx_policy {
+	enum mv88e6xxx_policy_mapping mapping;
+	enum mv88e6xxx_policy_action action;
+	struct ethtool_rx_flow_spec fs;
+	u8 addr[ETH_ALEN];
+	int port;
+	u16 vid;
+};
+
 struct mv88e6xxx_port {
 	struct mv88e6xxx_chip *chip;
 	int port;
@@ -265,6 +275,9 @@ struct mv88e6xxx_chip {
 	/* List of mdio busses */
 	struct list_head mdios;
 
+	/* Policy Control List IDs and rules */
+	struct idr policies;
+
 	/* There can be two interrupt controllers, which are chained
 	 * off a GPIO as interrupt source
 	 */
-- 
2.23.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox