* [PATCH] qed: fix spelling mistake "fullill" -> "fulfill"
From: Colin King @ 2019-09-13 9:07 UTC (permalink / raw)
To: Ariel Elior, GR-everest-linux-l2, David S . Miller, netdev
Cc: kernel-janitors, linux-kernel
From: Colin Ian King <colin.king@canonical.com>
There is a spelling mistake in a DP_VERBOSE debug message. Fix it.
(Using American English spelling as this is the most common way
to spell this in the kernel).
Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
drivers/net/ethernet/qlogic/qed/qed_vf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_vf.c b/drivers/net/ethernet/qlogic/qed/qed_vf.c
index 5dda547772c1..856051f50eb7 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_vf.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_vf.c
@@ -231,7 +231,7 @@ static void qed_vf_pf_acquire_reduce_resc(struct qed_hwfn *p_hwfn,
{
DP_VERBOSE(p_hwfn,
QED_MSG_IOV,
- "PF unwilling to fullill resource request: rxq [%02x/%02x] txq [%02x/%02x] sbs [%02x/%02x] mac [%02x/%02x] vlan [%02x/%02x] mc [%02x/%02x] cids [%02x/%02x]. Try PF recommended amount\n",
+ "PF unwilling to fulfill resource request: rxq [%02x/%02x] txq [%02x/%02x] sbs [%02x/%02x] mac [%02x/%02x] vlan [%02x/%02x] mc [%02x/%02x] cids [%02x/%02x]. Try PF recommended amount\n",
p_req->num_rxqs,
p_resp->num_rxqs,
p_req->num_rxqs,
--
2.20.1
^ permalink raw reply related
* big ICMP requests get disrupted on IPSec tunnel activation
From: Bartschies, Thomas @ 2019-09-13 8:59 UTC (permalink / raw)
To: 'netdev@vger.kernel.org'
Hello together,
since kenel 4.20 we're observing a strange behaviour when sending big ICMP packets. An example is a packet size of 3000 bytes.
The packets should be forwarded by a linux gateway (firewall) having multiple interfaces also acting as a vpn gateway.
Test steps:
1. Disabled all iptables rules
2. Enabled the VPN IPSec Policies.
3. Start a ping with packet size (e.g. 3000 bytes) from a client in the DMZ passing the machine targeting another LAN machine
4. Ping works
5. Enable a VPN policy by sending pings from the gateway to a tunnel target. System tries to create the tunnel
6. Ping from 3. immediately stalls. No error messages. Just stops.
7. Stop Ping from 3. Start another without packet size parameter. Stalls also.
Result:
Connections from the client to other services on the LAN machine still work. Tracepath works. Only ICMP requests do not pass
the gateway anymore. tcpdump sees them on incoming interface, but not on the outgoing LAN interface. IMCP requests to any
other target IP address in LAN still work. Until one uses a bigger packet size. Then these alternative connections stall also.
Flushing the policy table has no effect. Flushing the conntrack table has no effect. Setting rp_filter to loose (2) has no effect.
Flush the route cache has no effect.
Only a reboot of the gateway restores normal behavior.
What can be the cause? Is this a networking bug?
Best regards,
--
i. A. Thomas Bartschies
IT Systeme
Cornelsen Verlagskontor GmbH
Kammerratsheide 66
33609 Bielefeld
Telefon: +49 (0)521 9719-310
Telefax: +49 (0)521 9719-93310
thomas.bartschies@cvk.de
http://www.cvk.de
AG Bielefeld HRB 39324
Geschäftsführer: Thomas Fuchs, Patrick Neiss
^ permalink raw reply
* Re: "[RFC PATCH net-next 2/2] Reduce localhost to 127.0.0.0/16"
From: Dave Taht @ 2019-09-13 9:14 UTC (permalink / raw)
To: Mark Smith; +Cc: Linux Kernel Network Developers
In-Reply-To: <CAO42Z2xH_R1YQBhpyFVziPnHzWwzNV61VqrVT0yMcdEoTd6ZNQ@mail.gmail.com>
On Fri, Sep 13, 2019 at 9:54 AM Mark Smith <markzzzsmith@gmail.com> wrote:
>
> (Not subscribed to the ML)
>
> Hi,
>
> I've noticed this patch. I don't think it should be applied, as it
> contradicts RFC 1122, "Requirements for Internet Hosts --
> Communication Layers":
Yea! I kicked off a discussion!
> "(g) { 127, <any> }
>
> Internal host loopback address. Addresses of this form
> MUST NOT appear outside a host."
That 1984 (89) definition of a "host" has been stretched considerably
in the past few decades. We now have
a hypervisor, multiple cores, multiple vms, vms stacked within vms,
and containers with virtual interfaces on them, and a confusing
plethora of rfc1918 and nat between them and the wire.
This RFC-to-netdev's proposed reduction to a /16 was sufficient to
cover the two main use cases for loopback in Linux,
127.0.0.1 - loopback
127.0.1.1 - dns
We'd also seen some usages of things like 127.0.0.53 and so on, and in
the discussion at linuxconf last week,
it came out that cumulus and a few others were possibly using high
values of 127.x for switch chassis addressing, but we haven't got any
documentation on how that works yet.
The 1995 IPv6 standard and later has only one loopback address.
127.0.0.0/8 is 16m wasted internal to the host addresses.
> RFC 1122 is one of the relatively few Internet Standards, specifically
> Standard Number 3:
>
> https://www.rfc-editor.org/standards
We have been exploring the solution space here:
https://github.com/dtaht/unicast-extensions/blob/master/rfcs/draft-gilmore-taht-v4uniext.txt
If you would like to file more comments and bugs - or discuss here!
that would be great.
>
> Regards,
> Mark.
--
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740
^ permalink raw reply
* "[RFC PATCH net-next 1/2] Allow 225/8-231/8 as unicast"
From: Mark Smith @ 2019-09-13 9:16 UTC (permalink / raw)
To: dave.taht, netdev
Hi,
(Not subscribed to the mailing list)
I've just noticed this patch.
I don't think it should be applied, as 225/8 through 231/8 falls
within the IANA designated Class D multicast address range.
(https://www.iana.org/assignments/ipv4-address-space/ipv4-address-space.xhtml)
Using this address range as unicast addresses would mean that ICMP
messages would need to be able to use them as a source address.
However, Internet Standard number 3, RFC 1122, "Requirements for
Internet Hosts -- Communication Layers", prohibits using addresses
from within the multicast address range from being used as source
addresses:
" An ICMP error message MUST NOT be sent as the result of
receiving:
* an ICMP error message, or
* a datagram destined to an IP broadcast or IP multicast
address, or
* a datagram sent as a link-layer broadcast, or
* a non-initial fragment, or
* a datagram whose source address does not define a single
host -- e.g., a zero address, a loopback address, a
broadcast address, a multicast address, or a Class E
address."
Please note, IPv6 has and is being widely adopted. Trying to extend
use of IPv4 should be considered an unnecessary, in particular when it
violates Internet Standards.
There are more than 75 000 IPv6 routes in the Internet route table.
Nearly 18 000 BGP Autonomous Systems are announcing at least one IPv6
prefix.
http://www.cidr-report.org/v6/as2.0/#General_Status
A number of countries have exceeded 50% IPv6 capability and preference
according to APNIC.
https://stats.labs.apnic.net/ipv6
Globally, Google are receiving more than 25% of their traffic via IPv6:
https://www.google.com/intl/en/ipv6/statistics.html
Regards,
Mark.
^ permalink raw reply
* Re: [PATCH] can: flexcan: free error skb if enqueueing failed
From: Sean Nyekjaer @ 2019-09-13 9:44 UTC (permalink / raw)
To: Marc Kleine-Budde, linux-can
Cc: Martin Hundebøll, Wolfgang Grandegger, David S . Miller,
netdev
In-Reply-To: <d5f8811e-4b85-776a-668f-33f64ec6ef16@geanix.com>
On 01/08/2019 09.59, Martin Hundebøll wrote:
> On 15/07/2019 20.53, Martin Hundebøll wrote:
>> If the call to can_rx_offload_queue_sorted() fails, the passed skb isn't
>> consumed, so the caller must do so.
>>
>> Fixes: 30164759db1b ("can: flexcan: make use of rx-offload's
>> irq_offload_fifo")
>> Signed-off-by: Martin Hundebøll <martin@geanix.com>
>
> Ping.
Hi Marc
Any problems with this? Besides time ;-)
We really need this to be back ported to 4.19, soon...
/Sean
^ permalink raw reply
* [PATCH net] ip6_gre: fix a dst leak in ip6erspan_tunnel_xmit
From: Xin Long @ 2019-09-13 9:45 UTC (permalink / raw)
To: network dev; +Cc: davem, William Tu
In ip6erspan_tunnel_xmit(), if the skb will not be sent out, it has to
be freed on the tx_err path. Otherwise when deleting a netns, it would
cause dst/dev to leak, and dmesg shows:
unregister_netdevice: waiting for lo to become free. Usage count = 1
Fixes: ef7baf5e083c ("ip6_gre: add ip6 erspan collect_md mode")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
net/ipv6/ip6_gre.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index dd2d0b96..d5779d6 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -968,7 +968,7 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb,
if (unlikely(!tun_info ||
!(tun_info->mode & IP_TUNNEL_INFO_TX) ||
ip_tunnel_info_af(tun_info) != AF_INET6))
- return -EINVAL;
+ goto tx_err;
key = &tun_info->key;
memset(&fl6, 0, sizeof(fl6));
--
2.1.0
^ permalink raw reply related
* [PATCH net] net: stmmac: Hold rtnl lock in suspend/resume callbacks
From: Jose Abreu @ 2019-09-13 9:50 UTC (permalink / raw)
To: netdev
Cc: Joao Pinto, Jose Abreu, Giuseppe Cavallaro, Alexandre Torgue,
David S. Miller, Maxime Coquelin, linux-stm32, linux-arm-kernel,
linux-kernel, Christophe ROULLIER
We need to hold rnl lock in suspend and resume callbacks because phylink
requires it. Otherwise we will get a WARN() in suspend and resume.
Also, move phylink start and stop callbacks to inside device's internal
lock so that we prevent concurrent HW accesses.
Fixes: 74371272f97f ("net: stmmac: Convert to phylink and remove phylib logic")
Reported-by: Christophe ROULLIER <christophe.roullier@st.com>
Tested-by: Christophe ROULLIER <christophe.roullier@st.com>
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
---
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Cc: Jose Abreu <joabreu@synopsys.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: netdev@vger.kernel.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: Christophe ROULLIER <christophe.roullier@st.com>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index fd54c7c87485..b19ab09cb18f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -4451,10 +4451,12 @@ int stmmac_suspend(struct device *dev)
if (!ndev || !netif_running(ndev))
return 0;
- phylink_stop(priv->phylink);
-
mutex_lock(&priv->lock);
+ rtnl_lock();
+ phylink_stop(priv->phylink);
+ rtnl_unlock();
+
netif_device_detach(ndev);
stmmac_stop_all_queues(priv);
@@ -4558,9 +4560,11 @@ int stmmac_resume(struct device *dev)
stmmac_start_all_queues(priv);
- mutex_unlock(&priv->lock);
-
+ rtnl_lock();
phylink_start(priv->phylink);
+ rtnl_unlock();
+
+ mutex_unlock(&priv->lock);
return 0;
}
--
2.7.4
^ permalink raw reply related
* Re: [patch net-next rfc 2/7] net: introduce name_node struct to be used in hashlist
From: Jiri Pirko @ 2019-09-13 9:52 UTC (permalink / raw)
To: Stephen Hemminger
Cc: netdev, davem, jakub.kicinski, sthemmin, dsahern, dcbw, mkubecek,
andrew, parav, saeedm, mlxsw
In-Reply-To: <20190719132649.700e6a5c@hermes.lan>
Fri, Jul 19, 2019 at 10:26:49PM CEST, stephen@networkplumber.org wrote:
>On Fri, 19 Jul 2019 21:17:40 +0200
>Jiri Pirko <jiri@resnulli.us> wrote:
>
>> Fri, Jul 19, 2019 at 06:29:36PM CEST, stephen@networkplumber.org wrote:
>> >On Fri, 19 Jul 2019 13:00:24 +0200
>> >Jiri Pirko <jiri@resnulli.us> wrote:
>> >
>> >> From: Jiri Pirko <jiri@mellanox.com>
>> >>
>> >> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
>> >> ---
>> >> include/linux/netdevice.h | 10 +++-
>> >> net/core/dev.c | 96 +++++++++++++++++++++++++++++++--------
>> >> 2 files changed, 86 insertions(+), 20 deletions(-)
>> >>
>> >> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> >> index 88292953aa6f..74f99f127b0e 100644
>> >> --- a/include/linux/netdevice.h
>> >> +++ b/include/linux/netdevice.h
>> >> @@ -918,6 +918,12 @@ struct dev_ifalias {
>> >> struct devlink;
>> >> struct tlsdev_ops;
>> >>
>> >> +struct netdev_name_node {
>> >> + struct hlist_node hlist;
>> >> + struct net_device *dev;
>> >> + char *name
>> >
>> >You probably can make this const char *
>
>Don't bother, it looks ok as is. the problem is you would have
>to cast it when calling free.
Actually, it does not. So I'll make it const.
>
>> >Do you want to add __rcu to this list?
>>
>> Which list?
>>
>
> struct netdev_name_node __rcu *name_node;
This is stored in struct net_device for ifname. The pointer is never
accessed from rcu read.
>
>You might also want to explictly init the hlist node rather
>than relying on the fact that zero is an empty node ptr.
>
>
> static struct netdev_name_node *netdev_name_node_alloc(struct net_device *dev,
>- char *name)
>+ const char *name)
> {
> struct netdev_name_node *name_node;
>
>- name_node = kzalloc(sizeof(*name_node), GFP_KERNEL);
>+ name_node = kmalloc(sizeof(*name_node));
> if (!name_node)
> return NULL;
>+
>+ INIT_HLIST_NODE(&name_node->hlist);
Will do. Thanks!
> name_node->dev = dev;
> name_node->name = name;
> return name_node;
>
^ permalink raw reply
* Re: [PATCH] net/mlx5: Remove unneeded variable in mlx5_unload_one
From: Saeed Mahameed @ 2019-09-13 10:06 UTC (permalink / raw)
To: zhongjiang@huawei.com, davem@davemloft.net
Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <1568307542-43797-1-git-send-email-zhongjiang@huawei.com>
On Fri, 2019-09-13 at 00:59 +0800, zhong jiang wrote:
> mlx5_unload_one do not need local variable to store different value,
> Hence just remove it.
>
> Signed-off-by: zhong jiang <zhongjiang@huawei.com>
> ---
> drivers/net/ethernet/mellanox/mlx5/core/main.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c
> b/drivers/net/ethernet/mellanox/mlx5/core/main.c
> index 9648c22..c39bb37 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
> @@ -1228,8 +1228,6 @@ static int mlx5_load_one(struct mlx5_core_dev
> *dev, bool boot)
>
> static int mlx5_unload_one(struct mlx5_core_dev *dev, bool cleanup)
> {
> - int err = 0;
> -
> if (cleanup) {
> mlx5_unregister_device(dev);
> mlx5_drain_health_wq(dev);
> @@ -1257,7 +1255,7 @@ static int mlx5_unload_one(struct mlx5_core_dev
> *dev, bool cleanup)
> mlx5_function_teardown(dev, cleanup);
> out:
> mutex_unlock(&dev->intf_state_mutex);
> - return err;
> + return 0;
> }
>
> static int mlx5_mdev_init(struct mlx5_core_dev *dev, int
> profile_idx)
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
^ permalink raw reply
* Re: [PATCH bpf-next] libbpf: add xsk_umem__adjust_offset
From: Laatz, Kevin @ 2019-09-13 10:21 UTC (permalink / raw)
To: Björn Töpel
Cc: Netdev, Alexei Starovoitov, Daniel Borkmann,
Björn Töpel, Karlsson, Magnus, Jonathan Lemon,
Bruce Richardson, ciara.loftus, bpf
In-Reply-To: <CAJ+HfNgQY6muwzGgBW6xLFzKeiCMQUwrz_yrywB3F_VSKbaadQ@mail.gmail.com>
On 13/09/2019 05:59, Björn Töpel wrote:
> On Thu, 12 Sep 2019 at 17:47, Kevin Laatz <kevin.laatz@intel.com> wrote:
>> Currently, xsk_umem_adjust_offset exists as a kernel internal function.
>> This patch adds xsk_umem__adjust_offset to libbpf so that it can be used
>> from userspace. This will take the responsibility of properly storing the
>> offset away from the application, making it less error prone.
>>
>> Since xsk_umem__adjust_offset is called on a per-packet basis, we need to
>> inline the function to avoid any performance regressions. In order to
>> inline xsk_umem__adjust_offset, we need to add it to xsk.h. Unfortunately
>> this means that we can't dereference the xsk_umem_config struct directly
>> since it is defined only in xsk.c. We therefore add an extra API to return
>> the flags field to the user from the structure, and have the inline
>> function use this flags field directly.
>>
> Can you expand this to a series, with an additional patch where these
> functions are used in XDP socket sample application, so it's more
> clear for users?
These functions are currently not required in the xdpsock application and I think forcing them in would be messy :-). However, an example of the use of these functions could be seen in the DPDK AF_XDP PMD. There is a patch herehttp://patches.dpdk.org/patch/58624/ where we currently do the offset adjustment to the handle manually, but with this patch we could replace it with xsk_umem__adjust_offset and have a real use example of the functions being used.
Would this be enough for an example?
Thanks,
Kevin
^ permalink raw reply
* [PATCH bpf-next v2 0/3] AF_XDP fixes for i40e, ixgbe and xdpsock
From: Ciara Loftus @ 2019-09-13 10:39 UTC (permalink / raw)
To: netdev, ast, daniel, bjorn.topel, magnus.karlsson, jonathan.lemon
Cc: bruce.richardson, bpf, intel-wired-lan, kevin.laatz, Ciara Loftus
This patch set contains some fixes for AF_XDP zero copy in the i40e and
ixgbe drivers as well as a fix for the 'xdpsock' sample application when
running in unaligned mode.
Patches 1 and 2 fix a regression for the i40e and ixgbe drivers which
caused the umem headroom to be added to the xdp handle twice, resulting in
an incorrect value being received by the user for the case where the umem
headroom is non-zero.
Patch 3 fixes an issue with the xdpsock sample application whereby the
start of the tx packet data (offset) was not being set correctly when the
application was being run in unaligned mode.
This patch set has been applied against commit a2c11b034142 ("kcm: use
BPF_PROG_RUN")
---
v2:
- Rearranged local variable order in i40e_run_xdp_zc and ixgbe_run_xdp_zc
to comply with coding standards.
Ciara Loftus (3):
i40e: fix xdp handle calculations
ixgbe: fix xdp handle calculations
samples/bpf: fix xdpsock l2fwd tx for unaligned mode
drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 ++--
drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 4 ++--
samples/bpf/xdpsock_user.c | 2 +-
3 files changed, 5 insertions(+), 5 deletions(-)
--
2.17.1
^ permalink raw reply
* [PATCH bpf-next v2 1/3] i40e: fix xdp handle calculations
From: Ciara Loftus @ 2019-09-13 10:39 UTC (permalink / raw)
To: netdev, ast, daniel, bjorn.topel, magnus.karlsson, jonathan.lemon
Cc: bruce.richardson, bpf, intel-wired-lan, kevin.laatz, Ciara Loftus
In-Reply-To: <20190913103948.32053-1-ciara.loftus@intel.com>
Commit 4c5d9a7fa149 ("i40e: fix xdp handle calculations") reintroduced
the addition of the umem headroom to the xdp handle in the i40e_zca_free,
i40e_alloc_buffer_slow_zc and i40e_alloc_buffer_zc functions. However,
the headroom is already added to the handle in the function i40_run_xdp_zc.
This commit removes the latter addition and fixes the case where the
headroom is non-zero.
Fixes: 4c5d9a7fa149 ("i40e: fix xdp handle calculations")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
index 0373bc6c7e61..a05dfecdd9b4 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
@@ -192,9 +192,9 @@ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp)
{
struct xdp_umem *umem = rx_ring->xsk_umem;
int err, result = I40E_XDP_PASS;
- u64 offset = umem->headroom;
struct i40e_ring *xdp_ring;
struct bpf_prog *xdp_prog;
+ u64 offset;
u32 act;
rcu_read_lock();
@@ -203,7 +203,7 @@ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp)
*/
xdp_prog = READ_ONCE(rx_ring->xdp_prog);
act = bpf_prog_run_xdp(xdp_prog, xdp);
- offset += xdp->data - xdp->data_hard_start;
+ offset = xdp->data - xdp->data_hard_start;
xdp->handle = xsk_umem_adjust_offset(umem, xdp->handle, offset);
--
2.17.1
^ permalink raw reply related
* [PATCH bpf-next v2 2/3] ixgbe: fix xdp handle calculations
From: Ciara Loftus @ 2019-09-13 10:39 UTC (permalink / raw)
To: netdev, ast, daniel, bjorn.topel, magnus.karlsson, jonathan.lemon
Cc: bruce.richardson, bpf, intel-wired-lan, kevin.laatz, Ciara Loftus
In-Reply-To: <20190913103948.32053-1-ciara.loftus@intel.com>
Commit 7cbbf9f1fa23 ("ixgbe: fix xdp handle calculations") reintroduced
the addition of the umem headroom to the xdp handle in the ixgbe_zca_free,
ixgbe_alloc_buffer_slow_zc and ixgbe_alloc_buffer_zc functions. However,
the headroom is already added to the handle in the function
ixgbe_run_xdp_zc. This commit removes the latter addition and fixes the
case where the headroom is non-zero.
Fixes: 7cbbf9f1fa23 ("ixgbe: fix xdp handle calculations")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
---
drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
index ad802a8909e0..fd45d12b5a98 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
@@ -145,15 +145,15 @@ static int ixgbe_run_xdp_zc(struct ixgbe_adapter *adapter,
{
struct xdp_umem *umem = rx_ring->xsk_umem;
int err, result = IXGBE_XDP_PASS;
- u64 offset = umem->headroom;
struct bpf_prog *xdp_prog;
struct xdp_frame *xdpf;
+ u64 offset;
u32 act;
rcu_read_lock();
xdp_prog = READ_ONCE(rx_ring->xdp_prog);
act = bpf_prog_run_xdp(xdp_prog, xdp);
- offset += xdp->data - xdp->data_hard_start;
+ offset = xdp->data - xdp->data_hard_start;
xdp->handle = xsk_umem_adjust_offset(umem, xdp->handle, offset);
--
2.17.1
^ permalink raw reply related
* [PATCH bpf-next v2 3/3] samples/bpf: fix xdpsock l2fwd tx for unaligned mode
From: Ciara Loftus @ 2019-09-13 10:39 UTC (permalink / raw)
To: netdev, ast, daniel, bjorn.topel, magnus.karlsson, jonathan.lemon
Cc: bruce.richardson, bpf, intel-wired-lan, kevin.laatz, Ciara Loftus
In-Reply-To: <20190913103948.32053-1-ciara.loftus@intel.com>
Preserve the offset of the address of the received descriptor, and include
it in the address set for the tx descriptor, so the kernel can correctly
locate the start of the packet data.
Fixes: 03895e63ff97 ("samples/bpf: add buffer recycling for unaligned chunks to xdpsock")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
---
samples/bpf/xdpsock_user.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
index 102eace22956..df011ac33402 100644
--- a/samples/bpf/xdpsock_user.c
+++ b/samples/bpf/xdpsock_user.c
@@ -685,7 +685,7 @@ static void l2fwd(struct xsk_socket_info *xsk, struct pollfd *fds)
for (i = 0; i < rcvd; i++) {
u64 addr = xsk_ring_cons__rx_desc(&xsk->rx, idx_rx)->addr;
u32 len = xsk_ring_cons__rx_desc(&xsk->rx, idx_rx++)->len;
- u64 orig = xsk_umem__extract_addr(addr);
+ u64 orig = addr;
addr = xsk_umem__add_offset_to_addr(addr);
char *pkt = xsk_umem__get_data(xsk->umem->buffer, addr);
--
2.17.1
^ permalink raw reply related
* Re: [PATCH] net: openvswitch: free vport unless register_netdevice() succeeds
From: Stefano Brivio @ 2019-09-13 11:04 UTC (permalink / raw)
To: Pravin Shelar
Cc: Hillf Danton, ovs dev, David S. Miller, linux-kernel,
Linux Kernel Network Developers, syzkaller-bugs, syzbot,
Taehee Yoo, Greg Rose, Eric Dumazet, Marcelo Ricardo Leitner,
Ying Xue, Andrey Konovalov, Jiri Benc, Eelco Chaudron,
Paolo Abeni
In-Reply-To: <CAOrHB_AoBJ37+gFNysZr+v1ySXWZP1CHTw0SDR826fWGgFRZ+g@mail.gmail.com>
On Sat, 10 Aug 2019 00:34:55 -0700
Pravin Shelar <pshelar@ovn.org> wrote:
> On Thu, Aug 8, 2019 at 8:55 PM Hillf Danton <hdanton@sina.com> wrote:
> >
> >
> > syzbot found the following crash on:
> >
> > HEAD commit: 1e78030e Merge tag 'mmc-v5.3-rc1' of git://git.kernel.org/..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=148d3d1a600000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=30cef20daf3e9977
> > dashboard link: https://syzkaller.appspot.com/bug?extid=13210896153522fe1ee5
> > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=136aa8c4600000
> > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=109ba792600000
> >
> > =====================================================================
> > BUG: memory leak
> > unreferenced object 0xffff8881207e4100 (size 128):
> > comm "syz-executor032", pid 7014, jiffies 4294944027 (age 13.830s)
> > hex dump (first 32 bytes):
> > 00 70 16 18 81 88 ff ff 80 af 8c 22 81 88 ff ff .p........."....
> > 00 b6 23 17 81 88 ff ff 00 00 00 00 00 00 00 00 ..#.............
> > backtrace:
> > [<000000000eb78212>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
> > [<000000000eb78212>] slab_post_alloc_hook mm/slab.h:522 [inline]
> > [<000000000eb78212>] slab_alloc mm/slab.c:3319 [inline]
> > [<000000000eb78212>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3548
> > [<00000000006ea6c6>] kmalloc include/linux/slab.h:552 [inline]
> > [<00000000006ea6c6>] kzalloc include/linux/slab.h:748 [inline]
> > [<00000000006ea6c6>] ovs_vport_alloc+0x37/0xf0 net/openvswitch/vport.c:130
> > [<00000000f9a04a7d>] internal_dev_create+0x24/0x1d0 net/openvswitch/vport-internal_dev.c:164
> > [<0000000056ee7c13>] ovs_vport_add+0x81/0x190 net/openvswitch/vport.c:199
> > [<000000005434efc7>] new_vport+0x19/0x80 net/openvswitch/datapath.c:194
> > [<00000000b7b253f1>] ovs_dp_cmd_new+0x22f/0x410 net/openvswitch/datapath.c:1614
> > [<00000000e0988518>] genl_family_rcv_msg+0x2ab/0x5b0 net/netlink/genetlink.c:629
> > [<00000000d0cc9347>] genl_rcv_msg+0x54/0x9c net/netlink/genetlink.c:654
> > [<000000006694b647>] netlink_rcv_skb+0x61/0x170 net/netlink/af_netlink.c:2477
> > [<0000000088381f37>] genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
> > [<00000000dad42a47>] netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
> > [<00000000dad42a47>] netlink_unicast+0x1ec/0x2d0 net/netlink/af_netlink.c:1328
> > [<0000000067e6b079>] netlink_sendmsg+0x270/0x480 net/netlink/af_netlink.c:1917
> > [<00000000aab08a47>] sock_sendmsg_nosec net/socket.c:637 [inline]
> > [<00000000aab08a47>] sock_sendmsg+0x54/0x70 net/socket.c:657
> > [<000000004cb7c11d>] ___sys_sendmsg+0x393/0x3c0 net/socket.c:2311
> > [<00000000c4901c63>] __sys_sendmsg+0x80/0xf0 net/socket.c:2356
> > [<00000000c10abb2d>] __do_sys_sendmsg net/socket.c:2365 [inline]
> > [<00000000c10abb2d>] __se_sys_sendmsg net/socket.c:2363 [inline]
> > [<00000000c10abb2d>] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2363
> >
> > BUG: memory leak
> > unreferenced object 0xffff88811723b600 (size 64):
> > comm "syz-executor032", pid 7014, jiffies 4294944027 (age 13.830s)
> > hex dump (first 32 bytes):
> > 01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
> > 00 00 00 00 00 00 00 00 02 00 00 00 05 35 82 c1 .............5..
> > backtrace:
> > [<00000000352f46d8>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
> > [<00000000352f46d8>] slab_post_alloc_hook mm/slab.h:522 [inline]
> > [<00000000352f46d8>] slab_alloc mm/slab.c:3319 [inline]
> > [<00000000352f46d8>] __do_kmalloc mm/slab.c:3653 [inline]
> > [<00000000352f46d8>] __kmalloc+0x169/0x300 mm/slab.c:3664
> > [<000000008e48f3d1>] kmalloc include/linux/slab.h:557 [inline]
> > [<000000008e48f3d1>] ovs_vport_set_upcall_portids+0x54/0xd0 net/openvswitch/vport.c:343
> > [<00000000541e4f4a>] ovs_vport_alloc+0x7f/0xf0 net/openvswitch/vport.c:139
> > [<00000000f9a04a7d>] internal_dev_create+0x24/0x1d0 net/openvswitch/vport-internal_dev.c:164
> > [<0000000056ee7c13>] ovs_vport_add+0x81/0x190 net/openvswitch/vport.c:199
> > [<000000005434efc7>] new_vport+0x19/0x80 net/openvswitch/datapath.c:194
> > [<00000000b7b253f1>] ovs_dp_cmd_new+0x22f/0x410 net/openvswitch/datapath.c:1614
> > [<00000000e0988518>] genl_family_rcv_msg+0x2ab/0x5b0 net/netlink/genetlink.c:629
> > [<00000000d0cc9347>] genl_rcv_msg+0x54/0x9c net/netlink/genetlink.c:654
> > [<000000006694b647>] netlink_rcv_skb+0x61/0x170 net/netlink/af_netlink.c:2477
> > [<0000000088381f37>] genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
> > [<00000000dad42a47>] netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
> > [<00000000dad42a47>] netlink_unicast+0x1ec/0x2d0 net/netlink/af_netlink.c:1328
> > [<0000000067e6b079>] netlink_sendmsg+0x270/0x480 net/netlink/af_netlink.c:1917
> > [<00000000aab08a47>] sock_sendmsg_nosec net/socket.c:637 [inline]
> > [<00000000aab08a47>] sock_sendmsg+0x54/0x70 net/socket.c:657
> > [<000000004cb7c11d>] ___sys_sendmsg+0x393/0x3c0 net/socket.c:2311
> > [<00000000c4901c63>] __sys_sendmsg+0x80/0xf0 net/socket.c:2356
> >
> > BUG: memory leak
> > unreferenced object 0xffff8881228ca500 (size 128):
> > comm "syz-executor032", pid 7015, jiffies 4294944622 (age 7.880s)
> > hex dump (first 32 bytes):
> > 00 f0 27 18 81 88 ff ff 80 ac 8c 22 81 88 ff ff ..'........"....
> > 40 b7 23 17 81 88 ff ff 00 00 00 00 00 00 00 00 @.#.............
> > backtrace:
> > [<000000000eb78212>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
> > [<000000000eb78212>] slab_post_alloc_hook mm/slab.h:522 [inline]
> > [<000000000eb78212>] slab_alloc mm/slab.c:3319 [inline]
> > [<000000000eb78212>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3548
> > [<00000000006ea6c6>] kmalloc include/linux/slab.h:552 [inline]
> > [<00000000006ea6c6>] kzalloc include/linux/slab.h:748 [inline]
> > [<00000000006ea6c6>] ovs_vport_alloc+0x37/0xf0 net/openvswitch/vport.c:130
> > [<00000000f9a04a7d>] internal_dev_create+0x24/0x1d0 net/openvswitch/vport-internal_dev.c:164
> > [<0000000056ee7c13>] ovs_vport_add+0x81/0x190 net/openvswitch/vport.c:199
> > [<000000005434efc7>] new_vport+0x19/0x80 net/openvswitch/datapath.c:194
> > [<00000000b7b253f1>] ovs_dp_cmd_new+0x22f/0x410 net/openvswitch/datapath.c:1614
> > [<00000000e0988518>] genl_family_rcv_msg+0x2ab/0x5b0 net/netlink/genetlink.c:629
> > [<00000000d0cc9347>] genl_rcv_msg+0x54/0x9c net/netlink/genetlink.c:654
> > [<000000006694b647>] netlink_rcv_skb+0x61/0x170 net/netlink/af_netlink.c:2477
> > [<0000000088381f37>] genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
> > [<00000000dad42a47>] netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
> > [<00000000dad42a47>] netlink_unicast+0x1ec/0x2d0 net/netlink/af_netlink.c:1328
> > [<0000000067e6b079>] netlink_sendmsg+0x270/0x480 net/netlink/af_netlink.c:1917
> > [<00000000aab08a47>] sock_sendmsg_nosec net/socket.c:637 [inline]
> > [<00000000aab08a47>] sock_sendmsg+0x54/0x70 net/socket.c:657
> > [<000000004cb7c11d>] ___sys_sendmsg+0x393/0x3c0 net/socket.c:2311
> > [<00000000c4901c63>] __sys_sendmsg+0x80/0xf0 net/socket.c:2356
> > [<00000000c10abb2d>] __do_sys_sendmsg net/socket.c:2365 [inline]
> > [<00000000c10abb2d>] __se_sys_sendmsg net/socket.c:2363 [inline]
> > [<00000000c10abb2d>] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2363
> > =====================================================================
> >
> > The function in net core, register_netdevice(), may fail with vport's
> > destruction callback either invoked or not. After commit 309b66970ee2,
> > the duty to destroy vport is offloaded from the driver OTOH, which ends
> > up in the memory leak reported.
> >
> > It is fixed by releasing vport unless device is registered successfully.
> > To do that, the callback assignment is defered until device is registered.
> >
> > Reported-by: syzbot+13210896153522fe1ee5@syzkaller.appspotmail.com
> > Fixes: 309b66970ee2 ("net: openvswitch: do not free vport if register_netdevice() is failed.")
> > Cc: Taehee Yoo <ap420073@gmail.com>
> > Cc: Greg Rose <gvrose8192@gmail.com>
> > Cc: Eric Dumazet <eric.dumazet@gmail.com>
> > Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> > Cc: Ying Xue <ying.xue@windriver.com>
> > Cc: Andrey Konovalov <andreyknvl@google.com>
> > Signed-off-by: Hillf Danton <hdanton@sina.com>
> > ---
> >
> > --- a/net/openvswitch/vport-internal_dev.c
> > +++ b/net/openvswitch/vport-internal_dev.c
> > @@ -137,7 +137,7 @@ static void do_setup(struct net_device *
> > netdev->priv_flags |= IFF_LIVE_ADDR_CHANGE | IFF_OPENVSWITCH |
> > IFF_NO_QUEUE;
> > netdev->needs_free_netdev = true;
> > - netdev->priv_destructor = internal_dev_destructor;
> > + netdev->priv_destructor = NULL;
> > netdev->ethtool_ops = &internal_dev_ethtool_ops;
> > netdev->rtnl_link_ops = &internal_dev_link_ops;
> >
> > @@ -159,7 +159,6 @@ static struct vport *internal_dev_create
> > struct internal_dev *internal_dev;
> > struct net_device *dev;
> > int err;
> > - bool free_vport = true;
> >
> > vport = ovs_vport_alloc(0, &ovs_internal_vport_ops, parms);
> > if (IS_ERR(vport)) {
> > @@ -190,10 +189,9 @@ static struct vport *internal_dev_create
> >
> > rtnl_lock();
> > err = register_netdevice(vport->dev);
> > - if (err) {
> > - free_vport = false;
> > + if (err)
> > goto error_unlock;
> > - }
> > + vport->dev->priv_destructor = internal_dev_destructor;
> >
>
> Looks good.
> Acked-by: Pravin B Shelar <pshelar@ovn.org>
Pravin, Hillf,
It looks like this patch was never posted to netdev, so it wasn't
picked up by Patchwork -- am I missing something? If not, Hillf, can
you please re-send this patch to the list? Thanks.
--
Stefano
^ permalink raw reply
* Re: [PATCH bpf-next v2 0/3] AF_XDP fixes for i40e, ixgbe and xdpsock
From: Björn Töpel @ 2019-09-13 11:16 UTC (permalink / raw)
To: Ciara Loftus, netdev, ast, daniel, magnus.karlsson,
jonathan.lemon
Cc: bruce.richardson, bpf, intel-wired-lan, kevin.laatz
In-Reply-To: <20190913103948.32053-1-ciara.loftus@intel.com>
On 2019-09-13 12:39, Ciara Loftus wrote:
> This patch set contains some fixes for AF_XDP zero copy in the i40e and
> ixgbe drivers as well as a fix for the 'xdpsock' sample application when
> running in unaligned mode.
>
> Patches 1 and 2 fix a regression for the i40e and ixgbe drivers which
> caused the umem headroom to be added to the xdp handle twice, resulting in
> an incorrect value being received by the user for the case where the umem
> headroom is non-zero.
>
> Patch 3 fixes an issue with the xdpsock sample application whereby the
> start of the tx packet data (offset) was not being set correctly when the
> application was being run in unaligned mode.
>
> This patch set has been applied against commit a2c11b034142 ("kcm: use
> BPF_PROG_RUN")
>
> ---
> v2:
> - Rearranged local variable order in i40e_run_xdp_zc and ixgbe_run_xdp_zc
> to comply with coding standards.
>
Thanks Ciara!
Acked-by: Björn Töpel <bjorn.topel@intel.com>
> Ciara Loftus (3):
> i40e: fix xdp handle calculations
> ixgbe: fix xdp handle calculations
> samples/bpf: fix xdpsock l2fwd tx for unaligned mode
>
> drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 ++--
> drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 4 ++--
> samples/bpf/xdpsock_user.c | 2 +-
> 3 files changed, 5 insertions(+), 5 deletions(-)
>
^ permalink raw reply
* Re: [PATCH bpf-next] libbpf: add xsk_umem__adjust_offset
From: Björn Töpel @ 2019-09-13 11:17 UTC (permalink / raw)
To: Laatz, Kevin, Björn Töpel
Cc: Netdev, Alexei Starovoitov, Daniel Borkmann, Karlsson, Magnus,
Jonathan Lemon, Bruce Richardson, ciara.loftus, bpf
In-Reply-To: <847dcd1e-81ba-4364-7242-d280a17f9244@intel.com>
On 2019-09-13 12:21, Laatz, Kevin wrote:
> On 13/09/2019 05:59, Björn Töpel wrote:
>> On Thu, 12 Sep 2019 at 17:47, Kevin Laatz <kevin.laatz@intel.com> wrote:
>>> Currently, xsk_umem_adjust_offset exists as a kernel internal function.
>>> This patch adds xsk_umem__adjust_offset to libbpf so that it can be used
>>> from userspace. This will take the responsibility of properly storing
>>> the
>>> offset away from the application, making it less error prone.
>>>
>>> Since xsk_umem__adjust_offset is called on a per-packet basis, we
>>> need to
>>> inline the function to avoid any performance regressions. In order to
>>> inline xsk_umem__adjust_offset, we need to add it to xsk.h.
>>> Unfortunately
>>> this means that we can't dereference the xsk_umem_config struct directly
>>> since it is defined only in xsk.c. We therefore add an extra API to
>>> return
>>> the flags field to the user from the structure, and have the inline
>>> function use this flags field directly.
>>>
>> Can you expand this to a series, with an additional patch where these
>> functions are used in XDP socket sample application, so it's more
>> clear for users?
>
> These functions are currently not required in the xdpsock application
> and I think forcing them in would be messy :-). However, an example of
> the use of these functions could be seen in the DPDK AF_XDP PMD. There
> is a patch herehttp://patches.dpdk.org/patch/58624/ where we currently
> do the offset adjustment to the handle manually, but with this patch we
> could replace it with xsk_umem__adjust_offset and have a real use
> example of the functions being used.
>
> Would this be enough for an example?
>
Fair enough! :-)
Acked-by: Björn Töpel <bjorn.topel@intel.com>
> Thanks,
> Kevin
>
^ permalink raw reply
* [PATCH 00/27] Netfilter updates for net-next
From: Pablo Neira Ayuso @ 2019-09-13 11:30 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
Hi,
The following patchset contains Netfilter updates for net-next:
1) Fix error path of nf_tables_updobj(), from Dan Carpenter.
2) Move large structure away from stack in the nf_tables offload
infrastructure, from Arnd Bergmann.
3) Move indirect flow_block logic to nf_tables_offload.
4) Support for synproxy objects, from Fernando Fernandez Mancera.
5) Support for fwd and dup offload.
6) Add __nft_offload_get_chain() helper, this implicitly fixes missing
mutex and check for offload flags in the indirect block support,
patch from wenxu.
7) Remove rules on device unregistration, from wenxu. This includes
two preparation patches to reuse nft_flow_offload_chain() and
nft_flow_offload_rule().
Large batch from Jeremy Sowden to make a second pass to the
CONFIG_HEADER_TEST support and a bit of housekeeping:
8) Missing include guard in conntrack label header, from Jeremy Sowden.
9) A few coding style errors: trailing whitespace, incorrect indent in
Kconfig, and semicolons at the end of function definitions.
10) Remove unused ipt_init() and ip6t_init() declarations.
11) Inline xt_hashlimit, ebt_802_3 and xt_physdev headers. They are
only used once.
12) Update include directive in several netfilter files.
13) Remove unused include/net/netfilter/ipv6/nf_conntrack_icmpv6.h.
14) Move nf_ip6_ext_hdr() to include/linux/netfilter_ipv6.h
15) Move several synproxy structure definitions to nf_synproxy.h
16) Move nf_bridge_frag_data structure to include/linux/netfilter_bridge.h
17) Clean up static inline definitions in nf_conntrack_ecache.h.
18) Replace defined(CONFIG...) || defined(CONFIG...MODULE) with IS_ENABLED(CONFIG...).
19) Missing inline function conditional definitions based on Kconfig
preferences in synproxy and nf_conntrack_timeout.
20) Update br_nf_pre_routing_ipv6() definition.
21) Move conntrack code in linux/skbuff.h to nf_conntrack headers.
22) Several patches to remove superfluous CONFIG_NETFILTER and
CONFIG_NF_CONNTRACK checks in headers, coming from the initial batch
support for CONFIG_HEADER_TEST for netfilter.
You can pull these changes from:
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git
Thanks.
----------------------------------------------------------------
The following changes since commit 6703a605b5ab33502d7a327de880188013d7c377:
Merge branch 'net-tls-small-TX-offload-optimizations' (2019-09-07 18:10:34 +0200)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git HEAD
for you to fetch changes up to 0d32e7048d927418300b9f5415ca546e44621ef1:
netfilter: conntrack: remove two unused functions from nf_conntrack_timestamp.h. (2019-09-13 12:48:09 +0200)
----------------------------------------------------------------
Arnd Bergmann (1):
netfilter: nf_tables_offload: avoid excessive stack usage
Dan Carpenter (1):
netfilter: nf_tables: Fix an Oops in nf_tables_updobj() error handling
Fernando Fernandez Mancera (1):
netfilter: nft_synproxy: add synproxy stateful object support
Jeremy Sowden (18):
netfilter: fix include guards.
netfilter: fix coding-style errors.
netfilter: ip_tables: remove unused function declarations.
netfilter: inline xt_hashlimit, ebt_802_3 and xt_physdev headers
netfilter: update include directives.
netfilter: remove nf_conntrack_icmpv6.h header.
netfilter: move inline nf_ip6_ext_hdr() function to a more appropriate header.
netfilter: synproxy: move code between headers.
netfilter: move nf_bridge_frag_data struct definition to a more appropriate header.
netfilter: conntrack: use consistent style when defining inline functions
netfilter: replace defined(CONFIG...) || defined(CONFIG...MODULE) with IS_ENABLED(CONFIG...).
netfilter: conntrack: wrap two inline functions in config checks.
netfilter: br_netfilter: update stub br_nf_pre_routing_ipv6 parameter to `void *priv`.
netfilter: conntrack: move code to linux/nf_conntrack_common.h.
netfilter: conntrack: remove CONFIG_NF_CONNTRACK check from nf_conntrack_acct.h.
netfilter: remove CONFIG_NETFILTER checks from headers.
netfilter: conntrack: remove CONFIG_NF_CONNTRACK checks from nf_conntrack_zones.h.
netfilter: conntrack: remove two unused functions from nf_conntrack_timestamp.h.
Pablo Neira Ayuso (2):
netfilter: nf_tables_offload: move indirect flow_block callback logic to core
netfilter: nft_{fwd,dup}_netdev: add offload support
wenxu (4):
netfilter: nf_tables_offload: add __nft_offload_get_chain function
netfilter: nf_tables_offload: refactor the nft_flow_offload_chain function
netfilter: nf_tables_offload: refactor the nft_flow_offload_rule function
netfilter: nf_tables_offload: remove rules when the device unregisters
include/linux/netfilter.h | 4 +-
include/linux/netfilter/ipset/ip_set_getport.h | 2 +-
include/linux/netfilter/nf_conntrack_common.h | 20 +++
include/linux/netfilter/x_tables.h | 8 +-
include/linux/netfilter/xt_hashlimit.h | 11 --
include/linux/netfilter/xt_physdev.h | 8 -
include/linux/netfilter_arp/arp_tables.h | 2 -
include/linux/netfilter_bridge.h | 7 +
include/linux/netfilter_bridge/ebt_802_3.h | 12 --
include/linux/netfilter_bridge/ebtables.h | 3 +-
include/linux/netfilter_ipv4/ip_tables.h | 9 +-
include/linux/netfilter_ipv6.h | 28 +++-
include/linux/netfilter_ipv6/ip6_tables.h | 20 +--
include/linux/skbuff.h | 32 ++--
include/net/netfilter/br_netfilter.h | 4 +-
include/net/netfilter/ipv6/nf_conntrack_icmpv6.h | 21 ---
include/net/netfilter/nf_conntrack.h | 25 +--
include/net/netfilter/nf_conntrack_acct.h | 4 +-
include/net/netfilter/nf_conntrack_bridge.h | 11 +-
include/net/netfilter/nf_conntrack_core.h | 8 +-
include/net/netfilter/nf_conntrack_ecache.h | 84 ++++++----
include/net/netfilter/nf_conntrack_expect.h | 2 +-
include/net/netfilter/nf_conntrack_extend.h | 2 +-
include/net/netfilter/nf_conntrack_l4proto.h | 16 +-
include/net/netfilter/nf_conntrack_labels.h | 11 +-
include/net/netfilter/nf_conntrack_synproxy.h | 41 +----
include/net/netfilter/nf_conntrack_timeout.h | 4 +
include/net/netfilter/nf_conntrack_timestamp.h | 16 --
include/net/netfilter/nf_conntrack_tuple.h | 4 +-
include/net/netfilter/nf_conntrack_zones.h | 6 +-
include/net/netfilter/nf_dup_netdev.h | 6 +
include/net/netfilter/nf_flow_table.h | 6 +-
include/net/netfilter/nf_nat.h | 21 +--
include/net/netfilter/nf_nat_masquerade.h | 1 +
include/net/netfilter/nf_queue.h | 4 -
include/net/netfilter/nf_synproxy.h | 44 +++++-
include/net/netfilter/nf_tables.h | 8 -
include/net/netfilter/nf_tables_offload.h | 10 +-
include/uapi/linux/netfilter/nf_tables.h | 3 +-
net/bridge/netfilter/ebt_802_3.c | 8 +-
net/bridge/netfilter/nf_conntrack_bridge.c | 15 +-
net/ipv4/netfilter/Kconfig | 8 +-
net/ipv4/netfilter/Makefile | 2 +-
net/ipv6/netfilter.c | 4 +-
net/ipv6/netfilter/ip6t_ipv6header.c | 4 +-
net/ipv6/netfilter/nf_log_ipv6.c | 4 +-
net/ipv6/netfilter/nf_socket_ipv6.c | 1 -
net/netfilter/Kconfig | 8 +-
net/netfilter/Makefile | 2 +-
net/netfilter/nf_conntrack_ecache.c | 1 +
net/netfilter/nf_conntrack_expect.c | 2 +
net/netfilter/nf_conntrack_helper.c | 5 +-
net/netfilter/nf_conntrack_proto_icmpv6.c | 1 -
net/netfilter/nf_conntrack_standalone.c | 1 -
net/netfilter/nf_conntrack_timeout.c | 1 +
net/netfilter/nf_dup_netdev.c | 21 +++
net/netfilter/nf_flow_table_core.c | 1 +
net/netfilter/nf_nat_core.c | 6 +-
net/netfilter/nf_tables_api.c | 25 +--
net/netfilter/nf_tables_offload.c | 186 ++++++++++++++++++-----
net/netfilter/nft_dup_netdev.c | 12 ++
net/netfilter/nft_flow_offload.c | 3 +-
net/netfilter/nft_fwd_netdev.c | 12 ++
net/netfilter/nft_synproxy.c | 143 ++++++++++++++---
net/netfilter/xt_connlimit.c | 2 +
net/netfilter/xt_hashlimit.c | 7 +-
net/netfilter/xt_physdev.c | 5 +-
net/sched/act_ct.c | 2 +-
68 files changed, 603 insertions(+), 417 deletions(-)
delete mode 100644 include/linux/netfilter/xt_hashlimit.h
delete mode 100644 include/linux/netfilter/xt_physdev.h
delete mode 100644 include/linux/netfilter_bridge/ebt_802_3.h
delete mode 100644 include/net/netfilter/ipv6/nf_conntrack_icmpv6.h
^ permalink raw reply
* [PATCH 02/27] netfilter: nf_tables_offload: avoid excessive stack usage
From: Pablo Neira Ayuso @ 2019-09-13 11:30 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <20190913113102.15776-1-pablo@netfilter.org>
From: Arnd Bergmann <arnd@arndb.de>
The nft_offload_ctx structure is much too large to put on the
stack:
net/netfilter/nf_tables_offload.c:31:23: error: stack frame size of 1200 bytes in function 'nft_flow_rule_create' [-Werror,-Wframe-larger-than=]
Use dynamic allocation here, as we do elsewhere in the same
function.
Fixes: c9626a2cbdb2 ("netfilter: nf_tables: add hardware offload support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_tables_offload.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
index 3c2725ade61b..fabe2997188b 100644
--- a/net/netfilter/nf_tables_offload.c
+++ b/net/netfilter/nf_tables_offload.c
@@ -30,11 +30,7 @@ static struct nft_flow_rule *nft_flow_rule_alloc(int num_actions)
struct nft_flow_rule *nft_flow_rule_create(const struct nft_rule *rule)
{
- struct nft_offload_ctx ctx = {
- .dep = {
- .type = NFT_OFFLOAD_DEP_UNSPEC,
- },
- };
+ struct nft_offload_ctx *ctx;
struct nft_flow_rule *flow;
int num_actions = 0, err;
struct nft_expr *expr;
@@ -52,21 +48,31 @@ struct nft_flow_rule *nft_flow_rule_create(const struct nft_rule *rule)
return ERR_PTR(-ENOMEM);
expr = nft_expr_first(rule);
+
+ ctx = kzalloc(sizeof(struct nft_offload_ctx), GFP_KERNEL);
+ if (!ctx) {
+ err = -ENOMEM;
+ goto err_out;
+ }
+ ctx->dep.type = NFT_OFFLOAD_DEP_UNSPEC;
+
while (expr->ops && expr != nft_expr_last(rule)) {
if (!expr->ops->offload) {
err = -EOPNOTSUPP;
goto err_out;
}
- err = expr->ops->offload(&ctx, flow, expr);
+ err = expr->ops->offload(ctx, flow, expr);
if (err < 0)
goto err_out;
expr = nft_expr_next(expr);
}
- flow->proto = ctx.dep.l3num;
+ flow->proto = ctx->dep.l3num;
+ kfree(ctx);
return flow;
err_out:
+ kfree(ctx);
nft_flow_rule_destroy(flow);
return ERR_PTR(err);
--
2.11.0
^ permalink raw reply related
* [PATCH 03/27] netfilter: nf_tables_offload: move indirect flow_block callback logic to core
From: Pablo Neira Ayuso @ 2019-09-13 11:30 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <20190913113102.15776-1-pablo@netfilter.org>
Add nft_offload_init() and nft_offload_exit() function to deal with the
init and the exit path of the offload infrastructure.
Rename nft_indr_block_get_and_ing_cmd() to nft_indr_block_cb().
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/net/netfilter/nf_tables_offload.h | 7 +++----
net/netfilter/nf_tables_api.c | 10 +++-------
net/netfilter/nf_tables_offload.c | 22 ++++++++++++++++++----
3 files changed, 24 insertions(+), 15 deletions(-)
diff --git a/include/net/netfilter/nf_tables_offload.h b/include/net/netfilter/nf_tables_offload.h
index db104665a9e4..6de896ebcf30 100644
--- a/include/net/netfilter/nf_tables_offload.h
+++ b/include/net/netfilter/nf_tables_offload.h
@@ -64,10 +64,6 @@ struct nft_rule;
struct nft_flow_rule *nft_flow_rule_create(const struct nft_rule *rule);
void nft_flow_rule_destroy(struct nft_flow_rule *flow);
int nft_flow_rule_offload_commit(struct net *net);
-void nft_indr_block_get_and_ing_cmd(struct net_device *dev,
- flow_indr_block_bind_cb_t *cb,
- void *cb_priv,
- enum flow_block_command command);
#define NFT_OFFLOAD_MATCH(__key, __base, __field, __len, __reg) \
(__reg)->base_offset = \
@@ -80,4 +76,7 @@ void nft_indr_block_get_and_ing_cmd(struct net_device *dev,
int nft_chain_offload_priority(struct nft_base_chain *basechain);
+void nft_offload_init(void);
+void nft_offload_exit(void);
+
#endif
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 7def31ae3022..efd0c97cc2a3 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -7669,11 +7669,6 @@ static struct pernet_operations nf_tables_net_ops = {
.exit = nf_tables_exit_net,
};
-static struct flow_indr_block_ing_entry block_ing_entry = {
- .cb = nft_indr_block_get_and_ing_cmd,
- .list = LIST_HEAD_INIT(block_ing_entry.list),
-};
-
static int __init nf_tables_module_init(void)
{
int err;
@@ -7705,7 +7700,8 @@ static int __init nf_tables_module_init(void)
goto err5;
nft_chain_route_init();
- flow_indr_add_block_ing_cb(&block_ing_entry);
+ nft_offload_init();
+
return err;
err5:
rhltable_destroy(&nft_objname_ht);
@@ -7722,7 +7718,7 @@ static int __init nf_tables_module_init(void)
static void __exit nf_tables_module_exit(void)
{
- flow_indr_del_block_ing_cb(&block_ing_entry);
+ nft_offload_exit();
nfnetlink_subsys_unregister(&nf_tables_subsys);
unregister_netdevice_notifier(&nf_tables_flowtable_notifier);
nft_chain_filter_fini();
diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
index fabe2997188b..8abf193f8012 100644
--- a/net/netfilter/nf_tables_offload.c
+++ b/net/netfilter/nf_tables_offload.c
@@ -354,10 +354,9 @@ int nft_flow_rule_offload_commit(struct net *net)
return err;
}
-void nft_indr_block_get_and_ing_cmd(struct net_device *dev,
- flow_indr_block_bind_cb_t *cb,
- void *cb_priv,
- enum flow_block_command command)
+static void nft_indr_block_cb(struct net_device *dev,
+ flow_indr_block_bind_cb_t *cb, void *cb_priv,
+ enum flow_block_command command)
{
struct net *net = dev_net(dev);
const struct nft_table *table;
@@ -383,3 +382,18 @@ void nft_indr_block_get_and_ing_cmd(struct net_device *dev,
}
}
}
+
+static struct flow_indr_block_ing_entry block_ing_entry = {
+ .cb = nft_indr_block_cb,
+ .list = LIST_HEAD_INIT(block_ing_entry.list),
+};
+
+void nft_offload_init(void)
+{
+ flow_indr_add_block_ing_cb(&block_ing_entry);
+}
+
+void nft_offload_exit(void)
+{
+ flow_indr_del_block_ing_cb(&block_ing_entry);
+}
--
2.11.0
^ permalink raw reply related
* [PATCH 06/27] netfilter: nf_tables_offload: add __nft_offload_get_chain function
From: Pablo Neira Ayuso @ 2019-09-13 11:30 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <20190913113102.15776-1-pablo@netfilter.org>
From: wenxu <wenxu@ucloud.cn>
Add __nft_offload_get_chain function to get basechain from device. This
function requires that caller holds the per-netns nftables mutex. This
patch implicitly fixes missing offload flags check and proper mutex from
nft_indr_block_cb().
Fixes: 9a32669fecfb ("netfilter: nf_tables_offload: support indr block call")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_tables_offload.c | 52 +++++++++++++++++++++++++--------------
1 file changed, 34 insertions(+), 18 deletions(-)
diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
index 239cb781ad13..e200491ec672 100644
--- a/net/netfilter/nf_tables_offload.c
+++ b/net/netfilter/nf_tables_offload.c
@@ -369,33 +369,49 @@ int nft_flow_rule_offload_commit(struct net *net)
return err;
}
-static void nft_indr_block_cb(struct net_device *dev,
- flow_indr_block_bind_cb_t *cb, void *cb_priv,
- enum flow_block_command command)
+static struct nft_chain *__nft_offload_get_chain(struct net_device *dev)
{
+ struct nft_base_chain *basechain;
struct net *net = dev_net(dev);
const struct nft_table *table;
- const struct nft_chain *chain;
+ struct nft_chain *chain;
- list_for_each_entry_rcu(table, &net->nft.tables, list) {
+ list_for_each_entry(table, &net->nft.tables, list) {
if (table->family != NFPROTO_NETDEV)
continue;
- list_for_each_entry_rcu(chain, &table->chains, list) {
- if (nft_is_base_chain(chain)) {
- struct nft_base_chain *basechain;
-
- basechain = nft_base_chain(chain);
- if (!strncmp(basechain->dev_name, dev->name,
- IFNAMSIZ)) {
- nft_indr_block_ing_cmd(dev, basechain,
- cb, cb_priv,
- command);
- return;
- }
- }
+ list_for_each_entry(chain, &table->chains, list) {
+ if (!nft_is_base_chain(chain) ||
+ !(chain->flags & NFT_CHAIN_HW_OFFLOAD))
+ continue;
+
+ basechain = nft_base_chain(chain);
+ if (strncmp(basechain->dev_name, dev->name, IFNAMSIZ))
+ continue;
+
+ return chain;
}
}
+
+ return NULL;
+}
+
+static void nft_indr_block_cb(struct net_device *dev,
+ flow_indr_block_bind_cb_t *cb, void *cb_priv,
+ enum flow_block_command cmd)
+{
+ struct net *net = dev_net(dev);
+ struct nft_chain *chain;
+
+ mutex_lock(&net->nft.commit_mutex);
+ chain = __nft_offload_get_chain(dev);
+ if (chain) {
+ struct nft_base_chain *basechain;
+
+ basechain = nft_base_chain(chain);
+ nft_indr_block_ing_cmd(dev, basechain, cb, cb_priv, cmd);
+ }
+ mutex_unlock(&net->nft.commit_mutex);
}
static struct flow_indr_block_ing_entry block_ing_entry = {
--
2.11.0
^ permalink raw reply related
* [PATCH 08/27] netfilter: nf_tables_offload: refactor the nft_flow_offload_rule function
From: Pablo Neira Ayuso @ 2019-09-13 11:30 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <20190913113102.15776-1-pablo@netfilter.org>
From: wenxu <wenxu@ucloud.cn>
Pass rule, chain and flow_rule object parameters to nft_flow_offload_rule
to reuse it.
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_tables_offload.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
index 367a7fa5c9dd..739a79cdb741 100644
--- a/net/netfilter/nf_tables_offload.c
+++ b/net/netfilter/nf_tables_offload.c
@@ -155,20 +155,20 @@ int nft_chain_offload_priority(struct nft_base_chain *basechain)
return 0;
}
-static int nft_flow_offload_rule(struct nft_trans *trans,
+static int nft_flow_offload_rule(struct nft_chain *chain,
+ struct nft_rule *rule,
+ struct nft_flow_rule *flow,
enum flow_cls_command command)
{
- struct nft_flow_rule *flow = nft_trans_flow_rule(trans);
- struct nft_rule *rule = nft_trans_rule(trans);
struct flow_cls_offload cls_flow = {};
struct nft_base_chain *basechain;
struct netlink_ext_ack extack;
__be16 proto = ETH_P_ALL;
- if (!nft_is_base_chain(trans->ctx.chain))
+ if (!nft_is_base_chain(chain))
return -EOPNOTSUPP;
- basechain = nft_base_chain(trans->ctx.chain);
+ basechain = nft_base_chain(chain);
if (flow)
proto = flow->proto;
@@ -357,14 +357,20 @@ int nft_flow_rule_offload_commit(struct net *net)
!(trans->ctx.flags & NLM_F_APPEND))
return -EOPNOTSUPP;
- err = nft_flow_offload_rule(trans, FLOW_CLS_REPLACE);
+ err = nft_flow_offload_rule(trans->ctx.chain,
+ nft_trans_rule(trans),
+ nft_trans_flow_rule(trans),
+ FLOW_CLS_REPLACE);
nft_flow_rule_destroy(nft_trans_flow_rule(trans));
break;
case NFT_MSG_DELRULE:
if (!(trans->ctx.chain->flags & NFT_CHAIN_HW_OFFLOAD))
continue;
- err = nft_flow_offload_rule(trans, FLOW_CLS_DESTROY);
+ err = nft_flow_offload_rule(trans->ctx.chain,
+ nft_trans_rule(trans),
+ nft_trans_flow_rule(trans),
+ FLOW_CLS_DESTROY);
break;
}
--
2.11.0
^ permalink raw reply related
* [PATCH 05/27] netfilter: nft_{fwd,dup}_netdev: add offload support
From: Pablo Neira Ayuso @ 2019-09-13 11:30 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <20190913113102.15776-1-pablo@netfilter.org>
This patch adds support for packet mirroring and redirection. The
nft_fwd_dup_netdev_offload() function configures the flow_action object
for the fwd and the dup actions.
Extend nft_flow_rule_destroy() to release the net_device object when the
flow_rule object is released, since nft_fwd_dup_netdev_offload() bumps
the net_device reference counter.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: wenxu <wenxu@ucloud.cn>
---
include/net/netfilter/nf_dup_netdev.h | 6 ++++++
include/net/netfilter/nf_tables_offload.h | 3 ++-
net/netfilter/nf_dup_netdev.c | 21 +++++++++++++++++++++
net/netfilter/nf_tables_api.c | 2 +-
net/netfilter/nf_tables_offload.c | 17 ++++++++++++++++-
net/netfilter/nft_dup_netdev.c | 12 ++++++++++++
net/netfilter/nft_fwd_netdev.c | 12 ++++++++++++
7 files changed, 70 insertions(+), 3 deletions(-)
diff --git a/include/net/netfilter/nf_dup_netdev.h b/include/net/netfilter/nf_dup_netdev.h
index 181672672160..b175d271aec9 100644
--- a/include/net/netfilter/nf_dup_netdev.h
+++ b/include/net/netfilter/nf_dup_netdev.h
@@ -7,4 +7,10 @@
void nf_dup_netdev_egress(const struct nft_pktinfo *pkt, int oif);
void nf_fwd_netdev_egress(const struct nft_pktinfo *pkt, int oif);
+struct nft_offload_ctx;
+struct nft_flow_rule;
+
+int nft_fwd_dup_netdev_offload(struct nft_offload_ctx *ctx,
+ struct nft_flow_rule *flow,
+ enum flow_action_id id, int oif);
#endif
diff --git a/include/net/netfilter/nf_tables_offload.h b/include/net/netfilter/nf_tables_offload.h
index 6de896ebcf30..ddd048be4330 100644
--- a/include/net/netfilter/nf_tables_offload.h
+++ b/include/net/netfilter/nf_tables_offload.h
@@ -26,6 +26,7 @@ struct nft_offload_ctx {
u8 protonum;
} dep;
unsigned int num_actions;
+ struct net *net;
struct nft_offload_reg regs[NFT_REG32_15 + 1];
};
@@ -61,7 +62,7 @@ struct nft_flow_rule {
#define NFT_OFFLOAD_F_ACTION (1 << 0)
struct nft_rule;
-struct nft_flow_rule *nft_flow_rule_create(const struct nft_rule *rule);
+struct nft_flow_rule *nft_flow_rule_create(struct net *net, const struct nft_rule *rule);
void nft_flow_rule_destroy(struct nft_flow_rule *flow);
int nft_flow_rule_offload_commit(struct net *net);
diff --git a/net/netfilter/nf_dup_netdev.c b/net/netfilter/nf_dup_netdev.c
index 5a35ef08c3cb..f108a76925dd 100644
--- a/net/netfilter/nf_dup_netdev.c
+++ b/net/netfilter/nf_dup_netdev.c
@@ -10,6 +10,7 @@
#include <linux/netfilter.h>
#include <linux/netfilter/nf_tables.h>
#include <net/netfilter/nf_tables.h>
+#include <net/netfilter/nf_tables_offload.h>
#include <net/netfilter/nf_dup_netdev.h>
static void nf_do_netdev_egress(struct sk_buff *skb, struct net_device *dev)
@@ -50,5 +51,25 @@ void nf_dup_netdev_egress(const struct nft_pktinfo *pkt, int oif)
}
EXPORT_SYMBOL_GPL(nf_dup_netdev_egress);
+int nft_fwd_dup_netdev_offload(struct nft_offload_ctx *ctx,
+ struct nft_flow_rule *flow,
+ enum flow_action_id id, int oif)
+{
+ struct flow_action_entry *entry;
+ struct net_device *dev;
+
+ /* nft_flow_rule_destroy() releases the reference on this device. */
+ dev = dev_get_by_index(ctx->net, oif);
+ if (!dev)
+ return -EOPNOTSUPP;
+
+ entry = &flow->rule->action.entries[ctx->num_actions++];
+ entry->id = id;
+ entry->dev = dev;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(nft_fwd_dup_netdev_offload);
+
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Pablo Neira Ayuso <pablo@netfilter.org>");
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index efd0c97cc2a3..c6f59ef96017 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -2853,7 +2853,7 @@ static int nf_tables_newrule(struct net *net, struct sock *nlsk,
return nft_table_validate(net, table);
if (chain->flags & NFT_CHAIN_HW_OFFLOAD) {
- flow = nft_flow_rule_create(rule);
+ flow = nft_flow_rule_create(net, rule);
if (IS_ERR(flow))
return PTR_ERR(flow);
diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
index 8abf193f8012..239cb781ad13 100644
--- a/net/netfilter/nf_tables_offload.c
+++ b/net/netfilter/nf_tables_offload.c
@@ -28,7 +28,8 @@ static struct nft_flow_rule *nft_flow_rule_alloc(int num_actions)
return flow;
}
-struct nft_flow_rule *nft_flow_rule_create(const struct nft_rule *rule)
+struct nft_flow_rule *nft_flow_rule_create(struct net *net,
+ const struct nft_rule *rule)
{
struct nft_offload_ctx *ctx;
struct nft_flow_rule *flow;
@@ -54,6 +55,7 @@ struct nft_flow_rule *nft_flow_rule_create(const struct nft_rule *rule)
err = -ENOMEM;
goto err_out;
}
+ ctx->net = net;
ctx->dep.type = NFT_OFFLOAD_DEP_UNSPEC;
while (expr->ops && expr != nft_expr_last(rule)) {
@@ -80,6 +82,19 @@ struct nft_flow_rule *nft_flow_rule_create(const struct nft_rule *rule)
void nft_flow_rule_destroy(struct nft_flow_rule *flow)
{
+ struct flow_action_entry *entry;
+ int i;
+
+ flow_action_for_each(i, entry, &flow->rule->action) {
+ switch (entry->id) {
+ case FLOW_ACTION_REDIRECT:
+ case FLOW_ACTION_MIRRED:
+ dev_put(entry->dev);
+ break;
+ default:
+ break;
+ }
+ }
kfree(flow->rule);
kfree(flow);
}
diff --git a/net/netfilter/nft_dup_netdev.c b/net/netfilter/nft_dup_netdev.c
index c6052fdd2c40..c2e78c160fd7 100644
--- a/net/netfilter/nft_dup_netdev.c
+++ b/net/netfilter/nft_dup_netdev.c
@@ -10,6 +10,7 @@
#include <linux/netfilter.h>
#include <linux/netfilter/nf_tables.h>
#include <net/netfilter/nf_tables.h>
+#include <net/netfilter/nf_tables_offload.h>
#include <net/netfilter/nf_dup_netdev.h>
struct nft_dup_netdev {
@@ -56,6 +57,16 @@ static int nft_dup_netdev_dump(struct sk_buff *skb, const struct nft_expr *expr)
return -1;
}
+static int nft_dup_netdev_offload(struct nft_offload_ctx *ctx,
+ struct nft_flow_rule *flow,
+ const struct nft_expr *expr)
+{
+ const struct nft_dup_netdev *priv = nft_expr_priv(expr);
+ int oif = ctx->regs[priv->sreg_dev].data.data[0];
+
+ return nft_fwd_dup_netdev_offload(ctx, flow, FLOW_ACTION_MIRRED, oif);
+}
+
static struct nft_expr_type nft_dup_netdev_type;
static const struct nft_expr_ops nft_dup_netdev_ops = {
.type = &nft_dup_netdev_type,
@@ -63,6 +74,7 @@ static const struct nft_expr_ops nft_dup_netdev_ops = {
.eval = nft_dup_netdev_eval,
.init = nft_dup_netdev_init,
.dump = nft_dup_netdev_dump,
+ .offload = nft_dup_netdev_offload,
};
static struct nft_expr_type nft_dup_netdev_type __read_mostly = {
diff --git a/net/netfilter/nft_fwd_netdev.c b/net/netfilter/nft_fwd_netdev.c
index 61b7f93ac681..aba11c2333f3 100644
--- a/net/netfilter/nft_fwd_netdev.c
+++ b/net/netfilter/nft_fwd_netdev.c
@@ -12,6 +12,7 @@
#include <linux/ip.h>
#include <linux/ipv6.h>
#include <net/netfilter/nf_tables.h>
+#include <net/netfilter/nf_tables_offload.h>
#include <net/netfilter/nf_dup_netdev.h>
#include <net/neighbour.h>
#include <net/ip.h>
@@ -63,6 +64,16 @@ static int nft_fwd_netdev_dump(struct sk_buff *skb, const struct nft_expr *expr)
return -1;
}
+static int nft_fwd_netdev_offload(struct nft_offload_ctx *ctx,
+ struct nft_flow_rule *flow,
+ const struct nft_expr *expr)
+{
+ const struct nft_fwd_netdev *priv = nft_expr_priv(expr);
+ int oif = ctx->regs[priv->sreg_dev].data.data[0];
+
+ return nft_fwd_dup_netdev_offload(ctx, flow, FLOW_ACTION_REDIRECT, oif);
+}
+
struct nft_fwd_neigh {
enum nft_registers sreg_dev:8;
enum nft_registers sreg_addr:8;
@@ -194,6 +205,7 @@ static const struct nft_expr_ops nft_fwd_netdev_ops = {
.eval = nft_fwd_netdev_eval,
.init = nft_fwd_netdev_init,
.dump = nft_fwd_netdev_dump,
+ .offload = nft_fwd_netdev_offload,
};
static const struct nft_expr_ops *
--
2.11.0
^ permalink raw reply related
* [PATCH 13/27] netfilter: inline xt_hashlimit, ebt_802_3 and xt_physdev headers
From: Pablo Neira Ayuso @ 2019-09-13 11:30 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <20190913113102.15776-1-pablo@netfilter.org>
From: Jeremy Sowden <jeremy@azazel.net>
Three netfilter headers are only included once. Inline their contents
at those sites and remove them.
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/linux/netfilter/xt_hashlimit.h | 11 -----------
include/linux/netfilter/xt_physdev.h | 8 --------
include/linux/netfilter_bridge/ebt_802_3.h | 12 ------------
net/bridge/netfilter/ebt_802_3.c | 8 +++++++-
net/netfilter/xt_hashlimit.c | 7 ++++++-
net/netfilter/xt_physdev.c | 5 +++--
6 files changed, 16 insertions(+), 35 deletions(-)
delete mode 100644 include/linux/netfilter/xt_hashlimit.h
delete mode 100644 include/linux/netfilter/xt_physdev.h
delete mode 100644 include/linux/netfilter_bridge/ebt_802_3.h
diff --git a/include/linux/netfilter/xt_hashlimit.h b/include/linux/netfilter/xt_hashlimit.h
deleted file mode 100644
index 169d03983589..000000000000
--- a/include/linux/netfilter/xt_hashlimit.h
+++ /dev/null
@@ -1,11 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _XT_HASHLIMIT_H
-#define _XT_HASHLIMIT_H
-
-#include <uapi/linux/netfilter/xt_hashlimit.h>
-
-#define XT_HASHLIMIT_ALL (XT_HASHLIMIT_HASH_DIP | XT_HASHLIMIT_HASH_DPT | \
- XT_HASHLIMIT_HASH_SIP | XT_HASHLIMIT_HASH_SPT | \
- XT_HASHLIMIT_INVERT | XT_HASHLIMIT_BYTES |\
- XT_HASHLIMIT_RATE_MATCH)
-#endif /*_XT_HASHLIMIT_H*/
diff --git a/include/linux/netfilter/xt_physdev.h b/include/linux/netfilter/xt_physdev.h
deleted file mode 100644
index 4ca0593949cd..000000000000
--- a/include/linux/netfilter/xt_physdev.h
+++ /dev/null
@@ -1,8 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _XT_PHYSDEV_H
-#define _XT_PHYSDEV_H
-
-#include <linux/if.h>
-#include <uapi/linux/netfilter/xt_physdev.h>
-
-#endif /*_XT_PHYSDEV_H*/
diff --git a/include/linux/netfilter_bridge/ebt_802_3.h b/include/linux/netfilter_bridge/ebt_802_3.h
deleted file mode 100644
index c6147f9c0d80..000000000000
--- a/include/linux/netfilter_bridge/ebt_802_3.h
+++ /dev/null
@@ -1,12 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef __LINUX_BRIDGE_EBT_802_3_H
-#define __LINUX_BRIDGE_EBT_802_3_H
-
-#include <linux/skbuff.h>
-#include <uapi/linux/netfilter_bridge/ebt_802_3.h>
-
-static inline struct ebt_802_3_hdr *ebt_802_3_hdr(const struct sk_buff *skb)
-{
- return (struct ebt_802_3_hdr *)skb_mac_header(skb);
-}
-#endif
diff --git a/net/bridge/netfilter/ebt_802_3.c b/net/bridge/netfilter/ebt_802_3.c
index 2c8fe24400e5..68c2519bdc52 100644
--- a/net/bridge/netfilter/ebt_802_3.c
+++ b/net/bridge/netfilter/ebt_802_3.c
@@ -11,7 +11,13 @@
#include <linux/module.h>
#include <linux/netfilter/x_tables.h>
#include <linux/netfilter_bridge/ebtables.h>
-#include <linux/netfilter_bridge/ebt_802_3.h>
+#include <linux/skbuff.h>
+#include <uapi/linux/netfilter_bridge/ebt_802_3.h>
+
+static struct ebt_802_3_hdr *ebt_802_3_hdr(const struct sk_buff *skb)
+{
+ return (struct ebt_802_3_hdr *)skb_mac_header(skb);
+}
static bool
ebt_802_3_mt(const struct sk_buff *skb, struct xt_action_param *par)
diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c
index 2d2691dd51e0..ced3fc8fad7c 100644
--- a/net/netfilter/xt_hashlimit.c
+++ b/net/netfilter/xt_hashlimit.c
@@ -34,9 +34,14 @@
#include <linux/netfilter/x_tables.h>
#include <linux/netfilter_ipv4/ip_tables.h>
#include <linux/netfilter_ipv6/ip6_tables.h>
-#include <linux/netfilter/xt_hashlimit.h>
#include <linux/mutex.h>
#include <linux/kernel.h>
+#include <uapi/linux/netfilter/xt_hashlimit.h>
+
+#define XT_HASHLIMIT_ALL (XT_HASHLIMIT_HASH_DIP | XT_HASHLIMIT_HASH_DPT | \
+ XT_HASHLIMIT_HASH_SIP | XT_HASHLIMIT_HASH_SPT | \
+ XT_HASHLIMIT_INVERT | XT_HASHLIMIT_BYTES |\
+ XT_HASHLIMIT_RATE_MATCH)
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Harald Welte <laforge@netfilter.org>");
diff --git a/net/netfilter/xt_physdev.c b/net/netfilter/xt_physdev.c
index b92b22ce8abd..ec6ed6fda96c 100644
--- a/net/netfilter/xt_physdev.c
+++ b/net/netfilter/xt_physdev.c
@@ -5,12 +5,13 @@
/* (C) 2001-2003 Bart De Schuymer <bdschuym@pandora.be>
*/
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/if.h>
#include <linux/module.h>
#include <linux/skbuff.h>
#include <linux/netfilter_bridge.h>
-#include <linux/netfilter/xt_physdev.h>
#include <linux/netfilter/x_tables.h>
-#include <net/netfilter/br_netfilter.h>
+#include <uapi/linux/netfilter/xt_physdev.h>
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Bart De Schuymer <bdschuym@pandora.be>");
--
2.11.0
^ permalink raw reply related
* [PATCH 17/27] netfilter: synproxy: move code between headers.
From: Pablo Neira Ayuso @ 2019-09-13 11:30 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <20190913113102.15776-1-pablo@netfilter.org>
From: Jeremy Sowden <jeremy@azazel.net>
There is some non-conntrack code in the nf_conntrack_synproxy.h header.
Move it to the nf_synproxy.h header.
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/net/netfilter/nf_conntrack_synproxy.h | 39 ---------------------------
include/net/netfilter/nf_synproxy.h | 38 ++++++++++++++++++++++++++
2 files changed, 38 insertions(+), 39 deletions(-)
diff --git a/include/net/netfilter/nf_conntrack_synproxy.h b/include/net/netfilter/nf_conntrack_synproxy.h
index 2f0171d24997..c22f0c11cc82 100644
--- a/include/net/netfilter/nf_conntrack_synproxy.h
+++ b/include/net/netfilter/nf_conntrack_synproxy.h
@@ -43,43 +43,4 @@ static inline bool nf_ct_add_synproxy(struct nf_conn *ct,
return true;
}
-struct synproxy_stats {
- unsigned int syn_received;
- unsigned int cookie_invalid;
- unsigned int cookie_valid;
- unsigned int cookie_retrans;
- unsigned int conn_reopened;
-};
-
-struct synproxy_net {
- struct nf_conn *tmpl;
- struct synproxy_stats __percpu *stats;
- unsigned int hook_ref4;
- unsigned int hook_ref6;
-};
-
-extern unsigned int synproxy_net_id;
-static inline struct synproxy_net *synproxy_pernet(struct net *net)
-{
- return net_generic(net, synproxy_net_id);
-}
-
-struct synproxy_options {
- u8 options;
- u8 wscale;
- u16 mss_option;
- u16 mss_encode;
- u32 tsval;
- u32 tsecr;
-};
-
-struct tcphdr;
-struct nf_synproxy_info;
-bool synproxy_parse_options(const struct sk_buff *skb, unsigned int doff,
- const struct tcphdr *th,
- struct synproxy_options *opts);
-
-void synproxy_init_timestamp_cookie(const struct nf_synproxy_info *info,
- struct synproxy_options *opts);
-
#endif /* _NF_CONNTRACK_SYNPROXY_H */
diff --git a/include/net/netfilter/nf_synproxy.h b/include/net/netfilter/nf_synproxy.h
index dc420b47e3aa..19d1af7a0348 100644
--- a/include/net/netfilter/nf_synproxy.h
+++ b/include/net/netfilter/nf_synproxy.h
@@ -11,6 +11,44 @@
#include <net/netfilter/nf_conntrack_seqadj.h>
#include <net/netfilter/nf_conntrack_synproxy.h>
+struct synproxy_stats {
+ unsigned int syn_received;
+ unsigned int cookie_invalid;
+ unsigned int cookie_valid;
+ unsigned int cookie_retrans;
+ unsigned int conn_reopened;
+};
+
+struct synproxy_net {
+ struct nf_conn *tmpl;
+ struct synproxy_stats __percpu *stats;
+ unsigned int hook_ref4;
+ unsigned int hook_ref6;
+};
+
+extern unsigned int synproxy_net_id;
+static inline struct synproxy_net *synproxy_pernet(struct net *net)
+{
+ return net_generic(net, synproxy_net_id);
+}
+
+struct synproxy_options {
+ u8 options;
+ u8 wscale;
+ u16 mss_option;
+ u16 mss_encode;
+ u32 tsval;
+ u32 tsecr;
+};
+
+struct nf_synproxy_info;
+bool synproxy_parse_options(const struct sk_buff *skb, unsigned int doff,
+ const struct tcphdr *th,
+ struct synproxy_options *opts);
+
+void synproxy_init_timestamp_cookie(const struct nf_synproxy_info *info,
+ struct synproxy_options *opts);
+
void synproxy_send_client_synack(struct net *net, const struct sk_buff *skb,
const struct tcphdr *th,
const struct synproxy_options *opts);
--
2.11.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox