* [PATCH net v3] l2tp: fix race condition in l2tp_tunnel_delete
From: Sabrina Dubroca @ 2017-09-26 14:16 UTC (permalink / raw)
To: netdev; +Cc: Sabrina Dubroca, Guillaume Nault, Xin Long, Tom Parkin
If we try to delete the same tunnel twice, the first delete operation
does a lookup (l2tp_tunnel_get), finds the tunnel, calls
l2tp_tunnel_delete, which queues it for deletion by
l2tp_tunnel_del_work.
The second delete operation also finds the tunnel and calls
l2tp_tunnel_delete. If the workqueue has already fired and started
running l2tp_tunnel_del_work, then l2tp_tunnel_delete will queue the
same tunnel a second time, and try to free the socket again.
Add a dead flag to prevent firing the workqueue twice. Then we can
remove the check of queue_work's result that was meant to prevent that
race but doesn't.
Reproducer:
ip l2tp add tunnel tunnel_id 3000 peer_tunnel_id 4000 local 192.168.0.2 remote 192.168.0.1 encap udp udp_sport 5000 udp_dport 6000
ip l2tp add session name l2tp1 tunnel_id 3000 session_id 1000 peer_session_id 2000
ip link set l2tp1 up
ip l2tp del tunnel tunnel_id 3000
ip l2tp del tunnel tunnel_id 3000
Fixes: f8ccac0e4493 ("l2tp: put tunnel socket release on a workqueue")
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
v3: only prevent double deletion, drop checks during lookups
v2: as Tom Parkin explained, we can't remove the tunnel from the
per-net list from netlink. v2 uses only a dead flag, and adds
corresponding checks during lookups
net/l2tp/l2tp_core.c | 10 ++++------
net/l2tp/l2tp_core.h | 5 ++++-
2 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index d8c2a89a76e1..02d61101b108 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1688,14 +1688,12 @@ EXPORT_SYMBOL_GPL(l2tp_tunnel_create);
/* This function is used by the netlink TUNNEL_DELETE command.
*/
-int l2tp_tunnel_delete(struct l2tp_tunnel *tunnel)
+void l2tp_tunnel_delete(struct l2tp_tunnel *tunnel)
{
- l2tp_tunnel_inc_refcount(tunnel);
- if (false == queue_work(l2tp_wq, &tunnel->del_work)) {
- l2tp_tunnel_dec_refcount(tunnel);
- return 1;
+ if (!test_and_set_bit(0, &tunnel->dead)) {
+ l2tp_tunnel_inc_refcount(tunnel);
+ queue_work(l2tp_wq, &tunnel->del_work);
}
- return 0;
}
EXPORT_SYMBOL_GPL(l2tp_tunnel_delete);
diff --git a/net/l2tp/l2tp_core.h b/net/l2tp/l2tp_core.h
index 70a12df40a5f..67c79d9b5c6c 100644
--- a/net/l2tp/l2tp_core.h
+++ b/net/l2tp/l2tp_core.h
@@ -161,6 +161,9 @@ struct l2tp_tunnel_cfg {
struct l2tp_tunnel {
int magic; /* Should be L2TP_TUNNEL_MAGIC */
+
+ unsigned long dead;
+
struct rcu_head rcu;
rwlock_t hlist_lock; /* protect session_hlist */
bool acpt_newsess; /* Indicates whether this
@@ -255,7 +258,7 @@ int l2tp_tunnel_create(struct net *net, int fd, int version, u32 tunnel_id,
u32 peer_tunnel_id, struct l2tp_tunnel_cfg *cfg,
struct l2tp_tunnel **tunnelp);
void l2tp_tunnel_closeall(struct l2tp_tunnel *tunnel);
-int l2tp_tunnel_delete(struct l2tp_tunnel *tunnel);
+void l2tp_tunnel_delete(struct l2tp_tunnel *tunnel);
struct l2tp_session *l2tp_session_create(int priv_size,
struct l2tp_tunnel *tunnel,
u32 session_id, u32 peer_session_id,
--
2.14.1
^ permalink raw reply related
* Re: [PATCH net-next 0/7] nfp: flower vxlan tunnel offload
From: Or Gerlitz @ 2017-09-26 14:17 UTC (permalink / raw)
To: Jiri Benc
Cc: Simon Horman, David Miller, Jakub Kicinski, Linux Netdev List,
oss-drivers, John Hurley, Paolo Abeni, Paul Blakey, Jiri Pirko,
Roi Dayan
In-Reply-To: <20170926145143.28bf52bd@griffin>
On Tue, Sep 26, 2017 at 3:51 PM, Jiri Benc <jbenc@redhat.com> wrote:
> On Tue, 26 Sep 2017 15:41:37 +0300, Or Gerlitz wrote:
>> Please note that the way the rule is being set to the HW driver is by delegation
>> done in flower, see these commits (specifically "Add offload support
>> using egress Hardware device")
>
> It's very well possible the bug is somewhere in net/sched.
maybe before/instead you call it a bug, take a look on the design
there and maybe
tell us how to possibly do that otherwise?
^ permalink raw reply
* Re: [PATCH 2/6] ath9k: add a quirk to set use_msi automatically
From: Kalle Valo @ 2017-09-26 14:14 UTC (permalink / raw)
To: Christoph Hellwig
Cc: AceLan Kao, QCA ath9k Development, linux-wireless, netdev,
Linux-Kernel@Vger. Kernel. Org, Daniel Drake
In-Reply-To: <20170926140413.GA19977@infradead.org>
Christoph Hellwig <hch@infradead.org> writes:
> On Tue, Sep 26, 2017 at 03:26:51PM +0800, AceLan Kao wrote:
>> Ath9k is an old driver for old chips, and they work fine with legacy INTx.
>> But some new platforms are using it, so I think we should list those
>> new platforms which blocks INTx in the driver.
>
> And unless they are broken and need a quirk they should work even better
> with MSI if the device advertises MSI support in the PCI capabilities.
Daniel Drake (CCed) already found problems with the MSI implementation:
https://lkml.kernel.org/r/20170925041153.26574-1-drake@endlessm.com
--
Kalle Valo
^ permalink raw reply
* Re: [PATCH net-next 2/7] nfp: compile flower vxlan tunnel metadata match fields
From: Or Gerlitz @ 2017-09-26 14:12 UTC (permalink / raw)
To: John Hurley
Cc: Simon Horman, David Miller, Jakub Kicinski, Linux Netdev List,
oss-drivers
In-Reply-To: <CAK+XE=nSHn6FGTi0kcqi+mcaV3Lo8QANpqQ4z6uNcWE+M3Gngw@mail.gmail.com>
On Tue, Sep 26, 2017 at 4:58 PM, John Hurley <john.hurley@netronome.com> wrote:
> On Mon, Sep 25, 2017 at 7:35 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>> On Mon, Sep 25, 2017 at 1:23 PM, Simon Horman
>> <simon.horman@netronome.com> wrote:
>>> From: John Hurley <john.hurley@netronome.com>
>>>
>>> Compile ovs-tc flower vxlan metadata match fields for offloading. Only
>>
>> anything in the npf kernel bits has direct relation to ovs? what?
>>
>
> Sorry, this is a typo and should refer to TC.
>
>>> +++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
>>> @@ -52,8 +52,25 @@
>>> BIT(FLOW_DISSECTOR_KEY_PORTS) | \
>>> BIT(FLOW_DISSECTOR_KEY_ETH_ADDRS) | \
>>> BIT(FLOW_DISSECTOR_KEY_VLAN) | \
>>> + BIT(FLOW_DISSECTOR_KEY_ENC_KEYID) | \
>>> + BIT(FLOW_DISSECTOR_KEY_ENC_IPV4_ADDRS) | \
>>> + BIT(FLOW_DISSECTOR_KEY_ENC_IPV6_ADDRS) | \
>>
>> this series takes care of IPv6 tunnels too?
>
> IPv6 is not included in this set.
> The reason the IPv6 bit is included here is to account for behavior we
> have noticed in TC flower.
> If, for example, I add a filter with the following match fields:
> 'protocol ip flower enc_src_ip 10.0.0.1 enc_dst_ip 10.0.0.2
> enc_dst_port 4789 enc_key_id 123'
> The 'used_keys' value in the dissector marks both IPv4 and IPv6 encap
> addresses as 'used'.
> I am not sure if this is a bug in TC or that we are expected to check
> the enc_control fields to determine if IPv4 or v6 addresses are used.
you should have your code to check enc_control->addr_type to be
FLOW_DISSECTOR_KEY_IPV4_ADDRS or IPV6_ADDRS
> Including the IPv6 used_keys bit in our whitelist approach allows us
> to accept legitimate IPv4 tunnel rules in these situations.
mmm can please take a look on fl_init_dissector() and tell me if you
see why FLOW_DISSECTOR_KEY_IPV6_ADDRS is set for ipv4 tunnels,
I am not sure.
> If it is found to be IPv6 when the rule is parsed, it will be rejected here.
^ permalink raw reply
* Re: [PATCH 2/6] ath9k: add a quirk to set use_msi automatically
From: Christoph Hellwig @ 2017-09-26 14:04 UTC (permalink / raw)
To: AceLan Kao
Cc: Christoph Hellwig, QCA ath9k Development, Kalle Valo,
linux-wireless, netdev, Linux-Kernel@Vger. Kernel. Org
In-Reply-To: <CAFv23Qm9kZLSqJ_BP1D7mWCxekq0nt3S8onT1ckH-AoHDBUDzw@mail.gmail.com>
On Tue, Sep 26, 2017 at 03:26:51PM +0800, AceLan Kao wrote:
> Ath9k is an old driver for old chips, and they work fine with legacy INTx.
> But some new platforms are using it, so I think we should list those
> new platforms which blocks INTx in the driver.
And unless they are broken and need a quirk they should work even better
with MSI if the device advertises MSI support in the PCI capabilities.
^ permalink raw reply
* Re: [PATCH net v2] l2tp: fix race condition in l2tp_tunnel_delete
From: Sabrina Dubroca @ 2017-09-26 13:59 UTC (permalink / raw)
To: Guillaume Nault; +Cc: netdev, Xin Long, Tom Parkin
In-Reply-To: <20170925123355.a6qof4ovgr6ksx3s@alphalink.fr>
2017-09-25, 14:33:55 +0200, Guillaume Nault wrote:
[...]
> So what about using your v2 patch, but without the ->dead flag test in
> l2tp_tunnel_get() and l2tp_tunnel_find*()?
Ok, I think that's reasonable. I'll post a v3, thanks.
--
Sabrina
^ permalink raw reply
* Re: [PATCH net-next 2/7] nfp: compile flower vxlan tunnel metadata match fields
From: John Hurley @ 2017-09-26 13:58 UTC (permalink / raw)
To: Or Gerlitz
Cc: Simon Horman, David Miller, Jakub Kicinski, Linux Netdev List,
oss-drivers
In-Reply-To: <CAJ3xEMgs2d7dcktHfvx_iry8tBy8XK8n6E5W7J0pYPer9Xz_Kg@mail.gmail.com>
On Mon, Sep 25, 2017 at 7:35 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
> On Mon, Sep 25, 2017 at 1:23 PM, Simon Horman
> <simon.horman@netronome.com> wrote:
>> From: John Hurley <john.hurley@netronome.com>
>>
>> Compile ovs-tc flower vxlan metadata match fields for offloading. Only
>
> anything in the npf kernel bits has direct relation to ovs? what?
>
Sorry, this is a typo and should refer to TC.
>> +++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
>> @@ -52,8 +52,25 @@
>> BIT(FLOW_DISSECTOR_KEY_PORTS) | \
>> BIT(FLOW_DISSECTOR_KEY_ETH_ADDRS) | \
>> BIT(FLOW_DISSECTOR_KEY_VLAN) | \
>> + BIT(FLOW_DISSECTOR_KEY_ENC_KEYID) | \
>> + BIT(FLOW_DISSECTOR_KEY_ENC_IPV4_ADDRS) | \
>> + BIT(FLOW_DISSECTOR_KEY_ENC_IPV6_ADDRS) | \
>
> this series takes care of IPv6 tunnels too?
IPv6 is not included in this set.
The reason the IPv6 bit is included here is to account for behavior we
have noticed in TC flower.
If, for example, I add a filter with the following match fields:
'protocol ip flower enc_src_ip 10.0.0.1 enc_dst_ip 10.0.0.2
enc_dst_port 4789 enc_key_id 123'
The 'used_keys' value in the dissector marks both IPv4 and IPv6 encap
addresses as 'used'.
I am not sure if this is a bug in TC or that we are expected to check
the enc_control fields to determine if IPv4 or v6 addresses are used.
Including the IPv6 used_keys bit in our whitelist approach allows us
to accept legitimate IPv4 tunnel rules in these situations.
If it is found to be IPv6 when the rule is parsed, it will be rejected here.
^ permalink raw reply
* Re: [PATCH] net: stmmac: Meet alignment requirements for DMA
From: Matt Redfearn @ 2017-09-26 13:57 UTC (permalink / raw)
To: David Miller
Cc: netdev, alexandre.torgue, peppe.cavallaro, linux-kernel,
Linux MIPS Mailing List, Paul Burton, James Hogan
In-Reply-To: <20170922.182639.272534775457081015.davem@davemloft.net>
Hi David,
Thanks for your feedback.
On 23/09/17 02:26, David Miller wrote:
> From: Matt Redfearn <matt.redfearn@imgtec.com>
> Date: Fri, 22 Sep 2017 12:13:53 +0100
>
>> According to Documentation/DMA-API.txt:
>> Warnings: Memory coherency operates at a granularity called the cache
>> line width. In order for memory mapped by this API to operate
>> correctly, the mapped region must begin exactly on a cache line
>> boundary and end exactly on one (to prevent two separately mapped
>> regions from sharing a single cache line). Since the cache line size
>> may not be known at compile time, the API will not enforce this
>> requirement. Therefore, it is recommended that driver writers who
>> don't take special care to determine the cache line size at run time
>> only map virtual regions that begin and end on page boundaries (which
>> are guaranteed also to be cache line boundaries).
> This is rediculious. You're misreading what this document is trying
> to explain.
To be clear, is your interpretation of the documentation that drivers do
not have to ensure cacheline alignment?
As I read that documentation of the DMA API, it puts the onus on device
drivers to ensure that they operate on cacheline aligned & sized blocks
of memory. It states "the mapped region must begin exactly on a cache
line boundary". We know that skb->data is not cacheline aligned, since
it is skb_headroom() bytes into the skb buffer, and the value of
skb_headroom is not a multiple of cachelines. To give an example, on the
Creator Ci40 platform (a 32bit MIPS platform), I have the following values:
skb_headroom(skb) = 130 bytes
sbb->head = 0x8ed9b800 (32byte cacheline aligned)
skb->data = 0x8ed9b882 (not cacheline aligned)
Since the MIPS architecture requires software cache management,
performing a dma_map_single(TO_DEVICE) will writeback and invalidate the
cachelines for the required region. To comply with (our interpretation
of) the DMA API documentation, the MIPS implementation expects a
cacheline aligned & sized buffer, so that it can writeback a set of
complete cachelines. Indeed a recent patch
(https://patchwork.linux-mips.org/patch/14624/) causes the MIPS dma
mapping code to emit warnings if the buffer it is requested to sync is
not cachline aligned. This driver, as is, fails this test and causes
thousands of warnings to be emitted.
The situation for dma_map_single(FROM_DEVICE) is potentially worse,
since it will perform a cache invalidate of all lines for the buffer. If
the buffer is not cacheline aligned, this will throw away any adjacent
data in the same cacheline. The dma_map implementation has no way to
know if data adjacent to the buffer it is requested to sync is valuable
or not, and it will simply be thrown away by the cache invalidate.
If I am somehow misinterpreting the DMA API documentation, and drivers
are NOT required to ensure that buffers to be synced are cacheline
aligned, then there is only one way that the MIPS implementation can
ensure that it writeback / invalidates cachelines containing only the
requested data buffer, and that would be to use bounce buffers. Clearly
that will be devastating for performance as every outgoing packet will
need to be copied from it's unaligned location to a guaranteed aligned
location, and written back from there. Similarly incoming packets would
have to somehow be initially located in a cacheline aligned location
before being copied into the non-cacheline aligned location supplied by
the driver.
> As long as you use the dma_{map,unamp}_single() and sync to/from
> deivce interfaces properly, the cacheline issues will be handled properly
> and the cpu and the device will see proper uptodate memory contents.
I interpret the DMA API document (and the MIPS arch dma code operates
this way) as stating that the driver must ensure that buffers passed to
it are cacheline aligned, and cacheline sized, to prevent cache
management throwing away adjacent data in the same cacheline.
That is what this patch is for, to change the buffer address passed to
the DMA API to skb->head, which is (as far as I know) guaranteed to be
cacheline aligned.
>
> It is completely rediculious to require every driver to stash away two
> sets of pointer for every packet, and to DMA map the headroom of the SKB
> which is wasteful.
>
> I'm not applying this, fix this problem properly, thanks.
An alternative, slightly less invasive change, may be to subtract
NET_IP_ALIGN from the dma buffer address, so that the buffer for which a
sync is requested is cacheline aligned (is that guaranteed?). Would that
change be more acceptable?
Thanks,
Matt
^ permalink raw reply
* Re: [PATCH net-next v10] openvswitch: enable NSH support
From: Yang, Yi @ 2017-09-26 13:52 UTC (permalink / raw)
To: Jiri Benc
Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org,
netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, e@erig.me,
davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org
In-Reply-To: <20170926130534.170270e3@griffin>
On Tue, Sep 26, 2017 at 07:05:34PM +0800, Jiri Benc wrote:
> On Tue, 26 Sep 2017 12:47:16 +0800, Yi Yang wrote:
> > + return ((ret != 0) ? false : true);
Jiri, I appriciate your review very carefully and professionally from my
heart for 10 versions, that is really very very big effort.
I carefully thought this comment:
"""
> + return ((ret != 0) ? false : true);
Too little coffee or too late in the night, right? ;-)
"""
But I don't think this is a problematic line from my understanding,
because validate_nsh is declared to return bool. I had thought you were
saying "it was late, it was time for you to leave office, so no more
comments", this is my readout :-)
So the best comment you can give out here is one line you prefer :-)
I'm not a kernel developer full time, I'm adapting to several coding
style at the time in kernel, OVS and Opendaylight.
>
> I'm not going to review this version but this caught my eye - I pointed
> out this silly construct in my review of v9. I can understand that
> working late in the night and rewriting the code back and forth, one
> could end up with such construct and overlook it at the final
> self-review before submission. Happens to everyone.
>
> But leaving this after a review pointed it out means you're not paying
> enough attention to your work. Even the fact that you sent v10 so
> shortly after v9 means you did not spend the needed time on reflecting
> on the review and that you did not properly test the new version. And
> I told you exactly this before.
>
> Honestly, I'm starting to be tired with reviewing your patch again and
> again and pointing out silly mistakes like this one and repeating basic
> things to you again and again.
I did test it in my sfc test environment, I don't see any warning, error
during build and runtime, it can work well.
Sorry if missing your comments, that is understanding issue in most
cases but not I don't take your comments seriously. You know English
isn't my native language, it is a big challenge for me to understand
your comments very well. Neverthless, code is our common language, I can
understand code better than your descriptive words :-)
>
> Jiri
^ permalink raw reply
* Re: [PATCH net-next RFC 0/5] batched tx processing in vhost_net
From: Michael S. Tsirkin @ 2017-09-26 13:45 UTC (permalink / raw)
To: Jason Wang; +Cc: netdev, linux-kernel, kvm, virtualization
In-Reply-To: <1506067355-5771-1-git-send-email-jasowang@redhat.com>
On Fri, Sep 22, 2017 at 04:02:30PM +0800, Jason Wang wrote:
> Hi:
>
> This series tries to implement basic tx batched processing. This is
> done by prefetching descriptor indices and update used ring in a
> batch. This intends to speed up used ring updating and improve the
> cache utilization.
Interesting, thanks for the patches. So IIUC most of the gain is really
overcoming some of the shortcomings of virtio 1.0 wrt cache utilization?
Which is fair enough (1.0 is already deployed) but I would like to avoid
making 1.1 support harder, and this patchset does this unfortunately,
see comments on individual patches. I'm sure it can be addressed though.
> Test shows about ~22% improvement in tx pss.
Is this with or without tx napi in guest?
> Please review.
>
> Jason Wang (5):
> vhost: split out ring head fetching logic
> vhost: introduce helper to prefetch desc index
> vhost: introduce vhost_add_used_idx()
> vhost_net: rename VHOST_RX_BATCH to VHOST_NET_BATCH
> vhost_net: basic tx virtqueue batched processing
>
> drivers/vhost/net.c | 221 ++++++++++++++++++++++++++++----------------------
> drivers/vhost/vhost.c | 165 +++++++++++++++++++++++++++++++------
> drivers/vhost/vhost.h | 9 ++
> 3 files changed, 270 insertions(+), 125 deletions(-)
>
> --
> 2.7.4
^ permalink raw reply
* Re: [PATCH][next] ieee802154: atusb: make two structures static, fixes warnings
From: Stefan Schmidt @ 2017-09-26 13:45 UTC (permalink / raw)
To: Colin King, Alexander Aring, linux-wpan, netdev
Cc: kernel-janitors, linux-kernel
In-Reply-To: <20170926131836.15171-1-colin.king@canonical.com>
Hello.
On 09/26/2017 03:18 PM, Colin King wrote:
> From: Colin Ian King <colin.king@canonical.com>
>
> The arrays atusb_chip_data and hulusb_chip_data are local to the source
> and do not need to be in global scope, so make them static. Also
> remove unnecessary forward declaration of atusb_chip_data.
>
> Cleans up sparse warnings:
> symbol 'atusb_chip_data' was not declared. Should it be static?
> symbol 'hulusb_chip_data' was not declared. Should it be static?
>
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
> ---
> drivers/net/ieee802154/atusb.c | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ieee802154/atusb.c b/drivers/net/ieee802154/atusb.c
> index 115fa3f37a86..8b0718de568f 100644
> --- a/drivers/net/ieee802154/atusb.c
> +++ b/drivers/net/ieee802154/atusb.c
> @@ -45,8 +45,6 @@
> #define ATUSB_ALLOC_DELAY_MS 100 /* delay after failed allocation */
> #define ATUSB_TX_TIMEOUT_MS 200 /* on the air timeout */
>
> -struct atusb_chip_data;
> -
> struct atusb {
> struct ieee802154_hw *hw;
> struct usb_device *usb_dev;
> @@ -767,14 +765,14 @@ atusb_set_promiscuous_mode(struct ieee802154_hw *hw, const bool on)
> return 0;
> }
>
> -struct atusb_chip_data atusb_chip_data = {
> +static struct atusb_chip_data atusb_chip_data = {
> .t_channel_switch = 1,
> .rssi_base_val = -91,
> .set_txpower = atusb_set_txpower,
> .set_channel = atusb_set_channel,
> };
>
> -struct atusb_chip_data hulusb_chip_data = {
> +static struct atusb_chip_data hulusb_chip_data = {
> .t_channel_switch = 11,
> .rssi_base_val = -100,
> .set_txpower = hulusb_set_txpower,
>
This patch has been applied to the wpan-next tree and will be
part of the next pull request to net-next. Thanks!
regards
Stefan Schmidt
^ permalink raw reply
* Re: [PATCH net] ipv6: remove incorrect WARN_ON() in fib6_del()
From: Eric Dumazet @ 2017-09-26 13:20 UTC (permalink / raw)
To: Wei Wang; +Cc: Martin KaFai Lau, David Miller, Linux Kernel Network Developers
In-Reply-To: <CAEA6p_Du641cbcvr5JVXbLMh0T9fAauWDcXqNmrNoJkddMVCBA@mail.gmail.com>
On Mon, Sep 25, 2017 at 10:52 PM, Wei Wang <weiwan@google.com> wrote:
> On Mon, Sep 25, 2017 at 7:23 PM, Eric Dumazet <edumazet@google.com> wrote:
>> On Mon, Sep 25, 2017 at 7:07 PM, Martin KaFai Lau <kafai@fb.com> wrote:
>>
>>> I am probably still missing something.
>>>
>>> Considering the del operation should be under the writer lock,
>>> if rt->rt6i_node should be NULL (for rt that has already been
>>> removed from fib6), why this WARN_ON() is triggered?
>>>
>>> An example may help.
>>>
>>
>> Look at the stack trace, you'll find the answers...
>>
>> ip6_link_failure() -> ip6_del_rt()
>>
>> Note that rt might have been deleted from the _tree_ already.
>
> Had a brief talk with Martin.
> He has a valid point.
> The current WARN_ON() code is as follows:
> #if RT6_DEBUG >= 2
> if (rt->dst.obsolete > 0) {
> WARN_ON(fn);
> return -ENOENT;
> }
> #endif
>
> The WARN_ON() only triggers when fn is not NULL. (I missed it before.)
> In theory, fib6_del() calls fib6_del_route() which should set
> rt->rt6i_node to NULL and rt->dst.obsolete to DST_OBSOLETE_DEAD within
> the same write_lock session.
> If those 2 values are inconsistent, it indicates something is wrong.
> Will need more time to root cause the issue.
>
> Please ignore this patch. Sorry about the confusion.
Oh well, for some reason I was seeing WARN_ON(1) here, since this is
a construct I often add in my tests ...
^ permalink raw reply
* [PATCH] mac80211: aead api to reduce redundancy
From: Xiang Gao @ 2017-09-26 13:19 UTC (permalink / raw)
To: davem, johannes, linux-kernel, linux-wireless, netdev; +Cc: qasdfgtyuiop
Currently, the aes_ccm.c and aes_gcm.c are almost line by line copy of
each other. This patch reduce code redundancy by moving the code in these
two files to crypto/aead_api.c to make it a higher level aead api. The
file aes_ccm.c and aes_gcm.c are removed and all the functions there are
now implemented in their headers using the newly added aead api.
Signed-off-by: Xiang Gao <qasdfgtyuiop@gmail.com>
---
net/mac80211/Makefile | 3 +-
net/mac80211/{aes_gcm.c => aead_api.c} | 49 +++++++-------
net/mac80211/aead_api.h | 26 ++++++++
net/mac80211/aes_ccm.c | 115 ---------------------------------
net/mac80211/aes_ccm.h | 42 ++++++++----
net/mac80211/aes_gcm.h | 38 ++++++++---
net/mac80211/wpa.c | 12 ++--
7 files changed, 118 insertions(+), 167 deletions(-)
rename net/mac80211/{aes_gcm.c => aead_api.c} (55%)
create mode 100644 net/mac80211/aead_api.h
delete mode 100644 net/mac80211/aes_ccm.c
diff --git a/net/mac80211/Makefile b/net/mac80211/Makefile
index 282912245938..80f25ff2f24b 100644
--- a/net/mac80211/Makefile
+++ b/net/mac80211/Makefile
@@ -6,6 +6,7 @@ mac80211-y := \
driver-ops.o \
sta_info.o \
wep.o \
+ aead_api.o \
wpa.o \
scan.o offchannel.o \
ht.o agg-tx.o agg-rx.o \
@@ -15,8 +16,6 @@ mac80211-y := \
rate.o \
michael.o \
tkip.o \
- aes_ccm.o \
- aes_gcm.o \
aes_cmac.o \
aes_gmac.o \
fils_aead.o \
diff --git a/net/mac80211/aes_gcm.c b/net/mac80211/aead_api.c
similarity index 55%
rename from net/mac80211/aes_gcm.c
rename to net/mac80211/aead_api.c
index 8a4397cc1b08..6ac53d0c6977 100644
--- a/net/mac80211/aes_gcm.c
+++ b/net/mac80211/aead_api.c
@@ -1,7 +1,4 @@
-/*
- * Copyright 2014-2015, Qualcomm Atheros, Inc.
- *
- * This program is free software; you can redistribute it and/or modify
+/* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
@@ -9,43 +6,43 @@
#include <linux/kernel.h>
#include <linux/types.h>
#include <linux/err.h>
+#include <linux/scatterlist.h>
#include <crypto/aead.h>
-#include <net/mac80211.h>
-#include "key.h"
-#include "aes_gcm.h"
+#include "aead_api.h"
-int ieee80211_aes_gcm_encrypt(struct crypto_aead *tfm, u8 *j_0, u8 *aad,
- u8 *data, size_t data_len, u8 *mic)
+int aead_encrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad, size_t aad_len,
+ u8 *data, size_t data_len, u8 *auth)
{
struct scatterlist sg[3];
struct aead_request *aead_req;
int reqsize = sizeof(*aead_req) + crypto_aead_reqsize(tfm);
u8 *__aad;
- aead_req = kzalloc(reqsize + GCM_AAD_LEN, GFP_ATOMIC);
+ aead_req = kzalloc(reqsize + aad_len, GFP_ATOMIC);
if (!aead_req)
return -ENOMEM;
__aad = (u8 *)aead_req + reqsize;
- memcpy(__aad, aad, GCM_AAD_LEN);
+ memcpy(__aad, aad, aad_len);
sg_init_table(sg, 3);
- sg_set_buf(&sg[0], &__aad[2], be16_to_cpup((__be16 *)__aad));
+ sg_set_buf(&sg[0], __aad, aad_len);
sg_set_buf(&sg[1], data, data_len);
- sg_set_buf(&sg[2], mic, IEEE80211_GCMP_MIC_LEN);
+ sg_set_buf(&sg[2], auth, tfm->authsize);
aead_request_set_tfm(aead_req, tfm);
- aead_request_set_crypt(aead_req, sg, sg, data_len, j_0);
+ aead_request_set_crypt(aead_req, sg, sg, data_len, b_0);
aead_request_set_ad(aead_req, sg[0].length);
crypto_aead_encrypt(aead_req);
kzfree(aead_req);
+
return 0;
}
-int ieee80211_aes_gcm_decrypt(struct crypto_aead *tfm, u8 *j_0, u8 *aad,
- u8 *data, size_t data_len, u8 *mic)
+int aead_decrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad, size_t aad_len,
+ u8 *data, size_t data_len, u8 *auth)
{
struct scatterlist sg[3];
struct aead_request *aead_req;
@@ -56,21 +53,20 @@ int ieee80211_aes_gcm_decrypt(struct crypto_aead *tfm, u8 *j_0, u8 *aad,
if (data_len == 0)
return -EINVAL;
- aead_req = kzalloc(reqsize + GCM_AAD_LEN, GFP_ATOMIC);
+ aead_req = kzalloc(reqsize + aad_len, GFP_ATOMIC);
if (!aead_req)
return -ENOMEM;
__aad = (u8 *)aead_req + reqsize;
- memcpy(__aad, aad, GCM_AAD_LEN);
+ memcpy(__aad, aad, aad_len);
sg_init_table(sg, 3);
sg_set_buf(&sg[0], &__aad[2], be16_to_cpup((__be16 *)__aad));
sg_set_buf(&sg[1], data, data_len);
- sg_set_buf(&sg[2], mic, IEEE80211_GCMP_MIC_LEN);
+ sg_set_buf(&sg[2], auth, tfm->authsize);
aead_request_set_tfm(aead_req, tfm);
- aead_request_set_crypt(aead_req, sg, sg,
- data_len + IEEE80211_GCMP_MIC_LEN, j_0);
+ aead_request_set_crypt(aead_req, sg, sg, data_len + tfm->authsize, b_0);
aead_request_set_ad(aead_req, sg[0].length);
err = crypto_aead_decrypt(aead_req);
@@ -79,20 +75,21 @@ int ieee80211_aes_gcm_decrypt(struct crypto_aead *tfm, u8 *j_0, u8 *aad,
return err;
}
-struct crypto_aead *ieee80211_aes_gcm_key_setup_encrypt(const u8 key[],
- size_t key_len)
+struct crypto_aead *
+aead_key_setup_encrypt(const char *alg, const u8 key[],
+ size_t key_len, size_t authsize)
{
struct crypto_aead *tfm;
int err;
- tfm = crypto_alloc_aead("gcm(aes)", 0, CRYPTO_ALG_ASYNC);
+ tfm = crypto_alloc_aead(alg, 0, CRYPTO_ALG_ASYNC);
if (IS_ERR(tfm))
return tfm;
err = crypto_aead_setkey(tfm, key, key_len);
if (err)
goto free_aead;
- err = crypto_aead_setauthsize(tfm, IEEE80211_GCMP_MIC_LEN);
+ err = crypto_aead_setauthsize(tfm, authsize);
if (err)
goto free_aead;
@@ -103,7 +100,7 @@ struct crypto_aead *ieee80211_aes_gcm_key_setup_encrypt(const u8 key[],
return ERR_PTR(err);
}
-void ieee80211_aes_gcm_key_free(struct crypto_aead *tfm)
+void aead_key_free(struct crypto_aead *tfm)
{
crypto_free_aead(tfm);
}
diff --git a/net/mac80211/aead_api.h b/net/mac80211/aead_api.h
new file mode 100644
index 000000000000..141a813d410c
--- /dev/null
+++ b/net/mac80211/aead_api.h
@@ -0,0 +1,26 @@
+/* This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef _AEAD_API_H
+#define _AEAD_API_H
+
+#include <crypto/aead.h>
+#include <linux/crypto.h>
+
+struct crypto_aead *
+aead_key_setup_encrypt(const char *alg, const u8 key[],
+ size_t key_len, size_t authsize);
+
+int aead_encrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad,
+ size_t aad_len, u8 *data,
+ size_t data_len, u8 *auth);
+
+int aead_decrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad,
+ size_t aad_len, u8 *data,
+ size_t data_len, u8 *auth);
+
+void aead_key_free(struct crypto_aead *tfm);
+
+#endif /* _AEAD_API_H */
diff --git a/net/mac80211/aes_ccm.c b/net/mac80211/aes_ccm.c
deleted file mode 100644
index a4e0d59a40dd..000000000000
--- a/net/mac80211/aes_ccm.c
+++ /dev/null
@@ -1,115 +0,0 @@
-/*
- * Copyright 2003-2004, Instant802 Networks, Inc.
- * Copyright 2005-2006, Devicescape Software, Inc.
- *
- * Rewrite: Copyright (C) 2013 Linaro Ltd <ard.biesheuvel@linaro.org>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- */
-
-#include <linux/kernel.h>
-#include <linux/types.h>
-#include <linux/err.h>
-#include <crypto/aead.h>
-
-#include <net/mac80211.h>
-#include "key.h"
-#include "aes_ccm.h"
-
-int ieee80211_aes_ccm_encrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad,
- u8 *data, size_t data_len, u8 *mic,
- size_t mic_len)
-{
- struct scatterlist sg[3];
- struct aead_request *aead_req;
- int reqsize = sizeof(*aead_req) + crypto_aead_reqsize(tfm);
- u8 *__aad;
-
- aead_req = kzalloc(reqsize + CCM_AAD_LEN, GFP_ATOMIC);
- if (!aead_req)
- return -ENOMEM;
-
- __aad = (u8 *)aead_req + reqsize;
- memcpy(__aad, aad, CCM_AAD_LEN);
-
- sg_init_table(sg, 3);
- sg_set_buf(&sg[0], &__aad[2], be16_to_cpup((__be16 *)__aad));
- sg_set_buf(&sg[1], data, data_len);
- sg_set_buf(&sg[2], mic, mic_len);
-
- aead_request_set_tfm(aead_req, tfm);
- aead_request_set_crypt(aead_req, sg, sg, data_len, b_0);
- aead_request_set_ad(aead_req, sg[0].length);
-
- crypto_aead_encrypt(aead_req);
- kzfree(aead_req);
-
- return 0;
-}
-
-int ieee80211_aes_ccm_decrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad,
- u8 *data, size_t data_len, u8 *mic,
- size_t mic_len)
-{
- struct scatterlist sg[3];
- struct aead_request *aead_req;
- int reqsize = sizeof(*aead_req) + crypto_aead_reqsize(tfm);
- u8 *__aad;
- int err;
-
- if (data_len == 0)
- return -EINVAL;
-
- aead_req = kzalloc(reqsize + CCM_AAD_LEN, GFP_ATOMIC);
- if (!aead_req)
- return -ENOMEM;
-
- __aad = (u8 *)aead_req + reqsize;
- memcpy(__aad, aad, CCM_AAD_LEN);
-
- sg_init_table(sg, 3);
- sg_set_buf(&sg[0], &__aad[2], be16_to_cpup((__be16 *)__aad));
- sg_set_buf(&sg[1], data, data_len);
- sg_set_buf(&sg[2], mic, mic_len);
-
- aead_request_set_tfm(aead_req, tfm);
- aead_request_set_crypt(aead_req, sg, sg, data_len + mic_len, b_0);
- aead_request_set_ad(aead_req, sg[0].length);
-
- err = crypto_aead_decrypt(aead_req);
- kzfree(aead_req);
-
- return err;
-}
-
-struct crypto_aead *ieee80211_aes_key_setup_encrypt(const u8 key[],
- size_t key_len,
- size_t mic_len)
-{
- struct crypto_aead *tfm;
- int err;
-
- tfm = crypto_alloc_aead("ccm(aes)", 0, CRYPTO_ALG_ASYNC);
- if (IS_ERR(tfm))
- return tfm;
-
- err = crypto_aead_setkey(tfm, key, key_len);
- if (err)
- goto free_aead;
- err = crypto_aead_setauthsize(tfm, mic_len);
- if (err)
- goto free_aead;
-
- return tfm;
-
-free_aead:
- crypto_free_aead(tfm);
- return ERR_PTR(err);
-}
-
-void ieee80211_aes_key_free(struct crypto_aead *tfm)
-{
- crypto_free_aead(tfm);
-}
diff --git a/net/mac80211/aes_ccm.h b/net/mac80211/aes_ccm.h
index fcd3254c5cf0..e9b7ca0bde5b 100644
--- a/net/mac80211/aes_ccm.h
+++ b/net/mac80211/aes_ccm.h
@@ -10,19 +10,39 @@
#ifndef AES_CCM_H
#define AES_CCM_H
-#include <linux/crypto.h>
+#include "aead_api.h"
#define CCM_AAD_LEN 32
-struct crypto_aead *ieee80211_aes_key_setup_encrypt(const u8 key[],
- size_t key_len,
- size_t mic_len);
-int ieee80211_aes_ccm_encrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad,
- u8 *data, size_t data_len, u8 *mic,
- size_t mic_len);
-int ieee80211_aes_ccm_decrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad,
- u8 *data, size_t data_len, u8 *mic,
- size_t mic_len);
-void ieee80211_aes_key_free(struct crypto_aead *tfm);
+static inline struct crypto_aead *
+ieee80211_aes_key_setup_encrypt(const u8 key[], size_t key_len, size_t mic_len)
+{
+ return aead_key_setup_encrypt("ccm(aes)", key, key_len, mic_len);
+}
+
+static inline int
+ieee80211_aes_ccm_encrypt(struct crypto_aead *tfm,
+ u8 *b_0, u8 *aad, u8 *data,
+ size_t data_len, u8 *mic)
+{
+ return aead_encrypt(tfm, b_0, aad + 2,
+ be16_to_cpup((__be16 *)aad),
+ data, data_len, mic);
+}
+
+static inline int
+ieee80211_aes_ccm_decrypt(struct crypto_aead *tfm,
+ u8 *b_0, u8 *aad, u8 *data,
+ size_t data_len, u8 *mic)
+{
+ return aead_decrypt(tfm, b_0, aad + 2,
+ be16_to_cpup((__be16 *)aad),
+ data, data_len, mic);
+}
+
+static inline void ieee80211_aes_key_free(struct crypto_aead *tfm)
+{
+ return aead_key_free(tfm);
+}
#endif /* AES_CCM_H */
diff --git a/net/mac80211/aes_gcm.h b/net/mac80211/aes_gcm.h
index 55aed5352494..d2b096033009 100644
--- a/net/mac80211/aes_gcm.h
+++ b/net/mac80211/aes_gcm.h
@@ -9,16 +9,38 @@
#ifndef AES_GCM_H
#define AES_GCM_H
-#include <linux/crypto.h>
+#include "aead_api.h"
#define GCM_AAD_LEN 32
-int ieee80211_aes_gcm_encrypt(struct crypto_aead *tfm, u8 *j_0, u8 *aad,
- u8 *data, size_t data_len, u8 *mic);
-int ieee80211_aes_gcm_decrypt(struct crypto_aead *tfm, u8 *j_0, u8 *aad,
- u8 *data, size_t data_len, u8 *mic);
-struct crypto_aead *ieee80211_aes_gcm_key_setup_encrypt(const u8 key[],
- size_t key_len);
-void ieee80211_aes_gcm_key_free(struct crypto_aead *tfm);
+static inline int ieee80211_aes_gcm_encrypt(struct crypto_aead *tfm,
+ u8 *j_0, u8 *aad, u8 *data,
+ size_t data_len, u8 *mic)
+{
+ return aead_encrypt(tfm, j_0, aad + 2,
+ be16_to_cpup((__be16 *)aad),
+ data, data_len, mic);
+}
+
+static inline int ieee80211_aes_gcm_decrypt(struct crypto_aead *tfm,
+ u8 *j_0, u8 *aad, u8 *data,
+ size_t data_len, u8 *mic)
+{
+ return aead_decrypt(tfm, j_0, aad + 2,
+ be16_to_cpup((__be16 *)aad),
+ data, data_len, mic);
+}
+
+static inline struct crypto_aead *
+ieee80211_aes_gcm_key_setup_encrypt(const u8 key[], size_t key_len)
+{
+ return aead_key_setup_encrypt("gcm(aes)", key,
+ key_len, IEEE80211_GCMP_MIC_LEN);
+}
+
+static inline void ieee80211_aes_gcm_key_free(struct crypto_aead *tfm)
+{
+ return aead_key_free(tfm);
+}
#endif /* AES_GCM_H */
diff --git a/net/mac80211/wpa.c b/net/mac80211/wpa.c
index 0d722ea98a1b..390501bf9e31 100644
--- a/net/mac80211/wpa.c
+++ b/net/mac80211/wpa.c
@@ -464,7 +464,8 @@ static int ccmp_encrypt_skb(struct ieee80211_tx_data *tx, struct sk_buff *skb,
pos += IEEE80211_CCMP_HDR_LEN;
ccmp_special_blocks(skb, pn, b_0, aad);
return ieee80211_aes_ccm_encrypt(key->u.ccmp.tfm, b_0, aad, pos, len,
- skb_put(skb, mic_len), mic_len);
+ skb_put(skb,
+ key->u.ccmp.tfm->authsize));
}
@@ -540,10 +541,11 @@ ieee80211_crypto_ccmp_decrypt(struct ieee80211_rx_data *rx,
ccmp_special_blocks(skb, pn, b_0, aad);
if (ieee80211_aes_ccm_decrypt(
- key->u.ccmp.tfm, b_0, aad,
- skb->data + hdrlen + IEEE80211_CCMP_HDR_LEN,
- data_len,
- skb->data + skb->len - mic_len, mic_len))
+ key->u.ccmp.tfm, b_0, aad,
+ skb->data + hdrlen + IEEE80211_CCMP_HDR_LEN,
+ data_len,
+ skb->data + skb->len - key->u.ccmp.tfm->authsize
+ ))
return RX_DROP_UNUSABLE;
}
--
2.14.1
^ permalink raw reply related
* [PATCH][next] ieee802154: atusb: make two structures static, fixes warnings
From: Colin King @ 2017-09-26 13:18 UTC (permalink / raw)
To: Stefan Schmidt, Alexander Aring, linux-wpan, netdev
Cc: kernel-janitors, linux-kernel
From: Colin Ian King <colin.king@canonical.com>
The arrays atusb_chip_data and hulusb_chip_data are local to the source
and do not need to be in global scope, so make them static. Also
remove unnecessary forward declaration of atusb_chip_data.
Cleans up sparse warnings:
symbol 'atusb_chip_data' was not declared. Should it be static?
symbol 'hulusb_chip_data' was not declared. Should it be static?
Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
drivers/net/ieee802154/atusb.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ieee802154/atusb.c b/drivers/net/ieee802154/atusb.c
index 115fa3f37a86..8b0718de568f 100644
--- a/drivers/net/ieee802154/atusb.c
+++ b/drivers/net/ieee802154/atusb.c
@@ -45,8 +45,6 @@
#define ATUSB_ALLOC_DELAY_MS 100 /* delay after failed allocation */
#define ATUSB_TX_TIMEOUT_MS 200 /* on the air timeout */
-struct atusb_chip_data;
-
struct atusb {
struct ieee802154_hw *hw;
struct usb_device *usb_dev;
@@ -767,14 +765,14 @@ atusb_set_promiscuous_mode(struct ieee802154_hw *hw, const bool on)
return 0;
}
-struct atusb_chip_data atusb_chip_data = {
+static struct atusb_chip_data atusb_chip_data = {
.t_channel_switch = 1,
.rssi_base_val = -91,
.set_txpower = atusb_set_txpower,
.set_channel = atusb_set_channel,
};
-struct atusb_chip_data hulusb_chip_data = {
+static struct atusb_chip_data hulusb_chip_data = {
.t_channel_switch = 11,
.rssi_base_val = -100,
.set_txpower = hulusb_set_txpower,
--
2.14.1
^ permalink raw reply related
* [PATCH net-next] ldmvsw: Remove redundant unlikely()
From: Tobias Klauser @ 2017-09-26 13:14 UTC (permalink / raw)
To: netdev; +Cc: David S. Miller, Shannon Nelson
IS_ERR() already implies unlikely(), so it can be omitted.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
---
drivers/net/ethernet/sun/ldmvsw.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/sun/ldmvsw.c b/drivers/net/ethernet/sun/ldmvsw.c
index 5b56c24b6ed2..5feeaa9f0a9e 100644
--- a/drivers/net/ethernet/sun/ldmvsw.c
+++ b/drivers/net/ethernet/sun/ldmvsw.c
@@ -307,7 +307,7 @@ static int vsw_port_probe(struct vio_dev *vdev, const struct vio_device_id *id)
/* Get (or create) the vnet associated with this port */
vp = vsw_get_vnet(hp, vdev->mp, &handle);
- if (unlikely(IS_ERR(vp))) {
+ if (IS_ERR(vp)) {
err = PTR_ERR(vp);
pr_err("Failed to get vnet for vsw-port\n");
mdesc_release(hp);
--
2.13.0
^ permalink raw reply related
* [PATCH net-next] net/mlx5: Remove redundant unlikely()
From: Tobias Klauser @ 2017-09-26 13:13 UTC (permalink / raw)
To: Saeed Mahameed, Matan Barak, Leon Romanovsky
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA
IS_ERR() already implies unlikely(), so it can be omitted.
Signed-off-by: Tobias Klauser <tklauser-93Khv+1bN0NyDzI6CaY1VQ@public.gmane.org>
---
drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c
index 4614ddfa91eb..6a7c8b04447e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c
@@ -256,7 +256,7 @@ struct sk_buff *mlx5e_ipsec_handle_tx_skb(struct net_device *netdev,
goto drop;
}
mdata = mlx5e_ipsec_add_metadata(skb);
- if (unlikely(IS_ERR(mdata))) {
+ if (IS_ERR(mdata)) {
atomic64_inc(&priv->ipsec->sw_stats.ipsec_tx_drop_metadata);
goto drop;
}
--
2.13.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH net-next] bnxt_en: Remove redundant unlikely()
From: Tobias Klauser @ 2017-09-26 13:12 UTC (permalink / raw)
To: Michael Chan; +Cc: netdev
IS_ERR() already implies unlikely(), so it can be omitted.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index c25f5b555adf..5ba49938ba55 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1491,7 +1491,7 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_napi *bnapi, u32 *raw_cons,
(struct rx_tpa_end_cmp *)rxcmp,
(struct rx_tpa_end_cmp_ext *)rxcmp1, event);
- if (unlikely(IS_ERR(skb)))
+ if (IS_ERR(skb))
return -EBUSY;
rc = -ENOMEM;
--
2.13.0
^ permalink raw reply related
* Re: [PATCH 0/2] 9p: Fixes for hard-to-hit bugs
From: Tuomas Tynkkynen @ 2017-09-26 13:10 UTC (permalink / raw)
To: Al Viro
Cc: v9fs-developer, Eric Van Hensbergen, Ron Minnich,
Latchesar Ionkov, David S. Miller, linux-kernel, netdev,
linux-fsdevel
In-Reply-To: <20170906145908.8082-1-tuomas@tuxera.com>
Hi Al,
On Wed, 2017-09-06 at 17:59 +0300, Tuomas Tynkkynen wrote:
> These two patches fix two hard-to-hit (but really annoying) bugs in
> 9p.
> The first one was posted earlier in February (with one R-b), the
> second
> is a new one.
>
> Both of these have had soaking in NixOS distribution kernels for
> a couple of months with no ill effects.
>
> Tuomas Tynkkynen (2):
> fs/9p: Compare qid.path in v9fs_test_inode
> net/9p: Switch to wait_event_killable()
>
> fs/9p/vfs_inode.c | 3 +++
> fs/9p/vfs_inode_dotl.c | 3 +++
> net/9p/client.c | 3 +--
> net/9p/trans_virtio.c | 13 ++++++-------
> net/9p/trans_xen.c | 4 ++--
> 5 files changed, 15 insertions(+), 11 deletions(-)
>
Could you apply these? Thanks!
- Tuomas
^ permalink raw reply
* Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c
From: Roman Gushchin @ 2017-09-26 13:10 UTC (permalink / raw)
To: Yuchung Cheng
Cc: Oleksandr Natalenko, Hideaki YOSHIFUJI, Alexey Kuznetsov, netdev,
linux-kernel@vger.kernel.org
In-Reply-To: <CAK6E8=cGF+xKiixRVvA=3PVPA7OQta9hVLTgCbKgvYf3e9Eu-A@mail.gmail.com>
> On Wed, Sep 20, 2017 at 6:46 PM, Roman Gushchin <guro@fb.com> wrote:
> >
> > > Hello.
> > >
> > > Since, IIRC, v4.11, there is some regression in TCP stack resulting in the
> > > warning shown below. Most of the time it is harmless, but rarely it just
> > > causes either freeze or (I believe, this is related too) panic in
> > > tcp_sacktag_walk() (because sk_buff passed to this function is NULL).
> > > Unfortunately, I still do not have proper stacktrace from panic, but will try
> > > to capture it if possible.
> > >
> > > Also, I have custom settings regarding TCP stack, shown below as well. ifb is
> > > used to shape traffic with tc.
> > >
> > > Please note this regression was already reported as BZ [1] and as a letter to
> > > ML [2], but got neither attention nor resolution. It is reproducible for (not
> > > only) me on my home router since v4.11 till v4.13.1 incl.
> > >
> > > Please advise on how to deal with it. I'll provide any additional info if
> > > necessary, also ready to test patches if any.
> > >
> > > Thanks.
> > >
> > > [1] https://bugzilla.kernel.org/show_bug.cgi?id=195835
> > > [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_netdev_msg436158.html&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=jJYgtDM7QT-W-Fz_d29HYQ&m=MDDRfLG5DvdOeniMpaZDJI8ulKQ6PQ6OX_1YtRsiTMA&s=-n3dGZw-pQ95kMBUfq5G9nYZFcuWtbTDlYFkcvQPoKc&e=
> >
> > We're experiencing the same problems on some machines in our fleet.
> > Exactly the same symptoms: tcp_fastretrans_alert() warnings and
> > sometimes panics in tcp_sacktag_walk().
> >
> > Here is an example of a backtrace with the panic log:
Hi Yuchung!
> do you still see the panics if you disable RACK?
> sysctl net.ipv4.tcp_recovery=0?
No, we haven't seen any crash since that.
>
> also have you experience any sack reneg? could you post the output of
> ' nstat |grep -i TCP' thanks
hostname TcpActiveOpens 2289680 0.0
hostname TcpPassiveOpens 3592758 0.0
hostname TcpAttemptFails 746910 0.0
hostname TcpEstabResets 154988 0.0
hostname TcpInSegs 16258678255 0.0
hostname TcpOutSegs 46967011611 0.0
hostname TcpRetransSegs 13724310 0.0
hostname TcpInErrs 2 0.0
hostname TcpOutRsts 9418798 0.0
hostname TcpExtEmbryonicRsts 2303 0.0
hostname TcpExtPruneCalled 90192 0.0
hostname TcpExtOfoPruned 57274 0.0
hostname TcpExtOutOfWindowIcmps 3 0.0
hostname TcpExtTW 1164705 0.0
hostname TcpExtTWRecycled 2 0.0
hostname TcpExtPAWSEstab 159 0.0
hostname TcpExtDelayedACKs 209207209 0.0
hostname TcpExtDelayedACKLocked 508571 0.0
hostname TcpExtDelayedACKLost 1713248 0.0
hostname TcpExtListenOverflows 625 0.0
hostname TcpExtListenDrops 625 0.0
hostname TcpExtTCPHPHits 9341188489 0.0
hostname TcpExtTCPPureAcks 1434646465 0.0
hostname TcpExtTCPHPAcks 5733614672 0.0
hostname TcpExtTCPSackRecovery 3261698 0.0
hostname TcpExtTCPSACKReneging 12203 0.0
hostname TcpExtTCPSACKReorder 433189 0.0
hostname TcpExtTCPTSReorder 22694 0.0
hostname TcpExtTCPFullUndo 45092 0.0
hostname TcpExtTCPPartialUndo 22016 0.0
hostname TcpExtTCPLossUndo 2150040 0.0
hostname TcpExtTCPLostRetransmit 60119 0.0
hostname TcpExtTCPSackFailures 2626782 0.0
hostname TcpExtTCPLossFailures 182999 0.0
hostname TcpExtTCPFastRetrans 4334275 0.0
hostname TcpExtTCPSlowStartRetrans 3453348 0.0
hostname TcpExtTCPTimeouts 1070997 0.0
hostname TcpExtTCPLossProbes 2633545 0.0
hostname TcpExtTCPLossProbeRecovery 941647 0.0
hostname TcpExtTCPSackRecoveryFail 336302 0.0
hostname TcpExtTCPRcvCollapsed 461354 0.0
hostname TcpExtTCPAbortOnData 349196 0.0
hostname TcpExtTCPAbortOnClose 3395 0.0
hostname TcpExtTCPAbortOnTimeout 51201 0.0
hostname TcpExtTCPMemoryPressures 2 0.0
hostname TcpExtTCPSpuriousRTOs 2120503 0.0
hostname TcpExtTCPSackShifted 2613736 0.0
hostname TcpExtTCPSackMerged 21358743 0.0
hostname TcpExtTCPSackShiftFallback 8769387 0.0
hostname TcpExtTCPBacklogDrop 5 0.0
hostname TcpExtTCPRetransFail 843 0.0
hostname TcpExtTCPRcvCoalesce 949068035 0.0
hostname TcpExtTCPOFOQueue 470118 0.0
hostname TcpExtTCPOFODrop 9915 0.0
hostname TcpExtTCPOFOMerge 9 0.0
hostname TcpExtTCPChallengeACK 90 0.0
hostname TcpExtTCPSYNChallenge 3 0.0
hostname TcpExtTCPFastOpenActive 2089 0.0
hostname TcpExtTCPSpuriousRtxHostQueues 896596 0.0
hostname TcpExtTCPAutoCorking 547386735 0.0
hostname TcpExtTCPFromZeroWindowAdv 28757 0.0
hostname TcpExtTCPToZeroWindowAdv 28761 0.0
hostname TcpExtTCPWantZeroWindowAdv 322431 0.0
hostname TcpExtTCPSynRetrans 3026 0.0
hostname TcpExtTCPOrigDataSent 40976870977 0.0
hostname TcpExtTCPHystartTrainDetect 453920 0.0
hostname TcpExtTCPHystartTrainCwnd 11586273 0.0
hostname TcpExtTCPHystartDelayDetect 10943 0.0
hostname TcpExtTCPHystartDelayCwnd 763554 0.0
hostname TcpExtTCPACKSkippedPAWS 30 0.0
hostname TcpExtTCPACKSkippedSeq 218 0.0
hostname TcpExtTCPWinProbe 2408 0.0
hostname TcpExtTCPKeepAlive 213768 0.0
hostname TcpExtTCPMTUPFail 69 0.0
hostname TcpExtTCPMTUPSuccess 8811 0.0
Thanks!
^ permalink raw reply
* Re: [PATCH net-next 0/7] nfp: flower vxlan tunnel offload
From: Jiri Benc @ 2017-09-26 12:51 UTC (permalink / raw)
To: Or Gerlitz
Cc: Simon Horman, David Miller, Jakub Kicinski, Linux Netdev List,
oss-drivers, John Hurley, Paolo Abeni, Paul Blakey, Jiri Pirko,
Roi Dayan
In-Reply-To: <CAJ3xEMjEAhzC5=S7-dx8RiS=E7gYBBrtWY=9Cvqj-K3kyT9o3A@mail.gmail.com>
On Tue, 26 Sep 2017 15:41:37 +0300, Or Gerlitz wrote:
> Please note that the way the rule is being set to the HW driver is by delegation
> done in flower, see these commits (specifically "Add offload support
> using egress Hardware device")
It's very well possible the bug is somewhere in net/sched.
> What is the bug in your view?
If you replace skip_sw with skip_hw, the rules have to work
identically. In software, decapsulated packets appear on the vxlan0
interface, not on the eth0 interface. As the consequence, the second
example must not match on such packets. Those packets do not appear on
eth0 with software only path. eth0 sees encapsulated packets only. It's
vxlan0 that sees decapsulated packets with attached dst_metadata and
that's the only interface where the flower filter in the example can
match.
Hardware offloaded path must behave identically to the software path.
Jiri
^ permalink raw reply
* Re: [PATCH v2 net-next 0/7] net: speedup netns create/delete time
From: Eric Dumazet @ 2017-09-26 12:51 UTC (permalink / raw)
To: Tariq Toukan
Cc: David S . Miller, netdev, Eric W . Biederman, Eric Dumazet,
Majd Dibbiny, Yonatan Cohen, Eran Ben Elisha
In-Reply-To: <8edb310a-562d-bff1-0482-64314833e722@mellanox.com>
On Tue, Sep 26, 2017 at 4:21 AM, Tariq Toukan <tariqt@mellanox.com> wrote:
>
>
> Hi Eric,
>
> We see a regression introduced in this series, specifically in the patches
> touching lib/kobject_uevent.c.
> We tried to figure out what is wrong there, but couldn't point it out.
>
> Bug is that mlx4 driver restart fails, because mlx4_core is still in use.
> According to module dependencies, both mlx4_en and mlx4_ib should have been
> unloaded at this point
> Please see log below.
>
> This looks to be some kind of a race, as the repro is not deterministic.
> Probably the en/ib modules are now mistakenly reloaded.
>
> Any idea what could this be?
>
> Regards,
> Tariq
>
>
> [root@reg-l-vrt-41016-009 ~]# /etc/init.d/openibd stop
> Unloading HCA driver: [ OK ]
> [root@reg-l-vrt-41016-009 ~]# /etc/init.d/openibd start
> Loading HCA driver and Access Layer: [ OK ]
> [root@reg-l-vrt-41016-009 ~]# /etc/init.d/openibd stop
> Unloading mlx4_core [FAILED]
> rmmod: ERROR: Module mlx4_core is in use
I have absolutely no idea. Please bisect.
Are you really using netns in the first place ?
^ permalink raw reply
* Re: [PATCH net-next 7/7] nfp: flower vxlan neighbour keep-alive
From: Or Gerlitz @ 2017-09-26 12:44 UTC (permalink / raw)
To: John Hurley
Cc: Simon Horman, David Miller, Jakub Kicinski, Linux Netdev List,
oss-drivers
In-Reply-To: <CAK+XE=nfndmKbgz174CEKzfBHbf7nTBX9CUb47MPBnDssJJ0fA@mail.gmail.com>
On Tue, Sep 26, 2017 at 12:37 PM, John Hurley <john.hurley@netronome.com> wrote:
> [ Reposting in plantext only]
> On Mon, Sep 25, 2017 at 7:32 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>>
>> On Mon, Sep 25, 2017 at 1:23 PM, Simon Horman
>> <simon.horman@netronome.com> wrote:
>> > From: John Hurley <john.hurley@netronome.com>
>> >
>> > Periodically receive messages containing the destination IPs of tunnels
>> > that have recently forwarded traffic. Update the neighbour entries 'used'
>> > value for these IPs next hop.
>>
>> Are you proactively sending keep alive messages from the driver or the
>> fw? what's wrong with the probes sent by the kernel NUD subsystem?
> The messages are sent from the FW to the driver. They indicate which
> offloaded tunnels are currently active.
Do you support flow counters for offloaded TC rules? do you support last-use?
If Y && Y and you cache someone the prev counter value, you can use
this for the "used" feedback. I don't see why add keep-alive and not piggy back
on the flow counters logic.
>> In our driver we also update the used value for neighs of offloaded
>> tunnels, we do it based on flow counters for the offloaded tunnels
>> which is an evidence for activity. Any reason for you not to apply a
>> similar practice?
> Yes, this would provide the same outcome. Because our firmware already
> offered these messages, we chose to support this approach.
^ permalink raw reply
* Re: [PATCH] ebtables: fix race condition in frame_filter_net_init()
From: Florian Westphal @ 2017-09-26 12:42 UTC (permalink / raw)
To: Artem Savkov
Cc: Florian Westphal, Pablo Neira Ayuso, netdev, linux-kernel,
netfilter-devel
In-Reply-To: <20170926122938.11603-1-asavkov@redhat.com>
Artem Savkov <asavkov@redhat.com> wrote:
> It is possible for ebt_in_hook to be triggered before ebt_table is assigned
> resulting in a NULL-pointer dereference. Make sure hooks are
> registered as the last step.
Right, thanks for the patch.
> --- a/net/bridge/netfilter/ebtable_broute.c
> +++ b/net/bridge/netfilter/ebtable_broute.c
> @@ -65,7 +65,7 @@ static int ebt_broute(struct sk_buff *skb)
>
> static int __net_init broute_net_init(struct net *net)
> {
> - net->xt.broute_table = ebt_register_table(net, &broute_table, NULL);
> + net->xt.broute_table = ebt_register_table(net, &broute_table);
I wonder if it makes more sense to model this like the iptables version,
i.e. pass net->xt.table_name as last arg to ebt_register_table ...
> +int ebt_register_hooks(struct net *net, struct ebt_table *table,
> + const struct nf_hook_ops *ops)
> +{
> + int ret = nf_register_net_hooks(net, ops, hweight32(table->valid_hooks));
> +
> + if (ret)
> + __ebt_unregister_table(net, table);
> +
> + return ret;
> +}
... because this looks strange (unregister of table/not-so-obvious error
unwinding ...)
> @@ -1252,15 +1262,6 @@ ebt_register_table(struct net *net, const struct ebt_table *input_table,
> list_add(&table->list, &net->xt.tables[NFPROTO_BRIDGE]);
> mutex_unlock(&ebt_mutex);
... here one could then assign the net->xt.table_X pointer, and then do
the hook registration right after.
However i have no strong opinion here.
^ permalink raw reply
* Re: [PATCH net-next 0/7] nfp: flower vxlan tunnel offload
From: Or Gerlitz @ 2017-09-26 12:41 UTC (permalink / raw)
To: Jiri Benc
Cc: Simon Horman, David Miller, Jakub Kicinski, Linux Netdev List,
oss-drivers, John Hurley, Paolo Abeni, Paul Blakey, Jiri Pirko,
Roi Dayan
In-Reply-To: <20170926121509.50a32571@griffin>
On Tue, Sep 26, 2017 at 1:15 PM, Jiri Benc <jbenc@redhat.com> wrote:
> On Mon, 25 Sep 2017 19:04:53 +0200, Simon Horman wrote:
>> The MAC addresses are extracted from the netdevs already loaded in the
>> kernel and are monitored for any changes. The IP addresses are slightly
>> different in that they are extracted from the rules themselves. We make the
>> assumption that, if a packet is decapsulated at the end point and a match
>> is attempted on the IP address,
>
> You lost me here, I'm afraid. What do you mean by "match"?
>
>> that this IP address should be recognised
>> in the kernel. That being the case, the same traffic pattern should be
>> witnessed if the skip_hw flag is applied.
>
> Just to be really sure that this works correctly, can you confirm that
> this will match the packet:
>
> ip link add vxlan0 type vxlan dstport 4789 dev eth0 external
> ip link set dev vxlan0 up
> tc qdisc add dev vxlan0 ingress
> ethtool -K eth0 hw-tc-offload on
> tc filter add dev vxlan0 protocol ip parent ffff: flower enc_key_id 102 \
> enc_dst_port 4789 src_ip 3.4.5.6 skip_sw action [...]
>
> while this one will NOT match:
what do you exactly mean by "will not match"
> ip link add vxlan0 type vxlan dstport 4789 dev eth0 external
> ip link set dev vxlan0 up
> tc qdisc add dev eth0 ingress
> ethtool -K eth0 hw-tc-offload on
> tc filter add dev eth0 protocol ip parent ffff: flower enc_key_id 102 \
> enc_dst_port 4789 src_ip 3.4.5.6 skip_sw action [...]
Please note that the way the rule is being set to the HW driver is by delegation
done in flower, see these commits (specifically "Add offload support
using egress Hardware device")
a6e1693 net/sched: cls_flower: Set the filter Hardware device for all use-cases
7091d8c net/sched: cls_flower: Add offload support using egress Hardware device
255cb30 net/sched: act_mirred: Add new tc_action_ops get_dev()
Since the egress port is not HW port netdev but rather SW virtual tunnel netdev
we have some logic in the kernel to delegate the rule programming to
HW via the HW netdev
OKay? if not, please elaborate
> We found that with mlx5, the second one actually matches, too. Which is
> a very serious bug. (Adding Paolo who found this. And adding a few more
> Mellanox guys to be aware of the bug.)
What is the bug in your view?
Or.
^ permalink raw reply
* [PATCH] ebtables: fix race condition in frame_filter_net_init()
From: Artem Savkov @ 2017-09-26 12:29 UTC (permalink / raw)
To: Florian Westphal
Cc: Pablo Neira Ayuso, netdev, linux-kernel, netfilter-devel,
Artem Savkov
It is possible for ebt_in_hook to be triggered before ebt_table is assigned
resulting in a NULL-pointer dereference. Make sure hooks are
registered as the last step.
Fixes: aee12a0a3727 ebtables: remove nf_hook_register usage
Signed-off-by: Artem Savkov <asavkov@redhat.com>
---
include/linux/netfilter_bridge/ebtables.h | 5 +++--
net/bridge/netfilter/ebtable_broute.c | 2 +-
net/bridge/netfilter/ebtable_filter.c | 8 ++++++--
net/bridge/netfilter/ebtable_nat.c | 8 ++++++--
net/bridge/netfilter/ebtables.c | 24 +++++++++++++-----------
5 files changed, 29 insertions(+), 18 deletions(-)
diff --git a/include/linux/netfilter_bridge/ebtables.h b/include/linux/netfilter_bridge/ebtables.h
index 2c2a5514b0df..7d68f5ba6ded 100644
--- a/include/linux/netfilter_bridge/ebtables.h
+++ b/include/linux/netfilter_bridge/ebtables.h
@@ -109,8 +109,9 @@ struct ebt_table {
#define EBT_ALIGN(s) (((s) + (__alignof__(struct _xt_align)-1)) & \
~(__alignof__(struct _xt_align)-1))
extern struct ebt_table *ebt_register_table(struct net *net,
- const struct ebt_table *table,
- const struct nf_hook_ops *);
+ const struct ebt_table *table);
+extern int ebt_register_hooks(struct net *net, struct ebt_table *table,
+ const struct nf_hook_ops *ops);
extern void ebt_unregister_table(struct net *net, struct ebt_table *table,
const struct nf_hook_ops *);
extern unsigned int ebt_do_table(struct sk_buff *skb,
diff --git a/net/bridge/netfilter/ebtable_broute.c b/net/bridge/netfilter/ebtable_broute.c
index 2585b100ebbb..b41017409aa5 100644
--- a/net/bridge/netfilter/ebtable_broute.c
+++ b/net/bridge/netfilter/ebtable_broute.c
@@ -65,7 +65,7 @@ static int ebt_broute(struct sk_buff *skb)
static int __net_init broute_net_init(struct net *net)
{
- net->xt.broute_table = ebt_register_table(net, &broute_table, NULL);
+ net->xt.broute_table = ebt_register_table(net, &broute_table);
return PTR_ERR_OR_ZERO(net->xt.broute_table);
}
diff --git a/net/bridge/netfilter/ebtable_filter.c b/net/bridge/netfilter/ebtable_filter.c
index 45a00dbdbcad..ca04582b374e 100644
--- a/net/bridge/netfilter/ebtable_filter.c
+++ b/net/bridge/netfilter/ebtable_filter.c
@@ -93,8 +93,12 @@ static const struct nf_hook_ops ebt_ops_filter[] = {
static int __net_init frame_filter_net_init(struct net *net)
{
- net->xt.frame_filter = ebt_register_table(net, &frame_filter, ebt_ops_filter);
- return PTR_ERR_OR_ZERO(net->xt.frame_filter);
+ net->xt.frame_filter = ebt_register_table(net, &frame_filter);
+
+ if (IS_ERR(net->xt.frame_filter))
+ return PTR_ERR(net->xt.frame_filter);
+
+ return ebt_register_hooks(net, net->xt.frame_filter, ebt_ops_filter);
}
static void __net_exit frame_filter_net_exit(struct net *net)
diff --git a/net/bridge/netfilter/ebtable_nat.c b/net/bridge/netfilter/ebtable_nat.c
index 57cd5bb154e7..f4a2ff93be34 100644
--- a/net/bridge/netfilter/ebtable_nat.c
+++ b/net/bridge/netfilter/ebtable_nat.c
@@ -93,8 +93,12 @@ static const struct nf_hook_ops ebt_ops_nat[] = {
static int __net_init frame_nat_net_init(struct net *net)
{
- net->xt.frame_nat = ebt_register_table(net, &frame_nat, ebt_ops_nat);
- return PTR_ERR_OR_ZERO(net->xt.frame_nat);
+ net->xt.frame_nat = ebt_register_table(net, &frame_nat);
+
+ if (IS_ERR(net->xt.frame_nat))
+ return PTR_ERR(net->xt.frame_nat);
+
+ return ebt_register_hooks(net, net->xt.frame_nat, ebt_ops_nat);
}
static void __net_exit frame_nat_net_exit(struct net *net)
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 83951f978445..e72120ac426e 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -1169,9 +1169,19 @@ static void __ebt_unregister_table(struct net *net, struct ebt_table *table)
kfree(table);
}
+int ebt_register_hooks(struct net *net, struct ebt_table *table,
+ const struct nf_hook_ops *ops)
+{
+ int ret = nf_register_net_hooks(net, ops, hweight32(table->valid_hooks));
+
+ if (ret)
+ __ebt_unregister_table(net, table);
+
+ return ret;
+}
+
struct ebt_table *
-ebt_register_table(struct net *net, const struct ebt_table *input_table,
- const struct nf_hook_ops *ops)
+ebt_register_table(struct net *net, const struct ebt_table *input_table)
{
struct ebt_table_info *newinfo;
struct ebt_table *t, *table;
@@ -1252,15 +1262,6 @@ ebt_register_table(struct net *net, const struct ebt_table *input_table,
list_add(&table->list, &net->xt.tables[NFPROTO_BRIDGE]);
mutex_unlock(&ebt_mutex);
- if (!ops)
- return table;
-
- ret = nf_register_net_hooks(net, ops, hweight32(table->valid_hooks));
- if (ret) {
- __ebt_unregister_table(net, table);
- return ERR_PTR(ret);
- }
-
return table;
free_unlock:
mutex_unlock(&ebt_mutex);
@@ -2456,6 +2457,7 @@ static void __exit ebtables_fini(void)
printk(KERN_INFO "Ebtables v2.0 unregistered\n");
}
+EXPORT_SYMBOL(ebt_register_hooks);
EXPORT_SYMBOL(ebt_register_table);
EXPORT_SYMBOL(ebt_unregister_table);
EXPORT_SYMBOL(ebt_do_table);
--
2.13.5
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox