* Re: [RFC v2 1/2] netlink: add NLA_REJECT policy type
From: David Miller @ 2018-09-12 18:15 UTC (permalink / raw)
To: johannes-cdvu00un1VgdHxzADdlk8Q
Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
netdev-u79uwXL29TY76Z2rM5mHXA, mkubecek-AlSwsSmVLrQ,
johannes.berg-ral2JQCrhuEAvxtiuMwx3w
In-Reply-To: <20180912083610.20857-1-johannes-cdvu00un1VgdHxzADdlk8Q@public.gmane.org>
From: Johannes Berg <johannes-cdvu00un1VgdHxzADdlk8Q@public.gmane.org>
Date: Wed, 12 Sep 2018 10:36:09 +0200
> From: Johannes Berg <johannes.berg-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>
> In some situations some netlink attributes may be used for output
> only (kernel->userspace) or may be reserved for future use. It's
> then helpful to be able to prevent userspace from using them in
> messages sent to the kernel, since they'd otherwise be ignored and
> any future will become impossible if this happens.
>
> Add NLA_REJECT to the policy which does nothing but reject (with
> EINVAL) validation of any messages containing this attribute.
> Allow for returning a specific extended ACK error message in the
> validation_data pointer.
>
> While at it fix the indentation of NLA_BITFIELD32 and describe the
> validation_data pointer for it.
>
> The specific case I have in mind now is a shared nested attribute
> containing request/response data, and it would be pointless and
> potentially confusing to have userspace include response data in
> the messages that actually contain a request.
>
> Signed-off-by: Johannes Berg <johannes.berg-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
This looks great, no objections to this idea or the facility.
It does, however, remind me about about the classic problem of how bad
we are at feature support detection because unrecognized attributes are
ignored.
I do really hope we can fully solve that problem some day.
^ permalink raw reply
* Re: [PATCH net-next v3 02/17] zinc: introduce minimal cryptography library
From: Jason A. Donenfeld @ 2018-09-12 18:16 UTC (permalink / raw)
To: Eric Biggers
Cc: Ard Biesheuvel, LKML, Netdev, David Miller, Greg Kroah-Hartman,
Andrew Lutomirski, Samuel Neves, Jean-Philippe Aumasson,
Linux Crypto Mailing List
In-Reply-To: <20180911220849.GC81235@gmail.com>
Hi Eric,
On Wed, Sep 12, 2018 at 12:08 AM Eric Biggers <ebiggers@kernel.org> wrote:
> I'd strongly prefer the assembly to be readable too. Jason, I'm not sure if
> you've actually read through the asm from the OpenSSL implementations, but the
> generated .S files actually do lose a lot of semantic information that was in
> the original .pl scripts.
The thing to keep in mind is that the .S was not directly and blindly
generated from the .pl. We started with the output of the .pl, and
then, particularly in the case of x86_64, worked with it a lot, and
now it's something a bit different. We've definitely spent a lot of
time reading that assembly.
I'll see if I can improve the readability with some register name
remapping on ARM. No guarantees, but I'll play a bit and see if I can
make it a bit better.
Jason
^ permalink raw reply
* Re: [PATCH net-next v3 02/17] zinc: introduce minimal cryptography library
From: Ard Biesheuvel @ 2018-09-12 18:19 UTC (permalink / raw)
To: Jason A. Donenfeld
Cc: Eric Biggers, LKML, Netdev, David Miller, Greg Kroah-Hartman,
Andrew Lutomirski, Samuel Neves, Jean-Philippe Aumasson,
Linux Crypto Mailing List
In-Reply-To: <CAHmME9rxrC+CAR-xoXd-bZO1HYZ+TjvT_K4xgQwWXE53zi61xQ@mail.gmail.com>
On 12 September 2018 at 20:16, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> Hi Eric,
>
> On Wed, Sep 12, 2018 at 12:08 AM Eric Biggers <ebiggers@kernel.org> wrote:
>> I'd strongly prefer the assembly to be readable too. Jason, I'm not sure if
>> you've actually read through the asm from the OpenSSL implementations, but the
>> generated .S files actually do lose a lot of semantic information that was in
>> the original .pl scripts.
>
> The thing to keep in mind is that the .S was not directly and blindly
> generated from the .pl. We started with the output of the .pl, and
> then, particularly in the case of x86_64, worked with it a lot, and
> now it's something a bit different. We've definitely spent a lot of
> time reading that assembly.
>
Can we please have those changes as a separate patch? Preferably to
the .pl file rather than the .S file, so we can easily distinguish the
code from upstream from the code that you modified.
> I'll see if I can improve the readability with some register name
> remapping on ARM. No guarantees, but I'll play a bit and see if I can
> make it a bit better.
>
> Jason
^ permalink raw reply
* Re: [PATCH net-next v3 02/17] zinc: introduce minimal cryptography library
From: Eric Biggers @ 2018-09-12 18:34 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Jason A. Donenfeld, LKML, Netdev, David Miller,
Greg Kroah-Hartman, Andrew Lutomirski, Samuel Neves,
Jean-Philippe Aumasson, Linux Crypto Mailing List
In-Reply-To: <CAKv+Gu9ZjX3A4bHHgRAkJtr+SKwti+d3dwE0fhGtmeQB_refqw@mail.gmail.com>
On Wed, Sep 12, 2018 at 08:19:21PM +0200, Ard Biesheuvel wrote:
> On 12 September 2018 at 20:16, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> > Hi Eric,
> >
> > On Wed, Sep 12, 2018 at 12:08 AM Eric Biggers <ebiggers@kernel.org> wrote:
> >> I'd strongly prefer the assembly to be readable too. Jason, I'm not sure if
> >> you've actually read through the asm from the OpenSSL implementations, but the
> >> generated .S files actually do lose a lot of semantic information that was in
> >> the original .pl scripts.
> >
> > The thing to keep in mind is that the .S was not directly and blindly
> > generated from the .pl. We started with the output of the .pl, and
> > then, particularly in the case of x86_64, worked with it a lot, and
> > now it's something a bit different. We've definitely spent a lot of
> > time reading that assembly.
> >
>
> Can we please have those changes as a separate patch? Preferably to
> the .pl file rather than the .S file, so we can easily distinguish the
> code from upstream from the code that you modified.
>
> > I'll see if I can improve the readability with some register name
> > remapping on ARM. No guarantees, but I'll play a bit and see if I can
> > make it a bit better.
> >
> > Jason
FWIW, yesterday I made a modified version of poly1305-armv4.pl that generates an
asm file that works in kernel mode. The changes are actually pretty small, and
I think we can get them upstream into OpenSSL like they were for sha256-armv4.pl
and sha512-armv4.pl. I'll start a thread with Andy Polyakov and you two.
But I don't have time to help with all the many OpenSSL asm files Jason is
proposing, just maybe poly1305-armv4 and chacha-armv4 for now.
- Eric
^ permalink raw reply
* [PATCH net 0/4] s390/qeth: fixes 2018-09-12
From: Julian Wiedmann @ 2018-09-12 13:31 UTC (permalink / raw)
To: David Miller
Cc: netdev, linux-s390, Martin Schwidefsky, Heiko Carstens,
Stefan Raspl, Ursula Braun, Julian Wiedmann
Hi Dave,
please apply the following qeth fixes for -net.
Patch 1 resolves a regression in an error path, while patch 2 enables
the SG support by default that was newly introduced with 4.19.
Patch 3 takes care of a longstanding problem with large-order
allocations, and patch 4 fixes a potential out-of-bounds access.
Thanks,
Julian
Julian Wiedmann (3):
s390/qeth: indicate error when netdev allocation fails
s390/qeth: switch on SG by default for IQD devices
s390/qeth: don't dump past end of unknown HW header
Wenjia Zhang (1):
s390/qeth: use vzalloc for QUERY OAT buffer
drivers/s390/net/qeth_core_main.c | 11 ++++++++---
drivers/s390/net/qeth_l2_main.c | 2 +-
drivers/s390/net/qeth_l3_main.c | 2 +-
3 files changed, 10 insertions(+), 5 deletions(-)
--
2.16.4
^ permalink raw reply
* [PATCH net 1/4] s390/qeth: indicate error when netdev allocation fails
From: Julian Wiedmann @ 2018-09-12 13:31 UTC (permalink / raw)
To: David Miller
Cc: netdev, linux-s390, Martin Schwidefsky, Heiko Carstens,
Stefan Raspl, Ursula Braun, Julian Wiedmann
In-Reply-To: <20180912133135.12335-1-jwi@linux.ibm.com>
Bailing out on allocation error is nice, but we also need to tell the
ccwgroup core that creating the qeth groupdev failed.
Fixes: d3d1b205e89f ("s390/qeth: allocate netdevice early")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
---
drivers/s390/net/qeth_core_main.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c
index 49f64eb3eab0..6b24face21d5 100644
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -5768,8 +5768,10 @@ static int qeth_core_probe_device(struct ccwgroup_device *gdev)
qeth_update_from_chp_desc(card);
card->dev = qeth_alloc_netdev(card);
- if (!card->dev)
+ if (!card->dev) {
+ rc = -ENOMEM;
goto err_card;
+ }
qeth_determine_capabilities(card);
enforced_disc = qeth_enforce_discipline(card);
--
2.16.4
^ permalink raw reply related
* [PATCH net 2/4] s390/qeth: switch on SG by default for IQD devices
From: Julian Wiedmann @ 2018-09-12 13:31 UTC (permalink / raw)
To: David Miller
Cc: netdev, linux-s390, Martin Schwidefsky, Heiko Carstens,
Stefan Raspl, Ursula Braun, Julian Wiedmann
In-Reply-To: <20180912133135.12335-1-jwi@linux.ibm.com>
Scatter-gather transmit brings a nice performance boost. Considering the
rather large MTU sizes at play, it's also totally the Right Thing To Do.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
---
drivers/s390/net/qeth_core_main.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c
index 6b24face21d5..b60055e9cb1a 100644
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -5706,6 +5706,8 @@ static struct net_device *qeth_alloc_netdev(struct qeth_card *card)
dev->priv_flags &= ~IFF_TX_SKB_SHARING;
dev->hw_features |= NETIF_F_SG;
dev->vlan_features |= NETIF_F_SG;
+ if (IS_IQD(card))
+ dev->features |= NETIF_F_SG;
}
return dev;
--
2.16.4
^ permalink raw reply related
* [PATCH net 3/4] s390/qeth: use vzalloc for QUERY OAT buffer
From: Julian Wiedmann @ 2018-09-12 13:31 UTC (permalink / raw)
To: David Miller
Cc: netdev, linux-s390, Martin Schwidefsky, Heiko Carstens,
Stefan Raspl, Ursula Braun, Wenjia Zhang, Julian Wiedmann
In-Reply-To: <20180912133135.12335-1-jwi@linux.ibm.com>
From: Wenjia Zhang <wenjia@linux.ibm.com>
qeth_query_oat_command() currently allocates the kernel buffer for
the SIOC_QETH_QUERY_OAT ioctl with kzalloc. So on systems with
fragmented memory, large allocations may fail (eg. the qethqoat tool by
default uses 132KB).
Solve this issue by using vzalloc, backing the allocation with
non-contiguous memory.
Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
---
drivers/s390/net/qeth_core_main.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c
index b60055e9cb1a..de8282420f96 100644
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -25,6 +25,7 @@
#include <linux/netdevice.h>
#include <linux/netdev_features.h>
#include <linux/skbuff.h>
+#include <linux/vmalloc.h>
#include <net/iucv/af_iucv.h>
#include <net/dsfield.h>
@@ -4699,7 +4700,7 @@ static int qeth_query_oat_command(struct qeth_card *card, char __user *udata)
priv.buffer_len = oat_data.buffer_len;
priv.response_len = 0;
- priv.buffer = kzalloc(oat_data.buffer_len, GFP_KERNEL);
+ priv.buffer = vzalloc(oat_data.buffer_len);
if (!priv.buffer) {
rc = -ENOMEM;
goto out;
@@ -4740,7 +4741,7 @@ static int qeth_query_oat_command(struct qeth_card *card, char __user *udata)
rc = -EFAULT;
out_free:
- kfree(priv.buffer);
+ vfree(priv.buffer);
out:
return rc;
}
--
2.16.4
^ permalink raw reply related
* [PATCH net 4/4] s390/qeth: don't dump past end of unknown HW header
From: Julian Wiedmann @ 2018-09-12 13:31 UTC (permalink / raw)
To: David Miller
Cc: netdev, linux-s390, Martin Schwidefsky, Heiko Carstens,
Stefan Raspl, Ursula Braun, Julian Wiedmann
In-Reply-To: <20180912133135.12335-1-jwi@linux.ibm.com>
For inbound data with an unsupported HW header format, only dump the
actual HW header. We have no idea how much payload follows it, and what
it contains. Worst case, we dump past the end of the Inbound Buffer and
access whatever is located next in memory.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
---
drivers/s390/net/qeth_l2_main.c | 2 +-
drivers/s390/net/qeth_l3_main.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/s390/net/qeth_l2_main.c b/drivers/s390/net/qeth_l2_main.c
index 710fa74892ae..b5e38531733f 100644
--- a/drivers/s390/net/qeth_l2_main.c
+++ b/drivers/s390/net/qeth_l2_main.c
@@ -423,7 +423,7 @@ static int qeth_l2_process_inbound_buffer(struct qeth_card *card,
default:
dev_kfree_skb_any(skb);
QETH_CARD_TEXT(card, 3, "inbunkno");
- QETH_DBF_HEX(CTRL, 3, hdr, QETH_DBF_CTRL_LEN);
+ QETH_DBF_HEX(CTRL, 3, hdr, sizeof(*hdr));
continue;
}
work_done++;
diff --git a/drivers/s390/net/qeth_l3_main.c b/drivers/s390/net/qeth_l3_main.c
index 7175086677fb..ada258c01a08 100644
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -1390,7 +1390,7 @@ static int qeth_l3_process_inbound_buffer(struct qeth_card *card,
default:
dev_kfree_skb_any(skb);
QETH_CARD_TEXT(card, 3, "inbunkno");
- QETH_DBF_HEX(CTRL, 3, hdr, QETH_DBF_CTRL_LEN);
+ QETH_DBF_HEX(CTRL, 3, hdr, sizeof(*hdr));
continue;
}
work_done++;
--
2.16.4
^ permalink raw reply related
* Re: [PATCH] net: dsa: mv88e6xxx: Add MV88E6352 DT compatible
From: Andrew Lunn @ 2018-09-12 13:32 UTC (permalink / raw)
To: Marek Vasut; +Cc: netdev
In-Reply-To: <2efc9a68-fc8c-2a85-14f8-bc2c72d9957f@denx.de>
> But the DT should correctly describe the hardware, if it doesn't, it's
> just broken.
It is more subtle than that. It can be broken, yet work, because it
contains information which we don't use. I really expect there will be
cut/paste errors, meaning the more specific compatible is sometimes
wrong. But since at the moment we don't use it, such a broken DT blob
will work. Until the day we need to make use of the more specific
compatible because there really is broken silicon. At that point, we
introduce a regression. All the devices with broke, yet up until now
working DT blobs, stop working. Are you really going to argue they
where always broken, so we don't care we introduced a regression?
Anyway, this is just rehasing an old discussion. Please go read the
archive. See if you have anything new to add which was not discussed
before.
Andrew
^ permalink raw reply
* Re: [PATCH] net: dsa: mv88e6xxx: Add MV88E6352 DT compatible
From: Marek Vasut @ 2018-09-12 13:35 UTC (permalink / raw)
To: Andrew Lunn; +Cc: netdev
In-Reply-To: <20180912133222.GC24595@lunn.ch>
On 09/12/2018 03:32 PM, Andrew Lunn wrote:
>> But the DT should correctly describe the hardware, if it doesn't, it's
>> just broken.
>
> It is more subtle than that. It can be broken, yet work, because it
> contains information which we don't use. I really expect there will be
> cut/paste errors, meaning the more specific compatible is sometimes
> wrong.
If your DT is bogus, nothing can be done about that.
> But since at the moment we don't use it, such a broken DT blob
> will work. Until the day we need to make use of the more specific
> compatible because there really is broken silicon. At that point, we
> introduce a regression. All the devices with broke, yet up until now
> working DT blobs, stop working. Are you really going to argue they
> where always broken, so we don't care we introduced a regression?
If the DT is broken, it's already a bug and we cannot do anything about
that but maybe fix the DT somehow.
> Anyway, this is just rehasing an old discussion. Please go read the
> archive. See if you have anything new to add which was not discussed
> before.
Hehe, I've seen those discussions before too.
--
Best regards,
Marek Vasut
^ permalink raw reply
* Re: [PATCH net-next] tcp: rate limit synflood warnings further
From: Willem de Bruijn @ 2018-09-12 13:39 UTC (permalink / raw)
To: David Miller; +Cc: Network Development, Eric Dumazet, Willem de Bruijn
In-Reply-To: <20180911.233503.2281697703604725089.davem@davemloft.net>
On Wed, Sep 12, 2018 at 2:35 AM David Miller <davem@davemloft.net> wrote:
>
> From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> Date: Sun, 9 Sep 2018 19:12:12 -0400
>
> > From: Willem de Bruijn <willemb@google.com>
> >
> > Convert pr_info to net_info_ratelimited to limit the total number of
> > synflood warnings.
> >
> > Commit 946cedccbd73 ("tcp: Change possible SYN flooding messages")
> > rate limits synflood warnings to one per listener.
> >
> > Workloads that open many listener sockets can still see a high rate of
> > log messages. Syzkaller is one frequent example.
> >
> > Signed-off-by: Willem de Bruijn <willemb@google.com>
>
> Applied, thanks Willem.
>
> Is this stable material?
Thanks. Probably not. A process has to keep opening new listeners
at high rate while under load. I've only seen syzkaller do that.
^ permalink raw reply
* Re: [PATCH net-next] virtio_net: ethtool tx napi configuration
From: Willem de Bruijn @ 2018-09-12 13:43 UTC (permalink / raw)
To: Jason Wang
Cc: Network Development, David Miller, caleb.raitto,
Michael S. Tsirkin, Jon Olson (Google Drive), Willem de Bruijn
In-Reply-To: <ab603c53-f7f8-5e89-a7c6-0050a97abe7b@redhat.com>
On Tue, Sep 11, 2018 at 11:35 PM Jason Wang <jasowang@redhat.com> wrote:
>
>
>
> On 2018年09月11日 09:14, Willem de Bruijn wrote:
> >>>> I cook a fixup, and it looks works in my setup:
> >>>>
> >>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> >>>> index b320b6b14749..9181c3f2f832 100644
> >>>> --- a/drivers/net/virtio_net.c
> >>>> +++ b/drivers/net/virtio_net.c
> >>>> @@ -2204,10 +2204,17 @@ static int virtnet_set_coalesce(struct
> >>>> net_device *dev,
> >>>> return -EINVAL;
> >>>>
> >>>> if (napi_weight ^ vi->sq[0].napi.weight) {
> >>>> - if (dev->flags & IFF_UP)
> >>>> - return -EBUSY;
> >>>> - for (i = 0; i < vi->max_queue_pairs; i++)
> >>>> + for (i = 0; i < vi->max_queue_pairs; i++) {
> >>>> + struct netdev_queue *txq =
> >>>> + netdev_get_tx_queue(vi->dev, i);
> >>>> +
> >>>> + virtnet_napi_tx_disable(&vi->sq[i].napi);
> >>>> + __netif_tx_lock_bh(txq);
> >>>> vi->sq[i].napi.weight = napi_weight;
> >>>> + __netif_tx_unlock_bh(txq);
> >>>> + virtnet_napi_tx_enable(vi, vi->sq[i].vq,
> >>>> + &vi->sq[i].napi);
> >>>> + }
> >>>> }
> >>>>
> >>>> return 0;
> >>> Thanks! It passes my simple stress test, too. Which consists of two
> >>> concurrent loops, one toggling the ethtool option, another running
> >>> TCP_RR.
> >>>
> >>>> The only left case is the speculative tx polling in RX NAPI. I think we
> >>>> don't need to care in this case since it was not a must for correctness.
> >>> As long as the txq lock is held that will be a noop, anyway. The other
> >>> concurrent action is skb_xmit_done. It looks correct to me, but need
> >>> to think about it a bit. The tricky transition is coming out of napi without
> >>> having >= 2 + MAX_SKB_FRAGS clean descriptors. If the queue is
> >>> stopped it may deadlock transmission in no-napi mode.
> >> Yes, maybe we can enable tx queue when napi weight is zero in
> >> virtnet_poll_tx().
> > Yes, that precaution should resolve that edge case.
> >
>
> I've done a stress test and it passes. The test contains:
>
> - vm with 2 queues
> - a bash script to enable and disable tx napi
> - two netperf UDP_STREAM sessions to send small packets
Great. That matches my results. Do you want to send the v2?
^ permalink raw reply
* Re: [PATCH 4.4 18/31] r8152: napi hangup fix after disconnect
From: Ben Hutchings @ 2018-09-12 18:54 UTC (permalink / raw)
To: Jiri Slaby, linux-usb, netdev, David S. Miller
Cc: stable, Greg Kroah-Hartman, LKML
In-Reply-To: <3b118091-d600-0be8-3204-c0f794ef6288@suse.cz>
On Sat, 2018-08-25 at 09:43 +0200, Jiri Slaby wrote:
> On 08/24/2018, 06:38 PM, Ben Hutchings wrote:
> > On Fri, 2018-07-20 at 14:13 +0200, Greg Kroah-Hartman wrote:
> > > 4.4-stable review patch. If anyone has any objections, please let me know.
> > >
> > > ------------------
> > >
> > > From: Jiri Slaby <jslaby@suse.cz>
> > >
> > > [ Upstream commit 0ee1f4734967af8321ecebaf9c74221ace34f2d5 ]
> >
> > [...]
> > > --- a/drivers/net/usb/r8152.c
> > > +++ b/drivers/net/usb/r8152.c
> > > @@ -3139,7 +3139,8 @@ static int rtl8152_close(struct net_devi
> > > #ifdef CONFIG_PM_SLEEP
> > > unregister_pm_notifier(&tp->pm_notifier);
> > > #endif
> > > - napi_disable(&tp->napi);
> > > + if (!test_bit(RTL8152_UNPLUG, &tp->flags))
> > > + napi_disable(&tp->napi);
> > > clear_bit(WORK_ENABLE, &tp->flags);
> > > usb_kill_urb(tp->intr_urb);
> > > cancel_delayed_work_sync(&tp->schedule);
> >
> > This flag appears to be set only if the USB device is actually
> > disconnected. In case the driver is unbound for some other reason
> > (like the module is removed), the same problem will occur.
>
> Could you elaborate? I thought this would happen:
> module_exit -> usb_deregister -> usb_unbind_device -> rtl8152_disconnect
> -> unregister_netdev -> rtl8152_close
>
> Am I missing something?
What I mean is that if the USB device has not been *physically*
disconnected then its usb_device::state will not be
USB_STATE_NOTATTACHED. So rtl8152_disconnect() will not set the
RTL8152_UNPLUG flag and rtl8152_close() will still call napi_disable()
which will hang.
Some options to fix this:
- Add a separate flag which rtl8152_close() checks and
rtl8152_disconnect() always sets
- Call dev_close() before netif_napi_del()
Ben.
--
Ben Hutchings, Software Developer Codethink Ltd
https://www.codethink.co.uk/ Dale House, 35 Dale Street
Manchester, M1 2HF, United Kingdom
^ permalink raw reply
* Re: [net-next, v2, 1/2] net: stmmac: Rework coalesce timer and fix multi-queue races
From: Neil Armstrong @ 2018-09-12 13:50 UTC (permalink / raw)
To: Jose Abreu, netdev
Cc: Jerome Brunet, Martin Blumenstingl, David S. Miller, Joao Pinto,
Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <49853fcd-dac6-147b-736e-1cd2bd5924e5@synopsys.com>
Hi Jose,
On 11/09/2018 10:17, Jose Abreu wrote:
> On 10-09-2018 19:15, Neil Armstrong wrote:
>>
>> RX is still ok but now TX fails almost immediately...
>>
>> With 100ms report :
>>
>> $ iperf3 -c 192.168.1.47 -t 0 -p 5202 -R -i 0.1
>> Connecting to host 192.168.1.47, port 5202
>> Reverse mode, remote host 192.168.1.47 is sending
>> [ 4] local 192.168.1.45 port 45900 connected to 192.168.1.47 port 5202
>> [ ID] Interval Transfer Bandwidth
>> [ 4] 0.00-0.10 sec 10.9 MBytes 913 Mbits/sec
>> [ 4] 0.10-0.20 sec 11.0 MBytes 923 Mbits/sec
>> [ 4] 0.20-0.30 sec 6.34 MBytes 532 Mbits/sec
>> [ 4] 0.30-0.40 sec 0.00 Bytes 0.00 bits/sec
>> [ 4] 0.40-0.50 sec 0.00 Bytes 0.00 bits/sec
>> [ 4] 0.50-0.60 sec 0.00 Bytes 0.00 bits/sec
>> [ 4] 0.60-0.70 sec 0.00 Bytes 0.00 bits/sec
>> [ 4] 0.70-0.80 sec 0.00 Bytes 0.00 bits/sec
>> [ 4] 0.80-0.90 sec 0.00 Bytes 0.00 bits/sec
>> [ 4] 0.90-1.00 sec 0.00 Bytes 0.00 bits/sec
>> [ 4] 1.00-1.10 sec 0.00 Bytes 0.00 bits/sec
>> ^C[ 4] 1.10-1.10 sec 0.00 Bytes 0.00 bits/sec
>> - - - - - - - - - - - - - - - - - - - - - - - - -
>> [ ID] Interval Transfer Bandwidth
>> [ 4] 0.00-1.10 sec 0.00 Bytes 0.00 bits/sec sender
>> [ 4] 0.00-1.10 sec 28.2 MBytes 214 Mbits/sec receiver
>> iperf3: interrupt - the client has terminated
>>
>> Neil
>
> Ok, here goes another incremental patch. If this doesn't work can
> you please send me a link to the spec of the board you are using ?
Sorry for the delay...
Not better, sorry.
$ iperf3 -c 10.1.3.201 -p 5202 -R -t 0
Connecting to host 10.1.3.201, port 5202
Reverse mode, remote host 10.1.3.201 is sending
[ 4] local 10.1.2.12 port 60612 connected to 10.1.3.201 port 5202
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 110 MBytes 920 Mbits/sec
[ 4] 1.00-2.00 sec 110 MBytes 926 Mbits/sec
[ 4] 2.00-3.00 sec 1.94 MBytes 16.3 Mbits/sec
[ 4] 3.00-4.00 sec 0.00 Bytes 0.00 bits/sec
^C[ 4] 4.00-4.76 sec 0.00 Bytes 0.00 bits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-4.76 sec 0.00 Bytes 0.00 bits/sec sender
[ 4] 0.00-4.76 sec 222 MBytes 391 Mbits/sec receiver
iperf3: interrupt - the client has terminated
The board is the Amlogic S400 with the A113D SoC, sorry there is no public spec for this board and for this SoC.
Neil
>
> Thanks and Best Regards,
> Jose Miguel Abreu
>
^ permalink raw reply
* Re: [PATCH net-next v2 2/2] net: stmmac: Fixup the tail addr setting in xmit path
From: Jose Abreu @ 2018-09-12 14:17 UTC (permalink / raw)
To: Florian Fainelli, Jose Abreu, netdev
Cc: David S. Miller, Joao Pinto, Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <ea85407f-c1b3-4070-fd8b-bca55cb1962c@gmail.com>
Hi Florian,
On 10-09-2018 19:46, Florian Fainelli wrote:
>
> Can you include the appropriate Fixes tag here so this can easily be
> backported to relevant stable branches?
Well I guess it goes since forever but it can only cause a major
impact in xgmac2 operation, remaining shall be okay.
I didn't add a Fixes tag because xgmac2 was merged quite recently
... Will add in next version.
Thanks and Best Regards,
Jose Miguel Abreu
^ permalink raw reply
* Re: [PATCH net-next v2 1/2] net: stmmac: Rework coalesce timer and fix multi-queue races
From: Jose Abreu @ 2018-09-12 14:23 UTC (permalink / raw)
To: Florian Fainelli, Jose Abreu, netdev, Tal Gilboa
Cc: Jerome Brunet, Martin Blumenstingl, David S. Miller, Joao Pinto,
Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <f0288def-ecf3-851b-fe81-12f0f79a061d@gmail.com>
Hi Florian,
Thanks for your input.
On 10-09-2018 20:22, Florian Fainelli wrote:
> On 09/10/2018 02:14 AM, Jose Abreu wrote:
>> This follows David Miller advice and tries to fix coalesce timer in
>> multi-queue scenarios.
>>
>> We are now using per-queue coalesce values and per-queue TX timer.
>>
>> Coalesce timer default values was changed to 1ms and the coalesce frames
>> to 25.
>>
>> Tested in B2B setup between XGMAC2 and GMAC5.
> Why not revert the entire features for this merge window and work on
> getting it to work over the next weeks/merge windows?
It was already reverted but the performance drops a little bit
(not that much but I'm trying to fix it).
>
> The idea of using a timer to coalesce TX path when there is not a HW
> timer is a good idea and if this is made robust enough, you could even
> promote that as being a network stack library/feature that could be used
> by other drivers. In fact, this could be a great addition to the net DIM
> library (Tal, what do you think?)
>
> Here's a quick drive by review of things that appear wrong in the
> current driver (without your patches):
>
> - in stmmac_xmit(), in case we hit the !is_jumbo branch and we fail the
> DMA mapping, there is no timer cancellation, don't we want to abort the
> whole transmission?
I don't think this is needed because then tx pointer will not
advance and in stmmac_tx_clean we just won't perform any work.
Besides, we can have a pending timer from previous packets
running so canceling it can cause some problems.
>
> - stmmac_tx_clean() should probably use netif_lock_bh() to guard against
> the timer (soft IRQ context) and the the NAPI context (also soft IRQ)
> running in parallel on two different CPUs. This may not explain all
> problems, but these two things are fundamentally exclusive, because the
> timer is meant to emulate the interrupt after N packets, while NAPI
> executes when such a thing did actually occur
Ok, and now I'm also using __netif_tx_lock_bh(queue) to just lock
per queue instead of the whole TX.
>
> - stmmac_poll() should cancel pending timer(s) if it was able to reclaim
> packets, likewise stmmac_tx_timer() should re-enable TX interrupts if it
> reclaimed packets, since TX interrupts could have been left disabled
> from a prior NAPI run. These could be considered optimizations, since
> you could leave the TX timer running all the time, just adjust the
> deadline (based on line rate, MTU, IPG, number of fragments and their
> respective length), worst case, both NAPI and the timer clean up your TX
> ring, so you should always have room to push more packets
In next version I'm dropping the direct call to stmmac_tx_clean()
in the timer function and just scheduling NAPI instead.
Thanks and Best Regards,
Jose Miguel Abreu
^ permalink raw reply
* Re: [RFC v2 1/2] netlink: add NLA_REJECT policy type
From: Michal Kubecek @ 2018-09-12 19:29 UTC (permalink / raw)
To: Johannes Berg
Cc: David Miller, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1536777285.3678.28.camel-cdvu00un1VgdHxzADdlk8Q@public.gmane.org>
On Wed, Sep 12, 2018 at 08:34:45PM +0200, Johannes Berg wrote:
> It wouldn't be hard to reject attributes that are higher than maxtype -
> we already pass that to nla_parse() wherever we call it, but we'd have
> to find a way to make it optional I guess, for compatibility reasons.
> Perhaps with a warning, like attribute validation. For genetlink, a flag
> in the family (something like "strict attribute validation") would be
> easy, but for "netlink proper" we have a lot of nlmsg_parse() calls to
> patch, and/or replace by nlmsg_parse_strict().
>
> I guess we should
>
> 1) implement nlmsg_parse_strict() for those new things that want it
> strictly - greenfield type stuff that doesn't need to work with
> existing applications
>
> 2) add a warning to nlmsg_parse() when a too high attribute is
> encountered
>
> 3) eventually replace nlmsg_parse() calls by nlmsg_parse_strict() and
> see what breaks? :-) We won't be able to rely on that any time soon
> though (unless userspace first checks with a guaranteed rejected
> attribute, e.g. one that has NLA_REJECT, perhaps the u64 pad
> attributes could be marked such since the kernel can't assume
> alignment anyway)
I'm not so sure we (eventually) want to reject unknown attributes
everywhere. I don't have any concrete example in mind but I think there
will be use cases where we want to ignore unrecognized attributes
(probably per parse call). But it makes sense to require caller to
explicitely declare this is the case.
> While we're talking about wishlist, I'm also toying with the idea of
> having some sort of generic mechanism to convert netlink attributes
> to/from structs, for internal kernel representation; so far though I
> haven't been able to come up with anything useful.
I was also thinking about something like this. One motivation was to
design extensible version of ethtool_ops, the other was allowing complex
data types (structures, arrays) for ethtool tunables. But I have nothing
more than a vague idea so far.
Michal Kubecek
^ permalink raw reply
* Re: [net-next, v2, 1/2] net: stmmac: Rework coalesce timer and fix multi-queue races
From: Jose Abreu @ 2018-09-12 14:25 UTC (permalink / raw)
To: Neil Armstrong, Jose Abreu, netdev
Cc: Jerome Brunet, Martin Blumenstingl, David S. Miller, Joao Pinto,
Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <949c5c50-eb73-c02f-14a7-c3956d4847f2@baylibre.com>
Hi Neil,
On 12-09-2018 14:50, Neil Armstrong wrote:
> Hi Jose,
>
> On 11/09/2018 10:17, Jose Abreu wrote:
>> On 10-09-2018 19:15, Neil Armstrong wrote:
>>> RX is still ok but now TX fails almost immediately...
>>>
>>> With 100ms report :
>>>
>>> $ iperf3 -c 192.168.1.47 -t 0 -p 5202 -R -i 0.1
>>> Connecting to host 192.168.1.47, port 5202
>>> Reverse mode, remote host 192.168.1.47 is sending
>>> [ 4] local 192.168.1.45 port 45900 connected to 192.168.1.47 port 5202
>>> [ ID] Interval Transfer Bandwidth
>>> [ 4] 0.00-0.10 sec 10.9 MBytes 913 Mbits/sec
>>> [ 4] 0.10-0.20 sec 11.0 MBytes 923 Mbits/sec
>>> [ 4] 0.20-0.30 sec 6.34 MBytes 532 Mbits/sec
>>> [ 4] 0.30-0.40 sec 0.00 Bytes 0.00 bits/sec
>>> [ 4] 0.40-0.50 sec 0.00 Bytes 0.00 bits/sec
>>> [ 4] 0.50-0.60 sec 0.00 Bytes 0.00 bits/sec
>>> [ 4] 0.60-0.70 sec 0.00 Bytes 0.00 bits/sec
>>> [ 4] 0.70-0.80 sec 0.00 Bytes 0.00 bits/sec
>>> [ 4] 0.80-0.90 sec 0.00 Bytes 0.00 bits/sec
>>> [ 4] 0.90-1.00 sec 0.00 Bytes 0.00 bits/sec
>>> [ 4] 1.00-1.10 sec 0.00 Bytes 0.00 bits/sec
>>> ^C[ 4] 1.10-1.10 sec 0.00 Bytes 0.00 bits/sec
>>> - - - - - - - - - - - - - - - - - - - - - - - - -
>>> [ ID] Interval Transfer Bandwidth
>>> [ 4] 0.00-1.10 sec 0.00 Bytes 0.00 bits/sec sender
>>> [ 4] 0.00-1.10 sec 28.2 MBytes 214 Mbits/sec receiver
>>> iperf3: interrupt - the client has terminated
>>>
>>> Neil
>> Ok, here goes another incremental patch. If this doesn't work can
>> you please send me a link to the spec of the board you are using ?
> Sorry for the delay...
>
> Not better, sorry.
>
> $ iperf3 -c 10.1.3.201 -p 5202 -R -t 0
> Connecting to host 10.1.3.201, port 5202
> Reverse mode, remote host 10.1.3.201 is sending
> [ 4] local 10.1.2.12 port 60612 connected to 10.1.3.201 port 5202
> [ ID] Interval Transfer Bandwidth
> [ 4] 0.00-1.00 sec 110 MBytes 920 Mbits/sec
> [ 4] 1.00-2.00 sec 110 MBytes 926 Mbits/sec
> [ 4] 2.00-3.00 sec 1.94 MBytes 16.3 Mbits/sec
> [ 4] 3.00-4.00 sec 0.00 Bytes 0.00 bits/sec
> ^C[ 4] 4.00-4.76 sec 0.00 Bytes 0.00 bits/sec
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bandwidth
> [ 4] 0.00-4.76 sec 0.00 Bytes 0.00 bits/sec sender
> [ 4] 0.00-4.76 sec 222 MBytes 391 Mbits/sec receiver
> iperf3: interrupt - the client has terminated
>
> The board is the Amlogic S400 with the A113D SoC, sorry there is no public spec for this board and for this SoC.
Thanks for testing. I will send a new version with major
differences, if you could validate it it would be great.
Thanks and Best Regards,
Jose Miguel Abreu
>
> Neil
>
>> Thanks and Best Regards,
>> Jose Miguel Abreu
>>
^ permalink raw reply
* Re: kernels > v4.12 oops/crash with ipsec-traffic: bisected to b838d5e1c5b6e57b10ec8af2268824041e3ea911: ipv4: mark DST_NOGC and remove the operation of dst_free()
From: Tobias Hommel @ 2018-09-12 15:18 UTC (permalink / raw)
To: Steffen Klassert
Cc: Wolfgang Walter, Kristian Evensen, Network Development, weiwan,
edumazet
In-Reply-To: <20180912085046.GZ23674@gauss3.secunet.de>
[-- Attachment #1: Type: text/plain, Size: 2514 bytes --]
On Wed, Sep 12, 2018 at 10:50:46AM +0200, Steffen Klassert wrote:
> On Tue, Sep 11, 2018 at 09:02:48PM +0200, Tobias Hommel wrote:
> > > > Subject: [PATCH RFC] xfrm: Fix NULL pointer dereference when skb_dst_force
> > > > clears the dst_entry.
> > > >
> > > > Since commit 222d7dbd258d ("net: prevent dst uses after free")
> > > > skb_dst_force() might clear the dst_entry attached to the skb.
> > > > The xfrm code don't expect this to happen, so we crash with
> > > > a NULL pointer dereference in this case. Fix it by checking
> > > > skb_dst(skb) for NULL after skb_dst_force() and drop the packet
> > > > in cast the dst_entry was cleared.
> > > >
> > > > Fixes: 222d7dbd258d ("net: prevent dst uses after free")
> > > > Reported-by: Tobias Hommel <netdev-list@genoetigt.de>
> > > > Reported-by: Kristian Evensen <kristian.evensen@gmail.com>
> > > > Reported-by: Wolfgang Walter <linux@stwm.de>
> > > > Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> > > > ---
> > >
> > > This patch fixes the problem here.
> > >
> > > XfrmFwdHdrError gets around 80 at the very beginning and remains so. Probably
> > > this happens when some route are changed/set then.
> > >
> > > Regards and thanks,
> >
> > Same here, we're now running stable for ~6 hours, XfrmFwdHdrError is at 220.
> > This is less than 1 lost packet per minute, which seems to be okay for now.
>
> Thanks a lot for testing! This is now applied to the ipsec tree.
After running for about 24 hours, I now encountered another panic. This time it
is caused by an out of memory situation. Although the trace shows action in the
filesystem code I'm posting it here because I cannot isolate the error and
maybe it is caused by our NULL pointer bug or by the new fix.
I do not have a serial console attached, so I could only attach a screenshot of
the panic to this mail.
I am running v4.19-rc3 from git with the above mentioned patch applied.
After 19 hours everything still looked fine, XfrmFwdHdrError value was at ~950.
Overall memory usage shown by htop was at 1.2G/15.6G.
I had htop running via ssh so I was able to see at least some status post
mortem. Uptime: 23:50:57
Overall memory usage was at 10.2G/15.6G and user processes were just
using the usual amount of memory, so it looks like the kernel was eating up at
least 9G of RAM.
Maybe this information is not very helpful for debugging, but it is at least a
warning that something might still be wrong.
I'll try to gather some more information and keep you updated.
[-- Attachment #2: oom_panic.png --]
[-- Type: image/png, Size: 56627 bytes --]
^ permalink raw reply
* [GIT] Networking
From: David Miller @ 2018-09-12 20:29 UTC (permalink / raw)
To: torvalds; +Cc: akpm, netdev, linux-kernel
1) Fix up several Kconfig dependencies in netfilter, from Martin Willi and
Florian Westphal.
2) Memory leak in be2net driver, from Petr Oros.
3) Memory leak in E-Switch handling of mlx5 driver, from Raed Salem.
4) mlx5_attach_interface needs to check for errors, from Huy Nguyen.
5) tipc_release() needs to orphan the sock, from Cong Wang.
6) Need to program TxConfig register after TX/RX is enabled in r8169
driver, not beforehand, from Maciej S. Szmigiero.
7) Handle 64K PAGE_SIZE properly in ena driver, from Netanel Belgazal.
8) Fix crash regression in ip_do_fragment(), from Taehee Yoo.
9) syzbot can create conditions where kernel log is flooded with
synflood warnings due to creation of many listening sockets,
fix that. From Willem de Bruijn.
10) Fix RCU issues in rds socket layer, from Cong Wang.
11) Fix vlan matching in nfp driver, from Pieter Jansen van Vuuren.
Please pull, thanks a lot!
The following changes since commit 28619527b8a712590c93d0a9e24b4425b9376a8c:
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2018-09-04 12:45:11 -0700)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git
for you to fetch changes up to 4851bfd64d42d9fb6d2d30a41c8523614b412a7a:
Merge branch 'nfp-flower-fixes' (2018-09-12 13:18:30 -0700)
----------------------------------------------------------------
Cong Wang (6):
tipc: orphan sock in tipc_release()
tipc: call start and done ops directly in __tipc_nl_compat_dumpit()
net_sched: properly cancel netlink dump on failure
netfilter: xt_hashlimit: use s->file instead of s->private
rds: fix two RCU related problems
tipc: check return value of __tipc_dump_start()
Daniel Jurgens (1):
net/mlx5: Consider PCI domain in search for next dev
David S. Miller (6):
Merge branch 'iucv-fixes'
Merge tag 'mlx5e-fixes-2018-09-05' of git://git.kernel.org/.../saeed/linux
Merge branch 'ena-fixes'
Merge git://git.kernel.org/.../pablo/nf
Merge branch 'qeth-fixes'
Merge branch 'nfp-flower-fixes'
Davide Caratti (1):
net/sched: fix memory leak in act_tunnel_key_init()
Florian Westphal (5):
netfilter: xt_checksum: ignore gso skbs
netfilter: conntrack: place 'new' timeout in first location too
netfilter: nf_tables: rework ct timeout set support
netfilter: kconfig: nat related expression depend on nftables core
netfilter: conntrack: reset tcp maxwin on re-register
Haishuang Yan (2):
erspan: return PACKET_REJECT when the appropriate tunnel is not found
erspan: fix error handling for erspan tunnel
Hauke Mehrtens (1):
MIPS: lantiq: dma: add dev pointer
Huy Nguyen (1):
net/mlx5: Check for error in mlx5_attach_interface
Jack Morgenstein (2):
net/mlx5: Fix use-after-free in self-healing flow
net/mlx5: Fix debugfs cleanup in the device init/remove flow
Juergen Gross (1):
xen/netfront: fix waiting for xenbus state change
Julian Wiedmann (6):
net/af_iucv: drop inbound packets with invalid flags
net/af_iucv: fix skb handling on HiperTransport xmit error
net/iucv: declare iucv_path_table_empty() as static
s390/qeth: indicate error when netdev allocation fails
s390/qeth: switch on SG by default for IQD devices
s390/qeth: don't dump past end of unknown HW header
Kai-Heng Feng (1):
r8169: Clear RTL_FLAG_TASK_*_PENDING when clearing RTL_FLAG_TASK_ENABLED
Kristian Evensen (1):
qmi_wwan: Support dynamic config on Quectel EP06
Kuninori Morimoto (1):
ethernet: renesas: convert to SPDX identifiers
Louis Peens (1):
nfp: flower: reject tunnel encap with ipv6 outer headers for offloading
Maciej S. Szmigiero (1):
r8169: set TxConfig register after TX / RX is enabled, just like RxConfig
Martin Willi (1):
netfilter: xt_cluster: add dependency on conntrack module
Michal 'vorner' Vaner (1):
netfilter: nfnetlink_queue: Solve the NFQUEUE/conntrack clash for NF_REPEAT
Netanel Belgazal (7):
net: ena: fix surprise unplug NULL dereference kernel crash
net: ena: fix driver when PAGE_SIZE == 64kB
net: ena: fix device destruction to gracefully free resources
net: ena: fix potential double ena_destroy_device()
net: ena: fix missing lock during device destruction
net: ena: fix missing calls to READ_ONCE
net: ena: fix incorrect usage of memory barriers
Pablo Neira Ayuso (2):
netfilter: conntrack: timeout interface depend on CONFIG_NF_CONNTRACK_TIMEOUT
netfilter: cttimeout: ctnl_timeout_find_get() returns incorrect pointer to type
Petr Machata (1):
mlxsw: spectrum_buffers: Set up a dedicated pool for BUM traffic
Petr Oros (1):
be2net: Fix memory leak in be_cmd_get_profile_config()
Pieter Jansen van Vuuren (1):
nfp: flower: fix vlan match by checking both vlan id and vlan pcp
Raed Salem (1):
net/mlx5: E-Switch, Fix memory leak when creating switchdev mode FDB tables
Roi Dayan (2):
net/mlx5: Fix not releasing read lock when adding flow rules
net/mlx5: Fix possible deadlock from lockdep when adding fte to fg
Saeed Mahameed (1):
net/mlx5e: Ethtool steering, fix udp source port value
Stefan Wahren (1):
net: qca_spi: Fix race condition in spi transfers
Taehee Yoo (2):
netfilter: nf_tables: release chain in flushing set
ip: frags: fix crash in ip_do_fragment()
Tariq Toukan (2):
net/mlx5: Use u16 for Work Queue buffer fragment size
net/mlx5: Use u16 for Work Queue buffer strides offset
Vakul Garg (1):
net/tls: Set count of SG entries if sk_alloc_sg returns -ENOSPC
Vincent Whitchurch (1):
tcp: really ignore MSG_ZEROCOPY if no SO_ZEROCOPY
Wenjia Zhang (1):
s390/qeth: use vzalloc for QUERY OAT buffer
Willem de Bruijn (1):
tcp: rate limit synflood warnings further
Yue Haibing (1):
netfilter: conntrack: remove duplicated include from nf_conntrack_proto_udp.c
arch/mips/include/asm/mach-lantiq/xway/xway_dma.h | 1 +
arch/mips/lantiq/xway/dma.c | 4 +--
drivers/net/ethernet/amazon/ena/ena_com.c | 24 ++++++++---------
drivers/net/ethernet/amazon/ena/ena_eth_com.c | 6 +++++
drivers/net/ethernet/amazon/ena/ena_eth_com.h | 8 ++----
drivers/net/ethernet/amazon/ena/ena_netdev.c | 82 ++++++++++++++++++++++++++------------------------------
drivers/net/ethernet/amazon/ena/ena_netdev.h | 11 ++++++++
drivers/net/ethernet/emulex/benet/be_cmds.c | 2 +-
drivers/net/ethernet/lantiq_etop.c | 1 +
drivers/net/ethernet/mellanox/mlx5/core/dev.c | 22 ++++++++++------
drivers/net/ethernet/mellanox/mlx5/core/en_fs_ethtool.c | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 1 +
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 76 ++++++++++++++++++++++++++--------------------------
drivers/net/ethernet/mellanox/mlx5/core/health.c | 10 ++++++-
drivers/net/ethernet/mellanox/mlx5/core/main.c | 12 +++++----
drivers/net/ethernet/mellanox/mlx5/core/wq.c | 6 ++---
drivers/net/ethernet/mellanox/mlx5/core/wq.h | 2 +-
drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c | 16 +++++------
drivers/net/ethernet/netronome/nfp/flower/action.c | 6 +++++
drivers/net/ethernet/netronome/nfp/flower/main.h | 1 +
drivers/net/ethernet/netronome/nfp/flower/match.c | 2 +-
drivers/net/ethernet/netronome/nfp/flower/offload.c | 11 ++++++++
drivers/net/ethernet/qualcomm/qca_7k.c | 76 +++++++++++++++++++++++++---------------------------
drivers/net/ethernet/qualcomm/qca_spi.c | 110 +++++++++++++++++++++++++++++++++++++++-------------------------------------
drivers/net/ethernet/qualcomm/qca_spi.h | 5 ----
drivers/net/ethernet/realtek/r8169.c | 11 +++++---
drivers/net/ethernet/renesas/Kconfig | 1 +
drivers/net/ethernet/renesas/Makefile | 1 +
drivers/net/ethernet/renesas/ravb_ptp.c | 6 +----
drivers/net/usb/qmi_wwan.c | 30 ++++++++++++++++++++-
drivers/net/xen-netfront.c | 24 +++++++----------
drivers/s390/net/qeth_core_main.c | 11 +++++---
drivers/s390/net/qeth_l2_main.c | 2 +-
drivers/s390/net/qeth_l3_main.c | 2 +-
include/linux/mlx5/driver.h | 8 +++---
include/net/netfilter/nf_conntrack_timeout.h | 2 +-
net/core/skbuff.c | 3 ---
net/ipv4/ip_fragment.c | 1 +
net/ipv4/ip_gre.c | 5 ++++
net/ipv4/netfilter/Kconfig | 8 +++---
net/ipv4/tcp.c | 2 +-
net/ipv4/tcp_input.c | 4 +--
net/ipv6/netfilter/nf_conntrack_reasm.c | 1 +
net/iucv/af_iucv.c | 38 +++++++++++++++++---------
net/iucv/iucv.c | 2 +-
net/netfilter/Kconfig | 12 ++++-----
net/netfilter/nf_conntrack_proto.c | 26 ++++++++++++++++++
net/netfilter/nf_conntrack_proto_dccp.c | 19 ++++++++-----
net/netfilter/nf_conntrack_proto_generic.c | 8 +++---
net/netfilter/nf_conntrack_proto_gre.c | 8 +++---
net/netfilter/nf_conntrack_proto_icmp.c | 8 +++---
net/netfilter/nf_conntrack_proto_icmpv6.c | 8 +++---
net/netfilter/nf_conntrack_proto_sctp.c | 21 ++++++++++-----
net/netfilter/nf_conntrack_proto_tcp.c | 19 ++++++++-----
net/netfilter/nf_conntrack_proto_udp.c | 21 +++++++--------
net/netfilter/nf_tables_api.c | 1 +
net/netfilter/nfnetlink_cttimeout.c | 6 ++---
net/netfilter/nfnetlink_queue.c | 1 +
net/netfilter/nft_ct.c | 59 ++++++++++++++++++++---------------------
net/netfilter/xt_CHECKSUM.c | 22 +++++++++++++++-
net/netfilter/xt_cluster.c | 14 +++++++++-
net/netfilter/xt_hashlimit.c | 18 ++++++-------
net/rds/bind.c | 5 +++-
net/sched/act_tunnel_key.c | 28 +++++++++++++-------
net/tipc/netlink_compat.c | 5 ++++
net/tipc/socket.c | 18 ++++++++-----
net/tipc/socket.h | 1 +
net/tls/tls_sw.c | 6 +++++
68 files changed, 593 insertions(+), 400 deletions(-)
^ permalink raw reply
* [PATCH net v2 0/3] tls: don't leave keys in kernel memory
From: Sabrina Dubroca @ 2018-09-12 15:44 UTC (permalink / raw)
To: netdev
Cc: Sabrina Dubroca, Aviad Yehezkel, Boris Pismenny, Dave Watson,
Vakul Garg
There are a few places where the RX/TX key for a TLS socket is copied
to kernel memory. This series clears those memory areas when they're no
longer needed.
v2: add union tls_crypto_context, following Vakul Garg's comment
swap patch 2 and 3, using new union in patch 3
Sabrina Dubroca (3):
tls: don't copy the key out of tls12_crypto_info_aes_gcm_128
tls: zero the crypto information from tls_context before freeing
tls: clear key material from kernel memory when do_tls_setsockopt_conf
fails
include/net/tls.h | 19 +++++++++----------
net/tls/tls_device.c | 6 +++---
net/tls/tls_device_fallback.c | 2 +-
net/tls/tls_main.c | 22 ++++++++++++++++------
net/tls/tls_sw.c | 13 +++++--------
5 files changed, 34 insertions(+), 28 deletions(-)
--
2.18.0
^ permalink raw reply
* [PATCH net v2 1/3] tls: don't copy the key out of tls12_crypto_info_aes_gcm_128
From: Sabrina Dubroca @ 2018-09-12 15:44 UTC (permalink / raw)
To: netdev
Cc: Sabrina Dubroca, Aviad Yehezkel, Boris Pismenny, Dave Watson,
Vakul Garg
In-Reply-To: <cover.1536766755.git.sd@queasysnail.net>
There's no need to copy the key to an on-stack buffer before calling
crypto_aead_setkey().
Fixes: 3c4d7559159b ("tls: kernel TLS support")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
net/tls/tls_sw.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index e28a6ff25d96..f29b7c49cbf2 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1136,7 +1136,6 @@ void tls_sw_free_resources_rx(struct sock *sk)
int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx)
{
- char keyval[TLS_CIPHER_AES_GCM_128_KEY_SIZE];
struct tls_crypto_info *crypto_info;
struct tls12_crypto_info_aes_gcm_128 *gcm_128_info;
struct tls_sw_context_tx *sw_ctx_tx = NULL;
@@ -1265,9 +1264,7 @@ int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx)
ctx->push_pending_record = tls_sw_push_pending_record;
- memcpy(keyval, gcm_128_info->key, TLS_CIPHER_AES_GCM_128_KEY_SIZE);
-
- rc = crypto_aead_setkey(*aead, keyval,
+ rc = crypto_aead_setkey(*aead, gcm_128_info->key,
TLS_CIPHER_AES_GCM_128_KEY_SIZE);
if (rc)
goto free_aead;
--
2.18.0
^ permalink raw reply related
* [PATCH net v2 2/3] tls: zero the crypto information from tls_context before freeing
From: Sabrina Dubroca @ 2018-09-12 15:44 UTC (permalink / raw)
To: netdev
Cc: Sabrina Dubroca, Aviad Yehezkel, Boris Pismenny, Dave Watson,
Vakul Garg
In-Reply-To: <cover.1536766755.git.sd@queasysnail.net>
This contains key material in crypto_send_aes_gcm_128 and
crypto_recv_aes_gcm_128.
Introduce union tls_crypto_context, and replace the two identical
unions directly embedded in struct tls_context with it. We can then
use this union to clean up the memory in the new tls_ctx_free()
function.
Fixes: 3c4d7559159b ("tls: kernel TLS support")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
v2: introduce union tls_crypto_context
include/net/tls.h | 19 +++++++++----------
net/tls/tls_device.c | 6 +++---
net/tls/tls_device_fallback.c | 2 +-
net/tls/tls_main.c | 20 +++++++++++++++-----
net/tls/tls_sw.c | 8 ++++----
5 files changed, 32 insertions(+), 23 deletions(-)
diff --git a/include/net/tls.h b/include/net/tls.h
index d5c683e8bb22..0a769cf2f5f3 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -171,15 +171,14 @@ struct cipher_context {
char *rec_seq;
};
+union tls_crypto_context {
+ struct tls_crypto_info info;
+ struct tls12_crypto_info_aes_gcm_128 aes_gcm_128;
+};
+
struct tls_context {
- union {
- struct tls_crypto_info crypto_send;
- struct tls12_crypto_info_aes_gcm_128 crypto_send_aes_gcm_128;
- };
- union {
- struct tls_crypto_info crypto_recv;
- struct tls12_crypto_info_aes_gcm_128 crypto_recv_aes_gcm_128;
- };
+ union tls_crypto_context crypto_send;
+ union tls_crypto_context crypto_recv;
struct list_head list;
struct net_device *netdev;
@@ -367,8 +366,8 @@ static inline void tls_fill_prepend(struct tls_context *ctx,
* size KTLS_DTLS_HEADER_SIZE + KTLS_DTLS_NONCE_EXPLICIT_SIZE
*/
buf[0] = record_type;
- buf[1] = TLS_VERSION_MINOR(ctx->crypto_send.version);
- buf[2] = TLS_VERSION_MAJOR(ctx->crypto_send.version);
+ buf[1] = TLS_VERSION_MINOR(ctx->crypto_send.info.version);
+ buf[2] = TLS_VERSION_MAJOR(ctx->crypto_send.info.version);
/* we can use IV for nonce explicit according to spec */
buf[3] = pkt_len >> 8;
buf[4] = pkt_len & 0xFF;
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index 292742e50bfa..961b07d4d41c 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -686,7 +686,7 @@ int tls_set_device_offload(struct sock *sk, struct tls_context *ctx)
goto free_marker_record;
}
- crypto_info = &ctx->crypto_send;
+ crypto_info = &ctx->crypto_send.info;
switch (crypto_info->cipher_type) {
case TLS_CIPHER_AES_GCM_128:
nonce_size = TLS_CIPHER_AES_GCM_128_IV_SIZE;
@@ -780,7 +780,7 @@ int tls_set_device_offload(struct sock *sk, struct tls_context *ctx)
ctx->priv_ctx_tx = offload_ctx;
rc = netdev->tlsdev_ops->tls_dev_add(netdev, sk, TLS_OFFLOAD_CTX_DIR_TX,
- &ctx->crypto_send,
+ &ctx->crypto_send.info,
tcp_sk(sk)->write_seq);
if (rc)
goto release_netdev;
@@ -862,7 +862,7 @@ int tls_set_device_offload_rx(struct sock *sk, struct tls_context *ctx)
goto release_ctx;
rc = netdev->tlsdev_ops->tls_dev_add(netdev, sk, TLS_OFFLOAD_CTX_DIR_RX,
- &ctx->crypto_recv,
+ &ctx->crypto_recv.info,
tcp_sk(sk)->copied_seq);
if (rc) {
pr_err_ratelimited("%s: The netdev has refused to offload this socket\n",
diff --git a/net/tls/tls_device_fallback.c b/net/tls/tls_device_fallback.c
index 6102169239d1..450a6dbc5a88 100644
--- a/net/tls/tls_device_fallback.c
+++ b/net/tls/tls_device_fallback.c
@@ -320,7 +320,7 @@ static struct sk_buff *tls_enc_skb(struct tls_context *tls_ctx,
goto free_req;
iv = buf;
- memcpy(iv, tls_ctx->crypto_send_aes_gcm_128.salt,
+ memcpy(iv, tls_ctx->crypto_send.aes_gcm_128.salt,
TLS_CIPHER_AES_GCM_128_SALT_SIZE);
aad = buf + TLS_CIPHER_AES_GCM_128_SALT_SIZE +
TLS_CIPHER_AES_GCM_128_IV_SIZE;
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 180b6640e531..737b3865be1b 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -241,6 +241,16 @@ static void tls_write_space(struct sock *sk)
ctx->sk_write_space(sk);
}
+static void tls_ctx_free(struct tls_context *ctx)
+{
+ if (!ctx)
+ return;
+
+ memzero_explicit(&ctx->crypto_send, sizeof(ctx->crypto_send));
+ memzero_explicit(&ctx->crypto_recv, sizeof(ctx->crypto_recv));
+ kfree(ctx);
+}
+
static void tls_sk_proto_close(struct sock *sk, long timeout)
{
struct tls_context *ctx = tls_get_ctx(sk);
@@ -294,7 +304,7 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
#else
{
#endif
- kfree(ctx);
+ tls_ctx_free(ctx);
ctx = NULL;
}
@@ -305,7 +315,7 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
* for sk->sk_prot->unhash [tls_hw_unhash]
*/
if (free_ctx)
- kfree(ctx);
+ tls_ctx_free(ctx);
}
static int do_tls_getsockopt_tx(struct sock *sk, char __user *optval,
@@ -330,7 +340,7 @@ static int do_tls_getsockopt_tx(struct sock *sk, char __user *optval,
}
/* get user crypto info */
- crypto_info = &ctx->crypto_send;
+ crypto_info = &ctx->crypto_send.info;
if (!TLS_CRYPTO_INFO_READY(crypto_info)) {
rc = -EBUSY;
@@ -417,9 +427,9 @@ static int do_tls_setsockopt_conf(struct sock *sk, char __user *optval,
}
if (tx)
- crypto_info = &ctx->crypto_send;
+ crypto_info = &ctx->crypto_send.info;
else
- crypto_info = &ctx->crypto_recv;
+ crypto_info = &ctx->crypto_recv.info;
/* Currently we don't support set crypto info more than one time */
if (TLS_CRYPTO_INFO_READY(crypto_info)) {
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index f29b7c49cbf2..9e918489f4fb 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1055,8 +1055,8 @@ static int tls_read_size(struct strparser *strp, struct sk_buff *skb)
goto read_failure;
}
- if (header[1] != TLS_VERSION_MINOR(tls_ctx->crypto_recv.version) ||
- header[2] != TLS_VERSION_MAJOR(tls_ctx->crypto_recv.version)) {
+ if (header[1] != TLS_VERSION_MINOR(tls_ctx->crypto_recv.info.version) ||
+ header[2] != TLS_VERSION_MAJOR(tls_ctx->crypto_recv.info.version)) {
ret = -EINVAL;
goto read_failure;
}
@@ -1180,12 +1180,12 @@ int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx)
if (tx) {
crypto_init_wait(&sw_ctx_tx->async_wait);
- crypto_info = &ctx->crypto_send;
+ crypto_info = &ctx->crypto_send.info;
cctx = &ctx->tx;
aead = &sw_ctx_tx->aead_send;
} else {
crypto_init_wait(&sw_ctx_rx->async_wait);
- crypto_info = &ctx->crypto_recv;
+ crypto_info = &ctx->crypto_recv.info;
cctx = &ctx->rx;
aead = &sw_ctx_rx->aead_recv;
}
--
2.18.0
^ permalink raw reply related
* [PATCH net v2 3/3] tls: clear key material from kernel memory when do_tls_setsockopt_conf fails
From: Sabrina Dubroca @ 2018-09-12 15:44 UTC (permalink / raw)
To: netdev
Cc: Sabrina Dubroca, Aviad Yehezkel, Boris Pismenny, Dave Watson,
Vakul Garg
In-Reply-To: <cover.1536766755.git.sd@queasysnail.net>
Fixes: 3c4d7559159b ("tls: kernel TLS support")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
---
v2: use the new union tls_crypto_context
net/tls/tls_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 737b3865be1b..523622dc74f8 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -509,7 +509,7 @@ static int do_tls_setsockopt_conf(struct sock *sk, char __user *optval,
goto out;
err_crypto_info:
- memset(crypto_info, 0, sizeof(*crypto_info));
+ memzero_explicit(crypto_info, sizeof(union tls_crypto_context));
out:
return rc;
}
--
2.18.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox