* Re: [RFC, PATCH] net: page_pool: Don't use page->private to store dma_addr_t
From: Alexander Duyck @ 2019-02-12 18:13 UTC (permalink / raw)
To: Eric Dumazet
Cc: Tariq Toukan, Ilias Apalodimas, Matthew Wilcox, brouer@redhat.com,
David Miller, toke@redhat.com, netdev@vger.kernel.org,
mgorman@techsingularity.net, linux-mm@kvack.org
In-Reply-To: <d8fa6786-c252-6bb0-409f-42ce18127cb3@gmail.com>
On Tue, Feb 12, 2019 at 7:16 AM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 02/12/2019 04:39 AM, Tariq Toukan wrote:
> >
> >
> > On 2/11/2019 7:14 PM, Eric Dumazet wrote:
> >>
> >>
> >> On 02/11/2019 12:53 AM, Tariq Toukan wrote:
> >>>
> >>
> >>> Hi,
> >>>
> >>> It's great to use the struct page to store its dma mapping, but I am
> >>> worried about extensibility.
> >>> page_pool is evolving, and it would need several more per-page fields.
> >>> One of them would be pageref_bias, a planned optimization to reduce the
> >>> number of the costly atomic pageref operations (and replace existing
> >>> code in several drivers).
> >>>
> >>
> >> But the point about pageref_bias is to place it in a different cache line than "struct page"
> >>
> >> The major cost is having a cache line bouncing between producer and consumer.
> >>
> >
> > pageref_bias is meant to be dirtied only by the page requester, i.e. the
> > NIC driver / page_pool.
> > All other components (basically, SKB release flow / put_page) should
> > continue working with the atomic page_refcnt, and not dirty the
> > pageref_bias.
>
> This is exactly my point.
>
> You suggested to put pageref_bias in struct page, which breaks this completely.
>
> pageref_bias is better kept in a driver structure, with appropriate prefetching
> since most NIC use a ring buffer for their queues.
>
> The dma address _can_ be put in the struct page, since the driver does not dirty it
> and does not even read it when page can be recycled.
Instead of maintaining the pageref_bias in the page itself it could be
maintained in some sort of separate structure. You could just maintain
a pointer to a slot in an array somewhere. Then you can still access
it if needed, the pointer would be static for as long as it is in the
page pool, and you could invalidate the pointer prior to removing the
bias from the page.
^ permalink raw reply
* Re: [PATCH net-next 00/10] s390/qeth: updates 2019-02-12
From: David Miller @ 2019-02-12 18:14 UTC (permalink / raw)
To: jwi; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun
In-Reply-To: <20190212173325.36555-1-jwi@linux.ibm.com>
From: Julian Wiedmann <jwi@linux.ibm.com>
Date: Tue, 12 Feb 2019 18:33:15 +0100
> please apply one more round of qeth patches to net-next. This
> series targets the driver's control paths. It primarily brings
> improvements to the error handling for sent cmds and received
> responses, along with the usual cleanup and consolidation efforts.
Series applied, thanks.
^ permalink raw reply
* Re: [net-next PATCH V2 1/3] mm: add dma_addr_t to struct page
From: Jesper Dangaard Brouer @ 2019-02-12 18:19 UTC (permalink / raw)
To: Florian Fainelli
Cc: netdev, linux-mm, Toke Høiland-Jørgensen,
Ilias Apalodimas, willy, Saeed Mahameed, Alexander Duyck,
Andrew Morton, mgorman, David S. Miller, Tariq Toukan, brouer
In-Reply-To: <dc34bb0b-1efd-4200-2ee7-bf8adef8a0b5@gmail.com>
On Tue, 12 Feb 2019 10:05:39 -0800
Florian Fainelli <f.fainelli@gmail.com> wrote:
> On 2/12/19 6:49 AM, Jesper Dangaard Brouer wrote:
> > The page_pool API is using page->private to store DMA addresses.
> > As pointed out by David Miller we can't use that on 32-bit architectures
> > with 64-bit DMA
> >
> > This patch adds a new dma_addr_t struct to allow storing DMA addresses
> >
> > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> > Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
> > Acked-by: Andrew Morton <akpm@linux-foundation.org>
> > ---
> > include/linux/mm_types.h | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> > index 2c471a2c43fa..581737bd0878 100644
> > --- a/include/linux/mm_types.h
> > +++ b/include/linux/mm_types.h
> > @@ -95,6 +95,13 @@ struct page {
> > */
> > unsigned long private;
> > };
> > + struct { /* page_pool used by netstack */
> > + /**
> > + * @dma_addr: page_pool requires a 64-bit value even on
> > + * 32-bit architectures.
> > + */
>
> Nit: might require? dma_addr_t, as you mention in the commit may have a
> different size based on CONFIG_ARCH_DMA_ADDR_T_64BIT.
So you want me to change the comment to be:
/**
* @dma_addr: might require a 64-bit value even on
* 32-bit architectures.
*/
Correctly understood?
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
^ permalink raw reply
* Re: [RFC, PATCH] net: page_pool: Don't use page->private to store dma_addr_t
From: Ilias Apalodimas @ 2019-02-12 18:20 UTC (permalink / raw)
To: Alexander Duyck
Cc: Eric Dumazet, Tariq Toukan, Matthew Wilcox, brouer@redhat.com,
David Miller, toke@redhat.com, netdev@vger.kernel.org,
mgorman@techsingularity.net, linux-mm@kvack.org
In-Reply-To: <CAKgT0UfG08aYoN=zO_aVyx+OgNPmN9pVkBNeZMPTF2KL7XqoBQ@mail.gmail.com>
Hi Alexander,
On Tue, Feb 12, 2019 at 10:13:30AM -0800, Alexander Duyck wrote:
> On Tue, Feb 12, 2019 at 7:16 AM Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >
> >
> >
> > On 02/12/2019 04:39 AM, Tariq Toukan wrote:
> > >
> > >
> > > On 2/11/2019 7:14 PM, Eric Dumazet wrote:
> > >>
> > >>
> > >> On 02/11/2019 12:53 AM, Tariq Toukan wrote:
> > >>>
> > >>
> > >>> Hi,
> > >>>
> > >>> It's great to use the struct page to store its dma mapping, but I am
> > >>> worried about extensibility.
> > >>> page_pool is evolving, and it would need several more per-page fields.
> > >>> One of them would be pageref_bias, a planned optimization to reduce the
> > >>> number of the costly atomic pageref operations (and replace existing
> > >>> code in several drivers).
> > >>>
> > >>
> > >> But the point about pageref_bias is to place it in a different cache line than "struct page"
> > >>
> > >> The major cost is having a cache line bouncing between producer and consumer.
> > >>
> > >
> > > pageref_bias is meant to be dirtied only by the page requester, i.e. the
> > > NIC driver / page_pool.
> > > All other components (basically, SKB release flow / put_page) should
> > > continue working with the atomic page_refcnt, and not dirty the
> > > pageref_bias.
> >
> > This is exactly my point.
> >
> > You suggested to put pageref_bias in struct page, which breaks this completely.
> >
> > pageref_bias is better kept in a driver structure, with appropriate prefetching
> > since most NIC use a ring buffer for their queues.
> >
> > The dma address _can_ be put in the struct page, since the driver does not dirty it
> > and does not even read it when page can be recycled.
>
> Instead of maintaining the pageref_bias in the page itself it could be
> maintained in some sort of separate structure. You could just maintain
> a pointer to a slot in an array somewhere. Then you can still access
> it if needed, the pointer would be static for as long as it is in the
> page pool, and you could invalidate the pointer prior to removing the
> bias from the page.
I think that's what Tariq was suggesting in the first place.
/Ilias
^ permalink raw reply
* Re: [net-next PATCH V2 1/3] mm: add dma_addr_t to struct page
From: Florian Fainelli @ 2019-02-12 18:23 UTC (permalink / raw)
To: Jesper Dangaard Brouer
Cc: netdev, linux-mm, Toke Høiland-Jørgensen,
Ilias Apalodimas, willy, Saeed Mahameed, Alexander Duyck,
Andrew Morton, mgorman, David S. Miller, Tariq Toukan
In-Reply-To: <20190212191917.2ef91a88@carbon>
On 2/12/19 10:19 AM, Jesper Dangaard Brouer wrote:
> On Tue, 12 Feb 2019 10:05:39 -0800
> Florian Fainelli <f.fainelli@gmail.com> wrote:
>
>> On 2/12/19 6:49 AM, Jesper Dangaard Brouer wrote:
>>> The page_pool API is using page->private to store DMA addresses.
>>> As pointed out by David Miller we can't use that on 32-bit architectures
>>> with 64-bit DMA
>>>
>>> This patch adds a new dma_addr_t struct to allow storing DMA addresses
>>>
>>> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
>>> Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
>>> Acked-by: Andrew Morton <akpm@linux-foundation.org>
>>> ---
>>> include/linux/mm_types.h | 7 +++++++
>>> 1 file changed, 7 insertions(+)
>>>
>>> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
>>> index 2c471a2c43fa..581737bd0878 100644
>>> --- a/include/linux/mm_types.h
>>> +++ b/include/linux/mm_types.h
>>> @@ -95,6 +95,13 @@ struct page {
>>> */
>>> unsigned long private;
>>> };
>>> + struct { /* page_pool used by netstack */
>>> + /**
>>> + * @dma_addr: page_pool requires a 64-bit value even on
>>> + * 32-bit architectures.
>>> + */
>>
>> Nit: might require? dma_addr_t, as you mention in the commit may have a
>> different size based on CONFIG_ARCH_DMA_ADDR_T_64BIT.
>
> So you want me to change the comment to be:
>
> /**
> * @dma_addr: might require a 64-bit value even on
> * 32-bit architectures.
> */
>
> Correctly understood?
Correct, that is what I would change. The commit message is correct, but
the comment makes it sound like dma_addr_t is guaranteed to be 64-bit,
while it is actually platform dependent. Does that make it clearer?
--
Florian
^ permalink raw reply
* [PATCH] rpc: properly check debugfs dentry before using it
From: Greg Kroah-Hartman @ 2019-02-12 18:27 UTC (permalink / raw)
To: J. Bruce Fields, Jeff Layton, Trond Myklebust, Anna Schumaker
Cc: linux-nfs, netdev, David Howells
debugfs can now report an error code if something went wrong instead of
just NULL. So if the return value is to be used as a "real" dentry, it
needs to be checked if it is an error before dereferencing it.
This is now happening because of ff9fb72bc077 ("debugfs: return error
values, not NULL"), but why debugfs files are not being created properly
is an older issue, probably one that has always been there and should
probably be looked at...
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Anna Schumaker <anna.schumaker@netapp.com>
Cc: linux-nfs@vger.kernel.org
Cc: netdev@vger.kernel.org
Reported-by: David Howells <dhowells@redhat.com>
Tested-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
net/sunrpc/debugfs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
I can take this through my tree if people don't object, or it can go
through the NFS tree. It does need to get merged before 5.0-final
though.
I also have a "larger" debugfs cleanup patch for this file, but that's
not really 5.0-final material and I will send it out later.
thanks,
greg k-h
diff --git a/net/sunrpc/debugfs.c b/net/sunrpc/debugfs.c
index 45a033329cd4..19bb356230ed 100644
--- a/net/sunrpc/debugfs.c
+++ b/net/sunrpc/debugfs.c
@@ -146,7 +146,7 @@ rpc_clnt_debugfs_register(struct rpc_clnt *clnt)
rcu_read_lock();
xprt = rcu_dereference(clnt->cl_xprt);
/* no "debugfs" dentry? Don't bother with the symlink. */
- if (!xprt->debugfs) {
+ if (IS_ERR_OR_NULL(xprt->debugfs)) {
rcu_read_unlock();
return;
}
^ permalink raw reply related
* Re: [PATCH] ser_gigaset: mark expected switch fall-through
From: David Miller @ 2019-02-12 18:29 UTC (permalink / raw)
To: gustavo; +Cc: pebolle, isdn, gigaset307x-common, netdev, linux-kernel, keescook
In-Reply-To: <20190211223444.GA29517@embeddedor>
From: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Date: Mon, 11 Feb 2019 16:34:44 -0600
> In preparation to enabling -Wimplicit-fallthrough, mark switch
> cases where we are expecting to fall through.
>
> This patch fixes the following warning:
>
> drivers/isdn/gigaset/ser-gigaset.c: In function ‘gigaset_tty_ioctl’:
> drivers/isdn/gigaset/ser-gigaset.c:627:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
> switch (arg) {
> ^~~~~~
> drivers/isdn/gigaset/ser-gigaset.c:638:2: note: here
> default:
> ^~~~~~~
>
> Warning level 3 was used: -Wimplicit-fallthrough=3
>
> Notice that, in this particular case, the code comment is modified
> in accordance with what GCC is expecting to find.
>
> This patch is part of the ongoing efforts to enable
> -Wimplicit-fallthrough.
>
> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Applied.
^ permalink raw reply
* Re: [PATCH] isdn_v110: mark expected switch fall-through
From: David Miller @ 2019-02-12 18:29 UTC (permalink / raw)
To: gustavo; +Cc: isdn, netdev, linux-kernel, keescook
In-Reply-To: <20190211224237.GA29262@embeddedor>
From: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Date: Mon, 11 Feb 2019 16:42:37 -0600
> In preparation to enabling -Wimplicit-fallthrough, mark switch
> cases where we are expecting to fall through.
>
> This patch fixes the following warnings:
>
> drivers/isdn/i4l/isdn_v110.c: In function ‘EncodeMatrix’:
> drivers/isdn/i4l/isdn_v110.c:353:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
> if (line >= mlen) {
> ^
> drivers/isdn/i4l/isdn_v110.c:358:3: note: here
> case 128:
> ^~~~
>
> Warning level 3 was used: -Wimplicit-fallthrough=3
>
> Notice that, in this particular case, the code comment is modified
> in accordance with what GCC is expecting to find.
>
> This patch is part of the ongoing efforts to enable
> -Wimplicit-fallthrough.
>
> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Applied.
^ permalink raw reply
* Re: [PATCH] isdn: i4l: isdn_tty: Mark expected switch fall-through
From: David Miller @ 2019-02-12 18:29 UTC (permalink / raw)
To: gustavo; +Cc: isdn, netdev, linux-kernel, keescook
In-Reply-To: <20190211223821.GA13158@embeddedor>
From: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Date: Mon, 11 Feb 2019 16:38:21 -0600
> In preparation to enabling -Wimplicit-fallthrough, mark switch
> cases where we are expecting to fall through.
>
> This patch fixes the following warnings:
>
> drivers/isdn/i4l/isdn_tty.c: In function ‘isdn_tty_edit_at’:
> drivers/isdn/i4l/isdn_tty.c:3644:18: warning: this statement may fall through [-Wimplicit-fallthrough=]
> m->mdmcmdl = 0;
> ~~~~~~~~~~~^~~
> drivers/isdn/i4l/isdn_tty.c:3646:5: note: here
> case 0:
> ^~~~
>
> Warning level 3 was used: -Wimplicit-fallthrough=3
>
> Notice that, in this particular case, the code comment is modified
> in accordance with what GCC is expecting to find.
>
> This patch is part of the ongoing efforts to enable
> -Wimplicit-fallthrough.
>
> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Applied.
^ permalink raw reply
* Re: [PATCH net] batman-adv: fix uninit-value in batadv_interface_tx()
From: David Miller @ 2019-02-12 18:31 UTC (permalink / raw)
To: edumazet; +Cc: netdev, eric.dumazet, syzkaller
In-Reply-To: <20190211224122.122242-1-edumazet@google.com>
From: Eric Dumazet <edumazet@google.com>
Date: Mon, 11 Feb 2019 14:41:22 -0800
> KMSAN reported batadv_interface_tx() was possibly using a
> garbage value [1]
>
> batadv_get_vid() does have a pskb_may_pull() call
> but batadv_interface_tx() does not actually make sure
> this did not fail.
...
> Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: syzbot <syzkaller@googlegroups.com>
Applied and queued up for -stable.
^ permalink raw reply
* Re: [PATCH] inet_diag: fix reporting cgroup classid and fallback to priority
From: David Miller @ 2019-02-12 18:37 UTC (permalink / raw)
To: khlebnikov; +Cc: netdev, linux-kernel, sashal, linux-sctp
In-Reply-To: <154970855279.305165.13649851988934332761.stgit@buzz>
From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Date: Sat, 09 Feb 2019 13:35:52 +0300
> Field idiag_ext in struct inet_diag_req_v2 used as bitmap of requested
> extensions has only 8 bits. Thus extensions starting from DCTCPINFO
> cannot be requested directly. Some of them included into response
> unconditionally or hook into some of lower 8 bits.
>
> Extension INET_DIAG_CLASS_ID has not way to request from the beginning.
>
> This patch bundle it with INET_DIAG_TCLASS (ipv6 tos), fixes space
> reservation, and documents behavior for other extensions.
>
> Also this patch adds fallback to reporting socket priority. This filed
> is more widely used for traffic classification because ipv4 sockets
> automatically maps TOS to priority and default qdisc pfifo_fast knows
> about that. But priority could be changed via setsockopt SO_PRIORITY so
> INET_DIAG_TOS isn't enough for predicting class.
>
> Also cgroup2 obsoletes net_cls classid (it always zero), but we cannot
> reuse this field for reporting cgroup2 id because it is 64-bit (ino+gen).
>
> So, after this patch INET_DIAG_CLASS_ID will report socket priority
> for most common setup when net_cls isn't set and/or cgroup2 in use.
>
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> Fixes: 0888e372c37f ("net: inet: diag: expose sockets cgroup classid")
Applied, and queued up for -stable.
Please always put the Fixes: tag first in the list of tags. I fixed
it up for you this time.
Thanks.
^ permalink raw reply
* Re: [PATCH v2] net/packet: fix 4gb buffer limit due to overflow check
From: David Miller @ 2019-02-12 18:38 UTC (permalink / raw)
To: kal.conley
Cc: willemb, edumazet, alexander.h.duyck, jeffrey.t.kirsher, ktkhai,
vincent.whitchurch, lirongqing, magnus.karlsson, netdev,
linux-kernel
In-Reply-To: <20190210085712.31622-1-kal.conley@dectris.com>
From: Kal Conley <kal.conley@dectris.com>
Date: Sun, 10 Feb 2019 09:57:11 +0100
> When calculating rb->frames_per_block * req->tp_block_nr the result
> can overflow. Check it for overflow without limiting the total buffer
> size to UINT_MAX.
>
> This change fixes support for packet ring buffers >= UINT_MAX.
>
> Fixes: 8f8d28e4d6d8 ("net/packet: fix overflow in check for tp_frame_nr")
> Signed-off-by: Kal Conley <kal.conley@dectris.com>
> ---
> Changes in v2:
> - Add Signed-off-by and Fixes tag
Applied and queued up for -stable, thanks.
^ permalink raw reply
* Re: [PATCH net-next v4 00/17] Refactor classifier API to work with chain/classifiers without rtnl lock
From: David Miller @ 2019-02-12 18:42 UTC (permalink / raw)
To: vladbu; +Cc: netdev, jhs, xiyou.wangcong, jiri, ast, daniel
In-Reply-To: <20190211085548.7190-1-vladbu@mellanox.com>
From: Vlad Buslov <vladbu@mellanox.com>
Date: Mon, 11 Feb 2019 10:55:31 +0200
> Currently, all netlink protocol handlers for updating rules, actions and
> qdiscs are protected with single global rtnl lock which removes any
> possibility for parallelism. This patch set is a third step to remove
> rtnl lock dependency from TC rules update path.
...
I have to say, this stuff is very ambitious. Thanks for working on this.
Series applied, thanks Vlad.
^ permalink raw reply
* Re: [PATCH net] dsa: mv88e6xxx: Ensure all pending interrupts are handled prior to exit
From: Heiner Kallweit @ 2019-02-12 18:42 UTC (permalink / raw)
To: Andrew Lunn
Cc: John David Anglin, Russell King, Vivien Didelot, Florian Fainelli,
netdev
In-Reply-To: <20190212125644.GA7527@lunn.ch>
On 12.02.2019 13:56, Andrew Lunn wrote:
> On Tue, Feb 12, 2019 at 07:51:05AM +0100, Heiner Kallweit wrote:
>> On 12.02.2019 04:58, Andrew Lunn wrote:
>>>>> Hi David
>>>>>
>>>>> I just tested this on one of my boards. It loops endlessly:
>>>>>
>>>>> [ 47.173396] mv88e6xxx_g1_irq_thread_work: c881 a8 80
>>>>> [ 47.182108] mv88e6xxx_g1_irq_thread_work: c881 a8 80
>>>>> [ 47.190820] mv88e6xxx_g1_irq_thread_work: c881 a8 80
>>>>> [ 47.199535] mv88e6xxx_g1_irq_thread_work: c881 a8 80
>>>>> [ 47.208254] mv88e6xxx_g1_irq_thread_work: c881 a8 80
>>>>>
>>>>> These are reg, ctl1, reg & ctl1.
>>>>>
>>>>> So there is an unhandled device interrupt.
>>>
>>> Hi Heiner
>>>
>>> Your patch Fixes: 2b3e88ea6528 ("net: phy: improve phy state
>>> checking") is causing me problems with interrupts for the Marvell
>>> switches.
>>>
>> Hi Andrew,
>>
>> what kernel version is it?
>
> It is a little bit old, 5.0-rc1 net-next. I should rebase and
> retest. I'm testing on a ZII board which is not fully in mainline So i
> need some patches.
>
Thanks, Andrew. Indeed 5.0 needs a fix, as also pointed out by Russell.
I think I will simply remove the following:
if (!phy_is_started(phydev))
return IRQ_NONE;
Then we basically do the same like phy_mac_interrupt(), we always run
the state machine. If it has nothing to do, then it does nothing.
Therefore also state HALTED doesn't need a special handling.
This way we handle interrupts (incl. spurious ones) gracefully.
>> And the PHY driver in use is "Marvell 88E6390" ?
>
> Yes, the marvell 1G driver.
>
> Andrew
> .
>
Heiner
^ permalink raw reply
* Re: [PATCH v2] rhashtable: make walk safe from softirq context
From: David Miller @ 2019-02-12 18:43 UTC (permalink / raw)
To: johannes; +Cc: linux-wireless, netdev, j, tgraf, herbert, johannes.berg
In-Reply-To: <20190206090721.8001-1-johannes@sipsolutions.net>
From: Johannes Berg <johannes@sipsolutions.net>
Date: Wed, 6 Feb 2019 10:07:21 +0100
> From: Johannes Berg <johannes.berg@intel.com>
>
> When an rhashtable walk is done from softirq context, we rightfully
> get a lockdep complaint saying that we could get a softirq in the
> middle of a rehash, and thus deadlock on &ht->lock. This happened
> e.g. in mac80211 as it does a walk in softirq context.
>
> Fix this by using spin_lock_bh() wherever we use the &ht->lock.
>
> Initially, I thought it would be sufficient to do this only in the
> rehash (rhashtable_rehash_table), but I changed my mind:
> * the caller doesn't really need to disable softirqs across all
> of the rhashtable_walk_* functions, only those parts that they
> actually do within the lock need it
> * maybe more importantly, it would still lead to massive lockdep
> complaints - false positives, but hard to fix - because lockdep
> wouldn't know about different ht->lock instances, and thus one
> user of the code doing a walk w/o any locking (when it only ever
> uses process context this is fine) vs. another user like in wifi
> where we noticed this problem would still cause it to complain.
>
> Cc: stable@vger.kernel.org
> Reported-by: Jouni Malinen <j@w1.fi>
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Herbert and Johannes, I need some guidance.
It seems Herbert wants the softirq usage of rhashtables removed, but
since things have been like this for so long that's not the most
reasonable requirement if we can fix it more simply with Johannes's
patch especially for -stable.
Thanks.
^ permalink raw reply
* Re: [PATCH net-next 3/4] mlxsw: spectrum_flower: Fix VLAN modify action support
From: Pablo Neira Ayuso @ 2019-02-12 18:49 UTC (permalink / raw)
To: Ido Schimmel
Cc: netdev@vger.kernel.org, davem@davemloft.net, Jiri Pirko,
Nir Dotan, mlxsw
In-Reply-To: <20190212162924.29777-4-idosch@mellanox.com>
On Tue, Feb 12, 2019 at 04:29:53PM +0000, Ido Schimmel wrote:
> The driver does not support VLAN push and pop, but only VLAN modify.
>
> Fixes: 738678817573 ("drivers: net: use flow action infrastructure")
> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
> Acked-by: Jiri Pirko <jiri@mellanox.com>
> Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
^ permalink raw reply
* KASAN: use-after-free Read in rt_cache_valid
From: syzbot @ 2019-02-12 18:53 UTC (permalink / raw)
To: davem, kuznet, linux-kernel, netdev, syzkaller-bugs, yoshfuji
Hello,
syzbot found the following crash on:
HEAD commit: aa0c38cf39de Merge branch 'fixes' of git://git.kernel.org/..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1460eca7400000
kernel config: https://syzkaller.appspot.com/x/.config?x=ee434566c893c7b1
dashboard link: https://syzkaller.appspot.com/bug?extid=c4c4b2bb358bb936ad7e
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1197cb88c00000
IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+c4c4b2bb358bb936ad7e@syzkaller.appspotmail.com
Enabling of bearer <udp:syz1> rejected, already enabled
Enabling of bearer <udp:syz1> rejected, already enabled
Enabling of bearer <udp:syz1> rejected, already enabled
Enabling of bearer <udp:syz1> rejected, already enabled
==================================================================
BUG: KASAN: use-after-free in rt_cache_valid+0x158/0x190
net/ipv4/route.c:1510
Read of size 2 at addr ffff88809bfce836 by task udevd/7435
CPU: 1 PID: 7435 Comm: udevd Not tainted 5.0.0-rc6+ #69
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x172/0x1f0 lib/dump_stack.c:113
print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
__asan_report_load2_noabort+0x14/0x20 mm/kasan/generic_report.c:133
rt_cache_valid+0x158/0x190 net/ipv4/route.c:1510
__mkroute_output net/ipv4/route.c:2260 [inline]
ip_route_output_key_hash_rcu+0x89d/0x30e0 net/ipv4/route.c:2492
Enabling of bearer <udp:syz1> rejected, already enabled
ip_route_output_key_hash+0x212/0x380 net/ipv4/route.c:2321
Enabling of bearer <udp:syz1> rejected, already enabled
__ip_route_output_key include/net/route.h:124 [inline]
ip_route_output_flow+0x28/0xc0 net/ipv4/route.c:2576
ip_route_output_key include/net/route.h:134 [inline]
tipc_udp_xmit.isra.0+0x55d/0xcc0 net/tipc/udp_media.c:173
Enabling of bearer <udp:syz1> rejected, already enabled
tipc_udp_send_msg+0x295/0x4a0 net/tipc/udp_media.c:247
tipc_bearer_xmit_skb+0x172/0x360 net/tipc/bearer.c:503
Enabling of bearer <udp:syz1> rejected, already enabled
tipc_disc_timeout+0x933/0xd60 net/tipc/discover.c:332
call_timer_fn+0x190/0x720 kernel/time/timer.c:1325
Enabling of bearer <udp:syz1> rejected, already enabled
expire_timers kernel/time/timer.c:1362 [inline]
__run_timers kernel/time/timer.c:1681 [inline]
__run_timers kernel/time/timer.c:1649 [inline]
run_timer_softirq+0x652/0x1700 kernel/time/timer.c:1694
Enabling of bearer <udp:syz1> rejected, already enabled
__do_softirq+0x266/0x95a kernel/softirq.c:292
Enabling of bearer <udp:syz1> rejected, already enabled
invoke_softirq kernel/softirq.c:373 [inline]
irq_exit+0x180/0x1d0 kernel/softirq.c:413
exiting_irq arch/x86/include/asm/apic.h:536 [inline]
smp_apic_timer_interrupt+0x14a/0x570 arch/x86/kernel/apic/apic.c:1062
apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:807
</IRQ>
Enabling of bearer <udp:syz1> rejected, already enabled
RIP: 0010:generic_fillattr+0x54c/0x6e0 fs/stat.c:51
Code: 7c 24 0c 48 89 fa 48 c1 ea 03 0f b6 14 02 48 89 f8 83 e0 07 83 c0 03
38 d0 7c 08 84 d2 0f 85 ee 00 00 00 45 8b 64 24 0c 31 ff <41> 81 e4 00 08
00 00 44 89 e6 e8 15 21 bf ff 45 85 e4 74 2c e8 8b
RSP: 0018:ffff88809894fc58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000007 RBX: ffff88809894fde8 RCX: ffffffff81b0c113
RDX: 0000000000000000 RSI: ffffffff81b0c146 RDI: 0000000000000000
RBP: ffff88809894fc78 R08: ffff8880925de340 R09: ffff88809894fde8
R10: ffffed1013129fcd R11: ffff88809894fe6f R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000800 R15: ffff888098d4d100
vfs_getattr_nosec+0x140/0x180 fs/stat.c:82
vfs_getattr+0x4b/0x70 fs/stat.c:116
Enabling of bearer <udp:syz1> rejected, already enabled
vfs_statx+0x157/0x200 fs/stat.c:189
Enabling of bearer <udp:syz1> rejected, already enabled
vfs_stat include/linux/fs.h:3171 [inline]
__do_sys_newstat+0xa4/0x130 fs/stat.c:339
Enabling of bearer <udp:syz1> rejected, already enabled
Enabling of bearer <udp:syz1> rejected, already enabled
kobject: 'loop0' (000000003275aa21): kobject_uevent_env
__se_sys_newstat fs/stat.c:335 [inline]
__x64_sys_newstat+0x54/0x80 fs/stat.c:335
kobject: 'loop0' (000000003275aa21): fill_kobj_path: path
= '/devices/virtual/block/loop0'
do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7fda7f482c65
Code: 00 00 00 e8 5d 01 00 00 48 83 c4 18 c3 90 90 90 90 90 90 90 90 83 ff
01 48 89 f0 77 18 48 89 c7 48 89 d6 b8 04 00 00 00 0f 05 <48> 3d 00 f0 ff
ff 77 17 f3 c3 90 48 8b 05 a1 51 2b 00 64 c7 00 16
RSP: 002b:00007ffefd39a338 EFLAGS: 00000246 ORIG_RAX: 0000000000000004
Enabling of bearer <udp:syz1> rejected, already enabled
RAX: ffffffffffffffda RBX: 00007ffefd39a3d0 RCX: 00007fda7f482c65
RDX: 00007ffefd39a340 RSI: 00007ffefd39a340 RDI: 00007ffefd39a3d0
RBP: 0000000001125980 R08: 00007ffefd39a3e0 R09: 00007fda7f4d9790
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000001114250
R13: 0000000000000000 R14: 00007ffefd39a840 R15: 0000000000000001
Enabling of bearer <udp:syz1> rejected, already enabled
Allocated by task 7724:
save_stack+0x45/0xd0 mm/kasan/common.c:73
kobject: 'loop0' (000000003275aa21): kobject_uevent_env
set_track mm/kasan/common.c:85 [inline]
__kasan_kmalloc mm/kasan/common.c:496 [inline]
__kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:469
kobject: 'loop0' (000000003275aa21): fill_kobj_path: path
= '/devices/virtual/block/loop0'
kasan_kmalloc mm/kasan/common.c:504 [inline]
kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:411
Enabling of bearer <udp:syz1> rejected, already enabled
kmem_cache_alloc+0x12d/0x710 mm/slab.c:3543
dst_alloc+0x10e/0x1d0 net/core/dst.c:105
rt_dst_alloc+0x83/0x3f0 net/ipv4/route.c:1566
__mkroute_output net/ipv4/route.c:2265 [inline]
ip_route_output_key_hash_rcu+0x97d/0x30e0 net/ipv4/route.c:2492
Enabling of bearer <udp:syz1> rejected, already enabled
ip_route_output_key_hash+0x212/0x380 net/ipv4/route.c:2321
__ip_route_output_key include/net/route.h:124 [inline]
ip_route_connect include/net/route.h:302 [inline]
__ip4_datagram_connect+0x6fb/0x1330 net/ipv4/datagram.c:51
__ip6_datagram_connect+0xa6a/0x1390 net/ipv6/datagram.c:152
kobject: 'loop0' (000000003275aa21): kobject_uevent_env
ip6_datagram_connect+0x30/0x50 net/ipv6/datagram.c:273
inet_dgram_connect+0x150/0x2e0 net/ipv4/af_inet.c:571
__sys_connect+0x266/0x330 net/socket.c:1662
__do_sys_connect net/socket.c:1673 [inline]
__se_sys_connect net/socket.c:1670 [inline]
__x64_sys_connect+0x73/0xb0 net/socket.c:1670
do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
kobject: 'loop0' (000000003275aa21): fill_kobj_path: path
= '/devices/virtual/block/loop0'
Freed by task 0:
save_stack+0x45/0xd0 mm/kasan/common.c:73
set_track mm/kasan/common.c:85 [inline]
__kasan_slab_free+0x102/0x150 mm/kasan/common.c:458
kasan_slab_free+0xe/0x10 mm/kasan/common.c:466
__cache_free mm/slab.c:3487 [inline]
kmem_cache_free+0x86/0x260 mm/slab.c:3749
dst_destroy+0x2a0/0x3c0 net/core/dst.c:141
dst_destroy_rcu+0x16/0x19 net/core/dst.c:154
__rcu_reclaim kernel/rcu/rcu.h:240 [inline]
rcu_do_batch kernel/rcu/tree.c:2452 [inline]
invoke_rcu_callbacks kernel/rcu/tree.c:2773 [inline]
rcu_process_callbacks+0x928/0x1390 kernel/rcu/tree.c:2754
cgroup: fork rejected by pids controller in /syz0
__do_softirq+0x266/0x95a kernel/softirq.c:292
The buggy address belongs to the object at ffff88809bfce800
which belongs to the cache ip_dst_cache of size 160
The buggy address is located 54 bytes inside of
160-byte region [ffff88809bfce800, ffff88809bfce8a0)
The buggy address belongs to the page:
page:ffffea00026ff380 count:1 mapcount:0 mapping:ffff88821af8d1c0 index:0x0
flags: 0x1fffc0000000200(slab)
raw: 01fffc0000000200 ffffea000297e848 ffff8880a68abd48 ffff88821af8d1c0
raw: 0000000000000000 ffff88809bfce000 0000000100000010 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88809bfce700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff88809bfce780: 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc
> ffff88809bfce800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88809bfce880: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
ffff88809bfce900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches
^ permalink raw reply
* [PATCH net] net: phy: fix interrupt handling in non-started states
From: Heiner Kallweit @ 2019-02-12 18:56 UTC (permalink / raw)
To: Andrew Lunn, Florian Fainelli, David Miller
Cc: netdev@vger.kernel.org, Russell King - ARM Linux
phylib enables interrupts before phy_start() has been called, and if
we receive an interrupt in a non-started state, the interrupt handler
returns IRQ_NONE. This causes problems with at least one Marvell chip
as reported by Andrew.
Fix this by handling interrupts the same as in phy_mac_interrupt(),
basically always running the phylib state machine. It knows when it
has to do something and when not.
This change allows to handle interrupts gracefully even if they
occur in a non-started state.
Fixes: 2b3e88ea6528 ("net: phy: improve phy state checking")
Reported-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
drivers/net/phy/phy.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 189cd2048c3a..ca5e0c0f018c 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -762,9 +762,6 @@ static irqreturn_t phy_interrupt(int irq, void *phy_dat)
{
struct phy_device *phydev = phy_dat;
- if (!phy_is_started(phydev))
- return IRQ_NONE; /* It can't be ours. */
-
if (phydev->drv->did_interrupt && !phydev->drv->did_interrupt(phydev))
return IRQ_NONE;
--
2.20.1
^ permalink raw reply related
* Re: [PATCH net-next 0/4] net: phy: Add 2.5G/5GBASET PHYs support
From: David Miller @ 2019-02-12 19:03 UTC (permalink / raw)
To: maxime.chevallier
Cc: netdev, linux-kernel, andrew, f.fainelli, hkallweit1, linux,
linux-arm-kernel, antoine.tenart, thomas.petazzoni,
gregory.clement, miquel.raynal, nadavh, stefanc, mw
In-Reply-To: <20190211142529.22885-1-maxime.chevallier@bootlin.com>
From: Maxime Chevallier <maxime.chevallier@bootlin.com>
Date: Mon, 11 Feb 2019 15:25:25 +0100
> The 802.3bz standard defines 2 modes based on the NBASET alliance work
> that allow to use 2.5Gbps and 5Gbps speeds on Cat 5e, 6 and 7 cables.
>
> This series adds the necessary infrastructure to handle these modes with
> C45 PHYs. This series was originally part of a bigger one, that has
> seen 2 iterations [1] [2] that added support for these modes on Marvell
> Alaska PHYs.
...
I'll give Andrew, Florian, and Heiner a chance to review this series.
^ permalink raw reply
* Re: [PATCH v2] rhashtable: make walk safe from softirq context
From: Johannes Berg @ 2019-02-12 19:03 UTC (permalink / raw)
To: David Miller; +Cc: linux-wireless, netdev, j, tgraf, herbert, Bob Copeland
In-Reply-To: <20190212.104339.1794719792249723582.davem@davemloft.net>
On Tue, 2019-02-12 at 10:43 -0800, David Miller wrote:
> Herbert and Johannes, I need some guidance.
>
> It seems Herbert wants the softirq usage of rhashtables removed,
Well, specifically of rhashtable walkers. I can only concede that he's
right in that a hashtable walk during softirq (or even with softirqs
disabled) was maybe a bad idea.
At the same time, it's likely going to be pretty deep surgery in this
code, and I'm not sure I can do that right now. Maybe Bob has some
thoughts if it can be achieved more easily, but I think it'd require
adding a new list to each station that tracks which mesh paths it is the
next_hop for, and making sure that's maintained correctly, which feels
tricky but maybe it's not (I could be more familiar with mesh ...)
Evidently this goes back to
commit 60854fd94573f0d3b80b55b40cf0140a0430f3ab
Author: Bob Copeland <me@bobcopeland.com>
Date: Wed Mar 2 10:09:20 2016 -0500
mac80211: mesh: convert path table to rhashtable
which is kinda old. Not sure why this didn't surface before, because the
spinlock was introduced *before*, otherwise certainly the mutex would've
caused us to not be able to do this code to start with (commit
c6ff5268293 - rhashtable: Fix walker list corruption).
That commit also just converted an existing hashtable walk to
rhashtable, so not sure that counts as having introduced the problem :-)
I guess that's not really guidance. If it were my call I'd apply the
patch and issue a stern warning to myself to remove this ASAP ;-) But
sadly, mesh isn't exactly a priority to most, so not sure when that "P"
would be.
But I guess we should also ask Bob first:
1) do you think it'd be easy to maintain a separate list or avoid the
iteration in some otherway, and make that a small enough patch to be
applicable for stable?
2) or do you think maybe the mesh_plink_broken() call could just be
lifted into a workqueue instead?
johannes
^ permalink raw reply
* KASAN: use-after-free Read in sctp_outq_tail
From: syzbot @ 2019-02-12 19:04 UTC (permalink / raw)
To: davem, linux-kernel, linux-sctp, marcelo.leitner, netdev, nhorman,
syzkaller-bugs, vyasevich
Hello,
syzbot found the following crash on:
HEAD commit: d4104460aec1 Add linux-next specific files for 20190211
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=14140124c00000
kernel config: https://syzkaller.appspot.com/x/.config?x=c8a112d3b0d6719b
dashboard link: https://syzkaller.appspot.com/bug?extid=7823fa3f3e2d69341ea8
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
Unfortunately, I don't have any reproducer for this crash yet.
IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+7823fa3f3e2d69341ea8@syzkaller.appspotmail.com
==================================================================
BUG: KASAN: use-after-free in list_add_tail include/linux/list.h:93 [inline]
BUG: KASAN: use-after-free in sctp_outq_tail_data net/sctp/outqueue.c:105
[inline]
BUG: KASAN: use-after-free in sctp_outq_tail+0x816/0x930
net/sctp/outqueue.c:313
Read of size 8 at addr ffff88807b19a7b8 by task syz-executor.0/30745
CPU: 1 PID: 30745 Comm: syz-executor.0 Not tainted 5.0.0-rc5-next-20190211
#32
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x172/0x1f0 lib/dump_stack.c:113
print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
__asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:132
list_add_tail include/linux/list.h:93 [inline]
sctp_outq_tail_data net/sctp/outqueue.c:105 [inline]
sctp_outq_tail+0x816/0x930 net/sctp/outqueue.c:313
sctp_cmd_send_msg net/sctp/sm_sideeffect.c:1109 [inline]
sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1784 [inline]
sctp_side_effects net/sctp/sm_sideeffect.c:1220 [inline]
sctp_do_sm+0x68e/0x5380 net/sctp/sm_sideeffect.c:1191
sctp_primitive_SEND+0xa0/0xd0 net/sctp/primitive.c:178
sctp_sendmsg_to_asoc+0xa63/0x17b0 net/sctp/socket.c:1955
sctp_sendmsg+0x10a9/0x17e0 net/sctp/socket.c:2113
inet_sendmsg+0x147/0x5d0 net/ipv4/af_inet.c:798
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg+0xdd/0x130 net/socket.c:631
___sys_sendmsg+0x806/0x930 net/socket.c:2136
__sys_sendmsg+0x105/0x1d0 net/socket.c:2174
__do_sys_sendmsg net/socket.c:2183 [inline]
__se_sys_sendmsg net/socket.c:2181 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2181
do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457e39
Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fa9b8630c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457e39
RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000003
RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fa9b86316d4
R13: 00000000004c4e2b R14: 00000000004d8ab8 R15: 00000000ffffffff
Allocated by task 30745:
save_stack+0x45/0xd0 mm/kasan/common.c:75
set_track mm/kasan/common.c:87 [inline]
__kasan_kmalloc mm/kasan/common.c:498 [inline]
__kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:471
kasan_kmalloc+0x9/0x10 mm/kasan/common.c:506
kmem_cache_alloc_trace+0x151/0x760 mm/slab.c:3615
kmalloc include/linux/slab.h:548 [inline]
kzalloc include/linux/slab.h:743 [inline]
sctp_stream_init_ext+0x51/0x110 net/sctp/stream.c:172
sctp_sendmsg_to_asoc+0x1273/0x17b0 net/sctp/socket.c:1896
sctp_sendmsg+0x10a9/0x17e0 net/sctp/socket.c:2113
inet_sendmsg+0x147/0x5d0 net/ipv4/af_inet.c:798
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg+0xdd/0x130 net/socket.c:631
___sys_sendmsg+0x806/0x930 net/socket.c:2136
__sys_sendmsg+0x105/0x1d0 net/socket.c:2174
__do_sys_sendmsg net/socket.c:2183 [inline]
__se_sys_sendmsg net/socket.c:2181 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2181
do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 30745:
save_stack+0x45/0xd0 mm/kasan/common.c:75
set_track mm/kasan/common.c:87 [inline]
__kasan_slab_free+0x102/0x150 mm/kasan/common.c:460
kasan_slab_free+0xe/0x10 mm/kasan/common.c:468
__cache_free mm/slab.c:3491 [inline]
kfree+0xcf/0x230 mm/slab.c:3816
sctp_stream_outq_migrate+0x3e6/0x540 net/sctp/stream.c:88
sctp_stream_init+0xbc/0x410 net/sctp/stream.c:139
sctp_process_init+0x21c3/0x2b20 net/sctp/sm_make_chunk.c:2466
sctp_cmd_process_init net/sctp/sm_sideeffect.c:682 [inline]
sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1410 [inline]
sctp_side_effects net/sctp/sm_sideeffect.c:1220 [inline]
sctp_do_sm+0x3145/0x5380 net/sctp/sm_sideeffect.c:1191
sctp_assoc_bh_rcv+0x343/0x660 net/sctp/associola.c:1074
sctp_inq_push+0x1ea/0x290 net/sctp/inqueue.c:95
sctp_backlog_rcv+0x196/0xbe0 net/sctp/input.c:354
sk_backlog_rcv include/net/sock.h:937 [inline]
__release_sock+0x12e/0x3a0 net/core/sock.c:2379
release_sock+0x59/0x1c0 net/core/sock.c:2895
sctp_wait_for_connect+0x316/0x540 net/sctp/socket.c:8998
sctp_sendmsg_to_asoc+0x13e3/0x17b0 net/sctp/socket.c:1967
sctp_sendmsg+0x10a9/0x17e0 net/sctp/socket.c:2113
inet_sendmsg+0x147/0x5d0 net/ipv4/af_inet.c:798
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg+0xdd/0x130 net/socket.c:631
___sys_sendmsg+0x806/0x930 net/socket.c:2136
__sys_sendmsg+0x105/0x1d0 net/socket.c:2174
__do_sys_sendmsg net/socket.c:2183 [inline]
__se_sys_sendmsg net/socket.c:2181 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2181
do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
The buggy address belongs to the object at ffff88807b19a780
which belongs to the cache kmalloc-96 of size 96
The buggy address is located 56 bytes inside of
96-byte region [ffff88807b19a780, ffff88807b19a7e0)
The buggy address belongs to the page:
page:ffffea0001ec6680 count:1 mapcount:0 mapping:ffff88812c3f04c0
index:0xffff88807b19a800
flags: 0x1fffc0000000200(slab)
raw: 01fffc0000000200 ffffea000262acc8 ffffea0001448348 ffff88812c3f04c0
raw: ffff88807b19a800 ffff88807b19a000 000000010000001d 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88807b19a680: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
ffff88807b19a700: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
> ffff88807b19a780: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
^
ffff88807b19a800: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
ffff88807b19a880: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
==================================================================
---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
syzbot.
^ permalink raw reply
* Re: [PATCH V1 net 0/2] net: ena: race condition bug fix and version update
From: David Miller @ 2019-02-12 19:06 UTC (permalink / raw)
To: akiyano
Cc: netdev, dwmw, zorik, matua, saeedb, msw, aliguori, nafea, gtzalik,
netanel, alisaidi
In-Reply-To: <1549905464-13758-1-git-send-email-akiyano@amazon.com>
From: <akiyano@amazon.com>
Date: Mon, 11 Feb 2019 19:17:42 +0200
> From: Arthur Kiyanovski <akiyano@amazon.com>
>
> This patchset includes a fix to a race condition that can cause
> kernel panic, as well as a driver version update because of this
> fix.
Series applied and patch #1 queued up for -stable.
But I want to reiterate what Andrew said, the version is so increibly
useless and stupid.
I'm going to submit the fix to -stable, and then people will then
doubly and triply have no relationship between driver version number
and what fixes exist.
^ permalink raw reply
* Re: [PATCH net-next v5 09/12] socket: Add SO_TIMESTAMPING_NEW
From: Deepa Dinamani @ 2019-02-12 19:08 UTC (permalink / raw)
To: Ran Rozenstein
Cc: davem@davemloft.net, linux-kernel@vger.kernel.org,
netdev@vger.kernel.org, arnd@arndb.de, y2038@lists.linaro.org,
chris@zankel.net, fenghua.yu@intel.com, rth@twiddle.net,
tglx@linutronix.de, ubraun@linux.ibm.com,
linux-alpha@vger.kernel.org, linux-arch@vger.kernel.org,
linux-ia64@vger.kernel.org, linux-mips@linux-mips.org,
linux-s390@vger.kernel.org, linux-xtensa@linux-xtensa.org,
sparclinux@vger.kernel.org
In-Reply-To: <CABeXuvqMR7T=3ORvXPihkz-WbTN+oFeHkCu9uvebEq9wTLpJuQ@mail.gmail.com>
On Sun, Feb 10, 2019 at 7:21 PM Deepa Dinamani <deepa.kernel@gmail.com> wrote:
>
> On Feb 10, 2019, at 7:43 AM, Ran Rozenstein <ranro@mellanox.com> wrote:
>
> >> Subject: [PATCH net-next v5 09/12] socket: Add SO_TIMESTAMPING_NEW
> >>
> >> Add SO_TIMESTAMPING_NEW variant of socket timestamp options.
> >> This is the y2038 safe versions of the SO_TIMESTAMPING_OLD for all
> >> architectures.
> >>
> >> Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
> >> Acked-by: Willem de Bruijn <willemb@google.com>
> >
> >
> > Hi,
> >
> > I have app that include:
> > #include <linux/errqueue.h>
> >
> > It now fail with this error:
> > In file included from timestamping.c:6:0:
> > /usr/include/linux/errqueue.h:46:27: error: array type has incomplete element type 'struct __kernel_timespec'
> > struct __kernel_timespec ts[3];
> > ^~
> > I tried to do the trivial fix, to include time.h:
> > In include/uapi/linux/errqueue.h
> > #include <linux/time.h>
> > #include <linux/types.h>
> >
> > But it just add some other noises:
> > In file included from /usr/include/linux/errqueue.h:5:0,
> > from timestamping.c:6:
> > /usr/include/linux/time.h:10:8: error: redefinition of ?struct timespec?
> > struct timespec {
> > ^~~~~~~~
> > In file included from /usr/include/sys/select.h:39:0,
> > from /usr/include/sys/types.h:197,
> > from /usr/include/stdlib.h:279,
> > from timestamping.c:2:
> > /usr/include/bits/types/struct_timespec.h:8:8: note: originally defined here
> > struct timespec
> > ^~~~~~~~
> > In file included from /usr/include/linux/errqueue.h:5:0,
> > from timestamping.c:6:
> > /usr/include/linux/time.h:16:8: error: redefinition of ?struct timeval?
> > struct timeval {
> > ^~~~~~~
> > In file included from /usr/include/sys/select.h:37:0,
> > from /usr/include/sys/types.h:197,
> > from /usr/include/stdlib.h:279,
> > from timestamping.c:2:
> > /usr/include/bits/types/struct_timeval.h:8:8: note: originally defined here
> > struct timeval
> > ^~~~~~~
> >
> >
> > Can you please advise how to solve it?
> >
> > Thanks,
> > Ran
>
> The errqueue.h already had the same issue reported previously:
> https://lore.kernel.org/netdev/CAF=yD-L2ntuH54J_SwN9WcpBMgkV_v0e-Q2Pu2mrQ3+1RozGFQ@mail.gmail.com/
>
> Earlier when I tested this with kernel selftests such as
> tools/testing/selftests/networking/timestamping/rxtimestamp(the test
> was broken to begin with because of missing include of unistd.h), I
> was using make.cross to build.
> This does not put the headers in the right place
> (obj-$ARCH/usr/include instead of usr/include). Hence, I did not
> realize that this breaks the inclusion of errqueue.h due to the
> missing __kernel_timespec definition.
> I forgot that nobody seems to be using linux/time.h.
>
> But, if I include guards( #ifndef __KERNEL__) for struct timespec,
> struct timeval etc for linux/time.h, then we can include it from
> userspace/ errqueue.h for __kernel_timespec:
>
> --- a/include/uapi/linux/errqueue.h
> +++ b/include/uapi/linux/errqueue.h
> @@ -2,7 +2,7 @@
> #ifndef _UAPI_LINUX_ERRQUEUE_H
> #define _UAPI_LINUX_ERRQUEUE_H
>
> -#include <linux/types.h>
> +#include <linux/time.h>
>
> struct sock_extended_err {
> __u32 ee_errno;
> diff --git a/include/uapi/linux/time.h b/include/uapi/linux/time.h
> index a6aca9aaab80..40913d9a5bc8 100644
> --- a/include/uapi/linux/time.h
> +++ b/include/uapi/linux/time.h
> @@ -5,6 +5,8 @@
> #include <linux/types.h>
>
>
> +#ifdef __KERNEL__
> +
> #ifndef _STRUCT_TIMESPEC
> #define _STRUCT_TIMESPEC
> struct timespec {
> @@ -42,6 +44,8 @@ struct itimerval {
> struct timeval it_value; /* current value */
> };
>
> +#endif /* __KERNEL__ */
>
> Arnd,
>
> I forgot how we plan to include the definition for __kernel_timespec
> for libc or userspace. Does this seem right to you?
> Also these changes to errqueue.h needs to be reverted probably as this
> breaks userspace.
Arnd and I talked about this today morning.
We agreed that we could introduce a new time_types.h along the lines
of posix_types.h. We will move all the time definitions that we plan
to keep in the kernel uapi headers to this header. This header will
also not have any overlap with the sys/time.h and can be included
along with it from userspace.
I will post this patch shortly.
This should fix Ran's issue.
-Deepa
^ permalink raw reply
* Re: [Patch net v2 0/3] net_sched: some fixes for cls_tcindex
From: David Miller @ 2019-02-12 19:14 UTC (permalink / raw)
To: xiyou.wangcong; +Cc: netdev
In-Reply-To: <20190211210616.9592-1-xiyou.wangcong@gmail.com>
From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Mon, 11 Feb 2019 13:06:13 -0800
> This patchset contains 3 bug fixes for tcindex filter. Please check
> each patch for details.
>
> v2: fix a compile error in patch 2
> drop netns refcnt in patch 1
>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Series applied and queued up for -stable, thanks Cong.
^ permalink raw reply
* Re: [PATCH net-next 4/4] net: phy: Add generic support for 2.5GBaseT and 5GBaseT
From: Heiner Kallweit @ 2019-02-12 19:14 UTC (permalink / raw)
To: Maxime Chevallier, davem
Cc: netdev, linux-kernel, Andrew Lunn, Florian Fainelli, Russell King,
linux-arm-kernel, Antoine Tenart, thomas.petazzoni,
gregory.clement, miquel.raynal, nadavh, stefanc, mw
In-Reply-To: <20190211142529.22885-5-maxime.chevallier@bootlin.com>
On 11.02.2019 15:25, Maxime Chevallier wrote:
> The 802.3bz specification, based on previous by the NBASET alliance,
> defines the 2.5GBaseT and 5GBaseT link modes for ethernet traffic on
> cat5e, cat6 and cat7 cables.
>
> These mode integrate with the already defined C45 MDIO PMA/PMD registers
> set that added 10G support, by defining some previously reserved bits,
> and adding a new register (2.5G/5G Extended abilities).
>
> This commit adds the required definitions in include/uapi/linux/mdio.h
> to support these modes, and detect when a link-partner advertises them.
>
> It also adds support for these mode in the generic C45 PHY
> infrastructure.
>
> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
> ---
> drivers/net/phy/phy-c45.c | 37 +++++++++++++++++++++++++++++++++++++
> include/uapi/linux/mdio.h | 16 ++++++++++++++++
> 2 files changed, 53 insertions(+)
>
> diff --git a/drivers/net/phy/phy-c45.c b/drivers/net/phy/phy-c45.c
> index 6f028de4dae1..7af5fa81daf6 100644
> --- a/drivers/net/phy/phy-c45.c
> +++ b/drivers/net/phy/phy-c45.c
> @@ -47,6 +47,16 @@ int genphy_c45_pma_setup_forced(struct phy_device *phydev)
> /* Assume 1000base-T */
> ctrl2 |= MDIO_PMA_CTRL2_1000BT;
> break;
> + case SPEED_2500:
> + ctrl1 |= MDIO_CTRL1_SPEED2_5G;
> + /* Assume 2.5Gbase-T */
> + ctrl2 |= MDIO_PMA_CTRL2_2_5GBT;
> + break;
> + case SPEED_5000:
> + ctrl1 |= MDIO_CTRL1_SPEED5G;
> + /* Assume 5Gbase-T */
> + ctrl2 |= MDIO_PMA_CTRL2_5GBT;
> + break;
> case SPEED_10000:
> ctrl1 |= MDIO_CTRL1_SPEED10G;
> /* Assume 10Gbase-T */
> @@ -194,6 +204,12 @@ int genphy_c45_read_lpa(struct phy_device *phydev)
> if (val < 0)
> return val;
>
> + if (val & MDIO_AN_10GBT_STAT_LP2_5G)
> + linkmode_set_bit(ETHTOOL_LINK_MODE_2500baseT_Full_BIT,
> + phydev->lp_advertising);
> + if (val & MDIO_AN_10GBT_STAT_LP5G)
> + linkmode_set_bit(ETHTOOL_LINK_MODE_5000baseT_Full_BIT,
> + phydev->lp_advertising);
> if (val & MDIO_AN_10GBT_STAT_LP10G)
> linkmode_set_bit(ETHTOOL_LINK_MODE_10000baseT_Full_BIT,
> phydev->lp_advertising);
> @@ -224,6 +240,12 @@ int genphy_c45_read_pma(struct phy_device *phydev)
> case MDIO_PMA_CTRL1_SPEED1000:
> phydev->speed = SPEED_1000;
> break;
> + case MDIO_CTRL1_SPEED2_5G:
> + phydev->speed = SPEED_2500;
> + break;
> + case MDIO_CTRL1_SPEED5G:
> + phydev->speed = SPEED_5000;
> + break;
> case MDIO_CTRL1_SPEED10G:
> phydev->speed = SPEED_10000;
> break;
> @@ -339,6 +361,21 @@ int genphy_c45_pma_read_abilities(struct phy_device *phydev)
> linkmode_mod_bit(ETHTOOL_LINK_MODE_10baseT_Half_BIT,
> phydev->supported,
> val & MDIO_PMA_EXTABLE_10BT);
> +
> + if (val & MDIO_PMA_EXTABLE_NBT) {
> + val = phy_read_mmd(phydev, MDIO_MMD_PMAPMD,
> + MDIO_PMA_NG_EXTABLE);
> + if (val < 0)
> + return val;
> +
> + linkmode_mod_bit(ETHTOOL_LINK_MODE_2500baseT_Full_BIT,
> + phydev->supported,
> + val & MDIO_PMA_NG_EXTABLE_2_5GBT);
> +
> + linkmode_mod_bit(ETHTOOL_LINK_MODE_5000baseT_Full_BIT,
> + phydev->supported,
> + val & MDIO_PMA_NG_EXTABLE_5GBT);
> + }
> }
>
> return 0;
> diff --git a/include/uapi/linux/mdio.h b/include/uapi/linux/mdio.h
> index 0e012b168e4d..0a552061ff1c 100644
> --- a/include/uapi/linux/mdio.h
> +++ b/include/uapi/linux/mdio.h
> @@ -45,6 +45,7 @@
> #define MDIO_AN_ADVERTISE 16 /* AN advertising (base page) */
> #define MDIO_AN_LPA 19 /* AN LP abilities (base page) */
> #define MDIO_PCS_EEE_ABLE 20 /* EEE Capability register */
> +#define MDIO_PMA_NG_EXTABLE 21 /* 2.5G/5G PMA/PMD extended ability */
> #define MDIO_PCS_EEE_WK_ERR 22 /* EEE wake error counter */
> #define MDIO_PHYXS_LNSTAT 24 /* PHY XGXS lane state */
> #define MDIO_AN_EEE_ADV 60 /* EEE advertisement */
> @@ -92,6 +93,10 @@
> #define MDIO_CTRL1_SPEED10G (MDIO_CTRL1_SPEEDSELEXT | 0x00)
> /* 10PASS-TS/2BASE-TL */
> #define MDIO_CTRL1_SPEED10P2B (MDIO_CTRL1_SPEEDSELEXT | 0x04)
> +/* 2.5 Gb/s */
> +#define MDIO_CTRL1_SPEED2_5G (MDIO_CTRL1_SPEEDSELEXT | 0x18)
> +/* 5 Gb/s */
> +#define MDIO_CTRL1_SPEED5G (MDIO_CTRL1_SPEEDSELEXT | 0x1c)
>
> /* Status register 1. */
> #define MDIO_STAT1_LPOWERABLE 0x0002 /* Low-power ability */
> @@ -145,6 +150,8 @@
> #define MDIO_PMA_CTRL2_1000BKX 0x000d /* 1000BASE-KX type */
> #define MDIO_PMA_CTRL2_100BTX 0x000e /* 100BASE-TX type */
> #define MDIO_PMA_CTRL2_10BT 0x000f /* 10BASE-T type */
> +#define MDIO_PMA_CTRL2_2_5GBT 0x0030 /* 2.5GBaseT type */
> +#define MDIO_PMA_CTRL2_5GBT 0x0031 /* 5GBaseT type */
> #define MDIO_PCS_CTRL2_TYPE 0x0003 /* PCS type selection */
> #define MDIO_PCS_CTRL2_10GBR 0x0000 /* 10GBASE-R type */
> #define MDIO_PCS_CTRL2_10GBX 0x0001 /* 10GBASE-X type */
> @@ -198,6 +205,7 @@
> #define MDIO_PMA_EXTABLE_1000BKX 0x0040 /* 1000BASE-KX ability */
> #define MDIO_PMA_EXTABLE_100BTX 0x0080 /* 100BASE-TX ability */
> #define MDIO_PMA_EXTABLE_10BT 0x0100 /* 10BASE-T ability */
> +#define MDIO_PMA_EXTABLE_NBT 0x4000 /* 2.5/5GBASE-T ability */
>
> /* PHY XGXS lane state register. */
> #define MDIO_PHYXS_LNSTAT_SYNC0 0x0001
> @@ -234,9 +242,13 @@
> #define MDIO_PCS_10GBRT_STAT2_BER 0x3f00
>
> /* AN 10GBASE-T control register. */
> +#define MDIO_AN_10GBT_CTRL_ADV2_5G 0x0080 /* Advertise 2.5GBASE-T */
> +#define MDIO_AN_10GBT_CTRL_ADV5G 0x0100 /* Advertise 5GBASE-T */
> #define MDIO_AN_10GBT_CTRL_ADV10G 0x1000 /* Advertise 10GBASE-T */
>
> /* AN 10GBASE-T status register. */
> +#define MDIO_AN_10GBT_STAT_LP2_5G 0x0020 /* LP is 2.5GBT capable */
> +#define MDIO_AN_10GBT_STAT_LP5G 0x0040 /* LP is 5GBT capable */
> #define MDIO_AN_10GBT_STAT_LPTRR 0x0200 /* LP training reset req. */
> #define MDIO_AN_10GBT_STAT_LPLTABLE 0x0400 /* LP loop timing ability */
> #define MDIO_AN_10GBT_STAT_LP10G 0x0800 /* LP is 10GBT capable */
> @@ -265,6 +277,10 @@
> #define MDIO_EEE_10GKX4 0x0020 /* 10G KX4 EEE cap */
> #define MDIO_EEE_10GKR 0x0040 /* 10G KR EEE cap */
>
> +/* 2.5G/5G Extended abilities register. */
> +#define MDIO_PMA_NG_EXTABLE_2_5GBT 0x0001 /* 2.5GBASET ability */
> +#define MDIO_PMA_NG_EXTABLE_5GBT 0x0002 /* 5GBASET ability */
> +
> /* LASI RX_ALARM control/status registers. */
> #define MDIO_PMA_LASI_RX_PHYXSLFLT 0x0001 /* PHY XS RX local fault */
> #define MDIO_PMA_LASI_RX_PCSLFLT 0x0008 /* PCS RX local fault */
>
Looks good to me.
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Heiner
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox