* Re: [RFC] iproute: Add support for extended ack to rtnl_talk
From: Stephen Hemminger @ 2017-05-04 16:42 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: netdev
In-Reply-To: <590AF624.6090808@iogearbox.net>
On Thu, 04 May 2017 11:36:36 +0200
Daniel Borkmann <daniel@iogearbox.net> wrote:
> On 05/04/2017 01:56 AM, Stephen Hemminger wrote:
> > Add support for extended ack error reporting via libmnl. This
> > is a better alternative to use existing library and not copy/paste
> > code from the kernel. Also make arguments const where possible.
> >
> > Add a new function rtnl_talk_extack that takes a callback as an input
> > arg. If a netlink response contains extack attributes, the callback is
> > is invoked with the the err string, offset in the message and a pointer
> > to the message returned by the kernel.
> >
> > Adding a new function allows commands to be moved over to the
> > extended error reporting over time.
> >
> > For feedback, compile tested only.
>
> Just out of curiosity, what is the plan regarding converting iproute2
> over to libmnl (ip, tc, ss, ...)? In 2015, tipc tool was the first
> user merged that requires libmnl, the only other user today in the
> tree is devlink, which even seems to define its own libmnl library
> helpers. What is the clear benefit/rationale of outsourcing this to
> libmnl? I always was the impression we should strive for as little
> dependencies as possible?
>
> I don't really like that we make extended ack reporting now dependent
> on libmnl, which further diverts from iproute's native nl library vs
> requiring to install another nl library, making the current status
> quo even worse ... :/
>
> Thanks,
> Daniel
No rush for migration. just slow migration as time permits.
This would be good kernel janitor type project.
^ permalink raw reply
* Re: [PATCH net-next 9/9] ipvlan: introduce individual MAC addresses
From: Jiri Benc @ 2017-05-04 16:43 UTC (permalink / raw)
To: Chiappero, Marco
Cc: Dan Williams, netdev@vger.kernel.org, David S . Miller,
Kirsher, Jeffrey T, Duyck, Alexander H, Grandhi, Sainath,
Mahesh Bandewar
In-Reply-To: <695CDAEB5A6CE2498DE413F2A9AA82960915EF08@IRSMSX101.ger.corp.intel.com>
On Thu, 4 May 2017 09:37:00 +0000, Chiappero, Marco wrote:
> This looks conceptually wrong. Yes, ipvlan works at L3 (which is an
> implementation detail anyway), but slaves are Ethernet interfaces and
> should behave as much as possible as such regardless, with an
> individual MAC address assigned.
Isn't the proper fix then converting ipvlan interfaces to be L3 only
interfaces? I.e., ARPHRD_NONE? There's not much ipvlan can do with
arbitrary Ethernet frames anyway. Of course, a flag to switch to the
new behavior would be needed in order to preserve backwards
compatibility.
This patchset looks very wrong. For proper support of multiple MAC
addresses, we have macvlan and it's pointless to add that to ipvlan.
And doing some kind of weird MAC NAT in ipvlan just to satisfy broken
tools that can't cope with multiple interfaces with the same MAC address
is wrong, too. Those tools are already broken anyway, there's nothing
preventing anyone to set the same MAC address to multiple interfaces.
I suppose those tools don't work with bonding and bridge, either?
> So, either we fix this by forcing slaves to stay in sync with master,
Yes, that's the correct behavior. Well, at least as correct as one can
get with the ipvlan broken design of pretending that an interface is L2
when in fact, it is not.
Jiri
^ permalink raw reply
* Re: [RFC] iproute: Add support for extended ack to rtnl_talk
From: Stephen Hemminger @ 2017-05-04 16:43 UTC (permalink / raw)
To: David Miller; +Cc: dsahern, daniel, netdev
In-Reply-To: <20170504.104103.1628291573330660235.davem@davemloft.net>
On Thu, 04 May 2017 10:41:03 -0400 (EDT)
David Miller <davem@davemloft.net> wrote:
> From: David Ahern <dsahern@gmail.com>
> Date: Thu, 4 May 2017 08:27:35 -0600
>
> > On 5/4/17 3:36 AM, Daniel Borkmann wrote:
> >> What is the clear benefit/rationale of outsourcing this to
> >> libmnl? I always was the impression we should strive for as little
> >> dependencies as possible?
> >
> > +1
>
> Agreed, all else being equal iproute2 should be as self contained
> as possible since it is such a fundamental tool.
Sorry, the old netlink code is more difficult to understand than libmnl.
Having dependency on a library is not a problem. There already is
an alternative implementation of ip commands in busybox for those
people trying to work in small environments.
^ permalink raw reply
* Re: [RFC] iproute: Add support for extended ack to rtnl_talk
From: Stephen Hemminger @ 2017-05-04 16:45 UTC (permalink / raw)
To: Leon Romanovsky; +Cc: Daniel Borkmann, netdev
In-Reply-To: <20170504143738.GY22833@mtr-leonro.local>
[-- Attachment #1: Type: text/plain, Size: 2452 bytes --]
On Thu, 4 May 2017 17:37:38 +0300
Leon Romanovsky <leon@kernel.org> wrote:
> On Thu, May 04, 2017 at 11:36:36AM +0200, Daniel Borkmann wrote:
> > On 05/04/2017 01:56 AM, Stephen Hemminger wrote:
> > > Add support for extended ack error reporting via libmnl. This
> > > is a better alternative to use existing library and not copy/paste
> > > code from the kernel. Also make arguments const where possible.
> > >
> > > Add a new function rtnl_talk_extack that takes a callback as an input
> > > arg. If a netlink response contains extack attributes, the callback is
> > > is invoked with the the err string, offset in the message and a pointer
> > > to the message returned by the kernel.
> > >
> > > Adding a new function allows commands to be moved over to the
> > > extended error reporting over time.
> > >
> > > For feedback, compile tested only.
> >
> > Just out of curiosity, what is the plan regarding converting iproute2
> > over to libmnl (ip, tc, ss, ...)? In 2015, tipc tool was the first
> > user merged that requires libmnl, the only other user today in the
> > tree is devlink, which even seems to define its own libmnl library
> > helpers. What is the clear benefit/rationale of outsourcing this to
> > libmnl? I always was the impression we should strive for as little
> > dependencies as possible?
>
> And I would like to get direction for the RDMA tool [1] which I'm
> working on it now.
>
> The overall decision was to use netlink and put it under iproute2
> umbrella. Currently, I have working RFC which is based on
> legacy sysfs interface to ensure that we are converging on
> user-experience even before moving to actual netlink defines.
>
> An I would like to continue to work on netlink interface, but which lib interface
> should I need to base rdmatool's netlink code?
>
> [1] https://www.mail-archive.com/netdev@vger.kernel.org/msg148523.html
>
> >
> > I don't really like that we make extended ack reporting now dependent
> > on libmnl, which further diverts from iproute's native nl library vs
> > requiring to install another nl library, making the current status
> > quo even worse ... :/
> >
> > Thanks,
> > Daniel
I would prefer new code use libmnl, but using libnetlink would also be ok.
Any later conversion to libmnl would be mostly automated anyway.
The real objection was copy/pasting in the kernel netlink parser.
That was unnecessary bloat.
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* Re: [net-ipv4] question about arguments position
From: David Miller @ 2017-05-04 16:46 UTC (permalink / raw)
To: garsilva; +Cc: kuznet, jmorris, yoshfuji, kaber, netdev, linux-kernel
In-Reply-To: <20170504110755.Horde.9md-X4baf8TwOEKjGrXkelZ@gator4166.hostgator.com>
From: "Gustavo A. R. Silva" <garsilva@embeddedor.com>
Date: Thu, 04 May 2017 11:07:54 -0500
> While looking into Coverity ID 1357474 I ran into the following piece
> of code at net/ipv4/inet_diag.c:392:
Because it's been this way since at least 2005, it doesn't matter if
the order is correct or not. What's there is the locked in behavior
exposed to userspace and changing it will break things for people.
^ permalink raw reply
* Fw: [Bug 195217] siocsifflags - irda doesn't work (MCS7780)
From: Stephen Hemminger @ 2017-05-04 16:48 UTC (permalink / raw)
To: Samuel Ortiz; +Cc: netdev
Apparently IRDA is broken by VMAP_STACK
Begin forwarded message:
Date: Thu, 04 May 2017 12:16:15 +0000
From: bugzilla-daemon@bugzilla.kernel.org
To: stephen@networkplumber.org
Subject: [Bug 195217] siocsifflags - irda doesn't work (MCS7780)
https://bugzilla.kernel.org/show_bug.cgi?id=195217
Jerome (jb_69100@yahoo.com) changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|Other |USB
Kernel Version|4.9.X, 4.10.X, 4.11-rcX |4.9.0 --> 4.11.0
Product|Networking |Drivers
Severity|blocking |high
--- Comment #2 from Jerome (jb_69100@yahoo.com) ---
Hi,
If I compile the kernel by setting CONFIG_VMAP_STACK = N, "ifconfig irda0 up"
is working correctly. (Tested with kernel 4.11.0)
I suppose the problem comes either from the MCS7780 driver (use the stack not
"in the rules"?) Or from VMAP_STACK.
Can you help me better understand the problem?
Regards,
Jerome.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply
* Re: [PATCH] aquantia: Fix "ethtool -S" crash when adapter down.
From: Lino Sanfilippo @ 2017-05-04 16:48 UTC (permalink / raw)
To: Pavel Belous, David S . Miller; +Cc: netdev, David Arcari
In-Reply-To: <90666578f043b366313ddd90ffad86de42d890f2.1493914743.git.pavel.belous@aquantia.com>
Hi Pavel,
On 04.05.2017 18:33, Pavel Belous wrote:
> From: Pavel Belous <pavel.belous@aquantia.com>
>
> This patch fixes the crash that happens when driver tries to collect statistics
> from already released "aq_vec" object.
>
> Fixes: 97bde5c4f909 ("net: ethernet: aquantia: Support for NIC-specific code")
> Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
> ---
> drivers/net/ethernet/aquantia/atlantic/aq_nic.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
> index cdb0299..3a32573 100644
> --- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
> +++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
> @@ -755,7 +755,7 @@ void aq_nic_get_stats(struct aq_nic_s *self, u64 *data)
> count = 0U;
>
> for (i = 0U, aq_vec = self->aq_vec[0];
> - self->aq_vecs > i; ++i, aq_vec = self->aq_vec[i]) {
> + aq_vec && self->aq_vecs > i; ++i, aq_vec = self->aq_vec[i]) {
> data += count;
> aq_vec_get_sw_stats(aq_vec, data, &count);
> }
> @@ -961,6 +961,7 @@ void aq_nic_free_hot_resources(struct aq_nic_s *self)
> for (i = AQ_DIMOF(self->aq_vec); i--;) {
> if (self->aq_vec[i])
> aq_vec_free(self->aq_vec[i]);
> + self->aq_vec[i] = NULL;
> }
>
> err_exit:;
>
if the driver does not support statistics when the interface is down, would not it be clearer
to check if netif_running() in get_stats() instead?
Regards,
Lino
^ permalink raw reply
* struct ip vs struct iphdr
From: Oleg @ 2017-05-04 16:42 UTC (permalink / raw)
To: netdev
Hi, all.
It seems struct ip and struct iphdr are similar: struct ip, despite of
it name, doesn't contain anything but ip header.
So, my noob question, what is the difference between them?
Thanks.
--
Олег Неманов (Oleg Nemanov)
^ permalink raw reply
* Re: [PATCH] aquantia: Fix "ethtool -S" crash when adapter down.
From: David Miller @ 2017-05-04 16:51 UTC (permalink / raw)
To: LinoSanfilippo; +Cc: Pavel.Belous, netdev, darcari
In-Reply-To: <0babf800-080c-96e0-4dbb-8d3fca3fb784@gmx.de>
From: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Date: Thu, 4 May 2017 18:48:12 +0200
> Hi Pavel,
>
> On 04.05.2017 18:33, Pavel Belous wrote:
>> From: Pavel Belous <pavel.belous@aquantia.com>
>>
>> This patch fixes the crash that happens when driver tries to collect statistics
>> from already released "aq_vec" object.
>>
>> Fixes: 97bde5c4f909 ("net: ethernet: aquantia: Support for NIC-specific code")
>> Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
>> ---
>> drivers/net/ethernet/aquantia/atlantic/aq_nic.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
>> index cdb0299..3a32573 100644
>> --- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
>> +++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
>> @@ -755,7 +755,7 @@ void aq_nic_get_stats(struct aq_nic_s *self, u64 *data)
>> count = 0U;
>>
>> for (i = 0U, aq_vec = self->aq_vec[0];
>> - self->aq_vecs > i; ++i, aq_vec = self->aq_vec[i]) {
>> + aq_vec && self->aq_vecs > i; ++i, aq_vec = self->aq_vec[i]) {
>> data += count;
>> aq_vec_get_sw_stats(aq_vec, data, &count);
>> }
>> @@ -961,6 +961,7 @@ void aq_nic_free_hot_resources(struct aq_nic_s *self)
>> for (i = AQ_DIMOF(self->aq_vec); i--;) {
>> if (self->aq_vec[i])
>> aq_vec_free(self->aq_vec[i]);
>> + self->aq_vec[i] = NULL;
>> }
>>
>> err_exit:;
>>
>
> if the driver does not support statistics when the interface is down, would not it be clearer
> to check if netif_running() in get_stats() instead?
Yes, much cleaner.
Much better would be to have a cached software copy so that statistics
can be reported regardless of whether the device is down or not.
^ permalink raw reply
* Re: [Patch net] ipv6: initialize route null entry in addrconf_init()
From: David Miller @ 2017-05-04 16:51 UTC (permalink / raw)
To: andreyknvl; +Cc: xiyou.wangcong, netdev, dsahern
In-Reply-To: <CAAeHK+zmMUvt4Lz28k3Oyxi1QFmrTMBe34=6xJDBPxkeivnPeA@mail.gmail.com>
From: Andrey Konovalov <andreyknvl@google.com>
Date: Thu, 4 May 2017 14:28:37 +0200
> On Thu, May 4, 2017 at 7:07 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> Andrey reported a crash on init_net.ipv6.ip6_null_entry->rt6i_idev
>> since it is always NULL.
>>
>> This is clearly wrong, we have code to initialize it to loopback_dev,
>> unfortunately the order is still not correct.
>>
>> loopback_dev is registered very early during boot, we lose a chance
>> to re-initialize it in notifier. addrconf_init() is called after
>> ip6_route_init(), which means we have no chance to correct it.
>>
>> Fix it by moving this initialization explicitly after
>> ipv6_add_dev(init_net.loopback_dev) in addrconf_init().
>>
>> Reported-by: Andrey Konovalov <andreyknvl@google.com>
>> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>
> Hi Cong,
>
> This fixes the bug triggered by my reproducer.
>
> Thanks!
>
> Tested-by: Andrey Konovalov <andreyknvl@google.com>
Applied and queued up for -stable, thanks.
^ permalink raw reply
* Re: struct ip vs struct iphdr
From: Sowmini Varadhan @ 2017-05-04 16:52 UTC (permalink / raw)
To: Oleg; +Cc: netdev
In-Reply-To: <20170504164219.GA25287@legohost>
On (05/04/17 19:42), Oleg wrote:
>
> Hi, all.
>
> It seems struct ip and struct iphdr are similar: struct ip, despite of
> it name, doesn't contain anything but ip header.
>
> So, my noob question, what is the difference between them?
>
> Thanks.
BSD vs linux?
struct ip is a BSD-ism, intended to be used if you were porting
some BSD app.
--Sowmini
^ permalink raw reply
* Re: [PATCH] aquantia: Fix "ethtool -S" crash when adapter down.
From: David Arcari @ 2017-05-04 17:00 UTC (permalink / raw)
To: Pavel Belous, David S . Miller; +Cc: netdev
In-Reply-To: <90666578f043b366313ddd90ffad86de42d890f2.1493914743.git.pavel.belous@aquantia.com>
Hi Pavel,
On 05/04/2017 12:33 PM, Pavel Belous wrote:
> From: Pavel Belous <pavel.belous@aquantia.com>
>
> This patch fixes the crash that happens when driver tries to collect statistics
> from already released "aq_vec" object.
>
> Fixes: 97bde5c4f909 ("net: ethernet: aquantia: Support for NIC-specific code")
> Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
> ---
> drivers/net/ethernet/aquantia/atlantic/aq_nic.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
> index cdb0299..3a32573 100644
> --- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
> +++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
> @@ -755,7 +755,7 @@ void aq_nic_get_stats(struct aq_nic_s *self, u64 *data)
> count = 0U;
>
> for (i = 0U, aq_vec = self->aq_vec[0];
> - self->aq_vecs > i; ++i, aq_vec = self->aq_vec[i]) {
> + aq_vec && self->aq_vecs > i; ++i, aq_vec = self->aq_vec[i]) {
> data += count;
> aq_vec_get_sw_stats(aq_vec, data, &count);
> }
> @@ -961,6 +961,7 @@ void aq_nic_free_hot_resources(struct aq_nic_s *self)
> for (i = AQ_DIMOF(self->aq_vec); i--;) {
> if (self->aq_vec[i])
> aq_vec_free(self->aq_vec[i]);
> + self->aq_vec[i] = NULL;
I think you intended to to add { } to the if statement. The code compiles as
is, but the indentation is not correct.
-DA
> }
>
> err_exit:;
>
^ permalink raw reply
* Why do we need MSG_SENDPAGE_NOTLAST?
From: Ilya Lesokhin @ 2017-05-04 17:03 UTC (permalink / raw)
To: netdev@vger.kernel.org; +Cc: tls-fpga-sw-dev, Dave Watson
I don't understand the need for MSG_SENDPAGE_NOTLAST and I'm hoping someone can enlighten me.
According to commit 35f9c09 ('tcp: tcp_sendpages() should call tcp_push() once'):
"We need to call tcp_flush() at the end of the last page processed in
tcp_sendpages(), or else transmits can be deferred and future sends
stall."
I don't understand why we need to differentiate between the user setting MSG_MORE
and splice indicating that more data is going to be sent.
if the user passed MSG_MORE and didn't push any extra data, isn't it the users fault?
Do we need it because poorly written applications were broken when
MSG_MORE was added to tcp_sendpage? Or is there a deeper reason?
The reason I'm asking is that we are working on a kernel TLS implementation
and I would like to know if we can coalesce multiple tls_sendpage calls with MSG_MORE into a single
tls record or whether we must push out the record as soon as MSG_SENDPAGE_NOTLAST is cleared?
Thanks,
Ilya
^ permalink raw reply
* Re: [PATCH] aquantia: Fix "ethtool -S" crash when adapter down.
From: Pavel Belous @ 2017-05-04 17:08 UTC (permalink / raw)
To: David Arcari, David S . Miller; +Cc: netdev
In-Reply-To: <50047294-4f40-e8ae-0982-5720b87019a9@redhat.com>
On 04.05.2017 20:00, David Arcari wrote:
> Hi Pavel,
>
> On 05/04/2017 12:33 PM, Pavel Belous wrote:
>> From: Pavel Belous <pavel.belous@aquantia.com>
>>
>> This patch fixes the crash that happens when driver tries to collect statistics
>> from already released "aq_vec" object.
>>
>> Fixes: 97bde5c4f909 ("net: ethernet: aquantia: Support for NIC-specific code")
>> Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
>> ---
>> drivers/net/ethernet/aquantia/atlantic/aq_nic.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
>> index cdb0299..3a32573 100644
>> --- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
>> +++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
>> @@ -755,7 +755,7 @@ void aq_nic_get_stats(struct aq_nic_s *self, u64 *data)
>> count = 0U;
>>
>> for (i = 0U, aq_vec = self->aq_vec[0];
>> - self->aq_vecs > i; ++i, aq_vec = self->aq_vec[i]) {
>> + aq_vec && self->aq_vecs > i; ++i, aq_vec = self->aq_vec[i]) {
>> data += count;
>> aq_vec_get_sw_stats(aq_vec, data, &count);
>> }
>> @@ -961,6 +961,7 @@ void aq_nic_free_hot_resources(struct aq_nic_s *self)
>> for (i = AQ_DIMOF(self->aq_vec); i--;) {
>> if (self->aq_vec[i])
>> aq_vec_free(self->aq_vec[i]);
>> + self->aq_vec[i] = NULL;
>
> I think you intended to to add { } to the if statement. The code compiles as
> is, but the indentation is not correct.
>
> -DA
Oh. Sorry about that. I did not see the loss of braces during merge.
I will prepare another patch with Lino and David M. comments.
Regards,
Pavel.
>
>> }
>>
>> err_exit:;
>>
>
^ permalink raw reply
* Re: struct ip vs struct iphdr
From: Girish Moodalbail @ 2017-05-04 17:08 UTC (permalink / raw)
To: Oleg, netdev
In-Reply-To: <20170504164219.GA25287@legohost>
On 5/4/17 9:42 AM, Oleg wrote:
> Hi, all.
>
> It seems struct ip and struct iphdr are similar: struct ip, despite of
> it name, doesn't contain anything but ip header.
>
> So, my noob question, what is the difference between them?
Also, see this:
http://stackoverflow.com/questions/42840636/difference-between-struct-ip-and-struct-iphdr
>
> Thanks.
>
^ permalink raw reply
* Re: [PATCH] aquantia: Fix "ethtool -S" crash when adapter down.
From: Pavel Belous @ 2017-05-04 17:09 UTC (permalink / raw)
To: David Miller, LinoSanfilippo; +Cc: netdev, darcari
In-Reply-To: <20170504.125113.569969253752966231.davem@davemloft.net>
On 04.05.2017 19:51, David Miller wrote:
> From: Lino Sanfilippo <LinoSanfilippo@gmx.de>
> Date: Thu, 4 May 2017 18:48:12 +0200
>
>> Hi Pavel,
>>
>> On 04.05.2017 18:33, Pavel Belous wrote:
>>> From: Pavel Belous <pavel.belous@aquantia.com>
>>>
>>> This patch fixes the crash that happens when driver tries to collect statistics
>>> from already released "aq_vec" object.
>>>
>>> Fixes: 97bde5c4f909 ("net: ethernet: aquantia: Support for NIC-specific code")
>>> Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
>>> ---
>>> drivers/net/ethernet/aquantia/atlantic/aq_nic.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
>>> index cdb0299..3a32573 100644
>>> --- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
>>> +++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
>>> @@ -755,7 +755,7 @@ void aq_nic_get_stats(struct aq_nic_s *self, u64 *data)
>>> count = 0U;
>>>
>>> for (i = 0U, aq_vec = self->aq_vec[0];
>>> - self->aq_vecs > i; ++i, aq_vec = self->aq_vec[i]) {
>>> + aq_vec && self->aq_vecs > i; ++i, aq_vec = self->aq_vec[i]) {
>>> data += count;
>>> aq_vec_get_sw_stats(aq_vec, data, &count);
>>> }
>>> @@ -961,6 +961,7 @@ void aq_nic_free_hot_resources(struct aq_nic_s *self)
>>> for (i = AQ_DIMOF(self->aq_vec); i--;) {
>>> if (self->aq_vec[i])
>>> aq_vec_free(self->aq_vec[i]);
>>> + self->aq_vec[i] = NULL;
>>> }
>>>
>>> err_exit:;
>>>
>>
>> if the driver does not support statistics when the interface is down, would not it be clearer
>> to check if netif_running() in get_stats() instead?
>
> Yes, much cleaner.
>
> Much better would be to have a cached software copy so that statistics
> can be reported regardless of whether the device is down or not.
>
Thank you.
I will think about how to do it better.
Regards,
Pavel
^ permalink raw reply
* Re: [Patch net] ipv6: initialize route null entry in addrconf_init()
From: David Ahern @ 2017-05-04 17:12 UTC (permalink / raw)
To: David Miller, andreyknvl, xiyou.wangcong; +Cc: netdev
In-Reply-To: <20170504.125144.78973292337920988.davem@davemloft.net>
On 5/4/17 10:51 AM, David Miller wrote:
> From: Andrey Konovalov <andreyknvl@google.com>
> Date: Thu, 4 May 2017 14:28:37 +0200
>
>> On Thu, May 4, 2017 at 7:07 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>> Andrey reported a crash on init_net.ipv6.ip6_null_entry->rt6i_idev
>>> since it is always NULL.
>>>
>>> This is clearly wrong, we have code to initialize it to loopback_dev,
>>> unfortunately the order is still not correct.
>>>
>>> loopback_dev is registered very early during boot, we lose a chance
>>> to re-initialize it in notifier. addrconf_init() is called after
>>> ip6_route_init(), which means we have no chance to correct it.
>>>
>>> Fix it by moving this initialization explicitly after
>>> ipv6_add_dev(init_net.loopback_dev) in addrconf_init().
>>>
>>> Reported-by: Andrey Konovalov <andreyknvl@google.com>
>>> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>>
>> Hi Cong,
>>
>> This fixes the bug triggered by my reproducer.
>>
>> Thanks!
>>
>> Tested-by: Andrey Konovalov <andreyknvl@google.com>
>
> Applied and queued up for -stable, thanks.
>
This is not the complete solution; it only fixes init_net. It still
blows up when you do:
unshare -n
./rt6_device_match
same exact stack trace
^ permalink raw reply
* Re: [PATCH] cfg80211: make RATE_INFO_BW_20 the default
From: David Miller @ 2017-05-04 17:15 UTC (permalink / raw)
To: johannes; +Cc: linux-wireless, torvalds, netdev, johannes.berg
In-Reply-To: <20170504064230.22058-1-johannes@sipsolutions.net>
From: Johannes Berg <johannes@sipsolutions.net>
Date: Thu, 4 May 2017 08:42:30 +0200
> From: Johannes Berg <johannes.berg@intel.com>
>
> Due to the way I did the RX bitrate conversions in mac80211 with
> spatch, going setting flags to setting the value, many drivers now
> don't set the bandwidth value for 20 MHz, since with the flags it
> wasn't necessary to (there was no 20 MHz flag, only the others.)
>
> Rather than go through and try to fix up all the drivers, instead
> renumber the enum so that 20 MHz, which is the typical bandwidth,
> actually has the value 0, making those drivers all work again.
>
> If VHT was hit used with a driver not reporting it, e.g. iwlmvm,
> this manifested in hitting the bandwidth warning in
> cfg80211_calculate_bitrate_vht().
>
> Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Since Jens Axboe had the same problem and tested this patch, I'm
tossing it into my tree.
Just FYI...
^ permalink raw reply
* Re: [net-ipv4] question about arguments position
From: Gustavo A. R. Silva @ 2017-05-04 16:56 UTC (permalink / raw)
To: David Miller; +Cc: kuznet, jmorris, yoshfuji, kaber, netdev, linux-kernel
In-Reply-To: <20170504.124600.627647229351638856.davem@davemloft.net>
Hi David,
Quoting David Miller <davem@davemloft.net>:
> From: "Gustavo A. R. Silva" <garsilva@embeddedor.com>
> Date: Thu, 04 May 2017 11:07:54 -0500
>
>> While looking into Coverity ID 1357474 I ran into the following piece
>> of code at net/ipv4/inet_diag.c:392:
>
> Because it's been this way since at least 2005, it doesn't matter if
> the order is correct or not. What's there is the locked in behavior
> exposed to userspace and changing it will break things for people.
Oh, I see.
Thanks for clarifying
^ permalink raw reply
* Re: [Patch net] ipv6: initialize route null entry in addrconf_init()
From: Cong Wang @ 2017-05-04 17:19 UTC (permalink / raw)
To: David Ahern
Cc: David Miller, Andrey Konovalov, Linux Kernel Network Developers
In-Reply-To: <896e63a7-d0ad-a4db-21fc-9abdb02dd9df@gmail.com>
On Thu, May 4, 2017 at 10:12 AM, David Ahern <dsahern@gmail.com> wrote:
> On 5/4/17 10:51 AM, David Miller wrote:
>> From: Andrey Konovalov <andreyknvl@google.com>
>> Date: Thu, 4 May 2017 14:28:37 +0200
>>
>>> On Thu, May 4, 2017 at 7:07 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>>> Andrey reported a crash on init_net.ipv6.ip6_null_entry->rt6i_idev
>>>> since it is always NULL.
>>>>
>>>> This is clearly wrong, we have code to initialize it to loopback_dev,
>>>> unfortunately the order is still not correct.
>>>>
>>>> loopback_dev is registered very early during boot, we lose a chance
>>>> to re-initialize it in notifier. addrconf_init() is called after
>>>> ip6_route_init(), which means we have no chance to correct it.
>>>>
>>>> Fix it by moving this initialization explicitly after
>>>> ipv6_add_dev(init_net.loopback_dev) in addrconf_init().
>>>>
>>>> Reported-by: Andrey Konovalov <andreyknvl@google.com>
>>>> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>>>
>>> Hi Cong,
>>>
>>> This fixes the bug triggered by my reproducer.
>>>
>>> Thanks!
>>>
>>> Tested-by: Andrey Konovalov <andreyknvl@google.com>
>>
>> Applied and queued up for -stable, thanks.
>>
>
> This is not the complete solution; it only fixes init_net. It still
> blows up when you do:
>
> unshare -n
> ./rt6_device_match
>
>
> same exact stack trace
This is why I sent
[Patch net] ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf
^ permalink raw reply
* Re: [Patch net] ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf
From: Cong Wang @ 2017-05-04 17:21 UTC (permalink / raw)
To: David Ahern; +Cc: Linux Kernel Network Developers, Andrey Konovalov
In-Reply-To: <a127b7fe-82a7-53e9-8df4-704f1b295ada@gmail.com>
On Thu, May 4, 2017 at 7:04 AM, David Ahern <dsahern@gmail.com> wrote:
> On 5/3/17 11:07 PM, Cong Wang wrote:
>> For each netns (except init_net), we initialize its null entry
>> in 3 places:
>>
>> 1) The template itself, as we use kmemdup()
>> 2) Code around dst_init_metrics() in ip6_route_net_init()
>> 3) ip6_route_dev_notify(), which is supposed to initialize it after
>> loopback registers
>>
>> Unfortunately the last one still happens in a wrong order because
>> we expect to initialize net->ipv6.ip6_null_entry->rt6i_idev to
>> net->loopback_dev's idev, so we have to do that after we add
>> idev to it. However, this notifier has priority == 0 same as
>> ipv6_dev_notf, and ipv6_dev_notf is registered after
>> ip6_route_dev_notifier so it is called actually after
>> ip6_route_dev_notifier.
>>
>> Fix it by specifying a smaller priority for ip6_route_dev_notifier.
>>
>> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>> ---
>> net/ipv6/route.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
>> index 2f11366..4dbf7e2 100644
>> --- a/net/ipv6/route.c
>> +++ b/net/ipv6/route.c
>> @@ -4024,7 +4024,7 @@ static struct pernet_operations ip6_route_net_late_ops = {
>>
>> static struct notifier_block ip6_route_dev_notifier = {
>> .notifier_call = ip6_route_dev_notify,
>> - .priority = 0,
>> + .priority = -10, /* Must be called after addrconf_notify!! */
>> };
>>
>> void __init ip6_route_init_special_entries(void)
>>
>
> And I see a refcnt problem with this change:
>
> root@kenny-jessie2:~# unshare -n
> root@kenny-jessie2:~# logout
> root@kenny-jessie2:~# unshare -n
>
> Message from syslogd@kenny-jessie2 at May 4 07:04:38 ...
> kernel:[ 62.581552] unregister_netdevice: waiting for lo to become
> free. Usage count = 1
Ah, looks like we need to put the refcnt for UNREGISTER too.
Will send v2 to include your ADDRCONF_NOTIFY_PRIORITY suggestion.
^ permalink raw reply
* [PATCH] mac80211: Create ieee80211_if_process_skb from ieee80211_iface_work
From: Joe Perches @ 2017-05-04 17:30 UTC (permalink / raw)
To: Johannes Berg; +Cc: David S. Miller, linux-wireless, netdev, linux-kernel
This function is pretty long and the skb handling is a bit long too.
Create a new function just for the skb processing.
This isolates the code and reduces indentation a bit too.
No change in object size.
$ size net/mac80211/iface.o*
text data bss dec hex filename
15736 24 0 15760 3d90 net/mac80211/iface.o.new
15736 24 0 15760 3d90 net/mac80211/iface.o.old
Miscellanea:
o Use explicit casts to proper types instead of casts to (void *)
and have the compiler do the implicit cast
o Rewrap comments
Signed-off-by: Joe Perches <joe@perches.com>
---
net/mac80211/iface.c | 253 ++++++++++++++++++++++++++-------------------------
1 file changed, 127 insertions(+), 126 deletions(-)
diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index 3bd5b81f5d81..b51d3956feaa 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -1230,145 +1230,125 @@ static void ieee80211_if_setup_no_queue(struct net_device *dev)
dev->priv_flags |= IFF_NO_QUEUE;
}
-static void ieee80211_iface_work(struct work_struct *work)
+static void ieee80211_if_process_skb(struct sk_buff *skb,
+ struct ieee80211_sub_if_data *sdata,
+ struct ieee80211_local *local)
{
- struct ieee80211_sub_if_data *sdata =
- container_of(work, struct ieee80211_sub_if_data, work);
- struct ieee80211_local *local = sdata->local;
- struct sk_buff *skb;
- struct sta_info *sta;
struct ieee80211_ra_tid *ra_tid;
struct ieee80211_rx_agg *rx_agg;
-
- if (!ieee80211_sdata_running(sdata))
- return;
-
- if (test_bit(SCAN_SW_SCANNING, &local->scanning))
- return;
-
- if (!ieee80211_can_run_worker(local))
- return;
-
- /* first process frames */
- while ((skb = skb_dequeue(&sdata->skb_queue))) {
- struct ieee80211_mgmt *mgmt = (void *)skb->data;
-
- if (skb->pkt_type == IEEE80211_SDATA_QUEUE_AGG_START) {
- ra_tid = (void *)&skb->cb;
- ieee80211_start_tx_ba_cb(&sdata->vif, ra_tid->ra,
- ra_tid->tid);
- } else if (skb->pkt_type == IEEE80211_SDATA_QUEUE_AGG_STOP) {
- ra_tid = (void *)&skb->cb;
- ieee80211_stop_tx_ba_cb(&sdata->vif, ra_tid->ra,
- ra_tid->tid);
- } else if (skb->pkt_type == IEEE80211_SDATA_QUEUE_RX_AGG_START) {
- rx_agg = (void *)&skb->cb;
- mutex_lock(&local->sta_mtx);
- sta = sta_info_get_bss(sdata, rx_agg->addr);
- if (sta)
- __ieee80211_start_rx_ba_session(sta,
- 0, 0, 0, 1, rx_agg->tid,
- IEEE80211_MAX_AMPDU_BUF,
- false, true);
- mutex_unlock(&local->sta_mtx);
- } else if (skb->pkt_type == IEEE80211_SDATA_QUEUE_RX_AGG_STOP) {
- rx_agg = (void *)&skb->cb;
- mutex_lock(&local->sta_mtx);
- sta = sta_info_get_bss(sdata, rx_agg->addr);
- if (sta)
- __ieee80211_stop_rx_ba_session(sta,
- rx_agg->tid,
- WLAN_BACK_RECIPIENT, 0,
- false);
- mutex_unlock(&local->sta_mtx);
- } else if (ieee80211_is_action(mgmt->frame_control) &&
- mgmt->u.action.category == WLAN_CATEGORY_BACK) {
- int len = skb->len;
-
- mutex_lock(&local->sta_mtx);
- sta = sta_info_get_bss(sdata, mgmt->sa);
- if (sta) {
- switch (mgmt->u.action.u.addba_req.action_code) {
- case WLAN_ACTION_ADDBA_REQ:
- ieee80211_process_addba_request(
- local, sta, mgmt, len);
- break;
- case WLAN_ACTION_ADDBA_RESP:
- ieee80211_process_addba_resp(local, sta,
- mgmt, len);
- break;
- case WLAN_ACTION_DELBA:
- ieee80211_process_delba(sdata, sta,
+ struct sta_info *sta;
+ struct ieee80211_mgmt *mgmt = (void *)skb->data;
+
+ if (skb->pkt_type == IEEE80211_SDATA_QUEUE_AGG_START) {
+ ra_tid = (struct ieee80211_ra_tid *)&skb->cb;
+ ieee80211_start_tx_ba_cb(&sdata->vif, ra_tid->ra, ra_tid->tid);
+ } else if (skb->pkt_type == IEEE80211_SDATA_QUEUE_AGG_STOP) {
+ ra_tid = (struct ieee80211_ra_tid *)&skb->cb;
+ ieee80211_stop_tx_ba_cb(&sdata->vif, ra_tid->ra, ra_tid->tid);
+ } else if (skb->pkt_type == IEEE80211_SDATA_QUEUE_RX_AGG_START) {
+ rx_agg = (struct ieee80211_rx_agg *)&skb->cb;
+ mutex_lock(&local->sta_mtx);
+ sta = sta_info_get_bss(sdata, rx_agg->addr);
+ if (sta)
+ __ieee80211_start_rx_ba_session(sta,
+ 0, 0, 0, 1, rx_agg->tid,
+ IEEE80211_MAX_AMPDU_BUF,
+ false, true);
+ mutex_unlock(&local->sta_mtx);
+ } else if (skb->pkt_type == IEEE80211_SDATA_QUEUE_RX_AGG_STOP) {
+ rx_agg = (struct ieee80211_rx_agg *)&skb->cb;
+ mutex_lock(&local->sta_mtx);
+ sta = sta_info_get_bss(sdata, rx_agg->addr);
+ if (sta)
+ __ieee80211_stop_rx_ba_session(sta, rx_agg->tid,
+ WLAN_BACK_RECIPIENT, 0,
+ false);
+ mutex_unlock(&local->sta_mtx);
+ } else if (ieee80211_is_action(mgmt->frame_control) &&
+ mgmt->u.action.category == WLAN_CATEGORY_BACK) {
+ int len = skb->len;
+
+ mutex_lock(&local->sta_mtx);
+ sta = sta_info_get_bss(sdata, mgmt->sa);
+ if (sta) {
+ switch (mgmt->u.action.u.addba_req.action_code) {
+ case WLAN_ACTION_ADDBA_REQ:
+ ieee80211_process_addba_request(local, sta,
mgmt, len);
- break;
- default:
- WARN_ON(1);
- break;
- }
- }
- mutex_unlock(&local->sta_mtx);
- } else if (ieee80211_is_action(mgmt->frame_control) &&
- mgmt->u.action.category == WLAN_CATEGORY_VHT) {
- switch (mgmt->u.action.u.vht_group_notif.action_code) {
- case WLAN_VHT_ACTION_OPMODE_NOTIF: {
- struct ieee80211_rx_status *status;
- enum nl80211_band band;
- u8 opmode;
-
- status = IEEE80211_SKB_RXCB(skb);
- band = status->band;
- opmode = mgmt->u.action.u.vht_opmode_notif.operating_mode;
-
- mutex_lock(&local->sta_mtx);
- sta = sta_info_get_bss(sdata, mgmt->sa);
-
- if (sta)
- ieee80211_vht_handle_opmode(sdata, sta,
- opmode,
- band);
-
- mutex_unlock(&local->sta_mtx);
break;
- }
- case WLAN_VHT_ACTION_GROUPID_MGMT:
- ieee80211_process_mu_groups(sdata, mgmt);
+ case WLAN_ACTION_ADDBA_RESP:
+ ieee80211_process_addba_resp(local, sta,
+ mgmt, len);
+ break;
+ case WLAN_ACTION_DELBA:
+ ieee80211_process_delba(sdata, sta, mgmt, len);
break;
default:
WARN_ON(1);
break;
}
- } else if (ieee80211_is_data_qos(mgmt->frame_control)) {
- struct ieee80211_hdr *hdr = (void *)mgmt;
- /*
- * So the frame isn't mgmt, but frame_control
- * is at the right place anyway, of course, so
- * the if statement is correct.
- *
- * Warn if we have other data frame types here,
- * they must not get here.
- */
- WARN_ON(hdr->frame_control &
- cpu_to_le16(IEEE80211_STYPE_NULLFUNC));
- WARN_ON(!(hdr->seq_ctrl &
- cpu_to_le16(IEEE80211_SCTL_FRAG)));
- /*
- * This was a fragment of a frame, received while
- * a block-ack session was active. That cannot be
- * right, so terminate the session.
- */
+ }
+ mutex_unlock(&local->sta_mtx);
+ } else if (ieee80211_is_action(mgmt->frame_control) &&
+ mgmt->u.action.category == WLAN_CATEGORY_VHT) {
+ switch (mgmt->u.action.u.vht_group_notif.action_code) {
+ case WLAN_VHT_ACTION_OPMODE_NOTIF: {
+ struct ieee80211_rx_status *status;
+ enum nl80211_band band;
+ u8 opmode;
+
+ status = IEEE80211_SKB_RXCB(skb);
+ band = status->band;
+ opmode = mgmt->u.action.u.vht_opmode_notif.operating_mode;
+
mutex_lock(&local->sta_mtx);
sta = sta_info_get_bss(sdata, mgmt->sa);
- if (sta) {
- u16 tid = *ieee80211_get_qos_ctl(hdr) &
- IEEE80211_QOS_CTL_TID_MASK;
-
- __ieee80211_stop_rx_ba_session(
- sta, tid, WLAN_BACK_RECIPIENT,
- WLAN_REASON_QSTA_REQUIRE_SETUP,
- true);
- }
+
+ if (sta)
+ ieee80211_vht_handle_opmode(sdata, sta,
+ opmode, band);
+
mutex_unlock(&local->sta_mtx);
- } else switch (sdata->vif.type) {
+ break;
+ }
+ case WLAN_VHT_ACTION_GROUPID_MGMT:
+ ieee80211_process_mu_groups(sdata, mgmt);
+ break;
+ default:
+ WARN_ON(1);
+ break;
+ }
+ } else if (ieee80211_is_data_qos(mgmt->frame_control)) {
+ struct ieee80211_hdr *hdr = (void *)mgmt;
+ /*
+ * So the frame isn't mgmt, but frame_control is at the right
+ * place anyway, of course, so the if statement is correct.
+ *
+ * Warn if we have other data frame types here,
+ * they must not get here.
+ */
+ WARN_ON(hdr->frame_control &
+ cpu_to_le16(IEEE80211_STYPE_NULLFUNC));
+ WARN_ON(!(hdr->seq_ctrl & cpu_to_le16(IEEE80211_SCTL_FRAG)));
+ /*
+ * This was a fragment of a frame,
+ * received while a block-ack session was active.
+ * That cannot be right, so terminate the session.
+ */
+ mutex_lock(&local->sta_mtx);
+ sta = sta_info_get_bss(sdata, mgmt->sa);
+ if (sta) {
+ u16 tid = *ieee80211_get_qos_ctl(hdr) &
+ IEEE80211_QOS_CTL_TID_MASK;
+
+ __ieee80211_stop_rx_ba_session(sta, tid,
+ WLAN_BACK_RECIPIENT,
+ WLAN_REASON_QSTA_REQUIRE_SETUP,
+ true);
+ }
+ mutex_unlock(&local->sta_mtx);
+ } else {
+ switch (sdata->vif.type) {
case NL80211_IFTYPE_STATION:
ieee80211_sta_rx_queued_mgmt(sdata, skb);
break;
@@ -1384,7 +1364,28 @@ static void ieee80211_iface_work(struct work_struct *work)
WARN(1, "frame for unexpected interface type");
break;
}
+ }
+}
+
+static void ieee80211_iface_work(struct work_struct *work)
+{
+ struct ieee80211_sub_if_data *sdata =
+ container_of(work, struct ieee80211_sub_if_data, work);
+ struct ieee80211_local *local = sdata->local;
+ struct sk_buff *skb;
+
+ if (!ieee80211_sdata_running(sdata))
+ return;
+
+ if (test_bit(SCAN_SW_SCANNING, &local->scanning))
+ return;
+ if (!ieee80211_can_run_worker(local))
+ return;
+
+ /* first process frames */
+ while ((skb = skb_dequeue(&sdata->skb_queue))) {
+ ieee80211_if_process_skb(skb, sdata, local);
kfree_skb(skb);
}
--
2.10.0.rc2.1.g053435c
^ permalink raw reply related
* Re: [PATCH] cfg80211: make RATE_INFO_BW_20 the default
From: Linus Torvalds @ 2017-05-04 17:33 UTC (permalink / raw)
To: David Miller; +Cc: Johannes Berg, Linux Wireless List, Network Development
In-Reply-To: <20170504.112257.1601480484804301664.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
On Thu, May 4, 2017 at 8:22 AM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
> From: Johannes Berg <johannes-cdvu00un1VgdHxzADdlk8Q@public.gmane.org>
>>
>> I figured I'd give Linus to a chance to try or even apply it, but I
>> have no objection to you applying it either. I don't have anything else
>> yet right now, and sending a pull request for just a single patch
>> would be quite pointless.
>
> Ok, let's give Linus a chance to test the patch.
I'm having trouble recreating the warning. I have no idea why. It only
happened during ten minutes yesterday, and nothing in my wireless
setup has changed.
I wonder if *normally* my setup ends up connecting with a 40MHz band
or something, and I just happened to see the default uninitialized
case once.
I see that Jens reported that the patch works, although I'm wondering
how repeatable it was for him. The patch obviously looks simple and
seems like an obviously GoodThing(tm) regardless.
Linus
^ permalink raw reply
* [Patch net v2] ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf
From: Cong Wang @ 2017-05-04 17:36 UTC (permalink / raw)
To: netdev; +Cc: andreyknvl, dsahern, Cong Wang
For each netns (except init_net), we initialize its null entry
in 3 places:
1) The template itself, as we use kmemdup()
2) Code around dst_init_metrics() in ip6_route_net_init()
3) ip6_route_dev_notify(), which is supposed to initialize it after
loopback registers
Unfortunately the last one still happens in a wrong order because
we expect to initialize net->ipv6.ip6_null_entry->rt6i_idev to
net->loopback_dev's idev, so we have to do that after we add
idev to it. However, this notifier has priority == 0 same as
ipv6_dev_notf, and ipv6_dev_notf is registered after
ip6_route_dev_notifier so it is called actually after
ip6_route_dev_notifier.
Fix it by picking a smaller priority for ip6_route_dev_notifier.
Also, we have to release the refcnt accordingly when unregistering
loopback_dev because device exit functions are called before subsys
exit functions.
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
include/net/addrconf.h | 2 ++
net/ipv6/addrconf.c | 1 +
net/ipv6/route.c | 13 +++++++++++--
3 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/include/net/addrconf.h b/include/net/addrconf.h
index 1aeb25d..6c0ee3c 100644
--- a/include/net/addrconf.h
+++ b/include/net/addrconf.h
@@ -20,6 +20,8 @@
#define ADDRCONF_TIMER_FUZZ (HZ / 4)
#define ADDRCONF_TIMER_FUZZ_MAX (HZ)
+#define ADDRCONF_NOTIFY_PRIORITY 0
+
#include <linux/in.h>
#include <linux/in6.h>
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 77a4bd5..8d297a7 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -3548,6 +3548,7 @@ static int addrconf_notify(struct notifier_block *this, unsigned long event,
*/
static struct notifier_block ipv6_dev_notf = {
.notifier_call = addrconf_notify,
+ .priority = ADDRCONF_NOTIFY_PRIORITY,
};
static void addrconf_type_change(struct net_device *dev, unsigned long event)
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 2f11366..dc61b0b 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -3709,7 +3709,10 @@ static int ip6_route_dev_notify(struct notifier_block *this,
struct net_device *dev = netdev_notifier_info_to_dev(ptr);
struct net *net = dev_net(dev);
- if (event == NETDEV_REGISTER && (dev->flags & IFF_LOOPBACK)) {
+ if (!(dev->flags & IFF_LOOPBACK))
+ return NOTIFY_OK;
+
+ if (event == NETDEV_REGISTER) {
net->ipv6.ip6_null_entry->dst.dev = dev;
net->ipv6.ip6_null_entry->rt6i_idev = in6_dev_get(dev);
#ifdef CONFIG_IPV6_MULTIPLE_TABLES
@@ -3718,6 +3721,12 @@ static int ip6_route_dev_notify(struct notifier_block *this,
net->ipv6.ip6_blk_hole_entry->dst.dev = dev;
net->ipv6.ip6_blk_hole_entry->rt6i_idev = in6_dev_get(dev);
#endif
+ } else if (event == NETDEV_UNREGISTER) {
+ in6_dev_put(net->ipv6.ip6_null_entry->rt6i_idev);
+#ifdef CONFIG_IPV6_MULTIPLE_TABLES
+ in6_dev_put(net->ipv6.ip6_prohibit_entry->rt6i_idev);
+ in6_dev_put(net->ipv6.ip6_blk_hole_entry->rt6i_idev);
+#endif
}
return NOTIFY_OK;
@@ -4024,7 +4033,7 @@ static struct pernet_operations ip6_route_net_late_ops = {
static struct notifier_block ip6_route_dev_notifier = {
.notifier_call = ip6_route_dev_notify,
- .priority = 0,
+ .priority = ADDRCONF_NOTIFY_PRIORITY - 10,
};
void __init ip6_route_init_special_entries(void)
--
2.5.5
^ permalink raw reply related
* Re: [net-ipv4] question about arguments position
From: Joe Perches @ 2017-05-04 17:47 UTC (permalink / raw)
To: David Miller, garsilva
Cc: kuznet, jmorris, yoshfuji, kaber, netdev, linux-kernel
In-Reply-To: <20170504.124600.627647229351638856.davem@davemloft.net>
On Thu, 2017-05-04 at 12:46 -0400, David Miller wrote:
> From: "Gustavo A. R. Silva" <garsilva@embeddedor.com>
> Date: Thu, 04 May 2017 11:07:54 -0500
>
> > While looking into Coverity ID 1357474 I ran into the following piece
> > of code at net/ipv4/inet_diag.c:392:
>
> Because it's been this way since at least 2005, it doesn't matter if
> the order is correct or not. What's there is the locked in behavior
> exposed to userspace and changing it will break things for people.
Adding a few comments around the code about why
it is this way will help avoid future questions.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox