netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: BUG warnings in 4.14.9
       [not found] <CAK2bqV+NqMz0e-uVGREMvmmi_NgKRYyqZ8kd+yEfxT0pcCWE+g@mail.gmail.com>
@ 2017-12-26 18:59 ` Willy Tarreau
  2017-12-26 20:54   ` Ido Schimmel
  2017-12-26 23:08   ` alexander.levin
  0 siblings, 2 replies; 5+ messages in thread
From: Willy Tarreau @ 2017-12-26 18:59 UTC (permalink / raw)
  To: Wei Wang
  Cc: Martin KaFai Lau, Eric Dumazet, David S. Miller, Sasha Levin,
	Greg Kroah-Hartman, Chris Rankin, linux-kernel, stable, netdev

Guys,

Chris reported the bug below and confirmed that reverting commit
9704f81 (ipv6: grab rt->rt6i_ref before allocating pcpu rt) seems to
have fixed the issue for him. This patch is a94b9367 in mainline.

I personally have no opinion on the patch, just found it because it
was the only one touching this area between 4.14.8 and 4.14.9 :-)

Should this be reverted or maybe fixed differently ?

thanks,
Willy

On Tue, Dec 26, 2017 at 01:49:59AM +0000, Chris Rankin wrote:
> Hi,
> 
> I've just raised https://bugzilla.kernel.org/show_bug.cgi?id=198271
> because the new 4.14.9 kernel is generating lots of BUG warnings, e.g.
> 
> [   35.069924] BUG: using smp_processor_id() in preemptible [00000000]
> code: avahi-daemon/761
> [   35.078187] caller is ip6_pol_route+0x46b/0x509 [ipv6]
> [   35.083340] CPU: 5 PID: 761 Comm: avahi-daemon Tainted: G
> I     4.14.9 #1
> [   35.091176] Hardware name: Gigabyte Technology Co., Ltd.
> EX58-UD3R/EX58-UD3R, BIOS FB  05/04/2009
> [   35.100181] Call Trace:
> [   35.102709]  dump_stack+0x46/0x59
> [   35.106095]  check_preemption_disabled+0xca/0xda
> [   35.110786]  ip6_pol_route+0x46b/0x509 [ipv6]
> [   35.115224]  fib6_rule_lookup+0x15/0x4b [ipv6]
> [   35.119745]  ip6_dst_lookup_tail+0x11d/0x18f [ipv6]
> [   35.124702]  ip6_dst_lookup_flow+0x30/0x6f [ipv6]
> [   35.129481]  udpv6_sendmsg+0x5c5/0xa10 [ipv6]
> [   35.133914]  ? ip_reply_glue_bits+0x48/0x48
> [   35.138187]  ? rw_copy_check_uvector+0x6d/0xf9
> [   35.142701]  ? sock_sendmsg+0x28/0x34
> [   35.146393]  sock_sendmsg+0x28/0x34
> [   35.149912]  ___sys_sendmsg+0x1b4/0x246
> [   35.153821]  ? poll_select_copy_remaining+0x104/0x104
> [   35.158995]  ? current_time+0x11/0x52
> [   35.162738]  ? pipe_write+0x353/0x365
> [   35.166483]  ? __sys_sendmsg+0x3c/0x5d
> [   35.170304]  __sys_sendmsg+0x3c/0x5d
> [   35.173909]  do_syscall_64+0x4a/0xe1
> [   35.177482]  entry_SYSCALL64_slow_path+0x25/0x25
> [   35.177484] RIP: 0033:0x7f963d7097b7
> [   35.177485] RSP: 002b:00007ffff7ffbd18 EFLAGS: 00000246 ORIG_RAX:
> 000000000000002e
> [   35.177486] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f963d7097b7
> [   35.177487] RDX: 0000000000000000 RSI: 00007ffff7ffbde0 RDI: 000000000000000d
> [   35.177487] RBP: 0000000000000000 R08: 0000000000000001 R09: 00007ffff7ffbd42
> [   35.177488] R10: 0000000000000002 R11: 0000000000000246 R12: 00007ffff7ffbde0
> [   35.177489] R13: 000055e4e1af29cc R14: 000000000000000d R15: 0000000000000002
> [   35.177565] IN=eth0 OUT= MAC=
> SRC=fe80:0000:0000:0000:0224:1dff:fecd:0f3a
> DST=ff02:0000:0000:0000:0000:0000:0000:00fb LEN=245 TC=0 HOPLIMIT=255
> FLOWLBL=147873 PROTO=UDP SPT=5353 DPT=5353 LEN=205
> 
> Cheers,
> Chris

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUG warnings in 4.14.9
  2017-12-26 18:59 ` BUG warnings in 4.14.9 Willy Tarreau
@ 2017-12-26 20:54   ` Ido Schimmel
  2017-12-27  4:25     ` alexander.levin
  2017-12-26 23:08   ` alexander.levin
  1 sibling, 1 reply; 5+ messages in thread
From: Ido Schimmel @ 2017-12-26 20:54 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Wei Wang, Martin KaFai Lau, Eric Dumazet, David S. Miller,
	Sasha Levin, Greg Kroah-Hartman, Chris Rankin, linux-kernel,
	stable, netdev

On Tue, Dec 26, 2017 at 07:59:55PM +0100, Willy Tarreau wrote:
> Guys,
> 
> Chris reported the bug below and confirmed that reverting commit
> 9704f81 (ipv6: grab rt->rt6i_ref before allocating pcpu rt) seems to
> have fixed the issue for him. This patch is a94b9367 in mainline.
> 
> I personally have no opinion on the patch, just found it because it
> was the only one touching this area between 4.14.8 and 4.14.9 :-)
> 
> Should this be reverted or maybe fixed differently ?

Maybe I'm missing something, but how come this patch even made its way
into 4.14.y? It's part of a series to RCU-ify IPv6 FIB lookup that went
into 4.15.

Anyway, the mentioned bug was already fixed by commit 951f788a80ff
("ipv6: fix a BUG in rt6_get_pcpu_route()") when the code was still in
net-next.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUG warnings in 4.14.9
  2017-12-26 18:59 ` BUG warnings in 4.14.9 Willy Tarreau
  2017-12-26 20:54   ` Ido Schimmel
@ 2017-12-26 23:08   ` alexander.levin
  1 sibling, 0 replies; 5+ messages in thread
From: alexander.levin @ 2017-12-26 23:08 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Wei Wang, Martin KaFai Lau, Eric Dumazet, David S. Miller,
	Greg Kroah-Hartman, Chris Rankin, linux-kernel@vger.kernel.org,
	stable@vger.kernel.org, netdev@vger.kernel.org

On Tue, Dec 26, 2017 at 07:59:55PM +0100, Willy Tarreau wrote:
>Guys,
>
>Chris reported the bug below and confirmed that reverting commit
>9704f81 (ipv6: grab rt->rt6i_ref before allocating pcpu rt) seems to
>have fixed the issue for him. This patch is a94b9367 in mainline.
>
>I personally have no opinion on the patch, just found it because it
>was the only one touching this area between 4.14.8 and 4.14.9 :-)
>
>Should this be reverted or maybe fixed differently ?

Hi Willy,

This seems to be fixed upstream:

commit 951f788a80ff8b6339c5c1ab888b0d4b4352efd8
Author: Eric Dumazet <edumazet@google.com>
Date:   Sun Oct 8 21:07:18 2017 -0700

    ipv6: fix a BUG in rt6_get_pcpu_route()
    
    Ido reported following splat and provided a patch.
    
    [  122.221814] BUG: using smp_processor_id() in preemptible [00000000] code: sshd/2672
    [  122.221845] caller is debug_smp_processor_id+0x17/0x20
    [  122.221866] CPU: 0 PID: 2672 Comm: sshd Not tainted 4.14.0-rc3-idosch-next-custom #639
    [  122.221880] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
    [  122.221893] Call Trace:
    [  122.221919]  dump_stack+0xb1/0x10c
    [  122.221946]  ? _atomic_dec_and_lock+0x124/0x124
    [  122.221974]  ? ___ratelimit+0xfe/0x240
    [  122.222020]  check_preemption_disabled+0x173/0x1b0
    [  122.222060]  debug_smp_processor_id+0x17/0x20
    [  122.222083]  ip6_pol_route+0x1482/0x24a0
    ...
    
    I believe we can simplify this code path a bit, since we no longer
    hold a read_lock and need to release it to avoid a dead lock.
    
    By disabling BH, we make sure we'll prevent code re-entry and
    rt6_get_pcpu_route()/rt6_make_pcpu_route() run on the same cpu.
    
    Fixes: 66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in fib6_table")
    Reported-by: Ido Schimmel <idosch@mellanox.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Tested-by: Ido Schimmel <idosch@mellanox.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Which itself would depend on:

commit d3843fe5fd45be0e04a251a2cc68893c859a31bd
Author: Wei Wang <weiwan@google.com>
Date:   Fri Oct 6 12:06:06 2017 -0700

    ipv6: replace dst_hold() with dst_hold_safe() in routing code

Which applies with a small conflict.

-- 

Thanks,
Sasha

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUG warnings in 4.14.9
  2017-12-26 20:54   ` Ido Schimmel
@ 2017-12-27  4:25     ` alexander.levin
  2017-12-27 12:57       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 5+ messages in thread
From: alexander.levin @ 2017-12-27  4:25 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: Willy Tarreau, Wei Wang, Martin KaFai Lau, Eric Dumazet,
	David S. Miller, Greg Kroah-Hartman, Chris Rankin,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org,
	netdev@vger.kernel.org

On Tue, Dec 26, 2017 at 10:54:37PM +0200, Ido Schimmel wrote:
>On Tue, Dec 26, 2017 at 07:59:55PM +0100, Willy Tarreau wrote:
>> Guys,
>>
>> Chris reported the bug below and confirmed that reverting commit
>> 9704f81 (ipv6: grab rt->rt6i_ref before allocating pcpu rt) seems to
>> have fixed the issue for him. This patch is a94b9367 in mainline.
>>
>> I personally have no opinion on the patch, just found it because it
>> was the only one touching this area between 4.14.8 and 4.14.9 :-)
>>
>> Should this be reverted or maybe fixed differently ?
>
>Maybe I'm missing something, but how come this patch even made its way
>into 4.14.y? It's part of a series to RCU-ify IPv6 FIB lookup that went
>into 4.15.
>
>Anyway, the mentioned bug was already fixed by commit 951f788a80ff
>("ipv6: fix a BUG in rt6_get_pcpu_route()") when the code was still in
>net-next.

Uh, you're right. Greg, please just revert 9704f81. Thanks!

-- 

Thanks,
Sasha

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUG warnings in 4.14.9
  2017-12-27  4:25     ` alexander.levin
@ 2017-12-27 12:57       ` Greg Kroah-Hartman
  0 siblings, 0 replies; 5+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-27 12:57 UTC (permalink / raw)
  To: alexander.levin
  Cc: Ido Schimmel, Willy Tarreau, Wei Wang, Martin KaFai Lau,
	Eric Dumazet, David S. Miller, Chris Rankin,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org,
	netdev@vger.kernel.org

On Wed, Dec 27, 2017 at 04:25:00AM +0000, alexander.levin@verizon.com wrote:
> On Tue, Dec 26, 2017 at 10:54:37PM +0200, Ido Schimmel wrote:
> >On Tue, Dec 26, 2017 at 07:59:55PM +0100, Willy Tarreau wrote:
> >> Guys,
> >>
> >> Chris reported the bug below and confirmed that reverting commit
> >> 9704f81 (ipv6: grab rt->rt6i_ref before allocating pcpu rt) seems to
> >> have fixed the issue for him. This patch is a94b9367 in mainline.
> >>
> >> I personally have no opinion on the patch, just found it because it
> >> was the only one touching this area between 4.14.8 and 4.14.9 :-)
> >>
> >> Should this be reverted or maybe fixed differently ?
> >
> >Maybe I'm missing something, but how come this patch even made its way
> >into 4.14.y? It's part of a series to RCU-ify IPv6 FIB lookup that went
> >into 4.15.
> >
> >Anyway, the mentioned bug was already fixed by commit 951f788a80ff
> >("ipv6: fix a BUG in rt6_get_pcpu_route()") when the code was still in
> >net-next.
> 
> Uh, you're right. Greg, please just revert 9704f81. Thanks!

Now reverted, sorry about this.

greg k-h

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-12-27 12:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CAK2bqV+NqMz0e-uVGREMvmmi_NgKRYyqZ8kd+yEfxT0pcCWE+g@mail.gmail.com>
2017-12-26 18:59 ` BUG warnings in 4.14.9 Willy Tarreau
2017-12-26 20:54   ` Ido Schimmel
2017-12-27  4:25     ` alexander.levin
2017-12-27 12:57       ` Greg Kroah-Hartman
2017-12-26 23:08   ` alexander.levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).