netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Fwd: [BUG] v4.20 - bridge not getting DHCP responses? (works in 4.19.13)
       [not found] <CAA85sZsiifWTKq=Mw30f-UVyrzywnYYX7GvyqG+O+ndSYBML9A@mail.gmail.com>
@ 2019-01-06 22:21 ` Ian Kumlien
  2019-01-08 22:10   ` Ian Kumlien
  0 siblings, 1 reply; 11+ messages in thread
From: Ian Kumlien @ 2019-01-06 22:21 UTC (permalink / raw)
  To: Linux Kernel Network Developers
  Cc: jeffrey.t.kirsher, roopa, nikolay, linux-kernel@vger.kernel.org

[Sorry for the repost, screwed up the netdev address...]

Hi,

Switching from 4.19.x -> 4.20 resulted in DHCP not working for my VM:s.

My firewall (which also runs the dhcpd) runs VM:s and it does this by
having physical
interfaces attached to bridges - which the VM:s in turn attach to.

Since 4.20 the VM:s can't use DHCP, it's odd since the requests are
seen - a response is sent but
it never enters the interface attached to the bridge.

Basically:
VM vnet2: -> br0 -> eno2 -> switch -> eno1 (dhcpd)
dhcpd eno1 -> siwtch and... gone.

Any clues?

All the nics are handled by ixgbe

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] v4.20 - bridge not getting DHCP responses? (works in 4.19.13)
  2019-01-06 22:21 ` Fwd: [BUG] v4.20 - bridge not getting DHCP responses? (works in 4.19.13) Ian Kumlien
@ 2019-01-08 22:10   ` Ian Kumlien
  2019-01-08 22:51     ` Stephen Hemminger
  0 siblings, 1 reply; 11+ messages in thread
From: Ian Kumlien @ 2019-01-08 22:10 UTC (permalink / raw)
  To: Linux Kernel Network Developers
  Cc: jeffrey.t.kirsher, Roopa Prabhu, nikolay,
	linux-kernel@vger.kernel.org

On Sun, Jan 6, 2019 at 11:21 PM Ian Kumlien <ian.kumlien@gmail.com> wrote:
>
> [Sorry for the repost, screwed up the netdev address...]
>
> Hi,
>
> Switching from 4.19.x -> 4.20 resulted in DHCP not working for my VM:s.
>
> My firewall (which also runs the dhcpd) runs VM:s and it does this by
> having physical
> interfaces attached to bridges - which the VM:s in turn attach to.
>
> Since 4.20 the VM:s can't use DHCP, it's odd since the requests are
> seen - a response is sent but
> it never enters the interface attached to the bridge.
>
> Basically:
> VM vnet2: -> br0 -> eno2 -> switch -> eno1 (dhcpd)
> dhcpd eno1 -> siwtch and... gone.
>
> Any clues?
>
> All the nics are handled by ixgbe

So, doing similar tests at work with other drivers works - could it be
related to the mac address filter that was added?
I don't *really* use VF:s though... (can't really find anything else atm)

Will try to test, but the VM:s on this machine is in use.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] v4.20 - bridge not getting DHCP responses? (works in 4.19.13)
  2019-01-08 22:10   ` Ian Kumlien
@ 2019-01-08 22:51     ` Stephen Hemminger
  2019-01-08 22:51       ` Stephen Hemminger
  2019-01-08 23:00       ` Ian Kumlien
  0 siblings, 2 replies; 11+ messages in thread
From: Stephen Hemminger @ 2019-01-08 22:51 UTC (permalink / raw)
  To: Ian Kumlien
  Cc: Linux Kernel Network Developers, jeffrey.t.kirsher, Roopa Prabhu,
	nikolay, linux-kernel@vger.kernel.org

On Tue, 8 Jan 2019 23:10:04 +0100
Ian Kumlien <ian.kumlien@gmail.com> wrote:

> On Sun, Jan 6, 2019 at 11:21 PM Ian Kumlien <ian.kumlien@gmail.com> wrote:
> >
> > [Sorry for the repost, screwed up the netdev address...]
> >
> > Hi,
> >
> > Switching from 4.19.x -> 4.20 resulted in DHCP not working for my VM:s.
> >
> > My firewall (which also runs the dhcpd) runs VM:s and it does this by
> > having physical
> > interfaces attached to bridges - which the VM:s in turn attach to.
> >
> > Since 4.20 the VM:s can't use DHCP, it's odd since the requests are
> > seen - a response is sent but
> > it never enters the interface attached to the bridge.
> >
> > Basically:
> > VM vnet2: -> br0 -> eno2 -> switch -> eno1 (dhcpd)
> > dhcpd eno1 -> siwtch and... gone.
> >
> > Any clues?
> >
> > All the nics are handled by ixgbe  
> 
> So, doing similar tests at work with other drivers works - could it be
> related to the mac address filter that was added?
> I don't *really* use VF:s though... (can't really find anything else atm)
> 
> Will try to test, but the VM:s on this machine is in use.

The default MAC address of the bridge device is the first device assigned
to the bridge.  Remember most VF interfaces will only allow single MAC address
and no promiscious mode.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] v4.20 - bridge not getting DHCP responses? (works in 4.19.13)
  2019-01-08 22:51     ` Stephen Hemminger
@ 2019-01-08 22:51       ` Stephen Hemminger
  2019-01-08 23:00       ` Ian Kumlien
  1 sibling, 0 replies; 11+ messages in thread
From: Stephen Hemminger @ 2019-01-08 22:51 UTC (permalink / raw)
  To: Ian Kumlien
  Cc: Linux Kernel Network Developers, jeffrey.t.kirsher, Roopa Prabhu,
	nikolay, linux-kernel@vger.kernel.org

On Tue, 8 Jan 2019 23:10:04 +0100
Ian Kumlien <ian.kumlien@gmail.com> wrote:

> On Sun, Jan 6, 2019 at 11:21 PM Ian Kumlien <ian.kumlien@gmail.com> wrote:
> >
> > [Sorry for the repost, screwed up the netdev address...]
> >
> > Hi,
> >
> > Switching from 4.19.x -> 4.20 resulted in DHCP not working for my VM:s.
> >
> > My firewall (which also runs the dhcpd) runs VM:s and it does this by
> > having physical
> > interfaces attached to bridges - which the VM:s in turn attach to.
> >
> > Since 4.20 the VM:s can't use DHCP, it's odd since the requests are
> > seen - a response is sent but
> > it never enters the interface attached to the bridge.
> >
> > Basically:
> > VM vnet2: -> br0 -> eno2 -> switch -> eno1 (dhcpd)
> > dhcpd eno1 -> siwtch and... gone.
> >
> > Any clues?
> >
> > All the nics are handled by ixgbe  
> 
> So, doing similar tests at work with other drivers works - could it be
> related to the mac address filter that was added?
> I don't *really* use VF:s though... (can't really find anything else atm)
> 
> Will try to test, but the VM:s on this machine is in use.

The default MAC address of the bridge device is the first device assigned
to the bridge.  Remember most VF interfaces will only allow single MAC address
and no promiscious mode.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] v4.20 - bridge not getting DHCP responses? (works in 4.19.13)
  2019-01-08 22:51     ` Stephen Hemminger
  2019-01-08 22:51       ` Stephen Hemminger
@ 2019-01-08 23:00       ` Ian Kumlien
  2019-01-08 23:09         ` Florian Fainelli
  1 sibling, 1 reply; 11+ messages in thread
From: Ian Kumlien @ 2019-01-08 23:00 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Linux Kernel Network Developers, jeffrey.t.kirsher, Roopa Prabhu,
	nikolay, linux-kernel@vger.kernel.org

On Tue, Jan 8, 2019 at 11:51 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Tue, 8 Jan 2019 23:10:04 +0100
> Ian Kumlien <ian.kumlien@gmail.com> wrote:
> > On Sun, Jan 6, 2019 at 11:21 PM Ian Kumlien <ian.kumlien@gmail.com> wrote:
> > >
> > > [Sorry for the repost, screwed up the netdev address...]
> > >
> > > Hi,
> > >
> > > Switching from 4.19.x -> 4.20 resulted in DHCP not working for my VM:s.
> > >
> > > My firewall (which also runs the dhcpd) runs VM:s and it does this by
> > > having physical
> > > interfaces attached to bridges - which the VM:s in turn attach to.
> > >
> > > Since 4.20 the VM:s can't use DHCP, it's odd since the requests are
> > > seen - a response is sent but
> > > it never enters the interface attached to the bridge.
> > >
> > > Basically:
> > > VM vnet2: -> br0 -> eno2 -> switch -> eno1 (dhcpd)
> > > dhcpd eno1 -> siwtch and... gone.
> > >
> > > Any clues?
> > >
> > > All the nics are handled by ixgbe
> >
> > So, doing similar tests at work with other drivers works - could it be
> > related to the mac address filter that was added?
> > I don't *really* use VF:s though... (can't really find anything else atm)
> >
> > Will try to test, but the VM:s on this machine is in use.
>
> The default MAC address of the bridge device is the first device assigned
> to the bridge.  Remember most VF interfaces will only allow single MAC address
> and no promiscious mode.

Yeah, I'm not running any VF:s and it just seems like the responses
are dropped somewhere

when looking at "git log v4.19...v4.20
drivers/net/ethernet/intel/ixgbe/" nothing else really stands out...
The machine is also running NAT for my home network and all of that
works just fine...

I started with tcpdump, prooving that packets reached all the way
outside but replies never made it, reboorting
with 4.19.13 resulted in replies appearing in the tcpdump.

I don't quite know where to look - and what can i do to test - i tried
disabling all offloading (due to the UDP
offloading changes) but nothing helped...

Ideas? Patches? ;)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] v4.20 - bridge not getting DHCP responses? (works in 4.19.13)
  2019-01-08 23:00       ` Ian Kumlien
@ 2019-01-08 23:09         ` Florian Fainelli
  2019-01-08 23:09           ` Florian Fainelli
       [not found]           ` <CAA85sZvVHCYFRXp5cYzSDNpesNMtCZOB+Gnb+N+SmraD6eke1A@mail.gmail.com>
  0 siblings, 2 replies; 11+ messages in thread
From: Florian Fainelli @ 2019-01-08 23:09 UTC (permalink / raw)
  To: Ian Kumlien, Stephen Hemminger
  Cc: Linux Kernel Network Developers, jeffrey.t.kirsher, Roopa Prabhu,
	nikolay, linux-kernel@vger.kernel.org

On 1/8/19 3:00 PM, Ian Kumlien wrote:
> On Tue, Jan 8, 2019 at 11:51 PM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
>> On Tue, 8 Jan 2019 23:10:04 +0100
>> Ian Kumlien <ian.kumlien@gmail.com> wrote:
>>> On Sun, Jan 6, 2019 at 11:21 PM Ian Kumlien <ian.kumlien@gmail.com> wrote:
>>>>
>>>> [Sorry for the repost, screwed up the netdev address...]
>>>>
>>>> Hi,
>>>>
>>>> Switching from 4.19.x -> 4.20 resulted in DHCP not working for my VM:s.
>>>>
>>>> My firewall (which also runs the dhcpd) runs VM:s and it does this by
>>>> having physical
>>>> interfaces attached to bridges - which the VM:s in turn attach to.
>>>>
>>>> Since 4.20 the VM:s can't use DHCP, it's odd since the requests are
>>>> seen - a response is sent but
>>>> it never enters the interface attached to the bridge.
>>>>
>>>> Basically:
>>>> VM vnet2: -> br0 -> eno2 -> switch -> eno1 (dhcpd)
>>>> dhcpd eno1 -> siwtch and... gone.
>>>>
>>>> Any clues?
>>>>
>>>> All the nics are handled by ixgbe
>>>
>>> So, doing similar tests at work with other drivers works - could it be
>>> related to the mac address filter that was added?
>>> I don't *really* use VF:s though... (can't really find anything else atm)
>>>
>>> Will try to test, but the VM:s on this machine is in use.
>>
>> The default MAC address of the bridge device is the first device assigned
>> to the bridge.  Remember most VF interfaces will only allow single MAC address
>> and no promiscious mode.
> 
> Yeah, I'm not running any VF:s and it just seems like the responses
> are dropped somewhere
> 
> when looking at "git log v4.19...v4.20
> drivers/net/ethernet/intel/ixgbe/" nothing else really stands out...
> The machine is also running NAT for my home network and all of that
> works just fine...
> 
> I started with tcpdump, prooving that packets reached all the way
> outside but replies never made it, reboorting
> with 4.19.13 resulted in replies appearing in the tcpdump.
> 
> I don't quite know where to look - and what can i do to test - i tried
> disabling all offloading (due to the UDP
> offloading changes) but nothing helped...
> 
> Ideas? Patches? ;)

Running a bisection would certainly help find the offending commit if
that is something that you can do?
-- 
Florian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] v4.20 - bridge not getting DHCP responses? (works in 4.19.13)
  2019-01-08 23:09         ` Florian Fainelli
@ 2019-01-08 23:09           ` Florian Fainelli
       [not found]           ` <CAA85sZvVHCYFRXp5cYzSDNpesNMtCZOB+Gnb+N+SmraD6eke1A@mail.gmail.com>
  1 sibling, 0 replies; 11+ messages in thread
From: Florian Fainelli @ 2019-01-08 23:09 UTC (permalink / raw)
  To: Ian Kumlien, Stephen Hemminger
  Cc: Linux Kernel Network Developers, jeffrey.t.kirsher, Roopa Prabhu,
	nikolay, linux-kernel@vger.kernel.org

On 1/8/19 3:00 PM, Ian Kumlien wrote:
> On Tue, Jan 8, 2019 at 11:51 PM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
>> On Tue, 8 Jan 2019 23:10:04 +0100
>> Ian Kumlien <ian.kumlien@gmail.com> wrote:
>>> On Sun, Jan 6, 2019 at 11:21 PM Ian Kumlien <ian.kumlien@gmail.com> wrote:
>>>>
>>>> [Sorry for the repost, screwed up the netdev address...]
>>>>
>>>> Hi,
>>>>
>>>> Switching from 4.19.x -> 4.20 resulted in DHCP not working for my VM:s.
>>>>
>>>> My firewall (which also runs the dhcpd) runs VM:s and it does this by
>>>> having physical
>>>> interfaces attached to bridges - which the VM:s in turn attach to.
>>>>
>>>> Since 4.20 the VM:s can't use DHCP, it's odd since the requests are
>>>> seen - a response is sent but
>>>> it never enters the interface attached to the bridge.
>>>>
>>>> Basically:
>>>> VM vnet2: -> br0 -> eno2 -> switch -> eno1 (dhcpd)
>>>> dhcpd eno1 -> siwtch and... gone.
>>>>
>>>> Any clues?
>>>>
>>>> All the nics are handled by ixgbe
>>>
>>> So, doing similar tests at work with other drivers works - could it be
>>> related to the mac address filter that was added?
>>> I don't *really* use VF:s though... (can't really find anything else atm)
>>>
>>> Will try to test, but the VM:s on this machine is in use.
>>
>> The default MAC address of the bridge device is the first device assigned
>> to the bridge.  Remember most VF interfaces will only allow single MAC address
>> and no promiscious mode.
> 
> Yeah, I'm not running any VF:s and it just seems like the responses
> are dropped somewhere
> 
> when looking at "git log v4.19...v4.20
> drivers/net/ethernet/intel/ixgbe/" nothing else really stands out...
> The machine is also running NAT for my home network and all of that
> works just fine...
> 
> I started with tcpdump, prooving that packets reached all the way
> outside but replies never made it, reboorting
> with 4.19.13 resulted in replies appearing in the tcpdump.
> 
> I don't quite know where to look - and what can i do to test - i tried
> disabling all offloading (due to the UDP
> offloading changes) but nothing helped...
> 
> Ideas? Patches? ;)

Running a bisection would certainly help find the offending commit if
that is something that you can do?
-- 
Florian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] v4.20 - bridge not getting DHCP responses? (works in 4.19.13)
       [not found]           ` <CAA85sZvVHCYFRXp5cYzSDNpesNMtCZOB+Gnb+N+SmraD6eke1A@mail.gmail.com>
@ 2019-01-10  0:16             ` Ian Kumlien
  2019-01-10  0:16               ` Ian Kumlien
  2019-01-10  0:38               ` Ian Kumlien
  0 siblings, 2 replies; 11+ messages in thread
From: Ian Kumlien @ 2019-01-10  0:16 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Stephen Hemminger, jeffrey.t.kirsher, Roopa Prabhu, nikolay,
	linux-kernel@vger.kernel.org, Linux Kernel Network Developers

On Wed, Jan 9, 2019 at 12:17 AM Ian Kumlien <ian.kumlien@gmail.com> wrote:
> On Wed, Jan 9, 2019, 00:09 Florian Fainelli <f.fainelli@gmail.com wrote:

 [--8<---]

>> > when looking at "git log v4.19...v4.20
>> > drivers/net/ethernet/intel/ixgbe/" nothing else really stands out...
>> > The machine is also running NAT for my home network and all of that
>> > works just fine...
>> >
>> > I started with tcpdump, prooving that packets reached all the way
>> > outside but replies never made it, reboorting
>> > with 4.19.13 resulted in replies appearing in the tcpdump.
>> >
>> > I don't quite know where to look - and what can i do to test - i tried
>> > disabling all offloading (due to the UDP
>> > offloading changes) but nothing helped...
>> >
>> > Ideas? Patches? ;)
>>
>> Running a bisection would certainly help find the offending commit if
>> that is something that you can do?
>
> I was hoping for a likely suspect but this was on my "Todo" for Friday night anyway... (And I already started testing with some patches reversed)

So after lengthy git bisect sections, both from the latest stable i
was using (not the best of ideas)
and from 4.19.

The latest stable yielded 72b0094f918294e6cb8cf5c3b4520d928fbb1a57 -
which is incorrect...

However, the proper bisect gave me this:
fb420d5d91c1274d5966917725e71f27ed092a85 is the first bad commit
commit fb420d5d91c1274d5966917725e71f27ed092a85
Author: Eric Dumazet <edumazet@google.com>
Date:   Fri Sep 28 10:28:44 2018 -0700

    tcp/fq: move back to CLOCK_MONOTONIC

    In the recent TCP/EDT patch series, I switched TCP and sch_fq
    clocks from MONOTONIC to TAI, in order to meet the choice done
    earlier for sch_etf packet scheduler.

    But sure enough, this broke some setups were the TAI clock
    jumps forward (by almost 50 year...), as reported
    by Leonard Crestez.

    If we want to converge later, we'll probably need to add
    an skb field to differentiate the clock bases, or a socket option.

    In the meantime, an UDP application will need to use CLOCK_MONOTONIC
    base for its SCM_TXTIME timestamps if using fq packet scheduler.

    Fixes: 72b0094f9182 ("tcp: switch tcp_clock_ns() to CLOCK_TAI base")
    Fixes: 142537e41923 ("net_sched: sch_fq: switch to CLOCK_TAI")
    Fixes: fd2bca2aa789 ("tcp: switch internal pacing timer to CLOCK_TAI")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: Leonard Crestez <leonard.crestez@nxp.com>
    Tested-by: Leonard Crestez <leonard.crestez@nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

:040000 040000 06615f5ed4486fd0af77a8fb59775a9f2346aebc
7f883c7753cb3d5d881e0edbef2989f4e6db6a1f M include
:040000 040000 767c5e93fe5cfd609f90834d93978511c284ea01
cc47bd361516622c0b21602e188181fdfc6b2995 M net
----

Which could actually be the culprit - I'm having problems *with* UDP
traffic (DHCP) and I am using fq

Lets hope it's so, since this was kinda boring:
ls /lib/modules |grep 4.19.0 |wc -l
27

Testing 4.20.1 and then 4.20.1 with the suspected patch reverted, will
report shortly!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] v4.20 - bridge not getting DHCP responses? (works in 4.19.13)
  2019-01-10  0:16             ` Ian Kumlien
@ 2019-01-10  0:16               ` Ian Kumlien
  2019-01-10  0:38               ` Ian Kumlien
  1 sibling, 0 replies; 11+ messages in thread
From: Ian Kumlien @ 2019-01-10  0:16 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Stephen Hemminger, jeffrey.t.kirsher, Roopa Prabhu, nikolay,
	linux-kernel@vger.kernel.org, Linux Kernel Network Developers

On Wed, Jan 9, 2019 at 12:17 AM Ian Kumlien <ian.kumlien@gmail.com> wrote:
> On Wed, Jan 9, 2019, 00:09 Florian Fainelli <f.fainelli@gmail.com wrote:

 [--8<---]

>> > when looking at "git log v4.19...v4.20
>> > drivers/net/ethernet/intel/ixgbe/" nothing else really stands out...
>> > The machine is also running NAT for my home network and all of that
>> > works just fine...
>> >
>> > I started with tcpdump, prooving that packets reached all the way
>> > outside but replies never made it, reboorting
>> > with 4.19.13 resulted in replies appearing in the tcpdump.
>> >
>> > I don't quite know where to look - and what can i do to test - i tried
>> > disabling all offloading (due to the UDP
>> > offloading changes) but nothing helped...
>> >
>> > Ideas? Patches? ;)
>>
>> Running a bisection would certainly help find the offending commit if
>> that is something that you can do?
>
> I was hoping for a likely suspect but this was on my "Todo" for Friday night anyway... (And I already started testing with some patches reversed)

So after lengthy git bisect sections, both from the latest stable i
was using (not the best of ideas)
and from 4.19.

The latest stable yielded 72b0094f918294e6cb8cf5c3b4520d928fbb1a57 -
which is incorrect...

However, the proper bisect gave me this:
fb420d5d91c1274d5966917725e71f27ed092a85 is the first bad commit
commit fb420d5d91c1274d5966917725e71f27ed092a85
Author: Eric Dumazet <edumazet@google.com>
Date:   Fri Sep 28 10:28:44 2018 -0700

    tcp/fq: move back to CLOCK_MONOTONIC

    In the recent TCP/EDT patch series, I switched TCP and sch_fq
    clocks from MONOTONIC to TAI, in order to meet the choice done
    earlier for sch_etf packet scheduler.

    But sure enough, this broke some setups were the TAI clock
    jumps forward (by almost 50 year...), as reported
    by Leonard Crestez.

    If we want to converge later, we'll probably need to add
    an skb field to differentiate the clock bases, or a socket option.

    In the meantime, an UDP application will need to use CLOCK_MONOTONIC
    base for its SCM_TXTIME timestamps if using fq packet scheduler.

    Fixes: 72b0094f9182 ("tcp: switch tcp_clock_ns() to CLOCK_TAI base")
    Fixes: 142537e41923 ("net_sched: sch_fq: switch to CLOCK_TAI")
    Fixes: fd2bca2aa789 ("tcp: switch internal pacing timer to CLOCK_TAI")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: Leonard Crestez <leonard.crestez@nxp.com>
    Tested-by: Leonard Crestez <leonard.crestez@nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

:040000 040000 06615f5ed4486fd0af77a8fb59775a9f2346aebc
7f883c7753cb3d5d881e0edbef2989f4e6db6a1f M include
:040000 040000 767c5e93fe5cfd609f90834d93978511c284ea01
cc47bd361516622c0b21602e188181fdfc6b2995 M net
----

Which could actually be the culprit - I'm having problems *with* UDP
traffic (DHCP) and I am using fq

Lets hope it's so, since this was kinda boring:
ls /lib/modules |grep 4.19.0 |wc -l
27

Testing 4.20.1 and then 4.20.1 with the suspected patch reverted, will
report shortly!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] v4.20 - bridge not getting DHCP responses? (works in 4.19.13)
  2019-01-10  0:16             ` Ian Kumlien
  2019-01-10  0:16               ` Ian Kumlien
@ 2019-01-10  0:38               ` Ian Kumlien
  2019-01-10  8:08                 ` Paolo Abeni
  1 sibling, 1 reply; 11+ messages in thread
From: Ian Kumlien @ 2019-01-10  0:38 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Stephen Hemminger, jeffrey.t.kirsher, Roopa Prabhu, nikolay,
	linux-kernel@vger.kernel.org, Linux Kernel Network Developers

Confirmed, sending a new mail with summary etc

On Thu, Jan 10, 2019 at 1:16 AM Ian Kumlien <ian.kumlien@gmail.com> wrote:
>
> On Wed, Jan 9, 2019 at 12:17 AM Ian Kumlien <ian.kumlien@gmail.com> wrote:
> > On Wed, Jan 9, 2019, 00:09 Florian Fainelli <f.fainelli@gmail.com wrote:
>
>  [--8<---]
>
> >> > when looking at "git log v4.19...v4.20
> >> > drivers/net/ethernet/intel/ixgbe/" nothing else really stands out...
> >> > The machine is also running NAT for my home network and all of that
> >> > works just fine...
> >> >
> >> > I started with tcpdump, prooving that packets reached all the way
> >> > outside but replies never made it, reboorting
> >> > with 4.19.13 resulted in replies appearing in the tcpdump.
> >> >
> >> > I don't quite know where to look - and what can i do to test - i tried
> >> > disabling all offloading (due to the UDP
> >> > offloading changes) but nothing helped...
> >> >
> >> > Ideas? Patches? ;)
> >>
> >> Running a bisection would certainly help find the offending commit if
> >> that is something that you can do?
> >
> > I was hoping for a likely suspect but this was on my "Todo" for Friday night anyway... (And I already started testing with some patches reversed)
>
> So after lengthy git bisect sections, both from the latest stable i
> was using (not the best of ideas)
> and from 4.19.
>
> The latest stable yielded 72b0094f918294e6cb8cf5c3b4520d928fbb1a57 -
> which is incorrect...
>
> However, the proper bisect gave me this:
> fb420d5d91c1274d5966917725e71f27ed092a85 is the first bad commit
> commit fb420d5d91c1274d5966917725e71f27ed092a85
> Author: Eric Dumazet <edumazet@google.com>
> Date:   Fri Sep 28 10:28:44 2018 -0700
>
>     tcp/fq: move back to CLOCK_MONOTONIC
>
>     In the recent TCP/EDT patch series, I switched TCP and sch_fq
>     clocks from MONOTONIC to TAI, in order to meet the choice done
>     earlier for sch_etf packet scheduler.
>
>     But sure enough, this broke some setups were the TAI clock
>     jumps forward (by almost 50 year...), as reported
>     by Leonard Crestez.
>
>     If we want to converge later, we'll probably need to add
>     an skb field to differentiate the clock bases, or a socket option.
>
>     In the meantime, an UDP application will need to use CLOCK_MONOTONIC
>     base for its SCM_TXTIME timestamps if using fq packet scheduler.
>
>     Fixes: 72b0094f9182 ("tcp: switch tcp_clock_ns() to CLOCK_TAI base")
>     Fixes: 142537e41923 ("net_sched: sch_fq: switch to CLOCK_TAI")
>     Fixes: fd2bca2aa789 ("tcp: switch internal pacing timer to CLOCK_TAI")
>     Signed-off-by: Eric Dumazet <edumazet@google.com>
>     Reported-by: Leonard Crestez <leonard.crestez@nxp.com>
>     Tested-by: Leonard Crestez <leonard.crestez@nxp.com>
>     Signed-off-by: David S. Miller <davem@davemloft.net>
>
> :040000 040000 06615f5ed4486fd0af77a8fb59775a9f2346aebc
> 7f883c7753cb3d5d881e0edbef2989f4e6db6a1f M include
> :040000 040000 767c5e93fe5cfd609f90834d93978511c284ea01
> cc47bd361516622c0b21602e188181fdfc6b2995 M net
> ----
>
> Which could actually be the culprit - I'm having problems *with* UDP
> traffic (DHCP) and I am using fq
>
> Lets hope it's so, since this was kinda boring:
> ls /lib/modules |grep 4.19.0 |wc -l
> 27
>
> Testing 4.20.1 and then 4.20.1 with the suspected patch reverted, will
> report shortly!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [BUG] v4.20 - bridge not getting DHCP responses? (works in 4.19.13)
  2019-01-10  0:38               ` Ian Kumlien
@ 2019-01-10  8:08                 ` Paolo Abeni
  0 siblings, 0 replies; 11+ messages in thread
From: Paolo Abeni @ 2019-01-10  8:08 UTC (permalink / raw)
  To: Ian Kumlien, Florian Fainelli
  Cc: Stephen Hemminger, jeffrey.t.kirsher, Roopa Prabhu, nikolay,
	linux-kernel@vger.kernel.org, Linux Kernel Network Developers

On Thu, 2019-01-10 at 01:38 +0100, Ian Kumlien wrote:
> Confirmed, sending a new mail with summary etc
> 
> On Thu, Jan 10, 2019 at 1:16 AM Ian Kumlien <ian.kumlien@gmail.com> wrote:
> > On Wed, Jan 9, 2019 at 12:17 AM Ian Kumlien <ian.kumlien@gmail.com> wrote:
> > > On Wed, Jan 9, 2019, 00:09 Florian Fainelli <f.fainelli@gmail.com wrote:
> > 
> >  [--8<---]
> > 
> > > > > when looking at "git log v4.19...v4.20
> > > > > drivers/net/ethernet/intel/ixgbe/" nothing else really stands out...
> > > > > The machine is also running NAT for my home network and all of that
> > > > > works just fine...
> > > > > 
> > > > > I started with tcpdump, prooving that packets reached all the way
> > > > > outside but replies never made it, reboorting
> > > > > with 4.19.13 resulted in replies appearing in the tcpdump.
> > > > > 
> > > > > I don't quite know where to look - and what can i do to test - i tried
> > > > > disabling all offloading (due to the UDP
> > > > > offloading changes) but nothing helped...
> > > > > 
> > > > > Ideas? Patches? ;)
> > > > 
> > > > Running a bisection would certainly help find the offending commit if
> > > > that is something that you can do?
> > > 
> > > I was hoping for a likely suspect but this was on my "Todo" for Friday night anyway... (And I already started testing with some patches reversed)
> > 
> > So after lengthy git bisect sections, both from the latest stable i
> > was using (not the best of ideas)
> > and from 4.19.
> > 
> > The latest stable yielded 72b0094f918294e6cb8cf5c3b4520d928fbb1a57 -
> > which is incorrect...
> > 
> > However, the proper bisect gave me this:
> > fb420d5d91c1274d5966917725e71f27ed092a85 is the first bad commit
> > commit fb420d5d91c1274d5966917725e71f27ed092a85
> > Author: Eric Dumazet <edumazet@google.com>
> > Date:   Fri Sep 28 10:28:44 2018 -0700
> > 
> >     tcp/fq: move back to CLOCK_MONOTONIC

Thank you for bisecting. 

Should be solve by:

https://marc.info/?l=linux-netdev&m=154696956604748&w=2

Can you test with the above applied?

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-01-10  8:08 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CAA85sZsiifWTKq=Mw30f-UVyrzywnYYX7GvyqG+O+ndSYBML9A@mail.gmail.com>
2019-01-06 22:21 ` Fwd: [BUG] v4.20 - bridge not getting DHCP responses? (works in 4.19.13) Ian Kumlien
2019-01-08 22:10   ` Ian Kumlien
2019-01-08 22:51     ` Stephen Hemminger
2019-01-08 22:51       ` Stephen Hemminger
2019-01-08 23:00       ` Ian Kumlien
2019-01-08 23:09         ` Florian Fainelli
2019-01-08 23:09           ` Florian Fainelli
     [not found]           ` <CAA85sZvVHCYFRXp5cYzSDNpesNMtCZOB+Gnb+N+SmraD6eke1A@mail.gmail.com>
2019-01-10  0:16             ` Ian Kumlien
2019-01-10  0:16               ` Ian Kumlien
2019-01-10  0:38               ` Ian Kumlien
2019-01-10  8:08                 ` Paolo Abeni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).