* Re: [PATCH net] net: phy: fix interrupt handling in non-started states
From: Florian Fainelli @ 2019-02-14 4:10 UTC (permalink / raw)
To: Heiner Kallweit, Andrew Lunn, David Miller
Cc: netdev@vger.kernel.org, Russell King - ARM Linux
In-Reply-To: <25e86edc-0b88-8c03-b692-776e971331f2@gmail.com>
On 2/12/2019 10:56 AM, Heiner Kallweit wrote:
> phylib enables interrupts before phy_start() has been called, and if
> we receive an interrupt in a non-started state, the interrupt handler
> returns IRQ_NONE. This causes problems with at least one Marvell chip
> as reported by Andrew.
> Fix this by handling interrupts the same as in phy_mac_interrupt(),
> basically always running the phylib state machine. It knows when it
> has to do something and when not.
> This change allows to handle interrupts gracefully even if they
> occur in a non-started state.
>
> Fixes: 2b3e88ea6528 ("net: phy: improve phy state checking")
> Reported-by: Andrew Lunn <andrew@lunn.ch>
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
--
Florian
^ permalink raw reply
* Re: [PATCH net 1/2] net: phy: don't use locking in phy_is_started
From: Florian Fainelli @ 2019-02-14 4:13 UTC (permalink / raw)
To: Heiner Kallweit, Andrew Lunn, David Miller
Cc: Russell King - ARM Linux, netdev@vger.kernel.org
In-Reply-To: <2e6abca8-6a60-a7f0-b3e3-0d55fbebd4fc@gmail.com>
On 2/13/2019 11:11 AM, Heiner Kallweit wrote:
> Russell suggested to remove the locking from phy_is_started() because
> the read is atomic anyway and actually the locking may be more
> misleading.
>
> Fixes: 2b3e88ea6528 ("net: phy: improve phy state checking")
> Suggested-by: Russell King - ARM Linux admin <linux@armlinux.org.uk>
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
--
Florian
^ permalink raw reply
* Re: [PATCH net 2/2] net: phy: fix potential race in the phylib state machine
From: Florian Fainelli @ 2019-02-14 4:13 UTC (permalink / raw)
To: Heiner Kallweit, Andrew Lunn, David Miller
Cc: Russell King - ARM Linux, netdev@vger.kernel.org
In-Reply-To: <1094ff3a-0d7a-dc96-8a19-a5102e08fa79@gmail.com>
On 2/13/2019 11:12 AM, Heiner Kallweit wrote:
> Russell reported the following race in the phylib state machine
> (quoting from his mail):
>
> if (phy_polling_mode(phydev) && phy_is_started(phydev))
> phy_queue_state_machine(phydev, PHY_STATE_TIME);
>
> state = PHY_UP
> thread 0 thread 1
> phy_disconnect()
> +-phy_is_started()
> phy_is_started() |
> `-phy_stop()
> +-phydev->state = PHY_HALTED
> `-phy_stop_machine()
> `-cancel_delayed_work_sync()
> phy_queue_state_machine()
> `-mod_delayed_work()
>
> At this point, the phydev->state_queue() has been added back onto the
> system workqueue despite phy_stop_machine() having been called and
> cancel_delayed_work_sync() called on it.
>
> Fix this by protecting the complete operation in thread 0.
>
> Fixes: 2b3e88ea6528 ("net: phy: improve phy state checking")
> Reported-by: Russell King - ARM Linux admin <linux@armlinux.org.uk>
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
--
Florian
^ permalink raw reply
* Re: [PATCH bpf-next v11 0/7] bpf: add BPF_LWT_ENCAP_IP option to bpf_lwt_push_encap
From: Alexei Starovoitov @ 2019-02-14 4:21 UTC (permalink / raw)
To: David Ahern
Cc: Peter Oskolkov, Alexei Starovoitov, Daniel Borkmann, netdev,
Peter Oskolkov, Willem de Bruijn
In-Reply-To: <3772c82a-6959-9f8a-9273-0adcbdbcf631@gmail.com>
On Wed, Feb 13, 2019 at 08:44:51PM -0700, David Ahern wrote:
> On 2/13/19 7:39 PM, Alexei Starovoitov wrote:
> > On Wed, Feb 13, 2019 at 05:46:26PM -0700, David Ahern wrote:
> >> On 2/13/19 12:53 PM, Peter Oskolkov wrote:
> >>> This patchset implements BPF_LWT_ENCAP_IP mode in bpf_lwt_push_encap
> >>> BPF helper. It enables BPF programs (specifically, BPF_PROG_TYPE_LWT_IN
> >>> and BPF_PROG_TYPE_LWT_XMIT prog types) to add IP encapsulation headers
> >>> to packets (e.g. IP/GRE, GUE, IPIP).
> >>>
> >>> This is useful when thousands of different short-lived flows should be
> >>> encapped, each with different and dynamically determined destination.
> >>> Although lwtunnels can be used in some of these scenarios, the ability
> >>> to dynamically generate encap headers adds more flexibility, e.g.
> >>> when routing depends on the state of the host (reflected in global bpf
> >>> maps).
> >>>
> >>
> >>
> >> For the set:
> >> Reviewed-by: David Ahern <dsahern@gmail.com>
> >
> > Applied. Thanks everyone!
> >
>
> Looks like a cleanup round is needed.
>
> I changed the routes to fail with unreachable:
>
> @@ -179,16 +175,16 @@
> ip -netns ${NS3} tunnel add gre_dev mode gre remote ${IPv4_1} local
> ${IPv4_GRE} ttl 255
> ip -netns ${NS3} link set gre_dev up
> ip -netns ${NS3} addr add ${IPv4_GRE} dev gre_dev
> - ip -netns ${NS1} route add ${IPv4_GRE}/32 dev veth5 via ${IPv4_6}
> - ip -netns ${NS2} route add ${IPv4_GRE}/32 dev veth7 via ${IPv4_8}
> + ip -netns ${NS1} route add unreachable ${IPv4_GRE}/32
> + ip -netns ${NS2} route add unreachable ${IPv4_GRE}/32
>
>
> # configure IPv6 GRE device in NS3, and a route to it via the "bottom"
> route
> ip -netns ${NS3} -6 tunnel add name gre6_dev mode ip6gre remote
> ${IPv6_1} local ${IPv6_GRE} ttl 255
> ip -netns ${NS3} link set gre6_dev up
> ip -netns ${NS3} -6 addr add ${IPv6_GRE} nodad dev gre6_dev
> - ip -netns ${NS1} -6 route add ${IPv6_GRE}/128 dev veth5 via ${IPv6_6}
> - ip -netns ${NS2} -6 route add ${IPv6_GRE}/128 dev veth7 via ${IPv6_8}
> + ip -netns ${NS1} -6 route add unreachable ${IPv6_GRE}/128
> + ip -netns ${NS2} -6 route add unreachable ${IPv6_GRE}/128
>
> # rp_filter gets confused by what these tests are doing, so disable it
> ip netns exec ${NS1} sysctl -wq net.ipv4.conf.all.rp_filter=0
> @@ -220,7 +216,6 @@
>
>
> and then removed all of the set -e and exit 1's in the script (really
> should let all of the tests run versus bailing on the first failure).
>
> With kmemleak enabled I see a lot of suspected memory leaks - some may
> not be related to this change but it is triggering the suspected leak:
argh. Thanks a lot for catching it.
Let's figure out the fix quickly.
If it's too intrusive we can revert and reapply.
I'm not going to send a pull-req to Dave with a known issue like this.
^ permalink raw reply
* Re: [PATCH 2/2] doc: add phylink documentation to the networking book
From: Andrew Lunn @ 2019-02-14 4:32 UTC (permalink / raw)
To: Randy Dunlap
Cc: Russell King, linux-doc, netdev, David S. Miller, Jonathan Corbet
In-Reply-To: <f002402d-fb27-f697-f07d-de3cdff41f40@infradead.org>
> > +For information describing the SFP cage in DT, please see the binding
> > +documentation in the kernel source tree
> > +``Documentation/devicetree/bindings/net/sff,sfp.txt``
> oh, so SFP means "Small Form-factor Pluggable".
>
> I see that this source file:
> ./drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:1902:
>
> seems to imply that SFP means "single function per port (SFP) mode":
Hi Randy
rfc5513 might be relevant.
Andrew
^ permalink raw reply
* Re: [RFC bpf-next 0/7] net: flow_dissector: trigger BPF hook when called from eth_get_headlen
From: Alexei Starovoitov @ 2019-02-14 4:39 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: Willem de Bruijn, Stanislav Fomichev, Network Development,
David Miller, Alexei Starovoitov, Daniel Borkmann, simon.horman,
Willem de Bruijn
In-Reply-To: <20190212170232.GB10595@mini-arch>
On Tue, Feb 12, 2019 at 09:02:32AM -0800, Stanislav Fomichev wrote:
> On 02/05, Stanislav Fomichev wrote:
> > On 02/05, Alexei Starovoitov wrote:
> > > On Tue, Feb 05, 2019 at 07:56:19PM -0800, Stanislav Fomichev wrote:
> > > > On 02/05, Alexei Starovoitov wrote:
> > > > > On Tue, Feb 05, 2019 at 04:59:31PM -0800, Stanislav Fomichev wrote:
> > > > > > On 02/05, Alexei Starovoitov wrote:
> > > > > > > On Tue, Feb 05, 2019 at 12:40:03PM -0800, Stanislav Fomichev wrote:
> > > > > > > > On 02/05, Willem de Bruijn wrote:
> > > > > > > > > On Tue, Feb 5, 2019 at 12:57 PM Stanislav Fomichev <sdf@google.com> wrote:
> > > > > > > > > >
> > > > > > > > > > Currently, when eth_get_headlen calls flow dissector, it doesn't pass any
> > > > > > > > > > skb. Because we use passed skb to lookup associated networking namespace
> > > > > > > > > > to find whether we have a BPF program attached or not, we always use
> > > > > > > > > > C-based flow dissector in this case.
> > > > > > > > > >
> > > > > > > > > > The goal of this patch series is to add new networking namespace argument
> > > > > > > > > > to the eth_get_headlen and make BPF flow dissector programs be able to
> > > > > > > > > > work in the skb-less case.
> > > > > > > > > >
> > > > > > > > > > The series goes like this:
> > > > > > > > > > 1. introduce __init_skb and __init_skb_shinfo; those will be used to
> > > > > > > > > > initialize temporary skb
> > > > > > > > > > 2. introduce skb_net which can be used to get networking namespace
> > > > > > > > > > associated with an skb
> > > > > > > > > > 3. add new optional network namespace argument to __skb_flow_dissect and
> > > > > > > > > > plumb through the callers
> > > > > > > > > > 4. add new __flow_bpf_dissect which constructs temporary on-stack skb
> > > > > > > > > > (using __init_skb) and calls BPF flow dissector program
> > > > > > > > >
> > > > > > > > > The main concern I see with this series is this cost of skb zeroing
> > > > > > > > > for every packet in the device driver receive routine, *independent*
> > > > > > > > > from the real skb allocation and zeroing which will likely happen
> > > > > > > > > later.
> > > > > > > > Yes, plus ~200 bytes on the stack for the callers.
> > > > > > > >
> > > > > > > > Not sure how visible this zeroing though, I can probably try to get some
> > > > > > > > numbers from BPF_PROG_TEST_RUN (running current version vs running with
> > > > > > > > on-stack skb).
> > > > > > >
> > > > > > > imo extra 256 byte memset for every packet is non starter.
> > > > > > We can put pre-allocated/initialized skbs without data into percpu or even
> > > > > > use pcpu_freelist_pop/pcpu_freelist_push to make sure we don't have to think
> > > > > > about having multiple percpu for irq/softirq/process contexts.
> > > > > > Any concerns with that approach?
> > > > > > Any other possible concerns with the overall series?
> > > > >
> > > > > I'm missing why the whole thing is needed.
> > > > > You're saying:
> > > > > " make BPF flow dissector programs be able to work in the skb-less case".
> > > > > What does it mean specifically?
> > > > > The only non-skb case is XDP.
> > > > > Are you saying you want flow_dissector prog to be run in XDP?
> > > > eth_get_headlen that drivers call on RX path on a chunk of data to
> > > > guesstimate the length of the headers calls flow dissector without an skb
> > > > (__skb_flow_dissect was a weird interface where it accepts skb or
> > > > data+len). Right now, there is no way to trigger BPF flow dissector
> > > > for this case (we don't have an skb to get associated namespace/etc/etc).
> > > > The patch series tries to fix that to make sure that we always trigger
> > > > BPF program if it's attached to a device's namespace.
> > >
> > > then why not to create flow_dissector prog type that works without skb?
> > > Why do you need to fake an skb?
> > > XDP progs work just fine without it.
> > What's the advantage of having another prog type? In this case we would have
> > to write the same flow dissector program twice: first time against __skb_buff
> > interface, second time against xdp_md.
> > By using fake skb, we make the same flow dissector __sk_buff BPF program
> > work in both contexts without a rewrite to an xdp interface (I don't
> > think users should care whether flow dissector was called form "xdp" vs skb
> > context; and we're sort of stuck with __sk_buff interface already).
> Should I follow up with v2 where I address memset(,,256) for each packet?
> Or you still have some questions/doubts/suggestions regarding the problem
> I'm trying to solve?
sorry for delay. I'm still thinking what is the path forward here.
That 'stuck with __sk_buff' is what bothers me.
It's an indication that api wasn't thought through if first thing
it needs is this fake skb hack.
If bpf_flow.c is a realistic example of such flow dissector prog
it means that real skb fields are accessed.
In particular skb->vlan_proto, skb->protocol.
These fields in case of 'fake skb' will not be set, since eth_type_trans()
isn't called yet.
So either flow_dissector needs a real __sk_buff and all of its fields
should be real or it's a different flow_dissector prog type that
needs ctx->data, ctx->data_end, ctx->flow_keys only.
Either way going with fake skb is incorrect, since bpf_flow.c example
will be broken and for program writers it will be hard to figure why
it's broken.
^ permalink raw reply
* Re: [PATCH 2/2] doc: add phylink documentation to the networking book
From: Randy Dunlap @ 2019-02-14 4:39 UTC (permalink / raw)
To: Andrew Lunn
Cc: Russell King, linux-doc, netdev, David S. Miller, Jonathan Corbet
In-Reply-To: <20190214043217.GB20024@lunn.ch>
On 2/13/19 8:32 PM, Andrew Lunn wrote:
>>> +For information describing the SFP cage in DT, please see the binding
>>> +documentation in the kernel source tree
>>> +``Documentation/devicetree/bindings/net/sff,sfp.txt``
>> oh, so SFP means "Small Form-factor Pluggable".
>>
>> I see that this source file:
>> ./drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:1902:
>>
>> seems to imply that SFP means "single function per port (SFP) mode":
>
> Hi Randy
>
> rfc5513 might be relevant.
>
> Andrew
>
Definitely. like WAD. :)
thanks.
--
~Randy
^ permalink raw reply
* Re: [PATCH net] net: phy: fix interrupt handling in non-started states
From: David Miller @ 2019-02-14 4:44 UTC (permalink / raw)
To: hkallweit1; +Cc: andrew, f.fainelli, netdev, linux
In-Reply-To: <25e86edc-0b88-8c03-b692-776e971331f2@gmail.com>
From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Tue, 12 Feb 2019 19:56:15 +0100
> phylib enables interrupts before phy_start() has been called, and if
> we receive an interrupt in a non-started state, the interrupt handler
> returns IRQ_NONE. This causes problems with at least one Marvell chip
> as reported by Andrew.
> Fix this by handling interrupts the same as in phy_mac_interrupt(),
> basically always running the phylib state machine. It knows when it
> has to do something and when not.
> This change allows to handle interrupts gracefully even if they
> occur in a non-started state.
>
> Fixes: 2b3e88ea6528 ("net: phy: improve phy state checking")
> Reported-by: Andrew Lunn <andrew@lunn.ch>
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Applied, thanks Heiner.
^ permalink raw reply
* Re: [PATCH net] dsa: mv88e6xxx: Ensure all pending interrupts are handled prior to exit
From: David Miller @ 2019-02-14 4:47 UTC (permalink / raw)
To: andrew; +Cc: dave.anglin, linux, vivien.didelot, f.fainelli, netdev
In-Reply-To: <20190214020723.GE24589@lunn.ch>
From: Andrew Lunn <andrew@lunn.ch>
Date: Thu, 14 Feb 2019 03:07:23 +0100
> On Mon, Feb 11, 2019 at 01:40:21PM -0500, John David Anglin wrote:
>> The GPIO interrupt controller on the espressobin board only supports edge interrupts.
>> If one enables the use of hardware interrupts in the device tree for the 88E6341, it is
>> possible to miss an edge. When this happens, the INTn pin on the Marvell switch is
>> stuck low and no further interrupts occur.
>>
>> I found after adding debug statements to mv88e6xxx_g1_irq_thread_work() that there is
>> a race in handling device interrupts (e.g. PHY link interrupts). Some interrupts are
>> directly cleared by reading the Global 1 status register. However, the device interrupt
>> flag, for example, is not cleared until all the unmasked SERDES and PHY ports are serviced.
>> This is done by reading the relevant SERDES and PHY status register.
>>
>> The code only services interrupts whose status bit is set at the time of reading its status
>> register. If an interrupt event occurs after its status is read and before all interrupts
>> are serviced, then this event will not be serviced and the INTn output pin will remain low.
>>
>> This is not a problem with polling or level interrupts since the handler will be called
>> again to process the event. However, it's a big problem when using level interrupts.
>>
>> The fix presented here is to add a loop around the code servicing switch interrupts. If
>> any pending interrupts remain after the current set has been handled, we loop and process
>> the new set. If there are no pending interrupts after servicing, we are sure that INTn has
>> gone high and we will get an edge when a new event occurs.
>>
>> Tested on espressobin board.
>>
>> Signed-off-by: John David Anglin <dave.anglin@bell.net>
>
> Fixes: dc30c35be720 ("net: dsa: mv88e6xxx: Implement interrupt support.")
>
> Tested-by: Andrew Lunn <andrew@lunn.ch>
>
> David, please ensure that Heiner's patch:
>
> net: phy: fix interrupt handling in non-started states
>
> is applied first. Otherwise we can get into an interrupt storm.
Ok, all done.
Should I queue just this one for -stable? I didn't queue up Heiner's change for
-stable because it fixes a 5.0-rcX regression.
^ permalink raw reply
* Re: [RFC PATCH net-next 2/5] net: 8021q: vlan_dev: add vid tag for uc and mc address lists
From: Florian Fainelli @ 2019-02-14 4:49 UTC (permalink / raw)
To: Ivan Khoronzhuk, davem, linux-omap, netdev, linux-kernel, jiri,
andrew
In-Reply-To: <20190213161715.GA32249@khorivan>
On February 13, 2019 8:17:16 AM PST, Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:
>On Tue, Jan 22, 2019 at 03:12:41PM +0200, Ivan Khoronzhuk wrote:
>>On Mon, Jan 21, 2019 at 03:37:41PM -0800, Florian Fainelli wrote:
>>>On 12/4/18 3:42 PM, Ivan Khoronzhuk wrote:
>>>>On Tue, Dec 04, 2018 at 11:49:27AM -0800, Florian Fainelli wrote:
>>
>>[...]
>>
>>>
>>>Ivan, based on the recent submission I copied you on [1], it sounds
>like
>>>we want to move ahead with your proposal to extend netdev_hw_addr
>with a
>>>vid member.
>>>
>>>On second thought, your approach is good and if we enclose the vid
>>>member within an #if IS_ENABLED(CONFIG_VLAN)8021Q) we should be good
>for
>>>most foreseeable use cases, if not, we can always introduce a
>variable
>>>size/defined context in the future.
>>>
>>>Can you resubmit this patch series as non-RFC in the next few days so
>I
>>>can also repost mine [1] and take advantage of these changes for
>>>multicast over VLAN when VLAN filtering is globally enabled on the
>device.
>>>
>>>[1]: https://www.spinics.net/lists/netdev/msg544722.html
>>>
>>>Thanks!
>>
>>Yes, sure. I can start to do that in several days.
>>Just a little busy right now.
>>
>>Just before doing this, maybe some comments could be added as it has
>more
>>attention now. Meanwhile I can send alternative variant but based on
>>real dev splitting addresses between vlans. In this approach it leaves
>address
>>space w/o vid extension but requires more changes to vlan core.
>Drawback here
>>that to change one address alg traverses all related vlan addresses,
>it can be
>>cpu/time wasteful, if it's done regularly, but saves memory....
>>
>>Basically it's implemented locally in cpsw and requires more changes
>to move
>>it as some vlan core auxiliary functions to be reused. But it can work
>only
>>with vlans directly on top of real dev, which is fixable.
>>
>>Core function here:
>>__hw_addr_ref_sync_dev
>>it is called only for address the link of which was
>increased/decreased, thus
>>update made only on one address, comparing it for every vlan dev.
>>
>>It was added with this patch:
>>[1] net: core: dev_addr_lists: add auxiliary func to handle reference
>>address update e7946760de5852f32
>>
>>And used by this patch:
>>[2] net: ethernet: ti: cpsw: fix vlan mcast 15180eca569bfe1d4d
>>
>>So, idea is to move [2] to be vlan core auxiliary function to be
>reused
>>by NIC drivers.
>>
>>But potentially it can bring a little more changes I assume:
>>
>>1) add priv_flag |= IFF_IV_FLT (independent vlan filtering). It allows
>to reuse
>>this flag for farther changes, probably for per vlan allmulti or so.
>>
>>2) real dev has to have complete list for vlans, not only their vids,
>but also
>>all vlandevs in device chain above it. So changes in add_vid can be
>required.
>>Vlan core can assign vlan dev pointer to real device only after it's
>completely
>>initialized. And for propagation reasons it requires every device in
>>infrastructure to be aware. That seems doable, but depends not only on
>me.
>>
>>3) Move code from [2] to be auxiliary vlan core API for setting mc and
>uc.
>>From this patch only one function is cpsw specific: cpsw_set_mc(). The
>rest can
>>be applicable on every NIC supporting IFF_IV_FLT.
>>
>>4) Move code from link below to do the same but for uc addresses:
>>https://git.linaro.org/people/ivan.khoronzhuk/tsn_kernel.git/commit/?h=ucast_vlan_fix&id=ebc88a7d8758759322d9ff88f25f8bac51ce7219
>>here only one func cpsw specific: cpsw_set_uc()
>>the rest can be generic.
>>
>>As third alternative, we can think about how to reduce memory for
>addresses by
>>reusing them or else, but this is as continuation of addr+vid
>approach, and API
>>probably would be the same.
>>
>>Then all this can be compared for proper decision.
>
>
>Hi Florian,
>
>After several more investigations and tries probably better left this
>idea as is.
Thank you for keeping the thread alive, does that mean you are going to resubmit this patch series as-is (rebased) or are you saying that you are abandoning the idea and leaving the situation the way it is in cpsw?
>
>Here actually several explanations for this:
>1) If even assume that we can get access to vlan devices in the above
>ndev
>tree (we can) that doesn't guarantee that receive vlan filters are set
>replicating this structure. For example bond device can have one active
>slave
>but both of them in the tree having vid set, in this case addresses are
>syched only with active slave, no filters should be applied to not
>active slave.
>this can be achieved only each address has vid context.
>
>2) According to 1) rx filters device structure can be created while
>mc_sync()
>in each rx_mode(), and then used as orthogonal info. I've tried and it
>looks
>not cool and consumes anyway memory and even if it's less it's still
>not very
>scalable. (+ no normal signal "in complex structure case" when address
>should
>be undated to avoid redundant cpu cycles). Not sure it can have
>practical
>results and be universal enouph.
>
>3) Assuming that every device in the tree (bond, team or else) is legal
>to
>modify its own address space, the real end device cannot be sure the
>vlan device
>address spaces reflects vid addresses that device tree want's from him.
>According to this each address in address space must hold its own
>context at
>every device and this context is comparable with address size.
>
>>-- Regards,
>>Ivan Khoronzhuk
--
Florian
^ permalink raw reply
* Re: [PATCH net] dsa: mv88e6xxx: Ensure all pending interrupts are handled prior to exit
From: Andrew Lunn @ 2019-02-14 4:50 UTC (permalink / raw)
To: David Miller; +Cc: dave.anglin, linux, vivien.didelot, f.fainelli, netdev
In-Reply-To: <20190213.204731.2262809689964875254.davem@davemloft.net>
> Ok, all done.
Thanks
> Should I queue just this one for -stable? I didn't queue up Heiner's change for
> -stable because it fixes a 5.0-rcX regression.
Yes please.
Andrew
^ permalink raw reply
* Re: [PATCH net] net: neterion: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: David Miller @ 2019-02-14 5:00 UTC (permalink / raw)
To: albin_yang; +Cc: netdev, jdmason, yang.wei9
In-Reply-To: <1549986451-4780-1-git-send-email-albin_yang@163.com>
From: Yang Wei <albin_yang@163.com>
Date: Tue, 12 Feb 2019 23:47:31 +0800
> From: Yang Wei <yang.wei9@zte.com.cn>
>
> dev_consume_skb_irq() should be called when skb xmit done. It makes
> drop profiles(dropwatch, perf) more friendly.
>
> Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Applied to net-next.
^ permalink raw reply
* Re: [PATCH net] net: atheros: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: David Miller @ 2019-02-14 5:01 UTC (permalink / raw)
To: albin_yang; +Cc: netdev, jcliburn, chris.snook, yang.wei9
In-Reply-To: <1549986705-4915-1-git-send-email-albin_yang@163.com>
From: Yang Wei <albin_yang@163.com>
Date: Tue, 12 Feb 2019 23:51:45 +0800
> From: Yang Wei <yang.wei9@zte.com.cn>
>
> dev_consume_skb_irq() should be called when skb xmit done. It makes
> drop profiles(dropwatch, perf) more friendly.
>
> Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Applied.
^ permalink raw reply
* Re: [PATCH net] net: apple: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: David Miller @ 2019-02-14 5:01 UTC (permalink / raw)
To: albin_yang; +Cc: netdev, yang.wei9
In-Reply-To: <1549986773-4974-1-git-send-email-albin_yang@163.com>
From: Yang Wei <albin_yang@163.com>
Date: Tue, 12 Feb 2019 23:52:53 +0800
> From: Yang Wei <yang.wei9@zte.com.cn>
>
> dev_consume_skb_irq() should be called in mace_interrupt() when skb
> xmit done. It makes drop profiles(dropwatch, perf) more friendly.
>
> Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Applied to net-next.
^ permalink raw reply
* Re: [PATCH net] net: qualcomm: emac: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: David Miller @ 2019-02-14 5:01 UTC (permalink / raw)
To: albin_yang; +Cc: netdev, timur, yang.wei9
In-Reply-To: <1549986597-4837-1-git-send-email-albin_yang@163.com>
From: Yang Wei <albin_yang@163.com>
Date: Tue, 12 Feb 2019 23:49:57 +0800
> From: Yang Wei <yang.wei9@zte.com.cn>
>
> dev_consume_skb_irq() should be called in emac_mac_tx_process() when
> skb xmit done. It makes drop profiles(dropwatch, perf) more friendly.
>
> Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Applied to net-next.
^ permalink raw reply
* Re: [PATCH net] net: moxa: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: David Miller @ 2019-02-14 5:01 UTC (permalink / raw)
To: albin_yang; +Cc: netdev, keescook, yang.wei9
In-Reply-To: <1549986960-5031-1-git-send-email-albin_yang@163.com>
From: Yang Wei <albin_yang@163.com>
Date: Tue, 12 Feb 2019 23:56:00 +0800
> From: Yang Wei <yang.wei9@zte.com.cn>
>
> dev_consume_skb_irq() should be called in moxart_tx_finished() when
> skb xmit done. It makes drop profiles(dropwatch, perf) more friendly.
>
> Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Applied, thanks.
^ permalink raw reply
* Re: [PATCH net] net: sis: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: David Miller @ 2019-02-14 5:02 UTC (permalink / raw)
To: albin_yang; +Cc: netdev, romieu, venza, yang.wei9
In-Reply-To: <1549987144-5333-1-git-send-email-albin_yang@163.com>
From: Yang Wei <albin_yang@163.com>
Date: Tue, 12 Feb 2019 23:59:04 +0800
> From: Yang Wei <yang.wei9@zte.com.cn>
>
> dev_consume_skb_irq() should be called when skb xmit done. It makes
> drop profiles(dropwatch, perf) more friendly.
>
> Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Applied to net-next.
^ permalink raw reply
* Re: [PATCH net] net: macb: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: David Miller @ 2019-02-14 5:02 UTC (permalink / raw)
To: albin_yang; +Cc: netdev, nicolas.ferre, yang.wei9
In-Reply-To: <1549987202-5393-1-git-send-email-albin_yang@163.com>
From: Yang Wei <albin_yang@163.com>
Date: Wed, 13 Feb 2019 00:00:02 +0800
> From: Yang Wei <yang.wei9@zte.com.cn>
>
> dev_consume_skb_irq() should be called in at91ether_interrupt() when
> skb xmit done. It makes drop profiles(dropwatch, perf) more friendly.
>
> Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Applied, thanks.
^ permalink raw reply
* Re: [PATCH net] net: ixp4xx_eth: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: David Miller @ 2019-02-14 5:02 UTC (permalink / raw)
To: albin_yang; +Cc: netdev, khalasa, yang.wei9
In-Reply-To: <1549987271-5449-1-git-send-email-albin_yang@163.com>
From: Yang Wei <albin_yang@163.com>
Date: Wed, 13 Feb 2019 00:01:11 +0800
> From: Yang Wei <yang.wei9@zte.com.cn>
>
> dev_consume_skb_irq() should be called in eth_txdone_irq() when skb
> xmit done. It makes drop profiles(dropwatch, perf) more friendly.
>
> Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Applied.
^ permalink raw reply
* Re: [PATCH] qed: fix indentation issue with statements in an if-block
From: David Miller @ 2019-02-14 5:04 UTC (permalink / raw)
To: colin.king
Cc: aelior, GR-everest-linux-l2, netdev, kernel-janitors,
linux-kernel
In-Reply-To: <20190212160153.12432-1-colin.king@canonical.com>
From: Colin King <colin.king@canonical.com>
Date: Tue, 12 Feb 2019 16:01:53 +0000
> From: Colin Ian King <colin.king@canonical.com>
>
> There are some statements in an if-block that are not correctly
> indented. Fix these.
>
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
Applied to net-next.
^ permalink raw reply
* Re: [PATCH] qlge: fix some indentation issues
From: David Miller @ 2019-02-14 5:04 UTC (permalink / raw)
To: colin.king
Cc: manishc, GR-Linux-NIC-Dev, netdev, kernel-janitors, linux-kernel
In-Reply-To: <20190212160807.12807-1-colin.king@canonical.com>
From: Colin King <colin.king@canonical.com>
Date: Tue, 12 Feb 2019 16:08:07 +0000
> From: Colin Ian King <colin.king@canonical.com>
>
> There are some statements that are indented incorrectly. Fix these.
>
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
Applied to net-next.
^ permalink raw reply
* Re: [PATCH net] net: fix possible overflow in __sk_mem_raise_allocated()
From: David Miller @ 2019-02-14 5:05 UTC (permalink / raw)
To: edumazet; +Cc: netdev, eric.dumazet
In-Reply-To: <20190212202627.184863-1-edumazet@google.com>
From: Eric Dumazet <edumazet@google.com>
Date: Tue, 12 Feb 2019 12:26:27 -0800
> With many active TCP sockets, fat TCP sockets could fool
> __sk_mem_raise_allocated() thanks to an overflow.
>
> They would increase their share of the memory, instead
> of decreasing it.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Applied, thanks Eric.
^ permalink raw reply
* Re: [PATCH net-next] net: sched: flower: only return error from hw offload if skip_sw
From: David Miller @ 2019-02-14 5:07 UTC (permalink / raw)
To: vladbu; +Cc: netdev, jhs, xiyou.wangcong, jiri, pablo
In-Reply-To: <20190212213906.9368-1-vladbu@mellanox.com>
From: Vlad Buslov <vladbu@mellanox.com>
Date: Tue, 12 Feb 2019 23:39:06 +0200
> Recently introduced tc_setup_flow_action() can fail when parsing tcf_exts
> on some unsupported action commands. However, this should not affect the
> case when user did not explicitly request hw offload by setting skip_sw
> flag. Modify tc_setup_flow_action() callers to only propagate the error if
> skip_sw flag is set for filter that is being offloaded, and set extack
> error message in that case.
>
> Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
> Fixes: 3a7b68617de7 ("cls_api: add translator to flow_action representation")
Applied, thanks Vlad.
^ permalink raw reply
* Re: [PATCH net-next 1/1] flow_offload: fix block stats
From: David Miller @ 2019-02-14 5:08 UTC (permalink / raw)
To: john.hurley; +Cc: jiri, netdev, pablo, oss-drivers
In-Reply-To: <1550017432-26306-1-git-send-email-john.hurley@netronome.com>
From: John Hurley <john.hurley@netronome.com>
Date: Wed, 13 Feb 2019 00:23:52 +0000
> With the introduction of flow_stats_update(), drivers now update the stats
> fields of the passed tc_cls_flower_offload struct, rather than call
> tcf_exts_stats_update() directly to update the stats of offloaded TC
> flower rules. However, if multiple qdiscs are registered to a TC shared
> block and a flower rule is applied, then, when getting stats for the rule,
> multiple callbacks may be made.
>
> Take this into consideration by modifying flow_stats_update to gather the
> stats from all callbacks. Currently, the values in tc_cls_flower_offload
> only account for the last stats callback in the list.
>
> Fixes: 3b1903ef97c0 ("flow_offload: add statistics retrieval infrastructure and use it")
> Signed-off-by: John Hurley <john.hurley@netronome.com>
> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Applied, thanks.
^ permalink raw reply
* Re: [PATCH bpf-next v11 0/7] bpf: add BPF_LWT_ENCAP_IP option to bpf_lwt_push_encap
From: Peter Oskolkov @ 2019-02-14 5:36 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: David Ahern, Alexei Starovoitov, Daniel Borkmann, netdev,
Peter Oskolkov, Willem de Bruijn
In-Reply-To: <20190214042127.azcsxbrpzhgumiwa@ast-mbp>
On Wed, Feb 13, 2019 at 8:21 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Feb 13, 2019 at 08:44:51PM -0700, David Ahern wrote:
> > On 2/13/19 7:39 PM, Alexei Starovoitov wrote:
> > > On Wed, Feb 13, 2019 at 05:46:26PM -0700, David Ahern wrote:
> > >> On 2/13/19 12:53 PM, Peter Oskolkov wrote:
> > >>> This patchset implements BPF_LWT_ENCAP_IP mode in bpf_lwt_push_encap
> > >>> BPF helper. It enables BPF programs (specifically, BPF_PROG_TYPE_LWT_IN
> > >>> and BPF_PROG_TYPE_LWT_XMIT prog types) to add IP encapsulation headers
> > >>> to packets (e.g. IP/GRE, GUE, IPIP).
> > >>>
> > >>> This is useful when thousands of different short-lived flows should be
> > >>> encapped, each with different and dynamically determined destination.
> > >>> Although lwtunnels can be used in some of these scenarios, the ability
> > >>> to dynamically generate encap headers adds more flexibility, e.g.
> > >>> when routing depends on the state of the host (reflected in global bpf
> > >>> maps).
> > >>>
> > >>
> > >>
> > >> For the set:
> > >> Reviewed-by: David Ahern <dsahern@gmail.com>
> > >
> > > Applied. Thanks everyone!
> > >
> >
> > Looks like a cleanup round is needed.
> >
> > I changed the routes to fail with unreachable:
> >
> > @@ -179,16 +175,16 @@
> > ip -netns ${NS3} tunnel add gre_dev mode gre remote ${IPv4_1} local
> > ${IPv4_GRE} ttl 255
> > ip -netns ${NS3} link set gre_dev up
> > ip -netns ${NS3} addr add ${IPv4_GRE} dev gre_dev
> > - ip -netns ${NS1} route add ${IPv4_GRE}/32 dev veth5 via ${IPv4_6}
> > - ip -netns ${NS2} route add ${IPv4_GRE}/32 dev veth7 via ${IPv4_8}
> > + ip -netns ${NS1} route add unreachable ${IPv4_GRE}/32
> > + ip -netns ${NS2} route add unreachable ${IPv4_GRE}/32
> >
> >
> > # configure IPv6 GRE device in NS3, and a route to it via the "bottom"
> > route
> > ip -netns ${NS3} -6 tunnel add name gre6_dev mode ip6gre remote
> > ${IPv6_1} local ${IPv6_GRE} ttl 255
> > ip -netns ${NS3} link set gre6_dev up
> > ip -netns ${NS3} -6 addr add ${IPv6_GRE} nodad dev gre6_dev
> > - ip -netns ${NS1} -6 route add ${IPv6_GRE}/128 dev veth5 via ${IPv6_6}
> > - ip -netns ${NS2} -6 route add ${IPv6_GRE}/128 dev veth7 via ${IPv6_8}
> > + ip -netns ${NS1} -6 route add unreachable ${IPv6_GRE}/128
> > + ip -netns ${NS2} -6 route add unreachable ${IPv6_GRE}/128
> >
> > # rp_filter gets confused by what these tests are doing, so disable it
> > ip netns exec ${NS1} sysctl -wq net.ipv4.conf.all.rp_filter=0
> > @@ -220,7 +216,6 @@
> >
> >
> > and then removed all of the set -e and exit 1's in the script (really
> > should let all of the tests run versus bailing on the first failure).
> >
> > With kmemleak enabled I see a lot of suspected memory leaks - some may
> > not be related to this change but it is triggering the suspected leak:
>
> argh. Thanks a lot for catching it.
> Let's figure out the fix quickly.
Reproduced. Looking.
> If it's too intrusive we can revert and reapply.
> I'm not going to send a pull-req to Dave with a known issue like this.
>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox