* Re: [PATCH v6 06/11] ARM: dts: sunxi: h3/h5: represent the mdio switch used by sun8i-h3-emac
From: Corentin Labbe @ 2017-09-27 13:47 UTC (permalink / raw)
To: Maxime Ripard
Cc: robh+dt, mark.rutland, wens, linux, catalin.marinas, will.deacon,
peppe.cavallaro, alexandre.torgue, andrew, f.fainelli,
frowand.list, netdev, devicetree, linux-arm-kernel, linux-kernel,
linux-sunxi
In-Reply-To: <20170927101622.v7odmreccwh7ldg2@flea>
On Wed, Sep 27, 2017 at 12:16:22PM +0200, Maxime Ripard wrote:
> On Wed, Sep 27, 2017 at 07:34:09AM +0000, Corentin Labbe wrote:
> > Since dwmac-sun8i could use either an integrated PHY or an external PHY
> > (which could be at same MDIO address), we need to represent this selection
> > by a MDIO switch.
> >
> > Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
> > ---
> > arch/arm/boot/dts/sunxi-h3-h5.dtsi | 31 +++++++++++++++++++++++++------
> > 1 file changed, 25 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/arm/boot/dts/sunxi-h3-h5.dtsi b/arch/arm/boot/dts/sunxi-h3-h5.dtsi
> > index 3b7d953429a6..a8e9b8f378ba 100644
> > --- a/arch/arm/boot/dts/sunxi-h3-h5.dtsi
> > +++ b/arch/arm/boot/dts/sunxi-h3-h5.dtsi
> > @@ -422,14 +422,33 @@
> > #size-cells = <0>;
> > status = "disabled";
> >
> > - mdio: mdio {
> > + mdio0: mdio {
> > #address-cells = <1>;
> > #size-cells = <0>;
> > - int_mii_phy: ethernet-phy@1 {
> > - compatible = "ethernet-phy-ieee802.3-c22";
> > - reg = <1>;
> > - clocks = <&ccu CLK_BUS_EPHY>;
> > - resets = <&ccu RST_BUS_EPHY>;
> > + compatible = "snps,dwmac-mdio";
> > +
> > + mdio-mux {
> > + compatible = "mdio-mux";
> > + #address-cells = <1>;
> > + #size-cells = <0>;
>
> Newline
>
> > + /* Only one MDIO is usable at the time */
> > + internal_mdio: mdio@1 {
> > + reg = <1>;
> > + #address-cells = <1>;
> > + #size-cells = <0>;
>
> Newline
>
> > + int_mii_phy: ethernet-phy@1 {
> > + compatible = "ethernet-phy-ieee802.3-c22";
> > + reg = <1>;
> > + clocks = <&ccu CLK_BUS_EPHY>;
> > + resets = <&ccu RST_BUS_EPHY>;
> > + phy-is-integrated;
> > + };
> > + };
>
> Newline
>
> > + mdio: mdio@2 {
>
> This is quite confusing. Why not call the label external_mdio?
>
I will do it. (at origin I was not changing it for limiting changes on board with external PHY, but now all DT are reverted, it will be easy)
Regards
^ permalink raw reply
* Re: [PATCH v2 net-next 2/2] net/sched: allow flower to match tunnel options
From: Jiri Pirko @ 2017-09-27 13:47 UTC (permalink / raw)
To: Simon Horman
Cc: David Miller, Jiri Pirko, Jamal Hadi Salim, Cong Wang, netdev,
oss-drivers
In-Reply-To: <20170927133731.GA14183@vergenet.net>
Wed, Sep 27, 2017 at 03:37:33PM CEST, simon.horman@netronome.com wrote:
>On Wed, Sep 27, 2017 at 02:56:03PM +0200, Jiri Pirko wrote:
>> Wed, Sep 27, 2017 at 02:52:06PM CEST, simon.horman@netronome.com wrote:
>> >On Wed, Sep 27, 2017 at 01:08:22PM +0200, Jiri Pirko wrote:
>> >> Wed, Sep 27, 2017 at 11:27:33AM CEST, simon.horman@netronome.com wrote:
>> >> >On Wed, Sep 27, 2017 at 11:10:05AM +0200, Jiri Pirko wrote:
>> >> >> Wed, Sep 27, 2017 at 10:16:34AM CEST, simon.horman@netronome.com wrote:
>> >
>> >...
>> >
>> >> >> > enum flow_dissector_key_id {
>> >> >> > FLOW_DISSECTOR_KEY_CONTROL, /* struct flow_dissector_key_control */
>> >> >> > FLOW_DISSECTOR_KEY_BASIC, /* struct flow_dissector_key_basic */
>> >> >> >@@ -205,6 +217,7 @@ enum flow_dissector_key_id {
>> >> >> > FLOW_DISSECTOR_KEY_MPLS, /* struct flow_dissector_key_mpls */
>> >> >> > FLOW_DISSECTOR_KEY_TCP, /* struct flow_dissector_key_tcp */
>> >> >> > FLOW_DISSECTOR_KEY_IP, /* struct flow_dissector_key_ip */
>> >> >> >+ FLOW_DISSECTOR_KEY_ENC_OPTS, /* struct flow_dissector_key_enc_opts */
>> >> >>
>> >> >> I don't see the actual dissection implementation. Where is it?
>> >> >> Did you test the patchset?
>> >> >
>> >> >Yes, I did test it. But it is also possible something went astray along the
>> >> >way and I will retest.
>> >> >
>> >> >I think that the code you are looking for is in
>> >> >fl_classify() in this patch.
>> >>
>> >> The dissection should be done in the flow_dissector. That's the whole
>> >> point in having it generic. You should move it there.
>> >
>> >Coming back to this after lunch, I believe what I have done in this patch
>> >is consistent with handling of other enc fields, which are set in
>> >fl_classify() rather than the dissector. In particular the ip_tunnel_info,
>> >which is used by this patch, is already used in fl_classify().
>>
>> That means the current code is wrong. The dissection should be done in
>> flow_dissector, not in fl_classify.
>
>Would an better approach be to move the fl_classify() below into, say,
>skb_flow_dissect_tunnel_info() and call that from fl_classify().
No. There is one flow dissection function and you just set it up in a
way you need it. Makes no sense to me to split it up in any way.
>
>The reason I suggest this rather than moving the code into
>__skb_flow_dissect() is that currently flower assumes that tunnel_info
>is used if present. While I assume other users of () assume tunnel_info
>is not used even if present.
__skb_flow_dissect should look at what caller wants, then check skb_tunnel_info
only in case it is needed.
>
>> >Without this patch I see:
>> >
>> >
>> >static int fl_classify(struct sk_buff *skb, const struct tcf_proto *tp,
>> > struct tcf_result *res)
>> >{
>> > ...
>> > struct ip_tunnel_info *info;
>> >
>> > ...
>> >
>> > info = skb_tunnel_info(skb);
>> > if (info) {
>> > struct ip_tunnel_key *key = &info->key;
>> >
>> > switch (ip_tunnel_info_af(info)) {
>> > case AF_INET:
>> > skb_key.enc_control.addr_type =
>> > FLOW_DISSECTOR_KEY_IPV4_ADDRS;
>> > skb_key.enc_ipv4.src = key->u.ipv4.src;
>> > skb_key.enc_ipv4.dst = key->u.ipv4.dst;
>> > break;
>> > case AF_INET6:
>> > skb_key.enc_control.addr_type =
>> > FLOW_DISSECTOR_KEY_IPV6_ADDRS;
>> > skb_key.enc_ipv6.src = key->u.ipv6.src;
>> > skb_key.enc_ipv6.dst = key->u.ipv6.dst;
>> > break;
>> > }
>> >
>> > skb_key.enc_key_id.keyid = tunnel_id_to_key32(key->tun_id);
>> > skb_key.enc_tp.src = key->tp_src;
>> > skb_key.enc_tp.dst = key->tp_dst;
>> > }
>> >
>> > ...
>> >}
>> >
>> >This patch adds the following inside the if() clause above:
>> >
>> > if (info->options_len) {
>> > skb_key.enc_opts.len = info->options_len;
>> > ip_tunnel_info_opts_get(skb_key.enc_opts.data, info);
>> > }
>> >
>> >
^ permalink raw reply
* Re: [PATCH v2 net-next 2/2] net/sched: allow flower to match tunnel options
From: Simon Horman @ 2017-09-27 13:50 UTC (permalink / raw)
To: Jiri Pirko
Cc: David Miller, Jiri Pirko, Jamal Hadi Salim, Cong Wang, netdev,
oss-drivers
In-Reply-To: <20170927134750.GI1944@nanopsycho.orion>
On Wed, Sep 27, 2017 at 03:47:50PM +0200, Jiri Pirko wrote:
> Wed, Sep 27, 2017 at 03:37:33PM CEST, simon.horman@netronome.com wrote:
> >On Wed, Sep 27, 2017 at 02:56:03PM +0200, Jiri Pirko wrote:
> >> Wed, Sep 27, 2017 at 02:52:06PM CEST, simon.horman@netronome.com wrote:
> >> >On Wed, Sep 27, 2017 at 01:08:22PM +0200, Jiri Pirko wrote:
> >> >> Wed, Sep 27, 2017 at 11:27:33AM CEST, simon.horman@netronome.com wrote:
> >> >> >On Wed, Sep 27, 2017 at 11:10:05AM +0200, Jiri Pirko wrote:
> >> >> >> Wed, Sep 27, 2017 at 10:16:34AM CEST, simon.horman@netronome.com wrote:
> >> >
> >> >...
> >> >
> >> >> >> > enum flow_dissector_key_id {
> >> >> >> > FLOW_DISSECTOR_KEY_CONTROL, /* struct flow_dissector_key_control */
> >> >> >> > FLOW_DISSECTOR_KEY_BASIC, /* struct flow_dissector_key_basic */
> >> >> >> >@@ -205,6 +217,7 @@ enum flow_dissector_key_id {
> >> >> >> > FLOW_DISSECTOR_KEY_MPLS, /* struct flow_dissector_key_mpls */
> >> >> >> > FLOW_DISSECTOR_KEY_TCP, /* struct flow_dissector_key_tcp */
> >> >> >> > FLOW_DISSECTOR_KEY_IP, /* struct flow_dissector_key_ip */
> >> >> >> >+ FLOW_DISSECTOR_KEY_ENC_OPTS, /* struct flow_dissector_key_enc_opts */
> >> >> >>
> >> >> >> I don't see the actual dissection implementation. Where is it?
> >> >> >> Did you test the patchset?
> >> >> >
> >> >> >Yes, I did test it. But it is also possible something went astray along the
> >> >> >way and I will retest.
> >> >> >
> >> >> >I think that the code you are looking for is in
> >> >> >fl_classify() in this patch.
> >> >>
> >> >> The dissection should be done in the flow_dissector. That's the whole
> >> >> point in having it generic. You should move it there.
> >> >
> >> >Coming back to this after lunch, I believe what I have done in this patch
> >> >is consistent with handling of other enc fields, which are set in
> >> >fl_classify() rather than the dissector. In particular the ip_tunnel_info,
> >> >which is used by this patch, is already used in fl_classify().
> >>
> >> That means the current code is wrong. The dissection should be done in
> >> flow_dissector, not in fl_classify.
> >
> >Would an better approach be to move the fl_classify() below into, say,
> >skb_flow_dissect_tunnel_info() and call that from fl_classify().
>
> No. There is one flow dissection function and you just set it up in a
> way you need it. Makes no sense to me to split it up in any way.
>
>
> >
> >The reason I suggest this rather than moving the code into
> >__skb_flow_dissect() is that currently flower assumes that tunnel_info
> >is used if present. While I assume other users of () assume tunnel_info
> >is not used even if present.
>
> __skb_flow_dissect should look at what caller wants, then check skb_tunnel_info
> only in case it is needed.
Ok, do you think it is sufficient for __skb_flow_dissect to look at the
tunnel keys, say FLOW_DISSECTOR_KEY_ENC_*? I am a bit concerned this may
break flower as it look at the tunnel info unconditionally.
^ permalink raw reply
* Re: [PATCH v2 net-next 2/2] net/sched: allow flower to match tunnel options
From: Jiri Pirko @ 2017-09-27 14:00 UTC (permalink / raw)
To: Simon Horman
Cc: David Miller, Jiri Pirko, Jamal Hadi Salim, Cong Wang, netdev,
oss-drivers, amir
In-Reply-To: <20170927135042.GB14183@vergenet.net>
Wed, Sep 27, 2017 at 03:50:44PM CEST, simon.horman@netronome.com wrote:
>On Wed, Sep 27, 2017 at 03:47:50PM +0200, Jiri Pirko wrote:
>> Wed, Sep 27, 2017 at 03:37:33PM CEST, simon.horman@netronome.com wrote:
>> >On Wed, Sep 27, 2017 at 02:56:03PM +0200, Jiri Pirko wrote:
>> >> Wed, Sep 27, 2017 at 02:52:06PM CEST, simon.horman@netronome.com wrote:
>> >> >On Wed, Sep 27, 2017 at 01:08:22PM +0200, Jiri Pirko wrote:
>> >> >> Wed, Sep 27, 2017 at 11:27:33AM CEST, simon.horman@netronome.com wrote:
>> >> >> >On Wed, Sep 27, 2017 at 11:10:05AM +0200, Jiri Pirko wrote:
>> >> >> >> Wed, Sep 27, 2017 at 10:16:34AM CEST, simon.horman@netronome.com wrote:
>> >> >
>> >> >...
>> >> >
>> >> >> >> > enum flow_dissector_key_id {
>> >> >> >> > FLOW_DISSECTOR_KEY_CONTROL, /* struct flow_dissector_key_control */
>> >> >> >> > FLOW_DISSECTOR_KEY_BASIC, /* struct flow_dissector_key_basic */
>> >> >> >> >@@ -205,6 +217,7 @@ enum flow_dissector_key_id {
>> >> >> >> > FLOW_DISSECTOR_KEY_MPLS, /* struct flow_dissector_key_mpls */
>> >> >> >> > FLOW_DISSECTOR_KEY_TCP, /* struct flow_dissector_key_tcp */
>> >> >> >> > FLOW_DISSECTOR_KEY_IP, /* struct flow_dissector_key_ip */
>> >> >> >> >+ FLOW_DISSECTOR_KEY_ENC_OPTS, /* struct flow_dissector_key_enc_opts */
>> >> >> >>
>> >> >> >> I don't see the actual dissection implementation. Where is it?
>> >> >> >> Did you test the patchset?
>> >> >> >
>> >> >> >Yes, I did test it. But it is also possible something went astray along the
>> >> >> >way and I will retest.
>> >> >> >
>> >> >> >I think that the code you are looking for is in
>> >> >> >fl_classify() in this patch.
>> >> >>
>> >> >> The dissection should be done in the flow_dissector. That's the whole
>> >> >> point in having it generic. You should move it there.
>> >> >
>> >> >Coming back to this after lunch, I believe what I have done in this patch
>> >> >is consistent with handling of other enc fields, which are set in
>> >> >fl_classify() rather than the dissector. In particular the ip_tunnel_info,
>> >> >which is used by this patch, is already used in fl_classify().
>> >>
>> >> That means the current code is wrong. The dissection should be done in
>> >> flow_dissector, not in fl_classify.
>> >
>> >Would an better approach be to move the fl_classify() below into, say,
>> >skb_flow_dissect_tunnel_info() and call that from fl_classify().
>>
>> No. There is one flow dissection function and you just set it up in a
>> way you need it. Makes no sense to me to split it up in any way.
>>
>>
>> >
>> >The reason I suggest this rather than moving the code into
>> >__skb_flow_dissect() is that currently flower assumes that tunnel_info
>> >is used if present. While I assume other users of () assume tunnel_info
>> >is not used even if present.
>>
>> __skb_flow_dissect should look at what caller wants, then check skb_tunnel_info
>> only in case it is needed.
>
>Ok, do you think it is sufficient for __skb_flow_dissect to look at the
>tunnel keys, say FLOW_DISSECTOR_KEY_ENC_*? I am a bit concerned this may
>break flower as it look at the tunnel info unconditionally.
yeah. When flower needs that, it will get that from the flow dissector.
I don't see why it would break anything. Again, existing code is wrong:
commit bc3103f1ed405de587fa43d8b0671e615505a700
Author: Amir Vadai <amir@vadai.me>
Date: Thu Sep 8 16:23:47 2016 +0300
net/sched: cls_flower: Classify packet in ip tunnels
The dissection has to be moved to flow dissector.
^ permalink raw reply
* Re: [PATCH v6 05/11] dt-bindings: net: dwmac-sun8i: update documentation about integrated PHY
From: Andrew Lunn @ 2017-09-27 14:02 UTC (permalink / raw)
To: Corentin Labbe
Cc: robh+dt-DgEjT+Ai2ygdnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
maxime.ripard-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8, wens-jdAy2FN1RRM,
linux-I+IVW8TIWO2tmTQ+vhA3Yw, catalin.marinas-5wv7dgnIgG8,
will.deacon-5wv7dgnIgG8, peppe.cavallaro-qxv4g6HH51o,
alexandre.torgue-qxv4g6HH51o, f.fainelli-Re5JQEeQqe8AvxtiuMwx3w,
frowand.list-Re5JQEeQqe8AvxtiuMwx3w,
netdev-u79uwXL29TY76Z2rM5mHXA, devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
linux-sunxi-/JYPxA39Uh5TLH3MbocFFw
In-Reply-To: <20170927073414.17361-6-clabbe.montjoie-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Hi Corentin
> +Required properties for the mdio-mux node:
> + - compatible = "mdio-mux"
This is too generic. Please add a more specific compatible for this
particular mux. You can keep "mdio-mux", since that is what the MDIO
subsystem will look for.
> +Required properties of the integrated phy node:
> - clocks: a phandle to the reference clock for the EPHY
> - resets: a phandle to the reset control for the EPHY
> +- phy-is-integrated
So the last thing you said is that the mux is not the problem
here. Something else is locking up. Did you discover what?
I really would like phy-is-integrated to go away.
Andrew
^ permalink raw reply
* [PATCH net-next] libbpf: use map_flags when creating maps
From: Craig Gallek @ 2017-09-27 14:04 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, David S . Miller
Cc: Chonggang Li, netdev
From: Craig Gallek <kraig@google.com>
This extends struct bpf_map_def to include a flags field. Note that
this has the potential to break the validation logic in
bpf_object__validate_maps and bpf_object__init_maps as they use
sizeof(struct bpf_map_def) as a minimal allowable size of a map section.
Any bpf program compiled with a smaller struct bpf_map_def will fail this
check.
I don't believe this will be an issue in practice as both compile-time
definitions of struct bpf_map_def (in samples/bpf/bpf_load.h and
tools/testing/selftests/bpf/bpf_helpers.h) have always been larger
than this newly updated version in libbpf.h.
Signed-off-by: Craig Gallek <kraig@google.com>
---
tools/lib/bpf/libbpf.c | 2 +-
tools/lib/bpf/libbpf.h | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 35f6dfcdc565..6bea85f260a3 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -874,7 +874,7 @@ bpf_object__create_maps(struct bpf_object *obj)
def->key_size,
def->value_size,
def->max_entries,
- 0);
+ def->map_flags);
if (*pfd < 0) {
size_t j;
int err = *pfd;
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 7959086eb9c9..6e20003109e0 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -207,6 +207,7 @@ struct bpf_map_def {
unsigned int key_size;
unsigned int value_size;
unsigned int max_entries;
+ unsigned int map_flags;
};
/*
--
2.14.2.822.g60be5d43e6-goog
^ permalink raw reply related
* Re: [PATCH v2 net-next 2/2] net/sched: allow flower to match tunnel options
From: Simon Horman @ 2017-09-27 14:09 UTC (permalink / raw)
To: Jiri Pirko
Cc: David Miller, Jiri Pirko, Jamal Hadi Salim, Cong Wang, netdev,
oss-drivers, amir
In-Reply-To: <20170927140011.GJ1944@nanopsycho.orion>
On Wed, Sep 27, 2017 at 04:00:11PM +0200, Jiri Pirko wrote:
> Wed, Sep 27, 2017 at 03:50:44PM CEST, simon.horman@netronome.com wrote:
> >On Wed, Sep 27, 2017 at 03:47:50PM +0200, Jiri Pirko wrote:
> >> Wed, Sep 27, 2017 at 03:37:33PM CEST, simon.horman@netronome.com wrote:
> >> >On Wed, Sep 27, 2017 at 02:56:03PM +0200, Jiri Pirko wrote:
> >> >> Wed, Sep 27, 2017 at 02:52:06PM CEST, simon.horman@netronome.com wrote:
> >> >> >On Wed, Sep 27, 2017 at 01:08:22PM +0200, Jiri Pirko wrote:
> >> >> >> Wed, Sep 27, 2017 at 11:27:33AM CEST, simon.horman@netronome.com wrote:
> >> >> >> >On Wed, Sep 27, 2017 at 11:10:05AM +0200, Jiri Pirko wrote:
> >> >> >> >> Wed, Sep 27, 2017 at 10:16:34AM CEST, simon.horman@netronome.com wrote:
> >> >> >
> >> >> >...
> >> >> >
> >> >> >> >> > enum flow_dissector_key_id {
> >> >> >> >> > FLOW_DISSECTOR_KEY_CONTROL, /* struct flow_dissector_key_control */
> >> >> >> >> > FLOW_DISSECTOR_KEY_BASIC, /* struct flow_dissector_key_basic */
> >> >> >> >> >@@ -205,6 +217,7 @@ enum flow_dissector_key_id {
> >> >> >> >> > FLOW_DISSECTOR_KEY_MPLS, /* struct flow_dissector_key_mpls */
> >> >> >> >> > FLOW_DISSECTOR_KEY_TCP, /* struct flow_dissector_key_tcp */
> >> >> >> >> > FLOW_DISSECTOR_KEY_IP, /* struct flow_dissector_key_ip */
> >> >> >> >> >+ FLOW_DISSECTOR_KEY_ENC_OPTS, /* struct flow_dissector_key_enc_opts */
> >> >> >> >>
> >> >> >> >> I don't see the actual dissection implementation. Where is it?
> >> >> >> >> Did you test the patchset?
> >> >> >> >
> >> >> >> >Yes, I did test it. But it is also possible something went astray along the
> >> >> >> >way and I will retest.
> >> >> >> >
> >> >> >> >I think that the code you are looking for is in
> >> >> >> >fl_classify() in this patch.
> >> >> >>
> >> >> >> The dissection should be done in the flow_dissector. That's the whole
> >> >> >> point in having it generic. You should move it there.
> >> >> >
> >> >> >Coming back to this after lunch, I believe what I have done in this patch
> >> >> >is consistent with handling of other enc fields, which are set in
> >> >> >fl_classify() rather than the dissector. In particular the ip_tunnel_info,
> >> >> >which is used by this patch, is already used in fl_classify().
> >> >>
> >> >> That means the current code is wrong. The dissection should be done in
> >> >> flow_dissector, not in fl_classify.
> >> >
> >> >Would an better approach be to move the fl_classify() below into, say,
> >> >skb_flow_dissect_tunnel_info() and call that from fl_classify().
> >>
> >> No. There is one flow dissection function and you just set it up in a
> >> way you need it. Makes no sense to me to split it up in any way.
> >>
> >>
> >> >
> >> >The reason I suggest this rather than moving the code into
> >> >__skb_flow_dissect() is that currently flower assumes that tunnel_info
> >> >is used if present. While I assume other users of () assume tunnel_info
> >> >is not used even if present.
> >>
> >> __skb_flow_dissect should look at what caller wants, then check skb_tunnel_info
> >> only in case it is needed.
> >
> >Ok, do you think it is sufficient for __skb_flow_dissect to look at the
> >tunnel keys, say FLOW_DISSECTOR_KEY_ENC_*? I am a bit concerned this may
> >break flower as it look at the tunnel info unconditionally.
>
> yeah. When flower needs that, it will get that from the flow dissector.
> I don't see why it would break anything. Again, existing code is wrong:
I understand that you think the existing code is wrong.
But I also want to try not to add new bugs.
I am concerned about the case where none of FLOW_DISSECTOR_KEY_ENC_* are
set but flower currently dissects the tunnel info anyway. If I make
dissection of tunnel info dependent on FLOW_DISSECTOR_KEY_ENC_*
that may change things.
^ permalink raw reply
* Re: [PATCH v6 11/11] of: mdio: Prevent of_mdiobus_register from scanning mdio-mux nodes
From: Andrew Lunn @ 2017-09-27 14:12 UTC (permalink / raw)
To: Corentin Labbe
Cc: robh+dt-DgEjT+Ai2ygdnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
maxime.ripard-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8, wens-jdAy2FN1RRM,
linux-I+IVW8TIWO2tmTQ+vhA3Yw, catalin.marinas-5wv7dgnIgG8,
will.deacon-5wv7dgnIgG8, peppe.cavallaro-qxv4g6HH51o,
alexandre.torgue-qxv4g6HH51o, f.fainelli-Re5JQEeQqe8AvxtiuMwx3w,
frowand.list-Re5JQEeQqe8AvxtiuMwx3w,
netdev-u79uwXL29TY76Z2rM5mHXA, devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
linux-sunxi-/JYPxA39Uh5TLH3MbocFFw
In-Reply-To: <20170927073414.17361-12-clabbe.montjoie-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
On Wed, Sep 27, 2017 at 09:34:14AM +0200, Corentin Labbe wrote:
> Each child node of an MDIO node is scanned as a PHY when calling
> of_mdiobus_register() givint the following result:
> [ 18.175379] mdio_bus stmmac-0: /soc/ethernet@1c30000/mdio/mdio-mux has invalid PHY address
> [ 18.175408] mdio_bus stmmac-0: scan phy mdio-mux at address 0
> [ 18.175450] mdio_bus stmmac-0: scan phy mdio-mux at address 1
> [...]
> [ 18.176420] mdio_bus stmmac-0: scan phy mdio-mux at address 30
> [ 18.176452] mdio_bus stmmac-0: scan phy mdio-mux at address 31
>
> Since mdio-mux nodes are not PHY, this patch a way to to not scan
> them.
Hi Corentin
I still don't like this, but ...
>
> Signed-off-by: Corentin Labbe <clabbe.montjoie-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
> drivers/of/of_mdio.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/drivers/of/of_mdio.c b/drivers/of/of_mdio.c
> index d94dd8b77abd..d90ddb0d90f2 100644
> --- a/drivers/of/of_mdio.c
> +++ b/drivers/of/of_mdio.c
> @@ -190,6 +190,10 @@ int of_mdiobus_register(struct mii_bus *mdio, struct device_node *np)
> struct device_node *child;
> bool scanphys = false;
> int addr, rc;
> + static const struct of_device_id do_not_scan[] = {
> + { .compatible = "mdio-mux" },
> + {}
> + };
Please rename this to some less generic. What i don't want is other
compatible strings added here. We want to make the exception for
muxes, but nothing else. So something like compatible_muxes?
Andrew
^ permalink raw reply
* Re: WARNING: kernel stack frame pointer at ffff880156a5fea0 in bash:2103 has bad value 00007ffec7d87e50
From: Josh Poimboeuf @ 2017-09-27 14:14 UTC (permalink / raw)
To: Richard Weinberger
Cc: Alexei Starovoitov, ast, daniel, netdev, linux-kernel, mingo
In-Reply-To: <41856500.WAzQIuoSLi@blindfold>
On Wed, Sep 27, 2017 at 08:51:22AM +0200, Richard Weinberger wrote:
> Am Mittwoch, 27. September 2017, 00:42:46 CEST schrieb Josh Poimboeuf:
> > > Here is another variant of the warning, it matches the attached .config:
> > I can take a look at it. Unfortunately, for these types of issues I
> > often need the vmlinux file to be able to make sense of the unwinder
> > dump. So if you happen to have somewhere to copy the vmlinux to, that
> > would be helpful. Or if you give me your GCC version I can try to
> > rebuild it locally.
>
> There you go:
> http://git.infradead.org/~rw/bpf_splat/vmlinux.xz
Thanks. Can you test this fix?
diff --git a/arch/x86/kernel/kprobes/common.h b/arch/x86/kernel/kprobes/common.h
index db2182d63ed0..3fc0f9a794cb 100644
--- a/arch/x86/kernel/kprobes/common.h
+++ b/arch/x86/kernel/kprobes/common.h
@@ -3,6 +3,15 @@
/* Kprobes and Optprobes common header */
+#include <asm/asm.h>
+
+#ifdef CONFIG_FRAME_POINTER
+# define SAVE_RBP_STRING " push %" _ASM_BP "\n" \
+ " mov %" _ASM_SP ", %" _ASM_BP "\n"
+#else
+# define SAVE_RBP_STRING " push %" _ASM_BP "\n"
+#endif
+
#ifdef CONFIG_X86_64
#define SAVE_REGS_STRING \
/* Skip cs, ip, orig_ax. */ \
@@ -17,7 +26,7 @@
" pushq %r10\n" \
" pushq %r11\n" \
" pushq %rbx\n" \
- " pushq %rbp\n" \
+ SAVE_RBP_STRING \
" pushq %r12\n" \
" pushq %r13\n" \
" pushq %r14\n" \
@@ -48,7 +57,7 @@
" pushl %es\n" \
" pushl %ds\n" \
" pushl %eax\n" \
- " pushl %ebp\n" \
+ SAVE_RBP_STRING \
" pushl %edi\n" \
" pushl %esi\n" \
" pushl %edx\n" \
^ permalink raw reply related
* Re: [PATCH v2 net-next 2/2] net/sched: allow flower to match tunnel options
From: Jiri Pirko @ 2017-09-27 14:19 UTC (permalink / raw)
To: Simon Horman
Cc: David Miller, Jiri Pirko, Jamal Hadi Salim, Cong Wang, netdev,
oss-drivers, amir
In-Reply-To: <20170927140952.GC14183@vergenet.net>
Wed, Sep 27, 2017 at 04:09:54PM CEST, simon.horman@netronome.com wrote:
>On Wed, Sep 27, 2017 at 04:00:11PM +0200, Jiri Pirko wrote:
>> Wed, Sep 27, 2017 at 03:50:44PM CEST, simon.horman@netronome.com wrote:
>> >On Wed, Sep 27, 2017 at 03:47:50PM +0200, Jiri Pirko wrote:
>> >> Wed, Sep 27, 2017 at 03:37:33PM CEST, simon.horman@netronome.com wrote:
>> >> >On Wed, Sep 27, 2017 at 02:56:03PM +0200, Jiri Pirko wrote:
>> >> >> Wed, Sep 27, 2017 at 02:52:06PM CEST, simon.horman@netronome.com wrote:
>> >> >> >On Wed, Sep 27, 2017 at 01:08:22PM +0200, Jiri Pirko wrote:
>> >> >> >> Wed, Sep 27, 2017 at 11:27:33AM CEST, simon.horman@netronome.com wrote:
>> >> >> >> >On Wed, Sep 27, 2017 at 11:10:05AM +0200, Jiri Pirko wrote:
>> >> >> >> >> Wed, Sep 27, 2017 at 10:16:34AM CEST, simon.horman@netronome.com wrote:
>> >> >> >
>> >> >> >...
>> >> >> >
>> >> >> >> >> > enum flow_dissector_key_id {
>> >> >> >> >> > FLOW_DISSECTOR_KEY_CONTROL, /* struct flow_dissector_key_control */
>> >> >> >> >> > FLOW_DISSECTOR_KEY_BASIC, /* struct flow_dissector_key_basic */
>> >> >> >> >> >@@ -205,6 +217,7 @@ enum flow_dissector_key_id {
>> >> >> >> >> > FLOW_DISSECTOR_KEY_MPLS, /* struct flow_dissector_key_mpls */
>> >> >> >> >> > FLOW_DISSECTOR_KEY_TCP, /* struct flow_dissector_key_tcp */
>> >> >> >> >> > FLOW_DISSECTOR_KEY_IP, /* struct flow_dissector_key_ip */
>> >> >> >> >> >+ FLOW_DISSECTOR_KEY_ENC_OPTS, /* struct flow_dissector_key_enc_opts */
>> >> >> >> >>
>> >> >> >> >> I don't see the actual dissection implementation. Where is it?
>> >> >> >> >> Did you test the patchset?
>> >> >> >> >
>> >> >> >> >Yes, I did test it. But it is also possible something went astray along the
>> >> >> >> >way and I will retest.
>> >> >> >> >
>> >> >> >> >I think that the code you are looking for is in
>> >> >> >> >fl_classify() in this patch.
>> >> >> >>
>> >> >> >> The dissection should be done in the flow_dissector. That's the whole
>> >> >> >> point in having it generic. You should move it there.
>> >> >> >
>> >> >> >Coming back to this after lunch, I believe what I have done in this patch
>> >> >> >is consistent with handling of other enc fields, which are set in
>> >> >> >fl_classify() rather than the dissector. In particular the ip_tunnel_info,
>> >> >> >which is used by this patch, is already used in fl_classify().
>> >> >>
>> >> >> That means the current code is wrong. The dissection should be done in
>> >> >> flow_dissector, not in fl_classify.
>> >> >
>> >> >Would an better approach be to move the fl_classify() below into, say,
>> >> >skb_flow_dissect_tunnel_info() and call that from fl_classify().
>> >>
>> >> No. There is one flow dissection function and you just set it up in a
>> >> way you need it. Makes no sense to me to split it up in any way.
>> >>
>> >>
>> >> >
>> >> >The reason I suggest this rather than moving the code into
>> >> >__skb_flow_dissect() is that currently flower assumes that tunnel_info
>> >> >is used if present. While I assume other users of () assume tunnel_info
>> >> >is not used even if present.
>> >>
>> >> __skb_flow_dissect should look at what caller wants, then check skb_tunnel_info
>> >> only in case it is needed.
>> >
>> >Ok, do you think it is sufficient for __skb_flow_dissect to look at the
>> >tunnel keys, say FLOW_DISSECTOR_KEY_ENC_*? I am a bit concerned this may
>> >break flower as it look at the tunnel info unconditionally.
>>
>> yeah. When flower needs that, it will get that from the flow dissector.
>> I don't see why it would break anything. Again, existing code is wrong:
>
>I understand that you think the existing code is wrong.
>But I also want to try not to add new bugs.
>
>I am concerned about the case where none of FLOW_DISSECTOR_KEY_ENC_* are
>set but flower currently dissects the tunnel info anyway. If I make
>dissection of tunnel info dependent on FLOW_DISSECTOR_KEY_ENC_*
>that may change things.
If none of FLOW_DISSECTOR_KEY_ENC_* are set, flower does not care about
the fields and therefore they are masked out by fl_set_masked_key.
Otherwise it would be a bug is flower would match on something user did
not specify.
^ permalink raw reply
* Re: [PATCH v2] tun: bail out from tun_get_user() if the skb is empty
From: Alexander Potapenko @ 2017-09-27 14:31 UTC (permalink / raw)
To: Eric Dumazet
Cc: Eric Dumazet, David Miller, Dmitriy Vyukov, syzkaller, Networking,
LKML
In-Reply-To: <CANn89iKL1GzT+9Mnn2OjvPCgmvV4GMhwn5mQzi7yW8_B1wziYA@mail.gmail.com>
On Wed, Sep 27, 2017 at 3:26 PM, 'Eric Dumazet' via syzkaller
<syzkaller@googlegroups.com> wrote:
> On Wed, Sep 27, 2017 at 5:58 AM, Alexander Potapenko <glider@google.com> wrote:
>> On Wed, Sep 27, 2017 at 2:45 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>> On Wed, 2017-09-27 at 05:42 -0700, Eric Dumazet wrote:
>>>
>>>> Or something cleaner to avoid copy/paste and focus on proper
>>>> skb->data[0] access and meaning.
>> By the way I'm wondering if this is the only place where skb->data is
>> being accessed.
>> Isn't eth_type_trans() under IFF_TAP also touching it? Then we need to
>> check the size earlier.
>
> It is already checked.
Indeed, thanks.
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
--
Alexander Potapenko
Software Engineer
Google Germany GmbH
Erika-Mann-Straße, 33
80636 München
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
^ permalink raw reply
* Re: [patch net-next v3 06/12] net: mroute: Check if rule is a default rule
From: Nikolay Aleksandrov @ 2017-09-27 14:38 UTC (permalink / raw)
To: Jiri Pirko, netdev; +Cc: davem, yotamg, idosch, mlxsw, andrew, linyunsheng
In-Reply-To: <20170927062322.5476-7-jiri@resnulli.us>
On 27/09/17 09:23, Jiri Pirko wrote:
> From: Yotam Gigi <yotamg@mellanox.com>
>
> When the ipmr starts, it adds one default FIB rule that matches all packets
> and sends them to the DEFAULT (multicast) FIB table. A more complex rule
> can be added by user to specify that for a specific interface, a packet
> should be look up at either an arbitrary table or according to the l3mdev
> of the interface.
>
> For drivers willing to offload the ipmr logic into a hardware but don't
> want to offload all the FIB rules functionality, provide a function that
> can indicate whether the FIB rule is the default multicast rule, thus only
> one routing table is needed.
>
> This way, a driver can register to the FIB notification chain, get
> notifications about FIB rules added and trigger some kind of an internal
> abort mechanism when a non default rule is added by the user.
>
> Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
> Reviewed-by: Ido Schimmel <idosch@mellanox.com>
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
> ---
> v2->v3:
> - Use the already existing ifdef CONFIG_IP_MROUTE_MULTIPLE_TABLES
> v1->v2:
> - Update the lastuse MFC entry field too, in addition to packets an bytes.
> ---
> include/linux/mroute.h | 7 +++++++
> net/ipv4/ipmr.c | 12 ++++++++++++
> 2 files changed, 19 insertions(+)
>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
^ permalink raw reply
* Re: [PATCH net-next 2/6] bpf: add meta pointer for direct access
From: Jesper Dangaard Brouer @ 2017-09-27 14:54 UTC (permalink / raw)
To: John Fastabend
Cc: Daniel Borkmann, davem, alexei.starovoitov, peter.waskiewicz.jr,
jakub.kicinski, netdev, Andy Gospodarek, brouer
In-Reply-To: <645e7a39-c172-5882-5dd9-f038430114d1@gmail.com>
On Wed, 27 Sep 2017 06:35:40 -0700
John Fastabend <john.fastabend@gmail.com> wrote:
> On 09/27/2017 02:26 AM, Jesper Dangaard Brouer wrote:
> > On Tue, 26 Sep 2017 21:58:53 +0200
> > Daniel Borkmann <daniel@iogearbox.net> wrote:
> >
> >> On 09/26/2017 09:13 PM, Jesper Dangaard Brouer wrote:
> >> [...]
> >>> I'm currently implementing a cpumap type, that transfers raw XDP frames
> >>> to another CPU, and the SKB is allocated on the remote CPU. (It
> >>> actually works extremely well).
> >>
> >> Meaning you let all the XDP_PASS packets get processed on a
> >> different CPU, so you can reserve the whole CPU just for
> >> prefiltering, right?
> >
> > Yes, exactly. Except I use the XDP_REDIRECT action to steer packets.
> > The trick is using the map-flush point, to transfer packets in bulk to
> > the remote CPU (single call IPC is too slow), but at the same time
> > flush single packets if NAPI didn't see a bulk.
> >
> >> Do you have some numbers to share at this point, just curious when
> >> you mention it works extremely well.
> >
> > Sure... I've done a lot of benchmarking on this patchset ;-)
> > I have a benchmark program called xdp_redirect_cpu [1][2], that collect
> > stats via tracepoints (atm I'm limiting bulking 8 packets, and have
> > tracepoints at bulk spots, to amortize tracepoint cost 25ns/8=3.125ns)
> >
> > [1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/samples/bpf/xdp_redirect_cpu_kern.c
> > [2] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/samples/bpf/xdp_redirect_cpu_user.c
> >
> > Here I'm installing a DDoS program that drops UDP port 9 (pktgen
> > packets) on RX CPU=0. I'm forcing my netperf to hit the same CPU, that
> > the 11.9Mpps DDoS attack is hitting.
> >
> > Running XDP/eBPF prog_num:4
> > XDP-cpumap CPU:to pps drop-pps extra-info
> > XDP-RX 0 12,030,471 11,966,982 0
> > XDP-RX total 12,030,471 11,966,982
> > cpumap-enqueue 0:2 63,488 0 0
> > cpumap-enqueue sum:2 63,488 0 0
> > cpumap_kthread 2 63,488 0 3 time_exceed
> > cpumap_kthread total 63,488 0 0
> > redirect_err total 0 0
> >
> > $ netperf -H 172.16.0.2 -t TCP_CRR -l 10 -D1 -T5,5 -- -r 1024,1024
> > Local /Remote
> > Socket Size Request Resp. Elapsed Trans.
> > Send Recv Size Size Time Rate
> > bytes Bytes bytes bytes secs. per sec
> >
> > 16384 87380 1024 1024 10.00 12735.97
> > 16384 87380
> >
> > The netperf TCP_CRR performance is the same, without XDP loaded.
> >
>
> Just curious could you also try this with RPS enabled (or does this have
> RPS enabled). RPS should effectively do the same thing but higher in the
> stack. I'm curious what the delta would be. Might be another interesting
> case and fairly easy to setup if you already have the above scripts.
Yes, I'm essentially competing with RSP, thus such a comparison is very
relevant...
This is only a 6 CPUs system. Allocate 2 CPUs to RPS receive and let
other 4 CPUS process packet.
Summary of RPS (Receive Packet Steering) performance:
* End result is 6.3 Mpps max performance
* netperf TCP_CRR is 1 trans/sec.
* Each RX-RPS CPU stall at ~3.2Mpps.
The full test report below with setup:
The mask needed::
perl -e 'printf "%b\n",0x3C'
111100
RPS setup::
sudo sh -c 'echo 32768 > /proc/sys/net/core/rps_sock_flow_entries'
for N in $(seq 0 5) ; do \
sudo sh -c "echo 8192 > /sys/class/net/ixgbe1/queues/rx-$N/rps_flow_cnt" ; \
sudo sh -c "echo 3c > /sys/class/net/ixgbe1/queues/rx-$N/rps_cpus" ; \
grep -H . /sys/class/net/ixgbe1/queues/rx-$N/rps_cpus ; \
done
Reduce RX queues to two ::
ethtool -L ixgbe1 combined 2
IRQ align to CPU numbers::
$ ~/setup01.sh
Not root, running with sudo
--- Disable Ethernet flow-control ---
rx unmodified, ignoring
tx unmodified, ignoring
no pause parameters changed, aborting
rx unmodified, ignoring
tx unmodified, ignoring
no pause parameters changed, aborting
--- Align IRQs ---
/proc/irq/54/ixgbe1-TxRx-0/../smp_affinity_list:0
/proc/irq/55/ixgbe1-TxRx-1/../smp_affinity_list:1
/proc/irq/56/ixgbe1/../smp_affinity_list:0-5
$ grep -H . /sys/class/net/ixgbe1/queues/rx-*/rps_cpus
/sys/class/net/ixgbe1/queues/rx-0/rps_cpus:3c
/sys/class/net/ixgbe1/queues/rx-1/rps_cpus:3c
Generator is sending: 12,715,782 tx_packets /sec
./pktgen_sample04_many_flows.sh -vi ixgbe2 -m 00:1b:21:bb:9a:84 \
-d 172.16.0.2 -t8
$ nstat > /dev/null && sleep 1 && nstat
#kernel
IpInReceives 6346544 0.0
IpInDelivers 6346544 0.0
IpOutRequests 1020 0.0
IcmpOutMsgs 1020 0.0
IcmpOutDestUnreachs 1020 0.0
IcmpMsgOutType3 1020 0.0
UdpNoPorts 6346898 0.0
IpExtInOctets 291964714 0.0
IpExtOutOctets 73440 0.0
IpExtInNoECTPkts 6347063 0.0
$ mpstat -P ALL -u -I SCPU -I SUM
Average: CPU %usr %nice %sys %irq %soft %idle
Average: all 0.00 0.00 0.00 0.42 72.97 26.61
Average: 0 0.00 0.00 0.00 0.17 99.83 0.00
Average: 1 0.00 0.00 0.00 0.17 99.83 0.00
Average: 2 0.00 0.00 0.00 0.67 60.37 38.96
Average: 3 0.00 0.00 0.00 0.67 58.70 40.64
Average: 4 0.00 0.00 0.00 0.67 59.53 39.80
Average: 5 0.00 0.00 0.00 0.67 58.93 40.40
Average: CPU intr/s
Average: all 152067.22
Average: 0 50064.73
Average: 1 50089.35
Average: 2 45095.17
Average: 3 44875.04
Average: 4 44906.32
Average: 5 45152.08
Average: CPU TIMER/s NET_TX/s NET_RX/s TASKLET/s SCHED/s RCU/s
Average: 0 609.48 0.17 49431.28 0.00 2.66 21.13
Average: 1 567.55 0.00 49498.00 0.00 2.66 21.13
Average: 2 998.34 0.00 43941.60 4.16 82.86 68.22
Average: 3 540.60 0.17 44140.27 0.00 85.52 108.49
Average: 4 537.27 0.00 44219.63 0.00 84.53 64.89
Average: 5 530.78 0.17 44445.59 0.00 85.02 90.52
>From mpstat it looks like it is the RX-RPS CPUs that are the bottleneck.
Show adapter(s) (ixgbe1) statistics (ONLY that changed!)
Ethtool(ixgbe1) stat: 11109531 ( 11,109,531) <= fdir_miss /sec
Ethtool(ixgbe1) stat: 380632356 ( 380,632,356) <= rx_bytes /sec
Ethtool(ixgbe1) stat: 812792611 ( 812,792,611) <= rx_bytes_nic /sec
Ethtool(ixgbe1) stat: 1753550 ( 1,753,550) <= rx_missed_errors /sec
Ethtool(ixgbe1) stat: 4602487 ( 4,602,487) <= rx_no_dma_resources /sec
Ethtool(ixgbe1) stat: 6343873 ( 6,343,873) <= rx_packets /sec
Ethtool(ixgbe1) stat: 10946441 ( 10,946,441) <= rx_pkts_nic /sec
Ethtool(ixgbe1) stat: 190287853 ( 190,287,853) <= rx_queue_0_bytes /sec
Ethtool(ixgbe1) stat: 3171464 ( 3,171,464) <= rx_queue_0_packets /sec
Ethtool(ixgbe1) stat: 190344503 ( 190,344,503) <= rx_queue_1_bytes /sec
Ethtool(ixgbe1) stat: 3172408 ( 3,172,408) <= rx_queue_1_packets /sec
Notice, each RX-CPU can only process 3.1Mpps.
RPS RX-CPU(0):
# Overhead CPU Symbol
# ........ ... .......................................
#
11.72% 000 [k] ixgbe_poll
11.29% 000 [k] _raw_spin_lock
10.35% 000 [k] dev_gro_receive
8.36% 000 [k] __build_skb
7.35% 000 [k] __skb_get_hash
6.22% 000 [k] enqueue_to_backlog
5.89% 000 [k] __skb_flow_dissect
4.43% 000 [k] inet_gro_receive
4.19% 000 [k] ___slab_alloc
3.90% 000 [k] queued_spin_lock_slowpath
3.85% 000 [k] kmem_cache_alloc
3.06% 000 [k] build_skb
2.66% 000 [k] get_rps_cpu
2.57% 000 [k] napi_gro_receive
2.34% 000 [k] eth_type_trans
1.81% 000 [k] __cmpxchg_double_slab.isra.61
1.47% 000 [k] ixgbe_alloc_rx_buffers
1.43% 000 [k] get_partial_node.isra.81
0.84% 000 [k] swiotlb_sync_single
0.74% 000 [k] udp4_gro_receive
0.73% 000 [k] netif_receive_skb_internal
0.72% 000 [k] udp_gro_receive
0.63% 000 [k] skb_gro_reset_offset
0.49% 000 [k] __skb_flow_get_ports
0.48% 000 [k] llist_add_batch
0.36% 000 [k] swiotlb_sync_single_for_cpu
0.34% 000 [k] __slab_alloc
Remote RPS-CPU(3) getting packets::
# Overhead CPU Symbol
# ........ ... ..............................................
#
33.02% 003 [k] poll_idle
10.99% 003 [k] __netif_receive_skb_core
10.45% 003 [k] page_frag_free
8.49% 003 [k] ip_rcv
4.19% 003 [k] fib_table_lookup
2.84% 003 [k] __udp4_lib_rcv
2.81% 003 [k] __slab_free
2.23% 003 [k] __udp4_lib_lookup
2.09% 003 [k] ip_route_input_rcu
2.07% 003 [k] kmem_cache_free
2.06% 003 [k] udp_v4_early_demux
1.73% 003 [k] ip_rcv_finish
1.44% 003 [k] process_backlog
1.32% 003 [k] icmp_send
1.30% 003 [k] cmpxchg_double_slab.isra.73
0.95% 003 [k] intel_idle
0.88% 003 [k] _raw_spin_lock
0.84% 003 [k] fib_validate_source
0.79% 003 [k] ip_local_deliver_finish
0.67% 003 [k] ip_local_deliver
0.56% 003 [k] skb_release_data
0.53% 003 [k] unfreeze_partials.isra.80
0.51% 003 [k] skb_release_head_state
0.44% 003 [k] kfree_skb
0.44% 003 [k] queued_spin_lock_slowpath
0.44% 003 [k] __cmpxchg_double_slab.isra.61
$ netperf -H 172.16.0.2 -t TCP_CRR -l 10 -T5,5 -- -r 1024,1024
MIGRATED TCP Connect/Request/Response TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.16.0.2 () port 0 AF_INET : histogram : demo : cpu bind
Local /Remote
Socket Size Request Resp. Elapsed Trans.
Send Recv Size Size Time Rate
bytes Bytes bytes bytes secs. per sec
16384 87380 1024 1024 10.00 1.10
16384 87380
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
^ permalink raw reply
* Re: tc H/W offload issue with vxlan tunnels [was: nfp: flower vxlan tunnel offload]
From: Jiri Benc @ 2017-09-27 15:27 UTC (permalink / raw)
To: Jiri Pirko
Cc: Paolo Abeni, Or Gerlitz, Simon Horman, David Miller,
Jakub Kicinski, Linux Netdev List, oss-drivers, John Hurley,
Paul Blakey, Jiri Pirko, Roi Dayan
In-Reply-To: <20170927125509.GG1944@nanopsycho.orion>
On Wed, 27 Sep 2017 14:55:09 +0200, Jiri Pirko wrote:
> So where do you attach the tc filter instead of eth0? vxlan0?
Yes, vxlan0. I'm pasting the example from earlier in this thread again:
This will match:
ip link add vxlan0 type vxlan dstport 4789 dev eth0 external
ip link set dev vxlan0 up
tc qdisc add dev vxlan0 ingress
ethtool -K eth0 hw-tc-offload on
tc filter add dev vxlan0 protocol ip parent ffff: flower enc_key_id 102 \
enc_dst_port 4789 src_ip 3.4.5.6 skip_sw action [...]
while this must NOT match:
ip link add vxlan0 type vxlan dstport 4789 dev eth0 external
ip link set dev vxlan0 up
tc qdisc add dev eth0 ingress
ethtool -K eth0 hw-tc-offload on
tc filter add dev eth0 protocol ip parent ffff: flower enc_key_id 102 \
enc_dst_port 4789 src_ip 3.4.5.6 skip_sw action [...]
Jiri
^ permalink raw reply
* Re: tc H/W offload issue with vxlan tunnels [was: nfp: flower vxlan tunnel offload]
From: Jiri Pirko @ 2017-09-27 15:35 UTC (permalink / raw)
To: Jiri Benc
Cc: Paolo Abeni, Or Gerlitz, Simon Horman, David Miller,
Jakub Kicinski, Linux Netdev List, oss-drivers, John Hurley,
Paul Blakey, Jiri Pirko, Roi Dayan
In-Reply-To: <20170927172729.5b7a8c44@griffin>
Wed, Sep 27, 2017 at 05:27:29PM CEST, jbenc@redhat.com wrote:
>On Wed, 27 Sep 2017 14:55:09 +0200, Jiri Pirko wrote:
>> So where do you attach the tc filter instead of eth0? vxlan0?
>
>Yes, vxlan0. I'm pasting the example from earlier in this thread again:
>
>This will match:
>
>ip link add vxlan0 type vxlan dstport 4789 dev eth0 external
>ip link set dev vxlan0 up
>tc qdisc add dev vxlan0 ingress
>ethtool -K eth0 hw-tc-offload on
>tc filter add dev vxlan0 protocol ip parent ffff: flower enc_key_id 102 \
> enc_dst_port 4789 src_ip 3.4.5.6 skip_sw action [...]
>
>while this must NOT match:
>
>ip link add vxlan0 type vxlan dstport 4789 dev eth0 external
>ip link set dev vxlan0 up
>tc qdisc add dev eth0 ingress
>ethtool -K eth0 hw-tc-offload on
>tc filter add dev eth0 protocol ip parent ffff: flower enc_key_id 102 \
> enc_dst_port 4789 src_ip 3.4.5.6 skip_sw action [...]
Right. Driver of eth0 should get ndo_setup_tc calls for vxlan0. Similar
thing will be needed for bonding/team. I'm taking this into
considaration for my sharedblock patchset. Work in progress:
https://github.com/jpirko/linux_mlxsw/commits/jiri_devel_shblock
Basically the eth0 driver will register a callback function that would
be called whenever filter is added/deleted on vxlan0
^ permalink raw reply
* Re: [PATCH net-next 1/6] net: dsa: Fix SWITCHDEV_ATTR_ID_PORT_PARENT_ID
From: Vivien Didelot @ 2017-09-27 15:31 UTC (permalink / raw)
To: Andrew Lunn, David Miller; +Cc: netdev, Andrew Lunn
In-Reply-To: <1506464764-12699-2-git-send-email-andrew@lunn.ch>
Hi Andrew,
Andrew Lunn <andrew@lunn.ch> writes:
> @@ -354,13 +354,16 @@ static int dsa_slave_port_attr_get(struct net_device *dev,
> struct switchdev_attr *attr)
> {
> struct dsa_slave_priv *p = netdev_priv(dev);
> - struct dsa_switch *ds = p->dp->ds;
>
> switch (attr->id) {
> - case SWITCHDEV_ATTR_ID_PORT_PARENT_ID:
> - attr->u.ppid.id_len = sizeof(ds->index);
> - memcpy(&attr->u.ppid.id, &ds->index, attr->u.ppid.id_len);
> + case SWITCHDEV_ATTR_ID_PORT_PARENT_ID: {
> + struct dsa_switch *ds = p->dp->ds;
> + struct dsa_switch_tree *dst = ds->dst;
> +
> + attr->u.ppid.id_len = sizeof(dst->tree);
> + memcpy(&attr->u.ppid.id, &dst->tree, sizeof(dst->tree));
> break;
> + }
> case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS_SUPPORT:
> attr->u.brport_flags_support = 0;
> break;
I am pretty sure the kernel folks won't like blocks within case
statments. Simply declare dst = p->dp->ds->dst at the beginning.
Also keeping attr->u.ppid.id_len as memcpy's third argument like other
switchdev users do would be prefered.
Otherwise using the tree index is indeed the way to go:
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Thanks,
Vivien
^ permalink raw reply
* Re: [PATCH v3 1/1] ip_tunnel: add mpls over gre encapsulation
From: Roopa Prabhu @ 2017-09-27 15:36 UTC (permalink / raw)
To: Amine Kherbouche; +Cc: netdev@vger.kernel.org, xeb, David Lamparter
In-Reply-To: <198fd7591bc1daae4727ee8b950e116b59f2d4c3.1506504229.git.amine.kherbouche@6wind.com>
On Wed, Sep 27, 2017 at 2:37 AM, Amine Kherbouche
<amine.kherbouche@6wind.com> wrote:
> This commit introduces the MPLSoGRE support (RFC 4023), using ip tunnel
> API.
>
> Encap:
> - Add a new iptunnel type mpls.
> - Share tx path: gre type mpls loaded from skb->protocol.
>
> Decap:
> - pull gre hdr and call mpls_forward().
>
> Signed-off-by: Amine Kherbouche <amine.kherbouche@6wind.com>
> ---
> include/linux/mpls.h | 2 ++
> include/uapi/linux/if_tunnel.h | 1 +
> net/ipv4/ip_gre.c | 11 +++++++++
> net/ipv6/ip6_gre.c | 11 +++++++++
> net/mpls/af_mpls.c | 52 ++++++++++++++++++++++++++++++++++++++++++
> 5 files changed, 77 insertions(+)
>
> diff --git a/include/linux/mpls.h b/include/linux/mpls.h
> index 384fb22..57203c1 100644
> --- a/include/linux/mpls.h
> +++ b/include/linux/mpls.h
> @@ -8,4 +8,6 @@
> #define MPLS_TC_MASK (MPLS_LS_TC_MASK >> MPLS_LS_TC_SHIFT)
> #define MPLS_LABEL_MASK (MPLS_LS_LABEL_MASK >> MPLS_LS_LABEL_SHIFT)
>
> +int mpls_gre_rcv(struct sk_buff *skb, int gre_hdr_len);
> +
> #endif /* _LINUX_MPLS_H */
> diff --git a/include/uapi/linux/if_tunnel.h b/include/uapi/linux/if_tunnel.h
> index 2e52088..a2f48c0 100644
> --- a/include/uapi/linux/if_tunnel.h
> +++ b/include/uapi/linux/if_tunnel.h
> @@ -84,6 +84,7 @@ enum tunnel_encap_types {
> TUNNEL_ENCAP_NONE,
> TUNNEL_ENCAP_FOU,
> TUNNEL_ENCAP_GUE,
> + TUNNEL_ENCAP_MPLS,
> };
>
> #define TUNNEL_ENCAP_FLAG_CSUM (1<<0)
> diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
> index 9cee986..0a898f4 100644
> --- a/net/ipv4/ip_gre.c
> +++ b/net/ipv4/ip_gre.c
> @@ -32,6 +32,9 @@
> #include <linux/netfilter_ipv4.h>
> #include <linux/etherdevice.h>
> #include <linux/if_ether.h>
> +#if IS_ENABLED(CONFIG_MPLS)
> +#include <linux/mpls.h>
> +#endif
>
> #include <net/sock.h>
> #include <net/ip.h>
> @@ -412,6 +415,14 @@ static int gre_rcv(struct sk_buff *skb)
> return 0;
> }
>
> + if (unlikely(tpi.proto == htons(ETH_P_MPLS_UC))) {
> +#if IS_ENABLED(CONFIG_MPLS)
> + return mpls_gre_rcv(skb, hdr_len);
> +#else
> + goto drop;
> +#endif
> + }
> +
Amine, one small nit here.., if you define mpls_gre_rcv in gre header
(like you had initially), you could do the below...
#if IS_ENABLED(CONFIG_MPLS)
mpls_gre_rcv()
{
/* real func */
}
#else
mpls_gre_rcv()
{
kfree_skb(skb)
return NET_RX_DROP
}
#endif
and the check in gre_rcv() reduces to
if (unlikely(tpi.proto == htons(ETH_P_MPLS_UC)))
return mpls_gre_rcv(skb, hdr_len);
Which looks much cleaner.
Other than that, looks great. pls add my Acked-by: Roopa Prabhu
<roopa@cumulusnetworks.com> to your next version.
thanks!
^ permalink raw reply
* Re: [PATCH v2 net-next 0/2] net/sched: support tunnel options in cls_flower and act_tunnel_key
From: Jiri Benc @ 2017-09-27 15:46 UTC (permalink / raw)
To: Simon Horman
Cc: David Miller, Jiri Pirko, Jamal Hadi Salim, Cong Wang, netdev,
oss-drivers
In-Reply-To: <1506500194-17637-1-git-send-email-simon.horman@netronome.com>
On Wed, 27 Sep 2017 10:16:32 +0200, Simon Horman wrote:
> * Geneve
>
> In the case of Geneve options are TLVs[1]. My reading is that in collect
> metadata mode the kernel does not appear to do anything other than pass
> them around as a bytestring.
>
> [1] https://tools.ietf.org/html/draft-ietf-nvo3-geneve-05#section-3.5
[...]
>
> Neither of the above appear to assume any structure for the data.
But that's not true. Geneve uses TLVs, you even mentioned that
yourself. Matching on a block of TLVs as a bytestring doesn't make
sense. The TLV fields may be in any order.
We need better matching here. Bytestring is useless for Geneve.
NACK for this direction of option matching. We'd need to introduce
matching on TLVs sooner or later anyway and this would be just a never
used compat cruft that we need to keep around forever.
Jiri
^ permalink raw reply
* Re: [PATCH net-next 4/6] net: dsa: mv88e6xxx: Print offending port when vlan check fails
From: Vivien Didelot @ 2017-09-27 15:49 UTC (permalink / raw)
To: Andrew Lunn, David Miller; +Cc: netdev, Andrew Lunn
In-Reply-To: <1506464764-12699-5-git-send-email-andrew@lunn.ch>
Andrew Lunn <andrew@lunn.ch> writes:
> When testing if a VLAN is one more than one bridge, we print an error
> message that the VLAN is already in use somewhere else. Print both the
> new port which would like the VLAN, and the port which already has it,
> to aid debugging.
>
> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
^ permalink raw reply
* Re: [PATCH net-next 5/6] net: dsa: mv88e6xxx: Move mv88e6xxx_port_db_load_purge()
From: Vivien Didelot @ 2017-09-27 15:51 UTC (permalink / raw)
To: Andrew Lunn, David Miller; +Cc: netdev, Andrew Lunn
In-Reply-To: <1506464764-12699-6-git-send-email-andrew@lunn.ch>
Andrew Lunn <andrew@lunn.ch> writes:
> This function is going to be needed by a soon to be added new
> function. Move it earlier so we can avoid a forward declaration.
> No code changes.
s/code/functional/
> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
^ permalink raw reply
* Re: [iproute PATCH v2 0/3] Check user supplied interface name lengths
From: Phil Sutter @ 2017-09-27 16:05 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20170927084249.0591ee3a@shemminger-XPS-13-9360>
On Wed, Sep 27, 2017 at 08:42:49AM +0100, Stephen Hemminger wrote:
> On Tue, 26 Sep 2017 18:35:45 +0200
> Phil Sutter <phil@nwl.cc> wrote:
>
> > This series adds explicit checks for user-supplied interface names to
> > make sure their length fits Linux's requirements.
> >
> > The first two patches simplify interface name parsing in some places -
> > these are side-effects of working on the actual implementation provided
> > in patch three.
> >
> > Changes since v1:
> > - Patches 1 and 2 introduced.
> > - Changes to patch 3 are listed in there.
> >
> > Phil Sutter (3):
> > ip{6,}tunnel: Avoid copying user-supplied interface name around
> > tc: flower: No need to cache indev arg
> > Check user supplied interface name lengths
> >
> > include/utils.h | 1 +
> > ip/ip6tunnel.c | 9 +++++----
> > ip/ipl2tp.c | 3 ++-
> > ip/iplink.c | 27 ++++++++-------------------
> > ip/ipmaddr.c | 1 +
> > ip/iprule.c | 4 ++++
> > ip/iptunnel.c | 27 +++++++++++++--------------
> > ip/iptuntap.c | 4 +++-
> > lib/utils.c | 10 ++++++++++
> > misc/arpd.c | 1 +
> > tc/f_flower.c | 6 ++----
> > 11 files changed, 50 insertions(+), 43 deletions(-)
> >
>
> I like the idea, and checking arguments is good.
Cool!
> Why not merge the check and copy and put in lib/utils.c
>
> int get_ifname(char *name, const char *arg)
> {
> ...
What do you have in mind exactly? There are basically three situations
to which check_ifname() is added:
1) Simple pointer caching:
| check_ifname("name", *argv);
| name = *argv;
2) Value caching:
| check_ifname("name", *argv);
| strncpy(name, *argv, IFNAMSIZ);
3) Direct netlink attribute creation:
| check_ifname("name", *argv);
| addattr_l(&req.n, sizeof(req), IFNAME, *argv, strlen(*argv) + 1);
To cover them all, I could introduce the following:
| char *check_ifname(const char *name, const char *argv)
| {
| /* check *arg, call invarg() if invalid */
| return *argv;
| }
|
| void copy_ifname(char *dst, const char *name, const char *argv)
| {
| strncpy(dst, check_ifname(name, argv), IFNAMSIZ);
| }
|
| void addattr_ifname(struct nlmsghdr *n, int maxlen, int type,
| const char *name, const char *argv)
| {
| addattr_l(n, maxlen, type, check_ifname(name, argv),
| strlen(*argv) + 1);
| }
What do you think?
Cheers, Phil
^ permalink raw reply
* RE: [PATCH v2 02/16] thunderbolt: Add support for XDomain properties
From: David Laight @ 2017-09-27 16:06 UTC (permalink / raw)
To: 'Mika Westerberg', David Miller
Cc: gregkh@linuxfoundation.org, andreas.noever@gmail.com,
michael.jamet@intel.com, yehezkel.bernat@intel.com,
amir.jer.levy@intel.com, Mario.Limonciello@dell.com,
lukas@wunner.de, andriy.shevchenko@linux.intel.com,
andrew@lunn.ch, linux-kernel@vger.kernel.org,
netdev@vger.kernel.org
In-Reply-To: <20170927113241.GB4630@lahna.fi.intel.com>
From: Mika Westerberg
> Sent: 27 September 2017 12:33
...
> Just for my education, is there some rule which tells when __packed is
> to be used? For example the above structures are all 32-bit aligned but
> how about something like:
>
> struct foo {
> u32 value1;
> u8 value2;
> };
>
> If the on-wire format requires such structures I assume __packed
> is needed here?
You've endianness considerations as well with on-wire formats.
__packed indicates two things:
1) There will be no padding bytes between fields.
2) The structure itself might appear on any byte boundary.
The latter causes the compiler to do byte memory accesses and
shifts to load/store the data on some architectures.
So only mark things __packed when they might be misaligned in
memory.
David
^ permalink raw reply
* Re: [PATCH v3 1/1] ip_tunnel: add mpls over gre encapsulation
From: Amine Kherbouche @ 2017-09-27 16:08 UTC (permalink / raw)
To: Roopa Prabhu; +Cc: netdev@vger.kernel.org, xeb, David Lamparter
In-Reply-To: <CAJieiUgkOwJvrEq9vw+2-Up0qc9hvxt3M6LWnf8FSyx9VRX-Dw@mail.gmail.com>
On 09/27/2017 05:36 PM, Roopa Prabhu wrote:
> Amine, one small nit here.., if you define mpls_gre_rcv in gre header
> (like you had initially), you could do the below...
>
> #if IS_ENABLED(CONFIG_MPLS)
> mpls_gre_rcv()
> {
> /* real func */
> }
> #else
> mpls_gre_rcv()
> {
> kfree_skb(skb)
> return NET_RX_DROP
> }
> #endif
>
> and the check in gre_rcv() reduces to
>
> if (unlikely(tpi.proto == htons(ETH_P_MPLS_UC)))
> return mpls_gre_rcv(skb, hdr_len);
>
> Which looks much cleaner.
If I do that, do I have to add back the patch that export mpls_forward()
or just merge it with this one ?
^ permalink raw reply
* Re: [PATCH v3 1/1] ip_tunnel: add mpls over gre encapsulation
From: Roopa Prabhu @ 2017-09-27 16:20 UTC (permalink / raw)
To: Amine Kherbouche; +Cc: netdev@vger.kernel.org, xeb, David Lamparter
In-Reply-To: <1ce61dd3-293b-c26b-e173-3bdae4ba46a7@6wind.com>
On Wed, Sep 27, 2017 at 9:08 AM, Amine Kherbouche
<amine.kherbouche@6wind.com> wrote:
>
>
> On 09/27/2017 05:36 PM, Roopa Prabhu wrote:
>>
>> Amine, one small nit here.., if you define mpls_gre_rcv in gre header
>> (like you had initially), you could do the below...
>>
>> #if IS_ENABLED(CONFIG_MPLS)
>> mpls_gre_rcv()
>> {
>> /* real func */
>> }
>> #else
>> mpls_gre_rcv()
>> {
>> kfree_skb(skb)
>> return NET_RX_DROP
>> }
>> #endif
>>
>> and the check in gre_rcv() reduces to
>>
>> if (unlikely(tpi.proto == htons(ETH_P_MPLS_UC)))
>> return mpls_gre_rcv(skb, hdr_len);
>>
>> Which looks much cleaner.
>
>
> If I do that, do I have to add back the patch that export mpls_forward() or
> just merge it with this one ?
I think its better to bring the patch back in.
^ permalink raw reply
* Re: [PATCH v2 02/16] thunderbolt: Add support for XDomain properties
From: David Miller @ 2017-09-27 16:22 UTC (permalink / raw)
To: mika.westerberg
Cc: gregkh, andreas.noever, michael.jamet, yehezkel.bernat,
amir.jer.levy, Mario.Limonciello, lukas, andriy.shevchenko,
andrew, linux-kernel, netdev
In-Reply-To: <20170927113241.GB4630@lahna.fi.intel.com>
From: Mika Westerberg <mika.westerberg@linux.intel.com>
Date: Wed, 27 Sep 2017 14:32:41 +0300
> Just for my education, is there some rule which tells when __packed is
> to be used? For example the above structures are all 32-bit aligned but
> how about something like:
>
> struct foo {
> u32 value1;
> u8 value2;
> };
>
> If the on-wire format requires such structures I assume __packed
> is needed here?
Usually header elements are 32-bit aligned in a protocol, so it wouldn't
be specified like that.
The only legitimate case I've seen is where things are purposefully
misaligned within the header, like this:
struct foo {
u16 x;
u64 y;
u16 z;
};
Where the 'y' element is 2-byte aligned.
Fortunately, those situations are extremely rare.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox