Netdev List
 help / color / mirror / Atom feed
* [PATCH] net/netfilter/nf_nat_proto.c - make tables static
From: Valdis Klētnieks @ 2019-08-08  5:43 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal
  Cc: netfilter-devel, coreteam, netdev, linux-kernel

Sparse warns about two tables not being declared.

  CHECK   net/netfilter/nf_nat_proto.c
net/netfilter/nf_nat_proto.c:725:26: warning: symbol 'nf_nat_ipv4_ops' was not declared. Should it be static?
net/netfilter/nf_nat_proto.c:964:26: warning: symbol 'nf_nat_ipv6_ops' was not declared. Should it be static?

And in fact they can indeed be static.

Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>

diff --git a/net/netfilter/nf_nat_proto.c b/net/netfilter/nf_nat_proto.c
index 7ac733ebd060..0a59c14b5177 100644
--- a/net/netfilter/nf_nat_proto.c
+++ b/net/netfilter/nf_nat_proto.c
@@ -722,7 +722,7 @@ nf_nat_ipv4_local_fn(void *priv, struct sk_buff *skb,
 	return ret;
 }
 
-const struct nf_hook_ops nf_nat_ipv4_ops[] = {
+static const struct nf_hook_ops nf_nat_ipv4_ops[] = {
 	/* Before packet filtering, change destination */
 	{
 		.hook		= nf_nat_ipv4_in,
@@ -961,7 +961,7 @@ nf_nat_ipv6_local_fn(void *priv, struct sk_buff *skb,
 	return ret;
 }
 
-const struct nf_hook_ops nf_nat_ipv6_ops[] = {
+static const struct nf_hook_ops nf_nat_ipv6_ops[] = {
 	/* Before packet filtering, change destination */
 	{
 		.hook		= nf_nat_ipv6_in,


^ permalink raw reply related

* Re: [PATCH] team: Add vlan tx offload to hw_enc_features
From: Jesse Brandeburg @ 2019-08-08  5:48 UTC (permalink / raw)
  To: YueHaibing
  Cc: j.vosburgh, vfalico, andy, davem, jiri, jay.vosburgh,
	linux-kernel, netdev, jesse.brandeburg
In-Reply-To: <20190807023808.51976-1-yuehaibing@huawei.com>

On Wed, 7 Aug 2019 10:38:08 +0800
YueHaibing <yuehaibing@huawei.com> wrote:

> We should also enable bonding's vlan tx offload in hw_enc_features,

You mean team's vlan tx offload?

> pass the vlan packets to the slave devices with vlan tci, let them

s/let them to/let the slave/

> to handle vlan tunneling offload implementation.
> 
> Fixes: 3268e5cb494d ("team: Advertise tunneling offload features")
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>


^ permalink raw reply

* Re: [PATCH v2 bpf-next] btf: expose BTF info through sysfs
From: Greg KH @ 2019-08-08  6:08 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Andrii Nakryiko, bpf@vger.kernel.org, netdev@vger.kernel.org,
	Alexei Starovoitov, daniel@iogearbox.net,
	andrii.nakryiko@gmail.com, Kernel Team, Masahiro Yamada,
	Arnaldo Carvalho de Melo, Jiri Olsa, Sam Ravnborg
In-Reply-To: <89a6e282-0250-4264-128d-469be99073e9@fb.com>

On Thu, Aug 08, 2019 at 04:24:25AM +0000, Yonghong Song wrote:
> 
> 
> On 8/7/19 5:32 PM, Andrii Nakryiko wrote:
> > Make .BTF section allocated and expose its contents through sysfs.

Was this original patch not on bpf@vger?  I can't find it in my
archive.  Anyway...

> > /sys/kernel/btf directory is created to contain all the BTFs present
> > inside kernel. Currently there is only kernel's main BTF, represented as
> > /sys/kernel/btf/kernel file. Once kernel modules' BTFs are supported,
> > each module will expose its BTF as /sys/kernel/btf/<module-name> file.

Why are you using sysfs for this?  Who uses "BTF"s?  Are these debugging
images that only people working on developing bpf programs are going to
need, or are these things that you are going to need on a production
system?

I ask as maybe debugfs is the best place for this if they are not needed
on production systems.


> > 
> > Current approach relies on a few pieces coming together:
> > 1. pahole is used to take almost final vmlinux image (modulo .BTF and
> >     kallsyms) and generate .BTF section by converting DWARF info into
> >     BTF. This section is not allocated and not mapped to any segment,
> >     though, so is not yet accessible from inside kernel at runtime.
> > 2. objcopy dumps .BTF contents into binary file and subsequently
> >     convert binary file into linkable object file with automatically
> >     generated symbols _binary__btf_kernel_bin_start and
> >     _binary__btf_kernel_bin_end, pointing to start and end, respectively,
> >     of BTF raw data.
> > 3. final vmlinux image is generated by linking this object file (and
> >     kallsyms, if necessary). sysfs_btf.c then creates
> >     /sys/kernel/btf/kernel file and exposes embedded BTF contents through
> >     it. This allows, e.g., libbpf and bpftool access BTF info at
> >     well-known location, without resorting to searching for vmlinux image
> >     on disk (location of which is not standardized and vmlinux image
> >     might not be even available in some scenarios, e.g., inside qemu
> >     during testing).
> > 
> > Alternative approach using .incbin assembler directive to embed BTF
> > contents directly was attempted but didn't work, because sysfs_proc.o is
> > not re-compiled during link-vmlinux.sh stage. This is required, though,
> > to update embedded BTF data (initially empty data is embedded, then
> > pahole generates BTF info and we need to regenerate sysfs_btf.o with
> > updated contents, but it's too late at that point).
> > 
> > If BTF couldn't be generated due to missing or too old pahole,
> > sysfs_btf.c handles that gracefully by detecting that
> > _binary__btf_kernel_bin_start (weak symbol) is 0 and not creating
> > /sys/kernel/btf at all.
> > 
> > v1->v2:
> > - allow kallsyms stage to re-use vmlinux generated by gen_btf();
> > 
> > Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> > Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Cc: Jiri Olsa <jolsa@kernel.org>
> > Cc: Sam Ravnborg <sam@ravnborg.org>
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > ---
> >   kernel/bpf/Makefile     |  3 +++
> >   kernel/bpf/sysfs_btf.c  | 52 ++++++++++++++++++++++++++++++++++++++
> >   scripts/link-vmlinux.sh | 55 +++++++++++++++++++++++++++--------------
> >   3 files changed, 91 insertions(+), 19 deletions(-)
> >   create mode 100644 kernel/bpf/sysfs_btf.c

First rule, you can't create new sysfs files without a matching
Documentation/ABI/ set of entries.  Please do that for the next version
of this patch so we can properly check to see if what you are
documenting lines up with the code.  Otherwise we just have to guess as
to what the entries you are creating actually do.

> > 
> > diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
> > index 29d781061cd5..e1d9adb212f9 100644
> > --- a/kernel/bpf/Makefile
> > +++ b/kernel/bpf/Makefile
> > @@ -22,3 +22,6 @@ obj-$(CONFIG_CGROUP_BPF) += cgroup.o
> >   ifeq ($(CONFIG_INET),y)
> >   obj-$(CONFIG_BPF_SYSCALL) += reuseport_array.o
> >   endif
> > +ifeq ($(CONFIG_SYSFS),y)
> > +obj-$(CONFIG_DEBUG_INFO_BTF) += sysfs_btf.o
> > +endif
> > diff --git a/kernel/bpf/sysfs_btf.c b/kernel/bpf/sysfs_btf.c
> > new file mode 100644
> > index 000000000000..ac06ce1d62e8
> > --- /dev/null
> > +++ b/kernel/bpf/sysfs_btf.c
> > @@ -0,0 +1,52 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Provide kernel BTF information for introspection and use by eBPF tools.
> > + */
> > +#include <linux/kernel.h>
> > +#include <linux/module.h>
> > +#include <linux/kobject.h>
> > +#include <linux/init.h>
> > +
> > +/* See scripts/link-vmlinux.sh, gen_btf() func for details */
> > +extern char __weak _binary__btf_kernel_bin_start[];
> > +extern char __weak _binary__btf_kernel_bin_end[];
> > +
> > +static ssize_t
> > +btf_kernel_read(struct file *file, struct kobject *kobj,
> > +		struct bin_attribute *bin_attr,
> > +		char *buf, loff_t off, size_t len)
> > +{
> > +	memcpy(buf, _binary__btf_kernel_bin_start + off, len);
> > +	return len;
> > +}
> > +
> > +static struct bin_attribute btf_kernel_attr __ro_after_init = {
> > +	.attr = {
> > +		.name = "kernel",
> > +		.mode = 0444,
> > +	},
> > +	.read = btf_kernel_read,
> > +};

BIN_ATTR_RO()?

> > +
> > +static struct bin_attribute *btf_attrs[] __ro_after_init = {
> > +	&btf_kernel_attr,
> > +	NULL,
> > +};
> > +
> > +static struct attribute_group btf_group_attr __ro_after_init = {
> > +	.name = "btf",
> > +	.bin_attrs = btf_attrs,
> > +};
> > +
> > +static int __init btf_kernel_init(void)
> > +{
> > +	if (!_binary__btf_kernel_bin_start)
> > +		return 0;
> > +
> > +	btf_kernel_attr.size = _binary__btf_kernel_bin_end -
> > +			       _binary__btf_kernel_bin_start;
> > +
> > +	return sysfs_create_group(kernel_kobj, &btf_group_attr);

You are nesting directories here without a "real" kobject in the middle.
Are you _sure_ you want to do that?  It's going to get really tricky
later on based on your comments above about creating multiple files in
that directory over time once "modules" are allowed.

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH bpf-next] btf: expose BTF info through sysfs
From: Greg KH @ 2019-08-08  6:09 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, netdev, ast, daniel, yhs, andrii.nakryiko, kernel-team,
	Masahiro Yamada, Arnaldo Carvalho de Melo, Jiri Olsa
In-Reply-To: <20190807183821.138728-1-andriin@fb.com>

On Wed, Aug 07, 2019 at 11:38:21AM -0700, Andrii Nakryiko wrote:
> Make .BTF section allocated and expose its contents through sysfs.

Ah, found the original of this, sorry for the noise on the previous
response...

Still need Documentation/ABI/ entries though :)

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH net] net: phy: rtl8211f: do a double read to get real time link status
From: Heiner Kallweit @ 2019-08-08  6:11 UTC (permalink / raw)
  To: Yonglong Liu, davem, andrew
  Cc: netdev, linux-kernel, linuxarm, salil.mehta, yisen.zhuang,
	shiju.jose
In-Reply-To: <e663235c-93eb-702d-5a9c-8f781d631c42@huawei.com>

On 08.08.2019 03:15, Yonglong Liu wrote:
> 
> 
> On 2019/8/8 0:47, Heiner Kallweit wrote:
>> On 07.08.2019 15:16, Yonglong Liu wrote:
>>> [   27.232781] hns3 0000:bd:00.3 eth7: net open
>>> [   27.237303] 8021q: adding VLAN 0 to HW filter on device eth7
>>> [   27.242972] IPv6: ADDRCONF(NETDEV_CHANGE): eth7: link becomes ready
>>> [   27.244449] hns3 0000:bd:00.3: invalid speed (-1)
>>> [   27.253904] hns3 0000:bd:00.3 eth7: failed to adjust link.
>>> [   27.259379] RTL8211F Gigabit Ethernet mii-0000:bd:00.3:07: PHY state change UP -> RUNNING
>>> [   27.924903] hns3 0000:bd:00.3 eth7: link up
>>> [   28.280479] RTL8211F Gigabit Ethernet mii-0000:bd:00.3:07: PHY state change RUNNING -> NOLINK
>>> [   29.208452] hns3 0000:bd:00.3 eth7: link down
>>> [   32.376745] RTL8211F Gigabit Ethernet mii-0000:bd:00.3:07: PHY state change NOLINK -> RUNNING
>>> [   33.208448] hns3 0000:bd:00.3 eth7: link up
>>> [   35.253821] hns3 0000:bd:00.3 eth7: net stop
>>> [   35.258270] hns3 0000:bd:00.3 eth7: link down
>>>
>>> When using rtl8211f in polling mode, may get a invalid speed,
>>> because of reading a fake link up and autoneg complete status
>>> immediately after starting autoneg:
>>>
>>>         ifconfig-1176  [007] ....    27.232763: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x00 val:0x1040
>>>   kworker/u257:1-670   [015] ....    27.232805: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x04 val:0x01e1
>>>   kworker/u257:1-670   [015] ....    27.232815: mdio_access: mii-0000:bd:00.3 write phy:0x07 reg:0x04 val:0x05e1
>>>   kworker/u257:1-670   [015] ....    27.232869: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x01 val:0x79ad
>>>   kworker/u257:1-670   [015] ....    27.232904: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x09 val:0x0200
>>>   kworker/u257:1-670   [015] ....    27.232940: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x00 val:0x1040
>>>   kworker/u257:1-670   [015] ....    27.232949: mdio_access: mii-0000:bd:00.3 write phy:0x07 reg:0x00 val:0x1240
>>>   kworker/u257:1-670   [015] ....    27.233003: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x01 val:0x79ad
>>>   kworker/u257:1-670   [015] ....    27.233039: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x0a val:0x3002
>>>   kworker/u257:1-670   [015] ....    27.233074: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x09 val:0x0200
>>>   kworker/u257:1-670   [015] ....    27.233110: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x05 val:0x0000
>>>   kworker/u257:1-670   [000] ....    28.280475: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x01 val:0x7989
>>>   kworker/u257:1-670   [000] ....    29.304471: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x01 val:0x7989
>>>
>>> According to the datasheet of rtl8211f, to get the real time
>>> link status, need to read MII_BMSR twice.
>>>
>>> This patch add a read_status hook for rtl8211f, and do a fake
>>> phy_read before genphy_read_status(), so that can get real link
>>> status in genphy_read_status().
>>>
>>> Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
>>> ---
>>>  drivers/net/phy/realtek.c | 13 +++++++++++++
>>>  1 file changed, 13 insertions(+)
>>>
>> Is this an accidental resubmit? Because we discussed this in
>> https://marc.info/?t=156413509900003&r=1&w=2 and a fix has
>> been applied already.
>>
>> Heiner
>>
>> .
>>
> 
> In https://marc.info/?t=156413509900003&r=1&w=2 , the invalid speed
> recurrence rate is almost 100%, and I had test the solution about
> 5 times and it works. But yesterday it happen again suddenly, and than
> I fount that the recurrence rate reduce to 10%. This time we get 0x79ad
> after autoneg started which is not 0x798d from last discussion.
> 
> 
> 
OK, I'll have a look.
However the approach is wrong. The double read is related to the latching
of link-down events. This is done by all PHY's and not specific to RT8211F.
Also it's not related to the problem. I assume any sufficient delay would
do instead of the read.

^ permalink raw reply

* Re: [PATCH v2 1/2] tcp: add new tcp_mtu_probe_floor sysctl
From: Eric Dumazet @ 2019-08-08  6:12 UTC (permalink / raw)
  To: Josh Hunt, netdev; +Cc: davem, edumazet, ncardwell
In-Reply-To: <1565221950-1376-1-git-send-email-johunt@akamai.com>



On 8/8/19 1:52 AM, Josh Hunt wrote:
> The current implementation of TCP MTU probing can considerably
> underestimate the MTU on lossy connections allowing the MSS to get down to
> 48. We have found that in almost all of these cases on our networks these
> paths can handle much larger MTUs meaning the connections are being
> artificially limited. Even though TCP MTU probing can raise the MSS back up
> we have seen this not to be the case causing connections to be "stuck" with
> an MSS of 48 when heavy loss is present.
> 
> Prior to pushing out this change we could not keep TCP MTU probing enabled
> b/c of the above reasons. Now with a reasonble floor set we've had it
> enabled for the past 6 months.

I am still sad to see you do not share what is a reasonable value and let
everybody guess. 

It seems to be a top-secret value.

> 
> The new sysctl will still default to TCP_MIN_SND_MSS (48), but gives
> administrators the ability to control the floor of MSS probing.
> 
> Signed-off-by: Josh Hunt <johunt@akamai.com>

Signed-off-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply

* Re: [PATCH v2 2/2] tcp: Update TCP_BASE_MSS comment
From: Eric Dumazet @ 2019-08-08  6:13 UTC (permalink / raw)
  To: Josh Hunt, netdev; +Cc: davem, edumazet, ncardwell
In-Reply-To: <1565221950-1376-2-git-send-email-johunt@akamai.com>



On 8/8/19 1:52 AM, Josh Hunt wrote:
> TCP_BASE_MSS is used as the default initial MSS value when MTU probing is
> enabled. Update the comment to reflect this.
> 
> Suggested-by: Neal Cardwell <ncardwell@google.com>
> Signed-off-by: Josh Hunt <johunt@akamai.com>
> ---

Signed-off-by: Eric Dumazet <edumazet@google.com>


^ permalink raw reply

* Re: [PATCH net] net: phy: rtl8211f: do a double read to get real time link status
From: Yonglong Liu @ 2019-08-08  6:21 UTC (permalink / raw)
  To: Heiner Kallweit, davem, andrew
  Cc: netdev, linux-kernel, linuxarm, salil.mehta, yisen.zhuang,
	shiju.jose
In-Reply-To: <080b68c7-abe6-d142-da4b-26e8a7d4dc19@gmail.com>



On 2019/8/8 14:11, Heiner Kallweit wrote:
> On 08.08.2019 03:15, Yonglong Liu wrote:
>>
>>
>> On 2019/8/8 0:47, Heiner Kallweit wrote:
>>> On 07.08.2019 15:16, Yonglong Liu wrote:
>>>> [   27.232781] hns3 0000:bd:00.3 eth7: net open
>>>> [   27.237303] 8021q: adding VLAN 0 to HW filter on device eth7
>>>> [   27.242972] IPv6: ADDRCONF(NETDEV_CHANGE): eth7: link becomes ready
>>>> [   27.244449] hns3 0000:bd:00.3: invalid speed (-1)
>>>> [   27.253904] hns3 0000:bd:00.3 eth7: failed to adjust link.
>>>> [   27.259379] RTL8211F Gigabit Ethernet mii-0000:bd:00.3:07: PHY state change UP -> RUNNING
>>>> [   27.924903] hns3 0000:bd:00.3 eth7: link up
>>>> [   28.280479] RTL8211F Gigabit Ethernet mii-0000:bd:00.3:07: PHY state change RUNNING -> NOLINK
>>>> [   29.208452] hns3 0000:bd:00.3 eth7: link down
>>>> [   32.376745] RTL8211F Gigabit Ethernet mii-0000:bd:00.3:07: PHY state change NOLINK -> RUNNING
>>>> [   33.208448] hns3 0000:bd:00.3 eth7: link up
>>>> [   35.253821] hns3 0000:bd:00.3 eth7: net stop
>>>> [   35.258270] hns3 0000:bd:00.3 eth7: link down
>>>>
>>>> When using rtl8211f in polling mode, may get a invalid speed,
>>>> because of reading a fake link up and autoneg complete status
>>>> immediately after starting autoneg:
>>>>
>>>>         ifconfig-1176  [007] ....    27.232763: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x00 val:0x1040
>>>>   kworker/u257:1-670   [015] ....    27.232805: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x04 val:0x01e1
>>>>   kworker/u257:1-670   [015] ....    27.232815: mdio_access: mii-0000:bd:00.3 write phy:0x07 reg:0x04 val:0x05e1
>>>>   kworker/u257:1-670   [015] ....    27.232869: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x01 val:0x79ad
>>>>   kworker/u257:1-670   [015] ....    27.232904: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x09 val:0x0200
>>>>   kworker/u257:1-670   [015] ....    27.232940: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x00 val:0x1040
>>>>   kworker/u257:1-670   [015] ....    27.232949: mdio_access: mii-0000:bd:00.3 write phy:0x07 reg:0x00 val:0x1240
>>>>   kworker/u257:1-670   [015] ....    27.233003: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x01 val:0x79ad
>>>>   kworker/u257:1-670   [015] ....    27.233039: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x0a val:0x3002
>>>>   kworker/u257:1-670   [015] ....    27.233074: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x09 val:0x0200
>>>>   kworker/u257:1-670   [015] ....    27.233110: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x05 val:0x0000
>>>>   kworker/u257:1-670   [000] ....    28.280475: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x01 val:0x7989
>>>>   kworker/u257:1-670   [000] ....    29.304471: mdio_access: mii-0000:bd:00.3 read  phy:0x07 reg:0x01 val:0x7989
>>>>
>>>> According to the datasheet of rtl8211f, to get the real time
>>>> link status, need to read MII_BMSR twice.
>>>>
>>>> This patch add a read_status hook for rtl8211f, and do a fake
>>>> phy_read before genphy_read_status(), so that can get real link
>>>> status in genphy_read_status().
>>>>
>>>> Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
>>>> ---
>>>>  drivers/net/phy/realtek.c | 13 +++++++++++++
>>>>  1 file changed, 13 insertions(+)
>>>>
>>> Is this an accidental resubmit? Because we discussed this in
>>> https://marc.info/?t=156413509900003&r=1&w=2 and a fix has
>>> been applied already.
>>>
>>> Heiner
>>>
>>> .
>>>
>>
>> In https://marc.info/?t=156413509900003&r=1&w=2 , the invalid speed
>> recurrence rate is almost 100%, and I had test the solution about
>> 5 times and it works. But yesterday it happen again suddenly, and than
>> I fount that the recurrence rate reduce to 10%. This time we get 0x79ad
>> after autoneg started which is not 0x798d from last discussion.
>>
>>
>>
> OK, I'll have a look.
> However the approach is wrong. The double read is related to the latching
> of link-down events. This is done by all PHY's and not specific to RT8211F.
> Also it's not related to the problem. I assume any sufficient delay would
> do instead of the read.
> 
> .
> 

So you will send a new patch to fix this problem? I am waiting for it,
and can do a full test this time.


^ permalink raw reply

* [PATCH v2] team: Add vlan tx offload to hw_enc_features
From: YueHaibing @ 2019-08-08  6:22 UTC (permalink / raw)
  To: j.vosburgh, vfalico, andy, davem, jiri, jay.vosburgh
  Cc: linux-kernel, netdev, YueHaibing
In-Reply-To: <20190807023808.51976-1-yuehaibing@huawei.com>

We should also enable team's vlan tx offload in hw_enc_features,
pass the vlan packets to the slave devices with vlan tci, let the
slave handle vlan tunneling offload implementation.

Fixes: 3268e5cb494d ("team: Advertise tunneling offload features")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
v2: fix commit log typo
---
 drivers/net/team/team.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index abfa0da..e8089de 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -1004,6 +1004,8 @@ static void __team_compute_features(struct team *team)
 
 	team->dev->vlan_features = vlan_features;
 	team->dev->hw_enc_features = enc_features | NETIF_F_GSO_ENCAP_ALL |
+				     NETIF_F_HW_VLAN_CTAG_TX |
+				     NETIF_F_HW_VLAN_STAG_TX |
 				     NETIF_F_GSO_UDP_L4;
 	team->dev->hard_header_len = max_hard_header_len;
 
-- 
2.7.4



^ permalink raw reply related

* Re: [PATCH 11/17] qca: no need to check return value of debugfs_create functions
From: Michael Heimpold @ 2019-08-08  6:30 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: netdev, David S. Miller, Stefan Wahren, Yangtao Li
In-Reply-To: <20190806161128.31232-12-gregkh@linuxfoundation.org>

Am Dienstag, 6. August 2019, 18:11:22 CEST schrieb Greg Kroah-Hartman:
> When calling debugfs functions, there is no need to ever check the
> return value.  The function can work or not, but the code logic should
> never do something different based on this.
> 
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Stefan Wahren <stefan.wahren@i2se.com>
> Cc: Michael Heimpold <michael.heimpold@i2se.com>
> Cc: Yangtao Li <tiny.windzz@gmail.com>
> Cc: netdev@vger.kernel.org
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
>  drivers/net/ethernet/qualcomm/qca_debug.c | 13 +++----------
>  1 file changed, 3 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/net/ethernet/qualcomm/qca_debug.c
> b/drivers/net/ethernet/qualcomm/qca_debug.c index
> bcb890b18a94..702aa217a27a 100644
> --- a/drivers/net/ethernet/qualcomm/qca_debug.c
> +++ b/drivers/net/ethernet/qualcomm/qca_debug.c
> @@ -131,17 +131,10 @@ DEFINE_SHOW_ATTRIBUTE(qcaspi_info);
>  void
>  qcaspi_init_device_debugfs(struct qcaspi *qca)
>  {
> -	struct dentry *device_root;
> +	qca->device_root = debugfs_create_dir(dev_name(&qca->net_dev->dev),
> +					      NULL);
> 
> -	device_root = debugfs_create_dir(dev_name(&qca->net_dev->dev), NULL);
> -	qca->device_root = device_root;
> -
> -	if (IS_ERR(device_root) || !device_root) {
> -		pr_warn("failed to create debugfs directory for %s\n",
> -			dev_name(&qca->net_dev->dev));
> -		return;
> -	}
> -	debugfs_create_file("info", S_IFREG | 0444, device_root, qca,
> +	debugfs_create_file("info", S_IFREG | 0444, qca->device_root, qca,
>  			    &qcaspi_info_fops);
>  }

Acked-by: Michael Heimpold <michael.heimpold@i2se.com>





^ permalink raw reply

* Re: [PATCH net-next] net: openvswitch: Set OvS recirc_id from tc chain index
From: Paul Blakey @ 2019-08-08  6:39 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: netdev@vger.kernel.org, David S. Miller, Justin Pettit,
	Pravin B Shelar, Simon Horman, Vlad Buslov, Jiri Pirko, Roi Dayan,
	Yossi Kuperman, Rony Efraim, Oz Shlomo
In-Reply-To: <20190807150026.GE21609@localhost.localdomain>


On 8/7/2019 6:00 PM, Marcelo Ricardo Leitner wrote:
> On Wed, Aug 07, 2019 at 03:08:42PM +0300, Paul Blakey wrote:
>> Offloaded OvS datapath rules are translated one to one to tc rules,
>> for example the following simplified OvS rule:
>>
>> recirc_id(0),in_port(dev1),eth_type(0x0800),ct_state(-trk) actions:ct(),recirc(2)
>>
>> Will be translated to the following tc rule:
>>
>> $ tc filter add dev dev1 ingress \
>> 	    prio 1 chain 0 proto ip \
>> 		flower tcp ct_state -trk \
>> 		action ct pipe \
>> 		action goto chain 2
>>
>> Received packets will first travel though tc, and if they aren't stolen
>> by it, like in the above rule, they will continue to OvS datapath.
>> Since we already did some actions (action ct in this case) which might
>> modify the packets, and updated action stats, we would like to continue
>> the proccessing with the correct recirc_id in OvS (here recirc_id(2))
>> where we left off.
>>
>> To support this, introduce a new skb extension for tc, which
>> will be used for translating tc chain to ovs recirc_id to
>> handle these miss cases. Last tc chain index will be set
>> by tc goto chain action and read by OvS datapath.
>>
>> Signed-off-by: Paul Blakey <paulb@mellanox.com>
>> Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
>> Acked-by: Jiri Pirko <jiri@mellanox.com>
> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>

Thanks!


>> ---
>>   include/linux/skbuff.h    | 13 +++++++++++++
>>   include/net/sch_generic.h |  5 ++++-
>>   net/core/skbuff.c         |  6 ++++++
>>   net/openvswitch/flow.c    |  9 +++++++++
>>   net/sched/Kconfig         | 13 +++++++++++++
>>   net/sched/act_api.c       |  1 +
>>   net/sched/cls_api.c       | 12 ++++++++++++
>>   7 files changed, 58 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>> index 3aef8d8..fb2a792 100644
>> --- a/include/linux/skbuff.h
>> +++ b/include/linux/skbuff.h
>> @@ -279,6 +279,16 @@ struct nf_bridge_info {
>>   };
>>   #endif
>>   
>> +#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
>> +/* Chain in tc_skb_ext will be used to share the tc chain with
>> + * ovs recirc_id. It will be set to the current chain by tc
>> + * and read by ovs to recirc_id.
>> + */
>> +struct tc_skb_ext {
>> +	__u32 chain;
>> +};
>> +#endif
>> +
>>   struct sk_buff_head {
>>   	/* These two members must be first. */
>>   	struct sk_buff	*next;
>> @@ -4050,6 +4060,9 @@ enum skb_ext_id {
>>   #ifdef CONFIG_XFRM
>>   	SKB_EXT_SEC_PATH,
>>   #endif
>> +#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
>> +	TC_SKB_EXT,
>> +#endif
>>   	SKB_EXT_NUM, /* must be last */
>>   };
>>   
>> diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
>> index 6b6b012..871feea 100644
>> --- a/include/net/sch_generic.h
>> +++ b/include/net/sch_generic.h
>> @@ -275,7 +275,10 @@ struct tcf_result {
>>   			unsigned long	class;
>>   			u32		classid;
>>   		};
>> -		const struct tcf_proto *goto_tp;
>> +		struct {
>> +			const struct tcf_proto *goto_tp;
>> +			u32 goto_index;
>> +		};
>>   
>>   		/* used in the skb_tc_reinsert function */
>>   		struct {
>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> index ea8e8d3..2b40b5a 100644
>> --- a/net/core/skbuff.c
>> +++ b/net/core/skbuff.c
>> @@ -4087,6 +4087,9 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb)
>>   #ifdef CONFIG_XFRM
>>   	[SKB_EXT_SEC_PATH] = SKB_EXT_CHUNKSIZEOF(struct sec_path),
>>   #endif
>> +#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
>> +	[TC_SKB_EXT] = SKB_EXT_CHUNKSIZEOF(struct tc_skb_ext),
>> +#endif
>>   };
>>   
>>   static __always_inline unsigned int skb_ext_total_length(void)
>> @@ -4098,6 +4101,9 @@ static __always_inline unsigned int skb_ext_total_length(void)
>>   #ifdef CONFIG_XFRM
>>   		skb_ext_type_len[SKB_EXT_SEC_PATH] +
>>   #endif
>> +#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
>> +		skb_ext_type_len[TC_SKB_EXT] +
>> +#endif
>>   		0;
>>   }
>>   
>> diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
>> index bc89e16..0287ead 100644
>> --- a/net/openvswitch/flow.c
>> +++ b/net/openvswitch/flow.c
>> @@ -816,6 +816,9 @@ static int key_extract_mac_proto(struct sk_buff *skb)
>>   int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
>>   			 struct sk_buff *skb, struct sw_flow_key *key)
>>   {
>> +#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
>> +	struct tc_skb_ext *tc_ext;
>> +#endif
>>   	int res, err;
>>   
>>   	/* Extract metadata from packet. */
>> @@ -848,7 +851,13 @@ int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
>>   	if (res < 0)
>>   		return res;
>>   	key->mac_proto = res;
>> +
>> +#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
>> +	tc_ext = skb_ext_find(skb, TC_SKB_EXT);
>> +	key->recirc_id = tc_ext ? tc_ext->chain : 0;
>> +#else
>>   	key->recirc_id = 0;
>> +#endif
>>   
>>   	err = key_extract(skb, key);
>>   	if (!err)
>> diff --git a/net/sched/Kconfig b/net/sched/Kconfig
>> index afd2ba1..b3faafe 100644
>> --- a/net/sched/Kconfig
>> +++ b/net/sched/Kconfig
>> @@ -963,6 +963,19 @@ config NET_IFE_SKBTCINDEX
>>           tristate "Support to encoding decoding skb tcindex on IFE action"
>>           depends on NET_ACT_IFE
>>   
>> +config NET_TC_SKB_EXT
>> +	bool "TC recirculation support"
>> +	depends on NET_CLS_ACT
>> +	default y if NET_CLS_ACT
>> +	select SKB_EXTENSIONS
>> +
>> +	help
>> +	  Say Y here to allow tc chain misses to continue in OvS datapath in
>> +	  the correct recirc_id, and hardware chain misses to continue in
>> +	  the correct chain in tc software datapath.
>> +
>> +	  Say N here if you won't be using tc<->ovs offload or tc chains offload.
>> +
>>   endif # NET_SCHED
>>   
>>   config NET_SCH_FIFO
>> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
>> index 3397122..c393604 100644
>> --- a/net/sched/act_api.c
>> +++ b/net/sched/act_api.c
>> @@ -27,6 +27,7 @@ static void tcf_action_goto_chain_exec(const struct tc_action *a,
>>   {
>>   	const struct tcf_chain *chain = rcu_dereference_bh(a->goto_chain);
>>   
>> +	res->goto_index = chain->index;
>>   	res->goto_tp = rcu_dereference_bh(chain->filter_chain);
>>   }
>>   
>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>> index 3565d9a..b0b829a 100644
>> --- a/net/sched/cls_api.c
>> +++ b/net/sched/cls_api.c
>> @@ -1660,6 +1660,18 @@ int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
>>   			goto reset;
>>   		} else if (unlikely(TC_ACT_EXT_CMP(err, TC_ACT_GOTO_CHAIN))) {
>>   			first_tp = res->goto_tp;
>> +
>> +#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
>> +			{
>> +				struct tc_skb_ext *ext;
>> +
>> +				ext = skb_ext_add(skb, TC_SKB_EXT);
>> +				if (WARN_ON_ONCE(!ext))
>> +					return TC_ACT_SHOT;
>> +
>> +				ext->chain = res->goto_index;
>> +			}
>> +#endif
>>   			goto reset;
>>   		}
>>   #endif
>> -- 
>> 1.8.3.1
>>

^ permalink raw reply

* Re: [PATCH bpf-next 1/3] bpf: support cloning sk storage on accept()
From: Martin Lau @ 2019-08-08  6:39 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, davem@davemloft.net,
	ast@kernel.org, daniel@iogearbox.net
In-Reply-To: <20190807154720.260577-2-sdf@google.com>

On Wed, Aug 07, 2019 at 08:47:18AM -0700, Stanislav Fomichev wrote:
> Add new helper bpf_sk_storage_clone which optionally clones sk storage
> and call it from bpf_sk_storage_clone. Reuse the gap in
> bpf_sk_storage_elem to store clone/non-clone flag.
> 
> Cc: Martin KaFai Lau <kafai@fb.com>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  include/net/bpf_sk_storage.h |  10 ++++
>  include/uapi/linux/bpf.h     |   1 +
>  net/core/bpf_sk_storage.c    | 102 +++++++++++++++++++++++++++++++++--
>  net/core/sock.c              |   9 ++--
>  4 files changed, 115 insertions(+), 7 deletions(-)
> 
> diff --git a/include/net/bpf_sk_storage.h b/include/net/bpf_sk_storage.h
> index b9dcb02e756b..8e4f831d2e52 100644
> --- a/include/net/bpf_sk_storage.h
> +++ b/include/net/bpf_sk_storage.h
> @@ -10,4 +10,14 @@ void bpf_sk_storage_free(struct sock *sk);
>  extern const struct bpf_func_proto bpf_sk_storage_get_proto;
>  extern const struct bpf_func_proto bpf_sk_storage_delete_proto;
>  
> +#ifdef CONFIG_BPF_SYSCALL
> +int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk);
> +#else
> +static inline int bpf_sk_storage_clone(const struct sock *sk,
> +				       struct sock *newsk)
> +{
> +	return 0;
> +}
> +#endif
> +
>  #endif /* _BPF_SK_STORAGE_H */
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 4393bd4b2419..00459ca4c8cf 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -2931,6 +2931,7 @@ enum bpf_func_id {
>  
>  /* BPF_FUNC_sk_storage_get flags */
>  #define BPF_SK_STORAGE_GET_F_CREATE	(1ULL << 0)
> +#define BPF_SK_STORAGE_GET_F_CLONE	(1ULL << 1)
It is only used in bpf_sk_storage_get().
What if the elem is created from bpf_fd_sk_storage_update_elem()
i.e. from the syscall API ?

What may be the use case for a map to have both CLONE and non-CLONE
elements?  If it is not the case, would it be better to add
BPF_F_CLONE to bpf_attr->map_flags?

>  
>  /* Mode for BPF_FUNC_skb_adjust_room helper. */
>  enum bpf_adj_room_mode {
> diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
> index 94c7f77ecb6b..b6dea67965bc 100644
> --- a/net/core/bpf_sk_storage.c
> +++ b/net/core/bpf_sk_storage.c
> @@ -12,6 +12,9 @@
>  
>  static atomic_t cache_idx;
>  
> +#define BPF_SK_STORAGE_GET_F_MASK	(BPF_SK_STORAGE_GET_F_CREATE | \
> +					 BPF_SK_STORAGE_GET_F_CLONE)
> +
>  struct bucket {
>  	struct hlist_head list;
>  	raw_spinlock_t lock;
> @@ -66,7 +69,8 @@ struct bpf_sk_storage_elem {
>  	struct hlist_node snode;	/* Linked to bpf_sk_storage */
>  	struct bpf_sk_storage __rcu *sk_storage;
>  	struct rcu_head rcu;
> -	/* 8 bytes hole */
> +	u8 clone:1;
> +	/* 7 bytes hole */
>  	/* The data is stored in aother cacheline to minimize
>  	 * the number of cachelines access during a cache hit.
>  	 */
> @@ -509,7 +513,7 @@ static int sk_storage_delete(struct sock *sk, struct bpf_map *map)
>  	return 0;
>  }
>  
> -/* Called by __sk_destruct() */
> +/* Called by __sk_destruct() & bpf_sk_storage_clone() */
>  void bpf_sk_storage_free(struct sock *sk)
>  {
>  	struct bpf_sk_storage_elem *selem;
> @@ -739,19 +743,106 @@ static int bpf_fd_sk_storage_delete_elem(struct bpf_map *map, void *key)
>  	return err;
>  }
>  
> +static struct bpf_sk_storage_elem *
> +bpf_sk_storage_clone_elem(struct sock *newsk,
> +			  struct bpf_sk_storage_map *smap,
> +			  struct bpf_sk_storage_elem *selem)
> +{
> +	struct bpf_sk_storage_elem *copy_selem;
> +
> +	copy_selem = selem_alloc(smap, newsk, NULL, true);
> +	if (!copy_selem)
> +		return ERR_PTR(-ENOMEM);
nit.
may be just return NULL as selem_alloc() does.

> +
> +	if (map_value_has_spin_lock(&smap->map))
> +		copy_map_value_locked(&smap->map, SDATA(copy_selem)->data,
> +				      SDATA(selem)->data, true);
> +	else
> +		copy_map_value(&smap->map, SDATA(copy_selem)->data,
> +			       SDATA(selem)->data);
> +
> +	return copy_selem;
> +}
> +
> +int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
> +{
> +	struct bpf_sk_storage *new_sk_storage = NULL;
> +	struct bpf_sk_storage *sk_storage;
> +	struct bpf_sk_storage_elem *selem;
> +	int ret;
> +
> +	RCU_INIT_POINTER(newsk->sk_bpf_storage, NULL);
> +
> +	rcu_read_lock();
> +	sk_storage = rcu_dereference(sk->sk_bpf_storage);
> +
> +	if (!sk_storage || hlist_empty(&sk_storage->list))
> +		goto out;
> +
> +	hlist_for_each_entry_rcu(selem, &sk_storage->list, snode) {
> +		struct bpf_sk_storage_map *smap;
> +		struct bpf_sk_storage_elem *copy_selem;
> +
> +		if (!selem->clone)
> +			continue;
> +
> +		smap = rcu_dereference(SDATA(selem)->smap);
> +		if (!smap)
smap should not be NULL.

> +			continue;
> +
> +		copy_selem = bpf_sk_storage_clone_elem(newsk, smap, selem);
> +		if (IS_ERR(copy_selem)) {
> +			ret = PTR_ERR(copy_selem);
> +			goto err;
> +		}
> +
> +		if (!new_sk_storage) {
> +			ret = sk_storage_alloc(newsk, smap, copy_selem);
> +			if (ret) {
> +				kfree(copy_selem);
> +				atomic_sub(smap->elem_size,
> +					   &newsk->sk_omem_alloc);
> +				goto err;
> +			}
> +
> +			new_sk_storage = rcu_dereference(copy_selem->sk_storage);
> +			continue;
> +		}
> +
> +		raw_spin_lock_bh(&new_sk_storage->lock);
> +		selem_link_map(smap, copy_selem);
Unlike the existing selem-update use-cases in bpf_sk_storage.c,
the smap->map.refcnt has not been held here.  Reading the smap
is fine.  However, adding a new selem to a deleting smap is an issue.
Hence, I think bpf_map_inc_not_zero() should be done first.

> +		__selem_link_sk(new_sk_storage, copy_selem);
> +		raw_spin_unlock_bh(&new_sk_storage->lock);
> +	}
> +
> +out:
> +	rcu_read_unlock();
> +	return 0;
> +
> +err:
> +	rcu_read_unlock();
> +
> +	bpf_sk_storage_free(newsk);
> +	return ret;
> +}
> +
>  BPF_CALL_4(bpf_sk_storage_get, struct bpf_map *, map, struct sock *, sk,
>  	   void *, value, u64, flags)
>  {
>  	struct bpf_sk_storage_data *sdata;
>  
> -	if (flags > BPF_SK_STORAGE_GET_F_CREATE)
> +	if (flags & ~BPF_SK_STORAGE_GET_F_MASK)
> +		return (unsigned long)NULL;
> +
> +	if ((flags & BPF_SK_STORAGE_GET_F_CLONE) &&
> +	    !(flags & BPF_SK_STORAGE_GET_F_CREATE))
>  		return (unsigned long)NULL;
>  
>  	sdata = sk_storage_lookup(sk, map, true);
>  	if (sdata)
>  		return (unsigned long)sdata->data;
>  
> -	if (flags == BPF_SK_STORAGE_GET_F_CREATE &&
> +	if ((flags & BPF_SK_STORAGE_GET_F_CREATE) &&
>  	    /* Cannot add new elem to a going away sk.
>  	     * Otherwise, the new elem may become a leak
>  	     * (and also other memory issues during map
> @@ -762,6 +853,9 @@ BPF_CALL_4(bpf_sk_storage_get, struct bpf_map *, map, struct sock *, sk,
>  		/* sk must be a fullsock (guaranteed by verifier),
>  		 * so sock_gen_put() is unnecessary.
>  		 */
> +		if (!IS_ERR(sdata))
> +			SELEM(sdata)->clone =
> +				!!(flags & BPF_SK_STORAGE_GET_F_CLONE);
>  		sock_put(sk);
>  		return IS_ERR(sdata) ?
>  			(unsigned long)NULL : (unsigned long)sdata->data;
> diff --git a/net/core/sock.c b/net/core/sock.c
> index d57b0cc995a0..f5e801a9cea4 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1851,9 +1851,12 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority)
>  			goto out;
>  		}
>  		RCU_INIT_POINTER(newsk->sk_reuseport_cb, NULL);
> -#ifdef CONFIG_BPF_SYSCALL
> -		RCU_INIT_POINTER(newsk->sk_bpf_storage, NULL);
> -#endif
> +
> +		if (bpf_sk_storage_clone(sk, newsk)) {
> +			sk_free_unlock_clone(newsk);
> +			newsk = NULL;
> +			goto out;
> +		}
>  
>  		newsk->sk_err	   = 0;
>  		newsk->sk_err_soft = 0;
> -- 
> 2.22.0.770.g0f2c4a37fd-goog
> 

^ permalink raw reply

* Re: [bpf-next PATCH 2/3] samples/bpf: make xdp_fwd more practically usable via devmap lookup
From: Jesper Dangaard Brouer @ 2019-08-08  6:51 UTC (permalink / raw)
  To: Y Song
  Cc: netdev, Daniel Borkmann, Alexei Starovoitov, Xdp, a.s.protopopov,
	David Ahern, brouer
In-Reply-To: <CAH3MdRUf_2Sk8v2dPeQ_+LfKPPwX9N3QoMDMCGFehd5JQVktcw@mail.gmail.com>

On Wed, 7 Aug 2019 11:04:17 -0700
Y Song <ys114321@gmail.com> wrote:

> On Wed, Aug 7, 2019 at 5:37 AM Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> >
> > This address the TODO in samples/bpf/xdp_fwd_kern.c, which points out
> > that the chosen egress index should be checked for existence in the
> > devmap. This can now be done via taking advantage of Toke's work in
> > commit 0cdbb4b09a06 ("devmap: Allow map lookups from eBPF").
> >
> > This change makes xdp_fwd more practically usable, as this allows for
> > a mixed environment, where IP-forwarding fallback to network stack, if
> > the egress device isn't configured to use XDP.
> >
> > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> > ---
> >  samples/bpf/xdp_fwd_kern.c |   20 ++++++++++++++------
> >  samples/bpf/xdp_fwd_user.c |   36 +++++++++++++++++++++++++-----------
> >  2 files changed, 39 insertions(+), 17 deletions(-)
> >
> > diff --git a/samples/bpf/xdp_fwd_kern.c b/samples/bpf/xdp_fwd_kern.c
> > index e6ffc4ea06f4..4a5ad381ed2a 100644
> > --- a/samples/bpf/xdp_fwd_kern.c
> > +++ b/samples/bpf/xdp_fwd_kern.c
> > @@ -104,13 +104,21 @@ static __always_inline int xdp_fwd_flags(struct xdp_md *ctx, u32 flags)
> >
> >         rc = bpf_fib_lookup(ctx, &fib_params, sizeof(fib_params), flags);
> >
> > -       /* verify egress index has xdp support
> > -        * TO-DO bpf_map_lookup_elem(&tx_port, &key) fails with
> > -        *       cannot pass map_type 14 into func bpf_map_lookup_elem#1:
> > -        * NOTE: without verification that egress index supports XDP
> > -        *       forwarding packets are dropped.
> > -        */
> >         if (rc == 0) {
> > +               int *val;
> > +
> > +               /* Verify egress index has been configured as TX-port.
> > +                * (Note: User can still have inserted an egress ifindex that
> > +                * doesn't support XDP xmit, which will result in packet drops).
> > +                *
> > +                * Note: lookup in devmap supported since 0cdbb4b09a0.
> > +                * If not supported will fail with:
> > +                *  cannot pass map_type 14 into func bpf_map_lookup_elem#1:
> > +                */
> > +               val = bpf_map_lookup_elem(&tx_port, &fib_params.ifindex);  
> 
> It should be "xdp_tx_ports". Otherwise, you will have compilation errors.

Ups. This happened in my rebase, where I moved the rename patch [1/3]
before this one.  Thanks for catching this!

> > +               if (!val)
> > +                       return XDP_PASS;  
> 
> Also, maybe we can do
>          if (!bpf_map_lookup_elem(&tx_port, &fib_params.ifindex))
>             return XDP_PASS;
> so we do not need to define val at all.

I had it this way, because I also checked the contents (of the pointer
*val) to check if this was the correct ifindex, but I removed that
check again (as user side always insert correctly).  So, I guess I
could take your suggestion now.


> > +
> >                 if (h_proto == htons(ETH_P_IP))
> >                         ip_decrease_ttl(iph);
> >                 else if (h_proto == htons(ETH_P_IPV6))
> > diff --git a/samples/bpf/xdp_fwd_user.c b/samples/bpf/xdp_fwd_user.c
> > index ba012d9f93dd..20951bc27477 100644
> > --- a/samples/bpf/xdp_fwd_user.c
> > +++ b/samples/bpf/xdp_fwd_user.c
> > @@ -27,14 +27,20 @@
> >  #include "libbpf.h"
> >  #include <bpf/bpf.h>
> >
> > -
> > -static int do_attach(int idx, int fd, const char *name)
> > +static int do_attach(int idx, int prog_fd, int map_fd, const char
> > *name) {
> >         int err;
> >
> > -       err = bpf_set_link_xdp_fd(idx, fd, 0);
> > -       if (err < 0)
> > +       err = bpf_set_link_xdp_fd(idx, prog_fd, 0);
> > +       if (err < 0) {
> >                 printf("ERROR: failed to attach program to %s\n",
> > name);
> > +               return err;
> > +       }
> > +
> > +       /* Adding ifindex as a possible egress TX port */
> > +       err = bpf_map_update_elem(map_fd, &idx, &idx, 0);
> > +       if (err)
> > +               printf("ERROR: failed using device %s as
> > TX-port\n", name);
> >
> >         return err;
> >  }
> > @@ -47,6 +53,9 @@ static int do_detach(int idx, const char *name)
> >         if (err < 0)
> >                 printf("ERROR: failed to detach program from %s\n",
> > name);
> >
> > +       /* TODO: Remember to cleanup map, when adding use of shared
> > map
> > +        *  bpf_map_delete_elem((map_fd, &idx);
> > +        */
> >         return err;
> >  }
> >
> > @@ -67,10 +76,10 @@ int main(int argc, char **argv)
> >         };
> >         const char *prog_name = "xdp_fwd";
> >         struct bpf_program *prog;
> > +       int prog_fd, map_fd = -1;
> >         char filename[PATH_MAX];
> >         struct bpf_object *obj;
> >         int opt, i, idx, err;
> > -       int prog_fd, map_fd;
> >         int attach = 1;
> >         int ret = 0;
> >
> > @@ -103,8 +112,17 @@ int main(int argc, char **argv)
> >                         return 1;
> >                 }
> >
> > -               if (bpf_prog_load_xattr(&prog_load_attr, &obj,
> > &prog_fd))
> > +               err = bpf_prog_load_xattr(&prog_load_attr, &obj,
> > &prog_fd);
> > +               if (err) {
> > +                       if (err == -22) {  
> 
> -EINVAL?

Yes.

> For -EINVAL, many things could go wrong. But maybe the blow error
> is the most common one so I am fine with that.

Yes, it is rather sad, that we don't have better/more return codes,
such that we can react better to these.

E.g. if this was part of a open source project (external to the
kernel), I could have two XDP-BPF programs, one that use this feature
and one that don't.  If I had a more specific return code, then I could
load the other if the first failed.


> > +                               printf("Does kernel support devmap lookup?\n");
> > +                               /* If not, the error message will be:
> > +                                * "cannot pass map_type 14 into func
> > +                                * bpf_map_lookup_elem#1"
> > +                                */
> > +                       }
> >                         return 1;
> > +               }
> >
> >                 prog = bpf_object__find_program_by_title(obj, prog_name);
> >                 prog_fd = bpf_program__fd(prog);
> > @@ -119,10 +137,6 @@ int main(int argc, char **argv)
> >                         return 1;
> >                 }
> >         }
> > -       if (attach) {
> > -               for (i = 1; i < 64; ++i)
> > -                       bpf_map_update_elem(map_fd, &i, &i, 0);
> > -       }
> >
> >         for (i = optind; i < argc; ++i) {
> >                 idx = if_nametoindex(argv[i]);
> > @@ -138,7 +152,7 @@ int main(int argc, char **argv)
> >                         if (err)
> >                                 ret = err;
> >                 } else {
> > -                       err = do_attach(idx, prog_fd, argv[i]);
> > +                       err = do_attach(idx, prog_fd, map_fd, argv[i]);
> >                         if (err)
> >                                 ret = err;
> >                 }
> >  

I'll send a V2

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [PATCH v2] Fix non-kerneldoc comment in realtek/rtlwifi/usb.c
From: Kalle Valo @ 2019-08-08  7:10 UTC (permalink / raw)
  To: Larry Finger
  Cc: Valdis Klētnieks, Ping-Ke Shih, David S. Miller,
	linux-wireless, netdev, linux-kernel
In-Reply-To: <15df2564-8815-f351-8fb2-b46611a90234@lwfinger.net>

Larry Finger <Larry.Finger@lwfinger.net> writes:

> On 8/7/19 8:51 PM, Valdis Klētnieks wrote:
>> Fix spurious warning message when building with W=1:
>>
>>    CC [M]  drivers/net/wireless/realtek/rtlwifi/usb.o
>> drivers/net/wireless/realtek/rtlwifi/usb.c:243: warning: Cannot understand  * on line 243 - I thought it was a doc line
>> drivers/net/wireless/realtek/rtlwifi/usb.c:760: warning: Cannot understand  * on line 760 - I thought it was a doc line
>> drivers/net/wireless/realtek/rtlwifi/usb.c:790: warning: Cannot understand  * on line 790 - I thought it was a doc line
>>
>> Clean up the comment format.
>>
>> Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
>>
>> ---
>> Changes since v1:  Larry Finger pointed out the patch wasn't checkpatch-clean.
>>
>> diff --git a/drivers/net/wireless/realtek/rtlwifi/usb.c b/drivers/net/wireless/realtek/rtlwifi/usb.c
>> index 34d68dbf4b4c..4b59f3b46b28 100644
>> --- a/drivers/net/wireless/realtek/rtlwifi/usb.c
>> +++ b/drivers/net/wireless/realtek/rtlwifi/usb.c
>> @@ -239,10 +239,7 @@ static void _rtl_usb_io_handler_release(struct ieee80211_hw *hw)
>>   	mutex_destroy(&rtlpriv->io.bb_mutex);
>>   }
>>   -/**
>> - *
>> - *	Default aggregation handler. Do nothing and just return the oldest skb.
>> - */
>> +/*	Default aggregation handler. Do nothing and just return the oldest skb.  */
>>   static struct sk_buff *_none_usb_tx_aggregate_hdl(struct ieee80211_hw *hw,
>>   						  struct sk_buff_head *list)
>>   {
>> @@ -756,11 +753,6 @@ static int rtl_usb_start(struct ieee80211_hw *hw)
>>   	return err;
>>   }
>>   -/**
>> - *
>> - *
>> - */
>> -
>>   /*=======================  tx =========================================*/
>>   static void rtl_usb_cleanup(struct ieee80211_hw *hw)
>>   {
>> @@ -786,11 +778,7 @@ static void rtl_usb_cleanup(struct ieee80211_hw *hw)
>>   	usb_kill_anchored_urbs(&rtlusb->tx_submitted);
>>   }
>>   -/**
>> - *
>> - * We may add some struct into struct rtl_usb later. Do deinit here.
>> - *
>> - */
>> +/* We may add some struct into struct rtl_usb later. Do deinit here.  */
>>   static void rtl_usb_deinit(struct ieee80211_hw *hw)
>>   {
>>   	rtl_usb_cleanup(hw);
>
> I missed that the subject line should be "rtwifi: Fix ....". Otherwise it is OK.

I can fix the subject during commit.

-- 
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH bpf-next 1/3] bpf: support cloning sk storage on accept()
From: Martin Lau @ 2019-08-08  7:11 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Yonghong Song, Stanislav Fomichev, netdev@vger.kernel.org,
	bpf@vger.kernel.org, davem@davemloft.net, ast@kernel.org,
	daniel@iogearbox.net
In-Reply-To: <20190808000533.GA2820@mini-arch>

On Wed, Aug 07, 2019 at 05:05:33PM -0700, Stanislav Fomichev wrote:
[ ... ]
> > > +int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
> > > +{
> > > +	struct bpf_sk_storage *new_sk_storage = NULL;
> > > +	struct bpf_sk_storage *sk_storage;
> > > +	struct bpf_sk_storage_elem *selem;
> > > +	int ret;
> > > +
> > > +	RCU_INIT_POINTER(newsk->sk_bpf_storage, NULL);
> > > +
> > > +	rcu_read_lock();
> > > +	sk_storage = rcu_dereference(sk->sk_bpf_storage);
> > > +
> > > +	if (!sk_storage || hlist_empty(&sk_storage->list))
> > > +		goto out;
> > > +
> > > +	hlist_for_each_entry_rcu(selem, &sk_storage->list, snode) {
> > > +		struct bpf_sk_storage_map *smap;
> > > +		struct bpf_sk_storage_elem *copy_selem;
> > > +
> > > +		if (!selem->clone)
> > > +			continue;
> > > +
> > > +		smap = rcu_dereference(SDATA(selem)->smap);
> > > +		if (!smap)
> > > +			continue;
> > > +
> > > +		copy_selem = bpf_sk_storage_clone_elem(newsk, smap, selem);
> > > +		if (IS_ERR(copy_selem)) {
> > > +			ret = PTR_ERR(copy_selem);
> > > +			goto err;
> > > +		}
> > > +
> > > +		if (!new_sk_storage) {
> > > +			ret = sk_storage_alloc(newsk, smap, copy_selem);
> > > +			if (ret) {
> > > +				kfree(copy_selem);
> > > +				atomic_sub(smap->elem_size,
> > > +					   &newsk->sk_omem_alloc);
> > > +				goto err;
> > > +			}
> > > +
> > > +			new_sk_storage = rcu_dereference(copy_selem->sk_storage);
> > > +			continue;
> > > +		}
> > > +
> > > +		raw_spin_lock_bh(&new_sk_storage->lock);
> > > +		selem_link_map(smap, copy_selem);
> > > +		__selem_link_sk(new_sk_storage, copy_selem);
> > > +		raw_spin_unlock_bh(&new_sk_storage->lock);
> > 
> > Considering in this particular case, new socket is not visible to 
> > outside world yet (both kernel and user space), map_delete/map_update
> > operations are not applicable in this situation, so
> > the above raw_spin_lock_bh() probably not needed.
> I agree, it's doing nothing, but __selem_link_sk has the following comment:
> /* sk_storage->lock must be held and sk_storage->list cannot be empty */
I believe I may have forgotten to remove this comment after reusing it
in sk_storage_alloc() which also has a similar no-lock-needed situation
like here.

I would also prefer not to acquire the new_sk_storage->lock here to avoid
confusion when investigating racing bugs in the future.

> 
> Just wanted to keep that invariant for this call site as well (in case
> we add some lockdep enforcement or smth else). WDYT?
> 
> > > +	}
> > > +
> > > +out:
> > > +	rcu_read_unlock();
> > > +	return 0;
> > > +
> > > +err:
> > > +	rcu_read_unlock();
> > > +
> > > +	bpf_sk_storage_free(newsk);
> > > +	return ret;
> > > +}
> > > +
> > >   BPF_CALL_4(bpf_sk_storage_get, struct bpf_map *, map, struct sock *, sk,
> > >   	   void *, value, u64, flags)
> > >   {
> > >   	struct bpf_sk_storage_data *sdata;
> > >   
> > > -	if (flags > BPF_SK_STORAGE_GET_F_CREATE)
> > > +	if (flags & ~BPF_SK_STORAGE_GET_F_MASK)
> > > +		return (unsigned long)NULL;
> > > +
> > > +	if ((flags & BPF_SK_STORAGE_GET_F_CLONE) &&
> > > +	    !(flags & BPF_SK_STORAGE_GET_F_CREATE))
> > >   		return (unsigned long)NULL;
> > >   
> > >   	sdata = sk_storage_lookup(sk, map, true);
> > >   	if (sdata)
> > >   		return (unsigned long)sdata->data;
> > >   
> > > -	if (flags == BPF_SK_STORAGE_GET_F_CREATE &&
> > > +	if ((flags & BPF_SK_STORAGE_GET_F_CREATE) &&
> > >   	    /* Cannot add new elem to a going away sk.
> > >   	     * Otherwise, the new elem may become a leak
> > >   	     * (and also other memory issues during map
> > > @@ -762,6 +853,9 @@ BPF_CALL_4(bpf_sk_storage_get, struct bpf_map *, map, struct sock *, sk,
> > >   		/* sk must be a fullsock (guaranteed by verifier),
> > >   		 * so sock_gen_put() is unnecessary.
> > >   		 */
> > > +		if (!IS_ERR(sdata))
> > > +			SELEM(sdata)->clone =
> > > +				!!(flags & BPF_SK_STORAGE_GET_F_CLONE);
> > >   		sock_put(sk);
> > >   		return IS_ERR(sdata) ?
> > >   			(unsigned long)NULL : (unsigned long)sdata->data;
> > > diff --git a/net/core/sock.c b/net/core/sock.c
> > > index d57b0cc995a0..f5e801a9cea4 100644
> > > --- a/net/core/sock.c
> > > +++ b/net/core/sock.c
> > > @@ -1851,9 +1851,12 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority)
> > >   			goto out;
> > >   		}
> > >   		RCU_INIT_POINTER(newsk->sk_reuseport_cb, NULL);
> > > -#ifdef CONFIG_BPF_SYSCALL
> > > -		RCU_INIT_POINTER(newsk->sk_bpf_storage, NULL);
> > > -#endif
> > > +
> > > +		if (bpf_sk_storage_clone(sk, newsk)) {
> > > +			sk_free_unlock_clone(newsk);
> > > +			newsk = NULL;
> > > +			goto out;
> > > +		}
> > >   
> > >   		newsk->sk_err	   = 0;
> > >   		newsk->sk_err_soft = 0;
> > > 

^ permalink raw reply

* Re: [v3,1/4] tools: bpftool: add net attach command to attach XDP on interface
From: Daniel T. Lee @ 2019-08-07 22:15 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Daniel Borkmann, Alexei Starovoitov, netdev
In-Reply-To: <20190807134208.6601fad2@cakuba.netronome.com>

On Thu, Aug 8, 2019 at 5:42 AM Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
>
> On Wed,  7 Aug 2019 11:25:06 +0900, Daniel T. Lee wrote:
> > By this commit, using `bpftool net attach`, user can attach XDP prog on
> > interface. New type of enum 'net_attach_type' has been made, as stated at
> > cover-letter, the meaning of 'attach' is, prog will be attached on interface.
> >
> > With 'overwrite' option at argument, attached XDP program could be replaced.
> > Added new helper 'net_parse_dev' to parse the network device at argument.
> >
> > BPF prog will be attached through libbpf 'bpf_set_link_xdp_fd'.
> >
> > Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
> > ---
> >  tools/bpf/bpftool/net.c | 141 ++++++++++++++++++++++++++++++++++++----
> >  1 file changed, 130 insertions(+), 11 deletions(-)
> >
> > diff --git a/tools/bpf/bpftool/net.c b/tools/bpf/bpftool/net.c
> > index 67e99c56bc88..c05a3fac5cac 100644
> > --- a/tools/bpf/bpftool/net.c
> > +++ b/tools/bpf/bpftool/net.c
> > @@ -55,6 +55,35 @@ struct bpf_attach_info {
> >       __u32 flow_dissector_id;
> >  };
> >
> > +enum net_attach_type {
> > +     NET_ATTACH_TYPE_XDP,
> > +     NET_ATTACH_TYPE_XDP_GENERIC,
> > +     NET_ATTACH_TYPE_XDP_DRIVER,
> > +     NET_ATTACH_TYPE_XDP_OFFLOAD,
> > +};
> > +
> > +static const char * const attach_type_strings[] = {
> > +     [NET_ATTACH_TYPE_XDP]           = "xdp",
> > +     [NET_ATTACH_TYPE_XDP_GENERIC]   = "xdpgeneric",
> > +     [NET_ATTACH_TYPE_XDP_DRIVER]    = "xdpdrv",
> > +     [NET_ATTACH_TYPE_XDP_OFFLOAD]   = "xdpoffload",
> > +};
> > +
> > +const size_t max_net_attach_type = ARRAY_SIZE(attach_type_strings);
>
> Nit: in practice max_.._type is num_types - 1, so perhaps rename this
> to num_.. or such?
>

I can see at 'map.c', it declares ARRAY_SIZE with '_size' suffix.
         const size_t map_type_name_size = ARRAY_SIZE(map_type_name);

I'll change this variable name 'max_net_attach_type' to 'net_attach_type_size'.

> > +static enum net_attach_type parse_attach_type(const char *str)
> > +{
> > +     enum net_attach_type type;
> > +
> > +     for (type = 0; type < max_net_attach_type; type++) {
> > +             if (attach_type_strings[type] &&
> > +                is_prefix(str, attach_type_strings[type]))
>
>                    ^
> this is misaligned by one space
>
> Please try checkpatch with the --strict option to catch these.
>

I didn't know checkpatch has strict option.
Thanks for letting me know!

> > +                     return type;
> > +     }
> > +
> > +     return max_net_attach_type;
> > +}
> > +
> >  static int dump_link_nlmsg(void *cookie, void *msg, struct nlattr **tb)
> >  {
> >       struct bpf_netdev_t *netinfo = cookie;
> > @@ -223,6 +252,97 @@ static int query_flow_dissector(struct bpf_attach_info *attach_info)
> >       return 0;
> >  }
> >
> > +static int net_parse_dev(int *argc, char ***argv)
> > +{
> > +     int ifindex;
> > +
> > +     if (is_prefix(**argv, "dev")) {
> > +             NEXT_ARGP();
> > +
> > +             ifindex = if_nametoindex(**argv);
> > +             if (!ifindex)
> > +                     p_err("invalid devname %s", **argv);
> > +
> > +             NEXT_ARGP();
> > +     } else {
> > +             p_err("expected 'dev', got: '%s'?", **argv);
> > +             return -1;
> > +     }
> > +
> > +     return ifindex;
> > +}
> > +
> > +static int do_attach_detach_xdp(int progfd, enum net_attach_type attach_type,
> > +                             int ifindex, bool overwrite)
> > +{
> > +     __u32 flags = 0;
> > +
> > +     if (!overwrite)
> > +             flags = XDP_FLAGS_UPDATE_IF_NOEXIST;
> > +     if (attach_type == NET_ATTACH_TYPE_XDP_GENERIC)
> > +             flags |= XDP_FLAGS_SKB_MODE;
> > +     if (attach_type == NET_ATTACH_TYPE_XDP_DRIVER)
> > +             flags |= XDP_FLAGS_DRV_MODE;
> > +     if (attach_type == NET_ATTACH_TYPE_XDP_OFFLOAD)
> > +             flags |= XDP_FLAGS_HW_MODE;
> > +
> > +     return bpf_set_link_xdp_fd(ifindex, progfd, flags);
> > +}
> > +
> > +static int do_attach(int argc, char **argv)
> > +{
> > +     enum net_attach_type attach_type;
> > +     int progfd, ifindex, err = 0;
> > +     bool overwrite = false;
> > +
> > +     /* parse attach args */
> > +     if (!REQ_ARGS(5))
> > +             return -EINVAL;
> > +
> > +     attach_type = parse_attach_type(*argv);
> > +     if (attach_type == max_net_attach_type) {
> > +             p_err("invalid net attach/detach type");
>
> worth adding the type to the error message so that user know which part
> of command line was wrong:
>
>         p_err("invalid net attach/detach type '%s'", *argv);
>

It sounds reasonable.
I'll update the error message.


> > +             return -EINVAL;
> > +     }
> > +
> > +     NEXT_ARG();
>
> nit: the new line should be before NEXT_ARG(), IOV NEXT_ARG() belongs
> to the code which consumed the argument
>

I'm not sure I'm following.
Are you saying that, at here the newline shouldn't be necessary?

> > +     progfd = prog_parse_fd(&argc, &argv);
> > +     if (progfd < 0)
> > +             return -EINVAL;
> > +
> > +     ifindex = net_parse_dev(&argc, &argv);
> > +     if (ifindex < 1) {
> > +             close(progfd);
> > +             return -EINVAL;
> > +     }
> > +
> > +     if (argc) {
> > +             if (is_prefix(*argv, "overwrite")) {
> > +                     overwrite = true;
> > +             } else {
> > +                     p_err("expected 'overwrite', got: '%s'?", *argv);
> > +                     close(progfd);
> > +                     return -EINVAL;
> > +             }
> > +     }
> > +
> > +     /* attach xdp prog */
> > +     if (is_prefix("xdp", attach_type_strings[attach_type]))
>
> I'm still unclear on why this if is needed
>

Just an code structure that shows extensibility for other attachment types.
Well, for now there's no other type than XDP, so it's not necessary.

> > +             err = do_attach_detach_xdp(progfd, attach_type, ifindex,
> > +                                        overwrite);
> > +
> > +     if (err < 0) {
> > +             p_err("interface %s attach failed",
> > +                   attach_type_strings[attach_type]);
>
> Please add the error string, like:
>
>                 p_err("interface %s attach failed: %s",
>                       attach_type_strings[attach_type], strerror(errno));
>
>

Oh. Didn't think of propagate errno to error message.
I'll update it right away.

> > +             return err;
> > +     }
> > +
> > +     if (json_output)
> > +             jsonw_null(json_wtr);
> > +
> > +     return 0;
> > +}
> > +
> >  static int do_show(int argc, char **argv)
> >  {
> >       struct bpf_attach_info attach_info = {};
> > @@ -231,17 +351,10 @@ static int do_show(int argc, char **argv)
> >       unsigned int nl_pid;
> >       char err_buf[256];
> >
> > -     if (argc == 2) {
> > -             if (strcmp(argv[0], "dev") != 0)
> > -                     usage();
> > -             filter_idx = if_nametoindex(argv[1]);
> > -             if (filter_idx == 0) {
> > -                     fprintf(stderr, "invalid dev name %s\n", argv[1]);
> > -                     return -1;
> > -             }
> > -     } else if (argc != 0) {
> > +     if (argc == 2)
> > +             filter_idx = net_parse_dev(&argc, &argv);
>
> You should check filter_idx is not negative here, no?
>

You're right.
I'll update it.

> > +     else if (argc != 0)
> >               usage();
> > -     }
> >
> >       ret = query_flow_dissector(&attach_info);
> >       if (ret)

Thank you for your assistance.

^ permalink raw reply

* Re: [RFC PATCH] kbuild: re-implement detection of CONFIG options leaked to user-space
From: Christoph Hellwig @ 2019-08-08  7:45 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Masahiro Yamada, Linux Kbuild mailing list, Sam Ravnborg,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, bpf, Linux Kernel Mailing List, Networking,
	Amit Pundir, David Howells
In-Reply-To: <CAK8P3a2POcb+AReLKib513i_RTN9kLM_Tun7+G5LOacDuy7gjQ@mail.gmail.com>

On Tue, Aug 06, 2019 at 11:00:19AM +0200, Arnd Bergmann wrote:
> > I was playing with sed yesterday, but the resulted code might be unreadable.
> >
> > Sed scripts tend to be somewhat unreadable.
> > I just wondered which language is appropriate for this?
> > Maybe perl, or what else? I am not good at perl, though.
> 
> I like the sed version, in particular as it seems to do the job and
> I'm not volunteering to write it in anything else.

Did anyone not like sed?  I have to say I do like scripts using sed and
awk because they are fairly readable and avoid dependencies on "big"
scripting language and their optional modules that sooner or later get
pulled in.

> This one is nontrivial, since it defines two incompatible layouts for
> this structure,
> and the fdpic version is currently not usable at all from user space. Also,
> the definition breaks configurations that have both CONFIG_BINFMT_ELF
> and CONFIG_BINFMT_ELF_FDPIC enabled, which has become possible
> with commit 382e67aec6a7 ("ARM: enable elf_fdpic on systems with an MMU").
> 
> The best way forward I see is to duplicate the structure definition, adding
> a new 'struct elf_fdpic_prstatus', and using that in fs/binfmt_elf_fdpic.c.
> The same change is required in include/linux/elfcore-compat.h.

Yeah, this is a mess.  David Howells suggested something similar when
I brought the issue to his attention last time.

^ permalink raw reply

* Re: [PATCH v2 1/1] net: bridge: use mac_len in bridge forwarding
From: Zahari Doychev @ 2019-08-08  8:04 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: netdev, bridge, roopa, jhs, dsahern, simon.horman,
	makita.toshiaki, xiyou.wangcong, jiri, ast, johannes,
	alexei.starovoitov
In-Reply-To: <48058179-9690-54e2-f60c-c372446bfde9@cumulusnetworks.com>

On Wed, Aug 07, 2019 at 12:17:43PM +0300, Nikolay Aleksandrov wrote:
> Hi Zahari,
> On 05/08/2019 18:37, Zahari Doychev wrote:
> > The bridge code cannot forward packets from various paths that set up the
> > SKBs in different ways. Some of these packets get corrupted during the
> > forwarding as not always is just ETH_HLEN pulled at the front. This happens
> > e.g. when VLAN tags are pushed bu using tc act_vlan on ingress.
> Overall the patch looks good, I think it shouldn't introduce any regressions
> at least from the codepaths I was able to inspect, but please include more
> details in here from the cover letter, in fact you don't need it just add all of
> the details here so we have them, especially the test setup. Also please provide
> some details how this patch was tested. It'd be great if you could provide a
> selftest for it so we can make sure it's considered when doing future changes.

Hi Nik,

Thanks for the reply. I will do the suggested corrections and try creating a
selftest. I assume it should go to the net/forwarding together with the other
bridge tests as a separate patch.

Regards,
Zahari

> 
> Thank you,
>  Nik
> 
> > 
> > The problem is fixed by using skb->mac_len instead of ETH_HLEN, which makes
> > sure that the skb headers are correctly restored. This usually does not
> > change anything, execpt the local bridge transmits which now need to set
> > the skb->mac_len correctly in br_dev_xmit, as well as the broken case noted
> > above.
> > 
> > Signed-off-by: Zahari Doychev <zahari.doychev@linux.com>
> > ---
> >  net/bridge/br_device.c  | 3 ++-
> >  net/bridge/br_forward.c | 4 ++--
> >  net/bridge/br_vlan.c    | 3 ++-
> >  3 files changed, 6 insertions(+), 4 deletions(-)
> > 
> > diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
> > index 681b72862c16..aeb77ff60311 100644
> > --- a/net/bridge/br_device.c
> > +++ b/net/bridge/br_device.c
> > @@ -55,8 +55,9 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
> >  	BR_INPUT_SKB_CB(skb)->frag_max_size = 0;
> >  
> >  	skb_reset_mac_header(skb);
> > +	skb_reset_mac_len(skb);
> >  	eth = eth_hdr(skb);
> > -	skb_pull(skb, ETH_HLEN);
> > +	skb_pull(skb, skb->mac_len);
> >  
> >  	if (!br_allowed_ingress(br, br_vlan_group_rcu(br), skb, &vid))
> >  		goto out;
> > diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
> > index 86637000f275..edb4f3533f05 100644
> > --- a/net/bridge/br_forward.c
> > +++ b/net/bridge/br_forward.c
> > @@ -32,7 +32,7 @@ static inline int should_deliver(const struct net_bridge_port *p,
> >  
> >  int br_dev_queue_push_xmit(struct net *net, struct sock *sk, struct sk_buff *skb)
> >  {
> > -	skb_push(skb, ETH_HLEN);
> > +	skb_push(skb, skb->mac_len);
> >  	if (!is_skb_forwardable(skb->dev, skb))
> >  		goto drop;
> >  
> > @@ -94,7 +94,7 @@ static void __br_forward(const struct net_bridge_port *to,
> >  		net = dev_net(indev);
> >  	} else {
> >  		if (unlikely(netpoll_tx_running(to->br->dev))) {
> > -			skb_push(skb, ETH_HLEN);
> > +			skb_push(skb, skb->mac_len);
> >  			if (!is_skb_forwardable(skb->dev, skb))
> >  				kfree_skb(skb);
> >  			else
> > diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
> > index 021cc9f66804..88244c9cc653 100644
> > --- a/net/bridge/br_vlan.c
> > +++ b/net/bridge/br_vlan.c
> > @@ -466,13 +466,14 @@ static bool __allowed_ingress(const struct net_bridge *br,
> >  		/* Tagged frame */
> >  		if (skb->vlan_proto != br->vlan_proto) {
> >  			/* Protocol-mismatch, empty out vlan_tci for new tag */
> > -			skb_push(skb, ETH_HLEN);
> > +			skb_push(skb, skb->mac_len);
> >  			skb = vlan_insert_tag_set_proto(skb, skb->vlan_proto,
> >  							skb_vlan_tag_get(skb));
> >  			if (unlikely(!skb))
> >  				return false;
> >  
> >  			skb_pull(skb, ETH_HLEN);
> > +			skb_reset_network_header(skb);
> >  			skb_reset_mac_len(skb);
> >  			*vid = 0;
> >  			tagged = false;
> > 
> 

^ permalink raw reply

* Re: [PATCH v2] Fix non-kerneldoc comment in realtek/rtlwifi/usb.c
From: Larry Finger @ 2019-08-08  8:14 UTC (permalink / raw)
  To: Kalle Valo
  Cc: Valdis Klētnieks, Ping-Ke Shih, David S. Miller,
	linux-wireless, netdev, linux-kernel
In-Reply-To: <877e7odte2.fsf@kamboji.qca.qualcomm.com>

On 8/8/19 2:10 AM, Kalle Valo wrote:
> Larry Finger <Larry.Finger@lwfinger.net> writes:
> 
>> On 8/7/19 8:51 PM, Valdis Klētnieks wrote:
>>> Fix spurious warning message when building with W=1:
>>>
>>>     CC [M]  drivers/net/wireless/realtek/rtlwifi/usb.o
>>> drivers/net/wireless/realtek/rtlwifi/usb.c:243: warning: Cannot understand  * on line 243 - I thought it was a doc line
>>> drivers/net/wireless/realtek/rtlwifi/usb.c:760: warning: Cannot understand  * on line 760 - I thought it was a doc line
>>> drivers/net/wireless/realtek/rtlwifi/usb.c:790: warning: Cannot understand  * on line 790 - I thought it was a doc line
>>>
>>> Clean up the comment format.
>>>
>>> Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
>>>
>>> ---
>>> Changes since v1:  Larry Finger pointed out the patch wasn't checkpatch-clean.
>>>
>>> diff --git a/drivers/net/wireless/realtek/rtlwifi/usb.c b/drivers/net/wireless/realtek/rtlwifi/usb.c
>>> index 34d68dbf4b4c..4b59f3b46b28 100644
>>> --- a/drivers/net/wireless/realtek/rtlwifi/usb.c
>>> +++ b/drivers/net/wireless/realtek/rtlwifi/usb.c
>>> @@ -239,10 +239,7 @@ static void _rtl_usb_io_handler_release(struct ieee80211_hw *hw)
>>>    	mutex_destroy(&rtlpriv->io.bb_mutex);
>>>    }
>>>    -/**
>>> - *
>>> - *	Default aggregation handler. Do nothing and just return the oldest skb.
>>> - */
>>> +/*	Default aggregation handler. Do nothing and just return the oldest skb.  */
>>>    static struct sk_buff *_none_usb_tx_aggregate_hdl(struct ieee80211_hw *hw,
>>>    						  struct sk_buff_head *list)
>>>    {
>>> @@ -756,11 +753,6 @@ static int rtl_usb_start(struct ieee80211_hw *hw)
>>>    	return err;
>>>    }
>>>    -/**
>>> - *
>>> - *
>>> - */
>>> -
>>>    /*=======================  tx =========================================*/
>>>    static void rtl_usb_cleanup(struct ieee80211_hw *hw)
>>>    {
>>> @@ -786,11 +778,7 @@ static void rtl_usb_cleanup(struct ieee80211_hw *hw)
>>>    	usb_kill_anchored_urbs(&rtlusb->tx_submitted);
>>>    }
>>>    -/**
>>> - *
>>> - * We may add some struct into struct rtl_usb later. Do deinit here.
>>> - *
>>> - */
>>> +/* We may add some struct into struct rtl_usb later. Do deinit here.  */
>>>    static void rtl_usb_deinit(struct ieee80211_hw *hw)
>>>    {
>>>    	rtl_usb_cleanup(hw);
>>
>> I missed that the subject line should be "rtwifi: Fix ....". Otherwise it is OK.
> 
> I can fix the subject during commit.

OK. Acked-by: Larry Finger<Larry.Finger@lwfinger.net>

Thanks,

Larry



^ permalink raw reply

* RE: Clause 73 and USXGMII
From: Jose Abreu @ 2019-08-08  8:17 UTC (permalink / raw)
  To: netdev@vger.kernel.org
  Cc: Andrew Lunn, Florian Fainelli, Heiner Kallweit, Russell King
In-Reply-To: <BN8PR12MB32662070AAC1C34BA47C0763D3D40@BN8PR12MB3266.namprd12.prod.outlook.com>

++ PHY Experts

From: Jose Abreu <joabreu@synopsys.com>
Date: Aug/07/2019, 16:46:23 (UTC+00:00)

> Hello,
> 
> I've some sample code for Clause 73 support using Synopsys based XPCS 
> but I would like to clarify some things that I noticed.
> 
> I'm using USXGMII as interface and a single SERDES that operates at 10G 
> rate but MAC side is working at 2.5G. Maximum available bandwidth is 
> therefore 2.5Gbps.
> 
> So, I configure USXGMII for 2.5G mode and it works but if I try to limit 
> the autoneg abilities to 2.5G max then it never finishes:
> # ethtool enp4s0
> Settings for enp4s0:
> 	Supported ports: [ ]
> 	Supported link modes:   1000baseKX/Full 
> 	                        2500baseX/Full 
> 	Supported pause frame use: Symmetric Receive-only
> 	Supports auto-negotiation: Yes
> 	Supported FEC modes: Not reported
> 	Advertised link modes:  1000baseKX/Full 
> 	                        2500baseX/Full 
> 	Advertised pause frame use: Symmetric Receive-only
> 	Advertised auto-negotiation: Yes
> 	Advertised FEC modes: Not reported
> 	Speed: Unknown!
> 	Duplex: Unknown! (255)
> 	Port: MII
> 	PHYAD: 0
> 	Transceiver: internal
> 	Auto-negotiation: on
> 	Supports Wake-on: ug
> 	Wake-on: d
> 	Current message level: 0x0000003f (63)
> 			       drv probe link timer ifdown ifup
> 	Link detected: no
> 
> When I do not limit autoneg and I say that maximum limit is 10G then I 
> get Link Up and autoneg finishes with this outcome:
> # ethtool enp4s0
> Settings for enp4s0:
> 	Supported ports: [ ]
> 	Supported link modes:   1000baseKX/Full 
> 	                        2500baseX/Full 
> 	                        10000baseKX4/Full 
> 	                        10000baseKR/Full 
> 	Supported pause frame use: Symmetric Receive-only
> 	Supports auto-negotiation: Yes
> 	Supported FEC modes: Not reported
> 	Advertised link modes:  1000baseKX/Full 
> 	                        2500baseX/Full 
> 	                        10000baseKX4/Full 
> 	                        10000baseKR/Full 
> 	Advertised pause frame use: Symmetric Receive-only
> 	Advertised auto-negotiation: Yes
> 	Advertised FEC modes: Not reported
> 	Link partner advertised link modes:  1000baseKX/Full 
> 	                                     2500baseX/Full 
> 	                                     10000baseKX4/Full 
> 	                                     10000baseKR/Full 
> 	Link partner advertised pause frame use: Symmetric Receive-only
> 	Link partner advertised auto-negotiation: Yes
> 	Link partner advertised FEC modes: Not reported
> 	Speed: 2500Mb/s
> 	Duplex: Full
> 	Port: MII <- Never mind this, it's a SW issue
> 	PHYAD: 0
> 	Transceiver: internal
> 	Auto-negotiation: on
> 	Supports Wake-on: ug
> 	Wake-on: d
> 	Current message level: 0x0000003f (63)
> 			       drv probe link timer ifdown ifup
> 	Link detected: yes
> 
> I was expecting that, as MAC side is limited to 2.5G, I should set in 
> phylink the correct capabilities and then outcome of autoneg would only 
> have up to 2.5G modes. Am I wrong ?
> 
> ---
> Thanks,
> Jose Miguel Abreu


---
Thanks,
Jose Miguel Abreu

^ permalink raw reply

* Re: Clause 73 and USXGMII
From: Russell King - ARM Linux admin @ 2019-08-08  8:26 UTC (permalink / raw)
  To: Jose Abreu
  Cc: netdev@vger.kernel.org, Andrew Lunn, Florian Fainelli,
	Heiner Kallweit
In-Reply-To: <BN8PR12MB3266A710111427071814D371D3D70@BN8PR12MB3266.namprd12.prod.outlook.com>

Hi,

Have you tried enabling debug mode in phylink (add #define DEBUG at the
top of the file) ?

Thanks.

On Thu, Aug 08, 2019 at 08:17:29AM +0000, Jose Abreu wrote:
> ++ PHY Experts
> 
> From: Jose Abreu <joabreu@synopsys.com>
> Date: Aug/07/2019, 16:46:23 (UTC+00:00)
> 
> > Hello,
> > 
> > I've some sample code for Clause 73 support using Synopsys based XPCS 
> > but I would like to clarify some things that I noticed.
> > 
> > I'm using USXGMII as interface and a single SERDES that operates at 10G 
> > rate but MAC side is working at 2.5G. Maximum available bandwidth is 
> > therefore 2.5Gbps.
> > 
> > So, I configure USXGMII for 2.5G mode and it works but if I try to limit 
> > the autoneg abilities to 2.5G max then it never finishes:
> > # ethtool enp4s0
> > Settings for enp4s0:
> > 	Supported ports: [ ]
> > 	Supported link modes:   1000baseKX/Full 
> > 	                        2500baseX/Full 
> > 	Supported pause frame use: Symmetric Receive-only
> > 	Supports auto-negotiation: Yes
> > 	Supported FEC modes: Not reported
> > 	Advertised link modes:  1000baseKX/Full 
> > 	                        2500baseX/Full 
> > 	Advertised pause frame use: Symmetric Receive-only
> > 	Advertised auto-negotiation: Yes
> > 	Advertised FEC modes: Not reported
> > 	Speed: Unknown!
> > 	Duplex: Unknown! (255)
> > 	Port: MII
> > 	PHYAD: 0
> > 	Transceiver: internal
> > 	Auto-negotiation: on
> > 	Supports Wake-on: ug
> > 	Wake-on: d
> > 	Current message level: 0x0000003f (63)
> > 			       drv probe link timer ifdown ifup
> > 	Link detected: no
> > 
> > When I do not limit autoneg and I say that maximum limit is 10G then I 
> > get Link Up and autoneg finishes with this outcome:
> > # ethtool enp4s0
> > Settings for enp4s0:
> > 	Supported ports: [ ]
> > 	Supported link modes:   1000baseKX/Full 
> > 	                        2500baseX/Full 
> > 	                        10000baseKX4/Full 
> > 	                        10000baseKR/Full 
> > 	Supported pause frame use: Symmetric Receive-only
> > 	Supports auto-negotiation: Yes
> > 	Supported FEC modes: Not reported
> > 	Advertised link modes:  1000baseKX/Full 
> > 	                        2500baseX/Full 
> > 	                        10000baseKX4/Full 
> > 	                        10000baseKR/Full 
> > 	Advertised pause frame use: Symmetric Receive-only
> > 	Advertised auto-negotiation: Yes
> > 	Advertised FEC modes: Not reported
> > 	Link partner advertised link modes:  1000baseKX/Full 
> > 	                                     2500baseX/Full 
> > 	                                     10000baseKX4/Full 
> > 	                                     10000baseKR/Full 
> > 	Link partner advertised pause frame use: Symmetric Receive-only
> > 	Link partner advertised auto-negotiation: Yes
> > 	Link partner advertised FEC modes: Not reported
> > 	Speed: 2500Mb/s
> > 	Duplex: Full
> > 	Port: MII <- Never mind this, it's a SW issue
> > 	PHYAD: 0
> > 	Transceiver: internal
> > 	Auto-negotiation: on
> > 	Supports Wake-on: ug
> > 	Wake-on: d
> > 	Current message level: 0x0000003f (63)
> > 			       drv probe link timer ifdown ifup
> > 	Link detected: yes
> > 
> > I was expecting that, as MAC side is limited to 2.5G, I should set in 
> > phylink the correct capabilities and then outcome of autoneg would only 
> > have up to 2.5G modes. Am I wrong ?
> > 
> > ---
> > Thanks,
> > Jose Miguel Abreu
> 
> 
> ---
> Thanks,
> Jose Miguel Abreu
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply

* Re: [PATCH v2 1/1] net: bridge: use mac_len in bridge forwarding
From: Nikolay Aleksandrov @ 2019-08-08  8:34 UTC (permalink / raw)
  To: Zahari Doychev
  Cc: netdev, bridge, roopa, jhs, dsahern, simon.horman,
	makita.toshiaki, xiyou.wangcong, jiri, ast, johannes,
	alexei.starovoitov
In-Reply-To: <20190808080428.o2eqqfdscl274sr5@tycho>

On 08/08/2019 11:04, Zahari Doychev wrote:
> On Wed, Aug 07, 2019 at 12:17:43PM +0300, Nikolay Aleksandrov wrote:
>> Hi Zahari,
>> On 05/08/2019 18:37, Zahari Doychev wrote:
>>> The bridge code cannot forward packets from various paths that set up the
>>> SKBs in different ways. Some of these packets get corrupted during the
>>> forwarding as not always is just ETH_HLEN pulled at the front. This happens
>>> e.g. when VLAN tags are pushed bu using tc act_vlan on ingress.
>> Overall the patch looks good, I think it shouldn't introduce any regressions
>> at least from the codepaths I was able to inspect, but please include more
>> details in here from the cover letter, in fact you don't need it just add all of
>> the details here so we have them, especially the test setup. Also please provide
>> some details how this patch was tested. It'd be great if you could provide a
>> selftest for it so we can make sure it's considered when doing future changes.
> 
> Hi Nik,
> 
> Thanks for the reply. I will do the suggested corrections and try creating a
> selftest. I assume it should go to the net/forwarding together with the other
> bridge tests as a separate patch.
> 
> Regards,
> Zahari
> 

Hi,
Yes, the selftest should target net-next and go to net/forwarding.

Thanks,
 Nik

>>
>> Thank you,
>>  Nik
>>
>>>
>>> The problem is fixed by using skb->mac_len instead of ETH_HLEN, which makes
>>> sure that the skb headers are correctly restored. This usually does not
>>> change anything, execpt the local bridge transmits which now need to set
>>> the skb->mac_len correctly in br_dev_xmit, as well as the broken case noted
>>> above.
>>>
>>> Signed-off-by: Zahari Doychev <zahari.doychev@linux.com>
>>> ---
>>>  net/bridge/br_device.c  | 3 ++-
>>>  net/bridge/br_forward.c | 4 ++--
>>>  net/bridge/br_vlan.c    | 3 ++-
>>>  3 files changed, 6 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
>>> index 681b72862c16..aeb77ff60311 100644
>>> --- a/net/bridge/br_device.c
>>> +++ b/net/bridge/br_device.c
>>> @@ -55,8 +55,9 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
>>>  	BR_INPUT_SKB_CB(skb)->frag_max_size = 0;
>>>  
>>>  	skb_reset_mac_header(skb);
>>> +	skb_reset_mac_len(skb);
>>>  	eth = eth_hdr(skb);
>>> -	skb_pull(skb, ETH_HLEN);
>>> +	skb_pull(skb, skb->mac_len);
>>>  
>>>  	if (!br_allowed_ingress(br, br_vlan_group_rcu(br), skb, &vid))
>>>  		goto out;
>>> diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
>>> index 86637000f275..edb4f3533f05 100644
>>> --- a/net/bridge/br_forward.c
>>> +++ b/net/bridge/br_forward.c
>>> @@ -32,7 +32,7 @@ static inline int should_deliver(const struct net_bridge_port *p,
>>>  
>>>  int br_dev_queue_push_xmit(struct net *net, struct sock *sk, struct sk_buff *skb)
>>>  {
>>> -	skb_push(skb, ETH_HLEN);
>>> +	skb_push(skb, skb->mac_len);
>>>  	if (!is_skb_forwardable(skb->dev, skb))
>>>  		goto drop;
>>>  
>>> @@ -94,7 +94,7 @@ static void __br_forward(const struct net_bridge_port *to,
>>>  		net = dev_net(indev);
>>>  	} else {
>>>  		if (unlikely(netpoll_tx_running(to->br->dev))) {
>>> -			skb_push(skb, ETH_HLEN);
>>> +			skb_push(skb, skb->mac_len);
>>>  			if (!is_skb_forwardable(skb->dev, skb))
>>>  				kfree_skb(skb);
>>>  			else
>>> diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
>>> index 021cc9f66804..88244c9cc653 100644
>>> --- a/net/bridge/br_vlan.c
>>> +++ b/net/bridge/br_vlan.c
>>> @@ -466,13 +466,14 @@ static bool __allowed_ingress(const struct net_bridge *br,
>>>  		/* Tagged frame */
>>>  		if (skb->vlan_proto != br->vlan_proto) {
>>>  			/* Protocol-mismatch, empty out vlan_tci for new tag */
>>> -			skb_push(skb, ETH_HLEN);
>>> +			skb_push(skb, skb->mac_len);
>>>  			skb = vlan_insert_tag_set_proto(skb, skb->vlan_proto,
>>>  							skb_vlan_tag_get(skb));
>>>  			if (unlikely(!skb))
>>>  				return false;
>>>  
>>>  			skb_pull(skb, ETH_HLEN);
>>> +			skb_reset_network_header(skb);
>>>  			skb_reset_mac_len(skb);
>>>  			*vid = 0;
>>>  			tagged = false;
>>>
>>


^ permalink raw reply

* Re: net: micrel: confusion about phyids used in driver
From: Uwe Kleine-König @ 2019-08-08  8:36 UTC (permalink / raw)
  To: Yuiko.Oshino
  Cc: netdev, andrew, f.fainelli, kernel, hkallweit1, Ravi.Hegde,
	Tristram.Ha
In-Reply-To: <BL0PR11MB3251651EB9BC45DF4282D51D8EF80@BL0PR11MB3251.namprd11.prod.outlook.com>

Hello,

On Tue, Jul 02, 2019 at 08:55:07PM +0000, Yuiko.Oshino@microchip.com wrote:
> >On Fri, May 10, 2019 at 09:22:43AM +0200, Uwe Kleine-König wrote:
> >> On Thu, May 09, 2019 at 11:07:45PM +0200, Andrew Lunn wrote:
> >> > On Thu, May 09, 2019 at 10:55:29PM +0200, Heiner Kallweit wrote:
> >> > > On 09.05.2019 22:29, Uwe Kleine-König wrote:
> >> > > > I have a board here that has a KSZ8051MLL (datasheet:
> >> > > > http://ww1.microchip.com/downloads/en/DeviceDoc/ksz8051mll.pdf, phyid:
> >> > > > 0x0022155x) assembled. The actual phyid is 0x00221556.
> >> > >
> >> > > I think the datasheets are the source of the confusion. If the
> >> > > datasheets for different chips list 0x0022155x as PHYID each, and
> >> > > authors of support for additional chips don't check the existing
> >> > > code, then happens what happened.
> >> > >
> >> > > However it's not a rare exception and not Microchip-specific that
> >> > > sometimes vendors use the same PHYID for different chips.
> >>
> >> From the vendor's POV it is even sensible to reuse the phy IDs iff the
> >> chips are "compatible".
> >>
> >> Assuming that the last nibble of the phy ID actually helps to
> >> distinguish the different (not completely) compatible chips, we need
> >> some more detailed information than available in the data sheets I have.
> >> There is one person in the recipents of this mail with an
> >> @microchip.com address (hint, hint!).
> >
> >can you give some input here or forward to a person who can?
>
> I forward this to the team.

This thread still sits in my inbox waiting for some feedback. Did
something happen on your side?

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

^ permalink raw reply

* [PATCH rdma-next 0/4] Add XRQ and SRQ support to DEVX interface
From: Leon Romanovsky @ 2019-08-08  8:43 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Edward Srouji, Saeed Mahameed,
	Yishai Hadas, linux-netdev

From: Leon Romanovsky <leonro@mellanox.com>

Hi,

This small series extends DEVX interface with SRQ and XRQ legacy commands.

Thanks

Yishai Hadas (4):
  net/mlx5: Use debug message instead of warn
  net/mlx5: Add XRQ legacy commands opcodes
  IB/mlx5: Add legacy events to DEVX list
  IB/mlx5: Expose XRQ legacy commands over the DEVX interface

 drivers/infiniband/hw/mlx5/devx.c             | 12 ++++++++++++
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c |  4 ++++
 drivers/net/ethernet/mellanox/mlx5/core/qp.c  |  2 +-
 include/linux/mlx5/device.h                   |  9 +++++++++
 include/linux/mlx5/mlx5_ifc.h                 |  2 ++
 5 files changed, 28 insertions(+), 1 deletion(-)

--
2.20.1


^ permalink raw reply

* [PATCH mlx5-next 1/4] net/mlx5: Use debug message instead of warn
From: Leon Romanovsky @ 2019-08-08  8:43 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Edward Srouji, Saeed Mahameed,
	Yishai Hadas, linux-netdev
In-Reply-To: <20190808084358.29517-1-leon@kernel.org>

From: Yishai Hadas <yishaih@mellanox.com>

As QP may be created by DEVX, it may be valid to not find the rsn in
mlx5 core tree, change the level to be debug.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/qp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/qp.c b/drivers/net/ethernet/mellanox/mlx5/core/qp.c
index b8ba74de9555..f0f3abe331da 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/qp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/qp.c
@@ -162,7 +162,7 @@ static int rsc_event_notifier(struct notifier_block *nb,
 
 	common = mlx5_get_rsc(table, rsn);
 	if (!common) {
-		mlx5_core_warn(dev, "Async event for bogus resource 0x%x\n", rsn);
+		mlx5_core_dbg(dev, "Async event for unknown resource 0x%x\n", rsn);
 		return NOTIFY_OK;
 	}
 
-- 
2.20.1


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox