Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: ip route JSON format is unparseable for "unreachable" routes
From: David Ahern @ 2019-07-29 21:16 UTC (permalink / raw)
  To: Michael Ziegler, netdev
In-Reply-To: <6e88311b-5edc-4c62-1581-0f5b160a5f4e@michaelziegler.name>

On 7/28/19 5:09 AM, Michael Ziegler wrote:
> Hi,
> 
> I created a couple "unreachable" routes on one of my systems, like such:
> 
>> ip route add unreachable 10.0.0.0/8     metric 255
>> ip route add unreachable 192.168.0.0/16 metric 255
> 
> Unfortunately this results in unparseable JSON output from "ip":
> 
>> # ip -j route show  | jq .
>> parse error: Objects must consist of key:value pairs at line 1, column 84
> 
> The offending JSON objects are these:
> 
>> {"unreachable","dst":"10.0.0.0/8","metric":255,"flags":[]}
>> {"unreachable","dst":"192.168.0.0/16","metric":255,"flags":[]}
> "unreachable" cannot appear on its own here, it needs to be some kind of
> field.
> 
> The manpage says to report here, thus I do :) I've searched the
> archives, but I wasn't able to find any existing bug reports about this.
> I'm running version
> 

actually that was fixed by:

073661773872 ip route: print route type in JSON output

^ permalink raw reply

* Re: [PATCH net] net: ipv6: Fix a bug in ndisc_send_ns when netdev only has a global address
From: David Miller @ 2019-07-29 21:17 UTC (permalink / raw)
  To: suyj.fnst; +Cc: kuznet, yoshfuji, netdev, dsahern
In-Reply-To: <1564368591-42301-1-git-send-email-suyj.fnst@cn.fujitsu.com>

From: Su Yanjun <suyj.fnst@cn.fujitsu.com>
Date: Mon, 29 Jul 2019 10:49:51 +0800

> When we send mpls packets and the interface only has a
> manual global ipv6 address, then the two hosts cant communicate.
> I find that in ndisc_send_ns it only tries to get a ll address.
> In my case, the executive path is as below.
> ip6_output
>  ->ip6_finish_output
>   ->lwtunnel_xmit
>    ->mpls_xmit
>     ->neigh_resolve_output
>      ->neigh_probe
>       ->ndisc_solicit
>        ->ndisc_send_ns
> 
> In RFC4861, 7.2.2 says
> "If the source address of the packet prompting the solicitation is the
> same as one of the addresses assigned to the outgoing interface, that
> address SHOULD be placed in the IP Source Address of the outgoing
> solicitation.  Otherwise, any one of the addresses assigned to the
> interface should be used."
> 
> In this patch we try get a global address if we get ll address failed.
> 
> Signed-off-by: Su Yanjun <suyj.fnst@cn.fujitsu.com>

David, can you take a quick look at this?

Thank you.

^ permalink raw reply

* Re: [PATCH bpf-next 10/10] selftests/bpf: add CO-RE relocs ints tests
From: Song Liu @ 2019-07-29 21:21 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann,
	Yonghong Song, Andrii Nakryiko, Kernel Team
In-Reply-To: <20190724192742.1419254-11-andriin@fb.com>

On Wed, Jul 24, 2019 at 1:34 PM Andrii Nakryiko <andriin@fb.com> wrote:
>
> Add various tests validating handling compatible/incompatible integer
> types.
>
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>

Acked-by: Song Liu <songliubraving@fb.com>

^ permalink raw reply

* Re: [PATCH 1/3 net-next] linux: Add skb_frag_t page_offset accessors
From: Jakub Kicinski @ 2019-07-29 21:22 UTC (permalink / raw)
  To: Jonathan Lemon; +Cc: davem, kernel-team, netdev, Matthew Wilcox
In-Reply-To: <932D725D-62F1-47D6-807A-45F81E66B1C6@gmail.com>

On Mon, 29 Jul 2019 14:02:21 -0700, Jonathan Lemon wrote:
> On 29 Jul 2019, at 13:50, Jakub Kicinski wrote:
> > On Mon, 29 Jul 2019 10:19:39 -0700, Jonathan Lemon wrote:  
> >> Add skb_frag_off(), skb_frag_off_add(), skb_frag_off_set(),
> >> and skb_frag_off_set_from() accessors for page_offset.
> >>
> >> Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>

> >> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> >> index 718742b1c505..7d94a78067ee 100644
> >> --- a/include/linux/skbuff.h
> >> +++ b/include/linux/skbuff.h
> >> @@ -331,7 +331,7 @@ static inline void skb_frag_size_set(skb_frag_t 
> >> *frag, unsigned int size)
> >>  }
> >>
> >>  /**
> >> - * skb_frag_size_add - Incrementes the size of a skb fragment by 
> >> %delta
> >> + * skb_frag_size_add - Increments the size of a skb fragment by 
> >> %delta
> >>   * @frag: skb fragment
> >>   * @delta: value to add
> >>   */
> >> @@ -2857,6 +2857,46 @@ static inline void 
> >> skb_propagate_pfmemalloc(struct page *page,
> >>  		skb->pfmemalloc = true;
> >>  }
> >>
> >> +/**
> >> + * skb_frag_off - Returns the offset of a skb fragment
> >> + * @frag: the paged fragment
> >> + */
> >> +static inline unsigned int skb_frag_off(const skb_frag_t *frag)
> >> +{
> >> +	return frag->page_offset;
> >> +}
> >> +
> >> +/**
> >> + * skb_frag_off_add - Increments the offset of a skb fragment by 
> >> %delta  
> >
> > I realize you're following the existing code, but should we perhaps 
> > use
> > the latest kdoc syntax? '()' after function name, and args should have
> > '@' prefix, '%' would be for constants.  
> 
> That would be a task for a different cleanup.  Not that I disagree with 
> you, but there's also nothing worse than mixing styles in the same file.

Funny you should say that given that (a) I'm commenting on the new code
you're adding, and (b) you did do an unrelated spelling fix above ;)

> >> + * @frag: skb fragment
> >> + * @delta: value to add
> >> + */
> >> +static inline void skb_frag_off_add(skb_frag_t *frag, int delta)
> >> +{
> >> +	frag->page_offset += delta;
> >> +}
> >> +
> >> +/**
> >> + * skb_frag_off_set - Sets the offset of a skb fragment
> >> + * @frag: skb fragment
> >> + * @offset: offset of fragment
> >> + */
> >> +static inline void skb_frag_off_set(skb_frag_t *frag, unsigned int 
> >> offset)
> >> +{
> >> +	frag->page_offset = offset;
> >> +}
> >> +
> >> +/**
> >> + * skb_frag_off_set_from - Sets the offset of a skb fragment from 
> >> another fragment
> >> + * @fragto: skb fragment where offset is set
> >> + * @fragfrom: skb fragment offset is copied from
> >> + */
> >> +static inline void skb_frag_off_set_from(skb_frag_t *fragto,
> >> +					 const skb_frag_t *fragfrom)  
> >
> > skb_frag_off_copy() ?  
> 
> That was my initial inclination, but due to the often overloaded
> connotations of the word "copy", opted to use the same "set" verbiage
> that existed in the other functions.

There is no need to ponder the connotations of verbs. Please just 
look at other function names in skbuff.h, especially those which 
copy fields :)

static inline void skb_copy_hash(struct sk_buff *to, const struct sk_buff *from)
static inline void skb_copy_secmark(struct sk_buff *to, const struct sk_buff *from)
static inline void skb_copy_queue_mapping(struct sk_buff *to, const struct sk_buff *from)
static inline void skb_ext_copy(struct sk_buff *dst, const struct sk_buff *src)

^ permalink raw reply

* Re: [PATCH net-next 00/16] bnxt_en: Add TPA (GRO_HW and LRO) on 57500 chips.
From: David Miller @ 2019-07-29 21:24 UTC (permalink / raw)
  To: michael.chan; +Cc: netdev
In-Reply-To: <1564395033-19511-1-git-send-email-michael.chan@broadcom.com>

From: Michael Chan <michael.chan@broadcom.com>
Date: Mon, 29 Jul 2019 06:10:17 -0400

> This patchset adds TPA v2 support on the 57500 chips.  TPA v2 is
> different from the legacy TPA scheme on older chips and requires major
> refactoring and restructuring of the existing TPA logic.  The main
> difference is that the new TPA v2 has on-the-fly aggregation buffer
> completions before a TPA packet is completed.  The larger aggregation
> ID space also requires a new ID mapping logic to make it more
> memory efficient.

Series applied, but please explain something to me.

I thought initially while reviewing this that patch #5 makes the series
non-bisectable because this only includes the logic that appends new
entries to the agg array but lacks the changes to reset the agg count
at TPE end time (which occurs in patch #8).

However I then realized that you haven't turned on the logic yet that
can result in CMP_TYPE_RX_TPA_AGG_CMP entries in this context.

Am I right?

^ permalink raw reply

* linux-next: Fixes tag needs some work in the net-next tree
From: Stephen Rothwell @ 2019-07-29 21:25 UTC (permalink / raw)
  To: David Miller, Networking
  Cc: Linux Next Mailing List, Linux Kernel Mailing List,
	Oliver Hartkopp

[-- Attachment #1: Type: text/plain, Size: 406 bytes --]

Hi all,

In commit

  473d924d7d46 ("can: fix ioctl function removal")

Fixes tag

  Fixes: 60649d4e0af ("can: remove obsolete empty ioctl() handler")

has these problem(s):

  - SHA1 should be at least 12 digits long
    Can be fixed by setting core.abbrev to 12 (or more) or (for git v2.11
    or later) just making sure it is not set (or set to "auto").

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH 1/3 net-next] linux: Add skb_frag_t page_offset accessors
From: David Miller @ 2019-07-29 21:25 UTC (permalink / raw)
  To: jakub.kicinski; +Cc: jonathan.lemon, kernel-team, netdev, willy
In-Reply-To: <20190729142211.43b5ccd8@cakuba.netronome.com>

From: Jakub Kicinski <jakub.kicinski@netronome.com>
Date: Mon, 29 Jul 2019 14:22:11 -0700

> There is no need to ponder the connotations of verbs. Please just 
> look at other function names in skbuff.h, especially those which 
> copy fields :)
> 
> static inline void skb_copy_hash(struct sk_buff *to, const struct sk_buff *from)
> static inline void skb_copy_secmark(struct sk_buff *to, const struct sk_buff *from)
> static inline void skb_copy_queue_mapping(struct sk_buff *to, const struct sk_buff *from)
> static inline void skb_ext_copy(struct sk_buff *dst, const struct sk_buff *src)

I have to agree :-)

^ permalink raw reply

* Re: [PATCH 1/3 net-next] linux: Add skb_frag_t page_offset accessors
From: Jakub Kicinski @ 2019-07-29 21:25 UTC (permalink / raw)
  To: Jonathan Lemon; +Cc: davem, kernel-team, netdev, Matthew Wilcox
In-Reply-To: <20190729142211.43b5ccd8@cakuba.netronome.com>

On Mon, 29 Jul 2019 14:22:11 -0700, Jakub Kicinski wrote:
> > > I realize you're following the existing code, but should we perhaps 
> > > use
> > > the latest kdoc syntax? '()' after function name, and args should have
> > > '@' prefix, '%' would be for constants.    
> > 
> > That would be a task for a different cleanup.  Not that I disagree with 
> > you, but there's also nothing worse than mixing styles in the same file.  
> 
> Funny you should say that given that (a) I'm commenting on the new code
> you're adding, and (b) you did do an unrelated spelling fix above ;)

Ah, sorry I misread your comment there.

Some code already uses '()' in this file, as for the '%' skb_frag_
functions are the only one which have this mistake, the rest of kdoc 
is correct.

^ permalink raw reply

* Re: [PATCH net] net/mlx5e: Fix unnecessary flow_block_cb_is_busy call
From: Saeed Mahameed @ 2019-07-29 21:30 UTC (permalink / raw)
  To: wenxu@ucloud.cn, netdev@vger.kernel.org
In-Reply-To: <1564239595-23786-1-git-send-email-wenxu@ucloud.cn>

On Sat, 2019-07-27 at 22:59 +0800, wenxu@ucloud.cn wrote:
> From: wenxu <wenxu@ucloud.cn>
> 
> When call flow_block_cb_is_busy. The indr_priv is guaranteed to
> NULL ptr. So there is no need to call flow_bock_cb_is_busy.
> 
> Fixes: 0d4fd02e7199 ("net: flow_offload: add flow_block_cb_is_busy()
> and use it")
> Signed-off-by: wenxu <wenxu@ucloud.cn>

Applied to net-next-mlx5 branch.

Thanks,
Saeed.

^ permalink raw reply

* Re: [PATCH net-next 3/3] net: stmmac: Introducing support for Page Pool
From: Jon Hunter @ 2019-07-29 21:33 UTC (permalink / raw)
  To: Jose Abreu, Robin Murphy, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-stm32@st-md-mailman.stormreply.com,
	linux-arm-kernel@lists.infradead.org, Catalin Marinas,
	Will Deacon
  Cc: Joao Pinto, Alexandre Torgue, Maxime Ripard, Chen-Yu Tsai,
	Maxime Coquelin, linux-tegra, Giuseppe Cavallaro,
	David S . Miller
In-Reply-To: <MN2PR12MB3279ABF628C52883021123C5D3DD0@MN2PR12MB3279.namprd12.prod.outlook.com>


On 29/07/2019 15:08, Jose Abreu wrote:

...

>>> Hi Catalin and Will,
>>>
>>> Sorry to add you in such a long thread but we are seeing a DMA issue
>>> with stmmac driver in an ARM64 platform with IOMMU enabled.
>>>
>>> The issue seems to be solved when buffers allocation for DMA based
>>> transfers are *not* mapped with the DMA_ATTR_SKIP_CPU_SYNC flag *OR*
>>> when IOMMU is disabled.
>>>
>>> Notice that after transfer is done we do use
>>> dma_sync_single_for_{cpu,device} and then we reuse *the same* page for
>>> another transfer.
>>>
>>> Can you please comment on whether DMA_ATTR_SKIP_CPU_SYNC can not be used
>>> in ARM64 platforms with IOMMU ?
>>
>> In terms of what they do, there should be no difference on arm64 between:
>>
>> dma_map_page(..., dir);
>> ...
>> dma_unmap_page(..., dir);
>>
>> and:
>>
>> dma_map_page_attrs(..., dir, DMA_ATTR_SKIP_CPU_SYNC);
>> dma_sync_single_for_device(..., dir);
>> ...
>> dma_sync_single_for_cpu(..., dir);
>> dma_unmap_page_attrs(..., dir, DMA_ATTR_SKIP_CPU_SYNC);
>>
>> provided that the first sync covers the whole buffer and any subsequent 
>> ones cover at least the parts of the buffer which may have changed. Plus 
>> for coherent hardware it's entirely moot either way.
> 
> Thanks for confirming. That's indeed what stmmac is doing when buffer is 
> received by syncing the packet size to CPU.
> 
>>
>> Given Jon's previous findings, I would lean towards the idea that 
>> performing the extra (redundant) cache maintenance plus barrier in 
>> dma_unmap is mostly just perturbing timing in the same way as the debug 
>> print which also made things seem OK.
> 
> Mikko said that Tegra186 is not coherent so we have to explicit flush 
> pipeline but I don't understand why sync_single() is not doing it ...
> 
> Jon, can you please remove *all* debug prints, hacks, etc ... and test 
> this one in attach with plain -net tree ?

So far I have just been testing on the mainline kernel branch. The issue
still persists after applying this on mainline. I can test on the -net
tree, but I am not sure that will make a difference.

Cheers
Jon

-- 
nvpublic

^ permalink raw reply

* [PATCH bpf-next] tools: bpftool: add support for reporting the effective cgroup progs
From: Jakub Kicinski @ 2019-07-29 21:35 UTC (permalink / raw)
  To: alexei.starovoitov, daniel
  Cc: netdev, bpf, oss-drivers, ctakshak, kernel-team, Jakub Kicinski,
	Quentin Monnet

Takshak said in the original submission:

With different bpf attach_flags available to attach bpf programs specially
with BPF_F_ALLOW_OVERRIDE and BPF_F_ALLOW_MULTI, the list of effective
bpf-programs available to any sub-cgroups really needs to be available for
easy debugging.

Using BPF_F_QUERY_EFFECTIVE flag, one can get the list of not only attached
bpf-programs to a cgroup but also the inherited ones from parent cgroup.

So a new option is introduced to use BPF_F_QUERY_EFFECTIVE query flag here
to list all the effective bpf-programs available for execution at a specified
cgroup.

Reused modified test program test_cgroup_attach from tools/testing/selftests/bpf:
  # ./test_cgroup_attach

With old bpftool:

 # bpftool cgroup show /sys/fs/cgroup/cgroup-test-work-dir/cg1/
  ID       AttachType      AttachFlags     Name
  271      egress          multi           pkt_cntr_1
  272      egress          multi           pkt_cntr_2

Attached new program pkt_cntr_4 in cg2 gives following:

 # bpftool cgroup show /sys/fs/cgroup/cgroup-test-work-dir/cg1/cg2
  ID       AttachType      AttachFlags     Name
  273      egress          override        pkt_cntr_4

And with new "effective" option it shows all effective programs for cg2:

 # bpftool cgroup show /sys/fs/cgroup/cgroup-test-work-dir/cg1/cg2 effective
  ID       AttachType      AttachFlags     Name
  273      egress          override        pkt_cntr_4
  271      egress          override        pkt_cntr_1
  272      egress          override        pkt_cntr_2

Compared to original submission use a local flag instead of global
option.

We need to clear query_flags on every command, in case batch mode
wants to use varying settings.

Signed-off-by: Takshak Chahande <ctakshak@fb.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
 .../bpftool/Documentation/bpftool-cgroup.rst  | 16 +++--
 tools/bpf/bpftool/bash-completion/bpftool     | 15 +++--
 tools/bpf/bpftool/cgroup.c                    | 65 ++++++++++++-------
 3 files changed, 63 insertions(+), 33 deletions(-)

diff --git a/tools/bpf/bpftool/Documentation/bpftool-cgroup.rst b/tools/bpf/bpftool/Documentation/bpftool-cgroup.rst
index 585f270c2d25..06a28b07787d 100644
--- a/tools/bpf/bpftool/Documentation/bpftool-cgroup.rst
+++ b/tools/bpf/bpftool/Documentation/bpftool-cgroup.rst
@@ -20,8 +20,8 @@ SYNOPSIS
 CGROUP COMMANDS
 ===============
 
-|	**bpftool** **cgroup { show | list }** *CGROUP*
-|	**bpftool** **cgroup tree** [*CGROUP_ROOT*]
+|	**bpftool** **cgroup { show | list }** *CGROUP* [**effective**]
+|	**bpftool** **cgroup tree** [*CGROUP_ROOT*] [**effective**]
 |	**bpftool** **cgroup attach** *CGROUP* *ATTACH_TYPE* *PROG* [*ATTACH_FLAGS*]
 |	**bpftool** **cgroup detach** *CGROUP* *ATTACH_TYPE* *PROG*
 |	**bpftool** **cgroup help**
@@ -35,13 +35,17 @@ CGROUP COMMANDS
 
 DESCRIPTION
 ===========
-	**bpftool cgroup { show | list }** *CGROUP*
+	**bpftool cgroup { show | list }** *CGROUP* [**effective**]
 		  List all programs attached to the cgroup *CGROUP*.
 
 		  Output will start with program ID followed by attach type,
 		  attach flags and program name.
 
-	**bpftool cgroup tree** [*CGROUP_ROOT*]
+		  If **effective** is specified retrieve effective programs that
+		  will execute for events within a cgroup. This includes
+		  inherited along with attached ones.
+
+	**bpftool cgroup tree** [*CGROUP_ROOT*] [**effective**]
 		  Iterate over all cgroups in *CGROUP_ROOT* and list all
 		  attached programs. If *CGROUP_ROOT* is not specified,
 		  bpftool uses cgroup v2 mountpoint.
@@ -50,6 +54,10 @@ DESCRIPTION
 		  commands: it starts with absolute cgroup path, followed by
 		  program ID, attach type, attach flags and program name.
 
+		  If **effective** is specified retrieve effective programs that
+		  will execute for events within a cgroup. This includes
+		  inherited along with attached ones.
+
 	**bpftool cgroup attach** *CGROUP* *ATTACH_TYPE* *PROG* [*ATTACH_FLAGS*]
 		  Attach program *PROG* to the cgroup *CGROUP* with attach type
 		  *ATTACH_TYPE* and optional *ATTACH_FLAGS*.
diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool
index 6b961a5ed100..df16c5415444 100644
--- a/tools/bpf/bpftool/bash-completion/bpftool
+++ b/tools/bpf/bpftool/bash-completion/bpftool
@@ -710,12 +710,15 @@ _bpftool()
             ;;
         cgroup)
             case $command in
-                show|list)
-                    _filedir
-                    return 0
-                    ;;
-                tree)
-                    _filedir
+                show|list|tree)
+                    case $cword in
+                        3)
+                            _filedir
+                            ;;
+                        4)
+                            COMPREPLY=( $( compgen -W 'effective' -- "$cur" ) )
+                            ;;
+                    esac
                     return 0
                     ;;
                 attach|detach)
diff --git a/tools/bpf/bpftool/cgroup.c b/tools/bpf/bpftool/cgroup.c
index f3c05b08c68c..339c2c78b8e4 100644
--- a/tools/bpf/bpftool/cgroup.c
+++ b/tools/bpf/bpftool/cgroup.c
@@ -29,6 +29,8 @@
 	"                        recvmsg4 | recvmsg6 | sysctl |\n"	       \
 	"                        getsockopt | setsockopt }"
 
+static unsigned int query_flags;
+
 static const char * const attach_type_strings[] = {
 	[BPF_CGROUP_INET_INGRESS] = "ingress",
 	[BPF_CGROUP_INET_EGRESS] = "egress",
@@ -107,7 +109,8 @@ static int count_attached_bpf_progs(int cgroup_fd, enum bpf_attach_type type)
 	__u32 prog_cnt = 0;
 	int ret;
 
-	ret = bpf_prog_query(cgroup_fd, type, 0, NULL, NULL, &prog_cnt);
+	ret = bpf_prog_query(cgroup_fd, type, query_flags, NULL,
+			     NULL, &prog_cnt);
 	if (ret)
 		return -1;
 
@@ -125,8 +128,8 @@ static int show_attached_bpf_progs(int cgroup_fd, enum bpf_attach_type type,
 	int ret;
 
 	prog_cnt = ARRAY_SIZE(prog_ids);
-	ret = bpf_prog_query(cgroup_fd, type, 0, &attach_flags, prog_ids,
-			     &prog_cnt);
+	ret = bpf_prog_query(cgroup_fd, type, query_flags, &attach_flags,
+			     prog_ids, &prog_cnt);
 	if (ret)
 		return ret;
 
@@ -158,20 +161,30 @@ static int show_attached_bpf_progs(int cgroup_fd, enum bpf_attach_type type,
 static int do_show(int argc, char **argv)
 {
 	enum bpf_attach_type type;
+	const char *path;
 	int cgroup_fd;
 	int ret = -1;
 
-	if (argc < 1) {
-		p_err("too few parameters for cgroup show");
-		goto exit;
-	} else if (argc > 1) {
-		p_err("too many parameters for cgroup show");
-		goto exit;
+	query_flags = 0;
+
+	if (!REQ_ARGS(1))
+		return -1;
+	path = GET_ARG();
+
+	while (argc) {
+		if (is_prefix(*argv, "effective")) {
+			query_flags |= BPF_F_QUERY_EFFECTIVE;
+			NEXT_ARG();
+		} else {
+			p_err("expected no more arguments, 'effective', got: '%s'?",
+			      *argv);
+			return -1;
+		}
 	}
 
-	cgroup_fd = open(argv[0], O_RDONLY);
+	cgroup_fd = open(path, O_RDONLY);
 	if (cgroup_fd < 0) {
-		p_err("can't open cgroup %s", argv[0]);
+		p_err("can't open cgroup %s", path);
 		goto exit;
 	}
 
@@ -297,23 +310,29 @@ static int do_show_tree(int argc, char **argv)
 	char *cgroup_root;
 	int ret;
 
-	switch (argc) {
-	case 0:
+	query_flags = 0;
+
+	if (!argc) {
 		cgroup_root = find_cgroup_root();
 		if (!cgroup_root) {
 			p_err("cgroup v2 isn't mounted");
 			return -1;
 		}
-		break;
-	case 1:
-		cgroup_root = argv[0];
-		break;
-	default:
-		p_err("too many parameters for cgroup tree");
-		return -1;
+	} else {
+		cgroup_root = GET_ARG();
+
+		while (argc) {
+			if (is_prefix(*argv, "effective")) {
+				query_flags |= BPF_F_QUERY_EFFECTIVE;
+				NEXT_ARG();
+			} else {
+				p_err("expected no more arguments, 'effective', got: '%s'?",
+				      *argv);
+				return -1;
+			}
+		}
 	}
 
-
 	if (json_output)
 		jsonw_start_array(json_wtr);
 	else
@@ -459,8 +478,8 @@ static int do_help(int argc, char **argv)
 	}
 
 	fprintf(stderr,
-		"Usage: %s %s { show | list } CGROUP\n"
-		"       %s %s tree [CGROUP_ROOT]\n"
+		"Usage: %s %s { show | list } CGROUP [**effective**]\n"
+		"       %s %s tree [CGROUP_ROOT] [**effective**]\n"
 		"       %s %s attach CGROUP ATTACH_TYPE PROG [ATTACH_FLAGS]\n"
 		"       %s %s detach CGROUP ATTACH_TYPE PROG\n"
 		"       %s %s help\n"
-- 
2.21.0


^ permalink raw reply related

* [net-next:master 57/59] drivers/staging/octeon/octeon-stubs.h:1205:9: warning: cast to pointer from integer of different size
From: kbuild test robot @ 2019-07-29 21:36 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: kbuild-all, netdev

[-- Attachment #1: Type: text/plain, Size: 8294 bytes --]

tree:   https://kernel.googlesource.com/pub/scm/linux/kernel/git/davem/net-next.git master
head:   1cb9dfca39eb406960f8f84864ddd6ba329ec321
commit: 171a9bae68c72f2d1260c3825203760856e6793b [57/59] staging/octeon: Allow test build on !MIPS
config: sh-allmodconfig (attached as .config)
compiler: sh4-linux-gcc (GCC) 7.4.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        git checkout 171a9bae68c72f2d1260c3825203760856e6793b
        # save the attached .config to linux build tree
        GCC_VERSION=7.4.0 make.cross ARCH=sh 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   In file included from drivers/staging/octeon/octeon-ethernet.h:41:0,
                    from drivers/staging/octeon/ethernet.c:22:
   drivers/staging/octeon/octeon-stubs.h: In function 'cvmx_phys_to_ptr':
>> drivers/staging/octeon/octeon-stubs.h:1205:9: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
     return (void *)(physical_address);
            ^
--
   In file included from drivers/staging/octeon/octeon-ethernet.h:41:0,
                    from drivers/staging/octeon/ethernet-mem.c:12:
   drivers/staging/octeon/octeon-stubs.h: In function 'cvmx_phys_to_ptr':
>> drivers/staging/octeon/octeon-stubs.h:1205:9: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
     return (void *)(physical_address);
            ^
   In file included from include/linux/scatterlist.h:9:0,
                    from include/linux/dma-mapping.h:11,
                    from include/linux/skbuff.h:31,
                    from include/linux/if_ether.h:19,
                    from include/uapi/linux/ethtool.h:19,
                    from include/linux/ethtool.h:18,
                    from include/linux/netdevice.h:37,
                    from drivers/staging/octeon/ethernet-mem.c:9:
   drivers/staging/octeon/ethernet-mem.c: In function 'cvm_oct_free_hw_memory':
   arch/sh/include/asm/io.h:244:32: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
    #define phys_to_virt(address) ((void *)(address))
                                   ^
>> drivers/staging/octeon/ethernet-mem.c:123:18: note: in expansion of macro 'phys_to_virt'
       fpa = (char *)phys_to_virt(cvmx_ptr_to_phys(fpa));
                     ^~~~~~~~~~~~
--
   In file included from drivers/staging/octeon/octeon-ethernet.h:41:0,
                    from drivers/staging/octeon/ethernet-tx.c:25:
   drivers/staging/octeon/octeon-stubs.h: In function 'cvmx_phys_to_ptr':
>> drivers/staging/octeon/octeon-stubs.h:1205:9: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
     return (void *)(physical_address);
            ^
   drivers/staging/octeon/ethernet-tx.c: In function 'cvm_oct_xmit':
>> drivers/staging/octeon/ethernet-tx.c:264:37: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
      hw_buffer.s.addr = XKPHYS_TO_PHYS((u64)skb->data);
                                        ^
   drivers/staging/octeon/octeon-stubs.h:2:30: note: in definition of macro 'XKPHYS_TO_PHYS'
    #define XKPHYS_TO_PHYS(p)   (p)
                                 ^
   drivers/staging/octeon/ethernet-tx.c:268:37: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
      hw_buffer.s.addr = XKPHYS_TO_PHYS((u64)skb->data);
                                        ^
   drivers/staging/octeon/octeon-stubs.h:2:30: note: in definition of macro 'XKPHYS_TO_PHYS'
    #define XKPHYS_TO_PHYS(p)   (p)
                                 ^
   drivers/staging/octeon/ethernet-tx.c:276:20: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
        XKPHYS_TO_PHYS((u64)skb_frag_address(fs));
                       ^
   drivers/staging/octeon/octeon-stubs.h:2:30: note: in definition of macro 'XKPHYS_TO_PHYS'
    #define XKPHYS_TO_PHYS(p)   (p)
                                 ^
   drivers/staging/octeon/ethernet-tx.c:280:37: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
      hw_buffer.s.addr = XKPHYS_TO_PHYS((u64)CVM_OCT_SKB_CB(skb));
                                        ^
   drivers/staging/octeon/octeon-stubs.h:2:30: note: in definition of macro 'XKPHYS_TO_PHYS'
    #define XKPHYS_TO_PHYS(p)   (p)
                                 ^
--
   drivers/net/phy/mdio-octeon.c: In function 'octeon_mdiobus_probe':
>> drivers/net/phy/mdio-octeon.c:48:3: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
      (u64)devm_ioremap(&pdev->dev, mdio_phys, regsize);
      ^
   In file included from include/linux/io.h:13:0,
                    from include/linux/of_address.h:7,
                    from drivers/net/phy/mdio-octeon.c:7:
>> drivers/net/phy/mdio-cavium.h:111:48: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
    #define oct_mdio_writeq(val, addr) writeq(val, (void *)addr)
                                                   ^
   arch/sh/include/asm/io.h:33:71: note: in definition of macro '__raw_writeq'
    #define __raw_writeq(v,a) (__chk_io_ptr(a), *(volatile u64 __force *)(a) = (v))
                                                                          ^
>> arch/sh/include/asm/io.h:58:32: note: in expansion of macro 'writeq_relaxed'
    #define writeq(v,a)  ({ wmb(); writeq_relaxed((v),(a)); })
                                   ^~~~~~~~~~~~~~
>> drivers/net/phy/mdio-cavium.h:111:36: note: in expansion of macro 'writeq'
    #define oct_mdio_writeq(val, addr) writeq(val, (void *)addr)
                                       ^~~~~~
>> drivers/net/phy/mdio-octeon.c:56:2: note: in expansion of macro 'oct_mdio_writeq'
     oct_mdio_writeq(smi_en.u64, bus->register_base + SMI_EN);
     ^~~~~~~~~~~~~~~
>> drivers/net/phy/mdio-cavium.h:111:48: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
    #define oct_mdio_writeq(val, addr) writeq(val, (void *)addr)
                                                   ^
   arch/sh/include/asm/io.h:33:71: note: in definition of macro '__raw_writeq'
    #define __raw_writeq(v,a) (__chk_io_ptr(a), *(volatile u64 __force *)(a) = (v))
                                                                          ^
>> arch/sh/include/asm/io.h:58:32: note: in expansion of macro 'writeq_relaxed'
    #define writeq(v,a)  ({ wmb(); writeq_relaxed((v),(a)); })
                                   ^~~~~~~~~~~~~~
>> drivers/net/phy/mdio-cavium.h:111:36: note: in expansion of macro 'writeq'
    #define oct_mdio_writeq(val, addr) writeq(val, (void *)addr)
                                       ^~~~~~
   drivers/net/phy/mdio-octeon.c:77:2: note: in expansion of macro 'oct_mdio_writeq'
     oct_mdio_writeq(smi_en.u64, bus->register_base + SMI_EN);
     ^~~~~~~~~~~~~~~
   drivers/net/phy/mdio-octeon.c: In function 'octeon_mdiobus_remove':
>> drivers/net/phy/mdio-cavium.h:111:48: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
    #define oct_mdio_writeq(val, addr) writeq(val, (void *)addr)
                                                   ^
   arch/sh/include/asm/io.h:33:71: note: in definition of macro '__raw_writeq'
    #define __raw_writeq(v,a) (__chk_io_ptr(a), *(volatile u64 __force *)(a) = (v))
                                                                          ^
>> arch/sh/include/asm/io.h:58:32: note: in expansion of macro 'writeq_relaxed'
    #define writeq(v,a)  ({ wmb(); writeq_relaxed((v),(a)); })
                                   ^~~~~~~~~~~~~~
>> drivers/net/phy/mdio-cavium.h:111:36: note: in expansion of macro 'writeq'
    #define oct_mdio_writeq(val, addr) writeq(val, (void *)addr)
                                       ^~~~~~
   drivers/net/phy/mdio-octeon.c:91:2: note: in expansion of macro 'oct_mdio_writeq'
     oct_mdio_writeq(smi_en.u64, bus->register_base + SMI_EN);
     ^~~~~~~~~~~~~~~

vim +1205 drivers/staging/octeon/octeon-stubs.h

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 51905 bytes --]

^ permalink raw reply

* Re: [PATCH 1/3 net-next] linux: Add skb_frag_t page_offset accessors
From: Jonathan Lemon @ 2019-07-29 21:37 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, kernel-team, netdev, Matthew Wilcox
In-Reply-To: <20190729142211.43b5ccd8@cakuba.netronome.com>



On 29 Jul 2019, at 14:22, Jakub Kicinski wrote:

> On Mon, 29 Jul 2019 14:02:21 -0700, Jonathan Lemon wrote:
>> On 29 Jul 2019, at 13:50, Jakub Kicinski wrote:
>>> On Mon, 29 Jul 2019 10:19:39 -0700, Jonathan Lemon wrote:
>>>> Add skb_frag_off(), skb_frag_off_add(), skb_frag_off_set(),
>>>> and skb_frag_off_set_from() accessors for page_offset.
>>>>
>>>> Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
>
>>>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>>>> index 718742b1c505..7d94a78067ee 100644
>>>> --- a/include/linux/skbuff.h
>>>> +++ b/include/linux/skbuff.h
>>>> @@ -331,7 +331,7 @@ static inline void skb_frag_size_set(skb_frag_t
>>>> *frag, unsigned int size)
>>>>  }
>>>>
>>>>  /**
>>>> - * skb_frag_size_add - Incrementes the size of a skb fragment by
>>>> %delta
>>>> + * skb_frag_size_add - Increments the size of a skb fragment by
>>>> %delta
>>>>   * @frag: skb fragment
>>>>   * @delta: value to add
>>>>   */
>>>> @@ -2857,6 +2857,46 @@ static inline void
>>>> skb_propagate_pfmemalloc(struct page *page,
>>>>  		skb->pfmemalloc = true;
>>>>  }
>>>>
>>>> +/**
>>>> + * skb_frag_off - Returns the offset of a skb fragment
>>>> + * @frag: the paged fragment
>>>> + */
>>>> +static inline unsigned int skb_frag_off(const skb_frag_t *frag)
>>>> +{
>>>> +	return frag->page_offset;
>>>> +}
>>>> +
>>>> +/**
>>>> + * skb_frag_off_add - Increments the offset of a skb fragment by
>>>> %delta
>>>
>>> I realize you're following the existing code, but should we perhaps
>>> use
>>> the latest kdoc syntax? '()' after function name, and args should 
>>> have
>>> '@' prefix, '%' would be for constants.
>>
>> That would be a task for a different cleanup.  Not that I disagree 
>> with
>> you, but there's also nothing worse than mixing styles in the same 
>> file.
>
> Funny you should say that given that (a) I'm commenting on the new 
> code
> you're adding, and (b) you did do an unrelated spelling fix above ;)
>
>>>> + * @frag: skb fragment
>>>> + * @delta: value to add
>>>> + */
>>>> +static inline void skb_frag_off_add(skb_frag_t *frag, int delta)
>>>> +{
>>>> +	frag->page_offset += delta;
>>>> +}
>>>> +
>>>> +/**
>>>> + * skb_frag_off_set - Sets the offset of a skb fragment
>>>> + * @frag: skb fragment
>>>> + * @offset: offset of fragment
>>>> + */
>>>> +static inline void skb_frag_off_set(skb_frag_t *frag, unsigned int
>>>> offset)
>>>> +{
>>>> +	frag->page_offset = offset;
>>>> +}
>>>> +
>>>> +/**
>>>> + * skb_frag_off_set_from - Sets the offset of a skb fragment from
>>>> another fragment
>>>> + * @fragto: skb fragment where offset is set
>>>> + * @fragfrom: skb fragment offset is copied from
>>>> + */
>>>> +static inline void skb_frag_off_set_from(skb_frag_t *fragto,
>>>> +					 const skb_frag_t *fragfrom)
>>>
>>> skb_frag_off_copy() ?
>>
>> That was my initial inclination, but due to the often overloaded
>> connotations of the word "copy", opted to use the same "set" verbiage
>> that existed in the other functions.
>
> There is no need to ponder the connotations of verbs. Please just
> look at other function names in skbuff.h, especially those which
> copy fields :)
>
> static inline void skb_copy_hash(struct sk_buff *to, const struct 
> sk_buff *from)
> static inline void skb_copy_secmark(struct sk_buff *to, const struct 
> sk_buff *from)
> static inline void skb_copy_queue_mapping(struct sk_buff *to, const 
> struct sk_buff *from)
> static inline void skb_ext_copy(struct sk_buff *dst, const struct 
> sk_buff *src)

Okay, I missed those, let me respin.
-- 
Jonathan

^ permalink raw reply

* Re: [PATCH V36 23/29] bpf: Restrict bpf when kernel lockdown is in confidentiality mode
From: Matthew Garrett @ 2019-07-29 21:47 UTC (permalink / raw)
  To: James Morris
  Cc: LSM List, Linux Kernel Mailing List, Linux API, David Howells,
	Alexei Starovoitov, Network Development, Chun-Yi Lee,
	Daniel Borkmann
In-Reply-To: <20190718194415.108476-24-matthewgarrett@google.com>

On Thu, Jul 18, 2019 at 12:45 PM Matthew Garrett
<matthewgarrett@google.com> wrote:
> bpf_read() and bpf_read_str() could potentially be abused to (eg) allow
> private keys in kernel memory to be leaked. Disable them if the kernel
> has been locked down in confidentiality mode.
>
> Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Signed-off-by: Matthew Garrett <mjg59@google.com>
> cc: netdev@vger.kernel.org
> cc: Chun-Yi Lee <jlee@suse.com>
> cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>

Any further feedback on this?

^ permalink raw reply

* [net-next:master 57/59] drivers/net/phy/mdio-cavium.h:111:36: error: implicit declaration of function 'writeq'; did you mean 'writel'?
From: kbuild test robot @ 2019-07-29 21:45 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: kbuild-all, netdev

[-- Attachment #1: Type: text/plain, Size: 4206 bytes --]

tree:   https://kernel.googlesource.com/pub/scm/linux/kernel/git/davem/net-next.git master
head:   1cb9dfca39eb406960f8f84864ddd6ba329ec321
commit: 171a9bae68c72f2d1260c3825203760856e6793b [57/59] staging/octeon: Allow test build on !MIPS
config: c6x-allyesconfig (attached as .config)
compiler: c6x-elf-gcc (GCC) 7.4.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        git checkout 171a9bae68c72f2d1260c3825203760856e6793b
        # save the attached .config to linux build tree
        GCC_VERSION=7.4.0 make.cross ARCH=c6x 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   drivers/net/phy/mdio-octeon.c: In function 'octeon_mdiobus_probe':
   drivers/net/phy/mdio-octeon.c:48:3: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
      (u64)devm_ioremap(&pdev->dev, mdio_phys, regsize);
      ^
   In file included from drivers/net/phy/mdio-octeon.c:14:0:
>> drivers/net/phy/mdio-cavium.h:111:36: error: implicit declaration of function 'writeq'; did you mean 'writel'? [-Werror=implicit-function-declaration]
    #define oct_mdio_writeq(val, addr) writeq(val, (void *)addr)
                                       ^
   drivers/net/phy/mdio-octeon.c:56:2: note: in expansion of macro 'oct_mdio_writeq'
     oct_mdio_writeq(smi_en.u64, bus->register_base + SMI_EN);
     ^~~~~~~~~~~~~~~
   drivers/net/phy/mdio-cavium.h:111:48: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
    #define oct_mdio_writeq(val, addr) writeq(val, (void *)addr)
                                                   ^
   drivers/net/phy/mdio-octeon.c:56:2: note: in expansion of macro 'oct_mdio_writeq'
     oct_mdio_writeq(smi_en.u64, bus->register_base + SMI_EN);
     ^~~~~~~~~~~~~~~
   drivers/net/phy/mdio-cavium.h:111:48: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
    #define oct_mdio_writeq(val, addr) writeq(val, (void *)addr)
                                                   ^
   drivers/net/phy/mdio-octeon.c:77:2: note: in expansion of macro 'oct_mdio_writeq'
     oct_mdio_writeq(smi_en.u64, bus->register_base + SMI_EN);
     ^~~~~~~~~~~~~~~
   drivers/net/phy/mdio-octeon.c: In function 'octeon_mdiobus_remove':
   drivers/net/phy/mdio-cavium.h:111:48: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
    #define oct_mdio_writeq(val, addr) writeq(val, (void *)addr)
                                                   ^
   drivers/net/phy/mdio-octeon.c:91:2: note: in expansion of macro 'oct_mdio_writeq'
     oct_mdio_writeq(smi_en.u64, bus->register_base + SMI_EN);
     ^~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors

vim +111 drivers/net/phy/mdio-cavium.h

1eefee901fca020 David Daney 2016-03-11  105  
1eefee901fca020 David Daney 2016-03-11  106  static inline u64 oct_mdio_readq(u64 addr)
1eefee901fca020 David Daney 2016-03-11  107  {
1eefee901fca020 David Daney 2016-03-11  108  	return cvmx_read_csr(addr);
1eefee901fca020 David Daney 2016-03-11  109  }
1eefee901fca020 David Daney 2016-03-11  110  #else
1eefee901fca020 David Daney 2016-03-11 @111  #define oct_mdio_writeq(val, addr)	writeq(val, (void *)addr)
1eefee901fca020 David Daney 2016-03-11  112  #define oct_mdio_readq(addr)		readq((void *)addr)
1eefee901fca020 David Daney 2016-03-11  113  #endif
1eefee901fca020 David Daney 2016-03-11  114  
1eefee901fca020 David Daney 2016-03-11  115  int cavium_mdiobus_read(struct mii_bus *bus, int phy_id, int regnum);
1eefee901fca020 David Daney 2016-03-11  116  int cavium_mdiobus_write(struct mii_bus *bus, int phy_id, int regnum, u16 val);

:::::: The code at line 111 was first introduced by commit
:::::: 1eefee901fca0208b8a56f20cdc134e2b8638ae7 phy: mdio-octeon: Refactor into two files/modules

:::::: TO: David Daney <david.daney@cavium.com>
:::::: CC: David S. Miller <davem@davemloft.net>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 49823 bytes --]

^ permalink raw reply

* [PATCH bpf-next 0/2] bpf: allocate extra memory for setsockopt hook buffer
From: Stanislav Fomichev @ 2019-07-29 21:51 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev

Current setsockopt hook is limited to the size of the buffer that
user had supplied. Since we always allocate memory and copy the value
into kernel space, allocate just a little bit more in case BPF
program needs to override input data with a larger value.

The canonical example is TCP_CONGESTION socket option where
input buffer is a string and if user calls it with a short string,
BPF program has no way of extending it.

The tests are extended with TCP_CONGESTION use case.

Stanislav Fomichev (2):
  bpf: always allocate at least 16 bytes for setsockopt hook
  selftests/bpf: extend sockopt_sk selftest with TCP_CONGESTION use case

 kernel/bpf/cgroup.c                           | 17 ++++++++++---
 .../testing/selftests/bpf/progs/sockopt_sk.c  | 22 ++++++++++++++++
 tools/testing/selftests/bpf/test_sockopt_sk.c | 25 +++++++++++++++++++
 3 files changed, 60 insertions(+), 4 deletions(-)

-- 
2.22.0.709.g102302147b-goog

^ permalink raw reply

* [PATCH bpf-next 1/2] bpf: always allocate at least 16 bytes for setsockopt hook
From: Stanislav Fomichev @ 2019-07-29 21:51 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev
In-Reply-To: <20190729215111.209219-1-sdf@google.com>

Since we always allocate memory, allocate just a little bit more
for the BPF program in case it need to override user input with
bigger value. The canonical example is TCP_CONGESTION where
input string might be too small to override (nv -> bbr or cubic).

16 bytes are chosen to match the size of TCP_CA_NAME_MAX and can
be extended in the future if needed.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 kernel/bpf/cgroup.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 0a00eaca6fae..6a6a154cfa7b 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -964,7 +964,6 @@ static int sockopt_alloc_buf(struct bpf_sockopt_kern *ctx, int max_optlen)
 		return -ENOMEM;
 
 	ctx->optval_end = ctx->optval + max_optlen;
-	ctx->optlen = max_optlen;
 
 	return 0;
 }
@@ -984,7 +983,7 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
 		.level = *level,
 		.optname = *optname,
 	};
-	int ret;
+	int ret, max_optlen;
 
 	/* Opportunistic check to see whether we have any BPF program
 	 * attached to the hook so we don't waste time allocating
@@ -994,10 +993,18 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
 	    __cgroup_bpf_prog_array_is_empty(cgrp, BPF_CGROUP_SETSOCKOPT))
 		return 0;
 
-	ret = sockopt_alloc_buf(&ctx, *optlen);
+	/* Allocate a bit more than the initial user buffer for
+	 * BPF program. The canonical use case is overriding
+	 * TCP_CONGESTION(nv) to TCP_CONGESTION(cubic).
+	 */
+	max_optlen = max_t(int, 16, *optlen);
+
+	ret = sockopt_alloc_buf(&ctx, max_optlen);
 	if (ret)
 		return ret;
 
+	ctx.optlen = *optlen;
+
 	if (copy_from_user(ctx.optval, optval, *optlen) != 0) {
 		ret = -EFAULT;
 		goto out;
@@ -1016,7 +1023,7 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
 	if (ctx.optlen == -1) {
 		/* optlen set to -1, bypass kernel */
 		ret = 1;
-	} else if (ctx.optlen > *optlen || ctx.optlen < -1) {
+	} else if (ctx.optlen > max_optlen || ctx.optlen < -1) {
 		/* optlen is out of bounds */
 		ret = -EFAULT;
 	} else {
@@ -1063,6 +1070,8 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
 	if (ret)
 		return ret;
 
+	ctx.optlen = max_optlen;
+
 	if (!retval) {
 		/* If kernel getsockopt finished successfully,
 		 * copy whatever was returned to the user back
-- 
2.22.0.709.g102302147b-goog


^ permalink raw reply related

* [PATCH bpf-next 2/2] selftests/bpf: extend sockopt_sk selftest with TCP_CONGESTION use case
From: Stanislav Fomichev @ 2019-07-29 21:51 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev
In-Reply-To: <20190729215111.209219-1-sdf@google.com>

Ignore SOL_TCP:TCP_CONGESTION in getsockopt and always override
SOL_TCP:TCP_CONGESTION with "cubic" in setsockopt hook.

Call setsockopt(SOL_TCP, TCP_CONGESTION) with short optval ("nv")
to make sure BPF program has enough buffer space to replace it
with "cubic".

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 .../testing/selftests/bpf/progs/sockopt_sk.c  | 22 ++++++++++++++++
 tools/testing/selftests/bpf/test_sockopt_sk.c | 25 +++++++++++++++++++
 2 files changed, 47 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/sockopt_sk.c b/tools/testing/selftests/bpf/progs/sockopt_sk.c
index 076122c898e9..9a3d1c79e6fe 100644
--- a/tools/testing/selftests/bpf/progs/sockopt_sk.c
+++ b/tools/testing/selftests/bpf/progs/sockopt_sk.c
@@ -1,5 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
+#include <string.h>
 #include <netinet/in.h>
+#include <netinet/tcp.h>
 #include <linux/bpf.h>
 #include "bpf_helpers.h"
 
@@ -42,6 +44,14 @@ int _getsockopt(struct bpf_sockopt *ctx)
 		return 1;
 	}
 
+	if (ctx->level == SOL_TCP && ctx->optname == TCP_CONGESTION) {
+		/* Not interested in SOL_TCP:TCP_CONGESTION;
+		 * let next BPF program in the cgroup chain or kernel
+		 * handle it.
+		 */
+		return 1;
+	}
+
 	if (ctx->level != SOL_CUSTOM)
 		return 0; /* EPERM, deny everything except custom level */
 
@@ -91,6 +101,18 @@ int _setsockopt(struct bpf_sockopt *ctx)
 		return 1;
 	}
 
+	if (ctx->level == SOL_TCP && ctx->optname == TCP_CONGESTION) {
+		/* Always use cubic */
+
+		if (optval + 5 > optval_end)
+			return 0; /* EPERM, bounds check */
+
+		memcpy(optval, "cubic", 5);
+		ctx->optlen = 5;
+
+		return 1;
+	}
+
 	if (ctx->level != SOL_CUSTOM)
 		return 0; /* EPERM, deny everything except custom level */
 
diff --git a/tools/testing/selftests/bpf/test_sockopt_sk.c b/tools/testing/selftests/bpf/test_sockopt_sk.c
index 036b652e5ca9..e4f6055d92e9 100644
--- a/tools/testing/selftests/bpf/test_sockopt_sk.c
+++ b/tools/testing/selftests/bpf/test_sockopt_sk.c
@@ -6,6 +6,7 @@
 #include <sys/types.h>
 #include <sys/socket.h>
 #include <netinet/in.h>
+#include <netinet/tcp.h>
 
 #include <linux/filter.h>
 #include <bpf/bpf.h>
@@ -25,6 +26,7 @@ static int getsetsockopt(void)
 	union {
 		char u8[4];
 		__u32 u32;
+		char cc[16]; /* TCP_CA_NAME_MAX */
 	} buf = {};
 	socklen_t optlen;
 
@@ -115,6 +117,29 @@ static int getsetsockopt(void)
 		goto err;
 	}
 
+	/* TCP_CONGESTION can extend the string */
+
+	strcpy(buf.cc, "nv");
+	err = setsockopt(fd, SOL_TCP, TCP_CONGESTION, &buf, strlen("nv"));
+	if (err) {
+		log_err("Failed to call setsockopt(TCP_CONGESTION)");
+		goto err;
+	}
+
+
+	optlen = sizeof(buf.cc);
+	err = getsockopt(fd, SOL_TCP, TCP_CONGESTION, &buf, &optlen);
+	if (err) {
+		log_err("Failed to call getsockopt(TCP_CONGESTION)");
+		goto err;
+	}
+
+	if (strcmp(buf.cc, "cubic") != 0) {
+		log_err("Unexpected getsockopt(TCP_CONGESTION) %s != %s",
+			buf.cc, "cubic");
+		goto err;
+	}
+
 	close(fd);
 	return 0;
 err:
-- 
2.22.0.709.g102302147b-goog


^ permalink raw reply related

* Re: [PATCH 1/3 net-next] linux: Add skb_frag_t page_offset accessors
From: Jonathan Lemon @ 2019-07-29 21:53 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, kernel-team, netdev, Matthew Wilcox
In-Reply-To: <20190729142548.028d5a2b@cakuba.netronome.com>



On 29 Jul 2019, at 14:25, Jakub Kicinski wrote:

> On Mon, 29 Jul 2019 14:22:11 -0700, Jakub Kicinski wrote:
>>>> I realize you're following the existing code, but should we perhaps
>>>> use
>>>> the latest kdoc syntax? '()' after function name, and args should have
>>>> '@' prefix, '%' would be for constants.
>>>
>>> That would be a task for a different cleanup.  Not that I disagree with
>>> you, but there's also nothing worse than mixing styles in the same file.
>>
>> Funny you should say that given that (a) I'm commenting on the new code
>> you're adding, and (b) you did do an unrelated spelling fix above ;)
>
> Ah, sorry I misread your comment there.
>
> Some code already uses '()' in this file, as for the '%' skb_frag_
> functions are the only one which have this mistake, the rest of kdoc
> is correct.

The kernel-doc.rst guide seems to indicate that function names should
have () at the end - but none of them do so within this file.  (only when
talking about the function in the document).

The %CONST indicates name of a constant - I'm unclear whether this is
supposed to refer to a constant parameter.  For example:

/**
 *      __skb_peek - peek at the head of a non-empty &sk_buff_head
 *      @list_: list to peek at
 *
 *      Like skb_peek(), but the caller knows that the list is not empty.
 */
static inline struct sk_buff *__skb_peek(const struct sk_buff_head *list_)
{
        return list_->next;
}

^ permalink raw reply

* Re: [PATCH net-next 00/16] bnxt_en: Add TPA (GRO_HW and LRO) on 57500 chips.
From: Michael Chan @ 2019-07-29 22:00 UTC (permalink / raw)
  To: David Miller; +Cc: Netdev
In-Reply-To: <20190729.142455.1728471766679878919.davem@davemloft.net>

On Mon, Jul 29, 2019 at 2:24 PM David Miller <davem@davemloft.net> wrote:
>
> From: Michael Chan <michael.chan@broadcom.com>
> Date: Mon, 29 Jul 2019 06:10:17 -0400
>
> > This patchset adds TPA v2 support on the 57500 chips.  TPA v2 is
> > different from the legacy TPA scheme on older chips and requires major
> > refactoring and restructuring of the existing TPA logic.  The main
> > difference is that the new TPA v2 has on-the-fly aggregation buffer
> > completions before a TPA packet is completed.  The larger aggregation
> > ID space also requires a new ID mapping logic to make it more
> > memory efficient.
>
> Series applied, but please explain something to me.
>
> I thought initially while reviewing this that patch #5 makes the series
> non-bisectable because this only includes the logic that appends new
> entries to the agg array but lacks the changes to reset the agg count
> at TPE end time (which occurs in patch #8).
>
> However I then realized that you haven't turned on the logic yet that
> can result in CMP_TYPE_RX_TPA_AGG_CMP entries in this context.
>
> Am I right?

Yes, correct.  Everything is built up incrementally and the new GRO_HW
and LRO features on the new chip can only be enabled after patch #14.

Thanks.

^ permalink raw reply

* Re: [PATCH 1/3 net-next] linux: Add skb_frag_t page_offset accessors
From: Jakub Kicinski @ 2019-07-29 22:02 UTC (permalink / raw)
  To: Jonathan Lemon; +Cc: davem, kernel-team, netdev, Matthew Wilcox
In-Reply-To: <94802E4A-536A-4249-BEA3-5D89E8073738@gmail.com>

On Mon, 29 Jul 2019 14:53:45 -0700, Jonathan Lemon wrote:
> On 29 Jul 2019, at 14:25, Jakub Kicinski wrote:
> 
> > On Mon, 29 Jul 2019 14:22:11 -0700, Jakub Kicinski wrote:  
> >>>> I realize you're following the existing code, but should we perhaps
> >>>> use
> >>>> the latest kdoc syntax? '()' after function name, and args should have
> >>>> '@' prefix, '%' would be for constants.  
> >>>
> >>> That would be a task for a different cleanup.  Not that I disagree with
> >>> you, but there's also nothing worse than mixing styles in the same file.  
> >>
> >> Funny you should say that given that (a) I'm commenting on the new code
> >> you're adding, and (b) you did do an unrelated spelling fix above ;)  
> >
> > Ah, sorry I misread your comment there.
> >
> > Some code already uses '()' in this file, as for the '%' skb_frag_
> > functions are the only one which have this mistake, the rest of kdoc
> > is correct.  
> 
> The kernel-doc.rst guide seems to indicate that function names should
> have () at the end - but none of them do so within this file.  (only when
> talking about the function in the document).

/**
 * skb_complete_tx_timestamp() - deliver cloned skb with tx timestamps

/**
 * skb_tx_timestamp() - Driver hook for transmit timestamping

> The %CONST indicates name of a constant - I'm unclear whether this is
> supposed to refer to a constant parameter.  For example:
> 
> /**
>  *      __skb_peek - peek at the head of a non-empty &sk_buff_head
>  *      @list_: list to peek at
>  *
>  *      Like skb_peek(), but the caller knows that the list is not empty.
>  */
> static inline struct sk_buff *__skb_peek(const struct sk_buff_head *list_)
> {
>         return list_->next;
> }

Hmm.. I'm not sure I follow, this example does not use %, but & which
is for types. Quoting from:

https://www.kernel.org/doc/html/latest/doc-guide/kernel-doc.html#highlights-and-cross-references

@parameter
    Name of a function parameter. (No cross-referencing, just formatting.)
%CONST
    Name of a constant. (No cross-referencing, just formatting.)


So in your case you should use @delta, rather than %delta.

^ permalink raw reply

* [PATCH] net: smc911x: Mark expected switch fall-through
From: Gustavo A. R. Silva @ 2019-07-29 22:10 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, linux-kernel, Gustavo A. R. Silva, Kees Cook

Mark switch cases where we are expecting to fall through.

This patch fixes the following warning (Building: arm):

drivers/net/ethernet/smsc/smc911x.c: In function ‘smc911x_phy_detect’:
drivers/net/ethernet/smsc/smc911x.c:677:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
    if (cfg & HW_CFG_EXT_PHY_DET_) {
       ^
drivers/net/ethernet/smsc/smc911x.c:715:3: note: here
   default:
   ^~~~~~~

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
---
 drivers/net/ethernet/smsc/smc911x.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/smsc/smc911x.c b/drivers/net/ethernet/smsc/smc911x.c
index bd14803545de..8d88e4083456 100644
--- a/drivers/net/ethernet/smsc/smc911x.c
+++ b/drivers/net/ethernet/smsc/smc911x.c
@@ -712,6 +712,7 @@ static void smc911x_phy_detect(struct net_device *dev)
 					/* Found an external PHY */
 					break;
 			}
+			/* Else, fall through */
 		default:
 			/* Internal media only */
 			SMC_GET_PHY_ID1(lp, 1, id1);
-- 
2.22.0


^ permalink raw reply related

* Re: [PATCH] net: smc911x: Mark expected switch fall-through
From: David Miller @ 2019-07-29 22:13 UTC (permalink / raw)
  To: gustavo; +Cc: netdev, linux-kernel, keescook
In-Reply-To: <20190729221016.GA17610@embeddedor>

From: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Date: Mon, 29 Jul 2019 17:10:16 -0500

> Mark switch cases where we are expecting to fall through.
> 
> This patch fixes the following warning (Building: arm):
> 
> drivers/net/ethernet/smsc/smc911x.c: In function ‘smc911x_phy_detect’:
> drivers/net/ethernet/smsc/smc911x.c:677:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
>     if (cfg & HW_CFG_EXT_PHY_DET_) {
>        ^
> drivers/net/ethernet/smsc/smc911x.c:715:3: note: here
>    default:
>    ^~~~~~~
> 
> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

Applied.

^ permalink raw reply

* [PATCH] tipc: compat: allow tipc commands without arguments
From: Taras Kondratiuk @ 2019-07-29 22:15 UTC (permalink / raw)
  To: Jon Maloy, Ying Xue, David S. Miller
  Cc: netdev, tipc-discussion, linux-kernel, xe-linux-external, stable

Commit 2753ca5d9009 ("tipc: fix uninit-value in tipc_nl_compat_doit")
broke older tipc tools that use compat interface (e.g. tipc-config from
tipcutils package):

% tipc-config -p
operation not supported

The commit started to reject TIPC netlink compat messages that do not
have attributes. It is too restrictive because some of such messages are
valid (they don't need any arguments):

% grep 'tx none' include/uapi/linux/tipc_config.h
#define  TIPC_CMD_NOOP              0x0000    /* tx none, rx none */
#define  TIPC_CMD_GET_MEDIA_NAMES   0x0002    /* tx none, rx media_name(s) */
#define  TIPC_CMD_GET_BEARER_NAMES  0x0003    /* tx none, rx bearer_name(s) */
#define  TIPC_CMD_SHOW_PORTS        0x0006    /* tx none, rx ultra_string */
#define  TIPC_CMD_GET_REMOTE_MNG    0x4003    /* tx none, rx unsigned */
#define  TIPC_CMD_GET_MAX_PORTS     0x4004    /* tx none, rx unsigned */
#define  TIPC_CMD_GET_NETID         0x400B    /* tx none, rx unsigned */
#define  TIPC_CMD_NOT_NET_ADMIN     0xC001    /* tx none, rx none */

This patch relaxes the original fix and rejects messages without
arguments only if such arguments are expected by a command (reg_type is
non zero).

Fixes: 2753ca5d9009 ("tipc: fix uninit-value in tipc_nl_compat_doit")
Cc: stable@vger.kernel.org
Signed-off-by: Taras Kondratiuk <takondra@cisco.com>
---
The patch is based on v5.3-rc2.

 net/tipc/netlink_compat.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/net/tipc/netlink_compat.c b/net/tipc/netlink_compat.c
index d86030ef1232..e135d4e11231 100644
--- a/net/tipc/netlink_compat.c
+++ b/net/tipc/netlink_compat.c
@@ -55,6 +55,7 @@ struct tipc_nl_compat_msg {
 	int rep_type;
 	int rep_size;
 	int req_type;
+	int req_size;
 	struct net *net;
 	struct sk_buff *rep;
 	struct tlv_desc *req;
@@ -257,7 +258,8 @@ static int tipc_nl_compat_dumpit(struct tipc_nl_compat_cmd_dump *cmd,
 	int err;
 	struct sk_buff *arg;
 
-	if (msg->req_type && !TLV_CHECK_TYPE(msg->req, msg->req_type))
+	if (msg->req_type && (!msg->req_size ||
+			      !TLV_CHECK_TYPE(msg->req, msg->req_type)))
 		return -EINVAL;
 
 	msg->rep = tipc_tlv_alloc(msg->rep_size);
@@ -354,7 +356,8 @@ static int tipc_nl_compat_doit(struct tipc_nl_compat_cmd_doit *cmd,
 {
 	int err;
 
-	if (msg->req_type && !TLV_CHECK_TYPE(msg->req, msg->req_type))
+	if (msg->req_type && (!msg->req_size ||
+			      !TLV_CHECK_TYPE(msg->req, msg->req_type)))
 		return -EINVAL;
 
 	err = __tipc_nl_compat_doit(cmd, msg);
@@ -1278,8 +1281,8 @@ static int tipc_nl_compat_recv(struct sk_buff *skb, struct genl_info *info)
 		goto send;
 	}
 
-	len = nlmsg_attrlen(req_nlh, GENL_HDRLEN + TIPC_GENL_HDRLEN);
-	if (!len || !TLV_OK(msg.req, len)) {
+	msg.req_size = nlmsg_attrlen(req_nlh, GENL_HDRLEN + TIPC_GENL_HDRLEN);
+	if (msg.req_size && !TLV_OK(msg.req, msg.req_size)) {
 		msg.rep = tipc_get_err_tlv(TIPC_CFG_NOT_SUPPORTED);
 		err = -EOPNOTSUPP;
 		goto send;
-- 
2.19.1


^ permalink raw reply related

* Re: [PATCH net] net: ipv6: Fix a bug in ndisc_send_ns when netdev only has a global address
From: David Ahern @ 2019-07-29 22:28 UTC (permalink / raw)
  To: David Miller, suyj.fnst; +Cc: kuznet, yoshfuji, netdev
In-Reply-To: <20190729.141752.457438545178811941.davem@davemloft.net>

On 7/29/19 3:17 PM, David Miller wrote:
> David, can you take a quick look at this?

will do. I'll get back to you by tomorrow.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox