Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH v1 2/3] net: hisilicon: fix hip04-xmit never return TX_BUSY
From: Jiangfeng Xiao @ 2019-08-03 12:31 UTC (permalink / raw)
  To: davem, yisen.zhuang, salil.mehta, xiaojiangfeng
  Cc: netdev, linux-kernel, leeyou.li, xiaowei774, nixiaoming
In-Reply-To: <1564835501-90257-1-git-send-email-xiaojiangfeng@huawei.com>

TX_DESC_NUM is 256, in tx_count, the maximum value of
mod(TX_DESC_NUM - 1) is 254, the variable "count" in
the hip04_mac_start_xmit function is never equal to
(TX_DESC_NUM - 1), so hip04_mac_start_xmit never
return NETDEV_TX_BUSY.

tx_count is modified to mod(TX_DESC_NUM) so that
the maximum value of tx_count can reach
(TX_DESC_NUM - 1), then hip04_mac_start_xmit can reurn
NETDEV_TX_BUSY.

Signed-off-by: Jiangfeng Xiao <xiaojiangfeng@huawei.com>
---
 drivers/net/ethernet/hisilicon/hip04_eth.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/hisilicon/hip04_eth.c b/drivers/net/ethernet/hisilicon/hip04_eth.c
index 1e1b154..d775b98 100644
--- a/drivers/net/ethernet/hisilicon/hip04_eth.c
+++ b/drivers/net/ethernet/hisilicon/hip04_eth.c
@@ -248,7 +248,7 @@ struct hip04_priv {
 
 static inline unsigned int tx_count(unsigned int head, unsigned int tail)
 {
-	return (head - tail) % (TX_DESC_NUM - 1);
+	return (head - tail) % TX_DESC_NUM;
 }
 
 static void hip04_config_port(struct net_device *ndev, u32 speed, u32 duplex)
-- 
1.8.5.6


^ permalink raw reply related

* Re: [PATCH] isdn: hysdn: Fix error spaces around '*'
From: Greg KH @ 2019-08-03 11:24 UTC (permalink / raw)
  To: Joe Perches; +Cc: Jose Carlos Cazarin Filho, devel, netdev, isdn, linux-kernel
In-Reply-To: <6ff800ceda4b1c1f1d9e519aac13db42dc703294.camel@perches.com>

On Sat, Aug 03, 2019 at 04:15:05AM -0700, Joe Perches wrote:
> On Sat, 2019-08-03 at 08:32 +0200, Greg KH wrote:
> > On Fri, Aug 02, 2019 at 07:56:02PM +0000, Jose Carlos Cazarin Filho wrote:
> > > Fix checkpath error:
> > > CHECK: spaces preferred around that '*' (ctx:WxV)
> > > +extern hysdn_card *card_root;        /* pointer to first card */
> []
> > > diff --git a/drivers/staging/isdn/hysdn/hysdn_defs.h b/drivers/staging/isdn/hysdn/hysdn_defs.h
> []
> > > @@ -220,7 +220,7 @@ typedef struct hycapictrl_info hycapictrl_info;
> > >  /*****************/
> > >  /* exported vars */
> > >  /*****************/
> > > -extern hysdn_card *card_root;	/* pointer to first card */
> > > +extern hysdn_card * card_root;	/* pointer to first card */
> > 
> > The original code here is correct, checkpatch must be reporting this
> > incorrectly.
> 
> Here checkpatch thinks that hydsn_card is an identifier rather
> than a typedef.
> 
> It's defined as:
> 	typedef struct HYSDN_CARD {
> 	...
> 	} hysdn_card;
> 
> And that confuses checkpatch.
> 
> kernel source code style would not use a typedef for a struct.
> 
> A change would be to remove the typedef and declare this as:
> 	struct hysdn_card {
> 		...
> 	};
> 
> And then do a global:
> 	sed 's/\bhysdn_card\b/struct hysdn_card/g'
> 
> But that's not necessary as the driver is likely to be removed.

Ah, that makes sense why checkpatch did this, thanks for the
information.

And yes, it's not worth being changed, as this is going to be deleted.
But, I bet we get this sent a lot until it is as it's "easy pickings" :)

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH] isdn: hysdn: Fix error spaces around '*'
From: Joe Perches @ 2019-08-03 11:15 UTC (permalink / raw)
  To: Greg KH, Jose Carlos Cazarin Filho; +Cc: isdn, devel, netdev, linux-kernel
In-Reply-To: <20190803063246.GA10186@kroah.com>

On Sat, 2019-08-03 at 08:32 +0200, Greg KH wrote:
> On Fri, Aug 02, 2019 at 07:56:02PM +0000, Jose Carlos Cazarin Filho wrote:
> > Fix checkpath error:
> > CHECK: spaces preferred around that '*' (ctx:WxV)
> > +extern hysdn_card *card_root;        /* pointer to first card */
[]
> > diff --git a/drivers/staging/isdn/hysdn/hysdn_defs.h b/drivers/staging/isdn/hysdn/hysdn_defs.h
[]
> > @@ -220,7 +220,7 @@ typedef struct hycapictrl_info hycapictrl_info;
> >  /*****************/
> >  /* exported vars */
> >  /*****************/
> > -extern hysdn_card *card_root;	/* pointer to first card */
> > +extern hysdn_card * card_root;	/* pointer to first card */
> 
> The original code here is correct, checkpatch must be reporting this
> incorrectly.

Here checkpatch thinks that hydsn_card is an identifier rather
than a typedef.

It's defined as:
	typedef struct HYSDN_CARD {
	...
	} hysdn_card;

And that confuses checkpatch.

kernel source code style would not use a typedef for a struct.

A change would be to remove the typedef and declare this as:
	struct hysdn_card {
		...
	};

And then do a global:
	sed 's/\bhysdn_card\b/struct hysdn_card/g'

But that's not necessary as the driver is likely to be removed.



^ permalink raw reply

* Re: [v2,1/2] tools: bpftool: add net attach command to attach XDP on interface
From: Daniel T. Lee @ 2019-08-03  9:39 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Daniel Borkmann, Alexei Starovoitov, netdev
In-Reply-To: <20190802113935.63be803a@cakuba.netronome.com>

On Sat, Aug 3, 2019 at 3:39 AM Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
>
> On Fri, 2 Aug 2019 14:02:29 +0900, Daniel T. Lee wrote:
> > On Fri, Aug 2, 2019 at 8:36 AM Jakub Kicinski  wrote:
> > > On Thu,  1 Aug 2019 17:11:32 +0900, Daniel T. Lee wrote:
> > > > By this commit, using `bpftool net attach`, user can attach XDP prog on
> > > > interface. New type of enum 'net_attach_type' has been made, as stated at
> > > > cover-letter, the meaning of 'attach' is, prog will be attached on interface.
> > > >
> > > > BPF prog will be attached through libbpf 'bpf_set_link_xdp_fd'.
> > > >
> > > > Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
> > > > ---
> > > > Changes in v2:
> > > >   - command 'load' changed to 'attach' for the consistency
> > > >   - 'NET_ATTACH_TYPE_XDP_DRIVE' changed to 'NET_ATTACH_TYPE_XDP_DRIVER'
> > > >
> > > >  tools/bpf/bpftool/net.c | 107 +++++++++++++++++++++++++++++++++++++++-
> > > >  1 file changed, 106 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/tools/bpf/bpftool/net.c b/tools/bpf/bpftool/net.c
> > > > index 67e99c56bc88..f3b57660b303 100644
> > > > --- a/tools/bpf/bpftool/net.c
> > > > +++ b/tools/bpf/bpftool/net.c
> > > > @@ -55,6 +55,35 @@ struct bpf_attach_info {
> > > >       __u32 flow_dissector_id;
> > > >  };
> > > >
> > > > +enum net_attach_type {
> > > > +     NET_ATTACH_TYPE_XDP,
> > > > +     NET_ATTACH_TYPE_XDP_GENERIC,
> > > > +     NET_ATTACH_TYPE_XDP_DRIVER,
> > > > +     NET_ATTACH_TYPE_XDP_OFFLOAD,
> > > > +     __MAX_NET_ATTACH_TYPE
> > > > +};
> > > > +
> > > > +static const char * const attach_type_strings[] = {
> > > > +     [NET_ATTACH_TYPE_XDP] = "xdp",
> > > > +     [NET_ATTACH_TYPE_XDP_GENERIC] = "xdpgeneric",
> > > > +     [NET_ATTACH_TYPE_XDP_DRIVER] = "xdpdrv",
> > > > +     [NET_ATTACH_TYPE_XDP_OFFLOAD] = "xdpoffload",
> > > > +     [__MAX_NET_ATTACH_TYPE] = NULL,
> > >
> > > Not sure if the terminator is necessary,
> > > ARRAY_SIZE(attach_type_strings) should suffice?
> >
> > Yes, ARRAY_SIZE is fine though. But I was just trying to make below
> > 'parse_attach_type' consistent with 'parse_attach_type' from the 'prog.c'.
> > At 'prog.c', It has same terminator at 'attach_type_strings'.
> >
> > Should I change it or keep it?
>
> Oh well, I guess there is some precedent for that :S
>
> Quick grep for const char * const reveals we have around 7 non-NULL
> terminated arrays, and 2 NULL terminated. Plus the NULL-terminated
> don't align the '=' sign, while most do.
>
> it's not a big deal, my preference is for not NULL terminating here,
> and aligning '='.
>

I think following the majority, is better for this case.

Like you mentioned, those 2 NULL-terminated arrays are at 'cgroup.c'
and 'prog.c'
and the strings they are defining are 'bpf_attach_type' which is from
uapi 'bpf.h'.

And, in my guess, the reason for those 2 arrays uses NULL-terminator
is because enum from 'bpf_attach_type' has '__MAX_BPF_ATTACH_TYPE' at
the end.

Actually I was kind of uncomfortable with adding an enum since it's
not used globally and *only* used in 'net.c' context. Instead of using
hacky enum entry, stick to the majority, ARRAY_SIZE, is better to keep
consistency.

I'll update this to next version with '=' alignment.

>
> > > > +     NEXT_ARG();
> > > > +     if (!REQ_ARGS(1))
> > > > +             return -EINVAL;
> > >
> > > Error message needed here.
> > >
> >
> > Actually it provides error message like:
> > Error: 'xdp' needs at least 1 arguments, 0 found
> >
> > are you suggesting that any additional error message is necessary?
>
> Ah, sorry, I missed REQ_ARGS() there!
>
> > > > +             return -EINVAL;
> > > > +     }
> > >
> > > Please require the dev keyword before the interface name.
> > > That'll make it feel closer to prog load syntax.
> >
> > If adding the dev keyword before interface name, will it be too long to type in?
>
> I think it's probably muscle memory for most. Plus we have excellent
> bash completions.
>
> > and also `bpftool prog` use extra keyword (such as dev) when it is
> > optional keyword.
> >
> >        bpftool prog dump jited  PROG [{ file FILE | opcodes | linum }]
> >        bpftool prog pin   PROG FILE
> >        bpftool prog { load | loadall } OBJ  PATH \
> >
> > as you can see here, FILE uses optional keyword 'file' when the
> > argument is optional.
>
> Not sure I follow
>

command 'dump' has optional argument '[ file FILE ]',
and command 'pin' has required argument with 'FILE'.

I thought the preceding optional keyword 'file' with lower-case is
intended for the optional argument since it isn't preceded when the
argument is required.

> >        bpftool prog { load | loadall } OBJ  PATH \
> >                          [type TYPE] [dev NAME] \
> >                          [map { idx IDX | name NAME } MAP]\
> >                          [pinmaps MAP_DIR]
> >
> > Yes, bpftool prog load has dev keyword with it,
> >
> > but first, like previous, the argument is optional so i think it is
> > unnecessary to use optional keyword 'dev'.
>
> The keyword should not be optional if device name is specified.
> Maybe lack of coffee on my side..
>
> > and secondly, 'bpftool net attach' isn't really related to 'bpftool prog load'.
> >
> > At previous version patch, I was using word 'load' instead of
> > 'attach', since XDP program is not
> > considered as 'BPF_PROG_ATTACH', so it might give a confusion. However
> > by the last patch discussion,
> > word 'load' has been replaced to 'attach'.
> >
> > Keeping the consistency is very important, but I was just wandering
> > about making command
> > similar to 'bpftool prog load' syntax.
>
> In case of TC the device argument is optional. You may specify it, or
> you can refer to TC blocks instead. So for that reason alone I think
> it'll be much cleaner to require dev before the interface name.
>

Previously I didn't really considered TC.

Considering the extensibility, since device argument could be optional,
requiring dev before the interface name seems necessary.

Thank you for letting me know! :)

> > > > +     return 0;
> > > > +}
> > > > +
> > > > +static int do_attach_detach_xdp(int *progfd, enum net_attach_type *attach_type,
> > > > +                             int *ifindex)
> > > > +{
> > > > +     __u32 flags;
> > > > +     int err;
> > > > +
> > > > +     flags = XDP_FLAGS_UPDATE_IF_NOEXIST;
> > >
> > > Please add this as an option so that user can decide whether overwrite
> > > is allowed or not.
> >
> > Adding force flag to bpftool seems necessary.
> > I will add an optional argument for this.
>
> Right, I was wondering if we want to call it force, though? force is
> sort of a reuse of iproute2 concept. But it's kind of hard to come up
> with names.
>
> Just to be sure - I mean something like:
>
> bpftool net attach xdp id xyz dev ethN noreplace
>
> Rather than:
>
> bpftool -f net attach xdp id xyz dev ethN
>

How about a word 'immutable'? 'noreplace' seems good though.
Just suggesting an option.

> > > > +     if (*attach_type == NET_ATTACH_TYPE_XDP_GENERIC)
> > > > +             flags |= XDP_FLAGS_SKB_MODE;
> > > > +     if (*attach_type == NET_ATTACH_TYPE_XDP_DRIVER)
> > > > +             flags |= XDP_FLAGS_DRV_MODE;
> > > > +     if (*attach_type == NET_ATTACH_TYPE_XDP_OFFLOAD)
> > > > +             flags |= XDP_FLAGS_HW_MODE;
> > > > +
> > > > +     err = bpf_set_link_xdp_fd(*ifindex, *progfd, flags);
> > > > +
> > > > +     return err;
> > >
> > > no need for the err variable here.
> >
> > My apologies, but I'm not sure why err variable isn't needed at here.
> > AFAIK, 'bpf_set_link_xdp_fd' from libbpf returns the netlink_recv result,
> > and in order to propagate error, err variable is necessary, I guess?
>
>         return bpf_set_link_xdp_fd(*ifindex, *progfd, flags);
>
> Is what I meant.
>

Oops. I've got it wrong.
I'll update to next patch.

> > > > +}
> > > > +
> > > > +static int do_attach(int argc, char **argv)
> > > > +{
> > > > +     enum net_attach_type attach_type;
> > > > +     int err, progfd, ifindex;
> > > > +
> > > > +     err = parse_attach_args(argc, argv, &progfd, &attach_type, &ifindex);
> > > > +     if (err)
> > > > +             return err;
> > >
> > > Probably not the best idea to move this out into a helper.
> >
> > Again, just trying to make consistent with 'prog.c'.
> >
> > But clearly it has differences with do_attach/detach from 'prog.c'.
> > From it, it uses the same parse logic 'parse_attach_detach_args' since
> > the two command 'bpftool prog attach/detach' uses the same argument format.
> >
> > However, in here, 'bpftool net' attach and detach requires different number of
> > argument, so function for parse argument has been defined separately.
> > The situation is little bit different, but keeping argument parse logic as an
> > helper, I think it's better in terms of consistency.
>
> Well they won't share the same arguments if you add the keyword for
> controlling IF_NOEXIST :(
>

My apologies, but I think I'm not following with you.

Currently detach/attach isn't sharing same arguments.
And even after adding the argument for IF_NOEXIST such as  [ noreplace ],
it'll stays the same for not sharing same arguments.

I'm not sure why it is not the best idea to move arg parse logic into a helper.

> > About the moving parse logic to a helper, I was trying to keep command
> > entry (do_attach)
> > as simple as possible. Parse all the argument in command entry will
> > make function longer
> > and might make harder to understand what it does.
> >
> > And I'm not pretty sure that argument parse logic will stays the same
> > after other attachment
> > type comes in. What I mean is, the argument count or type might be
> > added and to fulfill
> > all that specific cases, the code might grow larger.
> >
> > So for the consistency, simplicity and extensibility, I prefer to keep
> > it as a helper.
> >
> > > > +     if (is_prefix("xdp", attach_type_strings[attach_type]))
> > > > +             err = do_attach_detach_xdp(&progfd, &attach_type, &ifindex);
> > >
> > > Hm. We either need an error to be reported if it's not xdp or since we
> > > only accept XDP now perhaps the if() is superfluous?
> >
> > Well, if the attach_type isn't xdp, the error will be occurred from
> > the argument parse,
> > Will it be necessary to reinforce with error logic to make it more secure?
>
> Hm. it should already be fine, no? For non-xdp parse_attach_type() will
> return __MAX_NET_ATTACH_TYPE, then parsing returns EINVAL and we exit.
> Not sure I follow.

Yes. That was what i meant.

Thank you for taking your time for the review.

^ permalink raw reply

* [PATCH iproute2-next] ss: sctp: Formatting tweak in sctp_show_info for locals
From: Patrick Talbert @ 2019-08-03  8:47 UTC (permalink / raw)
  To: netdev; +Cc: stephen

'locals' output does not include a leading space so it runs up against
skmem:() output. Add a leading space to fix it.

Signed-off-by: Patrick Talbert <ptalbert@redhat.com>
---
 misc/ss.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/misc/ss.c b/misc/ss.c
index 0927b192..5e70709d 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -2937,7 +2937,7 @@ static void sctp_show_info(const struct nlmsghdr *nlh, struct inet_diag_msg *r,
 		len = RTA_PAYLOAD(tb[INET_DIAG_LOCALS]);
 		sa = RTA_DATA(tb[INET_DIAG_LOCALS]);
 
-		out("locals:%s", format_host_sa(sa));
+		out(" locals:%s", format_host_sa(sa));
 		for (sa++, len -= sizeof(*sa); len > 0; sa++, len -= sizeof(*sa))
 			out(",%s", format_host_sa(sa));
 
-- 
2.18.1


^ permalink raw reply related

* [PATCH iproute2-next] ss: sctp: fix typo for nodelay
From: Patrick Talbert @ 2019-08-03  8:37 UTC (permalink / raw)
  To: netdev; +Cc: stephen

nodealy should be nodelay.

Signed-off-by: Patrick Talbert <ptalbert@redhat.com>
---
 misc/ss.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/misc/ss.c b/misc/ss.c
index 0927b192..01b47fed 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -2414,7 +2414,7 @@ static void sctp_stats_print(struct sctp_info *s)
 	if (s->sctpi_s_pd_point)
 		out(" pdpoint:%d", s->sctpi_s_pd_point);
 	if (s->sctpi_s_nodelay)
-		out(" nodealy:%d", s->sctpi_s_nodelay);
+		out(" nodelay:%d", s->sctpi_s_nodelay);
 	if (s->sctpi_s_disable_fragments)
 		out(" nofrag:%d", s->sctpi_s_disable_fragments);
 	if (s->sctpi_s_v4mapped)
-- 
2.18.1


^ permalink raw reply related

* Re: KASAN: use-after-free Read in blkdev_direct_IO
From: syzbot @ 2019-08-03  7:43 UTC (permalink / raw)
  To: arvid.brodin, davem, linux-fsdevel, linux-kernel, netdev,
	syzkaller-bugs, viro, xiyou.wangcong
In-Reply-To: <000000000000a2db16058f2514fa@google.com>

syzbot has bisected this bug to:

commit b9a1e627405d68d475a3c1f35e685ccfb5bbe668
Author: Cong Wang <xiyou.wangcong@gmail.com>
Date:   Thu Jul 4 00:21:13 2019 +0000

     hsr: implement dellink to clean up resources

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1589f3e8600000
start commit:   1e78030e Merge tag 'mmc-v5.3-rc1' of git://git.kernel.org/..
git tree:       upstream
final crash:    https://syzkaller.appspot.com/x/report.txt?x=1789f3e8600000
console output: https://syzkaller.appspot.com/x/log.txt?x=1389f3e8600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=4c7b914a2680c9c6
dashboard link: https://syzkaller.appspot.com/bug?extid=0a0e5f37746013dc7476
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=11fa7830600000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12f31c8a600000

Reported-by: syzbot+0a0e5f37746013dc7476@syzkaller.appspotmail.com
Fixes: b9a1e627405d ("hsr: implement dellink to clean up resources")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply

* Re: [PATCH net 0/2] flow_offload hardware priority fixes
From: Pablo Neira Ayuso @ 2019-08-03  7:08 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netfilter-devel, davem, netdev, marcelo.leitner, jiri, wenxu,
	saeedm, paulb, gerlitz.or
In-Reply-To: <20190802152549.357784a7@cakuba.netronome.com>

Hi Jakub,

On Fri, Aug 02, 2019 at 03:25:49PM -0700, Jakub Kicinski wrote:
> On Sat, 3 Aug 2019 00:04:09 +0200, Pablo Neira Ayuso wrote:
> > That patch removed the reference to tcf_auto_prio() already, please
> > let me know if you have any more specific update you would like to see
> > on that patch.
> 
> Please explain why the artificial priorities are needed at all.
> Hardware should order tables based on table type - ethtool, TC, nft.
> As I mentioned in the first email, and unless you can make a strong 
> case against that.
> Within those tables we should follow the same ordering rules as we 
> do in software (modulo ethtool but ordering is pretty clear there).

The idea is that every subsystem (ethtool, tc, nf) sets up/binds its
own flow_block object. And each flow_block object has its own priority
range space. So whatever priority the user specifies only applies to
the specific subsystem. Drivers still need to be updated to support
for more than one flow_block/subsystem binding at this stage though.

^ permalink raw reply

* Re: [PATCH 10/34] genwqe: convert put_page() to put_user_page*()
From: Greg Kroah-Hartman @ 2019-08-03  7:06 UTC (permalink / raw)
  To: john.hubbard
  Cc: Andrew Morton, linux-fbdev, Jan Kara, kvm, Dave Hansen,
	Dave Chinner, dri-devel, linux-mm, sparclinux, ceph-devel, devel,
	rds-devel, linux-rdma, x86, amd-gfx, Christoph Hellwig,
	Jason Gunthorpe, xen-devel, devel, linux-media, Arnd Bergmann,
	Guilherme G. Piccoli, John Hubbard, intel-gfx, linux-block,
	Jérôme Glisse, linux-rpi-kernel, Dan Williams,
	linux-arm-kernel, linux-nfs, netdev, LKML, linux-xfs,
	linux-crypto, linux-fsdevel, Frank Haverkamp
In-Reply-To: <20190802022005.5117-11-jhubbard@nvidia.com>

On Thu, Aug 01, 2019 at 07:19:41PM -0700, john.hubbard@gmail.com wrote:
> From: John Hubbard <jhubbard@nvidia.com>
> 
> For pages that were retained via get_user_pages*(), release those pages
> via the new put_user_page*() routines, instead of via put_page() or
> release_pages().
> 
> This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
> ("mm: introduce put_user_page*(), placeholder versions").
> 
> This changes the release code slightly, because each page slot in the
> page_list[] array is no longer checked for NULL. However, that check
> was wrong anyway, because the get_user_pages() pattern of usage here
> never allowed for NULL entries within a range of pinned pages.
> 
> Cc: Frank Haverkamp <haver@linux.vnet.ibm.com>
> Cc: "Guilherme G. Piccoli" <gpiccoli@linux.vnet.ibm.com>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  drivers/misc/genwqe/card_utils.c | 17 +++--------------
>  1 file changed, 3 insertions(+), 14 deletions(-)

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply

* Re: [PATCH 15/34] staging/vc04_services: convert put_page() to put_user_page*()
From: Greg Kroah-Hartman @ 2019-08-03  7:06 UTC (permalink / raw)
  To: john.hubbard
  Cc: Andrew Morton, linux-fbdev, Jan Kara, kvm, Dave Hansen,
	Dave Chinner, dri-devel, linux-mm, sparclinux, ceph-devel, devel,
	rds-devel, linux-rdma, Suniel Mahesh, x86, amd-gfx,
	Christoph Hellwig, Jason Gunthorpe, Mihaela Muraru, xen-devel,
	devel, linux-media, Stefan Wahren, John Hubbard, intel-gfx,
	Kishore KP, linux-block, Jérôme Glisse,
	linux-rpi-kernel, Dan Williams, Sidong Yang, linux-arm-kernel,
	linux-nfs, Eric Anholt, netdev, LKML, linux-xfs, linux-crypto,
	linux-fsdevel, Al Viro
In-Reply-To: <20190802022005.5117-16-jhubbard@nvidia.com>

On Thu, Aug 01, 2019 at 07:19:46PM -0700, john.hubbard@gmail.com wrote:
> From: John Hubbard <jhubbard@nvidia.com>
> 
> For pages that were retained via get_user_pages*(), release those pages
> via the new put_user_page*() routines, instead of via put_page() or
> release_pages().
> 
> This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
> ("mm: introduce put_user_page*(), placeholder versions").
> 
> Cc: Eric Anholt <eric@anholt.net>
> Cc: Stefan Wahren <stefan.wahren@i2se.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Mihaela Muraru <mihaela.muraru21@gmail.com>
> Cc: Suniel Mahesh <sunil.m@techveda.org>
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Cc: Sidong Yang <realwakka@gmail.com>
> Cc: Kishore KP <kishore.p@techveda.org>
> Cc: linux-rpi-kernel@lists.infradead.org
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: devel@driverdev.osuosl.org
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  .../vc04_services/interface/vchiq_arm/vchiq_2835_arm.c | 10 ++--------
>  1 file changed, 2 insertions(+), 8 deletions(-)

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply

* Re: [PATCH] drivers:staging:isdn:hysdn brace same line if
From: Greg KH @ 2019-08-03  6:35 UTC (permalink / raw)
  To: Fernando Eckhardt Valle; +Cc: isdn, netdev, devel, linux-kernel
In-Reply-To: <20190802195105.27788-1-phervalle@gmail.com>

On Fri, Aug 02, 2019 at 07:51:05PM +0000, Fernando Eckhardt Valle wrote:
> Fix checkpatch error "ERROR: that open brace { should be on the previous
> line" in drivers/staging/isdn/hysdn/hycapi.c:51.
> 
> Signed-off-by: Fernando Eckhardt Valle <phervalle@gmail.com>
> ---
>  drivers/staging/isdn/hysdn/hycapi.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)

Your subject line does not make sense :(


^ permalink raw reply

* Re: [PATCH] isdn: hysdn: Fix error spaces around '*'
From: Greg KH @ 2019-08-03  6:32 UTC (permalink / raw)
  To: Jose Carlos Cazarin Filho; +Cc: isdn, devel, netdev, linux-kernel
In-Reply-To: <20190802195602.28414-1-joseespiriki@gmail.com>

On Fri, Aug 02, 2019 at 07:56:02PM +0000, Jose Carlos Cazarin Filho wrote:
> Fix checkpath error:
> CHECK: spaces preferred around that '*' (ctx:WxV)
> +extern hysdn_card *card_root;        /* pointer to first card */
> 
> Signed-off-by: Jose Carlos Cazarin Filho <joseespiriki@gmail.com>
> ---
>  Hello all!
>  This is my first commit to the Linux Kernel, I'm doing this to learn and be able
>  to contribute more in the future
>  Peace all! 
> 
>  drivers/staging/isdn/hysdn/hysdn_defs.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/isdn/hysdn/hysdn_defs.h b/drivers/staging/isdn/hysdn/hysdn_defs.h
> index cdac46a21..f20150d62 100644
> --- a/drivers/staging/isdn/hysdn/hysdn_defs.h
> +++ b/drivers/staging/isdn/hysdn/hysdn_defs.h
> @@ -220,7 +220,7 @@ typedef struct hycapictrl_info hycapictrl_info;
>  /*****************/
>  /* exported vars */
>  /*****************/
> -extern hysdn_card *card_root;	/* pointer to first card */
> +extern hysdn_card * card_root;	/* pointer to first card */

The original code here is correct, checkpatch must be reporting this
incorrectly.

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH v3 bpf-next 02/12] libbpf: implement BPF CO-RE offset relocation algorithm
From: Andrii Nakryiko @ 2019-08-03  6:30 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Yonghong Song, Song Liu, Kernel Team
In-Reply-To: <20190802215604.onihsysinwiu3shl@ast-mbp.dhcp.thefacebook.com>

On Fri, Aug 2, 2019 at 2:56 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, Aug 02, 2019 at 12:16:52AM -0700, Andrii Nakryiko wrote:
> > On Thu, Aug 1, 2019 at 4:50 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Wed, Jul 31, 2019 at 11:47:53PM -0700, Andrii Nakryiko wrote:
> > > > This patch implements the core logic for BPF CO-RE offsets relocations.
> > > > Every instruction that needs to be relocated has corresponding
> > > > bpf_offset_reloc as part of BTF.ext. Relocations are performed by trying
> > > > to match recorded "local" relocation spec against potentially many
> > > > compatible "target" types, creating corresponding spec. Details of the
> > > > algorithm are noted in corresponding comments in the code.
> > > >
> > > > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > > > Acked-by: Song Liu <songliubraving@fb.com>
> > > ...
> > > > +             if (btf_is_composite(t)) {
> > > > +                     const struct btf_member *m = (void *)(t + 1);
> > > > +                     __u32 offset;
> > > > +
> > > > +                     if (access_idx >= BTF_INFO_VLEN(t->info))
> > > > +                             return -EINVAL;
> > > > +
> > > > +                     m = &m[access_idx];
> > > > +
> > > > +                     if (BTF_INFO_KFLAG(t->info)) {
> > > > +                             if (BTF_MEMBER_BITFIELD_SIZE(m->offset))
> > > > +                                     return -EINVAL;
> > > > +                             offset = BTF_MEMBER_BIT_OFFSET(m->offset);
> > > > +                     } else {
> > > > +                             offset = m->offset;
> > > > +                     }
> > >
> > > very similar logic exists in btf_dump.c
> > > probably makes sense to make a common helper at some point.
> >
> > Will add btf_member_bit_offset(type, member) and
> > btf_member_bit_size(type, member).
> >
> > >
> > > > +static size_t bpf_core_essential_name_len(const char *name)
> > > > +{
> > > > +     size_t n = strlen(name);
> > > > +     int i = n - 3;
> > > > +
> > > > +     while (i > 0) {
> > > > +             if (name[i] == '_' && name[i + 1] == '_' && name[i + 2] == '_')
> > > > +                     return i;
> > > > +             i--;
> > > > +     }
> > > > +     return n;
> > > > +}
> > >
> > > that's a bit of an eye irritant. How about?
> > >         size_t n = strlen(name);
> > >         int i, cnt = 0;
> > >
> > >         for (i = n - 1; i >= 0; i--) {
> > >                 if (name[i] == '_') {
> > >                     cnt++;
> > >                 } else {
> > >                    if (cnt == 3)
> > >                       return i + 1;
> > >                    cnt = 0;
> > >                 }
> > >         }
> > >         return n;
> >
> > I find this one much harder to read and understand. What's
> > eye-irritating about that loop?
> >
> > Your loop will also handle `a____b` differently. My version will
> > return "a_" as essential name, yours "a____b". Was this intentional on
> > your part?
>
> hmm. I think both will return sizeof("a") == 1

nope, there are 4 underscores, your implementation will bump cnt to 4
without checking it for `cnt == 3`, so will never detect flavor and
will just return sizeof("a____b"). It's easily fixable, but the point
is that my original irritating code is very straightforward and harder
to get wrong an, easier to understand at a glimpse, yours require much
more conscious thought to understand.

>
> > I'd rather use this instead, if you hate the first one:
> >
> > size_t n = strlen(name);
> > int i;
> >
> > for (i = n - 3; i > 0; i--) {
> >     if (strncmp(name + i, "___", 3) == 0)
> >         return i;
> > }
> >
> > Is this better?
>
> that is worse.
> What I don't like about it is that every byte is
> compared N=sizeof(string-to-found) times.
> I guess it's not such a big performance criticial path,
> but libbpf has to keep the bar high.

We are talking about searching for *three* characters in a short
string. Performance difference is negligible at best, unnoticeable at
worst. I'd rather have straightforward and easy code, but I'll rewrite
it as a state machine the way you proposed.

>
> > >
> > > > +     case BTF_KIND_ARRAY: {
> > > > +             const struct btf_array *loc_a, *targ_a;
> > > > +
> > > > +             loc_a = (void *)(local_type + 1);
> > > > +             targ_a = (void *)(targ_type + 1);
> > > > +             local_id = loc_a->type;
> > > > +             targ_id = targ_a->type;
> > >
> > > can we add a helper like:
> >
> > Yes, we can. I was thinking about that, but decided to not expand
> > patch set. But we do need to extract all those small, but nice
> > helpers. I'll put them in libbpf_internal.h for now, but I think it
> > might be good idea to expose them as part of btf.h. Thoughts?
>
> part of btf.h make sense to me.
>
> >
> > > const struct btf_array *btf_array(cosnt struct btf_type *t)
> > > {
> > >         return (const struct btf_array *)(t + 1);
> > > }
> > >
> > > then above will be:
> > >         case BTF_KIND_ARRAY: {
> > >                 local_id = btf_array(local_type)->type;
> > >                 targ_id = btf_array(targ_type)->type;
> > >
> > > and a bunch of code in btf.c and btf_dump.c would be cleaner as well?
> >
> > Yep, some of those are already scattered around btf.c and btf_dump.c,
> > will clean up and add patch to this patch set.
> >
> > >
> > > > +             goto recur;
> > > > +     }
> > > > +     default:
> > > > +             pr_warning("unexpected kind %d relocated, local [%d], target [%d]\n",
> > > > +                        kind, local_id, targ_id);
> > > > +             return 0;
> > > > +     }
> > > > +}
> > > > +
> > > > +/*
> > > > + * Given single high-level named field accessor in local type, find
> > > > + * corresponding high-level accessor for a target type. Along the way,
> > > > + * maintain low-level spec for target as well. Also keep updating target
> > > > + * offset.
> > > > + *
> > > > + * Searching is performed through recursive exhaustive enumeration of all
> > > > + * fields of a struct/union. If there are any anonymous (embedded)
> > > > + * structs/unions, they are recursively searched as well. If field with
> > > > + * desired name is found, check compatibility between local and target types,
> > > > + * before returning result.
> > > > + *
> > > > + * 1 is returned, if field is found.
> > > > + * 0 is returned if no compatible field is found.
> > > > + * <0 is returned on error.
> > > > + */
> > > > +static int bpf_core_match_member(const struct btf *local_btf,
> > > > +                              const struct bpf_core_accessor *local_acc,
> > > > +                              const struct btf *targ_btf,
> > > > +                              __u32 targ_id,
> > > > +                              struct bpf_core_spec *spec,
> > > > +                              __u32 *next_targ_id)
> > > > +{
> > > > +     const struct btf_type *local_type, *targ_type;
> > > > +     const struct btf_member *local_member, *m;
> > > > +     const char *local_name, *targ_name;
> > > > +     __u32 local_id;
> > > > +     int i, n, found;
> > > > +
> > > > +     targ_type = skip_mods_and_typedefs(targ_btf, targ_id, &targ_id);
> > > > +     if (!targ_type)
> > > > +             return -EINVAL;
> > > > +     if (!btf_is_composite(targ_type))
> > > > +             return 0;
> > > > +
> > > > +     local_id = local_acc->type_id;
> > > > +     local_type = btf__type_by_id(local_btf, local_id);
> > > > +     local_member = (void *)(local_type + 1);
> > > > +     local_member += local_acc->idx;
> > > > +     local_name = btf__name_by_offset(local_btf, local_member->name_off);
> > > > +
> > > > +     n = BTF_INFO_VLEN(targ_type->info);
> > > > +     m = (void *)(targ_type + 1);
> > >
> > > new btf_member() helper?
> > >
> > > > +     for (i = 0; i < n; i++, m++) {
> > > > +             __u32 offset;
> > > > +
> > > > +             /* bitfield relocations not supported */
> > > > +             if (BTF_INFO_KFLAG(targ_type->info)) {
> > > > +                     if (BTF_MEMBER_BITFIELD_SIZE(m->offset))
> > > > +                             continue;
> > > > +                     offset = BTF_MEMBER_BIT_OFFSET(m->offset);
> > > > +             } else {
> > > > +                     offset = m->offset;
> > > > +             }
> > > > +             if (offset % 8)
> > > > +                     continue;
> > >
> > > same bit of code again?
> > > definitely could use a helper.
> >
> > Different handling (continue here, return error above), but can use
> > those helpers I mentioned above.
> >
> > >
> > > > +     for (i = 0; i < local_spec->len; i++, local_acc++, targ_acc++) {
> > > > +             targ_type = skip_mods_and_typedefs(targ_spec->btf, targ_id,
> > > > +                                                &targ_id);
> > > > +             if (!targ_type)
> > > > +                     return -EINVAL;
> > > > +
> > > > +             if (local_acc->name) {
> > > > +                     matched = bpf_core_match_member(local_spec->btf,
> > > > +                                                     local_acc,
> > > > +                                                     targ_btf, targ_id,
> > > > +                                                     targ_spec, &targ_id);
> > > > +                     if (matched <= 0)
> > > > +                             return matched;
> > > > +             } else {
> > > > +                     /* for i=0, targ_id is already treated as array element
> > > > +                      * type (because it's the original struct), for others
> > > > +                      * we should find array element type first
> > > > +                      */
> > > > +                     if (i > 0) {
> > > > +                             const struct btf_array *a;
> > > > +
> > > > +                             if (!btf_is_array(targ_type))
> > > > +                                     return 0;
> > > > +
> > > > +                             a = (void *)(targ_type + 1);
> > > > +                             if (local_acc->idx >= a->nelems)
> > > > +                                     return 0;
> > >
> > > am I reading it correctly that the local spec requested out-of-bounds
> > > index in the target array type?
> > > Why this is 'ignore' instead of -EINVAL?
> >
> > Similar to any other mismatch (e.g., int in local type vs int64 in
> > target type). It just makes candidate not matching. Why would that be
> > error that will stop the whole relocation and subsequently object
> > loading process?
>
> Did the field name match or this is for anon types?
> I've read it as names matched and type miscompared.

No, not anonymous.

struct my_struct___local {
    int a;
};

struct my_struct___target {
    long long a;
};

my_struct___local->a will not match my_struct___target->a, but it's
not a reason to stop relocation process due to error.

>
> >
> > >
> > > > +/*
> > > > + * Probe few well-known locations for vmlinux kernel image and try to load BTF
> > > > + * data out of it to use for target BTF.
> > > > + */
> > > > +static struct btf *bpf_core_find_kernel_btf(void)
> > > > +{
> > > > +     const char *locations[] = {
> > > > +             "/lib/modules/%1$s/vmlinux-%1$s",
> > > > +             "/usr/lib/modules/%1$s/kernel/vmlinux",
> > > > +     };
> > > > +     char path[PATH_MAX + 1];
> > > > +     struct utsname buf;
> > > > +     struct btf *btf;
> > > > +     int i, err;
> > > > +
> > > > +     err = uname(&buf);
> > > > +     if (err) {
> > > > +             pr_warning("failed to uname(): %d\n", err);
> > >
> > > defensive programming ?
> > > I think uname() can fail only if &buf points to non-existing page like null.
> >
> > I haven't checked source for this syscall, but man page specified that
> > it might return -1 on error.
>
> man page says that it can only return EFAULT.

Ah, yeah, seems to be the only reason. I'll remove the check, it
wasn't paranoia :)

>
> >
> > >
> > > > +             return ERR_PTR(err);
> > > > +     }
> > > > +
> > > > +     for (i = 0; i < ARRAY_SIZE(locations); i++) {
> > > > +             snprintf(path, PATH_MAX, locations[i], buf.release);
> > > > +             pr_debug("attempting to load kernel BTF from '%s'\n", path);
> > >
> > > I think this debug message would have been more useful after access().
> >
> > Sure, will move.
> >
> > >
> > > > +
> > > > +             if (access(path, R_OK))
> > > > +                     continue;
> > > > +
> > > > +             btf = btf__parse_elf(path, NULL);
> > > > +             if (IS_ERR(btf))
> > > > +                     continue;
> > > > +
> > > > +             pr_debug("successfully loaded kernel BTF from '%s'\n", path);
> > > > +             return btf;
> > > > +     }
> > > > +
> > > > +     pr_warning("failed to find valid kernel BTF\n");
> > > > +     return ERR_PTR(-ESRCH);
> > > > +}
> > > > +
> > > > +/* Output spec definition in the format:
> > > > + * [<type-id>] (<type-name>) + <raw-spec> => <offset>@<spec>,
> > > > + * where <spec> is a C-syntax view of recorded field access, e.g.: x.a[3].b
> > > > + */
> > > > +static void bpf_core_dump_spec(int level, const struct bpf_core_spec *spec)
> > > > +{
> > > > +     const struct btf_type *t;
> > > > +     const char *s;
> > > > +     __u32 type_id;
> > > > +     int i;
> > > > +
> > > > +     type_id = spec->spec[0].type_id;
> > > > +     t = btf__type_by_id(spec->btf, type_id);
> > > > +     s = btf__name_by_offset(spec->btf, t->name_off);
> > > > +     libbpf_print(level, "[%u] (%s) + ", type_id, s);
> > >
> > > imo extra []() don't improve readability of the dump.
> >
> > [<num>] is the general convention I've been using throughout libbpf to
> > specify type ID, so I'd rather keep it for consistency. I can drop
> > parens, though, no problem.
> >
> > >
> > > > +
> > > > +     for (i = 0; i < spec->raw_len; i++)
> > > > +             libbpf_print(level, "%d%s", spec->raw_spec[i],
> > > > +                          i == spec->raw_len - 1 ? " => " : ":");
> > > > +
> > > > +     libbpf_print(level, "%u @ &x", spec->offset);
> > > > +
> > > > +     for (i = 0; i < spec->len; i++) {
> > > > +             if (spec->spec[i].name)
> > > > +                     libbpf_print(level, ".%s", spec->spec[i].name);
> > > > +             else
> > > > +                     libbpf_print(level, "[%u]", spec->spec[i].idx);
> > > > +     }
> > > > +
> > > > +}
> > > > +
> > > > +static size_t bpf_core_hash_fn(const void *key, void *ctx)
> > > > +{
> > > > +     return (size_t)key;
> > > > +}
> > > > +
> > > > +static bool bpf_core_equal_fn(const void *k1, const void *k2, void *ctx)
> > > > +{
> > > > +     return k1 == k2;
> > > > +}
> > > > +
> > > > +static void *u32_to_ptr(__u32 x)
> > > > +{
> > > > +     return (void *)(uintptr_t)x;
> > > > +}
> > >
> > > u32 to pointer on 64-bit arch?!
> > > That surely needs a comment.
> >
> > I should probably call it u32_to_hash_key() to make it obvious it's
> > conversion to hashmap's generic `void *` key type.
> >
> > >
> > > > +
> > > > +/*
> > > > + * CO-RE relocate single instruction.
> > > > + *
> > > > + * The outline and important points of the algorithm:
> > > > + * 1. For given local type, find corresponding candidate target types.
> > > > + *    Candidate type is a type with the same "essential" name, ignoring
> > > > + *    everything after last triple underscore (___). E.g., `sample`,
> > > > + *    `sample___flavor_one`, `sample___flavor_another_one`, are all candidates
> > > > + *    for each other. Names with triple underscore are referred to as
> > > > + *    "flavors" and are useful, among other things, to allow to
> > > > + *    specify/support incompatible variations of the same kernel struct, which
> > > > + *    might differ between different kernel versions and/or build
> > > > + *    configurations.
> > > > + *
> > > > + *    N.B. Struct "flavors" could be generated by bpftool's BTF-to-C
> > > > + *    converter, when deduplicated BTF of a kernel still contains more than
> > > > + *    one different types with the same name. In that case, ___2, ___3, etc
> > > > + *    are appended starting from second name conflict. But start flavors are
> > > > + *    also useful to be defined "locally", in BPF program, to extract same
> > > > + *    data from incompatible changes between different kernel
> > > > + *    versions/configurations. For instance, to handle field renames between
> > > > + *    kernel versions, one can use two flavors of the struct name with the
> > > > + *    same common name and use conditional relocations to extract that field,
> > > > + *    depending on target kernel version.
> > >
> > > there are actual kernel types that have ___ in the name.
> > > Ex: struct lmc___media
> > > We probably need to revisit this 'flavor' convention.
> >
> > There are only these:
> > - lmc___softc
> > - lmc___media
> > - lmc___ctl (all three in drivers/net/wan/lmc/lmc_var.h)
> > - ____ftrace_##name set of structs
> >
> > I couldn't come up with anything cleaner-looking. I think we can still
> > keep ___ convention, but:
> >
> > 1. Match only exactly 3 underscores, delimited by non-underscore from
> > both sides (so similar to your proposed loop above);
> > 2. We can also try matching candidates assuming full name without
> > ___xxx part removed, in addition to current logic. This seems like an
> > overkill at this point and unlikely to be useful in practice, so I'd
> > postpone implementing this until we really need it.
> >
> > What do you think? Which other convention did you have in mind?
>
> may be match ___[0-9]+ instead for now?
> Not as flexible, but user supplied "flavors" is not an immediate task.

All the tests I added use non-numeric flavors. While technically I can
use just ___1, ___2 and so on, it will greatly reduce readability,
while not really solving any problem (nothing prevents someone to add
something like lmc___1 eventually).

I think it's not worth it to complicate this logic just for
lmc___{softc,media,ctl}, but we can do 2) - try to match any struct as
is. If that fails, see if it's a "flavor" and match flavors.

>
> >
> > >
> > > > +     for (i = 0, j = 0; i < cand_ids->len; i++) {
> > > > +             cand_id = cand_ids->data[i];
> > > > +             cand_type = btf__type_by_id(targ_btf, cand_id);
> > > > +             cand_name = btf__name_by_offset(targ_btf, cand_type->name_off);
> > > > +
> > > > +             err = bpf_core_spec_match(&local_spec, targ_btf,
> > > > +                                       cand_id, &cand_spec);
> > > > +             if (err < 0) {
> > > > +                     pr_warning("prog '%s': relo #%d: failed to match spec ",
> > > > +                                prog_name, relo_idx);
> > > > +                     bpf_core_dump_spec(LIBBPF_WARN, &local_spec);
> > > > +                     libbpf_print(LIBBPF_WARN,
> > > > +                                  " to candidate #%d [%d] (%s): %d\n",
> > > > +                                  i, cand_id, cand_name, err);
> > > > +                     return err;
> > > > +             }
> > > > +             if (err == 0) {
> > > > +                     pr_debug("prog '%s': relo #%d: candidate #%d [%d] (%s) doesn't match spec ",
> > > > +                              prog_name, relo_idx, i, cand_id, cand_name);
> > > > +                     bpf_core_dump_spec(LIBBPF_DEBUG, &local_spec);
> > > > +                     libbpf_print(LIBBPF_DEBUG, "\n");
> > > > +                     continue;
> > > > +             }
> > > > +
> > > > +             pr_debug("prog '%s': relo #%d: candidate #%d matched as spec ",
> > > > +                      prog_name, relo_idx, i);
> > >
> > > did you mention that you're going to make a helper for this debug dumps?
> >
> > yeah, I added bpf_core_dump_spec(), but I don't know how to shorten
> > this further... This output is extremely useful to understand what's
> > happening and will be invaluable when users will inevitably report
> > confusing behavior in some cases, so I still want to keep it.
>
> not sure yet. Just pointing out that this function has more debug printfs
> than actual code which doesn't look right.
> We have complex algorithms in the kernel (like verifier).
> Yet we don't sprinkle printfs in there to this degree.
>

We do have a verbose verifier logging, though, exactly to help users
to debug issues, which is extremely helpful and is greatly appreciated
by users.
There is nothing worse for developer experience than getting -EINVAL
without any useful log message. Been there, banged my head against the
wall wishing for a bit more verbose log. What are we trying to
optimize for here?

^ permalink raw reply

* Re: [PATCH] net: mdio-octeon: Fix build error and Kconfig warning
From: Nathan Chancellor @ 2019-08-03  6:05 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: David Miller, andrew, broonie, devel, f.fainelli, gregkh,
	hkallweit1, kernel-build-reports, linux-arm-kernel, linux-next,
	netdev, lkp, rdunlap
In-Reply-To: <20190803013952.GF5597@bombadil.infradead.org>

On Fri, Aug 02, 2019 at 06:39:52PM -0700, Matthew Wilcox wrote:
> On Fri, Aug 02, 2019 at 06:30:31PM -0700, Nathan Chancellor wrote:
> > On Fri, Aug 02, 2019 at 06:11:32PM -0700, David Miller wrote:
> > > The proper way to fix this is to include either
> > > 
> > > 	linux/io-64-nonatomic-hi-lo.h
> > > 
> > > or
> > > 
> > > 	linux/io-64-nonatomic-lo-hi.h
> > > 
> > > whichever is appropriate.
> > 
> > Hmmmm, is that not what I did?
> > 
> > Although I did not know about io-64-nonatomic-hi-lo.h. What is the
> > difference and which one is needed here?
> 
> Whether you write the high or low 32 bits first.  For this, it doesn't
> matter, since the compiled driver will never be run on real hardware.

That's what I figured. I have only seen lo-hi used personally, which is
what I went with here. Thanks for the confirmation!

> 
> > There is apparently another failure when OF_MDIO is not set, I guess I
> > can try to look into that as well and respin into a series if
> > necessary.
> 
> Thanks for taking care of that!

^ permalink raw reply

* [PATCH v2] net: mdio-octeon: Fix Kconfig warnings and build errors
From: Nathan Chancellor @ 2019-08-03  6:01 UTC (permalink / raw)
  To: natechancellor
  Cc: andrew, broonie, davem, devel, f.fainelli, gregkh, hkallweit1,
	kernel-build-reports, linux-arm-kernel, linux-next, lkp, netdev,
	rdunlap, willy
In-Reply-To: <20190731185023.20954-1-natechancellor@gmail.com>

After commit 171a9bae68c7 ("staging/octeon: Allow test build on
!MIPS"), the following combination of configs cause a few Kconfig
warnings and build errors (distilled from arm allyesconfig and Randy's
randconfig builds):

    CONFIG_NETDEVICES=y
    CONFIG_STAGING=y
    CONFIG_COMPILE_TEST=y

and CONFIG_OCTEON_ETHERNET as either a module or built-in.

WARNING: unmet direct dependencies detected for MDIO_OCTEON
  Depends on [n]: NETDEVICES [=y] && MDIO_DEVICE [=y] && MDIO_BUS [=y]
&& 64BIT [=n] && HAS_IOMEM [=y] && OF_MDIO [=n]
  Selected by [y]:
  - OCTEON_ETHERNET [=y] && STAGING [=y] && (CAVIUM_OCTEON_SOC ||
COMPILE_TEST [=y]) && NETDEVICES [=y]

In file included from ../drivers/net/phy/mdio-octeon.c:14:
../drivers/net/phy/mdio-cavium.h:111:36: error: implicit declaration of
function ‘writeq’; did you mean ‘writel’?
[-Werror=implicit-function-declaration]
  111 | #define oct_mdio_writeq(val, addr) writeq(val, (void *)addr)
      |                                    ^~~~~~

CONFIG_64BIT is not strictly necessary if the proper readq/writeq
definitions are included from io-64-nonatomic-lo-hi.h.

CONFIG_OF_MDIO is not needed when CONFIG_COMPILE_TEST is enabled because
of commit f9dc9ac51610 ("of/mdio: Add dummy functions in of_mdio.h.").

Fixes: 171a9bae68c7 ("staging/octeon: Allow test build on !MIPS")
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Mark Brown <broonie@kernel.org>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
---

v1 -> v2:

* Address Randy's reported failure here: https://lore.kernel.org/netdev/b3444283-7a77-ece8-7ac6-41756aa7dc60@infradead.org/
  by not requiring CONFIG_OF_MDIO when CONFIG_COMPILE_TEST is set.

* Improve commit message

 drivers/net/phy/Kconfig       | 4 ++--
 drivers/net/phy/mdio-cavium.h | 2 ++
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index 20f14c5fbb7e..0e3d9e3d3533 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -159,8 +159,8 @@ config MDIO_MSCC_MIIM
 
 config MDIO_OCTEON
 	tristate "Octeon and some ThunderX SOCs MDIO buses"
-	depends on 64BIT
-	depends on HAS_IOMEM && OF_MDIO
+	depends on (64BIT && OF_MDIO) || COMPILE_TEST
+	depends on HAS_IOMEM
 	select MDIO_CAVIUM
 	help
 	  This module provides a driver for the Octeon and ThunderX MDIO
diff --git a/drivers/net/phy/mdio-cavium.h b/drivers/net/phy/mdio-cavium.h
index ed5f9bb5448d..b7f89ad27465 100644
--- a/drivers/net/phy/mdio-cavium.h
+++ b/drivers/net/phy/mdio-cavium.h
@@ -108,6 +108,8 @@ static inline u64 oct_mdio_readq(u64 addr)
 	return cvmx_read_csr(addr);
 }
 #else
+#include <linux/io-64-nonatomic-lo-hi.h>
+
 #define oct_mdio_writeq(val, addr)	writeq(val, (void *)addr)
 #define oct_mdio_readq(addr)		readq((void *)addr)
 #endif
-- 
2.23.0.rc1


^ permalink raw reply related

* Re: [PATCH bpf-next 1/3] selftests/bpf: test_progs: switch to open_memstream
From: Andrii Nakryiko @ 2019-08-03  6:00 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Andrii Nakryiko, Stanislav Fomichev, netdev@vger.kernel.org,
	bpf@vger.kernel.org, davem@davemloft.net, ast@kernel.org,
	daniel@iogearbox.net
In-Reply-To: <20190802201456.GB4544@mini-arch>

On Fri, Aug 2, 2019 at 1:14 PM Stanislav Fomichev <sdf@fomichev.me> wrote:
>
> On 08/02, Andrii Nakryiko wrote:
> > On 8/2/19 10:17 AM, Stanislav Fomichev wrote:
> > > Use open_memstream to override stdout during test execution.
> > > The copy of the original stdout is held in env.stdout and used
> > > to print subtest info and dump failed log.
> >
> > I really like the idea. I didn't know about open_memstream, it's awesome. Thanks!
> One possible downside of using open_memstream is that it's glibc
> specific. I probably need to wrap it in #ifdef __GLIBC__ to make
> it work with other libcs and just print everything as it was before :-(.
> I'm not sure we care though.

Given this is selftests/bpf, it is probably OK.

>
> > > test_{v,}printf are now simple wrappers around stdout and will be
> > > removed in the next patch.
> > >
> > > Cc: Andrii Nakryiko <andriin@fb.com>
> > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > ---
> > >  tools/testing/selftests/bpf/test_progs.c | 100 ++++++++++-------------
> > >  tools/testing/selftests/bpf/test_progs.h |   2 +-
> > >  2 files changed, 46 insertions(+), 56 deletions(-)
> > >
> > > diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
> > > index db00196c8315..00d1565d01a3 100644
> > > --- a/tools/testing/selftests/bpf/test_progs.c
> > > +++ b/tools/testing/selftests/bpf/test_progs.c
> > > @@ -40,14 +40,22 @@ static bool should_run(struct test_selector *sel, int num, const char *name)
> > >
> > >  static void dump_test_log(const struct prog_test_def *test, bool failed)
> > >  {
> > > -   if (env.verbose || test->force_log || failed) {
> > > -           if (env.log_cnt) {
> > > -                   fprintf(stdout, "%s", env.log_buf);
> > > -                   if (env.log_buf[env.log_cnt - 1] != '\n')
> > > -                           fprintf(stdout, "\n");
> > > +   if (stdout == env.stdout)
> > > +           return;
> > > +
> > > +   fflush(stdout); /* exports env.log_buf & env.log_cap */
> > > +
> > > +   if (env.log_cap && (env.verbose || test->force_log || failed)) {
> > > +           int len = strlen(env.log_buf);
> >
> > env.log_cap is not really a capacity, it's actual number of bytes (without terminating zero), so there is no need to do strlen and it's probably better to rename env.log_cap into env.log_cnt.
> I'll rename it to log_size to match open_memstream args.
> We probably still need to do strlen because open_memstream can allocate
> bigger buffer to hold the data.

If I read man page correctly, env.log_cnt will be exactly the value
that strlen will return - number of actual bytes written (omitting
terminal zero), not number of pre-allocated bytes, thus I'm saying
that strlen is redundant. Please take a look again.

>
> > > +
> > > +           if (len) {
> > > +                   fprintf(env.stdout, "%s", env.log_buf);
> > > +                   if (env.log_buf[len - 1] != '\n')
> > > +                           fprintf(env.stdout, "\n");
> > > +
> > > +                   fseeko(stdout, 0, SEEK_SET);
> > Same bug as I already fixed with env.log_cnt = 0 being inside this if. You want to do seek always, not just when you print output log.
> SG, will move to where we currently clear log_cnt, thanks!
>
> > >  /* rewind */
> > >             }
> > >     }
> > > -   env.log_cnt = 0;
> > >  }
> > >
> > >  void test__end_subtest()
> > > @@ -62,7 +70,7 @@ void test__end_subtest()
> > >
> > >     dump_test_log(test, sub_error_cnt);
> > >
> > > -   printf("#%d/%d %s:%s\n",
> > > +   fprintf(env.stdout, "#%d/%d %s:%s\n",
> > >            test->test_num, test->subtest_num,
> > >            test->subtest_name, sub_error_cnt ? "FAIL" : "OK");
> > >  }
> > > @@ -100,53 +108,7 @@ void test__force_log() {
> > >
> > >  void test__vprintf(const char *fmt, va_list args)
> > >  {
> > > -   size_t rem_sz;
> > > -   int ret = 0;
> > > -
> > > -   if (env.verbose || (env.test && env.test->force_log)) {
> > > -           vfprintf(stderr, fmt, args);
> > > -           return;
> > > -   }
> > > -
> > > -try_again:
> > > -   rem_sz = env.log_cap - env.log_cnt;
> > > -   if (rem_sz) {
> > > -           va_list ap;
> > > -
> > > -           va_copy(ap, args);
> > > -           /* we reserved extra byte for \0 at the end */
> > > -           ret = vsnprintf(env.log_buf + env.log_cnt, rem_sz + 1, fmt, ap);
> > > -           va_end(ap);
> > > -
> > > -           if (ret < 0) {
> > > -                   env.log_buf[env.log_cnt] = '\0';
> > > -                   fprintf(stderr, "failed to log w/ fmt '%s'\n", fmt);
> > > -                   return;
> > > -           }
> > > -   }
> > > -
> > > -   if (!rem_sz || ret > rem_sz) {
> > > -           size_t new_sz = env.log_cap * 3 / 2;
> > > -           char *new_buf;
> > > -
> > > -           if (new_sz < 4096)
> > > -                   new_sz = 4096;
> > > -           if (new_sz < ret + env.log_cnt)
> > > -                   new_sz = ret + env.log_cnt;
> > > -
> > > -           /* +1 for guaranteed space for terminating \0 */
> > > -           new_buf = realloc(env.log_buf, new_sz + 1);
> > > -           if (!new_buf) {
> > > -                   fprintf(stderr, "failed to realloc log buffer: %d\n",
> > > -                           errno);
> > > -                   return;
> > > -           }
> > > -           env.log_buf = new_buf;
> > > -           env.log_cap = new_sz;
> > > -           goto try_again;
> > > -   }
> > > -
> > > -   env.log_cnt += ret;
> > > +   vprintf(fmt, args);
> > >  }
> > >
> > >  void test__printf(const char *fmt, ...)
> > > @@ -477,6 +439,32 @@ static error_t parse_arg(int key, char *arg, struct argp_state *state)
> > >     return 0;
> > >  }
> > >
> > > +static void stdout_hijack(void)
> > > +{
> > > +   if (env.verbose || (env.test && env.test->force_log)) {
> > > +           /* nothing to do, output to stdout by default */
> > > +           return;
> > > +   }
> > > +
> > > +   /* stdout -> buffer */
> > > +   fflush(stdout);
> > > +   stdout = open_memstream(&env.log_buf, &env.log_cap);
> > Check errors and restore original stdout if something went wrong? (And emit some warning to stderr).
> Good point, will do.
>
> > > +}
> > > +
> > > +static void stdout_restore(void)
> > > +{
> > > +   if (stdout == env.stdout)
> > > +           return;
> > > +
> > > +   fclose(stdout);
> > > +   free(env.log_buf);
> > > +
> > > +   env.log_buf = NULL;
> > > +   env.log_cap = 0;
> > > +
> > > +   stdout = env.stdout;
> > > +}
> > > +
> > >  int main(int argc, char **argv)
> > >  {
> > >     static const struct argp argp = {
> > > @@ -495,6 +483,7 @@ int main(int argc, char **argv)
> > >     srand(time(NULL));
> > >
> > >     env.jit_enabled = is_jit_enabled();
> > > +   env.stdout = stdout;
> > >
> > >     for (i = 0; i < prog_test_cnt; i++) {
> > >             struct prog_test_def *test = &prog_test_defs[i];
> > > @@ -508,6 +497,7 @@ int main(int argc, char **argv)
> > >                             test->test_num, test->test_name))
> > >                     continue;
> > >
> > > +           stdout_hijack();
> > Why do you do this for every test? Just do once before all the tests and restore after?
> We can do that, my thinking was to limit the area of hijacking :-)

But why? We actually want to hijack stdout/stderr for entire duration
of all the tests. If test_progs needs some "infrastructural" mandatory
output, we have env.stdout/env.stderr for that.

> But that would work as well, less allocations per test, I guess. Will
> do.
>
> > >             test->run_test();
> > >             /* ensure last sub-test is finalized properly */
> > >             if (test->subtest_name)
> > > @@ -522,6 +512,7 @@ int main(int argc, char **argv)
> > >                     env.succ_cnt++;
> > >
> > >             dump_test_log(test, test->error_cnt);
> > > +           stdout_restore();
> > >
> > >             printf("#%d %s:%s\n", test->test_num, test->test_name,
> > >                    test->error_cnt ? "FAIL" : "OK");
> > > @@ -529,7 +520,6 @@ int main(int argc, char **argv)
> > >     printf("Summary: %d/%d PASSED, %d FAILED\n",
> > >            env.succ_cnt, env.sub_succ_cnt, env.fail_cnt);
> > >
> > > -   free(env.log_buf);
> > >     free(env.test_selector.num_set);
> > >     free(env.subtest_selector.num_set);
> > >
> > > diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h
> > > index afd14962456f..9fd89078494f 100644
> > > --- a/tools/testing/selftests/bpf/test_progs.h
> > > +++ b/tools/testing/selftests/bpf/test_progs.h
> > > @@ -56,8 +56,8 @@ struct test_env {
> > >     bool jit_enabled;
> > >
> > >     struct prog_test_def *test;
> > > +   FILE *stdout;
> > >     char *log_buf;
> > > -   size_t log_cnt;
> > >     size_t log_cap;
> > So it's actually log_cnt that's assigned on fflush for memstream, according to man page, so probably keep log_cnt, delete log_cap.
> Ack. See above, will rename to log_size, let me know if you disagree.
>
> > >     int succ_cnt; /* successful tests */

^ permalink raw reply

* [PATCH 1/1] bpf: introduce new helper udp_flow_src_port
From: Farid Zakaria @ 2019-08-03  4:43 UTC (permalink / raw)
  To: ast, daniel, netdev, bpf; +Cc: Farid Zakaria
In-Reply-To: <20190803044320.5530-1-farid.m.zakaria@gmail.com>

Foo over UDP uses UDP encapsulation to add additional entropy
into the packets so that they get beter distribution across EMCP
routes.

Expose udp_flow_src_port as a bpf helper so that tunnel filters
can benefit from the helper.

Signed-off-by: Farid Zakaria <farid.m.zakaria@gmail.com>
---
 include/uapi/linux/bpf.h                      | 21 +++++++--
 net/core/filter.c                             | 20 ++++++++
 tools/include/uapi/linux/bpf.h                | 21 +++++++--
 tools/testing/selftests/bpf/bpf_helpers.h     |  2 +
 .../bpf/prog_tests/udp_flow_src_port.c        | 28 +++++++++++
 .../bpf/progs/test_udp_flow_src_port_kern.c   | 47 +++++++++++++++++++
 6 files changed, 131 insertions(+), 8 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/udp_flow_src_port.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_udp_flow_src_port_kern.c

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 4393bd4b2419..90e814153dec 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2545,9 +2545,21 @@ union bpf_attr {
  * 		*th* points to the start of the TCP header, while *th_len*
  * 		contains **sizeof**\ (**struct tcphdr**).
  *
- * 	Return
- * 		0 if *iph* and *th* are a valid SYN cookie ACK, or a negative
- * 		error otherwise.
+ *  Return
+ *      0 if *iph* and *th* are a valid SYN cookie ACK, or a negative
+ *      error otherwise.
+ *
+ * int bpf_udp_flow_src_port(struct sk_buff *skb, int min, int max, int use_eth)
+ *  Description
+ *      It's common to implement tunnelling inside a UDP protocol to provide
+ *      additional randomness to the packet. The destination port of the UDP
+ *      header indicates the inner packet type whereas the source port is used
+ *      for additional entropy.
+ *
+ *  Return
+ *      An obfuscated hash of the packet that falls within the
+ *      min & max port range.
+ *      If min >= max, the default port range is used
  *
  * int bpf_sysctl_get_name(struct bpf_sysctl *ctx, char *buf, size_t buf_len, u64 flags)
  *	Description
@@ -2853,7 +2865,8 @@ union bpf_attr {
 	FN(sk_storage_get),		\
 	FN(sk_storage_delete),		\
 	FN(send_signal),		\
-	FN(tcp_gen_syncookie),
+	FN(tcp_gen_syncookie),  \
+	FN(udp_flow_src_port),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
diff --git a/net/core/filter.c b/net/core/filter.c
index 5a2707918629..fdf0ebb8c2c8 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2341,6 +2341,24 @@ static const struct bpf_func_proto bpf_msg_pull_data_proto = {
 	.arg4_type	= ARG_ANYTHING,
 };
 
+BPF_CALL_4(bpf_udp_flow_src_port, struct sk_buff *, skb, int, min,
+	   int, max, int, use_eth)
+{
+	struct net *net = dev_net(skb->dev);
+
+	return udp_flow_src_port(net, skb, min, max, use_eth);
+}
+
+static const struct bpf_func_proto bpf_udp_flow_src_port_proto = {
+	.func           = bpf_udp_flow_src_port,
+	.gpl_only       = false,
+	.ret_type       = RET_INTEGER,
+	.arg1_type		= ARG_PTR_TO_CTX,
+	.arg2_type      = ARG_ANYTHING,
+	.arg3_type		= ARG_ANYTHING,
+	.arg4_type		= ARG_ANYTHING,
+};
+
 BPF_CALL_4(bpf_msg_push_data, struct sk_msg *, msg, u32, start,
 	   u32, len, u64, flags)
 {
@@ -6186,6 +6204,8 @@ tc_cls_act_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_sk_storage_get_proto;
 	case BPF_FUNC_sk_storage_delete:
 		return &bpf_sk_storage_delete_proto;
+	case BPF_FUNC_udp_flow_src_port:
+		return &bpf_udp_flow_src_port_proto;
 #ifdef CONFIG_XFRM
 	case BPF_FUNC_skb_get_xfrm_state:
 		return &bpf_skb_get_xfrm_state_proto;
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 4393bd4b2419..90e814153dec 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2545,9 +2545,21 @@ union bpf_attr {
  * 		*th* points to the start of the TCP header, while *th_len*
  * 		contains **sizeof**\ (**struct tcphdr**).
  *
- * 	Return
- * 		0 if *iph* and *th* are a valid SYN cookie ACK, or a negative
- * 		error otherwise.
+ *  Return
+ *      0 if *iph* and *th* are a valid SYN cookie ACK, or a negative
+ *      error otherwise.
+ *
+ * int bpf_udp_flow_src_port(struct sk_buff *skb, int min, int max, int use_eth)
+ *  Description
+ *      It's common to implement tunnelling inside a UDP protocol to provide
+ *      additional randomness to the packet. The destination port of the UDP
+ *      header indicates the inner packet type whereas the source port is used
+ *      for additional entropy.
+ *
+ *  Return
+ *      An obfuscated hash of the packet that falls within the
+ *      min & max port range.
+ *      If min >= max, the default port range is used
  *
  * int bpf_sysctl_get_name(struct bpf_sysctl *ctx, char *buf, size_t buf_len, u64 flags)
  *	Description
@@ -2853,7 +2865,8 @@ union bpf_attr {
 	FN(sk_storage_get),		\
 	FN(sk_storage_delete),		\
 	FN(send_signal),		\
-	FN(tcp_gen_syncookie),
+	FN(tcp_gen_syncookie),  \
+	FN(udp_flow_src_port),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
diff --git a/tools/testing/selftests/bpf/bpf_helpers.h b/tools/testing/selftests/bpf/bpf_helpers.h
index 120aa86c58d3..385bfd8b7436 100644
--- a/tools/testing/selftests/bpf/bpf_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_helpers.h
@@ -313,6 +313,8 @@ static unsigned int (*bpf_set_hash)(void *ctx, __u32 hash) =
 static int (*bpf_skb_adjust_room)(void *ctx, __s32 len_diff, __u32 mode,
 				  unsigned long long flags) =
 	(void *) BPF_FUNC_skb_adjust_room;
+static int (*bpf_udp_flow_src_port)(void *ctx, int min, int max, int use_eth) =
+	(void *) BPF_FUNC_udp_flow_src_port;
 
 /* Scan the ARCH passed in from ARCH env variable (see Makefile) */
 #if defined(__TARGET_ARCH_x86)
diff --git a/tools/testing/selftests/bpf/prog_tests/udp_flow_src_port.c b/tools/testing/selftests/bpf/prog_tests/udp_flow_src_port.c
new file mode 100644
index 000000000000..0f7303b51d1d
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/udp_flow_src_port.c
@@ -0,0 +1,28 @@
+#include <test_progs.h>
+#include <linux/pkt_cls.h>
+
+void test_udp_flow_src_port(void)
+{
+	const char *file = "./test_udp_flow_src_port_kern.o";
+	struct bpf_object *obj;
+	__u32 duration, retval, size;
+	int err, prog_fd;
+    char buf[128];
+    struct tcphdr *tcph = (void *)buf + sizeof(struct ethhdr) + sizeof(struct iphdr);
+
+	err = bpf_prog_load(file, BPF_PROG_TYPE_SCHED_CLS, &obj, &prog_fd);
+	if (err) {
+		error_cnt++;
+		return;
+	}
+
+    short original = tcph->source;
+
+	err = bpf_prog_test_run(prog_fd, 1, &pkt_v4, sizeof(pkt_v4),
+				buf, &size, &retval, &duration);
+	CHECK(err || retval != TC_ACT_OK || tcph->source == original, "ipv4",
+	      "err %d errno %d retval %d sport %d duration %d\n",
+	      err, errno, retval, tcph->source, duration);
+
+	bpf_object__close(obj);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_udp_flow_src_port_kern.c b/tools/testing/selftests/bpf/progs/test_udp_flow_src_port_kern.c
new file mode 100644
index 000000000000..6238bd5fa856
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_udp_flow_src_port_kern.c
@@ -0,0 +1,47 @@
+#include <stddef.h>
+#include <stdbool.h>
+#include <string.h>
+#include <linux/bpf.h>
+#include <linux/if_ether.h>
+#include <linux/in.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+#include <linux/pkt_cls.h>
+#include <linux/tcp.h>
+#include "bpf_helpers.h"
+#include "bpf_endian.h"
+
+int _version SEC("version") = 1;
+char _license[] SEC("license") = "GPL";
+
+SEC("skb_udp_flow")
+int process(struct __sk_buff *skb)
+{
+    void *data = (void *)(size_t)skb->data;
+    void *data_end = (void *)(size_t)skb->data_end;
+    /* Is it an Ethernet frame? */
+    struct ethhdr *ethernet_header = (struct ethhdr *)data;
+    data += sizeof(*ethernet_header);
+    if (data > data_end) {
+        return TC_ACT_SHOT;
+    }
+
+    struct iphdr *ip_header = (struct iphdr *)data;
+    data += sizeof(*ip_header);
+    if (data > data_end) {
+        return TC_ACT_SHOT;
+    }
+
+    struct tcphdr *tcp_header = (struct tcphdr*)data;
+    data += sizeof(*tcp_header);
+    if (data > data_end) {
+        return TC_ACT_SHOT;
+    }
+
+    //lets assign the calculated source port in the
+    // tcp packet and verify it in the test
+    int sport = bpf_udp_flow_src_port(skb, 0, 0, 0);
+    tcp_header->source = sport;
+
+    return TC_ACT_OK;
+}
-- 
2.21.0


^ permalink raw reply related

* [PATCH 0/1] bpf: introduce new helper udp_flow_src_port
From: Farid Zakaria @ 2019-08-03  4:43 UTC (permalink / raw)
  To: ast, daniel, netdev, bpf; +Cc: Farid Zakaria

This is a submission to expose a new eBPF helper method
to allow access to udp_flow_src_port -- which is useful
when doing any Foo Over UDP network tunneling.

I hope this change adheres to the submission guidelines.
I've included a test and verified it passes:

./test_progs -t udp_flow_src_port
#31 udp_flow_src_port:OK
Summary: 1/0 PASSED, 0 FAILED

* shoutout to https://github.com/g2p/vido/blob/master/vido
which made testing the kernel + change super easy *
(This is my first contribution)

Farid Zakaria (1):
  bpf: introduce new helper udp_flow_src_port

 include/uapi/linux/bpf.h                      | 21 +++++++--
 net/core/filter.c                             | 20 ++++++++
 tools/include/uapi/linux/bpf.h                | 21 +++++++--
 tools/testing/selftests/bpf/bpf_helpers.h     |  2 +
 .../bpf/prog_tests/udp_flow_src_port.c        | 28 +++++++++++
 .../bpf/progs/test_udp_flow_src_port_kern.c   | 47 +++++++++++++++++++
 6 files changed, 131 insertions(+), 8 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/udp_flow_src_port.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_udp_flow_src_port_kern.c

-- 
2.21.0


^ permalink raw reply

* Re: [PATCH] net/mlx4_core: Use refcount_t for refcount
From: Chuhong Yuan @ 2019-08-03  2:42 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: linux-kernel@vger.kernel.org, davem@davemloft.net, Tariq Toukan,
	linux-rdma@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <47bb83d0111f1132bbf532c16be483c5efbe839f.camel@mellanox.com>

Saeed Mahameed <saeedm@mellanox.com> 于2019年8月3日周六 上午2:38写道：
>
> On Sat, 2019-08-03 at 00:10 +0800, Chuhong Yuan wrote:
> > Chuhong Yuan <hslester96@gmail.com> 于2019年8月2日周五 下午8:10写道：
> > > refcount_t is better for reference counters since its
> > > implementation can prevent overflows.
> > > So convert atomic_t ref counters to refcount_t.
> > >
> > > Also convert refcount from 0-based to 1-based.
> > >
> >
> > It seems that directly converting refcount from 0-based
> > to 1-based is infeasible.
> > I am sorry for this mistake.
>
> Just curious, why not keep it 0 based and use refcout_t ?
>
> refcount API should have the same semantics as atomic_t API .. no ?

refcount API will warn when increase a 0 refcount.
It regards this as a use-after-free.

^ permalink raw reply

* Re: [PATCH 31/34] nfs: convert put_page() to put_user_page*()
From: John Hubbard @ 2019-08-03  1:41 UTC (permalink / raw)
  To: Calum Mackay, john.hubbard, Andrew Morton
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, Trond Myklebust, Anna Schumaker
In-Reply-To: <1738cb1e-15d8-0bbe-5362-341664f6efc8@oracle.com>

On 8/2/19 6:27 PM, Calum Mackay wrote:
> On 02/08/2019 3:20 am, john.hubbard@gmail.com wrote:
... 
> Since it's static, and only called twice, might it be better to change its two callers [nfs_direct_{read,write}_schedule_iovec()] to call put_user_pages() directly, and remove nfs_direct_release_pages() entirely?
> 
> thanks,
> calum.
> 
> 
>>     void nfs_init_cinfo_from_dreq(struct nfs_commit_info *cinfo,
>>
 
Hi Calum,

Absolutely! Is it OK to add your reviewed-by, with the following incremental
patch made to this one?

diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
index b00b89dda3c5..c0c1b9f2c069 100644
--- a/fs/nfs/direct.c
+++ b/fs/nfs/direct.c
@@ -276,11 +276,6 @@ ssize_t nfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
        return nfs_file_direct_write(iocb, iter);
 }
 
-static void nfs_direct_release_pages(struct page **pages, unsigned int npages)
-{
-       put_user_pages(pages, npages);
-}
-
 void nfs_init_cinfo_from_dreq(struct nfs_commit_info *cinfo,
                              struct nfs_direct_req *dreq)
 {
@@ -510,7 +505,7 @@ static ssize_t nfs_direct_read_schedule_iovec(struct nfs_direct_req *dreq,
                        pos += req_len;
                        dreq->bytes_left -= req_len;
                }
-               nfs_direct_release_pages(pagevec, npages);
+               put_user_pages(pagevec, npages);
                kvfree(pagevec);
                if (result < 0)
                        break;
@@ -933,7 +928,7 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq,
                        pos += req_len;
                        dreq->bytes_left -= req_len;
                }
-               nfs_direct_release_pages(pagevec, npages);
+               put_user_pages(pagevec, npages);
                kvfree(pagevec);
                if (result < 0)
                        break;



thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply related

* Re: [PATCH] net: mdio-octeon: Fix build error and Kconfig warning
From: Matthew Wilcox @ 2019-08-03  1:39 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: David Miller, andrew, broonie, devel, f.fainelli, gregkh,
	hkallweit1, kernel-build-reports, linux-arm-kernel, linux-next,
	netdev, lkp, rdunlap
In-Reply-To: <20190803013031.GA76252@archlinux-threadripper>

On Fri, Aug 02, 2019 at 06:30:31PM -0700, Nathan Chancellor wrote:
> On Fri, Aug 02, 2019 at 06:11:32PM -0700, David Miller wrote:
> > The proper way to fix this is to include either
> > 
> > 	linux/io-64-nonatomic-hi-lo.h
> > 
> > or
> > 
> > 	linux/io-64-nonatomic-lo-hi.h
> > 
> > whichever is appropriate.
> 
> Hmmmm, is that not what I did?
> 
> Although I did not know about io-64-nonatomic-hi-lo.h. What is the
> difference and which one is needed here?

Whether you write the high or low 32 bits first.  For this, it doesn't
matter, since the compiled driver will never be run on real hardware.

> There is apparently another failure when OF_MDIO is not set, I guess I
> can try to look into that as well and respin into a series if
> necessary.

Thanks for taking care of that!

^ permalink raw reply

* Re: [PATCH 31/34] nfs: convert put_page() to put_user_page*()
From: Calum Mackay @ 2019-08-03  1:27 UTC (permalink / raw)
  To: john.hubbard, Andrew Morton
  Cc: calum.mackay, Christoph Hellwig, Dan Williams, Dave Chinner,
	Dave Hansen, Ira Weiny, Jan Kara, Jason Gunthorpe,
	Jérôme Glisse, LKML, amd-gfx, ceph-devel, devel, devel,
	dri-devel, intel-gfx, kvm, linux-arm-kernel, linux-block,
	linux-crypto, linux-fbdev, linux-fsdevel, linux-media, linux-mm,
	linux-nfs, linux-rdma, linux-rpi-kernel, linux-xfs, netdev,
	rds-devel, sparclinux, x86, xen-devel, John Hubbard,
	Trond Myklebust, Anna Schumaker
In-Reply-To: <20190802022005.5117-32-jhubbard@nvidia.com>

On 02/08/2019 3:20 am, john.hubbard@gmail.com wrote:
> From: John Hubbard <jhubbard@nvidia.com>
> 
> For pages that were retained via get_user_pages*(), release those pages
> via the new put_user_page*() routines, instead of via put_page() or
> release_pages().
> 
> This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
> ("mm: introduce put_user_page*(), placeholder versions").
> 
> Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
> Cc: Anna Schumaker <anna.schumaker@netapp.com>
> Cc: linux-nfs@vger.kernel.org
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>   fs/nfs/direct.c | 4 +---
>   1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
> index 0cb442406168..b00b89dda3c5 100644
> --- a/fs/nfs/direct.c
> +++ b/fs/nfs/direct.c
> @@ -278,9 +278,7 @@ ssize_t nfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>   
>   static void nfs_direct_release_pages(struct page **pages, unsigned int npages)
>   {
> -	unsigned int i;
> -	for (i = 0; i < npages; i++)
> -		put_page(pages[i]);
> +	put_user_pages(pages, npages);
>   }

Since it's static, and only called twice, might it be better to change 
its two callers [nfs_direct_{read,write}_schedule_iovec()] to call 
put_user_pages() directly, and remove nfs_direct_release_pages() entirely?

thanks,
calum.


>   
>   void nfs_init_cinfo_from_dreq(struct nfs_commit_info *cinfo,
> 

^ permalink raw reply

* Re: [PATCH] net: mdio-octeon: Fix build error and Kconfig warning
From: Nathan Chancellor @ 2019-08-03  1:30 UTC (permalink / raw)
  To: David Miller
  Cc: andrew, broonie, devel, f.fainelli, gregkh, hkallweit1,
	kernel-build-reports, linux-arm-kernel, linux-next, netdev, willy,
	lkp, rdunlap
In-Reply-To: <20190802.181132.1425585873361511856.davem@davemloft.net>

On Fri, Aug 02, 2019 at 06:11:32PM -0700, David Miller wrote:
> From: Nathan Chancellor <natechancellor@gmail.com>
> Date: Wed, 31 Jul 2019 11:50:24 -0700
> 
> > arm allyesconfig warns:
> > 
> > WARNING: unmet direct dependencies detected for MDIO_OCTEON
> >   Depends on [n]: NETDEVICES [=y] && MDIO_DEVICE [=y] && MDIO_BUS [=y]
> > && 64BIT && HAS_IOMEM [=y] && OF_MDIO [=y]
> >   Selected by [y]:
> >   - OCTEON_ETHERNET [=y] && STAGING [=y] && (CAVIUM_OCTEON_SOC &&
> > NETDEVICES [=y] || COMPILE_TEST [=y])
> > 
> > and errors:
> > 
> > In file included from ../drivers/net/phy/mdio-octeon.c:14:
> > ../drivers/net/phy/mdio-octeon.c: In function 'octeon_mdiobus_probe':
> > ../drivers/net/phy/mdio-cavium.h:111:36: error: implicit declaration of
> > function 'writeq'; did you mean 'writeb'?
> 
> The proper way to fix this is to include either
> 
> 	linux/io-64-nonatomic-hi-lo.h
> 
> or
> 
> 	linux/io-64-nonatomic-lo-hi.h
> 
> whichever is appropriate.

Hmmmm, is that not what I did?

Although I did not know about io-64-nonatomic-hi-lo.h. What is the
difference and which one is needed here?

There is apparently another failure when OF_MDIO is not set, I guess I
can try to look into that as well and respin into a series if
necessary.

Cheers,
Nathan

^ permalink raw reply

* NFSv4: rare bug *and* root cause captured in the wild
From: John Hubbard @ 2019-08-03  1:27 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker, David S. Miller, Chuck Lever,
	J. Bruce Fields, linux-nfs, netdev, LKML

Hi,

While testing unrelated (put_user_pages) work on Linux 5.3-rc2+,
I rebooted the NFS *server*, tried to ssh to the client, and the
client dumped a backtrace as shown below.

Good news: I found that I can reliably reproduce it with those steps, 
at commit 1e78030e5e5b (in linux.git) plus my 34-patch series [1], which
off course is completely unrelated. :) Anyway, I'm making a note of the 
exact repro commit, so I don't lose the repro.

I see what's wrong, but I do *not* see an easy fix. Can one of the
designers please recommend an approach to fixing this?

This is almost certainly caused by commit 7e0a0e38fcfe ("SUNRPC: 
Replace the queue timer with a delayed work function"), which changed 
over to running things in process (kthread) context. The commit is dated
May 1, 2019, but I've only been running NFSv4 for a couple days, so
the problem has likely been there all along, not specific to 5.3.

The call stack starts off in atomic context, so we get the bug:

nfs4_do_reclaim
    rcu_read_lock /* we are now in_atomic() and must not sleep */
        nfs4_purge_state_owners
            nfs4_free_state_owner
                nfs4_destroy_seqid_counter
                    rpc_destroy_wait_queue
                        cancel_delayed_work_sync
                            __cancel_work_timer
                                __flush_work
                                    start_flush_work
                                        might_sleep: 
                                         (kernel/workqueue.c:2975: BUG)

Details: actual backtrace I am seeing:

BUG: sleeping function called from invalid context at kernel/workqueue.c:2975
in_atomic(): 1, irqs_disabled(): 0, pid: 2224, name: 10.110.48.28-ma
1 lock held by 10.110.48.28-ma/2224:
 #0: 00000000d338d2ec (rcu_read_lock){....}, at: nfs4_do_reclaim+0x22/0x6b0 [nfsv4]
CPU: 8 PID: 2224 Comm: 10.110.48.28-ma Not tainted 5.3.0-rc2-hubbard-github+ #52
Hardware name: ASUS X299-A/PRIME X299-A, BIOS 1704 02/14/2019
Call Trace:
 dump_stack+0x46/0x60
 ___might_sleep.cold+0x8e/0x9b
 __flush_work+0x61/0x370
 ? find_held_lock+0x2b/0x80
 ? add_timer+0x100/0x200
 ? _raw_spin_lock_irqsave+0x35/0x40
 __cancel_work_timer+0xfb/0x180
 ? nfs4_purge_state_owners+0xf4/0x170 [nfsv4]
 nfs4_free_state_owner+0x10/0x50 [nfsv4]
 nfs4_purge_state_owners+0x139/0x170 [nfsv4]
 nfs4_do_reclaim+0x7a/0x6b0 [nfsv4]
 ? pnfs_destroy_layouts_byclid+0xc4/0x100 [nfsv4]
 nfs4_state_manager+0x6be/0x7f0 [nfsv4]
 nfs4_run_state_manager+0x1b/0x40 [nfsv4]
 kthread+0xfb/0x130
 ? nfs4_state_manager+0x7f0/0x7f0 [nfsv4]
 ? kthread_bind+0x20/0x20
 ret_from_fork+0x35/0x40

And last but not least, some words of encouragement: the reason I moved 
from NFSv3 to NFSv4 is that the easy authentication (matching UIDs on
client and server) now works perfectly. Yea! So I'm enjoying v4, despite 
the occasional minor glitch. :)

[1] https://lore.kernel.org/r/20190802022005.5117-1-jhubbard@nvidia.com

thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply

* Re: [PATCH net] r8152: fix typo in register name
From: David Miller @ 2019-08-03  1:17 UTC (permalink / raw)
  To: kevlo; +Cc: hayeswang, netdev
In-Reply-To: <20190801032938.GA22256@ns.kevlo.org>

From: Kevin Lo <kevlo@kevlo.org>
Date: Thu, 1 Aug 2019 11:29:38 +0800

> It is likely that PAL_BDC_CR should be PLA_BDC_CR.
> 
> Signed-off-by: Kevin Lo <kevlo@kevlo.org>

Applied.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox