Netdev List
 help / color / mirror / Atom feed
* [PATCH RFC 0/2] HWMON support for SFP modules
From: Andrew Lunn @ 2018-06-28 20:41 UTC (permalink / raw)
  To: netdev, Florian Fainelli, Guenter Roeck, Russell King, vadimp,
	linux-hwmon
  Cc: Andrew Lunn

This patchset adds HWMON support to SFP modules. The first patch adds
some attributes for power sensors which are currently missing from the
hwmon core. The second patch then extends the core SFP code to export
the sensors found in SFP modules.

This code has been tested with two SFP modules:

module OEM SFP-7000-85 rev 11.0 sn M1512220075 dc 160221
module FINISAR CORP. FTLF8524E2GNL rev A sn PW40MNN dc 160725

The anonymous module uses external calibration, while the FINISAR uses
internal calibration. Thus both code paths have been tested.

Andrew Lunn (2):
  hwmon: Add support for power min, lcrit, min_alarm and lcrit_alarm
  net: phy: sfp: Add HWMON support for module sensors

 drivers/hwmon/hwmon.c |   4 +
 drivers/net/phy/sfp.c | 732 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/hwmon.h |   8 +
 include/linux/sfp.h   |  72 ++++-
 4 files changed, 815 insertions(+), 1 deletion(-)

-- 
2.18.0.rc2

^ permalink raw reply

* Re: [PATCH net-next v2] tcp: force cwnd at least 2 in tcp_cwnd_reduction
From: Neal Cardwell @ 2018-06-28 20:47 UTC (permalink / raw)
  To: Lawrence Brakmo
  Cc: Yuchung Cheng, Matt Mathis, Netdev, Kernel Team, bmatheny, ast,
	Eric Dumazet, Wei Wang, Steve Ibanez, Yousuk Seung
In-Reply-To: <ADD5DEDF-213D-4375-B556-E9E44DD94130@fb.com>

On Thu, Jun 28, 2018 at 4:20 PM Lawrence Brakmo <brakmo@fb.com> wrote:
>
> I just looked at 4.18 traces and the behavior is as follows:
>
>    Host A sends the last packets of the request
>
>    Host B receives them, and the last packet is marked with congestion (CE)
>
>    Host B sends ACKs for packets not marked with congestion
>
>    Host B sends data packet with reply and ACK for packet marked with congestion (TCP flag ECE)
>
>    Host A receives ACKs with no ECE flag
>
>    Host A receives data packet with ACK for the last packet of request and has TCP ECE bit set
>
>    Host A sends 1st data packet of the next request with TCP flag CWR
>
>    Host B receives the packet (as seen in tcpdump at B), no CE flag
>
>    Host B sends a dup ACK that also has the TCP ECE flag
>
>    Host A RTO timer fires!
>
>    Host A to send the next packet
>
>    Host A receives an ACK for everything it has sent (i.e. Host B did receive 1st packet of request)
>
>    Host A send more packets…

Thanks, Larry! This is very interesting. I don't know the cause, but
this reminds me of an issue  Steve Ibanez raised on the netdev list
last December, where he was seeing cases with DCTCP where a CWR packet
would be received and buffered by Host B but not ACKed by Host B. This
was the thread "Re: Linux ECN Handling", starting around December 5. I
have cc-ed Steve.

I wonder if this may somehow be related to the DCTCP logic to rewind
tp->rcv_nxt and call tcp_send_ack(), and then restore tp->rcv_nxt, if
DCTCP notices that the incoming CE bits have been changed while the
receiver thinks it is holding on to a delayed ACK (in
dctcp_ce_state_0_to_1() and dctcp_ce_state_1_to_0()). I wonder if the
"synthetic" call to tcp_send_ack() somehow has side effects in the
delayed ACK state machine that can cause the connection to forget that
it still needs to fire a delayed ACK, even though it just sent an ACK
just now.

neal

^ permalink raw reply

* Re: [PATCH net-next v2] tcp: force cwnd at least 2 in tcp_cwnd_reduction
From: Lawrence Brakmo @ 2018-06-28 20:58 UTC (permalink / raw)
  To: Neal Cardwell
  Cc: Yuchung Cheng, Matt Mathis, Netdev, Kernel Team, Blake Matheny,
	Alexei Starovoitov, Eric Dumazet, Wei Wang, Steve Ibanez,
	Yousuk Seung
In-Reply-To: <CADVnQy=MsiEBCr+Mnp97mp0MxDqrA+_KiZEQehgcDfe9L-hghQ@mail.gmail.com>



On 6/28/18, 1:48 PM, "netdev-owner@vger.kernel.org on behalf of Neal Cardwell" <netdev-owner@vger.kernel.org on behalf of ncardwell@google.com> wrote:

    On Thu, Jun 28, 2018 at 4:20 PM Lawrence Brakmo <brakmo@fb.com> wrote:
    >
    > I just looked at 4.18 traces and the behavior is as follows:
    >
    >    Host A sends the last packets of the request
    >
    >    Host B receives them, and the last packet is marked with congestion (CE)
    >
    >    Host B sends ACKs for packets not marked with congestion
    >
    >    Host B sends data packet with reply and ACK for packet marked with congestion (TCP flag ECE)
    >
    >    Host A receives ACKs with no ECE flag
    >
    >    Host A receives data packet with ACK for the last packet of request and has TCP ECE bit set
    >
    >    Host A sends 1st data packet of the next request with TCP flag CWR
    >
    >    Host B receives the packet (as seen in tcpdump at B), no CE flag
    >
    >    Host B sends a dup ACK that also has the TCP ECE flag
    >
    >    Host A RTO timer fires!
    >
    >    Host A to send the next packet
    >
    >    Host A receives an ACK for everything it has sent (i.e. Host B did receive 1st packet of request)
    >
    >    Host A send more packets…
    
    Thanks, Larry! This is very interesting. I don't know the cause, but
    this reminds me of an issue  Steve Ibanez raised on the netdev list
    last December, where he was seeing cases with DCTCP where a CWR packet
    would be received and buffered by Host B but not ACKed by Host B. This
    was the thread "Re: Linux ECN Handling", starting around December 5. I
    have cc-ed Steve.
    
    I wonder if this may somehow be related to the DCTCP logic to rewind
    tp->rcv_nxt and call tcp_send_ack(), and then restore tp->rcv_nxt, if
    DCTCP notices that the incoming CE bits have been changed while the
    receiver thinks it is holding on to a delayed ACK (in
    dctcp_ce_state_0_to_1() and dctcp_ce_state_1_to_0()). I wonder if the
    "synthetic" call to tcp_send_ack() somehow has side effects in the
    delayed ACK state machine that can cause the connection to forget that
    it still needs to fire a delayed ACK, even though it just sent an ACK
    just now.
    
    neal
    
Just to reiterate that there are 2 distinct issues:

1) When the last data packet of a request is received with the CE flag, it causes a dup ack with ECE flag to be sent after the first data packet of the next request is received.

2) When one of the last (but not the last) data packets of a request is received with the CE flag, it causes a delayed ack to be sent after the first data packet of the next request is received. This delayed ack does acknowledge the data in 1st packet of the reply.

^ permalink raw reply

* Re: [PATCH bpf-next 2/7] lib: reciprocal_div: implement the improved algorithm on the paper mentioned
From: Jakub Kicinski @ 2018-06-28 21:05 UTC (permalink / raw)
  To: Jiong Wang
  Cc: Song Liu, Alexei Starovoitov, Daniel Borkmann, oss-drivers,
	Networking
In-Reply-To: <CAMsOgNDapubGdbP=nB8r8Mps5pMtjK9CSr3sn-LvDCOvyCg4Wg@mail.gmail.com>

On Thu, 28 Jun 2018 20:02:43 +0100, Jiong Wang wrote:
> > If that's the case, we should at least add a WARNING on the slow path.  
> 
> OK, I will add a pr_warn inside "reciprocal_value_adv" when l == 32 is
> triggered.

WARN() seems useful, given seeing l == 32 means the code calling this
function is buggy, and we want to see the back trace to figure out how
it happened.

^ permalink raw reply

* Re: [PATCH bpf-next 03/14] bpf: pass a pointer to a cgroup storage using pcpu variable
From: kbuild test robot @ 2018-06-28 21:08 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: kbuild-all, netdev, linux-kernel, kernel-team, tj, Roman Gushchin,
	Alexei Starovoitov, Daniel Borkmann
In-Reply-To: <20180628164719.28215-4-guro@fb.com>

[-- Attachment #1: Type: text/plain, Size: 5144 bytes --]

Hi Roman,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Roman-Gushchin/bpf-cgroup-local-storage/20180629-031527
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: arm-allmodconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.2.0 make.cross ARCH=arm 

All errors (new ones prefixed by >>):

   In file included from kernel/bpf/local_storage.c:2:0:
>> include/linux/bpf-cgroup.h:23:24: error: unknown type name 'bpf_cgroup_storage'
    DECLARE_PER_CPU(void*, bpf_cgroup_storage);
                           ^~~~~~~~~~~~~~~~~~
   include/linux/bpf-cgroup.h: In function 'bpf_cgroup_storage_set':
>> include/linux/bpf-cgroup.h:109:2: error: implicit declaration of function 'this_cpu_write'; did you mean 'init_cpu_online'? [-Werror=implicit-function-declaration]
     this_cpu_write(bpf_cgroup_storage, &buf->data[0]);
     ^~~~~~~~~~~~~~
     init_cpu_online
>> include/linux/bpf-cgroup.h:109:17: error: 'bpf_cgroup_storage' undeclared (first use in this function)
     this_cpu_write(bpf_cgroup_storage, &buf->data[0]);
                    ^~~~~~~~~~~~~~~~~~
   include/linux/bpf-cgroup.h:109:17: note: each undeclared identifier is reported only once for each function it appears in
   cc1: some warnings being treated as errors

vim +/bpf_cgroup_storage +23 include/linux/bpf-cgroup.h

    22	
  > 23	DECLARE_PER_CPU(void*, bpf_cgroup_storage);
    24	
    25	struct bpf_cgroup_storage_map;
    26	
    27	struct bpf_storage_buffer {
    28		struct rcu_head rcu;
    29		char data[0];
    30	};
    31	
    32	struct bpf_cgroup_storage {
    33		struct bpf_storage_buffer *buf;
    34		struct bpf_cgroup_storage_map *map;
    35		struct bpf_cgroup_storage_key key;
    36		struct list_head list;
    37		struct rb_node node;
    38		struct rcu_head rcu;
    39	};
    40	
    41	struct bpf_prog_list {
    42		struct list_head node;
    43		struct bpf_prog *prog;
    44	};
    45	
    46	struct bpf_prog_array;
    47	
    48	struct cgroup_bpf {
    49		/* array of effective progs in this cgroup */
    50		struct bpf_prog_array __rcu *effective[MAX_BPF_ATTACH_TYPE];
    51	
    52		/* attached progs to this cgroup and attach flags
    53		 * when flags == 0 or BPF_F_ALLOW_OVERRIDE the progs list will
    54		 * have either zero or one element
    55		 * when BPF_F_ALLOW_MULTI the list can have up to BPF_CGROUP_MAX_PROGS
    56		 */
    57		struct list_head progs[MAX_BPF_ATTACH_TYPE];
    58		u32 flags[MAX_BPF_ATTACH_TYPE];
    59	
    60		/* temp storage for effective prog array used by prog_attach/detach */
    61		struct bpf_prog_array __rcu *inactive;
    62	};
    63	
    64	void cgroup_bpf_put(struct cgroup *cgrp);
    65	int cgroup_bpf_inherit(struct cgroup *cgrp);
    66	
    67	int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
    68				enum bpf_attach_type type, u32 flags);
    69	int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
    70				enum bpf_attach_type type, u32 flags);
    71	int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
    72			       union bpf_attr __user *uattr);
    73	
    74	/* Wrapper for __cgroup_bpf_*() protected by cgroup_mutex */
    75	int cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
    76			      enum bpf_attach_type type, u32 flags);
    77	int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
    78			      enum bpf_attach_type type, u32 flags);
    79	int cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
    80			     union bpf_attr __user *uattr);
    81	
    82	int __cgroup_bpf_run_filter_skb(struct sock *sk,
    83					struct sk_buff *skb,
    84					enum bpf_attach_type type);
    85	
    86	int __cgroup_bpf_run_filter_sk(struct sock *sk,
    87				       enum bpf_attach_type type);
    88	
    89	int __cgroup_bpf_run_filter_sock_addr(struct sock *sk,
    90					      struct sockaddr *uaddr,
    91					      enum bpf_attach_type type,
    92					      void *t_ctx);
    93	
    94	int __cgroup_bpf_run_filter_sock_ops(struct sock *sk,
    95					     struct bpf_sock_ops_kern *sock_ops,
    96					     enum bpf_attach_type type);
    97	
    98	int __cgroup_bpf_check_dev_permission(short dev_type, u32 major, u32 minor,
    99					      short access, enum bpf_attach_type type);
   100	
   101	static inline void bpf_cgroup_storage_set(struct bpf_cgroup_storage *storage)
   102	{
   103		struct bpf_storage_buffer *buf;
   104	
   105		if (!storage)
   106			return;
   107	
   108		buf = rcu_dereference(storage->buf);
 > 109		this_cpu_write(bpf_cgroup_storage, &buf->data[0]);
   110	}
   111	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 66159 bytes --]

^ permalink raw reply

* Re: [PATCH 6/6] fs: replace f_ops->get_poll_head with a static ->f_poll_head pointer
From: Linus Torvalds @ 2018-06-28 21:11 UTC (permalink / raw)
  To: Al Viro; +Cc: Christoph Hellwig, linux-fsdevel, Network Development, LKP
In-Reply-To: <20180628202837.GI30522@ZenIV.linux.org.uk>

On Thu, Jun 28, 2018 at 1:28 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
>
> Sure, but...
>
> static __poll_t binder_poll(struct file *filp,
>                                 struct poll_table_struct *wait)
> {
>         struct binder_proc *proc = filp->private_data;
>         struct binder_thread *thread = NULL;
>         bool wait_for_proc_work;
>
>         thread = binder_get_thread(proc);
>         if (!thread)
>                 return POLLERR;

That's actually fine.

In particular, it's ok to *not* add yourself to the wait-queues if you
already return the value that you will always return.

For example, the poll function for a regular file could just always return

        EPOLLIN | EPOLLOUT | EPOLLRDNORM | EPOLLWRNORM

because it never changes. Now, it doesn't actually *do* that, because
we have the rule that "no ->poll function" means exactly that, but
it's not wrong.

Similarly, if a poll handler has an error that will not go away, the
above is actually the *right* thing to do. There simply isn't anything
to wait for.

Arguably, it probably should return not just POLLERR, but also
POLLIN/POLLOUT, since an error *also* means that it's going to return
immediately from read/write, but that's a separate thing entirely.

> And that's hardly unique - we have instances playing with timers,
> allocations, whatnot.  Even straight mutex_lock(), as in

So?

Again, locking is permitted. It's not great, but it's not against the rules.

So none of the things you point to are excuses for changing interfaces
or adding any flags.

And no, we don't care at all about "blocking" in general. Somebody who
cares about _performance_ may care, but it's not fundamnetal. Even
select(*) and poll() itself will block for allocating select tables
etc, but also for reading (and updating) user space.

Anybody who thinks "select cannot block" or "->poll() musn't block" is
just confused. It has *never* been about that. It waits asynchronously
for IO, but it may well wait synchronously for locks or memory or just
"lazy implementation".

And none of this has antyhign to do with the ->poll() interface
itself. If you want to improve performance on some _particular_ file
and actually worry about locking etc, you fix that particular
implementation. You don't go and change the interface for everybody.

The fact is, those interface changes were just broken shit. They were
confused. I don't actually believe that AIO even needed them.

Christoph, do you have a test program for IOCB_CMD_POLL and what it's
actually supposed to do?

Because I think that what it can do is simply to do the ->poll() calls
outside the iocb locks, and then just attach the poll table to the
kioctx afterwards.

This whole "poll must not block" is a complete red herring. It doesn't
come from any other requirements than BAD AIO GARBAGE CODE.

Seriously. Stop thinking this has to happen inside some spinlocked
region. That's AIO braindamage, it's irrelevant.

             Linus

^ permalink raw reply

* Re: [PATCH 6/6] fs: replace f_ops->get_poll_head with a static ->f_poll_head pointer
From: Linus Torvalds @ 2018-06-28 21:16 UTC (permalink / raw)
  To: Al Viro; +Cc: Christoph Hellwig, linux-fsdevel, Network Development, LKP
In-Reply-To: <20180628203746.GJ30522@ZenIV.linux.org.uk>

On Thu, Jun 28, 2018 at 1:37 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
>
> Speaking of obvious bogosities (a lot more so than a blocking allocation
> several calls down into helper):
>
> static __poll_t ca8210_test_int_poll(
>         struct file *filp,
>         struct poll_table_struct *ptable
> )

Ok, that's just garbage.

Again, this is not an excuse for changing "->poll()". This is just a
bogus driver that does stupid things.

And again, don't use this as an example of "poll must not block" The
fact is, poll() *can* block for locking and other such things, and
it's perfectly normal and ok. It just shouldn't block for IO.

I will not take any misguided patches to try to "fix" poll functions
that might block.

But I will take patches to fix bugs in drivers. In this case, I think
the fix is to just remove that crazy "wait_event_interruptible()"
entirely.

Not that anybody cares, obviously. Apparently nobody has ever noticed
how broken that poll function is.

               Linus

^ permalink raw reply

* Re: [PATCH v12 03/10] netdev: cavium: octeon: Add Octeon III BGX Ethernet Nexus
From: Carlos Munoz @ 2018-06-28 21:20 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Steven J. Hill, netdev, Chandrakala Chavva
In-Reply-To: <20180628084156.GF16727@lunn.ch>



On 06/28/2018 01:41 AM, Andrew Lunn wrote:
> External Email
>
>> +static char *mix_port;
>> +module_param(mix_port, charp, 0444);
>> +MODULE_PARM_DESC(mix_port, "Specifies which ports connect to MIX interfaces.");
>> +
>> +static char *pki_port;
>> +module_param(pki_port, charp, 0444);
>> +MODULE_PARM_DESC(pki_port, "Specifies which ports connect to the PKI.");
> Module parameters are generally not liked. Can you do without them?

These parameters change the kernel port assignment required by user space applications. We rather keep them as they simplify the process.

>
>> +             /* One time request driver module */
>> +             if (is_mix) {
>> +                     if (atomic_cmpxchg(&request_mgmt_once, 0, 1) == 0)
>> +                             request_module_nowait("octeon_mgmt");
> Why is this needed? So long as the driver has the needed properties,
> udev should load the module.
>
>      Andrew

The thing is the management module is only loaded when a port is assigned to it (determined by the above module parameter "mix_port").

Best regards,
Carlos

^ permalink raw reply

* Re: [PATCH 6/6] fs: replace f_ops->get_poll_head with a static ->f_poll_head pointer
From: Al Viro @ 2018-06-28 21:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christoph Hellwig, linux-fsdevel, Network Development, LKP
In-Reply-To: <CA+55aFxfkD9ozw77j-bshReLJN4dN2GEdW+R-LkxztceCW6OTg@mail.gmail.com>

On Thu, Jun 28, 2018 at 02:11:17PM -0700, Linus Torvalds wrote:
> On Thu, Jun 28, 2018 at 1:28 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> >
> > Sure, but...
> >
> > static __poll_t binder_poll(struct file *filp,
> >                                 struct poll_table_struct *wait)
> > {
> >         struct binder_proc *proc = filp->private_data;
> >         struct binder_thread *thread = NULL;
> >         bool wait_for_proc_work;
> >
> >         thread = binder_get_thread(proc);
> >         if (!thread)
> >                 return POLLERR;
> 
> That's actually fine.
> 
> In particular, it's ok to *not* add yourself to the wait-queues if you
> already return the value that you will always return.

Sure (and that's one of the problems I mentioned with ->get_poll_head() model).
But in this case I was refering to GFP_KERNEL allocation down there.

> > And that's hardly unique - we have instances playing with timers,
> > allocations, whatnot.  Even straight mutex_lock(), as in
> 
> So?
> 
> Again, locking is permitted. It's not great, but it's not against the rules.

Me: a *LOT* of ->poll() instances only block in __pollwait() called (indirectly)
on the first pass.
 
You: They are *all* supposed to do it.

Me: <examples of instances that block elsewhere>

I'm not saying that blocking on other things is a bug; some of such *are* bogus,
but a lot aren't really broken.  What I said is that in a lot of cases we really
have hard "no blocking other than in callback" (and on subsequent passes there's
no callback at all).  Which is just about perfect for AIO purposes, so *IF* we
go for "new method just for AIO, those who don't have it can take a hike", we might
as well indicate that "can take a hike" in some way (be it opt-in or opt-out) and
use straight unchanged ->poll(), with alternative callback.

Looks like we were talking past each other for the last couple of rounds...

> So none of the things you point to are excuses for changing interfaces
> or adding any flags.

> Anybody who thinks "select cannot block" or "->poll() musn't block" is
> just confused. It has *never* been about that. It waits asynchronously
> for IO, but it may well wait synchronously for locks or memory or just
> "lazy implementation".

Obviously.  I *do* understand how poll() works, really.

> The fact is, those interface changes were just broken shit. They were
> confused. I don't actually believe that AIO even needed them.
> 
> Christoph, do you have a test program for IOCB_CMD_POLL and what it's
> actually supposed to do?
> 
> Because I think that what it can do is simply to do the ->poll() calls
> outside the iocb locks, and then just attach the poll table to the
> kioctx afterwards.

I'd do a bit more - embed the first poll_table_entry into poll iocb itself,
so that the instances that use only one queue wouldn't need any allocations
at all.

^ permalink raw reply

* Re: [PATCH 2/3] MIPS: AR7: Normalize clk API
From: Paul Burton @ 2018-06-28 21:31 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Greg Ungerer, Ralf Baechle, James Hogan, Giuseppe Cavallaro,
	Alexandre Torgue, Jose Abreu, Corentin Labbe, David S . Miller,
	Arnd Bergmann, linux-m68k, linux-mips, netdev, linux-kernel
In-Reply-To: <1528706663-20670-3-git-send-email-geert@linux-m68k.org>

Hi Geert,

On Mon, Jun 11, 2018 at 10:44:22AM +0200, Geert Uytterhoeven wrote:
> Coldfire still provides its own variant of the clk API rather than using
> the generic COMMON_CLK API.  This generally works, but it causes some
> link errors with drivers using the clk_round_rate(), clk_set_rate(),
> clk_set_parent(), or clk_get_parent() functions when a platform lacks
> those interfaces.
> 
> This adds empty stub implementations for each of them, and I don't even
> try to do something useful here but instead just print a WARN() message
> to make it obvious what is going on if they ever end up being called.
> 
> The drivers that call these won't be used on these platforms (otherwise
> we'd get a link error today), so the added code is harmless bloat and
> will warn about accidental use.
> 
> Based on commit bd7fefe1f06ca6cc ("ARM: w90x900: normalize clk API").
> 
> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
> ---
>  arch/mips/ar7/clock.c | 29 +++++++++++++++++++++++++++++
>  1 file changed, 29 insertions(+)

Applied to mips-next for 4.19.

Thanks,
    Paul

^ permalink raw reply

* Re: [PATCH bpf-next 08/14] bpf: introduce the bpf_get_local_storage() helper function
From: kbuild test robot @ 2018-06-28 21:34 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: kbuild-all, netdev, linux-kernel, kernel-team, tj, Roman Gushchin,
	Alexei Starovoitov, Daniel Borkmann
In-Reply-To: <20180628164719.28215-9-guro@fb.com>

[-- Attachment #1: Type: text/plain, Size: 1590 bytes --]

Hi Roman,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Roman-Gushchin/bpf-cgroup-local-storage/20180629-031527
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: parisc-c3000_defconfig (attached as .config)
compiler: hppa-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.2.0 make.cross ARCH=parisc 

All errors (new ones prefixed by >>):

   net/core/filter.o: In function `cg_skb_func_proto':
>> (.text.cg_skb_func_proto+0x18): undefined reference to `bpf_get_local_storage_proto'
   (.text.cg_skb_func_proto+0x20): undefined reference to `bpf_get_local_storage_proto'
   net/core/filter.o: In function `sock_filter_func_proto':
>> (.text.sock_filter_func_proto+0x1c): undefined reference to `bpf_get_local_storage_proto'
   (.text.sock_filter_func_proto+0x3c): undefined reference to `bpf_get_local_storage_proto'
   net/core/filter.o: In function `.L1715':
>> (.text.sock_ops_func_proto+0x144): undefined reference to `bpf_get_local_storage_proto'
   net/core/filter.o:(.text.sock_ops_func_proto+0x14c): more undefined references to `bpf_get_local_storage_proto' follow

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 14533 bytes --]

^ permalink raw reply

* [PATCH bpf 0/3] Three BPF fixes
From: Daniel Borkmann @ 2018-06-28 21:34 UTC (permalink / raw)
  To: ast; +Cc: netdev, Daniel Borkmann

This set contains three fixes that are mostly JIT and set_memory_*()
related. The third in the series in particular fixes the syzkaller
bugs that were still pending; aside from local reproduction & testing,
also 'syz test' wasn't able to trigger them anymore. I've tested this
series on x86_64, arm64 and s390x, and kbuild bot wasn't yelling either
for the rest. For details, please see patches as usual, thanks!

Daniel Borkmann (3):
  bpf, arm32: fix to use bpf_jit_binary_lock_ro api
  bpf, s390: fix potential memleak when later bpf_jit_prog fails
  bpf: undo prog rejection on read-only lock failure

 arch/arm/net/bpf_jit_32.c    |  2 +-
 arch/s390/net/bpf_jit_comp.c |  1 +
 include/linux/filter.h       | 56 +++++++-------------------------------------
 kernel/bpf/core.c            | 30 +-----------------------
 4 files changed, 11 insertions(+), 78 deletions(-)

-- 
2.9.5

^ permalink raw reply

* [PATCH bpf 1/3] bpf, arm32: fix to use bpf_jit_binary_lock_ro api
From: Daniel Borkmann @ 2018-06-28 21:34 UTC (permalink / raw)
  To: ast; +Cc: netdev, Daniel Borkmann
In-Reply-To: <20180628213459.28631-1-daniel@iogearbox.net>

Any eBPF JIT that where its underlying arch supports ARCH_HAS_SET_MEMORY
would need to use bpf_jit_binary_{un,}lock_ro() pair instead of the
set_memory_{ro,rw}() pair directly as otherwise changes to the former
might break. arm32's eBPF conversion missed to change it, so fix this
up here.

Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
---
 arch/arm/net/bpf_jit_32.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index 6e8b716..f6a62ae 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -1844,7 +1844,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		/* there are 2 passes here */
 		bpf_jit_dump(prog->len, image_size, 2, ctx.target);
 
-	set_memory_ro((unsigned long)header, header->pages);
+	bpf_jit_binary_lock_ro(header);
 	prog->bpf_func = (void *)ctx.target;
 	prog->jited = 1;
 	prog->jited_len = image_size;
-- 
2.9.5

^ permalink raw reply related

* [PATCH bpf 2/3] bpf, s390: fix potential memleak when later bpf_jit_prog fails
From: Daniel Borkmann @ 2018-06-28 21:34 UTC (permalink / raw)
  To: ast; +Cc: netdev, Daniel Borkmann, Martin Schwidefsky
In-Reply-To: <20180628213459.28631-1-daniel@iogearbox.net>

If we would ever fail in the bpf_jit_prog() pass that writes the
actual insns to the image after we got header via bpf_jit_binary_alloc()
then we also need to make sure to free it through bpf_jit_binary_free()
again when bailing out. Given we had prior bpf_jit_prog() passes to
initially probe for clobbered registers, program size and to fill in
addrs arrray for jump targets, this is more of a theoretical one,
but at least make sure this doesn't break with future changes.

Fixes: 054623105728 ("s390/bpf: Add s390x eBPF JIT compiler backend")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
---
 arch/s390/net/bpf_jit_comp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index d2db8ac..5f0234e 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -1286,6 +1286,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 		goto free_addrs;
 	}
 	if (bpf_jit_prog(&jit, fp)) {
+		bpf_jit_binary_free(header);
 		fp = orig_fp;
 		goto free_addrs;
 	}
-- 
2.9.5

^ permalink raw reply related

* [PATCH bpf 3/3] bpf: undo prog rejection on read-only lock failure
From: Daniel Borkmann @ 2018-06-28 21:34 UTC (permalink / raw)
  To: ast; +Cc: netdev, Daniel Borkmann, Laura Abbott, Kees Cook
In-Reply-To: <20180628213459.28631-1-daniel@iogearbox.net>

Partially undo commit 9facc336876f ("bpf: reject any prog that failed
read-only lock") since it caused a regression, that is, syzkaller was
able to manage to cause a panic via fault injection deep in set_memory_ro()
path by letting an allocation fail: In x86's __change_page_attr_set_clr()
it was able to change the attributes of the primary mapping but not in
the alias mapping via cpa_process_alias(), so the second, inner call
to the __change_page_attr() via __change_page_attr_set_clr() had to split
a larger page and failed in the alloc_pages() with the artifically triggered
allocation error which is then propagated down to the call site.

Thus, for set_memory_ro() this means that it returned with an error, but
from debugging a probe_kernel_write() revealed EFAULT on that memory since
the primary mapping succeeded to get changed. Therefore the subsequent
hdr->locked = 0 reset triggered the panic as it was performed on read-only
memory, so call-site assumptions were infact wrong to assume that it would
either succeed /or/ not succeed at all since there's no such rollback in
set_memory_*() calls from partial change of mappings, in other words, we're
left in a state that is "half done". A later undo via set_memory_rw() is
succeeding though due to matching permissions on that part (aka due to the
try_preserve_large_page() succeeding). While reproducing locally with
explicitly triggering this error, the initial splitting only happens on
rare occasions and in real world it would additionally need oom conditions,
but that said, it could partially fail. Therefore, it is definitely wrong
to bail out on set_memory_ro() error and reject the program with the
set_memory_*() semantics we have today. Shouldn't have gone the extra mile
since no other user in tree today infact checks for any set_memory_*()
errors, e.g. neither module_enable_ro() / module_disable_ro() for module
RO/NX handling which is mostly default these days nor kprobes core with
alloc_insn_page() / free_insn_page() as examples that could be invoked long
after bootup and original 314beb9bcabf ("x86: bpf_jit_comp: secure bpf jit
against spraying attacks") did neither when it got first introduced to BPF
so "improving" with bailing out was clearly not right when set_memory_*()
cannot handle it today.

Kees suggested that if set_memory_*() can fail, we should annotate it with
__must_check, and all callers need to deal with it gracefully given those
set_memory_*() markings aren't "advisory", but they're expected to actually
do what they say. This might be an option worth to move forward in future
but would at the same time require that set_memory_*() calls from supporting
archs are guaranteed to be "atomic" in that they provide rollback if part
of the range fails, once that happened, the transition from RW -> RO could
be made more robust that way, while subsequent RO -> RW transition /must/
continue guaranteeing to always succeed the undo part.

Reported-by: syzbot+a4eb8c7766952a1ca872@syzkaller.appspotmail.com
Reported-by: syzbot+d866d1925855328eac3b@syzkaller.appspotmail.com
Fixes: 9facc336876f ("bpf: reject any prog that failed read-only lock")
Cc: Laura Abbott <labbott@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/filter.h | 56 ++++++++------------------------------------------
 kernel/bpf/core.c      | 30 +--------------------------
 2 files changed, 9 insertions(+), 77 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 20f2659..300baad 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -470,9 +470,7 @@ struct sock_fprog_kern {
 };
 
 struct bpf_binary_header {
-	u16 pages;
-	u16 locked:1;
-
+	u32 pages;
 	/* Some arches need word alignment for their instructions */
 	u8 image[] __aligned(4);
 };
@@ -481,7 +479,7 @@ struct bpf_prog {
 	u16			pages;		/* Number of allocated pages */
 	u16			jited:1,	/* Is our filter JIT'ed? */
 				jit_requested:1,/* archs need to JIT the prog */
-				locked:1,	/* Program image locked? */
+				undo_set_mem:1,	/* Passed set_memory_ro() checkpoint */
 				gpl_compatible:1, /* Is filter GPL compatible? */
 				cb_access:1,	/* Is control block accessed? */
 				dst_needed:1,	/* Do we need dst entry? */
@@ -677,46 +675,24 @@ bpf_ctx_narrow_access_ok(u32 off, u32 size, u32 size_default)
 
 static inline void bpf_prog_lock_ro(struct bpf_prog *fp)
 {
-#ifdef CONFIG_ARCH_HAS_SET_MEMORY
-	fp->locked = 1;
-	if (set_memory_ro((unsigned long)fp, fp->pages))
-		fp->locked = 0;
-#endif
+	fp->undo_set_mem = 1;
+	set_memory_ro((unsigned long)fp, fp->pages);
 }
 
 static inline void bpf_prog_unlock_ro(struct bpf_prog *fp)
 {
-#ifdef CONFIG_ARCH_HAS_SET_MEMORY
-	if (fp->locked) {
-		WARN_ON_ONCE(set_memory_rw((unsigned long)fp, fp->pages));
-		/* In case set_memory_rw() fails, we want to be the first
-		 * to crash here instead of some random place later on.
-		 */
-		fp->locked = 0;
-	}
-#endif
+	if (fp->undo_set_mem)
+		set_memory_rw((unsigned long)fp, fp->pages);
 }
 
 static inline void bpf_jit_binary_lock_ro(struct bpf_binary_header *hdr)
 {
-#ifdef CONFIG_ARCH_HAS_SET_MEMORY
-	hdr->locked = 1;
-	if (set_memory_ro((unsigned long)hdr, hdr->pages))
-		hdr->locked = 0;
-#endif
+	set_memory_ro((unsigned long)hdr, hdr->pages);
 }
 
 static inline void bpf_jit_binary_unlock_ro(struct bpf_binary_header *hdr)
 {
-#ifdef CONFIG_ARCH_HAS_SET_MEMORY
-	if (hdr->locked) {
-		WARN_ON_ONCE(set_memory_rw((unsigned long)hdr, hdr->pages));
-		/* In case set_memory_rw() fails, we want to be the first
-		 * to crash here instead of some random place later on.
-		 */
-		hdr->locked = 0;
-	}
-#endif
+	set_memory_rw((unsigned long)hdr, hdr->pages);
 }
 
 static inline struct bpf_binary_header *
@@ -728,22 +704,6 @@ bpf_jit_binary_hdr(const struct bpf_prog *fp)
 	return (void *)addr;
 }
 
-#ifdef CONFIG_ARCH_HAS_SET_MEMORY
-static inline int bpf_prog_check_pages_ro_single(const struct bpf_prog *fp)
-{
-	if (!fp->locked)
-		return -ENOLCK;
-	if (fp->jited) {
-		const struct bpf_binary_header *hdr = bpf_jit_binary_hdr(fp);
-
-		if (!hdr->locked)
-			return -ENOLCK;
-	}
-
-	return 0;
-}
-#endif
-
 int sk_filter_trim_cap(struct sock *sk, struct sk_buff *skb, unsigned int cap);
 static inline int sk_filter(struct sock *sk, struct sk_buff *skb)
 {
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index a9e6c04..1e5625d 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -598,8 +598,6 @@ bpf_jit_binary_alloc(unsigned int proglen, u8 **image_ptr,
 	bpf_fill_ill_insns(hdr, size);
 
 	hdr->pages = size / PAGE_SIZE;
-	hdr->locked = 0;
-
 	hole = min_t(unsigned int, size - (proglen + sizeof(*hdr)),
 		     PAGE_SIZE - sizeof(*hdr));
 	start = (get_random_int() % hole) & ~(alignment - 1);
@@ -1450,22 +1448,6 @@ static int bpf_check_tail_call(const struct bpf_prog *fp)
 	return 0;
 }
 
-static int bpf_prog_check_pages_ro_locked(const struct bpf_prog *fp)
-{
-#ifdef CONFIG_ARCH_HAS_SET_MEMORY
-	int i, err;
-
-	for (i = 0; i < fp->aux->func_cnt; i++) {
-		err = bpf_prog_check_pages_ro_single(fp->aux->func[i]);
-		if (err)
-			return err;
-	}
-
-	return bpf_prog_check_pages_ro_single(fp);
-#endif
-	return 0;
-}
-
 static void bpf_prog_select_func(struct bpf_prog *fp)
 {
 #ifndef CONFIG_BPF_JIT_ALWAYS_ON
@@ -1524,17 +1506,7 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
 	 * all eBPF JITs might immediately support all features.
 	 */
 	*err = bpf_check_tail_call(fp);
-	if (*err)
-		return fp;
-
-	/* Checkpoint: at this point onwards any cBPF -> eBPF or
-	 * native eBPF program is read-only. If we failed to change
-	 * the page attributes (e.g. allocation failure from
-	 * splitting large pages), then reject the whole program
-	 * in order to guarantee not ending up with any W+X pages
-	 * from BPF side in kernel.
-	 */
-	*err = bpf_prog_check_pages_ro_locked(fp);
+
 	return fp;
 }
 EXPORT_SYMBOL_GPL(bpf_prog_select_runtime);
-- 
2.9.5

^ permalink raw reply related

* Re: [PATCH 6/6] fs: replace f_ops->get_poll_head with a static ->f_poll_head pointer
From: Linus Torvalds @ 2018-06-28 21:39 UTC (permalink / raw)
  To: Al Viro; +Cc: Christoph Hellwig, linux-fsdevel, Network Development, LKP
In-Reply-To: <20180628213027.GK30522@ZenIV.linux.org.uk>

On Thu, Jun 28, 2018 at 2:30 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> > Again, locking is permitted. It's not great, but it's not against the rules.
>
> Me: a *LOT* of ->poll() instances only block in __pollwait() called (indirectly)
> on the first pass.
>
> You: They are *all* supposed to do it.
>
> Me: <examples of instances that block elsewhere>

Oh, I thought you were talking about the whole "first pass" adding to
wait queues, as opposed to doing it on the second pass.

The *blocking* is entirely immaterial. I didn't even react to it,
because it's simply not an issue.

I don't understand why you're even hung up about it.

The only reason "blocking" seems to be an issu eis because AIO has
shit-for-brains and wanted to do poll() under the spinlock.

But that's literally just AIO being confused garbage. It has zero
relevance for anything else.

                Linus

^ permalink raw reply

* Re: [PATCH v2 net-next 0/2] net: preserve sock reference when scrubbing the skb.
From: Cong Wang @ 2018-06-28 21:41 UTC (permalink / raw)
  To: David Miller
  Cc: Flavio Leitner, Linux Kernel Network Developers, Eric Dumazet,
	Paolo Abeni, Florian Westphal, NetFilter
In-Reply-To: <20180628.222040.2117056805629988850.davem@davemloft.net>

On Thu, Jun 28, 2018 at 6:20 AM David Miller <davem@davemloft.net> wrote:
>
> From: Cong Wang <xiyou.wangcong@gmail.com>
> Date: Wed, 27 Jun 2018 12:39:01 -0700
>
> > Let me rephrase why I don't like this patchset:
>
> Cong, I don't think you are seeing the situation clearly and
> I am certainly going to apply this patch series even in the
> face of your objections.
>
> Suggesting that solving the lack of back pressure on a UDP
> socket caused by this problem by using cgroups or cpu
> usage controllers is just complete and utter madness.

Pretty sure you didn't even read the rest of my reply,
I can't help you if you just stop at where you quoted.

^ permalink raw reply

* [PATCH bpf-next 0/8] tools: bpf: updates to bpftool and libbpf
From: Jakub Kicinski @ 2018-06-28 21:41 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: oss-drivers, netdev, Jakub Kicinski

Hi!

Set of random updates to bpftool and libbpf.  I'm preparing for
extending bpftool prog load, but there is a good number of
improvements that can be made before bpf -> bpf-next merge
helping to keep the later patch set to a manageable size as well.

First patch is a bpftool build speed improvement.  Next missing
program types are added to libbpf program type detection by section
name.  The ability to load programs from '.text' section is restored
when ELF file doesn't contain any pseudo calls.

In bpftool I remove my Author comments as unnecessary sign of vanity.
Last but not least missing option is added to bash completions and
processing of options in bash completions is improved.

Jakub Kicinski (8):
  tools: bpftool: use correct make variable type to improve compilation
    time
  tools: libbpf: add section names for missing program types
  tools: libbpf: allow setting ifindex for programs and maps
  tools: libbpf: restore the ability to load programs from .text section
  tools: libbpf: don't return '.text' as a program for multi-function
    programs
  tools: bpftool: drop unnecessary Author comments
  tools: bpftool: add missing --bpffs to completions
  tools: bpftool: deal with options upfront

 tools/bpf/bpftool/Makefile                |  2 +-
 tools/bpf/bpftool/bash-completion/bpftool | 32 ++++++++++-----
 tools/bpf/bpftool/common.c                |  2 -
 tools/bpf/bpftool/main.c                  |  4 +-
 tools/bpf/bpftool/main.h                  |  2 -
 tools/bpf/bpftool/map.c                   |  2 -
 tools/bpf/bpftool/prog.c                  |  4 +-
 tools/lib/bpf/libbpf.c                    | 49 ++++++++++++++++++-----
 tools/lib/bpf/libbpf.h                    |  2 +
 9 files changed, 66 insertions(+), 33 deletions(-)

-- 
2.17.1

^ permalink raw reply

* [PATCH bpf-next 1/8] tools: bpftool: use correct make variable type to improve compilation time
From: Jakub Kicinski @ 2018-06-28 21:41 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: oss-drivers, netdev, Jakub Kicinski
In-Reply-To: <20180628214142.11268-1-jakub.kicinski@netronome.com>

Commit 4bfe3bd3cc35 ("tools/bpftool: use version from the kernel
source tree") added version to bpftool.  The version used is
equal to the kernel version and obtained by running make kernelversion
against kernel source tree.  Version is then communicated
to the sources with a command line define set in CFLAGS.

Use a simply expanded variable for the version, otherwise the
recursive make will run every time CFLAGS are used.

This brings the single-job compilation time for me from almost
16 sec down to less than 4 sec.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
 tools/bpf/bpftool/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile
index 892dbf095bff..0911b00b25cc 100644
--- a/tools/bpf/bpftool/Makefile
+++ b/tools/bpf/bpftool/Makefile
@@ -23,7 +23,7 @@ endif
 
 LIBBPF = $(BPF_PATH)libbpf.a
 
-BPFTOOL_VERSION=$(shell make --no-print-directory -sC ../../.. kernelversion)
+BPFTOOL_VERSION := $(shell make --no-print-directory -sC ../../.. kernelversion)
 
 $(LIBBPF): FORCE
 	$(Q)$(MAKE) -C $(BPF_DIR) OUTPUT=$(OUTPUT) $(OUTPUT)libbpf.a FEATURES_DUMP=$(FEATURE_DUMP_EXPORT)
-- 
2.17.1

^ permalink raw reply related

* [PATCH bpf-next 2/8] tools: libbpf: add section names for missing program types
From: Jakub Kicinski @ 2018-06-28 21:41 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: oss-drivers, netdev, Jakub Kicinski
In-Reply-To: <20180628214142.11268-1-jakub.kicinski@netronome.com>

Specify default section names for BPF_PROG_TYPE_LIRC_MODE2
and BPF_PROG_TYPE_LWT_SEG6LOCAL, these are the only two
missing right now.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
 tools/lib/bpf/libbpf.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index a1e96b5de5ff..a1491e95edd0 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -2037,9 +2037,11 @@ static const struct {
 	BPF_PROG_SEC("lwt_in",		BPF_PROG_TYPE_LWT_IN),
 	BPF_PROG_SEC("lwt_out",		BPF_PROG_TYPE_LWT_OUT),
 	BPF_PROG_SEC("lwt_xmit",	BPF_PROG_TYPE_LWT_XMIT),
+	BPF_PROG_SEC("lwt_seg6local",	BPF_PROG_TYPE_LWT_SEG6LOCAL),
 	BPF_PROG_SEC("sockops",		BPF_PROG_TYPE_SOCK_OPS),
 	BPF_PROG_SEC("sk_skb",		BPF_PROG_TYPE_SK_SKB),
 	BPF_PROG_SEC("sk_msg",		BPF_PROG_TYPE_SK_MSG),
+	BPF_PROG_SEC("lirc_mode2",	BPF_PROG_TYPE_LIRC_MODE2),
 	BPF_SA_PROG_SEC("cgroup/bind4",	BPF_CGROUP_INET4_BIND),
 	BPF_SA_PROG_SEC("cgroup/bind6",	BPF_CGROUP_INET6_BIND),
 	BPF_SA_PROG_SEC("cgroup/connect4", BPF_CGROUP_INET4_CONNECT),
-- 
2.17.1

^ permalink raw reply related

* [PATCH bpf-next 3/8] tools: libbpf: allow setting ifindex for programs and maps
From: Jakub Kicinski @ 2018-06-28 21:41 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: oss-drivers, netdev, Jakub Kicinski
In-Reply-To: <20180628214142.11268-1-jakub.kicinski@netronome.com>

Users of bpf_object__open()/bpf_object__load() APIs may want to
load the programs and maps onto a device for offload.  Allow
setting ifindex on those sub-objects.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
 tools/lib/bpf/libbpf.c | 10 ++++++++++
 tools/lib/bpf/libbpf.h |  2 ++
 2 files changed, 12 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index a1491e95edd0..7bc02d93e277 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -1896,6 +1896,11 @@ void *bpf_program__priv(struct bpf_program *prog)
 	return prog ? prog->priv : ERR_PTR(-EINVAL);
 }
 
+void bpf_program__set_ifindex(struct bpf_program *prog, __u32 ifindex)
+{
+	prog->prog_ifindex = ifindex;
+}
+
 const char *bpf_program__title(struct bpf_program *prog, bool needs_copy)
 {
 	const char *title;
@@ -2122,6 +2127,11 @@ void *bpf_map__priv(struct bpf_map *map)
 	return map ? map->priv : ERR_PTR(-EINVAL);
 }
 
+void bpf_map__set_ifindex(struct bpf_map *map, __u32 ifindex)
+{
+	map->map_ifindex = ifindex;
+}
+
 struct bpf_map *
 bpf_map__next(struct bpf_map *prev, struct bpf_object *obj)
 {
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 09976531aa74..564f4be9bae0 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -109,6 +109,7 @@ int bpf_program__set_priv(struct bpf_program *prog, void *priv,
 			  bpf_program_clear_priv_t clear_priv);
 
 void *bpf_program__priv(struct bpf_program *prog);
+void bpf_program__set_ifindex(struct bpf_program *prog, __u32 ifindex);
 
 const char *bpf_program__title(struct bpf_program *prog, bool needs_copy);
 
@@ -251,6 +252,7 @@ typedef void (*bpf_map_clear_priv_t)(struct bpf_map *, void *);
 int bpf_map__set_priv(struct bpf_map *map, void *priv,
 		      bpf_map_clear_priv_t clear_priv);
 void *bpf_map__priv(struct bpf_map *map);
+void bpf_map__set_ifindex(struct bpf_map *map, __u32 ifindex);
 int bpf_map__pin(struct bpf_map *map, const char *path);
 
 long libbpf_get_error(const void *ptr);
-- 
2.17.1

^ permalink raw reply related

* [PATCH bpf-next 4/8] tools: libbpf: restore the ability to load programs from .text section
From: Jakub Kicinski @ 2018-06-28 21:41 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: oss-drivers, netdev, Jakub Kicinski
In-Reply-To: <20180628214142.11268-1-jakub.kicinski@netronome.com>

libbpf used to be able to load programs from the default section
called '.text'.  It's not very common to leave sections unnamed,
but if it happens libbpf will fail to load the programs reporting
-EINVAL from the kernel.  The -EINVAL comes from bpf_obj_name_cpy()
because since 48cca7e44f9f ("libbpf: add support for bpf_call")
libbpf does not resolve program names for programs in '.text',
defaulting to '.text'.  '.text', however, does not pass the
(isalnum(*src) || *src == '_') check in bpf_obj_name_cpy().

With few extra lines of code we can limit the pseudo call
assumptions only to objects which actually contain code relocations.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
 tools/lib/bpf/libbpf.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 7bc02d93e277..e2401b95f08d 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -234,6 +234,7 @@ struct bpf_object {
 	size_t nr_maps;
 
 	bool loaded;
+	bool has_pseudo_calls;
 
 	/*
 	 * Information when doing elf related work. Only valid if fd
@@ -400,10 +401,6 @@ bpf_object__init_prog_names(struct bpf_object *obj)
 		const char *name = NULL;
 
 		prog = &obj->programs[pi];
-		if (prog->idx == obj->efile.text_shndx) {
-			name = ".text";
-			goto skip_search;
-		}
 
 		for (si = 0; si < symbols->d_size / sizeof(GElf_Sym) && !name;
 		     si++) {
@@ -426,12 +423,15 @@ bpf_object__init_prog_names(struct bpf_object *obj)
 			}
 		}
 
+		if (!name && prog->idx == obj->efile.text_shndx)
+			name = ".text";
+
 		if (!name) {
 			pr_warning("failed to find sym for prog %s\n",
 				   prog->section_name);
 			return -EINVAL;
 		}
-skip_search:
+
 		prog->name = strdup(name);
 		if (!prog->name) {
 			pr_warning("failed to allocate memory for prog sym %s\n",
@@ -981,6 +981,7 @@ bpf_program__collect_reloc(struct bpf_program *prog, GElf_Shdr *shdr,
 			prog->reloc_desc[i].type = RELO_CALL;
 			prog->reloc_desc[i].insn_idx = insn_idx;
 			prog->reloc_desc[i].text_off = sym.st_value;
+			obj->has_pseudo_calls = true;
 			continue;
 		}
 
@@ -1426,6 +1427,12 @@ bpf_program__load(struct bpf_program *prog,
 	return err;
 }
 
+static bool bpf_program__is_function_storage(struct bpf_program *prog,
+					     struct bpf_object *obj)
+{
+	return prog->idx == obj->efile.text_shndx && obj->has_pseudo_calls;
+}
+
 static int
 bpf_object__load_progs(struct bpf_object *obj)
 {
@@ -1433,7 +1440,7 @@ bpf_object__load_progs(struct bpf_object *obj)
 	int err;
 
 	for (i = 0; i < obj->nr_programs; i++) {
-		if (obj->programs[i].idx == obj->efile.text_shndx)
+		if (bpf_program__is_function_storage(&obj->programs[i], obj))
 			continue;
 		err = bpf_program__load(&obj->programs[i],
 					obj->license,
@@ -2247,7 +2254,7 @@ int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr,
 		bpf_program__set_expected_attach_type(prog,
 						      expected_attach_type);
 
-		if (prog->idx != obj->efile.text_shndx && !first_prog)
+		if (!bpf_program__is_function_storage(prog, obj) && !first_prog)
 			first_prog = prog;
 	}
 
-- 
2.17.1

^ permalink raw reply related

* [PATCH bpf-next 5/8] tools: libbpf: don't return '.text' as a program for multi-function programs
From: Jakub Kicinski @ 2018-06-28 21:41 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: oss-drivers, netdev, Jakub Kicinski
In-Reply-To: <20180628214142.11268-1-jakub.kicinski@netronome.com>

Make bpf_program__next() skip over '.text' section if object file
has pseudo calls.  The '.text' section is hardly a program in that
case, it's more of a storage for code of functions other than main.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
 tools/lib/bpf/libbpf.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index e2401b95f08d..38ed3e92e393 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -1865,8 +1865,8 @@ void *bpf_object__priv(struct bpf_object *obj)
 	return obj ? obj->priv : ERR_PTR(-EINVAL);
 }
 
-struct bpf_program *
-bpf_program__next(struct bpf_program *prev, struct bpf_object *obj)
+static struct bpf_program *
+__bpf_program__next(struct bpf_program *prev, struct bpf_object *obj)
 {
 	size_t idx;
 
@@ -1887,6 +1887,18 @@ bpf_program__next(struct bpf_program *prev, struct bpf_object *obj)
 	return &obj->programs[idx];
 }
 
+struct bpf_program *
+bpf_program__next(struct bpf_program *prev, struct bpf_object *obj)
+{
+	struct bpf_program *prog = prev;
+
+	do {
+		prog = __bpf_program__next(prog, obj);
+	} while (prog && bpf_program__is_function_storage(prog, obj));
+
+	return prog;
+}
+
 int bpf_program__set_priv(struct bpf_program *prog, void *priv,
 			  bpf_program_clear_priv_t clear_priv)
 {
-- 
2.17.1

^ permalink raw reply related

* [PATCH bpf-next 6/8] tools: bpftool: drop unnecessary Author comments
From: Jakub Kicinski @ 2018-06-28 21:41 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: oss-drivers, netdev, Jakub Kicinski
In-Reply-To: <20180628214142.11268-1-jakub.kicinski@netronome.com>

Drop my author comments, those are from the early days of
bpftool and make little sense in tree, where we have quite
a few people contributing and git to attribute the work.

While at it bump some copyrights.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
 tools/bpf/bpftool/common.c | 2 --
 tools/bpf/bpftool/main.c   | 4 +---
 tools/bpf/bpftool/main.h   | 2 --
 tools/bpf/bpftool/map.c    | 2 --
 tools/bpf/bpftool/prog.c   | 4 +---
 5 files changed, 2 insertions(+), 12 deletions(-)

diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c
index 32f9e397a6c0..b432daea4520 100644
--- a/tools/bpf/bpftool/common.c
+++ b/tools/bpf/bpftool/common.c
@@ -31,8 +31,6 @@
  * SOFTWARE.
  */
 
-/* Author: Jakub Kicinski <kubakici@wp.pl> */
-
 #include <ctype.h>
 #include <errno.h>
 #include <fcntl.h>
diff --git a/tools/bpf/bpftool/main.c b/tools/bpf/bpftool/main.c
index eea7f14355f3..d15a62be6cf0 100644
--- a/tools/bpf/bpftool/main.c
+++ b/tools/bpf/bpftool/main.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2017 Netronome Systems, Inc.
+ * Copyright (C) 2017-2018 Netronome Systems, Inc.
  *
  * This software is dual licensed under the GNU General License Version 2,
  * June 1991 as shown in the file COPYING in the top-level directory of this
@@ -31,8 +31,6 @@
  * SOFTWARE.
  */
 
-/* Author: Jakub Kicinski <kubakici@wp.pl> */
-
 #include <bfd.h>
 #include <ctype.h>
 #include <errno.h>
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index 63fdb310b9a4..d39f7ef01d23 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -31,8 +31,6 @@
  * SOFTWARE.
  */
 
-/* Author: Jakub Kicinski <kubakici@wp.pl> */
-
 #ifndef __BPF_TOOL_H
 #define __BPF_TOOL_H
 
diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c
index 097b1a5e046b..5989e1575ae4 100644
--- a/tools/bpf/bpftool/map.c
+++ b/tools/bpf/bpftool/map.c
@@ -31,8 +31,6 @@
  * SOFTWARE.
  */
 
-/* Author: Jakub Kicinski <kubakici@wp.pl> */
-
 #include <assert.h>
 #include <errno.h>
 #include <fcntl.h>
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index 05f42a46d6ed..fd8cd9b51621 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2017 Netronome Systems, Inc.
+ * Copyright (C) 2017-2018 Netronome Systems, Inc.
  *
  * This software is dual licensed under the GNU General License Version 2,
  * June 1991 as shown in the file COPYING in the top-level directory of this
@@ -31,8 +31,6 @@
  * SOFTWARE.
  */
 
-/* Author: Jakub Kicinski <kubakici@wp.pl> */
-
 #include <errno.h>
 #include <fcntl.h>
 #include <stdarg.h>
-- 
2.17.1

^ permalink raw reply related

* [PATCH bpf-next 7/8] tools: bpftool: add missing --bpffs to completions
From: Jakub Kicinski @ 2018-06-28 21:41 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: oss-drivers, netdev, Jakub Kicinski
In-Reply-To: <20180628214142.11268-1-jakub.kicinski@netronome.com>

--bpffs is not suggested by bash completions.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
 tools/bpf/bpftool/bash-completion/bpftool | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool
index 1e1083321643..b0b8022d3570 100644
--- a/tools/bpf/bpftool/bash-completion/bpftool
+++ b/tools/bpf/bpftool/bash-completion/bpftool
@@ -182,7 +182,7 @@ _bpftool()
     if [[ -z $object ]]; then
         case $cur in
             -*)
-                local c='--version --json --pretty'
+                local c='--version --json --pretty --bpffs'
                 COMPREPLY=( $( compgen -W "$c" -- "$cur" ) )
                 return 0
                 ;;
-- 
2.17.1

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox