Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next 1/1] Added BASE-T1 PHY support to PHY Subsystem
From: Heiner Kallweit @ 2019-08-15 16:34 UTC (permalink / raw)
  To: Andrew Lunn, Christian Herber
  Cc: davem@davemloft.net, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, Florian Fainelli
In-Reply-To: <20190815155613.GE15291@lunn.ch>

On 15.08.2019 17:56, Andrew Lunn wrote:
> On Thu, Aug 15, 2019 at 03:32:29PM +0000, Christian Herber wrote:
>> BASE-T1 is a category of Ethernet PHYs.
>> They use a single copper pair for transmission.
>> This patch add basic support for this category of PHYs.
>> It coveres the discovery of abilities and basic configuration.
>> It includes setting fixed speed and enabling auto-negotiation.
>> BASE-T1 devices should always Clause-45 managed.
>> Therefore, this patch extends phy-c45.c.
>> While for some functions like auto-neogtiation different registers are
>> used, the layout of these registers is the same for the used fields.
>> Thus, much of the logic of basic Clause-45 devices can be reused.
>>
>> Signed-off-by: Christian Herber <christian.herber@nxp.com>
>> ---
>>  drivers/net/phy/phy-c45.c    | 113 +++++++++++++++++++++++++++++++----
>>  drivers/net/phy/phy-core.c   |   4 +-
>>  include/uapi/linux/ethtool.h |   2 +
>>  include/uapi/linux/mdio.h    |  21 +++++++
>>  4 files changed, 129 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/net/phy/phy-c45.c b/drivers/net/phy/phy-c45.c
>> index b9d4145781ca..9ff0b8c785de 100644
>> --- a/drivers/net/phy/phy-c45.c
>> +++ b/drivers/net/phy/phy-c45.c
>> @@ -8,13 +8,23 @@
>>  #include <linux/mii.h>
>>  #include <linux/phy.h>
>>  
>> +#define IS_100BASET1(phy) (linkmode_test_bit( \
>> +			   ETHTOOL_LINK_MODE_100baseT1_Full_BIT, \
>> +			   (phy)->supported))
>> +#define IS_1000BASET1(phy) (linkmode_test_bit( \
>> +			    ETHTOOL_LINK_MODE_1000baseT1_Full_BIT, \
>> +			    (phy)->supported))
> 
> Hi Christian
> 
> We already have the flag phydev->is_gigabit_capable. Maybe add a flag
> phydev->is_t1_capable
> 
>> +
>> +static u32 get_aneg_ctrl(struct phy_device *phydev);
>> +static u32 get_aneg_stat(struct phy_device *phydev);
> 
> No forward declarations please. Put the code in the right order so
> they are not needed.
> 
> Thanks
> 
>      Andrew
> 

For whatever reason I don't have the original mail in my netdev inbox (yet).

+	if (IS_100BASET1(phydev) || IS_1000BASET1(phydev))
+		ctrl = MDIO_AN_BT1_CTRL;

Code like this could be problematic once a PHY supports one of the T1 modes
AND normal modes. Then normal modes would be unusable.

I think this scenario isn't completely hypothetical. See the Aquantia
AQCS109 that supports normal modes and (proprietary) 1000Base-T2.

Maybe we need separate versions of the generic functions for T1.
Then it would be up to the PHY driver to decide when to use which
version.

Heiner

^ permalink raw reply

* Re: [PATCH net-next] net/rds: Add RDS6_INFO_SOCKETS and RDS6_INFO_RECV_MESSAGES options
From: santosh.shilimkar @ 2019-08-15 16:36 UTC (permalink / raw)
  To: Ka-Cheong Poon, netdev; +Cc: davem, rds-devel
In-Reply-To: <1565861803-31268-1-git-send-email-ka-cheong.poon@oracle.com>

On 8/15/19 2:36 AM, Ka-Cheong Poon wrote:
> Add support of the socket options RDS6_INFO_SOCKETS and
> RDS6_INFO_RECV_MESSAGES which update the RDS_INFO_SOCKETS and
> RDS_INFO_RECV_MESSAGES options respectively.  The old options work
> for IPv4 sockets only.
> 
> Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com>
> ---
Thanks Ka-Cheong for getting this one out on list.

Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>

^ permalink raw reply

* Re: [PATCH] arm64: do_csum: implement accelerated scalar version
From: Will Deacon @ 2019-08-15 16:46 UTC (permalink / raw)
  To: Zhangshaokun
  Cc: Robin Murphy, Ard Biesheuvel, linux-arm-kernel, netdev,
	ilias.apalodimas, huanglingyan (A), steve.capper
In-Reply-To: <440eb674-0e59-a97e-4a90-0026e2327069@hisilicon.com>

On Thu, May 16, 2019 at 11:14:35AM +0800, Zhangshaokun wrote:
> On 2019/5/15 17:47, Will Deacon wrote:
> > On Mon, Apr 15, 2019 at 07:18:22PM +0100, Robin Murphy wrote:
> >> On 12/04/2019 10:52, Will Deacon wrote:
> >>> I'm waiting for Robin to come back with numbers for a C implementation.
> >>>
> >>> Robin -- did you get anywhere with that?
> >>
> >> Still not what I would call finished, but where I've got so far (besides an
> >> increasingly elaborate test rig) is as below - it still wants some unrolling
> >> in the middle to really fly (and actual testing on BE), but the worst-case
> >> performance already equals or just beats this asm version on Cortex-A53 with
> >> GCC 7 (by virtue of being alignment-insensitive and branchless except for
> >> the loop). Unfortunately, the advantage of C code being instrumentable does
> >> also come around to bite me...
> > 
> > Is there any interest from anybody in spinning a proper patch out of this?
> > Shaokun?
> 
> HiSilicon's Kunpeng920(Hi1620) benefits from do_csum optimization, if Ard and
> Robin are ok, Lingyan or I can try to do it.
> Of course, if any guy posts the patch, we are happy to test it.
> Any will be ok.

I don't mind who posts it, but Robin is super busy with SMMU stuff at the
moment so it probably makes more sense for you or Lingyan to do it.

Will

^ permalink raw reply

* Re: [PATCH bpf-next 0/5] Add support for SKIP_BPF flag for AF_XDP sockets
From: Samudrala, Sridhar @ 2019-08-15 16:46 UTC (permalink / raw)
  To: Björn Töpel, magnus.karlsson, netdev, bpf,
	intel-wired-lan, maciej.fijalkowski, tom.herbert
In-Reply-To: <bebfb097-5357-91d8-ebc7-2f8ede392ad7@intel.com>

On 8/15/2019 5:51 AM, Björn Töpel wrote:
> On 2019-08-15 05:46, Sridhar Samudrala wrote:
>> This patch series introduces XDP_SKIP_BPF flag that can be specified
>> during the bind() call of an AF_XDP socket to skip calling the BPF
>> program in the receive path and pass the buffer directly to the socket.
>>
>> When a single AF_XDP socket is associated with a queue and a HW
>> filter is used to redirect the packets and the app is interested in
>> receiving all the packets on that queue, we don't need an additional
>> BPF program to do further filtering or lookup/redirect to a socket.
>>
>> Here are some performance numbers collected on
>>    - 2 socket 28 core Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz
>>    - Intel 40Gb Ethernet NIC (i40e)
>>
>> All tests use 2 cores and the results are in Mpps.
>>
>> turbo on (default)
>> ---------------------------------------------
>>                        no-skip-bpf    skip-bpf
>> ---------------------------------------------
>> rxdrop zerocopy           21.9         38.5
>> l2fwd  zerocopy           17.0         20.5
>> rxdrop copy               11.1         13.3
>> l2fwd  copy                1.9          2.0
>>
>> no turbo :  echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
>> ---------------------------------------------
>>                        no-skip-bpf    skip-bpf
>> ---------------------------------------------
>> rxdrop zerocopy           15.4         29.0
>> l2fwd  zerocopy           11.8         18.2
>> rxdrop copy                8.2         10.5
>> l2fwd  copy                1.7          1.7
>> ---------------------------------------------
>>
> 
> This work is somewhat similar to the XDP_ATTACH work [1]. Avoiding the
> retpoline in the XDP program call is a nice performance boost! I like
> the numbers! :-) I also like the idea of adding a flag that just does
> what most AF_XDP Rx users want -- just getting all packets of a
> certain queue into the XDP sockets.
> 
> In addition to Toke's mail, I have some more concerns with the series:
> 
> * AFAIU the SKIP_BPF only works for zero-copy enabled sockets. IMO, it
>    should work for all modes (including XDP_SKB).

This patch enables SKIP_BPF for AF_XDP sockets where an XDP program is 
attached at driver level (both zerocopy and copy modes)
I tried a quick hack to see the perf benefit with generic XDP mode, but 
i didn't see any significant improvement in performance in that 
scenario. so i didn't include that mode.

> 
> * In order to work, a user still needs an XDP program running. That's
>    clunky. I'd like the behavior that if no XDP program is attached,
>    and the option is set, the packets for a that queue end up in the
>    socket. If there's an XDP program attached, the program has
>    precedence.

I think this would require more changes in the drivers to take XDP 
datapath even when there is no XDP program loaded.

> 
> * It requires changes in all drivers. Not nice, and scales badly. Try
>    making it generic (xdp_do_redirect/xdp_flush), so it Just Works for
>    all XDP capable drivers.

I tried to make this as generic as possible and make the changes to the 
driver very minimal, but could not find a way to avoid any changes at 
all to the driver. xdp_do_direct() gets called based after the call to 
bpf_prog_run_xdp() in the drivers.

> 
> Thanks for working on this!
> 
> 
> Björn
> 
> [1] 
> https://lore.kernel.org/netdev/20181207114431.18038-1-bjorn.topel@gmail.com/ 
> 
> 
> 
>> Sridhar Samudrala (5):
>>    xsk: Convert bool 'zc' field in struct xdp_umem to a u32 bitmap
>>    xsk: Introduce XDP_SKIP_BPF bind option
>>    i40e: Enable XDP_SKIP_BPF option for AF_XDP sockets
>>    ixgbe: Enable XDP_SKIP_BPF option for AF_XDP sockets
>>    xdpsock_user: Add skip_bpf option
>>
>>   drivers/net/ethernet/intel/i40e/i40e_txrx.c   | 22 +++++++++-
>>   drivers/net/ethernet/intel/i40e/i40e_xsk.c    |  6 +++
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 20 ++++++++-
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c  | 16 ++++++-
>>   include/net/xdp_sock.h                        | 21 ++++++++-
>>   include/uapi/linux/if_xdp.h                   |  1 +
>>   include/uapi/linux/xdp_diag.h                 |  1 +
>>   net/xdp/xdp_umem.c                            |  9 ++--
>>   net/xdp/xsk.c                                 | 43 ++++++++++++++++---
>>   net/xdp/xsk_diag.c                            |  5 ++-
>>   samples/bpf/xdpsock_user.c                    |  8 ++++
>>   11 files changed, 135 insertions(+), 17 deletions(-)
>>

^ permalink raw reply

* Re: [PATCH -next] btf: fix return value check in btf_vmlinux_init()
From: Andrii Nakryiko @ 2019-08-15 17:06 UTC (permalink / raw)
  To: Wei Yongjun
  Cc: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, Andrii Nakryiko, Networking, bpf, kernel-janitors
In-Reply-To: <20190815142432.101401-1-weiyongjun1@huawei.com>

On Thu, Aug 15, 2019 at 7:21 AM Wei Yongjun <weiyongjun1@huawei.com> wrote:
>
> In case of error, the function kobject_create_and_add() returns NULL
> pointer not ERR_PTR(). The IS_ERR() test in the return value check
> should be replaced with NULL test.
>
> Fixes: 341dfcf8d78e ("btf: expose BTF info through sysfs")
> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
> ---


Argh... Thanks for the fix! Fix the comment below addressed:

Acked-by: Andrii Nakryiko <andriin@fb.com>

>  kernel/bpf/sysfs_btf.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/bpf/sysfs_btf.c b/kernel/bpf/sysfs_btf.c
> index 4659349fc795..be5557deb958 100644
> --- a/kernel/bpf/sysfs_btf.c
> +++ b/kernel/bpf/sysfs_btf.c
> @@ -30,16 +30,13 @@ static struct kobject *btf_kobj;
>
>  static int __init btf_vmlinux_init(void)
>  {
> -       int err;
> -
>         if (!_binary__btf_vmlinux_bin_start)
>                 return 0;
>
>         btf_kobj = kobject_create_and_add("btf", kernel_kobj);
> -       if (IS_ERR(btf_kobj)) {
> -               err = PTR_ERR(btf_kobj);
> +       if (!btf_kobj) {
>                 btf_kobj = NULL;

This is now not necessary, please drop (and don't forget to remove {}
for this single-line if afterwards).

> -               return err;
> +               return -ENOMEM;
>         }
>
>         bin_attr_btf_vmlinux.size = _binary__btf_vmlinux_bin_end -
>
>
>

^ permalink raw reply

* Re: [PATCH bpf] tools: bpftool: close prog FD before exit on showing a single program
From: Andrii Nakryiko @ 2019-08-15 17:09 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: Alexei Starovoitov, Daniel Borkmann, bpf, Networking, oss-drivers
In-Reply-To: <20190815142223.2203-1-quentin.monnet@netronome.com>

On Thu, Aug 15, 2019 at 7:24 AM Quentin Monnet
<quentin.monnet@netronome.com> wrote:
>
> When showing metadata about a single program by invoking
> "bpftool prog show PROG", the file descriptor referring to the program
> is not closed before returning from the function. Let's close it.
>
> Fixes: 71bb428fe2c1 ("tools: bpf: add bpftool")
> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> ---
>  tools/bpf/bpftool/prog.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
> index 66f04a4846a5..43fdbbfe41bb 100644
> --- a/tools/bpf/bpftool/prog.c
> +++ b/tools/bpf/bpftool/prog.c
> @@ -363,7 +363,9 @@ static int do_show(int argc, char **argv)
>                 if (fd < 0)
>                         return -1;
>
> -               return show_prog(fd);
> +               err = show_prog(fd);
> +               close(fd);
> +               return err;

There is a similar problem few lines above for special case of argc ==
2, which you didn't fix.
Would it be better to make show_prog(fd) close provided fd instead or
is it used in some other context where FD should live longer (I
haven't checked, sorry)?

>         }
>
>         if (argc)
> --
> 2.17.1
>

^ permalink raw reply

* Re: [PATCH bpf-next 0/5] Add support for SKIP_BPF flag for AF_XDP sockets
From: Toke Høiland-Jørgensen @ 2019-08-15 17:11 UTC (permalink / raw)
  To: Samudrala, Sridhar, magnus.karlsson, bjorn.topel, netdev, bpf,
	intel-wired-lan, maciej.fijalkowski, tom.herbert
In-Reply-To: <b9423054-247e-8b57-ea59-42368f60ea1e@intel.com>

"Samudrala, Sridhar" <sridhar.samudrala@intel.com> writes:

> On 8/15/2019 4:12 AM, Toke Høiland-Jørgensen wrote:
>> Sridhar Samudrala <sridhar.samudrala@intel.com> writes:
>> 
>>> This patch series introduces XDP_SKIP_BPF flag that can be specified
>>> during the bind() call of an AF_XDP socket to skip calling the BPF
>>> program in the receive path and pass the buffer directly to the socket.
>>>
>>> When a single AF_XDP socket is associated with a queue and a HW
>>> filter is used to redirect the packets and the app is interested in
>>> receiving all the packets on that queue, we don't need an additional
>>> BPF program to do further filtering or lookup/redirect to a socket.
>>>
>>> Here are some performance numbers collected on
>>>    - 2 socket 28 core Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz
>>>    - Intel 40Gb Ethernet NIC (i40e)
>>>
>>> All tests use 2 cores and the results are in Mpps.
>>>
>>> turbo on (default)
>>> ---------------------------------------------	
>>>                        no-skip-bpf    skip-bpf
>>> ---------------------------------------------	
>>> rxdrop zerocopy           21.9         38.5
>>> l2fwd  zerocopy           17.0         20.5
>>> rxdrop copy               11.1         13.3
>>> l2fwd  copy                1.9          2.0
>>>
>>> no turbo :  echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
>>> ---------------------------------------------	
>>>                        no-skip-bpf    skip-bpf
>>> ---------------------------------------------	
>>> rxdrop zerocopy           15.4         29.0
>>> l2fwd  zerocopy           11.8         18.2
>>> rxdrop copy                8.2         10.5
>>> l2fwd  copy                1.7          1.7
>>> ---------------------------------------------
>> 
>> You're getting this performance boost by adding more code in the fast
>> path for every XDP program; so what's the performance impact of that for
>> cases where we do run an eBPF program?
>
> The no-skip-bpf results are pretty close to what i see before the 
> patches are applied. As umem is cached in rx_ring for zerocopy the 
> overhead is much smaller compared to the copy scenario where i am 
> currently calling xdp_get_umem_from_qid().

I meant more for other XDP programs; what is the performance impact of
XDP_DROP, for instance?

>> Also, this is basically a special-casing of a particular deployment
>> scenario. Without a way to control RX queue assignment and traffic
>> steering, you're basically hard-coding a particular app's takeover of
>> the network interface; I'm not sure that is such a good idea...
>
> Yes. This is mainly targeted for application that create 1 AF_XDP
> socket per RX queue and can use a HW filter (via ethtool or TC flower)
> to redirect the packets to a queue or a group of queues.

Yeah, and I'd prefer it if the handling of this to be unified somehow...

-Toke

^ permalink raw reply

* Re: [patch net-next v2 2/2] selftests: netdevsim: add devlink params tests
From: Jakub Kicinski @ 2019-08-15 17:12 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, davem, mlxsw
In-Reply-To: <20190815085214.GC2273@nanopsycho>

On Thu, 15 Aug 2019 10:52:14 +0200, Jiri Pirko wrote:
> Thu, Aug 15, 2019 at 10:45:45AM CEST, jiri@resnulli.us wrote:
> >Thu, Aug 15, 2019 at 03:09:00AM CEST, jakub.kicinski@netronome.com wrote:  
> >>On Wed, 14 Aug 2019 17:26:04 +0200, Jiri Pirko wrote:  
> >>> From: Jiri Pirko <jiri@mellanox.com>
> >>> 
> >>> Test recently added netdevsim devlink param implementation.
> >>> 
> >>> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
> >>> ---
> >>> v1->v2:
> >>> -using cmd_jq helper  
> >>
> >>Still failing here :(
> >>
> >># ./devlink.sh 
> >>TEST: fw flash test                                                 [ OK ]
> >>TEST: params test                                                   [FAIL]
> >>	Failed to get test1 param value
> >>TEST: regions test                                                  [ OK ]
> >>
> >># jq --version
> >>jq-1.5-1-a5b5cbe
> >># echo '{ "a" : false }' | jq -e -r '.[]'
> >>false
> >># echo $?
> >>1  
> >
> >Odd, could you please try:
> >$ jq --version
> >jq-1.5
> >$ echo '{"param":{"netdevsim/netdevsim11":[{"name":"test1","type":"driver-specific","values":[{"cmode":"driverinit","value":"false"}]}]}}' | jq -e -r '.[][][].values[] | select(.cmode == "driverinit").value'
> >false
> >$ echo $?
> >0  
> 
> Ah, it is not the jq version, it is the iproute2 version:
> 8257e6c49cca9847e01262f6e749c6e88e2ddb72
> 
> I'll think about how to fix this.

Ah, wow, you're right! Old iproute2 works fine here, too!

> >>
> >>On another machine:
> >>
> >>$ echo '{ "a" : false }' | jq -e -r '.[]'
> >>false
> >>$ echo $?
> >>1
> >>
> >>Did you mean to drop the -e ?  


^ permalink raw reply

* Re: [PATCH v2 bpf-next 1/4] bpf: unprivileged BPF access via /dev/bpf
From: Alexei Starovoitov @ 2019-08-15 17:28 UTC (permalink / raw)
  To: Jordan Glover
  Cc: Andy Lutomirski, Daniel Colascione, Song Liu, Kees Cook,
	Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team,
	Lorenz Bauer, Jann Horn, Greg KH, Linux API, LSM List
In-Reply-To: <HG0x24u69mnaMFKuxHVAzHpyjwsD5-U6RpqFRua87wGWQCHg00Q8ZqPeA_5kJ9l-d6oe0cXa4HyYXMnOO0Aofp_LcPcQdG0WFV21z1MbgcE=@protonmail.ch>

On Thu, Aug 15, 2019 at 11:24:54AM +0000, Jordan Glover wrote:
> On Wednesday, August 14, 2019 10:05 PM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> 
> > On Wed, Aug 14, 2019 at 10:51:23AM -0700, Andy Lutomirski wrote:
> >
> > > If eBPF is genuinely not usable by programs that are not fully trusted
> > > by the admin, then no kernel changes at all are needed. Programs that
> > > want to reduce their own privileges can easily fork() a privileged
> > > subprocess or run a little helper to which they delegate BPF
> > > operations. This is far more flexible than anything that will ever be
> > > in the kernel because it allows the helper to verify that the rest of
> > > the program is doing exactly what it's supposed to and restrict eBPF
> > > operations to exactly the subset that is needed. So a container
> > > manager or network manager that drops some provilege could have a
> > > little bpf-helper that manages its BPF XDP, firewalling, etc
> > > configuration. The two processes would talk over a socketpair.
> >
> > there were three projects that tried to delegate bpf operations.
> > All of them failed.
> > bpf operational workflow is much more complex than you're imagining.
> > fork() also doesn't work for all cases.
> > I gave this example before: consider multiple systemd-like deamons
> > that need to do bpf operations that want to pass this 'bpf capability'
> > to other deamons written by other teams. Some of them will start
> > non-root, but still need to do bpf. They will be rpm installed
> > and live upgraded while running.
> > We considered to make systemd such centralized bpf delegation
> > authority too. It didn't work. bpf in kernel grows quickly.
> > libbpf part grows independently. llvm keeps evolving.
> > All of them are being changed while system overall has to stay
> > operational. Centralized approach breaks apart.
> >
> > > The interesting cases you're talking about really do involved
> > > unprivileged or less privileged eBPF, though. Let's see:
> > > systemd --user: systemd --user is not privileged at all. There's no
> > > issue of reducing privilege, since systemd --user doesn't have any
> > > privilege to begin with. But systemd supports some eBPF features, and
> > > presumably it would like to support them in the systemd --user case.
> > > This is unprivileged eBPF.
> >
> > Let's disambiguate the terminology.
> > This /dev/bpf patch set started as describing the feature as 'unprivileged bpf'.
> > I think that was a mistake.
> > Let's call systemd-like deamon usage of bpf 'less privileged bpf'.
> > This is not unprivileged.
> > 'unprivileged bpf' is what sysctl kernel.unprivileged_bpf_disabled controls.
> >
> > There is a huge difference between the two.
> > I'm against extending 'unprivileged bpf' even a bit more than what it is
> > today for many reasons mentioned earlier.
> > The /dev/bpf is about 'less privileged'.
> > Less privileged than root. We need to split part of full root capability
> > into bpf capability. So that most of the root can be dropped.
> > This is very similar to what cap_net_admin does.
> > cap_net_amdin can bring down eth0 which is just as bad as crashing the box.
> > cap_net_admin is very much privileged. Just 'less privileged' than root.
> > Same thing for cap_bpf.
> >
> > May be we should do both cap_bpf and /dev/bpf to make it clear that
> > this is the same thing. Two interfaces to achieve the same result.
> >
> 
> systemd --user processes aren't "less privileged". The are COMPLETELY unprivileged.
> Granting them cap_bpf is the same as granting it to every other unprivileged user
> process. Also unprivileged user process can start systemd --user process with any
> command they like.

systemd itself is trusted. It's the same binary whether it runs as pid=1
or as pid=123. One of the use cases is to make IPAddressDeny= work with --user.
Subset of that feature already works with AmbientCapabilities=CAP_NET_ADMIN.
CAP_BPF is a natural step in the same direction.


^ permalink raw reply

* Re: [PATCH bpf] tools: bpftool: close prog FD before exit on showing a single program
From: Jakub Kicinski @ 2019-08-15 17:30 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Quentin Monnet, Alexei Starovoitov, Daniel Borkmann, bpf,
	Networking, oss-drivers
In-Reply-To: <CAEf4BzbL3K5XWSyY6BxrVeF3+3qomsYbXh67yzjyy7ApsosVBw@mail.gmail.com>

On Thu, 15 Aug 2019 10:09:38 -0700, Andrii Nakryiko wrote:
> On Thu, Aug 15, 2019 at 7:24 AM Quentin Monnet wrote:
> > When showing metadata about a single program by invoking
> > "bpftool prog show PROG", the file descriptor referring to the program
> > is not closed before returning from the function. Let's close it.
> >
> > Fixes: 71bb428fe2c1 ("tools: bpf: add bpftool")
> > Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
> > Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> > ---
> >  tools/bpf/bpftool/prog.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
> > index 66f04a4846a5..43fdbbfe41bb 100644
> > --- a/tools/bpf/bpftool/prog.c
> > +++ b/tools/bpf/bpftool/prog.c
> > @@ -363,7 +363,9 @@ static int do_show(int argc, char **argv)
> >                 if (fd < 0)
> >                         return -1;
> >
> > -               return show_prog(fd);
> > +               err = show_prog(fd);
> > +               close(fd);
> > +               return err;  
> 
> There is a similar problem few lines above for special case of argc ==
> 2, which you didn't fix.

This is the special argc == 2 case.

> Would it be better to make show_prog(fd) close provided fd instead or
> is it used in some other context where FD should live longer (I
> haven't checked, sorry)?

I think it used to close that's how the bug crept in. Other than the bug
it's fine the way it is.

^ permalink raw reply

* Re: [patch net-next v3 2/2] selftests: netdevsim: add devlink params tests
From: Jakub Kicinski @ 2019-08-15 17:33 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, davem, mlxsw
In-Reply-To: <20190815134229.8884-3-jiri@resnulli.us>

On Thu, 15 Aug 2019 15:42:29 +0200, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> Test recently added netdevsim devlink param implementation.
> 
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>

Tested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>

Thanks!

^ permalink raw reply

* Re: [patch net-next v4 1/2] netdevsim: implement support for devlink region and snapshots
From: Jakub Kicinski @ 2019-08-15 17:34 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, davem, mlxsw
In-Reply-To: <20190815134634.9858-2-jiri@resnulli.us>

On Thu, 15 Aug 2019 15:46:33 +0200, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> Implement dummy region of size 32K and allow user to create snapshots
> or random data using debugfs file trigger.
> 
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>

^ permalink raw reply

* Re: [patch net-next v4 2/2] selftests: netdevsim: add devlink regions tests
From: Jakub Kicinski @ 2019-08-15 17:34 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, davem, mlxsw
In-Reply-To: <20190815134634.9858-3-jiri@resnulli.us>

On Thu, 15 Aug 2019 15:46:34 +0200, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> Test netdevsim devlink region implementation.
> 
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>

Tested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>

Thank you!!

^ permalink raw reply

* [PATCH] wimax/i2400m: fix a memory leak bug
From: Wenwen Wang @ 2019-08-15 18:05 UTC (permalink / raw)
  To: Wenwen Wang
  Cc: Inaky Perez-Gonzalez,
	supporter:INTEL WIRELESS WIMAX CONNECTION 2400, David S. Miller,
	open list:NETWORKING DRIVERS, open list

In i2400m_barker_db_init(), 'options_orig' is allocated through kstrdup()
to hold the original command line options. Then, the options are parsed.
However, if an error occurs during the parsing process, 'options_orig' is
not deallocated, leading to a memory leak bug. To fix this issue, free
'options_orig' before returning the error.

Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
---
 drivers/net/wimax/i2400m/fw.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wimax/i2400m/fw.c b/drivers/net/wimax/i2400m/fw.c
index e9fc168..6b36f6d 100644
--- a/drivers/net/wimax/i2400m/fw.c
+++ b/drivers/net/wimax/i2400m/fw.c
@@ -342,6 +342,7 @@ int i2400m_barker_db_init(const char *_options)
 				       "a 32-bit number\n",
 				       __func__, token);
 				result = -EINVAL;
+				kfree(options_orig);
 				goto error_parse;
 			}
 			if (barker == 0) {
@@ -350,8 +351,10 @@ int i2400m_barker_db_init(const char *_options)
 				continue;
 			}
 			result = i2400m_barker_db_add(barker);
-			if (result < 0)
+			if (result < 0) {
+				kfree(options_orig);
 				goto error_add;
+			}
 		}
 		kfree(options_orig);
 	}
-- 
2.7.4


^ permalink raw reply related

* Re: [PATCH bpf] tools: bpftool: close prog FD before exit on showing a single program
From: Andrii Nakryiko @ 2019-08-15 18:05 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Quentin Monnet, Alexei Starovoitov, Daniel Borkmann, bpf,
	Networking, oss-drivers
In-Reply-To: <20190815103023.0bd2c210@cakuba.netronome.com>

On Thu, Aug 15, 2019 at 10:30 AM Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
>
> On Thu, 15 Aug 2019 10:09:38 -0700, Andrii Nakryiko wrote:
> > On Thu, Aug 15, 2019 at 7:24 AM Quentin Monnet wrote:
> > > When showing metadata about a single program by invoking
> > > "bpftool prog show PROG", the file descriptor referring to the program
> > > is not closed before returning from the function. Let's close it.
> > >
> > > Fixes: 71bb428fe2c1 ("tools: bpf: add bpftool")
> > > Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
> > > Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> > > ---
> > >  tools/bpf/bpftool/prog.c | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
> > > index 66f04a4846a5..43fdbbfe41bb 100644
> > > --- a/tools/bpf/bpftool/prog.c
> > > +++ b/tools/bpf/bpftool/prog.c
> > > @@ -363,7 +363,9 @@ static int do_show(int argc, char **argv)
> > >                 if (fd < 0)
> > >                         return -1;
> > >
> > > -               return show_prog(fd);
> > > +               err = show_prog(fd);
> > > +               close(fd);
> > > +               return err;
> >
> > There is a similar problem few lines above for special case of argc ==
> > 2, which you didn't fix.
>
> This is the special argc == 2 case.

Yep, you are right, the other one already does this.

>
> > Would it be better to make show_prog(fd) close provided fd instead or
> > is it used in some other context where FD should live longer (I
> > haven't checked, sorry)?
>
> I think it used to close that's how the bug crept in. Other than the bug
> it's fine the way it is.

So are you saying that show_prog() should or should not close FD?

^ permalink raw reply

* Re: INFO: task hung in tls_sw_release_resources_tx
From: syzbot @ 2019-08-15 18:06 UTC (permalink / raw)
  To: ast, aviadye, borisp, bpf, daniel, davejwatson, davem, hdanton,
	jakub.kicinski, john.fastabend, kafai, linux-kernel, netdev,
	songliubraving, syzkaller-bugs, yhs
In-Reply-To: <000000000000523ea3059025b11d@google.com>

syzbot has bisected this bug to:

commit 130b392c6cd6b2aed1b7eb32253d4920babb4891
Author: Dave Watson <davejwatson@fb.com>
Date:   Wed Jan 30 21:58:31 2019 +0000

     net: tls: Add tls 1.3 support

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=118e8dee600000
start commit:   6d5afe20 sctp: fix memleak in sctp_send_reset_streams
git tree:       net
final crash:    https://syzkaller.appspot.com/x/report.txt?x=138e8dee600000
console output: https://syzkaller.appspot.com/x/log.txt?x=158e8dee600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=a4c9e9f08e9e8960
dashboard link: https://syzkaller.appspot.com/bug?extid=6a9ff159672dfbb41c95
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=17cb0502600000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14d5dc22600000

Reported-by: syzbot+6a9ff159672dfbb41c95@syzkaller.appspotmail.com
Fixes: 130b392c6cd6 ("net: tls: Add tls 1.3 support")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply

* Re: [PATCH bpf] tools: bpftool: close prog FD before exit on showing a single program
From: Jakub Kicinski @ 2019-08-15 18:09 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Quentin Monnet, Alexei Starovoitov, Daniel Borkmann, bpf,
	Networking, oss-drivers
In-Reply-To: <CAEf4BzYL-pJ79nKywsAH1b2S-EP_4SUZY5jS2wzYJ32pywsyrw@mail.gmail.com>

On Thu, 15 Aug 2019 11:05:16 -0700, Andrii Nakryiko wrote:
> > > Would it be better to make show_prog(fd) close provided fd instead or
> > > is it used in some other context where FD should live longer (I
> > > haven't checked, sorry)?  
> >
> > I think it used to close that's how the bug crept in. Other than the bug
> > it's fine the way it is.  
> 
> So are you saying that show_prog() should or should not close FD?

Yup, it we'd have to rename it to indicate it closes the fd, and it's
only called in two places. Not worth the churn.

^ permalink raw reply

* Re: [PATCH bpf-next v2 2/3] xdp: xdp_umem: replace kmap on vmap for umem map
From: Jonathan Lemon @ 2019-08-15 18:23 UTC (permalink / raw)
  To: Ivan Khoronzhuk
  Cc: magnus.karlsson, bjorn.topel, davem, hawk, john.fastabend,
	jakub.kicinski, daniel, netdev, bpf, xdp-newbies, linux-kernel,
	yhs, andrii.nakryiko
In-Reply-To: <20190815121356.8848-3-ivan.khoronzhuk@linaro.org>

On 15 Aug 2019, at 5:13, Ivan Khoronzhuk wrote:

> For 64-bit there is no reason to use vmap/vunmap, so use page_address
> as it was initially. For 32 bits, in some apps, like in samples
> xdpsock_user.c when number of pgs in use is quite big, the kmap
> memory can be not enough, despite on this, kmap looks like is
> deprecated in such cases as it can block and should be used rather
> for dynamic mm.
>
> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
> ---
>  net/xdp/xdp_umem.c | 36 ++++++++++++++++++++++++++++++------
>  1 file changed, 30 insertions(+), 6 deletions(-)
>
> diff --git a/net/xdp/xdp_umem.c b/net/xdp/xdp_umem.c
> index a0607969f8c0..d740c4f8810c 100644
> --- a/net/xdp/xdp_umem.c
> +++ b/net/xdp/xdp_umem.c
> @@ -14,7 +14,7 @@
>  #include <linux/netdevice.h>
>  #include <linux/rtnetlink.h>
>  #include <linux/idr.h>
> -#include <linux/highmem.h>
> +#include <linux/vmalloc.h>
>
>  #include "xdp_umem.h"
>  #include "xsk_queue.h"
> @@ -170,7 +170,30 @@ static void xdp_umem_unmap_pages(struct xdp_umem 
> *umem)
>  	unsigned int i;
>
>  	for (i = 0; i < umem->npgs; i++)
> -		kunmap(umem->pgs[i]);
> +		if (PageHighMem(umem->pgs[i]))
> +			vunmap(umem->pages[i].addr);
> +}
> +
> +static int xdp_umem_map_pages(struct xdp_umem *umem)
> +{
> +	unsigned int i;
> +	void *addr;
> +
> +	for (i = 0; i < umem->npgs; i++) {
> +		if (PageHighMem(umem->pgs[i]))
> +			addr = vmap(&umem->pgs[i], 1, VM_MAP, PAGE_KERNEL);
> +		else
> +			addr = page_address(umem->pgs[i]);
> +
> +		if (!addr) {
> +			xdp_umem_unmap_pages(umem);
> +			return -ENOMEM;
> +		}
> +
> +		umem->pages[i].addr = addr;
> +	}
> +
> +	return 0;
>  }

You'll want a __xdp_umem_unmap_pages() helper here that takes an
count of the number of pages to unmap, so it can be called from
xdp_umem_unmap_pages() in the normal case, and xdp_umem_map_pages()
in the error case.  Otherwise the error case ends up calling
PageHighMem on a null page.
-- 
Jonathan

>  static void xdp_umem_unpin_pages(struct xdp_umem *umem)
> @@ -312,7 +335,7 @@ static int xdp_umem_reg(struct xdp_umem *umem, 
> struct xdp_umem_reg *mr)
>  	u32 chunk_size = mr->chunk_size, headroom = mr->headroom;
>  	unsigned int chunks, chunks_per_page;
>  	u64 addr = mr->addr, size = mr->len;
> -	int size_chk, err, i;
> +	int size_chk, err;
>
>  	if (chunk_size < XDP_UMEM_MIN_CHUNK_SIZE || chunk_size > PAGE_SIZE) 
> {
>  		/* Strictly speaking we could support this, if:
> @@ -378,10 +401,11 @@ static int xdp_umem_reg(struct xdp_umem *umem, 
> struct xdp_umem_reg *mr)
>  		goto out_account;
>  	}
>
> -	for (i = 0; i < umem->npgs; i++)
> -		umem->pages[i].addr = kmap(umem->pgs[i]);
> +	err = xdp_umem_map_pages(umem);
> +	if (!err)
> +		return 0;
>
> -	return 0;
> +	kfree(umem->pages);
>
>  out_account:
>  	xdp_umem_unaccount_pages(umem);
> -- 
> 2.17.1

^ permalink raw reply

* Re: [PATCH 00/14] ARM: move lpc32xx and dove to multiplatform
From: Sylvain Lemieux @ 2019-08-15 18:32 UTC (permalink / raw)
  To: Arnd Bergmann, Russell King - ARM Linux admin
  Cc: SoC Team, Linux ARM, Vladimir Zapolskiy, Gregory Clement,
	Linus Walleij, Jason Cooper, Andrew Lunn, Sebastian Hesselbarth,
	David S. Miller, Greg Kroah-Hartman, Alan Stern, Guenter Roeck,
	open list:GPIO SUBSYSTEM, Networking, linux-serial, USB list,
	LINUXWATCHDOG
In-Reply-To: <CAK8P3a0=GrjM_HOBgqy5V3pOsA6w1EDOtEQO9dZG2Cw+-2niaw@mail.gmail.com>

Hi Arnd,

On 8/15/19 9:11 AM, Arnd Bergmann wrote:
> On Thu, Aug 1, 2019 at 9:33 AM Arnd Bergmann <arnd@arndb.de> wrote:
>>
>> On Thu, Aug 1, 2019 at 12:53 AM Russell King - ARM Linux admin
>> <linux@armlinux.org.uk> wrote:
>>>
>>> On Wed, Jul 31, 2019 at 09:56:42PM +0200, Arnd Bergmann wrote:
>>>> For dove, the patches are basically what I had proposed back in
>>>> 2015 when all other ARMv6/ARMv7 machines became part of a single
>>>> kernel build. I don't know what the state is mach-dove support is,
>>>> compared to the DT based support in mach-mvebu for the same
>>>> hardware. If they are functionally the same, we could also just
>>>> remove mach-dove rather than applying my patches.
>>>
>>> Well, the good news is that I'm down to a small board support file
>>> for the Dove Cubox now - but the bad news is, that there's still a
>>> board support file necessary to support everything the Dove SoC has
>>> to offer.
>>>
>>> Even for a DT based Dove Cubox, I'm still using mach-dove, but it
>>> may be possible to drop most of mach-dove now.  Without spending a
>>> lot of time digging through it, it's impossible to really know.
>>
>> Ok, so we won't remove it then, but I'd like to merge my patches to
>> at least get away from the special case of requiring a separate kernel
>> image for it.
>>
>> Can you try if applying patches 12 and 14 from my series causes
>> problems for you? (it may be easier to apply the entire set
>> or pull from [1] to avoid rebase conflicts).
> 
> I applied patches 12 and 13 into the soc tree now. There are some
> other pending multiplatform conversions (iop32x, ep93xx, lpc32xx,
> omap1), but it looks like none of those will be complete for 5.4.

I think the patchset (v2) for the LPC32xx is ready for 5.4
([PATCH v2 00/13] v2: ARM: move lpc32xx to multiplatform)
 >
> I now expect that we can get most of the preparation into 5.4,
> and maybe move them all over together in 5.5 after some more
> testing. If someone finds a problem with the one of the
> preparation steps, that we can revert the individual patches
> more easily.
> 
>        Arnd
> 
Sylvain

^ permalink raw reply

* Re: [PATCH v2] selftests: net: tcp_fastopen_backup_key.sh: fix shellcheck issue
From: David Miller @ 2019-08-15 18:34 UTC (permalink / raw)
  To: anders.roxell; +Cc: shuah, netdev, tim.bird, linux-kselftest, linux-kernel
In-Reply-To: <20190815075826.13210-1-anders.roxell@linaro.org>

From: Anders Roxell <anders.roxell@linaro.org>
Date: Thu, 15 Aug 2019 09:58:26 +0200

> When running tcp_fastopen_backup_key.sh the following issue was seen in
> a busybox environment.
> ./tcp_fastopen_backup_key.sh: line 33: [: -ne: unary operator expected
> 
> Shellcheck showed the following issue.
> $ shellcheck tools/testing/selftests/net/tcp_fastopen_backup_key.sh
> 
> In tools/testing/selftests/net/tcp_fastopen_backup_key.sh line 33:
>         if [ $val -ne 0 ]; then
>              ^-- SC2086: Double quote to prevent globbing and word splitting.
> 
> Rework to do a string comparison instead.
> 
> Signed-off-by: Anders Roxell <anders.roxell@linaro.org>

Applied.

^ permalink raw reply

* Re: [PATCH v2 bpf-next 1/4] bpf: unprivileged BPF access via /dev/bpf
From: Andy Lutomirski @ 2019-08-15 18:36 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jordan Glover, Andy Lutomirski, Daniel Colascione, Song Liu,
	Kees Cook, Networking, bpf, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, Lorenz Bauer, Jann Horn, Greg KH, Linux API,
	LSM List
In-Reply-To: <20190815172856.yoqvgu2yfrgbkowu@ast-mbp.dhcp.thefacebook.com>

On Thu, Aug 15, 2019 at 10:29 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Aug 15, 2019 at 11:24:54AM +0000, Jordan Glover wrote:
> > systemd --user processes aren't "less privileged". The are COMPLETELY unprivileged.
> > Granting them cap_bpf is the same as granting it to every other unprivileged user
> > process. Also unprivileged user process can start systemd --user process with any
> > command they like.
>
> systemd itself is trusted. It's the same binary whether it runs as pid=1
> or as pid=123. One of the use cases is to make IPAddressDeny= work with --user.
> Subset of that feature already works with AmbientCapabilities=CAP_NET_ADMIN.
> CAP_BPF is a natural step in the same direction.
>

I have the feeling that we're somehow speaking different languages.
What, precisely, do you mean when you say "systemd itself is trusted"?
 Do you mean "the administrator trusts that the /lib/systemd/systemd
binary is not malicious"?  Do you mean "the administrator trusts that
the running systemd process is not malicious"?

On a regular Linux desktop or server box, passing CAP_NET_ADMIN, your
envisioned CAP_BPF, or /dev/bpf as in this patchset through to a
systemd --user binary would be a gaping security hole.  You are
welcome to do it on your own systemd, but if a distro did it, it would
be a major error.

If you want IPAddressDeny= to work in a user systemd unit (i.e.
/etc/systemd/user/*), then I think you have two choices.  You could
have an API by which systemd --user can ask a privileged helper to
assist (which has all the challenges you mentioned but is definitely
*possible*) or the kernel bpf() interfaces need to be designed so
that, in the absence of kernel bugs, they are safe to use from an
unprivileged process.  By "safe", I mean "would not expose the system
to attack if the kernel's implementation of the bpf() ABI were
perfect".

My suggestions upthread for incrementally making bpf() depend less on
privilege would accomplish this goal.  It would be entirely reasonable
to say that, even with those changes, bpf() is still a large attack
surface and access to it should be restricted, and having a capability
or other mechanism to explicitly grant access to the
hopefully-secure-but-plausibly-buggy parts of bpf() would make sense.
But you rejected that idea and said you "realized that [changing all
the capable() checks is] perfect as-is" without much explanation,
which makes it hard to understand where you're coming from.

^ permalink raw reply

* Re: [PATCH net-next] net/rds: Add RDS6_INFO_SOCKETS and RDS6_INFO_RECV_MESSAGES options
From: David Miller @ 2019-08-15 18:40 UTC (permalink / raw)
  To: ka-cheong.poon; +Cc: netdev, santosh.shilimkar, rds-devel
In-Reply-To: <1565861803-31268-1-git-send-email-ka-cheong.poon@oracle.com>

From: Ka-Cheong Poon <ka-cheong.poon@oracle.com>
Date: Thu, 15 Aug 2019 02:36:43 -0700

> Add support of the socket options RDS6_INFO_SOCKETS and
> RDS6_INFO_RECV_MESSAGES which update the RDS_INFO_SOCKETS and
> RDS_INFO_RECV_MESSAGES options respectively.  The old options work
> for IPv4 sockets only.
> 
> Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] net: phy: read MII_CTRL1000 in genphy_read_status only if needed
From: David Miller @ 2019-08-15 18:42 UTC (permalink / raw)
  To: hkallweit1; +Cc: andrew, f.fainelli, netdev
In-Reply-To: <84cbdf69-70b4-3dd0-c34d-7db0a4f69653@gmail.com>

From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Thu, 15 Aug 2019 13:15:19 +0200

> Value of MII_CTRL1000 is needed only if LPA_1000MSFAIL is set.
> Therefore move reading this register.
> 
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH v2 bpf-next 1/4] bpf: unprivileged BPF access via /dev/bpf
From: Jordan Glover @ 2019-08-15 18:43 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andy Lutomirski, Daniel Colascione, Song Liu, Kees Cook,
	Networking, bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team,
	Lorenz Bauer, Jann Horn, Greg KH, Linux API, LSM List
In-Reply-To: <20190815172856.yoqvgu2yfrgbkowu@ast-mbp.dhcp.thefacebook.com>

On Thursday, August 15, 2019 5:28 PM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Thu, Aug 15, 2019 at 11:24:54AM +0000, Jordan Glover wrote:
>
> > On Wednesday, August 14, 2019 10:05 PM, Alexei Starovoitov alexei.starovoitov@gmail.com wrote:
> >
> > > On Wed, Aug 14, 2019 at 10:51:23AM -0700, Andy Lutomirski wrote:
> > >
> > > > If eBPF is genuinely not usable by programs that are not fully trusted
> > > > by the admin, then no kernel changes at all are needed. Programs that
> > > > want to reduce their own privileges can easily fork() a privileged
> > > > subprocess or run a little helper to which they delegate BPF
> > > > operations. This is far more flexible than anything that will ever be
> > > > in the kernel because it allows the helper to verify that the rest of
> > > > the program is doing exactly what it's supposed to and restrict eBPF
> > > > operations to exactly the subset that is needed. So a container
> > > > manager or network manager that drops some provilege could have a
> > > > little bpf-helper that manages its BPF XDP, firewalling, etc
> > > > configuration. The two processes would talk over a socketpair.
> > >
> > > there were three projects that tried to delegate bpf operations.
> > > All of them failed.
> > > bpf operational workflow is much more complex than you're imagining.
> > > fork() also doesn't work for all cases.
> > > I gave this example before: consider multiple systemd-like deamons
> > > that need to do bpf operations that want to pass this 'bpf capability'
> > > to other deamons written by other teams. Some of them will start
> > > non-root, but still need to do bpf. They will be rpm installed
> > > and live upgraded while running.
> > > We considered to make systemd such centralized bpf delegation
> > > authority too. It didn't work. bpf in kernel grows quickly.
> > > libbpf part grows independently. llvm keeps evolving.
> > > All of them are being changed while system overall has to stay
> > > operational. Centralized approach breaks apart.
> > >
> > > > The interesting cases you're talking about really do involved
> > > > unprivileged or less privileged eBPF, though. Let's see:
> > > > systemd --user: systemd --user is not privileged at all. There's no
> > > > issue of reducing privilege, since systemd --user doesn't have any
> > > > privilege to begin with. But systemd supports some eBPF features, and
> > > > presumably it would like to support them in the systemd --user case.
> > > > This is unprivileged eBPF.
> > >
> > > Let's disambiguate the terminology.
> > > This /dev/bpf patch set started as describing the feature as 'unprivileged bpf'.
> > > I think that was a mistake.
> > > Let's call systemd-like deamon usage of bpf 'less privileged bpf'.
> > > This is not unprivileged.
> > > 'unprivileged bpf' is what sysctl kernel.unprivileged_bpf_disabled controls.
> > > There is a huge difference between the two.
> > > I'm against extending 'unprivileged bpf' even a bit more than what it is
> > > today for many reasons mentioned earlier.
> > > The /dev/bpf is about 'less privileged'.
> > > Less privileged than root. We need to split part of full root capability
> > > into bpf capability. So that most of the root can be dropped.
> > > This is very similar to what cap_net_admin does.
> > > cap_net_amdin can bring down eth0 which is just as bad as crashing the box.
> > > cap_net_admin is very much privileged. Just 'less privileged' than root.
> > > Same thing for cap_bpf.
> > > May be we should do both cap_bpf and /dev/bpf to make it clear that
> > > this is the same thing. Two interfaces to achieve the same result.
> >
> > systemd --user processes aren't "less privileged". The are COMPLETELY unprivileged.
> > Granting them cap_bpf is the same as granting it to every other unprivileged user
> > process. Also unprivileged user process can start systemd --user process with any
> > command they like.
>
> systemd itself is trusted. It's the same binary whether it runs as pid=1
> or as pid=123. One of the use cases is to make IPAddressDeny= work with --user.
> Subset of that feature already works with AmbientCapabilities=CAP_NET_ADMIN.
> CAP_BPF is a natural step in the same direction.

The point was that systemd will run any arbitrary command you'll throw at it and you want
to automatically attach CAP_BPF to it. AmbientCapabilities is not valid option for
systemd --user instance (otherwise it would be nuts).

I think we may have misunderstanding here. Did you mean systemd "system" service with
"User=xxx" option instead of "systemd --user" service? It would make sense then.

^ permalink raw reply

* Re: [PATCH net-next] net: phy: swphy: emulate register MII_ESTATUS
From: David Miller @ 2019-08-15 18:44 UTC (permalink / raw)
  To: hkallweit1; +Cc: andrew, f.fainelli, netdev
In-Reply-To: <25690798-5122-d5a2-7d2b-c166b8649a2e@gmail.com>

From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Thu, 15 Aug 2019 13:19:22 +0200

> When the genphy driver binds to a swphy it will call
> genphy_read_abilites that will try to read MII_ESTATUS if BMSR_ESTATEN
> is set in MII_BMSR. So far this would read the default value 0xffff
> and 1000FD and 1000HD are reported as supported just by chance.
> Better add explicit support for emulating MII_ESTATUS.
> 
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
> Reviewed-by: Andrew Lunn <andrew@lunn.ch>

Applied.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox