Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH mlx5-next v2 11/13] IB/mlx5: Add flow counters binding support
From: Jason Gunthorpe @ 2018-05-29 19:56 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list, Boris Pismenny,
	Matan Barak, Raed Salem, Yishai Hadas, Saeed Mahameed,
	linux-netdev, Alex Rosenbaum
In-Reply-To: <20180529130917.13592-12-leon@kernel.org>

On Tue, May 29, 2018 at 04:09:15PM +0300, Leon Romanovsky wrote:
> diff --git a/include/uapi/rdma/mlx5-abi.h b/include/uapi/rdma/mlx5-abi.h
> index 508ea8c82da7..ef3f430a7050 100644
> +++ b/include/uapi/rdma/mlx5-abi.h
> @@ -443,4 +443,18 @@ enum {
>  enum {
>  	MLX5_IB_CLOCK_INFO_V1              = 0,
>  };
> +
> +struct mlx5_ib_flow_counters_data {
> +	__aligned_u64   counters_data;
> +	__u32   ncounters;
> +	__u32   reserved;
> +};
> +
> +struct mlx5_ib_create_flow {
> +	__u32   ncounters_data;
> +	__u32   reserved;
> +	/* Following are counters data based on ncounters_data */
> +	struct mlx5_ib_flow_counters_data data[];
> +};
> +
>  #endif /* MLX5_ABI_USER_H */

This uapi thing still needs to be fixed as I pointed out before.

I still can't figure out why this should be a 2d array. I think it
should be written simply as:

struct mlx5_ib_flow_counter_desc {
        __u32 description;
        __u32 index;
};

struct mlx5_ib_create_flow {
	RDMA_UAPI_PTR(struct mlx5_ib_flow_counter_desc, counters_data);
	__u32   ncounters;
	__u32   reserved;
};

With the corresponding changes elsewhere.

A flex array at the end of a struct means that the struct can never be
extended again which seems like a terrible idea, especially since I
can't fathom why we'd need more that one array of counters and the
current code doesn't even support more than one..

Jason

^ permalink raw reply

* Re: [PATCH bpf-next 04/11] bpf: show prog and map id in fdinfo
From: Daniel Borkmann @ 2018-05-29 19:55 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: netdev
In-Reply-To: <20180529192758.28240588@redhat.com>

On 05/29/2018 07:27 PM, Jesper Dangaard Brouer wrote:
> On Mon, 28 May 2018 02:43:37 +0200
> Daniel Borkmann <daniel@iogearbox.net> wrote:
> 
>> Its trivial and straight forward to expose it for scripts that can
>> then use it along with bpftool in order to inspect an individual
>> application's used maps and progs. Right now we dump some basic
>> information in the fdinfo file but with the help of the map/prog
>> id full introspection becomes possible now.
>>
>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>> Acked-by: Alexei Starovoitov <ast@kernel.org>
> 
> AFAICR iproute uses this proc fdinfo, for pinned maps.  Have you tested
> if this change is handled gracefully by tc ?

Yep, it works just fine, I also tested it before submission.

^ permalink raw reply

* Re: [PATCH bpf-next] bpf: Drop mpls from bpf_fib_lookup
From: Daniel Borkmann @ 2018-05-29 19:51 UTC (permalink / raw)
  To: dsahern, netdev, borkmann, ast; +Cc: David Ahern
In-Reply-To: <20180529175807.20734-1-dsahern@kernel.org>

On 05/29/2018 07:58 PM, dsahern@kernel.org wrote:
> From: David Ahern <dsahern@gmail.com>
> 
> MPLS support will not be submitted this dev cycle, but in working on it
> I do see a few changes are needed to the API. For now, drop mpls from the
> API. Since the fields in question are unions, the mpls fields can be added
> back later without affecting the uapi.
> 
> Signed-off-by: David Ahern <dsahern@gmail.com>

Applied to bpf-next, thanks David!

^ permalink raw reply

* Re: [PATCH bpf-next] bpf: Verify flags in bpf_fib_lookup
From: Daniel Borkmann @ 2018-05-29 19:51 UTC (permalink / raw)
  To: dsahern, netdev, borkmann, ast; +Cc: David Ahern
In-Reply-To: <20180529185913.11697-1-dsahern@kernel.org>

On 05/29/2018 08:59 PM, dsahern@kernel.org wrote:
> From: David Ahern <dsahern@gmail.com>
> 
> Verify flags argument contains only known flags. Allows programs to probe
> for support as more are added.
> 
> Signed-off-by: David Ahern <dsahern@gmail.com>

Applied to bpf-next, thanks David!

^ permalink raw reply

* Re: [PATCH bpf-next] bpf: hide the unused 'off' variable
From: Daniel Borkmann @ 2018-05-29 19:50 UTC (permalink / raw)
  To: Song Liu, John Fastabend
  Cc: Arnd Bergmann, YueHaibing, David Miller, Alexei Starovoitov,
	Networking, Linux Kernel Mailing List
In-Reply-To: <CAPhsuW7tkQVY0rVRt2=cAHf_fif7W_73Ns7_mnt3CriMbGEhrg@mail.gmail.com>

On 05/29/2018 08:18 PM, Song Liu wrote:
> On Tue, May 29, 2018 at 8:53 AM, John Fastabend
> <john.fastabend@gmail.com> wrote:
>> On 05/29/2018 03:35 AM, Arnd Bergmann wrote:
>>> On Tue, May 29, 2018 at 4:40 AM, YueHaibing <yuehaibing@huawei.com> wrote:
>>>> The local variable is only used while CONFIG_IPV6 enabled
>>>>
>>>> net/core/filter.c: In function ‘sk_msg_convert_ctx_access’:
>>>> net/core/filter.c:6489:6: warning: unused variable ‘off’ [-Wunused-variable]
>>>>   int off;
>>>>       ^
>>>> This puts it into #ifdef.
>>>>
>>>> Fixes: 303def35f64e ("bpf: allow sk_msg programs to read sock fields")
>>>> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
>>>
>>> I was about to send the same patch and found you had already sent one.
>>>
>>> Acked-by: Arnd Bergmann <arnd@arndb.de>
>>
>> Thanks! I'm curious why kbuild bot didn't catch this. Will
>> try to dig into that in a bit.
>>
>> Acked-by: John Fastabend <john.fastabend@gmail.com>
> 
> Acked-by: Song Liu <songliubraving@fb.com>

Applied to bpf-next, thanks Yue!

^ permalink raw reply

* Re: [OpenWrt-Devel] [PATCH 0/4 RFCv2] Realtek SMI RTL836x DSA driver
From: Linus Walleij @ 2018-05-29 19:47 UTC (permalink / raw)
  To: Kevin Darbyshire-Bryant
  Cc: Andrew Lunn, netdev, Florian Fainelli, LEDE Development List,
	Vivien Didelot
In-Reply-To: <6FB2D569-87D4-47A9-AB6D-51B85720B6DA@darbyshire-bryant.me.uk>

On Tue, May 29, 2018 at 8:57 PM, Kevin Darbyshire-Bryant
<kevin@darbyshire-bryant.me.uk> wrote:

> Oh lordy, that horrible device as exhibited in the netgear DGN3500.  Talk about magic values
> https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=42120bd7f323ff7170b32a5fd4674babd8b184bc

Yeah I even copied that init sequence over to this driver, however
I actually think I have the LED activity somewhat under control in
this new driver as those registers were kind-of-sort-of documented
in the vendor code drop.

It turns out the D-Link thing I am playing with doesn't even have
any LEDs mounted so I can't test it though.

Yours,
Linus Walleij

^ permalink raw reply

* Re: [PATCH bpf-next] bpf: clean up eBPF helpers documentation
From: Daniel Borkmann @ 2018-05-29 19:44 UTC (permalink / raw)
  To: Song Liu, Quentin Monnet; +Cc: Alexei Starovoitov, Networking, oss-drivers
In-Reply-To: <CAPhsuW4Oi5P=Mbxn1=b5XWiBptd6-SG7dLuEpmCe10G3Cufj6A@mail.gmail.com>

On 05/29/2018 08:27 PM, Song Liu wrote:
> On Tue, May 29, 2018 at 4:27 AM, Quentin Monnet
> <quentin.monnet@netronome.com> wrote:
>> These are minor edits for the eBPF helpers documentation in
>> include/uapi/linux/bpf.h.
>>
>> The main fix consists in removing "BPF_FIB_LOOKUP_", because it ends
>> with a non-escaped underscore that gets interpreted by rst2man and
>> produces the following message in the resulting manual page:
>>
>>     DOCUTILS SYSTEM MESSAGES
>>            System Message: ERROR/3 (/tmp/bpf-helpers.rst:, line 1514)
>>                   Unknown target name: "bpf_fib_lookup".
>>
>> Other edits consist in:
>>
>> - Improving formatting for flag values for "bpf_fib_lookup()" helper.
>> - Emphasising a parameter name in description of the return value for
>>   "bpf_get_stack()" helper.
>> - Removing unnecessary blank lines between "Description" and "Return"
>>   sections for the few helpers that would use it, for consistency.
>>
>> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
[...]
> 
> Please also apply the same changes to tools/include/uapi/linux/bpf.h.

Just did while applying to bpf-next, thanks guys!

> Other than this, it looks to me.
> 
> Acked-by: Song Liu <songliubraving@fb.com>
> 
> Thanks,
> Song
> 

^ permalink raw reply

* RE: [PATCH rdma-next v2 01/13] IB/uverbs: Add an ib_uobject getter to ioctl() infrastructure
From: Ruhl, Michael J @ 2018-05-29 19:31 UTC (permalink / raw)
  To: Leon Romanovsky, Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Boris Pismenny, Matan Barak,
	Raed Salem, Yishai Hadas, Saeed Mahameed, linux-netdev
In-Reply-To: <20180529130917.13592-2-leon@kernel.org>

>-----Original Message-----
>From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-
>owner@vger.kernel.org] On Behalf Of Leon Romanovsky
>Sent: Tuesday, May 29, 2018 9:09 AM
>To: Doug Ledford <dledford@redhat.com>; Jason Gunthorpe
><jgg@mellanox.com>
>Cc: Leon Romanovsky <leonro@mellanox.com>; RDMA mailing list <linux-
>rdma@vger.kernel.org>; Boris Pismenny <borisp@mellanox.com>; Matan
>Barak <matanb@mellanox.com>; Raed Salem <raeds@mellanox.com>; Yishai
>Hadas <yishaih@mellanox.com>; Saeed Mahameed
><saeedm@mellanox.com>; linux-netdev <netdev@vger.kernel.org>
>Subject: [PATCH rdma-next v2 01/13] IB/uverbs: Add an ib_uobject getter to
>ioctl() infrastructure
>
>From: Matan Barak <matanb@mellanox.com>
>
>Previously, the user had to dig inside the attribute to get the uobject.
>Add a helper function that correctly extract it (and do the required
>checks) for him/her.
>
>Tested-by: Michael Guralnik <michaelgur@mellanox.com>
>Signed-off-by: Matan Barak <matanb@mellanox.com>
>Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
>---
> drivers/infiniband/core/uverbs_std_types_cq.c      | 23 +++++++++++---------
>--
> .../infiniband/core/uverbs_std_types_flow_action.c |  4 ++--
> include/rdma/uverbs_ioctl.h                        | 11 +++++++++++
> 3 files changed, 25 insertions(+), 13 deletions(-)
>
>diff --git a/drivers/infiniband/core/uverbs_std_types_cq.c
>b/drivers/infiniband/core/uverbs_std_types_cq.c
>index b0dbae9dd0d7..3d293d01afea 100644
>--- a/drivers/infiniband/core/uverbs_std_types_cq.c
>+++ b/drivers/infiniband/core/uverbs_std_types_cq.c
>@@ -65,7 +65,6 @@ static int
>UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(struct ib_device *ib_dev,
> 	struct ib_cq_init_attr attr = {};
> 	struct ib_cq                   *cq;
> 	struct ib_uverbs_completion_event_file    *ev_file = NULL;
>-	const struct uverbs_attr *ev_file_attr;
> 	struct ib_uobject *ev_file_uobj;
>
> 	if (!(ib_dev->uverbs_cmd_mask & 1ULL <<
>IB_USER_VERBS_CMD_CREATE_CQ))
>@@ -87,10 +86,8 @@ static int
>UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(struct ib_device *ib_dev,
>
>	UVERBS_ATTR_CREATE_CQ_FLAGS)))
> 		return -EFAULT;
>
>-	ev_file_attr = uverbs_attr_get(attrs,
>UVERBS_ATTR_CREATE_CQ_COMP_CHANNEL);
>-	if (!IS_ERR(ev_file_attr)) {
>-		ev_file_uobj = ev_file_attr->obj_attr.uobject;
>-
>+	ev_file_uobj = uverbs_attr_get_uobject(attrs,
>UVERBS_ATTR_CREATE_CQ_COMP_CHANNEL);
>+	if (!IS_ERR(ev_file_uobj)) {
> 		ev_file = container_of(ev_file_uobj,
> 				       struct ib_uverbs_completion_event_file,
> 				       uobj_file.uobj);
>@@ -102,8 +99,8 @@ static int
>UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(struct ib_device *ib_dev,
> 		goto err_event_file;
> 	}
>
>-	obj = container_of(uverbs_attr_get(attrs,
>-
>UVERBS_ATTR_CREATE_CQ_HANDLE)->obj_attr.uobject,

See comment below on error checking.  Does this need the error check?

>+	obj = container_of(uverbs_attr_get_uobject(attrs,
>+
>UVERBS_ATTR_CREATE_CQ_HANDLE),
> 			   typeof(*obj), uobject);
> 	obj->uverbs_file	   = ucontext->ufile;
> 	obj->comp_events_reported  = 0;
>@@ -170,13 +167,17 @@ static int
>UVERBS_HANDLER(UVERBS_METHOD_CQ_DESTROY)(struct ib_device
>*ib_dev,
> 						    struct ib_uverbs_file *file,
> 						    struct uverbs_attr_bundle
>*attrs)
> {
>-	struct ib_uverbs_destroy_cq_resp resp;
> 	struct ib_uobject *uobj =
>-		uverbs_attr_get(attrs,
>UVERBS_ATTR_DESTROY_CQ_HANDLE)->obj_attr.uobject;
>-	struct ib_ucq_object *obj = container_of(uobj, struct ib_ucq_object,
>-						 uobject);
>+		uverbs_attr_get_uobject(attrs,
>UVERBS_ATTR_DESTROY_CQ_HANDLE);
>+	struct ib_uverbs_destroy_cq_resp resp;
>+	struct ib_ucq_object *obj;
> 	int ret;
>
>+	if (IS_ERR(uobj))
>+		return PTR_ERR(uobj);
>+

I remember a conversation that if an method attribute was mandatory, that you did not need to
test the uobj for error (since it was checked in the infrastructure).

Is this error check necessary?

Thanks

Mike

>+	obj = container_of(uobj, struct ib_ucq_object, uobject);
>+
> 	if (!(ib_dev->uverbs_cmd_mask & 1ULL <<
>IB_USER_VERBS_CMD_DESTROY_CQ))
> 		return -EOPNOTSUPP;
>
>diff --git a/drivers/infiniband/core/uverbs_std_types_flow_action.c
>b/drivers/infiniband/core/uverbs_std_types_flow_action.c
>index b4f016dfa23d..a7be51cf2e42 100644
>--- a/drivers/infiniband/core/uverbs_std_types_flow_action.c
>+++ b/drivers/infiniband/core/uverbs_std_types_flow_action.c
>@@ -320,7 +320,7 @@ static int
>UVERBS_HANDLER(UVERBS_METHOD_FLOW_ACTION_ESP_CREATE)(struct
>ib_device
> 		return ret;
>
> 	/* No need to check as this attribute is marked as MANDATORY */
>-	uobj = uverbs_attr_get(attrs,
>UVERBS_ATTR_FLOW_ACTION_ESP_HANDLE)->obj_attr.uobject;
>+	uobj = uverbs_attr_get_uobject(attrs,
>UVERBS_ATTR_FLOW_ACTION_ESP_HANDLE);
> 	action = ib_dev->create_flow_action_esp(ib_dev, &esp_attr.hdr,
>attrs);
> 	if (IS_ERR(action))
> 		return PTR_ERR(action);
>@@ -350,7 +350,7 @@ static int
>UVERBS_HANDLER(UVERBS_METHOD_FLOW_ACTION_ESP_MODIFY)(struct
>ib_device
> 	if (ret)
> 		return ret;
>
>-	uobj = uverbs_attr_get(attrs,
>UVERBS_ATTR_FLOW_ACTION_ESP_HANDLE)->obj_attr.uobject;
>+	uobj = uverbs_attr_get_uobject(attrs,
>UVERBS_ATTR_FLOW_ACTION_ESP_HANDLE);
> 	action = uobj->object;
>
> 	if (action->type != IB_FLOW_ACTION_ESP)
>diff --git a/include/rdma/uverbs_ioctl.h b/include/rdma/uverbs_ioctl.h
>index 4a4201d997a7..7ac6271a5ee0 100644
>--- a/include/rdma/uverbs_ioctl.h
>+++ b/include/rdma/uverbs_ioctl.h
>@@ -420,6 +420,17 @@ static inline void *uverbs_attr_get_obj(const struct
>uverbs_attr_bundle *attrs_b
> 	return uobj->object;
> }
>
>+static inline struct ib_uobject *uverbs_attr_get_uobject(const struct
>uverbs_attr_bundle *attrs_bundle,
>+							 u16 idx)
>+{
>+	const struct uverbs_attr *attr = uverbs_attr_get(attrs_bundle, idx);
>+
>+	if (IS_ERR(attr))
>+		return ERR_CAST(attr);
>+
>+	return attr->obj_attr.uobject;
>+}
>+
> static inline int uverbs_copy_to(const struct uverbs_attr_bundle
>*attrs_bundle,
> 				 size_t idx, const void *from, size_t size)
> {
>--
>2.14.3
>
>--
>To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 2/4 RFCv2] net: dsa: Add bindings for Realtek SMI DSAs
From: Florian Fainelli @ 2018-05-29 19:30 UTC (permalink / raw)
  To: Linus Walleij, Andrew Lunn, Vivien Didelot
  Cc: netdev, openwrt-devel, LEDE Development List,
	Antti Seppälä, Roman Yeryomin, Colin Leitner,
	Gabor Juhos, devicetree
In-Reply-To: <20180528174752.6806-3-linus.walleij@linaro.org>



On 05/28/2018 10:47 AM, Linus Walleij wrote:
> The Realtek SMI family is a set of DSA chips that provide
> switching in routers. This binding just follows the pattern
> set by other switches but with the introduction of an embedded
> irqchip to demux and handle the interrupts fired by the single
> line from the chip.
> 
> This interrupt construction is similar to how we handle
> interrupt controllers inside PCI bridges etc.
> 
> Cc: Antti Seppälä <a.seppala@gmail.com>
> Cc: Roman Yeryomin <roman@advem.lv>
> Cc: Colin Leitner <colin.leitner@googlemail.com>
> Cc: Gabor Juhos <juhosg@openwrt.org>
> Cc: Florian Fainelli <f.fainelli@gmail.com>
> Cc: devicetree@vger.kernel.org
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> ---

> +
> +	mdio {
> +		compatible = "realtek,smi-mdio", "dsa-mdio";

You should drop the "dsa-mdio" compatible string here since it both non
documented and not matched either.

Other than that, this looks great, with that fixed:

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply

* Re: [PATCH net-next] bpfilter: fix a build err
From: David Miller @ 2018-05-29 19:20 UTC (permalink / raw)
  To: yuehaibing; +Cc: ast, netdev, linux-kernel
In-Reply-To: <20180525101757.13756-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Fri, 25 May 2018 18:17:57 +0800

> gcc-7.3.0 report following err:
> 
>   HOSTCC  net/bpfilter/main.o
> In file included from net/bpfilter/main.c:9:0:
> ./include/uapi/linux/bpf.h:12:10: fatal error: linux/bpf_common.h: No such file or directory
>  #include <linux/bpf_common.h>
> 
> remove it by adding a include path.
> Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module")
> 
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next 1/5] net: aquantia: Ethtool based ring size configuration
From: Jakub Kicinski @ 2018-05-29 19:04 UTC (permalink / raw)
  To: Igor Russkikh
  Cc: David S . Miller, netdev, David Arcari, Pavel Belous,
	Anton Mikaev
In-Reply-To: <7b86f8ec337f9c27a362c5a0a5c306928b07b04e.1527596210.git.igor.russkikh@aquantia.com>

On Tue, 29 May 2018 15:56:58 +0300, Igor Russkikh wrote:
> +static int aq_set_ringparam(struct net_device *ndev,
> +			    struct ethtool_ringparam *ring)
> +{
> +	int err = 0;
> +	struct aq_nic_s *aq_nic = netdev_priv(ndev);
> +	struct aq_nic_cfg_s *aq_nic_cfg = aq_nic_get_cfg(aq_nic);
> +	const struct aq_hw_caps_s *hw_caps = aq_nic_cfg->aq_hw_caps;
> +
> +	if (ring->rx_mini_pending || ring->rx_jumbo_pending) {
> +		err = -EOPNOTSUPP;
> +		goto err_exit;
> +	}
> +
> +	spin_lock(&aq_nic->aq_spinlock);
> +
> +	if (netif_running(ndev))
> +		dev_close(ndev);

I don't think you can hold a spinlock around dev_close()/dev_open()
calls.

> +	aq_nic_free_vectors(aq_nic);
> +
> +	aq_nic_cfg->rxds = max(ring->rx_pending, hw_caps->rxds_min);
> +	aq_nic_cfg->rxds = min(aq_nic_cfg->rxds, hw_caps->rxds_max);
> +	aq_nic_cfg->rxds = ALIGN(aq_nic_cfg->rxds, AQ_HW_RXD_MULTIPLE);
> +
> +	aq_nic_cfg->txds = max(ring->tx_pending, hw_caps->txds_min);
> +	aq_nic_cfg->txds = min(aq_nic_cfg->txds, hw_caps->txds_max);
> +	aq_nic_cfg->txds = ALIGN(aq_nic_cfg->txds, AQ_HW_TXD_MULTIPLE);
> +
> +	for (aq_nic->aq_vecs = 0; aq_nic->aq_vecs < aq_nic_cfg->vecs;
> +	     aq_nic->aq_vecs++) {
> +		aq_nic->aq_vec[aq_nic->aq_vecs] =
> +		    aq_vec_alloc(aq_nic, aq_nic->aq_vecs, aq_nic_cfg);
> +		if (unlikely(!aq_nic->aq_vec[aq_nic->aq_vecs])) {
> +			err = -ENOMEM;
> +			goto err_unlock;
> +		}
> +	}
> +	if (!netif_running(ndev))
> +		err = dev_open(ndev);

Will this not open the device regardless if it was open before or not?

> +err_unlock:
> +	spin_unlock(&aq_nic->aq_spinlock);
> +err_exit:
> +	return err;
> +}

^ permalink raw reply

* [PATCH bpf-next] bpf: Verify flags in bpf_fib_lookup
From: dsahern @ 2018-05-29 18:59 UTC (permalink / raw)
  To: netdev, borkmann, ast; +Cc: David Ahern

From: David Ahern <dsahern@gmail.com>

Verify flags argument contains only known flags. Allows programs to probe
for support as more are added.

Signed-off-by: David Ahern <dsahern@gmail.com>
---
 net/core/filter.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/core/filter.c b/net/core/filter.c
index 24e6ce8be567..4cff6d9cd724 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -4270,6 +4270,9 @@ BPF_CALL_4(bpf_xdp_fib_lookup, struct xdp_buff *, ctx,
 	if (plen < sizeof(*params))
 		return -EINVAL;
 
+	if (flags & ~(BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT))
+		return -EINVAL;
+
 	switch (params->family) {
 #if IS_ENABLED(CONFIG_INET)
 	case AF_INET:
@@ -4304,6 +4307,9 @@ BPF_CALL_4(bpf_skb_fib_lookup, struct sk_buff *, skb,
 	if (plen < sizeof(*params))
 		return -EINVAL;
 
+	if (flags & ~(BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT))
+		return -EINVAL;
+
 	switch (params->family) {
 #if IS_ENABLED(CONFIG_INET)
 	case AF_INET:
-- 
2.11.0

^ permalink raw reply related

* Re: [OpenWrt-Devel] [PATCH 0/4 RFCv2] Realtek SMI RTL836x DSA driver
From: Kevin Darbyshire-Bryant @ 2018-05-29 18:57 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Andrew Lunn, netdev, Florian Fainelli, LEDE Development List,
	Vivien Didelot
In-Reply-To: <CACRpkdbem_jTe6cQwTeTi9Uhk2B88rGENePZvjbsq38mfvcs0A@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2631 bytes --]



> On 29 May 2018, at 19:41, Linus Walleij <linus.walleij@linaro.org> wrote:
> 
> On Tue, May 29, 2018 at 2:24 PM, Andrew Lunn <andrew@lunn.ch> wrote:
> 
>> Did you look at the switch end? I found a datasheet for the
>> 8366/8369. Register at 0x0050, P8GCR. It has two bits for RGMII
>> delays.
> 
> Unfortunately this datasheet is not applicable to RTL8366RB.
> 
> RTL documentation and model numbers are a complete mess
> around the time when this chip came out, unfortunately... I even
> started to implement using that datasheet and had to toss a bunch
> of stuff away.
> 
> There might not even be a proper datasheet for RTL8366RB,
> I'm afraid. The best we have is different (3 different AFAICT)
> vendor code drops. Here is one drop over at DD-WRT:
> https://svn.dd-wrt.com//browser/src/linux/universal/linux-3.2/drivers/net/ethernet/raeth/rb
> 
> As you can see, the RTL8366RB vendor driver consists of
> a hacked version of their RTL8368S driver, so apparently those
> two ASICs are similar, they even kept the same filenames.
> 
> For example the register defintions:
> https://svn.dd-wrt.com/browser/src/linux/universal/linux-3.2/drivers/net/ethernet/raeth/rb/rtl8368s_reg.h
> 
>> With RGMII delays, you have 3 'choices'.
>> 
>> 1) The hardware design includes the delay, by zig-zagging the clock
>> line to make it longer.
>> 2) The 'MAC' side does the delay.
>> 3) The 'PHY' side does the delay.
>> 
>> I normally recommend the PHY side doing it, because that's what most
>> board do. Gives us some consistency. But it does not really
>> matter. Just make sure one side, and only once side is inserting the
>> delays.
> 
> Makes sense! But I haven't found anything applicable in the
> RTL8366RB registers.
> 
> There are some jam tables with magic values written all over
> the place that have no documentation, I fear this is one of the
> settings poked around with there.
> 
> However, even if this router did not come with any code for
> the RTL8366RB driver, I disassembled the binary to verify
> that they use the same magic jam table, so the ASIC is
> initialized in the same way.
> 
> Yours,
> Linus Walleij
> 
> _______________________________________________
> openwrt-devel mailing list
> openwrt-devel@lists.openwrt.org
> https://lists.openwrt.org/listinfo/openwrt-devel

Oh lordy, that horrible device as exhibited in the netgear DGN3500.  Talk about magic values https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=42120bd7f323ff7170b32a5fd4674babd8b184bc

Cheers,

Kevin D-B

012C ACB2 28C6 C53E 9775  9123 B3A2 389B 9DE2 334A


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [1/4,RFCv2] net: phy: realtek: Support RTL8366RB variant
From: Heiner Kallweit @ 2018-05-29 18:51 UTC (permalink / raw)
  To: Linus Walleij, Andrew Lunn, Vivien Didelot, Florian Fainelli
  Cc: netdev, openwrt-devel, LEDE Development List,
	Antti Seppälä, Roman Yeryomin, Colin Leitner,
	Gabor Juhos
In-Reply-To: <20180528174752.6806-2-linus.walleij@linaro.org>

Am 28.05.2018 um 19:47 schrieb Linus Walleij:
> The RTL8366RB is an ASIC with five internal PHYs for
> LAN0..LAN3 and WAN. The PHYs are spawn off the main
> device so they can be handled in a distributed manner
> by the Realtek PHY driver. All that is really needed
> is the power save feature enablement and letting the
> PHY driver core pick up the IRQ from the switch chip.
> 
> Cc: Antti Seppälä <a.seppala@gmail.com>
> Cc: Roman Yeryomin <roman@advem.lv>
> Cc: Colin Leitner <colin.leitner@googlemail.com>
> Cc: Gabor Juhos <juhosg@openwrt.org>
> Cc: Florian Fainelli <f.fainelli@gmail.com>
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> ---
> ChangeLog RFCv1->RFCv2:
> - No real changes.
> ---
>  drivers/net/phy/realtek.c | 32 ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
> 
> diff --git a/drivers/net/phy/realtek.c b/drivers/net/phy/realtek.c
> index 9f48ecf9c627..21624d1c9d38 100644
> --- a/drivers/net/phy/realtek.c
> +++ b/drivers/net/phy/realtek.c
> @@ -37,6 +37,9 @@
>  #define RTL8201F_ISR				0x1e
>  #define RTL8201F_IER				0x13
>  
> +#define RTL8366RB_POWER_SAVE	0x21
Typically PHY register addresses are 5 bits wide, is 0x21 correct
and I miss something?

> +#define RTL8366RB_POWER_SAVE_ON 0x1000
Use BIT(12) instead?

> +
>  MODULE_DESCRIPTION("Realtek PHY driver");
>  MODULE_AUTHOR("Johnson Leung");
>  MODULE_LICENSE("GPL");
> @@ -145,6 +148,22 @@ static int rtl8211f_config_init(struct phy_device *phydev)
>  	return phy_modify_paged(phydev, 0xd08, 0x11, RTL8211F_TX_DELAY, val);
>  }
>  
> +static int rtl8366rb_config_init(struct phy_device *phydev)
> +{
> +	int ret;
> +	u16 reg;
> +
> +	ret = genphy_config_init(phydev);
> +	if (ret < 0)
> +		return ret;
> +
> +	reg = phy_read(phydev, RTL8366RB_POWER_SAVE);
> +	reg |= RTL8366RB_POWER_SAVE_ON;
> +	phy_write(phydev, RTL8366RB_POWER_SAVE, reg);
return phy_set_bits(phydev, RTL8366RB_POWER_SAVE, RTL8366RB_POWER_SAVE_ON);
would be easier.

> +
> +	return 0;
> +}
> +
>  static struct phy_driver realtek_drvs[] = {
>  	{
>  		.phy_id         = 0x00008201,
> @@ -207,6 +226,18 @@ static struct phy_driver realtek_drvs[] = {
>  		.resume		= genphy_resume,
>  		.read_page	= rtl821x_read_page,
>  		.write_page	= rtl821x_write_page,
> +	}, {
> +		/* The main part of this DSA is in drivers/net/dsa */
> +		.phy_id		= 0x001cc961,
> +		.name		= "RTL8366RB Gigabit Ethernet",
> +		.phy_id_mask	= 0x001fffff,
> +		.features	= PHY_GBIT_FEATURES,
> +		.flags		= PHY_HAS_INTERRUPT,
> +		.config_aneg	= &genphy_config_aneg,
Can be omitted, genphy_config_aneg is the default since 00fde79532d6
"net: phy: core: use genphy version of callbacks read_status and
config_aneg per default".

> +		.config_init	= &rtl8366rb_config_init,
> +		.read_status	= &genphy_read_status,
Can be omitted, genphy_read_status is the default.

> +		.suspend	= genphy_suspend,
> +		.resume		= genphy_resume,
>  	},
>  };
>  
> @@ -218,6 +249,7 @@ static struct mdio_device_id __maybe_unused realtek_tbl[] = {
>  	{ 0x001cc914, 0x001fffff },
>  	{ 0x001cc915, 0x001fffff },
>  	{ 0x001cc916, 0x001fffff },
> +	{ 0x001cc961, 0x001fffff },
>  	{ }
>  };
>  
> 

^ permalink raw reply

* Re: [pull request][for-next 00/12] Mellanox, mlx5e updates 2018-05-25
From: Saeed Mahameed @ 2018-05-29 18:45 UTC (permalink / raw)
  To: davem@davemloft.net; +Cc: netdev@vger.kernel.org
In-Reply-To: <20180529.094751.1897507632916752927.davem@davemloft.net>

On Tue, 2018-05-29 at 09:47 -0400, David Miller wrote:
> From: Saeed Mahameed <saeedm@mellanox.com>
> Date: Fri, 25 May 2018 17:01:55 -0700
> 
> > This is a mlx5e only pull request, for more information please see
> > tag
> > log below.
> > 
> > Please pull and let me know if there's any problem.
> 
> Pulled, thanks Saeed.
> 
> There was a minor conflict to resolve (simple overlapping changes).

Yes, Sorry i didn't notify on this in advance, the merge commit looks
ok, thanks a lot!.


^ permalink raw reply

* Re: [PATCH 0/4 RFCv2] Realtek SMI RTL836x DSA driver
From: Linus Walleij @ 2018-05-29 18:41 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Vivien Didelot, Florian Fainelli, netdev,
	OpenWrt Development List, LEDE Development List
In-Reply-To: <20180529122424.GC10919@lunn.ch>

On Tue, May 29, 2018 at 2:24 PM, Andrew Lunn <andrew@lunn.ch> wrote:

> Did you look at the switch end? I found a datasheet for the
> 8366/8369. Register at 0x0050, P8GCR. It has two bits for RGMII
> delays.

Unfortunately this datasheet is not applicable to RTL8366RB.

RTL documentation and model numbers are a complete mess
around the time when this chip came out, unfortunately... I even
started to implement using that datasheet and had to toss a bunch
of stuff away.

There might not even be a proper datasheet for RTL8366RB,
I'm afraid. The best we have is different (3 different AFAICT)
vendor code drops. Here is one drop over at DD-WRT:
https://svn.dd-wrt.com//browser/src/linux/universal/linux-3.2/drivers/net/ethernet/raeth/rb

As you can see, the RTL8366RB vendor driver consists of
a hacked version of their RTL8368S driver, so apparently those
two ASICs are similar, they even kept the same filenames.

For example the register defintions:
https://svn.dd-wrt.com/browser/src/linux/universal/linux-3.2/drivers/net/ethernet/raeth/rb/rtl8368s_reg.h

> With RGMII delays, you have 3 'choices'.
>
> 1) The hardware design includes the delay, by zig-zagging the clock
> line to make it longer.
> 2) The 'MAC' side does the delay.
> 3) The 'PHY' side does the delay.
>
> I normally recommend the PHY side doing it, because that's what most
> board do. Gives us some consistency. But it does not really
> matter. Just make sure one side, and only once side is inserting the
> delays.

Makes sense! But I haven't found anything applicable in the
RTL8366RB registers.

There are some jam tables with magic values written all over
the place that have no documentation, I fear this is one of the
settings poked around with there.

However, even if this router did not come with any code for
the RTL8366RB driver, I disassembled the binary to verify
that they use the same magic jam table, so the ASIC is
initialized in the same way.

Yours,
Linus Walleij

^ permalink raw reply

* Re: [PATCH bpf-next] bpf: clean up eBPF helpers documentation
From: Song Liu @ 2018-05-29 18:27 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: Daniel Borkmann, Alexei Starovoitov, Networking, oss-drivers
In-Reply-To: <20180529112744.16406-1-quentin.monnet@netronome.com>

On Tue, May 29, 2018 at 4:27 AM, Quentin Monnet
<quentin.monnet@netronome.com> wrote:
> These are minor edits for the eBPF helpers documentation in
> include/uapi/linux/bpf.h.
>
> The main fix consists in removing "BPF_FIB_LOOKUP_", because it ends
> with a non-escaped underscore that gets interpreted by rst2man and
> produces the following message in the resulting manual page:
>
>     DOCUTILS SYSTEM MESSAGES
>            System Message: ERROR/3 (/tmp/bpf-helpers.rst:, line 1514)
>                   Unknown target name: "bpf_fib_lookup".
>
> Other edits consist in:
>
> - Improving formatting for flag values for "bpf_fib_lookup()" helper.
> - Emphasising a parameter name in description of the return value for
>   "bpf_get_stack()" helper.
> - Removing unnecessary blank lines between "Description" and "Return"
>   sections for the few helpers that would use it, for consistency.
>
> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
> ---
>  include/uapi/linux/bpf.h | 21 ++++++++++-----------
>  1 file changed, 10 insertions(+), 11 deletions(-)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index cc68787f2d97..3f556b35ac8d 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -1010,7 +1010,6 @@ union bpf_attr {
>   *             ::
>   *
>   *                     # sysctl kernel.perf_event_max_stack=<new value>
> - *
>   *     Return
>   *             The positive or null stack id on success, or a negative error
>   *             in case of failure.
> @@ -1821,10 +1820,9 @@ union bpf_attr {
>   *             ::
>   *
>   *                     # sysctl kernel.perf_event_max_stack=<new value>
> - *
>   *     Return
> - *             a non-negative value equal to or less than size on success, or
> - *             a negative error in case of failure.
> + *             A non-negative value equal to or less than *size* on success,
> + *             or a negative error in case of failure.
>   *
>   * int skb_load_bytes_relative(const struct sk_buff *skb, u32 offset, void *to, u32 len, u32 start_header)
>   *     Description
> @@ -1845,7 +1843,6 @@ union bpf_attr {
>   *             in socket filters where *skb*\ **->data** does not always point
>   *             to the start of the mac header and where "direct packet access"
>   *             is not available.
> - *
>   *     Return
>   *             0 on success, or a negative error in case of failure.
>   *
> @@ -1861,16 +1858,18 @@ union bpf_attr {
>   *             rt_metric is set to metric from route.
>   *
>   *             *plen* argument is the size of the passed in struct.
> - *             *flags* argument can be one or more BPF_FIB_LOOKUP_ flags:
> + *             *flags* argument can be a combination of one or more of the
> + *             following values:
>   *
> - *             **BPF_FIB_LOOKUP_DIRECT** means do a direct table lookup vs
> - *             full lookup using FIB rules
> - *             **BPF_FIB_LOOKUP_OUTPUT** means do lookup from an egress
> - *             perspective (default is ingress)
> + *             **BPF_FIB_LOOKUP_DIRECT**
> + *                     Do a direct table lookup vs full lookup using FIB
> + *                     rules.
> + *             **BPF_FIB_LOOKUP_OUTPUT**
> + *                     Perform lookup from an egress perspective (default is
> + *                     ingress).
>   *
>   *             *ctx* is either **struct xdp_md** for XDP programs or
>   *             **struct sk_buff** tc cls_act programs.
> - *
>   *     Return
>   *             Egress device index on success, 0 if packet needs to continue
>   *             up the stack for further processing or a negative error in case
> --
> 2.14.1
>

Please also apply the same changes to tools/include/uapi/linux/bpf.h.

Other than this, it looks to me.

Acked-by: Song Liu <songliubraving@fb.com>

Thanks,
Song

^ permalink raw reply

* Re: [PATCH bpf-next] bpf: hide the unused 'off' variable
From: Song Liu @ 2018-05-29 18:18 UTC (permalink / raw)
  To: John Fastabend
  Cc: Arnd Bergmann, YueHaibing, David Miller, Alexei Starovoitov,
	Daniel Borkmann, Networking, Linux Kernel Mailing List
In-Reply-To: <4a060ed5-f147-6778-caa2-7866b1f04479@gmail.com>

On Tue, May 29, 2018 at 8:53 AM, John Fastabend
<john.fastabend@gmail.com> wrote:
> On 05/29/2018 03:35 AM, Arnd Bergmann wrote:
>> On Tue, May 29, 2018 at 4:40 AM, YueHaibing <yuehaibing@huawei.com> wrote:
>>> The local variable is only used while CONFIG_IPV6 enabled
>>>
>>> net/core/filter.c: In function ‘sk_msg_convert_ctx_access’:
>>> net/core/filter.c:6489:6: warning: unused variable ‘off’ [-Wunused-variable]
>>>   int off;
>>>       ^
>>> This puts it into #ifdef.
>>>
>>> Fixes: 303def35f64e ("bpf: allow sk_msg programs to read sock fields")
>>> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
>>
>> I was about to send the same patch and found you had already sent one.
>>
>> Acked-by: Arnd Bergmann <arnd@arndb.de>
>>
>
> Thanks! I'm curious why kbuild bot didn't catch this. Will
> try to dig into that in a bit.
>
> Acked-by: John Fastabend <john.fastabend@gmail.com>

Acked-by: Song Liu <songliubraving@fb.com>

^ permalink raw reply

* Re: [PATCH bpf-next 03/11] bpf: fixup error message from gpl helpers on license mismatch
From: Song Liu @ 2018-05-29 18:10 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: Daniel Borkmann, netdev
In-Reply-To: <20180529191628.3f6aa7dd@redhat.com>

On Tue, May 29, 2018 at 10:16 AM, Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
> On Mon, 28 May 2018 02:43:36 +0200
> Daniel Borkmann <daniel@iogearbox.net> wrote:
>
>> Stating 'proprietary program' in the error is just silly since it
>> can also be a different open source license than that which is just
>> not compatible.
>>
>> Reference: https://twitter.com/majek04/status/998531268039102465
>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>> Acked-by: Alexei Starovoitov <ast@kernel.org>
>
> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
>
> Thank you for cleaning up this confusion :-)
>

Acked-by: Song Liu <songliubraving@fb.com>

>> ---
>>  kernel/bpf/verifier.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index 1fd9667b..4f4786e 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -2462,7 +2462,7 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
>>
>>       /* eBPF programs must be GPL compatible to use GPL-ed functions */
>>       if (!env->prog->gpl_compatible && fn->gpl_only) {
>> -             verbose(env, "cannot call GPL only function from proprietary program\n");
>> +             verbose(env, "cannot call GPL-restricted function from non-GPL compatible program\n");
>>               return -EINVAL;
>>       }
>>
>
>
>
> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [PATCH bpf-next 02/11] bpf: add also cbpf long jump test cases with heavy expansion
From: Song Liu @ 2018-05-29 18:09 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: Alexei Starovoitov, netdev
In-Reply-To: <20180528004344.3606-3-daniel@iogearbox.net>

On Sun, May 27, 2018 at 5:43 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> We have one triggering on eBPF but lets also add a cBPF example to
> make sure we keep tracking them. Also add anther cBPF test running
> max number of MSH ops.
>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Acked-by: Alexei Starovoitov <ast@kernel.org>

Acked-by: Song Liu <songliubraving@fb.com>


> ---
>  lib/test_bpf.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 63 insertions(+)
>
> diff --git a/lib/test_bpf.c b/lib/test_bpf.c
> index 317f231..60aedc8 100644
> --- a/lib/test_bpf.c
> +++ b/lib/test_bpf.c
> @@ -356,6 +356,52 @@ static int bpf_fill_maxinsns11(struct bpf_test *self)
>         return __bpf_fill_ja(self, BPF_MAXINSNS, 68);
>  }
>
> +static int bpf_fill_maxinsns12(struct bpf_test *self)
> +{
> +       unsigned int len = BPF_MAXINSNS;
> +       struct sock_filter *insn;
> +       int i = 0;
> +
> +       insn = kmalloc_array(len, sizeof(*insn), GFP_KERNEL);
> +       if (!insn)
> +               return -ENOMEM;
> +
> +       insn[0] = __BPF_JUMP(BPF_JMP | BPF_JA, len - 2, 0, 0);
> +
> +       for (i = 1; i < len - 1; i++)
> +               insn[i] = __BPF_STMT(BPF_LDX | BPF_B | BPF_MSH, 0);
> +
> +       insn[len - 1] = __BPF_STMT(BPF_RET | BPF_K, 0xabababab);
> +
> +       self->u.ptr.insns = insn;
> +       self->u.ptr.len = len;
> +
> +       return 0;
> +}
> +
> +static int bpf_fill_maxinsns13(struct bpf_test *self)
> +{
> +       unsigned int len = BPF_MAXINSNS;
> +       struct sock_filter *insn;
> +       int i = 0;
> +
> +       insn = kmalloc_array(len, sizeof(*insn), GFP_KERNEL);
> +       if (!insn)
> +               return -ENOMEM;
> +
> +       for (i = 0; i < len - 3; i++)
> +               insn[i] = __BPF_STMT(BPF_LDX | BPF_B | BPF_MSH, 0);
> +
> +       insn[len - 3] = __BPF_STMT(BPF_LD | BPF_IMM, 0xabababab);
> +       insn[len - 2] = __BPF_STMT(BPF_ALU | BPF_XOR | BPF_X, 0);
> +       insn[len - 1] = __BPF_STMT(BPF_RET | BPF_A, 0);
> +
> +       self->u.ptr.insns = insn;
> +       self->u.ptr.len = len;
> +
> +       return 0;
> +}
> +
>  static int bpf_fill_ja(struct bpf_test *self)
>  {
>         /* Hits exactly 11 passes on x86_64 JIT. */
> @@ -5290,6 +5336,23 @@ static struct bpf_test tests[] = {
>                 .expected_errcode = -ENOTSUPP,
>         },
>         {
> +               "BPF_MAXINSNS: jump over MSH",
> +               { },
> +               CLASSIC | FLAG_EXPECTED_FAIL,
> +               { 0xfa, 0xfb, 0xfc, 0xfd, },
> +               { { 4, 0xabababab } },
> +               .fill_helper = bpf_fill_maxinsns12,
> +               .expected_errcode = -EINVAL,
> +       },
> +       {
> +               "BPF_MAXINSNS: exec all MSH",
> +               { },
> +               CLASSIC,
> +               { 0xfa, 0xfb, 0xfc, 0xfd, },
> +               { { 4, 0xababab83 } },
> +               .fill_helper = bpf_fill_maxinsns13,
> +       },
> +       {
>                 "BPF_MAXINSNS: ld_abs+get_processor_id",
>                 { },
>                 CLASSIC,
> --
> 2.9.5
>

^ permalink raw reply

* Re: [PATCH net] sctp: not allow to set rto_min with a value below 200 msecs
From: Marcelo Ricardo Leitner @ 2018-05-29 18:02 UTC (permalink / raw)
  To: Xin Long
  Cc: Neal Cardwell, Michael Tuexen, Neil Horman, Dmitry Vyukov, Netdev,
	linux-sctp, David Miller, David Ahern, Eric Dumazet, syzkaller
In-Reply-To: <CADvbK_fbKbH2wm6Xurr+ELVag-LvyQdL+peJd=wp7OL7_zMZTQ@mail.gmail.com>

On Wed, May 30, 2018 at 01:45:08AM +0800, Xin Long wrote:
> If we're counting on max_t to fix this CPU stuck. It should not that
> matter if min rto < the value causing that stuck.

Yes but putting a floor to rto_{min,max} now is to protect the rtx
timer now, not the heartbeat one.

> 
> >
> > Anyway, what about we add a floor to rto_max too, so that RTO can
> > actually grow into something bigger that don't hog the CPU? Like:
> > rto_min floor = 5ms
> > rto_max floor = 50ms
> >
> >   Marcelo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

* Re: [PATCH bpf-next 01/11] bpf: test case for map pointer poison with calls/branches
From: Song Liu @ 2018-05-29 18:01 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: Alexei Starovoitov, netdev
In-Reply-To: <20180528004344.3606-2-daniel@iogearbox.net>

On Sun, May 27, 2018 at 5:43 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> Add several test cases where the same or different map pointers
> originate from different paths in the program and execute a map
> lookup or tail call at a common location.
>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Acked-by: Alexei Starovoitov <ast@kernel.org>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  include/linux/filter.h                      |  10 ++
>  tools/include/linux/filter.h                |  10 ++
>  tools/testing/selftests/bpf/test_verifier.c | 185 ++++++++++++++++++++++++----
>  3 files changed, 178 insertions(+), 27 deletions(-)
>
> diff --git a/include/linux/filter.h b/include/linux/filter.h
> index d358d18..b443f70 100644
> --- a/include/linux/filter.h
> +++ b/include/linux/filter.h
> @@ -289,6 +289,16 @@ struct xdp_buff;
>                 .off   = OFF,                                   \
>                 .imm   = 0 })
>
> +/* Relative call */
> +
> +#define BPF_CALL_REL(TGT)                                      \
> +       ((struct bpf_insn) {                                    \
> +               .code  = BPF_JMP | BPF_CALL,                    \
> +               .dst_reg = 0,                                   \
> +               .src_reg = BPF_PSEUDO_CALL,                     \
> +               .off   = 0,                                     \
> +               .imm   = TGT })
> +
>  /* Function call */
>
>  #define BPF_EMIT_CALL(FUNC)                                    \
> diff --git a/tools/include/linux/filter.h b/tools/include/linux/filter.h
> index c5e512d..af55acf 100644
> --- a/tools/include/linux/filter.h
> +++ b/tools/include/linux/filter.h
> @@ -263,6 +263,16 @@
>  #define BPF_LD_MAP_FD(DST, MAP_FD)                             \
>         BPF_LD_IMM64_RAW(DST, BPF_PSEUDO_MAP_FD, MAP_FD)
>
> +/* Relative call */
> +
> +#define BPF_CALL_REL(TGT)                                      \
> +       ((struct bpf_insn) {                                    \
> +               .code  = BPF_JMP | BPF_CALL,                    \
> +               .dst_reg = 0,                                   \
> +               .src_reg = BPF_PSEUDO_CALL,                     \
> +               .off   = 0,                                     \
> +               .imm   = TGT })
> +
>  /* Program exit */
>
>  #define BPF_EXIT_INSN()                                                \
> diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
> index 4b4f015..7cb1d74 100644
> --- a/tools/testing/selftests/bpf/test_verifier.c
> +++ b/tools/testing/selftests/bpf/test_verifier.c
> @@ -50,7 +50,7 @@
>
>  #define MAX_INSNS      BPF_MAXINSNS
>  #define MAX_FIXUPS     8
> -#define MAX_NR_MAPS    4
> +#define MAX_NR_MAPS    7
>  #define POINTER_VALUE  0xcafe4all
>  #define TEST_DATA_LEN  64
>
> @@ -66,7 +66,9 @@ struct bpf_test {
>         int fixup_map1[MAX_FIXUPS];
>         int fixup_map2[MAX_FIXUPS];
>         int fixup_map3[MAX_FIXUPS];
> -       int fixup_prog[MAX_FIXUPS];
> +       int fixup_map4[MAX_FIXUPS];
> +       int fixup_prog1[MAX_FIXUPS];
> +       int fixup_prog2[MAX_FIXUPS];
>         int fixup_map_in_map[MAX_FIXUPS];
>         const char *errstr;
>         const char *errstr_unpriv;
> @@ -2769,7 +2771,7 @@ static struct bpf_test tests[] = {
>                         BPF_MOV64_IMM(BPF_REG_0, 0),
>                         BPF_EXIT_INSN(),
>                 },
> -               .fixup_prog = { 1 },
> +               .fixup_prog1 = { 1 },
>                 .errstr_unpriv = "R3 leaks addr into helper",
>                 .result_unpriv = REJECT,
>                 .result = ACCEPT,
> @@ -2856,7 +2858,7 @@ static struct bpf_test tests[] = {
>                         BPF_MOV64_IMM(BPF_REG_0, 1),
>                         BPF_EXIT_INSN(),
>                 },
> -               .fixup_prog = { 1 },
> +               .fixup_prog1 = { 1 },
>                 .result = ACCEPT,
>                 .retval = 42,
>         },
> @@ -2870,7 +2872,7 @@ static struct bpf_test tests[] = {
>                         BPF_MOV64_IMM(BPF_REG_0, 1),
>                         BPF_EXIT_INSN(),
>                 },
> -               .fixup_prog = { 1 },
> +               .fixup_prog1 = { 1 },
>                 .result = ACCEPT,
>                 .retval = 41,
>         },
> @@ -2884,7 +2886,7 @@ static struct bpf_test tests[] = {
>                         BPF_MOV64_IMM(BPF_REG_0, 1),
>                         BPF_EXIT_INSN(),
>                 },
> -               .fixup_prog = { 1 },
> +               .fixup_prog1 = { 1 },
>                 .result = ACCEPT,
>                 .retval = 1,
>         },
> @@ -2898,7 +2900,7 @@ static struct bpf_test tests[] = {
>                         BPF_MOV64_IMM(BPF_REG_0, 2),
>                         BPF_EXIT_INSN(),
>                 },
> -               .fixup_prog = { 1 },
> +               .fixup_prog1 = { 1 },
>                 .result = ACCEPT,
>                 .retval = 2,
>         },
> @@ -2912,7 +2914,7 @@ static struct bpf_test tests[] = {
>                         BPF_MOV64_IMM(BPF_REG_0, 2),
>                         BPF_EXIT_INSN(),
>                 },
> -               .fixup_prog = { 1 },
> +               .fixup_prog1 = { 1 },
>                 .result = ACCEPT,
>                 .retval = 2,
>         },
> @@ -2926,7 +2928,7 @@ static struct bpf_test tests[] = {
>                         BPF_MOV64_IMM(BPF_REG_0, 2),
>                         BPF_EXIT_INSN(),
>                 },
> -               .fixup_prog = { 2 },
> +               .fixup_prog1 = { 2 },
>                 .result = ACCEPT,
>                 .retval = 42,
>         },
> @@ -11682,6 +11684,112 @@ static struct bpf_test tests[] = {
>                 .prog_type = BPF_PROG_TYPE_XDP,
>         },
>         {
> +               "calls: two calls returning different map pointers for lookup (hash, array)",
> +               .insns = {
> +                       /* main prog */
> +                       BPF_JMP_IMM(BPF_JNE, BPF_REG_1, 0, 2),
> +                       BPF_CALL_REL(11),
> +                       BPF_JMP_IMM(BPF_JA, 0, 0, 1),
> +                       BPF_CALL_REL(12),
> +                       BPF_MOV64_REG(BPF_REG_1, BPF_REG_0),
> +                       BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
> +                       BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
> +                       BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
> +                       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
> +                                    BPF_FUNC_map_lookup_elem),
> +                       BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2),
> +                       BPF_ST_MEM(BPF_DW, BPF_REG_0, 0,
> +                                  offsetof(struct test_val, foo)),
> +                       BPF_MOV64_IMM(BPF_REG_0, 1),
> +                       BPF_EXIT_INSN(),
> +                       /* subprog 1 */
> +                       BPF_LD_MAP_FD(BPF_REG_0, 0),
> +                       BPF_EXIT_INSN(),
> +                       /* subprog 2 */
> +                       BPF_LD_MAP_FD(BPF_REG_0, 0),
> +                       BPF_EXIT_INSN(),
> +               },
> +               .prog_type = BPF_PROG_TYPE_SCHED_CLS,
> +               .fixup_map2 = { 13 },
> +               .fixup_map4 = { 16 },
> +               .result = ACCEPT,
> +               .retval = 1,
> +       },
> +       {
> +               "calls: two calls returning different map pointers for lookup (hash, map in map)",
> +               .insns = {
> +                       /* main prog */
> +                       BPF_JMP_IMM(BPF_JNE, BPF_REG_1, 0, 2),
> +                       BPF_CALL_REL(11),
> +                       BPF_JMP_IMM(BPF_JA, 0, 0, 1),
> +                       BPF_CALL_REL(12),
> +                       BPF_MOV64_REG(BPF_REG_1, BPF_REG_0),
> +                       BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
> +                       BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
> +                       BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
> +                       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
> +                                    BPF_FUNC_map_lookup_elem),
> +                       BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2),
> +                       BPF_ST_MEM(BPF_DW, BPF_REG_0, 0,
> +                                  offsetof(struct test_val, foo)),
> +                       BPF_MOV64_IMM(BPF_REG_0, 1),
> +                       BPF_EXIT_INSN(),
> +                       /* subprog 1 */
> +                       BPF_LD_MAP_FD(BPF_REG_0, 0),
> +                       BPF_EXIT_INSN(),
> +                       /* subprog 2 */
> +                       BPF_LD_MAP_FD(BPF_REG_0, 0),
> +                       BPF_EXIT_INSN(),
> +               },
> +               .prog_type = BPF_PROG_TYPE_SCHED_CLS,
> +               .fixup_map_in_map = { 16 },
> +               .fixup_map4 = { 13 },
> +               .result = REJECT,
> +               .errstr = "R0 invalid mem access 'map_ptr'",
> +       },
> +       {
> +               "cond: two branches returning different map pointers for lookup (tail, tail)",
> +               .insns = {
> +                       BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1,
> +                                   offsetof(struct __sk_buff, mark)),
> +                       BPF_JMP_IMM(BPF_JNE, BPF_REG_6, 0, 3),
> +                       BPF_LD_MAP_FD(BPF_REG_2, 0),
> +                       BPF_JMP_IMM(BPF_JA, 0, 0, 2),
> +                       BPF_LD_MAP_FD(BPF_REG_2, 0),
> +                       BPF_MOV64_IMM(BPF_REG_3, 7),
> +                       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
> +                                    BPF_FUNC_tail_call),
> +                       BPF_MOV64_IMM(BPF_REG_0, 1),
> +                       BPF_EXIT_INSN(),
> +               },
> +               .fixup_prog1 = { 5 },
> +               .fixup_prog2 = { 2 },
> +               .result_unpriv = REJECT,
> +               .errstr_unpriv = "tail_call abusing map_ptr",
> +               .result = ACCEPT,
> +               .retval = 42,
> +       },
> +       {
> +               "cond: two branches returning same map pointers for lookup (tail, tail)",
> +               .insns = {
> +                       BPF_LDX_MEM(BPF_W, BPF_REG_6, BPF_REG_1,
> +                                   offsetof(struct __sk_buff, mark)),
> +                       BPF_JMP_IMM(BPF_JEQ, BPF_REG_6, 0, 3),
> +                       BPF_LD_MAP_FD(BPF_REG_2, 0),
> +                       BPF_JMP_IMM(BPF_JA, 0, 0, 2),
> +                       BPF_LD_MAP_FD(BPF_REG_2, 0),
> +                       BPF_MOV64_IMM(BPF_REG_3, 7),
> +                       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
> +                                    BPF_FUNC_tail_call),
> +                       BPF_MOV64_IMM(BPF_REG_0, 1),
> +                       BPF_EXIT_INSN(),
> +               },
> +               .fixup_prog2 = { 2, 5 },
> +               .result_unpriv = ACCEPT,
> +               .result = ACCEPT,
> +               .retval = 42,
> +       },
> +       {
>                 "search pruning: all branches should be verified (nop operation)",
>                 .insns = {
>                         BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
> @@ -12162,12 +12270,13 @@ static int probe_filter_length(const struct bpf_insn *fp)
>         return len + 1;
>  }
>
> -static int create_map(uint32_t size_value, uint32_t max_elem)
> +static int create_map(uint32_t type, uint32_t size_key,
> +                     uint32_t size_value, uint32_t max_elem)
>  {
>         int fd;
>
> -       fd = bpf_create_map(BPF_MAP_TYPE_HASH, sizeof(long long),
> -                           size_value, max_elem, BPF_F_NO_PREALLOC);
> +       fd = bpf_create_map(type, size_key, size_value, max_elem,
> +                           type == BPF_MAP_TYPE_HASH ? BPF_F_NO_PREALLOC : 0);
>         if (fd < 0)
>                 printf("Failed to create hash map '%s'!\n", strerror(errno));
>
> @@ -12200,13 +12309,13 @@ static int create_prog_dummy2(int mfd, int idx)
>                                 ARRAY_SIZE(prog), "GPL", 0, NULL, 0);
>  }
>
> -static int create_prog_array(void)
> +static int create_prog_array(uint32_t max_elem, int p1key)
>  {
> -       int p1key = 0, p2key = 1;
> +       int p2key = 1;
>         int mfd, p1fd, p2fd;
>
>         mfd = bpf_create_map(BPF_MAP_TYPE_PROG_ARRAY, sizeof(int),
> -                            sizeof(int), 4, 0);
> +                            sizeof(int), max_elem, 0);
>         if (mfd < 0) {
>                 printf("Failed to create prog array '%s'!\n", strerror(errno));
>                 return -1;
> @@ -12261,7 +12370,9 @@ static void do_test_fixup(struct bpf_test *test, struct bpf_insn *prog,
>         int *fixup_map1 = test->fixup_map1;
>         int *fixup_map2 = test->fixup_map2;
>         int *fixup_map3 = test->fixup_map3;
> -       int *fixup_prog = test->fixup_prog;
> +       int *fixup_map4 = test->fixup_map4;
> +       int *fixup_prog1 = test->fixup_prog1;
> +       int *fixup_prog2 = test->fixup_prog2;
>         int *fixup_map_in_map = test->fixup_map_in_map;
>
>         if (test->fill_helper)
> @@ -12272,7 +12383,8 @@ static void do_test_fixup(struct bpf_test *test, struct bpf_insn *prog,
>          * that really matters is value size in this case.
>          */
>         if (*fixup_map1) {
> -               map_fds[0] = create_map(sizeof(long long), 1);
> +               map_fds[0] = create_map(BPF_MAP_TYPE_HASH, sizeof(long long),
> +                                       sizeof(long long), 1);
>                 do {
>                         prog[*fixup_map1].imm = map_fds[0];
>                         fixup_map1++;
> @@ -12280,7 +12392,8 @@ static void do_test_fixup(struct bpf_test *test, struct bpf_insn *prog,
>         }
>
>         if (*fixup_map2) {
> -               map_fds[1] = create_map(sizeof(struct test_val), 1);
> +               map_fds[1] = create_map(BPF_MAP_TYPE_HASH, sizeof(long long),
> +                                       sizeof(struct test_val), 1);
>                 do {
>                         prog[*fixup_map2].imm = map_fds[1];
>                         fixup_map2++;
> @@ -12288,25 +12401,43 @@ static void do_test_fixup(struct bpf_test *test, struct bpf_insn *prog,
>         }
>
>         if (*fixup_map3) {
> -               map_fds[1] = create_map(sizeof(struct other_val), 1);
> +               map_fds[2] = create_map(BPF_MAP_TYPE_HASH, sizeof(long long),
> +                                       sizeof(struct other_val), 1);
>                 do {
> -                       prog[*fixup_map3].imm = map_fds[1];
> +                       prog[*fixup_map3].imm = map_fds[2];
>                         fixup_map3++;
>                 } while (*fixup_map3);
>         }
>
> -       if (*fixup_prog) {
> -               map_fds[2] = create_prog_array();
> +       if (*fixup_map4) {
> +               map_fds[3] = create_map(BPF_MAP_TYPE_ARRAY, sizeof(int),
> +                                       sizeof(struct test_val), 1);
> +               do {
> +                       prog[*fixup_map4].imm = map_fds[3];
> +                       fixup_map4++;
> +               } while (*fixup_map4);
> +       }
> +
> +       if (*fixup_prog1) {
> +               map_fds[4] = create_prog_array(4, 0);
> +               do {
> +                       prog[*fixup_prog1].imm = map_fds[4];
> +                       fixup_prog1++;
> +               } while (*fixup_prog1);
> +       }
> +
> +       if (*fixup_prog2) {
> +               map_fds[5] = create_prog_array(8, 7);
>                 do {
> -                       prog[*fixup_prog].imm = map_fds[2];
> -                       fixup_prog++;
> -               } while (*fixup_prog);
> +                       prog[*fixup_prog2].imm = map_fds[5];
> +                       fixup_prog2++;
> +               } while (*fixup_prog2);
>         }
>
>         if (*fixup_map_in_map) {
> -               map_fds[3] = create_map_in_map();
> +               map_fds[6] = create_map_in_map();
>                 do {
> -                       prog[*fixup_map_in_map].imm = map_fds[3];
> +                       prog[*fixup_map_in_map].imm = map_fds[6];
>                         fixup_map_in_map++;
>                 } while (*fixup_map_in_map);
>         }
> --
> 2.9.5
>

^ permalink raw reply

* [PATCH bpf-next] bpf: Drop mpls from bpf_fib_lookup
From: dsahern @ 2018-05-29 17:58 UTC (permalink / raw)
  To: netdev, borkmann, ast; +Cc: David Ahern

From: David Ahern <dsahern@gmail.com>

MPLS support will not be submitted this dev cycle, but in working on it
I do see a few changes are needed to the API. For now, drop mpls from the
API. Since the fields in question are unions, the mpls fields can be added
back later without affecting the uapi.

Signed-off-by: David Ahern <dsahern@gmail.com>
---
 include/uapi/linux/bpf.h | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index cc68787f2d97..2dd440e39802 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1855,10 +1855,10 @@ union bpf_attr {
  *		If lookup is successful and result shows packet is to be
  *		forwarded, the neighbor tables are searched for the nexthop.
  *		If successful (ie., FIB lookup shows forwarding and nexthop
- *		is resolved), the nexthop address is returned in ipv4_dst,
- *		ipv6_dst or mpls_out based on family, smac is set to mac
- *		address of egress device, dmac is set to nexthop mac address,
- *		rt_metric is set to metric from route.
+ *		is resolved), the nexthop address is returned in ipv4_dst
+ *		or ipv6_dst based on family, smac is set to mac address of
+ *		egress device, dmac is set to nexthop mac address, rt_metric
+ *		is set to metric from route (IPv4/IPv6 only).
  *
  *             *plen* argument is the size of the passed in struct.
  *             *flags* argument can be one or more BPF_FIB_LOOKUP_ flags:
@@ -2538,8 +2538,10 @@ struct bpf_raw_tracepoint_args {
 #define BPF_FIB_LOOKUP_OUTPUT  BIT(1)
 
 struct bpf_fib_lookup {
-	/* input */
-	__u8	family;   /* network family, AF_INET, AF_INET6, AF_MPLS */
+	/* input:  network family for lookup (AF_INET, AF_INET6)
+	 * output: network family of egress nexthop
+	 */
+	__u8	family;
 
 	/* set if lookup is to consider L4 data - e.g., FIB rules */
 	__u8	l4_protocol;
@@ -2555,22 +2557,20 @@ struct bpf_fib_lookup {
 		__u8	tos;		/* AF_INET  */
 		__be32	flowlabel;	/* AF_INET6 */
 
-		/* output: metric of fib result */
-		__u32 rt_metric;
+		/* output: metric of fib result (IPv4/IPv6 only) */
+		__u32	rt_metric;
 	};
 
 	union {
-		__be32		mpls_in;
 		__be32		ipv4_src;
 		__u32		ipv6_src[4];  /* in6_addr; network order */
 	};
 
-	/* input to bpf_fib_lookup, *dst is destination address.
-	 * output: bpf_fib_lookup sets to gateway address
+	/* input to bpf_fib_lookup, ipv{4,6}_dst is destination address in
+	 * network header. output: bpf_fib_lookup sets to gateway address
+	 * if FIB lookup returns gateway route
 	 */
 	union {
-		/* return for MPLS lookups */
-		__be32		mpls_out[4];  /* support up to 4 labels */
 		__be32		ipv4_dst;
 		__u32		ipv6_dst[4];  /* in6_addr; network order */
 	};
-- 
2.11.0

^ permalink raw reply related

* Re: [PATCH net] sctp: not allow to set rto_min with a value below 200 msecs
From: Xin Long @ 2018-05-29 17:45 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: Neal Cardwell, Michael Tuexen, Neil Horman, Dmitry Vyukov, Netdev,
	linux-sctp, David Miller, David Ahern, Eric Dumazet, syzkaller
In-Reply-To: <20180529170604.GD3788@localhost.localdomain>

On Wed, May 30, 2018 at 1:06 AM, Marcelo Ricardo Leitner
<marcelo.leitner@gmail.com> wrote:
> On Tue, May 29, 2018 at 12:03:46PM -0400, Neal Cardwell wrote:
>> On Tue, May 29, 2018 at 11:45 AM Marcelo Ricardo Leitner <
>> marcelo.leitner@gmail.com> wrote:
>> > - patch2 - fix rtx attack vector
>> >    - Add the floor value to rto_min to HZ/20 (which fits the values
>> >      that Michael shared on the other email)
>>
>> I would encourage allowing minimum RTO values down to 5ms, if the ACK
>> policy in the receiver makes this feasible. Our experience is that in
>> datacenter environments it can be advantageous to allow timer-based loss
>> recoveries using timeout values as low as 5ms, e.g.:
>
> Thanks Neal. On Xin's tests, the hearbeat timer becomes an issue at
> ~25ms already. Xin, can you share more details on the hw, which CPU
> was used?
It was on a KVM guest,  "-smp 2,cores=1,threads=1,sockets=2"
# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                2
On-line CPU(s) list:   0,1
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             2
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 13
Model name:            QEMU Virtual CPU version 1.5.3
Stepping:              3
CPU MHz:               2397.222
BogoMIPS:              4794.44
Hypervisor vendor:     KVM
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              4096K
NUMA node0 CPU(s):     0,1
Flags:                 fpu de pse tsc msr pae mce cx8 apic sep mtrr
pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good
nopl cpuid pni cx16 hypervisor lahf_lm abm pti

If we're counting on max_t to fix this CPU stuck. It should not that
matter if min rto < the value causing that stuck.

>
> Anyway, what about we add a floor to rto_max too, so that RTO can
> actually grow into something bigger that don't hog the CPU? Like:
> rto_min floor = 5ms
> rto_max floor = 50ms
>
>   Marcelo

^ permalink raw reply

* Re: [PATCH bpf 2/2] bpf: enforce usage of __aligned_u64 in the UAPI header
From: Song Liu @ 2018-05-29 17:35 UTC (permalink / raw)
  To: Eugene Syromiatnikov
  Cc: netdev, open list, Martin KaFai Lau, Daniel Borkmann,
	Alexei Starovoitov, David S. Miller, Jiri Olsa, Ingo Molnar,
	Lawrence Brakmo, Andrey Ignatov, Jakub Kicinski, John Fastabend,
	Dmitry V. Levin
In-Reply-To: <20180527112847.GA18311@asgard.redhat.com>

On Sun, May 27, 2018 at 4:28 AM, Eugene Syromiatnikov <esyr@redhat.com> wrote:
> Use __aligned_u64 instead of __u64 everywhere in the UAPI header,
> similarly to v4.17-rc1~94^2~58^2 "RDMA: Change all uapi headers to use
> __aligned_u64 instead of __u64".
>
> This commit doesn't change structure layouts, but imposes additional
> alignment requirements on struct bpf_stack_build_id, struct
> bpf_sock_ops, struct bpf_perf_event_value, and struct
> bpf_raw_tracepoint_args; of them only struct bpf_sock_ops and struct
> bpf_perf_event_value have 64-bit fields that were present in a released
> kernel without this additional alignment requirement (bytes_received
> and bytes_acked were added to struct bpf_sock_ops in commit
> v4.16-rc1~123^2~33^2~5^2~4, struct bpf_perf_event_value was added
> in commit v4.15-rc1~84^2~532^2~3).
>
> Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>
> ---
>  include/uapi/linux/bpf.h       | 22 +++++++++++-----------
>  tools/include/uapi/linux/bpf.h | 22 +++++++++++-----------
>  2 files changed, 22 insertions(+), 22 deletions(-)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 903010a..174e99a 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -259,8 +259,8 @@ struct bpf_stack_build_id {
>         __s32           status;
>         unsigned char   build_id[BPF_BUILD_ID_SIZE];
>         union {
> -               __u64   offset;
> -               __u64   ip;
> +               __aligned_u64   offset;
> +               __aligned_u64   ip;
>         };
>  };
>
> @@ -288,7 +288,7 @@ union bpf_attr {
>                         __aligned_u64 value;
>                         __aligned_u64 next_key;
>                 };
> -               __u64           flags;
> +               __aligned_u64   flags;
>         };
>
>         struct { /* anonymous struct used by BPF_PROG_LOAD command */
> @@ -360,7 +360,7 @@ union bpf_attr {
>         } query;
>
>         struct {
> -               __u64 name;
> +               __aligned_u64 name;
>                 __u32 prog_fd;
>         } raw_tracepoint;
>  } __attribute__((aligned(8)));
> @@ -1011,7 +1011,7 @@ struct bpf_prog_info {
>         __u32 xlated_prog_len;
>         __aligned_u64 jited_prog_insns;
>         __aligned_u64 xlated_prog_insns;
> -       __u64 load_time;        /* ns since boottime */
> +       __aligned_u64 load_time;        /* ns since boottime */
>         __u32 created_by_uid;
>         __u32 nr_map_ids;
>         __aligned_u64 map_ids;
> @@ -1101,8 +1101,8 @@ struct bpf_sock_ops {
>         __u32 lost_out;
>         __u32 sacked_out;
>         __u32 sk_txhash;
> -       __u64 bytes_received;
> -       __u64 bytes_acked;
> +       __aligned_u64 bytes_received;
> +       __aligned_u64 bytes_acked;
>  };
>
>  /* Definitions for bpf_sock_ops_cb_flags */
> @@ -1189,9 +1189,9 @@ enum {
>  #define TCP_BPF_SNDCWND_CLAMP  1002    /* Set sndcwnd_clamp */
>
>  struct bpf_perf_event_value {
> -       __u64 counter;
> -       __u64 enabled;
> -       __u64 running;
> +       __aligned_u64 counter;
> +       __aligned_u64 enabled;
> +       __aligned_u64 running;
>  };
>
>  #define BPF_DEVCG_ACC_MKNOD    (1ULL << 0)
> @@ -1209,7 +1209,7 @@ struct bpf_cgroup_dev_ctx {
>  };
>
>  struct bpf_raw_tracepoint_args {
> -       __u64 args[0];
> +       __aligned_u64 args[0];
>  };
>
>  #endif /* _UAPI__LINUX_BPF_H__ */
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 903010a..174e99a 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -259,8 +259,8 @@ struct bpf_stack_build_id {
>         __s32           status;
>         unsigned char   build_id[BPF_BUILD_ID_SIZE];
>         union {
> -               __u64   offset;
> -               __u64   ip;
> +               __aligned_u64   offset;
> +               __aligned_u64   ip;
>         };
>  };
>
> @@ -288,7 +288,7 @@ union bpf_attr {
>                         __aligned_u64 value;
>                         __aligned_u64 next_key;
>                 };
> -               __u64           flags;
> +               __aligned_u64   flags;
>         };
>
>         struct { /* anonymous struct used by BPF_PROG_LOAD command */
> @@ -360,7 +360,7 @@ union bpf_attr {
>         } query;
>
>         struct {
> -               __u64 name;
> +               __aligned_u64 name;
>                 __u32 prog_fd;
>         } raw_tracepoint;
>  } __attribute__((aligned(8)));
> @@ -1011,7 +1011,7 @@ struct bpf_prog_info {
>         __u32 xlated_prog_len;
>         __aligned_u64 jited_prog_insns;
>         __aligned_u64 xlated_prog_insns;
> -       __u64 load_time;        /* ns since boottime */
> +       __aligned_u64 load_time;        /* ns since boottime */
>         __u32 created_by_uid;
>         __u32 nr_map_ids;
>         __aligned_u64 map_ids;
> @@ -1101,8 +1101,8 @@ struct bpf_sock_ops {
>         __u32 lost_out;
>         __u32 sacked_out;
>         __u32 sk_txhash;
> -       __u64 bytes_received;
> -       __u64 bytes_acked;
> +       __aligned_u64 bytes_received;
> +       __aligned_u64 bytes_acked;
>  };
>
>  /* Definitions for bpf_sock_ops_cb_flags */
> @@ -1189,9 +1189,9 @@ enum {
>  #define TCP_BPF_SNDCWND_CLAMP  1002    /* Set sndcwnd_clamp */
>
>  struct bpf_perf_event_value {
> -       __u64 counter;
> -       __u64 enabled;
> -       __u64 running;
> +       __aligned_u64 counter;
> +       __aligned_u64 enabled;
> +       __aligned_u64 running;
>  };
>
>  #define BPF_DEVCG_ACC_MKNOD    (1ULL << 0)
> @@ -1209,7 +1209,7 @@ struct bpf_cgroup_dev_ctx {
>  };
>
>  struct bpf_raw_tracepoint_args {
> -       __u64 args[0];
> +       __aligned_u64 args[0];
>  };
>
>  #endif /* _UAPI__LINUX_BPF_H__ */
> --
> 2.1.4
>

I think these changes are not necessary. Is it a general guidance to
only use 64-bit aligned
variables in UAPI headers?

Thanks,
Song

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox