Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH bpf-next 04/11] bpf: show prog and map id in fdinfo
From: Song Liu @ 2018-05-30 16:15 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: Jesper Dangaard Brouer, Networking
In-Reply-To: <c7a3a2bb-fe79-66cb-159e-b5680f53910f@iogearbox.net>

On Tue, May 29, 2018 at 12:55 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> On 05/29/2018 07:27 PM, Jesper Dangaard Brouer wrote:
>> On Mon, 28 May 2018 02:43:37 +0200
>> Daniel Borkmann <daniel@iogearbox.net> wrote:
>>
>>> Its trivial and straight forward to expose it for scripts that can
>>> then use it along with bpftool in order to inspect an individual
>>> application's used maps and progs. Right now we dump some basic
>>> information in the fdinfo file but with the help of the map/prog
>>> id full introspection becomes possible now.
>>>
>>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>>> Acked-by: Alexei Starovoitov <ast@kernel.org>

Acked-by: Song Liu <songliubraving@fb.com>

>>
>> AFAICR iproute uses this proc fdinfo, for pinned maps.  Have you tested
>> if this change is handled gracefully by tc ?
>
> Yep, it works just fine, I also tested it before submission.

^ permalink raw reply

* Re: [PATCH bpf-next 10/11] bpf: sync bpf uapi header with tools
From: Song Liu @ 2018-05-30 16:10 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: Alexei Starovoitov, Networking
In-Reply-To: <20180528004344.3606-11-daniel@iogearbox.net>

On Sun, May 27, 2018 at 5:43 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> Pull in recent changes from include/uapi/linux/bpf.h.
>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Acked-by: Alexei Starovoitov <ast@kernel.org>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  tools/include/uapi/linux/bpf.h | 20 ++++++++++++++++++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 9b8c6e3..7108711 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -2004,6 +2004,20 @@ union bpf_attr {
>   *             direct packet access.
>   *     Return
>   *             0 on success, or a negative error in case of failure.
> + *
> + * uint64_t bpf_skb_cgroup_id(struct sk_buff *skb)
> + *     Description
> + *             Return the cgroup v2 id of the socket associated with the *skb*.
> + *             This is roughly similar to the **bpf_get_cgroup_classid**\ ()
> + *             helper for cgroup v1 by providing a tag resp. identifier that
> + *             can be matched on or used for map lookups e.g. to implement
> + *             policy. The cgroup v2 id of a given path in the hierarchy is
> + *             exposed in user space through the f_handle API in order to get
> + *             to the same 64-bit id.
> + *
> + *             This helper can be used on TC egress path, but not on ingress.
> + *     Return
> + *             The id is returned or 0 in case the id could not be retrieved.
>   */
>  #define __BPF_FUNC_MAPPER(FN)          \
>         FN(unspec),                     \
> @@ -2082,7 +2096,8 @@ union bpf_attr {
>         FN(lwt_push_encap),             \
>         FN(lwt_seg6_store_bytes),       \
>         FN(lwt_seg6_adjust_srh),        \
> -       FN(lwt_seg6_action),
> +       FN(lwt_seg6_action),            \
> +       FN(skb_cgroup_id),
>
>  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
>   * function eBPF program intends to call
> @@ -2199,7 +2214,7 @@ struct bpf_tunnel_key {
>         };
>         __u8 tunnel_tos;
>         __u8 tunnel_ttl;
> -       __u16 tunnel_ext;
> +       __u16 tunnel_ext;       /* Padding, future use. */
>         __u32 tunnel_label;
>  };
>
> @@ -2210,6 +2225,7 @@ struct bpf_xfrm_state {
>         __u32 reqid;
>         __u32 spi;      /* Stored in network byte order */
>         __u16 family;
> +       __u16 ext;      /* Padding, future use. */
>         union {
>                 __u32 remote_ipv4;      /* Stored in network byte order */
>                 __u32 remote_ipv6[4];   /* Stored in network byte order */
> --
> 2.9.5
>

^ permalink raw reply

* Re: [RFC net-next 0/4] net: sched: support replay of filter offload when binding to block
From: Or Gerlitz @ 2018-05-30 15:59 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: John Hurley, Linux Netdev List, Jiri Pirko, Samudrala, Sridhar,
	oss-drivers, Rabie Loulou
In-Reply-To: <20180528130202.16981241@cakuba>

On Mon, May 28, 2018 at 11:02 PM, Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
> On Mon, 28 May 2018 13:48:28 +0300, Or Gerlitz wrote:
>> On Fri, May 25, 2018 at 5:25 AM, Jakub Kicinski wrote:
>> > This series from John adds the ability to replay filter offload requests
>> > when new offload callback is being registered on a TC block.  This is most
>> > likely to take place for shared blocks today, when a block which already
>> > has rules is bound to another interface.  Prior to this patch set if any
>> > of the rules were offloaded the block bind would fail.
>>
>> Can you elaborate a little further here? is this something that you are planning
>> to use for the uplink LAG use-case? AFAIU if we apply share-block to nfp as
>> things are prior to this patch, it would work, so there's a case where
>> it doesn't and this is now handled with the series?
>
> Just looking at things as they stand today, no bond/forward looking
> plans - nfp "supports" shared blocks by registering multiple callbacks
> to the block.  There are two problems:
>
> (a) one can't install a second callback if some rules are already
>     offloaded because of:
>
>         /* At this point, playback of previous block cb calls is not supported,
>          * so forbid to register to block which already has some offloaded
>          * filters present.
>          */
>         if (tcf_block_offload_in_use(block))
>                 return ERR_PTR(-EOPNOTSUPP);
>
>     in __tcf_block_cb_register(), so block sharing has to be set up
>     before any rules are added.
>
> (b) when block is unshared filters are not removed today and driver
>     would have to sweep its rule table, as John notes.  It's not a big
>     deal but this series fixes it nicely in the core, too.

OK, thanks for these two point clarifications, much helpful.

> Looking forward there are two things we can use shared blocks for: we
> can try to teach user space to share ingress blocks on all legs of bonds
> instead of trying to propagate the rules from the bond down in the
> kernel, which is more tricky to get right.  We will need reliable
> replay for that, because we want new links to be able to join and leave
> the bond when rules are already present.

> Second use case, which is more far fetched, is trying to discover and
> register callbacks for blocks of tunnel devices directly, and avoid the
> egdev infrastructure...

> We should discuss the above further, but regardless, I think this
> patchset is quite a nice addition on it's own.  Would you agree?

yes, it sounds good, but I need to look deeper, a bit behind on that :(

Or.

^ permalink raw reply

* Re: [PATCH net-next v4 7/8] net: bridge: Notify about bridge VLANs
From: Nikolay Aleksandrov @ 2018-05-30 15:58 UTC (permalink / raw)
  To: Petr Machata, netdev, devel, bridge
  Cc: jiri, idosch, davem, razvan.stefanescu, gregkh, stephen, andrew,
	vivien.didelot, f.fainelli, dan.carpenter
In-Reply-To: <583d583ad158363411fde87dbf6709024714e498.1527641426.git.petrm@mellanox.com>

On 30/05/18 04:00, Petr Machata wrote:
> A driver might need to react to changes in settings of brentry VLANs.
> Therefore send switchdev port notifications for these as well. Reuse
> SWITCHDEV_OBJ_ID_PORT_VLAN for this purpose. Listeners should use
> netif_is_bridge_master() on orig_dev to determine whether the
> notification is about a bridge port or a bridge.
> 
> Signed-off-by: Petr Machata <petrm@mellanox.com>
> ---
>  net/bridge/br_vlan.c | 28 +++++++++++++++++++++++++---
>  1 file changed, 25 insertions(+), 3 deletions(-)
> 

LGTM,
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>

^ permalink raw reply

* Re: [PATCH mlx5-next v2 11/13] IB/mlx5: Add flow counters binding support
From: Jason Gunthorpe @ 2018-05-30 15:35 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Leon Romanovsky, Doug Ledford, Leon Romanovsky, RDMA mailing list,
	Boris Pismenny, Matan Barak, Raed Salem, Yishai Hadas,
	Saeed Mahameed, linux-netdev, Alex Rosenbaum
In-Reply-To: <0dec7cc6-5715-4513-d55b-c53271c4fbee@dev.mellanox.co.il>

On Wed, May 30, 2018 at 06:24:00PM +0300, Yishai Hadas wrote:
> On 5/30/2018 6:06 PM, Jason Gunthorpe wrote:
> >On Wed, May 30, 2018 at 03:31:34PM +0300, Yishai Hadas wrote:
> >>On 5/29/2018 10:56 PM, Jason Gunthorpe wrote:
> >>>On Tue, May 29, 2018 at 04:09:15PM +0300, Leon Romanovsky wrote:
> >>>>diff --git a/include/uapi/rdma/mlx5-abi.h b/include/uapi/rdma/mlx5-abi.h
> >>>>index 508ea8c82da7..ef3f430a7050 100644
> >>>>+++ b/include/uapi/rdma/mlx5-abi.h
> >>>>@@ -443,4 +443,18 @@ enum {
> >>>>  enum {
> >>>>  	MLX5_IB_CLOCK_INFO_V1              = 0,
> >>>>  };
> >>>>+
> >>>>+struct mlx5_ib_flow_counters_data {
> >>>>+	__aligned_u64   counters_data;
> >>>>+	__u32   ncounters;
> >>>>+	__u32   reserved;
> >>>>+};
> >>>>+
> >>>>+struct mlx5_ib_create_flow {
> >>>>+	__u32   ncounters_data;
> >>>>+	__u32   reserved;
> >>>>+	/* Following are counters data based on ncounters_data */
> >>>>+	struct mlx5_ib_flow_counters_data data[];
> >>>>+};
> >>>>+
> >>>>  #endif /* MLX5_ABI_USER_H */
> >>>
> >>>This uapi thing still needs to be fixed as I pointed out before.
> >>
> >>In V3 we can go with below, no change in memory layout but it can clarify
> >>the code/usage.
> >>
> >>struct mlx5_ib_flow_counters_desc {
> >>         __u32   description;
> >>         __u32   index;
> >>};
> >>
> >>struct mlx5_ib_flow_counters_data {
> >>         RDMA_UAPI_PTR(struct mlx5_ib_flow_counters_desc *, counters_data);
> >>         __u32   ncounters;
> >>         __u32   reserved;
> >>};
> >
> >OK, this is what I asked for originally..
> >
> >>struct mlx5_ib_create_flow {
> >>         __u32   ncounters_data;
> >>         __u32   reserved;
> >>         /* Following are counters data based on ncounters_data */
> >>         struct mlx5_ib_flow_counters_data data[];
> >>
> >>
> >>>I still can't figure out why this should be a 2d array.
> >>
> >>This comes to support the future case of multiple counters objects/specs
> >>passed with the same flow. There is a need to differentiate mapping data for
> >>each counters object and that is done via the 'ncounters_data' field and the
> >>2d array.
> >
> >This still doesn't make any sense to me. How are these multiple
> >counters objects/specs going to be identified?
> >
> >Basically, what does the array index for data[] mean? Should it match
> >the spec index from the main verb or something?
> >
> 
> Each entry in the data[] should match a corresponding counter object that
> was pointed by a counters spec upon the flow creation.

What if there are a mixture of specs, some with counters and some
without?

The index is just matching the index of the spec? That makes sense.

> >This is a good place for a comment, since the intention is completely
> >opaque here.
> 
> Sure, we'll add comment to clarify the above.

Sure, can leave the flex array then too

Jason

^ permalink raw reply

* [PATCH] b53: Add brcm5389 support
From: Damien Thébault @ 2018-05-30 15:33 UTC (permalink / raw)
  To: vivien.didelot@savoirfairelinux.com, f.fainelli@gmail.com,
	andrew@lunn.ch
  Cc: davem@davemloft.net, netdev@vger.kernel.org

This patch adds support for the BCM5389 switch connected through MDIO.

Signed-off-by: Damien Thébault <damien.thebault@vitec.com>
---
 drivers/net/dsa/b53/b53_common.c | 13 +++++++++++++
 drivers/net/dsa/b53/b53_mdio.c   |  5 ++++-
 drivers/net/dsa/b53/b53_priv.h   |  1 +
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index 78616787f2a3..3da5fca77cbd 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1711,6 +1711,18 @@ static const struct b53_chip_data b53_switch_chips[] = {
 		.cpu_port = B53_CPU_PORT_25,
 		.duplex_reg = B53_DUPLEX_STAT_FE,
 	},
+	{
+		.chip_id = BCM5389_DEVICE_ID,
+		.dev_name = "BCM5389",
+		.vlans = 4096,
+		.enabled_ports = 0x1f,
+		.arl_entries = 4,
+		.cpu_port = B53_CPU_PORT,
+		.vta_regs = B53_VTA_REGS,
+		.duplex_reg = B53_DUPLEX_STAT_GE,
+		.jumbo_pm_reg = B53_JUMBO_PORT_MASK,
+		.jumbo_size_reg = B53_JUMBO_MAX_SIZE,
+	},
 	{
 		.chip_id = BCM5395_DEVICE_ID,
 		.dev_name = "BCM5395",
@@ -2034,6 +2046,7 @@ int b53_switch_detect(struct b53_device *dev)
 		else
 			dev->chip_id = BCM5365_DEVICE_ID;
 		break;
+	case BCM5389_DEVICE_ID:
 	case BCM5395_DEVICE_ID:
 	case BCM5397_DEVICE_ID:
 	case BCM5398_DEVICE_ID:
diff --git a/drivers/net/dsa/b53/b53_mdio.c b/drivers/net/dsa/b53/b53_mdio.c
index fa7556f5d4fb..a533a90e3904 100644
--- a/drivers/net/dsa/b53/b53_mdio.c
+++ b/drivers/net/dsa/b53/b53_mdio.c
@@ -285,6 +285,7 @@ static const struct b53_io_ops b53_mdio_ops = {
 #define B53_BRCM_OUI_1	0x0143bc00
 #define B53_BRCM_OUI_2	0x03625c00
 #define B53_BRCM_OUI_3	0x00406000
+#define B53_BRCM_OUI_4	0x01410c00
 
 static int b53_mdio_probe(struct mdio_device *mdiodev)
 {
@@ -311,7 +312,8 @@ static int b53_mdio_probe(struct mdio_device *mdiodev)
 	 */
 	if ((phy_id & 0xfffffc00) != B53_BRCM_OUI_1 &&
 	    (phy_id & 0xfffffc00) != B53_BRCM_OUI_2 &&
-	    (phy_id & 0xfffffc00) != B53_BRCM_OUI_3) {
+	    (phy_id & 0xfffffc00) != B53_BRCM_OUI_3 &&
+	    (phy_id & 0xfffffc00) != B53_BRCM_OUI_4) {
 		dev_err(&mdiodev->dev, "Unsupported device: 0x%08x\n", phy_id);
 		return -ENODEV;
 	}
@@ -360,6 +362,7 @@ static const struct of_device_id b53_of_match[] = {
 	{ .compatible = "brcm,bcm53125" },
 	{ .compatible = "brcm,bcm53128" },
 	{ .compatible = "brcm,bcm5365" },
+	{ .compatible = "brcm,bcm5389" },
 	{ .compatible = "brcm,bcm5395" },
 	{ .compatible = "brcm,bcm5397" },
 	{ .compatible = "brcm,bcm5398" },
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index 1187ebd79287..3b57f47d0e79 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -48,6 +48,7 @@ struct b53_io_ops {
 enum {
 	BCM5325_DEVICE_ID = 0x25,
 	BCM5365_DEVICE_ID = 0x65,
+	BCM5389_DEVICE_ID = 0x89,
 	BCM5395_DEVICE_ID = 0x95,
 	BCM5397_DEVICE_ID = 0x97,
 	BCM5398_DEVICE_ID = 0x98,
-- 
2.17.0

^ permalink raw reply related

* [PATCH v2 iproute2-next] ip route: print RTA_CACHEINFO if it exists
From: dsahern @ 2018-05-30 15:30 UTC (permalink / raw)
  To: netdev, stephen; +Cc: David Ahern

From: David Ahern <dsahern@gmail.com>

RTA_CACHEINFO can be sent for non-cloned routes. If the attribute is
present print it. Allows route dumps to print expires times for example
which can exist on FIB entries.

Signed-off-by: David Ahern <dsahern@gmail.com>
---
v2
- leave print_cache_flags under r->rtm_flags & RTM_F_CLONED check

 ip/iproute.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/ip/iproute.c b/ip/iproute.c
index 56dd9f25e38e..254d7abd2abf 100644
--- a/ip/iproute.c
+++ b/ip/iproute.c
@@ -899,17 +899,14 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 			   rta_getattr_u32(tb[RTA_UID]));
 
 	if (r->rtm_family == AF_INET) {
-		if (r->rtm_flags & RTM_F_CLONED) {
+		if (r->rtm_flags & RTM_F_CLONED)
 			print_cache_flags(fp, r->rtm_flags);
 
-			if (tb[RTA_CACHEINFO])
-				print_rta_cacheinfo(fp, RTA_DATA(tb[RTA_CACHEINFO]));
-		}
+		if (tb[RTA_CACHEINFO])
+			print_rta_cacheinfo(fp, RTA_DATA(tb[RTA_CACHEINFO]));
 	} else if (r->rtm_family == AF_INET6) {
-		if (r->rtm_flags & RTM_F_CLONED) {
-			if (tb[RTA_CACHEINFO])
-				print_rta_cacheinfo(fp, RTA_DATA(tb[RTA_CACHEINFO]));
-		}
+		if (tb[RTA_CACHEINFO])
+			print_rta_cacheinfo(fp, RTA_DATA(tb[RTA_CACHEINFO]));
 	}
 
 	if (tb[RTA_METRICS])
-- 
2.11.0

^ permalink raw reply related

* Re: [PATCH mlx5-next v2 11/13] IB/mlx5: Add flow counters binding support
From: Yishai Hadas @ 2018-05-30 15:24 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, Doug Ledford, Leon Romanovsky, RDMA mailing list,
	Boris Pismenny, Matan Barak, Raed Salem, Yishai Hadas,
	Saeed Mahameed, linux-netdev, Alex Rosenbaum
In-Reply-To: <20180530150608.GA30754@ziepe.ca>

On 5/30/2018 6:06 PM, Jason Gunthorpe wrote:
> On Wed, May 30, 2018 at 03:31:34PM +0300, Yishai Hadas wrote:
>> On 5/29/2018 10:56 PM, Jason Gunthorpe wrote:
>>> On Tue, May 29, 2018 at 04:09:15PM +0300, Leon Romanovsky wrote:
>>>> diff --git a/include/uapi/rdma/mlx5-abi.h b/include/uapi/rdma/mlx5-abi.h
>>>> index 508ea8c82da7..ef3f430a7050 100644
>>>> +++ b/include/uapi/rdma/mlx5-abi.h
>>>> @@ -443,4 +443,18 @@ enum {
>>>>   enum {
>>>>   	MLX5_IB_CLOCK_INFO_V1              = 0,
>>>>   };
>>>> +
>>>> +struct mlx5_ib_flow_counters_data {
>>>> +	__aligned_u64   counters_data;
>>>> +	__u32   ncounters;
>>>> +	__u32   reserved;
>>>> +};
>>>> +
>>>> +struct mlx5_ib_create_flow {
>>>> +	__u32   ncounters_data;
>>>> +	__u32   reserved;
>>>> +	/* Following are counters data based on ncounters_data */
>>>> +	struct mlx5_ib_flow_counters_data data[];
>>>> +};
>>>> +
>>>>   #endif /* MLX5_ABI_USER_H */
>>>
>>> This uapi thing still needs to be fixed as I pointed out before.
>>
>> In V3 we can go with below, no change in memory layout but it can clarify
>> the code/usage.
>>
>> struct mlx5_ib_flow_counters_desc {
>>          __u32   description;
>>          __u32   index;
>> };
>>
>> struct mlx5_ib_flow_counters_data {
>>          RDMA_UAPI_PTR(struct mlx5_ib_flow_counters_desc *, counters_data);
>>          __u32   ncounters;
>>          __u32   reserved;
>> };
> 
> OK, this is what I asked for originally..
> 
>> struct mlx5_ib_create_flow {
>>          __u32   ncounters_data;
>>          __u32   reserved;
>>          /* Following are counters data based on ncounters_data */
>>          struct mlx5_ib_flow_counters_data data[];
>>
>>
>>> I still can't figure out why this should be a 2d array.
>>
>> This comes to support the future case of multiple counters objects/specs
>> passed with the same flow. There is a need to differentiate mapping data for
>> each counters object and that is done via the 'ncounters_data' field and the
>> 2d array.
> 
> This still doesn't make any sense to me. How are these multiple
> counters objects/specs going to be identified?
> 
> Basically, what does the array index for data[] mean? Should it match
> the spec index from the main verb or something?
> 

Each entry in the data[] should match a corresponding counter object 
that was pointed by a counters spec upon the flow creation.

> This is a good place for a comment, since the intention is completely
> opaque here.

Sure, we'll add comment to clarify the above.

^ permalink raw reply

* Re: [PATCH iproute2-next] ipaddress: Add support for address metric
From: David Ahern @ 2018-05-30 15:22 UTC (permalink / raw)
  To: dsahern, netdev; +Cc: roopa
In-Reply-To: <20180527151000.30488-9-dsahern@kernel.org>

On 5/27/18 9:10 AM, dsahern@kernel.org wrote:
> From: David Ahern <dsahern@gmail.com>
> 
> Add support for IFA_RT_PRIORITY using the same keywords as iproute for
> RTA_PRIORITY.
> 
> Signed-off-by: David Ahern <dsahern@gmail.com>
> ---
>  include/uapi/linux/if_addr.h |  1 +
>  ip/ipaddress.c               | 15 ++++++++++++++-
>  man/man8/ip-address.8.in     |  6 ++++++
>  3 files changed, 21 insertions(+), 1 deletion(-)

applied to iproute2-next.

^ permalink raw reply

* Re: [PATCH] [net-next, wrong] make BPFILTER_UMH depend on X86
From: Alexei Starovoitov @ 2018-05-30 15:17 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: David S. Miller, Alexei Starovoitov, Masahiro Yamada,
	linux-kbuild, netdev, linux-kernel
In-Reply-To: <20180528153222.2037158-1-arnd@arndb.de>

On Mon, May 28, 2018 at 05:31:01PM +0200, Arnd Bergmann wrote:
> When build testing across architectures, I run into a build error on
> all targets other than X86:
> 
> gcc-8.1.0-nolibc/arm-linux-gnueabi/bin/arm-linux-gnueabi-objdump: net/bpfilter/bpfilter_umh: File format not recognized
> gcc-8.1.0-nolibc/arm-linux-gnueabi/bin/arm-linux-gnueabi-objcopy:net/bpfilter/bpfilter_umh.o: Invalid bfd target
> 
> The problem is that 'hostprogs' get built with 'gcc' rather than
> '$(CROSS_COMPILE)gcc', and my default gcc (as most people's) targets x86.
> 
> To work around it, adding an X86 dependency gets randconfigs building
> again on my box.
> 
> Clearly, this is not a good solution, since it should actually work fine
> when building native kernels on other architectures but that is now
> disabled, while cross building an x86 kernel on another host is still
> broken after my patch.
> 
> What we probably want here is to try out if the compiler is able to build
> executables for the target architecture and not build the helper otherwise,
> at least when compile-testing. No idea how to do that though.
> 
> Link: http://www.kernel.org/pub/tools/crosstool/
> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> Cc: linux-kbuild@vger.kernel.org
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>  net/bpfilter/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/bpfilter/Kconfig b/net/bpfilter/Kconfig
> index 60725c5f79db..61cc4fcbb4d0 100644
> --- a/net/bpfilter/Kconfig
> +++ b/net/bpfilter/Kconfig
> @@ -9,6 +9,7 @@ menuconfig BPFILTER
>  if BPFILTER
>  config BPFILTER_UMH
>  	tristate "bpfilter kernel module with user mode helper"
> +	depends on X86 # actually depends on native builds

depends on X86 will break it on arm.
I think the better short term fix would be to test that HOSTCC == CC
It doesn't have to be the same compiler. HOSTCC's arch == kernel ARCH
Not sure how to hack makefile to do that.
Long term we need to get rid of HOSTCC dependency.

^ permalink raw reply

* Re: [PATCH net-next v4 2/8] net: bridge: Extract br_vlan_add_existing()
From: Nikolay Aleksandrov @ 2018-05-30 15:15 UTC (permalink / raw)
  To: Petr Machata, netdev, devel, bridge
  Cc: f.fainelli, andrew, gregkh, vivien.didelot, idosch, jiri,
	razvan.stefanescu, davem, dan.carpenter
In-Reply-To: <3f7f1f0580acedf385d993304d53d370a60410c2.1527641426.git.petrm@mellanox.com>

On 30/05/18 03:56, Petr Machata wrote:
> Extract the code that deals with adding a preexisting VLAN to bridge CPU
> port to a separate function. A follow-up patch introduces a need to roll
> back operations in this block due to an error, and this split will make
> the error-handling code clearer.
> 
> Signed-off-by: Petr Machata <petrm@mellanox.com>
> ---
>  net/bridge/br_vlan.c | 55 +++++++++++++++++++++++++++++++---------------------
>  1 file changed, 33 insertions(+), 22 deletions(-)
> 

Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>

^ permalink raw reply

* Re: [PATCH mlx5-next 2/2] net/mlx5: Add FPGA QP error event
From: Saeed Mahameed @ 2018-05-30 15:14 UTC (permalink / raw)
  To: andrew@lunn.ch
  Cc: Jason Gunthorpe, netdev@vger.kernel.org, Ilan Tayari,
	linux-rdma@vger.kernel.org, Leon Romanovsky, Adi Nissim
In-Reply-To: <20180530010710.GC30239@lunn.ch>

On Wed, 2018-05-30 at 03:07 +0200, Andrew Lunn wrote:
> On Tue, May 29, 2018 at 05:19:54PM -0700, Saeed Mahameed wrote:
> > From: Ilan Tayari <ilant@mellanox.com>
> > 
> > The FPGA QP event fires whenever a QP on the FPGA trasitions
> > to the error state.
> 
> FPGA i know, field programmable gate array. Could you offer some clue
> as to what QP means?
> 

Hi Andre, QP "Queue Pair" is well known rdma concept, and widely used
in mlx drivers, it is used as in the driver as a ring buffer to
communicate with the FPGA device.
 
>    Thanks
> 	Andrew

^ permalink raw reply

* Re: [PATCH net-next v4 1/8] net: bridge: Extract boilerplate around switchdev_port_obj_*()
From: Nikolay Aleksandrov @ 2018-05-30 15:12 UTC (permalink / raw)
  To: Petr Machata, netdev, devel, bridge
  Cc: f.fainelli, andrew, gregkh, vivien.didelot, idosch, jiri,
	razvan.stefanescu, davem, dan.carpenter
In-Reply-To: <7b6f3bdc759168227bbff78173837d5fb5560528.1527641426.git.petrm@mellanox.com>

On 30/05/18 03:56, Petr Machata wrote:
> A call to switchdev_port_obj_add() or switchdev_port_obj_del() involves
> initializing a struct switchdev_obj_port_vlan, a piece of code that
> repeats on each call site almost verbatim. While in the current codebase
> there is just one duplicated add call, the follow-up patches add more of
> both add and del calls.
> 
> Thus to remove the duplication, extract the repetition into named
> functions and reuse.
> 
> Signed-off-by: Petr Machata <petrm@mellanox.com>
> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
> ---
>  net/bridge/br_private.h   | 13 +++++++++++++
>  net/bridge/br_switchdev.c | 25 +++++++++++++++++++++++++
>  net/bridge/br_vlan.c      | 26 +++-----------------------
>  3 files changed, 41 insertions(+), 23 deletions(-)
> 

Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>

^ permalink raw reply

* [PATCH][next] bpf: devmap: remove redundant assignment of dev = dev
From: Colin King @ 2018-05-30 15:09 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, netdev; +Cc: kernel-janitors, linux-kernel

From: Colin Ian King <colin.king@canonical.com>

The assignment dev = dev is redundant and should be removed.

Detected by CoverityScan, CID#1469486 ("Evaluation order violation")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 kernel/bpf/devmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
index ae16d0c373ef..1fe3fe60508a 100644
--- a/kernel/bpf/devmap.c
+++ b/kernel/bpf/devmap.c
@@ -352,7 +352,7 @@ int dev_map_enqueue(struct bpf_dtab_netdev *dst, struct xdp_buff *xdp,
 static void *dev_map_lookup_elem(struct bpf_map *map, void *key)
 {
 	struct bpf_dtab_netdev *obj = __dev_map_lookup_elem(map, *(u32 *)key);
-	struct net_device *dev = dev = obj ? obj->dev : NULL;
+	struct net_device *dev = obj ? obj->dev : NULL;
 
 	return dev ? &dev->ifindex : NULL;
 }
-- 
2.17.0

^ permalink raw reply related

* Re: [PATCH mlx5-next 1/2] net/mlx5: Add temperature warning event to log
From: Saeed Mahameed @ 2018-05-30 15:08 UTC (permalink / raw)
  To: andrew@lunn.ch
  Cc: Jason Gunthorpe, netdev@vger.kernel.org, Ilan Tayari,
	linux-rdma@vger.kernel.org, Leon Romanovsky, Adi Nissim
In-Reply-To: <20180530010404.GB30239@lunn.ch>

On Wed, 2018-05-30 at 03:04 +0200, Andrew Lunn wrote:
> On Tue, May 29, 2018 at 05:19:53PM -0700, Saeed Mahameed wrote:
> > From: Ilan Tayari <ilant@mellanox.com>
> > 
> > Temperature warning event is sent by FW to indicate high
> > temperature
> > as detected by one of the sensors on the board.
> > Add handling of this event by writing the numbers of the alert
> > sensors
> > to the kernel log.
> 
> Hi Saaed
> 
> Is the temperature itself available? If so, it would be better to
> expose this as a hwmon device per temperature sensor.
> 

Hi Andrew, yes the temperature is available by other means, this patch
is needed for alert information reasons in order to know which internal
sensors triggered the alarm.
We are working in parallel to expose temperature sensor to hwmon, but
this is still WIP.


Is it ok to have both ?

>        Andrew

^ permalink raw reply

* Re: [PATCH mlx5-next v2 11/13] IB/mlx5: Add flow counters binding support
From: Jason Gunthorpe @ 2018-05-30 15:06 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Leon Romanovsky, Doug Ledford, Leon Romanovsky, RDMA mailing list,
	Boris Pismenny, Matan Barak, Raed Salem, Yishai Hadas,
	Saeed Mahameed, linux-netdev, Alex Rosenbaum
In-Reply-To: <316f5042-b47d-2cee-48de-514467817e7a@dev.mellanox.co.il>

On Wed, May 30, 2018 at 03:31:34PM +0300, Yishai Hadas wrote:
> On 5/29/2018 10:56 PM, Jason Gunthorpe wrote:
> >On Tue, May 29, 2018 at 04:09:15PM +0300, Leon Romanovsky wrote:
> >>diff --git a/include/uapi/rdma/mlx5-abi.h b/include/uapi/rdma/mlx5-abi.h
> >>index 508ea8c82da7..ef3f430a7050 100644
> >>+++ b/include/uapi/rdma/mlx5-abi.h
> >>@@ -443,4 +443,18 @@ enum {
> >>  enum {
> >>  	MLX5_IB_CLOCK_INFO_V1              = 0,
> >>  };
> >>+
> >>+struct mlx5_ib_flow_counters_data {
> >>+	__aligned_u64   counters_data;
> >>+	__u32   ncounters;
> >>+	__u32   reserved;
> >>+};
> >>+
> >>+struct mlx5_ib_create_flow {
> >>+	__u32   ncounters_data;
> >>+	__u32   reserved;
> >>+	/* Following are counters data based on ncounters_data */
> >>+	struct mlx5_ib_flow_counters_data data[];
> >>+};
> >>+
> >>  #endif /* MLX5_ABI_USER_H */
> >
> >This uapi thing still needs to be fixed as I pointed out before.
> 
> In V3 we can go with below, no change in memory layout but it can clarify
> the code/usage.
> 
> struct mlx5_ib_flow_counters_desc {
>         __u32   description;
>         __u32   index;
> };
> 
> struct mlx5_ib_flow_counters_data {
>         RDMA_UAPI_PTR(struct mlx5_ib_flow_counters_desc *, counters_data);
>         __u32   ncounters;
>         __u32   reserved;
> };

OK, this is what I asked for originally..

> struct mlx5_ib_create_flow {
>         __u32   ncounters_data;
>         __u32   reserved;
>         /* Following are counters data based on ncounters_data */
>         struct mlx5_ib_flow_counters_data data[];
> 
> 
> >I still can't figure out why this should be a 2d array.
> 
> This comes to support the future case of multiple counters objects/specs
> passed with the same flow. There is a need to differentiate mapping data for
> each counters object and that is done via the 'ncounters_data' field and the
> 2d array.

This still doesn't make any sense to me. How are these multiple
counters objects/specs going to be identified?

Basically, what does the array index for data[] mean? Should it match
the spec index from the main verb or something?

This is a good place for a comment, since the intention is completely
opaque here.

> >A flex array at the end of a struct means that the struct can never be
> >extended again which seems like a terrible idea,
> 
> The header [1] has a fixed size and will always exist even if there will be
> no counters. Future extensions [2] will be added in the memory post the flex
> array which its size depends on 'ncounters_data'. This pattern is used also
> in other extended APIs. [3]
> 
> struct mlx5_ib_create_flow {
>         __u32   ncounters_data;
>         __u32   reserved;
> [1] /* Header is above ********
> 
>         /* Following are counters data based on ncounters_data */
>         struct mlx5_ib_flow_counters_data data[];
> 
> [2] Future fields.

We could do that.. But we won't - if it comes to it this will have to
move to the new kabi.

> [3] https://elixir.bootlin.com/linux/latest/source/include/uapi/rdma/ib_user_verbs.h#L1145

?? That looks like a normal flex array to me.

Jason

^ permalink raw reply

* Re: [PATCH v4 net-next 00/19] inet: frags: bring rhashtables to IP defrag
From: Tariq Toukan @ 2018-05-30 14:42 UTC (permalink / raw)
  To: Eric Dumazet, Tariq Toukan, moshe
  Cc: Eric Dumazet, aring, David Miller, netdev, Florian Westphal,
	Herbert Xu, Thomas Graf, Jesper Dangaard Brouer, Alexander Aring,
	Stefan Schmidt, Kirill Tkhai, Eran Ben Elisha
In-Reply-To: <CANn89iJN8=ZdnErcwMnEyhywksQLONL1t7DX=UoAJhFv4t0PrA@mail.gmail.com>



On 30/05/2018 10:36 AM, Eric Dumazet wrote:
> On Wed, May 30, 2018 at 3:20 AM Tariq Toukan <tariqt@mellanox.com> wrote:
> 
>> Not sure, the transmit BW you get is higher than what we saw.
>> Anyway, we'll check this.
> 
> That is on a 40Gbit test bed (mlx4 cx/3), maybe you were using a 10Gbit NIC
> ?
> 

It is a ConnectX-4 50G (mlx5).

Moshe is trying out the tuning you suggested.
He'll update once he's done.

^ permalink raw reply

* pull-request: wireless-drivers 2018-05-30
From: Kalle Valo @ 2018-05-30 14:17 UTC (permalink / raw)
  To: David Miller; +Cc: linux-wireless, netdev, linux-kernel

Hi Dave,

I now this is late but hopefully this pull request can make it to net
tree and to the final 4.17 release still. But if not, please let me know
and I'll pull this to wireless-drivers-next instead.

More info in the signed tag below, and please let me know if there are
any problems.

Kalle

The following changes since commit 813477aa49aac5deba04eb4956360dde58a0e807:

  MAINTAINERS: change Kalle as wcn36xx maintainer (2018-05-22 15:36:41 +0300)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers.git tags/wireless-drivers-for-davem-2018-05-30

for you to fetch changes up to ab1068d6866e28bf6427ceaea681a381e5870a4a:

  iwlwifi: pcie: compare with number of IRQs requested for, not number of CPUs (2018-05-29 10:40:25 +0300)

----------------------------------------------------------------
wireless-drivers fixes for 4.17

Two last minute fixes, hopefully they make it to 4.17 still.

rt2x00

* revert a fix which caused even more problems

iwlwifi

* fix a crash when there are 16 or more logical CPUs

----------------------------------------------------------------
Hao Wei Tee (1):
      iwlwifi: pcie: compare with number of IRQs requested for, not number of CPUs

Stanislaw Gruszka (1):
      Revert "rt2800: use TXOP_BACKOFF for probe frames"

 drivers/net/wireless/intel/iwlwifi/pcie/trans.c  | 10 +++++-----
 drivers/net/wireless/ralink/rt2x00/rt2x00queue.c |  7 +++----
 2 files changed, 8 insertions(+), 9 deletions(-)

^ permalink raw reply

* Re: [PATCH 2/4] arcnet: com20020: bindings for smsc com20020
From: Andrea Greco @ 2018-05-30 14:07 UTC (permalink / raw)
  To: Rob Herring
  Cc: Tobin C. Harding, Andrea Greco, Mark Rutland, netdev, devicetree,
	linux-kernel@vger.kernel.org
In-Reply-To: <CAL_JsqL1M3SuH_cJGUhh0+Xg+UYDjAc-=a+USENOKznvWib1Ng@mail.gmail.com>

On 05/24/2018 04:36 PM, Rob Herring wrote> If you want to add it, that's 
fine. But it's really not something that
> comes up often. For UARTs, there's already the "current-speed"
> property and most other things I can think of use Hz to express
> speeds.

No, Pref keep standard and use Hz.

This if finally:
```
SMSC com20020 Arcnet network controller

Required property:
- timeout-ns: Arcnet bus timeout, Idle Time (328000 - 20500)
- bus-speed-bps: Arcnet bus speed (10000000 - 156250)
- smsc,xtal-mhz: External oscillator frequency
- smsc,backplane-enabled: Controller use backplane mode
- reset-gpios: Chip reset pin
- interrupts: Should contain controller interrupt

arcnet@28000000 {
	compatible = "smsc,com20020";

	timeout-ns = <20500>;
	bus-speed-hz = <10000000>;
	smsc,xtal-mhz = <20>;
	smsc,backplane-enabled;

	reset-gpios = <&gpio3 21 GPIO_ACTIVE_LOW>;
	interrupts = <&gpio2 10 GPIO_ACTIVE_LOW>;
};
```
If confirmed, for me is right

Andrea

^ permalink raw reply

* Re: [PATCH v2] Revert "alx: remove WoL support"
From: Andrew Lunn @ 2018-05-30 13:58 UTC (permalink / raw)
  To: AceLan Kao
  Cc: Jay Cliburn, Chris Snook, David S . Miller, Rakesh Pandit, netdev,
	Emily Chien, Johannes Berg, Johannes Stezenbach, linux-kernel
In-Reply-To: <20180530021008.15080-1-acelan.kao@canonical.com>

On Wed, May 30, 2018 at 10:10:08AM +0800, AceLan Kao wrote:
> This reverts commit bc2bebe8de8ed4ba6482c9cc370b0dd72ffe8cd2.
> 
> The WoL feature is a must to pass Energy Star 6.1 and above,
> the power consumption will be measured during S3 with WoL is enabled.
> 
> Reverting "alx: remove WoL support", and will try to fix the unintentional
> wake up issue when WoL is enabled.

Hi AceLan

I find this change log entry rather odd.

If i remember correctly, you first argued that you did not want to
have to distribute out of tree patches.

It was suggested that you might be able to justify the revert using
the argument that the cure is worse than the decease. You ignored
that, and when with this Energy Star argument. That got shot down by
DaveM, and told to actually try to find the problem.

So you then come back and said you think the problem is fixed, but
don't know exactly what fixed it. So DaveM said try again.

Now you are back to Energy Star.

I don't get this. It was the fact you said it was probably fixed that
made DaveM reconsider. That is the argument you should be using in the
change log. We want to know what testing you have done. See a
tested-by: from somebody who had the issue which caused the revert,
and now says the issue is fixed.

Ideally we would like to know which change actually fixed the issue,
so it can be added to stable. But that requires somebody to do a long
git bisect.

    Andrew

^ permalink raw reply

* Re: [PATCH net] mlx4_core: restore optimal ICM memory allocation
From: Tariq Toukan @ 2018-05-30 13:49 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller
  Cc: netdev, Eric Dumazet, John Sperbeck, Tarick Bedeir, Qing Huang,
	Daniel Jurgens, Zhu Yanjun, Tariq Toukan
In-Reply-To: <20180530041152.113393-1-edumazet@google.com>



On 30/05/2018 7:11 AM, Eric Dumazet wrote:
> Commit 1383cb8103bb ("mlx4_core: allocate ICM memory in page size chunks")
> brought a regression caught in our regression suite, thanks to KASAN.
> 
> Note that mlx4_alloc_icm() is already able to try high order allocations
> and fallback to low-order allocations under high memory pressure.
> 
> We only have to tweak gfp_mask a bit, to help falling back faster,
> without risking OOM killings.
> 
> BUG: KASAN: slab-out-of-bounds in to_rdma_ah_attr+0x808/0x9e0 [mlx4_ib]
> Read of size 4 at addr ffff8817df584f68 by task qp_listing_test/92585
> 
> CPU: 38 PID: 92585 Comm: qp_listing_test Tainted: G           O
> Call Trace:
>   [<ffffffffba80d7bb>] dump_stack+0x4d/0x72
>   [<ffffffffb951dc5f>] print_address_description+0x6f/0x260
>   [<ffffffffb951e1c7>] kasan_report+0x257/0x370
>   [<ffffffffb951e339>] __asan_report_load4_noabort+0x19/0x20
>   [<ffffffffc0256d28>] to_rdma_ah_attr+0x808/0x9e0 [mlx4_ib]
>   [<ffffffffc02785b3>] mlx4_ib_query_qp+0x1213/0x1660 [mlx4_ib]
>   [<ffffffffc02dbfdb>] qpstat_print_qp+0x13b/0x500 [ib_uverbs]
>   [<ffffffffc02dc3ea>] qpstat_seq_show+0x4a/0xb0 [ib_uverbs]
>   [<ffffffffb95f125c>] seq_read+0xa9c/0x1230
>   [<ffffffffb96e0821>] proc_reg_read+0xc1/0x180
>   [<ffffffffb9577918>] __vfs_read+0xe8/0x730
>   [<ffffffffb9578057>] vfs_read+0xf7/0x300
>   [<ffffffffb95794d2>] SyS_read+0xd2/0x1b0
>   [<ffffffffb8e06b16>] do_syscall_64+0x186/0x420
>   [<ffffffffbaa00071>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> RIP: 0033:0x7f851a7bb30d
> RSP: 002b:00007ffd09a758c0 EFLAGS: 00000293 ORIG_RAX: 0000000000000000
> RAX: ffffffffffffffda RBX: 00007f84ff959440 RCX: 00007f851a7bb30d
> RDX: 000000000003fc00 RSI: 00007f84ff60a000 RDI: 000000000000000b
> RBP: 00007ffd09a75900 R08: 00000000ffffffff R09: 0000000000000000
> R10: 0000000000000022 R11: 0000000000000293 R12: 0000000000000000
> R13: 000000000003ffff R14: 000000000003ffff R15: 00007f84ff60a000
> 
> Allocated by task 4488:
>   save_stack+0x46/0xd0
>   kasan_kmalloc+0xad/0xe0
>   __kmalloc+0x101/0x5e0
>   ib_register_device+0xc03/0x1250 [ib_core]
>   mlx4_ib_add+0x27d6/0x4dd0 [mlx4_ib]
>   mlx4_add_device+0xa9/0x340 [mlx4_core]
>   mlx4_register_interface+0x16e/0x390 [mlx4_core]
>   xhci_pci_remove+0x7a/0x180 [xhci_pci]
>   do_one_initcall+0xa0/0x230
>   do_init_module+0x1b9/0x5a4
>   load_module+0x63e6/0x94c0
>   SYSC_init_module+0x1a4/0x1c0
>   SyS_init_module+0xe/0x10
>   do_syscall_64+0x186/0x420
>   entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> 
> Freed by task 0:
> (stack is not available)
> 
> The buggy address belongs to the object at ffff8817df584f40
>   which belongs to the cache kmalloc-32 of size 32
> The buggy address is located 8 bytes to the right of
>   32-byte region [ffff8817df584f40, ffff8817df584f60)
> The buggy address belongs to the page:
> page:ffffea005f7d6100 count:1 mapcount:0 mapping:ffff8817df584000 index:0xffff8817df584fc1
> flags: 0x880000000000100(slab)
> raw: 0880000000000100 ffff8817df584000 ffff8817df584fc1 000000010000003f
> raw: ffffea005f3ac0a0 ffffea005c476760 ffff8817fec00900 ffff883ff78d26c0
> page dumped because: kasan: bad access detected
> page->mem_cgroup:ffff883ff78d26c0
> 
> Memory state around the buggy address:
>   ffff8817df584e00: 00 03 fc fc fc fc fc fc 00 03 fc fc fc fc fc fc
>   ffff8817df584e80: 00 00 00 04 fc fc fc fc 00 00 00 fc fc fc fc fc
>> ffff8817df584f00: fb fb fb fb fc fc fc fc 00 00 00 00 fc fc fc fc
>                                                            ^
>   ffff8817df584f80: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
>   ffff8817df585000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> 
> Fixes: 1383cb8103bb ("mlx4_core: allocate ICM memory in page size chunks")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: John Sperbeck <jsperbeck@google.com>
> Cc: Tarick Bedeir <tarick@google.com>
> Cc: Qing Huang <qing.huang@oracle.com>
> Cc: Daniel Jurgens <danielj@mellanox.com>
> Cc: Zhu Yanjun <yanjun.zhu@oracle.com>
> Cc: Tariq Toukan <tariqt@mellanox.com>
> ---
>   drivers/net/ethernet/mellanox/mlx4/icm.c | 17 +++++++++++------
>   1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx4/icm.c b/drivers/net/ethernet/mellanox/mlx4/icm.c
> index 685337d58276fc91baeeb64387c52985e1bc6dda..cae33d5c7dbd9ba7929adcf2127b104f6796fa5a 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/icm.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/icm.c
> @@ -43,12 +43,13 @@
>   #include "fw.h"
>   
>   /*
> - * We allocate in page size (default 4KB on many archs) chunks to avoid high
> - * order memory allocations in fragmented/high usage memory situation.
> + * We allocate in as big chunks as we can, up to a maximum of 256 KB
> + * per chunk. Note that the chunks are not necessarily in contiguous
> + * physical memory.
>    */
>   enum {
> -	MLX4_ICM_ALLOC_SIZE	= PAGE_SIZE,
> -	MLX4_TABLE_CHUNK_SIZE	= PAGE_SIZE,
> +	MLX4_ICM_ALLOC_SIZE	= 1 << 18,
> +	MLX4_TABLE_CHUNK_SIZE	= 1 << 18,
>   };
>   
>   static void mlx4_free_icm_pages(struct mlx4_dev *dev, struct mlx4_icm_chunk *chunk)
> @@ -135,6 +136,7 @@ struct mlx4_icm *mlx4_alloc_icm(struct mlx4_dev *dev, int npages,
>   	struct mlx4_icm *icm;
>   	struct mlx4_icm_chunk *chunk = NULL;
>   	int cur_order;
> +	gfp_t mask;
>   	int ret;
>   
>   	/* We use sg_set_buf for coherent allocs, which assumes low memory */
> @@ -178,13 +180,16 @@ struct mlx4_icm *mlx4_alloc_icm(struct mlx4_dev *dev, int npages,
>   		while (1 << cur_order > npages)
>   			--cur_order;
>   
> +		mask = gfp_mask;
> +		if (cur_order)
> +			mask = (mask & ~__GFP_DIRECT_RECLAIM) | __GFP_NORETRY;
>   		if (coherent)
>   			ret = mlx4_alloc_icm_coherent(&dev->persist->pdev->dev,
>   						      &chunk->mem[chunk->npages],
> -						      cur_order, gfp_mask);
> +						      cur_order, mask);
>   		else
>   			ret = mlx4_alloc_icm_pages(&chunk->mem[chunk->npages],
> -						   cur_order, gfp_mask,
> +						   cur_order, mask,
>   						   dev->numa_node);
>   
>   		if (ret) {
> 

Thanks Eric.

I think it preserves the original intention of commit 1383cb8103bb 
("mlx4_core: allocate ICM memory in page size chunks").

Looks good to me.

Acked-by: Tariq Toukan <tariqt@mellanox.com>

^ permalink raw reply

* Re: [PATCH bpf v3 0/5] fix test_sockmap
From: John Fastabend @ 2018-05-30 13:32 UTC (permalink / raw)
  To: Prashant Bhole, Alexei Starovoitov, Daniel Borkmann
  Cc: David S . Miller, Shuah Khan, netdev, linux-kselftest
In-Reply-To: <20180530055611.10216-1-bhole_prashant_q7@lab.ntt.co.jp>

On 05/29/2018 10:56 PM, Prashant Bhole wrote:
> This series fixes error handling, timeout and data verification in
> test_sockmap. Previously it was not able to detect failure/timeout in
> RX/TX thread because error was not notified to the main thread.
> 
> Also slightly improved test output by printing parameter values (cork,
> apply, start, end) so that parameters for all tests are displayed.
> 
> Changes in v3:
>   - Skipped error checking for corked tests
> 
> Prashant Bhole (5):
>   selftests/bpf: test_sockmap, check test failure
>   selftests/bpf: test_sockmap, join cgroup in selftest mode
>   selftests/bpf: test_sockmap, fix test timeout
>   selftests/bpf: test_sockmap, fix data verification
>   selftests/bpf: test_sockmap, print additional test options
> 
>  tools/testing/selftests/bpf/test_sockmap.c | 76 +++++++++++++++++-----
>  1 file changed, 58 insertions(+), 18 deletions(-)
> 

Looks good thanks. We may want to tune the running time a bit but
lets get this applied first. A lot of nice improvements!

.John

^ permalink raw reply

* Re: [PATCH bpf v3 3/5] selftests/bpf: test_sockmap, fix test timeout
From: John Fastabend @ 2018-05-30 13:31 UTC (permalink / raw)
  To: Prashant Bhole, Alexei Starovoitov, Daniel Borkmann
  Cc: David S . Miller, Shuah Khan, netdev, linux-kselftest
In-Reply-To: <20180530055611.10216-4-bhole_prashant_q7@lab.ntt.co.jp>

On 05/29/2018 10:56 PM, Prashant Bhole wrote:
> In order to reduce runtime of tests, recently timout for select() call
> was reduced from 1sec to 10usec. This was causing many tests failures.
> It was caught with failure handling commits in this series.
> 
> Restoring the timeout from 10usec to 1sec
> 
> Fixes: a18fda1a62c3 ("bpf: reduce runtime of test_sockmap tests")
> Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
> ---

Quick question, how long does it take to run now with the time increase?
If its taking too long we may need to remove some tests. I have a longer
running test_sockmap script that I run as part of Cilium[1] project
where I put longer running stress tests.

Acked-by: John Fastabend <john.fastabend@gmail.com>

[1] cilium.io

^ permalink raw reply

* Re: [PATCH bpf v3 1/5] selftests/bpf: test_sockmap, check test failure
From: John Fastabend @ 2018-05-30 13:26 UTC (permalink / raw)
  To: Prashant Bhole, Alexei Starovoitov, Daniel Borkmann
  Cc: David S . Miller, Shuah Khan, netdev, linux-kselftest
In-Reply-To: <20180530055611.10216-2-bhole_prashant_q7@lab.ntt.co.jp>

On 05/29/2018 10:56 PM, Prashant Bhole wrote:
> Test failures are not identified because exit code of RX/TX threads
> is not checked. Also threads are not returning correct exit code.
> 
> - Return exit code from threads depending on test execution status
> - In main thread, check the exit code of RX/TX threads
> - Skip error checking for corked tests as they are expected to timeout
> 
> Fixes: 16962b2404ac ("bpf: sockmap, add selftests")
> Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
> ---
>  tools/testing/selftests/bpf/test_sockmap.c | 25 ++++++++++++++++------
>  1 file changed, 19 insertions(+), 6 deletions(-)
> 

Looks good. Thanks.

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply

* Re: [RFC PATCH ghak32 V2 00/13] audit: implement container id
From: Steve Grubb @ 2018-05-30 13:20 UTC (permalink / raw)
  To: linux-audit-H+wXaHxf7aLQT0dZR+AlfA
  Cc: simo-H+wXaHxf7aLQT0dZR+AlfA, jlayton-H+wXaHxf7aLQT0dZR+AlfA,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, LKML,
	eparis-FjpueFixGhCM4zKIHC2jIg, dhowells-H+wXaHxf7aLQT0dZR+AlfA,
	carlos-H+wXaHxf7aLQT0dZR+AlfA, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	luto-DgEjT+Ai2ygdnm+yROfE0A, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn
In-Reply-To: <cover.1521179281.git.rgb-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

On Friday, March 16, 2018 5:00:27 AM EDT Richard Guy Briggs wrote:
> Implement audit kernel container ID.
> 
> This patchset is a second RFC based on the proposal document (V3)
> posted:
> 	https://www.redhat.com/archives/linux-audit/2018-January/msg00014.html

So, if you work on a container orchestrator, how exactly is this set of 
interfaces to be used and in what order?

Thanks,
-Steve

> The first patch implements the proc fs write to set the audit container
> ID of a process, emitting an AUDIT_CONTAINER record to announce the
> registration of that container ID on that process.  This patch requires
> userspace support for record acceptance and proper type display.
> 
> The second checks for children or co-threads and refuses to set the
> container ID if either are present.  (This policy could be changed to
> set both with the same container ID provided they meet the rest of the
> requirements.)
> 
> The third implements the auxiliary record AUDIT_CONTAINER_INFO if a
> container ID is identifiable with an event.  This patch requires
> userspace support for proper type display.
> 
> The fourth adds container ID filtering to the exit, exclude and user
> lists.  This patch requires auditctil userspace support for the
> --containerid option.
> 
> The 5th adds signal and ptrace support.
> 
> The 6th creates a local audit context to be able to bind a standalone
> record with a locally created auxiliary record.
> 
> The 7th, 8th, 9th, 10th patches add container ID records to standalone
> records.  Some of these may end up being syscall auxiliary records and
> won't need this specific support since they'll be supported via
> syscalls.
> 
> The 11th adds network namespace container ID labelling based on member
> tasks' container ID labels.
> 
> The 12th adds container ID support to standalone netfilter records that
> don't have a task context and lists each container to which that net
> namespace belongs.
> 
> The 13th implements reading the container ID from the proc filesystem
> for debugging.  This patch isn't planned for upstream inclusion.
> 
> Feedback please!
> 
> Example: Set a container ID of 123456 to the "sleep" task:
> 	sleep 2&
> 	child=$!
> 	echo 123456 > /proc/$child/containerid; echo $?
> 	ausearch -ts recent -m container
> 	echo child:$child contid:$( cat /proc/$child/containerid)
> This should produce a record such as:
> 	type=CONTAINER msg=audit(1521122590.315:222): op=set pid=689 uid=0
> subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 auid=0 tty=pts0
> ses=3 opid=707 old-contid=18446744073709551615 contid=123456 res=1
> 
> Example: Set a filter on a container ID 123459 on /tmp/tmpcontainerid:
> 	containerid=123459
> 	key=tmpcontainerid
> 	auditctl -a exit,always -F dir=/tmp -F perm=wa -F containerid=$containerid
> -F key=$key perl -e "sleep 1; open(my \$tmpfile, '>', \"/tmp/$key\");
> close(\$tmpfile);" & child=$!
> 	echo $containerid > /proc/$child/containerid
> 	sleep 2
> 	ausearch -i -ts recent -k $key
> 	auditctl -d exit,always -F dir=/tmp -F perm=wa -F containerid=$containerid
> -F key=$key rm -f /tmp/$key
> This should produce an event such as:
> 	type=CONTAINER_INFO msg=audit(1521122591.614:227): op=task contid=123459
> 	type=PROCTITLE msg=audit(1521122591.614:227):
> proctitle=7065726C002D6500736C65657020313B206F70656E286D792024746D7066696C
> 652C20273E272C20222F746D702F746D70636F6E7461696E6572696422293B20636C6F73652
> 824746D7066696C65293B type=PATH msg=audit(1521122591.614:227): item=1
> name="/tmp/tmpcontainerid" inode=18427 dev=00:26 mode=0100644 ouid=0
> ogid=0 rdev=00:00 obj=unconfined_u:object_r:user_tmp_t:s0 nametype=CREATE
> cap_fp=0000000000000000 cap_fi=0000000000000000 cap_fe=0 cap_fver=0
> type=PATH msg=audit(1521122591.614:227): item=0 name="/tmp/" inode=13513
> dev=00:26 mode=041777 ouid=0 ogid=0 rdev=00:00
> obj=system_u:object_r:tmp_t:s0 nametype=PARENT cap_fp=0000000000000000
> cap_fi=0000000000000000 cap_fe=0 cap_fver=0 type=CWD
> msg=audit(1521122591.614:227): cwd="/root"
> 	type=SYSCALL msg=audit(1521122591.614:227): arch=c000003e syscall=257
> success=yes exit=3 a0=ffffffffffffff9c a1=55db90a28900 a2=241 a3=1b6
> items=2 ppid=689 pid=724 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0
> sgid=0 fsgid=0 tty=pts0 ses=3 comm="perl" exe="/usr/bin/perl"
> subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
> key="tmpcontainerid"
> 
> See:
> 	https://github.com/linux-audit/audit-kernel/issues/32
> 	https://github.com/linux-audit/audit-userspace/issues/40
> 	https://github.com/linux-audit/audit-testsuite/issues/64
> 
> Richard Guy Briggs (13):
>   audit: add container id
>   audit: check children and threading before allowing containerid
>   audit: log container info of syscalls
>   audit: add containerid filtering
>   audit: add containerid support for ptrace and signals
>   audit: add support for non-syscall auxiliary records
>   audit: add container aux record to watch/tree/mark
>   audit: add containerid support for tty_audit
>   audit: add containerid support for config/feature/user records
>   audit: add containerid support for seccomp and anom_abend records
>   audit: add support for containerid to network namespaces
>   audit: NETFILTER_PKT: record each container ID associated with a netNS
>   debug audit: read container ID of a process
> 
>  drivers/tty/tty_audit.c     |   5 +-
>  fs/proc/base.c              |  53 ++++++++++++++++
>  include/linux/audit.h       |  43 +++++++++++++
>  include/linux/init_task.h   |   4 +-
>  include/linux/sched.h       |   1 +
>  include/net/net_namespace.h |  12 ++++
>  include/uapi/linux/audit.h  |   8 ++-
>  kernel/audit.c              |  75 ++++++++++++++++++++---
>  kernel/audit.h              |   3 +
>  kernel/audit_fsnotify.c     |   5 +-
>  kernel/audit_tree.c         |   5 +-
>  kernel/audit_watch.c        |  33 +++++-----
>  kernel/auditfilter.c        |  52 +++++++++++++++-
>  kernel/auditsc.c            | 145
> ++++++++++++++++++++++++++++++++++++++++++-- kernel/nsproxy.c            |
>   6 ++
>  net/core/net_namespace.c    |  45 ++++++++++++++
>  net/netfilter/xt_AUDIT.c    |  15 ++++-
>  17 files changed, 473 insertions(+), 37 deletions(-)

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox