Netdev List
 help / color / mirror / Atom feed
* RE: [PATCH 3/3] ocelot_ace: fix action of trap
From: Y.b. Lu @ 2019-08-13  2:12 UTC (permalink / raw)
  To: Allan W. Nielsen
  Cc: netdev@vger.kernel.org, David S . Miller, Alexandre Belloni,
	Microchip Linux Driver Support
In-Reply-To: <20190812123147.6jjd3kocityxbvcg@lx-anielsen.microsemi.net>

Hi Allan,

> -----Original Message-----
> From: Allan W. Nielsen <allan.nielsen@microchip.com>
> Sent: Monday, August 12, 2019 8:32 PM
> To: Y.b. Lu <yangbo.lu@nxp.com>
> Cc: netdev@vger.kernel.org; David S . Miller <davem@davemloft.net>;
> Alexandre Belloni <alexandre.belloni@bootlin.com>; Microchip Linux Driver
> Support <UNGLinuxDriver@microchip.com>
> Subject: Re: [PATCH 3/3] ocelot_ace: fix action of trap
> 
> The 08/12/2019 18:48, Yangbo Lu wrote:
> > The trap action should be copying the frame to CPU and dropping it for
> > forwarding, but current setting was just copying frame to CPU.
> 
> Are there any actions which do a "copy-to-cpu" and still forward the frame in
> HW?

[Y.b. Lu] We're using Felix switch whose code hadn't been accepted by upstream.
https://patchwork.ozlabs.org/project/netdev/list/?series=115399&state=*

I'd like to trap all IEEE 1588 PTP Ethernet frames to CPU through etype 0x88f7.
When I used current TRAP option, I found the frames were not only copied to CPU, but also forwarded to other ports.
So I just made the TRAP option same with DROP option except enabling CPU_COPY_ENA in the patch.

Thanks.

> 
> > Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
> > ---
> >  drivers/net/ethernet/mscc/ocelot_ace.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/mscc/ocelot_ace.c
> > b/drivers/net/ethernet/mscc/ocelot_ace.c
> > index 91250f3..59ad590 100644
> > --- a/drivers/net/ethernet/mscc/ocelot_ace.c
> > +++ b/drivers/net/ethernet/mscc/ocelot_ace.c
> > @@ -317,9 +317,9 @@ static void is2_action_set(struct vcap_data *data,
> >  		break;
> >  	case OCELOT_ACL_ACTION_TRAP:
> >  		VCAP_ACT_SET(PORT_MASK, 0x0);
> > -		VCAP_ACT_SET(MASK_MODE, 0x0);
> > -		VCAP_ACT_SET(POLICE_ENA, 0x0);
> > -		VCAP_ACT_SET(POLICE_IDX, 0x0);
> > +		VCAP_ACT_SET(MASK_MODE, 0x1);
> > +		VCAP_ACT_SET(POLICE_ENA, 0x1);
> > +		VCAP_ACT_SET(POLICE_IDX, OCELOT_POLICER_DISCARD);
> This seems wrong. The policer is used to ensure that traffic are discarded, even
> in the case where other users of the code has requested it to go to the CPU.
> 
> Are you sure this is working? If it is working, then I fear we have an issue with
> the DROP action which uses this to discard frames.
> 
> >  		VCAP_ACT_SET(CPU_QU_NUM, 0x0);
> >  		VCAP_ACT_SET(CPU_COPY_ENA, 0x1);
> >  		break;
> > --
> > 2.7.4
> 
> --
> /Allan

^ permalink raw reply

* RE: [PATCH 2/3] ocelot_ace: fix ingress ports setting for rule
From: Y.b. Lu @ 2019-08-13  2:36 UTC (permalink / raw)
  To: Allan W. Nielsen
  Cc: netdev@vger.kernel.org, David S . Miller, Alexandre Belloni,
	Microchip Linux Driver Support
In-Reply-To: <20190812123820.qjaclomo6bhpz5pg@lx-anielsen.microsemi.net>

Hi Allan,

> -----Original Message-----
> From: Allan W. Nielsen <allan.nielsen@microchip.com>
> Sent: Monday, August 12, 2019 8:38 PM
> To: Y.b. Lu <yangbo.lu@nxp.com>
> Cc: netdev@vger.kernel.org; David S . Miller <davem@davemloft.net>;
> Alexandre Belloni <alexandre.belloni@bootlin.com>; Microchip Linux Driver
> Support <UNGLinuxDriver@microchip.com>
> Subject: Re: [PATCH 2/3] ocelot_ace: fix ingress ports setting for rule
> 
> The 08/12/2019 18:48, Yangbo Lu wrote:
> > The ingress ports setting of rule should support covering all ports.
> > This patch is to use u16 ingress_port for ingress port mask setting
> > for ace rule. One bit corresponds one port.
> That is how the HW is working, and it would be nice if we could operate on a
> port masks/lists instead. But how can this be used?

[Y.b. Lu] Will the changes affect anything? Current usage in ocelot_flower.c will be converted as below.

-       rule->chip_port = block->port->chip_port;
+       rule->ingress_port = BIT(block->port->chip_port);


> 
> Can you please explain how/when this will make a difference?

[Y.b. Lu] Actually I have another internal patch based on this patch-set for setting rule of trapping IEEE 1588 PTP Ethernet frames.
For such rule which should be applied on several ingress ports, we can set it once when ocelot initialization I think.

The internal patch I mentioned is for felix which had different ports number (VCAP_PORT_CNT). So I hadn't sent it out.
Let me just send v2 patch-set with the patch dropping VCAP_PORT_CNT changes for your reviewing.
Please feel free to provide suggestion.

Thanks a lot:)
> 
> > Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
> > ---
> >  drivers/net/ethernet/mscc/ocelot_ace.c    | 2 +-
> >  drivers/net/ethernet/mscc/ocelot_ace.h    | 2 +-
> >  drivers/net/ethernet/mscc/ocelot_flower.c | 2 +-
> >  3 files changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/mscc/ocelot_ace.c
> > b/drivers/net/ethernet/mscc/ocelot_ace.c
> > index 5580a58..91250f3 100644
> > --- a/drivers/net/ethernet/mscc/ocelot_ace.c
> > +++ b/drivers/net/ethernet/mscc/ocelot_ace.c
> > @@ -352,7 +352,7 @@ static void is2_entry_set(struct ocelot *ocelot, int ix,
> >  	data.type = IS2_ACTION_TYPE_NORMAL;
> >
> >  	VCAP_KEY_ANY_SET(PAG);
> > -	VCAP_KEY_SET(IGR_PORT_MASK, 0, ~BIT(ace->chip_port));
> > +	VCAP_KEY_SET(IGR_PORT_MASK, 0, ~ace->ingress_port);
> >  	VCAP_KEY_BIT_SET(FIRST, OCELOT_VCAP_BIT_1);
> >  	VCAP_KEY_BIT_SET(HOST_MATCH, OCELOT_VCAP_BIT_ANY);
> >  	VCAP_KEY_BIT_SET(L2_MC, ace->dmac_mc); diff --git
> > a/drivers/net/ethernet/mscc/ocelot_ace.h
> > b/drivers/net/ethernet/mscc/ocelot_ace.h
> > index ce72f02..0fe23e0 100644
> > --- a/drivers/net/ethernet/mscc/ocelot_ace.h
> > +++ b/drivers/net/ethernet/mscc/ocelot_ace.h
> > @@ -193,7 +193,7 @@ struct ocelot_ace_rule {
> >
> >  	enum ocelot_ace_action action;
> >  	struct ocelot_ace_stats stats;
> > -	int chip_port;
> > +	u16 ingress_port;
> >
> >  	enum ocelot_vcap_bit dmac_mc;
> >  	enum ocelot_vcap_bit dmac_bc;
> > diff --git a/drivers/net/ethernet/mscc/ocelot_flower.c
> > b/drivers/net/ethernet/mscc/ocelot_flower.c
> > index 7c60e8c..bfddc50 100644
> > --- a/drivers/net/ethernet/mscc/ocelot_flower.c
> > +++ b/drivers/net/ethernet/mscc/ocelot_flower.c
> > @@ -184,7 +184,7 @@ struct ocelot_ace_rule
> *ocelot_ace_rule_create(struct flow_cls_offload *f,
> >  		return NULL;
> >
> >  	rule->ocelot = block->port->ocelot;
> > -	rule->chip_port = block->port->chip_port;
> > +	rule->ingress_port = BIT(block->port->chip_port);
> >  	return rule;
> >  }
> 
> -- Allan

^ permalink raw reply

* Re: [v4,1/4] tools: bpftool: add net attach command to attach XDP on interface
From: Daniel T. Lee @ 2019-08-13  2:42 UTC (permalink / raw)
  To: Y Song; +Cc: Daniel Borkmann, Alexei Starovoitov, netdev
In-Reply-To: <CAH3MdRVJ5Z8FVF8XW8Ha-MwRAnO2mbmMDFb_s3cPswqq7MfsMQ@mail.gmail.com>

On Mon, Aug 12, 2019 at 9:27 AM Y Song <ys114321@gmail.com> wrote:
>
> On Fri, Aug 9, 2019 at 6:35 AM Daniel T. Lee <danieltimlee@gmail.com> wrote:
> >
> > By this commit, using `bpftool net attach`, user can attach XDP prog on
> > interface. New type of enum 'net_attach_type' has been made, as stated at
> > cover-letter, the meaning of 'attach' is, prog will be attached on interface.
> >
> > With 'overwrite' option at argument, attached XDP program could be replaced.
> > Added new helper 'net_parse_dev' to parse the network device at argument.
> >
> > BPF prog will be attached through libbpf 'bpf_set_link_xdp_fd'.
> >
> > Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
> > ---
> >  tools/bpf/bpftool/net.c | 136 +++++++++++++++++++++++++++++++++++++---
> >  1 file changed, 129 insertions(+), 7 deletions(-)
> >
> > diff --git a/tools/bpf/bpftool/net.c b/tools/bpf/bpftool/net.c
> > index 67e99c56bc88..74cc346c36cd 100644
> > --- a/tools/bpf/bpftool/net.c
> > +++ b/tools/bpf/bpftool/net.c
> > @@ -55,6 +55,35 @@ struct bpf_attach_info {
> >         __u32 flow_dissector_id;
> >  };
> >
> > +enum net_attach_type {
> > +       NET_ATTACH_TYPE_XDP,
> > +       NET_ATTACH_TYPE_XDP_GENERIC,
> > +       NET_ATTACH_TYPE_XDP_DRIVER,
> > +       NET_ATTACH_TYPE_XDP_OFFLOAD,
> > +};
> > +
> > +static const char * const attach_type_strings[] = {
> > +       [NET_ATTACH_TYPE_XDP]           = "xdp",
> > +       [NET_ATTACH_TYPE_XDP_GENERIC]   = "xdpgeneric",
> > +       [NET_ATTACH_TYPE_XDP_DRIVER]    = "xdpdrv",
> > +       [NET_ATTACH_TYPE_XDP_OFFLOAD]   = "xdpoffload",
> > +};
> > +
> > +const size_t net_attach_type_size = ARRAY_SIZE(attach_type_strings);
> > +
> > +static enum net_attach_type parse_attach_type(const char *str)
> > +{
> > +       enum net_attach_type type;
> > +
> > +       for (type = 0; type < net_attach_type_size; type++) {
> > +               if (attach_type_strings[type] &&
> > +                   is_prefix(str, attach_type_strings[type]))
> > +                       return type;
> > +       }
> > +
> > +       return net_attach_type_size;
> > +}
> > +
> >  static int dump_link_nlmsg(void *cookie, void *msg, struct nlattr **tb)
> >  {
> >         struct bpf_netdev_t *netinfo = cookie;
> > @@ -223,6 +252,97 @@ static int query_flow_dissector(struct bpf_attach_info *attach_info)
> >         return 0;
> >  }
> >
> > +static int net_parse_dev(int *argc, char ***argv)
> > +{
> > +       int ifindex;
> > +
> > +       if (is_prefix(**argv, "dev")) {
> > +               NEXT_ARGP();
> > +
> > +               ifindex = if_nametoindex(**argv);
> > +               if (!ifindex)
> > +                       p_err("invalid devname %s", **argv);
> > +
> > +               NEXT_ARGP();
> > +       } else {
> > +               p_err("expected 'dev', got: '%s'?", **argv);
> > +               return -1;
> > +       }
> > +
> > +       return ifindex;
> > +}
> > +
> > +static int do_attach_detach_xdp(int progfd, enum net_attach_type attach_type,
> > +                               int ifindex, bool overwrite)
> > +{
> > +       __u32 flags = 0;
> > +
> > +       if (!overwrite)
> > +               flags = XDP_FLAGS_UPDATE_IF_NOEXIST;
> > +       if (attach_type == NET_ATTACH_TYPE_XDP_GENERIC)
> > +               flags |= XDP_FLAGS_SKB_MODE;
> > +       if (attach_type == NET_ATTACH_TYPE_XDP_DRIVER)
> > +               flags |= XDP_FLAGS_DRV_MODE;
> > +       if (attach_type == NET_ATTACH_TYPE_XDP_OFFLOAD)
> > +               flags |= XDP_FLAGS_HW_MODE;
> > +
> > +       return bpf_set_link_xdp_fd(ifindex, progfd, flags);
> > +}
> > +
> > +static int do_attach(int argc, char **argv)
> > +{
> > +       enum net_attach_type attach_type;
> > +       int progfd, ifindex, err = 0;
> > +       bool overwrite = false;
> > +
> > +       /* parse attach args */
> > +       if (!REQ_ARGS(5))
> > +               return -EINVAL;
> > +
> > +       attach_type = parse_attach_type(*argv);
> > +       if (attach_type == net_attach_type_size) {
> > +               p_err("invalid net attach/detach type: %s", *argv);
> > +               return -EINVAL;
> > +       }
> > +       NEXT_ARG();
> > +
> > +       progfd = prog_parse_fd(&argc, &argv);
> > +       if (progfd < 0)
> > +               return -EINVAL;
> > +
> > +       ifindex = net_parse_dev(&argc, &argv);
> > +       if (ifindex < 1) {
> > +               close(progfd);
> > +               return -EINVAL;
> > +       }
> > +
> > +       if (argc) {
> > +               if (is_prefix(*argv, "overwrite")) {
> > +                       overwrite = true;
> > +               } else {
> > +                       p_err("expected 'overwrite', got: '%s'?", *argv);
> > +                       close(progfd);
> > +                       return -EINVAL;
> > +               }
> > +       }
> > +
> > +       /* attach xdp prog */
> > +       if (is_prefix("xdp", attach_type_strings[attach_type]))
> > +               err = do_attach_detach_xdp(progfd, attach_type, ifindex,
> > +                                          overwrite);
> > +
> > +       if (err < 0) {
> > +               p_err("interface %s attach failed: %s",
> > +                     attach_type_strings[attach_type], strerror(errno));
> > +               return err;
> > +       }
>
> I tried the below example,
>
> -bash-4.4$ sudo ./bpftool net attach x pinned /sys/fs/bpf/xdp_example
> dev v1
> -bash-4.4$ sudo ./bpftool net attach x pinned /sys/fs/bpf/xdp_example dev v1
> Kernel error message: XDP program already attached
> Error: interface xdp attach failed: Success
> -bash-4.4$
>
> It printed out "Success" as errno here is 0.
> The errno is encoded in variable err. Function bpf_set_link_xdp_fd()
> uses netlink interface to do setting. The syscall may be find (errno = 0)
> but the netlink msg may contain error code, which is returned with err.
>
> So the above strerror(errno) should be strerror(-err).
> libbpf API libbpf_strerror_r() accepts positive or negative err code which
> you could use as well here.
>
> With this issue fixed. You can add:
> Acked-by: Yonghong Song <yhs@fb.com>
>

I didn't realize it would return 0 as 'errno'.
Thanks for letting me know.
I'll update to next patch.

Thank you for taking your time for the review.

> > +
> > +       if (json_output)
> > +               jsonw_null(json_wtr);
> > +
> > +       return 0;
> > +}
> > +
> [...]

^ permalink raw reply

* [v5,0/4] tools: bpftool: add net attach/detach command to attach XDP prog
From: Daniel T. Lee @ 2019-08-13  2:46 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov; +Cc: netdev

Currently, bpftool net only supports dumping progs attached on the
interface. To attach XDP prog on interface, user must use other tool
(eg. iproute2). By this patch, with `bpftool net attach/detach`, user
can attach/detach XDP prog on interface.

    # bpftool prog
        16: xdp  name xdp_prog1  tag 539ec6ce11b52f98  gpl
        loaded_at 2019-08-07T08:30:17+0900  uid 0
        ...
        20: xdp  name xdp_fwd_prog  tag b9cb69f121e4a274  gpl
        loaded_at 2019-08-07T08:30:17+0900  uid 0

    # bpftool net attach xdpdrv id 16 dev enp6s0np0
    # bpftool net
    xdp:
        enp6s0np0(4) driver id 16

    # bpftool net attach xdpdrv id 20 dev enp6s0np0 overwrite
    # bpftool net
    xdp:
        enp6s0np0(4) driver id 20

    # bpftool net detach xdpdrv dev enp6s0np0
    # bpftool net
    xdp:


While this patch only contains support for XDP, through `net
attach/detach`, bpftool can further support other prog attach types.

XDP attach/detach tested on Mellanox ConnectX-4 and Netronome Agilio.

---
Changes in v5:
  - fix wrong error message, from errno to err with do_attach/detach

Changes in v4:
  - rename variable, attach/detach error message enhancement
  - bash-completion cleanup, doc update with brief description (attach types)

Changes in v3:
  - added 'overwrite' option for replacing previously attached XDP prog
  - command argument order has been changed ('ATTACH_TYPE' comes first)
  - add 'dev' keyword in front of <devname>
  - added bash-completion and documentation

Changes in v2:
  - command 'load/unload' changed to 'attach/detach' for the consistency

Daniel T. Lee (4):
  tools: bpftool: add net attach command to attach XDP on interface
  tools: bpftool: add net detach command to detach XDP on interface
  tools: bpftool: add bash-completion for net attach/detach
  tools: bpftool: add documentation for net attach/detach

 .../bpf/bpftool/Documentation/bpftool-net.rst |  57 +++++-
 tools/bpf/bpftool/bash-completion/bpftool     |  65 ++++++-
 tools/bpf/bpftool/net.c                       | 176 +++++++++++++++++-
 3 files changed, 278 insertions(+), 20 deletions(-)

-- 
2.20.1


^ permalink raw reply

* [v5,1/4] tools: bpftool: add net attach command to attach XDP on interface
From: Daniel T. Lee @ 2019-08-13  2:46 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov; +Cc: netdev, Yonghong Song
In-Reply-To: <20190813024621.29886-1-danieltimlee@gmail.com>

By this commit, using `bpftool net attach`, user can attach XDP prog on
interface. New type of enum 'net_attach_type' has been made, as stat ted at
cover-letter, the meaning of 'attach' is, prog will be attached on interface.

With 'overwrite' option at argument, attached XDP program could be replaced.
Added new helper 'net_parse_dev' to parse the network device at argument.

BPF prog will be attached through libbpf 'bpf_set_link_xdp_fd'.

Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
---
 tools/bpf/bpftool/net.c | 136 +++++++++++++++++++++++++++++++++++++---
 1 file changed, 129 insertions(+), 7 deletions(-)

diff --git a/tools/bpf/bpftool/net.c b/tools/bpf/bpftool/net.c
index 67e99c56bc88..33222ca1060e 100644
--- a/tools/bpf/bpftool/net.c
+++ b/tools/bpf/bpftool/net.c
@@ -55,6 +55,35 @@ struct bpf_attach_info {
 	__u32 flow_dissector_id;
 };
 
+enum net_attach_type {
+	NET_ATTACH_TYPE_XDP,
+	NET_ATTACH_TYPE_XDP_GENERIC,
+	NET_ATTACH_TYPE_XDP_DRIVER,
+	NET_ATTACH_TYPE_XDP_OFFLOAD,
+};
+
+static const char * const attach_type_strings[] = {
+	[NET_ATTACH_TYPE_XDP]		= "xdp",
+	[NET_ATTACH_TYPE_XDP_GENERIC]	= "xdpgeneric",
+	[NET_ATTACH_TYPE_XDP_DRIVER]	= "xdpdrv",
+	[NET_ATTACH_TYPE_XDP_OFFLOAD]	= "xdpoffload",
+};
+
+const size_t net_attach_type_size = ARRAY_SIZE(attach_type_strings);
+
+static enum net_attach_type parse_attach_type(const char *str)
+{
+	enum net_attach_type type;
+
+	for (type = 0; type < net_attach_type_size; type++) {
+		if (attach_type_strings[type] &&
+		    is_prefix(str, attach_type_strings[type]))
+			return type;
+	}
+
+	return net_attach_type_size;
+}
+
 static int dump_link_nlmsg(void *cookie, void *msg, struct nlattr **tb)
 {
 	struct bpf_netdev_t *netinfo = cookie;
@@ -223,6 +252,97 @@ static int query_flow_dissector(struct bpf_attach_info *attach_info)
 	return 0;
 }
 
+static int net_parse_dev(int *argc, char ***argv)
+{
+	int ifindex;
+
+	if (is_prefix(**argv, "dev")) {
+		NEXT_ARGP();
+
+		ifindex = if_nametoindex(**argv);
+		if (!ifindex)
+			p_err("invalid devname %s", **argv);
+
+		NEXT_ARGP();
+	} else {
+		p_err("expected 'dev', got: '%s'?", **argv);
+		return -1;
+	}
+
+	return ifindex;
+}
+
+static int do_attach_detach_xdp(int progfd, enum net_attach_type attach_type,
+				int ifindex, bool overwrite)
+{
+	__u32 flags = 0;
+
+	if (!overwrite)
+		flags = XDP_FLAGS_UPDATE_IF_NOEXIST;
+	if (attach_type == NET_ATTACH_TYPE_XDP_GENERIC)
+		flags |= XDP_FLAGS_SKB_MODE;
+	if (attach_type == NET_ATTACH_TYPE_XDP_DRIVER)
+		flags |= XDP_FLAGS_DRV_MODE;
+	if (attach_type == NET_ATTACH_TYPE_XDP_OFFLOAD)
+		flags |= XDP_FLAGS_HW_MODE;
+
+	return bpf_set_link_xdp_fd(ifindex, progfd, flags);
+}
+
+static int do_attach(int argc, char **argv)
+{
+	enum net_attach_type attach_type;
+	int progfd, ifindex, err = 0;
+	bool overwrite = false;
+
+	/* parse attach args */
+	if (!REQ_ARGS(5))
+		return -EINVAL;
+
+	attach_type = parse_attach_type(*argv);
+	if (attach_type == net_attach_type_size) {
+		p_err("invalid net attach/detach type: %s", *argv);
+		return -EINVAL;
+	}
+	NEXT_ARG();
+
+	progfd = prog_parse_fd(&argc, &argv);
+	if (progfd < 0)
+		return -EINVAL;
+
+	ifindex = net_parse_dev(&argc, &argv);
+	if (ifindex < 1) {
+		close(progfd);
+		return -EINVAL;
+	}
+
+	if (argc) {
+		if (is_prefix(*argv, "overwrite")) {
+			overwrite = true;
+		} else {
+			p_err("expected 'overwrite', got: '%s'?", *argv);
+			close(progfd);
+			return -EINVAL;
+		}
+	}
+
+	/* attach xdp prog */
+	if (is_prefix("xdp", attach_type_strings[attach_type]))
+		err = do_attach_detach_xdp(progfd, attach_type, ifindex,
+					   overwrite);
+
+	if (err < 0) {
+		p_err("interface %s attach failed: %s",
+		      attach_type_strings[attach_type], strerror(-err));
+		return err;
+	}
+
+	if (json_output)
+		jsonw_null(json_wtr);
+
+	return 0;
+}
+
 static int do_show(int argc, char **argv)
 {
 	struct bpf_attach_info attach_info = {};
@@ -232,13 +352,9 @@ static int do_show(int argc, char **argv)
 	char err_buf[256];
 
 	if (argc == 2) {
-		if (strcmp(argv[0], "dev") != 0)
-			usage();
-		filter_idx = if_nametoindex(argv[1]);
-		if (filter_idx == 0) {
-			fprintf(stderr, "invalid dev name %s\n", argv[1]);
+		filter_idx = net_parse_dev(&argc, &argv);
+		if (filter_idx < 1)
 			return -1;
-		}
 	} else if (argc != 0) {
 		usage();
 	}
@@ -305,13 +421,18 @@ static int do_help(int argc, char **argv)
 
 	fprintf(stderr,
 		"Usage: %s %s { show | list } [dev <devname>]\n"
+		"       %s %s attach ATTACH_TYPE PROG dev <devname> [ overwrite ]\n"
 		"       %s %s help\n"
+		"\n"
+		"       " HELP_SPEC_PROGRAM "\n"
+		"       ATTACH_TYPE := { xdp | xdpgeneric | xdpdrv | xdpoffload }\n"
+		"\n"
 		"Note: Only xdp and tc attachments are supported now.\n"
 		"      For progs attached to cgroups, use \"bpftool cgroup\"\n"
 		"      to dump program attachments. For program types\n"
 		"      sk_{filter,skb,msg,reuseport} and lwt/seg6, please\n"
 		"      consult iproute2.\n",
-		bin_name, argv[-2], bin_name, argv[-2]);
+		bin_name, argv[-2], bin_name, argv[-2], bin_name, argv[-2]);
 
 	return 0;
 }
@@ -319,6 +440,7 @@ static int do_help(int argc, char **argv)
 static const struct cmd cmds[] = {
 	{ "show",	do_show },
 	{ "list",	do_show },
+	{ "attach",	do_attach },
 	{ "help",	do_help },
 	{ 0 }
 };
-- 
2.20.1


^ permalink raw reply related

* [v5,2/4] tools: bpftool: add net detach command to detach XDP on interface
From: Daniel T. Lee @ 2019-08-13  2:46 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov; +Cc: netdev, Yonghong Song
In-Reply-To: <20190813024621.29886-1-danieltimlee@gmail.com>

By this commit, using `bpftool net detach`, the attached XDP prog can
be detached. Detaching the BPF prog will be done through libbpf
'bpf_set_link_xdp_fd' with the progfd set to -1.

Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
---
 tools/bpf/bpftool/net.c | 42 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/tools/bpf/bpftool/net.c b/tools/bpf/bpftool/net.c
index 33222ca1060e..a213a9b7f69c 100644
--- a/tools/bpf/bpftool/net.c
+++ b/tools/bpf/bpftool/net.c
@@ -343,6 +343,43 @@ static int do_attach(int argc, char **argv)
 	return 0;
 }
 
+static int do_detach(int argc, char **argv)
+{
+	enum net_attach_type attach_type;
+	int progfd, ifindex, err = 0;
+
+	/* parse detach args */
+	if (!REQ_ARGS(3))
+		return -EINVAL;
+
+	attach_type = parse_attach_type(*argv);
+	if (attach_type == net_attach_type_size) {
+		p_err("invalid net attach/detach type: %s", *argv);
+		return -EINVAL;
+	}
+	NEXT_ARG();
+
+	ifindex = net_parse_dev(&argc, &argv);
+	if (ifindex < 1)
+		return -EINVAL;
+
+	/* detach xdp prog */
+	progfd = -1;
+	if (is_prefix("xdp", attach_type_strings[attach_type]))
+		err = do_attach_detach_xdp(progfd, attach_type, ifindex, NULL);
+
+	if (err < 0) {
+		p_err("interface %s detach failed: %s",
+		      attach_type_strings[attach_type], strerror(-err));
+		return err;
+	}
+
+	if (json_output)
+		jsonw_null(json_wtr);
+
+	return 0;
+}
+
 static int do_show(int argc, char **argv)
 {
 	struct bpf_attach_info attach_info = {};
@@ -422,6 +459,7 @@ static int do_help(int argc, char **argv)
 	fprintf(stderr,
 		"Usage: %s %s { show | list } [dev <devname>]\n"
 		"       %s %s attach ATTACH_TYPE PROG dev <devname> [ overwrite ]\n"
+		"       %s %s detach ATTACH_TYPE dev <devname>\n"
 		"       %s %s help\n"
 		"\n"
 		"       " HELP_SPEC_PROGRAM "\n"
@@ -432,7 +470,8 @@ static int do_help(int argc, char **argv)
 		"      to dump program attachments. For program types\n"
 		"      sk_{filter,skb,msg,reuseport} and lwt/seg6, please\n"
 		"      consult iproute2.\n",
-		bin_name, argv[-2], bin_name, argv[-2], bin_name, argv[-2]);
+		bin_name, argv[-2], bin_name, argv[-2], bin_name, argv[-2],
+		bin_name, argv[-2]);
 
 	return 0;
 }
@@ -441,6 +480,7 @@ static const struct cmd cmds[] = {
 	{ "show",	do_show },
 	{ "list",	do_show },
 	{ "attach",	do_attach },
+	{ "detach",	do_detach },
 	{ "help",	do_help },
 	{ 0 }
 };
-- 
2.20.1


^ permalink raw reply related

* [v5,3/4] tools: bpftool: add bash-completion for net attach/detach
From: Daniel T. Lee @ 2019-08-13  2:46 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov; +Cc: netdev
In-Reply-To: <20190813024621.29886-1-danieltimlee@gmail.com>

This commit adds bash-completion for new "net attach/detach"
subcommand for attaching XDP program on interface.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
---
 tools/bpf/bpftool/bash-completion/bpftool | 65 +++++++++++++++++++----
 1 file changed, 55 insertions(+), 10 deletions(-)

diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool
index c8f42e1fcbc9..dbfcf50d8215 100644
--- a/tools/bpf/bpftool/bash-completion/bpftool
+++ b/tools/bpf/bpftool/bash-completion/bpftool
@@ -201,6 +201,10 @@ _bpftool()
             _bpftool_get_prog_tags
             return 0
             ;;
+        dev)
+            _sysfs_get_netdevs
+            return 0
+            ;;
         file|pinned)
             _filedir
             return 0
@@ -399,10 +403,6 @@ _bpftool()
                             _filedir
                             return 0
                             ;;
-                        dev)
-                            _sysfs_get_netdevs
-                            return 0
-                            ;;
                         *)
                             COMPREPLY=( $( compgen -W "map" -- "$cur" ) )
                             _bpftool_once_attr 'type'
@@ -498,10 +498,6 @@ _bpftool()
                         key|value|flags|name|entries)
                             return 0
                             ;;
-                        dev)
-                            _sysfs_get_netdevs
-                            return 0
-                            ;;
                         *)
                             _bpftool_once_attr 'type'
                             _bpftool_once_attr 'key'
@@ -775,18 +771,67 @@ _bpftool()
             esac
             ;;
         net)
+            local PROG_TYPE='id pinned tag'
+            local ATTACH_TYPES='xdp xdpgeneric xdpdrv xdpoffload'
             case $command in
+                show|list)
+                    [[ $prev != "$command" ]] && return 0
+                    COMPREPLY=( $( compgen -W 'dev' -- "$cur" ) )
+                    return 0
+                    ;;
+                attach)
+                    case $cword in
+                        3)
+                            COMPREPLY=( $( compgen -W "$ATTACH_TYPES" -- "$cur" ) )
+                            return 0
+                            ;;
+                        4)
+                            COMPREPLY=( $( compgen -W "$PROG_TYPE" -- "$cur" ) )
+                            return 0
+                            ;;
+                        5)
+                            case $prev in
+                                id)
+                                    _bpftool_get_prog_ids
+                                    ;;
+                                pinned)
+                                    _filedir
+                                    ;;
+                            esac
+                            return 0
+                            ;;
+                        6)
+                            COMPREPLY=( $( compgen -W 'dev' -- "$cur" ) )
+                            return 0
+                            ;;
+                        8)
+                            _bpftool_once_attr 'overwrite'
+                            return 0
+                            ;;
+                    esac
+                    ;;
+                detach)
+                    case $cword in
+                        3)
+                            COMPREPLY=( $( compgen -W "$ATTACH_TYPES" -- "$cur" ) )
+                            return 0
+                            ;;
+                        4)
+                            COMPREPLY=( $( compgen -W 'dev' -- "$cur" ) )
+                            return 0
+                            ;;
+                    esac
+                    ;;
                 *)
                     [[ $prev == $object ]] && \
                         COMPREPLY=( $( compgen -W 'help \
-                            show list' -- "$cur" ) )
+                            show list attach detach' -- "$cur" ) )
                     ;;
             esac
             ;;
         feature)
             case $command in
                 probe)
-                    [[ $prev == "dev" ]] && _sysfs_get_netdevs && return 0
                     [[ $prev == "prefix" ]] && return 0
                     if _bpftool_search_list 'macros'; then
                         COMPREPLY+=( $( compgen -W 'prefix' -- "$cur" ) )
-- 
2.20.1


^ permalink raw reply related

* [v5,4/4] tools: bpftool: add documentation for net attach/detach
From: Daniel T. Lee @ 2019-08-13  2:46 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov; +Cc: netdev
In-Reply-To: <20190813024621.29886-1-danieltimlee@gmail.com>

Since, new sub-command 'net attach/detach' has been added for
attaching XDP program on interface,
this commit documents usage and sample output of `net attach/detach`.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
---
 .../bpf/bpftool/Documentation/bpftool-net.rst | 57 ++++++++++++++++++-
 1 file changed, 54 insertions(+), 3 deletions(-)

diff --git a/tools/bpf/bpftool/Documentation/bpftool-net.rst b/tools/bpf/bpftool/Documentation/bpftool-net.rst
index d8e5237a2085..8651b00b81ea 100644
--- a/tools/bpf/bpftool/Documentation/bpftool-net.rst
+++ b/tools/bpf/bpftool/Documentation/bpftool-net.rst
@@ -15,17 +15,22 @@ SYNOPSIS
 	*OPTIONS* := { [{ **-j** | **--json** }] [{ **-p** | **--pretty** }] }
 
 	*COMMANDS* :=
-	{ **show** | **list** } [ **dev** name ] | **help**
+	{ **show** | **list** | **attach** | **detach** | **help** }
 
 NET COMMANDS
 ============
 
-|	**bpftool** **net { show | list } [ dev name ]**
+|	**bpftool** **net { show | list }** [ **dev** *NAME* ]
+|	**bpftool** **net attach** *ATTACH_TYPE* *PROG* **dev** *NAME* [ **overwrite** ]
+|	**bpftool** **net detach** *ATTACH_TYPE* **dev** *NAME*
 |	**bpftool** **net help**
+|
+|	*PROG* := { **id** *PROG_ID* | **pinned** *FILE* | **tag** *PROG_TAG* }
+|	*ATTACH_TYPE* := { **xdp** | **xdpgeneric** | **xdpdrv** | **xdpoffload** }
 
 DESCRIPTION
 ===========
-	**bpftool net { show | list } [ dev name ]**
+	**bpftool net { show | list }** [ **dev** *NAME* ]
                   List bpf program attachments in the kernel networking subsystem.
 
                   Currently, only device driver xdp attachments and tc filter
@@ -47,6 +52,24 @@ DESCRIPTION
                   all bpf programs attached to non clsact qdiscs, and finally all
                   bpf programs attached to root and clsact qdisc.
 
+	**bpftool** **net attach** *ATTACH_TYPE* *PROG* **dev** *NAME* [ **overwrite** ]
+                  Attach bpf program *PROG* to network interface *NAME* with
+                  type specified by *ATTACH_TYPE*. Previously attached bpf program
+                  can be replaced by the command used with **overwrite** option.
+                  Currently, only XDP-related modes are supported for *ATTACH_TYPE*.
+
+                  *ATTACH_TYPE* can be of:
+                  **xdp** - try native XDP and fallback to generic XDP if NIC driver does not support it;
+                  **xdpgeneric** - Generic XDP. runs at generic XDP hook when packet already enters receive path as skb;
+                  **xdpdrv** - Native XDP. runs earliest point in driver's receive path;
+                  **xdpoffload** - Offload XDP. runs directly on NIC on each packet reception;
+
+	**bpftool** **net detach** *ATTACH_TYPE* **dev** *NAME*
+                  Detach bpf program attached to network interface *NAME* with
+                  type specified by *ATTACH_TYPE*. To detach bpf program, same
+                  *ATTACH_TYPE* previously used for attach must be specified.
+                  Currently, only XDP-related modes are supported for *ATTACH_TYPE*.
+
 	**bpftool net help**
 		  Print short help message.
 
@@ -137,6 +160,34 @@ EXAMPLES
         }
     ]
 
+|
+| **# bpftool net attach xdpdrv id 16 dev enp6s0np0**
+| **# bpftool net**
+
+::
+
+      xdp:
+      enp6s0np0(4) driver id 16
+
+|
+| **# bpftool net attach xdpdrv id 16 dev enp6s0np0**
+| **# bpftool net attach xdpdrv id 20 dev enp6s0np0 overwrite**
+| **# bpftool net**
+
+::
+
+      xdp:
+      enp6s0np0(4) driver id 20
+
+|
+| **# bpftool net attach xdpdrv id 16 dev enp6s0np0**
+| **# bpftool net detach xdpdrv dev enp6s0np0**
+| **# bpftool net**
+
+::
+
+      xdp:
+
 
 SEE ALSO
 ========
-- 
2.20.1


^ permalink raw reply related

* [v2, 0/4] ocelot: support PTP Ethernet frames trapping
From: Yangbo Lu @ 2019-08-13  2:52 UTC (permalink / raw)
  To: netdev, David S . Miller, Allan W . Nielsen, Alexandre Belloni,
	Microchip Linux Driver Support
  Cc: Yangbo Lu

This patch-set is to support PTP Ethernet frames trapping.
Before that, fix some issues and improve the ocelot_ace driver
for using.

---
Changes for v2:
	- Added PTP Ethernet frames trapping support patch.

Yangbo Lu (4):
  ocelot_ace: drop member port from ocelot_ace_rule structure
  ocelot_ace: fix ingress ports setting for rule
  ocelot_ace: fix action of trap
  ocelot: add VCAP IS2 rule to trap PTP Ethernet frames

 drivers/net/ethernet/mscc/ocelot.c        | 28 ++++++++++++++++++++++++++++
 drivers/net/ethernet/mscc/ocelot_ace.c    | 20 ++++++++++----------
 drivers/net/ethernet/mscc/ocelot_ace.h    |  4 ++--
 drivers/net/ethernet/mscc/ocelot_flower.c |  8 ++++----
 4 files changed, 44 insertions(+), 16 deletions(-)

-- 
2.7.4


^ permalink raw reply

* [v2, 1/4] ocelot_ace: drop member port from ocelot_ace_rule structure
From: Yangbo Lu @ 2019-08-13  2:52 UTC (permalink / raw)
  To: netdev, David S . Miller, Allan W . Nielsen, Alexandre Belloni,
	Microchip Linux Driver Support
  Cc: Yangbo Lu
In-Reply-To: <20190813025214.18601-1-yangbo.lu@nxp.com>

The ocelot_ace_rule is not port specific. We don't need a member port
in ocelot_ace_rule structure. Drop it and use member ocelot instead.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
---
Changes for v2:
	- None.
---
 drivers/net/ethernet/mscc/ocelot_ace.c    | 12 ++++++------
 drivers/net/ethernet/mscc/ocelot_ace.h    |  2 +-
 drivers/net/ethernet/mscc/ocelot_flower.c |  6 +++---
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/mscc/ocelot_ace.c b/drivers/net/ethernet/mscc/ocelot_ace.c
index 39aca1a..5580a58 100644
--- a/drivers/net/ethernet/mscc/ocelot_ace.c
+++ b/drivers/net/ethernet/mscc/ocelot_ace.c
@@ -576,7 +576,7 @@ static void is2_entry_set(struct ocelot *ocelot, int ix,
 
 static void is2_entry_get(struct ocelot_ace_rule *rule, int ix)
 {
-	struct ocelot *op = rule->port->ocelot;
+	struct ocelot *op = rule->ocelot;
 	struct vcap_data data;
 	int row = (ix / 2);
 	u32 cnt;
@@ -655,11 +655,11 @@ int ocelot_ace_rule_offload_add(struct ocelot_ace_rule *rule)
 	/* Move down the rules to make place for the new rule */
 	for (i = acl_block->count - 1; i > index; i--) {
 		ace = ocelot_ace_rule_get_rule_index(acl_block, i);
-		is2_entry_set(rule->port->ocelot, i, ace);
+		is2_entry_set(rule->ocelot, i, ace);
 	}
 
 	/* Now insert the new rule */
-	is2_entry_set(rule->port->ocelot, index, rule);
+	is2_entry_set(rule->ocelot, index, rule);
 	return 0;
 }
 
@@ -697,11 +697,11 @@ int ocelot_ace_rule_offload_del(struct ocelot_ace_rule *rule)
 	/* Move up all the blocks over the deleted rule */
 	for (i = index; i < acl_block->count; i++) {
 		ace = ocelot_ace_rule_get_rule_index(acl_block, i);
-		is2_entry_set(rule->port->ocelot, i, ace);
+		is2_entry_set(rule->ocelot, i, ace);
 	}
 
 	/* Now delete the last rule, because it is duplicated */
-	is2_entry_set(rule->port->ocelot, acl_block->count, &del_ace);
+	is2_entry_set(rule->ocelot, acl_block->count, &del_ace);
 
 	return 0;
 }
@@ -717,7 +717,7 @@ int ocelot_ace_rule_stats_update(struct ocelot_ace_rule *rule)
 	/* After we get the result we need to clear the counters */
 	tmp = ocelot_ace_rule_get_rule_index(acl_block, index);
 	tmp->stats.pkts = 0;
-	is2_entry_set(rule->port->ocelot, index, tmp);
+	is2_entry_set(rule->ocelot, index, tmp);
 
 	return 0;
 }
diff --git a/drivers/net/ethernet/mscc/ocelot_ace.h b/drivers/net/ethernet/mscc/ocelot_ace.h
index e98944c..ce72f02 100644
--- a/drivers/net/ethernet/mscc/ocelot_ace.h
+++ b/drivers/net/ethernet/mscc/ocelot_ace.h
@@ -186,7 +186,7 @@ struct ocelot_ace_stats {
 
 struct ocelot_ace_rule {
 	struct list_head list;
-	struct ocelot_port *port;
+	struct ocelot *ocelot;
 
 	u16 prio;
 	u32 id;
diff --git a/drivers/net/ethernet/mscc/ocelot_flower.c b/drivers/net/ethernet/mscc/ocelot_flower.c
index 59487d4..7c60e8c 100644
--- a/drivers/net/ethernet/mscc/ocelot_flower.c
+++ b/drivers/net/ethernet/mscc/ocelot_flower.c
@@ -183,7 +183,7 @@ struct ocelot_ace_rule *ocelot_ace_rule_create(struct flow_cls_offload *f,
 	if (!rule)
 		return NULL;
 
-	rule->port = block->port;
+	rule->ocelot = block->port->ocelot;
 	rule->chip_port = block->port->chip_port;
 	return rule;
 }
@@ -219,7 +219,7 @@ static int ocelot_flower_destroy(struct flow_cls_offload *f,
 	int ret;
 
 	rule.prio = get_prio(f->common.prio);
-	rule.port = port_block->port;
+	rule.ocelot = port_block->port->ocelot;
 	rule.id = f->cookie;
 
 	ret = ocelot_ace_rule_offload_del(&rule);
@@ -237,7 +237,7 @@ static int ocelot_flower_stats_update(struct flow_cls_offload *f,
 	int ret;
 
 	rule.prio = get_prio(f->common.prio);
-	rule.port = port_block->port;
+	rule.ocelot = port_block->port->ocelot;
 	rule.id = f->cookie;
 	ret = ocelot_ace_rule_stats_update(&rule);
 	if (ret)
-- 
2.7.4


^ permalink raw reply related

* [v2, 4/4] ocelot: add VCAP IS2 rule to trap PTP Ethernet frames
From: Yangbo Lu @ 2019-08-13  2:52 UTC (permalink / raw)
  To: netdev, David S . Miller, Allan W . Nielsen, Alexandre Belloni,
	Microchip Linux Driver Support
  Cc: Yangbo Lu
In-Reply-To: <20190813025214.18601-1-yangbo.lu@nxp.com>

All the PTP messages over Ethernet have etype 0x88f7 on them.
Use etype as the key to trap PTP messages.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
---
Changes for v2:
	- Added this patch.
---
 drivers/net/ethernet/mscc/ocelot.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
index 6932e61..40f4e0d 100644
--- a/drivers/net/ethernet/mscc/ocelot.c
+++ b/drivers/net/ethernet/mscc/ocelot.c
@@ -1681,6 +1681,33 @@ int ocelot_probe_port(struct ocelot *ocelot, u8 port,
 }
 EXPORT_SYMBOL(ocelot_probe_port);
 
+static int ocelot_ace_add_ptp_rule(struct ocelot *ocelot)
+{
+	struct ocelot_ace_rule *rule;
+
+	rule = kzalloc(sizeof(*rule), GFP_KERNEL);
+	if (!rule)
+		return -ENOMEM;
+
+	/* Entry for PTP over Ethernet (etype 0x88f7)
+	 * Action: trap to CPU port
+	 */
+	rule->ocelot = ocelot;
+	rule->prio = 1;
+	rule->type = OCELOT_ACE_TYPE_ETYPE;
+	/* Available on all ingress port except CPU port */
+	rule->ingress_port = ~BIT(ocelot->num_phys_ports);
+	rule->dmac_mc = OCELOT_VCAP_BIT_1;
+	rule->frame.etype.etype.value[0] = 0x88;
+	rule->frame.etype.etype.value[1] = 0xf7;
+	rule->frame.etype.etype.mask[0] = 0xff;
+	rule->frame.etype.etype.mask[1] = 0xff;
+	rule->action = OCELOT_ACL_ACTION_TRAP;
+
+	ocelot_ace_rule_offload_add(rule);
+	return 0;
+}
+
 int ocelot_init(struct ocelot *ocelot)
 {
 	u32 port;
@@ -1708,6 +1735,7 @@ int ocelot_init(struct ocelot *ocelot)
 	ocelot_mact_init(ocelot);
 	ocelot_vlan_init(ocelot);
 	ocelot_ace_init(ocelot);
+	ocelot_ace_add_ptp_rule(ocelot);
 
 	for (port = 0; port < ocelot->num_phys_ports; port++) {
 		/* Clear all counters (5 groups) */
-- 
2.7.4


^ permalink raw reply related

* [v2, 3/4] ocelot_ace: fix action of trap
From: Yangbo Lu @ 2019-08-13  2:52 UTC (permalink / raw)
  To: netdev, David S . Miller, Allan W . Nielsen, Alexandre Belloni,
	Microchip Linux Driver Support
  Cc: Yangbo Lu
In-Reply-To: <20190813025214.18601-1-yangbo.lu@nxp.com>

The trap action should be copying the frame to CPU and
dropping it for forwarding, but current setting was just
copying frame to CPU.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
---
Changes for v2:
	- None.
---
 drivers/net/ethernet/mscc/ocelot_ace.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mscc/ocelot_ace.c b/drivers/net/ethernet/mscc/ocelot_ace.c
index 91250f3..59ad590 100644
--- a/drivers/net/ethernet/mscc/ocelot_ace.c
+++ b/drivers/net/ethernet/mscc/ocelot_ace.c
@@ -317,9 +317,9 @@ static void is2_action_set(struct vcap_data *data,
 		break;
 	case OCELOT_ACL_ACTION_TRAP:
 		VCAP_ACT_SET(PORT_MASK, 0x0);
-		VCAP_ACT_SET(MASK_MODE, 0x0);
-		VCAP_ACT_SET(POLICE_ENA, 0x0);
-		VCAP_ACT_SET(POLICE_IDX, 0x0);
+		VCAP_ACT_SET(MASK_MODE, 0x1);
+		VCAP_ACT_SET(POLICE_ENA, 0x1);
+		VCAP_ACT_SET(POLICE_IDX, OCELOT_POLICER_DISCARD);
 		VCAP_ACT_SET(CPU_QU_NUM, 0x0);
 		VCAP_ACT_SET(CPU_COPY_ENA, 0x1);
 		break;
-- 
2.7.4


^ permalink raw reply related

* [v2, 2/4] ocelot_ace: fix ingress ports setting for rule
From: Yangbo Lu @ 2019-08-13  2:52 UTC (permalink / raw)
  To: netdev, David S . Miller, Allan W . Nielsen, Alexandre Belloni,
	Microchip Linux Driver Support
  Cc: Yangbo Lu
In-Reply-To: <20190813025214.18601-1-yangbo.lu@nxp.com>

The ingress ports setting of rule should support covering all ports.
This patch is to use u16 ingress_port for ingress port mask setting
for ace rule. One bit corresponds one port.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
---
Changes for v2:
	- None.
---
 drivers/net/ethernet/mscc/ocelot_ace.c    | 2 +-
 drivers/net/ethernet/mscc/ocelot_ace.h    | 2 +-
 drivers/net/ethernet/mscc/ocelot_flower.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mscc/ocelot_ace.c b/drivers/net/ethernet/mscc/ocelot_ace.c
index 5580a58..91250f3 100644
--- a/drivers/net/ethernet/mscc/ocelot_ace.c
+++ b/drivers/net/ethernet/mscc/ocelot_ace.c
@@ -352,7 +352,7 @@ static void is2_entry_set(struct ocelot *ocelot, int ix,
 	data.type = IS2_ACTION_TYPE_NORMAL;
 
 	VCAP_KEY_ANY_SET(PAG);
-	VCAP_KEY_SET(IGR_PORT_MASK, 0, ~BIT(ace->chip_port));
+	VCAP_KEY_SET(IGR_PORT_MASK, 0, ~ace->ingress_port);
 	VCAP_KEY_BIT_SET(FIRST, OCELOT_VCAP_BIT_1);
 	VCAP_KEY_BIT_SET(HOST_MATCH, OCELOT_VCAP_BIT_ANY);
 	VCAP_KEY_BIT_SET(L2_MC, ace->dmac_mc);
diff --git a/drivers/net/ethernet/mscc/ocelot_ace.h b/drivers/net/ethernet/mscc/ocelot_ace.h
index ce72f02..0fe23e0 100644
--- a/drivers/net/ethernet/mscc/ocelot_ace.h
+++ b/drivers/net/ethernet/mscc/ocelot_ace.h
@@ -193,7 +193,7 @@ struct ocelot_ace_rule {
 
 	enum ocelot_ace_action action;
 	struct ocelot_ace_stats stats;
-	int chip_port;
+	u16 ingress_port;
 
 	enum ocelot_vcap_bit dmac_mc;
 	enum ocelot_vcap_bit dmac_bc;
diff --git a/drivers/net/ethernet/mscc/ocelot_flower.c b/drivers/net/ethernet/mscc/ocelot_flower.c
index 7c60e8c..bfddc50 100644
--- a/drivers/net/ethernet/mscc/ocelot_flower.c
+++ b/drivers/net/ethernet/mscc/ocelot_flower.c
@@ -184,7 +184,7 @@ struct ocelot_ace_rule *ocelot_ace_rule_create(struct flow_cls_offload *f,
 		return NULL;
 
 	rule->ocelot = block->port->ocelot;
-	rule->chip_port = block->port->chip_port;
+	rule->ingress_port = BIT(block->port->chip_port);
 	return rule;
 }
 
-- 
2.7.4


^ permalink raw reply related

* [PATCH net-next v2 1/5] r8152: separate the rx buffer size
From: Hayes Wang @ 2019-08-13  3:42 UTC (permalink / raw)
  To: netdev; +Cc: nic_swsd, linux-kernel, linux-usb, Hayes Wang
In-Reply-To: <1394712342-15778-295-albertk@realtek.com>

The different chips may accept different rx buffer sizes. The RTL8152
supports 16K bytes, and RTL8153 support 32K bytes.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
---
 drivers/net/usb/r8152.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index 0cc03a9ff545..94da79028a65 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -749,6 +749,7 @@ struct r8152 {
 	u32 msg_enable;
 	u32 tx_qlen;
 	u32 coalesce;
+	u32 rx_buf_sz;
 	u16 ocp_base;
 	u16 speed;
 	u8 *intr_buff;
@@ -1516,13 +1517,13 @@ static int alloc_all_mem(struct r8152 *tp)
 	skb_queue_head_init(&tp->rx_queue);
 
 	for (i = 0; i < RTL8152_MAX_RX; i++) {
-		buf = kmalloc_node(agg_buf_sz, GFP_KERNEL, node);
+		buf = kmalloc_node(tp->rx_buf_sz, GFP_KERNEL, node);
 		if (!buf)
 			goto err1;
 
 		if (buf != rx_agg_align(buf)) {
 			kfree(buf);
-			buf = kmalloc_node(agg_buf_sz + RX_ALIGN, GFP_KERNEL,
+			buf = kmalloc_node(tp->rx_buf_sz + RX_ALIGN, GFP_KERNEL,
 					   node);
 			if (!buf)
 				goto err1;
@@ -2113,7 +2114,7 @@ int r8152_submit_rx(struct r8152 *tp, struct rx_agg *agg, gfp_t mem_flags)
 		return 0;
 
 	usb_fill_bulk_urb(agg->urb, tp->udev, usb_rcvbulkpipe(tp->udev, 1),
-			  agg->head, agg_buf_sz,
+			  agg->head, tp->rx_buf_sz,
 			  (usb_complete_t)read_bulk_callback, agg);
 
 	ret = usb_submit_urb(agg->urb, mem_flags);
@@ -2447,7 +2448,7 @@ static void r8153_set_rx_early_timeout(struct r8152 *tp)
 
 static void r8153_set_rx_early_size(struct r8152 *tp)
 {
-	u32 ocp_data = agg_buf_sz - rx_reserved_size(tp->netdev->mtu);
+	u32 ocp_data = tp->rx_buf_sz - rx_reserved_size(tp->netdev->mtu);
 
 	switch (tp->version) {
 	case RTL_VER_03:
@@ -5115,6 +5116,7 @@ static int rtl_ops_init(struct r8152 *tp)
 		ops->in_nway		= rtl8152_in_nway;
 		ops->hw_phy_cfg		= r8152b_hw_phy_cfg;
 		ops->autosuspend_en	= rtl_runtime_suspend_enable;
+		tp->rx_buf_sz		= 16 * 1024;
 		break;
 
 	case RTL_VER_03:
@@ -5132,6 +5134,7 @@ static int rtl_ops_init(struct r8152 *tp)
 		ops->in_nway		= rtl8153_in_nway;
 		ops->hw_phy_cfg		= r8153_hw_phy_cfg;
 		ops->autosuspend_en	= rtl8153_runtime_enable;
+		tp->rx_buf_sz		= 32 * 1024;
 		break;
 
 	case RTL_VER_08:
@@ -5147,6 +5150,7 @@ static int rtl_ops_init(struct r8152 *tp)
 		ops->in_nway		= rtl8153_in_nway;
 		ops->hw_phy_cfg		= r8153b_hw_phy_cfg;
 		ops->autosuspend_en	= rtl8153b_runtime_enable;
+		tp->rx_buf_sz		= 32 * 1024;
 		break;
 
 	default:
-- 
2.21.0


^ permalink raw reply related

* [PATCH net-next v2 5/5] r8152: change rx_copybreak and rx_pending through ethtool
From: Hayes Wang @ 2019-08-13  3:42 UTC (permalink / raw)
  To: netdev; +Cc: nic_swsd, linux-kernel, linux-usb, Hayes Wang
In-Reply-To: <1394712342-15778-295-albertk@realtek.com>

Let the rx_copybreak and rx_pending could be modified by
ethtool.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
---
 drivers/net/usb/r8152.c | 91 ++++++++++++++++++++++++++++++++++++++---
 1 file changed, 86 insertions(+), 5 deletions(-)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index 2ae04522cd5a..40d18e866269 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -26,7 +26,7 @@
 #include <linux/acpi.h>
 
 /* Information for net-next */
-#define NETNEXT_VERSION		"09"
+#define NETNEXT_VERSION		"10"
 
 /* Information for net */
 #define NET_VERSION		"10"
@@ -584,7 +584,7 @@ enum rtl_register_content {
 #define TX_ALIGN		4
 #define RX_ALIGN		8
 
-#define RTL8152_MAX_RX_AGG	(10 * RTL8152_MAX_RX)
+#define RTL8152_RX_MAX_PENDING	4096
 #define RTL8152_RXFG_HEADSZ	256
 
 #define INTR_LINK		0x0004
@@ -756,6 +756,9 @@ struct r8152 {
 	u32 tx_qlen;
 	u32 coalesce;
 	u32 rx_buf_sz;
+	u32 rx_copybreak;
+	u32 rx_pending;
+
 	u16 ocp_base;
 	u16 speed;
 	u8 *intr_buff;
@@ -1984,7 +1987,7 @@ static struct rx_agg *rtl_get_free_rx(struct r8152 *tp, gfp_t mflags)
 
 	spin_unlock_irqrestore(&tp->rx_lock, flags);
 
-	if (!agg_free && atomic_read(&tp->rx_count) < RTL8152_MAX_RX_AGG)
+	if (!agg_free && atomic_read(&tp->rx_count) < tp->rx_pending)
 		agg_free = alloc_rx_agg(tp, mflags);
 
 	return agg_free;
@@ -2064,10 +2067,10 @@ static int rx_bottom(struct r8152 *tp, int budget)
 			pkt_len -= ETH_FCS_LEN;
 			rx_data += sizeof(struct rx_desc);
 
-			if (!agg_free || RTL8152_RXFG_HEADSZ > pkt_len)
+			if (!agg_free || tp->rx_copybreak > pkt_len)
 				rx_frag_head_sz = pkt_len;
 			else
-				rx_frag_head_sz = RTL8152_RXFG_HEADSZ;
+				rx_frag_head_sz = tp->rx_copybreak;
 
 			skb = napi_alloc_skb(napi, rx_frag_head_sz);
 			if (!skb) {
@@ -5104,6 +5107,77 @@ static int rtl8152_set_coalesce(struct net_device *netdev,
 	return ret;
 }
 
+static int rtl8152_get_tunable(struct net_device *netdev,
+			       const struct ethtool_tunable *tunable, void *d)
+{
+	struct r8152 *tp = netdev_priv(netdev);
+
+	switch (tunable->id) {
+	case ETHTOOL_RX_COPYBREAK:
+		*(u32 *)d = tp->rx_copybreak;
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	return 0;
+}
+
+static int rtl8152_set_tunable(struct net_device *netdev,
+			       const struct ethtool_tunable *tunable,
+			       const void *d)
+{
+	struct r8152 *tp = netdev_priv(netdev);
+	u32 val;
+
+	switch (tunable->id) {
+	case ETHTOOL_RX_COPYBREAK:
+		val = *(u32 *)d;
+		if (val < ETH_ZLEN) {
+			netif_err(tp, rx_err, netdev,
+				  "Invalid rx copy break value\n");
+			return -EINVAL;
+		}
+
+		if (tp->rx_copybreak != val) {
+			napi_disable(&tp->napi);
+			tp->rx_copybreak = val;
+			napi_enable(&tp->napi);
+		}
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	return 0;
+}
+
+static void rtl8152_get_ringparam(struct net_device *netdev,
+				  struct ethtool_ringparam *ring)
+{
+	struct r8152 *tp = netdev_priv(netdev);
+
+	ring->rx_max_pending = RTL8152_RX_MAX_PENDING;
+	ring->rx_pending = tp->rx_pending;
+}
+
+static int rtl8152_set_ringparam(struct net_device *netdev,
+				 struct ethtool_ringparam *ring)
+{
+	struct r8152 *tp = netdev_priv(netdev);
+
+	if (ring->rx_pending < (RTL8152_MAX_RX * 2))
+		return -EINVAL;
+
+	if (tp->rx_pending != ring->rx_pending) {
+		napi_disable(&tp->napi);
+		tp->rx_pending = ring->rx_pending;
+		napi_enable(&tp->napi);
+	}
+
+	return 0;
+}
+
 static const struct ethtool_ops ops = {
 	.get_drvinfo = rtl8152_get_drvinfo,
 	.get_link = ethtool_op_get_link,
@@ -5121,6 +5195,10 @@ static const struct ethtool_ops ops = {
 	.set_eee = rtl_ethtool_set_eee,
 	.get_link_ksettings = rtl8152_get_link_ksettings,
 	.set_link_ksettings = rtl8152_set_link_ksettings,
+	.get_tunable = rtl8152_get_tunable,
+	.set_tunable = rtl8152_set_tunable,
+	.get_ringparam = rtl8152_get_ringparam,
+	.set_ringparam = rtl8152_set_ringparam,
 };
 
 static int rtl8152_ioctl(struct net_device *netdev, struct ifreq *rq, int cmd)
@@ -5474,6 +5552,9 @@ static int rtl8152_probe(struct usb_interface *intf,
 	tp->speed = tp->mii.supports_gmii ? SPEED_1000 : SPEED_100;
 	tp->duplex = DUPLEX_FULL;
 
+	tp->rx_copybreak = RTL8152_RXFG_HEADSZ;
+	tp->rx_pending = 10 * RTL8152_MAX_RX;
+
 	intf->needs_remote_wakeup = 1;
 
 	tp->rtl_ops.init(tp);
-- 
2.21.0


^ permalink raw reply related

* [PATCH net-next v2 2/5] r8152: replace array with linking list for rx information
From: Hayes Wang @ 2019-08-13  3:42 UTC (permalink / raw)
  To: netdev; +Cc: nic_swsd, linux-kernel, linux-usb, Hayes Wang
In-Reply-To: <1394712342-15778-295-albertk@realtek.com>

The original method uses an array to store the rx information. The
new one uses a list to link each rx structure. Then, it is possible
to increase/decrease the number of rx structure dynamically.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
---
 drivers/net/usb/r8152.c | 182 +++++++++++++++++++++++++++-------------
 1 file changed, 125 insertions(+), 57 deletions(-)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index 94da79028a65..d063c9b358e5 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -22,6 +22,7 @@
 #include <linux/mdio.h>
 #include <linux/usb/cdc.h>
 #include <linux/suspend.h>
+#include <linux/atomic.h>
 #include <linux/acpi.h>
 
 /* Information for net-next */
@@ -694,7 +695,7 @@ struct tx_desc {
 struct r8152;
 
 struct rx_agg {
-	struct list_head list;
+	struct list_head list, info_list;
 	struct urb *urb;
 	struct r8152 *context;
 	void *buffer;
@@ -719,7 +720,7 @@ struct r8152 {
 	struct net_device *netdev;
 	struct urb *intr_urb;
 	struct tx_agg tx_info[RTL8152_MAX_TX];
-	struct rx_agg rx_info[RTL8152_MAX_RX];
+	struct list_head rx_info;
 	struct list_head rx_done, tx_free;
 	struct sk_buff_head tx_queue, rx_queue;
 	spinlock_t rx_lock, tx_lock;
@@ -744,6 +745,8 @@ struct r8152 {
 		void (*autosuspend_en)(struct r8152 *tp, bool enable);
 	} rtl_ops;
 
+	atomic_t rx_count;
+
 	int intr_interval;
 	u32 saved_wolopts;
 	u32 msg_enable;
@@ -1468,18 +1471,81 @@ static inline void *tx_agg_align(void *data)
 	return (void *)ALIGN((uintptr_t)data, TX_ALIGN);
 }
 
+static void free_rx_agg(struct r8152 *tp, struct rx_agg *agg)
+{
+	list_del(&agg->info_list);
+
+	usb_free_urb(agg->urb);
+	kfree(agg->buffer);
+	kfree(agg);
+
+	atomic_dec(&tp->rx_count);
+}
+
+static struct rx_agg *alloc_rx_agg(struct r8152 *tp, gfp_t mflags)
+{
+	struct net_device *netdev = tp->netdev;
+	int node = netdev->dev.parent ? dev_to_node(netdev->dev.parent) : -1;
+	struct rx_agg *rx_agg;
+	unsigned long flags;
+	u8 *buf;
+
+	rx_agg = kmalloc_node(sizeof(*rx_agg), mflags, node);
+	if (!rx_agg)
+		return NULL;
+
+	buf = kmalloc_node(tp->rx_buf_sz, mflags, node);
+	if (!buf)
+		goto free_rx;
+
+	if (buf != rx_agg_align(buf)) {
+		kfree(buf);
+		buf = kmalloc_node(tp->rx_buf_sz + RX_ALIGN, mflags,
+				   node);
+		if (!buf)
+			goto free_rx;
+	}
+
+	rx_agg->buffer = buf;
+	rx_agg->head = rx_agg_align(buf);
+
+	rx_agg->urb = usb_alloc_urb(0, mflags);
+	if (!rx_agg->urb)
+		goto free_buf;
+
+	rx_agg->context = tp;
+
+	INIT_LIST_HEAD(&rx_agg->list);
+	INIT_LIST_HEAD(&rx_agg->info_list);
+	spin_lock_irqsave(&tp->rx_lock, flags);
+	list_add_tail(&rx_agg->info_list, &tp->rx_info);
+	spin_unlock_irqrestore(&tp->rx_lock, flags);
+
+	atomic_inc(&tp->rx_count);
+
+	return rx_agg;
+
+free_buf:
+	kfree(rx_agg->buffer);
+free_rx:
+	kfree(rx_agg);
+	return NULL;
+}
+
 static void free_all_mem(struct r8152 *tp)
 {
+	struct rx_agg *agg, *agg_next;
+	unsigned long flags;
 	int i;
 
-	for (i = 0; i < RTL8152_MAX_RX; i++) {
-		usb_free_urb(tp->rx_info[i].urb);
-		tp->rx_info[i].urb = NULL;
+	spin_lock_irqsave(&tp->rx_lock, flags);
 
-		kfree(tp->rx_info[i].buffer);
-		tp->rx_info[i].buffer = NULL;
-		tp->rx_info[i].head = NULL;
-	}
+	list_for_each_entry_safe(agg, agg_next, &tp->rx_info, info_list)
+		free_rx_agg(tp, agg);
+
+	spin_unlock_irqrestore(&tp->rx_lock, flags);
+
+	WARN_ON(atomic_read(&tp->rx_count));
 
 	for (i = 0; i < RTL8152_MAX_TX; i++) {
 		usb_free_urb(tp->tx_info[i].urb);
@@ -1503,46 +1569,28 @@ static int alloc_all_mem(struct r8152 *tp)
 	struct usb_interface *intf = tp->intf;
 	struct usb_host_interface *alt = intf->cur_altsetting;
 	struct usb_host_endpoint *ep_intr = alt->endpoint + 2;
-	struct urb *urb;
 	int node, i;
-	u8 *buf;
 
 	node = netdev->dev.parent ? dev_to_node(netdev->dev.parent) : -1;
 
 	spin_lock_init(&tp->rx_lock);
 	spin_lock_init(&tp->tx_lock);
+	INIT_LIST_HEAD(&tp->rx_info);
 	INIT_LIST_HEAD(&tp->tx_free);
 	INIT_LIST_HEAD(&tp->rx_done);
 	skb_queue_head_init(&tp->tx_queue);
 	skb_queue_head_init(&tp->rx_queue);
+	atomic_set(&tp->rx_count, 0);
 
 	for (i = 0; i < RTL8152_MAX_RX; i++) {
-		buf = kmalloc_node(tp->rx_buf_sz, GFP_KERNEL, node);
-		if (!buf)
+		if (!alloc_rx_agg(tp, GFP_KERNEL))
 			goto err1;
-
-		if (buf != rx_agg_align(buf)) {
-			kfree(buf);
-			buf = kmalloc_node(tp->rx_buf_sz + RX_ALIGN, GFP_KERNEL,
-					   node);
-			if (!buf)
-				goto err1;
-		}
-
-		urb = usb_alloc_urb(0, GFP_KERNEL);
-		if (!urb) {
-			kfree(buf);
-			goto err1;
-		}
-
-		INIT_LIST_HEAD(&tp->rx_info[i].list);
-		tp->rx_info[i].context = tp;
-		tp->rx_info[i].urb = urb;
-		tp->rx_info[i].buffer = buf;
-		tp->rx_info[i].head = rx_agg_align(buf);
 	}
 
 	for (i = 0; i < RTL8152_MAX_TX; i++) {
+		struct urb *urb;
+		u8 *buf;
+
 		buf = kmalloc_node(agg_buf_sz, GFP_KERNEL, node);
 		if (!buf)
 			goto err1;
@@ -2331,44 +2379,64 @@ static void rxdy_gated_en(struct r8152 *tp, bool enable)
 
 static int rtl_start_rx(struct r8152 *tp)
 {
-	int i, ret = 0;
+	struct rx_agg *agg, *agg_next;
+	struct list_head tmp_list;
+	unsigned long flags;
+	int ret = 0;
 
-	INIT_LIST_HEAD(&tp->rx_done);
-	for (i = 0; i < RTL8152_MAX_RX; i++) {
-		INIT_LIST_HEAD(&tp->rx_info[i].list);
-		ret = r8152_submit_rx(tp, &tp->rx_info[i], GFP_KERNEL);
-		if (ret)
-			break;
-	}
+	INIT_LIST_HEAD(&tmp_list);
 
-	if (ret && ++i < RTL8152_MAX_RX) {
-		struct list_head rx_queue;
-		unsigned long flags;
+	spin_lock_irqsave(&tp->rx_lock, flags);
 
-		INIT_LIST_HEAD(&rx_queue);
+	INIT_LIST_HEAD(&tp->rx_done);
 
-		do {
-			struct rx_agg *agg = &tp->rx_info[i++];
-			struct urb *urb = agg->urb;
+	list_splice_init(&tp->rx_info, &tmp_list);
 
-			urb->actual_length = 0;
-			list_add_tail(&agg->list, &rx_queue);
-		} while (i < RTL8152_MAX_RX);
+	spin_unlock_irqrestore(&tp->rx_lock, flags);
 
-		spin_lock_irqsave(&tp->rx_lock, flags);
-		list_splice_tail(&rx_queue, &tp->rx_done);
-		spin_unlock_irqrestore(&tp->rx_lock, flags);
+	list_for_each_entry_safe(agg, agg_next, &tmp_list, info_list) {
+		INIT_LIST_HEAD(&agg->list);
+
+		if (ret < 0)
+			list_add_tail(&agg->list, &tp->rx_done);
+		else
+			ret = r8152_submit_rx(tp, agg, GFP_KERNEL);
 	}
 
+	spin_lock_irqsave(&tp->rx_lock, flags);
+	WARN_ON(!list_empty(&tp->rx_info));
+	list_splice(&tmp_list, &tp->rx_info);
+	spin_unlock_irqrestore(&tp->rx_lock, flags);
+
 	return ret;
 }
 
 static int rtl_stop_rx(struct r8152 *tp)
 {
-	int i;
+	struct rx_agg *agg, *agg_next;
+	struct list_head tmp_list;
+	unsigned long flags;
+
+	INIT_LIST_HEAD(&tmp_list);
+
+	/* The usb_kill_urb() couldn't be used in atomic.
+	 * Therefore, move the list of rx_info to a tmp one.
+	 * Then, list_for_each_entry_safe could be used without
+	 * spin lock.
+	 */
+
+	spin_lock_irqsave(&tp->rx_lock, flags);
+	list_splice_init(&tp->rx_info, &tmp_list);
+	spin_unlock_irqrestore(&tp->rx_lock, flags);
+
+	list_for_each_entry_safe(agg, agg_next, &tmp_list, info_list)
+		usb_kill_urb(agg->urb);
 
-	for (i = 0; i < RTL8152_MAX_RX; i++)
-		usb_kill_urb(tp->rx_info[i].urb);
+	/* Move back the list of temp to the rx_info */
+	spin_lock_irqsave(&tp->rx_lock, flags);
+	WARN_ON(!list_empty(&tp->rx_info));
+	list_splice(&tmp_list, &tp->rx_info);
+	spin_unlock_irqrestore(&tp->rx_lock, flags);
 
 	while (!skb_queue_empty(&tp->rx_queue))
 		dev_kfree_skb(__skb_dequeue(&tp->rx_queue));
-- 
2.21.0


^ permalink raw reply related

* [PATCH net-next v2 4/5] r8152: support skb_add_rx_frag
From: Hayes Wang @ 2019-08-13  3:42 UTC (permalink / raw)
  To: netdev; +Cc: nic_swsd, linux-kernel, linux-usb, Hayes Wang
In-Reply-To: <1394712342-15778-295-albertk@realtek.com>

Use skb_add_rx_frag() to reduce the memory copy for rx data.

Use a new list of rx_used to store the rx buffer which couldn't be
reused yet.

Besides, the total number of rx buffer may be increased or decreased
dynamically. And it is limited by RTL8152_MAX_RX_AGG.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
---
 drivers/net/usb/r8152.c | 120 +++++++++++++++++++++++++++++++++++-----
 1 file changed, 105 insertions(+), 15 deletions(-)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index f41cb728e999..2ae04522cd5a 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -584,6 +584,9 @@ enum rtl_register_content {
 #define TX_ALIGN		4
 #define RX_ALIGN		8
 
+#define RTL8152_MAX_RX_AGG	(10 * RTL8152_MAX_RX)
+#define RTL8152_RXFG_HEADSZ	256
+
 #define INTR_LINK		0x0004
 
 #define RTL8152_REQT_READ	0xc0
@@ -720,7 +723,7 @@ struct r8152 {
 	struct net_device *netdev;
 	struct urb *intr_urb;
 	struct tx_agg tx_info[RTL8152_MAX_TX];
-	struct list_head rx_info;
+	struct list_head rx_info, rx_used;
 	struct list_head rx_done, tx_free;
 	struct sk_buff_head tx_queue, rx_queue;
 	spinlock_t rx_lock, tx_lock;
@@ -1476,7 +1479,7 @@ static void free_rx_agg(struct r8152 *tp, struct rx_agg *agg)
 	list_del(&agg->info_list);
 
 	usb_free_urb(agg->urb);
-	__free_pages(agg->page, get_order(tp->rx_buf_sz));
+	put_page(agg->page);
 	kfree(agg);
 
 	atomic_dec(&tp->rx_count);
@@ -1494,7 +1497,7 @@ static struct rx_agg *alloc_rx_agg(struct r8152 *tp, gfp_t mflags)
 	if (!rx_agg)
 		return NULL;
 
-	rx_agg->page = alloc_pages(mflags, order);
+	rx_agg->page = alloc_pages(mflags | __GFP_COMP, order);
 	if (!rx_agg->page)
 		goto free_rx;
 
@@ -1947,6 +1950,46 @@ static u8 r8152_rx_csum(struct r8152 *tp, struct rx_desc *rx_desc)
 	return checksum;
 }
 
+static inline bool rx_count_exceed(struct r8152 *tp)
+{
+	return atomic_read(&tp->rx_count) > RTL8152_MAX_RX;
+}
+
+static inline int agg_offset(struct rx_agg *agg, void *addr)
+{
+	return (int)(addr - agg->buffer);
+}
+
+static struct rx_agg *rtl_get_free_rx(struct r8152 *tp, gfp_t mflags)
+{
+	struct rx_agg *agg, *agg_next, *agg_free = NULL;
+	unsigned long flags;
+
+	spin_lock_irqsave(&tp->rx_lock, flags);
+
+	list_for_each_entry_safe(agg, agg_next, &tp->rx_used, list) {
+		if (page_count(agg->page) == 1) {
+			if (!agg_free) {
+				list_del_init(&agg->list);
+				agg_free = agg;
+				continue;
+			}
+			if (rx_count_exceed(tp)) {
+				list_del_init(&agg->list);
+				free_rx_agg(tp, agg);
+			}
+			break;
+		}
+	}
+
+	spin_unlock_irqrestore(&tp->rx_lock, flags);
+
+	if (!agg_free && atomic_read(&tp->rx_count) < RTL8152_MAX_RX_AGG)
+		agg_free = alloc_rx_agg(tp, mflags);
+
+	return agg_free;
+}
+
 static int rx_bottom(struct r8152 *tp, int budget)
 {
 	unsigned long flags;
@@ -1982,7 +2025,7 @@ static int rx_bottom(struct r8152 *tp, int budget)
 
 	list_for_each_safe(cursor, next, &rx_queue) {
 		struct rx_desc *rx_desc;
-		struct rx_agg *agg;
+		struct rx_agg *agg, *agg_free;
 		int len_used = 0;
 		struct urb *urb;
 		u8 *rx_data;
@@ -1994,6 +2037,8 @@ static int rx_bottom(struct r8152 *tp, int budget)
 		if (urb->actual_length < ETH_ZLEN)
 			goto submit;
 
+		agg_free = rtl_get_free_rx(tp, GFP_ATOMIC);
+
 		rx_desc = agg->buffer;
 		rx_data = agg->buffer;
 		len_used += sizeof(struct rx_desc);
@@ -2001,7 +2046,7 @@ static int rx_bottom(struct r8152 *tp, int budget)
 		while (urb->actual_length > len_used) {
 			struct net_device *netdev = tp->netdev;
 			struct net_device_stats *stats = &netdev->stats;
-			unsigned int pkt_len;
+			unsigned int pkt_len, rx_frag_head_sz;
 			struct sk_buff *skb;
 
 			/* limite the skb numbers for rx_queue */
@@ -2019,22 +2064,37 @@ static int rx_bottom(struct r8152 *tp, int budget)
 			pkt_len -= ETH_FCS_LEN;
 			rx_data += sizeof(struct rx_desc);
 
-			skb = napi_alloc_skb(napi, pkt_len);
+			if (!agg_free || RTL8152_RXFG_HEADSZ > pkt_len)
+				rx_frag_head_sz = pkt_len;
+			else
+				rx_frag_head_sz = RTL8152_RXFG_HEADSZ;
+
+			skb = napi_alloc_skb(napi, rx_frag_head_sz);
 			if (!skb) {
 				stats->rx_dropped++;
 				goto find_next_rx;
 			}
 
 			skb->ip_summed = r8152_rx_csum(tp, rx_desc);
-			memcpy(skb->data, rx_data, pkt_len);
-			skb_put(skb, pkt_len);
+			memcpy(skb->data, rx_data, rx_frag_head_sz);
+			skb_put(skb, rx_frag_head_sz);
+			pkt_len -= rx_frag_head_sz;
+			rx_data += rx_frag_head_sz;
+			if (pkt_len) {
+				skb_add_rx_frag(skb, 0, agg->page,
+						agg_offset(agg, rx_data),
+						pkt_len,
+						SKB_DATA_ALIGN(pkt_len));
+				get_page(agg->page);
+			}
+
 			skb->protocol = eth_type_trans(skb, netdev);
 			rtl_rx_vlan_tag(rx_desc, skb);
 			if (work_done < budget) {
 				napi_gro_receive(napi, skb);
 				work_done++;
 				stats->rx_packets++;
-				stats->rx_bytes += pkt_len;
+				stats->rx_bytes += skb->len;
 			} else {
 				__skb_queue_tail(&tp->rx_queue, skb);
 			}
@@ -2042,10 +2102,24 @@ static int rx_bottom(struct r8152 *tp, int budget)
 find_next_rx:
 			rx_data = rx_agg_align(rx_data + pkt_len + ETH_FCS_LEN);
 			rx_desc = (struct rx_desc *)rx_data;
-			len_used = (int)(rx_data - (u8 *)agg->buffer);
+			len_used = agg_offset(agg, rx_data);
 			len_used += sizeof(struct rx_desc);
 		}
 
+		WARN_ON(!agg_free && page_count(agg->page) > 1);
+
+		if (agg_free) {
+			spin_lock_irqsave(&tp->rx_lock, flags);
+			if (page_count(agg->page) == 1) {
+				list_add(&agg_free->list, &tp->rx_used);
+			} else {
+				list_add_tail(&agg->list, &tp->rx_used);
+				agg = agg_free;
+				urb = agg->urb;
+			}
+			spin_unlock_irqrestore(&tp->rx_lock, flags);
+		}
+
 submit:
 		if (!ret) {
 			ret = r8152_submit_rx(tp, agg, GFP_ATOMIC);
@@ -2373,13 +2447,14 @@ static int rtl_start_rx(struct r8152 *tp)
 	struct rx_agg *agg, *agg_next;
 	struct list_head tmp_list;
 	unsigned long flags;
-	int ret = 0;
+	int ret = 0, i = 0;
 
 	INIT_LIST_HEAD(&tmp_list);
 
 	spin_lock_irqsave(&tp->rx_lock, flags);
 
 	INIT_LIST_HEAD(&tp->rx_done);
+	INIT_LIST_HEAD(&tp->rx_used);
 
 	list_splice_init(&tp->rx_info, &tmp_list);
 
@@ -2388,10 +2463,18 @@ static int rtl_start_rx(struct r8152 *tp)
 	list_for_each_entry_safe(agg, agg_next, &tmp_list, info_list) {
 		INIT_LIST_HEAD(&agg->list);
 
-		if (ret < 0)
+		/* Only RTL8152_MAX_RX rx_agg need to be submitted. */
+		if (++i > RTL8152_MAX_RX) {
+			spin_lock_irqsave(&tp->rx_lock, flags);
+			list_add_tail(&agg->list, &tp->rx_used);
+			spin_unlock_irqrestore(&tp->rx_lock, flags);
+		} else if (unlikely(ret < 0)) {
+			spin_lock_irqsave(&tp->rx_lock, flags);
 			list_add_tail(&agg->list, &tp->rx_done);
-		else
+			spin_unlock_irqrestore(&tp->rx_lock, flags);
+		} else {
 			ret = r8152_submit_rx(tp, agg, GFP_KERNEL);
+		}
 	}
 
 	spin_lock_irqsave(&tp->rx_lock, flags);
@@ -2420,8 +2503,15 @@ static int rtl_stop_rx(struct r8152 *tp)
 	list_splice_init(&tp->rx_info, &tmp_list);
 	spin_unlock_irqrestore(&tp->rx_lock, flags);
 
-	list_for_each_entry_safe(agg, agg_next, &tmp_list, info_list)
-		usb_kill_urb(agg->urb);
+	list_for_each_entry_safe(agg, agg_next, &tmp_list, info_list) {
+		/* At least RTL8152_MAX_RX rx_agg have the page_count being
+		 * equal to 1, so the other ones could be freed safely.
+		 */
+		if (page_count(agg->page) > 1)
+			free_rx_agg(tp, agg);
+		else
+			usb_kill_urb(agg->urb);
+	}
 
 	/* Move back the list of temp to the rx_info */
 	spin_lock_irqsave(&tp->rx_lock, flags);
-- 
2.21.0


^ permalink raw reply related

* [PATCH net-next v2 3/5] r8152: use alloc_pages for rx buffer
From: Hayes Wang @ 2019-08-13  3:42 UTC (permalink / raw)
  To: netdev; +Cc: nic_swsd, linux-kernel, linux-usb, Hayes Wang
In-Reply-To: <1394712342-15778-295-albertk@realtek.com>

Replace kmalloc_node() with alloc_pages() for rx buffer.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
---
 drivers/net/usb/r8152.c | 31 +++++++++++--------------------
 1 file changed, 11 insertions(+), 20 deletions(-)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index d063c9b358e5..f41cb728e999 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -698,8 +698,8 @@ struct rx_agg {
 	struct list_head list, info_list;
 	struct urb *urb;
 	struct r8152 *context;
+	struct page *page;
 	void *buffer;
-	void *head;
 };
 
 struct tx_agg {
@@ -1476,7 +1476,7 @@ static void free_rx_agg(struct r8152 *tp, struct rx_agg *agg)
 	list_del(&agg->info_list);
 
 	usb_free_urb(agg->urb);
-	kfree(agg->buffer);
+	__free_pages(agg->page, get_order(tp->rx_buf_sz));
 	kfree(agg);
 
 	atomic_dec(&tp->rx_count);
@@ -1486,28 +1486,19 @@ static struct rx_agg *alloc_rx_agg(struct r8152 *tp, gfp_t mflags)
 {
 	struct net_device *netdev = tp->netdev;
 	int node = netdev->dev.parent ? dev_to_node(netdev->dev.parent) : -1;
+	unsigned int order = get_order(tp->rx_buf_sz);
 	struct rx_agg *rx_agg;
 	unsigned long flags;
-	u8 *buf;
 
 	rx_agg = kmalloc_node(sizeof(*rx_agg), mflags, node);
 	if (!rx_agg)
 		return NULL;
 
-	buf = kmalloc_node(tp->rx_buf_sz, mflags, node);
-	if (!buf)
+	rx_agg->page = alloc_pages(mflags, order);
+	if (!rx_agg->page)
 		goto free_rx;
 
-	if (buf != rx_agg_align(buf)) {
-		kfree(buf);
-		buf = kmalloc_node(tp->rx_buf_sz + RX_ALIGN, mflags,
-				   node);
-		if (!buf)
-			goto free_rx;
-	}
-
-	rx_agg->buffer = buf;
-	rx_agg->head = rx_agg_align(buf);
+	rx_agg->buffer = page_address(rx_agg->page);
 
 	rx_agg->urb = usb_alloc_urb(0, mflags);
 	if (!rx_agg->urb)
@@ -1526,7 +1517,7 @@ static struct rx_agg *alloc_rx_agg(struct r8152 *tp, gfp_t mflags)
 	return rx_agg;
 
 free_buf:
-	kfree(rx_agg->buffer);
+	__free_pages(rx_agg->page, order);
 free_rx:
 	kfree(rx_agg);
 	return NULL;
@@ -2003,8 +1994,8 @@ static int rx_bottom(struct r8152 *tp, int budget)
 		if (urb->actual_length < ETH_ZLEN)
 			goto submit;
 
-		rx_desc = agg->head;
-		rx_data = agg->head;
+		rx_desc = agg->buffer;
+		rx_data = agg->buffer;
 		len_used += sizeof(struct rx_desc);
 
 		while (urb->actual_length > len_used) {
@@ -2051,7 +2042,7 @@ static int rx_bottom(struct r8152 *tp, int budget)
 find_next_rx:
 			rx_data = rx_agg_align(rx_data + pkt_len + ETH_FCS_LEN);
 			rx_desc = (struct rx_desc *)rx_data;
-			len_used = (int)(rx_data - (u8 *)agg->head);
+			len_used = (int)(rx_data - (u8 *)agg->buffer);
 			len_used += sizeof(struct rx_desc);
 		}
 
@@ -2162,7 +2153,7 @@ int r8152_submit_rx(struct r8152 *tp, struct rx_agg *agg, gfp_t mem_flags)
 		return 0;
 
 	usb_fill_bulk_urb(agg->urb, tp->udev, usb_rcvbulkpipe(tp->udev, 1),
-			  agg->head, tp->rx_buf_sz,
+			  agg->buffer, tp->rx_buf_sz,
 			  (usb_complete_t)read_bulk_callback, agg);
 
 	ret = usb_submit_urb(agg->urb, mem_flags);
-- 
2.21.0


^ permalink raw reply related

* [PATCH net-next v2 0/5] r8152: RX improve
From: Hayes Wang @ 2019-08-13  3:42 UTC (permalink / raw)
  To: netdev; +Cc: nic_swsd, linux-kernel, linux-usb, Hayes Wang
In-Reply-To: <1394712342-15778-289-Taiwan-albertk@realtek.com>

v2:
For patch #2, replace list_for_each_safe with list_for_each_entry_safe.
Remove unlikely in WARN_ON. Adjust the coding style.

For patch #4, replace list_for_each_safe with list_for_each_entry_safe.
Remove "else" after "continue".

For patch #5. replace sysfs with ethtool to modify rx_copybreak and
rx_pending.

v1:
The different chips use different rx buffer size.

Use skb_add_rx_frag() to reduce memory copy for RX.

Hayes Wang (5):
  r8152: separate the rx buffer size
  r8152: replace array with linking list for rx information
  r8152: use alloc_pages for rx buffer
  r8152: support skb_add_rx_frag
  r8152: change rx_copybreak and rx_pending through ethtool

 drivers/net/usb/r8152.c | 374 ++++++++++++++++++++++++++++++++--------
 1 file changed, 304 insertions(+), 70 deletions(-)

-- 
2.21.0


^ permalink raw reply

* Taking a day off...
From: David Miller @ 2019-08-13  4:26 UTC (permalink / raw)
  To: netdev, jakub.kicinski; +Cc: linux-wireless, netfilter-devel, bpf


Hello everyone,

Tomorrow I will be letting Jakub Kicinski manage the net and net-next
GIT trees.  So he will be integrating patches into GIT and doing git
pulls from people.  He will alway keep the patchwork states up to
date, just like I do.

I completely expect everyone to give Jakub the same respect and
consideration they usually give to me, because he deserves it.

In the future, when I need to take a few days off, I will hand things
over to Jakub in a similar way.

Thank you.

^ permalink raw reply

* Re: [PATCH bpf-next v2 2/4] bpf: support cloning sk storage on accept()
From: Stanislav Fomichev @ 2019-08-13  5:05 UTC (permalink / raw)
  To: Martin Lau
  Cc: Stanislav Fomichev, netdev@vger.kernel.org, bpf@vger.kernel.org,
	davem@davemloft.net, ast@kernel.org, daniel@iogearbox.net,
	Yonghong Song
In-Reply-To: <20190813014753.vftgwwzqxzx2pawg@kafai-mbp>

On 08/13, Martin Lau wrote:
> On Fri, Aug 09, 2019 at 09:10:36AM -0700, Stanislav Fomichev wrote:
> > Add new helper bpf_sk_storage_clone which optionally clones sk storage
> > and call it from sk_clone_lock.
> Thanks for v2.  Sorry for the delay.  I am traveling.
No worries, take your time, if you're OOO feel free to do another
round when you get back, not urgent.

> > 
> > Cc: Martin KaFai Lau <kafai@fb.com>
> > Cc: Yonghong Song <yhs@fb.com>
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  include/net/bpf_sk_storage.h |  10 ++++
> >  include/uapi/linux/bpf.h     |   3 ++
> >  net/core/bpf_sk_storage.c    | 100 +++++++++++++++++++++++++++++++++--
> >  net/core/sock.c              |   9 ++--
> >  4 files changed, 116 insertions(+), 6 deletions(-)
> > 
> > diff --git a/include/net/bpf_sk_storage.h b/include/net/bpf_sk_storage.h
> > index b9dcb02e756b..8e4f831d2e52 100644
> > --- a/include/net/bpf_sk_storage.h
> > +++ b/include/net/bpf_sk_storage.h
> > @@ -10,4 +10,14 @@ void bpf_sk_storage_free(struct sock *sk);
> >  extern const struct bpf_func_proto bpf_sk_storage_get_proto;
> >  extern const struct bpf_func_proto bpf_sk_storage_delete_proto;
> >  
> > +#ifdef CONFIG_BPF_SYSCALL
> > +int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk);
> > +#else
> > +static inline int bpf_sk_storage_clone(const struct sock *sk,
> > +				       struct sock *newsk)
> > +{
> > +	return 0;
> > +}
> > +#endif
> > +
> >  #endif /* _BPF_SK_STORAGE_H */
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 4393bd4b2419..0ef594ac3899 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -337,6 +337,9 @@ enum bpf_attach_type {
> >  #define BPF_F_RDONLY_PROG	(1U << 7)
> >  #define BPF_F_WRONLY_PROG	(1U << 8)
> >  
> > +/* Clone map from listener for newly accepted socket */
> > +#define BPF_F_CLONE		(1U << 9)
> > +
> >  /* flags for BPF_PROG_QUERY */
> >  #define BPF_F_QUERY_EFFECTIVE	(1U << 0)
> >  
> > diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
> > index 94c7f77ecb6b..584e08ee0ca3 100644
> > --- a/net/core/bpf_sk_storage.c
> > +++ b/net/core/bpf_sk_storage.c
> > @@ -12,6 +12,9 @@
> >  
> >  static atomic_t cache_idx;
> >  
> > +#define SK_STORAGE_CREATE_FLAG_MASK					\
> > +	(BPF_F_NO_PREALLOC | BPF_F_CLONE)
> > +
> >  struct bucket {
> >  	struct hlist_head list;
> >  	raw_spinlock_t lock;
> > @@ -209,7 +212,6 @@ static void selem_unlink_sk(struct bpf_sk_storage_elem *selem)
> >  		kfree_rcu(sk_storage, rcu);
> >  }
> >  
> > -/* sk_storage->lock must be held and sk_storage->list cannot be empty */
> >  static void __selem_link_sk(struct bpf_sk_storage *sk_storage,
> >  			    struct bpf_sk_storage_elem *selem)
> >  {
> > @@ -509,7 +511,7 @@ static int sk_storage_delete(struct sock *sk, struct bpf_map *map)
> >  	return 0;
> >  }
> >  
> > -/* Called by __sk_destruct() */
> > +/* Called by __sk_destruct() & bpf_sk_storage_clone() */
> >  void bpf_sk_storage_free(struct sock *sk)
> >  {
> >  	struct bpf_sk_storage_elem *selem;
> > @@ -557,6 +559,11 @@ static void bpf_sk_storage_map_free(struct bpf_map *map)
> >  
> >  	smap = (struct bpf_sk_storage_map *)map;
> >  
> > +	/* Note that this map might be concurrently cloned from
> > +	 * bpf_sk_storage_clone. Wait for any existing bpf_sk_storage_clone
> > +	 * RCU read section to finish before proceeding. New RCU
> > +	 * read sections should be prevented via bpf_map_inc_not_zero.
> > +	 */
> Thanks!
> 
> >  	synchronize_rcu();
> >  
> >  	/* bpf prog and the userspace can no longer access this map
> > @@ -601,7 +608,8 @@ static void bpf_sk_storage_map_free(struct bpf_map *map)
> >  
> >  static int bpf_sk_storage_map_alloc_check(union bpf_attr *attr)
> >  {
> > -	if (attr->map_flags != BPF_F_NO_PREALLOC || attr->max_entries ||
> > +	if (attr->map_flags & ~SK_STORAGE_CREATE_FLAG_MASK ||
> > +	    attr->max_entries ||
> I think "!(attr->map_flags & BPF_F_NO_PREALLOC)" should also be needed.
Makes sense, we always want to have BPF_F_NO_PREALLOC set. Will add,
thanks!

> >  	    attr->key_size != sizeof(int) || !attr->value_size ||
> >  	    /* Enforce BTF for userspace sk dumping */
> >  	    !attr->btf_key_type_id || !attr->btf_value_type_id)
> > @@ -739,6 +747,92 @@ static int bpf_fd_sk_storage_delete_elem(struct bpf_map *map, void *key)
> >  	return err;
> >  }
> >  
> > +static struct bpf_sk_storage_elem *
> > +bpf_sk_storage_clone_elem(struct sock *newsk,
> > +			  struct bpf_sk_storage_map *smap,
> > +			  struct bpf_sk_storage_elem *selem)
> > +{
> > +	struct bpf_sk_storage_elem *copy_selem;
> > +
> > +	copy_selem = selem_alloc(smap, newsk, NULL, true);
> > +	if (!copy_selem)
> > +		return NULL;
> > +
> > +	if (map_value_has_spin_lock(&smap->map))
> > +		copy_map_value_locked(&smap->map, SDATA(copy_selem)->data,
> > +				      SDATA(selem)->data, true);
> > +	else
> > +		copy_map_value(&smap->map, SDATA(copy_selem)->data,
> > +			       SDATA(selem)->data);
> > +
> > +	return copy_selem;
> > +}
> > +
> > +int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
> > +{
> > +	struct bpf_sk_storage *new_sk_storage = NULL;
> > +	struct bpf_sk_storage *sk_storage;
> > +	struct bpf_sk_storage_elem *selem;
> > +	int ret;
> > +
> > +	RCU_INIT_POINTER(newsk->sk_bpf_storage, NULL);
> > +
> > +	rcu_read_lock();
> > +	sk_storage = rcu_dereference(sk->sk_bpf_storage);
> > +
> > +	if (!sk_storage || hlist_empty(&sk_storage->list))
> > +		goto out;
> > +
> > +	hlist_for_each_entry_rcu(selem, &sk_storage->list, snode) {
> > +		struct bpf_sk_storage_elem *copy_selem;
> > +		struct bpf_sk_storage_map *smap;
> > +		struct bpf_map *map;
> > +		int refold;
> > +
> > +		smap = rcu_dereference(SDATA(selem)->smap);
> > +		if (!(smap->map.map_flags & BPF_F_CLONE))
> > +			continue;
> > +
> > +		map = bpf_map_inc_not_zero(&smap->map, false);
> > +		if (IS_ERR(map))
> > +			continue;
> > +
> > +		copy_selem = bpf_sk_storage_clone_elem(newsk, smap, selem);
> > +		if (!copy_selem) {
> > +			ret = -ENOMEM;
> > +			bpf_map_put(map);
> > +			goto err;
> > +		}
> > +
> > +		if (new_sk_storage) {
> > +			selem_link_map(smap, copy_selem);
> > +			__selem_link_sk(new_sk_storage, copy_selem);
> > +		} else {
> > +			ret = sk_storage_alloc(newsk, smap, copy_selem);
> > +			if (ret) {
> > +				kfree(copy_selem);
> > +				atomic_sub(smap->elem_size,
> > +					   &newsk->sk_omem_alloc);
> > +				bpf_map_put(map);
> > +				goto err;
> > +			}
> > +
> > +			new_sk_storage = rcu_dereference(copy_selem->sk_storage);
> > +		}
> > +		bpf_map_put(map);
> > +	}
> > +
> > +out:
> > +	rcu_read_unlock();
> > +	return 0;
> > +
> > +err:
> > +	rcu_read_unlock();
> > +
> > +	bpf_sk_storage_free(newsk);
> The later sk_free_unlock_clone(newsk) should eventually call
> bpf_sk_storage_free(newsk) also?
Hm, good point, I can drop it from here and rely on
sk_free_unlock_clone to call __sk_destruct/bpf_sk_storage_free.
That probably deserves a comment though :-)

> Others LGTM.
Thank you for a review!

> > +	return ret;
> > +}
> > +
> >  BPF_CALL_4(bpf_sk_storage_get, struct bpf_map *, map, struct sock *, sk,
> >  	   void *, value, u64, flags)
> >  {
> > diff --git a/net/core/sock.c b/net/core/sock.c
> > index d57b0cc995a0..f5e801a9cea4 100644
> > --- a/net/core/sock.c
> > +++ b/net/core/sock.c
> > @@ -1851,9 +1851,12 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority)
> >  			goto out;
> >  		}
> >  		RCU_INIT_POINTER(newsk->sk_reuseport_cb, NULL);
> > -#ifdef CONFIG_BPF_SYSCALL
> > -		RCU_INIT_POINTER(newsk->sk_bpf_storage, NULL);
> > -#endif
> > +
> > +		if (bpf_sk_storage_clone(sk, newsk)) {
> > +			sk_free_unlock_clone(newsk);
> > +			newsk = NULL;
> > +			goto out;
> > +		}
> >  
> >  		newsk->sk_err	   = 0;
> >  		newsk->sk_err_soft = 0;
> > -- 
> > 2.23.0.rc1.153.gdeed80330f-goog
> > 

^ permalink raw reply

* Re: [PATCH v2 15/34] staging/vc04_services: convert put_page() to put_user_page*()
From: Stefan Wahren @ 2019-08-13  5:23 UTC (permalink / raw)
  To: john.hubbard, Andrew Morton
  Cc: linux-fbdev, Jan Kara, kvm, Dave Hansen, Dave Chinner, dri-devel,
	linux-mm, sparclinux, Ira Weiny, ceph-devel, devel, rds-devel,
	linux-rdma, Suniel Mahesh, x86, amd-gfx, Christoph Hellwig,
	Jason Gunthorpe, Mihaela Muraru, xen-devel, devel, linux-media,
	John Hubbard, intel-gfx, Kishore KP, linux-block,
	Jérôme Glisse, linux-rpi-kernel, Dan Williams,
	Sidong Yang, linux-arm-kernel, linux-nfs, Eric Anholt, netdev,
	LKML, linux-xfs, linux-crypto, Greg Kroah-Hartman, linux-fsdevel,
	Al Viro
In-Reply-To: <20190804224915.28669-16-jhubbard@nvidia.com>

On 05.08.19 00:48, john.hubbard@gmail.com wrote:
> From: John Hubbard <jhubbard@nvidia.com>
>
> For pages that were retained via get_user_pages*(), release those pages
> via the new put_user_page*() routines, instead of via put_page() or
> release_pages().
>
> This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
> ("mm: introduce put_user_page*(), placeholder versions").
>
> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>
> Cc: Eric Anholt <eric@anholt.net>
> Cc: Stefan Wahren <stefan.wahren@i2se.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Mihaela Muraru <mihaela.muraru21@gmail.com>
> Cc: Suniel Mahesh <sunil.m@techveda.org>
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Cc: Sidong Yang <realwakka@gmail.com>
> Cc: Kishore KP <kishore.p@techveda.org>
> Cc: linux-rpi-kernel@lists.infradead.org
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: devel@driverdev.osuosl.org
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Stefan Wahren <stefan.wahren@i2se.com>

^ permalink raw reply

* Re: [PATCH v4 10/14] net: phy: adin: implement PHY subsystem software reset
From: Ardelean, Alexandru @ 2019-08-13  5:42 UTC (permalink / raw)
  To: andrew@lunn.ch
  Cc: davem@davemloft.net, hkallweit1@gmail.com,
	devicetree@vger.kernel.org, mark.rutland@arm.com,
	linux-kernel@vger.kernel.org, f.fainelli@gmail.com,
	netdev@vger.kernel.org, robh+dt@kernel.org
In-Reply-To: <20190812141954.GP14290@lunn.ch>

On Mon, 2019-08-12 at 16:19 +0200, Andrew Lunn wrote:
> [External]
> 
> > +static int adin_reset(struct phy_device *phydev)
> > +{
> > +	/* If there is a reset GPIO just exit */
> > +	if (!IS_ERR_OR_NULL(phydev->mdio.reset_gpio))
> > +		return 0;
> 
> I'm not so happy with this.
> 
> First off, there are two possible GPIO configurations. The GPIO can be
> applied to all PHYs on the MDIO bus. That GPIO is used when the bus is
> probed. There can also be a per PHY GPIO, which is what you are
> looking at here.
> 
> The idea of putting the GPIO handling in the core is that PHYs don't
> need to worry about it. How much of a difference does it make if the
> PHY is both reset via GPIO and then again in software? How slow is the
> software reset? Maybe just unconditionally do the reset, if it is not
> too slow.
> 

Ack.
Will do it unconditionally.
The reset is not too slow.


> > +
> > +	/* Reset PHY core regs & subsystem regs */
> > +	return adin_subsytem_soft_reset(phydev);
> > +}
> > +
> > +static int adin_probe(struct phy_device *phydev)
> > +{
> > +	return adin_reset(phydev);
> > +}
> 
> Why did you decide to do this as part of probe, and not use the
> .soft_reset member of phy_driver?

Hmmm.
This is a left-over from when I had the GPIO handling in this PHY driver.
It's also a habit picked up from writing IIO drivers, where there is no {soft_}reset hook.
Typically, the reset is done in the probe.

> 
> > +
> >  static struct phy_driver adin_driver[] = {
> >  	{
> >  		PHY_ID_MATCH_MODEL(PHY_ID_ADIN1200),
> >  		.name		= "ADIN1200",
> >  		.config_init	= adin_config_init,
> > +		.probe		= adin_probe,
> >  		.config_aneg	= adin_config_aneg,
> >  		.read_status	= adin_read_status,
> >  		.ack_interrupt	= adin_phy_ack_intr,
> > @@ -461,6 +503,7 @@ static struct phy_driver adin_driver[] = {
> >  		PHY_ID_MATCH_MODEL(PHY_ID_ADIN1300),
> >  		.name		= "ADIN1300",
> >  		.config_init	= adin_config_init,
> > +		.probe		= adin_probe,
> >  		.config_aneg	= adin_config_aneg,
> >  		.read_status	= adin_read_status,
> >  		.ack_interrupt	= adin_phy_ack_intr,
> 
> Thanks
> 	Andrew

^ permalink raw reply

* Re: [PATCH v4 13/14] net: phy: adin: add ethtool get_stats support
From: Ardelean, Alexandru @ 2019-08-13  5:48 UTC (permalink / raw)
  To: andrew@lunn.ch
  Cc: davem@davemloft.net, hkallweit1@gmail.com,
	devicetree@vger.kernel.org, mark.rutland@arm.com,
	linux-kernel@vger.kernel.org, f.fainelli@gmail.com,
	netdev@vger.kernel.org, robh+dt@kernel.org
In-Reply-To: <20190812143315.GS14290@lunn.ch>

On Mon, 2019-08-12 at 16:33 +0200, Andrew Lunn wrote:
> [External]
> 
> > +static int adin_read_mmd_stat_regs(struct phy_device *phydev,
> > +				   struct adin_hw_stat *stat,
> > +				   u32 *val)
> > +{
> > +	int ret;
> > +
> > +	ret = phy_read_mmd(phydev, MDIO_MMD_VEND1, stat->reg1);
> > +	if (ret < 0)
> > +		return ret;
> > +
> > +	*val = (ret & 0xffff);
> > +
> > +	if (stat->reg2 == 0)
> > +		return 0;
> > +
> > +	ret = phy_read_mmd(phydev, MDIO_MMD_VEND1, stat->reg2);
> > +	if (ret < 0)
> > +		return ret;
> > +
> > +	*val <<= 16;
> > +	*val |= (ret & 0xffff);
> > +
> > +	return 0;
> > +}
> 
> It still looks like you have not dealt with overflow from the LSB into
> the MSB between the two reads.

Apologies for forgetting to address this.
I did not intentionally leave it out; this item got lost after V1 [which had the most remarks].
Changelog V1 -> V2 was quite bulky, and I did not look at V1 remarks after I finished V2.

Thanks for snippet.

> 
> 	do {
> 		hi1 = phy_read_mmd(phydev, MDIO_MMD_VEND1, stat->reg2);
> 		if (hi1 < 0)
> 			return hi1;
> 		
> 		low = phy_read_mmd(phydev, MDIO_MMD_VEND1, stat->reg1);
> 		if (low < 0)
> 			return low;
> 
> 		hi2 = phy_read_mmd(phydev, MDIO_MMD_VEND1, stat->reg2);
> 		if (hi2 < 0)
> 			return hi1;
> 	} while (hi1 != hi2)
> 
> 	return low | (hi << 16);
> 
> 	Andrew

^ permalink raw reply

* [PATCH] MAINTAINERS: net_failover: Fix typo in a filepath
From: Denis Efremov @ 2019-08-13  6:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Denis Efremov, joe, Sridhar Samudrala, David S . Miller, netdev
In-Reply-To: <20190325212732.27253-1-joe@perches.com>

Replace "driver" with "drivers" in the filepath to net_failover.c

Cc: Sridhar Samudrala <sridhar.samudrala@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Fixes: cfc80d9a1163 ("net: Introduce net_failover driver")
Signed-off-by: Denis Efremov <efremov@linux.com>
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 51ab502485ac..c2117e5f4ff8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11071,7 +11071,7 @@ NET_FAILOVER MODULE
 M:	Sridhar Samudrala <sridhar.samudrala@intel.com>
 L:	netdev@vger.kernel.org
 S:	Supported
-F:	driver/net/net_failover.c
+F:	drivers/net/net_failover.c
 F:	include/net/net_failover.h
 F:	Documentation/networking/net_failover.rst
 
-- 
2.21.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox