Netdev List
 help / color / mirror / Atom feed
* Re: problems with SCTP GSO
From: Marcelo Ricardo Leitner @ 2018-06-12 17:05 UTC (permalink / raw)
  To: David Miller; +Cc: lucien.xin, edumazet, netdev
In-Reply-To: <20180611.202905.1954825345357429286.davem@davemloft.net>

On Mon, Jun 11, 2018 at 08:29:05PM -0700, David Miller wrote:
> 
> I would like to bring up some problems with the current GSO
> implementation in SCTP.
> 
> The most important for me right now is that SCTP uses
> "skb_gro_receive()" to build "GSO" frames :-(
> 
> Really it just ends up using the slow path (basically, label 'merge'
> and onwards).
> 
> So, using a GRO helper to build GSO packets is not great.

Okay.

> 
> I want to make major surgery here and the only way I can is if
> it is exactly the GRO demuxing path that uses skb_gro_receive().
> 
> Those paths pass in the list head from the NAPI struct that initiated
> the GRO code paths.  That makes it easy for me to change this to use a
> list_head or a hash chain.
> 
> Probably in the short term SCTP should just have a private helper that
> builds the frag list, appending 'skb' to 'head'.
> 
> In the long term, SCTP should use the page frags just like TCP to
> append the data when building GSO frames.  Then it could actually be
> offloaded and passed into drivers without linearizing.

Sounds like a plan. Shouldn't be too hard to do it.
(I'm out on PTO, btw)

Thanks,
Marcelo

^ permalink raw reply

* Re: [PATCH net-next 3/6] net: ethernet: ti: cpsw: add MQPRIO Qdisc offload
From: Andrew Lunn @ 2018-06-12 16:36 UTC (permalink / raw)
  To: Ivan Khoronzhuk
  Cc: grygorii.strashko, davem, corbet, akpm, netdev, linux-doc,
	linux-kernel, linux-omap, vinicius.gomes, henrik,
	jesus.sanchez-palencia, ilias.apalodimas, p-varis, spatton,
	francois.ozog, yogeshs, nsekhar
In-Reply-To: <20180611133047.4818-4-ivan.khoronzhuk@linaro.org>

On Mon, Jun 11, 2018 at 04:30:44PM +0300, Ivan Khoronzhuk wrote:
> That's possible to offload vlan to tc priority mapping with
> assumption sk_prio == L2 prio.
> 
> Example:
> $ ethtool -L eth0 rx 1 tx 4
> 
> $ qdisc replace dev eth0 handle 100: parent root mqprio num_tc 3 \
> map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 1
> 
> $ tc -g class show dev eth0
> +---(100:ffe2) mqprio
> |    +---(100:3) mqprio
> |    +---(100:4) mqprio
> |    
> +---(100:ffe1) mqprio
> |    +---(100:2) mqprio
> |    
> +---(100:ffe0) mqprio
>      +---(100:1) mqprio
> 
> Here, 100:1 is txq0, 100:2 is txq1, 100:3 is txq2, 100:4 is txq3
> txq0 belongs to tc0, txq1 to tc1, txq2 and txq3 to tc2
> The offload part only maps L2 prio to classes of traffic, but not
> to transmit queues, so to direct traffic to traffic class vlan has
> to be created with appropriate egress map.
> 
> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
> ---
>  drivers/net/ethernet/ti/cpsw.c | 82 ++++++++++++++++++++++++++++++++++
>  1 file changed, 82 insertions(+)
> 
> diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
> index 406537d74ec1..fd967d2bce5d 100644
> --- a/drivers/net/ethernet/ti/cpsw.c
> +++ b/drivers/net/ethernet/ti/cpsw.c
> @@ -39,6 +39,7 @@
>  #include <linux/sys_soc.h>
>  
>  #include <linux/pinctrl/consumer.h>
> +#include <net/pkt_cls.h>
>  
>  #include "cpsw.h"
>  #include "cpsw_ale.h"
> @@ -153,6 +154,8 @@ do {								\
>  #define IRQ_NUM			2
>  #define CPSW_MAX_QUEUES		8
>  #define CPSW_CPDMA_DESCS_POOL_SIZE_DEFAULT 256
> +#define CPSW_TC_NUM			4
> +#define CPSW_FIFO_SHAPERS_NUM		(CPSW_TC_NUM - 1)
>  
>  #define CPSW_RX_VLAN_ENCAP_HDR_PRIO_SHIFT	29
>  #define CPSW_RX_VLAN_ENCAP_HDR_PRIO_MSK		GENMASK(2, 0)
> @@ -453,6 +456,7 @@ struct cpsw_priv {
>  	u8				mac_addr[ETH_ALEN];
>  	bool				rx_pause;
>  	bool				tx_pause;
> +	bool				mqprio_hw;
>  	u32 emac_port;
>  	struct cpsw_common *cpsw;
>  };
> @@ -1577,6 +1581,14 @@ static void cpsw_slave_stop(struct cpsw_slave *slave, struct cpsw_common *cpsw)
>  	soft_reset_slave(slave);
>  }
>  
> +static int cpsw_tc_to_fifo(int tc, int num_tc)
> +{
> +	if (tc == num_tc - 1)
> +		return 0;
> +
> +	return CPSW_FIFO_SHAPERS_NUM - tc;
> +}
> +
>  static int cpsw_ndo_open(struct net_device *ndev)
>  {
>  	struct cpsw_priv *priv = netdev_priv(ndev);
> @@ -2190,6 +2202,75 @@ static int cpsw_ndo_set_tx_maxrate(struct net_device *ndev, int queue, u32 rate)
>  	return ret;
>  }
>  
> +static int cpsw_set_tc(struct net_device *ndev, void *type_data)
> +{

Hi Ivan

Maybe this is not the best of names. What if you add support for
another TC qdisc? So you have another case in the switch statement
below?

Maybe call it cpsw_set_mqprio?

> +static int cpsw_ndo_setup_tc(struct net_device *ndev, enum tc_setup_type type,
> +			     void *type_data)
> +{
> +	switch (type) {
> +	case TC_SETUP_QDISC_MQPRIO:
> +		return cpsw_set_tc(ndev, type_data);
> +
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}

  Andrew
  

^ permalink raw reply

* Re: [PATCH] net: stmmac: fix build failure due to missing COMMON_CLK dependency
From: Andy Shevchenko @ 2018-06-12 16:35 UTC (permalink / raw)
  To: David Miller, Geert Uytterhoeven
  Cc: Corentin Labbe, Alexandre TORGUE, Giuseppe CAVALLARO,
	Linux Kernel Mailing List, netdev, linux-sunxi
In-Reply-To: <20180608.105926.600207780816212953.davem@davemloft.net>

On Fri, Jun 8, 2018 at 5:59 PM, David Miller <davem@davemloft.net> wrote:
> From: Corentin Labbe <clabbe@baylibre.com>
> Date: Wed,  6 Jun 2018 18:45:22 +0000
>
>> This patch fix the build failure on m68k;
>> drivers/net/ethernet/stmicro/stmmac/dwmac-ipq806x.o: In function `ipq806x_gmac_probe':
>> dwmac-ipq806x.c:(.text+0xda): undefined reference to `clk_set_rate'
>> drivers/net/ethernet/stmicro/stmmac/dwmac-rk.o: In function `rk_gmac_probe':
>> dwmac-rk.c:(.text+0x1e58): undefined reference to `clk_set_rate'
>> drivers/net/ethernet/stmicro/stmmac/dwmac-sti.o: In function `stid127_fix_retime_src':
>> dwmac-sti.c:(.text+0xd8): undefined reference to `clk_set_rate'
>> dwmac-sti.c:(.text+0x114): undefined reference to `clk_set_rate'
>> drivers/net/ethernet/stmicro/stmmac/dwmac-sti.o:dwmac-sti.c:(.text+0x12c): more undefined references to `clk_set_rate' follow
>> Lots of stmmac platform drivers need COMMON_CLK in their Kconfig depends.
>>
>> Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
>
> Applied.

I think Geert has a better fix https://lkml.org/lkml/2018/6/11/122

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply

* Re: [PATCH 1/2] Convert target drivers to use sbitmap
From: Bart Van Assche @ 2018-06-12 16:32 UTC (permalink / raw)
  To: willy@infradead.org
  Cc: jgross@suse.com, axboe@kernel.dk, linux-scsi@vger.kernel.org,
	kvm@vger.kernel.org, mawilcox@microsoft.com,
	netdev@vger.kernel.org, linux-usb@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	target-devel@vger.kernel.org, qla2xxx-upstream@qlogic.com,
	linux1394-devel@lists.sourceforge.net, kent.overstreet@gmail.com
In-Reply-To: <20180612161526.GE19433@bombadil.infradead.org>

On Tue, 2018-06-12 at 09:15 -0700, Matthew Wilcox wrote:
> On Tue, Jun 12, 2018 at 03:22:42PM +0000, Bart Van Assche wrote:
> > On Tue, 2018-05-15 at 09:00 -0700, Matthew Wilcox wrote:
> > > diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c
> > > index 025dc2d3f3de..cdf671c2af61 100644
> > > --- a/drivers/scsi/qla2xxx/qla_target.c
> > > +++ b/drivers/scsi/qla2xxx/qla_target.c
> > > @@ -3719,7 +3719,8 @@ void qlt_free_cmd(struct qla_tgt_cmd *cmd)
> > >  		return;
> > >  	}
> > >  	cmd->jiffies_at_free = get_jiffies_64();
> > > -	percpu_ida_free(&sess->se_sess->sess_tag_pool, cmd->se_cmd.map_tag);
> > > +	sbitmap_queue_clear(&sess->se_sess->sess_tag_pool, cmd->se_cmd.map_tag,
> > > +			cmd->se_cmd.map_cpu);
> > >  }
> > >  EXPORT_SYMBOL(qlt_free_cmd);
> > 
> > Please introduce functions in the target core for allocating and freeing a tag
> > instead of spreading the knowledge of how to allocate and free tags over all
> > target drivers.
> 
> I can't without doing an unreasonably large amount of work on drivers that
> I have no way to test.  Some of the drivers have the se_cmd already; some
> of them don't.  I'd be happy to introduce a common function for freeing
> a tag.

Which target drivers are you referring to? If you are referring to the sbp driver:
I think that driver is dead and can be removed from the kernel tree. I even don't
know whether that driver ever has had any users other than the developer of that
driver.

> > This looks weird to me. Shouldn't target code ignore signals instead of causing
> > tag allocation to fail if a signal is received?
> 
> It's what the current code did:
> 
> -               if (signal_pending_state(state, current)) {
> -                       tag = -ERESTARTSYS;
> -                       break;
> -               }
> 
> and the current callers literally indicate that they want signals:
> 
> drivers/infiniband/ulp/isert/ib_isert.c:        cmd = iscsit_allocate_cmd(conn, TASK_INTERRUPTIBLE);
> drivers/target/iscsi/iscsi_target.c:    cmd = iscsit_allocate_cmd(conn, TASK_INTERRUPTIBLE);

Right, the iSCSI target driver uses signals to wake up threads (see also the
send_sig() calls in the iSCSI target code).

Bart.

^ permalink raw reply

* Re: [PATCH 2/2] ktime: helpers to convert between ktime and jiffies
From: Andrew Lunn @ 2018-06-12 16:30 UTC (permalink / raw)
  To: Tejaswi Tanikella; +Cc: netdev, f.fainelli, davem
In-Reply-To: <20180611115218.GA23539@tejaswit-linux.qualcomm.com>

On Mon, Jun 11, 2018 at 05:22:28PM +0530, Tejaswi Tanikella wrote:
> Signed-off-by: Tejaswi Tanikella <tejaswit@codeaurora.org>
> ---
>  include/linux/ktime.h | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/include/linux/ktime.h b/include/linux/ktime.h
> index 5b9fddb..4881483 100644
> --- a/include/linux/ktime.h
> +++ b/include/linux/ktime.h
> @@ -96,6 +96,10 @@ static inline ktime_t timeval_to_ktime(struct timeval tv)
>  /* Convert ktime_t to nanoseconds - NOP in the scalar storage format: */
>  #define ktime_to_ns(kt)			(kt)
>  
> +/* ktime to jiffies and back */
> +#define ktime_to_jiffies(kt)		nsecs_to_jiffies(kt)
> +#define jiffies_to_ktime(j)		jiffies_to_nsecs(j)

Hi Tejaswi

You should also add some users of these new helpers.

    Andrew

^ permalink raw reply

* Re: [PATCH net 1/2] ipv4: igmp: use alarmtimer to prevent delayed reports
From: Andrew Lunn @ 2018-06-12 16:28 UTC (permalink / raw)
  To: Tejaswi Tanikella; +Cc: netdev, f.fainelli, davem
In-Reply-To: <20180611115058.GA12452@tejaswit-linux.qualcomm.com>

On Mon, Jun 11, 2018 at 05:21:05PM +0530, Tejaswi Tanikella wrote:
> On receiving a IGMPv2/v3 query, based on max_delay set in the header a
> timer is started to send out a response after a random time within
> max_delay. If the system then moves into suspend state, Report is
> delayed until system wakes up.
> 
> Use a alarmtimer instead of using a timer. Alarmtimer will wake the
> system up from suspend to send out the IGMP report.

Hi Tejaswi

I think i must be missing something here. If we are suspended, we are
not receiving multicast frames. If we are not receiving frames, why do
we need to reply to the query?

Once we resume, i expect we will reply to the next query. You could
optimise restarting the flow by immediately sending a membership
report, same as when the setsockopt is used to join the group.

	Andrew

^ permalink raw reply

* Re: [PATCH 1/2] Convert target drivers to use sbitmap
From: Matthew Wilcox @ 2018-06-12 16:15 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-usb@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	kent.overstreet@gmail.com, linux1394-devel@lists.sourceforge.net,
	jgross@suse.com, axboe@kernel.dk, linux-scsi@vger.kernel.org,
	qla2xxx-upstream@qlogic.com, target-devel@vger.kernel.org,
	netdev@vger.kernel.org, mawilcox@microsoft.com
In-Reply-To: <da5220a5ed4bed210c31a7517389e787a3b1a01f.camel@wdc.com>

On Tue, Jun 12, 2018 at 03:22:42PM +0000, Bart Van Assche wrote:
> On Tue, 2018-05-15 at 09:00 -0700, Matthew Wilcox wrote:
> > diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c
> > index 025dc2d3f3de..cdf671c2af61 100644
> > --- a/drivers/scsi/qla2xxx/qla_target.c
> > +++ b/drivers/scsi/qla2xxx/qla_target.c
> > @@ -3719,7 +3719,8 @@ void qlt_free_cmd(struct qla_tgt_cmd *cmd)
> >  		return;
> >  	}
> >  	cmd->jiffies_at_free = get_jiffies_64();
> > -	percpu_ida_free(&sess->se_sess->sess_tag_pool, cmd->se_cmd.map_tag);
> > +	sbitmap_queue_clear(&sess->se_sess->sess_tag_pool, cmd->se_cmd.map_tag,
> > +			cmd->se_cmd.map_cpu);
> >  }
> >  EXPORT_SYMBOL(qlt_free_cmd);
> 
> Please introduce functions in the target core for allocating and freeing a tag
> instead of spreading the knowledge of how to allocate and free tags over all
> target drivers.

I can't without doing an unreasonably large amount of work on drivers that
I have no way to test.  Some of the drivers have the se_cmd already; some
of them don't.  I'd be happy to introduce a common function for freeing
a tag.

> > +int iscsit_wait_for_tag(struct se_session *se_sess, int state, int *cpup)
> > +{
> > +	int tag = -1;
> > +	DEFINE_WAIT(wait);
> > +	struct sbq_wait_state *ws;
> > +
> > +	if (state == TASK_RUNNING)
> > +		return tag;
> > +
> > +	ws = &se_sess->sess_tag_pool.ws[0];
> > +	for (;;) {
> > +		prepare_to_wait_exclusive(&ws->wait, &wait, state);
> > +		if (signal_pending_state(state, current))
> > +			break;
> 
> This looks weird to me. Shouldn't target code ignore signals instead of causing
> tag allocation to fail if a signal is received?

It's what the current code did:

-               if (signal_pending_state(state, current)) {
-                       tag = -ERESTARTSYS;
-                       break;
-               }

and the current callers literally indicate that they want signals:

drivers/infiniband/ulp/isert/ib_isert.c:        cmd = iscsit_allocate_cmd(conn, TASK_INTERRUPTIBLE);
drivers/target/iscsi/iscsi_target.c:    cmd = iscsit_allocate_cmd(conn, TASK_INTERRUPTIBLE);

(etc)

^ permalink raw reply

* REPLY URGENLY.
From: Mratthais @ 2018-06-12 16:06 UTC (permalink / raw)

In-Reply-To: <850140153.4238359.1528819602505.ref@mail.yahoo.com>

 Dear Friend,

Mr. john Matthias ouedraogo, the manager in charge of auditing and accounting section
of Bank of Africa (BOA) Ouagadougou Burkina-Faso West-Africa. I would like you
to indicate your interest to receive the transfer of $19.3 Million Dollars. I
will like you to stand as the next of kin to our late customer whose account
is presently dormant for claims. Please once you are interested kindly send
me the following details information below,

1.Your full name:...........
2.Resident address:........
3.Private phone........
4.fax numbers:...............
5.Country :................
6.Occupation:..............
7.Age:.........
8.sex........ 

I shall send you more details as soon as i hear from you.

Regards,
My Regards,
Mr. john Matthias ouedraogo



REPLY URGENTLY.

^ permalink raw reply

* [PATCH v4 net-next] net:sched: add action inheritdsfield to skbedit
From: Fu, Qiaobin @ 2018-06-12 15:42 UTC (permalink / raw)
  To: davem@davemloft.net
  Cc: netdev@vger.kernel.org, jhs@mojatatu.com, Michel Machado,
	Marcelo Ricardo Leitner, xiyou.wangcong@gmail.com
In-Reply-To: <38C2B0E3-E108-433B-906A-A2D72CEE4CAE@bu.edu>

The new action inheritdsfield copies the field DS of
IPv4 and IPv6 packets into skb->priority. This enables
later classification of packets based on the DS field.

v4:
*Not allow setting flags other than the expected ones.

*Allow dumping the pure flags.

Original idea by Jamal Hadi Salim <jhs@mojatatu.com>

Signed-off-by: Qiaobin Fu <qiaobinf@bu.edu>
Reviewed-by: Michel Machado <michel@digirati.com.br>
---

Note that the motivation for this patch is found in the following discussion:
https://www.spinics.net/lists/netdev/msg501061.html
---
diff --git a/include/uapi/linux/tc_act/tc_skbedit.h b/include/uapi/linux/tc_act/tc_skbedit.h
index fbcfe27a4e6c..6de6071ebed6 100644
--- a/include/uapi/linux/tc_act/tc_skbedit.h
+++ b/include/uapi/linux/tc_act/tc_skbedit.h
@@ -30,6 +30,7 @@
 #define SKBEDIT_F_MARK			0x4
 #define SKBEDIT_F_PTYPE			0x8
 #define SKBEDIT_F_MASK			0x10
+#define SKBEDIT_F_INHERITDSFIELD	0x20
 
 struct tc_skbedit {
 	tc_gen;
@@ -45,6 +46,7 @@ enum {
 	TCA_SKBEDIT_PAD,
 	TCA_SKBEDIT_PTYPE,
 	TCA_SKBEDIT_MASK,
+	TCA_SKBEDIT_FLAGS,
 	__TCA_SKBEDIT_MAX
 };
 #define TCA_SKBEDIT_MAX (__TCA_SKBEDIT_MAX - 1)
diff --git a/net/sched/act_skbedit.c b/net/sched/act_skbedit.c
index 6138d1d71900..9adbcfa3f5fe 100644
--- a/net/sched/act_skbedit.c
+++ b/net/sched/act_skbedit.c
@@ -23,6 +23,9 @@
 #include <linux/rtnetlink.h>
 #include <net/netlink.h>
 #include <net/pkt_sched.h>
+#include <net/ip.h>
+#include <net/ipv6.h>
+#include <net/dsfield.h>
 
 #include <linux/tc_act/tc_skbedit.h>
 #include <net/tc_act/tc_skbedit.h>
@@ -41,6 +44,25 @@ static int tcf_skbedit(struct sk_buff *skb, const struct tc_action *a,
 
 	if (d->flags & SKBEDIT_F_PRIORITY)
 		skb->priority = d->priority;
+	if (d->flags & SKBEDIT_F_INHERITDSFIELD) {
+		int wlen = skb_network_offset(skb);
+
+		switch (tc_skb_protocol(skb)) {
+		case htons(ETH_P_IP):
+			wlen += sizeof(struct iphdr);
+			if (!pskb_may_pull(skb, wlen))
+				goto err;
+			skb->priority = ipv4_get_dsfield(ip_hdr(skb)) >> 2;
+			break;
+
+		case htons(ETH_P_IPV6):
+			wlen += sizeof(struct ipv6hdr);
+			if (!pskb_may_pull(skb, wlen))
+				goto err;
+			skb->priority = ipv6_get_dsfield(ipv6_hdr(skb)) >> 2;
+			break;
+		}
+	}
 	if (d->flags & SKBEDIT_F_QUEUE_MAPPING &&
 	    skb->dev->real_num_tx_queues > d->queue_mapping)
 		skb_set_queue_mapping(skb, d->queue_mapping);
@@ -53,6 +75,10 @@ static int tcf_skbedit(struct sk_buff *skb, const struct tc_action *a,
 
 	spin_unlock(&d->tcf_lock);
 	return d->tcf_action;
+
+err:
+	spin_unlock(&d->tcf_lock);
+	return TC_ACT_SHOT;
 }
 
 static const struct nla_policy skbedit_policy[TCA_SKBEDIT_MAX + 1] = {
@@ -62,6 +88,7 @@ static const struct nla_policy skbedit_policy[TCA_SKBEDIT_MAX + 1] = {
 	[TCA_SKBEDIT_MARK]		= { .len = sizeof(u32) },
 	[TCA_SKBEDIT_PTYPE]		= { .len = sizeof(u16) },
 	[TCA_SKBEDIT_MASK]		= { .len = sizeof(u32) },
+	[TCA_SKBEDIT_FLAGS]		= { .len = sizeof(u64) },
 };
 
 static int tcf_skbedit_init(struct net *net, struct nlattr *nla,
@@ -73,6 +100,7 @@ static int tcf_skbedit_init(struct net *net, struct nlattr *nla,
 	struct tc_skbedit *parm;
 	struct tcf_skbedit *d;
 	u32 flags = 0, *priority = NULL, *mark = NULL, *mask = NULL;
+	u64 *pure_flags = NULL;
 	u16 *queue_mapping = NULL, *ptype = NULL;
 	bool exists = false;
 	int ret = 0, err;
@@ -114,6 +142,12 @@ static int tcf_skbedit_init(struct net *net, struct nlattr *nla,
 		mask = nla_data(tb[TCA_SKBEDIT_MASK]);
 	}
 
+	if (tb[TCA_SKBEDIT_FLAGS] != NULL) {
+		pure_flags = nla_data(tb[TCA_SKBEDIT_FLAGS]);
+		if (*pure_flags & SKBEDIT_F_INHERITDSFIELD)
+			flags |= SKBEDIT_F_INHERITDSFIELD;
+	}
+
 	parm = nla_data(tb[TCA_SKBEDIT_PARMS]);
 
 	exists = tcf_idr_check(tn, parm->index, a, bind);
@@ -178,6 +212,7 @@ static int tcf_skbedit_dump(struct sk_buff *skb, struct tc_action *a,
 		.action  = d->tcf_action,
 	};
 	struct tcf_t t;
+	u64 pure_flags = 0;
 
 	if (nla_put(skb, TCA_SKBEDIT_PARMS, sizeof(opt), &opt))
 		goto nla_put_failure;
@@ -196,6 +231,11 @@ static int tcf_skbedit_dump(struct sk_buff *skb, struct tc_action *a,
 	if ((d->flags & SKBEDIT_F_MASK) &&
 	    nla_put_u32(skb, TCA_SKBEDIT_MASK, d->mask))
 		goto nla_put_failure;
+	if (d->flags & SKBEDIT_F_INHERITDSFIELD)
+		pure_flags |= SKBEDIT_F_INHERITDSFIELD;
+	if (pure_flags != 0 &&
+	    nla_put(skb, TCA_SKBEDIT_FLAGS, sizeof(pure_flags), &pure_flags))
+		goto nla_put_failure;
 
 	tcf_tm_dump(&t, &d->tcf_tm);
 	if (nla_put_64bit(skb, TCA_SKBEDIT_TM, sizeof(t), &t, TCA_SKBEDIT_PAD))

^ permalink raw reply related

* [jkirsher/next-queue PATCH v2 7/7] net: allow fallback function to pass netdev
From: Alexander Duyck @ 2018-06-12 15:19 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher, netdev
In-Reply-To: <20180612151322.86792.97587.stgit@ahduyck-green-test.jf.intel.com>

For most of these calls we can just pass NULL through to the fallback
function as the sb_dev. The only cases where we cannot are the cases where
we might be dealing with either an upper device or a driver that would
have configured things to support an sb_dev itself.

The only driver that has any signficant change in this patchset should be
ixgbe as we can drop the redundant functionality that existed in both the
ndo_select_queue function and the fallback function that was passed through
to us.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c    |    2 +-
 drivers/net/ethernet/broadcom/bcmsysport.c      |    4 ++--
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c |    3 ++-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c |    2 +-
 drivers/net/ethernet/hisilicon/hns/hns_enet.c   |    2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c   |    4 ++--
 drivers/net/ethernet/mellanox/mlx4/en_tx.c      |    4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c |    2 +-
 drivers/net/hyperv/netvsc_drv.c                 |    2 +-
 drivers/net/net_failover.c                      |    2 +-
 drivers/net/xen-netback/interface.c             |    2 +-
 include/linux/netdevice.h                       |    3 ++-
 net/core/dev.c                                  |   12 +++---------
 net/packet/af_packet.c                          |    7 ++++---
 14 files changed, 24 insertions(+), 27 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index e3befb1..c673ac2 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -2224,7 +2224,7 @@ static u16 ena_select_queue(struct net_device *dev, struct sk_buff *skb,
 	if (skb_rx_queue_recorded(skb))
 		qid = skb_get_rx_queue(skb);
 	else
-		qid = fallback(dev, skb);
+		qid = fallback(dev, skb, NULL);
 
 	return qid;
 }
diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c
index 32f548e..eb890c4 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -2116,7 +2116,7 @@ static u16 bcm_sysport_select_queue(struct net_device *dev, struct sk_buff *skb,
 	unsigned int q, port;
 
 	if (!netdev_uses_dsa(dev))
-		return fallback(dev, skb);
+		return fallback(dev, skb, NULL);
 
 	/* DSA tagging layer will have configured the correct queue */
 	q = BRCM_TAG_GET_QUEUE(queue);
@@ -2124,7 +2124,7 @@ static u16 bcm_sysport_select_queue(struct net_device *dev, struct sk_buff *skb,
 	tx_ring = priv->ring_map[q + port * priv->per_port_num_tx_queues];
 
 	if (unlikely(!tx_ring))
-		return fallback(dev, skb);
+		return fallback(dev, skb, NULL);
 
 	return tx_ring->index;
 }
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 969dcc9..7a1b99f 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -1928,7 +1928,8 @@ u16 bnx2x_select_queue(struct net_device *dev, struct sk_buff *skb,
 	}
 
 	/* select a non-FCoE queue */
-	return fallback(dev, skb) % (BNX2X_NUM_ETH_QUEUES(bp) * bp->max_cos);
+	return fallback(dev, skb, NULL) %
+	       (BNX2X_NUM_ETH_QUEUES(bp) * bp->max_cos);
 }
 
 void bnx2x_set_num_queues(struct bnx2x *bp)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 8de3039..380931d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -972,7 +972,7 @@ static u16 cxgb_select_queue(struct net_device *dev, struct sk_buff *skb,
 		return txq;
 	}
 
-	return fallback(dev, skb) % dev->real_num_tx_queues;
+	return fallback(dev, skb, NULL) % dev->real_num_tx_queues;
 }
 
 static int closest_timer(const struct sge *s, int time)
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.c b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
index c36a231..8327254 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
@@ -2033,7 +2033,7 @@ static void hns_nic_get_stats64(struct net_device *ndev,
 	    is_multicast_ether_addr(eth_hdr->h_dest))
 		return 0;
 	else
-		return fallback(ndev, skb);
+		return fallback(ndev, skb, NULL);
 }
 
 static const struct net_device_ops hns_nic_netdev_ops = {
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 5d9867e..eef64d0 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -8248,11 +8248,11 @@ static u16 ixgbe_select_queue(struct net_device *dev, struct sk_buff *skb,
 	case htons(ETH_P_FIP):
 		adapter = netdev_priv(dev);
 
-		if (adapter->flags & IXGBE_FLAG_FCOE_ENABLED)
+		if (!sb_dev && (adapter->flags & IXGBE_FLAG_FCOE_ENABLED))
 			break;
 		/* fall through */
 	default:
-		return fallback(dev, skb);
+		return fallback(dev, skb, sb_dev);
 	}
 
 	f = &adapter->ring_feature[RING_F_FCOE];
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index df29966..1857ee0 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -695,9 +695,9 @@ u16 mlx4_en_select_queue(struct net_device *dev, struct sk_buff *skb,
 	u16 rings_p_up = priv->num_tx_rings_p_up;
 
 	if (netdev_get_num_tc(dev))
-		return fallback(dev, skb);
+		return fallback(dev, skb, NULL);
 
-	return fallback(dev, skb) % rings_p_up;
+	return fallback(dev, skb, NULL) % rings_p_up;
 }
 
 static void mlx4_bf_copy(void __iomem *dst, const void *src,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index 0119e86..88c0c85 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -115,7 +115,7 @@ u16 mlx5e_select_queue(struct net_device *dev, struct sk_buff *skb,
 		       select_queue_fallback_t fallback)
 {
 	struct mlx5e_priv *priv = netdev_priv(dev);
-	int channel_ix = fallback(dev, skb);
+	int channel_ix = fallback(dev, skb, NULL);
 	u16 num_channels;
 	int up = 0;
 
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 0a01572..5bc32e7 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -344,7 +344,7 @@ static u16 netvsc_select_queue(struct net_device *ndev, struct sk_buff *skb,
 			txq = vf_ops->ndo_select_queue(vf_netdev, skb,
 						       sb_dev, fallback);
 		else
-			txq = fallback(vf_netdev, skb);
+			txq = fallback(vf_netdev, skb, NULL);
 
 		/* Record the queue selected by VF so that it can be
 		 * used for common case where VF has more queues than
diff --git a/drivers/net/net_failover.c b/drivers/net/net_failover.c
index b2dc2e7..6f3d143 100644
--- a/drivers/net/net_failover.c
+++ b/drivers/net/net_failover.c
@@ -131,7 +131,7 @@ static u16 net_failover_select_queue(struct net_device *dev,
 			txq = ops->ndo_select_queue(primary_dev, skb,
 						    sb_dev, fallback);
 		else
-			txq = fallback(primary_dev, skb);
+			txq = fallback(primary_dev, skb, NULL);
 
 		qdisc_skb_cb(skb)->slave_dev_queue_mapping = skb->queue_mapping;
 
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 19c4c58..92274c2 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -155,7 +155,7 @@ static u16 xenvif_select_queue(struct net_device *dev, struct sk_buff *skb,
 	unsigned int size = vif->hash.size;
 
 	if (vif->hash.alg == XEN_NETIF_CTRL_HASH_ALGORITHM_NONE)
-		return fallback(dev, skb) % dev->real_num_tx_queues;
+		return fallback(dev, skb, NULL) % dev->real_num_tx_queues;
 
 	xenvif_set_skb_hash(vif, skb);
 
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 94d8f9b..db02e5f 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -782,7 +782,8 @@ static inline bool netdev_phys_item_id_same(struct netdev_phys_item_id *a,
 }
 
 typedef u16 (*select_queue_fallback_t)(struct net_device *dev,
-				       struct sk_buff *skb);
+				       struct sk_buff *skb,
+				       struct net_device *sb_dev);
 
 enum tc_setup_type {
 	TC_SETUP_QDISC_MQPRIO,
diff --git a/net/core/dev.c b/net/core/dev.c
index a78000a..5abf7ee 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3521,8 +3521,8 @@ u16 dev_pick_tx_cpu_id(struct net_device *dev, struct sk_buff *skb,
 }
 EXPORT_SYMBOL(dev_pick_tx_cpu_id);
 
-static u16 ___netdev_pick_tx(struct net_device *dev, struct sk_buff *skb,
-			     struct net_device *sb_dev)
+static u16 __netdev_pick_tx(struct net_device *dev, struct sk_buff *skb,
+			    struct net_device *sb_dev)
 {
 	struct sock *sk = skb->sk;
 	int queue_index = sk_tx_queue_get(sk);
@@ -3547,12 +3547,6 @@ static u16 ___netdev_pick_tx(struct net_device *dev, struct sk_buff *skb,
 	return queue_index;
 }
 
-static u16 __netdev_pick_tx(struct net_device *dev,
-			    struct sk_buff *skb)
-{
-	return ___netdev_pick_tx(dev, skb, NULL);
-}
-
 struct netdev_queue *netdev_pick_tx(struct net_device *dev,
 				    struct sk_buff *skb,
 				    struct net_device *sb_dev)
@@ -3573,7 +3567,7 @@ struct netdev_queue *netdev_pick_tx(struct net_device *dev,
 			queue_index = ops->ndo_select_queue(dev, skb, sb_dev,
 							    __netdev_pick_tx);
 		else
-			queue_index = ___netdev_pick_tx(dev, skb, sb_dev);
+			queue_index = __netdev_pick_tx(dev, skb, sb_dev);
 
 		queue_index = netdev_cap_txqueue(dev, queue_index);
 	}
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 905f7cd..e24d9b4 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -275,9 +275,10 @@ static bool packet_use_direct_xmit(const struct packet_sock *po)
 	return po->xmit == packet_direct_xmit;
 }
 
-static u16 __packet_pick_tx_queue(struct net_device *dev, struct sk_buff *skb)
+static u16 __packet_pick_tx_queue(struct net_device *dev, struct sk_buff *skb,
+				  struct net_device *sb_dev)
 {
-	return dev_pick_tx_cpu_id(dev, skb, NULL, NULL);
+	return dev_pick_tx_cpu_id(dev, skb, sb_dev, NULL);
 }
 
 static u16 packet_pick_tx_queue(struct sk_buff *skb)
@@ -291,7 +292,7 @@ static u16 packet_pick_tx_queue(struct sk_buff *skb)
 						    __packet_pick_tx_queue);
 		queue_index = netdev_cap_txqueue(dev, queue_index);
 	} else {
-		queue_index = __packet_pick_tx_queue(dev, skb);
+		queue_index = __packet_pick_tx_queue(dev, skb, NULL);
 	}
 
 	return queue_index;

^ permalink raw reply related

* [jkirsher/next-queue PATCH v2 6/7] net: allow ndo_select_queue to pass netdev
From: Alexander Duyck @ 2018-06-12 15:18 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher, netdev
In-Reply-To: <20180612151322.86792.97587.stgit@ahduyck-green-test.jf.intel.com>

This patch makes it so that instead of passing a void pointer as the
accel_priv we instead pass a net_device pointer as sb_dev. Making this
change allows us to pass the subordinate device through to the fallback
function eventually so that we can keep the actual code in the
ndo_select_queue call as focused on possible on the exception cases.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/infiniband/hw/hfi1/vnic_main.c            |    2 +-
 drivers/infiniband/ulp/opa_vnic/opa_vnic_netdev.c |    4 ++--
 drivers/net/bonding/bond_main.c                   |    3 ++-
 drivers/net/ethernet/amazon/ena/ena_netdev.c      |    3 ++-
 drivers/net/ethernet/broadcom/bcmsysport.c        |    2 +-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c   |    3 ++-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h   |    3 ++-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c   |    3 ++-
 drivers/net/ethernet/hisilicon/hns/hns_enet.c     |    3 ++-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c     |    7 ++++---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c        |    3 ++-
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h      |    3 ++-
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |    3 ++-
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   |    3 ++-
 drivers/net/ethernet/renesas/ravb_main.c          |    3 ++-
 drivers/net/ethernet/sun/ldmvsw.c                 |    3 ++-
 drivers/net/ethernet/sun/sunvnet.c                |    3 ++-
 drivers/net/hyperv/netvsc_drv.c                   |    4 ++--
 drivers/net/net_failover.c                        |    5 +++--
 drivers/net/team/team.c                           |    3 ++-
 drivers/net/tun.c                                 |    3 ++-
 drivers/net/wireless/marvell/mwifiex/main.c       |    3 ++-
 drivers/net/xen-netback/interface.c               |    2 +-
 drivers/net/xen-netfront.c                        |    3 ++-
 drivers/staging/rtl8188eu/os_dep/os_intfs.c       |    3 ++-
 drivers/staging/rtl8723bs/os_dep/os_intfs.c       |    7 +++----
 include/linux/netdevice.h                         |   11 +++++++----
 net/core/dev.c                                    |    6 ++++--
 net/mac80211/iface.c                              |    4 ++--
 29 files changed, 66 insertions(+), 42 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/vnic_main.c b/drivers/infiniband/hw/hfi1/vnic_main.c
index 5d65582..616fc9b 100644
--- a/drivers/infiniband/hw/hfi1/vnic_main.c
+++ b/drivers/infiniband/hw/hfi1/vnic_main.c
@@ -423,7 +423,7 @@ static netdev_tx_t hfi1_netdev_start_xmit(struct sk_buff *skb,
 
 static u16 hfi1_vnic_select_queue(struct net_device *netdev,
 				  struct sk_buff *skb,
-				  void *accel_priv,
+				  struct net_device *sb_dev,
 				  select_queue_fallback_t fallback)
 {
 	struct hfi1_vnic_vport_info *vinfo = opa_vnic_dev_priv(netdev);
diff --git a/drivers/infiniband/ulp/opa_vnic/opa_vnic_netdev.c b/drivers/infiniband/ulp/opa_vnic/opa_vnic_netdev.c
index 0c8aec6..6155878 100644
--- a/drivers/infiniband/ulp/opa_vnic/opa_vnic_netdev.c
+++ b/drivers/infiniband/ulp/opa_vnic/opa_vnic_netdev.c
@@ -95,7 +95,7 @@ static netdev_tx_t opa_netdev_start_xmit(struct sk_buff *skb,
 }
 
 static u16 opa_vnic_select_queue(struct net_device *netdev, struct sk_buff *skb,
-				 void *accel_priv,
+				 struct net_device *sb_dev,
 				 select_queue_fallback_t fallback)
 {
 	struct opa_vnic_adapter *adapter = opa_vnic_priv(netdev);
@@ -107,7 +107,7 @@ static u16 opa_vnic_select_queue(struct net_device *netdev, struct sk_buff *skb,
 	mdata->entropy = opa_vnic_calc_entropy(skb);
 	mdata->vl = opa_vnic_get_vl(adapter, skb);
 	rc = adapter->rn_ops->ndo_select_queue(netdev, skb,
-					       accel_priv, fallback);
+					       sb_dev, fallback);
 	skb_pull(skb, sizeof(*mdata));
 	return rc;
 }
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index bd53a71..e33f689 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4094,7 +4094,8 @@ static inline int bond_slave_override(struct bonding *bond,
 
 
 static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb,
-			     void *accel_priv, select_queue_fallback_t fallback)
+			     struct net_device *sb_dev,
+			     select_queue_fallback_t fallback)
 {
 	/* This helper function exists to help dev_pick_tx get the correct
 	 * destination queue.  Using a helper function skips a call to
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index f2af87d..e3befb1 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -2213,7 +2213,8 @@ static void ena_netpoll(struct net_device *netdev)
 #endif /* CONFIG_NET_POLL_CONTROLLER */
 
 static u16 ena_select_queue(struct net_device *dev, struct sk_buff *skb,
-			    void *accel_priv, select_queue_fallback_t fallback)
+			    struct net_device *sb_dev,
+			    select_queue_fallback_t fallback)
 {
 	u16 qid;
 	/* we suspect that this is good for in--kernel network services that
diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c
index d5fca2e..32f548e 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -2107,7 +2107,7 @@ static int bcm_sysport_stop(struct net_device *dev)
 };
 
 static u16 bcm_sysport_select_queue(struct net_device *dev, struct sk_buff *skb,
-				    void *accel_priv,
+				    struct net_device *sb_dev,
 				    select_queue_fallback_t fallback)
 {
 	struct bcm_sysport_priv *priv = netdev_priv(dev);
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 8cd73ff..969dcc9 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -1905,7 +1905,8 @@ void bnx2x_netif_stop(struct bnx2x *bp, int disable_hw)
 }
 
 u16 bnx2x_select_queue(struct net_device *dev, struct sk_buff *skb,
-		       void *accel_priv, select_queue_fallback_t fallback)
+		       struct net_device *sb_dev,
+		       select_queue_fallback_t fallback)
 {
 	struct bnx2x *bp = netdev_priv(dev);
 
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
index a8ce5c5..0e508e5 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
@@ -497,7 +497,8 @@ int bnx2x_set_vf_vlan(struct net_device *netdev, int vf, u16 vlan, u8 qos,
 
 /* select_queue callback */
 u16 bnx2x_select_queue(struct net_device *dev, struct sk_buff *skb,
-		       void *accel_priv, select_queue_fallback_t fallback);
+		       struct net_device *sb_dev,
+		       select_queue_fallback_t fallback);
 
 static inline void bnx2x_update_rx_prod(struct bnx2x *bp,
 					struct bnx2x_fastpath *fp,
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 35cb3ae..8de3039 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -929,7 +929,8 @@ static int setup_sge_queues(struct adapter *adap)
 }
 
 static u16 cxgb_select_queue(struct net_device *dev, struct sk_buff *skb,
-			     void *accel_priv, select_queue_fallback_t fallback)
+			     struct net_device *sb_dev,
+			     select_queue_fallback_t fallback)
 {
 	int txq;
 
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.c b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
index 1ccb644..c36a231 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
@@ -2022,7 +2022,8 @@ static void hns_nic_get_stats64(struct net_device *ndev,
 
 static u16
 hns_nic_select_queue(struct net_device *ndev, struct sk_buff *skb,
-		     void *accel_priv, select_queue_fallback_t fallback)
+		     struct net_device *sb_dev,
+		     select_queue_fallback_t fallback)
 {
 	struct ethhdr *eth_hdr = (struct ethhdr *)skb->data;
 	struct hns_nic_priv *priv = netdev_priv(ndev);
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 053a54c..5d9867e 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -8221,15 +8221,16 @@ static void ixgbe_atr(struct ixgbe_ring *ring,
 
 #ifdef IXGBE_FCOE
 static u16 ixgbe_select_queue(struct net_device *dev, struct sk_buff *skb,
-			      void *accel_priv, select_queue_fallback_t fallback)
+			      struct net_device *sb_dev,
+			      select_queue_fallback_t fallback)
 {
 	struct ixgbe_adapter *adapter;
 	struct ixgbe_ring_feature *f;
 	int txq;
 
-	if (accel_priv) {
+	if (sb_dev) {
 		u8 tc = netdev_get_prio_tc_map(dev, skb->priority);
-		struct net_device *vdev = accel_priv;
+		struct net_device *vdev = sb_dev;
 
 		txq = vdev->tc_to_txq[tc].offset;
 		txq += reciprocal_scale(skb_get_hash(skb),
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 0227786..df29966 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -688,7 +688,8 @@ static void build_inline_wqe(struct mlx4_en_tx_desc *tx_desc,
 }
 
 u16 mlx4_en_select_queue(struct net_device *dev, struct sk_buff *skb,
-			 void *accel_priv, select_queue_fallback_t fallback)
+			 struct net_device *sb_dev,
+			 select_queue_fallback_t fallback)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	u16 rings_p_up = priv->num_tx_rings_p_up;
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index ace6545..c3228b8 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -699,7 +699,8 @@ int mlx4_en_activate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq,
 
 void mlx4_en_tx_irq(struct mlx4_cq *mcq);
 u16 mlx4_en_select_queue(struct net_device *dev, struct sk_buff *skb,
-			 void *accel_priv, select_queue_fallback_t fallback);
+			 struct net_device *sb_dev,
+			 select_queue_fallback_t fallback);
 netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev);
 netdev_tx_t mlx4_en_xmit_frame(struct mlx4_en_rx_ring *rx_ring,
 			       struct mlx4_en_rx_alloc *frame,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index eb9eb7a..df2d1e8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -866,7 +866,8 @@ struct mlx5e_profile {
 void mlx5e_build_ptys2ethtool_map(void);
 
 u16 mlx5e_select_queue(struct net_device *dev, struct sk_buff *skb,
-		       void *accel_priv, select_queue_fallback_t fallback);
+		       struct net_device *sb_dev,
+		       select_queue_fallback_t fallback);
 netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev);
 netdev_tx_t mlx5e_sq_xmit(struct mlx5e_txqsq *sq, struct sk_buff *skb,
 			  struct mlx5e_tx_wqe *wqe, u16 pi);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index f29deb4..0119e86 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -111,7 +111,8 @@ static inline int mlx5e_get_dscp_up(struct mlx5e_priv *priv, struct sk_buff *skb
 #endif
 
 u16 mlx5e_select_queue(struct net_device *dev, struct sk_buff *skb,
-		       void *accel_priv, select_queue_fallback_t fallback)
+		       struct net_device *sb_dev,
+		       select_queue_fallback_t fallback)
 {
 	struct mlx5e_priv *priv = netdev_priv(dev);
 	int channel_ix = fallback(dev, skb);
diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
index 68f1221..4a7f54c 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -1656,7 +1656,8 @@ static netdev_tx_t ravb_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 }
 
 static u16 ravb_select_queue(struct net_device *ndev, struct sk_buff *skb,
-			     void *accel_priv, select_queue_fallback_t fallback)
+			     struct net_device *sb_dev,
+			     select_queue_fallback_t fallback)
 {
 	/* If skb needs TX timestamp, it is handled in network control queue */
 	return (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) ? RAVB_NC :
diff --git a/drivers/net/ethernet/sun/ldmvsw.c b/drivers/net/ethernet/sun/ldmvsw.c
index a5dd627..d42f47f 100644
--- a/drivers/net/ethernet/sun/ldmvsw.c
+++ b/drivers/net/ethernet/sun/ldmvsw.c
@@ -101,7 +101,8 @@ static struct vnet_port *vsw_tx_port_find(struct sk_buff *skb,
 }
 
 static u16 vsw_select_queue(struct net_device *dev, struct sk_buff *skb,
-			    void *accel_priv, select_queue_fallback_t fallback)
+			    struct net_device *sb_dev,
+			    select_queue_fallback_t fallback)
 {
 	struct vnet_port *port = netdev_priv(dev);
 
diff --git a/drivers/net/ethernet/sun/sunvnet.c b/drivers/net/ethernet/sun/sunvnet.c
index a94f504..12539b3 100644
--- a/drivers/net/ethernet/sun/sunvnet.c
+++ b/drivers/net/ethernet/sun/sunvnet.c
@@ -234,7 +234,8 @@ static struct vnet_port *vnet_tx_port_find(struct sk_buff *skb,
 }
 
 static u16 vnet_select_queue(struct net_device *dev, struct sk_buff *skb,
-			     void *accel_priv, select_queue_fallback_t fallback)
+			     struct net_device *sb_dev,
+			     select_queue_fallback_t fallback)
 {
 	struct vnet *vp = netdev_priv(dev);
 	struct vnet_port *port = __tx_port_find(vp, skb);
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 7b18a8c..0a01572 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -328,7 +328,7 @@ static u16 netvsc_pick_tx(struct net_device *ndev, struct sk_buff *skb)
 }
 
 static u16 netvsc_select_queue(struct net_device *ndev, struct sk_buff *skb,
-			       void *accel_priv,
+			       struct net_device *sb_dev,
 			       select_queue_fallback_t fallback)
 {
 	struct net_device_context *ndc = netdev_priv(ndev);
@@ -342,7 +342,7 @@ static u16 netvsc_select_queue(struct net_device *ndev, struct sk_buff *skb,
 
 		if (vf_ops->ndo_select_queue)
 			txq = vf_ops->ndo_select_queue(vf_netdev, skb,
-						       accel_priv, fallback);
+						       sb_dev, fallback);
 		else
 			txq = fallback(vf_netdev, skb);
 
diff --git a/drivers/net/net_failover.c b/drivers/net/net_failover.c
index 83f7420..b2dc2e7 100644
--- a/drivers/net/net_failover.c
+++ b/drivers/net/net_failover.c
@@ -115,7 +115,8 @@ static netdev_tx_t net_failover_start_xmit(struct sk_buff *skb,
 }
 
 static u16 net_failover_select_queue(struct net_device *dev,
-				     struct sk_buff *skb, void *accel_priv,
+				     struct sk_buff *skb,
+				     struct net_device *sb_dev,
 				     select_queue_fallback_t fallback)
 {
 	struct net_failover_info *nfo_info = netdev_priv(dev);
@@ -128,7 +129,7 @@ static u16 net_failover_select_queue(struct net_device *dev,
 
 		if (ops->ndo_select_queue)
 			txq = ops->ndo_select_queue(primary_dev, skb,
-						    accel_priv, fallback);
+						    sb_dev, fallback);
 		else
 			txq = fallback(primary_dev, skb);
 
diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index 8863fa0..b704051 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -1706,7 +1706,8 @@ static netdev_tx_t team_xmit(struct sk_buff *skb, struct net_device *dev)
 }
 
 static u16 team_select_queue(struct net_device *dev, struct sk_buff *skb,
-			     void *accel_priv, select_queue_fallback_t fallback)
+			     struct net_device *sb_dev,
+			     select_queue_fallback_t fallback)
 {
 	/*
 	 * This helper function exists to help dev_pick_tx get the correct
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index a192a01..76f0f41 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -607,7 +607,8 @@ static u16 tun_ebpf_select_queue(struct tun_struct *tun, struct sk_buff *skb)
 }
 
 static u16 tun_select_queue(struct net_device *dev, struct sk_buff *skb,
-			    void *accel_priv, select_queue_fallback_t fallback)
+			    struct net_device *sb_dev,
+			    select_queue_fallback_t fallback)
 {
 	struct tun_struct *tun = netdev_priv(dev);
 	u16 ret;
diff --git a/drivers/net/wireless/marvell/mwifiex/main.c b/drivers/net/wireless/marvell/mwifiex/main.c
index 510f6b8..fa3e8dd 100644
--- a/drivers/net/wireless/marvell/mwifiex/main.c
+++ b/drivers/net/wireless/marvell/mwifiex/main.c
@@ -1279,7 +1279,8 @@ static struct net_device_stats *mwifiex_get_stats(struct net_device *dev)
 
 static u16
 mwifiex_netdev_select_wmm_queue(struct net_device *dev, struct sk_buff *skb,
-				void *accel_priv, select_queue_fallback_t fallback)
+				struct net_device *sb_dev,
+				select_queue_fallback_t fallback)
 {
 	skb->priority = cfg80211_classify8021d(skb, NULL);
 	return mwifiex_1d_to_wmm_queue[skb->priority];
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 78ebe49..19c4c58 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -148,7 +148,7 @@ void xenvif_wake_queue(struct xenvif_queue *queue)
 }
 
 static u16 xenvif_select_queue(struct net_device *dev, struct sk_buff *skb,
-			       void *accel_priv,
+			       struct net_device *sb_dev,
 			       select_queue_fallback_t fallback)
 {
 	struct xenvif *vif = netdev_priv(dev);
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 679da1a..3c21a8f 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -545,7 +545,8 @@ static int xennet_count_skb_slots(struct sk_buff *skb)
 }
 
 static u16 xennet_select_queue(struct net_device *dev, struct sk_buff *skb,
-			       void *accel_priv, select_queue_fallback_t fallback)
+			       struct net_device *sb_dev,
+			       select_queue_fallback_t fallback)
 {
 	unsigned int num_queues = dev->real_num_tx_queues;
 	u32 hash;
diff --git a/drivers/staging/rtl8188eu/os_dep/os_intfs.c b/drivers/staging/rtl8188eu/os_dep/os_intfs.c
index add1ba0..38e85c8 100644
--- a/drivers/staging/rtl8188eu/os_dep/os_intfs.c
+++ b/drivers/staging/rtl8188eu/os_dep/os_intfs.c
@@ -253,7 +253,8 @@ static unsigned int rtw_classify8021d(struct sk_buff *skb)
 }
 
 static u16 rtw_select_queue(struct net_device *dev, struct sk_buff *skb,
-			    void *accel_priv, select_queue_fallback_t fallback)
+			    struct net_device *sb_dev,
+			    select_queue_fallback_t fallback)
 {
 	struct adapter	*padapter = rtw_netdev_priv(dev);
 	struct mlme_priv *pmlmepriv = &padapter->mlmepriv;
diff --git a/drivers/staging/rtl8723bs/os_dep/os_intfs.c b/drivers/staging/rtl8723bs/os_dep/os_intfs.c
index ace68f0..1816423 100644
--- a/drivers/staging/rtl8723bs/os_dep/os_intfs.c
+++ b/drivers/staging/rtl8723bs/os_dep/os_intfs.c
@@ -403,10 +403,9 @@ static unsigned int rtw_classify8021d(struct sk_buff *skb)
 }
 
 
-static u16 rtw_select_queue(struct net_device *dev, struct sk_buff *skb
-				, void *accel_priv
-				, select_queue_fallback_t fallback
-)
+static u16 rtw_select_queue(struct net_device *dev, struct sk_buff *skb,
+			    struct net_device *sb_dev,
+			    select_queue_fallback_t fallback)
 {
 	struct adapter	*padapter = rtw_netdev_priv(dev);
 	struct mlme_priv *pmlmepriv = &padapter->mlmepriv;
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 70f7ee3..94d8f9b 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -945,7 +945,8 @@ struct dev_ifalias {
  *	those the driver believes to be appropriate.
  *
  * u16 (*ndo_select_queue)(struct net_device *dev, struct sk_buff *skb,
- *                         void *accel_priv, select_queue_fallback_t fallback);
+ *                         struct net_device *sb_dev,
+ *                         select_queue_fallback_t fallback);
  *	Called to decide which queue to use when device supports multiple
  *	transmit queues.
  *
@@ -1217,7 +1218,7 @@ struct net_device_ops {
 						      netdev_features_t features);
 	u16			(*ndo_select_queue)(struct net_device *dev,
 						    struct sk_buff *skb,
-						    void *accel_priv,
+						    struct net_device *sb_dev,
 						    select_queue_fallback_t fallback);
 	void			(*ndo_change_rx_flags)(struct net_device *dev,
 						       int flags);
@@ -2552,9 +2553,11 @@ struct net_device *__dev_get_by_flags(struct net *net, unsigned short flags,
 void dev_disable_lro(struct net_device *dev);
 int dev_loopback_xmit(struct net *net, struct sock *sk, struct sk_buff *newskb);
 u16 dev_pick_tx_zero(struct net_device *dev, struct sk_buff *skb,
-		     void *accel_priv, select_queue_fallback_t fallback);
+		     struct net_device *sb_dev,
+		     select_queue_fallback_t fallback);
 u16 dev_pick_tx_cpu_id(struct net_device *dev, struct sk_buff *skb,
-		       void *accel_priv, select_queue_fallback_t fallback);
+		       struct net_device *sb_dev,
+		       select_queue_fallback_t fallback);
 int dev_queue_xmit(struct sk_buff *skb);
 int dev_queue_xmit_accel(struct sk_buff *skb, struct net_device *sb_dev);
 int dev_direct_xmit(struct sk_buff *skb, u16 queue_id);
diff --git a/net/core/dev.c b/net/core/dev.c
index 1a1cf2c..a78000a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3506,14 +3506,16 @@ static inline int get_xps_queue(struct net_device *dev,
 }
 
 u16 dev_pick_tx_zero(struct net_device *dev, struct sk_buff *skb,
-		     void *accel_priv, select_queue_fallback_t fallback)
+		     struct net_device *sb_dev,
+		     select_queue_fallback_t fallback)
 {
 	return 0;
 }
 EXPORT_SYMBOL(dev_pick_tx_zero);
 
 u16 dev_pick_tx_cpu_id(struct net_device *dev, struct sk_buff *skb,
-		       void *accel_priv, select_queue_fallback_t fallback)
+		       struct net_device *sb_dev,
+		       select_queue_fallback_t fallback)
 {
 	return (u16)raw_smp_processor_id() % dev->real_num_tx_queues;
 }
diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index 555e389..5e6cf2c 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -1130,7 +1130,7 @@ static void ieee80211_uninit(struct net_device *dev)
 
 static u16 ieee80211_netdev_select_queue(struct net_device *dev,
 					 struct sk_buff *skb,
-					 void *accel_priv,
+					 struct net_device *sb_dev,
 					 select_queue_fallback_t fallback)
 {
 	return ieee80211_select_queue(IEEE80211_DEV_TO_SUB_IF(dev), skb);
@@ -1176,7 +1176,7 @@ static u16 ieee80211_netdev_select_queue(struct net_device *dev,
 
 static u16 ieee80211_monitor_select_queue(struct net_device *dev,
 					  struct sk_buff *skb,
-					  void *accel_priv,
+					  struct net_device *sb_dev,
 					  select_queue_fallback_t fallback)
 {
 	struct ieee80211_sub_if_data *sdata = IEEE80211_DEV_TO_SUB_IF(dev);

^ permalink raw reply related

* [jkirsher/next-queue PATCH v2 5/7] net: Add generic ndo_select_queue functions
From: Alexander Duyck @ 2018-06-12 15:18 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher, netdev
In-Reply-To: <20180612151322.86792.97587.stgit@ahduyck-green-test.jf.intel.com>

This patch adds a generic version of the ndo_select_queue functions for
either returning 0 or selecting a queue based on the processor ID. This is
generally meant to just reduce the number of functions we have to change
in the future when we have to deal with ndo_select_queue changes.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/lantiq_etop.c   |   10 +---------
 drivers/net/ethernet/ti/netcp_core.c |    9 +--------
 drivers/staging/netlogic/xlr_net.c   |    9 +--------
 include/linux/netdevice.h            |    4 ++++
 net/core/dev.c                       |   14 ++++++++++++++
 net/packet/af_packet.c               |    2 +-
 6 files changed, 22 insertions(+), 26 deletions(-)

diff --git a/drivers/net/ethernet/lantiq_etop.c b/drivers/net/ethernet/lantiq_etop.c
index afc8100..7a637b5 100644
--- a/drivers/net/ethernet/lantiq_etop.c
+++ b/drivers/net/ethernet/lantiq_etop.c
@@ -563,14 +563,6 @@ struct ltq_etop_priv {
 	spin_unlock_irqrestore(&priv->lock, flags);
 }
 
-static u16
-ltq_etop_select_queue(struct net_device *dev, struct sk_buff *skb,
-		      void *accel_priv, select_queue_fallback_t fallback)
-{
-	/* we are currently only using the first queue */
-	return 0;
-}
-
 static int
 ltq_etop_init(struct net_device *dev)
 {
@@ -641,7 +633,7 @@ struct ltq_etop_priv {
 	.ndo_set_mac_address = ltq_etop_set_mac_address,
 	.ndo_validate_addr = eth_validate_addr,
 	.ndo_set_rx_mode = ltq_etop_set_multicast_list,
-	.ndo_select_queue = ltq_etop_select_queue,
+	.ndo_select_queue = dev_pick_tx_zero,
 	.ndo_init = ltq_etop_init,
 	.ndo_tx_timeout = ltq_etop_tx_timeout,
 };
diff --git a/drivers/net/ethernet/ti/netcp_core.c b/drivers/net/ethernet/ti/netcp_core.c
index e40aa3e..2c455bd 100644
--- a/drivers/net/ethernet/ti/netcp_core.c
+++ b/drivers/net/ethernet/ti/netcp_core.c
@@ -1889,13 +1889,6 @@ static int netcp_rx_kill_vid(struct net_device *ndev, __be16 proto, u16 vid)
 	return err;
 }
 
-static u16 netcp_select_queue(struct net_device *dev, struct sk_buff *skb,
-			      void *accel_priv,
-			      select_queue_fallback_t fallback)
-{
-	return 0;
-}
-
 static int netcp_setup_tc(struct net_device *dev, enum tc_setup_type type,
 			  void *type_data)
 {
@@ -1972,7 +1965,7 @@ static int netcp_setup_tc(struct net_device *dev, enum tc_setup_type type,
 	.ndo_vlan_rx_add_vid	= netcp_rx_add_vid,
 	.ndo_vlan_rx_kill_vid	= netcp_rx_kill_vid,
 	.ndo_tx_timeout		= netcp_ndo_tx_timeout,
-	.ndo_select_queue	= netcp_select_queue,
+	.ndo_select_queue	= dev_pick_tx_zero,
 	.ndo_setup_tc		= netcp_setup_tc,
 };
 
diff --git a/drivers/staging/netlogic/xlr_net.c b/drivers/staging/netlogic/xlr_net.c
index e461168..4e6611e 100644
--- a/drivers/staging/netlogic/xlr_net.c
+++ b/drivers/staging/netlogic/xlr_net.c
@@ -290,13 +290,6 @@ static netdev_tx_t xlr_net_start_xmit(struct sk_buff *skb,
 	return NETDEV_TX_OK;
 }
 
-static u16 xlr_net_select_queue(struct net_device *ndev, struct sk_buff *skb,
-				void *accel_priv,
-				select_queue_fallback_t fallback)
-{
-	return (u16)smp_processor_id();
-}
-
 static void xlr_hw_set_mac_addr(struct net_device *ndev)
 {
 	struct xlr_net_priv *priv = netdev_priv(ndev);
@@ -403,7 +396,7 @@ static void xlr_stats(struct net_device *ndev, struct rtnl_link_stats64 *stats)
 	.ndo_open = xlr_net_open,
 	.ndo_stop = xlr_net_stop,
 	.ndo_start_xmit = xlr_net_start_xmit,
-	.ndo_select_queue = xlr_net_select_queue,
+	.ndo_select_queue = dev_pick_tx_cpu_id,
 	.ndo_set_mac_address = xlr_net_set_mac_addr,
 	.ndo_set_rx_mode = xlr_set_rx_mode,
 	.ndo_get_stats64 = xlr_stats,
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 91b3ca9..70f7ee3 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2551,6 +2551,10 @@ struct net_device *__dev_get_by_flags(struct net *net, unsigned short flags,
 void dev_close_many(struct list_head *head, bool unlink);
 void dev_disable_lro(struct net_device *dev);
 int dev_loopback_xmit(struct net *net, struct sock *sk, struct sk_buff *newskb);
+u16 dev_pick_tx_zero(struct net_device *dev, struct sk_buff *skb,
+		     void *accel_priv, select_queue_fallback_t fallback);
+u16 dev_pick_tx_cpu_id(struct net_device *dev, struct sk_buff *skb,
+		       void *accel_priv, select_queue_fallback_t fallback);
 int dev_queue_xmit(struct sk_buff *skb);
 int dev_queue_xmit_accel(struct sk_buff *skb, struct net_device *sb_dev);
 int dev_direct_xmit(struct sk_buff *skb, u16 queue_id);
diff --git a/net/core/dev.c b/net/core/dev.c
index 2249294..1a1cf2c 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3505,6 +3505,20 @@ static inline int get_xps_queue(struct net_device *dev,
 #endif
 }
 
+u16 dev_pick_tx_zero(struct net_device *dev, struct sk_buff *skb,
+		     void *accel_priv, select_queue_fallback_t fallback)
+{
+	return 0;
+}
+EXPORT_SYMBOL(dev_pick_tx_zero);
+
+u16 dev_pick_tx_cpu_id(struct net_device *dev, struct sk_buff *skb,
+		       void *accel_priv, select_queue_fallback_t fallback)
+{
+	return (u16)raw_smp_processor_id() % dev->real_num_tx_queues;
+}
+EXPORT_SYMBOL(dev_pick_tx_cpu_id);
+
 static u16 ___netdev_pick_tx(struct net_device *dev, struct sk_buff *skb,
 			     struct net_device *sb_dev)
 {
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index ee01856..905f7cd 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -277,7 +277,7 @@ static bool packet_use_direct_xmit(const struct packet_sock *po)
 
 static u16 __packet_pick_tx_queue(struct net_device *dev, struct sk_buff *skb)
 {
-	return (u16) raw_smp_processor_id() % dev->real_num_tx_queues;
+	return dev_pick_tx_cpu_id(dev, skb, NULL, NULL);
 }
 
 static u16 packet_pick_tx_queue(struct sk_buff *skb)

^ permalink raw reply related

* [jkirsher/next-queue PATCH v2 3/7] ixgbe: Add code to populate and use macvlan tc to Tx queue map
From: Alexander Duyck @ 2018-06-12 15:18 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher, netdev
In-Reply-To: <20180612151322.86792.97587.stgit@ahduyck-green-test.jf.intel.com>

This patch makes it so that we use the tc_to_txq mapping in the macvlan
device in order to select the Tx queue for outgoing packets.

The idea here is to try and move away from using ixgbe_select_queue and to
come up with a generic way to make this work for devices going forward. By
encoding this information in the netdev this can become something that can
be used generically as a solution for similar setups going forward.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   44 ++++++++++++++++++++++---
 1 file changed, 38 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index fc23e36..6e27848 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -5271,6 +5271,8 @@ static void ixgbe_clean_rx_ring(struct ixgbe_ring *rx_ring)
 static int ixgbe_fwd_ring_up(struct ixgbe_adapter *adapter,
 			     struct ixgbe_fwd_adapter *accel)
 {
+	u16 rss_i = adapter->ring_feature[RING_F_RSS].indices;
+	int num_tc = netdev_get_num_tc(adapter->netdev);
 	struct net_device *vdev = accel->netdev;
 	int i, baseq, err;
 
@@ -5282,6 +5284,11 @@ static int ixgbe_fwd_ring_up(struct ixgbe_adapter *adapter,
 	accel->rx_base_queue = baseq;
 	accel->tx_base_queue = baseq;
 
+	/* record configuration for macvlan interface in vdev */
+	for (i = 0; i < num_tc; i++)
+		netdev_bind_sb_channel_queue(adapter->netdev, vdev,
+					     i, rss_i, baseq + (rss_i * i));
+
 	for (i = 0; i < adapter->num_rx_queues_per_pool; i++)
 		adapter->rx_ring[baseq + i]->netdev = vdev;
 
@@ -5306,6 +5313,10 @@ static int ixgbe_fwd_ring_up(struct ixgbe_adapter *adapter,
 
 	netdev_err(vdev, "L2FW offload disabled due to L2 filter error\n");
 
+	/* unbind the queues and drop the subordinate channel config */
+	netdev_unbind_sb_channel(adapter->netdev, vdev);
+	netdev_set_sb_channel(vdev, 0);
+
 	clear_bit(accel->pool, adapter->fwd_bitmask);
 	kfree(accel);
 
@@ -8212,18 +8223,22 @@ static u16 ixgbe_select_queue(struct net_device *dev, struct sk_buff *skb,
 			      void *accel_priv, select_queue_fallback_t fallback)
 {
 	struct ixgbe_fwd_adapter *fwd_adapter = accel_priv;
-	struct ixgbe_adapter *adapter;
-	int txq;
 #ifdef IXGBE_FCOE
+	struct ixgbe_adapter *adapter;
 	struct ixgbe_ring_feature *f;
 #endif
+	int txq;
 
 	if (fwd_adapter) {
-		adapter = netdev_priv(dev);
-		txq = reciprocal_scale(skb_get_hash(skb),
-				       adapter->num_rx_queues_per_pool);
+		u8 tc = netdev_get_num_tc(dev) ?
+			netdev_get_prio_tc_map(dev, skb->priority) : 0;
+		struct net_device *vdev = fwd_adapter->netdev;
+
+		txq = vdev->tc_to_txq[tc].offset;
+		txq += reciprocal_scale(skb_get_hash(skb),
+					vdev->tc_to_txq[tc].count);
 
-		return txq + fwd_adapter->tx_base_queue;
+		return txq;
 	}
 
 #ifdef IXGBE_FCOE
@@ -8777,6 +8792,11 @@ static int ixgbe_reassign_macvlan_pool(struct net_device *vdev, void *data)
 	/* if we cannot find a free pool then disable the offload */
 	netdev_err(vdev, "L2FW offload disabled due to lack of queue resources\n");
 	macvlan_release_l2fw_offload(vdev);
+
+	/* unbind the queues and drop the subordinate channel config */
+	netdev_unbind_sb_channel(adapter->netdev, vdev);
+	netdev_set_sb_channel(vdev, 0);
+
 	kfree(accel);
 
 	return 0;
@@ -9785,6 +9805,13 @@ static void *ixgbe_fwd_add(struct net_device *pdev, struct net_device *vdev)
 	if (!macvlan_supports_dest_filter(vdev))
 		return ERR_PTR(-EMEDIUMTYPE);
 
+	/* We need to lock down the macvlan to be a single queue device so that
+	 * we can reuse the tc_to_txq field in the macvlan netdev to represent
+	 * the queue mapping to our netdev.
+	 */
+	if (netif_is_multiqueue(vdev))
+		return ERR_PTR(-ERANGE);
+
 	pool = find_first_zero_bit(adapter->fwd_bitmask, adapter->num_rx_pools);
 	if (pool == adapter->num_rx_pools) {
 		u16 used_pools = adapter->num_vfs + adapter->num_rx_pools;
@@ -9841,6 +9868,7 @@ static void *ixgbe_fwd_add(struct net_device *pdev, struct net_device *vdev)
 		return ERR_PTR(-ENOMEM);
 
 	set_bit(pool, adapter->fwd_bitmask);
+	netdev_set_sb_channel(vdev, pool);
 	accel->pool = pool;
 	accel->netdev = vdev;
 
@@ -9882,6 +9910,10 @@ static void ixgbe_fwd_del(struct net_device *pdev, void *priv)
 		ring->netdev = NULL;
 	}
 
+	/* unbind the queues and drop the subordinate channel config */
+	netdev_unbind_sb_channel(pdev, accel->netdev);
+	netdev_set_sb_channel(accel->netdev, 0);
+
 	clear_bit(accel->pool, adapter->fwd_bitmask);
 	kfree(accel);
 }

^ permalink raw reply related

* [jkirsher/next-queue PATCH v2 4/7] net: Add support for subordinate traffic classes to netdev_pick_tx
From: Alexander Duyck @ 2018-06-12 15:18 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher, netdev
In-Reply-To: <20180612151322.86792.97587.stgit@ahduyck-green-test.jf.intel.com>

This change makes it so that we can support the concept of subordinate
device traffic classes to the core networking code. In doing this we can
start pulling out the driver specific bits needed to support selecting a
queue based on an upper device.

The solution at is currently stands is only partially implemented. I have
the start of some XPS bits in here, but I would still need to allow for
configuration of the XPS maps on the queues reserved for the subordinate
devices. For now I am using the reference to the sb_dev XPS map as just a
way to skip the lookup of the lower device XPS map for now as that would
result in the wrong queue being picked.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   19 +++-----
 drivers/net/macvlan.c                         |   10 +---
 include/linux/netdevice.h                     |    4 +-
 net/core/dev.c                                |   57 +++++++++++++++----------
 4 files changed, 45 insertions(+), 45 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 6e27848..053a54c 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -8219,20 +8219,17 @@ static void ixgbe_atr(struct ixgbe_ring *ring,
 					      input, common, ring->queue_index);
 }
 
+#ifdef IXGBE_FCOE
 static u16 ixgbe_select_queue(struct net_device *dev, struct sk_buff *skb,
 			      void *accel_priv, select_queue_fallback_t fallback)
 {
-	struct ixgbe_fwd_adapter *fwd_adapter = accel_priv;
-#ifdef IXGBE_FCOE
 	struct ixgbe_adapter *adapter;
 	struct ixgbe_ring_feature *f;
-#endif
 	int txq;
 
-	if (fwd_adapter) {
-		u8 tc = netdev_get_num_tc(dev) ?
-			netdev_get_prio_tc_map(dev, skb->priority) : 0;
-		struct net_device *vdev = fwd_adapter->netdev;
+	if (accel_priv) {
+		u8 tc = netdev_get_prio_tc_map(dev, skb->priority);
+		struct net_device *vdev = accel_priv;
 
 		txq = vdev->tc_to_txq[tc].offset;
 		txq += reciprocal_scale(skb_get_hash(skb),
@@ -8241,8 +8238,6 @@ static u16 ixgbe_select_queue(struct net_device *dev, struct sk_buff *skb,
 		return txq;
 	}
 
-#ifdef IXGBE_FCOE
-
 	/*
 	 * only execute the code below if protocol is FCoE
 	 * or FIP and we have FCoE enabled on the adapter
@@ -8268,11 +8263,9 @@ static u16 ixgbe_select_queue(struct net_device *dev, struct sk_buff *skb,
 		txq -= f->indices;
 
 	return txq + f->offset;
-#else
-	return fallback(dev, skb);
-#endif
 }
 
+#endif
 static int ixgbe_xmit_xdp_ring(struct ixgbe_adapter *adapter,
 			       struct xdp_frame *xdpf)
 {
@@ -10076,7 +10069,6 @@ static int ixgbe_xdp_xmit(struct net_device *dev, int n,
 	.ndo_open		= ixgbe_open,
 	.ndo_stop		= ixgbe_close,
 	.ndo_start_xmit		= ixgbe_xmit_frame,
-	.ndo_select_queue	= ixgbe_select_queue,
 	.ndo_set_rx_mode	= ixgbe_set_rx_mode,
 	.ndo_validate_addr	= eth_validate_addr,
 	.ndo_set_mac_address	= ixgbe_set_mac,
@@ -10099,6 +10091,7 @@ static int ixgbe_xdp_xmit(struct net_device *dev, int n,
 	.ndo_poll_controller	= ixgbe_netpoll,
 #endif
 #ifdef IXGBE_FCOE
+	.ndo_select_queue	= ixgbe_select_queue,
 	.ndo_fcoe_ddp_setup = ixgbe_fcoe_ddp_get,
 	.ndo_fcoe_ddp_target = ixgbe_fcoe_ddp_target,
 	.ndo_fcoe_ddp_done = ixgbe_fcoe_ddp_put,
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index adde8fc..401e1d1 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -514,7 +514,6 @@ static int macvlan_queue_xmit(struct sk_buff *skb, struct net_device *dev)
 	const struct macvlan_dev *vlan = netdev_priv(dev);
 	const struct macvlan_port *port = vlan->port;
 	const struct macvlan_dev *dest;
-	void *accel_priv = NULL;
 
 	if (vlan->mode == MACVLAN_MODE_BRIDGE) {
 		const struct ethhdr *eth = (void *)skb->data;
@@ -533,15 +532,10 @@ static int macvlan_queue_xmit(struct sk_buff *skb, struct net_device *dev)
 			return NET_XMIT_SUCCESS;
 		}
 	}
-
-	/* For packets that are non-multicast and not bridged we will pass
-	 * the necessary information so that the lowerdev can distinguish
-	 * the source of the packets via the accel_priv value.
-	 */
-	accel_priv = vlan->accel_priv;
 xmit_world:
 	skb->dev = vlan->lowerdev;
-	return dev_queue_xmit_accel(skb, accel_priv);
+	return dev_queue_xmit_accel(skb,
+				    netdev_get_sb_channel(dev) ? dev : NULL);
 }
 
 static inline netdev_tx_t macvlan_netpoll_send_skb(struct macvlan_dev *vlan, struct sk_buff *skb)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 41b4660..91b3ca9 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2090,7 +2090,7 @@ static inline void netdev_for_each_tx_queue(struct net_device *dev,
 
 struct netdev_queue *netdev_pick_tx(struct net_device *dev,
 				    struct sk_buff *skb,
-				    void *accel_priv);
+				    struct net_device *sb_dev);
 
 /* returns the headroom that the master device needs to take in account
  * when forwarding to this dev
@@ -2552,7 +2552,7 @@ struct net_device *__dev_get_by_flags(struct net *net, unsigned short flags,
 void dev_disable_lro(struct net_device *dev);
 int dev_loopback_xmit(struct net *net, struct sock *sk, struct sk_buff *newskb);
 int dev_queue_xmit(struct sk_buff *skb);
-int dev_queue_xmit_accel(struct sk_buff *skb, void *accel_priv);
+int dev_queue_xmit_accel(struct sk_buff *skb, struct net_device *sb_dev);
 int dev_direct_xmit(struct sk_buff *skb, u16 queue_id);
 int register_netdevice(struct net_device *dev);
 void unregister_netdevice_queue(struct net_device *dev, struct list_head *head);
diff --git a/net/core/dev.c b/net/core/dev.c
index 27fe4f2..2249294 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2704,24 +2704,26 @@ void netif_device_attach(struct net_device *dev)
  * Returns a Tx hash based on the given packet descriptor a Tx queues' number
  * to be used as a distribution range.
  */
-static u16 skb_tx_hash(const struct net_device *dev, struct sk_buff *skb)
+static u16 skb_tx_hash(const struct net_device *dev,
+		       const struct net_device *sb_dev,
+		       struct sk_buff *skb)
 {
 	u32 hash;
 	u16 qoffset = 0;
 	u16 qcount = dev->real_num_tx_queues;
 
+	if (dev->num_tc) {
+		u8 tc = netdev_get_prio_tc_map(dev, skb->priority);
+
+		qoffset = sb_dev->tc_to_txq[tc].offset;
+		qcount = sb_dev->tc_to_txq[tc].count;
+	}
+
 	if (skb_rx_queue_recorded(skb)) {
 		hash = skb_get_rx_queue(skb);
 		while (unlikely(hash >= qcount))
 			hash -= qcount;
-		return hash;
-	}
-
-	if (dev->num_tc) {
-		u8 tc = netdev_get_prio_tc_map(dev, skb->priority);
-
-		qoffset = dev->tc_to_txq[tc].offset;
-		qcount = dev->tc_to_txq[tc].count;
+		return hash + qoffset;
 	}
 
 	return (u16) reciprocal_scale(skb_get_hash(skb), qcount) + qoffset;
@@ -3465,7 +3467,9 @@ int dev_loopback_xmit(struct net *net, struct sock *sk, struct sk_buff *skb)
 }
 #endif /* CONFIG_NET_EGRESS */
 
-static inline int get_xps_queue(struct net_device *dev, struct sk_buff *skb)
+static inline int get_xps_queue(struct net_device *dev,
+				struct net_device *sb_dev,
+				struct sk_buff *skb)
 {
 #ifdef CONFIG_XPS
 	struct xps_dev_maps *dev_maps;
@@ -3473,7 +3477,7 @@ static inline int get_xps_queue(struct net_device *dev, struct sk_buff *skb)
 	int queue_index = -1;
 
 	rcu_read_lock();
-	dev_maps = rcu_dereference(dev->xps_maps);
+	dev_maps = rcu_dereference(sb_dev->xps_maps);
 	if (dev_maps) {
 		unsigned int tci = skb->sender_cpu - 1;
 
@@ -3501,17 +3505,20 @@ static inline int get_xps_queue(struct net_device *dev, struct sk_buff *skb)
 #endif
 }
 
-static u16 __netdev_pick_tx(struct net_device *dev, struct sk_buff *skb)
+static u16 ___netdev_pick_tx(struct net_device *dev, struct sk_buff *skb,
+			     struct net_device *sb_dev)
 {
 	struct sock *sk = skb->sk;
 	int queue_index = sk_tx_queue_get(sk);
 
+	sb_dev = sb_dev ? : dev;
+
 	if (queue_index < 0 || skb->ooo_okay ||
 	    queue_index >= dev->real_num_tx_queues) {
-		int new_index = get_xps_queue(dev, skb);
+		int new_index = get_xps_queue(dev, sb_dev, skb);
 
 		if (new_index < 0)
-			new_index = skb_tx_hash(dev, skb);
+			new_index = skb_tx_hash(dev, sb_dev, skb);
 
 		if (queue_index != new_index && sk &&
 		    sk_fullsock(sk) &&
@@ -3524,9 +3531,15 @@ static u16 __netdev_pick_tx(struct net_device *dev, struct sk_buff *skb)
 	return queue_index;
 }
 
+static u16 __netdev_pick_tx(struct net_device *dev,
+			    struct sk_buff *skb)
+{
+	return ___netdev_pick_tx(dev, skb, NULL);
+}
+
 struct netdev_queue *netdev_pick_tx(struct net_device *dev,
 				    struct sk_buff *skb,
-				    void *accel_priv)
+				    struct net_device *sb_dev)
 {
 	int queue_index = 0;
 
@@ -3541,10 +3554,10 @@ struct netdev_queue *netdev_pick_tx(struct net_device *dev,
 		const struct net_device_ops *ops = dev->netdev_ops;
 
 		if (ops->ndo_select_queue)
-			queue_index = ops->ndo_select_queue(dev, skb, accel_priv,
+			queue_index = ops->ndo_select_queue(dev, skb, sb_dev,
 							    __netdev_pick_tx);
 		else
-			queue_index = __netdev_pick_tx(dev, skb);
+			queue_index = ___netdev_pick_tx(dev, skb, sb_dev);
 
 		queue_index = netdev_cap_txqueue(dev, queue_index);
 	}
@@ -3556,7 +3569,7 @@ struct netdev_queue *netdev_pick_tx(struct net_device *dev,
 /**
  *	__dev_queue_xmit - transmit a buffer
  *	@skb: buffer to transmit
- *	@accel_priv: private data used for L2 forwarding offload
+ *	@sb_dev: suboordinate device used for L2 forwarding offload
  *
  *	Queue a buffer for transmission to a network device. The caller must
  *	have set the device and priority and built the buffer before calling
@@ -3579,7 +3592,7 @@ struct netdev_queue *netdev_pick_tx(struct net_device *dev,
  *      the BH enable code must have IRQs enabled so that it will not deadlock.
  *          --BLG
  */
-static int __dev_queue_xmit(struct sk_buff *skb, void *accel_priv)
+static int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
 {
 	struct net_device *dev = skb->dev;
 	struct netdev_queue *txq;
@@ -3618,7 +3631,7 @@ static int __dev_queue_xmit(struct sk_buff *skb, void *accel_priv)
 	else
 		skb_dst_force(skb);
 
-	txq = netdev_pick_tx(dev, skb, accel_priv);
+	txq = netdev_pick_tx(dev, skb, sb_dev);
 	q = rcu_dereference_bh(txq->qdisc);
 
 	trace_net_dev_queue(skb);
@@ -3692,9 +3705,9 @@ int dev_queue_xmit(struct sk_buff *skb)
 }
 EXPORT_SYMBOL(dev_queue_xmit);
 
-int dev_queue_xmit_accel(struct sk_buff *skb, void *accel_priv)
+int dev_queue_xmit_accel(struct sk_buff *skb, struct net_device *sb_dev)
 {
-	return __dev_queue_xmit(skb, accel_priv);
+	return __dev_queue_xmit(skb, sb_dev);
 }
 EXPORT_SYMBOL(dev_queue_xmit_accel);
 

^ permalink raw reply related

* [jkirsher/next-queue PATCH v2 2/7] net: Add support for subordinate device traffic classes
From: Alexander Duyck @ 2018-06-12 15:18 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher, netdev
In-Reply-To: <20180612151322.86792.97587.stgit@ahduyck-green-test.jf.intel.com>

This patch is meant to provide the basic tools needed to allow us to create
subordinate device traffic classes. The general idea here is to allow
subdividing the queues of a device into queue groups accessible through an
upper device such as a macvlan.

The idea here is to enforce the idea that an upper device has to be a
single queue device, ideally with IFF_NO_QUQUE set. With that being the
case we can pretty much guarantee that the tc_to_txq mappings and XPS maps
for the upper device are unused. As such we could reuse those in order to
support subdividing the lower device and distributing those queues between
the subordinate devices.

In order to distinguish between a regular set of traffic classes and if a
device is carrying subordinate traffic classes I changed num_tc from a u8
to a s16 value and use the negative values to represent the suboordinate
pool values. So starting at -1 and running to -32768 we can encode those as
pool values, and the existing values of 0 to 15 can be maintained.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 include/linux/netdevice.h |   16 ++++++++
 net/core/dev.c            |   89 +++++++++++++++++++++++++++++++++++++++++++++
 net/core/net-sysfs.c      |   21 ++++++++++-
 3 files changed, 124 insertions(+), 2 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 3ec9850..41b4660 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -569,6 +569,9 @@ struct netdev_queue {
 	 * (/sys/class/net/DEV/Q/trans_timeout)
 	 */
 	unsigned long		trans_timeout;
+
+	/* Suboordinate device that the queue has been assigned to */
+	struct net_device	*sb_dev;
 /*
  * write-mostly part
  */
@@ -1978,7 +1981,7 @@ struct net_device {
 #ifdef CONFIG_DCB
 	const struct dcbnl_rtnl_ops *dcbnl_ops;
 #endif
-	u8			num_tc;
+	s16			num_tc;
 	struct netdev_tc_txq	tc_to_txq[TC_MAX_QUEUE];
 	u8			prio_tc_map[TC_BITMASK + 1];
 
@@ -2032,6 +2035,17 @@ int netdev_get_num_tc(struct net_device *dev)
 	return dev->num_tc;
 }
 
+void netdev_unbind_sb_channel(struct net_device *dev,
+			      struct net_device *sb_dev);
+int netdev_bind_sb_channel_queue(struct net_device *dev,
+				 struct net_device *sb_dev,
+				 u8 tc, u16 count, u16 offset);
+int netdev_set_sb_channel(struct net_device *dev, u16 channel);
+static inline int netdev_get_sb_channel(struct net_device *dev)
+{
+	return max_t(int, -dev->num_tc, 0);
+}
+
 static inline
 struct netdev_queue *netdev_get_tx_queue(const struct net_device *dev,
 					 unsigned int index)
diff --git a/net/core/dev.c b/net/core/dev.c
index 6e18242..27fe4f2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2068,11 +2068,13 @@ int netdev_txq_to_tc(struct net_device *dev, unsigned int txq)
 		struct netdev_tc_txq *tc = &dev->tc_to_txq[0];
 		int i;
 
+		/* walk through the TCs and see if it falls into any of them */
 		for (i = 0; i < TC_MAX_QUEUE; i++, tc++) {
 			if ((txq - tc->offset) < tc->count)
 				return i;
 		}
 
+		/* didn't find it, just return -1 to indicate no match */
 		return -1;
 	}
 
@@ -2215,7 +2217,14 @@ int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
 	bool active = false;
 
 	if (dev->num_tc) {
+		/* Do not allow XPS on subordinate device directly */
 		num_tc = dev->num_tc;
+		if (num_tc < 0)
+			return -EINVAL;
+
+		/* If queue belongs to subordinate dev use its map */
+		dev = netdev_get_tx_queue(dev, index)->sb_dev ? : dev;
+
 		tc = netdev_txq_to_tc(dev, index);
 		if (tc < 0)
 			return -EINVAL;
@@ -2366,11 +2375,25 @@ int netif_set_xps_queue(struct net_device *dev, const struct cpumask *mask,
 EXPORT_SYMBOL(netif_set_xps_queue);
 
 #endif
+static void netdev_unbind_all_sb_channels(struct net_device *dev)
+{
+	struct netdev_queue *txq = &dev->_tx[dev->num_tx_queues];
+
+	/* Unbind any subordinate channels */
+	while (txq-- != &dev->_tx[0]) {
+		if (txq->sb_dev)
+			netdev_unbind_sb_channel(dev, txq->sb_dev);
+	}
+}
+
 void netdev_reset_tc(struct net_device *dev)
 {
 #ifdef CONFIG_XPS
 	netif_reset_xps_queues_gt(dev, 0);
 #endif
+	netdev_unbind_all_sb_channels(dev);
+
+	/* Reset TC configuration of device */
 	dev->num_tc = 0;
 	memset(dev->tc_to_txq, 0, sizeof(dev->tc_to_txq));
 	memset(dev->prio_tc_map, 0, sizeof(dev->prio_tc_map));
@@ -2399,11 +2422,77 @@ int netdev_set_num_tc(struct net_device *dev, u8 num_tc)
 #ifdef CONFIG_XPS
 	netif_reset_xps_queues_gt(dev, 0);
 #endif
+	netdev_unbind_all_sb_channels(dev);
+
 	dev->num_tc = num_tc;
 	return 0;
 }
 EXPORT_SYMBOL(netdev_set_num_tc);
 
+void netdev_unbind_sb_channel(struct net_device *dev,
+			      struct net_device *sb_dev)
+{
+	struct netdev_queue *txq = &dev->_tx[dev->num_tx_queues];
+
+#ifdef CONFIG_XPS
+	netif_reset_xps_queues_gt(sb_dev, 0);
+#endif
+	memset(sb_dev->tc_to_txq, 0, sizeof(sb_dev->tc_to_txq));
+	memset(sb_dev->prio_tc_map, 0, sizeof(sb_dev->prio_tc_map));
+
+	while (txq-- != &dev->_tx[0]) {
+		if (txq->sb_dev == sb_dev)
+			txq->sb_dev = NULL;
+	}
+}
+EXPORT_SYMBOL(netdev_unbind_sb_channel);
+
+int netdev_bind_sb_channel_queue(struct net_device *dev,
+				 struct net_device *sb_dev,
+				 u8 tc, u16 count, u16 offset)
+{
+	/* Make certain the sb_dev and dev are already configured */
+	if (sb_dev->num_tc >= 0 || tc >= dev->num_tc)
+		return -EINVAL;
+
+	/* We cannot hand out queues we don't have */
+	if ((offset + count) > dev->real_num_tx_queues)
+		return -EINVAL;
+
+	/* Record the mapping */
+	sb_dev->tc_to_txq[tc].count = count;
+	sb_dev->tc_to_txq[tc].offset = offset;
+
+	/* Provide a way for Tx queue to find the tc_to_txq map or
+	 * XPS map for itself.
+	 */
+	while (count--)
+		netdev_get_tx_queue(dev, count + offset)->sb_dev = sb_dev;
+
+	return 0;
+}
+EXPORT_SYMBOL(netdev_bind_sb_channel_queue);
+
+int netdev_set_sb_channel(struct net_device *dev, u16 channel)
+{
+	/* Do not use a multiqueue device to represent a subordinate channel */
+	if (netif_is_multiqueue(dev))
+		return -ENODEV;
+
+	/* We allow channels 1 - 32767 to be used for subordinate channels.
+	 * Channel 0 is meant to be "native" mode and used only to represent
+	 * the main root device. We allow writing 0 to reset the device back
+	 * to normal mode after being used as a subordinate channel.
+	 */
+	if (channel > S16_MAX)
+		return -EINVAL;
+
+	dev->num_tc = -channel;
+
+	return 0;
+}
+EXPORT_SYMBOL(netdev_set_sb_channel);
+
 /*
  * Routine to help set real_num_tx_queues. To avoid skbs mapped to queues
  * greater than real_num_tx_queues stale skbs on the qdisc must be flushed.
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 335c6a4..bd067b1 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -1054,11 +1054,23 @@ static ssize_t traffic_class_show(struct netdev_queue *queue,
 		return -ENOENT;
 
 	index = get_netdev_queue_index(queue);
+
+	/* If queue belongs to subordinate dev use its tc mapping */
+	dev = netdev_get_tx_queue(dev, index)->sb_dev ? : dev;
+
 	tc = netdev_txq_to_tc(dev, index);
 	if (tc < 0)
 		return -EINVAL;
 
-	return sprintf(buf, "%u\n", tc);
+	/* We can report the traffic class one of two ways:
+	 * Subordinate device traffic classes are reported with the traffic
+	 * class first, and then the subordinate class so for example TC0 on
+	 * subordinate device 2 will be reported as "0-2". If the queue
+	 * belongs to the root device it will be reported with just the
+	 * traffic class, so just "0" for TC 0 for example.
+	 */
+	return dev->num_tc < 0 ? sprintf(buf, "%u%d\n", tc, dev->num_tc) :
+				 sprintf(buf, "%u\n", tc);
 }
 
 #ifdef CONFIG_XPS
@@ -1225,7 +1237,14 @@ static ssize_t xps_cpus_show(struct netdev_queue *queue,
 	index = get_netdev_queue_index(queue);
 
 	if (dev->num_tc) {
+		/* Do not allow XPS on subordinate device directly */
 		num_tc = dev->num_tc;
+		if (num_tc < 0)
+			return -EINVAL;
+
+		/* If queue belongs to subordinate dev use its map */
+		dev = netdev_get_tx_queue(dev, index)->sb_dev ? : dev;
+
 		tc = netdev_txq_to_tc(dev, index);
 		if (tc < 0)
 			return -EINVAL;

^ permalink raw reply related

* [jkirsher/next-queue PATCH v2 1/7] net-sysfs: Drop support for XPS and traffic_class on single queue device
From: Alexander Duyck @ 2018-06-12 15:18 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher, netdev
In-Reply-To: <20180612151322.86792.97587.stgit@ahduyck-green-test.jf.intel.com>

This patch makes it so that we do not report the traffic class or allow XPS
configuration on single queue devices. This is mostly to avoid unnecessary
complexity with changes I have planned that will allow us to reuse
the unused tc_to_txq and XPS configuration on a single queue device to
allow it to make use of a subset of queues on an underlying device.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 net/core/net-sysfs.c |   15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index bb7e80f..335c6a4 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -1047,9 +1047,14 @@ static ssize_t traffic_class_show(struct netdev_queue *queue,
 				  char *buf)
 {
 	struct net_device *dev = queue->dev;
-	int index = get_netdev_queue_index(queue);
-	int tc = netdev_txq_to_tc(dev, index);
+	int index;
+	int tc;
 
+	if (!netif_is_multiqueue(dev))
+		return -ENOENT;
+
+	index = get_netdev_queue_index(queue);
+	tc = netdev_txq_to_tc(dev, index);
 	if (tc < 0)
 		return -EINVAL;
 
@@ -1214,6 +1219,9 @@ static ssize_t xps_cpus_show(struct netdev_queue *queue,
 	cpumask_var_t mask;
 	unsigned long index;
 
+	if (!netif_is_multiqueue(dev))
+		return -ENOENT;
+
 	index = get_netdev_queue_index(queue);
 
 	if (dev->num_tc) {
@@ -1260,6 +1268,9 @@ static ssize_t xps_cpus_store(struct netdev_queue *queue,
 	cpumask_var_t mask;
 	int err;
 
+	if (!netif_is_multiqueue(dev))
+		return -ENOENT;
+
 	if (!capable(CAP_NET_ADMIN))
 		return -EPERM;
 

^ permalink raw reply related

* [jkirsher/next-queue PATCH v2 0/7] Add support for L2 Fwd Offload w/o ndo_select_queue
From: Alexander Duyck @ 2018-06-12 15:18 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher, netdev

This patch series is meant to allow support for the L2 forward offload, aka
MACVLAN offload without the need for using ndo_select_queue.

The existing solution currently requires that we use ndo_select_queue in
the transmit path if we want to associate specific Tx queues with a given
MACVLAN interface. In order to get away from this we need to repurpose the
tc_to_txq array and XPS pointer for the MACVLAN interface and use those as
a means of accessing the queues on the lower device. As a result we cannot
offload a device that is configured as multiqueue, however it doesn't
really make sense to configure a macvlan interfaced as being multiqueue
anyway since it doesn't really have a qdisc of its own in the first place.

I am submitting this as an RFC for the netdev mailing list, and officially
submitting it for testing to Jeff Kirsher's next-queue in order to validate
the ixgbe specific bits.

The big changes in this set are:
  Allow lower device to update tc_to_txq and XPS map of offloaded MACVLAN
  Disable XPS for single queue devices
  Replace accel_priv with sb_dev in ndo_select_queue
  Add sb_dev parameter to fallback function for ndo_select_queue
  Consolidated ndo_select_queue functions that appeared to be duplicates

v2: Implement generic "select_queue" functions instead of "fallback" functions.
    Tweak last two patches to account for changes in dev_pick_tx_xxx functions.

---

Alexander Duyck (7):
      net-sysfs: Drop support for XPS and traffic_class on single queue device
      net: Add support for subordinate device traffic classes
      ixgbe: Add code to populate and use macvlan tc to Tx queue map
      net: Add support for subordinate traffic classes to netdev_pick_tx
      net: Add generic ndo_select_queue functions
      net: allow ndo_select_queue to pass netdev
      net: allow fallback function to pass netdev


 drivers/infiniband/hw/hfi1/vnic_main.c            |    2 
 drivers/infiniband/ulp/opa_vnic/opa_vnic_netdev.c |    4 -
 drivers/net/bonding/bond_main.c                   |    3 
 drivers/net/ethernet/amazon/ena/ena_netdev.c      |    5 -
 drivers/net/ethernet/broadcom/bcmsysport.c        |    6 -
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c   |    6 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h   |    3 
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c   |    5 -
 drivers/net/ethernet/hisilicon/hns/hns_enet.c     |    5 -
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c     |   62 ++++++--
 drivers/net/ethernet/lantiq_etop.c                |   10 -
 drivers/net/ethernet/mellanox/mlx4/en_tx.c        |    7 +
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h      |    3 
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |    3 
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   |    5 -
 drivers/net/ethernet/renesas/ravb_main.c          |    3 
 drivers/net/ethernet/sun/ldmvsw.c                 |    3 
 drivers/net/ethernet/sun/sunvnet.c                |    3 
 drivers/net/ethernet/ti/netcp_core.c              |    9 -
 drivers/net/hyperv/netvsc_drv.c                   |    6 -
 drivers/net/macvlan.c                             |   10 -
 drivers/net/net_failover.c                        |    7 +
 drivers/net/team/team.c                           |    3 
 drivers/net/tun.c                                 |    3 
 drivers/net/wireless/marvell/mwifiex/main.c       |    3 
 drivers/net/xen-netback/interface.c               |    4 -
 drivers/net/xen-netfront.c                        |    3 
 drivers/staging/netlogic/xlr_net.c                |    9 -
 drivers/staging/rtl8188eu/os_dep/os_intfs.c       |    3 
 drivers/staging/rtl8723bs/os_dep/os_intfs.c       |    7 -
 include/linux/netdevice.h                         |   34 ++++-
 net/core/dev.c                                    |  156 ++++++++++++++++++---
 net/core/net-sysfs.c                              |   36 ++++-
 net/mac80211/iface.c                              |    4 -
 net/packet/af_packet.c                            |    7 +
 35 files changed, 312 insertions(+), 130 deletions(-)

^ permalink raw reply

* Re: [PATCH 1/2] Convert target drivers to use sbitmap
From: Bart Van Assche @ 2018-06-12 15:22 UTC (permalink / raw)
  To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-usb@vger.kernel.org, willy@infradead.org,
	virtualization@lists.linux-foundation.org,
	kent.overstreet@gmail.com, linux1394-devel@lists.sourceforge.net,
	jgross@suse.com, axboe@kernel.dk, linux-scsi@vger.kernel.org,
	qla2xxx-upstream@qlogic.com, target-devel@vger.kernel.org,
	netdev@vger.kernel.org
  Cc: mawilcox@microsoft.com
In-Reply-To: <20180515160043.27044-2-willy@infradead.org>

On Tue, 2018-05-15 at 09:00 -0700, Matthew Wilcox wrote:
> diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c
> index 025dc2d3f3de..cdf671c2af61 100644
> --- a/drivers/scsi/qla2xxx/qla_target.c
> +++ b/drivers/scsi/qla2xxx/qla_target.c
> @@ -3719,7 +3719,8 @@ void qlt_free_cmd(struct qla_tgt_cmd *cmd)
>  		return;
>  	}
>  	cmd->jiffies_at_free = get_jiffies_64();
> -	percpu_ida_free(&sess->se_sess->sess_tag_pool, cmd->se_cmd.map_tag);
> +	sbitmap_queue_clear(&sess->se_sess->sess_tag_pool, cmd->se_cmd.map_tag,
> +			cmd->se_cmd.map_cpu);
>  }
>  EXPORT_SYMBOL(qlt_free_cmd);

Please introduce functions in the target core for allocating and freeing a tag
instead of spreading the knowledge of how to allocate and free tags over all
target drivers.
 
> +int iscsit_wait_for_tag(struct se_session *se_sess, int state, int *cpup)
> +{
> +	int tag = -1;
> +	DEFINE_WAIT(wait);
> +	struct sbq_wait_state *ws;
> +
> +	if (state == TASK_RUNNING)
> +		return tag;
> +
> +	ws = &se_sess->sess_tag_pool.ws[0];
> +	for (;;) {
> +		prepare_to_wait_exclusive(&ws->wait, &wait, state);
> +		if (signal_pending_state(state, current))
> +			break;

This looks weird to me. Shouldn't target code ignore signals instead of causing
tag allocation to fail if a signal is received?

> +		schedule();
> +		tag = sbitmap_queue_get(&se_sess->sess_tag_pool, cpup);
> +	}
> +
> +	finish_wait(&ws->wait, &wait);
> +	return tag;
> +}

Thanks,

Bart.

^ permalink raw reply

* Re: [PATCH net 2/3] hv_netvsc: fix network namespace issues with VF support
From: Stephen Hemminger @ 2018-06-12 15:10 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: kys, haiyangz, sthemmin, devel, netdev
In-Reply-To: <20180612095128.kslkt6ban6otanbs@mwanda>

On Tue, 12 Jun 2018 12:51:28 +0300
Dan Carpenter <dan.carpenter@oracle.com> wrote:

> On Mon, Jun 11, 2018 at 12:44:55PM -0700, Stephen Hemminger wrote:
> > When finding the parent netvsc device, the search needs to be across
> > all netvsc device instances (independent of network namespace).
> > 
> > Find parent device of VF using upper_dev_get routine which
> > searches only adjacent list.
> > 
> > Fixes: e8ff40d4bff1 ("hv_netvsc: improve VF device matching")
> > Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> > 
> > netns aware byref  
> 
> What?  Presumably that wasn't supposed to be part of the commit message.

That was leftover from earlier commit message

^ permalink raw reply

* Re: [PATCH 3/3 RFC] Revert "net: stmmac: fix build failure due to missing COMMON_CLK dependency"
From: Jose Abreu @ 2018-06-12 14:51 UTC (permalink / raw)
  To: Geert Uytterhoeven, Greg Ungerer, Ralf Baechle, James Hogan,
	Giuseppe Cavallaro, Alexandre Torgue, Jose Abreu, Corentin Labbe,
	David S . Miller
  Cc: Arnd Bergmann, linux-m68k, linux-mips, netdev, linux-kernel
In-Reply-To: <1528706663-20670-4-git-send-email-geert@linux-m68k.org>

Hi,

On 11-06-2018 09:44, Geert Uytterhoeven wrote:
> This reverts commit bde4975310eb1982bd0bbff673989052d92fd481.
>
> All legacy clock implementations now implement clk_set_rate() (Some
> implementations may be dummies, though).
>
> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
>

This seems okay by me. You can send a non-rfc patch with my ack
once other 2 patches get accepted:

Acked-by: Jose Abreu <joabreu@synopsys.com>

Thanks and Best Regards,
Jose Miguel Abreu

^ permalink raw reply

* [RFC PATCH] ath10k: ath10k_snoc_get_ce_id_from_irq() can be static
From: kbuild test robot @ 2018-06-12 14:50 UTC (permalink / raw)
  To: Niklas Cassel
  Cc: kbuild-all, Kalle Valo, David S. Miller, Niklas Cassel, ath10k,
	linux-wireless, netdev, linux-kernel
In-Reply-To: <20180612113907.15043-2-niklas.cassel@linaro.org>


Fixes: aecf55e7df3a ("ath10k: allow ATH10K_SNOC with COMPILE_TEST")
Signed-off-by: kbuild test robot <fengguang.wu@intel.com>
---
 snoc.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c
index a3a7042..92ddb1c 100644
--- a/drivers/net/wireless/ath/ath10k/snoc.c
+++ b/drivers/net/wireless/ath/ath10k/snoc.c
@@ -820,7 +820,7 @@ static const struct ath10k_bus_ops ath10k_snoc_bus_ops = {
 	.write32	= ath10k_snoc_write32,
 };
 
-int ath10k_snoc_get_ce_id_from_irq(struct ath10k *ar, int irq)
+static int ath10k_snoc_get_ce_id_from_irq(struct ath10k *ar, int irq)
 {
 	struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar);
 	int i;
@@ -868,7 +868,7 @@ static int ath10k_snoc_napi_poll(struct napi_struct *ctx, int budget)
 	return done;
 }
 
-void ath10k_snoc_init_napi(struct ath10k *ar)
+static void ath10k_snoc_init_napi(struct ath10k *ar)
 {
 	netif_napi_add(&ar->napi_dev, &ar->napi, ath10k_snoc_napi_poll,
 		       ATH10K_NAPI_BUDGET);

^ permalink raw reply related

* Re: [PATCH 2/2] ath10k: allow ATH10K_SNOC with COMPILE_TEST
From: kbuild test robot @ 2018-06-12 14:50 UTC (permalink / raw)
  To: Niklas Cassel
  Cc: kbuild-all, Kalle Valo, David S. Miller, Niklas Cassel, ath10k,
	linux-wireless, netdev, linux-kernel
In-Reply-To: <20180612113907.15043-2-niklas.cassel@linaro.org>

Hi Niklas,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on ath6kl/ath-next]
[also build test WARNING on next-20180612]
[cannot apply to v4.17]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Niklas-Cassel/ath10k-do-not-mix-spaces-and-tabs-in-Kconfig/20180612-194241
base:   https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git ath-next
reproduce:
        # apt-get install sparse
        make ARCH=x86_64 allmodconfig
        make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

>> drivers/net/wireless/ath/ath10k/snoc.c:823:5: sparse: symbol 'ath10k_snoc_get_ce_id_from_irq' was not declared. Should it be static?
>> drivers/net/wireless/ath/ath10k/snoc.c:871:6: sparse: symbol 'ath10k_snoc_init_napi' was not declared. Should it be static?

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply

* RE: [v3, 00/10] Support DPAA PTP clock and timestamping
From: Madalin-cristian Bucur @ 2018-06-12 14:27 UTC (permalink / raw)
  To: Y.b. Lu, netdev@vger.kernel.org, Richard Cochran, Rob Herring,
	Shawn Guo, David S . Miller
  Cc: devicetree@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Y.b. Lu
In-Reply-To: <20180607092050.46128-1-yangbo.lu@nxp.com>

> -----Original Message-----
> From: Yangbo Lu [mailto:yangbo.lu@nxp.com]
> Sent: Thursday, June 7, 2018 12:21 PM
> To: netdev@vger.kernel.org; Madalin-cristian Bucur
> <madalin.bucur@nxp.com>; Richard Cochran <richardcochran@gmail.com>;
> Rob Herring <robh+dt@kernel.org>; Shawn Guo <shawnguo@kernel.org>;
> David S . Miller <davem@davemloft.net>
> Cc: devicetree@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-arm-
> kernel@lists.infradead.org; linux-kernel@vger.kernel.org; Y.b. Lu
> <yangbo.lu@nxp.com>
> Subject: [v3, 00/10] Support DPAA PTP clock and timestamping
> 
> This patchset is to support DPAA FMAN PTP clock and HW timestamping.
> It had been verified on both ARM platform and PPC platform.
> - The patch #1 to patch #5 are to support DPAA FMAN 1588 timer in
>   ptp_qoriq driver.
> - The patch #6 to patch #10 are to add HW timestamping support in
>   DPAA ethernet driver.
> 
> Yangbo Lu (10):
>   fsl/fman: share the event interrupt
>   ptp: support DPAA FMan 1588 timer in ptp_qoriq
>   dt-binding: ptp_qoriq: add DPAA FMan support
>   powerpc/mpc85xx: move ptp timer out of fman in dts
>   arm64: dts: fsl: move ptp timer out of fman
>   fsl/fman: add set_tstamp interface
>   fsl/fman_port: support getting timestamp
>   fsl/fman: define frame description command UPD
>   dpaa_eth: add support for hardware timestamping
>   dpaa_eth: add the get_ts_info interface for ethtool
> 
>  Documentation/devicetree/bindings/net/fsl-fman.txt |   25 +-----
>  .../devicetree/bindings/ptp/ptp-qoriq.txt          |   15 +++-
>  arch/arm64/boot/dts/freescale/qoriq-fman3-0.dtsi   |   14 ++-
>  arch/powerpc/boot/dts/fsl/qoriq-fman-0.dtsi        |   14 ++-
>  arch/powerpc/boot/dts/fsl/qoriq-fman-1.dtsi        |   14 ++-
>  arch/powerpc/boot/dts/fsl/qoriq-fman3-0.dtsi       |   14 ++-
>  arch/powerpc/boot/dts/fsl/qoriq-fman3-1.dtsi       |   14 ++-
>  arch/powerpc/boot/dts/fsl/qoriq-fman3l-0.dtsi      |   14 ++-
>  drivers/net/ethernet/freescale/dpaa/dpaa_eth.c     |   88
> ++++++++++++++++-
>  drivers/net/ethernet/freescale/dpaa/dpaa_eth.h     |    3 +
>  drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c |   39 ++++++++
>  drivers/net/ethernet/freescale/fman/fman.c         |    3 +-
>  drivers/net/ethernet/freescale/fman/fman.h         |    1 +
>  drivers/net/ethernet/freescale/fman/fman_dtsec.c   |   27 +++++
>  drivers/net/ethernet/freescale/fman/fman_dtsec.h   |    1 +
>  drivers/net/ethernet/freescale/fman/fman_memac.c   |    5 +
>  drivers/net/ethernet/freescale/fman/fman_memac.h   |    1 +
>  drivers/net/ethernet/freescale/fman/fman_port.c    |   12 +++
>  drivers/net/ethernet/freescale/fman/fman_port.h    |    2 +
>  drivers/net/ethernet/freescale/fman/fman_tgec.c    |   21 ++++
>  drivers/net/ethernet/freescale/fman/fman_tgec.h    |    1 +
>  drivers/net/ethernet/freescale/fman/mac.c          |    3 +
>  drivers/net/ethernet/freescale/fman/mac.h          |    1 +
>  drivers/ptp/Kconfig                                |    2 +-
>  drivers/ptp/ptp_qoriq.c                            |  104 ++++++++++++-------
>  include/linux/fsl/ptp_qoriq.h                      |   38 ++++++--
>  26 files changed, 361 insertions(+), 115 deletions(-)

Acked-by: Madalin Bucur <madalin.bucur@nxp.com>

^ permalink raw reply

* [PATCH 1/1] ip: add rmnet initial support
From: Daniele Palmas @ 2018-06-12 14:12 UTC (permalink / raw)
  To: netdev, Stephen Hemminger; +Cc: Subash Abhinov Kasiviswanathan, Daniele Palmas

This patch adds basic support for Qualcomm rmnet devices.

Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
---
 ip/Makefile       |  2 +-
 ip/iplink.c       |  2 +-
 ip/iplink_rmnet.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 72 insertions(+), 2 deletions(-)
 create mode 100644 ip/iplink_rmnet.c

diff --git a/ip/Makefile b/ip/Makefile
index 77fadee..a88f936 100644
--- a/ip/Makefile
+++ b/ip/Makefile
@@ -10,7 +10,7 @@ IPOBJ=ip.o ipaddress.o ipaddrlabel.o iproute.o iprule.o ipnetns.o \
     link_iptnl.o link_gre6.o iplink_bond.o iplink_bond_slave.o iplink_hsr.o \
     iplink_bridge.o iplink_bridge_slave.o ipfou.o iplink_ipvlan.o \
     iplink_geneve.o iplink_vrf.o iproute_lwtunnel.o ipmacsec.o ipila.o \
-    ipvrf.o iplink_xstats.o ipseg6.o iplink_netdevsim.o
+    ipvrf.o iplink_xstats.o ipseg6.o iplink_netdevsim.o iplink_rmnet.o
 
 RTMONOBJ=rtmon.o
 
diff --git a/ip/iplink.c b/ip/iplink.c
index e4d4da9..f0f8fb8 100644
--- a/ip/iplink.c
+++ b/ip/iplink.c
@@ -121,7 +121,7 @@ void iplink_usage(void)
 			"          bridge | bond | team | ipoib | ip6tnl | ipip | sit | vxlan |\n"
 			"          gre | gretap | erspan | ip6gre | ip6gretap | ip6erspan |\n"
 			"          vti | nlmon | team_slave | bond_slave | ipvlan | geneve |\n"
-			"          bridge_slave | vrf | macsec | netdevsim }\n");
+			"          bridge_slave | vrf | macsec | netdevsim | rmnet}\n");
 	}
 	exit(-1);
 }
diff --git a/ip/iplink_rmnet.c b/ip/iplink_rmnet.c
new file mode 100644
index 0000000..2367754
--- /dev/null
+++ b/ip/iplink_rmnet.c
@@ -0,0 +1,70 @@
+/*
+ * iplink_rmnet.c	RMNET device support
+ *
+ *              This program is free software; you can redistribute it and/or
+ *              modify it under the terms of the GNU General Public License
+ *              as published by the Free Software Foundation; either version
+ *              2 of the License, or (at your option) any later version.
+ *
+ * Authors:     Daniele Palmas <dnlplm@gmail.com>
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "rt_names.h"
+#include "utils.h"
+#include "ip_common.h"
+
+static void print_explain(FILE *f)
+{
+	fprintf(f,
+		"Usage: ... rmnet mux_id MUXID\n"
+		"\n"
+		"MUXID := 1-127\n"
+	);
+}
+
+static void explain(void)
+{
+	print_explain(stderr);
+}
+
+static int rmnet_parse_opt(struct link_util *lu, int argc, char **argv,
+			   struct nlmsghdr *n)
+{
+	__u16 mux_id;
+
+	while (argc > 0) {
+		if (matches(*argv, "mux_id") == 0) {
+			NEXT_ARG();
+			if (get_u16(&mux_id, *argv, 0))
+				invarg("mux_id is invalid", *argv);
+			addattr_l(n, 1024, IFLA_RMNET_MUX_ID, &mux_id, 2);
+		} else if (matches(*argv, "help") == 0) {
+			explain();
+			return -1;
+		} else {
+			fprintf(stderr, "rmnet: unknown command \"%s\"?\n", *argv);
+			explain();
+			return -1;
+		}
+		argc--, argv++;
+	}
+
+	return 0;
+}
+
+static void rmnet_print_help(struct link_util *lu, int argc, char **argv,
+			     FILE *f)
+{
+	print_explain(f);
+}
+
+struct link_util rmnet_link_util = {
+	.id		= "rmnet",
+	.maxattr	= IFLA_RMNET_MAX,
+	.parse_opt	= rmnet_parse_opt,
+	.print_help	= rmnet_print_help,
+};
-- 
2.7.4

^ permalink raw reply related

* Re: [bpf PATCH v2 1/2] bpf: sockmap, fix crash when ipv6 sock is added
From: John Fastabend @ 2018-06-12 13:57 UTC (permalink / raw)
  To: Daniel Borkmann, edumazet, weiwan, ast; +Cc: netdev
In-Reply-To: <e96c1dcf-c811-3252-d7fb-0aedd8c70921@iogearbox.net>

On 06/11/2018 04:14 PM, Daniel Borkmann wrote:
> Hi John,
> 
> On 06/08/2018 05:06 PM, John Fastabend wrote:
>> This fixes a crash where we assign tcp_prot to IPv6 sockets instead
>> of tcpv6_prot.
>>
>> Previously we overwrote the sk->prot field with tcp_prot even in the
>> AF_INET6 case. This patch ensures the correct tcp_prot and tcpv6_prot
>> are used. Further, only allow ESTABLISHED connections to join the
>> map per note in TLS ULP,
>>
>>    /* The TLS ulp is currently supported only for TCP sockets
>>     * in ESTABLISHED state.
>>     * Supporting sockets in LISTEN state will require us
>>     * to modify the accept implementation to clone rather then
>>     * share the ulp context.
>>     */
>>
>> Also tested with 'netserver -6' and 'netperf -H [IPv6]' as well as
>> 'netperf -H [IPv4]'. The ESTABLISHED check resolves the previously
>> crashing case here.
>>
>> Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support")
>> Reported-by: syzbot+5c063698bdbfac19f363@syzkaller.appspotmail.com
>> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
>> Signed-off-by: Wei Wang <weiwan@google.com>
> [...]
> 
> Still one question for some more clarification below that popped up while
> review:
> 
>> @@ -162,6 +164,8 @@ static bool bpf_tcp_stream_read(const struct sock *sk)
>>  }
>>  
>>  static struct proto tcp_bpf_proto;
>> +static struct proto tcpv6_bpf_proto;
> 
> These two are global, w/o locking.
> 
>>  static int bpf_tcp_init(struct sock *sk)
>>  {
>>  	struct smap_psock *psock;
>> @@ -181,14 +185,30 @@ static int bpf_tcp_init(struct sock *sk)
>>  	psock->save_close = sk->sk_prot->close;
>>  	psock->sk_proto = sk->sk_prot;
>>  
>> +	if (sk->sk_family == AF_INET6) {
>> +		tcpv6_bpf_proto = *sk->sk_prot;
>> +		tcpv6_bpf_proto.close = bpf_tcp_close;
>> +	} else {
>> +		tcp_bpf_proto = *sk->sk_prot;
>> +		tcp_bpf_proto.close = bpf_tcp_close;
>> +	}
> 
> And each time we add a BPF ULP to a v4/v6 socket, we override tcp{,v6}_bpf_proto
> from scratch.
> 
>>  	if (psock->bpf_tx_msg) {
>> +		tcpv6_bpf_proto.sendmsg = bpf_tcp_sendmsg;
>> +		tcpv6_bpf_proto.sendpage = bpf_tcp_sendpage;
>> +		tcpv6_bpf_proto.recvmsg = bpf_tcp_recvmsg;
>> +		tcpv6_bpf_proto.stream_memory_read = bpf_tcp_stream_read;
>>  		tcp_bpf_proto.sendmsg = bpf_tcp_sendmsg;
>>  		tcp_bpf_proto.sendpage = bpf_tcp_sendpage;
>>  		tcp_bpf_proto.recvmsg = bpf_tcp_recvmsg;
>>  		tcp_bpf_proto.stream_memory_read = bpf_tcp_stream_read;
>>  	}
>>  
>> -	sk->sk_prot = &tcp_bpf_proto;
>> +	if (sk->sk_family == AF_INET6)
>> +		sk->sk_prot = &tcpv6_bpf_proto;
>> +	else
>> +		sk->sk_prot = &tcp_bpf_proto;
> 
> Where every active socket would be affected from it as well. Isn't that
> generally racy? E.g. existing ones where tcpv6_bpf_proto.sendmsg points
> to bpf_tcp_sendmsg would get overridden with earlier assignment on the
> tcpv6_bpf_proto = *sk->sk_prot during their lifetime after bpf_tcp_init().
> 

In general yes. At best it does feel fragile.

> In the kTLS case, the v4 protos are built up in module init via tls_register()
> and never change from there. The v6 ones are only reloaded when their addr
> changes e.g. module reload would come to mind, which should only be possible
> once no active v6 socket is present. What speaks against adapting similar
> scheme resp. what am I missing that the above would work? (Would be nice if
> there was some discussion in commit log related to it on 'why' this approach
> was done differently.)

I think its best to use the same scheme. Will post a new version. Also
would be nice to fix the selftests in the same series. Finally, I set
these pointers lazily adding a sendmsg hook for example even if it not
needed. Its harmless but does create an extra call through bpf for
no reason on some socks. To be complete we should avoid that.

> 
> Thanks,
> Daniel
> 
>>  	rcu_read_unlock();
>>  	return 0;
>>  }
>> @@ -1111,8 +1131,6 @@ static void bpf_tcp_msg_add(struct smap_psock *psock,
>>  
>>  static int bpf_tcp_ulp_register(void)
>>  {
>> -	tcp_bpf_proto = tcp_prot;
>> -	tcp_bpf_proto.close = bpf_tcp_close;
>>  	/* Once BPF TX ULP is registered it is never unregistered. It
>>  	 * will be in the ULP list for the lifetime of the system. Doing
>>  	 * duplicate registers is not a problem.
>>
> 

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox