* [PATCH v3] tcp: fix wrong checksum calculation on MTU probing
From: Douglas Caetano dos Santos @ 2016-09-22 18:52 UTC (permalink / raw)
To: Sergei Shtylyov, David Miller; +Cc: kuznet, jmorris, yoshfuji, kaber, netdev
In-Reply-To: <5ffc1020-a36f-75b4-ff25-4f58c243f803@cogentembedded.com>
With TCP MTU probing enabled and offload TX checksumming disabled,
tcp_mtu_probe() calculated the wrong checksum when a fragment being copied
into the probe's SKB had an odd length. This was caused by the direct use
of skb_copy_and_csum_bits() to calculate the checksum, as it pads the
fragment being copied, if needed. When this fragment was not the last, a
subsequent call used the previous checksum without considering this
padding.
The effect was a stale connection in one way, as even retransmissions
wouldn't solve the problem, because the checksum was never recalculated for
the full SKB length.
Signed-off-by: Douglas Caetano dos Santos <douglascs@taghos.com.br>
---
net/ipv4/tcp_output.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index f53d0cc..2d32952 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1966,12 +1966,14 @@ static int tcp_mtu_probe(struct sock *sk)
len = 0;
tcp_for_write_queue_from_safe(skb, next, sk) {
copy = min_t(int, skb->len, probe_size - len);
- if (nskb->ip_summed)
+ if (nskb->ip_summed) {
skb_copy_bits(skb, 0, skb_put(nskb, copy), copy);
- else
- nskb->csum = skb_copy_and_csum_bits(skb, 0,
- skb_put(nskb, copy),
- copy, nskb->csum);
+ } else {
+ __wsum csum = skb_copy_and_csum_bits(skb, 0,
+ skb_put(nskb, copy),
+ copy, 0);
+ nskb->csum = csum_block_add(nskb->csum, csum, len);
+ }
if (skb->len <= copy) {
/* We've eaten all the data from this skb.
--
2.5.0
^ permalink raw reply related
* Re: [PATCH net-next 4/4] net/sched: act_mirred: Implement ingress actions
From: Eric Dumazet @ 2016-09-22 18:42 UTC (permalink / raw)
To: Shmulik Ladkani
Cc: David S. Miller, Jamal Hadi Salim, WANG Cong, Eric Dumazet,
netdev
In-Reply-To: <20160922212745.789734c5@halley>
On Thu, 2016-09-22 at 21:27 +0300, Shmulik Ladkani wrote:
> On Thu, 22 Sep 2016 07:54:13 -0700 Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > Hmm... we probably need to apply the full rcu protection before this
> > patch.
> >
> > https://patchwork.ozlabs.org/patch/667680/
>
> Are you referring to order of application into net-next?
>
> This patch seems to present no new tcf_mirred_params members nor
> need-to-be-protected code regions (please, correct me if wrong).
> So it does not _depend_ on the 'full rcu fixes', does it?
No, simply a reminder that we run lockless there, so you might need to
read some control variables once, and in a consistent way.
(Or a concurrent writer could change params in the middle of the
function)
^ permalink raw reply
* Re: [PATCH v3 2/2] netfilter: Create revision 2 of xt_hashlimit to support higher pps rates
From: Vishwanath Pai @ 2016-09-22 18:39 UTC (permalink / raw)
To: Jan Engelhardt, pablo
Cc: kaber, kadlec, johunt, netfilter-devel, coreteam, netdev,
pai.vishwain
In-Reply-To: <alpine.LSU.2.20.1609221850530.27197@nerf40.vanv.qr>
Thanks for pointing this out, I will reorder the fields to:
struct hashlimit_cfg2 {
__u64 avg; /* Average secs between packets * scale */
__u64 burst;
__u32 mode; /* bitmask of XT_HASHLIMIT_HASH_* */
This should fix the hole and avoid padding.
-Vishwanath
On 09/22/2016 12:53 PM, Jan Engelhardt wrote:
> On Thursday 2016-09-22 18:43, Vishwanath Pai wrote:
>> >+struct hashlimit_cfg2 {
>> >+ __u32 mode; /* bitmask of XT_HASHLIMIT_HASH_* */
>> >+ __u64 avg; /* Average secs between packets * scale */
>> >+ __u64 burst; /* Period multiplier for upper limit. */
> This would have different sizes between i386 and x86_64,
> necessiting additional compat functions. It should be padded
> or reordered instead.
>
^ permalink raw reply
* [PATCH v1] bpf: Set register type according to is_valid_access()
From: Mickaël Salaün @ 2016-09-22 18:35 UTC (permalink / raw)
To: linux-kernel
Cc: Mickaël Salaün, Alexei Starovoitov, Andy Lutomirski,
Daniel Borkmann, Kees Cook, Sargun Dhillon, Tejun Heo, netdev
This fix a pointer leak when an unprivileged eBPF program read a pointer
value from the context. Even if is_valid_access() returns a pointer
type, the eBPF verifier replace it with UNKNOWN_VALUE. The register
value containing an address is then allowed to leak. Moreover, this
prevented unprivileged eBPF programs to use functions with (legitimate)
pointer arguments.
This bug is not an issue for now because the only unprivileged eBPF
program allowed is of type BPF_PROG_TYPE_SOCKET_FILTER and all the types
from its context are UNKNOWN_VALUE. However, this fix is important for
future unprivileged eBPF program types which could use pointers in their
context.
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Fixes: 969bf05eb3ce ("bpf: direct packet access")
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Kees Cook <keescook@chromium.org>
Acked-by: Sargun Dhillon <sargun@sargun.me>
---
kernel/bpf/verifier.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index daea765d72e6..0698ccd67715 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -794,10 +794,8 @@ static int check_mem_access(struct verifier_env *env, u32 regno, int off,
}
err = check_ctx_access(env, off, size, t, ®_type);
if (!err && t == BPF_READ && value_regno >= 0) {
- mark_reg_unknown_value(state->regs, value_regno);
- if (env->allow_ptr_leaks)
- /* note that reg.[id|off|range] == 0 */
- state->regs[value_regno].type = reg_type;
+ /* note that reg.[id|off|range] == 0 */
+ state->regs[value_regno].type = reg_type;
}
} else if (reg->type == FRAME_PTR || reg->type == PTR_TO_STACK) {
--
2.9.3
^ permalink raw reply related
* Re: [PATCH net-next 4/4] net/sched: act_mirred: Implement ingress actions
From: Shmulik Ladkani @ 2016-09-22 18:27 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S. Miller, Jamal Hadi Salim, WANG Cong, Eric Dumazet,
netdev, Shmulik Ladkani
In-Reply-To: <1474556053.23058.111.camel@edumazet-glaptop3.roam.corp.google.com>
On Thu, 22 Sep 2016 07:54:13 -0700 Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Hmm... we probably need to apply the full rcu protection before this
> patch.
>
> https://patchwork.ozlabs.org/patch/667680/
Are you referring to order of application into net-next?
This patch seems to present no new tcf_mirred_params members nor
need-to-be-protected code regions (please, correct me if wrong).
So it does not _depend_ on the 'full rcu fixes', does it?
Thanks,
Shmulik
^ permalink raw reply
* Re: XDP (eXpress Data Path) documentation
From: Jesper Dangaard Brouer via iovisor-dev @ 2016-09-22 18:16 UTC (permalink / raw)
To: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
iovisor-dev-9jONkmmOlFHEE9lA1F8Ukti2O/JbrIOy@public.gmane.org
Cc: Nathan Willis, Alexei Starovoitov, Tom Herbert, Jonathan Corbet,
linux-doc-u79uwXL29TY76Z2rM5mHXA, Saeed Mahameed, Edward Cree
In-Reply-To: <20160920110844.661965be-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
On Tue, 20 Sep 2016 11:08:44 +0200 Jesper Dangaard Brouer <brouer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> As promised, I've started documenting the XDP eXpress Data Path):
>
> [1] https://prototype-kernel.readthedocs.io/en/latest/networking/XDP/index.html
>
> IMHO the documentation have reached a stage where it is useful for the
> XDP project, BUT I request collaboration on improving the documentation
> from all. (Native English speakers are encouraged to send grammar fixes ;-))
I want to publicly thanks Edward Cree for being the first contributor
to the XDP documentation with formulation and grammar fixes.
Pulled and pushed:
https://github.com/netoptimizer/prototype-kernel/commit/fb6a3de95
Thanks!
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
^ permalink raw reply
* [PATCH v2 iproute2 net-next] tc: m_vlan: Add vlan modify action
From: Shmulik Ladkani @ 2016-09-22 18:01 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev, Jamal Hadi Salim, Jiri Pirko, Shmulik Ladkani
The 'vlan modify' action allows to replace an existing 802.1q tag
according to user provided settings.
It accepts same arguments as the 'vlan push' action.
For example, this replaces vid 6 with vid 5:
# tc filter add dev veth0 parent ffff: pref 1 protocol 802.1q \
basic match 'meta(vlan mask 0xfff eq 6)' \
action vlan modify id 5 continue
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
---
v2: Coding. No need to encapsule action_names[] access into a function
include/linux/tc_act/tc_vlan.h | 1 +
man/man8/tc-vlan.8 | 25 +++++++++++++++++++------
tc/m_vlan.c | 40 +++++++++++++++++++++++++++++++---------
3 files changed, 51 insertions(+), 15 deletions(-)
diff --git a/include/linux/tc_act/tc_vlan.h b/include/linux/tc_act/tc_vlan.h
index be72b6e384..bddb272b84 100644
--- a/include/linux/tc_act/tc_vlan.h
+++ b/include/linux/tc_act/tc_vlan.h
@@ -16,6 +16,7 @@
#define TCA_VLAN_ACT_POP 1
#define TCA_VLAN_ACT_PUSH 2
+#define TCA_VLAN_ACT_MODIFY 3
struct tc_vlan {
tc_gen;
diff --git a/man/man8/tc-vlan.8 b/man/man8/tc-vlan.8
index 4d0c5c8a15..af3de1c54e 100644
--- a/man/man8/tc-vlan.8
+++ b/man/man8/tc-vlan.8
@@ -6,7 +6,7 @@ vlan - vlan manipulation module
.in +8
.ti -8
.BR tc " ... " "action vlan" " { " pop " |"
-.IR PUSH " } [ " CONTROL " ]"
+.IR PUSH " | " MODIFY " } [ " CONTROL " ]"
.ti -8
.IR PUSH " := "
@@ -17,22 +17,30 @@ vlan - vlan manipulation module
.BI id " VLANID"
.ti -8
+.IR MODIFY " := "
+.BR modify " [ " protocol
+.IR VLANPROTO " ]"
+.BR " [ " priority
+.IR VLANPRIO " ] "
+.BI id " VLANID"
+
+.ti -8
.IR CONTROL " := { "
.BR reclassify " | " pipe " | " drop " | " continue " | " pass " }"
.SH DESCRIPTION
The
.B vlan
action allows to perform 802.1Q en- or decapsulation on a packet, reflected by
-the two operation modes
-.IR POP " and " PUSH .
+the operation modes
+.IR POP ", " PUSH " and " MODIFY .
The
.I POP
mode is simple, as no further information is required to just drop the
outer-most VLAN encapsulation. The
-.I PUSH
-mode on the other hand requires at least a
+.IR PUSH " and " MODIFY
+modes require at least a
.I VLANID
-and allows to optionally choose the
+and allow to optionally choose the
.I VLANPROTO
to use.
.SH OPTIONS
@@ -45,6 +53,11 @@ Encapsulation mode. Requires at least
.B id
option.
.TP
+.B modify
+Replace mode. Existing 802.1Q tag is replaced. Requires at least
+.B id
+option.
+.TP
.BI id " VLANID"
Specify the VLAN ID to encapsulate into.
.I VLANID
diff --git a/tc/m_vlan.c b/tc/m_vlan.c
index 05a63b48f1..b32f746015 100644
--- a/tc/m_vlan.c
+++ b/tc/m_vlan.c
@@ -19,10 +19,17 @@
#include "tc_util.h"
#include <linux/tc_act/tc_vlan.h>
+static const char * const action_names[] = {
+ [TCA_VLAN_ACT_POP] = "pop",
+ [TCA_VLAN_ACT_PUSH] = "push",
+ [TCA_VLAN_ACT_MODIFY] = "modify",
+};
+
static void explain(void)
{
fprintf(stderr, "Usage: vlan pop\n");
fprintf(stderr, " vlan push [ protocol VLANPROTO ] id VLANID [ priority VLANPRIO ] [CONTROL]\n");
+ fprintf(stderr, " vlan modify [ protocol VLANPROTO ] id VLANID [ priority VLANPRIO ] [CONTROL]\n");
fprintf(stderr, " VLANPROTO is one of 802.1Q or 802.1AD\n");
fprintf(stderr, " with default: 802.1Q\n");
fprintf(stderr, " CONTROL := reclassify | pipe | drop | continue | pass\n");
@@ -34,6 +41,11 @@ static void usage(void)
exit(-1);
}
+static bool has_push_attribs(int action)
+{
+ return action == TCA_VLAN_ACT_PUSH || action == TCA_VLAN_ACT_MODIFY;
+}
+
static int parse_vlan(struct action_util *a, int *argc_p, char ***argv_p,
int tca_id, struct nlmsghdr *n)
{
@@ -71,9 +83,17 @@ static int parse_vlan(struct action_util *a, int *argc_p, char ***argv_p,
return -1;
}
action = TCA_VLAN_ACT_PUSH;
+ } else if (matches(*argv, "modify") == 0) {
+ if (action) {
+ fprintf(stderr, "unexpected \"%s\" - action already specified\n",
+ *argv);
+ explain();
+ return -1;
+ }
+ action = TCA_VLAN_ACT_MODIFY;
} else if (matches(*argv, "id") == 0) {
- if (action != TCA_VLAN_ACT_PUSH) {
- fprintf(stderr, "\"%s\" is only valid for push\n",
+ if (!has_push_attribs(action)) {
+ fprintf(stderr, "\"%s\" is only valid for push/modify\n",
*argv);
explain();
return -1;
@@ -83,8 +103,8 @@ static int parse_vlan(struct action_util *a, int *argc_p, char ***argv_p,
invarg("id is invalid", *argv);
id_set = 1;
} else if (matches(*argv, "protocol") == 0) {
- if (action != TCA_VLAN_ACT_PUSH) {
- fprintf(stderr, "\"%s\" is only valid for push\n",
+ if (!has_push_attribs(action)) {
+ fprintf(stderr, "\"%s\" is only valid for push/modify\n",
*argv);
explain();
return -1;
@@ -94,8 +114,8 @@ static int parse_vlan(struct action_util *a, int *argc_p, char ***argv_p,
invarg("protocol is invalid", *argv);
proto_set = 1;
} else if (matches(*argv, "priority") == 0) {
- if (action != TCA_VLAN_ACT_PUSH) {
- fprintf(stderr, "\"%s\" is only valid for push\n",
+ if (!has_push_attribs(action)) {
+ fprintf(stderr, "\"%s\" is only valid for push/modify\n",
*argv);
explain();
return -1;
@@ -129,8 +149,9 @@ static int parse_vlan(struct action_util *a, int *argc_p, char ***argv_p,
}
}
- if (action == TCA_VLAN_ACT_PUSH && !id_set) {
- fprintf(stderr, "id needs to be set for push\n");
+ if (has_push_attribs(action) && !id_set) {
+ fprintf(stderr, "id needs to be set for %s\n",
+ action_names[action]);
explain();
return -1;
}
@@ -186,7 +207,8 @@ static int print_vlan(struct action_util *au, FILE *f, struct rtattr *arg)
fprintf(f, " pop");
break;
case TCA_VLAN_ACT_PUSH:
- fprintf(f, " push");
+ case TCA_VLAN_ACT_MODIFY:
+ fprintf(f, " %s", action_names[parm->v_action]);
if (tb[TCA_VLAN_PUSH_VLAN_ID]) {
val = rta_getattr_u16(tb[TCA_VLAN_PUSH_VLAN_ID]);
fprintf(f, " id %u", val);
--
2.7.4
^ permalink raw reply related
* Re: [PATCH v2] fs/select: add vmalloc fallback for select(2)
From: Vlastimil Babka @ 2016-09-22 17:55 UTC (permalink / raw)
To: Eric Dumazet
Cc: Alexander Viro, Andrew Morton, linux-fsdevel, linux-kernel,
linux-mm, Michal Hocko, netdev, Linux API, linux-man
In-Reply-To: <1474564068.23058.144.camel@edumazet-glaptop3.roam.corp.google.com>
On 09/22/2016 07:07 PM, Eric Dumazet wrote:
> On Thu, 2016-09-22 at 18:56 +0200, Vlastimil Babka wrote:
>> On 09/22/2016 06:49 PM, Eric Dumazet wrote:
>> > On Thu, 2016-09-22 at 18:43 +0200, Vlastimil Babka wrote:
>> >> The select(2) syscall performs a kmalloc(size, GFP_KERNEL) where size grows
>> >> with the number of fds passed. We had a customer report page allocation
>> >> failures of order-4 for this allocation. This is a costly order, so it might
>> >> easily fail, as the VM expects such allocation to have a lower-order fallback.
>> >>
>> >> Such trivial fallback is vmalloc(), as the memory doesn't have to be
>> >> physically contiguous. Also the allocation is temporary for the duration of the
>> >> syscall, so it's unlikely to stress vmalloc too much.
>> >
>> > vmalloc() uses a vmap_area_lock spinlock, and TLB flushes.
>> >
>> > So I guess allowing vmalloc() being called from an innocent application
>> > doing a select() might be dangerous, especially if this select() happens
>> > thousands of time per second.
>>
>> Isn't seq_buf_alloc() similarly exposed? And ipc_alloc()?
>
> Possibly.
>
> We don't have a library function (attempting kmalloc(), fallback to
> vmalloc() presumably to avoid abuses, but I guess some patches were
> accepted without thinking about this.
So in the case of select() it seems like the memory we need 6 bits per file
descriptor, multiplied by the highest possible file descriptor (nfds) as passed
to the syscall. According to the man page of select:
EINVAL nfds is negative or exceeds the RLIMIT_NOFILE resource limit (see
getrlimit(2)).
The code actually seems to silently cap the value instead of returning EINVAL
though? (IIUC):
/* max_fds can increase, so grab it once to avoid race */
rcu_read_lock();
fdt = files_fdtable(current->files);
max_fds = fdt->max_fds;
rcu_read_unlock();
if (n > max_fds)
n = max_fds;
The default for this cap seems to be 1024 where I checked (again, IIUC, it's
what ulimit -n returns?). I wasn't able to change it to more than 2048, which
makes the bitmaps still below PAGE_SIZE.
So if I get that right, the system admin would have to allow really large
RLIMIT_NOFILE to even make vmalloc() possible here. So I don't see it as a large
concern?
Vlastimil
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply
* [PATCH][V2] cxgb4: fix signed wrap around when decrementing index idx
From: Colin King @ 2016-09-22 17:48 UTC (permalink / raw)
To: Hariprasad S, netdev; +Cc: linux-kernel
From: Colin Ian King <colin.king@canonical.com>
Change predecrement compare to post decrement compare to avoid an
unsigned integer wrap-around comparison when decrementing idx in
the while loop.
For example, when idx is zero, the current situation will
predecrement idx in the while loop, wrapping idx to the maximum
signed integer and cause out of bounds reads on rxq_info->msix_tbl[idx].
Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c
index d12a73e..42a9c8d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c
@@ -367,7 +367,7 @@ int request_msix_queue_irqs_uld(struct adapter *adap, unsigned int uld_type)
}
return 0;
unwind:
- while (--idx >= 0) {
+ while (idx-- > 0) {
bmap_idx = rxq_info->msix_tbl[idx];
free_msix_idx_in_bmap(adap, bmap_idx);
free_irq(adap->msix_info_ulds[bmap_idx].vec,
--
2.9.3
^ permalink raw reply related
* Re: [PATCH iproute2 net-next] tc: m_vlan: Add vlan modify action
From: Shmulik Ladkani @ 2016-09-22 17:44 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev, Jamal Hadi Salim, Jiri Pirko, Shmulik Ladkani
In-Reply-To: <20160922090504.3eaa5266@xeon-e3>
On Thu, 22 Sep 2016 09:05:04 -0700 Stephen Hemminger <stephen@networkplumber.org> wrote:
> On Thu, 22 Sep 2016 12:31:10 +0300
> Shmulik Ladkani <shmulik.ladkani@ravellosystems.com> wrote:
>
> > +
> > +static const char *action_name(int action)
> > +{
> > + static const char * const names[] = {
> > + [TCA_VLAN_ACT_POP] = "pop",
> > + [TCA_VLAN_ACT_PUSH] = "push",
> > + [TCA_VLAN_ACT_MODIFY] = "modify",
> > + };
> > + return names[action];
> > +}
> > +
>
> Why are you wrapping a simple array lookup in a function?
No reason in particular, was probably code evolution, will amend, thanks.
^ permalink raw reply
* Re: [PATCH 2/2] net: thunderx: Support for byte queue limits
From: Florian Fainelli @ 2016-09-22 17:41 UTC (permalink / raw)
To: sunil.kovvuri, netdev; +Cc: Sunil Goutham, linux-kernel, linux-arm-kernel
In-Reply-To: <1474535121-13958-3-git-send-email-sunil.kovvuri@gmail.com>
On 09/22/2016 02:05 AM, sunil.kovvuri@gmail.com wrote:
> From: Sunil Goutham <sgoutham@cavium.com>
>
> This patch adds support for byte queue limits
>
> Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Where is the code that calls netdev_tx_reset_queue()? This is needed in
the function that brings down the interface, did your test survive a
up/down/up sequence?
> ---
> drivers/net/ethernet/cavium/thunder/nicvf_main.c | 11 ++++++--
> drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 30 ++++++++++++++--------
> 2 files changed, 29 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
> index 7d00162..453e3a0 100644
> --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
> +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
> @@ -516,7 +516,8 @@ static int nicvf_init_resources(struct nicvf *nic)
> static void nicvf_snd_pkt_handler(struct net_device *netdev,
> struct cmp_queue *cq,
> struct cqe_send_t *cqe_tx,
> - int cqe_type, int budget)
> + int cqe_type, int budget,
> + unsigned int *tx_pkts, unsigned int *tx_bytes)
> {
> struct sk_buff *skb = NULL;
> struct nicvf *nic = netdev_priv(netdev);
> @@ -547,6 +548,8 @@ static void nicvf_snd_pkt_handler(struct net_device *netdev,
> }
> nicvf_put_sq_desc(sq, hdr->subdesc_cnt + 1);
> prefetch(skb);
> + (*tx_pkts)++;
> + *tx_bytes += skb->len;
> napi_consume_skb(skb, budget);
> sq->skbuff[cqe_tx->sqe_ptr] = (u64)NULL;
> } else {
> @@ -662,6 +665,7 @@ static int nicvf_cq_intr_handler(struct net_device *netdev, u8 cq_idx,
> struct cmp_queue *cq = &qs->cq[cq_idx];
> struct cqe_rx_t *cq_desc;
> struct netdev_queue *txq;
> + unsigned int tx_pkts = 0, tx_bytes = 0;
>
> spin_lock_bh(&cq->lock);
> loop:
> @@ -701,7 +705,7 @@ loop:
> case CQE_TYPE_SEND:
> nicvf_snd_pkt_handler(netdev, cq,
> (void *)cq_desc, CQE_TYPE_SEND,
> - budget);
> + budget, &tx_pkts, &tx_bytes);
> tx_done++;
> break;
> case CQE_TYPE_INVALID:
> @@ -730,6 +734,9 @@ done:
> netdev = nic->pnicvf->netdev;
> txq = netdev_get_tx_queue(netdev,
> nicvf_netdev_qidx(nic, cq_idx));
> + if (tx_pkts)
> + netdev_tx_completed_queue(txq, tx_pkts, tx_bytes);
> +
> nic = nic->pnicvf;
> if (netif_tx_queue_stopped(txq) && netif_carrier_ok(netdev)) {
> netif_tx_start_queue(txq);
> diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
> index 178c5c7..a4fc501 100644
> --- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
> +++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
> @@ -1082,6 +1082,24 @@ static inline void nicvf_sq_add_cqe_subdesc(struct snd_queue *sq, int qentry,
> imm->len = 1;
> }
>
> +static inline void nicvf_sq_doorbell(struct nicvf *nic, struct sk_buff *skb,
> + int sq_num, int desc_cnt)
> +{
> + struct netdev_queue *txq;
> +
> + txq = netdev_get_tx_queue(nic->pnicvf->netdev,
> + skb_get_queue_mapping(skb));
> +
> + netdev_tx_sent_queue(txq, skb->len);
> +
> + /* make sure all memory stores are done before ringing doorbell */
> + smp_wmb();
> +
> + /* Inform HW to xmit all TSO segments */
> + nicvf_queue_reg_write(nic, NIC_QSET_SQ_0_7_DOOR,
> + sq_num, desc_cnt);
> +}
> +
> /* Segment a TSO packet into 'gso_size' segments and append
> * them to SQ for transfer
> */
> @@ -1141,12 +1159,8 @@ static int nicvf_sq_append_tso(struct nicvf *nic, struct snd_queue *sq,
> /* Save SKB in the last segment for freeing */
> sq->skbuff[hdr_qentry] = (u64)skb;
>
> - /* make sure all memory stores are done before ringing doorbell */
> - smp_wmb();
> + nicvf_sq_doorbell(nic, skb, sq_num, desc_cnt);
>
> - /* Inform HW to xmit all TSO segments */
> - nicvf_queue_reg_write(nic, NIC_QSET_SQ_0_7_DOOR,
> - sq_num, desc_cnt);
> nic->drv_stats.tx_tso++;
> return 1;
> }
> @@ -1219,12 +1233,8 @@ doorbell:
> nicvf_sq_add_cqe_subdesc(sq, qentry, tso_sqe, skb);
> }
>
> - /* make sure all memory stores are done before ringing doorbell */
> - smp_wmb();
> + nicvf_sq_doorbell(nic, skb, sq_num, subdesc_cnt);
>
> - /* Inform HW to xmit new packet */
> - nicvf_queue_reg_write(nic, NIC_QSET_SQ_0_7_DOOR,
> - sq_num, subdesc_cnt);
> return 1;
>
> append_fail:
>
--
Florian
^ permalink raw reply
* Fwd: Re: nfs broken on Fedora-24, 32-bit?
From: Ben Greear @ 2016-09-22 17:35 UTC (permalink / raw)
To: netdev
In-Reply-To: <08ad07ee-72e2-b478-d70d-87b08c24452f@candelatech.com>
This is probably not an NFS specific issue, though I guess possibly it is.
Forwarding to netdev in case someone wants to take a look at it.
Thanks,
Ben
-------- Forwarded Message --------
Subject: Re: nfs broken on Fedora-24, 32-bit?
Date: Fri, 16 Sep 2016 16:31:51 -0700
From: Ben Greear <greearb@candelatech.com>
Organization: Candela Technologies
To: Trond Myklebust <trondmy@primarydata.com>
CC: List Linux NFS Mailing <linux-nfs@vger.kernel.org>
On 09/15/2016 04:06 PM, Ben Greear wrote:
> On 09/15/2016 04:00 PM, Trond Myklebust wrote:
>> Hi Ben,
>>
>>> On Sep 15, 2016, at 18:32, Ben Greear <greearb@candelatech.com> wrote:
>>>
>>> I have a Fedora-24 machine mounting an NFS server running Fedora-13 (kernel 2.6.34.9-69.fc13.x86_64).
>>>
>>> F24 machine has this in /etc/fstab:
>>>
>>> 192.168.100.3:/mnt/d2 /mnt/d2 nfs nfsvers=3 0 0
>>>
>>> When I copy a file from f24-32 to the F-13 machine, the file size is the same,
>>> but the file is corrupted on the file server. I see a different md5sum each time.
>>>
>>> Various other systems (F21, F19, etc) can all copy to the F13 machine fine.
>>>
>>> And, F24-64 machine can copy to the F13 machine fine.
>>>
>>> Anyone seen something similar?
>>
>> Do you know if the corruption is happening on the read()s or on the write()s? Do you, for instance get the same corruption if you copy from a local file on
>> the F-24 client to the server? ..or if you copy from a file on the server to a local directory on the F-24 client?
>>
>> Cheers
>> Trond
>>
>
> Seems to be a write issue:
>
> # This is the nfs server:
>
> [greearb@fs3 candela_cdrom.5.3.5]$ md5sum gua-f21-32
> ad4073fa8b806bb82b85a645e21f5e67 gua-f21-32
> [greearb@fs3 candela_cdrom.5.3.5]$ md5sum ../greearb/tmp/gua-f21-32
> 582bfea0cc8cc52aa38dc0f5048d0156 ../greearb/tmp/gua-f21-32
> [greearb@fs3 candela_cdrom.5.3.5]$
>
>
> # This is the v-f24-32 client:
>
> greearb@v-f24-32 ~]$ cp /mnt/d2/pub/candela_cdrom.5.3.5/gua-f21-32 ./
> [greearb@v-f24-32 ~]$ md5sum gua-f21-32
> ad4073fa8b806bb82b85a645e21f5e67 gua-f21-32
> [greearb@v-f24-32 ~]$ cp gua-f21-32 /mnt/d2/pub/greearb/tmp/
> [greearb@v-f24-32 ~]$ md5sum /mnt/d2/pub/greearb/tmp/gua-f21-32
> ad4073fa8b806bb82b85a645e21f5e67 /mnt/d2/pub/greearb/tmp/gua-f21-32
>
>
> Interesting that the client reads back the file it copied over as if it were correct, but
> it shows up wrong on the nfs server. Maybe it is just reading a local cache?
>
> Thanks,
> Ben
>
Here is some more info on this:
We can only reproduce this on virtual machines using the KVM infrastructure, and only
when we use the rtl8139 virtual hardware (in bridge mode). With the e1000 virtual hardware
we cannot reproduce the problem.
Also, multiple different nfs servers (including much newer kernels) all show the same
behaviour with this broken nfs client.
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply
* Re: [PATCH net] net: rtnl_register in net_ns_init need rtnl_lock
From: Cong Wang @ 2016-09-22 17:20 UTC (permalink / raw)
To: Hannes Frederic Sowa; +Cc: Eric Dumazet, Linux Kernel Network Developers
In-Reply-To: <4c1b9aba-e6e8-babc-1bbe-aef2bc0bca53@stressinduktion.org>
On Thu, Sep 22, 2016 at 6:41 AM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> On 22.09.2016 15:03, Eric Dumazet wrote:
>> On Thu, 2016-09-22 at 13:03 +0200, Hannes Frederic Sowa wrote:
>>> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
>>> ---
>>> net/core/net_namespace.c | 2 ++
>>> 1 file changed, 2 insertions(+)
>>>
>>> diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
>>> index 2c2eb1b629b11d..a2ace299f28355 100644
>>> --- a/net/core/net_namespace.c
>>> +++ b/net/core/net_namespace.c
>>> @@ -758,9 +758,11 @@ static int __init net_ns_init(void)
>>>
>>> register_pernet_subsys(&net_ns_ops);
>>>
>>> + rtnl_lock();
>>> rtnl_register(PF_UNSPEC, RTM_NEWNSID, rtnl_net_newid, NULL, NULL);
>>> rtnl_register(PF_UNSPEC, RTM_GETNSID, rtnl_net_getid, rtnl_net_dumpid,
>>> NULL);
>>> + rtnl_unlock();
>>>
>>> return 0;
>>> }
>>
>> Hi Hannes
>>
>> Why is this needed here, and not in other places ?
>
> I found this during working on the file and actually saw no live issues
> (belonged to another series which I just split up).
>
> I don't think it is a big issue but wanted the writes to the
> rtnl_msg_handlers array to be strictly serialized. I was working on
> adding this to other places, too. Maybe better for net-next even?
But they are called during boot, why is it possible to have a parallel
reader/writer at that time?
^ permalink raw reply
* Re: [PATCH net-next v2 0/3] add support for RGMII on GMAC0 through TRGMII hardware module
From: Sergei Shtylyov @ 2016-09-22 17:08 UTC (permalink / raw)
To: David Miller, sean.wang
Cc: john, nbd, netdev, linux-kernel, linux-mediatek, andrew,
f.fainelli, keyhaede, objelf
In-Reply-To: <20160922.082247.198689057396545219.davem@davemloft.net>
Hello.
On 09/22/2016 03:22 PM, David Miller wrote:
>> By default, GMAC0 is connected to built-in switch called
>> MT7530 through the proprietary interface called Turbo RGMII
>> (TRGMII). TRGMII also supports well for RGMII as generic external
>> PHY uses but requires some slight changes to the setup of TRGMII
>> and doesn't have well support on current driver.
>>
>> So this patchset
>> 1) provides the slight changes of the setup for RGMII can work
>> through TRGMII
>> 2) adds additional setting "trgmii" as PHY_INTERFACE_MODE_TRGMII
>> about phy-mode on device tree to make GMAC0 distinguish which
>> mode it runs
>> 3) changes dynamically source clock, TX/RX delay and interface
>> mode on TRGMII for adapting various link
>>
>> Changes since v1:
>> - fixed the style of comment which doesn't have a space at
>> the beginning and end of comment lines
>> - add support for phy-mode "trgmii" as PHY_INTERFACE_MODE_TRGMII
>> into linux/phy.h
>> - enhance the Documentation about device tree binding for trgmii
>> which is applicable only for GMAC0 which uses fixed-link
>
> Series applied.
Despite my comments? Sigh...
MBR, Sergei
^ permalink raw reply
* Re: [PATCH v2] fs/select: add vmalloc fallback for select(2)
From: Eric Dumazet @ 2016-09-22 17:07 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Alexander Viro, Andrew Morton, linux-fsdevel, linux-kernel,
linux-mm, Michal Hocko, netdev
In-Reply-To: <12efc491-a0e7-1012-5a8b-6d3533c720db@suse.cz>
On Thu, 2016-09-22 at 18:56 +0200, Vlastimil Babka wrote:
> On 09/22/2016 06:49 PM, Eric Dumazet wrote:
> > On Thu, 2016-09-22 at 18:43 +0200, Vlastimil Babka wrote:
> >> The select(2) syscall performs a kmalloc(size, GFP_KERNEL) where size grows
> >> with the number of fds passed. We had a customer report page allocation
> >> failures of order-4 for this allocation. This is a costly order, so it might
> >> easily fail, as the VM expects such allocation to have a lower-order fallback.
> >>
> >> Such trivial fallback is vmalloc(), as the memory doesn't have to be
> >> physically contiguous. Also the allocation is temporary for the duration of the
> >> syscall, so it's unlikely to stress vmalloc too much.
> >
> > vmalloc() uses a vmap_area_lock spinlock, and TLB flushes.
> >
> > So I guess allowing vmalloc() being called from an innocent application
> > doing a select() might be dangerous, especially if this select() happens
> > thousands of time per second.
>
> Isn't seq_buf_alloc() similarly exposed? And ipc_alloc()?
Possibly.
We don't have a library function (attempting kmalloc(), fallback to
vmalloc() presumably to avoid abuses, but I guess some patches were
accepted without thinking about this.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply
* [PATCH net-next 7/9] net/mlx5: E-Switch, Support VLAN actions in the offloads mode
From: Saeed Mahameed @ 2016-09-22 17:01 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Or Gerlitz, Hadar Hen-Zion, Jiri Pirko, Andy Gospodarek,
Jesse Brandeburg, John Fastabend, Amir Vadai, Saeed Mahameed
In-Reply-To: <1474563709-17943-1-git-send-email-saeedm@mellanox.com>
From: Or Gerlitz <ogerlitz@mellanox.com>
Many virtualization systems use a policy under which a vlan tag is
pushed to packets sent by guests, and popped before the packet is
forwarded to the VM.
The current generation of the mlx5 HW doesn't fully support that on
a per flow level. As such, we are addressing the above common use
case with the SRIOV e-Switch abilities to push vlan into packets
sent by VFs and pop vlan from packets forwarded to VFs.
The HW can match on the correct vlan being present in packets
forwarded to VFs (eSwitch steering is done before stripping
the tag), so this part is offloaded as is.
A common practice for vlans is to avoid both push vlan and pop vlan
for inter-host VM/VM (east-west) communication because in this case,
push on egress cancels out with pop on ingress.
For supporting that, we use a global eswitch vlan pop policy, hence
allowing guest A to communicate with both remote VM B and local VM C.
This works since the HW pops the vlan only if it exists (e.g for
C --> A packets but not for B --> A packets).
On the slow path, when a VF vport has an offloaded flow which involves
pushing vlans, wheres another flow is not currently offloaded, the
packets from the 2nd flow seen by the VF representor on the host have
vlan. The VF rep driver removes such vlan before calling into the host
networking stack.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 +
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 21 ++-
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 33 ++++
drivers/net/ethernet/mellanox/mlx5/core/eswitch.h | 15 ++
.../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 180 +++++++++++++++++++++
5 files changed, 249 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 3460154..460363b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -869,6 +869,7 @@ void mlx5e_nic_rep_unload(struct mlx5_eswitch *esw,
int mlx5e_add_sqs_fwd_rules(struct mlx5e_priv *priv);
void mlx5e_remove_sqs_fwd_rules(struct mlx5e_priv *priv);
int mlx5e_attr_get(struct net_device *dev, struct switchdev_attr *attr);
+void mlx5e_handle_rx_cqe_rep(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe);
int mlx5e_create_direct_rqts(struct mlx5e_priv *priv);
void mlx5e_destroy_rqt(struct mlx5e_priv *priv, struct mlx5e_rqt *rqt);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index b309e7c..c127923 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -446,6 +446,16 @@ static void mlx5e_rq_free_mpwqe_info(struct mlx5e_rq *rq)
kfree(rq->mpwqe.info);
}
+static bool mlx5e_is_vf_vport_rep(struct mlx5e_priv *priv)
+{
+ struct mlx5_eswitch_rep *rep = (struct mlx5_eswitch_rep *)priv->ppriv;
+
+ if (rep && rep->vport != FDB_UPLINK_VPORT)
+ return true;
+
+ return false;
+}
+
static int mlx5e_create_rq(struct mlx5e_channel *c,
struct mlx5e_rq_param *param,
struct mlx5e_rq *rq)
@@ -487,6 +497,11 @@ static int mlx5e_create_rq(struct mlx5e_channel *c,
switch (priv->params.rq_wq_type) {
case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
+ if (mlx5e_is_vf_vport_rep(priv)) {
+ err = -EINVAL;
+ goto err_rq_wq_destroy;
+ }
+
rq->handle_rx_cqe = mlx5e_handle_rx_cqe_mpwrq;
rq->alloc_wqe = mlx5e_alloc_rx_mpwqe;
rq->dealloc_wqe = mlx5e_dealloc_rx_mpwqe;
@@ -512,7 +527,11 @@ static int mlx5e_create_rq(struct mlx5e_channel *c,
goto err_rq_wq_destroy;
}
- rq->handle_rx_cqe = mlx5e_handle_rx_cqe;
+ if (mlx5e_is_vf_vport_rep(priv))
+ rq->handle_rx_cqe = mlx5e_handle_rx_cqe_rep;
+ else
+ rq->handle_rx_cqe = mlx5e_handle_rx_cqe;
+
rq->alloc_wqe = mlx5e_alloc_rx_wqe;
rq->dealloc_wqe = mlx5e_dealloc_rx_wqe;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index e836e47..c6de6fb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -36,6 +36,7 @@
#include <net/busy_poll.h>
#include "en.h"
#include "en_tc.h"
+#include "eswitch.h"
static inline bool mlx5e_rx_hw_stamp(struct mlx5e_tstamp *tstamp)
{
@@ -803,6 +804,38 @@ wq_ll_pop:
&wqe->next.next_wqe_index);
}
+void mlx5e_handle_rx_cqe_rep(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe)
+{
+ struct net_device *netdev = rq->netdev;
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+ struct mlx5_eswitch_rep *rep = priv->ppriv;
+ struct mlx5e_rx_wqe *wqe;
+ struct sk_buff *skb;
+ __be16 wqe_counter_be;
+ u16 wqe_counter;
+ u32 cqe_bcnt;
+
+ wqe_counter_be = cqe->wqe_counter;
+ wqe_counter = be16_to_cpu(wqe_counter_be);
+ wqe = mlx5_wq_ll_get_wqe(&rq->wq, wqe_counter);
+ cqe_bcnt = be32_to_cpu(cqe->byte_cnt);
+
+ skb = skb_from_cqe(rq, cqe, wqe_counter, cqe_bcnt);
+ if (!skb)
+ goto wq_ll_pop;
+
+ mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb);
+
+ if (rep->vlan && skb_vlan_tag_present(skb))
+ skb_vlan_pop(skb);
+
+ napi_gro_receive(rq->cq.napi, skb);
+
+wq_ll_pop:
+ mlx5_wq_ll_pop(&rq->wq, wqe_counter_be,
+ &wqe->next.next_wqe_index);
+}
+
static inline void mlx5e_mpwqe_fill_rx_skb(struct mlx5e_rq *rq,
struct mlx5_cqe64 *cqe,
struct mlx5e_mpw_info *wi,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index eeeeadc..2e2938e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -157,6 +157,7 @@ struct mlx5_eswitch_fdb {
struct mlx5_flow_group *send_to_vport_grp;
struct mlx5_flow_group *miss_grp;
struct mlx5_flow_rule *miss_rule;
+ int vlan_push_pop_refcount;
} offloads;
};
};
@@ -183,6 +184,8 @@ struct mlx5_eswitch_rep {
struct mlx5_flow_rule *vport_rx_rule;
struct list_head vport_sqs_list;
+ u16 vlan;
+ u32 vlan_refcount;
bool valid;
};
@@ -252,11 +255,16 @@ enum {
SET_VLAN_INSERT = BIT(1)
};
+#define MLX5_FLOW_CONTEXT_ACTION_VLAN_POP 0x40
+#define MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH 0x80
+
struct mlx5_esw_flow_attr {
struct mlx5_eswitch_rep *in_rep;
struct mlx5_eswitch_rep *out_rep;
int action;
+ u16 vlan;
+ bool vlan_handled;
};
int mlx5_eswitch_sqs2vport_start(struct mlx5_eswitch *esw,
@@ -273,6 +281,13 @@ void mlx5_eswitch_register_vport_rep(struct mlx5_eswitch *esw,
void mlx5_eswitch_unregister_vport_rep(struct mlx5_eswitch *esw,
int vport_index);
+int mlx5_eswitch_add_vlan_action(struct mlx5_eswitch *esw,
+ struct mlx5_esw_flow_attr *attr);
+int mlx5_eswitch_del_vlan_action(struct mlx5_eswitch *esw,
+ struct mlx5_esw_flow_attr *attr);
+int __mlx5_eswitch_set_vport_vlan(struct mlx5_eswitch *esw,
+ int vport, u16 vlan, u8 qos, u8 set_flags);
+
#define MLX5_DEBUG_ESWITCH_MASK BIT(3)
#define esw_info(dev, format, ...) \
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index c0d9d1a..c910858 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -89,6 +89,186 @@ mlx5_eswitch_add_offloaded_rule(struct mlx5_eswitch *esw,
return rule;
}
+static int esw_set_global_vlan_pop(struct mlx5_eswitch *esw, u8 val)
+{
+ struct mlx5_eswitch_rep *rep;
+ int vf_vport, err = 0;
+
+ esw_debug(esw->dev, "%s applying global %s policy\n", __func__, val ? "pop" : "none");
+ for (vf_vport = 1; vf_vport < esw->enabled_vports; vf_vport++) {
+ rep = &esw->offloads.vport_reps[vf_vport];
+ if (!rep->valid)
+ continue;
+
+ err = __mlx5_eswitch_set_vport_vlan(esw, rep->vport, 0, 0, val);
+ if (err)
+ goto out;
+ }
+
+out:
+ return err;
+}
+
+static struct mlx5_eswitch_rep *
+esw_vlan_action_get_vport(struct mlx5_esw_flow_attr *attr, bool push, bool pop)
+{
+ struct mlx5_eswitch_rep *in_rep, *out_rep, *vport = NULL;
+
+ in_rep = attr->in_rep;
+ out_rep = attr->out_rep;
+
+ if (push)
+ vport = in_rep;
+ else if (pop)
+ vport = out_rep;
+ else
+ vport = in_rep;
+
+ return vport;
+}
+
+static int esw_add_vlan_action_check(struct mlx5_esw_flow_attr *attr,
+ bool push, bool pop, bool fwd)
+{
+ struct mlx5_eswitch_rep *in_rep, *out_rep;
+
+ if ((push || pop) && !fwd)
+ goto out_notsupp;
+
+ in_rep = attr->in_rep;
+ out_rep = attr->out_rep;
+
+ if (push && in_rep->vport == FDB_UPLINK_VPORT)
+ goto out_notsupp;
+
+ if (pop && out_rep->vport == FDB_UPLINK_VPORT)
+ goto out_notsupp;
+
+ /* vport has vlan push configured, can't offload VF --> wire rules w.o it */
+ if (!push && !pop && fwd)
+ if (in_rep->vlan && out_rep->vport == FDB_UPLINK_VPORT)
+ goto out_notsupp;
+
+ /* protects against (1) setting rules with different vlans to push and
+ * (2) setting rules w.o vlans (attr->vlan = 0) && w. vlans to push (!= 0)
+ */
+ if (push && in_rep->vlan_refcount && (in_rep->vlan != attr->vlan))
+ goto out_notsupp;
+
+ return 0;
+
+out_notsupp:
+ return -ENOTSUPP;
+}
+
+int mlx5_eswitch_add_vlan_action(struct mlx5_eswitch *esw,
+ struct mlx5_esw_flow_attr *attr)
+{
+ struct offloads_fdb *offloads = &esw->fdb_table.offloads;
+ struct mlx5_eswitch_rep *vport = NULL;
+ bool push, pop, fwd;
+ int err = 0;
+
+ push = !!(attr->action & MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH);
+ pop = !!(attr->action & MLX5_FLOW_CONTEXT_ACTION_VLAN_POP);
+ fwd = !!(attr->action & MLX5_FLOW_CONTEXT_ACTION_FWD_DEST);
+
+ err = esw_add_vlan_action_check(attr, push, pop, fwd);
+ if (err)
+ return err;
+
+ attr->vlan_handled = false;
+
+ vport = esw_vlan_action_get_vport(attr, push, pop);
+
+ if (!push && !pop && fwd) {
+ /* tracks VF --> wire rules without vlan push action */
+ if (attr->out_rep->vport == FDB_UPLINK_VPORT) {
+ vport->vlan_refcount++;
+ attr->vlan_handled = true;
+ }
+
+ return 0;
+ }
+
+ if (!push && !pop)
+ return 0;
+
+ if (!(offloads->vlan_push_pop_refcount)) {
+ /* it's the 1st vlan rule, apply global vlan pop policy */
+ err = esw_set_global_vlan_pop(esw, SET_VLAN_STRIP);
+ if (err)
+ goto out;
+ }
+ offloads->vlan_push_pop_refcount++;
+
+ if (push) {
+ if (vport->vlan_refcount)
+ goto skip_set_push;
+
+ err = __mlx5_eswitch_set_vport_vlan(esw, vport->vport, attr->vlan, 0,
+ SET_VLAN_INSERT | SET_VLAN_STRIP);
+ if (err)
+ goto out;
+ vport->vlan = attr->vlan;
+skip_set_push:
+ vport->vlan_refcount++;
+ }
+out:
+ if (!err)
+ attr->vlan_handled = true;
+ return err;
+}
+
+int mlx5_eswitch_del_vlan_action(struct mlx5_eswitch *esw,
+ struct mlx5_esw_flow_attr *attr)
+{
+ struct offloads_fdb *offloads = &esw->fdb_table.offloads;
+ struct mlx5_eswitch_rep *vport = NULL;
+ bool push, pop, fwd;
+ int err = 0;
+
+ if (!attr->vlan_handled)
+ return 0;
+
+ push = !!(attr->action & MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH);
+ pop = !!(attr->action & MLX5_FLOW_CONTEXT_ACTION_VLAN_POP);
+ fwd = !!(attr->action & MLX5_FLOW_CONTEXT_ACTION_FWD_DEST);
+
+ vport = esw_vlan_action_get_vport(attr, push, pop);
+
+ if (!push && !pop && fwd) {
+ /* tracks VF --> wire rules without vlan push action */
+ if (attr->out_rep->vport == FDB_UPLINK_VPORT)
+ vport->vlan_refcount--;
+
+ return 0;
+ }
+
+ if (push) {
+ vport->vlan_refcount--;
+ if (vport->vlan_refcount)
+ goto skip_unset_push;
+
+ vport->vlan = 0;
+ err = __mlx5_eswitch_set_vport_vlan(esw, vport->vport,
+ 0, 0, SET_VLAN_STRIP);
+ if (err)
+ goto out;
+ }
+
+skip_unset_push:
+ offloads->vlan_push_pop_refcount--;
+ if (offloads->vlan_push_pop_refcount)
+ return 0;
+
+ /* no more vlan rules, stop global vlan pop policy */
+ err = esw_set_global_vlan_pop(esw, 0);
+
+out:
+ return err;
+}
+
static struct mlx5_flow_rule *
mlx5_eswitch_add_send_to_vport_rule(struct mlx5_eswitch *esw, int vport, u32 sqn)
{
--
2.7.4
^ permalink raw reply related
* [PATCH net-next 5/9] net/mlx5: Put elements related to offloaded TC rule in one struct
From: Saeed Mahameed @ 2016-09-22 17:01 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Or Gerlitz, Hadar Hen-Zion, Jiri Pirko, Andy Gospodarek,
Jesse Brandeburg, John Fastabend, Amir Vadai, Saeed Mahameed
In-Reply-To: <1474563709-17943-1-git-send-email-saeedm@mellanox.com>
From: Or Gerlitz <ogerlitz@mellanox.com>
Put the representors related to the source and dest vports and the
action in struct mlx5_esw_flow_attr which is used while setting the FDB rule.
This patch doesn't change any functionality.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 51 ++++++++++++----------
drivers/net/ethernet/mellanox/mlx5/core/eswitch.h | 10 ++++-
.../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 9 ++--
3 files changed, 44 insertions(+), 26 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 783e122..3eb319b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -39,6 +39,7 @@
#include <linux/rhashtable.h>
#include <net/switchdev.h>
#include <net/tc_act/tc_mirred.h>
+#include <net/tc_act/tc_vlan.h>
#include "en.h"
#include "en_tc.h"
#include "eswitch.h"
@@ -47,6 +48,7 @@ struct mlx5e_tc_flow {
struct rhash_head node;
u64 cookie;
struct mlx5_flow_rule *rule;
+ struct mlx5_esw_flow_attr *attr;
};
#define MLX5E_TC_TABLE_NUM_ENTRIES 1024
@@ -114,15 +116,11 @@ err_create_ft:
static struct mlx5_flow_rule *mlx5e_tc_add_fdb_flow(struct mlx5e_priv *priv,
struct mlx5_flow_spec *spec,
- u32 action, u32 dst_vport)
+ struct mlx5_esw_flow_attr *attr)
{
struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
- struct mlx5_eswitch_rep *rep = priv->ppriv;
- u32 src_vport;
- src_vport = rep->vport;
-
- return mlx5_eswitch_add_offloaded_rule(esw, spec, action, src_vport, dst_vport);
+ return mlx5_eswitch_add_offloaded_rule(esw, spec, attr);
}
static void mlx5e_tc_del_flow(struct mlx5e_priv *priv,
@@ -358,7 +356,7 @@ static int parse_tc_nic_actions(struct mlx5e_priv *priv, struct tcf_exts *exts,
}
static int parse_tc_fdb_actions(struct mlx5e_priv *priv, struct tcf_exts *exts,
- u32 *action, u32 *dest_vport)
+ struct mlx5_esw_flow_attr *attr)
{
const struct tc_action *a;
LIST_HEAD(actions);
@@ -366,17 +364,18 @@ static int parse_tc_fdb_actions(struct mlx5e_priv *priv, struct tcf_exts *exts,
if (tc_no_actions(exts))
return -EINVAL;
- *action = 0;
+ memset(attr, 0, sizeof(*attr));
+ attr->in_rep = priv->ppriv;
tcf_exts_to_list(exts, &actions);
list_for_each_entry(a, &actions, list) {
/* Only support a single action per rule */
- if (*action)
+ if (attr->action)
return -EINVAL;
if (is_tcf_gact_shot(a)) {
- *action = MLX5_FLOW_CONTEXT_ACTION_DROP |
- MLX5_FLOW_CONTEXT_ACTION_COUNT;
+ attr->action = MLX5_FLOW_CONTEXT_ACTION_DROP |
+ MLX5_FLOW_CONTEXT_ACTION_COUNT;
continue;
}
@@ -384,7 +383,6 @@ static int parse_tc_fdb_actions(struct mlx5e_priv *priv, struct tcf_exts *exts,
int ifindex = tcf_mirred_ifindex(a);
struct net_device *out_dev;
struct mlx5e_priv *out_priv;
- struct mlx5_eswitch_rep *out_rep;
out_dev = __dev_get_by_index(dev_net(priv->netdev), ifindex);
@@ -394,10 +392,9 @@ static int parse_tc_fdb_actions(struct mlx5e_priv *priv, struct tcf_exts *exts,
return -EINVAL;
}
+ attr->action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST;
out_priv = netdev_priv(out_dev);
- out_rep = out_priv->ppriv;
- *dest_vport = out_rep->vport;
- *action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST;
+ attr->out_rep = out_priv->ppriv;
continue;
}
@@ -411,18 +408,27 @@ int mlx5e_configure_flower(struct mlx5e_priv *priv, __be16 protocol,
{
struct mlx5e_tc_table *tc = &priv->fs.tc;
int err = 0;
- u32 flow_tag, action, dest_vport = 0;
+ bool fdb_flow = false;
+ u32 flow_tag, action;
struct mlx5e_tc_flow *flow;
struct mlx5_flow_spec *spec;
struct mlx5_flow_rule *old = NULL;
struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
+ if (esw && esw->mode == SRIOV_OFFLOADS)
+ fdb_flow = true;
+
flow = rhashtable_lookup_fast(&tc->ht, &f->cookie,
tc->ht_params);
- if (flow)
+ if (flow) {
old = flow->rule;
- else
- flow = kzalloc(sizeof(*flow), GFP_KERNEL);
+ } else {
+ if (fdb_flow)
+ flow = kzalloc(sizeof(*flow) + sizeof(struct mlx5_esw_flow_attr),
+ GFP_KERNEL);
+ else
+ flow = kzalloc(sizeof(*flow), GFP_KERNEL);
+ }
spec = mlx5_vzalloc(sizeof(*spec));
if (!spec || !flow) {
@@ -436,11 +442,12 @@ int mlx5e_configure_flower(struct mlx5e_priv *priv, __be16 protocol,
if (err < 0)
goto err_free;
- if (esw && esw->mode == SRIOV_OFFLOADS) {
- err = parse_tc_fdb_actions(priv, f->exts, &action, &dest_vport);
+ if (fdb_flow) {
+ flow->attr = (struct mlx5_esw_flow_attr *)(flow + 1);
+ err = parse_tc_fdb_actions(priv, f->exts, flow->attr);
if (err < 0)
goto err_free;
- flow->rule = mlx5e_tc_add_fdb_flow(priv, spec, action, dest_vport);
+ flow->rule = mlx5e_tc_add_fdb_flow(priv, spec, flow->attr);
} else {
err = parse_tc_nic_actions(priv, f->exts, &action, &flow_tag);
if (err < 0)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index 4f5391a..eeeeadc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -238,11 +238,12 @@ int mlx5_eswitch_get_vport_stats(struct mlx5_eswitch *esw,
struct ifla_vf_stats *vf_stats);
struct mlx5_flow_spec;
+struct mlx5_esw_flow_attr;
struct mlx5_flow_rule *
mlx5_eswitch_add_offloaded_rule(struct mlx5_eswitch *esw,
struct mlx5_flow_spec *spec,
- u32 action, u32 src_vport, u32 dst_vport);
+ struct mlx5_esw_flow_attr *attr);
struct mlx5_flow_rule *
mlx5_eswitch_create_vport_rx_rule(struct mlx5_eswitch *esw, int vport, u32 tirn);
@@ -251,6 +252,13 @@ enum {
SET_VLAN_INSERT = BIT(1)
};
+struct mlx5_esw_flow_attr {
+ struct mlx5_eswitch_rep *in_rep;
+ struct mlx5_eswitch_rep *out_rep;
+
+ int action;
+};
+
int mlx5_eswitch_sqs2vport_start(struct mlx5_eswitch *esw,
struct mlx5_eswitch_rep *rep,
u16 *sqns_array, int sqns_num);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index b901cd4..c0d9d1a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -46,19 +46,22 @@ enum {
struct mlx5_flow_rule *
mlx5_eswitch_add_offloaded_rule(struct mlx5_eswitch *esw,
struct mlx5_flow_spec *spec,
- u32 action, u32 src_vport, u32 dst_vport)
+ struct mlx5_esw_flow_attr *attr)
{
struct mlx5_flow_destination dest = { 0 };
struct mlx5_fc *counter = NULL;
struct mlx5_flow_rule *rule;
void *misc;
+ int action;
if (esw->mode != SRIOV_OFFLOADS)
return ERR_PTR(-EOPNOTSUPP);
+ action = attr->action;
+
if (action & MLX5_FLOW_CONTEXT_ACTION_FWD_DEST) {
dest.type = MLX5_FLOW_DESTINATION_TYPE_VPORT;
- dest.vport_num = dst_vport;
+ dest.vport_num = attr->out_rep->vport;
action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST;
} else if (action & MLX5_FLOW_CONTEXT_ACTION_COUNT) {
counter = mlx5_fc_create(esw->dev, true);
@@ -69,7 +72,7 @@ mlx5_eswitch_add_offloaded_rule(struct mlx5_eswitch *esw,
}
misc = MLX5_ADDR_OF(fte_match_param, spec->match_value, misc_parameters);
- MLX5_SET(fte_match_set_misc, misc, source_port, src_vport);
+ MLX5_SET(fte_match_set_misc, misc, source_port, attr->in_rep->vport);
misc = MLX5_ADDR_OF(fte_match_param, spec->match_criteria, misc_parameters);
MLX5_SET_TO_ONES(fte_match_set_misc, misc, source_port);
--
2.7.4
^ permalink raw reply related
* [PATCH net-next 4/9] net/mlx5: E-Switch, Allow fine tuning of eswitch vport push/pop vlan
From: Saeed Mahameed @ 2016-09-22 17:01 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Or Gerlitz, Hadar Hen-Zion, Jiri Pirko, Andy Gospodarek,
Jesse Brandeburg, John Fastabend, Amir Vadai, Saeed Mahameed
In-Reply-To: <1474563709-17943-1-git-send-email-saeedm@mellanox.com>
From: Or Gerlitz <ogerlitz@mellanox.com>
The HW can be programmed to push vlan, pop vlan or both.
A factorization step towards using the push/pop capabilties in the
eswitch offloads mode. This patch doesn't add new functionality.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 33 +++++++++++++++--------
drivers/net/ethernet/mellanox/mlx5/core/eswitch.h | 5 ++++
2 files changed, 27 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
index 4927494..6605453 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -127,7 +127,7 @@ static int modify_esw_vport_context_cmd(struct mlx5_core_dev *dev, u16 vport,
}
static int modify_esw_vport_cvlan(struct mlx5_core_dev *dev, u32 vport,
- u16 vlan, u8 qos, bool set)
+ u16 vlan, u8 qos, u8 set_flags)
{
u32 in[MLX5_ST_SZ_DW(modify_esw_vport_context_in)] = {0};
@@ -135,14 +135,18 @@ static int modify_esw_vport_cvlan(struct mlx5_core_dev *dev, u32 vport,
!MLX5_CAP_ESW(dev, vport_cvlan_insert_if_not_exist))
return -ENOTSUPP;
- esw_debug(dev, "Set Vport[%d] VLAN %d qos %d set=%d\n",
- vport, vlan, qos, set);
- if (set) {
+ esw_debug(dev, "Set Vport[%d] VLAN %d qos %d set=%x\n",
+ vport, vlan, qos, set_flags);
+
+ if (set_flags & SET_VLAN_STRIP)
MLX5_SET(modify_esw_vport_context_in, in,
esw_vport_context.vport_cvlan_strip, 1);
+
+ if (set_flags & SET_VLAN_INSERT) {
/* insert only if no vlan in packet */
MLX5_SET(modify_esw_vport_context_in, in,
esw_vport_context.vport_cvlan_insert, 1);
+
MLX5_SET(modify_esw_vport_context_in, in,
esw_vport_context.cvlan_pcp, qos);
MLX5_SET(modify_esw_vport_context_in, in,
@@ -1777,25 +1781,21 @@ int mlx5_eswitch_get_vport_config(struct mlx5_eswitch *esw,
return 0;
}
-int mlx5_eswitch_set_vport_vlan(struct mlx5_eswitch *esw,
- int vport, u16 vlan, u8 qos)
+int __mlx5_eswitch_set_vport_vlan(struct mlx5_eswitch *esw,
+ int vport, u16 vlan, u8 qos, u8 set_flags)
{
struct mlx5_vport *evport;
int err = 0;
- int set = 0;
if (!ESW_ALLOWED(esw))
return -EPERM;
if (!LEGAL_VPORT(esw, vport) || (vlan > 4095) || (qos > 7))
return -EINVAL;
- if (vlan || qos)
- set = 1;
-
mutex_lock(&esw->state_lock);
evport = &esw->vports[vport];
- err = modify_esw_vport_cvlan(esw->dev, vport, vlan, qos, set);
+ err = modify_esw_vport_cvlan(esw->dev, vport, vlan, qos, set_flags);
if (err)
goto unlock;
@@ -1813,6 +1813,17 @@ unlock:
return err;
}
+int mlx5_eswitch_set_vport_vlan(struct mlx5_eswitch *esw,
+ int vport, u16 vlan, u8 qos)
+{
+ u8 set_flags = 0;
+
+ if (vlan || qos)
+ set_flags = SET_VLAN_STRIP | SET_VLAN_INSERT;
+
+ return __mlx5_eswitch_set_vport_vlan(esw, vport, vlan, qos, set_flags);
+}
+
int mlx5_eswitch_set_vport_spoofchk(struct mlx5_eswitch *esw,
int vport, bool spoofchk)
{
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index ebfcde0..4f5391a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -246,6 +246,11 @@ mlx5_eswitch_add_offloaded_rule(struct mlx5_eswitch *esw,
struct mlx5_flow_rule *
mlx5_eswitch_create_vport_rx_rule(struct mlx5_eswitch *esw, int vport, u32 tirn);
+enum {
+ SET_VLAN_STRIP = BIT(0),
+ SET_VLAN_INSERT = BIT(1)
+};
+
int mlx5_eswitch_sqs2vport_start(struct mlx5_eswitch *esw,
struct mlx5_eswitch_rep *rep,
u16 *sqns_array, int sqns_num);
--
2.7.4
^ permalink raw reply related
* [PATCH net-next 8/9] net/mlx5e: Add TC vlan action for SRIOV offloads
From: Saeed Mahameed @ 2016-09-22 17:01 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Or Gerlitz, Hadar Hen-Zion, Jiri Pirko, Andy Gospodarek,
Jesse Brandeburg, John Fastabend, Amir Vadai, Saeed Mahameed
In-Reply-To: <1474563709-17943-1-git-send-email-saeedm@mellanox.com>
From: Or Gerlitz <ogerlitz@mellanox.com>
Parse TC vlan actions and set the required elements to allow offloading.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 43 ++++++++++++++++++-------
1 file changed, 32 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 3eb319b..e61bd52 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -119,17 +119,27 @@ static struct mlx5_flow_rule *mlx5e_tc_add_fdb_flow(struct mlx5e_priv *priv,
struct mlx5_esw_flow_attr *attr)
{
struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
+ int err;
+
+ err = mlx5_eswitch_add_vlan_action(esw, attr);
+ if (err)
+ return ERR_PTR(err);
return mlx5_eswitch_add_offloaded_rule(esw, spec, attr);
}
static void mlx5e_tc_del_flow(struct mlx5e_priv *priv,
- struct mlx5_flow_rule *rule)
+ struct mlx5_flow_rule *rule,
+ struct mlx5_esw_flow_attr *attr)
{
+ struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
struct mlx5_fc *counter = NULL;
counter = mlx5_flow_rule_counter(rule);
+ if (esw && esw->mode == SRIOV_OFFLOADS)
+ mlx5_eswitch_del_vlan_action(esw, attr);
+
mlx5_del_flow_rule(rule);
mlx5_fc_destroy(priv->mdev, counter);
@@ -369,13 +379,9 @@ static int parse_tc_fdb_actions(struct mlx5e_priv *priv, struct tcf_exts *exts,
tcf_exts_to_list(exts, &actions);
list_for_each_entry(a, &actions, list) {
- /* Only support a single action per rule */
- if (attr->action)
- return -EINVAL;
-
if (is_tcf_gact_shot(a)) {
- attr->action = MLX5_FLOW_CONTEXT_ACTION_DROP |
- MLX5_FLOW_CONTEXT_ACTION_COUNT;
+ attr->action |= MLX5_FLOW_CONTEXT_ACTION_DROP |
+ MLX5_FLOW_CONTEXT_ACTION_COUNT;
continue;
}
@@ -392,12 +398,25 @@ static int parse_tc_fdb_actions(struct mlx5e_priv *priv, struct tcf_exts *exts,
return -EINVAL;
}
- attr->action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST;
+ attr->action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST;
out_priv = netdev_priv(out_dev);
attr->out_rep = out_priv->ppriv;
continue;
}
+ if (is_tcf_vlan(a)) {
+ if (tcf_vlan_action(a) == VLAN_F_POP) {
+ attr->action |= MLX5_FLOW_CONTEXT_ACTION_VLAN_POP;
+ } else if (tcf_vlan_action(a) == VLAN_F_PUSH) {
+ if (tcf_vlan_push_proto(a) != htons(ETH_P_8021Q))
+ return -EOPNOTSUPP;
+
+ attr->action |= MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH;
+ attr->vlan = tcf_vlan_push_vid(a);
+ }
+ continue;
+ }
+
return -EINVAL;
}
return 0;
@@ -413,6 +432,7 @@ int mlx5e_configure_flower(struct mlx5e_priv *priv, __be16 protocol,
struct mlx5e_tc_flow *flow;
struct mlx5_flow_spec *spec;
struct mlx5_flow_rule *old = NULL;
+ struct mlx5_esw_flow_attr *old_attr;
struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
if (esw && esw->mode == SRIOV_OFFLOADS)
@@ -422,6 +442,7 @@ int mlx5e_configure_flower(struct mlx5e_priv *priv, __be16 protocol,
tc->ht_params);
if (flow) {
old = flow->rule;
+ old_attr = flow->attr;
} else {
if (fdb_flow)
flow = kzalloc(sizeof(*flow) + sizeof(struct mlx5_esw_flow_attr),
@@ -466,7 +487,7 @@ int mlx5e_configure_flower(struct mlx5e_priv *priv, __be16 protocol,
goto err_del_rule;
if (old)
- mlx5e_tc_del_flow(priv, old);
+ mlx5e_tc_del_flow(priv, old, old_attr);
goto out;
@@ -494,7 +515,7 @@ int mlx5e_delete_flower(struct mlx5e_priv *priv,
rhashtable_remove_fast(&tc->ht, &flow->node, tc->ht_params);
- mlx5e_tc_del_flow(priv, flow->rule);
+ mlx5e_tc_del_flow(priv, flow->rule, flow->attr);
kfree(flow);
@@ -551,7 +572,7 @@ static void _mlx5e_tc_del_flow(void *ptr, void *arg)
struct mlx5e_tc_flow *flow = ptr;
struct mlx5e_priv *priv = arg;
- mlx5e_tc_del_flow(priv, flow->rule);
+ mlx5e_tc_del_flow(priv, flow->rule, flow->attr);
kfree(flow);
}
--
2.7.4
^ permalink raw reply related
* [PATCH net-next 2/9] net/mlx5: E-Switch, Set the vport when registering the uplink rep
From: Saeed Mahameed @ 2016-09-22 17:01 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Or Gerlitz, Hadar Hen-Zion, Jiri Pirko, Andy Gospodarek,
Jesse Brandeburg, John Fastabend, Amir Vadai, Saeed Mahameed
In-Reply-To: <1474563709-17943-1-git-send-email-saeedm@mellanox.com>
From: Or Gerlitz <ogerlitz@mellanox.com>
Set the vport value in the PF entry to be that of the uplink so
we can use it blindly over the tc / eswitch offload code without
translating it each time we deal with the uplink representor.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 6 ++---
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 10 ++------
drivers/net/ethernet/mellanox/mlx5/core/eswitch.h | 3 ++-
.../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 27 +++++++++++-----------
4 files changed, 20 insertions(+), 26 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index a9fc9d4..b309e7c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3726,9 +3726,9 @@ static void mlx5e_nic_enable(struct mlx5e_priv *priv)
mlx5_query_nic_vport_mac_address(mdev, 0, rep.hw_id);
rep.load = mlx5e_nic_rep_load;
rep.unload = mlx5e_nic_rep_unload;
- rep.vport = 0;
+ rep.vport = FDB_UPLINK_VPORT;
rep.priv_data = priv;
- mlx5_eswitch_register_vport_rep(esw, &rep);
+ mlx5_eswitch_register_vport_rep(esw, 0, &rep);
}
}
@@ -3867,7 +3867,7 @@ static void mlx5e_register_vport_rep(struct mlx5_core_dev *mdev)
rep.unload = mlx5e_vport_rep_unload;
rep.vport = vport;
ether_addr_copy(rep.hw_id, mac);
- mlx5_eswitch_register_vport_rep(esw, &rep);
+ mlx5_eswitch_register_vport_rep(esw, vport, &rep);
}
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 22cfc4a..783e122 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -120,10 +120,7 @@ static struct mlx5_flow_rule *mlx5e_tc_add_fdb_flow(struct mlx5e_priv *priv,
struct mlx5_eswitch_rep *rep = priv->ppriv;
u32 src_vport;
- if (rep->vport) /* set source vport for the flow */
- src_vport = rep->vport;
- else
- src_vport = FDB_UPLINK_VPORT;
+ src_vport = rep->vport;
return mlx5_eswitch_add_offloaded_rule(esw, spec, action, src_vport, dst_vport);
}
@@ -399,10 +396,7 @@ static int parse_tc_fdb_actions(struct mlx5e_priv *priv, struct tcf_exts *exts,
out_priv = netdev_priv(out_dev);
out_rep = out_priv->ppriv;
- if (out_rep->vport == 0)
- *dest_vport = FDB_UPLINK_VPORT;
- else
- *dest_vport = out_rep->vport;
+ *dest_vport = out_rep->vport;
*action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST;
continue;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index b96e8c9..6d8c5a2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -254,9 +254,10 @@ void mlx5_eswitch_sqs2vport_stop(struct mlx5_eswitch *esw,
int mlx5_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode);
int mlx5_devlink_eswitch_mode_get(struct devlink *devlink, u16 *mode);
void mlx5_eswitch_register_vport_rep(struct mlx5_eswitch *esw,
+ int vport_index,
struct mlx5_eswitch_rep *rep);
void mlx5_eswitch_unregister_vport_rep(struct mlx5_eswitch *esw,
- int vport);
+ int vport_index);
#define MLX5_DEBUG_ESWITCH_MASK BIT(3)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 3dc83a9..a73721b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -144,16 +144,12 @@ int mlx5_eswitch_sqs2vport_start(struct mlx5_eswitch *esw,
{
struct mlx5_flow_rule *flow_rule;
struct mlx5_esw_sq *esw_sq;
- int vport;
int err;
int i;
if (esw->mode != SRIOV_OFFLOADS)
return 0;
- vport = rep->vport == 0 ?
- FDB_UPLINK_VPORT : rep->vport;
-
for (i = 0; i < sqns_num; i++) {
esw_sq = kzalloc(sizeof(*esw_sq), GFP_KERNEL);
if (!esw_sq) {
@@ -163,7 +159,7 @@ int mlx5_eswitch_sqs2vport_start(struct mlx5_eswitch *esw,
/* Add re-inject rule to the PF/representor sqs */
flow_rule = mlx5_eswitch_add_send_to_vport_rule(esw,
- vport,
+ rep->vport,
sqns_array[i]);
if (IS_ERR(flow_rule)) {
err = PTR_ERR(flow_rule);
@@ -612,27 +608,30 @@ int mlx5_devlink_eswitch_mode_get(struct devlink *devlink, u16 *mode)
}
void mlx5_eswitch_register_vport_rep(struct mlx5_eswitch *esw,
- struct mlx5_eswitch_rep *rep)
+ int vport_index,
+ struct mlx5_eswitch_rep *__rep)
{
struct mlx5_esw_offload *offloads = &esw->offloads;
+ struct mlx5_eswitch_rep *rep;
+
+ rep = &offloads->vport_reps[vport_index];
- memcpy(&offloads->vport_reps[rep->vport], rep,
- sizeof(struct mlx5_eswitch_rep));
+ memcpy(rep, __rep, sizeof(struct mlx5_eswitch_rep));
- INIT_LIST_HEAD(&offloads->vport_reps[rep->vport].vport_sqs_list);
- offloads->vport_reps[rep->vport].valid = true;
+ INIT_LIST_HEAD(&rep->vport_sqs_list);
+ rep->valid = true;
}
void mlx5_eswitch_unregister_vport_rep(struct mlx5_eswitch *esw,
- int vport)
+ int vport_index)
{
struct mlx5_esw_offload *offloads = &esw->offloads;
struct mlx5_eswitch_rep *rep;
- rep = &offloads->vport_reps[vport];
+ rep = &offloads->vport_reps[vport_index];
- if (esw->mode == SRIOV_OFFLOADS && esw->vports[vport].enabled)
+ if (esw->mode == SRIOV_OFFLOADS && esw->vports[vport_index].enabled)
rep->unload(esw, rep);
- offloads->vport_reps[vport].valid = false;
+ rep->valid = false;
}
--
2.7.4
^ permalink raw reply related
* [PATCH net-next 3/9] net/mlx5: E-Switch, Set vport representor fields explicitly on registration
From: Saeed Mahameed @ 2016-09-22 17:01 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Or Gerlitz, Hadar Hen-Zion, Jiri Pirko, Andy Gospodarek,
Jesse Brandeburg, John Fastabend, Amir Vadai, Saeed Mahameed
In-Reply-To: <1474563709-17943-1-git-send-email-saeedm@mellanox.com>
From: Or Gerlitz <ogerlitz@mellanox.com>
The structure we use for the eswitch vport representor (mlx5_eswitch_rep)
has some fields which are set from upper layers in the driver when they
register the rep. Use explicit setting on registration time for them and
avoid global memcpy. This patch doesn't add new functionality.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/eswitch.h | 5 +++--
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 8 +++++++-
2 files changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index 6d8c5a2..ebfcde0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -178,11 +178,12 @@ struct mlx5_eswitch_rep {
void (*unload)(struct mlx5_eswitch *esw,
struct mlx5_eswitch_rep *rep);
u16 vport;
- struct mlx5_flow_rule *vport_rx_rule;
+ u8 hw_id[ETH_ALEN];
void *priv_data;
+
+ struct mlx5_flow_rule *vport_rx_rule;
struct list_head vport_sqs_list;
bool valid;
- u8 hw_id[ETH_ALEN];
};
struct mlx5_esw_offload {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index a73721b..b901cd4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -616,7 +616,13 @@ void mlx5_eswitch_register_vport_rep(struct mlx5_eswitch *esw,
rep = &offloads->vport_reps[vport_index];
- memcpy(rep, __rep, sizeof(struct mlx5_eswitch_rep));
+ memset(rep, 0, sizeof(*rep));
+
+ rep->load = __rep->load;
+ rep->unload = __rep->unload;
+ rep->vport = __rep->vport;
+ rep->priv_data = __rep->priv_data;
+ ether_addr_copy(rep->hw_id, __rep->hw_id);
INIT_LIST_HEAD(&rep->vport_sqs_list);
rep->valid = true;
--
2.7.4
^ permalink raw reply related
* [PATCH net-next 9/9] net/mlx5e: Add TC vlan match parsing
From: Saeed Mahameed @ 2016-09-22 17:01 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Or Gerlitz, Hadar Hen-Zion, Jiri Pirko, Andy Gospodarek,
Jesse Brandeburg, John Fastabend, Amir Vadai, Saeed Mahameed
In-Reply-To: <1474563709-17943-1-git-send-email-saeedm@mellanox.com>
From: Or Gerlitz <ogerlitz@mellanox.com>
Enhance the parsing of offloaded TC rules matches to handle vlans.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index e61bd52..a350b71 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -164,6 +164,7 @@ static int parse_cls_flower(struct mlx5e_priv *priv, struct mlx5_flow_spec *spec
~(BIT(FLOW_DISSECTOR_KEY_CONTROL) |
BIT(FLOW_DISSECTOR_KEY_BASIC) |
BIT(FLOW_DISSECTOR_KEY_ETH_ADDRS) |
+ BIT(FLOW_DISSECTOR_KEY_VLAN) |
BIT(FLOW_DISSECTOR_KEY_IPV4_ADDRS) |
BIT(FLOW_DISSECTOR_KEY_IPV6_ADDRS) |
BIT(FLOW_DISSECTOR_KEY_PORTS))) {
@@ -227,6 +228,24 @@ static int parse_cls_flower(struct mlx5e_priv *priv, struct mlx5_flow_spec *spec
key->src);
}
+ if (dissector_uses_key(f->dissector, FLOW_DISSECTOR_KEY_VLAN)) {
+ struct flow_dissector_key_vlan *key =
+ skb_flow_dissector_target(f->dissector,
+ FLOW_DISSECTOR_KEY_VLAN,
+ f->key);
+ struct flow_dissector_key_vlan *mask =
+ skb_flow_dissector_target(f->dissector,
+ FLOW_DISSECTOR_KEY_VLAN,
+ f->mask);
+ if (mask->vlan_id) {
+ MLX5_SET(fte_match_set_lyr_2_4, headers_c, vlan_tag, 1);
+ MLX5_SET(fte_match_set_lyr_2_4, headers_v, vlan_tag, 1);
+
+ MLX5_SET(fte_match_set_lyr_2_4, headers_c, first_vid, mask->vlan_id);
+ MLX5_SET(fte_match_set_lyr_2_4, headers_v, first_vid, key->vlan_id);
+ }
+ }
+
if (addr_type == FLOW_DISSECTOR_KEY_IPV4_ADDRS) {
struct flow_dissector_key_ipv4_addrs *key =
skb_flow_dissector_target(f->dissector,
--
2.7.4
^ permalink raw reply related
* [PATCH net-next 6/9] net/mlx5e: Refactor retrival of skb from rx completion element (cqe)
From: Saeed Mahameed @ 2016-09-22 17:01 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Or Gerlitz, Hadar Hen-Zion, Jiri Pirko, Andy Gospodarek,
Jesse Brandeburg, John Fastabend, Amir Vadai, Saeed Mahameed
In-Reply-To: <1474563709-17943-1-git-send-email-saeedm@mellanox.com>
From: Or Gerlitz <ogerlitz@mellanox.com>
Factor the relevant code into a static inline helper (skb_from_cqe)
doing that.
Move the call to napi_gro_receive to be carried out just
after mlx5e_complete_rx_cqe returns.
Both changes are to be used for the VF representor as well
in the next commit.
This patch doesn't change any functionality.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 41 +++++++++++++++++--------
1 file changed, 28 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 0a81bd3..e836e47 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -629,7 +629,6 @@ static inline void mlx5e_complete_rx_cqe(struct mlx5e_rq *rq,
rq->stats.packets++;
rq->stats.bytes += cqe_bcnt;
mlx5e_build_rx_skb(cqe, cqe_bcnt, rq, skb);
- napi_gro_receive(rq->cq.napi, skb);
}
static inline void mlx5e_xmit_xdp_doorbell(struct mlx5e_sq *sq)
@@ -733,20 +732,15 @@ static inline bool mlx5e_xdp_handle(struct mlx5e_rq *rq,
}
}
-void mlx5e_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe)
+static inline
+struct sk_buff *skb_from_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe,
+ u16 wqe_counter, u32 cqe_bcnt)
{
struct bpf_prog *xdp_prog = READ_ONCE(rq->xdp_prog);
struct mlx5e_dma_info *di;
- struct mlx5e_rx_wqe *wqe;
- __be16 wqe_counter_be;
struct sk_buff *skb;
- u16 wqe_counter;
void *va, *data;
- u32 cqe_bcnt;
- wqe_counter_be = cqe->wqe_counter;
- wqe_counter = be16_to_cpu(wqe_counter_be);
- wqe = mlx5_wq_ll_get_wqe(&rq->wq, wqe_counter);
di = &rq->dma_info[wqe_counter];
va = page_address(di->page);
data = va + MLX5_RX_HEADROOM;
@@ -757,22 +751,21 @@ void mlx5e_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe)
rq->buff.wqe_sz,
DMA_FROM_DEVICE);
prefetch(data);
- cqe_bcnt = be32_to_cpu(cqe->byte_cnt);
if (unlikely((cqe->op_own >> 4) != MLX5_CQE_RESP_SEND)) {
rq->stats.wqe_err++;
mlx5e_page_release(rq, di, true);
- goto wq_ll_pop;
+ return NULL;
}
if (mlx5e_xdp_handle(rq, xdp_prog, di, data, cqe_bcnt))
- goto wq_ll_pop; /* page/packet was consumed by XDP */
+ return NULL; /* page/packet was consumed by XDP */
skb = build_skb(va, RQ_PAGE_SIZE(rq));
if (unlikely(!skb)) {
rq->stats.buff_alloc_err++;
mlx5e_page_release(rq, di, true);
- goto wq_ll_pop;
+ return NULL;
}
/* queue up for recycling ..*/
@@ -782,7 +775,28 @@ void mlx5e_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe)
skb_reserve(skb, MLX5_RX_HEADROOM);
skb_put(skb, cqe_bcnt);
+ return skb;
+}
+
+void mlx5e_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe)
+{
+ struct mlx5e_rx_wqe *wqe;
+ __be16 wqe_counter_be;
+ struct sk_buff *skb;
+ u16 wqe_counter;
+ u32 cqe_bcnt;
+
+ wqe_counter_be = cqe->wqe_counter;
+ wqe_counter = be16_to_cpu(wqe_counter_be);
+ wqe = mlx5_wq_ll_get_wqe(&rq->wq, wqe_counter);
+ cqe_bcnt = be32_to_cpu(cqe->byte_cnt);
+
+ skb = skb_from_cqe(rq, cqe, wqe_counter, cqe_bcnt);
+ if (!skb)
+ goto wq_ll_pop;
+
mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb);
+ napi_gro_receive(rq->cq.napi, skb);
wq_ll_pop:
mlx5_wq_ll_pop(&rq->wq, wqe_counter_be,
@@ -861,6 +875,7 @@ void mlx5e_handle_rx_cqe_mpwrq(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe)
mlx5e_mpwqe_fill_rx_skb(rq, cqe, wi, cqe_bcnt, skb);
mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb);
+ napi_gro_receive(rq->cq.napi, skb);
mpwrq_cqe_out:
if (likely(wi->consumed_strides < rq->mpwqe_num_strides))
--
2.7.4
^ permalink raw reply related
* [PATCH net-next 0/9] Mellanox 100G SRIOV offloads vlan push/pop
From: Saeed Mahameed @ 2016-09-22 17:01 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Or Gerlitz, Hadar Hen-Zion, Jiri Pirko, Andy Gospodarek,
Jesse Brandeburg, John Fastabend, Amir Vadai, Saeed Mahameed
Hi Dave,
>From Or Gerlitz:
This series further enhances the SRIOV TC offloads of mlx5 to handle
the TC vlan push and pop actions. This serves a common use-case in
virtualization systems where the virtual switch add (push) vlan tags
to packets sent from VMs and removes (pop) vlan tags before the packet
is received by the VM. We use the new E-Switch switchdev mode and the
TC vlan action to achieve that also in SW defined SRIOV environments by
offloading TC rules that contain this action along with forwarding
(TC mirred/redirect action) the packet.
In the first patch we add some helpers to access the TC vlan action info
by offloading drivers. The next five patches don't add any new functionality,
they do some refactoring and cleanups in the current code to be used next.
The seventh patch deals with supporting vlans by the mlx5 e-switch in switchdev
mode. The eighth patch does the vlan action offload from TC and the last patch
adds matching for vlans as typically required by TC flows that involve vlan
pop action.
The series was applied on top of commit 524605e "cxgb4: Convert to use simple_open()"
Thanks.
Or Gerlitz (9):
net_sched: act_vlan: add helper inlines to access tcf_vlan info
net/mlx5: E-Switch, Set the vport when registering the uplink rep
net/mlx5: E-Switch, Set vport representor fields explicitly on
registration
net/mlx5: E-Switch, Allow fine tuning of eswitch vport push/pop vlan
net/mlx5: Put elements related to offloaded TC rule in one struct
net/mlx5e: Refactor retrival of skb from rx completion element (cqe)
net/mlx5: E-Switch, Support VLAN actions in the offloads mode
net/mlx5e: Add TC vlan action for SRIOV offloads
net/mlx5e: Add TC vlan match parsing
drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 +
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 27 ++-
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 74 +++++--
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 109 ++++++----
drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 33 ++-
drivers/net/ethernet/mellanox/mlx5/core/eswitch.h | 38 +++-
.../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 222 +++++++++++++++++++--
include/net/tc_act/tc_vlan.h | 25 +++
8 files changed, 446 insertions(+), 83 deletions(-)
--
2.7.4
^ permalink raw reply
* [PATCH net-next 1/9] net_sched: act_vlan: add helper inlines to access tcf_vlan info
From: Saeed Mahameed @ 2016-09-22 17:01 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Or Gerlitz, Hadar Hen-Zion, Jiri Pirko, Andy Gospodarek,
Jesse Brandeburg, John Fastabend, Amir Vadai, Saeed Mahameed
In-Reply-To: <1474563709-17943-1-git-send-email-saeedm@mellanox.com>
From: Or Gerlitz <ogerlitz@mellanox.com>
Needed e.g for offloading drivers to pick the relevant attributes.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
include/net/tc_act/tc_vlan.h | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
diff --git a/include/net/tc_act/tc_vlan.h b/include/net/tc_act/tc_vlan.h
index 6b83588..48cca32 100644
--- a/include/net/tc_act/tc_vlan.h
+++ b/include/net/tc_act/tc_vlan.h
@@ -11,6 +11,7 @@
#define __NET_TC_VLAN_H
#include <net/act_api.h>
+#include <linux/tc_act/tc_vlan.h>
#define VLAN_F_POP 0x1
#define VLAN_F_PUSH 0x2
@@ -24,4 +25,28 @@ struct tcf_vlan {
};
#define to_vlan(a) ((struct tcf_vlan *)a)
+static inline bool is_tcf_vlan(const struct tc_action *a)
+{
+#ifdef CONFIG_NET_CLS_ACT
+ if (a->ops && a->ops->type == TCA_ACT_VLAN)
+ return true;
+#endif
+ return false;
+}
+
+static inline u32 tcf_vlan_action(const struct tc_action *a)
+{
+ return to_vlan(a)->tcfv_action;
+}
+
+static inline u16 tcf_vlan_push_vid(const struct tc_action *a)
+{
+ return to_vlan(a)->tcfv_push_vid;
+}
+
+static inline __be16 tcf_vlan_push_proto(const struct tc_action *a)
+{
+ return to_vlan(a)->tcfv_push_proto;
+}
+
#endif /* __NET_TC_VLAN_H */
--
2.7.4
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox