* [PATCH v3 0/2] net: qcom/emac: add shared mdio bus support
From: Wang Dongsheng @ 2018-10-25 8:15 UTC (permalink / raw)
To: timur, andrew; +Cc: netdev, Wang Dongsheng, yu.zheng, f.fainelli
The emac include MDIO controller, and the motherboard has more than one
PHY connected to an MDIO bus. So share the shared mii_bus for others MAC
device that not has MDIO bus connected.
Based on ACPI, since "phy-handle" cannot directly point to a _DSD
sub-package, so we use "phy-handle" to point an internal MDIO device port.
The port describes the phy address.
Tested: QDF2400 (ACPI), buildin/insmod/rmmod
V3:
- Add "phy-handle" support.
- Remove all of DT changes.
V2:
- Separate patch.
Wang Dongsheng (2):
net: qcom/emac: split phy_config to mdio bus create and get phy device
net: qcom/emac: add phy-handle support for ACPI
drivers/net/ethernet/qualcomm/emac/emac-phy.c | 183 ++++++++++++++----
1 file changed, 142 insertions(+), 41 deletions(-)
--
2.18.0
^ permalink raw reply
* RE: [PATCH v3 0/2] net: qcom/emac: add shared mdio bus support
From: Wang, Dongsheng @ 2018-10-25 8:16 UTC (permalink / raw)
To: timur@kernel.org, andrew@lunn.ch
Cc: netdev@vger.kernel.org, Zheng, Joey, f.fainelli@gmail.com
In-Reply-To: <20181025081503.15683-1-dongsheng.wang@hxt-semitech.com>
Sorry, please ignore this patch.
Cheers,
-Dongsheng
> -----Original Message-----
> From: Wang, Dongsheng
> Sent: Thursday, October 25, 2018 4:15 PM
> To: timur@kernel.org; andrew@lunn.ch
> Cc: netdev@vger.kernel.org; Wang, Dongsheng
> <dongsheng.wang@hxt-semitech.com>; Zheng, Joey
> <yu.zheng@hxt-semitech.com>; f.fainelli@gmail.com
> Subject: [PATCH v3 0/2] net: qcom/emac: add shared mdio bus support
>
> The emac include MDIO controller, and the motherboard has more than one
> PHY connected to an MDIO bus. So share the shared mii_bus for others MAC
> device that not has MDIO bus connected.
>
> Based on ACPI, since "phy-handle" cannot directly point to a _DSD sub-package,
> so we use "phy-handle" to point an internal MDIO device port.
> The port describes the phy address.
>
> Tested: QDF2400 (ACPI), buildin/insmod/rmmod
>
> V3:
> - Add "phy-handle" support.
> - Remove all of DT changes.
>
> V2:
> - Separate patch.
>
> Wang Dongsheng (2):
> net: qcom/emac: split phy_config to mdio bus create and get phy device
> net: qcom/emac: add phy-handle support for ACPI
>
> drivers/net/ethernet/qualcomm/emac/emac-phy.c | 183 ++++++++++++++----
> 1 file changed, 142 insertions(+), 41 deletions(-)
>
> --
> 2.18.0
^ permalink raw reply
* Re: [PATCH nf] netfilter: ipv6: fix oops when defragmenting locally generated fragments
From: Pablo Neira Ayuso @ 2018-10-25 8:18 UTC (permalink / raw)
To: Florian Westphal
Cc: netfilter-devel, lorenzo, zenczykowski, edumazet, netdev, maze
In-Reply-To: <20181023144716.19746-1-fw@strlen.de>
On Tue, Oct 23, 2018 at 04:47:16PM +0200, Florian Westphal wrote:
> Unlike ipv4 and normal ipv6 defrag, netfilter ipv6 defragmentation did
> not save/restore skb->dst.
>
> This causes oops when handling locally generated ipv6 fragments, as
> output path needs a valid dst.
Applied, thanks!
^ permalink raw reply
* Re: [PATCH bpf-next 6/6] selftests/bpf: test_verifier, check bpf_map_lookup_elem access in bpf prog
From: Naresh Kamboju @ 2018-10-25 8:54 UTC (permalink / raw)
To: liu.song.a23, bhole_prashant_q7
Cc: ast, Daniel Borkmann, jakub.kicinski, David S. Miller,
quentin.monnet, netdev, open list:KERNEL SELFTEST FRAMEWORK
In-Reply-To: <CAPhsuW72jhD+962NjSyxPrMhoeE9d24ArEVm0oDsP4FV46nNVA@mail.gmail.com>
On Tue, 9 Oct 2018 at 12:32, Song Liu <liu.song.a23@gmail.com> wrote:
>
> On Mon, Oct 8, 2018 at 6:07 PM Prashant Bhole
> <bhole_prashant_q7@lab.ntt.co.jp> wrote:
> >
> > map_lookup_elem isn't supported by certain map types like:
> > - BPF_MAP_TYPE_PROG_ARRAY
> > - BPF_MAP_TYPE_STACK_TRACE
> > - BPF_MAP_TYPE_XSKMAP
> > - BPF_MAP_TYPE_SOCKMAP/BPF_MAP_TYPE_SOCKHASH
> > Let's add verfier tests to check whether verifier prevents
> > bpf_map_lookup_elem call on above programs from bpf program.
> >
> > Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
> > Acked-by: Alexei Starovoitov <ast@kernel.org>
> Acked-by: Song Liu <songliubraving@fb.com>
>
> > ---
> > tools/testing/selftests/bpf/test_verifier.c | 121 +++++++++++++++++++-
> > 1 file changed, 120 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
> > index 65ae44c85d27..cf4cd32b6772 100644
> > --- a/tools/testing/selftests/bpf/test_verifier.c
> > +++ b/tools/testing/selftests/bpf/test_verifier.c
> > @@ -48,7 +48,7 @@
> >
> > #define MAX_INSNS BPF_MAXINSNS
> > #define MAX_FIXUPS 8
> > -#define MAX_NR_MAPS 8
> > +#define MAX_NR_MAPS 13
> > #define POINTER_VALUE 0xcafe4all
> > #define TEST_DATA_LEN 64
> >
> > @@ -65,6 +65,10 @@ struct bpf_test {
> > int fixup_map_hash_48b[MAX_FIXUPS];
> > int fixup_map_hash_16b[MAX_FIXUPS];
> > int fixup_map_array_48b[MAX_FIXUPS];
> > + int fixup_map_sockmap[MAX_FIXUPS];
> > + int fixup_map_sockhash[MAX_FIXUPS];
> > + int fixup_map_xskmap[MAX_FIXUPS];
> > + int fixup_map_stacktrace[MAX_FIXUPS];
> > int fixup_prog1[MAX_FIXUPS];
> > int fixup_prog2[MAX_FIXUPS];
> > int fixup_map_in_map[MAX_FIXUPS];
> > @@ -4541,6 +4545,85 @@ static struct bpf_test tests[] = {
> > .errstr = "invalid access to packet",
> > .prog_type = BPF_PROG_TYPE_SCHED_CLS,
> > },
> > + {
> > + "prevent map lookup in sockmap",
> > + .insns = {
> > + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
> > + BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
> > + BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
> > + BPF_LD_MAP_FD(BPF_REG_1, 0),
> > + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
> > + BPF_FUNC_map_lookup_elem),
> > + BPF_EXIT_INSN(),
> > + },
> > + .fixup_map_sockmap = { 3 },
> > + .result = REJECT,
> > + .errstr = "cannot pass map_type 15 into func bpf_map_lookup_elem",
> > + .prog_type = BPF_PROG_TYPE_SOCK_OPS,
> > + },
> > + {
> > + "prevent map lookup in sockhash",
> > + .insns = {
> > + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
> > + BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
> > + BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
> > + BPF_LD_MAP_FD(BPF_REG_1, 0),
> > + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
> > + BPF_FUNC_map_lookup_elem),
> > + BPF_EXIT_INSN(),
> > + },
> > + .fixup_map_sockhash = { 3 },
> > + .result = REJECT,
> > + .errstr = "cannot pass map_type 18 into func bpf_map_lookup_elem",
> > + .prog_type = BPF_PROG_TYPE_SOCK_OPS,
> > + },
> > + {
> > + "prevent map lookup in xskmap",
> > + .insns = {
> > + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
> > + BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
> > + BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
> > + BPF_LD_MAP_FD(BPF_REG_1, 0),
> > + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
> > + BPF_FUNC_map_lookup_elem),
> > + BPF_EXIT_INSN(),
> > + },
> > + .fixup_map_xskmap = { 3 },
> > + .result = REJECT,
> > + .errstr = "cannot pass map_type 17 into func bpf_map_lookup_elem",
> > + .prog_type = BPF_PROG_TYPE_XDP,
> > + },
> > + {
> > + "prevent map lookup in stack trace",
> > + .insns = {
> > + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
> > + BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
> > + BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
> > + BPF_LD_MAP_FD(BPF_REG_1, 0),
> > + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
> > + BPF_FUNC_map_lookup_elem),
> > + BPF_EXIT_INSN(),
> > + },
> > + .fixup_map_stacktrace = { 3 },
> > + .result = REJECT,
> > + .errstr = "cannot pass map_type 7 into func bpf_map_lookup_elem",
> > + .prog_type = BPF_PROG_TYPE_PERF_EVENT,
> > + },
> > + {
> > + "prevent map lookup in prog array",
> > + .insns = {
> > + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
> > + BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
> > + BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
> > + BPF_LD_MAP_FD(BPF_REG_1, 0),
> > + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
> > + BPF_FUNC_map_lookup_elem),
> > + BPF_EXIT_INSN(),
> > + },
> > + .fixup_prog2 = { 3 },
> > + .result = REJECT,
> > + .errstr = "cannot pass map_type 3 into func bpf_map_lookup_elem",
> > + },
> > {
> > "valid map access into an array with a constant",
> > .insns = {
> > @@ -13515,6 +13598,10 @@ static void do_test_fixup(struct bpf_test *test, enum bpf_map_type prog_type,
> > int *fixup_map_hash_48b = test->fixup_map_hash_48b;
> > int *fixup_map_hash_16b = test->fixup_map_hash_16b;
> > int *fixup_map_array_48b = test->fixup_map_array_48b;
> > + int *fixup_map_sockmap = test->fixup_map_sockmap;
> > + int *fixup_map_sockhash = test->fixup_map_sockhash;
> > + int *fixup_map_xskmap = test->fixup_map_xskmap;
> > + int *fixup_map_stacktrace = test->fixup_map_stacktrace;
> > int *fixup_prog1 = test->fixup_prog1;
> > int *fixup_prog2 = test->fixup_prog2;
> > int *fixup_map_in_map = test->fixup_map_in_map;
> > @@ -13603,6 +13690,38 @@ static void do_test_fixup(struct bpf_test *test, enum bpf_map_type prog_type,
> > fixup_percpu_cgroup_storage++;
> > } while (*fixup_percpu_cgroup_storage);
> > }
> > + if (*fixup_map_sockmap) {
> > + map_fds[9] = create_map(BPF_MAP_TYPE_SOCKMAP, sizeof(int),
> > + sizeof(int), 1);
> > + do {
> > + prog[*fixup_map_sockmap].imm = map_fds[9];
> > + fixup_map_sockmap++;
> > + } while (*fixup_map_sockmap);
> > + }
> > + if (*fixup_map_sockhash) {
> > + map_fds[10] = create_map(BPF_MAP_TYPE_SOCKHASH, sizeof(int),
> > + sizeof(int), 1);
> > + do {
> > + prog[*fixup_map_sockhash].imm = map_fds[10];
> > + fixup_map_sockhash++;
> > + } while (*fixup_map_sockhash);
> > + }
> > + if (*fixup_map_xskmap) {
> > + map_fds[11] = create_map(BPF_MAP_TYPE_XSKMAP, sizeof(int),
> > + sizeof(int), 1);
> > + do {
> > + prog[*fixup_map_xskmap].imm = map_fds[11];
> > + fixup_map_xskmap++;
> > + } while (*fixup_map_xskmap);
> > + }
selftests: bpf: test_verifier sockmap, sockhash, xskmap failed on
mainline and next
(from 4.19.0-rc7-next-20181011 to till date )
Are we missing any pre-required kernel configs ?
Test log,
------------
selftests: bpf: test_verifier
<>
#274/p prevent map lookup in sockmap Failed to create hash map
'Invalid argument'!
FAIL
Unexpected error message!
EXP: cannot pass map_type 15 into func bpf_map_lookup_elem
RES: fd -1 is not pointing to valid bpf_map
fd -1 is not pointing to valid bpf_map
#275/p prevent map lookup in sockhash Failed to create hash map
'Invalid argument'!
FAIL
Unexpected error message!
EXP: cannot pass map_type 18 into func bpf_map_lookup_elem
RES: fd -1 is not pointing to valid bpf_map
fd -1 is not pointing to valid bpf_map
#276/p prevent map lookup in xskmap Failed to create hash map 'Invalid
argument'!
FAIL
Unexpected error message!
EXP: cannot pass map_type 17 into func bpf_map_lookup_elem
RES: fd -1 is not pointing to valid bpf_map
fd -1 is not pointing to valid bpf_map
<>
Summary: 962 PASSED, 0 SKIPPED, 3 FAILED
not ok 1..1 selftests: bpf: test_verifier [FAIL]
selftests: bpf_test_verifier [FAIL]
-mainline results history,
https://qa-reports.linaro.org/lkft/linux-next-oe/tests/kselftest/bpf_test_verifier
-next results history,
https://qa-reports.linaro.org/lkft/linux-next-oe/tests/kselftest/bpf_test_verifier
Test case full log,
https://lkft.validation.linaro.org/scheduler/job/461881#L1655
Best regards
Naresh Kamboju
> > + if (*fixup_map_stacktrace) {
> > + map_fds[12] = create_map(BPF_MAP_TYPE_STACK_TRACE, sizeof(u32),
> > + sizeof(u64), 1);
> > + do {
> > + prog[*fixup_map_stacktrace].imm = map_fds[12];
> > + fixup_map_stacktrace++;
> > + } while (fixup_map_stacktrace);
> > + }
> > }
> >
> > static void do_test_single(struct bpf_test *test, bool unpriv,
> > --
> > 2.17.1
> >
> >
^ permalink raw reply
* Re: Regression: kernel 4.14 an later very slow with many ipsec tunnels
From: David Miller @ 2018-10-25 17:34 UTC (permalink / raw)
To: linux
Cc: netdev, fw, steffen.klassert, linux-kernel, torvalds,
christophe.gouault, gregkh
In-Reply-To: <2766296.15tpkxTHJV@stwm.de>
From: Wolfgang Walter <linux@stwm.de>
Date: Thu, 25 Oct 2018 11:38:19 +0200
> there is now a new 4.19 which still has the big performance regression when
> many ipsec tunnels are configured (throughput and latency get worse by 10 to
> 50 times) which makes any kernel > 4.9 unusable for our routers.
>
> I still don't understand why a revert of the flow cache removal at least for
> the longterm kernels is that a bad option (maybe as a compile time option),
> especially as there is no workaround available.
You do know that the flow cache is DDoS targettable, right?
That's why we removed it, we did not make the change lightly.
Adding a DDoS vector back into the kernel is not an option sorry.
Please work diligently with Florian and others to try and find ways to
soften the performance hit.
Thank you.
^ permalink raw reply
* Re: [PATCH ghak90 (was ghak32) V4 03/10] audit: log container info of syscalls
From: Richard Guy Briggs @ 2018-10-25 17:38 UTC (permalink / raw)
To: Steve Grubb
Cc: Paul Moore, simo, carlos, linux-api, containers, linux-kernel,
dhowells, linux-audit, netfilter-devel, ebiederm, luto, netdev,
linux-fsdevel, Eric Paris, Serge Hallyn, viro
In-Reply-To: <20181025175745.5b2b13e9@ivy-bridge>
On 2018-10-25 17:57, Steve Grubb wrote:
> On Thu, 25 Oct 2018 08:27:32 -0400
> Richard Guy Briggs <rgb@redhat.com> wrote:
>
> > On 2018-10-25 06:49, Paul Moore wrote:
> > > On Thu, Oct 25, 2018 at 2:06 AM Steve Grubb <sgrubb@redhat.com>
> > > wrote:
> > > > On Wed, 24 Oct 2018 20:42:55 -0400
> > > > Richard Guy Briggs <rgb@redhat.com> wrote:
> > > > > On 2018-10-24 16:55, Paul Moore wrote:
> > > > > > On Wed, Oct 24, 2018 at 11:15 AM Richard Guy Briggs
> > > > > > <rgb@redhat.com> wrote:
> > > > > > > On 2018-10-19 19:16, Paul Moore wrote:
> > > > > > > > On Sun, Aug 5, 2018 at 4:32 AM Richard Guy Briggs
> > > > > > > > <rgb@redhat.com> wrote:
> > >
> > > ...
> > >
> > > > > > > > > +/*
> > > > > > > > > + * audit_log_contid - report container info
> > > > > > > > > + * @tsk: task to be recorded
> > > > > > > > > + * @context: task or local context for record
> > > > > > > > > + * @op: contid string description
> > > > > > > > > + */
> > > > > > > > > +int audit_log_contid(struct task_struct *tsk,
> > > > > > > > > + struct audit_context
> > > > > > > > > *context, char *op) +{
> > > > > > > > > + struct audit_buffer *ab;
> > > > > > > > > +
> > > > > > > > > + if (!audit_contid_set(tsk))
> > > > > > > > > + return 0;
> > > > > > > > > + /* Generate AUDIT_CONTAINER record with
> > > > > > > > > container ID */
> > > > > > > > > + ab = audit_log_start(context, GFP_KERNEL,
> > > > > > > > > AUDIT_CONTAINER);
> > > > > > > > > + if (!ab)
> > > > > > > > > + return -ENOMEM;
> > > > > > > > > + audit_log_format(ab, "op=%s contid=%llu",
> > > > > > > > > + op, audit_get_contid(tsk));
> > > > > > > > > + audit_log_end(ab);
> > > > > > > > > + return 0;
> > > > > > > > > +}
> > > > > > > > > +EXPORT_SYMBOL(audit_log_contid);
> > > > > > > >
> > > > > > > > As discussed in the previous iteration of the patch, I
> > > > > > > > prefer AUDIT_CONTAINER_ID here over AUDIT_CONTAINER. If
> > > > > > > > you feel strongly about keeping it as-is with
> > > > > > > > AUDIT_CONTAINER I suppose I could live with that, but it
> > > > > > > > is isn't my first choice.
> > > > > > >
> > > > > > > I don't have a strong opinion on this one, mildly
> > > > > > > preferring the shorter one only because it is shorter.
> > > > > >
> > > > > > We already have multiple AUDIT_CONTAINER* record types, so it
> > > > > > seems as though we should use "AUDIT_CONTAINER" as a prefix
> > > > > > of sorts, rather than a type itself.
> > > > >
> > > > > I'm fine with that. I'd still like to hear Steve's input. He
> > > > > had stronger opinions than me.
> > > >
> > > > The creation event should be separate and distinct from the
> > > > continuing use when its used as a supplemental record. IOW,
> > > > binding the ID to a container is part of the lifecycle and needs
> > > > to be kept distinct.
> > >
> > > Steve's comment is pretty ambiguous when it comes to AUDIT_CONTAINER
> > > vs AUDIT_CONTAINER_ID, but one could argue that AUDIT_CONTAINER_ID
> > > helps distinguish the audit container id marking record and gets to
> > > what I believe is the spirit of Steve's comment. Taking this in
> > > context with my previous remarks, let's switch to using
> > > AUDIT_CONTAINER_ID.
> >
> > I suspect Steve is mixing up AUDIT_CONTAINER_OP with
> > AUDIT_CONTAINER_ID, confusing the fact that they are two seperate
> > records. As a summary, the suggested records are:
> > CONTAINER_OP audit container identifier creation
> > CONTAINER audit container identifier aux record to an
> > event
> >
> > and what Paul is suggesting (which is fine by me) is:
> > CONTAINER_OP audit container identifier creation event
> > CONTAINER_ID audit container identifier aux record to
> > an event
> >
> > Steve, please indicate you are fine with this.
>
> I thought it was:
It *was*. It was changed at Paul's request in this v3 thread:
https://www.redhat.com/archives/linux-audit/2018-July/msg00087.html
And listed in the examples and changelog to this v4 patchset:
https://www.redhat.com/archives/linux-audit/2018-July/msg00178.html
It is also listed in this userspace patchset update v4 (which should
also have had a changelog added to it, note to self...):
https://www.redhat.com/archives/linux-audit/2018-July/msg00189.html
I realize it is hard to keep up with all the detail changes in these
patchsets...
> CONTAINER_ID audit container identifier creation event
> CONTAINER audit container identifier aux record to an event
>
> Or vice versa. Don't mix up creation of the identifier with operations.
Exactly what I'm trying to avoid... Worded another way: "Don't mix up
the creation operation with routine reporting of the identifier in
events." Steve, can you and Paul discuss and agree on what they should
be called? I don't have a horse in this race, but I need to record the
result of that run. ;-)
> -Steve
- RGB
--
Richard Guy Briggs <rgb@redhat.com>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635
^ permalink raw reply
* Re: [PATCH bpf-next 6/6] selftests/bpf: test_verifier, check bpf_map_lookup_elem access in bpf prog
From: Prashant Bhole @ 2018-10-25 9:09 UTC (permalink / raw)
To: Naresh Kamboju, liu.song.a23
Cc: ast, Daniel Borkmann, jakub.kicinski, David S. Miller,
quentin.monnet, netdev, open list:KERNEL SELFTEST FRAMEWORK
In-Reply-To: <CA+G9fYtT78hr4i6JDdeKBNoO4G_NMSVWUTyxMoU4_1ZfqGdthg@mail.gmail.com>
On 10/25/2018 5:54 PM, Naresh Kamboju wrote:
> On Tue, 9 Oct 2018 at 12:32, Song Liu <liu.song.a23@gmail.com> wrote:
>>
>> On Mon, Oct 8, 2018 at 6:07 PM Prashant Bhole
>> <bhole_prashant_q7@lab.ntt.co.jp> wrote:
>>>
>>> map_lookup_elem isn't supported by certain map types like:
>>> - BPF_MAP_TYPE_PROG_ARRAY
>>> - BPF_MAP_TYPE_STACK_TRACE
>>> - BPF_MAP_TYPE_XSKMAP
>>> - BPF_MAP_TYPE_SOCKMAP/BPF_MAP_TYPE_SOCKHASH
>>> Let's add verfier tests to check whether verifier prevents
>>> bpf_map_lookup_elem call on above programs from bpf program.
>>>
>>> Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
>>> Acked-by: Alexei Starovoitov <ast@kernel.org>
>> Acked-by: Song Liu <songliubraving@fb.com>
>>
>>> ---
>>> tools/testing/selftests/bpf/test_verifier.c | 121 +++++++++++++++++++-
>>> 1 file changed, 120 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
>>> index 65ae44c85d27..cf4cd32b6772 100644
>>> --- a/tools/testing/selftests/bpf/test_verifier.c
>>> +++ b/tools/testing/selftests/bpf/test_verifier.c
>>> @@ -48,7 +48,7 @@
>>>
>>> #define MAX_INSNS BPF_MAXINSNS
>>> #define MAX_FIXUPS 8
>>> -#define MAX_NR_MAPS 8
>>> +#define MAX_NR_MAPS 13
>>> #define POINTER_VALUE 0xcafe4all
>>> #define TEST_DATA_LEN 64
>>>
>>> @@ -65,6 +65,10 @@ struct bpf_test {
>>> int fixup_map_hash_48b[MAX_FIXUPS];
>>> int fixup_map_hash_16b[MAX_FIXUPS];
>>> int fixup_map_array_48b[MAX_FIXUPS];
>>> + int fixup_map_sockmap[MAX_FIXUPS];
>>> + int fixup_map_sockhash[MAX_FIXUPS];
>>> + int fixup_map_xskmap[MAX_FIXUPS];
>>> + int fixup_map_stacktrace[MAX_FIXUPS];
>>> int fixup_prog1[MAX_FIXUPS];
>>> int fixup_prog2[MAX_FIXUPS];
>>> int fixup_map_in_map[MAX_FIXUPS];
>>> @@ -4541,6 +4545,85 @@ static struct bpf_test tests[] = {
>>> .errstr = "invalid access to packet",
>>> .prog_type = BPF_PROG_TYPE_SCHED_CLS,
>>> },
>>> + {
>>> + "prevent map lookup in sockmap",
>>> + .insns = {
>>> + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
>>> + BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
>>> + BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
>>> + BPF_LD_MAP_FD(BPF_REG_1, 0),
>>> + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
>>> + BPF_FUNC_map_lookup_elem),
>>> + BPF_EXIT_INSN(),
>>> + },
>>> + .fixup_map_sockmap = { 3 },
>>> + .result = REJECT,
>>> + .errstr = "cannot pass map_type 15 into func bpf_map_lookup_elem",
>>> + .prog_type = BPF_PROG_TYPE_SOCK_OPS,
>>> + },
>>> + {
>>> + "prevent map lookup in sockhash",
>>> + .insns = {
>>> + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
>>> + BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
>>> + BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
>>> + BPF_LD_MAP_FD(BPF_REG_1, 0),
>>> + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
>>> + BPF_FUNC_map_lookup_elem),
>>> + BPF_EXIT_INSN(),
>>> + },
>>> + .fixup_map_sockhash = { 3 },
>>> + .result = REJECT,
>>> + .errstr = "cannot pass map_type 18 into func bpf_map_lookup_elem",
>>> + .prog_type = BPF_PROG_TYPE_SOCK_OPS,
>>> + },
>>> + {
>>> + "prevent map lookup in xskmap",
>>> + .insns = {
>>> + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
>>> + BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
>>> + BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
>>> + BPF_LD_MAP_FD(BPF_REG_1, 0),
>>> + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
>>> + BPF_FUNC_map_lookup_elem),
>>> + BPF_EXIT_INSN(),
>>> + },
>>> + .fixup_map_xskmap = { 3 },
>>> + .result = REJECT,
>>> + .errstr = "cannot pass map_type 17 into func bpf_map_lookup_elem",
>>> + .prog_type = BPF_PROG_TYPE_XDP,
>>> + },
>>> + {
>>> + "prevent map lookup in stack trace",
>>> + .insns = {
>>> + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
>>> + BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
>>> + BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
>>> + BPF_LD_MAP_FD(BPF_REG_1, 0),
>>> + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
>>> + BPF_FUNC_map_lookup_elem),
>>> + BPF_EXIT_INSN(),
>>> + },
>>> + .fixup_map_stacktrace = { 3 },
>>> + .result = REJECT,
>>> + .errstr = "cannot pass map_type 7 into func bpf_map_lookup_elem",
>>> + .prog_type = BPF_PROG_TYPE_PERF_EVENT,
>>> + },
>>> + {
>>> + "prevent map lookup in prog array",
>>> + .insns = {
>>> + BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
>>> + BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
>>> + BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
>>> + BPF_LD_MAP_FD(BPF_REG_1, 0),
>>> + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
>>> + BPF_FUNC_map_lookup_elem),
>>> + BPF_EXIT_INSN(),
>>> + },
>>> + .fixup_prog2 = { 3 },
>>> + .result = REJECT,
>>> + .errstr = "cannot pass map_type 3 into func bpf_map_lookup_elem",
>>> + },
>>> {
>>> "valid map access into an array with a constant",
>>> .insns = {
>>> @@ -13515,6 +13598,10 @@ static void do_test_fixup(struct bpf_test *test, enum bpf_map_type prog_type,
>>> int *fixup_map_hash_48b = test->fixup_map_hash_48b;
>>> int *fixup_map_hash_16b = test->fixup_map_hash_16b;
>>> int *fixup_map_array_48b = test->fixup_map_array_48b;
>>> + int *fixup_map_sockmap = test->fixup_map_sockmap;
>>> + int *fixup_map_sockhash = test->fixup_map_sockhash;
>>> + int *fixup_map_xskmap = test->fixup_map_xskmap;
>>> + int *fixup_map_stacktrace = test->fixup_map_stacktrace;
>>> int *fixup_prog1 = test->fixup_prog1;
>>> int *fixup_prog2 = test->fixup_prog2;
>>> int *fixup_map_in_map = test->fixup_map_in_map;
>>> @@ -13603,6 +13690,38 @@ static void do_test_fixup(struct bpf_test *test, enum bpf_map_type prog_type,
>>> fixup_percpu_cgroup_storage++;
>>> } while (*fixup_percpu_cgroup_storage);
>>> }
>>> + if (*fixup_map_sockmap) {
>>> + map_fds[9] = create_map(BPF_MAP_TYPE_SOCKMAP, sizeof(int),
>>> + sizeof(int), 1);
>>> + do {
>>> + prog[*fixup_map_sockmap].imm = map_fds[9];
>>> + fixup_map_sockmap++;
>>> + } while (*fixup_map_sockmap);
>>> + }
>>> + if (*fixup_map_sockhash) {
>>> + map_fds[10] = create_map(BPF_MAP_TYPE_SOCKHASH, sizeof(int),
>>> + sizeof(int), 1);
>>> + do {
>>> + prog[*fixup_map_sockhash].imm = map_fds[10];
>>> + fixup_map_sockhash++;
>>> + } while (*fixup_map_sockhash);
>>> + }
>>> + if (*fixup_map_xskmap) {
>>> + map_fds[11] = create_map(BPF_MAP_TYPE_XSKMAP, sizeof(int),
>>> + sizeof(int), 1);
>>> + do {
>>> + prog[*fixup_map_xskmap].imm = map_fds[11];
>>> + fixup_map_xskmap++;
>>> + } while (*fixup_map_xskmap);
>>> + }
>
> selftests: bpf: test_verifier sockmap, sockhash, xskmap failed on
> mainline and next
> (from 4.19.0-rc7-next-20181011 to till date )
> Are we missing any pre-required kernel configs ?
sockmap/hashmap is dependent on CONFIG_BPF_STREAM_PARSER and xskmap is
dependent on CONFIG_XDP_SOCKETS.
Reference: include/linux/bpf_types.h
-Prashant
>
> Test log,
> ------------
> selftests: bpf: test_verifier
> <>
> #274/p prevent map lookup in sockmap Failed to create hash map
> 'Invalid argument'!
> FAIL
> Unexpected error message!
> EXP: cannot pass map_type 15 into func bpf_map_lookup_elem
> RES: fd -1 is not pointing to valid bpf_map
> fd -1 is not pointing to valid bpf_map
> #275/p prevent map lookup in sockhash Failed to create hash map
> 'Invalid argument'!
> FAIL
> Unexpected error message!
> EXP: cannot pass map_type 18 into func bpf_map_lookup_elem
> RES: fd -1 is not pointing to valid bpf_map
> fd -1 is not pointing to valid bpf_map
> #276/p prevent map lookup in xskmap Failed to create hash map 'Invalid
> argument'!
> FAIL
> Unexpected error message!
> EXP: cannot pass map_type 17 into func bpf_map_lookup_elem
> RES: fd -1 is not pointing to valid bpf_map
> fd -1 is not pointing to valid bpf_map
> <>
> Summary: 962 PASSED, 0 SKIPPED, 3 FAILED
> not ok 1..1 selftests: bpf: test_verifier [FAIL]
> selftests: bpf_test_verifier [FAIL]
>
> -mainline results history,
> https://qa-reports.linaro.org/lkft/linux-next-oe/tests/kselftest/bpf_test_verifier
>
> -next results history,
> https://qa-reports.linaro.org/lkft/linux-next-oe/tests/kselftest/bpf_test_verifier
>
> Test case full log,
> https://lkft.validation.linaro.org/scheduler/job/461881#L1655
>
> Best regards
> Naresh Kamboju
>
>>> + if (*fixup_map_stacktrace) {
>>> + map_fds[12] = create_map(BPF_MAP_TYPE_STACK_TRACE, sizeof(u32),
>>> + sizeof(u64), 1);
>>> + do {
>>> + prog[*fixup_map_stacktrace].imm = map_fds[12];
>>> + fixup_map_stacktrace++;
>>> + } while (fixup_map_stacktrace);
>>> + }
>>> }
>>>
>>> static void do_test_single(struct bpf_test *test, bool unpriv,
>>> --
>>> 2.17.1
>>>
>>>
>
>
^ permalink raw reply
* Re: Regression in 4.19 net/phy/realtek: garbled sysfs output
From: Holger Hoffstätte @ 2018-10-25 9:29 UTC (permalink / raw)
To: Andrew Lunn; +Cc: Netdev, Jassi Brar, David S. Miller, Heiner Kallweit
In-Reply-To: <20181024201219.GB19440@lunn.ch>
On 10/24/18 22:12, Andrew Lunn wrote:
> On Wed, Oct 24, 2018 at 09:36:02PM +0200, Holger Hoffstätte wrote:
>> Hi,
>>
>> Since 4.19 r8169 depends on phylib:
>>
>> $lsmod | grep r8169
>> r8169 81920 0
>> libphy 57344 2 r8169,realtek
>>
>> Unfortunately this now gives me the following sysfs error:
>>
>> $cd /sys/module/realtek/drivers
>> $ls -l
>> ls: cannot access 'mdio_bus:RTL8201F 10/100Mbps Ethernet': No such file or directory
>> total 0
>> lrwxrwxrwx 1 root root 0 Oct 24 21:09 'mdio_bus:RTL8201CP Ethernet' -> '../../../bus/mdio_bus/drivers/RTL8201CP Ethernet'
>> l????????? ? ? ? ? ? 'mdio_bus:RTL8201F 10/100Mbps Ethernet'
>> lrwxrwxrwx 1 root root 0 Oct 24 21:09 'mdio_bus:RTL8211 Gigabit Ethernet' -> '../../../bus/mdio_bus/drivers/RTL8211 Gigabit Ethernet'
>> [..]
>>
>> Apparently the forward slash in "10/100Mbps Ethernet" is interpreted as
>> directory separator that leads nowhere, and was introduced in commit
>> 513588dd44b ("net: phy: realtek: add RTL8201F phy-id and functions").
>>
>> Would it be acceptable to change the name simply to "RTL8201F Ethernet"?
>
> Hi Holger
>
> Or use "RTL8201F Fast Ethernet"
Yes, even better since it's correct. :)
As expected changing the name .name entry fixes the sysfs behaviour.
diff --git a/drivers/net/phy/realtek.c b/drivers/net/phy/realtek.c
index 7fc8508b5..271e8adc3 100644
--- a/drivers/net/phy/realtek.c
+++ b/drivers/net/phy/realtek.c
@@ -220,7 +220,7 @@ static struct phy_driver realtek_drvs[] = {
.flags = PHY_HAS_INTERRUPT,
}, {
.phy_id = 0x001cc816,
- .name = "RTL8201F 10/100Mbps Ethernet",
+ .name = "RTL8201F Fast Ethernet",
.phy_id_mask = 0x001fffff,
.features = PHY_BASIC_FEATURES,
.flags = PHY_HAS_INTERRUPT,
> I wonder if other drivers have similar problems?
>
> davicom.c: .name = "Davicom DM9161B/C",
> intel-xway.c: .name = "Intel XWAY PHY11G (PEF 7071/PEF 7072) v1.3",
> intel-xway.c: .name = "Intel XWAY PHY11G (PEF 7071/PEF 7072) v1.4",
> intel-xway.c: .name = "Intel XWAY PHY11G (PEF 7071/PEF 7072) v1.5 / v1.6",
> intel-xway.c: .name = "Intel XWAY PHY22F (PEF 7061) v1.5 / v1.6",
> smsc.c: .name = "SMSC LAN8710/LAN8720",
I'm open to suggestions about how to rename those identifiers.
"|" seems to work but IMHO looks a bit weird:
"Davicom DM9161B/C" -> "Davicom DM9161B|C"
"(PEF 7071/PEF 7072) v1.5 / v1.6" -> "(PEF 7071|7072) v1.5|6"
We can go full regex, which will probably get me voted off the island:
"(PEF 7071/PEF 7072) v1.5 / v1.6" -> "(PEF {7071,7072}) v1.{5,6}"
Cast your votes now!
cheers,
Holger
^ permalink raw reply related
* Re: [PATCH net-next 0/4] net: ethernet: ti: cpsw: fix vlan mcast
From: David Miller @ 2018-10-25 18:34 UTC (permalink / raw)
To: ivan.khoronzhuk
Cc: grygorii.strashko, linux-omap, netdev, linux-kernel,
alexander.h.duyck, bjorn
In-Reply-To: <20181024221059.21834-1-ivan.khoronzhuk@linaro.org>
From: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Date: Thu, 25 Oct 2018 01:10:55 +0300
> The cpsw holds separate mcast entires for vlan entries. At this moment
> driver adds only not vlan mcast addresses, omitting vlan/mcast entries.
> As result mcast for vlans doesn't work. It can be fixed by adding same
> mcast entries for every created vlan, but this patchseries uses more
> sophisticated way and allows to create mcast entries only for vlans
> that really require it. Generic functions from this series can be
> reused for fixing vlan and macvlan unicast.
This is a bug fix but targetted at net-next, and indeed it is quite
invasive as it adds new core infrastructure and converts the generic
vlan code over to using it.
Unfortunately net-next is closed.
So if you want this bug fixed in mainline you will have to come up
with a less invasive fix, and resubmit this net-next approach when the
net-next tree opens back up.
Thank you.
^ permalink raw reply
* Re: [RFC PATCH 01/11] phy: core add phy_set_netif_mode() api
From: Kishon Vijay Abraham I @ 2018-10-25 10:05 UTC (permalink / raw)
To: Grygorii Strashko, David S. Miller, netdev, Tony Lindgren,
Rob Herring
Cc: Sekhar Nori, linux-kernel, linux-omap, devicetree
In-Reply-To: <18219d97-f86e-27ee-a116-a8febc3dcd24@ti.com>
Hi,
On Wednesday 10 October 2018 04:13 AM, Grygorii Strashko wrote:
>
>
> On 10/09/2018 12:22 AM, Kishon Vijay Abraham I wrote:
>> Hi Grygorii,
>>
>> On Tuesday 09 October 2018 05:19 AM, Grygorii Strashko wrote:
>>> Add new API phy_set_netif_mode(struct phy *phy, phy_interface_t mode) and
>>> new PHY operation callback .set_netif_mode() which intended to be implemnte
>>> by PHY drivers which supports Network interrfaces mode selection. Both
>>> accepts phy_interface_t vlaue as input parameter.
>>>
>>> Cc: Kishon Vijay Abraham I <kishon@ti.com>
>>> Cc: Tony Lindgren <tony@atomide.com>
>>> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
>>> ---
>>> drivers/phy/phy-core.c | 15 +++++++++++++++
>>> include/linux/phy/phy.h | 12 ++++++++++++
>>> 2 files changed, 27 insertions(+)
>>>
>>> diff --git a/drivers/phy/phy-core.c b/drivers/phy/phy-core.c
>>> index 35fd38c..d9aba1a 100644
>>> --- a/drivers/phy/phy-core.c
>>> +++ b/drivers/phy/phy-core.c
>>> @@ -377,6 +377,21 @@ int phy_set_mode(struct phy *phy, enum phy_mode mode)
>>> }
>>> EXPORT_SYMBOL_GPL(phy_set_mode);
>>>
>>> +int phy_set_netif_mode(struct phy *phy, phy_interface_t mode)
>>> +{
>>> + int ret;
>>> +
>>> + if (!phy || !phy->ops->set_netif_mode)
>>> + return 0;
>>> +
>>> + mutex_lock(&phy->mutex);
>>> + ret = phy->ops->set_netif_mode(phy, mode);
>>> + mutex_unlock(&phy->mutex);
>>> +
>>> + return ret;
>>> +}
>>> +EXPORT_SYMBOL_GPL(phy_set_netif_mode);
>>
>> We should try to add only generic PHY APIs and not subsystem specific APIs. In
>> this case I think phy_set_mode should suffice.
>
>
> This is what I've had in mind first, but all my guts argued against it after I've tried:
>
> diff --git a/include/linux/phy/phy.h b/include/linux/phy/phy.h
> index bc73d2b..961b156 100644
> --- a/include/linux/phy/phy.h
> +++ b/include/linux/phy/phy.h
> @@ -41,6 +41,14 @@ enum phy_mode {
> PHY_MODE_10GKR,
> PHY_MODE_UFS_HS_A,
> PHY_MODE_UFS_HS_B,
> + PHY_MODE_MODE_MII,
> + PHY_MODE_MODE_GMII,
> + PHY_MODE_MODE_SGMII,
> + PHY_MODE_MODE_RMII,
> + PHY_MODE_MODE_RGMII,
> + PHY_MODE_MODE_RGMII_ID,
> + PHY_MODE_MODE_RGMII_RXID,
> + PHY_MODE_MODE_RGMII_TXID,
> };
>
> above introduces ugly constants duplication and required every network phy driver
> to maintain conversation table phy_interface_t -> enum phy_mode.
> More over, if above change happens third time (first time PHY_MODE_SGMII/PHY_MODE_10GKR were added,
> second - PHY_MODE_2500SGMII) it will never ends (there are ~15 more items only in phy_interface_t).
> As result, enum phy_mode might became a un-maintainable monster.
>
> So, as per above, and considering that Network subsystem is based on standards (phy_interface_t lists standard intf)
> I've tried to add separate PHY API.
>
> As an idea:
> - seems it could be reasonable to introduce PHY_MODE_NETWORK (or PHY_MODE_ETHERNET) and
> add generic phy_set_submode(struct phy *phy, long submode).
>
> So, single functional PHY device can just use phy_set_submode() and
> multi-functional devices (like serdes which can be muxed between PCIe, USB, NET), can use:
>
> phy_set_mode(PHY_MODE_ETHERNET)
> phy_set_submode(X);
Agreed on the constant duplication comment above. We can modify set_mode to
take submode as an additional parameter and fix all the users of phy_set_mode.
int phy_set_mode(struct phy *phy, enum phy_mode mode, int submode)
Thanks
Kishon
^ permalink raw reply
* [PATCH v3 0/2] net: qcom/emac: add shared mdio bus support
From: Wang Dongsheng @ 2018-10-25 10:08 UTC (permalink / raw)
To: timur, andrew; +Cc: Wang Dongsheng, yu.zheng, f.fainelli, netdev
In-Reply-To: <78719753-77bd-596a-dfc7-ccd676850283@kernel.org>
The emac include MDIO controller, and the motherboard has more than one
PHY connected to an MDIO bus. So share the shared mii_bus for others MAC
device that not has MDIO bus connected.
Based on ACPI, since "phy-handle" cannot directly point to a _DSD
sub-package, so we use "phy-handle" to point an internal MDIO device port.
The port describes the phy address.
Tested: QDF2400 (ACPI), buildin/insmod/rmmod
V3:
- Add "phy-handle" support.
- Remove all of DT changes.
V2:
- Separate patch.
Wang Dongsheng (2):
net: qcom/emac: split phy_config to mdio bus create and get phy device
net: qcom/emac: add phy-handle support for ACPI
drivers/net/ethernet/qualcomm/emac/emac-phy.c | 183 ++++++++++++++----
1 file changed, 142 insertions(+), 41 deletions(-)
--
2.18.0
^ permalink raw reply
* [PATCH v3 1/2] net: qcom/emac: split phy_config to mdio bus create and get phy device
From: Wang Dongsheng @ 2018-10-25 10:08 UTC (permalink / raw)
To: timur, andrew; +Cc: Wang Dongsheng, yu.zheng, f.fainelli, netdev
In-Reply-To: <cover.1540459999.git.dongsheng.wang@hxt-semitech.com>
This patch separate emac_mdio_bus_create and emac_get_phydev from
emac_phy_config, and do some codes clean.
Signed-off-by: Wang Dongsheng <dongsheng.wang@hxt-semitech.com>
---
drivers/net/ethernet/qualcomm/emac/emac-phy.c | 96 +++++++++++--------
1 file changed, 56 insertions(+), 40 deletions(-)
diff --git a/drivers/net/ethernet/qualcomm/emac/emac-phy.c b/drivers/net/ethernet/qualcomm/emac/emac-phy.c
index 53dbf1e163a8..f2ed013ce5d5 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-phy.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-phy.c
@@ -96,15 +96,15 @@ static int emac_mdio_write(struct mii_bus *bus, int addr, int regnum, u16 val)
return 0;
}
-/* Configure the MDIO bus and connect the external PHY */
-int emac_phy_config(struct platform_device *pdev, struct emac_adapter *adpt)
+static int emac_mdio_bus_create(struct platform_device *pdev,
+ struct emac_adapter *adpt)
{
struct device_node *np = pdev->dev.of_node;
struct mii_bus *mii_bus;
int ret;
/* Create the mii_bus object for talking to the MDIO bus */
- adpt->mii_bus = mii_bus = devm_mdiobus_alloc(&pdev->dev);
+ mii_bus = devm_mdiobus_alloc(&pdev->dev);
if (!mii_bus)
return -ENOMEM;
@@ -115,50 +115,66 @@ int emac_phy_config(struct platform_device *pdev, struct emac_adapter *adpt)
mii_bus->parent = &pdev->dev;
mii_bus->priv = adpt;
- if (has_acpi_companion(&pdev->dev)) {
- u32 phy_addr;
-
- ret = mdiobus_register(mii_bus);
- if (ret) {
- dev_err(&pdev->dev, "could not register mdio bus\n");
- return ret;
- }
- ret = device_property_read_u32(&pdev->dev, "phy-channel",
- &phy_addr);
- if (ret)
- /* If we can't read a valid phy address, then assume
- * that there is only one phy on this mdio bus.
- */
- adpt->phydev = phy_find_first(mii_bus);
- else
- adpt->phydev = mdiobus_get_phy(mii_bus, phy_addr);
-
- /* of_phy_find_device() claims a reference to the phydev,
- * so we do that here manually as well. When the driver
- * later unloads, it can unilaterally drop the reference
- * without worrying about ACPI vs DT.
- */
- if (adpt->phydev)
- get_device(&adpt->phydev->mdio.dev);
- } else {
- struct device_node *phy_np;
-
- ret = of_mdiobus_register(mii_bus, np);
- if (ret) {
- dev_err(&pdev->dev, "could not register mdio bus\n");
- return ret;
- }
+ ret = of_mdiobus_register(mii_bus, has_acpi_companion(&pdev->dev) ?
+ NULL : np);
+ if (ret)
+ dev_err(&pdev->dev, "Could not register mdio bus\n");
+
+ adpt->mii_bus = ret ? NULL : mii_bus;
+ return ret;
+}
+
+static int emac_get_phydev(struct platform_device *pdev,
+ struct emac_adapter *adpt)
+{
+ struct device_node *np = pdev->dev.of_node;
+ struct mii_bus *bus = adpt->mii_bus;
+ struct device_node *phy_np;
+ struct phy_device *phydev;
+ u32 phy_addr;
+ int ret;
+
+ if (!has_acpi_companion(&pdev->dev)) {
phy_np = of_parse_phandle(np, "phy-handle", 0);
adpt->phydev = of_phy_find_device(phy_np);
of_node_put(phy_np);
+ return adpt->phydev ? 0 : -ENODEV;
}
- if (!adpt->phydev) {
- dev_err(&pdev->dev, "could not find external phy\n");
- mdiobus_unregister(mii_bus);
+ ret = device_property_read_u32(&pdev->dev, "phy-channel",
+ &phy_addr);
+ /* If we can't read a valid phy address, then assume
+ * that there is only one phy on this mdio bus.
+ */
+ phydev = ret ? phy_find_first(bus) : mdiobus_get_phy(bus, phy_addr);
+ if (!phydev)
return -ENODEV;
- }
+ /* of_phy_find_device() claims a reference to the phydev,
+ * so we do that here manually as well. When the driver
+ * later unloads, it can unilaterally drop the reference
+ * without worrying about ACPI vs DT.
+ */
+ get_device(&phydev->mdio.dev);
+ adpt->phydev = phydev;
return 0;
}
+
+/* Configure the MDIO bus and connect the external PHY */
+int emac_phy_config(struct platform_device *pdev, struct emac_adapter *adpt)
+{
+ int ret;
+
+ ret = emac_mdio_bus_create(pdev, adpt);
+ if (ret)
+ return ret;
+
+ ret = emac_get_phydev(pdev, adpt);
+ if (ret) {
+ dev_err(&pdev->dev, "Could not find external phy\n");
+ mdiobus_unregister(adpt->mii_bus);
+ }
+
+ return ret;
+}
--
2.18.0
^ permalink raw reply related
* [PATCH v3 2/2] net: qcom/emac: add phy-handle support for ACPI
From: Wang Dongsheng @ 2018-10-25 10:09 UTC (permalink / raw)
To: timur, andrew; +Cc: Wang Dongsheng, yu.zheng, f.fainelli, netdev
In-Reply-To: <cover.1540459999.git.dongsheng.wang@hxt-semitech.com>
Use "phy-handle" to porint an internal MDIO device port.
Signed-off-by: Wang Dongsheng <dongsheng.wang@hxt-semitech.com>
---
drivers/net/ethernet/qualcomm/emac/emac-phy.c | 115 +++++++++++++++---
1 file changed, 100 insertions(+), 15 deletions(-)
diff --git a/drivers/net/ethernet/qualcomm/emac/emac-phy.c b/drivers/net/ethernet/qualcomm/emac/emac-phy.c
index f2ed013ce5d5..3dc3ae55e5bb 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-phy.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-phy.c
@@ -96,6 +96,96 @@ static int emac_mdio_write(struct mii_bus *bus, int addr, int regnum, u16 val)
return 0;
}
+static int acpi_device_match(struct device *dev, void *fwnode)
+{
+ return dev->fwnode == fwnode;
+}
+
+static struct phy_device *
+emac_acpi_get_phydev_from_phy_handle(struct platform_device *pdev)
+{
+ struct fwnode_reference_args args;
+ struct fwnode_handle *fw_node;
+ struct acpi_device *adev;
+ acpi_handle handle;
+ struct device *dev;
+ struct phy_device *phydev;
+ struct net_device *netdev;
+ struct emac_adapter *adpt;
+ int phy_addr;
+ int ret;
+
+ /* Get PHY Port reference from phy-handle */
+ fw_node = acpi_fwnode_handle(ACPI_COMPANION(&pdev->dev));
+ ret = acpi_node_get_property_reference(fw_node, "phy-handle", 0,
+ &args);
+ if (ACPI_FAILURE(ret) || !is_acpi_device_node(args.fwnode))
+ return ERR_PTR(-ENODEV);
+
+ /* Get PHY addr from the port node */
+ if (fwnode_property_read_u32(args.fwnode, "phy-channel", &phy_addr))
+ return ERR_PTR(-ENODEV);
+
+ /* Get the MDIO bus that included the port */
+ handle = ACPI_HANDLE_FWNODE(args.fwnode);
+ if (!handle || acpi_bus_get_device(handle, &adev))
+ return ERR_PTR(-ENODEV);
+
+ while (adev->parent) {
+ if (!strcmp(acpi_device_hid(adev), "QCOM8070"))
+ break;
+ adev = adev->parent;
+ }
+ if (!adev->parent)
+ return ERR_PTR(-ENODEV);
+
+ dev = bus_find_device(&platform_bus_type, NULL,
+ &adev->fwnode,
+ acpi_device_match);
+ if (!dev)
+ return ERR_PTR(-ENODEV);
+
+ netdev = dev_get_drvdata(dev);
+ if (!netdev)
+ return ERR_PTR(-EPROBE_DEFER);
+
+ adpt = netdev_priv(netdev);
+ if (!adpt->mii_bus)
+ return ERR_PTR(-EPROBE_DEFER);
+
+ phydev = mdiobus_get_phy(adpt->mii_bus, phy_addr);
+ return phydev ? phydev : ERR_PTR(-ENODEV);
+}
+
+static struct phy_device *
+emac_acpi_get_phydev(struct platform_device *pdev, struct emac_adapter *adpt)
+{
+ struct phy_device *phydev = NULL;
+ int phy_addr;
+ int ret;
+
+ /* Compatible with "phy-channel" */
+ ret = device_property_read_u32(&pdev->dev, "phy-channel",
+ &phy_addr);
+ if (!ret)
+ phydev = mdiobus_get_phy(adpt->mii_bus, phy_addr);
+ if (phydev)
+ return phydev;
+
+ /* Get PHY Port reference from phy-handle */
+ phydev = emac_acpi_get_phydev_from_phy_handle(pdev);
+ if (!IS_ERR(phydev))
+ return phydev;
+ if (PTR_ERR(phydev) == -EPROBE_DEFER)
+ return ERR_PTR(-EPROBE_DEFER);
+
+ /* If we can't read a valid phy address from "phy-channel"/"phy-handle",
+ * then assume that there is only one phy on local mdio bus.
+ */
+ phydev = phy_find_first(adpt->mii_bus);
+ return phydev ? phydev : ERR_PTR(-ENODEV);
+}
+
static int emac_mdio_bus_create(struct platform_device *pdev,
struct emac_adapter *adpt)
{
@@ -128,13 +218,9 @@ static int emac_get_phydev(struct platform_device *pdev,
struct emac_adapter *adpt)
{
struct device_node *np = pdev->dev.of_node;
- struct mii_bus *bus = adpt->mii_bus;
struct device_node *phy_np;
struct phy_device *phydev;
- u32 phy_addr;
- int ret;
-
if (!has_acpi_companion(&pdev->dev)) {
phy_np = of_parse_phandle(np, "phy-handle", 0);
adpt->phydev = of_phy_find_device(phy_np);
@@ -142,14 +228,9 @@ static int emac_get_phydev(struct platform_device *pdev,
return adpt->phydev ? 0 : -ENODEV;
}
- ret = device_property_read_u32(&pdev->dev, "phy-channel",
- &phy_addr);
- /* If we can't read a valid phy address, then assume
- * that there is only one phy on this mdio bus.
- */
- phydev = ret ? phy_find_first(bus) : mdiobus_get_phy(bus, phy_addr);
- if (!phydev)
- return -ENODEV;
+ phydev = emac_acpi_get_phydev(pdev, adpt);
+ if (IS_ERR(phydev))
+ return PTR_ERR(phydev);
/* of_phy_find_device() claims a reference to the phydev,
* so we do that here manually as well. When the driver
@@ -171,10 +252,14 @@ int emac_phy_config(struct platform_device *pdev, struct emac_adapter *adpt)
return ret;
ret = emac_get_phydev(pdev, adpt);
- if (ret) {
+ if (!ret)
+ return 0;
+
+ if (ret != -EPROBE_DEFER)
dev_err(&pdev->dev, "Could not find external phy\n");
- mdiobus_unregister(adpt->mii_bus);
- }
+ else
+ dev_warn(&pdev->dev, "Phy is not available yet, deferred probing\n");
+ mdiobus_unregister(adpt->mii_bus);
return ret;
}
--
2.18.0
^ permalink raw reply related
* Re: [PATCH 2/2] Bluetooth: btqcomsmd: use HCI_QUIRK_USE_BDADDR_PROPERTY
From: Matthias Kaehlcke @ 2018-10-25 19:03 UTC (permalink / raw)
To: Balakrishna Godavarthi
Cc: Marcel Holtmann, Johan Hedberg, David S . Miller, Loic Poulain,
linux-bluetooth, netdev, linux-kernel, Brian Norris,
Dmitry Grinberg
In-Reply-To: <bff4f1b7d82e217f45273872eeffa77f@codeaurora.org>
On Thu, Oct 25, 2018 at 08:10:30PM +0530, Balakrishna Godavarthi wrote:
> Hi Matthias,
>
> On 2018-10-25 05:51, Matthias Kaehlcke wrote:
> > Use the HCI_QUIRK_USE_BDADDR_PROPERTY quirk to let the HCI
> > core handle the reading of 'local-bd-address'. With this there
> > is no need to set HCI_QUIRK_INVALID_BDADDR, the case of a
> > non-existing or invalid fwnode property is handled by the core
> > code.
> >
> > Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
> > ---
> > I couldn't actually test the changes in this driver since I
> > don't have a device with this controller. Could someone
> > from Qualcomm help with this?
> > ---
> > drivers/bluetooth/btqcomsmd.c | 28 +++-------------------------
> > 1 file changed, 3 insertions(+), 25 deletions(-)
> >
> > diff --git a/drivers/bluetooth/btqcomsmd.c
> > b/drivers/bluetooth/btqcomsmd.c
> > index 7df3eed1ef5e..e5841602c4f1 100644
> > --- a/drivers/bluetooth/btqcomsmd.c
> > +++ b/drivers/bluetooth/btqcomsmd.c
> > @@ -125,23 +125,10 @@ static int btqcomsmd_setup(struct hci_dev *hdev)
> > return PTR_ERR(skb);
> > kfree_skb(skb);
> >
> > - /* Devices do not have persistent storage for BD address. If no
> > - * BD address has been retrieved during probe, mark the device
> > - * as having an invalid BD address.
> > + /* Devices do not have persistent storage for BD address. Retrieve
> > + * it from the firmware node property.
> > */
> > - if (!bacmp(&btq->bdaddr, BDADDR_ANY)) {
> > - set_bit(HCI_QUIRK_INVALID_BDADDR, &hdev->quirks);
> > - return 0;
> > - }
> > -
> > - /* When setting a configured BD address fails, mark the device
> > - * as having an invalid BD address.
> > - */
> > - err = qca_set_bdaddr_rome(hdev, &btq->bdaddr);
> > - if (err) {
> > - set_bit(HCI_QUIRK_INVALID_BDADDR, &hdev->quirks);
> > - return 0;
> > - }
> > + set_bit(HCI_QUIRK_USE_BDADDR_PROPERTY, &hdev->quirks);
> >
> > return 0;
> > }
> > @@ -169,15 +156,6 @@ static int btqcomsmd_probe(struct platform_device
> > *pdev)
> > if (IS_ERR(btq->cmd_channel))
> > return PTR_ERR(btq->cmd_channel);
> >
> > - /* The local-bd-address property is usually injected by the
> > - * bootloader which has access to the allocated BD address.
> > - */
> > - if (!of_property_read_u8_array(pdev->dev.of_node, "local-bd-address",
> > - (u8 *)&btq->bdaddr, sizeof(bdaddr_t))) {
> > - dev_info(&pdev->dev, "BD address %pMR retrieved from device-tree",
> > - &btq->bdaddr);
> > - }
> > -
> > hdev = hci_alloc_dev();
> > if (!hdev)
> > return -ENOMEM;
>
> nit: can be remove unused "bdaddr_t bdaddr" variable.
> https://elixir.bootlin.com/linux/v4.19-rc8/source/drivers/bluetooth/btqcomsmd.c#L31
> Apart from this, It looks ok to me.
Thanks for the reviews! I'll remove the field in the next revision.
^ permalink raw reply
* [PATCH v2] netfilter: conntrack: fix calculation of next bucket number in early_drop
From: Vasily Khoruzhick @ 2018-10-25 19:15 UTC (permalink / raw)
To: Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
David S. Miller, netfilter-devel, coreteam, netdev, linux-kernel,
Dmitry Safonov
Cc: Vasily Khoruzhick, stable
If there's no entry to drop in bucket that corresponds to the hash,
early_drop() should look for it in other buckets. But since it increments
hash instead of bucket number, it actually looks in the same bucket 8
times: hsize is 16k by default (14 bits) and hash is 32-bit value, so
reciprocal_scale(hash, hsize) returns the same value for hash..hash+7 in
most cases.
Fix it by increasing bucket number instead of hash and rename _hash
to bucket to avoid future confusion.
Fixes: 3e86638e9a0b ("netfilter: conntrack: consider ct netns in early_drop logic")
Cc: <stable@vger.kernel.org> # v4.7+
Signed-off-by: Vasily Khoruzhick <vasilykh@arista.com>
---
v2: move 'bucket' outside of the loop
net/netfilter/nf_conntrack_core.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index ca1168d67fac..e92e749aff53 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1073,19 +1073,22 @@ static unsigned int early_drop_list(struct net *net,
return drops;
}
-static noinline int early_drop(struct net *net, unsigned int _hash)
+static noinline int early_drop(struct net *net, unsigned int hash)
{
- unsigned int i;
+ unsigned int i, bucket;
for (i = 0; i < NF_CT_EVICTION_RANGE; i++) {
struct hlist_nulls_head *ct_hash;
- unsigned int hash, hsize, drops;
+ unsigned int hsize, drops;
rcu_read_lock();
nf_conntrack_get_ht(&ct_hash, &hsize);
- hash = reciprocal_scale(_hash++, hsize);
+ if (!i)
+ bucket = reciprocal_scale(hash, hsize);
+ else
+ bucket = (bucket + 1) % hsize;
- drops = early_drop_list(net, &ct_hash[hash]);
+ drops = early_drop_list(net, &ct_hash[bucket]);
rcu_read_unlock();
if (drops) {
--
2.19.1
^ permalink raw reply related
* Re: [PATCH ghak90 (was ghak32) V4 03/10] audit: log container info of syscalls
From: Paul Moore @ 2018-10-25 10:49 UTC (permalink / raw)
To: sgrubb
Cc: rgb, simo, carlos, linux-api, containers, linux-kernel, dhowells,
linux-audit, netfilter-devel, ebiederm, luto, netdev,
linux-fsdevel, Eric Paris, Serge Hallyn, viro
In-Reply-To: <20181025080638.771621a3@ivy-bridge>
On Thu, Oct 25, 2018 at 2:06 AM Steve Grubb <sgrubb@redhat.com> wrote:
> On Wed, 24 Oct 2018 20:42:55 -0400
> Richard Guy Briggs <rgb@redhat.com> wrote:
> > On 2018-10-24 16:55, Paul Moore wrote:
> > > On Wed, Oct 24, 2018 at 11:15 AM Richard Guy Briggs
> > > <rgb@redhat.com> wrote:
> > > > On 2018-10-19 19:16, Paul Moore wrote:
> > > > > On Sun, Aug 5, 2018 at 4:32 AM Richard Guy Briggs
> > > > > <rgb@redhat.com> wrote:
...
> > > > > > +/*
> > > > > > + * audit_log_contid - report container info
> > > > > > + * @tsk: task to be recorded
> > > > > > + * @context: task or local context for record
> > > > > > + * @op: contid string description
> > > > > > + */
> > > > > > +int audit_log_contid(struct task_struct *tsk,
> > > > > > + struct audit_context *context,
> > > > > > char *op) +{
> > > > > > + struct audit_buffer *ab;
> > > > > > +
> > > > > > + if (!audit_contid_set(tsk))
> > > > > > + return 0;
> > > > > > + /* Generate AUDIT_CONTAINER record with container ID
> > > > > > */
> > > > > > + ab = audit_log_start(context, GFP_KERNEL,
> > > > > > AUDIT_CONTAINER);
> > > > > > + if (!ab)
> > > > > > + return -ENOMEM;
> > > > > > + audit_log_format(ab, "op=%s contid=%llu",
> > > > > > + op, audit_get_contid(tsk));
> > > > > > + audit_log_end(ab);
> > > > > > + return 0;
> > > > > > +}
> > > > > > +EXPORT_SYMBOL(audit_log_contid);
> > > > >
> > > > > As discussed in the previous iteration of the patch, I prefer
> > > > > AUDIT_CONTAINER_ID here over AUDIT_CONTAINER. If you feel
> > > > > strongly about keeping it as-is with AUDIT_CONTAINER I suppose
> > > > > I could live with that, but it is isn't my first choice.
> > > >
> > > > I don't have a strong opinion on this one, mildly preferring the
> > > > shorter one only because it is shorter.
> > >
> > > We already have multiple AUDIT_CONTAINER* record types, so it seems
> > > as though we should use "AUDIT_CONTAINER" as a prefix of sorts,
> > > rather than a type itself.
> >
> > I'm fine with that. I'd still like to hear Steve's input. He had
> > stronger opinions than me.
>
> The creation event should be separate and distinct from the continuing
> use when its used as a supplemental record. IOW, binding the ID to a
> container is part of the lifecycle and needs to be kept distinct.
Steve's comment is pretty ambiguous when it comes to AUDIT_CONTAINER
vs AUDIT_CONTAINER_ID, but one could argue that AUDIT_CONTAINER_ID
helps distinguish the audit container id marking record and gets to
what I believe is the spirit of Steve's comment. Taking this in
context with my previous remarks, let's switch to using
AUDIT_CONTAINER_ID.
--
paul moore
www.paul-moore.com
^ permalink raw reply
* Re: Regression: kernel 4.14 an later very slow with many ipsec tunnels
From: Florian Westphal @ 2018-10-25 19:24 UTC (permalink / raw)
To: David Miller
Cc: linux, netdev, fw, steffen.klassert, linux-kernel, torvalds,
christophe.gouault, gregkh
In-Reply-To: <20181025.103450.1966639999117342457.davem@davemloft.net>
David Miller <davem@davemloft.net> wrote:
> Please work diligently with Florian and others to try and find ways to
> soften the performance hit.
I will send a patch series that pre-sorts inexact policies into rbtrees
at insert time as soon as next-next opens up again.
^ permalink raw reply
* Re: [PATCH net-next 0/4] net: ethernet: ti: cpsw: fix vlan mcast
From: Grygorii Strashko @ 2018-10-25 19:25 UTC (permalink / raw)
To: David Miller, ivan.khoronzhuk
Cc: linux-omap, netdev, linux-kernel, alexander.h.duyck, bjorn
In-Reply-To: <20181025.113441.567599603045046210.davem@davemloft.net>
Hi David,
On 10/25/18 1:34 PM, David Miller wrote:
> From: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
> Date: Thu, 25 Oct 2018 01:10:55 +0300
>
>> The cpsw holds separate mcast entires for vlan entries. At this moment
>> driver adds only not vlan mcast addresses, omitting vlan/mcast entries.
>> As result mcast for vlans doesn't work. It can be fixed by adding same
>> mcast entries for every created vlan, but this patchseries uses more
>> sophisticated way and allows to create mcast entries only for vlans
>> that really require it. Generic functions from this series can be
>> reused for fixing vlan and macvlan unicast.
>
> This is a bug fix but targetted at net-next, and indeed it is quite
> invasive as it adds new core infrastructure and converts the generic
> vlan code over to using it.
>
> Unfortunately net-next is closed.
>
> So if you want this bug fixed in mainline you will have to come up
> with a less invasive fix, and resubmit this net-next approach when the
> net-next tree opens back up.
I think it's ok to wait as this issue was always here, so it's not critical.
--
regards,
-grygorii
^ permalink raw reply
* [PATCH net V2 1/1] net/smc: fix smc_buf_unuse to use the lgr pointer
From: Ursula Braun @ 2018-10-25 11:25 UTC (permalink / raw)
To: davem; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl, ubraun
From: Karsten Graul <kgraul@linux.ibm.com>
The pointer to the link group is unset in the smc connection structure
right before the call to smc_buf_unuse. Provide the lgr pointer to
smc_buf_unuse explicitly.
And move the call to smc_lgr_schedule_free_work to the end of
smc_conn_free.
Fixes: a6920d1d130c ("net/smc: handle unregistered buffers")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
---
net/smc/smc_core.c | 25 ++++++++++++-------------
1 file changed, 12 insertions(+), 13 deletions(-)
diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
index e871368500e3..18daebcef181 100644
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -122,22 +122,17 @@ static void __smc_lgr_unregister_conn(struct smc_connection *conn)
sock_put(&smc->sk); /* sock_hold in smc_lgr_register_conn() */
}
-/* Unregister connection and trigger lgr freeing if applicable
+/* Unregister connection from lgr
*/
static void smc_lgr_unregister_conn(struct smc_connection *conn)
{
struct smc_link_group *lgr = conn->lgr;
- int reduced = 0;
write_lock_bh(&lgr->conns_lock);
if (conn->alert_token_local) {
- reduced = 1;
__smc_lgr_unregister_conn(conn);
}
write_unlock_bh(&lgr->conns_lock);
- if (!reduced || lgr->conns_num)
- return;
- smc_lgr_schedule_free_work(lgr);
}
/* Send delete link, either as client to request the initiation
@@ -291,7 +286,8 @@ static int smc_lgr_create(struct smc_sock *smc, bool is_smcd,
return rc;
}
-static void smc_buf_unuse(struct smc_connection *conn)
+static void smc_buf_unuse(struct smc_connection *conn,
+ struct smc_link_group *lgr)
{
if (conn->sndbuf_desc)
conn->sndbuf_desc->used = 0;
@@ -301,8 +297,6 @@ static void smc_buf_unuse(struct smc_connection *conn)
conn->rmb_desc->used = 0;
} else {
/* buf registration failed, reuse not possible */
- struct smc_link_group *lgr = conn->lgr;
-
write_lock_bh(&lgr->rmbs_lock);
list_del(&conn->rmb_desc->list);
write_unlock_bh(&lgr->rmbs_lock);
@@ -315,16 +309,21 @@ static void smc_buf_unuse(struct smc_connection *conn)
/* remove a finished connection from its link group */
void smc_conn_free(struct smc_connection *conn)
{
- if (!conn->lgr)
+ struct smc_link_group *lgr = conn->lgr;
+
+ if (!lgr)
return;
- if (conn->lgr->is_smcd) {
+ if (lgr->is_smcd) {
smc_ism_unset_conn(conn);
tasklet_kill(&conn->rx_tsklet);
} else {
smc_cdc_tx_dismiss_slots(conn);
}
- smc_lgr_unregister_conn(conn);
- smc_buf_unuse(conn);
+ smc_lgr_unregister_conn(conn); /* unsets conn->lgr */
+ smc_buf_unuse(conn, lgr); /* allow buffer reuse */
+
+ if (!lgr->conns_num)
+ smc_lgr_schedule_free_work(lgr);
}
static void smc_link_clear(struct smc_link *lnk)
--
2.16.4
^ permalink raw reply related
* [RFC net-next v2 0/8] indirect tc block cb registration
From: John Hurley @ 2018-10-25 12:26 UTC (permalink / raw)
To: netdev, oss-drivers, jiri, gerlitz.or, ozsh, jakub.kicinski,
simon.horman, avivh
Cc: John Hurley
This patchset introduces an alternative to egdev offload by allowing a
driver to register for block updates when an external device (e.g. tunnel
netdev) is bound to a TC block. Drivers can track new netdevs or register
to existing ones to receive information on such events. Based on this,
they may register for block offload rules using already existing
functions.
The patchset also implements this new indirect block registration in the
NFP driver to allow the offloading of tunnel rules. The use of egdev
offload (which is currently only used for tunnel offload) is subsequently
removed.
John Hurley (8):
net: sched: register callbacks for indirect tc block binds
net: add netif_is_geneve()
nfp: flower: include geneve as supported offload tunnel type
nfp: flower: allow non repr netdev offload
nfp: flower: add infastructure for indirect TC block register
nfp: flower: offload tunnel decap rules via indirect TC blocks
nfp: flower: remove TC egdev offloads
nfp: flower: remove unnecessary code in flow lookup
drivers/net/ethernet/netronome/nfp/flower/action.c | 29 +-
drivers/net/ethernet/netronome/nfp/flower/cmsg.h | 13 +
drivers/net/ethernet/netronome/nfp/flower/main.c | 25 +-
drivers/net/ethernet/netronome/nfp/flower/main.h | 17 +-
drivers/net/ethernet/netronome/nfp/flower/match.c | 38 +--
.../net/ethernet/netronome/nfp/flower/metadata.c | 12 +-
.../net/ethernet/netronome/nfp/flower/offload.c | 246 +++++++++++------
.../ethernet/netronome/nfp/flower/tunnel_conf.c | 11 +-
include/net/geneve.h | 6 +
include/net/pkt_cls.h | 56 ++++
include/net/sch_generic.h | 3 +
net/sched/cls_api.c | 299 ++++++++++++++++++++-
12 files changed, 609 insertions(+), 146 deletions(-)
--
2.7.4
^ permalink raw reply
* [RFC net-next v2 1/8] net: sched: register callbacks for indirect tc block binds
From: John Hurley @ 2018-10-25 12:26 UTC (permalink / raw)
To: netdev, oss-drivers, jiri, gerlitz.or, ozsh, jakub.kicinski,
simon.horman, avivh
Cc: John Hurley
In-Reply-To: <1540470417-14803-1-git-send-email-john.hurley@netronome.com>
Currently drivers can register to receive TC block bind/unbind callbacks
by implementing the setup_tc ndo in any of their given netdevs. However,
drivers may also be interested in binds to higher level devices (e.g.
tunnel drivers) to potentially offload filters applied to them.
Introduce indirect block devs which allows drivers to register callbacks
for block binds on other devices. The calling driver is expected to
reference an 'owner' struct that it will pass to all block registrations.
This is used to track the callbacks from a given driver and free them if
the driver is removed while the upper level device is still active.
Freeing a callback will also trigger an unbind event (if necessary) to
direct the driver to remove any offloaded rules and unreg any block filter
callbacks.
Allow registering an indirect block dev callback for a device that is
already bound to a block. In this case (if it is an ingress block),
register and also trigger the callback meaning that any already installed
rules can be replayed to the calling driver.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
include/net/pkt_cls.h | 56 +++++++++
include/net/sch_generic.h | 3 +
net/sched/cls_api.c | 299 +++++++++++++++++++++++++++++++++++++++++++++-
3 files changed, 357 insertions(+), 1 deletion(-)
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 72ffb31..1b47837 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -37,6 +37,7 @@ struct tcf_block_ext_info {
};
struct tcf_block_cb;
+struct tcf_indr_block_owner;
bool tcf_queue_work(struct rcu_work *rwork, work_func_t func);
#ifdef CONFIG_NET_CLS
@@ -81,6 +82,20 @@ void __tcf_block_cb_unregister(struct tcf_block *block,
struct tcf_block_cb *block_cb);
void tcf_block_cb_unregister(struct tcf_block *block,
tc_setup_cb_t *cb, void *cb_ident);
+int __tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+ tc_indr_block_bind_cb_t *cb, void *cb_ident,
+ struct tcf_indr_block_owner *owner);
+int tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+ tc_indr_block_bind_cb_t *cb, void *cb_ident,
+ struct tcf_indr_block_owner *owner);
+void __tc_indr_block_cb_unregister(struct net_device *dev,
+ tc_indr_block_bind_cb_t *cb, void *cb_ident);
+void tc_indr_block_cb_unregister(struct net_device *dev,
+ tc_indr_block_bind_cb_t *cb,
+ void *cb_ident);
+
+struct tcf_indr_block_owner *tc_indr_block_owner_create(void);
+void tc_indr_block_owner_clean(struct tcf_indr_block_owner *owner);
int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
struct tcf_result *res, bool compat_mode);
@@ -183,6 +198,47 @@ void tcf_block_cb_unregister(struct tcf_block *block,
{
}
+static inline
+int __tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+ tc_indr_block_bind_cb_t *cb,
+ void *cb_ident,
+ struct tcf_indr_block_owner *owner)
+{
+ return 0;
+}
+
+static inline
+int tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+ tc_indr_block_bind_cb_t *cb, void *cb_ident,
+ struct tcf_indr_block_owner *owner)
+{
+ return 0;
+}
+
+static inline
+void __tc_indr_block_cb_unregister(struct net_device *dev,
+ tc_indr_block_bind_cb_t *cb,
+ void *cb_ident)
+{
+}
+
+static inline
+void tc_indr_block_cb_unregister(struct net_device *dev,
+ tc_indr_block_bind_cb_t *cb,
+ void *cb_ident)
+{
+}
+
+static inline struct tcf_indr_block_owner *tc_indr_block_owner_create(void)
+{
+ /* NULL would mean an error, only CONFIG_NET_CLS can dereference this */
+ return (void *)1;
+}
+
+static inline void tc_indr_block_owner_clean(struct tcf_indr_block_owner *owner)
+{
+}
+
static inline int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
struct tcf_result *res, bool compat_mode)
{
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 4d73642..8301581 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -24,6 +24,9 @@ struct bpf_flow_keys;
typedef int tc_setup_cb_t(enum tc_setup_type type,
void *type_data, void *cb_priv);
+typedef int tc_indr_block_bind_cb_t(struct net_device *dev, void *cb_priv,
+ enum tc_setup_type type, void *type_data);
+
struct qdisc_rate_table {
struct tc_ratespec rate;
u32 data[256];
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index f427a1e..79d97bf 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -25,6 +25,7 @@
#include <linux/kmod.h>
#include <linux/slab.h>
#include <linux/idr.h>
+#include <linux/rhashtable.h>
#include <net/net_namespace.h>
#include <net/sock.h>
#include <net/netlink.h>
@@ -365,6 +366,288 @@ static void tcf_chain_flush(struct tcf_chain *chain)
}
}
+static struct tcf_block *tc_dev_ingress_block(struct net_device *dev)
+{
+ const struct Qdisc_class_ops *cops;
+ struct Qdisc *qdisc;
+
+ if (!dev_ingress_queue(dev))
+ return NULL;
+
+ qdisc = dev_ingress_queue(dev)->qdisc_sleeping;
+ if (!qdisc)
+ return NULL;
+
+ cops = qdisc->ops->cl_ops;
+ if (!cops)
+ return NULL;
+
+ if (!cops->tcf_block)
+ return NULL;
+
+ return cops->tcf_block(qdisc, TC_H_MIN_INGRESS, NULL);
+}
+
+static struct rhashtable indr_setup_block_ht;
+
+struct tc_indr_block_dev {
+ struct rhash_head ht_node;
+ struct net_device *dev;
+ unsigned int refcnt;
+ struct list_head cb_list;
+ struct tcf_block *block;
+};
+
+struct tc_indr_block_cb {
+ struct tc_indr_block_dev *indr_dev;
+ struct list_head list;
+ void *cb_priv;
+ tc_indr_block_bind_cb_t *cb;
+ void *cb_ident;
+ struct list_head owner_list;
+};
+
+struct tcf_indr_block_owner {
+ struct list_head cb_list;
+};
+
+static const struct rhashtable_params tc_indr_setup_block_ht_params = {
+ .key_offset = offsetof(struct tc_indr_block_dev, dev),
+ .head_offset = offsetof(struct tc_indr_block_dev, ht_node),
+ .key_len = sizeof(struct net_device *),
+};
+
+static struct tc_indr_block_dev *
+tc_indr_block_dev_lookup(struct net_device *dev)
+{
+ return rhashtable_lookup_fast(&indr_setup_block_ht, &dev,
+ tc_indr_setup_block_ht_params);
+}
+
+static struct tc_indr_block_dev *tc_indr_block_dev_get(struct net_device *dev)
+{
+ struct tc_indr_block_dev *indr_dev;
+
+ indr_dev = tc_indr_block_dev_lookup(dev);
+ if (indr_dev)
+ goto inc_ref;
+
+ indr_dev = kzalloc(sizeof(*indr_dev), GFP_KERNEL);
+ if (!indr_dev)
+ return NULL;
+
+ INIT_LIST_HEAD(&indr_dev->cb_list);
+ indr_dev->dev = dev;
+ indr_dev->block = tc_dev_ingress_block(dev);
+ if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
+ tc_indr_setup_block_ht_params)) {
+ kfree(indr_dev);
+ return NULL;
+ }
+
+inc_ref:
+ indr_dev->refcnt++;
+ return indr_dev;
+}
+
+static void tc_indr_block_dev_put(struct tc_indr_block_dev *indr_dev)
+{
+ if (--indr_dev->refcnt)
+ return;
+
+ rhashtable_remove_fast(&indr_setup_block_ht, &indr_dev->ht_node,
+ tc_indr_setup_block_ht_params);
+ kfree(indr_dev);
+}
+
+static struct tc_indr_block_cb *
+tc_indr_block_cb_lookup(struct tc_indr_block_dev *indr_dev,
+ tc_indr_block_bind_cb_t *cb, void *cb_ident)
+{
+ struct tc_indr_block_cb *indr_block_cb;
+
+ list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
+ if (indr_block_cb->cb == cb &&
+ indr_block_cb->cb_ident == cb_ident)
+ return indr_block_cb;
+ return NULL;
+}
+
+static struct tc_indr_block_cb *
+tc_indr_block_cb_add(struct tc_indr_block_dev *indr_dev,
+ void *cb_priv, tc_indr_block_bind_cb_t *cb,
+ void *cb_ident, struct tcf_indr_block_owner *owner)
+{
+ struct tc_indr_block_cb *indr_block_cb;
+
+ indr_block_cb = tc_indr_block_cb_lookup(indr_dev, cb, cb_ident);
+ if (indr_block_cb)
+ return ERR_PTR(-EEXIST);
+
+ indr_block_cb = kzalloc(sizeof(*indr_block_cb), GFP_KERNEL);
+ if (!indr_block_cb)
+ return ERR_PTR(-ENOMEM);
+
+ indr_block_cb->indr_dev = indr_dev;
+ indr_block_cb->cb_priv = cb_priv;
+ indr_block_cb->cb = cb;
+ indr_block_cb->cb_ident = cb_ident;
+ list_add(&indr_block_cb->list, &indr_dev->cb_list);
+ list_add(&indr_block_cb->owner_list, &owner->cb_list);
+
+ return indr_block_cb;
+}
+
+static void tc_indr_block_cb_del(struct tc_indr_block_cb *indr_block_cb)
+{
+ list_del(&indr_block_cb->list);
+ list_del(&indr_block_cb->owner_list);
+ kfree(indr_block_cb);
+}
+
+static int tc_indr_block_ing_cmd(struct tc_indr_block_dev *indr_dev,
+ struct tc_indr_block_cb *indr_block_cb,
+ enum tc_block_command command)
+{
+ struct tc_block_offload bo = {
+ .command = command,
+ .binder_type = TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS,
+ .block = indr_dev->block,
+ };
+
+ if (!indr_dev->block)
+ return 0;
+ return indr_block_cb->cb(indr_dev->dev, indr_block_cb->cb_priv,
+ TC_SETUP_BLOCK, &bo);
+}
+
+int __tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+ tc_indr_block_bind_cb_t *cb, void *cb_ident,
+ struct tcf_indr_block_owner *owner)
+{
+ struct tc_indr_block_cb *indr_block_cb;
+ struct tc_indr_block_dev *indr_dev;
+ int err;
+
+ indr_dev = tc_indr_block_dev_get(dev);
+ if (!indr_dev)
+ return -ENOMEM;
+
+ indr_block_cb = tc_indr_block_cb_add(indr_dev, cb_priv, cb, cb_ident,
+ owner);
+ err = PTR_ERR_OR_ZERO(indr_block_cb);
+ if (err)
+ goto err_dev_put;
+
+ tc_indr_block_ing_cmd(indr_dev, indr_block_cb, TC_BLOCK_BIND);
+ return 0;
+
+err_dev_put:
+ tc_indr_block_dev_put(indr_dev);
+ return err;
+}
+EXPORT_SYMBOL_GPL(__tc_indr_block_cb_register);
+
+int tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+ tc_indr_block_bind_cb_t *cb, void *cb_ident,
+ struct tcf_indr_block_owner *owner)
+{
+ int err;
+
+ rtnl_lock();
+ err = __tc_indr_block_cb_register(dev, cb_priv, cb, cb_ident, owner);
+ rtnl_unlock();
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(tc_indr_block_cb_register);
+
+void __tc_indr_block_cb_unregister(struct net_device *dev,
+ tc_indr_block_bind_cb_t *cb, void *cb_ident)
+{
+ struct tc_indr_block_cb *indr_block_cb;
+ struct tc_indr_block_dev *indr_dev;
+
+ indr_dev = tc_indr_block_dev_lookup(dev);
+ if (!indr_dev)
+ return;
+
+ indr_block_cb = tc_indr_block_cb_lookup(indr_dev, cb, cb_ident);
+ if (!indr_block_cb)
+ return;
+
+ /* Send unbind message if required to free any block cbs. */
+ tc_indr_block_ing_cmd(indr_dev, indr_block_cb, TC_BLOCK_UNBIND);
+ tc_indr_block_cb_del(indr_block_cb);
+ tc_indr_block_dev_put(indr_dev);
+}
+EXPORT_SYMBOL_GPL(__tc_indr_block_cb_unregister);
+
+void tc_indr_block_cb_unregister(struct net_device *dev,
+ tc_indr_block_bind_cb_t *cb, void *cb_ident)
+{
+ rtnl_lock();
+ __tc_indr_block_cb_unregister(dev, cb, cb_ident);
+ rtnl_unlock();
+}
+EXPORT_SYMBOL_GPL(tc_indr_block_cb_unregister);
+
+struct tcf_indr_block_owner *tc_indr_block_owner_create(void)
+{
+ struct tcf_indr_block_owner *owner;
+
+ owner = kzalloc(sizeof(*owner), GFP_KERNEL);
+ if (!owner)
+ return NULL;
+ INIT_LIST_HEAD(&owner->cb_list);
+ return owner;
+}
+EXPORT_SYMBOL_GPL(tc_indr_block_owner_create);
+
+void tc_indr_block_owner_clean(struct tcf_indr_block_owner *owner)
+{
+ struct tc_indr_block_cb *indr_block_cb, *store;
+ struct tc_indr_block_dev *indr_dev;
+
+ rtnl_lock();
+ list_for_each_entry_safe(indr_block_cb, store, &owner->cb_list,
+ owner_list) {
+ indr_dev = indr_block_cb->indr_dev;
+ tc_indr_block_ing_cmd(indr_dev, indr_block_cb, TC_BLOCK_UNBIND);
+ tc_indr_block_cb_del(indr_block_cb);
+ tc_indr_block_dev_put(indr_dev);
+ }
+ rtnl_unlock();
+
+ kfree(owner);
+}
+EXPORT_SYMBOL_GPL(tc_indr_block_owner_clean);
+
+static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
+ struct tcf_block_ext_info *ei,
+ enum tc_block_command command,
+ struct netlink_ext_ack *extack)
+{
+ struct tc_indr_block_cb *indr_block_cb;
+ struct tc_indr_block_dev *indr_dev;
+ struct tc_block_offload bo = {
+ .command = command,
+ .binder_type = ei->binder_type,
+ .block = block,
+ .extack = extack,
+ };
+
+ indr_dev = tc_indr_block_dev_lookup(dev);
+ if (!indr_dev)
+ return;
+
+ indr_dev->block = command == TC_BLOCK_BIND ? block : NULL;
+
+ list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
+ indr_block_cb->cb(dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
+ &bo);
+}
+
static bool tcf_block_offload_in_use(struct tcf_block *block)
{
return block->offloadcnt;
@@ -406,12 +689,17 @@ static int tcf_block_offload_bind(struct tcf_block *block, struct Qdisc *q,
err = tcf_block_offload_cmd(block, dev, ei, TC_BLOCK_BIND, extack);
if (err == -EOPNOTSUPP)
goto no_offload_dev_inc;
- return err;
+ if (err)
+ return err;
+
+ tc_indr_block_call(block, dev, ei, TC_BLOCK_BIND, extack);
+ return 0;
no_offload_dev_inc:
if (tcf_block_offload_in_use(block))
return -EOPNOTSUPP;
block->nooffloaddevcnt++;
+ tc_indr_block_call(block, dev, ei, TC_BLOCK_BIND, extack);
return 0;
}
@@ -421,6 +709,8 @@ static void tcf_block_offload_unbind(struct tcf_block *block, struct Qdisc *q,
struct net_device *dev = q->dev_queue->dev;
int err;
+ tc_indr_block_call(block, dev, ei, TC_BLOCK_UNBIND, NULL);
+
if (!dev->netdev_ops->ndo_setup_tc)
goto no_offload_dev_dec;
err = tcf_block_offload_cmd(block, dev, ei, TC_BLOCK_UNBIND, NULL);
@@ -2355,6 +2645,11 @@ static int __init tc_filter_init(void)
if (err)
goto err_register_pernet_subsys;
+ err = rhashtable_init(&indr_setup_block_ht,
+ &tc_indr_setup_block_ht_params);
+ if (err)
+ goto err_rhash_setup_block_ht;
+
rtnl_register(PF_UNSPEC, RTM_NEWTFILTER, tc_new_tfilter, NULL, 0);
rtnl_register(PF_UNSPEC, RTM_DELTFILTER, tc_del_tfilter, NULL, 0);
rtnl_register(PF_UNSPEC, RTM_GETTFILTER, tc_get_tfilter,
@@ -2366,6 +2661,8 @@ static int __init tc_filter_init(void)
return 0;
+err_rhash_setup_block_ht:
+ unregister_pernet_subsys(&tcf_net_ops);
err_register_pernet_subsys:
destroy_workqueue(tc_filter_wq);
return err;
--
2.7.4
^ permalink raw reply related
* [RFC net-next v2 2/8] net: add netif_is_geneve()
From: John Hurley @ 2018-10-25 12:26 UTC (permalink / raw)
To: netdev, oss-drivers, jiri, gerlitz.or, ozsh, jakub.kicinski,
simon.horman, avivh
Cc: John Hurley
In-Reply-To: <1540470417-14803-1-git-send-email-john.hurley@netronome.com>
Add a helper function to determine if the type of a netdev is geneve based
on its rtnl_link_ops. This allows drivers that may wish to ofload tunnels
to check the underlying type of the device.
A recent patch added a similar helper to vxlan.h
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
include/net/geneve.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/include/net/geneve.h b/include/net/geneve.h
index a7600ed..fc6a7e0 100644
--- a/include/net/geneve.h
+++ b/include/net/geneve.h
@@ -60,6 +60,12 @@ struct genevehdr {
struct geneve_opt options[];
};
+static inline bool netif_is_geneve(const struct net_device *dev)
+{
+ return dev->rtnl_link_ops &&
+ !strcmp(dev->rtnl_link_ops->kind, "geneve");
+}
+
#ifdef CONFIG_INET
struct net_device *geneve_dev_create_fb(struct net *net, const char *name,
u8 name_assign_type, u16 dst_port);
--
2.7.4
^ permalink raw reply related
* [RFC net-next v2 3/8] nfp: flower: include geneve as supported offload tunnel type
From: John Hurley @ 2018-10-25 12:26 UTC (permalink / raw)
To: netdev, oss-drivers, jiri, gerlitz.or, ozsh, jakub.kicinski,
simon.horman, avivh
Cc: John Hurley
In-Reply-To: <1540470417-14803-1-git-send-email-john.hurley@netronome.com>
Offload of geneve decap rules is supported in NFP. Include geneve in the
check for supported types.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
drivers/net/ethernet/netronome/nfp/flower/tunnel_conf.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/netronome/nfp/flower/tunnel_conf.c b/drivers/net/ethernet/netronome/nfp/flower/tunnel_conf.c
index 8e5bec0..170f314 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/tunnel_conf.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/tunnel_conf.c
@@ -190,6 +190,8 @@ static bool nfp_tun_is_netdev_to_offload(struct net_device *netdev)
return true;
if (netif_is_vxlan(netdev))
return true;
+ if (netif_is_geneve(netdev))
+ return true;
return false;
}
--
2.7.4
^ permalink raw reply related
* [RFC net-next v2 5/8] nfp: flower: add infastructure for indirect TC block register
From: John Hurley @ 2018-10-25 12:26 UTC (permalink / raw)
To: netdev, oss-drivers, jiri, gerlitz.or, ozsh, jakub.kicinski,
simon.horman, avivh
Cc: John Hurley
In-Reply-To: <1540470417-14803-1-git-send-email-john.hurley@netronome.com>
Add support structures and functions that can be used by NFP to impliment
the indirect block register functionality of TC.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
drivers/net/ethernet/netronome/nfp/flower/main.c | 13 +++
drivers/net/ethernet/netronome/nfp/flower/main.h | 8 ++
.../net/ethernet/netronome/nfp/flower/offload.c | 129 +++++++++++++++++++++
3 files changed, 150 insertions(+)
diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.c b/drivers/net/ethernet/netronome/nfp/flower/main.c
index 3a54728..518006c 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/main.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/main.c
@@ -568,8 +568,18 @@ static int nfp_flower_init(struct nfp_app *app)
goto err_cleanup_metadata;
}
+ INIT_LIST_HEAD(&app_priv->indr_block_cb_priv);
+ app_priv->indr_block_owner = tc_indr_block_owner_create();
+ if (!app_priv->indr_block_owner) {
+ err = -ENOMEM;
+ goto err_lag_clean;
+ }
+
return 0;
+err_lag_clean:
+ if (app_priv->flower_ext_feats & NFP_FL_FEATS_LAG)
+ nfp_flower_lag_cleanup(&app_priv->nfp_lag);
err_cleanup_metadata:
nfp_flower_metadata_cleanup(app);
err_free_app_priv:
@@ -588,6 +598,8 @@ static void nfp_flower_clean(struct nfp_app *app)
if (app_priv->flower_ext_feats & NFP_FL_FEATS_LAG)
nfp_flower_lag_cleanup(&app_priv->nfp_lag);
+ nfp_flower_clean_indr_block_priv(app);
+
nfp_flower_metadata_cleanup(app);
vfree(app->priv);
app->priv = NULL;
@@ -678,6 +690,7 @@ static void nfp_flower_stop(struct nfp_app *app)
unregister_netdevice_notifier(&app_priv->nfp_lag.lag_nb);
nfp_tunnel_config_stop(app);
+ tc_indr_block_owner_clean(app_priv->indr_block_owner);
}
const struct nfp_app_type app_flower = {
diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.h b/drivers/net/ethernet/netronome/nfp/flower/main.h
index a91ac52..8b4bcf3 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/main.h
+++ b/drivers/net/ethernet/netronome/nfp/flower/main.h
@@ -133,6 +133,8 @@ struct nfp_fl_lag {
* @reify_wait_queue: wait queue for repr reify response counting
* @mtu_conf: Configuration of repr MTU value
* @nfp_lag: Link aggregation data block
+ * @indr_block_cb_priv: List of priv data passed to indirect block registers
+ * @indr_block_owner: Struct required for indirect blocks
*/
struct nfp_flower_priv {
struct nfp_app *app;
@@ -166,6 +168,8 @@ struct nfp_flower_priv {
wait_queue_head_t reify_wait_queue;
struct nfp_mtu_conf mtu_conf;
struct nfp_fl_lag nfp_lag;
+ struct list_head indr_block_cb_priv;
+ struct tcf_indr_block_owner *indr_block_owner;
};
/**
@@ -269,5 +273,9 @@ int nfp_flower_lag_populate_pre_action(struct nfp_app *app,
struct nfp_fl_pre_lag *pre_act);
int nfp_flower_lag_get_output_id(struct nfp_app *app,
struct net_device *master);
+void
+nfp_flower_register_indr_block(struct nfp_app *app, struct net_device *netdev);
+void nfp_flower_unregister_indr_block(struct net_device *netdev);
+void nfp_flower_clean_indr_block_priv(struct nfp_app *app);
#endif
diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c
index 2c32edf..f701b2e 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
@@ -693,3 +693,132 @@ int nfp_flower_setup_tc(struct nfp_app *app, struct net_device *netdev,
return -EOPNOTSUPP;
}
}
+
+struct nfp_flower_indr_block_cb_priv {
+ struct net_device *netdev;
+ struct nfp_app *app;
+ struct list_head list;
+};
+
+static struct nfp_flower_indr_block_cb_priv *
+nfp_flower_indr_block_cb_priv_lookup(struct nfp_app *app,
+ struct net_device *netdev)
+{
+ struct nfp_flower_indr_block_cb_priv *cb_priv;
+ struct nfp_flower_priv *priv = app->priv;
+
+ /* All callback list access should be protected by RTNL. */
+ ASSERT_RTNL();
+
+ list_for_each_entry(cb_priv, &priv->indr_block_cb_priv, list)
+ if (cb_priv->netdev == netdev)
+ return cb_priv;
+
+ return NULL;
+}
+
+void nfp_flower_clean_indr_block_priv(struct nfp_app *app)
+{
+ struct nfp_flower_indr_block_cb_priv *cb_priv, *temp;
+ struct nfp_flower_priv *priv = app->priv;
+
+ list_for_each_entry_safe(cb_priv, temp, &priv->indr_block_cb_priv, list)
+ kfree(cb_priv);
+}
+
+static int nfp_flower_setup_indr_block_cb(enum tc_setup_type type,
+ void *type_data, void *cb_priv)
+{
+ struct nfp_flower_indr_block_cb_priv *priv = cb_priv;
+ struct tc_cls_flower_offload *flower = type_data;
+
+ if (flower->common.chain_index)
+ return -EOPNOTSUPP;
+
+ switch (type) {
+ case TC_SETUP_CLSFLOWER:
+ return nfp_flower_repr_offload(priv->app, priv->netdev,
+ type_data, false);
+ default:
+ return -EOPNOTSUPP;
+ }
+}
+
+static int
+nfp_flower_setup_indr_tc_block(struct net_device *netdev, struct nfp_app *app,
+ struct tc_block_offload *f)
+{
+ struct nfp_flower_indr_block_cb_priv *cb_priv;
+ struct nfp_flower_priv *priv = app->priv;
+ int err;
+
+ if (f->binder_type != TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS)
+ return -EOPNOTSUPP;
+
+ switch (f->command) {
+ case TC_BLOCK_BIND:
+ cb_priv = kmalloc(sizeof(*cb_priv), GFP_KERNEL);
+ if (!cb_priv)
+ return -ENOMEM;
+
+ cb_priv->netdev = netdev;
+ cb_priv->app = app;
+ list_add(&cb_priv->list, &priv->indr_block_cb_priv);
+
+ err = tcf_block_cb_register(f->block,
+ nfp_flower_setup_indr_block_cb,
+ netdev, cb_priv, f->extack);
+ if (err) {
+ list_del(&cb_priv->list);
+ kfree(cb_priv);
+ }
+
+ return err;
+ case TC_BLOCK_UNBIND:
+ tcf_block_cb_unregister(f->block,
+ nfp_flower_setup_indr_block_cb, netdev);
+ cb_priv = nfp_flower_indr_block_cb_priv_lookup(app, netdev);
+ if (cb_priv) {
+ list_del(&cb_priv->list);
+ kfree(cb_priv);
+ }
+
+ return 0;
+ default:
+ return -EOPNOTSUPP;
+ }
+ return 0;
+}
+
+static int
+nfp_flower_indr_setup_tc_cb(struct net_device *netdev, void *cb_priv,
+ enum tc_setup_type type, void *type_data)
+{
+ switch (type) {
+ case TC_SETUP_BLOCK:
+ return nfp_flower_setup_indr_tc_block(netdev, cb_priv,
+ type_data);
+ default:
+ return -EOPNOTSUPP;
+ }
+}
+
+void
+nfp_flower_register_indr_block(struct nfp_app *app, struct net_device *netdev)
+{
+ struct nfp_flower_priv *priv = app->priv;
+ int err;
+
+ err = __tc_indr_block_cb_register(netdev, app,
+ nfp_flower_indr_setup_tc_cb, netdev,
+ priv->indr_block_owner);
+ if (err)
+ nfp_flower_cmsg_warn(priv->app,
+ "Failed to reg indirect block cb for %s\n", netdev->name);
+}
+
+void nfp_flower_unregister_indr_block(struct net_device *netdev)
+{
+ __tc_indr_block_cb_unregister(netdev, nfp_flower_indr_setup_tc_cb,
+ netdev);
+}
--
2.7.4
^ permalink raw reply related
* [RFC net-next v2 4/8] nfp: flower: allow non repr netdev offload
From: John Hurley @ 2018-10-25 12:26 UTC (permalink / raw)
To: netdev, oss-drivers, jiri, gerlitz.or, ozsh, jakub.kicinski,
simon.horman, avivh
Cc: John Hurley
In-Reply-To: <1540470417-14803-1-git-send-email-john.hurley@netronome.com>
Previously the offload functions in NFP assumed that the ingress (or
egress) netdev passed to them was an nfp repr.
Modify the driver to permit the passing of non repr netdevs as the ingress
device for an offload rule candidate. This may include devices such as
tunnels. The driver should then base its offload decision on a combination
of ingress device and egress port for a rule.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
drivers/net/ethernet/netronome/nfp/flower/action.c | 14 ++++----
drivers/net/ethernet/netronome/nfp/flower/main.h | 3 +-
drivers/net/ethernet/netronome/nfp/flower/match.c | 38 ++++++++++++----------
.../net/ethernet/netronome/nfp/flower/offload.c | 33 +++++++++++--------
4 files changed, 49 insertions(+), 39 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/flower/action.c b/drivers/net/ethernet/netronome/nfp/flower/action.c
index 244dc26..04349c7 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/action.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/action.c
@@ -151,11 +151,12 @@ nfp_fl_output(struct nfp_app *app, struct nfp_fl_output *output,
/* Set action output parameters. */
output->flags = cpu_to_be16(tmp_flags);
- /* Only offload if egress ports are on the same device as the
- * ingress port.
- */
- if (!switchdev_port_same_parent_id(in_dev, out_dev))
- return -EOPNOTSUPP;
+ if (nfp_netdev_is_nfp_repr(in_dev)) {
+ /* Confirm ingress and egress are on same device. */
+ if (!switchdev_port_same_parent_id(in_dev, out_dev))
+ return -EOPNOTSUPP;
+ }
+
if (!nfp_netdev_is_nfp_repr(out_dev))
return -EOPNOTSUPP;
@@ -728,9 +729,8 @@ nfp_flower_loop_action(struct nfp_app *app, const struct tc_action *a,
*a_len += sizeof(struct nfp_fl_push_vlan);
} else if (is_tcf_tunnel_set(a)) {
struct ip_tunnel_info *ip_tun = tcf_tunnel_info(a);
- struct nfp_repr *repr = netdev_priv(netdev);
- *tun_type = nfp_fl_get_tun_from_act_l4_port(repr->app, a);
+ *tun_type = nfp_fl_get_tun_from_act_l4_port(app, a);
if (*tun_type == NFP_FL_TUNNEL_NONE)
return -EOPNOTSUPP;
diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.h b/drivers/net/ethernet/netronome/nfp/flower/main.h
index 90045ba..a91ac52 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/main.h
+++ b/drivers/net/ethernet/netronome/nfp/flower/main.h
@@ -226,7 +226,8 @@ void nfp_flower_metadata_cleanup(struct nfp_app *app);
int nfp_flower_setup_tc(struct nfp_app *app, struct net_device *netdev,
enum tc_setup_type type, void *type_data);
-int nfp_flower_compile_flow_match(struct tc_cls_flower_offload *flow,
+int nfp_flower_compile_flow_match(struct nfp_app *app,
+ struct tc_cls_flower_offload *flow,
struct nfp_fl_key_ls *key_ls,
struct net_device *netdev,
struct nfp_fl_payload *nfp_flow,
diff --git a/drivers/net/ethernet/netronome/nfp/flower/match.c b/drivers/net/ethernet/netronome/nfp/flower/match.c
index e54fb60..cdf7559 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/match.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/match.c
@@ -52,10 +52,13 @@ nfp_flower_compile_port(struct nfp_flower_in_port *frame, u32 cmsg_port,
return 0;
}
- if (tun_type)
+ if (tun_type) {
frame->in_port = cpu_to_be32(NFP_FL_PORT_TYPE_TUN | tun_type);
- else
+ } else {
+ if (!cmsg_port)
+ return -EOPNOTSUPP;
frame->in_port = cpu_to_be32(cmsg_port);
+ }
return 0;
}
@@ -289,17 +292,21 @@ nfp_flower_compile_ipv4_udp_tun(struct nfp_flower_ipv4_udp_tun *frame,
}
}
-int nfp_flower_compile_flow_match(struct tc_cls_flower_offload *flow,
+int nfp_flower_compile_flow_match(struct nfp_app *app,
+ struct tc_cls_flower_offload *flow,
struct nfp_fl_key_ls *key_ls,
struct net_device *netdev,
struct nfp_fl_payload *nfp_flow,
enum nfp_flower_tun_type tun_type)
{
- struct nfp_repr *netdev_repr;
+ u32 cmsg_port = 0;
int err;
u8 *ext;
u8 *msk;
+ if (nfp_netdev_is_nfp_repr(netdev))
+ cmsg_port = nfp_repr_get_port_id(netdev);
+
memset(nfp_flow->unmasked_data, 0, key_ls->key_size);
memset(nfp_flow->mask_data, 0, key_ls->key_size);
@@ -327,15 +334,13 @@ int nfp_flower_compile_flow_match(struct tc_cls_flower_offload *flow,
/* Populate Exact Port data. */
err = nfp_flower_compile_port((struct nfp_flower_in_port *)ext,
- nfp_repr_get_port_id(netdev),
- false, tun_type);
+ cmsg_port, false, tun_type);
if (err)
return err;
/* Populate Mask Port Data. */
err = nfp_flower_compile_port((struct nfp_flower_in_port *)msk,
- nfp_repr_get_port_id(netdev),
- true, tun_type);
+ cmsg_port, true, tun_type);
if (err)
return err;
@@ -399,16 +404,13 @@ int nfp_flower_compile_flow_match(struct tc_cls_flower_offload *flow,
msk += sizeof(struct nfp_flower_ipv4_udp_tun);
/* Configure tunnel end point MAC. */
- if (nfp_netdev_is_nfp_repr(netdev)) {
- netdev_repr = netdev_priv(netdev);
- nfp_tunnel_write_macs(netdev_repr->app);
-
- /* Store the tunnel destination in the rule data.
- * This must be present and be an exact match.
- */
- nfp_flow->nfp_tun_ipv4_addr = tun_dst;
- nfp_tunnel_add_ipv4_off(netdev_repr->app, tun_dst);
- }
+ nfp_tunnel_write_macs(app);
+
+ /* Store the tunnel destination in the rule data.
+ * This must be present and be an exact match.
+ */
+ nfp_flow->nfp_tun_ipv4_addr = tun_dst;
+ nfp_tunnel_add_ipv4_off(app, tun_dst);
if (key_ls->key_layer_two & NFP_FLOWER_LAYER2_GENEVE_OP) {
err = nfp_flower_compile_geneve_opt(ext, flow, false);
diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c
index 29c9542..2c32edf 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
@@ -56,11 +56,10 @@
BIT(FLOW_DISSECTOR_KEY_ENC_PORTS))
static int
-nfp_flower_xmit_flow(struct net_device *netdev,
- struct nfp_fl_payload *nfp_flow, u8 mtype)
+nfp_flower_xmit_flow(struct nfp_app *app, struct nfp_fl_payload *nfp_flow,
+ u8 mtype)
{
u32 meta_len, key_len, mask_len, act_len, tot_len;
- struct nfp_repr *priv = netdev_priv(netdev);
struct sk_buff *skb;
unsigned char *msg;
@@ -78,7 +77,7 @@ nfp_flower_xmit_flow(struct net_device *netdev,
nfp_flow->meta.mask_len >>= NFP_FL_LW_SIZ;
nfp_flow->meta.act_len >>= NFP_FL_LW_SIZ;
- skb = nfp_flower_cmsg_alloc(priv->app, tot_len, mtype, GFP_KERNEL);
+ skb = nfp_flower_cmsg_alloc(app, tot_len, mtype, GFP_KERNEL);
if (!skb)
return -ENOMEM;
@@ -96,7 +95,7 @@ nfp_flower_xmit_flow(struct net_device *netdev,
nfp_flow->meta.mask_len <<= NFP_FL_LW_SIZ;
nfp_flow->meta.act_len <<= NFP_FL_LW_SIZ;
- nfp_ctrl_tx(priv->app->ctrl, skb);
+ nfp_ctrl_tx(app->ctrl, skb);
return 0;
}
@@ -427,13 +426,16 @@ nfp_flower_add_offload(struct nfp_app *app, struct net_device *netdev,
struct tc_cls_flower_offload *flow, bool egress)
{
enum nfp_flower_tun_type tun_type = NFP_FL_TUNNEL_NONE;
- struct nfp_port *port = nfp_port_from_netdev(netdev);
struct nfp_flower_priv *priv = app->priv;
struct nfp_fl_payload *flow_pay;
struct nfp_fl_key_ls *key_layer;
+ struct nfp_port *port = NULL;
struct net_device *ingr_dev;
int err;
+ if (nfp_netdev_is_nfp_repr(netdev))
+ port = nfp_port_from_netdev(netdev);
+
ingr_dev = egress ? NULL : netdev;
flow_pay = nfp_flower_search_fl_table(app, flow->cookie, ingr_dev,
NFP_FL_STATS_CTX_DONT_CARE);
@@ -462,8 +464,8 @@ nfp_flower_add_offload(struct nfp_app *app, struct net_device *netdev,
flow_pay->ingress_dev = egress ? NULL : netdev;
- err = nfp_flower_compile_flow_match(flow, key_layer, netdev, flow_pay,
- tun_type);
+ err = nfp_flower_compile_flow_match(app, flow, key_layer, netdev,
+ flow_pay, tun_type);
if (err)
goto err_destroy_flow;
@@ -476,7 +478,7 @@ nfp_flower_add_offload(struct nfp_app *app, struct net_device *netdev,
if (err)
goto err_destroy_flow;
- err = nfp_flower_xmit_flow(netdev, flow_pay,
+ err = nfp_flower_xmit_flow(app, flow_pay,
NFP_FLOWER_CMSG_TYPE_FLOW_ADD);
if (err)
goto err_destroy_flow;
@@ -487,7 +489,8 @@ nfp_flower_add_offload(struct nfp_app *app, struct net_device *netdev,
if (err)
goto err_destroy_flow;
- port->tc_offload_cnt++;
+ if (port)
+ port->tc_offload_cnt++;
/* Deallocate flow payload when flower rule has been destroyed. */
kfree(key_layer);
@@ -520,12 +523,15 @@ static int
nfp_flower_del_offload(struct nfp_app *app, struct net_device *netdev,
struct tc_cls_flower_offload *flow, bool egress)
{
- struct nfp_port *port = nfp_port_from_netdev(netdev);
struct nfp_flower_priv *priv = app->priv;
struct nfp_fl_payload *nfp_flow;
+ struct nfp_port *port = NULL;
struct net_device *ingr_dev;
int err;
+ if (nfp_netdev_is_nfp_repr(netdev))
+ port = nfp_port_from_netdev(netdev);
+
ingr_dev = egress ? NULL : netdev;
nfp_flow = nfp_flower_search_fl_table(app, flow->cookie, ingr_dev,
NFP_FL_STATS_CTX_DONT_CARE);
@@ -539,13 +545,14 @@ nfp_flower_del_offload(struct nfp_app *app, struct net_device *netdev,
if (nfp_flow->nfp_tun_ipv4_addr)
nfp_tunnel_del_ipv4_off(app, nfp_flow->nfp_tun_ipv4_addr);
- err = nfp_flower_xmit_flow(netdev, nfp_flow,
+ err = nfp_flower_xmit_flow(app, nfp_flow,
NFP_FLOWER_CMSG_TYPE_FLOW_DEL);
if (err)
goto err_free_flow;
err_free_flow:
- port->tc_offload_cnt--;
+ if (port)
+ port->tc_offload_cnt--;
kfree(nfp_flow->action_data);
kfree(nfp_flow->mask_data);
kfree(nfp_flow->unmasked_data);
--
2.7.4
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox