Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] Clock-independent TCP ISN generation
From: David Miller @ 2019-09-03 22:43 UTC (permalink / raw)
  To: sirus.shahini; +Cc: eric.dumazet, shiraz.saleem, jgg, arnd, netdev, sirus
In-Reply-To: <e3bf138f-672e-cefa-5fe5-ea25af8d3d61@gmail.com>

From: Cyrus Sh <sirus.shahini@gmail.com>
Date: Tue, 3 Sep 2019 10:06:03 -0600

> On 9/3/19 9:59 AM, Eric Dumazet wrote:
>> 
>> You could add a random delay to all SYN packets, if you believe your host has clock skews.
> 
> In theory yes, but again do you know any practical example with tested
> applications and the list of the rules? I'm interested to see an actual example
> that somebody has carried out and observed its results.

That's ironic that you're asking for specific and "practical" examples
when you submitted a kernel patch that doesn't even compile.

^ permalink raw reply

* Re: [PATCH] Clock-independent TCP ISN generation
From: David Miller @ 2019-09-03 22:45 UTC (permalink / raw)
  To: sirus.shahini; +Cc: eric.dumazet, shiraz.saleem, jgg, arnd, netdev, sirus
In-Reply-To: <e02c0aac-05c5-e0a4-9ae1-57685a0c3160@gmail.com>

From: Cyrus Sh <sirus.shahini@gmail.com>
Date: Tue, 3 Sep 2019 10:27:41 -0600

> It's up to you whether to want to keep using a problematic code that
> may endanger users or want to do something about it since we won't
> insist on having a patch accepted.

At least our problematic code, unlike your patch, compiles.

^ permalink raw reply

* Re: Proposal: r8152 firmware patching framework
From: David Miller @ 2019-09-03 22:50 UTC (permalink / raw)
  To: pmalani
  Cc: bambi.yeh, hayeswang, amber.chen, netdev, ryankao, jackc, albertk,
	marcochen, nic_swsd, grundler
In-Reply-To: <CACeCKadhJz3fdR+0rm+O2E39EbJgmN5NipMT8GRNtorus8myEg@mail.gmail.com>

From: Prashant Malani <pmalani@chromium.org>
Date: Tue, 3 Sep 2019 14:32:01 -0700

> I've moved David to the TO list to hopefully get his suggestions and
> guidance about how to design this in a upstream-compatible way.

I am not an expert in this area so please do not solicit my opinion.

Thank you.

^ permalink raw reply

* [PATCH bpf-next] selftests/bpf: precision tracking tests
From: Alexei Starovoitov @ 2019-09-03 22:51 UTC (permalink / raw)
  To: davem; +Cc: daniel, netdev, bpf, kernel-team

Add two tests to check that stack slot marking during backtracking
doesn't trigger 'spi > allocated_stack' warning.
One test is using BPF_ST insn. Another is using BPF_STX.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
these tests depend on the fix https://patchwork.ozlabs.org/patch/1157368/
---
 .../testing/selftests/bpf/verifier/precise.c  | 52 +++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/tools/testing/selftests/bpf/verifier/precise.c b/tools/testing/selftests/bpf/verifier/precise.c
index a455a4a71f11..02151f8c940f 100644
--- a/tools/testing/selftests/bpf/verifier/precise.c
+++ b/tools/testing/selftests/bpf/verifier/precise.c
@@ -140,3 +140,55 @@
 	.errstr = "!read_ok",
 	.result = REJECT,
 },
+{
+	"precise: ST insn causing spi > allocated_stack",
+	.insns = {
+	BPF_MOV64_REG(BPF_REG_3, BPF_REG_10),
+	BPF_JMP_IMM(BPF_JNE, BPF_REG_3, 123, 0),
+	BPF_ST_MEM(BPF_DW, BPF_REG_3, -8, 0),
+	BPF_LDX_MEM(BPF_DW, BPF_REG_4, BPF_REG_10, -8),
+	BPF_MOV64_IMM(BPF_REG_0, -1),
+	BPF_JMP_REG(BPF_JGT, BPF_REG_4, BPF_REG_0, 0),
+	BPF_EXIT_INSN(),
+	},
+	.prog_type = BPF_PROG_TYPE_XDP,
+	.flags = BPF_F_TEST_STATE_FREQ,
+	.errstr = "5: (2d) if r4 > r0 goto pc+0\
+	last_idx 5 first_idx 5\
+	parent didn't have regs=10 stack=0 marks\
+	last_idx 4 first_idx 2\
+	regs=10 stack=0 before 4\
+	regs=10 stack=0 before 3\
+	regs=0 stack=1 before 2\
+	last_idx 5 first_idx 5\
+	parent didn't have regs=1 stack=0 marks",
+	.result = VERBOSE_ACCEPT,
+	.retval = -1,
+},
+{
+	"precise: STX insn causing spi > allocated_stack",
+	.insns = {
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_prandom_u32),
+	BPF_MOV64_REG(BPF_REG_3, BPF_REG_10),
+	BPF_JMP_IMM(BPF_JNE, BPF_REG_3, 123, 0),
+	BPF_STX_MEM(BPF_DW, BPF_REG_3, BPF_REG_0, -8),
+	BPF_LDX_MEM(BPF_DW, BPF_REG_4, BPF_REG_10, -8),
+	BPF_MOV64_IMM(BPF_REG_0, -1),
+	BPF_JMP_REG(BPF_JGT, BPF_REG_4, BPF_REG_0, 0),
+	BPF_EXIT_INSN(),
+	},
+	.prog_type = BPF_PROG_TYPE_XDP,
+	.flags = BPF_F_TEST_STATE_FREQ,
+	.errstr = "last_idx 6 first_idx 6\
+	parent didn't have regs=10 stack=0 marks\
+	last_idx 5 first_idx 3\
+	regs=10 stack=0 before 5\
+	regs=10 stack=0 before 4\
+	regs=0 stack=1 before 3\
+	last_idx 6 first_idx 6\
+	parent didn't have regs=1 stack=0 marks\
+	last_idx 5 first_idx 3\
+	regs=1 stack=0 before 5",
+	.result = VERBOSE_ACCEPT,
+	.retval = -1,
+},
-- 
2.20.0


^ permalink raw reply related

* Re: [PATCH bpf-next 00/13] bpf: adding map batch processing support
From: Brian Vazquez @ 2019-09-03 23:07 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Alexei Starovoitov, Yonghong Song, Jakub Kicinski,
	Alexei Starovoitov, bpf@vger.kernel.org, netdev@vger.kernel.org,
	Daniel Borkmann, Kernel Team
In-Reply-To: <20190903223043.GC2101@mini-arch>

On Tue, Sep 3, 2019 at 3:30 PM Stanislav Fomichev <sdf@fomichev.me> wrote:
>
> On 09/03, Alexei Starovoitov wrote:
> > On Fri, Aug 30, 2019 at 02:18:09PM -0700, Stanislav Fomichev wrote:
> > > > >
> > > > > I personally like Jakub's/Quentin's proposal more. So if I get to choose
> > > > > between this series and Jakub's filter+dump in BPF, I'd pick filter+dump
> > > > > (pending per-cpu issue which we actually care about).
> > > > >
> > > > > But if we can have both, I don't have any objections; this patch
> >
> > I think we need to have both.
> > imo Jakub's and Yonghong's approach are solving slightly different cases.
> >
> > filter+dump via program is better suited for LRU map walks where filter prog
> > would do some non-trivial logic.
> > Whereas plain 'delete all' or 'dump all' is much simpler to use without
> > loading yet another prog just to dump it.
> > bpf infra today isn't quite ready for this very short lived auxiliary progs.
> > At prog load pages get read-only mapping, tlbs across cpus flushed,
> > kallsyms populated, FDs allocated, etc.
> > Loading the prog is a heavy operation. There was a chatter before to have
> > built-in progs. This filter+dump could benefit from builtin 'allow all'
> > or 'delete all' progs, but imo that complicates design and asks even
> > more questions than it answers. Should this builtin progs show up
> > in 'bpftool prog show' ? When do they load/unload? Same safety requirements
> > as normal progs? etc.
> > imo it's fine to have little bit overlap between apis.
> > So I think we should proceed with both batching apis.
> We don't need to load filter+dump every time we need a dump, right?
> We, internally, want to have this 'batch dump' only for long running daemons
> (I think the same applies to bcc), we can load this filter+dump once and
> then have a sys_bpf() command to trigger it.
>
> Also, related, if we add this batch dump, it doesn't mean that
> everything should switch to it. For example, I feel like we
> are perfectly fine if bpftool still uses get_next_key+lookup
> since we use it only for debugging.
>
> > Having said that I think both are suffering from the important issue pointed out
> > by Brian: when kernel deletes an element get_next_key iterator over hash/lru
> > map will produce duplicates.
> > The amount of duplicates can be huge. When batched iterator is slow and
> > bpf prog is doing a lot of update/delete, there could be 10x worth of duplicates,
> > since walk will resume from the beginning.
> > User space cannot be tasked to deal with it.
> > I think this issue has to be solved in the kernel first and it may require
> > different batching api.

We could also modify get_next_key behaviour _only_ when it's called
from a dumping function in which case we do know that we want to move
forward not backwards (basically if prev_key is not found, then
retrieve the first key in the next bucket).

That approach might miss entries that are in the same bucket where the
prev_key used to be, but missing entries is something that can always
happen (new additions in previous buckets), we can not control that
and as it has said before, if users care about consistency they can
use map-in-map.

> > One idea is to use bucket spin_lock and batch process it bucket-at-a-time.
> > From api pov the user space will tell kernel:
> > - here is the buffer for N element. start dump from the beginning.
> > - kernel will return <= N elements and an iterator.
> > - user space will pass this opaque iterator back to get another batch
> > For well behaved hash/lru map there will be zero or one elements per bucket.
> > When there are 2+ the batching logic can process them together.
> > If 'lookup' is requested the kernel can check whether user space provided
> > enough space for these 2 elements. If not abort the batch earlier.
> > get_next_key won't be used. Instead some sort of opaque iterator
> > will be returned to user space, so next batch lookup can start from it.
> > This iterator could be the index of the last dumped bucket.
> > This idea won't work for pathological hash tables though.
> > A lot of elements in a single bucket may be more than room for single batch.
> > In such case iterator will get stuck, since num_of_elements_in_bucket > batch_buf_size.
> > May be special error code can be used to solve that?
> This all requires new per-map implementations unfortunately :-(
> We were trying to see if we can somehow improve the existing bpf_map_ops
> to be more friendly towards batching.

Agreed that one of the motivations for current batching implementation
was to re use the existing code as much as we can. Although this might
be a good opportunity to do some per-map implementations as long as
they behave better than the current ones.

>
> You also bring a valid point with regard to well behaved hash/lru,
> we might be optimizing for the wrong case :-)

For pathological hash tables, wouldn't batching lookup_and_delete
help? In theory you are always requesting for the current first_key,
copying it and deleting it. And even if you retrieve the same key
during the batch (because another cpu added it again) it would contain
different info right?

> > I hope we can come up with other ideas to have a stable iterator over hash table.
> > Let's use email to describe the ideas and upcoming LPC conference to
> > sort out details and finalize the one to use.

^ permalink raw reply

* Re: [PATCH bpf-next 00/13] bpf: adding map batch processing support
From: Yonghong Song @ 2019-09-03 23:07 UTC (permalink / raw)
  To: Stanislav Fomichev, Alexei Starovoitov
  Cc: Jakub Kicinski, Brian Vazquez, Alexei Starovoitov,
	bpf@vger.kernel.org, netdev@vger.kernel.org, Daniel Borkmann,
	Kernel Team
In-Reply-To: <20190903223043.GC2101@mini-arch>



On 9/3/19 3:30 PM, Stanislav Fomichev wrote:
> On 09/03, Alexei Starovoitov wrote:
>> On Fri, Aug 30, 2019 at 02:18:09PM -0700, Stanislav Fomichev wrote:
>>>>>
>>>>> I personally like Jakub's/Quentin's proposal more. So if I get to choose
>>>>> between this series and Jakub's filter+dump in BPF, I'd pick filter+dump
>>>>> (pending per-cpu issue which we actually care about).
>>>>>
>>>>> But if we can have both, I don't have any objections; this patch
>>
>> I think we need to have both.
>> imo Jakub's and Yonghong's approach are solving slightly different cases.
>>
>> filter+dump via program is better suited for LRU map walks where filter prog
>> would do some non-trivial logic.
>> Whereas plain 'delete all' or 'dump all' is much simpler to use without
>> loading yet another prog just to dump it.
>> bpf infra today isn't quite ready for this very short lived auxiliary progs.
>> At prog load pages get read-only mapping, tlbs across cpus flushed,
>> kallsyms populated, FDs allocated, etc.
>> Loading the prog is a heavy operation. There was a chatter before to have
>> built-in progs. This filter+dump could benefit from builtin 'allow all'
>> or 'delete all' progs, but imo that complicates design and asks even
>> more questions than it answers. Should this builtin progs show up
>> in 'bpftool prog show' ? When do they load/unload? Same safety requirements
>> as normal progs? etc.
>> imo it's fine to have little bit overlap between apis.
>> So I think we should proceed with both batching apis.
> We don't need to load filter+dump every time we need a dump, right?
> We, internally, want to have this 'batch dump' only for long running daemons
> (I think the same applies to bcc), we can load this filter+dump once and
> then have a sys_bpf() command to trigger it.
> 
> Also, related, if we add this batch dump, it doesn't mean that
> everything should switch to it. For example, I feel like we
> are perfectly fine if bpftool still uses get_next_key+lookup
> since we use it only for debugging.
> 
>> Having said that I think both are suffering from the important issue pointed out
>> by Brian: when kernel deletes an element get_next_key iterator over hash/lru
>> map will produce duplicates.
>> The amount of duplicates can be huge. When batched iterator is slow and
>> bpf prog is doing a lot of update/delete, there could be 10x worth of duplicates,
>> since walk will resume from the beginning.
>> User space cannot be tasked to deal with it.
>> I think this issue has to be solved in the kernel first and it may require
>> different batching api.
>>
>> One idea is to use bucket spin_lock and batch process it bucket-at-a-time.
>>  From api pov the user space will tell kernel:
>> - here is the buffer for N element. start dump from the beginning.
>> - kernel will return <= N elements and an iterator.
>> - user space will pass this opaque iterator back to get another batch
>> For well behaved hash/lru map there will be zero or one elements per bucket.
>> When there are 2+ the batching logic can process them together.
>> If 'lookup' is requested the kernel can check whether user space provided
>> enough space for these 2 elements. If not abort the batch earlier.
>> get_next_key won't be used. Instead some sort of opaque iterator
>> will be returned to user space, so next batch lookup can start from it.
>> This iterator could be the index of the last dumped bucket.
>> This idea won't work for pathological hash tables though.
>> A lot of elements in a single bucket may be more than room for single batch.
>> In such case iterator will get stuck, since num_of_elements_in_bucket > batch_buf_size.
>> May be special error code can be used to solve that?
> This all requires new per-map implementations unfortunately :-(
> We were trying to see if we can somehow improve the existing bpf_map_ops
> to be more friendly towards batching.

The below is a link to folly current hashmap implementation:
https://github.com/facebook/folly/blob/master/folly/concurrency/ConcurrentHashMap.h
It uses segment level batching and each segment can contain multiple 
buckets. It support concurrent iterating vs. deletion.
I am yet to fully understand its implementation. But my guess is
that old elements are somehow kept long enough until all the
queries, read, etc. done and then garbage collection kicks in
and removes those old elements.

Kernel bucket-level locking should be okay. Indeed, per-map 
implementation or tweaking existing per-map interfaces will be necessary.

> 
> You also bring a valid point with regard to well behaved hash/lru,
> we might be optimizing for the wrong case :-)
> 
>> I hope we can come up with other ideas to have a stable iterator over hash table.
>> Let's use email to describe the ideas and upcoming LPC conference to
>> sort out details and finalize the one to use.

^ permalink raw reply

* Re: [PATCH] Clock-independent TCP ISN generation
From: Cyrus Sh @ 2019-09-04  0:45 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, shiraz.saleem, jgg, arnd, netdev, sirus
In-Reply-To: <20190903.154553.508717744184330290.davem@davemloft.net>

On 9/3/19 4:45 PM, David Miller wrote:

> At least our problematic code, unlike your patch, compiles.

I obviously compiled and tested the code before sending along and this should be
easy to understand. Even I published the results in the link that I mentioned in
the initial message. Now I'm not sure what you're doing that you're unable to
compile the code. What compilation error message do you get? This patch has
been written for kernel 5.3-rc7. 
Further the reason that I asked for a specific practical example was that I
wanted to actually test different techniques and see their effectiveness. (Again
should be easy to understand, I don't know why it's not for you and why it made
you upset)

^ permalink raw reply

* Re: [PATCH bpf-next 00/13] bpf: adding map batch processing support
From: Alexei Starovoitov @ 2019-09-04  1:35 UTC (permalink / raw)
  To: Brian Vazquez
  Cc: Stanislav Fomichev, Yonghong Song, Jakub Kicinski,
	Alexei Starovoitov, bpf@vger.kernel.org, netdev@vger.kernel.org,
	Daniel Borkmann, Kernel Team
In-Reply-To: <CAMzD94Qge5CtUZ+_0DsuQ_VLpmoADDZemFH1=WA-H8P0ax8qDQ@mail.gmail.com>

On Tue, Sep 03, 2019 at 04:07:17PM -0700, Brian Vazquez wrote:
> 
> We could also modify get_next_key behaviour _only_ when it's called
> from a dumping function in which case we do know that we want to move
> forward not backwards (basically if prev_key is not found, then
> retrieve the first key in the next bucket).
> 
> That approach might miss entries that are in the same bucket where the
> prev_key used to be, but missing entries is something that can always
> happen (new additions in previous buckets), we can not control that
> and as it has said before, if users care about consistency they can
> use map-in-map.

for dump-all case such miss of elements might be ok-ish,
but for delete-all it's probably not.
Imagine bpf prog is doing delete and bcc script is doing delete-all.
With 'go to the next bucket' logic there will be a case where
both kernel and user side are doing delete, but some elements are still left.
True that map-in-map is the answer, but if we're doing new get_next op
we probably should do it with less corner cases.

> > This all requires new per-map implementations unfortunately :-(
> > We were trying to see if we can somehow improve the existing bpf_map_ops
> > to be more friendly towards batching.
> 
> Agreed that one of the motivations for current batching implementation
> was to re use the existing code as much as we can. Although this might
> be a good opportunity to do some per-map implementations as long as
> they behave better than the current ones.

I don't think non reuse of get_next is a big deal.
Adding another get_next_v2 to all maps is a low cost comparing
with advantages of having stable iterate api.

stable walk over hash map is imo more important feature of api than
perf improvement brought by batching.

^ permalink raw reply

* Re: [PATCH v2 bpf-next 2/3] bpf: implement CAP_BPF
From: Alexei Starovoitov @ 2019-09-04  1:39 UTC (permalink / raw)
  To: nicolas.dichtel@6wind.com, Alexei Starovoitov, Daniel Borkmann
  Cc: Alexei Starovoitov, luto@amacapital.net, davem@davemloft.net,
	peterz@infradead.org, rostedt@goodmis.org, netdev@vger.kernel.org,
	bpf@vger.kernel.org, Kernel Team, linux-api@vger.kernel.org
In-Reply-To: <59ac111e-7ce7-5e00-32c9-9b55482fe701@6wind.com>

On 8/30/19 8:19 AM, Nicolas Dichtel wrote:
> Le 29/08/2019 à 19:30, Alexei Starovoitov a écrit :
> [snip]
>> These are the links that showing that k8 can delegates caps.
>> Are you saying that you know of folks who specifically
>> delegate cap_sys_admin and cap_net_admin _only_ to a container to run bpf in there?
>>
> Yes, we need cap_sys_admin only to load bpf:
> tc filter add dev eth0 ingress matchall action bpf obj ./tc_test_kern.o sec test
> 
> I'm not sure to understand why cap_net_admin is not enough to run the previous
> command (ie why load is forbidden).

because bpf syscall prog_load command requires cap_sys_admin in
the current implementation.

> I want to avoid sys_admin, thus cap_bpf will be ok. But we need to manage the
> backward compatibility.

re: backward compatibility...
do you know of any case where task is running under userid=nobody
with cap_sys_admin and cap_net_admin in order to do bpf ?

If not then what is the concern about compatibility?

^ permalink raw reply

* Re: [BACKPORT 4.14.y 4/8] net: sctp: fix warning "NULL check before some freeing functions is not needed"
From: Baolin Wang @ 2019-09-04  2:38 UTC (permalink / raw)
  To: Greg KH
  Cc: Marcelo Ricardo Leitner, # 3.4.x, vyasevich, Neil Horman,
	David Miller, hariprasad.kelam, linux-sctp, Networking,
	Arnd Bergmann, Orson Zhai, Vincent Guittot, LKML
In-Reply-To: <20190903183331.GB26562@kroah.com>

On Wed, 4 Sep 2019 at 02:33, Greg KH <greg@kroah.com> wrote:
>
> On Tue, Sep 03, 2019 at 11:52:06AM -0300, Marcelo Ricardo Leitner wrote:
> > On Tue, Sep 03, 2019 at 02:58:16PM +0800, Baolin Wang wrote:
> > > From: Hariprasad Kelam <hariprasad.kelam@gmail.com>
> > >
> > > This patch removes NULL checks before calling kfree.
> > >
> > > fixes below issues reported by coccicheck
> > > net/sctp/sm_make_chunk.c:2586:3-8: WARNING: NULL check before some
> > > freeing functions is not needed.
> > > net/sctp/sm_make_chunk.c:2652:3-8: WARNING: NULL check before some
> > > freeing functions is not needed.
> > > net/sctp/sm_make_chunk.c:2667:3-8: WARNING: NULL check before some
> > > freeing functions is not needed.
> > > net/sctp/sm_make_chunk.c:2684:3-8: WARNING: NULL check before some
> > > freeing functions is not needed.
> >
> > Hi. This doesn't seem the kind of patch that should be backported to
> > such old/stable releases. After all, it's just a cleanup.
>
> I agree, this does not seem necessary _unless_ it is needed for a later
> real fix.

It can remove warnings from our product kernel since this patch
(c4964bfaf433 sctp: Free cookie before we memdup a new one) was merged
into stable, we still need backport it to our product kernel manually.

But if you still think this is unnecessary, please ignore this patch.
Thanks for your comments.

-- 
Baolin Wang
Best Regards

^ permalink raw reply

* [PATCH 1/3] ixgbe: Use kzfree() rather than its implementation.
From: zhong jiang @ 2019-09-04  2:39 UTC (permalink / raw)
  To: davem, anna.schumaker, trond.myklebust; +Cc: zhongjiang, netdev, linux-kernel
In-Reply-To: <1567564752-6430-1-git-send-email-zhongjiang@huawei.com>

Use kzfree() instead of memset() + kfree().

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index 31629fc..113f608 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -960,11 +960,9 @@ int ixgbe_ipsec_vf_add_sa(struct ixgbe_adapter *adapter, u32 *msgbuf, u32 vf)
 	return 0;
 
 err_aead:
-	memset(xs->aead, 0, sizeof(*xs->aead));
-	kfree(xs->aead);
+	kzfree(xs->aead);
 err_xs:
-	memset(xs, 0, sizeof(*xs));
-	kfree(xs);
+	kzfree(xs);
 err_out:
 	msgbuf[1] = err;
 	return err;
@@ -1049,8 +1047,7 @@ int ixgbe_ipsec_vf_del_sa(struct ixgbe_adapter *adapter, u32 *msgbuf, u32 vf)
 	ixgbe_ipsec_del_sa(xs);
 
 	/* remove the xs that was made-up in the add request */
-	memset(xs, 0, sizeof(*xs));
-	kfree(xs);
+	kzfree(xs);
 
 	return 0;
 }
-- 
1.7.12.4


^ permalink raw reply related

* [PATCH 0/3] net: Use kzfree() directly
From: zhong jiang @ 2019-09-04  2:39 UTC (permalink / raw)
  To: davem, anna.schumaker, trond.myklebust; +Cc: zhongjiang, netdev, linux-kernel

With the help of Coccinelle. We find some place to replace.

@@
expression M, S;
@@

- memset(M, 0, S);
- kfree(M);
+ kzfree(M); 

zhong jiang (3):
  ixgbe: Use kzfree() rather than its implementation.
  sunrpc: Use kzfree rather than its implementation.
  net: mpoa: Use kzfree rather than its implementation.

 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 9 +++------
 net/atm/mpoa_caches.c                          | 6 ++----
 net/sunrpc/auth_gss/gss_krb5_keys.c            | 9 +++------
 3 files changed, 8 insertions(+), 16 deletions(-)

-- 
1.7.12.4

^ permalink raw reply

* [PATCH 2/3] sunrpc: Use kzfree rather than its implementation.
From: zhong jiang @ 2019-09-04  2:39 UTC (permalink / raw)
  To: davem, anna.schumaker, trond.myklebust; +Cc: zhongjiang, netdev, linux-kernel
In-Reply-To: <1567564752-6430-1-git-send-email-zhongjiang@huawei.com>

Use kzfree instead of memset() + kfree().

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
 net/sunrpc/auth_gss/gss_krb5_keys.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/net/sunrpc/auth_gss/gss_krb5_keys.c b/net/sunrpc/auth_gss/gss_krb5_keys.c
index 550fdf1..3b7f721 100644
--- a/net/sunrpc/auth_gss/gss_krb5_keys.c
+++ b/net/sunrpc/auth_gss/gss_krb5_keys.c
@@ -228,14 +228,11 @@ u32 krb5_derive_key(const struct gss_krb5_enctype *gk5e,
 	ret = 0;
 
 err_free_raw:
-	memset(rawkey, 0, keybytes);
-	kfree(rawkey);
+	kzfree(rawkey);
 err_free_out:
-	memset(outblockdata, 0, blocksize);
-	kfree(outblockdata);
+	kzfree(outblockdata);
 err_free_in:
-	memset(inblockdata, 0, blocksize);
-	kfree(inblockdata);
+	kzfree(inblockdata);
 err_free_cipher:
 	crypto_free_sync_skcipher(cipher);
 err_return:
-- 
1.7.12.4


^ permalink raw reply related

* [PATCH 3/3] net: mpoa: Use kzfree rather than its implementation.
From: zhong jiang @ 2019-09-04  2:39 UTC (permalink / raw)
  To: davem, anna.schumaker, trond.myklebust; +Cc: zhongjiang, netdev, linux-kernel
In-Reply-To: <1567564752-6430-1-git-send-email-zhongjiang@huawei.com>

Use kzfree instead of memset() + kfree().

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
 net/atm/mpoa_caches.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/atm/mpoa_caches.c b/net/atm/mpoa_caches.c
index 4bb4183..3286f9d 100644
--- a/net/atm/mpoa_caches.c
+++ b/net/atm/mpoa_caches.c
@@ -180,8 +180,7 @@ static int cache_hit(in_cache_entry *entry, struct mpoa_client *mpc)
 static void in_cache_put(in_cache_entry *entry)
 {
 	if (refcount_dec_and_test(&entry->use)) {
-		memset(entry, 0, sizeof(in_cache_entry));
-		kfree(entry);
+		kzfree(entry);
 	}
 }
 
@@ -416,8 +415,7 @@ static eg_cache_entry *eg_cache_get_by_src_ip(__be32 ipaddr,
 static void eg_cache_put(eg_cache_entry *entry)
 {
 	if (refcount_dec_and_test(&entry->use)) {
-		memset(entry, 0, sizeof(eg_cache_entry));
-		kfree(entry);
+		kzfree(entry);
 	}
 }
 
-- 
1.7.12.4


^ permalink raw reply related

* Re: [RFC v3] vhost: introduce mdev based hardware vhost backend
From: Tiwei Bie @ 2019-09-04  2:48 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: jasowang, alex.williamson, maxime.coquelin, linux-kernel, kvm,
	virtualization, netdev, dan.daly, cunming.liang, zhihong.wang,
	lingshan.zhu
In-Reply-To: <20190903043704-mutt-send-email-mst@kernel.org>

On Tue, Sep 03, 2019 at 07:26:03AM -0400, Michael S. Tsirkin wrote:
> On Wed, Aug 28, 2019 at 01:37:12PM +0800, Tiwei Bie wrote:
> > Details about this can be found here:
> > 
> > https://lwn.net/Articles/750770/
> > 
> > What's new in this version
> > ==========================
> > 
> > There are three choices based on the discussion [1] in RFC v2:
> > 
> > > #1. We expose a VFIO device, so we can reuse the VFIO container/group
> > >     based DMA API and potentially reuse a lot of VFIO code in QEMU.
> > >
> > >     But in this case, we have two choices for the VFIO device interface
> > >     (i.e. the interface on top of VFIO device fd):
> > >
> > >     A) we may invent a new vhost protocol (as demonstrated by the code
> > >        in this RFC) on VFIO device fd to make it work in VFIO's way,
> > >        i.e. regions and irqs.
> > >
> > >     B) Or as you proposed, instead of inventing a new vhost protocol,
> > >        we can reuse most existing vhost ioctls on the VFIO device fd
> > >        directly. There should be no conflicts between the VFIO ioctls
> > >        (type is 0x3B) and VHOST ioctls (type is 0xAF) currently.
> > >
> > > #2. Instead of exposing a VFIO device, we may expose a VHOST device.
> > >     And we will introduce a new mdev driver vhost-mdev to do this.
> > >     It would be natural to reuse the existing kernel vhost interface
> > >     (ioctls) on it as much as possible. But we will need to invent
> > >     some APIs for DMA programming (reusing VHOST_SET_MEM_TABLE is a
> > >     choice, but it's too heavy and doesn't support vIOMMU by itself).
> > 
> > This version is more like a quick PoC to try Jason's proposal on
> > reusing vhost ioctls. And the second way (#1/B) in above three
> > choices was chosen in this version to demonstrate the idea quickly.
> > 
> > Now the userspace API looks like this:
> > 
> > - VFIO's container/group based IOMMU API is used to do the
> >   DMA programming.
> > 
> > - Vhost's existing ioctls are used to setup the device.
> > 
> > And the device will report device_api as "vfio-vhost".
> > 
> > Note that, there are dirty hacks in this version. If we decide to
> > go this way, some refactoring in vhost.c/vhost.h may be needed.
> > 
> > PS. The direct mapping of the notify registers isn't implemented
> >     in this version.
> > 
> > [1] https://lkml.org/lkml/2019/7/9/101
> > 
> > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> 
> ....
> 
> > +long vhost_mdev_ioctl(struct mdev_device *mdev, unsigned int cmd,
> > +		      unsigned long arg)
> > +{
> > +	void __user *argp = (void __user *)arg;
> > +	struct vhost_mdev *vdpa;
> > +	unsigned long minsz;
> > +	int ret = 0;
> > +
> > +	if (!mdev)
> > +		return -EINVAL;
> > +
> > +	vdpa = mdev_get_drvdata(mdev);
> > +	if (!vdpa)
> > +		return -ENODEV;
> > +
> > +	switch (cmd) {
> > +	case VFIO_DEVICE_GET_INFO:
> > +	{
> > +		struct vfio_device_info info;
> > +
> > +		minsz = offsetofend(struct vfio_device_info, num_irqs);
> > +
> > +		if (copy_from_user(&info, (void __user *)arg, minsz)) {
> > +			ret = -EFAULT;
> > +			break;
> > +		}
> > +
> > +		if (info.argsz < minsz) {
> > +			ret = -EINVAL;
> > +			break;
> > +		}
> > +
> > +		info.flags = VFIO_DEVICE_FLAGS_VHOST;
> > +		info.num_regions = 0;
> > +		info.num_irqs = 0;
> > +
> > +		if (copy_to_user((void __user *)arg, &info, minsz)) {
> > +			ret = -EFAULT;
> > +			break;
> > +		}
> > +
> > +		break;
> > +	}
> > +	case VFIO_DEVICE_GET_REGION_INFO:
> > +	case VFIO_DEVICE_GET_IRQ_INFO:
> > +	case VFIO_DEVICE_SET_IRQS:
> > +	case VFIO_DEVICE_RESET:
> > +		ret = -EINVAL;
> > +		break;
> > +
> > +	case VHOST_MDEV_SET_STATE:
> > +		ret = vhost_set_state(vdpa, argp);
> > +		break;
> > +	case VHOST_GET_FEATURES:
> > +		ret = vhost_get_features(vdpa, argp);
> > +		break;
> > +	case VHOST_SET_FEATURES:
> > +		ret = vhost_set_features(vdpa, argp);
> > +		break;
> > +	case VHOST_GET_VRING_BASE:
> > +		ret = vhost_get_vring_base(vdpa, argp);
> > +		break;
> > +	default:
> > +		ret = vhost_dev_ioctl(&vdpa->dev, cmd, argp);
> > +		if (ret == -ENOIOCTLCMD)
> > +			ret = vhost_vring_ioctl(&vdpa->dev, cmd, argp);
> > +	}
> > +
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL(vhost_mdev_ioctl);
> 
> 
> I don't have a problem with this approach. A small question:
> would it make sense to have two fds: send vhost ioctls
> on one and vfio ioctls on another?
> We can then pass vfio fd to the vhost fd with a
> SET_BACKEND ioctl.
> 
> What do you think?

I like this idea! I will give it a try.
So we can introduce /dev/vhost-mdev to have the vhost fd, and let
userspace pass vfio fd to the vhost fd with a SET_BACKEND ioctl.

Thanks a lot!
Tiwei

> 
> -- 
> MST

^ permalink raw reply

* [PATCH] net: hsr: remove an redundant null check before kfree_skb
From: zhong jiang @ 2019-09-04  3:09 UTC (permalink / raw)
  To: davem, arvid.brodin; +Cc: zhongjiang, netdev, linux-kernel

kfree_skb has taken the null pointer into account. Hence just remove
the null check before kfree_skb.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
 net/hsr/hsr_forward.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/hsr/hsr_forward.c b/net/hsr/hsr_forward.c
index ddd9605..0c9e5b0 100644
--- a/net/hsr/hsr_forward.c
+++ b/net/hsr/hsr_forward.c
@@ -367,10 +367,8 @@ void hsr_forward_skb(struct sk_buff *skb, struct hsr_port *port)
 		port->dev->stats.tx_bytes += skb->len;
 	}
 
-	if (frame.skb_hsr)
-		kfree_skb(frame.skb_hsr);
-	if (frame.skb_std)
-		kfree_skb(frame.skb_std);
+	kfree_skb(frame.skb_hsr);
+	kfree_skb(frame.skb_std);
 	return;
 
 out_drop:
-- 
1.7.12.4


^ permalink raw reply related

* Re: [PATCH] net-ipv6: fix excessive RTF_ADDRCONF flag on ::1/128 local route (and others)
From: Lorenzo Colitti @ 2019-09-04  3:17 UTC (permalink / raw)
  To: David Ahern; +Cc: Maciej Żenczykowski, David S . Miller, Linux NetDev
In-Reply-To: <60b98521-cf3a-1130-896d-2947fc4d5290@gmail.com>

On Wed, Sep 4, 2019 at 4:45 AM David Ahern <dsahern@gmail.com> wrote:
>
> exactly. It was shortsighted of me to add the ADDRCONF flag and removing
> it reverts back to the previous behavior.
>
> When I enable radvd, I do see the flag set when it should be and not for
> other addresses. I believe the patch is correct.

Ah, wait, I was confused. Sorry.

What I was saying is that RTF_ADDRCONF flag should be set on the local
table /128 route for the IPv6 addresses configured by RAs. The way
those are configured is ndisc_router_discovery -> addrconf_prefix_rcv
-> addrconf_prefix_rcv_add_addr -> ipv6_add_addr ->
addrconf_f6i_alloc. Because in this patch, addrconf_f6i_alloc
unconditionally clears RTF_ADDRCONF, I didn't see how the flag could
be set. But I now realize that that was not happening before David's
commit, either: in 5.1 those routes were not created with RTF_ADDRCONF
either.

In other words: before 5.1, the /128 routes in the local table to IPv6
addresses created by SLAAC did not have RTF_ADDRCONF. After David's
commit, they did, because *all* /128 routes to IPv6 addresses had
RTF_ADDRCONF (correct for SLAAC adresses, but definitely incorrect for
manual addresses, loopback, etc.). If this commit is applied, we'll go
back to the 5.1 state. In the future it would be good to ensure that
at least the /128 routes created by SLAAC do have RTF_ADDRCONF, but no
need to do so in this commit.

^ permalink raw reply

* [PATCHv2 net-next] ipmr: remove cache_resolve_queue_len
From: Hangbin Liu @ 2019-09-04  3:34 UTC (permalink / raw)
  To: netdev
  Cc: Phil Karn, Sukumar Gopalakrishnan, David S . Miller,
	Alexey Kuznetsov, Hideaki YOSHIFUJI, Hangbin Liu
In-Reply-To: <20190903084359.13310-1-liuhangbin@gmail.com>

This is a re-post of previous patch wrote by David Miller[1].

Phil Karn reported[2] that on busy networks with lots of unresolved
multicast routing entries, the creation of new multicast group routes
can be extremely slow and unreliable.

The reason is we hard-coded multicast route entries with unresolved source
addresses(cache_resolve_queue_len) to 10. If some multicast route never
resolves and the unresolved source addresses increased, there will
be no ability to create new multicast route cache.

To resolve this issue, we need either add a sysctl entry to make the
cache_resolve_queue_len configurable, or just remove cache_resolve_queue_len
directly, as we already have the socket receive queue limits of mrouted
socket, pointed by David.

From my side, I'd perfer to remove the cache_resolve_queue_len instead
of creating two more(IPv4 and IPv6 version) sysctl entry.

[1] https://lkml.org/lkml/2018/7/22/11
[2] https://lkml.org/lkml/2018/7/21/343

v2: hold the mfc_unres_lock while walking the unresolved list in
queue_count(), as Nikolay Aleksandrov remind.

Reported-by: Phil Karn <karn@ka9q.net>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
 include/linux/mroute_base.h |  2 --
 net/ipv4/ipmr.c             | 30 +++++++++++++++++++++---------
 net/ipv6/ip6mr.c            | 10 +++-------
 3 files changed, 24 insertions(+), 18 deletions(-)

diff --git a/include/linux/mroute_base.h b/include/linux/mroute_base.h
index 34de06b426ef..50fb89533029 100644
--- a/include/linux/mroute_base.h
+++ b/include/linux/mroute_base.h
@@ -235,7 +235,6 @@ struct mr_table_ops {
  * @mfc_hash: Hash table of all resolved routes for easy lookup
  * @mfc_cache_list: list of resovled routes for possible traversal
  * @maxvif: Identifier of highest value vif currently in use
- * @cache_resolve_queue_len: current size of unresolved queue
  * @mroute_do_assert: Whether to inform userspace on wrong ingress
  * @mroute_do_pim: Whether to receive IGMP PIMv1
  * @mroute_reg_vif_num: PIM-device vif index
@@ -252,7 +251,6 @@ struct mr_table {
 	struct rhltable		mfc_hash;
 	struct list_head	mfc_cache_list;
 	int			maxvif;
-	atomic_t		cache_resolve_queue_len;
 	bool			mroute_do_assert;
 	bool			mroute_do_pim;
 	bool			mroute_do_wrvifwhole;
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index c07bc82cbbe9..4d67c64d94a4 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -744,8 +744,6 @@ static void ipmr_destroy_unres(struct mr_table *mrt, struct mfc_cache *c)
 	struct sk_buff *skb;
 	struct nlmsgerr *e;
 
-	atomic_dec(&mrt->cache_resolve_queue_len);
-
 	while ((skb = skb_dequeue(&c->_c.mfc_un.unres.unresolved))) {
 		if (ip_hdr(skb)->version == 0) {
 			struct nlmsghdr *nlh = skb_pull(skb,
@@ -1133,9 +1131,11 @@ static int ipmr_cache_unresolved(struct mr_table *mrt, vifi_t vifi,
 	}
 
 	if (!found) {
+		bool was_empty;
+
 		/* Create a new entry if allowable */
-		if (atomic_read(&mrt->cache_resolve_queue_len) >= 10 ||
-		    (c = ipmr_cache_alloc_unres()) == NULL) {
+		c = ipmr_cache_alloc_unres();
+		if (!c) {
 			spin_unlock_bh(&mfc_unres_lock);
 
 			kfree_skb(skb);
@@ -1161,11 +1161,11 @@ static int ipmr_cache_unresolved(struct mr_table *mrt, vifi_t vifi,
 			return err;
 		}
 
-		atomic_inc(&mrt->cache_resolve_queue_len);
+		was_empty = list_empty(&mrt->mfc_unres_queue);
 		list_add(&c->_c.list, &mrt->mfc_unres_queue);
 		mroute_netlink_event(mrt, c, RTM_NEWROUTE);
 
-		if (atomic_read(&mrt->cache_resolve_queue_len) == 1)
+		if (was_empty)
 			mod_timer(&mrt->ipmr_expire_timer,
 				  c->_c.mfc_un.unres.expires);
 	}
@@ -1272,7 +1272,6 @@ static int ipmr_mfc_add(struct net *net, struct mr_table *mrt,
 		if (uc->mfc_origin == c->mfc_origin &&
 		    uc->mfc_mcastgrp == c->mfc_mcastgrp) {
 			list_del(&_uc->list);
-			atomic_dec(&mrt->cache_resolve_queue_len);
 			found = true;
 			break;
 		}
@@ -1328,7 +1327,7 @@ static void mroute_clean_tables(struct mr_table *mrt, int flags)
 	}
 
 	if (flags & MRT_FLUSH_MFC) {
-		if (atomic_read(&mrt->cache_resolve_queue_len) != 0) {
+		if (!list_empty(&mrt->mfc_unres_queue)) {
 			spin_lock_bh(&mfc_unres_lock);
 			list_for_each_entry_safe(c, tmp, &mrt->mfc_unres_queue, list) {
 				list_del(&c->list);
@@ -2750,9 +2749,22 @@ static int ipmr_rtm_route(struct sk_buff *skb, struct nlmsghdr *nlh,
 		return ipmr_mfc_delete(tbl, &mfcc, parent);
 }
 
+static int queue_count(struct mr_table *mrt)
+{
+	struct list_head *pos;
+	int count = 0;
+
+	spin_lock_bh(&mfc_unres_lock);
+	list_for_each(pos, &mrt->mfc_unres_queue)
+		count++;
+	spin_unlock_bh(&mfc_unres_lock);
+
+	return count;
+}
+
 static bool ipmr_fill_table(struct mr_table *mrt, struct sk_buff *skb)
 {
-	u32 queue_len = atomic_read(&mrt->cache_resolve_queue_len);
+	u32 queue_len = queue_count(mrt);
 
 	if (nla_put_u32(skb, IPMRA_TABLE_ID, mrt->id) ||
 	    nla_put_u32(skb, IPMRA_TABLE_CACHE_RES_QUEUE_LEN, queue_len) ||
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index e80d36c5073d..d02f0f903559 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -768,8 +768,6 @@ static void ip6mr_destroy_unres(struct mr_table *mrt, struct mfc6_cache *c)
 	struct net *net = read_pnet(&mrt->net);
 	struct sk_buff *skb;
 
-	atomic_dec(&mrt->cache_resolve_queue_len);
-
 	while ((skb = skb_dequeue(&c->_c.mfc_un.unres.unresolved)) != NULL) {
 		if (ipv6_hdr(skb)->version == 0) {
 			struct nlmsghdr *nlh = skb_pull(skb,
@@ -1148,8 +1146,8 @@ static int ip6mr_cache_unresolved(struct mr_table *mrt, mifi_t mifi,
 		 *	Create a new entry if allowable
 		 */
 
-		if (atomic_read(&mrt->cache_resolve_queue_len) >= 10 ||
-		    (c = ip6mr_cache_alloc_unres()) == NULL) {
+		c = ip6mr_cache_alloc_unres();
+		if (!c) {
 			spin_unlock_bh(&mfc_unres_lock);
 
 			kfree_skb(skb);
@@ -1176,7 +1174,6 @@ static int ip6mr_cache_unresolved(struct mr_table *mrt, mifi_t mifi,
 			return err;
 		}
 
-		atomic_inc(&mrt->cache_resolve_queue_len);
 		list_add(&c->_c.list, &mrt->mfc_unres_queue);
 		mr6_netlink_event(mrt, c, RTM_NEWROUTE);
 
@@ -1468,7 +1465,6 @@ static int ip6mr_mfc_add(struct net *net, struct mr_table *mrt,
 		if (ipv6_addr_equal(&uc->mf6c_origin, &c->mf6c_origin) &&
 		    ipv6_addr_equal(&uc->mf6c_mcastgrp, &c->mf6c_mcastgrp)) {
 			list_del(&_uc->list);
-			atomic_dec(&mrt->cache_resolve_queue_len);
 			found = true;
 			break;
 		}
@@ -1526,7 +1522,7 @@ static void mroute_clean_tables(struct mr_table *mrt, int flags)
 	}
 
 	if (flags & MRT6_FLUSH_MFC) {
-		if (atomic_read(&mrt->cache_resolve_queue_len) != 0) {
+		if (!list_empty(&mrt->mfc_unres_queue)) {
 			spin_lock_bh(&mfc_unres_lock);
 			list_for_each_entry_safe(c, tmp, &mrt->mfc_unres_queue, list) {
 				list_del(&c->list);
-- 
2.19.2


^ permalink raw reply related

* [PATCH 0/3] net: remove an redundant continue
From: zhong jiang @ 2019-09-04  3:46 UTC (permalink / raw)
  To: davem, kvalo, pkshih; +Cc: zhongjiang, netdev, linux-kernel

With the help of Coccinelle. we find some place to replace.

@@

for (...;...;...) {
   ...
   if (...) {
     ...
-   continue;
   }
}


zhong jiang (3):
  rtlwifi: Remove an unnecessary continue in
    _rtl8723be_phy_config_bb_with_pgheaderfile
  nfp: Drop unnecessary continue in nfp_net_pf_alloc_vnics
  ath10k: Drop unnecessary continue in ath10k_mac_update_vif_chan

 drivers/net/ethernet/netronome/nfp/nfp_net_main.c    | 4 +---
 drivers/net/wireless/ath/ath10k/mac.c                | 4 +---
 drivers/net/wireless/realtek/rtlwifi/rtl8723be/phy.c | 1 -
 3 files changed, 2 insertions(+), 7 deletions(-)

-- 
1.7.12.4


^ permalink raw reply

* [PATCH 3/3] ath10k: Drop unnecessary continue in ath10k_mac_update_vif_chan
From: zhong jiang @ 2019-09-04  3:46 UTC (permalink / raw)
  To: davem, kvalo, pkshih; +Cc: zhongjiang, netdev, linux-kernel
In-Reply-To: <1567568784-9669-1-git-send-email-zhongjiang@huawei.com>

Continue is not needed at the bottom of a loop. Hence just remove it.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
 drivers/net/wireless/ath/ath10k/mac.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index 12dad65..91e4635 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -7804,11 +7804,9 @@ static int ath10k_ampdu_action(struct ieee80211_hw *hw,
 			continue;
 
 		ret = ath10k_wmi_vdev_down(ar, arvif->vdev_id);
-		if (ret) {
+		if (ret)
 			ath10k_warn(ar, "failed to down vdev %d: %d\n",
 				    arvif->vdev_id, ret);
-			continue;
-		}
 	}
 
 	/* All relevant vdevs are downed and associated channel resources
-- 
1.7.12.4


^ permalink raw reply related

* [PATCH 2/3] nfp: Drop unnecessary continue in nfp_net_pf_alloc_vnics
From: zhong jiang @ 2019-09-04  3:46 UTC (permalink / raw)
  To: davem, kvalo, pkshih; +Cc: zhongjiang, netdev, linux-kernel
In-Reply-To: <1567568784-9669-1-git-send-email-zhongjiang@huawei.com>

Continue is not needed at the bottom of a loop.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
 drivers/net/ethernet/netronome/nfp/nfp_net_main.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_main.c b/drivers/net/ethernet/netronome/nfp/nfp_net_main.c
index 986464d..68db47d 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_main.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_main.c
@@ -205,10 +205,8 @@ static void nfp_net_pf_free_vnics(struct nfp_pf *pf)
 		ctrl_bar += NFP_PF_CSR_SLICE_SIZE;
 
 		/* Kill the vNIC if app init marked it as invalid */
-		if (nn->port && nn->port->type == NFP_PORT_INVALID) {
+		if (nn->port && nn->port->type == NFP_PORT_INVALID)
 			nfp_net_pf_free_vnic(pf, nn);
-			continue;
-		}
 	}
 
 	if (list_empty(&pf->vnics))
-- 
1.7.12.4


^ permalink raw reply related

* [PATCH 1/3] rtlwifi: Remove an unnecessary continue in _rtl8723be_phy_config_bb_with_pgheaderfile
From: zhong jiang @ 2019-09-04  3:46 UTC (permalink / raw)
  To: davem, kvalo, pkshih; +Cc: zhongjiang, netdev, linux-kernel
In-Reply-To: <1567568784-9669-1-git-send-email-zhongjiang@huawei.com>

Continue is not needed at the bottom of a loop. Hence just drop it.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
 drivers/net/wireless/realtek/rtlwifi/rtl8723be/phy.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723be/phy.c b/drivers/net/wireless/realtek/rtlwifi/rtl8723be/phy.c
index aa8a095..4805b84 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8723be/phy.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723be/phy.c
@@ -732,7 +732,6 @@ static bool _rtl8723be_phy_config_bb_with_pgheaderfile(struct ieee80211_hw *hw,
 				else
 					_rtl8723be_store_tx_power_by_rate(hw,
 							v1, v2, v3, v4, v5, v6);
-				continue;
 			}
 		}
 	} else {
-- 
1.7.12.4


^ permalink raw reply related

* RE: [PATCH net-next 1/5] net/tls: use the full sk_proto pointer
From: John Fastabend @ 2019-09-04  4:27 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, oss-drivers, davejwatson, borisp, aviadye, john.fastabend,
	daniel, Jakub Kicinski, John Hurley, Dirk van der Merwe
In-Reply-To: <20190903043106.27570-2-jakub.kicinski@netronome.com>

Jakub Kicinski wrote:
> Since we already have the pointer to the full original sk_proto
> stored use that instead of storing all individual callback
> pointers as well.
> 
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Reviewed-by: John Hurley <john.hurley@netronome.com>
> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
> ---
>  drivers/crypto/chelsio/chtls/chtls_main.c |  6 +++--
>  include/net/tls.h                         | 10 ---------
>  net/tls/tls_main.c                        | 27 +++++++++--------------
>  3 files changed, 14 insertions(+), 29 deletions(-)
> 

I like it should probably do the same to tcp_bpf.c.

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply

* RE: [PATCH net-next 4/5] net/tls: clean up the number of #ifdefs for CONFIG_TLS_DEVICE
From: John Fastabend @ 2019-09-04  4:31 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, oss-drivers, davejwatson, borisp, aviadye, john.fastabend,
	daniel, Jakub Kicinski, John Hurley, Dirk van der Merwe
In-Reply-To: <20190903043106.27570-5-jakub.kicinski@netronome.com>

Jakub Kicinski wrote:
> TLS code has a number of #ifdefs which make the code a little
> harder to follow. Recent fixes removed the ifdef around the
> TLS_HW define, so we can switch to the often used pattern
> of defining tls_device functions as empty static inlines
> in the header when CONFIG_TLS_DEVICE=n.
> 
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Reviewed-by: John Hurley <john.hurley@netronome.com>
> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
> ---
>  include/net/tls.h  | 38 ++++++++++++++++++++++++++++++++------
>  net/tls/tls_main.c | 19 +------------------
>  net/tls/tls_sw.c   |  6 ++----
>  3 files changed, 35 insertions(+), 28 deletions(-)

Thanks I've been meaning to do this I agree it looks nicer.

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply

* Re: [RFC v3] vhost: introduce mdev based hardware vhost backend
From: Jason Wang @ 2019-09-04  4:32 UTC (permalink / raw)
  To: Tiwei Bie, Michael S. Tsirkin
  Cc: alex.williamson, maxime.coquelin, linux-kernel, kvm,
	virtualization, netdev, dan.daly, cunming.liang, zhihong.wang,
	lingshan.zhu
In-Reply-To: <20190904024801.GA5671@___>


On 2019/9/4 上午10:48, Tiwei Bie wrote:
> On Tue, Sep 03, 2019 at 07:26:03AM -0400, Michael S. Tsirkin wrote:
>> On Wed, Aug 28, 2019 at 01:37:12PM +0800, Tiwei Bie wrote:
>>> Details about this can be found here:
>>>
>>> https://lwn.net/Articles/750770/
>>>
>>> What's new in this version
>>> ==========================
>>>
>>> There are three choices based on the discussion [1] in RFC v2:
>>>
>>>> #1. We expose a VFIO device, so we can reuse the VFIO container/group
>>>>      based DMA API and potentially reuse a lot of VFIO code in QEMU.
>>>>
>>>>      But in this case, we have two choices for the VFIO device interface
>>>>      (i.e. the interface on top of VFIO device fd):
>>>>
>>>>      A) we may invent a new vhost protocol (as demonstrated by the code
>>>>         in this RFC) on VFIO device fd to make it work in VFIO's way,
>>>>         i.e. regions and irqs.
>>>>
>>>>      B) Or as you proposed, instead of inventing a new vhost protocol,
>>>>         we can reuse most existing vhost ioctls on the VFIO device fd
>>>>         directly. There should be no conflicts between the VFIO ioctls
>>>>         (type is 0x3B) and VHOST ioctls (type is 0xAF) currently.
>>>>
>>>> #2. Instead of exposing a VFIO device, we may expose a VHOST device.
>>>>      And we will introduce a new mdev driver vhost-mdev to do this.
>>>>      It would be natural to reuse the existing kernel vhost interface
>>>>      (ioctls) on it as much as possible. But we will need to invent
>>>>      some APIs for DMA programming (reusing VHOST_SET_MEM_TABLE is a
>>>>      choice, but it's too heavy and doesn't support vIOMMU by itself).
>>> This version is more like a quick PoC to try Jason's proposal on
>>> reusing vhost ioctls. And the second way (#1/B) in above three
>>> choices was chosen in this version to demonstrate the idea quickly.
>>>
>>> Now the userspace API looks like this:
>>>
>>> - VFIO's container/group based IOMMU API is used to do the
>>>    DMA programming.
>>>
>>> - Vhost's existing ioctls are used to setup the device.
>>>
>>> And the device will report device_api as "vfio-vhost".
>>>
>>> Note that, there are dirty hacks in this version. If we decide to
>>> go this way, some refactoring in vhost.c/vhost.h may be needed.
>>>
>>> PS. The direct mapping of the notify registers isn't implemented
>>>      in this version.
>>>
>>> [1] https://lkml.org/lkml/2019/7/9/101
>>>
>>> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
>> ....
>>
>>> +long vhost_mdev_ioctl(struct mdev_device *mdev, unsigned int cmd,
>>> +		      unsigned long arg)
>>> +{
>>> +	void __user *argp = (void __user *)arg;
>>> +	struct vhost_mdev *vdpa;
>>> +	unsigned long minsz;
>>> +	int ret = 0;
>>> +
>>> +	if (!mdev)
>>> +		return -EINVAL;
>>> +
>>> +	vdpa = mdev_get_drvdata(mdev);
>>> +	if (!vdpa)
>>> +		return -ENODEV;
>>> +
>>> +	switch (cmd) {
>>> +	case VFIO_DEVICE_GET_INFO:
>>> +	{
>>> +		struct vfio_device_info info;
>>> +
>>> +		minsz = offsetofend(struct vfio_device_info, num_irqs);
>>> +
>>> +		if (copy_from_user(&info, (void __user *)arg, minsz)) {
>>> +			ret = -EFAULT;
>>> +			break;
>>> +		}
>>> +
>>> +		if (info.argsz < minsz) {
>>> +			ret = -EINVAL;
>>> +			break;
>>> +		}
>>> +
>>> +		info.flags = VFIO_DEVICE_FLAGS_VHOST;
>>> +		info.num_regions = 0;
>>> +		info.num_irqs = 0;
>>> +
>>> +		if (copy_to_user((void __user *)arg, &info, minsz)) {
>>> +			ret = -EFAULT;
>>> +			break;
>>> +		}
>>> +
>>> +		break;
>>> +	}
>>> +	case VFIO_DEVICE_GET_REGION_INFO:
>>> +	case VFIO_DEVICE_GET_IRQ_INFO:
>>> +	case VFIO_DEVICE_SET_IRQS:
>>> +	case VFIO_DEVICE_RESET:
>>> +		ret = -EINVAL;
>>> +		break;
>>> +
>>> +	case VHOST_MDEV_SET_STATE:
>>> +		ret = vhost_set_state(vdpa, argp);
>>> +		break;
>>> +	case VHOST_GET_FEATURES:
>>> +		ret = vhost_get_features(vdpa, argp);
>>> +		break;
>>> +	case VHOST_SET_FEATURES:
>>> +		ret = vhost_set_features(vdpa, argp);
>>> +		break;
>>> +	case VHOST_GET_VRING_BASE:
>>> +		ret = vhost_get_vring_base(vdpa, argp);
>>> +		break;
>>> +	default:
>>> +		ret = vhost_dev_ioctl(&vdpa->dev, cmd, argp);
>>> +		if (ret == -ENOIOCTLCMD)
>>> +			ret = vhost_vring_ioctl(&vdpa->dev, cmd, argp);
>>> +	}
>>> +
>>> +	return ret;
>>> +}
>>> +EXPORT_SYMBOL(vhost_mdev_ioctl);
>>
>> I don't have a problem with this approach. A small question:
>> would it make sense to have two fds: send vhost ioctls
>> on one and vfio ioctls on another?
>> We can then pass vfio fd to the vhost fd with a
>> SET_BACKEND ioctl.
>>
>> What do you think?
> I like this idea! I will give it a try.
> So we can introduce /dev/vhost-mdev to have the vhost fd,


You still need to think about how to connect it to current sysfs based 
mdev management interface, or you want to invent another API, or just 
use the /dev/vhost-net but pass vfio fd through ioctl to the file.

Thanks


>   and let
> userspace pass vfio fd to the vhost fd with a SET_BACKEND ioctl.
>
> Thanks a lot!
> Tiwei
>
>> -- 
>> MST

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox