Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v3 20/24] wireless: Remove call to memset after dma_alloc_coherent
From: Kalle Valo @ 2019-07-15  9:06 UTC (permalink / raw)
  To: Fuqian Huang
  Cc: David S . Miller, Arend van Spriel, Franky Lin, Hante Meuleman,
	Chi-Hsien Lin, Wright Feng, Igor Mitsyanko, Avinash Patil,
	Sergey Matyukevich, ath10k, linux-wireless, netdev, linux-kernel,
	brcm80211-dev-list.pdl, brcm80211-dev-list
In-Reply-To: <20190715031941.7120-1-huangfq.daxian@gmail.com>

Fuqian Huang <huangfq.daxian@gmail.com> writes:

> In commit 518a2f1925c3
> ("dma-mapping: zero memory returned from dma_alloc_*"),
> dma_alloc_coherent has already zeroed the memory.
> So memset is not needed.
>
> Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
> ---
> Changes in v3:
>   - Use actual commit rather than the merge commit in the commit message
>
>  drivers/net/wireless/ath/ath10k/ce.c                     | 5 -----
>  drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c  | 2 --
>  drivers/net/wireless/quantenna/qtnfmac/pcie/pearl_pcie.c | 2 --
>  drivers/net/wireless/quantenna/qtnfmac/pcie/topaz_pcie.c | 2 --
>  4 files changed, 11 deletions(-)

Via which tree is this supposed to go? Can I take this to
wireless-drivers-next?

-- 
Kalle Valo

^ permalink raw reply

* Re: [PATCH v3 20/24] wireless: Remove call to memset after dma_alloc_coherent
From: Kalle Valo @ 2019-07-15  9:07 UTC (permalink / raw)
  To: Fuqian Huang
  Cc: David S . Miller, Arend van Spriel, Franky Lin, Hante Meuleman,
	Chi-Hsien Lin, Wright Feng, Igor Mitsyanko, Avinash Patil,
	Sergey Matyukevich, ath10k, linux-wireless, netdev, linux-kernel,
	brcm80211-dev-list.pdl, brcm80211-dev-list
In-Reply-To: <20190715031941.7120-1-huangfq.daxian@gmail.com>

Fuqian Huang <huangfq.daxian@gmail.com> writes:

> In commit 518a2f1925c3
> ("dma-mapping: zero memory returned from dma_alloc_*"),
> dma_alloc_coherent has already zeroed the memory.
> So memset is not needed.
>
> Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
> ---
> Changes in v3:
>   - Use actual commit rather than the merge commit in the commit message
>
>  drivers/net/wireless/ath/ath10k/ce.c                     | 5 -----
>  drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c  | 2 --
>  drivers/net/wireless/quantenna/qtnfmac/pcie/pearl_pcie.c | 2 --
>  drivers/net/wireless/quantenna/qtnfmac/pcie/topaz_pcie.c | 2 --
>  4 files changed, 11 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/ath10k/ce.c b/drivers/net/wireless/ath/ath10k/ce.c
> index eca87f7c5b6c..294fbc1e89ab 100644
> --- a/drivers/net/wireless/ath/ath10k/ce.c
> +++ b/drivers/net/wireless/ath/ath10k/ce.c
> @@ -1704,9 +1704,6 @@ ath10k_ce_alloc_dest_ring_64(struct ath10k *ar, unsigned int ce_id,
>  	/* Correctly initialize memory to 0 to prevent garbage
>  	 * data crashing system when download firmware
>  	 */
> -	memset(dest_ring->base_addr_owner_space_unaligned, 0,
> -	       nentries * sizeof(struct ce_desc_64) + CE_DESC_RING_ALIGN);

Shouldn't you also remove the comment?

-- 
Kalle Valo

^ permalink raw reply

* [PATCH bpf] samples/bpf: build with -D__TARGET_ARCH_$(SRCARCH)
From: Ilya Leoshkevich @ 2019-07-15  9:11 UTC (permalink / raw)
  To: bpf, netdev; +Cc: gor, heiko.carstens, Ilya Leoshkevich

While $ARCH can be relatively flexible (see Makefile and
tools/scripts/Makefile.arch), $SRCARCH always corresponds to a directory
name under arch/.

Therefore, build samples with -D__TARGET_ARCH_$(SRCARCH), since that
matches the expectations of bpf_helpers.h.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
---
 samples/bpf/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index f90daadfbc89..1d9be26b4edd 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -284,7 +284,7 @@ $(obj)/%.o: $(src)/%.c
 	$(Q)$(CLANG) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) -I$(obj) \
 		-I$(srctree)/tools/testing/selftests/bpf/ \
 		-D__KERNEL__ -D__BPF_TRACING__ -Wno-unused-value -Wno-pointer-sign \
-		-D__TARGET_ARCH_$(ARCH) -Wno-compare-distinct-pointer-types \
+		-D__TARGET_ARCH_$(SRCARCH) -Wno-compare-distinct-pointer-types \
 		-Wno-gnu-variable-sized-type-not-at-end \
 		-Wno-address-of-packed-member -Wno-tautological-compare \
 		-Wno-unknown-warning-option $(CLANG_ARCH_ARGS) \
-- 
2.21.0


^ permalink raw reply related

* Re: [PATCH] ethtool: igb: dump RR2DCDELAY register
From: Michal Kubecek @ 2019-07-15  9:13 UTC (permalink / raw)
  To: Artem Bityutskiy; +Cc: John W . Linville, netdev
In-Reply-To: <20190715065228.88377-1-artem.bityutskiy@linux.intel.com>

On Mon, Jul 15, 2019 at 09:52:28AM +0300, Artem Bityutskiy wrote:
> Decode 'RR2DCDELAY' register which Linux kernel provides starting from version
> 5.3. The corresponding commit in the Linux kernel is:
>     cd502a7f7c9c igb: add RR2DCDELAY to ethtool registers dump
> 
> The RR2DCDELAY register is present in I210 and I211 Intel Gigabit Ethernet
> chips and it stands for "Read Request To Data Completion Delay". Here is how
> this register is described in the I210 datasheet:
> 
> "This field captures the maximum PCIe split time in 16 ns units, which is the
> maximum delay between the read request to the first data completion. This is
> giving an estimation of the PCIe round trip time."
> 
> In practice, this register can be used to measure the time it takes the NIC to
> read data from the host memory.
> 
> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
> ---
>  igb.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/igb.c b/igb.c
> index e0ccef9..ab0896f 100644
> --- a/igb.c
> +++ b/igb.c
> @@ -859,6 +859,18 @@ igb_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs)
>  		"0x03430: TDFPC       (Tx data FIFO packet count)      0x%08X\n",
>  		regs_buff[550]);
>  
> +	/*
> +	 * Starting from kernel version 5.3 the registers dump buffer grew from
> +	 * 739 4-byte words to 740 words, and word 740 contains the RR2DCDLAY

nit: "E" missing here:                                                    ^

> +	 * register.
> +	 */
> +	if (regs->len < 740)
> +		return 0;
> +
> +	fprintf(stdout,
> +		"0x05BF4: RR2DCDELAY  (Max. DMA read delay)            0x%08X\n",
> +		regs_buff[739]);

Showing a delay as hex number doesn't seem very user friendly but that
also applies to many existing registers so it's probably better to be
consistent and perhaps do an overall cleanup later.

Michal

^ permalink raw reply

* Re: [RFC] virtio-net: share receive_*() and add_recvbuf_*() with virtio-vsock
From: Jason Wang @ 2019-07-15  9:16 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: Michael S. Tsirkin, Stefan Hajnoczi, virtualization, netdev
In-Reply-To: <20190715074416.a3s2i5ausognotbn@steredhat>


>>>>>>>        struct sk_buff *virtskb_receive_small(struct virtskb *vs, ...);
>>>>>>>        struct sk_buff *virtskb_receive_big(struct virtskb *vs, ...);
>>>>>>>        struct sk_buff *virtskb_receive_mergeable(struct virtskb *vs, ...);
>>>>>>>
>>>>>>>        int virtskb_add_recvbuf_small(struct virtskb*vs, ...);
>>>>>>>        int virtskb_add_recvbuf_big(struct virtskb *vs, ...);
>>>>>>>        int virtskb_add_recvbuf_mergeable(struct virtskb *vs, ...);
>>>>>>>
>>>>>>> For the Guest->Host path it should be easier, so maybe I can add a
>>>>>>> "virtskb_send(struct virtskb *vs, struct sk_buff *skb)" with a part of the code
>>>>>>> of xmit_skb().
>>>>>> I may miss something, but I don't see any thing that prevents us from using
>>>>>> xmit_skb() directly.
>>>>>>
>>>>> Yes, but my initial idea was to make it more parametric and not related to the
>>>>> virtio_net_hdr, so the 'hdr_len' could be a parameter and the
>>>>> 'num_buffers' should be handled by the caller.
>>>>>
>>>>>>> Let me know if you have in mind better names or if I should put these function
>>>>>>> in another place.
>>>>>>>
>>>>>>> I would like to leave the control part completely separate, so, for example,
>>>>>>> the two drivers will negotiate the features independently and they will call
>>>>>>> the right virtskb_receive_*() function based on the negotiation.
>>>>>> If it's one the issue of negotiation, we can simply change the
>>>>>> virtnet_probe() to deal with different devices.
>>>>>>
>>>>>>
>>>>>>> I already started to work on it, but before to do more steps and send an RFC
>>>>>>> patch, I would like to hear your opinion.
>>>>>>> Do you think that makes sense?
>>>>>>> Do you see any issue or a better solution?
>>>>>> I still think we need to seek a way of adding some codes on virtio-net.c
>>>>>> directly if there's no huge different in the processing of TX/RX. That would
>>>>>> save us a lot time.
>>>>> After the reading of the buffers from the virtqueue I think the process
>>>>> is slightly different, because virtio-net will interface with the network
>>>>> stack, while virtio-vsock will interface with the vsock-core (socket).
>>>>> So the virtio-vsock implements the following:
>>>>> - control flow mechanism to avoid to loose packets, informing the peer
>>>>>     about the amount of memory available in the receive queue using some
>>>>>     fields in the virtio_vsock_hdr
>>>>> - de-multiplexing parsing the virtio_vsock_hdr and choosing the right
>>>>>     socket depending on the port
>>>>> - socket state handling
>>
>> I think it's just a branch, for ethernet, go for networking stack. otherwise
>> go for vsock core?
>>
> Yes, that should work.
>
> So, I should refactor the functions that can be called also from the vsock
> core, in order to remove "struct net_device *dev" parameter.
> Maybe creating some wrappers for the network stack.
>
> Otherwise I should create a fake net_device for vsock_core.
>
> What do you suggest?


I'm not quite sure I get the question. Can you just use the one that 
created by virtio_net?


Thanks

^ permalink raw reply

* Re: [Intel-wired-lan] [PATCH] e1000e: Make speed detection on hotplugging cable more reliable
From: Kai Heng Feng @ 2019-07-15  9:21 UTC (permalink / raw)
  To: Paul Menzel; +Cc: Jeff Kirsher, netdev, intel-wired-lan, linux-kernel
In-Reply-To: <37a1e2af-64c6-4515-5dcc-6051e1192636@molgen.mpg.de>

at 5:06 PM, Paul Menzel <pmenzel@molgen.mpg.de> wrote:

> Dear Kai Heng,
>
>
> (with or without hyphen?)
>
> On 7/15/19 11:00 AM, Kai Heng Feng wrote:
>> at 4:52 PM, Paul Menzel <pmenzel@molgen.mpg.de> wrote:
>
>>> On 7/15/19 10:43 AM, Kai-Heng Feng wrote:
>>>> After hotplugging an 1Gbps ethernet cable with 1Gbps link partner, the
>>>> MII_BMSR may reports 10Mbps, renders the network rather slow.
>>>
>>> s/may reports/may report/
>>> s/renders/rendering/
>>
>> Apparently English isn’t my mother tongue ;)
>
> No problem. Mine neither.
>
>>>> The issue has much lower fail rate after commit 59653e6497d1 ("e1000e:
>>>> Make watchdog use delayed work"), which esssentially introduces some
>>>
>>> essentially
>>
>> Ok.
>>
>>>> delay before running the watchdog task.
>>>>
>>>> But there's still a chance that the hotplugging event and the queued
>>>> watchdog task gets run at the same time, then the original issue can be
>>>> observed once again.
>>>>
>>>> So let's use mod_delayed_work() to add a deterministic 1 second delay
>>>> before running watchdog task, after an interrupt.
>>>
>>> I am not clear about the effects for the user. Could you elaborate
>>> please? Does the link now come up up to one second later?
>>
>> Yes, the link will be up on a fixed one second later.
>>
>> The delay varies between 0 to 2 seconds without this patch.
>
> Is there no other fix? Regarding booting a system fast (less than six
> seconds), a fixed one second delay is quite a regression on systems where
> it worked before.

This only affects when ethernet cable is hot plugged.

Kai-Heng

>
>>>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
>>>
>>> Any bug URL?
>>
>> If maintainers think it’s necessary then I’ll file one.
>
> Not necessary, if there is none. I thought you had one in Launchpad or so.
>
>
> Kind regards,
>
> Paul



^ permalink raw reply

* Re: [RFC bpf-next 0/8] bpf: accelerate insn patching speed
From: Jiong Wang @ 2019-07-15  9:21 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiong Wang, Alexei Starovoitov, Daniel Borkmann, Edward Cree,
	Naveen N. Rao, Andrii Nakryiko, Jakub Kicinski, bpf, Networking,
	oss-drivers, Yonghong Song
In-Reply-To: <CAEf4BzaPFbYKUQzu7VoRd7idrqPDMEFF=UEmT2pGf+Lxz06+sA@mail.gmail.com>


Andrii Nakryiko writes:

> On Thu, Jul 11, 2019 at 4:22 AM Jiong Wang <jiong.wang@netronome.com> wrote:
>>
>>
>> Andrii Nakryiko writes:
>>
>> > On Thu, Jul 4, 2019 at 2:31 PM Jiong Wang <jiong.wang@netronome.com> wrote:
>> >>
>> >> This is an RFC based on latest bpf-next about acclerating insn patching
>> >> speed, it is now near the shape of final PATCH set, and we could see the
>> >> changes migrating to list patching would brings, so send out for
>> >> comments. Most of the info are in cover letter. I splitted the code in a
>> >> way to show API migration more easily.
>> >
>> >
>> > Hey Jiong,
>> >
>> >
>> > Sorry, took me a while to get to this and learn more about instruction
>> > patching. Overall this looks good and I think is a good direction.
>> > I'll post high-level feedback here, and some more
>> > implementation-specific ones in corresponding patches.
>>
>> Great, thanks very much for the feedbacks. Most of your feedbacks are
>> hitting those pain points I exactly had ran into. For some of them, I
>> thought similar solutions like yours, but failed due to various
>> reasons. Let's go through them again, I could have missed some important
>> things.
>>
>> Please see my replies below.
>
> Thanks for thoughtful reply :)
>
>>
>> >>
>> >> Test Results
>> >> ===
>> >>   - Full pass on test_verifier/test_prog/test_prog_32 under all three
>> >>     modes (interpreter, JIT, JIT with blinding).
>> >>
>> >>   - Benchmarking shows 10 ~ 15x faster on medium sized prog, and reduce
>> >>     patching time from 5100s (nearly one and a half hour) to less than
>> >>     0.5s for 1M insn patching.
>> >>
>> >> Known Issues
>> >> ===
>> >>   - The following warning is triggered when running scale test which
>> >>     contains 1M insns and patching:
>> >>       warning of mm/page_alloc.c:4639 __alloc_pages_nodemask+0x29e/0x330
>> >>
>> >>     This is caused by existing code, it can be reproduced on bpf-next
>> >>     master with jit blinding enabled, then run scale unit test, it will
>> >>     shown up after half an hour. After this set, patching is very fast, so
>> >>     it shows up quickly.
>> >>
>> >>   - No line info adjustment support when doing insn delete, subprog adj
>> >>     is with bug when doing insn delete as well. Generally, removal of insns
>> >>     could possibly cause remove of entire line or subprog, therefore
>> >>     entries of prog->aux->linfo or env->subprog needs to be deleted. I
>> >>     don't have good idea and clean code for integrating this into the
>> >>     linearization code at the moment, will do more experimenting,
>> >>     appreciate ideas and suggestions on this.
>> >
>> > Is there any specific problem to detect which line info to delete? Or
>> > what am I missing besides careful implementation?
>>
>> Mostly line info and subprog info are range info which covers a range of
>> insns. Deleting insns could causing you adjusting the range or removing one
>> range entirely. subprog info could be fully recalcuated during
>> linearization while line info I need some careful implementation and I
>> failed to have clean code for this during linearization also as said no
>> unit tests to help me understand whether the code is correct or not.
>>
>
> Ok, that's good that it's just about clean implementation. Try to
> implement it as clearly as possible. Then post it here, and if it can
> be improved someone (me?) will try to help to clean it up further.
>
> Not a big expert on line info, so can't comment on that,
> unfortunately. Maybe Yonghong can chime in (cc'ed)
>
>
>> I will described this latter, spent too much time writing the following
>> reply. Might worth an separate discussion thread.
>>
>> >>
>> >>     Insn delete doesn't happen on normal programs, for example Cilium
>> >>     benchmarks, and happens rarely on test_progs, so the test coverage is
>> >>     not good. That's also why this RFC have a full pass on selftest with
>> >>     this known issue.
>> >
>> > I hope you'll add test for deletion (and w/ corresponding line info)
>> > in final patch set :)
>>
>> Will try. Need to spend some time on BTF format.
>> >
>> >>
>> >>   - Could further use mem pool to accelerate the speed, changes are trivial
>> >>     on top of this RFC, and could be 2x extra faster. Not included in this
>> >>     RFC as reducing the algo complexity from quadratic to linear of insn
>> >>     number is the first step.
>> >
>> > Honestly, I think that would add more complexity than necessary, and I
>> > think we can further speed up performance without that, see below.
>> >
>> >>
>> >> Background
>> >> ===
>> >> This RFC aims to accelerate BPF insn patching speed, patching means expand
>> >> one bpf insn at any offset inside bpf prog into a set of new insns, or
>> >> remove insns.
>> >>
>> >> At the moment, insn patching is quadratic of insn number, this is due to
>> >> branch targets of jump insns needs to be adjusted, and the algo used is:
>> >>
>> >>   for insn inside prog
>> >>     patch insn + regeneate bpf prog
>> >>     for insn inside new prog
>> >>       adjust jump target
>> >>
>> >> This is causing significant time spending when a bpf prog requires large
>> >> amount of patching on different insns. Benchmarking shows it could take
>> >> more than half minutes to finish patching when patching number is more
>> >> than 50K, and the time spent could be more than one hour when patching
>> >> number is around 1M.
>> >>
>> >>   15000   :    3s
>> >>   45000   :   29s
>> >>   95000   :  125s
>> >>   195000  :  712s
>> >>   1000000 : 5100s
>> >>
>> >> This RFC introduces new patching infrastructure. Before doing insn
>> >> patching, insns in bpf prog are turned into a singly linked list, insert
>> >> new insns just insert new list node, delete insns just set delete flag.
>> >> And finally, the list is linearized back into array, and branch target
>> >> adjustment is done for all jump insns during linearization. This algo
>> >> brings the time complexity from quadratic to linear of insn number.
>> >>
>> >> Benchmarking shows the new patching infrastructure could be 10 ~ 15x faster
>> >> on medium sized prog, and for a 1M patching it reduce the time from 5100s
>> >> to less than 0.5s.
>> >>
>> >> Patching API
>> >> ===
>> >> Insn patching could happen on two layers inside BPF. One is "core layer"
>> >> where only BPF insns are patched. The other is "verification layer" where
>> >> insns have corresponding aux info as well high level subprog info, so
>> >> insn patching means aux info needs to be patched as well, and subprog info
>> >> needs to be adjusted. BPF prog also has debug info associated, so line info
>> >> should always be updated after insn patching.
>> >>
>> >> So, list creation, destroy, insert, delete is the same for both layer,
>> >> but lineration is different. "verification layer" patching require extra
>> >> work. Therefore the patch APIs are:
>> >>
>> >>    list creation:                bpf_create_list_insn
>> >>    list patch:                   bpf_patch_list_insn
>> >>    list pre-patch:               bpf_prepatch_list_insn
>> >
>> > I think pre-patch name is very confusing, until I read full
>> > description I couldn't understand what it's supposed to be used for.
>> > Speaking of bpf_patch_list_insn, patch is also generic enough to leave
>> > me wondering whether instruction buffer is inserted after instruction,
>> > or instruction is replaced with a bunch of instructions.
>> >
>> > So how about two more specific names:
>> > bpf_patch_list_insn -> bpf_list_insn_replace (meaning replace given
>> > instruction with a list of patch instructions)
>> > bpf_prepatch_list_insn -> bpf_list_insn_prepend (well, I think this
>> > one is pretty clear).
>>
>> My sense on English word is not great, will switch to above which indeed
>> reads more clear.
>>
>> >>    list lineration (core layer): prog = bpf_linearize_list_insn(prog, list)
>> >>    list lineration (veri layer): env = verifier_linearize_list_insn(env, list)
>> >
>> > These two functions are both quite involved, as well as share a lot of
>> > common code. I'd rather have one linearize instruction, that takes env
>> > as an optional parameter. If env is specified (which is the case for
>> > all cases except for constant blinding pass), then adjust aux_data and
>> > subprogs along the way.
>>
>> Two version of lineration and how to unify them was a painpoint to me. I
>> thought to factor out some of the common code out, but it actually doesn't
>> count much, the final size counting + insnsi resize parts are the same,
>> then things start to diverge since the "Copy over insn" loop.
>>
>> verifier layer needs to copy and initialize aux data etc. And jump
>> relocation is different. At core layer, the use case is JIT blinding which
>> could expand an jump_imm insn into a and/or/jump_reg sequence, and the
>
> Sorry, I didn't get what "could expand an jump_imm insn into a
> and/or/jump_reg sequence", maybe you can clarify if I'm missing
> something.
>
> But from your cover letter description, core layer has no jumps at
> all, while verifier has jumps inside patch buffer. So, if you support
> jumps inside of patch buffer, it will automatically work for core
> layer. Or what am I missing?

I meant in core layer (JIT blinding), there is the following patching:

input:
  insn 0             insn 0
  insn 1             insn 1
  jmp_imm   >>       mov_imm  \
  insn 2             xor_imm    insn seq expanded from jmp_imm
  insn 3             jmp_reg  /
                     insn 2
                     insn 3


jmp_imm is the insn that will be patched, and the actually transformation
is to expand it into mov_imm/xor_imm/jmp_reg sequence. "jmp_reg", sitting
at the end of the patch buffer, must jump to the same destination as the
original jmp_imm, so "jmp_reg" is an insn inside patch buffer but should
be relocated, and the jump destination is outside of patch buffer.

This means for core layer (jit blinding), it needs to take care of insn
inside patch buffer.
  
> Just compared two version of linearize side by side. From what I can
> see, unified version could look like this, high-level:
>
> 1. Count final insn count (but see my other suggestions how to avoid
> that altogether). If not changed - exit.
> 2. Realloc insn buffer, copy just instructions (not aux_data yet).
> Build idx_map, if necessary.
> 3. (if env) then bpf_patchable_insn has aux_data, so now do another
> pass and copy it into resulting array.
> 4. (if env) Copy sub info. Though I'd see if we can just reuse old
> ones and just adjust offsets. I'm not sure why we need to allocate new
> array, subprogram count shouldn't change, right?

If there is no dead insn elimination opt, then we could just adjust
offsets. When there is insn deleting, I feel the logic becomes more
complex. One subprog could be completely deleted or partially deleted, so
I feel just recalculate the whole subprog info as a side-product is
much simpler.

> 5. (common) Relocate jumps. Not clear why core layer doesn't care
> about PATCHED (or, alternatively, why verifier layer cares).

See above, in this RFC, core layer care PATCHED during relocating jumps,
and verifier layer doesn't.

> And again, with targets pointer it will look totally different (and
> simpler).

Yes, will see how the code looks.

> 6. (if env) adjust subprogs
> 7. (common) Adjust prog's line info.
>
> The devil is in the details, but I think this will still be better if
> contained in one function if a bunch of `if (env)` checks. Still
> pretty linear.
>
>> jump_reg is at the end of the patch buffer, it should be relocated. While
>> all use case in verifier layer, no jump in the prog will be patched and all
>> new jumps in patch buffer will jump inside the buffer locally so no need to
>> resolve.
>>
>> And yes we could unify them into one and control the diverge using
>> argument, but then where to place the function is an issue. My
>> understanding is verifier.c is designed to be on top of core.c and core.c
>> should not reference and no need to be aware of any verifier specific data
>> structures, for example env or bpf_aux_insn_data etc.
>
> Func prototype where it is. Maybe forward-declare verifier env struct.
> Implementation in verifier.c?
>
>>
>> So, in this RFC, I had choosed to write separate linerization function for
>> core and verifier layer. Does this make sense?
>
> See above. Let's still try to make it better.
>
>>
>> >
>> > This would keep logic less duplicated and shouldn't complexity beyond
>> > few null checks in few places.
>> >
>> >>    list destroy:                 bpf_destroy_list_insn
>> >>
>> >
>> > I'd also add a macro foreach_list_insn instead of explicit for loops
>> > in multiple places. That would also allow to skip deleted instructions
>> > transparently.
>> >
>> >> list patch could change the insn at patch point, it will invalid the aux
>> >
>> > typo: invalid -> invalidate
>>
>> Ack.
>>
>> >
>> >> info at patching point. list pre-patch insert new insns before patch point
>> >> where the insn and associated aux info are not touched, it is used for
>> >> example in convert_ctx_access when generating prologue.
>> >>
>> >> Typical API sequence for one patching pass:
>> >>
>> >>    struct bpf_list_insn list = bpf_create_list_insn(struct bpf_prog);
>> >>    for (elem = list; elem; elem = elem->next)
>> >>       patch_buf = gen_patch_buf_logic;
>> >>       elem = bpf_patch_list_insn(elem, patch_buf, cnt);
>> >>    bpf_prog = bpf_linearize_list_insn(list)
>> >>    bpf_destroy_list_insn(list)
>> >>
>> >> Several patching passes could also share the same list:
>> >>
>> >>    struct bpf_list_insn list = bpf_create_list_insn(struct bpf_prog);
>> >>    for (elem = list; elem; elem = elem->next)
>> >>       patch_buf = gen_patch_buf_logic1;
>> >>       elem = bpf_patch_list_insn(elem, patch_buf, cnt);
>> >>    for (elem = list; elem; elem = elem->next)
>> >>       patch_buf = gen_patch_buf_logic2;
>> >>       elem = bpf_patch_list_insn(elem, patch_buf, cnt);
>> >>    bpf_prog = bpf_linearize_list_insn(list)
>> >>    bpf_destroy_list_insn(list)
>> >>
>> >> but note new inserted insns int early passes won't have aux info except
>> >> zext info. So, if one patch pass requires all aux info updated and
>> >> recalculated for all insns including those pathced, it should first
>> >> linearize the old list, then re-create the list. The RFC always create and
>> >> linearize the list for each migrated patching pass separately.
>> >
>> > I think we should do just one list creation, few passes of patching
>> > and then linearize once. That will save quite a lot of memory
>> > allocation and will speed up a lot of things. All the verifier
>> > patching happens one after the other without any other functionality
>> > in between, so there shouldn't be any problem.
>>
>> Yes, as mentioned above, it is possible and I had tried to do it in an very
>> initial impl. IIRC convert_ctx_access + fixup_bpf_calls could share the
>> same list, but then the 32-bit zero extension insertion pass requires
>> aux.zext_dst set properly for all instructions including those patched
>
> So zext_dst. Seems like it's easily calculatable, so doesn't seem like
> it even needs to be accessed from aux_data.
>
> But. I can see at least two ways to do this:
> 1. those patching passes that care about aux_data, should just do
> extra check for NULL. Because when we adjust insns now, we just leave
> zero-initialized aux_data, except for zext_dst and seen. So it's easy
> to default to them if aux_data is NULL for patchable_insn.
> 2. just allocate and fill them out them when applying patch insns
> buffer. It's not a duplication, we already fill them out during
> patching today. So just do the same, except through malloc()'ed
> pointer instead. At the end they will be copied into linear resulting
> array during linearization (uniformly with non-patched insns).
>
>> one which we need to linearize the list first (as we set zext_dst during
>> linerization), or the other choice is we do the zext_dst initialization
>> during bpf_patch_list_insn, but this then make bpf_patch_list_insn diverge
>> between core and verifier layer.
>
> List construction is much simpler, even if we have to have extra
> check, similar to `if (env) { do_extra(); }`, IMO, it's fine.
>
>>
>> > As for aux_data. We can solve that even more simply and reliably by
>> > storing a pointer along the struct bpf_list_insn
>>
>> This is exactly what I had implemented initially, but then the issue is how
>> to handle aux_data for patched insn? IIRC I was leave it as a NULL pointer,
>> but later found zext_dst info is required for all insns, so I end up
>> duplicating zext_dst in bpf_list_insn.
>
> See above. No duplication. You have a pointer. Whether aux_data is in
> original array or was malloc()'ed, doesn't matter. But no duplication
> of fields.
>
>>
>> This leads me worrying we need to keep duplicating fields there as soon as
>> there is new similar requirements in future patching pass and I thought it
>> might be better to just reference the aux_data inside env using orig_idx,
>> this avoids duplicating information, but we need to make sure used fields
>> inside aux_data for patched insn update-to-date during linearization or
>> patching list.
>>
>> > (btw, how about calling it bpf_patchable_insn?).
>>
>> No preference, will use this one.
>>
>> > Here's how I propose to represent this patchable instruction:
>> >
>> > struct bpf_list_insn {
>> >        struct bpf_insn insn;
>> >        struct bpf_list_insn *next;
>> >        struct bpf_list_insn *target;
>> >        struct bpf_insn_aux_data *aux_data;
>> >        s32 orig_idx; // can repurpose this to have three meanings:
>> >                      // -2 - deleted
>> >                      // -1 - patched/inserted insn
>> >                      // >=0 - original idx
>>
>> I actually had experimented the -2/-1/0 trick, exactly the same number
>> assignment :) IIRC the code was not clear compared with using flag, the
>> reason seems to be:
>>   1. we still need orig_idx of an patched insn somehow, meaning negate the
>>      index.
>
> Not following, original index with be >=0, no?
>
>>   2. somehow somecode need to know whether one insn is deleted or patched
>>      after the negation, so I end up with some ugly code.
>
> So that's why you'll have constants with descriptive name for -2 and -1.
>
>>
>> Anyway, I might had not thought hard enough on this, I will retry using the
>> special index instead of flag, hopefully I could have clean code this time.
>>
>
> Yeah, please try again. All those `orig_idx = insn->orig_idx - 1; if
> (orig_idx >= 0) { ... }` are very confusing.
>
>> > };
>> >
>> > The idea would be as follows:
>> > 1. when creating original list, target pointer will point directly to
>> > a patchable instruction wrapper for jumps/calls. This will allow to
>> > stop tracking and re-calculating jump offsets and instruction indicies
>> > until linearization.
>>
>> Not sure I have followed the idea of "target" pointer. At the moment we are
>> using index mapping array (generated as by-product during coping insn).
>>
>> While the "target" pointer means to during list initialization, each jump
>> insn will have target initialized to the list node of the converted jump
>> destination insn, and all those non-jump insns are with NULL? Then during
>> linearization you assign index to each list node (could be done as
>> by-product of other pass) before insn coping which could then relocate the
>> insn during the coping as the "target" would have final index calculated?
>> Am I following correctly?
>
> Yes, I think you are understanding correctly what I'm saying. For
> implementation, you can do it in few ways, through few passes or with
> some additional data, is less important. See what's cleanest.
>
>>
>> > 2. aux_data is also filled at that point. Later at linearization time
>> > you'd just iterate over all the instructions in final order and copy
>> > original aux_data, if it's present. And then just repace env's
>> > aux_data array at the end, should be very simple and fast.
>>
>> As explained, I am worried making aux_data a pointer will causing
>> duplicating some fields into list_insn if the fields are required for
>> patched insns.
>
> Addressed above, I don't think there will be any duplication, because
> we pass aux_data by pointer.
>
>>
>> > 3. during fix_bpf_calls, zext, ctx rewrite passes, we'll reuse the
>> > same list of instructions and those passes will just keep inserting
>> > instruction buffers. Given we have restriction that all the jumps are
>> > only within patch buffer, it will be trivial to construct proper
>> > patchable instruction wrappers for newly added instructions, with NULL
>> > for aux_data and possibly non-NULL target (if it's a JMP insn).
>> > 4. After those passes, linearize, adjust subprogs (for this you'll
>> > probably still need to create index mapping, right?), copy or create
>> > new aux_data.
>> > 5. Done.
>> >
>> > What do you think? I think this should be overall simpler and faster.
>> > But let me know if I'm missing something.
>>
>> Thanks for all these thoughts, they are very good suggestions and reminds
>> me to revisit some points I had forgotten. I will do the following things:
>>
>>   1. retry the negative index solution to eliminate flag if the result code
>>      could be clean.
>>   2. the "target" pointer seems make sense, it makes list_insn bigger but
>>      normally space trade with time, so I will try to implement it to see
>>      how the code looks like.
>>   3. I still have concerns on making aux_data as pointer. Mostly due to
>>      patched insn will have NULL pointer and in case aux info of patched
>>      insn is required, we need to duplicate info inside list_insn. For
>>      example 32-bit zext opt requires zext_dst.
>>
>
>
> So one more thing I wanted to suggest. I'll try to keep high-level
> suggestions here.
>
> What about having a wrapper for patchable_insn list, where you can
> store some additional data, like final count and whatever else. It
> will eliminate some passes (counting) and will make list handling
> easier (because you can have a dummy head pointer, so no special
> handling of first element

Will try it.

> you had this concern in patch #1, I
> believe). But it will be clear if it's beneficial once implemented.

>> Regards,
>> Jiong
>>
>> >>
>> >> Compared with old patching code, this new infrastructure has much less core
>> >> code, even though the final code has a couple of extra lines but that is
>> >> mostly due to for list based infrastructure, we need to do more error
>> >> checks, so the list and associated aux data structure could be freed when
>> >> errors happens.
>> >>
>> >> Patching Restrictions
>> >> ===
>> >>   - For core layer, the linearization assume no new jumps inside patch buf.
>> >>     Currently, the only user of this layer is jit blinding.
>> >>   - For verifier layer, there could be new jumps inside patch buf, but
>> >>     they should have branch target resolved themselves, meaning new jumps
>> >>     doesn't jump to insns out of the patch buf. This is the case for all
>> >>     existing verifier layer users.
>> >>   - bpf_insn_aux_data for all patched insns including the one at patch
>> >>     point are invalidated, only 32-bit zext info will be recalcuated.
>> >>     If the aux data of insn at patch point needs to be retained, it is
>> >>     purely insn insertion, so need to use the pre-patch API.
>> >>
>> >> I plan to send out a PATCH set once I finished insn deletion line info adj
>> >> support, please have a looks at this RFC, and appreciate feedbacks.
>> >>
>> >> Jiong Wang (8):
>> >>   bpf: introducing list based insn patching infra to core layer
>> >>   bpf: extend list based insn patching infra to verification layer
>> >>   bpf: migrate jit blinding to list patching infra
>> >>   bpf: migrate convert_ctx_accesses to list patching infra
>> >>   bpf: migrate fixup_bpf_calls to list patching infra
>> >>   bpf: migrate zero extension opt to list patching infra
>> >>   bpf: migrate insn remove to list patching infra
>> >>   bpf: delete all those code around old insn patching infrastructure
>> >>
>> >>  include/linux/bpf_verifier.h |   1 -
>> >>  include/linux/filter.h       |  27 +-
>> >>  kernel/bpf/core.c            | 431 +++++++++++++++++-----------
>> >>  kernel/bpf/verifier.c        | 649 +++++++++++++++++++------------------------
>> >>  4 files changed, 580 insertions(+), 528 deletions(-)
>> >>
>> >> --
>> >> 2.7.4
>> >>
>>


^ permalink raw reply

* Re: [PATCH 16/30] net/wireless: Use kmemdup rather than duplicating its implementation
From: Kalle Valo @ 2019-07-15  9:32 UTC (permalink / raw)
  To: Fuqian Huang
  Cc: David S . Miller, Solomon Peachy, linux-wireless, netdev,
	linux-kernel
In-Reply-To: <20190703131614.25408-1-huangfq.daxian@gmail.com>

Fuqian Huang <huangfq.daxian@gmail.com> writes:

> kmemdup is introduced to duplicate a region of memory in a neat way.
> Rather than kmalloc/kzalloc + memset, which the programmer needs to
> write the size twice (sometimes lead to mistakes), kmemdup improves
> readability, leads to smaller code and also reduce the chances of mistakes.
> Suggestion to use kmemdup rather than using kmalloc/kzalloc + memset.
>
> Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>

I assume I can take this to wireless-drivers-next. If not, please
holler.

Fuqian, it would help the maintainers a lot if you could clearly
indicate to which tree the patches are planned to be commited. If I just
see one patch from a 30 patch set I have no clue what is your plan,
either is someone else going to apply the full patchset or the
maintainers should pick respective patches individually.

-- 
Kalle Valo

^ permalink raw reply

* Re: [PATCH v2 17/35] net/wireless: Use kmemdup rather than duplicating its implementation
From: Kalle Valo @ 2019-07-15  9:33 UTC (permalink / raw)
  To: Fuqian Huang
  Cc: David S . Miller, Solomon Peachy, linux-wireless, netdev,
	linux-kernel
In-Reply-To: <20190703162934.32645-1-huangfq.daxian@gmail.com>

Fuqian Huang <huangfq.daxian@gmail.com> writes:

> kmemdup is introduced to duplicate a region of memory in a neat way.
> Rather than kmalloc/kzalloc + memcpy, which the programmer needs to
> write the size twice (sometimes lead to mistakes), kmemdup improves
> readability, leads to smaller code and also reduce the chances of mistakes.
> Suggestion to use kmemdup rather than using kmalloc/kzalloc + memcpy.
>
> Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>

I'm planning to take this.

-- 
Kalle Valo

^ permalink raw reply

* Re: [PATCH] wl3501_cs: remove redundant variable ret
From: Kalle Valo @ 2019-07-15  9:40 UTC (permalink / raw)
  To: Colin King
  Cc: David S . Miller, linux-wireless, netdev, kernel-janitors,
	linux-kernel
In-Reply-To: <20190705103732.30568-1-colin.king@canonical.com>

Colin King <colin.king@canonical.com> writes:

> From: Colin Ian King <colin.king@canonical.com>
>
> The variable ret is being initialized with a value that is never
> read and it is being updated later with a new value that is returned.
> The variable is redundant and can be replaced with a return 0 as
> there are no other return points in this function.
>
> Addresses-Coverity: ("Unused value")
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
> ---
>  drivers/net/wireless/wl3501_cs.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/net/wireless/wl3501_cs.c b/drivers/net/wireless/wl3501_cs.c
> index a25b17932edb..007bf6803293 100644
> --- a/drivers/net/wireless/wl3501_cs.c
> +++ b/drivers/net/wireless/wl3501_cs.c
> @@ -1226,7 +1226,6 @@ static int wl3501_init_firmware(struct wl3501_card *this)
>  static int wl3501_close(struct net_device *dev)
>  {
>  	struct wl3501_card *this = netdev_priv(dev);
> -	int rc = -ENODEV;

I'll manually fix the commit log with:

s/variable ret/variable rc/

-- 
Kalle Valo

^ permalink raw reply

* Re: [PATCH] ath10k: work around uninitialized vht_pfr variable
From: Kalle Valo @ 2019-07-15  9:44 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Miaoqing Pan, David S. Miller, Rakesh Pillai, Brian Norris,
	Balaji Pothunoori, Wen Gong, Pradeep kumar Chitrapu, Sriram R,
	ath10k, linux-wireless, netdev, linux-kernel, clang-built-linux
In-Reply-To: <20190708125050.3689133-1-arnd@arndb.de>

Arnd Bergmann <arnd@arndb.de> writes:

> As clang points out, the vht_pfr is assigned to a struct member
> without being initialized in one case:
>
> drivers/net/wireless/ath/ath10k/mac.c:7528:7: error: variable 'vht_pfr' is used uninitialized whenever 'if' condition
>       is false [-Werror,-Wsometimes-uninitialized]
>                 if (!ath10k_mac_can_set_bitrate_mask(ar, band, mask,
>                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> drivers/net/wireless/ath/ath10k/mac.c:7551:20: note: uninitialized use occurs here
>                 arvif->vht_pfr = vht_pfr;
>                                  ^~~~~~~
> drivers/net/wireless/ath/ath10k/mac.c:7528:3: note: remove the 'if' if its condition is always true
>                 if (!ath10k_mac_can_set_bitrate_mask(ar, band, mask,
>                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> drivers/net/wireless/ath/ath10k/mac.c:7483:12: note: initialize the variable 'vht_pfr' to silence this warning
>         u8 vht_pfr;
>
> Add an explicit but probably incorrect initialization here.
> I suspect we want a better fix here, but chose this approach to
> illustrate the issue.
>
> Fixes: 8b97b055dc9d ("ath10k: fix failure to set multiple fixed rate")
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

I'll queue this for v5.3.

-- 
Kalle Valo

^ permalink raw reply

* Re: [PATCH net-next 01/16] qlge: Remove irq_cnt
From: Greg Kroah-Hartman @ 2019-07-15  9:48 UTC (permalink / raw)
  To: Benjamin Poirier; +Cc: David Miller, Manish Chopra, GR-Linux-NIC-Dev, netdev
In-Reply-To: <20190715014016.GA27883@f1>

On Mon, Jul 15, 2019 at 10:40:16AM +0900, Benjamin Poirier wrote:
> On 2019/06/17 16:48, Benjamin Poirier wrote:
> > qlge uses an irq enable/disable refcounting scheme that is:
> > * poorly implemented
> > 	Uses a spin_lock to protect accesses to the irq_cnt atomic variable
> > * buggy
> > 	Breaks when there is not a 1:1 sequence of irq - napi_poll, such as
> > 	when using SO_BUSY_POLL.
> > * unnecessary
> > 	The purpose or irq_cnt is to reduce irq control writes when
> > 	multiple work items result from one irq: the irq is re-enabled
> > 	after all work is done.
> > 	Analysis of the irq handler shows that there is only one case where
> > 	there might be two workers scheduled at once, and those have
> > 	separate irq masking bits.
> > 
> > Therefore, remove irq_cnt.
> > 
> > Additionally, we get a performance improvement:
> > perf stat -e cycles -a -r5 super_netperf 100 -H 192.168.33.1 -t TCP_RR
> > 
> > Before:
> > 628560
> > 628056
> > 622103
> > 622744
> > 627202
> > [...]
> >    268,803,947,669      cycles                 ( +-  0.09% )
> > 
> > After:
> > 636300
> > 634106
> > 634984
> > 638555
> > 634188
> > [...]
> >    259,237,291,449      cycles                 ( +-  0.19% )
> > 
> > Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
> 
> David, Greg,
> 
> Before I send v2 of this patchset, how about moving this driver out to
> staging? The hardware has been declared EOL by the vendor but what's
> more relevant to the Linux kernel is that the quality of this driver is
> not on par with many other mainline drivers, IMO. Over in staging, the
> code might benefit from the attention of interested parties (drive-by
> fixers) or fade away into obscurity.
> 
> Now that I check, the code has >500 basic checkpatch issues. While
> working on the rx processing code for the current patchset, I noticed
> the following more structural issues:
> * commit 7c734359d350 ("qlge: Size RX buffers based on MTU.",
>   v2.6.33-rc1) introduced dead code in the receive routines, which
>   should be rewritten anyways by the admission of the author himself
>   (see the comment above ql_build_rx_skb())
> * truesize accounting is incorrect (ex: a 9000B frame has truesize 10280
>   while containing two frags of order-1 allocations, ie. >16K)
> * in some cases the driver allocates skbs with head room but only puts
>   data in the frags
> * there is an inordinate amount of disparate debugging code, most of
>   which is of questionable value. In particular, qlge_dbg.c has hundreds
>   of lines of code bitrotting away in ifdef land (doesn't compile since
>   commit 18c49b91777c ("qlge: do vlan cleanup", v3.1-rc1), 8 years ago).
> * triggering an ethtool regdump will instead hexdump a 176k struct to
>   dmesg depending on some module parameters
> * many structures are defined full of holes, as we found in this
>   thread already; are used inefficiently (struct rx_ring is used for rx
>   and tx completions, with some members relevant to one case only); are
>   initialized redundantly (ex. memset 0 after alloc_etherdev). The
>   driver also has a habit of using runtime checks where compile time
>   checks are possible.
> * in terms of namespace, the driver uses either qlge_, ql_ (used by
>   other qlogic drivers, with clashes, ex: ql_sem_spinlock) or nothing (with
>   clashes, ex: struct ob_mac_iocb_req)
> 
> I'm guessing other parts of the driver have more issues of the same
> nature. The driver also has many smaller issues like deprecated apis,
> duplicate or useless comments, weird code structure (some "while" loops
> that could be rewritten with simple "for", ex. ql_wait_reg_rdy()) and
> weird line wrapping (all over, ex. the ql_set_routing_reg() calls in
> qlge_set_multicast_list()).
> 
> I'm aware of some precedents for code moving from mainline to staging:
> 1bb8155080c6 ncpfs: move net/ncpfs to drivers/staging/ncpfs (v4.16-rc1)
> 6c391ff758eb irda: move drivers/net/irda to drivers/staging/irda/drivers
> (v4.14-rc1)
> 
> What do you think is the best course of action in this case?

Feel free to move it to staging if you want it removed from the tree in
a few releases if no one is willing to do the work to keep it alive.  If
someone comes along later, it's a trivial revert to add it back.

So send a patch moving it, with all of the information you listed above
in a TODO file for the directory, and I'll be glad to take it.

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH ipsec 0/2] xfrm interface: bug fix on changelink
From: Nicolas Dichtel @ 2019-07-15  9:52 UTC (permalink / raw)
  To: steffen.klassert, davem; +Cc: netdev
In-Reply-To: <20190710074536.7505-1-nicolas.dichtel@6wind.com>

Le 10/07/2019 à 09:45, Nicolas Dichtel a écrit :
> 
> Here are two bug fix seen by code review. The first one avoids a corruption of
> existing xfrm interfaces and the second is a minor fix of an error message.
> 
>  include/net/xfrm.h        |  1 -
>  net/xfrm/xfrm_interface.c | 20 ++++++--------------
>  2 files changed, 6 insertions(+), 15 deletions(-)
Please, drop this series, I will resend it.


Regards,
Nicolas

^ permalink raw reply

* Re: [PATCH ipsec] xfrm interface: fix list corruption for x-netns
From: Nicolas Dichtel @ 2019-07-15  9:53 UTC (permalink / raw)
  To: steffen.klassert, davem; +Cc: netdev, Julien Floret
In-Reply-To: <20190710131137.4787-1-nicolas.dichtel@6wind.com>

Le 10/07/2019 à 15:11, Nicolas Dichtel a écrit :
> dev_net(dev) is the netns of the device and xi->net is the link netns,
> where the device has been linked.
> changelink() must operate in the link netns to avoid a corruption of
> the xfrm lists.
> 
> Note that xi->net and dev_net(xi->physdev) are always the same.
> 
> Before the patch, the xfrmi lists may be corrupted and can later trigger a
> kernel panic.
> 
> Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces")
> Reported-by: Julien Floret <julien.floret@6wind.com>
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> Tested-by: Julien Floret <julien.floret@6wind.com>
Please, drop this one too.


Regards,
Nicolas

^ permalink raw reply

* Re: [RFC bpf-next 1/8] bpf: introducing list based insn patching infra to core layer
From: Jiong Wang @ 2019-07-15  9:58 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiong Wang, Alexei Starovoitov, Daniel Borkmann, Edward Cree,
	Naveen N. Rao, Andrii Nakryiko, Jakub Kicinski, bpf, Networking,
	oss-drivers
In-Reply-To: <CAEf4BzbR_ieGmaOTjCrN6jQRo=QoEJNz1zVeFizZbzGBGaF=Cg@mail.gmail.com>


Andrii Nakryiko writes:

> On Thu, Jul 11, 2019 at 4:53 AM Jiong Wang <jiong.wang@netronome.com> wrote:
>>
>>
>> Andrii Nakryiko writes:
>>
>> > On Thu, Jul 4, 2019 at 2:32 PM Jiong Wang <jiong.wang@netronome.com> wrote:
>> >>
>> >> This patch introduces list based bpf insn patching infra to bpf core layer
>> >> which is lower than verification layer.
>> >>
>> >> This layer has bpf insn sequence as the solo input, therefore the tasks
>> >> to be finished during list linerization is:
>> >>   - copy insn
>> >>   - relocate jumps
>> >>   - relocation line info.
>> >>
>> >> Suggested-by: Alexei Starovoitov <ast@kernel.org>
>> >> Suggested-by: Edward Cree <ecree@solarflare.com>
>> >> Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
>> >> ---
>> >>  include/linux/filter.h |  25 +++++
>> >>  kernel/bpf/core.c      | 268 +++++++++++++++++++++++++++++++++++++++++++++++++
>> >>  2 files changed, 293 insertions(+)
>> >>
>> >> diff --git a/include/linux/filter.h b/include/linux/filter.h
>> >> index 1fe53e7..1fea68c 100644
>> >> --- a/include/linux/filter.h
>> >> +++ b/include/linux/filter.h
>> >> @@ -842,6 +842,31 @@ struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off,
>> >>                                        const struct bpf_insn *patch, u32 len);
>> >>  int bpf_remove_insns(struct bpf_prog *prog, u32 off, u32 cnt);
>> >>
>> >> +int bpf_jit_adj_imm_off(struct bpf_insn *insn, int old_idx, int new_idx,
>> >> +                       int idx_map[]);
>> >> +
>> >> +#define LIST_INSN_FLAG_PATCHED 0x1
>> >> +#define LIST_INSN_FLAG_REMOVED 0x2
>> >> +struct bpf_list_insn {
>> >> +       struct bpf_insn insn;
>> >> +       struct bpf_list_insn *next;
>> >> +       s32 orig_idx;
>> >> +       u32 flag;
>> >> +};
>> >> +
>> >> +struct bpf_list_insn *bpf_create_list_insn(struct bpf_prog *prog);
>> >> +void bpf_destroy_list_insn(struct bpf_list_insn *list);
>> >> +/* Replace LIST_INSN with new list insns generated from PATCH. */
>> >> +struct bpf_list_insn *bpf_patch_list_insn(struct bpf_list_insn *list_insn,
>> >> +                                         const struct bpf_insn *patch,
>> >> +                                         u32 len);
>> >> +/* Pre-patch list_insn with insns inside PATCH, meaning LIST_INSN is not
>> >> + * touched. New list insns are inserted before it.
>> >> + */
>> >> +struct bpf_list_insn *bpf_prepatch_list_insn(struct bpf_list_insn *list_insn,
>> >> +                                            const struct bpf_insn *patch,
>> >> +                                            u32 len);
>> >> +
>> >>  void bpf_clear_redirect_map(struct bpf_map *map);
>> >>
>> >>  static inline bool xdp_return_frame_no_direct(void)
>> >> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
>> >> index e2c1b43..e60703e 100644
>> >> --- a/kernel/bpf/core.c
>> >> +++ b/kernel/bpf/core.c
>> >> @@ -502,6 +502,274 @@ int bpf_remove_insns(struct bpf_prog *prog, u32 off, u32 cnt)
>> >>         return WARN_ON_ONCE(bpf_adj_branches(prog, off, off + cnt, off, false));
>> >>  }
>> >>
>> >> +int bpf_jit_adj_imm_off(struct bpf_insn *insn, int old_idx, int new_idx,
>> >> +                       s32 idx_map[])
>> >> +{
>> >> +       u8 code = insn->code;
>> >> +       s64 imm;
>> >> +       s32 off;
>> >> +
>> >> +       if (BPF_CLASS(code) != BPF_JMP && BPF_CLASS(code) != BPF_JMP32)
>> >> +               return 0;
>> >> +
>> >> +       if (BPF_CLASS(code) == BPF_JMP &&
>> >> +           (BPF_OP(code) == BPF_EXIT ||
>> >> +            (BPF_OP(code) == BPF_CALL && insn->src_reg != BPF_PSEUDO_CALL)))
>> >> +               return 0;
>> >> +
>> >> +       /* BPF to BPF call. */
>> >> +       if (BPF_OP(code) == BPF_CALL) {
>> >> +               imm = idx_map[old_idx + insn->imm + 1] - new_idx - 1;
>> >> +               if (imm < S32_MIN || imm > S32_MAX)
>> >> +                       return -ERANGE;
>> >> +               insn->imm = imm;
>> >> +               return 1;
>> >> +       }
>> >> +
>> >> +       /* Jump. */
>> >> +       off = idx_map[old_idx + insn->off + 1] - new_idx - 1;
>> >> +       if (off < S16_MIN || off > S16_MAX)
>> >> +               return -ERANGE;
>> >> +       insn->off = off;
>> >> +       return 0;
>> >> +}
>> >> +
>> >> +void bpf_destroy_list_insn(struct bpf_list_insn *list)
>> >> +{
>> >> +       struct bpf_list_insn *elem, *next;
>> >> +
>> >> +       for (elem = list; elem; elem = next) {
>> >> +               next = elem->next;
>> >> +               kvfree(elem);
>> >> +       }
>> >> +}
>> >> +
>> >> +struct bpf_list_insn *bpf_create_list_insn(struct bpf_prog *prog)
>> >> +{
>> >> +       unsigned int idx, len = prog->len;
>> >> +       struct bpf_list_insn *hdr, *prev;
>> >> +       struct bpf_insn *insns;
>> >> +
>> >> +       hdr = kvzalloc(sizeof(*hdr), GFP_KERNEL);
>> >> +       if (!hdr)
>> >> +               return ERR_PTR(-ENOMEM);
>> >> +
>> >> +       insns = prog->insnsi;
>> >> +       hdr->insn = insns[0];
>> >> +       hdr->orig_idx = 1;
>> >> +       prev = hdr;
>> >
>> > I'm not sure why you need this "prologue" instead of handling first
>> > instruction uniformly in for loop below?
>>
>> It is because the head of the list doesn't have precessor, so no need of
>> the prev->next assignment, not could do a check inside the loop to rule the
>> head out when doing it.
>
> yeah, prev = NULL initially. Then
>
> if (prev) prev->next = node;
>
> Or see my suggestiong about having patchabel_insns_list wrapper struct
> (in cover letter thread).
>
>>
>> >> +
>> >> +       for (idx = 1; idx < len; idx++) {
>> >> +               struct bpf_list_insn *node = kvzalloc(sizeof(*node),
>> >> +                                                     GFP_KERNEL);
>> >> +
>> >> +               if (!node) {
>> >> +                       /* Destroy what has been allocated. */
>> >> +                       bpf_destroy_list_insn(hdr);
>> >> +                       return ERR_PTR(-ENOMEM);
>> >> +               }
>> >> +               node->insn = insns[idx];
>> >> +               node->orig_idx = idx + 1;
>> >
>> > Why orig_idx is 1-based? It's really confusing.
>>
>> orig_idx == 0 means one insn is without original insn, means it is an new
>> insn generated for patching purpose.
>>
>> While the LIST_INSN_FLAG_PATCHED in the RFC means one insn in original prog
>> is patched.
>>
>> I had been trying to differenciate above two cases, but yes, they are
>> confusing and differenciating them might be useless, if an insn in original
>> prog is patched, all its info could be treated as clobbered and needing
>> re-calculating or should do conservative assumption.
>
> Instruction will be new and not patched only in patch_buffer. Once you
> add them to the list, they are patched, no? Not sure what's the
> distinction you are trying to maintain here.

Never mind, the reason I was trying to differenciating them is because I
had some strange preference on the insn patched.

insn 1          insn 1
insn 2   >>     insn 2.1
insn 3          insn 2.2
                insn 2.3
                insn 3

I kind of thinking the it is better to maintain the original info of one
patched insn, that is to say insn 2 above is patched and expanded into insn
2.1/2.2/2.3, then I slightly felt better to copy the aux info of insn to
insn 2.1 and only rebuilt those we are sure needs to be updated, for
example zext because the insn is changed.

>> >
>> >> +               prev->next = node;
>> >> +               prev = node;
>> >> +       }
>> >> +
>> >> +       return hdr;
>> >> +}
>> >> +
>
> [...]
>
>> >> +
>> >> +       len--;
>> >> +       patch++;
>> >> +
>> >> +       prev = list_insn;
>> >> +       next = list_insn->next;
>> >> +       for (idx = 0; idx < len; idx++) {
>> >> +               struct bpf_list_insn *node = kvzalloc(sizeof(*node),
>> >> +                                                     GFP_KERNEL);
>> >> +
>> >> +               if (!node) {
>> >> +                       /* Link what's allocated, so list destroyer could
>> >> +                        * free them.
>> >> +                        */
>> >> +                       prev->next = next;
>> >
>> > Why this special handling, if you can just insert element so that list
>> > is well-formed after each instruction?
>>
>> Good idea, just always do "node->next = next", the "prev->next = node" in
>> next round will fix it.
>>
>> >
>> >> +                       return ERR_PTR(-ENOMEM);
>> >> +               }
>> >> +
>> >> +               node->insn = patch[idx];
>> >> +               prev->next = node;
>> >> +               prev = node;
>> >
>> > E.g.,
>> >
>> > node->next = next;
>> > prev->next = node;
>> > prev = node;
>> >
>> >> +       }
>> >> +
>> >> +       prev->next = next;
>> >
>> > And no need for this either.
>> >
>> >> +       return prev;
>> >> +}
>> >> +
>> >> +struct bpf_list_insn *bpf_prepatch_list_insn(struct bpf_list_insn *list_insn,
>> >> +                                            const struct bpf_insn *patch,
>> >> +                                            u32 len)
>> >
>> > prepatch and patch functions should share the same logic.
>> >
>> > Prepend is just that - insert all instructions from buffer before current insns.
>> > Patch -> replace current one with first instriction in a buffer, then
>> > prepend remaining ones before the next instruction (so patch should
>> > call info prepend, with adjusted count and array pointer).
>>
>> Ack, there indeed has quite a few things to simplify.
>>
>> >
>> >> +{
>> >> +       struct bpf_list_insn *prev, *node, *begin_node;
>> >> +       u32 idx;
>> >> +
>> >> +       if (!len)
>> >> +               return list_insn;
>> >> +
>> >> +       node = kvzalloc(sizeof(*node), GFP_KERNEL);
>> >> +       if (!node)
>> >> +               return ERR_PTR(-ENOMEM);
>> >> +       node->insn = patch[0];
>> >> +       begin_node = node;
>> >> +       prev = node;
>> >> +
>> >> +       for (idx = 1; idx < len; idx++) {
>> >> +               node = kvzalloc(sizeof(*node), GFP_KERNEL);
>> >> +               if (!node) {
>> >> +                       node = begin_node;
>> >> +                       /* Release what's has been allocated. */
>> >> +                       while (node) {
>> >> +                               struct bpf_list_insn *next = node->next;
>> >> +
>> >> +                               kvfree(node);
>> >> +                               node = next;
>> >> +                       }
>> >> +                       return ERR_PTR(-ENOMEM);
>> >> +               }
>> >> +               node->insn = patch[idx];
>> >> +               prev->next = node;
>> >> +               prev = node;
>> >> +       }
>> >> +
>> >> +       prev->next = list_insn;
>> >> +       return begin_node;
>> >> +}
>> >> +
>> >>  void bpf_prog_kallsyms_del_subprogs(struct bpf_prog *fp)
>> >>  {
>> >>         int i;
>> >> --
>> >> 2.7.4
>> >>
>>


^ permalink raw reply

* [PATCH ipsec v2 0/4] xfrm interface: bugs fixes
From: Nicolas Dichtel @ 2019-07-15 10:00 UTC (permalink / raw)
  To: steffen.klassert, davem; +Cc: netdev
In-Reply-To: <df990564-819a-314f-dda6-aab58a2e7b6e@6wind.com>


Here is a bunch of bugs fixes. Some have been seen by code review and some when
playing with x-netns.
The details are in each patch.

v1 -> v2:
 - add patch #3 and #4

 include/net/xfrm.h        |  2 --
 net/xfrm/xfrm_interface.c | 56 +++++++++++++++++++++--------------------------
 2 files changed, 25 insertions(+), 33 deletions(-)

Regards,
Nicolas


^ permalink raw reply

* [PATCH ipsec v2 3/4] xfrm interface: fix list corruption for x-netns
From: Nicolas Dichtel @ 2019-07-15 10:00 UTC (permalink / raw)
  To: steffen.klassert, davem; +Cc: netdev, Nicolas Dichtel, Julien Floret
In-Reply-To: <20190715100023.8475-1-nicolas.dichtel@6wind.com>

dev_net(dev) is the netns of the device and xi->net is the link netns,
where the device has been linked.
changelink() must operate in the link netns to avoid a corruption of
the xfrm lists.

Note that xi->net and dev_net(xi->physdev) are always the same.

Before the patch, the xfrmi lists may be corrupted and can later trigger a
kernel panic.

Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces")
Reported-by: Julien Floret <julien.floret@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Tested-by: Julien Floret <julien.floret@6wind.com>
---
 net/xfrm/xfrm_interface.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
index 68336ee00d72..53e5e47b2c55 100644
--- a/net/xfrm/xfrm_interface.c
+++ b/net/xfrm/xfrm_interface.c
@@ -503,7 +503,7 @@ static int xfrmi_change(struct xfrm_if *xi, const struct xfrm_if_parms *p)
 
 static int xfrmi_update(struct xfrm_if *xi, struct xfrm_if_parms *p)
 {
-	struct net *net = dev_net(xi->dev);
+	struct net *net = xi->net;
 	struct xfrmi_net *xfrmn = net_generic(net, xfrmi_net_id);
 	int err;
 
@@ -663,9 +663,9 @@ static int xfrmi_changelink(struct net_device *dev, struct nlattr *tb[],
 			   struct nlattr *data[],
 			   struct netlink_ext_ack *extack)
 {
-	struct net *net = dev_net(dev);
+	struct xfrm_if *xi = netdev_priv(dev);
+	struct net *net = xi->net;
 	struct xfrm_if_parms p;
-	struct xfrm_if *xi;
 
 	xfrmi_netlink_parms(data, &p);
 	xi = xfrmi_locate(net, &p);
@@ -707,7 +707,7 @@ static struct net *xfrmi_get_link_net(const struct net_device *dev)
 {
 	struct xfrm_if *xi = netdev_priv(dev);
 
-	return dev_net(xi->phydev);
+	return xi->net;
 }
 
 static const struct nla_policy xfrmi_policy[IFLA_XFRM_MAX + 1] = {
-- 
2.21.0


^ permalink raw reply related

* [PATCH ipsec v2 2/4] xfrm interface: ifname may be wrong in logs
From: Nicolas Dichtel @ 2019-07-15 10:00 UTC (permalink / raw)
  To: steffen.klassert, davem; +Cc: netdev, Nicolas Dichtel
In-Reply-To: <20190715100023.8475-1-nicolas.dichtel@6wind.com>

The ifname is copied when the interface is created, but is never updated
later. In fact, this property is used only in one error message, where the
netdevice pointer is available, thus let's use it.

Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
 include/net/xfrm.h        |  1 -
 net/xfrm/xfrm_interface.c | 10 +---------
 2 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index b22db30c3d88..ad761ef84797 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -983,7 +983,6 @@ static inline void xfrm_dst_destroy(struct xfrm_dst *xdst)
 void xfrm_dst_ifdown(struct dst_entry *dst, struct net_device *dev);
 
 struct xfrm_if_parms {
-	char name[IFNAMSIZ];	/* name of XFRM device */
 	int link;		/* ifindex of underlying L2 interface */
 	u32 if_id;		/* interface identifyer */
 };
diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
index 2310dc9e35c2..68336ee00d72 100644
--- a/net/xfrm/xfrm_interface.c
+++ b/net/xfrm/xfrm_interface.c
@@ -145,8 +145,6 @@ static int xfrmi_create(struct net_device *dev)
 	if (err < 0)
 		goto out;
 
-	strcpy(xi->p.name, dev->name);
-
 	dev_hold(dev);
 	xfrmi_link(xfrmn, xi);
 
@@ -294,7 +292,7 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
 	if (tdev == dev) {
 		stats->collisions++;
 		net_warn_ratelimited("%s: Local routing loop detected!\n",
-				     xi->p.name);
+				     dev->name);
 		goto tx_err_dst_release;
 	}
 
@@ -638,12 +636,6 @@ static int xfrmi_newlink(struct net *src_net, struct net_device *dev,
 	int err;
 
 	xfrmi_netlink_parms(data, &p);
-
-	if (!tb[IFLA_IFNAME])
-		return -EINVAL;
-
-	nla_strlcpy(p.name, tb[IFLA_IFNAME], IFNAMSIZ);
-
 	xi = xfrmi_locate(net, &p);
 	if (xi)
 		return -EEXIST;
-- 
2.21.0


^ permalink raw reply related

* [PATCH ipsec v2 4/4] xfrm interface: fix management of phydev
From: Nicolas Dichtel @ 2019-07-15 10:00 UTC (permalink / raw)
  To: steffen.klassert, davem; +Cc: netdev, Nicolas Dichtel, Julien Floret
In-Reply-To: <20190715100023.8475-1-nicolas.dichtel@6wind.com>

With the current implementation, phydev cannot be removed:

$ ip link add dummy type dummy
$ ip link add xfrm1 type xfrm dev dummy if_id 1
$ ip l d dummy
 kernel:[77938.465445] unregister_netdevice: waiting for dummy to become free. Usage count = 1

Manage it like in ip tunnels, ie just keep the ifindex. Not that the side
effect, is that the phydev is now optional.

Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Tested-by: Julien Floret <julien.floret@6wind.com>
---
 include/net/xfrm.h        |  1 -
 net/xfrm/xfrm_interface.c | 32 +++++++++++++++++---------------
 2 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index ad761ef84797..aa08a7a5f6ac 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -990,7 +990,6 @@ struct xfrm_if_parms {
 struct xfrm_if {
 	struct xfrm_if __rcu *next;	/* next interface in list */
 	struct net_device *dev;		/* virtual device associated with interface */
-	struct net_device *phydev;	/* physical device */
 	struct net *net;		/* netns for packet i/o */
 	struct xfrm_if_parms p;		/* interface parms */
 
diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
index 53e5e47b2c55..2ab4859df55a 100644
--- a/net/xfrm/xfrm_interface.c
+++ b/net/xfrm/xfrm_interface.c
@@ -175,7 +175,6 @@ static void xfrmi_dev_uninit(struct net_device *dev)
 	struct xfrmi_net *xfrmn = net_generic(xi->net, xfrmi_net_id);
 
 	xfrmi_unlink(xfrmn, xi);
-	dev_put(xi->phydev);
 	dev_put(dev);
 }
 
@@ -362,7 +361,7 @@ static netdev_tx_t xfrmi_xmit(struct sk_buff *skb, struct net_device *dev)
 		goto tx_err;
 	}
 
-	fl.flowi_oif = xi->phydev->ifindex;
+	fl.flowi_oif = xi->p.link;
 
 	ret = xfrmi_xmit2(skb, dev, &fl);
 	if (ret < 0)
@@ -548,7 +547,7 @@ static int xfrmi_get_iflink(const struct net_device *dev)
 {
 	struct xfrm_if *xi = netdev_priv(dev);
 
-	return xi->phydev->ifindex;
+	return xi->p.link;
 }
 
 
@@ -574,12 +573,14 @@ static void xfrmi_dev_setup(struct net_device *dev)
 	dev->needs_free_netdev	= true;
 	dev->priv_destructor	= xfrmi_dev_free;
 	netif_keep_dst(dev);
+
+	eth_broadcast_addr(dev->broadcast);
 }
 
 static int xfrmi_dev_init(struct net_device *dev)
 {
 	struct xfrm_if *xi = netdev_priv(dev);
-	struct net_device *phydev = xi->phydev;
+	struct net_device *phydev = __dev_get_by_index(xi->net, xi->p.link);
 	int err;
 
 	dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
@@ -594,13 +595,19 @@ static int xfrmi_dev_init(struct net_device *dev)
 
 	dev->features |= NETIF_F_LLTX;
 
-	dev->needed_headroom = phydev->needed_headroom;
-	dev->needed_tailroom = phydev->needed_tailroom;
+	if (phydev) {
+		dev->needed_headroom = phydev->needed_headroom;
+		dev->needed_tailroom = phydev->needed_tailroom;
 
-	if (is_zero_ether_addr(dev->dev_addr))
-		eth_hw_addr_inherit(dev, phydev);
-	if (is_zero_ether_addr(dev->broadcast))
-		memcpy(dev->broadcast, phydev->broadcast, dev->addr_len);
+		if (is_zero_ether_addr(dev->dev_addr))
+			eth_hw_addr_inherit(dev, phydev);
+		if (is_zero_ether_addr(dev->broadcast))
+			memcpy(dev->broadcast, phydev->broadcast,
+			       dev->addr_len);
+	} else {
+		eth_hw_addr_random(dev);
+		eth_broadcast_addr(dev->broadcast);
+	}
 
 	return 0;
 }
@@ -644,13 +651,8 @@ static int xfrmi_newlink(struct net *src_net, struct net_device *dev,
 	xi->p = p;
 	xi->net = net;
 	xi->dev = dev;
-	xi->phydev = dev_get_by_index(net, p.link);
-	if (!xi->phydev)
-		return -ENODEV;
 
 	err = xfrmi_create(dev);
-	if (err < 0)
-		dev_put(xi->phydev);
 	return err;
 }
 
-- 
2.21.0


^ permalink raw reply related

* [PATCH ipsec v2 1/4] xfrm interface: avoid corruption on changelink
From: Nicolas Dichtel @ 2019-07-15 10:00 UTC (permalink / raw)
  To: steffen.klassert, davem; +Cc: netdev, Nicolas Dichtel, Julien Floret
In-Reply-To: <20190715100023.8475-1-nicolas.dichtel@6wind.com>

The new parameters must not be stored in the netdev_priv() before
validation, it may corrupt the interface. Note also that if data is NULL,
only a memset() is done.

$ ip link add xfrm1 type xfrm dev lo if_id 1
$ ip link add xfrm2 type xfrm dev lo if_id 2
$ ip link set xfrm1 type xfrm dev lo if_id 2
RTNETLINK answers: File exists
$ ip -d link list dev xfrm1
5: xfrm1@lo: <NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/none 00:00:00:00:00:00 brd 00:00:00:00:00:00 promiscuity 0 minmtu 68 maxmtu 1500
    xfrm if_id 0x2 addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535

=> "if_id 0x2"

Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Tested-by: Julien Floret <julien.floret@6wind.com>
---
 net/xfrm/xfrm_interface.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
index 74868f9d81fb..2310dc9e35c2 100644
--- a/net/xfrm/xfrm_interface.c
+++ b/net/xfrm/xfrm_interface.c
@@ -671,12 +671,12 @@ static int xfrmi_changelink(struct net_device *dev, struct nlattr *tb[],
 			   struct nlattr *data[],
 			   struct netlink_ext_ack *extack)
 {
-	struct xfrm_if *xi = netdev_priv(dev);
 	struct net *net = dev_net(dev);
+	struct xfrm_if_parms p;
+	struct xfrm_if *xi;
 
-	xfrmi_netlink_parms(data, &xi->p);
-
-	xi = xfrmi_locate(net, &xi->p);
+	xfrmi_netlink_parms(data, &p);
+	xi = xfrmi_locate(net, &p);
 	if (!xi) {
 		xi = netdev_priv(dev);
 	} else {
@@ -684,7 +684,7 @@ static int xfrmi_changelink(struct net_device *dev, struct nlattr *tb[],
 			return -EEXIST;
 	}
 
-	return xfrmi_update(xi, &xi->p);
+	return xfrmi_update(xi, &p);
 }
 
 static size_t xfrmi_get_size(const struct net_device *dev)
-- 
2.21.0


^ permalink raw reply related

* Re: [oss-drivers] Re: [RFC bpf-next 2/8] bpf: extend list based insn patching infra to verification layer
From: Jiong Wang @ 2019-07-15 10:02 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiong Wang, Andrii Nakryiko, Alexei Starovoitov, Daniel Borkmann,
	Edward Cree, Naveen N. Rao, Jakub Kicinski, bpf, Networking,
	oss-drivers
In-Reply-To: <CAEf4BzZX45+QJLnHdwW-Cmo1uAFd+4zzds0jJoQVWFLrUVABgA@mail.gmail.com>


Andrii Nakryiko writes:

> On Thu, Jul 11, 2019 at 5:20 AM Jiong Wang <jiong.wang@netronome.com> wrote:
>>
>>
>> Jiong Wang writes:
>>
>> > Andrii Nakryiko writes:
>> >
>> >> On Thu, Jul 4, 2019 at 2:32 PM Jiong Wang <jiong.wang@netronome.com> wrote:
>> >>>
>> >>> Verification layer also needs to handle auxiliar info as well as adjusting
>> >>> subprog start.
>> >>>
>> >>> At this layer, insns inside patch buffer could be jump, but they should
>> >>> have been resolved, meaning they shouldn't jump to insn outside of the
>> >>> patch buffer. Lineration function for this layer won't touch insns inside
>> >>> patch buffer.
>> >>>
>> >>> Adjusting subprog is finished along with adjusting jump target when the
>> >>> input will cover bpf to bpf call insn, re-register subprog start is cheap.
>> >>> But adjustment when there is insn deleteion is not considered yet.
>> >>>
>> >>> Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
>> >>> ---
>> >>>  kernel/bpf/verifier.c | 150 ++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>>  1 file changed, 150 insertions(+)
>> >>>
>> >>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> >>> index a2e7637..2026d64 100644
>> >>> --- a/kernel/bpf/verifier.c
>> >>> +++ b/kernel/bpf/verifier.c
>> >>> @@ -8350,6 +8350,156 @@ static void opt_hard_wire_dead_code_branches(struct bpf_verifier_env *env)
>> >>>         }
>> >>>  }
>> >>>
>> >>> +/* Linearize bpf list insn to array (verifier layer). */
>> >>> +static struct bpf_verifier_env *
>> >>> +verifier_linearize_list_insn(struct bpf_verifier_env *env,
>> >>> +                            struct bpf_list_insn *list)
>> >>
>> >> It's unclear why this returns env back? It's not allocating a new env,
>> >> so it's weird and unnecessary. Just return error code.
>> >
>> > The reason is I was thinking we have two layers in BPF, the core and the
>> > verifier.
>> >
>> > For core layer (the relevant file is core.c), when doing patching, the
>> > input is insn list and bpf_prog, the linearization should linearize the
>> > insn list into insn array, and also whatever others affect inside bpf_prog
>> > due to changing on insns, for example line info inside prog->aux. So the
>> > return value is bpf_prog for core layer linearization hook.
>> >
>> > For verifier layer, it is similar, but the context if bpf_verifier_env, the
>> > linearization hook should linearize the insn list, and also those affected
>> > inside env, for example bpf_insn_aux_data, so the return value is
>> > bpf_verifier_env, meaning returning an updated verifier context
>> > (bpf_verifier_env) after insn list linearization.
>>
>> Realized your point is no new env is allocated, so just return error
>> code. Yes, the env pointer is not changed, just internal data is
>> updated. Return bpf_verifier_env mostly is trying to make the hook more
>> clear that it returns an updated "context" where the linearization happens,
>> for verifier layer, it is bpf_verifier_env, and for core layer, it is
>> bpf_prog, so return value was designed to return these two types.
>
> Oh, I missed that core layer returns bpf_prog*. I think this is
> confusing as hell and is very contrary to what one would expect. If
> the function doesn't allocate those objects, it shouldn't return them,
> except for rare cases of some accessor functions. Me reading this,
> I'll always be suprised and will have to go skim code just to check
> whether those functions really return new bpf_prog or
> bpf_verifier_env, respectively.

bpf_prog_realloc do return new bpf_prog, so we will need to return bpf_prog
* for core layer.

>
> Please change them both to just return error code.
>
>>
>> >
>> > Make sense?
>> >
>> > Regards,
>> > Jiong
>> >
>> >>
>> >>> +{
>> >>> +       u32 *idx_map, idx, orig_cnt, fini_cnt = 0;
>> >>> +       struct bpf_subprog_info *new_subinfo;
>> >>> +       struct bpf_insn_aux_data *new_data;
>> >>> +       struct bpf_prog *prog = env->prog;
>> >>> +       struct bpf_verifier_env *ret_env;
>> >>> +       struct bpf_insn *insns, *insn;
>> >>> +       struct bpf_list_insn *elem;
>> >>> +       int ret;
>> >>> +
>> >>> +       /* Calculate final size. */
>> >>> +       for (elem = list; elem; elem = elem->next)
>> >>> +               if (!(elem->flag & LIST_INSN_FLAG_REMOVED))
>> >>> +                       fini_cnt++;
>> >>> +
>> >>> +       orig_cnt = prog->len;
>> >>> +       insns = prog->insnsi;
>> >>> +       /* If prog length remains same, nothing else to do. */
>> >>> +       if (fini_cnt == orig_cnt) {
>> >>> +               for (insn = insns, elem = list; elem; elem = elem->next, insn++)
>> >>> +                       *insn = elem->insn;
>> >>> +               return env;
>> >>> +       }
>> >>> +       /* Realloc insn buffer when necessary. */
>> >>> +       if (fini_cnt > orig_cnt)
>> >>> +               prog = bpf_prog_realloc(prog, bpf_prog_size(fini_cnt),
>> >>> +                                       GFP_USER);
>> >>> +       if (!prog)
>> >>> +               return ERR_PTR(-ENOMEM);
>> >>> +       insns = prog->insnsi;
>> >>> +       prog->len = fini_cnt;
>> >>> +       ret_env = env;
>> >>> +
>> >>> +       /* idx_map[OLD_IDX] = NEW_IDX */
>> >>> +       idx_map = kvmalloc(orig_cnt * sizeof(u32), GFP_KERNEL);
>> >>> +       if (!idx_map)
>> >>> +               return ERR_PTR(-ENOMEM);
>> >>> +       memset(idx_map, 0xff, orig_cnt * sizeof(u32));
>> >>> +
>> >>> +       /* Use the same alloc method used when allocating env->insn_aux_data. */
>> >>> +       new_data = vzalloc(array_size(sizeof(*new_data), fini_cnt));
>> >>> +       if (!new_data) {
>> >>> +               kvfree(idx_map);
>> >>> +               return ERR_PTR(-ENOMEM);
>> >>> +       }
>> >>> +
>> >>> +       /* Copy over insn + calculate idx_map. */
>> >>> +       for (idx = 0, elem = list; elem; elem = elem->next) {
>> >>> +               int orig_idx = elem->orig_idx - 1;
>> >>> +
>> >>> +               if (orig_idx >= 0) {
>> >>> +                       idx_map[orig_idx] = idx;
>> >>> +
>> >>> +                       if (elem->flag & LIST_INSN_FLAG_REMOVED)
>> >>> +                               continue;
>> >>> +
>> >>> +                       new_data[idx] = env->insn_aux_data[orig_idx];
>> >>> +
>> >>> +                       if (elem->flag & LIST_INSN_FLAG_PATCHED)
>> >>> +                               new_data[idx].zext_dst =
>> >>> +                                       insn_has_def32(env, &elem->insn);
>> >>> +               } else {
>> >>> +                       new_data[idx].seen = true;
>> >>> +                       new_data[idx].zext_dst = insn_has_def32(env,
>> >>> +                                                               &elem->insn);
>> >>> +               }
>> >>> +               insns[idx++] = elem->insn;
>> >>> +       }
>> >>> +
>> >>> +       new_subinfo = kvzalloc(sizeof(env->subprog_info), GFP_KERNEL);
>> >>> +       if (!new_subinfo) {
>> >>> +               kvfree(idx_map);
>> >>> +               vfree(new_data);
>> >>> +               return ERR_PTR(-ENOMEM);
>> >>> +       }
>> >>> +       memcpy(new_subinfo, env->subprog_info, sizeof(env->subprog_info));
>> >>> +       memset(env->subprog_info, 0, sizeof(env->subprog_info));
>> >>> +       env->subprog_cnt = 0;
>> >>> +       env->prog = prog;
>> >>> +       ret = add_subprog(env, 0);
>> >>> +       if (ret < 0) {
>> >>> +               ret_env = ERR_PTR(ret);
>> >>> +               goto free_all_ret;
>> >>> +       }
>> >>> +       /* Relocate jumps using idx_map.
>> >>> +        *   old_dst = jmp_insn.old_target + old_pc + 1;
>> >>> +        *   new_dst = idx_map[old_dst] = jmp_insn.new_target + new_pc + 1;
>> >>> +        *   jmp_insn.new_target = new_dst - new_pc - 1;
>> >>> +        */
>> >>> +       for (idx = 0, elem = list; elem; elem = elem->next) {
>> >>> +               int orig_idx = elem->orig_idx;
>> >>> +
>> >>> +               if (elem->flag & LIST_INSN_FLAG_REMOVED)
>> >>> +                       continue;
>> >>> +               if ((elem->flag & LIST_INSN_FLAG_PATCHED) || !orig_idx) {
>> >>> +                       idx++;
>> >>> +                       continue;
>> >>> +               }
>> >>> +
>> >>> +               ret = bpf_jit_adj_imm_off(&insns[idx], orig_idx - 1, idx,
>> >>> +                                         idx_map);
>> >>> +               if (ret < 0) {
>> >>> +                       ret_env = ERR_PTR(ret);
>> >>> +                       goto free_all_ret;
>> >>> +               }
>> >>> +               /* Recalculate subprog start as we are at bpf2bpf call insn. */
>> >>> +               if (ret > 0) {
>> >>> +                       ret = add_subprog(env, idx + insns[idx].imm + 1);
>> >>> +                       if (ret < 0) {
>> >>> +                               ret_env = ERR_PTR(ret);
>> >>> +                               goto free_all_ret;
>> >>> +                       }
>> >>> +               }
>> >>> +               idx++;
>> >>> +       }
>> >>> +       if (ret < 0) {
>> >>> +               ret_env = ERR_PTR(ret);
>> >>> +               goto free_all_ret;
>> >>> +       }
>> >>> +
>> >>> +       env->subprog_info[env->subprog_cnt].start = fini_cnt;
>> >>> +       for (idx = 0; idx <= env->subprog_cnt; idx++)
>> >>> +               new_subinfo[idx].start = env->subprog_info[idx].start;
>> >>> +       memcpy(env->subprog_info, new_subinfo, sizeof(env->subprog_info));
>> >>> +
>> >>> +       /* Adjust linfo.
>> >>> +        * FIXME: no support for insn removal at the moment.
>> >>> +        */
>> >>> +       if (prog->aux->nr_linfo) {
>> >>> +               struct bpf_line_info *linfo = prog->aux->linfo;
>> >>> +               u32 nr_linfo = prog->aux->nr_linfo;
>> >>> +
>> >>> +               for (idx = 0; idx < nr_linfo; idx++)
>> >>> +                       linfo[idx].insn_off = idx_map[linfo[idx].insn_off];
>> >>> +       }
>> >>> +       vfree(env->insn_aux_data);
>> >>> +       env->insn_aux_data = new_data;
>> >>> +       goto free_mem_list_ret;
>> >>> +free_all_ret:
>> >>> +       vfree(new_data);
>> >>> +free_mem_list_ret:
>> >>> +       kvfree(new_subinfo);
>> >>> +       kvfree(idx_map);
>> >>> +       return ret_env;
>> >>> +}
>> >>> +
>> >>>  static int opt_remove_dead_code(struct bpf_verifier_env *env)
>> >>>  {
>> >>>         struct bpf_insn_aux_data *aux_data = env->insn_aux_data;
>> >>> --
>> >>> 2.7.4
>> >>>
>>


^ permalink raw reply

* Re: [RFC] virtio-net: share receive_*() and add_recvbuf_*() with virtio-vsock
From: Stefano Garzarella @ 2019-07-15 10:42 UTC (permalink / raw)
  To: Jason Wang; +Cc: Michael S. Tsirkin, Stefan Hajnoczi, virtualization, netdev
In-Reply-To: <880c1ad2-7e02-3d5d-82d7-49076cc8d02b@redhat.com>

On Mon, Jul 15, 2019 at 05:16:09PM +0800, Jason Wang wrote:
> 
> > > > > > > >        struct sk_buff *virtskb_receive_small(struct virtskb *vs, ...);
> > > > > > > >        struct sk_buff *virtskb_receive_big(struct virtskb *vs, ...);
> > > > > > > >        struct sk_buff *virtskb_receive_mergeable(struct virtskb *vs, ...);
> > > > > > > > 
> > > > > > > >        int virtskb_add_recvbuf_small(struct virtskb*vs, ...);
> > > > > > > >        int virtskb_add_recvbuf_big(struct virtskb *vs, ...);
> > > > > > > >        int virtskb_add_recvbuf_mergeable(struct virtskb *vs, ...);
> > > > > > > > 
> > > > > > > > For the Guest->Host path it should be easier, so maybe I can add a
> > > > > > > > "virtskb_send(struct virtskb *vs, struct sk_buff *skb)" with a part of the code
> > > > > > > > of xmit_skb().
> > > > > > > I may miss something, but I don't see any thing that prevents us from using
> > > > > > > xmit_skb() directly.
> > > > > > > 
> > > > > > Yes, but my initial idea was to make it more parametric and not related to the
> > > > > > virtio_net_hdr, so the 'hdr_len' could be a parameter and the
> > > > > > 'num_buffers' should be handled by the caller.
> > > > > > 
> > > > > > > > Let me know if you have in mind better names or if I should put these function
> > > > > > > > in another place.
> > > > > > > > 
> > > > > > > > I would like to leave the control part completely separate, so, for example,
> > > > > > > > the two drivers will negotiate the features independently and they will call
> > > > > > > > the right virtskb_receive_*() function based on the negotiation.
> > > > > > > If it's one the issue of negotiation, we can simply change the
> > > > > > > virtnet_probe() to deal with different devices.
> > > > > > > 
> > > > > > > 
> > > > > > > > I already started to work on it, but before to do more steps and send an RFC
> > > > > > > > patch, I would like to hear your opinion.
> > > > > > > > Do you think that makes sense?
> > > > > > > > Do you see any issue or a better solution?
> > > > > > > I still think we need to seek a way of adding some codes on virtio-net.c
> > > > > > > directly if there's no huge different in the processing of TX/RX. That would
> > > > > > > save us a lot time.
> > > > > > After the reading of the buffers from the virtqueue I think the process
> > > > > > is slightly different, because virtio-net will interface with the network
> > > > > > stack, while virtio-vsock will interface with the vsock-core (socket).
> > > > > > So the virtio-vsock implements the following:
> > > > > > - control flow mechanism to avoid to loose packets, informing the peer
> > > > > >     about the amount of memory available in the receive queue using some
> > > > > >     fields in the virtio_vsock_hdr
> > > > > > - de-multiplexing parsing the virtio_vsock_hdr and choosing the right
> > > > > >     socket depending on the port
> > > > > > - socket state handling
> > > 
> > > I think it's just a branch, for ethernet, go for networking stack. otherwise
> > > go for vsock core?
> > > 
> > Yes, that should work.
> > 
> > So, I should refactor the functions that can be called also from the vsock
> > core, in order to remove "struct net_device *dev" parameter.
> > Maybe creating some wrappers for the network stack.
> > 
> > Otherwise I should create a fake net_device for vsock_core.
> > 
> > What do you suggest?
> 
> 
> I'm not quite sure I get the question. Can you just use the one that created
> by virtio_net?

Sure, sorry but I missed that it is allocated in the virtnet_probe()!

Thanks,
Stefano

^ permalink raw reply

* [PATCH v2] ethtool: igb: dump RR2DCDELAY register
From: Artem Bityutskiy @ 2019-07-15 10:59 UTC (permalink / raw)
  To: John W . Linville; +Cc: netdev
In-Reply-To: <20190715065228.88377-1-artem.bityutskiy@linux.intel.com>

From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>

Decode 'RR2DCDELAY' register which Linux kernel provides starting from version
5.3. The corresponding commit in the Linux kernel is:
    cd502a7f7c9c igb: add RR2DCDELAY to ethtool registers dump

The RR2DCDELAY register is present in I210 and I211 Intel Gigabit Ethernet
chips and it stands for "Read Request To Data Completion Delay". Here is how
this register is described in the I210 datasheet:

"This field captures the maximum PCIe split time in 16 ns units, which is the
maximum delay between the read request to the first data completion. This is
giving an estimation of the PCIe round trip time."

In practice, this register can be used to measure the time it takes the NIC to
read data from the host memory.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
---
 igb.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Changelog:
v2: Fixed a typo in the commentary.
v1: Initial pach version.


diff --git a/igb.c b/igb.c
index e0ccef9..cb24877 100644
--- a/igb.c
+++ b/igb.c
@@ -859,6 +859,18 @@ igb_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs)
 		"0x03430: TDFPC       (Tx data FIFO packet count)      0x%08X\n",
 		regs_buff[550]);
 
+	/*
+	 * Starting from kernel version 5.3 the registers dump buffer grew from
+	 * 739 4-byte words to 740 words, and word 740 contains the RR2DCDELAY
+	 * register.
+	 */
+	if (regs->len < 740)
+		return 0;
+
+	fprintf(stdout,
+		"0x05BF4: RR2DCDELAY  (Max. DMA read delay)            0x%08X\n",
+		regs_buff[739]);
+
 	return 0;
 }
 
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH] ethtool: igb: dump RR2DCDELAY register
From: Artem Bityutskiy @ 2019-07-15 11:01 UTC (permalink / raw)
  To: Michal Kubecek; +Cc: John W . Linville, netdev
In-Reply-To: <20190715091334.GB24551@unicorn.suse.cz>

On Mon, 2019-07-15 at 11:13 +0200, Michal Kubecek wrote:
> > +	/*
> > +	 * Starting from kernel version 5.3 the registers dump buffer grew from
> > +	 * 739 4-byte words to 740 words, and word 740 contains the RR2DCDLAY
> 
> nit: "E" missing here:                                                    ^

Fixed and sent out v2.

> > +	fprintf(stdout,
> > +		"0x05BF4: RR2DCDELAY  (Max. DMA read delay)            0x%08X\n",
> > +		regs_buff[739]);
> 
> Showing a delay as hex number doesn't seem very user friendly but that
> also applies to many existing registers so it's probably better to be
> consistent and perhaps do an overall cleanup later.

Agree.

Thanks!


^ permalink raw reply

* [PATCH v2] e1000e: Make speed detection on hotplugging cable more reliable
From: Kai-Heng Feng @ 2019-07-15 12:25 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: intel-wired-lan, netdev, linux-kernel, Kai-Heng Feng
In-Reply-To: <20190715084355.9962-1-kai.heng.feng@canonical.com>

After hotplugging an 1Gbps ethernet cable with 1Gbps link partner, the
MII_BMSR may report 10Mbps, renders the network rather slow.

The issue has much lower fail rate after commit 59653e6497d1 ("e1000e:
Make watchdog use delayed work"), which essentially introduces some
delay before running the watchdog task.

But there's still a chance that the hotplugging event and the queued
watchdog task gets run at the same time, then the original issue can be
observed once again.

So let's use mod_delayed_work() to add a deterministic 1 second delay
before running watchdog task, after an interrupt.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index e4baa13b3cda..c83bf5349d53 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -1780,8 +1780,8 @@ static irqreturn_t e1000_intr_msi(int __always_unused irq, void *data)
 		}
 		/* guard against interrupt when we're going down */
 		if (!test_bit(__E1000_DOWN, &adapter->state))
-			queue_delayed_work(adapter->e1000_workqueue,
-					   &adapter->watchdog_task, 1);
+			mod_delayed_work(adapter->e1000_workqueue,
+					 &adapter->watchdog_task, HZ);
 	}
 
 	/* Reset on uncorrectable ECC error */
@@ -1861,8 +1861,8 @@ static irqreturn_t e1000_intr(int __always_unused irq, void *data)
 		}
 		/* guard against interrupt when we're going down */
 		if (!test_bit(__E1000_DOWN, &adapter->state))
-			queue_delayed_work(adapter->e1000_workqueue,
-					   &adapter->watchdog_task, 1);
+			mod_delayed_work(adapter->e1000_workqueue,
+					 &adapter->watchdog_task, HZ);
 	}
 
 	/* Reset on uncorrectable ECC error */
@@ -1907,8 +1907,8 @@ static irqreturn_t e1000_msix_other(int __always_unused irq, void *data)
 		hw->mac.get_link_status = true;
 		/* guard against interrupt when we're going down */
 		if (!test_bit(__E1000_DOWN, &adapter->state))
-			queue_delayed_work(adapter->e1000_workqueue,
-					   &adapter->watchdog_task, 1);
+			mod_delayed_work(adapter->e1000_workqueue,
+					 &adapter->watchdog_task, HZ);
 	}
 
 	if (!test_bit(__E1000_DOWN, &adapter->state))
-- 
2.17.1


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox