Netdev List

Netdev List
 help / color / mirror / Atom feed

* RE: [PATCH net-netx] net: lan78xx: add LAN7801 MAC-only support
From: Woojung.Huh @ 2016-11-29 23:55 UTC (permalink / raw)
  To: f.fainelli, davem, andrew; +Cc: netdev, UNGLinuxDriver
In-Reply-To: <dcabc07f-ab5f-ad25-5beb-09f717c06e4e@gmail.com>

> There are two ways to get these settings propagated to the PHY driver:
> 
> - using a board fixup which is going to be invoked during
> drv->config_init() time
> 
> - specifying a phydev->dev_flags and reading it from the PHY driver to
> act upon and configure the PHY based on that value, there are only
> 32-bits available though, and you need to make sure they are not
> conflicting with other potential users in tree
> 
> My preference would go with 1, since you could just register it in your
> PHY driver and re-use the code you are proposing to include here.

Florian,

It seems phy_unregister_fixup() will be needed for module network driver.
phy_fixup_list keeps the list even after unloading module.
Do you know any update is waiting for submission? If not, I'll make patch.

Thanks.
Woojung

^ permalink raw reply

* Re: [PATCH net-netx] net: lan78xx: add LAN7801 MAC-only support
From: Florian Fainelli @ 2016-11-29 23:57 UTC (permalink / raw)
  To: Woojung.Huh, davem, andrew; +Cc: netdev, UNGLinuxDriver
In-Reply-To: <9235D6609DB808459E95D78E17F2E43D409700F2@CHN-SV-EXMX02.mchp-main.com>

On 11/29/2016 03:55 PM, Woojung.Huh@microchip.com wrote:
>> There are two ways to get these settings propagated to the PHY driver:
>>
>> - using a board fixup which is going to be invoked during
>> drv->config_init() time
>>
>> - specifying a phydev->dev_flags and reading it from the PHY driver to
>> act upon and configure the PHY based on that value, there are only
>> 32-bits available though, and you need to make sure they are not
>> conflicting with other potential users in tree
>>
>> My preference would go with 1, since you could just register it in your
>> PHY driver and re-use the code you are proposing to include here.
> 
> Florian,
> 
> It seems phy_unregister_fixup() will be needed for module network driver.
> phy_fixup_list keeps the list even after unloading module.
> Do you know any update is waiting for submission? If not, I'll make patch.

Oh, yes, that's a good point, we need such a thing, so far fixups have
been exclusively used by code that is built-in, but there really is not
a reason for that. Please go ahead and cook a patch for this, thanks!
-- 
Florian

^ permalink raw reply

* Re: [PATCH] net: arc_emac: add dependencies on associated arches and compile test
From: David Miller @ 2016-11-29 23:57 UTC (permalink / raw)
  To: pbrobinson; +Cc: zhengxing, al.kochet, tremyfr, netdev
In-Reply-To: <20161128071237.9016-1-pbrobinson@gmail.com>

From: Peter Robinson <pbrobinson@gmail.com>
Date: Mon, 28 Nov 2016 07:12:37 +0000

> Add dependencies on the architectures that support these devices and
> add compile test to ensure ongoing code build coverage.
> 
> Signed-off-by: Peter Robinson <pbrobinson@gmail.com>

Applied.

^ permalink raw reply

* RE: [PATCH net-netx] net: lan78xx: add LAN7801 MAC-only support
From: Woojung.Huh @ 2016-11-30  0:01 UTC (permalink / raw)
  To: f.fainelli, davem, andrew; +Cc: netdev, UNGLinuxDriver
In-Reply-To: <b0dea0b1-11d1-3d58-c055-41824eaf1d1f@gmail.com>

> > It seems phy_unregister_fixup() will be needed for module network driver.
> > phy_fixup_list keeps the list even after unloading module.
> > Do you know any update is waiting for submission? If not, I'll make patch.
> 
> Oh, yes, that's a good point, we need such a thing, so far fixups have
> been exclusively used by code that is built-in, but there really is not
> a reason for that. Please go ahead and cook a patch for this, thanks!

OK. Will do it.

Thanks.
- Woojung

^ permalink raw reply

* Re: [PATCH] net: macb: Write only necessary bits in NCR in macb reset
From: David Miller @ 2016-11-30  0:05 UTC (permalink / raw)
  To: harini.katakam
  Cc: nicolas.ferre, harinikatakamlinux, netdev, linux-kernel, harinik,
	michals
In-Reply-To: <1480325029-39224-1-git-send-email-harinik@xilinx.com>

From: Harini Katakam <harini.katakam@xilinx.com>
Date: Mon, 28 Nov 2016 14:53:49 +0530

> In macb_reset_hw, use read-modify-write to disable RX and TX.
> This way exiting settings and reserved bits wont be disturbed.
> Use the same method for clearing statistics as well.
> 
> Signed-off-by: Harini Katakam <harinik@xilinx.com>

This doesn't make much sense to me.

Consider the two callers of this function.

macb_init_hw() is going to do a non-masking write to the NCR
register:

	/* Enable TX and RX */
	macb_writel(bp, NCR, MACB_BIT(RE) | MACB_BIT(TE) | MACB_BIT(MPE));

So obviously no other writable fields matter at all for programming
the chip properly, otherwise macb_init_hw() would "or" in the bits
after a read of NCR.  But that's not what this code does, it
writes "RE | TE | MPE" directly.

And the other caller is macb_close() which is shutting down the
chip so can zero out all the other bits and it can't possibly
matter, also due to the assertion above about macb_init_hw()
showing that only the RE, TE, and MPE bits matter for proper
functioning of the chip.

You haven't shown a issue caused by the way the code works now, so
this patch isn't fixing a bug.  In fact, the "bit preserving" would
even be misleading to someone reading the code.  They will ask
themselves what bits need to be preserved, and as shown above none of
them need to be.

I'm not applying this, sorry.

^ permalink raw reply

* Re: [PATCH] net: stmmac: enable tx queue 0 for gmac4 IPs synthesized with multiple TX queues
From: David Miller @ 2016-11-30  0:11 UTC (permalink / raw)
  To: niklas.cassel
  Cc: peppe.cavallaro, alexandre.torgue, niklass, netdev, linux-kernel
In-Reply-To: <1479998194-7113-1-git-send-email-niklass@axis.com>

From: Niklas Cassel <niklas.cassel@axis.com>
Date: Thu, 24 Nov 2016 15:36:33 +0100

> From: Niklas Cassel <niklas.cassel@axis.com>
> 
> The dwmac4 IP can synthesized with 1-8 number of tx queues.
> On an IP synthesized with DWC_EQOS_NUM_TXQ > 1, all txqueues are disabled
> by default. For these IPs, the bitfield TXQEN is R/W.
> 
> Always enable tx queue 0. The write will have no effect on IPs synthesized
> with DWC_EQOS_NUM_TXQ == 1.
> 
> The driver does still not utilize more than one tx queue in the IP.
> 
> Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>

Applied to net-next, thanks.

^ permalink raw reply

* Re: [PATCH net-next v3 3/4] bpf: BPF for lightweight tunnel infrastructure
From: Alexei Starovoitov @ 2016-11-30  0:15 UTC (permalink / raw)
  To: Thomas Graf; +Cc: davem, netdev, daniel, tom, roopa, hannes
In-Reply-To: <7cc79a82e49996a9ebbff861af471018fdf118fb.1480424542.git.tgraf@suug.ch>

On Tue, Nov 29, 2016 at 02:21:22PM +0100, Thomas Graf wrote:
> Registers new BPF program types which correspond to the LWT hooks:
>   - BPF_PROG_TYPE_LWT_IN   => dst_input()
>   - BPF_PROG_TYPE_LWT_OUT  => dst_output()
>   - BPF_PROG_TYPE_LWT_XMIT => lwtunnel_xmit()
> 
> The separate program types are required to differentiate between the
> capabilities each LWT hook allows:
> 
>  * Programs attached to dst_input() or dst_output() are restricted and
>    may only read the data of an skb. This prevent modification and
>    possible invalidation of already validated packet headers on receive
>    and the construction of illegal headers while the IP headers are
>    still being assembled.
> 
>  * Programs attached to lwtunnel_xmit() are allowed to modify packet
>    content as well as prepending an L2 header via a newly introduced
>    helper bpf_skb_push(). This is safe as lwtunnel_xmit() is invoked
>    after the IP header has been assembled completely.
> 
> All BPF programs receive an skb with L3 headers attached and may return
> one of the following error codes:
> 
>  BPF_OK - Continue routing as per nexthop
>  BPF_DROP - Drop skb and return EPERM
>  BPF_REDIRECT - Redirect skb to device as per redirect() helper.
>                 (Only valid in lwtunnel_xmit() context)
> 
> The return codes are binary compatible with their TC_ACT_
> relatives to ease compatibility.
> 
> Signed-off-by: Thomas Graf <tgraf@suug.ch>
...
> +#define LWT_BPF_MAX_HEADROOM 128

why 128?
btw I'm thinking for XDP to use 256, so metadata can be stored in there.

> +static int run_lwt_bpf(struct sk_buff *skb, struct bpf_lwt_prog *lwt,
> +		       struct dst_entry *dst, bool can_redirect)
> +{
> +	int ret;
> +
> +	/* Preempt disable is needed to protect per-cpu redirect_info between
> +	 * BPF prog and skb_do_redirect(). The call_rcu in bpf_prog_put() and
> +	 * access to maps strictly require a rcu_read_lock() for protection,
> +	 * mixing with BH RCU lock doesn't work.
> +	 */
> +	preempt_disable();
> +	rcu_read_lock();
> +	bpf_compute_data_end(skb);
> +	ret = BPF_PROG_RUN(lwt->prog, skb);
> +	rcu_read_unlock();
> +
> +	switch (ret) {
> +	case BPF_OK:
> +		break;
> +
> +	case BPF_REDIRECT:
> +		if (!can_redirect) {
> +			WARN_ONCE(1, "Illegal redirect return code in prog %s\n",
> +				  lwt->name ? : "<unknown>");
> +			ret = BPF_OK;
> +		} else {
> +			ret = skb_do_redirect(skb);

I think this assumes that program did bpf_skb_push and L2 header is present.
Would it make sense to check that mac_header < network_header here to make
sure that it actually happened? I think the cost of single 'if' isn't much.
Also skb_do_redirect() can redirect to l3 tunnels like ipip ;)
so program shouldn't be doing bpf_skb_push in such case...
May be rename bpf_skb_push to bpf_skb_push_l2 ?
since it's doing skb_reset_mac_header(skb); at the end of it?
Or it's probably better to use 'flags' argument to tell whether
bpf_skb_push() should set mac_header or not ? Then this bit:

> +		case BPF_OK:
> +			/* If the L3 header was expanded, headroom might be too
> +			 * small for L2 header now, expand as needed.
> +			 */
> +			ret = xmit_check_hhlen(skb);

will work fine as well...
which probably needs "mac_header wasn't set" check? or it's fine?

All bpf bits look great. Thanks!

^ permalink raw reply

* Re: [PATCH net-next v3 4/4] bpf: Add tests and samples for LWT-BPF
From: Alexei Starovoitov @ 2016-11-30  0:17 UTC (permalink / raw)
  To: Thomas Graf; +Cc: davem, netdev, daniel, tom, roopa, hannes
In-Reply-To: <25a6f8d0d56175bb27a05c18eb54bc3bc5c09fa1.1480424542.git.tgraf@suug.ch>

On Tue, Nov 29, 2016 at 02:21:23PM +0100, Thomas Graf wrote:
> Adds a series of test to verify the functionality of attaching
> BPF programs at LWT hooks.
> 
> Also adds a sample which collects a histogram of packet sizes which
> pass through an LWT hook.
> 
> $ ./lwt_len_hist.sh
> Starting netserver with host 'IN(6)ADDR_ANY' port '12865' and family AF_UNSPEC
> MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.253.2 () port 0 AF_INET : demo
> Recv   Send    Send
> Socket Socket  Message  Elapsed
> Size   Size    Size     Time     Throughput
> bytes  bytes   bytes    secs.    10^6bits/sec
> 
>  87380  16384  16384    10.00    39857.69

Nice!

> +	ret = bpf_redirect(ifindex, 0);
> +	if (ret < 0) {
> +		printk("bpf_redirect() failed: %d\n", ret);
> +		return BPF_DROP;
> +	}

this 'if' looks a bit weird. You're passing 0 as flags,
so this helper will always succeed.
Other sample code often does 'return bpf_redirect(...)'
due to this reasoning.

^ permalink raw reply

* Re: [PATCH] net: ethernet: ti: cpsw: fix ASSERT_RTNL() warning during resume
From: Ivan Khoronzhuk @ 2016-11-30  0:18 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: David S. Miller, netdev, Mugunthan V N, Sekhar Nori, linux-kernel,
	linux-omap, Dave Gerlach
In-Reply-To: <20161129222703.10908-1-grygorii.strashko@ti.com>

On Tue, Nov 29, 2016 at 04:27:03PM -0600, Grygorii Strashko wrote:
> netif_set_real_num_tx/rx_queues() are required to be called with rtnl_lock
> taken, otherwise ASSERT_RTNL() warning will be triggered - which happens
> now during System resume from suspend:
> cpsw_resume()
> |- cpsw_ndo_open()
>   |- netif_set_real_num_tx/rx_queues()
>      |- ASSERT_RTNL();
> 
> Hence, fix it by surrounding cpsw_ndo_open() by rtnl_lock/unlock() calls.
> 
> Cc: Dave Gerlach <d-gerlach@ti.com>
> Cc: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
> Fixes: commit e05107e6b747 ("net: ethernet: ti: cpsw: add multi queue support")
> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>

> ---
>  drivers/net/ethernet/ti/cpsw.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
> index ae1ec6a..fd6c03b 100644
> --- a/drivers/net/ethernet/ti/cpsw.c
> +++ b/drivers/net/ethernet/ti/cpsw.c
> @@ -2944,6 +2944,8 @@ static int cpsw_resume(struct device *dev)
>  	/* Select default pin state */
>  	pinctrl_pm_select_default_state(dev);
>  
> +	/* shut up ASSERT_RTNL() warning in netif_set_real_num_tx/rx_queues */
> +	rtnl_lock();
>  	if (cpsw->data.dual_emac) {
>  		int i;
>  
> @@ -2955,6 +2957,8 @@ static int cpsw_resume(struct device *dev)
>  		if (netif_running(ndev))
>  			cpsw_ndo_open(ndev);
>  	}
> +	rtnl_unlock();
> +
>  	return 0;
>  }
>  #endif
> -- 
> 2.10.1
> 

^ permalink raw reply

* Re: [PATCH net-next 5/5] udp: add recvmmsg implementation
From: David Miller @ 2016-11-30  0:22 UTC (permalink / raw)
  To: hannes; +Cc: pabeni, netdev, edumazet, brouer, sd
In-Reply-To: <1165706e-b828-cb12-4bea-b77ccca1cb95@stressinduktion.org>

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Fri, 25 Nov 2016 18:09:00 +0100

> During review we discussed on how to handle major errors in the kernel:
> 
> The old code and the new code still can report back success even though
> the kernel got back an EFAULT while copying from kernel space to user
> space (due to bad pointers).
> 
> I favor that we drop all packets (also the already received batches) in
> this case and let the code report -EFAULT and increase sk_drops for all
> dropped packets from the queue.
> 
> Currently sk_err is set so the next syscall would get an -EFAULT, which
> seems very bad and can also be overwritten by incoming icmp packets, so
> we never get a notification that we actually had a bad pointer somewhere
> in the mmsghdr. Also delivering -EFAULT on the follow-up syscalls really
> will make people confused that use strace.
> 
> If people would like to know the amount of packets dropped we can make
> sk_drops readable by an getsockopt.
> 
> Thoughts?
> 
> Unfortunately the interface doesn't allow for better error handling.

I think this is a major problem.

If, as a side effect of batch dequeueing the SKBs from the socket,
you cannot stop properly mid-transfer if an error occurs, well then
you simply cannot batch like that.

You have to stop the exact byte where an error occurs mid-stream,
return the successful amount of bytes transferred, and then return
the error on the next recvmmsg call.

There is no other sane error reporting strategy.

If I get 4 frames, and the kernel can successfully copy the first
three and get an -EFAULT on the 4th.  Dammit you better tell the
application this so it can properly process the first 3 packets and
then determine how it is going to error out and recover for the 4th
one.

If we need to add prioritized sk_err stuff, or another value like
"sk_app_err" to handle the ICMP vs. -EFAULT issue, so be it.

I know what you guys are thinking, in that you can't figure out a
way to avoid the transactional overhead if it is necessary to
"put back" some SKBs if one of them in the batch gets a fault.

That's too bad, we need a proper implementation and proper error
reporting.  Those performance numbers are useless if we effectively
lose error notifications.

^ permalink raw reply

* Re: [PATCH net-next 2/2] tcp: allow to turn tcp timestamp randomization off
From: Florian Westphal @ 2016-11-30  0:21 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Florian Westphal, netdev
In-Reply-To: <1480442832.18162.148.camel@edumazet-glaptop3.roam.corp.google.com>

Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2016-11-29 at 16:45 +0100, Florian Westphal wrote:
> > Eric says: "By looking at tcpdump, and TS val of xmit packets of multiple
> > flows, we can deduct the relative qdisc delays (think of fq pacing).
> > This should work even if we have one flow per remote peer."
> > 
> > Having random per flow (or host) offsets doesn't allow that anymore so add
> > a way to turn this off.
> > 
> > Suggested-by: Eric Dumazet <edumazet@google.com>
> > Signed-off-by: Florian Westphal <fw@strlen.de>
> > ---
> 
> Excellent, thanks !
> 
> Acked-by: Eric Dumazet <edumazet@google.com>

Thanks for the ack, I missed connect() side though so this doesn't work
for outgoing connections, sorry :-/

I will send a v2.

^ permalink raw reply

* Re: bpf debug info
From: Alexei Starovoitov @ 2016-11-30  0:28 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Jakub Kicinski, netdev, Brenden Blanco, Thomas Graf, Wangnan,
	He Kuang, kernel-team
In-Reply-To: <583DE3AE.8030103@iogearbox.net>

On Tue, Nov 29, 2016 at 09:23:10PM +0100, Daniel Borkmann wrote:
> On 11/29/2016 07:51 PM, Alexei Starovoitov wrote:
> >On Tue, Nov 29, 2016 at 03:38:18PM +0000, Jakub Kicinski wrote:
> [...]
> >>>>So next step is to improve verifier messages to be more human friendly.
> >>>>The step after is to introduce BPF_COMMENT pseudo instruction
> >>>>that will be ignored by the interpreter yet it will contain the text
> >>>>of original source code. Then llvm-objdump step won't be necessary.
> >>>>The bpf loader will load both instructions and pieces of C sources.
> >>>>Then verifier errors should be even easier to read and humans
> >>>>can easily understand the purpose of the program.
> >>>
> >>>So the BPF_COMMENT pseudo insn will get stripped away from the insn array
> >>>after verification step, so we don't need to hold/account for this mem? I
> >>>assume in it's ->imm member it will just hold offset into text blob?
> >>
> >>Associating any form of opaque data with programs always makes me
> >>worried about opening a side channel of communication with a specialized
> >>user space implementations/compilers.  But I guess if the BPF_COMMENTs
> >>are stripped in the verifier as Daniel assumes drivers and JITs will
> >>never see it.
> >
> >yes. the idea that it's a comment. It can contain any text,
> >not only C code, but any other language.
> >It's definitely going to be stripped before JITs and kernel will
> >not make any safety or translation decisions based on such comment.
> >
> >>Just to clarify, however - is there any reason why pushing the source
> >>code into the kernel is necessary?  Or is it just for convenience?
> >>Provided the user space loader has access to the debug info it should
> >>have no problems matching the verifier output to code lines?
> >
> >correct. just for convenience. The user space has to keep .o around,
> >since it can crash, would have to reload and so on.
> >Only for some script that ssh-es into servers and wants to see
> >what is being loaded, it might help to dump full asm and these comments
> >along with prog_digest that Daniel is working on in parallel.
> 
> Which would mean we'd need to keep it around somewhere (prog aux data?)
> in post-verification time (so potentially drivers/JITs could see it, too,
> just not inside insn stream). Some API glue code could probably blind
> this information for the JITing time to stop incentive of playing side
> channel games (e.g. core code could encrypt the pointer value and only
> core kernel knows how to access that data, no modules, no out-of-tree
> code). The other thing I'm wondering is, when we strip this info anyway
> from the insn stream to keep it in aux data (so it can later be reconstructed
> on a dump), then perhaps that is best done before prog loading time? It
> would then allow to keep complexity with stripping that insns out of the
> verifier. If semantics are that these comments are acting as a hole/gap
> (in a similar sense of what we have with cBPF today), then it can never
> become a jmp target and loaders could strip it out already (instead of
> teaching DFS, etc about it), and prepare a meta data structure in bpf_attr
> for bpf(2), and verifier works based on that one. What makes this problematic
> however is when you have rewrites in the kernel (ctx access, constant
> blinding, etc), but perhaps they could just adjust the offsets from that
> meta data thing as well?

yes. all correct.
if we keep comment==nop instructions as part of the program, it's not great,
since we'll be wasting performance for no strong reason.
If we remove them from instruction stream after the verifier then they
can only be useful in verifier messages and that's not much better
than existing 'llvm-objdump -S file.o' approach.
Hmm.

> >Alternatively instead of doing BPF_COMMENT we can load the whole .o
> >as-is into bpffs as a blob. Later (based on digest) the kernel can
> >dump such .o back for user space to run objdump on. It all can be
> >done without kernel involvement. Like tc command can copy .o and so on.
> >But not everything is using tc.
> 
> That means kernel must ensure/verify that loaded insns also come from
> that claimed object file; not sure if easily possible w/o parsing elf.
> It could work if the kernel loads everything based on the content of
> the object file itself,

it will only check that program section of file contain valid insns.
the user may still cheat with junk in dwarf section of such elf.
Sounds more and more that this should really be solved by user space
and correlation to be done via prog_digest.

^ permalink raw reply

* Re: [net-next PATCH v3 6/6] virtio_net: xdp, add slowpath case for non contiguous buffers
From: Alexei Starovoitov @ 2016-11-30  0:37 UTC (permalink / raw)
  To: John Fastabend
  Cc: eric.dumazet, daniel, shm, davem, tgraf, john.r.fastabend, netdev,
	bblanco, brouer
In-Reply-To: <20161129201133.26851.31803.stgit@john-Precision-Tower-5810>

On Tue, Nov 29, 2016 at 12:11:33PM -0800, John Fastabend wrote:
> virtio_net XDP support expects receive buffers to be contiguous.
> If this is not the case we enable a slowpath to allow connectivity
> to continue but at a significan performance overhead associated with
> linearizing data. To make it painfully aware to users that XDP is
> running in a degraded mode we throw an xdp buffer error.
> 
> To linearize packets we allocate a page and copy the segments of
> the data, including the header, into it. After this the page can be
> handled by XDP code flow as normal.
> 
> Then depending on the return code the page is either freed or sent
> to the XDP xmit path. There is no attempt to optimize this path.
> 
> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
...
> +/* The conditions to enable XDP should preclude the underlying device from
> + * sending packets across multiple buffers (num_buf > 1). However per spec
> + * it does not appear to be illegal to do so but rather just against convention.
> + * So in order to avoid making a system unresponsive the packets are pushed
> + * into a page and the XDP program is run. This will be extremely slow and we
> + * push a warning to the user to fix this as soon as possible. Fixing this may
> + * require resolving the underlying hardware to determine why multiple buffers
> + * are being received or simply loading the XDP program in the ingress stack
> + * after the skb is built because there is no advantage to running it here
> + * anymore.
> + */
...
>  		if (num_buf > 1) {
>  			bpf_warn_invalid_xdp_buffer();
> -			goto err_xdp;
> +
> +			/* linearize data for XDP */
> +			xdp_page = xdp_linearize_page(rq, num_buf,
> +						      page, offset, &len);
> +			if (!xdp_page)
> +				goto err_xdp;

in case when we're 'lucky' the performance will silently be bad.
Can we do warn_once here? so at least something in dmesg points out
that performance is not as expected. Am I reading it correctly that
you had to do a special kernel hack to trigger this situation and
in all normal cases it's not the case?

^ permalink raw reply

* Re: [PATCH net-next v3 0/4] Fix OdroidC2 Gigabit Tx link issue
From: David Miller @ 2016-11-30  0:38 UTC (permalink / raw)
  To: jbrunet-rdvid1DuHRBWk0Htik3J/w
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, devicetree-u79uwXL29TY76Z2rM5mHXA,
	f.fainelli-Re5JQEeQqe8AvxtiuMwx3w, carlo-KA+7E9HrN00dnm+yROfE0A,
	khilman-rdvid1DuHRBWk0Htik3J/w, peppe.cavallaro-qxv4g6HH51o,
	alexandre.torgue-qxv4g6HH51o,
	martin.blumenstingl-gM/Ye1E23mwN+BqQ9rBEUg,
	neolynx-Re5JQEeQqe8AvxtiuMwx3w, andrew-g2DYL2Zd6BY,
	narmstrong-rdvid1DuHRBWk0Htik3J/w,
	linux-amlogic-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1480326409-25419-1-git-send-email-jbrunet-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>

From: Jerome Brunet <jbrunet-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
Date: Mon, 28 Nov 2016 10:46:45 +0100

> This patchset fixes an issue with the OdroidC2 board (DWMAC + RTL8211F).
> The platform seems to enter LPI on the Rx path too often while performing
> relatively high TX transfer. This eventually break the link (both Tx and
> Rx), and require to bring the interface down and up again to get the Rx
> path working again.
> 
> The root cause of this issue is not fully understood yet but disabling EEE
> advertisement on the PHY prevent this feature to be negotiated.
> With this change, the link is stable and reliable, with the expected
> throughput performance.
> 
> The patchset adds options in the generic phy driver to disable EEE
> advertisement, through device tree. The way it is done is very similar
> to the handling of the max-speed property.

Patches 1-3 applied to net-next, thanks.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] cpsw: ethtool: add support for nway reset
From: David Miller @ 2016-11-30  0:41 UTC (permalink / raw)
  To: yegorslists; +Cc: netdev, linux-omap, grygorii.strashko, mugunthanvnm
In-Reply-To: <1480326472-5849-1-git-send-email-yegorslists@googlemail.com>

From: yegorslists@googlemail.com
Date: Mon, 28 Nov 2016 10:47:52 +0100

> From: Yegor Yefremov <yegorslists@googlemail.com>
> 
> This patch adds support for ethtool's '-r' command. Restarting
> N-WAY negotiation can be useful to activate newly changed EEE
> settings etc.
> 
> Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>

This doesn't apply cleanly to net-next.

^ permalink raw reply

* Re: [PATCH net-next v5 2/3] bpf: Add new cgroup attach type to enable sock modifications
From: David Ahern @ 2016-11-30  0:43 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: netdev, daniel, ast, daniel, maheshb, tgraf
In-Reply-To: <20161129200141.GB24152@ast-mbp.thefacebook.com>

On 11/29/16 1:01 PM, Alexei Starovoitov wrote:
> Could you also expose sk_protcol and sk_type as read only fields?

Those are bitfields in struct sock, so can't use offsetof or sizeof. Any existing use cases that try to load a bitfield in a bpf that I can look at?

^ permalink raw reply

* Re: [PATCH net-next v3 0/4] Fix OdroidC2 Gigabit Tx link issue
From: Florian Fainelli @ 2016-11-30  0:43 UTC (permalink / raw)
  To: David Miller, jbrunet-rdvid1DuHRBWk0Htik3J/w
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, devicetree-u79uwXL29TY76Z2rM5mHXA,
	carlo-KA+7E9HrN00dnm+yROfE0A, khilman-rdvid1DuHRBWk0Htik3J/w,
	peppe.cavallaro-qxv4g6HH51o, alexandre.torgue-qxv4g6HH51o,
	martin.blumenstingl-gM/Ye1E23mwN+BqQ9rBEUg,
	neolynx-Re5JQEeQqe8AvxtiuMwx3w, andrew-g2DYL2Zd6BY,
	narmstrong-rdvid1DuHRBWk0Htik3J/w,
	linux-amlogic-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20161129.193853.827524417068912706.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

On 11/29/2016 04:38 PM, David Miller wrote:
> From: Jerome Brunet <jbrunet-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
> Date: Mon, 28 Nov 2016 10:46:45 +0100
> 
>> This patchset fixes an issue with the OdroidC2 board (DWMAC + RTL8211F).
>> The platform seems to enter LPI on the Rx path too often while performing
>> relatively high TX transfer. This eventually break the link (both Tx and
>> Rx), and require to bring the interface down and up again to get the Rx
>> path working again.
>>
>> The root cause of this issue is not fully understood yet but disabling EEE
>> advertisement on the PHY prevent this feature to be negotiated.
>> With this change, the link is stable and reliable, with the expected
>> throughput performance.
>>
>> The patchset adds options in the generic phy driver to disable EEE
>> advertisement, through device tree. The way it is done is very similar
>> to the handling of the max-speed property.
> 
> Patches 1-3 applied to net-next, thanks.

Meh, there was a v4 submitted shortly after, and I objected to the whole
idea of using that kind of Device Tree properties to disable EEE, we can
send reverts though..
-- 
Florian
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH net V2] net/sched: pedit: make sure that offset is valid
From: David Miller @ 2016-11-30  0:46 UTC (permalink / raw)
  To: amir; +Cc: netdev, xiyou.wangcong, jhs, ogerlitz, hadarh, jiri
In-Reply-To: <20161128105640.32363-1-amir@vadai.me>

From: Amir Vadai <amir@vadai.me>
Date: Mon, 28 Nov 2016 12:56:40 +0200

> Add a validation function to make sure offset is valid:
> 1. Not below skb head (could happen when offset is negative).
> 2. Validate both 'offset' and 'at'.
> 
> Signed-off-by: Amir Vadai <amir@vadai.me>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: Crash due to mutex genl_lock called from RCU context
From: David Miller @ 2016-11-30  0:49 UTC (permalink / raw)
  To: herbert; +Cc: xiyou.wangcong, eric.dumazet, subashab, tgraf, netdev
In-Reply-To: <20161128112211.GA990@gondor.apana.org.au>

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Mon, 28 Nov 2016 19:22:12 +0800

> netlink: Call cb->done from a worker thread
> 
> The cb->done interface expects to be called in process context.
> This was broken by the netlink RCU conversion.  This patch fixes
> it by adding a worker struct to make the cb->done call where
> necessary.
> 
> Fixes: 21e4902aea80 ("netlink: Lockless lookup with RCU grace...")
> Reported-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Applied and queued up for -stable, thanks Herbert.

^ permalink raw reply

* [PATCH net-next 1/1] driver: ipvlan: Remove useless member mtu_adj of struct ipvl_dev
From: fgao @ 2016-11-30  0:48 UTC (permalink / raw)
  To: davem, maheshb, edumazet, netdev, gfree.wind; +Cc: Gao Feng

From: Gao Feng <fgao@ikuai8.com>

The mtu_adj is initialized to zero when alloc mem, there is no any
assignment to mtu_adj. It is only used in ipvlan_adjust_mtu as one
right value.
So it is useless member of struct ipvl_dev, then remove it.

Signed-off-by: Gao Feng <fgao@ikuai8.com>
---
 drivers/net/ipvlan/ipvlan.h      | 1 -
 drivers/net/ipvlan/ipvlan_main.c | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h
index 7e0732f..05a62d2 100644
--- a/drivers/net/ipvlan/ipvlan.h
+++ b/drivers/net/ipvlan/ipvlan.h
@@ -73,7 +73,6 @@ struct ipvl_dev {
 	DECLARE_BITMAP(mac_filters, IPVLAN_MAC_FILTER_SIZE);
 	netdev_features_t	sfeatures;
 	u32			msg_enable;
-	u16			mtu_adj;
 };
 
 struct ipvl_addr {
diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index ab90b22..c6aa667 100644
--- a/drivers/net/ipvlan/ipvlan_main.c
+++ b/drivers/net/ipvlan/ipvlan_main.c
@@ -32,7 +32,7 @@
 
 static void ipvlan_adjust_mtu(struct ipvl_dev *ipvlan, struct net_device *dev)
 {
-	ipvlan->dev->mtu = dev->mtu - ipvlan->mtu_adj;
+	ipvlan->dev->mtu = dev->mtu;
 }
 
 static int ipvlan_register_nf_hook(void)
-- 
1.9.1

^ permalink raw reply related

* Re: [PATCH net-next v2] bpf: cgroup: fix documentation of __cgroup_bpf_update()
From: David Miller @ 2016-11-30  0:52 UTC (permalink / raw)
  To: daniel; +Cc: ast, daniel, netdev, roszenrami, cgroups
In-Reply-To: <1480338664-22616-1-git-send-email-daniel@zonque.org>

From: Daniel Mack <daniel@zonque.org>
Date: Mon, 28 Nov 2016 14:11:04 +0100

> There's a 'not' missing in one paragraph. Add it.
> 
> Signed-off-by: Daniel Mack <daniel@zonque.org>
> Reported-by: Rami Rosen <roszenrami@gmail.com>
> Fixes: 3007098494be ("cgroup: add support for eBPF programs")

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] stmmac: fix comments, make debug output consistent
From: David Miller @ 2016-11-30  0:53 UTC (permalink / raw)
  To: pavel; +Cc: alexandre.torgue, peppe.cavallaro, netdev, linux-kernel, akpm
In-Reply-To: <20161128115559.GB15034@amd>

From: Pavel Machek <pavel@ucw.cz>
Date: Mon, 28 Nov 2016 12:55:59 +0100

> Fix comments, add some new, and make debugfs output consistent.
>     
> Signed-off-by: Pavel Machek <pavel@denx.de>

Applied to net-next, thanks.

^ permalink raw reply

* Re: [PATCH net-next v5 2/3] bpf: Add new cgroup attach type to enable sock modifications
From: Alexei Starovoitov @ 2016-11-30  0:59 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, daniel, ast, daniel, maheshb, tgraf
In-Reply-To: <781e66b2-007c-bd56-2cd1-543c1f5dcdb7@cumulusnetworks.com>

On Tue, Nov 29, 2016 at 05:43:08PM -0700, David Ahern wrote:
> On 11/29/16 1:01 PM, Alexei Starovoitov wrote:
> > Could you also expose sk_protcol and sk_type as read only fields?
> 
> Those are bitfields in struct sock, so can't use offsetof or sizeof. Any existing use cases that try to load a bitfield in a bpf that I can look at?

pkt_type, vlan are also bitfileds in skb. Please see convert_skb_access()
There is a bit of ugliness due to __BIG_ENDIAN_BITFIELD though..

^ permalink raw reply

* Re: [PATCH net v2 1/1] net: macb: fix the RX queue reset in macb_rx()
From: David Miller @ 2016-11-30  1:02 UTC (permalink / raw)
  To: cyrille.pitchen
  Cc: nicolas.ferre, netdev, soren.brinkmann, Andrei.Pistirica,
	linux-arm-kernel, linux-kernel
In-Reply-To: <80ebb550eb6155e3b882cab1fb8d78a7385f8227.1480339901.git.cyrille.pitchen@atmel.com>

From: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Date: Mon, 28 Nov 2016 14:40:55 +0100

> On macb only (not gem), when a RX queue corruption was detected from
> macb_rx(), the RX queue was reset: during this process the RX ring
> buffer descriptor was initialized by macb_init_rx_ring() but we forgot
> to also set bp->rx_tail to 0.
> 
> Indeed, when processing the received frames, bp->rx_tail provides the
> macb driver with the index in the RX ring buffer of the next buffer to
> process. So when the whole ring buffer is reset we must also reset
> bp->rx_tail so the driver is synchronized again with the hardware.
> 
> Since macb_init_rx_ring() is called from many locations, currently from
> macb_rx() and macb_init_rings(), we'd rather add the "bp->rx_tail = 0;"
> line inside macb_init_rx_ring() than add the very same line after each
> call of this function.
> 
> Without this fix, the rx queue is not reset properly to recover from
> queue corruption and connection drop may occur.
> 
> Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
> Fixes: 9ba723b081a2 ("net: macb: remove BUG_ON() and reset the queue to handle RX errors")
> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH net-next v5 2/3] bpf: Add new cgroup attach type to enable sock modifications
From: David Ahern @ 2016-11-30  1:07 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: netdev, daniel, ast, daniel, maheshb, tgraf
In-Reply-To: <20161130005906.GB29591@ast-mbp.thefacebook.com>

On 11/29/16 5:59 PM, Alexei Starovoitov wrote:
> On Tue, Nov 29, 2016 at 05:43:08PM -0700, David Ahern wrote:
>> On 11/29/16 1:01 PM, Alexei Starovoitov wrote:
>>> Could you also expose sk_protcol and sk_type as read only fields?
>>
>> Those are bitfields in struct sock, so can't use offsetof or sizeof. Any existing use cases that try to load a bitfield in a bpf that I can look at?
> 
> pkt_type, vlan are also bitfileds in skb. Please see convert_skb_access()
> There is a bit of ugliness due to __BIG_ENDIAN_BITFIELD though..
> 

Given the added complexity I'd prefer to defer this second use case to a follow on patch set. This one introduces the infra for sockets and I don't see anything needing to change with it to add the read of 3 more sock elements. Agree?

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox