Netdev List
 help / color / mirror / Atom feed
* Re: [EXT] Re: [PATCH net-next 10/18] net: mvpp2: use the GoP interrupt for link status changes
From: Antoine Tenart @ 2017-08-23 16:04 UTC (permalink / raw)
  To: Stefan Chulski
  Cc: Antoine Tenart, Andrew Lunn, davem@davemloft.net,
	jason@lakedaemon.net, gregory.clement@free-electrons.com,
	sebastian.hesselbarth@gmail.com,
	thomas.petazzoni@free-electrons.com, Nadav Haklai,
	linux@armlinux.org.uk, mw@semihalf.com, netdev@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org
In-Reply-To: <5ebd192c54984b25adda0a294035b1b9@IL-EXCH01.marvell.com>

[-- Attachment #1: Type: text/plain, Size: 1190 bytes --]

On Wed, Aug 23, 2017 at 03:24:55PM +0000, Stefan Chulski wrote:
> > When the cable is connected (there is signal) and the serdes is in sync and AN
> > succeeded.
> > 
> > > With SFF/SFP ports, you generally need a gpio line the fibre module
> > > can use to indicate if it has link. Fixed-phy has such support, and
> > > your link_change function will get called when the link changes.
> > 
> > So that would work when using SFP modules but I wonder if the GoP irq isn't
> > needed when using passive cable, in which case this patch would still be needed
> > (and of course we should support the new Russell phylib capabilities).
> 
> Even if new phylib driver supports passive direct cables connection, 
> GoP IRQ required for SOHO/Peridot external switch support.
> SOHO/Peridot external switch could be connected directly to Serdes line.

So I guess the GoP link irq patches are needed. Should I resend them
then?

We'll have to discuss how to handle fixed-phy vs GoP IRQ, but I guess we
can do this when adding the fixed-phy support later.

Thanks,
Antoine

-- 
Antoine Ténart, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [RFC PATCH] dt-binding: net: sfp binding documentation
From: Rob Herring @ 2017-08-23 16:18 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Baruch Siach, Mark Rutland, Andrew Lunn, Florian Fainelli,
	David S . Miller, netdev,
	devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <20170823135744.GU20805-l+eeeJia6m9URfEZ8mYm6t73F7V6hmMc@public.gmane.org>

On Wed, Aug 23, 2017 at 8:57 AM, Russell King - ARM Linux
<linux-I+IVW8TIWO2tmTQ+vhA3Yw@public.gmane.org> wrote:
> On Mon, Aug 21, 2017 at 02:12:42PM -0500, Rob Herring wrote:
>> On Mon, Aug 21, 2017 at 10:06 AM, Baruch Siach <baruch-NswTu9S1W3P6gbPvEgmw2w@public.gmane.org> wrote:
>> > Hi Russell,
>> >
>> > On Mon, Aug 21, 2017 at 01:53:17PM +0100, Russell King - ARM Linux wrote:
>> >> On Sun, Aug 20, 2017 at 01:28:06PM +0300, Baruch Siach wrote:
>> >> > Add device-tree binding documentation SFP transceivers. Support for SFP
>> >> > transceivers has been recently introduced (drivers/net/phy/sfp.c).
>> >> >
>> >> > Signed-off-by: Baruch Siach <baruch-NswTu9S1W3P6gbPvEgmw2w@public.gmane.org>
>> >> > ---
>> >> >
>> >> > The SFP driver is on net-next.
>> >> >
>> >> > Not sure about the rate-select-gpio property name. The SFP+ standard
>> >> > (not supported yet) uses two signals, RS0 and RS1. RS0 is compatible
>> >> > with the SFP rate select signal, while RS1 controls the Tx rate.
>> >>
>> >> SFP+ is usable with this, but the platforms I have do not wire the
>> >> rate select pins on the SFP+ sockets to GPIOs, but hard-wire them.
>> >
>> > So maybe naming this signal 'rate-select0-gpio' would make it more future
>> > (SPF+) proof? Or 'rate-select-rx-gpio'?
>>
>> Just extend it by making it an array of 2 gpios.
>
> What do you do if you have only one rate select wired up and it doesn't
> correspond with the first?

Seems unlikely, but possible I guess. In that case, 2 properties is
probably better. Otherwise, you'd have to put in -1 or 0 for the first
phandle.

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v3 3/4] net: stmmac: register parent MDIO node for sun8i-h3-emac
From: Florian Fainelli @ 2017-08-23 16:31 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: Corentin Labbe, Chen-Yu Tsai, Andrew Lunn, Rob Herring,
	Mark Rutland, Russell King, Giuseppe Cavallaro, Alexandre Torgue,
	devicetree, linux-arm-kernel, linux-kernel, netdev
In-Reply-To: <20170823074942.yt3c5gvnmdw5pqge@flea.home>

On 08/23/2017 12:49 AM, Maxime Ripard wrote:
> Hi Florian,
> 
> On Tue, Aug 22, 2017 at 11:35:01AM -0700, Florian Fainelli wrote:
>>>>> So I think what you are saying is either impossible or engineering-wise
>>>>> a very stupid design, like using an external MAC with a discrete PHY
>>>>> connected to the internal MAC's MDIO bus, while using the internal MAC
>>>>> with the internal PHY.
>>>>>
>>>>> Now can we please decide on something? We're a week and a half from
>>>>> the 4.13 release. If mdio-mux is wrong, then we could have two mdio
>>>>> nodes (internal-mdio & external-mdio).
>>>>
>>>> I really don't see a need for a mdio-mux in the first place, just have
>>>> one MDIO controller (current state) sub-node which describes the
>>>> built-in STMMAC MDIO controller and declare the internal PHY as a child
>>>> node (along with 'phy-is-integrated'). If a different configuration is
>>>> used, then just put the external PHY as a child node there.
>>>>
>>>> If fixed-link is required, the mdio node becomes unused anyway.
>>>>
>>>> Works for everyone?
>>>
>>> If we put an external PHY with reg=1 as a child of internal MDIO,
>>> il will be merged with internal PHY node and get
>>> phy-is-integrated.
>>
>> Then have the .dtsi file contain just the mdio node, but no internal or
>> external PHY and push all the internal and external PHY node definition
>> (in its entirety) to the per-board DTS file, does not that work?
> 
> If possible, I'd really like to have the internal PHY in the
> DTSI. It's always there in hardware anyway, and duplicating the PHY,
> with its clock, reset line, and whatever info we might need in the
> future in each and every board DTS that uses it will be very error
> prone and we will have the usual bunch of issues that come up with
> duplication.

OK, then what if you put the internal PHY in the DTSI, mark it with a
status = "disabled" property, and have the per-board DTS put a status =
"okay" property along with a "phy-is-integrated" boolean property? Would
that work?

What I really don't think is necessary is:

- duplicating the "mdio" controller node for external vs. internal PHY,
because this is not accurate, there is just one MDIO controller, but
there may be different kinds of MDIO/PHY devices attached

- having the STMMAC driver MDIO probing code having to deal with a
"mdio" sub-node or an "internal-mdio" sub-node because this is confusing
and requiring more driver-level changes that are error prone

Thanks
-- 
Florian

^ permalink raw reply

* Re: [PATCH net-next 1/8] bpf: Recursively apply cgroup sock filters
From: David Ahern @ 2017-08-23 16:33 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: netdev, daniel, ast, tj, davem
In-Reply-To: <20170823014005.6riceht5iwv7zzxh@ast-mbp>

On 8/22/17 6:40 PM, Alexei Starovoitov wrote:
>> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
>> index df2e0f14a95d..7480cebab073 100644
>> --- a/kernel/cgroup/cgroup.c
>> +++ b/kernel/cgroup/cgroup.c
>> @@ -5186,4 +5186,22 @@ int cgroup_bpf_update(struct cgroup *cgrp, struct bpf_prog *prog,
>>  	mutex_unlock(&cgroup_mutex);
>>  	return ret;
>>  }
>> +
>> +int cgroup_bpf_run_filter_sk(struct sock *sk,
>> +			     enum bpf_attach_type type)
>> +{
>> +	struct cgroup *cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
>> +	int ret = 0;
>> +
>> +	while (cgrp) {
>> +		ret = __cgroup_bpf_run_filter_sk(cgrp, sk, type);
>> +		if (ret < 0)
>> +			break;
>> +
>> +		cgrp = cgroup_parent(cgrp);
>> +	}
> 
> I think this walk changes semantics for existing setups, so we cannot do it
> by default and have to add new attach flag.

I can add a flag similar to the override.

> Also why break on (ret < 0) ?

Because __cgroup_bpf_run_filter_sk returns either 0 or -EPERM.

> The caller of this does:
>   err = BPF_CGROUP_RUN_PROG_INET_SOCK(sk);
>   if (err) {
>           sk_common_release(sk);
> so we should probably break out of the loop on if (ret) too.
> 

I'll do that in v2.

^ permalink raw reply

* Re: [PATCH v3] net: stmmac: Delete dead code for MDIO registration
From: Florian Fainelli @ 2017-08-23 16:46 UTC (permalink / raw)
  To: Romain Perier, Giuseppe Cavallaro, Alexandre Torgue, Andrew Lunn
  Cc: netdev, linux-kernel
In-Reply-To: <20170823085058.23333-1-romain.perier@collabora.com>

On 08/23/2017 01:50 AM, Romain Perier wrote:
> This code is no longer used, the logging function was changed by commit
> fbca164776e4 ("net: stmmac: Use the right logging functi"). It was
> previously showing information about the type if the IRQ, if it's polled,
> ignored or a normal interrupt. As we don't want information loss, I have
> moved this code to phy_attached_print().
> 
> Fixes: fbca164776e4 ("net: stmmac: Use the right logging functi")

For future submissions, do not truncate the commit subject that you are
referencing.

> Signed-off-by: Romain Perier <romain.perier@collabora.com>

Since this is a fix you should indicate in the patch subject that this
is targeting David's "net" tree, see
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/netdev-FAQ.txt#n85


>  	const char *drv_name = phydev->drv ? phydev->drv->name : "unbound";
> +	char *irq_str;
> +	char irq_num[4];
> +
> +	switch(phydev->irq) {
> +	case PHY_POLL:
> +		irq_str = "POLL";
> +		break;
> +	case PHY_IGNORE_INTERRUPT:
> +		irq_str = "IGNORE";
> +		break;
> +	default:
> +		snprintf(irq_num, sizeof(irq_str), "%d", phydev->irq);

sizeof(irq_str) = 4 or 8 depending on the architecture because it's a
pointer, did not you mean sizeof(irq_num) here instead?

> +		irq_str = irq_num;
> +		break;
> +	}
>  
>  	if (!fmt) {
>  		dev_info(&phydev->mdio.dev, ATTACHED_FMT "\n",
>  			 drv_name, phydev_name(phydev),
> -			 phydev->irq);
> +			 irq_str);
>  	} else {
>  		va_list ap;
>  
>  		dev_info(&phydev->mdio.dev, ATTACHED_FMT,
>  			 drv_name, phydev_name(phydev),
> -			 phydev->irq);
> +			 irq_str);
>  
>  		va_start(ap, fmt);
>  		vprintk(fmt, ap);
> 


-- 
Florian

^ permalink raw reply

* Re: [PATCH net-next] bpf, doc: Add arm32 as arch supporting eBPF JIT
From: Alexei Starovoitov @ 2017-08-23 16:55 UTC (permalink / raw)
  To: Shubham Bansal, davem; +Cc: netdev, daniel
In-Reply-To: <1503503950-5146-1-git-send-email-illusionist.neo@gmail.com>

On 8/23/17 8:59 AM, Shubham Bansal wrote:
> As eBPF JIT support for arm32 was added recently with
> commit 39c13c204bb1150d401e27d41a9d8b332be47c49, it seems appropriate to
> add arm32 as arch with support for eBPF JIT in bpf and sysctl docs as well.
>
> Signed-off-by: Shubham Bansal <illusionist.neo@gmail.com>

Acked-by: Alexei Starovoitov <ast@kernel.org>

^ permalink raw reply

* Re: [PATCH net] net: xfrm: don't double-hold dst when sk_policy in use.
From: Wei Wang @ 2017-08-23 17:00 UTC (permalink / raw)
  To: Lorenzo Colitti
  Cc: Linux Kernel Network Developers, David S . Miller,
	steffen.klassert, misterikkit
In-Reply-To: <20170823081439.52796-1-lorenzo@google.com>

> While removing dst_entry garbage collection, commit 52df157f17e5
> ("xfrm: take refcnt of dst when creating struct xfrm_dst bundle")
> changed xfrm_resolve_and_create_bundle so it returns an xdst with
> a refcount of 1 instead of 0.
>
> However, it did not delete the dst_hold performed by xfrm_lookup
> when a per-socket policy is in use. This means that when a
> socket policy is in use, dst entries returned by xfrm_lookup have
> a refcount of 2, and are not freed when no longer in use.
>
> Cc: Wei Wang <weiwan@google.com>
> Fixes: 52df157f17 ("xfrm: take refcnt of dst when creating struct xfrm_dst bundle")
> Tested: https://android-review.googlesource.com/417481
> Tested: https://android-review.googlesource.com/418659
> Tested: https://android-review.googlesource.com/424463
> Tested: https://android-review.googlesource.com/452776 passes on net-next
> Signed-off-by: Lorenzo Colitti <lorenzo@google.com>

Thanks for the fix.

Acked-by: Wei Wang <weiwan@google.com>


On Wed, Aug 23, 2017 at 1:14 AM, Lorenzo Colitti <lorenzo@google.com> wrote:
> While removing dst_entry garbage collection, commit 52df157f17e5
> ("xfrm: take refcnt of dst when creating struct xfrm_dst bundle")
> changed xfrm_resolve_and_create_bundle so it returns an xdst with
> a refcount of 1 instead of 0.
>
> However, it did not delete the dst_hold performed by xfrm_lookup
> when a per-socket policy is in use. This means that when a
> socket policy is in use, dst entries returned by xfrm_lookup have
> a refcount of 2, and are not freed when no longer in use.
>
> Cc: Wei Wang <weiwan@google.com>
> Fixes: 52df157f17 ("xfrm: take refcnt of dst when creating struct xfrm_dst bundle")
> Tested: https://android-review.googlesource.com/417481
> Tested: https://android-review.googlesource.com/418659
> Tested: https://android-review.googlesource.com/424463
> Tested: https://android-review.googlesource.com/452776 passes on net-next
> Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
> ---
>  net/xfrm/xfrm_policy.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
> index 6f5a0dad50..69b16ee327 100644
> --- a/net/xfrm/xfrm_policy.c
> +++ b/net/xfrm/xfrm_policy.c
> @@ -2226,7 +2226,6 @@ struct dst_entry *xfrm_lookup(struct net *net, struct dst_entry *dst_orig,
>                                 goto no_transform;
>                         }
>
> -                       dst_hold(&xdst->u.dst);
>                         route = xdst->route;
>                 }
>         }
> --
> 2.14.1.480.gb18f417b89-goog
>

^ permalink raw reply

* Re: [PATCH net-next] bpf, doc: Add arm32 as arch supporting eBPF JIT
From: Daniel Borkmann @ 2017-08-23 17:03 UTC (permalink / raw)
  To: Shubham Bansal, davem; +Cc: netdev, ast
In-Reply-To: <1503503950-5146-1-git-send-email-illusionist.neo@gmail.com>

On 08/23/2017 05:59 PM, Shubham Bansal wrote:
> As eBPF JIT support for arm32 was added recently with
> commit 39c13c204bb1150d401e27d41a9d8b332be47c49, it seems appropriate to
> add arm32 as arch with support for eBPF JIT in bpf and sysctl docs as well.
>
> Signed-off-by: Shubham Bansal <illusionist.neo@gmail.com>

Acked-by: Daniel Borkmann <daniel@iogearbox.net>

^ permalink raw reply

* Re: [PATCH net] sctp: Avoid out-of-bounds reads from address storage
From: Marcelo Ricardo Leitner @ 2017-08-23 17:12 UTC (permalink / raw)
  To: Stefano Brivio
  Cc: David S . Miller, netdev, linux-kernel, stable, Xin Long,
	Vlad Yasevich, Neil Horman, linux-sctp
In-Reply-To: <7763d91bcf14744e49f09fc4bec0fb22c097774f.1502384055.git.sbrivio@redhat.com>

On Wed, Aug 23, 2017 at 01:27:13PM +0200, Stefano Brivio wrote:
> inet_diag_msg_sctp{,l}addr_fill() and sctp_get_sctp_info() copy
> sizeof(sockaddr_storage) bytes to fill in sockaddr structs used
> to export diagnostic information to userspace.
> 
> However, the memory allocated to store sockaddr information is
> smaller than that and depends on the address family, so we leak
> up to 100 uninitialized bytes to userspace. Just use the size of
> the source structs instead, in all the three cases this is what
> userspace expects. Zero out the remaining memory.
> 
> Unused bytes (i.e. when IPv4 addresses are used) in source
> structs sctp_sockaddr_entry and sctp_transport are already
> cleared by sctp_add_bind_addr() and sctp_transport_new(),
> respectively.
> 
> Noticed while testing KASAN-enabled kernel with 'ss':
> 
> [ 2326.885243] BUG: KASAN: slab-out-of-bounds in inet_sctp_diag_fill+0x42c/0x6c0 [sctp_diag] at addr ffff881be8779800
> [ 2326.896800] Read of size 128 by task ss/9527
> [ 2326.901564] CPU: 0 PID: 9527 Comm: ss Not tainted 4.11.0-22.el7a.x86_64 #1
> [ 2326.909236] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/17/2017
> [ 2326.917585] Call Trace:
> [ 2326.920312]  dump_stack+0x63/0x8d
> [ 2326.924014]  kasan_object_err+0x21/0x70
> [ 2326.928295]  kasan_report+0x288/0x540
> [ 2326.932380]  ? inet_sctp_diag_fill+0x42c/0x6c0 [sctp_diag]
> [ 2326.938500]  ? skb_put+0x8b/0xd0
> [ 2326.942098]  ? memset+0x31/0x40
> [ 2326.945599]  check_memory_region+0x13c/0x1a0
> [ 2326.950362]  memcpy+0x23/0x50
> [ 2326.953669]  inet_sctp_diag_fill+0x42c/0x6c0 [sctp_diag]
> [ 2326.959596]  ? inet_diag_msg_sctpasoc_fill+0x460/0x460 [sctp_diag]
> [ 2326.966495]  ? __lock_sock+0x102/0x150
> [ 2326.970671]  ? sock_def_wakeup+0x60/0x60
> [ 2326.975048]  ? remove_wait_queue+0xc0/0xc0
> [ 2326.979619]  sctp_diag_dump+0x44a/0x760 [sctp_diag]
> [ 2326.985063]  ? sctp_ep_dump+0x280/0x280 [sctp_diag]
> [ 2326.990504]  ? memset+0x31/0x40
> [ 2326.994007]  ? mutex_lock+0x12/0x40
> [ 2326.997900]  __inet_diag_dump+0x57/0xb0 [inet_diag]
> [ 2327.003340]  ? __sys_sendmsg+0x150/0x150
> [ 2327.007715]  inet_diag_dump+0x4d/0x80 [inet_diag]
> [ 2327.012979]  netlink_dump+0x1e6/0x490
> [ 2327.017064]  __netlink_dump_start+0x28e/0x2c0
> [ 2327.021924]  inet_diag_handler_cmd+0x189/0x1a0 [inet_diag]
> [ 2327.028045]  ? inet_diag_rcv_msg_compat+0x1b0/0x1b0 [inet_diag]
> [ 2327.034651]  ? inet_diag_dump_compat+0x190/0x190 [inet_diag]
> [ 2327.040965]  ? __netlink_lookup+0x1b9/0x260
> [ 2327.045631]  sock_diag_rcv_msg+0x18b/0x1e0
> [ 2327.050199]  netlink_rcv_skb+0x14b/0x180
> [ 2327.054574]  ? sock_diag_bind+0x60/0x60
> [ 2327.058850]  sock_diag_rcv+0x28/0x40
> [ 2327.062837]  netlink_unicast+0x2e7/0x3b0
> [ 2327.067212]  ? netlink_attachskb+0x330/0x330
> [ 2327.071975]  ? kasan_check_write+0x14/0x20
> [ 2327.076544]  netlink_sendmsg+0x5be/0x730
> [ 2327.080918]  ? netlink_unicast+0x3b0/0x3b0
> [ 2327.085486]  ? kasan_check_write+0x14/0x20
> [ 2327.090057]  ? selinux_socket_sendmsg+0x24/0x30
> [ 2327.095109]  ? netlink_unicast+0x3b0/0x3b0
> [ 2327.099678]  sock_sendmsg+0x74/0x80
> [ 2327.103567]  ___sys_sendmsg+0x520/0x530
> [ 2327.107844]  ? __get_locked_pte+0x178/0x200
> [ 2327.112510]  ? copy_msghdr_from_user+0x270/0x270
> [ 2327.117660]  ? vm_insert_page+0x360/0x360
> [ 2327.122133]  ? vm_insert_pfn_prot+0xb4/0x150
> [ 2327.126895]  ? vm_insert_pfn+0x32/0x40
> [ 2327.131077]  ? vvar_fault+0x71/0xd0
> [ 2327.134968]  ? special_mapping_fault+0x69/0x110
> [ 2327.140022]  ? __do_fault+0x42/0x120
> [ 2327.144008]  ? __handle_mm_fault+0x1062/0x17a0
> [ 2327.148965]  ? __fget_light+0xa7/0xc0
> [ 2327.153049]  __sys_sendmsg+0xcb/0x150
> [ 2327.157133]  ? __sys_sendmsg+0xcb/0x150
> [ 2327.161409]  ? SyS_shutdown+0x140/0x140
> [ 2327.165688]  ? exit_to_usermode_loop+0xd0/0xd0
> [ 2327.170646]  ? __do_page_fault+0x55d/0x620
> [ 2327.175216]  ? __sys_sendmsg+0x150/0x150
> [ 2327.179591]  SyS_sendmsg+0x12/0x20
> [ 2327.183384]  do_syscall_64+0xe3/0x230
> [ 2327.187471]  entry_SYSCALL64_slow_path+0x25/0x25
> [ 2327.192622] RIP: 0033:0x7f41d18fa3b0
> [ 2327.196608] RSP: 002b:00007ffc3b731218 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> [ 2327.205055] RAX: ffffffffffffffda RBX: 00007ffc3b731380 RCX: 00007f41d18fa3b0
> [ 2327.213017] RDX: 0000000000000000 RSI: 00007ffc3b731340 RDI: 0000000000000003
> [ 2327.220978] RBP: 0000000000000002 R08: 0000000000000004 R09: 0000000000000040
> [ 2327.228939] R10: 00007ffc3b730f30 R11: 0000000000000246 R12: 0000000000000003
> [ 2327.236901] R13: 00007ffc3b731340 R14: 00007ffc3b7313d0 R15: 0000000000000084
> [ 2327.244865] Object at ffff881be87797e0, in cache kmalloc-64 size: 64
> [ 2327.251953] Allocated:
> [ 2327.254581] PID = 9484
> [ 2327.257215]  save_stack_trace+0x1b/0x20
> [ 2327.261485]  save_stack+0x46/0xd0
> [ 2327.265179]  kasan_kmalloc+0xad/0xe0
> [ 2327.269165]  kmem_cache_alloc_trace+0xe6/0x1d0
> [ 2327.274138]  sctp_add_bind_addr+0x58/0x180 [sctp]
> [ 2327.279400]  sctp_do_bind+0x208/0x310 [sctp]
> [ 2327.284176]  sctp_bind+0x61/0xa0 [sctp]
> [ 2327.288455]  inet_bind+0x5f/0x3a0
> [ 2327.292151]  SYSC_bind+0x1a4/0x1e0
> [ 2327.295944]  SyS_bind+0xe/0x10
> [ 2327.299349]  do_syscall_64+0xe3/0x230
> [ 2327.303433]  return_from_SYSCALL_64+0x0/0x6a
> [ 2327.308194] Freed:
> [ 2327.310434] PID = 4131
> [ 2327.313065]  save_stack_trace+0x1b/0x20
> [ 2327.317344]  save_stack+0x46/0xd0
> [ 2327.321040]  kasan_slab_free+0x73/0xc0
> [ 2327.325220]  kfree+0x96/0x1a0
> [ 2327.328530]  dynamic_kobj_release+0x15/0x40
> [ 2327.333195]  kobject_release+0x99/0x1e0
> [ 2327.337472]  kobject_put+0x38/0x70
> [ 2327.341266]  free_notes_attrs+0x66/0x80
> [ 2327.345545]  mod_sysfs_teardown+0x1a5/0x270
> [ 2327.350211]  free_module+0x20/0x2a0
> [ 2327.354099]  SyS_delete_module+0x2cb/0x2f0
> [ 2327.358667]  do_syscall_64+0xe3/0x230
> [ 2327.362750]  return_from_SYSCALL_64+0x0/0x6a
> [ 2327.367510] Memory state around the buggy address:
> [ 2327.372855]  ffff881be8779700: fc fc fc fc 00 00 00 00 00 00 00 00 fc fc fc fc
> [ 2327.380914]  ffff881be8779780: fb fb fb fb fb fb fb fb fc fc fc fc 00 00 00 00
> [ 2327.388972] >ffff881be8779800: 00 00 00 00 fc fc fc fc fb fb fb fb fb fb fb fb
> [ 2327.397031]                                ^
> [ 2327.401792]  ffff881be8779880: fc fc fc fc fb fb fb fb fb fb fb fb fc fc fc fc
> [ 2327.409850]  ffff881be8779900: 00 00 00 00 00 04 fc fc fc fc fc fc 00 00 00 00
> [ 2327.417907] ==================================================================
> 
> This fixes CVE-2017-7558.
> 
> References: https://bugzilla.redhat.com/show_bug.cgi?id=1480266
> Fixes: 8f840e47f190 ("sctp: add the sctp_diag.c file")
> Cc: <stable@vger.kernel.org> # 4.7+
> Cc: Xin Long <lucien.xin@gmail.com>
> Cc: Vlad Yasevich <vyasevich@gmail.com>
> Cc: Neil Horman <nhorman@tuxdriver.com>
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>

Thanks Stefano.
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>

> ---
>  net/sctp/sctp_diag.c | 7 +++++--
>  net/sctp/socket.c    | 3 +--
>  2 files changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/net/sctp/sctp_diag.c b/net/sctp/sctp_diag.c
> index 9a647214a91e..e99518e79b52 100644
> --- a/net/sctp/sctp_diag.c
> +++ b/net/sctp/sctp_diag.c
> @@ -70,7 +70,8 @@ static int inet_diag_msg_sctpladdrs_fill(struct sk_buff *skb,
>  
>  	info = nla_data(attr);
>  	list_for_each_entry_rcu(laddr, address_list, list) {
> -		memcpy(info, &laddr->a, addrlen);
> +		memcpy(info, &laddr->a, sizeof(laddr->a));
> +		memset(info + sizeof(laddr->a), 0, addrlen - sizeof(laddr->a));
>  		info += addrlen;
>  	}
>  
> @@ -93,7 +94,9 @@ static int inet_diag_msg_sctpaddrs_fill(struct sk_buff *skb,
>  	info = nla_data(attr);
>  	list_for_each_entry(from, &asoc->peer.transport_addr_list,
>  			    transports) {
> -		memcpy(info, &from->ipaddr, addrlen);
> +		memcpy(info, &from->ipaddr, sizeof(from->ipaddr));
> +		memset(info + sizeof(from->ipaddr), 0,
> +		       addrlen - sizeof(from->ipaddr));
>  		info += addrlen;
>  	}
>  
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index 1db478e34520..8d760863bc41 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -4538,8 +4538,7 @@ int sctp_get_sctp_info(struct sock *sk, struct sctp_association *asoc,
>  	info->sctpi_ictrlchunks = asoc->stats.ictrlchunks;
>  
>  	prim = asoc->peer.primary_path;
> -	memcpy(&info->sctpi_p_address, &prim->ipaddr,
> -	       sizeof(struct sockaddr_storage));
> +	memcpy(&info->sctpi_p_address, &prim->ipaddr, sizeof(prim->ipaddr));
>  	info->sctpi_p_state = prim->state;
>  	info->sctpi_p_cwnd = prim->cwnd;
>  	info->sctpi_p_srtt = prim->srtt;
> -- 
> 2.9.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

* Re: [PATCH 0/2] net: Fix crashes due to activity during suspend
From: Florian Fainelli @ 2017-08-23 17:13 UTC (permalink / raw)
  To: Geert Uytterhoeven, marc_gonzalez, slash.tmp, andrew
  Cc: Geert Uytterhoeven, David S . Miller, Steve Glendinning,
	Andrew Lunn, Lukas Wunner, Rafael J . Wysocki,
	netdev@vger.kernel.org, Linux PM list, Linux-Renesas,
	linux-kernel@vger.kernel.org
In-Reply-To: <CAMuHMdVMuW+5UQc3Tg5bWkix_PLEaxsoGA6sgSdVz0fV6UsO6Q@mail.gmail.com>

On 08/23/2017 04:45 AM, Geert Uytterhoeven wrote:
> Hi Florian,
> 
> On Tue, Aug 22, 2017 at 8:49 PM, Florian Fainelli <f.fainelli@gmail.com> wrote:
>> On 08/22/2017 11:37 AM, Geert Uytterhoeven wrote:
>>> If an Ethernet device is used while the device is suspended, the system may
>>> crash.
>>>
>>> E.g. on sh73a0/kzm9g and r8a73a4/ape6evm, the external Ethernet chip is
>>> driven by a PM controlled clock.  If the Ethernet registers are accessed
>>> while the clock is not running, the system will crash with an imprecise
>>> external abort.
>>>
>>> This patch series fixes two of such crashes:
>>>   1. The first patch prevents the PHY polling state machine from accessing
>>>      PHY registers while a device is suspended,
>>>   2. The second patch prevents the net core from trying to transmit packets
>>>      when an smsc911x device is suspended.
>>>
>>> Both crashes can be reproduced on sh73a0/kzm9g and r8a73a4/ape6evm during
>>> s2ram (rarely), or by using pm_test (more likely to trigger):
>>>
>>>     # echo 0 > /sys/module/printk/parameters/console_suspend
>>>     # echo platform > /sys/power/pm_test
>>>     # echo mem > /sys/power/state
>>>
>>> With this series applied, my test systems survive a loop of 100 test
>>> suspends.
>>
>> It seems to me like part, if not the entire problem is that smsc91xx's
>> suspend and resume functions are way too simplistic and absolutely do
>> not manage the PHY during suspend/resume, the PHY state machine is not
>> even stopped, so of course, this will cause bus errors if you access
>> those registers.
>>
>> You are addressing this as part of patch 2, but this seems to me like
>> this is still a bit incomplete and you'd need at least phy_stop() and/or
>> phy_suspend() (does a power down of the PHY) and phy_start() and/or
>> phy_resume() calls to complete the PHY state machine shutdown during
>> suspend.
>>
>> Have you tried that?
> 
> Unfortunately that doesn't help.
> In state PHY_HALTED, the PHY state machine still calls the .adjust_link()
> callback while the device is suspended.

Humm that is correct yes.

> 
> Do you have a clue? This is too far beyond my phy-foo...

I was initially contemplating a revert of
7ad813f208533cebfcc32d3d7474dc1677d1b09a ("net: phy: Correctly process
PHY_HALTED in phy_stop_machine()") but this is not the root of the
problem. The problem really is that phy_stop() does not wait for the PHY
state machine to be stopped so you cannot rely on that and past the
function return be offered any guarantees that adjust_link is not called.

We seem to be getting away with that in most drivers because when we see
phydev->link = 0, we either do nothing or actually turn of the HW block.

How about we export phy_stop_machine() to drivers which would provide a
synchronization point that would ensure that no HW accesses are done
past this point?

I am absolutely not clear on the implications of using a freezable
workqueue with respect to the PHY state machine and how devices are
going to wind-up being powered down or not...

Thanks!
-- 
Florian

^ permalink raw reply

* [Patch net-next] net_sched: kill u32_node pointer in Qdisc
From: Cong Wang @ 2017-08-23 17:58 UTC (permalink / raw)
  To: netdev; +Cc: Cong Wang, Jamal Hadi Salim, Jiri Pirko

It is ugly to hide a u32-filter-specific pointer inside Qdisc,
this breaks the TC layers:

1. Qdisc is a generic representation, should not have any specific
   data of any type

2. Qdisc layer is above filter layer, should only save filters in
   the list of struct tcf_proto.

This pointer is used as the head of the chain of u32 hash tables,
that is struct tc_u_hnode, because u32 filter is very special,
it allows to create multiple hash tables within one qdisc and
across multiple u32 filters.

Instead of using this ugly pointer, we can just save it in a global
hash table key'ed by (dev ifindex, qdisc handle), therefore we can
still treat it as a per qdisc basis data structure conceptually.

Of course, because of network namespaces, this key is not unique
at all, but it is fine as we already have a pointer to Qdisc in
struct tc_u_common, we can just compare the pointers when collision.

And this only affects slow paths, has no impact to fast path,
thanks to the pointer ->tp_c.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 include/net/sch_generic.h |  1 -
 net/sched/cls_u32.c       | 55 +++++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 51 insertions(+), 5 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index d817585aa5df..c30b634c5f82 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -75,7 +75,6 @@ struct Qdisc {
 	struct hlist_node       hash;
 	u32			handle;
 	u32			parent;
-	void			*u32_node;
 
 	struct netdev_queue	*dev_queue;
 
diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index af22742d2847..816d8622df02 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -40,6 +40,8 @@
 #include <linux/rtnetlink.h>
 #include <linux/skbuff.h>
 #include <linux/bitmap.h>
+#include <linux/netdevice.h>
+#include <linux/hash.h>
 #include <net/netlink.h>
 #include <net/act_api.h>
 #include <net/pkt_cls.h>
@@ -92,6 +94,7 @@ struct tc_u_common {
 	struct Qdisc		*q;
 	int			refcnt;
 	u32			hgenerator;
+	struct hlist_node	hnode;
 	struct rcu_head		rcu;
 };
 
@@ -323,12 +326,40 @@ static u32 gen_new_htid(struct tc_u_common *tp_c)
 	return i > 0 ? (tp_c->hgenerator|0x800)<<20 : 0;
 }
 
+static struct hlist_head *tc_u_common_hash;
+
+#define U32_HASH_SHIFT 10
+#define U32_HASH_SIZE (1 << U32_HASH_SHIFT)
+
+static unsigned int tc_u_hash(const struct tcf_proto *tp)
+{
+	struct net_device *dev = tp->q->dev_queue->dev;
+	u32 qhandle = tp->q->handle;
+	int ifindex = dev->ifindex;
+
+	return hash_64((u64)ifindex << 32 | qhandle, U32_HASH_SHIFT);
+}
+
+static struct tc_u_common *tc_u_common_find(const struct tcf_proto *tp)
+{
+	struct tc_u_common *tc;
+	unsigned int h;
+
+	h = tc_u_hash(tp);
+	hlist_for_each_entry(tc, &tc_u_common_hash[h], hnode) {
+		if (tc->q == tp->q)
+			return tc;
+	}
+	return NULL;
+}
+
 static int u32_init(struct tcf_proto *tp)
 {
 	struct tc_u_hnode *root_ht;
 	struct tc_u_common *tp_c;
+	unsigned int h;
 
-	tp_c = tp->q->u32_node;
+	tp_c = tc_u_common_find(tp);
 
 	root_ht = kzalloc(sizeof(*root_ht), GFP_KERNEL);
 	if (root_ht == NULL)
@@ -345,7 +376,10 @@ static int u32_init(struct tcf_proto *tp)
 			return -ENOBUFS;
 		}
 		tp_c->q = tp->q;
-		tp->q->u32_node = tp_c;
+		INIT_HLIST_NODE(&tp_c->hnode);
+
+		h = tc_u_hash(tp);
+		hlist_add_head(&tp_c->hnode, &tc_u_common_hash[h]);
 	}
 
 	tp_c->refcnt++;
@@ -585,7 +619,7 @@ static void u32_destroy(struct tcf_proto *tp)
 	if (--tp_c->refcnt == 0) {
 		struct tc_u_hnode *ht;
 
-		tp->q->u32_node = NULL;
+		hlist_del(&tp_c->hnode);
 
 		for (ht = rtnl_dereference(tp_c->hlist);
 		     ht;
@@ -1213,6 +1247,8 @@ static struct tcf_proto_ops cls_u32_ops __read_mostly = {
 
 static int __init init_u32(void)
 {
+	int i, ret;
+
 	pr_info("u32 classifier\n");
 #ifdef CONFIG_CLS_U32_PERF
 	pr_info("    Performance counters on\n");
@@ -1223,12 +1259,23 @@ static int __init init_u32(void)
 #ifdef CONFIG_NET_CLS_ACT
 	pr_info("    Actions configured\n");
 #endif
-	return register_tcf_proto_ops(&cls_u32_ops);
+	tc_u_common_hash = kvmalloc_array(U32_HASH_SIZE, sizeof(struct hlist_head), GFP_KERNEL);
+	if (!tc_u_common_hash)
+		return -ENOMEM;
+
+	for (i = 0; i < U32_HASH_SIZE; i++)
+		INIT_HLIST_HEAD(&tc_u_common_hash[i]);
+
+	ret = register_tcf_proto_ops(&cls_u32_ops);
+	if (ret)
+		kvfree(tc_u_common_hash);
+	return ret;
 }
 
 static void __exit exit_u32(void)
 {
 	unregister_tcf_proto_ops(&cls_u32_ops);
+	kvfree(tc_u_common_hash);
 }
 
 module_init(init_u32)
-- 
2.13.0

^ permalink raw reply related

* Re: Stable apply request
From: David Miller @ 2017-08-23 18:07 UTC (permalink / raw)
  To: jslaby
  Cc: marcel, jeffy.chen, stable, linux-kernel, linux-bluetooth,
	rvaswani, briannorris, dmitry.torokhov, dianders, acho,
	johan.hedberg, netdev, gustavo
In-Reply-To: <373da46b-41f7-9aa2-946d-75633c9efe86@suse.cz>

From: Jiri Slaby <jslaby@suse.cz>
Date: Wed, 23 Aug 2017 09:29:44 +0200

> On 06/27/2017, 07:32 PM, Marcel Holtmann wrote:
>>> It looks like bnep_session has same pattern as the issue reported in
>>> old rfcomm:
>>>
>>> 	while (1) {
>>> 		set_current_state(TASK_INTERRUPTIBLE);
>>> 		if (condition)
>>> 			break;
>>> 		// may call might_sleep here
>>> 		schedule();
>>> 	}
>>> 	__set_current_state(TASK_RUNNING);
>>>
>>> Which fixed at:
>>> 	dfb2fae Bluetooth: Fix nested sleeps
>>>
>>> So let's fix it at the same way, also follow the suggestion of:
>>> https://lwn.net/Articles/628628/
> 
> ...
> 
>> all 3 patches have been applied to bluetooth-next tree.
> 
> Hi,
> 
> given users are hitting it in at least 4.4 and 4.12, can we have all
> three in all stables where this applies?
> 
> 5da8e47d849d Bluetooth: hidp: fix possible might sleep error in
> hidp_session_thread
> f06d977309d0 Bluetooth: cmtp: fix possible might sleep error in cmtp_session
> 25717382c1dd Bluetooth: bnep: fix possible might sleep error in bnep_session
> 
> I am not sure: to stable directly or via net stable?

I generally let the wireless family handle their own -stable submissions
and that includes bluetooth.

^ permalink raw reply

* Re: Stable apply request [was: Bluetooth: bnep: fix possible might sleep error in bnep_session]
From: Marcel Holtmann @ 2017-08-23 18:14 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Jeffy Chen, stable, LKML, open list:BLUETOOTH DRIVERS, rvaswani,
	Brian Norris, dmitry.torokhov, Douglas Anderson, acho,
	Johan Hedberg, Network Development, David S. Miller,
	Gustavo F. Padovan
In-Reply-To: <373da46b-41f7-9aa2-946d-75633c9efe86@suse.cz>

Hi Jiri,

>>> It looks like bnep_session has same pattern as the issue reported in
>>> old rfcomm:
>>> 
>>> 	while (1) {
>>> 		set_current_state(TASK_INTERRUPTIBLE);
>>> 		if (condition)
>>> 			break;
>>> 		// may call might_sleep here
>>> 		schedule();
>>> 	}
>>> 	__set_current_state(TASK_RUNNING);
>>> 
>>> Which fixed at:
>>> 	dfb2fae Bluetooth: Fix nested sleeps
>>> 
>>> So let's fix it at the same way, also follow the suggestion of:
>>> https://lwn.net/Articles/628628/
> 
> ...
> 
>> all 3 patches have been applied to bluetooth-next tree.
> 
> Hi,
> 
> given users are hitting it in at least 4.4 and 4.12, can we have all
> three in all stables where this applies?
> 
> 5da8e47d849d Bluetooth: hidp: fix possible might sleep error in
> hidp_session_thread
> f06d977309d0 Bluetooth: cmtp: fix possible might sleep error in cmtp_session
> 25717382c1dd Bluetooth: bnep: fix possible might sleep error in bnep_session
> 
> I am not sure: to stable directly or via net stable?

as Dave said, just email -stable directly and have Greg pick them up.

Regards

Marcel

^ permalink raw reply

* Re: [PATCH][net-next] net: hinic: make functions set_ctrl0 and set_ctrl1 static
From: David Miller @ 2017-08-23 18:42 UTC (permalink / raw)
  To: aviad.krawczyk; +Cc: colin.king, netdev, kernel-janitors, linux-kernel
In-Reply-To: <1C47E215DC9A8B4DBCB6C24F1DE66D8E0294323E@FRAEML521-MBX.china.huawei.com>

From: "Aviad Krawczyk (A)" <aviad.krawczyk@huawei.com>
Date: Wed, 23 Aug 2017 10:08:58 +0000

> Thanks

Please do not top-post.

Especially quoting the original message in the way that you did.

As a result of how you replied to this, this patch will show up
twice in my patchwork queue making more administrative work
for me.

Please learn how to properly reply in ASCII text using proper
traditional email quoting mechanisms just like other developers
do on this mailing list.

Every time you do this improperly, you make more work for everyone
on this mailing list.

Thank you.

^ permalink raw reply

* Re: [PATCH][netdev-next] gre: remove duplicated assignment of iph
From: David Miller @ 2017-08-23 18:46 UTC (permalink / raw)
  To: colin.king; +Cc: kuznet, yoshfuji, netdev, kernel-janitors, linux-kernel
In-Reply-To: <20170823111305.19185-1-colin.king@canonical.com>

From: Colin King <colin.king@canonical.com>
Date: Wed, 23 Aug 2017 12:13:05 +0100

> From: Colin Ian King <colin.king@canonical.com>
> 
> iph is being assigned the same value twice; remove the redundant
> second assignment.
> 
> Fixes warning:
> net/ipv4/ip_gre.c:265:2: warning: Value stored to 'iph' is never read
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>

It is always necessary to reload any SKB header pointer after calls
to functions which potentially change the data buffer of the SKB.

pskb_may_pull() is one such call.

Therefore your patch is adding a bug.

If you want to remove the dup, you must remove the first not
the second one.

^ permalink raw reply

* Re: [PATCH net-next 3/3] samples/bpf: extend test_tunnel_bpf.sh with ERSPAN
From: Alexei Starovoitov @ 2017-08-23 19:15 UTC (permalink / raw)
  To: William Tu; +Cc: netdev
In-Reply-To: <1503500157-3717-3-git-send-email-u9012063@gmail.com>

On Wed, Aug 23, 2017 at 07:55:57AM -0700, William Tu wrote:
> Extend existing tests for vxlan, gre, geneve, ipip to
> include ERSPAN tunnel.
> 
> Signed-off-by: William Tu <u9012063@gmail.com>

the test looks fine, hence
Acked-by: Alexei Starovoitov <ast@kernel.org>
cannot properly review patches 1,2, since I'm not familiar with erspan.

^ permalink raw reply

* UDP sockets oddities
From: Florian Fainelli @ 2017-08-23 20:02 UTC (permalink / raw)
  To: netdev, edumazet, pabeni, willemb; +Cc: davem

Hi,

On Broadcom STB chips using bcmsysport.c and bcm_sf2.c we have an out of
band HW mechanism (not using per-flow pause frames) where we can have
the integrated network switch backpressure the CPU Ethernet controller
which translates in completing TX packets interrupts at the appropriate
pace and therefore get flow control applied end-to-end from the host CPU
port towards any downstream port. At least that is the premise and this
works reasonably well.

This has a few drawbacks in that each of the bcmsysport TX queues need
to semi-statically map to their switch port output queues such that the
switch can calculate buffer occupancy and report congestion status,
which prompted this email [1] but this is tangential and is a policy not
a mechanism issue.

[1]: https://www.spinics.net/lists/netdev/msg448153.html

This is useful when your CPU / integrated switch links up at 1Gbits/sec
internally, and tries to push 1Gbits/sec worth of UDP traffic to e.g: a
downstream port linking at 100Mbits/sec, which could happen depending on
what you have connected to this device.

Now the problem that I am facing, is the following:

- net.core.wmem_default = 160KB (default value)
- using iperf -b 800M -u towards an iperf UDP server with the physical
link to that server established at 100Mbits/sec
- iperf does synchronous write(2) AFAICT so this gives it flow control
- using the default duration of 10s, you can barely see any packet loss
from one run to another
- the longer the run, the higher you are going to see some packet loss,
usually in the range of ~0.15% top

The transmit flow looks like this:

gphy (net/dsa/slave.c::dsa_slave_xmit, IFF_NO_QUEUE device)
 -> eth0 (drivers/net/ethernet/broadcom/bcmsysport.c, "regular" network
device)

I can clearly see that the network stack pushed N UDP packets (Udp and
Ip counters in /proc/net/snmp concur) however what the driver
transmitted and what the switch transmistted is N - M, and matches the
packet loss reported by the UDP server. I don't measure any SndbufErrors
which is not making sense yet.

If I reduce the default socket size to say, 10x less than 160KB, 16KB,
then I either don't see any packet loss at 100Mbits/sec for 5 minutes or
more, or just very very little, down to 0.001%. Now if I repeat the
experiment with the physical link at 10Mbits/sec, same thing, the 16KB
wmem_default setting is no longer working and we need to lower the
socket write buffer size again.

So what I am wondering is:

- do I have an obvious flow control problem in my network driver that
usually does not lead to packet loss, but may sometime happen?

- why would lowering the socket write size appear to masquerade or solve
this problem?

I can consistently reproduce this across several kernel versions, 4.1,
4.9 and latest net-next and therefore can also test patches.

Thanks for reading thus far!
-- 
Florian

^ permalink raw reply

* Re: [Patch net-next] net_sched: kill u32_node pointer in Qdisc
From: Jiri Pirko @ 2017-08-23 20:25 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev, Jamal Hadi Salim
In-Reply-To: <20170823175854.21924-1-xiyou.wangcong@gmail.com>

Wed, Aug 23, 2017 at 07:58:54PM CEST, xiyou.wangcong@gmail.com wrote:
>It is ugly to hide a u32-filter-specific pointer inside Qdisc,
>this breaks the TC layers:
>
>1. Qdisc is a generic representation, should not have any specific
>   data of any type
>
>2. Qdisc layer is above filter layer, should only save filters in
>   the list of struct tcf_proto.
>
>This pointer is used as the head of the chain of u32 hash tables,
>that is struct tc_u_hnode, because u32 filter is very special,
>it allows to create multiple hash tables within one qdisc and
>across multiple u32 filters.
>
>Instead of using this ugly pointer, we can just save it in a global
>hash table key'ed by (dev ifindex, qdisc handle), therefore we can
>still treat it as a per qdisc basis data structure conceptually.
>
>Of course, because of network namespaces, this key is not unique
>at all, but it is fine as we already have a pointer to Qdisc in
>struct tc_u_common, we can just compare the pointers when collision.
>
>And this only affects slow paths, has no impact to fast path,
>thanks to the pointer ->tp_c.

Thanks for taking care of this. Comments below.



>
>Cc: Jamal Hadi Salim <jhs@mojatatu.com>
>Cc: Jiri Pirko <jiri@resnulli.us>
>Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>---
> include/net/sch_generic.h |  1 -
> net/sched/cls_u32.c       | 55 +++++++++++++++++++++++++++++++++++++++++++----
> 2 files changed, 51 insertions(+), 5 deletions(-)
>
>diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
>index d817585aa5df..c30b634c5f82 100644
>--- a/include/net/sch_generic.h
>+++ b/include/net/sch_generic.h
>@@ -75,7 +75,6 @@ struct Qdisc {
> 	struct hlist_node       hash;
> 	u32			handle;
> 	u32			parent;
>-	void			*u32_node;
> 
> 	struct netdev_queue	*dev_queue;
> 
>diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
>index af22742d2847..816d8622df02 100644
>--- a/net/sched/cls_u32.c
>+++ b/net/sched/cls_u32.c
>@@ -40,6 +40,8 @@
> #include <linux/rtnetlink.h>
> #include <linux/skbuff.h>
> #include <linux/bitmap.h>
>+#include <linux/netdevice.h>
>+#include <linux/hash.h>
> #include <net/netlink.h>
> #include <net/act_api.h>
> #include <net/pkt_cls.h>
>@@ -92,6 +94,7 @@ struct tc_u_common {
> 	struct Qdisc		*q;
> 	int			refcnt;
> 	u32			hgenerator;
>+	struct hlist_node	hnode;
> 	struct rcu_head		rcu;
> };
> 
>@@ -323,12 +326,40 @@ static u32 gen_new_htid(struct tc_u_common *tp_c)
> 	return i > 0 ? (tp_c->hgenerator|0x800)<<20 : 0;
> }
> 
>+static struct hlist_head *tc_u_common_hash;

Why not use rhashtable?


>+
>+#define U32_HASH_SHIFT 10
>+#define U32_HASH_SIZE (1 << U32_HASH_SHIFT)
>+
>+static unsigned int tc_u_hash(const struct tcf_proto *tp)
>+{
>+	struct net_device *dev = tp->q->dev_queue->dev;
>+	u32 qhandle = tp->q->handle;
>+	int ifindex = dev->ifindex;
>+
>+	return hash_64((u64)ifindex << 32 | qhandle, U32_HASH_SHIFT);
>+}
>+
>+static struct tc_u_common *tc_u_common_find(const struct tcf_proto *tp)
>+{
>+	struct tc_u_common *tc;
>+	unsigned int h;
>+
>+	h = tc_u_hash(tp);
>+	hlist_for_each_entry(tc, &tc_u_common_hash[h], hnode) {
>+		if (tc->q == tp->q)
>+			return tc;
>+	}
>+	return NULL;
>+}
>+
> static int u32_init(struct tcf_proto *tp)
> {
> 	struct tc_u_hnode *root_ht;
> 	struct tc_u_common *tp_c;
>+	unsigned int h;
> 
>-	tp_c = tp->q->u32_node;
>+	tp_c = tc_u_common_find(tp);
> 
> 	root_ht = kzalloc(sizeof(*root_ht), GFP_KERNEL);
> 	if (root_ht == NULL)
>@@ -345,7 +376,10 @@ static int u32_init(struct tcf_proto *tp)
> 			return -ENOBUFS;
> 		}
> 		tp_c->q = tp->q;
>-		tp->q->u32_node = tp_c;
>+		INIT_HLIST_NODE(&tp_c->hnode);
>+
>+		h = tc_u_hash(tp);
>+		hlist_add_head(&tp_c->hnode, &tc_u_common_hash[h]);
> 	}
> 
> 	tp_c->refcnt++;
>@@ -585,7 +619,7 @@ static void u32_destroy(struct tcf_proto *tp)
> 	if (--tp_c->refcnt == 0) {
> 		struct tc_u_hnode *ht;
> 
>-		tp->q->u32_node = NULL;
>+		hlist_del(&tp_c->hnode);
> 
> 		for (ht = rtnl_dereference(tp_c->hlist);
> 		     ht;
>@@ -1213,6 +1247,8 @@ static struct tcf_proto_ops cls_u32_ops __read_mostly = {
> 
> static int __init init_u32(void)
> {
>+	int i, ret;
>+
> 	pr_info("u32 classifier\n");
> #ifdef CONFIG_CLS_U32_PERF
> 	pr_info("    Performance counters on\n");
>@@ -1223,12 +1259,23 @@ static int __init init_u32(void)
> #ifdef CONFIG_NET_CLS_ACT
> 	pr_info("    Actions configured\n");
> #endif
>-	return register_tcf_proto_ops(&cls_u32_ops);
>+	tc_u_common_hash = kvmalloc_array(U32_HASH_SIZE, sizeof(struct hlist_head), GFP_KERNEL);

This is over 80cols.


>+	if (!tc_u_common_hash)
>+		return -ENOMEM;
>+
>+	for (i = 0; i < U32_HASH_SIZE; i++)
>+		INIT_HLIST_HEAD(&tc_u_common_hash[i]);
>+
>+	ret = register_tcf_proto_ops(&cls_u32_ops);
>+	if (ret)
>+		kvfree(tc_u_common_hash);
>+	return ret;
> }
> 
> static void __exit exit_u32(void)
> {
> 	unregister_tcf_proto_ops(&cls_u32_ops);
>+	kvfree(tc_u_common_hash);
> }
> 
> module_init(init_u32)
>-- 
>2.13.0
>

^ permalink raw reply

* Re: [EXT] Re: [PATCH net-next 10/18] net: mvpp2: use the GoP interrupt for link status changes
From: Marcin Wojtas @ 2017-08-23 21:05 UTC (permalink / raw)
  To: Antoine Tenart
  Cc: Stefan Chulski, Andrew Lunn, davem@davemloft.net,
	jason@lakedaemon.net, gregory.clement@free-electrons.com,
	sebastian.hesselbarth@gmail.com,
	thomas.petazzoni@free-electrons.com, Nadav Haklai,
	linux@armlinux.org.uk, netdev@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org
In-Reply-To: <20170823160444.GC23960@kwain>

Hi Antoine,

2017-08-23 18:04 GMT+02:00 Antoine Tenart <antoine.tenart@free-electrons.com>:
> On Wed, Aug 23, 2017 at 03:24:55PM +0000, Stefan Chulski wrote:
>> > When the cable is connected (there is signal) and the serdes is in sync and AN
>> > succeeded.
>> >
>> > > With SFF/SFP ports, you generally need a gpio line the fibre module
>> > > can use to indicate if it has link. Fixed-phy has such support, and
>> > > your link_change function will get called when the link changes.
>> >
>> > So that would work when using SFP modules but I wonder if the GoP irq isn't
>> > needed when using passive cable, in which case this patch would still be needed
>> > (and of course we should support the new Russell phylib capabilities).
>>
>> Even if new phylib driver supports passive direct cables connection,
>> GoP IRQ required for SOHO/Peridot external switch support.
>> SOHO/Peridot external switch could be connected directly to Serdes line.
>
> So I guess the GoP link irq patches are needed. Should I resend them
> then?
>
> We'll have to discuss how to handle fixed-phy vs GoP IRQ, but I guess we
> can do this when adding the fixed-phy support later.
>

Please check mvneta.c - you can find there coexistence of normal
libphy support, fixed phy and link irq for the inband management mode.
IMO it's exactly, what you need here.

Best regards,
Marcin

^ permalink raw reply

* Re: [Patch net-next] net_sched: kill u32_node pointer in Qdisc
From: Cong Wang @ 2017-08-23 21:14 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: Linux Kernel Network Developers, Jamal Hadi Salim
In-Reply-To: <20170823202553.GC2015@nanopsycho>

On Wed, Aug 23, 2017 at 1:25 PM, Jiri Pirko <jiri@resnulli.us> wrote:
>>+static struct hlist_head *tc_u_common_hash;
>
> Why not use rhashtable?
>

It doesn't have to be so complicated, it is not fast path and
we don't have so many qdisc's and u32 filters in system
relatively.


>>+      tc_u_common_hash = kvmalloc_array(U32_HASH_SIZE, sizeof(struct hlist_head), GFP_KERNEL);
>
> This is over 80cols.
>

Yeah, it doesn't harm anything, but I can fix it.

^ permalink raw reply

* Re: [Patch net-next] net_sched: kill u32_node pointer in Qdisc
From: Jiri Pirko @ 2017-08-23 21:20 UTC (permalink / raw)
  To: Cong Wang; +Cc: Linux Kernel Network Developers, Jamal Hadi Salim
In-Reply-To: <CAM_iQpXx4BnkOQPBu7G12BAnGB_W18=dSJRjg206K6ek+8=o8g@mail.gmail.com>

Wed, Aug 23, 2017 at 11:14:15PM CEST, xiyou.wangcong@gmail.com wrote:
>On Wed, Aug 23, 2017 at 1:25 PM, Jiri Pirko <jiri@resnulli.us> wrote:
>>>+static struct hlist_head *tc_u_common_hash;
>>
>> Why not use rhashtable?
>>
>
>It doesn't have to be so complicated, it is not fast path and
>we don't have so many qdisc's and u32 filters in system
>relatively.

Well, it is easier to work with. So why not use it?


>
>
>>>+      tc_u_common_hash = kvmalloc_array(U32_HASH_SIZE, sizeof(struct hlist_head), GFP_KERNEL);
>>
>> This is over 80cols.
>>
>
>Yeah, it doesn't harm anything, but I can fix it.

At least checkpatch warns you about it.

^ permalink raw reply

* Re: [Patch net-next] net_sched: kill u32_node pointer in Qdisc
From: Cong Wang @ 2017-08-23 21:31 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: Linux Kernel Network Developers, Jamal Hadi Salim
In-Reply-To: <20170823212056.GA22713@nanopsycho>

On Wed, Aug 23, 2017 at 2:20 PM, Jiri Pirko <jiri@resnulli.us> wrote:
> Wed, Aug 23, 2017 at 11:14:15PM CEST, xiyou.wangcong@gmail.com wrote:
>>On Wed, Aug 23, 2017 at 1:25 PM, Jiri Pirko <jiri@resnulli.us> wrote:
>>>>+static struct hlist_head *tc_u_common_hash;
>>>
>>> Why not use rhashtable?
>>>
>>
>>It doesn't have to be so complicated, it is not fast path and
>>we don't have so many qdisc's and u32 filters in system
>>relatively.
>
> Well, it is easier to work with. So why not use it?
>

OMG... You must have a different definition for "easier".


>
>>
>>
>>>>+      tc_u_common_hash = kvmalloc_array(U32_HASH_SIZE, sizeof(struct hlist_head), GFP_KERNEL);
>>>
>>> This is over 80cols.
>>>
>>
>>Yeah, it doesn't harm anything, but I can fix it.
>
> At least checkpatch warns you about it.

Not every checkpatch.pl warning worth your time.

Jiri, spend your time on something more value than
80-cols. ;)

^ permalink raw reply

* Re: [Patch net-next] net_sched: kill u32_node pointer in Qdisc
From: Jiri Pirko @ 2017-08-23 21:38 UTC (permalink / raw)
  To: Cong Wang; +Cc: Linux Kernel Network Developers, Jamal Hadi Salim
In-Reply-To: <CAM_iQpXHg4jd4vtq0O3uAV6ONYA18M0MqJRs=P=FAmyCyf1eFA@mail.gmail.com>

Wed, Aug 23, 2017 at 11:31:23PM CEST, xiyou.wangcong@gmail.com wrote:
>On Wed, Aug 23, 2017 at 2:20 PM, Jiri Pirko <jiri@resnulli.us> wrote:
>> Wed, Aug 23, 2017 at 11:14:15PM CEST, xiyou.wangcong@gmail.com wrote:
>>>On Wed, Aug 23, 2017 at 1:25 PM, Jiri Pirko <jiri@resnulli.us> wrote:
>>>>>+static struct hlist_head *tc_u_common_hash;
>>>>
>>>> Why not use rhashtable?
>>>>
>>>
>>>It doesn't have to be so complicated, it is not fast path and
>>>we don't have so many qdisc's and u32 filters in system
>>>relatively.
>>
>> Well, it is easier to work with. So why not use it?
>>
>
>OMG... You must have a different definition for "easier".

Apparently.


>
>
>>
>>>
>>>
>>>>>+      tc_u_common_hash = kvmalloc_array(U32_HASH_SIZE, sizeof(struct hlist_head), GFP_KERNEL);
>>>>
>>>> This is over 80cols.
>>>>
>>>
>>>Yeah, it doesn't harm anything, but I can fix it.
>>
>> At least checkpatch warns you about it.
>
>Not every checkpatch.pl warning worth your time.
>
>Jiri, spend your time on something more value than
>80-cols. ;)

I would not have to spend any time on it, if you would just follow the
usual workflow. Clearly, you have some problem with that. I cannot
say I understand it :/

^ permalink raw reply

* [PATCH net] nfp: TX time stamp packets before HW doorbell is rung
From: Jakub Kicinski @ 2017-08-23 21:41 UTC (permalink / raw)
  To: netdev; +Cc: oss-drivers, Jakub Kicinski

TX completion may happen any time after HW queue was kicked.
We can't access the skb afterwards.  Move the time stamping
before ringing the doorbell.

Fixes: 4c3523623dc0 ("net: add driver for Netronome NFP4000/NFP6000 NIC VFs")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
---
Sorry, another one just popped up.  It has been in the driver 
"for ever", it could go into net-next as well.

 drivers/net/ethernet/netronome/nfp/nfp_net_common.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 1ff0c577819e..66a09e490cf5 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -895,6 +895,8 @@ static int nfp_net_tx(struct sk_buff *skb, struct net_device *netdev)
 
 	netdev_tx_sent_queue(nd_q, txbuf->real_len);
 
+	skb_tx_timestamp(skb);
+
 	tx_ring->wr_p += nr_frags + 1;
 	if (nfp_net_tx_ring_should_stop(tx_ring))
 		nfp_net_tx_ring_stop(nd_q, tx_ring);
@@ -903,8 +905,6 @@ static int nfp_net_tx(struct sk_buff *skb, struct net_device *netdev)
 	if (!skb->xmit_more || netif_xmit_stopped(nd_q))
 		nfp_net_tx_xmit_more_flush(tx_ring);
 
-	skb_tx_timestamp(skb);
-
 	return NETDEV_TX_OK;
 
 err_unmap:
-- 
2.11.0

^ permalink raw reply related

* RE: [PATCH net v2 2/2] net: dsa: skb_put_padto() already frees nskb
From: Woojung.Huh @ 2017-08-23 21:42 UTC (permalink / raw)
  To: f.fainelli, netdev; +Cc: davem, vivien.didelot, UNGLinuxDriver
In-Reply-To: <20170822221215.16305-3-f.fainelli@gmail.com>

> The first call of skb_put_padto() will free up the SKB on error, but we
> return NULL which tells dsa_slave_xmit() that the original SKB should be
> freed so this would lead to a double free here.
> 
> The second skb_put_padto() already frees the passed sk_buff reference
> upon error, so calling kfree_skb() on it again is not necessary.
> 
> Detected by CoverityScan, CID#1416687 ("USE_AFTER_FREE")
> 
> Fixes: e71cb9e00922 ("net: dsa: ksz: fix skb freeing")
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>

Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>

- Woojung

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox