Netdev List

Netdev List
 help / color / mirror / Atom feed

* [net-next 0/9][pull request] 40GbE Intel Wired LAN Driver Updates 2018-03-14
From: Jeff Kirsher @ 2018-03-14 20:44 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann, jogreene

This series contains updates to i40e and i40evf only.

Corentin Labbe cleans up the left over FCoE files in the i40e driver.

Gustavo A R Silva fixes a cut and paste error.

Paweł fixes a race condition when the VF driver is loaded on a host and
virsh is trying to attach it to the virtual machine and set a MAC
address.  Resolve the issue by adding polling in i40e_ndo_set_vf_mac()
when the VF is in reset mode.

Jake cleans up i40e_vlan_rx_register() since this only used in a single
location, so just inline the contents of the function.  Created a helper
function to proper update the per-filter statistics when we delete it.
Factored out the re-enabling ATR and SB rules.  Fixed an issue when
re-enabling ATR after the last TCPv4 filter is removed and ntuple is
still active, we were not restoring the TCPv4 filter input set.

Filip modifies the permission check function to ensure that it knows how
many filters are being requested, which allows the check to ensure that
the total number of filters in a single request does not cause us to go
over the limit.

Mariusz fixed an issue where the wrong calculation of partition id was
being done on OCP PHY mezzanine cards, which in turn caused wake on LAN
to be disabled on certain ports.

The following are changes since commit c292566a7779f88ed63d555b1015ab29ca0d5530:
  Merge branch 'sctp-add-support-for-some-sctp-auth-APIs-from-RFC6458'
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue 40GbE

Corentin Labbe (1):
  i40e: remove i40e_fcoe files

Filip Sadowski (1):
  i40e: Fix permission check for VF MAC filters

Gustavo A R Silva (1):
  i40evf/i40evf_main: Fix variable assignment in i40evf_parse_cls_flower

Jacob Keller (4):
  i40e: Cleanup i40e_vlan_rx_register
  i40e: track filter type statistics when deleting invalid filters
  i40e: factor out re-enable functions for ATR and SB
  i40e: restore TCPv4 input set when re-enabling ATR

Mariusz Stachura (1):
  i40e: fix for wrong partition id calculation on OCP mezz cards

Paweł Jabłoński (1):
  i40e: Fix attach VF to VM issue

 drivers/net/ethernet/intel/i40e/i40e_common.c      |   23 +-
 drivers/net/ethernet/intel/i40e/i40e_fcoe.c        | 1571 --------------------
 drivers/net/ethernet/intel/i40e/i40e_fcoe.h        |  127 --
 drivers/net/ethernet/intel/i40e/i40e_main.c        |  134 +-
 drivers/net/ethernet/intel/i40e/i40e_type.h        |    3 +
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c |   90 +-
 drivers/net/ethernet/intel/i40evf/i40evf_main.c    |    2 +-
 7 files changed, 181 insertions(+), 1769 deletions(-)
 delete mode 100644 drivers/net/ethernet/intel/i40e/i40e_fcoe.c
 delete mode 100644 drivers/net/ethernet/intel/i40e/i40e_fcoe.h

-- 
2.14.3

^ permalink raw reply

* Re: [PATCH] brcmfmac: drop Inter-Access Point Protocol packets by default
From: Arend van Spriel @ 2018-03-14 20:44 UTC (permalink / raw)
  To: Rafał Miłecki
  Cc: Rafał Miłecki, Kalle Valo, Franky Lin, Hante Meuleman,
	Chi-Hsien Lin, Wright Feng, Pieter-Paul Giesberts, James Hughes,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	brcm80211-dev-list.pdl-dY08KVG/lbpWk0Htik3J/w,
	brcm80211-dev-list-+wT8y+m8/X5BDgjK7y7TUQ,
	netdev-u79uwXL29TY76Z2rM5mHXA, Linus Lüssing, Felix Fietkau,
	bridge-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
In-Reply-To: <ec7ef5fd9fb4193f92c0e0189cbe5e3e-g1n6cQUeyibVItvQsEIGlw@public.gmane.org>

On 3/14/2018 4:57 PM, Rafał Miłecki wrote:
> On 2018-03-14 16:39, Rafał Miłecki wrote:
>> On 2018-03-14 13:58, Arend van Spriel wrote:
>>> On 3/14/2018 12:01 PM, Rafał Miłecki wrote:
>>>> From: Rafał Miłecki <rafal-g1n6cQUeyibVItvQsEIGlw@public.gmane.org>
>>>>
>>>> Testing brcmfmac with more recent firmwares resulted in AP interfaces
>>>> not working in some specific setups. Debugging resulted in discovering
>>>> support for IAPP in Broadcom's firmwares. This is an obsoleted standard
>>>> and its implementation is something that:
>>>> 1) Most people don't need / want to use
>>>> 2) Can allow local DoS attacks
>>>> 3) Breaks AP interfaces in some specific bridge setups
>>>>
>>>> To solve issues it can cause this commit modifies brcmfmac to drop IAPP
>>>> packets. If affects:
>>>> 1) Rx path: driver won't be sending these unwanted packets up.
>>>> 2) Tx path: driver will reject packets that would trigger STA
>>>>     disassociation perfromed by a firmware (possible local DoS attack).
>>>>
>>>> It appears there are some Broadcom's clients/users who care about this
>>>> feature despite the drawbacks. They can switch it on by a newly added
>>>> Kconfig option.
>>>
>>> Thanks for taking this approach. Looks fine except for .... (see below)
>>>
>>> Reviewed-by: Arend van Spriel <arend.vanspriel-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
>>>> Signed-off-by: Rafał Miłecki <rafal-g1n6cQUeyibVItvQsEIGlw@public.gmane.org>
>>>> ---
>>>>   drivers/net/wireless/broadcom/brcm80211/Kconfig    | 20 +++++++++++
>>>>   .../wireless/broadcom/brcm80211/brcmfmac/core.c    | 39
>>>> ++++++++++++++++++++++
>>>>   2 files changed, 59 insertions(+)
>>>>
>>>> diff --git a/drivers/net/wireless/broadcom/brcm80211/Kconfig
>>>> b/drivers/net/wireless/broadcom/brcm80211/Kconfig
>>>> index 9d99eb42d917..876787ef991a 100644
>>>> --- a/drivers/net/wireless/broadcom/brcm80211/Kconfig
>>>> +++ b/drivers/net/wireless/broadcom/brcm80211/Kconfig
>>>> @@ -68,6 +68,26 @@ config BRCMFMAC_PCIE
>>>>         IEEE802.11ac embedded FullMAC WLAN driver. Say Y if you want to
>>>>         use the driver for an PCIE wireless card.
>>>>
>>>> +config BRCMFMAC_IAPP
>>>> +    bool "Partial support for obsoleted Inter-Access Point Protocol"
>>>> +    depends on BRCMFMAC
>>>> +    ---help---
>>>> +      Most of Broadcom's firmwares can send 802.11f ADD frame every
>>>> +      time new STA connects to the AP interface. Some recent ones
>>>> +      can also disassociate STA when they receive such a frame.
>>>
>>> I do not see any evidence that this would occur only for recent
>>> firmware. That stuff is old and not touched recently.
>>
>> My evidence is comparing firmwares for 4366b1: 10.10.69.3309 (r610991)
>> vs. 10.10 (TOB) (r663589).
>>
>> The first one is from linux-firmware.git and it doesn't implement IAPP
>> in the TX path. The later one is what I got from you privately and it
>> implements it.
>>
>> Also a firmware for 4366c0: 10.10.122.20 (r683106) which is relatively
>> new implements IAPP in the TX path.
>>
>>
>>>> diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
>>>> b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
>>>> index 19048526b4af..db6987015fb1 100644
>>>> --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
>>>> +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
>>>
>>> [...]
>>>
>>>>   static netdev_tx_t brcmf_netdev_start_xmit(struct sk_buff *skb,
>>>>                          struct net_device *ndev)
>>>>   {
>>>> @@ -250,6 +278,12 @@ static netdev_tx_t
>>>> brcmf_netdev_start_xmit(struct sk_buff *skb,
>>>>           goto done;
>>>>       }
>>>>
>>>> +    if (!IS_ENABLED(CONFIG_BRCMFMAC_IAPP) && brcmf_skb_is_iapp(skb)) {
>>>> +        dev_kfree_skb(skb);
>>>> +        ret = -EINVAL;
>>>> +        goto done;
>>>> +    }
>>>
>>> This is not right. The function must return netdev_tx_t type. Here is
>>> kerneldoc of .start_xmit():
>>>
>>>  * netdev_tx_t (*ndo_start_xmit)(struct sk_buff *skb,
>>>  *                               struct net_device *dev);
>>>  *    Called when a packet needs to be transmitted.
>>>  *    Returns NETDEV_TX_OK.  Can return NETDEV_TX_BUSY, but you
>>> should stop
>>>  *    the queue before that can happen; it's for obsolete devices and
>>> weird
>>>  *    corner cases, but the stack really does a non-trivial amount
>>>  *    of useless work if you return NETDEV_TX_BUSY.
>>>  *    Required; cannot be NULL.
>>
>> Please take a closer look at how brcmf_netdev_start_xmit() works. Above
>>  code *will* return netdev_tx_t.
>>
>>
>>> You may want to increase dropped netstat or add driver internal
>>> statistic counter so there is visibility of IAPP packets being
>>> dropped.
>>
>> OK, I'll try to find a stat to increase.
>
> So after checking brcmf_netdev_start_xmit() again, I realized I actually
> *do* that. Doing:
> ret = -EINVAL;
> goto done;
> results in increasing tx_dropped.

Okay, okay. Admittedly I only looked at the patch. Feel free to remove 
the Reviewed-by.

Regards,
Arend

^ permalink raw reply

* Re: [PATCH] vhost: add vsock compat ioctl
From: Sonny Rao @ 2018-03-14 20:39 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, Stefan Hajnoczi, Jason Wang, virtualization, netdev,
	linux-kernel
In-Reply-To: <20180314210417-mutt-send-email-mst@kernel.org>

On Wed, Mar 14, 2018 at 12:05 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Wed, Mar 14, 2018 at 10:26:05AM -0700, Sonny Rao wrote:
>> This will allow usage of vsock from 32-bit binaries on a 64-bit
>> kernel.
>>
>> Signed-off-by: Sonny Rao <sonnyrao@chromium.org>
>
> I think you need to convert the pointer argument though.
> Something along the lines of:
>
> #ifdef CONFIG_COMPAT
> static long vhost_vsock_dev_compat_ioctl(struct file *f, unsigned int ioctl,
>                                          unsigned long arg)
> {
>         return vhost_vsock_dev_ioctl(f, ioctl, (unsigned long)compat_ptr(arg));
> }
> #endif

Ok, thanks for pointing that out -- it has worked for me so far, but
I'll re-spin as you suggested.

>
>
>
>> ---
>>  drivers/vhost/vsock.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
>> index 0d14e2ff19f16..d0e65e92110e5 100644
>> --- a/drivers/vhost/vsock.c
>> +++ b/drivers/vhost/vsock.c
>> @@ -705,6 +705,7 @@ static const struct file_operations vhost_vsock_fops = {
>>       .release        = vhost_vsock_dev_release,
>>       .llseek         = noop_llseek,
>>       .unlocked_ioctl = vhost_vsock_dev_ioctl,
>> +     .compat_ioctl   = vhost_vsock_dev_ioctl,
>>  };
>>
>>  static struct miscdevice vhost_vsock_misc = {
>> --
>> 2.13.5

^ permalink raw reply

* [PATCH net] skbuff: Fix not waking applications when errors are enqueued
From: Vinicius Costa Gomes @ 2018-03-14 20:32 UTC (permalink / raw)
  To: netdev; +Cc: Vinicius Costa Gomes, randy.e.witt, davem, eric.dumazet

When errors are enqueued to the error queue via sock_queue_err_skb()
function, it is possible that the waiting application is not notified.

Calling 'sk->sk_data_ready()' would not notify applications that
selected only POLLERR events in poll() (for example).

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Randy E. Witt <randy.e.witt@intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---
 net/core/skbuff.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index baf990528943..1ea4be79c0a7 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4179,7 +4179,7 @@ int sock_queue_err_skb(struct sock *sk, struct sk_buff *skb)
 
 	skb_queue_tail(&sk->sk_error_queue, skb);
 	if (!sock_flag(sk, SOCK_DEAD))
-		sk->sk_data_ready(sk);
+		sk->sk_error_report(sk);
 	return 0;
 }
 EXPORT_SYMBOL(sock_queue_err_skb);
-- 
2.16.2

^ permalink raw reply related

* Re: [PATCH iproute2] Revert "iproute: "list/flush/save default" selected all of the routes"
From: Alexander Zubkov @ 2018-03-14 20:26 UTC (permalink / raw)
  To: Serhey Popovych, Luca Boccassi, Stephen Hemminger; +Cc: netdev@vger.kernel.org
In-Reply-To: <135881521017992@web3o.yandex.ru>

Hello,

For example, it can be fixed in such way (patch is below):
- split handling of default and all/any
- set needed attributes in get_addr: PREFIXLEN_SPECIFIED flag for default
- and AF_UNSPEC for all/any
In this case "ip route show default" shows only default route and "ip 
route show all" shows all routes. And both also work when family (-4 or 
-6) is specified.
Serhey, does it goes in line with what you wanted to achieve? Because I 
do not know - may be there are reasons why all/any should be provided 
with specific family. If you think this solution is suitable, I'll do 
some additional tests and package the patch in a proper way for this 
mailing list.
And I'm unsure if check for AF_DECnet and AF_MPLS should be kept in both 
branches. May be someone have some additional thoughts on that?

--- a/lib/utils.c
+++ b/lib/utils.c
@@ -560,14 +560,23 @@ static int __get_addr_1(inet_prefix *addr, const 
char *name, int family)
  {
  	memset(addr, 0, sizeof(*addr));

-	if (strcmp(name, "default") == 0 ||
-	    strcmp(name, "all") == 0 ||
-	    strcmp(name, "any") == 0) {
+	if (strcmp(name, "default") == 0) {
  		if ((family == AF_DECnet) || (family == AF_MPLS))
  			return -1;
  		addr->family = (family != AF_UNSPEC) ? family : AF_INET;
  		addr->bytelen = af_byte_len(addr->family);
  		addr->bitlen = -2;
+		addr->flags |= PREFIXLEN_SPECIFIED;
+		return 0;
+	}
+
+	if (strcmp(name, "all") == 0 ||
+	    strcmp(name, "any") == 0) {
+		if ((family == AF_DECnet) || (family == AF_MPLS))
+			return -1;
+		addr->family = AF_UNSPEC;
+		addr->bytelen = 0;
+		addr->bitlen = -2;
  		return 0;
  	}

@@ -695,7 +704,7 @@ int get_prefix_1(inet_prefix *dst, char *arg, int 
family)

  	bitlen = af_bit_len(dst->family);

-	flags = PREFIXLEN_SPECIFIED;
+	flags = 0;
  	if (slash) {
  		unsigned int plen;

@@ -706,12 +715,11 @@ int get_prefix_1(inet_prefix *dst, char *arg, int 
family)
  		if (plen > bitlen)
  			return -1;

+		flags |= PREFIXLEN_SPECIFIED;
  		bitlen = plen;
  	} else {
  		if (dst->bitlen == -2)
  			bitlen = 0;
-		else
-			flags = 0;
  	}

  	dst->flags |= flags;


On 14.03.2018 09:59, Alexander Zubkov wrote:
> Hello,
> 
> There was a series of patches by Serhey and specifically this one:
> https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/commit/?id=93fa12418dc6f5943692250244be303bb162175b
> 
> It drops handling of special prefix names in get_prefix_1(), and in get_addr_1() they always receive family and bytelen. But as I unerstand for any case it was important to keep it with unspecified family for further filtering. As I do not know what is the global idea, I want to discuss it. Because there are options depending on how and where we want to handle those special names. Like keep unspecified family or change filtering logic.
> 
> I have added Serhey Popovych in the recepients, so he can give some ideas on what his aim is and help choose better solution.
> 
> 13.03.2018, 21:12, "Alexander Zubkov" <green@msu.ru>:
>> Hi,
>>
>> I just realized that you need patch for v4.15.0, which is easier to do.
>> I'll send it as separate message now. I will make patch for the master
>> branch, but later.
>>
>> On 13.03.2018 13:02, Luca Boccassi wrote:
>>>   On Tue, 2018-03-13 at 12:05 +0100, Alexander Zubkov wrote:
>>>>   Hello again,
>>>>
>>>>   The fun thing is that before the commit "ip route ls all" showed all
>>>>   routes, but "ip -[4|6] route ls all" showed only default. So it was
>>>>   broken too, but in other way.
>>>>   I see parsing of prefix was changed since my patch. So I need several
>>>>   days to propose fix. I think if "ip route ls [all|any]" shows all
>>>>   routes and "ip route ls default" shows only default, everybody will
>>>>   be happy with that?
>>>
>>>   Hi,
>>>
>>>   My only concern is that behaviour of existing commands that have been
>>>   in releases is not changed, otherwise I get bugs raised :-)
>>>
>>>   Thank you for your work!
>>>
>>>>   13.03.2018, 09:46, "Alexander Zubkov" <green@msu.ru>:
>>>>>   Hello.
>>>>>
>>>>>   May be the better way would be to change how "all"/"any" argument
>>>>>   behaves? My original concern was about "default" only. I agree too,
>>>>>   that "all" or "any" should work for all routes. But not for the
>>>>>   default.
>>>>>
>>>>>   12.03.2018, 22:37, "Luca Boccassi" <bluca@debian.org>:
>>>>>>     On Mon, 2018-03-12 at 14:03 -0700, Stephen Hemminger wrote:
>>>>>>>      This reverts commit 9135c4d6037ff9f1818507bac0049fc44db8c3d2.
>>>>>>>
>>>>>>>      Debian maintainer found that basic command:
>>>>>>>              # ip route flush all
>>>>>>>      No longer worked as expected which breaks user scripts and
>>>>>>>      expectations. It no longer flushed all IPv4 routes.
>>>>>>>
>>>>>>>      Reported-by: Luca Boccassi <bluca@debian.org>
>>>>>>>      Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
>>>>>>>      ---
>>>>>>>       ip/iproute.c | 65 ++++++++++++++++++----------------------
>>>>>>>   --------
>>>>>>>      ------------
>>>>>>>       lib/utils.c  | 13 ++++++++++++
>>>>>>>       2 files changed, 32 insertions(+), 46 deletions(-)
>>>>>>
>>>>>>     Tested-by: Luca Boccassi <bluca@debian.org>
>>>>>>
>>>>>>     Thanks, solves the problem. I'll backport it to Debian.
>>>>>>
>>>>>>     Alexander, reproducing the issue is quite simple - before that
>>>>>>   commit,
>>>>>>     ip route ls all showed all routes, but with the change it
>>>>>>   started
>>>>>>     showing only the default table. Same for ip route flush.
>>>>>>
>>>>>>     --
>>>>>>     Kind regards,
>>>>>>     Luca Boccassi

^ permalink raw reply

* Re: regression: ti: cpsw: warning from phy_connect()->sysfs_create_link()-sysfs_warn_dup()
From: Grygorii Strashko @ 2018-03-14 20:20 UTC (permalink / raw)
  To: Florian Fainelli, netdev, Andrew Lunn; +Cc: David Miller, Sekhar Nori
In-Reply-To: <c656f66c-b521-2afe-1338-e794657c049b@gmail.com>



On 03/14/2018 02:50 PM, Florian Fainelli wrote:
> On 03/14/2018 12:40 PM, Grygorii Strashko wrote:
>>
>>
>> On 02/26/2018 02:16 PM, Florian Fainelli wrote:
>>> On 02/26/2018 12:08 PM, Grygorii Strashko wrote:
>>>> Hi Florian,
>>>>
>>>> The TI CPSW driver produces warning as below when booted in switch mode:
>>>> [    8.882295] sysfs: cannot create duplicate filename '/devices/platform/44000000.ocp/48484000.ethernet/net/eth0/phydev'
>>>> [    8.999859] CPU: 1 PID: 356 Comm: systemd-network Not tainted 4.16.0-rc3-00010-g6cc3ff6-dirty #225
>>>> ...
>>>> [    9.012352] Hardware name: Generic DRA74X (Flattened Device Tree)
>>>> [    9.018901] Backtrace:
>>>> [    9.021376] [<c010c918>] (dump_backtrace) from [<c010cbe8>] (show_stack+0x18/0x1c)
>>>> [    9.028986]  r7:ed036240 r6:60070013 r5:00000000 r4:c0d598a0
>>>> [    9.034684] [<c010cbd0>] (show_stack) from [<c088e1f0>] (dump_stack+0x8c/0xa0)
>>>> [    9.041954] [<c088e164>] (dump_stack) from [<c02ba59c>] (sysfs_warn_dup+0x60/0x6c)
>>>> [    9.049564]  r7:ed036240 r6:ed036240 r5:c0b579bc r4:ed10c000
>>>> [    9.055264] [<c02ba53c>] (sysfs_warn_dup) from [<c02ba930>] (sysfs_do_create_link_sd+0xbc/0xc4)
>>>> [    9.064006]  r7:ed036240 r6:ffffffef r5:00000000 r4:ed034660
>>>> [    9.069701] [<c02ba874>] (sysfs_do_create_link_sd) from [<c02ba968>] (sysfs_create_link+0x30/0x3c)
>>>> [    9.078706]  r9:00000008 r8:00000000 r7:ed02f008 r6:ee015ae8 r5:ee015800 r4:ed02f000
>>>> [    9.086497] [<c02ba938>] (sysfs_create_link) from [<c060f070>] (phy_attach_direct+0x180/0x1f4)
>>>> [    9.095163] [<c060eef0>] (phy_attach_direct) from [<c060f1cc>] (phy_connect_direct+0x1c/0x54)
>>>> [    9.103735]  r10:00000001 r9:ee015d0c r8:00000008 r7:c062e84c r6:c062e84c r5:ed02f000
>>>> [    9.111609]  r4:ed02f000 r3:00000008
>>>> [    9.115217] [<c060f1b0>] (phy_connect_direct) from [<c060f250>] (phy_connect+0x4c/0x80)
>>>> [    9.123270]  r7:c062e84c r6:ee015800 r5:ed02f000 r4:ee693664
>>>> [    9.128969] [<c060f204>] (phy_connect) from [<c062abc8>] (cpsw_slave_open+0x21c/0x274)
>>>> [    9.136926]  r9:ee015d0c r8:ee015d00 r7:ed018a10 r6:ee015800 r5:ee015d00 r4:ed032630
>>>> [    9.144715] [<c062a9ac>] (cpsw_slave_open) from [<c062b28c>] (cpsw_ndo_open+0x158/0x55c)
>>>> [    9.152860]  r10:00000001 r9:00000000 r8:ee015d00 r7:c0d04c48 r6:ee015800 r5:ed018a10
>>>> [    9.160730]  r4:ed032630
>>>> [    9.163291] [<c062b134>] (cpsw_ndo_open) from [<c0765350>] (__dev_open+0xd4/0x158)
>>>> [    9.170900]  r10:00000000 r9:ec885db4 r8:ee01582c r7:c09a9c00 r6:00000000 r5:c0d04c48
>>>> [    9.178768]  r4:ee015800
>>>> [    9.181326] [<c076527c>] (__dev_open) from [<c0765748>] (__dev_change_flags+0x174/0x1c0)
>>>>
>>>> The reason of the warning is that in switch mode CPSW drivers is connecting two Net PHYs to
>>>> a single network device (it has been done this way when driver was initially introduced), so
>>>> now it produces warning during boot because of commits:
>>>>
>>>> 5568363f0cb3 ("net: phy: Create sysfs reciprocal links for attached_dev/phydev"
>>>> a3995460491d ("net: phy: Relax error checking on sysfs_create_link()"
>>>> ^ both went in v4.13
>>>>
>>>> Honestly, I'm not sure how to fix it the best way (the simplest fix is below), taking into account
>>>> that we are not ready to do big reworks in CPSW driver.
>>
>> I've got additional testing data and this actually a *regression*, because
>> second CPSW Port became broken after above commits due to Net PHY
>> connection failure.
>>
>>>
>>> The CPSW driver is duplicating a fair amount of what the DSA subsystem
>>> does properly without breaking any assumptions about the 1:1 mapping
>>> that can exist between a network device and PHY device. Having a PHY
>>> device without a network device is fine in premise, although discouraged.
>>>
>>> Migrating to DSA is certainly a big task, but I would strongly encourage
>>> you to consider doing it, since that would solve this problem, and
>>> probably many more.
>>
>> We are actively investigating possibility to use DSA for this driver,
>> but unfortunately this is looks very problematic, because this is old driver with
>> stable ABI and stable device tree binding which are also ABI.
>> So, this driver can't be simply migrating to use DSA and as possible solution
>> new driver can be introduced in long term perspective which will
>> follow DSA binding requirements and reuse as max as possible code
>> of the current CPSW driver. Even in this case, old driver
>> will need to be supported during some transition period and *be functional*.
> 
> That is fair enough, maybe we could explore breaking things more nicely
> within DSA such that the Device Tree parsing is largely independent from
> the internal representation, and as a result,you could reuse the
> existing binding you have, but leverage the DSA infrastructure where it
> makes sense, food for thought.
> 
>>
>> For now I'd be very appreciated if functionality of current TI CPSW driver will be
>> restored and propose to consider below fix, which will make it work again:
>>
>> =============
>>  From 5f67320d985a533da785e3643e5e63ba7395b4ae Mon Sep 17 00:00:00 2001
>> From: Grygorii Strashko <grygorii.strashko@ti.com>
>> Date: Wed, 14 Mar 2018 14:01:12 -0500
>> Subject: [PATCH] net: phy: skip error checking when creating sysfs link
>>   netdev->phydev
>>
>> Some ethernet drivers (like TI CPSW) may connect and manage >1 Net PHYs per
>> one netdevice, as result such drivers will produce warning during system
>> boot and fail to connect second phy to netdevice when PHY device
>> framework will try to create sysfs link netdev->phydev for second PHY,
>> because sysfs link with the same name has been created already for the first PHY.
>> As result, second CPSW external port will became unusable.
>>
>> Fix it by skipping error checking when PHY device
>> framework creating sysfs link netdev->phydev. The sysfs_create_link()
>> will still produce warning, but we can live with it as
>> system functionality will not be broken.
>>
>> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
>> ---
>>   drivers/net/phy/phy_device.c | 7 ++++---
>>   1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>> index 67f25ac..e2c34c3 100644
>> --- a/drivers/net/phy/phy_device.c
>> +++ b/drivers/net/phy/phy_device.c
>> @@ -1000,9 +1000,10 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
>>   				"attached_dev");
>>   	if (!err) {
>>   		err = sysfs_create_link(&dev->dev.kobj, &phydev->mdio.dev.kobj,
>> -					"phydev");
>> -		if (err)
>> -			goto error;
>> +				  "phydev");
>> +		/* don't check return value here as some net drivers can
>> +		 * use one netdevice with more then one phy
>> +		 */
>>   
>>   		phydev->sysfs_links = true;
> 
> Can you make sure that the single boolean tracking the state of both
> sysfs links is not going to cause any problem in your case?

I've tested it with ifconfig up/down as CPSW driver does
PHY connect in ndo_open() and PHY disconnect in ndo_close()
and saw no issues (sysfs_remove_link()->kernfs_remove_by_name()
handles -ENOENT internally with no warning or other issues for second phy).


 Try
> unbinding the PHY driver from your PHY device for instance to check
> that. If that still works and does not produce a warning then:

It will still produce warning, a can also use sysfs_create_link_nowarn()
and resend if you agree.

> 
> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
> 
> and maybe add a Fixes: tag when you submit this officially?
> 

sure.

-- 
regards,
-grygorii

^ permalink raw reply

* Re: [RESEND] rsi: Remove stack VLA usage
From: Tobin C. Harding @ 2018-03-14 20:19 UTC (permalink / raw)
  To: Kalle Valo
  Cc: Kees Cook, Andy Shevchenko, Kernel Hardening,
	Linux Kernel Mailing List, netdev, open list:TI WILINK WIRELES...,
	Tycho Andersen, Konstantin Ryabitsev
In-Reply-To: <87fu536mkm.fsf@codeaurora.org>

On Wed, Mar 14, 2018 at 11:19:53AM +0200, Kalle Valo wrote:
> "Tobin C. Harding" <me@tobin.cc> writes:
> 
> > Added Konstantin in case he is in charge of administering patchwork.kernel.org?
> >
> > On Tue, Mar 13, 2018 at 07:53:34PM -0700, Kees Cook wrote:
> >> On Tue, Mar 13, 2018 at 7:11 PM, Tobin C. Harding <me@tobin.cc> wrote:
> >> > On Tue, Mar 13, 2018 at 11:00:47PM +0200, Andy Shevchenko wrote:
> >> >> On Tue, Mar 13, 2018 at 10:17 PM, tcharding <me@tobin.cc> wrote:
> >> >> > On Mon, Mar 12, 2018 at 09:46:06AM +0000, Kalle Valo wrote:
> >> >> >> tcharding <me@tobin.cc> wrote:
> >> >>
> >> >> I'm pretty much sure it depends on the original email headers, like
> >> >> above ^^^ — no name.
> >> >> Perhaps git config on your side should be done.
> >> >
> >> > Thanks for the suggestion Andy but the 'tcharding' as the name was
> >> > munged by either Kalle or patchwork.  I'm guessing patchwork.
> >> 
> >> Something you're sending from is using "tcharding" (see the email Andy
> >> quotes). I see the headers as:
> >> 
> >> Date: Wed, 14 Mar 2018 07:17:57 +1100
> >> From: tcharding <me@tobin.cc>
> >> ...
> >> Message-ID: <20180313201757.GK8631@eros>
> >> X-Mailer: Mutt 1.5.24 (2015-08-30)
> >> User-Agent: Mutt/1.5.24 (2015-08-30)
> >> 
> >> Your most recently email shows "Tobin C. Harding" though, and also
> >> sent with Mutt...
> >>
> >> Do you have multiple Mutt configurations? Is something lacking a
> >> "From" header insertion and your MTA is filling it in for you from
> >> your username?
> >
> > Thanks for taking the time to respond Kees (and Tycho).  I have mutt
> > configured to reply from whichever email address I receive from so if
> > patchwork sent an email to 'tcharding <me@tobin.cc>' (which is the
> > details it has) and I hit reply it would have come from 'tcharding',
> > hence Andy's reply.  I wouldn't bet my life on it but I'm kinda
> > confident that I cannot initiate an email from 'tcharding' with my
> > current set up.
> >
> > Super bad form to blame someone (or something else) but I think this is
> > a problem with how my patchwork account is configured.  Either way, that
> > is still my fault I should have added my real name to patchwork when I
> > signed up (not just username 'tcharding').
> >
> > Is patchwork.kernel.org administered by Konstantin Ryabitsev? Added
> > Konstantin to CC's.
> 
> Like I said earlier, just send a request to helpdesk@kernel.org and
> admins should fix your name.

thanks Kalle, I'm on it.

^ permalink raw reply

* [PATCH RFC 7/7] net: phy: remove phy_stop_machine
From: Heiner Kallweit @ 2018-03-14 20:16 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn; +Cc: Geert Uytterhoeven, netdev@vger.kernel.org
In-Reply-To: <421476ed-03c7-63a0-44a4-c6d9c7702647@gmail.com>

Now that the functionality of phy_stop() was integrated to __phy_stop()
we can remove phy_stop_machine().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/phy/phy.c        | 18 ------------------
 drivers/net/phy/phy_device.c |  2 --
 include/linux/phy.h          |  1 -
 3 files changed, 21 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 54144cd10..4c80e5ecb 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -565,24 +565,6 @@ void phy_trigger_machine(struct phy_device *phydev, bool sync)
 	queue_delayed_work(system_power_efficient_wq, &phydev->state_queue, 0);
 }
 
-/**
- * phy_stop_machine - stop the PHY state machine tracking
- * @phydev: target phy_device struct
- *
- * Description: Stops the state machine delayed workqueue, sets the
- *   state to UP (unless it wasn't up yet). This function must be
- *   called BEFORE phy_detach.
- */
-void phy_stop_machine(struct phy_device *phydev)
-{
-	cancel_delayed_work_sync(&phydev->state_queue);
-
-	mutex_lock(&phydev->lock);
-	if (phydev->state > PHY_UP && phydev->state != PHY_HALTED)
-		phydev->state = PHY_UP;
-	mutex_unlock(&phydev->lock);
-}
-
 /**
  * phy_error - enter HALTED state for this PHY device
  * @phydev: target phy_device struct
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index ed501fabe..75afefac5 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -830,8 +830,6 @@ void phy_disconnect(struct phy_device *phydev)
 	if (phydev->irq > 0)
 		phy_stop_interrupts(phydev);
 
-	phy_stop_machine(phydev);
-
 	phydev->adjust_link = NULL;
 
 	phy_detach(phydev);
diff --git a/include/linux/phy.h b/include/linux/phy.h
index be43bd270..f51c68515 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -1032,7 +1032,6 @@ int phy_drivers_register(struct phy_driver *new_driver, int n,
 void phy_state_machine(struct work_struct *work);
 void phy_change_work(struct work_struct *work);
 void phy_mac_interrupt(struct phy_device *phydev);
-void phy_stop_machine(struct phy_device *phydev);
 void phy_trigger_machine(struct phy_device *phydev, bool sync);
 int phy_ethtool_sset(struct phy_device *phydev, struct ethtool_cmd *cmd);
 void phy_ethtool_ksettings_get(struct phy_device *phydev,
-- 
2.16.2

^ permalink raw reply related

* [PATCH RFC 6/7] net: phy: use new function phy_stop_suspending in, mdio_bus_phy_suspend
From: Heiner Kallweit @ 2018-03-14 20:16 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn; +Cc: Geert Uytterhoeven, netdev@vger.kernel.org
In-Reply-To: <421476ed-03c7-63a0-44a4-c6d9c7702647@gmail.com>

Use new function phy_stop_suspending() in mdio_bus_phy_suspend() to also
disable interrupts and set link state to down.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/phy/phy_device.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 08a5ff14f..ed501fabe 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -139,6 +139,7 @@ static bool mdio_bus_phy_needs_start(struct phy_device *phydev)
 static int mdio_bus_phy_suspend(struct device *dev)
 {
 	struct phy_device *phydev = to_phy_device(dev);
+	int ret = 0;
 
 	/* We must stop the state machine manually, otherwise it stops out of
 	 * control, possibly with the phydev->lock held. Upon resume, netdev
@@ -146,9 +147,11 @@ static int mdio_bus_phy_suspend(struct device *dev)
 	 * lead to a deadlock.
 	 */
 	if (phydev->attached_dev && phydev->adjust_link)
-		phy_stop_machine(phydev);
+		phy_stop_suspending(phydev);
+	else
+		ret = phy_suspend(phydev);
 
-	return phy_suspend(phydev);
+	return ret;
 }
 
 static int mdio_bus_phy_resume(struct device *dev)
-- 
2.16.2

^ permalink raw reply related

* [PATCH RFC 5/7] net: phy: make phy_stop synchronous
From: Heiner Kallweit @ 2018-03-14 20:16 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn; +Cc: Geert Uytterhoeven, netdev@vger.kernel.org
In-Reply-To: <421476ed-03c7-63a0-44a4-c6d9c7702647@gmail.com>

Currently phy_stop() just sets the state to PHY_HALTED and relies on the
state machine to do the remaining work. It can take up to 1s until the
state machine runs again what causes issues in situations where e.g.
driver / device is brought down directly after executing phy_stop().

Fix this by executing all phy_stop() activities synchronously.

Add a function phy_stop_suspending() which does basically the same as
phy_stop() and just adopts the state adjustment logic from
phy_stop_machine() to inform the resume callback about the status of
the PHY before suspending.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/phy/phy.c | 48 ++++++++++++++++++++++++++++++++----------------
 include/linux/phy.h   | 12 +++++++++++-
 2 files changed, 43 insertions(+), 17 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 0ca1672a5..54144cd10 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -737,21 +737,49 @@ int phy_stop_interrupts(struct phy_device *phydev)
 }
 EXPORT_SYMBOL(phy_stop_interrupts);
 
+static void phy_link_up(struct phy_device *phydev)
+{
+	phydev->phy_link_change(phydev, true, true);
+	phy_led_trigger_change_speed(phydev);
+}
+
+static void phy_link_down(struct phy_device *phydev, bool do_carrier)
+{
+	phydev->phy_link_change(phydev, false, do_carrier);
+	phy_led_trigger_change_speed(phydev);
+}
+
 /**
- * phy_stop - Bring down the PHY link, and stop checking the status
+ * __phy_stop - Bring down the PHY link, and stop checking the status
  * @phydev: target phy_device struct
+ * @suspending: called from a suspend callback
  */
-void phy_stop(struct phy_device *phydev)
+void __phy_stop(struct phy_device *phydev, bool suspending)
 {
 	mutex_lock(&phydev->lock);
 
 	if (PHY_HALTED == phydev->state)
 		goto out_unlock;
 
+	/* stop state machine */
+	cancel_delayed_work_sync(&phydev->state_queue);
+
 	if (phy_interrupt_is_valid(phydev))
 		phy_disable_interrupts(phydev);
 
-	phydev->state = PHY_HALTED;
+	if (phydev->link) {
+		phydev->link = 0;
+		phy_link_down(phydev, true);
+	}
+
+	phy_suspend(phydev);
+
+	if (suspending) {
+		if (phydev->state > PHY_UP && phydev->state != PHY_HALTED)
+			phydev->state = PHY_UP;
+	} else {
+		phydev->state = PHY_HALTED;
+	}
 
 out_unlock:
 	mutex_unlock(&phydev->lock);
@@ -761,7 +789,7 @@ void phy_stop(struct phy_device *phydev)
 	 * will not reenable interrupts.
 	 */
 }
-EXPORT_SYMBOL(phy_stop);
+EXPORT_SYMBOL(__phy_stop);
 
 /**
  * phy_start - start or restart a PHY device
@@ -804,18 +832,6 @@ void phy_start(struct phy_device *phydev)
 }
 EXPORT_SYMBOL(phy_start);
 
-static void phy_link_up(struct phy_device *phydev)
-{
-	phydev->phy_link_change(phydev, true, true);
-	phy_led_trigger_change_speed(phydev);
-}
-
-static void phy_link_down(struct phy_device *phydev, bool do_carrier)
-{
-	phydev->phy_link_change(phydev, false, do_carrier);
-	phy_led_trigger_change_speed(phydev);
-}
-
 /**
  * phy_state_machine - Handle the state machine
  * @work: work_struct that describes the work to be done
diff --git a/include/linux/phy.h b/include/linux/phy.h
index bc7aa93c6..be43bd270 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -940,7 +940,7 @@ struct phy_device *phy_connect(struct net_device *dev, const char *bus_id,
 void phy_disconnect(struct phy_device *phydev);
 void phy_detach(struct phy_device *phydev);
 void phy_start(struct phy_device *phydev);
-void phy_stop(struct phy_device *phydev);
+void __phy_stop(struct phy_device *phydev, bool suspending);
 int phy_start_aneg(struct phy_device *phydev);
 int phy_aneg_done(struct phy_device *phydev);
 
@@ -964,6 +964,16 @@ static inline const char *phydev_name(const struct phy_device *phydev)
 	return dev_name(&phydev->mdio.dev);
 }
 
+static inline void phy_stop(struct phy_device *phydev)
+{
+	__phy_stop(phydev, false);
+}
+
+static inline void phy_stop_suspending(struct phy_device *phydev)
+{
+	__phy_stop(phydev, true);
+}
+
 void phy_attached_print(struct phy_device *phydev, const char *fmt, ...)
 	__printf(2, 3);
 void phy_attached_info(struct phy_device *phydev);
-- 
2.16.2

^ permalink raw reply related

* [PATCH RFC 4/7] net: phy: remove phy_start_machine
From: Heiner Kallweit @ 2018-03-14 20:16 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn; +Cc: Geert Uytterhoeven, netdev@vger.kernel.org
In-Reply-To: <421476ed-03c7-63a0-44a4-c6d9c7702647@gmail.com>

Now that phy_start() integrated the functionality of phy_start_machine()
we can remove it.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/phy/phy.c        | 16 ----------------
 drivers/net/phy/phy_device.c |  1 -
 drivers/net/phy/phylink.c    |  1 -
 include/linux/phy.h          |  1 -
 4 files changed, 19 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 0aef35efd..0ca1672a5 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -546,22 +546,6 @@ int phy_start_aneg(struct phy_device *phydev)
 }
 EXPORT_SYMBOL(phy_start_aneg);
 
-/**
- * phy_start_machine - start PHY state machine tracking
- * @phydev: the phy_device struct
- *
- * Description: The PHY infrastructure can run a state machine
- *   which tracks whether the PHY is starting up, negotiating,
- *   etc.  This function starts the delayed workqueue which tracks
- *   the state of the PHY. If you want to maintain your own state machine,
- *   do not call this function.
- */
-void phy_start_machine(struct phy_device *phydev)
-{
-	queue_delayed_work(system_power_efficient_wq, &phydev->state_queue, HZ);
-}
-EXPORT_SYMBOL_GPL(phy_start_machine);
-
 /**
  * phy_trigger_machine - trigger the state machine to run
  *
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index c6fd79758..08a5ff14f 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -768,7 +768,6 @@ int phy_connect_direct(struct net_device *dev, struct phy_device *phydev,
 		return rc;
 
 	phy_prepare_link(phydev, handler);
-	phy_start_machine(phydev);
 	if (phydev->irq > 0)
 		phy_start_interrupts(phydev);
 
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 51a011a34..402d08899 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -694,7 +694,6 @@ static int phylink_bringup_phy(struct phylink *pl, struct phy_device *phy)
 		   __ETHTOOL_LINK_MODE_MASK_NBITS, pl->supported,
 		   phy->advertising);
 
-	phy_start_machine(phy);
 	if (phy->irq > 0)
 		phy_start_interrupts(phy);
 
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 68127b002..bc7aa93c6 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -1022,7 +1022,6 @@ int phy_drivers_register(struct phy_driver *new_driver, int n,
 void phy_state_machine(struct work_struct *work);
 void phy_change_work(struct work_struct *work);
 void phy_mac_interrupt(struct phy_device *phydev);
-void phy_start_machine(struct phy_device *phydev);
 void phy_stop_machine(struct phy_device *phydev);
 void phy_trigger_machine(struct phy_device *phydev, bool sync);
 int phy_ethtool_sset(struct phy_device *phydev, struct ethtool_cmd *cmd);
-- 
2.16.2

^ permalink raw reply related

* [PATCH RFC 2/7] net: phy: improve checking for when PHY is allowed to, suspend
From: Heiner Kallweit @ 2018-03-14 20:16 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn; +Cc: Geert Uytterhoeven, netdev@vger.kernel.org
In-Reply-To: <421476ed-03c7-63a0-44a4-c6d9c7702647@gmail.com>

This patch improves and unifies checking for when PHY is allowed to
suspend. New is a check for the parent of the MDIO bus being
runtime-suspended. In this case the MDIO bus may not be accessible and
therefore we don't try to suspend the PHY. Instead we rely on the
parent to suspend all devices on the MDIO bus when it's suspended.

In addition change the behavior to return 0 instead of -EBUSY from
phy_suspend() if we detect a situation where the PHY shouldn't be
suspended. Returning -EBUSY from a suspend callback is supported
for runtime pm, however in case of system suspend it prevents the
system from suspending.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/phy/phy_device.c | 37 +++++++++++++++++++++----------------
 1 file changed, 21 insertions(+), 16 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 478405e54..a5691536f 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -35,6 +35,7 @@
 #include <linux/io.h>
 #include <linux/uaccess.h>
 #include <linux/of.h>
+#include <linux/pm_runtime.h>
 
 #include <asm/irq.h>
 
@@ -75,14 +76,27 @@ extern struct phy_driver genphy_10g_driver;
 static LIST_HEAD(phy_fixup_list);
 static DEFINE_MUTEX(phy_fixup_lock);
 
-#ifdef CONFIG_PM
-static bool mdio_bus_phy_may_suspend(struct phy_device *phydev)
+static bool phy_may_suspend(struct phy_device *phydev)
 {
 	struct device_driver *drv = phydev->mdio.dev.driver;
 	struct phy_driver *phydrv = to_phy_driver(drv);
 	struct net_device *netdev = phydev->attached_dev;
+	struct device *mdio_parent = phydev->mdio.bus->parent;
+	struct ethtool_wolinfo wol = { .cmd = ETHTOOL_GWOL };
+
+	if (!drv || !phydrv->suspend || phydev->suspended)
+		return false;
+
+	/* If the device has WOL enabled, we cannot suspend the PHY */
+	phy_ethtool_get_wol(phydev, &wol);
+	if (wol.wolopts)
+		return false;
 
-	if (!drv || !phydrv->suspend)
+	/* If the parent of the MDIO bus is runtime-suspended, the MDIO bus may
+	 * not be accessible and we expect the parent to suspend all devices
+	 * on the MDIO bus when it suspends.
+	 */
+	if (mdio_parent && pm_runtime_suspended(mdio_parent))
 		return false;
 
 	/* PHY not attached? May suspend if the PHY has not already been
@@ -91,7 +105,7 @@ static bool mdio_bus_phy_may_suspend(struct phy_device *phydev)
 	 * MDIO bus driver and clock gated at this point.
 	 */
 	if (!netdev)
-		return !phydev->suspended;
+		return true;
 
 	/* Don't suspend PHY if the attached netdev parent may wakeup.
 	 * The parent may point to a PCI device, as in tg3 driver.
@@ -109,6 +123,7 @@ static bool mdio_bus_phy_may_suspend(struct phy_device *phydev)
 	return true;
 }
 
+#ifdef CONFIG_PM
 static int mdio_bus_phy_suspend(struct device *dev)
 {
 	struct phy_device *phydev = to_phy_device(dev);
@@ -121,9 +136,6 @@ static int mdio_bus_phy_suspend(struct device *dev)
 	if (phydev->attached_dev && phydev->adjust_link)
 		phy_stop_machine(phydev);
 
-	if (!mdio_bus_phy_may_suspend(phydev))
-		return 0;
-
 	return phy_suspend(phydev);
 }
 
@@ -132,14 +144,10 @@ static int mdio_bus_phy_resume(struct device *dev)
 	struct phy_device *phydev = to_phy_device(dev);
 	int ret;
 
-	if (!mdio_bus_phy_may_suspend(phydev))
-		goto no_resume;
-
 	ret = phy_resume(phydev);
 	if (ret < 0)
 		return ret;
 
-no_resume:
 	if (phydev->attached_dev && phydev->adjust_link)
 		phy_start_machine(phydev);
 
@@ -1148,13 +1156,10 @@ EXPORT_SYMBOL(phy_detach);
 int phy_suspend(struct phy_device *phydev)
 {
 	struct phy_driver *phydrv = to_phy_driver(phydev->mdio.dev.driver);
-	struct ethtool_wolinfo wol = { .cmd = ETHTOOL_GWOL };
 	int ret = 0;
 
-	/* If the device has WOL enabled, we cannot suspend the PHY */
-	phy_ethtool_get_wol(phydev, &wol);
-	if (wol.wolopts)
-		return -EBUSY;
+	if (!phy_may_suspend(phydev))
+		return 0;
 
 	if (phydev->drv && phydrv->suspend)
 		ret = phydrv->suspend(phydev);
-- 
2.16.2

^ permalink raw reply related

* [PATCH RFC 3/7] net: phy: resume PHY only if needed in, mdio_bus_phy_suspend
From: Heiner Kallweit @ 2018-03-14 20:16 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn; +Cc: Geert Uytterhoeven, netdev@vger.kernel.org
In-Reply-To: <421476ed-03c7-63a0-44a4-c6d9c7702647@gmail.com>

Currently the PHY is unconditionally resumed in mdio_bus_phy_suspend().
In cases where the PHY was sleepinh before suspending or if somebody else
takes care of resuming later, this is not needed and wastes energy.

Also start the state machine only if it's used by the driver (indicated
by the adjust_link callback being defined).

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/phy/phy_device.c | 33 +++++++++++++++++++++++----------
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index a5691536f..c6fd79758 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -124,6 +124,18 @@ static bool phy_may_suspend(struct phy_device *phydev)
 }
 
 #ifdef CONFIG_PM
+
+static bool mdio_bus_phy_needs_start(struct phy_device *phydev)
+{
+	bool start;
+
+	mutex_lock(&phydev->lock);
+	start = phydev->state == PHY_UP && phydev->adjust_link;
+	mutex_unlock(&phydev->lock);
+
+	return start;
+}
+
 static int mdio_bus_phy_suspend(struct device *dev)
 {
 	struct phy_device *phydev = to_phy_device(dev);
@@ -142,25 +154,25 @@ static int mdio_bus_phy_suspend(struct device *dev)
 static int mdio_bus_phy_resume(struct device *dev)
 {
 	struct phy_device *phydev = to_phy_device(dev);
-	int ret;
+	int ret = 0;
 
-	ret = phy_resume(phydev);
-	if (ret < 0)
-		return ret;
+	if (!phydev->attached_dev)
+		return 0;
 
-	if (phydev->attached_dev && phydev->adjust_link)
-		phy_start_machine(phydev);
+	if (mdio_bus_phy_needs_start(phydev))
+		phy_start(phydev);
+	else if (!phydev->adjust_link)
+		ret = phy_resume(phydev);
 
-	return 0;
+	return ret;
 }
 
 static int mdio_bus_phy_restore(struct device *dev)
 {
 	struct phy_device *phydev = to_phy_device(dev);
-	struct net_device *netdev = phydev->attached_dev;
 	int ret;
 
-	if (!netdev)
+	if (!phydev->attached_dev)
 		return 0;
 
 	ret = phy_init_hw(phydev);
@@ -171,7 +183,8 @@ static int mdio_bus_phy_restore(struct device *dev)
 	phydev->link = 0;
 	phydev->state = PHY_UP;
 
-	phy_start_machine(phydev);
+	if (mdio_bus_phy_needs_start(phydev))
+		phy_start(phydev);
 
 	return 0;
 }
-- 
2.16.2

^ permalink raw reply related

* [PATCH RFC 1/7] net: phy: unconditionally resume and re-enable interrupts, in phy_start
From: Heiner Kallweit @ 2018-03-14 20:16 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn; +Cc: Geert Uytterhoeven, netdev@vger.kernel.org
In-Reply-To: <421476ed-03c7-63a0-44a4-c6d9c7702647@gmail.com>

In subsequent patches of this series interrupts will be disabled during
system suspend, also we will have to resume the PHY in states other than
PHY_HALTED. To prepare for this unconditionally resume and re-enable
interrupts in phy_start().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/phy/phy.c | 20 ++++++++------------
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 05c1e8ef1..0aef35efd 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -791,10 +791,16 @@ EXPORT_SYMBOL(phy_stop);
  */
 void phy_start(struct phy_device *phydev)
 {
-	int err = 0;
+	/* if phy was suspended, bring the physical link up again */
+	phy_resume(phydev);
 
-	mutex_lock(&phydev->lock);
+	/* make sure interrupts are re-enabled for the PHY */
+	if (phy_interrupt_is_valid(phydev) && phy_enable_interrupts(phydev)) {
+		phy_error(phydev);
+		return;
+	}
 
+	mutex_lock(&phydev->lock);
 	switch (phydev->state) {
 	case PHY_STARTING:
 		phydev->state = PHY_PENDING;
@@ -803,16 +809,6 @@ void phy_start(struct phy_device *phydev)
 		phydev->state = PHY_UP;
 		break;
 	case PHY_HALTED:
-		/* if phy was suspended, bring the physical link up again */
-		__phy_resume(phydev);
-
-		/* make sure interrupts are re-enabled for the PHY */
-		if (phy_interrupt_is_valid(phydev)) {
-			err = phy_enable_interrupts(phydev);
-			if (err < 0)
-				break;
-		}
-
 		phydev->state = PHY_RESUMING;
 		break;
 	default:
-- 
2.16.2

^ permalink raw reply related

* [PATCH RFC 0/7] net: phy: patch series aiming to improve few aspects of phylib
From: Heiner Kallweit @ 2018-03-14 20:10 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn; +Cc: Geert Uytterhoeven, netdev@vger.kernel.org

This patch series aims to tackle few issues with phylib:
 
- address issues with patch series [1] (smsc911x + phylib changes)
- make phy_stop synchronous
- get rid of phy_start/stop_machine and handle it in phy_start/phy_stop
- in mdio_suspend consider runtime pm state of mdio bus parent
- consider more WOL conditions when deciding whether PHY is allowed to
  suspend
- only resume phy after system suspend if needed

[1] https://www.mail-archive.com/netdev@vger.kernel.org/msg196061.html

It works fine here but other NIC drivers may use phylib differently. 
Therefore I'd appreciate feedback and more testing.

I could think of some subsequent patches, e.g. phy_error() could be
reduced to calling phy_stop() and printing an error message
(today it silently sets the PHY state to PHY_HALTED).

Heiner Kallweit (7):
  net: phy: unconditionally resume and re-enable interrupts in phy_start
  net: phy: improve checking for when PHY is allowed to suspend
  net: phy: resume PHY only if needed in mdio_bus_phy_suspend
  net: phy: remove phy_start_machine
  net: phy: make phy_stop synchronous
  net: phy: use new function phy_stop_suspending in mdio_bus_phy_suspend
  net: phy: remove phy_stop_machine

 drivers/net/phy/phy.c        | 102 +++++++++++++++++--------------------------
 drivers/net/phy/phy_device.c |  80 ++++++++++++++++++++-------------
 drivers/net/phy/phylink.c    |   1 -
 include/linux/phy.h          |  14 ++++--
 4 files changed, 100 insertions(+), 97 deletions(-)

-- 
2.16.2

^ permalink raw reply

* [PATCH net] netlink: avoid a double skb free in genlmsg_mcast()
From: Nicolas Dichtel @ 2018-03-14 20:10 UTC (permalink / raw)
  To: ben.hutchings
  Cc: davem, netdev, stable, johannes.berg, gregkh, linux-kernel,
	Nicolas Dichtel
In-Reply-To: <1520899459.23626.87.camel@codethink.co.uk>

nlmsg_multicast() consumes always the skb, thus the original skb must be
freed only when this function is called with a clone.

Fixes: cb9f7a9a5c96 ("netlink: ensure to loop over all netns in genlmsg_multicast_allns()")
Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
 net/netlink/genetlink.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c
index 6f02499ef007..b9ce82c9440f 100644
--- a/net/netlink/genetlink.c
+++ b/net/netlink/genetlink.c
@@ -1106,7 +1106,7 @@ static int genlmsg_mcast(struct sk_buff *skb, u32 portid, unsigned long group,
 	if (!err)
 		delivered = true;
 	else if (err != -ESRCH)
-		goto error;
+		return err;
 	return delivered ? 0 : -ESRCH;
  error:
 	kfree_skb(skb);
-- 
2.15.1

^ permalink raw reply related

* RE: [Intel-wired-lan] [next-queue PATCH v4 6/8] igb: Add MAC address support for ethtool nftuple filters
From: Vinicius Costa Gomes @ 2018-03-14 19:58 UTC (permalink / raw)
  To: Brown, Aaron F, intel-wired-lan@lists.osuosl.org
  Cc: netdev@vger.kernel.org, Sanchez-Palencia, Jesus
In-Reply-To: <309B89C4C689E141A5FF6A0C5FB2118B8C8032A0@ORSMSX101.amr.corp.intel.com>

Hi,

"Brown, Aaron F" <aaron.f.brown@intel.com> writes:

>> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@osuosl.org] On
>> Behalf Of Vinicius Costa Gomes
>> Sent: Wednesday, March 7, 2018 4:37 PM
>> To: intel-wired-lan@lists.osuosl.org
>> Cc: netdev@vger.kernel.org; Sanchez-Palencia, Jesus <jesus.sanchez-
>> palencia@intel.com>
>> Subject: [Intel-wired-lan] [next-queue PATCH v4 6/8] igb: Add MAC address
>> support for ethtool nftuple filters
>> 
>> This adds the capability of configuring the queue steering of arriving
>> packets based on their source and destination MAC addresses.
>> 
>> In practical terms this adds support for the following use cases,
>> characterized by these examples:
>> 
>> $ ethtool -N eth0 flow-type ether dst aa:aa:aa:aa:aa:aa action 0
>> (this will direct packets with destination address "aa:aa:aa:aa:aa:aa"
>> to the RX queue 0)
>> 
>> $ ethtool -N eth0 flow-type ether src 44:44:44:44:44:44 action 3
>> (this will direct packets with source address "44:44:44:44:44:44" to
>> the RX queue 3)
>
> This seems to work fine on i210, and the patch series allows me to set
> the rx filters on the i350, i354 and i211, but it is not directing the
> packets to the queue I request.
>

For the i211, it seems that the datasheet is slightly misleading: it has
the QSEL bit documented on the RAH registers, but the queue selection
bits are not mentioned, so it really seems that queue selection won't
work for this controller.

For the other cases (in a quick search I couldn't find the i354
datasheet), the semantics changes, it's more about pool selection than
queue selection, and it depends on vfs_allocated_count (>= 1) and the
number of rss_queues (<= 1) to get to the state where setting the queue
via filters would have the expected effect.

> With the exception of i210 the rx_queues number does not seem to be
> effected by setting the filter. In the case of i211 the rx packets
> stay on rx_queue 0 with or without an ether src or dst filter. The
> first example one seems to work at first since it's directing to queue
> 0, but changing the filter to "action 1" does not change the behavior.
> With the i350 and i354 ports the packets are spread across the
> rx_queues with or without the filter set.
>

So, what I am thinking is: make it an error selecting any queue
different than zero (this is expected to work for all controllers, and
it's what will be called when the user does something like 'ip maddr'),
for controller different than the i210. Later, if/when this feature is
needed for other controllers we can extend the checks.

Does this make sense?

Thanks,
--
Vinicius

^ permalink raw reply

* Re: regression: ti: cpsw: warning from phy_connect()->sysfs_create_link()-sysfs_warn_dup()
From: Florian Fainelli @ 2018-03-14 19:50 UTC (permalink / raw)
  To: Grygorii Strashko, netdev, Andrew Lunn; +Cc: David Miller, Sekhar Nori
In-Reply-To: <d7b6d13b-ecec-2039-4a77-514f9666b729@ti.com>

On 03/14/2018 12:40 PM, Grygorii Strashko wrote:
> 
> 
> On 02/26/2018 02:16 PM, Florian Fainelli wrote:
>> On 02/26/2018 12:08 PM, Grygorii Strashko wrote:
>>> Hi Florian,
>>>
>>> The TI CPSW driver produces warning as below when booted in switch mode:
>>> [    8.882295] sysfs: cannot create duplicate filename '/devices/platform/44000000.ocp/48484000.ethernet/net/eth0/phydev'
>>> [    8.999859] CPU: 1 PID: 356 Comm: systemd-network Not tainted 4.16.0-rc3-00010-g6cc3ff6-dirty #225
>>> ...
>>> [    9.012352] Hardware name: Generic DRA74X (Flattened Device Tree)
>>> [    9.018901] Backtrace:
>>> [    9.021376] [<c010c918>] (dump_backtrace) from [<c010cbe8>] (show_stack+0x18/0x1c)
>>> [    9.028986]  r7:ed036240 r6:60070013 r5:00000000 r4:c0d598a0
>>> [    9.034684] [<c010cbd0>] (show_stack) from [<c088e1f0>] (dump_stack+0x8c/0xa0)
>>> [    9.041954] [<c088e164>] (dump_stack) from [<c02ba59c>] (sysfs_warn_dup+0x60/0x6c)
>>> [    9.049564]  r7:ed036240 r6:ed036240 r5:c0b579bc r4:ed10c000
>>> [    9.055264] [<c02ba53c>] (sysfs_warn_dup) from [<c02ba930>] (sysfs_do_create_link_sd+0xbc/0xc4)
>>> [    9.064006]  r7:ed036240 r6:ffffffef r5:00000000 r4:ed034660
>>> [    9.069701] [<c02ba874>] (sysfs_do_create_link_sd) from [<c02ba968>] (sysfs_create_link+0x30/0x3c)
>>> [    9.078706]  r9:00000008 r8:00000000 r7:ed02f008 r6:ee015ae8 r5:ee015800 r4:ed02f000
>>> [    9.086497] [<c02ba938>] (sysfs_create_link) from [<c060f070>] (phy_attach_direct+0x180/0x1f4)
>>> [    9.095163] [<c060eef0>] (phy_attach_direct) from [<c060f1cc>] (phy_connect_direct+0x1c/0x54)
>>> [    9.103735]  r10:00000001 r9:ee015d0c r8:00000008 r7:c062e84c r6:c062e84c r5:ed02f000
>>> [    9.111609]  r4:ed02f000 r3:00000008
>>> [    9.115217] [<c060f1b0>] (phy_connect_direct) from [<c060f250>] (phy_connect+0x4c/0x80)
>>> [    9.123270]  r7:c062e84c r6:ee015800 r5:ed02f000 r4:ee693664
>>> [    9.128969] [<c060f204>] (phy_connect) from [<c062abc8>] (cpsw_slave_open+0x21c/0x274)
>>> [    9.136926]  r9:ee015d0c r8:ee015d00 r7:ed018a10 r6:ee015800 r5:ee015d00 r4:ed032630
>>> [    9.144715] [<c062a9ac>] (cpsw_slave_open) from [<c062b28c>] (cpsw_ndo_open+0x158/0x55c)
>>> [    9.152860]  r10:00000001 r9:00000000 r8:ee015d00 r7:c0d04c48 r6:ee015800 r5:ed018a10
>>> [    9.160730]  r4:ed032630
>>> [    9.163291] [<c062b134>] (cpsw_ndo_open) from [<c0765350>] (__dev_open+0xd4/0x158)
>>> [    9.170900]  r10:00000000 r9:ec885db4 r8:ee01582c r7:c09a9c00 r6:00000000 r5:c0d04c48
>>> [    9.178768]  r4:ee015800
>>> [    9.181326] [<c076527c>] (__dev_open) from [<c0765748>] (__dev_change_flags+0x174/0x1c0)
>>>
>>> The reason of the warning is that in switch mode CPSW drivers is connecting two Net PHYs to
>>> a single network device (it has been done this way when driver was initially introduced), so
>>> now it produces warning during boot because of commits:
>>>
>>> 5568363f0cb3 ("net: phy: Create sysfs reciprocal links for attached_dev/phydev"
>>> a3995460491d ("net: phy: Relax error checking on sysfs_create_link()"
>>> ^ both went in v4.13
>>>
>>> Honestly, I'm not sure how to fix it the best way (the simplest fix is below), taking into account
>>> that we are not ready to do big reworks in CPSW driver.
> 
> I've got additional testing data and this actually a *regression*, because 
> second CPSW Port became broken after above commits due to Net PHY 
> connection failure.
> 
>>
>> The CPSW driver is duplicating a fair amount of what the DSA subsystem
>> does properly without breaking any assumptions about the 1:1 mapping
>> that can exist between a network device and PHY device. Having a PHY
>> device without a network device is fine in premise, although discouraged.
>>
>> Migrating to DSA is certainly a big task, but I would strongly encourage
>> you to consider doing it, since that would solve this problem, and
>> probably many more.
> 
> We are actively investigating possibility to use DSA for this driver,
> but unfortunately this is looks very problematic, because this is old driver with
> stable ABI and stable device tree binding which are also ABI. 
> So, this driver can't be simply migrating to use DSA and as possible solution
> new driver can be introduced in long term perspective which will
> follow DSA binding requirements and reuse as max as possible code
> of the current CPSW driver. Even in this case, old driver
> will need to be supported during some transition period and *be functional*.

That is fair enough, maybe we could explore breaking things more nicely
within DSA such that the Device Tree parsing is largely independent from
the internal representation, and as a result,you could reuse the
existing binding you have, but leverage the DSA infrastructure where it
makes sense, food for thought.

> 
> For now I'd be very appreciated if functionality of current TI CPSW driver will be
> restored and propose to consider below fix, which will make it work again:
> 
> =============
> From 5f67320d985a533da785e3643e5e63ba7395b4ae Mon Sep 17 00:00:00 2001
> From: Grygorii Strashko <grygorii.strashko@ti.com>
> Date: Wed, 14 Mar 2018 14:01:12 -0500
> Subject: [PATCH] net: phy: skip error checking when creating sysfs link
>  netdev->phydev
> 
> Some ethernet drivers (like TI CPSW) may connect and manage >1 Net PHYs per
> one netdevice, as result such drivers will produce warning during system
> boot and fail to connect second phy to netdevice when PHY device
> framework will try to create sysfs link netdev->phydev for second PHY,
> because sysfs link with the same name has been created already for the first PHY.
> As result, second CPSW external port will became unusable.
> 
> Fix it by skipping error checking when PHY device
> framework creating sysfs link netdev->phydev. The sysfs_create_link()
> will still produce warning, but we can live with it as
> system functionality will not be broken. 
> 
> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
> ---
>  drivers/net/phy/phy_device.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index 67f25ac..e2c34c3 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -1000,9 +1000,10 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
>  				"attached_dev");
>  	if (!err) {
>  		err = sysfs_create_link(&dev->dev.kobj, &phydev->mdio.dev.kobj,
> -					"phydev");
> -		if (err)
> -			goto error;
> +				  "phydev");
> +		/* don't check return value here as some net drivers can
> +		 * use one netdevice with more then one phy
> +		 */
>  
>  		phydev->sysfs_links = true;

Can you make sure that the single boolean tracking the state of both
sysfs links is not going to cause any problem in your case? Try
unbinding the PHY driver from your PHY device for instance to check
that. If that still works and does not produce a warning then:

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>

and maybe add a Fixes: tag when you submit this officially?
-- 
Florian

^ permalink raw reply

* Re: [PATCH net-next 1/5] sctp: add refcnt support for sh_key
From: Marcelo Ricardo Leitner @ 2018-03-14 19:41 UTC (permalink / raw)
  To: Xin Long; +Cc: Neil Horman, network dev, linux-sctp, davem
In-Reply-To: <CADvbK_ew__wdmX9KeE38S7=3YWJ7nFJB7YXyMKxPr8y+Bo9+qw@mail.gmail.com>

On Thu, Mar 15, 2018 at 12:12:32AM +0800, Xin Long wrote:
> On Wed, Mar 14, 2018 at 9:59 PM, Neil Horman <nhorman@tuxdriver.com> wrote:
> > On Wed, Mar 14, 2018 at 07:05:30PM +0800, Xin Long wrote:
> >> With refcnt support for sh_key, chunks auth sh_keys can be decided
> >> before enqueuing it. Changing the active key later will not affect
> >> the chunks already enqueued.
> >>
> >> Furthermore, this is necessary when adding the support for authinfo
> >> for sendmsg in next patch.
> >>
> >> Note that struct sctp_chunk can't be grown due to that performance
> >> drop issue on slow cpu, so it just reuses head_skb memory for shkey
> >> in sctp_chunk.
> >>
> >> Signed-off-by: Xin Long <lucien.xin@gmail.com>
> >> ---
> >>  include/net/sctp/auth.h    |  9 +++--
> >>  include/net/sctp/sm.h      |  3 +-
> >>  include/net/sctp/structs.h |  9 +++--
> >>  net/sctp/auth.c            | 86 +++++++++++++++++++++++-----------------------
> >>  net/sctp/chunk.c           |  5 +++
> >>  net/sctp/output.c          | 18 ++++++++--
> >>  net/sctp/sm_make_chunk.c   | 15 ++++++--
> >>  net/sctp/sm_statefuns.c    | 11 +++---
> >>  net/sctp/socket.c          |  6 ++++
> >>  9 files changed, 104 insertions(+), 58 deletions(-)
> >>
> >> diff --git a/include/net/sctp/auth.h b/include/net/sctp/auth.h
> >> index e5c57d0..017c1aa 100644
> >> --- a/include/net/sctp/auth.h
> >> +++ b/include/net/sctp/auth.h
> >> @@ -62,8 +62,9 @@ struct sctp_auth_bytes {
> >>  /* Definition for a shared key, weather endpoint or association */
> >>  struct sctp_shared_key {
> >>       struct list_head key_list;
> >> -     __u16 key_id;
> >>       struct sctp_auth_bytes *key;
> >> +     refcount_t refcnt;
> >> +     __u16 key_id;
> >>  };
> >>
> >>  #define key_for_each(__key, __list_head) \
> >> @@ -103,8 +104,10 @@ int sctp_auth_send_cid(enum sctp_cid chunk,
> >>  int sctp_auth_recv_cid(enum sctp_cid chunk,
> >>                      const struct sctp_association *asoc);
> >>  void sctp_auth_calculate_hmac(const struct sctp_association *asoc,
> >> -                         struct sk_buff *skb,
> >> -                         struct sctp_auth_chunk *auth, gfp_t gfp);
> >> +                           struct sk_buff *skb, struct sctp_auth_chunk *auth,
> >> +                           struct sctp_shared_key *ep_key, gfp_t gfp);
> >> +void sctp_auth_shkey_release(struct sctp_shared_key *sh_key);
> >> +void sctp_auth_shkey_hold(struct sctp_shared_key *sh_key);
> >>
> >>  /* API Helpers */
> >>  int sctp_auth_ep_add_chunkid(struct sctp_endpoint *ep, __u8 chunk_id);
> >> diff --git a/include/net/sctp/sm.h b/include/net/sctp/sm.h
> >> index 2883c43..2d0e782 100644
> >> --- a/include/net/sctp/sm.h
> >> +++ b/include/net/sctp/sm.h
> >> @@ -263,7 +263,8 @@ int sctp_process_asconf_ack(struct sctp_association *asoc,
> >>  struct sctp_chunk *sctp_make_fwdtsn(const struct sctp_association *asoc,
> >>                                   __u32 new_cum_tsn, size_t nstreams,
> >>                                   struct sctp_fwdtsn_skip *skiplist);
> >> -struct sctp_chunk *sctp_make_auth(const struct sctp_association *asoc);
> >> +struct sctp_chunk *sctp_make_auth(const struct sctp_association *asoc,
> >> +                               __u16 key_id);
> >>  struct sctp_chunk *sctp_make_strreset_req(const struct sctp_association *asoc,
> >>                                         __u16 stream_num, __be16 *stream_list,
> >>                                         bool out, bool in);
> >> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
> >> index ec6e46b..49ad67b 100644
> >> --- a/include/net/sctp/structs.h
> >> +++ b/include/net/sctp/structs.h
> >> @@ -577,8 +577,12 @@ struct sctp_chunk {
> >>       /* This points to the sk_buff containing the actual data.  */
> >>       struct sk_buff *skb;
> >>
> >> -     /* In case of GSO packets, this will store the head one */
> >> -     struct sk_buff *head_skb;
> >> +     union {
> >> +             /* In case of GSO packets, this will store the head one */
> >> +             struct sk_buff *head_skb;
> >> +             /* In case of auth enabled, this will point to the shkey */
> >> +             struct sctp_shared_key *shkey;
> >> +     };
> > Why do you need to add this at all?  You add the shared key pointer to the
> > association in this patch, and sctp_chunk already has a pointer to the
> > association, so you should already be able to find the shared key via
> > chunk->asoc->shkey, no?
> We need this, because one asoc can have a list of shared keys.
> When sending a msg, we can even choose one of these keys with
> cmsg info (cmsgs->authinfo) for this msg, which is that
> 5.3.8.  SCTP AUTH Information Structure (SCTP_AUTHINFO)
> is supposed to do. (Patch 2/5)
> 
> After this patchset, asoc->shkey (or we can say active shkey) is
> more like default shkey. It will be used only when SCTP_AUTHINFO
> is not set in cmsg info.

And even then, now once the chunk is enqueued, asoc->shkey is allowed
to change without affecting it.

This is probably one of the main improvements on this patchset.

  M.

^ permalink raw reply

* Re: regression: ti: cpsw: warning from phy_connect()->sysfs_create_link()-sysfs_warn_dup()
From: Grygorii Strashko @ 2018-03-14 19:40 UTC (permalink / raw)
  To: Florian Fainelli, netdev, Andrew Lunn; +Cc: David Miller, Sekhar Nori
In-Reply-To: <2679eb4c-b08a-4ef7-a6e8-de7425a47e94@gmail.com>



On 02/26/2018 02:16 PM, Florian Fainelli wrote:
> On 02/26/2018 12:08 PM, Grygorii Strashko wrote:
>> Hi Florian,
>>
>> The TI CPSW driver produces warning as below when booted in switch mode:
>> [    8.882295] sysfs: cannot create duplicate filename '/devices/platform/44000000.ocp/48484000.ethernet/net/eth0/phydev'
>> [    8.999859] CPU: 1 PID: 356 Comm: systemd-network Not tainted 4.16.0-rc3-00010-g6cc3ff6-dirty #225
>> ...
>> [    9.012352] Hardware name: Generic DRA74X (Flattened Device Tree)
>> [    9.018901] Backtrace:
>> [    9.021376] [<c010c918>] (dump_backtrace) from [<c010cbe8>] (show_stack+0x18/0x1c)
>> [    9.028986]  r7:ed036240 r6:60070013 r5:00000000 r4:c0d598a0
>> [    9.034684] [<c010cbd0>] (show_stack) from [<c088e1f0>] (dump_stack+0x8c/0xa0)
>> [    9.041954] [<c088e164>] (dump_stack) from [<c02ba59c>] (sysfs_warn_dup+0x60/0x6c)
>> [    9.049564]  r7:ed036240 r6:ed036240 r5:c0b579bc r4:ed10c000
>> [    9.055264] [<c02ba53c>] (sysfs_warn_dup) from [<c02ba930>] (sysfs_do_create_link_sd+0xbc/0xc4)
>> [    9.064006]  r7:ed036240 r6:ffffffef r5:00000000 r4:ed034660
>> [    9.069701] [<c02ba874>] (sysfs_do_create_link_sd) from [<c02ba968>] (sysfs_create_link+0x30/0x3c)
>> [    9.078706]  r9:00000008 r8:00000000 r7:ed02f008 r6:ee015ae8 r5:ee015800 r4:ed02f000
>> [    9.086497] [<c02ba938>] (sysfs_create_link) from [<c060f070>] (phy_attach_direct+0x180/0x1f4)
>> [    9.095163] [<c060eef0>] (phy_attach_direct) from [<c060f1cc>] (phy_connect_direct+0x1c/0x54)
>> [    9.103735]  r10:00000001 r9:ee015d0c r8:00000008 r7:c062e84c r6:c062e84c r5:ed02f000
>> [    9.111609]  r4:ed02f000 r3:00000008
>> [    9.115217] [<c060f1b0>] (phy_connect_direct) from [<c060f250>] (phy_connect+0x4c/0x80)
>> [    9.123270]  r7:c062e84c r6:ee015800 r5:ed02f000 r4:ee693664
>> [    9.128969] [<c060f204>] (phy_connect) from [<c062abc8>] (cpsw_slave_open+0x21c/0x274)
>> [    9.136926]  r9:ee015d0c r8:ee015d00 r7:ed018a10 r6:ee015800 r5:ee015d00 r4:ed032630
>> [    9.144715] [<c062a9ac>] (cpsw_slave_open) from [<c062b28c>] (cpsw_ndo_open+0x158/0x55c)
>> [    9.152860]  r10:00000001 r9:00000000 r8:ee015d00 r7:c0d04c48 r6:ee015800 r5:ed018a10
>> [    9.160730]  r4:ed032630
>> [    9.163291] [<c062b134>] (cpsw_ndo_open) from [<c0765350>] (__dev_open+0xd4/0x158)
>> [    9.170900]  r10:00000000 r9:ec885db4 r8:ee01582c r7:c09a9c00 r6:00000000 r5:c0d04c48
>> [    9.178768]  r4:ee015800
>> [    9.181326] [<c076527c>] (__dev_open) from [<c0765748>] (__dev_change_flags+0x174/0x1c0)
>>
>> The reason of the warning is that in switch mode CPSW drivers is connecting two Net PHYs to
>> a single network device (it has been done this way when driver was initially introduced), so
>> now it produces warning during boot because of commits:
>>
>> 5568363f0cb3 ("net: phy: Create sysfs reciprocal links for attached_dev/phydev"
>> a3995460491d ("net: phy: Relax error checking on sysfs_create_link()"
>> ^ both went in v4.13
>>
>> Honestly, I'm not sure how to fix it the best way (the simplest fix is below), taking into account
>> that we are not ready to do big reworks in CPSW driver.

I've got additional testing data and this actually a *regression*, because 
second CPSW Port became broken after above commits due to Net PHY 
connection failure.

> 
> The CPSW driver is duplicating a fair amount of what the DSA subsystem
> does properly without breaking any assumptions about the 1:1 mapping
> that can exist between a network device and PHY device. Having a PHY
> device without a network device is fine in premise, although discouraged.
> 
> Migrating to DSA is certainly a big task, but I would strongly encourage
> you to consider doing it, since that would solve this problem, and
> probably many more.

We are actively investigating possibility to use DSA for this driver,
but unfortunately this is looks very problematic, because this is old driver with
stable ABI and stable device tree binding which are also ABI. 
So, this driver can't be simply migrating to use DSA and as possible solution
new driver can be introduced in long term perspective which will
follow DSA binding requirements and reuse as max as possible code
of the current CPSW driver. Even in this case, old driver
will need to be supported during some transition period and *be functional*.

For now I'd be very appreciated if functionality of current TI CPSW driver will be
restored and propose to consider below fix, which will make it work again:

=============
>From 5f67320d985a533da785e3643e5e63ba7395b4ae Mon Sep 17 00:00:00 2001
From: Grygorii Strashko <grygorii.strashko@ti.com>
Date: Wed, 14 Mar 2018 14:01:12 -0500
Subject: [PATCH] net: phy: skip error checking when creating sysfs link
 netdev->phydev

Some ethernet drivers (like TI CPSW) may connect and manage >1 Net PHYs per
one netdevice, as result such drivers will produce warning during system
boot and fail to connect second phy to netdevice when PHY device
framework will try to create sysfs link netdev->phydev for second PHY,
because sysfs link with the same name has been created already for the first PHY.
As result, second CPSW external port will became unusable.

Fix it by skipping error checking when PHY device
framework creating sysfs link netdev->phydev. The sysfs_create_link()
will still produce warning, but we can live with it as
system functionality will not be broken. 

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
---
 drivers/net/phy/phy_device.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 67f25ac..e2c34c3 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1000,9 +1000,10 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
 				"attached_dev");
 	if (!err) {
 		err = sysfs_create_link(&dev->dev.kobj, &phydev->mdio.dev.kobj,
-					"phydev");
-		if (err)
-			goto error;
+				  "phydev");
+		/* don't check return value here as some net drivers can
+		 * use one netdevice with more then one phy
+		 */
 
 		phydev->sysfs_links = true;
 	}
-- 
2.10.5




-- 
regards,
-grygorii

^ permalink raw reply related

* Re: [PATCH net-next 0/2] skbuff: Fix applications not being woken for errors
From: Vinicius Costa Gomes @ 2018-03-14 19:37 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, randy.e.witt, David Miller, Eric Dumazet
In-Reply-To: <CAF=yD-K2xH6uGxC3P7UQab5v26kP85z3axvY8L9NfigCLrP6Hw@mail.gmail.com>

Hi,

Willem de Bruijn <willemdebruijn.kernel@gmail.com> writes:

>> Another interesting fact is that if the POLLIN event is added to the
>> poll() .events, poll() no longer becomes stuck,
>
> The process has registered interest only in POLLIN, which the call to
> sk_data_read (sock_def_readable) will trigger.
>
>> and more interestingly
>> the returned event in .revents is only POLLERR.
>
> datagram_poll will set (E)POLLERR based on non-empty sk_error_queue.
>
>> After a few debugging sessions, we got to 'sock_queue_err_skb()' and
>> how it notifies applications of the error just enqueued. Changing it
>> to use 'sk->sk_error_report()', fixes the issue for hardware and
>> software timestamping. That is patch (2).
>>
>> The "solution" proposed in patch (2) looks like too big a hammer,
>
> It looks fine to me. POLLERR is returned regardless of the mask a
> process sets up in pollfd.events. So waking with sk_error_report
> will fix this while still waking callers waiting on POLLIN.
>
> Note that on sock_dequeue_err_skb, if another notification (of the
> right kind) is waiting, sk_error_report is already called instead of
> sk_data_ready.

Thank you all who did confirm that this was the right fix.

>
> This should perhaps go to net, instead of net-next (though not the
> test).

Will propose it to net, what I am thinking is that now that a few TSN
features are upstream, applications that use hardware timestamping (the
easiest way to trigger this bug) could be more common.

>
> If resending, a small nit in the test: please keep the alphabetical
> order in getopt. The filepath also looks a bit fishy, but git am applied
> the mbox from patchwork without issue.

Will send a second version.


Thanks,
--
Vinicius

^ permalink raw reply

* Re: [PATCH net-next v2 0/4] net: qualcomm: rmnet: Updates 2018-03-12
From: Subash Abhinov Kasiviswanathan @ 2018-03-14 19:27 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20180314.140615.802964606635535521.davem@davemloft.net>

> Please address Joe's feedback and only update the copyright date on
> files that actually had changes this year.
> 
> Thank you.

Hi David

I have fixed that now in v3.
However, patchwork is not showing the entire series.
It shows only the first patch for some reason even if I search with me 
as submitter.

https://patchwork.ozlabs.org/project/netdev/list/?submitter=65547&state=%2A&archive=both

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply

* [PATCH net-next 2/2] net: Use rtnl_lock_killable() in register_netdev()
From: Kirill Tkhai @ 2018-03-14 19:17 UTC (permalink / raw)
  To: davem, ktkhai, vyasevic, edumazet, nicolas.dichtel, netdev
In-Reply-To: <152105492788.24797.10467675660981296096.stgit@localhost.localdomain>

This patch adds rtnl_lock_killable() to one of hot path
using rtnl_lock().

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/dev.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 12a9aad0b057..d8887cc38e7b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -8018,7 +8018,8 @@ int register_netdev(struct net_device *dev)
 {
 	int err;
 
-	rtnl_lock();
+	if (rtnl_lock_killable())
+		return -EINTR;
 	err = register_netdevice(dev);
 	rtnl_unlock();
 	return err;

^ permalink raw reply related

* [PATCH net-next 1/2] net: Add rtnl_lock_killable()
From: Kirill Tkhai @ 2018-03-14 19:17 UTC (permalink / raw)
  To: davem, ktkhai, vyasevic, edumazet, nicolas.dichtel, netdev
In-Reply-To: <152105492788.24797.10467675660981296096.stgit@localhost.localdomain>

rtnl_lock() is widely used mutex in kernel. Some of kernel code
does memory allocations under it. In case of memory deficit this
may invoke OOM killer, but the problem is a killed task can't
exit if it's waiting for the mutex. This may be a reason of deadlock
and panic.

This patch adds a new primitive, which responds on SIGKILL, and
it allows to use it in the places, where we don't want to sleep
forever.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 include/linux/rtnetlink.h |    1 +
 net/core/rtnetlink.c      |    6 ++++++
 2 files changed, 7 insertions(+)

diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index 3573b4bf2fdf..562a175c35a9 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -33,6 +33,7 @@ extern void rtnl_lock(void);
 extern void rtnl_unlock(void);
 extern int rtnl_trylock(void);
 extern int rtnl_is_locked(void);
+extern int rtnl_lock_killable(void);
 
 extern wait_queue_head_t netdev_unregistering_wq;
 extern struct rw_semaphore net_sem;
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 67f375cfb982..87079eaa871b 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -75,6 +75,12 @@ void rtnl_lock(void)
 }
 EXPORT_SYMBOL(rtnl_lock);
 
+int rtnl_lock_killable(void)
+{
+	return mutex_lock_killable(&rtnl_mutex);
+}
+EXPORT_SYMBOL(rtnl_lock_killable);
+
 static struct sk_buff *defer_kfree_skb_list;
 void rtnl_kfree_skbs(struct sk_buff *head, struct sk_buff *tail)
 {

^ permalink raw reply related

* [PATCH net-next 0/2] Introduce rtnl_lock_killable()
From: Kirill Tkhai @ 2018-03-14 19:17 UTC (permalink / raw)
  To: davem, ktkhai, vyasevic, edumazet, nicolas.dichtel, netdev

rtnl_lock() is widely used mutex in kernel. Some of kernel code
does memory allocations under it. In case of memory deficit this
may invoke OOM killer, but the problem is a killed task can't
exit if it's waiting for the mutex. This may be a reason of deadlock
and panic.

This patchset adds a new primitive, which responds on SIGKILL,
and it allows to use it in the places, where we don't want
to sleep forever. Also, the first place is made to use it.

---

Kirill Tkhai (2):
      net: Add rtnl_lock_killable()
      net: Use rtnl_lock_killable() in register_netdev()

 include/linux/rtnetlink.h |    1 +
 net/core/dev.c            |    3 ++-
 net/core/rtnetlink.c      |    6 ++++++
 3 files changed, 9 insertions(+), 1 deletion(-)

--
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox