* Re: [PATCH net-next 4/4] net: dsa: add port enable and disable helpers
From: Florian Fainelli @ 2017-09-22 17:16 UTC (permalink / raw)
To: Vivien Didelot, netdev; +Cc: linux-kernel, kernel, David S. Miller, Andrew Lunn
In-Reply-To: <20170922161753.19563-5-vivien.didelot@savoirfairelinux.com>
On 09/22/2017 09:17 AM, Vivien Didelot wrote:
> Provide dsa_port_enable and dsa_port_disable helpers to respectively
> enable and disable a switch port. This makes the dsa_port_set_state_now
> helper static.
>
> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
--
Florian
^ permalink raw reply
* Re: [PATCH,v2,net-next 2/2] tun: enable napi_gro_frags() for TUN/TAP driver
From: Petar Penkov @ 2017-09-22 17:48 UTC (permalink / raw)
To: Mahesh Bandewar (महेश बंडेवार)
Cc: Willem de Bruijn, Network Development, Eric Dumazet,
Willem de Bruijn, David Miller, Petar Bozhidarov Penkov
In-Reply-To: <CAF2d9jhng1-jQypPqA1XdQtocBQ9ayJYfGa1UGGvBhGP=CXoaA@mail.gmail.com>
On Fri, Sep 22, 2017 at 9:51 AM, Mahesh Bandewar (महेश बंडेवार)
<maheshb@google.com> wrote:
> On Fri, Sep 22, 2017 at 7:06 AM, Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
>>> @@ -2061,6 +2174,9 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
>>> if (tfile->detached)
>>> return -EINVAL;
>>>
>>> + if ((ifr->ifr_flags & IFF_NAPI_FRAGS) && !capable(CAP_NET_ADMIN))
>>> + return -EPERM;
>>> +
>>
>> This should perhaps be moved into the !dev branch, directly below the
>> ns_capable check.
>>
> Hmm, does that mean fail only on creation but allow to attach if
> exists? That would be wrong, isn't it? Correct me if I'm wrong but we
> want to prevent both these scenarios if user does not have sufficient
> privileges (i.e. NET_ADMIN in init-ns).
>
My understanding is we want to protect both scenarios.
>>> dev = __dev_get_by_name(net, ifr->ifr_name);
>>> if (dev) {
>>> if (ifr->ifr_flags & IFF_TUN_EXCL)
>>> @@ -2185,6 +2301,9 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
>>> tun->flags = (tun->flags & ~TUN_FEATURES) |
>>> (ifr->ifr_flags & TUN_FEATURES);
>>>
>>> + if (!(tun->flags & IFF_NAPI) || (tun->flags & TUN_TYPE_MASK) != IFF_TAP)
>>> + tun->flags = tun->flags & ~IFF_NAPI_FRAGS;
>>> +
>>
>> Similarly, this check only need to be performed in that branch.
>> Instead of reverting to non-frags mode, a tun_set_iff with the wrong
>> set of flags should probably fail hard.
> Yes, agree, wrong set of flags should fail hard and probably be done
> before attach or open, no?
Agreed, in v3 I will push this check before the conditional so both
branches can be rejected with EINVAL.
^ permalink raw reply
* Re: [PATCH,v2,net-next 2/2] tun: enable napi_gro_frags() for TUN/TAP driver
From: Willem de Bruijn @ 2017-09-22 17:50 UTC (permalink / raw)
To: Petar Penkov
Cc: Mahesh Bandewar (महेश बंडेवार),
Network Development, Eric Dumazet, Willem de Bruijn, David Miller,
Petar Bozhidarov Penkov
In-Reply-To: <CA+DcSEidWRHa1ovXfPOfG3OwQ=NASMt9NbTGpk=-NdC8BJQKhw@mail.gmail.com>
On Fri, Sep 22, 2017 at 1:48 PM, Petar Penkov <peterpenkov96@gmail.com> wrote:
> On Fri, Sep 22, 2017 at 9:51 AM, Mahesh Bandewar (महेश बंडेवार)
> <maheshb@google.com> wrote:
>> On Fri, Sep 22, 2017 at 7:06 AM, Willem de Bruijn
>> <willemdebruijn.kernel@gmail.com> wrote:
>>>> @@ -2061,6 +2174,9 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
>>>> if (tfile->detached)
>>>> return -EINVAL;
>>>>
>>>> + if ((ifr->ifr_flags & IFF_NAPI_FRAGS) && !capable(CAP_NET_ADMIN))
>>>> + return -EPERM;
>>>> +
>>>
>>> This should perhaps be moved into the !dev branch, directly below the
>>> ns_capable check.
>>>
>> Hmm, does that mean fail only on creation but allow to attach if
>> exists? That would be wrong, isn't it? Correct me if I'm wrong but we
>> want to prevent both these scenarios if user does not have sufficient
>> privileges (i.e. NET_ADMIN in init-ns).
Ok.
>>
> My understanding is we want to protect both scenarios.
>>>> dev = __dev_get_by_name(net, ifr->ifr_name);
>>>> if (dev) {
>>>> if (ifr->ifr_flags & IFF_TUN_EXCL)
>>>> @@ -2185,6 +2301,9 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
>>>> tun->flags = (tun->flags & ~TUN_FEATURES) |
>>>> (ifr->ifr_flags & TUN_FEATURES);
>>>>
>>>> + if (!(tun->flags & IFF_NAPI) || (tun->flags & TUN_TYPE_MASK) != IFF_TAP)
>>>> + tun->flags = tun->flags & ~IFF_NAPI_FRAGS;
>>>> +
>>>
>>> Similarly, this check only need to be performed in that branch.
>>> Instead of reverting to non-frags mode, a tun_set_iff with the wrong
>>> set of flags should probably fail hard.
>> Yes, agree, wrong set of flags should fail hard and probably be done
>> before attach or open, no?
> Agreed, in v3 I will push this check before the conditional so both
> branches can be rejected with EINVAL.
Sounds great.
^ permalink raw reply
* Re: [PATCH] Add a driver for Renesas uPD60620 and uPD60620A PHYs
From: Andrew Lunn @ 2017-09-22 17:59 UTC (permalink / raw)
To: Bernd Edlinger; +Cc: netdev@vger.kernel.org, Florian Fainelli
In-Reply-To: <AM5PR0701MB26574EBBE4E7CACC9173B345E4670@AM5PR0701MB2657.eurprd07.prod.outlook.com>
On Fri, Sep 22, 2017 at 05:08:45PM +0000, Bernd Edlinger wrote:
> Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
> ---
> drivers/net/phy/Kconfig | 5 +
> drivers/net/phy/Makefile | 1 +
> drivers/net/phy/uPD60620.c | 226
> +++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 232 insertions(+)
> create mode 100644 drivers/net/phy/uPD60620.c
>
> diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
> index a9d16a3..25089f0 100644
> --- a/drivers/net/phy/Kconfig
> +++ b/drivers/net/phy/Kconfig
> @@ -287,6 +287,11 @@ config DP83867_PHY
> ---help---
> Currently supports the DP83867 PHY.
>
> +config RENESAS_PHY
> + tristate "Driver for Renesas PHYs"
> + ---help---
> + Supports the uPD60620 and uPD60620A PHYs.
> +
Hi Bernd
Please call this "Reneseas PHYs" and place in it alphabetical order.
> config FIXED_PHY
> tristate "MDIO Bus/PHY emulation with fixed speed/link PHYs"
> depends on PHYLIB
> diff --git a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile
> index 416df92..1404ad3 100644
> --- a/drivers/net/phy/Makefile
> +++ b/drivers/net/phy/Makefile
> @@ -72,6 +72,7 @@ obj-$(CONFIG_MICROSEMI_PHY) += mscc.o
> obj-$(CONFIG_NATIONAL_PHY) += national.o
> obj-$(CONFIG_QSEMI_PHY) += qsemi.o
> obj-$(CONFIG_REALTEK_PHY) += realtek.o
> +obj-$(CONFIG_RENESAS_PHY) += uPD60620.o
> obj-$(CONFIG_ROCKCHIP_PHY) += rockchip.o
> obj-$(CONFIG_SMSC_PHY) += smsc.o
> obj-$(CONFIG_STE10XP) += ste10Xp.o
> diff --git a/drivers/net/phy/uPD60620.c b/drivers/net/phy/uPD60620.c
> new file mode 100644
> index 0000000..b3d900c
> --- /dev/null
> +++ b/drivers/net/phy/uPD60620.c
> @@ -0,0 +1,226 @@
> +/*
> + * Driver for the Renesas PHY uPD60620.
> + *
> + * Copyright (C) 2015 Softing Industrial Automation GmbH
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/phy.h>
> +
> +#define UPD60620_PHY_ID 0xb8242824
> +
> +/* Extended Registers and values */
> +/* PHY Special Control/Status */
> +#define PHY_PHYSCR 0x1F /* PHY.31 */
> +#define PHY_PHYSCR_10MB 0x0004 /* PHY speed = 10mb */
> +#define PHY_PHYSCR_100MB 0x0008 /* PHY speed = 100mb */
> +#define PHY_PHYSCR_DUPLEX 0x0010 /* PHY Duplex */
> +#define PHY_PHYSCR_RSVD5 0x0020 /* Reserved Bit 5 */
> +#define PHY_PHYSCR_MIIMOD 0x0040 /* Enable 4B5B MII mode */
Are any of these comments actually useful. It seems like the defines
are pretty obvious.
> +#define PHY_PHYSCR_RSVD7 0x0080 /* Reserved Bit 7 */
> +#define PHY_PHYSCR_RSVD8 0x0100 /* Reserved Bit 8 */
> +#define PHY_PHYSCR_RSVD9 0x0200 /* Reserved Bit 9 */
> +#define PHY_PHYSCR_RSVD10 0x0400 /* Reserved Bit 10 */
> +#define PHY_PHYSCR_RSVD11 0x0800 /* Reserved Bit 11 */
> +#define PHY_PHYSCR_ANDONE 0x1000 /* Auto negotiation done */
> +#define PHY_PHYSCR_RSVD13 0x2000 /* Reserved Bit 13 */
> +#define PHY_PHYSCR_RSVD14 0x4000 /* Reserved Bit 14 */
> +#define PHY_PHYSCR_RSVD15 0x8000 /* Reserved Bit 15 */
It looks like the only register you use is SCR and SPM. Maybe delete
all the rest? Or do you plan to add more features making use of these
registers?
> +/* Init PHY */
> +
> +static int upd60620_config_init(struct phy_device *phydev)
> +{
> + /* Enable support for passive HUBs (could be a strap option) */
> + /* PHYMODE: All speeds, HD in parallel detect */
> + return phy_write(phydev, PHY_SPM, 0x0180 | phydev->mdio.addr);
> +}
> +
> +/* Get PHY status from common registers */
> +
> +static int upd60620_read_status(struct phy_device *phydev)
> +{
> + int phy_state;
> +
> + /* Read negotiated state */
> + phy_state = phy_read(phydev, MII_BMSR);
> + if (phy_state < 0)
> + return phy_state;
> +
> + phydev->link = 0;
> + phydev->lp_advertising = 0;
> + phydev->pause = 0;
> + phydev->asym_pause = 0;
> +
> + if (phy_state & BMSR_ANEGCOMPLETE) {
It is worth comparing this against genphy_read_status() which is the
reference implementation. You would normally check if auto negotiation
is enabled, not if it has completed. If it is enabled you read the
current negotiated state, even if it is not completed.
> + phy_state = phy_read(phydev, PHY_PHYSCR);
> + if (phy_state < 0)
> + return phy_state;
> +
> + if (phy_state & (PHY_PHYSCR_10MB | PHY_PHYSCR_100MB)) {
> + phydev->link = 1;
> + phydev->speed = SPEED_10;
> + phydev->duplex = DUPLEX_HALF;
> +
> + if (phy_state & PHY_PHYSCR_100MB)
> + phydev->speed = SPEED_100;
> + if (phy_state & PHY_PHYSCR_DUPLEX)
> + phydev->duplex = DUPLEX_FULL;
> +
> + phy_state = phy_read(phydev, MII_LPA);
> + if (phy_state < 0)
> + return phy_state;
> +
> + phydev->lp_advertising
> + = mii_lpa_to_ethtool_lpa_t(phy_state);
> +
> + if (phydev->duplex == DUPLEX_FULL) {
> + if (phy_state & LPA_PAUSE_CAP)
> + phydev->pause = 1;
> + if (phy_state & LPA_PAUSE_ASYM)
> + phydev->asym_pause = 1;
> + }
> + }
> + } else if (phy_state & BMSR_LSTATUS) {
The else clause is then for a fixed configuration. Since all you are
looking at is BMCR, you can probably just cut/paste from
genphy_read_status().
> + phy_state = phy_read(phydev, MII_BMCR);
> + if (phy_state < 0)
> + return phy_state;
> +
> + if (!(phy_state & BMCR_ANENABLE)) {
> + phydev->link = 1;
> + phydev->speed = SPEED_10;
> + phydev->duplex = DUPLEX_HALF;
> +
> + if (phy_state & BMCR_SPEED100)
> + phydev->speed = SPEED_100;
> + if (phy_state & BMCR_FULLDPLX)
> + phydev->duplex = DUPLEX_FULL;
> + }
> + }
> + return 0;
> +}
Andrew
^ permalink raw reply
* Re: [PATCH,v2,net-next 1/2] tun: enable NAPI for TUN/TAP driver
From: Willem de Bruijn @ 2017-09-22 18:03 UTC (permalink / raw)
To: Mahesh Bandewar (महेश बंडेवार)
Cc: Petar Penkov, linux-netdev, Eric Dumazet, Willem de Bruijn,
David Miller, Petar Bozhidarov Penkov
In-Reply-To: <CAF2d9jh=H1O4JR3u=Rs3ODuQRF-Qp2wzzhAVf3PQ_f_Ox98eKQ@mail.gmail.com>
On Fri, Sep 22, 2017 at 1:11 PM, Mahesh Bandewar (महेश बंडेवार)
<maheshb@google.com> wrote:
>> #ifdef CONFIG_TUN_VNET_CROSS_LE
>> static inline bool tun_legacy_is_little_endian(struct tun_struct *tun)
>> {
>> @@ -541,6 +604,11 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
>>
>> tun = rtnl_dereference(tfile->tun);
>>
>> + if (tun && clean) {
>> + tun_napi_disable(tun, tfile);
> are we missing synchronize_net() separating disable and del calls?
That is not needed here. napi_disable has its own mechanism for waiting
until a napi struct is no longer run. netif_napi_del will call synchronize_net
if needed. These two calls are made one after the other in quite a few drivers.
^ permalink raw reply
* Re: [Intel-wired-lan] [PATCH][V3] e1000: avoid null pointer dereference on invalid stat type
From: Alexander Duyck @ 2017-09-22 18:09 UTC (permalink / raw)
To: Colin King
Cc: Jeff Kirsher, intel-wired-lan, Netdev, kernel-janitors,
linux-kernel@vger.kernel.org
In-Reply-To: <20170922171348.17630-1-colin.king@canonical.com>
On Fri, Sep 22, 2017 at 10:13 AM, Colin King <colin.king@canonical.com> wrote:
> From: Colin Ian King <colin.king@canonical.com>
>
> Currently if the stat type is invalid then data[i] is being set
> either by dereferencing a null pointer p, or it is reading from
> an incorrect previous location if we had a valid stat type
> previously. Fix this by skipping over the read of p on an invalid
> stat type.
>
> Detected by CoverityScan, CID#113385 ("Explicit null dereferenced")
>
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
Looks good to me.
Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com>
> ---
> drivers/net/ethernet/intel/e1000/e1000_ethtool.c | 9 ++++-----
> 1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
> index ec8aa4562cc9..3b3983a1ffbb 100644
> --- a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
> +++ b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
> @@ -1824,11 +1824,12 @@ static void e1000_get_ethtool_stats(struct net_device *netdev,
> {
> struct e1000_adapter *adapter = netdev_priv(netdev);
> int i;
> - char *p = NULL;
> const struct e1000_stats *stat = e1000_gstrings_stats;
>
> e1000_update_stats(adapter);
> - for (i = 0; i < E1000_GLOBAL_STATS_LEN; i++) {
> + for (i = 0; i < E1000_GLOBAL_STATS_LEN; i++, stat++) {
> + char *p;
> +
> switch (stat->type) {
> case NETDEV_STATS:
> p = (char *)netdev + stat->stat_offset;
> @@ -1839,15 +1840,13 @@ static void e1000_get_ethtool_stats(struct net_device *netdev,
> default:
> WARN_ONCE(1, "Invalid E1000 stat type: %u index %d\n",
> stat->type, i);
> - break;
> + continue;
> }
>
> if (stat->sizeof_stat == sizeof(u64))
> data[i] = *(u64 *)p;
> else
> data[i] = *(u32 *)p;
> -
> - stat++;
> }
> /* BUG_ON(i != E1000_STATS_LEN); */
> }
> --
> 2.14.1
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply
* Re: [PATCH net-next 2/4] net: dsa: remove phy arg from port enable/disable
From: Vivien Didelot @ 2017-09-22 18:12 UTC (permalink / raw)
To: Florian Fainelli, netdev
Cc: linux-kernel, kernel, David S. Miller, Andrew Lunn
In-Reply-To: <f8d74c3c-bd7f-8a84-2d57-c37250ff25f8@gmail.com>
Hi Florian,
Florian Fainelli <f.fainelli@gmail.com> writes:
> On 09/22/2017 09:17 AM, Vivien Didelot wrote:
>> The .port_enable and .port_disable functions are meant to deal with the
>> switch ports only, and no driver is using the phy argument anyway.
>> Remove it.
>
> I don't think this makes sense, there are perfectly legit reasons why a
> switch driver may have something to do with the PHY device attached to
> its per-port network interface, we should definitively keep that around,
> unless you think we should be accessing the PHY within the switch
> drivers by doing:
>
> struct phy_device *phydev = ds->ports[port].netdev->phydev?
bcm_sf2 is the only user for this phy argument right now. The reason I'm
doing this is because I prefer to discourage switch drivers to dig into
the phy device themselves while as you said there must be a cleaner
solution. This must be handled somehow elsewhere in the stack.
In the meantime, moving the PHY device up to the dsa_port structure is a
good solution, in order not to expose it in switch ops, but still make
it available to more complex drivers.
Do you know if netdev->phydev is usable? Why do DSA has its own copy in
dsa_slave_priv then?
I'll respin, thanks.
Vivien
^ permalink raw reply
* Re: [PATCH,v2,net-next 1/2] tun: enable NAPI for TUN/TAP driver
From: Mahesh Bandewar (महेश बंडेवार) @ 2017-09-22 18:12 UTC (permalink / raw)
To: Willem de Bruijn
Cc: Petar Penkov, linux-netdev, Eric Dumazet, Willem de Bruijn,
David Miller, Petar Bozhidarov Penkov
In-Reply-To: <CAF=yD-Ldc_K+MTd_wXNjvvQ-+UsjZOs+13irmRoGJzJmq5yfVA@mail.gmail.com>
On Fri, Sep 22, 2017 at 11:03 AM, Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
> On Fri, Sep 22, 2017 at 1:11 PM, Mahesh Bandewar (महेश बंडेवार)
> <maheshb@google.com> wrote:
>>> #ifdef CONFIG_TUN_VNET_CROSS_LE
>>> static inline bool tun_legacy_is_little_endian(struct tun_struct *tun)
>>> {
>>> @@ -541,6 +604,11 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
>>>
>>> tun = rtnl_dereference(tfile->tun);
>>>
>>> + if (tun && clean) {
>>> + tun_napi_disable(tun, tfile);
>> are we missing synchronize_net() separating disable and del calls?
>
> That is not needed here. napi_disable has its own mechanism for waiting
> until a napi struct is no longer run. netif_napi_del will call synchronize_net
> if needed.
Yes, that will do. Thanks.
> These two calls are made one after the other in quite a few drivers.
^ permalink raw reply
* Re: [PATCH net-next 2/4] net: dsa: remove phy arg from port enable/disable
From: Florian Fainelli @ 2017-09-22 18:23 UTC (permalink / raw)
To: Vivien Didelot, netdev; +Cc: linux-kernel, kernel, David S. Miller, Andrew Lunn
In-Reply-To: <87377eob5t.fsf@weeman.i-did-not-set--mail-host-address--so-tickle-me>
On 09/22/2017 11:12 AM, Vivien Didelot wrote:
> Hi Florian,
>
> Florian Fainelli <f.fainelli@gmail.com> writes:
>
>> On 09/22/2017 09:17 AM, Vivien Didelot wrote:
>>> The .port_enable and .port_disable functions are meant to deal with the
>>> switch ports only, and no driver is using the phy argument anyway.
>>> Remove it.
>>
>> I don't think this makes sense, there are perfectly legit reasons why a
>> switch driver may have something to do with the PHY device attached to
>> its per-port network interface, we should definitively keep that around,
>> unless you think we should be accessing the PHY within the switch
>> drivers by doing:
>>
>> struct phy_device *phydev = ds->ports[port].netdev->phydev?
>
> bcm_sf2 is the only user for this phy argument right now. The reason I'm
> doing this is because I prefer to discourage switch drivers to dig into
> the phy device themselves while as you said there must be a cleaner
> solution. This must be handled somehow elsewhere in the stack.
The current approach of passing the phy_device reference as an argument
is certainly a cleaner way then. The port_enable caller can provide the
correct phy_device and that lifts the switch driver from having to dig
it itself from its per-port netdev.
>
> In the meantime, moving the PHY device up to the dsa_port structure is a
> good solution, in order not to expose it in switch ops, but still make
> it available to more complex drivers.
>
> Do you know if netdev->phydev is usable? Why do DSA has its own copy in
> dsa_slave_priv then?
Historical reasons mostly. Considering the complexity of
dsa_slave_phy_setup(), I would certainly be extremely careful in
changing any of this, the potential for breakage is pretty big. At first
glance, I would say that this is a safe conversion to do, and I can test
this on the HW I have here anyway.
--
Florian
^ permalink raw reply
* [PATCH] r8152: add Linksys USB3GIGV1 id
From: Grant Grundler @ 2017-09-22 19:06 UTC (permalink / raw)
To: Hayes Wang
Cc: linux-usb, David S . Miller, LKML, netdev-u79uwXL29TY76Z2rM5mHXA,
Grant Grundler
This Linksys dongle by default comes up in cdc_ether mode.
This patch allows r8152 to claim the device:
Bus 002 Device 002: ID 13b1:0041 Linksys
Signed-off-by: Grant Grundler <grundler-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
---
drivers/net/usb/r8152.c | 2 ++
1 file changed, 2 insertions(+)
This was tested on chromeos-3.14, chromeos-3.18, and chromeos-4.4 kernels
with a mix of ARM/x86-64 systems.
diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index ceb78e2ea4f0..941ece08ba78 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -613,6 +613,7 @@ enum rtl8152_flags {
#define VENDOR_ID_MICROSOFT 0x045e
#define VENDOR_ID_SAMSUNG 0x04e8
#define VENDOR_ID_LENOVO 0x17ef
+#define VENDOR_ID_LINKSYS 0x13b1
#define VENDOR_ID_NVIDIA 0x0955
#define MCU_TYPE_PLA 0x0100
@@ -5316,6 +5317,7 @@ static const struct usb_device_id rtl8152_table[] = {
{REALTEK_USB_DEVICE(VENDOR_ID_LENOVO, 0x7205)},
{REALTEK_USB_DEVICE(VENDOR_ID_LENOVO, 0x720c)},
{REALTEK_USB_DEVICE(VENDOR_ID_LENOVO, 0x7214)},
+ {REALTEK_USB_DEVICE(VENDOR_ID_LINKSYS, 0x0041)},
{REALTEK_USB_DEVICE(VENDOR_ID_NVIDIA, 0x09ff)},
{}
};
--
2.14.1.821.g8fa685d3b7-goog
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* Re: [PATCH net-next 2/4] net: dsa: remove phy arg from port enable/disable
From: Andrew Lunn @ 2017-09-22 19:11 UTC (permalink / raw)
To: Florian Fainelli
Cc: Vivien Didelot, netdev, linux-kernel, kernel, David S. Miller
In-Reply-To: <6d7799bb-2b33-0696-1805-63cea5e52667@gmail.com>
> Historical reasons mostly. Considering the complexity of
> dsa_slave_phy_setup(), I would certainly be extremely careful in
> changing any of this, the potential for breakage is pretty big.
Yes, i took a look at this, wondering how to convert to phylink. I
went away and got a stiff drink :-)
Andrew
^ permalink raw reply
* [PATCH net-next v2 0/3] net: dsa: use slave device phydev
From: Vivien Didelot @ 2017-09-22 19:40 UTC (permalink / raw)
To: netdev
Cc: linux-kernel, kernel, David S. Miller, Florian Fainelli,
Andrew Lunn, Vivien Didelot
This patchset removes the private phy_device in favor of the one
provided by the slave net_device, makes slave open and close symmetrical
and finally provides helpers for enabling or disabling a DSA port.
Changes in v2:
- do not remove the phy argument from port enable/disable
Vivien Didelot (3):
net: dsa: use slave device phydev
net: dsa: make slave close symmetrical to open
net: dsa: add port enable and disable helpers
net/dsa/dsa_priv.h | 3 +-
net/dsa/port.c | 31 +++++++++++-
net/dsa/slave.c | 143 +++++++++++++++++++++++------------------------------
3 files changed, 94 insertions(+), 83 deletions(-)
--
2.14.1
^ permalink raw reply
* [PATCH net-next v2 1/3] net: dsa: use slave device phydev
From: Vivien Didelot @ 2017-09-22 19:40 UTC (permalink / raw)
To: netdev
Cc: linux-kernel, kernel, David S. Miller, Florian Fainelli,
Andrew Lunn, Vivien Didelot
In-Reply-To: <20170922194045.18814-1-vivien.didelot@savoirfairelinux.com>
There is no need to store a phy_device in dsa_slave_priv since
net_device already provides one. Simply s/p->phy/dev->phydev/.
While at it, return -ENODEV when it is NULL instead of -EOPNOTSUPP.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
---
net/dsa/slave.c | 126 ++++++++++++++++++++++++++------------------------------
1 file changed, 58 insertions(+), 68 deletions(-)
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 02ace7d462c4..3760472bf41d 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -99,15 +99,15 @@ static int dsa_slave_open(struct net_device *dev)
}
if (ds->ops->port_enable) {
- err = ds->ops->port_enable(ds, p->dp->index, p->phy);
+ err = ds->ops->port_enable(ds, p->dp->index, dev->phydev);
if (err)
goto clear_promisc;
}
dsa_port_set_state_now(p->dp, stp_state);
- if (p->phy)
- phy_start(p->phy);
+ if (dev->phydev)
+ phy_start(dev->phydev);
return 0;
@@ -130,8 +130,8 @@ static int dsa_slave_close(struct net_device *dev)
struct net_device *master = dsa_master_netdev(p);
struct dsa_switch *ds = p->dp->ds;
- if (p->phy)
- phy_stop(p->phy);
+ if (dev->phydev)
+ phy_stop(dev->phydev);
dev_mc_unsync(master, dev);
dev_uc_unsync(master, dev);
@@ -144,7 +144,7 @@ static int dsa_slave_close(struct net_device *dev)
dev_uc_del(master, dev->dev_addr);
if (ds->ops->port_disable)
- ds->ops->port_disable(ds, p->dp->index, p->phy);
+ ds->ops->port_disable(ds, p->dp->index, dev->phydev);
dsa_port_set_state_now(p->dp, BR_STATE_DISABLED);
@@ -273,12 +273,10 @@ dsa_slave_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb,
static int dsa_slave_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
{
- struct dsa_slave_priv *p = netdev_priv(dev);
+ if (!dev->phydev)
+ return -ENODEV;
- if (p->phy != NULL)
- return phy_mii_ioctl(p->phy, ifr, cmd);
-
- return -EOPNOTSUPP;
+ return phy_mii_ioctl(dev->phydev, ifr, cmd);
}
static int dsa_slave_port_attr_set(struct net_device *dev,
@@ -435,12 +433,10 @@ static int
dsa_slave_get_link_ksettings(struct net_device *dev,
struct ethtool_link_ksettings *cmd)
{
- struct dsa_slave_priv *p = netdev_priv(dev);
+ if (!dev->phydev)
+ return -ENODEV;
- if (!p->phy)
- return -EOPNOTSUPP;
-
- phy_ethtool_ksettings_get(p->phy, cmd);
+ phy_ethtool_ksettings_get(dev->phydev, cmd);
return 0;
}
@@ -449,12 +445,10 @@ static int
dsa_slave_set_link_ksettings(struct net_device *dev,
const struct ethtool_link_ksettings *cmd)
{
- struct dsa_slave_priv *p = netdev_priv(dev);
+ if (!dev->phydev)
+ return -ENODEV;
- if (p->phy != NULL)
- return phy_ethtool_ksettings_set(p->phy, cmd);
-
- return -EOPNOTSUPP;
+ return phy_ethtool_ksettings_set(dev->phydev, cmd);
}
static void dsa_slave_get_drvinfo(struct net_device *dev,
@@ -488,24 +482,20 @@ dsa_slave_get_regs(struct net_device *dev, struct ethtool_regs *regs, void *_p)
static int dsa_slave_nway_reset(struct net_device *dev)
{
- struct dsa_slave_priv *p = netdev_priv(dev);
+ if (!dev->phydev)
+ return -ENODEV;
- if (p->phy != NULL)
- return genphy_restart_aneg(p->phy);
-
- return -EOPNOTSUPP;
+ return genphy_restart_aneg(dev->phydev);
}
static u32 dsa_slave_get_link(struct net_device *dev)
{
- struct dsa_slave_priv *p = netdev_priv(dev);
+ if (!dev->phydev)
+ return -ENODEV;
- if (p->phy != NULL) {
- genphy_update_link(p->phy);
- return p->phy->link;
- }
+ genphy_update_link(dev->phydev);
- return -EOPNOTSUPP;
+ return dev->phydev->link;
}
static int dsa_slave_get_eeprom_len(struct net_device *dev)
@@ -640,7 +630,7 @@ static int dsa_slave_set_eee(struct net_device *dev, struct ethtool_eee *e)
int ret;
/* Port's PHY and MAC both need to be EEE capable */
- if (!p->phy)
+ if (!dev->phydev)
return -ENODEV;
if (!ds->ops->set_mac_eee)
@@ -651,12 +641,12 @@ static int dsa_slave_set_eee(struct net_device *dev, struct ethtool_eee *e)
return ret;
if (e->eee_enabled) {
- ret = phy_init_eee(p->phy, 0);
+ ret = phy_init_eee(dev->phydev, 0);
if (ret)
return ret;
}
- return phy_ethtool_set_eee(p->phy, e);
+ return phy_ethtool_set_eee(dev->phydev, e);
}
static int dsa_slave_get_eee(struct net_device *dev, struct ethtool_eee *e)
@@ -666,7 +656,7 @@ static int dsa_slave_get_eee(struct net_device *dev, struct ethtool_eee *e)
int ret;
/* Port's PHY and MAC both need to be EEE capable */
- if (!p->phy)
+ if (!dev->phydev)
return -ENODEV;
if (!ds->ops->get_mac_eee)
@@ -676,7 +666,7 @@ static int dsa_slave_get_eee(struct net_device *dev, struct ethtool_eee *e)
if (ret)
return ret;
- return phy_ethtool_get_eee(p->phy, e);
+ return phy_ethtool_get_eee(dev->phydev, e);
}
#ifdef CONFIG_NET_POLL_CONTROLLER
@@ -985,26 +975,26 @@ static void dsa_slave_adjust_link(struct net_device *dev)
struct dsa_switch *ds = p->dp->ds;
unsigned int status_changed = 0;
- if (p->old_link != p->phy->link) {
+ if (p->old_link != dev->phydev->link) {
status_changed = 1;
- p->old_link = p->phy->link;
+ p->old_link = dev->phydev->link;
}
- if (p->old_duplex != p->phy->duplex) {
+ if (p->old_duplex != dev->phydev->duplex) {
status_changed = 1;
- p->old_duplex = p->phy->duplex;
+ p->old_duplex = dev->phydev->duplex;
}
- if (p->old_pause != p->phy->pause) {
+ if (p->old_pause != dev->phydev->pause) {
status_changed = 1;
- p->old_pause = p->phy->pause;
+ p->old_pause = dev->phydev->pause;
}
if (ds->ops->adjust_link && status_changed)
- ds->ops->adjust_link(ds, p->dp->index, p->phy);
+ ds->ops->adjust_link(ds, p->dp->index, dev->phydev);
if (status_changed)
- phy_print_status(p->phy);
+ phy_print_status(dev->phydev);
}
static int dsa_slave_fixed_link_update(struct net_device *dev,
@@ -1029,17 +1019,18 @@ static int dsa_slave_phy_connect(struct net_device *slave_dev, int addr)
struct dsa_slave_priv *p = netdev_priv(slave_dev);
struct dsa_switch *ds = p->dp->ds;
- p->phy = mdiobus_get_phy(ds->slave_mii_bus, addr);
- if (!p->phy) {
+ slave_dev->phydev = mdiobus_get_phy(ds->slave_mii_bus, addr);
+ if (!slave_dev->phydev) {
netdev_err(slave_dev, "no phy at %d\n", addr);
return -ENODEV;
}
/* Use already configured phy mode */
if (p->phy_interface == PHY_INTERFACE_MODE_NA)
- p->phy_interface = p->phy->interface;
- return phy_connect_direct(slave_dev, p->phy, dsa_slave_adjust_link,
- p->phy_interface);
+ p->phy_interface = slave_dev->phydev->interface;
+
+ return phy_connect_direct(slave_dev, slave_dev->phydev,
+ dsa_slave_adjust_link, p->phy_interface);
}
static int dsa_slave_phy_setup(struct net_device *slave_dev)
@@ -1091,22 +1082,23 @@ static int dsa_slave_phy_setup(struct net_device *slave_dev)
return ret;
}
} else {
- p->phy = of_phy_connect(slave_dev, phy_dn,
- dsa_slave_adjust_link,
- phy_flags,
- p->phy_interface);
+ slave_dev->phydev = of_phy_connect(slave_dev, phy_dn,
+ dsa_slave_adjust_link,
+ phy_flags,
+ p->phy_interface);
}
of_node_put(phy_dn);
}
- if (p->phy && phy_is_fixed)
- fixed_phy_set_link_update(p->phy, dsa_slave_fixed_link_update);
+ if (slave_dev->phydev && phy_is_fixed)
+ fixed_phy_set_link_update(slave_dev->phydev,
+ dsa_slave_fixed_link_update);
/* We could not connect to a designated PHY, so use the switch internal
* MDIO bus instead
*/
- if (!p->phy) {
+ if (!slave_dev->phydev) {
ret = dsa_slave_phy_connect(slave_dev, p->dp->index);
if (ret) {
netdev_err(slave_dev, "failed to connect to port %d: %d\n",
@@ -1117,7 +1109,7 @@ static int dsa_slave_phy_setup(struct net_device *slave_dev)
}
}
- phy_attached_info(p->phy);
+ phy_attached_info(slave_dev->phydev);
return 0;
}
@@ -1137,12 +1129,12 @@ int dsa_slave_suspend(struct net_device *slave_dev)
netif_device_detach(slave_dev);
- if (p->phy) {
- phy_stop(p->phy);
+ if (slave_dev->phydev) {
+ phy_stop(slave_dev->phydev);
p->old_pause = -1;
p->old_link = -1;
p->old_duplex = -1;
- phy_suspend(p->phy);
+ phy_suspend(slave_dev->phydev);
}
return 0;
@@ -1150,13 +1142,11 @@ int dsa_slave_suspend(struct net_device *slave_dev)
int dsa_slave_resume(struct net_device *slave_dev)
{
- struct dsa_slave_priv *p = netdev_priv(slave_dev);
-
netif_device_attach(slave_dev);
- if (p->phy) {
- phy_resume(p->phy);
- phy_start(p->phy);
+ if (slave_dev->phydev) {
+ phy_resume(slave_dev->phydev);
+ phy_start(slave_dev->phydev);
}
return 0;
@@ -1249,8 +1239,8 @@ void dsa_slave_destroy(struct net_device *slave_dev)
port_dn = p->dp->dn;
netif_carrier_off(slave_dev);
- if (p->phy) {
- phy_disconnect(p->phy);
+ if (slave_dev->phydev) {
+ phy_disconnect(slave_dev->phydev);
if (of_phy_is_fixed_link(port_dn))
of_phy_deregister_fixed_link(port_dn);
--
2.14.1
^ permalink raw reply related
* [PATCH net-next v2 2/3] net: dsa: make slave close symmetrical to open
From: Vivien Didelot @ 2017-09-22 19:40 UTC (permalink / raw)
To: netdev
Cc: linux-kernel, kernel, David S. Miller, Florian Fainelli,
Andrew Lunn, Vivien Didelot
In-Reply-To: <20170922194045.18814-1-vivien.didelot@savoirfairelinux.com>
The DSA slave open function configures the unicast MAC addresses on the
master device, enable the switch port, change its STP state, then start
the PHY device.
Make the close function symmetric, by first stopping the PHY device,
then changing the STP state, disabling the switch port and restore the
master device.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
---
net/dsa/slave.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 3760472bf41d..0aab29928152 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -133,6 +133,11 @@ static int dsa_slave_close(struct net_device *dev)
if (dev->phydev)
phy_stop(dev->phydev);
+ dsa_port_set_state_now(p->dp, BR_STATE_DISABLED);
+
+ if (ds->ops->port_disable)
+ ds->ops->port_disable(ds, p->dp->index, dev->phydev);
+
dev_mc_unsync(master, dev);
dev_uc_unsync(master, dev);
if (dev->flags & IFF_ALLMULTI)
@@ -143,11 +148,6 @@ static int dsa_slave_close(struct net_device *dev)
if (!ether_addr_equal(dev->dev_addr, master->dev_addr))
dev_uc_del(master, dev->dev_addr);
- if (ds->ops->port_disable)
- ds->ops->port_disable(ds, p->dp->index, dev->phydev);
-
- dsa_port_set_state_now(p->dp, BR_STATE_DISABLED);
-
return 0;
}
--
2.14.1
^ permalink raw reply related
* [PATCH net-next v2 3/3] net: dsa: add port enable and disable helpers
From: Vivien Didelot @ 2017-09-22 19:40 UTC (permalink / raw)
To: netdev
Cc: linux-kernel, kernel, David S. Miller, Florian Fainelli,
Andrew Lunn, Vivien Didelot
In-Reply-To: <20170922194045.18814-1-vivien.didelot@savoirfairelinux.com>
Provide dsa_port_enable and dsa_port_disable helpers to respectively
enable and disable a switch port. This makes the dsa_port_set_state_now
helper static.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
---
net/dsa/dsa_priv.h | 3 ++-
net/dsa/port.c | 31 ++++++++++++++++++++++++++++++-
net/dsa/slave.c | 19 +++++--------------
3 files changed, 37 insertions(+), 16 deletions(-)
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 9803952a5b40..0298a0f6a349 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -117,7 +117,8 @@ void dsa_master_ethtool_restore(struct net_device *dev);
/* port.c */
int dsa_port_set_state(struct dsa_port *dp, u8 state,
struct switchdev_trans *trans);
-void dsa_port_set_state_now(struct dsa_port *dp, u8 state);
+int dsa_port_enable(struct dsa_port *dp, struct phy_device *phy);
+void dsa_port_disable(struct dsa_port *dp, struct phy_device *phy);
int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br);
void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br);
int dsa_port_vlan_filtering(struct dsa_port *dp, bool vlan_filtering,
diff --git a/net/dsa/port.c b/net/dsa/port.c
index 76d43a82d397..72c8dbd3d3f2 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -56,7 +56,7 @@ int dsa_port_set_state(struct dsa_port *dp, u8 state,
return 0;
}
-void dsa_port_set_state_now(struct dsa_port *dp, u8 state)
+static void dsa_port_set_state_now(struct dsa_port *dp, u8 state)
{
int err;
@@ -65,6 +65,35 @@ void dsa_port_set_state_now(struct dsa_port *dp, u8 state)
pr_err("DSA: failed to set STP state %u (%d)\n", state, err);
}
+int dsa_port_enable(struct dsa_port *dp, struct phy_device *phy)
+{
+ u8 stp_state = dp->bridge_dev ? BR_STATE_BLOCKING : BR_STATE_FORWARDING;
+ struct dsa_switch *ds = dp->ds;
+ int port = dp->index;
+ int err;
+
+ if (ds->ops->port_enable) {
+ err = ds->ops->port_enable(ds, port, phy);
+ if (err)
+ return err;
+ }
+
+ dsa_port_set_state_now(dp, stp_state);
+
+ return 0;
+}
+
+void dsa_port_disable(struct dsa_port *dp, struct phy_device *phy)
+{
+ struct dsa_switch *ds = dp->ds;
+ int port = dp->index;
+
+ dsa_port_set_state_now(dp, BR_STATE_DISABLED);
+
+ if (ds->ops->port_disable)
+ ds->ops->port_disable(ds, port, phy);
+}
+
int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br)
{
struct dsa_notifier_bridge_info info = {
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 0aab29928152..4ea1c6eb0da8 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -73,9 +73,7 @@ static int dsa_slave_open(struct net_device *dev)
{
struct dsa_slave_priv *p = netdev_priv(dev);
struct dsa_port *dp = p->dp;
- struct dsa_switch *ds = dp->ds;
struct net_device *master = dsa_master_netdev(p);
- u8 stp_state = dp->bridge_dev ? BR_STATE_BLOCKING : BR_STATE_FORWARDING;
int err;
if (!(master->flags & IFF_UP))
@@ -98,13 +96,9 @@ static int dsa_slave_open(struct net_device *dev)
goto clear_allmulti;
}
- if (ds->ops->port_enable) {
- err = ds->ops->port_enable(ds, p->dp->index, dev->phydev);
- if (err)
- goto clear_promisc;
- }
-
- dsa_port_set_state_now(p->dp, stp_state);
+ err = dsa_port_enable(dp, dev->phydev);
+ if (err)
+ goto clear_promisc;
if (dev->phydev)
phy_start(dev->phydev);
@@ -128,15 +122,12 @@ static int dsa_slave_close(struct net_device *dev)
{
struct dsa_slave_priv *p = netdev_priv(dev);
struct net_device *master = dsa_master_netdev(p);
- struct dsa_switch *ds = p->dp->ds;
+ struct dsa_port *dp = p->dp;
if (dev->phydev)
phy_stop(dev->phydev);
- dsa_port_set_state_now(p->dp, BR_STATE_DISABLED);
-
- if (ds->ops->port_disable)
- ds->ops->port_disable(ds, p->dp->index, dev->phydev);
+ dsa_port_disable(dp, dev->phydev);
dev_mc_unsync(master, dev);
dev_uc_unsync(master, dev);
--
2.14.1
^ permalink raw reply related
* Re: [PATCH net-next v2 1/3] net: dsa: use slave device phydev
From: Florian Fainelli @ 2017-09-22 20:00 UTC (permalink / raw)
To: Vivien Didelot, netdev; +Cc: linux-kernel, kernel, David S. Miller, Andrew Lunn
In-Reply-To: <20170922194045.18814-2-vivien.didelot@savoirfairelinux.com>
On 09/22/2017 12:40 PM, Vivien Didelot wrote:
> There is no need to store a phy_device in dsa_slave_priv since
> net_device already provides one. Simply s/p->phy/dev->phydev/.
You can therefore remove the phy_device from dsa_slave_priv, see below
for more comments. I will have to regress test the heck out of this,
this should take a few hours.
>
> While at it, return -ENODEV when it is NULL instead of -EOPNOTSUPP.
>
> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
> ---
> static int dsa_slave_port_attr_set(struct net_device *dev,
> @@ -435,12 +433,10 @@ static int
> dsa_slave_get_link_ksettings(struct net_device *dev,
> struct ethtool_link_ksettings *cmd)
> {
> - struct dsa_slave_priv *p = netdev_priv(dev);
> + if (!dev->phydev)
> + return -ENODEV;
>
> - if (!p->phy)
> - return -EOPNOTSUPP;
> -
> - phy_ethtool_ksettings_get(p->phy, cmd);
> + phy_ethtool_ksettings_get(dev->phydev, cmd);
This can be replaced by phy_ethtool_get_link_ksettings()
>
> return 0;
> }
> @@ -449,12 +445,10 @@ static int
> dsa_slave_set_link_ksettings(struct net_device *dev,
> const struct ethtool_link_ksettings *cmd)
> {
> - struct dsa_slave_priv *p = netdev_priv(dev);
> + if (!dev->phydev)
> + return -ENODEV;
>
> - if (p->phy != NULL)
> - return phy_ethtool_ksettings_set(p->phy, cmd);
> -
> - return -EOPNOTSUPP;
> + return phy_ethtool_ksettings_set(dev->phydev, cmd);
> }
This can disappear and you can assign this ethtool operation to
phy_ethtool_set_link_ksettings()
>
> static void dsa_slave_get_drvinfo(struct net_device *dev,
> @@ -488,24 +482,20 @@ dsa_slave_get_regs(struct net_device *dev, struct ethtool_regs *regs, void *_p)
>
> static int dsa_slave_nway_reset(struct net_device *dev)
> {
> - struct dsa_slave_priv *p = netdev_priv(dev);
> + if (!dev->phydev)
> + return -ENODEV;
>
> - if (p->phy != NULL)
> - return genphy_restart_aneg(p->phy);
> -
> - return -EOPNOTSUPP;
> + return genphy_restart_aneg(dev->phydev);
> }
This can now disappear and you can use phy_ethtool_nway_reset() directly
in ethtool_ops
>
> static u32 dsa_slave_get_link(struct net_device *dev)
> {
> - struct dsa_slave_priv *p = netdev_priv(dev);
> + if (!dev->phydev)
> + return -ENODEV;
>
> - if (p->phy != NULL) {
> - genphy_update_link(p->phy);
> - return p->phy->link;
> - }
> + genphy_update_link(dev->phydev);
>
> - return -EOPNOTSUPP;
> + return dev->phydev->link;
> }
This should certainly be just ethtool_op_get_link(), not sure why we
kept that around here...
--
Florian
^ permalink raw reply
* Re: [PATCH net-next v2 1/3] net: dsa: use slave device phydev
From: Andrew Lunn @ 2017-09-22 20:03 UTC (permalink / raw)
To: Vivien Didelot
Cc: netdev, linux-kernel, kernel, David S. Miller, Florian Fainelli
In-Reply-To: <20170922194045.18814-2-vivien.didelot@savoirfairelinux.com>
On Fri, Sep 22, 2017 at 03:40:43PM -0400, Vivien Didelot wrote:
> There is no need to store a phy_device in dsa_slave_priv since
> net_device already provides one. Simply s/p->phy/dev->phydev/.
>
> While at it, return -ENODEV when it is NULL instead of -EOPNOTSUPP.
I just did a quick poll for calling phy_mii_ioctl(). ENODEV seems the
most popular, second to EINVAL. Marvell drivers all use EOPNOTSUPP.
> static int dsa_slave_nway_reset(struct net_device *dev)
> {
> - struct dsa_slave_priv *p = netdev_priv(dev);
> + if (!dev->phydev)
> + return -ENODEV;
>
> - if (p->phy != NULL)
> - return genphy_restart_aneg(p->phy);
> -
> - return -EOPNOTSUPP;
> + return genphy_restart_aneg(dev->phydev);
> }
It looks like this can now be replaced with phy_ethtool_nway_reset().
It could be there are other phy_ethtool_ helpers which can be used,
now that we have phydev in ndev.
Andrew
^ permalink raw reply
* Re: [PATCH net-next 2/2] net: dsa: lan9303: Add basic offloading of unicast traffic
From: Andrew Lunn @ 2017-09-22 20:08 UTC (permalink / raw)
To: Egil Hjelmeland; +Cc: vivien.didelot, f.fainelli, netdev, linux-kernel
In-Reply-To: <b27d2dd4-84e3-b930-a6fe-1e5b36a7d213@egil-hjelmeland.no>
> >I'm wondering how this is supposed to work. Please add a good comment
> >here, since the hardware is forcing you to do something odd.
> >
> >Maybe it would be a good idea to save the STP state in chip. And then
> >when chip->is_bridged is set true, change the state in the hardware to
> >the saved value?
> >
> >What happens when port 0 is added to the bridge, there is then a
> >minute pause and then port 1 is added? I would expect that as soon as
> >port 0 is added, the STP state machine for port 0 will start and move
> >into listening and then forwarding. Due to hardware limitations it
> >looks like you cannot do this. So what state is the hardware
> >effectively in? Blocking? Forwarding?
> >
> >Then port 1 is added. You can then can respect the states. port 1 will
> >do blocking->listening->forwarding, but what about port 0? The calls
> >won't get repeated? How does it transition to forwarding?
> >
> > Andrew
> >
>
> I see your point with the "minute pause" argument. Although a bit
> contrived use case, it is easy to fix by caching the STP state, as
> you suggest. So I can do that.
I don't think it is contrived. I've done bridge configuration by hand
for testing purposes. I've also set the forwarding delay to very small
values, so there is a clear race condition here.
> How does other DSA HW chips handle port separation? Knowing that
> could perhaps help me know what to look for.
They have better hardware :-)
Generally each port is totally independent. You can change the STP
state per port without restrictions.
Andrew
^ permalink raw reply
* Re: [PATCH net-next v2 1/3] net: dsa: use slave device phydev
From: Vivien Didelot @ 2017-09-22 20:11 UTC (permalink / raw)
To: Florian Fainelli, netdev
Cc: linux-kernel, kernel, David S. Miller, Andrew Lunn
In-Reply-To: <2262f266-6fb9-cce4-f83a-933dd41b9062@gmail.com>
Hi Florian,
Florian Fainelli <f.fainelli@gmail.com> writes:
> On 09/22/2017 12:40 PM, Vivien Didelot wrote:
>> There is no need to store a phy_device in dsa_slave_priv since
>> net_device already provides one. Simply s/p->phy/dev->phydev/.
>
> You can therefore remove the phy_device from dsa_slave_priv, see below
> for more comments. I will have to regress test the heck out of this,
> this should take a few hours.
OK, since this is a sensible topic, I will respin a v3 without this
patch, so that a future patchset can address your comments below and
also gives you time to test this one patch alone.
>> static int dsa_slave_port_attr_set(struct net_device *dev,
>> @@ -435,12 +433,10 @@ static int
>> dsa_slave_get_link_ksettings(struct net_device *dev,
>> struct ethtool_link_ksettings *cmd)
>> {
>> - struct dsa_slave_priv *p = netdev_priv(dev);
>> + if (!dev->phydev)
>> + return -ENODEV;
>>
>> - if (!p->phy)
>> - return -EOPNOTSUPP;
>> -
>> - phy_ethtool_ksettings_get(p->phy, cmd);
>> + phy_ethtool_ksettings_get(dev->phydev, cmd);
>
> This can be replaced by phy_ethtool_get_link_ksettings()
>
>>
>> return 0;
>> }
>> @@ -449,12 +445,10 @@ static int
>> dsa_slave_set_link_ksettings(struct net_device *dev,
>> const struct ethtool_link_ksettings *cmd)
>> {
>> - struct dsa_slave_priv *p = netdev_priv(dev);
>> + if (!dev->phydev)
>> + return -ENODEV;
>>
>> - if (p->phy != NULL)
>> - return phy_ethtool_ksettings_set(p->phy, cmd);
>> -
>> - return -EOPNOTSUPP;
>> + return phy_ethtool_ksettings_set(dev->phydev, cmd);
>> }
>
> This can disappear and you can assign this ethtool operation to
> phy_ethtool_set_link_ksettings()
>
>>
>> static void dsa_slave_get_drvinfo(struct net_device *dev,
>> @@ -488,24 +482,20 @@ dsa_slave_get_regs(struct net_device *dev, struct ethtool_regs *regs, void *_p)
>>
>> static int dsa_slave_nway_reset(struct net_device *dev)
>> {
>> - struct dsa_slave_priv *p = netdev_priv(dev);
>> + if (!dev->phydev)
>> + return -ENODEV;
>>
>> - if (p->phy != NULL)
>> - return genphy_restart_aneg(p->phy);
>> -
>> - return -EOPNOTSUPP;
>> + return genphy_restart_aneg(dev->phydev);
>> }
>
> This can now disappear and you can use phy_ethtool_nway_reset() directly
> in ethtool_ops
>
>>
>> static u32 dsa_slave_get_link(struct net_device *dev)
>> {
>> - struct dsa_slave_priv *p = netdev_priv(dev);
>> + if (!dev->phydev)
>> + return -ENODEV;
>>
>> - if (p->phy != NULL) {
>> - genphy_update_link(p->phy);
>> - return p->phy->link;
>> - }
>> + genphy_update_link(dev->phydev);
>>
>> - return -EOPNOTSUPP;
>> + return dev->phydev->link;
>> }
>
> This should certainly be just ethtool_op_get_link(), not sure why we
> kept that around here...
Haaa, good to read that! I wasn't sure about this, but with this patch
the slave phy ethtool functions seemed indeed quite generic...
Thanks,
Vivien
^ permalink raw reply
* [PATCH] [for 4.14] net: qcom/emac: specify the correct size when mapping a DMA buffer
From: Timur Tabi @ 2017-09-22 20:32 UTC (permalink / raw)
To: David S. Miller, netdev, stable; +Cc: timur
When mapping the RX DMA buffers, the driver was accidentally specifying
zero for the buffer length. Under normal circumstances, SWIOTLB does not
need to allocate a bounce buffer, so the address is just mapped without
checking the size field. This is why the error was not detected earlier.
Fixes: b9b17debc69d ("net: emac: emac gigabit ethernet controller driver")
Cc: stable@vger.kernel.org
Signed-off-by: Timur Tabi <timur@codeaurora.org>
---
drivers/net/ethernet/qualcomm/emac/emac-mac.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/qualcomm/emac/emac-mac.c b/drivers/net/ethernet/qualcomm/emac/emac-mac.c
index 0ea3ca09c689..3ed9033e56db 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-mac.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-mac.c
@@ -898,7 +898,8 @@ static void emac_mac_rx_descs_refill(struct emac_adapter *adpt,
curr_rxbuf->dma_addr =
dma_map_single(adpt->netdev->dev.parent, skb->data,
- curr_rxbuf->length, DMA_FROM_DEVICE);
+ adpt->rxbuf_size, DMA_FROM_DEVICE);
+
ret = dma_mapping_error(adpt->netdev->dev.parent,
curr_rxbuf->dma_addr);
if (ret) {
--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc. Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.
^ permalink raw reply related
* [PATCH,v3,net-next 0/2] Improve code coverage of syzkaller
From: Petar Penkov @ 2017-09-22 20:49 UTC (permalink / raw)
To: netdev; +Cc: edumazet, maheshb, willemb, davem, ppenkov, Petar Penkov
This patch series is intended to improve code coverage of syzkaller on
the early receive path, specifically including flow dissector, GRO,
and GRO with frags parts of the networking stack. Syzkaller exercises
the stack through the TUN driver and this is therefore where changes
reside. Current coverage through netif_receive_skb() is limited as it
does not touch on any of the aforementioned code paths. Furthermore,
for full coverage, it is necessary to have more flexibility over the
linear and non-linear data of the skbs.
The following patches address this by providing the user(syzkaller)
with the ability to send via napi_gro_receive() and napi_gro_frags().
Additionally, syzkaller can specify how many fragments there are and
how much data per fragment there is. This is done by exploiting the
convenient structure of iovecs. Finally, this patch series adds
support for exercising the flow dissector during fuzzing.
The code path including napi_gro_receive() can be enabled via the
IFF_NAPI flag. The remainder of the changes in this patch series give
the user significantly more control over packets entering the kernel.
To avoid potential security vulnerabilities, hide the ability to send
custom skbs and the flow dissector code paths behind a
capable(CAP_NET_ADMIN) check to require special user privileges.
Changes since v2 based on feedback from Willem de Bruijn and Mahesh
Bandewar:
Patch 1/ No changes.
Patch 2/ Check if the preconditions for IFF_NAPI_FRAGS (IFF_NAPI and
IFF_TAP) are met before opening/attaching rather than after.
If they are not, change the behavior from discarding the
flag to rejecting the command with EINVAL.
Petar Penkov (2):
tun: enable NAPI for TUN/TAP driver
tun: enable napi_gro_frags() for TUN/TAP driver
drivers/net/tun.c | 261 +++++++++++++++++++++++++++++++++++++++++---
include/uapi/linux/if_tun.h | 2 +
2 files changed, 245 insertions(+), 18 deletions(-)
--
2.11.0
^ permalink raw reply
* [PATCH,v3,net-next 1/2] tun: enable NAPI for TUN/TAP driver
From: Petar Penkov @ 2017-09-22 20:49 UTC (permalink / raw)
To: netdev; +Cc: edumazet, maheshb, willemb, davem, ppenkov, Petar Penkov
In-Reply-To: <20170922204915.7889-1-peterpenkov96@gmail.com>
Changes TUN driver to use napi_gro_receive() upon receiving packets
rather than netif_rx_ni(). Adds flag IFF_NAPI that enables these
changes and operation is not affected if the flag is disabled. SKBs
are constructed upon packet arrival and are queued to be processed
later.
The new path was evaluated with a benchmark with the following setup:
Open two tap devices and a receiver thread that reads in a loop for
each device. Start one sender thread and pin all threads to different
CPUs. Send 1M minimum UDP packets to each device and measure sending
time for each of the sending methods:
napi_gro_receive(): 4.90s
netif_rx_ni(): 4.90s
netif_receive_skb(): 7.20s
Signed-off-by: Petar Penkov <peterpenkov96@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: davem@davemloft.net
Cc: ppenkov@stanford.edu
---
drivers/net/tun.c | 133 +++++++++++++++++++++++++++++++++++++++-----
include/uapi/linux/if_tun.h | 1 +
2 files changed, 119 insertions(+), 15 deletions(-)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 3c9985f29950..f16407242b18 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -121,7 +121,7 @@ do { \
#define TUN_VNET_BE 0x40000000
#define TUN_FEATURES (IFF_NO_PI | IFF_ONE_QUEUE | IFF_VNET_HDR | \
- IFF_MULTI_QUEUE)
+ IFF_MULTI_QUEUE | IFF_NAPI)
#define GOODCOPY_LEN 128
#define FLT_EXACT_COUNT 8
@@ -172,6 +172,7 @@ struct tun_file {
u16 queue_index;
unsigned int ifindex;
};
+ struct napi_struct napi;
struct list_head next;
struct tun_struct *detached;
struct skb_array tx_array;
@@ -229,6 +230,68 @@ struct tun_struct {
struct bpf_prog __rcu *xdp_prog;
};
+static int tun_napi_receive(struct napi_struct *napi, int budget)
+{
+ struct tun_file *tfile = container_of(napi, struct tun_file, napi);
+ struct sk_buff_head *queue = &tfile->sk.sk_write_queue;
+ struct sk_buff_head process_queue;
+ struct sk_buff *skb;
+ int received = 0;
+
+ __skb_queue_head_init(&process_queue);
+
+ spin_lock(&queue->lock);
+ skb_queue_splice_tail_init(queue, &process_queue);
+ spin_unlock(&queue->lock);
+
+ while (received < budget && (skb = __skb_dequeue(&process_queue))) {
+ napi_gro_receive(napi, skb);
+ ++received;
+ }
+
+ if (!skb_queue_empty(&process_queue)) {
+ spin_lock(&queue->lock);
+ skb_queue_splice(&process_queue, queue);
+ spin_unlock(&queue->lock);
+ }
+
+ return received;
+}
+
+static int tun_napi_poll(struct napi_struct *napi, int budget)
+{
+ unsigned int received;
+
+ received = tun_napi_receive(napi, budget);
+
+ if (received < budget)
+ napi_complete_done(napi, received);
+
+ return received;
+}
+
+static void tun_napi_init(struct tun_struct *tun, struct tun_file *tfile,
+ bool napi_en)
+{
+ if (napi_en) {
+ netif_napi_add(tun->dev, &tfile->napi, tun_napi_poll,
+ NAPI_POLL_WEIGHT);
+ napi_enable(&tfile->napi);
+ }
+}
+
+static void tun_napi_disable(struct tun_struct *tun, struct tun_file *tfile)
+{
+ if (tun->flags & IFF_NAPI)
+ napi_disable(&tfile->napi);
+}
+
+static void tun_napi_del(struct tun_struct *tun, struct tun_file *tfile)
+{
+ if (tun->flags & IFF_NAPI)
+ netif_napi_del(&tfile->napi);
+}
+
#ifdef CONFIG_TUN_VNET_CROSS_LE
static inline bool tun_legacy_is_little_endian(struct tun_struct *tun)
{
@@ -541,6 +604,11 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
tun = rtnl_dereference(tfile->tun);
+ if (tun && clean) {
+ tun_napi_disable(tun, tfile);
+ tun_napi_del(tun, tfile);
+ }
+
if (tun && !tfile->detached) {
u16 index = tfile->queue_index;
BUG_ON(index >= tun->numqueues);
@@ -598,6 +666,7 @@ static void tun_detach_all(struct net_device *dev)
for (i = 0; i < n; i++) {
tfile = rtnl_dereference(tun->tfiles[i]);
BUG_ON(!tfile);
+ tun_napi_disable(tun, tfile);
tfile->socket.sk->sk_shutdown = RCV_SHUTDOWN;
tfile->socket.sk->sk_data_ready(tfile->socket.sk);
RCU_INIT_POINTER(tfile->tun, NULL);
@@ -613,6 +682,7 @@ static void tun_detach_all(struct net_device *dev)
synchronize_net();
for (i = 0; i < n; i++) {
tfile = rtnl_dereference(tun->tfiles[i]);
+ tun_napi_del(tun, tfile);
/* Drop read queue */
tun_queue_purge(tfile);
sock_put(&tfile->sk);
@@ -631,7 +701,8 @@ static void tun_detach_all(struct net_device *dev)
module_put(THIS_MODULE);
}
-static int tun_attach(struct tun_struct *tun, struct file *file, bool skip_filter)
+static int tun_attach(struct tun_struct *tun, struct file *file,
+ bool skip_filter, bool napi)
{
struct tun_file *tfile = file->private_data;
struct net_device *dev = tun->dev;
@@ -677,10 +748,12 @@ static int tun_attach(struct tun_struct *tun, struct file *file, bool skip_filte
rcu_assign_pointer(tun->tfiles[tun->numqueues], tfile);
tun->numqueues++;
- if (tfile->detached)
+ if (tfile->detached) {
tun_enable_queue(tfile);
- else
+ } else {
sock_hold(&tfile->sk);
+ tun_napi_init(tun, tfile, napi);
+ }
tun_set_real_num_queues(tun);
@@ -956,13 +1029,28 @@ static void tun_poll_controller(struct net_device *dev)
* Tun only receives frames when:
* 1) the char device endpoint gets data from user space
* 2) the tun socket gets a sendmsg call from user space
- * Since both of those are synchronous operations, we are guaranteed
- * never to have pending data when we poll for it
- * so there is nothing to do here but return.
+ * If NAPI is not enabled, since both of those are synchronous
+ * operations, we are guaranteed never to have pending data when we poll
+ * for it so there is nothing to do here but return.
* We need this though so netpoll recognizes us as an interface that
* supports polling, which enables bridge devices in virt setups to
* still use netconsole
+ * If NAPI is enabled, however, we need to schedule polling for all
+ * queues.
*/
+ struct tun_struct *tun = netdev_priv(dev);
+
+ if (tun->flags & IFF_NAPI) {
+ struct tun_file *tfile;
+ int i;
+
+ rcu_read_lock();
+ for (i = 0; i < tun->numqueues; i++) {
+ tfile = rcu_dereference(tun->tfiles[i]);
+ napi_schedule(&tfile->napi);
+ }
+ rcu_read_unlock();
+ }
return;
}
#endif
@@ -1549,11 +1637,25 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
}
rxhash = __skb_get_hash_symmetric(skb);
-#ifndef CONFIG_4KSTACKS
- tun_rx_batched(tun, tfile, skb, more);
-#else
- netif_rx_ni(skb);
-#endif
+
+ if (tun->flags & IFF_NAPI) {
+ struct sk_buff_head *queue = &tfile->sk.sk_write_queue;
+ int queue_len;
+
+ spin_lock_bh(&queue->lock);
+ __skb_queue_tail(queue, skb);
+ queue_len = skb_queue_len(queue);
+ spin_unlock(&queue->lock);
+
+ if (!more || queue_len > NAPI_POLL_WEIGHT)
+ napi_schedule(&tfile->napi);
+
+ local_bh_enable();
+ } else if (!IS_ENABLED(CONFIG_4KSTACKS)) {
+ tun_rx_batched(tun, tfile, skb, more);
+ } else {
+ netif_rx_ni(skb);
+ }
stats = get_cpu_ptr(tun->pcpu_stats);
u64_stats_update_begin(&stats->syncp);
@@ -1980,7 +2082,8 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
if (err < 0)
return err;
- err = tun_attach(tun, file, ifr->ifr_flags & IFF_NOFILTER);
+ err = tun_attach(tun, file, ifr->ifr_flags & IFF_NOFILTER,
+ ifr->ifr_flags & IFF_NAPI);
if (err < 0)
return err;
@@ -2066,7 +2169,7 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
NETIF_F_HW_VLAN_STAG_TX);
INIT_LIST_HEAD(&tun->disabled);
- err = tun_attach(tun, file, false);
+ err = tun_attach(tun, file, false, ifr->ifr_flags & IFF_NAPI);
if (err < 0)
goto err_free_flow;
@@ -2216,7 +2319,7 @@ static int tun_set_queue(struct file *file, struct ifreq *ifr)
ret = security_tun_dev_attach_queue(tun->security);
if (ret < 0)
goto unlock;
- ret = tun_attach(tun, file, false);
+ ret = tun_attach(tun, file, false, tun->flags & IFF_NAPI);
} else if (ifr->ifr_flags & IFF_DETACH_QUEUE) {
tun = rtnl_dereference(tfile->tun);
if (!tun || !(tun->flags & IFF_MULTI_QUEUE) || tfile->detached)
diff --git a/include/uapi/linux/if_tun.h b/include/uapi/linux/if_tun.h
index 3cb5e1d85ddd..30b6184884eb 100644
--- a/include/uapi/linux/if_tun.h
+++ b/include/uapi/linux/if_tun.h
@@ -60,6 +60,7 @@
/* TUNSETIFF ifr flags */
#define IFF_TUN 0x0001
#define IFF_TAP 0x0002
+#define IFF_NAPI 0x0010
#define IFF_NO_PI 0x1000
/* This flag has no real effect */
#define IFF_ONE_QUEUE 0x2000
--
2.11.0
^ permalink raw reply related
* [PATCH,v3,net-next 2/2] tun: enable napi_gro_frags() for TUN/TAP driver
From: Petar Penkov @ 2017-09-22 20:49 UTC (permalink / raw)
To: netdev; +Cc: edumazet, maheshb, willemb, davem, ppenkov, Petar Penkov
In-Reply-To: <20170922204915.7889-1-peterpenkov96@gmail.com>
Add a TUN/TAP receive mode that exercises the napi_gro_frags()
interface. This mode is available only in TAP mode, as the interface
expects packets with Ethernet headers.
Furthermore, packets follow the layout of the iovec_iter that was
received. The first iovec is the linear data, and every one after the
first is a fragment. If there are more fragments than the max number,
drop the packet. Additionally, invoke eth_get_headlen() to exercise flow
dissector code and to verify that the header resides in the linear data.
The napi_gro_frags() mode requires setting the IFF_NAPI_FRAGS option.
This is imposed because this mode is intended for testing via tools like
syzkaller and packetdrill, and the increased flexibility it provides can
introduce security vulnerabilities. This flag is accepted only if the
device is in TAP mode and has the IFF_NAPI flag set as well. This is
done because both of these are explicit requirements for correct
operation in this mode.
Signed-off-by: Petar Penkov <peterpenkov96@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: davem@davemloft.net
Cc: ppenkov@stanford.edu
---
drivers/net/tun.c | 134 ++++++++++++++++++++++++++++++++++++++++++--
include/uapi/linux/if_tun.h | 1 +
2 files changed, 129 insertions(+), 6 deletions(-)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index f16407242b18..9880b3bc8fa5 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -75,6 +75,7 @@
#include <linux/skb_array.h>
#include <linux/bpf.h>
#include <linux/bpf_trace.h>
+#include <linux/mutex.h>
#include <linux/uaccess.h>
@@ -121,7 +122,8 @@ do { \
#define TUN_VNET_BE 0x40000000
#define TUN_FEATURES (IFF_NO_PI | IFF_ONE_QUEUE | IFF_VNET_HDR | \
- IFF_MULTI_QUEUE | IFF_NAPI)
+ IFF_MULTI_QUEUE | IFF_NAPI | IFF_NAPI_FRAGS)
+
#define GOODCOPY_LEN 128
#define FLT_EXACT_COUNT 8
@@ -173,6 +175,7 @@ struct tun_file {
unsigned int ifindex;
};
struct napi_struct napi;
+ struct mutex napi_mutex; /* Protects access to the above napi */
struct list_head next;
struct tun_struct *detached;
struct skb_array tx_array;
@@ -277,6 +280,7 @@ static void tun_napi_init(struct tun_struct *tun, struct tun_file *tfile,
netif_napi_add(tun->dev, &tfile->napi, tun_napi_poll,
NAPI_POLL_WEIGHT);
napi_enable(&tfile->napi);
+ mutex_init(&tfile->napi_mutex);
}
}
@@ -292,6 +296,11 @@ static void tun_napi_del(struct tun_struct *tun, struct tun_file *tfile)
netif_napi_del(&tfile->napi);
}
+static bool tun_napi_frags_enabled(const struct tun_struct *tun)
+{
+ return READ_ONCE(tun->flags) & IFF_NAPI_FRAGS;
+}
+
#ifdef CONFIG_TUN_VNET_CROSS_LE
static inline bool tun_legacy_is_little_endian(struct tun_struct *tun)
{
@@ -1036,7 +1045,8 @@ static void tun_poll_controller(struct net_device *dev)
* supports polling, which enables bridge devices in virt setups to
* still use netconsole
* If NAPI is enabled, however, we need to schedule polling for all
- * queues.
+ * queues unless we are using napi_gro_frags(), which we call in
+ * process context and not in NAPI context.
*/
struct tun_struct *tun = netdev_priv(dev);
@@ -1044,6 +1054,9 @@ static void tun_poll_controller(struct net_device *dev)
struct tun_file *tfile;
int i;
+ if (tun_napi_frags_enabled(tun))
+ return;
+
rcu_read_lock();
for (i = 0; i < tun->numqueues; i++) {
tfile = rcu_dereference(tun->tfiles[i]);
@@ -1266,6 +1279,64 @@ static unsigned int tun_chr_poll(struct file *file, poll_table *wait)
return mask;
}
+static struct sk_buff *tun_napi_alloc_frags(struct tun_file *tfile,
+ size_t len,
+ const struct iov_iter *it)
+{
+ struct sk_buff *skb;
+ size_t linear;
+ int err;
+ int i;
+
+ if (it->nr_segs > MAX_SKB_FRAGS + 1)
+ return ERR_PTR(-ENOMEM);
+
+ local_bh_disable();
+ skb = napi_get_frags(&tfile->napi);
+ local_bh_enable();
+ if (!skb)
+ return ERR_PTR(-ENOMEM);
+
+ linear = iov_iter_single_seg_count(it);
+ err = __skb_grow(skb, linear);
+ if (err)
+ goto free;
+
+ skb->len = len;
+ skb->data_len = len - linear;
+ skb->truesize += skb->data_len;
+
+ for (i = 1; i < it->nr_segs; i++) {
+ size_t fragsz = it->iov[i].iov_len;
+ unsigned long offset;
+ struct page *page;
+ void *data;
+
+ if (fragsz == 0 || fragsz > PAGE_SIZE) {
+ err = -EINVAL;
+ goto free;
+ }
+
+ local_bh_disable();
+ data = napi_alloc_frag(fragsz);
+ local_bh_enable();
+ if (!data) {
+ err = -ENOMEM;
+ goto free;
+ }
+
+ page = virt_to_head_page(data);
+ offset = data - page_address(page);
+ skb_fill_page_desc(skb, i - 1, page, offset, fragsz);
+ }
+
+ return skb;
+free:
+ /* frees skb and all frags allocated with napi_alloc_frag() */
+ napi_free_frags(&tfile->napi);
+ return ERR_PTR(err);
+}
+
/* prepad is the amount to reserve at front. len is length after that.
* linear is a hint as to how much to copy (usually headers). */
static struct sk_buff *tun_alloc_skb(struct tun_file *tfile,
@@ -1478,6 +1549,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
int err;
u32 rxhash;
int skb_xdp = 1;
+ bool frags = tun_napi_frags_enabled(tun);
if (!(tun->dev->flags & IFF_UP))
return -EIO;
@@ -1535,7 +1607,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
zerocopy = true;
}
- if (tun_can_build_skb(tun, tfile, len, noblock, zerocopy)) {
+ if (!frags && tun_can_build_skb(tun, tfile, len, noblock, zerocopy)) {
/* For the packet that is not easy to be processed
* (e.g gso or jumbo packet), we will do it at after
* skb was created with generic XDP routine.
@@ -1556,10 +1628,24 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
linear = tun16_to_cpu(tun, gso.hdr_len);
}
- skb = tun_alloc_skb(tfile, align, copylen, linear, noblock);
+ if (frags) {
+ mutex_lock(&tfile->napi_mutex);
+ skb = tun_napi_alloc_frags(tfile, copylen, from);
+ /* tun_napi_alloc_frags() enforces a layout for the skb.
+ * If zerocopy is enabled, then this layout will be
+ * overwritten by zerocopy_sg_from_iter().
+ */
+ zerocopy = false;
+ } else {
+ skb = tun_alloc_skb(tfile, align, copylen, linear,
+ noblock);
+ }
+
if (IS_ERR(skb)) {
if (PTR_ERR(skb) != -EAGAIN)
this_cpu_inc(tun->pcpu_stats->rx_dropped);
+ if (frags)
+ mutex_unlock(&tfile->napi_mutex);
return PTR_ERR(skb);
}
@@ -1571,6 +1657,11 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
if (err) {
this_cpu_inc(tun->pcpu_stats->rx_dropped);
kfree_skb(skb);
+ if (frags) {
+ tfile->napi.skb = NULL;
+ mutex_unlock(&tfile->napi_mutex);
+ }
+
return -EFAULT;
}
}
@@ -1578,6 +1669,11 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
if (virtio_net_hdr_to_skb(skb, &gso, tun_is_little_endian(tun))) {
this_cpu_inc(tun->pcpu_stats->rx_frame_errors);
kfree_skb(skb);
+ if (frags) {
+ tfile->napi.skb = NULL;
+ mutex_unlock(&tfile->napi_mutex);
+ }
+
return -EINVAL;
}
@@ -1603,7 +1699,8 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
skb->dev = tun->dev;
break;
case IFF_TAP:
- skb->protocol = eth_type_trans(skb, tun->dev);
+ if (!frags)
+ skb->protocol = eth_type_trans(skb, tun->dev);
break;
}
@@ -1638,7 +1735,23 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
rxhash = __skb_get_hash_symmetric(skb);
- if (tun->flags & IFF_NAPI) {
+ if (frags) {
+ /* Exercise flow dissector code path. */
+ u32 headlen = eth_get_headlen(skb->data, skb_headlen(skb));
+
+ if (headlen > skb_headlen(skb) || headlen < ETH_HLEN) {
+ this_cpu_inc(tun->pcpu_stats->rx_dropped);
+ napi_free_frags(&tfile->napi);
+ mutex_unlock(&tfile->napi_mutex);
+ WARN_ON(1);
+ return -ENOMEM;
+ }
+
+ local_bh_disable();
+ napi_gro_frags(&tfile->napi);
+ local_bh_enable();
+ mutex_unlock(&tfile->napi_mutex);
+ } else if (tun->flags & IFF_NAPI) {
struct sk_buff_head *queue = &tfile->sk.sk_write_queue;
int queue_len;
@@ -2061,6 +2174,15 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
if (tfile->detached)
return -EINVAL;
+ if ((ifr->ifr_flags & IFF_NAPI_FRAGS)) {
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
+ if (!(ifr->ifr_flags & IFF_NAPI) ||
+ (ifr->ifr_flags & TUN_TYPE_MASK) != IFF_TAP)
+ return -EINVAL;
+ }
+
dev = __dev_get_by_name(net, ifr->ifr_name);
if (dev) {
if (ifr->ifr_flags & IFF_TUN_EXCL)
diff --git a/include/uapi/linux/if_tun.h b/include/uapi/linux/if_tun.h
index 30b6184884eb..365ade5685c9 100644
--- a/include/uapi/linux/if_tun.h
+++ b/include/uapi/linux/if_tun.h
@@ -61,6 +61,7 @@
#define IFF_TUN 0x0001
#define IFF_TAP 0x0002
#define IFF_NAPI 0x0010
+#define IFF_NAPI_FRAGS 0x0020
#define IFF_NO_PI 0x1000
/* This flag has no real effect */
#define IFF_ONE_QUEUE 0x2000
--
2.11.0
^ permalink raw reply related
* [RFC PATCH 00/11] udp: full early demux for unconnected sockets
From: Paolo Abeni @ 2017-09-22 21:06 UTC (permalink / raw)
To: netdev
Cc: David S. Miller, Pablo Neira Ayuso, Florian Westphal,
Eric Dumazet, Hannes Frederic Sowa
This series refactor the UDP early demux code so that:
* full socket lookup is performed for unicast packets
* a sk is grabbed even for unconnected socket match
* a dst cache is used even in such scenario
To perform this tasks a couple of facilities are added:
* noref socket references, scoped inside the current RCU section, to be
explicitly cleared before leaving such section
* a dst cache inside the inet and inet6 local addresses tables, caching the
related local dst entry
The measured performance gain under small packet UDP flood is as follow:
ingress NIC vanilla patched delta
rx queues (kpps) (kpps) (%)
[ipv4]
1 2177 2414 10
2 2527 2892 14
3 3050 3733 22
4 3918 4643 18
5 5074 5699 12
6 5654 6869 21
[ipv6]
1 2002 2821 40
2 2087 3148 50
3 2583 4008 55
4 3072 4963 61
5 3719 5992 61
6 4314 6910 60
The number of user space process in use is equal to the number of
NIC rx queue; when multiple user space processes the SO_REUSEPORT
options is used, as described below:
ethtool -L em2 combined $n
MASK=1
for I in `seq 0 $((n - 1))`; do
udp_sink --reuse-port --recvfrom --count 1000000000 --port 9 $1 &
taskset -p $((MASK << ($I + $n) )) $!
done
Paolo Abeni (11):
net: add support for noref skb->sk
net: allow early demux to fetch noref socket
udp: do not touch socket refcount in early demux
net: add simple socket-like dst cache helpers
udp: perform full socket lookup in early demux
ip/route: factor out helper for local route creation
ipv6/addrconf: add an helper for inet6 address lookup
net: implement local route cache inside ifaddr
route: add ipv4/6 helpers to do partial route lookup vs local dst
IP: early demux can return an error code
udp: dst lookup in early demux for unconnected sockets
include/linux/inetdevice.h | 4 ++
include/linux/skbuff.h | 31 +++++++++++
include/linux/udp.h | 2 +
include/net/addrconf.h | 3 ++
include/net/dst.h | 20 +++++++
include/net/if_inet6.h | 4 ++
include/net/ip6_route.h | 1 +
include/net/protocol.h | 4 +-
include/net/route.h | 4 ++
include/net/tcp.h | 2 +-
include/net/udp.h | 2 +-
net/core/dst.c | 12 +++++
net/core/sock.c | 7 +++
net/ipv4/devinet.c | 29 ++++++++++-
net/ipv4/ip_input.c | 33 ++++++++----
net/ipv4/netfilter/nf_dup_ipv4.c | 3 ++
net/ipv4/route.c | 73 +++++++++++++++++++++++---
net/ipv4/tcp_ipv4.c | 9 ++--
net/ipv4/udp.c | 95 +++++++++++++++-------------------
net/ipv6/addrconf.c | 109 +++++++++++++++++++++++++++------------
net/ipv6/ip6_input.c | 4 ++
net/ipv6/netfilter/nf_dup_ipv6.c | 3 ++
net/ipv6/route.c | 13 +++++
net/ipv6/udp.c | 72 ++++++++++----------------
net/netfilter/nf_queue.c | 3 ++
25 files changed, 383 insertions(+), 159 deletions(-)
--
2.13.5
^ permalink raw reply
* [RFC PATCH 01/11] net: add support for noref skb->sk
From: Paolo Abeni @ 2017-09-22 21:06 UTC (permalink / raw)
To: netdev
Cc: David S. Miller, Pablo Neira Ayuso, Florian Westphal,
Eric Dumazet, Hannes Frederic Sowa
In-Reply-To: <cover.1506114055.git.pabeni@redhat.com>
Noref sk do not carry a socket refcount, are valid
only inside the current RCU section and must be
explicitly cleared before exiting such section.
They will be used in a later patch to allow early demux
without sock refcounting.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
include/linux/skbuff.h | 31 +++++++++++++++++++++++++++++++
net/core/sock.c | 7 +++++++
2 files changed, 38 insertions(+)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 492828801acb..c3fc32636690 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -922,6 +922,37 @@ static inline struct rtable *skb_rtable(const struct sk_buff *skb)
return (struct rtable *)skb_dst(skb);
}
+void sock_dummyfree(struct sk_buff *skb);
+
+/* only early demux can set noref socks
+ * noref socks do not carry any refcount and must be
+ * cleared before exiting the current RCU section
+ */
+static inline void skb_set_noref_sk(struct sk_buff *skb, struct sock *sk)
+{
+ skb->sk = sk;
+ skb->destructor = sock_dummyfree;
+}
+
+static inline bool skb_has_noref_sk(struct sk_buff *skb)
+{
+ return skb->destructor == sock_dummyfree;
+}
+
+static inline struct sock *skb_clear_noref_sk(struct sk_buff *skb)
+{
+ struct sock *ret;
+
+ if (!skb_has_noref_sk(skb))
+ return NULL;
+
+ WARN_ON_ONCE(!rcu_read_lock_held());
+ ret = skb->sk;
+ skb->sk = NULL;
+ skb->destructor = NULL;
+ return ret;
+}
+
/* For mangling skb->pkt_type from user space side from applications
* such as nft, tc, etc, we only allow a conservative subset of
* possible pkt_types to be set.
diff --git a/net/core/sock.c b/net/core/sock.c
index 9b7b6bbb2a23..33da8e7e58a0 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1893,6 +1893,13 @@ void sock_efree(struct sk_buff *skb)
}
EXPORT_SYMBOL(sock_efree);
+/* dummy destructor used by noref sockets */
+void sock_dummyfree(struct sk_buff *skb)
+{
+ WARN_ON_ONCE(!rcu_read_lock_held());
+}
+EXPORT_SYMBOL(sock_dummyfree);
+
kuid_t sock_i_uid(struct sock *sk)
{
kuid_t uid;
--
2.13.5
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox