* RE: [PATCH v2 0/2] Simplify mtty driver and mdev core
From: Parav Pandit @ 2019-08-21 3:57 UTC (permalink / raw)
To: Cornelia Huck, Alex Williamson
Cc: Jiri Pirko, David S . Miller, Kirti Wankhede, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, cjia, netdev@vger.kernel.org
In-Reply-To: <20190820195519.47d6fd6a.cohuck@redhat.com>
> -----Original Message-----
> From: Cornelia Huck <cohuck@redhat.com>
> Sent: Tuesday, August 20, 2019 11:25 PM
> To: Alex Williamson <alex.williamson@redhat.com>
> Cc: Parav Pandit <parav@mellanox.com>; Jiri Pirko <jiri@mellanox.com>;
> David S . Miller <davem@davemloft.net>; Kirti Wankhede
> <kwankhede@nvidia.com>; kvm@vger.kernel.org; linux-
> kernel@vger.kernel.org; cjia <cjia@nvidia.com>; netdev@vger.kernel.org
> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
>
> On Tue, 20 Aug 2019 11:19:04 -0600
> Alex Williamson <alex.williamson@redhat.com> wrote:
>
> > What about an alias based on the uuid? For example, we use 160-bit
> > sha1s daily with git (uuids are only 128-bit), but we generally don't
> > reference git commits with the full 20 character string. Generally 12
> > characters is recommended to avoid ambiguity. Could mdev
> > automatically create an abbreviated sha1 alias for the device? If so,
> > how many characters should we use and what do we do on collision? The
> > colliding device could add enough alias characters to disambiguate (we
> > likely couldn't re-alias the existing device to disambiguate, but I'm
> > not sure it matters, userspace has sysfs to associate aliases). Ex.
> >
> > UUID=$(uuidgen)
> > ALIAS=$(echo $UUID | sha1sum | colrm 13)
> >
> > Since there seems to be some prefix overhead, as I ask about above in
> > how many characters we actually have to work with in IFNAMESZ, maybe
> > we start with 8 characters (matching your "index" namespace) and
> > expand as necessary for disambiguation. If we can eliminate overhead
> > in IFNAMESZ, let's start with 12. Thanks,
> >
> > Alex
>
> I really like that idea, and it seems the best option proposed yet, as we don't
> need to create a secondary identifier.
User setting this alias at mdev creation time and exposed via sysfs as read only attribute works.
Exposing that as
const char *mdev_alias(struct mdev_device *dev) to vendor drivers..
^ permalink raw reply
* Re: [PATCH 08/38] nfp: Convert to XArray
From: Jakub Kicinski @ 2019-08-21 3:59 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: netdev
In-Reply-To: <20190820223259.22348-9-willy@infradead.org>
On Tue, 20 Aug 2019 15:32:29 -0700, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
>
> A minor change in semantics where we simply store into the XArray rather
> than insert; this only matters if there could already be something stored
> at that index, and from my reading of the code that can't happen.
>
> Use xa_for_each() rather than xas_for_each() as none of these loops
> appear to be performance-critical.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Looks reasonable (indeed IIRC there should not be anything at the
index we try to store to). I'll try to test tomorrow (CCing maintainers
could speed things up a little.. 🤭)
> @@ -285,9 +275,9 @@ static void
> nfp_abm_qdisc_clear_mq(struct net_device *netdev, struct nfp_abm_link *alink,
> struct nfp_qdisc *qdisc)
> {
> - struct radix_tree_iter iter;
> unsigned int mq_refs = 0;
> - void __rcu **slot;
> + unsigned long index;
> + struct nfp_qdisc *mq;
Could you keep the variables sorted longest to shortest as is customary
in networking code if you respin?
^ permalink raw reply
* Re: [PATCH] net: pch_gbe: Fix memory leaks
From: Wenwen Wang @ 2019-08-21 4:10 UTC (permalink / raw)
To: David Miller
Cc: Richard Fontana, Allison Randal, Alexios Zavras,
Greg Kroah-Hartman, Thomas Gleixner,
open list:NETWORKING [GENERAL], open list, Wenwen Wang
In-Reply-To: <20190815.135111.1048854967874803531.davem@davemloft.net>
On Thu, Aug 15, 2019 at 4:51 PM David Miller <davem@davemloft.net> wrote:
>
> From: Wenwen Wang <wenwen@cs.uga.edu>
> Date: Thu, 15 Aug 2019 16:46:05 -0400
>
> > On Thu, Aug 15, 2019 at 4:42 PM David Miller <davem@davemloft.net> wrote:
> >>
> >> From: Wenwen Wang <wenwen@cs.uga.edu>
> >> Date: Thu, 15 Aug 2019 16:03:39 -0400
> >>
> >> > On Thu, Aug 15, 2019 at 3:34 PM David Miller <davem@davemloft.net> wrote:
> >> >>
> >> >> From: Wenwen Wang <wenwen@cs.uga.edu>
> >> >> Date: Tue, 13 Aug 2019 20:33:45 -0500
> >> >>
> >> >> > In pch_gbe_set_ringparam(), if netif_running() returns false, 'tx_old' and
> >> >> > 'rx_old' are not deallocated, leading to memory leaks. To fix this issue,
> >> >> > move the free statements after the if branch.
> >> >> >
> >> >> > Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
> >> >>
> >> >> Why would they be "deallocated"? They are still assigned to
> >> >> adapter->tx_ring and adapter->rx_ring.
> >> >
> >> > 'adapter->tx_ring' and 'adapter->rx_ring' has been covered by newly
> >> > allocated 'txdr' and 'rxdr' respectively before this if statement.
> >>
> >> That only happens inside of the if() statement, that's why rx_old and
> >> tx_old are only freed in that code path.
> >
> > That happens not only inside of the if statement, but also before the
> > if statement, just after 'txdr' and 'rxdr' are allocated.
>
> Then the assignments inside of the if() statement are redundant.
>
> Something doesn't add up here, please make the code consistent.
Thanks for your suggestion! I will remove the assignments inside of
the if() statement.
Wenwen
^ permalink raw reply
* [PATCH v2] net: pch_gbe: Fix memory leaks
From: Wenwen Wang @ 2019-08-21 4:20 UTC (permalink / raw)
To: Wenwen Wang
Cc: David S. Miller, Richard Fontana, Alexios Zavras, Allison Randal,
Greg Kroah-Hartman, Thomas Gleixner, open list:NETWORKING DRIVERS,
open list
In pch_gbe_set_ringparam(), if netif_running() returns false, 'tx_old' and
'rx_old' are not deallocated, leading to memory leaks. To fix this issue,
move the free statements to the outside of the if() statement.
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
---
drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_ethtool.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_ethtool.c b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_ethtool.c
index 1a3008e..cb43919 100644
--- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_ethtool.c
+++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_ethtool.c
@@ -340,12 +340,10 @@ static int pch_gbe_set_ringparam(struct net_device *netdev,
goto err_setup_tx;
pch_gbe_free_rx_resources(adapter, rx_old);
pch_gbe_free_tx_resources(adapter, tx_old);
- kfree(tx_old);
- kfree(rx_old);
- adapter->rx_ring = rxdr;
- adapter->tx_ring = txdr;
err = pch_gbe_up(adapter);
}
+ kfree(tx_old);
+ kfree(rx_old);
return err;
err_setup_tx:
--
2.7.4
^ permalink raw reply related
* Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
From: Alex Williamson @ 2019-08-21 4:20 UTC (permalink / raw)
To: Parav Pandit
Cc: Jiri Pirko, David S . Miller, Kirti Wankhede, Cornelia Huck,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org, cjia,
netdev@vger.kernel.org
In-Reply-To: <AM0PR05MB486686D3C311F3C61BE0997DD1AA0@AM0PR05MB4866.eurprd05.prod.outlook.com>
On Wed, 21 Aug 2019 03:42:25 +0000
Parav Pandit <parav@mellanox.com> wrote:
> > -----Original Message-----
> > From: Alex Williamson <alex.williamson@redhat.com>
> > Sent: Tuesday, August 20, 2019 10:49 PM
> > To: Parav Pandit <parav@mellanox.com>
> > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>;
> > Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck
> > <cohuck@redhat.com>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org;
> > cjia <cjia@nvidia.com>; netdev@vger.kernel.org
> > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> >
> > On Tue, 20 Aug 2019 08:58:02 +0000
> > Parav Pandit <parav@mellanox.com> wrote:
> >
> > > + Dave.
> > >
> > > Hi Jiri, Dave, Alex, Kirti, Cornelia,
> > >
> > > Please provide your feedback on it, how shall we proceed?
> > >
> > > Short summary of requirements.
> > > For a given mdev (mediated device [1]), there is one representor
> > > netdevice and devlink port in switchdev mode (similar to SR-IOV VF),
> > > And there is one netdevice for the actual mdev when mdev is probed.
> > >
> > > (a) representor netdev and devlink port should be able derive
> > > phys_port_name(). So that representor netdev name can be built
> > > deterministically across reboots.
> > >
> > > (b) for mdev's netdevice, mdev's device should have an attribute.
> > > This attribute can be used by udev rules/systemd or something else to
> > > rename netdev name deterministically.
> > >
> > > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID.
> > > A simple grep IFNAMSIZ in stack hints hundreds of users of IFNAMSIZ in
> > > drivers, uapi, netlink, boot config area and more. Changing IFNAMSIZ
> > > for a mdev bus doesn't really look reasonable option to me.
> >
> > How many characters do we really have to work with? Your examples below
> > prepend various characters, ex. option-1 results in ens2f0_m10 or enm10. Do
> > the extra 8 or 3 characters in these count against IFNAMSIZ?
> >
> Maximum 15. Last is null termination.
> Some udev rules setting by user prefix the PF netdev interface. I took such example below where ens2f0 netdev named is prefixed.
> Some prefer not to prefix.
>
> > > Hence, I would like to discuss below options.
> > >
> > > Option-1: mdev index
> > > Introduce an optional mdev index/handle as u32 during mdev create
> > > time. User passes mdev index/handle as input.
> > >
> > > phys_port_name=mIndex=m%u
> > > mdev_index will be available in sysfs as mdev attribute for udev to
> > > name the mdev's netdev.
> > >
> > > example mdev create command:
> > > UUID=$(uuidgen)
> > > echo $UUID index=10
> > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create
> >
> > Nit, IIRC previous discussions of additional parameters used comma separators,
> > ex. echo $UUID,index=10 >...
> >
> Yes, ok.
>
> > > > example netdevs:
> > > repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */
> >
> > Is the parent really relevant in the name?
> No. I just picked one udev example who prefixed the parent netdev name.
> But there are users who do not prefix it.
>
> > Tools like mdevctl are meant to
> > provide persistence, creating the same mdev devices on the same parent, but
> > that's simply the easiest policy decision. We can also imagine that multiple
> > parent devices might support a specified mdev type and policies factoring in
> > proximity, load-balancing, power consumption, etc might be weighed such that
> > we really don't want to promote userspace creating dependencies on the
> > parent association.
> >
> > > mdev_netdev=enm10
> > >
> > > Pros:
> > > 1. mdevctl and any other existing tools are unaffected.
> > > 2. netdev stack, ovs and other switching platforms are unaffected.
> > > 3. achieves unique phys_port_name for representor netdev 4. achieves
> > > unique mdev eth netdev name for the mdev using udev/systemd extension.
> > > 5. Aligns well with mdev and netdev subsystem and similar to existing
> > > sriov bdf's.
> >
> > A user provided index seems strange to me. It's not really an index, just a user
> > specified instance number. Presumably you have the user providing this
> > because if it really were an index, then the value depends on the creation order
> > and persistence is lost. Now the user needs to both avoid uuid collision as well
> > as "index" number collision. The uuid namespace is large enough to mostly
> > ignore this, but this is not. This seems like a burden.
> >
> I liked the term 'instance number', which is lot better way to say than index/handle.
> Yes, user needs to avoid both the collision.
> UUID collision should not occur in most cases, they way UUID are generated.
> So practically users needs to pick unique 'instance number', similar to how it picks unique netdev names.
>
> Burden to user comes from the requirement to get uniqueness.
>
> > > Option-2: shorter mdev name
> > > Extend mdev to have shorter mdev device name in addition to UUID.
> > > such as 'foo', 'bar'.
> > > Mdev will continue to have UUID.
> > > phys_port_name=mdev_name
> > >
> > > Pros:
> > > 1. All same as option-1, except mdevctl needs upgrade for newer usage.
> > > It is common practice to upgrade iproute2 package along with the
> > > kernel. Similar practice to be done with mdevctl.
> > > 2. Newer users of mdevctl who wants to work with non_UUID names, will
> > > use newer mdevctl/tools. Cons:
> > > 1. Dual naming scheme of mdev might affect some of the existing tools.
> > > It's unclear how/if it actually affects.
> > > mdevctl [2] is very recently developed and can be enhanced for dual
> > > naming scheme.
> >
> > I think we've already nak'ed this one, the device namespace becomes
> > meaningless if the name becomes just a string where a uuid might be an
> > example string. mdevs are named by uuid.
> >
> > > Option-3: mdev uuid alias
> > > Instead of shorter mdev name or mdev index, have alpha-numeric name
> > > alias. Alias is an optional mdev sysfs attribute such as 'foo', 'bar'.
> > > example mdev create command:
> > > UUID=$(uuidgen)
> > > echo $UUID alias=foo
> > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create
> > > > example netdevs:
> > > examle netdevs:
> > > repnetdev = ens2f0_mfoo
> > > mdev_netdev=enmfoo
> > >
> > > Pros:
> > > 1. All same as option-1.
> > > 2. Doesn't affect existing mdev naming scheme.
> > > Cons:
> > > 1. Index scheme of option-1 is better which can number large number of
> > > mdevs with fewer characters, simplifying the management tool.
> >
> > No better than option-1, simply a larger secondary namespace, but still
> > requires the user to come up with two independent names for the device.
> >
> > > Option-4: extend IFNAMESZ to be 64 bytes Extended IFNAMESZ from 16 to
> > > 64 bytes phys_port_name=mdev_UUID_string mdev_netdev_name=enmUUID
> > >
> > > Pros:
> > > 1. Doesn't require mdev extension
> > > Cons:
> > > 1. netdev stack, driver, uapi, user space, boot config wide changes 2.
> > > Possible user space extensions who assumed name size being 16
> > > characters 3. Single device type demands namesize change for all
> > > netdev types
> >
> > What about an alias based on the uuid? For example, we use 160-bit sha1s
> > daily with git (uuids are only 128-bit), but we generally don't reference git
> > commits with the full 20 character string. Generally 12 characters is
> > recommended to avoid ambiguity. Could mdev automatically create an
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > abbreviated sha1 alias for the device? If so, how many characters should we
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > use and what do we do on collision? The colliding device could add enough
> > alias characters to disambiguate (we likely couldn't re-alias the existing device
> > to disambiguate, but I'm not sure it matters, userspace has sysfs to associate
> > aliases). Ex.
> >
> > UUID=$(uuidgen)
> > ALIAS=$(echo $UUID | sha1sum | colrm 13)
> >
> I explained in previous reply to Cornelia, we should set UUID and ALIAS at the same time.
> Setting is via different sysfs attribute is lot code burden with no extra benefit.
Just an example of the alias, not proposing how it's set. In fact,
proposing that the user does not set it, mdev-core provides one
automatically.
> > Since there seems to be some prefix overhead, as I ask about above in how
> > many characters we actually have to work with in IFNAMESZ, maybe we start
> > with 8 characters (matching your "index" namespace) and expand as necessary
> > for disambiguation. If we can eliminate overhead in IFNAMESZ, let's start with
> > 12. Thanks,
> >
> If user is going to choose the alias, why does it have to be limited to sha1?
> Or you just told it as an example?
>
> It can be an alpha-numeric string.
No, I'm proposing a different solution where mdev-core creates an alias
based on an abbreviated sha1. The user does not provide the alias.
> Instead of mdev imposing number of characters on the alias, it should be best left to the user.
> Because in future if netdev improves on the naming scheme, mdev will be limiting it, which is not right.
> So not restricting alias size seems right to me.
> User configuring mdev for networking devices in a given kernel knows what user is doing.
> So user can choose alias name size as it finds suitable.
That's not what I'm proposing, please read again. Thanks,
Alex
^ permalink raw reply
* Re: [EXT] [PATCH] qed: Add cleanup in qed_slowpath_start()
From: Wenwen Wang @ 2019-08-21 4:31 UTC (permalink / raw)
To: Sudarsana Reddy Kalluru
Cc: Ariel Elior, GR-everest-linux-l2, David S. Miller,
open list:QLOGIC QL4xxx ETHERNET DRIVER, open list, Wenwen Wang
In-Reply-To: <MN2PR18MB2528D8046DFC6BB880D8EFF6D3D20@MN2PR18MB2528.namprd18.prod.outlook.com>
On Tue, Aug 13, 2019 at 6:46 AM Sudarsana Reddy Kalluru
<skalluru@marvell.com> wrote:
>
> > -----Original Message-----
> > From: Wenwen Wang <wenwen@cs.uga.edu>
> > Sent: Tuesday, August 13, 2019 3:35 PM
> > To: Wenwen Wang <wenwen@cs.uga.edu>
> > Cc: Ariel Elior <aelior@marvell.com>; GR-everest-linux-l2 <GR-everest-linux-
> > l2@marvell.com>; David S. Miller <davem@davemloft.net>; open
> > list:QLOGIC QL4xxx ETHERNET DRIVER <netdev@vger.kernel.org>; open list
> > <linux-kernel@vger.kernel.org>
> > Subject: [EXT] [PATCH] qed: Add cleanup in qed_slowpath_start()
> >
> > External Email
> >
> > ----------------------------------------------------------------------
> > If qed_mcp_send_drv_version() fails, no cleanup is executed, leading to
> > memory leaks. To fix this issue, redirect the execution to the label 'err3'
> > before returning the error.
> >
> > Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
> > ---
> > drivers/net/ethernet/qlogic/qed/qed_main.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ethernet/qlogic/qed/qed_main.c
> > b/drivers/net/ethernet/qlogic/qed/qed_main.c
> > index 829dd60..d16a251 100644
> > --- a/drivers/net/ethernet/qlogic/qed/qed_main.c
> > +++ b/drivers/net/ethernet/qlogic/qed/qed_main.c
> > @@ -1325,7 +1325,7 @@ static int qed_slowpath_start(struct qed_dev
> > *cdev,
> > &drv_version);
> > if (rc) {
> > DP_NOTICE(cdev, "Failed sending drv version
> > command\n");
> > - return rc;
> > + goto err3;
>
> In this case, we might need to free the ll2-buf allocated at the below path (?),
> 1312 /* Allocate LL2 interface if needed */
> 1313 if (QED_LEADING_HWFN(cdev)->using_ll2) {
> 1314 rc = qed_ll2_alloc_if(cdev);
> May be by adding a new goto label 'err4'.
Thanks for your suggestion! I will rework the patch.
Wenwen
>
> > }
> > }
> >
> > --
> > 2.7.4
>
^ permalink raw reply
* Re: [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver
From: Richard Cochran @ 2019-08-21 4:38 UTC (permalink / raw)
To: Vladimir Oltean
Cc: Mark Brown, Hubert Feurstein, mlichvar, Andrew Lunn,
Florian Fainelli, linux-spi, netdev
In-Reply-To: <CA+h21hr4UcoJK7upNJjG0ibtX7CkF=akxVdrb--1AJn6-z=sUQ@mail.gmail.com>
On Tue, Aug 20, 2019 at 06:57:10PM +0300, Vladimir Oltean wrote:
> What if all we need is just a mini-"phc2sys-over-Ethernet" that runs
> on a kernel thread internally to DSA? We say that DSA switches are
> "slave" to the "master" netdevice - their PTP clock can be the same.
> I am fairly confident that the sja1105 at least can be configured in
> hardware to work in this mode. One just needs to enable the CPU port
> in its own reachability matrix. None of the switch ports is really a
> "CPU port" hardware speaking.
I did consider this method when working on an early version of the
marvell driver. At least for that chip, there was no way to get the
time stamps on the port attached to the CPU port.
The DSA system (as I understand it) does not allow using the Linux
interface acting as the CPU port in a way that would allow taking time
stamps with the existing code base. You might find a way, but I guess
it won't be easy.
Overall, the PTP switch use case is well supported by Linux. The
synchronization of the management CPU to the PTP, while nice to have,
is not required to implement a Transparent Clock. Your specific
application might require it, but honestly, if the management CPU
needs good synchronization, then you really aught to feed a PPS from
the switch into a gpio (for example) on the CPU.
Using SPI or MDIO or I2C or UART as a synchronization bus is not a
wise solution.
Having said that, I don't oppose improving the situation for these
slow, non-deterministic serial buses, but you will have to sell your
solution to the maintainers of said buses.
Thanks,
Richard
^ permalink raw reply
* RE: [PATCH v2 0/2] Simplify mtty driver and mdev core
From: Parav Pandit @ 2019-08-21 4:40 UTC (permalink / raw)
To: Alex Williamson
Cc: Jiri Pirko, David S . Miller, Kirti Wankhede, Cornelia Huck,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org, cjia,
netdev@vger.kernel.org
In-Reply-To: <20190820222051.7aeafb69@x1.home>
> -----Original Message-----
> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Wednesday, August 21, 2019 9:51 AM
> To: Parav Pandit <parav@mellanox.com>
> Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>;
> Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck
> <cohuck@redhat.com>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org;
> cjia <cjia@nvidia.com>; netdev@vger.kernel.org
> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
>
> On Wed, 21 Aug 2019 03:42:25 +0000
> Parav Pandit <parav@mellanox.com> wrote:
>
> > > -----Original Message-----
> > > From: Alex Williamson <alex.williamson@redhat.com>
> > > Sent: Tuesday, August 20, 2019 10:49 PM
> > > To: Parav Pandit <parav@mellanox.com>
> > > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller
> > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>;
> > > Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org;
> > > linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>;
> > > netdev@vger.kernel.org
> > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > >
> > > On Tue, 20 Aug 2019 08:58:02 +0000
> > > Parav Pandit <parav@mellanox.com> wrote:
> > >
> > > > + Dave.
> > > >
> > > > Hi Jiri, Dave, Alex, Kirti, Cornelia,
> > > >
> > > > Please provide your feedback on it, how shall we proceed?
> > > >
> > > > Short summary of requirements.
> > > > For a given mdev (mediated device [1]), there is one representor
> > > > netdevice and devlink port in switchdev mode (similar to SR-IOV
> > > > VF), And there is one netdevice for the actual mdev when mdev is probed.
> > > >
> > > > (a) representor netdev and devlink port should be able derive
> > > > phys_port_name(). So that representor netdev name can be built
> > > > deterministically across reboots.
> > > >
> > > > (b) for mdev's netdevice, mdev's device should have an attribute.
> > > > This attribute can be used by udev rules/systemd or something else
> > > > to rename netdev name deterministically.
> > > >
> > > > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID.
> > > > A simple grep IFNAMSIZ in stack hints hundreds of users of
> > > > IFNAMSIZ in drivers, uapi, netlink, boot config area and more.
> > > > Changing IFNAMSIZ for a mdev bus doesn't really look reasonable option
> to me.
> > >
> > > How many characters do we really have to work with? Your examples
> > > below prepend various characters, ex. option-1 results in ens2f0_m10
> > > or enm10. Do the extra 8 or 3 characters in these count against IFNAMSIZ?
> > >
> > Maximum 15. Last is null termination.
> > Some udev rules setting by user prefix the PF netdev interface. I took such
> example below where ens2f0 netdev named is prefixed.
> > Some prefer not to prefix.
> >
> > > > Hence, I would like to discuss below options.
> > > >
> > > > Option-1: mdev index
> > > > Introduce an optional mdev index/handle as u32 during mdev create
> > > > time. User passes mdev index/handle as input.
> > > >
> > > > phys_port_name=mIndex=m%u
> > > > mdev_index will be available in sysfs as mdev attribute for udev
> > > > to name the mdev's netdev.
> > > >
> > > > example mdev create command:
> > > > UUID=$(uuidgen)
> > > > echo $UUID index=10
> > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create
> > >
> > > Nit, IIRC previous discussions of additional parameters used comma
> > > separators, ex. echo $UUID,index=10 >...
> > >
> > Yes, ok.
> >
> > > > > example netdevs:
> > > > repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */
> > >
> > > Is the parent really relevant in the name?
> > No. I just picked one udev example who prefixed the parent netdev name.
> > But there are users who do not prefix it.
> >
> > > Tools like mdevctl are meant to
> > > provide persistence, creating the same mdev devices on the same
> > > parent, but that's simply the easiest policy decision. We can also
> > > imagine that multiple parent devices might support a specified mdev
> > > type and policies factoring in proximity, load-balancing, power
> > > consumption, etc might be weighed such that we really don't want to
> > > promote userspace creating dependencies on the parent association.
> > >
> > > > mdev_netdev=enm10
> > > >
> > > > Pros:
> > > > 1. mdevctl and any other existing tools are unaffected.
> > > > 2. netdev stack, ovs and other switching platforms are unaffected.
> > > > 3. achieves unique phys_port_name for representor netdev 4.
> > > > achieves unique mdev eth netdev name for the mdev using udev/systemd
> extension.
> > > > 5. Aligns well with mdev and netdev subsystem and similar to
> > > > existing sriov bdf's.
> > >
> > > A user provided index seems strange to me. It's not really an
> > > index, just a user specified instance number. Presumably you have
> > > the user providing this because if it really were an index, then the
> > > value depends on the creation order and persistence is lost. Now
> > > the user needs to both avoid uuid collision as well as "index"
> > > number collision. The uuid namespace is large enough to mostly ignore
> this, but this is not. This seems like a burden.
> > >
> > I liked the term 'instance number', which is lot better way to say than
> index/handle.
> > Yes, user needs to avoid both the collision.
> > UUID collision should not occur in most cases, they way UUID are generated.
> > So practically users needs to pick unique 'instance number', similar to how it
> picks unique netdev names.
> >
> > Burden to user comes from the requirement to get uniqueness.
> >
> > > > Option-2: shorter mdev name
> > > > Extend mdev to have shorter mdev device name in addition to UUID.
> > > > such as 'foo', 'bar'.
> > > > Mdev will continue to have UUID.
> > > > phys_port_name=mdev_name
> > > >
> > > > Pros:
> > > > 1. All same as option-1, except mdevctl needs upgrade for newer usage.
> > > > It is common practice to upgrade iproute2 package along with the
> > > > kernel. Similar practice to be done with mdevctl.
> > > > 2. Newer users of mdevctl who wants to work with non_UUID names,
> > > > will use newer mdevctl/tools. Cons:
> > > > 1. Dual naming scheme of mdev might affect some of the existing tools.
> > > > It's unclear how/if it actually affects.
> > > > mdevctl [2] is very recently developed and can be enhanced for
> > > > dual naming scheme.
> > >
> > > I think we've already nak'ed this one, the device namespace becomes
> > > meaningless if the name becomes just a string where a uuid might be
> > > an example string. mdevs are named by uuid.
> > >
> > > > Option-3: mdev uuid alias
> > > > Instead of shorter mdev name or mdev index, have alpha-numeric
> > > > name alias. Alias is an optional mdev sysfs attribute such as 'foo', 'bar'.
> > > > example mdev create command:
> > > > UUID=$(uuidgen)
> > > > echo $UUID alias=foo
> > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create
> > > > > example netdevs:
> > > > examle netdevs:
> > > > repnetdev = ens2f0_mfoo
> > > > mdev_netdev=enmfoo
> > > >
> > > > Pros:
> > > > 1. All same as option-1.
> > > > 2. Doesn't affect existing mdev naming scheme.
> > > > Cons:
> > > > 1. Index scheme of option-1 is better which can number large
> > > > number of mdevs with fewer characters, simplifying the management
> tool.
> > >
> > > No better than option-1, simply a larger secondary namespace, but
> > > still requires the user to come up with two independent names for the
> device.
> > >
> > > > Option-4: extend IFNAMESZ to be 64 bytes Extended IFNAMESZ from 16
> > > > to
> > > > 64 bytes phys_port_name=mdev_UUID_string
> mdev_netdev_name=enmUUID
> > > >
> > > > Pros:
> > > > 1. Doesn't require mdev extension
> > > > Cons:
> > > > 1. netdev stack, driver, uapi, user space, boot config wide changes 2.
> > > > Possible user space extensions who assumed name size being 16
> > > > characters 3. Single device type demands namesize change for all
> > > > netdev types
> > >
> > > What about an alias based on the uuid? For example, we use 160-bit
> > > sha1s daily with git (uuids are only 128-bit), but we generally
> > > don't reference git commits with the full 20 character string.
> > > Generally 12 characters is recommended to avoid ambiguity. Could
> > > mdev automatically create an
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > abbreviated sha1 alias for the device? If so, how many characters
> > > should we
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > use and what do we do on collision? The colliding device could add
> > > enough alias characters to disambiguate (we likely couldn't re-alias
> > > the existing device to disambiguate, but I'm not sure it matters,
> > > userspace has sysfs to associate aliases). Ex.
> > >
> > > UUID=$(uuidgen)
> > > ALIAS=$(echo $UUID | sha1sum | colrm 13)
> > >
> > I explained in previous reply to Cornelia, we should set UUID and ALIAS at the
> same time.
> > Setting is via different sysfs attribute is lot code burden with no extra benefit.
>
> Just an example of the alias, not proposing how it's set. In fact, proposing that
> the user does not set it, mdev-core provides one automatically.
>
> > > Since there seems to be some prefix overhead, as I ask about above
> > > in how many characters we actually have to work with in IFNAMESZ,
> > > maybe we start with 8 characters (matching your "index" namespace)
> > > and expand as necessary for disambiguation. If we can eliminate
> > > overhead in IFNAMESZ, let's start with 12. Thanks,
> > >
> > If user is going to choose the alias, why does it have to be limited to sha1?
> > Or you just told it as an example?
> >
> > It can be an alpha-numeric string.
>
> No, I'm proposing a different solution where mdev-core creates an alias based
> on an abbreviated sha1. The user does not provide the alias.
>
> > Instead of mdev imposing number of characters on the alias, it should be best
> left to the user.
> > Because in future if netdev improves on the naming scheme, mdev will be
> limiting it, which is not right.
> > So not restricting alias size seems right to me.
> > User configuring mdev for networking devices in a given kernel knows what
> user is doing.
> > So user can choose alias name size as it finds suitable.
>
> That's not what I'm proposing, please read again. Thanks,
I understood your point. But mdev doesn't know how user is going to use udev/systemd to name the netdev.
So even if mdev chose to pick 12 characters, it could result in collision.
Hence the proposal to provide the alias by the user, as user know the best policy for its use case in the environment its using.
So 12 character sha1 method will still work by user.
^ permalink raw reply
* Re: [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver
From: Richard Cochran @ 2019-08-21 4:42 UTC (permalink / raw)
To: Vladimir Oltean
Cc: Mark Brown, Hubert Feurstein, mlichvar, Andrew Lunn,
Florian Fainelli, linux-spi, netdev
In-Reply-To: <CA+h21hr4UcoJK7upNJjG0ibtX7CkF=akxVdrb--1AJn6-z=sUQ@mail.gmail.com>
On Tue, Aug 20, 2019 at 06:57:10PM +0300, Vladimir Oltean wrote:
> I believe something similar should be possible on other hardware as well.
Not for the marvell link street devices, AFAICT.
Thanks,
Richard
^ permalink raw reply
* [PATCH v2] qed: Add cleanup in qed_slowpath_start()
From: Wenwen Wang @ 2019-08-21 4:46 UTC (permalink / raw)
To: Wenwen Wang
Cc: Sudarsana Reddy Kalluru, Ariel Elior,
supporter:QLOGIC QL4xxx ETHERNET DRIVER, David S. Miller,
open list:QLOGIC QL4xxx ETHERNET DRIVER, open list
If qed_mcp_send_drv_version() fails, no cleanup is executed, leading to
memory leaks. To fix this issue, introduce the label 'err4' to perform the
cleanup work before returning the error.
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
---
drivers/net/ethernet/qlogic/qed/qed_main.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_main.c b/drivers/net/ethernet/qlogic/qed/qed_main.c
index 829dd60..1efff7f 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_main.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_main.c
@@ -1325,7 +1325,7 @@ static int qed_slowpath_start(struct qed_dev *cdev,
&drv_version);
if (rc) {
DP_NOTICE(cdev, "Failed sending drv version command\n");
- return rc;
+ goto err4;
}
}
@@ -1333,6 +1333,8 @@ static int qed_slowpath_start(struct qed_dev *cdev,
return 0;
+err4:
+ qed_ll2_dealloc_if(cdev);
err3:
qed_hw_stop(cdev);
err2:
--
2.7.4
^ permalink raw reply related
* Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
From: Alex Williamson @ 2019-08-21 4:57 UTC (permalink / raw)
To: Parav Pandit
Cc: Jiri Pirko, David S . Miller, Kirti Wankhede, Cornelia Huck,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org, cjia,
netdev@vger.kernel.org
In-Reply-To: <AM0PR05MB48664CDF05C3D02F9441440DD1AA0@AM0PR05MB4866.eurprd05.prod.outlook.com>
On Wed, 21 Aug 2019 04:40:15 +0000
Parav Pandit <parav@mellanox.com> wrote:
> > -----Original Message-----
> > From: Alex Williamson <alex.williamson@redhat.com>
> > Sent: Wednesday, August 21, 2019 9:51 AM
> > To: Parav Pandit <parav@mellanox.com>
> > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>;
> > Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck
> > <cohuck@redhat.com>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org;
> > cjia <cjia@nvidia.com>; netdev@vger.kernel.org
> > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> >
> > On Wed, 21 Aug 2019 03:42:25 +0000
> > Parav Pandit <parav@mellanox.com> wrote:
> >
> > > > -----Original Message-----
> > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > Sent: Tuesday, August 20, 2019 10:49 PM
> > > > To: Parav Pandit <parav@mellanox.com>
> > > > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller
> > > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>;
> > > > Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org;
> > > > linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>;
> > > > netdev@vger.kernel.org
> > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > > >
> > > > On Tue, 20 Aug 2019 08:58:02 +0000
> > > > Parav Pandit <parav@mellanox.com> wrote:
> > > >
> > > > > + Dave.
> > > > >
> > > > > Hi Jiri, Dave, Alex, Kirti, Cornelia,
> > > > >
> > > > > Please provide your feedback on it, how shall we proceed?
> > > > >
> > > > > Short summary of requirements.
> > > > > For a given mdev (mediated device [1]), there is one representor
> > > > > netdevice and devlink port in switchdev mode (similar to SR-IOV
> > > > > VF), And there is one netdevice for the actual mdev when mdev is probed.
> > > > >
> > > > > (a) representor netdev and devlink port should be able derive
> > > > > phys_port_name(). So that representor netdev name can be built
> > > > > deterministically across reboots.
> > > > >
> > > > > (b) for mdev's netdevice, mdev's device should have an attribute.
> > > > > This attribute can be used by udev rules/systemd or something else
> > > > > to rename netdev name deterministically.
> > > > >
> > > > > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID.
> > > > > A simple grep IFNAMSIZ in stack hints hundreds of users of
> > > > > IFNAMSIZ in drivers, uapi, netlink, boot config area and more.
> > > > > Changing IFNAMSIZ for a mdev bus doesn't really look reasonable option
> > to me.
> > > >
> > > > How many characters do we really have to work with? Your examples
> > > > below prepend various characters, ex. option-1 results in ens2f0_m10
> > > > or enm10. Do the extra 8 or 3 characters in these count against IFNAMSIZ?
> > > >
> > > Maximum 15. Last is null termination.
> > > Some udev rules setting by user prefix the PF netdev interface. I took such
> > example below where ens2f0 netdev named is prefixed.
> > > Some prefer not to prefix.
> > >
> > > > > Hence, I would like to discuss below options.
> > > > >
> > > > > Option-1: mdev index
> > > > > Introduce an optional mdev index/handle as u32 during mdev create
> > > > > time. User passes mdev index/handle as input.
> > > > >
> > > > > phys_port_name=mIndex=m%u
> > > > > mdev_index will be available in sysfs as mdev attribute for udev
> > > > > to name the mdev's netdev.
> > > > >
> > > > > example mdev create command:
> > > > > UUID=$(uuidgen)
> > > > > echo $UUID index=10
> > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create
> > > >
> > > > Nit, IIRC previous discussions of additional parameters used comma
> > > > separators, ex. echo $UUID,index=10 >...
> > > >
> > > Yes, ok.
> > >
> > > > > > example netdevs:
> > > > > repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */
> > > >
> > > > Is the parent really relevant in the name?
> > > No. I just picked one udev example who prefixed the parent netdev name.
> > > But there are users who do not prefix it.
> > >
> > > > Tools like mdevctl are meant to
> > > > provide persistence, creating the same mdev devices on the same
> > > > parent, but that's simply the easiest policy decision. We can also
> > > > imagine that multiple parent devices might support a specified mdev
> > > > type and policies factoring in proximity, load-balancing, power
> > > > consumption, etc might be weighed such that we really don't want to
> > > > promote userspace creating dependencies on the parent association.
> > > >
> > > > > mdev_netdev=enm10
> > > > >
> > > > > Pros:
> > > > > 1. mdevctl and any other existing tools are unaffected.
> > > > > 2. netdev stack, ovs and other switching platforms are unaffected.
> > > > > 3. achieves unique phys_port_name for representor netdev 4.
> > > > > achieves unique mdev eth netdev name for the mdev using udev/systemd
> > extension.
> > > > > 5. Aligns well with mdev and netdev subsystem and similar to
> > > > > existing sriov bdf's.
> > > >
> > > > A user provided index seems strange to me. It's not really an
> > > > index, just a user specified instance number. Presumably you have
> > > > the user providing this because if it really were an index, then the
> > > > value depends on the creation order and persistence is lost. Now
> > > > the user needs to both avoid uuid collision as well as "index"
> > > > number collision. The uuid namespace is large enough to mostly ignore
> > this, but this is not. This seems like a burden.
> > > >
> > > I liked the term 'instance number', which is lot better way to say than
> > index/handle.
> > > Yes, user needs to avoid both the collision.
> > > UUID collision should not occur in most cases, they way UUID are generated.
> > > So practically users needs to pick unique 'instance number', similar to how it
> > picks unique netdev names.
> > >
> > > Burden to user comes from the requirement to get uniqueness.
> > >
> > > > > Option-2: shorter mdev name
> > > > > Extend mdev to have shorter mdev device name in addition to UUID.
> > > > > such as 'foo', 'bar'.
> > > > > Mdev will continue to have UUID.
> > > > > phys_port_name=mdev_name
> > > > >
> > > > > Pros:
> > > > > 1. All same as option-1, except mdevctl needs upgrade for newer usage.
> > > > > It is common practice to upgrade iproute2 package along with the
> > > > > kernel. Similar practice to be done with mdevctl.
> > > > > 2. Newer users of mdevctl who wants to work with non_UUID names,
> > > > > will use newer mdevctl/tools. Cons:
> > > > > 1. Dual naming scheme of mdev might affect some of the existing tools.
> > > > > It's unclear how/if it actually affects.
> > > > > mdevctl [2] is very recently developed and can be enhanced for
> > > > > dual naming scheme.
> > > >
> > > > I think we've already nak'ed this one, the device namespace becomes
> > > > meaningless if the name becomes just a string where a uuid might be
> > > > an example string. mdevs are named by uuid.
> > > >
> > > > > Option-3: mdev uuid alias
> > > > > Instead of shorter mdev name or mdev index, have alpha-numeric
> > > > > name alias. Alias is an optional mdev sysfs attribute such as 'foo', 'bar'.
> > > > > example mdev create command:
> > > > > UUID=$(uuidgen)
> > > > > echo $UUID alias=foo
> > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create
> > > > > > example netdevs:
> > > > > examle netdevs:
> > > > > repnetdev = ens2f0_mfoo
> > > > > mdev_netdev=enmfoo
> > > > >
> > > > > Pros:
> > > > > 1. All same as option-1.
> > > > > 2. Doesn't affect existing mdev naming scheme.
> > > > > Cons:
> > > > > 1. Index scheme of option-1 is better which can number large
> > > > > number of mdevs with fewer characters, simplifying the management
> > tool.
> > > >
> > > > No better than option-1, simply a larger secondary namespace, but
> > > > still requires the user to come up with two independent names for the
> > device.
> > > >
> > > > > Option-4: extend IFNAMESZ to be 64 bytes Extended IFNAMESZ from 16
> > > > > to
> > > > > 64 bytes phys_port_name=mdev_UUID_string
> > mdev_netdev_name=enmUUID
> > > > >
> > > > > Pros:
> > > > > 1. Doesn't require mdev extension
> > > > > Cons:
> > > > > 1. netdev stack, driver, uapi, user space, boot config wide changes 2.
> > > > > Possible user space extensions who assumed name size being 16
> > > > > characters 3. Single device type demands namesize change for all
> > > > > netdev types
> > > >
> > > > What about an alias based on the uuid? For example, we use 160-bit
> > > > sha1s daily with git (uuids are only 128-bit), but we generally
> > > > don't reference git commits with the full 20 character string.
> > > > Generally 12 characters is recommended to avoid ambiguity. Could
> > > > mdev automatically create an
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > abbreviated sha1 alias for the device? If so, how many characters
> > > > should we
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > use and what do we do on collision? The colliding device could add
> > > > enough alias characters to disambiguate (we likely couldn't re-alias
> > > > the existing device to disambiguate, but I'm not sure it matters,
> > > > userspace has sysfs to associate aliases). Ex.
> > > >
> > > > UUID=$(uuidgen)
> > > > ALIAS=$(echo $UUID | sha1sum | colrm 13)
> > > >
> > > I explained in previous reply to Cornelia, we should set UUID and ALIAS at the
> > same time.
> > > Setting is via different sysfs attribute is lot code burden with no extra benefit.
> >
> > Just an example of the alias, not proposing how it's set. In fact, proposing that
> > the user does not set it, mdev-core provides one automatically.
> >
> > > > Since there seems to be some prefix overhead, as I ask about above
> > > > in how many characters we actually have to work with in IFNAMESZ,
> > > > maybe we start with 8 characters (matching your "index" namespace)
> > > > and expand as necessary for disambiguation. If we can eliminate
> > > > overhead in IFNAMESZ, let's start with 12. Thanks,
> > > >
> > > If user is going to choose the alias, why does it have to be limited to sha1?
> > > Or you just told it as an example?
> > >
> > > It can be an alpha-numeric string.
> >
> > No, I'm proposing a different solution where mdev-core creates an alias based
> > on an abbreviated sha1. The user does not provide the alias.
> >
> > > Instead of mdev imposing number of characters on the alias, it should be best
> > left to the user.
> > > Because in future if netdev improves on the naming scheme, mdev will be
> > limiting it, which is not right.
> > > So not restricting alias size seems right to me.
> > > User configuring mdev for networking devices in a given kernel knows what
> > user is doing.
> > > So user can choose alias name size as it finds suitable.
> >
> > That's not what I'm proposing, please read again. Thanks,
>
> I understood your point. But mdev doesn't know how user is going to use udev/systemd to name the netdev.
> So even if mdev chose to pick 12 characters, it could result in collision.
> Hence the proposal to provide the alias by the user, as user know the best policy for its use case in the environment its using.
> So 12 character sha1 method will still work by user.
Haven't you already provided examples where certain drivers or
subsystems have unique netdev prefixes? If mdev provides a unique
alias within the subsystem, couldn't we simply define a netdev prefix
for the mdev subsystem and avoid all other collisions? I'm not in favor
of the user providing both a uuid and an alias/instance. Thanks,
Alex
^ permalink raw reply
* RE: [PATCH v2 0/2] Simplify mtty driver and mdev core
From: Parav Pandit @ 2019-08-21 5:01 UTC (permalink / raw)
To: Alex Williamson
Cc: Jiri Pirko, David S . Miller, Kirti Wankhede, Cornelia Huck,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org, cjia,
netdev@vger.kernel.org
In-Reply-To: <20190820225722.237a57d2@x1.home>
> -----Original Message-----
> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Wednesday, August 21, 2019 10:27 AM
> To: Parav Pandit <parav@mellanox.com>
> Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>;
> Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck
> <cohuck@redhat.com>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org;
> cjia <cjia@nvidia.com>; netdev@vger.kernel.org
> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
>
> On Wed, 21 Aug 2019 04:40:15 +0000
> Parav Pandit <parav@mellanox.com> wrote:
>
> > > -----Original Message-----
> > > From: Alex Williamson <alex.williamson@redhat.com>
> > > Sent: Wednesday, August 21, 2019 9:51 AM
> > > To: Parav Pandit <parav@mellanox.com>
> > > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller
> > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>;
> > > Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org;
> > > linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>;
> > > netdev@vger.kernel.org
> > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > >
> > > On Wed, 21 Aug 2019 03:42:25 +0000
> > > Parav Pandit <parav@mellanox.com> wrote:
> > >
> > > > > -----Original Message-----
> > > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > > Sent: Tuesday, August 20, 2019 10:49 PM
> > > > > To: Parav Pandit <parav@mellanox.com>
> > > > > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller
> > > > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>;
> > > > > Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org;
> > > > > linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>;
> > > > > netdev@vger.kernel.org
> > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > > > >
> > > > > On Tue, 20 Aug 2019 08:58:02 +0000 Parav Pandit
> > > > > <parav@mellanox.com> wrote:
> > > > >
> > > > > > + Dave.
> > > > > >
> > > > > > Hi Jiri, Dave, Alex, Kirti, Cornelia,
> > > > > >
> > > > > > Please provide your feedback on it, how shall we proceed?
> > > > > >
> > > > > > Short summary of requirements.
> > > > > > For a given mdev (mediated device [1]), there is one
> > > > > > representor netdevice and devlink port in switchdev mode
> > > > > > (similar to SR-IOV VF), And there is one netdevice for the actual mdev
> when mdev is probed.
> > > > > >
> > > > > > (a) representor netdev and devlink port should be able derive
> > > > > > phys_port_name(). So that representor netdev name can be built
> > > > > > deterministically across reboots.
> > > > > >
> > > > > > (b) for mdev's netdevice, mdev's device should have an attribute.
> > > > > > This attribute can be used by udev rules/systemd or something
> > > > > > else to rename netdev name deterministically.
> > > > > >
> > > > > > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID.
> > > > > > A simple grep IFNAMSIZ in stack hints hundreds of users of
> > > > > > IFNAMSIZ in drivers, uapi, netlink, boot config area and more.
> > > > > > Changing IFNAMSIZ for a mdev bus doesn't really look
> > > > > > reasonable option
> > > to me.
> > > > >
> > > > > How many characters do we really have to work with? Your
> > > > > examples below prepend various characters, ex. option-1 results
> > > > > in ens2f0_m10 or enm10. Do the extra 8 or 3 characters in these count
> against IFNAMSIZ?
> > > > >
> > > > Maximum 15. Last is null termination.
> > > > Some udev rules setting by user prefix the PF netdev interface. I
> > > > took such
> > > example below where ens2f0 netdev named is prefixed.
> > > > Some prefer not to prefix.
> > > >
> > > > > > Hence, I would like to discuss below options.
> > > > > >
> > > > > > Option-1: mdev index
> > > > > > Introduce an optional mdev index/handle as u32 during mdev
> > > > > > create time. User passes mdev index/handle as input.
> > > > > >
> > > > > > phys_port_name=mIndex=m%u
> > > > > > mdev_index will be available in sysfs as mdev attribute for
> > > > > > udev to name the mdev's netdev.
> > > > > >
> > > > > > example mdev create command:
> > > > > > UUID=$(uuidgen)
> > > > > > echo $UUID index=10
> > > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/cr
> > > > > > > eate
> > > > >
> > > > > Nit, IIRC previous discussions of additional parameters used
> > > > > comma separators, ex. echo $UUID,index=10 >...
> > > > >
> > > > Yes, ok.
> > > >
> > > > > > > example netdevs:
> > > > > > repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */
> > > > >
> > > > > Is the parent really relevant in the name?
> > > > No. I just picked one udev example who prefixed the parent netdev name.
> > > > But there are users who do not prefix it.
> > > >
> > > > > Tools like mdevctl are meant to
> > > > > provide persistence, creating the same mdev devices on the same
> > > > > parent, but that's simply the easiest policy decision. We can
> > > > > also imagine that multiple parent devices might support a
> > > > > specified mdev type and policies factoring in proximity,
> > > > > load-balancing, power consumption, etc might be weighed such
> > > > > that we really don't want to promote userspace creating dependencies
> on the parent association.
> > > > >
> > > > > > mdev_netdev=enm10
> > > > > >
> > > > > > Pros:
> > > > > > 1. mdevctl and any other existing tools are unaffected.
> > > > > > 2. netdev stack, ovs and other switching platforms are unaffected.
> > > > > > 3. achieves unique phys_port_name for representor netdev 4.
> > > > > > achieves unique mdev eth netdev name for the mdev using
> > > > > > udev/systemd
> > > extension.
> > > > > > 5. Aligns well with mdev and netdev subsystem and similar to
> > > > > > existing sriov bdf's.
> > > > >
> > > > > A user provided index seems strange to me. It's not really an
> > > > > index, just a user specified instance number. Presumably you
> > > > > have the user providing this because if it really were an index,
> > > > > then the value depends on the creation order and persistence is
> > > > > lost. Now the user needs to both avoid uuid collision as well as "index"
> > > > > number collision. The uuid namespace is large enough to mostly
> > > > > ignore
> > > this, but this is not. This seems like a burden.
> > > > >
> > > > I liked the term 'instance number', which is lot better way to say
> > > > than
> > > index/handle.
> > > > Yes, user needs to avoid both the collision.
> > > > UUID collision should not occur in most cases, they way UUID are
> generated.
> > > > So practically users needs to pick unique 'instance number',
> > > > similar to how it
> > > picks unique netdev names.
> > > >
> > > > Burden to user comes from the requirement to get uniqueness.
> > > >
> > > > > > Option-2: shorter mdev name
> > > > > > Extend mdev to have shorter mdev device name in addition to UUID.
> > > > > > such as 'foo', 'bar'.
> > > > > > Mdev will continue to have UUID.
> > > > > > phys_port_name=mdev_name
> > > > > >
> > > > > > Pros:
> > > > > > 1. All same as option-1, except mdevctl needs upgrade for newer
> usage.
> > > > > > It is common practice to upgrade iproute2 package along with
> > > > > > the kernel. Similar practice to be done with mdevctl.
> > > > > > 2. Newer users of mdevctl who wants to work with non_UUID
> > > > > > names, will use newer mdevctl/tools. Cons:
> > > > > > 1. Dual naming scheme of mdev might affect some of the existing
> tools.
> > > > > > It's unclear how/if it actually affects.
> > > > > > mdevctl [2] is very recently developed and can be enhanced for
> > > > > > dual naming scheme.
> > > > >
> > > > > I think we've already nak'ed this one, the device namespace
> > > > > becomes meaningless if the name becomes just a string where a
> > > > > uuid might be an example string. mdevs are named by uuid.
> > > > >
> > > > > > Option-3: mdev uuid alias
> > > > > > Instead of shorter mdev name or mdev index, have alpha-numeric
> > > > > > name alias. Alias is an optional mdev sysfs attribute such as 'foo',
> 'bar'.
> > > > > > example mdev create command:
> > > > > > UUID=$(uuidgen)
> > > > > > echo $UUID alias=foo
> > > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/cr
> > > > > > > eate
> > > > > > > example netdevs:
> > > > > > examle netdevs:
> > > > > > repnetdev = ens2f0_mfoo
> > > > > > mdev_netdev=enmfoo
> > > > > >
> > > > > > Pros:
> > > > > > 1. All same as option-1.
> > > > > > 2. Doesn't affect existing mdev naming scheme.
> > > > > > Cons:
> > > > > > 1. Index scheme of option-1 is better which can number large
> > > > > > number of mdevs with fewer characters, simplifying the
> > > > > > management
> > > tool.
> > > > >
> > > > > No better than option-1, simply a larger secondary namespace,
> > > > > but still requires the user to come up with two independent
> > > > > names for the
> > > device.
> > > > >
> > > > > > Option-4: extend IFNAMESZ to be 64 bytes Extended IFNAMESZ
> > > > > > from 16 to
> > > > > > 64 bytes phys_port_name=mdev_UUID_string
> > > mdev_netdev_name=enmUUID
> > > > > >
> > > > > > Pros:
> > > > > > 1. Doesn't require mdev extension
> > > > > > Cons:
> > > > > > 1. netdev stack, driver, uapi, user space, boot config wide changes 2.
> > > > > > Possible user space extensions who assumed name size being 16
> > > > > > characters 3. Single device type demands namesize change for
> > > > > > all netdev types
> > > > >
> > > > > What about an alias based on the uuid? For example, we use
> > > > > 160-bit sha1s daily with git (uuids are only 128-bit), but we
> > > > > generally don't reference git commits with the full 20 character string.
> > > > > Generally 12 characters is recommended to avoid ambiguity.
> > > > > Could mdev automatically create an
> > >
> > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > > abbreviated sha1 alias for the device? If so, how many
> > > > > characters should we
> > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > > use and what do we do on collision? The colliding device could
> > > > > add enough alias characters to disambiguate (we likely couldn't
> > > > > re-alias the existing device to disambiguate, but I'm not sure
> > > > > it matters, userspace has sysfs to associate aliases). Ex.
> > > > >
> > > > > UUID=$(uuidgen)
> > > > > ALIAS=$(echo $UUID | sha1sum | colrm 13)
> > > > >
> > > > I explained in previous reply to Cornelia, we should set UUID and
> > > > ALIAS at the
> > > same time.
> > > > Setting is via different sysfs attribute is lot code burden with no extra
> benefit.
> > >
> > > Just an example of the alias, not proposing how it's set. In fact,
> > > proposing that the user does not set it, mdev-core provides one
> automatically.
> > >
> > > > > Since there seems to be some prefix overhead, as I ask about
> > > > > above in how many characters we actually have to work with in
> > > > > IFNAMESZ, maybe we start with 8 characters (matching your
> > > > > "index" namespace) and expand as necessary for disambiguation.
> > > > > If we can eliminate overhead in IFNAMESZ, let's start with 12.
> > > > > Thanks,
> > > > >
> > > > If user is going to choose the alias, why does it have to be limited to sha1?
> > > > Or you just told it as an example?
> > > >
> > > > It can be an alpha-numeric string.
> > >
> > > No, I'm proposing a different solution where mdev-core creates an
> > > alias based on an abbreviated sha1. The user does not provide the alias.
> > >
> > > > Instead of mdev imposing number of characters on the alias, it
> > > > should be best
> > > left to the user.
> > > > Because in future if netdev improves on the naming scheme, mdev
> > > > will be
> > > limiting it, which is not right.
> > > > So not restricting alias size seems right to me.
> > > > User configuring mdev for networking devices in a given kernel
> > > > knows what
> > > user is doing.
> > > > So user can choose alias name size as it finds suitable.
> > >
> > > That's not what I'm proposing, please read again. Thanks,
> >
> > I understood your point. But mdev doesn't know how user is going to use
> udev/systemd to name the netdev.
> > So even if mdev chose to pick 12 characters, it could result in collision.
> > Hence the proposal to provide the alias by the user, as user know the best
> policy for its use case in the environment its using.
> > So 12 character sha1 method will still work by user.
>
> Haven't you already provided examples where certain drivers or subsystems
> have unique netdev prefixes? If mdev provides a unique alias within the
> subsystem, couldn't we simply define a netdev prefix for the mdev subsystem
> and avoid all other collisions? I'm not in favor of the user providing both a uuid
> and an alias/instance. Thanks,
>
For a given prefix, say ens2f0, can two UUID->sha1 first 9 characters have collision?
^ permalink raw reply
* RE: [PATCH v2] qed: Add cleanup in qed_slowpath_start()
From: Sudarsana Reddy Kalluru @ 2019-08-21 5:03 UTC (permalink / raw)
To: Wenwen Wang
Cc: Ariel Elior, GR-everest-linux-l2, David S. Miller,
open list:QLOGIC QL4xxx ETHERNET DRIVER, open list
In-Reply-To: <1566362796-5399-1-git-send-email-wenwen@cs.uga.edu>
> -----Original Message-----
> From: Wenwen Wang <wenwen@cs.uga.edu>
> Sent: Wednesday, August 21, 2019 10:17 AM
> To: Wenwen Wang <wenwen@cs.uga.edu>
> Cc: Sudarsana Reddy Kalluru <skalluru@marvell.com>; Ariel Elior
> <aelior@marvell.com>; GR-everest-linux-l2 <GR-everest-linux-
> l2@marvell.com>; David S. Miller <davem@davemloft.net>; open
> list:QLOGIC QL4xxx ETHERNET DRIVER <netdev@vger.kernel.org>; open list
> <linux-kernel@vger.kernel.org>
> Subject: [PATCH v2] qed: Add cleanup in qed_slowpath_start()
>
> If qed_mcp_send_drv_version() fails, no cleanup is executed, leading to
> memory leaks. To fix this issue, introduce the label 'err4' to perform the
> cleanup work before returning the error.
>
> Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
> ---
> drivers/net/ethernet/qlogic/qed/qed_main.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/qlogic/qed/qed_main.c
> b/drivers/net/ethernet/qlogic/qed/qed_main.c
> index 829dd60..1efff7f 100644
> --- a/drivers/net/ethernet/qlogic/qed/qed_main.c
> +++ b/drivers/net/ethernet/qlogic/qed/qed_main.c
> @@ -1325,7 +1325,7 @@ static int qed_slowpath_start(struct qed_dev
> *cdev,
> &drv_version);
> if (rc) {
> DP_NOTICE(cdev, "Failed sending drv version
> command\n");
> - return rc;
> + goto err4;
> }
> }
>
> @@ -1333,6 +1333,8 @@ static int qed_slowpath_start(struct qed_dev
> *cdev,
>
> return 0;
>
> +err4:
> + qed_ll2_dealloc_if(cdev);
> err3:
> qed_hw_stop(cdev);
> err2:
> --
> 2.7.4
Acked-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
^ permalink raw reply
* Re: various TLS bug fixes...
From: John Fastabend @ 2019-08-21 5:18 UTC (permalink / raw)
To: Jakub Kicinski, David Miller; +Cc: netdev
In-Reply-To: <20190820172411.70250551@cakuba.netronome.com>
Jakub Kicinski wrote:
> On Tue, 20 Aug 2019 16:05:17 -0700 (PDT), David Miller wrote:
> > Jakub,
> >
> > I just did a batch of networking -stable submissions, however I ran
> > into some troubles with the various TLS backports.
>
> Yes, the TLS back ports are a little messy :S
>
> > I was able to backport commit 414776621d10 ("net/tls: prevent
> > skb_orphan() from leaking TLS plain text with offload") to v5.2
> > but not to v4.19
>
> We can probably leave that out of v4.19. The only practical scenario
> where the issue occurs to my knowledge is if admin configured TC
> redirect, or netem in the setup. Let me know if you'd rather we did the
> backport, I'm not 100% sure the risk/benefit ratio is favourable.
>
> > I was not able to backport neither d85f01775850 ("net: tls, fix
> > sk_write_space NULL write when tx disabled") nor commit 57c722e932cf
> > ("net/tls: swap sk_write_space on close") to any release. It seems
> > like there are a bunch of dependencies and perhaps other fixes.
>
> Right, those should still be applicable but John and I rejigged
> the close path quite a bit. I think the back port would be this:
Looks correct to me but would need to double check as well.
>
> diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
> index 4c0ac79f82d4..3288bdff9889 100644
> --- a/net/tls/tls_main.c
> +++ b/net/tls/tls_main.c
> @@ -301,6 +301,8 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
> #else
> {
> #endif
> + if (sk->sk_write_space == tls_write_space)
> + sk->sk_write_space = ctx->sk_write_space;
> tls_ctx_free(ctx);
> ctx = NULL;
> }
>
> > I suspect you've triaged through this already on your side for other
> > reasons, so perhaps you could help come up with a sane set of TLS
> > bug fix backports that would be appropriate for -stable?
>
> I'm planning to spend tomorrow working exactly on v4.19 backport.
> I have internal reports of openssl failing on v4.19 while v4.20
> works fine.. Hopefully I'll be able to figure that one out, test the
> above and see if there are any other missing fixes.
>
> Is it okay if I come back to this tomorrow?
Is the failure with hw offload or sw case? If its sendpage related
looks like we also need to push the following patch back to 4.19,
commit 648ee6cea7dde4a5cdf817e5d964fd60b22006a4
Author: John Fastabend <john.fastabend@gmail.com>
Date: Wed Jun 12 17:23:57 2019 +0000
net: tls, correctly account for copied bytes with multiple sk_msgs
If you have more details I can also spend some cycles looking into it.
.John
^ permalink raw reply
* Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
From: Alex Williamson @ 2019-08-21 5:26 UTC (permalink / raw)
To: Parav Pandit
Cc: Jiri Pirko, David S . Miller, Kirti Wankhede, Cornelia Huck,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org, cjia,
netdev@vger.kernel.org
In-Reply-To: <AM0PR05MB4866AE8FC4AA3CC24B08B326D1AA0@AM0PR05MB4866.eurprd05.prod.outlook.com>
On Wed, 21 Aug 2019 05:01:52 +0000
Parav Pandit <parav@mellanox.com> wrote:
> > -----Original Message-----
> > From: Alex Williamson <alex.williamson@redhat.com>
> > Sent: Wednesday, August 21, 2019 10:27 AM
> > To: Parav Pandit <parav@mellanox.com>
> > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>;
> > Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck
> > <cohuck@redhat.com>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org;
> > cjia <cjia@nvidia.com>; netdev@vger.kernel.org
> > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> >
> > On Wed, 21 Aug 2019 04:40:15 +0000
> > Parav Pandit <parav@mellanox.com> wrote:
> >
> > > > -----Original Message-----
> > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > Sent: Wednesday, August 21, 2019 9:51 AM
> > > > To: Parav Pandit <parav@mellanox.com>
> > > > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller
> > > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>;
> > > > Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org;
> > > > linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>;
> > > > netdev@vger.kernel.org
> > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > > >
> > > > On Wed, 21 Aug 2019 03:42:25 +0000
> > > > Parav Pandit <parav@mellanox.com> wrote:
> > > >
> > > > > > -----Original Message-----
> > > > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > > > Sent: Tuesday, August 20, 2019 10:49 PM
> > > > > > To: Parav Pandit <parav@mellanox.com>
> > > > > > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller
> > > > > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>;
> > > > > > Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org;
> > > > > > linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>;
> > > > > > netdev@vger.kernel.org
> > > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > > > > >
> > > > > > On Tue, 20 Aug 2019 08:58:02 +0000 Parav Pandit
> > > > > > <parav@mellanox.com> wrote:
> > > > > >
> > > > > > > + Dave.
> > > > > > >
> > > > > > > Hi Jiri, Dave, Alex, Kirti, Cornelia,
> > > > > > >
> > > > > > > Please provide your feedback on it, how shall we proceed?
> > > > > > >
> > > > > > > Short summary of requirements.
> > > > > > > For a given mdev (mediated device [1]), there is one
> > > > > > > representor netdevice and devlink port in switchdev mode
> > > > > > > (similar to SR-IOV VF), And there is one netdevice for the actual mdev
> > when mdev is probed.
> > > > > > >
> > > > > > > (a) representor netdev and devlink port should be able derive
> > > > > > > phys_port_name(). So that representor netdev name can be built
> > > > > > > deterministically across reboots.
> > > > > > >
> > > > > > > (b) for mdev's netdevice, mdev's device should have an attribute.
> > > > > > > This attribute can be used by udev rules/systemd or something
> > > > > > > else to rename netdev name deterministically.
> > > > > > >
> > > > > > > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID.
> > > > > > > A simple grep IFNAMSIZ in stack hints hundreds of users of
> > > > > > > IFNAMSIZ in drivers, uapi, netlink, boot config area and more.
> > > > > > > Changing IFNAMSIZ for a mdev bus doesn't really look
> > > > > > > reasonable option
> > > > to me.
> > > > > >
> > > > > > How many characters do we really have to work with? Your
> > > > > > examples below prepend various characters, ex. option-1 results
> > > > > > in ens2f0_m10 or enm10. Do the extra 8 or 3 characters in these count
> > against IFNAMSIZ?
> > > > > >
> > > > > Maximum 15. Last is null termination.
> > > > > Some udev rules setting by user prefix the PF netdev interface. I
> > > > > took such
> > > > example below where ens2f0 netdev named is prefixed.
> > > > > Some prefer not to prefix.
> > > > >
> > > > > > > Hence, I would like to discuss below options.
> > > > > > >
> > > > > > > Option-1: mdev index
> > > > > > > Introduce an optional mdev index/handle as u32 during mdev
> > > > > > > create time. User passes mdev index/handle as input.
> > > > > > >
> > > > > > > phys_port_name=mIndex=m%u
> > > > > > > mdev_index will be available in sysfs as mdev attribute for
> > > > > > > udev to name the mdev's netdev.
> > > > > > >
> > > > > > > example mdev create command:
> > > > > > > UUID=$(uuidgen)
> > > > > > > echo $UUID index=10
> > > > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/cr
> > > > > > > > eate
> > > > > >
> > > > > > Nit, IIRC previous discussions of additional parameters used
> > > > > > comma separators, ex. echo $UUID,index=10 >...
> > > > > >
> > > > > Yes, ok.
> > > > >
> > > > > > > > example netdevs:
> > > > > > > repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */
> > > > > >
> > > > > > Is the parent really relevant in the name?
> > > > > No. I just picked one udev example who prefixed the parent netdev name.
> > > > > But there are users who do not prefix it.
> > > > >
> > > > > > Tools like mdevctl are meant to
> > > > > > provide persistence, creating the same mdev devices on the same
> > > > > > parent, but that's simply the easiest policy decision. We can
> > > > > > also imagine that multiple parent devices might support a
> > > > > > specified mdev type and policies factoring in proximity,
> > > > > > load-balancing, power consumption, etc might be weighed such
> > > > > > that we really don't want to promote userspace creating dependencies
> > on the parent association.
> > > > > >
> > > > > > > mdev_netdev=enm10
> > > > > > >
> > > > > > > Pros:
> > > > > > > 1. mdevctl and any other existing tools are unaffected.
> > > > > > > 2. netdev stack, ovs and other switching platforms are unaffected.
> > > > > > > 3. achieves unique phys_port_name for representor netdev 4.
> > > > > > > achieves unique mdev eth netdev name for the mdev using
> > > > > > > udev/systemd
> > > > extension.
> > > > > > > 5. Aligns well with mdev and netdev subsystem and similar to
> > > > > > > existing sriov bdf's.
> > > > > >
> > > > > > A user provided index seems strange to me. It's not really an
> > > > > > index, just a user specified instance number. Presumably you
> > > > > > have the user providing this because if it really were an index,
> > > > > > then the value depends on the creation order and persistence is
> > > > > > lost. Now the user needs to both avoid uuid collision as well as "index"
> > > > > > number collision. The uuid namespace is large enough to mostly
> > > > > > ignore
> > > > this, but this is not. This seems like a burden.
> > > > > >
> > > > > I liked the term 'instance number', which is lot better way to say
> > > > > than
> > > > index/handle.
> > > > > Yes, user needs to avoid both the collision.
> > > > > UUID collision should not occur in most cases, they way UUID are
> > generated.
> > > > > So practically users needs to pick unique 'instance number',
> > > > > similar to how it
> > > > picks unique netdev names.
> > > > >
> > > > > Burden to user comes from the requirement to get uniqueness.
> > > > >
> > > > > > > Option-2: shorter mdev name
> > > > > > > Extend mdev to have shorter mdev device name in addition to UUID.
> > > > > > > such as 'foo', 'bar'.
> > > > > > > Mdev will continue to have UUID.
> > > > > > > phys_port_name=mdev_name
> > > > > > >
> > > > > > > Pros:
> > > > > > > 1. All same as option-1, except mdevctl needs upgrade for newer
> > usage.
> > > > > > > It is common practice to upgrade iproute2 package along with
> > > > > > > the kernel. Similar practice to be done with mdevctl.
> > > > > > > 2. Newer users of mdevctl who wants to work with non_UUID
> > > > > > > names, will use newer mdevctl/tools. Cons:
> > > > > > > 1. Dual naming scheme of mdev might affect some of the existing
> > tools.
> > > > > > > It's unclear how/if it actually affects.
> > > > > > > mdevctl [2] is very recently developed and can be enhanced for
> > > > > > > dual naming scheme.
> > > > > >
> > > > > > I think we've already nak'ed this one, the device namespace
> > > > > > becomes meaningless if the name becomes just a string where a
> > > > > > uuid might be an example string. mdevs are named by uuid.
> > > > > >
> > > > > > > Option-3: mdev uuid alias
> > > > > > > Instead of shorter mdev name or mdev index, have alpha-numeric
> > > > > > > name alias. Alias is an optional mdev sysfs attribute such as 'foo',
> > 'bar'.
> > > > > > > example mdev create command:
> > > > > > > UUID=$(uuidgen)
> > > > > > > echo $UUID alias=foo
> > > > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/cr
> > > > > > > > eate
> > > > > > > > example netdevs:
> > > > > > > examle netdevs:
> > > > > > > repnetdev = ens2f0_mfoo
> > > > > > > mdev_netdev=enmfoo
> > > > > > >
> > > > > > > Pros:
> > > > > > > 1. All same as option-1.
> > > > > > > 2. Doesn't affect existing mdev naming scheme.
> > > > > > > Cons:
> > > > > > > 1. Index scheme of option-1 is better which can number large
> > > > > > > number of mdevs with fewer characters, simplifying the
> > > > > > > management
> > > > tool.
> > > > > >
> > > > > > No better than option-1, simply a larger secondary namespace,
> > > > > > but still requires the user to come up with two independent
> > > > > > names for the
> > > > device.
> > > > > >
> > > > > > > Option-4: extend IFNAMESZ to be 64 bytes Extended IFNAMESZ
> > > > > > > from 16 to
> > > > > > > 64 bytes phys_port_name=mdev_UUID_string
> > > > mdev_netdev_name=enmUUID
> > > > > > >
> > > > > > > Pros:
> > > > > > > 1. Doesn't require mdev extension
> > > > > > > Cons:
> > > > > > > 1. netdev stack, driver, uapi, user space, boot config wide changes 2.
> > > > > > > Possible user space extensions who assumed name size being 16
> > > > > > > characters 3. Single device type demands namesize change for
> > > > > > > all netdev types
> > > > > >
> > > > > > What about an alias based on the uuid? For example, we use
> > > > > > 160-bit sha1s daily with git (uuids are only 128-bit), but we
> > > > > > generally don't reference git commits with the full 20 character string.
> > > > > > Generally 12 characters is recommended to avoid ambiguity.
> > > > > > Could mdev automatically create an
> > > >
> > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > > > abbreviated sha1 alias for the device? If so, how many
> > > > > > characters should we
> > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > > > use and what do we do on collision? The colliding device could
> > > > > > add enough alias characters to disambiguate (we likely couldn't
> > > > > > re-alias the existing device to disambiguate, but I'm not sure
> > > > > > it matters, userspace has sysfs to associate aliases). Ex.
> > > > > >
> > > > > > UUID=$(uuidgen)
> > > > > > ALIAS=$(echo $UUID | sha1sum | colrm 13)
> > > > > >
> > > > > I explained in previous reply to Cornelia, we should set UUID and
> > > > > ALIAS at the
> > > > same time.
> > > > > Setting is via different sysfs attribute is lot code burden with no extra
> > benefit.
> > > >
> > > > Just an example of the alias, not proposing how it's set. In fact,
> > > > proposing that the user does not set it, mdev-core provides one
> > automatically.
> > > >
> > > > > > Since there seems to be some prefix overhead, as I ask about
> > > > > > above in how many characters we actually have to work with in
> > > > > > IFNAMESZ, maybe we start with 8 characters (matching your
> > > > > > "index" namespace) and expand as necessary for disambiguation.
> > > > > > If we can eliminate overhead in IFNAMESZ, let's start with 12.
> > > > > > Thanks,
> > > > > >
> > > > > If user is going to choose the alias, why does it have to be limited to sha1?
> > > > > Or you just told it as an example?
> > > > >
> > > > > It can be an alpha-numeric string.
> > > >
> > > > No, I'm proposing a different solution where mdev-core creates an
> > > > alias based on an abbreviated sha1. The user does not provide the alias.
> > > >
> > > > > Instead of mdev imposing number of characters on the alias, it
> > > > > should be best
> > > > left to the user.
> > > > > Because in future if netdev improves on the naming scheme, mdev
> > > > > will be
> > > > limiting it, which is not right.
> > > > > So not restricting alias size seems right to me.
> > > > > User configuring mdev for networking devices in a given kernel
> > > > > knows what
> > > > user is doing.
> > > > > So user can choose alias name size as it finds suitable.
> > > >
> > > > That's not what I'm proposing, please read again. Thanks,
> > >
> > > I understood your point. But mdev doesn't know how user is going to use
> > udev/systemd to name the netdev.
> > > So even if mdev chose to pick 12 characters, it could result in collision.
> > > Hence the proposal to provide the alias by the user, as user know the best
> > policy for its use case in the environment its using.
> > > So 12 character sha1 method will still work by user.
> >
> > Haven't you already provided examples where certain drivers or subsystems
> > have unique netdev prefixes? If mdev provides a unique alias within the
> > subsystem, couldn't we simply define a netdev prefix for the mdev subsystem
> > and avoid all other collisions? I'm not in favor of the user providing both a uuid
> > and an alias/instance. Thanks,
> >
> For a given prefix, say ens2f0, can two UUID->sha1 first 9 characters have collision?
I think it would be a mistake to waste so many chars on a prefix, but 9
characters of sha1 likely wouldn't have a collision before we have 10s
of thousands of devices. Thanks,
Alex
^ permalink raw reply
* RE: [EXT] Re: [PATCH net-next 0/1] net: fec: add C45 MDIO read/write support
From: Andy Duan @ 2019-08-21 5:55 UTC (permalink / raw)
To: Andrew Lunn
Cc: Marco Hartmann, davem@davemloft.net, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, Christian Herber
In-Reply-To: <20190820130403.GH29991@lunn.ch>
From: Andrew Lunn <andrew@lunn.ch> Sent: Tuesday, August 20, 2019 9:04 PM
> On Tue, Aug 20, 2019 at 02:32:26AM +0000, Andy Duan wrote:
> > From: Andrew Lunn <andrew@lunn.ch>
> > > On Mon, Aug 19, 2019 at 05:11:14PM +0000, Marco Hartmann wrote:
> > > > As of yet, the Fast Ethernet Controller (FEC) driver only supports
> > > > Clause 22 conform MDIO transactions. IEEE 802.3ae Clause 45
> > > > defines a modified MDIO protocol that uses a two staged access
> > > > model in order to increase the address space.
> > > >
> > > > This patch adds support for Clause 45 conform MDIO read and write
> > > > operations to the FEC driver.
> > >
> > > Hi Marco
> > >
> > > Do all versions of the FEC hardware support C45? Or do we need to
> > > make use of the quirk support in this driver to just enable it for some
> revisions of FEC?
> > >
> > > Thanks
> > > Andrew
> >
> > i.MX legacy platforms like i.MX6/7 series, they doesn't support Write & Read
> Increment.
> > But for i.MX8MQ/MM series, it support C45 full features like Write & Read
> Increment.
> >
> > For the patch itself, it doesn't support Write & Read Increment, so I
> > think the patch doesn't need to add quirk support.
>
> Hi Andy
>
> So what happens with something older than a i.MX8MQ/MM when a C45
> transfer is attempted? This patch adds a new write. Does that write
> immediately trigger a completion interrupt? Does it never trigger an interrupt,
> and we have to wait FEC_MII_TIMEOUT?
>
> Ideally, if the hardware does not support C45, we want it to return
> EOPNOTSUPP.
>
> Thanks
> Andrew
It still trigger an interrupt to wakeup the completion, we have to wait FEC_MII_TIMEOUT.
Older chips just support part of C45 feature just like the patch implementation.
^ permalink raw reply
* pull-request: wireless-drivers 2019-08-21
From: Kalle Valo @ 2019-08-21 6:07 UTC (permalink / raw)
To: David Miller; +Cc: linux-wireless, netdev, linux-kernel, Johannes Berg
Hi Dave,
here's a pull request to net for 5.3, more info below. I will be offline
the next week, but Johannes should be able to help if there are any
issues.
Kalle
The following changes since commit d1abaeb3be7b5fa6d7a1fbbd2e14e3310005c4c1:
Linux 5.3-rc5 (2019-08-18 14:31:08 -0700)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers.git tags/wireless-drivers-for-davem-2019-08-21
for you to fetch changes up to 5a8c31aa63578cb0ff390a57537f1cb4b312a1ed:
iwlwifi: pcie: fix recognition of QuZ devices (2019-08-20 17:00:42 +0300)
----------------------------------------------------------------
wireless-drivers fixes for 5.3
Third set of fixes for 5.3, and most likely the last one. The rt2x00
regression has been reported multiple times, others are of lower
priority.
mt76
* fix hang on resume on certain machines
rt2x00
* fix AP mode regression related to encryption
iwlwifi
* avoid unnecessary error messages due to multicast frames when not
associated
* fix configuration for ax201 devices
* fix recognition of QuZ devices
----------------------------------------------------------------
Emmanuel Grumbach (1):
iwlwifi: pcie: fix the byte count table format for 22560 devices
Ilan Peer (1):
iwlwifi: mvm: Allow multicast data frames only when associated
Luca Coelho (2):
iwlwifi: pcie: don't switch FW to qnj when ax201 is detected
iwlwifi: pcie: fix recognition of QuZ devices
Stanislaw Gruszka (2):
mt76: mt76x0u: do not reset radio on resume
rt2x00: clear IV's on start to fix AP mode regression
drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c | 33 ++++++++++++++++++++---
drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c | 10 +++++++
drivers/net/wireless/intel/iwlwifi/pcie/drv.c | 17 ++++++++++++
drivers/net/wireless/intel/iwlwifi/pcie/trans.c | 1 +
drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c | 20 +++++++++-----
drivers/net/wireless/mediatek/mt76/mt76x0/usb.c | 8 +++---
drivers/net/wireless/ralink/rt2x00/rt2800lib.c | 9 +++++++
drivers/net/wireless/ralink/rt2x00/rt2x00.h | 1 +
drivers/net/wireless/ralink/rt2x00/rt2x00dev.c | 13 +++++----
9 files changed, 93 insertions(+), 19 deletions(-)
^ permalink raw reply
* RE: [PATCH v2 0/2] Simplify mtty driver and mdev core
From: Parav Pandit @ 2019-08-21 6:23 UTC (permalink / raw)
To: Alex Williamson
Cc: Jiri Pirko, David S . Miller, Kirti Wankhede, Cornelia Huck,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org, cjia,
netdev@vger.kernel.org
In-Reply-To: <20190820232622.164962d3@x1.home>
> -----Original Message-----
> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Wednesday, August 21, 2019 10:56 AM
> To: Parav Pandit <parav@mellanox.com>
> Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>;
> Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck
> <cohuck@redhat.com>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org;
> cjia <cjia@nvidia.com>; netdev@vger.kernel.org
> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
>
> > > > > Just an example of the alias, not proposing how it's set. In
> > > > > fact, proposing that the user does not set it, mdev-core
> > > > > provides one
> > > automatically.
> > > > >
> > > > > > > Since there seems to be some prefix overhead, as I ask about
> > > > > > > above in how many characters we actually have to work with
> > > > > > > in IFNAMESZ, maybe we start with 8 characters (matching your
> > > > > > > "index" namespace) and expand as necessary for disambiguation.
> > > > > > > If we can eliminate overhead in IFNAMESZ, let's start with 12.
> > > > > > > Thanks,
> > > > > > >
> > > > > > If user is going to choose the alias, why does it have to be limited to
> sha1?
> > > > > > Or you just told it as an example?
> > > > > >
> > > > > > It can be an alpha-numeric string.
> > > > >
> > > > > No, I'm proposing a different solution where mdev-core creates
> > > > > an alias based on an abbreviated sha1. The user does not provide the
> alias.
> > > > >
> > > > > > Instead of mdev imposing number of characters on the alias, it
> > > > > > should be best
> > > > > left to the user.
> > > > > > Because in future if netdev improves on the naming scheme,
> > > > > > mdev will be
> > > > > limiting it, which is not right.
> > > > > > So not restricting alias size seems right to me.
> > > > > > User configuring mdev for networking devices in a given kernel
> > > > > > knows what
> > > > > user is doing.
> > > > > > So user can choose alias name size as it finds suitable.
> > > > >
> > > > > That's not what I'm proposing, please read again. Thanks,
> > > >
> > > > I understood your point. But mdev doesn't know how user is going
> > > > to use
> > > udev/systemd to name the netdev.
> > > > So even if mdev chose to pick 12 characters, it could result in collision.
> > > > Hence the proposal to provide the alias by the user, as user know
> > > > the best
> > > policy for its use case in the environment its using.
> > > > So 12 character sha1 method will still work by user.
> > >
> > > Haven't you already provided examples where certain drivers or
> > > subsystems have unique netdev prefixes? If mdev provides a unique
> > > alias within the subsystem, couldn't we simply define a netdev
> > > prefix for the mdev subsystem and avoid all other collisions? I'm
> > > not in favor of the user providing both a uuid and an
> > > alias/instance. Thanks,
> > >
> > For a given prefix, say ens2f0, can two UUID->sha1 first 9 characters have
> collision?
>
> I think it would be a mistake to waste so many chars on a prefix, but 9
> characters of sha1 likely wouldn't have a collision before we have 10s of
> thousands of devices. Thanks,
>
> Alex
Jiri, Dave,
Are you ok with it for devlink/netdev part?
Mdev core will create an alias from a UUID.
This will be supplied during devlink port attr set such as,
devlink_port_attrs_mdev_set(struct devlink_port *port, const char *mdev_alias);
This alias is used to generate representor netdev's phys_port_name.
This alias from the mdev device's sysfs will be used by the udev/systemd to generate predicable netdev's name.
Example: enm<mdev_alias_first_12_chars>
I took Ethernet mdev as an example.
New prefix 'm' stands for mediated device.
Remaining 12 characters are first 12 chars of the mdev alias.
^ permalink raw reply
* [PATCH net-next] net: dsa: remove bitmap operations
From: Vivien Didelot @ 2019-08-21 6:24 UTC (permalink / raw)
To: netdev; +Cc: davem, f.fainelli, andrew, olteanv, Vivien Didelot
The bitmap operations were introduced to simplify the switch drivers
in the future, since most of them could implement the common VLAN and
MDB operations (add, del, dump) with simple functions taking all target
ports at once, and thus limiting the number of hardware accesses.
Programming an MDB or VLAN this way in a single operation would clearly
simplify the drivers a lot but would require a new get-set interface
in DSA. The usage of such bitmap from the stack also raised concerned
in the past, leading to the dynamic allocation of a new ds->_bitmap
member in the dsa_switch structure. So let's get rid of them for now.
This commit nicely wraps the ds->ops->port_{mdb,vlan}_{prepare,add}
switch operations into new dsa_switch_{mdb,vlan}_{prepare,add}
variants not using any bitmap argument anymore.
New dsa_switch_{mdb,vlan}_match helpers have been introduced to make
clear which local port of a switch must be programmed with the target
object. While the targeted user port is an obvious candidate, the
DSA links must also be programmed, as well as the CPU port for VLANs.
While at it, also remove local variables that are only used once.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
---
include/net/dsa.h | 3 --
net/dsa/dsa2.c | 14 -----
net/dsa/switch.c | 132 +++++++++++++++++++++-------------------------
3 files changed, 59 insertions(+), 90 deletions(-)
diff --git a/include/net/dsa.h b/include/net/dsa.h
index 147b757ef8ea..96acb14ec1a8 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -275,9 +275,6 @@ struct dsa_switch {
*/
bool vlan_filtering;
- unsigned long *bitmap;
- unsigned long _bitmap;
-
/* Dynamically allocated ports, keep last */
size_t num_ports;
struct dsa_port ports[];
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 8c4eccb0cfe6..f8445fa73448 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -834,20 +834,6 @@ struct dsa_switch *dsa_switch_alloc(struct device *dev, size_t n)
if (!ds)
return NULL;
- /* We avoid allocating memory outside dsa_switch
- * if it is not needed.
- */
- if (n <= sizeof(ds->_bitmap) * 8) {
- ds->bitmap = &ds->_bitmap;
- } else {
- ds->bitmap = devm_kcalloc(dev,
- BITS_TO_LONGS(n),
- sizeof(unsigned long),
- GFP_KERNEL);
- if (unlikely(!ds->bitmap))
- return NULL;
- }
-
ds->dev = dev;
ds->num_ports = n;
diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index 09d9286b27cc..489eb7b430a4 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -128,57 +128,51 @@ static int dsa_switch_fdb_del(struct dsa_switch *ds,
return ds->ops->port_fdb_del(ds, port, info->addr, info->vid);
}
-static int
-dsa_switch_mdb_prepare_bitmap(struct dsa_switch *ds,
- const struct switchdev_obj_port_mdb *mdb,
- const unsigned long *bitmap)
+static bool dsa_switch_mdb_match(struct dsa_switch *ds, int port,
+ struct dsa_notifier_mdb_info *info)
+{
+ if (ds->index == info->sw_index && port == info->port)
+ return true;
+
+ if (dsa_is_dsa_port(ds, port))
+ return true;
+
+ return false;
+}
+
+static int dsa_switch_mdb_prepare(struct dsa_switch *ds,
+ struct dsa_notifier_mdb_info *info)
{
int port, err;
if (!ds->ops->port_mdb_prepare || !ds->ops->port_mdb_add)
return -EOPNOTSUPP;
- for_each_set_bit(port, bitmap, ds->num_ports) {
- err = ds->ops->port_mdb_prepare(ds, port, mdb);
- if (err)
- return err;
+ for (port = 0; port < ds->num_ports; port++) {
+ if (dsa_switch_mdb_match(ds, port, info)) {
+ err = ds->ops->port_mdb_prepare(ds, port, info->mdb);
+ if (err)
+ return err;
+ }
}
return 0;
}
-static void dsa_switch_mdb_add_bitmap(struct dsa_switch *ds,
- const struct switchdev_obj_port_mdb *mdb,
- const unsigned long *bitmap)
-{
- int port;
-
- if (!ds->ops->port_mdb_add)
- return;
-
- for_each_set_bit(port, bitmap, ds->num_ports)
- ds->ops->port_mdb_add(ds, port, mdb);
-}
-
static int dsa_switch_mdb_add(struct dsa_switch *ds,
struct dsa_notifier_mdb_info *info)
{
- const struct switchdev_obj_port_mdb *mdb = info->mdb;
- struct switchdev_trans *trans = info->trans;
int port;
- /* Build a mask of Multicast group members */
- bitmap_zero(ds->bitmap, ds->num_ports);
- if (ds->index == info->sw_index)
- set_bit(info->port, ds->bitmap);
- for (port = 0; port < ds->num_ports; port++)
- if (dsa_is_dsa_port(ds, port))
- set_bit(port, ds->bitmap);
+ if (switchdev_trans_ph_prepare(info->trans))
+ return dsa_switch_mdb_prepare(ds, info);
- if (switchdev_trans_ph_prepare(trans))
- return dsa_switch_mdb_prepare_bitmap(ds, mdb, ds->bitmap);
+ if (!ds->ops->port_mdb_add)
+ return 0;
- dsa_switch_mdb_add_bitmap(ds, mdb, ds->bitmap);
+ for (port = 0; port < ds->num_ports; port++)
+ if (dsa_switch_mdb_match(ds, port, info))
+ ds->ops->port_mdb_add(ds, port, info->mdb);
return 0;
}
@@ -186,13 +180,11 @@ static int dsa_switch_mdb_add(struct dsa_switch *ds,
static int dsa_switch_mdb_del(struct dsa_switch *ds,
struct dsa_notifier_mdb_info *info)
{
- const struct switchdev_obj_port_mdb *mdb = info->mdb;
-
if (!ds->ops->port_mdb_del)
return -EOPNOTSUPP;
if (ds->index == info->sw_index)
- return ds->ops->port_mdb_del(ds, info->port, mdb);
+ return ds->ops->port_mdb_del(ds, info->port, info->mdb);
return 0;
}
@@ -234,59 +226,55 @@ static int dsa_port_vlan_check(struct dsa_switch *ds, int port,
(void *)vlan);
}
-static int
-dsa_switch_vlan_prepare_bitmap(struct dsa_switch *ds,
- const struct switchdev_obj_port_vlan *vlan,
- const unsigned long *bitmap)
+static bool dsa_switch_vlan_match(struct dsa_switch *ds, int port,
+ struct dsa_notifier_vlan_info *info)
+{
+ if (ds->index == info->sw_index && port == info->port)
+ return true;
+
+ if (dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port))
+ return true;
+
+ return false;
+}
+
+static int dsa_switch_vlan_prepare(struct dsa_switch *ds,
+ struct dsa_notifier_vlan_info *info)
{
int port, err;
if (!ds->ops->port_vlan_prepare || !ds->ops->port_vlan_add)
return -EOPNOTSUPP;
- for_each_set_bit(port, bitmap, ds->num_ports) {
- err = dsa_port_vlan_check(ds, port, vlan);
- if (err)
- return err;
+ for (port = 0; port < ds->num_ports; port++) {
+ if (dsa_switch_vlan_match(ds, port, info)) {
+ err = dsa_port_vlan_check(ds, port, info->vlan);
+ if (err)
+ return err;
- err = ds->ops->port_vlan_prepare(ds, port, vlan);
- if (err)
- return err;
+ err = ds->ops->port_vlan_prepare(ds, port, info->vlan);
+ if (err)
+ return err;
+ }
}
return 0;
}
-static void
-dsa_switch_vlan_add_bitmap(struct dsa_switch *ds,
- const struct switchdev_obj_port_vlan *vlan,
- const unsigned long *bitmap)
-{
- int port;
-
- for_each_set_bit(port, bitmap, ds->num_ports)
- ds->ops->port_vlan_add(ds, port, vlan);
-}
-
static int dsa_switch_vlan_add(struct dsa_switch *ds,
struct dsa_notifier_vlan_info *info)
{
- const struct switchdev_obj_port_vlan *vlan = info->vlan;
- struct switchdev_trans *trans = info->trans;
int port;
- /* Build a mask of VLAN members */
- bitmap_zero(ds->bitmap, ds->num_ports);
- if (ds->index == info->sw_index)
- set_bit(info->port, ds->bitmap);
- for (port = 0; port < ds->num_ports; port++)
- if (dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port))
- set_bit(port, ds->bitmap);
+ if (switchdev_trans_ph_prepare(info->trans))
+ return dsa_switch_vlan_prepare(ds, info);
- if (switchdev_trans_ph_prepare(trans))
- return dsa_switch_vlan_prepare_bitmap(ds, vlan, ds->bitmap);
+ if (!ds->ops->port_vlan_add)
+ return 0;
- dsa_switch_vlan_add_bitmap(ds, vlan, ds->bitmap);
+ for (port = 0; port < ds->num_ports; port++)
+ if (dsa_switch_vlan_match(ds, port, info))
+ ds->ops->port_vlan_add(ds, port, info->vlan);
return 0;
}
@@ -294,13 +282,11 @@ static int dsa_switch_vlan_add(struct dsa_switch *ds,
static int dsa_switch_vlan_del(struct dsa_switch *ds,
struct dsa_notifier_vlan_info *info)
{
- const struct switchdev_obj_port_vlan *vlan = info->vlan;
-
if (!ds->ops->port_vlan_del)
return -EOPNOTSUPP;
if (ds->index == info->sw_index)
- return ds->ops->port_vlan_del(ds, info->port, vlan);
+ return ds->ops->port_vlan_del(ds, info->port, info->vlan);
return 0;
}
--
2.23.0
^ permalink raw reply related
* Re: [pull request][net-next v2 00/16] Mellanox, mlx5 devlink RX health reporters
From: David Miller @ 2019-08-21 6:36 UTC (permalink / raw)
To: saeedm; +Cc: netdev
In-Reply-To: <20190820202352.2995-1-saeedm@mellanox.com>
From: Saeed Mahameed <saeedm@mellanox.com>
Date: Tue, 20 Aug 2019 20:24:10 +0000
> This series is adding a new devlink health reporter for RX related
> errors from Aya.
>
> Last two patches from Vlad and Gavi, are trivial fixes for previously
> submitted patches on this release cycle.
>
> v1->v2:
> - Improve reversed xmas tree variable declaration.
> - Rebase on top of net-next to avoid a new conflict due to latest
> merge with net.
>
> For more information please see tag log below.
>
> Please pull and let me know if there is any problem.
Pulled, thanks Saeed.
^ permalink raw reply
* Re: INFO: task hung in tls_sw_release_resources_tx
From: Steffen Klassert @ 2019-08-21 6:37 UTC (permalink / raw)
To: Jakub Kicinski, syzbot, ast, aviadye, borisp, bpf, daniel,
davejwatson, davem, hdanton, john.fastabend, netdev,
syzkaller-bugs, herbert, linux-crypto
In-Reply-To: <20190817054743.GE8209@sol.localdomain>
On Fri, Aug 16, 2019 at 10:47:43PM -0700, Eric Biggers wrote:
> [+Steffen, who is the maintainer of pcrypt]
>
> On Fri, Aug 16, 2019 at 07:02:34PM -0700, Jakub Kicinski wrote:
> > On Thu, 15 Aug 2019 11:06:00 -0700, syzbot wrote:
> > > syzbot has bisected this bug to:
> > >
> > > commit 130b392c6cd6b2aed1b7eb32253d4920babb4891
> > > Author: Dave Watson <davejwatson@fb.com>
> > > Date: Wed Jan 30 21:58:31 2019 +0000
> > >
> > > net: tls: Add tls 1.3 support
> > >
> > > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=118e8dee600000
> > > start commit: 6d5afe20 sctp: fix memleak in sctp_send_reset_streams
> > > git tree: net
> > > final crash: https://syzkaller.appspot.com/x/report.txt?x=138e8dee600000
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=158e8dee600000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=a4c9e9f08e9e8960
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=6a9ff159672dfbb41c95
> > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=17cb0502600000
> > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=14d5dc22600000
> > >
> > > Reported-by: syzbot+6a9ff159672dfbb41c95@syzkaller.appspotmail.com
> > > Fixes: 130b392c6cd6 ("net: tls: Add tls 1.3 support")
> > >
> > > For information about bisection process see: https://goo.gl/tpsmEJ#bisection
> >
> > CC Herbert, linux-crypto
> >
> > This is got to be something in the crypto code :S
> >
> > The test case opens a ktls socket and back log writes to it.
> > Then it opens a AF_ALG socket, binds "pcrypt(gcm(aes))" and dies.
> >
> > The ktls socket upon close waits for async crypto callbacks, but they
> > never come. If I unset CRYPTO_USER_API_AEAD or change the alg to bind
> > to "gcm(aes)" the bug does not trigger.
> >
> > Any suggestions?
>
> Seeing as pcrypt is involved and this is a "task hung" bug, this is probably
> caused by the recursive pcrypt deadlock, which is yet to be fixed.
>
> See the original thread for more info:
>
> https://groups.google.com/forum/#!msg/syzkaller-bugs/1_CXUd3gBcg/BvsRLH0lAgAJ
>
> And the syzbot dashboard link:
>
> https://syzkaller.appspot.com/bug?id=178f2528d10720d563091fb51dceb4cb20f75525
>
> Let's tell syzbot this is a duplicate:
>
> #syz dup: INFO: task hung in aead_recvmsg
>
>
> Steffen, do you have any plan to fix this?
I've tried to use different padata instances for each pcrypt template,
but then each pcrypt template needs to expose its cpumask configuration
to a new file in /sys/kernel/pcrypt/. Currently we have one file
there for the encrytion and on for the decryption cpumask. If we have
more than these two files, we need some naming convention to now which
pcrypt template we want to configure. That would be a bit odd because
a such a nested pcrypt in pcrypt algorithm would not make sense at all.
I still think we should somehow forbid these nested configurations.
If I remember correct, the only objection to your original patch
was that it would still deadlock if an underlying algorithm uses
pcrypt as a fallback.
Maybe we can use your patch and also refuse instanitating if an
underlying algorithm needs a fallback.
The patch would look like this then:
Subject: [PATCH] crypto: pcrypt - forbid recursive instantiation
If the pcrypt template is used multiple times in an algorithm, then a
deadlock occurs because all pcrypt instances share the same
padata_instance, which completes requests in the order submitted. That
is, the inner pcrypt request waits for the outer pcrypt request while
the outer request is already waiting for the inner.
Fix this by making pcrypt forbid instantiation if pcrypt appears in the
underlying ->cra_driver_name and if an underlying algorithm needs a
fallback. This is somewhat of a hack, but it's a simple fix that should
be sufficient to prevent the deadlock.
Reproducer:
#include <linux/if_alg.h>
#include <sys/socket.h>
#include <unistd.h>
int main()
{
struct sockaddr_alg addr = {
.salg_type = "aead",
.salg_name = "pcrypt(pcrypt(rfc4106-gcm-aesni))"
};
int algfd, reqfd;
char buf[32] = { 0 };
algfd = socket(AF_ALG, SOCK_SEQPACKET, 0);
bind(algfd, (void *)&addr, sizeof(addr));
setsockopt(algfd, SOL_ALG, ALG_SET_KEY, buf, 20);
reqfd = accept(algfd, 0, 0);
write(reqfd, buf, 32);
read(reqfd, buf, 16);
}
Reported-by: syzbot+56c7151cad94eec37c521f0e47d2eee53f9361c4@syzkaller.appspotmail.com
Fixes: 5068c7a883d1 ("crypto: pcrypt - Add pcrypt crypto parallelization wrapper")
Cc: <stable@vger.kernel.org> # v2.6.34+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
crypto/pcrypt.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/crypto/pcrypt.c b/crypto/pcrypt.c
index 543792e0ebf0..932a77b61b47 100644
--- a/crypto/pcrypt.c
+++ b/crypto/pcrypt.c
@@ -198,6 +198,12 @@ static void pcrypt_free(struct aead_instance *inst)
static int pcrypt_init_instance(struct crypto_instance *inst,
struct crypto_alg *alg)
{
+ /* Recursive pcrypt deadlocks due to the shared padata_instance */
+ if (!strncmp(alg->cra_driver_name, "pcrypt(", 7) ||
+ strstr(alg->cra_driver_name, "(pcrypt(") ||
+ strstr(alg->cra_driver_name, ",pcrypt("))
+ return -EINVAL;
+
if (snprintf(inst->alg.cra_driver_name, CRYPTO_MAX_ALG_NAME,
"pcrypt(%s)", alg->cra_driver_name) >= CRYPTO_MAX_ALG_NAME)
return -ENAMETOOLONG;
@@ -236,7 +242,7 @@ static int pcrypt_create_aead(struct crypto_template *tmpl, struct rtattr **tb,
ctx = aead_instance_ctx(inst);
crypto_set_aead_spawn(&ctx->spawn, aead_crypto_instance(inst));
- err = crypto_grab_aead(&ctx->spawn, name, 0, 0);
+ err = crypto_grab_aead(&ctx->spawn, name, 0, CRYPTO_ALG_NEED_FALLBACK);
if (err)
goto out_free_inst;
--
2.17.1
^ permalink raw reply related
* Re: [PATCH ipsec] xfrm: policy: avoid warning splat when merging nodes
From: Steffen Klassert @ 2019-08-21 6:41 UTC (permalink / raw)
To: Florian Westphal; +Cc: netdev, syzkaller-bugs, syzbot+8cc27ace5f6972910b31
In-Reply-To: <20190812083213.5010-1-fw@strlen.de>
On Mon, Aug 12, 2019 at 10:32:13AM +0200, Florian Westphal wrote:
> syzbot reported a splat:
> xfrm_policy_inexact_list_reinsert+0x625/0x6e0 net/xfrm/xfrm_policy.c:877
> CPU: 1 PID: 6756 Comm: syz-executor.1 Not tainted 5.3.0-rc2+ #57
> Call Trace:
> xfrm_policy_inexact_node_reinsert net/xfrm/xfrm_policy.c:922 [inline]
> xfrm_policy_inexact_node_merge net/xfrm/xfrm_policy.c:958 [inline]
> xfrm_policy_inexact_insert_node+0x537/0xb50 net/xfrm/xfrm_policy.c:1023
> xfrm_policy_inexact_alloc_chain+0x62b/0xbd0 net/xfrm/xfrm_policy.c:1139
> xfrm_policy_inexact_insert+0xe8/0x1540 net/xfrm/xfrm_policy.c:1182
> xfrm_policy_insert+0xdf/0xce0 net/xfrm/xfrm_policy.c:1574
> xfrm_add_policy+0x4cf/0x9b0 net/xfrm/xfrm_user.c:1670
> xfrm_user_rcv_msg+0x46b/0x720 net/xfrm/xfrm_user.c:2676
> netlink_rcv_skb+0x1f0/0x460 net/netlink/af_netlink.c:2477
> xfrm_netlink_rcv+0x74/0x90 net/xfrm/xfrm_user.c:2684
> netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
> netlink_unicast+0x809/0x9a0 net/netlink/af_netlink.c:1328
> netlink_sendmsg+0xa70/0xd30 net/netlink/af_netlink.c:1917
> sock_sendmsg_nosec net/socket.c:637 [inline]
> sock_sendmsg net/socket.c:657 [inline]
>
> There is no reproducer, however, the warning can be reproduced
> by adding rules with ever smaller prefixes.
>
> The sanity check ("does the policy match the node") uses the prefix value
> of the node before its updated to the smaller value.
>
> To fix this, update the prefix earlier. The bug has no impact on tree
> correctness, this is only to prevent a false warning.
>
> Reported-by: syzbot+8cc27ace5f6972910b31@syzkaller.appspotmail.com
> Signed-off-by: Florian Westphal <fw@strlen.de>
Applied, thanks Florian!
^ permalink raw reply
* Re: various TLS bug fixes...
From: Jakub Kicinski @ 2019-08-21 6:51 UTC (permalink / raw)
To: John Fastabend; +Cc: David Miller, netdev
In-Reply-To: <5d5cd426e18be_67732ae0ef5705bc4@john-XPS-13-9370.notmuch>
On Tue, 20 Aug 2019 22:18:30 -0700, John Fastabend wrote:
> > > I suspect you've triaged through this already on your side for other
> > > reasons, so perhaps you could help come up with a sane set of TLS
> > > bug fix backports that would be appropriate for -stable?
> >
> > I'm planning to spend tomorrow working exactly on v4.19 backport.
> > I have internal reports of openssl failing on v4.19 while v4.20
> > works fine.. Hopefully I'll be able to figure that one out, test the
> > above and see if there are any other missing fixes.
> >
> > Is it okay if I come back to this tomorrow?
>
> Is the failure with hw offload or sw case?
SW case, strangely enough. Large file transfer, I think with openssl
client..
> If its sendpage related looks like we also need to push the following
> patch back to 4.19,
>
> commit 648ee6cea7dde4a5cdf817e5d964fd60b22006a4
> Author: John Fastabend <john.fastabend@gmail.com>
> Date: Wed Jun 12 17:23:57 2019 +0000
>
> net: tls, correctly account for copied bytes with multiple sk_msgs
I had a quick look at that, but the commit in Fixes is not in v4.19.
> If you have more details I can also spend some cycles looking into it.
Awesome, I'll let you know what the details are as soon as I get them.
^ permalink raw reply
* Re: [PATCH RFC ipsec-next 0/7] ipsec: add TCP encapsulation support (RFC 8229)
From: Steffen Klassert @ 2019-08-21 6:59 UTC (permalink / raw)
To: Sabrina Dubroca; +Cc: netdev, Herbert Xu
In-Reply-To: <20190816141814.GA12002@bistromath.localdomain>
On Fri, Aug 16, 2019 at 04:18:14PM +0200, Sabrina Dubroca wrote:
> Hi Steffen,
>
> 2019-06-25, 12:11:33 +0200, Sabrina Dubroca wrote:
> > This patchset introduces support for TCP encapsulation of IKE and ESP
> > messages, as defined by RFC 8229 [0]. It is an evolution of what
> > Herbert Xu proposed in January 2018 [1] that addresses the main
> > criticism against it, by not interfering with the TCP implementation
> > at all. The networking stack now has infrastructure for this: TCP ULPs
> > and Stream Parsers.
>
> Have you had a chance to look at this? I was going to rebase and
> resend, but the patches still apply to ipsec-next and net-next (patch
> 2 is already in net-next as commit bd95e678e0f6).
I had a look and I have no general objection against this. If you
think the patchset is ready for inclusion, just remove the RFC and
resend it. I'll have a closer on it look then.
^ permalink raw reply
* [PATCH net-next 0/7] mlxsw: Add devlink-trap support
From: Ido Schimmel @ 2019-08-21 7:19 UTC (permalink / raw)
To: netdev; +Cc: davem, jiri, dsahern, mlxsw, Ido Schimmel
From: Ido Schimmel <idosch@mellanox.com>
This patchset adds devlink-trap support in mlxsw.
Patches #1-#4 add the necessary APIs and defines in mlxsw.
Patch #5 implements devlink-trap support for layer 2 drops. More drops
will be added in the future.
Patches #6-#7 add selftests to make sure that all the new code paths are
exercised and that the feature is working as expected.
Ido Schimmel (7):
mlxsw: core: Add API to set trap action
mlxsw: reg: Add new trap actions
mlxsw: Add layer 2 discard trap IDs
mlxsw: Add trap group for layer 2 discards
mlxsw: spectrum: Add devlink-trap support
selftests: mlxsw: Add test cases for devlink-trap L2 drops
selftests: mlxsw: Add a test case for devlink-trap
drivers/net/ethernet/mellanox/mlxsw/Makefile | 2 +-
drivers/net/ethernet/mellanox/mlxsw/core.c | 64 +++
drivers/net/ethernet/mellanox/mlxsw/core.h | 12 +
drivers/net/ethernet/mellanox/mlxsw/reg.h | 12 +
.../net/ethernet/mellanox/mlxsw/spectrum.c | 21 +
.../net/ethernet/mellanox/mlxsw/spectrum.h | 13 +
.../ethernet/mellanox/mlxsw/spectrum_trap.c | 267 ++++++++++
drivers/net/ethernet/mellanox/mlxsw/trap.h | 7 +
.../drivers/net/mlxsw/devlink_trap.sh | 129 +++++
.../net/mlxsw/devlink_trap_l2_drops.sh | 484 ++++++++++++++++++
10 files changed, 1010 insertions(+), 1 deletion(-)
create mode 100644 drivers/net/ethernet/mellanox/mlxsw/spectrum_trap.c
create mode 100755 tools/testing/selftests/drivers/net/mlxsw/devlink_trap.sh
create mode 100755 tools/testing/selftests/drivers/net/mlxsw/devlink_trap_l2_drops.sh
--
2.21.0
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox