* RE: [PATCH v2 0/2] Simplify mtty driver and mdev core
From: Parav Pandit @ 2019-08-21 4:40 UTC (permalink / raw)
To: Alex Williamson
Cc: Jiri Pirko, David S . Miller, Kirti Wankhede, Cornelia Huck,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org, cjia,
netdev@vger.kernel.org
In-Reply-To: <20190820222051.7aeafb69@x1.home>
> -----Original Message-----
> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Wednesday, August 21, 2019 9:51 AM
> To: Parav Pandit <parav@mellanox.com>
> Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>;
> Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck
> <cohuck@redhat.com>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org;
> cjia <cjia@nvidia.com>; netdev@vger.kernel.org
> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
>
> On Wed, 21 Aug 2019 03:42:25 +0000
> Parav Pandit <parav@mellanox.com> wrote:
>
> > > -----Original Message-----
> > > From: Alex Williamson <alex.williamson@redhat.com>
> > > Sent: Tuesday, August 20, 2019 10:49 PM
> > > To: Parav Pandit <parav@mellanox.com>
> > > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller
> > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>;
> > > Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org;
> > > linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>;
> > > netdev@vger.kernel.org
> > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > >
> > > On Tue, 20 Aug 2019 08:58:02 +0000
> > > Parav Pandit <parav@mellanox.com> wrote:
> > >
> > > > + Dave.
> > > >
> > > > Hi Jiri, Dave, Alex, Kirti, Cornelia,
> > > >
> > > > Please provide your feedback on it, how shall we proceed?
> > > >
> > > > Short summary of requirements.
> > > > For a given mdev (mediated device [1]), there is one representor
> > > > netdevice and devlink port in switchdev mode (similar to SR-IOV
> > > > VF), And there is one netdevice for the actual mdev when mdev is probed.
> > > >
> > > > (a) representor netdev and devlink port should be able derive
> > > > phys_port_name(). So that representor netdev name can be built
> > > > deterministically across reboots.
> > > >
> > > > (b) for mdev's netdevice, mdev's device should have an attribute.
> > > > This attribute can be used by udev rules/systemd or something else
> > > > to rename netdev name deterministically.
> > > >
> > > > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID.
> > > > A simple grep IFNAMSIZ in stack hints hundreds of users of
> > > > IFNAMSIZ in drivers, uapi, netlink, boot config area and more.
> > > > Changing IFNAMSIZ for a mdev bus doesn't really look reasonable option
> to me.
> > >
> > > How many characters do we really have to work with? Your examples
> > > below prepend various characters, ex. option-1 results in ens2f0_m10
> > > or enm10. Do the extra 8 or 3 characters in these count against IFNAMSIZ?
> > >
> > Maximum 15. Last is null termination.
> > Some udev rules setting by user prefix the PF netdev interface. I took such
> example below where ens2f0 netdev named is prefixed.
> > Some prefer not to prefix.
> >
> > > > Hence, I would like to discuss below options.
> > > >
> > > > Option-1: mdev index
> > > > Introduce an optional mdev index/handle as u32 during mdev create
> > > > time. User passes mdev index/handle as input.
> > > >
> > > > phys_port_name=mIndex=m%u
> > > > mdev_index will be available in sysfs as mdev attribute for udev
> > > > to name the mdev's netdev.
> > > >
> > > > example mdev create command:
> > > > UUID=$(uuidgen)
> > > > echo $UUID index=10
> > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create
> > >
> > > Nit, IIRC previous discussions of additional parameters used comma
> > > separators, ex. echo $UUID,index=10 >...
> > >
> > Yes, ok.
> >
> > > > > example netdevs:
> > > > repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */
> > >
> > > Is the parent really relevant in the name?
> > No. I just picked one udev example who prefixed the parent netdev name.
> > But there are users who do not prefix it.
> >
> > > Tools like mdevctl are meant to
> > > provide persistence, creating the same mdev devices on the same
> > > parent, but that's simply the easiest policy decision. We can also
> > > imagine that multiple parent devices might support a specified mdev
> > > type and policies factoring in proximity, load-balancing, power
> > > consumption, etc might be weighed such that we really don't want to
> > > promote userspace creating dependencies on the parent association.
> > >
> > > > mdev_netdev=enm10
> > > >
> > > > Pros:
> > > > 1. mdevctl and any other existing tools are unaffected.
> > > > 2. netdev stack, ovs and other switching platforms are unaffected.
> > > > 3. achieves unique phys_port_name for representor netdev 4.
> > > > achieves unique mdev eth netdev name for the mdev using udev/systemd
> extension.
> > > > 5. Aligns well with mdev and netdev subsystem and similar to
> > > > existing sriov bdf's.
> > >
> > > A user provided index seems strange to me. It's not really an
> > > index, just a user specified instance number. Presumably you have
> > > the user providing this because if it really were an index, then the
> > > value depends on the creation order and persistence is lost. Now
> > > the user needs to both avoid uuid collision as well as "index"
> > > number collision. The uuid namespace is large enough to mostly ignore
> this, but this is not. This seems like a burden.
> > >
> > I liked the term 'instance number', which is lot better way to say than
> index/handle.
> > Yes, user needs to avoid both the collision.
> > UUID collision should not occur in most cases, they way UUID are generated.
> > So practically users needs to pick unique 'instance number', similar to how it
> picks unique netdev names.
> >
> > Burden to user comes from the requirement to get uniqueness.
> >
> > > > Option-2: shorter mdev name
> > > > Extend mdev to have shorter mdev device name in addition to UUID.
> > > > such as 'foo', 'bar'.
> > > > Mdev will continue to have UUID.
> > > > phys_port_name=mdev_name
> > > >
> > > > Pros:
> > > > 1. All same as option-1, except mdevctl needs upgrade for newer usage.
> > > > It is common practice to upgrade iproute2 package along with the
> > > > kernel. Similar practice to be done with mdevctl.
> > > > 2. Newer users of mdevctl who wants to work with non_UUID names,
> > > > will use newer mdevctl/tools. Cons:
> > > > 1. Dual naming scheme of mdev might affect some of the existing tools.
> > > > It's unclear how/if it actually affects.
> > > > mdevctl [2] is very recently developed and can be enhanced for
> > > > dual naming scheme.
> > >
> > > I think we've already nak'ed this one, the device namespace becomes
> > > meaningless if the name becomes just a string where a uuid might be
> > > an example string. mdevs are named by uuid.
> > >
> > > > Option-3: mdev uuid alias
> > > > Instead of shorter mdev name or mdev index, have alpha-numeric
> > > > name alias. Alias is an optional mdev sysfs attribute such as 'foo', 'bar'.
> > > > example mdev create command:
> > > > UUID=$(uuidgen)
> > > > echo $UUID alias=foo
> > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create
> > > > > example netdevs:
> > > > examle netdevs:
> > > > repnetdev = ens2f0_mfoo
> > > > mdev_netdev=enmfoo
> > > >
> > > > Pros:
> > > > 1. All same as option-1.
> > > > 2. Doesn't affect existing mdev naming scheme.
> > > > Cons:
> > > > 1. Index scheme of option-1 is better which can number large
> > > > number of mdevs with fewer characters, simplifying the management
> tool.
> > >
> > > No better than option-1, simply a larger secondary namespace, but
> > > still requires the user to come up with two independent names for the
> device.
> > >
> > > > Option-4: extend IFNAMESZ to be 64 bytes Extended IFNAMESZ from 16
> > > > to
> > > > 64 bytes phys_port_name=mdev_UUID_string
> mdev_netdev_name=enmUUID
> > > >
> > > > Pros:
> > > > 1. Doesn't require mdev extension
> > > > Cons:
> > > > 1. netdev stack, driver, uapi, user space, boot config wide changes 2.
> > > > Possible user space extensions who assumed name size being 16
> > > > characters 3. Single device type demands namesize change for all
> > > > netdev types
> > >
> > > What about an alias based on the uuid? For example, we use 160-bit
> > > sha1s daily with git (uuids are only 128-bit), but we generally
> > > don't reference git commits with the full 20 character string.
> > > Generally 12 characters is recommended to avoid ambiguity. Could
> > > mdev automatically create an
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > abbreviated sha1 alias for the device? If so, how many characters
> > > should we
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > use and what do we do on collision? The colliding device could add
> > > enough alias characters to disambiguate (we likely couldn't re-alias
> > > the existing device to disambiguate, but I'm not sure it matters,
> > > userspace has sysfs to associate aliases). Ex.
> > >
> > > UUID=$(uuidgen)
> > > ALIAS=$(echo $UUID | sha1sum | colrm 13)
> > >
> > I explained in previous reply to Cornelia, we should set UUID and ALIAS at the
> same time.
> > Setting is via different sysfs attribute is lot code burden with no extra benefit.
>
> Just an example of the alias, not proposing how it's set. In fact, proposing that
> the user does not set it, mdev-core provides one automatically.
>
> > > Since there seems to be some prefix overhead, as I ask about above
> > > in how many characters we actually have to work with in IFNAMESZ,
> > > maybe we start with 8 characters (matching your "index" namespace)
> > > and expand as necessary for disambiguation. If we can eliminate
> > > overhead in IFNAMESZ, let's start with 12. Thanks,
> > >
> > If user is going to choose the alias, why does it have to be limited to sha1?
> > Or you just told it as an example?
> >
> > It can be an alpha-numeric string.
>
> No, I'm proposing a different solution where mdev-core creates an alias based
> on an abbreviated sha1. The user does not provide the alias.
>
> > Instead of mdev imposing number of characters on the alias, it should be best
> left to the user.
> > Because in future if netdev improves on the naming scheme, mdev will be
> limiting it, which is not right.
> > So not restricting alias size seems right to me.
> > User configuring mdev for networking devices in a given kernel knows what
> user is doing.
> > So user can choose alias name size as it finds suitable.
>
> That's not what I'm proposing, please read again. Thanks,
I understood your point. But mdev doesn't know how user is going to use udev/systemd to name the netdev.
So even if mdev chose to pick 12 characters, it could result in collision.
Hence the proposal to provide the alias by the user, as user know the best policy for its use case in the environment its using.
So 12 character sha1 method will still work by user.
^ permalink raw reply
* Re: [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver
From: Richard Cochran @ 2019-08-21 4:42 UTC (permalink / raw)
To: Vladimir Oltean
Cc: Mark Brown, Hubert Feurstein, mlichvar, Andrew Lunn,
Florian Fainelli, linux-spi, netdev
In-Reply-To: <CA+h21hr4UcoJK7upNJjG0ibtX7CkF=akxVdrb--1AJn6-z=sUQ@mail.gmail.com>
On Tue, Aug 20, 2019 at 06:57:10PM +0300, Vladimir Oltean wrote:
> I believe something similar should be possible on other hardware as well.
Not for the marvell link street devices, AFAICT.
Thanks,
Richard
^ permalink raw reply
* [PATCH v2] qed: Add cleanup in qed_slowpath_start()
From: Wenwen Wang @ 2019-08-21 4:46 UTC (permalink / raw)
To: Wenwen Wang
Cc: Sudarsana Reddy Kalluru, Ariel Elior,
supporter:QLOGIC QL4xxx ETHERNET DRIVER, David S. Miller,
open list:QLOGIC QL4xxx ETHERNET DRIVER, open list
If qed_mcp_send_drv_version() fails, no cleanup is executed, leading to
memory leaks. To fix this issue, introduce the label 'err4' to perform the
cleanup work before returning the error.
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
---
drivers/net/ethernet/qlogic/qed/qed_main.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_main.c b/drivers/net/ethernet/qlogic/qed/qed_main.c
index 829dd60..1efff7f 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_main.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_main.c
@@ -1325,7 +1325,7 @@ static int qed_slowpath_start(struct qed_dev *cdev,
&drv_version);
if (rc) {
DP_NOTICE(cdev, "Failed sending drv version command\n");
- return rc;
+ goto err4;
}
}
@@ -1333,6 +1333,8 @@ static int qed_slowpath_start(struct qed_dev *cdev,
return 0;
+err4:
+ qed_ll2_dealloc_if(cdev);
err3:
qed_hw_stop(cdev);
err2:
--
2.7.4
^ permalink raw reply related
* Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
From: Alex Williamson @ 2019-08-21 4:57 UTC (permalink / raw)
To: Parav Pandit
Cc: Jiri Pirko, David S . Miller, Kirti Wankhede, Cornelia Huck,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org, cjia,
netdev@vger.kernel.org
In-Reply-To: <AM0PR05MB48664CDF05C3D02F9441440DD1AA0@AM0PR05MB4866.eurprd05.prod.outlook.com>
On Wed, 21 Aug 2019 04:40:15 +0000
Parav Pandit <parav@mellanox.com> wrote:
> > -----Original Message-----
> > From: Alex Williamson <alex.williamson@redhat.com>
> > Sent: Wednesday, August 21, 2019 9:51 AM
> > To: Parav Pandit <parav@mellanox.com>
> > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>;
> > Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck
> > <cohuck@redhat.com>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org;
> > cjia <cjia@nvidia.com>; netdev@vger.kernel.org
> > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> >
> > On Wed, 21 Aug 2019 03:42:25 +0000
> > Parav Pandit <parav@mellanox.com> wrote:
> >
> > > > -----Original Message-----
> > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > Sent: Tuesday, August 20, 2019 10:49 PM
> > > > To: Parav Pandit <parav@mellanox.com>
> > > > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller
> > > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>;
> > > > Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org;
> > > > linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>;
> > > > netdev@vger.kernel.org
> > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > > >
> > > > On Tue, 20 Aug 2019 08:58:02 +0000
> > > > Parav Pandit <parav@mellanox.com> wrote:
> > > >
> > > > > + Dave.
> > > > >
> > > > > Hi Jiri, Dave, Alex, Kirti, Cornelia,
> > > > >
> > > > > Please provide your feedback on it, how shall we proceed?
> > > > >
> > > > > Short summary of requirements.
> > > > > For a given mdev (mediated device [1]), there is one representor
> > > > > netdevice and devlink port in switchdev mode (similar to SR-IOV
> > > > > VF), And there is one netdevice for the actual mdev when mdev is probed.
> > > > >
> > > > > (a) representor netdev and devlink port should be able derive
> > > > > phys_port_name(). So that representor netdev name can be built
> > > > > deterministically across reboots.
> > > > >
> > > > > (b) for mdev's netdevice, mdev's device should have an attribute.
> > > > > This attribute can be used by udev rules/systemd or something else
> > > > > to rename netdev name deterministically.
> > > > >
> > > > > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID.
> > > > > A simple grep IFNAMSIZ in stack hints hundreds of users of
> > > > > IFNAMSIZ in drivers, uapi, netlink, boot config area and more.
> > > > > Changing IFNAMSIZ for a mdev bus doesn't really look reasonable option
> > to me.
> > > >
> > > > How many characters do we really have to work with? Your examples
> > > > below prepend various characters, ex. option-1 results in ens2f0_m10
> > > > or enm10. Do the extra 8 or 3 characters in these count against IFNAMSIZ?
> > > >
> > > Maximum 15. Last is null termination.
> > > Some udev rules setting by user prefix the PF netdev interface. I took such
> > example below where ens2f0 netdev named is prefixed.
> > > Some prefer not to prefix.
> > >
> > > > > Hence, I would like to discuss below options.
> > > > >
> > > > > Option-1: mdev index
> > > > > Introduce an optional mdev index/handle as u32 during mdev create
> > > > > time. User passes mdev index/handle as input.
> > > > >
> > > > > phys_port_name=mIndex=m%u
> > > > > mdev_index will be available in sysfs as mdev attribute for udev
> > > > > to name the mdev's netdev.
> > > > >
> > > > > example mdev create command:
> > > > > UUID=$(uuidgen)
> > > > > echo $UUID index=10
> > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create
> > > >
> > > > Nit, IIRC previous discussions of additional parameters used comma
> > > > separators, ex. echo $UUID,index=10 >...
> > > >
> > > Yes, ok.
> > >
> > > > > > example netdevs:
> > > > > repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */
> > > >
> > > > Is the parent really relevant in the name?
> > > No. I just picked one udev example who prefixed the parent netdev name.
> > > But there are users who do not prefix it.
> > >
> > > > Tools like mdevctl are meant to
> > > > provide persistence, creating the same mdev devices on the same
> > > > parent, but that's simply the easiest policy decision. We can also
> > > > imagine that multiple parent devices might support a specified mdev
> > > > type and policies factoring in proximity, load-balancing, power
> > > > consumption, etc might be weighed such that we really don't want to
> > > > promote userspace creating dependencies on the parent association.
> > > >
> > > > > mdev_netdev=enm10
> > > > >
> > > > > Pros:
> > > > > 1. mdevctl and any other existing tools are unaffected.
> > > > > 2. netdev stack, ovs and other switching platforms are unaffected.
> > > > > 3. achieves unique phys_port_name for representor netdev 4.
> > > > > achieves unique mdev eth netdev name for the mdev using udev/systemd
> > extension.
> > > > > 5. Aligns well with mdev and netdev subsystem and similar to
> > > > > existing sriov bdf's.
> > > >
> > > > A user provided index seems strange to me. It's not really an
> > > > index, just a user specified instance number. Presumably you have
> > > > the user providing this because if it really were an index, then the
> > > > value depends on the creation order and persistence is lost. Now
> > > > the user needs to both avoid uuid collision as well as "index"
> > > > number collision. The uuid namespace is large enough to mostly ignore
> > this, but this is not. This seems like a burden.
> > > >
> > > I liked the term 'instance number', which is lot better way to say than
> > index/handle.
> > > Yes, user needs to avoid both the collision.
> > > UUID collision should not occur in most cases, they way UUID are generated.
> > > So practically users needs to pick unique 'instance number', similar to how it
> > picks unique netdev names.
> > >
> > > Burden to user comes from the requirement to get uniqueness.
> > >
> > > > > Option-2: shorter mdev name
> > > > > Extend mdev to have shorter mdev device name in addition to UUID.
> > > > > such as 'foo', 'bar'.
> > > > > Mdev will continue to have UUID.
> > > > > phys_port_name=mdev_name
> > > > >
> > > > > Pros:
> > > > > 1. All same as option-1, except mdevctl needs upgrade for newer usage.
> > > > > It is common practice to upgrade iproute2 package along with the
> > > > > kernel. Similar practice to be done with mdevctl.
> > > > > 2. Newer users of mdevctl who wants to work with non_UUID names,
> > > > > will use newer mdevctl/tools. Cons:
> > > > > 1. Dual naming scheme of mdev might affect some of the existing tools.
> > > > > It's unclear how/if it actually affects.
> > > > > mdevctl [2] is very recently developed and can be enhanced for
> > > > > dual naming scheme.
> > > >
> > > > I think we've already nak'ed this one, the device namespace becomes
> > > > meaningless if the name becomes just a string where a uuid might be
> > > > an example string. mdevs are named by uuid.
> > > >
> > > > > Option-3: mdev uuid alias
> > > > > Instead of shorter mdev name or mdev index, have alpha-numeric
> > > > > name alias. Alias is an optional mdev sysfs attribute such as 'foo', 'bar'.
> > > > > example mdev create command:
> > > > > UUID=$(uuidgen)
> > > > > echo $UUID alias=foo
> > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create
> > > > > > example netdevs:
> > > > > examle netdevs:
> > > > > repnetdev = ens2f0_mfoo
> > > > > mdev_netdev=enmfoo
> > > > >
> > > > > Pros:
> > > > > 1. All same as option-1.
> > > > > 2. Doesn't affect existing mdev naming scheme.
> > > > > Cons:
> > > > > 1. Index scheme of option-1 is better which can number large
> > > > > number of mdevs with fewer characters, simplifying the management
> > tool.
> > > >
> > > > No better than option-1, simply a larger secondary namespace, but
> > > > still requires the user to come up with two independent names for the
> > device.
> > > >
> > > > > Option-4: extend IFNAMESZ to be 64 bytes Extended IFNAMESZ from 16
> > > > > to
> > > > > 64 bytes phys_port_name=mdev_UUID_string
> > mdev_netdev_name=enmUUID
> > > > >
> > > > > Pros:
> > > > > 1. Doesn't require mdev extension
> > > > > Cons:
> > > > > 1. netdev stack, driver, uapi, user space, boot config wide changes 2.
> > > > > Possible user space extensions who assumed name size being 16
> > > > > characters 3. Single device type demands namesize change for all
> > > > > netdev types
> > > >
> > > > What about an alias based on the uuid? For example, we use 160-bit
> > > > sha1s daily with git (uuids are only 128-bit), but we generally
> > > > don't reference git commits with the full 20 character string.
> > > > Generally 12 characters is recommended to avoid ambiguity. Could
> > > > mdev automatically create an
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > abbreviated sha1 alias for the device? If so, how many characters
> > > > should we
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > use and what do we do on collision? The colliding device could add
> > > > enough alias characters to disambiguate (we likely couldn't re-alias
> > > > the existing device to disambiguate, but I'm not sure it matters,
> > > > userspace has sysfs to associate aliases). Ex.
> > > >
> > > > UUID=$(uuidgen)
> > > > ALIAS=$(echo $UUID | sha1sum | colrm 13)
> > > >
> > > I explained in previous reply to Cornelia, we should set UUID and ALIAS at the
> > same time.
> > > Setting is via different sysfs attribute is lot code burden with no extra benefit.
> >
> > Just an example of the alias, not proposing how it's set. In fact, proposing that
> > the user does not set it, mdev-core provides one automatically.
> >
> > > > Since there seems to be some prefix overhead, as I ask about above
> > > > in how many characters we actually have to work with in IFNAMESZ,
> > > > maybe we start with 8 characters (matching your "index" namespace)
> > > > and expand as necessary for disambiguation. If we can eliminate
> > > > overhead in IFNAMESZ, let's start with 12. Thanks,
> > > >
> > > If user is going to choose the alias, why does it have to be limited to sha1?
> > > Or you just told it as an example?
> > >
> > > It can be an alpha-numeric string.
> >
> > No, I'm proposing a different solution where mdev-core creates an alias based
> > on an abbreviated sha1. The user does not provide the alias.
> >
> > > Instead of mdev imposing number of characters on the alias, it should be best
> > left to the user.
> > > Because in future if netdev improves on the naming scheme, mdev will be
> > limiting it, which is not right.
> > > So not restricting alias size seems right to me.
> > > User configuring mdev for networking devices in a given kernel knows what
> > user is doing.
> > > So user can choose alias name size as it finds suitable.
> >
> > That's not what I'm proposing, please read again. Thanks,
>
> I understood your point. But mdev doesn't know how user is going to use udev/systemd to name the netdev.
> So even if mdev chose to pick 12 characters, it could result in collision.
> Hence the proposal to provide the alias by the user, as user know the best policy for its use case in the environment its using.
> So 12 character sha1 method will still work by user.
Haven't you already provided examples where certain drivers or
subsystems have unique netdev prefixes? If mdev provides a unique
alias within the subsystem, couldn't we simply define a netdev prefix
for the mdev subsystem and avoid all other collisions? I'm not in favor
of the user providing both a uuid and an alias/instance. Thanks,
Alex
^ permalink raw reply
* RE: [PATCH v2 0/2] Simplify mtty driver and mdev core
From: Parav Pandit @ 2019-08-21 5:01 UTC (permalink / raw)
To: Alex Williamson
Cc: Jiri Pirko, David S . Miller, Kirti Wankhede, Cornelia Huck,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org, cjia,
netdev@vger.kernel.org
In-Reply-To: <20190820225722.237a57d2@x1.home>
> -----Original Message-----
> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Wednesday, August 21, 2019 10:27 AM
> To: Parav Pandit <parav@mellanox.com>
> Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>;
> Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck
> <cohuck@redhat.com>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org;
> cjia <cjia@nvidia.com>; netdev@vger.kernel.org
> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
>
> On Wed, 21 Aug 2019 04:40:15 +0000
> Parav Pandit <parav@mellanox.com> wrote:
>
> > > -----Original Message-----
> > > From: Alex Williamson <alex.williamson@redhat.com>
> > > Sent: Wednesday, August 21, 2019 9:51 AM
> > > To: Parav Pandit <parav@mellanox.com>
> > > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller
> > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>;
> > > Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org;
> > > linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>;
> > > netdev@vger.kernel.org
> > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > >
> > > On Wed, 21 Aug 2019 03:42:25 +0000
> > > Parav Pandit <parav@mellanox.com> wrote:
> > >
> > > > > -----Original Message-----
> > > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > > Sent: Tuesday, August 20, 2019 10:49 PM
> > > > > To: Parav Pandit <parav@mellanox.com>
> > > > > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller
> > > > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>;
> > > > > Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org;
> > > > > linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>;
> > > > > netdev@vger.kernel.org
> > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > > > >
> > > > > On Tue, 20 Aug 2019 08:58:02 +0000 Parav Pandit
> > > > > <parav@mellanox.com> wrote:
> > > > >
> > > > > > + Dave.
> > > > > >
> > > > > > Hi Jiri, Dave, Alex, Kirti, Cornelia,
> > > > > >
> > > > > > Please provide your feedback on it, how shall we proceed?
> > > > > >
> > > > > > Short summary of requirements.
> > > > > > For a given mdev (mediated device [1]), there is one
> > > > > > representor netdevice and devlink port in switchdev mode
> > > > > > (similar to SR-IOV VF), And there is one netdevice for the actual mdev
> when mdev is probed.
> > > > > >
> > > > > > (a) representor netdev and devlink port should be able derive
> > > > > > phys_port_name(). So that representor netdev name can be built
> > > > > > deterministically across reboots.
> > > > > >
> > > > > > (b) for mdev's netdevice, mdev's device should have an attribute.
> > > > > > This attribute can be used by udev rules/systemd or something
> > > > > > else to rename netdev name deterministically.
> > > > > >
> > > > > > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID.
> > > > > > A simple grep IFNAMSIZ in stack hints hundreds of users of
> > > > > > IFNAMSIZ in drivers, uapi, netlink, boot config area and more.
> > > > > > Changing IFNAMSIZ for a mdev bus doesn't really look
> > > > > > reasonable option
> > > to me.
> > > > >
> > > > > How many characters do we really have to work with? Your
> > > > > examples below prepend various characters, ex. option-1 results
> > > > > in ens2f0_m10 or enm10. Do the extra 8 or 3 characters in these count
> against IFNAMSIZ?
> > > > >
> > > > Maximum 15. Last is null termination.
> > > > Some udev rules setting by user prefix the PF netdev interface. I
> > > > took such
> > > example below where ens2f0 netdev named is prefixed.
> > > > Some prefer not to prefix.
> > > >
> > > > > > Hence, I would like to discuss below options.
> > > > > >
> > > > > > Option-1: mdev index
> > > > > > Introduce an optional mdev index/handle as u32 during mdev
> > > > > > create time. User passes mdev index/handle as input.
> > > > > >
> > > > > > phys_port_name=mIndex=m%u
> > > > > > mdev_index will be available in sysfs as mdev attribute for
> > > > > > udev to name the mdev's netdev.
> > > > > >
> > > > > > example mdev create command:
> > > > > > UUID=$(uuidgen)
> > > > > > echo $UUID index=10
> > > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/cr
> > > > > > > eate
> > > > >
> > > > > Nit, IIRC previous discussions of additional parameters used
> > > > > comma separators, ex. echo $UUID,index=10 >...
> > > > >
> > > > Yes, ok.
> > > >
> > > > > > > example netdevs:
> > > > > > repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */
> > > > >
> > > > > Is the parent really relevant in the name?
> > > > No. I just picked one udev example who prefixed the parent netdev name.
> > > > But there are users who do not prefix it.
> > > >
> > > > > Tools like mdevctl are meant to
> > > > > provide persistence, creating the same mdev devices on the same
> > > > > parent, but that's simply the easiest policy decision. We can
> > > > > also imagine that multiple parent devices might support a
> > > > > specified mdev type and policies factoring in proximity,
> > > > > load-balancing, power consumption, etc might be weighed such
> > > > > that we really don't want to promote userspace creating dependencies
> on the parent association.
> > > > >
> > > > > > mdev_netdev=enm10
> > > > > >
> > > > > > Pros:
> > > > > > 1. mdevctl and any other existing tools are unaffected.
> > > > > > 2. netdev stack, ovs and other switching platforms are unaffected.
> > > > > > 3. achieves unique phys_port_name for representor netdev 4.
> > > > > > achieves unique mdev eth netdev name for the mdev using
> > > > > > udev/systemd
> > > extension.
> > > > > > 5. Aligns well with mdev and netdev subsystem and similar to
> > > > > > existing sriov bdf's.
> > > > >
> > > > > A user provided index seems strange to me. It's not really an
> > > > > index, just a user specified instance number. Presumably you
> > > > > have the user providing this because if it really were an index,
> > > > > then the value depends on the creation order and persistence is
> > > > > lost. Now the user needs to both avoid uuid collision as well as "index"
> > > > > number collision. The uuid namespace is large enough to mostly
> > > > > ignore
> > > this, but this is not. This seems like a burden.
> > > > >
> > > > I liked the term 'instance number', which is lot better way to say
> > > > than
> > > index/handle.
> > > > Yes, user needs to avoid both the collision.
> > > > UUID collision should not occur in most cases, they way UUID are
> generated.
> > > > So practically users needs to pick unique 'instance number',
> > > > similar to how it
> > > picks unique netdev names.
> > > >
> > > > Burden to user comes from the requirement to get uniqueness.
> > > >
> > > > > > Option-2: shorter mdev name
> > > > > > Extend mdev to have shorter mdev device name in addition to UUID.
> > > > > > such as 'foo', 'bar'.
> > > > > > Mdev will continue to have UUID.
> > > > > > phys_port_name=mdev_name
> > > > > >
> > > > > > Pros:
> > > > > > 1. All same as option-1, except mdevctl needs upgrade for newer
> usage.
> > > > > > It is common practice to upgrade iproute2 package along with
> > > > > > the kernel. Similar practice to be done with mdevctl.
> > > > > > 2. Newer users of mdevctl who wants to work with non_UUID
> > > > > > names, will use newer mdevctl/tools. Cons:
> > > > > > 1. Dual naming scheme of mdev might affect some of the existing
> tools.
> > > > > > It's unclear how/if it actually affects.
> > > > > > mdevctl [2] is very recently developed and can be enhanced for
> > > > > > dual naming scheme.
> > > > >
> > > > > I think we've already nak'ed this one, the device namespace
> > > > > becomes meaningless if the name becomes just a string where a
> > > > > uuid might be an example string. mdevs are named by uuid.
> > > > >
> > > > > > Option-3: mdev uuid alias
> > > > > > Instead of shorter mdev name or mdev index, have alpha-numeric
> > > > > > name alias. Alias is an optional mdev sysfs attribute such as 'foo',
> 'bar'.
> > > > > > example mdev create command:
> > > > > > UUID=$(uuidgen)
> > > > > > echo $UUID alias=foo
> > > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/cr
> > > > > > > eate
> > > > > > > example netdevs:
> > > > > > examle netdevs:
> > > > > > repnetdev = ens2f0_mfoo
> > > > > > mdev_netdev=enmfoo
> > > > > >
> > > > > > Pros:
> > > > > > 1. All same as option-1.
> > > > > > 2. Doesn't affect existing mdev naming scheme.
> > > > > > Cons:
> > > > > > 1. Index scheme of option-1 is better which can number large
> > > > > > number of mdevs with fewer characters, simplifying the
> > > > > > management
> > > tool.
> > > > >
> > > > > No better than option-1, simply a larger secondary namespace,
> > > > > but still requires the user to come up with two independent
> > > > > names for the
> > > device.
> > > > >
> > > > > > Option-4: extend IFNAMESZ to be 64 bytes Extended IFNAMESZ
> > > > > > from 16 to
> > > > > > 64 bytes phys_port_name=mdev_UUID_string
> > > mdev_netdev_name=enmUUID
> > > > > >
> > > > > > Pros:
> > > > > > 1. Doesn't require mdev extension
> > > > > > Cons:
> > > > > > 1. netdev stack, driver, uapi, user space, boot config wide changes 2.
> > > > > > Possible user space extensions who assumed name size being 16
> > > > > > characters 3. Single device type demands namesize change for
> > > > > > all netdev types
> > > > >
> > > > > What about an alias based on the uuid? For example, we use
> > > > > 160-bit sha1s daily with git (uuids are only 128-bit), but we
> > > > > generally don't reference git commits with the full 20 character string.
> > > > > Generally 12 characters is recommended to avoid ambiguity.
> > > > > Could mdev automatically create an
> > >
> > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > > abbreviated sha1 alias for the device? If so, how many
> > > > > characters should we
> > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > > use and what do we do on collision? The colliding device could
> > > > > add enough alias characters to disambiguate (we likely couldn't
> > > > > re-alias the existing device to disambiguate, but I'm not sure
> > > > > it matters, userspace has sysfs to associate aliases). Ex.
> > > > >
> > > > > UUID=$(uuidgen)
> > > > > ALIAS=$(echo $UUID | sha1sum | colrm 13)
> > > > >
> > > > I explained in previous reply to Cornelia, we should set UUID and
> > > > ALIAS at the
> > > same time.
> > > > Setting is via different sysfs attribute is lot code burden with no extra
> benefit.
> > >
> > > Just an example of the alias, not proposing how it's set. In fact,
> > > proposing that the user does not set it, mdev-core provides one
> automatically.
> > >
> > > > > Since there seems to be some prefix overhead, as I ask about
> > > > > above in how many characters we actually have to work with in
> > > > > IFNAMESZ, maybe we start with 8 characters (matching your
> > > > > "index" namespace) and expand as necessary for disambiguation.
> > > > > If we can eliminate overhead in IFNAMESZ, let's start with 12.
> > > > > Thanks,
> > > > >
> > > > If user is going to choose the alias, why does it have to be limited to sha1?
> > > > Or you just told it as an example?
> > > >
> > > > It can be an alpha-numeric string.
> > >
> > > No, I'm proposing a different solution where mdev-core creates an
> > > alias based on an abbreviated sha1. The user does not provide the alias.
> > >
> > > > Instead of mdev imposing number of characters on the alias, it
> > > > should be best
> > > left to the user.
> > > > Because in future if netdev improves on the naming scheme, mdev
> > > > will be
> > > limiting it, which is not right.
> > > > So not restricting alias size seems right to me.
> > > > User configuring mdev for networking devices in a given kernel
> > > > knows what
> > > user is doing.
> > > > So user can choose alias name size as it finds suitable.
> > >
> > > That's not what I'm proposing, please read again. Thanks,
> >
> > I understood your point. But mdev doesn't know how user is going to use
> udev/systemd to name the netdev.
> > So even if mdev chose to pick 12 characters, it could result in collision.
> > Hence the proposal to provide the alias by the user, as user know the best
> policy for its use case in the environment its using.
> > So 12 character sha1 method will still work by user.
>
> Haven't you already provided examples where certain drivers or subsystems
> have unique netdev prefixes? If mdev provides a unique alias within the
> subsystem, couldn't we simply define a netdev prefix for the mdev subsystem
> and avoid all other collisions? I'm not in favor of the user providing both a uuid
> and an alias/instance. Thanks,
>
For a given prefix, say ens2f0, can two UUID->sha1 first 9 characters have collision?
^ permalink raw reply
* RE: [PATCH v2] qed: Add cleanup in qed_slowpath_start()
From: Sudarsana Reddy Kalluru @ 2019-08-21 5:03 UTC (permalink / raw)
To: Wenwen Wang
Cc: Ariel Elior, GR-everest-linux-l2, David S. Miller,
open list:QLOGIC QL4xxx ETHERNET DRIVER, open list
In-Reply-To: <1566362796-5399-1-git-send-email-wenwen@cs.uga.edu>
> -----Original Message-----
> From: Wenwen Wang <wenwen@cs.uga.edu>
> Sent: Wednesday, August 21, 2019 10:17 AM
> To: Wenwen Wang <wenwen@cs.uga.edu>
> Cc: Sudarsana Reddy Kalluru <skalluru@marvell.com>; Ariel Elior
> <aelior@marvell.com>; GR-everest-linux-l2 <GR-everest-linux-
> l2@marvell.com>; David S. Miller <davem@davemloft.net>; open
> list:QLOGIC QL4xxx ETHERNET DRIVER <netdev@vger.kernel.org>; open list
> <linux-kernel@vger.kernel.org>
> Subject: [PATCH v2] qed: Add cleanup in qed_slowpath_start()
>
> If qed_mcp_send_drv_version() fails, no cleanup is executed, leading to
> memory leaks. To fix this issue, introduce the label 'err4' to perform the
> cleanup work before returning the error.
>
> Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
> ---
> drivers/net/ethernet/qlogic/qed/qed_main.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/qlogic/qed/qed_main.c
> b/drivers/net/ethernet/qlogic/qed/qed_main.c
> index 829dd60..1efff7f 100644
> --- a/drivers/net/ethernet/qlogic/qed/qed_main.c
> +++ b/drivers/net/ethernet/qlogic/qed/qed_main.c
> @@ -1325,7 +1325,7 @@ static int qed_slowpath_start(struct qed_dev
> *cdev,
> &drv_version);
> if (rc) {
> DP_NOTICE(cdev, "Failed sending drv version
> command\n");
> - return rc;
> + goto err4;
> }
> }
>
> @@ -1333,6 +1333,8 @@ static int qed_slowpath_start(struct qed_dev
> *cdev,
>
> return 0;
>
> +err4:
> + qed_ll2_dealloc_if(cdev);
> err3:
> qed_hw_stop(cdev);
> err2:
> --
> 2.7.4
Acked-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
^ permalink raw reply
* Re: various TLS bug fixes...
From: John Fastabend @ 2019-08-21 5:18 UTC (permalink / raw)
To: Jakub Kicinski, David Miller; +Cc: netdev
In-Reply-To: <20190820172411.70250551@cakuba.netronome.com>
Jakub Kicinski wrote:
> On Tue, 20 Aug 2019 16:05:17 -0700 (PDT), David Miller wrote:
> > Jakub,
> >
> > I just did a batch of networking -stable submissions, however I ran
> > into some troubles with the various TLS backports.
>
> Yes, the TLS back ports are a little messy :S
>
> > I was able to backport commit 414776621d10 ("net/tls: prevent
> > skb_orphan() from leaking TLS plain text with offload") to v5.2
> > but not to v4.19
>
> We can probably leave that out of v4.19. The only practical scenario
> where the issue occurs to my knowledge is if admin configured TC
> redirect, or netem in the setup. Let me know if you'd rather we did the
> backport, I'm not 100% sure the risk/benefit ratio is favourable.
>
> > I was not able to backport neither d85f01775850 ("net: tls, fix
> > sk_write_space NULL write when tx disabled") nor commit 57c722e932cf
> > ("net/tls: swap sk_write_space on close") to any release. It seems
> > like there are a bunch of dependencies and perhaps other fixes.
>
> Right, those should still be applicable but John and I rejigged
> the close path quite a bit. I think the back port would be this:
Looks correct to me but would need to double check as well.
>
> diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
> index 4c0ac79f82d4..3288bdff9889 100644
> --- a/net/tls/tls_main.c
> +++ b/net/tls/tls_main.c
> @@ -301,6 +301,8 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
> #else
> {
> #endif
> + if (sk->sk_write_space == tls_write_space)
> + sk->sk_write_space = ctx->sk_write_space;
> tls_ctx_free(ctx);
> ctx = NULL;
> }
>
> > I suspect you've triaged through this already on your side for other
> > reasons, so perhaps you could help come up with a sane set of TLS
> > bug fix backports that would be appropriate for -stable?
>
> I'm planning to spend tomorrow working exactly on v4.19 backport.
> I have internal reports of openssl failing on v4.19 while v4.20
> works fine.. Hopefully I'll be able to figure that one out, test the
> above and see if there are any other missing fixes.
>
> Is it okay if I come back to this tomorrow?
Is the failure with hw offload or sw case? If its sendpage related
looks like we also need to push the following patch back to 4.19,
commit 648ee6cea7dde4a5cdf817e5d964fd60b22006a4
Author: John Fastabend <john.fastabend@gmail.com>
Date: Wed Jun 12 17:23:57 2019 +0000
net: tls, correctly account for copied bytes with multiple sk_msgs
If you have more details I can also spend some cycles looking into it.
.John
^ permalink raw reply
* Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
From: Alex Williamson @ 2019-08-21 5:26 UTC (permalink / raw)
To: Parav Pandit
Cc: Jiri Pirko, David S . Miller, Kirti Wankhede, Cornelia Huck,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org, cjia,
netdev@vger.kernel.org
In-Reply-To: <AM0PR05MB4866AE8FC4AA3CC24B08B326D1AA0@AM0PR05MB4866.eurprd05.prod.outlook.com>
On Wed, 21 Aug 2019 05:01:52 +0000
Parav Pandit <parav@mellanox.com> wrote:
> > -----Original Message-----
> > From: Alex Williamson <alex.williamson@redhat.com>
> > Sent: Wednesday, August 21, 2019 10:27 AM
> > To: Parav Pandit <parav@mellanox.com>
> > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>;
> > Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck
> > <cohuck@redhat.com>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org;
> > cjia <cjia@nvidia.com>; netdev@vger.kernel.org
> > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> >
> > On Wed, 21 Aug 2019 04:40:15 +0000
> > Parav Pandit <parav@mellanox.com> wrote:
> >
> > > > -----Original Message-----
> > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > Sent: Wednesday, August 21, 2019 9:51 AM
> > > > To: Parav Pandit <parav@mellanox.com>
> > > > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller
> > > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>;
> > > > Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org;
> > > > linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>;
> > > > netdev@vger.kernel.org
> > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > > >
> > > > On Wed, 21 Aug 2019 03:42:25 +0000
> > > > Parav Pandit <parav@mellanox.com> wrote:
> > > >
> > > > > > -----Original Message-----
> > > > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > > > Sent: Tuesday, August 20, 2019 10:49 PM
> > > > > > To: Parav Pandit <parav@mellanox.com>
> > > > > > Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller
> > > > > > <davem@davemloft.net>; Kirti Wankhede <kwankhede@nvidia.com>;
> > > > > > Cornelia Huck <cohuck@redhat.com>; kvm@vger.kernel.org;
> > > > > > linux-kernel@vger.kernel.org; cjia <cjia@nvidia.com>;
> > > > > > netdev@vger.kernel.org
> > > > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > > > > >
> > > > > > On Tue, 20 Aug 2019 08:58:02 +0000 Parav Pandit
> > > > > > <parav@mellanox.com> wrote:
> > > > > >
> > > > > > > + Dave.
> > > > > > >
> > > > > > > Hi Jiri, Dave, Alex, Kirti, Cornelia,
> > > > > > >
> > > > > > > Please provide your feedback on it, how shall we proceed?
> > > > > > >
> > > > > > > Short summary of requirements.
> > > > > > > For a given mdev (mediated device [1]), there is one
> > > > > > > representor netdevice and devlink port in switchdev mode
> > > > > > > (similar to SR-IOV VF), And there is one netdevice for the actual mdev
> > when mdev is probed.
> > > > > > >
> > > > > > > (a) representor netdev and devlink port should be able derive
> > > > > > > phys_port_name(). So that representor netdev name can be built
> > > > > > > deterministically across reboots.
> > > > > > >
> > > > > > > (b) for mdev's netdevice, mdev's device should have an attribute.
> > > > > > > This attribute can be used by udev rules/systemd or something
> > > > > > > else to rename netdev name deterministically.
> > > > > > >
> > > > > > > (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID.
> > > > > > > A simple grep IFNAMSIZ in stack hints hundreds of users of
> > > > > > > IFNAMSIZ in drivers, uapi, netlink, boot config area and more.
> > > > > > > Changing IFNAMSIZ for a mdev bus doesn't really look
> > > > > > > reasonable option
> > > > to me.
> > > > > >
> > > > > > How many characters do we really have to work with? Your
> > > > > > examples below prepend various characters, ex. option-1 results
> > > > > > in ens2f0_m10 or enm10. Do the extra 8 or 3 characters in these count
> > against IFNAMSIZ?
> > > > > >
> > > > > Maximum 15. Last is null termination.
> > > > > Some udev rules setting by user prefix the PF netdev interface. I
> > > > > took such
> > > > example below where ens2f0 netdev named is prefixed.
> > > > > Some prefer not to prefix.
> > > > >
> > > > > > > Hence, I would like to discuss below options.
> > > > > > >
> > > > > > > Option-1: mdev index
> > > > > > > Introduce an optional mdev index/handle as u32 during mdev
> > > > > > > create time. User passes mdev index/handle as input.
> > > > > > >
> > > > > > > phys_port_name=mIndex=m%u
> > > > > > > mdev_index will be available in sysfs as mdev attribute for
> > > > > > > udev to name the mdev's netdev.
> > > > > > >
> > > > > > > example mdev create command:
> > > > > > > UUID=$(uuidgen)
> > > > > > > echo $UUID index=10
> > > > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/cr
> > > > > > > > eate
> > > > > >
> > > > > > Nit, IIRC previous discussions of additional parameters used
> > > > > > comma separators, ex. echo $UUID,index=10 >...
> > > > > >
> > > > > Yes, ok.
> > > > >
> > > > > > > > example netdevs:
> > > > > > > repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */
> > > > > >
> > > > > > Is the parent really relevant in the name?
> > > > > No. I just picked one udev example who prefixed the parent netdev name.
> > > > > But there are users who do not prefix it.
> > > > >
> > > > > > Tools like mdevctl are meant to
> > > > > > provide persistence, creating the same mdev devices on the same
> > > > > > parent, but that's simply the easiest policy decision. We can
> > > > > > also imagine that multiple parent devices might support a
> > > > > > specified mdev type and policies factoring in proximity,
> > > > > > load-balancing, power consumption, etc might be weighed such
> > > > > > that we really don't want to promote userspace creating dependencies
> > on the parent association.
> > > > > >
> > > > > > > mdev_netdev=enm10
> > > > > > >
> > > > > > > Pros:
> > > > > > > 1. mdevctl and any other existing tools are unaffected.
> > > > > > > 2. netdev stack, ovs and other switching platforms are unaffected.
> > > > > > > 3. achieves unique phys_port_name for representor netdev 4.
> > > > > > > achieves unique mdev eth netdev name for the mdev using
> > > > > > > udev/systemd
> > > > extension.
> > > > > > > 5. Aligns well with mdev and netdev subsystem and similar to
> > > > > > > existing sriov bdf's.
> > > > > >
> > > > > > A user provided index seems strange to me. It's not really an
> > > > > > index, just a user specified instance number. Presumably you
> > > > > > have the user providing this because if it really were an index,
> > > > > > then the value depends on the creation order and persistence is
> > > > > > lost. Now the user needs to both avoid uuid collision as well as "index"
> > > > > > number collision. The uuid namespace is large enough to mostly
> > > > > > ignore
> > > > this, but this is not. This seems like a burden.
> > > > > >
> > > > > I liked the term 'instance number', which is lot better way to say
> > > > > than
> > > > index/handle.
> > > > > Yes, user needs to avoid both the collision.
> > > > > UUID collision should not occur in most cases, they way UUID are
> > generated.
> > > > > So practically users needs to pick unique 'instance number',
> > > > > similar to how it
> > > > picks unique netdev names.
> > > > >
> > > > > Burden to user comes from the requirement to get uniqueness.
> > > > >
> > > > > > > Option-2: shorter mdev name
> > > > > > > Extend mdev to have shorter mdev device name in addition to UUID.
> > > > > > > such as 'foo', 'bar'.
> > > > > > > Mdev will continue to have UUID.
> > > > > > > phys_port_name=mdev_name
> > > > > > >
> > > > > > > Pros:
> > > > > > > 1. All same as option-1, except mdevctl needs upgrade for newer
> > usage.
> > > > > > > It is common practice to upgrade iproute2 package along with
> > > > > > > the kernel. Similar practice to be done with mdevctl.
> > > > > > > 2. Newer users of mdevctl who wants to work with non_UUID
> > > > > > > names, will use newer mdevctl/tools. Cons:
> > > > > > > 1. Dual naming scheme of mdev might affect some of the existing
> > tools.
> > > > > > > It's unclear how/if it actually affects.
> > > > > > > mdevctl [2] is very recently developed and can be enhanced for
> > > > > > > dual naming scheme.
> > > > > >
> > > > > > I think we've already nak'ed this one, the device namespace
> > > > > > becomes meaningless if the name becomes just a string where a
> > > > > > uuid might be an example string. mdevs are named by uuid.
> > > > > >
> > > > > > > Option-3: mdev uuid alias
> > > > > > > Instead of shorter mdev name or mdev index, have alpha-numeric
> > > > > > > name alias. Alias is an optional mdev sysfs attribute such as 'foo',
> > 'bar'.
> > > > > > > example mdev create command:
> > > > > > > UUID=$(uuidgen)
> > > > > > > echo $UUID alias=foo
> > > > > > > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/cr
> > > > > > > > eate
> > > > > > > > example netdevs:
> > > > > > > examle netdevs:
> > > > > > > repnetdev = ens2f0_mfoo
> > > > > > > mdev_netdev=enmfoo
> > > > > > >
> > > > > > > Pros:
> > > > > > > 1. All same as option-1.
> > > > > > > 2. Doesn't affect existing mdev naming scheme.
> > > > > > > Cons:
> > > > > > > 1. Index scheme of option-1 is better which can number large
> > > > > > > number of mdevs with fewer characters, simplifying the
> > > > > > > management
> > > > tool.
> > > > > >
> > > > > > No better than option-1, simply a larger secondary namespace,
> > > > > > but still requires the user to come up with two independent
> > > > > > names for the
> > > > device.
> > > > > >
> > > > > > > Option-4: extend IFNAMESZ to be 64 bytes Extended IFNAMESZ
> > > > > > > from 16 to
> > > > > > > 64 bytes phys_port_name=mdev_UUID_string
> > > > mdev_netdev_name=enmUUID
> > > > > > >
> > > > > > > Pros:
> > > > > > > 1. Doesn't require mdev extension
> > > > > > > Cons:
> > > > > > > 1. netdev stack, driver, uapi, user space, boot config wide changes 2.
> > > > > > > Possible user space extensions who assumed name size being 16
> > > > > > > characters 3. Single device type demands namesize change for
> > > > > > > all netdev types
> > > > > >
> > > > > > What about an alias based on the uuid? For example, we use
> > > > > > 160-bit sha1s daily with git (uuids are only 128-bit), but we
> > > > > > generally don't reference git commits with the full 20 character string.
> > > > > > Generally 12 characters is recommended to avoid ambiguity.
> > > > > > Could mdev automatically create an
> > > >
> > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > > > abbreviated sha1 alias for the device? If so, how many
> > > > > > characters should we
> > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > > > use and what do we do on collision? The colliding device could
> > > > > > add enough alias characters to disambiguate (we likely couldn't
> > > > > > re-alias the existing device to disambiguate, but I'm not sure
> > > > > > it matters, userspace has sysfs to associate aliases). Ex.
> > > > > >
> > > > > > UUID=$(uuidgen)
> > > > > > ALIAS=$(echo $UUID | sha1sum | colrm 13)
> > > > > >
> > > > > I explained in previous reply to Cornelia, we should set UUID and
> > > > > ALIAS at the
> > > > same time.
> > > > > Setting is via different sysfs attribute is lot code burden with no extra
> > benefit.
> > > >
> > > > Just an example of the alias, not proposing how it's set. In fact,
> > > > proposing that the user does not set it, mdev-core provides one
> > automatically.
> > > >
> > > > > > Since there seems to be some prefix overhead, as I ask about
> > > > > > above in how many characters we actually have to work with in
> > > > > > IFNAMESZ, maybe we start with 8 characters (matching your
> > > > > > "index" namespace) and expand as necessary for disambiguation.
> > > > > > If we can eliminate overhead in IFNAMESZ, let's start with 12.
> > > > > > Thanks,
> > > > > >
> > > > > If user is going to choose the alias, why does it have to be limited to sha1?
> > > > > Or you just told it as an example?
> > > > >
> > > > > It can be an alpha-numeric string.
> > > >
> > > > No, I'm proposing a different solution where mdev-core creates an
> > > > alias based on an abbreviated sha1. The user does not provide the alias.
> > > >
> > > > > Instead of mdev imposing number of characters on the alias, it
> > > > > should be best
> > > > left to the user.
> > > > > Because in future if netdev improves on the naming scheme, mdev
> > > > > will be
> > > > limiting it, which is not right.
> > > > > So not restricting alias size seems right to me.
> > > > > User configuring mdev for networking devices in a given kernel
> > > > > knows what
> > > > user is doing.
> > > > > So user can choose alias name size as it finds suitable.
> > > >
> > > > That's not what I'm proposing, please read again. Thanks,
> > >
> > > I understood your point. But mdev doesn't know how user is going to use
> > udev/systemd to name the netdev.
> > > So even if mdev chose to pick 12 characters, it could result in collision.
> > > Hence the proposal to provide the alias by the user, as user know the best
> > policy for its use case in the environment its using.
> > > So 12 character sha1 method will still work by user.
> >
> > Haven't you already provided examples where certain drivers or subsystems
> > have unique netdev prefixes? If mdev provides a unique alias within the
> > subsystem, couldn't we simply define a netdev prefix for the mdev subsystem
> > and avoid all other collisions? I'm not in favor of the user providing both a uuid
> > and an alias/instance. Thanks,
> >
> For a given prefix, say ens2f0, can two UUID->sha1 first 9 characters have collision?
I think it would be a mistake to waste so many chars on a prefix, but 9
characters of sha1 likely wouldn't have a collision before we have 10s
of thousands of devices. Thanks,
Alex
^ permalink raw reply
* RE: [EXT] Re: [PATCH net-next 0/1] net: fec: add C45 MDIO read/write support
From: Andy Duan @ 2019-08-21 5:55 UTC (permalink / raw)
To: Andrew Lunn
Cc: Marco Hartmann, davem@davemloft.net, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, Christian Herber
In-Reply-To: <20190820130403.GH29991@lunn.ch>
From: Andrew Lunn <andrew@lunn.ch> Sent: Tuesday, August 20, 2019 9:04 PM
> On Tue, Aug 20, 2019 at 02:32:26AM +0000, Andy Duan wrote:
> > From: Andrew Lunn <andrew@lunn.ch>
> > > On Mon, Aug 19, 2019 at 05:11:14PM +0000, Marco Hartmann wrote:
> > > > As of yet, the Fast Ethernet Controller (FEC) driver only supports
> > > > Clause 22 conform MDIO transactions. IEEE 802.3ae Clause 45
> > > > defines a modified MDIO protocol that uses a two staged access
> > > > model in order to increase the address space.
> > > >
> > > > This patch adds support for Clause 45 conform MDIO read and write
> > > > operations to the FEC driver.
> > >
> > > Hi Marco
> > >
> > > Do all versions of the FEC hardware support C45? Or do we need to
> > > make use of the quirk support in this driver to just enable it for some
> revisions of FEC?
> > >
> > > Thanks
> > > Andrew
> >
> > i.MX legacy platforms like i.MX6/7 series, they doesn't support Write & Read
> Increment.
> > But for i.MX8MQ/MM series, it support C45 full features like Write & Read
> Increment.
> >
> > For the patch itself, it doesn't support Write & Read Increment, so I
> > think the patch doesn't need to add quirk support.
>
> Hi Andy
>
> So what happens with something older than a i.MX8MQ/MM when a C45
> transfer is attempted? This patch adds a new write. Does that write
> immediately trigger a completion interrupt? Does it never trigger an interrupt,
> and we have to wait FEC_MII_TIMEOUT?
>
> Ideally, if the hardware does not support C45, we want it to return
> EOPNOTSUPP.
>
> Thanks
> Andrew
It still trigger an interrupt to wakeup the completion, we have to wait FEC_MII_TIMEOUT.
Older chips just support part of C45 feature just like the patch implementation.
^ permalink raw reply
* pull-request: wireless-drivers 2019-08-21
From: Kalle Valo @ 2019-08-21 6:07 UTC (permalink / raw)
To: David Miller; +Cc: linux-wireless, netdev, linux-kernel, Johannes Berg
Hi Dave,
here's a pull request to net for 5.3, more info below. I will be offline
the next week, but Johannes should be able to help if there are any
issues.
Kalle
The following changes since commit d1abaeb3be7b5fa6d7a1fbbd2e14e3310005c4c1:
Linux 5.3-rc5 (2019-08-18 14:31:08 -0700)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers.git tags/wireless-drivers-for-davem-2019-08-21
for you to fetch changes up to 5a8c31aa63578cb0ff390a57537f1cb4b312a1ed:
iwlwifi: pcie: fix recognition of QuZ devices (2019-08-20 17:00:42 +0300)
----------------------------------------------------------------
wireless-drivers fixes for 5.3
Third set of fixes for 5.3, and most likely the last one. The rt2x00
regression has been reported multiple times, others are of lower
priority.
mt76
* fix hang on resume on certain machines
rt2x00
* fix AP mode regression related to encryption
iwlwifi
* avoid unnecessary error messages due to multicast frames when not
associated
* fix configuration for ax201 devices
* fix recognition of QuZ devices
----------------------------------------------------------------
Emmanuel Grumbach (1):
iwlwifi: pcie: fix the byte count table format for 22560 devices
Ilan Peer (1):
iwlwifi: mvm: Allow multicast data frames only when associated
Luca Coelho (2):
iwlwifi: pcie: don't switch FW to qnj when ax201 is detected
iwlwifi: pcie: fix recognition of QuZ devices
Stanislaw Gruszka (2):
mt76: mt76x0u: do not reset radio on resume
rt2x00: clear IV's on start to fix AP mode regression
drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c | 33 ++++++++++++++++++++---
drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c | 10 +++++++
drivers/net/wireless/intel/iwlwifi/pcie/drv.c | 17 ++++++++++++
drivers/net/wireless/intel/iwlwifi/pcie/trans.c | 1 +
drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c | 20 +++++++++-----
drivers/net/wireless/mediatek/mt76/mt76x0/usb.c | 8 +++---
drivers/net/wireless/ralink/rt2x00/rt2800lib.c | 9 +++++++
drivers/net/wireless/ralink/rt2x00/rt2x00.h | 1 +
drivers/net/wireless/ralink/rt2x00/rt2x00dev.c | 13 +++++----
9 files changed, 93 insertions(+), 19 deletions(-)
^ permalink raw reply
* RE: [PATCH v2 0/2] Simplify mtty driver and mdev core
From: Parav Pandit @ 2019-08-21 6:23 UTC (permalink / raw)
To: Alex Williamson
Cc: Jiri Pirko, David S . Miller, Kirti Wankhede, Cornelia Huck,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org, cjia,
netdev@vger.kernel.org
In-Reply-To: <20190820232622.164962d3@x1.home>
> -----Original Message-----
> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Wednesday, August 21, 2019 10:56 AM
> To: Parav Pandit <parav@mellanox.com>
> Cc: Jiri Pirko <jiri@mellanox.com>; David S . Miller <davem@davemloft.net>;
> Kirti Wankhede <kwankhede@nvidia.com>; Cornelia Huck
> <cohuck@redhat.com>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org;
> cjia <cjia@nvidia.com>; netdev@vger.kernel.org
> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
>
> > > > > Just an example of the alias, not proposing how it's set. In
> > > > > fact, proposing that the user does not set it, mdev-core
> > > > > provides one
> > > automatically.
> > > > >
> > > > > > > Since there seems to be some prefix overhead, as I ask about
> > > > > > > above in how many characters we actually have to work with
> > > > > > > in IFNAMESZ, maybe we start with 8 characters (matching your
> > > > > > > "index" namespace) and expand as necessary for disambiguation.
> > > > > > > If we can eliminate overhead in IFNAMESZ, let's start with 12.
> > > > > > > Thanks,
> > > > > > >
> > > > > > If user is going to choose the alias, why does it have to be limited to
> sha1?
> > > > > > Or you just told it as an example?
> > > > > >
> > > > > > It can be an alpha-numeric string.
> > > > >
> > > > > No, I'm proposing a different solution where mdev-core creates
> > > > > an alias based on an abbreviated sha1. The user does not provide the
> alias.
> > > > >
> > > > > > Instead of mdev imposing number of characters on the alias, it
> > > > > > should be best
> > > > > left to the user.
> > > > > > Because in future if netdev improves on the naming scheme,
> > > > > > mdev will be
> > > > > limiting it, which is not right.
> > > > > > So not restricting alias size seems right to me.
> > > > > > User configuring mdev for networking devices in a given kernel
> > > > > > knows what
> > > > > user is doing.
> > > > > > So user can choose alias name size as it finds suitable.
> > > > >
> > > > > That's not what I'm proposing, please read again. Thanks,
> > > >
> > > > I understood your point. But mdev doesn't know how user is going
> > > > to use
> > > udev/systemd to name the netdev.
> > > > So even if mdev chose to pick 12 characters, it could result in collision.
> > > > Hence the proposal to provide the alias by the user, as user know
> > > > the best
> > > policy for its use case in the environment its using.
> > > > So 12 character sha1 method will still work by user.
> > >
> > > Haven't you already provided examples where certain drivers or
> > > subsystems have unique netdev prefixes? If mdev provides a unique
> > > alias within the subsystem, couldn't we simply define a netdev
> > > prefix for the mdev subsystem and avoid all other collisions? I'm
> > > not in favor of the user providing both a uuid and an
> > > alias/instance. Thanks,
> > >
> > For a given prefix, say ens2f0, can two UUID->sha1 first 9 characters have
> collision?
>
> I think it would be a mistake to waste so many chars on a prefix, but 9
> characters of sha1 likely wouldn't have a collision before we have 10s of
> thousands of devices. Thanks,
>
> Alex
Jiri, Dave,
Are you ok with it for devlink/netdev part?
Mdev core will create an alias from a UUID.
This will be supplied during devlink port attr set such as,
devlink_port_attrs_mdev_set(struct devlink_port *port, const char *mdev_alias);
This alias is used to generate representor netdev's phys_port_name.
This alias from the mdev device's sysfs will be used by the udev/systemd to generate predicable netdev's name.
Example: enm<mdev_alias_first_12_chars>
I took Ethernet mdev as an example.
New prefix 'm' stands for mediated device.
Remaining 12 characters are first 12 chars of the mdev alias.
^ permalink raw reply
* [PATCH net-next] net: dsa: remove bitmap operations
From: Vivien Didelot @ 2019-08-21 6:24 UTC (permalink / raw)
To: netdev; +Cc: davem, f.fainelli, andrew, olteanv, Vivien Didelot
The bitmap operations were introduced to simplify the switch drivers
in the future, since most of them could implement the common VLAN and
MDB operations (add, del, dump) with simple functions taking all target
ports at once, and thus limiting the number of hardware accesses.
Programming an MDB or VLAN this way in a single operation would clearly
simplify the drivers a lot but would require a new get-set interface
in DSA. The usage of such bitmap from the stack also raised concerned
in the past, leading to the dynamic allocation of a new ds->_bitmap
member in the dsa_switch structure. So let's get rid of them for now.
This commit nicely wraps the ds->ops->port_{mdb,vlan}_{prepare,add}
switch operations into new dsa_switch_{mdb,vlan}_{prepare,add}
variants not using any bitmap argument anymore.
New dsa_switch_{mdb,vlan}_match helpers have been introduced to make
clear which local port of a switch must be programmed with the target
object. While the targeted user port is an obvious candidate, the
DSA links must also be programmed, as well as the CPU port for VLANs.
While at it, also remove local variables that are only used once.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
---
include/net/dsa.h | 3 --
net/dsa/dsa2.c | 14 -----
net/dsa/switch.c | 132 +++++++++++++++++++++-------------------------
3 files changed, 59 insertions(+), 90 deletions(-)
diff --git a/include/net/dsa.h b/include/net/dsa.h
index 147b757ef8ea..96acb14ec1a8 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -275,9 +275,6 @@ struct dsa_switch {
*/
bool vlan_filtering;
- unsigned long *bitmap;
- unsigned long _bitmap;
-
/* Dynamically allocated ports, keep last */
size_t num_ports;
struct dsa_port ports[];
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 8c4eccb0cfe6..f8445fa73448 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -834,20 +834,6 @@ struct dsa_switch *dsa_switch_alloc(struct device *dev, size_t n)
if (!ds)
return NULL;
- /* We avoid allocating memory outside dsa_switch
- * if it is not needed.
- */
- if (n <= sizeof(ds->_bitmap) * 8) {
- ds->bitmap = &ds->_bitmap;
- } else {
- ds->bitmap = devm_kcalloc(dev,
- BITS_TO_LONGS(n),
- sizeof(unsigned long),
- GFP_KERNEL);
- if (unlikely(!ds->bitmap))
- return NULL;
- }
-
ds->dev = dev;
ds->num_ports = n;
diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index 09d9286b27cc..489eb7b430a4 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -128,57 +128,51 @@ static int dsa_switch_fdb_del(struct dsa_switch *ds,
return ds->ops->port_fdb_del(ds, port, info->addr, info->vid);
}
-static int
-dsa_switch_mdb_prepare_bitmap(struct dsa_switch *ds,
- const struct switchdev_obj_port_mdb *mdb,
- const unsigned long *bitmap)
+static bool dsa_switch_mdb_match(struct dsa_switch *ds, int port,
+ struct dsa_notifier_mdb_info *info)
+{
+ if (ds->index == info->sw_index && port == info->port)
+ return true;
+
+ if (dsa_is_dsa_port(ds, port))
+ return true;
+
+ return false;
+}
+
+static int dsa_switch_mdb_prepare(struct dsa_switch *ds,
+ struct dsa_notifier_mdb_info *info)
{
int port, err;
if (!ds->ops->port_mdb_prepare || !ds->ops->port_mdb_add)
return -EOPNOTSUPP;
- for_each_set_bit(port, bitmap, ds->num_ports) {
- err = ds->ops->port_mdb_prepare(ds, port, mdb);
- if (err)
- return err;
+ for (port = 0; port < ds->num_ports; port++) {
+ if (dsa_switch_mdb_match(ds, port, info)) {
+ err = ds->ops->port_mdb_prepare(ds, port, info->mdb);
+ if (err)
+ return err;
+ }
}
return 0;
}
-static void dsa_switch_mdb_add_bitmap(struct dsa_switch *ds,
- const struct switchdev_obj_port_mdb *mdb,
- const unsigned long *bitmap)
-{
- int port;
-
- if (!ds->ops->port_mdb_add)
- return;
-
- for_each_set_bit(port, bitmap, ds->num_ports)
- ds->ops->port_mdb_add(ds, port, mdb);
-}
-
static int dsa_switch_mdb_add(struct dsa_switch *ds,
struct dsa_notifier_mdb_info *info)
{
- const struct switchdev_obj_port_mdb *mdb = info->mdb;
- struct switchdev_trans *trans = info->trans;
int port;
- /* Build a mask of Multicast group members */
- bitmap_zero(ds->bitmap, ds->num_ports);
- if (ds->index == info->sw_index)
- set_bit(info->port, ds->bitmap);
- for (port = 0; port < ds->num_ports; port++)
- if (dsa_is_dsa_port(ds, port))
- set_bit(port, ds->bitmap);
+ if (switchdev_trans_ph_prepare(info->trans))
+ return dsa_switch_mdb_prepare(ds, info);
- if (switchdev_trans_ph_prepare(trans))
- return dsa_switch_mdb_prepare_bitmap(ds, mdb, ds->bitmap);
+ if (!ds->ops->port_mdb_add)
+ return 0;
- dsa_switch_mdb_add_bitmap(ds, mdb, ds->bitmap);
+ for (port = 0; port < ds->num_ports; port++)
+ if (dsa_switch_mdb_match(ds, port, info))
+ ds->ops->port_mdb_add(ds, port, info->mdb);
return 0;
}
@@ -186,13 +180,11 @@ static int dsa_switch_mdb_add(struct dsa_switch *ds,
static int dsa_switch_mdb_del(struct dsa_switch *ds,
struct dsa_notifier_mdb_info *info)
{
- const struct switchdev_obj_port_mdb *mdb = info->mdb;
-
if (!ds->ops->port_mdb_del)
return -EOPNOTSUPP;
if (ds->index == info->sw_index)
- return ds->ops->port_mdb_del(ds, info->port, mdb);
+ return ds->ops->port_mdb_del(ds, info->port, info->mdb);
return 0;
}
@@ -234,59 +226,55 @@ static int dsa_port_vlan_check(struct dsa_switch *ds, int port,
(void *)vlan);
}
-static int
-dsa_switch_vlan_prepare_bitmap(struct dsa_switch *ds,
- const struct switchdev_obj_port_vlan *vlan,
- const unsigned long *bitmap)
+static bool dsa_switch_vlan_match(struct dsa_switch *ds, int port,
+ struct dsa_notifier_vlan_info *info)
+{
+ if (ds->index == info->sw_index && port == info->port)
+ return true;
+
+ if (dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port))
+ return true;
+
+ return false;
+}
+
+static int dsa_switch_vlan_prepare(struct dsa_switch *ds,
+ struct dsa_notifier_vlan_info *info)
{
int port, err;
if (!ds->ops->port_vlan_prepare || !ds->ops->port_vlan_add)
return -EOPNOTSUPP;
- for_each_set_bit(port, bitmap, ds->num_ports) {
- err = dsa_port_vlan_check(ds, port, vlan);
- if (err)
- return err;
+ for (port = 0; port < ds->num_ports; port++) {
+ if (dsa_switch_vlan_match(ds, port, info)) {
+ err = dsa_port_vlan_check(ds, port, info->vlan);
+ if (err)
+ return err;
- err = ds->ops->port_vlan_prepare(ds, port, vlan);
- if (err)
- return err;
+ err = ds->ops->port_vlan_prepare(ds, port, info->vlan);
+ if (err)
+ return err;
+ }
}
return 0;
}
-static void
-dsa_switch_vlan_add_bitmap(struct dsa_switch *ds,
- const struct switchdev_obj_port_vlan *vlan,
- const unsigned long *bitmap)
-{
- int port;
-
- for_each_set_bit(port, bitmap, ds->num_ports)
- ds->ops->port_vlan_add(ds, port, vlan);
-}
-
static int dsa_switch_vlan_add(struct dsa_switch *ds,
struct dsa_notifier_vlan_info *info)
{
- const struct switchdev_obj_port_vlan *vlan = info->vlan;
- struct switchdev_trans *trans = info->trans;
int port;
- /* Build a mask of VLAN members */
- bitmap_zero(ds->bitmap, ds->num_ports);
- if (ds->index == info->sw_index)
- set_bit(info->port, ds->bitmap);
- for (port = 0; port < ds->num_ports; port++)
- if (dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port))
- set_bit(port, ds->bitmap);
+ if (switchdev_trans_ph_prepare(info->trans))
+ return dsa_switch_vlan_prepare(ds, info);
- if (switchdev_trans_ph_prepare(trans))
- return dsa_switch_vlan_prepare_bitmap(ds, vlan, ds->bitmap);
+ if (!ds->ops->port_vlan_add)
+ return 0;
- dsa_switch_vlan_add_bitmap(ds, vlan, ds->bitmap);
+ for (port = 0; port < ds->num_ports; port++)
+ if (dsa_switch_vlan_match(ds, port, info))
+ ds->ops->port_vlan_add(ds, port, info->vlan);
return 0;
}
@@ -294,13 +282,11 @@ static int dsa_switch_vlan_add(struct dsa_switch *ds,
static int dsa_switch_vlan_del(struct dsa_switch *ds,
struct dsa_notifier_vlan_info *info)
{
- const struct switchdev_obj_port_vlan *vlan = info->vlan;
-
if (!ds->ops->port_vlan_del)
return -EOPNOTSUPP;
if (ds->index == info->sw_index)
- return ds->ops->port_vlan_del(ds, info->port, vlan);
+ return ds->ops->port_vlan_del(ds, info->port, info->vlan);
return 0;
}
--
2.23.0
^ permalink raw reply related
* Re: [pull request][net-next v2 00/16] Mellanox, mlx5 devlink RX health reporters
From: David Miller @ 2019-08-21 6:36 UTC (permalink / raw)
To: saeedm; +Cc: netdev
In-Reply-To: <20190820202352.2995-1-saeedm@mellanox.com>
From: Saeed Mahameed <saeedm@mellanox.com>
Date: Tue, 20 Aug 2019 20:24:10 +0000
> This series is adding a new devlink health reporter for RX related
> errors from Aya.
>
> Last two patches from Vlad and Gavi, are trivial fixes for previously
> submitted patches on this release cycle.
>
> v1->v2:
> - Improve reversed xmas tree variable declaration.
> - Rebase on top of net-next to avoid a new conflict due to latest
> merge with net.
>
> For more information please see tag log below.
>
> Please pull and let me know if there is any problem.
Pulled, thanks Saeed.
^ permalink raw reply
* Re: INFO: task hung in tls_sw_release_resources_tx
From: Steffen Klassert @ 2019-08-21 6:37 UTC (permalink / raw)
To: Jakub Kicinski, syzbot, ast, aviadye, borisp, bpf, daniel,
davejwatson, davem, hdanton, john.fastabend, netdev,
syzkaller-bugs, herbert, linux-crypto
In-Reply-To: <20190817054743.GE8209@sol.localdomain>
On Fri, Aug 16, 2019 at 10:47:43PM -0700, Eric Biggers wrote:
> [+Steffen, who is the maintainer of pcrypt]
>
> On Fri, Aug 16, 2019 at 07:02:34PM -0700, Jakub Kicinski wrote:
> > On Thu, 15 Aug 2019 11:06:00 -0700, syzbot wrote:
> > > syzbot has bisected this bug to:
> > >
> > > commit 130b392c6cd6b2aed1b7eb32253d4920babb4891
> > > Author: Dave Watson <davejwatson@fb.com>
> > > Date: Wed Jan 30 21:58:31 2019 +0000
> > >
> > > net: tls: Add tls 1.3 support
> > >
> > > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=118e8dee600000
> > > start commit: 6d5afe20 sctp: fix memleak in sctp_send_reset_streams
> > > git tree: net
> > > final crash: https://syzkaller.appspot.com/x/report.txt?x=138e8dee600000
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=158e8dee600000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=a4c9e9f08e9e8960
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=6a9ff159672dfbb41c95
> > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=17cb0502600000
> > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=14d5dc22600000
> > >
> > > Reported-by: syzbot+6a9ff159672dfbb41c95@syzkaller.appspotmail.com
> > > Fixes: 130b392c6cd6 ("net: tls: Add tls 1.3 support")
> > >
> > > For information about bisection process see: https://goo.gl/tpsmEJ#bisection
> >
> > CC Herbert, linux-crypto
> >
> > This is got to be something in the crypto code :S
> >
> > The test case opens a ktls socket and back log writes to it.
> > Then it opens a AF_ALG socket, binds "pcrypt(gcm(aes))" and dies.
> >
> > The ktls socket upon close waits for async crypto callbacks, but they
> > never come. If I unset CRYPTO_USER_API_AEAD or change the alg to bind
> > to "gcm(aes)" the bug does not trigger.
> >
> > Any suggestions?
>
> Seeing as pcrypt is involved and this is a "task hung" bug, this is probably
> caused by the recursive pcrypt deadlock, which is yet to be fixed.
>
> See the original thread for more info:
>
> https://groups.google.com/forum/#!msg/syzkaller-bugs/1_CXUd3gBcg/BvsRLH0lAgAJ
>
> And the syzbot dashboard link:
>
> https://syzkaller.appspot.com/bug?id=178f2528d10720d563091fb51dceb4cb20f75525
>
> Let's tell syzbot this is a duplicate:
>
> #syz dup: INFO: task hung in aead_recvmsg
>
>
> Steffen, do you have any plan to fix this?
I've tried to use different padata instances for each pcrypt template,
but then each pcrypt template needs to expose its cpumask configuration
to a new file in /sys/kernel/pcrypt/. Currently we have one file
there for the encrytion and on for the decryption cpumask. If we have
more than these two files, we need some naming convention to now which
pcrypt template we want to configure. That would be a bit odd because
a such a nested pcrypt in pcrypt algorithm would not make sense at all.
I still think we should somehow forbid these nested configurations.
If I remember correct, the only objection to your original patch
was that it would still deadlock if an underlying algorithm uses
pcrypt as a fallback.
Maybe we can use your patch and also refuse instanitating if an
underlying algorithm needs a fallback.
The patch would look like this then:
Subject: [PATCH] crypto: pcrypt - forbid recursive instantiation
If the pcrypt template is used multiple times in an algorithm, then a
deadlock occurs because all pcrypt instances share the same
padata_instance, which completes requests in the order submitted. That
is, the inner pcrypt request waits for the outer pcrypt request while
the outer request is already waiting for the inner.
Fix this by making pcrypt forbid instantiation if pcrypt appears in the
underlying ->cra_driver_name and if an underlying algorithm needs a
fallback. This is somewhat of a hack, but it's a simple fix that should
be sufficient to prevent the deadlock.
Reproducer:
#include <linux/if_alg.h>
#include <sys/socket.h>
#include <unistd.h>
int main()
{
struct sockaddr_alg addr = {
.salg_type = "aead",
.salg_name = "pcrypt(pcrypt(rfc4106-gcm-aesni))"
};
int algfd, reqfd;
char buf[32] = { 0 };
algfd = socket(AF_ALG, SOCK_SEQPACKET, 0);
bind(algfd, (void *)&addr, sizeof(addr));
setsockopt(algfd, SOL_ALG, ALG_SET_KEY, buf, 20);
reqfd = accept(algfd, 0, 0);
write(reqfd, buf, 32);
read(reqfd, buf, 16);
}
Reported-by: syzbot+56c7151cad94eec37c521f0e47d2eee53f9361c4@syzkaller.appspotmail.com
Fixes: 5068c7a883d1 ("crypto: pcrypt - Add pcrypt crypto parallelization wrapper")
Cc: <stable@vger.kernel.org> # v2.6.34+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
crypto/pcrypt.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/crypto/pcrypt.c b/crypto/pcrypt.c
index 543792e0ebf0..932a77b61b47 100644
--- a/crypto/pcrypt.c
+++ b/crypto/pcrypt.c
@@ -198,6 +198,12 @@ static void pcrypt_free(struct aead_instance *inst)
static int pcrypt_init_instance(struct crypto_instance *inst,
struct crypto_alg *alg)
{
+ /* Recursive pcrypt deadlocks due to the shared padata_instance */
+ if (!strncmp(alg->cra_driver_name, "pcrypt(", 7) ||
+ strstr(alg->cra_driver_name, "(pcrypt(") ||
+ strstr(alg->cra_driver_name, ",pcrypt("))
+ return -EINVAL;
+
if (snprintf(inst->alg.cra_driver_name, CRYPTO_MAX_ALG_NAME,
"pcrypt(%s)", alg->cra_driver_name) >= CRYPTO_MAX_ALG_NAME)
return -ENAMETOOLONG;
@@ -236,7 +242,7 @@ static int pcrypt_create_aead(struct crypto_template *tmpl, struct rtattr **tb,
ctx = aead_instance_ctx(inst);
crypto_set_aead_spawn(&ctx->spawn, aead_crypto_instance(inst));
- err = crypto_grab_aead(&ctx->spawn, name, 0, 0);
+ err = crypto_grab_aead(&ctx->spawn, name, 0, CRYPTO_ALG_NEED_FALLBACK);
if (err)
goto out_free_inst;
--
2.17.1
^ permalink raw reply related
* Re: [PATCH ipsec] xfrm: policy: avoid warning splat when merging nodes
From: Steffen Klassert @ 2019-08-21 6:41 UTC (permalink / raw)
To: Florian Westphal; +Cc: netdev, syzkaller-bugs, syzbot+8cc27ace5f6972910b31
In-Reply-To: <20190812083213.5010-1-fw@strlen.de>
On Mon, Aug 12, 2019 at 10:32:13AM +0200, Florian Westphal wrote:
> syzbot reported a splat:
> xfrm_policy_inexact_list_reinsert+0x625/0x6e0 net/xfrm/xfrm_policy.c:877
> CPU: 1 PID: 6756 Comm: syz-executor.1 Not tainted 5.3.0-rc2+ #57
> Call Trace:
> xfrm_policy_inexact_node_reinsert net/xfrm/xfrm_policy.c:922 [inline]
> xfrm_policy_inexact_node_merge net/xfrm/xfrm_policy.c:958 [inline]
> xfrm_policy_inexact_insert_node+0x537/0xb50 net/xfrm/xfrm_policy.c:1023
> xfrm_policy_inexact_alloc_chain+0x62b/0xbd0 net/xfrm/xfrm_policy.c:1139
> xfrm_policy_inexact_insert+0xe8/0x1540 net/xfrm/xfrm_policy.c:1182
> xfrm_policy_insert+0xdf/0xce0 net/xfrm/xfrm_policy.c:1574
> xfrm_add_policy+0x4cf/0x9b0 net/xfrm/xfrm_user.c:1670
> xfrm_user_rcv_msg+0x46b/0x720 net/xfrm/xfrm_user.c:2676
> netlink_rcv_skb+0x1f0/0x460 net/netlink/af_netlink.c:2477
> xfrm_netlink_rcv+0x74/0x90 net/xfrm/xfrm_user.c:2684
> netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
> netlink_unicast+0x809/0x9a0 net/netlink/af_netlink.c:1328
> netlink_sendmsg+0xa70/0xd30 net/netlink/af_netlink.c:1917
> sock_sendmsg_nosec net/socket.c:637 [inline]
> sock_sendmsg net/socket.c:657 [inline]
>
> There is no reproducer, however, the warning can be reproduced
> by adding rules with ever smaller prefixes.
>
> The sanity check ("does the policy match the node") uses the prefix value
> of the node before its updated to the smaller value.
>
> To fix this, update the prefix earlier. The bug has no impact on tree
> correctness, this is only to prevent a false warning.
>
> Reported-by: syzbot+8cc27ace5f6972910b31@syzkaller.appspotmail.com
> Signed-off-by: Florian Westphal <fw@strlen.de>
Applied, thanks Florian!
^ permalink raw reply
* Re: various TLS bug fixes...
From: Jakub Kicinski @ 2019-08-21 6:51 UTC (permalink / raw)
To: John Fastabend; +Cc: David Miller, netdev
In-Reply-To: <5d5cd426e18be_67732ae0ef5705bc4@john-XPS-13-9370.notmuch>
On Tue, 20 Aug 2019 22:18:30 -0700, John Fastabend wrote:
> > > I suspect you've triaged through this already on your side for other
> > > reasons, so perhaps you could help come up with a sane set of TLS
> > > bug fix backports that would be appropriate for -stable?
> >
> > I'm planning to spend tomorrow working exactly on v4.19 backport.
> > I have internal reports of openssl failing on v4.19 while v4.20
> > works fine.. Hopefully I'll be able to figure that one out, test the
> > above and see if there are any other missing fixes.
> >
> > Is it okay if I come back to this tomorrow?
>
> Is the failure with hw offload or sw case?
SW case, strangely enough. Large file transfer, I think with openssl
client..
> If its sendpage related looks like we also need to push the following
> patch back to 4.19,
>
> commit 648ee6cea7dde4a5cdf817e5d964fd60b22006a4
> Author: John Fastabend <john.fastabend@gmail.com>
> Date: Wed Jun 12 17:23:57 2019 +0000
>
> net: tls, correctly account for copied bytes with multiple sk_msgs
I had a quick look at that, but the commit in Fixes is not in v4.19.
> If you have more details I can also spend some cycles looking into it.
Awesome, I'll let you know what the details are as soon as I get them.
^ permalink raw reply
* Re: [PATCH RFC ipsec-next 0/7] ipsec: add TCP encapsulation support (RFC 8229)
From: Steffen Klassert @ 2019-08-21 6:59 UTC (permalink / raw)
To: Sabrina Dubroca; +Cc: netdev, Herbert Xu
In-Reply-To: <20190816141814.GA12002@bistromath.localdomain>
On Fri, Aug 16, 2019 at 04:18:14PM +0200, Sabrina Dubroca wrote:
> Hi Steffen,
>
> 2019-06-25, 12:11:33 +0200, Sabrina Dubroca wrote:
> > This patchset introduces support for TCP encapsulation of IKE and ESP
> > messages, as defined by RFC 8229 [0]. It is an evolution of what
> > Herbert Xu proposed in January 2018 [1] that addresses the main
> > criticism against it, by not interfering with the TCP implementation
> > at all. The networking stack now has infrastructure for this: TCP ULPs
> > and Stream Parsers.
>
> Have you had a chance to look at this? I was going to rebase and
> resend, but the patches still apply to ipsec-next and net-next (patch
> 2 is already in net-next as commit bd95e678e0f6).
I had a look and I have no general objection against this. If you
think the patchset is ready for inclusion, just remove the RFC and
resend it. I'll have a closer on it look then.
^ permalink raw reply
* [PATCH net-next 0/7] mlxsw: Add devlink-trap support
From: Ido Schimmel @ 2019-08-21 7:19 UTC (permalink / raw)
To: netdev; +Cc: davem, jiri, dsahern, mlxsw, Ido Schimmel
From: Ido Schimmel <idosch@mellanox.com>
This patchset adds devlink-trap support in mlxsw.
Patches #1-#4 add the necessary APIs and defines in mlxsw.
Patch #5 implements devlink-trap support for layer 2 drops. More drops
will be added in the future.
Patches #6-#7 add selftests to make sure that all the new code paths are
exercised and that the feature is working as expected.
Ido Schimmel (7):
mlxsw: core: Add API to set trap action
mlxsw: reg: Add new trap actions
mlxsw: Add layer 2 discard trap IDs
mlxsw: Add trap group for layer 2 discards
mlxsw: spectrum: Add devlink-trap support
selftests: mlxsw: Add test cases for devlink-trap L2 drops
selftests: mlxsw: Add a test case for devlink-trap
drivers/net/ethernet/mellanox/mlxsw/Makefile | 2 +-
drivers/net/ethernet/mellanox/mlxsw/core.c | 64 +++
drivers/net/ethernet/mellanox/mlxsw/core.h | 12 +
drivers/net/ethernet/mellanox/mlxsw/reg.h | 12 +
.../net/ethernet/mellanox/mlxsw/spectrum.c | 21 +
.../net/ethernet/mellanox/mlxsw/spectrum.h | 13 +
.../ethernet/mellanox/mlxsw/spectrum_trap.c | 267 ++++++++++
drivers/net/ethernet/mellanox/mlxsw/trap.h | 7 +
.../drivers/net/mlxsw/devlink_trap.sh | 129 +++++
.../net/mlxsw/devlink_trap_l2_drops.sh | 484 ++++++++++++++++++
10 files changed, 1010 insertions(+), 1 deletion(-)
create mode 100644 drivers/net/ethernet/mellanox/mlxsw/spectrum_trap.c
create mode 100755 tools/testing/selftests/drivers/net/mlxsw/devlink_trap.sh
create mode 100755 tools/testing/selftests/drivers/net/mlxsw/devlink_trap_l2_drops.sh
--
2.21.0
^ permalink raw reply
* [PATCH net-next 1/7] mlxsw: core: Add API to set trap action
From: Ido Schimmel @ 2019-08-21 7:19 UTC (permalink / raw)
To: netdev; +Cc: davem, jiri, dsahern, mlxsw, Ido Schimmel
In-Reply-To: <20190821071937.13622-1-idosch@idosch.org>
From: Ido Schimmel <idosch@mellanox.com>
Up until now the action of a trap was never changed during its lifetime.
This is going to change by subsequent patches that will allow devlink to
control the action of certain traps.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
drivers/net/ethernet/mellanox/mlxsw/core.c | 12 ++++++++++++
drivers/net/ethernet/mellanox/mlxsw/core.h | 3 +++
2 files changed, 15 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.c b/drivers/net/ethernet/mellanox/mlxsw/core.c
index 17ceac7505e5..6ec07ecfb5f6 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.c
@@ -1477,6 +1477,18 @@ void mlxsw_core_trap_unregister(struct mlxsw_core *mlxsw_core,
}
EXPORT_SYMBOL(mlxsw_core_trap_unregister);
+int mlxsw_core_trap_action_set(struct mlxsw_core *mlxsw_core,
+ const struct mlxsw_listener *listener,
+ enum mlxsw_reg_hpkt_action action)
+{
+ char hpkt_pl[MLXSW_REG_HPKT_LEN];
+
+ mlxsw_reg_hpkt_pack(hpkt_pl, action, listener->trap_id,
+ listener->trap_group, listener->is_ctrl);
+ return mlxsw_reg_write(mlxsw_core, MLXSW_REG(hpkt), hpkt_pl);
+}
+EXPORT_SYMBOL(mlxsw_core_trap_action_set);
+
static u64 mlxsw_core_tid_get(struct mlxsw_core *mlxsw_core)
{
return atomic64_inc_return(&mlxsw_core->emad.tid);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.h b/drivers/net/ethernet/mellanox/mlxsw/core.h
index 8efcff4b59cb..19cea16c30bb 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.h
@@ -128,6 +128,9 @@ int mlxsw_core_trap_register(struct mlxsw_core *mlxsw_core,
void mlxsw_core_trap_unregister(struct mlxsw_core *mlxsw_core,
const struct mlxsw_listener *listener,
void *priv);
+int mlxsw_core_trap_action_set(struct mlxsw_core *mlxsw_core,
+ const struct mlxsw_listener *listener,
+ enum mlxsw_reg_hpkt_action action);
typedef void mlxsw_reg_trans_cb_t(struct mlxsw_core *mlxsw_core, char *payload,
size_t payload_len, unsigned long cb_priv);
--
2.21.0
^ permalink raw reply related
* [PATCH net-next 2/7] mlxsw: reg: Add new trap actions
From: Ido Schimmel @ 2019-08-21 7:19 UTC (permalink / raw)
To: netdev; +Cc: davem, jiri, dsahern, mlxsw, Ido Schimmel
In-Reply-To: <20190821071937.13622-1-idosch@idosch.org>
From: Ido Schimmel <idosch@mellanox.com>
Subsequent patches will add discard traps support in mlxsw. The driver
cannot configure such traps with a normal trap action, but needs to use
exception trap action, which also increments an error counter.
On the other hand, when these traps are initialized or set to drop
action, they should use the default drop action set by the firmware.
This guarantees that when the feature is disabled we get the exact same
behavior as before the feature was introduced.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
drivers/net/ethernet/mellanox/mlxsw/reg.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/reg.h b/drivers/net/ethernet/mellanox/mlxsw/reg.h
index ead36702549a..59e296562b5a 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/reg.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/reg.h
@@ -5559,6 +5559,8 @@ enum mlxsw_reg_hpkt_action {
MLXSW_REG_HPKT_ACTION_DISCARD,
MLXSW_REG_HPKT_ACTION_SOFT_DISCARD,
MLXSW_REG_HPKT_ACTION_TRAP_AND_SOFT_DISCARD,
+ MLXSW_REG_HPKT_ACTION_TRAP_EXCEPTION_TO_CPU,
+ MLXSW_REG_HPKT_ACTION_SET_FW_DEFAULT = 15,
};
/* reg_hpkt_action
@@ -5569,6 +5571,8 @@ enum mlxsw_reg_hpkt_action {
* 3 - Discard.
* 4 - Soft discard (allow other traps to act on the packet).
* 5 - Trap and soft discard (allow other traps to overwrite this trap).
+ * 6 - Trap to CPU (CPU receives sole copy) and count it as error.
+ * 15 - Restore the firmware's default action.
* Access: RW
*
* Note: Must be set to 0 (forward) for event trap IDs, as they are already
--
2.21.0
^ permalink raw reply related
* [PATCH net-next 3/7] mlxsw: Add layer 2 discard trap IDs
From: Ido Schimmel @ 2019-08-21 7:19 UTC (permalink / raw)
To: netdev; +Cc: davem, jiri, dsahern, mlxsw, Ido Schimmel
In-Reply-To: <20190821071937.13622-1-idosch@idosch.org>
From: Ido Schimmel <idosch@mellanox.com>
Add the trap IDs used to report layer 2 drops.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
drivers/net/ethernet/mellanox/mlxsw/trap.h | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/trap.h b/drivers/net/ethernet/mellanox/mlxsw/trap.h
index 19202bdb5105..7618f084cae9 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/trap.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/trap.h
@@ -66,6 +66,13 @@ enum {
MLXSW_TRAP_ID_NVE_ENCAP_ARP = 0xBD,
MLXSW_TRAP_ID_ROUTER_ALERT_IPV4 = 0xD6,
MLXSW_TRAP_ID_ROUTER_ALERT_IPV6 = 0xD7,
+ MLXSW_TRAP_ID_DISCARD_ING_PACKET_SMAC_MC = 0x140,
+ MLXSW_TRAP_ID_DISCARD_ING_SWITCH_VTAG_ALLOW = 0x148,
+ MLXSW_TRAP_ID_DISCARD_ING_SWITCH_VLAN = 0x149,
+ MLXSW_TRAP_ID_DISCARD_ING_SWITCH_STP = 0x14A,
+ MLXSW_TRAP_ID_DISCARD_LOOKUP_SWITCH_UC = 0x150,
+ MLXSW_TRAP_ID_DISCARD_LOOKUP_SWITCH_MC_NULL = 0x151,
+ MLXSW_TRAP_ID_DISCARD_LOOKUP_SWITCH_LB = 0x152,
MLXSW_TRAP_ID_ACL0 = 0x1C0,
/* Multicast trap used for routes with trap action */
MLXSW_TRAP_ID_ACL1 = 0x1C1,
--
2.21.0
^ permalink raw reply related
* [PATCH net-next 4/7] mlxsw: Add trap group for layer 2 discards
From: Ido Schimmel @ 2019-08-21 7:19 UTC (permalink / raw)
To: netdev; +Cc: davem, jiri, dsahern, mlxsw, Ido Schimmel
In-Reply-To: <20190821071937.13622-1-idosch@idosch.org>
From: Ido Schimmel <idosch@mellanox.com>
Discard trap groups are defined in a different enum so that they could
all share the same policer ID: MLXSW_REG_HTGT_TRAP_GROUP_MAX + 1.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
drivers/net/ethernet/mellanox/mlxsw/reg.h | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/reg.h b/drivers/net/ethernet/mellanox/mlxsw/reg.h
index 59e296562b5a..baa20cdd65df 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/reg.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/reg.h
@@ -5422,6 +5422,14 @@ enum mlxsw_reg_htgt_trap_group {
MLXSW_REG_HTGT_TRAP_GROUP_SP_LBERROR,
MLXSW_REG_HTGT_TRAP_GROUP_SP_PTP0,
MLXSW_REG_HTGT_TRAP_GROUP_SP_PTP1,
+
+ __MLXSW_REG_HTGT_TRAP_GROUP_MAX,
+ MLXSW_REG_HTGT_TRAP_GROUP_MAX = __MLXSW_REG_HTGT_TRAP_GROUP_MAX - 1
+};
+
+enum mlxsw_reg_htgt_discard_trap_group {
+ MLXSW_REG_HTGT_DISCARD_TRAP_GROUP_BASE = MLXSW_REG_HTGT_TRAP_GROUP_MAX,
+ MLXSW_REG_HTGT_TRAP_GROUP_SP_L2_DISCARDS,
};
/* reg_htgt_trap_group
--
2.21.0
^ permalink raw reply related
* [PATCH net-next 5/7] mlxsw: spectrum: Add devlink-trap support
From: Ido Schimmel @ 2019-08-21 7:19 UTC (permalink / raw)
To: netdev; +Cc: davem, jiri, dsahern, mlxsw, Ido Schimmel
In-Reply-To: <20190821071937.13622-1-idosch@idosch.org>
From: Ido Schimmel <idosch@mellanox.com>
Register supported packet traps (layer 2 drops only, currently) and
associated trap group with devlink during driver initialization.
The amount of traffic generated by these packet drop traps is capped at
10Kpps to ensure the CPU is not overwhelmed by incoming packets.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
drivers/net/ethernet/mellanox/mlxsw/Makefile | 2 +-
drivers/net/ethernet/mellanox/mlxsw/core.c | 52 ++++
drivers/net/ethernet/mellanox/mlxsw/core.h | 9 +
.../net/ethernet/mellanox/mlxsw/spectrum.c | 21 ++
.../net/ethernet/mellanox/mlxsw/spectrum.h | 13 +
.../ethernet/mellanox/mlxsw/spectrum_trap.c | 267 ++++++++++++++++++
6 files changed, 363 insertions(+), 1 deletion(-)
create mode 100644 drivers/net/ethernet/mellanox/mlxsw/spectrum_trap.c
diff --git a/drivers/net/ethernet/mellanox/mlxsw/Makefile b/drivers/net/ethernet/mellanox/mlxsw/Makefile
index 171b36bd8a4e..0e86a581d45b 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/Makefile
+++ b/drivers/net/ethernet/mellanox/mlxsw/Makefile
@@ -29,7 +29,7 @@ mlxsw_spectrum-objs := spectrum.o spectrum_buffers.o \
spectrum_mr_tcam.o spectrum_mr.o \
spectrum_qdisc.o spectrum_span.o \
spectrum_nve.o spectrum_nve_vxlan.o \
- spectrum_dpipe.o
+ spectrum_dpipe.o spectrum_trap.o
mlxsw_spectrum-$(CONFIG_MLXSW_SPECTRUM_DCB) += spectrum_dcb.o
mlxsw_spectrum-$(CONFIG_PTP_1588_CLOCK) += spectrum_ptp.o
obj-$(CONFIG_MLXSW_MINIMAL) += mlxsw_minimal.o
diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.c b/drivers/net/ethernet/mellanox/mlxsw/core.c
index 6ec07ecfb5f6..963a2b4b61b1 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.c
@@ -1017,6 +1017,54 @@ static int mlxsw_devlink_flash_update(struct devlink *devlink,
component, extack);
}
+static int mlxsw_devlink_trap_init(struct devlink *devlink,
+ const struct devlink_trap *trap,
+ void *trap_ctx)
+{
+ struct mlxsw_core *mlxsw_core = devlink_priv(devlink);
+ struct mlxsw_driver *mlxsw_driver = mlxsw_core->driver;
+
+ if (!mlxsw_driver->trap_init)
+ return -EOPNOTSUPP;
+ return mlxsw_driver->trap_init(mlxsw_core, trap, trap_ctx);
+}
+
+static void mlxsw_devlink_trap_fini(struct devlink *devlink,
+ const struct devlink_trap *trap,
+ void *trap_ctx)
+{
+ struct mlxsw_core *mlxsw_core = devlink_priv(devlink);
+ struct mlxsw_driver *mlxsw_driver = mlxsw_core->driver;
+
+ if (!mlxsw_driver->trap_fini)
+ return;
+ mlxsw_driver->trap_fini(mlxsw_core, trap, trap_ctx);
+}
+
+static int mlxsw_devlink_trap_action_set(struct devlink *devlink,
+ const struct devlink_trap *trap,
+ enum devlink_trap_action action)
+{
+ struct mlxsw_core *mlxsw_core = devlink_priv(devlink);
+ struct mlxsw_driver *mlxsw_driver = mlxsw_core->driver;
+
+ if (!mlxsw_driver->trap_action_set)
+ return -EOPNOTSUPP;
+ return mlxsw_driver->trap_action_set(mlxsw_core, trap, action);
+}
+
+static int
+mlxsw_devlink_trap_group_init(struct devlink *devlink,
+ const struct devlink_trap_group *group)
+{
+ struct mlxsw_core *mlxsw_core = devlink_priv(devlink);
+ struct mlxsw_driver *mlxsw_driver = mlxsw_core->driver;
+
+ if (!mlxsw_driver->trap_group_init)
+ return -EOPNOTSUPP;
+ return mlxsw_driver->trap_group_init(mlxsw_core, group);
+}
+
static const struct devlink_ops mlxsw_devlink_ops = {
.reload = mlxsw_devlink_core_bus_device_reload,
.port_type_set = mlxsw_devlink_port_type_set,
@@ -1034,6 +1082,10 @@ static const struct devlink_ops mlxsw_devlink_ops = {
.sb_occ_tc_port_bind_get = mlxsw_devlink_sb_occ_tc_port_bind_get,
.info_get = mlxsw_devlink_info_get,
.flash_update = mlxsw_devlink_flash_update,
+ .trap_init = mlxsw_devlink_trap_init,
+ .trap_fini = mlxsw_devlink_trap_fini,
+ .trap_action_set = mlxsw_devlink_trap_action_set,
+ .trap_group_init = mlxsw_devlink_trap_group_init,
};
static int
diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.h b/drivers/net/ethernet/mellanox/mlxsw/core.h
index 19cea16c30bb..b65a17d49e43 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.h
@@ -292,6 +292,15 @@ struct mlxsw_driver {
int (*flash_update)(struct mlxsw_core *mlxsw_core,
const char *file_name, const char *component,
struct netlink_ext_ack *extack);
+ int (*trap_init)(struct mlxsw_core *mlxsw_core,
+ const struct devlink_trap *trap, void *trap_ctx);
+ void (*trap_fini)(struct mlxsw_core *mlxsw_core,
+ const struct devlink_trap *trap, void *trap_ctx);
+ int (*trap_action_set)(struct mlxsw_core *mlxsw_core,
+ const struct devlink_trap *trap,
+ enum devlink_trap_action action);
+ int (*trap_group_init)(struct mlxsw_core *mlxsw_core,
+ const struct devlink_trap_group *group);
void (*txhdr_construct)(struct sk_buff *skb,
const struct mlxsw_tx_info *tx_info);
int (*resources_register)(struct mlxsw_core *mlxsw_core);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 389861ece418..7de9833fc60b 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -4665,6 +4665,12 @@ static int mlxsw_sp_init(struct mlxsw_core *mlxsw_core,
goto err_traps_init;
}
+ err = mlxsw_sp_devlink_traps_init(mlxsw_sp);
+ if (err) {
+ dev_err(mlxsw_sp->bus_info->dev, "Failed to initialize devlink traps\n");
+ goto err_devlink_traps_init;
+ }
+
err = mlxsw_sp_buffers_init(mlxsw_sp);
if (err) {
dev_err(mlxsw_sp->bus_info->dev, "Failed to initialize buffers\n");
@@ -4798,6 +4804,8 @@ static int mlxsw_sp_init(struct mlxsw_core *mlxsw_core,
err_lag_init:
mlxsw_sp_buffers_fini(mlxsw_sp);
err_buffers_init:
+ mlxsw_sp_devlink_traps_fini(mlxsw_sp);
+err_devlink_traps_init:
mlxsw_sp_traps_fini(mlxsw_sp);
err_traps_init:
mlxsw_sp_fids_fini(mlxsw_sp);
@@ -4870,6 +4878,7 @@ static void mlxsw_sp_fini(struct mlxsw_core *mlxsw_core)
mlxsw_sp_span_fini(mlxsw_sp);
mlxsw_sp_lag_fini(mlxsw_sp);
mlxsw_sp_buffers_fini(mlxsw_sp);
+ mlxsw_sp_devlink_traps_fini(mlxsw_sp);
mlxsw_sp_traps_fini(mlxsw_sp);
mlxsw_sp_fids_fini(mlxsw_sp);
mlxsw_sp_kvdl_fini(mlxsw_sp);
@@ -5251,6 +5260,10 @@ static struct mlxsw_driver mlxsw_sp1_driver = {
.sb_occ_port_pool_get = mlxsw_sp_sb_occ_port_pool_get,
.sb_occ_tc_port_bind_get = mlxsw_sp_sb_occ_tc_port_bind_get,
.flash_update = mlxsw_sp_flash_update,
+ .trap_init = mlxsw_sp_trap_init,
+ .trap_fini = mlxsw_sp_trap_fini,
+ .trap_action_set = mlxsw_sp_trap_action_set,
+ .trap_group_init = mlxsw_sp_trap_group_init,
.txhdr_construct = mlxsw_sp_txhdr_construct,
.resources_register = mlxsw_sp1_resources_register,
.kvd_sizes_get = mlxsw_sp_kvd_sizes_get,
@@ -5281,6 +5294,10 @@ static struct mlxsw_driver mlxsw_sp2_driver = {
.sb_occ_port_pool_get = mlxsw_sp_sb_occ_port_pool_get,
.sb_occ_tc_port_bind_get = mlxsw_sp_sb_occ_tc_port_bind_get,
.flash_update = mlxsw_sp_flash_update,
+ .trap_init = mlxsw_sp_trap_init,
+ .trap_fini = mlxsw_sp_trap_fini,
+ .trap_action_set = mlxsw_sp_trap_action_set,
+ .trap_group_init = mlxsw_sp_trap_group_init,
.txhdr_construct = mlxsw_sp_txhdr_construct,
.resources_register = mlxsw_sp2_resources_register,
.params_register = mlxsw_sp2_params_register,
@@ -5310,6 +5327,10 @@ static struct mlxsw_driver mlxsw_sp3_driver = {
.sb_occ_port_pool_get = mlxsw_sp_sb_occ_port_pool_get,
.sb_occ_tc_port_bind_get = mlxsw_sp_sb_occ_tc_port_bind_get,
.flash_update = mlxsw_sp_flash_update,
+ .trap_init = mlxsw_sp_trap_init,
+ .trap_fini = mlxsw_sp_trap_fini,
+ .trap_action_set = mlxsw_sp_trap_action_set,
+ .trap_group_init = mlxsw_sp_trap_group_init,
.txhdr_construct = mlxsw_sp_txhdr_construct,
.resources_register = mlxsw_sp2_resources_register,
.params_register = mlxsw_sp2_params_register,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index db17ba35ec84..20c14bba9ccb 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -958,4 +958,17 @@ void mlxsw_sp_nve_fini(struct mlxsw_sp *mlxsw_sp);
int mlxsw_sp_nve_inc_parsing_depth_get(struct mlxsw_sp *mlxsw_sp);
void mlxsw_sp_nve_inc_parsing_depth_put(struct mlxsw_sp *mlxsw_sp);
+/* spectrum_trap.c */
+int mlxsw_sp_devlink_traps_init(struct mlxsw_sp *mlxsw_sp);
+void mlxsw_sp_devlink_traps_fini(struct mlxsw_sp *mlxsw_sp);
+int mlxsw_sp_trap_init(struct mlxsw_core *mlxsw_core,
+ const struct devlink_trap *trap, void *trap_ctx);
+void mlxsw_sp_trap_fini(struct mlxsw_core *mlxsw_core,
+ const struct devlink_trap *trap, void *trap_ctx);
+int mlxsw_sp_trap_action_set(struct mlxsw_core *mlxsw_core,
+ const struct devlink_trap *trap,
+ enum devlink_trap_action action);
+int mlxsw_sp_trap_group_init(struct mlxsw_core *mlxsw_core,
+ const struct devlink_trap_group *group);
+
#endif
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_trap.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_trap.c
new file mode 100644
index 000000000000..899450b28621
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_trap.c
@@ -0,0 +1,267 @@
+// SPDX-License-Identifier: BSD-3-Clause OR GPL-2.0
+/* Copyright (c) 2019 Mellanox Technologies. All rights reserved */
+
+#include <linux/kernel.h>
+#include <net/devlink.h>
+#include <uapi/linux/devlink.h>
+
+#include "core.h"
+#include "reg.h"
+#include "spectrum.h"
+
+#define MLXSW_SP_TRAP_METADATA DEVLINK_TRAP_METADATA_TYPE_F_IN_PORT
+
+static void mlxsw_sp_rx_drop_listener(struct sk_buff *skb, u8 local_port,
+ void *priv);
+
+#define MLXSW_SP_TRAP_DROP(_id, _group_id) \
+ DEVLINK_TRAP_GENERIC(DROP, DROP, _id, \
+ DEVLINK_TRAP_GROUP_GENERIC(_group_id), \
+ MLXSW_SP_TRAP_METADATA)
+
+#define MLXSW_SP_RXL_DISCARD(_id, _group_id) \
+ MLXSW_RXL(mlxsw_sp_rx_drop_listener, DISCARD_##_id, SET_FW_DEFAULT, \
+ false, SP_##_group_id, DISCARD)
+
+static struct devlink_trap mlxsw_sp_traps_arr[] = {
+ MLXSW_SP_TRAP_DROP(SMAC_MC, L2_DROPS),
+ MLXSW_SP_TRAP_DROP(VLAN_TAG_MISMATCH, L2_DROPS),
+ MLXSW_SP_TRAP_DROP(INGRESS_VLAN_FILTER, L2_DROPS),
+ MLXSW_SP_TRAP_DROP(INGRESS_STP_FILTER, L2_DROPS),
+ MLXSW_SP_TRAP_DROP(EMPTY_TX_LIST, L2_DROPS),
+ MLXSW_SP_TRAP_DROP(PORT_LOOPBACK_FILTER, L2_DROPS),
+};
+
+static struct mlxsw_listener mlxsw_sp_listeners_arr[] = {
+ MLXSW_SP_RXL_DISCARD(ING_PACKET_SMAC_MC, L2_DISCARDS),
+ MLXSW_SP_RXL_DISCARD(ING_SWITCH_VTAG_ALLOW, L2_DISCARDS),
+ MLXSW_SP_RXL_DISCARD(ING_SWITCH_VLAN, L2_DISCARDS),
+ MLXSW_SP_RXL_DISCARD(ING_SWITCH_STP, L2_DISCARDS),
+ MLXSW_SP_RXL_DISCARD(LOOKUP_SWITCH_UC, L2_DISCARDS),
+ MLXSW_SP_RXL_DISCARD(LOOKUP_SWITCH_MC_NULL, L2_DISCARDS),
+ MLXSW_SP_RXL_DISCARD(LOOKUP_SWITCH_LB, L2_DISCARDS),
+};
+
+/* Mapping between hardware trap and devlink trap. Multiple hardware traps can
+ * be mapped to the same devlink trap. Order is according to
+ * 'mlxsw_sp_listeners_arr'.
+ */
+static u16 mlxsw_sp_listener_devlink_map[] = {
+ DEVLINK_TRAP_GENERIC_ID_SMAC_MC,
+ DEVLINK_TRAP_GENERIC_ID_VLAN_TAG_MISMATCH,
+ DEVLINK_TRAP_GENERIC_ID_INGRESS_VLAN_FILTER,
+ DEVLINK_TRAP_GENERIC_ID_INGRESS_STP_FILTER,
+ DEVLINK_TRAP_GENERIC_ID_EMPTY_TX_LIST,
+ DEVLINK_TRAP_GENERIC_ID_EMPTY_TX_LIST,
+ DEVLINK_TRAP_GENERIC_ID_PORT_LOOPBACK_FILTER,
+};
+
+static int mlxsw_sp_rx_listener(struct mlxsw_sp *mlxsw_sp, struct sk_buff *skb,
+ u8 local_port,
+ struct mlxsw_sp_port *mlxsw_sp_port)
+{
+ struct mlxsw_sp_port_pcpu_stats *pcpu_stats;
+
+ if (unlikely(!mlxsw_sp_port)) {
+ dev_warn_ratelimited(mlxsw_sp->bus_info->dev, "Port %d: skb received for non-existent port\n",
+ local_port);
+ kfree_skb(skb);
+ return -EINVAL;
+ }
+
+ skb->dev = mlxsw_sp_port->dev;
+
+ pcpu_stats = this_cpu_ptr(mlxsw_sp_port->pcpu_stats);
+ u64_stats_update_begin(&pcpu_stats->syncp);
+ pcpu_stats->rx_packets++;
+ pcpu_stats->rx_bytes += skb->len;
+ u64_stats_update_end(&pcpu_stats->syncp);
+
+ skb->protocol = eth_type_trans(skb, skb->dev);
+
+ return 0;
+}
+
+static void mlxsw_sp_rx_drop_listener(struct sk_buff *skb, u8 local_port,
+ void *trap_ctx)
+{
+ struct devlink_port *in_devlink_port;
+ struct mlxsw_sp_port *mlxsw_sp_port;
+ struct mlxsw_sp *mlxsw_sp;
+ struct devlink *devlink;
+
+ mlxsw_sp = devlink_trap_ctx_priv(trap_ctx);
+ mlxsw_sp_port = mlxsw_sp->ports[local_port];
+
+ if (mlxsw_sp_rx_listener(mlxsw_sp, skb, local_port, mlxsw_sp_port))
+ return;
+
+ devlink = priv_to_devlink(mlxsw_sp->core);
+ in_devlink_port = mlxsw_core_port_devlink_port_get(mlxsw_sp->core,
+ local_port);
+ devlink_trap_report(devlink, skb, trap_ctx, in_devlink_port);
+ consume_skb(skb);
+}
+
+int mlxsw_sp_devlink_traps_init(struct mlxsw_sp *mlxsw_sp)
+{
+ struct devlink *devlink = priv_to_devlink(mlxsw_sp->core);
+
+ if (WARN_ON(ARRAY_SIZE(mlxsw_sp_listener_devlink_map) !=
+ ARRAY_SIZE(mlxsw_sp_listeners_arr)))
+ return -EINVAL;
+
+ return devlink_traps_register(devlink, mlxsw_sp_traps_arr,
+ ARRAY_SIZE(mlxsw_sp_traps_arr),
+ mlxsw_sp);
+}
+
+void mlxsw_sp_devlink_traps_fini(struct mlxsw_sp *mlxsw_sp)
+{
+ struct devlink *devlink = priv_to_devlink(mlxsw_sp->core);
+
+ devlink_traps_unregister(devlink, mlxsw_sp_traps_arr,
+ ARRAY_SIZE(mlxsw_sp_traps_arr));
+}
+
+int mlxsw_sp_trap_init(struct mlxsw_core *mlxsw_core,
+ const struct devlink_trap *trap, void *trap_ctx)
+{
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(mlxsw_sp_listener_devlink_map); i++) {
+ struct mlxsw_listener *listener;
+ int err;
+
+ if (mlxsw_sp_listener_devlink_map[i] != trap->id)
+ continue;
+ listener = &mlxsw_sp_listeners_arr[i];
+
+ err = mlxsw_core_trap_register(mlxsw_core, listener, trap_ctx);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+void mlxsw_sp_trap_fini(struct mlxsw_core *mlxsw_core,
+ const struct devlink_trap *trap, void *trap_ctx)
+{
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(mlxsw_sp_listener_devlink_map); i++) {
+ struct mlxsw_listener *listener;
+
+ if (mlxsw_sp_listener_devlink_map[i] != trap->id)
+ continue;
+ listener = &mlxsw_sp_listeners_arr[i];
+
+ mlxsw_core_trap_unregister(mlxsw_core, listener, trap_ctx);
+ }
+}
+
+int mlxsw_sp_trap_action_set(struct mlxsw_core *mlxsw_core,
+ const struct devlink_trap *trap,
+ enum devlink_trap_action action)
+{
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(mlxsw_sp_listener_devlink_map); i++) {
+ enum mlxsw_reg_hpkt_action hw_action;
+ struct mlxsw_listener *listener;
+ int err;
+
+ if (mlxsw_sp_listener_devlink_map[i] != trap->id)
+ continue;
+ listener = &mlxsw_sp_listeners_arr[i];
+
+ switch (action) {
+ case DEVLINK_TRAP_ACTION_DROP:
+ hw_action = MLXSW_REG_HPKT_ACTION_SET_FW_DEFAULT;
+ break;
+ case DEVLINK_TRAP_ACTION_TRAP:
+ hw_action = MLXSW_REG_HPKT_ACTION_TRAP_EXCEPTION_TO_CPU;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ err = mlxsw_core_trap_action_set(mlxsw_core, listener,
+ hw_action);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+#define MLXSW_SP_DISCARD_POLICER_ID (MLXSW_REG_HTGT_TRAP_GROUP_MAX + 1)
+
+static int
+mlxsw_sp_trap_group_policer_init(struct mlxsw_sp *mlxsw_sp,
+ const struct devlink_trap_group *group)
+{
+ enum mlxsw_reg_qpcr_ir_units ir_units;
+ char qpcr_pl[MLXSW_REG_QPCR_LEN];
+ u16 policer_id;
+ u8 burst_size;
+ bool is_bytes;
+ u32 rate;
+
+ switch (group->id) {
+ case DEVLINK_TRAP_GROUP_GENERIC_ID_L2_DROPS:
+ policer_id = MLXSW_SP_DISCARD_POLICER_ID;
+ ir_units = MLXSW_REG_QPCR_IR_UNITS_M;
+ is_bytes = false;
+ rate = 10 * 1024; /* 10Kpps */
+ burst_size = 7;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ mlxsw_reg_qpcr_pack(qpcr_pl, policer_id, ir_units, is_bytes, rate,
+ burst_size);
+ return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(qpcr), qpcr_pl);
+}
+
+static int
+__mlxsw_sp_trap_group_init(struct mlxsw_sp *mlxsw_sp,
+ const struct devlink_trap_group *group)
+{
+ char htgt_pl[MLXSW_REG_HTGT_LEN];
+ u8 priority, tc, group_id;
+ u16 policer_id;
+
+ switch (group->id) {
+ case DEVLINK_TRAP_GROUP_GENERIC_ID_L2_DROPS:
+ group_id = MLXSW_REG_HTGT_TRAP_GROUP_SP_L2_DISCARDS;
+ policer_id = MLXSW_SP_DISCARD_POLICER_ID;
+ priority = 0;
+ tc = 1;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ mlxsw_reg_htgt_pack(htgt_pl, group_id, policer_id, priority, tc);
+ return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(htgt), htgt_pl);
+}
+
+int mlxsw_sp_trap_group_init(struct mlxsw_core *mlxsw_core,
+ const struct devlink_trap_group *group)
+{
+ struct mlxsw_sp *mlxsw_sp = mlxsw_core_driver_priv(mlxsw_core);
+ int err;
+
+ err = mlxsw_sp_trap_group_policer_init(mlxsw_sp, group);
+ if (err)
+ return err;
+
+ err = __mlxsw_sp_trap_group_init(mlxsw_sp, group);
+ if (err)
+ return err;
+
+ return 0;
+}
--
2.21.0
^ permalink raw reply related
* [PATCH net-next 6/7] selftests: mlxsw: Add test cases for devlink-trap L2 drops
From: Ido Schimmel @ 2019-08-21 7:19 UTC (permalink / raw)
To: netdev; +Cc: davem, jiri, dsahern, mlxsw, Ido Schimmel
In-Reply-To: <20190821071937.13622-1-idosch@idosch.org>
From: Ido Schimmel <idosch@mellanox.com>
Test that each supported packet trap is triggered under the right
conditions and that packets are indeed dropped and not forwarded.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
---
.../net/mlxsw/devlink_trap_l2_drops.sh | 484 ++++++++++++++++++
1 file changed, 484 insertions(+)
create mode 100755 tools/testing/selftests/drivers/net/mlxsw/devlink_trap_l2_drops.sh
diff --git a/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_l2_drops.sh b/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_l2_drops.sh
new file mode 100755
index 000000000000..5dcdfa20fc6c
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/mlxsw/devlink_trap_l2_drops.sh
@@ -0,0 +1,484 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# Test devlink-trap L2 drops functionality over mlxsw. Each registered L2 drop
+# packet trap is tested to make sure it is triggered under the right
+# conditions.
+
+lib_dir=$(dirname $0)/../../../net/forwarding
+
+ALL_TESTS="
+ source_mac_is_multicast_test
+ vlan_tag_mismatch_test
+ ingress_vlan_filter_test
+ ingress_stp_filter_test
+ port_list_is_empty_test
+ port_loopback_filter_test
+"
+NUM_NETIFS=4
+source $lib_dir/tc_common.sh
+source $lib_dir/lib.sh
+source $lib_dir/devlink_lib.sh
+
+h1_create()
+{
+ simple_if_init $h1
+}
+
+h1_destroy()
+{
+ simple_if_fini $h1
+}
+
+h2_create()
+{
+ simple_if_init $h2
+}
+
+h2_destroy()
+{
+ simple_if_fini $h2
+}
+
+switch_create()
+{
+ ip link add dev br0 type bridge vlan_filtering 1 mcast_snooping 0
+
+ ip link set dev $swp1 master br0
+ ip link set dev $swp2 master br0
+
+ ip link set dev br0 up
+ ip link set dev $swp1 up
+ ip link set dev $swp2 up
+
+ tc qdisc add dev $swp2 clsact
+}
+
+switch_destroy()
+{
+ tc qdisc del dev $swp2 clsact
+
+ ip link set dev $swp2 down
+ ip link set dev $swp1 down
+
+ ip link del dev br0
+}
+
+setup_prepare()
+{
+ h1=${NETIFS[p1]}
+ swp1=${NETIFS[p2]}
+
+ swp2=${NETIFS[p3]}
+ h2=${NETIFS[p4]}
+
+ vrf_prepare
+
+ h1_create
+ h2_create
+
+ switch_create
+}
+
+cleanup()
+{
+ pre_cleanup
+
+ switch_destroy
+
+ h2_destroy
+ h1_destroy
+
+ vrf_cleanup
+}
+
+l2_drops_test()
+{
+ local trap_name=$1; shift
+ local group_name=$1; shift
+
+ # This is the common part of all the tests. It checks that stats are
+ # initially idle, then non-idle after changing the trap action and
+ # finally idle again. It also makes sure the packets are dropped and
+ # never forwarded.
+ devlink_trap_stats_idle_test $trap_name
+ check_err $? "Trap stats not idle with initial drop action"
+ devlink_trap_group_stats_idle_test $group_name
+ check_err $? "Trap group stats not idle with initial drop action"
+
+ devlink_trap_action_set $trap_name "trap"
+
+ devlink_trap_stats_idle_test $trap_name
+ check_fail $? "Trap stats idle after setting action to trap"
+ devlink_trap_group_stats_idle_test $group_name
+ check_fail $? "Trap group stats idle after setting action to trap"
+
+ devlink_trap_action_set $trap_name "drop"
+
+ devlink_trap_stats_idle_test $trap_name
+ check_err $? "Trap stats not idle after setting action to drop"
+ devlink_trap_group_stats_idle_test $group_name
+ check_err $? "Trap group stats not idle after setting action to drop"
+
+ tc_check_packets "dev $swp2 egress" 101 0
+ check_err $? "Packets were not dropped"
+}
+
+l2_drops_cleanup()
+{
+ local mz_pid=$1; shift
+
+ kill $mz_pid && wait $mz_pid &> /dev/null
+ tc filter del dev $swp2 egress protocol ip pref 1 handle 101 flower
+}
+
+source_mac_is_multicast_test()
+{
+ local trap_name="source_mac_is_multicast"
+ local smac=01:02:03:04:05:06
+ local group_name="l2_drops"
+ local mz_pid
+
+ tc filter add dev $swp2 egress protocol ip pref 1 handle 101 \
+ flower src_mac $smac action drop
+
+ $MZ $h1 -c 0 -p 100 -a $smac -b bcast -t ip -d 1msec -q &
+ mz_pid=$!
+
+ RET=0
+
+ l2_drops_test $trap_name $group_name
+
+ log_test "Source MAC is multicast"
+
+ l2_drops_cleanup $mz_pid
+}
+
+__vlan_tag_mismatch_test()
+{
+ local trap_name="vlan_tag_mismatch"
+ local dmac=de:ad:be:ef:13:37
+ local group_name="l2_drops"
+ local opt=$1; shift
+ local mz_pid
+
+ # Remove PVID flag. This should prevent untagged and prio-tagged
+ # packets from entering the bridge.
+ bridge vlan add vid 1 dev $swp1 untagged master
+
+ tc filter add dev $swp2 egress protocol ip pref 1 handle 101 \
+ flower dst_mac $dmac action drop
+
+ $MZ $h1 "$opt" -c 0 -p 100 -a own -b $dmac -t ip -d 1msec -q &
+ mz_pid=$!
+
+ l2_drops_test $trap_name $group_name
+
+ # Add PVID and make sure packets are no longer dropped.
+ bridge vlan add vid 1 dev $swp1 pvid untagged master
+ devlink_trap_action_set $trap_name "trap"
+
+ devlink_trap_stats_idle_test $trap_name
+ check_err $? "Trap stats not idle when packets should not be dropped"
+ devlink_trap_group_stats_idle_test $group_name
+ check_err $? "Trap group stats not idle with when packets should not be dropped"
+
+ tc_check_packets "dev $swp2 egress" 101 0
+ check_fail $? "Packets not forwarded when should"
+
+ devlink_trap_action_set $trap_name "drop"
+
+ l2_drops_cleanup $mz_pid
+}
+
+vlan_tag_mismatch_untagged_test()
+{
+ RET=0
+
+ __vlan_tag_mismatch_test
+
+ log_test "VLAN tag mismatch - untagged packets"
+}
+
+vlan_tag_mismatch_vid_0_test()
+{
+ RET=0
+
+ __vlan_tag_mismatch_test "-Q 0"
+
+ log_test "VLAN tag mismatch - prio-tagged packets"
+}
+
+vlan_tag_mismatch_test()
+{
+ vlan_tag_mismatch_untagged_test
+ vlan_tag_mismatch_vid_0_test
+}
+
+ingress_vlan_filter_test()
+{
+ local trap_name="ingress_vlan_filter"
+ local dmac=de:ad:be:ef:13:37
+ local group_name="l2_drops"
+ local mz_pid
+ local vid=10
+
+ bridge vlan add vid $vid dev $swp2 master
+ # During initialization the firmware enables all the VLAN filters and
+ # the driver does not turn them off since the traffic will be discarded
+ # by the STP filter whose default is DISCARD state. Add the VID on the
+ # ingress bridge port and then remove it to make sure it is not member
+ # in the VLAN.
+ bridge vlan add vid $vid dev $swp1 master
+ bridge vlan del vid $vid dev $swp1 master
+
+ RET=0
+
+ tc filter add dev $swp2 egress protocol ip pref 1 handle 101 \
+ flower dst_mac $dmac action drop
+
+ $MZ $h1 -Q $vid -c 0 -p 100 -a own -b $dmac -t ip -d 1msec -q &
+ mz_pid=$!
+
+ l2_drops_test $trap_name $group_name
+
+ # Add the VLAN on the bridge port and make sure packets are no longer
+ # dropped.
+ bridge vlan add vid $vid dev $swp1 master
+ devlink_trap_action_set $trap_name "trap"
+
+ devlink_trap_stats_idle_test $trap_name
+ check_err $? "Trap stats not idle when packets should not be dropped"
+ devlink_trap_group_stats_idle_test $group_name
+ check_err $? "Trap group stats not idle with when packets should not be dropped"
+
+ tc_check_packets "dev $swp2 egress" 101 0
+ check_fail $? "Packets not forwarded when should"
+
+ devlink_trap_action_set $trap_name "drop"
+
+ log_test "Ingress VLAN filter"
+
+ l2_drops_cleanup $mz_pid
+
+ bridge vlan del vid $vid dev $swp1 master
+ bridge vlan del vid $vid dev $swp2 master
+}
+
+__ingress_stp_filter_test()
+{
+ local trap_name="ingress_spanning_tree_filter"
+ local dmac=de:ad:be:ef:13:37
+ local group_name="l2_drops"
+ local state=$1; shift
+ local mz_pid
+ local vid=20
+
+ bridge vlan add vid $vid dev $swp2 master
+ bridge vlan add vid $vid dev $swp1 master
+ ip link set dev $swp1 type bridge_slave state $state
+
+ tc filter add dev $swp2 egress protocol ip pref 1 handle 101 \
+ flower dst_mac $dmac action drop
+
+ $MZ $h1 -Q $vid -c 0 -p 100 -a own -b $dmac -t ip -d 1msec -q &
+ mz_pid=$!
+
+ l2_drops_test $trap_name $group_name
+
+ # Change STP state to forwarding and make sure packets are no longer
+ # dropped.
+ ip link set dev $swp1 type bridge_slave state 3
+ devlink_trap_action_set $trap_name "trap"
+
+ devlink_trap_stats_idle_test $trap_name
+ check_err $? "Trap stats not idle when packets should not be dropped"
+ devlink_trap_group_stats_idle_test $group_name
+ check_err $? "Trap group stats not idle with when packets should not be dropped"
+
+ tc_check_packets "dev $swp2 egress" 101 0
+ check_fail $? "Packets not forwarded when should"
+
+ devlink_trap_action_set $trap_name "drop"
+
+ l2_drops_cleanup $mz_pid
+
+ bridge vlan del vid $vid dev $swp1 master
+ bridge vlan del vid $vid dev $swp2 master
+}
+
+ingress_stp_filter_listening_test()
+{
+ local state=$1; shift
+
+ RET=0
+
+ __ingress_stp_filter_test $state
+
+ log_test "Ingress STP filter - listening state"
+}
+
+ingress_stp_filter_learning_test()
+{
+ local state=$1; shift
+
+ RET=0
+
+ __ingress_stp_filter_test $state
+
+ log_test "Ingress STP filter - learning state"
+}
+
+ingress_stp_filter_test()
+{
+ ingress_stp_filter_listening_test 1
+ ingress_stp_filter_learning_test 2
+}
+
+port_list_is_empty_uc_test()
+{
+ local trap_name="port_list_is_empty"
+ local dmac=de:ad:be:ef:13:37
+ local group_name="l2_drops"
+ local mz_pid
+
+ # Disable unicast flooding on both ports, so that packets cannot egress
+ # any port.
+ ip link set dev $swp1 type bridge_slave flood off
+ ip link set dev $swp2 type bridge_slave flood off
+
+ RET=0
+
+ tc filter add dev $swp2 egress protocol ip pref 1 handle 101 \
+ flower dst_mac $dmac action drop
+
+ $MZ $h1 -c 0 -p 100 -a own -b $dmac -t ip -d 1msec -q &
+ mz_pid=$!
+
+ l2_drops_test $trap_name $group_name
+
+ # Allow packets to be flooded to one port.
+ ip link set dev $swp2 type bridge_slave flood on
+ devlink_trap_action_set $trap_name "trap"
+
+ devlink_trap_stats_idle_test $trap_name
+ check_err $? "Trap stats not idle when packets should not be dropped"
+ devlink_trap_group_stats_idle_test $group_name
+ check_err $? "Trap group stats not idle with when packets should not be dropped"
+
+ tc_check_packets "dev $swp2 egress" 101 0
+ check_fail $? "Packets not forwarded when should"
+
+ devlink_trap_action_set $trap_name "drop"
+
+ log_test "Port list is empty - unicast"
+
+ l2_drops_cleanup $mz_pid
+
+ ip link set dev $swp1 type bridge_slave flood on
+}
+
+port_list_is_empty_mc_test()
+{
+ local trap_name="port_list_is_empty"
+ local dmac=01:00:5e:00:00:01
+ local group_name="l2_drops"
+ local dip=239.0.0.1
+ local mz_pid
+
+ # Disable multicast flooding on both ports, so that packets cannot
+ # egress any port. We also need to flush IP addresses from the bridge
+ # in order to prevent packets from being flooded to the router port.
+ ip link set dev $swp1 type bridge_slave mcast_flood off
+ ip link set dev $swp2 type bridge_slave mcast_flood off
+ ip address flush dev br0
+
+ RET=0
+
+ tc filter add dev $swp2 egress protocol ip pref 1 handle 101 \
+ flower dst_mac $dmac action drop
+
+ $MZ $h1 -c 0 -p 100 -a own -b $dmac -t ip -B $dip -d 1msec -q &
+ mz_pid=$!
+
+ l2_drops_test $trap_name $group_name
+
+ # Allow packets to be flooded to one port.
+ ip link set dev $swp2 type bridge_slave mcast_flood on
+ devlink_trap_action_set $trap_name "trap"
+
+ devlink_trap_stats_idle_test $trap_name
+ check_err $? "Trap stats not idle when packets should not be dropped"
+ devlink_trap_group_stats_idle_test $group_name
+ check_err $? "Trap group stats not idle with when packets should not be dropped"
+
+ tc_check_packets "dev $swp2 egress" 101 0
+ check_fail $? "Packets not forwarded when should"
+
+ devlink_trap_action_set $trap_name "drop"
+
+ log_test "Port list is empty - multicast"
+
+ l2_drops_cleanup $mz_pid
+
+ ip link set dev $swp1 type bridge_slave mcast_flood on
+}
+
+port_list_is_empty_test()
+{
+ port_list_is_empty_uc_test
+ port_list_is_empty_mc_test
+}
+
+port_loopback_filter_uc_test()
+{
+ local trap_name="port_loopback_filter"
+ local dmac=de:ad:be:ef:13:37
+ local group_name="l2_drops"
+ local mz_pid
+
+ # Make sure packets can only egress the input port.
+ ip link set dev $swp2 type bridge_slave flood off
+
+ RET=0
+
+ tc filter add dev $swp2 egress protocol ip pref 1 handle 101 \
+ flower dst_mac $dmac action drop
+
+ $MZ $h1 -c 0 -p 100 -a own -b $dmac -t ip -d 1msec -q &
+ mz_pid=$!
+
+ l2_drops_test $trap_name $group_name
+
+ # Allow packets to be flooded.
+ ip link set dev $swp2 type bridge_slave flood on
+ devlink_trap_action_set $trap_name "trap"
+
+ devlink_trap_stats_idle_test $trap_name
+ check_err $? "Trap stats not idle when packets should not be dropped"
+ devlink_trap_group_stats_idle_test $group_name
+ check_err $? "Trap group stats not idle with when packets should not be dropped"
+
+ tc_check_packets "dev $swp2 egress" 101 0
+ check_fail $? "Packets not forwarded when should"
+
+ devlink_trap_action_set $trap_name "drop"
+
+ log_test "Port loopback filter - unicast"
+
+ l2_drops_cleanup $mz_pid
+}
+
+port_loopback_filter_test()
+{
+ port_loopback_filter_uc_test
+}
+
+trap cleanup EXIT
+
+setup_prepare
+setup_wait
+
+tests_run
+
+exit $EXIT_STATUS
--
2.21.0
^ permalink raw reply related
* [PATCH net-next 7/7] selftests: mlxsw: Add a test case for devlink-trap
From: Ido Schimmel @ 2019-08-21 7:19 UTC (permalink / raw)
To: netdev; +Cc: davem, jiri, dsahern, mlxsw, Ido Schimmel
In-Reply-To: <20190821071937.13622-1-idosch@idosch.org>
From: Ido Schimmel <idosch@mellanox.com>
Test generic devlink-trap functionality over mlxsw. These tests are not
specific to a single trap, but do not check the devlink-trap common
infrastructure either.
Currently, the only test case is device deletion (by reloading the
driver) while packets are being trapped.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
---
.../drivers/net/mlxsw/devlink_trap.sh | 129 ++++++++++++++++++
1 file changed, 129 insertions(+)
create mode 100755 tools/testing/selftests/drivers/net/mlxsw/devlink_trap.sh
diff --git a/tools/testing/selftests/drivers/net/mlxsw/devlink_trap.sh b/tools/testing/selftests/drivers/net/mlxsw/devlink_trap.sh
new file mode 100755
index 000000000000..89b55e946eed
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/mlxsw/devlink_trap.sh
@@ -0,0 +1,129 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# Test generic devlink-trap functionality over mlxsw. These tests are not
+# specific to a single trap, but do not check the devlink-trap common
+# infrastructure either.
+
+lib_dir=$(dirname $0)/../../../net/forwarding
+
+ALL_TESTS="
+ dev_del_test
+"
+NUM_NETIFS=4
+source $lib_dir/tc_common.sh
+source $lib_dir/lib.sh
+source $lib_dir/devlink_lib.sh
+
+h1_create()
+{
+ simple_if_init $h1
+}
+
+h1_destroy()
+{
+ simple_if_fini $h1
+}
+
+h2_create()
+{
+ simple_if_init $h2
+}
+
+h2_destroy()
+{
+ simple_if_fini $h2
+}
+
+switch_create()
+{
+ ip link add dev br0 type bridge vlan_filtering 1 mcast_snooping 0
+
+ ip link set dev $swp1 master br0
+ ip link set dev $swp2 master br0
+
+ ip link set dev br0 up
+ ip link set dev $swp1 up
+ ip link set dev $swp2 up
+}
+
+switch_destroy()
+{
+ ip link set dev $swp2 down
+ ip link set dev $swp1 down
+
+ ip link del dev br0
+}
+
+setup_prepare()
+{
+ h1=${NETIFS[p1]}
+ swp1=${NETIFS[p2]}
+
+ swp2=${NETIFS[p3]}
+ h2=${NETIFS[p4]}
+
+ vrf_prepare
+
+ h1_create
+ h2_create
+
+ switch_create
+}
+
+cleanup()
+{
+ pre_cleanup
+
+ switch_destroy
+
+ h2_destroy
+ h1_destroy
+
+ vrf_cleanup
+}
+
+dev_del_test()
+{
+ local trap_name="source_mac_is_multicast"
+ local smac=01:02:03:04:05:06
+ local num_iter=5
+ local mz_pid
+ local i
+
+ $MZ $h1 -c 0 -p 100 -a $smac -b bcast -t ip -q &
+ mz_pid=$!
+
+ # The purpose of this test is to make sure we correctly dismantle a
+ # port while packets are trapped from it. This is done by reloading the
+ # the driver while the 'ingress_smac_mc_drop' trap is triggered.
+ RET=0
+
+ for i in $(seq 1 $num_iter); do
+ log_info "Iteration $i / $num_iter"
+
+ devlink_trap_action_set $trap_name "trap"
+ sleep 1
+
+ devlink_reload
+ # Allow netdevices to be re-created following the reload
+ sleep 20
+
+ cleanup
+ setup_prepare
+ setup_wait
+ done
+
+ log_test "Device delete"
+
+ kill $mz_pid && wait $mz_pid &> /dev/null
+}
+
+trap cleanup EXIT
+
+setup_prepare
+setup_wait
+
+tests_run
+
+exit $EXIT_STATUS
--
2.21.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox