* Re: [PATCH] libbpf: Don't error out if getsockopt() fails for XDP_OPTIONS
From: Daniel Borkmann @ 2019-09-16 8:08 UTC (permalink / raw)
To: Yonghong Song, Toke Høiland-Jørgensen,
Alexei Starovoitov, netdev@vger.kernel.org, bpf@vger.kernel.org,
maximmi@mellanox.com
In-Reply-To: <60651b4b-c185-1e17-1664-88957537e3f1@fb.com>
On 9/13/19 8:53 PM, Yonghong Song wrote:
> On 9/10/19 12:06 AM, Toke Høiland-Jørgensen wrote:
>> Yonghong Song <yhs@fb.com> writes:
>>> On 9/9/19 10:46 AM, Toke Høiland-Jørgensen wrote:
>>>> The xsk_socket__create() function fails and returns an error if it cannot
>>>> get the XDP_OPTIONS through getsockopt(). However, support for XDP_OPTIONS
>>>> was not added until kernel 5.3, so this means that creating XSK sockets
>>>> always fails on older kernels.
>>>>
>>>> Since the option is just used to set the zero-copy flag in the xsk struct,
>>>> there really is no need to error out if the getsockopt() call fails.
>>>>
>>>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>>>> ---
>>>> tools/lib/bpf/xsk.c | 8 ++------
>>>> 1 file changed, 2 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
>>>> index 680e63066cf3..598e487d9ce8 100644
>>>> --- a/tools/lib/bpf/xsk.c
>>>> +++ b/tools/lib/bpf/xsk.c
>>>> @@ -603,12 +603,8 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
>>>>
>>>> optlen = sizeof(opts);
>>>> err = getsockopt(xsk->fd, SOL_XDP, XDP_OPTIONS, &opts, &optlen);
>>>> - if (err) {
>>>> - err = -errno;
>>>> - goto out_mmap_tx;
>>>> - }
>>>> -
>>>> - xsk->zc = opts.flags & XDP_OPTIONS_ZEROCOPY;
>>>> + if (!err)
>>>> + xsk->zc = opts.flags & XDP_OPTIONS_ZEROCOPY;
>>>>
>>>> if (!(xsk->config.libbpf_flags & XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) {
>>>> err = xsk_setup_xdp_prog(xsk);
>>>
>>> Since 'zc' is not used by anybody, maybe all codes 'zc' related can be
>>> removed? It can be added back back once there is an interface to use
>>> 'zc'?
>>
>> Fine with me; up to the maintainers what they prefer, I guess? :)
Given this is not exposed to applications at this point and we don't do anything
useful with it, lets just remove the zc cruft until there is a proper interface
added to libbpf. Toke, please respin with the suggested removal, thanks!
^ permalink raw reply
* Re: [PATCH net-next v2 3/3] mlxsw: spectrum_buffers: Add the ability to query the CPU port's shared buffer
From: Jiri Pirko @ 2019-09-16 8:00 UTC (permalink / raw)
To: Ido Schimmel; +Cc: netdev, davem, jiri, shalomt, mlxsw, Ido Schimmel
In-Reply-To: <20190916061750.26207-4-idosch@idosch.org>
Mon, Sep 16, 2019 at 08:17:50AM CEST, idosch@idosch.org wrote:
>From: Shalom Toledo <shalomt@mellanox.com>
>
>While debugging packet loss towards the CPU, it is useful to be able to
>query the CPU port's shared buffer quotas and occupancy.
>
>Since the CPU port has no ingress buffers, all the shared buffers ingress
>information will be cleared.
>
>Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
>Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
^ permalink raw reply
* macb: inconsistent Rx descriptor chain after OOM
From: Andreas Schwab @ 2019-09-16 7:41 UTC (permalink / raw)
To: Nicolas Ferre; +Cc: netdev
When there is an OOM situation, the macb driver cannot recover from it:
[245622.872993] macb 10090000.ethernet eth0: Unable to allocate sk_buff
[245622.891438] macb 10090000.ethernet eth0: inconsistent Rx descriptor chain
After that, the interface is dead. Since this system is using NFS root,
it then stalled as a whole.
Andreas.
--
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."
^ permalink raw reply
* [PATCH v5 0/2] ethtool: implement Energy Detect Powerdown support via phy-tunable
From: Alexandru Ardelean @ 2019-09-16 7:35 UTC (permalink / raw)
To: netdev, devicetree, linux-kernel
Cc: davem, robh+dt, mark.rutland, f.fainelli, hkallweit1, andrew,
mkubecek, Alexandru Ardelean
This changeset proposes a new control for PHY tunable to control Energy
Detect Power Down.
The `phy_tunable_id` has been named `ETHTOOL_PHY_EDPD` since it looks like
this feature is common across other PHYs (like EEE), and defining
`ETHTOOL_PHY_ENERGY_DETECT_POWER_DOWN` seems too long.
The way EDPD works, is that the RX block is put to a lower power mode,
except for link-pulse detection circuits. The TX block is also put to low
power mode, but the PHY wakes-up periodically to send link pulses, to avoid
lock-ups in case the other side is also in EDPD mode.
Currently, there are 2 PHY drivers that look like they could use this new
PHY tunable feature: the `adin` && `micrel` PHYs.
This series updates only the `adin` PHY driver to support this new feature,
as this chip has been tested. A change for `micrel` can be proposed after a
discussion of the PHY-tunable API is resolved.
Alexandru Ardelean (2):
ethtool: implement Energy Detect Powerdown support via phy-tunable
net: phy: adin: implement Energy Detect Powerdown mode via phy-tunable
drivers/net/phy/adin.c | 61 ++++++++++++++++++++++++++++++++++++
include/uapi/linux/ethtool.h | 22 +++++++++++++
net/core/ethtool.c | 6 ++++
3 files changed, 89 insertions(+)
--
Changelog v4 -> v5:
* add Andrew's & Florian's Reviewed-by tags for patch 1
* fixed patch 2 goof:
`rc = adin_set_edpd(phydev, 1);`
->
`rc = adin_set_edpd(phydev, ETHTOOL_PHY_EDPD_DFLT_TX_MSECS);`
this was omitted when re-spin to v4 was done
* for patch 2 added Florian's Reviewed-by tag as this was suggested by him
and his accord was
`with that fixed: Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>`
Changelog v3 -> v4:
* impose the TX interval unit for EDPD to be milliseconds; the point was
raised by Michal; this should allow for intervals:
- as small as 1 millisecond, which does not sound like a power-saver
- as large as 65 seconds, which sounds like a lot to wait for a link to
come up
Changelog v2 -> v3:
* implement Andrew's review comments:
1. for patch `ethtool: implement Energy Detect Powerdown support via
phy-tunable`
- ETHTOOL_PHY_EDPD_DFLT_TX_INTERVAL == 0xffff
- ETHTOOL_PHY_EDPD_NO_TX == 0xfffe
- added comment in include/uapi/linux/ethtool.h
2. for patch `net: phy: adin: implement Energy Detect Powerdown mode via
phy-tunable`
- added comments about interval & units for the ADIN PHY
- in `adin_set_edpd()`: add a switch statement of all the valid values
- in `adin_get_edpd()`: return `ETHTOOL_PHY_EDPD_DFLT_TX_INTERVAL`
since the PHY only supports a single TX-interval value (1 second)
Changelog v1 -> v2:
* initial series was made up of 2 sub-series: 1 for kernel & 1 for ethtool
in userspace; v2 contains only the kernel series
2.20.1
^ permalink raw reply
* [PATCH v5 2/2] net: phy: adin: implement Energy Detect Powerdown mode via phy-tunable
From: Alexandru Ardelean @ 2019-09-16 7:35 UTC (permalink / raw)
To: netdev, devicetree, linux-kernel
Cc: davem, robh+dt, mark.rutland, f.fainelli, hkallweit1, andrew,
mkubecek, Alexandru Ardelean
In-Reply-To: <20190916073526.24711-1-alexandru.ardelean@analog.com>
This driver becomes the first user of the kernel's `ETHTOOL_PHY_EDPD`
phy-tunable feature.
EDPD is also enabled by default on PHY config_init, but can be disabled via
the phy-tunable control.
When enabling EDPD, it's also a good idea (for the ADIN PHYs) to enable TX
periodic pulses, so that in case the other PHY is also on EDPD mode, there
is no lock-up situation where both sides are waiting for the other to
transmit.
Via the phy-tunable control, TX pulses can be disabled if specifying 0
`tx-interval` via ethtool.
The ADIN PHY supports only fixed 1 second intervals; they cannot be
configured. That is why the acceptable values are 1,
ETHTOOL_PHY_EDPD_DFLT_TX_MSECS and ETHTOOL_PHY_EDPD_NO_TX (which disables
TX pulses).
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
---
drivers/net/phy/adin.c | 61 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 61 insertions(+)
diff --git a/drivers/net/phy/adin.c b/drivers/net/phy/adin.c
index 4dec83df048d..cf5a391c93e6 100644
--- a/drivers/net/phy/adin.c
+++ b/drivers/net/phy/adin.c
@@ -26,6 +26,11 @@
#define ADIN1300_RX_ERR_CNT 0x0014
+#define ADIN1300_PHY_CTRL_STATUS2 0x0015
+#define ADIN1300_NRG_PD_EN BIT(3)
+#define ADIN1300_NRG_PD_TX_EN BIT(2)
+#define ADIN1300_NRG_PD_STATUS BIT(1)
+
#define ADIN1300_PHY_CTRL2 0x0016
#define ADIN1300_DOWNSPEED_AN_100_EN BIT(11)
#define ADIN1300_DOWNSPEED_AN_10_EN BIT(10)
@@ -328,12 +333,62 @@ static int adin_set_downshift(struct phy_device *phydev, u8 cnt)
ADIN1300_DOWNSPEEDS_EN);
}
+static int adin_get_edpd(struct phy_device *phydev, u16 *tx_interval)
+{
+ int val;
+
+ val = phy_read(phydev, ADIN1300_PHY_CTRL_STATUS2);
+ if (val < 0)
+ return val;
+
+ if (ADIN1300_NRG_PD_EN & val) {
+ if (val & ADIN1300_NRG_PD_TX_EN)
+ /* default is 1 second */
+ *tx_interval = ETHTOOL_PHY_EDPD_DFLT_TX_MSECS;
+ else
+ *tx_interval = ETHTOOL_PHY_EDPD_NO_TX;
+ } else {
+ *tx_interval = ETHTOOL_PHY_EDPD_DISABLE;
+ }
+
+ return 0;
+}
+
+static int adin_set_edpd(struct phy_device *phydev, u16 tx_interval)
+{
+ u16 val;
+
+ if (tx_interval == ETHTOOL_PHY_EDPD_DISABLE)
+ return phy_clear_bits(phydev, ADIN1300_PHY_CTRL_STATUS2,
+ (ADIN1300_NRG_PD_EN | ADIN1300_NRG_PD_TX_EN));
+
+ val = ADIN1300_NRG_PD_EN;
+
+ switch (tx_interval) {
+ case 1000: /* 1 second */
+ /* fallthrough */
+ case ETHTOOL_PHY_EDPD_DFLT_TX_MSECS:
+ val |= ADIN1300_NRG_PD_TX_EN;
+ /* fallthrough */
+ case ETHTOOL_PHY_EDPD_NO_TX:
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ return phy_modify(phydev, ADIN1300_PHY_CTRL_STATUS2,
+ (ADIN1300_NRG_PD_EN | ADIN1300_NRG_PD_TX_EN),
+ val);
+}
+
static int adin_get_tunable(struct phy_device *phydev,
struct ethtool_tunable *tuna, void *data)
{
switch (tuna->id) {
case ETHTOOL_PHY_DOWNSHIFT:
return adin_get_downshift(phydev, data);
+ case ETHTOOL_PHY_EDPD:
+ return adin_get_edpd(phydev, data);
default:
return -EOPNOTSUPP;
}
@@ -345,6 +400,8 @@ static int adin_set_tunable(struct phy_device *phydev,
switch (tuna->id) {
case ETHTOOL_PHY_DOWNSHIFT:
return adin_set_downshift(phydev, *(const u8 *)data);
+ case ETHTOOL_PHY_EDPD:
+ return adin_set_edpd(phydev, *(const u16 *)data);
default:
return -EOPNOTSUPP;
}
@@ -368,6 +425,10 @@ static int adin_config_init(struct phy_device *phydev)
if (rc < 0)
return rc;
+ rc = adin_set_edpd(phydev, ETHTOOL_PHY_EDPD_DFLT_TX_MSECS);
+ if (rc < 0)
+ return rc;
+
phydev_dbg(phydev, "PHY is using mode '%s'\n",
phy_modes(phydev->interface));
--
2.20.1
^ permalink raw reply related
* [PATCH v5 1/2] ethtool: implement Energy Detect Powerdown support via phy-tunable
From: Alexandru Ardelean @ 2019-09-16 7:35 UTC (permalink / raw)
To: netdev, devicetree, linux-kernel
Cc: davem, robh+dt, mark.rutland, f.fainelli, hkallweit1, andrew,
mkubecek, Alexandru Ardelean
In-Reply-To: <20190916073526.24711-1-alexandru.ardelean@analog.com>
The `phy_tunable_id` has been named `ETHTOOL_PHY_EDPD` since it looks like
this feature is common across other PHYs (like EEE), and defining
`ETHTOOL_PHY_ENERGY_DETECT_POWER_DOWN` seems too long.
The way EDPD works, is that the RX block is put to a lower power mode,
except for link-pulse detection circuits. The TX block is also put to low
power mode, but the PHY wakes-up periodically to send link pulses, to avoid
lock-ups in case the other side is also in EDPD mode.
Currently, there are 2 PHY drivers that look like they could use this new
PHY tunable feature: the `adin` && `micrel` PHYs.
The ADIN's datasheet mentions that TX pulses are at intervals of 1 second
default each, and they can be disabled. For the Micrel KSZ9031 PHY, the
datasheet does not mention whether they can be disabled, but mentions that
they can modified.
The way this change is structured, is similar to the PHY tunable downshift
control:
* a `ETHTOOL_PHY_EDPD_DFLT_TX_MSECS` value is exposed to cover a default
TX interval; some PHYs could specify a certain value that makes sense
* `ETHTOOL_PHY_EDPD_NO_TX` would disable TX when EDPD is enabled
* `ETHTOOL_PHY_EDPD_DISABLE` will disable EDPD
As noted by the `ETHTOOL_PHY_EDPD_DFLT_TX_MSECS` the interval unit is 1
millisecond, which should cover a reasonable range of intervals:
- from 1 millisecond, which does not sound like much of a power-saver
- to ~65 seconds which is quite a lot to wait for a link to come up when
plugging a cable
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
---
include/uapi/linux/ethtool.h | 22 ++++++++++++++++++++++
net/core/ethtool.c | 6 ++++++
2 files changed, 28 insertions(+)
diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index dd06302aa93e..8938b76c4ee3 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -259,10 +259,32 @@ struct ethtool_tunable {
#define ETHTOOL_PHY_FAST_LINK_DOWN_ON 0
#define ETHTOOL_PHY_FAST_LINK_DOWN_OFF 0xff
+/* Energy Detect Power Down (EDPD) is a feature supported by some PHYs, where
+ * the PHY's RX & TX blocks are put into a low-power mode when there is no
+ * link detected (typically cable is un-plugged). For RX, only a minimal
+ * link-detection is available, and for TX the PHY wakes up to send link pulses
+ * to avoid any lock-ups in case the peer PHY may also be running in EDPD mode.
+ *
+ * Some PHYs may support configuration of the wake-up interval for TX pulses,
+ * and some PHYs may support only disabling TX pulses entirely. For the latter
+ * a special value is required (ETHTOOL_PHY_EDPD_NO_TX) so that this can be
+ * configured from userspace (should the user want it).
+ *
+ * The interval units for TX wake-up are in milliseconds, since this should
+ * cover a reasonable range of intervals:
+ * - from 1 millisecond, which does not sound like much of a power-saver
+ * - to ~65 seconds which is quite a lot to wait for a link to come up when
+ * plugging a cable
+ */
+#define ETHTOOL_PHY_EDPD_DFLT_TX_MSECS 0xffff
+#define ETHTOOL_PHY_EDPD_NO_TX 0xfffe
+#define ETHTOOL_PHY_EDPD_DISABLE 0
+
enum phy_tunable_id {
ETHTOOL_PHY_ID_UNSPEC,
ETHTOOL_PHY_DOWNSHIFT,
ETHTOOL_PHY_FAST_LINK_DOWN,
+ ETHTOOL_PHY_EDPD,
/*
* Add your fresh new phy tunable attribute above and remember to update
* phy_tunable_strings[] in net/core/ethtool.c
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 6288e69e94fc..c763106c73fc 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -133,6 +133,7 @@ phy_tunable_strings[__ETHTOOL_PHY_TUNABLE_COUNT][ETH_GSTRING_LEN] = {
[ETHTOOL_ID_UNSPEC] = "Unspec",
[ETHTOOL_PHY_DOWNSHIFT] = "phy-downshift",
[ETHTOOL_PHY_FAST_LINK_DOWN] = "phy-fast-link-down",
+ [ETHTOOL_PHY_EDPD] = "phy-energy-detect-power-down",
};
static int ethtool_get_features(struct net_device *dev, void __user *useraddr)
@@ -2451,6 +2452,11 @@ static int ethtool_phy_tunable_valid(const struct ethtool_tunable *tuna)
tuna->type_id != ETHTOOL_TUNABLE_U8)
return -EINVAL;
break;
+ case ETHTOOL_PHY_EDPD:
+ if (tuna->len != sizeof(u16) ||
+ tuna->type_id != ETHTOOL_TUNABLE_U16)
+ return -EINVAL;
+ break;
default:
return -EINVAL;
}
--
2.20.1
^ permalink raw reply related
* Re: [PATCH v3] net: stmmac: socfpga: re-use the `interface` parameter from platform data
From: David Miller @ 2019-09-16 7:22 UTC (permalink / raw)
To: alexandru.ardelean
Cc: netdev, linux-stm32, linux-arm-kernel, linux-kernel,
peppe.cavallaro, alexandre.torgue, joabreu, mcoquelin.stm32
In-Reply-To: <20190916070400.18721-1-alexandru.ardelean@analog.com>
From: Alexandru Ardelean <alexandru.ardelean@analog.com>
Date: Mon, 16 Sep 2019 10:04:00 +0300
> The socfpga sub-driver defines an `interface` field in the `socfpga_dwmac`
> struct and parses it on init.
>
> The shared `stmmac_probe_config_dt()` function also parses this from the
> device-tree and makes it available on the returned `plat_data` (which is
> the same data available via `netdev_priv()`).
>
> All that's needed now is to dig that information out, via some
> `dev_get_drvdata()` && `netdev_priv()` calls and re-use it.
>
> Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Applied.
^ permalink raw reply
* Re: [PATCH net-next 0/3] More fixes for unlocked cls hardware offload API refactoring
From: David Miller @ 2019-09-16 7:21 UTC (permalink / raw)
To: vladbu; +Cc: netdev, jhs, xiyou.wangcong, jiri
In-Reply-To: <20190913152841.15755-1-vladbu@mellanox.com>
From: Vlad Buslov <vladbu@mellanox.com>
Date: Fri, 13 Sep 2019 18:28:38 +0300
> Two fixes for my "Refactor cls hardware offload API to support
> rtnl-independent drivers" series and refactoring patch that implements
> infrastructure necessary for the fixes.
Series applied, thanks.
^ permalink raw reply
* Re: [PATCH net-next 0/6] net: add support for ip_tun_info options setting
From: David Miller @ 2019-09-16 7:20 UTC (permalink / raw)
To: lucien.xin; +Cc: netdev, jbenc, tgraf, u9012063
In-Reply-To: <cover.1568617721.git.lucien.xin@gmail.com>
From: Xin Long <lucien.xin@gmail.com>
Date: Mon, 16 Sep 2019 15:10:14 +0800
> With this patchset, users can configure options with LWTUNNEL_IP(6)_OPTS
> by ip route encap for ersapn or vxlan lwtunnel. Note that in kernel part
> it won't parse the option details but do some check and memcpy only, and
> the options will be parsed by iproute in userspace.
Sorry, this will have to wait until net-next opens back up.
Thank you.
^ permalink raw reply
* Re: [PATCH net-next 0/6] net: add support for ip_tun_info options setting
From: Xin Long @ 2019-09-16 7:17 UTC (permalink / raw)
To: network dev; +Cc: davem, Jiri Benc, Thomas Graf, William Tu
In-Reply-To: <cover.1568617721.git.lucien.xin@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2040 bytes --]
On Mon, Sep 16, 2019 at 3:10 PM Xin Long <lucien.xin@gmail.com> wrote:
>
> With this patchset, users can configure options with LWTUNNEL_IP(6)_OPTS
> by ip route encap for ersapn or vxlan lwtunnel. Note that in kernel part
> it won't parse the option details but do some check and memcpy only, and
> the options will be parsed by iproute in userspace.
>
> We also improve the vxlan and erspan options processing in this patchset.
>
> As an example I also wrote a patch for iproute2 that I will reply on this
> mail, with it we can add options for erspan lwtunnel like:
>
> # ip net a a; ip net a b
> # ip -n a l a eth0 type veth peer name eth0 netns b
> # ip -n a l s eth0 up; ip -n b link set eth0 up
> # ip -n a a a 10.1.0.1/24 dev eth0; ip -n b a a 10.1.0.2/24 dev eth0
> # ip -n b l a erspan1 type erspan key 1 seq erspan 123 \
> local 10.1.0.2 remote 10.1.0.1
> # ip -n b a a 1.1.1.1/24 dev erspan1; ip -n b l s erspan1 up
> # ip -n b r a 2.1.1.0/24 dev erspan1
> # ip -n a l a erspan1 type erspan key 1 seq local 10.1.0.1 external
> # ip -n a a a 2.1.1.1/24 dev erspan1; ip -n a l s erspan1 up
> # ip -n a r a 1.1.1.0/24 encap ip id 1 erspan ver 1 idx 123 \
> dst 10.1.0.2 dev erspan1
> # ip -n a r s; ip net exec a ping 1.1.1.1 -c 1
the iproute2 patch for testing is as attached.
>
> Xin Long (6):
> lwtunnel: add options process for arp request
> lwtunnel: add LWTUNNEL_IP_OPTS support for lwtunnel_ip
> lwtunnel: add LWTUNNEL_IP6_OPTS support for lwtunnel_ip6
> vxlan: check tun_info options_len properly
> erspan: fix the tun_info options_len check
> erspan: make md work without TUNNEL_ERSPAN_OPT set
>
> drivers/net/vxlan.c | 6 +++--
> include/uapi/linux/lwtunnel.h | 2 ++
> net/ipv4/ip_gre.c | 31 ++++++++++-------------
> net/ipv4/ip_tunnel_core.c | 59 +++++++++++++++++++++++++++++++++----------
> net/ipv6/ip6_gre.c | 35 +++++++++++++------------
> 5 files changed, 84 insertions(+), 49 deletions(-)
>
> --
> 2.1.0
>
[-- Attachment #2: 0001-iproute_lwtunnel-add-support-options-for-erspan-meta.patch --]
[-- Type: application/octet-stream, Size: 7224 bytes --]
From c852e8b4c6838b5beed3e477bb6bfb9dbd9b2900 Mon Sep 17 00:00:00 2001
From: Xin Long <lucien.xin@gmail.com>
Date: Mon, 16 Sep 2019 03:14:07 -0400
Subject: [PATCH] iproute_lwtunnel: add support options for erspan metadata
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
include/uapi/linux/lwtunnel.h | 2 +
ip/iproute_lwtunnel.c | 131 ++++++++++++++++++++++++++++++++--
2 files changed, 129 insertions(+), 4 deletions(-)
diff --git a/include/uapi/linux/lwtunnel.h b/include/uapi/linux/lwtunnel.h
index 3f3fe6f3..c25ff92d 100644
--- a/include/uapi/linux/lwtunnel.h
+++ b/include/uapi/linux/lwtunnel.h
@@ -27,6 +27,7 @@ enum lwtunnel_ip_t {
LWTUNNEL_IP_TOS,
LWTUNNEL_IP_FLAGS,
LWTUNNEL_IP_PAD,
+ LWTUNNEL_IP_OPTS,
__LWTUNNEL_IP_MAX,
};
@@ -41,6 +42,7 @@ enum lwtunnel_ip6_t {
LWTUNNEL_IP6_TC,
LWTUNNEL_IP6_FLAGS,
LWTUNNEL_IP6_PAD,
+ LWTUNNEL_IP6_OPTS,
__LWTUNNEL_IP6_MAX,
};
diff --git a/ip/iproute_lwtunnel.c b/ip/iproute_lwtunnel.c
index 03217b8f..ff7eb450 100644
--- a/ip/iproute_lwtunnel.c
+++ b/ip/iproute_lwtunnel.c
@@ -32,6 +32,7 @@
#include <linux/seg6_hmac.h>
#include <linux/seg6_local.h>
#include <linux/if_tunnel.h>
+#include <linux/erspan.h>
static const char *format_encap_type(int type)
{
@@ -294,7 +295,7 @@ static void print_encap_mpls(FILE *fp, struct rtattr *encap)
static void print_encap_ip(FILE *fp, struct rtattr *encap)
{
struct rtattr *tb[LWTUNNEL_IP_MAX+1];
- __u16 flags;
+ __u16 flags = 0;
parse_rtattr_nested(tb, LWTUNNEL_IP_MAX, encap);
@@ -329,6 +330,25 @@ static void print_encap_ip(FILE *fp, struct rtattr *encap)
if (flags & TUNNEL_SEQ)
print_bool(PRINT_ANY, "seq", "seq ", true);
}
+
+ if (tb[LWTUNNEL_IP_OPTS]) {
+ if (flags & TUNNEL_ERSPAN_OPT) {
+ struct erspan_metadata *em = RTA_DATA(tb[LWTUNNEL_IP_OPTS]);
+
+ if (em->version == 1) {
+ print_bool(PRINT_ANY, "erspan", "erspan ", true);
+ print_uint(PRINT_ANY, "ver", "ver %u ", 1);
+ print_uint(PRINT_ANY, "idx", "idx %u ",
+ ntohl(em->u.index));
+ } else if (em->version == 2) {
+ print_bool(PRINT_ANY, "erspan", "erspan ", true);
+ print_color_string(PRINT_ANY, COLOR_INET,
+ "dir", "dir %s ",
+ em->u.md2.dir ? "ingress" : "exgress");
+ print_uint(PRINT_ANY, "hwid", "hwid %u ", em->u.md2.hwid);
+ }
+ }
+ }
}
static void print_encap_ila(FILE *fp, struct rtattr *encap)
@@ -365,7 +385,7 @@ static void print_encap_ila(FILE *fp, struct rtattr *encap)
static void print_encap_ip6(FILE *fp, struct rtattr *encap)
{
struct rtattr *tb[LWTUNNEL_IP6_MAX+1];
- __u16 flags;
+ __u16 flags = 0;
parse_rtattr_nested(tb, LWTUNNEL_IP6_MAX, encap);
@@ -401,6 +421,25 @@ static void print_encap_ip6(FILE *fp, struct rtattr *encap)
if (flags & TUNNEL_SEQ)
print_bool(PRINT_ANY, "seq", "seq ", true);
}
+
+ if (tb[LWTUNNEL_IP6_OPTS]) {
+ if (flags & TUNNEL_ERSPAN_OPT) {
+ struct erspan_metadata *em = RTA_DATA(tb[LWTUNNEL_IP6_OPTS]);
+
+ if (em->version == 1) {
+ print_bool(PRINT_ANY, "erspan", "erspan ", true);
+ print_uint(PRINT_ANY, "ver", "ver %u ", 1);
+ print_uint(PRINT_ANY, "idx", "idx %u ",
+ ntohl(em->u.index));
+ } else if (em->version == 2) {
+ print_bool(PRINT_ANY, "erspan", "erspan ", true);
+ print_color_string(PRINT_ANY, COLOR_INET,
+ "dir", "dir %s ",
+ em->u.md2.dir ? "ingress" : "exgress");
+ print_uint(PRINT_ANY, "hwid", "hwid %u ", em->u.md2.hwid);
+ }
+ }
+ }
}
static void print_encap_bpf(FILE *fp, struct rtattr *encap)
@@ -799,7 +838,7 @@ static int parse_encap_ip(struct rtattr *rta, size_t len,
int *argcp, char ***argvp)
{
int id_ok = 0, dst_ok = 0, src_ok = 0, tos_ok = 0, ttl_ok = 0;
- int key_ok = 0, csum_ok = 0, seq_ok = 0;
+ int key_ok = 0, csum_ok = 0, seq_ok = 0, erspan_ok = 0;
char **argv = *argvp;
int argc = *argcp;
int ret = 0;
@@ -851,6 +890,48 @@ static int parse_encap_ip(struct rtattr *rta, size_t len,
if (get_u8(&ttl, *argv, 0))
invarg("\"ttl\" value is invalid\n", *argv);
ret = rta_addattr8(rta, len, LWTUNNEL_IP_TTL, ttl);
+ } else if (strcmp(*argv, "erspan") == 0) {
+ struct erspan_metadata em;
+
+ memset(&em, 0, sizeof(em));
+ if (erspan_ok++)
+ duparg2("erspan", *argv);
+ flags |= TUNNEL_ERSPAN_OPT;
+ NEXT_ARG();
+ if (strcmp(*argv, "ver"))
+ duparg2("ver", *argv);
+ NEXT_ARG();
+ if (get_s32(&em.version, *argv, 0) ||
+ (em.version != 1 && em.version != 2))
+ invarg("\"tos\" value is invalid\n", *argv);
+ NEXT_ARG();
+ if (em.version == 1) {
+ if (strcmp(*argv, "idx"))
+ duparg2("idx", *argv);
+ NEXT_ARG();
+ if (get_be32(&em.u.index, *argv, 0))
+ invarg("\"idx\" value is invalid\n", *argv);
+ } else {
+ __u8 hwid;
+
+ if (strcmp(*argv, "dir"))
+ duparg2("dir", *argv);
+ NEXT_ARG();
+ if (strcmp(*argv, "ingress") == 0)
+ em.u.md2.dir = 0;
+ else if (strcmp(*argv, "exgress") == 0)
+ em.u.md2.dir = 1;
+ else
+ invarg("\"dir\" value is invalid\n", *argv);
+ NEXT_ARG();
+ if (strcmp(*argv, "hwid"))
+ duparg2("hwid", *argv);
+ NEXT_ARG();
+ if (get_u8(&hwid, *argv, 0))
+ invarg("\"hwid\" value is invalid\n", *argv);
+ em.u.md2.hwid = hwid;
+ }
+ rta_addattr_l(rta, len, LWTUNNEL_IP_OPTS, &em, sizeof(em));
} else if (strcmp(*argv, "key") == 0) {
if (key_ok++)
duparg2("key", *argv);
@@ -966,7 +1047,7 @@ static int parse_encap_ip6(struct rtattr *rta, size_t len,
int *argcp, char ***argvp)
{
int id_ok = 0, dst_ok = 0, src_ok = 0, tos_ok = 0, ttl_ok = 0;
- int key_ok = 0, csum_ok = 0, seq_ok = 0;
+ int key_ok = 0, csum_ok = 0, seq_ok = 0, erspan_ok = 0;
char **argv = *argvp;
int argc = *argcp;
int ret = 0;
@@ -1020,6 +1101,48 @@ static int parse_encap_ip6(struct rtattr *rta, size_t len,
*argv);
ret = rta_addattr8(rta, len, LWTUNNEL_IP6_HOPLIMIT,
hoplimit);
+ } else if (strcmp(*argv, "erspan") == 0) {
+ struct erspan_metadata em;
+
+ memset(&em, 0, sizeof(em));
+ if (erspan_ok++)
+ duparg2("erspan", *argv);
+ flags |= TUNNEL_ERSPAN_OPT;
+ NEXT_ARG();
+ if (strcmp(*argv, "ver"))
+ duparg2("ver", *argv);
+ NEXT_ARG();
+ if (get_s32(&em.version, *argv, 0) ||
+ (em.version != 1 && em.version != 2))
+ invarg("\"tos\" value is invalid\n", *argv);
+ NEXT_ARG();
+ if (em.version == 1) {
+ if (strcmp(*argv, "idx"))
+ duparg2("idx", *argv);
+ NEXT_ARG();
+ if (get_be32(&em.u.index, *argv, 0))
+ invarg("\"idx\" value is invalid\n", *argv);
+ } else {
+ __u8 hwid;
+
+ if (strcmp(*argv, "dir"))
+ duparg2("dir", *argv);
+ NEXT_ARG();
+ if (strcmp(*argv, "ingress") == 0)
+ em.u.md2.dir = 0;
+ else if (strcmp(*argv, "exgress") == 0)
+ em.u.md2.dir = 1;
+ else
+ invarg("\"dir\" value is invalid\n", *argv);
+ NEXT_ARG();
+ if (strcmp(*argv, "hwid"))
+ duparg2("hwid", *argv);
+ NEXT_ARG();
+ if (get_u8(&hwid, *argv, 0))
+ invarg("\"hwid\" value is invalid\n", *argv);
+ em.u.md2.hwid = hwid;
+ }
+ rta_addattr_l(rta, len, LWTUNNEL_IP6_OPTS, &em, sizeof(em));
} else if (strcmp(*argv, "key") == 0) {
if (key_ok++)
duparg2("key", *argv);
--
2.18.1
^ permalink raw reply related
* Re: [PATCH 1/1] MAINTAINERS: update FORCEDETH MAINTAINERS info
From: David Miller @ 2019-09-16 7:16 UTC (permalink / raw)
To: rain.1986.08.12; +Cc: mchehab+samsung, gregkh, linux-kernel, netdev
In-Reply-To: <20190913134345.28318-1-rain.1986.08.12@gmail.com>
From: rain.1986.08.12@gmail.com
Date: Fri, 13 Sep 2019 21:43:45 +0800
> From: Rain River <rain.1986.08.12@gmail.com>
>
> Many FORCEDETH NICs are used in our hosts. Several bugs are fixed and
> some features are developed for FORCEDETH NICs. And I have been
> reviewing patches for FORCEDETH NIC for several months. Mark me as the
> FORCEDETH NIC maintainer. I will send out the patches and maintain
> FORCEDETH NIC.
>
> Signed-off-by: Rain River <rain.1986.08.12@gmail.com>
Applied.
^ permalink raw reply
* Re: [PATCH] net/wan: dscc4: remove broken dscc4 driver
From: David Miller @ 2019-09-16 7:15 UTC (permalink / raw)
To: dan.carpenter; +Cc: romieu, netdev, kernel-janitors
In-Reply-To: <20190913132817.GA13179@mwanda>
From: Dan Carpenter <dan.carpenter@oracle.com>
Date: Fri, 13 Sep 2019 16:28:17 +0300
> Using static analysis, I discovered that the "dpriv->pci_priv->pdev"
> pointer is always NULL. This pointer was supposed to be initialized
> during probe and is essential for the driver to work. It would be easy
> to add a "ppriv->pdev = pdev;" to dscc4_found1() but this driver has
> been broken since before we started using git and no one has complained
> so probably we should just remove it.
>
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Applied to net-next.
^ permalink raw reply
* Re: [PATCH net-next] MAINTAINERS: xen-netback: update my email address
From: David Miller @ 2019-09-16 7:13 UTC (permalink / raw)
To: paul.durrant; +Cc: netdev, xen-devel, wei.liu
In-Reply-To: <20190913124727.3277-1-paul.durrant@citrix.com>
From: Paul Durrant <paul.durrant@citrix.com>
Date: Fri, 13 Sep 2019 13:47:27 +0100
> My Citrix email address will expire shortly.
>
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Applied.
^ permalink raw reply
* Re: [PATCH net] net: stmmac: Hold rtnl lock in suspend/resume callbacks
From: David Miller @ 2019-09-16 7:12 UTC (permalink / raw)
To: Jose.Abreu
Cc: netdev, Joao.Pinto, peppe.cavallaro, alexandre.torgue,
mcoquelin.stm32, linux-stm32, linux-arm-kernel, linux-kernel,
christophe.roullier
In-Reply-To: <66b6c1395e4bbc836e80083b89b2189ce7382d7b.1568360548.git.joabreu@synopsys.com>
From: Jose Abreu <Jose.Abreu@synopsys.com>
Date: Fri, 13 Sep 2019 11:50:32 +0200
> We need to hold rnl lock in suspend and resume callbacks because phylink
> requires it. Otherwise we will get a WARN() in suspend and resume.
>
> Also, move phylink start and stop callbacks to inside device's internal
> lock so that we prevent concurrent HW accesses.
>
> Fixes: 74371272f97f ("net: stmmac: Convert to phylink and remove phylib logic")
> Reported-by: Christophe ROULLIER <christophe.roullier@st.com>
> Tested-by: Christophe ROULLIER <christophe.roullier@st.com>
> Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Applied and queued up for v5.3 -stable.
Thanks.
^ permalink raw reply
* [PATCH net-next 6/6] erspan: make md work without TUNNEL_ERSPAN_OPT set
From: Xin Long @ 2019-09-16 7:10 UTC (permalink / raw)
To: network dev; +Cc: davem, Jiri Benc, Thomas Graf, u9012063
In-Reply-To: <cover.1568617721.git.lucien.xin@gmail.com>
Now when a skb comes with ip_tun_info but with no TUNNEL_ERSPAN_OPT to
a md erspan device, it will be dropped.
This patch is to allow this skb to go through this erspan device, and
the options (version, index, hwid, dir, etc.) will be filled with
tunnel's params, which can be configured by users.
This can be verified by:
# ip net a a; ip net a b
# ip -n a l a eth0 type veth peer name eth0 netns b
# ip -n a l s eth0 up; ip -n b link set eth0 up
# ip -n a a a 10.1.0.1/24 dev eth0; ip -n b a a 10.1.0.2/24 dev eth0
# ip -n b l a erspan1 type erspan key 1 seq local 10.1.0.2 remote 10.1.0.1
# ip -n b a a 1.1.1.1/24 dev erspan1; ip -n b l s erspan1 up
# ip -n b r a 2.1.1.0/24 dev erspan1
# ip -n a l a erspan1 type erspan key 1 seq local 10.1.0.1 external
# ip -n a a a 2.1.1.1/24 dev erspan1; ip -n a l s erspan1 up
# ip -n a r a 1.1.1.0/24 encap ip id 1 dst 10.1.0.2 dev erspan1
# ip net exec a ping 1.1.1.1 -c 1
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
net/ipv4/ip_gre.c | 31 +++++++++++++------------------
net/ipv6/ip6_gre.c | 35 +++++++++++++++++++----------------
2 files changed, 32 insertions(+), 34 deletions(-)
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index df7149c..ac4cbb8 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -491,15 +491,12 @@ static void gre_fb_xmit(struct sk_buff *skb, struct net_device *dev,
static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev)
{
struct ip_tunnel *tunnel = netdev_priv(dev);
+ struct erspan_metadata *md = NULL;
struct ip_tunnel_info *tun_info;
const struct ip_tunnel_key *key;
- struct erspan_metadata *md;
+ int version, nhoff, thoff;
bool truncate = false;
__be16 proto;
- int tunnel_hlen;
- int version;
- int nhoff;
- int thoff;
tun_info = skb_tunnel_info(skb);
if (unlikely(!tun_info || !(tun_info->mode & IP_TUNNEL_INFO_TX) ||
@@ -507,15 +504,11 @@ static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev)
goto err_free_skb;
key = &tun_info->key;
- if (!(tun_info->key.tun_flags & TUNNEL_ERSPAN_OPT))
- goto err_free_skb;
- if (sizeof(*md) > tun_info->options_len)
- goto err_free_skb;
- md = ip_tunnel_info_opts(tun_info);
-
- /* ERSPAN has fixed 8 byte GRE header */
- version = md->version;
- tunnel_hlen = 8 + erspan_hdr_len(version);
+ if (key->tun_flags & TUNNEL_ERSPAN_OPT) {
+ if (tun_info->options_len < sizeof(*md))
+ goto err_free_skb;
+ md = ip_tunnel_info_opts(tun_info);
+ }
if (skb_cow_head(skb, dev->needed_headroom))
goto err_free_skb;
@@ -538,15 +531,17 @@ static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev)
(ntohs(ipv6_hdr(skb)->payload_len) > skb->len - thoff))
truncate = true;
+ version = md ? md->version : tunnel->erspan_ver;
if (version == 1) {
erspan_build_header(skb, ntohl(tunnel_id_to_key32(key->tun_id)),
- ntohl(md->u.index), truncate, true);
+ md ? ntohl(md->u.index) : tunnel->index,
+ truncate, true);
proto = htons(ETH_P_ERSPAN);
} else if (version == 2) {
erspan_build_header_v2(skb,
ntohl(tunnel_id_to_key32(key->tun_id)),
- md->u.md2.dir,
- get_hwid(&md->u.md2),
+ md ? md->u.md2.dir : tunnel->dir,
+ md ? get_hwid(&md->u.md2) : tunnel->hwid,
truncate, true);
proto = htons(ETH_P_ERSPAN2);
} else {
@@ -556,7 +551,7 @@ static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev)
gre_build_header(skb, 8, TUNNEL_SEQ,
proto, 0, htonl(tunnel->o_seqno++));
- ip_md_tunnel_xmit(skb, dev, IPPROTO_GRE, tunnel_hlen);
+ ip_md_tunnel_xmit(skb, dev, IPPROTO_GRE, 8 + erspan_hdr_len(version));
return;
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 4aba9e0..a48cec3 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -959,10 +959,11 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb,
* for native mode, call prepare_ip6gre_xmit_{ipv4,ipv6}.
*/
if (t->parms.collect_md) {
+ struct erspan_metadata *md = NULL;
struct ip_tunnel_info *tun_info;
const struct ip_tunnel_key *key;
- struct erspan_metadata *md;
__be32 tun_id;
+ int version;
tun_info = skb_tunnel_info(skb);
if (unlikely(!tun_info ||
@@ -978,23 +979,25 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb,
fl6.flowi6_uid = sock_net_uid(dev_net(dev), NULL);
dsfield = key->tos;
- if (!(tun_info->key.tun_flags & TUNNEL_ERSPAN_OPT))
- goto tx_err;
- if (sizeof(*md) > tun_info->options_len)
- goto tx_err;
- md = ip_tunnel_info_opts(tun_info);
+ if (key->tun_flags & TUNNEL_ERSPAN_OPT) {
+ if (tun_info->options_len < sizeof(*md))
+ goto tx_err;
+ md = ip_tunnel_info_opts(tun_info);
+ }
tun_id = tunnel_id_to_key32(key->tun_id);
- if (md->version == 1) {
- erspan_build_header(skb,
- ntohl(tun_id),
- ntohl(md->u.index), truncate,
- false);
- } else if (md->version == 2) {
- erspan_build_header_v2(skb,
- ntohl(tun_id),
- md->u.md2.dir,
- get_hwid(&md->u.md2),
+ version = md ? md->version : t->parms.erspan_ver;
+ if (version == 1) {
+ erspan_build_header(skb, ntohl(tun_id),
+ md ? ntohl(md->u.index)
+ : t->parms.index,
+ truncate, false);
+ } else if (version == 2) {
+ erspan_build_header_v2(skb, ntohl(tun_id),
+ md ? md->u.md2.dir
+ : t->parms.dir,
+ md ? get_hwid(&md->u.md2)
+ : t->parms.hwid,
truncate, false);
} else {
goto tx_err;
--
2.1.0
^ permalink raw reply related
* [PATCH net-next 5/6] erspan: fix the tun_info options_len check
From: Xin Long @ 2019-09-16 7:10 UTC (permalink / raw)
To: network dev; +Cc: davem, Jiri Benc, Thomas Graf, u9012063
In-Reply-To: <cover.1568617721.git.lucien.xin@gmail.com>
The check for !md doens't really work for ip_tunnel_info_opts(info) which
only does info + 1. Also to avoid out-of-bounds access on info, it should
ensure options_len is not less than erspan_metadata in both erspan_xmit()
and ip6erspan_tunnel_xmit().
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
net/ipv4/ip_gre.c | 4 ++--
net/ipv6/ip6_gre.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index a53a543..df7149c 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -509,9 +509,9 @@ static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev)
key = &tun_info->key;
if (!(tun_info->key.tun_flags & TUNNEL_ERSPAN_OPT))
goto err_free_skb;
- md = ip_tunnel_info_opts(tun_info);
- if (!md)
+ if (sizeof(*md) > tun_info->options_len)
goto err_free_skb;
+ md = ip_tunnel_info_opts(tun_info);
/* ERSPAN has fixed 8 byte GRE header */
version = md->version;
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index dd2d0b96..4aba9e0 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -980,9 +980,9 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb,
dsfield = key->tos;
if (!(tun_info->key.tun_flags & TUNNEL_ERSPAN_OPT))
goto tx_err;
- md = ip_tunnel_info_opts(tun_info);
- if (!md)
+ if (sizeof(*md) > tun_info->options_len)
goto tx_err;
+ md = ip_tunnel_info_opts(tun_info);
tun_id = tunnel_id_to_key32(key->tun_id);
if (md->version == 1) {
--
2.1.0
^ permalink raw reply related
* [PATCH net-next 4/6] vxlan: check tun_info options_len properly
From: Xin Long @ 2019-09-16 7:10 UTC (permalink / raw)
To: network dev; +Cc: davem, Jiri Benc, Thomas Graf, u9012063
In-Reply-To: <cover.1568617721.git.lucien.xin@gmail.com>
This patch is to improve the tun_info options_len by dropping
the skb when TUNNEL_VXLAN_OPT is set but options_len is less
than vxlan_metadata. This can void a potential out-of-bounds
access on ip_tun_info.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
drivers/net/vxlan.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 3d9bcc9..e0787286 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2487,9 +2487,11 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
vni = tunnel_id_to_key32(info->key.tun_id);
ifindex = 0;
dst_cache = &info->dst_cache;
- if (info->options_len &&
- info->key.tun_flags & TUNNEL_VXLAN_OPT)
+ if (info->key.tun_flags & TUNNEL_VXLAN_OPT) {
+ if (info->options_len < sizeof(*md))
+ goto drop;
md = ip_tunnel_info_opts(info);
+ }
ttl = info->key.ttl;
tos = info->key.tos;
label = info->key.label;
--
2.1.0
^ permalink raw reply related
* [PATCH net-next 3/6] lwtunnel: add LWTUNNEL_IP6_OPTS support for lwtunnel_ip6
From: Xin Long @ 2019-09-16 7:10 UTC (permalink / raw)
To: network dev; +Cc: davem, Jiri Benc, Thomas Graf, u9012063
In-Reply-To: <cover.1568617721.git.lucien.xin@gmail.com>
Similar to lwtunnel_ip, this patch is to add options set/dump support
for lwtunnel_ip6.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
include/uapi/linux/lwtunnel.h | 1 +
net/ipv4/ip_tunnel_core.c | 22 ++++++++++++++++++----
2 files changed, 19 insertions(+), 4 deletions(-)
diff --git a/include/uapi/linux/lwtunnel.h b/include/uapi/linux/lwtunnel.h
index 93f2c05..4bed5e6 100644
--- a/include/uapi/linux/lwtunnel.h
+++ b/include/uapi/linux/lwtunnel.h
@@ -42,6 +42,7 @@ enum lwtunnel_ip6_t {
LWTUNNEL_IP6_TC,
LWTUNNEL_IP6_FLAGS,
LWTUNNEL_IP6_PAD,
+ LWTUNNEL_IP6_OPTS,
__LWTUNNEL_IP6_MAX,
};
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index d9b7188..c8f5375a 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -353,6 +353,7 @@ static const struct nla_policy ip6_tun_policy[LWTUNNEL_IP6_MAX + 1] = {
[LWTUNNEL_IP6_HOPLIMIT] = { .type = NLA_U8 },
[LWTUNNEL_IP6_TC] = { .type = NLA_U8 },
[LWTUNNEL_IP6_FLAGS] = { .type = NLA_U16 },
+ [LWTUNNEL_IP6_OPTS] = { .type = NLA_BINARY },
};
static int ip6_tun_build_state(struct nlattr *attr,
@@ -363,14 +364,20 @@ static int ip6_tun_build_state(struct nlattr *attr,
struct ip_tunnel_info *tun_info;
struct lwtunnel_state *new_state;
struct nlattr *tb[LWTUNNEL_IP6_MAX + 1];
- int err;
+ int err, opts_len = 0;
+ void *opts;
err = nla_parse_nested_deprecated(tb, LWTUNNEL_IP6_MAX, attr,
ip6_tun_policy, extack);
if (err < 0)
return err;
- new_state = lwtunnel_state_alloc(sizeof(*tun_info));
+ if (tb[LWTUNNEL_IP6_OPTS]) {
+ opts = nla_data(tb[LWTUNNEL_IP6_OPTS]);
+ opts_len = nla_len(tb[LWTUNNEL_IP6_OPTS]);
+ }
+
+ new_state = lwtunnel_state_alloc(sizeof(*tun_info) + opts_len);
if (!new_state)
return -ENOMEM;
@@ -396,8 +403,10 @@ static int ip6_tun_build_state(struct nlattr *attr,
if (tb[LWTUNNEL_IP6_FLAGS])
tun_info->key.tun_flags = nla_get_be16(tb[LWTUNNEL_IP6_FLAGS]);
+ if (opts_len)
+ ip_tunnel_info_opts_set(tun_info, opts, opts_len, 0);
+
tun_info->mode = IP_TUNNEL_INFO_TX | IP_TUNNEL_INFO_IPV6;
- tun_info->options_len = 0;
*ts = new_state;
@@ -417,6 +426,10 @@ static int ip6_tun_fill_encap_info(struct sk_buff *skb,
nla_put_u8(skb, LWTUNNEL_IP6_HOPLIMIT, tun_info->key.ttl) ||
nla_put_be16(skb, LWTUNNEL_IP6_FLAGS, tun_info->key.tun_flags))
return -ENOMEM;
+ if (tun_info->options_len &&
+ nla_put(skb, LWTUNNEL_IP6_OPTS,
+ tun_info->options_len, ip_tunnel_info_opts(tun_info)))
+ return -ENOMEM;
return 0;
}
@@ -428,7 +441,8 @@ static int ip6_tun_encap_nlsize(struct lwtunnel_state *lwtstate)
+ nla_total_size(16) /* LWTUNNEL_IP6_SRC */
+ nla_total_size(1) /* LWTUNNEL_IP6_HOPLIMIT */
+ nla_total_size(1) /* LWTUNNEL_IP6_TC */
- + nla_total_size(2); /* LWTUNNEL_IP6_FLAGS */
+ + nla_total_size(2) /* LWTUNNEL_IP6_FLAGS */
+ + lwt_tun_info(lwtstate)->options_len; /* LWTUNNEL_IP6_OPTS */
}
static const struct lwtunnel_encap_ops ip6_tun_lwt_ops = {
--
2.1.0
^ permalink raw reply related
* [PATCH net-next 2/6] lwtunnel: add LWTUNNEL_IP_OPTS support for lwtunnel_ip
From: Xin Long @ 2019-09-16 7:10 UTC (permalink / raw)
To: network dev; +Cc: davem, Jiri Benc, Thomas Graf, u9012063
In-Reply-To: <cover.1568617721.git.lucien.xin@gmail.com>
This patch is to add LWTUNNEL_IP_OPTS into lwtunnel_ip_t, by which
users will be able to set options for ip_tunnel_info by "ip route
encap" for erspan and vxlan's private metadata. Like one way to go
in iproute is:
# ip route add 1.1.1.0/24 encap ip id 1 erspan ver 1 idx 123 \
dst 10.1.0.2 dev erspan1
# ip route show
1.1.1.0/24 encap ip id 1 src 0.0.0.0 dst 10.1.0.2 ttl 0 \
tos 0 erspan ver 1 idx 123 dev erspan1 scope link
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
include/uapi/linux/lwtunnel.h | 1 +
net/ipv4/ip_tunnel_core.c | 30 ++++++++++++++++++++++++------
2 files changed, 25 insertions(+), 6 deletions(-)
diff --git a/include/uapi/linux/lwtunnel.h b/include/uapi/linux/lwtunnel.h
index de696ca..93f2c05 100644
--- a/include/uapi/linux/lwtunnel.h
+++ b/include/uapi/linux/lwtunnel.h
@@ -27,6 +27,7 @@ enum lwtunnel_ip_t {
LWTUNNEL_IP_TOS,
LWTUNNEL_IP_FLAGS,
LWTUNNEL_IP_PAD,
+ LWTUNNEL_IP_OPTS,
__LWTUNNEL_IP_MAX,
};
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index 10f0848..d9b7188 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -218,6 +218,7 @@ static const struct nla_policy ip_tun_policy[LWTUNNEL_IP_MAX + 1] = {
[LWTUNNEL_IP_TTL] = { .type = NLA_U8 },
[LWTUNNEL_IP_TOS] = { .type = NLA_U8 },
[LWTUNNEL_IP_FLAGS] = { .type = NLA_U16 },
+ [LWTUNNEL_IP_OPTS] = { .type = NLA_BINARY },
};
static int ip_tun_build_state(struct nlattr *attr,
@@ -228,14 +229,20 @@ static int ip_tun_build_state(struct nlattr *attr,
struct ip_tunnel_info *tun_info;
struct lwtunnel_state *new_state;
struct nlattr *tb[LWTUNNEL_IP_MAX + 1];
- int err;
+ int err, opts_len = 0;
+ void *opts;
err = nla_parse_nested_deprecated(tb, LWTUNNEL_IP_MAX, attr,
ip_tun_policy, extack);
if (err < 0)
return err;
- new_state = lwtunnel_state_alloc(sizeof(*tun_info));
+ if (tb[LWTUNNEL_IP_OPTS]) {
+ opts = nla_data(tb[LWTUNNEL_IP_OPTS]);
+ opts_len = nla_len(tb[LWTUNNEL_IP_OPTS]);
+ }
+
+ new_state = lwtunnel_state_alloc(sizeof(*tun_info) + opts_len);
if (!new_state)
return -ENOMEM;
@@ -269,8 +276,10 @@ static int ip_tun_build_state(struct nlattr *attr,
if (tb[LWTUNNEL_IP_FLAGS])
tun_info->key.tun_flags = nla_get_be16(tb[LWTUNNEL_IP_FLAGS]);
+ if (opts_len)
+ ip_tunnel_info_opts_set(tun_info, opts, opts_len, 0);
+
tun_info->mode = IP_TUNNEL_INFO_TX;
- tun_info->options_len = 0;
*ts = new_state;
@@ -299,6 +308,10 @@ static int ip_tun_fill_encap_info(struct sk_buff *skb,
nla_put_u8(skb, LWTUNNEL_IP_TTL, tun_info->key.ttl) ||
nla_put_be16(skb, LWTUNNEL_IP_FLAGS, tun_info->key.tun_flags))
return -ENOMEM;
+ if (tun_info->options_len &&
+ nla_put(skb, LWTUNNEL_IP_OPTS,
+ tun_info->options_len, ip_tunnel_info_opts(tun_info)))
+ return -ENOMEM;
return 0;
}
@@ -310,13 +323,18 @@ static int ip_tun_encap_nlsize(struct lwtunnel_state *lwtstate)
+ nla_total_size(4) /* LWTUNNEL_IP_SRC */
+ nla_total_size(1) /* LWTUNNEL_IP_TOS */
+ nla_total_size(1) /* LWTUNNEL_IP_TTL */
- + nla_total_size(2); /* LWTUNNEL_IP_FLAGS */
+ + nla_total_size(2) /* LWTUNNEL_IP_FLAGS */
+ + lwt_tun_info(lwtstate)->options_len; /* LWTUNNEL_IP_OPTS */
}
static int ip_tun_cmp_encap(struct lwtunnel_state *a, struct lwtunnel_state *b)
{
- return memcmp(lwt_tun_info(a), lwt_tun_info(b),
- sizeof(struct ip_tunnel_info));
+ struct ip_tunnel_info *info_a = lwt_tun_info(a);
+ struct ip_tunnel_info *info_b = lwt_tun_info(b);
+ u8 opts_len;
+
+ opts_len = min(info_a->options_len, info_b->options_len);
+ return memcmp(info_a, info_b, sizeof(*info_a) + opts_len);
}
static const struct lwtunnel_encap_ops ip_tun_lwt_ops = {
--
2.1.0
^ permalink raw reply related
* [PATCH net-next 1/6] lwtunnel: add options process for arp request
From: Xin Long @ 2019-09-16 7:10 UTC (permalink / raw)
To: network dev; +Cc: davem, Jiri Benc, Thomas Graf, u9012063
In-Reply-To: <cover.1568617721.git.lucien.xin@gmail.com>
Without options copied to the dst tun_info in iptunnel_metadata_reply()
called by arp_process for handling arp_request, the generated arp_reply
packet may be dropped or sent out with wrong options for some tunnels
like erspan and vxlan, and the traffic will break.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
net/ipv4/ip_tunnel_core.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index 1452a97..10f0848 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -126,15 +126,14 @@ struct metadata_dst *iptunnel_metadata_reply(struct metadata_dst *md,
if (!md || md->type != METADATA_IP_TUNNEL ||
md->u.tun_info.mode & IP_TUNNEL_INFO_TX)
-
return NULL;
- res = metadata_dst_alloc(0, METADATA_IP_TUNNEL, flags);
+ src = &md->u.tun_info;
+ res = metadata_dst_alloc(src->options_len, METADATA_IP_TUNNEL, flags);
if (!res)
return NULL;
dst = &res->u.tun_info;
- src = &md->u.tun_info;
dst->key.tun_id = src->key.tun_id;
if (src->mode & IP_TUNNEL_INFO_IPV6)
memcpy(&dst->key.u.ipv6.dst, &src->key.u.ipv6.src,
@@ -143,6 +142,8 @@ struct metadata_dst *iptunnel_metadata_reply(struct metadata_dst *md,
dst->key.u.ipv4.dst = src->key.u.ipv4.src;
dst->key.tun_flags = src->key.tun_flags;
dst->mode = src->mode | IP_TUNNEL_INFO_TX;
+ ip_tunnel_info_opts_set(dst, ip_tunnel_info_opts(src),
+ src->options_len, 0);
return res;
}
--
2.1.0
^ permalink raw reply related
* [PATCH net-next 0/6] net: add support for ip_tun_info options setting
From: Xin Long @ 2019-09-16 7:10 UTC (permalink / raw)
To: network dev; +Cc: davem, Jiri Benc, Thomas Graf, u9012063
With this patchset, users can configure options with LWTUNNEL_IP(6)_OPTS
by ip route encap for ersapn or vxlan lwtunnel. Note that in kernel part
it won't parse the option details but do some check and memcpy only, and
the options will be parsed by iproute in userspace.
We also improve the vxlan and erspan options processing in this patchset.
As an example I also wrote a patch for iproute2 that I will reply on this
mail, with it we can add options for erspan lwtunnel like:
# ip net a a; ip net a b
# ip -n a l a eth0 type veth peer name eth0 netns b
# ip -n a l s eth0 up; ip -n b link set eth0 up
# ip -n a a a 10.1.0.1/24 dev eth0; ip -n b a a 10.1.0.2/24 dev eth0
# ip -n b l a erspan1 type erspan key 1 seq erspan 123 \
local 10.1.0.2 remote 10.1.0.1
# ip -n b a a 1.1.1.1/24 dev erspan1; ip -n b l s erspan1 up
# ip -n b r a 2.1.1.0/24 dev erspan1
# ip -n a l a erspan1 type erspan key 1 seq local 10.1.0.1 external
# ip -n a a a 2.1.1.1/24 dev erspan1; ip -n a l s erspan1 up
# ip -n a r a 1.1.1.0/24 encap ip id 1 erspan ver 1 idx 123 \
dst 10.1.0.2 dev erspan1
# ip -n a r s; ip net exec a ping 1.1.1.1 -c 1
Xin Long (6):
lwtunnel: add options process for arp request
lwtunnel: add LWTUNNEL_IP_OPTS support for lwtunnel_ip
lwtunnel: add LWTUNNEL_IP6_OPTS support for lwtunnel_ip6
vxlan: check tun_info options_len properly
erspan: fix the tun_info options_len check
erspan: make md work without TUNNEL_ERSPAN_OPT set
drivers/net/vxlan.c | 6 +++--
include/uapi/linux/lwtunnel.h | 2 ++
net/ipv4/ip_gre.c | 31 ++++++++++-------------
net/ipv4/ip_tunnel_core.c | 59 +++++++++++++++++++++++++++++++++----------
net/ipv6/ip6_gre.c | 35 +++++++++++++------------
5 files changed, 84 insertions(+), 49 deletions(-)
--
2.1.0
^ permalink raw reply
* Re: [patch net-next 00/15] devlink: allow devlink instances to change network namespace
From: Jiri Pirko @ 2019-09-16 7:09 UTC (permalink / raw)
To: David Miller
Cc: netdev, idosch, dsahern, jakub.kicinski, tariqt, saeedm, kuznet,
yoshfuji, shuah, mlxsw
In-Reply-To: <20190916.090111.605211597512563157.davem@davemloft.net>
Mon, Sep 16, 2019 at 09:01:11AM CEST, davem@davemloft.net wrote:
>
>Jiri, this has to wait until the next merge window sorry.
Sure, no worries :)
^ permalink raw reply
* Re: [PATCH net] ip6_gre: fix a dst leak in ip6erspan_tunnel_xmit
From: David Miller @ 2019-09-16 7:09 UTC (permalink / raw)
To: lucien.xin; +Cc: netdev, u9012063
In-Reply-To: <1bfbf329c5b3649a6c6362350a0d609ff184deba.1568367947.git.lucien.xin@gmail.com>
From: Xin Long <lucien.xin@gmail.com>
Date: Fri, 13 Sep 2019 17:45:47 +0800
> In ip6erspan_tunnel_xmit(), if the skb will not be sent out, it has to
> be freed on the tx_err path. Otherwise when deleting a netns, it would
> cause dst/dev to leak, and dmesg shows:
>
> unregister_netdevice: waiting for lo to become free. Usage count = 1
>
> Fixes: ef7baf5e083c ("ip6_gre: add ip6 erspan collect_md mode")
> Signed-off-by: Xin Long <lucien.xin@gmail.com>
Applied and queued up for -stable.
^ permalink raw reply
* Re: [PATCH net-next v2 2/3] mlxsw: spectrum: Register CPU port with devlink
From: Jiri Pirko @ 2019-09-16 7:08 UTC (permalink / raw)
To: Ido Schimmel; +Cc: netdev, davem, jiri, shalomt, mlxsw, Ido Schimmel
In-Reply-To: <20190916061750.26207-3-idosch@idosch.org>
Mon, Sep 16, 2019 at 08:17:49AM CEST, idosch@idosch.org wrote:
>From: Shalom Toledo <shalomt@mellanox.com>
>
>Register CPU port with devlink.
>
>Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
>Signed-off-by: Ido Schimmel <idosch@mellanox.com>
>---
> drivers/net/ethernet/mellanox/mlxsw/core.c | 65 ++++++++++++++++---
> drivers/net/ethernet/mellanox/mlxsw/core.h | 5 ++
> .../net/ethernet/mellanox/mlxsw/spectrum.c | 46 +++++++++++++
> 3 files changed, 107 insertions(+), 9 deletions(-)
>
>diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.c b/drivers/net/ethernet/mellanox/mlxsw/core.c
>index 3fa96076e8a5..66354b05fd6c 100644
>--- a/drivers/net/ethernet/mellanox/mlxsw/core.c
>+++ b/drivers/net/ethernet/mellanox/mlxsw/core.c
>@@ -1864,11 +1864,13 @@ u64 mlxsw_core_res_get(struct mlxsw_core *mlxsw_core,
> }
> EXPORT_SYMBOL(mlxsw_core_res_get);
>
>-int mlxsw_core_port_init(struct mlxsw_core *mlxsw_core, u8 local_port,
>- u32 port_number, bool split,
>- u32 split_port_subnumber,
>- const unsigned char *switch_id,
>- unsigned char switch_id_len)
>+static int
>+__mlxsw_core_port_init(struct mlxsw_core *mlxsw_core, u8 local_port,
>+ enum devlink_port_flavour flavour,
>+ u32 port_number, bool split,
>+ u32 split_port_subnumber,
>+ const unsigned char *switch_id,
>+ unsigned char switch_id_len)
No need to wrap after "static int":
static int __mlxsw_core_port_init(struct mlxsw_core *mlxsw_core, u8 local_port,
enum devlink_port_flavour flavour,
u32 port_number, bool split,
u32 split_port_subnumber,
const unsigned char *switch_id,
unsigned char switch_id_len)
> {
> struct devlink *devlink = priv_to_devlink(mlxsw_core);
> struct mlxsw_core_port *mlxsw_core_port =
>@@ -1877,17 +1879,17 @@ int mlxsw_core_port_init(struct mlxsw_core *mlxsw_core, u8 local_port,
> int err;
>
> mlxsw_core_port->local_port = local_port;
>- devlink_port_attrs_set(devlink_port, DEVLINK_PORT_FLAVOUR_PHYSICAL,
>- port_number, split, split_port_subnumber,
>+ devlink_port_attrs_set(devlink_port, flavour, port_number,
>+ split, split_port_subnumber,
> switch_id, switch_id_len);
> err = devlink_port_register(devlink, devlink_port, local_port);
> if (err)
> memset(mlxsw_core_port, 0, sizeof(*mlxsw_core_port));
> return err;
> }
>-EXPORT_SYMBOL(mlxsw_core_port_init);
>
>-void mlxsw_core_port_fini(struct mlxsw_core *mlxsw_core, u8 local_port)
>+static void
>+__mlxsw_core_port_fini(struct mlxsw_core *mlxsw_core, u8 local_port)
No need to wrap:
static void __mlxsw_core_port_fini(struct mlxsw_core *mlxsw_core, u8 local_port)
> {
> struct mlxsw_core_port *mlxsw_core_port =
> &mlxsw_core->ports[local_port];
>@@ -1896,8 +1898,53 @@ void mlxsw_core_port_fini(struct mlxsw_core *mlxsw_core, u8 local_port)
> devlink_port_unregister(devlink_port);
> memset(mlxsw_core_port, 0, sizeof(*mlxsw_core_port));
> }
>+
>+int mlxsw_core_port_init(struct mlxsw_core *mlxsw_core, u8 local_port,
>+ u32 port_number, bool split,
>+ u32 split_port_subnumber,
>+ const unsigned char *switch_id,
>+ unsigned char switch_id_len)
>+{
>+ return __mlxsw_core_port_init(mlxsw_core, local_port,
>+ DEVLINK_PORT_FLAVOUR_PHYSICAL,
>+ port_number, split, split_port_subnumber,
>+ switch_id, switch_id_len);
>+}
>+EXPORT_SYMBOL(mlxsw_core_port_init);
>+
>+void mlxsw_core_port_fini(struct mlxsw_core *mlxsw_core, u8 local_port)
>+{
>+ __mlxsw_core_port_fini(mlxsw_core, local_port);
>+}
> EXPORT_SYMBOL(mlxsw_core_port_fini);
>
>+int mlxsw_core_cpu_port_init(struct mlxsw_core *mlxsw_core,
>+ void *port_driver_priv,
>+ const unsigned char *switch_id,
>+ unsigned char switch_id_len)
>+{
>+ struct mlxsw_core_port *mlxsw_core_port =
>+ &mlxsw_core->ports[MLXSW_PORT_CPU_PORT];
>+ int err;
>+
>+ err = __mlxsw_core_port_init(mlxsw_core, MLXSW_PORT_CPU_PORT,
>+ DEVLINK_PORT_FLAVOUR_CPU,
>+ 0, false, 0,
>+ switch_id, switch_id_len);
>+ if (err)
>+ return err;
>+
>+ mlxsw_core_port->port_driver_priv = port_driver_priv;
It is a bit confusing why this is done here comparing to physical ports,
where it is done during type set. But I didn't find better solution.
>+ return 0;
>+}
>+EXPORT_SYMBOL(mlxsw_core_cpu_port_init);
>+
>+void mlxsw_core_cpu_port_fini(struct mlxsw_core *mlxsw_core)
>+{
>+ __mlxsw_core_port_fini(mlxsw_core, MLXSW_PORT_CPU_PORT);
>+}
>+EXPORT_SYMBOL(mlxsw_core_cpu_port_fini);
>+
> void mlxsw_core_port_eth_set(struct mlxsw_core *mlxsw_core, u8 local_port,
> void *port_driver_priv, struct net_device *dev)
> {
>diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.h b/drivers/net/ethernet/mellanox/mlxsw/core.h
>index b65a17d49e43..5d7d2ab6d155 100644
>--- a/drivers/net/ethernet/mellanox/mlxsw/core.h
>+++ b/drivers/net/ethernet/mellanox/mlxsw/core.h
>@@ -177,6 +177,11 @@ int mlxsw_core_port_init(struct mlxsw_core *mlxsw_core, u8 local_port,
> const unsigned char *switch_id,
> unsigned char switch_id_len);
> void mlxsw_core_port_fini(struct mlxsw_core *mlxsw_core, u8 local_port);
>+int mlxsw_core_cpu_port_init(struct mlxsw_core *mlxsw_core,
>+ void *port_driver_priv,
>+ const unsigned char *switch_id,
>+ unsigned char switch_id_len);
>+void mlxsw_core_cpu_port_fini(struct mlxsw_core *mlxsw_core);
> void mlxsw_core_port_eth_set(struct mlxsw_core *mlxsw_core, u8 local_port,
> void *port_driver_priv, struct net_device *dev);
> void mlxsw_core_port_ib_set(struct mlxsw_core *mlxsw_core, u8 local_port,
>diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
>index 91e4792bb7e7..dd234cf7b39d 100644
>--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
>+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
>@@ -3872,6 +3872,45 @@ static void mlxsw_sp_port_remove(struct mlxsw_sp *mlxsw_sp, u8 local_port)
> mlxsw_core_port_fini(mlxsw_sp->core, local_port);
> }
>
>+static int mlxsw_sp_cpu_port_create(struct mlxsw_sp *mlxsw_sp)
>+{
>+ struct mlxsw_sp_port *mlxsw_sp_port;
>+ int err;
>+
>+ mlxsw_sp_port = kzalloc(sizeof(*mlxsw_sp_port), GFP_KERNEL);
>+ if (!mlxsw_sp_port)
>+ return -ENOMEM;
>+
>+ mlxsw_sp_port->mlxsw_sp = mlxsw_sp;
>+ mlxsw_sp_port->local_port = MLXSW_PORT_CPU_PORT;
>+
>+ err = mlxsw_core_cpu_port_init(mlxsw_sp->core,
>+ mlxsw_sp_port,
>+ mlxsw_sp->base_mac,
>+ sizeof(mlxsw_sp->base_mac));
>+ if (err) {
>+ dev_err(mlxsw_sp->bus_info->dev, "Failed to initialize core CPU port\n");
>+ goto err_core_cpu_port_init;
>+ }
>+
>+ mlxsw_sp->ports[MLXSW_PORT_CPU_PORT] = mlxsw_sp_port;
>+ return 0;
>+
>+err_core_cpu_port_init:
>+ kfree(mlxsw_sp_port);
>+ return err;
>+}
>+
>+static void mlxsw_sp_cpu_port_remove(struct mlxsw_sp *mlxsw_sp)
>+{
>+ struct mlxsw_sp_port *mlxsw_sp_port =
>+ mlxsw_sp->ports[MLXSW_PORT_CPU_PORT];
>+
>+ mlxsw_core_cpu_port_fini(mlxsw_sp->core);
>+ mlxsw_sp->ports[MLXSW_PORT_CPU_PORT] = NULL;
>+ kfree(mlxsw_sp_port);
>+}
>+
> static bool mlxsw_sp_port_created(struct mlxsw_sp *mlxsw_sp, u8 local_port)
> {
> return mlxsw_sp->ports[local_port] != NULL;
>@@ -3884,6 +3923,7 @@ static void mlxsw_sp_ports_remove(struct mlxsw_sp *mlxsw_sp)
> for (i = 1; i < mlxsw_core_max_ports(mlxsw_sp->core); i++)
> if (mlxsw_sp_port_created(mlxsw_sp, i))
> mlxsw_sp_port_remove(mlxsw_sp, i);
>+ mlxsw_sp_cpu_port_remove(mlxsw_sp);
> kfree(mlxsw_sp->port_to_module);
> kfree(mlxsw_sp->ports);
> }
>@@ -3908,6 +3948,10 @@ static int mlxsw_sp_ports_create(struct mlxsw_sp *mlxsw_sp)
> goto err_port_to_module_alloc;
> }
>
>+ err = mlxsw_sp_cpu_port_create(mlxsw_sp);
>+ if (err)
>+ goto err_cpu_port_create;
>+
> for (i = 1; i < max_ports; i++) {
> /* Mark as invalid */
> mlxsw_sp->port_to_module[i] = -1;
>@@ -3931,6 +3975,8 @@ static int mlxsw_sp_ports_create(struct mlxsw_sp *mlxsw_sp)
> for (i--; i >= 1; i--)
> if (mlxsw_sp_port_created(mlxsw_sp, i))
> mlxsw_sp_port_remove(mlxsw_sp, i);
>+ mlxsw_sp_cpu_port_remove(mlxsw_sp);
>+err_cpu_port_create:
> kfree(mlxsw_sp->port_to_module);
> err_port_to_module_alloc:
> kfree(mlxsw_sp->ports);
>--
>2.21.0
>
^ permalink raw reply
* Re: [PATCH] qed: fix spelling mistake "fullill" -> "fulfill"
From: David Miller @ 2019-09-16 7:07 UTC (permalink / raw)
To: colin.king
Cc: aelior, GR-everest-linux-l2, netdev, kernel-janitors,
linux-kernel
In-Reply-To: <20190913090759.3490-1-colin.king@canonical.com>
From: Colin King <colin.king@canonical.com>
Date: Fri, 13 Sep 2019 10:07:59 +0100
> From: Colin Ian King <colin.king@canonical.com>
>
> There is a spelling mistake in a DP_VERBOSE debug message. Fix it.
> (Using American English spelling as this is the most common way
> to spell this in the kernel).
>
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
Applied to net-next
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox