* Re: [PATCH v4 9/9] Input: add IOC3 serio driver
From: Thomas Bogendoerfer @ 2019-08-14 14:37 UTC (permalink / raw)
To: Jonas Gorski
Cc: Ralf Baechle, Paul Burton, James Hogan, Dmitry Torokhov,
Lee Jones, David S. Miller, Srinivas Kandagatla, Alessandro Zummo,
Alexandre Belloni, Greg Kroah-Hartman, Jiri Slaby,
Evgeniy Polyakov, linux-mips, linux-kernel, linux-input,
Network Development, linux-rtc, linux-serial
In-Reply-To: <CAOiHx=kuQtOuNfsJ+fDrps+hbrbp5cPujmQpi8Vfy+0qeP8dtA@mail.gmail.com>
On Wed, 14 Aug 2019 15:20:14 +0200
Jonas Gorski <jonas.gorski@gmail.com> wrote:
> > + d = devm_kzalloc(&pdev->dev, sizeof(*d), GFP_KERNEL);
>
> &pdev->dev => dev
will change.
>
> > + if (!d)
> > + return -ENOMEM;
> > +
> > + sk = kzalloc(sizeof(*sk), GFP_KERNEL);
>
> any reason not to devm_kzalloc this as well? Then you won't need to
> manually free it in the error cases.
it has different life time than the device, so it may not allocated
via devm_kzalloc
> > +static int ioc3kbd_remove(struct platform_device *pdev)
> > +{
> > + struct ioc3kbd_data *d = platform_get_drvdata(pdev);
> > +
> > + devm_free_irq(&pdev->dev, d->irq, d);
> > + serio_unregister_port(d->kbd);
> > + serio_unregister_port(d->aux);
> > + return 0;
> > +}
>
> and on that topic, won't you need to kfree d->kbd and d->aux here?
that's done in serio_release_port() by the serio core.
Thomas.
--
SUSE Linux GmbH
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG Nürnberg)
^ permalink raw reply
* [PATCH net-next v2] net: dsa: mv88e6xxx: check for mode change in port_setup_mac
From: Marek Behún @ 2019-08-14 14:40 UTC (permalink / raw)
To: netdev; +Cc: Andrew Lunn, Vivien Didelot, Heiner Kallweit, Marek Behún
The mv88e6xxx_port_setup_mac checks if the requested MAC settings are
different from the current ones, and if not, does nothing (since chaning
them requires putting the link down).
In this check it only looks if the triplet [link, speed, duplex] is
being changed.
This patch adds support to also check if the mode parameter (of type
phy_interface_t) is requested to be changed. The current mode is
computed by the ->port_link_state() method, and if it is different from
PHY_INTERFACE_MODE_NA, we check for equality with the requested mode.
In the implementations of the mv88e6250_port_link_state() method we set
the current mode to PHY_INTERFACE_MODE_NA - so the code does not check
for mode change on 6250.
In the mv88e6352_port_link_state() method, we use the cached cmode of
the port to determine the mode as phy_interface_t (and if it is not
enough, eg. for RGMII, we also look at the port control register for
RX/TX timings).
Signed-off-by: Marek Behún <marek.behun@nic.cz>
---
drivers/net/dsa/mv88e6xxx/chip.c | 4 +++-
drivers/net/dsa/mv88e6xxx/port.c | 38 ++++++++++++++++++++++++++++++++
drivers/net/dsa/mv88e6xxx/port.h | 1 +
3 files changed, 42 insertions(+), 1 deletion(-)
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 818a83eb2dcb..9b3ad22a5b98 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -417,7 +417,9 @@ int mv88e6xxx_port_setup_mac(struct mv88e6xxx_chip *chip, int port, int link,
*/
if (state.link == link &&
state.speed == speed &&
- state.duplex == duplex)
+ state.duplex == duplex &&
+ (state.interface == mode ||
+ state.interface == PHY_INTERFACE_MODE_NA))
return 0;
/* Port's MAC control must not be changed unless the link is down */
diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c
index 04309ef0a1cc..c95cdb73e5a2 100644
--- a/drivers/net/dsa/mv88e6xxx/port.c
+++ b/drivers/net/dsa/mv88e6xxx/port.c
@@ -590,6 +590,7 @@ int mv88e6250_port_link_state(struct mv88e6xxx_chip *chip, int port,
state->link = !!(reg & MV88E6250_PORT_STS_LINK);
state->an_enabled = 1;
state->an_complete = state->link;
+ state->interface = PHY_INTERFACE_MODE_NA;
return 0;
}
@@ -600,6 +601,43 @@ int mv88e6352_port_link_state(struct mv88e6xxx_chip *chip, int port,
int err;
u16 reg;
+ switch (chip->ports[port].cmode) {
+ case MV88E6XXX_PORT_STS_CMODE_RGMII:
+ err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_MAC_CTL,
+ ®);
+ if (err)
+ return err;
+
+ if ((reg & MV88E6XXX_PORT_MAC_CTL_RGMII_DELAY_RXCLK) &&
+ (reg & MV88E6XXX_PORT_MAC_CTL_RGMII_DELAY_TXCLK))
+ state->interface = PHY_INTERFACE_MODE_RGMII_ID;
+ else if (reg & MV88E6XXX_PORT_MAC_CTL_RGMII_DELAY_RXCLK)
+ state->interface = PHY_INTERFACE_MODE_RGMII_RXID;
+ else if (reg & MV88E6XXX_PORT_MAC_CTL_RGMII_DELAY_TXCLK)
+ state->interface = PHY_INTERFACE_MODE_RGMII_TXID;
+ else
+ state->interface = PHY_INTERFACE_MODE_RGMII;
+ break;
+ case MV88E6XXX_PORT_STS_CMODE_1000BASE_X:
+ state->interface = PHY_INTERFACE_MODE_1000BASEX;
+ break;
+ case MV88E6XXX_PORT_STS_CMODE_SGMII:
+ state->interface = PHY_INTERFACE_MODE_SGMII;
+ break;
+ case MV88E6XXX_PORT_STS_CMODE_2500BASEX:
+ state->interface = PHY_INTERFACE_MODE_2500BASEX;
+ break;
+ case MV88E6XXX_PORT_STS_CMODE_XAUI:
+ state->interface = PHY_INTERFACE_MODE_XAUI;
+ break;
+ case MV88E6XXX_PORT_STS_CMODE_RXAUI:
+ state->interface = PHY_INTERFACE_MODE_RXAUI;
+ break;
+ default:
+ /* we do not support other cmode values here */
+ state->interface = PHY_INTERFACE_MODE_NA;
+ }
+
err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_STS, ®);
if (err)
return err;
diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h
index ceec771f8bfc..1abf5ea033e2 100644
--- a/drivers/net/dsa/mv88e6xxx/port.h
+++ b/drivers/net/dsa/mv88e6xxx/port.h
@@ -42,6 +42,7 @@
#define MV88E6XXX_PORT_STS_TX_PAUSED 0x0020
#define MV88E6XXX_PORT_STS_FLOW_CTL 0x0010
#define MV88E6XXX_PORT_STS_CMODE_MASK 0x000f
+#define MV88E6XXX_PORT_STS_CMODE_RGMII 0x0007
#define MV88E6XXX_PORT_STS_CMODE_100BASE_X 0x0008
#define MV88E6XXX_PORT_STS_CMODE_1000BASE_X 0x0009
#define MV88E6XXX_PORT_STS_CMODE_SGMII 0x000a
--
2.21.0
^ permalink raw reply related
* [PATCH net-next 0/4] net: bridge: mdb: allow dump/add/del of host-joined entries
From: Nikolay Aleksandrov @ 2019-08-14 14:40 UTC (permalink / raw)
To: netdev; +Cc: davem, roopa, bridge, Nikolay Aleksandrov
Hi,
This set makes the bridge dump host-joined mdb entries, they should be
treated as normal entries since they take a slot and are aging out.
We already have notifications for them but we couldn't dump them until
now so they remained hidden. We dump them similar to how they're
notified, in order to keep user-space compatibility with the dumped
objects (e.g. iproute2 dumps mdbs in a format which can be fed into
add/del commands) we allow host-joined groups also to be added/deleted via
mdb commands. That can later be used for L2 mcast MAC manipulation as
was recently discussed. Note that iproute2 changes are not necessary,
this set will work with the current user-space mdb code.
Patch 01 - a trivial comment move
Patch 02 - factors out the mdb filling code so it can be
re-used for the host-joined entries
Patch 03 - dumps host-joined entries
Patch 04 - allows manipulation of host-joined entries via standard mdb
calls
Thanks,
Nik
Nikolay Aleksandrov (4):
net: bridge: mdb: move vlan comments
net: bridge: mdb: factor out mdb filling
net: bridge: mdb: dump host-joined entries as well
net: bridge: mdb: allow add/delete for host-joined groups
net/bridge/br_mdb.c | 171 +++++++++++++++++++++++++-------------
net/bridge/br_multicast.c | 24 ++++--
net/bridge/br_private.h | 2 +
3 files changed, 133 insertions(+), 64 deletions(-)
--
2.21.0
^ permalink raw reply
* [PATCH net-next 1/4] net: bridge: mdb: move vlan comments
From: Nikolay Aleksandrov @ 2019-08-14 14:40 UTC (permalink / raw)
To: netdev; +Cc: davem, roopa, bridge, Nikolay Aleksandrov
In-Reply-To: <20190814144024.9710-1-nikolay@cumulusnetworks.com>
Trivial patch to move the vlan comments in their proper places above the
vid 0 checks.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
---
net/bridge/br_mdb.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index 428af1abf8cc..ee6208c6d946 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -653,9 +653,6 @@ static int br_mdb_add(struct sk_buff *skb, struct nlmsghdr *nlh,
br = netdev_priv(dev);
- /* If vlan filtering is enabled and VLAN is not specified
- * install mdb entry on all vlans configured on the port.
- */
pdev = __dev_get_by_index(net, entry->ifindex);
if (!pdev)
return -ENODEV;
@@ -665,6 +662,9 @@ static int br_mdb_add(struct sk_buff *skb, struct nlmsghdr *nlh,
return -EINVAL;
vg = nbp_vlan_group(p);
+ /* If vlan filtering is enabled and VLAN is not specified
+ * install mdb entry on all vlans configured on the port.
+ */
if (br_vlan_enabled(br->dev) && vg && entry->vid == 0) {
list_for_each_entry(v, &vg->vlan_list, vlist) {
entry->vid = v->vid;
@@ -745,9 +745,6 @@ static int br_mdb_del(struct sk_buff *skb, struct nlmsghdr *nlh,
br = netdev_priv(dev);
- /* If vlan filtering is enabled and VLAN is not specified
- * delete mdb entry on all vlans configured on the port.
- */
pdev = __dev_get_by_index(net, entry->ifindex);
if (!pdev)
return -ENODEV;
@@ -757,6 +754,9 @@ static int br_mdb_del(struct sk_buff *skb, struct nlmsghdr *nlh,
return -EINVAL;
vg = nbp_vlan_group(p);
+ /* If vlan filtering is enabled and VLAN is not specified
+ * delete mdb entry on all vlans configured on the port.
+ */
if (br_vlan_enabled(br->dev) && vg && entry->vid == 0) {
list_for_each_entry(v, &vg->vlan_list, vlist) {
entry->vid = v->vid;
--
2.21.0
^ permalink raw reply related
* [PATCH net-next 2/4] net: bridge: mdb: factor out mdb filling
From: Nikolay Aleksandrov @ 2019-08-14 14:40 UTC (permalink / raw)
To: netdev; +Cc: davem, roopa, bridge, Nikolay Aleksandrov
In-Reply-To: <20190814144024.9710-1-nikolay@cumulusnetworks.com>
We have to factor out the mdb fill portion in order to re-use it later for
the bridge mdb entries. No functional changes intended.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
---
net/bridge/br_mdb.c | 68 ++++++++++++++++++++++++---------------------
1 file changed, 37 insertions(+), 31 deletions(-)
diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index ee6208c6d946..77730983097e 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -77,6 +77,40 @@ static void __mdb_entry_to_br_ip(struct br_mdb_entry *entry, struct br_ip *ip)
#endif
}
+static int __mdb_fill_info(struct sk_buff *skb,
+ struct net_bridge_port_group *p)
+{
+ struct nlattr *nest_ent;
+ struct br_mdb_entry e;
+
+ memset(&e, 0, sizeof(e));
+ __mdb_entry_fill_flags(&e, p->flags);
+ e.ifindex = p->port->dev->ifindex;
+ e.vid = p->addr.vid;
+ if (p->addr.proto == htons(ETH_P_IP))
+ e.addr.u.ip4 = p->addr.u.ip4;
+#if IS_ENABLED(CONFIG_IPV6)
+ if (p->addr.proto == htons(ETH_P_IPV6))
+ e.addr.u.ip6 = p->addr.u.ip6;
+#endif
+ e.addr.proto = p->addr.proto;
+ nest_ent = nla_nest_start_noflag(skb,
+ MDBA_MDB_ENTRY_INFO);
+ if (!nest_ent)
+ return -EMSGSIZE;
+
+ if (nla_put_nohdr(skb, sizeof(e), &e) ||
+ nla_put_u32(skb,
+ MDBA_MDB_EATTR_TIMER,
+ br_timer_value(&p->timer))) {
+ nla_nest_cancel(skb, nest_ent);
+ return -EMSGSIZE;
+ }
+ nla_nest_end(skb, nest_ent);
+
+ return 0;
+}
+
static int br_mdb_fill_info(struct sk_buff *skb, struct netlink_callback *cb,
struct net_device *dev)
{
@@ -95,7 +129,6 @@ static int br_mdb_fill_info(struct sk_buff *skb, struct netlink_callback *cb,
hlist_for_each_entry_rcu(mp, &br->mdb_list, mdb_node) {
struct net_bridge_port_group *p;
struct net_bridge_port_group __rcu **pp;
- struct net_bridge_port *port;
if (idx < s_idx)
goto skip;
@@ -108,41 +141,14 @@ static int br_mdb_fill_info(struct sk_buff *skb, struct netlink_callback *cb,
for (pp = &mp->ports; (p = rcu_dereference(*pp)) != NULL;
pp = &p->next) {
- struct nlattr *nest_ent;
- struct br_mdb_entry e;
-
- port = p->port;
- if (!port)
+ if (!p->port)
continue;
- memset(&e, 0, sizeof(e));
- e.ifindex = port->dev->ifindex;
- e.vid = p->addr.vid;
- __mdb_entry_fill_flags(&e, p->flags);
- if (p->addr.proto == htons(ETH_P_IP))
- e.addr.u.ip4 = p->addr.u.ip4;
-#if IS_ENABLED(CONFIG_IPV6)
- if (p->addr.proto == htons(ETH_P_IPV6))
- e.addr.u.ip6 = p->addr.u.ip6;
-#endif
- e.addr.proto = p->addr.proto;
- nest_ent = nla_nest_start_noflag(skb,
- MDBA_MDB_ENTRY_INFO);
- if (!nest_ent) {
- nla_nest_cancel(skb, nest2);
- err = -EMSGSIZE;
- goto out;
- }
- if (nla_put_nohdr(skb, sizeof(e), &e) ||
- nla_put_u32(skb,
- MDBA_MDB_EATTR_TIMER,
- br_timer_value(&p->timer))) {
- nla_nest_cancel(skb, nest_ent);
+ err = __mdb_fill_info(skb, p);
+ if (err) {
nla_nest_cancel(skb, nest2);
- err = -EMSGSIZE;
goto out;
}
- nla_nest_end(skb, nest_ent);
}
nla_nest_end(skb, nest2);
skip:
--
2.21.0
^ permalink raw reply related
* [PATCH net-next 3/4] net: bridge: mdb: dump host-joined entries as well
From: Nikolay Aleksandrov @ 2019-08-14 14:40 UTC (permalink / raw)
To: netdev; +Cc: davem, roopa, bridge, Nikolay Aleksandrov
In-Reply-To: <20190814144024.9710-1-nikolay@cumulusnetworks.com>
Currently we dump only the port mdb entries but we can have host-joined
entries on the bridge itself and they should be treated as normal temp
mdbs, they're already notified:
$ bridge monitor all
[MDB]dev br0 port br0 grp ff02::8 temp
The group will not be shown in the bridge mdb output, but it takes 1 slot
and it's timing out. If it's only host-joined then the mdb show output
can even be empty.
After this patch we show the host-joined groups:
$ bridge mdb show
dev br0 port br0 grp ff02::8 temp
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
---
net/bridge/br_mdb.c | 41 +++++++++++++++++++++++++++++++----------
1 file changed, 31 insertions(+), 10 deletions(-)
diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index 77730983097e..985273425117 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -78,22 +78,35 @@ static void __mdb_entry_to_br_ip(struct br_mdb_entry *entry, struct br_ip *ip)
}
static int __mdb_fill_info(struct sk_buff *skb,
+ struct net_bridge_mdb_entry *mp,
struct net_bridge_port_group *p)
{
+ struct timer_list *mtimer;
struct nlattr *nest_ent;
struct br_mdb_entry e;
+ u8 flags = 0;
+ int ifindex;
memset(&e, 0, sizeof(e));
- __mdb_entry_fill_flags(&e, p->flags);
- e.ifindex = p->port->dev->ifindex;
- e.vid = p->addr.vid;
- if (p->addr.proto == htons(ETH_P_IP))
- e.addr.u.ip4 = p->addr.u.ip4;
+ if (p) {
+ ifindex = p->port->dev->ifindex;
+ mtimer = &p->timer;
+ flags = p->flags;
+ } else {
+ ifindex = mp->br->dev->ifindex;
+ mtimer = &mp->timer;
+ }
+
+ __mdb_entry_fill_flags(&e, flags);
+ e.ifindex = ifindex;
+ e.vid = mp->addr.vid;
+ if (mp->addr.proto == htons(ETH_P_IP))
+ e.addr.u.ip4 = mp->addr.u.ip4;
#if IS_ENABLED(CONFIG_IPV6)
- if (p->addr.proto == htons(ETH_P_IPV6))
- e.addr.u.ip6 = p->addr.u.ip6;
+ if (mp->addr.proto == htons(ETH_P_IPV6))
+ e.addr.u.ip6 = mp->addr.u.ip6;
#endif
- e.addr.proto = p->addr.proto;
+ e.addr.proto = mp->addr.proto;
nest_ent = nla_nest_start_noflag(skb,
MDBA_MDB_ENTRY_INFO);
if (!nest_ent)
@@ -102,7 +115,7 @@ static int __mdb_fill_info(struct sk_buff *skb,
if (nla_put_nohdr(skb, sizeof(e), &e) ||
nla_put_u32(skb,
MDBA_MDB_EATTR_TIMER,
- br_timer_value(&p->timer))) {
+ br_timer_value(mtimer))) {
nla_nest_cancel(skb, nest_ent);
return -EMSGSIZE;
}
@@ -139,12 +152,20 @@ static int br_mdb_fill_info(struct sk_buff *skb, struct netlink_callback *cb,
break;
}
+ if (mp->host_joined) {
+ err = __mdb_fill_info(skb, mp, NULL);
+ if (err) {
+ nla_nest_cancel(skb, nest2);
+ break;
+ }
+ }
+
for (pp = &mp->ports; (p = rcu_dereference(*pp)) != NULL;
pp = &p->next) {
if (!p->port)
continue;
- err = __mdb_fill_info(skb, p);
+ err = __mdb_fill_info(skb, mp, p);
if (err) {
nla_nest_cancel(skb, nest2);
goto out;
--
2.21.0
^ permalink raw reply related
* [PATCH net-next 4/4] net: bridge: mdb: allow add/delete for host-joined groups
From: Nikolay Aleksandrov @ 2019-08-14 14:40 UTC (permalink / raw)
To: netdev; +Cc: davem, roopa, bridge, Nikolay Aleksandrov
In-Reply-To: <20190814144024.9710-1-nikolay@cumulusnetworks.com>
Currently this is needed only for user-space compatibility, so similar
object adds/deletes as the dumped ones would succeed. Later it can be
used for L2 mcast MAC add/delete.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
---
net/bridge/br_mdb.c | 74 +++++++++++++++++++++++++++------------
net/bridge/br_multicast.c | 24 +++++++++----
net/bridge/br_private.h | 2 ++
3 files changed, 71 insertions(+), 29 deletions(-)
diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index 985273425117..331a130b83b1 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -616,6 +616,19 @@ static int br_mdb_add_group(struct net_bridge *br, struct net_bridge_port *port,
return err;
}
+ /* host join */
+ if (!port) {
+ /* don't allow any flags for host-joined groups */
+ if (state)
+ return -EINVAL;
+ if (mp->host_joined)
+ return -EEXIST;
+
+ br_multicast_host_join(mp);
+
+ return 0;
+ }
+
for (pp = &mp->ports;
(p = mlock_dereference(*pp, br)) != NULL;
pp = &p->next) {
@@ -640,19 +653,21 @@ static int __br_mdb_add(struct net *net, struct net_bridge *br,
{
struct br_ip ip;
struct net_device *dev;
- struct net_bridge_port *p;
+ struct net_bridge_port *p = NULL;
int ret;
if (!netif_running(br->dev) || !br_opt_get(br, BROPT_MULTICAST_ENABLED))
return -EINVAL;
- dev = __dev_get_by_index(net, entry->ifindex);
- if (!dev)
- return -ENODEV;
+ if (entry->ifindex != br->dev->ifindex) {
+ dev = __dev_get_by_index(net, entry->ifindex);
+ if (!dev)
+ return -ENODEV;
- p = br_port_get_rtnl(dev);
- if (!p || p->br != br || p->state == BR_STATE_DISABLED)
- return -EINVAL;
+ p = br_port_get_rtnl(dev);
+ if (!p || p->br != br || p->state == BR_STATE_DISABLED)
+ return -EINVAL;
+ }
__mdb_entry_to_br_ip(entry, &ip);
@@ -680,15 +695,19 @@ static int br_mdb_add(struct sk_buff *skb, struct nlmsghdr *nlh,
br = netdev_priv(dev);
- pdev = __dev_get_by_index(net, entry->ifindex);
- if (!pdev)
- return -ENODEV;
+ if (entry->ifindex != br->dev->ifindex) {
+ pdev = __dev_get_by_index(net, entry->ifindex);
+ if (!pdev)
+ return -ENODEV;
- p = br_port_get_rtnl(pdev);
- if (!p || p->br != br || p->state == BR_STATE_DISABLED)
- return -EINVAL;
+ p = br_port_get_rtnl(pdev);
+ if (!p || p->br != br || p->state == BR_STATE_DISABLED)
+ return -EINVAL;
+ vg = nbp_vlan_group(p);
+ } else {
+ vg = br_vlan_group(br);
+ }
- vg = nbp_vlan_group(p);
/* If vlan filtering is enabled and VLAN is not specified
* install mdb entry on all vlans configured on the port.
*/
@@ -727,6 +746,13 @@ static int __br_mdb_del(struct net_bridge *br, struct br_mdb_entry *entry)
if (!mp)
goto unlock;
+ /* host leave */
+ if (entry->ifindex == mp->br->dev->ifindex && mp->host_joined) {
+ br_multicast_host_leave(mp);
+ err = 0;
+ goto unlock;
+ }
+
for (pp = &mp->ports;
(p = mlock_dereference(*pp, br)) != NULL;
pp = &p->next) {
@@ -759,9 +785,9 @@ static int br_mdb_del(struct sk_buff *skb, struct nlmsghdr *nlh,
{
struct net *net = sock_net(skb->sk);
struct net_bridge_vlan_group *vg;
+ struct net_bridge_port *p = NULL;
struct net_device *dev, *pdev;
struct br_mdb_entry *entry;
- struct net_bridge_port *p;
struct net_bridge_vlan *v;
struct net_bridge *br;
int err;
@@ -772,15 +798,19 @@ static int br_mdb_del(struct sk_buff *skb, struct nlmsghdr *nlh,
br = netdev_priv(dev);
- pdev = __dev_get_by_index(net, entry->ifindex);
- if (!pdev)
- return -ENODEV;
+ if (entry->ifindex != br->dev->ifindex) {
+ pdev = __dev_get_by_index(net, entry->ifindex);
+ if (!pdev)
+ return -ENODEV;
- p = br_port_get_rtnl(pdev);
- if (!p || p->br != br || p->state == BR_STATE_DISABLED)
- return -EINVAL;
+ p = br_port_get_rtnl(pdev);
+ if (!p || p->br != br || p->state == BR_STATE_DISABLED)
+ return -EINVAL;
+ vg = nbp_vlan_group(p);
+ } else {
+ vg = br_vlan_group(br);
+ }
- vg = nbp_vlan_group(p);
/* If vlan filtering is enabled and VLAN is not specified
* delete mdb entry on all vlans configured on the port.
*/
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 9b379e110129..f92cb6751898 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -148,8 +148,7 @@ static void br_multicast_group_expired(struct timer_list *t)
if (!netif_running(br->dev) || timer_pending(&mp->timer))
goto out;
- mp->host_joined = false;
- br_mdb_notify(br->dev, NULL, &mp->addr, RTM_DELMDB, 0);
+ br_multicast_host_leave(mp);
if (mp->ports)
goto out;
@@ -512,6 +511,21 @@ static bool br_port_group_equal(struct net_bridge_port_group *p,
return ether_addr_equal(src, p->eth_addr);
}
+void br_multicast_host_join(struct net_bridge_mdb_entry *mp)
+{
+ if (!mp->host_joined) {
+ mp->host_joined = true;
+ br_mdb_notify(mp->br->dev, NULL, &mp->addr, RTM_NEWMDB, 0);
+ }
+ mod_timer(&mp->timer, jiffies + mp->br->multicast_membership_interval);
+}
+
+void br_multicast_host_leave(struct net_bridge_mdb_entry *mp)
+{
+ mp->host_joined = false;
+ br_mdb_notify(mp->br->dev, NULL, &mp->addr, RTM_DELMDB, 0);
+}
+
static int br_multicast_add_group(struct net_bridge *br,
struct net_bridge_port *port,
struct br_ip *group,
@@ -534,11 +548,7 @@ static int br_multicast_add_group(struct net_bridge *br,
goto err;
if (!port) {
- if (!mp->host_joined) {
- mp->host_joined = true;
- br_mdb_notify(br->dev, NULL, &mp->addr, RTM_NEWMDB, 0);
- }
- mod_timer(&mp->timer, now + br->multicast_membership_interval);
+ br_multicast_host_join(mp);
goto out;
}
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index b7a4942ff1b3..a99dcbb9825c 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -702,6 +702,8 @@ void br_multicast_get_stats(const struct net_bridge *br,
struct br_mcast_stats *dest);
void br_mdb_init(void);
void br_mdb_uninit(void);
+void br_multicast_host_join(struct net_bridge_mdb_entry *mp);
+void br_multicast_host_leave(struct net_bridge_mdb_entry *mp);
#define mlock_dereference(X, br) \
rcu_dereference_protected(X, lockdep_is_held(&br->multicast_lock))
--
2.21.0
^ permalink raw reply related
* Re: [PATCH net 2/4] tcp: tcp_fragment() should apply sane memory limits
From: Marcelo Ricardo Leitner @ 2019-08-14 14:41 UTC (permalink / raw)
To: Bernd; +Cc: Neal Cardwell, netdev, edumazet
In-Reply-To: <CABOR3+zQ0yfbcon6bv5TXrrAomoWLxy101iEXqBycDTrhytDiA@mail.gmail.com>
On Fri, Aug 02, 2019 at 09:58:12PM +0200, Bernd wrote:
> 2019-08-02 21:14 GMT+02:00, Neal Cardwell <ncardwell@google.com>:
> > What's the exact kernel version you are using?
>
> It is the RHEL errata kernel 3.10.0-957.21.3.el7 (rhsa-2019:1481), i
> need to check if there is a newer one.
FWIW, this one doesn't have the patch below.
Marcelo
>
> > Eric submitted a patch recently that may address your issue:
> > tcp: be more careful in tcp_fragment()
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=b617158dc096709d8600c53b6052144d12b89fab
> >
> > Would you be able to test your workload with that commit
> > cherry-picked, and see if the issue still occurs?
>
> It only happens on a customer system in production up to now, so most
> likely not.
>
> > That commit was targeted to many stable releases, so you may be able
> > to pick up that fix from a stable branch.
>
> The only thing which is a bit strange, this is a Java client, and I am
> pretty sure we don’t set a small SO_SNDBUF, if anything it is
> increased (I need to verify that).
>
> Not to worry, I guess I can now with your helpful pointer sort that
> out with Redhat.
>
> gruss
> Bernd
>
^ permalink raw reply
* Re: [PATCH bpf-next v4 1/8] xsk: replace ndo_xsk_async_xmit with ndo_xsk_wakeup
From: Jonathan Lemon @ 2019-08-14 14:46 UTC (permalink / raw)
To: Magnus Karlsson
Cc: bjorn.topel, ast, daniel, netdev, brouer, maximmi, bpf,
bruce.richardson, ciara.loftus, jakub.kicinski, xiaolong.ye,
qi.z.zhang, sridhar.samudrala, kevin.laatz, ilias.apalodimas,
kiran.patil, axboe, maciej.fijalkowski, maciejromanfijalkowski,
intel-wired-lan
In-Reply-To: <1565767643-4908-2-git-send-email-magnus.karlsson@intel.com>
On 14 Aug 2019, at 0:27, Magnus Karlsson wrote:
> This commit replaces ndo_xsk_async_xmit with ndo_xsk_wakeup. This new
> ndo provides the same functionality as before but with the addition of
> a new flags field that is used to specifiy if Rx, Tx or both should be
> woken up. The previous ndo only woke up Tx, as implied by the
> name. The i40e and ixgbe drivers (which are all the supported ones)
> are updated with this new interface.
>
> This new ndo will be used by the new need_wakeup functionality of XDP
> sockets that need to be able to wake up both Rx and Tx driver
> processing.
>
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
^ permalink raw reply
* Re: [PATCH bpf-next v4 2/8] xsk: add support for need_wakeup flag in AF_XDP rings
From: Jonathan Lemon @ 2019-08-14 14:47 UTC (permalink / raw)
To: Magnus Karlsson
Cc: bjorn.topel, ast, daniel, netdev, brouer, maximmi, bpf,
bruce.richardson, ciara.loftus, jakub.kicinski, xiaolong.ye,
qi.z.zhang, sridhar.samudrala, kevin.laatz, ilias.apalodimas,
kiran.patil, axboe, maciej.fijalkowski, maciejromanfijalkowski,
intel-wired-lan
In-Reply-To: <1565767643-4908-3-git-send-email-magnus.karlsson@intel.com>
On 14 Aug 2019, at 0:27, Magnus Karlsson wrote:
> This commit adds support for a new flag called need_wakeup in the
> AF_XDP Tx and fill rings. When this flag is set, it means that the
> application has to explicitly wake up the kernel Rx (for the bit in
> the fill ring) or kernel Tx (for bit in the Tx ring) processing by
> issuing a syscall. Poll() can wake up both depending on the flags
> submitted and sendto() will wake up tx processing only.
>
> The main reason for introducing this new flag is to be able to
> efficiently support the case when application and driver is executing
> on the same core. Previously, the driver was just busy-spinning on the
> fill ring if it ran out of buffers in the HW and there were none on
> the fill ring. This approach works when the application is running on
> another core as it can replenish the fill ring while the driver is
> busy-spinning. Though, this is a lousy approach if both of them are
> running on the same core as the probability of the fill ring getting
> more entries when the driver is busy-spinning is zero. With this new
> feature the driver now sets the need_wakeup flag and returns to the
> application. The application can then replenish the fill queue and
> then explicitly wake up the Rx processing in the kernel using the
> syscall poll(). For Tx, the flag is only set to one if the driver has
> no outstanding Tx completion interrupts. If it has some, the flag is
> zero as it will be woken up by a completion interrupt anyway.
>
> As a nice side effect, this new flag also improves the performance of
> the case where application and driver are running on two different
> cores as it reduces the number of syscalls to the kernel. The kernel
> tells user space if it needs to be woken up by a syscall, and this
> eliminates many of the syscalls.
>
> This flag needs some simple driver support. If the driver does not
> support this, the Rx flag is always zero and the Tx flag is always
> one. This makes any application relying on this feature default to the
> old behaviour of not requiring any syscalls in the Rx path and always
> having to call sendto() in the Tx path.
>
> For backwards compatibility reasons, this feature has to be explicitly
> turned on using a new bind flag (XDP_USE_NEED_WAKEUP). I recommend
> that you always turn it on as it so far always have had a positive
> performance impact.
>
> The name and inspiration of the flag has been taken from io_uring by
> Jens Axboe. Details about this feature in io_uring can be found in
> http://kernel.dk/io_uring.pdf, section 8.3.
>
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
^ permalink raw reply
* Re: [PATCH net-next v2] net: dsa: mv88e6xxx: check for mode change in port_setup_mac
From: Andrew Lunn @ 2019-08-14 14:48 UTC (permalink / raw)
To: Marek Behún; +Cc: netdev, Vivien Didelot, Heiner Kallweit
In-Reply-To: <20190814144024.27975-1-marek.behun@nic.cz>
On Wed, Aug 14, 2019 at 04:40:24PM +0200, Marek Behún wrote:
> The mv88e6xxx_port_setup_mac checks if the requested MAC settings are
> different from the current ones, and if not, does nothing (since chaning
> them requires putting the link down).
>
> In this check it only looks if the triplet [link, speed, duplex] is
> being changed.
>
> This patch adds support to also check if the mode parameter (of type
> phy_interface_t) is requested to be changed. The current mode is
> computed by the ->port_link_state() method, and if it is different from
> PHY_INTERFACE_MODE_NA, we check for equality with the requested mode.
>
> In the implementations of the mv88e6250_port_link_state() method we set
> the current mode to PHY_INTERFACE_MODE_NA - so the code does not check
> for mode change on 6250.
>
> In the mv88e6352_port_link_state() method, we use the cached cmode of
> the port to determine the mode as phy_interface_t (and if it is not
> enough, eg. for RGMII, we also look at the port control register for
> RX/TX timings).
>
> Signed-off-by: Marek Behún <marek.behun@nic.cz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply
* Re: [PATCH bpf-next v4 3/8] i40e: add support for AF_XDP need_wakeup feature
From: Jonathan Lemon @ 2019-08-14 14:48 UTC (permalink / raw)
To: Magnus Karlsson
Cc: bjorn.topel, ast, daniel, netdev, brouer, maximmi, bpf,
bruce.richardson, ciara.loftus, jakub.kicinski, xiaolong.ye,
qi.z.zhang, sridhar.samudrala, kevin.laatz, ilias.apalodimas,
kiran.patil, axboe, maciej.fijalkowski, maciejromanfijalkowski,
intel-wired-lan
In-Reply-To: <1565767643-4908-4-git-send-email-magnus.karlsson@intel.com>
On 14 Aug 2019, at 0:27, Magnus Karlsson wrote:
> This patch adds support for the need_wakeup feature of AF_XDP. If the
> application has told the kernel that it might sleep using the new bind
> flag XDP_USE_NEED_WAKEUP, the driver will then set this flag if it has
> no more buffers on the NIC Rx ring and yield to the application. For
> Tx, it will set the flag if it has no outstanding Tx completion
> interrupts and return to the application.
>
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> ---
> drivers/net/ethernet/intel/i40e/i40e_xsk.c | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> index d0ff5d8..42c9012 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> @@ -626,6 +626,15 @@ int i40e_clean_rx_irq_zc(struct i40e_ring
> *rx_ring, int budget)
>
> i40e_finalize_xdp_rx(rx_ring, xdp_xmit);
> i40e_update_rx_stats(rx_ring, total_rx_bytes, total_rx_packets);
> +
> + if (xsk_umem_uses_need_wakeup(rx_ring->xsk_umem)) {
> + if (failure || rx_ring->next_to_clean == rx_ring->next_to_use)
> + xsk_set_rx_need_wakeup(rx_ring->xsk_umem);
> + else
> + xsk_clear_rx_need_wakeup(rx_ring->xsk_umem);
> +
> + return (int)total_rx_packets;
> + }
> return failure ? budget : (int)total_rx_packets;
Can you elaborate why we're not returning the total budget on failure
for the wakeup case?
^ permalink raw reply
* Re: [PATCH bpf-next v4 5/8] libbpf: add support for need_wakeup flag in AF_XDP part
From: Jonathan Lemon @ 2019-08-14 14:49 UTC (permalink / raw)
To: Magnus Karlsson
Cc: bjorn.topel, ast, daniel, netdev, brouer, maximmi, bpf,
bruce.richardson, ciara.loftus, jakub.kicinski, xiaolong.ye,
qi.z.zhang, sridhar.samudrala, kevin.laatz, ilias.apalodimas,
kiran.patil, axboe, maciej.fijalkowski, maciejromanfijalkowski,
intel-wired-lan
In-Reply-To: <1565767643-4908-6-git-send-email-magnus.karlsson@intel.com>
On 14 Aug 2019, at 0:27, Magnus Karlsson wrote:
> This commit adds support for the new need_wakeup flag in AF_XDP. The
> xsk_socket__create function is updated to handle this and a new
> function is introduced called xsk_ring_prod__needs_wakeup(). This
> function can be used by the application to check if Rx and/or Tx
> processing needs to be explicitly woken up.
>
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
^ permalink raw reply
* Re: [PATCH bpf-next v4 6/8] samples/bpf: add use of need_wakeup flag in xdpsock
From: Jonathan Lemon @ 2019-08-14 14:53 UTC (permalink / raw)
To: Magnus Karlsson
Cc: bjorn.topel, ast, daniel, netdev, brouer, maximmi, bpf,
bruce.richardson, ciara.loftus, jakub.kicinski, xiaolong.ye,
qi.z.zhang, sridhar.samudrala, kevin.laatz, ilias.apalodimas,
kiran.patil, axboe, maciej.fijalkowski, maciejromanfijalkowski,
intel-wired-lan
In-Reply-To: <1565767643-4908-7-git-send-email-magnus.karlsson@intel.com>
On 14 Aug 2019, at 0:27, Magnus Karlsson wrote:
> This commit adds using the need_wakeup flag to the xdpsock sample
> application. It is turned on by default as we think it is a feature
> that seems to always produce a performance benefit, if the application
> has been written taking advantage of it. It can be turned off in the
> sample app by using the '-m' command line option.
>
> The txpush and l2fwd sub applications have also been updated to
> support poll() with multiple sockets.
>
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
> ---
> samples/bpf/xdpsock_user.c | 192
> ++++++++++++++++++++++++++++-----------------
> 1 file changed, 120 insertions(+), 72 deletions(-)
>
> diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
> index 93eaaf7..da84c76 100644
> --- a/samples/bpf/xdpsock_user.c
> +++ b/samples/bpf/xdpsock_user.c
> @@ -67,8 +67,10 @@ static int opt_ifindex;
> static int opt_queue;
> static int opt_poll;
> static int opt_interval = 1;
> -static u32 opt_xdp_bind_flags;
> +static u32 opt_xdp_bind_flags = XDP_USE_NEED_WAKEUP;
> static int opt_xsk_frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE;
> +static int opt_timeout = 1000;
> +static bool opt_need_wakeup = true;
> static __u32 prog_id;
>
> struct xsk_umem_info {
> @@ -352,6 +354,7 @@ static struct option long_options[] = {
> {"zero-copy", no_argument, 0, 'z'},
> {"copy", no_argument, 0, 'c'},
> {"frame-size", required_argument, 0, 'f'},
> + {"no-need-wakeup", no_argument, 0, 'm'},
> {0, 0, 0, 0}
> };
>
> @@ -372,6 +375,7 @@ static void usage(const char *prog)
> " -z, --zero-copy Force zero-copy mode.\n"
> " -c, --copy Force copy mode.\n"
> " -f, --frame-size=n Set the frame size (must be a power of two,
> default is %d).\n"
> + " -m, --no-need-wakeup Turn off use of driver need wakeup flag.\n"
> "\n";
> fprintf(stderr, str, prog, XSK_UMEM__DEFAULT_FRAME_SIZE);
> exit(EXIT_FAILURE);
> @@ -384,8 +388,9 @@ static void parse_command_line(int argc, char
> **argv)
> opterr = 0;
>
> for (;;) {
> - c = getopt_long(argc, argv, "Frtli:q:psSNn:czf:", long_options,
> - &option_index);
> +
> + c = getopt_long(argc, argv, "Frtli:q:psSNn:czf:m",
> + long_options, &option_index);
> if (c == -1)
> break;
>
> @@ -429,6 +434,9 @@ static void parse_command_line(int argc, char
> **argv)
> break;
> case 'f':
> opt_xsk_frame_size = atoi(optarg);
> + case 'm':
> + opt_need_wakeup = false;
> + opt_xdp_bind_flags &= ~XDP_USE_NEED_WAKEUP;
> break;
> default:
> usage(basename(argv[0]));
> @@ -459,7 +467,8 @@ static void kick_tx(struct xsk_socket_info *xsk)
> exit_with_error(errno);
> }
>
> -static inline void complete_tx_l2fwd(struct xsk_socket_info *xsk)
> +static inline void complete_tx_l2fwd(struct xsk_socket_info *xsk,
> + struct pollfd *fds)
> {
> u32 idx_cq = 0, idx_fq = 0;
> unsigned int rcvd;
> @@ -468,7 +477,9 @@ static inline void complete_tx_l2fwd(struct
> xsk_socket_info *xsk)
> if (!xsk->outstanding_tx)
> return;
>
> - kick_tx(xsk);
> + if (!opt_need_wakeup || xsk_ring_prod__needs_wakeup(&xsk->tx))
> + kick_tx(xsk);
> +
> ndescs = (xsk->outstanding_tx > BATCH_SIZE) ? BATCH_SIZE :
> xsk->outstanding_tx;
>
> @@ -482,6 +493,8 @@ static inline void complete_tx_l2fwd(struct
> xsk_socket_info *xsk)
> while (ret != rcvd) {
> if (ret < 0)
> exit_with_error(-ret);
> + if (xsk_ring_prod__needs_wakeup(&xsk->umem->fq))
> + ret = poll(fds, num_socks, opt_timeout);
> ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd,
> &idx_fq);
> }
> @@ -505,7 +518,8 @@ static inline void complete_tx_only(struct
> xsk_socket_info *xsk)
> if (!xsk->outstanding_tx)
> return;
>
> - kick_tx(xsk);
> + if (!opt_need_wakeup || xsk_ring_prod__needs_wakeup(&xsk->tx))
> + kick_tx(xsk);
>
> rcvd = xsk_ring_cons__peek(&xsk->umem->cq, BATCH_SIZE, &idx);
> if (rcvd > 0) {
> @@ -515,20 +529,25 @@ static inline void complete_tx_only(struct
> xsk_socket_info *xsk)
> }
> }
>
> -static void rx_drop(struct xsk_socket_info *xsk)
> +static void rx_drop(struct xsk_socket_info *xsk, struct pollfd *fds)
> {
> unsigned int rcvd, i;
> u32 idx_rx = 0, idx_fq = 0;
> int ret;
>
> rcvd = xsk_ring_cons__peek(&xsk->rx, BATCH_SIZE, &idx_rx);
> - if (!rcvd)
> + if (!rcvd) {
> + if (xsk_ring_prod__needs_wakeup(&xsk->umem->fq))
> + ret = poll(fds, num_socks, opt_timeout);
> return;
> + }
>
> ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd, &idx_fq);
> while (ret != rcvd) {
> if (ret < 0)
> exit_with_error(-ret);
> + if (xsk_ring_prod__needs_wakeup(&xsk->umem->fq))
> + ret = poll(fds, num_socks, opt_timeout);
> ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd, &idx_fq);
> }
I'll just point out that it seems a little odd that if one ring
needs a wakeup, all the rings get polled again. However, I suppose
that it does amortize costs of a kernel call.
--
Jonathan
> @@ -549,42 +568,65 @@ static void rx_drop(struct xsk_socket_info *xsk)
> static void rx_drop_all(void)
> {
> struct pollfd fds[MAX_SOCKS + 1];
> - int i, ret, timeout, nfds = 1;
> + int i, ret;
>
> memset(fds, 0, sizeof(fds));
>
> for (i = 0; i < num_socks; i++) {
> fds[i].fd = xsk_socket__fd(xsks[i]->xsk);
> fds[i].events = POLLIN;
> - timeout = 1000; /* 1sn */
> }
>
> for (;;) {
> if (opt_poll) {
> - ret = poll(fds, nfds, timeout);
> + ret = poll(fds, num_socks, opt_timeout);
> if (ret <= 0)
> continue;
> }
>
> for (i = 0; i < num_socks; i++)
> - rx_drop(xsks[i]);
> + rx_drop(xsks[i], fds);
> + }
> +}
> +
> +static void tx_only(struct xsk_socket_info *xsk, u32 frame_nb)
> +{
> + u32 idx;
> +
> + if (xsk_ring_prod__reserve(&xsk->tx, BATCH_SIZE, &idx) ==
> BATCH_SIZE) {
> + unsigned int i;
> +
> + for (i = 0; i < BATCH_SIZE; i++) {
> + xsk_ring_prod__tx_desc(&xsk->tx, idx + i)->addr =
> + (frame_nb + i) << XSK_UMEM__DEFAULT_FRAME_SHIFT;
> + xsk_ring_prod__tx_desc(&xsk->tx, idx + i)->len =
> + sizeof(pkt_data) - 1;
> + }
> +
> + xsk_ring_prod__submit(&xsk->tx, BATCH_SIZE);
> + xsk->outstanding_tx += BATCH_SIZE;
> + frame_nb += BATCH_SIZE;
> + frame_nb %= NUM_FRAMES;
> }
> +
> + complete_tx_only(xsk);
> }
>
> -static void tx_only(struct xsk_socket_info *xsk)
> +static void tx_only_all(void)
> {
> - int timeout, ret, nfds = 1;
> - struct pollfd fds[nfds + 1];
> - u32 idx, frame_nb = 0;
> + struct pollfd fds[MAX_SOCKS];
> + u32 frame_nb[MAX_SOCKS] = {};
> + int i, ret;
>
> memset(fds, 0, sizeof(fds));
> - fds[0].fd = xsk_socket__fd(xsk->xsk);
> - fds[0].events = POLLOUT;
> - timeout = 1000; /* 1sn */
> + for (i = 0; i < num_socks; i++) {
> + fds[0].fd = xsk_socket__fd(xsks[i]->xsk);
> + fds[0].events = POLLOUT;
> + }
>
> for (;;) {
> if (opt_poll) {
> - ret = poll(fds, nfds, timeout);
> + ret = poll(fds, num_socks, opt_timeout);
> if (ret <= 0)
> continue;
>
> @@ -592,69 +634,75 @@ static void tx_only(struct xsk_socket_info *xsk)
> continue;
> }
>
> - if (xsk_ring_prod__reserve(&xsk->tx, BATCH_SIZE, &idx) ==
> - BATCH_SIZE) {
> - unsigned int i;
> -
> - for (i = 0; i < BATCH_SIZE; i++) {
> - xsk_ring_prod__tx_desc(&xsk->tx, idx + i)->addr
> - = (frame_nb + i) * opt_xsk_frame_size;
> - xsk_ring_prod__tx_desc(&xsk->tx, idx + i)->len =
> - sizeof(pkt_data) - 1;
> - }
> -
> - xsk_ring_prod__submit(&xsk->tx, BATCH_SIZE);
> - xsk->outstanding_tx += BATCH_SIZE;
> - frame_nb += BATCH_SIZE;
> - frame_nb %= NUM_FRAMES;
> - }
> -
> - complete_tx_only(xsk);
> + for (i = 0; i < num_socks; i++)
> + tx_only(xsks[i], frame_nb[i]);
> }
> }
>
> -static void l2fwd(struct xsk_socket_info *xsk)
> +static void l2fwd(struct xsk_socket_info *xsk, struct pollfd *fds)
> {
> - for (;;) {
> - unsigned int rcvd, i;
> - u32 idx_rx = 0, idx_tx = 0;
> - int ret;
> + unsigned int rcvd, i;
> + u32 idx_rx = 0, idx_tx = 0;
> + int ret;
>
> - for (;;) {
> - complete_tx_l2fwd(xsk);
> + complete_tx_l2fwd(xsk, fds);
>
> - rcvd = xsk_ring_cons__peek(&xsk->rx, BATCH_SIZE,
> - &idx_rx);
> - if (rcvd > 0)
> - break;
> - }
> + rcvd = xsk_ring_cons__peek(&xsk->rx, BATCH_SIZE, &idx_rx);
> + if (!rcvd) {
> + if (xsk_ring_prod__needs_wakeup(&xsk->umem->fq))
> + ret = poll(fds, num_socks, opt_timeout);
> + return;
> + }
>
> + ret = xsk_ring_prod__reserve(&xsk->tx, rcvd, &idx_tx);
> + while (ret != rcvd) {
> + if (ret < 0)
> + exit_with_error(-ret);
> + if (xsk_ring_prod__needs_wakeup(&xsk->tx))
> + kick_tx(xsk);
> ret = xsk_ring_prod__reserve(&xsk->tx, rcvd, &idx_tx);
> - while (ret != rcvd) {
> - if (ret < 0)
> - exit_with_error(-ret);
> - ret = xsk_ring_prod__reserve(&xsk->tx, rcvd, &idx_tx);
> - }
> + }
> +
> + for (i = 0; i < rcvd; i++) {
> + u64 addr = xsk_ring_cons__rx_desc(&xsk->rx, idx_rx)->addr;
> + u32 len = xsk_ring_cons__rx_desc(&xsk->rx, idx_rx++)->len;
> + char *pkt = xsk_umem__get_data(xsk->umem->buffer, addr);
> +
> + swap_mac_addresses(pkt);
>
> - for (i = 0; i < rcvd; i++) {
> - u64 addr = xsk_ring_cons__rx_desc(&xsk->rx,
> - idx_rx)->addr;
> - u32 len = xsk_ring_cons__rx_desc(&xsk->rx,
> - idx_rx++)->len;
> - char *pkt = xsk_umem__get_data(xsk->umem->buffer, addr);
> + hex_dump(pkt, len, addr);
> + xsk_ring_prod__tx_desc(&xsk->tx, idx_tx)->addr = addr;
> + xsk_ring_prod__tx_desc(&xsk->tx, idx_tx++)->len = len;
> + }
>
> - swap_mac_addresses(pkt);
> + xsk_ring_prod__submit(&xsk->tx, rcvd);
> + xsk_ring_cons__release(&xsk->rx, rcvd);
>
> - hex_dump(pkt, len, addr);
> - xsk_ring_prod__tx_desc(&xsk->tx, idx_tx)->addr = addr;
> - xsk_ring_prod__tx_desc(&xsk->tx, idx_tx++)->len = len;
> - }
> + xsk->rx_npkts += rcvd;
> + xsk->outstanding_tx += rcvd;
> +}
> +
> +static void l2fwd_all(void)
> +{
> + struct pollfd fds[MAX_SOCKS];
> + int i, ret;
> +
> + memset(fds, 0, sizeof(fds));
> +
> + for (i = 0; i < num_socks; i++) {
> + fds[i].fd = xsk_socket__fd(xsks[i]->xsk);
> + fds[i].events = POLLOUT | POLLIN;
> + }
>
> - xsk_ring_prod__submit(&xsk->tx, rcvd);
> - xsk_ring_cons__release(&xsk->rx, rcvd);
> + for (;;) {
> + if (opt_poll) {
> + ret = poll(fds, num_socks, opt_timeout);
> + if (ret <= 0)
> + continue;
> + }
>
> - xsk->rx_npkts += rcvd;
> - xsk->outstanding_tx += rcvd;
> + for (i = 0; i < num_socks; i++)
> + l2fwd(xsks[i], fds);
> }
> }
>
> @@ -705,9 +753,9 @@ int main(int argc, char **argv)
> if (opt_bench == BENCH_RXDROP)
> rx_drop_all();
> else if (opt_bench == BENCH_TXONLY)
> - tx_only(xsks[0]);
> + tx_only_all();
> else
> - l2fwd(xsks[0]);
> + l2fwd_all();
>
> return 0;
> }
> --
> 2.7.4
^ permalink raw reply
* Re: [PATCH bpf-next v4 7/8] net/mlx5e: Move the SW XSK code from NAPI poll to a separate function
From: Jonathan Lemon @ 2019-08-14 14:53 UTC (permalink / raw)
To: Magnus Karlsson
Cc: bjorn.topel, ast, daniel, netdev, brouer, maximmi, bpf,
bruce.richardson, ciara.loftus, jakub.kicinski, xiaolong.ye,
qi.z.zhang, sridhar.samudrala, kevin.laatz, ilias.apalodimas,
kiran.patil, axboe, maciej.fijalkowski, maciejromanfijalkowski,
intel-wired-lan
In-Reply-To: <1565767643-4908-8-git-send-email-magnus.karlsson@intel.com>
On 14 Aug 2019, at 0:27, Magnus Karlsson wrote:
> From: Maxim Mikityanskiy <maximmi@mellanox.com>
>
> Two XSK tasks are performed during NAPI polling, that are not bound to
> hardware interrupts: TXing packets and polling for frames in the Fill
> Ring. They are special in a way that the hardware doesn't know about
> these tasks, so it doesn't trigger interrupts if there is still some
> work to be done, it's our driver's responsibility to ensure NAPI will be
> rescheduled if needed.
>
> Create a new function to handle these tasks and move the corresponding
> code from mlx5e_napi_poll to the new function to improve modularity and
> prepare for the changes in the following patch.
>
> Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
^ permalink raw reply
* [net-next 1/1] tipc: clean up skb list lock handling on send path
From: Jon Maloy @ 2019-08-14 14:55 UTC (permalink / raw)
To: davem, netdev
Cc: tung.q.nguyen, hoang.h.le, jon.maloy, lxin, shuali, ying.xue,
edumazet, tipc-discussion
The policy for handling the skb list locks on the send and receive paths
is simple.
- On the send path we never need to grab the lock on the 'xmitq' list
when the destination is an exernal node.
- On the receive path we always need to grab the lock on the 'inputq'
list, irrespective of source node.
However, when transmitting node local messages those will eventually
end up on the receive path of a local socket, meaning that the argument
'xmitq' in tipc_node_xmit() will become the 'ínputq' argument in the
function tipc_sk_rcv(). This has been handled by always initializing
the spinlock of the 'xmitq' list at message creation, just in case it
may end up on the receive path later, and despite knowing that the lock
in most cases never will be used.
This approach is inaccurate and confusing, and has also concealed the
fact that the stated 'no lock grabbing' policy for the send path is
violated in some cases.
We now clean up this by never initializing the lock at message creation,
instead doing this at the moment we find that the message actually will
enter the receive path. At the same time we fix the four locations
where we incorrectly access the spinlock on the send/error path.
This patch also reverts commit d12cffe9329f ("tipc: ensure head->lock
is initialised") which has now become redundant.
CC: Eric Dumazet <edumazet@google.com>
Reported-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
---
net/tipc/bcast.c | 10 +++++-----
net/tipc/link.c | 4 ++--
net/tipc/name_distr.c | 2 +-
net/tipc/node.c | 7 ++++---
net/tipc/socket.c | 14 +++++++-------
5 files changed, 19 insertions(+), 18 deletions(-)
diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index 34f3e56..6ef1abd 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -185,7 +185,7 @@ static void tipc_bcbase_xmit(struct net *net, struct sk_buff_head *xmitq)
}
/* We have to transmit across all bearers */
- skb_queue_head_init(&_xmitq);
+ __skb_queue_head_init(&_xmitq);
for (bearer_id = 0; bearer_id < MAX_BEARERS; bearer_id++) {
if (!bb->dests[bearer_id])
continue;
@@ -256,7 +256,7 @@ static int tipc_bcast_xmit(struct net *net, struct sk_buff_head *pkts,
struct sk_buff_head xmitq;
int rc = 0;
- skb_queue_head_init(&xmitq);
+ __skb_queue_head_init(&xmitq);
tipc_bcast_lock(net);
if (tipc_link_bc_peers(l))
rc = tipc_link_xmit(l, pkts, &xmitq);
@@ -286,7 +286,7 @@ static int tipc_rcast_xmit(struct net *net, struct sk_buff_head *pkts,
u32 dnode, selector;
selector = msg_link_selector(buf_msg(skb_peek(pkts)));
- skb_queue_head_init(&_pkts);
+ __skb_queue_head_init(&_pkts);
list_for_each_entry_safe(dst, tmp, &dests->list, list) {
dnode = dst->node;
@@ -344,7 +344,7 @@ static int tipc_mcast_send_sync(struct net *net, struct sk_buff *skb,
msg_set_size(_hdr, MCAST_H_SIZE);
msg_set_is_rcast(_hdr, !msg_is_rcast(hdr));
- skb_queue_head_init(&tmpq);
+ __skb_queue_head_init(&tmpq);
__skb_queue_tail(&tmpq, _skb);
if (method->rcast)
tipc_bcast_xmit(net, &tmpq, cong_link_cnt);
@@ -378,7 +378,7 @@ int tipc_mcast_xmit(struct net *net, struct sk_buff_head *pkts,
int rc = 0;
skb_queue_head_init(&inputq);
- skb_queue_head_init(&localq);
+ __skb_queue_head_init(&localq);
/* Clone packets before they are consumed by next call */
if (dests->local && !tipc_msg_reassemble(pkts, &localq)) {
diff --git a/net/tipc/link.c b/net/tipc/link.c
index dd3155b..ba057a9 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -959,7 +959,7 @@ int tipc_link_xmit(struct tipc_link *l, struct sk_buff_head *list,
pr_warn("Too large msg, purging xmit list %d %d %d %d %d!\n",
skb_queue_len(list), msg_user(hdr),
msg_type(hdr), msg_size(hdr), mtu);
- skb_queue_purge(list);
+ __skb_queue_purge(list);
return -EMSGSIZE;
}
@@ -988,7 +988,7 @@ int tipc_link_xmit(struct tipc_link *l, struct sk_buff_head *list,
if (likely(skb_queue_len(transmq) < maxwin)) {
_skb = skb_clone(skb, GFP_ATOMIC);
if (!_skb) {
- skb_queue_purge(list);
+ __skb_queue_purge(list);
return -ENOBUFS;
}
__skb_dequeue(list);
diff --git a/net/tipc/name_distr.c b/net/tipc/name_distr.c
index 44abc8e..61219f0 100644
--- a/net/tipc/name_distr.c
+++ b/net/tipc/name_distr.c
@@ -190,7 +190,7 @@ void tipc_named_node_up(struct net *net, u32 dnode)
struct name_table *nt = tipc_name_table(net);
struct sk_buff_head head;
- skb_queue_head_init(&head);
+ __skb_queue_head_init(&head);
read_lock_bh(&nt->cluster_scope_lock);
named_distribute(net, &head, dnode, &nt->cluster_scope);
diff --git a/net/tipc/node.c b/net/tipc/node.c
index 1bdcf0f..c8f6177 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -1444,13 +1444,14 @@ int tipc_node_xmit(struct net *net, struct sk_buff_head *list,
if (in_own_node(net, dnode)) {
tipc_loopback_trace(net, list);
+ spin_lock_init(&list->lock);
tipc_sk_rcv(net, list);
return 0;
}
n = tipc_node_find(net, dnode);
if (unlikely(!n)) {
- skb_queue_purge(list);
+ __skb_queue_purge(list);
return -EHOSTUNREACH;
}
@@ -1459,7 +1460,7 @@ int tipc_node_xmit(struct net *net, struct sk_buff_head *list,
if (unlikely(bearer_id == INVALID_BEARER_ID)) {
tipc_node_read_unlock(n);
tipc_node_put(n);
- skb_queue_purge(list);
+ __skb_queue_purge(list);
return -EHOSTUNREACH;
}
@@ -1491,7 +1492,7 @@ int tipc_node_xmit_skb(struct net *net, struct sk_buff *skb, u32 dnode,
{
struct sk_buff_head head;
- skb_queue_head_init(&head);
+ __skb_queue_head_init(&head);
__skb_queue_tail(&head, skb);
tipc_node_xmit(net, &head, dnode, selector);
return 0;
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 83ae41d..3b9f8cc 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -809,7 +809,7 @@ static int tipc_sendmcast(struct socket *sock, struct tipc_name_seq *seq,
msg_set_nameupper(hdr, seq->upper);
/* Build message as chain of buffers */
- skb_queue_head_init(&pkts);
+ __skb_queue_head_init(&pkts);
rc = tipc_msg_build(hdr, msg, 0, dlen, mtu, &pkts);
/* Send message if build was successful */
@@ -853,7 +853,7 @@ static int tipc_send_group_msg(struct net *net, struct tipc_sock *tsk,
msg_set_grp_bc_seqno(hdr, bc_snd_nxt);
/* Build message as chain of buffers */
- skb_queue_head_init(&pkts);
+ __skb_queue_head_init(&pkts);
mtu = tipc_node_get_mtu(net, dnode, tsk->portid);
rc = tipc_msg_build(hdr, m, 0, dlen, mtu, &pkts);
if (unlikely(rc != dlen))
@@ -1058,7 +1058,7 @@ static int tipc_send_group_bcast(struct socket *sock, struct msghdr *m,
msg_set_grp_bc_ack_req(hdr, ack);
/* Build message as chain of buffers */
- skb_queue_head_init(&pkts);
+ __skb_queue_head_init(&pkts);
rc = tipc_msg_build(hdr, m, 0, dlen, mtu, &pkts);
if (unlikely(rc != dlen))
return rc;
@@ -1387,7 +1387,7 @@ static int __tipc_sendmsg(struct socket *sock, struct msghdr *m, size_t dlen)
if (unlikely(rc))
return rc;
- skb_queue_head_init(&pkts);
+ __skb_queue_head_init(&pkts);
mtu = tipc_node_get_mtu(net, dnode, tsk->portid);
rc = tipc_msg_build(hdr, m, 0, dlen, mtu, &pkts);
if (unlikely(rc != dlen))
@@ -1445,7 +1445,7 @@ static int __tipc_sendstream(struct socket *sock, struct msghdr *m, size_t dlen)
int send, sent = 0;
int rc = 0;
- skb_queue_head_init(&pkts);
+ __skb_queue_head_init(&pkts);
if (unlikely(dlen > INT_MAX))
return -EMSGSIZE;
@@ -1805,7 +1805,7 @@ static int tipc_recvmsg(struct socket *sock, struct msghdr *m,
/* Send group flow control advertisement when applicable */
if (tsk->group && msg_in_group(hdr) && !grp_evt) {
- skb_queue_head_init(&xmitq);
+ __skb_queue_head_init(&xmitq);
tipc_group_update_rcv_win(tsk->group, tsk_blocks(hlen + dlen),
msg_orignode(hdr), msg_origport(hdr),
&xmitq);
@@ -2674,7 +2674,7 @@ static void tipc_sk_timeout(struct timer_list *t)
struct sk_buff_head list;
int rc = 0;
- skb_queue_head_init(&list);
+ __skb_queue_head_init(&list);
bh_lock_sock(sk);
/* Try again later if socket is busy */
--
2.1.4
^ permalink raw reply related
* Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
From: Alex Williamson @ 2019-08-14 14:57 UTC (permalink / raw)
To: Parav Pandit
Cc: Cornelia Huck, Kirti Wankhede, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, cjia@nvidia.com, Jiri Pirko,
netdev@vger.kernel.org
In-Reply-To: <AM0PR05MB48666CCDFE985A25F42A0259D1AD0@AM0PR05MB4866.eurprd05.prod.outlook.com>
On Wed, 14 Aug 2019 13:45:49 +0000
Parav Pandit <parav@mellanox.com> wrote:
> > -----Original Message-----
> > From: Cornelia Huck <cohuck@redhat.com>
> > Sent: Wednesday, August 14, 2019 6:39 PM
> > To: Parav Pandit <parav@mellanox.com>
> > Cc: Alex Williamson <alex.williamson@redhat.com>; Kirti Wankhede
> > <kwankhede@nvidia.com>; kvm@vger.kernel.org; linux-
> > kernel@vger.kernel.org; cjia@nvidia.com; Jiri Pirko <jiri@mellanox.com>;
> > netdev@vger.kernel.org
> > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> >
> > On Wed, 14 Aug 2019 12:27:01 +0000
> > Parav Pandit <parav@mellanox.com> wrote:
> >
> > > + Jiri, + netdev
> > > To get perspective on the ndo->phys_port_name for the representor netdev
> > of mdev.
> > >
> > > Hi Cornelia,
> > >
> > > > -----Original Message-----
> > > > From: Cornelia Huck <cohuck@redhat.com>
> > > > Sent: Wednesday, August 14, 2019 1:32 PM
> > > > To: Parav Pandit <parav@mellanox.com>
> > > > Cc: Alex Williamson <alex.williamson@redhat.com>; Kirti Wankhede
> > > > <kwankhede@nvidia.com>; kvm@vger.kernel.org; linux-
> > > > kernel@vger.kernel.org; cjia@nvidia.com
> > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > > >
> > > > On Wed, 14 Aug 2019 05:54:36 +0000
> > > > Parav Pandit <parav@mellanox.com> wrote:
> > > >
> > > > > > > I get that part. I prefer to remove the UUID itself from the
> > > > > > > structure and therefore removing this API makes lot more sense?
> > > > > >
> > > > > > Mdev and support tools around mdev are based on UUIDs because
> > > > > > it's
> > > > defined
> > > > > > in the documentation.
> > > > > When we introduce newer device naming scheme, it will update the
> > > > documentation also.
> > > > > May be that is the time to move to .rst format too.
> > > >
> > > > You are aware that there are existing tools that expect a uuid
> > > > naming scheme, right?
> > > >
> > > Yes, Alex mentioned too.
> > > The good tool that I am aware of is [1], which is 4 months old. Not sure if it is
> > part of any distros yet.
> > >
> > > README also says, that it is in 'early in development. So we have scope to
> > improve it for non UUID names, but lets discuss that more below.
> >
> > The up-to-date reference for mdevctl is
> > https://github.com/mdevctl/mdevctl. There is currently an effort to get this
> > packaged in Fedora.
> >
> Awesome.
>
> > >
> > > > >
> > > > > > I don't think it's as simple as saying "voila, UUID dependencies
> > > > > > are removed, users are free to use arbitrary strings". We'd
> > > > > > need to create some kind of naming policy, what characters are
> > > > > > allows so that we can potentially expand the creation parameters
> > > > > > as has been proposed a couple times, how do we deal with
> > > > > > collisions and races, and why should we make such a change when
> > > > > > a UUID is a perfectly reasonable devices name. Thanks,
> > > > > >
> > > > > Sure, we should define a policy on device naming to be more relaxed.
> > > > > We have enough examples in-kernel.
> > > > > Few that I am aware of are netdev (vxlan, macvlan, ipvlan, lot
> > > > > more), rdma
> > > > etc which has arbitrary device names and ID based device names.
> > > > >
> > > > > Collisions and race is already taken care today in the mdev core.
> > > > > Same
> > > > unique device names continue.
> > > >
> > > > I'm still completely missing a rationale _why_ uuids are supposedly
> > > > bad/restricting/etc.
> > > There is nothing bad about uuid based naming.
> > > Its just too long name to derive phys_port_name of a netdev.
> > > In details below.
> > >
> > > For a given mdev of networking type, we would like to have
> > > (a) representor netdevice [2]
> > > (b) associated devlink port [3]
> > >
> > > Currently these representor netdevice exist only for the PCIe SR-IOV VFs.
> > > It is further getting extended for mdev without SR-IOV.
> > >
> > > Each of the devlink port is attached to representor netdevice [4].
> > >
> > > This netdevice phys_port_name should be a unique derived from some
> > property of mdev.
> > > Udev/systemd uses phys_port_name to derive unique representor netdev
> > name.
> > > This netdev name is further use by orchestration and switching software in
> > user space.
> > > One such distro supported switching software is ovs [4], which relies on the
> > persistent device name of the representor netdevice.
> >
> > Ok, let me rephrase this to check that I understand this correctly. I'm not sure
> > about some of the terms you use here (even after looking at the linked
> > doc/code), but that's probably still ok.
> >
> > We want to derive an unique (and probably persistent?) netdev name so that
> > userspace can refer to a representor netdevice. Makes sense.
> > For generating that name, udev uses the phys_port_name (which represents
> > the devlink port, IIUC). Also makes sense.
> >
> You understood it correctly.
>
> > >
> > > phys_port_name has limitation to be only 15 characters long.
> > > UUID doesn't fit in phys_port_name.
> >
> > Understood. But why do we need to derive the phys_port_name from the mdev
> > device name? This netdevice use case seems to be just one use case for using
> > mdev devices? If this is a specialized mdev type for this setup, why not just
> > expose a shorter identifier via an extra attribute?
> >
> Representor netdev, represents mdev's switch port (like PCI SRIOV VF's switch port).
> So user must be able to relate this two objects in similar manner as SRIOV VFs.
> Phys_port_name is derived from the PCI PF and VF numbering scheme.
> Similarly mdev's such port should be derived from mdev's id/name/attribute.
>
> > > Longer UUID names are creating snow ball effect, not just in networking stack
> > but many user space tools too.
> >
> > This snowball effect mainly comes from the device name -> phys_port_name
> > setup, IIUC.
> >
> Right.
>
> > > (as opposed to recently introduced mdevctl, are they more mdev tools
> > > which has dependency on UUID name?)
> >
> > I am aware that people have written scripts etc. to manage their mdevs.
> > Given that the mdev infrastructure has been around for quite some time, I'd
> > say the chance of some of those scripts relying on uuid names is non-zero.
> >
> Ok. but those scripts have never managed networking devices.
> So those scripts won't break because they will always create mdev devices using UUID.
> When they use these new networking devices, they need more things than their scripts.
> So user space upgrade for such mixed mode case is reasonable.
Tools like mdevctl are agnostic of the type of mdev device they're
managing, it shouldn't matter than they've never managed a networking
mdev previously, it follows the standards of mdev management.
> > >
> > > Instead of mdev subsystem creating such effect, one option we are
> > considering is to have shorter mdev names.
> > > (Similar to netdev, rdma, nvme devices).
> > > Such as mdev1, mdev2000 etc.
Note that these are kernel generated names, as are the other examples.
In the case of mdev, the user is providing the UUID, which becomes the
device name. When a user writes to the create attribute, there needs
to be determinism that the user can identify the device they created vs
another that may have been created concurrently. I don't see that we
can put users in the path of managing device instance numbers.
> > > Second option I was considering is to have an optional alias for UUID based
> > mdev.
> > > This name alias is given at time of mdev creation.
> > > Devlink port's phys_port_name is derived out of this shorter mdev name
> > alias.
> > > This way, mdev remains to be UUID based with optional extension.
> > > However, I prefer first option to relax mdev naming scheme.
> >
> > Actually, I think that second option makes much more sense, as you avoid
> > potentially breaking existing tooling.
> Let's first understand of what exactly will break with existing tool
> if they see non_uuid based device.
Do we really want a mixed namespace of device names, some UUID, some...
something else? That seems like a mess.
> Existing tooling continue to work with UUID devices.
> Do you have example of what can break if they see non_uuid based
> device name? I think you are clear, but to be sure, UUID based
> creation will continue to be there. Optionally mdev will be created
> with alpha-numeric string, if we don't it as additional attribute.
I'm not onboard with a UUID being just one of the possible naming
strings via which we can create mdev devices. I think that becomes
untenable for userspace. I don't think a sufficient argument has been
made against the alias approach, which seems to keep the UUID as a
canonical name, providing a consistent namespace, augmented with user
or kernel provided short alias. Thanks,
Alex
> > >
> > > > We want to uniquely identify a device, across different types of
> > > > vendor drivers. An uuid is a unique identifier and even a
> > > > well-defined one. Tools (e.g. mdevctl) are relying on it for
> > > > mdev devices
> > today.
> > > >
> > > > What is the problem you're trying to solve?
> > > Unique device naming is still achieved without UUID scheme by
> > > various
> > subsystems in kernel using alpha-numeric string.
> > > Having such string based continue to provide unique names.
> > >
> > > I hope I described the problem and two solutions above.
> > >
> > > [1] https://github.com/awilliam/mdevctl
> > > [2]
> > > https://elixir.bootlin.com/linux/v5.3-rc4/source/drivers/net/ethernet/
> > > mellanox/mlx5/core/en_rep.c [3]
> > > http://man7.org/linux/man-pages/man8/devlink-port.8.html
> > > [4]
> > > https://elixir.bootlin.com/linux/v5.3-rc4/source/net/core/devlink.c#L6
> > > 921
> > > [5] https://www.openvswitch.org/
> > >
>
^ permalink raw reply
* Re: [PATCH bpf-next v4 3/8] i40e: add support for AF_XDP need_wakeup feature
From: Magnus Karlsson @ 2019-08-14 14:59 UTC (permalink / raw)
To: Jonathan Lemon
Cc: Magnus Karlsson, Björn Töpel, Alexei Starovoitov,
Daniel Borkmann, Network Development, Jesper Dangaard Brouer,
Maxim Mikityanskiy, bpf, bruce.richardson, ciara.loftus,
Jakub Kicinski, Ye Xiaolong, Zhang, Qi Z, Samudrala, Sridhar,
Kevin Laatz, ilias.apalodimas, Kiran, axboe, Fijalkowski, Maciej,
Maciej Fijalkowski, intel-wired-lan
In-Reply-To: <3B2C7C21-4AAC-4126-A31D-58A61D941709@gmail.com>
On Wed, Aug 14, 2019 at 4:48 PM Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
>
>
>
> On 14 Aug 2019, at 0:27, Magnus Karlsson wrote:
>
> > This patch adds support for the need_wakeup feature of AF_XDP. If the
> > application has told the kernel that it might sleep using the new bind
> > flag XDP_USE_NEED_WAKEUP, the driver will then set this flag if it has
> > no more buffers on the NIC Rx ring and yield to the application. For
> > Tx, it will set the flag if it has no outstanding Tx completion
> > interrupts and return to the application.
> >
> > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> > ---
> > drivers/net/ethernet/intel/i40e/i40e_xsk.c | 18 ++++++++++++++++++
> > 1 file changed, 18 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > index d0ff5d8..42c9012 100644
> > --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > @@ -626,6 +626,15 @@ int i40e_clean_rx_irq_zc(struct i40e_ring
> > *rx_ring, int budget)
> >
> > i40e_finalize_xdp_rx(rx_ring, xdp_xmit);
> > i40e_update_rx_stats(rx_ring, total_rx_bytes, total_rx_packets);
> > +
> > + if (xsk_umem_uses_need_wakeup(rx_ring->xsk_umem)) {
> > + if (failure || rx_ring->next_to_clean == rx_ring->next_to_use)
> > + xsk_set_rx_need_wakeup(rx_ring->xsk_umem);
> > + else
> > + xsk_clear_rx_need_wakeup(rx_ring->xsk_umem);
> > +
> > + return (int)total_rx_packets;
> > + }
> > return failure ? budget : (int)total_rx_packets;
>
> Can you elaborate why we're not returning the total budget on failure
> for the wakeup case?
In the non need_wakeup case (the old behavior), when allocation fails
from the fill queue we want to retry right away basically busy
spinning on the fill queue until we find at least one entry and then
go on processing packets. Works well when the app and the driver are
on different cores, but a lousy strategy when they execute on the same
core. That is why in the need_wakeup feature case, we do not return
the total budget if there is a failure. We will just come back at a
later point in time from a syscall since the need_wakeup flag will
have been set and check the fill queue again. We do not want a
busy-spinning behavior in this case.
Thanks: Magnus
^ permalink raw reply
* can: tcan4x5x: spi bits_per_word issue on Raspberry PI
From: FIXED-TERM Buecheler Konstantin (ETAS-SEC/ECT-Mu) @ 2019-08-14 15:01 UTC (permalink / raw)
To: linux-can@vger.kernel.org, netdev@vger.kernel.org; +Cc: Dan Murphy
> Hi all,
>
> I am trying to use a tcan4550 together with a Raspberry PI 3 B. I am using the
> tcan4x5x driver from net-next.
> I always get the following error during startup.
> tcan4x5x spi0.0: Probe failed, err=-22
> tcan4x5x: probe of spi0.0 failed with error -22
>
> I realized that this happens because the Raspberry PI does only support 8/9 bit
> words. https://elinux.org/index.php?title=RPi_SPI#Supported_bits_per_word
> In the driver it is set to 32.
> spi->bits_per_word = 32;
>
> Setting this to 8 does not help of course since the tcan chip expects a multiple of
> 32 per spi transaction.
> I don't know if this is a Raspberry Pi specific problem or if there are more devices
> with this hardware limitation.
>
> Does anyone have a workaround for that?
It seems to be enough to just change the bits_per_word value to 8
>
> If this a common issue it might be a good idea to patch the driver. I will check if I
> can find proper a way to do so.
>
> Regards,
> Konstantin
Now I have another really confusing problem. Anything I write to SPI is written little endian. The tcan chip expects big endian.
Anything I read from SPI is treated as little endian but is big endian. Does anyone know why this happens?
Is there a flag or something I can set for the SPI device/wire to fix this?
Regards,
Konstantin
^ permalink raw reply
* Re: [PATCH v2 bpf-next] mm: mmap: increase sockets maximum memory size pgoff for 32bits
From: Ivan Khoronzhuk @ 2019-08-14 15:09 UTC (permalink / raw)
To: Andrew Morton
Cc: bjorn.topel, linux-mm, xdp-newbies, netdev, bpf, linux-kernel,
ast, magnus.karlsson
In-Reply-To: <20190812141924.32136e040904d0c5a819dcb1@linux-foundation.org>
On Mon, Aug 12, 2019 at 02:19:24PM -0700, Andrew Morton wrote:
Hi, Andrew
>On Mon, 12 Aug 2019 15:43:26 +0300 Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:
>
>> The AF_XDP sockets umem mapping interface uses XDP_UMEM_PGOFF_FILL_RING
>> and XDP_UMEM_PGOFF_COMPLETION_RING offsets. The offsets seems like are
>> established already and are part of configuration interface.
>>
>> But for 32-bit systems, while AF_XDP socket configuration, the values
>> are to large to pass maximum allowed file size verification.
>> The offsets can be tuned ofc, but instead of changing existent
>> interface - extend max allowed file size for sockets.
>
>
>What are the implications of this? That all code in the kernel which
>handles mapped sockets needs to be audited (and tested) for correctly
>handling mappings larger than 4G on 32-bit machines? Has that been
That's to allow only offset to be passed, mapping length is less than 4Gb.
I have verified all list of mmap for sockets and all of them contain dummy
cb sock_no_mmap() except the following:
xsk_mmap()
tcp_mmap()
packet_mmap()
xsk_mmap() - it's what this fix is needed for.
tcp_mmap() - doesn't have obvious issues with pgoff - no any references on it.
packet_mmap() - return -EINVAL if it's even set.
>done? Are we confident that we aren't introducing user-visible buggy
>behaviour into unsuspecting legacy code?
>
>Also... what are the user-visible runtime effects of this change?
>Please send along a paragraph which explains this, for the changelog.
>Does this patch fix some user-visible problem? If so, should be code
>be backported into -stable kernels?
It should go to linux-next, no one has been using it till this patch
with 32 bits as w/o this fix af_xdp sockets can't be used at all.
It unblocks af_xdp socket usage for 32bit systems.
That's example of potential next commit message:
Subject: mm: mmap: increase sockets maximum memory size pgoff for 32bits
The AF_XDP sockets umem mapping interface uses XDP_UMEM_PGOFF_FILL_RING
and XDP_UMEM_PGOFF_COMPLETION_RING offsets. These offsets are established
already and are part of the configuration interface.
But for 32-bit systems, using AF_XDP socket configuration, these values
are too large to pass the maximum allowed file size verification. The
offsets can be tuned off, but instead of changing the existing interface,
let's extend the max allowed file size for sockets.
No one has been using it till this patch with 32 bits as w/o this fix
af_xdp sockets can't be used at all, so it unblocks af_xdp socket usage
for 32bit systems.
All list of mmap cbs for sockets were verified on side effects and
all of them contain dummy cb - sock_no_mmap() at this moment, except the
following:
xsk_mmap() - it's what this fix is needed for.
tcp_mmap() - doesn't have obvious issues with pgoff - no any references on it.
packet_mmap() - return -EINVAL if it's even set.
Is it ok to be replicated in PATCH v2 or this explanation is enough here
to use v1?
--
Regards,
Ivan Khoronzhuk
^ permalink raw reply
* [PATCH v2 2/2] treewide: remove dummy Makefiles for single targets
From: Masahiro Yamada @ 2019-08-14 15:19 UTC (permalink / raw)
To: linux-kbuild
Cc: Masahiro Yamada, Alexei Starovoitov, Boris Pismenny,
Daniel Borkmann, David S. Miller, Igor Russkikh, Jakub Kicinski,
Leon Romanovsky, Martin KaFai Lau, Saeed Mahameed, Song Liu,
Yonghong Song, bpf, linux-kernel, linux-rdma, netdev, oss-drivers
In-Reply-To: <20190814151919.16300-1-yamada.masahiro@socionext.com>
Now that the single target build descends into sub-directories in the
same way as the normal build, these dummy Makefiles are not needed
any more.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
---
Changes in v2: None
drivers/net/ethernet/aquantia/atlantic/hw_atl/Makefile | 2 --
drivers/net/ethernet/mellanox/mlx5/core/accel/Makefile | 2 --
drivers/net/ethernet/mellanox/mlx5/core/diag/Makefile | 2 --
drivers/net/ethernet/mellanox/mlx5/core/en/Makefile | 2 --
drivers/net/ethernet/mellanox/mlx5/core/en/xsk/Makefile | 1 -
drivers/net/ethernet/mellanox/mlx5/core/en_accel/Makefile | 2 --
drivers/net/ethernet/mellanox/mlx5/core/fpga/Makefile | 2 --
drivers/net/ethernet/mellanox/mlx5/core/ipoib/Makefile | 2 --
drivers/net/ethernet/mellanox/mlx5/core/lib/Makefile | 2 --
drivers/net/ethernet/netronome/nfp/bpf/Makefile | 2 --
drivers/net/ethernet/netronome/nfp/flower/Makefile | 2 --
drivers/net/ethernet/netronome/nfp/nfpcore/Makefile | 2 --
drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000/Makefile | 2 --
drivers/net/ethernet/netronome/nfp/nic/Makefile | 2 --
14 files changed, 27 deletions(-)
delete mode 100644 drivers/net/ethernet/aquantia/atlantic/hw_atl/Makefile
delete mode 100644 drivers/net/ethernet/mellanox/mlx5/core/accel/Makefile
delete mode 100644 drivers/net/ethernet/mellanox/mlx5/core/diag/Makefile
delete mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/Makefile
delete mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/xsk/Makefile
delete mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_accel/Makefile
delete mode 100644 drivers/net/ethernet/mellanox/mlx5/core/fpga/Makefile
delete mode 100644 drivers/net/ethernet/mellanox/mlx5/core/ipoib/Makefile
delete mode 100644 drivers/net/ethernet/mellanox/mlx5/core/lib/Makefile
delete mode 100644 drivers/net/ethernet/netronome/nfp/bpf/Makefile
delete mode 100644 drivers/net/ethernet/netronome/nfp/flower/Makefile
delete mode 100644 drivers/net/ethernet/netronome/nfp/nfpcore/Makefile
delete mode 100644 drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000/Makefile
delete mode 100644 drivers/net/ethernet/netronome/nfp/nic/Makefile
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl/Makefile b/drivers/net/ethernet/aquantia/atlantic/hw_atl/Makefile
deleted file mode 100644
index 805fa28f391a..000000000000
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl/Makefile
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-# kbuild requires Makefile in a directory to build individual objects
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/accel/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/accel/Makefile
deleted file mode 100644
index c78512eed8d7..000000000000
--- a/drivers/net/ethernet/mellanox/mlx5/core/accel/Makefile
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0-only
-subdir-ccflags-y += -I$(src)/..
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/diag/Makefile
deleted file mode 100644
index c78512eed8d7..000000000000
--- a/drivers/net/ethernet/mellanox/mlx5/core/diag/Makefile
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0-only
-subdir-ccflags-y += -I$(src)/..
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/en/Makefile
deleted file mode 100644
index c78512eed8d7..000000000000
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/Makefile
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0-only
-subdir-ccflags-y += -I$(src)/..
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/Makefile
deleted file mode 100644
index 5ee42991900a..000000000000
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/Makefile
+++ /dev/null
@@ -1 +0,0 @@
-subdir-ccflags-y += -I$(src)/../..
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/Makefile
deleted file mode 100644
index c78512eed8d7..000000000000
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/Makefile
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0-only
-subdir-ccflags-y += -I$(src)/..
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fpga/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/fpga/Makefile
deleted file mode 100644
index c78512eed8d7..000000000000
--- a/drivers/net/ethernet/mellanox/mlx5/core/fpga/Makefile
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0-only
-subdir-ccflags-y += -I$(src)/..
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/Makefile
deleted file mode 100644
index c78512eed8d7..000000000000
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/Makefile
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0-only
-subdir-ccflags-y += -I$(src)/..
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/lib/Makefile
deleted file mode 100644
index c78512eed8d7..000000000000
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/Makefile
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0-only
-subdir-ccflags-y += -I$(src)/..
diff --git a/drivers/net/ethernet/netronome/nfp/bpf/Makefile b/drivers/net/ethernet/netronome/nfp/bpf/Makefile
deleted file mode 100644
index 805fa28f391a..000000000000
--- a/drivers/net/ethernet/netronome/nfp/bpf/Makefile
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-# kbuild requires Makefile in a directory to build individual objects
diff --git a/drivers/net/ethernet/netronome/nfp/flower/Makefile b/drivers/net/ethernet/netronome/nfp/flower/Makefile
deleted file mode 100644
index 805fa28f391a..000000000000
--- a/drivers/net/ethernet/netronome/nfp/flower/Makefile
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-# kbuild requires Makefile in a directory to build individual objects
diff --git a/drivers/net/ethernet/netronome/nfp/nfpcore/Makefile b/drivers/net/ethernet/netronome/nfp/nfpcore/Makefile
deleted file mode 100644
index 805fa28f391a..000000000000
--- a/drivers/net/ethernet/netronome/nfp/nfpcore/Makefile
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-# kbuild requires Makefile in a directory to build individual objects
diff --git a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000/Makefile b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000/Makefile
deleted file mode 100644
index 805fa28f391a..000000000000
--- a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000/Makefile
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-# kbuild requires Makefile in a directory to build individual objects
diff --git a/drivers/net/ethernet/netronome/nfp/nic/Makefile b/drivers/net/ethernet/netronome/nfp/nic/Makefile
deleted file mode 100644
index 805fa28f391a..000000000000
--- a/drivers/net/ethernet/netronome/nfp/nic/Makefile
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-# kbuild requires Makefile in a directory to build individual objects
--
2.17.1
^ permalink raw reply related
* [patch net-next v2 0/2] selftests: netdevsim: add devlink paramstests
From: Jiri Pirko @ 2019-08-14 15:26 UTC (permalink / raw)
To: netdev; +Cc: davem, jakub.kicinski, mlxsw
From: Jiri Pirko <jiri@mellanox.com>
The first patch is just a helper addition as a dependency of the actual
test in patch number two.
Jiri Pirko (2):
selftests: net: push jq workaround into separate helper
selftests: netdevsim: add devlink params tests
.../drivers/net/netdevsim/devlink.sh | 62 ++++++++++++++++++-
tools/testing/selftests/net/forwarding/lib.sh | 16 +++++
.../selftests/net/forwarding/tc_common.sh | 17 ++---
3 files changed, 81 insertions(+), 14 deletions(-)
--
2.21.0
^ permalink raw reply
* [patch net-next v2 1/2] selftests: net: push jq workaround into separate helper
From: Jiri Pirko @ 2019-08-14 15:26 UTC (permalink / raw)
To: netdev; +Cc: davem, jakub.kicinski, mlxsw
In-Reply-To: <20190814152604.6385-1-jiri@resnulli.us>
From: Jiri Pirko <jiri@mellanox.com>
Push the jq return value workaround code into a separate helper so it
could be used by the rest of the code.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v1->v2:
-new patch
---
tools/testing/selftests/net/forwarding/lib.sh | 16 ++++++++++++++++
.../selftests/net/forwarding/tc_common.sh | 17 ++++-------------
2 files changed, 20 insertions(+), 13 deletions(-)
diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh
index 9385dc971269..9d78841efef6 100644
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -250,6 +250,22 @@ setup_wait()
sleep $WAIT_TIME
}
+cmd_jq()
+{
+ local cmd=$1
+ local jq_exp=$2
+ local ret
+ local output
+
+ output="$($cmd)"
+ # workaround the jq bug which causes jq to return 0 in case input is ""
+ ret=$?
+ if [[ $ret -ne 0 ]]; then
+ return $ret
+ fi
+ echo $output | jq -r -e "$jq_exp"
+}
+
lldpad_app_wait_set()
{
local dev=$1; shift
diff --git a/tools/testing/selftests/net/forwarding/tc_common.sh b/tools/testing/selftests/net/forwarding/tc_common.sh
index 9d3b64a2a264..315e934358d4 100644
--- a/tools/testing/selftests/net/forwarding/tc_common.sh
+++ b/tools/testing/selftests/net/forwarding/tc_common.sh
@@ -8,18 +8,9 @@ tc_check_packets()
local id=$1
local handle=$2
local count=$3
- local ret
- output="$(tc -j -s filter show $id)"
- # workaround the jq bug which causes jq to return 0 in case input is ""
- ret=$?
- if [[ $ret -ne 0 ]]; then
- return $ret
- fi
- echo $output | \
- jq -e ".[] \
- | select(.options.handle == $handle) \
- | select(.options.actions[0].stats.packets == $count)" \
- &> /dev/null
- return $?
+ cmd_jq "tc -j -s filter show $id" \
+ ".[] | select(.options.handle == $handle) | \
+ select(.options.actions[0].stats.packets == $count)" \
+ &> /dev/null
}
--
2.21.0
^ permalink raw reply related
* [patch net-next v2 2/2] selftests: netdevsim: add devlink params tests
From: Jiri Pirko @ 2019-08-14 15:26 UTC (permalink / raw)
To: netdev; +Cc: davem, jakub.kicinski, mlxsw
In-Reply-To: <20190814152604.6385-1-jiri@resnulli.us>
From: Jiri Pirko <jiri@mellanox.com>
Test recently added netdevsim devlink param implementation.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v1->v2:
-using cmd_jq helper
---
.../drivers/net/netdevsim/devlink.sh | 62 ++++++++++++++++++-
1 file changed, 61 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/drivers/net/netdevsim/devlink.sh b/tools/testing/selftests/drivers/net/netdevsim/devlink.sh
index 9d8baf5d14b3..6828e9404460 100755
--- a/tools/testing/selftests/drivers/net/netdevsim/devlink.sh
+++ b/tools/testing/selftests/drivers/net/netdevsim/devlink.sh
@@ -3,7 +3,7 @@
lib_dir=$(dirname $0)/../../../net/forwarding
-ALL_TESTS="fw_flash_test"
+ALL_TESTS="fw_flash_test params_test"
NUM_NETIFS=0
source $lib_dir/lib.sh
@@ -30,6 +30,66 @@ fw_flash_test()
log_test "fw flash test"
}
+param_get()
+{
+ local name=$1
+
+ cmd_jq "devlink dev param show $DL_HANDLE name $name -j" \
+ '.[][][].values[] | select(.cmode == "driverinit").value'
+}
+
+param_set()
+{
+ local name=$1
+ local value=$2
+
+ devlink dev param set $DL_HANDLE name $name cmode driverinit value $value
+}
+
+check_value()
+{
+ local name=$1
+ local phase_name=$2
+ local expected_param_value=$3
+ local expected_debugfs_value=$4
+ local value
+
+ value=$(param_get $name)
+ check_err $? "Failed to get $name param value"
+ [ "$value" == "$expected_param_value" ]
+ check_err $? "Unexpected $phase_name $name param value"
+ value=$(<$DEBUGFS_DIR/$name)
+ check_err $? "Failed to get $name debugfs value"
+ [ "$value" == "$expected_debugfs_value" ]
+ check_err $? "Unexpected $phase_name $name debugfs value"
+}
+
+params_test()
+{
+ RET=0
+
+ local max_macs
+ local test1
+
+ check_value max_macs initial 32 32
+ check_value test1 initial true Y
+
+ param_set max_macs 16
+ check_err $? "Failed to set max_macs param value"
+ param_set test1 false
+ check_err $? "Failed to set test1 param value"
+
+ check_value max_macs post-set 16 32
+ check_value test1 post-set false Y
+
+ devlink dev reload $DL_HANDLE
+
+ check_value max_macs post-reload 16 16
+ check_value test1 post-reload false N
+
+ log_test "params test"
+}
+
setup_prepare()
{
modprobe netdevsim
--
2.21.0
^ permalink raw reply related
* Re: [PATCH RFC 1/4] net: phy: swphy: emulate register MII_ESTATUS
From: Andrew Lunn @ 2019-08-14 15:34 UTC (permalink / raw)
To: Heiner Kallweit
Cc: Florian Fainelli, Marek Behun, David Miller,
netdev@vger.kernel.org
In-Reply-To: <5e3d85cd-43b6-c581-be99-b6b0cf025771@gmail.com>
On Tue, Aug 13, 2019 at 11:24:56PM +0200, Heiner Kallweit wrote:
> When the genphy driver binds to a swphy it will call
> genphy_read_abilites that will try to read MII_ESTATUS if BMSR_ESTATEN
> is set in MII_BMSR. So far this would read the default value 0xffff
> and 1000FD and 1000HD are reported as supported just by chance.
> Better add explicit support for emulating MII_ESTATUS.
>
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox