* [PATCH net-next 1/2] net/mlx5e: Introduce mlx5e_flow_esw_attr_init() helper
From: xiangxia.m.yue @ 2019-02-12 3:39 UTC (permalink / raw)
To: saeedm, gerlitz.or; +Cc: netdev, Tonghao Zhang
From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Introduce the mlx5e_flow_esw_attr_init() helper
for simplifying codes.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 41 +++++++++++++++++--------
1 file changed, 29 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index e943775..98b002c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -2736,6 +2736,30 @@ static bool is_peer_flow_needed(struct mlx5e_tc_flow *flow)
return err;
}
+static void
+mlx5e_flow_esw_attr_init(struct mlx5_esw_flow_attr *esw_attr,
+ struct mlx5e_priv *priv,
+ struct mlx5e_tc_flow_parse_attr *parse_attr,
+ struct tc_cls_flower_offload *f,
+ struct mlx5_eswitch_rep *in_rep,
+ struct mlx5_core_dev *in_mdev)
+{
+ struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
+
+ esw_attr->parse_attr = parse_attr;
+ esw_attr->chain = f->common.chain_index;
+ esw_attr->prio = TC_H_MAJ(f->common.prio) >> 16;
+
+ esw_attr->in_rep = in_rep;
+ esw_attr->in_mdev = in_mdev;
+
+ if (MLX5_CAP_ESW(esw->dev, counter_eswitch_affinity) ==
+ MLX5_COUNTER_SOURCE_ESWITCH)
+ esw_attr->counter_dev = in_mdev;
+ else
+ esw_attr->counter_dev = priv->mdev;
+}
+
static struct mlx5e_tc_flow *
__mlx5e_add_fdb_flow(struct mlx5e_priv *priv,
struct tc_cls_flower_offload *f,
@@ -2757,28 +2781,21 @@ static bool is_peer_flow_needed(struct mlx5e_tc_flow *flow)
&parse_attr, &flow);
if (err)
goto out;
+
parse_attr->filter_dev = filter_dev;
- flow->esw_attr->parse_attr = parse_attr;
+ mlx5e_flow_esw_attr_init(flow->esw_attr,
+ priv, parse_attr,
+ f, in_rep, in_mdev);
+
err = parse_cls_flower(flow->priv, flow, &parse_attr->spec,
f, filter_dev);
if (err)
goto err_free;
- flow->esw_attr->chain = f->common.chain_index;
- flow->esw_attr->prio = TC_H_MAJ(f->common.prio) >> 16;
err = parse_tc_fdb_actions(priv, &rule->action, parse_attr, flow, extack);
if (err)
goto err_free;
- flow->esw_attr->in_rep = in_rep;
- flow->esw_attr->in_mdev = in_mdev;
-
- if (MLX5_CAP_ESW(esw->dev, counter_eswitch_affinity) ==
- MLX5_COUNTER_SOURCE_ESWITCH)
- flow->esw_attr->counter_dev = in_mdev;
- else
- flow->esw_attr->counter_dev = priv->mdev;
-
err = mlx5e_tc_add_fdb_flow(priv, parse_attr, flow, extack);
if (err)
goto err_free;
--
1.8.3.1
^ permalink raw reply related
* [PATCH net-next 2/2] net/mlx5e: Remove 'parse_attr' argument in mlx5e_tc_add_fdb_flow()
From: xiangxia.m.yue @ 2019-02-12 3:39 UTC (permalink / raw)
To: saeedm, gerlitz.or; +Cc: netdev, Tonghao Zhang
In-Reply-To: <1549942783-56833-1-git-send-email-xiangxia.m.yue@gmail.com>
From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
This patch is a little improvement. Simplify the mlx5e_tc_add_fdb_flow().
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 98b002c..15cf26d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -929,13 +929,13 @@ static int mlx5e_attach_encap(struct mlx5e_priv *priv,
static int
mlx5e_tc_add_fdb_flow(struct mlx5e_priv *priv,
- struct mlx5e_tc_flow_parse_attr *parse_attr,
struct mlx5e_tc_flow *flow,
struct netlink_ext_ack *extack)
{
struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
u32 max_chain = mlx5_eswitch_get_chain_range(esw);
struct mlx5_esw_flow_attr *attr = flow->esw_attr;
+ struct mlx5e_tc_flow_parse_attr *parse_attr = attr->parse_attr;
u16 max_prio = mlx5_eswitch_get_prio_range(esw);
struct net_device *out_dev, *encap_dev = NULL;
struct mlx5_fc *counter = NULL;
@@ -967,7 +967,7 @@ static int mlx5e_attach_encap(struct mlx5e_priv *priv,
if (!(attr->dests[out_index].flags & MLX5_ESW_DEST_ENCAP))
continue;
- mirred_ifindex = attr->parse_attr->mirred_ifindex[out_index];
+ mirred_ifindex = parse_attr->mirred_ifindex[out_index];
out_dev = __dev_get_by_index(dev_net(priv->netdev),
mirred_ifindex);
err = mlx5e_attach_encap(priv,
@@ -2796,7 +2796,7 @@ static bool is_peer_flow_needed(struct mlx5e_tc_flow *flow)
if (err)
goto err_free;
- err = mlx5e_tc_add_fdb_flow(priv, parse_attr, flow, extack);
+ err = mlx5e_tc_add_fdb_flow(priv, flow, extack);
if (err)
goto err_free;
--
1.8.3.1
^ permalink raw reply related
* Re: [PATCH] net: phy: at803x: disable delay only for RGMII mode
From: Peter Ujfalusi @ 2019-02-14 13:22 UTC (permalink / raw)
To: Niklas Cassel
Cc: Marc Gonzalez, Andrew Lunn, Florian Fainelli, Vinod Koul,
David S Miller, linux-arm-msm, Bjorn Andersson, netdev,
Nori, Sekhar
In-Reply-To: <20190214123922.GA28897@centauri.ideon.se>
Hi Niklas,
On 14/02/2019 14.39, Niklas Cassel wrote:
>>> So, I've rebased your old patch, see attachment.
>>> I suggest that Peter test it on am335x-evm.
>>
>> with the patch + s/rgmii-txid/rgmii-id in the am335x-evmsk.dts ethernet
>> is working.
>> I don't have am335x-evm to test, but it has the same PHY as evmsk.
>>
>
> Florian's concern was that this PHY driver looked at "phy-mode" from the
> perspective of the MAC rather than the PHY.
> However, if s/rgmii-txid/rgmii-id is the correct fix for am335x-evm,
> then this means that this PHY driver was just broken.
>
> If the driver had misinterpreted the perspective, then the correct
> fix for am335x-evm would have been s/rgmii-txid/rgmii-rxid.
Not sure if I got this right, but:
rgmii-id/txid/rxid is the delay mode between PHY and MAC, right?
on the PHY node it is from the PHY perspective, right?
The errata I have mentioned for am335x say:
"The reset state of RGMII1_IDMODE (bit 4) and RGMII2_IDMODE (bit 5) in
the GMII_SEL register enables internal delay mode on the transmit clock
of the respective RGMII port. The AM335x device does not support
internal delay mode, so RGMII1_IDMODE and RGMII2_IDMODE must be set to 1b."
If the delay mode on the transmit clock is not working on the am335x,
then this translate that the rxid needs to be enabled on the PHY side?
But then why it worked when only the txid was enabled and rxid was not
on the PHY side, and why it works if both txid and rxid is enabled?
Just tried w/ your patch and setting rgmii-rxid for am335x-evmsk and
ethernet is not working, it only works w/ rgmii-id (so both tx and rx
delay is enabled on the PHY side?)
> So considering that this driver seems to be really broken
> (rather then just inverted perspective),
> perhaps we can merge the patch I attached in my previous email after all?
> (Together with a s/rgmii-txid/rgmii-id in the am335x-evmsk.dts.)
at the same time am335x-evm.dts needs to have the same change and most
likely other boards which uses the same PHY needs to be checked?
PS: sorry for my lack of knowledge on the networking stuff...
- Péter
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
^ permalink raw reply
* Re: [PATCH 2/8] net: ethernet: stmmac: update to support all PHY config for stm32mp157c.
From: Andrew Lunn @ 2019-02-14 13:40 UTC (permalink / raw)
To: Christophe Roullier
Cc: robh, davem, joabreu, mark.rutland, mcoquelin.stm32,
alexandre.torgue, peppe.cavallaro, linux-stm32, linux-kernel,
devicetree, linux-arm-kernel, netdev
In-Reply-To: <1550126763-22669-3-git-send-email-christophe.roullier@st.com>
On Thu, Feb 14, 2019 at 07:45:57AM +0100, Christophe Roullier wrote:
> @@ -131,19 +185,19 @@ static int stm32mp1_set_mode(struct plat_stmmacenet_data *plat_dat)
> case PHY_INTERFACE_MODE_RGMII:
> val = SYSCFG_PMCR_ETH_SEL_RGMII;
> - if (dwmac->int_phyclk)
> + if (dwmac->eth_clk_sel_reg)
> val |= SYSCFG_PMCR_ETH_CLK_SEL;
> pr_debug("SYSCFG init : PHY_INTERFACE_MODE_RGMII\n");
> break;
Hi Christophe
This code should handle all 4 PHY_INTERFACE_MODE_RGMII* values.
Andrew
^ permalink raw reply
* Re: [PATCH 7/8] ARM: dts: stm32: Add Ethernet support on stm32h7 SOC and activate it for eval and disco boards
From: Andrew Lunn @ 2019-02-14 13:44 UTC (permalink / raw)
To: Christophe Roullier
Cc: robh, davem, joabreu, mark.rutland, mcoquelin.stm32,
alexandre.torgue, peppe.cavallaro, linux-stm32, linux-kernel,
devicetree, linux-arm-kernel, netdev
In-Reply-To: <1550126763-22669-8-git-send-email-christophe.roullier@st.com>
On Thu, Feb 14, 2019 at 07:46:02AM +0100, Christophe Roullier wrote:
> + mdio0 {
> + #address-cells = <1>;
> + #size-cells = <0>;
> + compatible = "snps,dwmac-mdio";
> + phy1: ethernet-phy@1 {
> + reg = <0>;
> + };
The reg value should be the same as ethernet-phy@X.
Andrew
> + mdio0 {
> + #address-cells = <1>;
> + #size-cells = <0>;
> + compatible = "snps,dwmac-mdio";
> + phy1: ethernet-phy@1 {
> + reg = <0>;
> + };
Here as well.
Andrew
^ permalink raw reply
* Re: [PATCH iproute2] lib/libnetlink: ensure a minimum of 32KB for the buffer used in rtnl_recvmsg()
From: Michal Kubecek @ 2019-02-14 13:49 UTC (permalink / raw)
To: netdev
Cc: David Ahern, Eric Dumazet, Stephen Hemminger, Eric Dumazet,
Hangbin Liu, Phil Sutter
In-Reply-To: <b42f0dcb-3c8c-9797-a9f1-da71642e26cc@gmail.com>
On Tue, Feb 12, 2019 at 07:04:17PM -0700, David Ahern wrote:
>
> Do we know of any single message sizes > 32k? 2d34851cd341 cites
> increasing VF's but at some point there is a limit. If not, the whole
> PEEK thing should go away and we just malloc 32k (or 64k) buffers for
> each recvmsg.
IFLA_VF_LIST is by far the biggest thing I have seen so far. I don't
remember exact numbers but the issue with 32KB buffer (for the whole
RTM_NELINK message) was encountered by some of our customers with NICs
having 120 or 128 VFs.
There is a bigger issue with IFLA_VFINFO_LIST, though, as it's an
attribute so that netlink limits its size to 64 KB. IIRC with current
size of IFLA_VF_INFO this would be reached with 270-280 VFs (I'm sure
the number was higer than 256 but not too much higher.)
This would mean unless we let something else grow too much, the whole
message shouldn't get much bigger than 64 KB. And if we can find some
other solution (e.g. passing VF information in separate messages if
client declares support), even 32 KB would be more than enough.
Michal Kubecek
^ permalink raw reply
* Re: [PATCH iproute2] lib/libnetlink: ensure a minimum of 32KB for the buffer used in rtnl_recvmsg()
From: Michal Kubecek @ 2019-02-14 13:51 UTC (permalink / raw)
To: netdev
Cc: Hangbin Liu, David Ahern, Eric Dumazet, Stephen Hemminger,
Phil Sutter
In-Reply-To: <20190213062012.GC10051@dhcp-12-139.nay.redhat.com>
On Wed, Feb 13, 2019 at 02:20:12PM +0800, Hangbin Liu wrote:
>
> > > Do we know of any single message sizes > 32k? 2d34851cd341 cites
> > > increasing VF's but at some point there is a limit. If not, the whole
> > > PEEK thing should go away and we just malloc 32k (or 64k) buffers for
> > > each recvmsg.
>
> Apart from the 200 VFs example, I think there will be more and more virtual
> interfaces be used in cloud environment, like openstack/OVS, so useing 32K
> or 64K is still not safe.
Many intefraces are not a problem. Those will have separate messages so
that they don't need to fit into one buffer.
Michal Kubecek
^ permalink raw reply
* Re: [PATCH iproute2] iplink: document XDP subcommand to force the XDP mode.
From: Matteo Croce @ 2019-02-14 14:01 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev, David Ahern, Stephen Hemminger, Jakub Kicinski
In-Reply-To: <20190213140403.5fcf761e@shemminger-XPS-13-9360>
On Wed, Feb 13, 2019 at 11:04 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Wed, 13 Feb 2019 15:40:30 +0100
> Matteo Croce <mcroce@redhat.com> wrote:
>
> > When attaching an eBPF program to a device, ip link can force the XDP mode
> > by using the xdp{generic,drv,offload} keyword instead of just 'xdp'.
> > Document this behaviour also in the help output.
> >
> > Signed-off-by: Matteo Croce <mcroce@redhat.com>
> > Fixes: 14683814 ("bpf: add xdpdrv for requesting XDP driver mode")
> > Fixes: 1b5e8094 ("bpf: allow requesting XDP HW offload")
>
> Applied, thanks.
> The man page already has this as well.
>
Yes, I found it just after I made the patch. However, it could be nice
to have the generic "xdp" and a command like "type" or "mode" to
specify the XDP mode, eg.
ip link set dev eth0 xdp mode [ auto | generic | drv | offload ]
I was trying to add it, but unfortunately it seems that the arguments
aren't parsed in a loop, and are required to be in the exact order.
Would this change make sense?
Regards,
--
Matteo Croce
per aspera ad upstream
^ permalink raw reply
* [v3 PATCH 0/4] mac80211: Fix incorrect usage of rhashtable walk API
From: Herbert Xu @ 2019-02-14 14:02 UTC (permalink / raw)
To: David Miller, johannes, linux-wireless, netdev, j, tgraf,
johannes.berg, Julia Lawall
In-Reply-To: <20190213143853.labj6zdcsoupkris@gondor.apana.org.au>
Hi:
v3 fixes a bug in patch two where we would return a NULL pointer
when we should return an existing path.
The first two patches in this series are bug fixes and should be
backported to stable.
They fixes a number of issues with the use of the rhashtable API
in mac80211. First of all it converts the use of rashtable walks over
to a simple linked list. This is because an rhashtable walk is
inherently unstable and not meant for uses that require stability,
e.g., when you're trying to lookup an object to delete.
It also fixes a potential memory leak when the rhashtable insertion
fails (which can occur due to OOM).
The third patch is a code-cleanup to mac80211 while the last patch
removes an obsolete rhashtable API.
Thanks,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* [PATCH 2/4] mac80211: Free mpath object when rhashtable insertion fails
From: Herbert Xu @ 2019-02-14 14:03 UTC (permalink / raw)
To: David Miller, johannes, linux-wireless, netdev, j, tgraf,
johannes.berg, Julia Lawall
In-Reply-To: <20190214140236.omt74prxhkfaasue@gondor.apana.org.au>
When rhashtable insertion fails the mesh table code doesn't free
the now-orphan mesh path object. This patch fixes that.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
net/mac80211/mesh_pathtbl.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/net/mac80211/mesh_pathtbl.c b/net/mac80211/mesh_pathtbl.c
index 884a0d212e8b..c3a7396fb955 100644
--- a/net/mac80211/mesh_pathtbl.c
+++ b/net/mac80211/mesh_pathtbl.c
@@ -436,17 +436,15 @@ struct mesh_path *mesh_path_add(struct ieee80211_sub_if_data *sdata,
} while (unlikely(ret == -EEXIST && !mpath));
spin_unlock_bh(&tbl->walk_lock);
- if (ret && ret != -EEXIST)
- return ERR_PTR(ret);
-
- /* At this point either new_mpath was added, or we found a
- * matching entry already in the table; in the latter case
- * free the unnecessary new entry.
- */
- if (ret == -EEXIST) {
+ if (ret) {
kfree(new_mpath);
+
+ if (ret != -EEXIST)
+ return ERR_PTR(ret);
+
new_mpath = mpath;
}
+
sdata->u.mesh.mesh_paths_generation++;
return new_mpath;
}
@@ -481,6 +479,9 @@ int mpp_path_add(struct ieee80211_sub_if_data *sdata,
hlist_add_head_rcu(&new_mpath->walk_list, &tbl->walk_head);
spin_unlock_bh(&tbl->walk_lock);
+ if (ret)
+ kfree(new_mpath);
+
sdata->u.mesh.mpp_paths_generation++;
return ret;
}
^ permalink raw reply related
* [PATCH 4/4] rhashtable: Remove obsolete rhashtable_walk_init function
From: Herbert Xu @ 2019-02-14 14:03 UTC (permalink / raw)
To: David Miller, johannes, linux-wireless, netdev, j, tgraf,
johannes.berg, Julia Lawall
In-Reply-To: <20190214140236.omt74prxhkfaasue@gondor.apana.org.au>
The rhashtable_walk_init function has been obsolete for more than
two years. This patch finally converts its last users over to
rhashtable_walk_enter and removes it.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
include/linux/rhashtable.h | 8 --------
lib/rhashtable.c | 2 +-
lib/test_rhashtable.c | 9 ++-------
net/ipv6/ila/ila_xlat.c | 15 +++------------
net/netlink/af_netlink.c | 10 +---------
5 files changed, 7 insertions(+), 37 deletions(-)
diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
index 20f9c6af7473..ae9c0f71f311 100644
--- a/include/linux/rhashtable.h
+++ b/include/linux/rhashtable.h
@@ -1113,14 +1113,6 @@ static inline int rhashtable_replace_fast(
return err;
}
-/* Obsolete function, do not use in new code. */
-static inline int rhashtable_walk_init(struct rhashtable *ht,
- struct rhashtable_iter *iter, gfp_t gfp)
-{
- rhashtable_walk_enter(ht, iter);
- return 0;
-}
-
/**
* rhltable_walk_enter - Initialise an iterator
* @hlt: Table to walk over
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 852ffa5160f1..0a105d4af166 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -682,7 +682,7 @@ EXPORT_SYMBOL_GPL(rhashtable_walk_enter);
* rhashtable_walk_exit - Free an iterator
* @iter: Hash table Iterator
*
- * This function frees resources allocated by rhashtable_walk_init.
+ * This function frees resources allocated by rhashtable_walk_enter.
*/
void rhashtable_walk_exit(struct rhashtable_iter *iter)
{
diff --git a/lib/test_rhashtable.c b/lib/test_rhashtable.c
index 6a8ac7626797..d96962eea942 100644
--- a/lib/test_rhashtable.c
+++ b/lib/test_rhashtable.c
@@ -177,16 +177,11 @@ static int __init test_rht_lookup(struct rhashtable *ht, struct test_obj *array,
static void test_bucket_stats(struct rhashtable *ht, unsigned int entries)
{
- unsigned int err, total = 0, chain_len = 0;
+ unsigned int total = 0, chain_len = 0;
struct rhashtable_iter hti;
struct rhash_head *pos;
- err = rhashtable_walk_init(ht, &hti, GFP_KERNEL);
- if (err) {
- pr_warn("Test failed: allocation error");
- return;
- }
-
+ rhashtable_walk_enter(ht, &hti);
rhashtable_walk_start(&hti);
while ((pos = rhashtable_walk_next(&hti))) {
diff --git a/net/ipv6/ila/ila_xlat.c b/net/ipv6/ila/ila_xlat.c
index 17c455ff69ff..ae6cd4cef8db 100644
--- a/net/ipv6/ila/ila_xlat.c
+++ b/net/ipv6/ila/ila_xlat.c
@@ -385,10 +385,7 @@ int ila_xlat_nl_cmd_flush(struct sk_buff *skb, struct genl_info *info)
spinlock_t *lock;
int ret;
- ret = rhashtable_walk_init(&ilan->xlat.rhash_table, &iter, GFP_KERNEL);
- if (ret)
- goto done;
-
+ rhashtable_walk_enter(&ilan->xlat.rhash_table, &iter);
rhashtable_walk_start(&iter);
for (;;) {
@@ -509,23 +506,17 @@ int ila_xlat_nl_dump_start(struct netlink_callback *cb)
struct net *net = sock_net(cb->skb->sk);
struct ila_net *ilan = net_generic(net, ila_net_id);
struct ila_dump_iter *iter;
- int ret;
iter = kmalloc(sizeof(*iter), GFP_KERNEL);
if (!iter)
return -ENOMEM;
- ret = rhashtable_walk_init(&ilan->xlat.rhash_table, &iter->rhiter,
- GFP_KERNEL);
- if (ret) {
- kfree(iter);
- return ret;
- }
+ rhashtable_walk_enter(&ilan->xlat.rhash_table, &iter->rhiter);
iter->skip = 0;
cb->args[0] = (long)iter;
- return ret;
+ return 0;
}
int ila_xlat_nl_dump_done(struct netlink_callback *cb)
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 3c023d6120f6..f369cc751cea 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2541,15 +2541,7 @@ struct nl_seq_iter {
static int netlink_walk_start(struct nl_seq_iter *iter)
{
- int err;
-
- err = rhashtable_walk_init(&nl_table[iter->link].hash, &iter->hti,
- GFP_KERNEL);
- if (err) {
- iter->link = MAX_LINKS;
- return err;
- }
-
+ rhashtable_walk_enter(&nl_table[iter->link].hash, &iter->hti);
rhashtable_walk_start(&iter->hti);
return 0;
^ permalink raw reply related
* [PATCH 1/4] mac80211: Use linked list instead of rhashtable walk for mesh tables
From: Herbert Xu @ 2019-02-14 14:03 UTC (permalink / raw)
To: David Miller, johannes, linux-wireless, netdev, j, tgraf,
johannes.berg, Julia Lawall
In-Reply-To: <20190214140236.omt74prxhkfaasue@gondor.apana.org.au>
The mesh table code walks over hash tables for two purposes. First of
all it's used as part of a netlink dump process, but it is also used
for looking up entries to delete using criteria other than the hash
key.
The second purpose is directly contrary to the design specification
of rhashtable walks. It is only meant for use by netlink dumps.
This is because rhashtable is resizable and you cannot obtain a
stable walk over it during a resize process.
In fact mesh's use of rhashtable for dumping is bogus too. Rather
than using rhashtable walk's iterator to keep track of the current
position, it always converts the current position to an integer
which defeats the purpose of the iterator.
Therefore this patch converts all uses of rhashtable walk into a
simple linked list.
This patch also adds a new spin lock to protect the hash table
insertion/removal as well as the walk list modifications. In fact
the previous code was buggy as the removals can race with each
other, potentially resulting in a double-free.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
net/mac80211/mesh.h | 6 +
net/mac80211/mesh_pathtbl.c | 138 +++++++++++---------------------------------
2 files changed, 43 insertions(+), 101 deletions(-)
diff --git a/net/mac80211/mesh.h b/net/mac80211/mesh.h
index cad6592c52a1..2ec7011a4d07 100644
--- a/net/mac80211/mesh.h
+++ b/net/mac80211/mesh.h
@@ -70,6 +70,7 @@ enum mesh_deferred_task_flags {
* @dst: mesh path destination mac address
* @mpp: mesh proxy mac address
* @rhash: rhashtable list pointer
+ * @walk_list: linked list containing all mesh_path objects.
* @gate_list: list pointer for known gates list
* @sdata: mesh subif
* @next_hop: mesh neighbor to which frames for this destination will be
@@ -105,6 +106,7 @@ struct mesh_path {
u8 dst[ETH_ALEN];
u8 mpp[ETH_ALEN]; /* used for MPP or MAP */
struct rhash_head rhash;
+ struct hlist_node walk_list;
struct hlist_node gate_list;
struct ieee80211_sub_if_data *sdata;
struct sta_info __rcu *next_hop;
@@ -133,12 +135,16 @@ struct mesh_path {
* gate's mpath may or may not be resolved and active.
* @gates_lock: protects updates to known_gates
* @rhead: the rhashtable containing struct mesh_paths, keyed by dest addr
+ * @walk_head: linked list containging all mesh_path objects
+ * @walk_lock: lock protecting walk_head
* @entries: number of entries in the table
*/
struct mesh_table {
struct hlist_head known_gates;
spinlock_t gates_lock;
struct rhashtable rhead;
+ struct hlist_head walk_head;
+ spinlock_t walk_lock;
atomic_t entries; /* Up to MAX_MESH_NEIGHBOURS */
};
diff --git a/net/mac80211/mesh_pathtbl.c b/net/mac80211/mesh_pathtbl.c
index a5125624a76d..884a0d212e8b 100644
--- a/net/mac80211/mesh_pathtbl.c
+++ b/net/mac80211/mesh_pathtbl.c
@@ -59,8 +59,10 @@ static struct mesh_table *mesh_table_alloc(void)
return NULL;
INIT_HLIST_HEAD(&newtbl->known_gates);
+ INIT_HLIST_HEAD(&newtbl->walk_head);
atomic_set(&newtbl->entries, 0);
spin_lock_init(&newtbl->gates_lock);
+ spin_lock_init(&newtbl->walk_lock);
return newtbl;
}
@@ -249,28 +251,15 @@ mpp_path_lookup(struct ieee80211_sub_if_data *sdata, const u8 *dst)
static struct mesh_path *
__mesh_path_lookup_by_idx(struct mesh_table *tbl, int idx)
{
- int i = 0, ret;
- struct mesh_path *mpath = NULL;
- struct rhashtable_iter iter;
-
- ret = rhashtable_walk_init(&tbl->rhead, &iter, GFP_ATOMIC);
- if (ret)
- return NULL;
-
- rhashtable_walk_start(&iter);
+ int i = 0;
+ struct mesh_path *mpath;
- while ((mpath = rhashtable_walk_next(&iter))) {
- if (IS_ERR(mpath) && PTR_ERR(mpath) == -EAGAIN)
- continue;
- if (IS_ERR(mpath))
- break;
+ hlist_for_each_entry_rcu(mpath, &tbl->walk_head, walk_list) {
if (i++ == idx)
break;
}
- rhashtable_walk_stop(&iter);
- rhashtable_walk_exit(&iter);
- if (IS_ERR(mpath) || !mpath)
+ if (!mpath)
return NULL;
if (mpath_expired(mpath)) {
@@ -432,6 +421,7 @@ struct mesh_path *mesh_path_add(struct ieee80211_sub_if_data *sdata,
return ERR_PTR(-ENOMEM);
tbl = sdata->u.mesh.mesh_paths;
+ spin_lock_bh(&tbl->walk_lock);
do {
ret = rhashtable_lookup_insert_fast(&tbl->rhead,
&new_mpath->rhash,
@@ -441,8 +431,10 @@ struct mesh_path *mesh_path_add(struct ieee80211_sub_if_data *sdata,
mpath = rhashtable_lookup_fast(&tbl->rhead,
dst,
mesh_rht_params);
-
+ else if (!ret)
+ hlist_add_head(&new_mpath->walk_list, &tbl->walk_head);
} while (unlikely(ret == -EEXIST && !mpath));
+ spin_unlock_bh(&tbl->walk_lock);
if (ret && ret != -EEXIST)
return ERR_PTR(ret);
@@ -480,9 +472,14 @@ int mpp_path_add(struct ieee80211_sub_if_data *sdata,
memcpy(new_mpath->mpp, mpp, ETH_ALEN);
tbl = sdata->u.mesh.mpp_paths;
+
+ spin_lock_bh(&tbl->walk_lock);
ret = rhashtable_lookup_insert_fast(&tbl->rhead,
&new_mpath->rhash,
mesh_rht_params);
+ if (!ret)
+ hlist_add_head_rcu(&new_mpath->walk_list, &tbl->walk_head);
+ spin_unlock_bh(&tbl->walk_lock);
sdata->u.mesh.mpp_paths_generation++;
return ret;
@@ -503,20 +500,9 @@ void mesh_plink_broken(struct sta_info *sta)
struct mesh_table *tbl = sdata->u.mesh.mesh_paths;
static const u8 bcast[ETH_ALEN] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff};
struct mesh_path *mpath;
- struct rhashtable_iter iter;
- int ret;
-
- ret = rhashtable_walk_init(&tbl->rhead, &iter, GFP_ATOMIC);
- if (ret)
- return;
-
- rhashtable_walk_start(&iter);
- while ((mpath = rhashtable_walk_next(&iter))) {
- if (IS_ERR(mpath) && PTR_ERR(mpath) == -EAGAIN)
- continue;
- if (IS_ERR(mpath))
- break;
+ rcu_read_lock();
+ hlist_for_each_entry_rcu(mpath, &tbl->walk_head, walk_list) {
if (rcu_access_pointer(mpath->next_hop) == sta &&
mpath->flags & MESH_PATH_ACTIVE &&
!(mpath->flags & MESH_PATH_FIXED)) {
@@ -530,8 +516,7 @@ void mesh_plink_broken(struct sta_info *sta)
WLAN_REASON_MESH_PATH_DEST_UNREACHABLE, bcast);
}
}
- rhashtable_walk_stop(&iter);
- rhashtable_walk_exit(&iter);
+ rcu_read_unlock();
}
static void mesh_path_free_rcu(struct mesh_table *tbl,
@@ -551,6 +536,7 @@ static void mesh_path_free_rcu(struct mesh_table *tbl,
static void __mesh_path_del(struct mesh_table *tbl, struct mesh_path *mpath)
{
+ hlist_del_rcu(&mpath->walk_list);
rhashtable_remove_fast(&tbl->rhead, &mpath->rhash, mesh_rht_params);
mesh_path_free_rcu(tbl, mpath);
}
@@ -571,27 +557,14 @@ void mesh_path_flush_by_nexthop(struct sta_info *sta)
struct ieee80211_sub_if_data *sdata = sta->sdata;
struct mesh_table *tbl = sdata->u.mesh.mesh_paths;
struct mesh_path *mpath;
- struct rhashtable_iter iter;
- int ret;
-
- ret = rhashtable_walk_init(&tbl->rhead, &iter, GFP_ATOMIC);
- if (ret)
- return;
-
- rhashtable_walk_start(&iter);
-
- while ((mpath = rhashtable_walk_next(&iter))) {
- if (IS_ERR(mpath) && PTR_ERR(mpath) == -EAGAIN)
- continue;
- if (IS_ERR(mpath))
- break;
+ struct hlist_node *n;
+ spin_lock_bh(&tbl->walk_lock);
+ hlist_for_each_entry_safe(mpath, n, &tbl->walk_head, walk_list) {
if (rcu_access_pointer(mpath->next_hop) == sta)
__mesh_path_del(tbl, mpath);
}
-
- rhashtable_walk_stop(&iter);
- rhashtable_walk_exit(&iter);
+ spin_unlock_bh(&tbl->walk_lock);
}
static void mpp_flush_by_proxy(struct ieee80211_sub_if_data *sdata,
@@ -599,51 +572,26 @@ static void mpp_flush_by_proxy(struct ieee80211_sub_if_data *sdata,
{
struct mesh_table *tbl = sdata->u.mesh.mpp_paths;
struct mesh_path *mpath;
- struct rhashtable_iter iter;
- int ret;
-
- ret = rhashtable_walk_init(&tbl->rhead, &iter, GFP_ATOMIC);
- if (ret)
- return;
-
- rhashtable_walk_start(&iter);
-
- while ((mpath = rhashtable_walk_next(&iter))) {
- if (IS_ERR(mpath) && PTR_ERR(mpath) == -EAGAIN)
- continue;
- if (IS_ERR(mpath))
- break;
+ struct hlist_node *n;
+ spin_lock_bh(&tbl->walk_lock);
+ hlist_for_each_entry_safe(mpath, n, &tbl->walk_head, walk_list) {
if (ether_addr_equal(mpath->mpp, proxy))
__mesh_path_del(tbl, mpath);
}
-
- rhashtable_walk_stop(&iter);
- rhashtable_walk_exit(&iter);
+ spin_unlock_bh(&tbl->walk_lock);
}
static void table_flush_by_iface(struct mesh_table *tbl)
{
struct mesh_path *mpath;
- struct rhashtable_iter iter;
- int ret;
-
- ret = rhashtable_walk_init(&tbl->rhead, &iter, GFP_ATOMIC);
- if (ret)
- return;
-
- rhashtable_walk_start(&iter);
+ struct hlist_node *n;
- while ((mpath = rhashtable_walk_next(&iter))) {
- if (IS_ERR(mpath) && PTR_ERR(mpath) == -EAGAIN)
- continue;
- if (IS_ERR(mpath))
- break;
+ spin_lock_bh(&tbl->walk_lock);
+ hlist_for_each_entry_safe(mpath, n, &tbl->walk_head, walk_list) {
__mesh_path_del(tbl, mpath);
}
-
- rhashtable_walk_stop(&iter);
- rhashtable_walk_exit(&iter);
+ spin_unlock_bh(&tbl->walk_lock);
}
/**
@@ -675,7 +623,7 @@ static int table_path_del(struct mesh_table *tbl,
{
struct mesh_path *mpath;
- rcu_read_lock();
+ spin_lock_bh(&tbl->walk_lock);
mpath = rhashtable_lookup_fast(&tbl->rhead, addr, mesh_rht_params);
if (!mpath) {
rcu_read_unlock();
@@ -683,7 +631,7 @@ static int table_path_del(struct mesh_table *tbl,
}
__mesh_path_del(tbl, mpath);
- rcu_read_unlock();
+ spin_unlock_bh(&tbl->walk_lock);
return 0;
}
@@ -854,28 +802,16 @@ void mesh_path_tbl_expire(struct ieee80211_sub_if_data *sdata,
struct mesh_table *tbl)
{
struct mesh_path *mpath;
- struct rhashtable_iter iter;
- int ret;
+ struct hlist_node *n;
- ret = rhashtable_walk_init(&tbl->rhead, &iter, GFP_KERNEL);
- if (ret)
- return;
-
- rhashtable_walk_start(&iter);
-
- while ((mpath = rhashtable_walk_next(&iter))) {
- if (IS_ERR(mpath) && PTR_ERR(mpath) == -EAGAIN)
- continue;
- if (IS_ERR(mpath))
- break;
+ spin_lock_bh(&tbl->walk_lock);
+ hlist_for_each_entry_safe(mpath, n, &tbl->walk_head, walk_list) {
if ((!(mpath->flags & MESH_PATH_RESOLVING)) &&
(!(mpath->flags & MESH_PATH_FIXED)) &&
time_after(jiffies, mpath->exp_time + MESH_PATH_EXPIRE))
__mesh_path_del(tbl, mpath);
}
-
- rhashtable_walk_stop(&iter);
- rhashtable_walk_exit(&iter);
+ spin_unlock_bh(&tbl->walk_lock);
}
void mesh_path_expire(struct ieee80211_sub_if_data *sdata)
^ permalink raw reply related
* [PATCH 3/4] mac80211: Use rhashtable_lookup_get_insert_fast instead of racy code
From: Herbert Xu @ 2019-02-14 14:03 UTC (permalink / raw)
To: David Miller, johannes, linux-wireless, netdev, j, tgraf,
johannes.berg, Julia Lawall
In-Reply-To: <20190214140236.omt74prxhkfaasue@gondor.apana.org.au>
The code in mesh_path_add tries to handle the case where a duplicate
entry is added to the rhashtable by doing a lookup after a failed
insertion. It also tries to handle races by repeating the insertion
should the lookup fail.
This is now unnecessary as we have rhashtable API functions that can
directly return the mathcing object.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
net/mac80211/mesh_pathtbl.c | 24 ++++++++----------------
1 file changed, 8 insertions(+), 16 deletions(-)
diff --git a/net/mac80211/mesh_pathtbl.c b/net/mac80211/mesh_pathtbl.c
index c3a7396fb955..8902395e406e 100644
--- a/net/mac80211/mesh_pathtbl.c
+++ b/net/mac80211/mesh_pathtbl.c
@@ -404,7 +404,6 @@ struct mesh_path *mesh_path_add(struct ieee80211_sub_if_data *sdata,
{
struct mesh_table *tbl;
struct mesh_path *mpath, *new_mpath;
- int ret;
if (ether_addr_equal(dst, sdata->vif.addr))
/* never add ourselves as neighbours */
@@ -422,25 +421,18 @@ struct mesh_path *mesh_path_add(struct ieee80211_sub_if_data *sdata,
tbl = sdata->u.mesh.mesh_paths;
spin_lock_bh(&tbl->walk_lock);
- do {
- ret = rhashtable_lookup_insert_fast(&tbl->rhead,
- &new_mpath->rhash,
- mesh_rht_params);
-
- if (ret == -EEXIST)
- mpath = rhashtable_lookup_fast(&tbl->rhead,
- dst,
- mesh_rht_params);
- else if (!ret)
- hlist_add_head(&new_mpath->walk_list, &tbl->walk_head);
- } while (unlikely(ret == -EEXIST && !mpath));
+ mpath = rhashtable_lookup_get_insert_fast(&tbl->rhead,
+ &new_mpath->rhash,
+ mesh_rht_params);
+ if (!mpath)
+ hlist_add_head(&new_mpath->walk_list, &tbl->walk_head);
spin_unlock_bh(&tbl->walk_lock);
- if (ret) {
+ if (mpath) {
kfree(new_mpath);
- if (ret != -EEXIST)
- return ERR_PTR(ret);
+ if (IS_ERR(mpath))
+ return mpath;
new_mpath = mpath;
}
^ permalink raw reply related
* Re: KASAN: use-after-free Read in sctp_outq_tail
From: Xin Long @ 2019-02-14 14:03 UTC (permalink / raw)
To: Marcelo Ricardo Leitner
Cc: syzbot, davem, LKML, linux-sctp, network dev, Neil Horman,
syzkaller-bugs, Vlad Yasevich
In-Reply-To: <20190213135157.GJ10665@localhost.localdomain>
On Wed, Feb 13, 2019 at 9:52 PM Marcelo Ricardo Leitner
<marcelo.leitner@gmail.com> wrote:
>
> On Wed, Feb 13, 2019 at 12:35:56PM +0800, Xin Long wrote:
> > On Wed, Feb 13, 2019 at 4:00 AM syzbot
> > <syzbot+7823fa3f3e2d69341ea8@syzkaller.appspotmail.com> wrote:
> > >
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit: d4104460aec1 Add linux-next specific files for 20190211
> > > git tree: linux-next
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=14140124c00000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=c8a112d3b0d6719b
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=7823fa3f3e2d69341ea8
> > > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > >
> > > Unfortunately, I don't have any reproducer for this crash yet.
> > >
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+7823fa3f3e2d69341ea8@syzkaller.appspotmail.com
> > >
> > > ==================================================================
> > > BUG: KASAN: use-after-free in list_add_tail include/linux/list.h:93 [inline]
> > > BUG: KASAN: use-after-free in sctp_outq_tail_data net/sctp/outqueue.c:105
> > > [inline]
> > > BUG: KASAN: use-after-free in sctp_outq_tail+0x816/0x930
> > > net/sctp/outqueue.c:313
> > > Read of size 8 at addr ffff88807b19a7b8 by task syz-executor.0/30745
> > I think https://patchwork.ozlabs.org/patch/1040500/ will fix this.
>
> I don't think so. Seems it will switch from use-after-free to NULL deref
> instead with that patch.
It will allocate ext for the stream if its ext is NULL in
sctp_sendmsg_to_asoc(), the new data will work fine. As for
the old data on the surplus streams, it has been dropped in
sctp_stream_outq_migrate().
>
> > > CPU: 1 PID: 30745 Comm: syz-executor.0 Not tainted 5.0.0-rc5-next-20190211
> > > #32
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > Google 01/01/2011
> > > Call Trace:
> > > __dump_stack lib/dump_stack.c:77 [inline]
> > > dump_stack+0x172/0x1f0 lib/dump_stack.c:113
> > > print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
> > > kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
> > > __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:132
> > > list_add_tail include/linux/list.h:93 [inline]
> > > sctp_outq_tail_data net/sctp/outqueue.c:105 [inline]
> > > sctp_outq_tail+0x816/0x930 net/sctp/outqueue.c:313
> > > sctp_cmd_send_msg net/sctp/sm_sideeffect.c:1109 [inline]
> > > sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1784 [inline]
> > > sctp_side_effects net/sctp/sm_sideeffect.c:1220 [inline]
> > > sctp_do_sm+0x68e/0x5380 net/sctp/sm_sideeffect.c:1191
> > > sctp_primitive_SEND+0xa0/0xd0 net/sctp/primitive.c:178
> > > sctp_sendmsg_to_asoc+0xa63/0x17b0 net/sctp/socket.c:1955
>
> sctp_sendmsg_to_asoc()
> ...
> if (sinfo->sinfo_stream >= asoc->stream.outcnt) {
> err = -EINVAL;
> goto err;
> }
>
> if (unlikely(!SCTP_SO(&asoc->stream, sinfo->sinfo_stream)->ext)) {
> ...
>
> It should have aborted even if an old ->ext was still there because
> outcnt is correctly updated. So somehow outcnt was wrong here.
>
> sctp_stream_init()
> ...
> /* Filter out chunks queued on streams that won't exist anymore */
> sched->unsched_all(stream);
> sctp_stream_outq_migrate(stream, NULL, outcnt); <--- [A]
> sched->sched_all(stream);
>
> ret = sctp_stream_alloc_out(stream, outcnt, gfp); <--- [B]
> if (ret)
> goto out;
>
> stream->outcnt = outcnt; <--- [C]
> ...
>
> We have a problem here because [A] is freeing ->ext, but [B] can fail (ENOMEM),
> which would lead it to not update outcnt in [C] even after the
> changes already performed in [A].
>
> It should handle the freeing of ->ext in sctp_stream_alloc_out()
> instead, or allocate the flexarray earlier (so it can bail out before
> freeing stuff).
Agree that it's better to do:
sched->unsched_all(stream);
sctp_stream_outq_migrate(stream, NULL, outcnt);
sched->sched_all(stream);
after the flexarray allocation.
Just sctp_stream_alloc_out() can not be used here anymore, as
stream->out will be set in it.
^ permalink raw reply
* Re: [PATCH 2/4] mac80211: Free mpath object when rhashtable insertion fails
From: Herbert Xu @ 2019-02-14 14:04 UTC (permalink / raw)
To: Johannes Berg; +Cc: David Miller, linux-wireless, netdev, j, tgraf
In-Reply-To: <36adaef5c982ba10444d6f837b0d5c55ac953bdf.camel@sipsolutions.net>
On Wed, Feb 13, 2019 at 04:04:29PM +0100, Johannes Berg wrote:
>
> Yes, that's a small change in behaviour as to when the generation
> counter is updated, but I'm pretty sure it really only needs updating
> when we inserted something, not if we found an old mpath.
I decided not to do this in my patch as it's not directly related
to the kfree issue.
But I agree that this makes more sense and we should make that
change in another patch.
Thanks!
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Re: [B.A.T.M.A.N.] [PATCH v2] batman-adv: Add multicast-to-unicast support for multiple targets
From: Sven Eckelmann @ 2019-02-14 14:04 UTC (permalink / raw)
To: Linus Lüssing; +Cc: b.a.t.m.a.n, David Bauer, Jiri Pirko, netdev
In-Reply-To: <20190214134452.GA1602@otheros>
[-- Attachment #1: Type: text/plain, Size: 1018 bytes --]
On Thursday, 14 February 2019 14:44:52 CET Linus Lüssing wrote:
[...]
> > No new sysfs config files.
>
> Why? The bridge for instance does the same.
https://patchwork.open-mesh.org/patch/16763/ - here the quote
On Samstag, 29. Oktober 2016 12:33:01 CEST Jiri Pirko wrote:
> I strongly believe it is a huge mistake to use sysfs for things like
> this. This should be done via generic netlink api.
We don't need to configuration interfaces - we only need the preferred one. If
this is sysfs for you guys then we should not have started with generic
netlink at all. And why wasn't this brought up now *after* the stuff was
merged by David. It isn't the first time that I've stated clearly that there
should be no new sysfs configuration files when we switch to genl.
If it now preferred to have sysfs again for configuration then please discuss
it with the netdev folks and find out how the new generic netlink interface
can be removed again before the next release.
Kind regards,
Sven
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* [net-next, PATCH] net: stmmac: use correct define to get rx timestamp on GMAC4
From: Alexandre Torgue @ 2019-02-14 14:12 UTC (permalink / raw)
To: Giuseppe Cavallaro, Jose Abreu, davem
Cc: netdev, linux-stm32, linux-arm-kernel, linux-kernel,
Alexandre Torgue
In dwmac4_wrback_get_rx_timestamp_status we looking for a RX timestamp.
For that receive descriptors are handled and so we should use defines
related to receive descriptors. It'll no change the functional behavior
as RDES3_RDES1_VALID=TDES3_RS1V=BIT(26) but it makes code easier to read.
Signed-off-by: Alexandre Torgue <alexandre.torgue@st.com>
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
index 20299f6..9f062b3 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
@@ -268,7 +268,7 @@ static int dwmac4_wrback_get_rx_timestamp_status(void *desc, void *next_desc,
int ret = -EINVAL;
/* Get the status from normal w/b descriptor */
- if (likely(p->des3 & TDES3_RS1V)) {
+ if (likely(p->des3 & RDES3_RDES1_VALID)) {
if (likely(le32_to_cpu(p->des1) & RDES1_TIMESTAMP_AVAILABLE)) {
int i = 0;
--
2.7.4
^ permalink raw reply related
* Re: [PATCH net-next v5] ipmr: ip6mr: Create new sockopt to clear mfc cache or vifs
From: Nicolas Dichtel @ 2019-02-14 14:16 UTC (permalink / raw)
To: Callum Sinclair, davem, kuznet, yoshfuji, nikolay, netdev,
linux-kernel
In-Reply-To: <20190214024418.21490-2-callum.sinclair@alliedtelesis.co.nz>
Le 14/02/2019 à 03:44, Callum Sinclair a écrit :
> Currently the only way to clear the forwarding cache was to delete the
> entries one by one using the MRT_DEL_MFC socket option or to destroy and
> recreate the socket.
>
> Create a new socket option which with the use of optional flags can
> clear any combination of multicast entries (static or not static) and
> multicast vifs (static or not static).
>
> Calling the new socket option MRT_FLUSH with the flags MRT_FLUSH_MFC and
> MRT_FLUSH_VIFS will clear all entries and vifs on the socket except for
> static entries.
>
> Signed-off-by: Callum Sinclair <callum.sinclair@alliedtelesis.co.nz>
Except two minor comments (see below),
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
[snip]
> +/* MRT6_FLUSH optional flags */
> +#define MRT6_FLUSH_MFC 1 /* Flush multicast entries */
> +#define MRT6_FLUSH_MFC_STATIC 2 /* Flush static multicast entries */
> +#define MRT6_FLUSH_VIFS 4 /* Flushing multicast vifs */
> +#define MRT6_FLUSH_VIFS_STATIC 8 /* Flush static multicast vifs */
vifs are called mifs in ipv6, maybe it's better to keep the consistency with
MRT6_FLUSH_MIFS and MRT6_FLUSH_MIFS_STATIC.
[snip]
> + if (flags & (MRT6_FLUSH_MFC | MRT6_FLUSH_MFC_STATIC)) {
> + list_for_each_entry_safe(c, tmp, &mrt->mfc_cache_list, list) {
> + if (((c->mfc_flags & MFC_STATIC) && !(flags & MRT6_FLUSH_MFC_STATIC)) ||
> + (!(c->mfc_flags & MFC_STATIC) && !(flags & MRT6_FLUSH_MFC)))
> + continue;
> + rhltable_remove(&mrt->mfc_hash, &c->mnode, ip6mr_rht_params);
> + list_del_rcu(&c->list);
> + call_ip6mr_mfc_entry_notifiers(read_pnet(&mrt->net),
> + FIB_EVENT_ENTRY_DEL,
> + (struct mfc6_cache *)c, mrt->id);
Two many tabs here.
Regards,
Nicolas
^ permalink raw reply
* Re: KASAN: use-after-free Read in sctp_outq_tail
From: Marcelo Ricardo Leitner @ 2019-02-14 14:17 UTC (permalink / raw)
To: Xin Long
Cc: syzbot, davem, LKML, linux-sctp, network dev, Neil Horman,
syzkaller-bugs, Vlad Yasevich
In-Reply-To: <CADvbK_fwUdU3_CWBFYfw+ZeOLvEF9p6koP2FUdo+m0SCqGN+Ug@mail.gmail.com>
On Thu, Feb 14, 2019 at 10:03:30PM +0800, Xin Long wrote:
> On Wed, Feb 13, 2019 at 9:52 PM Marcelo Ricardo Leitner
> <marcelo.leitner@gmail.com> wrote:
> >
> > On Wed, Feb 13, 2019 at 12:35:56PM +0800, Xin Long wrote:
> > > On Wed, Feb 13, 2019 at 4:00 AM syzbot
> > > <syzbot+7823fa3f3e2d69341ea8@syzkaller.appspotmail.com> wrote:
> > > >
> > > > Hello,
> > > >
> > > > syzbot found the following crash on:
> > > >
> > > > HEAD commit: d4104460aec1 Add linux-next specific files for 20190211
> > > > git tree: linux-next
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=14140124c00000
> > > > kernel config: https://syzkaller.appspot.com/x/.config?x=c8a112d3b0d6719b
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=7823fa3f3e2d69341ea8
> > > > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > > >
> > > > Unfortunately, I don't have any reproducer for this crash yet.
> > > >
> > > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > > Reported-by: syzbot+7823fa3f3e2d69341ea8@syzkaller.appspotmail.com
> > > >
> > > > ==================================================================
> > > > BUG: KASAN: use-after-free in list_add_tail include/linux/list.h:93 [inline]
> > > > BUG: KASAN: use-after-free in sctp_outq_tail_data net/sctp/outqueue.c:105
> > > > [inline]
> > > > BUG: KASAN: use-after-free in sctp_outq_tail+0x816/0x930
> > > > net/sctp/outqueue.c:313
> > > > Read of size 8 at addr ffff88807b19a7b8 by task syz-executor.0/30745
> > > I think https://patchwork.ozlabs.org/patch/1040500/ will fix this.
> >
> > I don't think so. Seems it will switch from use-after-free to NULL deref
> > instead with that patch.
> It will allocate ext for the stream if its ext is NULL in
> sctp_sendmsg_to_asoc(), the new data will work fine. As for
Ehh, right!
> the old data on the surplus streams, it has been dropped in
> sctp_stream_outq_migrate().
Yep.
>
> >
> > > > CPU: 1 PID: 30745 Comm: syz-executor.0 Not tainted 5.0.0-rc5-next-20190211
> > > > #32
> > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > > Google 01/01/2011
> > > > Call Trace:
> > > > __dump_stack lib/dump_stack.c:77 [inline]
> > > > dump_stack+0x172/0x1f0 lib/dump_stack.c:113
> > > > print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
> > > > kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
> > > > __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:132
> > > > list_add_tail include/linux/list.h:93 [inline]
> > > > sctp_outq_tail_data net/sctp/outqueue.c:105 [inline]
> > > > sctp_outq_tail+0x816/0x930 net/sctp/outqueue.c:313
> > > > sctp_cmd_send_msg net/sctp/sm_sideeffect.c:1109 [inline]
> > > > sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1784 [inline]
> > > > sctp_side_effects net/sctp/sm_sideeffect.c:1220 [inline]
> > > > sctp_do_sm+0x68e/0x5380 net/sctp/sm_sideeffect.c:1191
> > > > sctp_primitive_SEND+0xa0/0xd0 net/sctp/primitive.c:178
> > > > sctp_sendmsg_to_asoc+0xa63/0x17b0 net/sctp/socket.c:1955
> >
> > sctp_sendmsg_to_asoc()
> > ...
> > if (sinfo->sinfo_stream >= asoc->stream.outcnt) {
> > err = -EINVAL;
> > goto err;
> > }
> >
> > if (unlikely(!SCTP_SO(&asoc->stream, sinfo->sinfo_stream)->ext)) {
> > ...
> >
> > It should have aborted even if an old ->ext was still there because
> > outcnt is correctly updated. So somehow outcnt was wrong here.
> >
> > sctp_stream_init()
> > ...
> > /* Filter out chunks queued on streams that won't exist anymore */
> > sched->unsched_all(stream);
> > sctp_stream_outq_migrate(stream, NULL, outcnt); <--- [A]
> > sched->sched_all(stream);
> >
> > ret = sctp_stream_alloc_out(stream, outcnt, gfp); <--- [B]
> > if (ret)
> > goto out;
> >
> > stream->outcnt = outcnt; <--- [C]
> > ...
> >
> > We have a problem here because [A] is freeing ->ext, but [B] can fail (ENOMEM),
> > which would lead it to not update outcnt in [C] even after the
> > changes already performed in [A].
> >
> > It should handle the freeing of ->ext in sctp_stream_alloc_out()
> > instead, or allocate the flexarray earlier (so it can bail out before
> > freeing stuff).
> Agree that it's better to do:
> sched->unsched_all(stream);
> sctp_stream_outq_migrate(stream, NULL, outcnt);
> sched->sched_all(stream);
> after the flexarray allocation.
>
> Just sctp_stream_alloc_out() can not be used here anymore, as
> stream->out will be set in it.
What about carving out a sctp_stream_init_out() ? Like
struct flex_array *new_out;
/* just allocate it, nothing more */
new_out = sctp_stream_alloc_out(outcnt, gfp);
if (!new_out)
goto out;
/* Filter out chunks queued on streams that won't exist anymore */
sched->unsched_all(stream);
sctp_stream_outq_migrate(stream, NULL, outcnt);
sched->sched_all(stream);
/* initialization stuff, and it can't fail anymore */
sctp_stream_init_out(stream, outcnt, new_out);
This may hurt a bit more with the genradix migration, but then we
don't end up dropping data for nothing.
^ permalink raw reply
* Re: [net-next, PATCH] net: stmmac: use correct define to get rx timestamp on GMAC4
From: Jose Abreu @ 2019-02-14 14:18 UTC (permalink / raw)
To: Alexandre Torgue, Giuseppe Cavallaro, Jose Abreu, davem
Cc: netdev, linux-stm32, linux-arm-kernel, linux-kernel
In-Reply-To: <1550153571-14404-1-git-send-email-alexandre.torgue@st.com>
Hi Alexandre,
On 2/14/2019 2:12 PM, Alexandre Torgue wrote:
> In dwmac4_wrback_get_rx_timestamp_status we looking for a RX timestamp.
> For that receive descriptors are handled and so we should use defines
> related to receive descriptors. It'll no change the functional behavior
> as RDES3_RDES1_VALID=TDES3_RS1V=BIT(26) but it makes code easier to read.
>
> Signed-off-by: Alexandre Torgue <alexandre.torgue@st.com>
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
> index 20299f6..9f062b3 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
> @@ -268,7 +268,7 @@ static int dwmac4_wrback_get_rx_timestamp_status(void *desc, void *next_desc,
> int ret = -EINVAL;
>
> /* Get the status from normal w/b descriptor */
> - if (likely(p->des3 & TDES3_RS1V)) {
> + if (likely(p->des3 & RDES3_RDES1_VALID)) {
Shouldn't this also use le32_to_cpu() like bellow ?
Thanks,
Jose Miguel Abreu
> if (likely(le32_to_cpu(p->des1) & RDES1_TIMESTAMP_AVAILABLE)) {
> int i = 0;
>
>
^ permalink raw reply
* Re: KASAN: use-after-free Read in sctp_outq_tail
From: Marcelo Ricardo Leitner @ 2019-02-14 14:23 UTC (permalink / raw)
To: Xin Long
Cc: syzbot, davem, LKML, linux-sctp, network dev, Neil Horman,
syzkaller-bugs, Vlad Yasevich
In-Reply-To: <20190214141745.GL13621@localhost.localdomain>
On Thu, Feb 14, 2019 at 12:17:45PM -0200, Marcelo Ricardo Leitner wrote:
> On Thu, Feb 14, 2019 at 10:03:30PM +0800, Xin Long wrote:
> > On Wed, Feb 13, 2019 at 9:52 PM Marcelo Ricardo Leitner
> > <marcelo.leitner@gmail.com> wrote:
> > >
> > > On Wed, Feb 13, 2019 at 12:35:56PM +0800, Xin Long wrote:
> > > > On Wed, Feb 13, 2019 at 4:00 AM syzbot
> > > > <syzbot+7823fa3f3e2d69341ea8@syzkaller.appspotmail.com> wrote:
> > > > >
> > > > > Hello,
> > > > >
> > > > > syzbot found the following crash on:
> > > > >
> > > > > HEAD commit: d4104460aec1 Add linux-next specific files for 20190211
> > > > > git tree: linux-next
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=14140124c00000
> > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=c8a112d3b0d6719b
> > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=7823fa3f3e2d69341ea8
> > > > > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > > > >
> > > > > Unfortunately, I don't have any reproducer for this crash yet.
> > > > >
> > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > > > Reported-by: syzbot+7823fa3f3e2d69341ea8@syzkaller.appspotmail.com
> > > > >
> > > > > ==================================================================
> > > > > BUG: KASAN: use-after-free in list_add_tail include/linux/list.h:93 [inline]
> > > > > BUG: KASAN: use-after-free in sctp_outq_tail_data net/sctp/outqueue.c:105
> > > > > [inline]
> > > > > BUG: KASAN: use-after-free in sctp_outq_tail+0x816/0x930
> > > > > net/sctp/outqueue.c:313
> > > > > Read of size 8 at addr ffff88807b19a7b8 by task syz-executor.0/30745
> > > > I think https://patchwork.ozlabs.org/patch/1040500/ will fix this.
> > >
> > > I don't think so. Seems it will switch from use-after-free to NULL deref
> > > instead with that patch.
> > It will allocate ext for the stream if its ext is NULL in
> > sctp_sendmsg_to_asoc(), the new data will work fine. As for
>
> Ehh, right!
Marking it as fixed again then, as with this patch it won't crash
anymore.
#syz fix:
sctp: set stream ext to NULL after freeing it in sctp_stream_outq_migrate
>
> > the old data on the surplus streams, it has been dropped in
> > sctp_stream_outq_migrate().
>
> Yep.
>
> >
> > >
> > > > > CPU: 1 PID: 30745 Comm: syz-executor.0 Not tainted 5.0.0-rc5-next-20190211
> > > > > #32
> > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > > > Google 01/01/2011
> > > > > Call Trace:
> > > > > __dump_stack lib/dump_stack.c:77 [inline]
> > > > > dump_stack+0x172/0x1f0 lib/dump_stack.c:113
> > > > > print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
> > > > > kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
> > > > > __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:132
> > > > > list_add_tail include/linux/list.h:93 [inline]
> > > > > sctp_outq_tail_data net/sctp/outqueue.c:105 [inline]
> > > > > sctp_outq_tail+0x816/0x930 net/sctp/outqueue.c:313
> > > > > sctp_cmd_send_msg net/sctp/sm_sideeffect.c:1109 [inline]
> > > > > sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1784 [inline]
> > > > > sctp_side_effects net/sctp/sm_sideeffect.c:1220 [inline]
> > > > > sctp_do_sm+0x68e/0x5380 net/sctp/sm_sideeffect.c:1191
> > > > > sctp_primitive_SEND+0xa0/0xd0 net/sctp/primitive.c:178
> > > > > sctp_sendmsg_to_asoc+0xa63/0x17b0 net/sctp/socket.c:1955
> > >
> > > sctp_sendmsg_to_asoc()
> > > ...
> > > if (sinfo->sinfo_stream >= asoc->stream.outcnt) {
> > > err = -EINVAL;
> > > goto err;
> > > }
> > >
> > > if (unlikely(!SCTP_SO(&asoc->stream, sinfo->sinfo_stream)->ext)) {
> > > ...
> > >
> > > It should have aborted even if an old ->ext was still there because
> > > outcnt is correctly updated. So somehow outcnt was wrong here.
> > >
> > > sctp_stream_init()
> > > ...
> > > /* Filter out chunks queued on streams that won't exist anymore */
> > > sched->unsched_all(stream);
> > > sctp_stream_outq_migrate(stream, NULL, outcnt); <--- [A]
> > > sched->sched_all(stream);
> > >
> > > ret = sctp_stream_alloc_out(stream, outcnt, gfp); <--- [B]
> > > if (ret)
> > > goto out;
> > >
> > > stream->outcnt = outcnt; <--- [C]
> > > ...
> > >
> > > We have a problem here because [A] is freeing ->ext, but [B] can fail (ENOMEM),
> > > which would lead it to not update outcnt in [C] even after the
> > > changes already performed in [A].
> > >
> > > It should handle the freeing of ->ext in sctp_stream_alloc_out()
> > > instead, or allocate the flexarray earlier (so it can bail out before
> > > freeing stuff).
> > Agree that it's better to do:
> > sched->unsched_all(stream);
> > sctp_stream_outq_migrate(stream, NULL, outcnt);
> > sched->sched_all(stream);
> > after the flexarray allocation.
> >
> > Just sctp_stream_alloc_out() can not be used here anymore, as
> > stream->out will be set in it.
>
> What about carving out a sctp_stream_init_out() ? Like
>
> struct flex_array *new_out;
>
> /* just allocate it, nothing more */
> new_out = sctp_stream_alloc_out(outcnt, gfp);
> if (!new_out)
> goto out;
>
> /* Filter out chunks queued on streams that won't exist anymore */
> sched->unsched_all(stream);
> sctp_stream_outq_migrate(stream, NULL, outcnt);
> sched->sched_all(stream);
>
> /* initialization stuff, and it can't fail anymore */
> sctp_stream_init_out(stream, outcnt, new_out);
>
> This may hurt a bit more with the genradix migration, but then we
> don't end up dropping data for nothing.
>
^ permalink raw reply
* Re: [PATCH net-next 09/12] mlxsw: core: Extend hwmon interface with fan fault attribute
From: Guenter Roeck @ 2019-02-14 14:29 UTC (permalink / raw)
To: Vadim Pasternak, Andrew Lunn, Ido Schimmel
Cc: netdev@vger.kernel.org, davem@davemloft.net, Jiri Pirko, mlxsw
In-Reply-To: <AM6PR05MB5224D8338CD23225746CBA72A2670@AM6PR05MB5224.eurprd05.prod.outlook.com>
On 2/13/19 11:06 PM, Vadim Pasternak wrote:
>
>
>> -----Original Message-----
>> From: Guenter Roeck <groeck7@gmail.com> On Behalf Of Guenter Roeck
>> Sent: Wednesday, February 13, 2019 5:03 PM
>> To: Andrew Lunn <andrew@lunn.ch>; Ido Schimmel <idosch@mellanox.com>
>> Cc: netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko
>> <jiri@mellanox.com>; mlxsw <mlxsw@mellanox.com>; Vadim Pasternak
>> <vadimp@mellanox.com>
>> Subject: Re: [PATCH net-next 09/12] mlxsw: core: Extend hwmon interface with
>> fan fault attribute
>>
>> On 2/13/19 5:53 AM, Andrew Lunn wrote:
>>> On Wed, Feb 13, 2019 at 11:28:53AM +0000, Ido Schimmel wrote:
>>>> From: Vadim Pasternak <vadimp@mellanox.com>
>>>>
>>>> Add new fan hwmon attribute for exposing fan faults (fault indication
>>>> is read from Fan Out of Range Event Register).
>>>>
>>>> Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
>>>> Reviewed-by: Jiri Pirko <jiri@mellanox.com>
>>>> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
>>>
>>> Hi Ido
>>>
>>> You should include the HWMON maintainer in the Cc: list.
>>>
>>> I would not be too surprised if he says to use
>>> hwmon_device_register_with_info().
>>>
>>
>> I would ask to do that for new drivers, but this is is not a new driver.
>> On top of that, I wasn't included in its initial review. Since I wasn't involved, I
>> have no idea what shape the driver is in, and for sure won't review it now (to
>> retain my sanity).
>>
>> Only comment I have is that using the _with_info API and using devm_ functions
>> might simplify the driver a lot. I'll be happy to do a review if/when that is done.
>>
>> Guenter
>
> Hi Guenter,
>
> Thank you for your comments.
>
> But, this patch adds one new FAN attribute to the existing infrastructure.
> At the time mlxsw core_hwmon module has been submitted, the API
> hwmon_device_register_with_info() was not available yet.
> Of course in a new modules we are using this API, like in
> our mlxreg-fan for example.
>
> The same is relevant for the patch 10/12 from this series – it also extends
> the existing infrastructure.
> But there, even in case of a new code the config structure for
> hwmon_device_register_with_info() would be look like:
> {
> /* 3 entries like below - for chip ambient temperatures (could be from 1 to 3. */
> HWMON_F_INPUT | HWMON_T_HIGHEST | HWMON_T_RESET_HISTORY,
> ...
> /* 128 entries like below - for modules temperatures (could be from 16 to 128. */
> HWMON_F_INPUT | HWMON_T_CRIT | HWMON_T_EMERGENCY, HWMON_T_FAULT | HWMON_T_LABEL,
> ...
> /* 64 entries like below - for interconnects temperatures (could be from 0 to 64). */
> HWMON_F_INPUT | HWMON_T_HIGHEST | HWMON_T_RESET_HISTORY | HWMON_T_LABEL,
> 0
> };
>
I would probably have created the above dynamically, just like the current code does,
except it now generates all the attributes. AFAICS the current code doesn't leave holes
in attribute numbering, so allocating everything statically doesn't really make much
sense. Note that you would need separate arrays, one per sensor type (temperature, fan,
pwm).
> Means we'll have 195 lines for configuration description and in case future systems will
> be equipped with the bigger number of port slots and/or with the bigger number of
> interconnects devices, the below structure will require modification.
> Modification will be not necessary, in case we'll keep configuration like it is now.
>
> Regarding devm_ API - it has been used initially in core_hwmon module, but the in
> commit (9b3bc7db759e) it has been removed. And it was a good reason for that.
> Chip can be re-programmed after initialization in case driver determines it needs to
> flash a new firmware version. In such case after re-programming driver will make
> registration again and among the other stuff it will make new registration with
> hwmon subsystem. And in case it 's registered with
> hwmon subsystem using devm_hwmon_device_register_with_groups(), the old
> hwmon device (registered before the flashing) was never unregistered and was
> referencing stale data, resulting in a use-after free.
>
Well, devm_ obviously doesn't work if the object lifetime doesn't match device
lifetime.
Guenter
^ permalink raw reply
* [patch net-next] lib: objagg: fix handling of object with 0 users when assembling hints
From: Jiri Pirko @ 2019-02-14 14:39 UTC (permalink / raw)
To: netdev; +Cc: davem, mlxsw
From: Jiri Pirko <jiri@mellanox.com>
It is possible that there might be an originally parent object with 0
direct users that is in hints no longer considered as parent. Then the
weight of this object is 0 and current code ignores him. That's why the
total amount of hint objects might be lower than for the original
objagg and WARN_ON is hit. Fix this be considering 0 weight valid.
Fixes: 9069a3817d82 ("lib: objagg: implement optimization hints assembly and use hints for object creation")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
lib/objagg.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/lib/objagg.c b/lib/objagg.c
index d552ec9c60ed..576be22e86de 100644
--- a/lib/objagg.c
+++ b/lib/objagg.c
@@ -744,8 +744,6 @@ static unsigned int objagg_tmp_graph_node_weight(struct objagg_tmp_graph *graph,
* that this node can represent with delta.
*/
- if (node->crossed_out)
- return 0;
for (j = 0; j < graph->nodes_count; j++) {
if (!objagg_tmp_graph_is_edge(graph, index, j))
continue;
@@ -759,14 +757,18 @@ static unsigned int objagg_tmp_graph_node_weight(struct objagg_tmp_graph *graph,
static int objagg_tmp_graph_node_max_weight(struct objagg_tmp_graph *graph)
{
+ struct objagg_tmp_node *node;
unsigned int max_weight = 0;
unsigned int weight;
int max_index = -1;
int i;
for (i = 0; i < graph->nodes_count; i++) {
+ node = &graph->nodes[i];
+ if (node->crossed_out)
+ continue;
weight = objagg_tmp_graph_node_weight(graph, i);
- if (weight > max_weight) {
+ if (weight >= max_weight) {
max_weight = weight;
max_index = i;
}
--
2.14.5
^ permalink raw reply related
* Re: KASAN: use-after-free Read in sctp_outq_tail
From: Marcelo Ricardo Leitner @ 2019-02-14 14:39 UTC (permalink / raw)
To: Xin Long
Cc: syzbot, davem, LKML, linux-sctp, network dev, Neil Horman,
syzkaller-bugs, Vlad Yasevich
In-Reply-To: <20190214141745.GL13621@localhost.localdomain>
On Thu, Feb 14, 2019 at 12:17:45PM -0200, Marcelo Ricardo Leitner wrote:
> On Thu, Feb 14, 2019 at 10:03:30PM +0800, Xin Long wrote:
> > On Wed, Feb 13, 2019 at 9:52 PM Marcelo Ricardo Leitner
> > <marcelo.leitner@gmail.com> wrote:
> > >
> > > On Wed, Feb 13, 2019 at 12:35:56PM +0800, Xin Long wrote:
> > > > On Wed, Feb 13, 2019 at 4:00 AM syzbot
> > > > <syzbot+7823fa3f3e2d69341ea8@syzkaller.appspotmail.com> wrote:
> > > > >
> > > > > Hello,
> > > > >
> > > > > syzbot found the following crash on:
> > > > >
> > > > > HEAD commit: d4104460aec1 Add linux-next specific files for 20190211
> > > > > git tree: linux-next
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=14140124c00000
> > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=c8a112d3b0d6719b
> > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=7823fa3f3e2d69341ea8
> > > > > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > > > >
> > > > > Unfortunately, I don't have any reproducer for this crash yet.
> > > > >
> > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > > > Reported-by: syzbot+7823fa3f3e2d69341ea8@syzkaller.appspotmail.com
> > > > >
> > > > > ==================================================================
> > > > > BUG: KASAN: use-after-free in list_add_tail include/linux/list.h:93 [inline]
> > > > > BUG: KASAN: use-after-free in sctp_outq_tail_data net/sctp/outqueue.c:105
> > > > > [inline]
> > > > > BUG: KASAN: use-after-free in sctp_outq_tail+0x816/0x930
> > > > > net/sctp/outqueue.c:313
> > > > > Read of size 8 at addr ffff88807b19a7b8 by task syz-executor.0/30745
> > > > I think https://patchwork.ozlabs.org/patch/1040500/ will fix this.
> > >
> > > I don't think so. Seems it will switch from use-after-free to NULL deref
> > > instead with that patch.
> > It will allocate ext for the stream if its ext is NULL in
> > sctp_sendmsg_to_asoc(), the new data will work fine. As for
>
> Ehh, right!
>
> > the old data on the surplus streams, it has been dropped in
> > sctp_stream_outq_migrate().
>
> Yep.
>
> >
> > >
> > > > > CPU: 1 PID: 30745 Comm: syz-executor.0 Not tainted 5.0.0-rc5-next-20190211
> > > > > #32
> > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > > > Google 01/01/2011
> > > > > Call Trace:
> > > > > __dump_stack lib/dump_stack.c:77 [inline]
> > > > > dump_stack+0x172/0x1f0 lib/dump_stack.c:113
> > > > > print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
> > > > > kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
> > > > > __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:132
> > > > > list_add_tail include/linux/list.h:93 [inline]
> > > > > sctp_outq_tail_data net/sctp/outqueue.c:105 [inline]
> > > > > sctp_outq_tail+0x816/0x930 net/sctp/outqueue.c:313
> > > > > sctp_cmd_send_msg net/sctp/sm_sideeffect.c:1109 [inline]
> > > > > sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1784 [inline]
> > > > > sctp_side_effects net/sctp/sm_sideeffect.c:1220 [inline]
> > > > > sctp_do_sm+0x68e/0x5380 net/sctp/sm_sideeffect.c:1191
> > > > > sctp_primitive_SEND+0xa0/0xd0 net/sctp/primitive.c:178
> > > > > sctp_sendmsg_to_asoc+0xa63/0x17b0 net/sctp/socket.c:1955
> > >
> > > sctp_sendmsg_to_asoc()
> > > ...
> > > if (sinfo->sinfo_stream >= asoc->stream.outcnt) {
> > > err = -EINVAL;
> > > goto err;
> > > }
> > >
> > > if (unlikely(!SCTP_SO(&asoc->stream, sinfo->sinfo_stream)->ext)) {
> > > ...
> > >
> > > It should have aborted even if an old ->ext was still there because
> > > outcnt is correctly updated. So somehow outcnt was wrong here.
> > >
> > > sctp_stream_init()
> > > ...
> > > /* Filter out chunks queued on streams that won't exist anymore */
> > > sched->unsched_all(stream);
> > > sctp_stream_outq_migrate(stream, NULL, outcnt); <--- [A]
> > > sched->sched_all(stream);
> > >
> > > ret = sctp_stream_alloc_out(stream, outcnt, gfp); <--- [B]
> > > if (ret)
> > > goto out;
> > >
> > > stream->outcnt = outcnt; <--- [C]
> > > ...
> > >
> > > We have a problem here because [A] is freeing ->ext, but [B] can fail (ENOMEM),
> > > which would lead it to not update outcnt in [C] even after the
> > > changes already performed in [A].
> > >
> > > It should handle the freeing of ->ext in sctp_stream_alloc_out()
> > > instead, or allocate the flexarray earlier (so it can bail out before
> > > freeing stuff).
> > Agree that it's better to do:
> > sched->unsched_all(stream);
> > sctp_stream_outq_migrate(stream, NULL, outcnt);
> > sched->sched_all(stream);
> > after the flexarray allocation.
> >
> > Just sctp_stream_alloc_out() can not be used here anymore, as
> > stream->out will be set in it.
>
> What about carving out a sctp_stream_init_out() ? Like
>
> struct flex_array *new_out;
>
> /* just allocate it, nothing more */
> new_out = sctp_stream_alloc_out(outcnt, gfp);
> if (!new_out)
> goto out;
>
> /* Filter out chunks queued on streams that won't exist anymore */
> sched->unsched_all(stream);
> sctp_stream_outq_migrate(stream, NULL, outcnt);
> sched->sched_all(stream);
>
> /* initialization stuff, and it can't fail anymore */
> sctp_stream_init_out(stream, outcnt, new_out);
>
> This may hurt a bit more with the genradix migration, but then we
> don't end up dropping data for nothing.
Reviewing calls to this function, if this allocation fails it will
either cause the asoc to be terminated or sendmsg error. So data loss
is not really an issue here.
^ permalink raw reply
* [PATCH net] net: i825xx: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: Yang Wei @ 2019-02-14 14:45 UTC (permalink / raw)
To: netdev; +Cc: davem, yang.wei9, albin_yang
From: Yang Wei <yang.wei9@zte.com.cn>
dev_consume_skb_irq() should be called in i596_interrupt() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
---
drivers/net/ethernet/i825xx/lib82596.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/i825xx/lib82596.c b/drivers/net/ethernet/i825xx/lib82596.c
index 2f7ae11..1274ad2 100644
--- a/drivers/net/ethernet/i825xx/lib82596.c
+++ b/drivers/net/ethernet/i825xx/lib82596.c
@@ -1194,7 +1194,7 @@ static irqreturn_t i596_interrupt(int irq, void *dev_id)
dma_unmap_single(dev->dev.parent,
tx_cmd->dma_addr,
skb->len, DMA_TO_DEVICE);
- dev_kfree_skb_irq(skb);
+ dev_consume_skb_irq(skb);
tx_cmd->cmd.command = 0; /* Mark free */
break;
--
2.7.4
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox