* [PATCH net 0/3] mlx5 fixes to 4.8-rc6
@ 2016-09-18 15:20 Or Gerlitz
2016-09-18 15:20 ` [PATCH net 1/3] net/mlx5: Fix flow counter bulk command out mailbox allocation Or Gerlitz
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Or Gerlitz @ 2016-09-18 15:20 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Tariq Toukan, Hadar Har-Zion, Amir Vadai, Or Gerlitz
Hi Dave,
This series series has a fix from Roi to memory corruption bug in
the bulk flow counters code and two late and hopefully last fixes
from me to the new eswitch offloads code.
Series done over net commit 37dd348 "bna: fix crash in bnad_get_strings()"
Or.
Or Gerlitz (2):
net/mlx5: E-Switch, Fix error flow in the SRIOV e-switch init code
net/mlx5: E-Switch, Handle mode change failures
Roi Dayan (1):
net/mlx5: Fix flow counter bulk command out mailbox allocation
drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 1 +
.../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 20 ++++++++++++++------
drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c | 4 ++--
3 files changed, 17 insertions(+), 8 deletions(-)
--
2.3.7
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH net 1/3] net/mlx5: Fix flow counter bulk command out mailbox allocation
2016-09-18 15:20 [PATCH net 0/3] mlx5 fixes to 4.8-rc6 Or Gerlitz
@ 2016-09-18 15:20 ` Or Gerlitz
2016-09-18 18:02 ` Leon Romanovsky
2016-09-18 15:20 ` [PATCH net 2/3] net/mlx5: E-Switch, Fix error flow in the SRIOV e-switch init code Or Gerlitz
` (2 subsequent siblings)
3 siblings, 1 reply; 7+ messages in thread
From: Or Gerlitz @ 2016-09-18 15:20 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Tariq Toukan, Hadar Har-Zion, Amir Vadai, Roi Dayan,
Or Gerlitz
From: Roi Dayan <roid@mellanox.com>
The FW command output length should be only the length of struct
mlx5_cmd_fc_bulk out field. Failing to do so will cause the memcpy
call which is invoked later in the driver to write over wrong memory
address and corrupt kernel memory which results in random crashes.
This bug was found using the kernel address sanitizer (kasan).
Fixes: a351a1b03bf1 ('net/mlx5: Introduce bulk reading of flow counters')
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
index 9134010..287ade1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
@@ -425,11 +425,11 @@ struct mlx5_cmd_fc_bulk *
mlx5_cmd_fc_bulk_alloc(struct mlx5_core_dev *dev, u16 id, int num)
{
struct mlx5_cmd_fc_bulk *b;
- int outlen = sizeof(*b) +
+ int outlen =
MLX5_ST_SZ_BYTES(query_flow_counter_out) +
MLX5_ST_SZ_BYTES(traffic_counter) * num;
- b = kzalloc(outlen, GFP_KERNEL);
+ b = kzalloc(sizeof(*b) + outlen, GFP_KERNEL);
if (!b)
return NULL;
--
2.3.7
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH net 2/3] net/mlx5: E-Switch, Fix error flow in the SRIOV e-switch init code
2016-09-18 15:20 [PATCH net 0/3] mlx5 fixes to 4.8-rc6 Or Gerlitz
2016-09-18 15:20 ` [PATCH net 1/3] net/mlx5: Fix flow counter bulk command out mailbox allocation Or Gerlitz
@ 2016-09-18 15:20 ` Or Gerlitz
2016-09-18 15:20 ` [PATCH net 3/3] net/mlx5: E-Switch, Handle mode change failures Or Gerlitz
2016-09-20 2:10 ` [PATCH net 0/3] mlx5 fixes to 4.8-rc6 David Miller
3 siblings, 0 replies; 7+ messages in thread
From: Or Gerlitz @ 2016-09-18 15:20 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Tariq Toukan, Hadar Har-Zion, Amir Vadai, Or Gerlitz
When enablement of the SRIOV e-switch in certain mode (switchdev or legacy)
fails, we must set the mode to none. Otherwise, we'll run into double free
based crashes when further attempting to deal with the e-switch (such
as when disabling sriov or unloading the driver).
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
index 8b78f15..b247949 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -1554,6 +1554,7 @@ int mlx5_eswitch_enable_sriov(struct mlx5_eswitch *esw, int nvfs, int mode)
abort:
esw_enable_vport(esw, 0, UC_ADDR_CHANGE);
+ esw->mode = SRIOV_NONE;
return err;
}
--
2.3.7
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH net 3/3] net/mlx5: E-Switch, Handle mode change failures
2016-09-18 15:20 [PATCH net 0/3] mlx5 fixes to 4.8-rc6 Or Gerlitz
2016-09-18 15:20 ` [PATCH net 1/3] net/mlx5: Fix flow counter bulk command out mailbox allocation Or Gerlitz
2016-09-18 15:20 ` [PATCH net 2/3] net/mlx5: E-Switch, Fix error flow in the SRIOV e-switch init code Or Gerlitz
@ 2016-09-18 15:20 ` Or Gerlitz
2016-09-20 2:10 ` [PATCH net 0/3] mlx5 fixes to 4.8-rc6 David Miller
3 siblings, 0 replies; 7+ messages in thread
From: Or Gerlitz @ 2016-09-18 15:20 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Tariq Toukan, Hadar Har-Zion, Amir Vadai, Or Gerlitz
E-switch mode changes involve creating HW tables, potentially allocating
netdevices, etc, and things can fail. Add an attempt to rollback to the
existing mode when changing to the new mode fails. Only if rollback fails,
getting proper SRIOV functionality requires module unload or sriov
disablement/enablement.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
.../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 20 ++++++++++++++------
1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 3dc83a9..7de40e6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -446,7 +446,7 @@ out:
static int esw_offloads_start(struct mlx5_eswitch *esw)
{
- int err, num_vfs = esw->dev->priv.sriov.num_vfs;
+ int err, err1, num_vfs = esw->dev->priv.sriov.num_vfs;
if (esw->mode != SRIOV_LEGACY) {
esw_warn(esw->dev, "Can't set offloads mode, SRIOV legacy not enabled\n");
@@ -455,8 +455,12 @@ static int esw_offloads_start(struct mlx5_eswitch *esw)
mlx5_eswitch_disable_sriov(esw);
err = mlx5_eswitch_enable_sriov(esw, num_vfs, SRIOV_OFFLOADS);
- if (err)
- esw_warn(esw->dev, "Failed set eswitch to offloads, err %d\n", err);
+ if (err) {
+ esw_warn(esw->dev, "Failed setting eswitch to offloads, err %d\n", err);
+ err1 = mlx5_eswitch_enable_sriov(esw, num_vfs, SRIOV_LEGACY);
+ if (err1)
+ esw_warn(esw->dev, "Failed setting eswitch back to legacy, err %d\n", err);
+ }
return err;
}
@@ -508,12 +512,16 @@ create_ft_err:
static int esw_offloads_stop(struct mlx5_eswitch *esw)
{
- int err, num_vfs = esw->dev->priv.sriov.num_vfs;
+ int err, err1, num_vfs = esw->dev->priv.sriov.num_vfs;
mlx5_eswitch_disable_sriov(esw);
err = mlx5_eswitch_enable_sriov(esw, num_vfs, SRIOV_LEGACY);
- if (err)
- esw_warn(esw->dev, "Failed set eswitch legacy mode. err %d\n", err);
+ if (err) {
+ esw_warn(esw->dev, "Failed setting eswitch to legacy, err %d\n", err);
+ err1 = mlx5_eswitch_enable_sriov(esw, num_vfs, SRIOV_OFFLOADS);
+ if (err1)
+ esw_warn(esw->dev, "Failed setting eswitch back to offloads, err %d\n", err);
+ }
return err;
}
--
2.3.7
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net 1/3] net/mlx5: Fix flow counter bulk command out mailbox allocation
2016-09-18 15:20 ` [PATCH net 1/3] net/mlx5: Fix flow counter bulk command out mailbox allocation Or Gerlitz
@ 2016-09-18 18:02 ` Leon Romanovsky
2016-09-18 20:24 ` Or Gerlitz
0 siblings, 1 reply; 7+ messages in thread
From: Leon Romanovsky @ 2016-09-18 18:02 UTC (permalink / raw)
To: Or Gerlitz
Cc: David S. Miller, netdev, Tariq Toukan, Hadar Har-Zion, Amir Vadai,
Roi Dayan
[-- Attachment #1: Type: text/plain, Size: 1648 bytes --]
On Sun, Sep 18, 2016 at 06:20:27PM +0300, Or Gerlitz wrote:
> From: Roi Dayan <roid@mellanox.com>
>
> The FW command output length should be only the length of struct
> mlx5_cmd_fc_bulk out field. Failing to do so will cause the memcpy
> call which is invoked later in the driver to write over wrong memory
> address and corrupt kernel memory which results in random crashes.
>
> This bug was found using the kernel address sanitizer (kasan).
>
> Fixes: a351a1b03bf1 ('net/mlx5: Introduce bulk reading of flow counters')
> Signed-off-by: Roi Dayan <roid@mellanox.com>
> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
> ---
> drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
> index 9134010..287ade1 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
> @@ -425,11 +425,11 @@ struct mlx5_cmd_fc_bulk *
> mlx5_cmd_fc_bulk_alloc(struct mlx5_core_dev *dev, u16 id, int num)
> {
> struct mlx5_cmd_fc_bulk *b;
> - int outlen = sizeof(*b) +
> + int outlen =
> MLX5_ST_SZ_BYTES(query_flow_counter_out) +
> MLX5_ST_SZ_BYTES(traffic_counter) * num;
>
> - b = kzalloc(outlen, GFP_KERNEL);
> + b = kzalloc(sizeof(*b) + outlen, GFP_KERNEL);
> if (!b)
> return NULL;
^^^^^^^^^ very controversial decision.
The code flow mlx5_fc_stats_query->mlx5_cmd_fc_bulk_alloc->kzalloc
failure is the same for success scenario too.
It is not related to the proposed patch.
>
> --
> 2.3.7
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net 1/3] net/mlx5: Fix flow counter bulk command out mailbox allocation
2016-09-18 18:02 ` Leon Romanovsky
@ 2016-09-18 20:24 ` Or Gerlitz
0 siblings, 0 replies; 7+ messages in thread
From: Or Gerlitz @ 2016-09-18 20:24 UTC (permalink / raw)
To: Leon Romanovsky, Amir Vadai
Cc: David S. Miller, Linux Netdev List, Tariq Toukan, Hadar Har-Zion,
Roi Dayan, Or Gerlitz
On Sun, Sep 18, 2016 at 9:02 PM, Leon Romanovsky <leon@kernel.org> wrote:
> On Sun, Sep 18, 2016 at 06:20:27PM +0300, Or Gerlitz wrote:
>> From: Roi Dayan <roid@mellanox.com>
>> @@ -425,11 +425,11 @@ struct mlx5_cmd_fc_bulk *
>> mlx5_cmd_fc_bulk_alloc(struct mlx5_core_dev *dev, u16 id, int num)
>> {
>> struct mlx5_cmd_fc_bulk *b;
>> - int outlen = sizeof(*b) +
>> + int outlen =
>> MLX5_ST_SZ_BYTES(query_flow_counter_out) +
>> MLX5_ST_SZ_BYTES(traffic_counter) * num;
>>
>> - b = kzalloc(outlen, GFP_KERNEL);
>> + b = kzalloc(sizeof(*b) + outlen, GFP_KERNEL);
>> if (!b)
>> return NULL;
> ^^^^^^^^^ very controversial decision.
> The code flow mlx5_fc_stats_query->mlx5_cmd_fc_bulk_alloc->kzalloc
> failure is the same for success scenario too.
Sure, we will look on your comment and if needed come up with a
cleanup patch for net-next (4.9)
> It is not related to the proposed patch.
Correct, the proposed patch fixes a memory corruption that we want to
sort out for net (4.8)
Or.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net 0/3] mlx5 fixes to 4.8-rc6
2016-09-18 15:20 [PATCH net 0/3] mlx5 fixes to 4.8-rc6 Or Gerlitz
` (2 preceding siblings ...)
2016-09-18 15:20 ` [PATCH net 3/3] net/mlx5: E-Switch, Handle mode change failures Or Gerlitz
@ 2016-09-20 2:10 ` David Miller
3 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2016-09-20 2:10 UTC (permalink / raw)
To: ogerlitz; +Cc: netdev, tariqt, hadarh, amirva
From: Or Gerlitz <ogerlitz@mellanox.com>
Date: Sun, 18 Sep 2016 18:20:26 +0300
> This series series has a fix from Roi to memory corruption bug in
> the bulk flow counters code and two late and hopefully last fixes
> from me to the new eswitch offloads code.
>
> Series done over net commit 37dd348 "bna: fix crash in bnad_get_strings()"
Series applied, thanks.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-09-20 2:10 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-09-18 15:20 [PATCH net 0/3] mlx5 fixes to 4.8-rc6 Or Gerlitz
2016-09-18 15:20 ` [PATCH net 1/3] net/mlx5: Fix flow counter bulk command out mailbox allocation Or Gerlitz
2016-09-18 18:02 ` Leon Romanovsky
2016-09-18 20:24 ` Or Gerlitz
2016-09-18 15:20 ` [PATCH net 2/3] net/mlx5: E-Switch, Fix error flow in the SRIOV e-switch init code Or Gerlitz
2016-09-18 15:20 ` [PATCH net 3/3] net/mlx5: E-Switch, Handle mode change failures Or Gerlitz
2016-09-20 2:10 ` [PATCH net 0/3] mlx5 fixes to 4.8-rc6 David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).