From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.linuxfoundation.org ([140.211.169.12]:46036 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751159AbdAQQNx (ORCPT ); Tue, 17 Jan 2017 11:13:53 -0500 Subject: Patch "net/mlx5: Only cancel recovery work when cleaning up device" has been added to the 4.9-stable tree To: danielj@mellanox.com, davem@davemloft.net, gregkh@linuxfoundation.org, saeedm@mellanox.com Cc: , From: Date: Tue, 17 Jan 2017 17:13:07 +0100 Message-ID: <148466958768191@kroah.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ANSI_X3.4-1968 Content-Transfer-Encoding: 8bit Sender: stable-owner@vger.kernel.org List-ID: This is a note to let you know that I've just added the patch titled net/mlx5: Only cancel recovery work when cleaning up device to the 4.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: net-mlx5-only-cancel-recovery-work-when-cleaning-up-device.patch and it can be found in the queue-4.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let know about it. >>From 5e44fca5047054f1762813751626b5245e0da022 Mon Sep 17 00:00:00 2001 From: Daniel Jurgens Date: Tue, 10 Jan 2017 22:33:39 +0200 Subject: net/mlx5: Only cancel recovery work when cleaning up device From: Daniel Jurgens commit 5e44fca5047054f1762813751626b5245e0da022 upstream. Do not attempt to drain the health workqueue when unloading the device in the recovery flow, this can cause a deadlock when the recovery work tries to cancel itself with sync. Because the work is no longer unconditionally canceled when unloading, it must be explicitly canceled in the AER flow. fixes: 689a248df83b ("net/mlx5: Cancel recovery work in remove flow") Signed-off-by: Daniel Jurgens Signed-off-by: Saeed Mahameed Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- drivers/net/ethernet/mellanox/mlx5/core/main.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -1158,7 +1158,8 @@ static int mlx5_unload_one(struct mlx5_c { int err = 0; - mlx5_drain_health_wq(dev); + if (cleanup) + mlx5_drain_health_wq(dev); mutex_lock(&dev->intf_state_mutex); if (test_bit(MLX5_INTERFACE_STATE_DOWN, &dev->intf_state)) { @@ -1320,9 +1321,10 @@ static pci_ers_result_t mlx5_pci_err_det mlx5_enter_error_state(dev); mlx5_unload_one(dev, priv, false); - /* In case of kernel call save the pci state */ + /* In case of kernel call save the pci state and drain the health wq */ if (state) { pci_save_state(pdev); + mlx5_drain_health_wq(dev); mlx5_pci_disable_device(dev); } Patches currently in stable-queue which might be from danielj@mellanox.com are queue-4.9/net-mlx5-only-cancel-recovery-work-when-cleaning-up-device.patch