From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78177C38A2A for ; Fri, 8 May 2020 12:39:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5A5BE20731 for ; Fri, 8 May 2020 12:39:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1588941590; bh=Ma4PKiBBIHWZ9Wop04IcE8ZOfnFHrAuW2aab4kSchMk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=YVeGroFn3VIhoBiWsoHrRbedhS0h7ocyT2SuS52xy758sE8eZqyqubNDXgMmJaXP3 7G6B30VpBCyfDRpc8IHezOKvDCiW9yYwQab+x8n1ka91W1ku3nC9bshZ5QEYC6JDxL 27i0CPPk2yX1QwNxNstnPK3daf4pU7M4sxCoxGE8= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728614AbgEHMjs (ORCPT ); Fri, 8 May 2020 08:39:48 -0400 Received: from mail.kernel.org ([198.145.29.99]:58946 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728582AbgEHMj3 (ORCPT ); Fri, 8 May 2020 08:39:29 -0400 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 53DD120731; Fri, 8 May 2020 12:39:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1588941568; bh=Ma4PKiBBIHWZ9Wop04IcE8ZOfnFHrAuW2aab4kSchMk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ecnPkqOr/62XXzaMGiLmnkHKacxZqHzb9rxDY6WFF8ak7P/JcUIAUmHnd+tlFTe5t WpoHG4Lzo0aubLAoldCVuGHXRNpukgxqkWdVbYBJg16Q51isohbHvz3eYhALocZJUo CLoqBwXXgOneM2aMDNenmemMKfO+FrxaRHTYMwj8= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Mohamad Haj Yahia , Saeed Mahameed , "David S. Miller" Subject: [PATCH 4.4 084/312] net/mlx5: Add timeout handle to commands with callback Date: Fri, 8 May 2020 14:31:15 +0200 Message-Id: <20200508123130.457457744@linuxfoundation.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200508123124.574959822@linuxfoundation.org> References: <20200508123124.574959822@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Mohamad Haj Yahia commit 65ee67084589c1783a74b4a4a5db38d7264ec8b5 upstream. The current implementation does not handle timeout in case of command with callback request, and this can lead to deadlock if the command doesn't get fw response. Add delayed callback timeout work before posting the command to fw. In case of real fw command completion we will cancel the delayed work. In case of fw command timeout the callback timeout handler will be called and it will simulate fw completion with timeout error. Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Mohamad Haj Yahia Signed-off-by: Saeed Mahameed Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 38 +++++++++++++++++++++----- include/linux/mlx5/driver.h | 1 2 files changed, 32 insertions(+), 7 deletions(-) --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -634,11 +634,36 @@ static void free_msg(struct mlx5_core_de static void mlx5_free_cmd_msg(struct mlx5_core_dev *dev, struct mlx5_cmd_msg *msg); +static u16 msg_to_opcode(struct mlx5_cmd_msg *in) +{ + struct mlx5_inbox_hdr *hdr = (struct mlx5_inbox_hdr *)(in->first.data); + + return be16_to_cpu(hdr->opcode); +} + +static void cb_timeout_handler(struct work_struct *work) +{ + struct delayed_work *dwork = container_of(work, struct delayed_work, + work); + struct mlx5_cmd_work_ent *ent = container_of(dwork, + struct mlx5_cmd_work_ent, + cb_timeout_work); + struct mlx5_core_dev *dev = container_of(ent->cmd, struct mlx5_core_dev, + cmd); + + ent->ret = -ETIMEDOUT; + mlx5_core_warn(dev, "%s(0x%x) timeout. Will cause a leak of a command resource\n", + mlx5_command_str(msg_to_opcode(ent->in)), + msg_to_opcode(ent->in)); + mlx5_cmd_comp_handler(dev, 1UL << ent->idx); +} + static void cmd_work_handler(struct work_struct *work) { struct mlx5_cmd_work_ent *ent = container_of(work, struct mlx5_cmd_work_ent, work); struct mlx5_cmd *cmd = ent->cmd; struct mlx5_core_dev *dev = container_of(cmd, struct mlx5_core_dev, cmd); + unsigned long cb_timeout = msecs_to_jiffies(MLX5_CMD_TIMEOUT_MSEC); struct mlx5_cmd_layout *lay; struct semaphore *sem; unsigned long flags; @@ -691,6 +716,9 @@ static void cmd_work_handler(struct work ent->ts1 = ktime_get_ns(); cmd_mode = cmd->mode; + if (ent->callback) + schedule_delayed_work(&ent->cb_timeout_work, cb_timeout); + /* ring doorbell after the descriptor is valid */ mlx5_core_dbg(dev, "writing 0x%x to command doorbell\n", 1 << ent->idx); wmb(); @@ -735,13 +763,6 @@ static const char *deliv_status_to_str(u } } -static u16 msg_to_opcode(struct mlx5_cmd_msg *in) -{ - struct mlx5_inbox_hdr *hdr = (struct mlx5_inbox_hdr *)(in->first.data); - - return be16_to_cpu(hdr->opcode); -} - static int wait_func(struct mlx5_core_dev *dev, struct mlx5_cmd_work_ent *ent) { unsigned long timeout = msecs_to_jiffies(MLX5_CMD_TIMEOUT_MSEC); @@ -808,6 +829,7 @@ static int mlx5_cmd_invoke(struct mlx5_c if (!callback) init_completion(&ent->done); + INIT_DELAYED_WORK(&ent->cb_timeout_work, cb_timeout_handler); INIT_WORK(&ent->work, cmd_work_handler); if (page_queue) { cmd_work_handler(&ent->work); @@ -1287,6 +1309,8 @@ void mlx5_cmd_comp_handler(struct mlx5_c struct semaphore *sem; ent = cmd->ent_arr[i]; + if (ent->callback) + cancel_delayed_work(&ent->cb_timeout_work); if (ent->page_queue) sem = &cmd->pages_sem; else --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -566,6 +566,7 @@ struct mlx5_cmd_work_ent { void *uout; int uout_size; mlx5_cmd_cbk_t callback; + struct delayed_work cb_timeout_work; void *context; int idx; struct completion done;