From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 72314C43334
	for <linux-kernel@archiver.kernel.org>; Tue,  7 Jun 2022 10:36:06 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S241193AbiFGKgE (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Tue, 7 Jun 2022 06:36:04 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39562 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S241162AbiFGKf4 (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Tue, 7 Jun 2022 06:35:56 -0400
Received: from verein.lst.de (verein.lst.de [213.95.11.211])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AABC8EC3E5
        for <linux-kernel@vger.kernel.org>; Tue,  7 Jun 2022 03:35:43 -0700 (PDT)
Received: by verein.lst.de (Postfix, from userid 2407)
        id 3541567373; Tue,  7 Jun 2022 12:35:39 +0200 (CEST)
Date:   Tue, 7 Jun 2022 12:35:38 +0200
From:   Christoph Hellwig <hch@lst.de>
To:     Michael Kelley <mikelley@microsoft.com>
Cc:     kbusch@kernel.org, axboe@fb.com, hch@lst.de, sagi@grimberg.me,
        linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org,
        caroline.subramoney@microsoft.com, riwurd@microsoft.com,
        nathan.obr@microsoft.com
Subject: Re: [PATCH v3 2/2] nvme: handle persistent internal error AER from
 NVMe controller
Message-ID: <20220607103538.GA25071@lst.de>
References: <1654560915-22449-1-git-send-email-mikelley@microsoft.com> <1654560915-22449-2-git-send-email-mikelley@microsoft.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1654560915-22449-2-git-send-email-mikelley@microsoft.com>
User-Agent: Mutt/1.5.17 (2007-11-01)
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Mon, Jun 06, 2022 at 05:15:15PM -0700, Michael Kelley wrote:
> +static void nvme_handle_aer_persistent_error(struct nvme_ctrl *ctrl)
> +{
> +	trace_nvme_async_event(ctrl, NVME_AER_ERROR);
> +
> +	/*
> +	 * We can't read the CSTS here because we're in an atomic context on
> +	 * some transports and the read may require submitting a request to the
> +	 * to the controller and getting a response. Such a sequence isn't
> +	 * likely to be successful anyway if the controller is reporting a
> +	 * persistent internal error. So assume CSTS.CFS is set.
> +	 */
> +	if (nvme_should_reset(ctrl, NVME_CSTS_CFS)) {
> +		dev_warn(ctrl->device, "resetting controller due to AER\n");
> +		nvme_reset_ctrl(ctrl);

I don't think we even need the nvme_should_reset check now.

nvme_reset_ctrl first calls nvme_change_ctrl_state, which only allows
the transition to the RESETTING state if it previously was NEW or LIVE,
so we are already covered.  The only downside would be an extra kernel
message if we already were in another state.