From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B345C0015E for ; Sun, 16 Jul 2023 20:19:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231740AbjGPUTS (ORCPT ); Sun, 16 Jul 2023 16:19:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44430 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231735AbjGPUTR (ORCPT ); Sun, 16 Jul 2023 16:19:17 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B3E55126 for ; Sun, 16 Jul 2023 13:19:16 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 4A12B60EAE for ; Sun, 16 Jul 2023 20:19:16 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5B2DFC433C7; Sun, 16 Jul 2023 20:19:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1689538755; bh=jRSihTJI2EMdEgMB0WWBFvNNICXqTi0ER/XzAzwH8v4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=BykCqFZIcxIo7m43yWUltrZkRcXKvkDvXFJ6yQ77ygGoZg0kqHg0A7yBx1g1RTfrO e4OXZZO1Vv1xJuSfNUDOEvdWjbKXd7HFSmBqp7s+94GPjAq4JXGXGzz/6/4go/q1vY 6LaVuU/vyqFLJxrgWqccwPgVVvj5NIMDjwP8AmYw= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Jonathan Cameron , Dave Jiang , Dan Williams , Sasha Levin Subject: [PATCH 6.4 535/800] cxl/region: Flag partially torn down regions as unusable Date: Sun, 16 Jul 2023 21:46:28 +0200 Message-ID: <20230716195001.521202249@linuxfoundation.org> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230716194949.099592437@linuxfoundation.org> References: <20230716194949.099592437@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Dan Williams [ Upstream commit 2ab47045ac96a605e3037d479a7d5854570ee5bf ] cxl_region_decode_reset() walks all the decoders associated with a given region and disables them. Due to decoder ordering rules it is possible that a switch in the topology notices that a given decoder can not be shutdown before another region with a higher HPA is shutdown first. That can leave the region in a partially committed state. Capture that state in a new CXL_REGION_F_NEEDS_RESET flag and require that a successful cxl_region_decode_reset() attempt must be completed before cxl_region_probe() accepts the region. This is a corollary for the bug that Jonathan identified in "CXL/region : commit reset of out of order region appears to succeed." [1]. Cc: Jonathan Cameron Link: http://lore.kernel.org/r/20230316171441.0000205b@Huawei.com [1] Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware") Reviewed-by: Dave Jiang Link: https://lore.kernel.org/r/168696507423.3590522.16254212607926684429.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams Signed-off-by: Sasha Levin --- drivers/cxl/core/region.c | 12 ++++++++++++ drivers/cxl/cxl.h | 8 ++++++++ 2 files changed, 20 insertions(+) diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 594ce3c2565df..fa29bd2ec3227 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -182,14 +182,19 @@ static int cxl_region_decode_reset(struct cxl_region *cxlr, int count) rc = cxld->reset(cxld); if (rc) return rc; + set_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags); } endpoint_reset: rc = cxled->cxld.reset(&cxled->cxld); if (rc) return rc; + set_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags); } + /* all decoders associated with this region have been torn down */ + clear_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags); + return 0; } @@ -2864,6 +2869,13 @@ static int cxl_region_probe(struct device *dev) goto out; } + if (test_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags)) { + dev_err(&cxlr->dev, + "failed to activate, re-commit region and retry\n"); + rc = -ENXIO; + goto out; + } + /* * From this point on any path that changes the region's state away from * CXL_CONFIG_COMMIT is also responsible for releasing the driver. diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 70ab8fcd0377c..dcebe48bb5bb5 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -469,6 +469,14 @@ struct cxl_region_params { */ #define CXL_REGION_F_AUTO 0 +/* + * Require that a committed region successfully complete a teardown once + * any of its associated decoders have been torn down. This maintains + * the commit state for the region since there are committed decoders, + * but blocks cxl_region_probe(). + */ +#define CXL_REGION_F_NEEDS_RESET 1 + /** * struct cxl_region - CXL region * @dev: This region's device -- 2.39.2