From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8677837DE84; Thu, 12 Mar 2026 18:24:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.176.79.56 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773339850; cv=none; b=ZlD+q6QU6B6SqGKCn8Qva1HYpG9Sg1tMpbSbKVXjckL33A6iAGOKDVilfiX1COMVYFq72EhsaYt++d1j/EsUTy4EaJIL+RYtO8QhwfgzjxJ12m7fxEpFv0r8JeDfhb0afw2QYT8oVK5SushxWCH8ZVWIgPL3bGu03HI+j9lKTU4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773339850; c=relaxed/simple; bh=GJPok1N29I2rdd1Y56hNtB8ClOGNj3H1Xdxe6sT9QyU=; h=Date:From:To:CC:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=JXTj/ab9mAR/LqZnx704HooX9oA2C17oZWaz9SmfXwExT23P+PcC9TFbV+HqxUanA1uMnx8/+OFjY/JW5x6UzxnvXIFHe7OjCXdxWaq1FxOmS3PM8SYN+2A1kgxHZnLMciMw9qWj6dBhgsBw3+cbDFxnYRUts0v5X/BXyrBiPI8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=185.176.79.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.18.224.107]) by frasgout.his.huawei.com (SkyGuard) with ESMTPS id 4fWwwN3rZwzJ468c; Fri, 13 Mar 2026 02:23:16 +0800 (CST) Received: from dubpeml500005.china.huawei.com (unknown [7.214.145.207]) by mail.maildlp.com (Postfix) with ESMTPS id C176340585; Fri, 13 Mar 2026 02:24:04 +0800 (CST) Received: from localhost (10.203.177.15) by dubpeml500005.china.huawei.com (7.214.145.207) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 12 Mar 2026 18:24:03 +0000 Date: Thu, 12 Mar 2026 18:24:01 +0000 From: Jonathan Cameron To: Alex Williamson CC: Lukas Wunner , , , , , , , , , , , , , , , , , , , , , Terry Bowman Subject: Re: [PATCH v4 09/10] PCI: save/restore CXL config around reset Message-ID: <20260312182401.00001adc@huawei.com> In-Reply-To: <20260126153435.5f1557df@shazbot.org> References: <20260120222610.2227109-1-smadhavan@nvidia.com> <20260120222610.2227109-10-smadhavan@nvidia.com> <20260122104745.00001fea@huawei.com> <20260126153435.5f1557df@shazbot.org> X-Mailer: Claws Mail 4.3.0 (GTK 3.24.42; x86_64-w64-mingw32) Precedence: bulk X-Mailing-List: linux-pci@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-ClientProxiedBy: lhrpeml500011.china.huawei.com (7.191.174.215) To dubpeml500005.china.huawei.com (7.214.145.207) On Mon, 26 Jan 2026 15:34:35 -0700 Alex Williamson wrote: > On Thu, 22 Jan 2026 10:47:45 +0000 > Jonathan Cameron wrote: > > > On Thu, 22 Jan 2026 11:01:57 +0100 > > Lukas Wunner wrote: > > > > > On Tue, Jan 20, 2026 at 10:26:09PM +0000, smadhavan@nvidia.com wrote: > > > > +++ b/drivers/pci/pci.c > > > > @@ -4989,6 +4990,11 @@ static int cxl_reset(struct pci_dev *dev, bool probe) > > > > if (probe) > > > > return 0; > > > > > > > > + pci_save_state(dev); > > > > + rc = cxl_config_save_state(dev, &cxl_state); > > > > + if (rc) > > > > + pci_warn(dev, "Failed to save CXL config state: %d\n", rc); > > > > + > > > > > > Hm, shouldn't the call to cxl_config_save_state() be moved to > > > pci_save_state() (and likewise, cxl_config_restore_state() moved to > > > pci_restore_state())? > > > > > > E.g. when a DPC event occurs, I assume CXL registers need to > > > be restored as well on recovery, right? > > The CXL spec has some comic language around DPC that basically says > > "use with care, DPC trigger will bring down physical link, reset devicestate, > > disrupt CXL.cache and CXL.mem traffic". > > or in shorter words > > 'Good luck' > > > > If a CXL device undergoes DPC high chance you'll either trigger CXL isolation > > which we aren't handing yet in Linux because we aren't convinced software > > can really recover form it, or stall a CPU and end up rebooting. > > > > Maybe we'll one day we'll figure this out. Today turn off DPC on CXL ports! :) > > Even if we hand-wave that DPC isn't an issue, save/restore of the PCI > state happens at a higher level for every other PCI reset method and > we're creating inconsistency here. > > PCI-core includes interfaces for saving PCI state, offloading PCI state > as an opaque blob, reloading, and restoring that state, and performing > resets without saving and restoring state. This has a couple users, > including vfio. > > If we want similar behavior for CXL type2 devices for a future vfio use > case, we shouldn't create unnecessary differentiation here with saving > the CXL state separately and making the reset method behave > differently. Thanks, > I'm a bit concerned that, unlike PCI where no traffic flows after reset and restore of basic PCIe stuff, for CXL once you've put the decoders etc back in place, CXL.mem traffic can happen autonomously. It's cacheable and physical address prefetchers on the CPU side may be able wander into it more or less randomly, whether there are page tables yet or not. This is somewhat similar to PCI devices misbehaving if you enable bus mastering without ensuring they are in a clean state (just in the other direction). So I'm not sure how safe it is to restore the generic CXL state with out the driver taking control. I don't think there are tight enough guarantees that devices should be able to survive this if their drivers haven't managed the setup of CXL.mem carefully as they did during driver bind etc. Maybe they had to load a firmware first before there was anything behind a CXL protocol front end. The drivers can't stop CXL.mem in a prepare reset callback prior to saving state as it may be RWL by an annoying BIOS. Maybe I'm overly paranoid and all device manufacturers are sensible. Or I missed some spec text that says devices should politely handle traffic turning up before they are ready. If they implement the memory ready checks then we may be fine as hopefully Media Status == Ready doesn't happen until it's safe to enable access (though the spec doesn't actually say that is sufficient that I can find). I need to do some more digging and maybe a spot of prototyping. Also more than plausible I'm missing a nugget of code in here that makes this all safe. Jonathan > Alex