From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 073AA40DFA0 for ; Sat, 2 May 2026 00:46:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777682795; cv=none; b=L9nebjxJoK0bglKZqt2jMEaJZe5sLjRVispUGlqp9qvtSrpX5uBILN1Dy56l+HktBkGl+EZPIcInRsxDUloFsmCNozsl1WGMft1o+0DFQhEikUyNc8KUfgc/EiEU4pcr/QA9yFBn6huk28iaZ8eYTog2GwPTzlKYMQLx/JrCtZM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777682795; c=relaxed/simple; bh=XTGflNX+Q2aPS2DfKrLc/p0uoaXIOkmiEjOTKw1RbaI=; h=Date:From:To:Message-ID:In-Reply-To:References:Subject: Mime-Version:Content-Type; b=oWb69YrAtmzVlxeWnrdnsTG306D0ZCUVv0J8kzWIVesjpmLuGxDr+EdZAPaATUH3twiBPdQmYc5d5lL0Sxa4Lhf+0Xgd+w6W/87gZNA5ykjdO7z0krJrKUd7StK4Q6SjcHyV5Y4ICgIikFuL/hEe1p0GQAyfpkvxGsnYI5Zv1Q4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Q7JXXict; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Q7JXXict" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 69F21C2BCB4; Sat, 2 May 2026 00:46:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777682794; bh=XTGflNX+Q2aPS2DfKrLc/p0uoaXIOkmiEjOTKw1RbaI=; h=Date:From:To:In-Reply-To:References:Subject:From; b=Q7JXXictQBhxCWn158/ZJdF1UOn89unsN0/BHywntWXydVS3LOtCELX1bbVSdq3Bh 6c5/p8/oocDL4o9fNS2VEwf04mIkVpAaI+GOXW+9g1vtm2yGIn1/Le5wVrnAC1E4wN FEupvycxSKwCgEoR0ihf6/Ydn988m3pZxiE1ox7asUfgCIVTml9E1PDVh7yg2r7Wnz N4gzRpa4N8W7NoVchIk9K85qW0D0Pn1QngB5J7Z0eywg1anxi3KoM/Pcwh0/U9W9Ii +3fQU7pFLl3F3o6w4+Nru8xRUu9jQSipQT9zlXN4Ux/chkDuWekmLMBlVkNz6qy4LZ 6ud38w1lsOaYQ== Received: from phl-compute-09.internal (phl-compute-09.internal [10.202.2.49]) by mailfauth.phl.internal (Postfix) with ESMTP id 8AC79F40082; Fri, 1 May 2026 20:46:33 -0400 (EDT) Received: from phl-frontend-04 ([10.202.2.163]) by phl-compute-09.internal (MEProxy); Fri, 01 May 2026 20:46:33 -0400 X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefhedrtddtgdeludeikecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpuffrtefokffrpgfnqfghnecuuegr ihhlohhuthemuceftddtnecunecujfgurhepfffhvffkjghfufggtgfgsehtjeertddttd ejnecuhfhrohhmpedfffgrnhcuhghilhhlihgrmhhsucdlnhhvihguihgrmddfuceoughj sgifsehkvghrnhgvlhdrohhrgheqnecuggftrfgrthhtvghrnhepjedvleeuffffvdehvd egledttdetfeeugfdvuedugedugeefiedvvdduleefhefhnecuvehluhhsthgvrhfuihii vgeptdenucfrrghrrghmpehmrghilhhfrhhomhepughjsgifodhmvghsmhhtphgruhhthh hpvghrshhonhgrlhhithihqddujeejvdeftdegheehqdeffeefleegtdegjedqughjsgif peepkhgvrhhnvghlrdhorhhgsehfrghsthhmrghilhdrtghomhdpnhgspghrtghpthhtoh epuddtpdhmohguvgepshhmthhpohhuthdprhgtphhtthhopegrlhhutggvrhhophesrghm ugdrtghomhdprhgtphhtthhopegujhgsfieskhgvrhhnvghlrdhorhhgpdhrtghpthhtoh eprghlvghjrghnughrohdrlhhutggvrhhoqdhprghlrghusegrmhgurdgtohhmpdhrtghp thhtoheplhhinhhugidqtgiglhesvhhgvghrrdhkvghrnhgvlhdrohhrghdprhgtphhtth hopegvugifrghrugdrtghrvggvsegrmhgurdgtohhmpdhrtghpthhtohepuggrvhgvmhes uggrvhgvmhhlohhfthdrnhgvthdprhgtphhtthhopehkuhgsrgeskhgvrhhnvghlrdhorh hgpdhrtghpthhtohepphgrsggvnhhisehrvgguhhgrthdrtghomhdprhgtphhtthhopegv ughumhgriigvthesghhoohhglhgvrdgtohhm X-ME-Proxy: Feedback-ID: i67ae4b3e:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 1 May 2026 20:46:33 -0400 (EDT) Date: Fri, 01 May 2026 17:46:31 -0700 From: "Dan Williams (nvidia)" To: Alejandro Lucero Palau , "Dan Williams (nvidia)" , alejandro.lucero-palau@amd.com, linux-cxl@vger.kernel.org, edward.cree@amd.com, davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, edumazet@google.com, dave.jiang@intel.com Message-ID: <69f54967edd72_3291a9100c3@djbw-dev.notmuch> In-Reply-To: References: <20260423180528.17166-1-alejandro.lucero-palau@amd.com> <20260423180528.17166-7-alejandro.lucero-palau@amd.com> <69f409325f7c0_3291a910046@djbw-dev.notmuch> Subject: Re: [PATCH v26 6/8] cxl: attach region to an accelerator/type2 memdev Precedence: bulk X-Mailing-List: linux-cxl@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Alejandro Lucero Palau wrote: [..] > >> + } > > A couple problems here. > > > > 1/ Nothing stops a CXL class device from implementing a decoder with > > CXL_DECODER_DEVMEM (HDM-DB). > > Uhmmm... Consider a helpful expander that does not require the host OS to use cpu_cache_invalidate_memregion() whenever DPA space is changed. I imagine that would be useful for DCD where there could be a higher frequency of extent changes. > >> + /* hold endpoint lock to setup autoremove of the region */ > >> + guard(device)(&endpoint->dev); > > This does not handle the case when ->endpoint is an ERR_PTR() because > > the memdev never attached in the first instance. > > Not sure about this but, is it not the success of devm_cxl_add_memdev() > ensuring this can not happen? That is only ensured by using the "attach" mechanism. devm_cxl_add_memdev(..., NULL) is only for the generic memory expander case. Where the entire usage model is governed by memdev ABIs. > Note I decouple region attach from memdev creation. No memdev, no call > to this function. Which leads to this problem. > >> +/* Called at driver exit or when user space triggers cxl region removal. */ > >> +static void efx_cxl_unmap_region(void *data) { > >> + struct efx_probe_data *probe_data = data; > >> + > >> + probe_data->cxl_pio_initialised = false; > >> + iounmap(probe_data->cxl->ctpio_cxl); > >> +} > > I do not see how an async event can safely zap that ctpio_cxl space with > > zero coordination with the driver, and I do not think you want to burden > > the fast path with new locks to coordinate this. > > You should look patch 8 where your concern is hopefully tackled. I need > to test this further, but there is no need of additional locks. Please do not structure patches with bugs in the middle of the series. It burns reviewer resources. > > Can we please stick with the violent but simple "unload driver" approach > > for now? Someone removing cxl_acpi, disabling port drivers, or disabling > > the cxl_mem driver gets to keep all the pieces. Just like force > > unloading your storage driver underneath your root filesystem. Do not do > > it unless you want to see the fireworks or test various hotplug flows. > > > I have to disagree. The use of CXL is only part of the datapath. The > driver can keep going without CXL. The related buffers can be used until > the sync between the efx_ef10_disable_piobufs() added in patch 8 and the > CXL CTPIO datapath. > > > Although the option of unloading the driver is possible, I do not think > CXL should decide what to do here when there exists another option. The concern is creeping complexity. There is no such concept in existing Linux drivers where an MMIO mapping disappears out from underneath a running driver. It may start returning all 0xff and fire an error interrupt, but the driver does not need to worry about responding to async unmap. All drivers must already be prepared to be unloaded. So we start with the simple semantic first to get this functionality landed and then think about adding sophistication like live fallback to PCI operation.