From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 86F0C28378; Wed, 15 Nov 2023 20:24:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Authentication-Results: smtp.subspace.kernel.org; dkim=none Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0FE5F1595; Wed, 15 Nov 2023 12:24:52 -0800 (PST) Received: from [10.57.83.164] (unknown [10.57.83.164]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2ED0A3F6C4; Wed, 15 Nov 2023 12:23:57 -0800 (PST) Message-ID: <6442d24b-6352-46e9-89e0-72d4a493f77c@arm.com> Date: Wed, 15 Nov 2023 20:23:54 +0000 Precedence: bulk X-Mailing-List: patches@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 00/17] Solve iommu probe races around iommu_fwspec To: Jason Gunthorpe Cc: acpica-devel@lists.linux.dev, Alyssa Rosenzweig , Albert Ou , asahi@lists.linux.dev, Catalin Marinas , Dexuan Cui , devicetree@vger.kernel.org, David Woodhouse , Frank Rowand , Hanjun Guo , Haiyang Zhang , iommu@lists.linux.dev, Jean-Philippe Brucker , Jonathan Hunter , Joerg Roedel , "K. Y. Srinivasan" , Len Brown , linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-hyperv@vger.kernel.org, linux-mips@vger.kernel.org, linux-riscv@lists.infradead.org, linux-snps-arc@lists.infradead.org, linux-tegra@vger.kernel.org, Russell King , Lorenzo Pieralisi , Marek Szyprowski , Hector Martin , Palmer Dabbelt , patches@lists.linux.dev, Paul Walmsley , "Rafael J. Wysocki" , Robert Moore , Rob Herring , Sudeep Holla , Suravee Suthikulpanit , Sven Peter , Thierry Reding , Thomas Bogendoerfer , Krishna Reddy , Vineet Gupta , virtualization@lists.linux.dev, Wei Liu , Will Deacon , =?UTF-8?Q?Andr=C3=A9_Draszik?= , Lu Baolu , Christoph Hellwig , Jerry Snitselaar , Moritz Fischer , Zhenhua Huang , "Rafael J. Wysocki" , Rob Herring References: <0-v2-36a0088ecaa7+22c6e-iommu_fwspec_jgg@nvidia.com> <1316b55e-8074-4b2f-99df-585df2f3dd06@arm.com> Content-Language: en-GB From: Robin Murphy In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 2023-11-15 3:36 pm, Jason Gunthorpe wrote: > On Wed, Nov 15, 2023 at 03:22:09PM +0000, Robin Murphy wrote: >> On 2023-11-15 2:05 pm, Jason Gunthorpe wrote: >>> [Several people have tested this now, so it is something that should sit in >>> linux-next for a while] >> >> What's the aim here? This is obviously far, far too much for a >> stable fix, > > To fix the locking bug and ugly abuse of dev->iommu? Fixing the locking can be achieved by fixing the locking, as I have now demonstrated. > I wouldn't say that, it is up to the people who care about this to > decide. It seems alot of people are hitting it so maybe it should be > backported in some situations. Regardless, we should not continue to > have this locking bug in v6.8. > >> but then it's also not the refactoring we want for the future either, since >> it's moving in the wrong direction of cementing the fundamental brokenness >> further in place rather than getting any closer to removing it. > > I haven't seen patches or an outline on what you have in mind though? > > In my view I would like to get rid of of_xlate(), at a minimum. It is > a micro-optimization I don't think we need. I see a pretty > straightforward path to get there from here. Micro-optimisation!? OK, I think I have to say it. Please stop trying to rewrite code you don't understand. > Do you also want to get rid of iommu_fwspec, or at least thin it out? > That seems reasonable too, I think that becomes within reach once > of_xlate is gone. > > What do you see as "cemeting"? Most of this series constitutes a giant sweeping redesign of a whole bunch of internal machinery to permit it to be used concurrently, where that concurrency should still not exist in the first place because the thing that allows it to happen also causes other problems like groups being broken. Once the real problem is fixed there will be no need for any of this, and at worst some of it will then actually get in the way. I feel like I've explained it many times already, but what needs to happen is for the firmware parsing and of_xlate stage to be initiated by __iommu_probe_device() itself. The first step is my bus ops series (if I'm ever allowed to get it landed...) which gets to the state of expecting to start from a fwspec. Then it's a case of shuffling around what's currently in the bus_type dma_configure methods such that point is where the fwspec is created as well, and the driver-probe-time work is almost removed except for still deferring if a device is waiting for its IOMMU instance (since that instance turning up and registering will retrigger the rest itself). And there at last, a trivial lifecycle and access pattern for dev->iommu (with the overlapping bits of iommu_fwspec finally able to be squashed as well), and finally an end to 8 long and unfortunate years of calling things in the wrong order in ways they were never supposed to be. Thanks, Robin.