From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-dl1-f54.google.com (mail-dl1-f54.google.com [74.125.82.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AEC41312807 for ; Mon, 13 Apr 2026 19:31:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.54 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776108691; cv=none; b=ahO5yAKbGYtXTqVeO0H6w3LcIJqrR7UkElfKVXMxxlibA5BPiJWy3KxuGBESRdNRXOlne4VeUT/Zhb2M5MUdWkA5XXdhV8T3QtgcRZmb8CPCg8T5lVv+TLBect6c9KK8ol06DvUlyradqM6xmj5EEvyoM+agq/Zm0PGqFENWLTI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776108691; c=relaxed/simple; bh=CC2nyShyWFv1Gy65La/U1u9FHqbHFUCiQ17U7vIiH74=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=rnUQVwBdwn/hHueD8qtT62ik4llefUJsM7eTP8lOpnOmouA8b6FLHHUybia6r8j/fvPAx4hyS0qLcgv74/P6qI2KiyMzUYRB1kcZwYlX/rC5UeKZlv08yf8EE4EX1tdjrgeRzhh4DuSIIyOVRmSEe+OfyVzpEYSCZV/an7VUfvc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=r5eSIzel; arc=none smtp.client-ip=74.125.82.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="r5eSIzel" Received: by mail-dl1-f54.google.com with SMTP id a92af1059eb24-126ea4e9697so42193c88.1 for ; Mon, 13 Apr 2026 12:31:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1776108689; x=1776713489; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=ZmEo2MDjIDDkENmGoPrF05ezLV8EbIzqq9cU9t7++Qw=; b=r5eSIzel3iPzdUHqTkmTlxNqqW50b7KB4CF0HWK+BO2UjCeUAsTiK9/njiBseL6X6F yR+f3HsHK9dfDorqlR5SWQxtpDWl5k/iuEAKtEBT1K517euIimYmGnREYb1FpSUpeYvK Hn3NUfPPJvGY7XJZA9Ve2OQug3EWftq114ORsfV3qNLOKwgY6BZeTwYeH5lExvXDiRl8 W/Yjlvp50JdEY21+2duJDaNLKdUQolggk2MRl1n82X7/0RAlVU8nhro4pfPLwqWH1/HZ cGs05Qg3zt7NSdmOkBu3O2/aokW2Tg1tm5YRetEkRo1262eOmaMH2jkrdLn5vATfzn3u ysPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776108689; x=1776713489; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZmEo2MDjIDDkENmGoPrF05ezLV8EbIzqq9cU9t7++Qw=; b=JL15xeFGP4f/JuxIawpe6kAY4UKIMumdQtYoJAGLDhk7jgvoYQFcgtohB1+9tz/WWb amBZI/9M7WhZfXnyIEXZnI97fCnlwN4XfIC8Ht1IkTZEYbKA7I03jfcfjKemL5zRDl4c SF9s2QVwyJ4I4lFVe6z9xZMciu/iSGFmzMF7/Gr79hLpiSFutfm8dta7FgloDhMYUMg5 +T3/cl3/SSlEKsehG5icjPten4tKPaw7jAt/u2at/RujdorOXLsoVVv/v6NaTHklliHk IzbdUIrtd4O7decZb+u5dmmXMkL5/xzJui7VfAlo43rq36VZWTW88tQb5LfkyEWZLnyo PeHg== X-Forwarded-Encrypted: i=1; AFNElJ+J8AJXuJvMj/pP6nhfwPpcfHk8PvfVK3UUi3y5DDA+E9husDMjONDPSx0I2VJP9przQWPQ52jgmPLhMDw=@vger.kernel.org X-Gm-Message-State: AOJu0YzdlCpXDpK8hoLKpjz/czuCpkdWTS//9kpIF7+2NpqEl8sjG+rD k1zneQGFcBRfQ986vzGSZaX+cqJqkx1rs9pZETqCesS8tSgZ6Uac74prK7WfQbXoUA== X-Gm-Gg: AeBDieu+Z8lSE++xxWXXZdGjMVKgSvDNCblOnVJNaaqwsdirMPcJtutwaD7ofFL4cxM 2jHjOerbXUH/PH807Lp3zi4a2pX4HEwBWyIZKAwsXtuGebuTr+r0CnI4yBdSgDHFwtpoG018Q6D G4ie44hdIKHto/74laacRkuxc37YZToDg4si23HoPoJDiHWhpVJvUZbz74oyQquZXMV4NHdKeqw MGubYDhLiKIBN1Pnwf+OUx3Unkc5CeFEpqnJ1Jcg3Vcxz98AvCl0g8gcgcfrenjrrg9lNmwX+gy 2G7ArM1YHXc3u/WLyrVQ4h+i9rE1KtirE5XG8c8Rxx4DwrLFEt0PC6ECI6pKaL4DVFC25wLldr+ e5zcQMhHZyqhfx/KpfOty3Q/wBsKsuCUXvzlpflmWHXmx1B7W2d7A1pYIb/LtkVg51XH8faaYa4 b5pwe1yNkK6M6GhUgowoE2QYBEVB+fkxbkZVGp3tUj6TaIXhVuyIACOmBFtzPd84DE4U8x7hEtp HjOLX4Y7jDurnErGCcwcA== X-Received: by 2002:a05:7022:6611:b0:12b:ebf6:a3bf with SMTP id a92af1059eb24-12c29c2345bmr711976c88.5.1776108688158; Mon, 13 Apr 2026 12:31:28 -0700 (PDT) Received: from google.com (195.236.83.34.bc.googleusercontent.com. [34.83.236.195]) by smtp.gmail.com with ESMTPSA id 5a478bee46e88-2d55faa556esm21143295eec.8.2026.04.13.12.31.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Apr 2026 12:31:27 -0700 (PDT) Date: Mon, 13 Apr 2026 19:31:22 +0000 From: Samiullah Khawaja To: Jason Gunthorpe Cc: Pranjal Shrivastava , David Woodhouse , Lu Baolu , Joerg Roedel , Will Deacon , Robin Murphy , Kevin Tian , Alex Williamson , Shuah Khan , iommu@lists.linux.dev, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Saeed Mahameed , Adithya Jayachandran , Parav Pandit , Leon Romanovsky , William Tu , Pratyush Yadav , Pasha Tatashin , David Matlack , Andrew Morton , Chris Li , Vipin Sharma , YiFei Zhu Subject: Re: [PATCH 05/14] iommupt: Implement preserve/unpreserve/restore callbacks Message-ID: References: <20260203220948.2176157-1-skhawaja@google.com> <20260203220948.2176157-6-skhawaja@google.com> <20260410141652.GV2551565@ziepe.ca> <20260410231650.GD3694781@ziepe.ca> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20260410231650.GD3694781@ziepe.ca> On Fri, Apr 10, 2026 at 08:16:50PM -0300, Jason Gunthorpe wrote: >On Fri, Apr 10, 2026 at 11:02:52PM +0000, Samiullah Khawaja wrote: >> On Fri, Apr 10, 2026 at 11:16:52AM -0300, Jason Gunthorpe wrote: >> > On Fri, Mar 20, 2026 at 09:57:08PM +0000, Pranjal Shrivastava wrote: >> > > > +static int __restore_tables(struct pt_range *range, void *arg, >> > > > + unsigned int level, struct pt_table_p *table) >> > > > +{ >> > > > + struct pt_state pts = pt_init(range, level, table); >> > > > + int ret; >> > > > + >> > > > + for_each_pt_level_entry(&pts) { >> > > > + if (pts.type == PT_ENTRY_TABLE) { >> > > > + iommu_restore_page(virt_to_phys(pts.table_lower)); >> > > > + ret = pt_descend(&pts, arg, __restore_tables); >> > > > + if (ret) >> > > > + return ret; >> > > >> > > If pt_descend() returns an error, we immediately return ret. However, we >> > > have already successfully called iommu_restore_page() on pts.table_lower >> > > and potentially many other tables earlier in the loop or higher up in >> > > the tree.. >> > >> > It doesn't return an error, it just propogates errors from the >> > callbacks which this one never errors. So this is just dead code. >> > >> > > > +int DOMAIN_NS(restore)(struct iommu_domain *domain, struct iommu_domain_ser *ser) >> > > > +{ >> > > > + struct pt_iommu *iommu_table = >> > > > + container_of(domain, struct pt_iommu, domain); >> > > > + struct pt_common *common = common_from_iommu(iommu_table); >> > > > + struct pt_range range = pt_all_range(common); >> > > > + >> > > > + iommu_restore_page(ser->top_table); >> > > > + >> > > > + /* Free new table */ >> > > > + iommu_free_pages(range.top_table); >> > > > + >> > > > + /* Set the restored top table */ >> > > > + pt_top_set(common, phys_to_virt(ser->top_table), ser->top_level); >> > > > + >> > > > + /* Restore all pages*/ >> > > > + range = pt_all_range(common); >> > > > + return pt_walk_range(&range, __restore_tables, NULL); >> > >> > This should probably be doing something with the FEAT flags and >> > ias/oas too or do you imagine the calling driver has to deal with >> > that? >> >> During boot the iommu_domain is recreated in driver and it sets up the >> FEAT flags and ias/oas properly. Then this generic callback is used to >> restore the page tables. >> >> Currently the FEAT flags of a domain are not explicitly preserved, I >> will preserve them and error out here if there is a mismatch. > >Hrm, that expands the ABI a bit though > >If the only operation on the restored table is free then I suppose it >can be simplified quite a bit, you just need the minimal things that >the collect walker in free touches. Yes, we use the collect walker during KHO restore of the preserved pages and also during free. But if I understand correctly, the collect walker behaviour changes based on some FEAT_ flags (like SIGN_EXTEND). So we have to be careful if the previous kernel was using different FEAT flags that affect the collect walker. To handle this, we can just preserve the u32 features from struct pt_common and deduce everything using that. Or are you suggesting not to save u32 features at all? Thinking about it more, we do preserve the top_level, so that could potentially be used to walk over the page tables of these free-only domains if we just set up the pts->index and pts->end_index properly by initializing the range based on the top_level. Are you thinking of a similar approach to walk these free-only domains? > >Is that the intention, free only? Yes, the intention is to free only. This domain will be immutable and can only be freed. > >If so then the restored iommu_domain should be some special free only >domain too. Agreed. > >Jason