All of lore.kernel.org
 help / color / mirror / Atom feed
From: Baolu Lu <baolu.lu@linux.intel.com>
To: "Tian, Kevin" <kevin.tian@intel.com>,
	Joerg Roedel <joro@8bytes.org>, Will Deacon <will@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Jason Gunthorpe <jgg@nvidia.com>, Jann Horn <jannh@google.com>,
	Vasant Hegde <vasant.hegde@amd.com>,
	"Hansen, Dave" <dave.hansen@intel.com>,
	Alistair Popple <apopple@nvidia.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Uladzislau Rezki <urezki@gmail.com>,
	Jean-Philippe Brucker <jean-philippe@linaro.org>,
	Andy Lutomirski <luto@kernel.org>, "Lai, Yi1" <yi1.lai@intel.com>
Cc: "iommu@lists.linux.dev" <iommu@lists.linux.dev>,
	"security@kernel.org" <security@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Dave Hansen <dave.hansen@linux.intel.com>
Subject: Re: [PATCH v4 6/8] mm: Introduce deferred freeing for kernel page tables
Date: Mon, 15 Sep 2025 09:16:47 +0800	[thread overview]
Message-ID: <0c97eb75-731a-49bf-a247-cd5be8835843@linux.intel.com> (raw)
In-Reply-To: <BL1PR11MB52719E654E5AE227FE29B75F8C08A@BL1PR11MB5271.namprd11.prod.outlook.com>

On 9/12/25 16:14, Tian, Kevin wrote:
>> From: Lu Baolu <baolu.lu@linux.intel.com>
>> Sent: Friday, September 5, 2025 1:51 PM
>>
>> From: Dave Hansen <dave.hansen@linux.intel.com>
>>
>> On x86 and other architectures that map the kernel's virtual address space
>> into the upper portion of every process's page table, the IOMMU's paging
>> structure caches can become stale when the CPU page table is shared with
>> IOMMU in the Shared Virtual Address (SVA) context. This occurs when a page
>> used for the kernel's page tables is freed and reused without the IOMMU
>> being notified.
>>
>> While the IOMMU driver is notified of changes to user virtual address
>> mappings, there is no similar notification mechanism for kernel page
>> table changes. This can lead to data corruption or system instability
>> when Shared Virtual Address (SVA) is enabled, as the IOMMU's internal
>> caches may retain stale entries for kernel virtual addresses.
> 
> above could be saved to the last patch.

Yes.

> 
>>
>> This introduces a conditional asynchronous mechanism, enabled by
>> CONFIG_ASYNC_PGTABLE_FREE. When enabled, this mechanism defers the
>> freeing
>> of pages that are used as page tables for kernel address mappings. These
>> pages are now queued to a work struct instead of being freed immediately.
>>
>> This deferred freeing provides a safe context for a future patch to add
>> an IOMMU-specific callback, which might be expensive on large-scale
>> systems. This ensures the necessary IOMMU cache invalidation is performed
>> before the page is finally returned to the page allocator outside of any
>> critical, non-sleepable path.
>>
>> In the current kernel, some page table pages are allocated with an
>> associated struct ptdesc, while others are not. Those without a ptdesc are
>> freed using free_pages() and its variants, which bypasses the destructor
>> that pagetable_dtor_free() would run. While the long-term plan is to
>> convert all page table pages to use struct ptdesc, this uses a temporary
>> flag within ptdesc to indicate whether a page needs a destructor,
>> considering that this aims to fix a potential security issue in IOMMU SVA.
>> The flag and its associated logic can be removed once the conversion is
>> complete.
> 
> stale comment?

Yes. Fixed.

> 
>>
>> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> 
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>

Thanks,
baolu

  reply	other threads:[~2025-09-15  1:19 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-05  5:50 [PATCH v4 0/8] Fix stale IOTLB entries for kernel address space Lu Baolu
2025-09-05  5:50 ` [PATCH v4 1/8] mm: Add a ptdesc flag to mark kernel page tables Lu Baolu
2025-09-05 18:24   ` Jason Gunthorpe
2025-09-12  7:58   ` Tian, Kevin
2025-09-05  5:50 ` [PATCH v4 2/8] mm: Actually mark kernel page table pages Lu Baolu
2025-09-05 18:24   ` Jason Gunthorpe
2025-09-12  7:59   ` Tian, Kevin
2025-09-05  5:50 ` [PATCH v4 3/8] x86/mm: Use 'ptdesc' when freeing PMD pages Lu Baolu
2025-09-05 18:25   ` Jason Gunthorpe
2025-09-12  8:03   ` Tian, Kevin
2025-09-05  5:50 ` [PATCH v4 4/8] mm: Introduce pure page table freeing function Lu Baolu
2025-09-05 18:31   ` Jason Gunthorpe
2025-09-12  8:04   ` Tian, Kevin
2025-09-05  5:51 ` [PATCH v4 5/8] x86/mm: Use pagetable_free() Lu Baolu
2025-09-05 18:41   ` Jason Gunthorpe
2025-09-05 19:22     ` Dave Hansen
2025-09-05 20:11     ` Dave Hansen
2025-09-05 23:04       ` Jason Gunthorpe
2025-09-19  5:31       ` Baolu Lu
2025-09-05  5:51 ` [PATCH v4 6/8] mm: Introduce deferred freeing for kernel page tables Lu Baolu
2025-09-05 18:43   ` Jason Gunthorpe
2025-09-05 19:26     ` Dave Hansen
2025-09-12  8:17     ` Tian, Kevin
2025-09-15 11:35       ` Jason Gunthorpe
2025-09-19  8:18         ` Tian, Kevin
2025-09-12  8:14   ` Tian, Kevin
2025-09-15  1:16     ` Baolu Lu [this message]
2025-09-05  5:51 ` [PATCH v4 7/8] mm: Hook up Kconfig options for async page table freeing Lu Baolu
2025-09-05 18:44   ` Jason Gunthorpe
2025-09-12  8:19   ` Tian, Kevin
2025-09-05  5:51 ` [PATCH v4 8/8] iommu/sva: Invalidate stale IOTLB entries for kernel address space Lu Baolu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0c97eb75-731a-49bf-a247-cd5be8835843@linux.intel.com \
    --to=baolu.lu@linux.intel.com \
    --cc=apopple@nvidia.com \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=iommu@lists.linux.dev \
    --cc=jannh@google.com \
    --cc=jean-philippe@linaro.org \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=peterz@infradead.org \
    --cc=robin.murphy@arm.com \
    --cc=security@kernel.org \
    --cc=urezki@gmail.com \
    --cc=vasant.hegde@amd.com \
    --cc=will@kernel.org \
    --cc=yi1.lai@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.