From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wm1-f54.google.com (mail-wm1-f54.google.com [209.85.128.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1AAF8324B33 for ; Fri, 12 Jun 2026 04:38:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.54 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781239115; cv=none; b=DOzwPBEUMD2SXZV9PVHaE1KIc/ueiPb6k+qNJTb0b0pMhLsbdmUoYsOlTUNvqe1m8xIKjTWdZzbVU1N/tbU+uamtGKOPnPstF2ohKbR+zXT+TP3E6oB0JRrldyXhNXXaS3ZhlCZtGD7clP6eZRG3/556HQawa0Wqvk0kvNyWNlI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781239115; c=relaxed/simple; bh=t5oz6EacuBRZvCxH7GqWAg8c5YZCD/3IEpQhWJIs3TQ=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=cnauRv1fUdM0OLSZ2+pwObyNVUI+4OwqqYWMdjSIcr3q6QFCb93c1g+j6Ka2I4YXtJrzEtmrcw/bKU6x+lBAZos5GrM6h+lxCFF9LIpOlCXJnA6X+LAUDkoNyTWio+KN/ZNPIbmBReZNq+rGPZVNwNfS/9/ibezi/Lou3Ci7qEA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=YWx4ZUJQ; arc=none smtp.client-ip=209.85.128.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="YWx4ZUJQ" Received: by mail-wm1-f54.google.com with SMTP id 5b1f17b1804b1-490bc6a7958so14949405e9.1 for ; Thu, 11 Jun 2026 21:38:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1781239112; x=1781843912; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=+eyDzmzMv9cOLRXZ9xP5Hgn20feunlwWlNf7aiJTcsI=; b=YWx4ZUJQ5m9MCcuQQ8XGhs0CXPmNAkc2grc48QgQHubJIBAdOx6UdgYphdyzmAwyZX 27Un3wXAmUAF4Xitg9RJnKK3B1S0+9+EWHle7jmU26joiu2zzlIXeTDJYLFogcokmd0z IrvJIEaK3/8Pcb/QF5ud/bMvfmcdM5C9WZmfeLZ5HG1VTEJJfddCzbQIa3pBbJ77MVSe Xob9jfTgGMiEYPLdRrERBVui1Wjf+Hs93Vonm+1koFaPIGYxpKKz4dinAd8wBJblNV3/ VLpxyGfCxIMujIZatbJpqUsYa3xRYhRuFmlamFJAnxIYAF+XmViCx6EXoXND1kQRaKKl DpDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1781239112; x=1781843912; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=+eyDzmzMv9cOLRXZ9xP5Hgn20feunlwWlNf7aiJTcsI=; b=hJ3FzfTiBMlNFVg+9Zh2qpree+oal4GYoHAlVLOAkIICi146JH83nY5n7XtlP5zEE+ ikAO6ByOzgwKBLawoT+YZPUVpV7UCPZfUayxDrtgvuHsjWw7kft5Yt4n7Q6WAZAPOZMM sunfdpXY5JVzkQl0Pl+zHrIfcTlVdKGYxmQ4yeP6uJp44S+K/XF0c4Kg4P6017bXBM8n FjHDeevPfgGX+l5J0H3cuVoTfp4xfKzUGXGHURIbMcO1oq4luiXyEfBkzG8EoevxaiHt xRoQHB/1xXln9U80FL3mO4lOc4NMGTaCcU45KrtuNaIzOx/wqyHJR5bJMXmUaRjdKiwg EadQ== X-Forwarded-Encrypted: i=1; AFNElJ9FjC91jhtJDepxO9lFreFOngeHkHgM7poe1kIId+h+qNwfQioXhFo1LnawQGBYWq0RGC5Omb7ZrLkvUSg=@vger.kernel.org X-Gm-Message-State: AOJu0Yye099kzBEHwkRKaI0cuFZcZlc3BTHatrWUdz6B3/uv239F4p6B 3M/ttdT8W5oUVzUysrZ7JWesFpyNUGgJQtfpG9VuU3oRdhWvsD7lpdKv X-Gm-Gg: Acq92OGYUYczLrqdGmDRxDjac4L4rS5jNCLWA4ezmjVA3x29dTxONbb3/ozDj5p64xQ 5P2QNHvWsup0DEUdqyB9SXCxxUitdxDoZKIqPP5Mx4vB04HVnW6PV8JerE6qpedk1piG4vFBwG+ dTN+aSokCU/HPTYUkcYAThLNcXiXf23njUD2yjP5iLQiEmgP4rnH5JeLDX6RPh74JkMmx3CavlF JnMwnV2MutOGeEA8Z12LX9lDMB8DUjNJjLHYndPyJMLiwBDPl7UgtX9aaCyKFoHnovT15/yKCUq 09rieUnj9SxgGn/O0auplCUzpSBLrruGOWZNYqcKGYcjd+vBJBqwnPI80c5CAhKIaB/Jc4mE9CJ 2/U2f6Q2wbmdMSo66ZTZ5n7tgLo6NYpXUU4sqzRlzh9ozaJM3Un0IjeKxu97+hPYXB8KnwjWE1X Ox8YIt81rwsyrQYFOxWUmM2NV/uV8BQdND4GyDZAhMQH5NaETja4v7xjDnBeNvtMqpgWWmTx2Oq UCza9aRDU4= X-Received: by 2002:a05:6000:460f:b0:44f:69f4:39b5 with SMTP id ffacd0b85a97d-4606d137b1emr1501403f8f.29.1781239112192; Thu, 11 Jun 2026 21:38:32 -0700 (PDT) Received: from dohko.chello.ie (188-141-5-72.dynamic.upc.ie. [188.141.5.72]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4606f2c4240sm2299857f8f.27.2026.06.11.21.38.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jun 2026 21:38:31 -0700 (PDT) From: David Carlier To: akpm@linux-foundation.org Cc: syzbot+fd95a72470f5a44e464c@syzkaller.appspotmail.com, David Carlier , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Kevin Tian , Jason Gunthorpe , Lu Baolu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH] mm: pgtable: protect lockless kernel page table walks with RCU Date: Fri, 12 Jun 2026 05:38:27 +0100 Message-ID: <20260612043828.23558-1-devnexen@gmail.com> X-Mailer: git-send-email 2.53.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit ptdump walks the kernel page tables locklessly through walk_kernel_page_table_range_lockless(). It only holds the init_mm mmap lock and the memory hotplug lock, and neither excludes vmalloc/ioremap teardown from freeing kernel PTE pages via pmd_free_pte_page() -> pagetable_free_kernel(). syzbot hit a use-after-free in ptdump_pte_entry() reading a PTE page that was freed underneath the walk. Deferring the kernel page table free only batches the TLB flush; it does not wait for lockless walkers. Mirror the user page table walk, where pte_offset_map() already takes the RCU read lock: hold rcu_read_lock() across the lockless kernel walk and wait for a grace period in the kernel page table free worker before releasing the pages. A walker then either observes the cleared PMD and skips the page, or keeps it alive until it drops the RCU read lock. Fixes: 5ba2f0a15564 ("mm: introduce deferred freeing for kernel page tables") Reported-by: syzbot+fd95a72470f5a44e464c@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/6a287988.39669fcc.33b062.00a0.GAE@google.com/T/ Assisted-by: Claude:claude-opus-4-8 Signed-off-by: David Carlier --- mm/pagewalk.c | 15 ++++++++++++++- mm/pgtable-generic.c | 8 ++++++++ 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/mm/pagewalk.c b/mm/pagewalk.c index 3ae2586ff45b..6d9f14f86784 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -655,13 +655,26 @@ int walk_kernel_page_table_range_lockless(unsigned long start, unsigned long end .private = private, .no_vma = true }; + int err; if (start >= end) return -EINVAL; if (!check_ops_safe(ops)) return -EINVAL; - return walk_pgd_range(start, end, &walk); + /* + * Kernel intermediate page tables can be freed concurrently by + * vmalloc/ioremap teardown (e.g. pmd_free_pte_page()), which routes + * the freed pages through pagetable_free_kernel(). That path defers + * the free past an RCU grace period, so hold the RCU read lock across + * the lockless walk to prevent a page table from being freed while we + * are still dereferencing it. + */ + rcu_read_lock(); + err = walk_pgd_range(start, end, &walk); + rcu_read_unlock(); + + return err; } /** diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c index b91b1a98029c..59e1315185b4 100644 --- a/mm/pgtable-generic.c +++ b/mm/pgtable-generic.c @@ -434,6 +434,14 @@ static void kernel_pgtable_work_func(struct work_struct *work) spin_unlock(&kernel_pgtable_work.lock); iommu_sva_invalidate_kva_range(PAGE_OFFSET, TLB_FLUSH_ALL); + + /* + * Lockless kernel page table walkers (ptdump, and any other user of + * walk_kernel_page_table_range_lockless()) dereference these pages + * under rcu_read_lock(). Wait for a grace period so no walker can + * still be reading a page we are about to free. + */ + synchronize_rcu(); list_for_each_entry_safe(pt, next, &page_list, pt_list) __pagetable_free(pt); } -- 2.53.0