From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 072F322D9EB; Fri, 1 Aug 2025 13:26:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754054802; cv=none; b=S8uQZqdZioLuR+JaX/TjAwXd1vX4UuJRzVNU6wkzkihbmpxnIobSweJLr+He7V206VH83W8Thk833w4loORCOrg57KmstVhhCBSOBGkb2xLjxUJeSyjvWFfpFQlhlECGAPKhBDsVHQkq1fsXgiv7urkU2Ta0XOrQStR4E67rmxI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754054802; c=relaxed/simple; bh=/ZS8Ia0VupnzJGFGSAG2VKydIIX0xcJaBtsknSGclrY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=UjmPrGxauLM6B0xlDL+xcmpvipGTEhAvgYjuAkvQasyxSKH8cDVAwb6rE/DCpnpprtGCDIubyNeTMSMyf78K1LIGSNuXL4RWq5n4dbmh6WqJNvAG+rc/MoMKbYDSNDBxOzu3EbYKoa8/F+GiBTWA4XYEGmvaiNpiCZHbZ78DCoM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Y6GzlzgU; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Y6GzlzgU" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3F080C4CEF6; Fri, 1 Aug 2025 13:26:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1754054801; bh=/ZS8Ia0VupnzJGFGSAG2VKydIIX0xcJaBtsknSGclrY=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Y6GzlzgUA0rcyWqKvroZW0tg8EN52RnGrzprPaOd0jOC+Dvw5E267dFvhKU1xZqBh mMVACQx1kqstD1LAdN7d8bkiAWPjXO7JIRX3zOEkXUwLmuDiLw8Og2mDOcF63b3Ldh iiFIa+ZTRZn4LW9upGOuC2WteE5mKqeWde9dxFDzPzHrvuXESWq23CKygGrQO8aG8B suLa4C7Vqdx14fmIKI75doF0XZQn+cPZrMWiCp+PqpKBpBm/9eEGg5x3wtKjGf/gnB mEWU/daa8yenSITT6K2vVxt0P6ifhKxHDBbCa7NXLpAAfAyMyTUZepQn/6PeVFm1Bs 219dGjfygNTKw== Date: Fri, 1 Aug 2025 09:26:39 -0400 From: Sasha Levin To: David Hildenbrand Cc: Andrew Morton , peterx@redhat.com, aarcange@redhat.com, surenb@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: Re: [PATCH] mm/userfaultfd: fix missing PTE unmap for non-migration entries Message-ID: References: <20250630031958.1225651-1-sashal@kernel.org> <20250630175746.e52af129fd2d88deecc25169@linux-foundation.org> <214e78a0-7774-4b1e-8d85-9a66d2384744@redhat.com> Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <214e78a0-7774-4b1e-8d85-9a66d2384744@redhat.com> On Thu, Jul 31, 2025 at 02:56:25PM +0200, David Hildenbrand wrote: >On 31.07.25 14:37, Sasha Levin wrote: >>On Tue, Jul 08, 2025 at 05:42:16PM +0200, David Hildenbrand wrote: >>>On 08.07.25 17:33, Sasha Levin wrote: >>>>On Tue, Jul 08, 2025 at 05:10:44PM +0200, David Hildenbrand wrote: >>>>>On 01.07.25 02:57, Andrew Morton wrote: >>>>>>On Sun, 29 Jun 2025 23:19:58 -0400 Sasha Levin wrote: >>>>>> >>>>>>>When handling non-swap entries in move_pages_pte(), the error handling >>>>>>>for entries that are NOT migration entries fails to unmap the page table >>>>>>>entries before jumping to the error handling label. >>>>>>> >>>>>>>This results in a kmap/kunmap imbalance which on CONFIG_HIGHPTE systems >>>>>>>triggers a WARNING in kunmap_local_indexed() because the kmap stack is >>>>>>>corrupted. >>>>>>> >>>>>>>Example call trace on ARM32 (CONFIG_HIGHPTE enabled): >>>>>>> WARNING: CPU: 1 PID: 633 at mm/highmem.c:622 kunmap_local_indexed+0x178/0x17c >>>>>>> Call trace: >>>>>>> kunmap_local_indexed from move_pages+0x964/0x19f4 >>>>>>> move_pages from userfaultfd_ioctl+0x129c/0x2144 >>>>>>> userfaultfd_ioctl from sys_ioctl+0x558/0xd24 >>>>>>> >>>>>>>The issue was introduced with the UFFDIO_MOVE feature but became more >>>>>>>frequent with the addition of guard pages (commit 7c53dfbdb024 ("mm: add >>>>>>>PTE_MARKER_GUARD PTE marker")) which made the non-migration entry code >>>>>>>path more commonly executed during userfaultfd operations. >>>>>>> >>>>>>>Fix this by ensuring PTEs are properly unmapped in all non-swap entry >>>>>>>paths before jumping to the error handling label, not just for migration >>>>>>>entries. >>>>>> >>>>>>I don't get it. >>>>>> >>>>>>>--- a/mm/userfaultfd.c >>>>>>>+++ b/mm/userfaultfd.c >>>>>>>@@ -1384,14 +1384,15 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, >>>>>>> entry = pte_to_swp_entry(orig_src_pte); >>>>>>> if (non_swap_entry(entry)) { >>>>>>>+ pte_unmap(src_pte); >>>>>>>+ pte_unmap(dst_pte); >>>>>>>+ src_pte = dst_pte = NULL; >>>>>>> if (is_migration_entry(entry)) { >>>>>>>- pte_unmap(src_pte); >>>>>>>- pte_unmap(dst_pte); >>>>>>>- src_pte = dst_pte = NULL; >>>>>>> migration_entry_wait(mm, src_pmd, src_addr); >>>>>>> err = -EAGAIN; >>>>>>>- } else >>>>>>>+ } else { >>>>>>> err = -EFAULT; >>>>>>>+ } >>>>>>> goto out; >>>>>> >>>>>>where we have >>>>>> >>>>>>out: >>>>>> ... >>>>>> if (dst_pte) >>>>>> pte_unmap(dst_pte); >>>>>> if (src_pte) >>>>>> pte_unmap(src_pte); >>>>> >>>>>AI slop? >>>> >>>>Nah, this one is sadly all me :( >>> >>>Haha, sorry :P >> >>So as I was getting nowhere with this, I asked AI to help me :) >> >>If you're not interested in reading LLM generated code, feel free to >>stop reading now... >> >>After it went over the logs, and a few prompts to point it the right >>way, it ended up generating a patch (below) that made sense, and fixed >>the warning that LKFT was being able to trigger. >> >>If anyone who's more familiar with the code than me (and the AI) agrees >>with the patch and ways to throw their Reviewed-by, I'll send out the >>patch. > >Seems to check out for me. In particular, out pte_unmap() everywhere >else in that function (and mremap.c:move_ptes) are ordered properly. > >Even if it would not fix the issue, it would be a cleanup :) > >Acked-by: David Hildenbrand David, I ended up LLM generating a .cocci script to detect this type of issues, and it ended up detecting a similar issue in arch/loongarch/mm/init.c. Would you be open to reviewing both the .cocci script as well as the loongarch fix? -- Thanks, Sasha