From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 84CE93D9DD2 for ; Fri, 10 Apr 2026 15:27:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775834822; cv=none; b=AuhQ03MVkYf01Uwkti/wuJthrVe0Z1gc5gGSACGN7QDyqJkmAVm/jvI+4sGxQNatk08cPcoqJf+UXgWl8NlgcHSxtD3aGnsJK5cVI+1p0XV0IfPtbTxHn7eiEIvaUKssht3qQKZkCX+Ma4FA+JHaCAP5+4O7O3vt8foNYx2iNR4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775834822; c=relaxed/simple; bh=6C0Cle+J2YLPhyvvDJs7ivBC6rZ+hmnk/ni2OJklfbI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=K6dA/QyVt7JtsWY6a1nT7M+NFvEBq/YfdxeHYNEifG8LxWaM0Eq1jBaIUYTxm6tDRYVEXyZbGZuxuIW8Do/3+dxbnoRWsAjoH+nEg5aMSOJegs7Ze12dYj3VBmfG67Osh0rrPFLZ+pAGHpAUSYXuZZSEHhFt6IDpvObqGr3ALS4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=TIXAtJbX; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b=mQJiPy2k; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="TIXAtJbX"; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b="mQJiPy2k" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1775834820; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=O4cFgle6OvvDEQMPYV4EUwy/InNyaem1K8y2fw9CmrM=; b=TIXAtJbXLmONZ1EZsNrdAjMjPefjckS2PbNS5TPl6Skys/pRTW0g2KtvGXWE7psiLkHcBL xLDMWJIdsL+ZdWTDFfCtFNwMbvX5at7oD4qi1Rq2dkfhfqavnyR4Ds6OZkVczrxhTrqCNh 6WfvahwgS7VugOaBGQXDwtYvJZKs3m0= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-472-IK8FGugRNIm0wAll93uSyg-1; Fri, 10 Apr 2026 11:26:59 -0400 X-MC-Unique: IK8FGugRNIm0wAll93uSyg-1 X-Mimecast-MFC-AGG-ID: IK8FGugRNIm0wAll93uSyg_1775834818 Received: by mail-qv1-f71.google.com with SMTP id 6a1803df08f44-8a0b5478a12so48399266d6.0 for ; Fri, 10 Apr 2026 08:26:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1775834818; x=1776439618; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=O4cFgle6OvvDEQMPYV4EUwy/InNyaem1K8y2fw9CmrM=; b=mQJiPy2kIMRMjYdbvbZb7KmopNEbF0wUMbtcSdvkCty3z+0jVGJxh79+4vhuFv6HU2 O1axmps5QDBSkRlzsOvAHunTyeumB7rMeBUm5XvC2NaQhIVuBtFtzCWUe8glOKwJlOip Q1Sihi3TE4rd96VLJ7lz0y56HxzlTy9bQAFZ3nV2XLXqyZQSfgzfOzo8PiwgZmhFIUh6 PYt+mhAPrHmT13clizQOByn3jqv0OrxuG4LVALmkha9J67dvSDeNdO4LuHsctsczi6HB 8oreqIjk7f07wl1ePpf6SLcHBVHNEPo8kAMF/jL9ADofzMF1IlyESLtqwGU2Kpd4Gmtf znEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775834818; x=1776439618; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=O4cFgle6OvvDEQMPYV4EUwy/InNyaem1K8y2fw9CmrM=; b=rb0f7MZHK8Pxenit6CfJzzKZ1I8PGJPVWieR3IbjGk2LRk4t8xVSlZkj89fnTcpOn+ 9tKuOaah7PjtF4AJWXcy/Qdr9r7r3HrxL2WvYcTRvfTEct/r3nFa++qoKg1nm6UiMLZ3 21cqbB93+tMKMNGn01EKT25mFX66EjAd1ohW4w3B/oBclhqaLEgehQ4MYYWzXCjBjaB7 bc0xoJZxMOMAW0VhbSjeKWRbdfg/jzf0mzBvQehBi0R3rWPNZO1WLCo71zPGE/ZeofJh T/6TBfVtKnIOg1tMx7ViHVAvKRb6EJ4sccOzMT4jtrTELmV0hadMilEubBJg8Fjq2xyl A4Rg== X-Forwarded-Encrypted: i=1; AJvYcCUjyPIxqJJz3waloH+WHdMoaUAce35mg1n/vIU0QvWZQtLYWijzc2SYCVLZDe1vpbbVUVgkEgh/HmY2RJI=@vger.kernel.org X-Gm-Message-State: AOJu0YyVPrk4r8oBhO6+t7KHc513CreNWcz2784VpSehh0gQnFTmTUZV PtT/I/oeKdlXk84HUHTq50MBKw1tG1Ml6IROSCiYvL6nPMicwkwV9J8ILF8C4qKd1V1GVYy8wsm Ir1Shhxv8vRRUQC5wzRj+xfxeZYcuYZ3AHRBXI1CvHujQ6XFpKgnT/qOvU+mg2odOmQ== X-Gm-Gg: AeBDietYIUuGcTjhdq+JoqdbDHMiPa4bx9rRkyxor4xaVoCqNLNxwnXenJGq2+w3C6H n+bl0/WKnYdzbfHLgZumAZWU6cz7S9VfVLLz6krUo9xiT03oHxNICabvwNeXDKt/HGq1LdhdEU3 xXVyDDiQbLEKH/1r+RNCOPSCHDBrI5YBTRqmPlU0th7qp+klrrcKoWTwjG94FHtjAFX84W+yW7Y u7PzRQiW7YeUXxyR3PEBNOhE+rN3OYExW755/IAzmqTWnzhOrSapX1aXXmDT5oIjbXXiaD9d+wR 6HfVzJ7WXzEHOE8gYOgQ34sa0eXVEDJ5ozMqNlnOm81a3u4B0PqMfNSzrxBAQWW7naNAgB4cgJL MbsDYa8zTPkfmMoucnnKyg8D+TfNNrwdC6MzrxN5ZWNCGmtE= X-Received: by 2002:ac8:7d8a:0:b0:50d:8324:6d24 with SMTP id d75a77b69052e-50dd6a6add7mr42027191cf.7.1775834818099; Fri, 10 Apr 2026 08:26:58 -0700 (PDT) X-Received: by 2002:ac8:7d8a:0:b0:50d:8324:6d24 with SMTP id d75a77b69052e-50dd6a6add7mr42026341cf.7.1775834817255; Fri, 10 Apr 2026 08:26:57 -0700 (PDT) Received: from x1.local ([142.189.10.167]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-50dd53f21aasm23801201cf.10.2026.04.10.08.26.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Apr 2026 08:26:56 -0700 (PDT) Date: Fri, 10 Apr 2026 11:26:55 -0400 From: Peter Xu To: Mike Rapoport Cc: David CARLIER , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Andrea Arcangeli Subject: Re: [PATCH v4] mm/userfaultfd: detect VMA replacement after copy retry in mfill_copy_folio_retry() Message-ID: References: <20260331134158.622084-1-devnexen@gmail.com> <20260331200148.cc0c95deaf070579a68af041@linux-foundation.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: On Thu, Apr 09, 2026 at 02:31:46PM +0300, Mike Rapoport wrote: > Hi Peter, Hi, Mike, > > On Thu, Apr 02, 2026 at 09:42:01AM -0400, Peter Xu wrote: > > Hi, Mike, > > > > Let me also leave my comments inline just for you to consider. > > > > On Thu, Apr 02, 2026 at 06:58:33AM +0300, Mike Rapoport wrote: > > > Hi David, > > > > > > It feels that you use an LLM for correspondence. Please tune it down to > > > produce more laconic and to the point responses. > > > > > > On Wed, Apr 01, 2026 at 09:06:36AM +0100, David CARLIER wrote: > > > > On Tue, Apr 01, 2026 at 08:49:00AM +0300, Mike Rapoport wrote: > > > > > What does "folio allocated from the original VMA's backing store" exactly > > > > > mean? Why is this a problem? > > > > > > > > Fair point, the commit message was vague here. What I meant is: > > > > > > > > mfill_atomic_pte_copy() captures ops = vma_uffd_ops(state->vma) and > > > > passes it to __mfill_atomic_pte(). There, ops->alloc_folio() allocates > > > > a folio for the original VMA's inode (e.g. a shmem folio for that > > > > specific shmem inode). > > > > > > I wouldn't say ->alloc_folio() allocates a folio _for_ the inode, it > > > allocates it with inode's memory policy. Worst can happen without any > > > changes is that the allocated folio will end up in a wrong node. > > > > For shmem it's only about mempolicy indeed, but since we're trying to > > export it as an API in the series, IMHO it would be nice to be generic. So > > we shouldn't assume it's only about mempolicy, we should rely on detecting > > any context change and bail out with -EAGAIN, relying all rest checks to > > the next UFFDIO_COPY ioctl done on top of the new mapping topology. > > My point was that this is preexisting bug and that we don't need to rush > with the complete fix that will extensively compare VMA compatibility... Yes, I fully agree it was pre-existing. My guess is we only didn't reach a consensus yet on how to completely fix it, and whether we need an intermediate fix for "a VMA suddenly changed to hugetlb" only. > > > > This is still a footgun, but I don't see it as a big deal. > > > > IIUC this is a real bug reported. Actually, if my understanding is > > correct, we should be able to easily write a reproducer by registering the > > src addr of UFFDIO_COPY to userfaultfd too, then the ioctl(UFFDIO_COPY) > > thread will get blocked faulting in the src_addr. During that, we can > > change the VMA layout in another thread to test whatever setup we want. > > > > > Let's revisit it after -rc1 and please make sure to cc "MEMORY MAPPING" > > > folks for insights about how to better track VMA changes or their absence. > > > > No strong feeling here if we want to slightly postpone this fix. It looks > > like not easy to happen as it looks to be a bug present for a while, indeed. > > > > It's just that if my understanding is correct, with above reproducer we can > > crash the kernel easily without a proper fix. > > ... but we do need a more urgent fix for the case when a VMA suddenly > becomes hugetlb, because that could not happen before the refactoring. Personally this is least of a concern to me. Hugetlbfs is so specially managed in userapps, so it is even less likely to trigger the same bug with VM_SOFTDIRTY changes or other ways. But I understand your point. You want to cover what your series changed on this. If you think this is the right way to go, I'll follow your decision. > > For that, it would be enough to check that ->ops are the same before and > after copy_from_user(). > > @David, do you mind to send a patch for this without waiting for rc1? > > > > > The vm_flags comparison was a secondary guard against permission/type > > > > changes during the window. > > > > > > Permissions should be fine, they are checked in userfaultfd_register. > > > Some other flags that don't matter to uffd operation may change during the > > > window, though and then a comparison of vm_flags will give a false > > > positive. > > > > IMHO false positive is fine in this case when -EAGAIN will be used (which I > > still think we should), if it only causes a retry. > > I still disagree, but let's postpone this discussion for later, when David > resends the patch that compares VMA properties. Maybe it's because we are having different checks in mind, where you wanted to check only the "invalid cases" but I am trying to make it fallback for all detectable changes. IMHO it's hard to define "invalid case" in this case, and it's also unnecessary when a EAGAIN will be processed with another UFFDIO_COPY attempt and we'll simply redo all the checks. Hence relying on VMA change would be the simplest, safest to me. Again, I'm open to any suggestion on replacing the vma snapshot logic as long as all possible issues got reported will be properly fixed, especially in the latest mainline. I don't worry much on backporting yet; if this bug existed for 10 years and nobody yet reported, to me we can always evaluate the effort to backport or skip. However, a proper fix in mainline is IMHO more important. Thanks, -- Peter Xu