From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E48F3DA5D4 for ; Fri, 5 Jun 2026 14:55:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780671352; cv=none; b=MhH6ExESa54Zc4N/sJHlsSIr/kxHNCJeJcorPqY+TWGzNZKgl/Igixrrz1ObRfr/O/qW4nOn+un80SI1MBOOs1yNBvahL13EfsoRg4J7KsCzD/h8moVmbe8FjVeo/iPaR1E+vXjwg7eaMMvL8RqTj5aPAt3fYcfYNXPtnA8qjQs= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780671352; c=relaxed/simple; bh=XkPdGUmNRSBGr3+ifPjcHl2PrQBpsF3QBH5hEZpZ54U=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=ONYXlxizm4K+4+uA7YV3/9sfsDOVMQUgGRpjzrFUj7WO0LuydCejdpRnodPi1yHv5FKcHmaLAY695fdS8CGPBvpisX9dr6BVsDLyWuyYgOtWdvqFr59oxJxaBxSy0mJM8j9cO4CfBUZkNTySk9OKxJVHOVo/AJLCZs44BAp7D8M= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=UKHyjRLC; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b=SyfKeRIU; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="UKHyjRLC"; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b="SyfKeRIU" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780671349; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=cbnEVeFbG8cpEs96wFiXUr8pcbuq1jm4ef3IcsDU6X4=; b=UKHyjRLCJZ7+Amj+KRFfMQI/zAIsRX/x/b+HMp5hrRNxs7d1Itv0xB6lEK+tq12TLKUinC UQbxzQKWWHYafbmqCg/HY9tykwHptH5nxQzlJ1CEjjpS06Hq/IMQFMS6YzLoXE0Fi7M1rz fWlgv23rT5EBJ2b7C2vhGi4ePV5RVWo= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-13-X_O_Yk1iOHepJblrz0B38g-1; Fri, 05 Jun 2026 10:55:48 -0400 X-MC-Unique: X_O_Yk1iOHepJblrz0B38g-1 X-Mimecast-MFC-AGG-ID: X_O_Yk1iOHepJblrz0B38g_1780671347 Received: by mail-wm1-f71.google.com with SMTP id 5b1f17b1804b1-49048e21ea7so20902235e9.1 for ; Fri, 05 Jun 2026 07:55:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1780671347; x=1781276147; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=cbnEVeFbG8cpEs96wFiXUr8pcbuq1jm4ef3IcsDU6X4=; b=SyfKeRIUksTHxBI2L20h53+DdWGAUJLwgfsgadXrnJA6JW3b/UncRg7HoH5g/1qxZH o9dgmz4yQdZiSuv/1OGdMi+aamhpDFtWrtY56S71/8i13yDPTSGqQGyHxDs0BeBUEUPM GWJoOHb6eCBePzuHB6ZsNGwztyZek35Arq31w8GSkNvQaeXkf5MUWUe+/lQS/KtiSPFt lx45w0LRweWwZXCxVPwnCt1fcVlZzNpocU65umGEuM/0mz5WhWZiOsEgDIAxn5vZoOUa jksMEinpn6CY6CGRTkpZ+37KT1+GJ4QDGXvK4ckteoi77qIuq/9X4A66bcGiBtvBbJg8 SfFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780671347; x=1781276147; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cbnEVeFbG8cpEs96wFiXUr8pcbuq1jm4ef3IcsDU6X4=; b=KU1g8lKIta4iFBCrxrjdqjZMNabwSAKs59d8szhIkLM+p8/ZCdYta/GB79eINJnJsx CzY1b/d835SGksPFVmhMUl8Iq9FarsZGRDWhSF/BAHJ6NmNRDyLSe8xuj5+K3CY8c1uV z217qQznlBUJ50ErvW05ItN4A4L9zG8xqM56zYXHb/UYMPOSTTE7t29JF0jnkLbjNT9S eVNZVUtAhlWg4HCNTkWAwDcqOVNz0xrG6/0PDljheCcYUv8Jh1yVPISFXSQOHJiNLq5A sgwEZFXbr/dIynRQKMvwTZckElTcFxcratdVL41IA8B3X0FSy9e6Pa0WJOSEms2kVVg5 3Q3A== X-Gm-Message-State: AOJu0YygOk4caFImMaOBqQR4lnpB/9MCB2xUc6hvnTOGeBP/kJaoi6U9 9OJUWDiVgJEkbuVAUG/ReDkJ5OgGs1Il1589HA50LhR/brxiWWCKys6U/MoUUmLckn6k3tjOB0l GGHx/P2ZhDVljaQo0spytNDxwV2BsXJ+Nm4RdYaSI4114BCXSjR24u/6PxiC+qdFEoQ== X-Gm-Gg: Acq92OGTFyEa+8DM/UClYNZkpsbDKYwLxd6jgNIuDq55ogoQOz1+yM4sugjUtuCD+lK 2CxDiqhFVBIM0QH8b4QHnXOGKw/UKNmT1m3TDBikFSlWSk9wkE4WBubJuPQUc69Z6N/wkoLB7xx f0M0LW4FHnaUHfJNe+A+81a1+1Dj2JkUqWEgUPI37tYolcEzpn5pv/nyhp+l7qVs8MKjEjbK7Yb +nYlsy6fCuUrEiJP0vTXmy1Qqq+Uf9CIMPZ5ylRoFE/2ViiI2BysE/d+PSNUHjlnm2QWxM3vw6T TymuXTs5tTO3kKd6/58H/A/ZIRX5cBSCNC38jNR6SdJ4RWzANQF/wSiaSYiZGinxH/qf/iWemcP qcEW6HKbFnmKhkzACib1Yaam1CF9O3UXucEaWyUg5mvAMaMEJ8KalYJU= X-Received: by 2002:a05:600c:34d2:b0:48f:e230:29f4 with SMTP id 5b1f17b1804b1-490c2d1fb35mr52046475e9.15.1780671346847; Fri, 05 Jun 2026 07:55:46 -0700 (PDT) X-Received: by 2002:a05:600c:34d2:b0:48f:e230:29f4 with SMTP id 5b1f17b1804b1-490c2d1fb35mr52045885e9.15.1780671346297; Fri, 05 Jun 2026 07:55:46 -0700 (PDT) Received: from redhat.com (ppp-94-66-118-61.home.otenet.gr. [94.66.118.61]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4602cda3651sm13729661f8f.32.2026.06.05.07.55.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Jun 2026 07:55:45 -0700 (PDT) Date: Fri, 5 Jun 2026 10:55:43 -0400 From: "Michael S. Tsirkin" To: "Garg, Shivank" Cc: linux-kernel@vger.kernel.org, Sean Christopherson , Paolo Bonzini , David Hildenbrand , Vlastimil Babka , kvm@vger.kernel.org Subject: Re: [PATCH] KVM: guest_memfd: fix NUMA interleave index double-counting Message-ID: <20260605105455-mutt-send-email-mst@kernel.org> References: <0eff0a90667b900bee837d06b5db5025e1f304b5.1780501924.git.mst@redhat.com> <916681a5-dd66-4773-a46f-2273a72c11cf@amd.com> <20260604194613-mutt-send-email-mst@kernel.org> <42c42370-2cf1-4b98-8d6a-8d7cd62f95f4@amd.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <42c42370-2cf1-4b98-8d6a-8d7cd62f95f4@amd.com> On Fri, Jun 05, 2026 at 06:31:51PM +0530, Garg, Shivank wrote: > > > On 6/5/2026 5:16 AM, Michael S. Tsirkin wrote: > > [You don't often get email from mst@redhat.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] > > > > On Thu, Jun 04, 2026 at 12:21:15AM +0530, Garg, Shivank wrote: > >> > >> > >> On 6/3/2026 9:27 PM, Michael S. Tsirkin wrote: > >>> kvm_gmem_get_policy() sets *ilx to the full page offset > >>> (vm_pgoff + vma offset). But get_vma_policy() adds the page > >>> offset on top of *ilx, so the offset is counted twice. This > >>> causes NUMA interleaving to skip nodes: for order-0 pages the > >>> effective index jumps by 2 for each consecutive page. > >>> > >>> The get_policy vm_op should return only a per-file bias in *ilx > >>> (like shmem_get_policy does with inode->i_ino), letting > >>> get_vma_policy() add the page-offset component. > >>> > >>> Fix by setting *ilx to inode->i_ino instead of the full page > >>> offset. The page offset is computed by get_vma_policy() in > >>> mm/mempolicy.c. The full offset is still computed > >>> in kvm_gmem_get_policy() for mpol_shared_policy_lookup(). > >>> shmem_get_policy() follows the same pattern. > >>> > >>> Found by Sashiko (sashiko.dev) AI code review. > >>> > >>> Fixes: ed1ffa810bd6 ("KVM: guest_memfd: Enforce NUMA mempolicy using shared policy") > >>> Cc: Sean Christopherson > >>> Cc: Paolo Bonzini > >>> Assisted-by: Claude:claude-opus-4-6 > >>> Signed-off-by: Michael S. Tsirkin > >>> --- > >>> virt/kvm/guest_memfd.c | 7 ++++--- > >>> 1 file changed, 4 insertions(+), 3 deletions(-) > >>> > >>> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c > >>> index 69c9d6d546b2..0bcf6fc08e2d 100644 > >>> --- a/virt/kvm/guest_memfd.c > >>> +++ b/virt/kvm/guest_memfd.c > >>> @@ -438,11 +438,12 @@ static int kvm_gmem_set_policy(struct vm_area_struct *vma, struct mempolicy *mpo > >>> } > >>> > >>> static struct mempolicy *kvm_gmem_get_policy(struct vm_area_struct *vma, > >>> - unsigned long addr, pgoff_t *pgoff) > >>> + unsigned long addr, pgoff_t *ilx) > >>> { > >>> struct inode *inode = file_inode(vma->vm_file); > >>> + pgoff_t pgoff = vma->vm_pgoff + ((addr - vma->vm_start) >> PAGE_SHIFT); > >>> > >>> - *pgoff = vma->vm_pgoff + ((addr - vma->vm_start) >> PAGE_SHIFT); > >>> + *ilx = inode->i_ino; > >>> > >>> /* > >>> * Return the memory policy for this index, or NULL if none is set. > >>> @@ -453,7 +454,7 @@ static struct mempolicy *kvm_gmem_get_policy(struct vm_area_struct *vma, > >>> * can then replace NULL with the default memory policy instead of the > >>> * current task's memory policy. > >>> */ > >>> - return mpol_shared_policy_lookup(&GMEM_I(inode)->policy, *pgoff); > >>> + return mpol_shared_policy_lookup(&GMEM_I(inode)->policy, pgoff); > >>> } > >>> #endif /* CONFIG_NUMA */ > >>> > >>> -- > >>> MST > >>> > >> > >> Thanks for fixing this. LGTM! > >> > >> Reviewed-by: Shivank Garg > > > > > > Can u actually test it though pls? > > Because I think another patch I sent in response so Sashiko > > is also needed. > > Hi Michael, > > Yes, I tested this. > > I used kretprobes to read *ilx on each kvm_gmem_get_policy(), while calling > get_mempolicy(MPOL_F_ADDR) on consecutive offsets(0..7) of guest_memfd mapping: > > BEFORE: > page offset: 0 1 2 3 4 5 6 7 > *ilx: 0 1 2 3 4 5 6 7 > > get_vma_policy() again add the page offset on top. so, it will increase by stride 2. > > AFTER Fix: > page offset: 0 1 2 3 ... 7 > *ilx: 128376 128376 128376 128376 ... 128376 > > It store i_no, so after get_vma_policy(), it will increase by just 1. > > It's hard to show any wrong allocation with the bug because this index value is not > used by allocation path, which uses NO_INTERLEAVE_INDEX. > > Tested-by: Shivank Garg > > Thanks, > Shivank > So for this to be useful at all we do need the patch I sent in response to sashiko, right? Mind trying out that one? -- MST