From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from out30-100.freemail.mail.aliyun.com (out30-100.freemail.mail.aliyun.com [115.124.30.100])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id C71ED1E0E14
	for <linux-kernel@vger.kernel.org>; Tue, 21 Apr 2026 09:22:07 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.100
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1776763329; cv=none; b=d9ZzLay+wJSphokf9H20heTSio7YbVRPosChtHWlbNgzyBbmNfkTrf1pZJepmUUKWWfYPfNLxoweXmianevbf4Kc0elDITcyywIIxrQBfi+cDCnOpUIo1zNzZzuEhbZuHcGh9UjIuThGfKNdgrCO6sarDTziitvBgZ6gT8A7tms=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1776763329; c=relaxed/simple;
	bh=iOEkw6LvinylNmaAway9euWIvy74QNdBGS170cKTofg=;
	h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID:
	 MIME-Version:Content-Type; b=s/MaMl+nSMW8mqVaNOaw7Qr5VDivloNWsWrb+ZU1HoAJCypkIHFoZ6OTrZd2pSNpknVQNGir+l59yJOc5M62wmfzmMyyTgZeBDXdsy8Uipoa9WK/DMZPifaSURQtDIeqCXvXluCwcMTT4d01nAbCvnYgChVVR4fm5Uxdsf6wUdE=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=gsy3gvDo; arc=none smtp.client-ip=115.124.30.100
Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com
Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="gsy3gvDo"
DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=linux.alibaba.com; s=default;
	t=1776763320; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type;
	bh=gjhwBOa3I6lJB6O5Zxac6J4WhufkpISfcLSmTDALcd4=;
	b=gsy3gvDolgXdQPPZ/xyieWpPLLQCNTQsjxHTmsOlF11/d0JPcqNdXI/nvLzYj+XIYatkB0ykEqpgzLnghPW+/U9v4Zv3hmqF1zykdLysO6ecpSS8OoxsGED6rUmSZQoFgCvfSDjR40L8QxQoAzfibrWP8crNG6W4tJdXZ39+Pco=
X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R141e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045098064;MF=ying.huang@linux.alibaba.com;NM=1;PH=DS;RN=27;SR=0;TI=SMTPD_---0X1SeSXl_1776763305;
Received: from DESKTOP-5N7EMDA(mailfrom:ying.huang@linux.alibaba.com fp:SMTPD_---0X1SeSXl_1776763305 cluster:ay36)
          by smtp.aliyun-inc.com;
          Tue, 21 Apr 2026 17:21:58 +0800
From: "Huang, Ying" <ying.huang@linux.alibaba.com>
To: John Hubbard <jhubbard@nvidia.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,  David Hildenbrand
 <david@kernel.org>,  Lorenzo Stoakes <ljs@kernel.org>,  "Liam R . Howlett"
 <Liam.Howlett@oracle.com>,  Vlastimil Babka <vbabka@kernel.org>,  Mike
 Rapoport <rppt@kernel.org>,  Suren Baghdasaryan <surenb@google.com>,
  Michal Hocko <mhocko@suse.com>,  Zi Yan <ziy@nvidia.com>,  Matthew Brost
 <matthew.brost@intel.com>,  Joshua Hahn <joshua.hahnjy@gmail.com>,  Rakie
 Kim <rakie.kim@sk.com>,  Byungchul Park <byungchul@sk.com>,  Gregory Price
 <gourry@gourry.net>,  Alistair Popple <apopple@nvidia.com>,  Axel
 Rasmussen <axelrasmussen@google.com>,  Yuanchu Xie <yuanchu@google.com>,
  Wei Xu <weixugc@google.com>,  Chris Li <chrisl@kernel.org>,  Kairui Song
 <kasong@tencent.com>,  Kemeng Shi <shikemeng@huaweicloud.com>,  Nhat Pham
 <nphamcs@gmail.com>,  Baoquan He <bhe@redhat.com>,  Barry Song
 <baohua@kernel.org>,  LKML <linux-kernel@vger.kernel.org>,
  linux-mm@kvack.org
Subject: Re: [RFC PATCH 2/2] mm/migrate: wait for folio refcount during
 longterm pin migration
In-Reply-To: <20260410032333.400406-3-jhubbard@nvidia.com> (John Hubbard's
	message of "Thu, 9 Apr 2026 20:23:33 -0700")
References: <20260410032333.400406-1-jhubbard@nvidia.com>
	<20260410032333.400406-3-jhubbard@nvidia.com>
Date: Tue, 21 Apr 2026 17:21:43 +0800
Message-ID: <87cxzsis7s.fsf@DESKTOP-5N7EMDA>
User-Agent: Gnus/5.13 (Gnus v5.13)
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=ascii

John Hubbard <jhubbard@nvidia.com> writes:

> When migrating pages for FOLL_LONGTERM pinning (MR_LONGTERM_PIN), the
> migration can fail with -EAGAIN if the folio has unexpected references.
> These references are often transient (e.g., from GPU operations like
> cuMemset that will complete shortly).
>
> Previously, the migration code would retry up to 10 times
> (NR_MAX_MIGRATE_PAGES_RETRY), but this busy-retry approach failed when
> the transient reference holder needed more time than the retry loop
> provides.
>
> Fix this by waiting up to one second for the folio's refcount to drop
> to the expected value before retrying migration. The wait uses
> wait_var_event_timeout() paired with the wake_up_var() calls added to
> folio_put() in the previous commit. If the timeout expires, the
> existing retry loop continues as before. The folio_put_wakeup_key
> static key is enabled for the duration of migrate_pages() so that
> folio_put() only wakes waiters when migration is active.
>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  mm/migrate.c | 30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 2c3d489ecf51..a5d9f85aa376 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -47,6 +47,8 @@
>  #include <asm/tlbflush.h>
>  
>  #include <trace/events/migrate.h>
> +#include <linux/jump_label.h>
> +#include <linux/wait_bit.h>
>  
>  #include "internal.h"
>  #include "swap.h"
> @@ -1732,6 +1734,17 @@ static void migrate_folios_move(struct list_head *src_folios,
>  			*retry += 1;
>  			*thp_retry += is_thp;
>  			*nr_retry_pages += nr_pages;
> +			/*
> +			 * For longterm pinning, wait for references
> +			 * to be released before retrying.
> +			 */
> +			if (reason == MR_LONGTERM_PIN) {
> +				int expected = folio_expected_ref_count(folio) + 1;
> +
> +				wait_var_event_timeout(&folio->_refcount,
> +					folio_ref_count(folio) <= expected,
> +					HZ);
> +			}
>  			break;
>  		case 0:
>  			stats->nr_succeeded += nr_pages;
> @@ -1941,6 +1954,17 @@ static int migrate_pages_batch(struct list_head *from,
>  				retry++;
>  				thp_retry += is_thp;
>  				nr_retry_pages += nr_pages;
> +				/*
> +				 * For longterm pinning, wait for references
> +				 * to be released.
> +				 */
> +				if (reason == MR_LONGTERM_PIN) {
> +					int expected = folio_expected_ref_count(folio) + 1;
> +
> +					wait_var_event_timeout(&folio->_refcount,
> +							folio_ref_count(folio) <= expected,
> +							HZ);
> +				}
>  				break;
>  			case 0:
>  				list_move_tail(&folio->lru, &unmap_folios);
> @@ -2085,6 +2109,9 @@ int migrate_pages(struct list_head *from, new_folio_t get_new_folio,
>  
>  	memset(&stats, 0, sizeof(stats));
>  
> +	if (reason == MR_LONGTERM_PIN)
> +		static_branch_inc(&folio_put_wakeup_key);
> +

This should be done in migrate_pages_sync() before the sync loop.

>  	rc_gather = migrate_hugetlbs(from, get_new_folio, put_new_folio, private,
>  				     mode, reason, &stats, &ret_folios);
>  	if (rc_gather < 0)
> @@ -2137,6 +2164,9 @@ int migrate_pages(struct list_head *from, new_folio_t get_new_folio,
>  	if (!list_empty(from))
>  		goto again;
>  out:
> +	if (reason == MR_LONGTERM_PIN)
> +		static_branch_dec(&folio_put_wakeup_key);
> +
>  	/*
>  	 * Put the permanent failure folio back to migration list, they
>  	 * will be put back to the right list by the caller.

---
Best Regards,
Huang, Ying