From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from BN1PR04CU002.outbound.protection.outlook.com (mail-eastus2azon11010026.outbound.protection.outlook.com [52.101.56.26]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ED74B2EC09F for ; Tue, 21 Apr 2026 05:52:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.56.26 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776750740; cv=fail; b=nyDYQ0GSJvDPUqUKGCMklnYg1INaPHIhAhxyJ6XBDorhmogK+rY2SbRGIe0RtnUUvTT+oKyn/BCUoclAUnhvcykI9oeAzLpveFUnk8zlhFJ7sHShO3PtJSvXkU0zB9cop6FQufyRNStxXE5FYZPS6M/37vGI1NWdejPZMFuqr0A= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776750740; c=relaxed/simple; bh=8/DeIH/8qxsAhJugZEk2qiiMGzsRNM6AIufSts2p8P0=; h=Date:From:To:Cc:Subject:Message-ID:References:Content-Type: Content-Disposition:In-Reply-To:MIME-Version; b=RznExpIrttmjqKbL895xFe8IyIbUaMf3wASfuokZUqW+7B1L2+IgycjzPYX+cEUK23WngeMwNAD5uV2z6hLfQJj8vUOo7jurn8zMkR8Ij6Bq1Nghnm5ZjnLvwmqbP+DIPLNoiR4MVdkh+JLAMD25l6Ee4NNDM+N9q07N5SZ9k2o= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=mCtj/zuT; arc=fail smtp.client-ip=52.101.56.26 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="mCtj/zuT" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=BtdarCmbMjRlnqUWVMvY2buE9sBGRV37U/r5pPzDhRMynopp7RlAXJEf/2d0oT5HMmQkEqJo3wuuNwTdmvO9lJ6Khs/wv5jZdtypsE/KJzmdpTdvIgtj01nTuv1el/zE1V2wS8mYMooCyjNLDBNlRE5Smnwc2Bvmj7U1Alnwp5LogQIVlRa8WCeomrcGl4kmxjgL1pwUzreMIwJHc6h5xDyXzV+STO8BiTuLTcYoZ3DK+/dpx7agwgOBZ517delfmZh7edKOgnyEgJXqI32kni1MSDwK682u5z6ZrSftNGn8/wATmCE2Q61QNOh9qMPykO1+ikOdptuSnWm8giwulQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=NW5MOU1nW63QrKMjzHg4dznPPHcAfuN8G/tnLfhWljI=; b=rM0PIxz75qV4zTtJlt8MZ/KJXzkUak4a/CYrEwKXp46EMyrG4JqA1RylP7s25E/r3dObPHIQpqK2mx6Pl3JQUfM9CuRAzcGxYeUMO9ECFRzlB2EPETwhqFZ5mJtBKzEtzaEdZ8lPInmoBgAyogbLo0uoVbOJgc5xVlh9tbxPHfD+PnXeoRgMQeo+fZanCNWM0hB6q+IywHzhBpbowf3ojhm9jdHb1PnhsZFJEY0aUCsBWxcMLLZUpSsxXj5OB1HzTNnZNIqYx5USbKooKjPghHnxlEqITE1MXZGk5mRc3FRqSqvZjgH24yKvyQYSNMYjlf7lyOsalxDbSC+mWUABRA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=NW5MOU1nW63QrKMjzHg4dznPPHcAfuN8G/tnLfhWljI=; b=mCtj/zuT6NaZFgDlqaMHbgmk+OlsI9FRQw/HDSNv9Uv+/+mrZyOYFjONOW/Lz5ffb1T4JQsqkc9t0slcNST0ubRp2cQ5LSPQG7zMPT6S59zIbgBys002JGkOjWflHUKP4CS/fLwP8iDGimVWDyifCtZs9kY3cf+HoxkhT+B4ZNHeTJD+b+m0SHqxHf7Cvv48LeO7BH8z+HguM4bWOmOsvNHO5iBy5qGNDA0hswSw+uRaQ7XsXckoPKNA827qa/UXGBDkXXMaCpN6PJsyYcNrhoidhE84W+v2ZXX+d0Ubjbbs2BJ1qTX/iPedcEnvqX83x0wukN5GBqt37cqLd8vq7g== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by CH2PR12MB4213.namprd12.prod.outlook.com (2603:10b6:610:a4::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9846.16; Tue, 21 Apr 2026 05:52:12 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::5807:8e24:69b0:f6c0]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::5807:8e24:69b0:f6c0%4]) with mapi id 15.20.9846.016; Tue, 21 Apr 2026 05:52:12 +0000 Date: Tue, 21 Apr 2026 15:52:06 +1000 From: Alistair Popple To: John Hubbard Cc: Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Zi Yan , Matthew Brost , Joshua Hahn , Rakie Kim , Byungchul Park , Gregory Price , Ying Huang , Axel Rasmussen , Yuanchu Xie , Wei Xu , Chris Li , Kairui Song , Kemeng Shi , Nhat Pham , Baoquan He , Barry Song , LKML , linux-mm@kvack.org Subject: Re: [RFC PATCH 0/2] mm/migrate: wait for folio refcount during longterm pin migration Message-ID: References: <20260410032333.400406-1-jhubbard@nvidia.com> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260410032333.400406-1-jhubbard@nvidia.com> X-ClientProxiedBy: SY5PR01CA0104.ausprd01.prod.outlook.com (2603:10c6:10:207::7) To DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|CH2PR12MB4213:EE_ X-MS-Office365-Filtering-Correlation-Id: 78c2f538-f3de-4237-2e1b-08de9f6a2204 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|7416014|366016|1800799024|22082099003|18002099003|56012099003; X-Microsoft-Antispam-Message-Info: Tiypy+Z7Ig9HLZZRy9yTV8T4PEpClgJjgo49K+65g0+PwXvJTEK2h3eFQvPzhNDBwHVmHOOJlHNeLPubT3TsMkbO+l0nBfNGb0wvb6Lc4DOB1KBSQND39PkNURsjmca84mONFBvT/uZwGV0CsgtonRE1ysWz6S7SQrL++goYiSY8OCglI4+axHlWX1p/o59xslig08fZdeem+4ihsjR82GVGVYZcyWCfmpyw/PJfOCUfwXLESqzTcG9ZPHDUAdEMeTYpCAr4k8/u7SeGdy3X8/tmp9r7X54sRz/sGAgG/g/fhs7VT0rNsRrV8Z13qgwFVVgzUMhbMtRglzrhmNpKO3RrtQhxyIgKWZ/7cBujllTx/ckB11yWNYoLif5DENbKv0dn7fz+M2oRoa9xrAMwlT6/yollhg5Y7760nS0bHouEa+92F89jwSE/FWzPGMv2GpWU2zp0GYEKMUO0MdcHXJWy3vd1/6mXe0ZtJdyRJsBua8vscTynM95yssAwAP/vQVAr2QV2NPSg+Sjn3hxGmCmBiynt7b4Ouw00h7Hn+acP4eVVdnabdbyHpCef0Vl5QyLUwYQWCCOFyr/NeoH8NPXpofQD+okJe9VG1bpTxZaiCV9kfevZpMX4mYGhgmjNjzgaIATHu28A4Mw8Bv2D7kLPWrNBWVBvZ3ftDlvC6OaRsvcl1M0eHAix8tSLMCXVGYvGLlYLFwH7yQstH0FJYxxAeIKR3WqI+zQ9Zzffa8A= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(376014)(7416014)(366016)(1800799024)(22082099003)(18002099003)(56012099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?YsgNRk/G3yhlL31U/uhuryXv7+McMbMekyP4qKICT4ZiVIBNLFdUcLxBfms3?= =?us-ascii?Q?OGGjyRX7LVu8xqxwkh/VPLYt8C2ERBIKO1nT9rdWkuPfMLLHSljBLawUOkS9?= =?us-ascii?Q?ow5Wqh8d6TgRx4a1AjRP9q/wEGo62EoMfOWVVsuPhjhYSWJGROqLhVSFHT72?= =?us-ascii?Q?MZjGI+8GbgQHXhD6iz8mlW8kWRzScHGZS3hc3feBq1Y47Qe/KoJQ9x2TajUI?= =?us-ascii?Q?WuL5uPTQa/5W4PdlatRb20P91AIL12vhjHRd2v6MFkskPU6LjyTFh3QyBeri?= =?us-ascii?Q?sx4fM0zPz78kTGS7gIQ9pJJ47qhrQs4Q4DEt3mLuWFvKUtfyXxX5pVQBDAjD?= =?us-ascii?Q?CF5knCNPKC6fxp3XN2FqVW1pMlXPrxM7O6YFEXL3LjuTjB535T5/AK2kUvIB?= =?us-ascii?Q?fF4VUUheuh2R+XZV1ihgd2TQrDIZGxikr42GIU3goKxwOnfUMI8a2CgtQytb?= =?us-ascii?Q?j+bFXiMOsoryuyHG6ugipYqgwpzEPEs8o+O86vhZqTrWC3AH0XURRYrTrPWK?= =?us-ascii?Q?wgT61YpweIZH8vUm/nOZEllZVU+s9b3l9xqUZbfvB1xe3VehFQl+1NzzWEvi?= =?us-ascii?Q?Vi8yQvRlgGacd4o8X3C5aciQQholjdYwKNK95P28uYxrp4zSiS7iibqfo5vf?= =?us-ascii?Q?dNLNXxjKaMhts3sHyp+pZqFaRqp2rpJlY2nP2DA68hhaBCWQMFVF0XJNADnm?= =?us-ascii?Q?1nuvlE3OxS+ZyxYClW17QYL+UMkNtRs1rh6eOJBh+DB8GUAYFyVVon6vR96I?= =?us-ascii?Q?xs/gBlVoctP9iiZRR5H0uXHSEbEAhcV3RQeQ/UL/7pDFqbMGfkT90r0tazbE?= =?us-ascii?Q?oFr992RlRlPdwJNk0U/kVb6GpfFk+3Xr60UqpIrREkdPwh60Uq1IR6OPiSB4?= =?us-ascii?Q?xgnr2KRNp3E5fi5Y4adbF7dLTd5i7Lh6IldbkHBiFeXtHKKZDsFMYwbQt5dn?= =?us-ascii?Q?jSqRcoCgbGNdEbRaBQySsFJMv1/fTFc/Q6m9KF2kXmHqHQqRi27CkvwOl5Zx?= =?us-ascii?Q?S385HRmtcD4CYoOETGrBl6dkCLZiW0qvh/CczebCq+hrJVirps5xnNqP9Tar?= =?us-ascii?Q?vFsxSl1PEaNfJfurVcAmnu2m9kIVHGGrT0XygYukKPzRvo7U+IzHmhxH6sFs?= =?us-ascii?Q?44oLy9OGwiWXj6TM42s5my9Dmil3X65TYtmmkjRAZiqs12RdJoXzRZ50533+?= =?us-ascii?Q?KZ+43qTO9L1A2KW7oBd3En9GnOZ8I7YknK4mivQBBK67xT+JoJiKDwDiU7dd?= =?us-ascii?Q?8cE7eSfoX25XTUmlLdAm6ufQ59K3K6FVJPTWnI5WVyShWoPA3ZiBHGP4TnSx?= =?us-ascii?Q?IvQYcUEKl71W7zozL372c/QOoFXgxX1Mf8vMOTKhkUmdgokBUtSlmx26v0t2?= =?us-ascii?Q?PvKI0kkIhVqklRUBkkhAIvheUIN3FzqwFFdME5qK9WsynCI5eOUVrMT2lwMu?= =?us-ascii?Q?rSlUm0WQ5rFmYeBC11nakMBcdabdFcMfosFzJK73KfD9dju5k/McxMbAoX9P?= =?us-ascii?Q?7xfEPYHIihQxyjnb0om0dslOLgXmYIjA45maM+Fn/Ko6RwtRctUIWVplCfwN?= =?us-ascii?Q?soO1/ewzepqCGBWds4m7CpvyssmH/cviRqTKkrLfWBqCHQ2ky/gUdCESbvdz?= =?us-ascii?Q?kei2d0/vfME6/BBt2x9t0EUuvDPb1KyiBdXgdtbyr0WpVEQv3SCbyQBXeq3T?= =?us-ascii?Q?L314Wx3us0S6b+Ju0X2yBQiS+8U2wqU+nHmmDvP9e5D7dPAHUWD77oKloAY7?= =?us-ascii?Q?6O++ZquCLA=3D=3D?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 78c2f538-f3de-4237-2e1b-08de9f6a2204 X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7726.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 Apr 2026 05:52:12.1592 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: juwhZZWjDgexu1XxWl3C7FJvB7gflJJ2+az4U+z3kPXazLOq+AgDVGWwDPAp11bl28Vt6KTPvkqj+GglJDTXLQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH2PR12MB4213 On 2026-04-10 at 13:23 +1000, John Hubbard wrote... > Hi, > > This adds a bounded sleep to migration so that FOLL_LONGTERM pinning can > wait for transient folio references to drain, instead of failing after a > fixed number of retries. The wait uses a one-second timeout. An > alternative approach would be to call wait_var_event_killable() with no > timeout, but that doesn't match as well with migration's "this will > probably work" API. In other words, a short sleeping wait is more > appropriate here. This is much better than retrying $RANDOM times. It also seems it would provide a nice definition of what a transient vs. longterm pin is. Any pins longer than the migration timeout would be longterm. > When migrating pages for FOLL_LONGTERM pinning, migration can fail with > -EAGAIN if a folio has unexpected references. These references are often > transient, but the current retry loop gives up too quickly. This series > adds wait_var_event_timeout() at the retry points, paired with > wake_up_var() in folio_put() to wake the sleeper as soon as the refcount > drops. Nothing wrong with the above, just a minor nit that I wanted to check my understanding of. FOLL_LONGTERM causing migration implies this is in ZONE_MOVABLE, and the aim of ZONE_MOVABLE is that memory is always movable. That implies any unexpected page references should *always* be transient, not often transient. At least that's my understanding assuming drivers are behaving. > The wake_up_var() calls in folio_put() are gated behind a static key, > disabled by default, so non-migration workloads pay zero cost. > migrate_pages() enables the key on entry when the reason is > MR_LONGTERM_PIN, and disables it on exit. > > Toggling the key is not free. folio_put() is static inline, so every > compilation unit that calls it gets its own patch site (roughly 500 in > vmlinux, plus modules). On x86, jump label patching is batched (256 > sites per batch, 3 IPI rounds per batch), so enabling the key costs > 6-9 IPI broadcasts, a few hundred microseconds on a large machine. > That cost is paid twice per migrate_pages() call. Migration itself > spends several milliseconds per batch on LRU isolation, TLB flushes, > and page copies. Concurrent longterm-pin migrations after the first > just do an atomic_inc (no patching). > > Matthew Brost offered to performance-test this series [1], as Intel has > tests that stress migration and good metrics to catch regressions. > > [1] https://lore.kernel.org/all/aX+oUorOWPt1xbgw@lstrano-desk.jf.intel.com/ > > John Hubbard (2): > mm: wake up folio refcount waiters on folio_put() > mm/migrate: wait for folio refcount during longterm pin migration > > include/linux/mm.h | 8 ++++++++ > mm/migrate.c | 30 ++++++++++++++++++++++++++++++ > mm/swap.c | 10 +++++++++- > 3 files changed, 47 insertions(+), 1 deletion(-) > > > base-commit: 9a9c8ce300cd3859cc87b408ef552cd697cc2ab7 > -- > 2.53.0 >