From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61972C19F32 for ; Wed, 5 Mar 2025 19:22:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5A065280005; Wed, 5 Mar 2025 14:22:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 55044280004; Wed, 5 Mar 2025 14:22:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3CA7E280005; Wed, 5 Mar 2025 14:22:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 1D38E280004 for ; Wed, 5 Mar 2025 14:22:32 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 590241A0A14 for ; Wed, 5 Mar 2025 19:22:33 +0000 (UTC) X-FDA: 83188468986.29.CFEE123 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf13.hostedemail.com (Postfix) with ESMTP id E4DE020006 for ; Wed, 5 Mar 2025 19:22:30 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=iAYsqsU4; spf=pass (imf13.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741202551; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qSCeDmqIs0mapPQVCkqXEI9hpBAb0U2gsY/JPXiiuPQ=; b=3K6L9RpJMq/F24XPT5Eov52aBNYptQpAqByDm0p+w/PoVyYgOIYvnyKHhdaG3hTF/nBJmq GCmISgqWSpWgp59vjPYkzc720uKI5xgcJZYKpliH3yGrJDVeom28mQbHpTzyfP04kjQsWZ gLTFWzgYsXtDA3U4R8YQQRsfFCW8eNk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741202551; a=rsa-sha256; cv=none; b=psmxUsYQIB7nD/fTPB10gTbINkw+RN11w/ZYg29JOTv/mL3lAn26NMETCpbGZYDzXHdkZ/ 8exKQerPdRIrlQCh447uMEvUHRp5Olq+19sLWtK04MC8wmD4mrz85RdRk8MlU/eK+HeYN9 0qnpI8hzBVYXpi8J5U8lLHe0Nib9KmQ= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=iAYsqsU4; spf=pass (imf13.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741202550; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=qSCeDmqIs0mapPQVCkqXEI9hpBAb0U2gsY/JPXiiuPQ=; b=iAYsqsU4jbcT4CYzoRWkc36KOX+bzh/PvPXldGfVC46PQkyFeEhvVClA6LtaQqgKTJ84K3 bO8wXItJoE6V3oEOA0AO0Iow9hmRq4k7pXB3Xeq3do8ftOoVl5GWogiAfiyNgylWfTe9vo UmxxTiu9s6mhtiS61kSVsY/b5SeHzPo= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-339-XV_HCO6fP5G43L3mzLIImg-1; Wed, 05 Mar 2025 14:22:28 -0500 X-MC-Unique: XV_HCO6fP5G43L3mzLIImg-1 X-Mimecast-MFC-AGG-ID: XV_HCO6fP5G43L3mzLIImg_1741202548 Received: by mail-wm1-f72.google.com with SMTP id 5b1f17b1804b1-4394c0a58e7so50143485e9.0 for ; Wed, 05 Mar 2025 11:22:28 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741202548; x=1741807348; h=content-transfer-encoding:in-reply-to:organization:autocrypt :content-language:from:references:cc:to:subject:user-agent :mime-version:date:message-id:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=qSCeDmqIs0mapPQVCkqXEI9hpBAb0U2gsY/JPXiiuPQ=; b=FQR5B4KQw7EjHYLUb4l6pDPL60gVVFPssnc3loz3V4CJb5s38tzfHRtAMVAsbN9Iio olISOO47lNu7bl/Vbci59KCCO1Xw555V/cb/3icE7a4FZPNnxVJDPfeUTl96R9GLRtfG pqZWHXW3FZSPmc0f2XgM/7icSN6io7TlvoFC59ILqTR8AaelZLgQjwrhXy6GDNVn93k1 hcM0WF+Ut89jAMPsHcTYR5kdI+OGtjc7l2JVEAhuehS/JQngR7MRvFIB6H9py45jvekU 9m+/PCg3vjIhc0qh6mE7cXVdEelQ5IuiJuggwAZK5+PV/I4v3E+be4piXjyC34kji33b FMyA== X-Forwarded-Encrypted: i=1; AJvYcCUNkwWGzSJWi7a3mxZ6dQFVYVl5co5ZkfNZVbZ2qifRmeNXlDm0Eyc8mom74mxu4VFQgKJaK1JkLw==@kvack.org X-Gm-Message-State: AOJu0YxwxBESfXcauqZwUntO58VfudlH0pZcSch0b36hd9xZ0vrVa71R ra8UmOPB80bMTZcS9rmTqt0k3p30qwVVF+sDkjcSuS8qKQxifX7Js3qTAXCVLPrVW+OW5teMHy2 GkNqSNzzMOEFOvFG7JkUTZ45lvqW00BqjJJX0KWA8AEhOYcaX X-Gm-Gg: ASbGnctk0R/HC0EFabrqAXdWleaoEbAh2yDtkn1B990/0ZWUG2mW/W7SlFcAoYA6HRW bOUTMcMSKEEE6Yfn0ITY0PEWcH6Bkb0N4mNjFvLBQpB/GEjWapxULgEYLw8xpvaIjZ/I/cbkFSg ooLiDdQqG6VI7F/HP408H9/ntCQfDrwhcLM6fDan/LamfjFlZOsvLcxwlNgF1mawXKOrpWiOvxN Eqw1eudrH33vuOMgEUgEfmXEPTaR3LVD3k+csUkVE2E0Q+KFNQQsSMJ4WonbctgDlfh02TvAp8t 6BdC9ZxKzUjRVmsfbYgboyhK5nuEy41CeMyB4hFd3rco X-Received: by 2002:a05:6000:381:b0:390:ffa8:35ab with SMTP id ffacd0b85a97d-3911f738603mr4189436f8f.21.1741202547723; Wed, 05 Mar 2025 11:22:27 -0800 (PST) X-Google-Smtp-Source: AGHT+IHKsRg0PtZ6BbY8yExiYwcY+eae4zmrkVNGVZSyudYptXMi30iBw+2oe72L6T0Q827KA8sDrA== X-Received: by 2002:a05:6000:381:b0:390:ffa8:35ab with SMTP id ffacd0b85a97d-3911f738603mr4189417f8f.21.1741202547312; Wed, 05 Mar 2025 11:22:27 -0800 (PST) Received: from [192.168.3.141] (p5b0c63d5.dip0.t-ipconnect.de. [91.12.99.213]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-390e4795983sm21731893f8f.6.2025.03.05.11.22.26 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 05 Mar 2025 11:22:26 -0800 (PST) Message-ID: Date: Wed, 5 Mar 2025 20:22:26 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH 0/2] SKSM: Synchronous Kernel Samepage Merging To: Mathieu Desnoyers , Peter Xu Cc: Linus Torvalds , Andrew Morton , linux-kernel@vger.kernel.org, Matthew Wilcox , Olivier Dion , linux-mm@kvack.org References: <20250228023043.83726-1-mathieu.desnoyers@efficios.com> <8524caa9-e1f6-4411-b86b-d9457ddb8007@efficios.com> <60f148db-7586-4154-a909-d433bad39794@efficios.com> <72810548-b917-49b7-b7ef-043c6b395d31@efficios.com> <8cae1e56-239f-4f67-a18c-b4f4d09f40d0@redhat.com> <08506527-5d0b-44c7-9d09-a4d53b2fda2d@efficios.com> From: David Hildenbrand Autocrypt: addr=david@redhat.com; keydata= xsFNBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABzSREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT7CwZgEEwEIAEICGwMGCwkIBwMCBhUIAgkKCwQW AgMBAh4BAheAAhkBFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAl8Ox4kFCRKpKXgACgkQTd4Q 9wD/g1oHcA//a6Tj7SBNjFNM1iNhWUo1lxAja0lpSodSnB2g4FCZ4R61SBR4l/psBL73xktp rDHrx4aSpwkRP6Epu6mLvhlfjmkRG4OynJ5HG1gfv7RJJfnUdUM1z5kdS8JBrOhMJS2c/gPf wv1TGRq2XdMPnfY2o0CxRqpcLkx4vBODvJGl2mQyJF/gPepdDfcT8/PY9BJ7FL6Hrq1gnAo4 3Iv9qV0JiT2wmZciNyYQhmA1V6dyTRiQ4YAc31zOo2IM+xisPzeSHgw3ONY/XhYvfZ9r7W1l pNQdc2G+o4Di9NPFHQQhDw3YTRR1opJaTlRDzxYxzU6ZnUUBghxt9cwUWTpfCktkMZiPSDGd KgQBjnweV2jw9UOTxjb4LXqDjmSNkjDdQUOU69jGMUXgihvo4zhYcMX8F5gWdRtMR7DzW/YE BgVcyxNkMIXoY1aYj6npHYiNQesQlqjU6azjbH70/SXKM5tNRplgW8TNprMDuntdvV9wNkFs 9TyM02V5aWxFfI42+aivc4KEw69SE9KXwC7FSf5wXzuTot97N9Phj/Z3+jx443jo2NR34XgF 89cct7wJMjOF7bBefo0fPPZQuIma0Zym71cP61OP/i11ahNye6HGKfxGCOcs5wW9kRQEk8P9 M/k2wt3mt/fCQnuP/mWutNPt95w9wSsUyATLmtNrwccz63XOwU0EVcufkQEQAOfX3n0g0fZz Bgm/S2zF/kxQKCEKP8ID+Vz8sy2GpDvveBq4H2Y34XWsT1zLJdvqPI4af4ZSMxuerWjXbVWb T6d4odQIG0fKx4F8NccDqbgHeZRNajXeeJ3R7gAzvWvQNLz4piHrO/B4tf8svmRBL0ZB5P5A 2uhdwLU3NZuK22zpNn4is87BPWF8HhY0L5fafgDMOqnf4guJVJPYNPhUFzXUbPqOKOkL8ojk CXxkOFHAbjstSK5Ca3fKquY3rdX3DNo+EL7FvAiw1mUtS+5GeYE+RMnDCsVFm/C7kY8c2d0G NWkB9pJM5+mnIoFNxy7YBcldYATVeOHoY4LyaUWNnAvFYWp08dHWfZo9WCiJMuTfgtH9tc75 7QanMVdPt6fDK8UUXIBLQ2TWr/sQKE9xtFuEmoQGlE1l6bGaDnnMLcYu+Asp3kDT0w4zYGsx 5r6XQVRH4+5N6eHZiaeYtFOujp5n+pjBaQK7wUUjDilPQ5QMzIuCL4YjVoylWiBNknvQWBXS lQCWmavOT9sttGQXdPCC5ynI+1ymZC1ORZKANLnRAb0NH/UCzcsstw2TAkFnMEbo9Zu9w7Kv AxBQXWeXhJI9XQssfrf4Gusdqx8nPEpfOqCtbbwJMATbHyqLt7/oz/5deGuwxgb65pWIzufa N7eop7uh+6bezi+rugUI+w6DABEBAAHCwXwEGAEIACYCGwwWIQQb2cqtc1xMOkYN/MpN3hD3 AP+DWgUCXw7HsgUJEqkpoQAKCRBN3hD3AP+DWrrpD/4qS3dyVRxDcDHIlmguXjC1Q5tZTwNB boaBTPHSy/Nksu0eY7x6HfQJ3xajVH32Ms6t1trDQmPx2iP5+7iDsb7OKAb5eOS8h+BEBDeq 3ecsQDv0fFJOA9ag5O3LLNk+3x3q7e0uo06XMaY7UHS341ozXUUI7wC7iKfoUTv03iO9El5f XpNMx/YrIMduZ2+nd9Di7o5+KIwlb2mAB9sTNHdMrXesX8eBL6T9b+MZJk+mZuPxKNVfEQMQ a5SxUEADIPQTPNvBewdeI80yeOCrN+Zzwy/Mrx9EPeu59Y5vSJOx/z6OUImD/GhX7Xvkt3kq Er5KTrJz3++B6SH9pum9PuoE/k+nntJkNMmQpR4MCBaV/J9gIOPGodDKnjdng+mXliF3Ptu6 3oxc2RCyGzTlxyMwuc2U5Q7KtUNTdDe8T0uE+9b8BLMVQDDfJjqY0VVqSUwImzTDLX9S4g/8 kC4HRcclk8hpyhY2jKGluZO0awwTIMgVEzmTyBphDg/Gx7dZU1Xf8HFuE+UZ5UDHDTnwgv7E th6RC9+WrhDNspZ9fJjKWRbveQgUFCpe1sa77LAw+XFrKmBHXp9ZVIe90RMe2tRL06BGiRZr jPrnvUsUUsjRoRNJjKKA/REq+sAnhkNPPZ/NNMjaZ5b8Tovi8C0tmxiCHaQYqj7G2rgnT0kt WNyWQQ== Organization: Red Hat In-Reply-To: <08506527-5d0b-44c7-9d09-a4d53b2fda2d@efficios.com> X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: qaU0GmxF45YXyNf2DWppc8Z1LSHIZZoWJA3baHL1Zho_1741202548 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam02 X-Stat-Signature: hbkomf7b3zan8y6rqx8taon7gbzb7yjf X-Rspamd-Queue-Id: E4DE020006 X-Rspam-User: X-HE-Tag: 1741202550-654697 X-HE-Meta: U2FsdGVkX18Ze5aJwupwH+O/tOwKM0GDuOLWfcvnxxXb7eagSWpOSbLa2RXXhSnhKaACcmcXYhP1HZh63swNJ56oJe2LWBN4bw908uQ5tj03cgQB84G/hJcM7+UDIUggvhHs0sFfbPIaxbRL87jCmWBfDbIHzqGak1PfRHNnbGQ3gmMQA7jkz3SekVzbKMRE+4xlR7ZuD3L9LIqDLOiFUGs7vG9//9STY9QpsL/S8iuKrA2gHyzjQVujYuEkG/2TcGMm96wAiq0Owq07En+jgtf0AAVaUuwKzMmd8twZo689J/JL8k9m+0LaD4X0R1htDjoZ+Ol8ayTnPnhmYE+c3vBg7icB52B6gn50zXjiz7/xiZTXQmnH+cURxZzwlhhZe8uw3pmlCJ4t0zycctz0YIIoLx/8fMen6gLR6+AjF+TSY/OztzmcQ8f6syBBsnbIYP5fwXJMM62+xH0p8hBIKVuyeNSTDyuZLmWyskvV/jvszTC5rHd06f00fpg4mWpjzEBiq5tLbf24jnSNdHNGbJLk/c2ZXvtbaIDi0bfmHbV/lNMPKpcFm/Q2xiG8iPjw93t77T/HCd7YRuwQEOteGbT95I3oXOTiJG/Ibs7Lo7pmfHPN9riPPspnn42OLFEfB7DvInWMMVpgWudrnyytrwCtjE5Kfg18EqpqPaOaYORcK3mH6Ji6MCiymqrLiTtHIRaxPM6OvgXkhCT7RFJqK8DinQN/rVmXU5fZ61FGBKXHpfewpHS5Di1/IgUiUVAtN2l1+C5acG+JGQBX9feDNEk11nI+rzmv7LcAP+quAWZFjwPvQRs2DoZuOfmj4Ryh5d97Q1dJkR30pu3ApH0257SAztjyTpLMVvqiOR+jlocoz+1zXG27giEEc0oRMv/KiJNHcRKQNspbl4w0p/zLZBnK2SkY6pTnMWQL2dc9D8Dbff2P/+fPU0lKSIMdQWEA/0XS4r/VSR8bgI/ypNl Yd7wm4nK Jfr0mz3jazGfJlaK0HSVMttQl6bn+ZRIYDsZB+c7LCG6ter3vXgfGQ0Ps8tLP8J1MS16E/+owB78x+a1igaVA4cR7orrmDuDS/JvfcnHTw4U+bwf3C5+tBHRXpxTI1txpCBN6feZguj4xphjmxLhtxiNR5Mt+0gAAPGHL7GVetYbOlNCWe03xczi0B2fryN06ayuliirF6/VaCaPhNjcls+3Ugu3mmPhzMWb+7QcTvxIIZfNy5+wPV9et9rCDQX7gbHhO2TH+DtHFYmC3DVqD3vxio57RIb1nfpwPEfpkrBNV+tNPouL/q4IKFtfgqii+bPDHHAvAITZX0Qwenqvuw0MrEV6lto+PBiu03NxupscSDZjVXJDcbXsxNZlztuUC4BT2Fo5EjRo4jF7JigXczFKjs1JIwxl0vvnxiCflqbS0GjGEzpuKL3ciqIEav25U57yGgJB7AWVC1qhG+VKvGPgJH4PaCifIyHf+3/D0pSDM5wi8O8pA9mFN2qRUCGdQdaFxnb19Ywmro6E= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 05.03.25 15:06, Mathieu Desnoyers wrote: > On 2025-03-03 15:49, David Hildenbrand wrote: >> On 03.03.25 21:01, Mathieu Desnoyers wrote: >>> On 2025-02-28 17:32, Peter Xu wrote: >>>> On Fri, Feb 28, 2025 at 12:53:02PM -0500, Mathieu Desnoyers wrote: >>>>> On 2025-02-28 11:32, Peter Xu wrote: >>>>>> On Fri, Feb 28, 2025 at 09:59:00AM -0500, Mathieu Desnoyers wrote: >>>>>>> For the VM use-case, I wonder if we could just add a userfaultfd >>>>>>> "COW" event that would notify userspace when a COW happens ? >>>>>> >>>>>> I don't know what's the best for KSM and how well this will work, >>>>>> but we >>>>>> have such event for years..  See UFFDIO_REGISTER_MODE_WP: >>>>>> >>>>>> https://man7.org/linux/man-pages/man2/userfaultfd.2.html >>>>> >>>>> userfaultfd UFFDIO_REGISTER only seems to work if I pass an address >>>>> resulting from a mmap mapping, but returns EINVAL if I pass a >>>>> page-aligned address which sits within a private file mapping >>>>> (e.g. executable data). >>>> >>>> Yes, so far sync traps only supports RAM-based file systems, or >>>> anonymous. >>>> Generic private file mappings (that stores executables and libraries) >>>> are >>>> not yet supported. >>>> >>>>> >>>>> Also, I notice that do_wp_page() only calls handle_userfault >>>>> VM_UFFD_WP when vm_fault flags does not have FAULT_FLAG_UNSHARE >>>>> set. >>>> >>>> AFAICT that's expected, unshare should only be set on reads, never >>>> writes. >>>> So uffd-wp shouldn't trap any of those. >>>> >>>>> >>>>> AFAIU, as it stands now userfaultfd would not help tracking COW faults >>>>> caused by stores to private file mappings. Am I missing something ? >>>> >>>> I think you're right.  So we have UFFD_FEATURE_WP_ASYNC that should >>>> work on >>>> most mappings.  That one is async, though, so more like soft-dirty.  It >>>> might be doable to try making it sync too without a lot of changes >>>> based on >>>> how async tracking works. >>> >>> I'm looking more closely at admin-guide/mm/pagemap.rst and it appears to >>> be a good fit. Here is what I have in mind to replace the ksmd scanning >>> thread for the VM use-case by a purely user-space driven scanning: >>> >>> Within qemu or similar user-space process: >>> >>> 1) Track guest memory with the userfaultfd UFFD_FEATURE_WP_ASYNC >>> feature and >>>      UFFDIO_REGISTER_MODE_WP mode. >>> >>> 2) Protect user-space memory with the PAGEMAP_SCAN ioctl >>> PM_SCAN_WP_MATCHING flag >>>      to detect memory which stays invariant for a long time. >>> >>> 3) Use the PAGEMAP_SCAN ioctl with PAGE_IS_WRITTEN to detect which >>> pages are written to. >>>      Keep track of memory which is frequently modified, so it can be >>> left alone and >>>      not write-protected nor merged anymore. >>> >>> 4) Whenever pages stay invariant for a given lapse of time, merge them >>> with the new >>>      madvise(2) KSM_MERGE behavior. >>> >>> Let me know if that makes sense. >> >> Note that one of the strengths of ksm in the kernel right now is that we >> write-protect + try-deduplicate only when we are fairly sure that we can >> deduplicate (unstable tree), and that the interaction with THPs / large >> folios is fairly well thought-through. >> >> Also note that, just because data hasn't been written in some time >> interval, doesn't mean that it should be deduplicated and result in CoW >> on next write access. > > Right. This tracking of address range access pattern would have to be > implemented in user-space. > >> One probably would have to mimic what the KSM implementation in the >> kernel does, and built something like the unstable tree, to find >> candidates where we can actually deduplciate. Then, have a way to not- >> deduplicate if the content changed. > > With madvise MADV_MERGE, there is no need to "unmerge". The merge > write-protects the page and merges its content at the time of the > MADV_MERGE with exact duplicates, and keeps that write protected page in > a global hash table indexed by checksum. Right, and that's a real problem. > > However, unlike KSM, it won't track that range on an ongoing basis. > > "Unmerging" the page is done naturally by writing to the merged address > range. Because it is write-protected, this will trigger COW, and will > therefore provide a new anonymous page to the process, thus "unmerging" > that page. > > It's really just up to userspace to track COW faults and figure out > that it really should not try to merge that range anymore, based on the > the access pattern monitored through write-protection faults. > Just to be clear, what you described here is very likely not performance-wise any feasible replacement for the in-tree ksm for the VM use case (again, the thing that was primarily invented for VMs). -- Cheers, David / dhildenb