From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 642E4C52D7C for ; Mon, 19 Aug 2024 20:16:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F06A26B008A; Mon, 19 Aug 2024 16:16:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EB7516B008C; Mon, 19 Aug 2024 16:16:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CBAF96B0092; Mon, 19 Aug 2024 16:16:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id AE2D46B008A for ; Mon, 19 Aug 2024 16:16:12 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 705801A1587 for ; Mon, 19 Aug 2024 20:16:12 +0000 (UTC) X-FDA: 82470101784.19.C3B7C3D Received: from mail-qk1-f180.google.com (mail-qk1-f180.google.com [209.85.222.180]) by imf15.hostedemail.com (Postfix) with ESMTP id 77421A0013 for ; Mon, 19 Aug 2024 20:16:10 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="f/7yv5DA"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf15.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.222.180 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724098483; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2bDcAPyWn/B3dB8WEY9pSZO0HQv1mT5yruHlrosIXss=; b=Z6c/lp4CCKzmJn/nkz/AmH6S56mKHYrK8NxDVsqFQI0kOTOiOWqjzCd+9POtrIqbEtGOor HqiE+hB3CPX6jou55TfoZjY4Su+E1JaFFYKZEw0532HTYJs/5wMIMJ4z0/N41W1Rdw3Oqv 4Npq9xyaqfDr3o9klNw4H4ZuM1IC6k8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724098483; a=rsa-sha256; cv=none; b=mm2j/M0aZNIMvNTQ0DCh1z05KQFxh2qekISv2NywwX5aaQ6JTIHp2BCjtztTfDHXXnz7+g MYxoN5gmM6webd7BtVmHkNP9LyD7qSdN93SFslLyrtsKrDNTbP7pWJlma52EGuseLPw6XQ G89u1pO5yMreOFs8hGEOLrG8bbogOqo= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="f/7yv5DA"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf15.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.222.180 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com Received: by mail-qk1-f180.google.com with SMTP id af79cd13be357-7a35eff1d06so335400885a.0 for ; Mon, 19 Aug 2024 13:16:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724098569; x=1724703369; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=2bDcAPyWn/B3dB8WEY9pSZO0HQv1mT5yruHlrosIXss=; b=f/7yv5DAfWphPAONXmjVeWduXfP9GWRk+i7NJoj2/w4BY3Mpp27afNXHNPxmn3x8/D UTvf2K5h4UnETofXQ9NqvlCkYf8kmz7UCAhWVrYIMKRuRMDdvKqAvUmWx4UUXvCq8VUx MBYaAd1BvM//jqerPD7Du1ZfofrV1yWpvyvTAnVvoXTYGY86Ey6SUrIVy2jItbSwemx9 qCAtEgxOvoeSlJwLZuaYQXN8I7wyp0BWpcGKthbpxd5AUenFD4rUYxffGHAigPiDPlju XY73nAkv9SjOw5rsZ7rHhPUJ4SxqU2uBwT989Tf4t069DhV6axOIieKw5FPutArax/bT 71WA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724098569; x=1724703369; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=2bDcAPyWn/B3dB8WEY9pSZO0HQv1mT5yruHlrosIXss=; b=d2ZlGX8nZ+RI4XabNlWIu9l75ephkdfFpOw4t1ECARW8idDheqmfWSapNGtvSyKAbV yEKm3A2n+CJsIQqxlx1q/Kz1/0+1jn4yEcFe5t+qkVyg6wO0v566f3VvgeHMyGT8i/kR liMyxbnFCZPMXIEZfbiuAhzCyYLKjsK2UNhyiwJM6CQfNJP8FMKBni9aDS0jVpqk8vjs ITaR9Ns+aub9z6TjeCAkgYXdUo9Q9ahZQsnb4QANi9tBmTvfw70BlFDOZmFezEbXHeBO XzGQurHd8rldWbjRfvdA9PxV2Jw5kXrMq3O2UkgfFYHh5EZ47qL1Pj6vgLDyt2UVpVtG tjDg== X-Forwarded-Encrypted: i=1; AJvYcCURES6eJHhDzdViu1nBY7A0bc2S85A02bLrRyTJXSGN3KZuht5vEDxBm9qZ1fnTFrMK5Qpdp9T2NSUpRu38dc2TJg4= X-Gm-Message-State: AOJu0Yx35ZEcIJstpcPnLbspcpKnWQwFrhpJFunfAkoIj8AhMK6CWbGh G0yE9LMF1AwZraAlRtTvrsd4qWEEcBiBso27bYBf5E09UpRr2cnP X-Google-Smtp-Source: AGHT+IH5rVdrhtASVA0IbpdMpTAjIzrHEi8lN53IGRcdkXNk+ka7mf3rD/j3HaEFfUtc07VqzgCwoA== X-Received: by 2002:a05:6214:3204:b0:6bf:8ae8:fc0c with SMTP id 6a1803df08f44-6bf8ae92718mr130582366d6.35.1724098569193; Mon, 19 Aug 2024 13:16:09 -0700 (PDT) Received: from ?IPV6:2a03:83e0:1145:4:1409:786c:cb1d:c3fb? ([2620:10d:c091:500::7:e1ca]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bf6fec5e9csm45817476d6.80.2024.08.19.13.16.08 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 19 Aug 2024 13:16:08 -0700 (PDT) Message-ID: <9a58e794-2156-4a9f-a383-1cdfc07eee5e@gmail.com> Date: Mon, 19 Aug 2024 16:16:07 -0400 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4 4/6] mm: Introduce a pageflag for partially mapped folios To: Barry Song Cc: akpm@linux-foundation.org, linux-mm@kvack.org, hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, ryncsn@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com References: <20240819023145.2415299-1-usamaarif642@gmail.com> <20240819023145.2415299-5-usamaarif642@gmail.com> Content-Language: en-US From: Usama Arif In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 77421A0013 X-Stat-Signature: z8jgwdfmwtgamspusiibbf184j56iyjw X-Rspam-User: X-HE-Tag: 1724098570-290908 X-HE-Meta: U2FsdGVkX1/2EtTPEgVpjNqbXqH5LpjqawxCG5zsvwIoOdAJ77CiO/NXP8A2p6Q2aRsLmL/KHTnNbY1Fiu8iqAgbE8U5XfGyDc1nIl/CXB+n1oI13Hq9vdTsYnn9Z1yVjQ1TXVIt1/Agr/kWgEJONZQg/xVT66eFSkw/wqzoBccHAERZV9GQ1i8GxCVpXh7dOEXb18jPQHOjp/UEQfWjY54vl+Rq4ajZ4q/t1NZe5xMynxuM0L2Q0AaS0EBdYtGDaLmrJ8oW2tUVHpB2LTmuykE8VFOFwSPq28zNRCPjqzhUPIqIf1A19dw9nSr1xbJCTbtyFNPWgMIsyU63/ex60uKwtr9sG4L00T2MZkHfBVM+xz0yXUGPH/9grUZbnlg6kdruwF+QLG1ECQh6K7xuOwavNRmhJaRcbNLuvil4o9axZ9Hcmy1m32yO9X9ERSAUXVI4c3TVXZ1K/Pma3YSkJM0tiyEk7vQYwLdJOK0b01rEDIJxj4B/6WDp4BGyhKpkzMMR7n638+EDP1eUHKCR4XmteFx+u3P82vEQLWbgQwhoB1EWZoQQc4gAFumQwERE9bU3TKsY5e/VSJxD3MwmmMxd04ayQXgfFpruzHasN2qVh9/OV89CJTwNHrepeLylcjsHGTgWI6earwoG4yOEWCKn9e2R9AcwHsxvA4xyws43s5nXWVK7ul4EoiSjJ/jdfYzG6XApDhl6s8HRSpHm0Ra3u6f/VnXNs5gMMrUlEgMk74Jk2sRZ0RjUxUyBXj2RKgxdvngZ1z6W1ambbYO6xfOQ13qtMIgsnASyqic6etWmLiFrSUPeSBTnwXJW98uV5NvuKHDNsu6GZrvxNU4FI67iSpFc0AMXEWTmHkI7USpQ8xA8tjLu3h6vn645zmfZPu6OSjf7yefTifkVKi10T2O1NoLsuL4csOvaDMuEROKFeVVuAgVIQdl2rvsIJSncW07a78IrKhEURvSXyTZ hKbwNaGL /EApp/RJUes5bQyKWZ/SA1hdw23GlBwedHbl9LogEUxS8jUDOPWL4iKExauMZu76Yl/c5AgihrKrIyTKIe/EZjqr7YMgz/KZdiLFoQ5UO4apgI0JMHTTjRtaJtjdLmQ+0RUkDKk6I4zK6/z654UDgZR0MHQ2GV/4d+rfVrP/SfReXMTvv+I4GwdKsrYT1ImFZAweLy2LaamKlw/UfvyXHPd9zhLlxgLWzX0Sdvg3s+K3LlSMqwVD+rSAsq6kDd2VKsrw4lsK56F4sKlSmxBfju5Dx9Oe35jl19aiXWDGYkI/ro3PwJYSU75xRWWLlYy+91lkeIQXBC6F91anEfFQBwSTw818N5oHGmHKHPhq8vk2jvR9n1oKR8G4sF9b4twN+p912PxmLxfU9FP4YHTvrx1PMQddWWXKN54ExKwXCSQofIl0l4Wnu2V/kHw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 19/08/2024 20:00, Barry Song wrote: > On Tue, Aug 20, 2024 at 2:17 AM Usama Arif wrote: >> >> >> >> On 19/08/2024 09:29, Barry Song wrote: >>> Hi Usama, >>> >>> I feel it is much better now! thanks! >>> >>> On Mon, Aug 19, 2024 at 2:31 PM Usama Arif wrote: >>>> >>>> Currently folio->_deferred_list is used to keep track of >>>> partially_mapped folios that are going to be split under memory >>>> pressure. In the next patch, all THPs that are faulted in and collapsed >>>> by khugepaged are also going to be tracked using _deferred_list. >>>> >>>> This patch introduces a pageflag to be able to distinguish between >>>> partially mapped folios and others in the deferred_list at split time in >>>> deferred_split_scan. Its needed as __folio_remove_rmap decrements >>>> _mapcount, _large_mapcount and _entire_mapcount, hence it won't be >>>> possible to distinguish between partially mapped folios and others in >>>> deferred_split_scan. >>>> >>>> Eventhough it introduces an extra flag to track if the folio is >>>> partially mapped, there is no functional change intended with this >>>> patch and the flag is not useful in this patch itself, it will >>>> become useful in the next patch when _deferred_list has non partially >>>> mapped folios. >>>> >>>> Signed-off-by: Usama Arif >>>> --- >>>> include/linux/huge_mm.h | 4 ++-- >>>> include/linux/page-flags.h | 11 +++++++++++ >>>> mm/huge_memory.c | 23 ++++++++++++++++------- >>>> mm/internal.h | 4 +++- >>>> mm/memcontrol.c | 3 ++- >>>> mm/migrate.c | 3 ++- >>>> mm/page_alloc.c | 5 +++-- >>>> mm/rmap.c | 5 +++-- >>>> mm/vmscan.c | 3 ++- >>>> 9 files changed, 44 insertions(+), 17 deletions(-) >>>> >>>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h >>>> index 4c32058cacfe..969f11f360d2 100644 >>>> --- a/include/linux/huge_mm.h >>>> +++ b/include/linux/huge_mm.h >>>> @@ -321,7 +321,7 @@ static inline int split_huge_page(struct page *page) >>>> { >>>> return split_huge_page_to_list_to_order(page, NULL, 0); >>>> } >>>> -void deferred_split_folio(struct folio *folio); >>>> +void deferred_split_folio(struct folio *folio, bool partially_mapped); >>>> >>>> void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, >>>> unsigned long address, bool freeze, struct folio *folio); >>>> @@ -495,7 +495,7 @@ static inline int split_huge_page(struct page *page) >>>> { >>>> return 0; >>>> } >>>> -static inline void deferred_split_folio(struct folio *folio) {} >>>> +static inline void deferred_split_folio(struct folio *folio, bool partially_mapped) {} >>>> #define split_huge_pmd(__vma, __pmd, __address) \ >>>> do { } while (0) >>>> >>>> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h >>>> index a0a29bd092f8..c3bb0e0da581 100644 >>>> --- a/include/linux/page-flags.h >>>> +++ b/include/linux/page-flags.h >>>> @@ -182,6 +182,7 @@ enum pageflags { >>>> /* At least one page in this folio has the hwpoison flag set */ >>>> PG_has_hwpoisoned = PG_active, >>>> PG_large_rmappable = PG_workingset, /* anon or file-backed */ >>>> + PG_partially_mapped = PG_reclaim, /* was identified to be partially mapped */ >>>> }; >>>> >>>> #define PAGEFLAGS_MASK ((1UL << NR_PAGEFLAGS) - 1) >>>> @@ -861,8 +862,18 @@ static inline void ClearPageCompound(struct page *page) >>>> ClearPageHead(page); >>>> } >>>> FOLIO_FLAG(large_rmappable, FOLIO_SECOND_PAGE) >>>> +FOLIO_TEST_FLAG(partially_mapped, FOLIO_SECOND_PAGE) >>>> +/* >>>> + * PG_partially_mapped is protected by deferred_split split_queue_lock, >>>> + * so its safe to use non-atomic set/clear. >>>> + */ >>>> +__FOLIO_SET_FLAG(partially_mapped, FOLIO_SECOND_PAGE) >>>> +__FOLIO_CLEAR_FLAG(partially_mapped, FOLIO_SECOND_PAGE) >>>> #else >>>> FOLIO_FLAG_FALSE(large_rmappable) >>>> +FOLIO_TEST_FLAG_FALSE(partially_mapped) >>>> +__FOLIO_SET_FLAG_NOOP(partially_mapped) >>>> +__FOLIO_CLEAR_FLAG_NOOP(partially_mapped) >>>> #endif >>>> >>>> #define PG_head_mask ((1UL << PG_head)) >>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >>>> index 2d77b5d2291e..70ee49dfeaad 100644 >>>> --- a/mm/huge_memory.c >>>> +++ b/mm/huge_memory.c >>>> @@ -3398,6 +3398,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, >>>> * page_deferred_list. >>>> */ >>>> list_del_init(&folio->_deferred_list); >>>> + __folio_clear_partially_mapped(folio); >>>> } >>>> spin_unlock(&ds_queue->split_queue_lock); >>>> if (mapping) { >>>> @@ -3454,11 +3455,13 @@ void __folio_undo_large_rmappable(struct folio *folio) >>>> if (!list_empty(&folio->_deferred_list)) { >>>> ds_queue->split_queue_len--; >>>> list_del_init(&folio->_deferred_list); >>>> + __folio_clear_partially_mapped(folio); >>> >>> is it possible to make things clearer by >>> >>> if (folio_clear_partially_mapped) >>> __folio_clear_partially_mapped(folio); >>> >>> While writing without conditions isn't necessarily wrong, adding a condition >>> will improve the readability of the code and enhance the clarity of my mTHP >>> counters series. also help decrease smp cache sync if we can avoid >>> unnecessary writing? >>> >> >> Do you mean if(folio_test_partially_mapped(folio))? >> >> I don't like this idea. I think it makes the readability worse? If I was looking at if (test) -> clear for the first time, I would become confused why its being tested if its going to be clear at the end anyways? > > In the pmd-order case, the majority of folios are not partially mapped. > Unconditional writes will trigger cache synchronization across all > CPUs (related to the MESI protocol), making them more costly. By > using conditional writes, such as "if(test) write," we can avoid > most unnecessary writes, which is much more efficient. Additionally, > we only need to manage nr_split_deferred when the condition > is met. We are carefully evaluating all scenarios to determine > if modifications to the partially_mapped flag are necessary. > Hmm okay, as you said its needed for nr_split_deferred anyways. Something like below is ok to fold in? commit 4ae9e2067346effd902b342296987b97dee29018 (HEAD) Author: Usama Arif Date: Mon Aug 19 21:07:16 2024 +0100 mm: Introduce a pageflag for partially mapped folios fix Test partially_mapped flag before clearing it. This should avoid unnecessary writes and will be needed in the nr_split_deferred series. Signed-off-by: Usama Arif diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 5d67d3b3c1b2..ccde60aaaa0f 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3479,7 +3479,8 @@ void __folio_undo_large_rmappable(struct folio *folio) if (!list_empty(&folio->_deferred_list)) { ds_queue->split_queue_len--; list_del_init(&folio->_deferred_list); - __folio_clear_partially_mapped(folio); + if (folio_test_partially_mapped(folio)) + __folio_clear_partially_mapped(folio); } spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); } @@ -3610,7 +3611,8 @@ static unsigned long deferred_split_scan(struct shrinker *shrink, } else { /* We lost race with folio_put() */ list_del_init(&folio->_deferred_list); - __folio_clear_partially_mapped(folio); + if (folio_test_partially_mapped(folio)) + __folio_clear_partially_mapped(folio); ds_queue->split_queue_len--; } if (!--sc->nr_to_scan)