From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6827C3DA42 for ; Sat, 13 Jul 2024 09:25:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E84B96B007B; Sat, 13 Jul 2024 05:25:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E0C7E6B0083; Sat, 13 Jul 2024 05:25:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C86256B0085; Sat, 13 Jul 2024 05:25:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id A7E2C6B007B for ; Sat, 13 Jul 2024 05:25:47 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 2308FA0DE4 for ; Sat, 13 Jul 2024 09:25:47 +0000 (UTC) X-FDA: 82334197134.12.313550F Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf04.hostedemail.com (Postfix) with ESMTP id 088E440005 for ; Sat, 13 Jul 2024 09:25:43 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=GIl24Cem; spf=pass (imf04.hostedemail.com: domain of gshan@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=gshan@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720862708; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=UcRixC2lx+d7TOqjLS/Bq0UJkIBiHu6hn/9edB9IbO8=; b=HAfZ/3dbU5M+pizc/EH5CAweQWoCjNmrYFshfO7wlb9K0ytA+NJtCCkiNWCYg2aHMpvK5a uYKHpGKxpYZQoNfal48LF4/xHJth29Z9laGMjOKQsQKNjOv0h+9jmGpHrrI8sNU07YbqUI Gva2B4gZYAkr5AA1YfFakZWNvDODCFA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720862708; a=rsa-sha256; cv=none; b=sr1bJnyy5adASQTFcAuuoGax91aZep2QbiiHdl7Wr+SzaCbxBzCBdxRXWbIRE7mla4BGYf Ezt9gFUia971GGbursIaPz5gfy1YoG8c1E+8qpTjApX+Vj5lqxjMaGi+h/S6g7x7eYO4jK rYnbuq+jBj6rhhfpsfsAvqRC9x6Btew= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=GIl24Cem; spf=pass (imf04.hostedemail.com: domain of gshan@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=gshan@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1720862743; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UcRixC2lx+d7TOqjLS/Bq0UJkIBiHu6hn/9edB9IbO8=; b=GIl24CemirH3qF5XBJ3NoBbgige3iNcuubmO0QouKxyf3hnn3vASVzPYN1XSXRr+vplTXI 1JEKkYj0/XOydF3ZIczd7ThnWnDnVwl03R/TZt/frdPoG8EZIb61787fk/nTfP9jlgwfGu yvUqTOYVzZ8NFh/7d65fbF11r0aYikk= Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-629-BZtJ24CUP8GQPPie-mDcug-1; Sat, 13 Jul 2024 05:25:41 -0400 X-MC-Unique: BZtJ24CUP8GQPPie-mDcug-1 Received: by mail-pg1-f199.google.com with SMTP id 41be03b00d2f7-71cdcb122e8so2125083a12.2 for ; Sat, 13 Jul 2024 02:25:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720862740; x=1721467540; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=UcRixC2lx+d7TOqjLS/Bq0UJkIBiHu6hn/9edB9IbO8=; b=VkH6anQTO/CTUAegeta/PplGxKpgz8synXbsqmAyhbuDqM8Drx6uCrQ+t5deHAAiwg BSZO2cVnW0IarPNC1VQlTKlZcpCRRPqBSFzh5VfMU5FBIK0mNaS+jV+5AXBRBmfEbCpB BfII7JR3KbtTSp7UFBHdxD+ElymH+Hff0yGY7deoMBw5N6PBbZ8/rnjydXlP58GvPTNL FbPR917jM4708TmvVY1OKuSsAFNPV9PpNjLhiN4rj4FH91oFogrPg/LF6gUeOfTxT+GT 3qKJISMKDq6z8fW9vOjldLLpWmov/0Vzy1xzsC+btY6viSfjojvcj9x1yES1t62ELcx4 XULA== X-Gm-Message-State: AOJu0YyU+eLrriHqVzanJCpbMKz2xHnub9su+fDRwJ5/JPr3kP21PTxP HbXD4TX7ZM99Y010mkK3EBY2jp1ne91Dls7IL1RQaf1gtrMJtW5p/hsb1JmgB2E9c+PXgtcMTgI vbt9PJRxBaLeH4XaKmysnLQtK20pxDplyBNTEVWWzZXiZtw9K X-Received: by 2002:a05:6a20:394a:b0:1be:c43c:e1cc with SMTP id adf61e73a8af0-1c2980f9522mr16226570637.10.1720862740615; Sat, 13 Jul 2024 02:25:40 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEep5nCRORM9rbDk8yQvtZf8RFQzchFGR+rMgXtq0XU0SPUCCZHwDTSqxwRfdSOKgopRMqybg== X-Received: by 2002:a05:6a20:394a:b0:1be:c43c:e1cc with SMTP id adf61e73a8af0-1c2980f9522mr16226553637.10.1720862740148; Sat, 13 Jul 2024 02:25:40 -0700 (PDT) Received: from [192.168.68.54] ([43.252.112.134]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1fc0bb6fec6sm6940235ad.27.2024.07.13.02.25.36 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 13 Jul 2024 02:25:39 -0700 (PDT) Message-ID: Date: Sat, 13 Jul 2024 19:25:34 +1000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] mm/huge_memory: Avoid PMD-size page cache if needed To: David Hildenbrand , Matthew Wilcox Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, william.kucharski@oracle.com, ryan.roberts@arm.com, shan.gavin@gmail.com References: <20240711104840.200573-1-gshan@redhat.com> <63a0364b-a2e0-48c2-b255-e976112deeb1@redhat.com> From: Gavin Shan In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Stat-Signature: fa5nkumdffqjzxuk64f46qyobandh4zf X-Rspamd-Queue-Id: 088E440005 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1720862743-585260 X-HE-Meta: U2FsdGVkX1+BW0FcdRz8yVBNERwYdZhRcqT6Nk7Xf5pl3A6R5cek0+AfQeLInHEfIGj6iXH3ZL1QfUpuF/IB672Iez9F652u1sq0/BYJWIjsN9vG6HjgasYHgKT5QMEVJxGZSLiaBz7gFlfXpgqNDp63y+YM/tZa2XmMnJyKRrTtFdIHG7m0z+UcM+TUm+RGd2hJ6PXI4pYnE8K9oJ7ZQL7P3laLoLEwFnWvo5rB8gmLZFQAqpj7g52l2eyE4hWxapiy5y6Yrt/sB//VSEZfXo145vPsGY9bA6SjiiZdX4DStArO0E1wzkRkKYNGxwTaUHzQvQWulnCfkhK9pFz0u3/lP56WnJvhz4IlAynpa5oDUSEgQlJyFWzskHqgN+LH85CjAbKo3xRqIj5XM5vZkFQvEYp7FBSRBv211vNQDFwrY6ewNIlzoDztXD9GasYT+unpS6baVy1ulGUhhA5omqrzrVcbQhyf6YFYuWy6TnaqS0HpMtLcwxkk3ULjBIFlH4BxWT4e+9eng+CPwvifM4JPR6G7Zh+1O99LpWTeBaI0PcD6NjHR9DHVacAHmgFLw4FyJ3s7aXz4x7SvDc2jsZ2FcvFhc+xCCsc3b9BGxo06GNrCP9b06fFLZ/7W4LuJCc9+jOTcl7jZq45UBPR41Dqg3Vl5DWzGbi8xXS6TLso7z4Yy5c+r52zidC9JwOfB++Kw1m/EEv7ebjgOlQBHZTW0w4lTXZFzU9tJukxUN7sauZ5K3wj83PRmZ+6mKMp/T5UslMKrXK3ZSQCnd7MSCvCi76jPdAguEYoWS6poWzOs8fXrTRuM8TzqRFlWnvs92eXBa/lAPYE/u4en9DZwZKvO0uWRyY8fIFfLA5A762XzYcnRvC5XOEzfGYExsd8YyrJvmAsH0pHt6W2x4Ljrdx4Uwg5VmibOcx5wcbXJimmxs1fKuUbG4iwcz3jbSJycG6lEj2gnkJyJFsI3GdH YE4vrFmA +0xgxR2bqJqCeESbQhpB9d05rYPvx0OQgHLPzUAWe6IGQWz11mhxNK8ihQGT5tGNcGgqUrmdw2hcoYnd3SHko6Z7AfUepprh9OoHGu9EnL8Y2twIBvnDG/XbVY8Yn16/X2ClXhVst/WzY4pukUCj1nOSWXPlg0JB8YyR1ChpYWeeEN7rKTb1nMta1Fl29IADdWeTT+U3822d4v6t0FHhTF5yFyCzk7Q+cTPCEbvBsmZCA6BwyFmLyJVZXUY5f6wsZ5XoLFX1Oc2YDsL3qETLIzd0CyP3mnKVtYjmqOwoX1TxKP/3JoK5HOKum7DRI6GUyke1EHN4OAO1dZM0kp9LnOwmhIzMw+7m3szpmn/JtbwcYlKci7UTIYkDNW8Q8dQsAtYoq+dHI64D8/o8tKYiDmfHfUKvJyZVM5YLRSWm82/Adq7JVXoACFUqew+OJHiWlBSlec851T/B3/hV7WB79M23JRKgqigpbUTD9xFelhcYpz5C7IyrhShG5qA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 7/13/24 11:03 AM, David Hildenbrand wrote: > On 12.07.24 07:39, Gavin Shan wrote: >> >> David, I can help to clean it up. Could you please help to confirm the following > > Thanks! > >> changes are exactly what you're suggesting? Hopefully, there are nothing I've missed. >> The original issue can be fixed by the changes. With the changes applied, madvise(MADV_COLLAPSE) >> returns with errno -22 in the test program. >> >> The fix tag needs to adjusted either. >> >> Fixes: 3485b88390b0 ("mm: thp: introduce multi-size THP sysfs interface") >> >> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h >> index 2aa986a5cd1b..45909efb0ef0 100644 >> --- a/include/linux/huge_mm.h >> +++ b/include/linux/huge_mm.h >> @@ -74,7 +74,12 @@ extern struct kobj_attribute shmem_enabled_attr; >>    /* >>     * Mask of all large folio orders supported for file THP. >>     */ >> -#define THP_ORDERS_ALL_FILE    (BIT(PMD_ORDER) | BIT(PUD_ORDER)) > > DAX doesn't have any MAX_PAGECACHE_ORDER restrictions (like hugetlb). So this should be > > /* >  * FSDAX never splits folios, so the MAX_PAGECACHE_ORDER limit does not >  * apply here. >  */ > THP_ORDERS_ALL_FILE_DAX ((BIT(PMD_ORDER) | BIT(PUD_ORDER)) > > Something like that > Ok. It will be corrected in v2. >> +#define THP_ORDERS_ALL_FILE_DAX                \ >> +       ((BIT(PMD_ORDER) | BIT(PUD_ORDER)) & (BIT(MAX_PAGECACHE_ORDER + 1) - 1)) >> +#define THP_ORDERS_ALL_FILE_DEFAULT    \ >> +       ((BIT(MAX_PAGECACHE_ORDER + 1) - 1) & ~BIT(0)) >> +#define THP_ORDERS_ALL_FILE            \ >> +       (THP_ORDERS_ALL_FILE_DAX | THP_ORDERS_ALL_FILE_DEFAULT) > > Maybe we can get rid of THP_ORDERS_ALL_FILE (to prevent abuse) and fixup > THP_ORDERS_ALL instead. > Sure, it will be removed in v2. >>    /* >>     * Mask of all large folio orders supported for THP. >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index 2120f7478e55..4690f33afaa6 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -88,9 +88,17 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, >>           bool smaps = tva_flags & TVA_SMAPS; >>           bool in_pf = tva_flags & TVA_IN_PF; >>           bool enforce_sysfs = tva_flags & TVA_ENFORCE_SYSFS; >> +       unsigned long supported_orders; >> + >>           /* Check the intersection of requested and supported orders. */ >> -       orders &= vma_is_anonymous(vma) ? >> -                       THP_ORDERS_ALL_ANON : THP_ORDERS_ALL_FILE; >> +       if (vma_is_anonymous(vma)) >> +               supported_orders = THP_ORDERS_ALL_ANON; >> +       else if (vma_is_dax(vma)) >> +               supported_orders = THP_ORDERS_ALL_FILE_DAX; >> +       else >> +               supported_orders = THP_ORDERS_ALL_FILE_DEFAULT; > > This is what I had in mind. > > But, do we have to special-case shmem as well or will that be handled correctly? > With previous fixes and this one, I don't see there is any missed cases for shmem to have 512MB page cache, exceeding MAX_PAGECACHE_ORDER. Hopefully, I don't miss anything from the code inspection. - regular read/write paths: covered by the previous fixes - synchronous readahead: covered by the previous fixes - asynchronous readahead: page size granularity, no huge page - page fault handling: covered by the previous fixes - collapsing PTEs to PMD: to be covered by this patch - swapin: shouldn't have 512MB huge page since we don't have such huge pages during swapout period - other cases I missed (?) Thanks, Gavin