From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4DC2C3DA66 for ; Fri, 25 Aug 2023 15:09:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E60D2280091; Fri, 25 Aug 2023 11:09:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E0FFC280073; Fri, 25 Aug 2023 11:09:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CB04B280091; Fri, 25 Aug 2023 11:09:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id B7548280073 for ; Fri, 25 Aug 2023 11:09:48 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 263A4B273C for ; Fri, 25 Aug 2023 15:09:48 +0000 (UTC) X-FDA: 81162961656.29.43A3C12 Received: from mail-ed1-f41.google.com (mail-ed1-f41.google.com [209.85.208.41]) by imf18.hostedemail.com (Postfix) with ESMTP id 463691C000C for ; Fri, 25 Aug 2023 15:09:46 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b="L+/2ijjO"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf18.hostedemail.com: domain of zokeefe@google.com designates 209.85.208.41 as permitted sender) smtp.mailfrom=zokeefe@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692976186; a=rsa-sha256; cv=none; b=jGTVmD8jil42lTH+UmQ7+nT8EzgzKEbgTGlvhGbaO+ypzftRR2Ae3hR9IL2IRA2ePbRkqx ehBQY4CcNkXO94/JngRUO1iB7RXxOUIt3DxlhRjaW3gzHA96SBHkPPY6TwNv/wrBTwr4wH LkRuvZKzfibt9lYfIqbSOgm7FyQ5D3k= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b="L+/2ijjO"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf18.hostedemail.com: domain of zokeefe@google.com designates 209.85.208.41 as permitted sender) smtp.mailfrom=zokeefe@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692976186; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+Gq9jP84cRzBl+sgoqLbq4czFux7bqHqPTg17tu5c2I=; b=1wR3PTFcZJz2OHGibWOYFmwwAHmpTROpQEWdWBcz3v2KWABhcL53zG4wd5D1Dzv1bka0JD pnqu1yaCqs0GhmHUWPonlP8FAsyYPOvWfXW6QnqVHKqbD5mLHyvTFhAUoXV26gNeaG4V8Y QF9cWQBZcgOWbi311NE1/s7kwLcUiIs= Received: by mail-ed1-f41.google.com with SMTP id 4fb4d7f45d1cf-5232ce75e26so12601a12.1 for ; Fri, 25 Aug 2023 08:09:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1692976184; x=1693580984; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=+Gq9jP84cRzBl+sgoqLbq4czFux7bqHqPTg17tu5c2I=; b=L+/2ijjO7QMRxKQGgMtzCarHkeFlcapGa26NuEbm8kguByvlnG5fHSFxOOwRmKNYyt EgY+gxipRVAa9Nfghjlu+D5b4qI8VvURQZCPbF47nEH/pM2CWVrD7oz7z6661UaZSy9f ZrmKF7aQzee5hri031Ox+1dn6UEK0vncx7hzmE8+imYtfTnl7BYKsdz6ZbyY1RWIssMn YRKqR2f+ZSAz+zGtS/NLUIkwgiBIFGP/DLcc67HmOtfh0skJrBCzulBet6HCP/0r9/2+ xuB203/NM1e/RPWg6H8m9aZwvBEz0QbJKNYBRFC0SSffEqOA7eAtHf5Xxpi+lN1fmuyE ocxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692976184; x=1693580984; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+Gq9jP84cRzBl+sgoqLbq4czFux7bqHqPTg17tu5c2I=; b=I30Qw7d6+Z3QlniNhH9viJ3VkgVpTVVEzaLunDZqdYaTNwdqm6kMjkYKUzuDRJHd1X wrUN158U3Fj0DQN6fjDb587RcmDacuOO0cHOeHJYm09NcOZ+4vGjwp0bIT4qVBH3Qy4S AhJ6NXTXmNBYNvfwws2HIKh89GVp1g2tMHnHpwkb3khtdDJslfOM+kvKRbFY73fvliCf 9LBOWYv9pXhDvCxK+mt8oHzNSZhykhFHebFjW7hPKHhERgOc2wc1/w7dNQobFYALx1wY WBRdXMsMDHzZl/2VoiE/wqnZjJzij4kqqV4ZUZzS6XWh8Kjm/N8zqcFfucfwvl08Q0jW db1w== X-Gm-Message-State: AOJu0YzaY01AVbWWtGVzRo9fn2D2kacR9NK4nGJR7Wi1DMLtp823nFRM hNUv56/fB38J3gSob5YhX5SgkDwsLCJjPx7Uf5GJ4Q== X-Google-Smtp-Source: AGHT+IHUSbWA6D7Kr7ih9MQvH+AX9k2ln5in8pCdueT+yk36uQ89b3zjjhcJfDCPyElxPO1Arxlzr+hUBFrnlLobzrE= X-Received: by 2002:a50:9fa5:0:b0:523:d5bc:8424 with SMTP id c34-20020a509fa5000000b00523d5bc8424mr172162edf.5.1692976184443; Fri, 25 Aug 2023 08:09:44 -0700 (PDT) MIME-Version: 1.0 References: <20230821234844.699818-1-zokeefe@google.com> <37c2b525-5c2c-d400-552c-9ccb91f4d7bf@redhat.com> <3e08d48b-7b70-cc7f-0ec1-12ad9b1a33db@redhat.com> <3408ff54-f353-0334-0d66-c808389d2f01@redhat.com> <9f967665-2cbd-f80b-404e-ac741eab1ced@redhat.com> In-Reply-To: <9f967665-2cbd-f80b-404e-ac741eab1ced@redhat.com> From: "Zach O'Keefe" Date: Fri, 25 Aug 2023 08:09:07 -0700 Message-ID: Subject: Re: [EXTERNAL] Re: [PATCH v3] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()" To: David Hildenbrand Cc: Matthew Wilcox , Saurabh Singh Sengar , "linux-mm@kvack.org" , Yang Shi , "linux-kernel@vger.kernel.org" , Greg KH , Saurabh Sengar Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 463691C000C X-Stat-Signature: zet6bxhh5chmyt4dsszfcfp8fsqtcije X-HE-Tag: 1692976186-682034 X-HE-Meta: U2FsdGVkX1+q2gc8rbTHoZKuFqWXOgq0IyKYtboqAfXDJs5JDencZweQqVXhfon9DCBMmNZz+gSCsHg2A9LuHMJBAHMAoOwj/l3CWESKOmtCz66oMFy6yrcOoxZ0vyRmt5+UFKyHuOwxsOQpf172uWaylvbQ2w1pvAprT3IVegOhlsXLLleeTYBsz0amVNp45vvbOTpYT0DRoCBeM3UesbNqzbhBkKfFrYqX1pKbp4RUpsDeTxr5sgr6bxYBrup1/yvOeCZeY5hDeIrYre0WdtTU/zq7JXYjZVBPRF3TZTXG14Q4V3ZQ8pUdAJaXMkRTYMS+8Ag1m+iVQWrcDwbH2iPS0OKgAuo2rGE/pNqp/Ba0DRpBFrTQiAz3DuY8iE60M7IvviT1vHmlAzB8l5f6bIL9G3BHAxokC38f01ra4wavS68Sqma1s/V0TnkouRF8ILW+sVR+NuCMNLg1Q5rzKgTMteBC3v6dANCvp86l3LDIoxTi2FqiBP4SFsCRya3RzwB6OMxgjjzRi5+Otpz1Hj7qZAoTT/zzBXob5ojK8+l03Zw4VLpjtCJBSMDNB9RFk29dvjTYZnNKz1tGSL4zRPkmxMKb0c0fXPnowkSp/RO+tqMHvUg/iVkVDXlTVQXUJYu/fwh5NkUrLZAujbJ9mYhwVSmip4QmHQMum/H8bZD6SjHdYY1p/XM57QPN2iM3Q/JD9CozMg8uRTbXnCW73nX4Fe8Tpts1MY3Hjj0XHp9y7cvdX4I/JgW5wUV3OVCs9ixecsZM0hNVuSQpScaUymK5wqgQzSLBUUR559X9gZhaKzVn1xrqKD8pGPBc4ibsj1B4k4ju1GcHbFPbEJNuIuIOSM2h094kLI6Uzza1m7YpbQT70sFktAd3pR1OjD5Hp5oAokv8+WmTUFfdHG+Pm0TiFHD+97pWzR0ZqdOTEIqbS6fseMw+auTx+sbPBXP1TGHLfm/ux/0Bp4NVqHA nUW3f3Hm IEf5+DKwNFO6ME3P946wug9JboailCbC7+dh2VHHvzGj+413Er3998q6kts8vEmryBzGD1Qb32ZWJRomhjQwnFVS6b4/eC7+jiZ4pQtcGMA2CMs0NLLbg+mfRHQmMJhVp3jtWTMgNjwMZ8+tCrvEkc4VkOwMn0F7gPFRDCMPcHZSt32HXGL9f6W6a8KCEqgZgUnqlzVZ6pT7l9fBEd2Ngx4bJv9L5K8OF353Zvc90vw3ot6HlpkiXR3gP4TngOZ5nK6/w+YT0Nr9AYp4ozLW7MsIbPDKDDsS1LA8QhdHJhcWKpXB2gUmT6XMeVVoL3+6OAyQWyu8scToY5vNfPyoXuq5mNy2JAvyxFsl3QhJHdy/iiE+GqT3G7pWwXPC6SNhksc3RB7wlWp4q5mH/qA5/ViX2PJXlk3odTZ4laUUELHd279PTg883ynLDw7Tw3kyN/K4S4pIydAyxMrQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Aug 25, 2023 at 5:58=E2=80=AFAM David Hildenbrand wrote: > > On 25.08.23 14:49, Matthew Wilcox wrote: > > On Fri, Aug 25, 2023 at 09:59:23AM +0200, David Hildenbrand wrote: > >> Especially, we do have bigger ->huge_fault changes coming up: > >> > >> https://lkml.kernel.org/r/20230818202335.2739663-1-willy@infradead.org FWIW, one of those patches updates the docs to read, "->huge_fault() is called when there is no PUD or PMD entry present. This gives the filesystem the opportunity to install a PUD or PMD sized page. Filesystems can also use the ->fault method to return a PMD sized page, so implementing this function may not be necessary. In particular, filesystems should not call filemap_fault() from ->huge_fault(). [..]" Which won't work (in the general case) without this patch (well, at least the ->huge_fault() check part). So, if we're advertising this is the way it works, maybe that gives a stronger argument for addressing it sooner vs when the first in-tree user depends on it? > >> If the driver is not in the tree, people don't care. > >> > >> You really should try upstreaming that driver. > >> > >> > >> So this patch here adds complexity (which I don't like) in order to ke= ep an > >> OOT driver working -- possibly for a short time. I'm tempted to say "p= lease > >> fix your driver to not use huge faults in that scenario, it is no long= er > >> supported". > >> > >> But I'm just about to vanish for 1.5 week into vacation :) > >> > >> @Willy, what are your thoughts? > > > > Fundamentally there was a bad assumption with the original patch -- > > it assumed that the only reason to support ->huge_fault was for DAX, > > and that's not true. It's just that the only drivers in-tree which > > support ->huge_fault do so in order to support DAX. > > Okay, and we are willing to continue supporting that then and it's > nothing we want to stop OOT drivers from doing. > > Fine with me; we should probably reflect that in the patch description. I can change these paragraphs, "During the review of the above commits, it was determined that in-tree users weren't affected by the change; most notably, since the only relevant user (in terms of THP) of VM_MIXEDMAP or ->huge_fault is DAX, which is explicitly approved early in approval logic. However, there is at least one occurrence where an out-of-tree driver that used VM_HUGEPAGE|VM_MIXEDMAP with a vm_ops->huge_fault handler, was broken. Remove the VM_NO_KHUGEPAGED check when not in collapse path and give any ->huge_fault handler a chance to handle the fault. Note that we don't validate the file mode or mapping alignment, which is consistent with the behavior before the aforementioned commits." To read, "The above commits, however, overfit the existing in-tree use cases, and assume that the only reason to support ->huge_fault was for DAX (which is explicitly approved early in the approval logic). This is a bad assumption to make and unnecessarily prevents general support of ->huge_fault by filesystems. Allow returning "true" if such a handler exists, giving the fault path an opportunity to exercise it. Similarly, the rationale for including the VM_NO_KHUGEPAGED check along the fault path was that it didn't alter any in-tree users, but was likewise similarly unnecessarily restrictive (and reads odd). Remove the check from the fault path." > > > > Keeping a driver out of tree is always a risky and costly proposition. > > It will continue to be broken by core kernel changes, particularly > > if/when it does unusual things. > > > > Yes. > > > I think the complexity is entirely on us. I think there's a simpler wa= y > > to handle the problem, but I'd start by turning all of this "admin and > > app get to control when THP are used" nonsense into no-ops. > > Well, simpler, yes, but also more controversial :) > > -- > Cheers, > > David / dhildenb >