From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.6 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9832EC433DF for ; Wed, 19 Aug 2020 09:12:44 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 58B7620738 for ; Wed, 19 Aug 2020 09:12:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="bXEuX1E0" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 58B7620738 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=FxwpfrYMj30QkDXlR5WODceoUPWCdGv6CqJKqp2SlXM=; b=bXEuX1E0+5eUlAGATENdoPm6B rsCouvR2Tl0vcbG1995aEwllaBn1y0nl+dTJy+jkN+56se0mEnjlp1nQnNkUrZfscQriiGKytgiaE ummGZaGDleHsDqdpNb8+Mfoho8V37mj3sSCSMZ0Ve9ScEKsgsePDYCA6orLUPMNjfhGjJJYbtWVjV RDcU4NM+Sun46Wk9NCFDr/nrII4d82Lb0t8hdmzj05H1iD3BB+BEmhrQyZ1bNnpow7EkQEo2pMUdY j4h4D/OUe/0zfsmNPwPsXupOmIryGZ1uatYzI9kPT2weU3bUC0KIaODgqJGJUMEMcobqZhDBoKm4l inXUzU9cQ==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1k8K7l-0001el-P5; Wed, 19 Aug 2020 09:11:14 +0000 Received: from foss.arm.com ([217.140.110.172]) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1k8K7i-0001eC-R2 for linux-arm-kernel@lists.infradead.org; Wed, 19 Aug 2020 09:11:11 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1655B31B; Wed, 19 Aug 2020 02:11:10 -0700 (PDT) Received: from [10.163.66.190] (unknown [10.163.66.190]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 468913F6CF; Wed, 19 Aug 2020 02:11:07 -0700 (PDT) Subject: Re: [PATCH 1/2] arm64/mm: Change THP helpers to comply with generic MM semantics To: Jonathan Cameron References: <1597655984-15428-1-git-send-email-anshuman.khandual@arm.com> <1597655984-15428-2-git-send-email-anshuman.khandual@arm.com> <20200818101301.000027ef@Huawei.com> <8db455b6-8fe5-b552-119f-4abab0cc8501@arm.com> <20200818132625.00003d05@Huawei.com> From: Anshuman Khandual Message-ID: Date: Wed, 19 Aug 2020 14:40:36 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20200818132625.00003d05@Huawei.com> Content-Language: en-US X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200819_051111_084376_B6BA451B X-CRM114-Status: GOOD ( 22.77 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , Suzuki Poulose , catalin.marinas@arm.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Marc Zyngier , akpm@linux-foundation.org, will@kernel.org, linux-arm-kernel@lists.infradead.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 08/18/2020 05:56 PM, Jonathan Cameron wrote: > On Tue, 18 Aug 2020 15:11:58 +0530 > Anshuman Khandual wrote: > >> On 08/18/2020 02:43 PM, Jonathan Cameron wrote: >>> On Mon, 17 Aug 2020 14:49:43 +0530 >>> Anshuman Khandual wrote: >>> >>>> pmd_present() and pmd_trans_huge() are expected to behave in the following >>>> manner during various phases of a given PMD. It is derived from a previous >>>> detailed discussion on this topic [1] and present THP documentation [2]. >>>> >>>> pmd_present(pmd): >>>> >>>> - Returns true if pmd refers to system RAM with a valid pmd_page(pmd) >>>> - Returns false if pmd does not refer to system RAM - Invalid pmd_page(pmd) >>>> >>>> pmd_trans_huge(pmd): >>>> >>>> - Returns true if pmd refers to system RAM and is a trans huge mapping >>>> >>>> ------------------------------------------------------------------------- >>>> | PMD states | pmd_present | pmd_trans_huge | >>>> ------------------------------------------------------------------------- >>>> | Mapped | Yes | Yes | >>>> ------------------------------------------------------------------------- >>>> | Splitting | Yes | Yes | >>>> ------------------------------------------------------------------------- >>>> | Migration/Swap | No | No | >>>> ------------------------------------------------------------------------- >>>> >>>> The problem: >>>> >>>> PMD is first invalidated with pmdp_invalidate() before it's splitting. This >>>> invalidation clears PMD_SECT_VALID as below. >>>> >>>> PMD Split -> pmdp_invalidate() -> pmd_mkinvalid -> Clears PMD_SECT_VALID >>>> >>>> Once PMD_SECT_VALID gets cleared, it results in pmd_present() return false >>>> on the PMD entry. It will need another bit apart from PMD_SECT_VALID to re- >>>> affirm pmd_present() as true during the THP split process. To comply with >>>> above mentioned semantics, pmd_trans_huge() should also check pmd_present() >>>> first before testing presence of an actual transparent huge mapping. >>>> >>>> The solution: >>>> >>>> Ideally PMD_TYPE_SECT should have been used here instead. But it shares the >>>> bit position with PMD_SECT_VALID which is used for THP invalidation. Hence >>>> it will not be there for pmd_present() check after pmdp_invalidate(). >>>> >>>> A new software defined PMD_PRESENT_INVALID (bit 59) can be set on the PMD >>>> entry during invalidation which can help pmd_present() return true and in >>>> recognizing the fact that it still points to memory. >>>> >>>> This bit is transient. During the split process it will be overridden by a >>>> page table page representing normal pages in place of erstwhile huge page. >>>> Other pmdp_invalidate() callers always write a fresh PMD value on the entry >>>> overriding this transient PMD_PRESENT_INVALID bit, which makes it safe. >>>> >>>> [1]: https://lkml.org/lkml/2018/10/17/231 >>>> [2]: https://www.kernel.org/doc/Documentation/vm/transhuge.txt >>> >>> Hi Anshuman, >>> >>> One query on this. From my reading of the ARM ARM, bit 59 is not >>> an ignored bit. The exact requirements for hardware to be using >>> it are a bit complex though. >>> >>> It 'might' be safe to use it for this, but if so can we have a comment >>> explaining why. Also more than possible I'm misunderstanding things! >> >> We are using this bit 59 only when the entry is not active from MMU >> perspective i.e PMD_SECT_VALID is clear. >> > > Understood. I guess we ran out of bits that were always ignored so had > to start using ones that are ignored in this particular state. Right, there are no more available SW PTE bits. #define PTE_DIRTY (_AT(pteval_t, 1) << 55) #define PTE_SPECIAL (_AT(pteval_t, 1) << 56) #define PTE_DEVMAP (_AT(pteval_t, 1) << 57) #define PTE_PROT_NONE (_AT(pteval_t, 1) << 58) /* only when !PTE_VALID */ Earlier I had proposed using PTE_SPECIAL at PMD level for this purpose. But Catalin prefers these unused bits as the entry is anyway invalid and which also leaves aside PTE_SPECIAL at mapped PMD for later use. There is already one comment near PMD_PRESENT_INVALID definition which explains this situation. +/* + * This help indicate that the entry is present i.e pmd_page() + * still points to a valid huge page in memory even if the pmd + * has been invalidated. + */ +#define PMD_PRESENT_INVALID (_AT(pteval_t, 1) << 59) /* only when !PMD_SECT_VALID */ _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel