From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.3 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59FA1C63797 for ; Thu, 22 Jul 2021 06:23:02 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CDD506128C for ; Thu, 22 Jul 2021 06:23:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CDD506128C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6C61A6B0036; Thu, 22 Jul 2021 02:23:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 676946B005D; Thu, 22 Jul 2021 02:23:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 517356B006C; Thu, 22 Jul 2021 02:23:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0202.hostedemail.com [216.40.44.202]) by kanga.kvack.org (Postfix) with ESMTP id 309706B0036 for ; Thu, 22 Jul 2021 02:23:01 -0400 (EDT) Received: from smtpin34.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id C7249180AD81D for ; Thu, 22 Jul 2021 06:23:00 +0000 (UTC) X-FDA: 78389230920.34.727ABE0 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf25.hostedemail.com (Postfix) with ESMTP id 2D153B058BE9 for ; Thu, 22 Jul 2021 06:22:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1626934978; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+/j9lDJJR//R/1rtU4Q72mkECn/lkZxHLRnO/7ICo6A=; b=NMm4E2ONsxUGSceUIfdCH5XW4lPoMiqTZ4QbHV3ZrRZcMTPDar9VWY6STgeUFmP+afuqaE ukSatUFsAB9lmjO/kp6wLczorXlRtg4Mou4FEqv8gRLxH3H0yUG42JPXQeUoibp2tgExts hyBT7Mx4Qm0vbIQAT0rkwMi7AxQ7W4c= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-230-8vIb5JljPI6QSFF9xcnuzg-1; Thu, 22 Jul 2021 02:22:56 -0400 X-MC-Unique: 8vIb5JljPI6QSFF9xcnuzg-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 13723107ACF5; Thu, 22 Jul 2021 06:22:55 +0000 (UTC) Received: from [10.64.54.195] (vpn2-54-195.bne.redhat.com [10.64.54.195]) by smtp.corp.redhat.com (Postfix) with ESMTPS id D59305B4BC; Thu, 22 Jul 2021 06:22:51 +0000 (UTC) Reply-To: Gavin Shan Subject: Re: [PATCH v3 01/12] mm/debug_vm_pgtable: Introduce struct pgtable_debug_args To: Anshuman Khandual , linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, catalin.marinas@arm.com, will@kernel.org, akpm@linux-foundation.org, chuhu@redhat.com, shan.gavin@gmail.com References: <20210719130613.334901-1-gshan@redhat.com> <20210719130613.334901-2-gshan@redhat.com> <280a5740-b5dc-4b78-3a38-67e5adbb0afd@redhat.com> <04a4618f-9899-1518-cee1-0a48cb4df4c6@arm.com> From: Gavin Shan Message-ID: <65078a0c-c35c-8e3f-d4d3-3090b0c3daaf@redhat.com> Date: Thu, 22 Jul 2021 16:23:08 +1000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.0 MIME-Version: 1.0 In-Reply-To: <04a4618f-9899-1518-cee1-0a48cb4df4c6@arm.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=NMm4E2ON; spf=none (imf25.hostedemail.com: domain of gshan@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=gshan@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Stat-Signature: qyxdcyufpn4sxnhu9c9195uumeq87o6t X-Rspamd-Queue-Id: 2D153B058BE9 X-Rspamd-Server: rspam01 X-HE-Tag: 1626934979-115324 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Anshuman, On 7/22/21 2:41 PM, Anshuman Khandual wrote: > On 7/21/21 3:50 PM, Gavin Shan wrote: >> On 7/21/21 3:44 PM, Anshuman Khandual wrote: >>> On 7/19/21 6:36 PM, Gavin Shan wrote: >>>> In debug_vm_pgtable(), there are many local variables introduced to >>>> track the needed information and they are passed to the functions fo= r >>>> various test cases. It'd better to introduce a struct as place holde= r >>>> for these information. With it, what the functions for various test >>>> cases need is the struct, to simplify the code. It also makes code >>>> easier to be maintained. >>>> >>>> Besides, set_xxx_at() could access the data on the corresponding pag= es >>>> in the page table modifying tests. So the accessed pages in the test= s >>>> should have been allocated from buddy. Otherwise, we're accessing pa= ges >>>> that aren't owned by us. This causes issues like page flag corruptio= n. >>>> >>>> This introduces "struct pgtable_debug_args". The struct is initializ= ed >>>> and destroyed, but the information in the struct isn't used yet. The= y >>>> will be used in subsequent patches. >>>> >>>> Signed-off-by: Gavin Shan >>>> --- >>>> =C2=A0 mm/debug_vm_pgtable.c | 197 ++++++++++++++++++++++++++++++++= +++++++++- >>>> =C2=A0 1 file changed, 196 insertions(+), 1 deletion(-) >>>> >> >> I saw you've finished the review on PATCH[v3 01/12] and PATCH[v3 02/12= ]. >> I will wait to integrate your comments to v4 until you finish the revi= ew >> on all patches in v3 series. >> >>>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c >>>> index 1c922691aa61..ea153ff40d23 100644 >>>> --- a/mm/debug_vm_pgtable.c >>>> +++ b/mm/debug_vm_pgtable.c >>>> @@ -58,6 +58,36 @@ >>>> =C2=A0 #define RANDOM_ORVALUE (GENMASK(BITS_PER_LONG - 1, 0) & ~ARC= H_SKIP_MASK) >>>> =C2=A0 #define RANDOM_NZVALUE=C2=A0=C2=A0=C2=A0 GENMASK(7, 0) >>>> =C2=A0 +struct pgtable_debug_args { >>>> +=C2=A0=C2=A0=C2=A0 struct mm_struct=C2=A0=C2=A0=C2=A0 *mm; >>>> +=C2=A0=C2=A0=C2=A0 struct vm_area_struct=C2=A0=C2=A0=C2=A0 *vma; >>>> + >>>> +=C2=A0=C2=A0=C2=A0 pgd_t=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 *pgdp; >>>> +=C2=A0=C2=A0=C2=A0 p4d_t=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 *p4dp; >>>> +=C2=A0=C2=A0=C2=A0 pud_t=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 *pudp; >>>> +=C2=A0=C2=A0=C2=A0 pmd_t=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 *pmdp; >>>> +=C2=A0=C2=A0=C2=A0 pte_t=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 *ptep; >>>> + >>>> +=C2=A0=C2=A0=C2=A0 p4d_t=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 *start_p4dp; >>>> +=C2=A0=C2=A0=C2=A0 pud_t=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 *start_pudp; >>>> +=C2=A0=C2=A0=C2=A0 pmd_t=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 *start_pmdp; >>>> +=C2=A0=C2=A0=C2=A0 pgtable_t=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 start_ptep; >>>> + >>>> +=C2=A0=C2=A0=C2=A0 unsigned long=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 vaddr; >>>> +=C2=A0=C2=A0=C2=A0 pgprot_t=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= page_prot; >>>> +=C2=A0=C2=A0=C2=A0 pgprot_t=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= page_prot_none; >>>> + >>>> +=C2=A0=C2=A0=C2=A0 unsigned long=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 pud_pfn; >>>> +=C2=A0=C2=A0=C2=A0 unsigned long=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 pmd_pfn; >>>> +=C2=A0=C2=A0=C2=A0 unsigned long=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 pte_pfn; >>>> + >>>> +=C2=A0=C2=A0=C2=A0 unsigned long=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 fixed_pgd_pfn; >>>> +=C2=A0=C2=A0=C2=A0 unsigned long=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 fixed_p4d_pfn; >>>> +=C2=A0=C2=A0=C2=A0 unsigned long=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 fixed_pud_pfn; >>>> +=C2=A0=C2=A0=C2=A0 unsigned long=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 fixed_pmd_pfn; >>>> +=C2=A0=C2=A0=C2=A0 unsigned long=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 fixed_pte_pfn; >>>> +}; >>>> + >>>> =C2=A0 static void __init pte_basic_tests(unsigned long pfn, int id= x) >>>> =C2=A0 { >>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 pgprot_t prot =3D protection_map[idx= ]; >>>> @@ -955,8 +985,167 @@ static unsigned long __init get_random_vaddr(v= oid) >>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return random_vaddr; >>>> =C2=A0 } >>>> =C2=A0 +static void __init destroy_args(struct pgtable_debug_args *= args) >>>> +{ >>>> +=C2=A0=C2=A0=C2=A0 struct page *page =3D NULL; >>>> + >>>> +=C2=A0=C2=A0=C2=A0 /* Free (huge) page */ >>>> +=C2=A0=C2=A0=C2=A0 if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 IS_ENABLED(CONFIG_HAVE_A= RCH_TRANSPARENT_HUGEPAGE_PUD) && >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 has_transparent_hugepage= () && >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 args->pud_pfn !=3D ULONG= _MAX) { >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 page =3D pfn_to_page(arg= s->pud_pfn); >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 __free_pages(page, HPAGE= _PUD_SHIFT - PAGE_SHIFT); >>>> +=C2=A0=C2=A0=C2=A0 } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAG= E) && >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 has_tr= ansparent_hugepage() && >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 args->= pmd_pfn !=3D ULONG_MAX) { >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 page =3D pfn_to_page(arg= s->pmd_pfn); >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 __free_pages(page, HPAGE= _PMD_ORDER); >>>> +=C2=A0=C2=A0=C2=A0 } else if (args->pte_pfn !=3D ULONG_MAX) { >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 page =3D pfn_to_page(arg= s->pte_pfn); >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 __free_pages(page, 0); >>>> +=C2=A0=C2=A0=C2=A0 } >>>> + >>>> +=C2=A0=C2=A0=C2=A0 /* Free page table */ >>>> +=C2=A0=C2=A0=C2=A0 if (args->start_ptep) { >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 pte_free(args->mm, args-= >start_ptep); >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mm_dec_nr_ptes(args->mm)= ; >>>> +=C2=A0=C2=A0=C2=A0 } >>>> + >>>> +=C2=A0=C2=A0=C2=A0 if (args->start_pmdp) { >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 pmd_free(args->mm, args-= >start_pmdp); >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mm_dec_nr_pmds(args->mm)= ; >>>> +=C2=A0=C2=A0=C2=A0 } >>>> + >>>> +=C2=A0=C2=A0=C2=A0 if (args->start_pudp) { >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 pud_free(args->mm, args-= >start_pudp); >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mm_dec_nr_puds(args->mm)= ; >>>> +=C2=A0=C2=A0=C2=A0 } >>>> + >>>> +=C2=A0=C2=A0=C2=A0 if (args->start_p4dp) >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 p4d_free(args->mm, args-= >p4dp); >>>> + >>>> +=C2=A0=C2=A0=C2=A0 /* Free vma and mm struct */ >>>> +=C2=A0=C2=A0=C2=A0 if (args->vma) >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 vm_area_free(args->vma); >>>> +=C2=A0=C2=A0=C2=A0 if (args->mm) >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mmdrop(args->mm); >>>> +} >>>> + >>>> +static int __init init_args(struct pgtable_debug_args *args) >>>> +{ >>>> +=C2=A0=C2=A0=C2=A0 struct page *page =3D NULL; >>>> +=C2=A0=C2=A0=C2=A0 phys_addr_t phys; >>>> +=C2=A0=C2=A0=C2=A0 int ret =3D 0; >>>> + >>>> +=C2=A0=C2=A0=C2=A0 /* Initialize the debugging data */ >>>> +=C2=A0=C2=A0=C2=A0 memset(args, 0, sizeof(*args)); >>>> +=C2=A0=C2=A0=C2=A0 args->page_prot=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D= vm_get_page_prot(VMFLAGS); >>>> +=C2=A0=C2=A0=C2=A0 args->page_prot_none =3D __P000; >>> >>> Please preserve the existing comments before this assignment. >>> >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * __P000 (or = even __S000) will help create page table entries with >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * PROT_NONE p= ermission as required for pxx_protnone_tests(). >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 */ >>> >> >> Sure. I will combine the comments in v4 as below: >> >> =C2=A0=C2=A0=C2=A0=C2=A0/* >> =C2=A0=C2=A0=C2=A0=C2=A0 * Initialize the debugging arguments. >> =C2=A0=C2=A0=C2=A0=C2=A0 * >> =C2=A0=C2=A0=C2=A0=C2=A0 * __P000 (or even __S000) will help create p= age table entries with >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * PROT_NONE permissi= on as required for pxx_protnone_tests(). >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 */ >> >> >>>> +=C2=A0=C2=A0=C2=A0 args->pud_pfn=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 =3D ULONG_MAX; >>>> +=C2=A0=C2=A0=C2=A0 args->pmd_pfn=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 =3D ULONG_MAX; >>>> +=C2=A0=C2=A0=C2=A0 args->pte_pfn=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 =3D ULONG_MAX; >>>> +=C2=A0=C2=A0=C2=A0 args->fixed_pgd_pfn=C2=A0 =3D ULONG_MAX; >>>> +=C2=A0=C2=A0=C2=A0 args->fixed_p4d_pfn=C2=A0 =3D ULONG_MAX; >>>> +=C2=A0=C2=A0=C2=A0 args->fixed_pud_pfn=C2=A0 =3D ULONG_MAX; >>>> +=C2=A0=C2=A0=C2=A0 args->fixed_pmd_pfn=C2=A0 =3D ULONG_MAX; >>>> +=C2=A0=C2=A0=C2=A0 args->fixed_pte_pfn=C2=A0 =3D ULONG_MAX; >>>> + >>>> +=C2=A0=C2=A0=C2=A0 /* Allocate mm and vma */ >>>> +=C2=A0=C2=A0=C2=A0 args->mm =3D mm_alloc(); >>>> +=C2=A0=C2=A0=C2=A0 if (!args->mm) { >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 pr_err("Failed to alloca= te mm struct\n"); >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ret =3D -ENOMEM; >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 goto error; >>>> +=C2=A0=C2=A0=C2=A0 } >>>> + >>>> +=C2=A0=C2=A0=C2=A0 args->vma =3D vm_area_alloc(args->mm); >>>> +=C2=A0=C2=A0=C2=A0 if (!args->vma) { >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 pr_err("Failed to alloca= te vma\n"); >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ret =3D -ENOMEM; >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 goto error; >>>> +=C2=A0=C2=A0=C2=A0 } >>>> + >>>> +=C2=A0=C2=A0=C2=A0 /* Figure out the virtual address and allocate p= age table entries */ >>>> +=C2=A0=C2=A0=C2=A0 args->vaddr =3D get_random_vaddr(); >>> >>> Please group args->vaddr's init with page_prot and page_prot_none abo= ve. >>> >> >> Yes, It will make the code tidy. I'll move this line accordingly in v4= , >> but the related comments will be dropped as the code is self-explanato= ry. >> >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* Allocate page table ent= ries */ >> >>>> +=C2=A0=C2=A0=C2=A0 args->pgdp =3D pgd_offset(args->mm, args->vaddr)= ; >>>> +=C2=A0=C2=A0=C2=A0 args->p4dp =3D p4d_alloc(args->mm, args->pgdp, a= rgs->vaddr); >>>> +=C2=A0=C2=A0=C2=A0 args->pudp =3D args->p4dp ? >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 pud_alloc(args->mm, args->p4dp, args->vaddr) : NULL; >>>> +=C2=A0=C2=A0=C2=A0 args->pmdp =3D args->pudp ? >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 pmd_alloc(args->mm, args->pudp, args->vaddr) : NULL; >>>> +=C2=A0=C2=A0=C2=A0 args->ptep =3D args->pmdp ? >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 pte_alloc_map(args->mm, args->pmdp, args->vaddr) : NULL; >>>> +=C2=A0=C2=A0=C2=A0 if (!args->ptep) { >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 pr_err("Failed to alloca= te page table\n"); >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ret =3D -ENOMEM; >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 goto error; >>>> +=C2=A0=C2=A0=C2=A0 } >>> >>> Why not just assert that all page table level pointers are allocated >>> successfully, otherwise bail out the test completely. Something like >>> this at each level. >>> >>> =C2=A0=C2=A0=C2=A0=C2=A0if (!args->p4dp) { >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 pr_err("Failed to allocat= e page table\n"); >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ret =3D -ENOMEM; >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 goto error; >>> =C2=A0=C2=A0=C2=A0=C2=A0} >>> >>> Is there any value in proceeding with the test when some page table >>> pointers have not been allocated. Also individual tests do not cross >>> check these pointers. Also asserting successful allocations will >>> make the freeing path simpler, as I had mentioned earlier. >>> >> >> There is no tests will be carried out if we fail to allocate any level >> of page table entries. For other questions, please refer below respons= e. >> In summary, this snippet needs to be combined with next snippet, as be= low. >> >>>> + >>>> +=C2=A0=C2=A0=C2=A0 /* >>>> +=C2=A0=C2=A0=C2=A0=C2=A0 * The above page table entries will be mod= ified. Lets save the >>>> +=C2=A0=C2=A0=C2=A0=C2=A0 * page table entries so that they can be r= eleased when the tests >>>> +=C2=A0=C2=A0=C2=A0=C2=A0 * are completed. >>>> +=C2=A0=C2=A0=C2=A0=C2=A0 */ >>>> +=C2=A0=C2=A0=C2=A0 args->start_p4dp =3D p4d_offset(args->pgdp, 0UL)= ; >>>> +=C2=A0=C2=A0=C2=A0 args->start_pudp =3D pud_offset(args->p4dp, 0UL)= ; >>>> +=C2=A0=C2=A0=C2=A0 args->start_pmdp =3D pmd_offset(args->pudp, 0UL)= ; >>>> +=C2=A0=C2=A0=C2=A0 args->start_ptep =3D pmd_pgtable(READ_ONCE(*(arg= s->pmdp))); >>> >>> If the above page table pointers have been validated to be allocated >>> successfully, we could add these here. >>> >>> =C2=A0=C2=A0=C2=A0=C2=A0WARN_ON(!args->start_p4dp) >>> =C2=A0=C2=A0=C2=A0=C2=A0WARN_ON(!args->start_pudp) >>> =C2=A0=C2=A0=C2=A0=C2=A0WARN_ON(!args->start_pmdp) >>> =C2=A0=C2=A0=C2=A0=C2=A0WARN_ON(!args->start_ptep) >>> >>> Afterwards all those if (args->start_pxdp) checks in the freeing path >>> will not be required anymore. >>> >> >> The check on @args->start_pxdp is still needed in destroy_args() for >> couple of cases: (1) destroy_args() is called on failing to allocate >> @args->mm or @args->vma. That time, no page table entries are allocate= d. >> (2) It's possible to fail allocating current level of page table entri= es >> even the previous levels of page table entries are allocated successfu= lly. >=20 > This makes sense as destroy_args() is getting called if any of these > allocations fails during init_args(). Did not realize that earlier. >=20 >> >> So Lets change these (above) two snippets as below in v4: >> >> =C2=A0=C2=A0=C2=A0=C2=A0/* >> =C2=A0=C2=A0=C2=A0=C2=A0 * Allocate page table entries. The allocated= page table entries >> =C2=A0=C2=A0=C2=A0=C2=A0 * will be modified in the tests. Lets save t= he page table entries >> =C2=A0=C2=A0=C2=A0=C2=A0 * so that they can be released when the test= s are completed. >> =C2=A0=C2=A0=C2=A0=C2=A0 */ >> =C2=A0=C2=A0=C2=A0=C2=A0args->pgdp =3D pgd_offset(args->mm, args->vad= dr); >> =C2=A0=C2=A0=C2=A0=C2=A0args->p4dp =3D p4d_alloc(args->mm, args->pgdp= , args->vaddr); >> =C2=A0=C2=A0=C2=A0=C2=A0if (!args->p4dp) { >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 pr_err("Failed to allocate= p4d entries\n"); >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ret =3D -ENOMEM; >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 goto error; >> =C2=A0=C2=A0=C2=A0=C2=A0} >> >> =C2=A0=C2=A0=C2=A0=C2=A0args->start_p4dp =3D p4d_offset(args->pgdp, 0= UL); >=20 > Dont bring the arg->start_pxdp assignments here. If all page table leve= l > pointer allocations succeed, they all get assigned together like we hav= e > right now. Although a sanity check afterwards like the following, might > still be better. >=20 > WARN_ON(!args->start_p4dp) > WARN_ON(!args->start_pudp) > WARN_ON(!args->start_pmdp) > WARN_ON(!args->start_ptep) >=20 We have to assign arg->start_pxdp here because destroy_args() relies it to release the corresponding page tables in failing path. For example, the args->start_p4dp is going to be release if we fail to populate args->start_pudp. Ok. I will add WARN_ON() for each level of page table entries right after they are assigned in v4. >> =C2=A0=C2=A0=C2=A0=C2=A0args->pudp =3D pud_alloc(args->mm, args->p4dp= , args->vaddr); >> =C2=A0=C2=A0=C2=A0=C2=A0if (!args->pudp) { >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 pr_err("Failed to allocate= pud entries\n"); >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ret =3D -ENOMEM; >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 goto error; >> =C2=A0=C2=A0=C2=A0=C2=A0} >> >> =C2=A0=C2=A0=C2=A0=C2=A0args->pmdp =3D pmd_alloc(args->mm, args->pudp= , args->vaddr); >> =C2=A0=C2=A0=C2=A0=C2=A0if (!args->pmdp) { >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 pr_err("Failed to allocate= PMD entries\n"); >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ret =3D -ENOMEM; >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 goto error; >> =C2=A0=C2=A0=C2=A0=C2=A0} >> >> =C2=A0=C2=A0=C2=A0=C2=A0args->start_pmdp =3D pmd_offset(args->pudp, 0= UL); >> =C2=A0=C2=A0=C2=A0=C2=A0args->ptep =3D pte_alloc_map(args->mm, args->= pmdp, args->vaddr); >> =C2=A0=C2=A0=C2=A0=C2=A0if (!args->ptep) { >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 pr_err("Failed to allocate= page table\n"); >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ret =3D -ENOMEM; >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 goto error; >> =C2=A0=C2=A0=C2=A0=C2=A0} >> >> =C2=A0=C2=A0=C2=A0=C2=A0args->start_ptep =3D pmd_pgtable(READ_ONCE(*(= args->pmdp))); >> >>>> + >>>> +=C2=A0=C2=A0=C2=A0 /* >>>> +=C2=A0=C2=A0=C2=A0=C2=A0 * Figure out the fixed addresses, which ar= e all around the kernel >>>> +=C2=A0=C2=A0=C2=A0=C2=A0 * symbol (@start_kernel). The correspondin= g PFNs might be invalid, >>>> +=C2=A0=C2=A0=C2=A0=C2=A0 * but it's fine as the following tests won= 't access the pages. >>>> +=C2=A0=C2=A0=C2=A0=C2=A0 */ >>>> +=C2=A0=C2=A0=C2=A0 phys =3D __pa_symbol(&start_kernel); >>>> +=C2=A0=C2=A0=C2=A0 args->fixed_pgd_pfn =3D __phys_to_pfn(phys & PGD= IR_MASK); >>>> +=C2=A0=C2=A0=C2=A0 args->fixed_p4d_pfn =3D __phys_to_pfn(phys & P4D= _MASK); >>>> +=C2=A0=C2=A0=C2=A0 args->fixed_pud_pfn =3D __phys_to_pfn(phys & PUD= _MASK); >>>> +=C2=A0=C2=A0=C2=A0 args->fixed_pmd_pfn =3D __phys_to_pfn(phys & PMD= _MASK); >>>> +=C2=A0=C2=A0=C2=A0 args->fixed_pte_pfn =3D __phys_to_pfn(phys & PAG= E_MASK); >>>> + >>>> +=C2=A0=C2=A0=C2=A0 /* >>>> +=C2=A0=C2=A0=C2=A0=C2=A0 * Allocate (huge) pages because some of th= e tests need to access >>>> +=C2=A0=C2=A0=C2=A0=C2=A0 * the data in the pages. The corresponding= tests will be skipped >>>> +=C2=A0=C2=A0=C2=A0=C2=A0 * if we fail to allocate (huge) pages. >>>> +=C2=A0=C2=A0=C2=A0=C2=A0 */ >>>> +=C2=A0=C2=A0=C2=A0 if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 IS_ENABLED(CONFIG_HAVE_A= RCH_TRANSPARENT_HUGEPAGE_PUD) && >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 has_transparent_hugepage= ()) { >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 page =3D alloc_pages(GFP= _KERNEL | __GFP_NOWARN, >>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 HPAGE_PUD_SHIFT - PAGE_SHIFT); >>> >>> Please drop __GFP_NOWARN and instead use something like alloc_contig_= pages() >>> when required allocation order exceed (MAX_ORDER - 1). Else the test = might >>> not be able to execute on platform configurations, where PUD THP is e= nabled. >>> >> >> Yes, It's correct that alloc_contig_pages() should be used here, depen= ding >> on CONFIG_CONTIG_ALLOC. Otherwise, alloc_pages(...__GFP_NOWARN...) is = still >> used as we're doing. This snippet will be changed like below in v4: >=20 > First 'order > (MAX_ORDER - 1)' needs to be established before calling = into > alloc_contig_pages() without __GFP_NOWARN and set a new flag indicating= that > there is contig page allocated. But if 'order <=3D (MAX_ORDER - 1)', th= en call > alloc_pages(..) without __GFP_NOWARN. There is no need to add __GFP_N= OWARN > in any case. In case CONFIG_CONTIG_ALLOC is not available, directly ret= urn a > NULL as that would have been the case with alloc_pages(...__GFP_NOWARN.= ..) as > well. >=20 > Symbol alloc_contig_pages() is not available outside CONFIG_CONTIG_ALLO= C. So > IS_ENABLED() construct will not work, unless there is an empty stub add= ed in > the header. Otherwise #ifdef CONFIG_CONTIG_ALLOC needs to be used inste= ad. >=20 > Regardless please do test this on a x86 platform with PUD based THP in = order > to make sure every thing works as expected. >=20 Thanks, I will change the code accordingly in v4 and test it on x86 before posting it. >> >> =C2=A0=C2=A0=C2=A0=C2=A0/* >> =C2=A0=C2=A0=C2=A0=C2=A0 * Allocate (huge) pages because some of the = tests need to access >> =C2=A0=C2=A0=C2=A0=C2=A0 * the data in the pages. The corresponding t= ests will be skipped >> =C2=A0=C2=A0=C2=A0=C2=A0 * if we fail to allocate (huge) pages. >> =C2=A0=C2=A0=C2=A0=C2=A0 */ >> =C2=A0=C2=A0=C2=A0=C2=A0if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &= & >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 IS_ENABLED(CONFIG_HAVE_ARC= H_TRANSPARENT_HUGEPAGE_PUD) && >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 IS_ENABLED(CONFIG_CONTIG_A= LLOC)) && >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 has_transparent_hugepage()= ) { >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 page =3D alloc_contig_page= s((1 << (HPAGE_PUD_SHIFT - PAGE_SHIFT)), >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 GFP_KERNEL | __= GFP_NOWARN, >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 first_online_no= de, NULL); >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (page) { >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ar= gs->is_contiguous_pud_page =3D true; >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ar= gs->pud_pfn =3D page_to_pfn(page); >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ar= gs->pmd_pfn =3D args->pud_pfn; >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ar= gs->pte_pfn =3D args->pud_pfn; >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 re= turn 0; >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } >> =C2=A0=C2=A0=C2=A0=C2=A0} >> >> =C2=A0=C2=A0=C2=A0=C2=A0if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &= & >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 IS_ENABLED(CONFIG_HAVE_ARC= H_TRANSPARENT_HUGEPAGE_PUD) && >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 has_transparent_hugepage()= ) { >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 page =3D alloc_pages(GFP_K= ERNEL | __GFP_NOWARN, >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 HPAGE_PUD_SHIFT - PAGE_SHIFT); >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (page) { >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ar= gs->is_contiguous_pud_page =3D false; >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ar= gs->pud_pfn =3D page_to_pfn(page); >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ar= gs->pmd_pfn =3D args->pud_pfn; >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ar= gs->pte_pfn =3D args->pud_pfn; >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 re= turn 0; >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } >> =C2=A0=C2=A0=C2=A0=C2=A0} >> >> =C2=A0=C2=A0=C2=A0=C2=A0[... The logic to allocate PMD huge page or p= age is kept as of being] >=20 > IIRC it is also not guaranteed that PMD_SHIFT <=3D (MAX_ORDER - 1). Hen= ce > this same scheme should be followed for PMD level allocation as well. >=20 In theory, it's possible to have PMD_SHIFT <=3D (MAX_ORDER - 1) with misc= onfigured kernel. I will apply the similar logic to PMD huge page in v4. >> =C2=A0=C2=A0=C2=A0 [... The code to release the PUD huge page needs c= hanges based on @args->is_contiguous_pud_page] >=20 > Right, a flag would be needed to call the appropriate free function. >=20 Yes. We need two falgs for PUD and PMD huge pages separately. Thanks, Gavin