From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54636C71155 for ; Fri, 20 Jun 2025 07:57:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E93DF6B007B; Fri, 20 Jun 2025 03:57:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E44186B0089; Fri, 20 Jun 2025 03:57:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D32B06B008A; Fri, 20 Jun 2025 03:57:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id C28BC6B007B for ; Fri, 20 Jun 2025 03:57:11 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 523441005D1 for ; Fri, 20 Jun 2025 07:57:11 +0000 (UTC) X-FDA: 83575023462.28.BC5F7A4 Received: from mail-ej1-f54.google.com (mail-ej1-f54.google.com [209.85.218.54]) by imf01.hostedemail.com (Postfix) with ESMTP id 4EEDC40005 for ; Fri, 20 Jun 2025 07:57:09 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=IkPz0rXz; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf01.hostedemail.com: domain of xavier.qyxia@gmail.com designates 209.85.218.54 as permitted sender) smtp.mailfrom=xavier.qyxia@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750406229; a=rsa-sha256; cv=none; b=ddqWTJINZp4qUy6ycVfsgmwmld+LhIQHx7jAwzwrMF2Ices7QUsAp5n2YDYvZZAIk8/YPW wcgQtTqdWGRHYzGsSaDGBpEuH2bvk64WXZ5KqZjD+D5ItQJ6Ven7nHmBW3YAZZ12r2fBp9 gcrbjJdpRvLKSAFj1/LQTDU6jllMxWA= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=IkPz0rXz; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf01.hostedemail.com: domain of xavier.qyxia@gmail.com designates 209.85.218.54 as permitted sender) smtp.mailfrom=xavier.qyxia@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750406229; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EbjPmpN4k4kj61gbTaIz/v/P03tQOZqrCSncxNOLHhc=; b=RUZEA/48Y2F5E2krXTDTw2C77OaUOUxo+oUs7Y5W0jlXM0rdx3Q6uIgLHCIU59Y3Ql/J+E V8f0Chbk8rqzX9LDuOEmRQ/VOOSFuY6A/LUmfrsqtrRg4Kg2/A+LSWI5MweDyCNe+oodhr q2BVrQHW8j5S2bBOFpx5z94qjOOqMSQ= Received: by mail-ej1-f54.google.com with SMTP id a640c23a62f3a-ad883afdf0cso318735966b.0 for ; Fri, 20 Jun 2025 00:57:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1750406228; x=1751011028; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=EbjPmpN4k4kj61gbTaIz/v/P03tQOZqrCSncxNOLHhc=; b=IkPz0rXzMTBa1FKYyJ4T5XYqBBej5FhvdueWM6yOefTylyUWHncwRCUvSfZXgl304U iedBjiOufEGQrke2A9uVO5YXuG98T5mUpehZwpAB1V+ywDLcRSJuQeuaYBpolXGR3VUy 3HJq/Yhy+H6dHnf1vpqAiQSZCadrBcRZoPsPUzoRveC8ttvYRm/Fpo4MsEpVYfs6h3MN edJ4L/U4rWxuFxjAEpNgtYF1nwzJ0VFvSJ+RvYiZ83SJYnZCTXMHZ3Z+Q4T5ig7eUq1V BJbn9knA3MYEbJZVOrjR8enDy885AJfQNjTK23UrK7/BaqBnZ5nYaIhXJJacA2CNQxRA F4kQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750406228; x=1751011028; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=EbjPmpN4k4kj61gbTaIz/v/P03tQOZqrCSncxNOLHhc=; b=b2lwUJbAMBEbEvntXypGdIlWkxe1UWnbnby6mCQhQWPJznCdhjVg1Ms5IIGULpWqT2 /zL+SYtrxdzGrWsDObgYg/hSctZiPdG5sR9Hz6aSFu0kiMxyVrGPFdq0SZilHOXiTCxo kXyw2gTWO6FHZExrpqnfzFaDdkFYGdMr8rHRKDpmu6afrLKPt8/1eSINFsugHOliBxnc ChIEhPu3wIzhVXwdmVBBNkNCae7Bd0Xkg60A4i5qIZgdOcqQ3X3A+pVjExTrAZ/u77rB fw7/OW2X3vX+TsJVS1jFfdQ4iAGlQ/1iXPkDLQaELmB2L/1CVUK/pLJ1IsOhnuWrIYwB FDcA== X-Forwarded-Encrypted: i=1; AJvYcCUqzrOSC2Hf0oUDK7HQR4MFmoTvRNM/88WwziiMrIX3ZN8xpIjMdlTLgaa+xS+rozBT6Mwv2IH0sg==@kvack.org X-Gm-Message-State: AOJu0YzeWzcY5qFKOltukH0djJw0iITWIAr9ldBhqy3vAtx1Ck/aKbk7 +ZQYkU3GaHYATJmQ6L2sodriFGmkIngmu1xi1XhdfS9lhWt/snlbVhLqcNxEr046Cf67E8U5KS9 7bGJtjXVt/2MkkhQReh+XKEF56OztFJc= X-Gm-Gg: ASbGnctTOAIlnsZgdPjopbaPf3Hn2xOlBA6KuwJcYsIefrN4JMB7bAygGe82AeO3Vzl zh4ik40Klvaaak0u2Dg928QwV1L//eAHwni6GN5AQ8HKkJbeCgiz8J5HLp31Jzi7bDU3Pu7miuQ D2JBj/AFD704+bCZv1nD58gz6/neiIC3bfIppwPcMk6rw= X-Google-Smtp-Source: AGHT+IF+6VAuUMy2Nvt0MZJBmp7cJebHTpZAwXCTt5iJxnNBDyghc8/sLzCVmrIXjVnN1BmUqSI7tUkqqvgBuxrWVds= X-Received: by 2002:a17:906:d7e3:b0:ad5:1bfd:30d2 with SMTP id a640c23a62f3a-ae057c1b50fmr165637966b.55.1750406227316; Fri, 20 Jun 2025 00:57:07 -0700 (PDT) MIME-Version: 1.0 References: <20250510125948.2383778-1-xavier_qy@163.com> <99a0a2c8-d98e-4c81-9207-c55c72c00872@arm.com> <225fd9dd-2b97-4ec6-a9a6-fe148c4b901e@arm.com> In-Reply-To: <225fd9dd-2b97-4ec6-a9a6-fe148c4b901e@arm.com> From: Xavier Xia Date: Fri, 20 Jun 2025 15:56:55 +0800 X-Gm-Features: AX0GCFvm3O18WCJzWjEGbk5vI8Iresz2_k3HB_QhB7hJ_3Y61_BnRY6cB-c-gac Message-ID: Subject: Re: [PATCH v6] arm64/mm: Optimize loop to reduce redundant operations of contpte_ptep_get To: Ryan Roberts , will@kernel.org Cc: Xavier Xia , 21cnbao@gmail.com, dev.jain@arm.com, ioworker0@gmail.com, akpm@linux-foundation.org, catalin.marinas@arm.com, david@redhat.com, gshan@redhat.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, willy@infradead.org, ziy@nvidia.com, Barry Song , linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: 4EEDC40005 X-Rspamd-Server: rspam10 X-Stat-Signature: dbxnczdp399d4u43wiqt3p4aazuchwif X-HE-Tag: 1750406229-839810 X-HE-Meta: U2FsdGVkX1/Wyr+h7khJDO3FvKbfSn0dz67NAWcFFHGogrEIbLm883R8ECgD/ONcO+wUKqdZXZB8z0D+w/ayGLdJv2kYcaEq7oTq1p/0EXDXv3rm4+eIAamw8cz1/zYk1+b0U9pGXgsCHSRRknCKNmjvpvJHLF8JsIEAI6KtKUbT/jD4BM3BGWMNezOeSBT0qH1TgDxDqomh2LcUb9XyY8ykQwxb4XWFjcav9Dg+ciQnHK0SROcEiQmZB+Zv0clyXcFDlMBS0t1iM+YOailv9J1mq0kTbNVoe7Dedd8Su41+fQQoa5mlS74YzPczJPv8he93f2TiYDdfyazJ7hP7GM0m+cbXSCJCm0hf0HSdJgj9KxmguBXnbjFKJAJ95z2KpaRWcilsf4ONTSdi+S0lzasimWoge5Suz7l2Y4xzClgMw0uavUDPjoMPYLuVEIrJHUbvT5u+JK0p6FrA092ctUwkMOb9Q3cylhLZ2YUhPYT16xkzSCZr5h+LieaeCI9MaPfvFiBOmOjHHkn7aGu7+BL1JH/2YCYPvN8XHMgjrVIvH0ehFjqjtCTLhHKWboqJCM2aK3dbHfauRr8m1ixGjlik8yCJeEI0lZvxe/jYuqGnXGGEQXug0aN0ukji3kg3Q4awtwxqZnlIkoIEq6MBf7HwUYsSzToNbnhVE/nV2lrXQqbInedO4Lh7p/nWlV4ZELODlmXMvDPLqrwCWiLnlqWIa4Ke+Bkg1zuvRCdpO0GBhIGeFCiEIWegDMYfyLU9XYVyirQTVQYwC6UokpRp4A716j4Gqq/A/srtdnLspusITXUWaqitoXUGfXDiXp6ZmD0/95FXFO5SqpGYwxbOxzVwbmRHUJ9tNbHRENWofJ7m4avYZUkiVlmZjyIoPvqHqWA6ts0OMPP+ghegMsPQlvr+yyb4VJ6JYNvmCaICflpLVou6krqmbaHXh1k5laCKWBN2P2vmdgWCwQx9K+u 64pLEZHv TvcEWSQfr2Cqjdoo2BPVzeZZg3Jw+65HZQ3sK+8WtSPZqwMS+6Lwtig1f7Q7XdEHpE/eVz1XvEsO2CPbIg8dktvkCFL3NJyyrWtbfB+X4Oa7whONSDb1yCoYuSnqoRUmJQaQK4o9F246E5cwgz6HOsvslGO8rUYuK6+m1u2NRIVY7I3cdJnIbCGv18/FTvLCXAFY2MbhmeOP0OGeUzml7J5sybUq7Qzok7VDrFG/owvWB0zkqrozjTt8mVKLuP/00XAFuJmjASE0zXOl7nZ8tnGE8X5AzP6vdammOU/hWuo6V2um7xCoiN2LMWH/uFMzKEcBhN7nfWwAsMlGR819ZK9VofTJH/EUS2FoPrP0sFd+5e3wTjKvbuUn3v8+NDY5JEEYyHccve/536k7X902+o4Zorh+71VNkVxGoLrznt8X8OiR3SbT4gL8qbQxz3j+Gq3rb3TNtyjNmBO7NaoHoUvDJUczyrPXVEPKLkwsBNpzRAxEXcjN3tPUDlA0AU8Av2KomeSecdAgNLDoVTbuoiBa3TrHqo7PHmMmp7JeiWea8ASU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi all, May I follow up: Does this patch require any further changes? Is it now meeting the merging criteria? -- Thanks, Xavier On Thu, Jun 5, 2025 at 3:16=E2=80=AFPM Ryan Roberts = wrote: > > On 05/06/2025 06:54, Xavier Xia wrote: > > Hi Ryan, > > > > Thank you for your review, and for reproducing and verifying the test c= ases. > > I am using a Gmail email to reply to your message, hoping you can recei= ve it. > > Please check the details below. > > Ahh yes, this arrived in my inbox without issue! > > Thanks, > Ryan > > > > > > > > > > On Thu, Jun 5, 2025 at 11:20=E2=80=AFAM Ryan Roberts wrote: > >> > >> On 10/05/2025 13:59, Xavier Xia wrote: > >>> This commit optimizes the contpte_ptep_get and contpte_ptep_get_lockl= ess > >>> function by adding early termination logic. It checks if the dirty an= d > >>> young bits of orig_pte are already set and skips redundant bit-settin= g > >>> operations during the loop. This reduces unnecessary iterations and > >>> improves performance. > >>> > >>> In order to verify the optimization performance, a test function has = been > >>> designed. The function's execution time and instruction statistics ha= ve > >>> been traced using perf, and the following are the operation results o= n a > >>> certain Qualcomm mobile phone chip: > >>> > >>> Test Code: > >> > >> nit: It would have been good to include the source for the whole progr= am, > >> including #includes and the main() function to make it quicker for oth= ers to get > >> up and running. > > > > OK, I will pay attention to it in the future. This test case is quite > > simple, so I didn't add it. > > > >> > >>> > >>> #define PAGE_SIZE 4096 > >>> #define CONT_PTES 16 > >>> #define TEST_SIZE (4096* CONT_PTES * PAGE_SIZE) > >>> #define YOUNG_BIT 8 > >>> void rwdata(char *buf) > >>> { > >>> for (size_t i =3D 0; i < TEST_SIZE; i +=3D PAGE_SIZE) { > >>> buf[i] =3D 'a'; > >>> volatile char c =3D buf[i]; > >>> } > >>> } > >>> void clear_young_dirty(char *buf) > >>> { > >>> if (madvise(buf, TEST_SIZE, MADV_FREE) =3D=3D -1) { > >>> perror("madvise free failed"); > >>> free(buf); > >>> exit(EXIT_FAILURE); > >>> } > >>> if (madvise(buf, TEST_SIZE, MADV_COLD) =3D=3D -1) { > >>> perror("madvise free failed"); > >>> free(buf); > >>> exit(EXIT_FAILURE); > >>> } > >> > >> nit: MADV_FREE clears both young and dirty so I don't think MADV_COLD = is > >> required? (MADV_COLD only clears young I think?) > > > > You're right, MADV_COLD here can probably be removed. > > > >> > >>> } > >>> void set_one_young(char *buf) > >>> { > >>> for (size_t i =3D 0; i < TEST_SIZE; i +=3D CONT_PTES * = PAGE_SIZE) { > >>> volatile char c =3D buf[i + YOUNG_BIT * PAGE_SI= ZE]; > >>> } > >>> } > >>> > >>> void test_contpte_perf() { > >>> char *buf; > >>> int ret =3D posix_memalign((void **)&buf, CONT_PTES * P= AGE_SIZE, > >>> TEST_SIZE); > >>> if ((ret !=3D 0) || ((unsigned long)buf % CONT_PTES * P= AGE_SIZE)) { > >>> perror("posix_memalign failed"); > >>> exit(EXIT_FAILURE); > >>> } > >>> > >>> rwdata(buf); > >>> #if TEST_CASE2 || TEST_CASE3 > >>> clear_young_dirty(buf); > >>> #endif > >>> #if TEST_CASE2 > >>> set_one_young(buf); > >>> #endif > >>> > >>> for (int j =3D 0; j < 500; j++) { > >>> mlock(buf, TEST_SIZE); > >>> > >>> munlock(buf, TEST_SIZE); > >>> } > >>> free(buf); > >>> } > >>> > >>> Descriptions of three test scenarios > >>> > >>> Scenario 1 > >>> The data of all 16 PTEs are both dirty and young. > >>> #define TEST_CASE2 0 > >>> #define TEST_CASE3 0 > >>> > >>> Scenario 2 > >>> Among the 16 PTEs, only the 8th one is young, and there are no = dirty ones. > >>> #define TEST_CASE2 1 > >>> #define TEST_CASE3 0 > >>> > >>> Scenario 3 > >>> Among the 16 PTEs, there are neither young nor dirty ones. > >>> #define TEST_CASE2 0 > >>> #define TEST_CASE3 1 > >>> > >>> Test results > >>> > >>> |Scenario 1 | Original| Optimized| > >>> |-------------------|---------------|----------------| > >>> |instructions | 37912436160| 18731580031| > >>> |test time | 4.2797| 2.2949| > >>> |overhead of | | | > >>> |contpte_ptep_get() | 21.31%| 4.80%| > >>> > >>> |Scenario 2 | Original| Optimized| > >>> |-------------------|---------------|----------------| > >>> |instructions | 36701270862| 36115790086| > >>> |test time | 3.2335| 3.0874| > >>> |Overhead of | | | > >>> |contpte_ptep_get() | 32.26%| 33.57%| > >>> > >>> |Scenario 3 | Original| Optimized| > >>> |-------------------|---------------|----------------| > >>> |instructions | 36706279735| 36750881878| > >>> |test time | 3.2008| 3.1249| > >>> |Overhead of | | | > >>> |contpte_ptep_get() | 31.94%| 34.59%| > >>> > >>> For Scenario 1, optimized code can achieve an instruction benefit of = 50.59% > >>> and a time benefit of 46.38%. > >>> For Scenario 2, optimized code can achieve an instruction count benef= it of > >>> 1.6% and a time benefit of 4.5%. > >>> For Scenario 3, since all the PTEs have neither the young nor the dir= ty > >>> flag, the branches taken by optimized code should be the same as thos= e of > >>> the original code. In fact, the test results of optimized code seem t= o be > >>> closer to those of the original code. > >> > >> I re-ran these tests on Apple M2 with 4K base pages + 64K mTHP. > >> > >> Scenario 1: reduced to 56% of baseline execution time > >> Scenario 2: reduced to 89% of baseline execution time > >> Scenario 3: reduced to 91% of baseline execution time > >> > >> I'm pretty amazed that scenario 3 got faster given it is doing the sam= e number > >> of loops. > > > > It seems that the data you obtained is similar to my test data. For > > scenario 3, it's > > faster even when running the same code, which I can't quite figure out = either. > > > >>> > >>> It can be proven through test function that the optimization for > >>> contpte_ptep_get is effective. Since the logic of contpte_ptep_get_lo= ckless > >>> is similar to that of contpte_ptep_get, the same optimization scheme = is > >>> also adopted for it. > >>> > >>> Reviewed-by: Barry Song > >>> Signed-off-by: Xavier Xia > >> > >> I don't love the extra complexity, but this version is much tidier. Wh= ile the > >> micro-benchmark is clearly contrived, it shows that there will be case= s where it > >> will be faster and there are no cases where it is slower. This will pr= obably be > >> more valuable for 16K kernels because the number of PTEs in a contpte = block is > >> 128 there: > > > > Okay, this version has been revised multiple times based on your > > previous feedback > > and Barry's comments, and it seems much less complicated to understand = now. :) > > > >> > >> Reviewed-by: Ryan Roberts > >> Tested-by: Ryan Roberts > >> > >>> --- > >>> Changes in v6: > >>> - Move prot =3D pte_pgprot(pte_mkold(pte_mkclean(pte))) into the cont= pte_is_consistent(), > >>> as suggested by Barry. > >>> - Link to v5: https://lore.kernel.org/all/20250509122728.2379466-1-xa= vier_qy@163.com/ > >>> > >>> Changes in v5: > >>> - Replace macro CHECK_CONTPTE_CONSISTENCY with inline function contpt= e_is_consistent > >>> for improved readability and clarity, as suggested by Barry. > >>> - Link to v4: https://lore.kernel.org/all/20250508070353.2370826-1-xa= vier_qy@163.com/ > >>> > >>> Changes in v4: > >>> - Convert macro CHECK_CONTPTE_FLAG to an internal loop for better rea= dability. > >>> - Refactor contpte_ptep_get_lockless using the same optimization logi= c, as suggested by Ryan. > >>> - Link to v3: https://lore.kernel.org/all/3d338f91.8c71.1965cd8b1b8.C= oremail.xavier_qy@163.com/ > >>> --- > >>> arch/arm64/mm/contpte.c | 74 +++++++++++++++++++++++++++++++++++----= -- > >>> 1 file changed, 64 insertions(+), 10 deletions(-) > >>> > >>> diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c > >>> index bcac4f55f9c1..71efe7dff0ad 100644 > >>> --- a/arch/arm64/mm/contpte.c > >>> +++ b/arch/arm64/mm/contpte.c > >>> @@ -169,17 +169,46 @@ pte_t contpte_ptep_get(pte_t *ptep, pte_t orig_= pte) > >>> for (i =3D 0; i < CONT_PTES; i++, ptep++) { > >>> pte =3D __ptep_get(ptep); > >>> > >>> - if (pte_dirty(pte)) > >>> + if (pte_dirty(pte)) { > >>> orig_pte =3D pte_mkdirty(orig_pte); > >>> - > >>> - if (pte_young(pte)) > >>> + for (; i < CONT_PTES; i++, ptep++) { > >>> + pte =3D __ptep_get(ptep); > >>> + if (pte_young(pte)) { > >>> + orig_pte =3D pte_mkyoung(orig_p= te); > >>> + break; > >>> + } > >>> + } > >>> + break; > >>> + } > >>> + > >>> + if (pte_young(pte)) { > >>> orig_pte =3D pte_mkyoung(orig_pte); > >>> + i++; > >>> + ptep++; > >>> + for (; i < CONT_PTES; i++, ptep++) { > >>> + pte =3D __ptep_get(ptep); > >>> + if (pte_dirty(pte)) { > >>> + orig_pte =3D pte_mkdirty(orig_p= te); > >>> + break; > >>> + } > >>> + } > >>> + break; > >>> + } > >>> } > >>> > >>> return orig_pte; > >>> } > >>> EXPORT_SYMBOL_GPL(contpte_ptep_get); > >>> > >>> +static inline bool contpte_is_consistent(pte_t pte, unsigned long pf= n, > >>> + pgprot_t orig_prot) > >>> +{ > >>> + pgprot_t prot =3D pte_pgprot(pte_mkold(pte_mkclean(pte))); > >>> + > >>> + return pte_valid_cont(pte) && pte_pfn(pte) =3D=3D pfn && > >>> + pgprot_val(prot) =3D=3D pgprot_val(orig_prot); > >>> +} > >>> + > >>> pte_t contpte_ptep_get_lockless(pte_t *orig_ptep) > >>> { > >>> /* > >>> @@ -202,7 +231,6 @@ pte_t contpte_ptep_get_lockless(pte_t *orig_ptep) > >>> pgprot_t orig_prot; > >>> unsigned long pfn; > >>> pte_t orig_pte; > >>> - pgprot_t prot; > >>> pte_t *ptep; > >>> pte_t pte; > >>> int i; > >>> @@ -219,18 +247,44 @@ pte_t contpte_ptep_get_lockless(pte_t *orig_pte= p) > >>> > >>> for (i =3D 0; i < CONT_PTES; i++, ptep++, pfn++) { > >>> pte =3D __ptep_get(ptep); > >>> - prot =3D pte_pgprot(pte_mkold(pte_mkclean(pte))); > >>> > >>> - if (!pte_valid_cont(pte) || > >>> - pte_pfn(pte) !=3D pfn || > >>> - pgprot_val(prot) !=3D pgprot_val(orig_prot)) > >>> + if (!contpte_is_consistent(pte, pfn, orig_prot)) > >>> goto retry; > >>> > >>> - if (pte_dirty(pte)) > >>> + if (pte_dirty(pte)) { > >>> orig_pte =3D pte_mkdirty(orig_pte); > >>> + for (; i < CONT_PTES; i++, ptep++, pfn++) { > >>> + pte =3D __ptep_get(ptep); > >>> + > >>> + if (!contpte_is_consistent(pte, pfn, or= ig_prot)) > >>> + goto retry; > >>> + > >>> + if (pte_young(pte)) { > >>> + orig_pte =3D pte_mkyoung(orig_p= te); > >>> + break; > >>> + } > >>> + } > >>> + break; > >> > >> I considered for a while whether it is safe for contpte_ptep_get_lockl= ess() to > >> exit early having not seen every PTE in the contpte block and confirme= d that > >> they are all consistent. I eventually concluded that it is, as long as= all the > >> PTEs that it does check are consistent I believe this is fine. > > > > So, it looks like my changes here will be okay. > > > >> > >>> + } > >>> > >>> - if (pte_young(pte)) > >>> + if (pte_young(pte)) { > >>> orig_pte =3D pte_mkyoung(orig_pte); > >>> + i++; > >>> + ptep++; > >>> + pfn++; > >>> + for (; i < CONT_PTES; i++, ptep++, pfn++) { > >>> + pte =3D __ptep_get(ptep); > >>> + > >>> + if (!contpte_is_consistent(pte, pfn, or= ig_prot)) > >>> + goto retry; > >>> + > >>> + if (pte_dirty(pte)) { > >>> + orig_pte =3D pte_mkdirty(orig_p= te); > >>> + break; > >>> + } > >>> + } > >>> + break; > >>> + } > >>> } > >>> > >>> return orig_pte; > >> >