From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f46.google.com (mail-pa0-f46.google.com [209.85.220.46]) by kanga.kvack.org (Postfix) with ESMTP id DD2186B0253 for ; Tue, 13 Oct 2015 09:58:43 -0400 (EDT) Received: by padhy16 with SMTP id hy16so22402108pad.1 for ; Tue, 13 Oct 2015 06:58:43 -0700 (PDT) Received: from smtprelay.synopsys.com (smtprelay4.synopsys.com. [198.182.47.9]) by mx.google.com with ESMTPS id wv8si5304934pbc.216.2015.10.13.06.58.43 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 13 Oct 2015 06:58:43 -0700 (PDT) From: Vineet Gupta Subject: pmd_modify() semantics Date: Tue, 13 Oct 2015 13:58:39 +0000 Message-ID: Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Sender: owner-linux-mm@kvack.org List-ID: To: "Kirill A. Shutemov" , Andrew Morton Cc: lkml , "linux-mm@kvack.org" , Minchan Kim Hi Kirill,=0A= =0A= I'm running LTP tests on the new ARC THP code and thp03 seems to be trigger= ing mm=0A= spew.=0A= =0A= --------------->8---------------------=0A= [ARCLinux]# ./ltp-thp03-extract=0A= PID 60=0A= bad pmd bf1c4600 be600231=0A= ../mm/pgtable-generic.c:34: bad pgd be600231.=0A= bad pmd bf1c4604 bd800231=0A= ../mm/pgtable-generic.c:34: bad pgd bd800231.=0A= BUG: Bad rss-counter state mm:bf12e900 idx:1 val:512=0A= BUG: non-zero nr_ptes on freeing mm: 2=0A= --------------->8---------------------=0A= =0A= I know what exactly is happening and the likely fix, but would want to get = some=0A= thoughts from you if possible.=0A= =0A= background: ARC is software page walked with PGD -> PTE -> page for normal = and PMD=0A= -> page for THP case. A vanilla PGD doesn't have any flags - only pointer t= o PTE=0A= =0A= A reduced version of thp03 allocates a THP, dirties it, followed by=0A= mprotect(PROT_NONE).=0A= At the time of mprotect() -> change_huge_pmd() -> pmd_modify() needs to cha= nge=0A= some of the bits.=0A= =0A= The issue is ARC implementation of pmd_modify() based on pte variant, which= =0A= retains the soft pte bits (dirty and accessed).=0A= =0A= static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot)=0A= {=0A= return pte_pmd(pte_modify(pmd_pte(pmd), newprot));=0A= }=0A= =0A= Obvious fix is to rewrite pmd_modify() so that it clears out all pte type f= lags=0A= but that assumes PMD is becoming PGD (a vanilla PGD on ARC doesn't have any= =0A= flags). Can we have pmd_modify() ever be called for NOT splitting pmd e.g.= =0A= mprotect Write to Read which won't split the THP like it does now and simpl= y=0A= changes the prot flags. My proposed version of pmd_modify() will loose the = dirty bit.=0A= =0A= In short, what are the semantics of pmd_modify() - essentially does it impl= y pmd=0A= is being split so are free to make it like PGD.=0A= =0A= TIA,=0A= -Vineet=0A= -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f176.google.com (mail-wi0-f176.google.com [209.85.212.176]) by kanga.kvack.org (Postfix) with ESMTP id 4D8EE6B0253 for ; Tue, 13 Oct 2015 12:07:22 -0400 (EDT) Received: by wijq8 with SMTP id q8so38868016wij.0 for ; Tue, 13 Oct 2015 09:07:21 -0700 (PDT) Received: from mail-wi0-f180.google.com (mail-wi0-f180.google.com. [209.85.212.180]) by mx.google.com with ESMTPS id gl16si4794409wjc.187.2015.10.13.09.06.58 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 13 Oct 2015 09:06:58 -0700 (PDT) Received: by wicge5 with SMTP id ge5so199418263wic.0 for ; Tue, 13 Oct 2015 09:06:58 -0700 (PDT) Date: Tue, 13 Oct 2015 19:06:56 +0300 From: "Kirill A. Shutemov" Subject: Re: pmd_modify() semantics Message-ID: <20151013160656.GA14071@node> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Vineet Gupta Cc: "Kirill A. Shutemov" , Andrew Morton , lkml , "linux-mm@kvack.org" , Minchan Kim On Tue, Oct 13, 2015 at 01:58:39PM +0000, Vineet Gupta wrote: > Hi Kirill, > > I'm running LTP tests on the new ARC THP code and thp03 seems to be triggering mm > spew. > > --------------->8--------------------- > [ARCLinux]# ./ltp-thp03-extract > PID 60 > bad pmd bf1c4600 be600231 > ../mm/pgtable-generic.c:34: bad pgd be600231. > bad pmd bf1c4604 bd800231 > ../mm/pgtable-generic.c:34: bad pgd bd800231. > BUG: Bad rss-counter state mm:bf12e900 idx:1 val:512 > BUG: non-zero nr_ptes on freeing mm: 2 > --------------->8--------------------- > > I know what exactly is happening and the likely fix, but would want to get some > thoughts from you if possible. > > background: ARC is software page walked with PGD -> PTE -> page for normal and PMD > -> page for THP case. A vanilla PGD doesn't have any flags - only pointer to PTE > > A reduced version of thp03 allocates a THP, dirties it, followed by > mprotect(PROT_NONE). > At the time of mprotect() -> change_huge_pmd() -> pmd_modify() needs to change > some of the bits. > > The issue is ARC implementation of pmd_modify() based on pte variant, which > retains the soft pte bits (dirty and accessed). > > static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot) > { > return pte_pmd(pte_modify(pmd_pte(pmd), newprot)); > } > > Obvious fix is to rewrite pmd_modify() so that it clears out all pte type flags > but that assumes PMD is becoming PGD (a vanilla PGD on ARC doesn't have any > flags). Can we have pmd_modify() ever be called for NOT splitting pmd e.g. > mprotect Write to Read which won't split the THP like it does now and simply > changes the prot flags. My proposed version of pmd_modify() will loose the dirty bit. Hm? pmd_modify() is nothing to do with splitting. The mprotect() codepath you've mentioned above calls pmd_modify() only if the THP is fully in mprotect range. > In short, what are the semantics of pmd_modify() - essentially does it imply pmd > is being split so are free to make it like PGD. No, pmd_modify() cannot make such assumption. That's just not true -- we don't split PMD in such codepath. And even if we do, we construct new PMD entry from scratch instead of modifying existing one. So the semantics of pmd_modify(): you can assume that the entry is pmd_large(), going to stay this way and you need to touch only protection-related bit. -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f47.google.com (mail-pa0-f47.google.com [209.85.220.47]) by kanga.kvack.org (Postfix) with ESMTP id 419466B0254 for ; Wed, 14 Oct 2015 02:34:05 -0400 (EDT) Received: by pabws5 with SMTP id ws5so14264706pab.1 for ; Tue, 13 Oct 2015 23:34:05 -0700 (PDT) Received: from smtprelay.synopsys.com (us01smtprelay-2.synopsys.com. [198.182.47.9]) by mx.google.com with ESMTPS id ow7si10795925pbb.237.2015.10.13.23.34.04 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 13 Oct 2015 23:34:04 -0700 (PDT) From: Vineet Gupta Subject: Re: pmd_modify() semantics Date: Wed, 14 Oct 2015 06:33:59 +0000 Message-ID: References: <20151013160656.GA14071@node> Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Sender: owner-linux-mm@kvack.org List-ID: To: "Kirill A. Shutemov" Cc: "Kirill A. Shutemov" , Andrew Morton , lkml , "linux-mm@kvack.org" , Minchan Kim On Tuesday 13 October 2015 09:37 PM, Kirill A. Shutemov wrote:=0A= > On Tue, Oct 13, 2015 at 01:58:39PM +0000, Vineet Gupta wrote:=0A= >> Hi Kirill,=0A= >>=0A= >> I'm running LTP tests on the new ARC THP code and thp03 seems to be trig= gering mm=0A= >> spew.=0A= >>=0A= >> --------------->8---------------------=0A= >> [ARCLinux]# ./ltp-thp03-extract=0A= >> PID 60=0A= >> bad pmd bf1c4600 be600231=0A= >> ../mm/pgtable-generic.c:34: bad pgd be600231.=0A= >> bad pmd bf1c4604 bd800231=0A= >> ../mm/pgtable-generic.c:34: bad pgd bd800231.=0A= >> BUG: Bad rss-counter state mm:bf12e900 idx:1 val:512=0A= >> BUG: non-zero nr_ptes on freeing mm: 2=0A= >> --------------->8---------------------=0A= >>=0A= >> I know what exactly is happening and the likely fix, but would want to g= et some=0A= >> thoughts from you if possible.=0A= >>=0A= >> background: ARC is software page walked with PGD -> PTE -> page for norm= al and PMD=0A= >> -> page for THP case. A vanilla PGD doesn't have any flags - only pointe= r to PTE=0A= >>=0A= >> A reduced version of thp03 allocates a THP, dirties it, followed by=0A= >> mprotect(PROT_NONE).=0A= >> At the time of mprotect() -> change_huge_pmd() -> pmd_modify() needs to = change=0A= >> some of the bits.=0A= >>=0A= >> The issue is ARC implementation of pmd_modify() based on pte variant, wh= ich=0A= >> retains the soft pte bits (dirty and accessed).=0A= >>=0A= >> static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot)=0A= >> {=0A= >> return pte_pmd(pte_modify(pmd_pte(pmd), newprot));=0A= >> }=0A= >>=0A= >> Obvious fix is to rewrite pmd_modify() so that it clears out all pte typ= e flags=0A= >> but that assumes PMD is becoming PGD (a vanilla PGD on ARC doesn't have = any=0A= >> flags). Can we have pmd_modify() ever be called for NOT splitting pmd e.= g.=0A= >> mprotect Write to Read which won't split the THP like it does now and si= mply=0A= >> changes the prot flags. My proposed version of pmd_modify() will loose t= he dirty bit.=0A= > Hm? pmd_modify() is nothing to do with splitting. The mprotect() codepath= =0A= > you've mentioned above calls pmd_modify() only if the THP is fully in=0A= > mprotect range.=0A= =0A= Indeed my mental picture of this was messed up - specially because behind t= he=0A= back, pmd_modify() for ARC (based on pte_modify()) was buggered to clear th= e huge=0A= page bit itself :-) So we had a THP PMD which would start failing for=0A= pmd_trans_huge() and thus treated like a normal PGD. But it had the leftove= r PMD=0A= soft bits, which triggered the MM spew.=0A= =0A= The localized fix is below, while better fix is to make pte_modify() only c= lear=0A= R-W-X bits (currently it clears everything except soft accessed/dirty bits)= =0A= =0A= static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot)=0A= {=0A= - return pte_pmd(pte_modify(pmd_pte(pmd), newprot));=0A= + /*=0A= + * open-coded pte_modify() additionally retaining HW_SZ bit=0A= + * otherwise, pmd_trans_huge() checks start failing=0A= + */=0A= + return __pmd((pmd_val(pmd) & (_PAGE_CHG_MASK | _PAGE_HW_SZ)) |=0A= pgprot_val(newprot));=0A= }=0A= =0A= =0A= >=0A= >> In short, what are the semantics of pmd_modify() - essentially does it i= mply pmd=0A= >> is being split so are free to make it like PGD.=0A= > No, pmd_modify() cannot make such assumption. That's just not true -- we= =0A= > don't split PMD in such codepath. And even if we do, we construct new PMD= =0A= > entry from scratch instead of modifying existing one.=0A= >=0A= > So the semantics of pmd_modify(): you can assume that the entry is=0A= > pmd_large(), going to stay this way and you need to touch only=0A= > protection-related bit.=0A= =0A= Thx !=0A= =0A= -Vineet=0A= -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753121AbbJMN6p (ORCPT ); Tue, 13 Oct 2015 09:58:45 -0400 Received: from smtprelay4.synopsys.com ([198.182.47.9]:57362 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752860AbbJMN6n convert rfc822-to-8bit (ORCPT ); Tue, 13 Oct 2015 09:58:43 -0400 From: Vineet Gupta To: "Kirill A. Shutemov" , Andrew Morton CC: lkml , "linux-mm@kvack.org" , Minchan Kim Subject: pmd_modify() semantics Thread-Topic: pmd_modify() semantics Thread-Index: AdEFv0JDFPDvDHmyQyGUnRjrG/HdHg== Date: Tue, 13 Oct 2015 13:58:39 +0000 Message-ID: Accept-Language: en-US, en-IN Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.12.197.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Kirill, I'm running LTP tests on the new ARC THP code and thp03 seems to be triggering mm spew. --------------->8--------------------- [ARCLinux]# ./ltp-thp03-extract PID 60 bad pmd bf1c4600 be600231 ../mm/pgtable-generic.c:34: bad pgd be600231. bad pmd bf1c4604 bd800231 ../mm/pgtable-generic.c:34: bad pgd bd800231. BUG: Bad rss-counter state mm:bf12e900 idx:1 val:512 BUG: non-zero nr_ptes on freeing mm: 2 --------------->8--------------------- I know what exactly is happening and the likely fix, but would want to get some thoughts from you if possible. background: ARC is software page walked with PGD -> PTE -> page for normal and PMD -> page for THP case. A vanilla PGD doesn't have any flags - only pointer to PTE A reduced version of thp03 allocates a THP, dirties it, followed by mprotect(PROT_NONE). At the time of mprotect() -> change_huge_pmd() -> pmd_modify() needs to change some of the bits. The issue is ARC implementation of pmd_modify() based on pte variant, which retains the soft pte bits (dirty and accessed). static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot) { return pte_pmd(pte_modify(pmd_pte(pmd), newprot)); } Obvious fix is to rewrite pmd_modify() so that it clears out all pte type flags but that assumes PMD is becoming PGD (a vanilla PGD on ARC doesn't have any flags). Can we have pmd_modify() ever be called for NOT splitting pmd e.g. mprotect Write to Read which won't split the THP like it does now and simply changes the prot flags. My proposed version of pmd_modify() will loose the dirty bit. In short, what are the semantics of pmd_modify() - essentially does it imply pmd is being split so are free to make it like PGD. TIA, -Vineet From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932523AbbJMQHA (ORCPT ); Tue, 13 Oct 2015 12:07:00 -0400 Received: from mail-wi0-f174.google.com ([209.85.212.174]:35742 "EHLO mail-wi0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932298AbbJMQG7 (ORCPT ); Tue, 13 Oct 2015 12:06:59 -0400 Date: Tue, 13 Oct 2015 19:06:56 +0300 From: "Kirill A. Shutemov" To: Vineet Gupta Cc: "Kirill A. Shutemov" , Andrew Morton , lkml , "linux-mm@kvack.org" , Minchan Kim Subject: Re: pmd_modify() semantics Message-ID: <20151013160656.GA14071@node> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 13, 2015 at 01:58:39PM +0000, Vineet Gupta wrote: > Hi Kirill, > > I'm running LTP tests on the new ARC THP code and thp03 seems to be triggering mm > spew. > > --------------->8--------------------- > [ARCLinux]# ./ltp-thp03-extract > PID 60 > bad pmd bf1c4600 be600231 > ../mm/pgtable-generic.c:34: bad pgd be600231. > bad pmd bf1c4604 bd800231 > ../mm/pgtable-generic.c:34: bad pgd bd800231. > BUG: Bad rss-counter state mm:bf12e900 idx:1 val:512 > BUG: non-zero nr_ptes on freeing mm: 2 > --------------->8--------------------- > > I know what exactly is happening and the likely fix, but would want to get some > thoughts from you if possible. > > background: ARC is software page walked with PGD -> PTE -> page for normal and PMD > -> page for THP case. A vanilla PGD doesn't have any flags - only pointer to PTE > > A reduced version of thp03 allocates a THP, dirties it, followed by > mprotect(PROT_NONE). > At the time of mprotect() -> change_huge_pmd() -> pmd_modify() needs to change > some of the bits. > > The issue is ARC implementation of pmd_modify() based on pte variant, which > retains the soft pte bits (dirty and accessed). > > static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot) > { > return pte_pmd(pte_modify(pmd_pte(pmd), newprot)); > } > > Obvious fix is to rewrite pmd_modify() so that it clears out all pte type flags > but that assumes PMD is becoming PGD (a vanilla PGD on ARC doesn't have any > flags). Can we have pmd_modify() ever be called for NOT splitting pmd e.g. > mprotect Write to Read which won't split the THP like it does now and simply > changes the prot flags. My proposed version of pmd_modify() will loose the dirty bit. Hm? pmd_modify() is nothing to do with splitting. The mprotect() codepath you've mentioned above calls pmd_modify() only if the THP is fully in mprotect range. > In short, what are the semantics of pmd_modify() - essentially does it imply pmd > is being split so are free to make it like PGD. No, pmd_modify() cannot make such assumption. That's just not true -- we don't split PMD in such codepath. And even if we do, we construct new PMD entry from scratch instead of modifying existing one. So the semantics of pmd_modify(): you can assume that the entry is pmd_large(), going to stay this way and you need to touch only protection-related bit. -- Kirill A. Shutemov From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753084AbbJNGeH (ORCPT ); Wed, 14 Oct 2015 02:34:07 -0400 Received: from us01smtprelay-2.synopsys.com ([198.182.47.9]:34367 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752708AbbJNGeE convert rfc822-to-8bit (ORCPT ); Wed, 14 Oct 2015 02:34:04 -0400 From: Vineet Gupta To: "Kirill A. Shutemov" CC: "Kirill A. Shutemov" , Andrew Morton , lkml , "linux-mm@kvack.org" , Minchan Kim Subject: Re: pmd_modify() semantics Thread-Topic: pmd_modify() semantics Thread-Index: AdEFv0JDFPDvDHmyQyGUnRjrG/HdHg== Date: Wed, 14 Oct 2015 06:33:59 +0000 Message-ID: References: <20151013160656.GA14071@node> Accept-Language: en-US, en-IN Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.12.197.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tuesday 13 October 2015 09:37 PM, Kirill A. Shutemov wrote: > On Tue, Oct 13, 2015 at 01:58:39PM +0000, Vineet Gupta wrote: >> Hi Kirill, >> >> I'm running LTP tests on the new ARC THP code and thp03 seems to be triggering mm >> spew. >> >> --------------->8--------------------- >> [ARCLinux]# ./ltp-thp03-extract >> PID 60 >> bad pmd bf1c4600 be600231 >> ../mm/pgtable-generic.c:34: bad pgd be600231. >> bad pmd bf1c4604 bd800231 >> ../mm/pgtable-generic.c:34: bad pgd bd800231. >> BUG: Bad rss-counter state mm:bf12e900 idx:1 val:512 >> BUG: non-zero nr_ptes on freeing mm: 2 >> --------------->8--------------------- >> >> I know what exactly is happening and the likely fix, but would want to get some >> thoughts from you if possible. >> >> background: ARC is software page walked with PGD -> PTE -> page for normal and PMD >> -> page for THP case. A vanilla PGD doesn't have any flags - only pointer to PTE >> >> A reduced version of thp03 allocates a THP, dirties it, followed by >> mprotect(PROT_NONE). >> At the time of mprotect() -> change_huge_pmd() -> pmd_modify() needs to change >> some of the bits. >> >> The issue is ARC implementation of pmd_modify() based on pte variant, which >> retains the soft pte bits (dirty and accessed). >> >> static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot) >> { >> return pte_pmd(pte_modify(pmd_pte(pmd), newprot)); >> } >> >> Obvious fix is to rewrite pmd_modify() so that it clears out all pte type flags >> but that assumes PMD is becoming PGD (a vanilla PGD on ARC doesn't have any >> flags). Can we have pmd_modify() ever be called for NOT splitting pmd e.g. >> mprotect Write to Read which won't split the THP like it does now and simply >> changes the prot flags. My proposed version of pmd_modify() will loose the dirty bit. > Hm? pmd_modify() is nothing to do with splitting. The mprotect() codepath > you've mentioned above calls pmd_modify() only if the THP is fully in > mprotect range. Indeed my mental picture of this was messed up - specially because behind the back, pmd_modify() for ARC (based on pte_modify()) was buggered to clear the huge page bit itself :-) So we had a THP PMD which would start failing for pmd_trans_huge() and thus treated like a normal PGD. But it had the leftover PMD soft bits, which triggered the MM spew. The localized fix is below, while better fix is to make pte_modify() only clear R-W-X bits (currently it clears everything except soft accessed/dirty bits) static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot) { - return pte_pmd(pte_modify(pmd_pte(pmd), newprot)); + /* + * open-coded pte_modify() additionally retaining HW_SZ bit + * otherwise, pmd_trans_huge() checks start failing + */ + return __pmd((pmd_val(pmd) & (_PAGE_CHG_MASK | _PAGE_HW_SZ)) | pgprot_val(newprot)); } > >> In short, what are the semantics of pmd_modify() - essentially does it imply pmd >> is being split so are free to make it like PGD. > No, pmd_modify() cannot make such assumption. That's just not true -- we > don't split PMD in such codepath. And even if we do, we construct new PMD > entry from scratch instead of modifying existing one. > > So the semantics of pmd_modify(): you can assume that the entry is > pmd_large(), going to stay this way and you need to touch only > protection-related bit. Thx ! -Vineet