From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Hansen Subject: Re: [RFC PATCH v3 12/24] x86/mm: Modify ptep_set_wrprotect and pmdp_set_wrprotect for _PAGE_DIRTY_SW Date: Thu, 30 Aug 2018 09:08:13 -0700 Message-ID: <079a55f2-4654-4adf-a6ef-6e480b594a2f@linux.intel.com> References: <20180830143904.3168-1-yu-cheng.yu@intel.com> <20180830143904.3168-13-yu-cheng.yu@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Return-path: In-Reply-To: Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org To: Jann Horn , yu-cheng.yu@intel.com Cc: the arch/x86 maintainers , "H . Peter Anvin" , Thomas Gleixner , Ingo Molnar , kernel list , linux-doc@vger.kernel.org, Linux-MM , linux-arch , Linux API , Arnd Bergmann , Andy Lutomirski , Balbir Singh , Cyrill Gorcunov , Florian Weimer , hjl.tools@gmail.com, Jonathan Corbet , keescook@chromiun.org, Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra List-Id: linux-arch.vger.kernel.org On 08/30/2018 08:49 AM, Jann Horn wrote: >> @@ -1203,7 +1203,28 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, >> static inline void ptep_set_wrprotect(struct mm_struct *mm, >> unsigned long addr, pte_t *ptep) >> { >> + pte_t pte; >> + >> clear_bit(_PAGE_BIT_RW, (unsigned long *)&ptep->pte); >> + pte = *ptep; >> + >> + /* >> + * Some processors can start a write, but ending up seeing >> + * a read-only PTE by the time they get to the Dirty bit. >> + * In this case, they will set the Dirty bit, leaving a >> + * read-only, Dirty PTE which looks like a Shadow Stack PTE. >> + * >> + * However, this behavior has been improved and will not occur >> + * on processors supporting Shadow Stacks. Without this >> + * guarantee, a transition to a non-present PTE and flush the >> + * TLB would be needed. >> + * >> + * When change a writable PTE to read-only and if the PTE has >> + * _PAGE_DIRTY_HW set, we move that bit to _PAGE_DIRTY_SW so >> + * that the PTE is not a valid Shadow Stack PTE. >> + */ >> + pte = pte_move_flags(pte, _PAGE_DIRTY_HW, _PAGE_DIRTY_SW); >> + set_pte_at(mm, addr, ptep, pte); >> } > I don't understand why it's okay that you first atomically clear the > RW bit, then atomically switch from DIRTY_HW to DIRTY_SW. Doesn't that > mean that between the two atomic writes, another core can incorrectly > see a shadow stack? Good point. This could result in a spurious shadow-stack fault, or allow a shadow-stack write to the page in the transient state. But, the shadow-stack permissions are more restrictive than what could be in the TLB at this point, so I don't think there's a real security implication here. The only trouble is handling the spurious shadow-stack fault. The alternative is to go !Present for a bit, which we would probably just handle fine in the existing page fault code. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga07.intel.com ([134.134.136.100]:15474 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726959AbeH3UM0 (ORCPT ); Thu, 30 Aug 2018 16:12:26 -0400 Subject: Re: [RFC PATCH v3 12/24] x86/mm: Modify ptep_set_wrprotect and pmdp_set_wrprotect for _PAGE_DIRTY_SW References: <20180830143904.3168-1-yu-cheng.yu@intel.com> <20180830143904.3168-13-yu-cheng.yu@intel.com> From: Dave Hansen Message-ID: <079a55f2-4654-4adf-a6ef-6e480b594a2f@linux.intel.com> Date: Thu, 30 Aug 2018 09:08:13 -0700 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-arch-owner@vger.kernel.org List-ID: To: Jann Horn , yu-cheng.yu@intel.com Cc: the arch/x86 maintainers , "H . Peter Anvin" , Thomas Gleixner , Ingo Molnar , kernel list , linux-doc@vger.kernel.org, Linux-MM , linux-arch , Linux API , Arnd Bergmann , Andy Lutomirski , Balbir Singh , Cyrill Gorcunov , Florian Weimer , hjl.tools@gmail.com, Jonathan Corbet , keescook@chromiun.org, Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , ravi.v.shankar@intel.com, vedvyas.shanbhogue@intel.com Message-ID: <20180830160813.0I3byIB1Jd37p4-KgfK_VYvDo2zSd0--YHwKUqg20NU@z> On 08/30/2018 08:49 AM, Jann Horn wrote: >> @@ -1203,7 +1203,28 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, >> static inline void ptep_set_wrprotect(struct mm_struct *mm, >> unsigned long addr, pte_t *ptep) >> { >> + pte_t pte; >> + >> clear_bit(_PAGE_BIT_RW, (unsigned long *)&ptep->pte); >> + pte = *ptep; >> + >> + /* >> + * Some processors can start a write, but ending up seeing >> + * a read-only PTE by the time they get to the Dirty bit. >> + * In this case, they will set the Dirty bit, leaving a >> + * read-only, Dirty PTE which looks like a Shadow Stack PTE. >> + * >> + * However, this behavior has been improved and will not occur >> + * on processors supporting Shadow Stacks. Without this >> + * guarantee, a transition to a non-present PTE and flush the >> + * TLB would be needed. >> + * >> + * When change a writable PTE to read-only and if the PTE has >> + * _PAGE_DIRTY_HW set, we move that bit to _PAGE_DIRTY_SW so >> + * that the PTE is not a valid Shadow Stack PTE. >> + */ >> + pte = pte_move_flags(pte, _PAGE_DIRTY_HW, _PAGE_DIRTY_SW); >> + set_pte_at(mm, addr, ptep, pte); >> } > I don't understand why it's okay that you first atomically clear the > RW bit, then atomically switch from DIRTY_HW to DIRTY_SW. Doesn't that > mean that between the two atomic writes, another core can incorrectly > see a shadow stack? Good point. This could result in a spurious shadow-stack fault, or allow a shadow-stack write to the page in the transient state. But, the shadow-stack permissions are more restrictive than what could be in the TLB at this point, so I don't think there's a real security implication here. The only trouble is handling the spurious shadow-stack fault. The alternative is to go !Present for a bit, which we would probably just handle fine in the existing page fault code.