From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753552AbaJVOBW (ORCPT ); Wed, 22 Oct 2014 10:01:22 -0400 Received: from mx1.redhat.com ([209.132.183.28]:1467 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752122AbaJVOBS (ORCPT ); Wed, 22 Oct 2014 10:01:18 -0400 Message-ID: <5447B862.4060707@redhat.com> Date: Wed, 22 Oct 2014 16:00:02 +0200 From: Paolo Bonzini User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: Dominik Dingel , Andrew Morton , linux-mm@kvack.org, Mel Gorman , Michal Hocko , Dave Hansen , Rik van Riel CC: Andrea Arcangeli , Andy Lutomirski , "Aneesh Kumar K.V" , Bob Liu , Christian Borntraeger , Cornelia Huck , Gleb Natapov , Heiko Carstens , "H. Peter Anvin" , Hugh Dickins , Ingo Molnar , Jianyu Zhan , Johannes Weiner , "Kirill A. Shutemov" , kvm@vger.kernel.org, linux390@de.ibm.com, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, Martin Schwidefsky , Peter Zijlstra , Sasha Levin Subject: Re: [PATCH 3/4] s390/mm: prevent and break zero page mappings in case of storage keys References: <1413976170-42501-1-git-send-email-dingel@linux.vnet.ibm.com> <1413976170-42501-4-git-send-email-dingel@linux.vnet.ibm.com> In-Reply-To: <1413976170-42501-4-git-send-email-dingel@linux.vnet.ibm.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Reviewed-by: Paolo Bonzini On 10/22/2014 01:09 PM, Dominik Dingel wrote: > As soon as storage keys are enabled we need to stop working on zero page > mappings to prevent inconsistencies between storage keys and pgste. > > Otherwise following data corruption could happen: > 1) guest enables storage key > 2) guest sets storage key for not mapped page X > -> change goes to PGSTE > 3) guest reads from page X > -> as X was not dirty before, the page will be zero page backed, > storage key from PGSTE for X will go to storage key for zero page > 4) guest sets storage key for not mapped page Y (same logic as above > 5) guest reads from page Y > -> as Y was not dirty before, the page will be zero page backed, > storage key from PGSTE for Y will got to storage key for zero page > overwriting storage key for X > > While holding the mmap sem, we are safe against changes on entries we > already fixed, as every fault would need to take the mmap_sem (read). > > Other vCPUs executing storage key instructions will get a one time interception > and be serialized also with mmap_sem. > > Signed-off-by: Dominik Dingel > --- > arch/s390/include/asm/pgtable.h | 5 +++++ > arch/s390/mm/pgtable.c | 13 ++++++++++++- > 2 files changed, 17 insertions(+), 1 deletion(-) > > diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h > index 1e991f6a..0da98d6 100644 > --- a/arch/s390/include/asm/pgtable.h > +++ b/arch/s390/include/asm/pgtable.h > @@ -481,6 +481,11 @@ static inline int mm_has_pgste(struct mm_struct *mm) > return 0; > } > > +/* > + * In the case that a guest uses storage keys > + * faults should no longer be backed by zero pages > + */ > +#define mm_forbids_zeropage mm_use_skey > static inline int mm_use_skey(struct mm_struct *mm) > { > #ifdef CONFIG_PGSTE > diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c > index ab55ba8..58d7eb2 100644 > --- a/arch/s390/mm/pgtable.c > +++ b/arch/s390/mm/pgtable.c > @@ -1309,6 +1309,15 @@ static int __s390_enable_skey(pte_t *pte, unsigned long addr, > pgste_t pgste; > > pgste = pgste_get_lock(pte); > + /* > + * Remove all zero page mappings, > + * after establishing a policy to forbid zero page mappings > + * following faults for that page will get fresh anonymous pages > + */ > + if (is_zero_pfn(pte_pfn(*pte))) { > + ptep_flush_direct(walk->mm, addr, pte); > + pte_val(*pte) = _PAGE_INVALID; > + } > /* Clear storage key */ > pgste_val(pgste) &= ~(PGSTE_ACC_BITS | PGSTE_FP_BIT | > PGSTE_GR_BIT | PGSTE_GC_BIT); > @@ -1327,9 +1336,11 @@ void s390_enable_skey(void) > down_write(&mm->mmap_sem); > if (mm_use_skey(mm)) > goto out_up; > + > + mm->context.use_skey = 1; > + > walk.mm = mm; > walk_page_range(0, TASK_SIZE, &walk); > - mm->context.use_skey = 1; > > out_up: > up_write(&mm->mmap_sem); >