From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 40FrVs03SjzF27s for ; Wed, 4 Apr 2018 00:06:44 +1000 (AEST) Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w33E1WGl100947 for ; Tue, 3 Apr 2018 10:06:42 -0400 Received: from e06smtp15.uk.ibm.com (e06smtp15.uk.ibm.com [195.75.94.111]) by mx0a-001b2d01.pphosted.com with ESMTP id 2h47eyvdk5-1 (version=TLSv1.2 cipher=AES256-SHA256 bits=256 verify=NOT) for ; Tue, 03 Apr 2018 10:06:37 -0400 Received: from localhost by e06smtp15.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 3 Apr 2018 15:06:28 +0100 Subject: Re: [PATCH v2] cxl: Fix possible deadlock when processing page faults from cxllib To: Frederic Barrat , linuxppc-dev@lists.ozlabs.org, mpe@ellerman.id.au Cc: clombard@linux.vnet.ibm.com, andrew.donnellan@au1.ibm.com, vaibhav@linux.vnet.ibm.com References: <20180403135402.30374-1-fbarrat@linux.vnet.ibm.com> From: Laurent Dufour Date: Tue, 3 Apr 2018 16:06:25 +0200 MIME-Version: 1.0 In-Reply-To: <20180403135402.30374-1-fbarrat@linux.vnet.ibm.com> Content-Type: text/plain; charset=utf-8 Message-Id: <904fbfd8-5224-78b5-ca74-4c4dec1125e7@linux.vnet.ibm.com> List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On 03/04/2018 15:54, Frederic Barrat wrote: > cxllib_handle_fault() is called by an external driver when it needs to > have the host resolve page faults for a buffer. The buffer can cover > several pages and VMAs. The function iterates over all the pages used > by the buffer, based on the page size of the VMA. > > To ensure some stability while processing the faults, the thread T1 > grabs the mm->mmap_sem semaphore with read access (R1). However, when > processing a page fault for a single page, one of the underlying > functions, copro_handle_mm_fault(), also grabs the same semaphore with > read access (R2). So the thread T1 takes the semaphore twice. > > If another thread T2 tries to access the semaphore in write mode W1 > (say, because it wants to allocate memory and calls 'brk'), then that > thread T2 will have to wait because there's a reader (R1). If the > thread T1 is processing a new page at that time, it won't get an > automatic grant at R2, because there's now a writer thread > waiting (T2). And we have a deadlock. > > The timeline is: > 1. thread T1 owns the semaphore with read access R1 > 2. thread T2 requests write access W1 and waits > 3. thread T1 requests read access R2 and waits > > The fix is for the thread T1 to release the semaphore R1 once it got > the information it needs from the current VMA. The address space/VMAs > could evolve while T1 iterates over the full buffer, but in the > unlikely case where T1 misses a page, the external driver will raise a > new page fault when retrying the memory access. > > Fixes: 3ced8d730063 ("cxl: Export library to support IBM XSL") > Cc: stable@vger.kernel.org # 4.13+ > Signed-off-by: Frederic Barrat What are the changes introduced in this v2 ? > --- > drivers/misc/cxl/cxllib.c | 85 ++++++++++++++++++++++++++++++----------------- > 1 file changed, 55 insertions(+), 30 deletions(-) > > diff --git a/drivers/misc/cxl/cxllib.c b/drivers/misc/cxl/cxllib.c > index 30ccba436b3b..55cd35d1a9cc 100644 > --- a/drivers/misc/cxl/cxllib.c > +++ b/drivers/misc/cxl/cxllib.c > @@ -208,49 +208,74 @@ int cxllib_get_PE_attributes(struct task_struct *task, > } > EXPORT_SYMBOL_GPL(cxllib_get_PE_attributes); > > -int cxllib_handle_fault(struct mm_struct *mm, u64 addr, u64 size, u64 flags) > +static int get_vma_info(struct mm_struct *mm, u64 addr, > + u64 *vma_start, u64 *vma_end, > + unsigned long *page_size) > { > - int rc; > - u64 dar; > struct vm_area_struct *vma = NULL; > - unsigned long page_size; > - > - if (mm == NULL) > - return -EFAULT; > + int rc = 0; > > down_read(&mm->mmap_sem); > > vma = find_vma(mm, addr); > if (!vma) { > - pr_err("Can't find vma for addr %016llx\n", addr); > rc = -EFAULT; > goto out; > } > - /* get the size of the pages allocated */ > - page_size = vma_kernel_pagesize(vma); > - > - for (dar = (addr & ~(page_size - 1)); dar < (addr + size); dar += page_size) { > - if (dar < vma->vm_start || dar >= vma->vm_end) { > - vma = find_vma(mm, addr); > - if (!vma) { > - pr_err("Can't find vma for addr %016llx\n", addr); > - rc = -EFAULT; > - goto out; > - } > - /* get the size of the pages allocated */ > - page_size = vma_kernel_pagesize(vma); > + *page_size = vma_kernel_pagesize(vma); > + *vma_start = vma->vm_start; > + *vma_end = vma->vm_end; > +out: > + up_read(&mm->mmap_sem); > + return rc; > +} > + > +int cxllib_handle_fault(struct mm_struct *mm, u64 addr, u64 size, u64 flags) > +{ > + int rc; > + u64 dar, vma_start, vma_end; > + unsigned long page_size; > + > + if (mm == NULL) > + return -EFAULT; > + > + /* > + * The buffer we have to process can extend over several pages > + * and may also cover several VMAs. > + * We iterate over all the pages. The page size could vary > + * between VMAs. > + */ > + rc = get_vma_info(mm, addr, &vma_start, &vma_end, &page_size); > + if (rc) > + return rc; > + > + for (dar = (addr & ~(page_size - 1)); dar < (addr + size); > + dar += page_size) { > + if (dar < vma_start || dar >= vma_end) { > + /* > + * We don't hold the mm->mmap_sem semaphore > + * while iterating, since the semaphore is > + * required by one of the lower-level page > + * fault processing functions and it could > + * create a deadlock. > + * > + * It means the VMAs can be altered between 2 > + * loop iterations and we could theoretically > + * miss a page (however unlikely). But that's > + * not really a problem, as the driver will > + * retry access, get another page fault on the > + * missing page and call us again. > + */ > + rc = get_vma_info(mm, dar, &vma_start, &vma_end, > + &page_size); > + if (rc) > + return rc; > } > > rc = cxl_handle_mm_fault(mm, flags, dar); > - if (rc) { > - pr_err("cxl_handle_mm_fault failed %d", rc); > - rc = -EFAULT; > - goto out; > - } > + if (rc) > + return -EFAULT; > } > - rc = 0; > -out: > - up_read(&mm->mmap_sem); > - return rc; > + return 0; > } > EXPORT_SYMBOL_GPL(cxllib_handle_fault); >