From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: MIME-Version: 1.0 In-Reply-To: <20150922211716.GA32623@linux.intel.com> References: <1442950582-10140-1-git-send-email-ross.zwisler@linux.intel.com> <20150922211716.GA32623@linux.intel.com> Date: Tue, 22 Sep 2015 14:26:30 -0700 Message-ID: Subject: Re: [PATCH v2] dax: fix NULL pointer in __dax_pmd_fault() From: Dan Williams Content-Type: text/plain; charset=UTF-8 Sender: owner-linux-mm@kvack.org To: Ross Zwisler , Dan Williams , Andrew Morton , "linux-kernel@vger.kernel.org" , Alexander Viro , Matthew Wilcox , linux-fsdevel , "Kirill A. Shutemov" , "linux-nvdimm@lists.01.org" , Dave Chinner , Linux MM List-ID: On Tue, Sep 22, 2015 at 2:17 PM, Ross Zwisler wrote: > On Tue, Sep 22, 2015 at 01:51:04PM -0700, Dan Williams wrote: >> [ adding Andrew ] >> >> On Tue, Sep 22, 2015 at 12:36 PM, Ross Zwisler >> wrote: >> > The following commit: >> > >> > commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for >> > DAX") >> > >> > moved some code in __dax_pmd_fault() that was responsible for zeroing >> > newly allocated PMD pages. The new location didn't properly set up >> > 'kaddr', though, so when run this code resulted in a NULL pointer BUG. >> > >> > Fix this by getting the correct 'kaddr' via bdev_direct_access(). >> > >> > Signed-off-by: Ross Zwisler >> > Reported-by: Dan Williams >> >> Taking into account the comment below, >> >> Reviewed-by: Dan Williams >> >> > --- >> > fs/dax.c | 13 ++++++++++++- >> > 1 file changed, 12 insertions(+), 1 deletion(-) >> > >> > diff --git a/fs/dax.c b/fs/dax.c >> > index 7ae6df7..bcfb14b 100644 >> > --- a/fs/dax.c >> > +++ b/fs/dax.c >> > @@ -569,8 +569,20 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, >> > if (!buffer_size_valid(&bh) || bh.b_size < PMD_SIZE) >> > goto fallback; >> > >> > + sector = bh.b_blocknr << (blkbits - 9); >> > + >> > if (buffer_unwritten(&bh) || buffer_new(&bh)) { >> > int i; >> > + >> > + length = bdev_direct_access(bh.b_bdev, sector, &kaddr, &pfn, >> > + bh.b_size); >> > + if (length < 0) { >> > + result = VM_FAULT_SIGBUS; >> > + goto out; >> > + } >> > + if ((length < PMD_SIZE) || (pfn & PG_PMD_COLOUR)) >> > + goto fallback; >> > + >> >> Hmm, we don't need the PG_PMD_COLOUR check since we aren't using the >> pfn in this path, right? > > I think we care, because we'll end up bailing anyway at the later > PG_PMD_COLOUR check before we actually insert the pfn via > vmf_insert_pfn_pmd(). If we don't check the alignment we'll do 2 MiB worth of > zeroing to the media, then later fall back to PTE faults. Ok, good point. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Tue, 22 Sep 2015 15:17:16 -0600 From: Ross Zwisler Subject: Re: [PATCH v2] dax: fix NULL pointer in __dax_pmd_fault() Message-ID: <20150922211716.GA32623@linux.intel.com> References: <1442950582-10140-1-git-send-email-ross.zwisler@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org To: Dan Williams , @linux.intel.com Cc: Ross Zwisler , Andrew Morton , "linux-kernel@vger.kernel.org" , Alexander Viro , Matthew Wilcox , linux-fsdevel , "Kirill A. Shutemov" , "linux-nvdimm@lists.01.org" , Dave Chinner , Linux MM List-ID: On Tue, Sep 22, 2015 at 01:51:04PM -0700, Dan Williams wrote: > [ adding Andrew ] > > On Tue, Sep 22, 2015 at 12:36 PM, Ross Zwisler > wrote: > > The following commit: > > > > commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for > > DAX") > > > > moved some code in __dax_pmd_fault() that was responsible for zeroing > > newly allocated PMD pages. The new location didn't properly set up > > 'kaddr', though, so when run this code resulted in a NULL pointer BUG. > > > > Fix this by getting the correct 'kaddr' via bdev_direct_access(). > > > > Signed-off-by: Ross Zwisler > > Reported-by: Dan Williams > > Taking into account the comment below, > > Reviewed-by: Dan Williams > > > --- > > fs/dax.c | 13 ++++++++++++- > > 1 file changed, 12 insertions(+), 1 deletion(-) > > > > diff --git a/fs/dax.c b/fs/dax.c > > index 7ae6df7..bcfb14b 100644 > > --- a/fs/dax.c > > +++ b/fs/dax.c > > @@ -569,8 +569,20 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, > > if (!buffer_size_valid(&bh) || bh.b_size < PMD_SIZE) > > goto fallback; > > > > + sector = bh.b_blocknr << (blkbits - 9); > > + > > if (buffer_unwritten(&bh) || buffer_new(&bh)) { > > int i; > > + > > + length = bdev_direct_access(bh.b_bdev, sector, &kaddr, &pfn, > > + bh.b_size); > > + if (length < 0) { > > + result = VM_FAULT_SIGBUS; > > + goto out; > > + } > > + if ((length < PMD_SIZE) || (pfn & PG_PMD_COLOUR)) > > + goto fallback; > > + > > Hmm, we don't need the PG_PMD_COLOUR check since we aren't using the > pfn in this path, right? I think we care, because we'll end up bailing anyway at the later PG_PMD_COLOUR check before we actually insert the pfn via vmf_insert_pfn_pmd(). If we don't check the alignment we'll do 2 MiB worth of zeroing to the media, then later fall back to PTE faults. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: MIME-Version: 1.0 In-Reply-To: <1442950582-10140-1-git-send-email-ross.zwisler@linux.intel.com> References: <1442950582-10140-1-git-send-email-ross.zwisler@linux.intel.com> Date: Tue, 22 Sep 2015 13:51:04 -0700 Message-ID: Subject: Re: [PATCH v2] dax: fix NULL pointer in __dax_pmd_fault() From: Dan Williams Content-Type: text/plain; charset=UTF-8 Sender: owner-linux-mm@kvack.org To: Ross Zwisler , Andrew Morton Cc: "linux-kernel@vger.kernel.org" , Alexander Viro , Matthew Wilcox , linux-fsdevel , "Kirill A. Shutemov" , "linux-nvdimm@lists.01.org" , Dave Chinner , Linux MM List-ID: [ adding Andrew ] On Tue, Sep 22, 2015 at 12:36 PM, Ross Zwisler wrote: > The following commit: > > commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for > DAX") > > moved some code in __dax_pmd_fault() that was responsible for zeroing > newly allocated PMD pages. The new location didn't properly set up > 'kaddr', though, so when run this code resulted in a NULL pointer BUG. > > Fix this by getting the correct 'kaddr' via bdev_direct_access(). > > Signed-off-by: Ross Zwisler > Reported-by: Dan Williams Taking into account the comment below, Reviewed-by: Dan Williams > --- > fs/dax.c | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/fs/dax.c b/fs/dax.c > index 7ae6df7..bcfb14b 100644 > --- a/fs/dax.c > +++ b/fs/dax.c > @@ -569,8 +569,20 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, > if (!buffer_size_valid(&bh) || bh.b_size < PMD_SIZE) > goto fallback; > > + sector = bh.b_blocknr << (blkbits - 9); > + > if (buffer_unwritten(&bh) || buffer_new(&bh)) { > int i; > + > + length = bdev_direct_access(bh.b_bdev, sector, &kaddr, &pfn, > + bh.b_size); > + if (length < 0) { > + result = VM_FAULT_SIGBUS; > + goto out; > + } > + if ((length < PMD_SIZE) || (pfn & PG_PMD_COLOUR)) > + goto fallback; > + Hmm, we don't need the PG_PMD_COLOUR check since we aren't using the pfn in this path, right? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: Ross Zwisler Subject: [PATCH v2] dax: fix NULL pointer in __dax_pmd_fault() Date: Tue, 22 Sep 2015 13:36:22 -0600 Message-Id: <1442950582-10140-1-git-send-email-ross.zwisler@linux.intel.com> Sender: linux-kernel-owner@vger.kernel.org To: linux-kernel@vger.kernel.org Cc: Ross Zwisler , Alexander Viro , Matthew Wilcox , linux-fsdevel@vger.kernel.org, "Kirill A. Shutemov" , linux-nvdimm@lists.01.org, Dan Williams , Dave Chinner List-ID: The following commit: commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for DAX") moved some code in __dax_pmd_fault() that was responsible for zeroing newly allocated PMD pages. The new location didn't properly set up 'kaddr', though, so when run this code resulted in a NULL pointer BUG. Fix this by getting the correct 'kaddr' via bdev_direct_access(). Signed-off-by: Ross Zwisler Reported-by: Dan Williams --- fs/dax.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/fs/dax.c b/fs/dax.c index 7ae6df7..bcfb14b 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -569,8 +569,20 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, if (!buffer_size_valid(&bh) || bh.b_size < PMD_SIZE) goto fallback; + sector = bh.b_blocknr << (blkbits - 9); + if (buffer_unwritten(&bh) || buffer_new(&bh)) { int i; + + length = bdev_direct_access(bh.b_bdev, sector, &kaddr, &pfn, + bh.b_size); + if (length < 0) { + result = VM_FAULT_SIGBUS; + goto out; + } + if ((length < PMD_SIZE) || (pfn & PG_PMD_COLOUR)) + goto fallback; + for (i = 0; i < PTRS_PER_PMD; i++) clear_pmem(kaddr + i * PAGE_SIZE, PAGE_SIZE); wmb_pmem(); @@ -623,7 +635,6 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, result = VM_FAULT_NOPAGE; spin_unlock(ptl); } else { - sector = bh.b_blocknr << (blkbits - 9); length = bdev_direct_access(bh.b_bdev, sector, &kaddr, &pfn, bh.b_size); if (length < 0) { -- 2.1.0 From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Morton Subject: Re: [PATCH v2] dax: fix NULL pointer in __dax_pmd_fault() Date: Tue, 22 Sep 2015 14:13:33 -0700 Message-ID: <20150922141333.c28e3c5d800267937ca7b29a@linux-foundation.org> References: <1442950582-10140-1-git-send-email-ross.zwisler@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: linux-kernel@vger.kernel.org, Alexander Viro , Matthew Wilcox , linux-fsdevel@vger.kernel.org, "Kirill A. Shutemov" , linux-nvdimm@ml01.01.org, Dan Williams , Dave Chinner To: Ross Zwisler Return-path: In-Reply-To: <1442950582-10140-1-git-send-email-ross.zwisler@linux.intel.com> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Tue, 22 Sep 2015 13:36:22 -0600 Ross Zwisler wrote: > The following commit: > > commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for > DAX") > > moved some code in __dax_pmd_fault() that was responsible for zeroing > newly allocated PMD pages. The new location didn't properly set up > 'kaddr', though, so when run this code resulted in a NULL pointer BUG. > > Fix this by getting the correct 'kaddr' via bdev_direct_access(). Why the heck didn't gcc warn? I had a fiddle: --- a/fs/dax.c~a +++ a/fs/dax.c @@ -529,15 +529,18 @@ int __dax_pmd_fault(struct vm_area_struc unsigned long pmd_addr = address & PMD_MASK; bool write = flags & FAULT_FLAG_WRITE; long length; - void __pmem *kaddr; + void *kaddr; pgoff_t size, pgoff; sector_t block, sector; unsigned long pfn; int result = 0; +// printk("%p\n", kaddr); + /* Fall back to PTEs if we're going to COW */ if (write && !(vma->vm_flags & VM_SHARED)) return VM_FAULT_FALLBACK; + printk("%p\n", kaddr); /* If the PMD would extend outside the VMA */ if (pmd_addr < vma->vm_start) return VM_FAULT_FALLBACK; gcc warns about the first printk, but not about the second. So that "if (...) return ..." seems to have defeated gcc uninitialized-var detection. wtf? > --- a/fs/dax.c > +++ b/fs/dax.c > @@ -569,8 +569,20 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, > if (!buffer_size_valid(&bh) || bh.b_size < PMD_SIZE) > goto fallback; > > + sector = bh.b_blocknr << (blkbits - 9); > + > if (buffer_unwritten(&bh) || buffer_new(&bh)) { > int i; > + > + length = bdev_direct_access(bh.b_bdev, sector, &kaddr, &pfn, > + bh.b_size); > + if (length < 0) { > + result = VM_FAULT_SIGBUS; > + goto out; > + } > + if ((length < PMD_SIZE) || (pfn & PG_PMD_COLOUR)) > + goto fallback; > + > for (i = 0; i < PTRS_PER_PMD; i++) > clear_pmem(kaddr + i * PAGE_SIZE, PAGE_SIZE); > wmb_pmem(); hm, that's a lot of copy-n-paste. Do we really need to run bdev_direct_access() twice? Will `kaddr' and `pfn' change? From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dan Williams Subject: Re: [PATCH v2] dax: fix NULL pointer in __dax_pmd_fault() Date: Tue, 22 Sep 2015 13:51:04 -0700 Message-ID: References: <1442950582-10140-1-git-send-email-ross.zwisler@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Cc: "linux-kernel@vger.kernel.org" , Alexander Viro , Matthew Wilcox , linux-fsdevel , "Kirill A. Shutemov" , "linux-nvdimm@lists.01.org" , Dave Chinner , Linux MM To: Ross Zwisler , Andrew Morton Return-path: Received: from mail-wi0-f176.google.com ([209.85.212.176]:34094 "EHLO mail-wi0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933514AbbIVUvF (ORCPT ); Tue, 22 Sep 2015 16:51:05 -0400 Received: by wicfx3 with SMTP id fx3so210709527wic.1 for ; Tue, 22 Sep 2015 13:51:04 -0700 (PDT) In-Reply-To: <1442950582-10140-1-git-send-email-ross.zwisler@linux.intel.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: [ adding Andrew ] On Tue, Sep 22, 2015 at 12:36 PM, Ross Zwisler wrote: > The following commit: > > commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for > DAX") > > moved some code in __dax_pmd_fault() that was responsible for zeroing > newly allocated PMD pages. The new location didn't properly set up > 'kaddr', though, so when run this code resulted in a NULL pointer BUG. > > Fix this by getting the correct 'kaddr' via bdev_direct_access(). > > Signed-off-by: Ross Zwisler > Reported-by: Dan Williams Taking into account the comment below, Reviewed-by: Dan Williams > --- > fs/dax.c | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/fs/dax.c b/fs/dax.c > index 7ae6df7..bcfb14b 100644 > --- a/fs/dax.c > +++ b/fs/dax.c > @@ -569,8 +569,20 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, > if (!buffer_size_valid(&bh) || bh.b_size < PMD_SIZE) > goto fallback; > > + sector = bh.b_blocknr << (blkbits - 9); > + > if (buffer_unwritten(&bh) || buffer_new(&bh)) { > int i; > + > + length = bdev_direct_access(bh.b_bdev, sector, &kaddr, &pfn, > + bh.b_size); > + if (length < 0) { > + result = VM_FAULT_SIGBUS; > + goto out; > + } > + if ((length < PMD_SIZE) || (pfn & PG_PMD_COLOUR)) > + goto fallback; > + Hmm, we don't need the PG_PMD_COLOUR check since we aren't using the pfn in this path, right? From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dan Williams Subject: Re: [PATCH v2] dax: fix NULL pointer in __dax_pmd_fault() Date: Tue, 22 Sep 2015 14:25:19 -0700 Message-ID: References: <1442950582-10140-1-git-send-email-ross.zwisler@linux.intel.com> <20150922141333.c28e3c5d800267937ca7b29a@linux-foundation.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Cc: Ross Zwisler , "linux-kernel@vger.kernel.org" , Alexander Viro , Matthew Wilcox , linux-fsdevel , "Kirill A. Shutemov" , linux-nvdimm , Dave Chinner To: Andrew Morton Return-path: Received: from mail-wi0-f181.google.com ([209.85.212.181]:36832 "EHLO mail-wi0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934304AbbIVVZU (ORCPT ); Tue, 22 Sep 2015 17:25:20 -0400 Received: by wicgb1 with SMTP id gb1so179048654wic.1 for ; Tue, 22 Sep 2015 14:25:19 -0700 (PDT) In-Reply-To: <20150922141333.c28e3c5d800267937ca7b29a@linux-foundation.org> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Tue, Sep 22, 2015 at 2:13 PM, Andrew Morton wrote: > On Tue, 22 Sep 2015 13:36:22 -0600 Ross Zwisler wrote: > >> The following commit: >> >> commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for >> DAX") >> >> moved some code in __dax_pmd_fault() that was responsible for zeroing >> newly allocated PMD pages. The new location didn't properly set up >> 'kaddr', though, so when run this code resulted in a NULL pointer BUG. >> >> Fix this by getting the correct 'kaddr' via bdev_direct_access(). > > Why the heck didn't gcc warn? > > I had a fiddle: > > --- a/fs/dax.c~a > +++ a/fs/dax.c > @@ -529,15 +529,18 @@ int __dax_pmd_fault(struct vm_area_struc > unsigned long pmd_addr = address & PMD_MASK; > bool write = flags & FAULT_FLAG_WRITE; > long length; > - void __pmem *kaddr; > + void *kaddr; > pgoff_t size, pgoff; > sector_t block, sector; > unsigned long pfn; > int result = 0; > > +// printk("%p\n", kaddr); > + > /* Fall back to PTEs if we're going to COW */ > if (write && !(vma->vm_flags & VM_SHARED)) > return VM_FAULT_FALLBACK; > + printk("%p\n", kaddr); > /* If the PMD would extend outside the VMA */ > if (pmd_addr < vma->vm_start) > return VM_FAULT_FALLBACK; > > gcc warns about the first printk, but not about the second. So that > "if (...) return ..." seems to have defeated gcc uninitialized-var > detection. wtf? > >> --- a/fs/dax.c >> +++ b/fs/dax.c >> @@ -569,8 +569,20 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, >> if (!buffer_size_valid(&bh) || bh.b_size < PMD_SIZE) >> goto fallback; >> >> + sector = bh.b_blocknr << (blkbits - 9); >> + >> if (buffer_unwritten(&bh) || buffer_new(&bh)) { >> int i; >> + >> + length = bdev_direct_access(bh.b_bdev, sector, &kaddr, &pfn, >> + bh.b_size); >> + if (length < 0) { >> + result = VM_FAULT_SIGBUS; >> + goto out; >> + } >> + if ((length < PMD_SIZE) || (pfn & PG_PMD_COLOUR)) >> + goto fallback; >> + >> for (i = 0; i < PTRS_PER_PMD; i++) >> clear_pmem(kaddr + i * PAGE_SIZE, PAGE_SIZE); >> wmb_pmem(); > > hm, that's a lot of copy-n-paste. Do we really need to run > bdev_direct_access() twice? Will `kaddr' and `pfn' change? > They shouldn't change, but I'm working on a fix for handling the race of unbinding the pmem device while that kaddr is in use (unbind invalidates kaddr). The proposal is a dax_map_bh()/dax_unmap_bh() interface to temporarily pin the mapping around each usage. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dan Williams Subject: Re: [PATCH v2] dax: fix NULL pointer in __dax_pmd_fault() Date: Tue, 22 Sep 2015 14:26:30 -0700 Message-ID: References: <1442950582-10140-1-git-send-email-ross.zwisler@linux.intel.com> <20150922211716.GA32623@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 To: Ross Zwisler , Dan Williams , Andrew Morton , "linux-kernel@vger.kernel.org" , Alexander Viro , Matthew Wilcox , linux-fsdevel , "Kirill A. Shutemov" , "linux-nvdimm@lists.01.org" , Dave Chinner , Linux MM Return-path: In-Reply-To: <20150922211716.GA32623@linux.intel.com> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Tue, Sep 22, 2015 at 2:17 PM, Ross Zwisler wrote: > On Tue, Sep 22, 2015 at 01:51:04PM -0700, Dan Williams wrote: >> [ adding Andrew ] >> >> On Tue, Sep 22, 2015 at 12:36 PM, Ross Zwisler >> wrote: >> > The following commit: >> > >> > commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for >> > DAX") >> > >> > moved some code in __dax_pmd_fault() that was responsible for zeroing >> > newly allocated PMD pages. The new location didn't properly set up >> > 'kaddr', though, so when run this code resulted in a NULL pointer BUG. >> > >> > Fix this by getting the correct 'kaddr' via bdev_direct_access(). >> > >> > Signed-off-by: Ross Zwisler >> > Reported-by: Dan Williams >> >> Taking into account the comment below, >> >> Reviewed-by: Dan Williams >> >> > --- >> > fs/dax.c | 13 ++++++++++++- >> > 1 file changed, 12 insertions(+), 1 deletion(-) >> > >> > diff --git a/fs/dax.c b/fs/dax.c >> > index 7ae6df7..bcfb14b 100644 >> > --- a/fs/dax.c >> > +++ b/fs/dax.c >> > @@ -569,8 +569,20 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, >> > if (!buffer_size_valid(&bh) || bh.b_size < PMD_SIZE) >> > goto fallback; >> > >> > + sector = bh.b_blocknr << (blkbits - 9); >> > + >> > if (buffer_unwritten(&bh) || buffer_new(&bh)) { >> > int i; >> > + >> > + length = bdev_direct_access(bh.b_bdev, sector, &kaddr, &pfn, >> > + bh.b_size); >> > + if (length < 0) { >> > + result = VM_FAULT_SIGBUS; >> > + goto out; >> > + } >> > + if ((length < PMD_SIZE) || (pfn & PG_PMD_COLOUR)) >> > + goto fallback; >> > + >> >> Hmm, we don't need the PG_PMD_COLOUR check since we aren't using the >> pfn in this path, right? > > I think we care, because we'll end up bailing anyway at the later > PG_PMD_COLOUR check before we actually insert the pfn via > vmf_insert_pfn_pmd(). If we don't check the alignment we'll do 2 MiB worth of > zeroing to the media, then later fall back to PTE faults. Ok, good point. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Chinner Subject: Re: [PATCH v2] dax: fix NULL pointer in __dax_pmd_fault() Date: Wed, 23 Sep 2015 09:30:16 +1000 Message-ID: <20150922233016.GH3902@dastard> References: <1442950582-10140-1-git-send-email-ross.zwisler@linux.intel.com> <20150922141333.c28e3c5d800267937ca7b29a@linux-foundation.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Andrew Morton , Ross Zwisler , "linux-kernel@vger.kernel.org" , Alexander Viro , Matthew Wilcox , linux-fsdevel , "Kirill A. Shutemov" , linux-nvdimm To: Dan Williams Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Tue, Sep 22, 2015 at 02:25:19PM -0700, Dan Williams wrote: > On Tue, Sep 22, 2015 at 2:13 PM, Andrew Morton > wrote: > > On Tue, 22 Sep 2015 13:36:22 -0600 Ross Zwisler wrote: > > > >> The following commit: > >> > >> commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for > >> DAX") > >> > >> moved some code in __dax_pmd_fault() that was responsible for zeroing > >> newly allocated PMD pages. The new location didn't properly set up > >> 'kaddr', though, so when run this code resulted in a NULL pointer BUG. > >> > >> Fix this by getting the correct 'kaddr' via bdev_direct_access(). > > > > Why the heck didn't gcc warn? > > > > I had a fiddle: > > > > --- a/fs/dax.c~a > > +++ a/fs/dax.c > > @@ -529,15 +529,18 @@ int __dax_pmd_fault(struct vm_area_struc > > unsigned long pmd_addr = address & PMD_MASK; > > bool write = flags & FAULT_FLAG_WRITE; > > long length; > > - void __pmem *kaddr; > > + void *kaddr; > > pgoff_t size, pgoff; > > sector_t block, sector; > > unsigned long pfn; > > int result = 0; > > > > +// printk("%p\n", kaddr); > > + > > /* Fall back to PTEs if we're going to COW */ > > if (write && !(vma->vm_flags & VM_SHARED)) > > return VM_FAULT_FALLBACK; > > + printk("%p\n", kaddr); > > /* If the PMD would extend outside the VMA */ > > if (pmd_addr < vma->vm_start) > > return VM_FAULT_FALLBACK; > > > > gcc warns about the first printk, but not about the second. So that > > "if (...) return ..." seems to have defeated gcc uninitialized-var > > detection. wtf? > > > >> --- a/fs/dax.c > >> +++ b/fs/dax.c > >> @@ -569,8 +569,20 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, > >> if (!buffer_size_valid(&bh) || bh.b_size < PMD_SIZE) > >> goto fallback; > >> > >> + sector = bh.b_blocknr << (blkbits - 9); > >> + > >> if (buffer_unwritten(&bh) || buffer_new(&bh)) { > >> int i; > >> + > >> + length = bdev_direct_access(bh.b_bdev, sector, &kaddr, &pfn, > >> + bh.b_size); > >> + if (length < 0) { > >> + result = VM_FAULT_SIGBUS; > >> + goto out; > >> + } > >> + if ((length < PMD_SIZE) || (pfn & PG_PMD_COLOUR)) > >> + goto fallback; > >> + > >> for (i = 0; i < PTRS_PER_PMD; i++) > >> clear_pmem(kaddr + i * PAGE_SIZE, PAGE_SIZE); > >> wmb_pmem(); > > > > hm, that's a lot of copy-n-paste. Do we really need to run > > bdev_direct_access() twice? Will `kaddr' and `pfn' change? > > > > They shouldn't change, but I'm working on a fix for handling the race > of unbinding the pmem device while that kaddr is in use (unbind > invalidates kaddr). Exactly what does "unbinding the pmem device" mean, and why can (parts of) the pmem device "go away" when there are active references to it? > The proposal is a dax_map_bh()/dax_unmap_bh() > interface to temporarily pin the mapping around each usage. Which mapping? The bufferhead maps file offset to filesystem block addresses, so I'm not sure what problem you are actually refering to here... Cheers, Dave. -- Dave Chinner david@fromorbit.com From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dan Williams Subject: Re: [PATCH v2] dax: fix NULL pointer in __dax_pmd_fault() Date: Tue, 22 Sep 2015 20:00:29 -0700 Message-ID: References: <1442950582-10140-1-git-send-email-ross.zwisler@linux.intel.com> <20150922141333.c28e3c5d800267937ca7b29a@linux-foundation.org> <20150922233016.GH3902@dastard> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Cc: Andrew Morton , Ross Zwisler , "linux-kernel@vger.kernel.org" , Alexander Viro , Matthew Wilcox , linux-fsdevel , "Kirill A. Shutemov" , linux-nvdimm To: Dave Chinner Return-path: In-Reply-To: <20150922233016.GH3902@dastard> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Tue, Sep 22, 2015 at 4:30 PM, Dave Chinner wrote: > On Tue, Sep 22, 2015 at 02:25:19PM -0700, Dan Williams wrote: >> On Tue, Sep 22, 2015 at 2:13 PM, Andrew Morton >> wrote: >> > On Tue, 22 Sep 2015 13:36:22 -0600 Ross Zwisler wrote: >> > >> >> The following commit: >> >> >> >> commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for >> >> DAX") >> >> >> >> moved some code in __dax_pmd_fault() that was responsible for zeroing >> >> newly allocated PMD pages. The new location didn't properly set up >> >> 'kaddr', though, so when run this code resulted in a NULL pointer BUG. >> >> >> >> Fix this by getting the correct 'kaddr' via bdev_direct_access(). >> > >> > Why the heck didn't gcc warn? >> > >> > I had a fiddle: >> > >> > --- a/fs/dax.c~a >> > +++ a/fs/dax.c >> > @@ -529,15 +529,18 @@ int __dax_pmd_fault(struct vm_area_struc >> > unsigned long pmd_addr = address & PMD_MASK; >> > bool write = flags & FAULT_FLAG_WRITE; >> > long length; >> > - void __pmem *kaddr; >> > + void *kaddr; >> > pgoff_t size, pgoff; >> > sector_t block, sector; >> > unsigned long pfn; >> > int result = 0; >> > >> > +// printk("%p\n", kaddr); >> > + >> > /* Fall back to PTEs if we're going to COW */ >> > if (write && !(vma->vm_flags & VM_SHARED)) >> > return VM_FAULT_FALLBACK; >> > + printk("%p\n", kaddr); >> > /* If the PMD would extend outside the VMA */ >> > if (pmd_addr < vma->vm_start) >> > return VM_FAULT_FALLBACK; >> > >> > gcc warns about the first printk, but not about the second. So that >> > "if (...) return ..." seems to have defeated gcc uninitialized-var >> > detection. wtf? >> > >> >> --- a/fs/dax.c >> >> +++ b/fs/dax.c >> >> @@ -569,8 +569,20 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, >> >> if (!buffer_size_valid(&bh) || bh.b_size < PMD_SIZE) >> >> goto fallback; >> >> >> >> + sector = bh.b_blocknr << (blkbits - 9); >> >> + >> >> if (buffer_unwritten(&bh) || buffer_new(&bh)) { >> >> int i; >> >> + >> >> + length = bdev_direct_access(bh.b_bdev, sector, &kaddr, &pfn, >> >> + bh.b_size); >> >> + if (length < 0) { >> >> + result = VM_FAULT_SIGBUS; >> >> + goto out; >> >> + } >> >> + if ((length < PMD_SIZE) || (pfn & PG_PMD_COLOUR)) >> >> + goto fallback; >> >> + >> >> for (i = 0; i < PTRS_PER_PMD; i++) >> >> clear_pmem(kaddr + i * PAGE_SIZE, PAGE_SIZE); >> >> wmb_pmem(); >> > >> > hm, that's a lot of copy-n-paste. Do we really need to run >> > bdev_direct_access() twice? Will `kaddr' and `pfn' change? >> > >> >> They shouldn't change, but I'm working on a fix for handling the race >> of unbinding the pmem device while that kaddr is in use (unbind >> invalidates kaddr). > > Exactly what does "unbinding the pmem device" mean, echo namespace0.0 > /sys/bus/nd/drivers/nd_pmem/unbind > and why can > (parts of) the pmem device "go away" when there are active > references to it? Normally we have outstanding i/o requests to hold off blk_cleanup_queue(), but in the dax case we don't have any mechanism (yet) to flag the queue as busy. I have some patches to add a percpu_refcount for this purpose. > >> The proposal is a dax_map_bh()/dax_unmap_bh() >> interface to temporarily pin the mapping around each usage. > > Which mapping? The bufferhead maps file offset to filesystem block > addresses, so I'm not sure what problem you are actually refering > to here... The kaddr is coming from the devm_memremap() in the pmem driver that gets unmapped after the device is released by the driver. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Chinner Subject: Re: [PATCH v2] dax: fix NULL pointer in __dax_pmd_fault() Date: Wed, 23 Sep 2015 19:04:59 +1000 Message-ID: <20150923090459.GO19114@dastard> References: <1442950582-10140-1-git-send-email-ross.zwisler@linux.intel.com> <20150922141333.c28e3c5d800267937ca7b29a@linux-foundation.org> <20150922233016.GH3902@dastard> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Andrew Morton , Ross Zwisler , "linux-kernel@vger.kernel.org" , Alexander Viro , Matthew Wilcox , linux-fsdevel , "Kirill A. Shutemov" , linux-nvdimm To: Dan Williams Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Tue, Sep 22, 2015 at 08:00:29PM -0700, Dan Williams wrote: > On Tue, Sep 22, 2015 at 4:30 PM, Dave Chinner wrote: > > On Tue, Sep 22, 2015 at 02:25:19PM -0700, Dan Williams wrote: > >> On Tue, Sep 22, 2015 at 2:13 PM, Andrew Morton > >> wrote: > >> > On Tue, 22 Sep 2015 13:36:22 -0600 Ross Zwisler wrote: > >> > > >> >> The following commit: > >> >> > >> >> commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for > >> >> DAX") > >> >> > >> >> moved some code in __dax_pmd_fault() that was responsible for zeroing > >> >> newly allocated PMD pages. The new location didn't properly set up > >> >> 'kaddr', though, so when run this code resulted in a NULL pointer BUG. > >> >> > >> >> Fix this by getting the correct 'kaddr' via bdev_direct_access(). > >> > > >> > Why the heck didn't gcc warn? > >> > > >> > I had a fiddle: > >> > > >> > --- a/fs/dax.c~a > >> > +++ a/fs/dax.c > >> > @@ -529,15 +529,18 @@ int __dax_pmd_fault(struct vm_area_struc > >> > unsigned long pmd_addr = address & PMD_MASK; > >> > bool write = flags & FAULT_FLAG_WRITE; > >> > long length; > >> > - void __pmem *kaddr; > >> > + void *kaddr; > >> > pgoff_t size, pgoff; > >> > sector_t block, sector; > >> > unsigned long pfn; > >> > int result = 0; > >> > > >> > +// printk("%p\n", kaddr); > >> > + > >> > /* Fall back to PTEs if we're going to COW */ > >> > if (write && !(vma->vm_flags & VM_SHARED)) > >> > return VM_FAULT_FALLBACK; > >> > + printk("%p\n", kaddr); > >> > /* If the PMD would extend outside the VMA */ > >> > if (pmd_addr < vma->vm_start) > >> > return VM_FAULT_FALLBACK; > >> > > >> > gcc warns about the first printk, but not about the second. So that > >> > "if (...) return ..." seems to have defeated gcc uninitialized-var > >> > detection. wtf? > >> > > >> >> --- a/fs/dax.c > >> >> +++ b/fs/dax.c > >> >> @@ -569,8 +569,20 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, > >> >> if (!buffer_size_valid(&bh) || bh.b_size < PMD_SIZE) > >> >> goto fallback; > >> >> > >> >> + sector = bh.b_blocknr << (blkbits - 9); > >> >> + > >> >> if (buffer_unwritten(&bh) || buffer_new(&bh)) { > >> >> int i; > >> >> + > >> >> + length = bdev_direct_access(bh.b_bdev, sector, &kaddr, &pfn, > >> >> + bh.b_size); > >> >> + if (length < 0) { > >> >> + result = VM_FAULT_SIGBUS; > >> >> + goto out; > >> >> + } > >> >> + if ((length < PMD_SIZE) || (pfn & PG_PMD_COLOUR)) > >> >> + goto fallback; > >> >> + > >> >> for (i = 0; i < PTRS_PER_PMD; i++) > >> >> clear_pmem(kaddr + i * PAGE_SIZE, PAGE_SIZE); > >> >> wmb_pmem(); > >> > > >> > hm, that's a lot of copy-n-paste. Do we really need to run > >> > bdev_direct_access() twice? Will `kaddr' and `pfn' change? > >> > > >> > >> They shouldn't change, but I'm working on a fix for handling the race > >> of unbinding the pmem device while that kaddr is in use (unbind > >> invalidates kaddr). > > > > Exactly what does "unbinding the pmem device" mean, > > echo namespace0.0 > /sys/bus/nd/drivers/nd_pmem/unbind That tells me "how", not "what"..... :/ > > and why can > > (parts of) the pmem device "go away" when there are active > > references to it? > > Normally we have outstanding i/o requests to hold off > blk_cleanup_queue(), but in the dax case we don't have any mechanism > (yet) to flag the queue as busy. I have some patches to add a > percpu_refcount for this purpose. So this comes back to fact we allow a block device to be torn down and freed while a filesystem has active references to it? > >> The proposal is a dax_map_bh()/dax_unmap_bh() > >> interface to temporarily pin the mapping around each usage. > > > > Which mapping? The bufferhead maps file offset to filesystem block > > addresses, so I'm not sure what problem you are actually refering > > to here... > > The kaddr is coming from the devm_memremap() in the pmem driver that > gets unmapped after the device is released by the driver. Perhaps the better solution is to not tear down the block device until all active references have gone away? i.e. unbind puts the device into a persistent error state and forces all active mappings to refault. Hence all future accesses error out and then when the user unmounts the unhappy filesystem the last reference to the blockdev goes away and the mappings can be torn down safely... Cheers, Dave. -- Dave Chinner david@fromorbit.com From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boaz Harrosh Subject: Re: [PATCH v2] dax: fix NULL pointer in __dax_pmd_fault() Date: Thu, 24 Sep 2015 11:50:00 +0300 Message-ID: <5603B938.1@plexistor.com> References: <1442950582-10140-1-git-send-email-ross.zwisler@linux.intel.com> <20150922141333.c28e3c5d800267937ca7b29a@linux-foundation.org> <20150922233016.GH3902@dastard> <20150923090459.GO19114@dastard> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: linux-nvdimm , "linux-kernel@vger.kernel.org" , Alexander Viro , linux-fsdevel , Andrew Morton , "Kirill A. Shutemov" To: Dave Chinner , Dan Williams Return-path: In-Reply-To: <20150923090459.GO19114@dastard> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On 09/23/2015 12:04 PM, Dave Chinner wrote: > On Tue, Sep 22, 2015 at 08:00:29PM -0700, Dan Williams wrote: <> >> The kaddr is coming from the devm_memremap() in the pmem driver that >> gets unmapped after the device is released by the driver. > > Perhaps the better solution is to not tear down the block device > until all active references have gone away? i.e. unbind puts the > device into a persistent error state and forces all active mappings > to refault. Hence all future accesses error out and then when the > user unmounts the unhappy filesystem the last reference to the > blockdev goes away and the mappings can be torn down safely... > Me too > Cheers, > > Dave. > From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dan Williams Subject: Re: [PATCH v2] dax: fix NULL pointer in __dax_pmd_fault() Date: Thu, 24 Sep 2015 09:06:55 -0700 Message-ID: References: <1442950582-10140-1-git-send-email-ross.zwisler@linux.intel.com> <20150922141333.c28e3c5d800267937ca7b29a@linux-foundation.org> <20150922233016.GH3902@dastard> <20150923090459.GO19114@dastard> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Cc: Andrew Morton , Ross Zwisler , "linux-kernel@vger.kernel.org" , Alexander Viro , Matthew Wilcox , linux-fsdevel , "Kirill A. Shutemov" , linux-nvdimm To: Dave Chinner Return-path: In-Reply-To: <20150923090459.GO19114@dastard> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Wed, Sep 23, 2015 at 2:04 AM, Dave Chinner wrote: > On Tue, Sep 22, 2015 at 08:00:29PM -0700, Dan Williams wrote: >> The kaddr is coming from the devm_memremap() in the pmem driver that >> gets unmapped after the device is released by the driver. > > Perhaps the better solution is to not tear down the block device > until all active references have gone away? i.e. unbind puts the > device into a persistent error state and forces all active mappings > to refault. Hence all future accesses error out and then when the > user unmounts the unhappy filesystem the last reference to the > blockdev goes away and the mappings can be torn down safely... In fact this is how it already works in the block layer, it's just that the pmem driver was not participating in that mechanism. The filesystem prevents the gendisk and hosting driver module from going away via the heavyweight get_disk(). The gendisk keeps the request_queue from being de-allocated, but the queue can go "dead" to new requests at any time. Single-queue based drivers take the queue_lock and check blk_queue_dying() before allowing new requests. Multi-queue drivers take a lighter-weight approach and try to get a new "live" reference from a percpu_refcount. When the backing device is unplugged or otherwise unbound from its driver it calls blk_cleanup_queue() in its shutdown path. That marks the queue dead and flushes any outstanding requests. From that point forward all requests end in error until the final put_disk(). This is what I came up with for pmem: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002206.html From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934373AbbIVTgf (ORCPT ); Tue, 22 Sep 2015 15:36:35 -0400 Received: from mga11.intel.com ([192.55.52.93]:55205 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758363AbbIVTge (ORCPT ); Tue, 22 Sep 2015 15:36:34 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.17,574,1437462000"; d="scan'208";a="810694122" From: Ross Zwisler To: linux-kernel@vger.kernel.org Cc: Ross Zwisler , Alexander Viro , Matthew Wilcox , linux-fsdevel@vger.kernel.org, "Kirill A. Shutemov" , linux-nvdimm@ml01.01.org, Dan Williams , Dave Chinner Subject: [PATCH v2] dax: fix NULL pointer in __dax_pmd_fault() Date: Tue, 22 Sep 2015 13:36:22 -0600 Message-Id: <1442950582-10140-1-git-send-email-ross.zwisler@linux.intel.com> X-Mailer: git-send-email 2.1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The following commit: commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for DAX") moved some code in __dax_pmd_fault() that was responsible for zeroing newly allocated PMD pages. The new location didn't properly set up 'kaddr', though, so when run this code resulted in a NULL pointer BUG. Fix this by getting the correct 'kaddr' via bdev_direct_access(). Signed-off-by: Ross Zwisler Reported-by: Dan Williams --- fs/dax.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/fs/dax.c b/fs/dax.c index 7ae6df7..bcfb14b 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -569,8 +569,20 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, if (!buffer_size_valid(&bh) || bh.b_size < PMD_SIZE) goto fallback; + sector = bh.b_blocknr << (blkbits - 9); + if (buffer_unwritten(&bh) || buffer_new(&bh)) { int i; + + length = bdev_direct_access(bh.b_bdev, sector, &kaddr, &pfn, + bh.b_size); + if (length < 0) { + result = VM_FAULT_SIGBUS; + goto out; + } + if ((length < PMD_SIZE) || (pfn & PG_PMD_COLOUR)) + goto fallback; + for (i = 0; i < PTRS_PER_PMD; i++) clear_pmem(kaddr + i * PAGE_SIZE, PAGE_SIZE); wmb_pmem(); @@ -623,7 +635,6 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, result = VM_FAULT_NOPAGE; spin_unlock(ptl); } else { - sector = bh.b_blocknr << (blkbits - 9); length = bdev_direct_access(bh.b_bdev, sector, &kaddr, &pfn, bh.b_size); if (length < 0) { -- 2.1.0 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934773AbbIVUvJ (ORCPT ); Tue, 22 Sep 2015 16:51:09 -0400 Received: from mail-wi0-f182.google.com ([209.85.212.182]:34977 "EHLO mail-wi0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934747AbbIVUvH (ORCPT ); Tue, 22 Sep 2015 16:51:07 -0400 MIME-Version: 1.0 In-Reply-To: <1442950582-10140-1-git-send-email-ross.zwisler@linux.intel.com> References: <1442950582-10140-1-git-send-email-ross.zwisler@linux.intel.com> Date: Tue, 22 Sep 2015 13:51:04 -0700 Message-ID: Subject: Re: [PATCH v2] dax: fix NULL pointer in __dax_pmd_fault() From: Dan Williams To: Ross Zwisler , Andrew Morton Cc: "linux-kernel@vger.kernel.org" , Alexander Viro , Matthew Wilcox , linux-fsdevel , "Kirill A. Shutemov" , "linux-nvdimm@lists.01.org" , Dave Chinner , Linux MM Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [ adding Andrew ] On Tue, Sep 22, 2015 at 12:36 PM, Ross Zwisler wrote: > The following commit: > > commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for > DAX") > > moved some code in __dax_pmd_fault() that was responsible for zeroing > newly allocated PMD pages. The new location didn't properly set up > 'kaddr', though, so when run this code resulted in a NULL pointer BUG. > > Fix this by getting the correct 'kaddr' via bdev_direct_access(). > > Signed-off-by: Ross Zwisler > Reported-by: Dan Williams Taking into account the comment below, Reviewed-by: Dan Williams > --- > fs/dax.c | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/fs/dax.c b/fs/dax.c > index 7ae6df7..bcfb14b 100644 > --- a/fs/dax.c > +++ b/fs/dax.c > @@ -569,8 +569,20 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, > if (!buffer_size_valid(&bh) || bh.b_size < PMD_SIZE) > goto fallback; > > + sector = bh.b_blocknr << (blkbits - 9); > + > if (buffer_unwritten(&bh) || buffer_new(&bh)) { > int i; > + > + length = bdev_direct_access(bh.b_bdev, sector, &kaddr, &pfn, > + bh.b_size); > + if (length < 0) { > + result = VM_FAULT_SIGBUS; > + goto out; > + } > + if ((length < PMD_SIZE) || (pfn & PG_PMD_COLOUR)) > + goto fallback; > + Hmm, we don't need the PG_PMD_COLOUR check since we aren't using the pfn in this path, right? From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934848AbbIVV0c (ORCPT ); Tue, 22 Sep 2015 17:26:32 -0400 Received: from mail-wi0-f174.google.com ([209.85.212.174]:34640 "EHLO mail-wi0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934741AbbIVV0b (ORCPT ); Tue, 22 Sep 2015 17:26:31 -0400 MIME-Version: 1.0 In-Reply-To: <20150922211716.GA32623@linux.intel.com> References: <1442950582-10140-1-git-send-email-ross.zwisler@linux.intel.com> <20150922211716.GA32623@linux.intel.com> Date: Tue, 22 Sep 2015 14:26:30 -0700 Message-ID: Subject: Re: [PATCH v2] dax: fix NULL pointer in __dax_pmd_fault() From: Dan Williams To: Ross Zwisler , Dan Williams , Andrew Morton , "linux-kernel@vger.kernel.org" , Alexander Viro , Matthew Wilcox , linux-fsdevel , "Kirill A. Shutemov" , "linux-nvdimm@lists.01.org" , Dave Chinner , Linux MM Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 22, 2015 at 2:17 PM, Ross Zwisler wrote: > On Tue, Sep 22, 2015 at 01:51:04PM -0700, Dan Williams wrote: >> [ adding Andrew ] >> >> On Tue, Sep 22, 2015 at 12:36 PM, Ross Zwisler >> wrote: >> > The following commit: >> > >> > commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for >> > DAX") >> > >> > moved some code in __dax_pmd_fault() that was responsible for zeroing >> > newly allocated PMD pages. The new location didn't properly set up >> > 'kaddr', though, so when run this code resulted in a NULL pointer BUG. >> > >> > Fix this by getting the correct 'kaddr' via bdev_direct_access(). >> > >> > Signed-off-by: Ross Zwisler >> > Reported-by: Dan Williams >> >> Taking into account the comment below, >> >> Reviewed-by: Dan Williams >> >> > --- >> > fs/dax.c | 13 ++++++++++++- >> > 1 file changed, 12 insertions(+), 1 deletion(-) >> > >> > diff --git a/fs/dax.c b/fs/dax.c >> > index 7ae6df7..bcfb14b 100644 >> > --- a/fs/dax.c >> > +++ b/fs/dax.c >> > @@ -569,8 +569,20 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, >> > if (!buffer_size_valid(&bh) || bh.b_size < PMD_SIZE) >> > goto fallback; >> > >> > + sector = bh.b_blocknr << (blkbits - 9); >> > + >> > if (buffer_unwritten(&bh) || buffer_new(&bh)) { >> > int i; >> > + >> > + length = bdev_direct_access(bh.b_bdev, sector, &kaddr, &pfn, >> > + bh.b_size); >> > + if (length < 0) { >> > + result = VM_FAULT_SIGBUS; >> > + goto out; >> > + } >> > + if ((length < PMD_SIZE) || (pfn & PG_PMD_COLOUR)) >> > + goto fallback; >> > + >> >> Hmm, we don't need the PG_PMD_COLOUR check since we aren't using the >> pfn in this path, right? > > I think we care, because we'll end up bailing anyway at the later > PG_PMD_COLOUR check before we actually insert the pfn via > vmf_insert_pfn_pmd(). If we don't check the alignment we'll do 2 MiB worth of > zeroing to the media, then later fall back to PTE faults. Ok, good point.