linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Matthew Wilcox <willy@linux.intel.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Matthew Wilcox <matthew.r.wilcox@intel.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH v1 5/7] dax: Add huge page fault support
Date: Mon, 13 Oct 2014 12:13:15 +1100	[thread overview]
Message-ID: <20141013011315.GB4503@dastard> (raw)
In-Reply-To: <20141009204716.GQ5098@wil.cx>

On Thu, Oct 09, 2014 at 04:47:16PM -0400, Matthew Wilcox wrote:
> On Wed, Oct 08, 2014 at 11:11:00PM +0300, Kirill A. Shutemov wrote:
> > On Wed, Oct 08, 2014 at 09:25:27AM -0400, Matthew Wilcox wrote:
> > > +	pgoff = ((address - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
> > > +	size = (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT;
> > > +	if (pgoff >= size)
> > > +		return VM_FAULT_SIGBUS;
> > > +	/* If the PMD would cover blocks out of the file */
> > > +	if ((pgoff | PG_PMD_COLOUR) >= size)
> > > +		return VM_FAULT_FALLBACK;
> > 
> > IIUC, zero pading would work too.
> 
> The blocks after this file might be allocated to another file already.
> I suppose we could ask the filesystem if it wants to allocate them to
> this file.
> 
> Dave, Jan, is it acceptable to call get_block() for blocks that extend
> beyond the current i_size?

In what context? XFS basically does nothing for certain cases (e.g.
read mapping for direct IO) where zeroes are always going to be
returned, so essentially filesystems right now may actually just
return a "hole" for any read mapping request beyond EOF.

If "create" is set, then we'll either create or map existing blocks
beyond EOF because the we have to reserve space or allocate blocks
before the EOF gets extended when the write succeeds fully...

> > > +	if (length < PMD_SIZE)
> > > +		goto fallback;
> > > +	if (pfn & PG_PMD_COLOUR)
> > > +		goto fallback;	/* not aligned */
> > 
> > So, are you rely on pure luck to make get_block() allocate 2M aligned pfn?
> > Not really productive. You would need assistance from fs and
> > arch_get_unmapped_area() sides.
> 
> Certainly ext4 and XFS will align their allocations; if you ask it for a
> 2MB block, it will try to allocate a 2MB block aligned on a 2MB boundary.

As a sweeping generalisation, that's wrong. Empty filesystems might
behave that way, but we don't *guarantee* that this sort of
alignment will occur.

XFS has several different extent alignment strategies and
none of them will always work that way. Many of them are dependent
on mkfs parameters, and even then are used only as *guidelines*.
Further, alignment is dependent on the size of the write being done
- on some filesystem configs a 2MB write might be aligned, but on
others it won't be. More complex still is that mount options can
change alignment behaviour, as can per-file extent size hints, as
can truncation that removes post-eof blocks...

IOWs, if you want the filesystem to guarantee alignment to the
underlying hardware in this way for DAX, we're going to need to make
some modifications to the allocator alignment strategy.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-10-13  1:13 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-08 13:25 [PATCH v1 0/7] Huge page support for DAX Matthew Wilcox
2014-10-08 13:25 ` [PATCH v1 1/7] thp: vma_adjust_trans_huge(): adjust file-backed VMA too Matthew Wilcox
2014-10-08 13:25 ` [PATCH v1 2/7] mm: Prepare for DAX huge pages Matthew Wilcox
2014-10-08 15:21   ` Kirill A. Shutemov
2014-10-08 15:57     ` Matthew Wilcox
2014-10-08 19:43       ` Kirill A. Shutemov
2014-10-09 20:40         ` Matthew Wilcox
2014-10-13 20:36           ` Kirill A. Shutemov
2014-10-08 13:25 ` [PATCH v1 3/7] mm: Add vm_insert_pfn_pmd() Matthew Wilcox
2014-10-08 13:25 ` [PATCH v1 4/7] mm: Add a pmd_fault handler Matthew Wilcox
2014-10-08 13:25 ` [PATCH v1 5/7] dax: Add huge page fault support Matthew Wilcox
2014-10-08 20:11   ` Kirill A. Shutemov
2014-10-09 20:47     ` Matthew Wilcox
2014-10-13  1:13       ` Dave Chinner [this message]
2014-10-08 13:25 ` [PATCH v1 6/7] ext2: Huge " Matthew Wilcox
2014-10-08 13:25 ` [PATCH v1 7/7] ext4: " Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141013011315.GB4503@dastard \
    --to=david@fromorbit.com \
    --cc=kirill@shutemov.name \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew.r.wilcox@intel.com \
    --cc=willy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).