All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Matthew Wilcox <willy@linux.intel.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH 0/8] Support multi-order entries in the radix tree
Date: Wed, 24 Feb 2016 13:24:06 -0700	[thread overview]
Message-ID: <20160224202406.GA13473@linux.intel.com> (raw)
In-Reply-To: <1453213533-6040-1-git-send-email-matthew.r.wilcox@intel.com>

On Tue, Jan 19, 2016 at 09:25:25AM -0500, Matthew Wilcox wrote:
> From: Matthew Wilcox <willy@linux.intel.com>
> 
> In order to support huge pages in the page cache, Kirill has proposed
> simply creating 512 entries.  I think this runs into problems with
> fsync() tracking dirty bits in the radix tree.  Ross inserts a special
> entry to represent the PMD at the index for the start of the PMD, but
> this requires probing the tree twice; once for the PTE and once for the PMD.
> When we add PUD entries, that will become three times.
> 
> The approach in this patch set is to modify the radix tree to support
> multi-order entries.  Pointers to internal radix tree nodes mostly do not
> have the 'indirect' bit set.  I change that so they always have that bit
> set; then any pointer without the indirect bit set is a multi-order entry.
> 
> If the order of the entry is a multiple of the fanout of the tree,
> then all is well.  If not, it is necessary to insert alias nodes into
> the tree that point to the canonical entry.  At this point, I have not
> added support for entries which are smaller than the last-level fanout of
> the tree (and I put a BUG_ON in to prevent that usage).  Adding support
> would be a simple matter of one last pointer-chase when we get to the
> bottom of the tree, but I am not aware of any reason to add support for
> smaller multi-order entries at this point, so I haven't.
> 
> Note that no actual users are modified at this point.  I think it'd be
> mostly a matter of deleting code from the DAX fsync support at this point,
> but with that code in flux, I'm a little reluctant to add more churn
> to it.  I'm also not entriely sure where Kirill is on the page-cache
> modifications; he seems to have his hands full fixing up the MM right now.
> 
> Before diving into the important modifications, I add Andrew Morton's
> radix tree test harness to the tree in patches 1 & 2.  It was absolutely
> invaluable in catching some of my bugs.  Patches 3 & 4 are minor tweaks.
> Patches 5-7 are the interesting ones.  Patch 8 we might want to leave
> out entirely or shift over to the test harness.  I found it useful during
> debugging and others might too.
> 
> Matthew Wilcox (8):
>   radix-tree: Add an explicit include of bitops.h
>   radix tree test harness
>   radix-tree: Cleanups
>   radix_tree: Convert some variables to unsigned types
>   radix_tree: Tag all internal tree nodes as indirect pointers
>   radix_tree: Loop based on shift count, not height
>   radix_tree: Add support for multi-order entries
>   radix_tree: Add radix_tree_dump

I like the idea of this approach - I'll work on integrating it into DAX *sync.

One quick note - some of the patches are prefixed with "radix-tree" and others
with "radix_tree".

Also, if we go through the trouble of including the radix tree test harness,
should we include a new test at the end of the series that tests out
multi-order radix tree entries?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Matthew Wilcox <willy@linux.intel.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH 0/8] Support multi-order entries in the radix tree
Date: Wed, 24 Feb 2016 13:24:06 -0700	[thread overview]
Message-ID: <20160224202406.GA13473@linux.intel.com> (raw)
In-Reply-To: <1453213533-6040-1-git-send-email-matthew.r.wilcox@intel.com>

On Tue, Jan 19, 2016 at 09:25:25AM -0500, Matthew Wilcox wrote:
> From: Matthew Wilcox <willy@linux.intel.com>
> 
> In order to support huge pages in the page cache, Kirill has proposed
> simply creating 512 entries.  I think this runs into problems with
> fsync() tracking dirty bits in the radix tree.  Ross inserts a special
> entry to represent the PMD at the index for the start of the PMD, but
> this requires probing the tree twice; once for the PTE and once for the PMD.
> When we add PUD entries, that will become three times.
> 
> The approach in this patch set is to modify the radix tree to support
> multi-order entries.  Pointers to internal radix tree nodes mostly do not
> have the 'indirect' bit set.  I change that so they always have that bit
> set; then any pointer without the indirect bit set is a multi-order entry.
> 
> If the order of the entry is a multiple of the fanout of the tree,
> then all is well.  If not, it is necessary to insert alias nodes into
> the tree that point to the canonical entry.  At this point, I have not
> added support for entries which are smaller than the last-level fanout of
> the tree (and I put a BUG_ON in to prevent that usage).  Adding support
> would be a simple matter of one last pointer-chase when we get to the
> bottom of the tree, but I am not aware of any reason to add support for
> smaller multi-order entries at this point, so I haven't.
> 
> Note that no actual users are modified at this point.  I think it'd be
> mostly a matter of deleting code from the DAX fsync support at this point,
> but with that code in flux, I'm a little reluctant to add more churn
> to it.  I'm also not entriely sure where Kirill is on the page-cache
> modifications; he seems to have his hands full fixing up the MM right now.
> 
> Before diving into the important modifications, I add Andrew Morton's
> radix tree test harness to the tree in patches 1 & 2.  It was absolutely
> invaluable in catching some of my bugs.  Patches 3 & 4 are minor tweaks.
> Patches 5-7 are the interesting ones.  Patch 8 we might want to leave
> out entirely or shift over to the test harness.  I found it useful during
> debugging and others might too.
> 
> Matthew Wilcox (8):
>   radix-tree: Add an explicit include of bitops.h
>   radix tree test harness
>   radix-tree: Cleanups
>   radix_tree: Convert some variables to unsigned types
>   radix_tree: Tag all internal tree nodes as indirect pointers
>   radix_tree: Loop based on shift count, not height
>   radix_tree: Add support for multi-order entries
>   radix_tree: Add radix_tree_dump

I like the idea of this approach - I'll work on integrating it into DAX *sync.

One quick note - some of the patches are prefixed with "radix-tree" and others
with "radix_tree".

Also, if we go through the trouble of including the radix tree test harness,
should we include a new test at the end of the series that tests out
multi-order radix tree entries?

  parent reply	other threads:[~2016-02-24 20:24 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-19 14:25 [PATCH 0/8] Support multi-order entries in the radix tree Matthew Wilcox
2016-01-19 14:25 ` Matthew Wilcox
2016-01-19 14:25 ` [PATCH 1/8] radix-tree: Add an explicit include of bitops.h Matthew Wilcox
2016-01-19 14:25   ` Matthew Wilcox
2016-01-19 14:25 ` [PATCH 2/8] radix tree test harness Matthew Wilcox
2016-01-19 14:25   ` Matthew Wilcox
2016-01-26 23:44   ` Andrew Morton
2016-01-26 23:44     ` Andrew Morton
2016-01-27  3:20     ` Matthew Wilcox
2016-01-27  3:20       ` Matthew Wilcox
2016-01-19 14:25 ` [PATCH 3/8] radix-tree: Cleanups Matthew Wilcox
2016-01-19 14:25   ` Matthew Wilcox
2016-01-19 14:25 ` [PATCH 4/8] radix_tree: Convert some variables to unsigned types Matthew Wilcox
2016-01-19 14:25   ` Matthew Wilcox
2016-01-19 14:25 ` [PATCH 5/8] radix_tree: Tag all internal tree nodes as indirect pointers Matthew Wilcox
2016-01-19 14:25   ` Matthew Wilcox
2016-01-19 14:25 ` [PATCH 6/8] radix_tree: Loop based on shift count, not height Matthew Wilcox
2016-01-19 14:25   ` Matthew Wilcox
2016-01-19 14:25 ` [PATCH 7/8] radix_tree: Add support for multi-order entries Matthew Wilcox
2016-01-19 14:25   ` Matthew Wilcox
2016-01-19 14:25 ` [PATCH 8/8] radix_tree: Add radix_tree_dump Matthew Wilcox
2016-01-19 14:25   ` Matthew Wilcox
2016-01-22  0:28 ` [PATCH 0/8] Support multi-order entries in the radix tree Andrew Morton
2016-01-22  0:28   ` Andrew Morton
2016-02-24 20:24 ` Ross Zwisler [this message]
2016-02-24 20:24   ` Ross Zwisler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160224202406.GA13473@linux.intel.com \
    --to=ross.zwisler@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew.r.wilcox@intel.com \
    --cc=willy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.