linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Hugh Dickins <hughd@google.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 1/12] radix_tree: exceptional entries and indices
Date: Tue, 12 Jul 2011 16:24:31 -0700	[thread overview]
Message-ID: <20110712162431.75bfe77b.akpm@linux-foundation.org> (raw)
In-Reply-To: <alpine.LSU.2.00.1107121536100.2112@sister.anvils>

<tries to remember what this is all about>

l 2011 15:56:14 -0700 (PDT)
Hugh Dickins <hughd@google.com> wrote:

> On Sat, 18 Jun 2011, Andrew Morton wrote:
> > On Fri, 17 Jun 2011 17:13:38 -0700 (PDT) Hugh Dickins <hughd@google.com> wrote:
> > > On Fri, 17 Jun 2011, Andrew Morton wrote:
> > > > On Tue, 14 Jun 2011 03:42:27 -0700 (PDT)
> > > > Hugh Dickins <hughd@google.com> wrote:
> > > > 
> > > > > The low bit of a radix_tree entry is already used to denote an indirect
> > > > > pointer, for internal use, and the unlikely radix_tree_deref_retry() case.
> > > > > Define the next bit as denoting an exceptional entry, and supply inline
> > > > > functions radix_tree_exception() to return non-0 in either unlikely case,
> > > > > and radix_tree_exceptional_entry() to return non-0 in the second case.
> > > > 
> > > > Yes, the RADIX_TREE_INDIRECT_PTR hack is internal-use-only, and doesn't
> > > > operate on (and hence doesn't corrupt) client-provided items.
> > > > 
> > > > This patch uses bit 1 and uses it against client items, so for
> > > > practical purpoese it can only be used when the client is storing
> > > > addresses.  And it needs new APIs to access that flag.
> > > > 
> > > > All a bit ugly.  Why not just add another tag for this?  Or reuse an
> > > > existing tag if the current tags aren't all used for these types of
> > > > pages?
> > > 
> > > I couldn't see how to use tags without losing the "lockless" lookups:
> > 
> > So lockless pagecache broke the radix-tree tag-versus-item coherency as
> > well as the address_space nrpages-vs-radix-tree coherency.
> 
> I don't think that remark is fair to lockless pagecache at all.  If we
> want the scalability advantage of lockless lookup, yes, we don't have
> strict coherency with tagging at that time.  But those places that need
> to worry about that coherency, can lock to do so.

Nobody thought about these issues, afaik.  Things have broken and the
code has become significantly more complex/fragile.

Does the locking in mapping_tagged() make any sense?

> > Isn't it fun learning these things.
> > 
> > > because the tag is a separate bit from the entry itself, unless you're
> > > under tree_lock, there would be races when changing from page pointer
> > > to swap entry or back, when slot was updated but tag not or vice versa.
> > 
> > So...  take tree_lock?
> 
> I wouldn't call that an improvement...

I wouldn't call the proposed changes to radix-tree.c an improvement,
either.  It's an expedient, once-off, single-caller hack.

If the cost of adding locking is negligible then that is a superior fix.

> > What effect does that have?
> 
> ... but admit I have not measured: I rather assume that if we now change
> tmpfs from lockless to locked lookup, someone else will soon come up with
> the regression numbers.
> 
> > It'd better be
> > "really bad", because this patchset does nothing at all to improve core
> > MM maintainability :(
> 
> I was aiming to improve shmem.c maintainability; and you have good grounds
> to accuse me of hurting shmem.c maintainability when I highmem-ized the
> swap vector nine years ago.
> 
> I was not aiming to improve core MM maintainability, nor to harm it.
> I am extending the use to which the radix-tree can be put, but is that
> so bad?

I find it hard to believe that this wart added to the side of the
radix-tree code will find any other users.  And the wart spreads
contagion into core filemap pagecache lookup.

It's pretty nasty stuff.  Please, what is a better way of doing all this?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-07-12 23:25 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-14 10:40 [PATCH 0/12] tmpfs: convert from old swap vector to radix tree Hugh Dickins
2011-06-14 10:42 ` [PATCH 1/12] radix_tree: exceptional entries and indices Hugh Dickins
2011-06-14 11:22   ` Pekka Enberg
2011-06-15  0:24     ` Hugh Dickins
2011-06-17 23:38   ` Andrew Morton
2011-06-18  0:07     ` Randy Dunlap
2011-06-18  0:12       ` Randy Dunlap
2011-06-18  1:52         ` Hugh Dickins
2011-07-19 22:36           ` Hugh Dickins
2011-07-19 23:28             ` Randy Dunlap
2011-06-18  0:13     ` Hugh Dickins
2011-06-18 21:48       ` Andrew Morton
2011-07-12 22:56         ` Hugh Dickins
2011-07-12 23:24           ` Andrew Morton [this message]
2011-07-13 22:27             ` Hugh Dickins
2011-06-14 10:43 ` [PATCH 2/12] mm: let swap use exceptional entries Hugh Dickins
2011-06-18 21:52   ` Andrew Morton
2011-07-12 22:08     ` Hugh Dickins
2011-07-13 23:11       ` Andrew Morton
2011-07-19 22:46         ` Hugh Dickins
2011-06-18 21:55   ` Andrew Morton
2011-07-12 22:35     ` Hugh Dickins
2011-06-14 10:45 ` [PATCH 3/12] tmpfs: demolish old swap vector support Hugh Dickins
2011-06-14 10:48 ` [PATCH 4/12] tmpfs: miscellaneous trivial cleanups Hugh Dickins
2011-06-14 10:49 ` [PATCH 5/12] tmpfs: copy truncate_inode_pages_range Hugh Dickins
2011-06-14 10:51 ` [PATCH 6/12] tmpfs: convert shmem_truncate_range to radix-swap Hugh Dickins
2011-06-14 10:52 ` [PATCH 7/12] tmpfs: convert shmem_unuse_inode " Hugh Dickins
2011-06-14 10:53 ` [PATCH 8/12] tmpfs: convert shmem_getpage_gfp " Hugh Dickins
2011-06-14 10:54 ` [PATCH 9/12] tmpfs: convert mem_cgroup shmem " Hugh Dickins
2011-06-14 10:56 ` [PATCH 10/12] tmpfs: convert shmem_writepage and enable swap Hugh Dickins
2011-06-14 10:57 ` [PATCH 11/12] tmpfs: use kmemdup for short symlinks Hugh Dickins
2011-06-14 11:16   ` Pekka Enberg
2011-06-14 10:59 ` [PATCH 12/12] mm: a few small updates for radix-swap Hugh Dickins
2011-06-15  0:49   ` [PATCH v2 " Hugh Dickins
2011-06-14 17:29 ` [PATCH 0/12] tmpfs: convert from old swap vector to radix tree Linus Torvalds
2011-06-14 18:20   ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110712162431.75bfe77b.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).