Re: XFS_IOC_FSEMAP requirements

linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Christoph Hellwig <hch@infradead.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	linux-xfs@vger.kernel.org
Subject: Re: XFS_IOC_FSEMAP requirements
Date: Thu, 22 Dec 2016 12:24:13 -0800	[thread overview]
Message-ID: <20161222202413.GA4951@infradead.org> (raw)
In-Reply-To: <20161222200729.GT4326@dastard>

On Fri, Dec 23, 2016 at 07:07:29AM +1100, Dave Chinner wrote:
> And extent per rbtree node is almost certainly not the right choice
> because of the object count requirement - we do not want to a
> kmalloc for every extent we add to the list.

People are doing a kmalloc for each Packet / I/O at millions of 
I/Os per second, so I'm not that worried about that.  It's certainly
more efficient than the crazy amount of memmoves we're currently
doing based on my first preliminary numbers.

That beeing said I'm still looking for something even better.

> That was before I found out how easy it is to use the rhashtable
> code and how much faster it is for large lists than an rbtree.
> That's the way I've been thinking recently, anyway...

hashes generally aren't very good for sequential iteration, of which
we do a lot for the extent tree.  That beeing said it was on my todo
list to simply give it a try after I saw the buffer cache patch.

> My plan for the blocksize < page size was simply to track dirtines
> on pages and forget about sub-page dirtiness. That way the
> IO path is simply iterates entire pages to cover all the
> mapped regions of the page. iomap already does that for us, and I
> started on making writepage work that way, too. Haven't got to
> working writepage code yet, though.

There are four things that buffer_heads are used for in the blocksize <
pagesize case.

 - dirties - could be handled as mentioned by you
 - uptodateness - we could always read in the whole page and things
	would just work.  But on 64k page size this actually seems
	to be a performance issue, otherwise we wouldn't have the
	is_partially_uptodate address_space operation
 - tracking the block number for pure overwrites.  Probably not
   	really needed
 - tracking of I/O completions - we must write out the whole page
	on a writepage call, and something must track when all I/Os
	for the page have finished so that we can unlock it (or
	drop the writepage bit for the write case).

Nothing unsolveable, but at least the last one is a little nasty,
and doing the dumb things for 1 and 2 might cause performance
regressions.

next prev parent reply	other threads:[~2016-12-22 20:24 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-20 10:29 XFS_IOC_FSEMAP requirements Carlos Maiolino
2016-12-21  1:48 ` Darrick J. Wong
2016-12-22  8:57   ` Christoph Hellwig
2016-12-22 20:07     ` Dave Chinner
2016-12-22 20:24       ` Christoph Hellwig [this message]
2016-12-22  9:28   ` Carlos Maiolino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161222202413.GA4951@infradead.org \
    --to=hch@infradead.org \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).