From: Dave Chinner <david@fromorbit.com>
To: Christoph Hellwig <hch@lst.de>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 14/17] xfs: use bios directly to read and write the log recovery buffers
Date: Wed, 22 May 2019 16:19:19 +1000 [thread overview]
Message-ID: <20190522061919.GJ29573@dread.disaster.area> (raw)
In-Reply-To: <20190522051214.GA19467@lst.de>
On Wed, May 22, 2019 at 07:12:14AM +0200, Christoph Hellwig wrote:
> On Wed, May 22, 2019 at 08:24:34AM +1000, Dave Chinner wrote:
> > Yeah, the log recovery code should probably be split in three - the
> > kernel specific IO code/API, the log parsing code (the bit that
> > finds head/tail and parses it into transactions for recovery) and
> > then the bit that actually does the recovery. THe logprint code in
> > userspace uses the parsing code, so that's the bit we need to share
> > with userspace...
>
> Actually one thing I have on my TODO list is to move the log item type
> specific recovery code first into an ops vector, and then out to the
> xfs_*_item.c together with the code creating those items. That isn't
> really all of the recovery code, but it seems like a useful split.
Sounds like the right place to me - it's roughly where I had in mind
to split the code as it's not until logprint decodes the
transactions and needs to parse the individual log items that it
diverges from the kernel code. So just having a set of op vectors
that we can supply from userspace to implement logprint would make
it much simpler....
> Note that the I/O code isn't really very log specific, it basically
> just is trivial I/O to a vmalloc buffer code. In fact I wonder if
> I could just generalize it a little more and move it to the block layer.
Yeah, it's not complex, just different to userspace. Which is why
I thought just having a simple API to between it and the kernel log
code would make it easy to port...
> > I've got a rough AIO implementation backing the xfs_buf.c code in
> > userspace already. It works just fine and is massively faster than
> > the existing code on SSDs, so I don't see a problem with porting IO
> > code that assumes an AIO model anymore. i.e. Re-using the kernel AIO
> > model for all the buffer code in userspace is one of the reasons I'm
> > porting xfs-buf.c to userspace.
>
> Given that we:
>
> a) do direct I/O everywhere
> b) tend to do it on either a block device, or a file where we don't
> need to allocate over holes
>
> aio should be a win everywhere.
So far it is, but I haven't tested on spinning disks so I can't say
for certain that it is a win there. The biggest difference for SSDs
is that we completely bypass the prefetching code and so the
buffer cache memory footprint goes way down. Hence we save huge
amounts of CPU by avoiding allocating, freeing and faulting in
memory so we essentially stop bashing on and being limited by
mmap_sem contention.
> The only caveat is that CONFG_AIO
> is kernel option and could be turned off in some low end configs.
Should be trivial to add a configure option to turn it off and
have the IO code just call pread/pwrite directly and run the
completions synchronously. That's kind of how I'm building up the
patchset, anyway - AIO doesn't come along until after the xfs_buf.c
infrastructure is in place doing sync IO. I'll make a note to add a
--disable-aio config option when I get there....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2019-05-22 13:19 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-20 16:13 use bios directly in the log code Christoph Hellwig
2019-05-20 16:13 ` [PATCH 01/17] xfs: remove the no-op spinlock_destroy stub Christoph Hellwig
2019-05-20 16:13 ` [PATCH 02/17] xfs: remove the never used _XBF_COMPOUND flag Christoph Hellwig
2019-05-20 16:13 ` [PATCH 03/17] xfs: renumber XBF_WRITE_FAIL Christoph Hellwig
2019-05-20 16:13 ` [PATCH 04/17] xfs: reformat xlog_get_lowest_lsn Christoph Hellwig
2019-05-20 16:13 ` [PATCH 05/17] xfs: don't use REQ_PREFLUSH for split log writes Christoph Hellwig
2019-05-20 16:13 ` [PATCH 06/17] xfs: factor out log buffer writing Christoph Hellwig
2019-05-20 16:13 ` [PATCH 07/17] xfs: factor out splitting of an iclog from xlog_sync Christoph Hellwig
2019-05-20 16:13 ` [PATCH 08/17] xfs: split iclog size calculation out of xlog_sync Christoph Hellwig
2019-05-20 16:13 ` [PATCH 09/17] xfs: update both state counters together in xlog_sync Christoph Hellwig
2019-05-20 16:13 ` [PATCH 10/17] xfs: remove the syncing argument from xlog_verify_iclog Christoph Hellwig
2019-05-20 16:13 ` [PATCH 11/17] xfs: make use of the l_targ field in struct xlog Christoph Hellwig
2019-05-20 16:13 ` [PATCH 12/17] xfs: use bios directly to write log buffers Christoph Hellwig
2019-05-20 16:13 ` [PATCH 13/17] xfs: return an offset instead of a pointer from xlog_align Christoph Hellwig
2019-05-20 16:13 ` [PATCH 14/17] xfs: use bios directly to read and write the log recovery buffers Christoph Hellwig
2019-05-20 23:32 ` Dave Chinner
2019-05-21 5:09 ` Christoph Hellwig
2019-05-21 22:24 ` Dave Chinner
2019-05-22 5:12 ` Christoph Hellwig
2019-05-22 6:19 ` Dave Chinner [this message]
2019-05-22 17:31 ` Christoph Hellwig
2019-05-22 23:28 ` Dave Chinner
2019-05-23 6:22 ` Christoph Hellwig
2019-05-20 16:13 ` [PATCH 15/17] xfs: remove unused buffer cache APIs Christoph Hellwig
2019-05-20 16:13 ` [PATCH 16/17] xfs: properly type the b_log_item field in struct xfs_buf Christoph Hellwig
2019-05-20 16:13 ` [PATCH 17/17] xfs: remove the b_io_length " Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190522061919.GJ29573@dread.disaster.area \
--to=david@fromorbit.com \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox