From: Dave Chinner <david@fromorbit.com>
To: Kent Overstreet <koverstreet@google.com>
Cc: axboe@kernel.dk, tytso@mit.edu, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org
Subject: Re: Immutable biovecs, dio rewrite
Date: Tue, 11 Jun 2013 15:20:12 +1000 [thread overview]
Message-ID: <20130611052012.GJ29376@dastard> (raw)
In-Reply-To: <1370744348-15407-1-git-send-email-koverstreet@google.com>
On Sat, Jun 08, 2013 at 07:18:42PM -0700, Kent Overstreet wrote:
> Immutable biovecs: Drivers no longer modify the biovec array directly
> (bv_len/bv_offset in particular) - we add a real iterator to struct bio
> that lets drivers partially complete a bio while only modifying the
> iterator. The iterator has the existing bi_sector, bi_size, bi_idx
> memembers, and also bi_bvec_done.
>
> This gets us a couple things:
> * Changing all the drivers to go through the iterator means that we can
> submit a partially completed bio to generic_make_request() - this
> previously worked on some drivers, but worked on others.
>
> This makes it much easier for upper layers to process bios
> incrementally - not just stacking drivers, my dio rewrite relies
> heavily on this strategy.
>
> * Previously, any code that might need to retry a bio somehow if it
> errored (mainly stacking drivers) had to clone not just the bio, but
> the entire biovec. The biovec can be up to BIO_MAX_PAGES, which works
> out to 4k...
>
> * When cloning a bio, now we don't have to clone the biovec unless we
> want to modify it. Bio splitting also becomes just a special case of
> cloning a bio.
>
> We also get to delete a lot of code. And this patch series barely
> scratches the surface - I've got more patches that delete another 1.5k
> lines of code, without trying all that hard.
>
> I'd like to get as much of this into 3.11 as possible - I don't know if
> the dio rewrite is a realistic possibility (it currently breaks btrfs -
> we need to add a different hook for them) and it does need a lot of
> review and testing from the various driver maintainers. The dio rewrite
> does pass xfstests for me, though.
Please test with XFS and CONFIG_XFS_DEBUG=y - xfstests will stress
the dio subsystem a lot more when it is run on XFS. Indeed, xfstests
generic/013 assert fails almost immediately with:
[ 58.859136] XFS (vda): Mounting Filesystem
[ 58.881742] XFS (vda): Ending clean mount
[ 58.989301] XFS: Assertion failed: bh_result->b_size >= (1 << inode->i_blkbits), file: fs/xfs/xfs_aops.c, line: 1209
[ 58.992672] ------------[ cut here ]------------
[ 58.994093] kernel BUG at fs/xfs/xfs_message.c:108!
[ 58.995385] invalid opcode: 0000 [#1] SMP
[ 58.996569] Modules linked in:
[ 58.997427] CPU: 1 PID: 9529 Comm: fsstress Not tainted 3.10.0-rc4-next-20130510-dgc+ #85
[ 58.999556] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 59.001143] task: ffff880079a34530 ti: ffff880079a0a000 task.ti: ffff880079a0a000
[ 59.003130] RIP: 0010:[<ffffffff814655c2>] [<ffffffff814655c2>] assfail+0x22/0x30
[ 59.005263] RSP: 0018:ffff880079a0b998 EFLAGS: 00010292
[ 59.006334] RAX: 0000000000000068 RBX: 0000000000054000 RCX: ffff88007fd0eb68
[ 59.007676] RDX: 0000000000000000 RSI: ffff88007fd0d0d8 RDI: 0000000000000246
[ 59.009076] RBP: ffff880079a0b998 R08: 000000000000000a R09: 00000000000001fa
[ 59.010883] R10: 0000000000000000 R11: 00000000000001f9 R12: ffff880073730b50
[ 59.012855] R13: ffff88007a91e800 R14: ffff880079a0baa0 R15: 0000000000000a00
[ 59.014753] FS: 00007f784eb7b700(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
[ 59.016947] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 59.018174] CR2: 00007f5d4c40e1a0 CR3: 000000007b177000 CR4: 00000000000006e0
[ 59.019509] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 59.020906] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 59.022252] Stack:
[ 59.022648] ffff880079a0ba48 ffffffff81450b2e ffff88007ba2d2c0 0000000000000001
[ 59.024181] ffff880000000000 ffffffff00000000 ffff880000000008 00010000ffff159d
[ 59.025650] 0000000000000054 ffff880073730980 ffff880079a34530 0000000179a34530
[ 59.027117] Call Trace:
[ 59.027592] [<ffffffff81450b2e>] __xfs_get_blocks+0x7e/0x5d0
[ 59.028744] [<ffffffff81451094>] xfs_get_blocks_direct+0x14/0x20
[ 59.029914] [<ffffffff811c157b>] get_blocks+0x9b/0x1b0
[ 59.030903] [<ffffffff8115da92>] ? get_user_pages+0x52/0x60
[ 59.031968] [<ffffffff811c1af7>] __blockdev_direct_IO+0x367/0x850
[ 59.033191] [<ffffffff81451080>] ? __xfs_get_blocks+0x5d0/0x5d0
[ 59.034336] [<ffffffff8144f3ed>] xfs_vm_direct_IO+0x18d/0x1b0
[ 59.035436] [<ffffffff81451080>] ? __xfs_get_blocks+0x5d0/0x5d0
[ 59.036635] [<ffffffff81144b42>] ? pagevec_lookup+0x22/0x30
[ 59.037718] [<ffffffff8113a6ef>] generic_file_aio_read+0x6bf/0x710
[ 59.038899] [<ffffffff814579a2>] xfs_file_aio_read+0x152/0x320
[ 59.040089] [<ffffffff81186910>] do_sync_read+0x80/0xb0
[ 59.041100] [<ffffffff811876d5>] vfs_read+0xa5/0x160
[ 59.042098] [<ffffffff81187912>] SyS_read+0x52/0xa0
[ 59.043045] [<ffffffff81c39e99>] system_call_fastpath+0x16/0x1b
[ 59.044246] Code: 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 f1 41 89 d0 48 89 e5 48 89 fa 48 c7 c6 48 f4 fb 81 31 ff 31 c0 e8 de fb ff ff <0f> 0b 66 66 66
[ 59.049085] RIP [<ffffffff814655c2>] assfail+0x22/0x30
[ 59.050100] RSP <ffff880079a0b998>
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2013-06-11 5:20 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-09 2:18 Immutable biovecs, dio rewrite Kent Overstreet
2013-06-09 2:18 ` [PATCH 01/26] bcache: Use standard utility code Kent Overstreet
2013-06-09 2:18 ` [PATCH 02/26] bcache: Kill unaligned bvec hack Kent Overstreet
2013-06-09 2:18 ` [PATCH 03/26] block: Abstract out bvec iterator Kent Overstreet
2013-06-09 2:18 ` [PATCH 04/26] dm: Use bvec_iter for dm_bio_record() Kent Overstreet
2013-06-09 2:18 ` [PATCH 05/26] block: Convert bio_iovec() to bvec_iter Kent Overstreet
2013-06-09 2:18 ` Kent Overstreet
2013-06-09 2:18 ` [PATCH 06/26] block: Convert bio_for_each_segment() " Kent Overstreet
2013-06-09 2:18 ` Kent Overstreet
2013-06-09 2:18 ` Kent Overstreet
2013-06-09 2:18 ` [Drbd-dev] " Kent Overstreet
2013-06-09 2:18 ` Kent Overstreet
2013-06-09 2:18 ` [Cluster-devel] " Kent Overstreet
2013-06-09 14:21 ` Geoff Levand
2013-06-09 2:18 ` Kent Overstreet
2013-06-09 2:18 ` [Drbd-dev] [PATCH 07/26] block: Immutable bio vecs Kent Overstreet
2013-06-09 2:18 ` Kent Overstreet
2013-06-09 2:18 ` Kent Overstreet
2013-06-09 2:18 ` [PATCH 08/26] block: Convert bio_copy_data() to bvec_iter Kent Overstreet
2013-06-09 2:18 ` [PATCH 09/26] bio-integrity: Convert " Kent Overstreet
2013-06-09 2:18 ` [PATCH 10/26] block: Convert drivers to immutable biovecs Kent Overstreet
2013-06-09 2:18 ` Kent Overstreet
2013-06-28 19:39 ` Ed Cashin
2013-06-28 19:39 ` Ed Cashin
2013-06-09 2:18 ` [PATCH 11/26] block: Kill bio_iovec_idx(), __bio_iovec() Kent Overstreet
2013-06-09 2:18 ` [PATCH 12/26] rbd: Refactor bio cloning, don't clone biovecs Kent Overstreet
2013-06-09 2:18 ` [PATCH 13/26] dm: Refactor for new bio cloning/splitting Kent Overstreet
2013-06-09 2:18 ` Kent Overstreet
2013-06-09 2:18 ` [PATCH 14/26] md, bcache: Remove bi_idx hacks Kent Overstreet
2013-06-09 2:18 ` [PATCH 15/26] block: Generic bio chaining Kent Overstreet
2013-06-09 2:18 ` [PATCH 16/26] block: Rename bio_split() -> bio_pair_split() Kent Overstreet
2013-06-09 2:18 ` [PATCH 17/26] block: Introduce new bio_split() Kent Overstreet
2013-06-09 2:19 ` [PATCH 18/26] block: Kill bio_pair_split() Kent Overstreet
2013-06-09 2:19 ` [PATCH 19/26] block: Kill bio_segments() Kent Overstreet
2013-06-09 2:19 ` [PATCH 20/26] block: Don't save/copy bvec array anymore, share when cloning Kent Overstreet
2013-06-09 2:19 ` [PATCH 21/26] block: Move bouncing to generic_make_request() Kent Overstreet
2013-06-09 2:19 ` [PATCH 22/26] block: Make generic_make_request handle arbitrary sized bios Kent Overstreet
2013-06-11 17:12 ` David Sterba
2013-06-12 4:26 ` Kent Overstreet
2013-06-09 2:19 ` [PATCH 23/26] blk-lib.c: generic_make_request() handles large bios now Kent Overstreet
2013-06-09 2:19 ` [PATCH 24/26] bcache: " Kent Overstreet
2013-06-09 2:19 ` [PATCH 25/26] block: Add bio_get_user_pages() Kent Overstreet
2013-06-09 2:19 ` [PATCH 26/26] Apply fire to dio code Kent Overstreet
2013-06-09 8:34 ` Immutable biovecs, dio rewrite Geert Uytterhoeven
2013-06-09 8:55 ` Kent Overstreet
2013-06-11 5:20 ` Dave Chinner [this message]
2013-06-12 20:30 ` Kent Overstreet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130611052012.GJ29376@dastard \
--to=david@fromorbit.com \
--cc=axboe@kernel.dk \
--cc=koverstreet@google.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.