From: Liu Bo <bo.li.liu@oracle.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 1/2] Btrfs: use flag EXTENT_DEFRAG for snapshot-aware defrag
Date: Sat, 25 Aug 2012 18:36:58 +0800 [thread overview]
Message-ID: <5038AACA.2090804@oracle.com> (raw)
In-Reply-To: <20120824173447.GE17430@twin.jikos.cz>
On 08/25/2012 01:34 AM, David Sterba wrote:
> On Thu, Aug 23, 2012 at 07:01:33PM +0800, Liu Bo wrote:
>> We're going to use this flag EXTENT_DEFRAG to indicate which range
>> belongs to
>> defragment so that we can implement snapshow-aware defrag:
>>
>> We set the EXTENT_DEFRAG flag when dirtying the extents that need
>> defragmented, so later on writeback thread can differentiate between
>> normal writeback and writeback started by defragmentation.
>>
>> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
>> ---
>> fs/btrfs/extent_io.c | 11 +++++++++--
>> fs/btrfs/extent_io.h | 29 ++++++++++++++---------------
>> fs/btrfs/file.c | 4 ++--
>> fs/btrfs/free-space-cache.c | 7 ++++---
>> fs/btrfs/inode.c | 20 ++++++++++++--------
>> fs/btrfs/ioctl.c | 8 ++++----
>> 6 files changed, 45 insertions(+), 34 deletions(-)
>>
>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>> index bb25e89..f03ceff 100644
>> --- a/fs/btrfs/extent_io.c
>> +++ b/fs/btrfs/extent_io.c
>> @@ -1143,6 +1143,14 @@ int set_extent_delalloc(struct extent_io_tree *tree, u64 start, u64 end,
>> NULL, cached_state, mask);
>> }
>>
>> +int set_extent_defrag(struct extent_io_tree *tree, u64 start, u64 end,
>> + struct extent_state **cached_state, gfp_t mask)
>> +{
>> + return set_extent_bit(tree, start, end,
>> + EXTENT_DELALLOC | EXTENT_UPTODATE | EXTENT_DEFRAG,
>> + NULL, cached_state, mask);
>> +}
>> +
>> int clear_extent_dirty(struct extent_io_tree *tree, u64 start, u64 end,
>> gfp_t mask)
>> {
>> @@ -3700,8 +3708,7 @@ int try_release_extent_mapping(struct extent_map_tree *map,
>> }
>> if (!test_range_bit(tree, em->start,
>> extent_map_end(em) - 1,
>> - EXTENT_LOCKED | EXTENT_WRITEBACK,
>> - 0, NULL)) {
>> + EXTENT_LOCKED, 0, NULL)) {
>> remove_extent_mapping(map, em);
>> /* once for the rb tree */
>> free_extent_map(em);
>> diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
>> index 25900af..d9dee94 100644
>> --- a/fs/btrfs/extent_io.h
>> +++ b/fs/btrfs/extent_io.h
>> @@ -5,21 +5,18 @@
>>
>> /* bits for the extent state */
>> #define EXTENT_DIRTY 1
>> -#define EXTENT_WRITEBACK (1 << 1)
>> -#define EXTENT_UPTODATE (1 << 2)
>> -#define EXTENT_LOCKED (1 << 3)
>> -#define EXTENT_NEW (1 << 4)
>> -#define EXTENT_DELALLOC (1 << 5)
>> -#define EXTENT_DEFRAG (1 << 6)
>> -#define EXTENT_DEFRAG_DONE (1 << 7)
>> -#define EXTENT_BUFFER_FILLED (1 << 8)
>> -#define EXTENT_BOUNDARY (1 << 9)
>> -#define EXTENT_NODATASUM (1 << 10)
>> -#define EXTENT_DO_ACCOUNTING (1 << 11)
>> -#define EXTENT_FIRST_DELALLOC (1 << 12)
>> -#define EXTENT_NEED_WAIT (1 << 13)
>> -#define EXTENT_DAMAGED (1 << 14)
>> -#define EXTENT_IOBITS (EXTENT_LOCKED | EXTENT_WRITEBACK)
>> +#define EXTENT_UPTODATE (1 << 1)
>> +#define EXTENT_LOCKED (1 << 2)
>> +#define EXTENT_NEW (1 << 3)
>> +#define EXTENT_DELALLOC (1 << 4)
>> +#define EXTENT_DEFRAG (1 << 5)
>> +#define EXTENT_BOUNDARY (1 << 6)
>> +#define EXTENT_NODATASUM (1 << 7)
>> +#define EXTENT_DO_ACCOUNTING (1 << 8)
>> +#define EXTENT_FIRST_DELALLOC (1 << 9)
>> +#define EXTENT_NEED_WAIT (1 << 10)
>> +#define EXTENT_DAMAGED (1 << 11)
>> +#define EXTENT_IOBITS (EXTENT_LOCKED)
>
> Uh, please don't do that, the values are used only for in-memory
> accounting and we do not care about their real values. You create a hole
> in the number sequence, removing EXTENT_WRITEBACK, EXTENT_DEFRAG_DONE
> and EXTENT_BUFFER_FILLED, fine, so put there a comment instead. We can
> reuse the value anytime later.
>
> Also I'd like to hear why it's safe to remove the WRITEBACK flag. It's
> embedded inside the IOBITS flag that is checked in many places, so if you
> remove it the condition would be more relaxed and some code can be
> unexpectedly executed.
>
> On the other hand, the WRITEBACK flag is not set anywhere, either
> directly or via IOBITS, so it's probably a leftover (like the other
> flags you remove).
>
Yeah, it might be some historical reasons why WRITEBACK flag is kept till now.
> So, I'd rather see a separate patch that just removes the unused bits,
> and then patch that adds the snapshot-aware defrag logic, this is more
> friendly to testing.
>
hmm, make sense, I'll make a separated patch.
>
>> #define EXTENT_CTLBITS (EXTENT_DO_ACCOUNTING | EXTENT_FIRST_DELALLOC)
>>
>> /*
>> @@ -235,6 +232,8 @@ int convert_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
>> int bits, int clear_bits, gfp_t mask);
>> int set_extent_delalloc(struct extent_io_tree *tree, u64 start, u64 end,
>> struct extent_state **cached_state, gfp_t mask);
>> +int set_extent_defrag(struct extent_io_tree *tree, u64 start, u64 end,
>> + struct extent_state **cached_state, gfp_t mask);
>> int find_first_extent_bit(struct extent_io_tree *tree, u64 start,
>> u64 *start_ret, u64 *end_ret, int bits);
>> struct extent_state *find_first_extent_bit_state(struct extent_io_tree *tree,
>> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
>> index 9aa01ec..c172868 100644
>> --- a/fs/btrfs/file.c
>> +++ b/fs/btrfs/file.c
>> @@ -1173,8 +1173,8 @@ again:
>>
>> clear_extent_bit(&BTRFS_I(inode)->io_tree, start_pos,
>> last_pos - 1, EXTENT_DIRTY | EXTENT_DELALLOC |
>> - EXTENT_DO_ACCOUNTING, 0, 0, &cached_state,
>> - GFP_NOFS);
>> + EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG,
>> + 0, 0, &cached_state, GFP_NOFS);
>> unlock_extent_cached(&BTRFS_I(inode)->io_tree,
>> start_pos, last_pos - 1, &cached_state,
>> GFP_NOFS);
>> diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
>> index 6b10acf..53ff2da 100644
>> --- a/fs/btrfs/free-space-cache.c
>> +++ b/fs/btrfs/free-space-cache.c
>> @@ -1023,7 +1023,8 @@ int __btrfs_write_out_cache(struct btrfs_root *root, struct inode *inode,
>> ret = btrfs_search_slot(trans, root, &key, path, 0, 1);
>> if (ret < 0) {
>> clear_extent_bit(&BTRFS_I(inode)->io_tree, 0, inode->i_size - 1,
>> - EXTENT_DIRTY | EXTENT_DELALLOC, 0, 0, NULL,
>> + EXTENT_DIRTY | EXTENT_DELALLOC | EXTENT_DEFRAG,
>> + 0, 0, NULL,
>
> This does not make much sense to me, why do you need to clear DEFRAG for
> the free space inode?
>
...Actually I didn't notice this, I just did grep for clear_extent_bit and add that flag...
>> GFP_NOFS);
>> goto out;
>> }
>> @@ -1037,8 +1038,8 @@ int __btrfs_write_out_cache(struct btrfs_root *root, struct inode *inode,
>> found_key.offset != offset) {
>> clear_extent_bit(&BTRFS_I(inode)->io_tree, 0,
>> inode->i_size - 1,
>> - EXTENT_DIRTY | EXTENT_DELALLOC, 0, 0,
>> - NULL, GFP_NOFS);
>> + EXTENT_DIRTY | EXTENT_DELALLOC |
>> + EXTENT_DEFRAG, 0, 0, NULL, GFP_NOFS);
>> btrfs_release_path(path);
>> goto out;
>> }
>> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
>> index ea6a4ee..85bc35f 100644
>> --- a/fs/btrfs/inode.c
>> +++ b/fs/btrfs/inode.c
>> @@ -3529,7 +3529,8 @@ again:
>> }
>>
>> clear_extent_bit(&BTRFS_I(inode)->io_tree, page_start, page_end,
>> - EXTENT_DIRTY | EXTENT_DELALLOC | EXTENT_DO_ACCOUNTING,
>> + EXTENT_DIRTY | EXTENT_DELALLOC |
>> + EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG,
>> 0, 0, &cached_state, GFP_NOFS);
>>
>> ret = btrfs_set_extent_delalloc(inode, page_start, page_end,
>> @@ -5996,7 +5997,8 @@ unlock:
>> if (lockstart < lockend) {
>> if (create && len < lockend - lockstart) {
>> clear_extent_bit(&BTRFS_I(inode)->io_tree, lockstart,
>> - lockstart + len - 1, unlock_bits, 1, 0,
>> + lockstart + len - 1,
>> + unlock_bits | EXTENT_DEFRAG, 1, 0,
>
> this
>
>> &cached_state, GFP_NOFS);
>> /*
>> * Beside unlock, we also need to cleanup reserved space
>> @@ -6004,8 +6006,8 @@ unlock:
>> */
>> clear_extent_bit(&BTRFS_I(inode)->io_tree,
>> lockstart + len, lockend,
>> - unlock_bits | EXTENT_DO_ACCOUNTING,
>> - 1, 0, NULL, GFP_NOFS);
>> + unlock_bits | EXTENT_DO_ACCOUNTING |
>> + EXTENT_DEFRAG, 1, 0, NULL, GFP_NOFS);
>
> and this come from the patch that fixes the DIO problem (Btrfs: fix a
> dio write regression), right? Please mention that, makes reviewer's life
> easier :)
>
I'm so sorry, Dave, I should write a NOTE in front of the patchset.
>> } else {
>> clear_extent_bit(&BTRFS_I(inode)->io_tree, lockstart,
>> lockend, unlock_bits, 1, 0,
>> @@ -6570,8 +6572,8 @@ static void btrfs_invalidatepage(struct page *page, unsigned long offset)
>> */
>> clear_extent_bit(tree, page_start, page_end,
>> EXTENT_DIRTY | EXTENT_DELALLOC |
>> - EXTENT_LOCKED | EXTENT_DO_ACCOUNTING, 1, 0,
>> - &cached_state, GFP_NOFS);
>> + EXTENT_LOCKED | EXTENT_DO_ACCOUNTING |
>> + EXTENT_DEFRAG, 1, 0, &cached_state, GFP_NOFS);
>> /*
>> * whoever cleared the private bit is responsible
>> * for the finish_ordered_io
>> @@ -6587,7 +6589,8 @@ static void btrfs_invalidatepage(struct page *page, unsigned long offset)
>> }
>> clear_extent_bit(tree, page_start, page_end,
>> EXTENT_LOCKED | EXTENT_DIRTY | EXTENT_DELALLOC |
>> - EXTENT_DO_ACCOUNTING, 1, 1, &cached_state, GFP_NOFS);
>> + EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG, 1, 1,
>> + &cached_state, GFP_NOFS);
>> __btrfs_releasepage(page, GFP_NOFS);
>>
>> ClearPageChecked(page);
>> @@ -6683,7 +6686,8 @@ again:
>> * prepare_pages in the normal write path.
>> */
>> clear_extent_bit(&BTRFS_I(inode)->io_tree, page_start, page_end,
>> - EXTENT_DIRTY | EXTENT_DELALLOC | EXTENT_DO_ACCOUNTING,
>> + EXTENT_DIRTY | EXTENT_DELALLOC |
>> + EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG,
>> 0, 0, &cached_state, GFP_NOFS);
>>
>> ret = btrfs_set_extent_delalloc(inode, page_start, page_end,
>> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
>> index a1fbca0..9449b84 100644
>> --- a/fs/btrfs/ioctl.c
>> +++ b/fs/btrfs/ioctl.c
>> @@ -1022,8 +1022,8 @@ again:
>> page_start, page_end - 1, 0, &cached_state);
>> clear_extent_bit(&BTRFS_I(inode)->io_tree, page_start,
>> page_end - 1, EXTENT_DIRTY | EXTENT_DELALLOC |
>> - EXTENT_DO_ACCOUNTING, 0, 0, &cached_state,
>> - GFP_NOFS);
>> + EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG, 0, 0,
>> + &cached_state, GFP_NOFS);
>>
>> if (i_done != page_cnt) {
>> spin_lock(&BTRFS_I(inode)->lock);
>> @@ -1034,8 +1034,8 @@ again:
>> }
>>
>>
>> - btrfs_set_extent_delalloc(inode, page_start, page_end - 1,
>> - &cached_state);
>> + set_extent_defrag(&BTRFS_I(inode)->io_tree, page_start, page_end - 1,
>> + &cached_state, GFP_NOFS);
>>
>> unlock_extent_cached(&BTRFS_I(inode)->io_tree,
>> page_start, page_end - 1, &cached_state,
>> ---
>
> otherwise looks ok.
>
Thanks for reviewing this! :)
thanks,
liubo
> david
>
prev parent reply other threads:[~2012-08-25 10:36 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-23 11:01 [PATCH 1/2] Btrfs: use flag EXTENT_DEFRAG for snapshot-aware defrag Liu Bo
2012-08-23 11:01 ` [PATCH 2/2] Btrfs: " Liu Bo
2012-08-23 11:08 ` [PATCH 1/2] Btrfs: use flag EXTENT_DEFRAG for " Liu Bo
2012-08-24 17:34 ` David Sterba
2012-08-25 10:36 ` Liu Bo [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5038AACA.2090804@oracle.com \
--to=bo.li.liu@oracle.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).