Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Filipe Manana <fdmanana@kernel.org>, Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org, David Sterba <dsterba@suse.com>
Subject: Re: [PATCH v2 07/11] btrfs: remove extent_map::block_start member
Date: Fri, 17 May 2024 08:15:36 +0930	[thread overview]
Message-ID: <bcec9364-4de6-4286-9d9b-a3c3731211c5@gmx.com> (raw)
In-Reply-To: <CAL3q7H7bT71DicGMZZ7aHu9xcthHG4SiCAn3cC_-5rjrdiodBw@mail.gmail.com>



在 2024/5/17 02:58, Filipe Manana 写道:
> On Fri, May 3, 2024 at 7:02 AM Qu Wenruo <wqu@suse.com> wrote:
>>
>> The member extent_map::block_start can be calculated from
>> extent_map::disk_bytenr + extent_map::offset for regular extents.
>> And otherwise just extent_map::disk_bytenr.
>>
>> And this is already validated by the validate_extent_map().
>> Now we can remove the member.
>>
>> However there is a special case in btrfs_alloc_ordered_extent(), where
>> we intentionally pass a pseudo ordered extent, exploiting the fact that
>> for NOCOW/PREALLOC writes we do not rely on ordered extent members to
>> update the file extent items.
>>
>> And that's the only way to pass btrfs_extract_ordered_extent(), as it
>> doesn't accept any ordered extent with an offset.
>>
>> For now we will keep the old pseudo ordered extent members, and leave
>> the cleanup of it for the future.
>
> These last 3 paragraphs, the rant about NOCOW writes and ordered
> extents, seem unnecessary to me.
>
> This is just how things work, and for me it makes total sense that an
> ordered extent for a NOCOW write does not represent the entirety of an
> existing file extent item.
> NOCOW writes don't create new file extent items, so they can refer
> only to a section of an existing extent (or the whole extent).

To me, NOCOW/PREALLOC is really the exception, not following all the
existing definition for all the OE/extent map definitions, even it's
only for transient NOCOW/PREALLOC OE.

And I really hope to remove the exception in the near future, even it
may mean some more complex OE spliting etc.

>
> I don't see how this rant is relevant to have here, as the patchset is
> about extent maps and not ordered extents.
> I would leave it out, as it's a separate thing and doesn't affect anything here.

Sure, I can leave it out for now.

>
> More comments inlined below.
>
>>
[...]
>>
>>          if (type != BTRFS_ORDERED_NOCOW) {
>> -               em = create_io_em(inode, start, len, block_start,
>> +               em = create_io_em(inode, start, len,
>>                                    orig_block_len, ram_bytes,
>>                                    BTRFS_COMPRESS_NONE, /* compress_type */
>>                                    file_extent, type);
>>                  if (IS_ERR(em))
>>                          goto out;
>>          }
>> +
>> +       /*
>> +        * NOTE: I know the numbers are totally wrong for NOCOW/PREALLOC,
>> +        * but it doesn't cause problem at least for now.
>
> They are not wrong, they are what they must be.
>
> This is all about passing the right values to
> btrfs_alloc_ordered_extent(), which has nothing to do with extent
> maps.

I think it's pretty obvious that, only for NOCOW/PREALLOC that OE
members differs from their corresponding extent map.

Sure, it is not causing anything wrong (in fact, anything other than the
current behavior is going to cause a lot of problems).

But just for the sake of consistency, even it doesn't affect any on-disk
change, I still want to modify the call sites so that we can directly
pass the btrfs_file_extent item to OE allocation.
Even if it means other modification mostly inside OE splitting part.

For now I can remove/modify the comments, but I do not think this is
consistent.

>
>> +        *
>> +        * For regular writes, we would have file_extent->offset as 0,
>> +        * thus we really only need disk_bytenr, every other length
>> +        * (disk_num_bytes/ram_bytes) would match @len and fe->num_bytes.
>> +        * The current numbers are totally fine.
>> +        *
>> +        * For NOCOW, we don't really care about the numbers except @file_pos
>
> I don't see any variable named @file_pos anywhere in this function.
>
>> +        * and @num_bytes, as we won't insert a file extent item at all.
>
> There's no @num_bytes either, there's a @len however.
>
>> +        *
>> +        * For PREALLOC, we do not use ordered extent's member, but
>
> ordered extent's member -> ordered extent members
>
>> +        * btrfs_mark_extent_written() would handle everything.
>
> would handle -> handles
>
>> +        *
>> +        * So here we intentionally go with pseudo numbers for the NOCOW/PREALLOC
>> +        * OEs, or btrfs_extract_ordered_extent() would need a completely new
>> +        * routine to handle NOCOW/PREALLOC splits, meanwhile result nothing
>> +        * different.
>> +        */
>
> I would just leave the entire comment out.
>
>>          ordered = btrfs_alloc_ordered_extent(inode, start, len, len,
>> -                                            block_start, len, 0,
[...]
>> @@ -1778,7 +1778,9 @@ static void btrfs_rewrite_logical_zoned(struct btrfs_ordered_extent *ordered,
>>          write_lock(&em_tree->lock);
>>          em = search_extent_mapping(em_tree, ordered->file_offset,
>>                                     ordered->num_bytes);
>> -       em->block_start = logical;
>> +       /* The em should be a new COW extent, thus it should not has offset. */
>
> not has offset -> not have an offset
>
> Otherwise it seems fine, but I still want to go over the rest of the patchset.
> I'm going slowly over it, and after commenting on each inidividual
> patch, I'll comment on the cover letter.

Really appreciate the review so far.
And considering the patchset is not urgent, just take your time so that
I can also address the comments at the same time.

Thanks,
Qu
>
> Thanks.
>
>> +       ASSERT(em->offset == 0);
>> +       em->disk_bytenr = logical;
>>          free_extent_map(em);
>>          write_unlock(&em_tree->lock);
>>   }
>> diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h
>> index 3743719d13f2..89b2b66e5bc0 100644
>> --- a/include/trace/events/btrfs.h
>> +++ b/include/trace/events/btrfs.h
>> @@ -291,7 +291,6 @@ TRACE_EVENT_CONDITION(btrfs_get_extent,
>>                  __field(        u64,  ino               )
>>                  __field(        u64,  start             )
>>                  __field(        u64,  len               )
>> -               __field(        u64,  block_start       )
>>                  __field(        u32,  flags             )
>>                  __field(        int,  refs              )
>>          ),
>> @@ -301,18 +300,16 @@ TRACE_EVENT_CONDITION(btrfs_get_extent,
>>                  __entry->ino            = btrfs_ino(inode);
>>                  __entry->start          = map->start;
>>                  __entry->len            = map->len;
>> -               __entry->block_start    = map->block_start;
>>                  __entry->flags          = map->flags;
>>                  __entry->refs           = refcount_read(&map->refs);
>>          ),
>>
>>          TP_printk_btrfs("root=%llu(%s) ino=%llu start=%llu len=%llu "
>> -                 "block_start=%llu(%s) flags=%s refs=%u",
>> +                 "flags=%s refs=%u",
>>                    show_root_type(__entry->root_objectid),
>>                    __entry->ino,
>>                    __entry->start,
>>                    __entry->len,
>> -                 show_map_type(__entry->block_start),
>>                    show_map_flags(__entry->flags),
>>                    __entry->refs)
>>   );
>> @@ -2608,7 +2605,6 @@ TRACE_EVENT(btrfs_extent_map_shrinker_remove_em,
>>                  __field(        u64,    root_id         )
>>                  __field(        u64,    start           )
>>                  __field(        u64,    len             )
>> -               __field(        u64,    block_start     )
>>                  __field(        u32,    flags           )
>>          ),
>>
>> @@ -2617,15 +2613,12 @@ TRACE_EVENT(btrfs_extent_map_shrinker_remove_em,
>>                  __entry->root_id        = inode->root->root_key.objectid;
>>                  __entry->start          = em->start;
>>                  __entry->len            = em->len;
>> -               __entry->block_start    = em->block_start;
>>                  __entry->flags          = em->flags;
>>          ),
>>
>> -       TP_printk_btrfs(
>> -"ino=%llu root=%llu(%s) start=%llu len=%llu block_start=%llu(%s) flags=%s",
>> +       TP_printk_btrfs("ino=%llu root=%llu(%s) start=%llu len=%llu flags=%s",
>>                          __entry->ino, show_root_type(__entry->root_id),
>>                          __entry->start, __entry->len,
>> -                       show_map_type(__entry->block_start),
>>                          show_map_flags(__entry->flags))
>>   );
>>
>> --
>> 2.45.0
>>
>>
>

  reply	other threads:[~2024-05-16 22:45 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-03  6:01 [PATCH v2 00/11] btrfs: extent-map: unify the members with btrfs_ordered_extent Qu Wenruo
2024-05-03  6:01 ` [PATCH v2 01/11] btrfs: rename extent_map::orig_block_len to disk_num_bytes Qu Wenruo
2024-05-09 16:15   ` Filipe Manana
2024-05-03  6:01 ` [PATCH v2 02/11] btrfs: export the expected file extent through can_nocow_extent() Qu Wenruo
2024-05-09 16:22   ` Filipe Manana
2024-05-09 21:55     ` Qu Wenruo
2024-05-03  6:01 ` [PATCH v2 03/11] btrfs: introduce new members for extent_map Qu Wenruo
2024-05-09 17:05   ` Filipe Manana
2024-05-09 22:11     ` Qu Wenruo
2024-05-10 11:26       ` Filipe Manana
2024-05-10 22:26         ` Qu Wenruo
2024-05-13 12:48   ` Filipe Manana
2024-05-13 12:54     ` Filipe Manana
2024-05-13 17:31   ` Filipe Manana
2024-05-03  6:01 ` [PATCH v2 04/11] btrfs: introduce extra sanity checks for extent maps Qu Wenruo
2024-05-13 12:21   ` Filipe Manana
2024-05-13 22:34     ` Qu Wenruo
2024-05-03  6:01 ` [PATCH v2 05/11] btrfs: remove extent_map::orig_start member Qu Wenruo
2024-05-13 13:09   ` Filipe Manana
2024-05-13 22:14     ` Qu Wenruo
2024-05-03  6:01 ` [PATCH v2 06/11] btrfs: remove extent_map::block_len member Qu Wenruo
2024-05-13 17:44   ` Filipe Manana
2024-05-14  7:09     ` Qu Wenruo
2024-05-03  6:01 ` [PATCH v2 07/11] btrfs: remove extent_map::block_start member Qu Wenruo
2024-05-16 17:28   ` Filipe Manana
2024-05-16 22:45     ` Qu Wenruo [this message]
2024-05-03  6:01 ` [PATCH v2 08/11] btrfs: cleanup duplicated parameters related to can_nocow_file_extent_args Qu Wenruo
2024-05-20 15:55   ` Filipe Manana
2024-05-20 22:13     ` Qu Wenruo
2024-05-03  6:01 ` [PATCH v2 09/11] btrfs: cleanup duplicated parameters related to btrfs_alloc_ordered_extent Qu Wenruo
2024-05-20 16:31   ` Filipe Manana
2024-05-03  6:01 ` [PATCH v2 10/11] btrfs: cleanup duplicated parameters related to create_io_em() Qu Wenruo
2024-05-20 16:46   ` Filipe Manana
2024-05-03  6:01 ` [PATCH v2 11/11] btrfs: cleanup duplicated parameters related to btrfs_create_dio_extent() Qu Wenruo
2024-05-20 16:48   ` Filipe Manana
2024-05-23  4:03     ` Qu Wenruo
2024-05-03 11:53 ` [PATCH v2 00/11] btrfs: extent-map: unify the members with btrfs_ordered_extent David Sterba
2024-05-20 16:55 ` Filipe Manana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bcec9364-4de6-4286-9d9b-a3c3731211c5@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=dsterba@suse.com \
    --cc=fdmanana@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox