From: Josef Bacik <josef@toxicpanda.com>
To: linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: [PATCH 23/23] btrfs: add a comment explaining the data flush steps
Date: Tue, 7 Jul 2020 11:42:46 -0400 [thread overview]
Message-ID: <20200707154246.52844-24-josef@toxicpanda.com> (raw)
In-Reply-To: <20200707154246.52844-1-josef@toxicpanda.com>
The data flushing steps are not obvious to people other than myself and
Chris. Write a giant comment explaining the reasoning behind each flush
step for data as well as why it is in that particular order.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/space-info.c | 47 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 47 insertions(+)
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index bc99864c30d9..7fa7f580b4cc 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -998,6 +998,53 @@ static void btrfs_async_reclaim_metadata_space(struct work_struct *work)
} while (flush_state <= COMMIT_TRANS);
}
+/*
+ * FLUSH_DELALLOC_WAIT:
+ * Space is free'd from flushing delalloc in one of two ways.
+ *
+ * 1) compression is on and we allocate less space than we reserved.
+ * 2) We are overwriting existing space.
+ *
+ * For #1 that extra space is reclaimed as soon as the delalloc pages are
+ * cow'ed, by way of btrfs_add_reserved_bytes() which adds the actual extent
+ * length to ->bytes_reserved, and subtracts the reserved space from
+ * ->bytes_may_use.
+ *
+ * For #2 this is trickier. Once the ordered extent runs we will drop the
+ * extent in the range we are overwriting, which creates a delayed ref for
+ * that freed extent. This however is not reclaimed until the transaction
+ * commits, thus the next stages.
+ *
+ * RUN_DELAYED_IPUTS
+ * If we are freeing inodes, we want to make sure all delayed iputs have
+ * completed, because they could have been on an inode with i_nlink == 0, and
+ * thus have been trunated and free'd up space. But again this space is not
+ * immediately re-usable, it comes in the form of a delayed ref, which must be
+ * run and then the transaction must be committed.
+ *
+ * FLUSH_DELAYED_REFS
+ * The above two cases generate delayed refs that will affect
+ * ->total_bytes_pinned. However this counter can be inconsistent with
+ * reality if there are outstanding delayed refs. This is because we adjust
+ * the counter based soley on the current set of delayed refs and disregard
+ * any on-disk state which might include more refs. So for example, if we
+ * have an extent with 2 references, but we only drop 1, we'll see that there
+ * is a negative delayed ref count for the extent and assume that the space
+ * will be free'd, and thus increase ->total_bytes_pinned.
+ *
+ * Running the delayed refs gives us the actual real view of what will be
+ * freed at the transaction commit time. This stage will not actually free
+ * space for us, it just makes sure that may_commit_transaction() has all of
+ * the information it needs to make the right decision.
+ *
+ * COMMIT_TRANS
+ * This is where we reclaim all of the pinned space generated by the previous
+ * two stages. We will not commit the transaction if we don't think we're
+ * likely to satisfy our request, which means if our current free space +
+ * total_bytes_pinned < reservation we will not commit. This is why the
+ * previous states are actually important, to make sure we know for sure
+ * whether committing the transaction will allow us to make progress.
+ */
static const enum btrfs_flush_state data_flush_states[] = {
FLUSH_DELALLOC_WAIT,
RUN_DELAYED_IPUTS,
--
2.24.1
next prev parent reply other threads:[~2020-07-07 15:43 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-07 15:42 [PATCH 00/23][v2] Change data reservations to use the ticketing infra Josef Bacik
2020-07-07 15:42 ` [PATCH 01/23] btrfs: change nr to u64 in btrfs_start_delalloc_roots Josef Bacik
2020-07-07 15:42 ` [PATCH 02/23] btrfs: remove orig from shrink_delalloc Josef Bacik
2020-07-07 15:42 ` [PATCH 03/23] btrfs: handle U64_MAX for shrink_delalloc Josef Bacik
2020-07-07 15:42 ` [PATCH 04/23] btrfs: make shrink_delalloc take space_info as an arg Josef Bacik
2020-07-07 15:42 ` [PATCH 05/23] btrfs: make ALLOC_CHUNK use the space info flags Josef Bacik
2020-07-07 15:42 ` [PATCH 06/23] btrfs: call btrfs_try_granting_tickets when freeing reserved bytes Josef Bacik
2020-07-07 15:42 ` [PATCH 07/23] btrfs: call btrfs_try_granting_tickets when unpinning anything Josef Bacik
2020-07-07 15:42 ` [PATCH 08/23] btrfs: call btrfs_try_granting_tickets when reserving space Josef Bacik
2020-07-07 15:42 ` [PATCH 09/23] btrfs: use the btrfs_space_info_free_bytes_may_use helper for delalloc Josef Bacik
2020-07-07 15:42 ` [PATCH 10/23] btrfs: use btrfs_start_delalloc_roots in shrink_delalloc Josef Bacik
2020-07-07 15:42 ` [PATCH 11/23] btrfs: check tickets after waiting on ordered extents Josef Bacik
2020-07-07 15:42 ` [PATCH 12/23] btrfs: add flushing states for handling data reservations Josef Bacik
2020-07-07 15:42 ` [PATCH 13/23] btrfs: add the data transaction commit logic into may_commit_transaction Josef Bacik
2020-07-07 15:42 ` [PATCH 14/23] btrfs: add btrfs_reserve_data_bytes and use it Josef Bacik
2020-07-07 15:42 ` [PATCH 15/23] btrfs: use ticketing for data space reservations Josef Bacik
2020-07-07 15:42 ` [PATCH 16/23] btrfs: serialize data reservations if we are flushing Josef Bacik
2020-07-07 15:42 ` [PATCH 17/23] btrfs: use the same helper for data and metadata reservations Josef Bacik
2020-07-07 15:42 ` [PATCH 18/23] btrfs: drop the commit_cycles stuff for data reservations Josef Bacik
2020-07-07 15:42 ` [PATCH 19/23] btrfs: don't force commit if we are data Josef Bacik
2020-07-07 15:42 ` [PATCH 20/23] btrfs: run delayed iputs before committing the transaction for data Josef Bacik
2020-07-07 15:42 ` [PATCH 21/23] btrfs: flush delayed refs when trying to reserve data space Josef Bacik
2020-07-07 15:42 ` [PATCH 22/23] btrfs: do async reclaim for data reservations Josef Bacik
2020-07-07 15:42 ` Josef Bacik [this message]
2020-07-08 11:23 ` [PATCH 00/23][v2] Change data reservations to use the ticketing infra Nikolay Borisov
-- strict thread matches above, loose matches on Subject: below --
2020-07-21 14:22 [PATCH 00/23][v4] " Josef Bacik
2020-07-21 14:22 ` [PATCH 23/23] btrfs: add a comment explaining the data flush steps Josef Bacik
2020-07-08 13:59 [PATCH 00/23][v3] Change data reservations to use the ticketing infra Josef Bacik
2020-07-08 14:00 ` [PATCH 23/23] btrfs: add a comment explaining the data flush steps Josef Bacik
2020-06-30 13:58 [PATCH 00/23] Change data reservations to use the ticketing infra Josef Bacik
2020-06-30 13:59 ` [PATCH 23/23] btrfs: add a comment explaining the data flush steps Josef Bacik
2020-02-04 16:19 [PATCH 0/23][v4] Convert data reservations to the ticketing infrastructure Josef Bacik
2020-02-04 16:19 ` [PATCH 23/23] btrfs: add a comment explaining the data flush steps Josef Bacik
2020-02-04 16:59 ` Nikolay Borisov
2020-02-05 9:55 ` Johannes Thumshirn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200707154246.52844-24-josef@toxicpanda.com \
--to=josef@toxicpanda.com \
--cc=kernel-team@fb.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox