From: "Theodore Ts'o" <tytso@mit.edu>
To: Jan Kara <jack@suse.cz>
Cc: Li Dongyang <dongyangli@ddn.com>,
linux-ext4@vger.kernel.org, Andreas Dilger <adilger@dilger.ca>,
Alex Zhuravlev <bzzz@whamcloud.com>
Subject: Re: [PATCH V2] jbd2: use rhashtable for revoke records during replay
Date: Fri, 8 Nov 2024 11:11:18 -0500 [thread overview]
Message-ID: <20241108161118.GA42603@mit.edu> (raw)
In-Reply-To: <20241108103358.ziocxsyapli2pexv@quack3>
On Fri, Nov 08, 2024 at 11:33:58AM +0100, Jan Kara wrote:
> > 1048576 records - 95 seconds
> > 2097152 records - 580 seconds
>
> These are really high numbers of revoke records. Deleting couple GB of
> metadata doesn't happen so easily. Are they from a real workload or just
> a stress test?
For context, the background of this is that this has been an
out-of-tree that's been around for a very long time, for use with
Lustre servers where apparently, this very large number of revoke
records is a real thing.
> If my interpretation is correct, then rhashtable is unnecessarily
> huge hammer for this. Firstly, as the big hash is needed only during
> replay, there's no concurrent access to the data
> structure. Secondly, we just fill the data structure in the
> PASS_REVOKE scan and then use it. Thirdly, we know the number of
> elements we need to store in the table in advance (well, currently
> we don't but it's trivial to modify PASS_SCAN to get that number).
>
> So rather than playing with rhashtable, I'd modify PASS_SCAN to sum
> up number of revoke records we're going to process and then prepare
> a static hash of appropriate size for replay (we can just use the
> standard hashing fs/jbd2/revoke.c uses, just with differently sized
> hash table allocated for replay and point journal->j_revoke to
> it). And once recovery completes jbd2_journal_clear_revoke() can
> free the table and point journal->j_revoke back to the original
> table. What do you think?
Hmm, that's a really nice idea; Andreas, what do you think?
- Ted
next prev parent reply other threads:[~2024-11-08 16:11 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-05 3:44 [PATCH V2] jbd2: use rhashtable for revoke records during replay Li Dongyang
2024-11-08 10:33 ` Jan Kara
2024-11-08 16:11 ` Theodore Ts'o [this message]
2024-11-12 18:44 ` Andreas Dilger
2024-11-13 14:47 ` Jan Kara
2025-01-16 0:08 ` Andreas Dilger
2025-01-16 18:04 ` Jan Kara
2024-11-09 3:12 ` Zhang Yi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241108161118.GA42603@mit.edu \
--to=tytso@mit.edu \
--cc=adilger@dilger.ca \
--cc=bzzz@whamcloud.com \
--cc=dongyangli@ddn.com \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox