public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: Jan Kara <jack@suse.cz>
Cc: Li Dongyang <dongyangli@ddn.com>,
	linux-ext4@vger.kernel.org, Andreas Dilger <adilger@dilger.ca>,
	Alex Zhuravlev <bzzz@whamcloud.com>
Subject: Re: [PATCH V2] jbd2: use rhashtable for revoke records during replay
Date: Fri, 8 Nov 2024 11:11:18 -0500	[thread overview]
Message-ID: <20241108161118.GA42603@mit.edu> (raw)
In-Reply-To: <20241108103358.ziocxsyapli2pexv@quack3>

On Fri, Nov 08, 2024 at 11:33:58AM +0100, Jan Kara wrote:
> > 1048576 records - 95 seconds
> > 2097152 records - 580 seconds
> 
> These are really high numbers of revoke records. Deleting couple GB of
> metadata doesn't happen so easily. Are they from a real workload or just
> a stress test?

For context, the background of this is that this has been an
out-of-tree that's been around for a very long time, for use with
Lustre servers where apparently, this very large number of revoke
records is a real thing.

> If my interpretation is correct, then rhashtable is unnecessarily
> huge hammer for this. Firstly, as the big hash is needed only during
> replay, there's no concurrent access to the data
> structure. Secondly, we just fill the data structure in the
> PASS_REVOKE scan and then use it. Thirdly, we know the number of
> elements we need to store in the table in advance (well, currently
> we don't but it's trivial to modify PASS_SCAN to get that number).
> 
> So rather than playing with rhashtable, I'd modify PASS_SCAN to sum
> up number of revoke records we're going to process and then prepare
> a static hash of appropriate size for replay (we can just use the
> standard hashing fs/jbd2/revoke.c uses, just with differently sized
> hash table allocated for replay and point journal->j_revoke to
> it). And once recovery completes jbd2_journal_clear_revoke() can
> free the table and point journal->j_revoke back to the original
> table. What do you think?

Hmm, that's a really nice idea; Andreas, what do you think?

     	      	     	  		 - Ted

  reply	other threads:[~2024-11-08 16:11 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-05  3:44 [PATCH V2] jbd2: use rhashtable for revoke records during replay Li Dongyang
2024-11-08 10:33 ` Jan Kara
2024-11-08 16:11   ` Theodore Ts'o [this message]
2024-11-12 18:44     ` Andreas Dilger
2024-11-13 14:47       ` Jan Kara
2025-01-16  0:08         ` Andreas Dilger
2025-01-16 18:04           ` Jan Kara
2024-11-09  3:12   ` Zhang Yi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241108161118.GA42603@mit.edu \
    --to=tytso@mit.edu \
    --cc=adilger@dilger.ca \
    --cc=bzzz@whamcloud.com \
    --cc=dongyangli@ddn.com \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox