linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: harshad shirwadkar <harshadshirwadkar@gmail.com>
Cc: Jan Kara <jack@suse.cz>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>,
	"Theodore Y. Ts'o" <tytso@mit.edu>
Subject: Re: [PATCH 04/10] ext4: clean up the JBD2 API that initializes fast commits
Date: Thu, 5 Nov 2020 11:30:24 +0100	[thread overview]
Message-ID: <20201105103024.GA32718@quack2.suse.cz> (raw)
In-Reply-To: <CAD+ocbykJ61MmkLqq78p=AOT0f_6j066J3ivNHjXJVbtLEvNag@mail.gmail.com>

On Wed 04-11-20 11:52:24, harshad shirwadkar wrote:
> On Tue, Nov 3, 2020 at 8:29 AM Jan Kara <jack@suse.cz> wrote:
> > > -int jbd2_fc_init(journal_t *journal, int num_fc_blks)
> > > +int jbd2_fc_init(journal_t *journal)
> > >  {
> > > -     journal->j_fc_wbufsize = num_fc_blks;
> > > -     journal->j_fc_wbuf = kmalloc_array(journal->j_fc_wbufsize,
> > > -                             sizeof(struct buffer_head *), GFP_KERNEL);
> > > -     if (!journal->j_fc_wbuf)
> > > -             return -ENOMEM;
> > > +     /*
> > > +      * Only set j_fc_wbufsize here to indicate that the client file
> > > +      * system is interested in using fast commits. The actual number of
> > > +      * fast commit blocks is found inside jbd2_superblock and is only
> > > +      * valid if j_fc_wbufsize is non-zero. The real value of j_fc_wbufsize
> > > +      * gets set in journal_reset().
> > > +      */
> > > +     journal->j_fc_wbufsize = JBD2_MIN_FC_BLOCKS;
> > >       return 0;
> > >  }
> >
> > When looking at this, is there a reason why jbd2_fc_init() still exists?  I
> > mean why not just make the rule that the journal has FC block number set
> > iff FC gets enabled? Anything else seems a bit confusing to me and also
> > dangerous - imagine we have fs with FC running, we write some FCs and then
> > crash. Then on system recovery we mount with no_fc mount option. We have
> > just lost data on the filesystem AFAIU... So I'd just remove all the mount
> > options related to fastcommits and leave everything to the journal setup
> > (which can be modified with e2fsprogs if needed) to keep things simple.
> The problem is whether or not to use fast commits is the file system's
> call. The JBD2 feature flag will be cleared on a clean unmount and if
> we rely solely on the JBD2 feature flag, fast commit will be turned
> off after a clean unmount. Whereas the FS compat flag is the source of
> truth about whether fast commit needs to be used or not. That's why we
> need an API for the file system to tell JBD2 to still do fast commits.

Yes, I meant the API could be just that the filesystem either calls
jbd2_journal_set_features() with FASTCOMMIT feature or it won't. Similarly
to how e.g. JBD2_FEATURE_INCOMPAT_64BIT is handled. No need for
jbd2_fc_init() function AFAICT.

> Mount options that override the feature flag in Ext4 were mainly meant
> for debugging purposes. So, perhaps there should be a clear warning
> message in the kernel if any of these options are used? Even if we get
> rid of the mount options, we still need the jbd2_fc_init() API for the
> FS to tell JBD2 that it wants to use fast commit. Note that even if
> jbd2_fc_init() is not called, JBD2 will still try to replay fast
> commit blocks.



> > >  EXPORT_SYMBOL(jbd2_fc_init);
> > > @@ -1500,7 +1494,7 @@ static void journal_fail_superblock(journal_t *journal)
> > >  static int journal_reset(journal_t *journal)
> > >  {
> > >       journal_superblock_t *sb = journal->j_superblock;
> > > -     unsigned long long first, last;
> > > +     unsigned long long first, last, num_fc_blocks;
> > >
> > >       first = be32_to_cpu(sb->s_first);
> > >       last = be32_to_cpu(sb->s_maxlen);
> > > @@ -1513,6 +1507,28 @@ static int journal_reset(journal_t *journal)
> > >
> > >       journal->j_first = first;
> > >
> > > +     /*
> > > +      * At this point, fast commit recovery has finished. Now, we solely
> > > +      * rely on the file system to decide whether it wants fast commits
> > > +      * or not. File system that wishes to use fast commits must have
> > > +      * already called jbd2_fc_init() before we get here.
> > > +      */
> > > +     if (journal->j_fc_wbufsize > 0)
> > > +             jbd2_journal_set_features(journal, 0, 0,
> > > +                                       JBD2_FEATURE_INCOMPAT_FAST_COMMIT);
> > > +     else
> > > +             jbd2_journal_clear_features(journal, 0, 0,
> > > +                                       JBD2_FEATURE_INCOMPAT_FAST_COMMIT);
> > > +
> > > +     /* If valid, prefer the value found in superblock over the default */
> > > +     num_fc_blocks = be32_to_cpu(sb->s_num_fc_blks);
> > > +     if (num_fc_blocks > 0 && num_fc_blocks < last)
> > > +             journal->j_fc_wbufsize = num_fc_blocks;
> > > +
> > > +     if (jbd2_has_feature_fast_commit(journal))
> > > +             journal->j_fc_wbuf = kmalloc_array(journal->j_fc_wbufsize,
> > > +                                     sizeof(struct buffer_head *), GFP_KERNEL);
> > > +
> > >       if (jbd2_has_feature_fast_commit(journal) &&
> > >           journal->j_fc_wbufsize > 0) {
> > >               journal->j_fc_last = last;
> > > @@ -1531,7 +1547,8 @@ static int journal_reset(journal_t *journal)
> > >       journal->j_commit_sequence = journal->j_transaction_sequence - 1;
> > >       journal->j_commit_request = journal->j_commit_sequence;
> > >
> > > -     journal->j_max_transaction_buffers = journal->j_maxlen / 4;
> > > +     journal->j_max_transaction_buffers =
> > > +             (journal->j_maxlen - journal->j_fc_wbufsize) / 4;
> > >
> > >       /*
> > >        * As a special case, if the on-disk copy is already marked as needing
> > > @@ -1872,6 +1889,7 @@ static int load_superblock(journal_t *journal)
> > >  {
> > >       int err;
> > >       journal_superblock_t *sb;
> > > +     int num_fc_blocks;
> > >
> > >       err = journal_get_superblock(journal);
> > >       if (err)
> > > @@ -1884,10 +1902,12 @@ static int load_superblock(journal_t *journal)
> > >       journal->j_first = be32_to_cpu(sb->s_first);
> > >       journal->j_errno = be32_to_cpu(sb->s_errno);
> > >
> > > -     if (jbd2_has_feature_fast_commit(journal) &&
> > > -         journal->j_fc_wbufsize > 0) {
> > > +     if (jbd2_has_feature_fast_commit(journal)) {
> > >               journal->j_fc_last = be32_to_cpu(sb->s_maxlen);
> > > -             journal->j_last = journal->j_fc_last - journal->j_fc_wbufsize;
> > > +             num_fc_blocks = be32_to_cpu(sb->s_num_fc_blks);
> > > +             if (!num_fc_blocks || num_fc_blocks >= journal->j_fc_last)
> >
> > I think this needs to be stricter - we need the check that the journal is
> > at least JBD2_MIN_JOURNAL_BLOCKS long (which happens at the beginning of
> > journal_reset()) to happen after we've subtracted fastcommit blocks...
> So are you saying that with FC, the minimum journal size is
> JBD2_MIN_JOURNAL_BLOCKS + JBD2_MIN_FC_BLOCKS? I was assuming that we

Yes. JBD2_MIN_JOURNAL_BLOCKS is minimum number of blocks we need for normal
commits to get reasonable behavior. So as you say with fastcommits enabled,
the minimal journal size is JBD2_MIN_JOURNAL_BLOCKS + JBD2_MIN_FC_BLOCKS.

> will reserve JBD2_MIN_FC_BLOCKS (256) blocks out of the total journal
> size. That way the users who rely on the journal size to be 1024
> blocks, won't see a difference in journal size even after turning FC
> on. But I'm not sure if that's something we should care about.

Well, e2fsprogs need to check journal size when enabling fastcommits so
that we don't get invalid configurations.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2020-11-05 10:30 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-31 20:05 [PATCH 00/10] Ext4 fast commit fixes Harshad Shirwadkar
2020-10-31 20:05 ` [PATCH 01/10] ext4: describe fast_commit feature flags Harshad Shirwadkar
2020-11-03 10:33   ` Jan Kara
2020-10-31 20:05 ` [PATCH 02/10] ext4: mark fc ineligible if inode gets evictied due to mem pressure Harshad Shirwadkar
2020-11-03 14:13   ` Jan Kara
2020-11-03 18:33     ` harshad shirwadkar
2020-11-04  9:52       ` Jan Kara
2020-10-31 20:05 ` [PATCH 03/10] ext4: pass handle to ext4_fc_track_* functions Harshad Shirwadkar
2020-11-03 14:46   ` Jan Kara
2020-11-05  1:53     ` harshad shirwadkar
2020-10-31 20:05 ` [PATCH 04/10] ext4: clean up the JBD2 API that initializes fast commits Harshad Shirwadkar
2020-11-03 16:29   ` Jan Kara
2020-11-04 19:52     ` harshad shirwadkar
2020-11-05 10:30       ` Jan Kara [this message]
2020-11-05 12:44         ` Jan Kara
     [not found]           ` <CAE1WUT6oxiKtrdF6vzLv5vYFxPjefRhzDE2xfQGF6CqQQdPv1Q@mail.gmail.com>
2020-11-06  3:02             ` harshad shirwadkar
2020-10-31 20:05 ` [PATCH 05/10] jbd2: fix fast commit journalling APIs Harshad Shirwadkar
2020-11-03 16:38   ` Jan Kara
2020-11-04 21:25     ` harshad shirwadkar
2020-10-31 20:05 ` [PATCH 06/10] ext4: dedpulicate the code to wait on inode that's being committed Harshad Shirwadkar
2020-11-03 16:42   ` Jan Kara
2020-11-04 21:35     ` harshad shirwadkar
2020-10-31 20:05 ` [PATCH 07/10] ext4: misc fast commit fixes Harshad Shirwadkar
2020-11-03 16:52   ` Jan Kara
2020-11-06  3:07     ` harshad shirwadkar
2020-10-31 20:05 ` [PATCH 08/10] ext4: fix inode dirty check in case of fast commits Harshad Shirwadkar
2020-11-01  7:09   ` Andrea Righi
2020-11-03 16:58   ` Jan Kara
2020-10-31 20:05 ` [PATCH 09/10] ext4: disable fast commit with data journalling Harshad Shirwadkar
2020-11-03 17:01   ` Jan Kara
2020-10-31 20:05 ` [PATCH 10/10] ext4: issue fsdev cache flush before starting fast commit Harshad Shirwadkar
2020-11-03 17:02   ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201105103024.GA32718@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=harshadshirwadkar@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).