From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: harshad shirwadkar <harshadshirwadkar@gmail.com>
Cc: Andreas Dilger <adilger@dilger.ca>,
"Theodore Y. Ts'o" <tytso@mit.edu>,
Ext4 Developers List <linux-ext4@vger.kernel.org>
Subject: Re: [PATCH 01/11] ext4: add handling for extended mount options
Date: Tue, 23 Jul 2019 23:12:31 -0700 [thread overview]
Message-ID: <20190724061231.GA7074@magnolia> (raw)
In-Reply-To: <CAD+ocbwCYZDrj9D=85AVaB_RLYjUFwNs1V02fRn4tHh04_k7_A@mail.gmail.com>
On Tue, Jul 23, 2019 at 11:03:54PM -0700, harshad shirwadkar wrote:
> Before I respond to your questions, I would like to explain how fast
> commits differ from ijournal in a few key aspects (I will make sure to
> explain it in detail in patch 00/11 and documentation):
Please do; I hadn't realized there were also journal ondisk format
changes, and these must be recorded in the ext4 disk format
documentation.
> - Instead of storing extent blocks in a fast commit block, we only
> store extents that were modified in a particular fast commit
> transaction in tag-length-value format.
>
> - Whenever the fast commit information (inode structure + changed
> extents in TLV format) exceeds one block, we fall back to full commit.
> Thus at this point, the number of blocks we write per fast commit
> transaction is either the total number of files changed (if fast
> commit was successfully performed) or the number of blocks that would
> be written by a full commit transaction.
>
> - To reduce complexity, there is no support for per-core fast commit areas.
>
> Current design of fast commits is such that we try to perform fast
> commits whenever possible but either if it's impossible to record file
> system changes by fast commits or if we haven't yet added support for
> dealing with a particular type of file system change, we fall back to
> full commits. Whenever we later add more features to fast commits, we
> probably would need more on-disk format changes for the fast commit
> blocks and that would mean we burn feature flags. So, my guess is that
> we would need to make a few judgement calls on whether we want to
> exclude a few fast commit features, keep the patch series simple and
> potentially burn feature flags later OR we save feature flags by
> implementing those fast commit features.
Every feature flag you add doubles the size of the testing matrix.
If I were you I'd only want to test the (fastcommit) and (!fastcommit)
scenarios.
--D
> On Tue, Jul 23, 2019 at 2:59 PM Andreas Dilger <adilger@dilger.ca> wrote:
> >
> > On Jul 22, 2019, at 3:02 PM, Theodore Y. Ts'o <tytso@mit.edu> wrote:
> > >
> > > On Mon, Jul 22, 2019 at 12:15:11PM -0600, Andreas Dilger wrote:
> > >> Unless I missed it, this patch series needs a 00/11 email that describes
> > >> *what* "fast commit" is, and why we want it. This should include some
> > >> benchmark results, since (I'd assume) that the "fast" part of the feature
> > >> name implies a performance improvement?
> > >
> > > For background, it's a simplified version of the scheme proposed by
> > > Park and Shin, in their paper, "iJournaling: Fine-Grained Journaling
> > > for Improving the Latency of Fsync System Call"[1]
> > >
> > > [1] https://www.usenix.org/conference/atc17/technical-sessions/presentation/park
> > >
> > > I agree we should have a cover letter for this patch series. Also, we
> > > should add documentation to Documentation/filesystems/journaling.rst
> > > about this feature; what it does, how it works, its basic on-disk
> > > format changes, etc.
> >
> > Thanks for the link, I hadn't read that paper previously. From reading the
> > paper, it seems there are some things that should be addressed before the
> > patch is committed to the tree in order to maintain proper disk format
> > compatibility:
> > - the ijournal header shows a 256-byte inode. In Lustre today (and also
> > Samba and other xattr-intensive workloads) 512- or 1024-byte inodes are used
> > in order to store more xattrs within the inode, so the size of the inode
> > data in the ijournal header needs to match the actual inode size of the
> > filesystem and not be a fixed size. What if the inode size == blocksize?
>
> Okay, I agree. This is one of such questions where we need to decide
> whether to exclude this fast commit feature request for now or not. I
> think whether or not we actually support 512 or 1024 byte inodes in
> this patch series, we definitely shouldn't assume in the fast commit
> header that inodes are of a fixed size. I will fix it. Supporting
> bigger inodes doesn't sound like it would result in big change in the
> patch series. But I would like to know whether you think it's okay to
> wait or not.
>
> > - the ijournal header also shows a 4-byte inode number. It would be prudent
> > to reserve space for 64-bit inode numbers, or at least have some mechanism
> > (flag) to indicate that a 64-bit inode is stored instead of a 32-bit inode.
>
> Noted, will change that.
>
> > - if there are many cores in a system, say 96, how much space will be used
> > from the journal file by the per-core ijournal?
> > - what happens if multiple threads are writing to the same file with ijournal
> > and per-core ijournal areas? Will the same inode information be recorded
> > in multiple ijournal areas?
>
> As mentioned above, at least for now we keep it simple by not having
> per-core fast commit areas.
>
> Thanks,
> Harshad
>
> >
> > Cheers, Andreas
> >
> >
> >
> >
> >
next prev parent reply other threads:[~2019-07-24 6:12 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-22 4:00 [PATCH 01/11] ext4: add handling for extended mount options Harshad Shirwadkar
2019-07-22 4:00 ` [PATCH 02/11] jbd2: add fast commit fields to journal_s structure Harshad Shirwadkar
2019-07-22 4:00 ` [PATCH 03/11] jbd2: fast commit setup and enable Harshad Shirwadkar
2019-07-22 4:00 ` [PATCH 04/11] jbd2: fast-commit commit path changes Harshad Shirwadkar
2019-07-22 4:00 ` [PATCH 05/11] jbd2: fast-commit commit path new APIs Harshad Shirwadkar
2019-07-22 17:45 ` kbuild test robot
2019-07-22 21:02 ` kbuild test robot
2019-07-22 4:00 ` [PATCH 06/11] jbd2: fast-commit recovery path changes Harshad Shirwadkar
2019-07-22 4:00 ` [PATCH 07/11] ext4: add fields that are needed to track changed files Harshad Shirwadkar
2019-07-22 4:00 ` [PATCH 08/11] ext4: track changed files for fast commit Harshad Shirwadkar
2019-07-22 4:00 ` [PATCH 09/11] ext4: fast-commit commit range tracking Harshad Shirwadkar
2019-07-22 4:00 ` [PATCH 10/11] ext4: fast-commit commit path changes Harshad Shirwadkar
2019-07-22 4:00 ` [PATCH 11/11] ext4: fast-commit recovery " Harshad Shirwadkar
2019-07-22 21:34 ` kbuild test robot
2019-07-22 18:15 ` [PATCH 01/11] ext4: add handling for extended mount options Andreas Dilger
2019-07-22 21:02 ` Theodore Y. Ts'o
2019-07-22 21:36 ` harshad shirwadkar
2019-07-23 21:59 ` Andreas Dilger
2019-07-24 6:03 ` harshad shirwadkar
2019-07-24 6:12 ` Darrick J. Wong [this message]
2019-07-24 16:07 ` Theodore Y. Ts'o
2019-07-24 16:56 ` Darrick J. Wong
2019-07-24 18:14 ` harshad shirwadkar
2019-07-25 3:46 ` Theodore Y. Ts'o
2019-07-25 20:03 ` Andreas Dilger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190724061231.GA7074@magnolia \
--to=darrick.wong@oracle.com \
--cc=adilger@dilger.ca \
--cc=harshadshirwadkar@gmail.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.