public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Huang\, Ying" <ying.huang@intel.com>
To: Theodore Ts'o <tytso@mit.edu>
Cc: huang ying <huang.ying.caritas@gmail.com>,
	kernel test robot <rong.a.chen@intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	LKP ML <lkp@01.org>
Subject: Re: [LKP] [ext4] fde872682e: fsmark.files_per_sec -38.0% regression
Date: Thu, 14 Mar 2019 09:52:52 +0800	[thread overview]
Message-ID: <87h8c66x3f.fsf@yhuang-dev.intel.com> (raw)
In-Reply-To: <20190313153056.GB672@mit.edu> (Theodore Ts'o's message of "Wed, 13 Mar 2019 11:30:56 -0400")

Theodore Ts'o <tytso@mit.edu> writes:

 > On Wed, Mar 13, 2019 at 03:26:39PM +0800, huang ying wrote:
>> >
>> >
>> > commit: fde872682e175743e0c3ef939c89e3c6008a1529 ("ext4: force inode writes when nfsd calls commit_metadata()")
>> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>> 
>> It appears that this is a performance regression caused by a
>> functionality fixing.  So we should ignore this?
>
> Yes, this is a correctness issue that we discovered while tracking
> down user data loss issue after a crash of the NFS server, so this is
> a change we have to keep.  When the NFS folks added the
> commit_metadata() hook, they didn't realize that the fallback path in
> nfsd/vfs.c using sync_inode_metadata() doesn't work on all file
> systems --- and in particular doesn't work for ext3 and ext4 because
> of how we do journalling.
>
> It only applies to NFS serving, not local ext4 use cases, so most ext4
> users won't be impacted on it; only those who export those file
> systems using NFS.
>
> I do have some plans on how to claw back the performance hit.  The
> good news is that it won't require an on-disk format change; the bad
> news is that it's a non-trivial change to how journalling works, and
> it's not something we can backport to the stable kernel series.  It's
> something we're going to have to leave to a distribution who is
> willing to do a lot of careful regression testing, once the change is
> available, maybe in 3 months or so.
>
>      	 	      	  	      	  - Ted
>
> P.S.  I *believe* all other file systems should be OK, and I didn't
> want to impose a performance tax on all other file systems (such as
> btrfs), so I fixed it in an ext4-specific way.  The more
> general/conservative change would be to fall back to using fsync in
> nfs/vfs.c:commit_metadata() unless the file system specifically set a
> superblock flag indicating that using sync_inode_metadata is safe.
> OTOH we lived with this flaw in ext3/ext4 for *years* without anyone
> noticing or complaining, so....

Got it!  Thanks for your detailed explanation!

Best Regards,
Huang, Ying

      reply	other threads:[~2019-03-14  1:54 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-02  0:40 [LKP] [ext4] fde872682e: fsmark.files_per_sec -38.0% regression kernel test robot
2019-03-13  7:26 ` huang ying
2019-03-13 15:30   ` Theodore Ts'o
2019-03-14  1:52     ` Huang, Ying [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h8c66x3f.fsf@yhuang-dev.intel.com \
    --to=ying.huang@intel.com \
    --cc=huang.ying.caritas@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@01.org \
    --cc=rong.a.chen@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox