From: Shaohua Li <shaohua.li@intel.com>
To: "Shi, Alex" <alex.shi@intel.com>, Jan Kara <jack@suse.cz>
Cc: Corrado Zoccolo <czoccolo@gmail.com>,
Vivek Goyal <vgoyal@redhat.com>, "jack@suse.cz" <jack@suse.cz>,
"tytso@mit.edu" <tytso@mit.edu>,
"jaxboe@fusionio.com" <jaxboe@fusionio.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"Chen, Tim C" <tim.c.chen@intel.com>
Subject: Re: [performance bug] kernel building regression on 64 LCPUs machine
Date: Tue, 15 Feb 2011 09:10:01 +0800 [thread overview]
Message-ID: <1297732201.24560.2.camel@sli10-conroe> (raw)
In-Reply-To: <1297650318.29573.2482.camel@debian>
On Mon, 2011-02-14 at 10:25 +0800, Shi, Alex wrote:
> On Sun, 2011-02-13 at 02:25 +0800, Corrado Zoccolo wrote:
> > On Sat, Feb 12, 2011 at 10:21 AM, Alex,Shi <alex.shi@intel.com> wrote:
> > > On Wed, 2011-01-26 at 16:15 +0800, Li, Shaohua wrote:
> > >> On Thu, Jan 20, 2011 at 11:16:56PM +0800, Vivek Goyal wrote:
> > >> > On Wed, Jan 19, 2011 at 10:03:26AM +0800, Shaohua Li wrote:
> > >> > > add Jan and Theodore to the loop.
> > >> > >
> > >> > > On Wed, 2011-01-19 at 09:55 +0800, Shi, Alex wrote:
> > >> > > > Shaohua and I tested kernel building performance on latest kernel. and
> > >> > > > found it is drop about 15% on our 64 LCPUs NHM-EX machine on ext4 file
> > >> > > > system. We find this performance dropping is due to commit
> > >> > > > 749ef9f8423054e326f. If we revert this patch or just change the
> > >> > > > WRITE_SYNC back to WRITE in jbd2/commit.c file. the performance can be
> > >> > > > recovered.
> > >> > > >
> > >> > > > iostat report show with the commit, read request merge number increased
> > >> > > > and write request merge dropped. The total request size increased and
> > >> > > > queue length dropped. So we tested another patch: only change WRITE_SYNC
> > >> > > > to WRITE_SYNC_PLUG in jbd2/commit.c, but nothing effected.
> > >> > > since WRITE_SYNC_PLUG doesn't work, this isn't a simple no-write-merge issue.
> > >> > >
> > >> >
> > >> > Yep, it does sound like reduce write merging. But moving journal commits
> > >> > back to WRITE, then fsync performance will drop as there will be idling
> > >> > introduced between fsync thread and journalling thread. So that does
> > >> > not sound like a good idea either.
> > >> >
> > >> > Secondly, in presence of mixed workload (some other sync read happening)
> > >> > WRITES can get less bandwidth and sync workload much more. So by
> > >> > marking journal commits as WRITES you might increase the delay there
> > >> > in completion in presence of other sync workload.
> > >> >
> > >> > So Jan Kara's approach makes sense that if somebody is waiting on
> > >> > commit then make it WRITE_SYNC otherwise make it WRITE. Not sure why
> > >> > did it not work for you. Is it possible to run some traces and do
> > >> > more debugging that figure out what's happening.
> > >> Sorry for the long delay.
> > >>
> > >> Looks fedora enables ccache by default. While our kbuild test is on ext4 disk
> > >> but rootfs is on ext3 where ccache cache files live. Jan's patch only covers
> > >> ext4, maybe this is the reason.
> > >> I changed jbd to use WRITE for journal_commit_transaction. With the change and
> > >> Jan's patch, the test seems fine.
> > > Let me clarify the bug situation again.
> > > With the following scenarios, the regression is clear.
> > > 1, ccache_dir setup at rootfs that format is ext3 on /dev/sda1; 2,
> > > kbuild on /dev/sdb1 with ext4.
> > > but if we disable the ccache, only do kbuild on sdb1 with ext4. There is
> > > no regressions whenever with or without Jan's patch.
> > > So, problem focus on the ccache scenario, (from fedora 11, ccache is
> > > default setting).
> > >
> > > If we compare the vmstat output with or without ccache, there is too
> > > many write when ccache enabled. According the result, it should to do
> > > some tunning on ext3 fs.
> > Is ext3 configured with data ordered or writeback?
>
> The ext3 on sda and ext4 on sdb are both used 'ordered' mounting mode.
>
> > I think ccache might be performing fsyncs, and this is a bad workload
> > for ext3, especially in ordered mode.
> > It might be that my patch introduced a regression in ext3 fsync
> > performance, but I don't understand how reverting only the change in
> > jbd2 (that is the ext4 specific journaling daemon) could restore it.
> > The two partitions are on different disks, so each one should be
> > isolated from the I/O perspective (do they share a single
> > controller?).
>
> No, sda/sdb use separated controller.
>
> > The only interaction I see happens at the VM level,
> > since changing performance of any of the two changes the rate at which
> > pages can be cleaned.
> >
> > Corrado
> > >
> > >
> > > vmstat average output per 10 seconds, without ccache
> > > procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
> > > r b swpd free buff cache si so bi bo in cs us sy id wa st
> > > 26.8 0.5 0.0 63930192.3 9677.0 96544.9 0.0 0.0 2486.9 337.9 17729.9 4496.4 17.5 2.5 79.8 0.2 0.0
> > >
> > > vmstat average output per 10 seconds, with ccache
> > > procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
> > > r b swpd free buff cache si so bi bo in cs us sy id wa st
> > > 2.4 40.7 0.0 64316231.0 17260.6 119533.8 0.0 0.0 2477.6 1493.1 8606.4 3565.2 2.5 1.1 83.0 13.5 0.0
> > >
> > >
> > >>
> > >> Jan,
> > >> can you send a patch with similar change for ext3? So we can do more tests.
Hi Jan,
can you send a patch with both ext3 and ext4 changes? Our test shows
your patch has positive effect, but need confirm with the ext3 change.
Thanks,
Shaohua
next prev parent reply other threads:[~2011-02-15 1:10 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-01-19 1:55 [performance bug] kernel building regression on 64 LCPUs machine Alex,Shi
2011-01-19 2:03 ` Shaohua Li
2011-01-19 12:56 ` Jan Kara
2011-01-20 7:52 ` Alex,Shi
2011-01-20 15:16 ` Vivek Goyal
2011-01-21 7:17 ` Shaohua Li
2011-01-26 8:15 ` Shaohua Li
2011-02-12 9:21 ` Alex,Shi
2011-02-12 18:25 ` Corrado Zoccolo
2011-02-14 2:25 ` Alex,Shi
2011-02-15 1:10 ` Shaohua Li [this message]
2011-02-21 16:49 ` Jan Kara
2011-02-23 8:24 ` Alex,Shi
2011-02-24 12:13 ` Jan Kara
2011-02-25 0:44 ` Alex Shi
2011-02-26 14:45 ` Corrado Zoccolo
2011-03-01 19:56 ` Jeff Moyer
2011-03-02 9:42 ` Jan Kara
2011-03-02 16:13 ` Jeff Moyer
2011-03-02 21:17 ` Jan Kara
2011-03-02 21:20 ` Jeff Moyer
2011-03-03 1:14 ` Jeff Moyer
2011-03-04 15:32 ` Jan Kara
2011-03-04 15:40 ` Jeff Moyer
2011-03-04 15:50 ` Jeff Moyer
2011-03-04 18:27 ` Jeff Moyer
2011-03-22 7:38 ` Alex,Shi
2011-03-22 16:14 ` Jan Kara
2011-03-22 17:46 ` Jeff Moyer
2011-03-24 6:45 ` Alex,Shi
2011-03-28 19:48 ` Jan Kara
2011-01-19 14:32 ` Ted Ts'o
2011-01-20 2:12 ` Shaohua Li
2011-01-21 7:23 ` Corrado Zoccolo
2011-01-21 7:47 ` Alex,Shi
2011-01-21 7:52 ` Alex,Shi
2011-01-21 8:13 ` Corrado Zoccolo
2011-01-21 8:20 ` Shaohua Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1297732201.24560.2.camel@sli10-conroe \
--to=shaohua.li@intel.com \
--cc=alex.shi@intel.com \
--cc=czoccolo@gmail.com \
--cc=jack@suse.cz \
--cc=jaxboe@fusionio.com \
--cc=linux-kernel@vger.kernel.org \
--cc=tim.c.chen@intel.com \
--cc=tytso@mit.edu \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).