All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Theodore Ts'o <tytso@mit.edu>, Jan Kara <jack@suse.cz>,
	linux-ext4@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>, Jiri Slaby <jslaby@suse.cz>
Subject: Re: Excessive stall times on ext4 in 3.9-rc2
Date: Tue, 23 Apr 2013 17:13:19 +0100	[thread overview]
Message-ID: <20130423161319.GC2108@suse.de> (raw)
In-Reply-To: <20130423155019.GH31170@thunk.org>

On Tue, Apr 23, 2013 at 11:50:19AM -0400, Theodore Ts'o wrote:
> On Tue, Apr 23, 2013 at 04:33:05PM +0100, Mel Gorman wrote:
> > That's a pretty big drop but it gets bad again for the second worst stall --
> > wait_on_page_bit as a result of generic_file_buffered_write.
> > 
> > Vanilla kernel  1336064 ms stalled with 109 events
> > Patched kernel  2338781 ms stalled with 164 events
> 
> Do you have the stack trace for this stall?  I'm wondering if this is
> caused by the waiting for stable pages in write_begin() , or something
> else.
> 

[<ffffffff81110238>] wait_on_page_bit+0x78/0x80
[<ffffffff815af294>] kretprobe_trampoline+0x0/0x4c
[<ffffffff81110e84>] generic_file_buffered_write+0x114/0x2a0
[<ffffffff81111ccd>] __generic_file_aio_write+0x1bd/0x3c0
[<ffffffff81111f4a>] generic_file_aio_write+0x7a/0xf0
[<ffffffff811ee639>] ext4_file_write+0x99/0x420
[<ffffffff81174d87>] do_sync_write+0xa7/0xe0
[<ffffffff81175447>] vfs_write+0xa7/0x180
[<ffffffff811758cd>] sys_write+0x4d/0x90
[<ffffffff815b3eed>] system_call_fastpath+0x1a/0x1f
[<ffffffffffffffff>] 0xffffffffffffffff

The processes that stalled in this particular trace are wget, latency-output,
tar and tclsh. Most of these are sequential writers except for tar which
is both a sequential reader and sequential writers.

> If it is blocking caused by stable page writeback that's interesting,
> since it would imply that something in your workload is trying to
> write to a page that has already been modified (i.e., appending to a
> log file, or updating a database file).  Does that make sense given
> what your workload might be running?
> 

I doubt it is stable write consider the type of processes that are running. I
would expect the bulk of the activity to be sequential readers or writers
of multiple files. The summarised report from the raw data is now at

http://www.csn.ul.ie/~mel/postings/ext4tag-20130423/dstate-summary-vanilla
http://www.csn.ul.ie/~mel/postings/ext4tag-20130423/dstate-summary-ext4tag

It's an aside but the worst of the stalls are incurred by systemd-tmpfile
which were not a deliberate part of the test and yet another thing that
I would not have caught unless I was running tests on my laptop. Looking
closer at that thing, the default configuration is to run the service 15
minutes after boot and after that it runs once a day. It looks like the
bulk of the scanning would be in /var/tmp/ looking at systemds own files
(over 3000 of them) which I'm a little amused by.

My normal test machines would not hit this because they are not systemd
based but the existance of thing thing is worth noting. Any IO-based tests
run on systemd-based distributions may give different results depending
on whether this service triggered during the test or not.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@suse.de>
To: "Theodore Ts'o" <tytso@mit.edu>, Jan Kara <jack@suse.cz>,
	linux-ext4@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>, Jiri Slaby <jslaby@suse.cz>
Subject: Re: Excessive stall times on ext4 in 3.9-rc2
Date: Tue, 23 Apr 2013 17:13:19 +0100	[thread overview]
Message-ID: <20130423161319.GC2108@suse.de> (raw)
In-Reply-To: <20130423155019.GH31170@thunk.org>

On Tue, Apr 23, 2013 at 11:50:19AM -0400, Theodore Ts'o wrote:
> On Tue, Apr 23, 2013 at 04:33:05PM +0100, Mel Gorman wrote:
> > That's a pretty big drop but it gets bad again for the second worst stall --
> > wait_on_page_bit as a result of generic_file_buffered_write.
> > 
> > Vanilla kernel  1336064 ms stalled with 109 events
> > Patched kernel  2338781 ms stalled with 164 events
> 
> Do you have the stack trace for this stall?  I'm wondering if this is
> caused by the waiting for stable pages in write_begin() , or something
> else.
> 

[<ffffffff81110238>] wait_on_page_bit+0x78/0x80
[<ffffffff815af294>] kretprobe_trampoline+0x0/0x4c
[<ffffffff81110e84>] generic_file_buffered_write+0x114/0x2a0
[<ffffffff81111ccd>] __generic_file_aio_write+0x1bd/0x3c0
[<ffffffff81111f4a>] generic_file_aio_write+0x7a/0xf0
[<ffffffff811ee639>] ext4_file_write+0x99/0x420
[<ffffffff81174d87>] do_sync_write+0xa7/0xe0
[<ffffffff81175447>] vfs_write+0xa7/0x180
[<ffffffff811758cd>] sys_write+0x4d/0x90
[<ffffffff815b3eed>] system_call_fastpath+0x1a/0x1f
[<ffffffffffffffff>] 0xffffffffffffffff

The processes that stalled in this particular trace are wget, latency-output,
tar and tclsh. Most of these are sequential writers except for tar which
is both a sequential reader and sequential writers.

> If it is blocking caused by stable page writeback that's interesting,
> since it would imply that something in your workload is trying to
> write to a page that has already been modified (i.e., appending to a
> log file, or updating a database file).  Does that make sense given
> what your workload might be running?
> 

I doubt it is stable write consider the type of processes that are running. I
would expect the bulk of the activity to be sequential readers or writers
of multiple files. The summarised report from the raw data is now at

http://www.csn.ul.ie/~mel/postings/ext4tag-20130423/dstate-summary-vanilla
http://www.csn.ul.ie/~mel/postings/ext4tag-20130423/dstate-summary-ext4tag

It's an aside but the worst of the stalls are incurred by systemd-tmpfile
which were not a deliberate part of the test and yet another thing that
I would not have caught unless I was running tests on my laptop. Looking
closer at that thing, the default configuration is to run the service 15
minutes after boot and after that it runs once a day. It looks like the
bulk of the scanning would be in /var/tmp/ looking at systemds own files
(over 3000 of them) which I'm a little amused by.

My normal test machines would not hit this because they are not systemd
based but the existance of thing thing is worth noting. Any IO-based tests
run on systemd-based distributions may give different results depending
on whether this service triggered during the test or not.

-- 
Mel Gorman
SUSE Labs

  reply	other threads:[~2013-04-23 16:13 UTC|newest]

Thread overview: 105+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-02 14:27 Excessive stall times on ext4 in 3.9-rc2 Mel Gorman
2013-04-02 14:27 ` Mel Gorman
2013-04-02 15:00 ` Jiri Slaby
2013-04-02 15:00   ` Jiri Slaby
2013-04-02 15:03 ` Zheng Liu
2013-04-02 15:03   ` Zheng Liu
2013-04-02 15:15   ` Mel Gorman
2013-04-02 15:15     ` Mel Gorman
2013-04-02 15:06 ` Theodore Ts'o
2013-04-02 15:06   ` Theodore Ts'o
2013-04-02 15:14   ` Theodore Ts'o
2013-04-02 15:14     ` Theodore Ts'o
2013-04-02 18:19     ` Theodore Ts'o
2013-04-02 18:19       ` Theodore Ts'o
2013-04-07 21:59       ` Frank Ch. Eigler
2013-04-07 21:59         ` Frank Ch. Eigler
2013-04-08  8:36         ` Mel Gorman
2013-04-08  8:36           ` Mel Gorman
2013-04-08 10:52           ` Frank Ch. Eigler
2013-04-08 10:52             ` Frank Ch. Eigler
2013-04-08 11:01         ` Theodore Ts'o
2013-04-08 11:01           ` Theodore Ts'o
2013-04-03 10:19     ` Mel Gorman
2013-04-03 10:19       ` Mel Gorman
2013-04-03 12:05       ` Theodore Ts'o
2013-04-03 12:05         ` Theodore Ts'o
2013-04-03 15:15         ` Mel Gorman
2013-04-05 22:18       ` Jiri Slaby
2013-04-05 22:18         ` Jiri Slaby
2013-04-05 23:16         ` Theodore Ts'o
2013-04-05 23:16           ` Theodore Ts'o
2013-04-06  7:29           ` Jiri Slaby
2013-04-06  7:29             ` Jiri Slaby
2013-04-06  7:37             ` Jiri Slaby
2013-04-06  7:37               ` Jiri Slaby
2013-04-06  8:19               ` Jiri Slaby
2013-04-06 13:15             ` Theodore Ts'o
2013-04-06 13:15               ` Theodore Ts'o
2013-04-10 10:56   ` Mel Gorman
2013-04-10 10:56     ` Mel Gorman
2013-04-10 13:12     ` Theodore Ts'o
2013-04-10 13:12       ` Theodore Ts'o
2013-04-11 17:04       ` Mel Gorman
2013-04-11 17:04         ` Mel Gorman
2013-04-11 18:35         ` Theodore Ts'o
2013-04-11 18:35           ` Theodore Ts'o
2013-04-11 21:33           ` Jan Kara
2013-04-11 21:33             ` Jan Kara
2013-04-12  2:57             ` Theodore Ts'o
2013-04-12  2:57               ` Theodore Ts'o
2013-04-12  4:50               ` Dave Chinner
2013-04-12  4:50                 ` Dave Chinner
2013-04-12 15:19                 ` Theodore Ts'o
2013-04-12 15:19                   ` Theodore Ts'o
2013-04-13  1:23                   ` Dave Chinner
2013-04-13  1:23                     ` Dave Chinner
2013-04-22 14:38                   ` Mel Gorman
2013-04-22 14:38                     ` Mel Gorman
2013-04-22 22:42                     ` Jeff Moyer
2013-04-22 22:42                       ` Jeff Moyer
2013-04-23  0:02                       ` Theodore Ts'o
2013-04-23  0:02                         ` Theodore Ts'o
2013-04-23  9:31                       ` Jan Kara
2013-04-23  9:31                         ` Jan Kara
2013-04-23 14:01                       ` Mel Gorman
2013-04-23 14:01                         ` Mel Gorman
2013-04-24 19:09                         ` Jeff Moyer
2013-04-24 19:09                           ` Jeff Moyer
2013-04-25 12:21                           ` Mel Gorman
2013-04-25 12:21                             ` Mel Gorman
2013-04-12  9:47               ` Mel Gorman
2013-04-12  9:47                 ` Mel Gorman
2013-04-21  0:05                 ` Theodore Ts'o
2013-04-21  0:05                   ` Theodore Ts'o
2013-04-21  0:07                   ` [PATCH 1/3] ext4: mark all metadata I/O with REQ_META Theodore Ts'o
2013-04-21  0:07                     ` Theodore Ts'o
2013-04-21  0:07                     ` [PATCH 2/3] buffer: add BH_Prio and BH_Meta flags Theodore Ts'o
2013-04-21  0:07                       ` Theodore Ts'o
2013-04-21  0:07                     ` [PATCH 3/3] ext4: mark metadata blocks using bh flags Theodore Ts'o
2013-04-21  0:07                       ` Theodore Ts'o
2013-04-21  6:09                       ` Jiri Slaby
2013-04-21  6:09                         ` Jiri Slaby
2013-04-21  6:09                         ` Jiri Slaby
2013-04-21 19:55                         ` Theodore Ts'o
2013-04-21 19:55                           ` Theodore Ts'o
2013-04-21 19:55                           ` Theodore Ts'o
2013-04-21 20:48                           ` [PATCH 3/3 -v2] " Theodore Ts'o
2013-04-21 20:48                             ` Theodore Ts'o
2013-04-21 20:48                             ` Theodore Ts'o
2013-04-22 12:06                     ` [PATCH 1/3] ext4: mark all metadata I/O with REQ_META Zheng Liu
2013-04-22 12:06                       ` Zheng Liu
2013-04-23 15:33                   ` Excessive stall times on ext4 in 3.9-rc2 Mel Gorman
2013-04-23 15:33                     ` Mel Gorman
2013-04-23 15:50                     ` Theodore Ts'o
2013-04-23 15:50                       ` Theodore Ts'o
2013-04-23 16:13                       ` Mel Gorman [this message]
2013-04-23 16:13                         ` Mel Gorman
2013-04-12 10:18               ` Tvrtko Ursulin
2013-04-12 10:18                 ` Tvrtko Ursulin
2013-04-12  9:45           ` Mel Gorman
2013-04-12  9:45             ` Mel Gorman
2013-04-02 23:16 ` Theodore Ts'o
2013-04-02 23:16   ` Theodore Ts'o
2013-04-03 15:22   ` Mel Gorman
2013-04-03 15:22     ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130423161319.GC2108@suse.de \
    --to=mgorman@suse.de \
    --cc=jack@suse.cz \
    --cc=jslaby@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.