From: Shawn Bohrer <shawn.bohrer@gmail.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: High latencies writing to a memory mapped file
Date: Tue, 3 Aug 2010 17:03:53 -0500 [thread overview]
Message-ID: <20100803220336.GA7282@BohrerMBP.rgmadvisors.com> (raw)
In-Reply-To: <20100726232224.GJ7362@dastard>
Hey Dave,
Thanks for the suggestions.
On Tue, Jul 27, 2010 at 09:22:24AM +1000, Dave Chinner wrote:
> On Mon, Jul 26, 2010 at 05:09:09PM -0500, Shawn Bohrer wrote:
> > On Fri, Jul 23, 2010 at 09:28:46AM +1000, Dave Chinner wrote:
> > > On Thu, Jul 22, 2010 at 09:47:06AM -0500, Shawn Bohrer wrote:
> > > > Hello,
> > > >
> > > > We have an application that receives data over the network, and then
> > > > writes data out to a shared memory mapped file on xfs. Previously
> > > > everything worked great when we were running a CentOS 5.3 kernel
> > > > (2.6.18-128.7.1.el5), but when we upgraded to the newer CentOS 5.4 or
> > > > 5.5 kernels we started to see occasional spikes (up to 1 second) in
> > > > the time it takes to write the data to the memory mapped file. I
> > > > noticed that between the CentOS 5.3 and 5.4 kernels the upstream xfs
> > > > changes were backported. Testing on 2.6.32.16 shows the same
> > > > occasional latency spikes.
> > > >
> > > > After doing some debugging on 2.6.32.16 I have found the cause of the
> > > > latency spikes. The spikes occur every time pdflush runs to flush the
> > > > dirty pages. Note our app only writes data at about 1.5 MB/s and we
> > > > are well below both the dirty_background_ratio and the dirty_ratio so
> > > > these flushes are only of pages that are dirty_expire_centisecs old.
> > > > Capturing one of the spikes with ftrace shows:
> > > >
> > > > ...
> > > > flush-8:0-601 [011] 1291592.127590: funcgraph_entry: | xfs_ilock() {
> > > > flush-8:0-601 [011] 1291592.127590: funcgraph_entry: | down_write() {
> > > > flush-8:0-601 [011] 1291592.127590: funcgraph_entry: 0.213 us | _cond_resched();
> > > > flush-8:0-601 [011] 1291592.127591: funcgraph_exit: 0.657 us | }
> > > > flush-8:0-601 [011] 1291592.127591: funcgraph_exit: 1.174 us | }
> > > > flush-8:0-601 [011] 1291592.127764: sched_switch: task flush-8:0:601 [120] (D) ==> swapper:0 [120]
> > > > ...
> > >
> > > what is the lead-in context to this hunk? i.e. how does xfs_ilock()
> > > get called? That will tell me if it is blocking on the iolock or the
> > > ilock...
> >
> > I don't have the lead-in context here. I'm using the function_graph
> > tracer and with stalls up to a second in length it easily overflowed
> > the ftrace ring buffer. Once I found I my process was blocking on the
> > xfs_ilock(), I limited my traces to include xfs_ilock, xfs_iunlock,
> > handle_mm_fault, and the scheduler events. If you are still
> > interested in seeing some more detail here I can try to collect some
> > more captures.
>
> I'm not sure isthe tracepoint buffer is the same as the ftrace
> buffer, but if it is you can increase the size by doing this sort of
> thing:
>
> # echo 32768 > /sys/kernel/debug/tracing/buffer_size_kb
>
> I often do that to capture long XFS traces during testing....
>
> [snip]
>
> > > Hmmm - perhaps the issue is that the transaction commit has not been
> > > able to get a iclog buffer to write the transaction into and so is
> > > sleeping waiting for an iclog IO to complete. That would be woken by
> > > the xfslogd, and then the transaction would write into the
> > > now-available iclog and unlock the inode.
> > >
> > > So, I'd say that in this case your application is waiting for log IO
> > > completion to occur. Nothing you can do to avoid that, but you can
> > > minimise the impact - increase the log buffer size via the mount
> > > option logbsize (i.e. -o logbsize=262144) to reduce the frequency of
> > > iclog buffer IOs and make it less likely the system will run stall
> > > having run out of iclog space.
> > >
> > > The other possibility is that the log IO is being slowed by barriers
> > > being issued with the log writes. If you've got non-volatile caches
> > > on your storage, you can turn off barriers and that should help
> > > reduce log IO latency significantly.
> >
> > I've tried mounting with "noatime,nobarrier,logbufs=8,logbsize=262144"
> > but I don't notice much difference and still see the stalls.
>
> Ok. One other possibility came to mind - the transaction commit is
> blocking while locking the superblock buffer to apply free block
> count changes. Do you have the "lazy-count" option set (xfs_info
> <mtpt> will tell you). If it isn't set, then transaction commit
> latency can be quite high if the sb buffer is being written to disk
> when the transaction tries to lock it. If it is not set, you can use
> xfs_admin to set that bit (mkfs has defaulted to lazy-count=1 for
> some time now).
Setting lazy-count=1 looks like it has helped quite a bit to reduce
the frequency that I was seeing the spikes. I am still seeing spikes
of up to a second, but I haven't had time to capture any more traces
to verify if they are the same issue. Overall though this may be good
enough.
Thanks,
Shawn
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2010-08-03 22:03 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-22 14:47 High latencies writing to a memory mapped file Shawn Bohrer
2010-07-22 23:28 ` Dave Chinner
2010-07-26 22:09 ` Shawn Bohrer
2010-07-26 23:22 ` Dave Chinner
2010-08-03 22:03 ` Shawn Bohrer [this message]
2010-07-27 9:24 ` Matthias Schniedermeyer
2010-07-27 10:47 ` Emmanuel Florac
2010-07-27 11:27 ` Matthias Schniedermeyer
2010-07-27 13:06 ` Emmanuel Florac
2010-07-27 13:00 ` Stan Hoeppner
[not found] ` <4C4EDEFD.7000401@hardwarefreak.com>
2010-07-27 14:49 ` Matthias Schniedermeyer
2010-07-27 19:59 ` Kinzel, David
2010-07-27 21:02 ` Matthias Schniedermeyer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100803220336.GA7282@BohrerMBP.rgmadvisors.com \
--to=shawn.bohrer@gmail.com \
--cc=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox