public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Shawn Bohrer <shawn.bohrer@gmail.com>
Cc: xfs@oss.sgi.com
Subject: Re: Latencies writing to memory mapped files
Date: Wed, 22 Sep 2010 09:15:31 +1000	[thread overview]
Message-ID: <20100921231531.GE2614@dastard> (raw)
In-Reply-To: <20100921180541.GA6877@BohrerMBP.rgmadvisors.com>

On Tue, Sep 21, 2010 at 01:05:41PM -0500, Shawn Bohrer wrote:
> On Tue, Sep 21, 2010 at 08:48:33AM +1000, Dave Chinner wrote:
> > On Mon, Sep 20, 2010 at 05:17:26PM -0500, Shawn Bohrer wrote:
> > > On Mon, Sep 20, 2010 at 10:05:35AM +1000, Dave Chinner wrote:
> > > > FWIW, if you can work out the magic necessary (perhaps systemtap) it
> > > > woul dbe really interesting to know whether it is the same buffer
> > > > that is causing the latency every time (i.e. get the bp->b_bn field
> > > > from the metadata buffer). With the block number, we can use xfs_db
> > > > to dig out what the buffer actually is....
.....
> > Hmmm - it would be good to know which one produced the latency,
> > given there does not appear to be a pattern in the block numbers.
> 
> OK here is a little more information which may be relevant.  I've
> currently got 12 processes that read data from a socket and each write
> to a different memory mapped file.  The apps are only appending to the
> file, but they don't write the data in one sequential chunk so it is
> quite possible that it appears as random IO for a short time period.
> 
> This time I made some more modifications so that I would only capture
> the bp->b_bn when one of my processes sees a delay of 300ms or greater.
> There still is a small chance for false positives, but most of these
> should have caused real delays.  All of the following came from the
> same process:
> 
> [001] 81758.886627: _xfs_buf_find: xfs_buf_lock blocked for 871ms on bp->b_bn: 474487328
> [001] 81920.808163: _xfs_buf_find: xfs_buf_lock blocked for 1038ms on bp->b_bn: 474487568
> [001] 82933.428627: _xfs_buf_find: xfs_buf_lock blocked for 767ms on bp->b_bn: 542273864
> [000] 83048.558413: _xfs_buf_find: xfs_buf_lock blocked for 363ms on bp->b_bn: 474488080
> [001] 86717.251189: _xfs_buf_find: xfs_buf_lock blocked for 514ms on bp->b_bn: 474487560
> [001] 88466.635802: _xfs_buf_find: xfs_buf_lock blocked for 380ms on bp->b_bn: 542274544
> [000] 88972.122062: _xfs_buf_find: xfs_buf_lock blocked for 521ms on bp->b_bn: 474487560
> [001] 89271.988589: _xfs_buf_find: xfs_buf_lock blocked for 496ms on bp->b_bn: 474487560
> [001] 91046.566309: _xfs_buf_find: xfs_buf_lock blocked for 540ms on bp->b_bn: 474487560
> [001] 91047.280042: _xfs_buf_find: xfs_buf_lock blocked for 713ms on bp->b_bn: 542274216
> [001] 91212.458472: _xfs_buf_find: xfs_buf_lock blocked for 490ms on bp->b_bn: 542274544
> [001] 92397.667196: _xfs_buf_find: xfs_buf_lock blocked for 549ms on bp->b_bn: 474487560
> 
> > > What do I need to do to convert the block numbers to a buffer?
> > 
> > # xfs_db -r -c "daddr 812730376" -c "print" <dev>
> > 
> > Will dump the sector at that address. That should be enough to tell
> > us what is in the buffer (by the magic number).
> 
> So I have no idea what I'm looking at but here is the output for the
> above numbers (duplicates removed):
> 
> xfs_db -r -c "daddr 474487328" -c "print" /dev/sda5
> 000: 424d4150 0000007f 00000000 07000082 00000000 07000092 00000000 0039a000
       ^^^^^^^^
        B M A P

#define XFS_BMAP_MAGIC  0x424d4150      /* 'BMAP' */

So these are inode extent btree blocks your application is getting
stuck on.  These only get written back as a result of either log
pressure (i.e.  tail pushing) or by the xfsbufd based on age. They
aren't actually flushed with the data because changes are logged.
IOWs, the writeback of the bmap btree blocks is asynchronous to any
operation that modifies them, so there's no direct connection
between modification and writeback.

I'm not sure that there is anything that can really be done to
prevent this. If the cause of writeback is age-based flushing on the
metadata buffers, you could try increasing the xfsbufd writeback age
so that only log pressure will cause them to be flushed.

Alternatively, you could change your application to pre-fault pages
in an async thread so the latency of allocation during the page
fault is not taken by the main writer...

Cheers,

Dave.

-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2010-09-21 23:14 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-15 15:26 Latencies writing to memory mapped files Shawn Bohrer
2010-09-16  0:18 ` Dave Chinner
2010-09-17 15:45   ` Shawn Bohrer
2010-09-20  0:05     ` Dave Chinner
2010-09-20 22:17       ` Shawn Bohrer
2010-09-20 22:48         ` Dave Chinner
2010-09-21 18:05           ` Shawn Bohrer
2010-09-21 23:15             ` Dave Chinner [this message]
2010-09-29 15:31               ` Shawn Bohrer
2010-09-29 23:19                 ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100921231531.GE2614@dastard \
    --to=david@fromorbit.com \
    --cc=shawn.bohrer@gmail.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox