From: Dave Chinner <david@fromorbit.com>
To: Alex Lyakas <alex@zadarastorage.com>
Cc: Christoph Hellwig <hch@infradead.org>,
Brian Foster <bfoster@redhat.com>,
linux-xfs@vger.kernel.org
Subject: Re: xfs_alloc_ag_vextent_near() takes minutes to complete
Date: Fri, 5 May 2017 13:29:05 +1000 [thread overview]
Message-ID: <20170505032905.GF17542@dastard> (raw)
In-Reply-To: <DDDCF9D44B6A4D1493F62FCA5AC73CF6@alyakaslap>
On Thu, May 04, 2017 at 11:07:45AM +0300, Alex Lyakas wrote:
> Hello Brian, Cristoph,
>
> Thank you for your responses.
>
> >The search overhead could be high due to either fragmented free space or
> >perhaps waiting on busy extents (since you have enabled online discard).
> >Do you have any threads freeing space and waiting on discard operations
> >when this occurs? Also, what does 'xfs_db -c "freesp -s" <dev>' show for
> >this filesystem?
> I disabled the discard, but the problem still happens. Output of the
> freesp command is at [1]. To my understanding this means that 60% of
> the free space is 16-31 continuous blocks, i.e., 64kb-124kb. Does
> this count as a fragmented free space?
>
> I debugged the issue further, profiling the
> xfs_alloc_ag_vextent_near() call and what it does. Some results:
>
> # it appears to not be triggering any READs of xfs_buf, i.e., no
> calls to xfs_buf_ioapply_map() with rw==READ or rw==READA in the
> same thread
> # most of the time (about 95%) is spent in xfs_buf_lock() waiting in
> "down(&bp->b_sema)" call
> # the average time to lock an xfs_buf is about 10-12 ms
>
> For example, in one test it took 45778 ms to complete the
> xfs_alloc_ag_vextent_near() execution. During this time, 6240
> xfs_buf were locked, totalling to 42810 ms spent in locking the
> buffers, which is about 93%. On average 7 ms to lock a buffer.
>
> # it is still not clear who is holding the lock
>
> Cristoph, I understand that kernel 3.18 is EOL at the moment, but it
> used to be a long-term kernel, so there is an expectation of
> stability, but perhaps not community support at this point.
>
> Thanks,
> Alex.
>
>
> [1]
> from to extents blocks pct
> 1 1 155759 155759 0.00
> 2 3 1319 3328 0.00
> 4 7 13153 56265 0.00
> 8 15 152663 1752813 0.03
> 16 31 143626908 4019133338 60.17
There's your problem. 143 million small free space extents totalling
4TB of free space. That's going to require (roughly speaking)
somewhere between 3-500,000 4k btree leaf blocks to index. i.e a
footprint of 10-20GB of metadata.
Even accounting for it being evenly spread across 50AGs, that's
still a 5-10k of btree blocks per free space btree per AG, and so if
that's not in cache when we end up doing a linear search for a near
block of a size that falls into this bucket, it's going to get stuck
reading btree leaf siblings from disk synchronously....
Perhaps this "near block" search needs to terminate after at a
certain search radius, similar to how the old AGI btree searches
during inode allocation were terminated after a certain radius of
allocated inode clusters were searched for free inodes....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2017-05-05 3:29 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-01 12:37 xfs_alloc_ag_vextent_near() takes minutes to complete Alex Lyakas
2017-05-01 15:26 ` Brian Foster
2017-05-02 7:35 ` Christoph Hellwig
2017-05-04 8:07 ` Alex Lyakas
2017-05-04 11:13 ` Alex Lyakas
2017-05-04 12:29 ` Brian Foster
2017-05-04 12:25 ` Brian Foster
2017-05-04 13:53 ` Alex Lyakas
2017-05-05 3:29 ` Dave Chinner [this message]
2017-05-07 7:52 ` Alex Lyakas
2017-05-07 8:00 ` Alex Lyakas
2017-05-07 9:12 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170505032905.GF17542@dastard \
--to=david@fromorbit.com \
--cc=alex@zadarastorage.com \
--cc=bfoster@redhat.com \
--cc=hch@infradead.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox