* [PATCH v2] xfs: Avoid pathological backwards allocation
@ 2013-04-11 20:09 Jan Kara
2013-04-16 15:41 ` Mark Tinguely
2013-05-20 13:56 ` Jan Kara
0 siblings, 2 replies; 5+ messages in thread
From: Jan Kara @ 2013-04-11 20:09 UTC (permalink / raw)
To: xfs; +Cc: Jan Kara, tinguely, dchinner
Writing a large file using direct IO in 16 MB chunks sometimes results
in a pathological allocation pattern where 16 MB chunks of large free
extent are allocated to a file in a reversed order. So extents of a file
look for example as:
ext logical physical expected length flags
0 0 13 4550656
1 4550656 188136807 4550668 12562432
2 17113088 200699240 200699238 622592
3 17735680 182046055 201321831 4096
4 17739776 182041959 182050150 4096
5 17743872 182037863 182046054 4096
6 17747968 182033767 182041958 4096
7 17752064 182029671 182037862 4096
...
6757 45400064 154381644 154389835 4096
6758 45404160 154377548 154385739 4096
6759 45408256 252951571 154381643 73728 eof
This happens because XFS_ALLOCTYPE_THIS_BNO allocation fails (the last
extent in the file cannot be further extended) so we fall back to
XFS_ALLOCTYPE_NEAR_BNO allocation which picks end of a large free
extent as the best place to continue the file. Since the chunk at the
end of the free extent again cannot be further extended, this behavior
repeats until the whole free extent is consumed in a reversed order.
For data allocations this backward allocation isn't beneficial so make
xfs_alloc_compute_diff() pick start of a free extent instead of its end
for them. That avoids the backward allocation pattern.
See thread at http://oss.sgi.com/archives/xfs/2013-03/msg00144.html for
more details about the reproduction case and why this solution was
chosen.
Based on idea by Dave Chinner <dchinner@redhat.com>.
CC: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
fs/xfs/xfs_alloc.c | 24 ++++++++++++++++++------
1 files changed, 18 insertions(+), 6 deletions(-)
v2: Updated comment and commit description.
diff --git a/fs/xfs/xfs_alloc.c b/fs/xfs/xfs_alloc.c
index 0ad2325..f99113d 100644
--- a/fs/xfs/xfs_alloc.c
+++ b/fs/xfs/xfs_alloc.c
@@ -173,6 +173,7 @@ xfs_alloc_compute_diff(
xfs_agblock_t wantbno, /* target starting block */
xfs_extlen_t wantlen, /* target length */
xfs_extlen_t alignment, /* target alignment */
+ char userdata, /* are we allocating data? */
xfs_agblock_t freebno, /* freespace's starting block */
xfs_extlen_t freelen, /* freespace's length */
xfs_agblock_t *newbnop) /* result: best start block from free */
@@ -187,7 +188,14 @@ xfs_alloc_compute_diff(
ASSERT(freelen >= wantlen);
freeend = freebno + freelen;
wantend = wantbno + wantlen;
- if (freebno >= wantbno) {
+ /*
+ * We want to allocate from the start of a free extent if it is past
+ * the desired block or if we are allocating user data and the free
+ * extent is before desired block. The second case is there to allow
+ * for contiguous allocation from the remaining free space if the file
+ * grows in the short term.
+ */
+ if (freebno >= wantbno || (userdata && freeend < wantend)) {
if ((newbno1 = roundup(freebno, alignment)) >= freeend)
newbno1 = NULLAGBLOCK;
} else if (freeend >= wantend && alignment > 1) {
@@ -772,7 +780,8 @@ xfs_alloc_find_best_extent(
xfs_alloc_fix_len(args);
sdiff = xfs_alloc_compute_diff(args->agbno, args->len,
- args->alignment, *sbnoa,
+ args->alignment,
+ args->userdata, *sbnoa,
*slena, &new);
/*
@@ -943,7 +952,8 @@ restart:
if (args->len < blen)
continue;
ltdiff = xfs_alloc_compute_diff(args->agbno, args->len,
- args->alignment, ltbnoa, ltlena, <new);
+ args->alignment, args->userdata, ltbnoa,
+ ltlena, <new);
if (ltnew != NULLAGBLOCK &&
(args->len > blen || ltdiff < bdiff)) {
bdiff = ltdiff;
@@ -1095,7 +1105,8 @@ restart:
args->len = XFS_EXTLEN_MIN(ltlena, args->maxlen);
xfs_alloc_fix_len(args);
ltdiff = xfs_alloc_compute_diff(args->agbno, args->len,
- args->alignment, ltbnoa, ltlena, <new);
+ args->alignment, args->userdata, ltbnoa,
+ ltlena, <new);
error = xfs_alloc_find_best_extent(args,
&bno_cur_lt, &bno_cur_gt,
@@ -1111,7 +1122,8 @@ restart:
args->len = XFS_EXTLEN_MIN(gtlena, args->maxlen);
xfs_alloc_fix_len(args);
gtdiff = xfs_alloc_compute_diff(args->agbno, args->len,
- args->alignment, gtbnoa, gtlena, >new);
+ args->alignment, args->userdata, gtbnoa,
+ gtlena, >new);
error = xfs_alloc_find_best_extent(args,
&bno_cur_gt, &bno_cur_lt,
@@ -1170,7 +1182,7 @@ restart:
}
rlen = args->len;
(void)xfs_alloc_compute_diff(args->agbno, rlen, args->alignment,
- ltbnoa, ltlena, <new);
+ args->userdata, ltbnoa, ltlena, <new);
ASSERT(ltnew >= ltbno);
ASSERT(ltnew + rlen <= ltbnoa + ltlena);
ASSERT(ltnew + rlen <= be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length));
--
1.7.1
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH v2] xfs: Avoid pathological backwards allocation
2013-04-11 20:09 [PATCH v2] xfs: Avoid pathological backwards allocation Jan Kara
@ 2013-04-16 15:41 ` Mark Tinguely
2013-05-20 13:56 ` Jan Kara
1 sibling, 0 replies; 5+ messages in thread
From: Mark Tinguely @ 2013-04-16 15:41 UTC (permalink / raw)
To: Jan Kara; +Cc: xfs
On 04/11/13 15:09, Jan Kara wrote:
> Writing a large file using direct IO in 16 MB chunks sometimes results
> in a pathological allocation pattern where 16 MB chunks of large free
> extent are allocated to a file in a reversed order. So extents of a file
> look for example as:
>
> ext logical physical expected length flags
> 0 0 13 4550656
> 1 4550656 188136807 4550668 12562432
> 2 17113088 200699240 200699238 622592
> 3 17735680 182046055 201321831 4096
> 4 17739776 182041959 182050150 4096
> 5 17743872 182037863 182046054 4096
> 6 17747968 182033767 182041958 4096
> 7 17752064 182029671 182037862 4096
> ...
> 6757 45400064 154381644 154389835 4096
> 6758 45404160 154377548 154385739 4096
> 6759 45408256 252951571 154381643 73728 eof
>
> This happens because XFS_ALLOCTYPE_THIS_BNO allocation fails (the last
> extent in the file cannot be further extended) so we fall back to
> XFS_ALLOCTYPE_NEAR_BNO allocation which picks end of a large free
> extent as the best place to continue the file. Since the chunk at the
> end of the free extent again cannot be further extended, this behavior
> repeats until the whole free extent is consumed in a reversed order.
>
> For data allocations this backward allocation isn't beneficial so make
> xfs_alloc_compute_diff() pick start of a free extent instead of its end
> for them. That avoids the backward allocation pattern.
>
> See thread at http://oss.sgi.com/archives/xfs/2013-03/msg00144.html for
> more details about the reproduction case and why this solution was
> chosen.
>
> Based on idea by Dave Chinner<dchinner@redhat.com>.
>
> CC: Dave Chinner<dchinner@redhat.com>
> Reviewed-by: Dave Chinner<dchinner@redhat.com>
> Signed-off-by: Jan Kara<jack@suse.cz>
> ---
> fs/xfs/xfs_alloc.c | 24 ++++++++++++++++++------
> 1 files changed, 18 insertions(+), 6 deletions(-)
>
> v2: Updated comment and commit description.
>
Looks good. I also agree this should wait for Linux 3.11.
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] xfs: Avoid pathological backwards allocation
2013-04-11 20:09 [PATCH v2] xfs: Avoid pathological backwards allocation Jan Kara
2013-04-16 15:41 ` Mark Tinguely
@ 2013-05-20 13:56 ` Jan Kara
2013-05-20 14:57 ` Mark Tinguely
2013-05-20 18:10 ` Ben Myers
1 sibling, 2 replies; 5+ messages in thread
From: Jan Kara @ 2013-05-20 13:56 UTC (permalink / raw)
To: xfs; +Cc: Jan Kara, tinguely, dchinner
On Thu 11-04-13 22:09:56, Jan Kara wrote:
> Writing a large file using direct IO in 16 MB chunks sometimes results
> in a pathological allocation pattern where 16 MB chunks of large free
> extent are allocated to a file in a reversed order. So extents of a file
> look for example as:
>
> ext logical physical expected length flags
> 0 0 13 4550656
> 1 4550656 188136807 4550668 12562432
> 2 17113088 200699240 200699238 622592
> 3 17735680 182046055 201321831 4096
> 4 17739776 182041959 182050150 4096
> 5 17743872 182037863 182046054 4096
> 6 17747968 182033767 182041958 4096
> 7 17752064 182029671 182037862 4096
> ...
> 6757 45400064 154381644 154389835 4096
> 6758 45404160 154377548 154385739 4096
> 6759 45408256 252951571 154381643 73728 eof
>
> This happens because XFS_ALLOCTYPE_THIS_BNO allocation fails (the last
> extent in the file cannot be further extended) so we fall back to
> XFS_ALLOCTYPE_NEAR_BNO allocation which picks end of a large free
> extent as the best place to continue the file. Since the chunk at the
> end of the free extent again cannot be further extended, this behavior
> repeats until the whole free extent is consumed in a reversed order.
>
> For data allocations this backward allocation isn't beneficial so make
> xfs_alloc_compute_diff() pick start of a free extent instead of its end
> for them. That avoids the backward allocation pattern.
>
> See thread at http://oss.sgi.com/archives/xfs/2013-03/msg00144.html for
> more details about the reproduction case and why this solution was
> chosen.
>
> Based on idea by Dave Chinner <dchinner@redhat.com>.
>
> CC: Dave Chinner <dchinner@redhat.com>
> Reviewed-by: Dave Chinner <dchinner@redhat.com>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> fs/xfs/xfs_alloc.c | 24 ++++++++++++++++++------
> 1 files changed, 18 insertions(+), 6 deletions(-)
>
> v2: Updated comment and commit description.
Could anybody pull this patch into XFS tree? I don't see it there...
Honza
>
> diff --git a/fs/xfs/xfs_alloc.c b/fs/xfs/xfs_alloc.c
> index 0ad2325..f99113d 100644
> --- a/fs/xfs/xfs_alloc.c
> +++ b/fs/xfs/xfs_alloc.c
> @@ -173,6 +173,7 @@ xfs_alloc_compute_diff(
> xfs_agblock_t wantbno, /* target starting block */
> xfs_extlen_t wantlen, /* target length */
> xfs_extlen_t alignment, /* target alignment */
> + char userdata, /* are we allocating data? */
> xfs_agblock_t freebno, /* freespace's starting block */
> xfs_extlen_t freelen, /* freespace's length */
> xfs_agblock_t *newbnop) /* result: best start block from free */
> @@ -187,7 +188,14 @@ xfs_alloc_compute_diff(
> ASSERT(freelen >= wantlen);
> freeend = freebno + freelen;
> wantend = wantbno + wantlen;
> - if (freebno >= wantbno) {
> + /*
> + * We want to allocate from the start of a free extent if it is past
> + * the desired block or if we are allocating user data and the free
> + * extent is before desired block. The second case is there to allow
> + * for contiguous allocation from the remaining free space if the file
> + * grows in the short term.
> + */
> + if (freebno >= wantbno || (userdata && freeend < wantend)) {
> if ((newbno1 = roundup(freebno, alignment)) >= freeend)
> newbno1 = NULLAGBLOCK;
> } else if (freeend >= wantend && alignment > 1) {
> @@ -772,7 +780,8 @@ xfs_alloc_find_best_extent(
> xfs_alloc_fix_len(args);
>
> sdiff = xfs_alloc_compute_diff(args->agbno, args->len,
> - args->alignment, *sbnoa,
> + args->alignment,
> + args->userdata, *sbnoa,
> *slena, &new);
>
> /*
> @@ -943,7 +952,8 @@ restart:
> if (args->len < blen)
> continue;
> ltdiff = xfs_alloc_compute_diff(args->agbno, args->len,
> - args->alignment, ltbnoa, ltlena, <new);
> + args->alignment, args->userdata, ltbnoa,
> + ltlena, <new);
> if (ltnew != NULLAGBLOCK &&
> (args->len > blen || ltdiff < bdiff)) {
> bdiff = ltdiff;
> @@ -1095,7 +1105,8 @@ restart:
> args->len = XFS_EXTLEN_MIN(ltlena, args->maxlen);
> xfs_alloc_fix_len(args);
> ltdiff = xfs_alloc_compute_diff(args->agbno, args->len,
> - args->alignment, ltbnoa, ltlena, <new);
> + args->alignment, args->userdata, ltbnoa,
> + ltlena, <new);
>
> error = xfs_alloc_find_best_extent(args,
> &bno_cur_lt, &bno_cur_gt,
> @@ -1111,7 +1122,8 @@ restart:
> args->len = XFS_EXTLEN_MIN(gtlena, args->maxlen);
> xfs_alloc_fix_len(args);
> gtdiff = xfs_alloc_compute_diff(args->agbno, args->len,
> - args->alignment, gtbnoa, gtlena, >new);
> + args->alignment, args->userdata, gtbnoa,
> + gtlena, >new);
>
> error = xfs_alloc_find_best_extent(args,
> &bno_cur_gt, &bno_cur_lt,
> @@ -1170,7 +1182,7 @@ restart:
> }
> rlen = args->len;
> (void)xfs_alloc_compute_diff(args->agbno, rlen, args->alignment,
> - ltbnoa, ltlena, <new);
> + args->userdata, ltbnoa, ltlena, <new);
> ASSERT(ltnew >= ltbno);
> ASSERT(ltnew + rlen <= ltbnoa + ltlena);
> ASSERT(ltnew + rlen <= be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length));
> --
> 1.7.1
>
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH v2] xfs: Avoid pathological backwards allocation
2013-05-20 13:56 ` Jan Kara
@ 2013-05-20 14:57 ` Mark Tinguely
2013-05-20 18:10 ` Ben Myers
1 sibling, 0 replies; 5+ messages in thread
From: Mark Tinguely @ 2013-05-20 14:57 UTC (permalink / raw)
To: Jan Kara; +Cc: dchinner, xfs
On 05/20/13 08:56, Jan Kara wrote:
> On Thu 11-04-13 22:09:56, Jan Kara wrote:
>> Writing a large file using direct IO in 16 MB chunks sometimes results
>> in a pathological allocation pattern where 16 MB chunks of large free
>> extent are allocated to a file in a reversed order. So extents of a file
>> look for example as:
>>
>> ext logical physical expected length flags
>> 0 0 13 4550656
>> 1 4550656 188136807 4550668 12562432
>> 2 17113088 200699240 200699238 622592
>> 3 17735680 182046055 201321831 4096
>> 4 17739776 182041959 182050150 4096
>> 5 17743872 182037863 182046054 4096
>> 6 17747968 182033767 182041958 4096
>> 7 17752064 182029671 182037862 4096
>> ...
>> 6757 45400064 154381644 154389835 4096
>> 6758 45404160 154377548 154385739 4096
>> 6759 45408256 252951571 154381643 73728 eof
>>
>> This happens because XFS_ALLOCTYPE_THIS_BNO allocation fails (the last
>> extent in the file cannot be further extended) so we fall back to
>> XFS_ALLOCTYPE_NEAR_BNO allocation which picks end of a large free
>> extent as the best place to continue the file. Since the chunk at the
>> end of the free extent again cannot be further extended, this behavior
>> repeats until the whole free extent is consumed in a reversed order.
>>
>> For data allocations this backward allocation isn't beneficial so make
>> xfs_alloc_compute_diff() pick start of a free extent instead of its end
>> for them. That avoids the backward allocation pattern.
>>
>> See thread at http://oss.sgi.com/archives/xfs/2013-03/msg00144.html for
>> more details about the reproduction case and why this solution was
>> chosen.
>>
>> Based on idea by Dave Chinner<dchinner@redhat.com>.
>>
>> CC: Dave Chinner<dchinner@redhat.com>
>> Reviewed-by: Dave Chinner<dchinner@redhat.com>
>> Signed-off-by: Jan Kara<jack@suse.cz>
>> ---
>> fs/xfs/xfs_alloc.c | 24 ++++++++++++++++++------
>> 1 files changed, 18 insertions(+), 6 deletions(-)
>>
>> v2: Updated comment and commit description.
> Could anybody pull this patch into XFS tree? I don't see it there...
>
> Honza
Sorry, a miscommunication on my part that this belonged in the dev tree
but not in the for Linus pull for Linux 3.10.
--Mark.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] xfs: Avoid pathological backwards allocation
2013-05-20 13:56 ` Jan Kara
2013-05-20 14:57 ` Mark Tinguely
@ 2013-05-20 18:10 ` Ben Myers
1 sibling, 0 replies; 5+ messages in thread
From: Ben Myers @ 2013-05-20 18:10 UTC (permalink / raw)
To: Jan Kara; +Cc: dchinner, tinguely, xfs
On Mon, May 20, 2013 at 03:56:07PM +0200, Jan Kara wrote:
> On Thu 11-04-13 22:09:56, Jan Kara wrote:
> > Writing a large file using direct IO in 16 MB chunks sometimes results
> > in a pathological allocation pattern where 16 MB chunks of large free
> > extent are allocated to a file in a reversed order. So extents of a file
> > look for example as:
> >
> > ext logical physical expected length flags
> > 0 0 13 4550656
> > 1 4550656 188136807 4550668 12562432
> > 2 17113088 200699240 200699238 622592
> > 3 17735680 182046055 201321831 4096
> > 4 17739776 182041959 182050150 4096
> > 5 17743872 182037863 182046054 4096
> > 6 17747968 182033767 182041958 4096
> > 7 17752064 182029671 182037862 4096
> > ...
> > 6757 45400064 154381644 154389835 4096
> > 6758 45404160 154377548 154385739 4096
> > 6759 45408256 252951571 154381643 73728 eof
> >
> > This happens because XFS_ALLOCTYPE_THIS_BNO allocation fails (the last
> > extent in the file cannot be further extended) so we fall back to
> > XFS_ALLOCTYPE_NEAR_BNO allocation which picks end of a large free
> > extent as the best place to continue the file. Since the chunk at the
> > end of the free extent again cannot be further extended, this behavior
> > repeats until the whole free extent is consumed in a reversed order.
> >
> > For data allocations this backward allocation isn't beneficial so make
> > xfs_alloc_compute_diff() pick start of a free extent instead of its end
> > for them. That avoids the backward allocation pattern.
> >
> > See thread at http://oss.sgi.com/archives/xfs/2013-03/msg00144.html for
> > more details about the reproduction case and why this solution was
> > chosen.
> >
> > Based on idea by Dave Chinner <dchinner@redhat.com>.
> >
> > CC: Dave Chinner <dchinner@redhat.com>
> > Reviewed-by: Dave Chinner <dchinner@redhat.com>
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> > fs/xfs/xfs_alloc.c | 24 ++++++++++++++++++------
> > 1 files changed, 18 insertions(+), 6 deletions(-)
> >
> > v2: Updated comment and commit description.
> Could anybody pull this patch into XFS tree? I don't see it there...
Applied.
Thanks Jan.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-05-20 18:10 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-11 20:09 [PATCH v2] xfs: Avoid pathological backwards allocation Jan Kara
2013-04-16 15:41 ` Mark Tinguely
2013-05-20 13:56 ` Jan Kara
2013-05-20 14:57 ` Mark Tinguely
2013-05-20 18:10 ` Ben Myers
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox