From: "Alexander Y. Fomichev" <gluk@php4.ru>
To: linux-kernel@vger.kernel.org
Cc: admin@list.net.ru
Subject: xfs cluster rewrites is broken?
Date: Fri, 17 Mar 2006 15:32:03 +0300 [thread overview]
Message-ID: <200603171532.04385.gluk@php4.ru> (raw)
Hello,
Two days ago i've try 2.6.16-rc5 on 2-way dual-core Opteron server
and faced with a strange system behaviour.
Bulky database updates (host is intended to be a database mysql server),
at some point of time leads to the state when system begins continuously
write to disk with a speed about of 100-250 Mb/s, really, near to limit
of raid controller [lsi320-2x]).
On particular drive only innodb logfiles ( rollback segmets )
have some relationship to mysql. It seems strange because write speed
to innodb datafile itself within limits of 20 Mb/s )
( both on the xfs partitions in this case ).
vmstat 1 shows something like this:
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
1 0 0 2746688 4232 465784 0 0 132 156046 2853 762 12 4 73
11
1 0 0 2777440 4232 465784 0 0 0 242742 4050 432 6 4 74
17
1 1 0 2746696 4232 465784 0 0 0 134551 2201 556 13 5 74
8
1 1 0 2760712 4232 465920 0 0 128 296360 4892 1083 5 5 70
19
0 1 0 2746596 4232 465920 0 0 0 209254 3560 9072 10 5 70
15
0 1 0 2745736 4232 466192 0 0 256 142445 2477 721 12 4 75
9
1 0 0 2757396 4232 466328 0 0 128 190102 3375 829 8 4 74
14
0 1 0 2746360 4232 466328 0 0 0 192885 3122 256 9 4 75
and iostat:
nuclear ~ # iostat 1
Linux 2.6.16-rc6-g232a347a (nuclear.srv.ehouse.ru) 03/17/06
[skip]
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 1196.76 222.90 152462.91 194620 133118417
sdb 42.69 11.82 5024.36 10322 4386869
avg-cpu: %user %nice %sys %iowait %idle
12.72 0.00 3.99 8.73 74.56
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 2079.00 0.00 265946.00 0 265946
sdb 0.00 0.00 0.00
( it is sda contains ib_logfile[0-3], sdb -- ib_data itself )
while normaly it looks like:
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
2 0 0 1772460 98660 1242776 0 0 0 2045 329 697 24 2 74
0
1 0 0 1801452 98660 1242776 0 0 0 20022 671 1164 23 2 74
1
2 0 0 1782372 98660 1242776 0 0 880 3845 383 1070 20 2 74
4
1 0 0 1771876 98660 1242776 0 0 0 7897 428 718 24 2 74
0
1 0 0 1781572 98660 1242776 0 0 0 8200 446 1314 24 2 74
0
1 0 0 1770020 98660 1242776 0 0 0 11402 478 742 23 2 74
0
I don't seen smothing similar with previous 2.6.15. so assume this
is a kernel issue.
clone of last git tree seems affected too, so i've try to 'bisect' a little.
One day crowling with git bisect reveal commit related to this.
$git bisect bad
6c4fe19f66a839bce68fcb7b99cdcb0f31c7a59e is first bad commit
diff-tree 6c4fe19f66a839bce68fcb7b99cdcb0f31c7a59e (from
7336cea8c2737bbaf0296d67782f760828301d56)
Author: Christoph Hellwig <hch@sgi.com>
Date: Wed Jan 11 20:49:28 2006 +1100
[XFS] cluster rewrites We can cluster mapped pages aswell, this
improves
performances on rewrites since we can reduce the number of allocator
calls.
SGI-PV: 947118
SGI-Modid: xfs-linux-melb:xfs-kern:203829a
Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=6c4fe19f66a839bce68fcb7b99cdcb0f31c7a59e;hp=7336cea8c2737bbaf0296d67782f760828301d56
Reverting of this on 2.6.16-rc5 eliminate symptoms completely.
half-intuitive:
diff -urN a/fs/xfs/linux-2.6/xfs_aops.c b/fs/xfs/linux-2.6/xfs_aops.c
--- a/fs/xfs/linux-2.6/xfs_aops.c 2006-03-17 13:13:53.000000000 +0300
+++ b/fs/xfs/linux-2.6/xfs_aops.c 2006-03-17 15:12:12.000000000 +0300
@@ -616,8 +616,6 @@
acceptable = (type == IOMAP_UNWRITTEN);
else if (buffer_delay(bh))
acceptable = (type == IOMAP_DELAY);
- else if (buffer_mapped(bh))
- acceptable = (type == 0);
else
break;
} while ((bh = bh->b_this_page) != head);
works too, as i can see, but this is just illustration.
--
Best regards.
Alexander Y. Fomichev <gluk@php4.ru>
Public PGP key: http://sysadminday.org.ru/gluk.asc
next reply other threads:[~2006-03-17 12:32 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-03-17 12:32 Alexander Y. Fomichev [this message]
2006-03-20 0:05 ` xfs cluster rewrites is broken? David Chinner
2006-03-21 14:55 ` Alexander Y. Fomichev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200603171532.04385.gluk@php4.ru \
--to=gluk@php4.ru \
--cc=admin@list.net.ru \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.