public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrei Banu <andrei.banu@redhost.ro>
To: linux-ext4@vger.kernel.org
Subject: Weird jbd2 I/O load
Date: Wed, 16 Oct 2013 00:41:13 +0300	[thread overview]
Message-ID: <525DB679.4070008@redhost.ro> (raw)

Hello,

First off let me state that my level of knowledge and expertise is in no 
way a match for that of the people on this list. I am not even sure if 
what I want to ask is in any way related to my problem or it's just a 
side effect (or even plain irrelevant).

I am trying to identify the source of the problems I face with an 
mdraid-1 built with 2 Samsung 840 Pro SSDs. The filesystem is ext-4. I 
face many problems with this array:

- write speeds around 10MB/s and serious server overloads (loads of 20 
to 100 - this is a quad core CPU) when copying larger files (100+ MBs):
root [~]# time dd if=arch.tar.gz of=test4 bs=2M oflag=sync
146+1 records in
146+1 records out
307191761 bytes (307 MB) copied, 23.6788 s, 13.0 MB/s
real    0m23.680s
user    0m0.000s
sys     0m0.932s

- asymmetrical wear on the 2 SSDs (one SSD has a wear of 6% while the 
other has a wear of 30%):
root [~]# smartctl --attributes /dev/sda | grep -i wear
177 Wear_Leveling_Count     0x0013   094%   094   000    Pre-fail 
Always       -       196
root [~]# smartctl --attributes /dev/sdb | grep -i wear
177 Wear_Leveling_Count     0x0013   070%   070   000    Pre-fail 
Always       -       1073

- very asymmetrical await, svctm and %util in iostat when copying larger 
files (100+ MB):
Device:         rrqm/s   wrqm/s     r/s     w/s         rsec/s 
wsec/s       avgrq-sz   avgqu-sz   await        svctm   %util
sda               0.00        1589.50    0.00   54.00     0.00 
13148.00   243.48     0.60           11.17       0.46      2.50
sdb               0.00        1627.50    0.00   16.50     0.00 
9524.00     577.21     144.25       1439.33  60.61   100.00
md1             0.00        0.00           0.00    0.00 0.00     
0.00             0.00        0.00            0.00 0.00      0.00
md2             0.00        0.00           0.00    1602     0.00 
12816.00     8.00        0.00            0.00         0.00      0.00
md0             0.00        0.00           0.00    0.00      0.00 
0.00             0.00        0.00             0.00         0.00 0.00

- asymmetrical total LBA written but much lower than the above:
root [~]# smartctl --attributes /dev/sda | grep "Total_LBAs_Written"
241 Total_LBAs_Written      0x0032   099   099   000    Old_age 
Always       -       23628284668
root [~]# smartctl --attributes /dev/sdb | grep "Total_LBAs_Written"
241 Total_LBAs_Written      0x0032   099   099   000    Old_age 
Always       -       25437073579
(the gap seems to be getting narrower and narrower here though - it 
seems some event in the past caused this)


And the number one reason I am trying for help on this list:
root # iotop -o
Total DISK READ: 247.78 K/s | Total DISK WRITE: 495.56 K/s
TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO> COMMAND
534 be/3 root        0.00 B/s   55.06 K/s  0.00 % 99.99 % [jbd2/md2-8]
....

When there are problems, jbd2 seems to do 99.9%  I/O without doing any 
apparent significant reads or writes. It seems like jbd2 just keeps the 
devices busy.

What could be the reason of some of the above anomalies? Especially why 
is jbd2 keeping the raid members busy while not doing any reads or 
writes? Why the abysmal write speed?

So far I have updated the SSDs firmware, checked the alignment which 
seems ok (1MB boundary), checked with all 3 schedulers, the swap is on 
an md device (so the asymmetrical use and wear again can't be 
explained), I have looked for "hard resetting link" in dmesg but found 
nothing so I guess it's not a cable or back plane issue). What else can 
I check? What else can I try?

Kind regards!

             reply	other threads:[~2013-10-15 22:18 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-15 21:41 Andrei Banu [this message]
2013-10-21 13:53 ` Weird jbd2 I/O load Zheng Liu
2013-10-21 14:24   ` Andrei Banu
2013-10-21 16:55     ` Zheng Liu
2013-10-21 17:11       ` Zheng Liu
2013-10-21 17:42   ` Andrei Banu
2013-10-22  2:57     ` Zheng Liu
2013-10-22  7:22       ` Andrei Banu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=525DB679.4070008@redhost.ro \
    --to=andrei.banu@redhost.ro \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox