Weird jbd2 I/O load

public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed

* Weird jbd2 I/O load
@ 2013-10-15 21:41 Andrei Banu
  2013-10-21 13:53 ` Zheng Liu
  0 siblings, 1 reply; 8+ messages in thread
From: Andrei Banu @ 2013-10-15 21:41 UTC (permalink / raw)
  To: linux-ext4

Hello,

First off let me state that my level of knowledge and expertise is in no 
way a match for that of the people on this list. I am not even sure if 
what I want to ask is in any way related to my problem or it's just a 
side effect (or even plain irrelevant).

I am trying to identify the source of the problems I face with an 
mdraid-1 built with 2 Samsung 840 Pro SSDs. The filesystem is ext-4. I 
face many problems with this array:

- write speeds around 10MB/s and serious server overloads (loads of 20 
to 100 - this is a quad core CPU) when copying larger files (100+ MBs):
root [~]# time dd if=arch.tar.gz of=test4 bs=2M oflag=sync
146+1 records in
146+1 records out
307191761 bytes (307 MB) copied, 23.6788 s, 13.0 MB/s
real    0m23.680s
user    0m0.000s
sys     0m0.932s

- asymmetrical wear on the 2 SSDs (one SSD has a wear of 6% while the 
other has a wear of 30%):
root [~]# smartctl --attributes /dev/sda | grep -i wear
177 Wear_Leveling_Count     0x0013   094%   094   000    Pre-fail 
Always       -       196
root [~]# smartctl --attributes /dev/sdb | grep -i wear
177 Wear_Leveling_Count     0x0013   070%   070   000    Pre-fail 
Always       -       1073

- very asymmetrical await, svctm and %util in iostat when copying larger 
files (100+ MB):
Device:         rrqm/s   wrqm/s     r/s     w/s         rsec/s 
wsec/s       avgrq-sz   avgqu-sz   await        svctm   %util
sda               0.00        1589.50    0.00   54.00     0.00 
13148.00   243.48     0.60           11.17       0.46      2.50
sdb               0.00        1627.50    0.00   16.50     0.00 
9524.00     577.21     144.25       1439.33  60.61   100.00
md1             0.00        0.00           0.00    0.00 0.00     
0.00             0.00        0.00            0.00 0.00      0.00
md2             0.00        0.00           0.00    1602     0.00 
12816.00     8.00        0.00            0.00         0.00      0.00
md0             0.00        0.00           0.00    0.00      0.00 
0.00             0.00        0.00             0.00         0.00 0.00

- asymmetrical total LBA written but much lower than the above:
root [~]# smartctl --attributes /dev/sda | grep "Total_LBAs_Written"
241 Total_LBAs_Written      0x0032   099   099   000    Old_age 
Always       -       23628284668
root [~]# smartctl --attributes /dev/sdb | grep "Total_LBAs_Written"
241 Total_LBAs_Written      0x0032   099   099   000    Old_age 
Always       -       25437073579
(the gap seems to be getting narrower and narrower here though - it 
seems some event in the past caused this)

And the number one reason I am trying for help on this list:
root # iotop -o
Total DISK READ: 247.78 K/s | Total DISK WRITE: 495.56 K/s
TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO> COMMAND
534 be/3 root        0.00 B/s   55.06 K/s  0.00 % 99.99 % [jbd2/md2-8]
....

When there are problems, jbd2 seems to do 99.9%  I/O without doing any 
apparent significant reads or writes. It seems like jbd2 just keeps the 
devices busy.

What could be the reason of some of the above anomalies? Especially why 
is jbd2 keeping the raid members busy while not doing any reads or 
writes? Why the abysmal write speed?

So far I have updated the SSDs firmware, checked the alignment which 
seems ok (1MB boundary), checked with all 3 schedulers, the swap is on 
an md device (so the asymmetrical use and wear again can't be 
explained), I have looked for "hard resetting link" in dmesg but found 
nothing so I guess it's not a cable or back plane issue). What else can 
I check? What else can I try?

Kind regards!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Weird jbd2 I/O load
  2013-10-15 21:41 Weird jbd2 I/O load Andrei Banu
@ 2013-10-21 13:53 ` Zheng Liu
  2013-10-21 14:24   ` Andrei Banu
  2013-10-21 17:42   ` Andrei Banu
  0 siblings, 2 replies; 8+ messages in thread
From: Zheng Liu @ 2013-10-21 13:53 UTC (permalink / raw)
  To: Andrei Banu; +Cc: linux-ext4

Hi Andrei,

Could you please disable barrier for ext4 and try your 'dd' test again?
  $ sudo mount -t ext4 -o remount,barrier=0 ${DEV} ${MNT}

*WARNING: you could lost your data with barrier=0 when you get a power
failure or cold reset.*

We have met a similar problem that is because some SSDs couldn't handle
barrier command properly.

Regards,
                                                - Zheng

On Wed, Oct 16, 2013 at 12:41:13AM +0300, Andrei Banu wrote:
> Hello,
> 
> First off let me state that my level of knowledge and expertise is
> in no way a match for that of the people on this list. I am not even
> sure if what I want to ask is in any way related to my problem or
> it's just a side effect (or even plain irrelevant).
> 
> I am trying to identify the source of the problems I face with an
> mdraid-1 built with 2 Samsung 840 Pro SSDs. The filesystem is ext-4.
> I face many problems with this array:
> 
> - write speeds around 10MB/s and serious server overloads (loads of
> 20 to 100 - this is a quad core CPU) when copying larger files (100+
> MBs):
> root [~]# time dd if=arch.tar.gz of=test4 bs=2M oflag=sync
> 146+1 records in
> 146+1 records out
> 307191761 bytes (307 MB) copied, 23.6788 s, 13.0 MB/s
> real    0m23.680s
> user    0m0.000s
> sys     0m0.932s
> 
> - asymmetrical wear on the 2 SSDs (one SSD has a wear of 6% while
> the other has a wear of 30%):
> root [~]# smartctl --attributes /dev/sda | grep -i wear
> 177 Wear_Leveling_Count     0x0013   094%   094   000    Pre-fail
> Always       -       196
> root [~]# smartctl --attributes /dev/sdb | grep -i wear
> 177 Wear_Leveling_Count     0x0013   070%   070   000    Pre-fail
> Always       -       1073
> 
> - very asymmetrical await, svctm and %util in iostat when copying
> larger files (100+ MB):
> Device:         rrqm/s   wrqm/s     r/s     w/s         rsec/s
> wsec/s       avgrq-sz   avgqu-sz   await        svctm   %util
> sda               0.00        1589.50    0.00   54.00     0.00
> 13148.00   243.48     0.60           11.17       0.46      2.50
> sdb               0.00        1627.50    0.00   16.50     0.00
> 9524.00     577.21     144.25       1439.33  60.61   100.00
> md1             0.00        0.00           0.00    0.00 0.00
> 0.00             0.00        0.00            0.00 0.00      0.00
> md2             0.00        0.00           0.00    1602     0.00
> 12816.00     8.00        0.00            0.00         0.00      0.00
> md0             0.00        0.00           0.00    0.00      0.00
> 0.00             0.00        0.00             0.00         0.00 0.00
> 
> - asymmetrical total LBA written but much lower than the above:
> root [~]# smartctl --attributes /dev/sda | grep "Total_LBAs_Written"
> 241 Total_LBAs_Written      0x0032   099   099   000    Old_age
> Always       -       23628284668
> root [~]# smartctl --attributes /dev/sdb | grep "Total_LBAs_Written"
> 241 Total_LBAs_Written      0x0032   099   099   000    Old_age
> Always       -       25437073579
> (the gap seems to be getting narrower and narrower here though - it
> seems some event in the past caused this)
> 
> 
> And the number one reason I am trying for help on this list:
> root # iotop -o
> Total DISK READ: 247.78 K/s | Total DISK WRITE: 495.56 K/s
> TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO> COMMAND
> 534 be/3 root        0.00 B/s   55.06 K/s  0.00 % 99.99 % [jbd2/md2-8]
> ....
> 
> When there are problems, jbd2 seems to do 99.9%  I/O without doing
> any apparent significant reads or writes. It seems like jbd2 just
> keeps the devices busy.
> 
> What could be the reason of some of the above anomalies? Especially
> why is jbd2 keeping the raid members busy while not doing any reads
> or writes? Why the abysmal write speed?
> 
> So far I have updated the SSDs firmware, checked the alignment which
> seems ok (1MB boundary), checked with all 3 schedulers, the swap is
> on an md device (so the asymmetrical use and wear again can't be
> explained), I have looked for "hard resetting link" in dmesg but
> found nothing so I guess it's not a cable or back plane issue). What
> else can I check? What else can I try?
> 
> Kind regards!
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Weird jbd2 I/O load
  2013-10-21 13:53 ` Zheng Liu
@ 2013-10-21 14:24   ` Andrei Banu
  2013-10-21 16:55     ` Zheng Liu
  2013-10-21 17:42   ` Andrei Banu
  1 sibling, 1 reply; 8+ messages in thread
From: Andrei Banu @ 2013-10-21 14:24 UTC (permalink / raw)
  To: linux-ext4

Hi Zheng,

Thank you for your reply. We can make this test and if it doesn't help 
we'll re-enable
the barrier but first I need to ask a few questions:

1. /dev/md2 is mounted on /. So your command should look like this?

$ mount -t ext4 -o remount,barrier=0    /dev/md2    /

In /etc/fstab I have other parameters as well: 
noatime,usrjquota=quota.user,jqfmt=vfsv0
Do I also add these like this:

$ mount -t ext4 -o 
remount,barrier=0,noatime,usrjquota=quota.user,jqfmt=vfsv0 /dev/md2     /

2. Can the command above be run on an active (very active) cPanel server?

3. How do I re-enable the barrier?

4. What is the probability of data loss in case of cold reboot?

Thanks!

On 10/21/2013 4:53 PM, Zheng Liu wrote:
> Hi Andrei,
>
> Could you please disable barrier for ext4 and try your 'dd' test again?
>    $ sudo mount -t ext4 -o remount,barrier=0 ${DEV} ${MNT}
>
> *WARNING: you could lost your data with barrier=0 when you get a power
> failure or cold reset.*
>
> We have met a similar problem that is because some SSDs couldn't handle
> barrier command properly.
>
> Regards,
>                                                  - Zheng


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Weird jbd2 I/O load
  2013-10-21 14:24   ` Andrei Banu
@ 2013-10-21 16:55     ` Zheng Liu
  2013-10-21 17:11       ` Zheng Liu
  0 siblings, 1 reply; 8+ messages in thread
From: Zheng Liu @ 2013-10-21 16:55 UTC (permalink / raw)
  To: Andrei Banu; +Cc: linux-ext4

On 10/21/2013 10:24 PM, Andrei Banu wrote:
> Hi Zheng,
> 
> Thank you for your reply. We can make this test and if it doesn't help
> we'll re-enable
> the barrier but first I need to ask a few questions:
> 
> 1. /dev/md2 is mounted on /. So your command should look like this?
> 
> $ mount -t ext4 -o remount,barrier=0    /dev/md2    /

Yes.

> 
> In /etc/fstab I have other parameters as well:
> noatime,usrjquota=quota.user,jqfmt=vfsv0
> Do I also add these like this:
> 
> $ mount -t ext4 -o
> remount,barrier=0,noatime,usrjquota=quota.user,jqfmt=vfsv0 /dev/md2     /

If I remember correctly, this command is OK.  But frankly I don't do the
test.  I am not sure all these options will pass when you remount a
partition.

> 
> 2. Can the command above be run on an active (very active) cPanel server?

Sorry, I am not familiar with cPanel server.  But if your previous
testings run on a server under a heavy pressure.  It quite impacts the
result because disk bandwidth is occupied by this process.

> 
> 3. How do I re-enable the barrier?

sudo mount -t ext4 -o remount,barrier=1 ${DEV} ${MNT}

> 
> 4. What is the probability of data loss in case of cold reboot?

After disabling barrier, the system couldn't ensure that the dirty data
has been written into the disk medium if your device couldn't support
FUA command.  So I just want to make sure that your SSD can not handle
barrier command properly.  *Please do not use 'barrier=0' in your
product system*.

						- Zheng

> 
> Thanks!
> 
> On 10/21/2013 4:53 PM, Zheng Liu wrote:
>> Hi Andrei,
>>
>> Could you please disable barrier for ext4 and try your 'dd' test again?
>>    $ sudo mount -t ext4 -o remount,barrier=0 ${DEV} ${MNT}
>>
>> *WARNING: you could lost your data with barrier=0 when you get a power
>> failure or cold reset.*
>>
>> We have met a similar problem that is because some SSDs couldn't handle
>> barrier command properly.
>>
>> Regards,
>>                                                  - Zheng
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Weird jbd2 I/O load
  2013-10-21 16:55     ` Zheng Liu
@ 2013-10-21 17:11       ` Zheng Liu
  0 siblings, 0 replies; 8+ messages in thread
From: Zheng Liu @ 2013-10-21 17:11 UTC (permalink / raw)
  To: Andrei Banu; +Cc: linux-ext4

On 10/22/2013 12:55 AM, Zheng Liu wrote:
> On 10/21/2013 10:24 PM, Andrei Banu wrote:
>> Hi Zheng,
>>
>> Thank you for your reply. We can make this test and if it doesn't help
>> we'll re-enable
>> the barrier but first I need to ask a few questions:
>>
>> 1. /dev/md2 is mounted on /. So your command should look like this?
>>
>> $ mount -t ext4 -o remount,barrier=0    /dev/md2    /
> 
> Yes.
> 
>>
>> In /etc/fstab I have other parameters as well:
>> noatime,usrjquota=quota.user,jqfmt=vfsv0
>> Do I also add these like this:
>>
>> $ mount -t ext4 -o
>> remount,barrier=0,noatime,usrjquota=quota.user,jqfmt=vfsv0 /dev/md2     /
> 
> If I remember correctly, this command is OK.  But frankly I don't do the
> test.  I am not sure all these options will pass when you remount a
> partition.
> 
>>
>> 2. Can the command above be run on an active (very active) cPanel server?
> 
> Sorry, I am not familiar with cPanel server.  But if your previous
> testings run on a server under a heavy pressure.  It quite impacts the
> result because disk bandwidth is occupied by this process.

Ah, sorry, I misunderstand your question.  Yes, you can remount a
partition on an active server.  But as I said above, please do not run
your testings on a server under heavy pressure because the result could
be impacted.

Regards,
						- Zheng

> 
>>
>> 3. How do I re-enable the barrier?
> 
> sudo mount -t ext4 -o remount,barrier=1 ${DEV} ${MNT}
> 
>>
>> 4. What is the probability of data loss in case of cold reboot?
> 
> After disabling barrier, the system couldn't ensure that the dirty data
> has been written into the disk medium if your device couldn't support
> FUA command.  So I just want to make sure that your SSD can not handle
> barrier command properly.  *Please do not use 'barrier=0' in your
> product system*.
> 
> 						- Zheng
> 
>>
>> Thanks!
>>
>> On 10/21/2013 4:53 PM, Zheng Liu wrote:
>>> Hi Andrei,
>>>
>>> Could you please disable barrier for ext4 and try your 'dd' test again?
>>>    $ sudo mount -t ext4 -o remount,barrier=0 ${DEV} ${MNT}
>>>
>>> *WARNING: you could lost your data with barrier=0 when you get a power
>>> failure or cold reset.*
>>>
>>> We have met a similar problem that is because some SSDs couldn't handle
>>> barrier command properly.
>>>
>>> Regards,
>>>                                                  - Zheng
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Weird jbd2 I/O load
  2013-10-21 13:53 ` Zheng Liu
  2013-10-21 14:24   ` Andrei Banu
@ 2013-10-21 17:42   ` Andrei Banu
  2013-10-22  2:57     ` Zheng Liu
  1 sibling, 1 reply; 8+ messages in thread
From: Andrei Banu @ 2013-10-21 17:42 UTC (permalink / raw)
  To: linux-ext4, Zheng Liu

Hi,

Meantime I've created another md device (just 5GB) and I've redone the 
tests. I believe
this is easier and less risky than remounting an used md device.

root [/home2]# mount -l | grep md3
/dev/md3 on /home2 type ext4 (rw,barrier=0)

root [/home2]# dd bs=2M count=64 if=/dev/zero of=test6 conv=fdatasync
64+0 records in
64+0 records out
134217728 bytes (134 MB) copied, 12.3287 s, 10.9 MB/s

So the speed issue is still with us I believe.

Is there some way to check the barrier is really set to 0?

Thanks a lot!

On 10/21/2013 4:53 PM, Zheng Liu wrote:
> Hi Andrei,
>
> Could you please disable barrier for ext4 and try your 'dd' test again?
>    $ sudo mount -t ext4 -o remount,barrier=0 ${DEV} ${MNT}
>
> *WARNING: you could lost your data with barrier=0 when you get a power
> failure or cold reset.*
>
> We have met a similar problem that is because some SSDs couldn't handle
> barrier command properly.
>
> Regards,
>                                                  - Zheng


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Weird jbd2 I/O load
  2013-10-21 17:42   ` Andrei Banu
@ 2013-10-22  2:57     ` Zheng Liu
  2013-10-22  7:22       ` Andrei Banu
  0 siblings, 1 reply; 8+ messages in thread
From: Zheng Liu @ 2013-10-22  2:57 UTC (permalink / raw)
  To: Andrei Banu; +Cc: linux-ext4

On Mon, Oct 21, 2013 at 08:42:50PM +0300, Andrei Banu wrote:
> Hi,
> 
> Meantime I've created another md device (just 5GB) and I've redone
> the tests. I believe
> this is easier and less risky than remounting an used md device.
> 
> root [/home2]# mount -l | grep md3
> /dev/md3 on /home2 type ext4 (rw,barrier=0)
> 
> root [/home2]# dd bs=2M count=64 if=/dev/zero of=test6 conv=fdatasync
> 64+0 records in
> 64+0 records out
> 134217728 bytes (134 MB) copied, 12.3287 s, 10.9 MB/s
> 
> So the speed issue is still with us I believe.

Thanks for doing this.  It seems that the problem we met are different.

> 
> Is there some way to check the barrier is really set to 0?

You have seen that from the output of 'mount' command barrier is 0.  You
can 'cat /proc/mounts' to double-check it.  But it should be the same.

Regards,
                                                - Zheng

> 
> Thanks a lot!
> 
> On 10/21/2013 4:53 PM, Zheng Liu wrote:
> >Hi Andrei,
> >
> >Could you please disable barrier for ext4 and try your 'dd' test again?
> >   $ sudo mount -t ext4 -o remount,barrier=0 ${DEV} ${MNT}
> >
> >*WARNING: you could lost your data with barrier=0 when you get a power
> >failure or cold reset.*
> >
> >We have met a similar problem that is because some SSDs couldn't handle
> >barrier command properly.
> >
> >Regards,
> >                                                 - Zheng
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Weird jbd2 I/O load
  2013-10-22  2:57     ` Zheng Liu
@ 2013-10-22  7:22       ` Andrei Banu
  0 siblings, 0 replies; 8+ messages in thread
From: Andrei Banu @ 2013-10-22  7:22 UTC (permalink / raw)
  To: linux-ext4

Hi,

Thank you for your attempt to help me! By the way...I have checked 
/proc/mounts and it was barrier=0 so...I guess that is that.

Unfortunately this leaves me with only one thing to test and that is to 
swap the physical SSDs with each other and see if the apparent problems 
move from sdb to sda or stay on sdb. If they move I guess it's the SSD 
on sdb. But if the problems stay I guess the problem lies elsewhere. I 
am just not sure it's safe to swap the 2 devices of an mdraid array.

Thanks again and kind regards!

On 10/22/2013 5:57 AM, Zheng Liu wrote:
> On Mon, Oct 21, 2013 at 08:42:50PM +0300, Andrei Banu wrote:
>> Hi,
>>
>> Meantime I've created another md device (just 5GB) and I've redone
>> the tests. I believe
>> this is easier and less risky than remounting an used md device.
>>
>> root [/home2]# mount -l | grep md3
>> /dev/md3 on /home2 type ext4 (rw,barrier=0)
>>
>> root [/home2]# dd bs=2M count=64 if=/dev/zero of=test6 conv=fdatasync
>> 64+0 records in
>> 64+0 records out
>> 134217728 bytes (134 MB) copied, 12.3287 s, 10.9 MB/s
>>
>> So the speed issue is still with us I believe.
> Thanks for doing this.  It seems that the problem we met are different.
>
>> Is there some way to check the barrier is really set to 0?
> You have seen that from the output of 'mount' command barrier is 0.  You
> can 'cat /proc/mounts' to double-check it.  But it should be the same.
>
> Regards,
>                                                  - Zheng


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-10-22  7:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-15 21:41 Weird jbd2 I/O load Andrei Banu
2013-10-21 13:53 ` Zheng Liu
2013-10-21 14:24   ` Andrei Banu
2013-10-21 16:55     ` Zheng Liu
2013-10-21 17:11       ` Zheng Liu
2013-10-21 17:42   ` Andrei Banu
2013-10-22  2:57     ` Zheng Liu
2013-10-22  7:22       ` Andrei Banu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox