RAID1 performance and "task X blocked for more than 120 seconds"

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* RAID1 performance and "task X blocked for more than 120 seconds"
@ 2012-11-18 18:39 Martijn
  2012-11-18 21:05 ` Stan Hoeppner
  0 siblings, 1 reply; 5+ messages in thread
From: Martijn @ 2012-11-18 18:39 UTC (permalink / raw)
  To: linux-raid

Good day,

I have a machine here that suffers from poor performance that I'm not 
able to explain, even with previous tips from this list.

Some details:
- Ubuntu Linux 10.04 LTS install
- mdadm 2.6.7.1-1ubuntu15

- RAID is a RAID1 array with 3 disks: 2 active, 1 spare
- a partitionable device is used: /dev/md127

- Mount lists the following partitions:
/dev/md127p2 on / type ext3 (rw)
/dev/md127p6 on /tmp type ext3 (rw,noexec,noatime)
/dev/md127p1 on /boot type ext2 (rw)
/dev/md127p3 on /home type ext3 (rw,noatime,usrquota)
/dev/md127p5 on /var type ext3 (rw,noatime)

- All 3 disks are connected over SATA 150 to a K8S Pro S2882 motherboard 
(http://www.tyan.com/archive/products/html/thunderk8spro_spec.html)
- Disks are all Seagate Barracuda 7200.12 ST31000528AS, 1TB.
- Smart finds the disks all in good condition

These tips from earlier performance questions where checked and applied 
already:
- The RAID member disks, /dev/sda and /dev/sdb, both have their 
scheduler set to deadline.
- NCQ is disabled by setting queue_depth to 1.
- The system has run without problems for almost 250 days. During this 
time, no large copy-actions where done. Installation and other processes 
where completed without trouble or any sign of performance problems.

The problem:
I was copying 3 GB of data using rsync, from another server to this 
machine over a 100 mbit connection. After some time it appeared to me as 
if one of the two systems was having trouble keeping up. Copying speed 
was a few MB/s and the transfer sometimes stopped for a longer period of 
time, then to continue again.

Looking at the receiving system, I noticed this in syslog:
task kjournald blocked for more than 120 seconds
task dkpg-preconfigure blocked for more than 120 seconds
[...]

dpkg-preconfigure being a process running at that time.

Eventually, the copy completed. But some time after the copy was 
completed, I still noticed a high (50-80%) %iowait and 2000 to 4000 
blocks being written to sda and sdb. I monitored this using iostat.

I waited for the system to return to 0 writes and a load of near 0 when 
I attempted to copy the data on disk from directory A to B, and the same 
problem occured.

This is a pretty normal Ubuntu 10.04 installation on a machine that I've 
used with mdadm RAID1 before with much older disks, performing without 
any problem.

Basically:
Any tips on how to trace this and how to fix it?

Thanks in advance,
- Martijn

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RAID1 performance and "task X blocked for more than 120 seconds"
  2012-11-18 18:39 RAID1 performance and "task X blocked for more than 120 seconds" Martijn
@ 2012-11-18 21:05 ` Stan Hoeppner
  2012-11-18 21:53   ` Martijn
  0 siblings, 1 reply; 5+ messages in thread
From: Stan Hoeppner @ 2012-11-18 21:05 UTC (permalink / raw)
  To: mailinglist; +Cc: linux-raid

On 11/18/2012 12:39 PM, Martijn wrote:

> - Disks are all Seagate Barracuda 7200.12 ST31000528AS, 1TB.
> - NCQ is disabled by setting queue_depth to 1.

WRT write throughput, you have effectively a single 7.2k spindle.  The
only way to get lower performance is a 5xxx RPM 'green' or laptop drive.
 This is a low performance machine.

> The problem:
> I was copying 3 GB of data using rsync, from another server to this
> machine over a 100 mbit connection. After some time it appeared to me as
> if one of the two systems was having trouble keeping up. Copying speed
> was a few MB/s and the transfer sometimes stopped for a longer period of
> time, then to continue again.
> 
> Looking at the receiving system, I noticed this in syslog:
> task kjournald blocked for more than 120 seconds
> task dkpg-preconfigure blocked for more than 120 seconds
> [...]
> 
> dpkg-preconfigure being a process running at that time.

Multiple disk intensive processes running concurrently.

> Eventually, the copy completed. But some time after the copy was
> completed, I still noticed a high (50-80%) %iowait and 2000 to 4000
> blocks being written to sda and sdb. I monitored this using iostat.

This is the buffer cache flushing.

> I waited for the system to return to 0 writes and a load of near 0 when
> I attempted to copy the data on disk from directory A to B, and the same
> problem occured.

Your previously mentioned symptoms were leading me to this, but this one
kinda seals the deal.  This sounds like classic filesystem free space
fragmentation.  What filesystem is this?  The 3GB of files--are they
large or small files?

-- 
Stan



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RAID1 performance and "task X blocked for more than 120 seconds"
  2012-11-18 21:05 ` Stan Hoeppner
@ 2012-11-18 21:53   ` Martijn
  2012-11-19 16:51     ` Stan Hoeppner
  0 siblings, 1 reply; 5+ messages in thread
From: Martijn @ 2012-11-18 21:53 UTC (permalink / raw)
  To: linux-raid; +Cc: stan

Thank you for your response Stan.

On 18-11-2012 22:05, Stan Hoeppner wrote:
> On 11/18/2012 12:39 PM, Martijn wrote:
>> - Disks are all Seagate Barracuda 7200.12 ST31000528AS, 1TB.
>> - NCQ is disabled by setting queue_depth to 1.
> WRT write throughput, you have effectively a single 7.2k spindle.  The
> only way to get lower performance is a 5xxx RPM 'green' or laptop drive.
>   This is a low performance machine.
It's certainly not a top notch performance machine and I know that. It's 
old hardware. The disks are the newest component. For the record: no 
great performance is needed. I only expect the machine the behave 
normally under normal ("copying a few files") circumstances.

For comparison:
I've (had) more of these machines, working well, with mdadm RAID1 on 
much lower performance disks. Same motherboard. Same (deadline) 
scheduler. A difference is they don't use a partitionable device, but 
seperate partitions in RAID1, so /dev/sda1 mirroring /dev/sdb1, and so 
on. Also, a different OS: an older version of Gentoo.

They never had any trouble keeping up and certainly never had an entry 
in the syslog like the one I got now. Actually I just tried copying that 
same 3 GB of files, and it worked flawlessly. No hickups and without 
starving the machine. That is even while it's in production, under some 
load.

%iowait on that machine is around 30% while copying. When the copy is 
done, writing very quickly returns to 0 blocks/s normal on that machine.

>> The problem:
>> I was copying 3 GB of data using rsync, from another server to this
>> machine over a 100 mbit connection. After some time it appeared to me as
>> if one of the two systems was having trouble keeping up. Copying speed
>> was a few MB/s and the transfer sometimes stopped for a longer period of
>> time, then to continue again.
>>
>> Looking at the receiving system, I noticed this in syslog:
>> task kjournald blocked for more than 120 seconds
>> task dkpg-preconfigure blocked for more than 120 seconds
>> [...]
>>
>> dpkg-preconfigure being a process running at that time.
>
> Multiple disk intensive processes running concurrently.
The dkpg-preconfigure was a coincidence. It wasn't running when I did 
the local copy. The syslog entries then mentioned a few vim editors I 
had open to edit config files.

>> Eventually, the copy completed. But some time after the copy was
>> completed, I still noticed a high (50-80%) %iowait and 2000 to 4000
>> blocks being written to sda and sdb. I monitored this using iostat.
>
> This is the buffer cache flushing.
>
>> I waited for the system to return to 0 writes and a load of near 0 when
>> I attempted to copy the data on disk from directory A to B, and the same
>> problem occured.
>
> Your previously mentioned symptoms were leading me to this, but this one
> kinda seals the deal.  This sounds like classic filesystem free space
> fragmentation.  What filesystem is this?  The 3GB of files--are they
> large or small files?

Except for /boot, it's all ext3. Free space:
Filesystem            Size  Used Avail Use% Mounted on
/dev/md127p2           60G  1.1G   56G   2% /
/dev/md127p6          7.9G  149M  7.4G   2% /tmp
/dev/md127p1          243M   19M  212M   9% /boot
/dev/md127p3          709G  6.5G  667G   1% /home
/dev/md127p5          119G  420M  112G   1% /var

Less than 10% usage on every partition. The filesystems have always been 
empty. This data was amongst the very first data written to the /home. 
All partitions where created using standard Linux fdisk and then 
formatted using mkfs.ext3.

The 3GB consists of very mixed content: mostly small files (~1KB), and 
just a few bigger (50MB+).

Thanks,
- Martijn

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RAID1 performance and "task X blocked for more than 120 seconds"
  2012-11-18 21:53   ` Martijn
@ 2012-11-19 16:51     ` Stan Hoeppner
  2012-11-20  9:14       ` Martijn
  0 siblings, 1 reply; 5+ messages in thread
From: Stan Hoeppner @ 2012-11-19 16:51 UTC (permalink / raw)
  To: mailinglist; +Cc: linux-raid

On 11/18/2012 3:53 PM, Martijn wrote:
> Thank you for your response Stan.
> 
> On 18-11-2012 22:05, Stan Hoeppner wrote:
>> On 11/18/2012 12:39 PM, Martijn wrote:
>>> - Disks are all Seagate Barracuda 7200.12 ST31000528AS, 1TB.
>>> - NCQ is disabled by setting queue_depth to 1.
>> WRT write throughput, you have effectively a single 7.2k spindle.  The
>> only way to get lower performance is a 5xxx RPM 'green' or laptop drive.
>>   This is a low performance machine.
> It's certainly not a top notch performance machine and I know that. It's
> old hardware. The disks are the newest component. For the record: no
> great performance is needed. I only expect the machine the behave
> normally under normal ("copying a few files") circumstances.
> 
> For comparison:
> I've (had) more of these machines, working well, with mdadm RAID1 on
> much lower performance disks. Same motherboard. Same (deadline)
> scheduler. A difference is they don't use a partitionable device, but
> seperate partitions in RAID1, so /dev/sda1 mirroring /dev/sdb1, and so
> on. Also, a different OS: an older version of Gentoo.
> 
> They never had any trouble keeping up and certainly never had an entry
> in the syslog like the one I got now. Actually I just tried copying that
> same 3 GB of files, and it worked flawlessly. No hickups and without
> starving the machine. That is even while it's in production, under some
> load.
> 
> %iowait on that machine is around 30% while copying. When the copy is
> done, writing very quickly returns to 0 blocks/s normal on that machine.
> 
>>> The problem:
>>> I was copying 3 GB of data using rsync, from another server to this
>>> machine over a 100 mbit connection. After some time it appeared to me as
>>> if one of the two systems was having trouble keeping up. Copying speed
>>> was a few MB/s and the transfer sometimes stopped for a longer period of
>>> time, then to continue again.
>>>
>>> Looking at the receiving system, I noticed this in syslog:
>>> task kjournald blocked for more than 120 seconds
>>> task dkpg-preconfigure blocked for more than 120 seconds
>>> [...]
>>>
>>> dpkg-preconfigure being a process running at that time.
>>
>> Multiple disk intensive processes running concurrently.
> The dkpg-preconfigure was a coincidence. It wasn't running when I did
> the local copy. The syslog entries then mentioned a few vim editors I
> had open to edit config files.
> 
>>> Eventually, the copy completed. But some time after the copy was
>>> completed, I still noticed a high (50-80%) %iowait and 2000 to 4000
>>> blocks being written to sda and sdb. I monitored this using iostat.
>>
>> This is the buffer cache flushing.
>>
>>> I waited for the system to return to 0 writes and a load of near 0 when
>>> I attempted to copy the data on disk from directory A to B, and the same
>>> problem occured.
>>
>> Your previously mentioned symptoms were leading me to this, but this one
>> kinda seals the deal.  This sounds like classic filesystem free space
>> fragmentation.  What filesystem is this?  The 3GB of files--are they
>> large or small files?
> 
> Except for /boot, it's all ext3. Free space:
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/md127p2           60G  1.1G   56G   2% /
> /dev/md127p6          7.9G  149M  7.4G   2% /tmp
> /dev/md127p1          243M   19M  212M   9% /boot
> /dev/md127p3          709G  6.5G  667G   1% /home
> /dev/md127p5          119G  420M  112G   1% /var
> 
> Less than 10% usage on every partition. The filesystems have always been
> empty. This data was amongst the very first data written to the /home.
> All partitions where created using standard Linux fdisk and then
> formatted using mkfs.ext3.
> 
> The 3GB consists of very mixed content: mostly small files (~1KB), and
> just a few bigger (50MB+).

Ok, so it's not a fragmentation issue.  When you copied the 3GB of files
from one directory to another on /home you said you experienced the
"same problem".  Could you describe that in more detail?  Also, do you
see any lines in dmesg showing any SATA links being reset, or any
disk/interface messages, either warnings or errors?


-- 
Stan


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RAID1 performance and "task X blocked for more than 120 seconds"
  2012-11-19 16:51     ` Stan Hoeppner
@ 2012-11-20  9:14       ` Martijn
  0 siblings, 0 replies; 5+ messages in thread
From: Martijn @ 2012-11-20  9:14 UTC (permalink / raw)
  To: linux-raid; +Cc: stan

Hi Stan,

On 19-11-2012 17:51, Stan Hoeppner wrote:
> On 11/18/2012 3:53 PM, Martijn wrote:
>> Thank you for your response Stan.
>> On 18-11-2012 22:05, Stan Hoeppner wrote:
>>> On 11/18/2012 12:39 PM, Martijn wrote:
>>> Your previously mentioned symptoms were leading me to this, but this one
>>> kinda seals the deal.  This sounds like classic filesystem free space
>>> fragmentation.  What filesystem is this?  The 3GB of files--are they
>>> large or small files?
>>
>> Except for /boot, it's all ext3. Free space:
>> Filesystem            Size  Used Avail Use% Mounted on
>> /dev/md127p2           60G  1.1G   56G   2% /
>> /dev/md127p6          7.9G  149M  7.4G   2% /tmp
>> /dev/md127p1          243M   19M  212M   9% /boot
>> /dev/md127p3          709G  6.5G  667G   1% /home
>> /dev/md127p5          119G  420M  112G   1% /var
>>
>> Less than 10% usage on every partition. The filesystems have always been
>> empty. This data was amongst the very first data written to the /home.
>> All partitions where created using standard Linux fdisk and then
>> formatted using mkfs.ext3.
>>
>> The 3GB consists of very mixed content: mostly small files (~1KB), and
>> just a few bigger (50MB+).
>
> Ok, so it's not a fragmentation issue.  When you copied the 3GB of files
> from one directory to another on /home you said you experienced the
> "same problem".  Could you describe that in more detail?  Also, do you
> see any lines in dmesg showing any SATA links being reset, or any
> disk/interface messages, either warnings or errors?

Thank you for taking the time to look into this. I have some news. 
Although it's not 100% clear what exactly the cause is, chances are good 
it's hardware and not software.

On your questions:
- dmesg, logs in general: no entries suggesting ANY sort of trouble with 
SATA, or any of the disks. Actually, no problems at all.
- SMART on all drives look good. (more on this later)
- Settings for devices in /sys all seem perfectly the same
- hdparm settings all seem equal for all disks
- No settings in BIOS that catch my attention. There's not that much to 
set up anyway ;-)

The copy is a (not very exciting) cp -rv /old/ /new with source and 
destination on the same partition. Immediately after it starts, the 
machine starts to feel sluggish. Very shortly after that, other programs 
you want to run start to pause for... many seconds, even minutes. This 
continues until some time after the copy is done. Meanwhile, the 
mentioned logentries appear.

Looking at the list of files coming by with -v, the copy sometimes just 
pauses for a while, on a file that isn't big. Then to continue copying 
again after some time, then pausing again, etc.

Last night I had a look together with a collegue and he suggested 
testing write performance on seperate disks to better isolate the 
problem. Make sure it's either the writing performance in general, or 
just that of software RAID.

- Read performances are good across all disks.

We tested writing performance on the spare (sdc) and that seemed ok. 
Then we put sda on fail and had the RAID rebuild itself with sdb and sdc 
as active devices.

- So, write performance on spare sdc seems good.

Then, we tested write performance on sda:

- Write performance on sda looks bad. 1/4th of that of sdc.
- Write performance on sdb also looks bad, but we've not yet been able 
to test the performance outside the RAID set, yet. The disk seems to 
'lag' behind sdc when writing.

We've put in an extra disk in (sdd) which seems to have good write 
performance when tested seperately, and plan to rebuild the RAID on sdc 
and sdd, then see what the true writing performance of sdb is.

My collegue also noted that:

- Requesting SMART regularly fails when the disk is still writing.

So summing it up, it certainly seems that this is all caused by some 
sort of hardware issue. Either with two of the disks (the formerly two 
active devices in the RAID), the cables, or the controller. And more 
importantly: probably not related to software.

- Martijn

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-11-20  9:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-18 18:39 RAID1 performance and "task X blocked for more than 120 seconds" Martijn
2012-11-18 21:05 ` Stan Hoeppner
2012-11-18 21:53   ` Martijn
2012-11-19 16:51     ` Stan Hoeppner
2012-11-20  9:14       ` Martijn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).