From: Martijn <mailinglist@mindconnect.nl>
To: linux-raid@vger.kernel.org
Cc: stan@hardwarefreak.com
Subject: Re: RAID1 performance and "task X blocked for more than 120 seconds"
Date: Sun, 18 Nov 2012 22:53:18 +0100 [thread overview]
Message-ID: <50A958CE.5030902@mindconnect.nl> (raw)
In-Reply-To: <50A94D9F.3060601@hardwarefreak.com>
Thank you for your response Stan.
On 18-11-2012 22:05, Stan Hoeppner wrote:
> On 11/18/2012 12:39 PM, Martijn wrote:
>> - Disks are all Seagate Barracuda 7200.12 ST31000528AS, 1TB.
>> - NCQ is disabled by setting queue_depth to 1.
> WRT write throughput, you have effectively a single 7.2k spindle. The
> only way to get lower performance is a 5xxx RPM 'green' or laptop drive.
> This is a low performance machine.
It's certainly not a top notch performance machine and I know that. It's
old hardware. The disks are the newest component. For the record: no
great performance is needed. I only expect the machine the behave
normally under normal ("copying a few files") circumstances.
For comparison:
I've (had) more of these machines, working well, with mdadm RAID1 on
much lower performance disks. Same motherboard. Same (deadline)
scheduler. A difference is they don't use a partitionable device, but
seperate partitions in RAID1, so /dev/sda1 mirroring /dev/sdb1, and so
on. Also, a different OS: an older version of Gentoo.
They never had any trouble keeping up and certainly never had an entry
in the syslog like the one I got now. Actually I just tried copying that
same 3 GB of files, and it worked flawlessly. No hickups and without
starving the machine. That is even while it's in production, under some
load.
%iowait on that machine is around 30% while copying. When the copy is
done, writing very quickly returns to 0 blocks/s normal on that machine.
>> The problem:
>> I was copying 3 GB of data using rsync, from another server to this
>> machine over a 100 mbit connection. After some time it appeared to me as
>> if one of the two systems was having trouble keeping up. Copying speed
>> was a few MB/s and the transfer sometimes stopped for a longer period of
>> time, then to continue again.
>>
>> Looking at the receiving system, I noticed this in syslog:
>> task kjournald blocked for more than 120 seconds
>> task dkpg-preconfigure blocked for more than 120 seconds
>> [...]
>>
>> dpkg-preconfigure being a process running at that time.
>
> Multiple disk intensive processes running concurrently.
The dkpg-preconfigure was a coincidence. It wasn't running when I did
the local copy. The syslog entries then mentioned a few vim editors I
had open to edit config files.
>> Eventually, the copy completed. But some time after the copy was
>> completed, I still noticed a high (50-80%) %iowait and 2000 to 4000
>> blocks being written to sda and sdb. I monitored this using iostat.
>
> This is the buffer cache flushing.
>
>> I waited for the system to return to 0 writes and a load of near 0 when
>> I attempted to copy the data on disk from directory A to B, and the same
>> problem occured.
>
> Your previously mentioned symptoms were leading me to this, but this one
> kinda seals the deal. This sounds like classic filesystem free space
> fragmentation. What filesystem is this? The 3GB of files--are they
> large or small files?
Except for /boot, it's all ext3. Free space:
Filesystem Size Used Avail Use% Mounted on
/dev/md127p2 60G 1.1G 56G 2% /
/dev/md127p6 7.9G 149M 7.4G 2% /tmp
/dev/md127p1 243M 19M 212M 9% /boot
/dev/md127p3 709G 6.5G 667G 1% /home
/dev/md127p5 119G 420M 112G 1% /var
Less than 10% usage on every partition. The filesystems have always been
empty. This data was amongst the very first data written to the /home.
All partitions where created using standard Linux fdisk and then
formatted using mkfs.ext3.
The 3GB consists of very mixed content: mostly small files (~1KB), and
just a few bigger (50MB+).
Thanks,
- Martijn
next prev parent reply other threads:[~2012-11-18 21:53 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-18 18:39 RAID1 performance and "task X blocked for more than 120 seconds" Martijn
2012-11-18 21:05 ` Stan Hoeppner
2012-11-18 21:53 ` Martijn [this message]
2012-11-19 16:51 ` Stan Hoeppner
2012-11-20 9:14 ` Martijn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50A958CE.5030902@mindconnect.nl \
--to=mailinglist@mindconnect.nl \
--cc=linux-raid@vger.kernel.org \
--cc=stan@hardwarefreak.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).