public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Dimitrios Apostolou <jimis@gmx.net>
To: "Rafał Bilski" <rafalbilski@interia.pl>
Cc: linux-kernel@vger.kernel.org, Alan Cox <alan@lxorguk.ukuu.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: high system cpu load during intense disk i/o
Date: Thu, 9 Aug 2007 11:17:50 +0300	[thread overview]
Message-ID: <200708091117.51687.jimis@gmx.net> (raw)
In-Reply-To: <46BA14A2.5030705@interia.pl>

Hi Rafal, thank you for your help! 

On Wednesday 08 August 2007 22:08:18 Rafał Bilski wrote:
> > Hello again,
>
> Hi,
>
> > I 'm now using libata on the same system described before (see attached
> > dmesg.txt). When writing to both disks I think the problem is now worse
> > (pata_oprof_bad.txt, pata_vmstat_bad.txt), even the oprofile script
> > needed half an hour to complete! For completion I also attach the same
> > tests when I write to only one disk (pata_vmstat_1disk.txt,
> > pata_oprof_1disk.txt), whence everything is normal.
> >
> > FWIW, libata did not give me any performance benefit, 20MB/s is again the
> > peak hdparm reports.
>
> This OProfile thing is extremly not usefull in this case. It says that Your
> system is using 25% CPU time for no-op loops, but it doesn't say why. Your
> system really isn't very busy. Look here:
> procs -----------memory---------- ---swap-- -----io---- -system--
> ----cpu---- r  b   swpd   free   buff  cache   si   so    bi    bo   in  
> cs us sy id wa 2  2      0 225352   5604  91700    0    0 18112  1664 28145
> 6315 29 71  0  0 2  2      0 225360   5604  91700    0    0 18496  1664
> 27992 6358 30 70  0  0 1  2      0 225360   5604  91700    0    0 18432 
> 1472 28511 6315 28 72  0  0 1  2      0 225360   5604  91700    0    0
> 18240  1536 28031 6153 31 69  0  0 + video 720x576 25fps yuv stream over
> PCI. And system is fully responsive. Of course programs which need disk
> access must wait a bit longer, but later are running fine.
> I don't have disks so fast like Yours and I can't do destructive write
> test. First disk:
> 1  1      0 241848   7312 100768    0    0 27712     0  927 1270 29 13  0
> 58 1  1      0 241052   7580 100896    0    0  4612  4676  519  702 34 12 
> 0 54 Second disk:
> 0  1      0 237752   7268 100980    0    0  6464     0  468  583 37 10  0
> 53 0  1      0 241060   7532 100884    0    0  1728  1728  465  578 31  9 
> 0 60 Both:
> 0  2      0 241592   7384 100776    0    0 33024     0  905 1415 33 16  0
> 51 1  2      0 240804   7528 100884    0    0  6848  6848  642  780 38 10 
> 0 52 So sda + sdc = both.
>
> Your single disk:
> 0  1      0 128804  19620  82484    0    0     0 21120  335  675  0  4  0
> 96 Both:
> 5  2      0 168480  10972  47152    0    0     0 16000  252  470 22 78  0 
> 0 I would expect 2*21k, but we have only 2*8k and it is lower then single
> disk case. Of course this math isn't moving us forward. Only thing which
> would help would be function call trace as Andrew wrote. Which function is
> calling delay_tsc()? Is it calling it often or once but with long delay?

Please guide me on doing these call traces. Unfortunately I don't have any 
experience at all regarding kernel-space debugging. What tools should I use? 
Keep in mind also that I mostly have remote access to this PC. 

However I am not really confident that the cause is delay_tsc(). It only 
appeared on the libata tests. To summarize I will write down the first lines 
of all problematic oprofilings I've done until now: 

first test: 
3832     23.4818  __switch_to
3380     20.7121  schedule
653       4.0015  mask_and_ack_8259A
452       2.7698  __blockdev_direct_IO
327       2.0038  dequeue_task

second test, with idle=poll (which didn't really help, it seems the cpu is not 
idling at all): 
2477     31.2871  __switch_to
1625     20.5255  schedule
246       3.1072  mask_and_ack_8259A
222       2.8041  __blockdev_direct_IO
147       1.8568  follow_page

third test, libata and idle=poll (the problem felt like it was greater here, 
oprofiling took a really long time):
9556     26.0523  delay_tsc
6424     17.5136  iowrite8
6178     16.8430  __switch_to
4015     10.9460  schedule
3335      9.0921  ioread8
1429      3.8959  iowrite32


So I would assume that delay_tsc() probably only makes the situation worse for 
the libata tests, but the real problem is at __switch_to() and schedule(). Do 
you agree with these assumptions? What are these functions used for? My guess 
would be that __switch_to() is for context switching and schedule() for 
process scheduling. However the context switch rate, as reported by vmstat, 
is not that great to verify the time lost in __switch_to(). As shown from the 
measurements on one disk the context switching rate is greater, but the time 
spent is much less. 

Is there a way to for oprofile to report the time spent in function calls 
depending on the call trace?

> So far it looks like some kind of hardware limit for me. Do You have any
> options in BIOS which can degrade PCI or disk performance?

None that I can think of. Do you have any specific options in mind?


Thanks again for the help. I guess if that doesn't lead anywhere I'll just 
start compiling older vanilla kernels and see when the problem dissapears. 
But this needs a lot of time and I'm not sure for how long I'll be able to 
not offer any service with those disks (I was supposed to use RAID 0 to 
provide storage space with them, but with the current problems that wouldn't 
be so wise). 

Dimitris

>
> > Thanks,
> > Dimitris
>
> Rafał

  reply	other threads:[~2007-08-09  8:21 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-03 16:03 high system cpu load during intense disk i/o Dimitrios Apostolou
2007-08-05 16:03 ` Dimitrios Apostolou
2007-08-05 17:58   ` Rafał Bilski
2007-08-05 18:42     ` Dimitrios Apostolou
2007-08-05 20:08       ` Rafał Bilski
2007-08-06 16:14       ` Rafał Bilski
2007-08-06 19:18         ` Dimitrios Apostolou
2007-08-06 19:48           ` Alan Cox
2007-08-07  0:40             ` Dimitrios Apostolou
2007-08-07  0:37               ` Alan Cox
2007-08-07 13:15                 ` Dimitrios Apostolou
2007-08-06 22:12           ` Rafał Bilski
2007-08-07  0:49             ` Dimitrios Apostolou
2007-08-07  9:03               ` Rafał Bilski
2007-08-07  9:43                 ` Dimitrios Apostolou
2007-08-06  1:28   ` Andrew Morton
2007-08-06 14:20     ` Dimitrios Apostolou
2007-08-06 17:33       ` Andrew Morton
2007-08-06 19:27         ` Dimitrios Apostolou
2007-08-06 20:04         ` Dimitrios Apostolou
2007-08-06 16:09     ` Dimitrios Apostolou
2007-08-07 14:50 ` Dimitrios Apostolou
2007-08-08 19:08   ` Rafał Bilski
2007-08-09  8:17     ` Dimitrios Apostolou [this message]
2007-08-10  7:06       ` Rafał Bilski
2007-08-17 23:19         ` Dimitrios Apostolou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200708091117.51687.jimis@gmx.net \
    --to=jimis@gmx.net \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rafalbilski@interia.pl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox