All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
To: Karl Vogel <karl.vogel@seagha.com>, axboe@suse.de
Cc: linux-kernel@vger.kernel.org
Subject: Re: Kernel 2.6.8.1: swap storm of death - CFQ scheduler=culprit
Date: Mon, 23 Aug 2004 11:12:06 -0300	[thread overview]
Message-ID: <20040823141206.GE2157@logos.cnet> (raw)
In-Reply-To: <m33c2f56ck.fsf_-_@seagha.com>

On Sun, Aug 22, 2004 at 09:18:51PM +0200, Karl Vogel wrote:
> When using elevator=as I'm unable to trigger the swap of death, so it seems
> that the CFQ scheduler is at blame here.
> 
> With AS scheduler, the system recovers in +-10 seconds, vmstat output during
> that time:
> 
> procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
>  1  0      0 295632  40372  49400   87  278   324   303 1424   784  7  2 78 13
>  0  0      0 295632  40372  49400    0    0     0     0 1210   648  3  1 96  0
>  0  0      0 295632  40372  49400    0    0     0     0 1209   652  4  0 96  0
>  2  0      0 112784  40372  49400    0    0     0     0 1204   630 23 34 43  0
>  1  9 156236    788    264   8128   28 156220  3012 156228 3748  3655 11 31  0 59
>  0 15 176656   2196    280   8664    0 20420   556 20436 1108   374  2  5  0 93
>  0 17 205320    724    232   7960   28 28664   396 28664 1118   503  7 12  0 81
>  2 12 217892   1812    252   8556  248 12584   864 12584 1495   318  2  7  0 91
>  4 14 253268   2500    268   8728  188 35392   432 35392 1844   399  3  7  0 90
>  0 13 255692   1188    288   9152  960 2424  1408  2424 1173  2215 10  5  0 85
>  0  7 266140   2288    312   9276  604 10468   752 10468 1248   644  5  5  0 90
>  0  7 190516 340636    348   9860 1400    0  2016     0 1294   817  4  8  0 88
>  1  8 190516 339460    384  10844  552    0  1556     4 1241   642  3  1  0 96
>  1  3 190516 337084    404  11968 1432    0  2576     4 1292   788  3  1  0 96
>  0  6 190516 333892    420  13612 1844    0  3500     0 1343   850  5  2  0 93
>  0  1 190516 333700    424  13848  480    0   720     0 1250   654  3  2  0 95
>  0  1 190516 334468    424  13848  188    0   188     0 1224   589  3  2  0 95
> 
> With CFQ processes got stuck in 'D' and never left that state. See URL's in my
> initial post for diagnostics.

I can confirm this on a 512MB box with 512MB swap (2.6.8-rc4). Using CFQ the machine swaps out
400 megs, with AS it swaps out 30M.  

That leads to allocation failures/etc. 

CFQ allocates a huge number of bio/biovecs:

 cat /proc/slabinfo | grep bio
biovec-(256)         256    256   3072    2    2 : tunables   24   12    0 : slabdata    128    128      0
biovec-128           256    260   1536    5    2 : tunables   24   12    0 : slabdata 52     52      0
biovec-64            265    265    768    5    1 : tunables   54   27    0 : slabdata 53     53      0
biovec-16            260    260    192   20    1 : tunables  120   60    0 : slabdata 13     13      0
biovec-4             272    305     64   61    1 : tunables  120   60    0 : slabdata  5      5      0
biovec-1          121088 122040     16  226    1 : tunables  120   60    0 : slabdata    540    540      0
bio               121131 121573     64   61    1 : tunables  120   60    0 : slabdata   1992   1993      0


biovec-(256)         256    256   3072    2    2 : tunables   24   12    0 : slabdata 128    128      0
biovec-128           256    260   1536    5    2 : tunables   24   12    0 : slabdata  52     52      0
biovec-64            265    265    768    5    1 : tunables   54   27    0 : slabdata  53     53      0
biovec-16            258    260    192   20    1 : tunables  120   60    0 : slabdata  13     13      0
biovec-4             257    305     64   61    1 : tunables  120   60    0 : slabdata   5      5      0
biovec-1           66390  68026     16  226    1 : tunables  120   60    0 : slabdata 301    301      0
bio                66389  67222     64   61    1 : tunables  120   60    0 : slabdata   1102   1102      0

(which are freed later on, but the cause for the trashing during the swap IO).

While AS does:

[marcelo@yage marcelo]$ cat /proc/slabinfo | grep bio
biovec-(256)         256    256   3072    2    2 : tunables   24   12    0 : slabdata    128    128      0
biovec-128           256    260   1536    5    2 : tunables   24   12    0 : slabdata     52     52      0
biovec-64            260    260    768    5    1 : tunables   54   27    0 : slabdata     52     52      0
biovec-16            280    280    192   20    1 : tunables  120   60    0 : slabdata     14     14      0
biovec-4             264    305     64   61    1 : tunables  120   60    0 : slabdata      5      5      0
biovec-1            4478   5424     16  226    1 : tunables  120   60    0 : slabdata     24     24      0
bio                 4525   5002     64   61    1 : tunables  120   60    0 : slabdata     81     82      0


Odd thing is the 400M swapped out are not reclaimed after exp (the 512MB callocator) exits. With AS 
almost all swapped out memory is reclaimed on exit.

 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 0  0 492828  13308    320   3716    0    0     0     0 1002     5  0  0 100  0


Jens, is this huge amount of bio/biovec's allocations expected with CFQ? Its really really bad.


  reply	other threads:[~2004-08-23 15:38 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-08-22 13:27 Kernel 2.6.8.1: swap storm of death Karl Vogel
2004-08-22 13:33 ` Karl Vogel
2004-08-22 18:49 ` Kernel 2.6.8.1: swap storm of death - 2.6.8.1-mm4 also karl.vogel
2004-08-22 19:18   ` Kernel 2.6.8.1: swap storm of death - CFQ scheduler=culprit Karl Vogel
2004-08-23 14:12     ` Marcelo Tosatti [this message]
2004-08-23 15:41       ` Jens Axboe
  -- strict thread matches above, loose matches on Subject: below --
2004-08-23 16:10 Karl Vogel
2004-08-23 17:00 ` Jens Axboe
2004-08-24 10:03 ` Jens Axboe
2004-08-24  9:18   ` Marcelo Tosatti
2004-08-24 10:52     ` Jens Axboe
2004-08-24 10:13   ` Jens Axboe
2004-08-24 10:28 Karl Vogel
2004-08-24 10:29 ` Jens Axboe
2004-08-24 10:35 Karl Vogel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040823141206.GE2157@logos.cnet \
    --to=marcelo.tosatti@cyclades.com \
    --cc=axboe@suse.de \
    --cc=karl.vogel@seagha.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.