Re: Pathological case identified from contest

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Con Kolivas <conman@kolivas.net>
To: linux kernel mailing list <linux-kernel@vger.kernel.org>
Cc: Andrew Morton <akpm@digeo.com>
Subject: Re: Pathological case identified from contest
Date: Thu, 17 Oct 2002 17:16:46 +1000	[thread overview]
Message-ID: <1034839006.3dae63de3f69a@kolivas.net> (raw)
In-Reply-To: <1034828795.3dae3bfb42911@kolivas.net>

Quoting Con Kolivas <conman@kolivas.net>:

> Quoting Andrew Morton <akpm@digeo.com>:
> 
> > Con Kolivas wrote:
> > > 
> > > I found a pathological case in 2.5 while running contest with
> process_load
> > > recently after checking the results which showed a bad result for
> > 2.5.43-mm1:
> > > 
> > > 2.5.43-mm1              101.38  72%     42      31%
> > > 2.5.43-mm1              102.90  75%     34      28%
> > > 2.5.43-mm1              504.12  14%     603     85%
> > > 2.5.43-mm1              96.73   77%     34      26%
> > > 
> > > This was very strange so I looked into it further
> > > 
> > > The default for process_load is this command:
> > > 
> > > process_load --processes $nproc --recordsize 8192 --injections 2
> > > 
> > > where $nproc=4*num_cpus
> > > 
> > > When I changed recordsize to 16384, many of the 2.5 kernels started
> > exhibiting
> > > the same behaviour. While the machine was apparently still alive and
> would
> > > respond to my request to abort, the kernel compile would all but stop
> > while
> > > process_load just continued without allowing anything to happen from
> > kernel
> > > compilation for up to 5 minutes at a time. This doesnt happen with any
> 2.4
> > kernels.
> > > 
> > 
> > Well it doesn't happen on my test machine (UP or SMP).  I tried
> > various recordsizes.  It's probably related to HZ, memory bandwidth
> > and the precise timing at which things happen.
> > 
> > The test describes itself thusly:
> > 
> >  *  This test generates a load which simulates a process-loaded system.
> >  *
> >  *  The test creates a ring of processes, each connected to its
> predecessor
> >  *  and successor by a pipe.  After the ring is created, the parent
> process
> >  *  injects some dummy data records into the ring and then joins.  The
> >  *  processes pass the data records around the ring until they are killed.
> >  *
> > 
> > It'll be starvation in the CPU scheduler I expect.  For some reason
> > the ring of piping processes is just never giving a timeslice to
> > anything else.  Or maybe something to do with the exceptional
> > wakeup strategy which pipes use.
> > 
> > Don't now, sorry.  One for the kernel/*.c guys.
> 
> Ok well I've done some profiling as suggested by wli and it shows pretty
> much
> what I find in the results - it gets stuck while doing process_load and
> never
> moves on.
> 
> recordsize 8192 kern profile:
> c01223ac 76997    4.48583     do_anonymous_page
> c0188694 135835   7.91373     __generic_copy_from_user
> c0188610 345071   20.1038     __generic_copy_to_user
> c0105298 801429   46.6911     poll_idle
> sysprofile:
> 00000000 160258   5.03854     (no symbol)            
> /lib/i686/libc-2.2.5.so
> c0188610 345071   10.8491     __generic_copy_to_user 
> /home/con/kernel/linux-2.5.43/vmlinux
> c0105298 801429   25.1971     poll_idle              
> /home/con/kernel/linux-2.5.43/vmlinux
> 00000000 1132668  35.6113     (no symbol)            
> /usr/lib/gcc-lib/i686-pc-linux-gnu/2.95.3/cc1
> 
> Normal run consistent with doing kernel compilation most of the time.
> 
> recordsize 16384 kernprofile: 
> c0111ef4 403545   4.3407      do_schedule
> c0105298 558704   6.00965     poll_idle
> c0188694 2571995  27.6655     __generic_copy_from_user
> c0188610 4489796  48.2941     __generic_copy_to_user
> sysprofile:
> c0111ef4 403545   4.24896     do_schedule            
> /home/con/kernel/linux-2.5.43/vmlinux
> c0105298 558704   5.88264     poll_idle              
> /home/con/kernel/linux-2.5.43/vmlinux
> c0188694 2571995  27.0807     __generic_copy_from_user
> /home/con/kernel/linux-2.5.43/vmlinux
> c0188610 4489796  47.2734     __generic_copy_to_user 
> /home/con/kernel/linux-2.5.43/vmlinux
> 
> I had to abort the run with recordsize 16384 but you can see it's just stuck
> in
> process_load copying data between forked processes.
> 
> Can someone else on lkml decipher why it gets stuck here?

Well this has become more common with 2.5.43-mm2. I had to abort the
process_load run 3 times when benchmarking it. Going back to other kernels and
trying them it didnt happen so I dont think its my hardware failing or something
like that.

Con

next prev parent reply	other threads:[~2002-10-17  7:10 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-10-17  2:13 Pathological case identified from contest Con Kolivas
2002-10-17  2:49 ` Andrew Morton
2002-10-17  4:26   ` Con Kolivas
2002-10-17  7:16     ` Con Kolivas [this message]
2002-10-17  7:35       ` Andrew Morton
2002-10-17 17:15         ` Rik van Riel
2002-10-20  2:59         ` Con Kolivas
2002-10-20  3:05           ` Andrew Morton
2002-10-20  6:27             ` Con Kolivas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1034839006.3dae63de3f69a@kolivas.net \
    --to=conman@kolivas.net \
    --cc=akpm@digeo.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox