Re: Strange delays / what usually happens every 10 min?

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Eric Dumazet <dada1@cosmosbay.com>
To: Florian Boelstler <kernel@boelstler.net>
Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: Strange delays / what usually happens every 10 min?
Date: Tue, 13 Nov 2007 17:23:29 +0100	[thread overview]
Message-ID: <4739CF81.8050704@cosmosbay.com> (raw)
In-Reply-To: <fhcc39$51b$1@ger.gmane.org>

Florian Boelstler a écrit :
> Hi,
>
> this issue has been already discussed on the kernelnewbies mailing 
> list [1],[2] and suggested to be further discussed here.
>
> I am currently working on a MPC8540-based custom board, which runs Linux
> 2.6.15 (arch/ppc). The original Linux sources have been modified to 
> support that custom board. (Additional patches to support LTT are 
> applied as well, though disabled in the running kernel)
>
> I set up a periodically running kernel thread, which is delayed for a
> single jiffy using schedule_timeout() in an infinite loop. It is used to
> measure delays between invocations of that thread. For measuring the
> distance in time the PPC's time base lower half register is used
> (obtained using get_cycles() defined in asm/timex.h).
>
> The thread calculates the delay to the previous run and only outputs the
> result if a new maximum value has been determined (in respect to all
> previous cycles). Further the thread outputs a warning if a very "high"
> delay was determined. I.e. a delay greater than 5ms.
>
> While running that test driver a delay of about 10ms _exactly_ occurs
> every 10 minutes.
>
> The kernel is configured using CONFIG_HZ=1000 and CONFIG_PREEMPT.
> The CCB is at 333MHz, whereas the TBR update rate is 333 MHz / 8, i.e.
> 41,625 MHz.
> Kernel configuration as a whole is found here: 
> http://nopaste.info/5e4d0283bb.html
>
> And now the funny part starts.
> I got a response from Bruce Rowen on kernelnewbies, telling me that he 
> came across the same problem. He increased his AMD-Geode-based 
> platform to 1GB of RAM (256MB before) and also hit the 
> 10-minutes-issue a few month ago (using Linux 2.6.13).
> Going back to 256MB cured the problem. I did the same thing by 
> instructing the boot loader in order to only use 256 MB of RAM 
> (instead of 512MB) and yes, the 10-minutes-issue was gone as well.
>
> Apart of some kernel threads almost all user processes have been killed
> during the test. Only SSH and a bash were running (whereas a test with 
> network interfaces completely disabled and only operated from a serial 
> console turned out the same results).
> The kernel comes with compiled in CIFS support, some kernel debugging
> features like soft-lockup detection and preemption debugging. I.e. ps
> lists the kernel threads ksoftirqd, watchdog, events, khelper, kthread,
> kblockd, pdflush, aio, cifsoplockd and cifsdnotifyd.
>
> An appropriate userspace test tool based on nanosleep() determined the
> same results like the kernel thread:
>
> root@mpc0:/# /tmp/wait.rt
> looping 1 milli seconds nanosleep ...
> 15:26:16: #1 FRAME MAX 1996 us (at 4139773004 ticks)
> 15:26:16: #2 FRAME MAX 2002 us (at 4139856360 ticks)
> 15:26:16: #155 FRAME MAX 2102 us (at 4152597854 ticks)
> 15:41:37: #460398 FRAME MAX 8941 us (at 3813406605 ticks)
> 15:41:37: #460398 FRAME HIGH 8941 us (at 3813406605 ticks)
> 15:51:37: #760394 FRAME MAX 9936 us (at 3018602602 ticks)
> 15:51:37: #760394 FRAME HIGH 9936 us (at 3018602602 ticks)
> 16:01:37: #1060390 FRAME HIGH 9935 us (at 2223798809 ticks)
> 16:11:37: #1360386 FRAME HIGH 9934 us (at 1428994989 ticks)
> 16:21:37: #1660382 FRAME HIGH 9935 us (at 634191241 ticks)
> [...]
>
> Thanks for any help!
>
> Cheers,
>
>   Florian
>
> [1] http://thread.gmane.org/gmane.linux.kernel.kernelnewbies/23419
> [2] http://thread.gmane.org/gmane.linux.kernel.kernelnewbies/23426
>

Hi Florian

I think you hit the periodic flush of IP route cache, which is fired 
every 600 seconds per default.

(Check /proc/sys/net/ipv4/route/secret_interval )

For a 1GB machine, this hash table is so big that a full scan might take 
more than 10 ms, even if empty.

Instead of using less RAM, you could just boot with rhash_entries=1024 
to lower the size of this table.

Or just change secret_interval to 2000000 for example (not much more 
because * HZ could overflow)

Eric

next prev parent reply	other threads:[~2007-11-13 16:23 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-13 14:24 Strange delays / what usually happens every 10 min? Florian Boelstler
2007-11-13 14:41 ` Peter Zijlstra
2007-11-13 16:02 ` Dmitry Adamushko
2007-11-13 16:08 ` Chris Snook
2007-11-13 16:50   ` Clemens Koller
2007-11-13 17:58     ` Florian Boelstler
2007-11-13 16:23 ` Eric Dumazet [this message]
2007-11-13 17:54   ` Florian Boelstler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4739CF81.8050704@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=kernel@boelstler.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.