From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753811AbYIVQDw (ORCPT ); Mon, 22 Sep 2008 12:03:52 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752789AbYIVQDp (ORCPT ); Mon, 22 Sep 2008 12:03:45 -0400 Received: from 25.mail-out.ovh.net ([91.121.27.228]:47304 "HELO 25.mail-out.ovh.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1752780AbYIVQDo (ORCPT ); Mon, 22 Sep 2008 12:03:44 -0400 To: linux-kernel@vger.kernel.org Subject: [sh4][2.6.17] latency peaks with unix sockets on heavy loads MIME-Version: 1.0 Date: Mon, 22 Sep 2008 18:02:40 +0200 From: guillaume ranquet X-Webmail-UserID: darkebola@the-organisation.net X-Originating-IP: 213.41.232.205 Organization: the-org Message-ID: User-Agent: RoundCube Webmail/0.1 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Ovh-Tracer-Id: 13976077018743178263 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org I'm experiencing little glitches when I try to send/receive datas with local sockets under heavy loads. under normal load it behaves normally, but with load increasing, I get latency peaks once every second. an Image is worth a thousand words: http://img255.imageshack.us/img255/3700/capplottsy7.th.png X: time elapsed from beginning of execution Y: call latency red: under heavy load green: no load at all those peeks of 200ms really disturbs me as I'm using the sockets for RPC calls and 200ms (and far more with load increasing) is really too much. I've been testing various things enabling/disabling kernel preempt : no effects active waiting (doing some stuff that consumes cpu) between RPC calls: no effects (even worse) non blocking sockets doesn't show any improvements and never yield for a EWOULDBLOCK setting policy to SCHED_FIFO solves the problem: http://img47.imageshack.us/img47/4449/capplotschedfifoxh5.th.png also, adding a usleep(0) between each call (still with SCHED_NORMAL policy) removes the peaks from my understanding, usleep(0) puts the task in sleeping mode until the next TICK is emitted and may cause a context switch if there's another runnable task sched_yield()'ing once every 1000 calls helps also greatly (some peaks still appears here and there though) upgrading to 2.6.23: h00rray it solves everything: http://img371.imageshack.us/img371/7028/capplotkernel2623lldnl6.th.png still the mean time is a bit higher and provokes a 30% overhead at running the test but my problem is that I can't upgrade my kernel (yet) and need to find a solution on 2.6.17 I couldn't reproduce the behavior of the 2.6.17 with the 2.6.23, no matter the kernel config what has changed and could impact on that 'glitch' between the 2 kernels: -lock classes of AF_UNIX domain has became bh-unsafe :: seems out of suspicion since the peaks hasn't shown up with AF_INET sockets -scheduler for SCHED_NORMAL tasks has been completely rewritten :: seems to be guilty of the new (improved?) behavior is that a known bug of the pre-CFS scheduler? am I totally wrong and should not blame the scheduler? is there a solution with 2.6.17 and SHED_NORMAL? ps: since I'm not subscribed (my e-mail account can't handle the traffic), would you please CC me?