From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ingo Molnar Subject: Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP Date: Thu, 30 Nov 2006 07:17:58 +0100 Message-ID: <20061130061758.GA2003@elte.hu> References: <2f14bf623344.456de60a@fnal.gov> <20061129.181950.31643130.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: wenji@fnal.gov, akpm@osdl.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Return-path: To: David Miller Content-Disposition: inline In-Reply-To: <20061129.181950.31643130.davem@davemloft.net> Sender: linux-kernel-owner@vger.kernel.org List-Id: netdev.vger.kernel.org * David Miller wrote: > We can make explicitl preemption checks in the main loop of > tcp_recvmsg(), and release the socket and run the backlog if > need_resched() is TRUE. > > This is the simplest and most elegant solution to this problem. yeah, i like this one. If the problem is "too long locked section", then the most natural solution is to "break up the lock", not to "boost the priority of the lock-holding task" (which is what the proposed patch does). [ Also note that "sprinkle the code with preempt_disable()" kind of solutions, besides hurting interactivity, are also a pain to resolve in something like PREEMPT_RT. (unlike say a spinlock, preempt_disable() is quite opaque in what data structure it protects, etc., making it hard to convert it to a preemptible primitive) ] > The one suggested in your patch and paper are way overkill, there is > no reason to solve a TCP specific problem inside of the generic > scheduler. agreed. What we could also add is a /reverse/ mechanism to the scheduler: a task could query whether it has just a small amount of time left in its timeslice, and could in that case voluntarily drop its current lock and yield, and thus give up its current timeslice and wait for a new, full timeslice, instead of being forcibly preempted due to lack of timeslices with a possibly critical lock still held. But the suggested solution here, to "prolong the running of this task just a little bit longer" only starts a perpetual arms race between users of such a facility and other kernel subsystems. (besides not being adequate anyway, there can always be /so/ long lock-hold times that the scheduler would have no other option but to preempt the task) Ingo