From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753413Ab1HHGcJ (ORCPT ); Mon, 8 Aug 2011 02:32:09 -0400 Received: from oproxy4-pub.bluehost.com ([69.89.21.11]:54585 "HELO oproxy4-pub.bluehost.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1752900Ab1HHGb4 (ORCPT ); Mon, 8 Aug 2011 02:31:56 -0400 Message-ID: <4E3F82D6.1020609@tao.ma> Date: Mon, 08 Aug 2011 14:31:50 +0800 From: Tao Ma User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.18) Gecko/20110617 Thunderbird/3.1.11 MIME-Version: 1.0 To: Shaohua Li CC: Jens Axboe , "linux-kernel@vger.kernel.org" , Christoph Hellwig , Roland Dreier , Dan Williams Subject: Re: [PATCH] block: Make rq_affinity = 1 work as expected. References: <1312519150-3261-1-git-send-email-tm@tao.ma> <4E3B9CAC.7020802@fusionio.com> <4E3F5C0E.2060207@tao.ma> <4E3F76D7.4010708@tao.ma> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Identified-User: {1390:box585.bluehost.com:colyli:tao.ma} {sentby:smtp auth 182.92.247.4 authed with tm@tao.ma} Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 08/08/2011 01:56 PM, Shaohua Li wrote: > 2011/8/8 Tao Ma : >> On 08/08/2011 12:33 PM, Shaohua Li wrote: >>> 2011/8/8 Tao Ma : >>>> Hi Shaohua, >>>> On 08/08/2011 10:58 AM, Shaohua Li wrote: >>>>> 2011/8/5 Jens Axboe : >>>>>> On 2011-08-05 06:39, Tao Ma wrote: >>>>>>> From: Tao Ma >>>>>>> >>>>>>> Commit 5757a6d76c introduced a new rq_affinity = 2 so as to make >>>>>>> the request completed in the __make_request cpu. But it makes the >>>>>>> old rq_affinity = 1 not work any more. The root cause is that >>>>>>> if the 'cpu' and 'req->cpu' is in the same group and cpu != req->cpu, >>>>>>> ccpu will be the same as group_cpu, so the completion will be >>>>>>> excuted in the 'cpu' not 'group_cpu'. >>>>>>> >>>>>>> This patch fix problem by simpling removing group_cpu and the codes >>>>>>> are more explicit now. If ccpu == cpu, we complete in cpu, otherwise >>>>>>> we raise_blk_irq to ccpu. >>>>>> >>>>>> Thanks Tao Ma, much more readable too. >>>>> Hi Jens, >>>>> I rethought the problem when I check interrupt in my system. I thought >>>>> we don't need Tao's patch though it makes the code behavior like before. >>>>> Let's take an example. My test box has cpu 0-7, one socket. Say request >>>>> is added in CPU 1, blk_complete_request occurs at CPU 7. Without Tao's >>>>> patch, softirq will be done at CPU 7. With it, an IPI will be directed to CPU 0, >>>>> and softirq will be done at CPU 0. In this case, doing softirq at CPU 0 and >>>>> CPU 7 have no difference and we can avoid an ipi if doing it in CPU 7. >>>> I totally agree with your analysis, but what I am worried is that this >>>> does change the old system behavior. >>>> And without this patch actually '1' and '2' in rq_affinity has the same >>>> effect now in your case. If you do prefer the new codes and the new >>>> behavior, then '1' don't need to exist any more(since from your >>>> description it seems to only adds an additional IPI overhead and no >>>> benefit), or '2' is totally unneeded here. >>> with rq_affinity 2, CPU 1 will do the softirq in above case. it's >>> still different >>> like the rq_affinity 1 case. >> OK, so let's see what's going on without the patch in case rq_affinity = 1. >> If the complete cpu and the request cpu are in the same group, the >> complete cpu will call softirq. >> If the complete cpu and the request cpu are not in the same group, the >> group cpu of the request cpu will call softirq. >> >> These behaviors are totally different. How can you tell the user what's >> going on there? And that' the reason we want 0, 1, 2 for rq_affinity. If >> the user does care about the extra IPI(in your case), fine, just set >> rq_affinty = 2. > rq_affinity=2: finish request in each cpu > rq_affinity=1: finish request in one CPU for each socket. > Even without your patch, rq_affinity=1 finish request in one CPU too. We always finish request in one CPU, that is. The only difference is which cpu to do the softirq work. > Remember the controller only has one interrupt source. the only difference > is request isn't always finished in the first CPU of a socket. I didn't > think this is a behavior change which user even cares about. That is your think. Thanks. At least it makes me feels strange when I came across it and that's the reason why I found it. I am done with it. Thanks Tao