From: MANISH PANDEY <quic_mapa@quicinc.com>
To: Christian Loehle <christian.loehle@arm.com>,
Bart Van Assche <bvanassche@acm.org>,
Sandeep Dhavale <dhavale@google.com>,
Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Qais Yousef <qyousef@layalina.io>, <axboe@kernel.dk>,
<mingo@kernel.org>, <peterz@infradead.org>,
<vincent.guittot@linaro.org>, <linux-block@vger.kernel.org>,
<sudeep.holla@arm.com>, Jaegeuk Kim <jaegeuk@kernel.org>,
Christoph Hellwig <hch@infradead.org>, <kailash@google.com>,
<tkjos@google.com>, <bvanassche@google.com>,
<quic_nitirawa@quicinc.com>, <quic_cang@quicinc.com>,
<quic_rampraka@quicinc.com>, <quic_narepall@quicinc.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: Regarding patch "block/blk-mq: Don't complete locally if capacities are different"
Date: Fri, 23 Aug 2024 19:19:31 +0530 [thread overview]
Message-ID: <22d1bd64-934f-49f1-bb82-1367f4a43f40@quicinc.com> (raw)
In-Reply-To: <3d37e8ba-25a8-45c2-93a3-02888dad2c9e@arm.com>
On 8/23/2024 5:33 PM, Christian Loehle wrote:
> On 8/23/24 08:57, MANISH PANDEY wrote:
>>
>>
>> On 8/22/2024 7:54 PM, Bart Van Assche wrote:
>>> On 8/22/24 3:46 AM, MANISH PANDEY wrote:
>>>> On 8/21/2024 10:52 PM, Bart Van Assche wrote:
>>>>> What is the performance impact of the above change?
>>> >
>>>> No impact at all
>>> Is this a good summary of this email thread?
>>> * The first email in this thread reports an important performance
>>> regression.
>>> * In your previous email there is a candidate fix for the performance
>>> regression.
>>> * Above I read that the proposed fix has no performance impact at all
>>> on any setup.
>>>
>>> Is this a good summary of this email thread? If so, do you agree that
>>> this must be confusing everyone who is following this email thread?
>>>
>>> Thanks,
>>>
>>> Bart.
>>
>> Hi Bart,
>>
>> Performance impact due to addition of cpu capacity check (https://lore.kernel.org/all/20240223155749.2958009-3-qyousef@layalina.io/) ...[1]
>> is already mentioned in the first email.
>>
>> But let me summarize it again:
>>
>> We are not able to get advantage of affining the IRQ in different capacity CPU(s)/clusters and complete the request in higher cluster cpu(s), even though the LLC is shared between these clusters as it is causing the block completion to happen on SOFTIRQ context, if requester and completion clusters are different.
>>
>> Below is the performance impact with the current patch [1]
>>
>> 1. For MCQ capable UFS host (paired with UFS 4.x), we are observing ~20% random R/W performance drop.
>>
>> 2. For single doorbell ufs hosts (paired with UFS 2.x/ UFS 3.x), we are observing ~7-10% random R/W performance drop.
>>
>
> If you do decide to write your proposal up as a patch, a description of the
> topology would be helpful as well.
>
Thanks Christian for meaningful discussions and suggestions.
We would mention the same in the commit text description in the patch.
>>
>> Also in previous emails on this thread, below were few suggestions to add check for equal or greater capacity cpus by @Christian Loehle
>> https://lore.kernel.org/all/3feb5226-7872-432b-9781-29903979d34a@arm.com/
>>
>>> From: Christian Loehle @ 2024-08-02 9:03 UTC (permalink / raw)
>>> [......]
>>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>>> index e3c3c0c21b55..a4a2500c4ef6 100644
>>> --- a/block/blk-mq.c
>>> +++ b/block/blk-mq.c
>>> @@ -1164,7 +1164,7 @@ static inline bool
>>> blk_mq_complete_need_ipi(struct request *rq)
>>> if (cpu == rq->mq_ctx->cpu ||
>>> (!test_bit(QUEUE_FLAG_SAME_FORCE, &rq->q->queue_flags) &&
>>> cpus_share_cache(cpu, rq->mq_ctx->cpu) &&
>>> - cpus_equal_capacity(cpu, rq->mq_ctx->cpu)))
>>> + arch_scale_cpu_capacity(cpu) >=
>>> arch_scale_cpu_capacity(rq->mq_ctx->cpu)))
>>> return false;
>>>
>>> /* don't try to IPI to an offline CPU */
>>
>>
>> There can be SoCs with different CPU cluster configurations and to have optimal IO load balancing or to avoid contention b/w submission path and completion path, we may need to complete IO request of large capacity CPU(s) on small cluster cpus. So the above propose solution may not be suffice to all the use cases.
>>
>> Hence with below proposed solution, we are trying to propose a new rq flag QUEUE_FLAG_CPU_CAPACITY. The proposed solution will provide us a way such that users who are benefited with CPU capacity check [1] would be able to use the fix as it is, and if a user (including us) want to bypass cpu capacity fix [1], they can set rq_affinity to 3 and would be able to retain performance drop as mentioned in first email. This would give flexibility to user to choose what's the best for their system.
>>
>
> FWIW I'd agree with introducing a new queue_flag that behaves like
> QUEUE_FLAG_SAME_COMP before commit af550e4c9682 ("block/blk-mq: Don't complete locally if capacities are different").
> Equal capacity makes sense as the default behavior for
> QUEUE_FLAG_SAME_COMP, but is limiting, there might be just one
> CPU of that capacity and that might be fully utilized by submission
> (and related work), so completing locally makes sense.
>
> So QUEUE_FLAG_SAME_COMP I'd leave as-is and introduce
> QUEUE_FLAG_SAME_LLC that actually just checks LLC.
>
> Regards,
> Christian
>
Hi Christian,
while making the patch, i figured out that queue_flags is unsigned long
type and we already reached up to 32 flags as of now.
So we need to figure out some other way.
Please let us know if you have some suggestions for the solution, which
can provide flexibility of with and without cpu capacity check.
Regards
Manish
next prev parent reply other threads:[~2024-08-23 13:49 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-31 13:46 Regarding patch "block/blk-mq: Don't complete locally if capacities are different" MANISH PANDEY
2024-08-01 9:25 ` MANISH PANDEY
2024-08-01 16:05 ` Qais Yousef
2024-08-02 9:03 ` Christian Loehle
2024-08-05 2:07 ` Qais Yousef
2024-08-05 10:18 ` Christian Loehle
2024-08-05 17:24 ` MANISH PANDEY
2024-08-09 0:47 ` Qais Yousef
2024-08-09 0:23 ` Qais Yousef
2024-08-13 16:20 ` Christian Loehle
2024-09-01 17:25 ` Qais Yousef
2024-08-05 17:17 ` Bart Van Assche
2024-08-05 17:35 ` MANISH PANDEY
2024-08-05 17:52 ` Bart Van Assche
2024-08-08 6:05 ` MANISH PANDEY
2024-08-09 0:36 ` Qais Yousef
2024-08-11 17:41 ` Dietmar Eggemann
2024-08-12 18:15 ` Sandeep Dhavale
2024-08-21 12:29 ` MANISH PANDEY
2024-08-21 17:22 ` Bart Van Assche
2024-08-22 10:46 ` MANISH PANDEY
2024-08-22 14:24 ` Bart Van Assche
2024-08-23 7:57 ` MANISH PANDEY
2024-08-23 12:03 ` Christian Loehle
2024-08-23 13:49 ` MANISH PANDEY [this message]
2024-08-23 14:12 ` Bart Van Assche
2024-08-26 17:32 ` Christian Loehle
2024-09-01 17:13 ` Qais Yousef
2024-08-09 0:28 ` Qais Yousef
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=22d1bd64-934f-49f1-bb82-1367f4a43f40@quicinc.com \
--to=quic_mapa@quicinc.com \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=bvanassche@google.com \
--cc=christian.loehle@arm.com \
--cc=dhavale@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=hch@infradead.org \
--cc=jaegeuk@kernel.org \
--cc=kailash@google.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=quic_cang@quicinc.com \
--cc=quic_narepall@quicinc.com \
--cc=quic_nitirawa@quicinc.com \
--cc=quic_rampraka@quicinc.com \
--cc=qyousef@layalina.io \
--cc=sudeep.holla@arm.com \
--cc=tkjos@google.com \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox