From: Christian Loehle <christian.loehle@arm.com>
To: MANISH PANDEY <quic_mapa@quicinc.com>, qyousef@layalina.io
Cc: axboe@kernel.dk, mingo@kernel.org, peterz@infradead.org,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
linux-block@vger.kernel.org, sudeep.holla@arm.com,
Jaegeuk Kim <jaegeuk@kernel.org>,
Bart Van Assche <bvanassche@acm.org>,
Christoph Hellwig <hch@infradead.org>,
kailash@google.com, tkjos@google.com, dhavale@google.com,
bvanassche@google.com, quic_nitirawa@quicinc.com,
quic_cang@quicinc.com, quic_rampraka@quicinc.com,
quic_narepall@quicinc.com
Subject: Re: Regarding patch "block/blk-mq: Don't complete locally if capacities are different"
Date: Fri, 2 Aug 2024 10:03:16 +0100 [thread overview]
Message-ID: <3feb5226-7872-432b-9781-29903979d34a@arm.com> (raw)
In-Reply-To: <10c7f773-7afd-4409-b392-5d987a4024e4@quicinc.com>
On 7/31/24 14:46, MANISH PANDEY wrote:
> Hi Qais Yousef,
Qais already asked the important question, still some from my end.
> Recently we observed below patch has been merged
> https://lore.kernel.org/all/20240223155749.2958009-3-qyousef@layalina.io
>
> This patch is causing performance degradation ~20% in Random IO along with significant drop in Sequential IO performance. So we would like to revert this patch as it impacts MCQ UFS devices heavily. Though Non MCQ devices are also getting impacted due to this.
I'm curious about the sequential IO part in particular, what's the blocksize and throughput?
If blocksize is large enough the completion and submission parts are hopefully not as critical.
>
> We have several concerns with the patch
> 1. This patch takes away the luxury of affining best possible cpus from device drivers and limits driver to fall in same group of CPUs.
>
> 2. Why can't device driver use irq affinity to use desired CPUs to complete the IO request, instead of forcing it from block layer.
>
> 3. Already CPUs are grouped based on LLC, then if a new categorization is required ?
As Qais hinted at, because of systems that share LLC on all CPUs but are HMP.
>
>> big performance impact if the IO request
>> was done from a CPU with higher capacity but the interrupt is serviced
>> on a lower capacity CPU.
>
> This patch doesn't considers the issue of contention in submission path and completion path. Also what if we want to complete the request of smaller capacity CPU to Higher capacity CPU?
> Shouldn't a device driver take care of this and allow the vendors to use the best possible combination they want to use?
> Does it considers MCQ devices and different SQ<->CQ mappings?
So I'm assuming you're seeing something like the following:
Some CPU(s) (call them S) are submitting IO, hardirq triggers on
S.
Before the patch the completion softirq could run on a !S CPU,
now it runs on S. Am I then correct in assuming your workload
is CPU-bound on S? Would you share some details about the
workload, too?
What's the capacity of CPU(s) S then?
IOW does this help?
-->8--
diff --git a/block/blk-mq.c b/block/blk-mq.c
index e3c3c0c21b55..a4a2500c4ef6 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1164,7 +1164,7 @@ static inline bool blk_mq_complete_need_ipi(struct request *rq)
if (cpu == rq->mq_ctx->cpu ||
(!test_bit(QUEUE_FLAG_SAME_FORCE, &rq->q->queue_flags) &&
cpus_share_cache(cpu, rq->mq_ctx->cpu) &&
- cpus_equal_capacity(cpu, rq->mq_ctx->cpu)))
+ arch_scale_cpu_capacity(cpu) >= arch_scale_cpu_capacity(rq->mq_ctx->cpu)))
return false;
/* don't try to IPI to an offline CPU */
next prev parent reply other threads:[~2024-08-02 9:03 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-31 13:46 Regarding patch "block/blk-mq: Don't complete locally if capacities are different" MANISH PANDEY
2024-08-01 9:25 ` MANISH PANDEY
2024-08-01 16:05 ` Qais Yousef
2024-08-02 9:03 ` Christian Loehle [this message]
2024-08-05 2:07 ` Qais Yousef
2024-08-05 10:18 ` Christian Loehle
2024-08-05 17:24 ` MANISH PANDEY
2024-08-09 0:47 ` Qais Yousef
2024-08-09 0:23 ` Qais Yousef
2024-08-13 16:20 ` Christian Loehle
2024-09-01 17:25 ` Qais Yousef
2024-08-05 17:17 ` Bart Van Assche
2024-08-05 17:35 ` MANISH PANDEY
2024-08-05 17:52 ` Bart Van Assche
2024-08-08 6:05 ` MANISH PANDEY
2024-08-09 0:36 ` Qais Yousef
2024-08-11 17:41 ` Dietmar Eggemann
2024-08-12 18:15 ` Sandeep Dhavale
2024-08-21 12:29 ` MANISH PANDEY
2024-08-21 17:22 ` Bart Van Assche
2024-08-22 10:46 ` MANISH PANDEY
2024-08-22 14:24 ` Bart Van Assche
2024-08-23 7:57 ` MANISH PANDEY
2024-08-23 12:03 ` Christian Loehle
2024-08-23 13:49 ` MANISH PANDEY
2024-08-23 14:12 ` Bart Van Assche
2024-08-26 17:32 ` Christian Loehle
2024-09-01 17:13 ` Qais Yousef
2024-08-09 0:28 ` Qais Yousef
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3feb5226-7872-432b-9781-29903979d34a@arm.com \
--to=christian.loehle@arm.com \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=bvanassche@google.com \
--cc=dhavale@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=hch@infradead.org \
--cc=jaegeuk@kernel.org \
--cc=kailash@google.com \
--cc=linux-block@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=quic_cang@quicinc.com \
--cc=quic_mapa@quicinc.com \
--cc=quic_narepall@quicinc.com \
--cc=quic_nitirawa@quicinc.com \
--cc=quic_rampraka@quicinc.com \
--cc=qyousef@layalina.io \
--cc=sudeep.holla@arm.com \
--cc=tkjos@google.com \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox