linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] sched: blk: Handle HMP systems when completing IO
@ 2024-02-23 15:57 Qais Yousef
  2024-02-23 15:57 ` [PATCH v2 1/2] sched: Add a new function to compare if two cpus have the same capacity Qais Yousef
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Qais Yousef @ 2024-02-23 15:57 UTC (permalink / raw)
  To: Jens Axboe, Ingo Molnar, Peter Zijlstra, Vincent Guittot,
	Dietmar Eggemann
  Cc: linux-kernel, linux-block, Sudeep Holla, Wei Wang, Jaegeuk Kim,
	Bart Van Assche, Christoph Hellwig, Qais Yousef

Due to recent changes in how topology is represented on asymmetric multi
processing systems like big.LITTLE where all cpus share the last LLC, there is
a performance regression as cpus with different compute capacities appear under
the same LLC and we no longer send an IPI when the requester is running on
a different cluster with different compute capacity.

Restore the old behavior by adding a new cpus_equal_capacity() function to help
check for the new condition for these systems.

Changes since v1:

	* Split the patch per subsystem.
	* Convert cpus_gte_capacity() to cpus_equal_capacity()
	* Make cpus_equal_capacity() return immediately for SMP systems.

Qais Yousef (2):
  sched: Add a new function to compare if two cpus have the same
    capacity
  block/blk-mq: Don't complete locally if capacities are different

 block/blk-mq.c                 |  5 +++--
 include/linux/sched/topology.h |  6 ++++++
 kernel/sched/core.c            | 11 +++++++++++
 3 files changed, 20 insertions(+), 2 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/2] sched: Add a new function to compare if two cpus have the same capacity
  2024-02-23 15:57 [PATCH v2 0/2] sched: blk: Handle HMP systems when completing IO Qais Yousef
@ 2024-02-23 15:57 ` Qais Yousef
  2024-02-23 16:12   ` Bart Van Assche
  2024-02-23 15:57 ` [PATCH v2 2/2] block/blk-mq: Don't complete locally if capacities are different Qais Yousef
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Qais Yousef @ 2024-02-23 15:57 UTC (permalink / raw)
  To: Jens Axboe, Ingo Molnar, Peter Zijlstra, Vincent Guittot,
	Dietmar Eggemann
  Cc: linux-kernel, linux-block, Sudeep Holla, Wei Wang, Jaegeuk Kim,
	Bart Van Assche, Christoph Hellwig, Qais Yousef

The new helper function is needed to help blk-mq check if it needs to
dispatch the softirq on another CPU to match the performance level the
IO requester is running at. This is important on HMP systems where not
all CPUs have the same compute capacity.

Signed-off-by: Qais Yousef <qyousef@layalina.io>
---
 include/linux/sched/topology.h |  6 ++++++
 kernel/sched/core.c            | 11 +++++++++++
 2 files changed, 17 insertions(+)

diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
index a6e04b4a21d7..11e0e00e0bb9 100644
--- a/include/linux/sched/topology.h
+++ b/include/linux/sched/topology.h
@@ -176,6 +176,7 @@ extern void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
 cpumask_var_t *alloc_sched_domains(unsigned int ndoms);
 void free_sched_domains(cpumask_var_t doms[], unsigned int ndoms);
 
+bool cpus_equal_capacity(int this_cpu, int that_cpu);
 bool cpus_share_cache(int this_cpu, int that_cpu);
 bool cpus_share_resources(int this_cpu, int that_cpu);
 
@@ -226,6 +227,11 @@ partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
 {
 }
 
+static inline bool cpus_equal_capacity(int this_cpu, int that_cpu)
+{
+	return true;
+}
+
 static inline bool cpus_share_cache(int this_cpu, int that_cpu)
 {
 	return true;
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a76c7095f736..adbaabb23fa1 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3953,6 +3953,17 @@ void wake_up_if_idle(int cpu)
 	}
 }
 
+bool cpus_equal_capacity(int this_cpu, int that_cpu)
+{
+	if (!sched_asym_cpucap_active())
+		return true;
+
+	if (this_cpu == that_cpu)
+		return true;
+
+	return arch_scale_cpu_capacity(this_cpu) == arch_scale_cpu_capacity(that_cpu);
+}
+
 bool cpus_share_cache(int this_cpu, int that_cpu)
 {
 	if (this_cpu == that_cpu)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/2] block/blk-mq: Don't complete locally if capacities are different
  2024-02-23 15:57 [PATCH v2 0/2] sched: blk: Handle HMP systems when completing IO Qais Yousef
  2024-02-23 15:57 ` [PATCH v2 1/2] sched: Add a new function to compare if two cpus have the same capacity Qais Yousef
@ 2024-02-23 15:57 ` Qais Yousef
  2024-02-23 16:09   ` Bart Van Assche
  2024-02-24 13:43 ` [PATCH v2 0/2] sched: blk: Handle HMP systems when completing IO Shrikanth Hegde
  2024-02-24 19:48 ` Jens Axboe
  3 siblings, 1 reply; 9+ messages in thread
From: Qais Yousef @ 2024-02-23 15:57 UTC (permalink / raw)
  To: Jens Axboe, Ingo Molnar, Peter Zijlstra, Vincent Guittot,
	Dietmar Eggemann
  Cc: linux-kernel, linux-block, Sudeep Holla, Wei Wang, Jaegeuk Kim,
	Bart Van Assche, Christoph Hellwig, Qais Yousef

The logic in blk_mq_complete_need_ipi() assumes SMP systems where all
CPUs have equal compute capacities and only LLC cache can make
a different on perceived performance. But this assumption falls apart on
HMP systems where LLC is shared, but the CPUs have different capacities.
Staying local then can have a big performance impact if the IO request
was done from a CPU with higher capacity but the interrupt is serviced
on a lower capacity CPU.

Use the new cpus_equal_capacity() function to check if we need to send
an IPI.

Without the patch I see the BLOCK softirq always running on little cores
(where the hardirq is serviced). With it I can see it running on all
cores.

This was noticed after the topology change [1] where now on a big.LITTLE
we truly get that the LLC is shared between all cores where as in the
past it was being misrepresented for historical reasons. The logic
exposed a missing dependency on capacities for such systems where there
can be a big performance difference between the CPUs.

This of course introduced a noticeable change in behavior depending on
how the topology is presented. Leading to regressions in some workloads
as the performance of the BLOCK softirq on littles can be noticeably
worse on some platforms.

Worth noting that we could have checked for capacities being greater
than or equal instead for equality. This will lead to favouring higher
performance always. But opted for equality instead to match the
performance of the requester without making an assumption that can lead
to power trade-offs which these systems tend to be sensitive about. If
the requester would like to run faster, it's better to rely on the
scheduler to give the IO requester via some facility to run on a faster
core; and then if the interrupt triggered on a CPU with different
capacity we'll make sure to match the performance the requester is
supposed to run at.

[1] https://lpc.events/event/16/contributions/1342/attachments/962/1883/LPC-2022-Android-MC-Phantom-Domains.pdf

Signed-off-by: Qais Yousef <qyousef@layalina.io>
---
 block/blk-mq.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 2dc01551e27c..ea69047e12f7 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1167,10 +1167,11 @@ static inline bool blk_mq_complete_need_ipi(struct request *rq)
 	if (force_irqthreads())
 		return false;
 
-	/* same CPU or cache domain?  Complete locally */
+	/* same CPU or cache domain and capacity?  Complete locally */
 	if (cpu == rq->mq_ctx->cpu ||
 	    (!test_bit(QUEUE_FLAG_SAME_FORCE, &rq->q->queue_flags) &&
-	     cpus_share_cache(cpu, rq->mq_ctx->cpu)))
+	     cpus_share_cache(cpu, rq->mq_ctx->cpu) &&
+	     cpus_equal_capacity(cpu, rq->mq_ctx->cpu)))
 		return false;
 
 	/* don't try to IPI to an offline CPU */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/2] block/blk-mq: Don't complete locally if capacities are different
  2024-02-23 15:57 ` [PATCH v2 2/2] block/blk-mq: Don't complete locally if capacities are different Qais Yousef
@ 2024-02-23 16:09   ` Bart Van Assche
  2024-02-24 23:06     ` Qais Yousef
  0 siblings, 1 reply; 9+ messages in thread
From: Bart Van Assche @ 2024-02-23 16:09 UTC (permalink / raw)
  To: Qais Yousef, Jens Axboe, Ingo Molnar, Peter Zijlstra,
	Vincent Guittot, Dietmar Eggemann
  Cc: linux-kernel, linux-block, Sudeep Holla, Wei Wang, Jaegeuk Kim,
	Christoph Hellwig

On 2/23/24 07:57, Qais Yousef wrote:
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 2dc01551e27c..ea69047e12f7 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1167,10 +1167,11 @@ static inline bool blk_mq_complete_need_ipi(struct request *rq)
>   	if (force_irqthreads())
>   		return false;
>   
> -	/* same CPU or cache domain?  Complete locally */
> +	/* same CPU or cache domain and capacity?  Complete locally */
>   	if (cpu == rq->mq_ctx->cpu ||
>   	    (!test_bit(QUEUE_FLAG_SAME_FORCE, &rq->q->queue_flags) &&
> -	     cpus_share_cache(cpu, rq->mq_ctx->cpu)))
> +	     cpus_share_cache(cpu, rq->mq_ctx->cpu) &&
> +	     cpus_equal_capacity(cpu, rq->mq_ctx->cpu)))
>   		return false;
>   
>   	/* don't try to IPI to an offline CPU */

I think it's worth mentioning that this change is intended for storage
controllers that only support a single completion interrupt. Anyway:

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] sched: Add a new function to compare if two cpus have the same capacity
  2024-02-23 15:57 ` [PATCH v2 1/2] sched: Add a new function to compare if two cpus have the same capacity Qais Yousef
@ 2024-02-23 16:12   ` Bart Van Assche
  0 siblings, 0 replies; 9+ messages in thread
From: Bart Van Assche @ 2024-02-23 16:12 UTC (permalink / raw)
  To: Qais Yousef, Jens Axboe, Ingo Molnar, Peter Zijlstra,
	Vincent Guittot, Dietmar Eggemann
  Cc: linux-kernel, linux-block, Sudeep Holla, Wei Wang, Jaegeuk Kim,
	Christoph Hellwig

On 2/23/24 07:57, Qais Yousef wrote:
> The new helper function is needed to help blk-mq check if it needs to
> dispatch the softirq on another CPU to match the performance level the
> IO requester is running at. This is important on HMP systems where not
> all CPUs have the same compute capacity.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 0/2] sched: blk: Handle HMP systems when completing IO
  2024-02-23 15:57 [PATCH v2 0/2] sched: blk: Handle HMP systems when completing IO Qais Yousef
  2024-02-23 15:57 ` [PATCH v2 1/2] sched: Add a new function to compare if two cpus have the same capacity Qais Yousef
  2024-02-23 15:57 ` [PATCH v2 2/2] block/blk-mq: Don't complete locally if capacities are different Qais Yousef
@ 2024-02-24 13:43 ` Shrikanth Hegde
  2024-02-24 23:12   ` Qais Yousef
  2024-02-24 19:48 ` Jens Axboe
  3 siblings, 1 reply; 9+ messages in thread
From: Shrikanth Hegde @ 2024-02-24 13:43 UTC (permalink / raw)
  To: Qais Yousef
  Cc: linux-kernel, linux-block, Sudeep Holla, Wei Wang, Jaegeuk Kim,
	Bart Van Assche, Christoph Hellwig, Jens Axboe, Ingo Molnar,
	Peter Zijlstra, Vincent Guittot, Dietmar Eggemann



On 2/23/24 9:27 PM, Qais Yousef wrote:
> Due to recent changes in how topology is represented on asymmetric multi
> processing systems like big.LITTLE where all cpus share the last LLC, there is
> a performance regression as cpus with different compute capacities appear under
> the same LLC and we no longer send an IPI when the requester is running on
> a different cluster with different compute capacity.
> 
> Restore the old behavior by adding a new cpus_equal_capacity() function to help
> check for the new condition for these systems.
> 
> Changes since v1:
> 
> 	* Split the patch per subsystem.
> 	* Convert cpus_gte_capacity() to cpus_equal_capacity()
> 	* Make cpus_equal_capacity() return immediately for SMP systems.
> 

nit: Did you mean !SMP systems here? 
Because in changes i see its returning true directly if its in !CONFIG_SMP path. 

> Qais Yousef (2):
>   sched: Add a new function to compare if two cpus have the same
>     capacity
>   block/blk-mq: Don't complete locally if capacities are different
> 
>  block/blk-mq.c                 |  5 +++--
>  include/linux/sched/topology.h |  6 ++++++
>  kernel/sched/core.c            | 11 +++++++++++
>  3 files changed, 20 insertions(+), 2 deletions(-)
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 0/2] sched: blk: Handle HMP systems when completing IO
  2024-02-23 15:57 [PATCH v2 0/2] sched: blk: Handle HMP systems when completing IO Qais Yousef
                   ` (2 preceding siblings ...)
  2024-02-24 13:43 ` [PATCH v2 0/2] sched: blk: Handle HMP systems when completing IO Shrikanth Hegde
@ 2024-02-24 19:48 ` Jens Axboe
  3 siblings, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2024-02-24 19:48 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Vincent Guittot, Dietmar Eggemann,
	Qais Yousef
  Cc: linux-kernel, linux-block, Sudeep Holla, Wei Wang, Jaegeuk Kim,
	Bart Van Assche, Christoph Hellwig


On Fri, 23 Feb 2024 15:57:47 +0000, Qais Yousef wrote:
> Due to recent changes in how topology is represented on asymmetric multi
> processing systems like big.LITTLE where all cpus share the last LLC, there is
> a performance regression as cpus with different compute capacities appear under
> the same LLC and we no longer send an IPI when the requester is running on
> a different cluster with different compute capacity.
> 
> Restore the old behavior by adding a new cpus_equal_capacity() function to help
> check for the new condition for these systems.
> 
> [...]

Applied, thanks!

[1/2] sched: Add a new function to compare if two cpus have the same capacity
      commit: b361c9027b4e4159e7bcca4eb64fd26507c19994
[2/2] block/blk-mq: Don't complete locally if capacities are different
      commit: af550e4c968294398fc76b075f12d51c76caf753

Best regards,
-- 
Jens Axboe




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/2] block/blk-mq: Don't complete locally if capacities are different
  2024-02-23 16:09   ` Bart Van Assche
@ 2024-02-24 23:06     ` Qais Yousef
  0 siblings, 0 replies; 9+ messages in thread
From: Qais Yousef @ 2024-02-24 23:06 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, Ingo Molnar, Peter Zijlstra, Vincent Guittot,
	Dietmar Eggemann, linux-kernel, linux-block, Sudeep Holla,
	Wei Wang, Jaegeuk Kim, Christoph Hellwig

On 02/23/24 08:09, Bart Van Assche wrote:
> On 2/23/24 07:57, Qais Yousef wrote:
> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index 2dc01551e27c..ea69047e12f7 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -1167,10 +1167,11 @@ static inline bool blk_mq_complete_need_ipi(struct request *rq)
> >   	if (force_irqthreads())
> >   		return false;
> > -	/* same CPU or cache domain?  Complete locally */
> > +	/* same CPU or cache domain and capacity?  Complete locally */
> >   	if (cpu == rq->mq_ctx->cpu ||
> >   	    (!test_bit(QUEUE_FLAG_SAME_FORCE, &rq->q->queue_flags) &&
> > -	     cpus_share_cache(cpu, rq->mq_ctx->cpu)))
> > +	     cpus_share_cache(cpu, rq->mq_ctx->cpu) &&
> > +	     cpus_equal_capacity(cpu, rq->mq_ctx->cpu)))
> >   		return false;
> >   	/* don't try to IPI to an offline CPU */
> 
> I think it's worth mentioning that this change is intended for storage
> controllers that only support a single completion interrupt. Anyway:

Sorry I didn't realize it is only applied for this scenario.

> 
> Reviewed-by: Bart Van Assche <bvanassche@acm.org>

Thanks for the reviews!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 0/2] sched: blk: Handle HMP systems when completing IO
  2024-02-24 13:43 ` [PATCH v2 0/2] sched: blk: Handle HMP systems when completing IO Shrikanth Hegde
@ 2024-02-24 23:12   ` Qais Yousef
  0 siblings, 0 replies; 9+ messages in thread
From: Qais Yousef @ 2024-02-24 23:12 UTC (permalink / raw)
  To: Shrikanth Hegde
  Cc: linux-kernel, linux-block, Sudeep Holla, Wei Wang, Jaegeuk Kim,
	Bart Van Assche, Christoph Hellwig, Jens Axboe, Ingo Molnar,
	Peter Zijlstra, Vincent Guittot, Dietmar Eggemann

On 02/24/24 19:13, Shrikanth Hegde wrote:
> 
> 
> On 2/23/24 9:27 PM, Qais Yousef wrote:
> > Due to recent changes in how topology is represented on asymmetric multi
> > processing systems like big.LITTLE where all cpus share the last LLC, there is
> > a performance regression as cpus with different compute capacities appear under
> > the same LLC and we no longer send an IPI when the requester is running on
> > a different cluster with different compute capacity.
> > 
> > Restore the old behavior by adding a new cpus_equal_capacity() function to help
> > check for the new condition for these systems.
> > 
> > Changes since v1:
> > 
> > 	* Split the patch per subsystem.
> > 	* Convert cpus_gte_capacity() to cpus_equal_capacity()
> > 	* Make cpus_equal_capacity() return immediately for SMP systems.
> > 
> 
> nit: Did you mean !SMP systems here? 
> Because in changes i see its returning true directly if its in !CONFIG_SMP path. 

I was referring to this hunk

+       if (!sched_asym_cpucap_active())
+               return true;

in cpus_equal_capacity(). In SMP system the condition is always true and
there's a static key that tells us if the system is asymmetric.

> 
> > Qais Yousef (2):
> >   sched: Add a new function to compare if two cpus have the same
> >     capacity
> >   block/blk-mq: Don't complete locally if capacities are different
> > 
> >  block/blk-mq.c                 |  5 +++--
> >  include/linux/sched/topology.h |  6 ++++++
> >  kernel/sched/core.c            | 11 +++++++++++
> >  3 files changed, 20 insertions(+), 2 deletions(-)
> > 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-02-24 23:12 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-23 15:57 [PATCH v2 0/2] sched: blk: Handle HMP systems when completing IO Qais Yousef
2024-02-23 15:57 ` [PATCH v2 1/2] sched: Add a new function to compare if two cpus have the same capacity Qais Yousef
2024-02-23 16:12   ` Bart Van Assche
2024-02-23 15:57 ` [PATCH v2 2/2] block/blk-mq: Don't complete locally if capacities are different Qais Yousef
2024-02-23 16:09   ` Bart Van Assche
2024-02-24 23:06     ` Qais Yousef
2024-02-24 13:43 ` [PATCH v2 0/2] sched: blk: Handle HMP systems when completing IO Shrikanth Hegde
2024-02-24 23:12   ` Qais Yousef
2024-02-24 19:48 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).