linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 5.15 20/29] block: reduce kblockd_mod_delayed_work_on() CPU consumption
       [not found] <20211221015751.116328-1-sashal@kernel.org>
@ 2021-12-21  1:57 ` Sasha Levin
  2021-12-21 15:35   ` Michael Kelley (LINUX)
  2021-12-21  1:57 ` [PATCH AUTOSEL 5.15 29/29] Revert "block: reduce kblockd_mod_delayed_work_on() CPU consumption" Sasha Levin
  1 sibling, 1 reply; 6+ messages in thread
From: Sasha Levin @ 2021-12-21  1:57 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Jens Axboe, Dexuan Cui, Ming Lei, Sasha Levin, linux-block

From: Jens Axboe <axboe@kernel.dk>

[ Upstream commit cb2ac2912a9ca7d3d26291c511939a41361d2d83 ]

Dexuan reports that he's seeing spikes of very heavy CPU utilization when
running 24 disks and using the 'none' scheduler. This happens off the
sched restart path, because SCSI requires the queue to be restarted async,
and hence we're hammering on mod_delayed_work_on() to ensure that the work
item gets run appropriately.

Avoid hammering on the timer and just use queue_work_on() if no delay
has been specified.

Reported-and-tested-by: Dexuan Cui <decui@microsoft.com>
Link: https://lore.kernel.org/linux-block/BYAPR21MB1270C598ED214C0490F47400BF719@BYAPR21MB1270.namprd21.prod.outlook.com/
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 block/blk-core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index c2d912d0c976c..a728434fcff87 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1625,6 +1625,8 @@ EXPORT_SYMBOL(kblockd_schedule_work);
 int kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork,
 				unsigned long delay)
 {
+	if (!delay)
+		return queue_work_on(cpu, kblockd_workqueue, &dwork->work);
 	return mod_delayed_work_on(cpu, kblockd_workqueue, dwork, delay);
 }
 EXPORT_SYMBOL(kblockd_mod_delayed_work_on);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH AUTOSEL 5.15 29/29] Revert "block: reduce kblockd_mod_delayed_work_on() CPU consumption"
       [not found] <20211221015751.116328-1-sashal@kernel.org>
  2021-12-21  1:57 ` [PATCH AUTOSEL 5.15 20/29] block: reduce kblockd_mod_delayed_work_on() CPU consumption Sasha Levin
@ 2021-12-21  1:57 ` Sasha Levin
  1 sibling, 0 replies; 6+ messages in thread
From: Sasha Levin @ 2021-12-21  1:57 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Jens Axboe, Alex Xu, kernel test robot, Sasha Levin, linux-block

From: Jens Axboe <axboe@kernel.dk>

[ Upstream commit 87959fa16cfbcf76245c11559db1940069621274 ]

This reverts commit cb2ac2912a9ca7d3d26291c511939a41361d2d83.

Alex and the kernel test robot report that this causes a significant
performance regression with BFQ. I can reproduce that result, so let's
revert this one as we're close to -rc6 and we there's no point in trying
to rush a fix.

Link: https://lore.kernel.org/linux-block/1639853092.524jxfaem2.none@localhost/
Link: https://lore.kernel.org/lkml/20211219141852.GH14057@xsang-OptiPlex-9020/
Reported-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 block/blk-core.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index a728434fcff87..c2d912d0c976c 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1625,8 +1625,6 @@ EXPORT_SYMBOL(kblockd_schedule_work);
 int kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork,
 				unsigned long delay)
 {
-	if (!delay)
-		return queue_work_on(cpu, kblockd_workqueue, &dwork->work);
 	return mod_delayed_work_on(cpu, kblockd_workqueue, dwork, delay);
 }
 EXPORT_SYMBOL(kblockd_mod_delayed_work_on);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* RE: [PATCH AUTOSEL 5.15 20/29] block: reduce kblockd_mod_delayed_work_on() CPU consumption
  2021-12-21  1:57 ` [PATCH AUTOSEL 5.15 20/29] block: reduce kblockd_mod_delayed_work_on() CPU consumption Sasha Levin
@ 2021-12-21 15:35   ` Michael Kelley (LINUX)
  2021-12-21 15:36     ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Michael Kelley (LINUX) @ 2021-12-21 15:35 UTC (permalink / raw)
  To: Sasha Levin, linux-kernel@vger.kernel.org, stable@vger.kernel.org
  Cc: Jens Axboe, Dexuan Cui, Ming Lei, linux-block@vger.kernel.org

From: Sasha Levin <sashal@kernel.org> Sent: Monday, December 20, 2021 5:58 PM
> 
> From: Jens Axboe <axboe@kernel.dk>
> 
> [ Upstream commit cb2ac2912a9ca7d3d26291c511939a41361d2d83 ]
> 
> Dexuan reports that he's seeing spikes of very heavy CPU utilization when
> running 24 disks and using the 'none' scheduler. This happens off the
> sched restart path, because SCSI requires the queue to be restarted async,
> and hence we're hammering on mod_delayed_work_on() to ensure that the work
> item gets run appropriately.
> 
> Avoid hammering on the timer and just use queue_work_on() if no delay
> has been specified.
> 
> Reported-and-tested-by: Dexuan Cui <decui@microsoft.com>
> Link: https://lore.kernel.org/linux-block/BYAPR21MB1270C598ED214C0490F47400BF719@BYAPR21MB1270.namprd21.prod.outlook.com/
> Reviewed-by: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>  block/blk-core.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index c2d912d0c976c..a728434fcff87 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1625,6 +1625,8 @@ EXPORT_SYMBOL(kblockd_schedule_work);
>  int kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork,
>  				unsigned long delay)
>  {
> +	if (!delay)
> +		return queue_work_on(cpu, kblockd_workqueue, &dwork->work);
>  	return mod_delayed_work_on(cpu, kblockd_workqueue, dwork, delay);
>  }
>  EXPORT_SYMBOL(kblockd_mod_delayed_work_on);
> --
> 2.34.1

Sasha -- there are reports of this patch causing performance problems.  See https://lore.kernel.org/lkml/1639853092.524jxfaem2.none@localhost/.
I would suggest *not* backporting it to any of the stable branches until the
issues are fully sorted out.

Michael

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH AUTOSEL 5.15 20/29] block: reduce kblockd_mod_delayed_work_on() CPU consumption
  2021-12-21 15:35   ` Michael Kelley (LINUX)
@ 2021-12-21 15:36     ` Jens Axboe
  2021-12-21 17:58       ` Sasha Levin
  0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2021-12-21 15:36 UTC (permalink / raw)
  To: Michael Kelley (LINUX), Sasha Levin, linux-kernel@vger.kernel.org,
	stable@vger.kernel.org
  Cc: Dexuan Cui, Ming Lei, linux-block@vger.kernel.org

On 12/21/21 8:35 AM, Michael Kelley (LINUX) wrote:
> From: Sasha Levin <sashal@kernel.org> Sent: Monday, December 20, 2021 5:58 PM
>>
>> From: Jens Axboe <axboe@kernel.dk>
>>
>> [ Upstream commit cb2ac2912a9ca7d3d26291c511939a41361d2d83 ]
>>
>> Dexuan reports that he's seeing spikes of very heavy CPU utilization when
>> running 24 disks and using the 'none' scheduler. This happens off the
>> sched restart path, because SCSI requires the queue to be restarted async,
>> and hence we're hammering on mod_delayed_work_on() to ensure that the work
>> item gets run appropriately.
>>
>> Avoid hammering on the timer and just use queue_work_on() if no delay
>> has been specified.
>>
>> Reported-and-tested-by: Dexuan Cui <decui@microsoft.com>
>> Link: https://lore.kernel.org/linux-block/BYAPR21MB1270C598ED214C0490F47400BF719@BYAPR21MB1270.namprd21.prod.outlook.com/
>> Reviewed-by: Ming Lei <ming.lei@redhat.com>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> Signed-off-by: Sasha Levin <sashal@kernel.org>
>> ---
>>  block/blk-core.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index c2d912d0c976c..a728434fcff87 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -1625,6 +1625,8 @@ EXPORT_SYMBOL(kblockd_schedule_work);
>>  int kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork,
>>  				unsigned long delay)
>>  {
>> +	if (!delay)
>> +		return queue_work_on(cpu, kblockd_workqueue, &dwork->work);
>>  	return mod_delayed_work_on(cpu, kblockd_workqueue, dwork, delay);
>>  }
>>  EXPORT_SYMBOL(kblockd_mod_delayed_work_on);
>> --
>> 2.34.1
> 
> Sasha -- there are reports of this patch causing performance problems.
> See
> https://lore.kernel.org/lkml/1639853092.524jxfaem2.none@localhost/. I
> would suggest *not* backporting it to any of the stable branches until
> the issues are fully sorted out.

Both this and the revert were backported. Which arguably doesn't make a
lot of sense, but at least it's consistent and won't cause any issues...

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH AUTOSEL 5.15 20/29] block: reduce kblockd_mod_delayed_work_on() CPU consumption
  2021-12-21 15:36     ` Jens Axboe
@ 2021-12-21 17:58       ` Sasha Levin
  2021-12-21 18:11         ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Sasha Levin @ 2021-12-21 17:58 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Michael Kelley (LINUX), linux-kernel@vger.kernel.org,
	stable@vger.kernel.org, Dexuan Cui, Ming Lei,
	linux-block@vger.kernel.org

On Tue, Dec 21, 2021 at 08:36:33AM -0700, Jens Axboe wrote:
>On 12/21/21 8:35 AM, Michael Kelley (LINUX) wrote:
>> From: Sasha Levin <sashal@kernel.org> Sent: Monday, December 20, 2021 5:58 PM
>>>
>>> From: Jens Axboe <axboe@kernel.dk>
>>>
>>> [ Upstream commit cb2ac2912a9ca7d3d26291c511939a41361d2d83 ]
>>>
>>> Dexuan reports that he's seeing spikes of very heavy CPU utilization when
>>> running 24 disks and using the 'none' scheduler. This happens off the
>>> sched restart path, because SCSI requires the queue to be restarted async,
>>> and hence we're hammering on mod_delayed_work_on() to ensure that the work
>>> item gets run appropriately.
>>>
>>> Avoid hammering on the timer and just use queue_work_on() if no delay
>>> has been specified.
>>>
>>> Reported-and-tested-by: Dexuan Cui <decui@microsoft.com>
>>> Link: https://lore.kernel.org/linux-block/BYAPR21MB1270C598ED214C0490F47400BF719@BYAPR21MB1270.namprd21.prod.outlook.com/
>>> Reviewed-by: Ming Lei <ming.lei@redhat.com>
>>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>> Signed-off-by: Sasha Levin <sashal@kernel.org>
>>> ---
>>>  block/blk-core.c | 2 ++
>>>  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>> index c2d912d0c976c..a728434fcff87 100644
>>> --- a/block/blk-core.c
>>> +++ b/block/blk-core.c
>>> @@ -1625,6 +1625,8 @@ EXPORT_SYMBOL(kblockd_schedule_work);
>>>  int kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork,
>>>  				unsigned long delay)
>>>  {
>>> +	if (!delay)
>>> +		return queue_work_on(cpu, kblockd_workqueue, &dwork->work);
>>>  	return mod_delayed_work_on(cpu, kblockd_workqueue, dwork, delay);
>>>  }
>>>  EXPORT_SYMBOL(kblockd_mod_delayed_work_on);
>>> --
>>> 2.34.1
>>
>> Sasha -- there are reports of this patch causing performance problems.
>> See
>> https://lore.kernel.org/lkml/1639853092.524jxfaem2.none@localhost/. I
>> would suggest *not* backporting it to any of the stable branches until
>> the issues are fully sorted out.
>
>Both this and the revert were backported. Which arguably doesn't make a
>lot of sense, but at least it's consistent and won't cause any issues...

The logic behind it is that it makes it easy for both us as well as
everyone else to annotate why a certain patch might be "missing" from
the trees - in this case because it was reverted.

It looks dumb now, but it saves a lot of time as well as mitigates the
risk of it being picked up again at some point in the future.

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH AUTOSEL 5.15 20/29] block: reduce kblockd_mod_delayed_work_on() CPU consumption
  2021-12-21 17:58       ` Sasha Levin
@ 2021-12-21 18:11         ` Jens Axboe
  0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2021-12-21 18:11 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Michael Kelley (LINUX), linux-kernel@vger.kernel.org,
	stable@vger.kernel.org, Dexuan Cui, Ming Lei,
	linux-block@vger.kernel.org

On 12/21/21 10:58 AM, Sasha Levin wrote:
> On Tue, Dec 21, 2021 at 08:36:33AM -0700, Jens Axboe wrote:
>> On 12/21/21 8:35 AM, Michael Kelley (LINUX) wrote:
>>> From: Sasha Levin <sashal@kernel.org> Sent: Monday, December 20, 2021 5:58 PM
>>>>
>>>> From: Jens Axboe <axboe@kernel.dk>
>>>>
>>>> [ Upstream commit cb2ac2912a9ca7d3d26291c511939a41361d2d83 ]
>>>>
>>>> Dexuan reports that he's seeing spikes of very heavy CPU utilization when
>>>> running 24 disks and using the 'none' scheduler. This happens off the
>>>> sched restart path, because SCSI requires the queue to be restarted async,
>>>> and hence we're hammering on mod_delayed_work_on() to ensure that the work
>>>> item gets run appropriately.
>>>>
>>>> Avoid hammering on the timer and just use queue_work_on() if no delay
>>>> has been specified.
>>>>
>>>> Reported-and-tested-by: Dexuan Cui <decui@microsoft.com>
>>>> Link: https://lore.kernel.org/linux-block/BYAPR21MB1270C598ED214C0490F47400BF719@BYAPR21MB1270.namprd21.prod.outlook.com/
>>>> Reviewed-by: Ming Lei <ming.lei@redhat.com>
>>>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>>> Signed-off-by: Sasha Levin <sashal@kernel.org>
>>>> ---
>>>>  block/blk-core.c | 2 ++
>>>>  1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>>> index c2d912d0c976c..a728434fcff87 100644
>>>> --- a/block/blk-core.c
>>>> +++ b/block/blk-core.c
>>>> @@ -1625,6 +1625,8 @@ EXPORT_SYMBOL(kblockd_schedule_work);
>>>>  int kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork,
>>>>  				unsigned long delay)
>>>>  {
>>>> +	if (!delay)
>>>> +		return queue_work_on(cpu, kblockd_workqueue, &dwork->work);
>>>>  	return mod_delayed_work_on(cpu, kblockd_workqueue, dwork, delay);
>>>>  }
>>>>  EXPORT_SYMBOL(kblockd_mod_delayed_work_on);
>>>> --
>>>> 2.34.1
>>>
>>> Sasha -- there are reports of this patch causing performance problems.
>>> See
>>> https://lore.kernel.org/lkml/1639853092.524jxfaem2.none@localhost/. I
>>> would suggest *not* backporting it to any of the stable branches until
>>> the issues are fully sorted out.
>>
>> Both this and the revert were backported. Which arguably doesn't make a
>> lot of sense, but at least it's consistent and won't cause any issues...
> 
> The logic behind it is that it makes it easy for both us as well as
> everyone else to annotate why a certain patch might be "missing" from
> the trees - in this case because it was reverted.
> 
> It looks dumb now, but it saves a lot of time as well as mitigates the
> risk of it being picked up again at some point in the future.

It's fine with me, when I saw the first patch yesterday I did get
worried, but then I saw the revert was picked too. As I said, as long
as the end result is sane, then there's no harm in doing it this way.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-12-21 18:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20211221015751.116328-1-sashal@kernel.org>
2021-12-21  1:57 ` [PATCH AUTOSEL 5.15 20/29] block: reduce kblockd_mod_delayed_work_on() CPU consumption Sasha Levin
2021-12-21 15:35   ` Michael Kelley (LINUX)
2021-12-21 15:36     ` Jens Axboe
2021-12-21 17:58       ` Sasha Levin
2021-12-21 18:11         ` Jens Axboe
2021-12-21  1:57 ` [PATCH AUTOSEL 5.15 29/29] Revert "block: reduce kblockd_mod_delayed_work_on() CPU consumption" Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).