All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Yanjun.Zhu" <yanjun.zhu@linux.dev>
To: Leon Romanovsky <leon@kernel.org>, Zhu Yanjun <yanjun.zhu@linux.dev>
Cc: Marco Crivellari <marco.crivellari@suse.com>,
	linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org,
	Tejun Heo <tj@kernel.org>, Lai Jiangshan <jiangshanlai@gmail.com>,
	Frederic Weisbecker <frederic@kernel.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Michal Hocko <mhocko@suse.com>, Zhu Yanjun <zyjzyj2000@gmail.com>,
	Jason Gunthorpe <jgg@ziepe.ca>
Subject: Re: [PATCH] RDMA/rxe: Replace use of system_unbound_wq with system_dfl_wq
Date: Tue, 17 Mar 2026 12:31:08 -0700	[thread overview]
Message-ID: <089fe865-0077-4253-85de-1bb05216b6e7@linux.dev> (raw)
In-Reply-To: <20260317190314.GC61385@unreal>


On 3/17/26 12:03 PM, Leon Romanovsky wrote:
> On Tue, Mar 17, 2026 at 10:24:11AM -0700, Yanjun.Zhu wrote:
>> On 3/17/26 7:38 AM, Zhu Yanjun wrote:
>>> 在 2026/3/16 13:13, Leon Romanovsky 写道:
>>>> On Fri, Mar 13, 2026 at 04:40:23PM +0100, Marco Crivellari wrote:
>>>>> This patch continues the effort to refactor workqueue APIs,
>>>>> which has begun
>>>>> with the changes introducing new workqueues and a new
>>>>> alloc_workqueue flag:
>>>>>
>>>>>      commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and
>>>>> system_dfl_wq")
>>>>>      commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
>>>>>
>>>>> The point of the refactoring is to eventually alter the default
>>>>> behavior of
>>>>> workqueues to become unbound by default so that their workload
>>>>> placement is
>>>>> optimized by the scheduler.
>>>>>
>>>>> Before that to happen, workqueue users must be converted to the
>>>>> better named
>>>>> new workqueues with no intended behaviour changes:
>>>>>
>>>>>      system_wq -> system_percpu_wq
>>>>>      system_unbound_wq -> system_dfl_wq
>>>>>
>>>>> This way the old obsolete workqueues (system_wq,
>>>>> system_unbound_wq) can be
>>>>> removed in the future.
>>>> I recall earlier efforts to replace system workqueues with
>>>> per‑driver queues,
>>>> because unloading a driver forces a flush of the entire system
>>>> workqueue,
>>>> which is undesirable for overall system behavior.
>>>>
>>>> Wouldn't it be better to introduce a local workqueue here and use
>>>> that instead?
>>> Thanks.
>>>
>>> 1.The initialization should be:
>>>
>>> my_wq = alloc_workqueue("my_driver_queue", WQ_UNBOUND | WQ_MEM_RECLAIM,
>>> 0);
>>> if (!my_wq)
>>>      return -ENOMEM;
>>>
>>> 2. The Submission should be:
>>>
>>> queue_work(my_wq, &my_work);
>>>
>>> 3. Destroy should be:
>>>
>>> destroy_workqueue()
>>>
>>> Thanks,
>>> Zhu Yanjun
>> Hi, Leon
>>
>> The diff for a new work queue in rxe is as below. Please review it.
> I'm not sure that you need second workqueue and destroy_workqueue
> already does flush_workqueue. There is no need to call it explicitly.
flush_workqueue() can be removed.

The introduction of the second workqueue is due to rxe_wq being heavily 
utilized by QP tasks.

The additional workqueue helps offload and distribute the workload, 
preventing rxe_wq from becoming a bottleneck.

If you believe that the workload on rxe_wq is not significant, I can 
simplify the design

by removing the second workqueue and using rxe_wq for all work items 
instead.

Zhu Yanjun

>
> Thanks
>
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_odp.c
>> b/drivers/infiniband/sw/rxe/rxe_odp.c
>> index bc11b1ec59ac..03199fef47fb 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_odp.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_odp.c
>> @@ -545,7 +545,7 @@ static int rxe_ib_advise_mr_prefetch(struct ib_pd *ibpd,
>>           work->frags[i].mr = mr;
>>       }
>>
>> -    queue_work(system_unbound_wq, &work->work);
>> +    rxe_queue_aux_work(&work->work);
>>
>>       return 0;
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_task.c
>> b/drivers/infiniband/sw/rxe/rxe_task.c
>> index f522820b950c..a2da699b969e 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_task.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_task.c
>> @@ -6,19 +6,36 @@
>>
>>   #include "rxe.h"
>>
>> +/* work for rxe_task */
>>   static struct workqueue_struct *rxe_wq;
>>
>> +/* work for other rxe jobs */
>> +static struct workqueue_struct *rxe_aux_wq;
>> +
>>   int rxe_alloc_wq(void)
>>   {
>> -    rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND, WQ_MAX_ACTIVE);
>> +    rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND | WQ_MEM_RECLAIM,
>> +                WQ_MAX_ACTIVE);
>>       if (!rxe_wq)
>>           return -ENOMEM;
>>
>> +    rxe_aux_wq = alloc_workqueue("rxe_aux_wq",
>> +                WQ_UNBOUND | WQ_MEM_RECLAIM, WQ_MAX_ACTIVE);
>> +    if (!rxe_aux_wq) {
>> +        destroy_workqueue(rxe_wq);
>> +        return -ENOMEM;
>> +
>> +    }
>> +
>>       return 0;
>>   }
>>
>>   void rxe_destroy_wq(void)
>>   {
>> +    flush_workqueue(rxe_aux_wq);
>> +    destroy_workqueue(rxe_aux_wq);
>> +
>> +    flush_workqueue(rxe_wq);
>>       destroy_workqueue(rxe_wq);
>>   }
>>
>> @@ -254,6 +271,14 @@ void rxe_sched_task(struct rxe_task *task)
>>       spin_unlock_irqrestore(&task->lock, flags);
>>   }
>>
>> +/* rxe_wq for rxe tasks. rxe_aux_wq for other rxe jobs.
>> + */
>> +void rxe_queue_aux_work(struct work_struct *work)
>> +{
>> +    WARN_ON_ONCE(!rxe_aux_wq);
>> +    queue_work(rxe_aux_wq, work);
>> +}
>> +
>>   /* rxe_disable/enable_task are only called from
>>    * rxe_modify_qp in process context. Task is moved
>>    * to the drained state by do_task.
>> diff --git a/drivers/infiniband/sw/rxe/rxe_task.h
>> b/drivers/infiniband/sw/rxe/rxe_task.h
>> index a8c9a77b6027..e1c0a34808b4 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_task.h
>> +++ b/drivers/infiniband/sw/rxe/rxe_task.h
>> @@ -36,6 +36,7 @@ int rxe_alloc_wq(void);
>>
>>   void rxe_destroy_wq(void);
>>
>> +void rxe_queue_aux_work(struct work_struct *work);
>>   /*
>>    * init rxe_task structure
>>    *    qp  => parameter to pass to func
>>
>> Zhu Yanjun
>>
>>>> Thanks
>>>>
>>>>> Link:
>>>>> https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
>>>>> Suggested-by: Tejun Heo <tj@kernel.org>
>>>>> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
>>>>> ---
>>>>>    drivers/infiniband/sw/rxe/rxe_odp.c | 2 +-
>>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/infiniband/sw/rxe/rxe_odp.c
>>>>> b/drivers/infiniband/sw/rxe/rxe_odp.c
>>>>> index bc11b1ec59ac..d440c8cbaea5 100644
>>>>> --- a/drivers/infiniband/sw/rxe/rxe_odp.c
>>>>> +++ b/drivers/infiniband/sw/rxe/rxe_odp.c
>>>>> @@ -545,7 +545,7 @@ static int rxe_ib_advise_mr_prefetch(struct
>>>>> ib_pd *ibpd,
>>>>>            work->frags[i].mr = mr;
>>>>>        }
>>>>>    -    queue_work(system_unbound_wq, &work->work);
>>>>> +    queue_work(system_dfl_wq, &work->work);
>>>>>          return 0;
>>>>>    --
>>>>> 2.53.0
>>>>>

  reply	other threads:[~2026-03-17 19:31 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-13 15:40 [PATCH] RDMA/rxe: Replace use of system_unbound_wq with system_dfl_wq Marco Crivellari
2026-03-13 17:49 ` yanjun.zhu
2026-03-16 20:13 ` Leon Romanovsky
2026-03-17 14:32   ` Marco Crivellari
2026-03-17 16:24     ` Leon Romanovsky
2026-03-18  8:34       ` Marco Crivellari
2026-03-18 12:20       ` Marco Crivellari
2026-03-18 14:47         ` Zhu Yanjun
2026-03-18 15:02         ` Leon Romanovsky
2026-03-18 15:08           ` Marco Crivellari
2026-03-17 14:38   ` Zhu Yanjun
2026-03-17 17:24     ` Yanjun.Zhu
2026-03-17 19:03       ` Leon Romanovsky
2026-03-17 19:31         ` Yanjun.Zhu [this message]
2026-03-17 20:15           ` Yanjun.Zhu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=089fe865-0077-4253-85de-1bb05216b6e7@linux.dev \
    --to=yanjun.zhu@linux.dev \
    --cc=bigeasy@linutronix.de \
    --cc=frederic@kernel.org \
    --cc=jgg@ziepe.ca \
    --cc=jiangshanlai@gmail.com \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=marco.crivellari@suse.com \
    --cc=mhocko@suse.com \
    --cc=tj@kernel.org \
    --cc=zyjzyj2000@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.