public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: "Yanjun.Zhu" <yanjun.zhu@linux.dev>
To: Leon Romanovsky <leon@kernel.org>, Zhu Yanjun <yanjun.zhu@linux.dev>
Cc: Marco Crivellari <marco.crivellari@suse.com>,
	linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org,
	Tejun Heo <tj@kernel.org>, Lai Jiangshan <jiangshanlai@gmail.com>,
	Frederic Weisbecker <frederic@kernel.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Michal Hocko <mhocko@suse.com>, Zhu Yanjun <zyjzyj2000@gmail.com>,
	Jason Gunthorpe <jgg@ziepe.ca>
Subject: Re: [PATCH] RDMA/rxe: Replace use of system_unbound_wq with system_dfl_wq
Date: Tue, 17 Mar 2026 12:31:08 -0700	[thread overview]
Message-ID: <089fe865-0077-4253-85de-1bb05216b6e7@linux.dev> (raw)
In-Reply-To: <20260317190314.GC61385@unreal>


On 3/17/26 12:03 PM, Leon Romanovsky wrote:
> On Tue, Mar 17, 2026 at 10:24:11AM -0700, Yanjun.Zhu wrote:
>> On 3/17/26 7:38 AM, Zhu Yanjun wrote:
>>> 在 2026/3/16 13:13, Leon Romanovsky 写道:
>>>> On Fri, Mar 13, 2026 at 04:40:23PM +0100, Marco Crivellari wrote:
>>>>> This patch continues the effort to refactor workqueue APIs,
>>>>> which has begun
>>>>> with the changes introducing new workqueues and a new
>>>>> alloc_workqueue flag:
>>>>>
>>>>>      commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and
>>>>> system_dfl_wq")
>>>>>      commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
>>>>>
>>>>> The point of the refactoring is to eventually alter the default
>>>>> behavior of
>>>>> workqueues to become unbound by default so that their workload
>>>>> placement is
>>>>> optimized by the scheduler.
>>>>>
>>>>> Before that to happen, workqueue users must be converted to the
>>>>> better named
>>>>> new workqueues with no intended behaviour changes:
>>>>>
>>>>>      system_wq -> system_percpu_wq
>>>>>      system_unbound_wq -> system_dfl_wq
>>>>>
>>>>> This way the old obsolete workqueues (system_wq,
>>>>> system_unbound_wq) can be
>>>>> removed in the future.
>>>> I recall earlier efforts to replace system workqueues with
>>>> per‑driver queues,
>>>> because unloading a driver forces a flush of the entire system
>>>> workqueue,
>>>> which is undesirable for overall system behavior.
>>>>
>>>> Wouldn't it be better to introduce a local workqueue here and use
>>>> that instead?
>>> Thanks.
>>>
>>> 1.The initialization should be:
>>>
>>> my_wq = alloc_workqueue("my_driver_queue", WQ_UNBOUND | WQ_MEM_RECLAIM,
>>> 0);
>>> if (!my_wq)
>>>      return -ENOMEM;
>>>
>>> 2. The Submission should be:
>>>
>>> queue_work(my_wq, &my_work);
>>>
>>> 3. Destroy should be:
>>>
>>> destroy_workqueue()
>>>
>>> Thanks,
>>> Zhu Yanjun
>> Hi, Leon
>>
>> The diff for a new work queue in rxe is as below. Please review it.
> I'm not sure that you need second workqueue and destroy_workqueue
> already does flush_workqueue. There is no need to call it explicitly.
flush_workqueue() can be removed.

The introduction of the second workqueue is due to rxe_wq being heavily 
utilized by QP tasks.

The additional workqueue helps offload and distribute the workload, 
preventing rxe_wq from becoming a bottleneck.

If you believe that the workload on rxe_wq is not significant, I can 
simplify the design

by removing the second workqueue and using rxe_wq for all work items 
instead.

Zhu Yanjun

>
> Thanks
>
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_odp.c
>> b/drivers/infiniband/sw/rxe/rxe_odp.c
>> index bc11b1ec59ac..03199fef47fb 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_odp.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_odp.c
>> @@ -545,7 +545,7 @@ static int rxe_ib_advise_mr_prefetch(struct ib_pd *ibpd,
>>           work->frags[i].mr = mr;
>>       }
>>
>> -    queue_work(system_unbound_wq, &work->work);
>> +    rxe_queue_aux_work(&work->work);
>>
>>       return 0;
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_task.c
>> b/drivers/infiniband/sw/rxe/rxe_task.c
>> index f522820b950c..a2da699b969e 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_task.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_task.c
>> @@ -6,19 +6,36 @@
>>
>>   #include "rxe.h"
>>
>> +/* work for rxe_task */
>>   static struct workqueue_struct *rxe_wq;
>>
>> +/* work for other rxe jobs */
>> +static struct workqueue_struct *rxe_aux_wq;
>> +
>>   int rxe_alloc_wq(void)
>>   {
>> -    rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND, WQ_MAX_ACTIVE);
>> +    rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND | WQ_MEM_RECLAIM,
>> +                WQ_MAX_ACTIVE);
>>       if (!rxe_wq)
>>           return -ENOMEM;
>>
>> +    rxe_aux_wq = alloc_workqueue("rxe_aux_wq",
>> +                WQ_UNBOUND | WQ_MEM_RECLAIM, WQ_MAX_ACTIVE);
>> +    if (!rxe_aux_wq) {
>> +        destroy_workqueue(rxe_wq);
>> +        return -ENOMEM;
>> +
>> +    }
>> +
>>       return 0;
>>   }
>>
>>   void rxe_destroy_wq(void)
>>   {
>> +    flush_workqueue(rxe_aux_wq);
>> +    destroy_workqueue(rxe_aux_wq);
>> +
>> +    flush_workqueue(rxe_wq);
>>       destroy_workqueue(rxe_wq);
>>   }
>>
>> @@ -254,6 +271,14 @@ void rxe_sched_task(struct rxe_task *task)
>>       spin_unlock_irqrestore(&task->lock, flags);
>>   }
>>
>> +/* rxe_wq for rxe tasks. rxe_aux_wq for other rxe jobs.
>> + */
>> +void rxe_queue_aux_work(struct work_struct *work)
>> +{
>> +    WARN_ON_ONCE(!rxe_aux_wq);
>> +    queue_work(rxe_aux_wq, work);
>> +}
>> +
>>   /* rxe_disable/enable_task are only called from
>>    * rxe_modify_qp in process context. Task is moved
>>    * to the drained state by do_task.
>> diff --git a/drivers/infiniband/sw/rxe/rxe_task.h
>> b/drivers/infiniband/sw/rxe/rxe_task.h
>> index a8c9a77b6027..e1c0a34808b4 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_task.h
>> +++ b/drivers/infiniband/sw/rxe/rxe_task.h
>> @@ -36,6 +36,7 @@ int rxe_alloc_wq(void);
>>
>>   void rxe_destroy_wq(void);
>>
>> +void rxe_queue_aux_work(struct work_struct *work);
>>   /*
>>    * init rxe_task structure
>>    *    qp  => parameter to pass to func
>>
>> Zhu Yanjun
>>
>>>> Thanks
>>>>
>>>>> Link:
>>>>> https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
>>>>> Suggested-by: Tejun Heo <tj@kernel.org>
>>>>> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
>>>>> ---
>>>>>    drivers/infiniband/sw/rxe/rxe_odp.c | 2 +-
>>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/infiniband/sw/rxe/rxe_odp.c
>>>>> b/drivers/infiniband/sw/rxe/rxe_odp.c
>>>>> index bc11b1ec59ac..d440c8cbaea5 100644
>>>>> --- a/drivers/infiniband/sw/rxe/rxe_odp.c
>>>>> +++ b/drivers/infiniband/sw/rxe/rxe_odp.c
>>>>> @@ -545,7 +545,7 @@ static int rxe_ib_advise_mr_prefetch(struct
>>>>> ib_pd *ibpd,
>>>>>            work->frags[i].mr = mr;
>>>>>        }
>>>>>    -    queue_work(system_unbound_wq, &work->work);
>>>>> +    queue_work(system_dfl_wq, &work->work);
>>>>>          return 0;
>>>>>    --
>>>>> 2.53.0
>>>>>

  reply	other threads:[~2026-03-17 19:31 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-13 15:40 [PATCH] RDMA/rxe: Replace use of system_unbound_wq with system_dfl_wq Marco Crivellari
2026-03-13 17:49 ` yanjun.zhu
2026-03-16 20:13 ` Leon Romanovsky
2026-03-17 14:32   ` Marco Crivellari
2026-03-17 16:24     ` Leon Romanovsky
2026-03-18  8:34       ` Marco Crivellari
2026-03-18 12:20       ` Marco Crivellari
2026-03-18 14:47         ` Zhu Yanjun
2026-03-18 15:02         ` Leon Romanovsky
2026-03-18 15:08           ` Marco Crivellari
2026-03-17 14:38   ` Zhu Yanjun
2026-03-17 17:24     ` Yanjun.Zhu
2026-03-17 19:03       ` Leon Romanovsky
2026-03-17 19:31         ` Yanjun.Zhu [this message]
2026-03-17 20:15           ` Yanjun.Zhu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=089fe865-0077-4253-85de-1bb05216b6e7@linux.dev \
    --to=yanjun.zhu@linux.dev \
    --cc=bigeasy@linutronix.de \
    --cc=frederic@kernel.org \
    --cc=jgg@ziepe.ca \
    --cc=jiangshanlai@gmail.com \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=marco.crivellari@suse.com \
    --cc=mhocko@suse.com \
    --cc=tj@kernel.org \
    --cc=zyjzyj2000@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox