From: Zhu Yanjun <yanjun.zhu@linux.dev>
To: Leon Romanovsky <leon@kernel.org>,
RDMA mailing list <linux-rdma@vger.kernel.org>
Subject: Re: [PATCH 1/1] Revert "RDMA/rxe: Add workqueue support for rxe tasks"
Date: Wed, 4 Oct 2023 09:00:17 +0800 [thread overview]
Message-ID: <f3c3f4bd-0375-41b4-b479-5d3194ecb985@linux.dev> (raw)
In-Reply-To: <be4c9b0e-8acf-7fee-5ad0-209df5d3b0f9@linux.dev>
在 2023/10/4 8:46, Zhu Yanjun 写道:
>
> 在 2023/10/4 2:11, Leon Romanovsky 写道:
>> On Tue, Oct 03, 2023 at 11:29:42PM +0800, Zhu Yanjun wrote:
>>> 在 2023/10/3 17:59, Leon Romanovsky 写道:
>>>> On Tue, Oct 03, 2023 at 04:55:40PM +0800, Zhu Yanjun wrote:
>>>>> 在 2023/10/1 14:50, Leon Romanovsky 写道:
>>>>>> On Sun, Oct 1, 2023, at 09:47, Zhu Yanjun wrote:
>>>>>>> 在 2023/10/1 14:39, Leon Romanovsky 写道:
>>>>>>>> On Sun, Oct 1, 2023, at 09:34, Zhu Yanjun wrote:
>>>>>>>>> 在 2023/10/1 14:30, Leon Romanovsky 写道:
>>>>>>>>>> On Wed, Sep 27, 2023 at 11:51:12AM -0500, Bob Pearson wrote:
>>>>>>>>>>> On 9/26/23 15:24, Bart Van Assche wrote:
>>>>>>>>>>>> On 9/26/23 11:34, Bob Pearson wrote:
>>>>>>>>>>>>> I am working to try to reproduce the KASAN warning.
>>>>>>>>>>>>> Unfortunately,
>>>>>>>>>>>>> so far I am not able to see it in Ubuntu + Linus' kernel
>>>>>>>>>>>>> (as you described) on metal. The config file is different
>>>>>>>>>>>>> but copies the CONFIG_KASAN_xxx exactly as yours. With
>>>>>>>>>>>>> KASAN enabled it hangs on every iteration of srp/002 but
>>>>>>>>>>>>> without a KASAN warning. I am now building an openSuSE VM
>>>>>>>>>>>>> for qemu and will see if that causes the warning.
>>>>>>>>>>>> Hi Bob,
>>>>>>>>>>>>
>>>>>>>>>>>> Did you try to understand the report that I shared? My
>>>>>>>>>>>> conclusion from
>>>>>>>>>>>> the report is that when using tasklets rxe_completer() only
>>>>>>>>>>>> runs after
>>>>>>>>>>>> rxe_requester() has finished and also that when using work
>>>>>>>>>>>> queues that
>>>>>>>>>>>> rxe_completer() may run concurrently with rxe_requester().
>>>>>>>>>>>> This patch
>>>>>>>>>>>> seems to fix all issues that I ran into with the rdma_rxe
>>>>>>>>>>>> workqueue
>>>>>>>>>>>> patch (I have not tried to verify the performance
>>>>>>>>>>>> implications of this
>>>>>>>>>>>> patch):
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/drivers/infiniband/sw/rxe/rxe_task.c
>>>>>>>>>>>> b/drivers/infiniband/sw/rxe/rxe_task.c
>>>>>>>>>>>> index 1501120d4f52..6cd5d5a7a316 100644
>>>>>>>>>>>> --- a/drivers/infiniband/sw/rxe/rxe_task.c
>>>>>>>>>>>> +++ b/drivers/infiniband/sw/rxe/rxe_task.c
>>>>>>>>>>>> @@ -10,7 +10,7 @@ static struct workqueue_struct *rxe_wq;
>>>>>>>>>>>>
>>>>>>>>>>>> int rxe_alloc_wq(void)
>>>>>>>>>>>> {
>>>>>>>>>>>> - rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND,
>>>>>>>>>>>> WQ_MAX_ACTIVE);
>>>>>>>>>>>> + rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND, 1);
>>>>>>>>>>>> if (!rxe_wq)
>>>>>>>>>>>> return -ENOMEM;
>>>>> With this commit, a test run for several days. The similar problem
>>>>> still
>>>>> occurred.
>>>>>
>>>>> The problem is very similar with the one that Bart mentioned.
>>>>>
>>>>> It is very possible that WQ_MAX_ACTIVE is changed to 1, then this
>>>>> problem is
>>>>> alleviated.
>>>>>
>>>>> In the following
>>>>>
>>>>> 4661 __printf(1, 4)
>>>>> 4662 struct workqueue_struct *alloc_workqueue(const char *fmt,
>>>>> 4663 unsigned int flags,
>>>>> 4664 int max_active, ...)
>>>>> 4665 {
>>>>> 4666 va_list args;
>>>>> 4667 struct workqueue_struct *wq;
>>>>> 4668 struct pool_workqueue *pwq;
>>>>> 4669
>>>>> 4670 /*
>>>>> 4671 * Unbound && max_active == 1 used to imply ordered,
>>>>> which is
>>>>> no longer
>>>>> 4672 * the case on many machines due to per-pod pools. While
>>>>> 4673 * alloc_ordered_workqueue() is the right way to
>>>>> create an
>>>>> ordered
>>>>> 4674 * workqueue, keep the previous behavior to avoid subtle
>>>>> breakages.
>>>>> 4675 */
>>>>> 4676 if ((flags & WQ_UNBOUND) && max_active == 1)
>>>>> <---This means that workqueue is ordered.
>>>>> 4677 flags |= __WQ_ORDERED;
>>>>> ...
>>>>>
>>>>> Do this mean that the ordered workqueue covers the root cause? When
>>>>> workqueue is changed to ordered, it is difficult to reproduce this
>>>>> problem.
>>>>> Got it.
>
> Is there any way to ensure the following?
>
> if a mail does not appear in the rdma maillist, this mail will not be
> reviewed?
Sorry. My bad. I used the wrong rdma maillist.
>
>>
>>> The analysis is as below:
>>>
>>> Because workqueue will sleep when it is preempted, sometimes the
>>> sleep time
>>> will exceed the timeout
>>>
>>> of rdma packets. As such, rdma stack or ULP will oom or hang. This
>>> is why
>>> workqueue will cause ULP hang.
>>>
>>> But tasklet will not sleep. So this kind of problem will not occur with
>>> tasklet.
>>>
>>> About the performance, currently ordered workqueue can only execute
>>> at most
>>> one work item at any given
>>>
>>> time in the queued order. So in RXE, workqueue will not execute more
>>> jobs
>>> than tasklet.
>> It is because of changing max_active to be 1. Once that bug will be
>> fixed, RXE will be able to spread traffic on all CPUs.
>
Sure. I agree with you.
After max_active is changed to 1, the workqueue is the ordered workqueue.
The ordered workqueue will execute the work item one by one on differen
CPUs,
that is, after one work item is complete, the ordered workqueue will
execute another one
in the queued order on different CPUs. Tasklet will execute the jobs in
the same CPU one by one.
So if the total job number is the same, the ordered workqueue will have
the same execution time with the tasklet.
But the ordered workqueue has more overhead in scheduling than the tasklet.
In total, the performance of the ordered workqueue is not good compared
with the tasklet.
Zhu Yanjun
next prev parent reply other threads:[~2023-10-04 1:00 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-22 16:32 [PATCH 1/1] Revert "RDMA/rxe: Add workqueue support for rxe tasks" Zhu Yanjun
2023-09-22 16:42 ` Bart Van Assche
2023-09-26 9:43 ` Leon Romanovsky
2023-09-26 9:43 ` Leon Romanovsky
2023-09-26 14:06 ` Leon Romanovsky
2023-09-26 17:05 ` Bart Van Assche
2023-09-26 18:34 ` Bob Pearson
2023-09-26 20:24 ` Bart Van Assche
2023-09-27 0:08 ` Rain River
2023-09-27 16:36 ` Bob Pearson
2023-09-27 16:51 ` Bob Pearson
2023-10-01 6:30 ` Leon Romanovsky
[not found] ` <8afdc6ac-1f31-c12f-a60c-811a0101fc89@linux.dev>
[not found] ` <88137631-028c-4a60-b7b1-ac55f98badbf@app.fastmail.com>
[not found] ` <a0d05185-7f03-b3a8-1493-2b50302161d2@linux.dev>
[not found] ` <e1576d79-642d-40bd-8e55-c37009cb6426@app.fastmail.com>
[not found] ` <1290ba1d-6102-ea17-c80e-9f1280b26067@linux.dev>
[not found] ` <20231003095901.GA51282@unreal>
[not found] ` <5ea7795a-49a6-2ba0-4caf-02ba7b6961f9@linux.dev>
[not found] ` <20231003181123.GD51282@unreal>
[not found] ` <be4c9b0e-8acf-7fee-5ad0-209df5d3b0f9@linux.dev>
2023-10-04 1:00 ` Zhu Yanjun [this message]
2023-10-04 17:44 ` Bart Van Assche
2023-10-04 21:16 ` Bob Pearson
2023-10-04 3:41 ` Zhu Yanjun
2023-10-04 17:43 ` Bart Van Assche
2023-10-04 18:38 ` Jason Gunthorpe
2023-10-05 9:25 ` Zhu Yanjun
2023-10-05 14:21 ` Jason Gunthorpe
2023-10-05 14:50 ` Bart Van Assche
2023-10-05 15:56 ` Jason Gunthorpe
2023-10-06 15:58 ` Bob Pearson
2023-10-07 0:35 ` Zhu Yanjun
2023-10-08 16:01 ` Zhu Yanjun
2023-10-08 17:09 ` Leon Romanovsky
2023-10-10 4:53 ` Daisuke Matsuda (Fujitsu)
2023-10-10 16:09 ` Jason Gunthorpe
2023-10-10 21:29 ` Bart Van Assche
2023-10-11 15:51 ` Jason Gunthorpe
2023-10-11 20:14 ` Bart Van Assche
2023-10-11 23:12 ` Jason Gunthorpe
2023-10-12 11:49 ` Zhu Yanjun
2023-10-12 15:38 ` Bob Pearson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f3c3f4bd-0375-41b4-b479-5d3194ecb985@linux.dev \
--to=yanjun.zhu@linux.dev \
--cc=leon@kernel.org \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.