Linux-NVME Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Ping Gan <jacky_gam_2001@163.com>
To: sagi@grimberg.me, hch@lst.de, kch@nvidia.com,
	linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org
Cc: ping.gan@dell.com
Subject: Re: [PATCH 0/2] nvmet: support polling task for RDMA and TCP
Date: Mon,  1 Jul 2024 15:42:44 +0800	[thread overview]
Message-ID: <20240701074245.73348-1-jacky_gam_2001@163.com> (raw)
In-Reply-To: <0779b376-38e3-42ef-b32a-a9cfab2749f2@grimberg.me>

>Hey Ping Gan,
>
>
>On 26/06/2024 11:28, Ping Gan wrote:
>> When running nvmf on SMP platform, current nvme target's RDMA and
>> TCP use kworker to handle IO. But if there is other high workload
>> in the system(eg: on kubernetes), the competition between the
>> kworker and other workload is very radical. And since the kworker
>> is scheduled by OS randomly, it's difficult to control OS resource
>> and also tune the performance. If target support to use delicated
>> polling task to handle IO, it's useful to control OS resource and
>> gain good performance. So it makes sense to add polling task in
>> rdma-rdma and rdma-tcp modules.
>
>This is NOT the way to go here.
>
>Both rdma and tcp are driven from workqueue context, which are bound 
>workqueues.
>
>So there are two ways to go here:
>1. Add generic port cpuset and use that to direct traffic to the 
>appropriate set of cores
>(i.e. select an appropriate comp_vector for rdma and add an appropriate 
>steering rule
>for tcp).
>2. Add options to rdma/tcp to use UNBOUND workqueues, and allow users
>to 
>control
>these UNBOUND workqueues cpumask via sysfs.
>
>(2) will not control interrupts to steer to other workloads cpus, but 
>the handlers may
>run on a set of dedicated cpus.
>
>(1) is a better solution, but harder to implement.
>
>You also should look into nvmet-fc as well (and nvmet-loop for that
>matter).

hi Sagi Grimberg,
Thanks for your reply, actually we had tried the first advice you
suggested, but we found the performance was poor when using spdk 
as initiator. You know this patch is not only resolving OS resource
competition issue, but also the perf issue. We have analyzed if we
still use workqueue(kworker) as target when initiator is polling 
driver(eg: spdk), then workqueue/kworker target is the bottleneck 
since every nvmf request may have a wait latency from queuing on 
workqueue to begin processing, and the latency can be traced by wqlat 
of bcc (https://github.com/iovisor/bcc/blob/master/tools/wqlat.py). 
We think the latency is a disaster for the polling driver data plane,
right? So we think adding a polling task mode on nvmet side to handle
IO does really make sense; what's your opinion about this? And you
mentioned we should also look into nvmet-fc, I agree with you.
However currently we have no nvmf-fc's testbed; if we get the testbed,
will do that. 


Thanks,
Ping





  reply	other threads:[~2024-07-01  7:44 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-26  8:28 [PATCH 0/2] nvmet: support polling task for RDMA and TCP Ping Gan
2024-06-26  8:28 ` [PATCH 1/2] nvmet-rdma: add polling cq task for nvmet-rdma Ping Gan
2024-06-26  8:28 ` [PATCH 2/2] nvmet-tcp: add polling task for nvmet-tcp Ping Gan
2024-06-30  8:58 ` [PATCH 0/2] nvmet: support polling task for RDMA and TCP Sagi Grimberg
2024-07-01  7:42   ` Ping Gan [this message]
2024-07-01  7:42     ` Ping Gan
2024-07-01  8:22     ` Sagi Grimberg
2024-07-02 10:02       ` Ping Gan
2024-07-02 10:02         ` Ping Gan
2024-07-03 19:58         ` Sagi Grimberg
2024-07-04  8:10           ` Ping Gan
2024-07-04  8:40             ` Sagi Grimberg
2024-07-04 10:35               ` Ping Gan
2024-07-05  5:59                 ` Sagi Grimberg
2024-07-05  6:28                   ` Ping Gan
2024-07-16 10:36             ` Hannes Reinecke
2024-07-17  0:53               ` Ping Gan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240701074245.73348-1-jacky_gam_2001@163.com \
    --to=jacky_gam_2001@163.com \
    --cc=hch@lst.de \
    --cc=kch@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=ping.gan@dell.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox