* Re: [PATCH net-next 3/3] virtio-net: reduce the CPU consumption of dim worker
2024-01-16 13:11 ` [PATCH net-next 3/3] virtio-net: reduce the CPU consumption of dim worker Heng Qi
@ 2024-01-16 19:56 ` Simon Horman
2024-01-17 3:41 ` Heng Qi
2024-01-19 10:31 ` kernel test robot
` (3 subsequent siblings)
4 siblings, 1 reply; 16+ messages in thread
From: Simon Horman @ 2024-01-16 19:56 UTC (permalink / raw)
To: Heng Qi
Cc: netdev, virtualization, Jason Wang, Michael S. Tsirkin,
Paolo Abeni, Jakub Kicinski, Eric Dumazet, David S. Miller,
Xuan Zhuo
On Tue, Jan 16, 2024 at 09:11:33PM +0800, Heng Qi wrote:
> Accumulate multiple request commands to kick the device once,
> and obtain the processing results of the corresponding commands
> asynchronously. The batch command method is used to optimize the
> CPU overhead of the DIM worker caused by the guest being busy
> waiting for the command response result.
>
> On an 8-queue device, without this patch, the guest cpu overhead
> due to waiting for cvq could be 10+% and above. With this patch,
> the corresponding overhead is basically invisible.
>
> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
...
> @@ -3520,22 +3568,99 @@ static int virtnet_send_notf_coal_vq_cmds(struct virtnet_info *vi,
> return 0;
> }
>
> +static bool virtnet_add_dim_command(struct virtnet_info *vi,
> + struct virtnet_batch_coal *ctrl)
> +{
> + struct scatterlist *sgs[4], hdr, stat, out;
> + unsigned out_num = 0;
> + int ret;
> +
> + /* Caller should know better */
> + BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ));
> +
> + ctrl->hdr.class = VIRTIO_NET_CTRL_NOTF_COAL;
> + ctrl->hdr.cmd = VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET;
> +
> + /* Add header */
> + sg_init_one(&hdr, &ctrl->hdr, sizeof(ctrl->hdr));
> + sgs[out_num++] = &hdr;
> +
> + /* Add body */
> + sg_init_one(&out, &ctrl->num_entries, sizeof(ctrl->num_entries) +
> + ctrl->num_entries * sizeof(struct virtnet_coal_entry));
Hi Heng Qi,
I am a bit confused.
With this series applied on top of net-next
struct virtnet_coal_entry doesn't seem to exist.
> + sgs[out_num++] = &out;
> +
> + /* Add return status. */
> + ctrl->status = VIRTIO_NET_OK;
> + sg_init_one(&stat, &ctrl->status, sizeof(ctrl->status));
> + sgs[out_num] = &stat;
> +
> + BUG_ON(out_num + 1 > ARRAY_SIZE(sgs));
> + ret = virtqueue_add_sgs(vi->cvq, sgs, out_num, 1, ctrl, GFP_ATOMIC);
> + if (ret < 0) {
> + dev_warn(&vi->vdev->dev, "Failed to add sgs for command vq: %d\n.", ret);
> + return false;
> + }
> +
> + virtqueue_kick(vi->cvq);
> +
> + ctrl->usable = false;
> + vi->cvq_cmd_nums++;
> +
> + return true;
> +}
...
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [PATCH net-next 3/3] virtio-net: reduce the CPU consumption of dim worker
2024-01-16 19:56 ` Simon Horman
@ 2024-01-17 3:41 ` Heng Qi
0 siblings, 0 replies; 16+ messages in thread
From: Heng Qi @ 2024-01-17 3:41 UTC (permalink / raw)
To: Simon Horman
Cc: netdev, virtualization, Jason Wang, Michael S. Tsirkin,
Paolo Abeni, Jakub Kicinski, Eric Dumazet, David S. Miller,
Xuan Zhuo
在 2024/1/17 上午3:56, Simon Horman 写道:
> On Tue, Jan 16, 2024 at 09:11:33PM +0800, Heng Qi wrote:
>> Accumulate multiple request commands to kick the device once,
>> and obtain the processing results of the corresponding commands
>> asynchronously. The batch command method is used to optimize the
>> CPU overhead of the DIM worker caused by the guest being busy
>> waiting for the command response result.
>>
>> On an 8-queue device, without this patch, the guest cpu overhead
>> due to waiting for cvq could be 10+% and above. With this patch,
>> the corresponding overhead is basically invisible.
>>
>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> ...
>
>> @@ -3520,22 +3568,99 @@ static int virtnet_send_notf_coal_vq_cmds(struct virtnet_info *vi,
>> return 0;
>> }
>>
>> +static bool virtnet_add_dim_command(struct virtnet_info *vi,
>> + struct virtnet_batch_coal *ctrl)
>> +{
>> + struct scatterlist *sgs[4], hdr, stat, out;
>> + unsigned out_num = 0;
>> + int ret;
>> +
>> + /* Caller should know better */
>> + BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ));
>> +
>> + ctrl->hdr.class = VIRTIO_NET_CTRL_NOTF_COAL;
>> + ctrl->hdr.cmd = VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET;
>> +
>> + /* Add header */
>> + sg_init_one(&hdr, &ctrl->hdr, sizeof(ctrl->hdr));
>> + sgs[out_num++] = &hdr;
>> +
>> + /* Add body */
>> + sg_init_one(&out, &ctrl->num_entries, sizeof(ctrl->num_entries) +
>> + ctrl->num_entries * sizeof(struct virtnet_coal_entry));
> Hi Heng Qi,
>
> I am a bit confused.
> With this series applied on top of net-next
> struct virtnet_coal_entry doesn't seem to exist.
Hi Simon,
It should be struct virtio_net_ctrl_coal_vq.
Thanks.
>
>
>> + sgs[out_num++] = &out;
>> +
>> + /* Add return status. */
>> + ctrl->status = VIRTIO_NET_OK;
>> + sg_init_one(&stat, &ctrl->status, sizeof(ctrl->status));
>> + sgs[out_num] = &stat;
>> +
>> + BUG_ON(out_num + 1 > ARRAY_SIZE(sgs));
>> + ret = virtqueue_add_sgs(vi->cvq, sgs, out_num, 1, ctrl, GFP_ATOMIC);
>> + if (ret < 0) {
>> + dev_warn(&vi->vdev->dev, "Failed to add sgs for command vq: %d\n.", ret);
>> + return false;
>> + }
>> +
>> + virtqueue_kick(vi->cvq);
>> +
>> + ctrl->usable = false;
>> + vi->cvq_cmd_nums++;
>> +
>> + return true;
>> +}
> ...
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH net-next 3/3] virtio-net: reduce the CPU consumption of dim worker
2024-01-16 13:11 ` [PATCH net-next 3/3] virtio-net: reduce the CPU consumption of dim worker Heng Qi
2024-01-16 19:56 ` Simon Horman
@ 2024-01-19 10:31 ` kernel test robot
2024-01-19 13:43 ` kernel test robot
` (2 subsequent siblings)
4 siblings, 0 replies; 16+ messages in thread
From: kernel test robot @ 2024-01-19 10:31 UTC (permalink / raw)
To: Heng Qi, netdev, virtualization
Cc: llvm, oe-kbuild-all, Jason Wang, Michael S. Tsirkin, Paolo Abeni,
Jakub Kicinski, Eric Dumazet, Xuan Zhuo
Hi Heng,
kernel test robot noticed the following build errors:
[auto build test ERROR on net-next/main]
url: https://github.com/intel-lab-lkp/linux/commits/Heng-Qi/virtio-net-fix-possible-dim-status-unrecoverable/20240116-211306
base: net-next/main
patch link: https://lore.kernel.org/r/1705410693-118895-4-git-send-email-hengqi%40linux.alibaba.com
patch subject: [PATCH net-next 3/3] virtio-net: reduce the CPU consumption of dim worker
config: i386-randconfig-012-20240119 (https://download.01.org/0day-ci/archive/20240119/202401191807.RIhMHJ4U-lkp@intel.com/config)
compiler: ClangBuiltLinux clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240119/202401191807.RIhMHJ4U-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202401191807.RIhMHJ4U-lkp@intel.com/
All errors (new ones prefixed by >>):
>> drivers/net/virtio_net.c:3590:27: error: invalid application of 'sizeof' to an incomplete type 'struct virtnet_coal_entry'
3590 | ctrl->num_entries * sizeof(struct virtnet_coal_entry));
| ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/virtio_net.c:3590:41: note: forward declaration of 'struct virtnet_coal_entry'
3590 | ctrl->num_entries * sizeof(struct virtnet_coal_entry));
| ^
drivers/net/virtio_net.c:3658:27: error: invalid application of 'sizeof' to an incomplete type 'struct virtnet_coal_entry'
3658 | vi->max_queue_pairs * sizeof(struct virtnet_coal_entry));
| ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/virtio_net.c:3658:41: note: forward declaration of 'struct virtnet_coal_entry'
3658 | vi->max_queue_pairs * sizeof(struct virtnet_coal_entry));
| ^
drivers/net/virtio_net.c:4544:31: error: invalid application of 'sizeof' to an incomplete type 'struct virtnet_coal_entry'
4544 | vi->max_queue_pairs * sizeof(struct virtnet_coal_entry)), GFP_KERNEL);
| ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/virtio_net.c:4544:45: note: forward declaration of 'struct virtnet_coal_entry'
4544 | vi->max_queue_pairs * sizeof(struct virtnet_coal_entry)), GFP_KERNEL);
| ^
drivers/net/virtio_net.c:4551:27: error: invalid application of 'sizeof' to an incomplete type 'struct virtnet_coal_entry'
4551 | vi->max_queue_pairs * sizeof(struct virtnet_coal_entry));
| ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/virtio_net.c:4544:45: note: forward declaration of 'struct virtnet_coal_entry'
4544 | vi->max_queue_pairs * sizeof(struct virtnet_coal_entry)), GFP_KERNEL);
| ^
4 errors generated.
vim +3590 drivers/net/virtio_net.c
3570
3571 static bool virtnet_add_dim_command(struct virtnet_info *vi,
3572 struct virtnet_batch_coal *ctrl)
3573 {
3574 struct scatterlist *sgs[4], hdr, stat, out;
3575 unsigned out_num = 0;
3576 int ret;
3577
3578 /* Caller should know better */
3579 BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ));
3580
3581 ctrl->hdr.class = VIRTIO_NET_CTRL_NOTF_COAL;
3582 ctrl->hdr.cmd = VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET;
3583
3584 /* Add header */
3585 sg_init_one(&hdr, &ctrl->hdr, sizeof(ctrl->hdr));
3586 sgs[out_num++] = &hdr;
3587
3588 /* Add body */
3589 sg_init_one(&out, &ctrl->num_entries, sizeof(ctrl->num_entries) +
> 3590 ctrl->num_entries * sizeof(struct virtnet_coal_entry));
3591 sgs[out_num++] = &out;
3592
3593 /* Add return status. */
3594 ctrl->status = VIRTIO_NET_OK;
3595 sg_init_one(&stat, &ctrl->status, sizeof(ctrl->status));
3596 sgs[out_num] = &stat;
3597
3598 BUG_ON(out_num + 1 > ARRAY_SIZE(sgs));
3599 ret = virtqueue_add_sgs(vi->cvq, sgs, out_num, 1, ctrl, GFP_ATOMIC);
3600 if (ret < 0) {
3601 dev_warn(&vi->vdev->dev, "Failed to add sgs for command vq: %d\n.", ret);
3602 return false;
3603 }
3604
3605 virtqueue_kick(vi->cvq);
3606
3607 ctrl->usable = false;
3608 vi->cvq_cmd_nums++;
3609
3610 return true;
3611 }
3612
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [PATCH net-next 3/3] virtio-net: reduce the CPU consumption of dim worker
2024-01-16 13:11 ` [PATCH net-next 3/3] virtio-net: reduce the CPU consumption of dim worker Heng Qi
2024-01-16 19:56 ` Simon Horman
2024-01-19 10:31 ` kernel test robot
@ 2024-01-19 13:43 ` kernel test robot
2024-01-22 7:19 ` Xuan Zhuo
2024-01-22 7:42 ` Xuan Zhuo
4 siblings, 0 replies; 16+ messages in thread
From: kernel test robot @ 2024-01-19 13:43 UTC (permalink / raw)
To: Heng Qi, netdev, virtualization
Cc: oe-kbuild-all, Jason Wang, Michael S. Tsirkin, Paolo Abeni,
Jakub Kicinski, Eric Dumazet, Xuan Zhuo
Hi Heng,
kernel test robot noticed the following build errors:
[auto build test ERROR on net-next/main]
url: https://github.com/intel-lab-lkp/linux/commits/Heng-Qi/virtio-net-fix-possible-dim-status-unrecoverable/20240116-211306
base: net-next/main
patch link: https://lore.kernel.org/r/1705410693-118895-4-git-send-email-hengqi%40linux.alibaba.com
patch subject: [PATCH net-next 3/3] virtio-net: reduce the CPU consumption of dim worker
config: i386-buildonly-randconfig-004-20240119 (https://download.01.org/0day-ci/archive/20240119/202401192156.ZUNUJmuA-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240119/202401192156.ZUNUJmuA-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202401192156.ZUNUJmuA-lkp@intel.com/
All errors (new ones prefixed by >>):
drivers/net/virtio_net.c: In function 'virtnet_add_dim_command':
>> drivers/net/virtio_net.c:3590:48: error: invalid application of 'sizeof' to incomplete type 'struct virtnet_coal_entry'
3590 | ctrl->num_entries * sizeof(struct virtnet_coal_entry));
| ^~~~~~
drivers/net/virtio_net.c: In function 'virtnet_rx_dim_work':
drivers/net/virtio_net.c:3658:62: error: invalid application of 'sizeof' to incomplete type 'struct virtnet_coal_entry'
3658 | vi->max_queue_pairs * sizeof(struct virtnet_coal_entry));
| ^~~~~~
drivers/net/virtio_net.c: In function 'virtnet_alloc_queues':
drivers/net/virtio_net.c:4544:52: error: invalid application of 'sizeof' to incomplete type 'struct virtnet_coal_entry'
4544 | vi->max_queue_pairs * sizeof(struct virtnet_coal_entry)), GFP_KERNEL);
| ^~~~~~
drivers/net/virtio_net.c:4551:62: error: invalid application of 'sizeof' to incomplete type 'struct virtnet_coal_entry'
4551 | vi->max_queue_pairs * sizeof(struct virtnet_coal_entry));
| ^~~~~~
vim +3590 drivers/net/virtio_net.c
3570
3571 static bool virtnet_add_dim_command(struct virtnet_info *vi,
3572 struct virtnet_batch_coal *ctrl)
3573 {
3574 struct scatterlist *sgs[4], hdr, stat, out;
3575 unsigned out_num = 0;
3576 int ret;
3577
3578 /* Caller should know better */
3579 BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ));
3580
3581 ctrl->hdr.class = VIRTIO_NET_CTRL_NOTF_COAL;
3582 ctrl->hdr.cmd = VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET;
3583
3584 /* Add header */
3585 sg_init_one(&hdr, &ctrl->hdr, sizeof(ctrl->hdr));
3586 sgs[out_num++] = &hdr;
3587
3588 /* Add body */
3589 sg_init_one(&out, &ctrl->num_entries, sizeof(ctrl->num_entries) +
> 3590 ctrl->num_entries * sizeof(struct virtnet_coal_entry));
3591 sgs[out_num++] = &out;
3592
3593 /* Add return status. */
3594 ctrl->status = VIRTIO_NET_OK;
3595 sg_init_one(&stat, &ctrl->status, sizeof(ctrl->status));
3596 sgs[out_num] = &stat;
3597
3598 BUG_ON(out_num + 1 > ARRAY_SIZE(sgs));
3599 ret = virtqueue_add_sgs(vi->cvq, sgs, out_num, 1, ctrl, GFP_ATOMIC);
3600 if (ret < 0) {
3601 dev_warn(&vi->vdev->dev, "Failed to add sgs for command vq: %d\n.", ret);
3602 return false;
3603 }
3604
3605 virtqueue_kick(vi->cvq);
3606
3607 ctrl->usable = false;
3608 vi->cvq_cmd_nums++;
3609
3610 return true;
3611 }
3612
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [PATCH net-next 3/3] virtio-net: reduce the CPU consumption of dim worker
2024-01-16 13:11 ` [PATCH net-next 3/3] virtio-net: reduce the CPU consumption of dim worker Heng Qi
` (2 preceding siblings ...)
2024-01-19 13:43 ` kernel test robot
@ 2024-01-22 7:19 ` Xuan Zhuo
2024-01-22 7:42 ` Xuan Zhuo
4 siblings, 0 replies; 16+ messages in thread
From: Xuan Zhuo @ 2024-01-22 7:19 UTC (permalink / raw)
To: Heng Qi
Cc: Jason Wang, Michael S. Tsirkin, Paolo Abeni, Jakub Kicinski,
Eric Dumazet, David S. Miller, netdev, virtualization
On Tue, 16 Jan 2024 21:11:33 +0800, Heng Qi <hengqi@linux.alibaba.com> wrote:
> Accumulate multiple request commands to kick the device once,
> and obtain the processing results of the corresponding commands
> asynchronously. The batch command method is used to optimize the
> CPU overhead of the DIM worker caused by the guest being busy
> waiting for the command response result.
>
> On an 8-queue device, without this patch, the guest cpu overhead
> due to waiting for cvq could be 10+% and above. With this patch,
> the corresponding overhead is basically invisible.
>
> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> ---
> drivers/net/virtio_net.c | 185 ++++++++++++++++++++++++++++++++++++++++-------
> 1 file changed, 158 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index e4305ad..9f22c85 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -33,6 +33,8 @@
> module_param(gso, bool, 0444);
> module_param(napi_tx, bool, 0644);
>
> +#define BATCH_CMD 25
> +
> /* FIXME: MTU in config. */
> #define GOOD_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN)
> #define GOOD_COPY_LEN 128
> @@ -134,6 +136,9 @@ struct virtnet_interrupt_coalesce {
> };
>
> struct virtnet_batch_coal {
> + struct virtio_net_ctrl_hdr hdr;
> + virtio_net_ctrl_ack status;
> + __u8 usable;
> __le32 num_entries;
> struct virtio_net_ctrl_coal_vq coal_vqs[];
> };
> @@ -299,6 +304,7 @@ struct virtnet_info {
>
> /* Work struct for delayed refilling if we run low on memory. */
> struct delayed_work refill;
> + struct delayed_work get_cvq;
>
> /* Is delayed refill enabled? */
> bool refill_enabled;
> @@ -326,6 +332,7 @@ struct virtnet_info {
> bool rx_dim_enabled;
>
> /* Interrupt coalescing settings */
> + int cvq_cmd_nums;
> struct virtnet_batch_coal *batch_coal;
> struct virtnet_interrupt_coalesce intr_coal_tx;
> struct virtnet_interrupt_coalesce intr_coal_rx;
> @@ -2512,6 +2519,46 @@ static int virtnet_tx_resize(struct virtnet_info *vi,
> return err;
> }
>
> +static bool virtnet_process_dim_cmd(struct virtnet_info *vi, void *res)
> +{
> + struct virtnet_batch_coal *batch_coal;
> + u16 queue;
> + int i;
> +
> + if (res != ((void *)vi)) {
> + batch_coal = (struct virtnet_batch_coal *)res;
> + batch_coal->usable = true;
> + vi->cvq_cmd_nums--;
> + for (i = 0; i < batch_coal->num_entries; i++) {
> + queue = batch_coal->coal_vqs[i].vqn / 2;
> + vi->rq[queue].dim.state = DIM_START_MEASURE;
> + }
> + } else {
> + return true;
> + }
> +
> + return false;
> +}
> +
> +static bool virtnet_cvq_response(struct virtnet_info *vi, bool poll)
> +{
> + unsigned tmp;
> + void *res;
> +
> + if (!poll) {
> + while ((res = virtqueue_get_buf(vi->cvq, &tmp)) &&
> + !virtqueue_is_broken(vi->cvq))
> + virtnet_process_dim_cmd(vi, res);
> + return 0;
> + }
> +
> + while (!(res = virtqueue_get_buf(vi->cvq, &tmp)) &&
> + !virtqueue_is_broken(vi->cvq))
> + cpu_relax();
> +
> + return virtnet_process_dim_cmd(vi, res);
How about?
while (true) {
res = virtqueue_get_buf(vi->cvq, &tmp);
if (!res) {
if (poll) {
if (virtqueue_is_broken(vi->cvq))
return 0;
cpu_relax();
continue;
}
return false;
}
if (res == ((void *)vi))
return true;
virtnet_process_dim_cmd(vi, res);
}
Thanks.
> +}
> +
> /*
> * Send command via the control virtqueue and check status. Commands
> * supported by the hypervisor, as indicated by feature bits, should
> @@ -2521,7 +2568,7 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd,
> struct scatterlist *out)
> {
> struct scatterlist *sgs[4], hdr, stat;
> - unsigned out_num = 0, tmp;
> + unsigned out_num = 0;
> int ret;
>
> /* Caller should know better */
> @@ -2555,9 +2602,9 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd,
> /* Spin for a response, the kick causes an ioport write, trapping
> * into the hypervisor, so the request should be handled immediately.
> */
> - while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> - !virtqueue_is_broken(vi->cvq))
> - cpu_relax();
> + while (true)
> + if (virtnet_cvq_response(vi, true))
> + break;
>
> return vi->ctrl->status == VIRTIO_NET_OK;
> }
> @@ -2709,6 +2756,7 @@ static int virtnet_close(struct net_device *dev)
> cancel_work_sync(&vi->rq[i].dim.work);
> }
>
> + cancel_delayed_work_sync(&vi->get_cvq);
> return 0;
> }
>
> @@ -3520,22 +3568,99 @@ static int virtnet_send_notf_coal_vq_cmds(struct virtnet_info *vi,
> return 0;
> }
>
> +static bool virtnet_add_dim_command(struct virtnet_info *vi,
> + struct virtnet_batch_coal *ctrl)
> +{
> + struct scatterlist *sgs[4], hdr, stat, out;
> + unsigned out_num = 0;
> + int ret;
> +
> + /* Caller should know better */
> + BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ));
> +
> + ctrl->hdr.class = VIRTIO_NET_CTRL_NOTF_COAL;
> + ctrl->hdr.cmd = VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET;
> +
> + /* Add header */
> + sg_init_one(&hdr, &ctrl->hdr, sizeof(ctrl->hdr));
> + sgs[out_num++] = &hdr;
> +
> + /* Add body */
> + sg_init_one(&out, &ctrl->num_entries, sizeof(ctrl->num_entries) +
> + ctrl->num_entries * sizeof(struct virtnet_coal_entry));
> + sgs[out_num++] = &out;
> +
> + /* Add return status. */
> + ctrl->status = VIRTIO_NET_OK;
> + sg_init_one(&stat, &ctrl->status, sizeof(ctrl->status));
> + sgs[out_num] = &stat;
> +
> + BUG_ON(out_num + 1 > ARRAY_SIZE(sgs));
> + ret = virtqueue_add_sgs(vi->cvq, sgs, out_num, 1, ctrl, GFP_ATOMIC);
> + if (ret < 0) {
> + dev_warn(&vi->vdev->dev, "Failed to add sgs for command vq: %d\n.", ret);
> + return false;
> + }
> +
> + virtqueue_kick(vi->cvq);
> +
> + ctrl->usable = false;
> + vi->cvq_cmd_nums++;
> +
> + return true;
> +}
> +
> +static void get_cvq_work(struct work_struct *work)
> +{
> + struct virtnet_info *vi =
> + container_of(work, struct virtnet_info, get_cvq.work);
> +
> + if (!rtnl_trylock()) {
> + schedule_delayed_work(&vi->get_cvq, 5);
> + return;
> + }
> +
> + if (!vi->cvq_cmd_nums)
> + goto ret;
> +
> + virtnet_cvq_response(vi, false);
> +
> + if (vi->cvq_cmd_nums)
> + schedule_delayed_work(&vi->get_cvq, 5);
> +
> +ret:
> + rtnl_unlock();
> +}
> +
> static void virtnet_rx_dim_work(struct work_struct *work)
> {
> struct dim *dim = container_of(work, struct dim, work);
> struct receive_queue *rq = container_of(dim,
> struct receive_queue, dim);
> struct virtnet_info *vi = rq->vq->vdev->priv;
> + struct virtnet_batch_coal *avail_coal;
> struct dim_cq_moder update_moder;
> - struct virtnet_batch_coal *coal = vi->batch_coal;
> - struct scatterlist sgs;
> - int i, j = 0;
> + int i, j = 0, position;
> + u8 *buf;
>
> if (!rtnl_trylock()) {
> schedule_work(&dim->work);
> return;
> }
>
> + if (vi->cvq_cmd_nums == BATCH_CMD || vi->cvq->num_free < 3 ||
> + vi->cvq->num_free <= (virtqueue_get_vring_size(vi->cvq) / 3))
> + virtnet_cvq_response(vi, true);
> +
> + for (i = 0; i < BATCH_CMD; i++) {
> + buf = (u8 *)vi->batch_coal;
> + position = i * (sizeof(struct virtnet_batch_coal) +
> + vi->max_queue_pairs * sizeof(struct virtnet_coal_entry));
> + avail_coal = (struct virtnet_batch_coal *)(&buf[position]);
> + if (avail_coal->usable)
> + break;
> + }
> +
> /* Each rxq's work is queued by "net_dim()->schedule_work()"
> * in response to NAPI traffic changes. Note that dim->profile_ix
> * for each rxq is updated prior to the queuing action.
> @@ -3552,30 +3677,26 @@ static void virtnet_rx_dim_work(struct work_struct *work)
> update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix);
> if (update_moder.usec != rq->intr_coal.max_usecs ||
> update_moder.pkts != rq->intr_coal.max_packets) {
> - coal->coal_vqs[j].vqn = cpu_to_le16(rxq2vq(i));
> - coal->coal_vqs[j].coal.max_usecs = cpu_to_le32(update_moder.usec);
> - coal->coal_vqs[j].coal.max_packets = cpu_to_le32(update_moder.pkts);
> + avail_coal->coal_vqs[j].vqn = cpu_to_le16(rxq2vq(i));
> + avail_coal->coal_vqs[j].coal.max_usecs = cpu_to_le32(update_moder.usec);
> + avail_coal->coal_vqs[j].coal.max_packets = cpu_to_le32(update_moder.pkts);
> rq->intr_coal.max_usecs = update_moder.usec;
> rq->intr_coal.max_packets = update_moder.pkts;
> j++;
> - }
> + } else if (dim->state == DIM_APPLY_NEW_PROFILE)
> + dim->state = DIM_START_MEASURE;
> }
>
> if (!j)
> goto ret;
>
> - coal->num_entries = cpu_to_le32(j);
> - sg_init_one(&sgs, coal, sizeof(struct virtnet_batch_coal) +
> - j * sizeof(struct virtio_net_ctrl_coal_vq));
> - if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_NOTF_COAL,
> - VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET,
> - &sgs))
> - dev_warn(&vi->vdev->dev, "Failed to add dim command\n.");
> + avail_coal->num_entries = cpu_to_le32(j);
> + if (!virtnet_add_dim_command(vi, avail_coal))
> + goto ret;
>
> - for (i = 0; i < j; i++) {
> - rq = &vi->rq[(coal->coal_vqs[i].vqn) / 2];
> - rq->dim.state = DIM_START_MEASURE;
> - }
> + virtnet_cvq_response(vi, false);
> + if (vi->cvq_cmd_nums)
> + schedule_delayed_work(&vi->get_cvq, 1);
>
> ret:
> rtnl_unlock();
> @@ -4402,7 +4523,9 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>
> static int virtnet_alloc_queues(struct virtnet_info *vi)
> {
> - int i, len;
> + struct virtnet_batch_coal *batch_coal;
> + int i, position;
> + u8 *buf;
>
> if (vi->has_cvq) {
> vi->ctrl = kzalloc(sizeof(*vi->ctrl), GFP_KERNEL);
> @@ -4418,13 +4541,21 @@ static int virtnet_alloc_queues(struct virtnet_info *vi)
> if (!vi->rq)
> goto err_rq;
>
> - len = sizeof(struct virtnet_batch_coal) +
> - vi->max_queue_pairs * sizeof(struct virtio_net_ctrl_coal_vq);
> - vi->batch_coal = kzalloc(len, GFP_KERNEL);
> - if (!vi->batch_coal)
> + buf = kzalloc(BATCH_CMD * (sizeof(struct virtnet_batch_coal) +
> + vi->max_queue_pairs * sizeof(struct virtnet_coal_entry)), GFP_KERNEL);
> + if (!buf)
> goto err_coal;
>
> + vi->batch_coal = (struct virtnet_batch_coal *)buf;
> + for (i = 0; i < BATCH_CMD; i++) {
> + position = i * (sizeof(struct virtnet_batch_coal) +
> + vi->max_queue_pairs * sizeof(struct virtnet_coal_entry));
> + batch_coal = (struct virtnet_batch_coal *)(&buf[position]);
> + batch_coal->usable = true;
> + }
> +
> INIT_DELAYED_WORK(&vi->refill, refill_work);
> + INIT_DELAYED_WORK(&vi->get_cvq, get_cvq_work);
> for (i = 0; i < vi->max_queue_pairs; i++) {
> vi->rq[i].pages = NULL;
> netif_napi_add_weight(vi->dev, &vi->rq[i].napi, virtnet_poll,
> --
> 1.8.3.1
>
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [PATCH net-next 3/3] virtio-net: reduce the CPU consumption of dim worker
2024-01-16 13:11 ` [PATCH net-next 3/3] virtio-net: reduce the CPU consumption of dim worker Heng Qi
` (3 preceding siblings ...)
2024-01-22 7:19 ` Xuan Zhuo
@ 2024-01-22 7:42 ` Xuan Zhuo
4 siblings, 0 replies; 16+ messages in thread
From: Xuan Zhuo @ 2024-01-22 7:42 UTC (permalink / raw)
To: Heng Qi
Cc: Jason Wang, Michael S. Tsirkin, Paolo Abeni, Jakub Kicinski,
Eric Dumazet, David S. Miller, netdev, virtualization
On Tue, 16 Jan 2024 21:11:33 +0800, Heng Qi <hengqi@linux.alibaba.com> wrote:
> Accumulate multiple request commands to kick the device once,
> and obtain the processing results of the corresponding commands
> asynchronously. The batch command method is used to optimize the
> CPU overhead of the DIM worker caused by the guest being busy
> waiting for the command response result.
>
> On an 8-queue device, without this patch, the guest cpu overhead
> due to waiting for cvq could be 10+% and above. With this patch,
> the corresponding overhead is basically invisible.
>
> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> ---
> drivers/net/virtio_net.c | 185 ++++++++++++++++++++++++++++++++++++++++-------
> 1 file changed, 158 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index e4305ad..9f22c85 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -33,6 +33,8 @@
> module_param(gso, bool, 0444);
> module_param(napi_tx, bool, 0644);
>
> +#define BATCH_CMD 25
> +
> /* FIXME: MTU in config. */
> #define GOOD_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN)
> #define GOOD_COPY_LEN 128
> @@ -134,6 +136,9 @@ struct virtnet_interrupt_coalesce {
> };
>
> struct virtnet_batch_coal {
> + struct virtio_net_ctrl_hdr hdr;
> + virtio_net_ctrl_ack status;
> + __u8 usable;
> __le32 num_entries;
> struct virtio_net_ctrl_coal_vq coal_vqs[];
> };
> @@ -299,6 +304,7 @@ struct virtnet_info {
>
> /* Work struct for delayed refilling if we run low on memory. */
> struct delayed_work refill;
> + struct delayed_work get_cvq;
>
> /* Is delayed refill enabled? */
> bool refill_enabled;
> @@ -326,6 +332,7 @@ struct virtnet_info {
> bool rx_dim_enabled;
>
> /* Interrupt coalescing settings */
> + int cvq_cmd_nums;
> struct virtnet_batch_coal *batch_coal;
> struct virtnet_interrupt_coalesce intr_coal_tx;
> struct virtnet_interrupt_coalesce intr_coal_rx;
> @@ -2512,6 +2519,46 @@ static int virtnet_tx_resize(struct virtnet_info *vi,
> return err;
> }
>
> +static bool virtnet_process_dim_cmd(struct virtnet_info *vi, void *res)
> +{
> + struct virtnet_batch_coal *batch_coal;
> + u16 queue;
> + int i;
> +
> + if (res != ((void *)vi)) {
> + batch_coal = (struct virtnet_batch_coal *)res;
> + batch_coal->usable = true;
> + vi->cvq_cmd_nums--;
> + for (i = 0; i < batch_coal->num_entries; i++) {
> + queue = batch_coal->coal_vqs[i].vqn / 2;
> + vi->rq[queue].dim.state = DIM_START_MEASURE;
> + }
> + } else {
> + return true;
> + }
> +
> + return false;
> +}
> +
> +static bool virtnet_cvq_response(struct virtnet_info *vi, bool poll)
> +{
> + unsigned tmp;
> + void *res;
> +
> + if (!poll) {
> + while ((res = virtqueue_get_buf(vi->cvq, &tmp)) &&
> + !virtqueue_is_broken(vi->cvq))
> + virtnet_process_dim_cmd(vi, res);
> + return 0;
> + }
> +
> + while (!(res = virtqueue_get_buf(vi->cvq, &tmp)) &&
> + !virtqueue_is_broken(vi->cvq))
> + cpu_relax();
> +
> + return virtnet_process_dim_cmd(vi, res);
> +}
> +
> /*
> * Send command via the control virtqueue and check status. Commands
> * supported by the hypervisor, as indicated by feature bits, should
> @@ -2521,7 +2568,7 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd,
> struct scatterlist *out)
> {
> struct scatterlist *sgs[4], hdr, stat;
> - unsigned out_num = 0, tmp;
> + unsigned out_num = 0;
> int ret;
>
> /* Caller should know better */
> @@ -2555,9 +2602,9 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd,
> /* Spin for a response, the kick causes an ioport write, trapping
> * into the hypervisor, so the request should be handled immediately.
> */
> - while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> - !virtqueue_is_broken(vi->cvq))
> - cpu_relax();
> + while (true)
> + if (virtnet_cvq_response(vi, true))
> + break;
>
> return vi->ctrl->status == VIRTIO_NET_OK;
> }
> @@ -2709,6 +2756,7 @@ static int virtnet_close(struct net_device *dev)
> cancel_work_sync(&vi->rq[i].dim.work);
> }
>
> + cancel_delayed_work_sync(&vi->get_cvq);
> return 0;
> }
>
> @@ -3520,22 +3568,99 @@ static int virtnet_send_notf_coal_vq_cmds(struct virtnet_info *vi,
> return 0;
> }
>
> +static bool virtnet_add_dim_command(struct virtnet_info *vi,
> + struct virtnet_batch_coal *ctrl)
> +{
> + struct scatterlist *sgs[4], hdr, stat, out;
> + unsigned out_num = 0;
> + int ret;
> +
> + /* Caller should know better */
> + BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ));
> +
> + ctrl->hdr.class = VIRTIO_NET_CTRL_NOTF_COAL;
> + ctrl->hdr.cmd = VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET;
> +
> + /* Add header */
> + sg_init_one(&hdr, &ctrl->hdr, sizeof(ctrl->hdr));
> + sgs[out_num++] = &hdr;
> +
> + /* Add body */
> + sg_init_one(&out, &ctrl->num_entries, sizeof(ctrl->num_entries) +
> + ctrl->num_entries * sizeof(struct virtnet_coal_entry));
> + sgs[out_num++] = &out;
> +
> + /* Add return status. */
> + ctrl->status = VIRTIO_NET_OK;
> + sg_init_one(&stat, &ctrl->status, sizeof(ctrl->status));
> + sgs[out_num] = &stat;
> +
> + BUG_ON(out_num + 1 > ARRAY_SIZE(sgs));
> + ret = virtqueue_add_sgs(vi->cvq, sgs, out_num, 1, ctrl, GFP_ATOMIC);
> + if (ret < 0) {
> + dev_warn(&vi->vdev->dev, "Failed to add sgs for command vq: %d\n.", ret);
> + return false;
> + }
> +
> + virtqueue_kick(vi->cvq);
> +
> + ctrl->usable = false;
> + vi->cvq_cmd_nums++;
> +
> + return true;
> +}
We should merge this to the function virtnet_send_command.
> +
> +static void get_cvq_work(struct work_struct *work)
> +{
> + struct virtnet_info *vi =
> + container_of(work, struct virtnet_info, get_cvq.work);
> +
> + if (!rtnl_trylock()) {
> + schedule_delayed_work(&vi->get_cvq, 5);
> + return;
> + }
> +
> + if (!vi->cvq_cmd_nums)
> + goto ret;
> +
> + virtnet_cvq_response(vi, false);
> +
> + if (vi->cvq_cmd_nums)
> + schedule_delayed_work(&vi->get_cvq, 5);
> +
> +ret:
> + rtnl_unlock();
> +}
> +
> static void virtnet_rx_dim_work(struct work_struct *work)
> {
> struct dim *dim = container_of(work, struct dim, work);
> struct receive_queue *rq = container_of(dim,
> struct receive_queue, dim);
> struct virtnet_info *vi = rq->vq->vdev->priv;
> + struct virtnet_batch_coal *avail_coal;
> struct dim_cq_moder update_moder;
> - struct virtnet_batch_coal *coal = vi->batch_coal;
> - struct scatterlist sgs;
> - int i, j = 0;
> + int i, j = 0, position;
> + u8 *buf;
>
> if (!rtnl_trylock()) {
> schedule_work(&dim->work);
> return;
> }
>
> + if (vi->cvq_cmd_nums == BATCH_CMD || vi->cvq->num_free < 3 ||
> + vi->cvq->num_free <= (virtqueue_get_vring_size(vi->cvq) / 3))
> + virtnet_cvq_response(vi, true);
> +
> + for (i = 0; i < BATCH_CMD; i++) {
> + buf = (u8 *)vi->batch_coal;
> + position = i * (sizeof(struct virtnet_batch_coal) +
> + vi->max_queue_pairs * sizeof(struct virtnet_coal_entry));
> + avail_coal = (struct virtnet_batch_coal *)(&buf[position]);
> + if (avail_coal->usable)
> + break;
list or kmalloc here are all better way.
> + }
> +
> /* Each rxq's work is queued by "net_dim()->schedule_work()"
> * in response to NAPI traffic changes. Note that dim->profile_ix
> * for each rxq is updated prior to the queuing action.
> @@ -3552,30 +3677,26 @@ static void virtnet_rx_dim_work(struct work_struct *work)
> update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix);
> if (update_moder.usec != rq->intr_coal.max_usecs ||
> update_moder.pkts != rq->intr_coal.max_packets) {
> - coal->coal_vqs[j].vqn = cpu_to_le16(rxq2vq(i));
> - coal->coal_vqs[j].coal.max_usecs = cpu_to_le32(update_moder.usec);
> - coal->coal_vqs[j].coal.max_packets = cpu_to_le32(update_moder.pkts);
> + avail_coal->coal_vqs[j].vqn = cpu_to_le16(rxq2vq(i));
> + avail_coal->coal_vqs[j].coal.max_usecs = cpu_to_le32(update_moder.usec);
> + avail_coal->coal_vqs[j].coal.max_packets = cpu_to_le32(update_moder.pkts);
> rq->intr_coal.max_usecs = update_moder.usec;
> rq->intr_coal.max_packets = update_moder.pkts;
> j++;
> - }
> + } else if (dim->state == DIM_APPLY_NEW_PROFILE)
> + dim->state = DIM_START_MEASURE;
> }
>
> if (!j)
> goto ret;
>
> - coal->num_entries = cpu_to_le32(j);
> - sg_init_one(&sgs, coal, sizeof(struct virtnet_batch_coal) +
> - j * sizeof(struct virtio_net_ctrl_coal_vq));
> - if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_NOTF_COAL,
> - VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET,
> - &sgs))
> - dev_warn(&vi->vdev->dev, "Failed to add dim command\n.");
> + avail_coal->num_entries = cpu_to_le32(j);
> + if (!virtnet_add_dim_command(vi, avail_coal))
> + goto ret;
>
> - for (i = 0; i < j; i++) {
> - rq = &vi->rq[(coal->coal_vqs[i].vqn) / 2];
> - rq->dim.state = DIM_START_MEASURE;
> - }
> + virtnet_cvq_response(vi, false);
Is this usable?
> + if (vi->cvq_cmd_nums)
> + schedule_delayed_work(&vi->get_cvq, 1);
>
> ret:
> rtnl_unlock();
> @@ -4402,7 +4523,9 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>
> static int virtnet_alloc_queues(struct virtnet_info *vi)
> {
> - int i, len;
> + struct virtnet_batch_coal *batch_coal;
> + int i, position;
> + u8 *buf;
>
> if (vi->has_cvq) {
> vi->ctrl = kzalloc(sizeof(*vi->ctrl), GFP_KERNEL);
> @@ -4418,13 +4541,21 @@ static int virtnet_alloc_queues(struct virtnet_info *vi)
> if (!vi->rq)
> goto err_rq;
>
> - len = sizeof(struct virtnet_batch_coal) +
> - vi->max_queue_pairs * sizeof(struct virtio_net_ctrl_coal_vq);
> - vi->batch_coal = kzalloc(len, GFP_KERNEL);
> - if (!vi->batch_coal)
> + buf = kzalloc(BATCH_CMD * (sizeof(struct virtnet_batch_coal) +
> + vi->max_queue_pairs * sizeof(struct virtnet_coal_entry)), GFP_KERNEL);
> + if (!buf)
> goto err_coal;
>
> + vi->batch_coal = (struct virtnet_batch_coal *)buf;
> + for (i = 0; i < BATCH_CMD; i++) {
> + position = i * (sizeof(struct virtnet_batch_coal) +
> + vi->max_queue_pairs * sizeof(struct virtnet_coal_entry));
> + batch_coal = (struct virtnet_batch_coal *)(&buf[position]);
> + batch_coal->usable = true;
> + }
> +
> INIT_DELAYED_WORK(&vi->refill, refill_work);
> + INIT_DELAYED_WORK(&vi->get_cvq, get_cvq_work);
> for (i = 0; i < vi->max_queue_pairs; i++) {
> vi->rq[i].pages = NULL;
> netif_napi_add_weight(vi->dev, &vi->rq[i].napi, virtnet_poll,
> --
> 1.8.3.1
>
^ permalink raw reply [flat|nested] 16+ messages in thread