From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
Eli Cohen <elic@nvidia.com>, Hillf Danton <hdanton@sina.com>,
virtualization <virtualization@lists.linux-foundation.org>
Subject: Re: [PATCH 1/2] vdpa: mlx5: prevent cvq work from hogging CPU
Date: Fri, 25 Mar 2022 02:45:25 -0400 [thread overview]
Message-ID: <20220325024324-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CACGkMEtVUqJcS1W2p1U9RCjQqQOSREu1J9zjmw37TbPBNqq7tA@mail.gmail.com>
On Fri, Mar 25, 2022 at 11:22:25AM +0800, Jason Wang wrote:
> On Thu, Mar 24, 2022 at 8:24 PM Eli Cohen <elic@nvidia.com> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Hillf Danton <hdanton@sina.com>
> > > Sent: Thursday, March 24, 2022 2:02 PM
> > > To: Jason Wang <jasowang@redhat.com>
> > > Cc: Eli Cohen <elic@nvidia.com>; Michael S. Tsirkin <mst@redhat.com>; virtualization <virtualization@lists.linux-foundation.org>; linux-
> > > kernel <linux-kernel@vger.kernel.org>
> > > Subject: Re: [PATCH 1/2] vdpa: mlx5: prevent cvq work from hogging CPU
> > >
> > > On Thu, 24 Mar 2022 16:20:34 +0800 Jason Wang wrote:
> > > > On Thu, Mar 24, 2022 at 2:17 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > On Thu, Mar 24, 2022 at 02:04:19PM +0800, Hillf Danton wrote:
> > > > > > On Thu, 24 Mar 2022 10:34:09 +0800 Jason Wang wrote:
> > > > > > > On Thu, Mar 24, 2022 at 8:54 AM Hillf Danton <hdanton@sina.com> wrote:
> > > > > > > >
> > > > > > > > On Tue, 22 Mar 2022 09:59:14 +0800 Jason Wang wrote:
> > > > > > > > >
> > > > > > > > > Yes, there will be no "infinite" loop, but since the loop is triggered
> > > > > > > > > by userspace. It looks to me it will delay the flush/drain of the
> > > > > > > > > workqueue forever which is still suboptimal.
> > > > > > > >
> > > > > > > > Usually it is barely possible to shoot two birds using a stone.
> > > > > > > >
> > > > > > > > Given the "forever", I am inclined to not running faster, hehe, though
> > > > > > > > another cobble is to add another line in the loop checking if mvdev is
> > > > > > > > unregistered, and for example make mvdev->cvq unready before destroying
> > > > > > > > workqueue.
> > > > > > > >
> > > > > > > > static void mlx5_vdpa_dev_del(struct vdpa_mgmt_dev *v_mdev, struct vdpa_device *dev)
> > > > > > > > {
> > > > > > > > struct mlx5_vdpa_mgmtdev *mgtdev = container_of(v_mdev, struct mlx5_vdpa_mgmtdev, mgtdev);
> > > > > > > > struct mlx5_vdpa_dev *mvdev = to_mvdev(dev);
> > > > > > > > struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
> > > > > > > >
> > > > > > > > mlx5_notifier_unregister(mvdev->mdev, &ndev->nb);
> > > > > > > > destroy_workqueue(mvdev->wq);
> > > > > > > > _vdpa_unregister_device(dev);
> > > > > > > > mgtdev->ndev = NULL;
> > > > > > > > }
> > > > > > > >
> > > > > > >
> > > > > > > Yes, so we had
> > > > > > >
> > > > > > > 1) using a quota for re-requeue
> > > > > > > 2) using something like
> > > > > > >
> > > > > > > while (READ_ONCE(cvq->ready)) {
> > > > > > > ...
> > > > > > > cond_resched();
> > > > > > > }
> > > > > > >
> > > > > > > There should not be too much difference except we need to use
> > > > > > > cancel_work_sync() instead of flush_work for 1).
> > > > > > >
> > > > > > > I would keep the code as is but if you stick I can change.
> > > > > >
> > > > > > No Sir I would not - I am simply not a fan of work requeue.
> > > > > >
> > > > > > Hillf
> > > > >
> > > > > I think I agree - requeue adds latency spikes under heavy load -
> > > > > unfortunately, not measured by netperf but still important
> > > > > for latency sensitive workloads. Checking a flag is cheaper.
> > > >
> > > > Just spot another possible issue.
> > > >
> > > > The workqueue will be used by another work to update the carrier
> > > > (event_handler()). Using cond_resched() may still have unfair issue
> > > > which blocks the carrier update for infinite time,
> > >
> > > Then would you please specify the reason why mvdev->wq is single
> > > threaded?
>
> I didn't see a reason why it needs to be a single threaded (ordered).
>
> > Given requeue, the serialization of the two works is not
> > > strong. Otherwise unbound WQ that can process works in parallel is
> > > a cure to the unfairness above.
>
> Yes, and we probably don't want a per device workqueue but a per
> module one. Or simply use the system_wq one.
>
> > >
> >
> > I think the proposed patch can still be used with quota equal to one.
> > That would guarantee fairness.
> > This is not performance critical and a single workqueue should be enough.
>
> Yes, but both Hillf and Michael don't like requeuing. So my plan is
>
> 1) send patch 2 first since it's a hard requirement for the next RHEL release
> 2) a series to fix this hogging issue by
> 2.1) switch to use a per module workqueue
> 2.2) READ_ONCE(cvq->ready) + cond_resched()
>
> Thanks
Actually if we don't care about speed here then requeing with quota of 1
is fine, in that we don't have a quota at all, we just always requeue
instead of a loop.
It's the mix of requeue and a loop that I consider confusing.
> >
> > > Thanks
> > > Hillf
> >
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: Eli Cohen <elic@nvidia.com>, Hillf Danton <hdanton@sina.com>,
virtualization <virtualization@lists.linux-foundation.org>,
linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/2] vdpa: mlx5: prevent cvq work from hogging CPU
Date: Fri, 25 Mar 2022 02:45:25 -0400 [thread overview]
Message-ID: <20220325024324-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CACGkMEtVUqJcS1W2p1U9RCjQqQOSREu1J9zjmw37TbPBNqq7tA@mail.gmail.com>
On Fri, Mar 25, 2022 at 11:22:25AM +0800, Jason Wang wrote:
> On Thu, Mar 24, 2022 at 8:24 PM Eli Cohen <elic@nvidia.com> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Hillf Danton <hdanton@sina.com>
> > > Sent: Thursday, March 24, 2022 2:02 PM
> > > To: Jason Wang <jasowang@redhat.com>
> > > Cc: Eli Cohen <elic@nvidia.com>; Michael S. Tsirkin <mst@redhat.com>; virtualization <virtualization@lists.linux-foundation.org>; linux-
> > > kernel <linux-kernel@vger.kernel.org>
> > > Subject: Re: [PATCH 1/2] vdpa: mlx5: prevent cvq work from hogging CPU
> > >
> > > On Thu, 24 Mar 2022 16:20:34 +0800 Jason Wang wrote:
> > > > On Thu, Mar 24, 2022 at 2:17 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > On Thu, Mar 24, 2022 at 02:04:19PM +0800, Hillf Danton wrote:
> > > > > > On Thu, 24 Mar 2022 10:34:09 +0800 Jason Wang wrote:
> > > > > > > On Thu, Mar 24, 2022 at 8:54 AM Hillf Danton <hdanton@sina.com> wrote:
> > > > > > > >
> > > > > > > > On Tue, 22 Mar 2022 09:59:14 +0800 Jason Wang wrote:
> > > > > > > > >
> > > > > > > > > Yes, there will be no "infinite" loop, but since the loop is triggered
> > > > > > > > > by userspace. It looks to me it will delay the flush/drain of the
> > > > > > > > > workqueue forever which is still suboptimal.
> > > > > > > >
> > > > > > > > Usually it is barely possible to shoot two birds using a stone.
> > > > > > > >
> > > > > > > > Given the "forever", I am inclined to not running faster, hehe, though
> > > > > > > > another cobble is to add another line in the loop checking if mvdev is
> > > > > > > > unregistered, and for example make mvdev->cvq unready before destroying
> > > > > > > > workqueue.
> > > > > > > >
> > > > > > > > static void mlx5_vdpa_dev_del(struct vdpa_mgmt_dev *v_mdev, struct vdpa_device *dev)
> > > > > > > > {
> > > > > > > > struct mlx5_vdpa_mgmtdev *mgtdev = container_of(v_mdev, struct mlx5_vdpa_mgmtdev, mgtdev);
> > > > > > > > struct mlx5_vdpa_dev *mvdev = to_mvdev(dev);
> > > > > > > > struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
> > > > > > > >
> > > > > > > > mlx5_notifier_unregister(mvdev->mdev, &ndev->nb);
> > > > > > > > destroy_workqueue(mvdev->wq);
> > > > > > > > _vdpa_unregister_device(dev);
> > > > > > > > mgtdev->ndev = NULL;
> > > > > > > > }
> > > > > > > >
> > > > > > >
> > > > > > > Yes, so we had
> > > > > > >
> > > > > > > 1) using a quota for re-requeue
> > > > > > > 2) using something like
> > > > > > >
> > > > > > > while (READ_ONCE(cvq->ready)) {
> > > > > > > ...
> > > > > > > cond_resched();
> > > > > > > }
> > > > > > >
> > > > > > > There should not be too much difference except we need to use
> > > > > > > cancel_work_sync() instead of flush_work for 1).
> > > > > > >
> > > > > > > I would keep the code as is but if you stick I can change.
> > > > > >
> > > > > > No Sir I would not - I am simply not a fan of work requeue.
> > > > > >
> > > > > > Hillf
> > > > >
> > > > > I think I agree - requeue adds latency spikes under heavy load -
> > > > > unfortunately, not measured by netperf but still important
> > > > > for latency sensitive workloads. Checking a flag is cheaper.
> > > >
> > > > Just spot another possible issue.
> > > >
> > > > The workqueue will be used by another work to update the carrier
> > > > (event_handler()). Using cond_resched() may still have unfair issue
> > > > which blocks the carrier update for infinite time,
> > >
> > > Then would you please specify the reason why mvdev->wq is single
> > > threaded?
>
> I didn't see a reason why it needs to be a single threaded (ordered).
>
> > Given requeue, the serialization of the two works is not
> > > strong. Otherwise unbound WQ that can process works in parallel is
> > > a cure to the unfairness above.
>
> Yes, and we probably don't want a per device workqueue but a per
> module one. Or simply use the system_wq one.
>
> > >
> >
> > I think the proposed patch can still be used with quota equal to one.
> > That would guarantee fairness.
> > This is not performance critical and a single workqueue should be enough.
>
> Yes, but both Hillf and Michael don't like requeuing. So my plan is
>
> 1) send patch 2 first since it's a hard requirement for the next RHEL release
> 2) a series to fix this hogging issue by
> 2.1) switch to use a per module workqueue
> 2.2) READ_ONCE(cvq->ready) + cond_resched()
>
> Thanks
Actually if we don't care about speed here then requeing with quota of 1
is fine, in that we don't have a quota at all, we just always requeue
instead of a loop.
It's the mix of requeue and a loop that I consider confusing.
> >
> > > Thanks
> > > Hillf
> >
next prev parent reply other threads:[~2022-03-25 6:45 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-21 6:04 [PATCH 1/2] vdpa: mlx5: prevent cvq work from hogging CPU Jason Wang
2022-03-21 6:04 ` Jason Wang
2022-03-21 6:04 ` [PATCH 2/2] vdpa: mlx5: synchronize driver status with CVQ Jason Wang
2022-03-21 6:04 ` Jason Wang
2022-03-21 6:56 ` Eli Cohen
2022-03-21 7:23 ` Michael S. Tsirkin
2022-03-21 7:23 ` Michael S. Tsirkin
2022-03-21 7:36 ` Jason Wang
2022-03-21 7:36 ` Jason Wang
2022-03-21 6:31 ` [PATCH 1/2] vdpa: mlx5: prevent cvq work from hogging CPU Eli Cohen
2022-03-21 7:20 ` Michael S. Tsirkin
2022-03-21 7:20 ` Michael S. Tsirkin
2022-03-21 7:35 ` Jason Wang
2022-03-21 7:35 ` Jason Wang
2022-03-21 8:53 ` Hillf Danton
2022-03-21 8:59 ` Jason Wang
2022-03-21 8:59 ` Jason Wang
2022-03-21 9:00 ` Jason Wang
2022-03-21 9:00 ` Jason Wang
2022-03-21 12:34 ` Hillf Danton
2022-03-22 1:59 ` Jason Wang
2022-03-22 1:59 ` Jason Wang
2022-03-24 0:53 ` Hillf Danton
2022-03-24 2:34 ` Jason Wang
2022-03-24 2:34 ` Jason Wang
2022-03-24 6:04 ` Hillf Danton
2022-03-24 6:17 ` Michael S. Tsirkin
2022-03-24 6:17 ` Michael S. Tsirkin
2022-03-24 8:20 ` Jason Wang
2022-03-24 8:20 ` Jason Wang
2022-03-24 12:02 ` Hillf Danton
2022-03-24 12:24 ` Eli Cohen
2022-03-25 3:22 ` Jason Wang
2022-03-25 3:22 ` Jason Wang
2022-03-25 6:45 ` Michael S. Tsirkin [this message]
2022-03-25 6:45 ` Michael S. Tsirkin
2022-03-25 7:53 ` Jason Wang
2022-03-25 7:53 ` Jason Wang
2022-03-25 11:57 ` Hillf Danton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220325024324-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=elic@nvidia.com \
--cc=hdanton@sina.com \
--cc=jasowang@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.