From: Andrea Parri <parri.andrea@gmail.com>
To: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Dexuan Cui <decui@microsoft.com>,
"K . Y . Srinivasan" <kys@microsoft.com>,
Haiyang Zhang <haiyangz@microsoft.com>,
Stephen Hemminger <sthemmin@microsoft.com>,
Wei Liu <wei.liu@kernel.org>,
linux-hyperv@vger.kernel.org,
Michael Kelley <mikelley@microsoft.com>,
Boqun Feng <boqun.feng@gmail.com>,
linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 02/11] Drivers: hv: vmbus: Don't bind the offer&rescind works to a specific CPU
Date: Thu, 26 Mar 2020 16:47:10 +0100 [thread overview]
Message-ID: <20200326154710.GA13711@andrea> (raw)
In-Reply-To: <871rpf5hhm.fsf@vitty.brq.redhat.com>
On Thu, Mar 26, 2020 at 03:16:21PM +0100, Vitaly Kuznetsov wrote:
> "Andrea Parri (Microsoft)" <parri.andrea@gmail.com> writes:
>
> > The offer and rescind works are currently scheduled on the so called
> > "connect CPU". However, this is not really needed: we can synchronize
> > the works by relying on the usage of the offer_in_progress counter and
> > of the channel_mutex mutex. This synchronization is already in place.
> > So, remove this unnecessary "bind to the connect CPU" constraint and
> > update the inline comments accordingly.
> >
> > Suggested-by: Dexuan Cui <decui@microsoft.com>
> > Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
> > ---
> > drivers/hv/channel_mgmt.c | 21 ++++++++++++++++-----
> > drivers/hv/vmbus_drv.c | 39 ++++++++++++++++++++++++++++-----------
> > 2 files changed, 44 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> > index 0370364169c4e..1191f3d76d111 100644
> > --- a/drivers/hv/channel_mgmt.c
> > +++ b/drivers/hv/channel_mgmt.c
> > @@ -1025,11 +1025,22 @@ static void vmbus_onoffer_rescind(struct vmbus_channel_message_header *hdr)
> > * offer comes in first and then the rescind.
> > * Since we process these events in work elements,
> > * and with preemption, we may end up processing
> > - * the events out of order. Given that we handle these
> > - * work elements on the same CPU, this is possible only
> > - * in the case of preemption. In any case wait here
> > - * until the offer processing has moved beyond the
> > - * point where the channel is discoverable.
> > + * the events out of order. We rely on the synchronization
> > + * provided by offer_in_progress and by channel_mutex for
> > + * ordering these events:
> > + *
> > + * { Initially: offer_in_progress = 1 }
> > + *
> > + * CPU1 CPU2
> > + *
> > + * [vmbus_process_offer()] [vmbus_onoffer_rescind()]
> > + *
> > + * LOCK channel_mutex WAIT_ON offer_in_progress == 0
> > + * DECREMENT offer_in_progress LOCK channel_mutex
> > + * INSERT chn_list SEARCH chn_list
> > + * UNLOCK channel_mutex UNLOCK channel_mutex
> > + *
> > + * Forbids: CPU2's SEARCH from *not* seeing CPU1's INSERT
>
> WAIT_ON offer_in_progress == 0
> LOCK channel_mutex
>
> seems to be racy: what happens if offer_in_progress increments after we
> read it but before we managed to aquire channel_mutex?
Remark that the RESCIND work must see the increment which is performed
"before" queueing the work in question (and the associated OFFER work),
cf. the comment in vmbus_on_msg_dpc() below and
dbb92f88648d6 ("workqueue: Document (some) memory-ordering properties of {queue,schedule}_work()")
AFAICT, this suffices to meet the intended behavior as sketched above.
I might be missing something of course, can you elaborate on the issue
here?
Thanks,
Andrea
>
> I think this shold be changed to
>
> LOCK channel_mutex
> CHECK offer_in_progress == 0
> EQUAL? GOTO proceed with rescind handling
> NOT EQUAL?
> WHILE offer_in_progress) != 0 {
> UNLOCK channel_mutex
> MSLEEP(1)
> LOCK channel_mutex
> }
> proceed with rescind handling:
> ...
> UNLOCK channel_mutex
>
> > */
> >
> > while (atomic_read(&vmbus_connection.offer_in_progress) != 0) {
> > diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> > index 7600615e13754..903b1ec6a259e 100644
> > --- a/drivers/hv/vmbus_drv.c
> > +++ b/drivers/hv/vmbus_drv.c
> > @@ -1048,8 +1048,9 @@ void vmbus_on_msg_dpc(unsigned long data)
> > /*
> > * The host can generate a rescind message while we
> > * may still be handling the original offer. We deal with
> > - * this condition by ensuring the processing is done on the
> > - * same CPU.
> > + * this condition by relying on the synchronization provided
> > + * by offer_in_progress and by channel_mutex. See also the
> > + * inline comments in vmbus_onoffer_rescind().
> > */
> > switch (hdr->msgtype) {
> > case CHANNELMSG_RESCIND_CHANNELOFFER:
> > @@ -1071,16 +1072,34 @@ void vmbus_on_msg_dpc(unsigned long data)
> > * work queue: the RESCIND handler can not start to
> > * run before the OFFER handler finishes.
> > */
> > - schedule_work_on(VMBUS_CONNECT_CPU,
> > - &ctx->work);
> > + schedule_work(&ctx->work);
> > break;
> >
> > case CHANNELMSG_OFFERCHANNEL:
> > + /*
> > + * The host sends the offer message of a given channel
> > + * before sending the rescind message of the same
> > + * channel. These messages are sent to the guest's
> > + * connect CPU; the guest then starts processing them
> > + * in the tasklet handler on this CPU:
> > + *
> > + * VMBUS_CONNECT_CPU
> > + *
> > + * [vmbus_on_msg_dpc()]
> > + * atomic_inc() // CHANNELMSG_OFFERCHANNEL
> > + * queue_work()
> > + * ...
> > + * [vmbus_on_msg_dpc()]
> > + * schedule_work() // CHANNELMSG_RESCIND_CHANNELOFFER
> > + *
> > + * We rely on the memory-ordering properties of the
> > + * queue_work() and schedule_work() primitives, which
> > + * guarantee that the atomic increment will be visible
> > + * to the CPUs which will execute the offer & rescind
> > + * works by the time these works will start execution.
> > + */
> > atomic_inc(&vmbus_connection.offer_in_progress);
> > - queue_work_on(VMBUS_CONNECT_CPU,
> > - vmbus_connection.work_queue,
> > - &ctx->work);
> > - break;
> > + fallthrough;
> >
> > default:
> > queue_work(vmbus_connection.work_queue, &ctx->work);
> > @@ -1124,9 +1143,7 @@ static void vmbus_force_channel_rescinded(struct vmbus_channel *channel)
> >
> > INIT_WORK(&ctx->work, vmbus_onmessage_work);
> >
> > - queue_work_on(VMBUS_CONNECT_CPU,
> > - vmbus_connection.work_queue,
> > - &ctx->work);
> > + queue_work(vmbus_connection.work_queue, &ctx->work);
> > }
> > #endif /* CONFIG_PM_SLEEP */
>
> --
> Vitaly
>
next prev parent reply other threads:[~2020-03-26 15:47 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-25 22:54 [RFC PATCH 00/11] VMBus channel interrupt reassignment Andrea Parri (Microsoft)
2020-03-25 22:54 ` [RFC PATCH 01/11] Drivers: hv: vmbus: Always handle the VMBus messages on CPU0 Andrea Parri (Microsoft)
2020-03-26 14:05 ` Vitaly Kuznetsov
2020-03-28 18:50 ` Andrea Parri
2020-03-25 22:54 ` [RFC PATCH 02/11] Drivers: hv: vmbus: Don't bind the offer&rescind works to a specific CPU Andrea Parri (Microsoft)
2020-03-26 14:16 ` Vitaly Kuznetsov
2020-03-26 15:47 ` Andrea Parri [this message]
2020-03-26 17:26 ` Vitaly Kuznetsov
2020-03-28 17:08 ` Andrea Parri
2020-03-29 3:43 ` Michael Kelley
2020-03-30 12:24 ` Vitaly Kuznetsov
2020-04-03 12:04 ` Andrea Parri
2020-03-25 22:54 ` [RFC PATCH 03/11] Drivers: hv: vmbus: Replace the per-CPU channel lists with a global array of channels Andrea Parri (Microsoft)
2020-03-26 14:31 ` Vitaly Kuznetsov
2020-03-26 17:05 ` Andrea Parri
2020-03-26 17:43 ` Vitaly Kuznetsov
2020-03-28 18:21 ` Andrea Parri
2020-03-29 3:49 ` Michael Kelley
2020-03-30 12:45 ` Vitaly Kuznetsov
2020-04-03 13:38 ` Andrea Parri
2020-04-03 14:56 ` Vitaly Kuznetsov
2020-03-25 22:54 ` [RFC PATCH 04/11] hv_netvsc: Disable NAPI before closing the VMBus channel Andrea Parri (Microsoft)
2020-03-26 15:26 ` Stephen Hemminger
2020-03-26 17:55 ` Andrea Parri
2020-03-25 22:54 ` [RFC PATCH 05/11] hv_utils: Always execute the fcopy and vss callbacks in a tasklet Andrea Parri (Microsoft)
2020-03-25 22:55 ` [RFC PATCH 06/11] Drivers: hv: vmbus: Use a spin lock for synchronizing channel scheduling vs. channel removal Andrea Parri (Microsoft)
2020-03-25 22:55 ` [RFC PATCH 07/11] PCI: hv: Prepare hv_compose_msi_msg() for the VMBus-channel-interrupt-to-vCPU reassignment functionality Andrea Parri (Microsoft)
2020-03-25 22:55 ` [RFC PATCH 08/11] Drivers: hv: vmbus: Remove the unused HV_LOCALIZED channel affinity logic Andrea Parri (Microsoft)
2020-03-25 22:55 ` [RFC PATCH 09/11] Drivers: hv: vmbus: Synchronize init_vp_index() vs. CPU hotplug Andrea Parri (Microsoft)
2020-03-25 22:55 ` [RFC PATCH 10/11] Drivers: hv: vmbus: Introduce the CHANNELMSG_MODIFYCHANNEL message type Andrea Parri (Microsoft)
2020-03-26 14:46 ` Vitaly Kuznetsov
2020-03-28 18:48 ` Andrea Parri
2020-04-03 14:55 ` Andrea Parri
2020-03-25 22:55 ` [RFC PATCH 11/11] scsi: storvsc: Re-init stor_chns when a channel interrupt is re-assigned Andrea Parri (Microsoft)
2020-03-30 16:42 ` Michael Kelley
2020-03-30 18:55 ` Andrea Parri
2020-03-30 19:49 ` Michael Kelley
2020-04-03 13:41 ` Andrea Parri
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200326154710.GA13711@andrea \
--to=parri.andrea@gmail.com \
--cc=boqun.feng@gmail.com \
--cc=decui@microsoft.com \
--cc=haiyangz@microsoft.com \
--cc=kys@microsoft.com \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mikelley@microsoft.com \
--cc=sthemmin@microsoft.com \
--cc=vkuznets@redhat.com \
--cc=wei.liu@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.