From: Andrea Parri <parri.andrea@gmail.com>
To: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Michael Kelley <mikelley@microsoft.com>,
Dexuan Cui <decui@microsoft.com>,
KY Srinivasan <kys@microsoft.com>,
Haiyang Zhang <haiyangz@microsoft.com>,
Stephen Hemminger <sthemmin@microsoft.com>,
Wei Liu <wei.liu@kernel.org>,
"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
Boqun Feng <boqun.feng@gmail.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH 02/11] Drivers: hv: vmbus: Don't bind the offer&rescind works to a specific CPU
Date: Fri, 3 Apr 2020 14:04:56 +0200 [thread overview]
Message-ID: <20200403120456.GA24298@andrea> (raw)
In-Reply-To: <87o8se2fpr.fsf@vitty.brq.redhat.com>
On Mon, Mar 30, 2020 at 02:24:16PM +0200, Vitaly Kuznetsov wrote:
> Michael Kelley <mikelley@microsoft.com> writes:
>
> > From: Andrea Parri <parri.andrea@gmail.com> Sent: Saturday, March 28, 2020 10:09 AM
> >>
> >> > In case we believe that OFFER -> RESCINF sequence is always ordered
> >> > by the host AND we don't care about other offers in the queue the
> >> > suggested locking is OK: we're guaranteed to process RESCIND after we
> >> > finished processing OFFER for the same channel. However, waiting for
> >> > 'offer_in_progress == 0' looks fishy so I'd suggest we at least add a
> >> > comment explaining that the wait is only needed to serialize us with
> >> > possible OFFER for the same channel - and nothing else. I'd personally
> >> > still slightly prefer the algorythm I suggested as it guarantees we take
> >> > channel_mutex with offer_in_progress == 0 -- even if there are no issues
> >> > we can think of today (not strongly though).
> >>
> >> Does it? offer_in_progress is incremented without channel_mutex...
> >>
>
> No, it does not, you're right, by itself the change is insufficient.
>
> >> IAC, I have no objections to apply the changes you suggested. To avoid
> >> misunderstandings: vmbus_bus_suspend() presents a similar usage... Are
> >> you suggesting that I apply similar changes there?
> >>
> >> Alternatively: FWIW, the comment in vmbus_onoffer_rescind() does refer
> >> to "The offer msg and the corresponding rescind msg...". I am all ears
> >> if you have any concrete suggestions to improve these comments.
> >>
> >
> > Given that waiting for 'offer_in_progress == 0' is the current code, I think
> > there's an argument to made for not changing it if the change isn't strictly
> > necessary. This patch set introduces enough change that *is* necessary. :-)
> >
>
> Sure. I was thinking a bit more about this and it seems that over years
> we've made the synchronization of channels code too complex (every time
> for a good reason but still). Now (before this series) we have at least:
>
> vmbus_connection.channel_mutex
> vmbus_connection.offer_in_progress
> channel.probe_done
> channel.rescind
> Workqueues (vmbus_connection.work_queue,
> queue_work_on(vmbus_connection.connect_cpu),...)
> channel.lock spinlock (the least of the problems)
>
> Maybe there's room for improvement? Out of top of my head I'd suggest a
> state machine for each channel (e.g something like
> OFFERED->OPENING->OPEN->RESCIND_REQ->RESCINDED->CLOSED) + refcounting
> (subchannels, open/rescind/... requests in progress, ...) + non-blocking
> request handling like "Can we handle this rescind offer now? No,
> refcount is too big. OK, rescheduling the work". Maybe not the best
> design ever and I'd gladly support any other which improves the
> readability of the code and makes all state changes and synchronization
> between them more obvious.
>
> Note, VMBus channel handling driven my messages (unlike events for ring
> buffer) is not performance critical, we just need to ensure completeness
> (all requests are handled correctly) with forward progress guarantees
> (no deadlocks).
>
> I understand the absence of 'hot' issues in the current code is what can
> make the virtue of redesign questionable and sorry for hijacking the
> series which doesn't seem to make things worse :-)
(Back here... Sorry for the delay.)
FWIW, what you wrote above makes sense to me; and *yes*, the series in
question was not intended as "let us undertake such a redesign" (quite
the opposite in fact...)
With regard to this specific patch, it seems to me that we've reached
a certain consensus or, at least, I don't see complaints ;-). Please
let me know if I misunderstood.
Thanks,
Andrea
next prev parent reply other threads:[~2020-04-03 12:05 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-25 22:54 [RFC PATCH 00/11] VMBus channel interrupt reassignment Andrea Parri (Microsoft)
2020-03-25 22:54 ` [RFC PATCH 01/11] Drivers: hv: vmbus: Always handle the VMBus messages on CPU0 Andrea Parri (Microsoft)
2020-03-26 14:05 ` Vitaly Kuznetsov
2020-03-28 18:50 ` Andrea Parri
2020-03-25 22:54 ` [RFC PATCH 02/11] Drivers: hv: vmbus: Don't bind the offer&rescind works to a specific CPU Andrea Parri (Microsoft)
2020-03-26 14:16 ` Vitaly Kuznetsov
2020-03-26 15:47 ` Andrea Parri
2020-03-26 17:26 ` Vitaly Kuznetsov
2020-03-28 17:08 ` Andrea Parri
2020-03-29 3:43 ` Michael Kelley
2020-03-30 12:24 ` Vitaly Kuznetsov
2020-04-03 12:04 ` Andrea Parri [this message]
2020-03-25 22:54 ` [RFC PATCH 03/11] Drivers: hv: vmbus: Replace the per-CPU channel lists with a global array of channels Andrea Parri (Microsoft)
2020-03-26 14:31 ` Vitaly Kuznetsov
2020-03-26 17:05 ` Andrea Parri
2020-03-26 17:43 ` Vitaly Kuznetsov
2020-03-28 18:21 ` Andrea Parri
2020-03-29 3:49 ` Michael Kelley
2020-03-30 12:45 ` Vitaly Kuznetsov
2020-04-03 13:38 ` Andrea Parri
2020-04-03 14:56 ` Vitaly Kuznetsov
2020-03-25 22:54 ` [RFC PATCH 04/11] hv_netvsc: Disable NAPI before closing the VMBus channel Andrea Parri (Microsoft)
2020-03-26 15:26 ` Stephen Hemminger
2020-03-26 17:55 ` Andrea Parri
2020-03-25 22:54 ` [RFC PATCH 05/11] hv_utils: Always execute the fcopy and vss callbacks in a tasklet Andrea Parri (Microsoft)
2020-03-25 22:55 ` [RFC PATCH 06/11] Drivers: hv: vmbus: Use a spin lock for synchronizing channel scheduling vs. channel removal Andrea Parri (Microsoft)
2020-03-25 22:55 ` [RFC PATCH 07/11] PCI: hv: Prepare hv_compose_msi_msg() for the VMBus-channel-interrupt-to-vCPU reassignment functionality Andrea Parri (Microsoft)
2020-03-25 22:55 ` [RFC PATCH 08/11] Drivers: hv: vmbus: Remove the unused HV_LOCALIZED channel affinity logic Andrea Parri (Microsoft)
2020-03-25 22:55 ` [RFC PATCH 09/11] Drivers: hv: vmbus: Synchronize init_vp_index() vs. CPU hotplug Andrea Parri (Microsoft)
2020-03-25 22:55 ` [RFC PATCH 10/11] Drivers: hv: vmbus: Introduce the CHANNELMSG_MODIFYCHANNEL message type Andrea Parri (Microsoft)
2020-03-26 14:46 ` Vitaly Kuznetsov
2020-03-28 18:48 ` Andrea Parri
2020-04-03 14:55 ` Andrea Parri
2020-03-25 22:55 ` [RFC PATCH 11/11] scsi: storvsc: Re-init stor_chns when a channel interrupt is re-assigned Andrea Parri (Microsoft)
2020-03-30 16:42 ` Michael Kelley
2020-03-30 18:55 ` Andrea Parri
2020-03-30 19:49 ` Michael Kelley
2020-04-03 13:41 ` Andrea Parri
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200403120456.GA24298@andrea \
--to=parri.andrea@gmail.com \
--cc=boqun.feng@gmail.com \
--cc=decui@microsoft.com \
--cc=haiyangz@microsoft.com \
--cc=kys@microsoft.com \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mikelley@microsoft.com \
--cc=sthemmin@microsoft.com \
--cc=vkuznets@redhat.com \
--cc=wei.liu@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.