From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: [PATCH 1/2] vm_event: sync domctl Date: Wed, 23 Dec 2015 19:14:26 +0000 Message-ID: <567AF292.8050606@citrix.com> References: <1450882432-10484-1-git-send-email-tamas@tklengyel.com> <567AC09B.2040403@bitdefender.com> <567AD725.9090207@citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0924701162387604731==" Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aBorq-0008OV-AT for xen-devel@lists.xenproject.org; Wed, 23 Dec 2015 19:14:34 +0000 In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Tamas K Lengyel Cc: Wei Liu , Ian Campbell , Razvan Cojocaru , Stefano Stabellini , Ian Jackson , Xen-devel List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --===============0924701162387604731== Content-Type: multipart/alternative; boundary="------------040101020707060404060300" This is a multi-part message in MIME format. --------------040101020707060404060300 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit On 23/12/2015 18:11, Tamas K Lengyel wrote: > > > On Wed, Dec 23, 2015 at 6:17 PM, Andrew Cooper > > wrote: > > On 23/12/2015 15:41, Razvan Cojocaru wrote: > > On 12/23/2015 04:53 PM, Tamas K Lengyel wrote: > >> Introduce new vm_event domctl option which allows an event > subscriber > >> to request all vCPUs not currently pending a vm_event request > to be paused, > >> thus allowing the subscriber to sync up on the state of the > domain. This > >> is especially useful when the subscribed wants to disable > certain events > >> from being delivered and wants to ensure no more requests are > pending on the > >> ring before doing so. > >> > >> Cc: Ian Jackson > > >> Cc: Stefano Stabellini > > >> Cc: Ian Campbell > > >> Cc: Wei Liu > > >> Cc: Razvan Cojocaru > > >> Signed-off-by: Tamas K Lengyel > > > This certainly looks very interesting. Would xc_domain_pause() > not be > > enough for your use case then? > > I second this query. I would have thought xc_domain_pause() does > exactly what you want in this case. > > > The problem is in what order the responses are processed. I may not be > correct about the logic but here is what my impression was: > xc_domain_unpause resumes all vCPUs even if there is still a vm_event > response that has not been processed. Now, if the subscriber set > response flags (altp2m switch, singlestep toggle, etc) those actions > would not be properly performed on the vCPU before it's resumed. If > the subscriber processes all requests and signals via the event > channel that the responses are on the ring, then calls > xc_domain_unpause, we can still have a race between processing the > responses from the ring and unpausing the vCPU. > > > The code provided is racy, as it is liable to alter which pause > references it takes/releases depending on what other pause/unpause > actions are being made. > > > It's understood that the user would not use xc_domain_pause/unpause > while using vm_event responses with response flags specified. Even > then, it was already racy IMHO if the user called xc_domain_unpause > before processing requests from the vm_event ring that originally > paused the vCPU, so this doesn't change that situation. Pausing is strictly reference counted. (or rather, it is since c/s 3eb1c70 "properly reference count DOMCTL_{,un}pausedomain hypercalls". Before then, it definitely was buggy.) There is the domain pause count, and pause counts per vcpu. All domain pause operations take both a domain pause reference, and a vcpu pause reference on each vcpu. A vcpu is only eligible to be scheduled if its pause reference count is zero. If two independent tasks call vcpu_pause() on the same vcpu, it will remain paused until both independent tasks have called vcpu_unpause(). Having said this, I can well believe that there might be issues with the current uses of pausing. The vital factor is that the entity which pauses a vcpu is also responsible for unpausing it, and it must be resistant to accidentally leaking its reference. In this case, I believe that what you want to do is: 1) Identify condition requiring a sync 2) xc_domain_pause() 3) Process all of the pending vm_events 4) Synchronise the state 5) xc_domain_unpause() All vcpus of the domain should stay descheduled between points 2 and 5. If this doesn't have the intended effect, then I suspect there is a bug in the pause reference handing of the vm_event subsystem. Is this clearer, or have I misunderstood the problem? ~Andrew --------------040101020707060404060300 Content-Type: text/html; charset=windows-1252 Content-Length: 7055 Content-Transfer-Encoding: quoted-printable
On 23/12/2015 18:11, Tamas K Lengyel wrote:


On Wed, Dec 23, 2015 at 6:17 PM, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
On 23/12/2015 15:41, Razvan Cojocaru wrote:
> On 12/23/2015 04:53 PM, Tamas K Lengyel wrote:
>> Introduce new vm_event domctl option which allows an event subscriber
>> to request all vCPUs not currently pending a vm_event request to be paused,
>> thus allowing the subscriber to sync up on the state of the domain. This
>> is especially useful when the subscribed wants to disable certain events
>> from being delivered and wants to ensure no more requests are pending on the
>> ring before doing so.
>>
>> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
>> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
>> Cc: Ian Campbell <ian.campbell@citrix.com>
>> Cc: Wei Liu <wei.liu2@citrix.com>
>> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
>> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
> This certainly looks very interesting. Would xc_domain_pause() not be
> enough for your use case then=3F

I second this query.=A0 I would have thought xc_domain_pause() does
exactly what you want in this case.

The problem is in what order the responses are processed. I may not be correct about the logic but here is what my impression was: xc_domain_unpause resumes all vCPUs even if there is still a vm_event response that has not been processed. Now, if the subscriber set response flags (altp2m switch, singlestep toggle, etc) those actions would not be properly performed on the vCPU before it's resumed. If the subscriber processes all requests and signals via the event channel that the responses are on the ring, then calls xc_domain_unpause, we can still have a race between processing the responses from the ring and unpausing the vCPU.
=A0
The code provided is racy, as it is liable to alter which pause
references it takes/releases depending on what other pause/unpause
actions are being made.

It's understood that the user would not use xc_domain_pause/unpause while using vm_event responses with response flags specified. Even then, it was already racy IMHO if the user called xc_domain_unpause before processing requests from the vm_event ring that originally paused the vCPU, so this doesn't change that situation.

Pausing is strictly reference counted. (or rather, it is since c/s 3eb1c70 "properly reference count DOMCTL_{,un}pausedomain hypercalls".=A0 Before then, it definitely was buggy.)

There is the domain pause count, and pause counts per vcpu.=A0 All domain pause operations take both a domain pause reference, and a vcpu pause reference on each vcpu.=A0 A vcpu is only eligible to be scheduled if its pause reference count is zero.=A0 If two independent tasks call vcpu_pause() on the same vcpu, it will remain paused until both independent tasks have called vcpu_unpause().

Having said this, I can well believe that there might be issues with the current uses of pausing.

The vital factor is that the entity which pauses a vcpu is also responsible for unpausing it, and it must be resistant to accidentally leaking its reference.

In this case, I believe that what you want to do is:

1) Identify condition requiring a sync
2) xc_domain_pause()
3) Process all of the pending vm_events
4) Synchronise the state
5) xc_domain_unpause()

All vcpus of the domain should stay descheduled between points 2 and 5.=A0 If this doesn't have the intended effect, then I suspect there is a bug in the pause reference handing of the vm_event subsystem.

Is this clearer, or have I misunderstood the problem=3F

~Andrew
--------------040101020707060404060300-- --===============0924701162387604731== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============0924701162387604731==--