All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Durrant <xadimgnik@gmail.com>
To: "'Jürgen Groß'" <jgross@suse.com>,
	"'Jan Beulich'" <jbeulich@suse.com>,
	"'Marek Marczykowski-Górecki'" <marmarek@invisiblethingslab.com>,
	"'Ian Jackson'" <ian.jackson@eu.citrix.com>
Cc: 'Andrew Cooper' <andrew.cooper3@citrix.com>,
	'xen-devel' <xen-devel@lists.xenproject.org>
Subject: RE: handle_pio looping during domain shutdown, with qemu 4.2.0 in stubdom
Date: Mon, 8 Jun 2020 13:38:02 +0100	[thread overview]
Message-ID: <002e01d63d91$a701e5c0$f505b140$@xen.org> (raw)
In-Reply-To: <3811b700-7bd7-859a-2c84-a9885acf64a1@suse.com>

> -----Original Message-----
> From: Jürgen Groß <jgross@suse.com>
> Sent: 08 June 2020 10:25
> To: paul@xen.org; 'Jan Beulich' <jbeulich@suse.com>; 'Marek Marczykowski-Górecki'
> <marmarek@invisiblethingslab.com>; 'Ian Jackson' <ian.jackson@eu.citrix.com>
> Cc: 'Andrew Cooper' <andrew.cooper3@citrix.com>; 'xen-devel' <xen-devel@lists.xenproject.org>
> Subject: Re: handle_pio looping during domain shutdown, with qemu 4.2.0 in stubdom
> 
> On 08.06.20 11:15, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: 08 June 2020 09:14
> >> To: 'Marek Marczykowski-Górecki' <marmarek@invisiblethingslab.com>; paul@xen.org
> >> Cc: 'Andrew Cooper' <andrew.cooper3@citrix.com>; 'xen-devel' <xen-devel@lists.xenproject.org>
> >> Subject: Re: handle_pio looping during domain shutdown, with qemu 4.2.0 in stubdom
> >>
> >> On 05.06.2020 18:18, 'Marek Marczykowski-Górecki' wrote:
> >>> On Fri, Jun 05, 2020 at 04:39:56PM +0100, Paul Durrant wrote:
> >>>>> From: Jan Beulich <jbeulich@suse.com>
> >>>>> Sent: 05 June 2020 14:57
> >>>>>
> >>>>> On 05.06.2020 15:37, Paul Durrant wrote:
> >>>>>>> From: Jan Beulich <jbeulich@suse.com>
> >>>>>>> Sent: 05 June 2020 14:32
> >>>>>>>
> >>>>>>> On 05.06.2020 13:05, Paul Durrant wrote:
> >>>>>>>> That would mean we wouldn't be seeing the "Unexpected PIO" message. From that message this
> >> clearly
> >>>>>>> X86EMUL_UNHANDLEABLE which suggests a race with ioreq server teardown, possibly due to
> selecting
> >> a
> >>>>>>> server but then not finding a vcpu match in ioreq_vcpu_list.
> >>>>>>>
> >>>>>>> I was suspecting such, but at least the tearing down of all servers
> >>>>>>> happens only from relinquish-resources, which gets started only
> >>>>>>> after ->is_shut_down got set (unless the tool stack invoked
> >>>>>>> XEN_DOMCTL_destroydomain without having observed XEN_DOMINF_shutdown
> >>>>>>> set for the domain).
> >>>>>>>
> >>>>>>> For individually unregistered servers - yes, if qemu did so, this
> >>>>>>> would be a problem. They need to remain registered until all vCPU-s
> >>>>>>> in the domain got paused.
> >>>>>>
> >>>>>> It shouldn't be a problem should it? Destroying an individual server is only done with the
> domain
> >>>>> paused, so no vcpus can be running at the time.
> >>>>>
> >>>>> Consider the case of one getting destroyed after it has already
> >>>>> returned data, but the originating vCPU didn't consume that data
> >>>>> yet. Once that vCPU gets unpaused, handle_hvm_io_completion()
> >>>>> won't find the matching server anymore, and hence the chain
> >>>>> hvm_wait_for_io() -> hvm_io_assist() ->
> >>>>> vcpu_end_shutdown_deferral() would be skipped. handle_pio()
> >>>>> would then still correctly consume the result.
> >>>>
> >>>> True, and skipping hvm_io_assist() means the vcpu internal ioreq state will be left set to
> >> IOREQ_READY and *that* explains why we would then exit hvmemul_do_io() with X86EMUL_UNHANDLEABLE
> (from
> >> the first switch).
> >>>
> >>> I can confirm X86EMUL_UNHANDLEABLE indeed comes from the first switch in
> >>> hvmemul_do_io(). And it happens shortly after ioreq server is destroyed:
> >>>
> >>> (XEN) d12v0 XEN_DMOP_remote_shutdown domain 11 reason 0
> >>> (XEN) d12v0 domain 11 domain_shutdown vcpu_id 0 defer_shutdown 1
> >>> (XEN) d12v0 XEN_DMOP_remote_shutdown domain 11 done
> >>> (XEN) d12v0 hvm_destroy_ioreq_server called for 11, id 0
> >>
> >> Can either of you tell why this is? As said before, qemu shouldn't
> >> start tearing down ioreq servers until the domain has made it out
> >> of all shutdown deferrals, and all its vCPU-s have been paused.
> >> For the moment I think the proposed changes, while necessary, will
> >> mask another issue elsewhere. The @releaseDomain xenstore watch,
> >> being the trigger I would consider relevant here, will trigger
> >> only once XEN_DOMINF_shutdown is reported set for a domain, which
> >> gets derived from d->is_shut_down (i.e. not mistakenly
> >> d->is_shutting_down).
> >
> > I can't find anything that actually calls xendevicemodel_shutdown(). It was added by:
> 
> destroy_hvm_domain() in qemu does.
> 

Ah ok, thanks. So it looks like this should only normally be called when the guest has written to the PIIX to request shutdown. Presumably the hvm_destroy_ioreq_server call we see afterwards is QEMU then exiting.
There is one other circumstance when destroy_hvmdomain() would be called and that is if the ioreq state is not STATE_IOREQ_INPROCESS... in which case there should be an accompanying error message in the qemu log.

  Paul

> 
> Juergen



  reply	other threads:[~2020-06-08 12:38 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-04  1:46 handle_pio looping during domain shutdown, with qemu 4.2.0 in stubdom Marek Marczykowski-Górecki
2020-06-04  7:08 ` Jan Beulich
2020-06-04  7:17   ` Jan Beulich
2020-06-04 11:13   ` Andrew Cooper
2020-06-04 12:36     ` Jan Beulich
2020-06-04 14:25       ` Marek Marczykowski-Górecki
2020-06-05  9:09         ` Jan Beulich
2020-06-05  9:22           ` Jan Beulich
2020-06-05 12:01             ` Marek Marczykowski-Górecki
2020-06-05 12:39               ` Paul Durrant
2020-06-05 13:04                 ` 'Marek Marczykowski-Górecki'
2020-06-05 13:24                   ` Paul Durrant
2020-06-05 12:44               ` Andrew Cooper
2020-06-05 14:13               ` Jan Beulich
2020-06-05 11:05           ` Paul Durrant
2020-06-05 11:25             ` Paul Durrant
2020-06-05 13:36               ` Jan Beulich
2020-06-05 13:43                 ` Paul Durrant
2020-06-05 13:46                   ` Jan Beulich
2020-06-05 13:49                     ` Paul Durrant
2020-06-05 15:48                       ` Paul Durrant
2020-06-05 17:13                         ` 'Marek Marczykowski-Górecki'
2020-06-05 17:24                           ` Paul Durrant
2020-06-05 20:43                             ` 'Marek Marczykowski-Górecki'
2020-06-07 13:32                               ` Paul Durrant
2020-06-05 13:32             ` Jan Beulich
2020-06-05 13:37               ` Paul Durrant
2020-06-05 13:57                 ` Jan Beulich
2020-06-05 15:39                   ` Paul Durrant
2020-06-05 16:18                     ` 'Marek Marczykowski-Górecki'
2020-06-08  8:13                       ` Jan Beulich
2020-06-08  9:15                         ` Paul Durrant
2020-06-08  9:24                           ` Jürgen Groß
2020-06-08 12:38                             ` Paul Durrant [this message]
2020-06-08 12:46                               ` Jan Beulich
2020-06-08 12:54                                 ` Paul Durrant
2020-06-05  9:38 ` Jan Beulich
2020-06-05 11:18   ` Marek Marczykowski-Górecki
2020-06-05 13:59     ` Jan Beulich
2020-06-05 15:10       ` Paul Durrant

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='002e01d63d91$a701e5c0$f505b140$@xen.org' \
    --to=xadimgnik@gmail.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=jgross@suse.com \
    --cc=marmarek@invisiblethingslab.com \
    --cc=paul@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.