From: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
To: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>,
Paul Durrant <Paul.Durrant@citrix.com>,
Wei Liu <wei.liu2@citrix.com>
Subject: Re: Race condition on device add hanling in xl devd
Date: Thu, 28 Feb 2019 13:38:07 +0100 [thread overview]
Message-ID: <20190228123807.GO5348@mail-itl> (raw)
In-Reply-To: <20190228100837.3velwvbmrfcs4eor@Air-de-Roger>
[-- Attachment #1.1: Type: text/plain, Size: 2702 bytes --]
On Thu, Feb 28, 2019 at 11:08:37AM +0100, Roger Pau Monné wrote:
> On Mon, Feb 25, 2019 at 12:14:02AM +0100, Marek Marczykowski-Górecki wrote:
> > On Mon, Dec 17, 2018 at 05:09:19PM +0100, Roger Pau Monné wrote:
> > > On Mon, Dec 17, 2018 at 02:42:23PM +0000, Paul Durrant wrote:
> > > > I suspect I must be remembering a XenServer-specific hack^Wpatch then. I'd have to dig... it's been a while since I messed with the netif state model, which is of course different the blkif state model.
> > >
> > > Quite likely. With udev scripts is was feasible to only execute
> > > hotplug scripts for vifs with an attached frontend.
> > >
> > > With libxl this is not possible, since hotplug scripts are run during
> > > domain creation, at which point the guest is completely paused.
> > >
> > > I'm not that familiar with bridges and vifs, but maybe the vifs status
> > > can be set to offline until there's a frontend attached in order to
> > > reduce the bridge distributor load? (if that's not already the case).
> >
> > I've found was the problem, and with some definition of "race condition"
> > it could be named this way.
> > The problem is that for some reason xenstore watch on device add
> > sometimes does not fire in xl devd. But then, when libxl in dom0
> > timeouts and remove the device, the xenstore watch in xl devd fire and
> > hotplug script is called. At this point device is already gone, so
> > it fails. xl devd then quickly calls hotplug script the second time, for
> > device removal.
> >
> > I have no idea why this xenstore watch do not fire, but triggering a
> > no-op write into watched path (to trigger the watch again) workarounds
> > the problem. I use a xenstore watch in dom0 for that[1] - which works.
> > I suspect something related to KVM nested virtualization (lost
> > interrupt?)...
>
> That's very weird, could you try to run xenstored in dom0 with trace
> enabled [0] in order to try to figure out what's happening?
I've tried already, but it was way too slow (remember it's nested KVM,
it doesn't really improve the performance). I hit multiple timeouts even
without hitting this problem. Unfortunately I don't have logs from that
experiment anymore.
I can try again...
> I assume this only happens when running nested in KVM?
I'd say so. I'm not entirely sure, because I've seen similar symptoms on
bare metal Xen too in the past, but I think it could be a different
problem and also I haven't seen it in past 3 months.
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
[-- Attachment #2: Type: text/plain, Size: 157 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
next prev parent reply other threads:[~2019-02-28 12:38 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-16 1:47 Race condition on device add hanling in xl devd Marek Marczykowski-Górecki
2018-12-17 9:40 ` Roger Pau Monné
2018-12-17 12:00 ` Marek Marczykowski-Górecki
2018-12-17 12:18 ` Roger Pau Monné
2018-12-17 12:23 ` Marek Marczykowski-Górecki
2018-12-17 13:05 ` Roger Pau Monné
2018-12-17 13:11 ` Paul Durrant
2018-12-17 14:32 ` Roger Pau Monné
2018-12-17 14:42 ` Paul Durrant
2018-12-17 16:09 ` Roger Pau Monné
2019-02-24 23:14 ` Marek Marczykowski-Górecki
2019-02-28 10:08 ` Roger Pau Monné
2019-02-28 12:38 ` Marek Marczykowski-Górecki [this message]
2018-12-17 13:23 ` Marek Marczykowski-Górecki
2018-12-17 14:44 ` Roger Pau Monné
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190228123807.GO5348@mail-itl \
--to=marmarek@invisiblethingslab.com \
--cc=Paul.Durrant@citrix.com \
--cc=roger.pau@citrix.com \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.