* libxl refactoring, call for discussion @ 2016-01-25 13:38 Pavlo Suikov 2016-01-27 14:03 ` Pavlo Suikov 2016-01-27 14:26 ` Ian Campbell 0 siblings, 2 replies; 5+ messages in thread From: Pavlo Suikov @ 2016-01-25 13:38 UTC (permalink / raw) To: xen-devel [-- Attachment #1.1: Type: text/plain, Size: 3525 bytes --] Hi everyone, I want to bring domain restart question for a discussion. It originates from DomD restart, but the solution I am about to offer can be quite generic. Problem is, domain specification currently holds only frontend info, which is used to generate both frontend and backend entries for a device; that means that backend xenstore data is handled not by a domain that owns backends, but by a domain that has frontends linked to these backends. As a result, reboot of any domain with backends without reboot of the corresponding frontend domains is impossible. This is wrong on many levels, but the main thing is: some domains don't know something they should (do they have any backends) and some know something they should not (there are backends for their frontends in different domains). I propose following change: make domain "require" and "provide" interfaces visible (think CORBA), hold connections between the two in the priveleged domain (where toolstack is, think controller from the MVC idiom). With this change domains (except for Dom0 which is a special case) can be rebooted in any order whatsoever, and frontend/backend link can be adjusted as a static config or during runtime (e.g. if hardware rendering backend hangs, switch to software rendering to avoid glitches). However, it requires change in the libxl internal device representation (device should not be a frontend/backend pair any more) and config format change, which breaks backwards compatibility. That is, I want domain configuration hold records on both frontends (what this domain require) and backends (what this domain provides) and libxl to create corresponding xenstore branches separately. Moreover, I'd like to have frontend/backend connection information be held in a different config belonging to Dom0, so that on any domain reboot (or any exceptional situation like watchdog failure) supervisor (Dom0) can use this information to initiate a reconnect. And, as we talk about libxl refactoring, I'd like to state one point more: code duplication. Libxl support for a split-driver model consists of an declarative IDL device specification, xenstore read, xenstore write, config read, config write, xl args read, JSON read/write and device chain of responsibility with async device creation. The only thing IDL is used for is type and JSON read/write code generation, everything else is an error-prone hand-written duplicated code. Why won't we generate as much as we can? That means generation of xenstore read, xenstore write, config read, config write and xl args read - these all directly depend on device IDL specification. If we already have external code generation tool, why not use it to full extent instead of writing all this serialization/deserialization code manually (and in different styles - e.g. block device is the only one that uses lexx, and xl_cmdimpl.c parse_config_data implementations differs from device to device quite a lot)? As a matter of fact, I'd be doing some work in this general direction because we need DomD restart anyway and libxl boilerplate is kinda messy (we have ~12 devices and xl/libxl interface patches for them are almost copy and pasted), but I would like to hear as much criticism and ideas as possible. It would be nice if we can come out from this discussion with something potentially upstreamable. Suikov Pavlo GlobalLogic P +x.xxx.xxx.xxxx M +38.066.667.1296 S psujkov www.globallogic.com <http://www.globallogic.com/> http://www.globallogic.com/email_disclaimer.txt [-- Attachment #1.2: Type: text/html, Size: 5789 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: libxl refactoring, call for discussion 2016-01-25 13:38 libxl refactoring, call for discussion Pavlo Suikov @ 2016-01-27 14:03 ` Pavlo Suikov 2016-01-27 14:26 ` Ian Campbell 1 sibling, 0 replies; 5+ messages in thread From: Pavlo Suikov @ 2016-01-27 14:03 UTC (permalink / raw) To: xen-devel [-- Attachment #1.1: Type: text/plain, Size: 4136 bytes --] Hi again, any thoughts on this topic? Questions, suggestions, criticism? I can make patches to discuss code, but it would rather be handly if I'd be on the same page with everyone here to make something generically useful. Suikov Pavlo GlobalLogic P +x.xxx.xxx.xxxx M +38.066.667.1296 S psujkov www.globallogic.com <http://www.globallogic.com/> http://www.globallogic.com/email_disclaimer.txt On Mon, Jan 25, 2016 at 3:38 PM, Pavlo Suikov <pavlo.suikov@globallogic.com> wrote: > Hi everyone, > > I want to bring domain restart question for a discussion. It originates > from DomD restart, but the solution I am about to offer can be quite > generic. > > Problem is, domain specification currently holds only frontend info, which > is used to generate both frontend and backend entries for a device; that > means that backend xenstore data is handled not by a domain that owns > backends, but by a domain that has frontends linked to these backends. As a > result, reboot of any domain with backends without reboot of the > corresponding frontend domains is impossible. > > This is wrong on many levels, but the main thing is: some domains don't > know something they should (do they have any backends) and some know > something they should not (there are backends for their frontends in > different domains). > > I propose following change: make domain "require" and "provide" interfaces > visible (think CORBA), hold connections between the two in the priveleged > domain (where toolstack is, think controller from the MVC idiom). With this > change domains (except for Dom0 which is a special case) can be rebooted in > any order whatsoever, and frontend/backend link can be adjusted as a static > config or during runtime (e.g. if hardware rendering backend hangs, switch > to software rendering to avoid glitches). However, it requires change in > the libxl internal device representation (device should not be a > frontend/backend pair any more) and config format change, which breaks > backwards compatibility. > > That is, I want domain configuration hold records on both frontends (what > this domain require) and backends (what this domain provides) and libxl to > create corresponding xenstore branches separately. Moreover, I'd like to > have frontend/backend connection information be held in a different config > belonging to Dom0, so that on any domain reboot (or any exceptional > situation like watchdog failure) supervisor (Dom0) can use this information > to initiate a reconnect. > > And, as we talk about libxl refactoring, I'd like to state one point more: > code duplication. Libxl support for a split-driver model consists of an > declarative IDL device specification, xenstore read, xenstore write, config > read, config write, xl args read, JSON read/write and device chain of > responsibility with async device creation. The only thing IDL is used for > is type and JSON read/write code generation, everything else is an > error-prone hand-written duplicated code. > > Why won't we generate as much as we can? That means generation of xenstore > read, xenstore write, config read, config write and xl args read - these > all directly depend on device IDL specification. If we already have > external code generation tool, why not use it to full extent instead of > writing all this serialization/deserialization code manually (and in > different styles - e.g. block device is the only one that uses lexx, and > xl_cmdimpl.c parse_config_data implementations differs from device to > device quite a lot)? > > As a matter of fact, I'd be doing some work in this general direction > because we need DomD restart anyway and libxl boilerplate is kinda messy > (we have ~12 devices and xl/libxl interface patches for them are almost > copy and pasted), but I would like to hear as much criticism and ideas as > possible. It would be nice if we can come out from this discussion with > something potentially upstreamable. > > Suikov Pavlo > GlobalLogic > P +x.xxx.xxx.xxxx M +38.066.667.1296 S psujkov > www.globallogic.com > <http://www.globallogic.com/> > http://www.globallogic.com/email_disclaimer.txt > [-- Attachment #1.2: Type: text/html, Size: 8568 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: libxl refactoring, call for discussion 2016-01-25 13:38 libxl refactoring, call for discussion Pavlo Suikov 2016-01-27 14:03 ` Pavlo Suikov @ 2016-01-27 14:26 ` Ian Campbell 2016-01-27 16:20 ` Pavlo Suikov 1 sibling, 1 reply; 5+ messages in thread From: Ian Campbell @ 2016-01-27 14:26 UTC (permalink / raw) To: Pavlo Suikov, xen-devel On Mon, 2016-01-25 at 15:38 +0200, Pavlo Suikov wrote: > Hi everyone, > > I want to bring domain restart question for a discussion. It originates > from DomD restart, but the solution I am about to offer can be quite > generic. > > Problem is, domain specification currently holds only frontend info, > which is used to generate both frontend and backend entries for a device; > that means that backend xenstore data is handled not by a domain that > owns backends, but by a domain that has frontends linked to these > backends. As a result, reboot of any domain with backends without reboot > of the corresponding frontend domains is impossible. > > This is wrong on many levels, but the main thing is: some domains don't > know something they should (do they have any backends) and some know > something they should not (there are backends for their frontends in > different domains). > > I propose following change: make domain "require" and "provide" > interfaces visible (think CORBA), hold connections between the two in the > priveleged domain (where toolstack is, think controller from the MVC > idiom). With this change domains (except for Dom0 which is a special > case) can be rebooted in any order whatsoever, and frontend/backend link > can be adjusted as a static config or during runtime (e.g. if hardware > rendering backend hangs, switch to software rendering to avoid glitches). > However, it requires change in the libxl internal device representation > (device should not be a frontend/backend pair any more) and config format > change, which breaks backwards compatibility. > > That is, I want domain configuration hold records on both frontends (what > this domain require) and backends (what this domain provides) and libxl > to create corresponding xenstore branches separately. Moreover, I'd like > to have frontend/backend connection information be held in a different > config belonging to Dom0, so that on any domain reboot (or any > exceptional situation like watchdog failure) supervisor (Dom0) can use > this information to initiate a reconnect. > > And, as we talk about libxl refactoring, I'd like to state one point > more: code duplication. Libxl support for a split-driver model consists > of an declarative IDL device specification, xenstore read, xenstore > write, config read, config write, xl args read, JSON read/write and > device chain of responsibility with async device creation. The only thing > IDL is used for is type and JSON read/write code generation, everything > else is an error-prone hand-written duplicated code. > > Why won't we generate as much as we can? That means generation of > xenstore read, xenstore write, config read, config write and xl args read > - these all directly depend on device IDL specification. If we already > have external code generation tool, why not use it to full extent instead > of writing all this serialization/deserialization code manually (and in > different styles - e.g. block device is the only one that uses lexx, and > xl_cmdimpl.c parse_config_data implementations differs from device to > device quite a lot)? It seems that much of this is conflating several separate things: 1. The contents of xenstore representing a front/back pair, and the protocol which they imply. 2. The code in libxl which populates that state 3. The data structures used inside libxl to describe such devices 4. The code in xl which parses xl's own config file format (which is not common to all toolstacks) Currently only #3 is covered by the IDL. It's possible (likely even) that some or all of #1 and #2 might be describable either in an extension to the IDL or in some other domain specific language or that more of the setup in these cases could be made common than is the case in the code today, and I would expect such patches to be welcomed (although it might be wise to discuss a specific plan first). Note that for historical reasons their are some subtle differences between the handling required for some devices. #4 isn't really subject to the libxl IDL, being xl code relating to xl's own configuration file. It's possible that xl could benefit from some more code automation. Wei is looking at cleaning some of the various adhoc parsing of key-value lists for example (and has posted some patches in the last week or so). There's probably more cleanup which could be done in this area and again I would expect patches to be welcome. It seems like aside from proposing to autogenerate more code you may also proposing some sort of change to the actual xenstore protocol used by pv front and backends (the "require"/"provide" concept from CORBA and the MVC idiom bits). It's not clear to me exactly what the proposal is but just let me say that (unfortunately) any changes to these arrangements need to be made in a backward compatible way, we can't simply rework everything and break all existing guests (or dom0's). Maybe I've misunderstood what you meant to suggest though. Note that the xl config file syntax is subject to evolution and can be extended, or even changed, although we would of course prefer to minimise breakage where possible (i.e. extending is preferred). Ian. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: libxl refactoring, call for discussion 2016-01-27 14:26 ` Ian Campbell @ 2016-01-27 16:20 ` Pavlo Suikov 2016-01-27 16:50 ` Ian Campbell 0 siblings, 1 reply; 5+ messages in thread From: Pavlo Suikov @ 2016-01-27 16:20 UTC (permalink / raw) To: Ian Campbell; +Cc: xen-devel [-- Attachment #1.1: Type: text/plain, Size: 9140 bytes --] Hi Ian, I'll try to make clear domain restart thing first. Currently there are only device entries in domain config for frontends with backend paths in them, e.g.: FrontendDomain.cfg: vdevice = [ 'backend=BackendDomain,devid=0,device=/dev/device' ] it is parsed only on frontend domain start, and libxl creates both frontend and backend xenstore branches from that entry. Problem is, on backend domain restart libxl does not re-read this config, because it belongs to the frontend domain which is clearly not restarted, and as a consequence xenstore backend branches are cleared (on backend shutdown) but not filled again, because backend domain does not have any entries for this device. What I propose is to add information needed to fill xenstore backend branches to backend domains and remove it from frontend domains, so it would possibly look like this: FrontendDomain.cfg: frontend:vdevice = [ 'backend=BackendDomain,devid=0' ] BackendDomain.cfg: backend:vdevice = ['frontend=FrontendDomain,devid=0,device=/dev/device'] Now we have all the information needed for both domains to restart intependently filling their own xenstore branches. On FrontendDomain start libxl fills only frontend branch for vdevice, and on backendDomain start libxl fills only backend branch. This can be easily done in backwards-compatible way: if frontend entry is filled in the old format, it is used to fill both branches, but if we explicitly state that it's only frontend or only backend entry, then it is used to fill only the corresponding branch. For the main part, that's it: currently the main thing we need is a DomD restart, and such a change would be enough for that purpose with no need for dirty hacks. But from that point we can make one step further: to remove split driver connection information from user domains. It would possibly look like this: FrontendDomain.cfg: frontend:vdevice = [ 'devid=0' ] BackendDomain.cfg: backend:vdevice = ['devid=0,devname=Main,device=/dev/device'] SupervisorDomain.cfg: backend:vdevice = ['devid=0,devname=Fallback,device=/dev/device'] connection:vdevice = ['devid=0, frontend=FrontendDomain, backend=BackendDomain, devname=Main'] connection:vdevice = ['devid=0, frontend=FrontendDomain, backend=SupervisorDomain, devname=Fallback'] Motivation for this example is the following: if we have, e.g., PV GPU which is used in user domain and we have restrictions on graphics availability, we can monitor if main device backend hanged and make a runtime switch to a fallback device (software rendering) till main backend would be restated. Connections are static, they generally do not change and do not actually belong to any user domain, so there is no need clearing and refilling them on every domain restart. This can be easily done in backwards-compatible way as well: if entries already have connection information, we stick to the old model; if they don't, we look for the connection branch to fill missing fields and to setup fallback paths. There still is an issue with hoplug devices being non-persistent (xl vdevice-attach fills both frontend and backend branches, but restart of any of user domains looses this information). At the moment we fix this rather dirty with additional 'hotplugged' xenstore branch bot belonging to any domain so it's not erased on shutdown but is being read on domain start if exists. In context of reworking libxl, I believe this one can be done as well, though I still don't have a solution I would like. Suikov Pavlo GlobalLogic P +x.xxx.xxx.xxxx M +38.066.667.1296 S psujkov www.globallogic.com <http://www.globallogic.com/> http://www.globallogic.com/email_disclaimer.txt On Wed, Jan 27, 2016 at 4:26 PM, Ian Campbell <ian.campbell@citrix.com> wrote: > On Mon, 2016-01-25 at 15:38 +0200, Pavlo Suikov wrote: > > Hi everyone, > > > > I want to bring domain restart question for a discussion. It originates > > from DomD restart, but the solution I am about to offer can be quite > > generic. > > > > Problem is, domain specification currently holds only frontend info, > > which is used to generate both frontend and backend entries for a device; > > that means that backend xenstore data is handled not by a domain that > > owns backends, but by a domain that has frontends linked to these > > backends. As a result, reboot of any domain with backends without reboot > > of the corresponding frontend domains is impossible. > > > > This is wrong on many levels, but the main thing is: some domains don't > > know something they should (do they have any backends) and some know > > something they should not (there are backends for their frontends in > > different domains). > > > > I propose following change: make domain "require" and "provide" > > interfaces visible (think CORBA), hold connections between the two in the > > priveleged domain (where toolstack is, think controller from the MVC > > idiom). With this change domains (except for Dom0 which is a special > > case) can be rebooted in any order whatsoever, and frontend/backend link > > can be adjusted as a static config or during runtime (e.g. if hardware > > rendering backend hangs, switch to software rendering to avoid glitches). > > However, it requires change in the libxl internal device representation > > (device should not be a frontend/backend pair any more) and config format > > change, which breaks backwards compatibility. > > > > That is, I want domain configuration hold records on both frontends (what > > this domain require) and backends (what this domain provides) and libxl > > to create corresponding xenstore branches separately. Moreover, I'd like > > to have frontend/backend connection information be held in a different > > config belonging to Dom0, so that on any domain reboot (or any > > exceptional situation like watchdog failure) supervisor (Dom0) can use > > this information to initiate a reconnect. > > > > And, as we talk about libxl refactoring, I'd like to state one point > > more: code duplication. Libxl support for a split-driver model consists > > of an declarative IDL device specification, xenstore read, xenstore > > write, config read, config write, xl args read, JSON read/write and > > device chain of responsibility with async device creation. The only thing > > IDL is used for is type and JSON read/write code generation, everything > > else is an error-prone hand-written duplicated code. > > > > Why won't we generate as much as we can? That means generation of > > xenstore read, xenstore write, config read, config write and xl args read > > - these all directly depend on device IDL specification. If we already > > have external code generation tool, why not use it to full extent instead > > of writing all this serialization/deserialization code manually (and in > > different styles - e.g. block device is the only one that uses lexx, and > > xl_cmdimpl.c parse_config_data implementations differs from device to > > device quite a lot)? > > It seems that much of this is conflating several separate things: > > 1. The contents of xenstore representing a front/back pair, and the > protocol which they imply. > 2. The code in libxl which populates that state > 3. The data structures used inside libxl to describe such devices > 4. The code in xl which parses xl's own config file format (which is not > common to all toolstacks) > > Currently only #3 is covered by the IDL. > > It's possible (likely even) that some or all of #1 and #2 might be > describable either in an extension to the IDL or in some other domain > specific language or that more of the setup in these cases could be made > common than is the case in the code today, and I would expect such patches > to be welcomed (although it might be wise to discuss a specific plan > first). > > Note that for historical reasons their are some subtle differences between > the handling required for some devices. > > #4 isn't really subject to the libxl IDL, being xl code relating to xl's > own configuration file. It's possible that xl could benefit from some more > code automation. Wei is looking at cleaning some of the various adhoc > parsing of key-value lists for example (and has posted some patches in the > last week or so). There's probably more cleanup which could be done in this > area and again I would expect patches to be welcome. > > It seems like aside from proposing to autogenerate more code you may also > proposing some sort of change to the actual xenstore protocol used by pv > front and backends (the "require"/"provide" concept from CORBA and the MVC > idiom bits). It's not clear to me exactly what the proposal is but just let > me say that (unfortunately) any changes to these arrangements need to be > made in a backward compatible way, we can't simply rework everything and > break all existing guests (or dom0's). Maybe I've misunderstood what you > meant to suggest though. > > Note that the xl config file syntax is subject to evolution and can be > extended, or even changed, although we would of course prefer to minimise > breakage where possible (i.e. extending is preferred). > > Ian. > [-- Attachment #1.2: Type: text/html, Size: 13096 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: libxl refactoring, call for discussion 2016-01-27 16:20 ` Pavlo Suikov @ 2016-01-27 16:50 ` Ian Campbell 0 siblings, 0 replies; 5+ messages in thread From: Ian Campbell @ 2016-01-27 16:50 UTC (permalink / raw) To: Pavlo Suikov; +Cc: xen-devel On Wed, 2016-01-27 at 18:20 +0200, Pavlo Suikov wrote: > Hi Ian, Please don't top post and please use plain text emails, not html. > I'll try to make clear domain restart thing first. Currently there are > only device entries in domain config for frontends with backend paths in > them, e.g.: > > FrontendDomain.cfg: > vdevice = [ 'backend=BackendDomain,devid=0,device=/dev/device' ] > > it is parsed only on frontend domain start, and libxl creates both > frontend and backend xenstore branches from that entry. Problem is, on > backend domain restart libxl does not re-read this config, libxl _never_ reads this config. This is an xl configuration file. xl is one user of libxl, but not the only one (libvirt is another for example). It is important to keep the distinction in mind when discussing these APIs. libxl gets libxl_device_disk structs from xl. > because it belongs to the frontend domain which is clearly not > restarted, and as a consequence xenstore backend branches are cleared (on > backend shutdown) but not filled again, because backend domain does not > have any entries for this device. The information regarding which backends a domain has is present in xenstore (in /local/domain/<domid>/backends) and libxl could make use of that information when restarting a domain, including following the trail to the frontend configuration. If needed libxl could also make a note in a backend domain location when starting a frontend of anything which it might need in order to restart that backend. Likewise if anything which libxl needs in order to restart a backend is missing it should just be added. I think most of the problem here is just that libxl is not making use of the information which it already has available when rebooting a domain which has backends attached. > What I propose is to add information needed to fill xenstore backend > branches to backend domains and remove it from frontend domains, so it > would possibly look like this: > > FrontendDomain.cfg: > frontend:vdevice = [ 'backend=BackendDomain,devid=0' ] > > BackendDomain.cfg: > backend:vdevice = ['frontend=FrontendDomain,devid=0,device=/dev/device'] Leaving aside that this proposal is xl specific and not addressing the more important libxl API I don't think that having to predeclare the set of backends when starting the backend domain is going to be feasible in the general case (where domains are created dynamically and in an adhoc way by users without the foresight afforded to embedded situations). In any case as described above I think you are solving an issue which doesn't actually exist. [...] > Motivation for this example is the following: if we have, e.g., PV GPU > which is used in user domain and we have restrictions on graphics > availability, we can monitor if main device backend hanged and make a > runtime switch to a fallback device (software rendering) till main > backend would be restated. Connections are static, they generally do not > change and do not actually belong to any user domain, so there is no need > clearing and refilling them on every domain restart. I think you need to think about this in terms of the underlying libxl APIs (i.e. libxl_device_<foo>_reconnect() or something) before thinking about what the xl cfg file format and commands might be like. > There still is an issue with hoplug devices being non-persistent (xl > vdevice-attach fills both frontend and backend branches, but restart of > any of user domains looses this information). Not since 4.5 (or maybe 4.6), libxl now records such dynamic changes and they are preserved on reboot (or should be, if not then there is a bug somewhere). Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-01-27 16:50 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-01-25 13:38 libxl refactoring, call for discussion Pavlo Suikov 2016-01-27 14:03 ` Pavlo Suikov 2016-01-27 14:26 ` Ian Campbell 2016-01-27 16:20 ` Pavlo Suikov 2016-01-27 16:50 ` Ian Campbell
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).