* [PATCH v3 1/3] docs: add pod variant of xen-pv-channel.7
2017-07-26 14:39 [PATCH v3 0/3] docs: convert manpages to pod Olaf Hering
@ 2017-07-26 14:39 ` Olaf Hering
2017-07-26 14:39 ` [PATCH v3 2/3] docs: add pod variant of xl-network-configuration.5 Olaf Hering
` (3 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Olaf Hering @ 2017-07-26 14:39 UTC (permalink / raw)
To: xen-devel, Ian Jackson, Wei Liu; +Cc: Olaf Hering
Convert source for xen-pv-channel.7 from markdown to pod.
This removes the buildtime requirement for pandoc, and subsequently the
need for ghc, in the chain for BuildRequires of xen.rpm.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
docs/man/xen-pv-channel.markdown.7 | 106 ---------------------
docs/man/xen-pv-channel.pod.7 | 188 +++++++++++++++++++++++++++++++++++++
2 files changed, 188 insertions(+), 106 deletions(-)
delete mode 100644 docs/man/xen-pv-channel.markdown.7
create mode 100644 docs/man/xen-pv-channel.pod.7
diff --git a/docs/man/xen-pv-channel.markdown.7 b/docs/man/xen-pv-channel.markdown.7
deleted file mode 100644
index 1c6149dae0..0000000000
--- a/docs/man/xen-pv-channel.markdown.7
+++ /dev/null
@@ -1,106 +0,0 @@
-Xen PV Channels
-===============
-
-A channel is a low-bandwidth private byte stream similar to a serial
-link. Typical uses of channels are
-
- 1. to provide initial configuration information to a VM on boot
- (example use: CloudStack's cloud-early-config service)
- 2. to signal/query an in-guest agent
- (example use: oVirt's guest agent)
-
-Channels are similar to virtio-serial devices and emulated serial links.
-Channels are intended to be used in the implementation of libvirt <channel>s
-when running on Xen.
-
-Note: if an application requires a high-bandwidth link then it should use
-vchan instead.
-
-How to use channels: an example
--------------------------------
-
-Consider a cloud deployment where VMs are cloned from pre-made templates,
-and customised on first boot by an in-guest agent which sets the IP address,
-hostname, ssh keys etc. To install the system the cloud administrator would
-first:
-
- 1. Install a guest as normal (no channel configuration necessary)
- 2. Install the in-guest agent specific to the cloud software. This will
- prepare the guest to communicate over the channel, and also prepare
- the guest to be cloned safely (sometimes known as "sysprepping")
- 3. Shutdown the guest
- 4. Register the guest as a template with the cloud orchestration software
- 5. Install the cloud orchestration agent in dom0
-
-At runtime, when a cloud tenant requests that a VM is created from the template,
-the sequence of events would be: (assuming a Linux domU)
-
- 1. A VM is "cloned" from the template
- 2. A unique Unix domain socket path in dom0 is allocated
- (e.g. /my/cloud/software/talk/to/domain/<vm uuid>)
- 3. Domain configuration is created for the VM, listing the channel
- name expected by the in-guest agent. In xl syntax this would be:
-
- channel = [ "connection=socket, name=org.my.cloud.software.agent.version1,
- path = /my/cloud/software/talk/to/domain/<vm uuid>" ]
-
- 4. The VM is started
- 5. In dom0 the cloud orchestration agent connects to the Unix domain
- socket, writes a handshake message and waits for a reply
- 6. Assuming the guest kernel has CONFIG_HVC_XEN_FRONTEND set then the console
- driver will generate a hotplug event
- 7. A udev rule is activated by the hotplug event.
-
- The udev rule would look something like:
-
- SUBSYSTEM=="xen", DEVPATH=="/devices/console-[0-9]", RUN+="xen-console-setup"
-
- where the "xen-console-setup" script would read the channel name and
- make a symlink in /dev/xen-channel/org.my.cloud.software.agent.version1
-
- 8. The in-guest agent uses inotify to see the creation of the /dev/xen-channel
- symlink and opens the device.
- 9. The in-guest agent completes the handshake with the dom0 agent
- 10. The dom0 agent transmits the unique VM configuration: hostname, IP
- address, ssh keys etc etc
- 11. The in-guest agent receives the configuration and applies it.
-
-Using channels avoids having to use a temporary disk device or network
-connection.
-
-Design recommendations and pitfalls
------------------------------------
-
-It's necessary to install channel-specific software (an "agent") into the guest
-before you can use a channel. By default a channel will appear as a device
-which could be mistaken for a serial port or regular console. It is known
-that some software will proactively seek out serial ports and issue AT commands
-at them; make sure such software is disabled!
-
-Since channels are identified by names, application authors must ensure their
-channel names are unique to avoid clashes. We recommend that channel names
-include parts unique to the application such as a domain names. To assist
-prevent clashes we recommend authors add their names to our global channel
-registry at the end of this document.
-
-Limitations
------------
-
-Hotplug and unplug of channels is not currently implemented.
-
-Channel name registry
----------------------
-
-It is important that channel names are globally unique. To help ensure
-that no-one's name clashes with yours, please add yours to this list.
-
- Key:
- N: Name
- C: Contact
- D: Short description of use, possibly including a URL to your software
- or API
-
- N: org.xenproject.guest.clipboard.0.1
- C: David Scott <dave.scott@citrix.com>
- D: Share clipboard data via an in-guest agent. See:
- http://wiki.xenproject.org/wiki/Clipboard_sharing_protocol
diff --git a/docs/man/xen-pv-channel.pod.7 b/docs/man/xen-pv-channel.pod.7
new file mode 100644
index 0000000000..2333083cce
--- /dev/null
+++ b/docs/man/xen-pv-channel.pod.7
@@ -0,0 +1,188 @@
+=encoding utf8
+
+
+=head1 NAME
+
+Xen PV Channels
+
+=head1 DESCRIPTION
+
+A channel is a low-bandwidth private byte stream similar to a serial
+link. Typical uses of channels are
+
+=over
+
+=item 1.
+
+to provide initial configuration information to a VM on boot
+(example use: CloudStack's cloud-early-config service)
+
+
+=item 2.
+
+to signal/query an in-guest agent
+(example use: oVirt's guest agent)
+
+
+=back
+
+Channels are similar to virtio-serial devices and emulated serial links.
+Channels are intended to be used in the implementation of libvirt s
+when running on Xen.
+
+Note: if an application requires a high-bandwidth link then it should use
+vchan instead.
+
+
+=head2 How to use channels: an example
+
+Consider a cloud deployment where VMs are cloned from pre-made templates,
+and customised on first boot by an in-guest agent which sets the IP address,
+hostname, ssh keys etc. To install the system the cloud administrator would
+first:
+
+=over
+
+=item 1.
+
+Install a guest as normal (no channel configuration necessary)
+
+
+=item 2.
+
+Install the in-guest agent specific to the cloud software. This will
+prepare the guest to communicate over the channel, and also prepare
+the guest to be cloned safely (sometimes known as "sysprepping")
+
+
+=item 3.
+
+Shutdown the guest
+
+
+=item 4.
+
+Register the guest as a template with the cloud orchestration software
+
+
+=item 5.
+
+Install the cloud orchestration agent in dom0
+
+
+=back
+
+At runtime, when a cloud tenant requests that a VM is created from the template,
+the sequence of events would be: (assuming a Linux domU)
+
+=over
+
+=item 1.
+
+A VM is "cloned" from the template
+
+
+=item 2.
+
+A unique Unix domain socket path in dom0 is allocated
+(e.g. /my/cloud/software/talk/to/domain/)
+
+
+=item 3.
+
+Domain configuration is created for the VM, listing the channel
+name expected by the in-guest agent. In xl syntax this would be:
+
+channel = [ "connection=socket, name=org.my.cloud.software.agent.version1, path = /my/cloud/software/talk/to/domain/" ]
+
+=item 4.
+
+The VM is started
+
+
+=item 5.
+
+In dom0 the cloud orchestration agent connects to the Unix domain
+socket, writes a handshake message and waits for a reply
+
+
+=item 6.
+
+Assuming the guest kernel has CONFIGI<HVC>XEN_FRONTEND set then the console
+driver will generate a hotplug event
+
+
+=item 7.
+
+A udev rule is activated by the hotplug event.
+
+The udev rule would look something like:
+
+SUBSYSTEM=="xen", DEVPATH=="/devices/console-[0-9]", RUN+="xen-console-setup"
+
+where the "xen-console-setup" script would read the channel name and
+make a symlink in /dev/xen-channel/org.my.cloud.software.agent.version1
+
+
+=item 8.
+
+The in-guest agent uses inotify to see the creation of the /dev/xen-channel
+symlink and opens the device.
+
+
+=item 9.
+
+The in-guest agent completes the handshake with the dom0 agent
+
+
+=item 10.
+
+The dom0 agent transmits the unique VM configuration: hostname, IP
+address, ssh keys etc etc
+
+
+=item 11.
+
+The in-guest agent receives the configuration and applies it.
+
+
+=back
+
+Using channels avoids having to use a temporary disk device or network
+connection.
+
+
+=head2 Design recommendations and pitfalls
+
+It's necessary to install channel-specific software (an "agent") into the guest
+before you can use a channel. By default a channel will appear as a device
+which could be mistaken for a serial port or regular console. It is known
+that some software will proactively seek out serial ports and issue AT commands
+at them; make sure such software is disabled!
+
+Since channels are identified by names, application authors must ensure their
+channel names are unique to avoid clashes. We recommend that channel names
+include parts unique to the application such as a domain names. To assist
+prevent clashes we recommend authors add their names to our global channel
+registry at the end of this document.
+
+
+=head2 Limitations
+
+Hotplug and unplug of channels is not currently implemented.
+
+
+=head2 Channel name registry
+
+It is important that channel names are globally unique. To help ensure
+that no-one's name clashes with yours, please add yours to this list.
+
+ Key:
+ N: Name
+ C: Contact
+ D: Short description of use, possibly including a URL to your software or API
+
+ N: org.xenproject.guest.clipboard.0.1
+ C: David Scott <dave.scott@citrix.com>
+ D: Share clipboard data via an in-guest agent. See:
+ http://wiki.xenproject.org/wiki/Clipboard_sharing_protocol
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH v3 2/3] docs: add pod variant of xl-network-configuration.5
2017-07-26 14:39 [PATCH v3 0/3] docs: convert manpages to pod Olaf Hering
2017-07-26 14:39 ` [PATCH v3 1/3] docs: add pod variant of xen-pv-channel.7 Olaf Hering
@ 2017-07-26 14:39 ` Olaf Hering
2017-07-26 14:39 ` [PATCH v3 3/3] docs: add pod variant of xl-numa-placement Olaf Hering
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Olaf Hering @ 2017-07-26 14:39 UTC (permalink / raw)
To: xen-devel, Ian Jackson, Wei Liu; +Cc: Olaf Hering
Convert source for xl-network-configuration.5 from markdown to pod.
This removes the buildtime requirement for pandoc, and subsequently the
need for ghc, in the chain for BuildRequires of xen.rpm.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
...n.markdown.5 => xl-network-configuration.pod.5} | 196 ++++++++++++++-------
1 file changed, 137 insertions(+), 59 deletions(-)
rename docs/man/{xl-network-configuration.markdown.5 => xl-network-configuration.pod.5} (55%)
diff --git a/docs/man/xl-network-configuration.markdown.5 b/docs/man/xl-network-configuration.pod.5
similarity index 55%
rename from docs/man/xl-network-configuration.markdown.5
rename to docs/man/xl-network-configuration.pod.5
index 84c2645ad8..e9ac3c5b9e 100644
--- a/docs/man/xl-network-configuration.markdown.5
+++ b/docs/man/xl-network-configuration.pod.5
@@ -1,6 +1,11 @@
-# XL Network Configuration
+=encoding utf8
-## Syntax Overview
+=head1 NAME
+
+xl-network-configuration - XL Network Configuration Syntax
+
+
+=head1 SYNTAX
This document specifies the xl config file format vif configuration
option. It has the following form:
@@ -8,7 +13,7 @@ option. It has the following form:
vif = [ '<vifspec>', '<vifspec>', ... ]
where each vifspec is in this form:
-
+
[<key>=<value>|<flag>,]
For example:
@@ -24,11 +29,13 @@ These might be specified in the domain config file like this:
More formally, the string is a series of comma-separated keyword/value
pairs. All keywords are optional.
-Each device has a `DEVID` which is its index within the vif list, starting from 0.
+Each device has a C<DEVID> which is its index within the vif list, starting from 0.
-## Keywords
-### mac
+=head1 Keywords
+
+
+=head2 mac
If specified then this option specifies the MAC address inside the
guest of this VIF device. The value is a 48-bit number represented as
@@ -36,89 +43,137 @@ six groups of two hexadecimal digits, separated by colons (:).
The default if this keyword is not specified is to be automatically
generate a MAC address inside the space assigned to Xen's
-[Organizationally Unique Identifier][oui] (00:16:3e).
+L<Organizationally Unique Identifier|http://en.wikipedia.org/wiki/Organizationally_Unique_Identifier> (00:16:3e).
If you are choosing a MAC address then it is strongly recommend to
follow one of the following strategies:
- * Generate a random sequence of 6 byte, set the locally administered
- bit (bit 2 of the first byte) and clear the multicast bit (bit 1
- of the first byte). In other words the first byte should have the
- bit pattern xxxxxx10 (where x is a randomly generated bit) and the
- remaining 5 bytes are randomly generated See
- [http://en.wikipedia.org/wiki/MAC_address] for more details the
- structure of a MAC address.
- * Allocate an address from within the space defined by your
- organization's OUI (if you have one) following your organization's
- procedures for doing so.
- * Allocate an address from within the space defined by Xen's OUI
- (00:16:3e). Taking care not to clash with other users of the
- physical network segment where this VIF will reside.
+=over
+
+=item *
+
+Generate a random sequence of 6 byte, set the locally administered
+bit (bit 2 of the first byte) and clear the multicast bit (bit 1
+of the first byte). In other words the first byte should have the
+bit pattern xxxxxx10 (where x is a randomly generated bit) and the
+remaining 5 bytes are randomly generated See
+[http://en.wikipedia.org/wiki/MAC_address] for more details the
+structure of a MAC address.
+
+
+=item *
+
+Allocate an address from within the space defined by your
+organization's OUI (if you have one) following your organization's
+procedures for doing so.
+
+
+=item *
+
+Allocate an address from within the space defined by Xen's OUI
+(00:16:3e). Taking care not to clash with other users of the
+physical network segment where this VIF will reside.
+
+
+=back
If you have an OUI for your own use then that is the preferred
strategy. Otherwise in general you should prefer to generate a random
MAC and set the locally administered bit since this allows for more
bits of randomness than using the Xen OUI.
-### bridge
+
+=head2 bridge
Specifies the name of the network bridge which this VIF should be
-added to. The default is `xenbr0`. The bridge must be configured using
-your distribution's network configuration tools. See the [wiki][net]
+added to. The default is C<xenbr0>. The bridge must be configured using
+your distribution's network configuration tools. See the L<wiki|http://wiki.xen.org/wiki/HostConfiguration/Networking>
for guidance and examples.
-### gatewaydev
+
+=head2 gatewaydev
Specifies the name of the network interface which has an IP and which
is in the network the VIF should communicate with. This is used in the host
-by the vif-route hotplug script. See [wiki][vifroute] for guidance and
+by the vif-route hotplug script. See L<wiki|http://wiki.xen.org/wiki/Vif-route> for guidance and
examples.
NOTE: netdev is a deprecated alias of this option.
-### type
+
+=head2 type
This keyword is valid for HVM guests only.
Specifies the type of device to valid values are:
- * `ioemu` (default) -- this device will be provided as an emulate
- device to the guest and also as a paravirtualised device which the
- guest may choose to use instead if it has suitable drivers
- available.
- * `vif` -- this device will be provided as a paravirtualised device
- only.
+=over
+
+=item *
+
+C<ioemu> (default) -- this device will be provided as an emulate
+device to the guest and also as a paravirtualised device which the
+guest may choose to use instead if it has suitable drivers
+available.
+
+
+=item *
-### model
+C<vif> -- this device will be provided as a paravirtualised device
+only.
-This keyword is valid for HVM guest devices with `type=ioemu` only.
+
+=back
+
+
+=head2 model
+
+This keyword is valid for HVM guest devices with C<type=ioemu> only.
Specifies the type device to emulated for this guest. Valid values
are:
- * `rtl8139` (default) -- Realtek RTL8139
- * `e1000` -- Intel E1000
- * in principle any device supported by your device model
+=over
+
+=item *
+
+C<rtl8139> (default) -- Realtek RTL8139
+
-### vifname
+=item *
+
+C<e1000> -- Intel E1000
+
+
+=item *
+
+in principle any device supported by your device model
+
+
+=back
+
+
+=head2 vifname
Specifies the backend device name for the virtual device.
If the domain is an HVM domain then the associated emulated (tap)
device will have a "-emu" suffice added.
-The default name for the virtual device is `vifDOMID.DEVID` where
-`DOMID` is the guest domain ID and `DEVID` is the device
-number. Likewise the default tap name is `vifDOMID.DEVID-emu`.
+The default name for the virtual device is C<vifDOMID.DEVID> where
+C<DOMID> is the guest domain ID and C<DEVID> is the device
+number. Likewise the default tap name is C<vifDOMID.DEVID-emu>.
-### script
+
+=head2 script
Specifies the hotplug script to run to configure this device (e.g. to
add it to the relevant bridge). Defaults to
-`XEN_SCRIPT_DIR/vif-bridge` but can be set to any script. Some example
-scripts are installed in `XEN_SCRIPT_DIR`.
+C<XEN_SCRIPT_DIR/vif-bridge> but can be set to any script. Some example
+scripts are installed in C<XEN_SCRIPT_DIR>.
+
-### ip
+=head2 ip
Specifies the IP address for the device, the default is not to
specify an IP address.
@@ -128,25 +183,51 @@ configured. A typically behaviour (exhibited by the example hotplug
scripts) if set might be to configure firewall rules to allow only the
specified IP address to be used by the guest (blocking all others).
-### backend
+
+=head2 backend
Specifies the backend domain which this device should attach to. This
defaults to domain 0. Specifying another domain requires setting up a
driver domain which is outside the scope of this document.
-### rate
+
+=head2 rate
Specifies the rate at which the outgoing traffic will be limited to.
The default if this keyword is not specified is unlimited.
-The rate may be specified as "<RATE>/s" or optionally "<RATE>/s@<INTERVAL>".
+The rate may be specified as "/s" or optionally "/s@".
+
+=over
+
+=item *
+
+C<RATE> is in bytes and can accept suffixes:
- * `RATE` is in bytes and can accept suffixes:
- * GB, MB, KB, B for bytes.
- * Gb, Mb, Kb, b for bits.
- * `INTERVAL` is in microseconds and can accept suffixes: ms, us, s.
- It determines the frequency at which the vif transmission credit
- is replenished. The default is 50ms.
+=over
+
+=item *
+
+GB, MB, KB, B for bytes.
+
+
+=item *
+
+Gb, Mb, Kb, b for bits.
+
+
+=back
+
+
+
+=item *
+
+C<INTERVAL> is in microseconds and can accept suffixes: ms, us, s.
+It determines the frequency at which the vif transmission credit
+is replenished. The default is 50ms.
+
+
+=back
Vif rate limiting is credit-based. It means that for "1MB/s@20ms", the
available credit will be equivalent of the traffic you would have done
@@ -162,12 +243,9 @@ For example:
NOTE: The actual underlying limits of rate limiting are dependent
on the underlying netback implementation.
-### devid
+
+=head2 devid
Specifies the devid manually instead of letting xl choose the lowest index available.
NOTE: This should not be set unless you have a reason to.
-
-[oui]: http://en.wikipedia.org/wiki/Organizationally_Unique_Identifier
-[net]: http://wiki.xen.org/wiki/HostConfiguration/Networking
-[vifroute]: http://wiki.xen.org/wiki/Vif-route
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH v3 3/3] docs: add pod variant of xl-numa-placement
2017-07-26 14:39 [PATCH v3 0/3] docs: convert manpages to pod Olaf Hering
2017-07-26 14:39 ` [PATCH v3 1/3] docs: add pod variant of xen-pv-channel.7 Olaf Hering
2017-07-26 14:39 ` [PATCH v3 2/3] docs: add pod variant of xl-network-configuration.5 Olaf Hering
@ 2017-07-26 14:39 ` Olaf Hering
2017-07-27 11:46 ` Dario Faggioli
2017-07-28 14:54 ` [PATCH v3 0/3] docs: convert manpages to pod Wei Liu
2017-09-19 14:36 ` Ian Jackson
4 siblings, 1 reply; 7+ messages in thread
From: Olaf Hering @ 2017-07-26 14:39 UTC (permalink / raw)
To: xen-devel, Ian Jackson, Wei Liu; +Cc: Olaf Hering
Convert source for xl-numa-placement.7 from markdown to pod.
This removes the buildtime requirement for pandoc, and subsequently the
need for ghc, in the chain for BuildRequires of xen.rpm.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
...lacement.markdown.7 => xl-numa-placement.pod.7} | 166 ++++++++++++++-------
1 file changed, 110 insertions(+), 56 deletions(-)
rename docs/man/{xl-numa-placement.markdown.7 => xl-numa-placement.pod.7} (74%)
diff --git a/docs/man/xl-numa-placement.markdown.7 b/docs/man/xl-numa-placement.pod.7
similarity index 74%
rename from docs/man/xl-numa-placement.markdown.7
rename to docs/man/xl-numa-placement.pod.7
index f863492093..54a444172e 100644
--- a/docs/man/xl-numa-placement.markdown.7
+++ b/docs/man/xl-numa-placement.pod.7
@@ -1,6 +1,12 @@
-# Guest Automatic NUMA Placement in libxl and xl #
+=encoding utf8
-## Rationale ##
+=head1 NAME
+
+Guest Automatic NUMA Placement in libxl and xl
+
+=head1 DESCRIPTION
+
+=head2 Rationale
NUMA (which stands for Non-Uniform Memory Access) means that the memory
accessing times of a program running on a CPU depends on the relative
@@ -17,13 +23,14 @@ running memory-intensive workloads on a shared host. In fact, the cost
of accessing non node-local memory locations is very high, and the
performance degradation is likely to be noticeable.
-For more information, have a look at the [Xen NUMA Introduction][numa_intro]
+For more information, have a look at the L<Xen NUMA Introduction|http://wiki.xen.org/wiki/Xen_NUMA_Introduction>
page on the Wiki.
-## Xen and NUMA machines: the concept of _node-affinity_ ##
+
+=head2 Xen and NUMA machines: the concept of I<node-affinity>
The Xen hypervisor deals with NUMA machines throughout the concept of
-_node-affinity_. The node-affinity of a domain is the set of NUMA nodes
+I<node-affinity>. The node-affinity of a domain is the set of NUMA nodes
of the host where the memory for the domain is being allocated (mostly,
at domain creation time). This is, at least in principle, different and
unrelated with the vCPU (hard and soft, see below) scheduling affinity,
@@ -42,15 +49,16 @@ it is very important to "place" the domain correctly when it is fist
created, as the most of its memory is allocated at that time and can
not (for now) be moved easily.
-### Placing via pinning and cpupools ###
+
+=head2 Placing via pinning and cpupools
The simplest way of placing a domain on a NUMA node is setting the hard
scheduling affinity of the domain's vCPUs to the pCPUs of the node. This
also goes under the name of vCPU pinning, and can be done through the
"cpus=" option in the config file (more about this below). Another option
is to pool together the pCPUs spanning the node and put the domain in
-such a _cpupool_ with the "pool=" config option (as documented in our
-[Wiki][cpupools_howto]).
+such a I<cpupool> with the "pool=" config option (as documented in our
+L<Wiki|http://wiki.xen.org/wiki/Cpupools_Howto>).
In both the above cases, the domain will not be able to execute outside
the specified set of pCPUs for any reasons, even if all those pCPUs are
@@ -59,7 +67,8 @@ busy doing something else while there are others, idle, pCPUs.
So, when doing this, local memory accesses are 100% guaranteed, but that
may come at he cost of some load imbalances.
-### NUMA aware scheduling ###
+
+=head2 NUMA aware scheduling
If using the credit1 scheduler, and starting from Xen 4.3, the scheduler
itself always tries to run the domain's vCPUs on one of the nodes in
@@ -87,21 +96,37 @@ workload.
Notice that, for each vCPU, the following three scenarios are possbile:
- * a vCPU *is pinned* to some pCPUs and *does not have* any soft affinity
- In this case, the vCPU is always scheduled on one of the pCPUs to which
- it is pinned, without any specific peference among them.
- * a vCPU *has* its own soft affinity and *is not* pinned to any particular
- pCPU. In this case, the vCPU can run on every pCPU. Nevertheless, the
- scheduler will try to have it running on one of the pCPUs in its soft
- affinity;
- * a vCPU *has* its own vCPU soft affinity and *is also* pinned to some
- pCPUs. In this case, the vCPU is always scheduled on one of the pCPUs
- onto which it is pinned, with, among them, a preference for the ones
- that also forms its soft affinity. In case pinning and soft affinity
- form two disjoint sets of pCPUs, pinning "wins", and the soft affinity
- is just ignored.
-
-## Guest placement in xl ##
+=over
+
+=item *
+
+a vCPU I<is pinned> to some pCPUs and I<does not have> any soft affinity
+In this case, the vCPU is always scheduled on one of the pCPUs to which
+it is pinned, without any specific peference among them.
+
+
+=item *
+
+a vCPU I<has> its own soft affinity and I<is not> pinned to any particular
+pCPU. In this case, the vCPU can run on every pCPU. Nevertheless, the
+scheduler will try to have it running on one of the pCPUs in its soft
+affinity;
+
+
+=item *
+
+a vCPU I<has> its own vCPU soft affinity and I<is also> pinned to some
+pCPUs. In this case, the vCPU is always scheduled on one of the pCPUs
+onto which it is pinned, with, among them, a preference for the ones
+that also forms its soft affinity. In case pinning and soft affinity
+form two disjoint sets of pCPUs, pinning "wins", and the soft affinity
+is just ignored.
+
+
+=back
+
+
+=head2 Guest placement in xl
If using xl for creating and managing guests, it is very easy to ask for
both manual or automatic placement of them across the host's NUMA nodes.
@@ -111,7 +136,8 @@ the details of the heuristics adopted for automatic placement (see below),
and the lack of support (in both xm/xend and the Xen versions where that
was the default toolstack) for NUMA aware scheduling.
-### Placing the guest manually ###
+
+=head2 Placing the guest manually
Thanks to the "cpus=" option, it is possible to specify where a domain
should be created and scheduled on, directly in its config file. This
@@ -126,19 +152,31 @@ or Xen won't be able to guarantee the locality for their memory accesses.
That, of course, also mean the vCPUs of the domain will only be able to
execute on those same pCPUs.
-It is is also possible to have a "cpus\_soft=" option in the xl config file,
+It is is also possible to have a "cpus_soft=" option in the xl config file,
to specify the soft affinity for all the vCPUs of the domain. This affects
the NUMA placement in the following way:
- * if only "cpus\_soft=" is present, the VM's node-affinity will be equal
- to the nodes to which the pCPUs in the soft affinity mask belong;
- * if both "cpus\_soft=" and "cpus=" are present, the VM's node-affinity
- will be equal to the nodes to which the pCPUs present both in hard and
- soft affinity belong.
+=over
+
+=item *
+
+if only "cpus_soft=" is present, the VM's node-affinity will be equal
+to the nodes to which the pCPUs in the soft affinity mask belong;
-### Placing the guest automatically ###
-If neither "cpus=" nor "cpus\_soft=" are present in the config file, libxl
+=item *
+
+if both "cpus_soft=" and "cpus=" are present, the VM's node-affinity
+will be equal to the nodes to which the pCPUs present both in hard and
+soft affinity belong.
+
+
+=back
+
+
+=head2 Placing the guest automatically
+
+If neither "cpus=" nor "cpus_soft=" are present in the config file, libxl
tries to figure out on its own on which node(s) the domain could fit best.
If it finds one (some), the domain's node affinity get set to there,
and both memory allocations and NUMA aware scheduling (for the credit
@@ -160,14 +198,29 @@ to have, and as much pCPUs as it has vCPUs. After that, the actual
decision on which candidate to pick happens accordingly to the following
heuristics:
- * candidates involving fewer nodes are considered better. In case
- two (or more) candidates span the same number of nodes,
- * candidates with a smaller number of vCPUs runnable on them (due
- to previous placement and/or plain vCPU pinning) are considered
- better. In case the same number of vCPUs can run on two (or more)
- candidates,
- * the candidate with with the greatest amount of free memory is
- considered to be the best one.
+=over
+
+=item *
+
+candidates involving fewer nodes are considered better. In case
+two (or more) candidates span the same number of nodes,
+
+
+=item *
+
+candidates with a smaller number of vCPUs runnable on them (due
+to previous placement and/or plain vCPU pinning) are considered
+better. In case the same number of vCPUs can run on two (or more)
+candidates,
+
+
+=item *
+
+the candidate with with the greatest amount of free memory is
+considered to be the best one.
+
+
+=back
Giving preference to candidates with fewer nodes ensures better
performance for the guest, as it avoid spreading its memory among
@@ -178,35 +231,37 @@ largest amounts of free memory helps keeping the memory fragmentation
small, and maximizes the probability of being able to put more domains
there.
-## Guest placement in libxl ##
+
+=head2 Guest placement in libxl
xl achieves automatic NUMA placement because that is what libxl does
by default. No API is provided (yet) for modifying the behaviour of
the placement algorithm. However, if your program is calling libxl,
-it is possible to set the `numa_placement` build info key to `false`
-(it is `true` by default) with something like the below, to prevent
+it is possible to set the C<numa_placement> build info key to C<false>
+(it is C<true> by default) with something like the below, to prevent
any placement from happening:
libxl_defbool_set(&domain_build_info->numa_placement, false);
-Also, if `numa_placement` is set to `true`, the domain's vCPUs must
-not be pinned (i.e., `domain_build_info->cpumap` must have all its
+Also, if C<numa_placement> is set to C<true>, the domain's vCPUs must
+not be pinned (i.e., C<<< domain_build_info->cpumap >>> must have all its
bits set, as it is by default), or domain creation will fail with
-`ERROR_INVAL`.
+C<ERROR_INVAL>.
Starting from Xen 4.3, in case automatic placement happens (and is
-successful), it will affect the domain's node-affinity and _not_ its
+successful), it will affect the domain's node-affinity and I<not> its
vCPU pinning. Namely, the domain's vCPUs will not be pinned to any
pCPU on the host, but the memory from the domain will come from the
selected node(s) and the NUMA aware scheduling (if the credit scheduler
is in use) will try to keep the domain's vCPUs there as much as possible.
Besides than that, looking and/or tweaking the placement algorithm
-search "Automatic NUMA placement" in libxl\_internal.h.
+search "Automatic NUMA placement" in libxl_internal.h.
Note this may change in future versions of Xen/libxl.
-## Xen < 4.5 ##
+
+=head2 Xen < 4.5
The concept of vCPU soft affinity has been introduced for the first time
in Xen 4.5. In 4.3, it is the domain's node-affinity that drives the
@@ -215,25 +270,24 @@ and so each vCPU can have its own mask of pCPUs, while node-affinity is
per-domain, that is the equivalent of having all the vCPUs with the same
soft affinity.
-## Xen < 4.3 ##
+
+=head2 Xen < 4.3
As NUMA aware scheduling is a new feature of Xen 4.3, things are a little
bit different for earlier version of Xen. If no "cpus=" option is specified
and Xen 4.2 is in use, the automatic placement algorithm still runs, but
-the results is used to _pin_ the vCPUs of the domain to the output node(s).
+the results is used to I<pin> the vCPUs of the domain to the output node(s).
This is consistent with what was happening with xm/xend.
On a version of Xen earlier than 4.2, there is not automatic placement at
all in xl or libxl, and hence no node-affinity, vCPU affinity or pinning
being introduced/modified.
-## Limitations ##
+
+=head2 Limitations
Analyzing various possible placement solutions is what makes the
algorithm flexible and quite effective. However, that also means
it won't scale well to systems with arbitrary number of nodes.
For this reason, automatic placement is disabled (with a warning)
if it is requested on a host with more than 16 NUMA nodes.
-
-[numa_intro]: http://wiki.xen.org/wiki/Xen_NUMA_Introduction
-[cpupools_howto]: http://wiki.xen.org/wiki/Cpupools_Howto
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 7+ messages in thread