* [PATCH 01/14] kdbus: add documentation
From: Greg Kroah-Hartman @ 2015-03-09 13:09 UTC (permalink / raw)
To: arnd-r2nGTMty4D4, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
gnomes-qBU/x9rampVanCEyBjwyrvXRex20P6io, teg-B22kvLQNl6c,
jkosina-AlSwsSmVLrQ, luto-kltTT9wpgjJwATOyAt5JVQ,
linux-api-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
Cc: daniel-cYrQPVfZoowdnm+yROfE0A, dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w,
tixxdz-Umm1ozX2/EEdnm+yROfE0A, Greg Kroah-Hartman
In-Reply-To: <1425906560-13798-1-git-send-email-gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org>
From: Daniel Mack <daniel-cYrQPVfZoowdnm+yROfE0A@public.gmane.org>
kdbus is a system for low-latency, low-overhead, easy to use
interprocess communication (IPC).
The interface to all functions in this driver is implemented via ioctls
on files exposed through a filesystem called 'kdbusfs'. The default
mount point of kdbusfs is /sys/fs/kdbus. This patch adds detailed
documentation about the kernel level API design.
This patch adds a set of comprehensive set of DocBook files which
can be turned into man-pages using 'make mandocs', or into HTML
files with 'make htmldocs'.
Signed-off-by: Daniel Mack <daniel-cYrQPVfZoowdnm+yROfE0A@public.gmane.org>
Signed-off-by: David Herrmann <dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Signed-off-by: Djalal Harouni <tixxdz-Umm1ozX2/EEdnm+yROfE0A@public.gmane.org>
Signed-off-by: Greg Kroah-Hartman <gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org>
---
Documentation/Makefile | 2 +-
Documentation/kdbus/Makefile | 30 +
Documentation/kdbus/kdbus.bus.xml | 360 +++++++++
Documentation/kdbus/kdbus.connection.xml | 1252 +++++++++++++++++++++++++++++
Documentation/kdbus/kdbus.endpoint.xml | 436 ++++++++++
Documentation/kdbus/kdbus.fs.xml | 124 +++
Documentation/kdbus/kdbus.item.xml | 840 ++++++++++++++++++++
Documentation/kdbus/kdbus.match.xml | 553 +++++++++++++
Documentation/kdbus/kdbus.message.xml | 1277 ++++++++++++++++++++++++++++++
Documentation/kdbus/kdbus.name.xml | 711 +++++++++++++++++
Documentation/kdbus/kdbus.policy.xml | 406 ++++++++++
Documentation/kdbus/kdbus.pool.xml | 320 ++++++++
Documentation/kdbus/kdbus.xml | 1012 +++++++++++++++++++++++
Documentation/kdbus/stylesheet.xsl | 16 +
Makefile | 1 +
15 files changed, 7339 insertions(+), 1 deletion(-)
create mode 100644 Documentation/kdbus/Makefile
create mode 100644 Documentation/kdbus/kdbus.bus.xml
create mode 100644 Documentation/kdbus/kdbus.connection.xml
create mode 100644 Documentation/kdbus/kdbus.endpoint.xml
create mode 100644 Documentation/kdbus/kdbus.fs.xml
create mode 100644 Documentation/kdbus/kdbus.item.xml
create mode 100644 Documentation/kdbus/kdbus.match.xml
create mode 100644 Documentation/kdbus/kdbus.message.xml
create mode 100644 Documentation/kdbus/kdbus.name.xml
create mode 100644 Documentation/kdbus/kdbus.policy.xml
create mode 100644 Documentation/kdbus/kdbus.pool.xml
create mode 100644 Documentation/kdbus/kdbus.xml
create mode 100644 Documentation/kdbus/stylesheet.xsl
diff --git a/Documentation/Makefile b/Documentation/Makefile
index 6883a1b9b351..5e3fde632d03 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -1,4 +1,4 @@
subdir-y := accounting arm auxdisplay blackfin connector \
- filesystems filesystems ia64 laptops mic misc-devices \
+ filesystems filesystems ia64 kdbus laptops mic misc-devices \
networking pcmcia prctl ptp spi timers vDSO video4linux \
watchdog
diff --git a/Documentation/kdbus/Makefile b/Documentation/kdbus/Makefile
new file mode 100644
index 000000000000..cd6b48ee41bf
--- /dev/null
+++ b/Documentation/kdbus/Makefile
@@ -0,0 +1,30 @@
+DOCS := \
+ kdbus.xml \
+ kdbus.bus.xml \
+ kdbus.connection.xml \
+ kdbus.endpoint.xml \
+ kdbus.fs.xml \
+ kdbus.item.xml \
+ kdbus.match.xml \
+ kdbus.message.xml \
+ kdbus.name.xml \
+ kdbus.policy.xml \
+ kdbus.pool.xml
+
+XMLFILES := $(addprefix $(obj)/,$(DOCS))
+MANFILES := $(patsubst %.xml, %.7, $(XMLFILES))
+HTMLFILES := $(patsubst %.xml, %.html, $(XMLFILES))
+
+XMLTO_ARGS := -m $(obj)/stylesheet.xsl
+
+%.7: %.xml
+ xmlto man $(XMLTO_ARGS) -o . $<
+
+%.html: %.xml
+ xmlto html-nochunks $(XMLTO_ARGS) -o . $<
+
+mandocs: $(MANFILES)
+
+htmldocs: $(HTMLFILES)
+
+clean-files := $(MANFILES) $(HTMLFILES)
diff --git a/Documentation/kdbus/kdbus.bus.xml b/Documentation/kdbus/kdbus.bus.xml
new file mode 100644
index 000000000000..4d875e59ac02
--- /dev/null
+++ b/Documentation/kdbus/kdbus.bus.xml
@@ -0,0 +1,360 @@
+<?xml version='1.0'?> <!--*-nxml-*-->
+<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
+ "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
+
+<refentry id="kdbus.bus">
+
+ <refentryinfo>
+ <title>kdbus.bus</title>
+ <productname>kdbus.bus</productname>
+ </refentryinfo>
+
+ <refmeta>
+ <refentrytitle>kdbus.bus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </refmeta>
+
+ <refnamediv>
+ <refname>kdbus.bus</refname>
+ <refpurpose>kdbus bus</refpurpose>
+ </refnamediv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ A bus is a resource that is shared between connections in order to
+ transmit messages (see
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ ).
+ Each bus is independent, and operations on the bus will not have any
+ effect on other buses. A bus is a management entity that controls the
+ addresses of its connections, their policies and message transactions
+ performed via this bus.
+ </para>
+ <para>
+ Each bus is bound to the mount instance it was created on. It has a
+ custom name that is unique across all buses of a domain. In
+ <citerefentry>
+ <refentrytitle>kdbus.fs</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ , a bus is presented as a directory. No operations can be performed on
+ the bus itself; instead you need to perform the operations on an endpoint
+ associated with the bus. Endpoints are accessible as files underneath the
+ bus directory. A default endpoint called <constant>bus</constant> is
+ provided on each bus.
+ </para>
+ <para>
+ Bus names may be chosen freely except for one restriction: the name must
+ be prefixed with the numeric effective UID of the creator and a dash. This
+ is required to avoid namespace clashes between different users. When
+ creating a bus, the name that is passed in must be properly formatted, or
+ the kernel will refuse creation of the bus. Example:
+ <literal>1047-foobar</literal> is an acceptable name for a bus
+ registered by a user with UID 1047. However,
+ <literal>1024-foobar</literal> is not, and neither is
+ <literal>foobar</literal>. The UID must be provided in the
+ user-namespace of the bus owner.
+ </para>
+ <para>
+ To create a new bus, you need to open the control file of a domain and
+ employ the <constant>KDBUS_CMD_BUS_MAKE</constant> ioctl. The control
+ file descriptor that was used to issue
+ <constant>KDBUS_CMD_BUS_MAKE</constant> must not previously have been
+ used for any other control-ioctl and must be kept open for the entire
+ life-time of the created bus. Closing it will immediately cleanup the
+ entire bus and all its associated resources and endpoints. Every control
+ file descriptor can only be used to create a single new bus; from that
+ point on, it is not used for any further communication until the final
+ <citerefentry>
+ <refentrytitle>close</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ .
+ </para>
+ <para>
+ Each bus will generate a random, 128-bit UUID upon creation. This UUID
+ will be returned to creators of connections through
+ <varname>kdbus_cmd_hello.id128</varname> and can be used to uniquely
+ identify buses, even across different machines or containers. The UUID
+ will have its variant bits set to <literal>DCE</literal>, and denote
+ version 4 (random). For more details on UUIDs, see <ulink
+ url="https://en.wikipedia.org/wiki/Universally_unique_identifier">
+ the Wikipedia article on UUIDs</ulink>.
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Creating buses</title>
+ <para>
+ To create a new bus, the <constant>KDBUS_CMD_BUS_MAKE</constant>
+ command is used. It takes a <type>struct kdbus_cmd</type> argument.
+ </para>
+ <programlisting>
+struct kdbus_cmd {
+ __u64 size;
+ __u64 flags;
+ __u64 return_flags;
+ struct kdbus_item items[0];
+};
+ </programlisting>
+
+ <para>The fields in this struct are described below.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term><varname>size</varname></term>
+ <listitem><para>
+ The overall size of the struct, including its items.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>flags</varname></term>
+ <listitem><para>The flags for creation.</para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_MAKE_ACCESS_GROUP</constant></term>
+ <listitem>
+ <para>Make the bus file group-accessible.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_MAKE_ACCESS_WORLD</constant></term>
+ <listitem>
+ <para>Make the bus file world-accessible.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_FLAG_NEGOTIATE</constant></term>
+ <listitem>
+ <para>
+ Requests a set of valid flags for this ioctl. When this bit is
+ set, no action is taken; the ioctl will return
+ <errorcode>0</errorcode>, and the <varname>flags</varname>
+ field will have all bits set that are valid for this command.
+ The <constant>KDBUS_FLAG_NEGOTIATE</constant> bit will be
+ cleared by the operation.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>return_flags</varname></term>
+ <listitem><para>
+ Flags returned by the kernel. Currently unused and always set to
+ <constant>0</constant> by the kernel.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>items</varname></term>
+ <listitem>
+ <para>
+ The following items (see
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ ) are expected for <constant>KDBUS_CMD_BUS_MAKE</constant>.
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_MAKE_NAME</constant></term>
+ <listitem>
+ <para>
+ Contains a null-terminated string that identifies the
+ bus. The name must be unique across the kdbus domain and
+ must start with the effective UID of the caller, followed by
+ a '<literal>-</literal>' (dash). This item is mandatory.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_BLOOM_PARAMETER</constant></term>
+ <listitem>
+ <para>
+ Bus-wide bloom parameters passed in a
+ <type>struct kdbus_bloom_parameter</type>. These settings are
+ copied back to new connections verbatim. This item is
+ mandatory. See
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for a more detailed description of this item.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_ATTACH_FLAGS_RECV</constant></term>
+ <listitem>
+ <para>
+ An optional item that contains a set of required attach flags
+ that connections must allow. This item is used as a
+ negotiation measure during connection creation. If connections
+ do not satisfy the bus requirements, they are not allowed on
+ the bus. If not set, the bus does not require any metadata to
+ be attached; in this case connections are free to set their
+ own attach flags.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_ATTACH_FLAGS_SEND</constant></term>
+ <listitem>
+ <para>
+ An optional item that contains a set of attach flags that are
+ returned to connections when they query the bus creator
+ metadata. If not set, no metadata is returned.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_NEGOTIATE</constant></term>
+ <listitem><para>
+ With this item, programs can <emphasis>probe</emphasis> the
+ kernel for known item types. See
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more details.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ Unrecognized items are rejected, and the ioctl will fail with
+ <varname>errno</varname> set to <constant>EINVAL</constant>.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Return value</title>
+ <para>
+ On success, all mentioned ioctl commands return <errorcode>0</errorcode>;
+ on error, <errorcode>-1</errorcode> is returned, and
+ <varname>errno</varname> is set to indicate the error.
+ If the issued ioctl is illegal for the file descriptor used,
+ <varname>errno</varname> will be set to <constant>ENOTTY</constant>.
+ </para>
+
+ <refsect2>
+ <title>
+ <constant>KDBUS_CMD_BUS_MAKE</constant> may fail with the following
+ errors
+ </title>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>EBADMSG</constant></term>
+ <listitem><para>
+ A mandatory item is missing.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EINVAL</constant></term>
+ <listitem><para>
+ The flags supplied in the <constant>struct kdbus_cmd</constant>
+ are invalid or the supplied name does not start with the current
+ UID and a '<literal>-</literal>' (dash).
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EEXIST</constant></term>
+ <listitem><para>
+ A bus of that name already exists.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>ESHUTDOWN</constant></term>
+ <listitem><para>
+ The kdbus mount instance for the bus was already shut down.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EMFILE</constant></term>
+ <listitem><para>
+ The maximum number of buses for the current user is exhausted.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+ <simplelist type="inline">
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.endpoint</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.fs</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.name</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ </simplelist>
+ </refsect1>
+</refentry>
diff --git a/Documentation/kdbus/kdbus.connection.xml b/Documentation/kdbus/kdbus.connection.xml
new file mode 100644
index 000000000000..09852125b2d4
--- /dev/null
+++ b/Documentation/kdbus/kdbus.connection.xml
@@ -0,0 +1,1252 @@
+<?xml version='1.0'?> <!--*-nxml-*-->
+<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
+ "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
+
+<refentry id="kdbus.connection">
+
+ <refentryinfo>
+ <title>kdbus.connection</title>
+ <productname>kdbus.connection</productname>
+ </refentryinfo>
+
+ <refmeta>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </refmeta>
+
+ <refnamediv>
+ <refname>kdbus.connection</refname>
+ <refpurpose>kdbus connection</refpurpose>
+ </refnamediv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ Connections are identified by their <emphasis>connection ID</emphasis>,
+ internally implemented as a <type>uint64_t</type> counter.
+ The IDs of every newly created bus start at <constant>1</constant>, and
+ every new connection will increment the counter by <constant>1</constant>.
+ The IDs are not reused.
+ </para>
+ <para>
+ In higher level tools, the user visible representation of a connection is
+ defined by the D-Bus protocol specification as
+ <constant>":1.<ID>"</constant>.
+ </para>
+ <para>
+ Messages with a specific <type>uint64_t</type> destination ID are
+ directly delivered to the connection with the corresponding ID. Signal
+ messages (see
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>)
+ may be addressed to the special destination ID
+ <constant>KDBUS_DST_ID_BROADCAST</constant> (~0ULL) and will then
+ potentially be delivered to all currently active connections on the bus.
+ However, in order to receive any signal messages, clients must subscribe
+ to them by installing a match (see
+ <citerefentry>
+ <refentrytitle>kdbus.match</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ ).
+ </para>
+ <para>
+ Messages synthesized and sent directly by the kernel will carry the
+ special source ID <constant>KDBUS_SRC_ID_KERNEL</constant> (0).
+ </para>
+ <para>
+ In addition to the unique <type>uint64_t</type> connection ID,
+ established connections can request the ownership of
+ <emphasis>well-known names</emphasis>, under which they can be found and
+ addressed by other bus clients. A well-known name is associated with one
+ and only one connection at a time. See
+ <citerefentry>
+ <refentrytitle>kdbus.name</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ on name acquisition, the name registry, and the validity of names.
+ </para>
+ <para>
+ Messages can specify the special destination ID
+ <constant>KDBUS_DST_ID_NAME</constant> (0) and carry a well-known name
+ in the message data. Such a message is delivered to the destination
+ connection which owns that well-known name.
+ </para>
+
+ <programlisting><![CDATA[
+ +-------------------------------------------------------------------------+
+ | +---------------+ +---------------------------+ |
+ | | Connection | | Message | -----------------+ |
+ | | :1.22 | --> | src: 22 | | |
+ | | | | dst: 25 | | |
+ | | | | | | |
+ | | | | | | |
+ | | | +---------------------------+ | |
+ | | | | |
+ | | | <--------------------------------------+ | |
+ | +---------------+ | | |
+ | | | |
+ | +---------------+ +---------------------------+ | | |
+ | | Connection | | Message | -----+ | |
+ | | :1.25 | --> | src: 25 | | |
+ | | | | dst: 0xffffffffffffffff | -------------+ | |
+ | | | | (KDBUS_DST_ID_BROADCAST) | | | |
+ | | | | | ---------+ | | |
+ | | | +---------------------------+ | | | |
+ | | | | | | |
+ | | | <--------------------------------------------------+ |
+ | +---------------+ | | |
+ | | | |
+ | +---------------+ +---------------------------+ | | |
+ | | Connection | | Message | --+ | | |
+ | | :1.55 | --> | src: 55 | | | | |
+ | | | | dst: 0 / org.foo.bar | | | | |
+ | | | | | | | | |
+ | | | | | | | | |
+ | | | +---------------------------+ | | | |
+ | | | | | | |
+ | | | <------------------------------------------+ | |
+ | +---------------+ | | |
+ | | | |
+ | +---------------+ | | |
+ | | Connection | | | |
+ | | :1.81 | | | |
+ | | org.foo.bar | | | |
+ | | | | | |
+ | | | | | |
+ | | | <-----------------------------------+ | |
+ | | | | |
+ | | | <----------------------------------------------+ |
+ | +---------------+ |
+ +-------------------------------------------------------------------------+
+ ]]></programlisting>
+ </refsect1>
+
+ <refsect1>
+ <title>Privileged connections</title>
+ <para>
+ A connection is considered <emphasis>privileged</emphasis> if the user
+ it was created by is the same that created the bus, or if the creating
+ task had <constant>CAP_IPC_OWNER</constant> set when it called
+ <constant>KDBUS_CMD_HELLO</constant> (see below).
+ </para>
+ <para>
+ Privileged connections have permission to employ certain restricted
+ functions and commands, which are explained below and in other kdbus
+ man-pages.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Activator and policy holder connection</title>
+ <para>
+ An <emphasis>activator</emphasis> connection is a placeholder for a
+ <emphasis>well-known name</emphasis>. Messages sent to such a connection
+ can be used to start an implementer connection, which will then get all
+ the messages from the activator copied over. An activator connection
+ cannot be used to send any message.
+ </para>
+ <para>
+ A <emphasis>policy holder</emphasis> connection only installs a policy
+ for one or more names. These policy entries are kept active as long as
+ the connection is alive, and are removed once it terminates. Such a
+ policy connection type can be used to deploy restrictions for names that
+ are not yet active on the bus. A policy holder connection cannot be used
+ to send any message.
+ </para>
+ <para>
+ The creation of activator or policy holder connections is restricted to
+ privileged users on the bus (see above).
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Monitor connections</title>
+ <para>
+ Monitors are eavesdropping connections that receive all the traffic on the
+ bus, but is invisible to other connections. Such connections have all
+ properties of any other, regular connection, except for the following
+ details:
+ </para>
+
+ <itemizedlist>
+ <listitem><para>
+ They will get every message sent over the bus, both unicasts and
+ broadcasts.
+ </para></listitem>
+
+ <listitem><para>
+ Installing matches for signal messages is neither necessary
+ nor allowed.
+ </para></listitem>
+
+ <listitem><para>
+ They cannot send messages or be directly addressed as receiver.
+ </para></listitem>
+
+ <listitem><para>
+ They cannot own well-known names. Therefore, they also can't operate as
+ activators.
+ </para></listitem>
+
+ <listitem><para>
+ Their creation and destruction will not cause
+ <constant>KDBUS_ITEM_ID_{ADD,REMOVE}</constant> (see
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>).
+ </para></listitem>
+
+ <listitem><para>
+ They are not listed with their unique name in name registry dumps
+ (see <constant>KDBUS_CMD_NAME_LIST</constant> in
+ <citerefentry>
+ <refentrytitle>kdbus.name</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>), so other connections cannot detect the presence of
+ a monitor.
+ </para></listitem>
+ </itemizedlist>
+ <para>
+ The creation of monitor connections is restricted to privileged users on
+ the bus (see above).
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Creating connections</title>
+ <para>
+ A connection to a bus is created by opening an endpoint file (see
+ <citerefentry>
+ <refentrytitle>kdbus.endpoint</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>)
+ of a bus and becoming an active client with the
+ <constant>KDBUS_CMD_HELLO</constant> ioctl. Every connection has a unique
+ identifier on the bus and can address messages to every other connection
+ on the same bus by using the peer's connection ID as the destination.
+ </para>
+ <para>
+ The <constant>KDBUS_CMD_HELLO</constant> ioctl takes a <type>struct
+ kdbus_cmd_hello</type> as argument.
+ </para>
+
+ <programlisting>
+struct kdbus_cmd_hello {
+ __u64 size;
+ __u64 flags;
+ __u64 return_flags;
+ __u64 attach_flags_send;
+ __u64 attach_flags_recv;
+ __u64 bus_flags;
+ __u64 id;
+ __u64 pool_size;
+ __u64 offset;
+ __u8 id128[16];
+ struct kdbus_item items[0];
+};
+ </programlisting>
+
+ <para>The fields in this struct are described below.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term><varname>size</varname></term>
+ <listitem><para>
+ The overall size of the struct, including its items.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>flags</varname></term>
+ <listitem>
+ <para>Flags to apply to this connection</para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_HELLO_ACCEPT_FD</constant></term>
+ <listitem>
+ <para>
+ When this flag is set, the connection can be sent file
+ descriptors as message payload of unicast messages. If it's
+ not set, an attempt to send file descriptors will result in
+ <constant>-ECOMM</constant> on the sender's side.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_HELLO_ACTIVATOR</constant></term>
+ <listitem>
+ <para>
+ Make this connection an activator (see above). With this bit
+ set, an item of type <constant>KDBUS_ITEM_NAME</constant> has
+ to be attached. This item describes the well-known name this
+ connection should be an activator for.
+ A connection can not be an activator and a policy holder at
+ the same time time, so this bit is not allowed together with
+ <constant>KDBUS_HELLO_POLICY_HOLDER</constant>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_HELLO_POLICY_HOLDER</constant></term>
+ <listitem>
+ <para>
+ Make this connection a policy holder (see above). With this
+ bit set, an item of type <constant>KDBUS_ITEM_NAME</constant>
+ has to be attached. This item describes the well-known name
+ this connection should hold a policy for.
+ A connection can not be an activator and a policy holder at
+ the same time time, so this bit is not allowed together with
+ <constant>KDBUS_HELLO_ACTIVATOR</constant>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_HELLO_MONITOR</constant></term>
+ <listitem>
+ <para>
+ Make this connection a monitor connection (see above).
+ </para>
+ <para>
+ This flag can only be set by privileged bus connections. See
+ below for more information.
+ A connection can not be monitor and an activator or a policy
+ holder at the same time time, so this bit is not allowed
+ together with <constant>KDBUS_HELLO_ACTIVATOR</constant> or
+ <constant>KDBUS_HELLO_POLICY_HOLDER</constant>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_FLAG_NEGOTIATE</constant></term>
+ <listitem>
+ <para>
+ Requests a set of valid flags for this ioctl. When this bit is
+ set, no action is taken; the ioctl will return
+ <errorcode>0</errorcode>, and the <varname>flags</varname>
+ field will have all bits set that are valid for this command.
+ The <constant>KDBUS_FLAG_NEGOTIATE</constant> bit will be
+ cleared by the operation.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>return_flags</varname></term>
+ <listitem><para>
+ Flags returned by the kernel. Currently unused and always set to
+ <constant>0</constant> by the kernel.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>attach_flags_send</varname></term>
+ <listitem><para>
+ Set the bits for metadata this connection permits to be sent to the
+ receiving peer. Only metadata items that are both allowed to be sent
+ by the sender and that are requested by the receiver will be attached
+ to the message. Note, however, that the bus may optionally require
+ some of those bits to be set. If the match fails, the ioctl will fail
+ with <varname>errno</varname> set to
+ <constant>ECONNREFUSED</constant>. In either case, when returning the
+ field will be set to the mask of metadata items that are enforced by
+ the bus with the <constant>KDBUS_FLAGS_KERNEL</constant> bit set as
+ well.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>attach_flags_recv</varname></term>
+ <listitem><para>
+ Request the attachment of metadata for each message received by this
+ connection. See
+ <citerefentry>
+ <refentrytitle>kdbus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for information about metadata, and
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ regarding items in general.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>bus_flags</varname></term>
+ <listitem><para>
+ Upon successful completion of the ioctl, this member will contain the
+ flags of the bus it connected to.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>id</varname></term>
+ <listitem><para>
+ Upon successful completion of the command, this member will contain
+ the numerical ID of the new connection.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>pool_size</varname></term>
+ <listitem><para>
+ The size of the communication pool, in bytes. The pool can be
+ accessed by calling
+ <citerefentry>
+ <refentrytitle>mmap</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ on the file descriptor that was used to issue the
+ <constant>KDBUS_CMD_HELLO</constant> ioctl.
+ The pool size of a connection must be greater than
+ <constant>0</constant> and a multiple of
+ <constant>PAGE_SIZE</constant>. See
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>offset</varname></term>
+ <listitem><para>
+ The kernel will return the offset in the pool where returned details
+ will be stored. See below.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>id128</varname></term>
+ <listitem><para>
+ Upon successful completion of the ioctl, this member will contain the
+ <emphasis>128-bit UUID</emphasis> of the connected bus.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>items</varname></term>
+ <listitem>
+ <para>
+ Variable list of items containing optional additional information.
+ The following items are currently expected/valid:
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_CONN_DESCRIPTION</constant></term>
+ <listitem>
+ <para>
+ Contains a string that describes this connection, so it can
+ be identified later.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_NAME</constant></term>
+ <term><constant>KDBUS_ITEM_POLICY_ACCESS</constant></term>
+ <listitem>
+ <para>
+ For activators and policy holders only, combinations of
+ these two items describe policy access entries. See
+ <citerefentry>
+ <refentrytitle>kdbus.policy</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for further details.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_CREDS</constant></term>
+ <term><constant>KDBUS_ITEM_PIDS</constant></term>
+ <term><constant>KDBUS_ITEM_SECLABEL</constant></term>
+ <listitem>
+ <para>
+ Privileged bus users may submit these types in order to
+ create connections with faked credentials. This information
+ will be returned when peer information is queried by
+ <constant>KDBUS_CMD_CONN_INFO</constant>. See below for more
+ information on retrieving information on connections.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_NEGOTIATE</constant></term>
+ <listitem><para>
+ With this item, programs can <emphasis>probe</emphasis> the
+ kernel for known item types. See
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more details.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ Unrecognized items are rejected, and the ioctl will fail with
+ <varname>errno</varname> set to <constant>EINVAL</constant>.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ At the offset returned in the <varname>offset</varname> field of
+ <type>struct kdbus_cmd_hello</type>, the kernel will store items
+ of the following types:
+ </para>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_BLOOM_PARAMETER</constant></term>
+ <listitem>
+ <para>
+ Bloom filter parameter as defined by the bus creator.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ The offset in the pool has to be freed with the
+ <constant>KDBUS_CMD_FREE</constant> ioctl. See
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for further information.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Retrieving information on a connection</title>
+ <para>
+ The <constant>KDBUS_CMD_CONN_INFO</constant> ioctl can be used to
+ retrieve credentials and properties of the initial creator of a
+ connection. This ioctl uses the following struct.
+ </para>
+
+ <programlisting>
+struct kdbus_cmd_info {
+ __u64 size;
+ __u64 flags;
+ __u64 return_flags;
+ __u64 id;
+ __u64 attach_flags;
+ __u64 offset;
+ __u64 info_size;
+ struct kdbus_item items[0];
+};
+ </programlisting>
+
+ <variablelist>
+ <varlistentry>
+ <term><varname>size</varname></term>
+ <listitem><para>
+ The overall size of the struct, including its items.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>flags</varname></term>
+ <listitem><para>
+ Currently, no flags are supported.
+ <constant>KDBUS_FLAG_NEGOTIATE</constant> is accepted to probe for
+ valid flags. If set, the ioctl will return <errorcode>0</errorcode>,
+ and the <varname>flags</varname> field is set to
+ <constant>0</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>return_flags</varname></term>
+ <listitem><para>
+ Flags returned by the kernel. Currently unused and always set to
+ <constant>0</constant> by the kernel.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>id</varname></term>
+ <listitem><para>
+ The numerical ID of the connection for which information is to be
+ retrieved. If set to a non-zero value, the
+ <constant>KDBUS_ITEM_OWNED_NAME</constant> item is ignored.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>flags</varname></term>
+ <listitem><para>
+ Specifies which metadata items should be attached to the answer. See
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>offset</varname></term>
+ <listitem><para>
+ When the ioctl returns, this field will contain the offset of the
+ connection information inside the caller's pool. See
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for further information.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>info_size</varname></term>
+ <listitem><para>
+ The kernel will return the size of the returned information, so
+ applications can optionally
+ <citerefentry>
+ <refentrytitle>mmap</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ specific parts of the pool. See
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for further information.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>items</varname></term>
+ <listitem>
+ <para>
+ The following items are expected for
+ <constant>KDBUS_CMD_CONN_INFO</constant>.
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_OWNED_NAME</constant></term>
+ <listitem>
+ <para>
+ Contains the well-known name of the connection to look up as.
+ This item is mandatory if the <varname>id</varname> field is
+ set to 0.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_NEGOTIATE</constant></term>
+ <listitem><para>
+ With this item, programs can <emphasis>probe</emphasis> the
+ kernel for known item types. See
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more details.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ <para>
+ Unrecognized items are rejected, and the ioctl will fail with
+ <varname>errno</varname> set to <constant>EINVAL</constant>.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ When the ioctl returns, the following struct will be stored in the
+ caller's pool at <varname>offset</varname>. The fields in this struct
+ are described below.
+ </para>
+
+ <programlisting>
+struct kdbus_info {
+ __u64 size;
+ __u64 id;
+ __u64 flags;
+ struct kdbus_item items[0];
+};
+ </programlisting>
+
+ <variablelist>
+ <varlistentry>
+ <term><varname>size</varname></term>
+ <listitem><para>
+ The overall size of the struct, including its items.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>id</varname></term>
+ <listitem><para>
+ The connection's unique ID.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>flags</varname></term>
+ <listitem><para>
+ The connection's flags as specified when it was created.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>items</varname></term>
+ <listitem>
+ <para>
+ Depending on the <varname>flags</varname> field in
+ <type>struct kdbus_cmd_info</type>, items of types
+ <constant>KDBUS_ITEM_OWNED_NAME</constant> and
+ <constant>KDBUS_ITEM_CONN_DESCRIPTION</constant> may follow here.
+ <constant>KDBUS_ITEM_NEGOTIATE</constant> is also allowed.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ Once the caller is finished with parsing the return buffer, it needs to
+ employ the <constant>KDBUS_CMD_FREE</constant> command for the offset, in
+ order to free the buffer part. See
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for further information.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Getting information about a connection's bus creator</title>
+ <para>
+ The <constant>KDBUS_CMD_BUS_CREATOR_INFO</constant> ioctl takes the same
+ struct as <constant>KDBUS_CMD_CONN_INFO</constant>, but is used to
+ retrieve information about the creator of the bus the connection is
+ attached to. The metadata returned by this call is collected during the
+ creation of the bus and is never altered afterwards, so it provides
+ pristine information on the task that created the bus, at the moment when
+ it did so.
+ </para>
+ <para>
+ In response to this call, a slice in the connection's pool is allocated
+ and filled with an object of type <type>struct kdbus_info</type>,
+ pointed to by the ioctl's <varname>offset</varname> field.
+ </para>
+
+ <programlisting>
+struct kdbus_info {
+ __u64 size;
+ __u64 id;
+ __u64 flags;
+ struct kdbus_item items[0];
+};
+ </programlisting>
+
+ <variablelist>
+ <varlistentry>
+ <term><varname>size</varname></term>
+ <listitem><para>
+ The overall size of the struct, including its items.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>id</varname></term>
+ <listitem><para>
+ The bus ID.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>flags</varname></term>
+ <listitem><para>
+ The bus flags as specified when it was created.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>items</varname></term>
+ <listitem>
+ <para>
+ Metadata information is stored in items here. The item list
+ contains a <constant>KDBUS_ITEM_MAKE_NAME</constant> item that
+ indicates the bus name of the calling connection.
+ <constant>KDBUS_ITEM_NEGOTIATE</constant> is allowed to probe
+ for known item types.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ Once the caller is finished with parsing the return buffer, it needs to
+ employ the <constant>KDBUS_CMD_FREE</constant> command for the offset, in
+ order to free the buffer part. See
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for further information.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Updating connection details</title>
+ <para>
+ Some of a connection's details can be updated with the
+ <constant>KDBUS_CMD_CONN_UPDATE</constant> ioctl, using the file
+ descriptor that was used to create the connection. The update command
+ uses the following struct.
+ </para>
+
+ <programlisting>
+struct kdbus_cmd {
+ __u64 size;
+ __u64 flags;
+ __u64 return_flags;
+ struct kdbus_item items[0];
+};
+ </programlisting>
+
+ <variablelist>
+ <varlistentry>
+ <term><varname>size</varname></term>
+ <listitem><para>
+ The overall size of the struct, including its items.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>flags</varname></term>
+ <listitem><para>
+ Currently, no flags are supported.
+ <constant>KDBUS_FLAG_NEGOTIATE</constant> is accepted to probe for
+ valid flags. If set, the ioctl will return <errorcode>0</errorcode>,
+ and the <varname>flags</varname> field is set to
+ <constant>0</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>return_flags</varname></term>
+ <listitem><para>
+ Flags returned by the kernel. Currently unused and always set to
+ <constant>0</constant> by the kernel.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>items</varname></term>
+ <listitem>
+ <para>
+ Items to describe the connection details to be updated. The
+ following item types are supported.
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_ATTACH_FLAGS_SEND</constant></term>
+ <listitem>
+ <para>
+ Supply a new set of metadata items that this connection
+ permits to be sent along with messages.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_ATTACH_FLAGS_RECV</constant></term>
+ <listitem>
+ <para>
+ Supply a new set of metadata items that this connection
+ requests to be attached to each message.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_NAME</constant></term>
+ <term><constant>KDBUS_ITEM_POLICY_ACCESS</constant></term>
+ <listitem>
+ <para>
+ Policy holder connections may supply a new set of policy
+ information with these items. For other connection types,
+ <constant>EOPNOTSUPP</constant> is returned in
+ <varname>errno</varname>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_NEGOTIATE</constant></term>
+ <listitem><para>
+ With this item, programs can <emphasis>probe</emphasis> the
+ kernel for known item types. See
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more details.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ Unrecognized items are rejected, and the ioctl will fail with
+ <varname>errno</varname> set to <constant>EINVAL</constant>.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Termination of connections</title>
+ <para>
+ A connection can be terminated by simply calling
+ <citerefentry>
+ <refentrytitle>close</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ on its file descriptor. All pending incoming messages will be discarded,
+ and the memory allocated by the pool will be freed.
+ </para>
+
+ <para>
+ An alternative way of closing down a connection is via the
+ <constant>KDBUS_CMD_BYEBYE</constant> ioctl. This ioctl will succeed only
+ if the message queue of the connection is empty at the time of closing;
+ otherwise, the ioctl will fail with <varname>errno</varname> set to
+ <constant>EBUSY</constant>. When this ioctl returns
+ successfully, the connection has been terminated and won't accept any new
+ messages from remote peers. This way, a connection can be terminated
+ race-free, without losing any messages. The ioctl takes an argument of
+ type <type>struct kdbus_cmd</type>.
+ </para>
+
+ <programlisting>
+struct kdbus_cmd {
+ __u64 size;
+ __u64 flags;
+ __u64 return_flags;
+ struct kdbus_item items[0];
+};
+ </programlisting>
+
+ <variablelist>
+ <varlistentry>
+ <term><varname>size</varname></term>
+ <listitem><para>
+ The overall size of the struct, including its items.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>flags</varname></term>
+ <listitem><para>
+ Currently, no flags are supported.
+ <constant>KDBUS_FLAG_NEGOTIATE</constant> is accepted to probe for
+ valid flags. If set, the ioctl will fail with
+ <varname>errno</varname> set to <constant>EPROTO</constant>, and
+ the <varname>flags</varname> field is set to <constant>0</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>return_flags</varname></term>
+ <listitem><para>
+ Flags returned by the kernel. Currently unused and always set to
+ <constant>0</constant> by the kernel.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>items</varname></term>
+ <listitem>
+ <para>
+ Items to describe the connection details to be updated. The
+ following item types are supported.
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_NEGOTIATE</constant></term>
+ <listitem><para>
+ With this item, programs can <emphasis>probe</emphasis> the
+ kernel for known item types. See
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more details.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ Unrecognized items are rejected, and the ioctl will fail with
+ <varname>errno</varname> set to <constant>EINVAL</constant>.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Return value</title>
+ <para>
+ On success, all mentioned ioctl commands return <errorcode>0</errorcode>;
+ on error, <errorcode>-1</errorcode> is returned, and
+ <varname>errno</varname> is set to indicate the error.
+ If the issued ioctl is illegal for the file descriptor used,
+ <varname>errno</varname> will be set to <constant>ENOTTY</constant>.
+ </para>
+
+ <refsect2>
+ <title>
+ <constant>KDBUS_CMD_HELLO</constant> may fail with the following
+ errors
+ </title>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>EFAULT</constant></term>
+ <listitem><para>
+ The supplied pool size was 0 or not a multiple of the page size.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EINVAL</constant></term>
+ <listitem><para>
+ The flags supplied in <type>struct kdbus_cmd_hello</type>
+ are invalid.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EINVAL</constant></term>
+ <listitem><para>
+ An illegal combination of
+ <constant>KDBUS_HELLO_MONITOR</constant>,
+ <constant>KDBUS_HELLO_ACTIVATOR</constant> and
+ <constant>KDBUS_HELLO_POLICY_HOLDER</constant> was passed in
+ <varname>flags</varname>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EINVAL</constant></term>
+ <listitem><para>
+ An invalid set of items was supplied.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>ECONNREFUSED</constant></term>
+ <listitem><para>
+ The attach_flags_send field did not satisfy the requirements of
+ the bus.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EPERM</constant></term>
+ <listitem><para>
+ A <constant>KDBUS_ITEM_CREDS</constant> items was supplied, but the
+ current user is not privileged.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>ESHUTDOWN</constant></term>
+ <listitem><para>
+ The bus you were trying to connect to has already been shut down.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EMFILE</constant></term>
+ <listitem><para>
+ The maximum number of connections on the bus has been reached.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EOPNOTSUPP</constant></term>
+ <listitem><para>
+ The endpoint does not support the connection flags supplied in
+ <type>struct kdbus_cmd_hello</type>.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+
+ <refsect2>
+ <title>
+ <constant>KDBUS_CMD_BYEBYE</constant> may fail with the following
+ errors
+ </title>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>EALREADY</constant></term>
+ <listitem><para>
+ The connection has already been shut down.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EBUSY</constant></term>
+ <listitem><para>
+ There are still messages queued up in the connection's pool.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+
+ <refsect2>
+ <title>
+ <constant>KDBUS_CMD_CONN_INFO</constant> may fail with the following
+ errors
+ </title>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>EINVAL</constant></term>
+ <listitem><para>
+ Invalid flags, or neither an ID nor a name was provided, or the
+ name is invalid.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>ESRCH</constant></term>
+ <listitem><para>
+ Connection lookup by name failed.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>ENXIO</constant></term>
+ <listitem><para>
+ No connection with the provided connection ID found.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+
+ <refsect2>
+ <title>
+ <constant>KDBUS_CMD_CONN_UPDATE</constant> may fail with the following
+ errors
+ </title>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>EINVAL</constant></term>
+ <listitem><para>
+ Illegal flags or items.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EINVAL</constant></term>
+ <listitem><para>
+ Wildcards submitted in policy entries, or illegal sequence
+ of policy items.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EOPNOTSUPP</constant></term>
+ <listitem><para>
+ Operation not supported by connection.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>E2BIG</constant></term>
+ <listitem><para>
+ Too many policy items attached.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+ <simplelist type="inline">
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.bus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.endpoint</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.name</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.policy</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ </simplelist>
+ </refsect1>
+</refentry>
diff --git a/Documentation/kdbus/kdbus.endpoint.xml b/Documentation/kdbus/kdbus.endpoint.xml
new file mode 100644
index 000000000000..76e325d4e931
--- /dev/null
+++ b/Documentation/kdbus/kdbus.endpoint.xml
@@ -0,0 +1,436 @@
+<?xml version='1.0'?> <!--*-nxml-*-->
+<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
+ "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
+
+<refentry id="kdbus.endpoint">
+
+ <refentryinfo>
+ <title>kdbus.endpoint</title>
+ <productname>kdbus.endpoint</productname>
+ </refentryinfo>
+
+ <refmeta>
+ <refentrytitle>kdbus.endpoint</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </refmeta>
+
+ <refnamediv>
+ <refname>kdbus.endpoint</refname>
+ <refpurpose>kdbus endpoint</refpurpose>
+ </refnamediv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ Endpoints are entry points to a bus (see
+ <citerefentry>
+ <refentrytitle>kdbus.bus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>).
+ By default, each bus has a default
+ endpoint called 'bus'. The bus owner has the ability to create custom
+ endpoints with specific names, permissions, and policy databases
+ (see below). An endpoint is presented as file underneath the directory
+ of the parent bus.
+ </para>
+ <para>
+ To create a custom endpoint, open the default endpoint
+ (<literal>bus</literal>) and use the
+ <constant>KDBUS_CMD_ENDPOINT_MAKE</constant> ioctl with
+ <type>struct kdbus_cmd</type>. Custom endpoints always have a policy
+ database that, by default, forbids any operation. You have to explicitly
+ install policy entries to allow any operation on this endpoint.
+ </para>
+ <para>
+ Once <constant>KDBUS_CMD_ENDPOINT_MAKE</constant> succeeded, the new
+ endpoint will appear in the filesystem
+ (<citerefentry>
+ <refentrytitle>kdbus.bus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>), and the used file descriptor will manage the
+ newly created endpoint resource. It cannot be used to manage further
+ resources and must be kept open as long as the endpoint is needed. The
+ endpoint will be terminated as soon as the file descriptor is closed.
+ </para>
+ <para>
+ Endpoint names may be chosen freely except for one restriction: the name
+ must be prefixed with the numeric effective UID of the creator and a dash.
+ This is required to avoid namespace clashes between different users. When
+ creating an endpoint, the name that is passed in must be properly
+ formatted or the kernel will refuse creation of the endpoint. Example:
+ <literal>1047-my-endpoint</literal> is an acceptable name for an
+ endpoint registered by a user with UID 1047. However,
+ <literal>1024-my-endpoint</literal> is not, and neither is
+ <literal>my-endpoint</literal>. The UID must be provided in the
+ user-namespace of the bus.
+ </para>
+ <para>
+ To create connections to a bus, use <constant>KDBUS_CMD_HELLO</constant>
+ on a file descriptor returned by <function>open()</function> on an
+ endpoint node. See
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for further details.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Creating custom endpoints</title>
+ <para>
+ To create a new endpoint, the
+ <constant>KDBUS_CMD_ENDPOINT_MAKE</constant> command is used. Along with
+ the endpoint's name, which will be used to expose the endpoint in the
+ <citerefentry>
+ <refentrytitle>kdbus.fs</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>,
+ the command also optionally takes items to set up the endpoint's
+ <citerefentry>
+ <refentrytitle>kdbus.policy</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+ <constant>KDBUS_CMD_ENDPOINT_MAKE</constant> takes a
+ <type>struct kdbus_cmd</type> argument.
+ </para>
+ <programlisting>
+struct kdbus_cmd {
+ __u64 size;
+ __u64 flags;
+ __u64 return_flags;
+ struct kdbus_item items[0];
+};
+ </programlisting>
+
+ <para>The fields in this struct are described below.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term><varname>size</varname></term>
+ <listitem><para>
+ The overall size of the struct, including its items.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>flags</varname></term>
+ <listitem><para>The flags for creation.</para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_MAKE_ACCESS_GROUP</constant></term>
+ <listitem>
+ <para>Make the endpoint file group-accessible.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_MAKE_ACCESS_WORLD</constant></term>
+ <listitem>
+ <para>Make the endpoint file world-accessible.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_FLAG_NEGOTIATE</constant></term>
+ <listitem>
+ <para>
+ Requests a set of valid flags for this ioctl. When this bit is
+ set, no action is taken; the ioctl will return
+ <errorcode>0</errorcode>, and the <varname>flags</varname>
+ field will have all bits set that are valid for this command.
+ The <constant>KDBUS_FLAG_NEGOTIATE</constant> bit will be
+ cleared by the operation.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>return_flags</varname></term>
+ <listitem><para>
+ Flags returned by the kernel. Currently unused and always set to
+ <constant>0</constant> by the kernel.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>items</varname></term>
+ <listitem>
+ <para>
+ The following items are expected for
+ <constant>KDBUS_CMD_ENDPOINT_MAKE</constant>.
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_MAKE_NAME</constant></term>
+ <listitem>
+ <para>Contains a string to identify the endpoint name.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_NAME</constant></term>
+ <term><constant>KDBUS_ITEM_POLICY_ACCESS</constant></term>
+ <listitem>
+ <para>
+ These items are used to set the policy attached to the
+ endpoint. For more details on bus and endpoint policies, see
+ <citerefentry>
+ <refentrytitle>kdbus.policy</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ <para>
+ Unrecognized items are rejected, and the ioctl will fail with
+ <varname>errno</varname> set to <varname>EINVAL</varname>.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Updating endpoints</title>
+ <para>
+ To update an existing endpoint, the
+ <constant>KDBUS_CMD_ENDPOINT_UPDATE</constant> command is used on the file
+ descriptor that was used to create the update, using
+ <constant>KDBUS_CMD_ENDPOINT_MAKE</constant>. The only relevant detail of
+ the endpoint that can be updated is the policy. When the command is
+ employed, the policy of the endpoint is <emphasis>replaced</emphasis>
+ atomically with the new set of rules.
+ The command takes a <type>struct kdbus_cmd</type> argument.
+ </para>
+ <programlisting>
+struct kdbus_cmd {
+ __u64 size;
+ __u64 flags;
+ __u64 return_flags;
+ struct kdbus_item items[0];
+};
+ </programlisting>
+
+ <para>The fields in this struct are described below.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term><varname>size</varname></term>
+ <listitem><para>
+ The overall size of the struct, including its items.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>flags</varname></term>
+ <listitem><para>
+ Unused for this command.
+ <constant>KDBUS_FLAG_NEGOTIATE</constant> is accepted to probe for
+ valid flags. If set, the ioctl will return <errorcode>0</errorcode>,
+ and the <varname>flags</varname> field is set to
+ <constant>0</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>return_flags</varname></term>
+ <listitem><para>
+ Flags returned by the kernel. Currently unused and always set to
+ <constant>0</constant> by the kernel.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>items</varname></term>
+ <listitem>
+ <para>
+ The following items are expected for
+ <constant>KDBUS_CMD_ENDPOINT_UPDATE</constant>.
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_NAME</constant></term>
+ <term><constant>KDBUS_ITEM_POLICY_ACCESS</constant></term>
+ <listitem>
+ <para>
+ These items are used to set the policy attached to the
+ endpoint. For more details on bus and endpoint policies, see
+ <citerefentry>
+ <refentrytitle>kdbus.policy</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+ Existing policy is atomically replaced with the new rules
+ provided.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_NEGOTIATE</constant></term>
+ <listitem><para>
+ With this item, programs can <emphasis>probe</emphasis> the
+ kernel for known item types. See
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more details.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ <para>
+ Unrecognized items are rejected, and the ioctl will fail with
+ <varname>errno</varname> set to <constant>EINVAL</constant>.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Return value</title>
+ <para>
+ On success, all mentioned ioctl commands return <errorcode>0</errorcode>;
+ on error, <errorcode>-1</errorcode> is returned, and
+ <varname>errno</varname> is set to indicate the error.
+ If the issued ioctl is illegal for the file descriptor used,
+ <varname>errno</varname> will be set to <constant>ENOTTY</constant>.
+ </para>
+
+ <refsect2>
+ <title>
+ <constant>KDBUS_CMD_ENDPOINT_MAKE</constant> may fail with the
+ following errors
+ </title>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>EINVAL</constant></term>
+ <listitem><para>
+ The flags supplied in the <type>struct kdbus_cmd</type>
+ are invalid.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EINVAL</constant></term>
+ <listitem><para>
+ Illegal combination of <constant>KDBUS_ITEM_NAME</constant> and
+ <constant>KDBUS_ITEM_POLICY_ACCESS</constant> was provided.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EEXIST</constant></term>
+ <listitem><para>
+ An endpoint of that name already exists.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EPERM</constant></term>
+ <listitem><para>
+ The calling user is not privileged. See
+ <citerefentry>
+ <refentrytitle>kdbus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for information about privileged users.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+
+ <refsect2>
+ <title>
+ <constant>KDBUS_CMD_ENDPOINT_UPDATE</constant> may fail with the
+ following errors
+ </title>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>EINVAL</constant></term>
+ <listitem><para>
+ The flags supplied in <type>struct kdbus_cmd</type>
+ are invalid.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EINVAL</constant></term>
+ <listitem><para>
+ Illegal combination of <constant>KDBUS_ITEM_NAME</constant> and
+ <constant>KDBUS_ITEM_POLICY_ACCESS</constant> was provided.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EEXIST</constant></term>
+ <listitem><para>
+ An endpoint of that name already exists.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+ <simplelist type="inline">
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.bus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.endpoint</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.fs</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.name</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ </simplelist>
+ </refsect1>
+</refentry>
diff --git a/Documentation/kdbus/kdbus.fs.xml b/Documentation/kdbus/kdbus.fs.xml
new file mode 100644
index 000000000000..8c2a90e10b66
--- /dev/null
+++ b/Documentation/kdbus/kdbus.fs.xml
@@ -0,0 +1,124 @@
+<?xml version='1.0'?> <!--*-nxml-*-->
+<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
+ "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
+
+<refentry id="kdbus_fs">
+
+ <refentryinfo>
+ <title>kdbus.fs</title>
+ <productname>kdbus.fs</productname>
+ </refentryinfo>
+
+ <refmeta>
+ <refentrytitle>kdbus.fs</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </refmeta>
+
+ <refnamediv>
+ <refname>kdbus.fs</refname>
+ <refpurpose>kdbus file system</refpurpose>
+ </refnamediv>
+
+ <refsect1>
+ <title>File-system Layout</title>
+
+ <para>
+ The <emphasis>kdbusfs</emphasis> pseudo filesystem provides access to
+ kdbus entities, such as <emphasis>buses</emphasis> and
+ <emphasis>endpoints</emphasis>. Each time the filesystem is mounted,
+ a new, isolated kdbus instance is created, which is independent from the
+ other instances.
+ </para>
+ <para>
+ The system-wide standard mount point for <emphasis>kdbusfs</emphasis> is
+ <constant>/sys/fs/kdbus</constant>.
+ </para>
+
+ <para>
+ Buses are represented as directories in the file system layout, whereas
+ endpoints are exposed as files inside these directories. At the top-level,
+ a <emphasis>control</emphasis> node is present, which can be opened to
+ create new buses via the <constant>KDBUS_CMD_BUS_MAKE</constant> ioctl.
+ Each <emphasis>bus</emphasis> shows a default endpoint called
+ <varname>bus</varname>, which can be opened to either create a connection
+ with the <constant>KDBUS_CMD_HELLO</constant> ioctl, or to create new
+ custom endpoints for the bus with
+ <constant>KDBUS_CMD_ENDPOINT_MAKE</constant>. See
+ <citerefentry>
+ <refentrytitle>kdbus.bus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>,
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry> and
+ <citerefentry>
+ <refentrytitle>kdbus.endpoint</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more details.
+ </para>
+
+ <para>Following, you can see an example layout of the
+ <emphasis>kdbusfs</emphasis> filesystem:</para>
+
+<programlisting>
+ /sys/fs/kdbus/ ; mount-point
+ |-- 0-system ; bus directory
+ | |-- bus ; default endpoint
+ | `-- 1017-custom ; custom endpoint
+ |-- 1000-user ; bus directory
+ | |-- bus ; default endpoint
+ | |-- 1000-service-A ; custom endpoint
+ | `-- 1000-service-B ; custom endpoint
+ `-- control ; control file
+</programlisting>
+ </refsect1>
+
+ <refsect1>
+ <title>Mounting instances</title>
+ <para>
+ In order to get a new and separate kdbus environment, a new instance
+ of <emphasis>kdbusfs</emphasis> can be mounted like this:
+ </para>
+<programlisting>
+ # mount -t kdbusfs kdbusfs /tmp/new_kdbus/
+</programlisting>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+ <simplelist type="inline">
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.bus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.endpoint</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>mount</refentrytitle>
+ <manvolnum>8</manvolnum>
+ </citerefentry>
+ </member>
+ </simplelist>
+ </refsect1>
+</refentry>
diff --git a/Documentation/kdbus/kdbus.item.xml b/Documentation/kdbus/kdbus.item.xml
new file mode 100644
index 000000000000..bfe47362097f
--- /dev/null
+++ b/Documentation/kdbus/kdbus.item.xml
@@ -0,0 +1,840 @@
+<?xml version='1.0'?> <!--*-nxml-*-->
+<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
+ "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
+
+<refentry id="kdbus">
+
+ <refentryinfo>
+ <title>kdbus.item</title>
+ <productname>kdbus item</productname>
+ </refentryinfo>
+
+ <refmeta>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </refmeta>
+
+ <refnamediv>
+ <refname>kdbus.item</refname>
+ <refpurpose>kdbus item structure, layout and usage</refpurpose>
+ </refnamediv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ To flexibly augment transport structures, data blobs of type
+ <type>struct kdbus_item</type> can be attached to the structs passed
+ into the ioctls. Some ioctls make items of certain types mandatory,
+ others are optional. Items that are unsupported by ioctls they are
+ attached to will cause the ioctl to fail with <varname>errno</varname>
+ set to <constant>EINVAL</constant>.
+ Items are also used for information stored in a connection's
+ <emphasis>pool</emphasis>, such as received messages, name lists or
+ requested connection or bus owner information. Depending on the type of
+ an item, its total size is either fixed or variable.
+ </para>
+
+ <refsect2>
+ <title>Chaining items</title>
+ <para>
+ Whenever items are used as part of the kdbus kernel API, they are
+ embedded in structs that are embedded inside structs that themselves
+ include a size field containing the overall size of the structure.
+ This allows multiple items to be chained up, and an item iterator
+ (see below) is capable of detecting the end of an item chain.
+ </para>
+ </refsect2>
+
+ <refsect2>
+ <title>Alignment</title>
+ <para>
+ The kernel expects all items to be aligned to 8-byte boundaries.
+ Unaligned items will cause the ioctl they are used with to fail
+ with <varname>errno</varname> set to <constant>EINVAL</constant>.
+ An item that has an unaligned size itself hence needs to be padded
+ if it is followed by another item.
+ </para>
+ </refsect2>
+
+ <refsect2>
+ <title>Iterating items</title>
+ <para>
+ A simple iterator would iterate over the items until the items have
+ reached the embedding structure's overall size. An example
+ implementation is shown below.
+ </para>
+
+ <programlisting><![CDATA[
+#define KDBUS_ALIGN8(val) (((val) + 7) & ~7)
+
+#define KDBUS_ITEM_NEXT(item) \
+ (typeof(item))(((uint8_t *)item) + KDBUS_ALIGN8((item)->size))
+
+#define KDBUS_ITEM_FOREACH(item, head, first) \
+ for (item = (head)->first; \
+ ((uint8_t *)(item) < (uint8_t *)(head) + (head)->size) && \
+ ((uint8_t *)(item) >= (uint8_t *)(head)); \
+ item = KDBUS_ITEM_NEXT(item))
+ ]]></programlisting>
+ </refsect2>
+ </refsect1>
+
+ <refsect1>
+ <title>Item layout</title>
+ <para>
+ A <type>struct kdbus_item</type> consists of a
+ <varname>size</varname> field, describing its overall size, and a
+ <varname>type</varname> field, both 64 bit wide. They are followed by
+ a union to store information that is specific to the item's type.
+ The struct layout is shown below.
+ </para>
+
+ <programlisting>
+struct kdbus_item {
+ __u64 size;
+ __u64 type;
+ /* item payload - see below */
+ union {
+ __u8 data[0];
+ __u32 data32[0];
+ __u64 data64[0];
+ char str[0];
+
+ __u64 id;
+ struct kdbus_vec vec;
+ struct kdbus_creds creds;
+ struct kdbus_pids pids;
+ struct kdbus_audit audit;
+ struct kdbus_caps caps;
+ struct kdbus_timestamp timestamp;
+ struct kdbus_name name;
+ struct kdbus_bloom_parameter bloom_parameter;
+ struct kdbus_bloom_filter bloom_filter;
+ struct kdbus_memfd memfd;
+ int fds[0];
+ struct kdbus_notify_name_change name_change;
+ struct kdbus_notify_id_change id_change;
+ struct kdbus_policy_access policy_access;
+ };
+};
+ </programlisting>
+
+ <para>
+ <type>struct kdbus_item</type> should never be used to allocate
+ an item instance, as its size may grow in future releases of the API.
+ Instead, it should be manually assembled by storing the
+ <varname>size</varname>, <varname>type</varname> and payload to a
+ struct of its own.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Item types</title>
+
+ <refsect2>
+ <title>Negotiation item</title>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_NEGOTIATE</constant></term>
+ <listitem><para>
+ With this item is attached to any ioctl, programs can
+ <emphasis>probe</emphasis> the kernel for known item items.
+ The item carries an array of <type>uint64_t</type> values in
+ <varname>item.data64</varname>, each set to an item type to
+ probe. The kernel will reset each member of this array that is
+ not recognized as valid item type to <constant>0</constant>.
+ This way, users can negotiate kernel features at start-up to
+ keep newer userspace compatible with older kernels. This item
+ is never attached by the kernel in response to any command.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+
+ <refsect2>
+ <title>Command specific items</title>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_PAYLOAD_VEC</constant></term>
+ <term><constant>KDBUS_ITEM_PAYLOAD_OFF</constant></term>
+ <listitem><para>
+ Messages are directly copied by the sending process into the
+ receiver's
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+ This way, two peers can exchange data by effectively doing a
+ single-copy from one process to another; the kernel will not buffer
+ the data anywhere else. <constant>KDBUS_ITEM_PAYLOAD_VEC</constant>
+ is used when <emphasis>sending</emphasis> message. The item
+ references a memory address when the payload data can be found.
+ <constant>KDBUS_ITEM_PAYLOAD_OFF</constant> is used when messages
+ are <emphasis>received</emphasis>, and the
+ <constant>offset</constant> value describes the offset inside the
+ receiving connection's
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ where the message payload can be found. See
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information on passing of payload data along with a
+ message.
+ <programlisting>
+struct kdbus_vec {
+ __u64 size;
+ union {
+ __u64 address;
+ __u64 offset;
+ };
+};
+ </programlisting>
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_PAYLOAD_MEMFD</constant></term>
+ <listitem><para>
+ Transports a file descriptor of a <emphasis>memfd</emphasis> in
+ <type>struct kdbus_memfd</type> in <varname>item.memfd</varname>.
+ The <varname>size</varname> field has to match the actual size of
+ the memfd that was specified when it was created. The
+ <varname>start</varname> parameter denotes the offset inside the
+ memfd at which the referenced payload starts. See
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information on passing of payload data along with a
+ message.
+ <programlisting>
+struct kdbus_memfd {
+ __u64 start;
+ __u64 size;
+ int fd;
+ __u32 __pad;
+};
+ </programlisting>
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_FDS</constant></term>
+ <listitem><para>
+ Contains an array of <emphasis>file descriptors</emphasis>.
+ When used with <constant>KDBUS_CMD_SEND</constant>, the values of
+ this array must be filled with valid file descriptor numbers.
+ When received as item attached to a message, the array will
+ contain the numbers of the installed file descriptors, or
+ <constant>-1</constant> in case an error occurred.
+ file descriptor.
+ In either case, the number of entries in the array is derived from
+ the item's total size. See
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+
+ <refsect2>
+ <title>Items specific to some commands</title>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_CANCEL_FD</constant></term>
+ <listitem><para>
+ Transports a file descriptor that can be used to cancel a
+ synchronous <constant>KDBUS_CMD_SEND</constant> operation by
+ writing to it. The file descriptor is stored in
+ <varname>item.fd[0]</varname>. The item may only contain one
+ file descriptor. See
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information on this item and how to use it.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_BLOOM_PARAMETER</constant></term>
+ <listitem><para>
+ Contains a set of <emphasis>bloom parameters</emphasis> as
+ <type>struct kdbus_bloom_parameter</type> in
+ <varname>item.bloom_parameter</varname>.
+ The item is passed from userspace to kernel during the
+ <constant>KDBUS_CMD_BUS_MAKE</constant> ioctl, and returned
+ verbatim when <constant>KDBUS_CMD_HELLO</constant> is called.
+ The kernel does not use the bloom parameters, but they need to
+ be known by each connection on the bus in order to define the
+ bloom filter hash details. See
+ <citerefentry>
+ <refentrytitle>kdbus.match</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information on matching and bloom filters.
+ <programlisting>
+struct kdbus_bloom_parameter {
+ __u64 size;
+ __u64 n_hash;
+};
+ </programlisting>
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_BLOOM_FILTER</constant></term>
+ <listitem><para>
+ Carries a <emphasis>bloom filter</emphasis> as
+ <type>struct kdbus_bloom_filter</type> in
+ <varname>item.bloom_filter</varname>. It is mandatory to send this
+ item attached to a <type>struct kdbus_msg</type>, in case the
+ message is a signal. This item is never transported from kernel to
+ userspace. See
+ <citerefentry>
+ <refentrytitle>kdbus.match</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information on matching and bloom filters.
+ <programlisting>
+struct kdbus_bloom_filter {
+ __u64 generation;
+ __u64 data[0];
+};
+ </programlisting>
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_BLOOM_MASK</constant></term>
+ <listitem><para>
+ Transports a <emphasis>bloom mask</emphasis> as binary data blob
+ stored in <varname>item.data</varname>. This item is used to
+ describe a match into a connection's match database. See
+ <citerefentry>
+ <refentrytitle>kdbus.match</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information on matching and bloom filters.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_DST_NAME</constant></term>
+ <listitem><para>
+ Contains a <emphasis>well-known name</emphasis> to send a
+ message to, as null-terminated string in
+ <varname>item.str</varname>. This item is used with
+ <constant>KDBUS_CMD_SEND</constant>. See
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information on how to send a message.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_MAKE_NAME</constant></term>
+ <listitem><para>
+ Contains a <emphasis>bus name</emphasis> or
+ <emphasis>endpoint name</emphasis>, stored as null-terminated
+ string in <varname>item.str</varname>. This item is sent from
+ userspace to kernel when buses or endpoints are created, and
+ returned back to userspace when the bus creator information is
+ queried. See
+ <citerefentry>
+ <refentrytitle>kdbus.bus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ and
+ <citerefentry>
+ <refentrytitle>kdbus.endpoint</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_ATTACH_FLAGS_SEND</constant></term>
+ <term><constant>KDBUS_ITEM_ATTACH_FLAGS_RECV</constant></term>
+ <listitem><para>
+ Contains a set of <emphasis>attach flags</emphasis> at
+ <emphasis>send</emphasis> or <emphasis>receive</emphasis> time. See
+ <citerefentry>
+ <refentrytitle>kdbus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>,
+ <citerefentry>
+ <refentrytitle>kdbus.bus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry> and
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information on attach flags.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_ID</constant></term>
+ <listitem><para>
+ Transports a connection's <emphasis>numerical ID</emphasis> of
+ a connection as <type>uint64_t</type> value in
+ <varname>item.id</varname>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_NAME</constant></term>
+ <listitem><para>
+ Transports a name associated with the
+ <emphasis>name registry</emphasis> as null-terminated string as
+ <type>struct kdbus_name</type> in
+ <varname>item.name</varname>. The <varname>flags</varname>
+ contains the flags of the name. See
+ <citerefentry>
+ <refentrytitle>kdbus.name</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information on how to access the name registry of a bus.
+ <programlisting>
+struct kdbus_name {
+ __u64 flags;
+ char name[0];
+};
+ </programlisting>
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+
+ <refsect2>
+ <title>Items attached by the kernel as metadata</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_TIMESTAMP</constant></term>
+ <listitem><para>
+ Contains both the <emphasis>monotonic</emphasis> and the
+ <emphasis>realtime</emphasis> timestamp, taken when the message
+ was processed on the kernel side.
+ Stored as <type>struct kdbus_timestamp</type> in
+ <varname>item.timestamp</varname>.
+ <programlisting>
+struct kdbus_timestamp {
+ __u64 seqnum;
+ __u64 monotonic_ns;
+ __u64 realtime_ns;
+};
+ </programlisting>
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_CREDS</constant></term>
+ <listitem><para>
+ Contains a set of <emphasis>user</emphasis> and
+ <emphasis>group</emphasis> information as 32-bit values, in the
+ usual four flavors: real, effective, saved and filesystem related.
+ Stored as <type>struct kdbus_creds</type> in
+ <varname>item.creds</varname>.
+ <programlisting>
+struct kdbus_creds {
+ __u32 uid;
+ __u32 euid;
+ __u32 suid;
+ __u32 fsuid;
+ __u32 gid;
+ __u32 egid;
+ __u32 sgid;
+ __u32 fsgid;
+};
+ </programlisting>
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_PIDS</constant></term>
+ <listitem><para>
+ Contains the <emphasis>PID</emphasis>, <emphasis>TID</emphasis>
+ and <emphasis>parent PID (PPID)</emphasis> of a remote peer.
+ Stored as <type>struct kdbus_pids</type> in
+ <varname>item.pids</varname>.
+ <programlisting>
+struct kdbus_pids {
+ __u64 pid;
+ __u64 tid;
+ __u64 ppid;
+};
+ </programlisting>
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_AUXGROUPS</constant></term>
+ <listitem><para>
+ Contains the <emphasis>auxiliary (supplementary) groups</emphasis>
+ a remote peer is a member of, stored as array of
+ <type>uint32_t</type> values in <varname>item.data32</varname>.
+ The array length can be determined by looking at the item's total
+ size, subtracting the size of the header and and dividing the
+ remainder by <constant>sizeof(uint32_t)</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_OWNED_NAME</constant></term>
+ <listitem><para>
+ Contains a <emphasis>well-known name</emphasis> currently owned
+ by a connection. The name is stored as null-terminated string in
+ <varname>item.str</varname>. Its length can also be derived from
+ the item's total size.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_TID_COMM</constant> [*]</term>
+ <listitem><para>
+ Contains the <emphasis>comm</emphasis> string of a task's
+ <emphasis>TID</emphasis> (thread ID), stored as null-terminated
+ string in <varname>item.str</varname>. Its length can also be
+ derived from the item's total size. Receivers of this item should
+ not use its contents for any kind of security measures. See below.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_PID_COMM</constant> [*]</term>
+ <listitem><para>
+ Contains the <emphasis>comm</emphasis> string of a task's
+ <emphasis>PID</emphasis> (process ID), stored as null-terminated
+ string in <varname>item.str</varname>. Its length can also be
+ derived from the item's total size. Receivers of this item should
+ not use its contents for any kind of security measures. See below.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_EXE</constant> [*]</term>
+ <listitem><para>
+ Contains the <emphasis>path to the executable</emphasis> of a task,
+ stored as null-terminated string in <varname>item.str</varname>. Its
+ length can also be derived from the item's total size. Receivers of
+ this item should not use its contents for any kind of security
+ measures. See below.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_CMDLINE</constant> [*]</term>
+ <listitem><para>
+ Contains the <emphasis>command line arguments</emphasis> of a
+ task, stored as an <emphasis>array</emphasis> of null-terminated
+ strings in <varname>item.str</varname>. The total length of all
+ strings in the array can be derived from the item's total size.
+ Receivers of this item should not use its contents for any kind
+ of security measures. See below.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_CGROUP</constant></term>
+ <listitem><para>
+ Contains the <emphasis>cgroup path</emphasis> of a task, stored
+ as null-terminated string in <varname>item.str</varname>. Its
+ length can also be derived from the item's total size.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_CAPS</constant></term>
+ <listitem><para>
+ Contains sets of <emphasis>capabilities</emphasis>, stored as
+ <type>struct kdbus_caps</type> in <varname>item.caps</varname>.
+ As the item size may increase in the future, programs should be
+ written in a way that it takes
+ <varname>item.caps.last_cap</varname> into account, and derive
+ the number of sets and rows from the item size and the reported
+ number of valid capability bits.
+ <programlisting>
+struct kdbus_caps {
+ __u32 last_cap;
+ __u32 caps[0];
+};
+ </programlisting>
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_SECLABEL</constant></term>
+ <listitem><para>
+ Contains the <emphasis>LSM label</emphasis> of a task, stored as
+ null-terminated string in <varname>item.str</varname>. Its length
+ can also be derived from the item's total size.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_AUDIT</constant></term>
+ <listitem><para>
+ Contains the audit <emphasis>sessionid</emphasis> and
+ <emphasis>loginuid</emphasis> of a task, stored as
+ <type>struct kdbus_audit</type> in
+ <varname>item.audit</varname>.
+ <programlisting>
+struct kdbus_audit {
+ __u32 sessionid;
+ __u32 loginuid;
+};
+ </programlisting>
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_CONN_DESCRIPTION</constant></term>
+ <listitem><para>
+ Contains the <emphasis>connection description</emphasis>, as set
+ by <constant>KDBUS_CMD_HELLO</constant> or
+ <constant>KDBUS_CMD_CONN_UPDATE</constant>, stored as
+ null-terminated string in <varname>item.str</varname>. Its length
+ can also be derived from the item's total size.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ All metadata is automatically translated into the
+ <emphasis>namespaces</emphasis> of the task that receives them. See
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information.
+ </para>
+
+ <para>
+ [*] Note that the content stored in metadata items of type
+ <constant>KDBUS_ITEM_TID_COMM</constant>,
+ <constant>KDBUS_ITEM_PID_COMM</constant>,
+ <constant>KDBUS_ITEM_EXE</constant> and
+ <constant>KDBUS_ITEM_CMDLINE</constant>
+ can easily be tampered by the sending tasks. Therefore, they should
+ <emphasis>not</emphasis> be used for any sort of security relevant
+ assumptions. The only reason they are transmitted is to let
+ receivers know about details that were set when metadata was
+ collected, even though the task they were collected from is not
+ active any longer when the items are received.
+ </para>
+ </refsect2>
+
+ <refsect2>
+ <title>Items used for policy entries, matches and notifications</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_POLICY_ACCESS</constant></term>
+ <listitem><para>
+ This item describes a <emphasis>policy access</emphasis> entry to
+ access the policy database of a
+ <citerefentry>
+ <refentrytitle>kdbus.bus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry> or
+ <citerefentry>
+ <refentrytitle>kdbus.endpoint</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+ Please refer to
+ <citerefentry>
+ <refentrytitle>kdbus.policy</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information on the policy database and how to access it.
+ <programlisting>
+struct kdbus_policy_access {
+ __u64 type;
+ __u64 access;
+ __u64 id;
+};
+ </programlisting>
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_ID_ADD</constant></term>
+ <term><constant>KDBUS_ITEM_ID_REMOVE</constant></term>
+ <listitem><para>
+ This item is sent as attachment to a
+ <emphasis>kernel notification</emphasis> and indicates that a
+ new connection was created on the bus, or that a connection was
+ disconnected, respectively. It stores a
+ <type>struct kdbus_notify_id_change</type> in
+ <varname>item.id_change</varname>.
+ The <varname>id</varname> field contains the numeric ID of the
+ connection that was added or removed, and <varname>flags</varname>
+ is set to the connection flags, as passed by
+ <constant>KDBUS_CMD_HELLO</constant>. See
+ <citerefentry>
+ <refentrytitle>kdbus.match</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ and
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information on matches and notification messages.
+ <programlisting>
+struct kdbus_notify_id_change {
+ __u64 id;
+ __u64 flags;
+};
+ </programlisting>
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_NAME_ADD</constant></term>
+ <term><constant>KDBUS_ITEM_NAME_REMOVE</constant></term>
+ <term><constant>KDBUS_ITEM_NAME_CHANGE</constant></term>
+ <listitem><para>
+ This item is sent as attachment to a
+ <emphasis>kernel notification</emphasis> and indicates that a
+ <emphasis>well-known name</emphasis> appeared, disappeared or
+ transferred to another owner on the bus. It stores a
+ <type>struct kdbus_notify_name_change</type> in
+ <varname>item.name_change</varname>.
+ <varname>old_id</varname> describes the former owner of the name
+ and is set to <constant>0</constant> values in case of
+ <constant>KDBUS_ITEM_NAME_ADD</constant>.
+ <varname>new_id</varname> describes the new owner of the name and
+ is set to <constant>0</constant> values in case of
+ <constant>KDBUS_ITEM_NAME_REMOVE</constant>.
+ The <varname>name</varname> field contains the well-known name the
+ notification is about, as null-terminated string. See
+ <citerefentry>
+ <refentrytitle>kdbus.match</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ and
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information on matches and notification messages.
+ <programlisting>
+struct kdbus_notify_name_change {
+ struct kdbus_notify_id_change old_id;
+ struct kdbus_notify_id_change new_id;
+ char name[0];
+};
+ </programlisting>
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_REPLY_TIMEOUT</constant></term>
+ <listitem><para>
+ This item is sent as attachment to a
+ <emphasis>kernel notification</emphasis>. It informs the receiver
+ that an expected reply to a message was not received in time.
+ The remote peer ID and the message cookie is stored in the message
+ header. See
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information about messages, timeouts and notifications.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_REPLY_DEAD</constant></term>
+ <listitem><para>
+ This item is sent as attachment to a
+ <emphasis>kernel notification</emphasis>. It informs the receiver
+ that a remote connection a reply is expected from was disconnected
+ before that reply was sent. The remote peer ID and the message
+ cookie is stored in the message header. See
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information about messages, timeouts and notifications.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+ <simplelist type="inline">
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.bus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.endpoint</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.fs</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.name</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>memfd_create</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ </member>
+ </simplelist>
+ </refsect1>
+
+</refentry>
diff --git a/Documentation/kdbus/kdbus.match.xml b/Documentation/kdbus/kdbus.match.xml
new file mode 100644
index 000000000000..ef77b64e5890
--- /dev/null
+++ b/Documentation/kdbus/kdbus.match.xml
@@ -0,0 +1,553 @@
+<?xml version='1.0'?> <!--*-nxml-*-->
+<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
+ "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
+
+<refentry id="kdbus.match">
+
+ <refentryinfo>
+ <title>kdbus.match</title>
+ <productname>kdbus.match</productname>
+ </refentryinfo>
+
+ <refmeta>
+ <refentrytitle>kdbus.match</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </refmeta>
+
+ <refnamediv>
+ <refname>kdbus.match</refname>
+ <refpurpose>kdbus match</refpurpose>
+ </refnamediv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ kdbus connections can install matches in order to subscribe to signal
+ messages sent on the bus. Such signal messages can be either directed
+ to a single connection (by setting a specific connection ID in
+ <varname>struct kdbus_msg.dst_id</varname> or by sending it to a
+ well-known name), or to potentially <emphasis>all</emphasis> currently
+ active connections on the bus (by setting
+ <varname>struct kdbus_msg.dst_id</varname> to
+ <constant>KDBUS_DST_ID_BROADCAST</constant>).
+ A signal message always has the <constant>KDBUS_MSG_SIGNAL</constant>
+ bit set in the <varname>flags</varname> bitfield.
+ Also, signal messages can originate from either the kernel (called
+ <emphasis>notifications</emphasis>), or from other bus connections.
+ In either case, a bus connection needs to have a suitable
+ <emphasis>match</emphasis> installed in order to receive any signal
+ message. Without any rules installed in the connection, no signal message
+ will be received.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Matches for signal messages from other connections</title>
+ <para>
+ Matches for messages from other connections (not kernel notifications)
+ are implemented as bloom filters (see below). The sender adds certain
+ properties of the message as elements to a bloom filter bit field, and
+ sends that along with the signal message.
+
+ The receiving connection adds the message properties it is interested in
+ as elements to a bloom mask bit field, and uploads the mask as match rule,
+ possibly along with some other rules to further limit the match.
+
+ The kernel will match the signal message's bloom filter against the
+ connections bloom mask (simply by &-ing it), and will decide whether
+ the message should be delivered to a connection.
+ </para>
+ <para>
+ The kernel has no notion of any specific properties of the signal message,
+ all it sees are the bit fields of the bloom filter and the mask to match
+ against. The use of bloom filters allows simple and efficient matching,
+ without exposing any message properties or internals to the kernel side.
+ Clients need to deal with the fact that they might receive signal messages
+ which they did not subscribe to, as the bloom filter might allow
+ false-positives to pass the filter.
+
+ To allow the future extension of the set of elements in the bloom filter,
+ the filter specifies a <emphasis>generation</emphasis> number. A later
+ generation must always contain all elements of the set of the previous
+ generation, but can add new elements to the set. The match rules mask can
+ carry an array with all previous generations of masks individually stored.
+ When the filter and mask are matched by the kernel, the mask with the
+ closest matching generation is selected as the index into the mask array.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Bloom filters</title>
+ <para>
+ Bloom filters allow checking whether a given word is present in a
+ dictionary. This allows connections to set up a mask for information it
+ is interested in, and will be delivered signal messages that have a
+ matching filter.
+
+ For general information, see
+ <ulink url="https://en.wikipedia.org/wiki/Bloom_filter">the Wikipedia
+ article on bloom filters</ulink>.
+ </para>
+ <para>
+ The size of the bloom filter is defined per bus when it is created, in
+ <varname>kdbus_bloom_parameter.size</varname>. All bloom filters attached
+ to signal messages on the bus must match this size, and all bloom filter
+ matches uploaded by connections must also match the size, or a multiple
+ thereof (see below).
+
+ The calculation of the mask has to be done in userspace applications. The
+ kernel just checks the bitmasks to decide whether or not to let the
+ message pass. All bits in the mask must match the filter in and bit-wise
+ <emphasis>AND</emphasis> logic, but the mask may have more bits set than
+ the filter. Consequently, false positive matches are expected to happen,
+ and programs must deal with that fact by checking the contents of the
+ payload again at receive time.
+ </para>
+ <para>
+ Masks are entities that are always passed to the kernel as part of a
+ match (with an item of type <constant>KDBUS_ITEM_BLOOM_MASK</constant>),
+ and filters can be attached to signals, with an item of type
+ <constant>KDBUS_ITEM_BLOOM_FILTER</constant>. For a filter to match, all
+ its bits have to be set in the match mask as well.
+ </para>
+ <para>
+ For example, consider a bus that has a bloom size of 8 bytes, and the
+ following mask/filter combinations:
+ </para>
+ <programlisting><![CDATA[
+ filter 0x0101010101010101
+ mask 0x0101010101010101
+ -> matches
+
+ filter 0x0303030303030303
+ mask 0x0101010101010101
+ -> doesn't match
+
+ filter 0x0101010101010101
+ mask 0x0303030303030303
+ -> matches
+ ]]></programlisting>
+
+ <para>
+ Hence, in order to catch all messages, a mask filled with
+ <constant>0xff</constant> bytes can be installed as a wildcard match rule.
+ </para>
+
+ <refsect2>
+ <title>Generations</title>
+
+ <para>
+ Uploaded matches may contain multiple masks, which have are as large as
+ the bloom size defined by the bus. Each block of a mask is called a
+ <emphasis>generation</emphasis>, starting at index 0.
+
+ At match time, when a signal is about to be delivered, a bloom mask
+ generation is passed, which denotes which of the bloom masks the filter
+ should be matched against. This allows programs to provide backward
+ compatible masks at upload time, while older clients can still match
+ against older versions of filters.
+ </para>
+ </refsect2>
+ </refsect1>
+
+ <refsect1>
+ <title>Matches for kernel notifications</title>
+ <para>
+ To receive kernel generated notifications (see
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>),
+ a connection must install match rules that are different from
+ the bloom filter matches described in the section above. They can be
+ filtered by the connection ID that caused the notification to be sent, by
+ one of the names it currently owns, or by the type of the notification
+ (ID/name add/remove/change).
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Adding a match</title>
+ <para>
+ To add a match, the <constant>KDBUS_CMD_MATCH_ADD</constant> ioctl is
+ used, which takes a struct of the struct described below.
+
+ Note that each of the items attached to this command will internally
+ create one match <emphasis>rule</emphasis>, and the collection of them,
+ which is submitted as one block via the ioctl, is called a
+ <emphasis>match</emphasis>. To allow a message to pass, all rules of a
+ match have to be satisfied. Hence, adding more items to the command will
+ only narrow the possibility of a match to effectively let the message
+ pass, and will decrease the chance that the connection's process will be
+ woken up needlessly.
+
+ Multiple matches can be installed per connection. As long as one of it has
+ a set of rules which allows the message to pass, this one will be
+ decisive.
+ </para>
+
+ <programlisting>
+struct kdbus_cmd_match {
+ __u64 size;
+ __u64 flags;
+ __u64 return_flags;
+ __u64 cookie;
+ struct kdbus_item items[0];
+};
+ </programlisting>
+
+ <para>The fields in this struct are described below.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term><varname>size</varname></term>
+ <listitem><para>
+ The overall size of the struct, including its items.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>flags</varname></term>
+ <listitem><para>Flags to control the behavior of the ioctl.</para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_MATCH_REPLACE</constant></term>
+ <listitem>
+ <para>Make the endpoint file group-accessible</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_FLAG_NEGOTIATE</constant></term>
+ <listitem>
+ <para>
+ Requests a set of valid flags for this ioctl. When this bit is
+ set, no action is taken; the ioctl will return
+ <errorcode>0</errorcode>, and the <varname>flags</varname>
+ field will have all bits set that are valid for this command.
+ The <constant>KDBUS_FLAG_NEGOTIATE</constant> bit will be
+ cleared by the operation.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>return_flags</varname></term>
+ <listitem><para>
+ Flags returned by the kernel. Currently unused and always set to
+ <constant>0</constant> by the kernel.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>cookie</varname></term>
+ <listitem><para>
+ A cookie which identifies the match, so it can be referred to when
+ removing it.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>items</varname></term>
+ <listitem>
+ <para>
+ Items to define the actual rules of the matches. The following item
+ types are expected. Each item will create one new match rule.
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_BLOOM_MASK</constant></term>
+ <listitem>
+ <para>
+ An item that carries the bloom filter mask to match against
+ in its data field. The payload size must match the bloom
+ filter size that was specified when the bus was created.
+ See the section below for more information on bloom filters.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_NAME</constant></term>
+ <listitem>
+ <para>
+ When used as part of kernel notifications, this item specifies
+ a name that is acquired, lost or that changed its owner (see
+ below). When used as part of a match for user-generated signal
+ messages, it specifies a name that the sending connection must
+ own at the time of sending the signal.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_ID</constant></term>
+ <listitem>
+ <para>
+ Specify a sender connection's ID that will match this rule.
+ For kernel notifications, this specifies the ID of a
+ connection that was added to or removed from the bus.
+ For used-generated signals, it specifies the ID of the
+ connection that sent the signal message.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_NAME_ADD</constant></term>
+ <term><constant>KDBUS_ITEM_NAME_REMOVE</constant></term>
+ <term><constant>KDBUS_ITEM_NAME_CHANGE</constant></term>
+ <listitem>
+ <para>
+ These items request delivery of kernel notifications that
+ describe a name acquisition, loss, or change. The details
+ are stored in the item's
+ <varname>kdbus_notify_name_change</varname> member.
+ All information specified must be matched in order to make
+ the message pass. Use
+ <constant>KDBUS_MATCH_ID_ANY</constant> to
+ match against any unique connection ID.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_ID_ADD</constant></term>
+ <term><constant>KDBUS_ITEM_ID_REMOVE</constant></term>
+ <listitem>
+ <para>
+ These items request delivery of kernel notifications that are
+ generated when a connection is created or terminated.
+ <type>struct kdbus_notify_id_change</type> is used to
+ store the actual match information. This item can be used to
+ monitor one particular connection ID, or, when the ID field
+ is set to <constant>KDBUS_MATCH_ID_ANY</constant>,
+ all of them.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_NEGOTIATE</constant></term>
+ <listitem><para>
+ With this item, programs can <emphasis>probe</emphasis> the
+ kernel for known item types. See
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more details.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ Unrecognized items are rejected, and the ioctl will fail with
+ <varname>errno</varname> set to <constant>EINVAL</constant>.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ Refer to
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information on message types.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Removing a match</title>
+ <para>
+ Matches can be removed with the
+ <constant>KDBUS_CMD_MATCH_REMOVE</constant> ioctl, which takes
+ <type>struct kdbus_cmd_match</type> as argument, but its fields
+ usage slightly differs compared to that of
+ <constant>KDBUS_CMD_MATCH_ADD</constant>.
+ </para>
+
+ <programlisting>
+struct kdbus_cmd_match {
+ __u64 size;
+ __u64 cookie;
+ __u64 flags;
+ __u64 return_flags;
+ struct kdbus_item items[0];
+};
+ </programlisting>
+
+ <para>The fields in this struct are described below.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term><varname>size</varname></term>
+ <listitem><para>
+ The overall size of the struct, including its items.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>cookie</varname></term>
+ <listitem><para>
+ The cookie of the match, as it was passed when the match was added.
+ All matches that have this cookie will be removed.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>flags</varname></term>
+ <listitem><para>
+ No flags are supported for this use case.
+ <constant>KDBUS_FLAG_NEGOTIATE</constant> is accepted to probe for
+ valid flags. If set, the ioctl will fail with
+ <errorcode>-1</errorcode>, <varname>errno</varname> is set to
+ <constant>EPROTO</constant>, and the <varname>flags</varname> field
+ is set to <constant>0</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>return_flags</varname></term>
+ <listitem><para>
+ Flags returned by the kernel. Currently unused and always set to
+ <constant>0</constant> by the kernel.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>items</varname></term>
+ <listitem>
+ <para>
+ No items are supported for this use case, but
+ <constant>KDBUS_ITEM_NEGOTIATE</constant> is allowed nevertheless.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Return value</title>
+ <para>
+ On success, all mentioned ioctl commands return <errorcode>0</errorcode>;
+ on error, <errorcode>-1</errorcode> is returned, and
+ <varname>errno</varname> is set to indicate the error.
+ If the issued ioctl is illegal for the file descriptor used,
+ <varname>errno</varname> will be set to <constant>ENOTTY</constant>.
+ </para>
+
+ <refsect2>
+ <title>
+ <constant>KDBUS_CMD_MATCH_ADD</constant> may fail with the following
+ errors
+ </title>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>EINVAL</constant></term>
+ <listitem><para>
+ Illegal flags or items.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EDOM</constant></term>
+ <listitem><para>
+ Illegal bloom filter size.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EMFILE</constant></term>
+ <listitem><para>
+ Too many matches for this connection.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+
+ <refsect2>
+ <title>
+ <constant>KDBUS_CMD_MATCH_REMOVE</constant> may fail with the following
+ errors
+ </title>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>EINVAL</constant></term>
+ <listitem><para>
+ Illegal flags.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EBADSLT</constant></term>
+ <listitem><para>
+ A match entry with the given cookie could not be found.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+ <simplelist type="inline">
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.bus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.match</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.fs</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.name</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ </simplelist>
+ </refsect1>
+</refentry>
diff --git a/Documentation/kdbus/kdbus.message.xml b/Documentation/kdbus/kdbus.message.xml
new file mode 100644
index 000000000000..c25000dcfbc7
--- /dev/null
+++ b/Documentation/kdbus/kdbus.message.xml
@@ -0,0 +1,1277 @@
+<?xml version='1.0'?> <!--*-nxml-*-->
+<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
+ "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
+
+<refentry id="kdbus.message">
+
+ <refentryinfo>
+ <title>kdbus.message</title>
+ <productname>kdbus.message</productname>
+ </refentryinfo>
+
+ <refmeta>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </refmeta>
+
+ <refnamediv>
+ <refname>kdbus.message</refname>
+ <refpurpose>kdbus message</refpurpose>
+ </refnamediv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ A kdbus message is used to exchange information between two connections
+ on a bus, or to transport notifications from the kernel to one or many
+ connections. This document describes the layout of messages, how payload
+ is added to them and how they are sent and received.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Message layout</title>
+
+ <para>The layout of a message is shown below.</para>
+
+ <programlisting>
+ +-------------------------------------------------------------------------+
+ | Message |
+ | +---------------------------------------------------------------------+ |
+ | | Header | |
+ | | size: overall message size, including the data records | |
+ | | destination: connection ID of the receiver | |
+ | | source: connection ID of the sender (set by kernel) | |
+ | | payload_type: "DBusDBus" textual identifier stored as uint64_t | |
+ | +---------------------------------------------------------------------+ |
+ | +---------------------------------------------------------------------+ |
+ | | Data Record | |
+ | | size: overall record size (without padding) | |
+ | | type: type of data | |
+ | | data: reference to data (address or file descriptor) | |
+ | +---------------------------------------------------------------------+ |
+ | +---------------------------------------------------------------------+ |
+ | | padding bytes to the next 8 byte alignment | |
+ | +---------------------------------------------------------------------+ |
+ | +---------------------------------------------------------------------+ |
+ | | Data Record | |
+ | | size: overall record size (without padding) | |
+ | | ... | |
+ | +---------------------------------------------------------------------+ |
+ | +---------------------------------------------------------------------+ |
+ | | padding bytes to the next 8 byte alignment | |
+ | +---------------------------------------------------------------------+ |
+ | +---------------------------------------------------------------------+ |
+ | | Data Record | |
+ | | size: overall record size | |
+ | | ... | |
+ | +---------------------------------------------------------------------+ |
+ | ... further data records ... |
+ +-------------------------------------------------------------------------+
+ </programlisting>
+ </refsect1>
+
+ <refsect1>
+ <title>Message payload</title>
+
+ <para>
+ When connecting to the bus, receivers request a memory pool of a given
+ size, large enough to carry all backlog of data enqueued for the
+ connection. The pool is internally backed by a shared memory file which
+ can be <function>mmap()</function>ed by the receiver. See
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information.
+ </para>
+
+ <para>
+ Message payload must be described in items attached to a message when
+ it is sent. A receiver can access the payload by looking at the items
+ that are attached to a message in its pool. The following items are used.
+ </para>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_PAYLOAD_VEC</constant></term>
+ <listitem>
+ <para>
+ This item references a piece of memory on the sender side which is
+ directly copied into the receiver's pool. This way, two peers can
+ exchange data by effectively doing a single-copy from one process
+ to another; the kernel will not buffer the data anywhere else.
+ This item is never found in a message received by a connection.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_PAYLOAD_OFF</constant></term>
+ <listitem>
+ <para>
+ This item is attached to messages on the receiving side and points
+ to a memory area inside the receiver's pool. The
+ <varname>offset</varname> variable in the item denotes the memory
+ location relative to the message itself.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_PAYLOAD_MEMFD</constant></term>
+ <listitem>
+ <para>
+ Messages can reference <emphasis>memfd</emphasis> files which
+ contain the data. memfd files are tmpfs-backed files that allow
+ sealing of the content of the file, which prevents all writable
+ access to the file content.
+ </para>
+ <para>
+ Only memfds that have
+ <constant>(F_SEAL_SHRINK|F_SEAL_GROW|F_SEAL_WRITE|F_SEAL_SEAL)
+ </constant>
+ set are accepted as payload data, which enforces reliable passing of
+ data. The receiver can assume that neither the sender nor anyone
+ else can alter the content after the message is sent. If those
+ seals are not set on the memfd, the ioctl will fail with
+ <errorcode>-1</errorcode>, and <varname>errno</varname> will be
+ set to <constant>ETXTBUSY</constant>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_FDS</constant></term>
+ <listitem>
+ <para>
+ Messages can transport regular file descriptors via
+ <constant>KDBUS_ITEM_FDS</constant>. This item carries an array
+ of <type>int</type> values in <varname>item.fd</varname>. The
+ maximum number of file descriptors in the item is
+ <constant>253</constant>, and only one item of this type is
+ accepted per message. All passed values must be valid file
+ descriptors; the open count of each file descriptors is increased
+ by installing it to the receiver's task. This item can only be
+ used for directed messages, not for broadcasts, and only to
+ remote peers that have opted-in for receiving file descriptors
+ at connection time (<constant>KDBUS_HELLO_ACCEPT_FD</constant>).
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ The sender must not make any assumptions on the type in which data is
+ received by the remote peer. The kernel is free to re-pack multiple
+ <constant>KDBUS_ITEM_PAYLOAD_VEC</constant> and
+ <constant>KDBUS_ITEM_PAYLOAD_MEMFD</constant> payloads. For instance, the
+ kernel may decide to merge multiple <constant>VECs</constant> into a
+ single <constant>VEC</constant>, inline <constant>MEMFD</constant>
+ payloads into memory, or merge all passed <constant>VECs</constant> into a
+ single <constant>MEMFD</constant>. However, the kernel preserves the order
+ of passed data. This means that the order of all <constant>VEC</constant>
+ and <constant>MEMFD</constant> items is not changed in respect to each
+ other. In other words: All passed <constant>VEC</constant> and
+ <constant>MEMFD</constant> data payloads are treated as a single stream
+ of data that may be received by the remote peer in a different set of
+ chunks than it was sent as.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Sending messages</title>
+
+ <para>
+ Messages are passed to the kernel with the
+ <constant>KDBUS_CMD_SEND</constant> ioctl. Depending on the destination
+ address of the message, the kernel delivers the message to the specific
+ destination connection, or to some subset of all connections on the same
+ bus. Sending messages across buses is not possible. Messages are always
+ queued in the memory pool of the destination connection (see above).
+ </para>
+
+ <para>
+ The <constant>KDBUS_CMD_SEND</constant> ioctl uses a
+ <type>struct kdbus_cmd_send</type> to describe the message
+ transfer.
+ </para>
+ <programlisting>
+struct kdbus_cmd_send {
+ __u64 size;
+ __u64 flags;
+ __u64 return_flags;
+ __u64 msg_address;
+ struct kdbus_msg_info reply;
+ struct kdbus_item items[0];
+};
+ </programlisting>
+
+ <para>The fields in this struct are described below.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term><varname>size</varname></term>
+ <listitem><para>
+ The overall size of the struct, including its items.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>flags</varname></term>
+ <listitem><para>Flags for message delivery</para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_SEND_SYNC_REPLY</constant></term>
+ <listitem>
+ <para>
+ By default, all calls to kdbus are considered asynchronous,
+ non-blocking. However, as there are many use cases that need
+ to wait for a remote peer to answer a method call, there's a
+ way to send a message and wait for a reply in a synchronous
+ fashion. This is what the
+ <constant>KDBUS_SEND_SYNC_REPLY</constant> controls. The
+ <constant>KDBUS_CMD_SEND</constant> ioctl will block until the
+ reply has arrived, the timeout limit is reached, in case the
+ remote connection was shut down, or if interrupted by a signal
+ before any reply; see
+ <citerefentry>
+ <refentrytitle>signal</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+
+ The offset of the reply message in the sender's pool is stored
+ in in <varname>offset_reply</varname> when the ioctl has
+ returned without error. Hence, there is no need for another
+ <constant>KDBUS_CMD_RECV</constant> ioctl or anything else to
+ receive the reply.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_FLAG_NEGOTIATE</constant></term>
+ <listitem>
+ <para>
+ Request a set of valid flags for this ioctl. When this bit is
+ set, no action is taken; the ioctl will fail with
+ <errorcode>-1</errorcode>, <varname>errno</varname>
+ is set to <constant>EPROTO</constant>.
+ Once the ioctl returned, the <varname>flags</varname>
+ field will have all bits set that the kernel recognizes as
+ valid for this command.
+ The <constant>KDBUS_FLAG_NEGOTIATE</constant> bit will be
+ cleared by the operation.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>return_flags</varname></term>
+ <listitem><para>
+ Flags returned by the kernel. Currently unused and always set to
+ <constant>0</constant> by the kernel.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>msg_address</varname></term>
+ <listitem><para>
+ In this field, users have to provide a pointer to a message
+ (<type>struct kdbus_msg</type>) to send. See below for a
+ detailed description.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>reply</varname></term>
+ <listitem><para>
+ Only used for synchronous replies. See description of
+ <type>struct kdbus_cmd_recv</type> for more details.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>items</varname></term>
+ <listitem>
+ <para>
+ The following items are currently recognized.
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_CANCEL_FD</constant></term>
+ <listitem>
+ <para>
+ When this optional item is passed in, and the call is
+ executed as SYNC call, the passed in file descriptor can be
+ used as alternative cancellation point. The kernel will call
+ <citerefentry>
+ <refentrytitle>poll</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ on this file descriptor, and once it reports any incoming
+ bytes, the blocking send operation will be canceled; the
+ blocking, synchronous ioctl call will return
+ <errorcode>-1</errorcode>, and <varname>errno</varname> will
+ be set to <errorname>ECANCELED</errorname>.
+ Any type of file descriptor on which
+ <citerefentry>
+ <refentrytitle>poll</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ can be called on can be used as payload to this item; for
+ example, an eventfd can be used for this purpose, see
+ <citerefentry>
+ <refentrytitle>eventfd</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>.
+ For asynchronous message sending, this item is allowed but
+ ignored.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ <para>
+ Unrecognized items are rejected, and the ioctl will fail with
+ <varname>errno</varname> set to <constant>EINVAL</constant>.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ The fields in this struct are described below.
+ The message referenced the <varname>msg_address</varname> above has
+ the following layout.
+ </para>
+
+ <programlisting>
+struct kdbus_msg {
+ __u64 size;
+ __u64 flags;
+ __s64 priority;
+ __u64 dst_id;
+ __u64 src_id;
+ __u64 payload_type;
+ __u64 cookie;
+ __u64 timeout_ns;
+ __u64 cookie_reply;
+ struct kdbus_item items[0];
+};
+ </programlisting>
+
+ <para>The fields in this struct are described below.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term><varname>size</varname></term>
+ <listitem><para>
+ The overall size of the struct, including its items.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>flags</varname></term>
+ <listitem><para>Flags to describe message details.</para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_MSG_EXPECT_REPLY</constant></term>
+ <listitem>
+ <para>
+ Expect a reply to this message from the remote peer. With
+ this bit set, the timeout_ns field must be set to a non-zero
+ number of nanoseconds in which the receiving peer is expected
+ to reply. If such a reply is not received in time, the sender
+ will be notified with a timeout message (see below). The
+ value must be an absolute value, in nanoseconds and based on
+ <constant>CLOCK_MONOTONIC</constant>.
+ </para><para>
+ For a message to be accepted as reply, it must be a direct
+ message to the original sender (not a broadcast and not a
+ signal message), and its
+ <varname>kdbus_msg.reply_cookie</varname> must match the
+ previous message's <varname>kdbus_msg.cookie</varname>.
+ </para><para>
+ Expected replies also temporarily open the policy of the
+ sending connection, so the other peer is allowed to respond
+ within the given time window.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_MSG_NO_AUTO_START</constant></term>
+ <listitem>
+ <para>
+ By default, when a message is sent to an activator
+ connection, the activator is notified and will start an
+ implementer. This flag inhibits that behavior. With this bit
+ set, and the remote being an activator, the ioctl will fail
+ with <varname>errno</varname> set to
+ <constant>EADDRNOTAVAIL</constant>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_FLAG_NEGOTIATE</constant></term>
+ <listitem>
+ <para>
+ Requests a set of valid flags for this ioctl. When this bit is
+ set, no action is taken; the ioctl will return
+ <errorcode>0</errorcode>, and the <varname>flags</varname>
+ field will have all bits set that are valid for this command.
+ The <constant>KDBUS_FLAG_NEGOTIATE</constant> bit will be
+ cleared by the operation.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>priority</varname></term>
+ <listitem><para>
+ The priority of this message. Receiving messages (see below) may
+ optionally be constrained to messages of a minimal priority. This
+ allows for use cases where timing critical data is interleaved with
+ control data on the same connection. If unused, the priority field
+ should be set to <constant>0</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>dst_id</varname></term>
+ <listitem><para>
+ The numeric ID of the destination connection, or
+ <constant>KDBUS_DST_ID_BROADCAST</constant>
+ (~0ULL) to address every peer on the bus, or
+ <constant>KDBUS_DST_ID_NAME</constant> (0) to look
+ it up dynamically from the bus' name registry.
+ In the latter case, an item of type
+ <constant>KDBUS_ITEM_DST_NAME</constant> is mandatory.
+ Also see
+ <citerefentry>
+ <refentrytitle>kdbus.name</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ .
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>src_id</varname></term>
+ <listitem><para>
+ Upon return of the ioctl, this member will contain the sending
+ connection's numerical ID. Should be 0 at send time.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>payload_type</varname></term>
+ <listitem><para>
+ Type of the payload in the actual data records. Currently, only
+ <constant>KDBUS_PAYLOAD_DBUS</constant> is accepted as input value
+ of this field. When receiving messages that are generated by the
+ kernel (notifications), this field will contain
+ <constant>KDBUS_PAYLOAD_KERNEL</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>cookie</varname></term>
+ <listitem><para>
+ Cookie of this message, for later recognition. Also, when replying
+ to a message (see above), the <varname>cookie_reply</varname>
+ field must match this value.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>timeout_ns</varname></term>
+ <listitem><para>
+ If the message sent requires a reply from the remote peer (see above),
+ this field contains the timeout in absolute nanoseconds based on
+ <constant>CLOCK_MONOTONIC</constant>. Also see
+ <citerefentry>
+ <refentrytitle>clock_gettime</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>cookie_reply</varname></term>
+ <listitem><para>
+ If the message sent is a reply to another message, this field must
+ match the cookie of the formerly received message.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>items</varname></term>
+ <listitem>
+ <para>
+ A dynamically sized list of items to contain additional information.
+ The following items are expected/valid:
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_PAYLOAD_VEC</constant></term>
+ <term><constant>KDBUS_ITEM_PAYLOAD_MEMFD</constant></term>
+ <term><constant>KDBUS_ITEM_FDS</constant></term>
+ <listitem>
+ <para>
+ Actual data records containing the payload. See section
+ "Passing of Payload Data".
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_BLOOM_FILTER</constant></term>
+ <listitem>
+ <para>
+ Bloom filter for matches (see below).
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ITEM_DST_NAME</constant></term>
+ <listitem>
+ <para>
+ Well-known name to send this message to. Required if
+ <varname>dst_id</varname> is set to
+ <constant>KDBUS_DST_ID_NAME</constant>.
+ If a connection holding the given name can't be found,
+ the ioctl will fail with <varname>errno</varname> set to
+ <constant>ESRCH</constant> is returned.
+ </para>
+ <para>
+ For messages to a unique name (ID), this item is optional. If
+ present, the kernel will make sure the name owner matches the
+ given unique name. This allows programs to tie the message
+ sending to the condition that a name is currently owned by a
+ certain unique name.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ The message will be augmented by the requested metadata items when
+ queued into the receiver's pool. See
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ and
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information on metadata.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Receiving messages</title>
+
+ <para>
+ Messages are received by the client with the
+ <constant>KDBUS_CMD_RECV</constant> ioctl. The endpoint file of the bus
+ supports <function>poll()/epoll()/select()</function>; when new messages
+ are available on the connection's file descriptor,
+ <constant>POLLIN</constant> is reported. For compatibility reasons,
+ <constant>POLLOUT</constant> is always reported as well. Note, however,
+ that the latter does not guarantee that a message can in fact be sent, as
+ this depends on how many pending messages the receiver has in its pool.
+ </para>
+
+ <para>
+ With the <constant>KDBUS_CMD_RECV</constant> ioctl, a
+ <type>struct kdbus_cmd_recv</type> is used.
+ </para>
+
+ <programlisting>
+struct kdbus_cmd_recv {
+ __u64 size;
+ __u64 flags;
+ __u64 return_flags;
+ __s64 priority;
+ __u64 dropped_msgs;
+ struct kdbus_msg_info msg;
+ struct kdbus_item items[0];
+};
+ </programlisting>
+
+ <para>The fields in this struct are described below.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term><varname>size</varname></term>
+ <listitem><para>
+ The overall size of the struct, including its items.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>flags</varname></term>
+ <listitem><para>Flags to control the receive command.</para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_RECV_PEEK</constant></term>
+ <listitem>
+ <para>
+ Just return the location of the next message. Do not install
+ file descriptors or anything else. This is usually used to
+ determine the sender of the next queued message.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_RECV_DROP</constant></term>
+ <listitem>
+ <para>
+ Drop the next message without doing anything else with it,
+ and free the pool slice. This a short-cut for
+ <constant>KDBUS_RECV_PEEK</constant> and
+ <constant>KDBUS_CMD_FREE</constant>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_RECV_USE_PRIORITY</constant></term>
+ <listitem>
+ <para>
+ Dequeue the messages ordered by their priority, and filtering
+ them with the priority field (see below).
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_FLAG_NEGOTIATE</constant></term>
+ <listitem>
+ <para>
+ Request a set of valid flags for this ioctl. When this bit is
+ set, no action is taken; the ioctl will fail with
+ <errorcode>-1</errorcode>, <varname>errno</varname>
+ is set to <constant>EPROTO</constant>.
+ Once the ioctl returned, the <varname>flags</varname>
+ field will have all bits set that the kernel recognizes as
+ valid for this command.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>return_flags</varname></term>
+ <listitem><para>
+ Flags returned by the kernel. If the <varname>dropped_msgs</varname>
+ field is non-zero, <constant>KDBUS_RECV_RETURN_DROPPED_MSGS</constant>
+ is set. If a file descriptor could not be installed, the
+ <constant>KDBUS_RECV_RETURN_INCOMPLETE_FDS</constant> flag is set.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>priority</varname></term>
+ <listitem><para>
+ With <constant>KDBUS_RECV_USE_PRIORITY</constant> set in
+ <varname>flags</varname>, messages will be dequeued ordered by their
+ priority, starting with the highest value. Also, messages will be
+ filtered by the value given in this field, so the returned message
+ will at least have the requested priority. If no such message is
+ waiting in the queue, the ioctl will fail, and
+ <varname>errno</varname> will be set to <constant>EAGAIN</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>dropped_msgs</varname></term>
+ <listitem><para>
+ Whenever a message with <constant>KDBUS_MSG_SIGNAL</constant> is sent
+ but cannot be queued on a peer (e.g., as it contains FDs but the peer
+ does not support FDs, or there is no space left in the peer's pool..)
+ the 'dropped_msgs' counter of the peer is incremented. On the next
+ RECV ioctl, the 'dropped_msgs' field is copied into the ioctl struct
+ and cleared on the peer. If it was non-zero, the
+ <constant>KDBUS_RECV_RETURN_DROPPED_MSGS</constant> flag will be set
+ in <varname>return_flags</varname>. Note that this will only happen
+ if the ioctl succeeded or failed with <constant>EAGAIN</constant>. In
+ other error cases, the 'dropped_msgs' field of the peer is left
+ untouched.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>msg</varname></term>
+ <listitem><para>
+ Embedded struct containing information on the received message when
+ this command succeeded (see below).
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>items</varname></term>
+ <listitem><para>
+ Items to specify further details for the receive command.
+ Currently unused, and all items will be rejected with
+ <varname>errno</varname> set to <constant>EINVAL</constant>.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ Both <type>struct kdbus_cmd_recv</type> and
+ <type>struct kdbus_cmd_send</type> embed
+ <type>struct kdbus_msg_info</type>.
+ For the <constant>KDBUS_CMD_SEND</constant> ioctl, it is used to catch
+ synchronous replies, if one was requested, and is unused otherwise.
+ </para>
+
+ <programlisting>
+struct kdbus_msg_info {
+ __u64 offset;
+ __u64 msg_size;
+ __u64 return_flags;
+};
+ </programlisting>
+
+ <para>The fields in this struct are described below.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term><varname>offset</varname></term>
+ <listitem><para>
+ Upon return of the ioctl, this field contains the offset in the
+ receiver's memory pool. The memory must be freed with
+ <constant>KDBUS_CMD_FREE</constant>. See
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for further details.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>msg_size</varname></term>
+ <listitem><para>
+ Upon successful return of the ioctl, this field contains the size of
+ the allocated slice at offset <varname>offset</varname>.
+ It is the combination of the size of the stored
+ <type>struct kdbus_msg</type> object plus all appended VECs.
+ You can use it in combination with <varname>offset</varname> to map
+ a single message, instead of mapping the entire pool. See
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for further details.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>return_flags</varname></term>
+ <listitem>
+ <para>
+ Kernel-provided return flags. Currently, the following flags are
+ defined.
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_RECV_RETURN_INCOMPLETE_FDS</constant></term>
+ <listitem>
+ <para>
+ The message contained memfds or file descriptors, and the
+ kernel failed to install one or more of them at receive time.
+ Most probably that happened because the maximum number of
+ file descriptors for the receiver's task were exceeded.
+ In such cases, the message is still delivered, so this is not
+ a fatal condition. File descriptors numbers inside the
+ <constant>KDBUS_ITEM_FDS</constant> item or memfd files
+ referenced by <constant>KDBUS_ITEM_PAYLOAD_MEMFD</constant>
+ items which could not be installed will be set to
+ <constant>-1</constant>.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ Unless <constant>KDBUS_RECV_DROP</constant> was passed, the
+ <varname>offset</varname> field contains the location of the new message
+ inside the receiver's pool after the <constant>KDBUS_CMD_RECV</constant>
+ ioctl was employed. The message is stored as <type>struct kdbus_msg</type>
+ at this offset, and can be interpreted with the semantics described above.
+ </para>
+ <para>
+ Also, if the connection allowed for file descriptor to be passed
+ (<constant>KDBUS_HELLO_ACCEPT_FD</constant>), and if the message contained
+ any, they will be installed into the receiving process when the
+ <constant>KDBUS_CMD_RECV</constant> ioctl is called.
+ <emphasis>memfds</emphasis> may always be part of the message payload.
+ The receiving task is obliged to close all file descriptors appropriately
+ once no longer needed. If <constant>KDBUS_RECV_PEEK</constant> is set, no
+ file descriptors are installed. This allows for peeking at a message,
+ looking at its metadata only and dropping it via
+ <constant>KDBUS_RECV_DROP</constant>, without installing any of the file
+ descriptors into the receiving process.
+ </para>
+ <para>
+ The caller is obliged to call the <constant>KDBUS_CMD_FREE</constant>
+ ioctl with the returned offset when the memory is no longer needed.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Notifications</title>
+ <para>
+ A kernel notification is a regular kdbus message with the following
+ details.
+ </para>
+
+ <itemizedlist>
+ <listitem><para>
+ kdbus_msg.src_id == <constant>KDBUS_SRC_ID_KERNEL</constant>
+ </para></listitem>
+ <listitem><para>
+ kdbus_msg.dst_id == <constant>KDBUS_DST_ID_BROADCAST</constant>
+ </para></listitem>
+ <listitem><para>
+ kdbus_msg.payload_type == <constant>KDBUS_PAYLOAD_KERNEL</constant>
+ </para></listitem>
+ <listitem><para>
+ Has exactly one of the items attached that are described below.
+ </para></listitem>
+ <listitem><para>
+ Always has a timestamp item (<constant>KDBUS_ITEM_TIMESTAMP</constant>)
+ attached.
+ </para></listitem>
+ </itemizedlist>
+
+ <para>
+ The kernel will notify its users of the following events.
+ </para>
+
+ <itemizedlist>
+ <listitem><para>
+ When connection <emphasis>A</emphasis> is terminated while connection
+ <emphasis>B</emphasis> is waiting for a reply from it, connection
+ <emphasis>B</emphasis> is notified with a message with an item of
+ type <constant>KDBUS_ITEM_REPLY_DEAD</constant>.
+ </para></listitem>
+
+ <listitem><para>
+ When connection <emphasis>A</emphasis> does not receive a reply from
+ connection <emphasis>B</emphasis> within the specified timeout window,
+ connection <emphasis>A</emphasis> will receive a message with an
+ item of type <constant>KDBUS_ITEM_REPLY_TIMEOUT</constant>.
+ </para></listitem>
+
+ <listitem><para>
+ When an ordinary connection (not a monitor) is created on or removed
+ from a bus, messages with an item of type
+ <constant>KDBUS_ITEM_ID_ADD</constant> or
+ <constant>KDBUS_ITEM_ID_REMOVE</constant>, respectively, are delivered
+ to all bus members that match these messages through their match
+ database. Eavesdroppers (monitor connections) do not cause such
+ notifications to be sent. They are invisible on the bus.
+ </para></listitem>
+
+ <listitem><para>
+ When a connection gains or loses ownership of a name, messages with an
+ item of type <constant>KDBUS_ITEM_NAME_ADD</constant>,
+ <constant>KDBUS_ITEM_NAME_REMOVE</constant> or
+ <constant>KDBUS_ITEM_NAME_CHANGE</constant> are delivered to all bus
+ members that match these messages through their match database.
+ </para></listitem>
+ </itemizedlist>
+ </refsect1>
+
+ <refsect1>
+ <title>Return value</title>
+ <para>
+ On success, all mentioned ioctl commands return <errorcode>0</errorcode>;
+ on error, <errorcode>-1</errorcode> is returned, and
+ <varname>errno</varname> is set to indicate the error.
+ If the issued ioctl is illegal for the file descriptor used,
+ <varname>errno</varname> will be set to <constant>ENOTTY</constant>.
+ </para>
+
+ <refsect2>
+ <title>
+ <constant>KDBUS_CMD_SEND</constant> may fail with the following
+ errors
+ </title>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>EOPNOTSUPP</constant></term>
+ <listitem><para>
+ The connection is not an ordinary connection, or the passed
+ file descriptors in <constant>KDBUS_ITEM_FDS</constant> item are
+ either kdbus handles or unix domain sockets. Both are currently
+ unsupported.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EINVAL</constant></term>
+ <listitem><para>
+ The submitted payload type is
+ <constant>KDBUS_PAYLOAD_KERNEL</constant>,
+ <constant>KDBUS_MSG_EXPECT_REPLY</constant> was set without timeout
+ or cookie values, <constant>KDBUS_SEND_SYNC_REPLY</constant> was
+ set without <constant>KDBUS_MSG_EXPECT_REPLY</constant>, an invalid
+ item was supplied, <constant>src_id</constant> was non-zero and was
+ different from the current connection's ID, a supplied memfd had a
+ size of 0, or a string was not properly null-terminated.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>ENOTUNIQ</constant></term>
+ <listitem><para>
+ The supplied destination is
+ <constant>KDBUS_DST_ID_BROADCAST</constant> and either
+ file descriptors were passed, or
+ <constant>KDBUS_MSG_EXPECT_REPLY</constant> was set,
+ or a timeout was given.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>E2BIG</constant></term>
+ <listitem><para>
+ Too many items
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EMSGSIZE</constant></term>
+ <listitem><para>
+ The size of the message header and items or the payload vector
+ is excessive.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EEXIST</constant></term>
+ <listitem><para>
+ Multiple <constant>KDBUS_ITEM_FDS</constant>,
+ <constant>KDBUS_ITEM_BLOOM_FILTER</constant> or
+ <constant>KDBUS_ITEM_DST_NAME</constant> items were supplied.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EBADF</constant></term>
+ <listitem><para>
+ The supplied <constant>KDBUS_ITEM_FDS</constant> or
+ <constant>KDBUS_ITEM_PAYLOAD_MEMFD</constant> items
+ contained an illegal file descriptor.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EMEDIUMTYPE</constant></term>
+ <listitem><para>
+ The supplied memfd is not a sealed kdbus memfd.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EMFILE</constant></term>
+ <listitem><para>
+ Too many file descriptors inside a
+ <constant>KDBUS_ITEM_FDS</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EBADMSG</constant></term>
+ <listitem><para>
+ An item had illegal size, both a <constant>dst_id</constant> and a
+ <constant>KDBUS_ITEM_DST_NAME</constant> was given, or both a name
+ and a bloom filter was given.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>ETXTBSY</constant></term>
+ <listitem><para>
+ The supplied kdbus memfd file cannot be sealed or the seal
+ was removed, because it is shared with other processes or
+ still mapped with
+ <citerefentry>
+ <refentrytitle>mmap</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>ECOMM</constant></term>
+ <listitem><para>
+ A peer does not accept the file descriptors addressed to it.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EFAULT</constant></term>
+ <listitem><para>
+ The supplied bloom filter size was not 64-bit aligned, or supplied
+ memory could not be accessed by the kernel.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EDOM</constant></term>
+ <listitem><para>
+ The supplied bloom filter size did not match the bloom filter
+ size of the bus.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EDESTADDRREQ</constant></term>
+ <listitem><para>
+ <constant>dst_id</constant> was set to
+ <constant>KDBUS_DST_ID_NAME</constant>, but no
+ <constant>KDBUS_ITEM_DST_NAME</constant> was attached.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>ESRCH</constant></term>
+ <listitem><para>
+ The name to look up was not found in the name registry.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EADDRNOTAVAIL</constant></term>
+ <listitem><para>
+ <constant>KDBUS_MSG_NO_AUTO_START</constant> was given but the
+ destination connection is an activator.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>ENXIO</constant></term>
+ <listitem><para>
+ The passed numeric destination connection ID couldn't be found,
+ or is not connected.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>ECONNRESET</constant></term>
+ <listitem><para>
+ The destination connection is no longer active.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>ETIMEDOUT</constant></term>
+ <listitem><para>
+ Timeout while synchronously waiting for a reply.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EINTR</constant></term>
+ <listitem><para>
+ Interrupted system call while synchronously waiting for a reply.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EPIPE</constant></term>
+ <listitem><para>
+ When sending a message, a synchronous reply from the receiving
+ connection was expected but the connection died before answering.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>ENOBUFS</constant></term>
+ <listitem><para>
+ Too many pending messages on the receiver side.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EREMCHG</constant></term>
+ <listitem><para>
+ Both a well-known name and a unique name (ID) was given, but
+ the name is not currently owned by that connection.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EXFULL</constant></term>
+ <listitem><para>
+ The memory pool of the receiver is full.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EREMOTEIO</constant></term>
+ <listitem><para>
+ While synchronously waiting for a reply, the remote peer
+ failed with an I/O error.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+
+ <refsect2>
+ <title>
+ <constant>KDBUS_CMD_RECV</constant> may fail with the following
+ errors
+ </title>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>EOPNOTSUPP</constant></term>
+ <listitem><para>
+ The connection is not an ordinary connection, or the passed
+ file descriptors are either kdbus handles or unix domain
+ sockets. Both are currently unsupported.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EINVAL</constant></term>
+ <listitem><para>
+ Invalid flags or offset.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EAGAIN</constant></term>
+ <listitem><para>
+ No message found in the queue
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+ <simplelist type="inline">
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.bus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.endpoint</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.fs</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.name</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>clock_gettime</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>ioctl</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>poll</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>select</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>epoll</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>eventfd</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>memfd_create</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ </member>
+ </simplelist>
+ </refsect1>
+</refentry>
diff --git a/Documentation/kdbus/kdbus.name.xml b/Documentation/kdbus/kdbus.name.xml
new file mode 100644
index 000000000000..3f5f6a6c5ed6
--- /dev/null
+++ b/Documentation/kdbus/kdbus.name.xml
@@ -0,0 +1,711 @@
+<?xml version='1.0'?> <!--*-nxml-*-->
+<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
+ "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
+
+<refentry id="kdbus.name">
+
+ <refentryinfo>
+ <title>kdbus.name</title>
+ <productname>kdbus.name</productname>
+ </refentryinfo>
+
+ <refmeta>
+ <refentrytitle>kdbus.name</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </refmeta>
+
+ <refnamediv>
+ <refname>kdbus.name</refname>
+ <refpurpose>kdbus.name</refpurpose>
+ </refnamediv>
+
+ <refsect1>
+ <title>Description</title>
+ <para>
+ Each
+ <citerefentry>
+ <refentrytitle>kdbus.bus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ instantiates a name registry to resolve well-known names into unique
+ connection IDs for message delivery. The registry will be queried when a
+ message is sent with <varname>kdbus_msg.dst_id</varname> set to
+ <constant>KDBUS_DST_ID_NAME</constant>, or when a registry dump is
+ requested with <constant>KDBUS_CMD_NAME_LIST</constant>.
+ </para>
+
+ <para>
+ All of the below is subject to policy rules for <emphasis>SEE</emphasis>
+ and <emphasis>OWN</emphasis> permissions. See
+ <citerefentry>
+ <refentrytitle>kdbus.policy</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Name validity</title>
+ <para>
+ A name has to comply with the following rules in order to be considered
+ valid.
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ The name has two or more elements separated by a
+ '<literal>.</literal>' (period) character.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ All elements must contain at least one character.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Each element must only contain the ASCII characters
+ <literal>[A-Z][a-z][0-9]_</literal> and must not begin with a
+ digit.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The name must contain at least one '<literal>.</literal>' (period)
+ character (and thus at least two elements).
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The name must not begin with a '<literal>.</literal>' (period)
+ character.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The name must not exceed <constant>255</constant> characters in
+ length.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </refsect1>
+
+ <refsect1>
+ <title>Acquiring a name</title>
+ <para>
+ To acquire a name, a client uses the
+ <constant>KDBUS_CMD_NAME_ACQUIRE</constant> ioctl with
+ <type>struct kdbus_cmd</type> as argument.
+ </para>
+
+ <programlisting>
+struct kdbus_cmd {
+ __u64 size;
+ __u64 flags;
+ __u64 return_flags;
+ struct kdbus_item items[0];
+};
+ </programlisting>
+
+ <para>The fields in this struct are described below.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term><varname>size</varname></term>
+ <listitem><para>
+ The overall size of the struct, including its items.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>flags</varname></term>
+ <listitem><para>Flags to control details in the name acquisition.</para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_NAME_REPLACE_EXISTING</constant></term>
+ <listitem>
+ <para>
+ Acquiring a name that is already present usually fails,
+ unless this flag is set in the call, and
+ <constant>KDBUS_NAME_ALLOW_REPLACEMENT</constant> (see below)
+ was set when the current owner of the name acquired it, or
+ if the current owner is an activator connection (see
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>).
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_NAME_ALLOW_REPLACEMENT</constant></term>
+ <listitem>
+ <para>
+ Allow other connections to take over this name. When this
+ happens, the former owner of the connection will be notified
+ of the name loss.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_NAME_QUEUE</constant></term>
+ <listitem>
+ <para>
+ A name that is already acquired by a connection can not be
+ acquired again (unless the
+ <constant>KDBUS_NAME_ALLOW_REPLACEMENT</constant> flag was
+ set during acquisition; see above).
+ However, a connection can put itself in a queue of
+ connections waiting for the name to be released. Once that
+ happens, the first connection in that queue becomes the new
+ owner and is notified accordingly.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_FLAG_NEGOTIATE</constant></term>
+ <listitem>
+ <para>
+ Request a set of valid flags for this ioctl. When this bit is
+ set, no action is taken; the ioctl will fail with
+ <errorcode>-1</errorcode>, and <varname>errno</varname>
+ is set to <constant>EPROTO</constant>.
+ Once the ioctl returned, the <varname>flags</varname>
+ field will have all bits set that the kernel recognizes as
+ valid for this command.
+ The <constant>KDBUS_FLAG_NEGOTIATE</constant> bit will be
+ cleared by the operation.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>return_flags</varname></term>
+ <listitem>
+ <para>
+ Flags returned by the kernel. Currently, the following may be
+ returned by the kernel.
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_NAME_IN_QUEUE</constant></term>
+ <listitem>
+ <para>
+ The name was not acquired yet, but the connection was
+ placed in the queue of peers waiting for the name.
+ This can only happen if <constant>KDBUS_NAME_QUEUE</constant>
+ was set in the <varname>flags</varname> member (see above).
+ The connection will receive a name owner change notification
+ once the current owner has given up the name and its
+ ownership was transferred.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>items</varname></term>
+ <listitem>
+ <para>
+ Items to submit the name. Currently, one item of type
+ <constant>KDBUS_ITEM_NAME</constant> is expected and allowed, and
+ the contained string must be a valid bus name.
+ <constant>KDBUS_ITEM_NEGOTIATE</constant> may be used to probe for
+ valid item types. See
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for a detailed description of how this item is used.
+ </para>
+ <para>
+ Unrecognized items are rejected, and the ioctl will fail with
+ <varname>errno</varname> set to <errorname>>EINVAL</errorname>.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Releasing a name</title>
+ <para>
+ A connection may release a name explicitly with the
+ <constant>KDBUS_CMD_NAME_RELEASE</constant> ioctl. If the connection was
+ an implementer of an activatable name, its pending messages are moved
+ back to the activator. If there are any connections queued up as waiters
+ for the name, the first one in the queue (the oldest entry) will become
+ the new owner. The same happens implicitly for all names once a
+ connection terminates. See
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information on connections.
+ </para>
+ <para>
+ The <constant>KDBUS_CMD_NAME_RELEASE</constant> ioctl uses the same data
+ structure as the acquisition call
+ (<constant>KDBUS_CMD_NAME_ACQUIRE</constant>),
+ but with slightly different field usage.
+ </para>
+
+ <programlisting>
+struct kdbus_cmd {
+ __u64 size;
+ __u64 flags;
+ __u64 return_flags;
+ struct kdbus_item items[0];
+};
+ </programlisting>
+
+ <para>The fields in this struct are described below.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term><varname>size</varname></term>
+ <listitem><para>
+ The overall size of the struct, including its items.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>flags</varname></term>
+ <listitem><para>
+ Flags to the command. Currently unused.
+ <constant>KDBUS_FLAG_NEGOTIATE</constant> is accepted to probe for
+ valid flags. If set, the ioctl will return <errorcode>0</errorcode>,
+ and the <varname>flags</varname> field is set to
+ <constant>0</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>return_flags</varname></term>
+ <listitem><para>
+ Flags returned by the kernel. Currently unused and always set to
+ <constant>0</constant> by the kernel.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>items</varname></term>
+ <listitem>
+ <para>
+ Items to submit the name. Currently, one item of type
+ <constant>KDBUS_ITEM_NAME</constant> is expected and allowed, and
+ the contained string must be a valid bus name.
+ <constant>KDBUS_ITEM_NEGOTIATE</constant> may be used to probe for
+ valid item types. See
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for a detailed description of how this item is used.
+ </para>
+ <para>
+ Unrecognized items are rejected, and the ioctl will fail with
+ <varname>errno</varname> set to <constant>EINVAL</constant>.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Dumping the name registry</title>
+ <para>
+ A connection may request a complete or filtered dump of currently active
+ bus names with the <constant>KDBUS_CMD_LIST</constant> ioctl, which
+ takes a <type>struct kdbus_cmd_list</type> as argument.
+ </para>
+
+ <programlisting>
+struct kdbus_cmd_list {
+ __u64 flags;
+ __u64 return_flags;
+ __u64 offset;
+};
+ </programlisting>
+
+ <para>The fields in this struct are described below.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term><varname>flags</varname></term>
+ <listitem>
+ <para>
+ Any combination of flags to specify which names should be dumped.
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_LIST_UNIQUE</constant></term>
+ <listitem>
+ <para>
+ List the unique (numeric) IDs of the connection, whether it
+ owns a name or not.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_LIST_NAMES</constant></term>
+ <listitem>
+ <para>
+ List well-known names stored in the database which are
+ actively owned by a real connection (not an activator).
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_LIST_ACTIVATORS</constant></term>
+ <listitem>
+ <para>
+ List names that are owned by an activator.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_LIST_QUEUED</constant></term>
+ <listitem>
+ <para>
+ List connections that are not yet owning a name but are
+ waiting for it to become available.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_FLAG_NEGOTIATE</constant></term>
+ <listitem>
+ <para>
+ Request a set of valid flags for this ioctl. When this bit is
+ set, no action is taken; the ioctl will fail with
+ <errorcode>-1</errorcode>, and <varname>errno</varname>
+ is set to <constant>EPROTO</constant>.
+ Once the ioctl returned, the <varname>flags</varname>
+ field will have all bits set that the kernel recognizes as
+ valid for this command.
+ The <constant>KDBUS_FLAG_NEGOTIATE</constant> bit will be
+ cleared by the operation.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>return_flags</varname></term>
+ <listitem><para>
+ Flags returned by the kernel. Currently unused and always set to
+ <constant>0</constant> by the kernel.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>offset</varname></term>
+ <listitem><para>
+ When the ioctl returns successfully, the offset to the name registry
+ dump inside the connection's pool will be stored in this field.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ The returned list of names is stored in a <type>struct kdbus_list</type>
+ that in turn contains an array of type <type>struct kdbus_info</type>,
+ The array-size in bytes is given as <varname>list_size</varname>.
+ The fields inside <type>struct kdbus_info</type> is described next.
+ </para>
+
+ <programlisting>
+struct kdbus_info {
+ __u64 size;
+ __u64 id;
+ __u64 flags;
+ struct kdbus_item items[0];
+};
+ </programlisting>
+
+ <para>The fields in this struct are described below.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term><varname>size</varname></term>
+ <listitem><para>
+ The overall size of the struct, including its items.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>id</varname></term>
+ <listitem><para>
+ The owning connection's unique ID.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>flags</varname></term>
+ <listitem><para>
+ The flags of the owning connection.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>items</varname></term>
+ <listitem>
+ <para>
+ Items containing the actual name. Currently, one item of type
+ <constant>KDBUS_ITEM_OWNED_NAME</constant> will be attached,
+ including the name's flags. In that item, the flags field of the
+ name may carry the following bits:
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_NAME_ALLOW_REPLACEMENT</constant></term>
+ <listitem>
+ <para>
+ Other connections are allowed to take over this name from the
+ connection that owns it.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_NAME_IN_QUEUE</constant></term>
+ <listitem>
+ <para>
+ When retrieving a list of currently acquired names in the
+ registry, this flag indicates whether the connection
+ actually owns the name or is currently waiting for it to
+ become available.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_NAME_ACTIVATOR</constant></term>
+ <listitem>
+ <para>
+ An activator connection owns a name as a placeholder for an
+ implementer, which is started on demand by programs as soon
+ as the first message arrives. There's some more information
+ on this topic in
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ .
+ </para>
+ <para>
+ In contrast to
+ <constant>KDBUS_NAME_REPLACE_EXISTING</constant>,
+ when a name is taken over from an activator connection, all
+ the messages that have been queued in the activator
+ connection will be moved over to the new owner. The activator
+ connection will still be tracked for the name and will take
+ control again if the implementer connection terminates.
+ </para>
+ <para>
+ This flag can not be used when acquiring a name, but is
+ implicitly set through <constant>KDBUS_CMD_HELLO</constant>
+ with <constant>KDBUS_HELLO_ACTIVATOR</constant> set in
+ <varname>kdbus_cmd_hello.conn_flags</varname>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_FLAG_NEGOTIATE</constant></term>
+ <listitem>
+ <para>
+ Requests a set of valid flags for this ioctl. When this bit is
+ set, no action is taken; the ioctl will return
+ <errorcode>0</errorcode>, and the <varname>flags</varname>
+ field will have all bits set that are valid for this command.
+ The <constant>KDBUS_FLAG_NEGOTIATE</constant> bit will be
+ cleared by the operation.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ The returned buffer must be freed with the
+ <constant>KDBUS_CMD_FREE</constant> ioctl when the user is finished with
+ it. See
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Return value</title>
+ <para>
+ On success, all mentioned ioctl commands return <errorcode>0</errorcode>;
+ on error, <errorcode>-1</errorcode> is returned, and
+ <varname>errno</varname> is set to indicate the error.
+ If the issued ioctl is illegal for the file descriptor used,
+ <varname>errno</varname> will be set to <constant>ENOTTY</constant>.
+ </para>
+
+ <refsect2>
+ <title>
+ <constant>KDBUS_CMD_NAME_ACQUIRE</constant> may fail with the following
+ errors
+ </title>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>EINVAL</constant></term>
+ <listitem><para>
+ Illegal command flags, illegal name provided, or an activator
+ tried to acquire a second name.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EPERM</constant></term>
+ <listitem><para>
+ Policy prohibited name ownership.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EALREADY</constant></term>
+ <listitem><para>
+ Connection already owns that name.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EEXIST</constant></term>
+ <listitem><para>
+ The name already exists and can not be taken over.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>E2BIG</constant></term>
+ <listitem><para>
+ The maximum number of well-known names per connection is exhausted.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+
+ <refsect2>
+ <title>
+ <constant>KDBUS_CMD_NAME_RELEASE</constant>
+ may fail with the following errors
+ </title>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>EINVAL</constant></term>
+ <listitem><para>
+ Invalid command flags, or invalid name provided.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>ESRCH</constant></term>
+ <listitem><para>
+ Name is not found in the registry.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EADDRINUSE</constant></term>
+ <listitem><para>
+ Name is owned by a different connection and can't be released.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+
+ <refsect2>
+ <title>
+ <constant>KDBUS_CMD_LIST</constant> may fail with the following
+ errors
+ </title>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>EINVAL</constant></term>
+ <listitem><para>
+ Invalid command flags
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>ENOBUFS</constant></term>
+ <listitem><para>
+ No available memory in the connection's pool.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+ <simplelist type="inline">
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.bus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.policy</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ </simplelist>
+ </refsect1>
+</refentry>
diff --git a/Documentation/kdbus/kdbus.policy.xml b/Documentation/kdbus/kdbus.policy.xml
new file mode 100644
index 000000000000..67324163880a
--- /dev/null
+++ b/Documentation/kdbus/kdbus.policy.xml
@@ -0,0 +1,406 @@
+<?xml version='1.0'?> <!--*-nxml-*-->
+<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
+ "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
+
+<refentry id="kdbus.policy">
+
+ <refentryinfo>
+ <title>kdbus.policy</title>
+ <productname>kdbus.policy</productname>
+ </refentryinfo>
+
+ <refmeta>
+ <refentrytitle>kdbus.policy</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </refmeta>
+
+ <refnamediv>
+ <refname>kdbus.policy</refname>
+ <refpurpose>kdbus policy</refpurpose>
+ </refnamediv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ A kdbus policy restricts the possibilities of connections to own, see and
+ talk to well-known names. A policy can be associated with a bus (through a
+ policy holder connection) or a custom endpoint. kdbus stores its policy
+ information in a database that can be accessed through the following
+ ioctl commands:
+ </para>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_CMD_HELLO</constant></term>
+ <listitem><para>
+ When creating, or updating, a policy holder connection. See
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_CMD_ENDPOINT_MAKE</constant></term>
+ <term><constant>KDBUS_CMD_ENDPOINT_UPDATE</constant></term>
+ <listitem><para>
+ When creating, or updating, a bus custom endpoint. See
+ <citerefentry>
+ <refentrytitle>kdbus.endpoint</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ In all cases, the name and policy access information is stored in items
+ of type <constant>KDBUS_ITEM_NAME</constant> and
+ <constant>KDBUS_ITEM_POLICY_ACCESS</constant>. For this transport, the
+ following rules apply.
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ An item of type <constant>KDBUS_ITEM_NAME</constant> must be followed
+ by at least one <constant>KDBUS_ITEM_POLICY_ACCESS</constant> item.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ An item of type <constant>KDBUS_ITEM_NAME</constant> can be followed
+ by an arbitrary number of
+ <constant>KDBUS_ITEM_POLICY_ACCESS</constant> items.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ An arbitrary number of groups of names and access levels can be given.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ Names passed in items of type <constant>KDBUS_ITEM_NAME</constant> must
+ comply to the rules of valid kdbus.name. See
+ <citerefentry>
+ <refentrytitle>kdbus.name</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information.
+
+ The payload of an item of type
+ <constant>KDBUS_ITEM_POLICY_ACCESS</constant> is defined by the following
+ struct. For more information on the layout of items, please refer to
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+ </para>
+
+ <programlisting>
+struct kdbus_policy_access {
+ __u64 type;
+ __u64 access;
+ __u64 id;
+};
+ </programlisting>
+
+ <para>The fields in this struct are described below.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term><varname>type</varname></term>
+ <listitem>
+ <para>
+ One of the following.
+ </para>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_POLICY_ACCESS_USER</constant></term>
+ <listitem><para>
+ Grant access to a user with the UID stored in the
+ <varname>id</varname> field.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_POLICY_ACCESS_GROUP</constant></term>
+ <listitem><para>
+ Grant access to a user with the GID stored in the
+ <varname>id</varname> field.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_POLICY_ACCESS_WORLD</constant></term>
+ <listitem><para>
+ Grant access to everyone. The <varname>id</varname> field
+ is ignored.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>access</varname></term>
+ <listitem>
+ <para>
+ The access to grant. One of the following.
+ </para>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_POLICY_SEE</constant></term>
+ <listitem><para>
+ Allow the name to be seen.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_POLICY_TALK</constant></term>
+ <listitem><para>
+ Allow the name to be talked to.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_POLICY_OWN</constant></term>
+ <listitem><para>
+ Allow the name to be owned.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>id</varname></term>
+ <listitem><para>
+ For <constant>KDBUS_POLICY_ACCESS_USER</constant>, stores the UID.
+ For <constant>KDBUS_POLICY_ACCESS_GROUP</constant>, stores the GID.
+ </para></listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ <para>
+ All endpoints of buses have an empty policy database by default.
+ Therefore, unless policy rules are added, all operations will also be
+ denied by default. Also see
+ <citerefentry>
+ <refentrytitle>kdbus.endpoint</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Wildcard names</title>
+ <para>
+ Policy holder connections may upload names that contain the wildcard
+ suffix (<literal>".*"</literal>). Such a policy entry is effective for
+ every well-known name that extends the provided name by exactly one more
+ level.
+
+ For example, the name <literal>foo.bar.*</literal> matches both
+ <literal>"foo.bar.baz"</literal> and
+ <literal>"foo.bar.bazbaz"</literal> are, but not
+ <literal>"foo.bar.baz.baz"</literal>.
+
+ This allows connections to take control over multiple names that the
+ policy holder doesn't need to know about when uploading the policy.
+
+ Such wildcard entries are not allowed for custom endpoints.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Privileged connections</title>
+ <para>
+ The policy database is overruled when action is taken by a privileged
+ connection. Please refer to
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information on what makes a connection privileged.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Examples</title>
+ <para>
+ For instance, a set of policy rules may look like this:
+ </para>
+
+ <programlisting>
+KDBUS_ITEM_NAME: str='org.foo.bar'
+KDBUS_ITEM_POLICY_ACCESS: type=USER, access=OWN, ID=1000
+KDBUS_ITEM_POLICY_ACCESS: type=USER, access=TALK, ID=1001
+KDBUS_ITEM_POLICY_ACCESS: type=WORLD, access=SEE
+
+KDBUS_ITEM_NAME: str='org.blah.baz'
+KDBUS_ITEM_POLICY_ACCESS: type=USER, access=OWN, ID=0
+KDBUS_ITEM_POLICY_ACCESS: type=WORLD, access=TALK
+ </programlisting>
+
+ <para>
+ That means that 'org.foo.bar' may only be owned by UID 1000, but every
+ user on the bus is allowed to see the name. However, only UID 1001 may
+ actually send a message to the connection and receive a reply from it.
+
+ The second rule allows 'org.blah.baz' to be owned by UID 0 only, but
+ every user may talk to it.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>TALK access and multiple well-known names per connection</title>
+ <para>
+ Note that TALK access is checked against all names of a connection. For
+ example, if a connection owns both <constant>'org.foo.bar'</constant> and
+ <constant>'org.blah.baz'</constant>, and the policy database allows
+ <constant>'org.blah.baz'</constant> to be talked to by WORLD, then this
+ permission is also granted to <constant>'org.foo.bar'</constant>. That
+ might sound illogical, but after all, we allow messages to be directed to
+ either the ID or a well-known name, and policy is applied to the
+ connection, not the name. In other words, the effective TALK policy for a
+ connection is the most permissive of all names the connection owns.
+
+ For broadcast messages, the receiver needs TALK permissions to the sender
+ to receive the broadcast.
+ </para>
+ <para>
+ Both the endpoint and the bus policy databases are consulted to allow
+ name registry listing, owning a well-known name and message delivery.
+ If either one fails, the operation is failed with
+ <varname>errno</varname> set to <constant>EPERM</constant>.
+
+ For best practices, connections that own names with a restricted TALK
+ access should not install matches. This avoids cases where the sent
+ message may pass the bloom filter due to false-positives and may also
+ satisfy the policy rules.
+
+ Also see
+ <citerefentry>
+ <refentrytitle>kdbus.match</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Implicit policies</title>
+ <para>
+ Depending on the type of the endpoint, a set of implicit rules that
+ override installed policies might be enforced.
+
+ On default endpoints, the following set is enforced and checked before
+ any user-supplied policy is checked.
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ Privileged connections always override any installed policy. Those
+ connections could easily install their own policies, so there is no
+ reason to enforce installed policies.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Connections can always talk to connections of the same user. This
+ includes broadcast messages.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ Custom endpoints have stricter policies. The following rules apply:
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ Policy rules are always enforced, even if the connection is a
+ privileged connection.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Policy rules are always enforced for <constant>TALK</constant> access,
+ even if both ends are running under the same user. This includes
+ broadcast messages.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ To restrict the set of names that can be seen, endpoint policies can
+ install <constant>SEE</constant> policies.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+ <simplelist type="inline">
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.bus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.endpoint</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.fs</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.name</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ </simplelist>
+ </refsect1>
+</refentry>
diff --git a/Documentation/kdbus/kdbus.pool.xml b/Documentation/kdbus/kdbus.pool.xml
new file mode 100644
index 000000000000..05fd01902ad4
--- /dev/null
+++ b/Documentation/kdbus/kdbus.pool.xml
@@ -0,0 +1,320 @@
+<?xml version='1.0'?> <!--*-nxml-*-->
+<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
+ "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
+
+<refentry id="kdbus.pool">
+
+ <refentryinfo>
+ <title>kdbus.pool</title>
+ <productname>kdbus.pool</productname>
+ </refentryinfo>
+
+ <refmeta>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </refmeta>
+
+ <refnamediv>
+ <refname>kdbus.pool</refname>
+ <refpurpose>kdbus pool</refpurpose>
+ </refnamediv>
+
+ <refsect1>
+ <title>Description</title>
+ <para>
+ A pool for data received from the kernel is installed for every
+ <emphasis>connection</emphasis> of the <emphasis>bus</emphasis>, and
+ is sized according to the information stored in the
+ <varname>pool_size</varname> member of <type>struct kdbus_cmd_hello</type>
+ when <constant>KDBUS_CMD_HELLO</constant> is employed. Internally, the
+ pool is segmented into <emphasis>slices</emphasis>, each referenced by its
+ <emphasis>offset</emphasis> in the pool, expressed in <type>bytes</type>.
+ See
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more information about <constant>KDBUS_CMD_HELLO</constant>.
+ </para>
+
+ <para>
+ The pool is written to by the kernel when one of the following
+ <emphasis>ioctls</emphasis> is issued:
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_CMD_HELLO</constant></term>
+ <listitem><para>
+ ... to receive details about the bus the connection was made to
+ </para></listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><constant>KDBUS_CMD_RECV</constant></term>
+ <listitem><para>
+ ... to receive a message
+ </para></listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><constant>KDBUS_CMD_LIST</constant></term>
+ <listitem><para>
+ ... to dump the name registry
+ </para></listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><constant>KDBUS_CMD_CONN_INFO</constant></term>
+ <listitem><para>
+ ... to retrieve information on a connection
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+
+ </para>
+ <para>
+ The <varname>offset</varname> fields returned by either one of the
+ aforementioned ioctls describe offsets inside the pool. In order to make
+ the slice available for subsequent calls,
+ <constant>KDBUS_CMD_FREE</constant> has to be called on that offset
+ (see below). Otherwise, the pool will fill up, and the connection won't
+ be able to receive any more information through its pool.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Pool slice allocation</title>
+ <para>
+ Pool slices are allocated by the kernel in order to report information
+ back to a task, such as messages, returned name list etc.
+ Allocation of pool slices cannot be initiated by userspace. See
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ and
+ <citerefentry>
+ <refentrytitle>kdbus.name</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for examples of commands that use the <emphasis>pool</emphasis> to
+ return data.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Accessing the pool memory</title>
+ <para>
+ Memory in the pool is read-only for userspace and may only be written
+ to by the kernel. To read from the pool memory, the caller is expected to
+ <citerefentry>
+ <refentrytitle>mmap</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ the buffer into its task, like this:
+ </para>
+ <programlisting>
+uint8_t *buf = mmap(NULL, size, PROT_READ, MAP_SHARED, conn_fd, 0);
+ </programlisting>
+
+ <para>
+ In order to map the entire pool, the <varname>size</varname> parameter in
+ the example above should be set to the value of the
+ <varname>pool_size</varname> member of
+ <type>struct kdbus_cmd_hello</type> when
+ <constant>KDBUS_CMD_HELLO</constant> was employed to create the
+ connection (see above).
+ </para>
+
+ <para>
+ The <emphasis>file descriptor</emphasis> used to map the memory must be
+ the one that was used to create the <emphasis>connection</emphasis>.
+ In other words, the one that was used to call
+ <constant>KDBUS_CMD_HELLO</constant>. See
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more details.
+ </para>
+
+ <para>
+ Alternatively, instead of mapping the entire pool buffer, only parts
+ of it can be mapped. Every kdbus command that returns an
+ <emphasis>offset</emphasis> (see above) also reports a
+ <emphasis>size</emphasis> along with it, so programs can be written
+ in a way that it only maps portions of the pool to access a specific
+ <emphasis>slice</emphasis>.
+ </para>
+
+ <para>
+ When access to the pool memory is no longer needed, programs should
+ call <function>munmap()</function> on the pointer returned by
+ <function>mmap()</function>.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Freeing pool slices</title>
+ <para>
+ The <constant>KDBUS_CMD_FREE</constant> ioctl is used to free a slice
+ inside the pool, describing an offset that was returned in an
+ <varname>offset</varname> field of another ioctl struct.
+ The <constant>KDBUS_CMD_FREE</constant> command takes a
+ <type>struct kdbus_cmd_free</type> as argument.
+ </para>
+
+<programlisting>
+struct kdbus_cmd_free {
+ __u64 size;
+ __u64 flags;
+ __u64 return_flags;
+ __u64 offset;
+ struct kdbus_item items[0];
+};
+</programlisting>
+
+ <para>The fields in this struct are described below.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term><varname>size</varname></term>
+ <listitem><para>
+ The overall size of the struct, including its items.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>flags</varname></term>
+ <listitem><para>
+ Currently unused.
+ <constant>KDBUS_FLAG_NEGOTIATE</constant> is accepted to probe for
+ valid flags. If set, the ioctl will return <errorcode>0</errorcode>,
+ and the <varname>flags</varname> field is set to
+ <constant>0</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>return_flags</varname></term>
+ <listitem><para>
+ Flags returned by the kernel. Currently unused and always set to
+ <constant>0</constant> by the kernel.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>offset</varname></term>
+ <listitem><para>
+ The offset to free, as returned by other ioctls that allocated
+ memory for returned information.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><varname>items</varname></term>
+ <listitem><para>
+ Items to specify further details for the receive command.
+ Currently unused.
+ Unrecognized items are rejected, and the ioctl will fail with
+ <varname>errno</varname> set to <constant>EINVAL</constant>.
+ All items except for
+ <constant>KDBUS_ITEM_NEGOTIATE</constant> (see
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ ) will be rejected.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Return value</title>
+ <para>
+ On success, all mentioned ioctl commands return <errorcode>0</errorcode>;
+ on error, <errorcode>-1</errorcode> is returned, and
+ <varname>errno</varname> is set to indicate the error.
+ If the issued ioctl is illegal for the file descriptor used,
+ <varname>errno</varname> will be set to <constant>ENOTTY</constant>.
+ </para>
+
+ <refsect2>
+ <title>
+ <constant>KDBUS_CMD_FREE</constant> may fail with the following
+ errors
+ </title>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>ENXIO</constant></term>
+ <listitem><para>
+ No pool slice found at given offset.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EINVAL</constant></term>
+ <listitem><para>
+ Invalid flags provided.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>EINVAL</constant></term>
+ <listitem><para>
+ The offset is valid, but the user is not allowed to free the slice.
+ This happens, for example, if the offset was retrieved with
+ <constant>KDBUS_RECV_PEEK</constant>.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+ <simplelist type="inline">
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.bus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.endpoint</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.name</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>mmap</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>munmap</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ </member>
+ </simplelist>
+ </refsect1>
+</refentry>
diff --git a/Documentation/kdbus/kdbus.xml b/Documentation/kdbus/kdbus.xml
new file mode 100644
index 000000000000..194abd2e76cc
--- /dev/null
+++ b/Documentation/kdbus/kdbus.xml
@@ -0,0 +1,1012 @@
+<?xml version='1.0'?> <!--*-nxml-*-->
+<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
+ "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
+
+<refentry id="kdbus">
+
+ <refentryinfo>
+ <title>kdbus</title>
+ <productname>kdbus</productname>
+ </refentryinfo>
+
+ <refmeta>
+ <refentrytitle>kdbus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </refmeta>
+
+ <refnamediv>
+ <refname>kdbus</refname>
+ <refpurpose>Kernel Message Bus</refpurpose>
+ </refnamediv>
+
+ <refsect1>
+ <title>Synopsis</title>
+ <para>
+ kdbus is an inter-process communication bus system controlled by the
+ kernel. It provides user-space with an API to create buses and send
+ unicast and multicast messages to one, or many, peers connected to the
+ same bus. It does not enforce any layout on the transmitted data, but
+ only provides the transport layer used for message interchange between
+ peers.
+ </para>
+ <para>
+ This set of man-pages gives a comprehensive overview of the kernel-level
+ API, with all ioctl commands, associated structs and bit masks. However,
+ most people will not use this API level directly, but rather let one of
+ the high-level abstraction libraries help them integrate D-Bus
+ functionality into their applications.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Description</title>
+ <para>
+ kdbus provides a pseudo filesystem called <emphasis>kdbusfs</emphasis>,
+ which is usually mounted on <filename>/sys/fs/kdbus</filename>. Bus
+ primitives can be accessed as files and sub-directories underneath this
+ mount-point. Any advanced operations are done via
+ <function>ioctl()</function> on files created by
+ <emphasis>kdbusfs</emphasis>. Multiple mount-points of
+ <emphasis>kdbusfs</emphasis> are independent of each other. This allows
+ namespacing of kdbus by mounting a new instance of
+ <emphasis>kdbusfs</emphasis> in a new mount-namespace. kdbus calls these
+ mount instances domains and each bus belongs to exactly one domain.
+ </para>
+
+ <para>
+ kdbus was designed as a transport layer for D-Bus, but is in no way
+ limited, nor controlled by the D-Bus protocol specification. The D-Bus
+ protocol is one possible application layer on top of kdbus.
+ </para>
+
+ <para>
+ For the general D-Bus protocol specification, its payload format, its
+ marshaling, and its communication semantics, please refer to the
+ <ulink url="http://dbus.freedesktop.org/doc/dbus-specification.html">
+ D-Bus specification</ulink>.
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Terminology</title>
+
+ <refsect2>
+ <title>Domain</title>
+ <para>
+ A domain is a <emphasis>kdbusfs</emphasis> mount-point containing all
+ the bus primitives. Each domain is independent, and separate domains
+ do not affect each other.
+ </para>
+ </refsect2>
+
+ <refsect2>
+ <title>Bus</title>
+ <para>
+ A bus is a named object inside a domain. Clients exchange messages
+ over a bus. Multiple buses themselves have no connection to each other;
+ messages can only be exchanged on the same bus. The default endpoint of
+ a bus, to which clients establish connections, is the "bus" file
+ /sys/fs/kdbus/<bus name>/bus.
+ Common operating system setups create one "system bus" per system,
+ and one "user bus" for every logged-in user. Applications or services
+ may create their own private buses. The kernel driver does not
+ distinguish between different bus types, they are all handled the same
+ way. See
+ <citerefentry>
+ <refentrytitle>kdbus.bus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more details.
+ </para>
+ </refsect2>
+
+ <refsect2>
+ <title>Endpoint</title>
+ <para>
+ An endpoint provides a file to talk to a bus. Opening an endpoint
+ creates a new connection to the bus to which the endpoint belongs. All
+ endpoints have unique names and are accessible as files underneath the
+ directory of a bus, e.g., /sys/fs/kdbus/<bus>/<endpoint>
+ Every bus has a default endpoint called "bus".
+ A bus can optionally offer additional endpoints with custom names
+ to provide restricted access to the bus. Custom endpoints carry
+ additional policy which can be used to create sandboxes with
+ locked-down, limited, filtered access to a bus. See
+ <citerefentry>
+ <refentrytitle>kdbus.endpoint</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more details.
+ </para>
+ </refsect2>
+
+ <refsect2>
+ <title>Connection</title>
+ <para>
+ A connection to a bus is created by opening an endpoint file of a
+ bus. Every ordinary client connection has a unique identifier on the
+ bus and can address messages to every other connection on the same
+ bus by using the peer's connection ID as the destination. See
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more details.
+ </para>
+ </refsect2>
+
+ <refsect2>
+ <title>Pool</title>
+ <para>
+ Each connection allocates a piece of shmem-backed memory that is
+ used to receive messages and answers to ioctl commands from the kernel.
+ It is never used to send anything to the kernel. In order to access that
+ memory, an application must mmap() it into its address space. See
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more details.
+ </para>
+ </refsect2>
+
+ <refsect2>
+ <title>Well-known Name</title>
+ <para>
+ A connection can, in addition to its implicit unique connection ID,
+ request the ownership of a textual well-known name. Well-known names are
+ noted in reverse-domain notation, such as com.example.service1. A
+ connection that offers a service on a bus is usually reached by its
+ well-known name. An analogy of connection ID and well-known name is an
+ IP address and a DNS name associated with that address. See
+ <citerefentry>
+ <refentrytitle>kdbus.name</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more details.
+ </para>
+ </refsect2>
+
+ <refsect2>
+ <title>Message</title>
+ <para>
+ Connections can exchange messages with other connections by addressing
+ the peers with their connection ID or well-known name. A message
+ consists of a message header with information on how to route the
+ message, and the message payload, which is a logical byte stream of
+ arbitrary size. Messages can carry additional file descriptors to be
+ passed from one connection to another, just like passing file
+ descriptors over UNIX domain sockets. Every connection can specify which
+ set of metadata the kernel should attach to the message when it is
+ delivered to the receiving connection. Metadata contains information
+ like: system time stamps, UID, GID, TID, proc-starttime, well-known
+ names, process comm, process exe, process argv, cgroup, capabilities,
+ seclabel, audit session, loginuid and the connection's human-readable
+ name. See
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more details.
+ </para>
+ </refsect2>
+
+ <refsect2>
+ <title>Item</title>
+ <para>
+ The API of kdbus implements the notion of items, submitted through and
+ returned by most ioctls, and stored inside data structures in the
+ connection's pool. See
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more details.
+ </para>
+ </refsect2>
+
+ <refsect2>
+ <title>Broadcast, signal, filter, match</title>
+ <para>
+ Signals are messages that a receiver opts in for by installing a blob of
+ bytes, called a 'match'. Signal messages must always carry a
+ counter-part blob, called a 'filter', and signals are only delivered to
+ peers which have a match that white-lists the message's filter. Senders
+ of signal messages can use either a single connection ID as receiver,
+ or the special connection ID
+ <constant>KDBUS_DST_ID_BROADCAST</constant> to potentially send it to
+ all connections of a bus, following the logic described above. See
+ <citerefentry>
+ <refentrytitle>kdbus.match</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ and
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more details.
+ </para>
+ </refsect2>
+
+ <refsect2>
+ <title>Policy</title>
+ <para>
+ A policy is a set of rules that define which connections can see, talk
+ to, or register a well-known name on the bus. A policy is attached to
+ buses and custom endpoints, and modified by policy holder connections or
+ owners of custom endpoints. See
+ <citerefentry>
+ <refentrytitle>kdbus.policy</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more details.
+ </para>
+ </refsect2>
+
+ <refsect2>
+ <title>Privileged bus users</title>
+ <para>
+ A user connecting to the bus is considered privileged if it is either
+ the creator of the bus, or if it has the CAP_IPC_OWNER capability flag
+ set. See
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more details.
+ </para>
+ </refsect2>
+ </refsect1>
+
+ <refsect1>
+ <title>Bus Layout</title>
+
+ <para>
+ A <emphasis>bus</emphasis> provides and defines an environment that peers
+ can connect to for message interchange. A bus is created via the kdbus
+ control interface and can be modified by the bus creator. It applies the
+ policy that control all bus operations. The bus creator itself does not
+ participate as a peer. To establish a peer
+ <emphasis>connection</emphasis>, you have to open one of the
+ <emphasis>endpoints</emphasis> of a bus. Each bus provides a default
+ endpoint, but further endpoints can be created on-demand. Endpoints are
+ used to apply additional policies for all connections on this endpoint.
+ Thus, they provide additional filters to further restrict access of
+ specific connections to the bus.
+ </para>
+
+ <para>
+ Following, you can see an example bus layout:
+ </para>
+
+ <programlisting><![CDATA[
+ Bus Creator
+ |
+ |
+ +-----+
+ | Bus |
+ +-----+
+ |
+ __________________/ \__________________
+ / \
+ | |
+ +----------+ +----------+
+ | Endpoint | | Endpoint |
+ +----------+ +----------+
+ _________/|\_________ _________/|\_________
+ / | \ / | \
+ | | | | | |
+ | | | | | |
+ Connection Connection Connection Connection Connection Connection
+ ]]></programlisting>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Data structures and interconnections</title>
+ <programlisting><![CDATA[
+ +--------------------------------------------------------------------------+
+ | Domain (Mount Point) |
+ | /sys/fs/kdbus/control |
+ | +----------------------------------------------------------------------+ |
+ | | Bus (System Bus) | |
+ | | /sys/fs/kdbus/0-system/ | |
+ | | +-------------------------------+ +--------------------------------+ | |
+ | | | Endpoint | | Endpoint | | |
+ | | | /sys/fs/kdbus/0-system/bus | | /sys/fs/kdbus/0-system/ep.app | | |
+ | | +-------------------------------+ +--------------------------------+ | |
+ | | +--------------+ +--------------+ +--------------+ +---------------+ | |
+ | | | Connection | | Connection | | Connection | | Connection | | |
+ | | | :1.22 | | :1.25 | | :1.55 | | :1.81 | | |
+ | | +--------------+ +--------------+ +--------------+ +---------------+ | |
+ | +----------------------------------------------------------------------+ |
+ | |
+ | +----------------------------------------------------------------------+ |
+ | | Bus (User Bus for UID 2702) | |
+ | | /sys/fs/kdbus/2702-user/ | |
+ | | +-------------------------------+ +--------------------------------+ | |
+ | | | Endpoint | | Endpoint | | |
+ | | | /sys/fs/kdbus/2702-user/bus | | /sys/fs/kdbus/2702-user/ep.app | | |
+ | | +-------------------------------+ +--------------------------------+ | |
+ | | +--------------+ +--------------+ +--------------+ +---------------+ | |
+ | | | Connection | | Connection | | Connection | | Connection | | |
+ | | | :1.22 | | :1.25 | | :1.55 | | :1.81 | | |
+ | | +--------------+ +--------------+ +--------------------------------+ | |
+ | +----------------------------------------------------------------------+ |
+ +--------------------------------------------------------------------------+
+ ]]></programlisting>
+ </refsect1>
+
+ <refsect1>
+ <title>Metadata</title>
+
+ <refsect2>
+ <title>When metadata is collected</title>
+ <para>
+ kdbus records data about the system in certain situations. Such metadata
+ can refer to the currently active process (creds, PIDs, current user
+ groups, process names and its executable path, cgroup membership,
+ capabilities, security label and audit information), connection
+ information (description string, currently owned names) and time stamps.
+ </para>
+ <para>
+ Metadata is collected at the following times.
+ </para>
+
+ <itemizedlist>
+ <listitem><para>
+ When a bus is created (<constant>KDBUS_CMD_MAKE</constant>),
+ information about the calling task is collected. This data is returned
+ by the kernel via the <constant>KDBUS_CMD_BUS_CREATOR_INFO</constant>
+ call.
+ </para></listitem>
+
+ <listitem>
+ <para>
+ When a connection is created (<constant>KDBUS_CMD_HELLO</constant>),
+ information about the calling task is collected. Alternatively, a
+ privileged connection may provide 'faked' information about
+ credentials, PIDs and security labels which will be stored instead.
+ This data is returned by the kernel as information on a connection
+ (<constant>KDBUS_CMD_CONN_INFO</constant>). Only metadata that a
+ connection allowed to be sent (by setting its bit in
+ <varname>attach_flags_send</varname>) will be exported in this way.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ When a message is sent (<constant>KDBUS_CMD_SEND</constant>),
+ information about the sending task and the sending connection are
+ collected. This metadata will be attached to the message when it
+ arrives in the receiver's pool. If the connection sending the
+ message installed faked credentials (see
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>),
+ the message will not be augmented by any information about the
+ currently sending task. Note that only metadata that was requested
+ by the receiving connection will be collected and attached to
+ messages.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ Which metadata items are actually delivered depends on the following
+ sets and masks:
+ </para>
+
+ <itemizedlist>
+ <listitem><para>
+ (a) the system-wide kmod creds mask
+ (module parameter <varname>attach_flags_mask</varname>)
+ </para></listitem>
+
+ <listitem><para>
+ (b) the per-connection send creds mask, set by the connecting client
+ </para></listitem>
+
+ <listitem><para>
+ (c) the per-connection receive creds mask, set by the connecting
+ client
+ </para></listitem>
+
+ <listitem><para>
+ (d) the per-bus minimal creds mask, set by the bus creator
+ </para></listitem>
+
+ <listitem><para>
+ (e) the per-bus owner creds mask, set by the bus creator
+ </para></listitem>
+
+ <listitem><para>
+ (f) the mask specified when querying creds of a bus peer
+ </para></listitem>
+
+ <listitem><para>
+ (g) the mask specified when querying creds of a bus owner
+ </para></listitem>
+ </itemizedlist>
+
+ <para>
+ With the following rules:
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ [1] The creds attached to messages are determined as
+ <constant>a & b & c</constant>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ [2] When connecting to a bus (<constant>KDBUS_CMD_HELLO</constant>),
+ and <constant>~b & d != 0</constant>, the call will fail with,
+ <errorcode>-1</errorcode>, and <varname>errno</varname> is set to
+ <constant>ECONNREFUSED</constant>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ [3] When querying creds of a bus peer, the creds returned are
+ <constant>a & b & f</constant>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ [4] When querying creds of a bus owner, the creds returned are
+ <constant>a & e & g</constant>.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ Hence, programs might not always get all requested metadata items that
+ it requested. Code must be written so that it can cope with this fact.
+ </para>
+ </refsect2>
+
+ <refsect2>
+ <title>Benefits and heads-up</title>
+ <para>
+ Attaching metadata to messages has two major benefits.
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ Metadata attached to messages is gathered at the moment when the
+ other side calls <constant>KDBUS_CMD_SEND</constant>, or,
+ respectively, then the kernel notification is generated. There is
+ no need for the receiving peer to retrieve information about the
+ task in a second step. This closes a race gap that would otherwise
+ be inherent.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ As metadata is delivered along with messages in the same data
+ blob, no extra calls to kernel functions etc. are needed to gather
+ them.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ Note, however, that collecting metadata does come at a price for
+ performance, so developers should carefully assess which metadata to
+ really opt-in for. For best practice, data that is not needed as part
+ of a message should not be requested by the connection in the first
+ place (see <varname>attach_flags_recv</varname> in
+ <constant>KDBUS_CMD_HELLO</constant>).
+ </para>
+ </refsect2>
+
+ <refsect2>
+ <title>Attach flags for metadata items</title>
+ <para>
+ To let the kernel know which metadata information to attach as items
+ to the aforementioned commands, it uses a bitmask. In those, the
+ following <emphasis>attach flags</emphasis> are currently supported.
+ Both the the <varname>attach_flags_recv</varname> and
+ <varname>attach_flags_send</varname> fields of
+ <type>struct kdbus_cmd_hello</type>, as well as the payload of the
+ <constant>KDBUS_ITEM_ATTACH_FLAGS_SEND</constant> and
+ <constant>KDBUS_ITEM_ATTACH_FLAGS_RECV</constant> items follow this
+ scheme.
+ </para>
+
+ <variablelist>
+ <varlistentry>
+ <term><constant>KDBUS_ATTACH_TIMESTAMP</constant></term>
+ <listitem><para>
+ Requests the attachment of an item of type
+ <constant>KDBUS_ITEM_TIMESTAMP</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ATTACH_CREDS</constant></term>
+ <listitem><para>
+ Requests the attachment of an item of type
+ <constant>KDBUS_ITEM_CREDS</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ATTACH_PIDS</constant></term>
+ <listitem><para>
+ Requests the attachment of an item of type
+ <constant>KDBUS_ITEM_PIDS</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ATTACH_AUXGROUPS</constant></term>
+ <listitem><para>
+ Requests the attachment of an item of type
+ <constant>KDBUS_ITEM_AUXGROUPS</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ATTACH_NAMES</constant></term>
+ <listitem><para>
+ Requests the attachment of an item of type
+ <constant>KDBUS_ITEM_OWNED_NAME</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ATTACH_TID_COMM</constant></term>
+ <listitem><para>
+ Requests the attachment of an item of type
+ <constant>KDBUS_ITEM_TID_COMM</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ATTACH_PID_COMM</constant></term>
+ <listitem><para>
+ Requests the attachment of an item of type
+ <constant>KDBUS_ITEM_PID_COMM</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ATTACH_EXE</constant></term>
+ <listitem><para>
+ Requests the attachment of an item of type
+ <constant>KDBUS_ITEM_EXE</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ATTACH_CMDLINE</constant></term>
+ <listitem><para>
+ Requests the attachment of an item of type
+ <constant>KDBUS_ITEM_CMDLINE</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ATTACH_CGROUP</constant></term>
+ <listitem><para>
+ Requests the attachment of an item of type
+ <constant>KDBUS_ITEM_CGROUP</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ATTACH_CAPS</constant></term>
+ <listitem><para>
+ Requests the attachment of an item of type
+ <constant>KDBUS_ITEM_CAPS</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ATTACH_SECLABEL</constant></term>
+ <listitem><para>
+ Requests the attachment of an item of type
+ <constant>KDBUS_ITEM_SECLABEL</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ATTACH_AUDIT</constant></term>
+ <listitem><para>
+ Requests the attachment of an item of type
+ <constant>KDBUS_ITEM_AUDIT</constant>.
+ </para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><constant>KDBUS_ATTACH_CONN_DESCRIPTION</constant></term>
+ <listitem><para>
+ Requests the attachment of an item of type
+ <constant>KDBUS_ITEM_CONN_DESCRIPTION</constant>.
+ </para></listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ Please refer to
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for detailed information about the layout and payload of items and
+ what metadata should be used to.
+ </para>
+ </refsect2>
+ </refsect1>
+
+ <refsect1>
+ <title>The ioctl interface</title>
+
+ <para>
+ As stated in the 'synopsis' section above, application developers are
+ strongly encouraged to use kdbus through one of the high-level D-Bus
+ abstraction libraries, rather than using the low-level API directly.
+ </para>
+
+ <para>
+ kdbus on the kernel level exposes its functions exclusively through
+ <citerefentry>
+ <refentrytitle>ioctl</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>,
+ employed on file descriptors returned by
+ <citerefentry>
+ <refentrytitle>open</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ on pseudo files exposed by
+ <citerefentry>
+ <refentrytitle>kdbus.fs</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+ </para>
+ <para>
+ Following is a list of all the ioctls, along with the command structs
+ they must be used with.
+ </para>
+
+ <informaltable frame="none">
+ <tgroup cols="3" colsep="1">
+ <thead>
+ <row>
+ <entry>ioctl signature</entry>
+ <entry>command</entry>
+ <entry>transported struct</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry><constant>0x40189500</constant></entry>
+ <entry><constant>KDBUS_CMD_BUS_MAKE</constant></entry>
+ <entry><type>struct kdbus_cmd *</type></entry>
+ </row><row>
+ <entry><constant>0x40189510</constant></entry>
+ <entry><constant>KDBUS_CMD_ENDPOINT_MAKE</constant></entry>
+ <entry><type>struct kdbus_cmd *</type></entry>
+ </row><row>
+ <entry><constant>0xc0609580</constant></entry>
+ <entry><constant>KDBUS_CMD_HELLO</constant></entry>
+ <entry><type>struct kdbus_cmd_hello *</type></entry>
+ </row><row>
+ <entry><constant>0x40189582</constant></entry>
+ <entry><constant>KDBUS_CMD_BYEBYE</constant></entry>
+ <entry><type>struct kdbus_cmd *</type></entry>
+ </row><row>
+ <entry><constant>0x40389590</constant></entry>
+ <entry><constant>KDBUS_CMD_SEND</constant></entry>
+ <entry><type>struct kdbus_cmd_send *</type></entry>
+ </row><row>
+ <entry><constant>0x80409591</constant></entry>
+ <entry><constant>KDBUS_CMD_RECV</constant></entry>
+ <entry><type>struct kdbus_cmd_recv *</type></entry>
+ </row><row>
+ <entry><constant>0x40209583</constant></entry>
+ <entry><constant>KDBUS_CMD_FREE</constant></entry>
+ <entry><type>struct kdbus_cmd_free *</type></entry>
+ </row><row>
+ <entry><constant>0x401895a0</constant></entry>
+ <entry><constant>KDBUS_CMD_NAME_ACQUIRE</constant></entry>
+ <entry><type>struct kdbus_cmd *</type></entry>
+ </row><row>
+ <entry><constant>0x401895a1</constant></entry>
+ <entry><constant>KDBUS_CMD_NAME_RELEASE</constant></entry>
+ <entry><type>struct kdbus_cmd *</type></entry>
+ </row><row>
+ <entry><constant>0x80289586</constant></entry>
+ <entry><constant>KDBUS_CMD_LIST</constant></entry>
+ <entry><type>struct kdbus_cmd_list *</type></entry>
+ </row><row>
+ <entry><constant>0x80309584</constant></entry>
+ <entry><constant>KDBUS_CMD_CONN_INFO</constant></entry>
+ <entry><type>struct kdbus_cmd_info *</type></entry>
+ </row><row>
+ <entry><constant>0x40209551</constant></entry>
+ <entry><constant>KDBUS_CMD_UPDATE</constant></entry>
+ <entry><type>struct kdbus_cmd *</type></entry>
+ </row><row>
+ <entry><constant>0x80309585</constant></entry>
+ <entry><constant>KDBUS_CMD_BUS_CREATOR_INFO</constant></entry>
+ <entry><type>struct kdbus_cmd_info *</type></entry>
+ </row><row>
+ <entry><constant>0x40189511</constant></entry>
+ <entry><constant>KDBUS_CMD_ENDPOINT_UPDATE</constant></entry>
+ <entry><type>struct kdbus_cmd *</type></entry>
+ </row><row>
+ <entry><constant>0x402095b0</constant></entry>
+ <entry><constant>KDBUS_CMD_MATCH_ADD</constant></entry>
+ <entry><type>struct kdbus_cmd_match *</type></entry>
+ </row><row>
+ <entry><constant>0x402095b1</constant></entry>
+ <entry><constant>KDBUS_CMD_MATCH_REMOVE</constant></entry>
+ <entry><type>struct kdbus_cmd_match *</type></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </informaltable>
+
+ <para>
+ Depending on the type of <emphasis>kdbusfs</emphasis> node that was
+ opened and what ioctls have been executed on a file descriptor before,
+ a different sub-set of ioctl commands is allowed.
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ On a file descriptor resulting from opening a
+ <emphasis>control node</emphasis>, only the
+ <constant>KDBUS_CMD_BUS_MAKE</constant> ioctl may be executed.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ On a file descriptor resulting from opening a
+ <emphasis>bus endpoint node</emphasis>, only the
+ <constant>KDBUS_CMD_ENDPOINT_MAKE</constant> and
+ <constant>KDBUS_CMD_HELLO</constant> ioctls may be executed.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ A file descriptor that was used to create a bus
+ (via <constant>KDBUS_CMD_BUS_MAKE</constant>) is called a
+ <emphasis>bus owner</emphasis> file descriptor. The bus will be
+ active as long as the file descriptor is kept open.
+ A bus owner file descriptor can not be used to
+ employ any further ioctls. As soon as
+ <citerefentry>
+ <refentrytitle>close</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ is called on it, the bus will be shut down, along will all associated
+ endpoints and connections. See
+ <citerefentry>
+ <refentrytitle>kdbus.bus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more details.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ A file descriptor that was used to create an endpoint
+ (via <constant>KDBUS_CMD_ENDPOINT_MAKE</constant>) is called an
+ <emphasis>endpoint owner</emphasis> file descriptor. The endpoint
+ will be active as long as the file descriptor is kept open.
+ An endpoint owner file descriptor can only be used
+ to update details of an endpoint through the
+ <constant>KDBUS_CMD_ENDPOINT_UPDATE</constant> ioctl. As soon as
+ <citerefentry>
+ <refentrytitle>close</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ is called on it, the endpoint will be removed from the bus, and all
+ connections that are connected to the bus through it are shut down.
+ See
+ <citerefentry>
+ <refentrytitle>kdbus.endpoint</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ for more details.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ A file descriptor that was used to create a connection
+ (via <constant>KDBUS_CMD_HELLO</constant>) is called a
+ <emphasis>connection owner</emphasis> file descriptor. The connection
+ will be active as long as the file descriptor is kept open.
+ A connection owner file descriptor may be used to
+ issue any of the following ioctls.
+ </para>
+
+ <itemizedlist>
+ <listitem><para>
+ <constant>KDBUS_CMD_UPDATE</constant> to tweak details of the
+ connection. See
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+ </para></listitem>
+
+ <listitem><para>
+ <constant>KDBUS_CMD_BYEBYE</constant> to shut down a connection
+ without losing messages. See
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+ </para></listitem>
+
+ <listitem><para>
+ <constant>KDBUS_CMD_FREE</constant> to free a slice of memory in
+ the pool. See
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+ </para></listitem>
+
+ <listitem><para>
+ <constant>KDBUS_CMD_CONN_INFO</constant> to retrieve information
+ on other connections on the bus. See
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+ </para></listitem>
+
+ <listitem><para>
+ <constant>KDBUS_CMD_BUS_CREATOR_INFO</constant> to retrieve
+ information on the bus creator. See
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+ </para></listitem>
+
+ <listitem><para>
+ <constant>KDBUS_CMD_LIST</constant> to retrieve a list of
+ currently active well-known names and unique IDs on the bus. See
+ <citerefentry>
+ <refentrytitle>kdbus.name</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+ </para></listitem>
+
+ <listitem><para>
+ <constant>KDBUS_CMD_SEND</constant> and
+ <constant>KDBUS_CMD_RECV</constant> to send or receive a message.
+ See
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+ </para></listitem>
+
+ <listitem><para>
+ <constant>KDBUS_CMD_NAME_ACQUIRE</constant> and
+ <constant>KDBUS_CMD_NAME_RELEASE</constant> to acquire or release
+ a well-known name on the bus. See
+ <citerefentry>
+ <refentrytitle>kdbus.name</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+ </para></listitem>
+
+ <listitem><para>
+ <constant>KDBUS_CMD_MATCH_ADD</constant> and
+ <constant>KDBUS_CMD_MATCH_REMOVE</constant> to add or remove
+ a match for signal messages. See
+ <citerefentry>
+ <refentrytitle>kdbus.match</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>.
+ </para></listitem>
+ </itemizedlist>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ These ioctls, along with the structs they transport, are explained in
+ detail in the other documents linked to in the 'see also' section below.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+ <simplelist type="inline">
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.bus</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.connection</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.endpoint</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.fs</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.item</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.message</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.name</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>kdbus.pool</refentrytitle>
+ <manvolnum>7</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>ioctl</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>mmap</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>open</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <citerefentry>
+ <refentrytitle>close</refentrytitle>
+ <manvolnum>2</manvolnum>
+ </citerefentry>
+ </member>
+ <member>
+ <ulink url="http://freedesktop.org/wiki/Software/dbus">D-Bus</ulink>
+ </member>
+ </simplelist>
+ </refsect1>
+
+</refentry>
diff --git a/Documentation/kdbus/stylesheet.xsl b/Documentation/kdbus/stylesheet.xsl
new file mode 100644
index 000000000000..52565eac7d0d
--- /dev/null
+++ b/Documentation/kdbus/stylesheet.xsl
@@ -0,0 +1,16 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<stylesheet xmlns="http://www.w3.org/1999/XSL/Transform" version="1.0">
+ <param name="chunk.quietly">1</param>
+ <param name="funcsynopsis.style">ansi</param>
+ <param name="funcsynopsis.tabular.threshold">80</param>
+ <param name="callout.graphics">0</param>
+ <param name="paper.type">A4</param>
+ <param name="generate.section.toc.level">2</param>
+ <param name="use.id.as.filename">1</param>
+ <param name="citerefentry.link">1</param>
+ <strip-space elements="*"/>
+ <template name="generate.citerefentry.link">
+ <value-of select="refentrytitle"/>
+ <text>.html</text>
+ </template>
+</stylesheet>
diff --git a/Makefile b/Makefile
index 1100ff3c77e3..08c98188a37a 100644
--- a/Makefile
+++ b/Makefile
@@ -1350,6 +1350,7 @@ $(help-board-dirs): help-%:
%docs: scripts_basic FORCE
$(Q)$(MAKE) $(build)=scripts build_docproc
$(Q)$(MAKE) $(build)=Documentation/DocBook $@
+ $(Q)$(MAKE) $(build)=Documentation/kdbus $@
else # KBUILD_EXTMOD
--
2.3.1
^ permalink raw reply related
* [PATCH v4 00/14] Add kdbus implementation
From: Greg Kroah-Hartman @ 2015-03-09 13:09 UTC (permalink / raw)
To: arnd, ebiederm, gnomes, teg, jkosina, luto, linux-api,
linux-kernel
Cc: daniel, dh.herrmann, tixxdz
kdbus is a kernel-level IPC implementation that aims for resemblance to
the the protocol layer with the existing userspace D-Bus daemon while
enabling some features that couldn't be implemented before in userspace.
The documentation in the first patch in this series explains the
protocol and the API details.
This is v4 of the kdbus series for inclusion into the mainline kernel.
Changes since v3 are:
* Drop KDBUS_FLAG_KERNEL and the 'kernel_flags' member from all
struct kdbus_cmd_*, and introduce a new KDBUS_FLAGS_NEGOTIATE
instead. Requested by Michael Kerrisk.
* Transform kdbus.txt into DocBook man-pages for better readablity,
and extend the documentation significantly. Requested by Michael
Kerrisk and Christoph Hellwig.
* Add a walk-through example for using the low-level ioctl API from
userspace.
* Consolidate some 'struct kdbus_cmd_*' types to make the API
interface easier to grasp.
* Drop 'struct kdbus_item_list'. The information stored in this
struct was redundant as all ioctls report the returned size
in the command struct already.
* KDBUS_CMD_NAME_ACQUIRE now returns the KDBUS_NAME_IN_QUEUE flag
in cmd->return_flags rather than modifying cmd->flags.
* Get rid of the need for a 2nd pool slice at install time. This
avoids pool fragmentation, message memory footprint and complexity.
* Separate flags from attach_flags in struct kdbus_cmd_info.
* Fix handling of messages with file descriptors with regard to
monitor connections that don't accept file descriptors.
* Revisited and reimplemented the quota logic. 50% are now always
kept reserved for the connection to receive notification etc,
and the rest is accounted per remote peer to avoid denial of
service attacks.
* Make use of new functions introduced with 4.0-rc1
(vfs_iter_write(), {kstrdup,kfree}_const())
* Some internal restructuring and cleanups.
Reasons why this should be done in the kernel, instead of userspace as
it is currently done today include the following:
* Performance: Fewer process context switches, fewer copies, fewer
syscalls, larger memory chunks via memfd. This is really important
for a whole class of userspace programs that are ported from other
operating systems that are run on tiny ARM systems that rely on
hundreds of thousands of messages passed at boot time, and at
"critical" times in their user interaction loops. DBus is not used
for performance sensitive applications because DBus is slow.
We want to make it fast so we can finally use it for low-latency,
high-throughput applications. A simple DBus method-call+reply takes
200us on an up-to-date test machine, with kdbus it takes 8us (with
UDS about 2us). If the packet size is increased from 8k to 128k,
kdbus even beats UDS due to single-copy transfers.
* Security: The peers which communicate do not have to trust each
other, as the only trustworthy component in the game is the kernel
which adds metadata and ensures that all data passed as payload is
either copied or sealed, so that the receiver can parse the data
without having to protect against changing memory while parsing
buffers. Also, all the data transfer is controlled by the kernel,
so that LSMs can track and control what is going on, without
involving userspace. Because of the LSM issue, security people are
much happier with this model than the current scheme of having to
hook into dbus to mediate things.
* More types of metadata can be attached to messages than in userspace
* Semantics for apps with heavy data payloads (media apps, for
instance) with optinal priority message dequeuing, and global
message ordering. Some "crazy" people are playing with using kdbus
for audio data in the system. I'm not saying that this is the best
model for this, but until now, there wasn't any other way to do this
without having to create custom "buses", one for each application
library.
* Being in the kernel closes a lot of races which can't be fixed with
the current userspace solutions. For example, with kdbus, there is a
way a client can disconnect from a bus, but do so only if no further
messages present in its queue, which is crucial for implementing
race-free "exit-on-idle" services
* Eavesdropping on the kernel level, so privileged users can hook into
the message stream without hacking support for that into their
userspace processes
* A number of smaller benefits: for example kdbus learned a way to peek
full messages without dequeing them, which is really useful for
logging metadata when handling bus-activation requests.
* dbus-daemon is not available during early-boot or shutdown.
DBus marshaling is the de-facto standard in all major(!) Linux desktop
systems. It is well established and accepted by many DEs. It also
solves many other problems, including: policy, authentication /
authorization, well-known name registry, efficient broadcasts /
multicasts, peer discovery, bus discovery, metadata transmission, and
more.
It is a shame that we cannot use this well-established protocol for
low-latency applications. We, effectively, have to duplicate all this
code on custom UDS and other transports just because DBus is too slow.
kdbus tries to unify those efforts, so that we don't need multiple
policy implementations, name registries and peer discovery mechanisms.
Furthermore, kdbus implements comprehensive, yet optional, metadata
transmission that allows to identify and authenticate peers in a
race-free manner (which is *not* possible with UDS).
Also, kdbus provides a single transport bus with sequential message
numbering. If you use multiple channels, you cannot give any ordering
guarantees across peers (for instance, regarding parallel name-registry
changes).
Of course, some of the bits above could be implemented in userspace
alone, for example with more sophisticated memory management APIs, but
this is usually done by losing out on the other details. For example,
for many of the memory management APIs, it's hard to not require the
communicating peers to fully trust each other. And we _really_ don't
want peers to have to trust each other.
Another benefit of having this in the kernel, rather than as a userspace
daemon, is that you can now easily use the bus from the initrd, or up to
the very end when the system shuts down. On current userspace D-Bus,
this is not really possible, as this requires passing the bus instance
around between initrd and the "real" system. Such a transition of all
fds also requires keeping full state of what has already been read from
the connection fds. kdbus makes this much simpler, as we can change the
ownership of the bus, just by passing one fd over from one part to the
other.
Given the theoretical advantages above, here are some real-world
examples:
* The Tizen developers have been complaining about the high latency
of DBus for polkit'ish policy queries. That's why their
authentication framework uses custom UDS sockets (called 'Cynara').
If a UI-interaction needs multiple authentication-queries, you don't
want it to take multiple milliseconds, given that you usually want
to render the result in the same frame.
* PulseAudio doesn't use DBus for data transmission. They had to
implement their own marshaling code, transport layer and so on, just
because DBus1-latency is horrible. With kdbus, we can basically drop
this code-duplication and unify the IPC layer. Same is true for
Wayland, btw.
* By moving broadcast-transmission into the kernel, we can use the
time-slices of the sender to perform heavy operations. This is also
true for policy decisions, etc. With a userspace daemon, we cannot
perform operations in a time-slice of the caller. This makes DoS
attacks much harder.
* With priority-inheritance, we can do synchronous calls into trusted
peers and let them optionally use our time-slice to perform the
action. This allows syscall-like/binder-like method-calls into other
processes. Without priority-inheritance, this is not possible in a
secure manner (see 'priority-inheritance').
* Logging-daemons often want to attach metadata to log-messages so
debugging/filtering gets easier. If short-lived programs send
log-messages, the destination peer might not be able to read such
metadata from /proc, as the process might no longer be available at
that time. Same is true for policy-decisions like polkit does. You
cannot send off method-calls and exit. You have to wait for a reply,
even though you might not even care for it. If you don't wait, the
other side might not be able to verify your identity and as such
reject the request.
* Even though the dbus traffic on idle-systems might be low, this
doesn't mean it's not significant at boot-times or under high-load.
If you run a dbus-monitor of your choice, you will see there is an
significant number of messages exchanged during VT-switches, startup,
shutdown, suspend, wakeup, hotplugging and similar situations where
lots of control-messages are exchanged. We don't want to spend
hundreds of ms just to transmit those messages.
These patches can also be found in a git tree, the kdbus branch of
char-misc.git at:
https://git.kernel.org/cgit/linux/kernel/git/gregkh/char-misc.git/
Daniel Mack (14):
kdbus: add documentation
kdbus: add uapi header file
kdbus: add driver skeleton, ioctl entry points and utility functions
kdbus: add connection pool implementation
kdbus: add connection, queue handling and message validation code
kdbus: add node and filesystem implementation
kdbus: add code to gather metadata
kdbus: add code for notifications and matches
kdbus: add code for buses, domains and endpoints
kdbus: add name registry implementation
kdbus: add policy database implementation
kdbus: add Makefile, Kconfig and MAINTAINERS entry
kdbus: add walk-through user space example
kdbus: add selftests
Documentation/Makefile | 2 +-
Documentation/ioctl/ioctl-number.txt | 1 +
Documentation/kdbus/Makefile | 30 +
Documentation/kdbus/kdbus.bus.xml | 360 ++++
Documentation/kdbus/kdbus.connection.xml | 1252 ++++++++++++
Documentation/kdbus/kdbus.endpoint.xml | 436 ++++
Documentation/kdbus/kdbus.fs.xml | 124 ++
Documentation/kdbus/kdbus.item.xml | 840 ++++++++
Documentation/kdbus/kdbus.match.xml | 553 +++++
Documentation/kdbus/kdbus.message.xml | 1277 ++++++++++++
Documentation/kdbus/kdbus.name.xml | 711 +++++++
Documentation/kdbus/kdbus.policy.xml | 406 ++++
Documentation/kdbus/kdbus.pool.xml | 320 +++
Documentation/kdbus/kdbus.xml | 1012 ++++++++++
Documentation/kdbus/stylesheet.xsl | 16 +
MAINTAINERS | 13 +
Makefile | 1 +
include/uapi/linux/Kbuild | 1 +
include/uapi/linux/kdbus.h | 979 +++++++++
include/uapi/linux/magic.h | 2 +
init/Kconfig | 12 +
ipc/Makefile | 2 +-
ipc/kdbus/Makefile | 22 +
ipc/kdbus/bus.c | 560 ++++++
ipc/kdbus/bus.h | 101 +
ipc/kdbus/connection.c | 2215 +++++++++++++++++++++
ipc/kdbus/connection.h | 257 +++
ipc/kdbus/domain.c | 296 +++
ipc/kdbus/domain.h | 77 +
ipc/kdbus/endpoint.c | 275 +++
ipc/kdbus/endpoint.h | 67 +
ipc/kdbus/fs.c | 510 +++++
ipc/kdbus/fs.h | 28 +
ipc/kdbus/handle.c | 617 ++++++
ipc/kdbus/handle.h | 85 +
ipc/kdbus/item.c | 339 ++++
ipc/kdbus/item.h | 64 +
ipc/kdbus/limits.h | 64 +
ipc/kdbus/main.c | 125 ++
ipc/kdbus/match.c | 559 ++++++
ipc/kdbus/match.h | 35 +
ipc/kdbus/message.c | 616 ++++++
ipc/kdbus/message.h | 133 ++
ipc/kdbus/metadata.c | 1164 +++++++++++
ipc/kdbus/metadata.h | 57 +
ipc/kdbus/names.c | 772 +++++++
ipc/kdbus/names.h | 74 +
ipc/kdbus/node.c | 910 +++++++++
ipc/kdbus/node.h | 84 +
ipc/kdbus/notify.c | 248 +++
ipc/kdbus/notify.h | 30 +
ipc/kdbus/policy.c | 489 +++++
ipc/kdbus/policy.h | 51 +
ipc/kdbus/pool.c | 728 +++++++
ipc/kdbus/pool.h | 46 +
ipc/kdbus/queue.c | 678 +++++++
ipc/kdbus/queue.h | 92 +
ipc/kdbus/reply.c | 259 +++
ipc/kdbus/reply.h | 68 +
ipc/kdbus/util.c | 201 ++
ipc/kdbus/util.h | 74 +
samples/Makefile | 3 +-
samples/kdbus/.gitignore | 1 +
samples/kdbus/Makefile | 10 +
samples/kdbus/kdbus-api.h | 114 ++
samples/kdbus/kdbus-workers.c | 1327 ++++++++++++
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/kdbus/.gitignore | 3 +
tools/testing/selftests/kdbus/Makefile | 46 +
tools/testing/selftests/kdbus/kdbus-enum.c | 94 +
tools/testing/selftests/kdbus/kdbus-enum.h | 14 +
tools/testing/selftests/kdbus/kdbus-test.c | 923 +++++++++
tools/testing/selftests/kdbus/kdbus-test.h | 85 +
tools/testing/selftests/kdbus/kdbus-util.c | 1615 +++++++++++++++
tools/testing/selftests/kdbus/kdbus-util.h | 222 +++
tools/testing/selftests/kdbus/test-activator.c | 318 +++
tools/testing/selftests/kdbus/test-attach-flags.c | 750 +++++++
tools/testing/selftests/kdbus/test-benchmark.c | 451 +++++
tools/testing/selftests/kdbus/test-bus.c | 175 ++
tools/testing/selftests/kdbus/test-chat.c | 122 ++
tools/testing/selftests/kdbus/test-connection.c | 616 ++++++
tools/testing/selftests/kdbus/test-daemon.c | 65 +
tools/testing/selftests/kdbus/test-endpoint.c | 341 ++++
tools/testing/selftests/kdbus/test-fd.c | 789 ++++++++
tools/testing/selftests/kdbus/test-free.c | 64 +
tools/testing/selftests/kdbus/test-match.c | 441 ++++
tools/testing/selftests/kdbus/test-message.c | 731 +++++++
tools/testing/selftests/kdbus/test-metadata-ns.c | 506 +++++
tools/testing/selftests/kdbus/test-monitor.c | 176 ++
tools/testing/selftests/kdbus/test-names.c | 194 ++
tools/testing/selftests/kdbus/test-policy-ns.c | 632 ++++++
tools/testing/selftests/kdbus/test-policy-priv.c | 1269 ++++++++++++
tools/testing/selftests/kdbus/test-policy.c | 80 +
tools/testing/selftests/kdbus/test-sync.c | 369 ++++
tools/testing/selftests/kdbus/test-timeout.c | 99 +
95 files changed, 34063 insertions(+), 3 deletions(-)
create mode 100644 Documentation/kdbus/Makefile
create mode 100644 Documentation/kdbus/kdbus.bus.xml
create mode 100644 Documentation/kdbus/kdbus.connection.xml
create mode 100644 Documentation/kdbus/kdbus.endpoint.xml
create mode 100644 Documentation/kdbus/kdbus.fs.xml
create mode 100644 Documentation/kdbus/kdbus.item.xml
create mode 100644 Documentation/kdbus/kdbus.match.xml
create mode 100644 Documentation/kdbus/kdbus.message.xml
create mode 100644 Documentation/kdbus/kdbus.name.xml
create mode 100644 Documentation/kdbus/kdbus.policy.xml
create mode 100644 Documentation/kdbus/kdbus.pool.xml
create mode 100644 Documentation/kdbus/kdbus.xml
create mode 100644 Documentation/kdbus/stylesheet.xsl
create mode 100644 include/uapi/linux/kdbus.h
create mode 100644 ipc/kdbus/Makefile
create mode 100644 ipc/kdbus/bus.c
create mode 100644 ipc/kdbus/bus.h
create mode 100644 ipc/kdbus/connection.c
create mode 100644 ipc/kdbus/connection.h
create mode 100644 ipc/kdbus/domain.c
create mode 100644 ipc/kdbus/domain.h
create mode 100644 ipc/kdbus/endpoint.c
create mode 100644 ipc/kdbus/endpoint.h
create mode 100644 ipc/kdbus/fs.c
create mode 100644 ipc/kdbus/fs.h
create mode 100644 ipc/kdbus/handle.c
create mode 100644 ipc/kdbus/handle.h
create mode 100644 ipc/kdbus/item.c
create mode 100644 ipc/kdbus/item.h
create mode 100644 ipc/kdbus/limits.h
create mode 100644 ipc/kdbus/main.c
create mode 100644 ipc/kdbus/match.c
create mode 100644 ipc/kdbus/match.h
create mode 100644 ipc/kdbus/message.c
create mode 100644 ipc/kdbus/message.h
create mode 100644 ipc/kdbus/metadata.c
create mode 100644 ipc/kdbus/metadata.h
create mode 100644 ipc/kdbus/names.c
create mode 100644 ipc/kdbus/names.h
create mode 100644 ipc/kdbus/node.c
create mode 100644 ipc/kdbus/node.h
create mode 100644 ipc/kdbus/notify.c
create mode 100644 ipc/kdbus/notify.h
create mode 100644 ipc/kdbus/policy.c
create mode 100644 ipc/kdbus/policy.h
create mode 100644 ipc/kdbus/pool.c
create mode 100644 ipc/kdbus/pool.h
create mode 100644 ipc/kdbus/queue.c
create mode 100644 ipc/kdbus/queue.h
create mode 100644 ipc/kdbus/reply.c
create mode 100644 ipc/kdbus/reply.h
create mode 100644 ipc/kdbus/util.c
create mode 100644 ipc/kdbus/util.h
create mode 100644 samples/kdbus/.gitignore
create mode 100644 samples/kdbus/Makefile
create mode 100644 samples/kdbus/kdbus-api.h
create mode 100644 samples/kdbus/kdbus-workers.c
create mode 100644 tools/testing/selftests/kdbus/.gitignore
create mode 100644 tools/testing/selftests/kdbus/Makefile
create mode 100644 tools/testing/selftests/kdbus/kdbus-enum.c
create mode 100644 tools/testing/selftests/kdbus/kdbus-enum.h
create mode 100644 tools/testing/selftests/kdbus/kdbus-test.c
create mode 100644 tools/testing/selftests/kdbus/kdbus-test.h
create mode 100644 tools/testing/selftests/kdbus/kdbus-util.c
create mode 100644 tools/testing/selftests/kdbus/kdbus-util.h
create mode 100644 tools/testing/selftests/kdbus/test-activator.c
create mode 100644 tools/testing/selftests/kdbus/test-attach-flags.c
create mode 100644 tools/testing/selftests/kdbus/test-benchmark.c
create mode 100644 tools/testing/selftests/kdbus/test-bus.c
create mode 100644 tools/testing/selftests/kdbus/test-chat.c
create mode 100644 tools/testing/selftests/kdbus/test-connection.c
create mode 100644 tools/testing/selftests/kdbus/test-daemon.c
create mode 100644 tools/testing/selftests/kdbus/test-endpoint.c
create mode 100644 tools/testing/selftests/kdbus/test-fd.c
create mode 100644 tools/testing/selftests/kdbus/test-free.c
create mode 100644 tools/testing/selftests/kdbus/test-match.c
create mode 100644 tools/testing/selftests/kdbus/test-message.c
create mode 100644 tools/testing/selftests/kdbus/test-metadata-ns.c
create mode 100644 tools/testing/selftests/kdbus/test-monitor.c
create mode 100644 tools/testing/selftests/kdbus/test-names.c
create mode 100644 tools/testing/selftests/kdbus/test-policy-ns.c
create mode 100644 tools/testing/selftests/kdbus/test-policy-priv.c
create mode 100644 tools/testing/selftests/kdbus/test-policy.c
create mode 100644 tools/testing/selftests/kdbus/test-sync.c
create mode 100644 tools/testing/selftests/kdbus/test-timeout.c
^ permalink raw reply
* Re: [PATCH] capabilities: Ambient capability set V2
From: Christoph Lameter @ 2015-03-09 12:05 UTC (permalink / raw)
To: Serge E. Hallyn
Cc: Andy Lutomirski, Serge Hallyn, Jonathan Corbet, Aaron Jones,
LSM List, linux-kernel@vger.kernel.org, Andrew Morton,
Andrew G. Morgan, Mimi Zohar, Austin S Hemmelgarn, Markku Savela,
Jarkko Sakkinen, Linux API, Michael Kerrisk
In-Reply-To: <20150307213554.GB9833@mail.hallyn.com>
On Sat, 7 Mar 2015, Serge E. Hallyn wrote:
> > The ancestor here is ambient_test and when it is run pI will not be set
> > despite the cap setting.
>
> ambient_test is supposed to set it.
I thought the setcap +i would do it.
So the setcap and setting of the file inheritance bits has no effect on
pI? When the process starts pI is off despite fI being set?
^ permalink raw reply
* Re: [PATCH v1 1/6] eeprom: Add a simple EEPROM framework for eeprom providers
From: Srinivas Kandagatla @ 2015-03-09 7:13 UTC (permalink / raw)
To: Mark Brown
Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Maxime Ripard,
Rob Herring, Pawel Moll, Kumar Gala,
linux-api-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
devicetree-u79uwXL29TY76Z2rM5mHXA, Stephen Boyd,
andrew-g2DYL2Zd6BY, Arnd Bergmann, Greg Kroah-Hartman
In-Reply-To: <20150307150035.GN28806-GFdadSzt00ze9xe1eoZjHA@public.gmane.org>
On 07/03/15 15:00, Mark Brown wrote:
> On Thu, Mar 05, 2015 at 09:45:41AM +0000, Srinivas Kandagatla wrote:
>
>> +
>> + return eeprom;
>> +}
>> +EXPORT_SYMBOL(eeprom_register);
>
> This framework uses regmap but regmap is EXPORT_SYMBOL_GPL() and this is
> using EXPORT_SYMBOL().
>
Thanks for spotting this, I will fix this in next version.
>> +int eeprom_unregister(struct eeprom_device *eeprom)
>> +{
>> + mutex_lock(&eeprom_mutex);
>> + if (atomic_read(&eeprom->users)) {
>> + mutex_unlock(&eeprom_mutex);
>
> Atomic reads and a mutex - isn't the mutex enough? Atomics are
> generally a recipie for bugs due to the complexity in using them.
Yes, you are right as long as we protect users variable with mutex,
using atomic is really redundant, will fix it in next version.
>
^ permalink raw reply
* Re: [PATCH v2 02/18] ARM: ARMv7M: Enlarge vector table to 256 entries
From: Stefan Agner @ 2015-03-09 0:29 UTC (permalink / raw)
To: Maxime Coquelin
Cc: u.kleine-koenig, afaerber, geert, Rob Herring, Philipp Zabel,
Jonathan Corbet, Pawel Moll, Mark Rutland, Ian Campbell,
Kumar Gala, Russell King, Daniel Lezcano, Thomas Gleixner,
Linus Walleij, Greg Kroah-Hartman, Jiri Slaby, Arnd Bergmann,
Andrew Morton, David S. Miller, Mauro Carvalho Chehab,
Joe Perches, Antti Palosaari, Tejun Heo, Will Deacon, Nikolay
In-Reply-To: <1424455277-29983-3-git-send-email-mcoquelin.stm32@gmail.com>
On 2015-02-20 19:01, Maxime Coquelin wrote:
> From Cortex-M reference manuals, the nvic supports up to 240 interrupts.
> So the number of entries in vectors table is up to 256.
>
> This patch adds a new config flag to specify the number of external interrupts.
> Some ifdeferies are added in order to respect the natural alignment without
> wasting too much space on smaller systems.
>
> Signed-off-by: Maxime Coquelin <mcoquelin.stm32@gmail.com>
> ---
> arch/arm/kernel/entry-v7m.S | 13 +++++++++----
> arch/arm/mm/Kconfig | 15 +++++++++++++++
> 2 files changed, 24 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm/kernel/entry-v7m.S b/arch/arm/kernel/entry-v7m.S
> index 8944f49..68cde36 100644
> --- a/arch/arm/kernel/entry-v7m.S
> +++ b/arch/arm/kernel/entry-v7m.S
> @@ -117,9 +117,14 @@ ENTRY(__switch_to)
> ENDPROC(__switch_to)
>
> .data
> - .align 8
> +#if CONFIG_CPUV7M_NUM_IRQ <= 112
> + .align 9
> +#else
> + .align 10
> +#endif
> +
> /*
> - * Vector table (64 words => 256 bytes natural alignment)
> + * Vector table (Natural alignment need to be ensured)
> */
> ENTRY(vector_table)
> .long 0 @ 0 - Reset stack pointer
> @@ -138,6 +143,6 @@ ENTRY(vector_table)
> .long __invalid_entry @ 13 - Reserved
> .long __pendsv_entry @ 14 - PendSV
> .long __invalid_entry @ 15 - SysTick
> - .rept 64 - 16
> - .long __irq_entry @ 16..64 - External Interrupts
> + .rept CONFIG_CPUV7M_NUM_IRQ
> + .long __irq_entry @ External Interrupts
> .endr
> diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
> index c43c714..27eb835 100644
> --- a/arch/arm/mm/Kconfig
> +++ b/arch/arm/mm/Kconfig
> @@ -604,6 +604,21 @@ config CPU_USE_DOMAINS
> This option enables or disables the use of domain switching
> via the set_fs() function.
>
> +config CPUV7M_NUM_IRQ
> + int "Number of external interrupts connected to the NVIC"
> + depends on CPU_V7M
> + default 90 if ARCH_STM32
> + default 38 if ARCH_EFM32
> + default 240
> + help
> + This option indicates the number of interrupts connected to the NVIC.
> + The value can be larger than the real number of interrupts supported
> + by the system, but must not be lower.
> + The default value is 240, corresponding to the maximum number of
> + interrupts supported by the NVIC on Cortex-M family.
> +
> + If unsure, keep default value.
> +
> #
> # CPU supports 36-bit I/O
> #
I sent a patch which extended that vector table some weeks ago:
https://lkml.org/lkml/2014/12/29/296
But your solution is definitely more flexible, and given that we deal
with small devices here, it's worth saving memory.
Acked-by: Stefan Agner <stefan@agner.ch>
^ permalink raw reply
* Re: [PATCH v5 tip 0/7] tracing: attach eBPF programs to kprobes
From: Alexei Starovoitov @ 2015-03-08 0:21 UTC (permalink / raw)
To: Steven Rostedt, Ingo Molnar
Cc: Namhyung Kim, Arnaldo Carvalho de Melo, Jiri Olsa,
Masami Hiramatsu, David S. Miller, Daniel Borkmann,
Peter Zijlstra, Linux API, Network Development, LKML
In-Reply-To: <20150306200938.6a6387c0@gandalf.local.home>
On 3/6/15 5:09 PM, Steven Rostedt wrote:
> On Wed, 4 Mar 2015 15:48:24 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
>
>> On Wed, 4 Mar 2015 21:33:16 +0100
>> Ingo Molnar <mingo@kernel.org> wrote:
>>
>>>
>>> * Alexei Starovoitov <ast@plumgrid.com> wrote:
>>>
>>>> On Sun, Mar 1, 2015 at 3:27 PM, Alexei Starovoitov <ast@plumgrid.com> wrote:
>>>>> Peter, Steven,
>>>>> I think this set addresses everything we've discussed.
>>>>> Please review/ack. Thanks!
>>>>
>>>> icmp echo request
>>>
>>> I'd really like to have an Acked-by from Steve (propagated into the
>>> changelogs) before looking at applying these patches.
>>
>> I'll have to look at this tomorrow. I'm a bit swamped with other things
>> at the moment :-/
>>
>
> Just an update. I started looking at it but then was pulled off to do
> other things. I'll make this a priority next week. Sorry for the delay.
There is no rush. Please let me know if I need to clarify anything.
One thing I just caught which I'm planning to address in the follow on
patch is missing 'recursion check'. Since attaching programs to kprobes
means that root may create loops by adding a kprobe somewhere in
the call chain invoked from bpf program. So far I'm thinking to do
simple stack_trace_call()-like check. I don't think it's a blocker
for this set, but if I'm done coding recursion soon, I'll just
roll it in and respin this set :)
^ permalink raw reply
* Re: [PATCH v0 01/11] stm class: Introduce an abstraction for System Trace Module devices
From: Paul Bolle @ 2015-03-07 22:26 UTC (permalink / raw)
To: Alexander Shishkin
Cc: Greg Kroah-Hartman, linux-kernel, Pratik Patel, mathieu.poirier,
peter.lachner, norbert.schulz, keven.boell, yann.fouassier,
laurent.fert, linux-api
In-Reply-To: <1425728161-164217-2-git-send-email-alexander.shishkin@linux.intel.com>
On Sat, 2015-03-07 at 13:35 +0200, Alexander Shishkin wrote:
> Documentation/ABI/testing/configfs-stp-policy | 44 ++
git am whined about this file when I tried to apply this patch:
Applying: stm class: Introduce an abstraction for System Trace Module devices
[...]/.git/rebase-apply/patch:77: new blank line at EOF.
> Documentation/ABI/testing/sysfs-class-stm | 14 +
> Documentation/ABI/testing/sysfs-class-stm_source | 11 +
> Documentation/trace/stm.txt | 77 +++
> drivers/Kconfig | 2 +
> drivers/Makefile | 1 +
> drivers/stm/Kconfig | 8 +
> drivers/stm/Makefile | 3 +
> drivers/stm/core.c | 839 +++++++++++++++++++++++
> drivers/stm/policy.c | 470 +++++++++++++
> drivers/stm/stm.h | 77 +++
> include/linux/stm.h | 87 +++
> include/uapi/linux/stm.h | 47 ++
> --- /dev/null
> +++ b/drivers/stm/Kconfig
> @@ -0,0 +1,8 @@
> +config STM
> + tristate "System Trace Module devices"
> + help
> + A System Trace Module (STM) is a device exporting data in System
> + Trace Protocol (STP) format as defined by MIPI STP standards.
> + Examples of such devices are Intel Trace Hub and Coresight STM.
> +
> + Say Y here to enable System Trace Module device support.
> diff --git a/drivers/stm/Makefile b/drivers/stm/Makefile
> new file mode 100644
> index 0000000000..adec701649
> --- /dev/null
> +++ b/drivers/stm/Makefile
> @@ -0,0 +1,3 @@
> +obj-$(CONFIG_STM) += stm_core.o
> +
> +stm_core-y := core.o policy.o
I tried to compile this as a module:
$ make -C ../.. M=$PWD CONFIG_STM=m stm_core.ko
make: Entering directory `[...]'
LD [M] [...]/drivers/stm/stm_core.o
[...]/drivers/stm/policy.o: In function `stp_configfs_init':
policy.c:(.text+0x5f0): multiple definition of `init_module'
[...]/drivers/stm/core.o:core.c:(.init.text+0x0): first defined here
make[1]: *** [[...]/drivers/stm/stm_core.o] Error 1
make: *** [stm_core.ko] Error 2
make: Leaving directory `[...]'
I think that's because
postcore_initcall(stm_core_init);
in core.c becomes
module_init(stm_core_init);
if this driver is compiled as a module. And that will clash with
module_init(stp_configfs_init);
in policy.c. Am I missing something obvious or should STM not be a
tristate symbol?
Paul Bolle
^ permalink raw reply
* Re: [PATCH] capabilities: Ambient capability set V2
From: Serge E. Hallyn @ 2015-03-07 21:35 UTC (permalink / raw)
To: Christoph Lameter
Cc: Serge E. Hallyn, Andy Lutomirski, Serge Hallyn, Jonathan Corbet,
Aaron Jones, LSM List,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Andrew Morton, Andrew G. Morgan, Mimi Zohar, Austin S Hemmelgarn,
Markku Savela, Jarkko Sakkinen, Linux API, Michael Kerrisk
In-Reply-To: <alpine.DEB.2.11.1503070907330.15173-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org>
On Sat, Mar 07, 2015 at 09:09:05AM -0600, Christoph Lameter wrote:
> On Fri, 6 Mar 2015, Serge E. Hallyn wrote:
>
> > > I think that's right. fI doesn't set pI.
> >
> > Right. The idea is that for the running binary to get capability x in its
> > pP, its privileged ancestor must have set x in pI, and the binary itself
> > must be trusted with x in fI.
>
> The ancestor here is ambient_test and when it is run pI will not be set
> despite the cap setting.
ambient_test is supposed to set it.
> Therefore anything is spawns cannot have the inheritance bits set either.
> This plainly does not make any sense whatsoever. If this is so as it seems
> to be then we should be able to remove the inheritance bits because they
> have no effect.
>
^ permalink raw reply
* Re: [PATCH] capabilities: Ambient capability set V2
From: Serge E. Hallyn @ 2015-03-07 21:35 UTC (permalink / raw)
To: Christoph Lameter
Cc: Andy Lutomirski, Serge E. Hallyn, Serge Hallyn, Jonathan Corbet,
Aaron Jones, LSM List, linux-kernel@vger.kernel.org,
Andrew Morton, Andrew G. Morgan, Mimi Zohar, Austin S Hemmelgarn,
Markku Savela, Jarkko Sakkinen, Linux API, Michael Kerrisk
In-Reply-To: <alpine.DEB.2.11.1503070906130.15173@gentwo.org>
On Sat, Mar 07, 2015 at 09:06:46AM -0600, Christoph Lameter wrote:
> On Fri, 6 Mar 2015, Andy Lutomirski wrote:
>
> > > christoph@fujitsu-haswell:~$ getcap ambient_test
> > >
> > > ambient_test = cap_setpcap,cap_net_admin,cap_net_raw,cap_sys_nice+eip
> >
> > I think that's right. fI doesn't set pI.
>
> Ok then that is the point of pI if it cannot be set?
It can be set! Anything with CAP_SETPCAP can fill it's pI. When
it and its children exec(), pI' = pI.
^ permalink raw reply
* Re: [PATCH] capabilities: Ambient capability set V2
From: Christoph Lameter @ 2015-03-07 15:09 UTC (permalink / raw)
To: Serge E. Hallyn
Cc: Andy Lutomirski, Serge Hallyn, Jonathan Corbet, Aaron Jones,
LSM List, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Andrew Morton, Andrew G. Morgan, Mimi Zohar, Austin S Hemmelgarn,
Markku Savela, Jarkko Sakkinen, Linux API, Michael Kerrisk
In-Reply-To: <20150306200838.GA29198-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
On Fri, 6 Mar 2015, Serge E. Hallyn wrote:
> > I think that's right. fI doesn't set pI.
>
> Right. The idea is that for the running binary to get capability x in its
> pP, its privileged ancestor must have set x in pI, and the binary itself
> must be trusted with x in fI.
The ancestor here is ambient_test and when it is run pI will not be set
despite the cap setting.
Therefore anything is spawns cannot have the inheritance bits set either.
This plainly does not make any sense whatsoever. If this is so as it seems
to be then we should be able to remove the inheritance bits because they
have no effect.
^ permalink raw reply
* Re: [PATCH] capabilities: Ambient capability set V2
From: Christoph Lameter @ 2015-03-07 15:06 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Serge E. Hallyn, Serge Hallyn, Jonathan Corbet, Aaron Jones,
LSM List, linux-kernel@vger.kernel.org, Andrew Morton,
Andrew G. Morgan, Mimi Zohar, Austin S Hemmelgarn, Markku Savela,
Jarkko Sakkinen, Linux API, Michael Kerrisk
In-Reply-To: <CALCETrVQF22rkZFD8VAW_xrVQOjwpej6W4TJS9gbN9B431TEKg@mail.gmail.com>
On Fri, 6 Mar 2015, Andy Lutomirski wrote:
> > christoph@fujitsu-haswell:~$ getcap ambient_test
> >
> > ambient_test = cap_setpcap,cap_net_admin,cap_net_raw,cap_sys_nice+eip
>
> I think that's right. fI doesn't set pI.
Ok then that is the point of pI if it cannot be set?
^ permalink raw reply
* Re: [PATCH v1 1/6] eeprom: Add a simple EEPROM framework for eeprom providers
From: Mark Brown @ 2015-03-07 15:00 UTC (permalink / raw)
To: Srinivas Kandagatla
Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Maxime Ripard,
Rob Herring, Pawel Moll, Kumar Gala,
linux-api-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
devicetree-u79uwXL29TY76Z2rM5mHXA, Stephen Boyd,
andrew-g2DYL2Zd6BY, Arnd Bergmann, Greg Kroah-Hartman
In-Reply-To: <1425548741-12930-1-git-send-email-srinivas.kandagatla-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
[-- Attachment #1: Type: text/plain, Size: 526 bytes --]
On Thu, Mar 05, 2015 at 09:45:41AM +0000, Srinivas Kandagatla wrote:
> +
> + return eeprom;
> +}
> +EXPORT_SYMBOL(eeprom_register);
This framework uses regmap but regmap is EXPORT_SYMBOL_GPL() and this is
using EXPORT_SYMBOL().
> +int eeprom_unregister(struct eeprom_device *eeprom)
> +{
> + mutex_lock(&eeprom_mutex);
> + if (atomic_read(&eeprom->users)) {
> + mutex_unlock(&eeprom_mutex);
Atomic reads and a mutex - isn't the mutex enough? Atomics are
generally a recipie for bugs due to the complexity in using them.
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]
^ permalink raw reply
* Re: [PATCH v3 0/3] epoll: introduce round robin wakeup mode
From: Jason Baron @ 2015-03-07 12:35 UTC (permalink / raw)
To: Ingo Molnar
Cc: Andrew Morton, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
mingo-H+wXaHxf7aLQT0dZR+AlfA,
viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn, normalperson-rMlxZR9MS24,
davidel-AhlLAIvw+VEjIGhXcJzhZg,
mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, luto-kltTT9wpgjJwATOyAt5JVQ,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
linux-api-u79uwXL29TY76Z2rM5mHXA, Linus Torvalds, Alexander Viro
In-Reply-To: <20150305091517.GA25158-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
On 03/05/2015 04:15 AM, Ingo Molnar wrote:
> * Jason Baron <jbaron-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org> wrote:
>
>> 2) We are using the wakeup in this case to 'assign' work more
>> permanently to the thread. That is, in the case of a listen socket
>> we then add the connected socket to the woken up threads local set
>> of epoll events. So the load persists past the wake up. And in this
>> case, doing the round robin wakeups, simply allows us to access more
>> cpu bandwidth. (I'm also looking into potentially using cpu affinity
>> to do the wakeups as well as you suggested.)
> So this is the part that I still don't understand.
Here's maybe another way to frame this. Epoll sets add
a waiter on the wait queue in a fixed order when epoll sets
are added (via EPOLL_CTL_ADD). This order does not change
modulo adds/dels which are usually not common. So if
we don't want to wake all threads, when say an interrupt
occurs at some random point, we can either:
1) Walk the list, wake up the first epoll set that has idle
threads (queued via epoll_wait()) and return.
or:
2) Walk the list and wake up the first epoll set that has idle
threads, but then 'rotate' or move this epoll set to the tail
of the queue before returning.
So because the epoll sets are in a fixed order there is
an extreme bias to pick the same epoll sets over and over
regardless of the order in which threads return to wait
via (epoll_wait()). So I think the rotate makes sense for
the case where I am trying to assign work to threads that
may persist past the wake up point, and for cases where
the threads can finish all their work before returning
back to epoll_wait().
Thanks,
-Jason
^ permalink raw reply
* [PATCH v0 01/11] stm class: Introduce an abstraction for System Trace Module devices
From: Alexander Shishkin @ 2015-03-07 11:35 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: linux-kernel, Pratik Patel, mathieu.poirier, peter.lachner,
norbert.schulz, keven.boell, yann.fouassier, laurent.fert,
Alexander Shishkin, linux-api
In-Reply-To: <1425728161-164217-1-git-send-email-alexander.shishkin@linux.intel.com>
A System Trace Module (STM) is a device exporting data in System Trace
Protocol (STP) format as defined by MIPI STP standards. Examples of such
devices are Intel Trace Hub and Coresight STM.
This abstraction provides a unified interface for software trace sources
to send their data over an STM device to a debug host. In order to do
that, such a trace source needs to be assigned a pair of master/channel
identifiers that all the data from this source will be tagged with. The
STP decoder on the debug host side will use these master/channel tags to
distinguish different trace streams from one another inside one STP
stream.
This abstraction provides a configfs-based policy management mechanism
for dynamic allocation of these master/channel pairs based on trace
source-supplied string identifier. It has the flexibility of being
defined at runtime and at the same time (provided that the policy
definition is aligned with the decoding end) consistency.
For userspace trace sources, this abstraction provides write()-based and
mmap()-based (if the underlying stm device allows this) output mechanism.
For kernel-side trace sources, we provide "stm_source" device class that
can be connected to an stm device at run time.
Cc: linux-api@vger.kernel.org
Cc: Pratik Patel <pratikp@codeaurora.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
---
Documentation/ABI/testing/configfs-stp-policy | 44 ++
Documentation/ABI/testing/sysfs-class-stm | 14 +
Documentation/ABI/testing/sysfs-class-stm_source | 11 +
Documentation/trace/stm.txt | 77 +++
drivers/Kconfig | 2 +
drivers/Makefile | 1 +
drivers/stm/Kconfig | 8 +
drivers/stm/Makefile | 3 +
drivers/stm/core.c | 839 +++++++++++++++++++++++
drivers/stm/policy.c | 470 +++++++++++++
drivers/stm/stm.h | 77 +++
include/linux/stm.h | 87 +++
include/uapi/linux/stm.h | 47 ++
13 files changed, 1680 insertions(+)
create mode 100644 Documentation/ABI/testing/configfs-stp-policy
create mode 100644 Documentation/ABI/testing/sysfs-class-stm
create mode 100644 Documentation/ABI/testing/sysfs-class-stm_source
create mode 100644 Documentation/trace/stm.txt
create mode 100644 drivers/stm/Kconfig
create mode 100644 drivers/stm/Makefile
create mode 100644 drivers/stm/core.c
create mode 100644 drivers/stm/policy.c
create mode 100644 drivers/stm/stm.h
create mode 100644 include/linux/stm.h
create mode 100644 include/uapi/linux/stm.h
diff --git a/Documentation/ABI/testing/configfs-stp-policy b/Documentation/ABI/testing/configfs-stp-policy
new file mode 100644
index 0000000000..1c7ab3dbcd
--- /dev/null
+++ b/Documentation/ABI/testing/configfs-stp-policy
@@ -0,0 +1,44 @@
+What: /config/stp-policy
+Date: Jan 2015
+KernelVersion: 3.20
+Description:
+ This group contains policies mandating Master/Channel allocation
+ for software sources wishing to send trace data over an STM
+ device.
+
+What: /config/stp-policy/<policy>
+Date: Jan 2015
+KernelVersion: 3.20
+Description:
+ Root of a policy. This name is an arbitrary string.
+
+What: /config/stp-policy/<policy>/device
+Date: Jan 2015
+KernelVersion: 3.20
+Description:
+ STM device to which this policy applies. Write a valid stm class
+ device name here to assign this policy to that device.
+
+What: /config/stp-policy/<policy>/<node>
+Date: Jan 2015
+KernelVersion: 3.20
+Description:
+ Policy node is a string identifier that software clients will
+ use to request a master/channel to be allocated and assigned to
+ them.
+
+What: /config/stp-policy/<policy>/<node>/masters
+Date: Jan 2015
+KernelVersion: 3.20
+Description:
+ Range of masters from which to allocate for users of this node.
+ Write two numbers: the first master and the last master number.
+
+What: /config/stp-policy/<policy>/<node>/channels
+Date: Jan 2015
+KernelVersion: 3.20
+Description:
+ Range of channels from which to allocate for users of this node.
+ Write two numbers: the first channel and the last channel
+ number.
+
diff --git a/Documentation/ABI/testing/sysfs-class-stm b/Documentation/ABI/testing/sysfs-class-stm
new file mode 100644
index 0000000000..186f8e66e1
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-stm
@@ -0,0 +1,14 @@
+What: /sys/class/stm/<stm>/masters
+Date: Jan 2015
+KernelVersion: 3.20
+Contact: Alexander Shishkin <alexander.shishkin@linux.intel.com>
+Description:
+ Shows first and last available to software master numbers on
+ this STM device.
+
+What: /sys/class/stm/<stm>/channels
+Date: Jan 2015
+KernelVersion: 3.20
+Contact: Alexander Shishkin <alexander.shishkin@linux.intel.com>
+Description:
+ Shows the number of channels per master on this STM device.
diff --git a/Documentation/ABI/testing/sysfs-class-stm_source b/Documentation/ABI/testing/sysfs-class-stm_source
new file mode 100644
index 0000000000..735b0d657f
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-stm_source
@@ -0,0 +1,11 @@
+What: /sys/class/stm_source/<stm_source>/stm_source_link
+Date: Jan 2015
+KernelVersion: 3.20
+Contact: Alexander Shishkin <alexander.shishkin@linux.intel.com>
+Description:
+ stm_source device linkage to stm device, where its tracing data
+ is directed. Reads return an existing connection or "<none>" if
+ this stm_source is not connected to any stm device yet.
+ Write an existing (registered) stm device's name here to
+ connect that device. If a device is already connected to this
+ stm_source, it will first be disconnected.
diff --git a/Documentation/trace/stm.txt b/Documentation/trace/stm.txt
new file mode 100644
index 0000000000..0ba5c9115c
--- /dev/null
+++ b/Documentation/trace/stm.txt
@@ -0,0 +1,77 @@
+System Trace Module
+===================
+
+System Trace Module (STM) is a device described in MIPI STP specs as
+STP trace stream generator. STP (System Trace Protocol) is a trace
+protocol multiplexing data from multiple trace sources, each one of
+which is assigned a unique pair of master and channel. While some of
+these masters and channels are statically allocated to certain
+hardware trace sources, others are available to software. Software
+trace sources are usually free to pick for themselves any
+master/channel combination from this pool.
+
+On the receiving end of this STP stream (the decoder side), trace
+sources can only be identified by master/channel combination, so in
+order for the decoder to be able to make sense of the trace that
+involves multiple trace sources, it needs to be able to map those
+master/channel pairs to the trace sources that it understands.
+
+For instance, it is helpful to know that syslog messages come on
+master 7 channel 15, while arbitrary user applications can use masters
+48 to 63 and channels 0 to 127.
+
+To solve this mapping problem, stm class provides a policy management
+mechanism via configfs, that allows defining rules that map string
+identifiers to ranges of masters and channels. If these rules (policy)
+are consistent with what decoder expects, it will be able to properly
+process the trace data.
+
+This policy is a tree structure containing rules (policy_node) that
+have a name (string identifier) and a range of masters and channels
+associated with it, located in "stp-policy" subsystem directory in
+configfs. From the examle above, a rule may look like this:
+
+$ ls /config/stp-policy/my-policy/user
+channels masters
+$ cat /config/stp-policy/my-policy/user/masters
+48 63
+$ cat /config/stp-policy/my-policy/user/channels
+0 127
+
+which means that the master allocation pool for this rule consists of
+masters 48 through 63 and channel allocation pool has channels 0
+through 127 in it. Now, any producer (trace source) identifying itself
+with "user" identification string will be allocated a master and
+channel from within these ranges.
+
+These rules can be nested, for example, one can define a rule "dummy"
+under "user" directory from the example above and this new rule will
+be used for trace sources with the id string of "user/dummy".
+
+Trace sources have to open the stm class device's node and write their
+trace data into its file descriptor. In order to identify themselves
+to the policy, they need to do a STP_POLICY_ID_SET ioctl on this file
+descriptor providing their id string. Otherwise, they will be
+automatically allocated a master/channel pair upon first write to this
+file descriptor according to the "default" rule of the policy, if such
+exists.
+
+Some STM devices may allow direct mapping of the channel mmio regions
+to userspace for zero-copy writing. One mappable page (in terms of
+mmu) will usually contain multiple channels' mmios, so the user will
+need to allocate that many channels to themselves (via the
+aforementioned ioctl() call) to be able to do this. That is, if your
+stm device's channel mmio region is 64 bytes and hardware page size is
+4096 bytes, after a successful STP_POLICY_ID_SET ioctl() call with
+width==64, you should be able to mmap() one page on this file
+descriptor and obtain direct access to an mmio region for 64 channels.
+
+For kernel-based trace sources, there is "stm_source" device
+class. Devices of this class can be connected and disconnected to/from
+stm devices at runtime via a sysfs attribute.
+
+Examples of STM devices are Intel Trace Hub [1] and Coresight STM
+[2].
+
+[1] https://software.intel.com/sites/default/files/managed/d3/3c/intel-th-developer-manual.pdf
+[2] http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0444b/index.html
diff --git a/drivers/Kconfig b/drivers/Kconfig
index c0cc96bab9..7bc80670bb 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -182,4 +182,6 @@ source "drivers/thunderbolt/Kconfig"
source "drivers/android/Kconfig"
+source "drivers/stm/Kconfig"
+
endmenu
diff --git a/drivers/Makefile b/drivers/Makefile
index 527a6da8d5..2d511b411a 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -165,3 +165,4 @@ obj-$(CONFIG_RAS) += ras/
obj-$(CONFIG_THUNDERBOLT) += thunderbolt/
obj-$(CONFIG_CORESIGHT) += coresight/
obj-$(CONFIG_ANDROID) += android/
+obj-$(CONFIG_STM) += stm/
diff --git a/drivers/stm/Kconfig b/drivers/stm/Kconfig
new file mode 100644
index 0000000000..90ed327461
--- /dev/null
+++ b/drivers/stm/Kconfig
@@ -0,0 +1,8 @@
+config STM
+ tristate "System Trace Module devices"
+ help
+ A System Trace Module (STM) is a device exporting data in System
+ Trace Protocol (STP) format as defined by MIPI STP standards.
+ Examples of such devices are Intel Trace Hub and Coresight STM.
+
+ Say Y here to enable System Trace Module device support.
diff --git a/drivers/stm/Makefile b/drivers/stm/Makefile
new file mode 100644
index 0000000000..adec701649
--- /dev/null
+++ b/drivers/stm/Makefile
@@ -0,0 +1,3 @@
+obj-$(CONFIG_STM) += stm_core.o
+
+stm_core-y := core.o policy.o
diff --git a/drivers/stm/core.c b/drivers/stm/core.c
new file mode 100644
index 0000000000..ba0ce55b0a
--- /dev/null
+++ b/drivers/stm/core.c
@@ -0,0 +1,839 @@
+/*
+ * System Trace Module (STM) infrastructure
+ * Copyright (c) 2014, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ *
+ * STM class implements generic infrastructure for System Trace Module devices
+ * as defined in MIPI STPv2 specification.
+ */
+
+#include <linux/uaccess.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/device.h>
+#include <linux/compat.h>
+#include <linux/kdev_t.h>
+#include <linux/srcu.h>
+#include <linux/slab.h>
+#include <linux/stm.h>
+#include <linux/fs.h>
+#include <linux/mm.h>
+#include "stm.h"
+
+#include <uapi/linux/stm.h>
+
+static unsigned int stm_core_up;
+
+/*
+ * The SRCU here makes sure that STM device doesn't disappear from under a
+ * stm_source_write() caller, which may want to have as little overhead as
+ * possible.
+ */
+static struct srcu_struct stm_source_srcu;
+
+static ssize_t masters_show(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct stm_device *stm = dev_get_drvdata(dev);
+ int ret;
+
+ ret = sprintf(buf, "%u %u\n", stm->data->sw_start, stm->data->sw_end);
+
+ return ret;
+}
+
+static DEVICE_ATTR_RO(masters);
+
+static ssize_t channels_show(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct stm_device *stm = dev_get_drvdata(dev);
+ int ret;
+
+ ret = sprintf(buf, "%u\n", stm->data->sw_nchannels);
+
+ return ret;
+}
+
+static DEVICE_ATTR_RO(channels);
+
+static struct attribute *stm_attrs[] = {
+ &dev_attr_masters.attr,
+ &dev_attr_channels.attr,
+ NULL,
+};
+
+static const struct attribute_group stm_group = {
+ .attrs = stm_attrs,
+};
+
+static const struct attribute_group *stm_groups[] = {
+ &stm_group,
+ NULL,
+};
+
+static struct class stm_class = {
+ .name = "stm",
+ .dev_groups = stm_groups,
+};
+
+static int stm_dev_match(struct device *dev, const void *data)
+{
+ const char *name = data;
+
+ return sysfs_streq(name, dev_name(dev));
+}
+
+/**
+ * stm_find_device() - find stm device by name
+ * @buf: character buffer containing the name
+ * @len: length of the name in @buf
+ *
+ * This is called from attributes' store methods, so it will
+ * also trim the trailing newline if necessary.
+ *
+ * Return: device pointer or null if lookup failed.
+ */
+struct device *stm_find_device(const char *buf, size_t len)
+{
+ if (!stm_core_up)
+ return NULL;
+
+ return class_find_device(&stm_class, NULL, buf, stm_dev_match);
+}
+
+#define __stm_master(_s, _m) \
+ ((_s)->masters[(_m) - (_s)->data->sw_start])
+
+static inline struct stp_master *
+stm_master(struct stm_device *stm, unsigned int idx)
+{
+ if (idx < stm->data->sw_start || idx > stm->data->sw_end)
+ return NULL;
+
+ return __stm_master(stm, idx);
+}
+
+static int stp_master_alloc(struct stm_device *stm, unsigned int idx)
+{
+ struct stp_master *master;
+ size_t size;
+
+ size = ALIGN(stm->data->sw_nchannels, 8) / 8;
+ size += sizeof(struct stp_master);
+ master = kzalloc(size, GFP_ATOMIC);
+ if (!master)
+ return -ENOMEM;
+
+ master->nr_free = stm->data->sw_nchannels;
+ __stm_master(stm, idx) = master;
+
+ return 0;
+}
+
+static void stp_master_free(struct stm_device *stm, unsigned int idx)
+{
+ struct stp_master *master = stm_master(stm, idx);
+
+ if (!master)
+ return;
+
+ __stm_master(stm, idx) = NULL;
+ kfree(master);
+}
+
+static void stm_output_claim(struct stm_device *stm, struct stm_output *output)
+{
+ struct stp_master *master = stm_master(stm, output->master);
+
+ if (WARN_ON_ONCE(master->nr_free < output->nr_chans))
+ return;
+
+ bitmap_allocate_region(&master->chan_map[0], output->channel,
+ ilog2(output->nr_chans));
+
+ master->nr_free -= output->nr_chans;
+}
+
+static void
+stm_output_disclaim(struct stm_device *stm, struct stm_output *output)
+{
+ struct stp_master *master = stm_master(stm, output->master);
+
+ bitmap_release_region(&master->chan_map[0], output->channel,
+ ilog2(output->nr_chans));
+
+ master->nr_free += output->nr_chans;
+}
+
+/*
+ * This is like bitmap_find_free_region(), except it can ignore @start bits
+ * at the beginning.
+ */
+static int find_free_channels(unsigned long *bitmap, unsigned int start,
+ unsigned int end, unsigned int width)
+{
+ unsigned int pos;
+ int i;
+
+ for (pos = start; pos < end + 1; pos = ALIGN(pos, width)) {
+ pos = find_next_zero_bit(bitmap, end + 1, pos);
+ if (pos + width > end + 1)
+ break;
+
+ if (pos & (width - 1))
+ continue;
+
+ for (i = 1; i < width && !test_bit(pos + i, bitmap); i++)
+ ;
+ if (i == width)
+ return pos;
+ }
+
+ return -1;
+}
+
+static unsigned int
+stm_find_master_chan(struct stm_device *stm, unsigned int width,
+ unsigned int *mstart, unsigned int mend,
+ unsigned int *cstart, unsigned int cend)
+{
+ struct stp_master *master;
+ unsigned int midx;
+ int pos, err;
+
+ for (midx = *mstart; midx <= mend; midx++) {
+ if (!stm_master(stm, midx)) {
+ err = stp_master_alloc(stm, midx);
+ if (err)
+ return err;
+ }
+
+ master = stm_master(stm, midx);
+
+ if (!master->nr_free)
+ continue;
+
+ pos = find_free_channels(master->chan_map, *cstart, cend,
+ width);
+ if (pos < 0)
+ continue;
+
+ *mstart = midx;
+ *cstart = pos;
+ return 0;
+ }
+
+ return -ENOSPC;
+}
+
+static int stm_output_assign(struct stm_device *stm, unsigned int width,
+ struct stp_policy_node *policy_node,
+ struct stm_output *output)
+{
+ unsigned int midx, cidx, mend, cend;
+ int ret = -EBUSY;
+
+ if (width > stm->data->sw_nchannels)
+ return -EINVAL;
+
+ if (policy_node) {
+ stp_policy_node_get_ranges(policy_node,
+ &midx, &mend, &cidx, &cend);
+ } else {
+ midx = stm->data->sw_start;
+ cidx = 0;
+ mend = stm->data->sw_end;
+ cend = stm->data->sw_nchannels - 1;
+ }
+
+ spin_lock(&stm->mc_lock);
+ if (output->nr_chans)
+ goto unlock;
+
+ ret = stm_find_master_chan(stm, width, &midx, mend, &cidx, cend);
+ if (ret)
+ goto unlock;
+
+ output->master = midx;
+ output->channel = cidx;
+ output->nr_chans = width;
+ stm_output_claim(stm, output);
+ dev_dbg(stm->dev, "assigned %u:%u (+%u)\n", midx, cidx, width);
+
+ ret = 0;
+unlock:
+ spin_unlock(&stm->mc_lock);
+
+ return ret;
+}
+
+static void stm_output_free(struct stm_device *stm, struct stm_output *output)
+{
+ spin_lock(&stm->mc_lock);
+ if (output->nr_chans)
+ stm_output_disclaim(stm, output);
+ spin_unlock(&stm->mc_lock);
+}
+
+static int major_match(struct device *dev, const void *data)
+{
+ unsigned int major = *(unsigned int *)data;
+
+ return MAJOR(dev->devt) == major;
+}
+
+static int stm_char_open(struct inode *inode, struct file *file)
+{
+ struct stm_file *stmf;
+ struct device *dev;
+ unsigned int major = imajor(inode);
+ int err = -ENODEV;
+
+ dev = class_find_device(&stm_class, NULL, &major, major_match);
+ if (!dev)
+ return -ENODEV;
+
+ stmf = kzalloc(sizeof(*stmf), GFP_KERNEL);
+ if (!stmf)
+ return -ENOMEM;
+
+ stmf->stm = dev_get_drvdata(dev);
+
+ if (!try_module_get(stmf->stm->owner))
+ goto err_free;
+
+ file->private_data = stmf;
+
+ return nonseekable_open(inode, file);
+
+err_free:
+ kfree(stmf);
+
+ return err;
+}
+
+static int stm_char_release(struct inode *inode, struct file *file)
+{
+ struct stm_file *stmf = file->private_data;
+
+ stm_output_free(stmf->stm, &stmf->output);
+ module_put(stmf->stm->owner);
+ kfree(stmf);
+
+ return 0;
+}
+
+static int stm_file_assign(struct stm_file *stmf, char *id, unsigned int width)
+{
+ struct stm_device *stm = stmf->stm;
+ int ret;
+
+ mutex_lock(&stm->policy_mutex);
+ if (stm->policy)
+ stmf->policy_node = stp_policy_node_lookup(stm->policy, id);
+
+ ret = stm_output_assign(stm, width, stmf->policy_node, &stmf->output);
+ mutex_unlock(&stm->policy_mutex);
+
+ return ret;
+}
+
+static ssize_t stm_char_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct stm_file *stmf = file->private_data;
+ struct stm_device *stm = stmf->stm;
+ char *kbuf;
+ int err;
+
+ /*
+ * if no m/c have been assigned to this writer up to this
+ * point, use "default" policy entry
+ */
+ if (!stmf->output.nr_chans) {
+ err = stm_file_assign(stmf, "default", 1);
+ /*
+ * EBUSY means that somebody else just assigned this
+ * output, which is just fine for write()
+ */
+ if (err && err != -EBUSY)
+ return err;
+ }
+
+ kbuf = kmalloc(count + 1, GFP_KERNEL);
+ if (!kbuf)
+ return -ENOMEM;
+
+ err = copy_from_user(kbuf, buf, count);
+ if (err) {
+ kfree(kbuf);
+ return -EFAULT;
+ }
+
+ stm->data->write(stm->data, stmf->output.master,
+ stmf->output.channel, kbuf, count);
+
+
+ kfree(kbuf);
+
+ return count;
+}
+
+static int stm_char_mmap(struct file *file, struct vm_area_struct *vma)
+{
+ struct stm_file *stmf = file->private_data;
+ struct stm_device *stm = stmf->stm;
+ unsigned long size, phys;
+
+ if (!stm->data->mmio_addr)
+ return -EOPNOTSUPP;
+
+ if (vma->vm_pgoff)
+ return -EINVAL;
+
+ size = vma->vm_end - vma->vm_start;
+
+ if (stmf->output.nr_chans * stm->data->sw_mmiosz != size)
+ return -EINVAL;
+
+ phys = stm->data->mmio_addr(stm->data, stmf->output.master,
+ stmf->output.channel,
+ stmf->output.nr_chans);
+
+ if (!phys)
+ return -EINVAL;
+
+ vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+ vma->vm_flags |= VM_IO | VM_DONTEXPAND | VM_DONTDUMP;
+ vm_iomap_memory(vma, phys, size);
+
+ return 0;
+}
+
+static int stm_char_policy_set_ioctl(struct stm_file *stmf, void __user *arg)
+{
+ struct stm_device *stm = stmf->stm;
+ struct stp_policy_id *id;
+ int ret = -EFAULT;
+ u32 size;
+
+ if (stmf->output.nr_chans)
+ return -EBUSY;
+
+ if (copy_from_user(&size, arg, sizeof(size)))
+ return -EFAULT;
+
+ if (size >= PATH_MAX + sizeof(*id))
+ return -EINVAL;
+
+ id = kzalloc(size + 1, GFP_KERNEL);
+ if (!id)
+ return -ENOMEM;
+
+ if (copy_from_user(id, arg, size))
+ goto err_free;
+
+ if (id->__reserved_0 || id->__reserved_1)
+ return -EINVAL;
+
+ if (id->width < 1 ||
+ id->width > PAGE_SIZE / stm->data->sw_mmiosz) {
+ ret = -EINVAL;
+ goto err_free;
+ }
+
+ ret = stm_file_assign(stmf, id->id, id->width);
+ if (ret)
+ goto err_free;
+
+ if (stm->data->link)
+ stm->data->link(stm->data, stmf->output.master,
+ stmf->output.channel);
+
+ ret = 0;
+
+err_free:
+ kfree(id);
+
+ return ret;
+}
+
+static int stm_char_policy_get_ioctl(struct stm_file *stmf, void __user *arg)
+{
+ struct stp_policy_id id = {
+ .size = sizeof(id),
+ .master = stmf->output.master,
+ .channel = stmf->output.channel,
+ .width = stmf->output.nr_chans,
+ .__reserved_0 = 0,
+ .__reserved_1 = 0,
+ };
+
+ return copy_to_user(arg, &id, id.size) ? -EFAULT : 0;
+}
+
+static long
+stm_char_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+ struct stm_file *stmf = file->private_data;
+ int err;
+
+ switch (cmd) {
+ case STP_POLICY_ID_SET:
+ err = stm_char_policy_set_ioctl(stmf, (void __user *)arg);
+ if (err)
+ return err;
+
+ return stm_char_policy_get_ioctl(stmf, (void __user *)arg);
+
+ case STP_POLICY_ID_GET:
+ return stm_char_policy_get_ioctl(stmf, (void __user *)arg);
+
+ default:
+ return -ENOTTY;
+ }
+
+ return 0;
+}
+
+#ifdef CONFIG_COMPAT
+static long
+stm_char_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+ return stm_char_ioctl(file, cmd, (unsigned long)compat_ptr(arg));
+}
+#else
+#define stm_char_compat_ioctl NULL
+#endif
+
+static const struct file_operations stm_fops = {
+ .open = stm_char_open,
+ .release = stm_char_release,
+ .write = stm_char_write,
+ .mmap = stm_char_mmap,
+ .unlocked_ioctl = stm_char_ioctl,
+ .compat_ioctl = stm_char_compat_ioctl,
+ .llseek = no_llseek,
+};
+
+int stm_register_device(struct device *parent, struct stm_data *stm_data,
+ struct module *owner)
+{
+ struct stm_device *stm;
+ struct device *dev;
+ unsigned int nmasters;
+ int err = -ENOMEM;
+
+ if (!stm_core_up)
+ return -EPROBE_DEFER;
+
+ if (!stm_data->write || !stm_data->sw_nchannels)
+ return -EINVAL;
+
+ nmasters = stm_data->sw_end - stm_data->sw_start;
+ stm = kzalloc(sizeof(*stm) + nmasters * sizeof(void *), GFP_KERNEL);
+ if (!stm)
+ return -ENOMEM;
+
+ stm->major = register_chrdev(0, stm_data->name, &stm_fops);
+ if (stm->major < 0)
+ goto err_free;
+
+ dev = device_create(&stm_class, parent, MKDEV(stm->major, 0), NULL,
+ "%s", stm_data->name);
+ if (IS_ERR(dev)) {
+ err = PTR_ERR(dev);
+ goto err_device;
+ }
+
+ spin_lock_init(&stm->link_lock);
+ INIT_LIST_HEAD(&stm->link_list);
+
+ spin_lock_init(&stm->mc_lock);
+ mutex_init(&stm->policy_mutex);
+ stm->sw_nmasters = nmasters;
+ stm->owner = owner;
+ stm->data = stm_data;
+ stm->dev = dev;
+ stm_data->stm = stm;
+
+ dev_set_drvdata(dev, stm);
+
+ return 0;
+
+err_device:
+ device_unregister(dev);
+err_free:
+ kfree(stm);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(stm_register_device);
+
+static void stm_source_link_drop(struct stm_source_device *src);
+
+void stm_unregister_device(struct stm_data *stm_data)
+{
+ struct stm_device *stm = stm_data->stm;
+ struct stm_source_device *src, *iter;
+ int i;
+
+ spin_lock(&stm->link_lock);
+ list_for_each_entry_safe(src, iter, &stm->link_list, link_entry) {
+ stm_source_link_drop(src);
+ }
+ spin_unlock(&stm->link_lock);
+
+ synchronize_srcu(&stm_source_srcu);
+
+ unregister_chrdev(stm->major, stm_data->name);
+
+ if (stm->policy)
+ stp_policy_unbind(stm->policy);
+
+ for (i = 0; i < stm->sw_nmasters; i++)
+ stp_master_free(stm, i);
+
+ device_unregister(stm->dev);
+ kfree(stm);
+ stm_data->stm = NULL;
+}
+EXPORT_SYMBOL_GPL(stm_unregister_device);
+
+static int stm_source_link_add(struct stm_source_device *src,
+ struct stm_device *stm)
+{
+ int err;
+
+ spin_lock(&stm->link_lock);
+ spin_lock(&src->link_lock);
+
+ /* src->link is dereferenced under stm_source_srcu but not the list */
+ rcu_assign_pointer(src->link, stm);
+ list_add_tail(&src->link_entry, &stm->link_list);
+
+ spin_unlock(&src->link_lock);
+ spin_unlock(&stm->link_lock);
+
+ if (stm->policy) {
+ char *id = kstrdup(src->data->name, GFP_KERNEL);
+
+ if (id) {
+ src->policy_node =
+ stp_policy_node_lookup(stm->policy, id);
+
+ kfree(id);
+ }
+ }
+
+ err = stm_output_assign(stm, src->data->nr_chans,
+ src->policy_node, &src->output);
+ if (err)
+ return err;
+
+ if (stm->data->link)
+ stm->data->link(stm->data, src->output.master,
+ src->output.channel);
+ if (src->data->link)
+ src->data->link(src->data);
+
+ return 0;
+}
+
+static void stm_source_link_drop(struct stm_source_device *src)
+{
+ int idx = srcu_read_lock(&stm_source_srcu);
+
+ if (src->link && src->data->unlink)
+ src->data->unlink(src->data);
+
+ srcu_read_unlock(&stm_source_srcu, idx);
+
+ spin_lock(&src->link_lock);
+ if (src->link) {
+ stm_output_free(src->link, &src->output);
+ list_del_init(&src->link_entry);
+ rcu_assign_pointer(src->link, NULL);
+ }
+ spin_unlock(&src->link_lock);
+}
+
+static ssize_t stm_source_link_show(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct stm_source_device *src = dev_get_drvdata(dev);
+ int idx, ret;
+
+ idx = srcu_read_lock(&stm_source_srcu);
+ ret = sprintf(buf, "%s\n",
+ src->link ? dev_name(src->link->dev) : "<none>");
+ srcu_read_unlock(&stm_source_srcu, idx);
+
+ return ret;
+}
+
+static ssize_t stm_source_link_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct stm_source_device *src = dev_get_drvdata(dev);
+ struct stm_device *link;
+ struct device *linkdev;
+ int err;
+
+ stm_source_link_drop(src);
+
+ linkdev = stm_find_device(buf, count);
+ if (!linkdev)
+ return -EINVAL;
+
+ link = dev_get_drvdata(linkdev);
+
+ err = stm_source_link_add(src, link);
+
+ return err ? : count;
+}
+
+static DEVICE_ATTR_RW(stm_source_link);
+
+static struct attribute *stm_source_attrs[] = {
+ &dev_attr_stm_source_link.attr,
+ NULL,
+};
+
+static const struct attribute_group stm_source_group = {
+ .attrs = stm_source_attrs,
+};
+
+static const struct attribute_group *stm_source_groups[] = {
+ &stm_source_group,
+ NULL,
+};
+
+static struct class stm_source_class = {
+ .name = "stm_source",
+ .dev_groups = stm_source_groups,
+};
+
+/**
+ * stm_source_register_device() - register an stm_source device
+ * @parent: parent device
+ * @data: device description structure
+ *
+ * This will create a device of stm_source class that can write
+ * data to an stm device once linked.
+ *
+ * Return: 0 on success, -errno otherwise.
+ */
+int stm_source_register_device(struct device *parent,
+ struct stm_source_data *data)
+{
+ struct stm_source_device *src;
+ struct device *dev;
+
+ if (!stm_core_up)
+ return -EPROBE_DEFER;
+
+ src = kzalloc(sizeof(*src), GFP_KERNEL);
+ if (!src)
+ return -ENOMEM;
+
+ dev = device_create(&stm_source_class, parent, MKDEV(0, 0), NULL, "%s",
+ data->name);
+ if (IS_ERR(dev)) {
+ kfree(src);
+ return PTR_ERR(dev);
+ }
+
+ spin_lock_init(&src->link_lock);
+ INIT_LIST_HEAD(&src->link_entry);
+ src->dev = dev;
+ src->data = data;
+ data->src = src;
+ dev_set_drvdata(dev, src);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(stm_source_register_device);
+
+/**
+ * stm_source_unregister_device() - unregister an stm_source device
+ * @data: device description that was used to register the device
+ *
+ * This will remove a previously created stm_source device from the system.
+ */
+void stm_source_unregister_device(struct stm_source_data *data)
+{
+ struct stm_source_device *src = data->src;
+
+ stm_source_link_drop(src);
+
+ device_destroy(&stm_source_class, src->dev->devt);
+
+ kfree(src);
+}
+EXPORT_SYMBOL_GPL(stm_source_unregister_device);
+
+int stm_source_write(struct stm_source_data *data, unsigned int chan,
+ const char *buf, size_t count)
+{
+ struct stm_source_device *src = data->src;
+ struct stm_device *stm;
+ int idx;
+
+ if (!src->output.nr_chans)
+ return -ENODEV;
+
+ if (chan >= src->output.nr_chans)
+ return -EINVAL;
+
+ idx = srcu_read_lock(&stm_source_srcu);
+
+ stm = srcu_dereference(src->link, &stm_source_srcu);
+ if (stm)
+ count = stm->data->write(stm->data, src->output.master,
+ src->output.channel + chan, buf,
+ count);
+ else
+ count = -ENODEV;
+
+ srcu_read_unlock(&stm_source_srcu, idx);
+
+ return count;
+}
+EXPORT_SYMBOL_GPL(stm_source_write);
+
+static int __init stm_core_init(void)
+{
+ int err;
+
+ err = class_register(&stm_class);
+ if (err)
+ return err;
+
+ err = class_register(&stm_source_class);
+ if (err) {
+ class_unregister(&stm_class);
+ return err;
+ }
+
+ init_srcu_struct(&stm_source_srcu);
+
+ stm_core_up++;
+
+ return 0;
+}
+
+postcore_initcall(stm_core_init);
diff --git a/drivers/stm/policy.c b/drivers/stm/policy.c
new file mode 100644
index 0000000000..6ce125da3b
--- /dev/null
+++ b/drivers/stm/policy.c
@@ -0,0 +1,470 @@
+/*
+ * System Trace Module (STM) master/channel allocation policy management
+ * Copyright (c) 2014, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ *
+ * A master/channel allocation policy allows mapping string identifiers to
+ * master and channel ranges, where allocation can be done.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/types.h>
+#include <linux/module.h>
+#include <linux/device.h>
+#include <linux/configfs.h>
+#include <linux/slab.h>
+#include <linux/stm.h>
+#include "stm.h"
+
+/*
+ * STP Master/Channel allocation policy configfs layout.
+ */
+
+struct stp_policy {
+ struct config_group group;
+ struct stm_device *stm;
+};
+
+struct stp_policy_node {
+ struct config_group group;
+ struct stm_device *stm;
+ struct stp_policy *policy;
+ unsigned int first_master;
+ unsigned int last_master;
+ unsigned int first_channel;
+ unsigned int last_channel;
+};
+
+void stp_policy_node_get_ranges(struct stp_policy_node *policy_node,
+ unsigned int *mstart, unsigned int *mend,
+ unsigned int *cstart, unsigned int *cend)
+{
+ *mstart = policy_node->first_master;
+ *mend = policy_node->last_master;
+ *cstart = policy_node->first_channel;
+ *cend = policy_node->last_channel;
+}
+
+static inline char *stp_policy_node_name(struct stp_policy_node *policy_node)
+{
+ return policy_node->group.cg_item.ci_name ? : "<none>";
+}
+
+static inline struct stp_policy *to_stp_policy(struct config_item *item)
+{
+ return item ?
+ container_of(to_config_group(item), struct stp_policy, group) :
+ NULL;
+}
+
+static inline struct stp_policy_node *
+to_stp_policy_node(struct config_item *item)
+{
+ return item ?
+ container_of(to_config_group(item), struct stp_policy_node,
+ group) :
+ NULL;
+}
+
+static ssize_t stp_policy_node_masters_show(struct stp_policy_node *policy_node,
+ char *page)
+{
+ ssize_t count;
+
+ count = sprintf(page, "%u %u\n", policy_node->first_master,
+ policy_node->last_master);
+
+ return count;
+}
+
+static ssize_t
+stp_policy_node_masters_store(struct stp_policy_node *policy_node,
+ const char *page, size_t count)
+{
+ struct stm_device *stm = policy_node->stm;
+ unsigned int first, last;
+ char *p = (char *) page;
+
+ if (sscanf(p, "%u %u", &first, &last) != 2)
+ return -EINVAL;
+
+ /* must be within [sw_start..sw_end], which is an inclusive range */
+ if (first > INT_MAX || last > INT_MAX || first > last ||
+ first < stm->data->sw_start ||
+ last > stm->data->sw_end)
+ return -ERANGE;
+
+ policy_node->first_master = first;
+ policy_node->last_master = last;
+
+ return count;
+}
+
+static ssize_t
+stp_policy_node_channels_show(struct stp_policy_node *policy_node, char *page)
+{
+ ssize_t count;
+
+ count = sprintf(page, "%u %u\n", policy_node->first_channel,
+ policy_node->last_channel);
+
+ return count;
+}
+
+static ssize_t
+stp_policy_node_channels_store(struct stp_policy_node *policy_node,
+ const char *page, size_t count)
+{
+ unsigned int first, last;
+ char *p = (char *) page;
+
+ if (sscanf(p, "%u %u", &first, &last) != 2)
+ return -EINVAL;
+
+ if (first > INT_MAX || last > INT_MAX || first > last ||
+ last >= policy_node->stm->data->sw_nchannels)
+ return -ERANGE;
+
+ policy_node->first_channel = first;
+ policy_node->last_channel = last;
+
+ return count;
+}
+
+static void stp_policy_node_release(struct config_item *item)
+{
+ kfree(to_stp_policy_node(item));
+}
+
+struct stp_policy_node_attribute {
+ struct configfs_attribute attr;
+ ssize_t (*show)(struct stp_policy_node *, char *);
+ ssize_t (*store)(struct stp_policy_node *, const char *, size_t);
+};
+
+static ssize_t stp_policy_node_attr_show(struct config_item *item,
+ struct configfs_attribute *attr,
+ char *page)
+{
+ struct stp_policy_node *policy_node = to_stp_policy_node(item);
+ struct stp_policy_node_attribute *pn_attr =
+ container_of(attr, struct stp_policy_node_attribute, attr);
+ ssize_t count = 0;
+
+ if (pn_attr->show)
+ count = pn_attr->show(policy_node, page);
+
+ return count;
+}
+
+static ssize_t stp_policy_node_attr_store(struct config_item *item,
+ struct configfs_attribute *attr,
+ const char *page, size_t len)
+{
+ struct stp_policy_node *policy_node = to_stp_policy_node(item);
+ struct stp_policy_node_attribute *pn_attr =
+ container_of(attr, struct stp_policy_node_attribute, attr);
+ ssize_t count = -EINVAL;
+
+ if (pn_attr->store)
+ count = pn_attr->store(policy_node, page, len);
+
+ return count;
+}
+
+static struct configfs_item_operations stp_policy_node_item_ops = {
+ .release = stp_policy_node_release,
+ .show_attribute = stp_policy_node_attr_show,
+ .store_attribute = stp_policy_node_attr_store,
+};
+
+static struct stp_policy_node_attribute stp_policy_node_attr_range = {
+ .attr = {
+ .ca_owner = THIS_MODULE,
+ .ca_name = "masters",
+ .ca_mode = S_IRUGO | S_IWUSR,
+ },
+ .show = stp_policy_node_masters_show,
+ .store = stp_policy_node_masters_store,
+};
+
+static struct stp_policy_node_attribute stp_policy_node_attr_channels = {
+ .attr = {
+ .ca_owner = THIS_MODULE,
+ .ca_name = "channels",
+ .ca_mode = S_IRUGO | S_IWUSR,
+ },
+ .show = stp_policy_node_channels_show,
+ .store = stp_policy_node_channels_store,
+};
+
+static struct configfs_attribute *stp_policy_node_attrs[] = {
+ &stp_policy_node_attr_range.attr,
+ &stp_policy_node_attr_channels.attr,
+ NULL,
+};
+
+static struct config_item_type stp_policy_type;
+static struct config_item_type stp_policy_node_type;
+
+static struct config_group *
+stp_policy_node_make(struct config_group *group, const char *name)
+{
+ struct stp_policy_node *policy_node, *parent_node;
+ struct stp_policy *policy;
+
+ if (group->cg_item.ci_type == &stp_policy_type) {
+ policy = container_of(group, struct stp_policy, group);
+ } else {
+ parent_node = container_of(group, struct stp_policy_node,
+ group);
+ policy = parent_node->policy;
+ }
+
+ if (!policy->stm)
+ return ERR_PTR(-ENODEV);
+
+ policy_node = kzalloc(sizeof(struct stp_policy_node), GFP_KERNEL);
+ if (!policy_node)
+ return ERR_PTR(-ENOMEM);
+
+ config_group_init_type_name(&policy_node->group, name,
+ &stp_policy_node_type);
+
+ policy_node->policy = policy;
+ policy_node->stm = policy->stm;
+
+ /* default values for the attributes */
+ policy_node->first_master = policy->stm->data->sw_start;
+ policy_node->last_master = policy->stm->data->sw_end;
+ policy_node->first_channel = 0;
+ policy_node->last_channel = policy->stm->data->sw_nchannels - 1;
+
+ return &policy_node->group;
+}
+
+static void
+stp_policy_node_drop(struct config_group *group, struct config_item *item)
+{
+ config_item_put(item);
+}
+
+static struct configfs_group_operations stp_policy_node_group_ops = {
+ .make_group = stp_policy_node_make,
+ .drop_item = stp_policy_node_drop,
+};
+
+static struct config_item_type stp_policy_node_type = {
+ .ct_item_ops = &stp_policy_node_item_ops,
+ .ct_group_ops = &stp_policy_node_group_ops,
+ .ct_attrs = stp_policy_node_attrs,
+ .ct_owner = THIS_MODULE,
+};
+
+/*
+ * Root group: policies.
+ */
+static struct configfs_attribute stp_policy_attr_device = {
+ .ca_owner = THIS_MODULE,
+ .ca_name = "device",
+ .ca_mode = S_IRUGO | S_IWUSR,
+};
+
+static struct configfs_attribute *stp_policy_attrs[] = {
+ &stp_policy_attr_device,
+ NULL,
+};
+
+static ssize_t stp_policy_attr_show(struct config_item *item,
+ struct configfs_attribute *attr,
+ char *page)
+{
+ struct stp_policy *policy = to_stp_policy(item);
+
+ return sprintf(page, "%s\n",
+ (policy && policy->stm) ?
+ policy->stm->data->name :
+ "<none>");
+}
+
+static ssize_t stp_policy_attr_store(struct config_item *item,
+ struct configfs_attribute *attr,
+ const char *page, size_t len)
+{
+ struct stp_policy *policy = to_stp_policy(item);
+ ssize_t count = -EINVAL;
+ struct device *dev;
+
+ dev = stm_find_device(page, len);
+ if (dev) {
+ count = len;
+ if (policy->stm)
+ put_device(policy->stm->dev);
+
+ policy->stm = dev_get_drvdata(dev);
+
+ mutex_lock(&policy->stm->policy_mutex);
+ policy->stm->policy = policy;
+ mutex_unlock(&policy->stm->policy_mutex);
+ }
+
+ return count;
+}
+
+void stp_policy_unbind(struct stp_policy *policy)
+{
+ put_device(policy->stm->dev);
+
+ mutex_lock(&policy->stm->policy_mutex);
+ policy->stm->policy = NULL;
+ mutex_unlock(&policy->stm->policy_mutex);
+
+ policy->stm = NULL;
+}
+
+static void stp_policy_release(struct config_item *item)
+{
+ struct stp_policy *policy = to_stp_policy(item);
+
+ stp_policy_unbind(policy);
+ kfree(policy);
+}
+
+static struct configfs_item_operations stp_policy_item_ops = {
+ .release = stp_policy_release,
+ .show_attribute = stp_policy_attr_show,
+ .store_attribute = stp_policy_attr_store,
+};
+
+static struct configfs_group_operations stp_policy_group_ops = {
+ .make_group = stp_policy_node_make,
+};
+
+static struct config_item_type stp_policy_type = {
+ .ct_item_ops = &stp_policy_item_ops,
+ .ct_group_ops = &stp_policy_group_ops,
+ .ct_attrs = stp_policy_attrs,
+ .ct_owner = THIS_MODULE,
+};
+
+static struct config_group *
+stp_policies_make(struct config_group *group, const char *name)
+{
+ struct stp_policy *policy;
+
+ policy = kzalloc(sizeof(*policy), GFP_KERNEL);
+ if (!policy)
+ return ERR_PTR(-ENOMEM);
+
+ config_group_init_type_name(&policy->group, name,
+ &stp_policy_type);
+ policy->stm = NULL;
+
+ return &policy->group;
+}
+
+static struct configfs_group_operations stp_policies_group_ops = {
+ .make_group = stp_policies_make,
+};
+
+static struct config_item_type stp_policies_type = {
+ .ct_group_ops = &stp_policies_group_ops,
+ .ct_owner = THIS_MODULE,
+};
+
+static struct configfs_subsystem stp_policy_subsys = {
+ .su_group = {
+ .cg_item = {
+ .ci_namebuf = "stp-policy",
+ .ci_type = &stp_policies_type,
+ },
+ },
+};
+
+/*
+ * Lock the policy mutex from the outside
+ */
+static struct stp_policy_node *
+__stp_policy_node_lookup(struct stp_policy *policy, char *s)
+{
+ struct stp_policy_node *policy_node, *ret;
+ struct list_head *head = &policy->group.cg_children;
+ struct config_item *item;
+ char *start, *end = s;
+
+ if (list_empty(head))
+ return NULL;
+
+ /* return the first entry if everything else fails */
+ item = list_entry(head->next, struct config_item, ci_entry);
+ ret = to_stp_policy_node(item);
+
+next:
+ for (;;) {
+ start = strsep(&end, "/");
+ if (!start)
+ break;
+
+ if (!*start)
+ continue;
+
+ list_for_each_entry(item, head, ci_entry) {
+ policy_node = to_stp_policy_node(item);
+
+ if (!strcmp(start,
+ policy_node->group.cg_item.ci_name)) {
+ ret = policy_node;
+
+ if (!end)
+ goto out;
+
+ head = &policy_node->group.cg_children;
+ goto next;
+ }
+ }
+ break;
+ }
+
+out:
+ return ret;
+}
+
+struct stp_policy_node *
+stp_policy_node_lookup(struct stp_policy *policy, char *s)
+{
+ struct stp_policy_node *policy_node;
+
+ mutex_lock(&stp_policy_subsys.su_mutex);
+ policy_node = __stp_policy_node_lookup(policy, s);
+ mutex_unlock(&stp_policy_subsys.su_mutex);
+
+ return policy_node;
+}
+
+static int stp_configfs_init(void)
+{
+ int err;
+
+ config_group_init(&stp_policy_subsys.su_group);
+ mutex_init(&stp_policy_subsys.su_mutex);
+ err = configfs_register_subsystem(&stp_policy_subsys);
+
+ return err;
+}
+
+static void stp_configfs_done(void)
+{
+ configfs_unregister_subsystem(&stp_policy_subsys);
+}
+
+module_init(stp_configfs_init);
+module_exit(stp_configfs_done);
diff --git a/drivers/stm/stm.h b/drivers/stm/stm.h
new file mode 100644
index 0000000000..7d7d9954e2
--- /dev/null
+++ b/drivers/stm/stm.h
@@ -0,0 +1,77 @@
+/*
+ * System Trace Module (STM) infrastructure
+ * Copyright (c) 2014, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ *
+ * STM class implements generic infrastructure for System Trace Module devices
+ * as defined in MIPI STPv2 specification.
+ */
+
+#ifndef _CLASS_STM_H_
+#define _CLASS_STM_H_
+
+struct stp_policy;
+struct stp_policy_node;
+
+struct stp_policy_node *
+stp_policy_node_lookup(struct stp_policy *policy, char *s);
+void stp_policy_unbind(struct stp_policy *policy);
+
+void stp_policy_node_get_ranges(struct stp_policy_node *policy_node,
+ unsigned int *mstart, unsigned int *mend,
+ unsigned int *cstart, unsigned int *cend);
+
+struct stp_master {
+ unsigned int nr_free;
+ unsigned long chan_map[0];
+};
+
+struct stm_device {
+ struct device *dev;
+ struct module *owner;
+ struct stp_policy *policy;
+ struct mutex policy_mutex;
+ int major;
+ unsigned int sw_nmasters;
+ struct stm_data *data;
+ spinlock_t link_lock;
+ struct list_head link_list;
+ /* master allocation */
+ spinlock_t mc_lock;
+ struct stp_master *masters[0];
+};
+
+struct stm_output {
+ unsigned int master;
+ unsigned int channel;
+ unsigned int nr_chans;
+};
+
+struct stm_file {
+ struct stm_device *stm;
+ struct stp_policy_node *policy_node;
+ struct stm_output output;
+};
+
+struct device *stm_find_device(const char *name, size_t len);
+
+struct stm_source_device {
+ struct device *dev;
+ struct stm_source_data *data;
+ spinlock_t link_lock;
+ struct stm_device *link;
+ struct list_head link_entry;
+ /* one output per stm_source device */
+ struct stp_policy_node *policy_node;
+ struct stm_output output;
+};
+
+#endif /* _CLASS_STM_H_ */
diff --git a/include/linux/stm.h b/include/linux/stm.h
new file mode 100644
index 0000000000..d00fa9a3fc
--- /dev/null
+++ b/include/linux/stm.h
@@ -0,0 +1,87 @@
+/*
+ * System Trace Module (STM) infrastructure apis
+ * Copyright (C) 2014 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ */
+
+#ifndef _STM_H_
+#define _STM_H_
+
+struct stp_policy;
+
+struct stm_device;
+
+/**
+ * struct stm_data - STM device description and callbacks
+ * @name: device name
+ * @stm: internal structure, only used by stm class code
+ * @sw_start: first STP master
+ * @sw_end: last STP master
+ * @sw_nchannels: number of STP channels per master
+ * @sw_mmiosz: size of one channel's IO space, for mmap, optional
+ * @write: write callback
+ * @mmio_addr: mmap callback, optional
+ *
+ * Fill out this structure before calling stm_register_device() to create
+ * an STM device and stm_unregister_device() to destroy it. It will also be
+ * passed back to write() and mmio_addr() callbacks.
+ */
+struct stm_data {
+ const char *name;
+ struct stm_device *stm;
+ unsigned int sw_start;
+ unsigned int sw_end;
+ unsigned int sw_nchannels;
+ unsigned int sw_mmiosz;
+ ssize_t (*write)(struct stm_data *, unsigned int,
+ unsigned int, const char *, size_t);
+ phys_addr_t (*mmio_addr)(struct stm_data *, unsigned int,
+ unsigned int, unsigned int);
+ void (*link)(struct stm_data *, unsigned int,
+ unsigned int);
+ void (*unlink)(struct stm_data *, unsigned int,
+ unsigned int);
+};
+
+int stm_register_device(struct device *parent, struct stm_data *stm_data,
+ struct module *owner);
+void stm_unregister_device(struct stm_data *stm_data);
+
+struct stm_source_device;
+
+/**
+ * struct stm_source_data - STM source device description and callbacks
+ * @name: device name, will be used for policy lookup
+ * @src: internal structure, only used by stm class code
+ * @nr_chans: number of channels to allocate
+ * @link: called when STM device gets linked to this source
+ * @unlink: called when STH device is about to be unlinked
+ *
+ * Fill in this structure before calling stm_source_register_device() to
+ * register a source device. Also pass it to unregister and write calls.
+ */
+struct stm_source_data {
+ const char *name;
+ struct stm_source_device *src;
+ unsigned int percpu;
+ unsigned int nr_chans;
+ int (*link)(struct stm_source_data *data);
+ void (*unlink)(struct stm_source_data *data);
+};
+
+int stm_source_register_device(struct device *parent,
+ struct stm_source_data *data);
+void stm_source_unregister_device(struct stm_source_data *data);
+
+int stm_source_write(struct stm_source_data *data, unsigned int chan,
+ const char *buf, size_t count);
+
+#endif /* _STM_H_ */
diff --git a/include/uapi/linux/stm.h b/include/uapi/linux/stm.h
new file mode 100644
index 0000000000..042b58b53b
--- /dev/null
+++ b/include/uapi/linux/stm.h
@@ -0,0 +1,47 @@
+/*
+ * System Trace Module (STM) userspace interfaces
+ * Copyright (c) 2014, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ *
+ * STM class implements generic infrastructure for System Trace Module devices
+ * as defined in MIPI STPv2 specification.
+ */
+
+#ifndef _UAPI_LINUX_STM_H
+#define _UAPI_LINUX_STM_H
+
+/**
+ * struct stp_policy_id - identification for the STP policy
+ * @size: size of the structure including real id[] length
+ * @master: assigned master
+ * @channel: first assigned channel
+ * @width: number of requested channels
+ * @id: identification string
+ *
+ * User must calculate the total size of the structure and put it into
+ * @size field, fill out the @id and desired @width. In return, kernel
+ * fills out @master, @channel and @width.
+ */
+struct stp_policy_id {
+ __u32 size;
+ __u16 master;
+ __u16 channel;
+ __u16 width;
+ /* padding */
+ __u16 __reserved_0;
+ __u32 __reserved_1;
+ char id[0];
+};
+
+#define STP_POLICY_ID_SET _IOWR('%', 0, struct stp_policy_id)
+#define STP_POLICY_ID_GET _IOR('%', 1, struct stp_policy_id)
+
+#endif /* _UAPI_LINUX_STM_H */
--
2.1.4
^ permalink raw reply related
* Re: [PATCH v5 tip 0/7] tracing: attach eBPF programs to kprobes
From: Steven Rostedt @ 2015-03-07 1:09 UTC (permalink / raw)
To: Ingo Molnar
Cc: Alexei Starovoitov, Namhyung Kim, Arnaldo Carvalho de Melo,
Jiri Olsa, Masami Hiramatsu, David S. Miller, Daniel Borkmann,
Peter Zijlstra, Linux API, Network Development, LKML
In-Reply-To: <20150304154824.5f165c6d@gandalf.local.home>
On Wed, 4 Mar 2015 15:48:24 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:
> On Wed, 4 Mar 2015 21:33:16 +0100
> Ingo Molnar <mingo@kernel.org> wrote:
>
> >
> > * Alexei Starovoitov <ast@plumgrid.com> wrote:
> >
> > > On Sun, Mar 1, 2015 at 3:27 PM, Alexei Starovoitov <ast@plumgrid.com> wrote:
> > > > Peter, Steven,
> > > > I think this set addresses everything we've discussed.
> > > > Please review/ack. Thanks!
> > >
> > > icmp echo request
> >
> > I'd really like to have an Acked-by from Steve (propagated into the
> > changelogs) before looking at applying these patches.
>
> I'll have to look at this tomorrow. I'm a bit swamped with other things
> at the moment :-/
>
Just an update. I started looking at it but then was pulled off to do
other things. I'll make this a priority next week. Sorry for the delay.
-- Steve
^ permalink raw reply
* Re: Right interface for cellphone modem audio (was Re: [PATCHv2 0/2] N900 Modem Speech Support)
From: Kai Vehmanen @ 2015-03-06 20:49 UTC (permalink / raw)
To: Pavel Machek
Cc: perex-/Fr2/VpizcU, Takashi Iwai,
alsa-devel-K7yf7f+aM1XWsZ/bQMPhNw, Sebastian Reichel,
Peter Ujfalusi, Kai Vehmanen, Pali Rohar, Aaro Koskinen,
Ivaylo Dimitrov, linux-omap-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
linux-api-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20150306094354.GA32369@amd>
Hi,
On Fri, 6 Mar 2015, Pavel Machek wrote:
>> Our take was that ALSA is not the right interface for cmt_speech. The
>> cmt_speech interface in the modem is _not_ a PCM interface as modelled by
>> ALSA. Specifically:
>>
>> - the interface is lossy in both directions
>> - data is sent in packets, not a stream of samples (could be other things
>> than PCM samples), with timing and meta-data
>> - timing of uplink is of utmost importance
>
> I see that you may not have data available in "downlink" scenario, but
> how is it lossy in "uplink" scenario? Phone should always try to fill
> the uplink, no? (Or do you detect silence and not transmit in this
Lossy was perhaps not the best choice of words, non-continuous would be
a better choice in the uplink case. To adjust timing, some samples from
the continuous locally recorded PCM stream need to be skipped and/or
duplicated. This would normally be done between speech bursts to avoid
artifacts.
> Packets vs. stream of samples... does userland need to know about the
> packets? Could we simply hide it from the userland? As userland daemon
> is (supposed to be) realtime, do we really need extra set of
> timestamps? What other metadata are there?
Yes, we need flags that tell about the frame. Please see docs for
'frame_flags' and 'spc_flags' in libcmtspeechdata cmtspeech.h:
https://www.gitorious.org/libcmtspeechdata/libcmtspeechdata/source/9206835ea3c96815840a80ccba9eaeb16ff7e294:cmtspeech.h
Kernel space does not have enough info to handle these flags as the audio
mixer is not implemented in kernel, so they have to be passed to/from
user-space.
And some further info in libcmtspeechdata/docs/
https://www.gitorious.org/libcmtspeechdata/libcmtspeechdata/source/9206835ea3c96815840a80ccba9eaeb16ff7e294:doc/libcmtspeechdata_api_docs_main.txt
> Uplink timing... As the daemon is realtime, can it just send the data
> at the right time? Also normally uplink would be filled, no?
But how would you implement that via the ALSA API? With cmt_speech, a
speech packet is prepared in a mmap'ed buffer, flags are set to describe
the buffer, and at the correct time, write() is called to trigger
transmission in HW (see cmtspeech_ul_buffer_release() in
libcmtspeechdata() -> compare this to snd_pcm_mmap_commit() in ALSA). In
ALSA, the mmap commit and PCM write variants just add data to the
ringbuffer and update the appl pointer. Only initial start (and stop) on
stream have the "do something now" semantics in ALSA.
The ALSA compressed offload API did not exist back when we were working on
cmt_speech, but that's still not a good fit, although adds some of the
concepts (notably frames).
> Well, packets are of fixed size, right? So the userland can simply
> supply the right size in the common case. As for sending at the right
> time... well... if the userspace is already real-time, that should be
> easy
See above, ALSA just doesn't work like that, there's no syscall for "send
these samples now", the model is different.
Br, Kai
^ permalink raw reply
* Re: [PATCH] capabilities: Ambient capability set V2
From: Serge E. Hallyn @ 2015-03-06 20:08 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Christoph Lameter, Serge E. Hallyn, Serge Hallyn, Jonathan Corbet,
Aaron Jones, LSM List, linux-kernel@vger.kernel.org,
Andrew Morton, Andrew G. Morgan, Mimi Zohar, Austin S Hemmelgarn,
Markku Savela, Jarkko Sakkinen, Linux API, Michael Kerrisk
In-Reply-To: <CALCETrVQF22rkZFD8VAW_xrVQOjwpej6W4TJS9gbN9B431TEKg@mail.gmail.com>
On Fri, Mar 06, 2015 at 11:02:43AM -0800, Andy Lutomirski wrote:
> On Fri, Mar 6, 2015 at 10:53 AM, Christoph Lameter <cl@linux.com> wrote:
> > On Fri, 6 Mar 2015, Serge E. Hallyn wrote:
> >
> >> Sorry, something about that patch-patch didn't make sense to me, but I
> >> need to look more closely. My objection was that you were able to get the
> >> pA capabilities into pP without them being in your pI. Your proposed
> >> change didn't seem like it would fix that.
> >
> > Just tried to fix that. Could it be that cap_inherited is never set even
> > for a binary that has
> >
> > christoph@fujitsu-haswell:~$ getcap ambient_test
> >
> > ambient_test = cap_setpcap,cap_net_admin,cap_net_raw,cap_sys_nice+eip
>
> I think that's right. fI doesn't set pI.
Right. The idea is that for the running binary to get capability x in its
pP, its privileged ancestor must have set x in pI, and the binary itself
must be trusted with x in fI.
What we are doing is allowing bypassing fI using pA, without bypassing the
requirement for x to be in pI. Since pI is intended to be filled (for
instance) at login based on username/group, pI generally does not get cleared.
At the same time, any software which thinks it is running untrusted code
safely without privilege by clearing pI and pP won't be fooled by pA.
-serge
^ permalink raw reply
* Re: [PATCH] capabilities: Ambient capability set V2
From: Andy Lutomirski @ 2015-03-06 19:02 UTC (permalink / raw)
To: Christoph Lameter
Cc: Serge E. Hallyn, Serge Hallyn, Jonathan Corbet, Aaron Jones,
LSM List, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Andrew Morton, Andrew G. Morgan, Mimi Zohar, Austin S Hemmelgarn,
Markku Savela, Jarkko Sakkinen, Linux API, Michael Kerrisk
In-Reply-To: <alpine.DEB.2.11.1503061244130.9804-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org>
On Fri, Mar 6, 2015 at 10:53 AM, Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org> wrote:
> On Fri, 6 Mar 2015, Serge E. Hallyn wrote:
>
>> Sorry, something about that patch-patch didn't make sense to me, but I
>> need to look more closely. My objection was that you were able to get the
>> pA capabilities into pP without them being in your pI. Your proposed
>> change didn't seem like it would fix that.
>
> Just tried to fix that. Could it be that cap_inherited is never set even
> for a binary that has
>
> christoph@fujitsu-haswell:~$ getcap ambient_test
>
> ambient_test = cap_setpcap,cap_net_admin,cap_net_raw,cap_sys_nice+eip
I think that's right. fI doesn't set pI.
--Andy
^ permalink raw reply
* Re: [PATCH] capabilities: Ambient capability set V2
From: Christoph Lameter @ 2015-03-06 18:53 UTC (permalink / raw)
To: Serge E. Hallyn
Cc: Serge Hallyn, Andy Lutomirski, Jonathan Corbet, Aaron Jones,
linux-security-module, linux-kernel, akpm, Andrew G. Morgan,
Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen,
linux-api, Michael Kerrisk
In-Reply-To: <20150306163443.GA28386@mail.hallyn.com>
On Fri, 6 Mar 2015, Serge E. Hallyn wrote:
> Sorry, something about that patch-patch didn't make sense to me, but I
> need to look more closely. My objection was that you were able to get the
> pA capabilities into pP without them being in your pI. Your proposed
> change didn't seem like it would fix that.
Just tried to fix that. Could it be that cap_inherited is never set even
for a binary that has
christoph@fujitsu-haswell:~$ getcap ambient_test
ambient_test = cap_setpcap,cap_net_admin,cap_net_raw,cap_sys_nice+eip
I added some printks and it seems that current_cred()->cap_inherited is
not set when running ambient_test.
Index: linux/security/commoncap.c
===================================================================
--- linux.orig/security/commoncap.c 2015-03-06 11:05:10.802218196
-0600
+++ linux/security/commoncap.c 2015-03-06 12:50:38.424330679 -0600
@@ -456,6 +456,10 @@ static int get_file_caps(struct linux_bi
kernel_cap_t relevant_ambient = cap_intersect(
current_cred()->cap_ambient,
current_cred()->cap_inheritable);
+ printk("task->comm %s: Amb=%x Inh=%x relevant=%x\n",
+ current->comm, current_cred()->cap_ambient.cap[0],
+ current_cred()->cap_inheritable.cap[0],
+ relevant_ambient.cap[0]);
rc = 0;
if (!cap_isclear(relevant_ambient)) {
/*
Mar 6 12:42:18 fujitsu-haswell kernel: [ 284.715051] task->comm ambient_test: Amb=803000 Inh=0 relevant=0
^ permalink raw reply
* Re: [PATCH] capabilities: Ambient capability set V2
From: Serge E. Hallyn @ 2015-03-06 16:34 UTC (permalink / raw)
To: Christoph Lameter
Cc: Serge E. Hallyn, Serge Hallyn, Andy Lutomirski, Jonathan Corbet,
Aaron Jones, linux-security-module-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
akpm-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Andrew G. Morgan,
Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen,
linux-api-u79uwXL29TY76Z2rM5mHXA, Michael Kerrisk
In-Reply-To: <alpine.DEB.2.11.1503060948460.8207-gkYfJU5Cukgdnm+yROfE0A@public.gmane.org>
On Fri, Mar 06, 2015 at 09:50:02AM -0600, Christoph Lameter wrote:
> On Thu, 5 Mar 2015, Serge E. Hallyn wrote:
>
> > > > So I'd say drop this change ^
> > >
> > > Then the ambient caps get ignored for a executables that have capabilities
> > > seton the file?
> >
> > Yes. Those are assumed to already know what they're doing.
>
> Ok can we get this patch merged now if I do this change
> (effectively ambient caps for binaries that have no caps set) and deal with the
> other issues later? This would cover most of the use cases here at least.
Sorry, something about that patch-patch didn't make sense to me, but I
need to look more closely. My objection was that you were able to get the
pA capabilities into pP without them being in your pI. Your proposed
change didn't seem like it would fix that.
It also seems worth waiting until you talk to Andy in person next week.
-serge
^ permalink raw reply
* Re: [PATCH] capabilities: Ambient capability set V2
From: Christoph Lameter @ 2015-03-06 15:50 UTC (permalink / raw)
To: Serge E. Hallyn
Cc: Serge Hallyn, Andy Lutomirski, Jonathan Corbet, Aaron Jones,
linux-security-module-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
akpm-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Andrew G. Morgan,
Mimi Zohar, Austin S Hemmelgarn, Markku Savela, Jarkko Sakkinen,
linux-api-u79uwXL29TY76Z2rM5mHXA, Michael Kerrisk
In-Reply-To: <20150305171326.GA14998-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
On Thu, 5 Mar 2015, Serge E. Hallyn wrote:
> > > So I'd say drop this change ^
> >
> > Then the ambient caps get ignored for a executables that have capabilities
> > seton the file?
>
> Yes. Those are assumed to already know what they're doing.
Ok can we get this patch merged now if I do this change
(effectively ambient caps for binaries that have no caps set) and deal with the
other issues later? This would cover most of the use cases here at least.
^ permalink raw reply
* Re: [PATCH] capabilities: Ambient capability set V2
From: Christoph Lameter @ 2015-03-06 15:47 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Jarkko Sakkinen, Andrew Morton, LSM List, Andrew G. Morgan,
Michael Kerrisk, Mimi Zohar,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Austin S Hemmelgarn, Aaron Jones, Serge Hallyn, Serge E. Hallyn,
Markku Savela, Linux API, Jonathan Corbet
In-Reply-To: <CALCETrUVrfPBpb69WFyptzFoJ8Sx4LwhhjirVx=KQ11ofCcwYg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
[-- Attachment #1: Type: TEXT/PLAIN, Size: 825 bytes --]
On Thu, 5 Mar 2015, Andy Lutomirski wrote:
> > Yes due to the library issues.
>
> You can't LD_PRELOAD and fP together. And I'm still unconvinced that
> ambient caps can ever be safe in conjunction with fP. I'll grill you
> next week on what you're trying to do that makes you want this :)
>From the ld.so manpage:
LD_PRELOAD
A whitespace-separated list of additional, user-specified, ELF shared
libraries to be loaded before all others. This can be used to selec‐
tively override functions in other shared libraries. For setuid/set‐
gid ELF binaries, only libraries in the standard search directories
that are also setgid will be loaded.
So this mechanism has not been made to work for binaries with caps? We
have to keep using setuid?
^ permalink raw reply
* Re: [Qemu-devel] [PATCH 02/21] userfaultfd: linux/Documentation/vm/userfaultfd.txt
From: Eric Blake @ 2015-03-06 15:39 UTC (permalink / raw)
To: Andrea Arcangeli, qemu-devel, kvm, linux-kernel, linux-mm,
linux-api, Android Kernel Team
Cc: Robert Love, Dave Hansen, Jan Kara, Neil Brown, Stefan Hajnoczi,
Andrew Jones, Sanidhya Kashyap, KOSAKI Motohiro,
Michel Lespinasse, Taras Glek, zhang.zhanghailiang,
Pavel Emelyanov, Hugh Dickins, Mel Gorman, Sasha Levin,
Dr. David Alan Gilbert, Huangpeng (Peter), Andres Lagar-Cavilla,
Christopher Covington, Anthony Liguori, Paolo Bonzini
In-Reply-To: <1425575884-2574-3-git-send-email-aarcange@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 6342 bytes --]
On 03/05/2015 10:17 AM, Andrea Arcangeli wrote:
> Add documentation.
>
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> ---
> Documentation/vm/userfaultfd.txt | 97 ++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 97 insertions(+)
> create mode 100644 Documentation/vm/userfaultfd.txt
Just a grammar review (no analysis of technical correctness)
>
> diff --git a/Documentation/vm/userfaultfd.txt b/Documentation/vm/userfaultfd.txt
> new file mode 100644
> index 0000000..2ec296c
> --- /dev/null
> +++ b/Documentation/vm/userfaultfd.txt
> @@ -0,0 +1,97 @@
> += Userfaultfd =
> +
> +== Objective ==
> +
> +Userfaults allow to implement on demand paging from userland and more
s/to implement/the implementation of/
and maybe: s/on demand/on-demand/
> +generally they allow userland to take control various memory page
> +faults, something otherwise only the kernel code could do.
> +
> +For example userfaults allows a proper and more optimal implementation
> +of the PROT_NONE+SIGSEGV trick.
> +
> +== Design ==
> +
> +Userfaults are delivered and resolved through the userfaultfd syscall.
> +
> +The userfaultfd (aside from registering and unregistering virtual
> +memory ranges) provides for two primary functionalities:
s/provides for/provides/
> +
> +1) read/POLLIN protocol to notify an userland thread of the faults
s/an userland/a userland/ (remember, 'a unicorn gets an umbrella' - if
the 'u' is pronounced 'you' the correct article is 'a')
> + happening
> +
> +2) various UFFDIO_* ioctls that can mangle over the virtual memory
> + regions registered in the userfaultfd that allows userland to
> + efficiently resolve the userfaults it receives via 1) or to mangle
> + the virtual memory in the background
maybe: s/mangle/manage/2
> +
> +The real advantage of userfaults if compared to regular virtual memory
> +management of mremap/mprotect is that the userfaults in all their
> +operations never involve heavyweight structures like vmas (in fact the
> +userfaultfd runtime load never takes the mmap_sem for writing).
> +
> +Vmas are not suitable for page(or hugepage)-granular fault tracking
s/page(or hugepage)-granular/page- (or hugepage-) granular/
> +when dealing with virtual address spaces that could span
> +Terabytes. Too many vmas would be needed for that.
> +
> +The userfaultfd once opened by invoking the syscall, can also be
> +passed using unix domain sockets to a manager process, so the same
> +manager process could handle the userfaults of a multitude of
> +different process without them being aware about what is going on
s/process/processes/
> +(well of course unless they later try to use the userfaultfd themself
s/themself/themselves/
> +on the same region the manager is already tracking, which is a corner
> +case that would currently return -EBUSY).
> +
> +== API ==
> +
> +When first opened the userfaultfd must be enabled invoking the
> +UFFDIO_API ioctl specifying an uffdio_api.api value set to UFFD_API
s/an uffdio/a uffdio/
> +which will specify the read/POLLIN protocol userland intends to speak
> +on the UFFD. The UFFDIO_API ioctl if successful (i.e. if the requested
> +uffdio_api.api is spoken also by the running kernel), will return into
> +uffdio_api.bits and uffdio_api.ioctls two 64bit bitmasks of
> +respectively the activated feature bits below PAGE_SHIFT in the
> +userfault addresses returned by read(2) and the generic ioctl
> +available.
> +
> +Once the userfaultfd has been enabled the UFFDIO_REGISTER ioctl should
> +be invoked (if present in the returned uffdio_api.ioctls bitmask) to
> +register a memory range in the userfaultfd by setting the
> +uffdio_register structure accordingly. The uffdio_register.mode
> +bitmask will specify to the kernel which kind of faults to track for
> +the range (UFFDIO_REGISTER_MODE_MISSING would track missing
> +pages). The UFFDIO_REGISTER ioctl will return the
> +uffdio_register.ioctls bitmask of ioctls that are suitable to resolve
> +userfaults on the range reigstered. Not all ioctls will necessarily be
s/reigstered/registered/
> +supported for all memory types depending on the underlying virtual
> +memory backend (anonymous memory vs tmpfs vs real filebacked
> +mappings).
> +
> +Userland can use the uffdio_register.ioctls to mangle the virtual
maybe s/mangle/manage/
> +address space in the background (to add or potentially also remove
> +memory from the userfaultfd registered range). This means an userfault
s/an/a/
> +could be triggering just before userland maps in the background the
> +user-faulted page. To avoid POLLIN resulting in an unexpected blocking
> +read (if the UFFD is not opened in nonblocking mode in the first
> +place), we don't allow the background thread to wake userfaults that
> +haven't been read by userland yet. If we would do that likely the
> +UFFDIO_WAKE ioctl could be dropped. This may change in the future
> +(with a UFFD_API protocol bumb combined with the removal of the
s/bumb/bump/
> +UFFDIO_WAKE ioctl) if it'll be demonstrated that it's a valid
> +optimization and worthy to force userland to use the UFFD always in
> +nonblocking mode if combined with POLLIN.
> +
> +userfaultfd is also a generic enough feature, that it allows KVM to
> +implement postcopy live migration (one form of memory externalization
> +consisting of a virtual machine running with part or all of its memory
> +residing on a different node in the cloud) without having to modify a
> +single line of KVM kernel code. Guest async page faults, FOLL_NOWAIT
> +and all other GUP features works just fine in combination with
> +userfaults (userfaults trigger async page faults in the guest
> +scheduler so those guest processes that aren't waiting for userfaults
> +can keep running in the guest vcpus).
> +
> +The primary ioctl to resolve userfaults is UFFDIO_COPY. That
> +atomically copies a page into the userfault registered range and wakes
> +up the blocked userfaults (unless uffdio_copy.mode &
> +UFFDIO_COPY_MODE_DONTWAKE is set). Other ioctl works similarly to
> +UFFDIO_COPY.
>
>
>
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]
^ permalink raw reply
* Re: [PATCH 10/21] userfaultfd: add new syscall to provide memory externalization
From: Michael Kerrisk (man-pages) @ 2015-03-06 10:48 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: qemu-devel, kvm, lkml, linux-mm@kvack.org, Linux API,
Android Kernel Team, Kirill A. Shutemov, Pavel Emelyanov,
Sanidhya Kashyap, zhang.zhanghailiang, Linus Torvalds,
Andres Lagar-Cavilla, Dave Hansen, Paolo Bonzini, Rik van Riel,
Mel Gorman, Andy Lutomirski, Andrew Morton, Sasha Levin,
Hugh Dickins, Peter Feiner, Dr. David Alan Gilbert,
Christopher Covington, Jo
In-Reply-To: <1425575884-2574-11-git-send-email-aarcange@redhat.com>
Hi Andrea,
On 5 March 2015 at 18:17, Andrea Arcangeli <aarcange@redhat.com> wrote:
> Once an userfaultfd has been created and certain region of the process
> virtual address space have been registered into it, the thread
> responsible for doing the memory externalization can manage the page
> faults in userland by talking to the kernel using the userfaultfd
> protocol.
Is there someting like a man page for this new syscall?
Thanks,
Michael
> poll() can be used to know when there are new pending userfaults to be
> read (POLLIN).
>
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> ---
> fs/userfaultfd.c | 977 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 977 insertions(+)
> create mode 100644 fs/userfaultfd.c
>
> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> new file mode 100644
> index 0000000..6b31967
> --- /dev/null
> +++ b/fs/userfaultfd.c
> @@ -0,0 +1,977 @@
> +/*
> + * fs/userfaultfd.c
> + *
> + * Copyright (C) 2007 Davide Libenzi <davidel@xmailserver.org>
> + * Copyright (C) 2008-2009 Red Hat, Inc.
> + * Copyright (C) 2015 Red Hat, Inc.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See
> + * the COPYING file in the top-level directory.
> + *
> + * Some part derived from fs/eventfd.c (anon inode setup) and
> + * mm/ksm.c (mm hashing).
> + */
> +
> +#include <linux/hashtable.h>
> +#include <linux/sched.h>
> +#include <linux/mm.h>
> +#include <linux/poll.h>
> +#include <linux/slab.h>
> +#include <linux/seq_file.h>
> +#include <linux/file.h>
> +#include <linux/bug.h>
> +#include <linux/anon_inodes.h>
> +#include <linux/syscalls.h>
> +#include <linux/userfaultfd_k.h>
> +#include <linux/mempolicy.h>
> +#include <linux/ioctl.h>
> +#include <linux/security.h>
> +
> +enum userfaultfd_state {
> + UFFD_STATE_WAIT_API,
> + UFFD_STATE_RUNNING,
> +};
> +
> +struct userfaultfd_ctx {
> + /* pseudo fd refcounting */
> + atomic_t refcount;
> + /* waitqueue head for the userfaultfd page faults */
> + wait_queue_head_t fault_wqh;
> + /* waitqueue head for the pseudo fd to wakeup poll/read */
> + wait_queue_head_t fd_wqh;
> + /* userfaultfd syscall flags */
> + unsigned int flags;
> + /* state machine */
> + enum userfaultfd_state state;
> + /* released */
> + bool released;
> + /* mm with one ore more vmas attached to this userfaultfd_ctx */
> + struct mm_struct *mm;
> +};
> +
> +struct userfaultfd_wait_queue {
> + unsigned long address;
> + wait_queue_t wq;
> + bool pending;
> + struct userfaultfd_ctx *ctx;
> +};
> +
> +struct userfaultfd_wake_range {
> + unsigned long start;
> + unsigned long len;
> +};
> +
> +static int userfaultfd_wake_function(wait_queue_t *wq, unsigned mode,
> + int wake_flags, void *key)
> +{
> + struct userfaultfd_wake_range *range = key;
> + int ret;
> + struct userfaultfd_wait_queue *uwq;
> + unsigned long start, len;
> +
> + uwq = container_of(wq, struct userfaultfd_wait_queue, wq);
> + ret = 0;
> + /* don't wake the pending ones to avoid reads to block */
> + if (uwq->pending && !ACCESS_ONCE(uwq->ctx->released))
> + goto out;
> + /* len == 0 means wake all */
> + start = range->start;
> + len = range->len;
> + if (len && (start > uwq->address || start + len <= uwq->address))
> + goto out;
> + ret = wake_up_state(wq->private, mode);
> + if (ret)
> + /* wake only once, autoremove behavior */
> + list_del_init(&wq->task_list);
> +out:
> + return ret;
> +}
> +
> +/**
> + * userfaultfd_ctx_get - Acquires a reference to the internal userfaultfd
> + * context.
> + * @ctx: [in] Pointer to the userfaultfd context.
> + *
> + * Returns: In case of success, returns not zero.
> + */
> +static void userfaultfd_ctx_get(struct userfaultfd_ctx *ctx)
> +{
> + if (!atomic_inc_not_zero(&ctx->refcount))
> + BUG();
> +}
> +
> +/**
> + * userfaultfd_ctx_put - Releases a reference to the internal userfaultfd
> + * context.
> + * @ctx: [in] Pointer to userfaultfd context.
> + *
> + * The userfaultfd context reference must have been previously acquired either
> + * with userfaultfd_ctx_get() or userfaultfd_ctx_fdget().
> + */
> +static void userfaultfd_ctx_put(struct userfaultfd_ctx *ctx)
> +{
> + if (atomic_dec_and_test(&ctx->refcount)) {
> + mmdrop(ctx->mm);
> + kfree(ctx);
> + }
> +}
> +
> +static inline unsigned long userfault_address(unsigned long address,
> + unsigned int flags,
> + unsigned long reason)
> +{
> + BUILD_BUG_ON(PAGE_SHIFT < UFFD_BITS);
> + address &= PAGE_MASK;
> + if (flags & FAULT_FLAG_WRITE)
> + /*
> + * Encode "write" fault information in the LSB of the
> + * address read by userland, without depending on
> + * FAULT_FLAG_WRITE kernel internal value.
> + */
> + address |= UFFD_BIT_WRITE;
> + if (reason & VM_UFFD_WP)
> + /*
> + * Encode "reason" fault information as bit number 1
> + * in the address read by userland. If bit number 1 is
> + * clear it means the reason is a VM_FAULT_MISSING
> + * fault.
> + */
> + address |= UFFD_BIT_WP;
> + return address;
> +}
> +
> +/*
> + * The locking rules involved in returning VM_FAULT_RETRY depending on
> + * FAULT_FLAG_ALLOW_RETRY, FAULT_FLAG_RETRY_NOWAIT and
> + * FAULT_FLAG_KILLABLE are not straightforward. The "Caution"
> + * recommendation in __lock_page_or_retry is not an understatement.
> + *
> + * If FAULT_FLAG_ALLOW_RETRY is set, the mmap_sem must be released
> + * before returning VM_FAULT_RETRY only if FAULT_FLAG_RETRY_NOWAIT is
> + * not set.
> + *
> + * If FAULT_FLAG_ALLOW_RETRY is set but FAULT_FLAG_KILLABLE is not
> + * set, VM_FAULT_RETRY can still be returned if and only if there are
> + * fatal_signal_pending()s, and the mmap_sem must be released before
> + * returning it.
> + */
> +int handle_userfault(struct vm_area_struct *vma, unsigned long address,
> + unsigned int flags, unsigned long reason)
> +{
> + struct mm_struct *mm = vma->vm_mm;
> + struct userfaultfd_ctx *ctx;
> + struct userfaultfd_wait_queue uwq;
> +
> + BUG_ON(!rwsem_is_locked(&mm->mmap_sem));
> +
> + ctx = vma->vm_userfaultfd_ctx.ctx;
> + if (!ctx)
> + return VM_FAULT_SIGBUS;
> +
> + BUG_ON(ctx->mm != mm);
> +
> + VM_BUG_ON(reason & ~(VM_UFFD_MISSING|VM_UFFD_WP));
> + VM_BUG_ON(!(reason & VM_UFFD_MISSING) ^ !!(reason & VM_UFFD_WP));
> +
> + /*
> + * If it's already released don't get it. This avoids to loop
> + * in __get_user_pages if userfaultfd_release waits on the
> + * caller of handle_userfault to release the mmap_sem.
> + */
> + if (unlikely(ACCESS_ONCE(ctx->released)))
> + return VM_FAULT_SIGBUS;
> +
> + /* check that we can return VM_FAULT_RETRY */
> + if (unlikely(!(flags & FAULT_FLAG_ALLOW_RETRY))) {
> + /*
> + * Validate the invariant that nowait must allow retry
> + * to be sure not to return SIGBUS erroneously on
> + * nowait invocations.
> + */
> + BUG_ON(flags & FAULT_FLAG_RETRY_NOWAIT);
> +#ifdef CONFIG_DEBUG_VM
> + if (printk_ratelimit()) {
> + printk(KERN_WARNING
> + "FAULT_FLAG_ALLOW_RETRY missing %x\n", flags);
> + dump_stack();
> + }
> +#endif
> + return VM_FAULT_SIGBUS;
> + }
> +
> + /*
> + * Handle nowait, not much to do other than tell it to retry
> + * and wait.
> + */
> + if (flags & FAULT_FLAG_RETRY_NOWAIT)
> + return VM_FAULT_RETRY;
> +
> + /* take the reference before dropping the mmap_sem */
> + userfaultfd_ctx_get(ctx);
> +
> + /* be gentle and immediately relinquish the mmap_sem */
> + up_read(&mm->mmap_sem);
> +
> + init_waitqueue_func_entry(&uwq.wq, userfaultfd_wake_function);
> + uwq.wq.private = current;
> + uwq.address = userfault_address(address, flags, reason);
> + uwq.pending = true;
> + uwq.ctx = ctx;
> +
> + spin_lock(&ctx->fault_wqh.lock);
> + /*
> + * After the __add_wait_queue the uwq is visible to userland
> + * through poll/read().
> + */
> + __add_wait_queue(&ctx->fault_wqh, &uwq.wq);
> + for (;;) {
> + set_current_state(TASK_KILLABLE);
> + if (!uwq.pending || ACCESS_ONCE(ctx->released) ||
> + fatal_signal_pending(current))
> + break;
> + spin_unlock(&ctx->fault_wqh.lock);
> +
> + wake_up_poll(&ctx->fd_wqh, POLLIN);
> + schedule();
> +
> + spin_lock(&ctx->fault_wqh.lock);
> + }
> + __remove_wait_queue(&ctx->fault_wqh, &uwq.wq);
> + __set_current_state(TASK_RUNNING);
> + spin_unlock(&ctx->fault_wqh.lock);
> +
> + /*
> + * ctx may go away after this if the userfault pseudo fd is
> + * already released.
> + */
> + userfaultfd_ctx_put(ctx);
> +
> + return VM_FAULT_RETRY;
> +}
> +
> +static int userfaultfd_release(struct inode *inode, struct file *file)
> +{
> + struct userfaultfd_ctx *ctx = file->private_data;
> + struct mm_struct *mm = ctx->mm;
> + struct vm_area_struct *vma, *prev;
> + /* len == 0 means wake all */
> + struct userfaultfd_wake_range range = { .len = 0, };
> + unsigned long new_flags;
> +
> + ACCESS_ONCE(ctx->released) = true;
> +
> + /*
> + * Flush page faults out of all CPUs. NOTE: all page faults
> + * must be retried without returning VM_FAULT_SIGBUS if
> + * userfaultfd_ctx_get() succeeds but vma->vma_userfault_ctx
> + * changes while handle_userfault released the mmap_sem. So
> + * it's critical that released is set to true (above), before
> + * taking the mmap_sem for writing.
> + */
> + down_write(&mm->mmap_sem);
> + prev = NULL;
> + for (vma = mm->mmap; vma; vma = vma->vm_next) {
> + cond_resched();
> + BUG_ON(!!vma->vm_userfaultfd_ctx.ctx ^
> + !!(vma->vm_flags & (VM_UFFD_MISSING | VM_UFFD_WP)));
> + if (vma->vm_userfaultfd_ctx.ctx != ctx) {
> + prev = vma;
> + continue;
> + }
> + new_flags = vma->vm_flags & ~(VM_UFFD_MISSING | VM_UFFD_WP);
> + prev = vma_merge(mm, prev, vma->vm_start, vma->vm_end,
> + new_flags, vma->anon_vma,
> + vma->vm_file, vma->vm_pgoff,
> + vma_policy(vma),
> + NULL_VM_UFFD_CTX);
> + if (prev)
> + vma = prev;
> + else
> + prev = vma;
> + vma->vm_flags = new_flags;
> + vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX;
> + }
> + up_write(&mm->mmap_sem);
> +
> + /*
> + * After no new page faults can wait on this fault_wqh, flush
> + * the last page faults that may have been already waiting on
> + * the fault_wqh.
> + */
> + spin_lock(&ctx->fault_wqh.lock);
> + __wake_up_locked_key(&ctx->fault_wqh, TASK_NORMAL, 0, &range);
> + spin_unlock(&ctx->fault_wqh.lock);
> +
> + wake_up_poll(&ctx->fd_wqh, POLLHUP);
> + userfaultfd_ctx_put(ctx);
> + return 0;
> +}
> +
> +static inline unsigned int find_userfault(struct userfaultfd_ctx *ctx,
> + struct userfaultfd_wait_queue **uwq)
> +{
> + wait_queue_t *wq;
> + struct userfaultfd_wait_queue *_uwq;
> + unsigned int ret = 0;
> +
> + spin_lock(&ctx->fault_wqh.lock);
> + list_for_each_entry(wq, &ctx->fault_wqh.task_list, task_list) {
> + _uwq = container_of(wq, struct userfaultfd_wait_queue, wq);
> + if (_uwq->pending) {
> + ret = POLLIN;
> + if (uwq)
> + *uwq = _uwq;
> + break;
> + }
> + }
> + spin_unlock(&ctx->fault_wqh.lock);
> +
> + return ret;
> +}
> +
> +static unsigned int userfaultfd_poll(struct file *file, poll_table *wait)
> +{
> + struct userfaultfd_ctx *ctx = file->private_data;
> +
> + poll_wait(file, &ctx->fd_wqh, wait);
> +
> + switch (ctx->state) {
> + case UFFD_STATE_WAIT_API:
> + return POLLERR;
> + case UFFD_STATE_RUNNING:
> + return find_userfault(ctx, NULL);
> + default:
> + BUG();
> + }
> +}
> +
> +static ssize_t userfaultfd_ctx_read(struct userfaultfd_ctx *ctx, int no_wait,
> + __u64 *addr)
> +{
> + ssize_t ret;
> + DECLARE_WAITQUEUE(wait, current);
> + struct userfaultfd_wait_queue *uwq = NULL;
> +
> + /* always take the fd_wqh lock before the fault_wqh lock */
> + spin_lock(&ctx->fd_wqh.lock);
> + __add_wait_queue(&ctx->fd_wqh, &wait);
> + for (;;) {
> + set_current_state(TASK_INTERRUPTIBLE);
> + if (find_userfault(ctx, &uwq)) {
> + uwq->pending = false;
> + /* careful to always initialize addr if ret == 0 */
> + *addr = uwq->address;
> + ret = 0;
> + break;
> + }
> + if (signal_pending(current)) {
> + ret = -ERESTARTSYS;
> + break;
> + }
> + if (no_wait) {
> + ret = -EAGAIN;
> + break;
> + }
> + spin_unlock(&ctx->fd_wqh.lock);
> + schedule();
> + spin_lock_irq(&ctx->fd_wqh.lock);
> + }
> + __remove_wait_queue(&ctx->fd_wqh, &wait);
> + __set_current_state(TASK_RUNNING);
> + spin_unlock_irq(&ctx->fd_wqh.lock);
> +
> + return ret;
> +}
> +
> +static ssize_t userfaultfd_read(struct file *file, char __user *buf,
> + size_t count, loff_t *ppos)
> +{
> + struct userfaultfd_ctx *ctx = file->private_data;
> + ssize_t _ret, ret = 0;
> + /* careful to always initialize addr if ret == 0 */
> + __u64 uninitialized_var(addr);
> + int no_wait = file->f_flags & O_NONBLOCK;
> +
> + if (ctx->state == UFFD_STATE_WAIT_API)
> + return -EINVAL;
> + BUG_ON(ctx->state != UFFD_STATE_RUNNING);
> +
> + for (;;) {
> + if (count < sizeof(addr))
> + return ret ? ret : -EINVAL;
> + _ret = userfaultfd_ctx_read(ctx, no_wait, &addr);
> + if (_ret < 0)
> + return ret ? ret : _ret;
> + if (put_user(addr, (__u64 __user *) buf))
> + return ret ? ret : -EFAULT;
> + ret += sizeof(addr);
> + buf += sizeof(addr);
> + count -= sizeof(addr);
> + /*
> + * Allow to read more than one fault at time but only
> + * block if waiting for the very first one.
> + */
> + no_wait = O_NONBLOCK;
> + }
> +}
> +
> +static int __wake_userfault(struct userfaultfd_ctx *ctx,
> + struct userfaultfd_wake_range *range)
> +{
> + wait_queue_t *wq;
> + struct userfaultfd_wait_queue *uwq;
> + int ret;
> + unsigned long start, end;
> +
> + start = range->start;
> + end = range->start + range->len;
> +
> + ret = -ENOENT;
> + spin_lock(&ctx->fault_wqh.lock);
> + list_for_each_entry(wq, &ctx->fault_wqh.task_list, task_list) {
> + uwq = container_of(wq, struct userfaultfd_wait_queue, wq);
> + if (uwq->pending)
> + continue;
> + if (uwq->address >= start && uwq->address < end) {
> + ret = 0;
> + /* wake all in the range and autoremove */
> + __wake_up_locked_key(&ctx->fault_wqh, TASK_NORMAL, 0,
> + range);
> + break;
> + }
> + }
> + spin_unlock(&ctx->fault_wqh.lock);
> +
> + return ret;
> +}
> +
> +static __always_inline int wake_userfault(struct userfaultfd_ctx *ctx,
> + struct userfaultfd_wake_range *range)
> +{
> + if (!waitqueue_active(&ctx->fault_wqh))
> + return -ENOENT;
> +
> + return __wake_userfault(ctx, range);
> +}
> +
> +static __always_inline int validate_range(struct mm_struct *mm,
> + __u64 start, __u64 len)
> +{
> + __u64 task_size = mm->task_size;
> +
> + if (start & ~PAGE_MASK)
> + return -EINVAL;
> + if (len & ~PAGE_MASK)
> + return -EINVAL;
> + if (!len)
> + return -EINVAL;
> + if (start < mmap_min_addr)
> + return -EINVAL;
> + if (start >= task_size)
> + return -EINVAL;
> + if (len > task_size - start)
> + return -EINVAL;
> + return 0;
> +}
> +
> +static int userfaultfd_register(struct userfaultfd_ctx *ctx,
> + unsigned long arg)
> +{
> + struct mm_struct *mm = ctx->mm;
> + struct vm_area_struct *vma, *prev, *cur;
> + int ret;
> + struct uffdio_register uffdio_register;
> + struct uffdio_register __user *user_uffdio_register;
> + unsigned long vm_flags, new_flags;
> + bool found;
> + unsigned long start, end, vma_end;
> +
> + user_uffdio_register = (struct uffdio_register __user *) arg;
> +
> + ret = -EFAULT;
> + if (copy_from_user(&uffdio_register, user_uffdio_register,
> + sizeof(uffdio_register)-sizeof(__u64)))
> + goto out;
> +
> + ret = -EINVAL;
> + if (!uffdio_register.mode)
> + goto out;
> + if (uffdio_register.mode & ~(UFFDIO_REGISTER_MODE_MISSING|
> + UFFDIO_REGISTER_MODE_WP))
> + goto out;
> + vm_flags = 0;
> + if (uffdio_register.mode & UFFDIO_REGISTER_MODE_MISSING)
> + vm_flags |= VM_UFFD_MISSING;
> + if (uffdio_register.mode & UFFDIO_REGISTER_MODE_WP) {
> + vm_flags |= VM_UFFD_WP;
> + /*
> + * FIXME: remove the below error constraint by
> + * implementing the wprotect tracking mode.
> + */
> + ret = -EINVAL;
> + goto out;
> + }
> +
> + ret = validate_range(mm, uffdio_register.range.start,
> + uffdio_register.range.len);
> + if (ret)
> + goto out;
> +
> + start = uffdio_register.range.start;
> + end = start + uffdio_register.range.len;
> +
> + down_write(&mm->mmap_sem);
> + vma = find_vma_prev(mm, start, &prev);
> +
> + ret = -ENOMEM;
> + if (!vma)
> + goto out_unlock;
> +
> + /* check that there's at least one vma in the range */
> + ret = -EINVAL;
> + if (vma->vm_start >= end)
> + goto out_unlock;
> +
> + /*
> + * Search for not compatible vmas.
> + *
> + * FIXME: this shall be relaxed later so that it doesn't fail
> + * on tmpfs backed vmas (in addition to the current allowance
> + * on anonymous vmas).
> + */
> + found = false;
> + for (cur = vma; cur && cur->vm_start < end; cur = cur->vm_next) {
> + cond_resched();
> +
> + BUG_ON(!!cur->vm_userfaultfd_ctx.ctx ^
> + !!(cur->vm_flags & (VM_UFFD_MISSING | VM_UFFD_WP)));
> +
> + /* check not compatible vmas */
> + ret = -EINVAL;
> + if (cur->vm_ops)
> + goto out_unlock;
> +
> + /*
> + * Check that this vma isn't already owned by a
> + * different userfaultfd. We can't allow more than one
> + * userfaultfd to own a single vma simultaneously or we
> + * wouldn't know which one to deliver the userfaults to.
> + */
> + ret = -EBUSY;
> + if (cur->vm_userfaultfd_ctx.ctx &&
> + cur->vm_userfaultfd_ctx.ctx != ctx)
> + goto out_unlock;
> +
> + found = true;
> + }
> + BUG_ON(!found);
> +
> + /*
> + * Now that we scanned all vmas we can already tell userland which
> + * ioctls methods are guaranteed to succeed on this range.
> + */
> + ret = -EFAULT;
> + if (put_user(UFFD_API_RANGE_IOCTLS, &user_uffdio_register->ioctls))
> + goto out_unlock;
> +
> + if (vma->vm_start < start)
> + prev = vma;
> +
> + ret = 0;
> + do {
> + cond_resched();
> +
> + BUG_ON(vma->vm_ops);
> + BUG_ON(vma->vm_userfaultfd_ctx.ctx &&
> + vma->vm_userfaultfd_ctx.ctx != ctx);
> +
> + /*
> + * Nothing to do: this vma is already registered into this
> + * userfaultfd and with the right tracking mode too.
> + */
> + if (vma->vm_userfaultfd_ctx.ctx == ctx &&
> + (vma->vm_flags & vm_flags) == vm_flags)
> + goto skip;
> +
> + if (vma->vm_start > start)
> + start = vma->vm_start;
> + vma_end = min(end, vma->vm_end);
> +
> + new_flags = (vma->vm_flags & ~vm_flags) | vm_flags;
> + prev = vma_merge(mm, prev, start, vma_end, new_flags,
> + vma->anon_vma, vma->vm_file, vma->vm_pgoff,
> + vma_policy(vma),
> + ((struct vm_userfaultfd_ctx){ ctx }));
> + if (prev) {
> + vma = prev;
> + goto next;
> + }
> + if (vma->vm_start < start) {
> + ret = split_vma(mm, vma, start, 1);
> + if (ret)
> + break;
> + }
> + if (vma->vm_end > end) {
> + ret = split_vma(mm, vma, end, 0);
> + if (ret)
> + break;
> + }
> + next:
> + /*
> + * In the vma_merge() successful mprotect-like case 8:
> + * the next vma was merged into the current one and
> + * the current one has not been updated yet.
> + */
> + vma->vm_flags = new_flags;
> + vma->vm_userfaultfd_ctx.ctx = ctx;
> +
> + skip:
> + prev = vma;
> + start = vma->vm_end;
> + vma = vma->vm_next;
> + } while (vma && vma->vm_start < end);
> +out_unlock:
> + up_write(&mm->mmap_sem);
> +out:
> + return ret;
> +}
> +
> +static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
> + unsigned long arg)
> +{
> + struct mm_struct *mm = ctx->mm;
> + struct vm_area_struct *vma, *prev, *cur;
> + int ret;
> + struct uffdio_range uffdio_unregister;
> + unsigned long new_flags;
> + bool found;
> + unsigned long start, end, vma_end;
> + const void __user *buf = (void __user *)arg;
> +
> + ret = -EFAULT;
> + if (copy_from_user(&uffdio_unregister, buf, sizeof(uffdio_unregister)))
> + goto out;
> +
> + ret = validate_range(mm, uffdio_unregister.start,
> + uffdio_unregister.len);
> + if (ret)
> + goto out;
> +
> + start = uffdio_unregister.start;
> + end = start + uffdio_unregister.len;
> +
> + down_write(&mm->mmap_sem);
> + vma = find_vma_prev(mm, start, &prev);
> +
> + ret = -ENOMEM;
> + if (!vma)
> + goto out_unlock;
> +
> + /* check that there's at least one vma in the range */
> + ret = -EINVAL;
> + if (vma->vm_start >= end)
> + goto out_unlock;
> +
> + /*
> + * Search for not compatible vmas.
> + *
> + * FIXME: this shall be relaxed later so that it doesn't fail
> + * on tmpfs backed vmas (in addition to the current allowance
> + * on anonymous vmas).
> + */
> + found = false;
> + ret = -EINVAL;
> + for (cur = vma; cur && cur->vm_start < end; cur = cur->vm_next) {
> + cond_resched();
> +
> + BUG_ON(!!cur->vm_userfaultfd_ctx.ctx ^
> + !!(cur->vm_flags & (VM_UFFD_MISSING | VM_UFFD_WP)));
> +
> + /*
> + * Check not compatible vmas, not strictly required
> + * here as not compatible vmas cannot have an
> + * userfaultfd_ctx registered on them, but this
> + * provides for more strict behavior to notice
> + * unregistration errors.
> + */
> + if (cur->vm_ops)
> + goto out_unlock;
> +
> + found = true;
> + }
> + BUG_ON(!found);
> +
> + if (vma->vm_start < start)
> + prev = vma;
> +
> + ret = 0;
> + do {
> + cond_resched();
> +
> + BUG_ON(vma->vm_ops);
> +
> + /*
> + * Nothing to do: this vma is already registered into this
> + * userfaultfd and with the right tracking mode too.
> + */
> + if (!vma->vm_userfaultfd_ctx.ctx)
> + goto skip;
> +
> + if (vma->vm_start > start)
> + start = vma->vm_start;
> + vma_end = min(end, vma->vm_end);
> +
> + new_flags = vma->vm_flags & ~(VM_UFFD_MISSING | VM_UFFD_WP);
> + prev = vma_merge(mm, prev, start, vma_end, new_flags,
> + vma->anon_vma, vma->vm_file, vma->vm_pgoff,
> + vma_policy(vma),
> + NULL_VM_UFFD_CTX);
> + if (prev) {
> + vma = prev;
> + goto next;
> + }
> + if (vma->vm_start < start) {
> + ret = split_vma(mm, vma, start, 1);
> + if (ret)
> + break;
> + }
> + if (vma->vm_end > end) {
> + ret = split_vma(mm, vma, end, 0);
> + if (ret)
> + break;
> + }
> + next:
> + /*
> + * In the vma_merge() successful mprotect-like case 8:
> + * the next vma was merged into the current one and
> + * the current one has not been updated yet.
> + */
> + vma->vm_flags = new_flags;
> + vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX;
> +
> + skip:
> + prev = vma;
> + start = vma->vm_end;
> + vma = vma->vm_next;
> + } while (vma && vma->vm_start < end);
> +out_unlock:
> + up_write(&mm->mmap_sem);
> +out:
> + return ret;
> +}
> +
> +/*
> + * This is mostly needed to re-wakeup those userfaults that were still
> + * pending when userland wake them up the first time. We don't wake
> + * the pending one to avoid blocking reads to block, or non blocking
> + * read to return -EAGAIN, if used with POLLIN, to avoid userland
> + * doubts on why POLLIN wasn't reliable.
> + */
> +static int userfaultfd_wake(struct userfaultfd_ctx *ctx,
> + unsigned long arg)
> +{
> + int ret;
> + struct uffdio_range uffdio_wake;
> + struct userfaultfd_wake_range range;
> + const void __user *buf = (void __user *)arg;
> +
> + ret = -EFAULT;
> + if (copy_from_user(&uffdio_wake, buf, sizeof(uffdio_wake)))
> + goto out;
> +
> + ret = validate_range(ctx->mm, uffdio_wake.start, uffdio_wake.len);
> + if (ret)
> + goto out;
> +
> + range.start = uffdio_wake.start;
> + range.len = uffdio_wake.len;
> +
> + /*
> + * len == 0 means wake all and we don't want to wake all here,
> + * so check it again to be sure.
> + */
> + VM_BUG_ON(!range.len);
> +
> + ret = wake_userfault(ctx, &range);
> +
> +out:
> + return ret;
> +}
> +
> +/*
> + * userland asks for a certain API version and we return which bits
> + * and ioctl commands are implemented in this kernel for such API
> + * version or -EINVAL if unknown.
> + */
> +static int userfaultfd_api(struct userfaultfd_ctx *ctx,
> + unsigned long arg)
> +{
> + struct uffdio_api uffdio_api;
> + void __user *buf = (void __user *)arg;
> + int ret;
> +
> + ret = -EINVAL;
> + if (ctx->state != UFFD_STATE_WAIT_API)
> + goto out;
> + ret = -EFAULT;
> + if (copy_from_user(&uffdio_api, buf, sizeof(__u64)))
> + goto out;
> + if (uffdio_api.api != UFFD_API) {
> + /* careful not to leak info, we only read the first 8 bytes */
> + memset(&uffdio_api, 0, sizeof(uffdio_api));
> + if (copy_to_user(buf, &uffdio_api, sizeof(uffdio_api)))
> + goto out;
> + ret = -EINVAL;
> + goto out;
> + }
> + /* careful not to leak info, we only read the first 8 bytes */
> + uffdio_api.bits = UFFD_API_BITS;
> + uffdio_api.ioctls = UFFD_API_IOCTLS;
> + ret = -EFAULT;
> + if (copy_to_user(buf, &uffdio_api, sizeof(uffdio_api)))
> + goto out;
> + ctx->state = UFFD_STATE_RUNNING;
> + ret = 0;
> +out:
> + return ret;
> +}
> +
> +static long userfaultfd_ioctl(struct file *file, unsigned cmd,
> + unsigned long arg)
> +{
> + int ret = -EINVAL;
> + struct userfaultfd_ctx *ctx = file->private_data;
> +
> + switch(cmd) {
> + case UFFDIO_API:
> + ret = userfaultfd_api(ctx, arg);
> + break;
> + case UFFDIO_REGISTER:
> + ret = userfaultfd_register(ctx, arg);
> + break;
> + case UFFDIO_UNREGISTER:
> + ret = userfaultfd_unregister(ctx, arg);
> + break;
> + case UFFDIO_WAKE:
> + ret = userfaultfd_wake(ctx, arg);
> + break;
> + }
> + return ret;
> +}
> +
> +#ifdef CONFIG_PROC_FS
> +static void userfaultfd_show_fdinfo(struct seq_file *m, struct file *f)
> +{
> + struct userfaultfd_ctx *ctx = f->private_data;
> + wait_queue_t *wq;
> + struct userfaultfd_wait_queue *uwq;
> + unsigned long pending = 0, total = 0;
> +
> + spin_lock(&ctx->fault_wqh.lock);
> + list_for_each_entry(wq, &ctx->fault_wqh.task_list, task_list) {
> + uwq = container_of(wq, struct userfaultfd_wait_queue, wq);
> + if (uwq->pending)
> + pending++;
> + total++;
> + }
> + spin_unlock(&ctx->fault_wqh.lock);
> +
> + /*
> + * If more protocols will be added, there will be all shown
> + * separated by a space. Like this:
> + * protocols: 0xaa 0xbb
> + */
> + seq_printf(m, "pending:\t%lu\ntotal:\t%lu\nAPI:\t%Lx:%x:%Lx\n",
> + pending, total, UFFD_API, UFFD_API_BITS,
> + UFFD_API_IOCTLS|UFFD_API_RANGE_IOCTLS);
> +}
> +#endif
> +
> +static const struct file_operations userfaultfd_fops = {
> +#ifdef CONFIG_PROC_FS
> + .show_fdinfo = userfaultfd_show_fdinfo,
> +#endif
> + .release = userfaultfd_release,
> + .poll = userfaultfd_poll,
> + .read = userfaultfd_read,
> + .unlocked_ioctl = userfaultfd_ioctl,
> + .compat_ioctl = userfaultfd_ioctl,
> + .llseek = noop_llseek,
> +};
> +
> +/**
> + * userfaultfd_file_create - Creates an userfaultfd file pointer.
> + * @flags: Flags for the userfaultfd file.
> + *
> + * This function creates an userfaultfd file pointer, w/out installing
> + * it into the fd table. This is useful when the userfaultfd file is
> + * used during the initialization of data structures that require
> + * extra setup after the userfaultfd creation. So the userfaultfd
> + * creation is split into the file pointer creation phase, and the
> + * file descriptor installation phase. In this way races with
> + * userspace closing the newly installed file descriptor can be
> + * avoided. Returns an userfaultfd file pointer, or a proper error
> + * pointer.
> + */
> +static struct file *userfaultfd_file_create(int flags)
> +{
> + struct file *file;
> + struct userfaultfd_ctx *ctx;
> +
> + BUG_ON(!current->mm);
> +
> + /* Check the UFFD_* constants for consistency. */
> + BUILD_BUG_ON(UFFD_CLOEXEC != O_CLOEXEC);
> + BUILD_BUG_ON(UFFD_NONBLOCK != O_NONBLOCK);
> +
> + file = ERR_PTR(-EINVAL);
> + if (flags & ~UFFD_SHARED_FCNTL_FLAGS)
> + goto out;
> +
> + file = ERR_PTR(-ENOMEM);
> + ctx = kmalloc(sizeof(*ctx), GFP_KERNEL);
> + if (!ctx)
> + goto out;
> +
> + atomic_set(&ctx->refcount, 1);
> + init_waitqueue_head(&ctx->fault_wqh);
> + init_waitqueue_head(&ctx->fd_wqh);
> + ctx->flags = flags;
> + ctx->state = UFFD_STATE_WAIT_API;
> + ctx->released = false;
> + ctx->mm = current->mm;
> + /* prevent the mm struct to be freed */
> + atomic_inc(&ctx->mm->mm_count);
> +
> + file = anon_inode_getfile("[userfaultfd]", &userfaultfd_fops, ctx,
> + O_RDWR | (flags & UFFD_SHARED_FCNTL_FLAGS));
> + if (IS_ERR(file))
> + kfree(ctx);
> +out:
> + return file;
> +}
> +
> +SYSCALL_DEFINE1(userfaultfd, int, flags)
> +{
> + int fd, error;
> + struct file *file;
> +
> + error = get_unused_fd_flags(flags & UFFD_SHARED_FCNTL_FLAGS);
> + if (error < 0)
> + return error;
> + fd = error;
> +
> + file = userfaultfd_file_create(flags);
> + if (IS_ERR(file)) {
> + error = PTR_ERR(file);
> + goto err_put_unused_fd;
> + }
> + fd_install(fd, file);
> +
> + return fd;
> +
> +err_put_unused_fd:
> + put_unused_fd(fd);
> +
> + return error;
> +}
> --
> To unsubscribe from this list: send the line "unsubscribe linux-api" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply
* Re: [PATCH v2 11/18] pinctrl: Add pinctrl driver for STM32 MCUs
From: Maxime Coquelin @ 2015-03-06 9:57 UTC (permalink / raw)
To: Linus Walleij
Cc: Uwe Kleine-König, Andreas Färber, Geert Uytterhoeven,
Rob Herring, Philipp Zabel, Jonathan Corbet, Pawel Moll,
Mark Rutland, Ian Campbell, Kumar Gala, Russell King,
Daniel Lezcano, Thomas Gleixner, Greg Kroah-Hartman, Jiri Slaby,
Arnd Bergmann, Andrew Morton, David S. Miller,
Mauro Carvalho Chehab, Joe Perches, Antti Palosaari, Tejun Heo
In-Reply-To: <CACRpkdbuJ5B_GwvRXax2Y4V37ihh5e6H7=2no0fYTMZPXwDdCw@mail.gmail.com>
2015-03-06 10:24 GMT+01:00 Linus Walleij <linus.walleij@linaro.org>:
> On Fri, Feb 20, 2015 at 7:01 PM, Maxime Coquelin
> <mcoquelin.stm32@gmail.com> wrote:
>
>> This driver adds pinctrl and GPIO support to STMicrolectronic's
>> STM32 family of MCUs.
>>
>> Pin muxing and GPIO handling have been tested on STM32F429
>> based Discovery board.
>>
>> Signed-off-by: Maxime Coquelin <mcoquelin.stm32@gmail.com>
>
> (...)
>> +config PINCTRL_STM32
>> + bool "STMicroelectronics STM32 pinctrl driver"
>> + depends on OF
>> + depends on ARCH_STM32 || COMPILE_TEST
>> + select PINMUX
>> + select PINCONF
>> + select GPIOLIB_IRQCHIP
>> + help
>> + This selects the device tree based generic pinctrl driver for STM32.
>
> Good start! Especially that you use GPIOLIB_IRQCHIP.
>
> But this (as discussed earlier) should select GENERIC_PINCONF
>
> Stopping review here so you can reengineer it a bit using GENERIC_PINCONF
> for next submission.
>
> Also think about pinmux in single registers, whether you want to do this
> with a single value for a register or using strings to identify groups
> and functions.
Thanks for the review.
I will digest all this, and come back with another solution :)
Best regards,
Maxime
>
> Yours,
> Linus Walleij
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox