* AF_BUS socket address family
@ 2012-06-29 16:45 Vincent Sanders
2012-06-29 18:16 ` Chris Friesen
` (4 more replies)
0 siblings, 5 replies; 28+ messages in thread
From: Vincent Sanders @ 2012-06-29 16:45 UTC (permalink / raw)
To: netdev, linux-kernel, David S. Miller
This series adds the bus address family (AF_BUS) it is against
net-next as of yesterday.
AF_BUS is a message oriented inter process communication system.
The principle features are:
- Reliable datagram based communication (all sockets are of type
SOCK_SEQPACKET)
- Multicast message delivery (one to many, unicast as a subset)
- Strict ordering (messages are delivered to every client in the same order)
- Ability to pass file descriptors
- Ability to pass credentials
The basic concept is to provide a virtual bus on which multiple
processes can communicate and policy is imposed by a "bus master".
Introduction
------------
AF_BUS is based upon AF_UNIX but extended for multicast operation and
removes stream operation, responding to extensive feedback on previous
approaches we have made the implementation as isolated as
possible. There are opportunities in the future to integrate the
socket garbage collector with that of the unix socket implementation.
The impetus for creating this IPC mechanism is to replace the
underlying transport for D-Bus. The D-Bus system currently emulates this
IPC mechanism using AF_UNIX sockets in userspace and has numerous
undesirable behaviours. D-Bus is now widely deployed in many areas and
has become a de-facto IPC standard. Using this IPC mechanism as a
transport gives a significant (100% or more) improvement to throughput
with comparable improvement to latency.
This work was undertaken by Collabora for the GENIVI Alliance and we
are committed to responding to feedback promptly and intend to continue
to support this feature into the future.
Operation
---------
A bus is created by processes connecting on an AF_BUS socket. The
"bus master" binds itself instead of connecting to the NULL address.
The socket address is made up of a path component and a numeric
component. The path component is either a pathname or an abstract
socket similar to a unix socket. The numeric component is used to
uniquely identify each connection to the bus. Thus the path identifies
a specific bus and the numeric component the attachment to that bus.
The numeric component of the address is divided into two fixed parts a
prefix to identify multicast groups and a suffix which identifies the
attachment. The kernel allocates a single address in prefix 0 to each
socket upon connection.
Connections are initially limited to communicating with address the
bus master (address 0) . The bus master is responsible for making all
policy decisions around manipulating other attachments including
building multicast groups.
It is expected that connecting clients use protocol specific messages
to communicate with the bus master to negotiate differing
configurations although a bus master might implement a fixed
behaviour.
AF_BUS itself is protocol agnostic and implements the configured
policy between attachments which allows for a bus master to leave a
bus and communication between clients to continue.
Some test code has been written [1] which demonstrates the usage of
AF_BUS.
Use with BUS_PROTO_DBUS
-----------------------
The initial aim of AF_BUS is to provide a IPC mechanism suitable for
use to provide the underlying transport for D-Bus.
A socket created using BUS_PROTO_DBUS indicates that the messages
passed will be in the D-Bus format. The userspace libraries have been
updated to use this transport with an updated D-Bus daemon [2] as a bus
master.
The D-Bus protocol allows for multicast groups to be filtered depending
on message contents. These filters are configured by the bus master
but need to be enforced on message delivery.
We have simply used the standard kernel netfilter mechanism to achieve
this. This is used to filter delivery to clients that may be part of a
multicast group where they are not receiving all messages according to
policy. If a client wishes to further filter its input provision has
been made to allow them to use BPF.
The kernel based IPC has several benefits for D-Bus over the userspace
emulation:
- Context switching between userspace processes is reduced.
- Message data copying is reduced.
- System call overheads are reduced.
- The userspace D-Bus daemon was subject to resource starvation,
client contention and priority inversion.
- Latency is reduced
- Throughput is increased.
The tools for testing these assertions are available [3] and
consistently show a doubling in throughput and better than halving of
latency.
[1] http://cgit.collabora.com/git/user/javier/check-unix-multicast.git/log/?h=af-bus
[2] http://cgit.collabora.com/git/user/rodrigo/dbus.git/
[3] git://github.com/kanchev/dbus-ping.git
https://github.com/kanchev/dbus-ping/blob/master/dbus-genivi-benchmarking.sh
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-06-29 16:45 Vincent Sanders
@ 2012-06-29 18:16 ` Chris Friesen
2012-06-29 19:33 ` Ben Hutchings
2012-06-29 18:45 ` Casey Schaufler
` (3 subsequent siblings)
4 siblings, 1 reply; 28+ messages in thread
From: Chris Friesen @ 2012-06-29 18:16 UTC (permalink / raw)
To: Vincent Sanders; +Cc: netdev, linux-kernel, David S. Miller
On 06/29/2012 10:45 AM, Vincent Sanders wrote:
> This series adds the bus address family (AF_BUS) it is against
> net-next as of yesterday.
>
> AF_BUS is a message oriented inter process communication system.
>
> The principle features are:
>
> - Reliable datagram based communication (all sockets are of type
> SOCK_SEQPACKET)
>
> - Multicast message delivery (one to many, unicast as a subset)
>
> - Strict ordering (messages are delivered to every client in the same order)
>
> - Ability to pass file descriptors
>
> - Ability to pass credentials
>
I haven't had time to look at the code yet, but if you haven't already
I'd like to propose adding the ability for someone with suitable
privileges to eavesdrop on all communications. We've been using
something similar to this (essentially a simplified multicast unix
datagram protocol) for many years now and having a tcpdump-like ability
is very useful for debugging.
Chris
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-06-29 16:45 Vincent Sanders
2012-06-29 18:16 ` Chris Friesen
@ 2012-06-29 18:45 ` Casey Schaufler
2012-06-29 23:22 ` Vincent Sanders
2012-06-29 22:36 ` David Miller
` (2 subsequent siblings)
4 siblings, 1 reply; 28+ messages in thread
From: Casey Schaufler @ 2012-06-29 18:45 UTC (permalink / raw)
To: Vincent Sanders; +Cc: netdev, linux-kernel, David S. Miller
On 6/29/2012 9:45 AM, Vincent Sanders wrote:
> This series adds the bus address family (AF_BUS) it is against
> net-next as of yesterday.
>
> AF_BUS is a message oriented inter process communication system.
>
> The principle features are:
>
> - Reliable datagram based communication (all sockets are of type
> SOCK_SEQPACKET)
>
> - Multicast message delivery (one to many, unicast as a subset)
>
> - Strict ordering (messages are delivered to every client in the same order)
>
> - Ability to pass file descriptors
>
> - Ability to pass credentials
>
> The basic concept is to provide a virtual bus on which multiple
> processes can communicate and policy is imposed by a "bus master".
>
> Introduction
> ------------
>
> AF_BUS is based upon AF_UNIX but extended for multicast operation and
> removes stream operation, responding to extensive feedback on previous
> approaches we have made the implementation as isolated as
> possible. There are opportunities in the future to integrate the
> socket garbage collector with that of the unix socket implementation.
>
> The impetus for creating this IPC mechanism is to replace the
> underlying transport for D-Bus. The D-Bus system currently emulates this
> IPC mechanism using AF_UNIX sockets in userspace and has numerous
> undesirable behaviours. D-Bus is now widely deployed in many areas and
> has become a de-facto IPC standard. Using this IPC mechanism as a
> transport gives a significant (100% or more) improvement to throughput
> with comparable improvement to latency.
>
> This work was undertaken by Collabora for the GENIVI Alliance and we
> are committed to responding to feedback promptly and intend to continue
> to support this feature into the future.
>
> Operation
> ---------
>
> A bus is created by processes connecting on an AF_BUS socket. The
> "bus master" binds itself instead of connecting to the NULL address.
>
> The socket address is made up of a path component and a numeric
> component. The path component is either a pathname or an abstract
> socket similar to a unix socket. The numeric component is used to
> uniquely identify each connection to the bus. Thus the path identifies
> a specific bus and the numeric component the attachment to that bus.
>
> The numeric component of the address is divided into two fixed parts a
> prefix to identify multicast groups and a suffix which identifies the
> attachment. The kernel allocates a single address in prefix 0 to each
> socket upon connection.
>
> Connections are initially limited to communicating with address the
> bus master (address 0) . The bus master is responsible for making all
> policy decisions around manipulating other attachments including
> building multicast groups.
>
> It is expected that connecting clients use protocol specific messages
> to communicate with the bus master to negotiate differing
> configurations although a bus master might implement a fixed
> behaviour.
>
> AF_BUS itself is protocol agnostic and implements the configured
> policy between attachments which allows for a bus master to leave a
> bus and communication between clients to continue.
>
> Some test code has been written [1] which demonstrates the usage of
> AF_BUS.
>
> Use with BUS_PROTO_DBUS
> -----------------------
>
> The initial aim of AF_BUS is to provide a IPC mechanism suitable for
> use to provide the underlying transport for D-Bus.
>
> A socket created using BUS_PROTO_DBUS indicates that the messages
> passed will be in the D-Bus format. The userspace libraries have been
> updated to use this transport with an updated D-Bus daemon [2] as a bus
> master.
Why don't you go whole hog and put all of D-Bus into the kernel?
>
> The D-Bus protocol allows for multicast groups to be filtered depending
> on message contents. These filters are configured by the bus master
> but need to be enforced on message delivery.
>
> We have simply used the standard kernel netfilter mechanism to achieve
> this. This is used to filter delivery to clients that may be part of a
> multicast group where they are not receiving all messages according to
> policy. If a client wishes to further filter its input provision has
> been made to allow them to use BPF.
>
> The kernel based IPC has several benefits for D-Bus over the userspace
> emulation:
>
> - Context switching between userspace processes is reduced.
> - Message data copying is reduced.
> - System call overheads are reduced.
> - The userspace D-Bus daemon was subject to resource starvation,
> client contention and priority inversion.
> - Latency is reduced
> - Throughput is increased.
>
> The tools for testing these assertions are available [3] and
> consistently show a doubling in throughput and better than halving of
> latency.
Please cross-post Patches 04/15 and 05/15 to the linux-security-module list.
Please cross-post Patch 05/15 to the selinux list.
Where is the analogous patch for the Smack LSM?
>
> [1] http://cgit.collabora.com/git/user/javier/check-unix-multicast.git/log/?h=af-bus
> [2] http://cgit.collabora.com/git/user/rodrigo/dbus.git/
>
> [3] git://github.com/kanchev/dbus-ping.git
> https://github.com/kanchev/dbus-ping/blob/master/dbus-genivi-benchmarking.sh
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-06-29 18:16 ` Chris Friesen
@ 2012-06-29 19:33 ` Ben Hutchings
0 siblings, 0 replies; 28+ messages in thread
From: Ben Hutchings @ 2012-06-29 19:33 UTC (permalink / raw)
To: Chris Friesen; +Cc: Vincent Sanders, netdev, linux-kernel, David S. Miller
On Fri, 2012-06-29 at 12:16 -0600, Chris Friesen wrote:
> On 06/29/2012 10:45 AM, Vincent Sanders wrote:
> > This series adds the bus address family (AF_BUS) it is against
> > net-next as of yesterday.
> >
> > AF_BUS is a message oriented inter process communication system.
> >
> > The principle features are:
> >
> > - Reliable datagram based communication (all sockets are of type
> > SOCK_SEQPACKET)
> >
> > - Multicast message delivery (one to many, unicast as a subset)
> >
> > - Strict ordering (messages are delivered to every client in the same order)
> >
> > - Ability to pass file descriptors
> >
> > - Ability to pass credentials
> >
>
> I haven't had time to look at the code yet, but if you haven't already
> I'd like to propose adding the ability for someone with suitable
> privileges to eavesdrop on all communications. We've been using
> something similar to this (essentially a simplified multicast unix
> datagram protocol) for many years now and having a tcpdump-like ability
> is very useful for debugging.
It's in there (look for 'eavesdrop' in 08/15).
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-06-29 16:45 Vincent Sanders
2012-06-29 18:16 ` Chris Friesen
2012-06-29 18:45 ` Casey Schaufler
@ 2012-06-29 22:36 ` David Miller
2012-06-29 23:12 ` Vincent Sanders
2012-06-30 20:41 ` Hans-Peter Jansen
2012-07-05 7:59 ` Linus Walleij
4 siblings, 1 reply; 28+ messages in thread
From: David Miller @ 2012-06-29 22:36 UTC (permalink / raw)
To: vincent.sanders; +Cc: netdev, linux-kernel
There is no extensive text describing why using IPv4 for this cannot
be done. I can almost bet that nobody really, honestly, tried.
Basically this means all of our feedback from the last time we had
discussions on kernel IPC for DBUS are being completely ignored.
Therefore, I will completely ignore this patch submission.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-06-29 22:36 ` David Miller
@ 2012-06-29 23:12 ` Vincent Sanders
2012-06-29 23:18 ` David Miller
2012-07-05 21:06 ` Jan Engelhardt
0 siblings, 2 replies; 28+ messages in thread
From: Vincent Sanders @ 2012-06-29 23:12 UTC (permalink / raw)
To: David Miller; +Cc: netdev, linux-kernel
On Fri, Jun 29, 2012 at 03:36:56PM -0700, David Miller wrote:
>
> There is no extensive text describing why using IPv4 for this cannot
> be done. I can almost bet that nobody really, honestly, tried.
>
I can assure you that the team has tried no fewer than six differing
approaches, including using IP and attempting to bend several of the
existing address families.
> Basically this means all of our feedback from the last time we had
> discussions on kernel IPC for DBUS are being completely ignored.
Absolutely not, we listened hard and did extensive research, please do
not ascribe thoughtlessness to our actions. Certainly I would not
presume to waste your time and present something which has not been
thoroughly considered.
I had hoped you would have at least read the opening list where I
outlined the underlying features which explain why none of the
existing IPC match the requirements.
Firstly it is intended is an interprocess mechanism and not to rely on
a configured IP system, indeed one of its primary usages is to
provide mechanism for various tools to set up IP networking.
Leaving that aside the requirements for multicast, strict ordering, fd
passing and credential passing are simply not available in any other
single transport. It was made plain to us that AF_UNIX would not be
expanded to encompass multicast so we are left with adding AF_BUS.
If we are wrong I hope you will explain to me how we can achieve fd and
credential passing to multicast groups within existing protocols.
>
> Therefore, I will completely ignore this patch submission.
>
I do hope you will reconsider, or at least educate us appropriately.
I understand you are a busy maintainer and appreciate your time in this matter.
Best regards
--
Vincent Sanders <vincent.sanders@collabora.co.uk>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-06-29 23:12 ` Vincent Sanders
@ 2012-06-29 23:18 ` David Miller
2012-06-29 23:42 ` Vincent Sanders
2012-07-02 4:49 ` Chris Friesen
2012-07-05 21:06 ` Jan Engelhardt
1 sibling, 2 replies; 28+ messages in thread
From: David Miller @ 2012-06-29 23:18 UTC (permalink / raw)
To: vincent.sanders; +Cc: netdev, linux-kernel
From: Vincent Sanders <vincent.sanders@collabora.co.uk>
Date: Sat, 30 Jun 2012 00:12:37 +0100
> I had hoped you would have at least read the opening list where I
> outlined the underlying features which explain why none of the
> existing IPC match the requirements.
I had hoped that you had read the part we told you last time where
we explained why multicast and "reliable delivery" are fundamentally
incompatible attributes.
We are not creating a full address family in the kernel which exists
for one, and only one, specific and difficult user.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-06-29 18:45 ` Casey Schaufler
@ 2012-06-29 23:22 ` Vincent Sanders
0 siblings, 0 replies; 28+ messages in thread
From: Vincent Sanders @ 2012-06-29 23:22 UTC (permalink / raw)
To: Casey Schaufler; +Cc: netdev, linux-kernel, David S. Miller
On Fri, Jun 29, 2012 at 11:45:10AM -0700, Casey Schaufler wrote:
> On 6/29/2012 9:45 AM, Vincent Sanders wrote:
<snip>
> >
> > A socket created using BUS_PROTO_DBUS indicates that the messages
> > passed will be in the D-Bus format. The userspace libraries have been
> > updated to use this transport with an updated D-Bus daemon [2] as a bus
> > master.
>
> Why don't you go whole hog and put all of D-Bus into the kernel?
>
That would be ridiculously excessive. This work represents what we
feel is the minimum required functionlity for the underlying IPC
mechanism.
The minimal filtering performed by the netfilter module is what is
required to enforce security as used in existing deployments and no more.
<snip>
> >
> > The tools for testing these assertions are available [3] and
> > consistently show a doubling in throughput and better than halving of
> > latency.
>
> Please cross-post Patches 04/15 and 05/15 to the linux-security-module list.
> Please cross-post Patch 05/15 to the selinux list.
>
> Where is the analogous patch for the Smack LSM?
we have not tested or built this with the Smack LSM, I would, of
course, be pleased to accept a patch to add this functionality if you
are knowladgeable in this area.
<snip>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-06-29 23:18 ` David Miller
@ 2012-06-29 23:42 ` Vincent Sanders
2012-06-29 23:50 ` David Miller
2012-06-30 0:13 ` Benjamin LaHaise
2012-07-02 4:49 ` Chris Friesen
1 sibling, 2 replies; 28+ messages in thread
From: Vincent Sanders @ 2012-06-29 23:42 UTC (permalink / raw)
To: David Miller; +Cc: netdev, linux-kernel
On Fri, Jun 29, 2012 at 04:18:21PM -0700, David Miller wrote:
> From: Vincent Sanders <vincent.sanders@collabora.co.uk>
> Date: Sat, 30 Jun 2012 00:12:37 +0100
>
> > I had hoped you would have at least read the opening list where I
> > outlined the underlying features which explain why none of the
> > existing IPC match the requirements.
>
> I had hoped that you had read the part we told you last time where
> we explained why multicast and "reliable delivery" are fundamentally
> incompatible attributes.
>
I do not beleive we indicated reliable delivery, mearly ordered and
idempotent. eitehr everyone gets the message in the same order or
noone gets it.
> We are not creating a full address family in the kernel which exists
> for one, and only one, specific and difficult user.
Basically you are indicating you would be completely opposed to any
mechanism involving D-Bus IPC and the kernel?
Is there were a way to convince you that this is of real value to a
great many of the users of Linux systems in use today. I can assert
with some confidence that there are many, many more users of D-Bus IPC
than there are for several of the other address families that are
present within the kernel already.
The current users are suffering from the issues outlined in my
introductory mail all the time. These issues are caused by emulating an
IPC system over AF_UNIX in userspace.
All we are trying to do is make things better for our users, is there
a way to do that which will satisfy you technically and them? Honestly
I am just looking for a viable solution here.
--
Regards Vincent
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-06-29 23:42 ` Vincent Sanders
@ 2012-06-29 23:50 ` David Miller
2012-06-30 0:09 ` Vincent Sanders
2012-06-30 13:12 ` Alan Cox
2012-06-30 0:13 ` Benjamin LaHaise
1 sibling, 2 replies; 28+ messages in thread
From: David Miller @ 2012-06-29 23:50 UTC (permalink / raw)
To: vincent.sanders; +Cc: netdev, linux-kernel
From: Vincent Sanders <vincent.sanders@collabora.co.uk>
Date: Sat, 30 Jun 2012 00:42:30 +0100
> Basically you are indicating you would be completely opposed to any
> mechanism involving D-Bus IPC and the kernel?
I would not oppose existing mechanisms, which I do not believe is
impossible to use in your scenerio.
What you really don't get is that packet drops and event losses are
absolutely fundamental.
As long as receivers lack infinite receive queue this will always be
the case.
Multicast operates in non-reliable transports only so that one stuck
or malfunctioning receiver doesn't screw things over for everyone nor
unduly brudon the sender.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-06-29 23:50 ` David Miller
@ 2012-06-30 0:09 ` Vincent Sanders
2012-06-30 13:12 ` Alan Cox
1 sibling, 0 replies; 28+ messages in thread
From: Vincent Sanders @ 2012-06-30 0:09 UTC (permalink / raw)
To: David Miller; +Cc: netdev, linux-kernel
On Fri, Jun 29, 2012 at 04:50:23PM -0700, David Miller wrote:
> From: Vincent Sanders <vincent.sanders@collabora.co.uk>
> Date: Sat, 30 Jun 2012 00:42:30 +0100
>
> > Basically you are indicating you would be completely opposed to any
> > mechanism involving D-Bus IPC and the kernel?
>
> I would not oppose existing mechanisms, which I do not believe is
> impossible to use in your scenerio.
>
You keep saying that yet have offered no concrete way to achive the
semantics we require. To pass fd and credentials currently *requires*
the use of AF_UNIX does it not? And D-Bus already emulates its IPC
over AF_UNIX because of that.
> What you really don't get is that packet drops and event losses are
> absolutely fundamental.
not within an IPC surely? there cannot be packet drops within AF_BUS
we simply do not do it. The rrecive queues are checked for capability
of reciving the message before it is delivered to them all or none.
>
> As long as receivers lack infinite receive queue this will always be
> the case.
Indeed, I would not question that.
>
> Multicast operates in non-reliable transports only so that one stuck
> or malfunctioning receiver doesn't screw things over for everyone nor
> unduly brudon the sender.
>
We have addressed this within AF_BUS by the reciver and bus master
being told if all recepients cannot receive the message (and therefor
it cannot be sent).
The policy decision of how to handle this situation is therefore
handled by the userspace clients on a protocol level. D-Bus *already*
has to handle this situation, its just currently done over AF_UNIX
sockets so once it occours the problem is harder to rectify as the
ordering constraint is broken (which causes even more issues).
I am afraid it is rather late here and I may not be able to continue
this conversation untill the morning, I apologise if this is
inconveniant, but I must sleep.
--
Regards Vincent
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-06-29 23:42 ` Vincent Sanders
2012-06-29 23:50 ` David Miller
@ 2012-06-30 0:13 ` Benjamin LaHaise
2012-06-30 12:52 ` Alan Cox
1 sibling, 1 reply; 28+ messages in thread
From: Benjamin LaHaise @ 2012-06-30 0:13 UTC (permalink / raw)
To: Vincent Sanders; +Cc: David Miller, netdev, linux-kernel
On Sat, Jun 30, 2012 at 12:42:30AM +0100, Vincent Sanders wrote:
> The current users are suffering from the issues outlined in my
> introductory mail all the time. These issues are caused by emulating an
> IPC system over AF_UNIX in userspace.
Nothing in your introductory statements indicate how your requirements
can't be met through a hybrid socket + shared memory solution. The IPC
facilities of the kernel are already quite rich, and sufficient for
building many kinds of complex systems. What's so different about DBus'
requirements?
-ben
--
"Thought is the essence of where you are now."
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-06-30 0:13 ` Benjamin LaHaise
@ 2012-06-30 12:52 ` Alan Cox
2012-07-02 14:51 ` Vincent Sanders
0 siblings, 1 reply; 28+ messages in thread
From: Alan Cox @ 2012-06-30 12:52 UTC (permalink / raw)
To: Benjamin LaHaise; +Cc: Vincent Sanders, David Miller, netdev, linux-kernel
On Fri, 29 Jun 2012 20:13:50 -0400
Benjamin LaHaise <bcrl@kvack.org> wrote:
> On Sat, Jun 30, 2012 at 12:42:30AM +0100, Vincent Sanders wrote:
> > The current users are suffering from the issues outlined in my
> > introductory mail all the time. These issues are caused by emulating an
> > IPC system over AF_UNIX in userspace.
>
> Nothing in your introductory statements indicate how your requirements
> can't be met through a hybrid socket + shared memory solution. The IPC
> facilities of the kernel are already quite rich, and sufficient for
> building many kinds of complex systems. What's so different about DBus'
> requirements?
dbus wants to
- multicast
- pass file handles
- never lose an event
- be fast
- have a security model
The security model makes a shared memory hack impractical, the file
handle passing means at least some of it needs to be AF_UNIX. The event
loss handling/speed argue for putting it in kernel.
I'm not convinced AF_BUS entirely sorts this either. In particular the
failure case dbus currently has to handle for not losing events allows it
to identify who in a "group" has jammed the bus by not listening (eg by
locking up). This information appears to be lost in the AF_BUS case and
that's slightly catastrophic for error recovery.
Alan
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-06-29 23:50 ` David Miller
2012-06-30 0:09 ` Vincent Sanders
@ 2012-06-30 13:12 ` Alan Cox
2012-07-01 0:33 ` David Miller
1 sibling, 1 reply; 28+ messages in thread
From: Alan Cox @ 2012-06-30 13:12 UTC (permalink / raw)
To: David Miller; +Cc: vincent.sanders, netdev, linux-kernel
> What you really don't get is that packet drops and event losses are
> absolutely fundamental.
The world is full of "receiver reliable" multicast transport providers
which provide ordered defined message delivery properties.
They are reliable in the sense that a message is either queued to the
other ends or is not queued. They are not reliable in the sense of "we
wait forever".
In fact if you look up the stack you'll find a large number of multicast
messaging systems which do reliable transport built on top of IP. In fact
Red Hat provides a high level messaging cluster service that does exactly
this. (as well as dbus which does it on the deskop level) plus a ton of
stuff on top of that (JGroups etc)
Everybody at the application level has been using these 'receiver
reliable' multicast services for years (Websphere MQ, TIBCO, RTPGM,
OpenPGM, MS-PGM, you name it). There are even accelerators for PGM based
protocols in things like Cisco routers and Solarflare can do much of it
on the card for 10Gbit.
> As long as receivers lack infinite receive queue this will always be
> the case.
>
> Multicast operates in non-reliable transports only so that one stuck
> or malfunctioning receiver doesn't screw things over for everyone nor
> unduly brudon the sender.
All the world is not IP. Dealing with a malfunctioning receiver is
something dbus already has to deal with. "Unduly burden the sender" is
you talking out of your underwear. The sender is already implementing
this property set - in user space. So there can't be any more burdening,
in fact the point of this is to get rid of excess burdens caused by lack
of kernel support.
This is a latency issue not a throughput one so you can't hide it with
buffers. A few ms shaved off desktop behaviour here and there makes a
massive difference to perceived responsiveness. Less task switches and
daemons means a lot less tasks bouncing around processors which means
less power consumption.
Alan
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-06-29 16:45 Vincent Sanders
` (2 preceding siblings ...)
2012-06-29 22:36 ` David Miller
@ 2012-06-30 20:41 ` Hans-Peter Jansen
2012-07-02 16:46 ` Alban Crequy
2012-07-05 7:59 ` Linus Walleij
4 siblings, 1 reply; 28+ messages in thread
From: Hans-Peter Jansen @ 2012-06-30 20:41 UTC (permalink / raw)
To: Vincent Sanders; +Cc: netdev, linux-kernel, David S. Miller
Dear Vincent,
On Friday 29 June 2012, 18:45:39 Vincent Sanders wrote:
> This series adds the bus address family (AF_BUS) it is against
> net-next as of yesterday.
>
> AF_BUS is a message oriented inter process communication system.
>
> The principle features are:
>
> - Reliable datagram based communication (all sockets are of type
> SOCK_SEQPACKET)
>
> - Multicast message delivery (one to many, unicast as a subset)
>
> - Strict ordering (messages are delivered to every client in the
> same order)
>
> - Ability to pass file descriptors
>
> - Ability to pass credentials
>
> The basic concept is to provide a virtual bus on which multiple
> processes can communicate and policy is imposed by a "bus master".
>
> Introduction
> ------------
>
> AF_BUS is based upon AF_UNIX but extended for multicast operation and
> removes stream operation, responding to extensive feedback on
> previous approaches we have made the implementation as isolated as
> possible. There are opportunities in the future to integrate the
> socket garbage collector with that of the unix socket implementation.
>
> The impetus for creating this IPC mechanism is to replace the
> underlying transport for D-Bus. The D-Bus system currently emulates
> this IPC mechanism using AF_UNIX sockets in userspace and has
> numerous undesirable behaviours. D-Bus is now widely deployed in many
> areas and has become a de-facto IPC standard. Using this IPC
> mechanism as a transport gives a significant (100% or more)
> improvement to throughput with comparable improvement to latency.
Your introduction is missing a comprehensive "Discussion" section, where
you compare the AF_UNIX based implementation with AF_BUS ones.
You should elaborate on each of the above noted undesirable behaviours,
why and how AF_BUS is advantageous. Show the workarounds, that are
needed by AF_UNIX to operate (properly?!?) and how the new
implementation is going to improve this situation.
This will help to get some progress into the indurated discussion here.
Please also note, that, while your aims are nice and sound, it's even
more important for IPC mechanisms to operate properly - even during
persisting error conditions (crashed bus master and clients,
misbehaving or even abusing members). It would be cool to create a
D-BUS test rig, that not only measures performance numbers, but also
checks for dead locks, corner cases and abuse attempts in both IPC
implementations.
It's a juggling act: while AF_UNIX might suffer from downsides, the code
is heavily exercised in every aspect. Your implementation will only be
exercised by a handful of users (basically one lib), but in order to
rectify its existence in kernel space, such extensions need different
kinds of users, and the basic concepts need to fit in the whole kernel
picture as well, or you need to call it AF_DBUS with even less chance
to get it into mainstream.
Wishing you all the best and good luck,
Pete
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-06-30 13:12 ` Alan Cox
@ 2012-07-01 0:33 ` David Miller
2012-07-01 14:16 ` Alan Cox
0 siblings, 1 reply; 28+ messages in thread
From: David Miller @ 2012-07-01 0:33 UTC (permalink / raw)
To: alan; +Cc: vincent.sanders, netdev, linux-kernel
From: Alan Cox <alan@lxorguk.ukuu.org.uk>
Date: Sat, 30 Jun 2012 14:12:22 +0100
> In fact if you look up the stack you'll find a large number of multicast
> messaging systems which do reliable transport built on top of IP. In fact
> Red Hat provides a high level messaging cluster service that does exactly
> this. (as well as dbus which does it on the deskop level) plus a ton of
> stuff on top of that (JGroups etc)
>
> Everybody at the application level has been using these 'receiver
> reliable' multicast services for years (Websphere MQ, TIBCO, RTPGM,
> OpenPGM, MS-PGM, you name it). There are even accelerators for PGM based
> protocols in things like Cisco routers and Solarflare can do much of it
> on the card for 10Gbit.
The issue is that what to do when a receiver goes deaf is a policy
issue.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-07-01 0:33 ` David Miller
@ 2012-07-01 14:16 ` Alan Cox
2012-07-01 21:45 ` David Miller
0 siblings, 1 reply; 28+ messages in thread
From: Alan Cox @ 2012-07-01 14:16 UTC (permalink / raw)
To: David Miller; +Cc: vincent.sanders, netdev, linux-kernel
> The issue is that what to do when a receiver goes deaf is a policy
> issue.
Something all these protocols alredy recognize and have done for years.
The real issue for AF_BUS versus the current slow AF_UNIX approach is
that you really need to know *who* is blocked up.
That's no different to UDP multicast and needing to know how errored the
frame.
Alan
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-07-01 14:16 ` Alan Cox
@ 2012-07-01 21:45 ` David Miller
0 siblings, 0 replies; 28+ messages in thread
From: David Miller @ 2012-07-01 21:45 UTC (permalink / raw)
To: alan; +Cc: vincent.sanders, netdev, linux-kernel
From: Alan Cox <alan@lxorguk.ukuu.org.uk>
Date: Sun, 1 Jul 2012 15:16:59 +0100
>> The issue is that what to do when a receiver goes deaf is a policy
>> issue.
>
> Something all these protocols alredy recognize and have done for years.
> The real issue for AF_BUS versus the current slow AF_UNIX approach is
> that you really need to know *who* is blocked up.
>
> That's no different to UDP multicast and needing to know how errored the
> frame.
And the policy on what to do about such UDP multicast errors is in
userspace, which is precisely my point.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-06-29 23:18 ` David Miller
2012-06-29 23:42 ` Vincent Sanders
@ 2012-07-02 4:49 ` Chris Friesen
1 sibling, 0 replies; 28+ messages in thread
From: Chris Friesen @ 2012-07-02 4:49 UTC (permalink / raw)
To: David Miller; +Cc: vincent.sanders, netdev, linux-kernel
On 06/29/2012 05:18 PM, David Miller wrote:
> From: Vincent Sanders<vincent.sanders@collabora.co.uk>
> Date: Sat, 30 Jun 2012 00:12:37 +0100
>
>> I had hoped you would have at least read the opening list where I
>> outlined the underlying features which explain why none of the
>> existing IPC match the requirements.
> I had hoped that you had read the part we told you last time where
> we explained why multicast and "reliable delivery" are fundamentally
> incompatible attributes.
>
> We are not creating a full address family in the kernel which exists
> for one, and only one, specific and difficult user.
For what it's worth, the company I work for (and a number of other
companies) currently use an out-of-tree datagram multicast messaging
protocol family based on AF_UNIX.
If AF_BUS were to be accepted, it would be essentially trivial for us to
port our existing userspace messaging library to use it instead of our
current protocol family, and we would almost certainly do so.
I'd love to see AF_BUS go in.
Chris Friesen
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-06-30 12:52 ` Alan Cox
@ 2012-07-02 14:51 ` Vincent Sanders
0 siblings, 0 replies; 28+ messages in thread
From: Vincent Sanders @ 2012-07-02 14:51 UTC (permalink / raw)
To: Alan Cox; +Cc: Benjamin LaHaise, David Miller, netdev, linux-kernel
On Sat, Jun 30, 2012 at 01:52:40PM +0100, Alan Cox wrote:
> On Fri, 29 Jun 2012 20:13:50 -0400
> Benjamin LaHaise <bcrl@kvack.org> wrote:
>
> > On Sat, Jun 30, 2012 at 12:42:30AM +0100, Vincent Sanders wrote:
> > > The current users are suffering from the issues outlined in my
> > > introductory mail all the time. These issues are caused by emulating an
> > > IPC system over AF_UNIX in userspace.
> >
> > Nothing in your introductory statements indicate how your requirements
> > can't be met through a hybrid socket + shared memory solution. The IPC
> > facilities of the kernel are already quite rich, and sufficient for
> > building many kinds of complex systems. What's so different about DBus'
> > requirements?
>
> dbus wants to
> - multicast
> - pass file handles
> - never lose an event
> - be fast
> - have a security model
>
> The security model makes a shared memory hack impractical, the file
> handle passing means at least some of it needs to be AF_UNIX. The event
> loss handling/speed argue for putting it in kernel.
Thankyou for making this point more eloquently than I had previously
been able to.
>
> I'm not convinced AF_BUS entirely sorts this either. In particular the
> failure case dbus currently has to handle for not losing events allows it
> to identify who in a "group" has jammed the bus by not listening (eg by
> locking up). This information appears to be lost in the AF_BUS case and
> that's slightly catastrophic for error recovery.
>
The strategy the existing AF_UNIX D-Bus daemon implements is simply to
have huge queues and thus rarely encounters the situation. When It
does the bus daemon crafts an error message as a reply to the sender.
The AF_BUS solution is more direct in that the sender gets either
EAGAIN for a direct send or EPOLLOUT from poll. Whatever the response
the sender can use this information to implement a userspace policy
decision.
Your feedback sparked a discussion and we have considered this in more
depth and propose implementing a userspace policy of:
- sending a message to the bus master and let it "deal" with the
blocking client.
- The bus master might choose to isolate the offending client or
perhaps even cause a service restart etc.
The bus master is a privileged client and has state information
about the bus allowing an optimal decision. Though we intend to
add a socket option to query the queue lengths so it can make a
better decisions.
Regardless this is all userspace policy for the D-Bus client
library / bus master daemon which I believe addresses David Miller's
concerns about such decisions being made in userspace.
--
Best Regards
Vincent Sanders <vincent.sanders@collabora.co.uk>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
@ 2012-07-02 15:18 Javier Martinez Canillas
2012-07-03 16:52 ` Chris Friesen
0 siblings, 1 reply; 28+ messages in thread
From: Javier Martinez Canillas @ 2012-07-02 15:18 UTC (permalink / raw)
To: Chris Friesen, David Miller, vincent.sanders, netdev,
linux-kernel
On Mon, Jul 2, 2012 at 6:49 AM, Chris Friesen <chris.friesen@genband.com> wrote:
> On 06/29/2012 05:18 PM, David Miller wrote:
>>
>> From: Vincent Sanders<vincent.sanders@collabora.co.uk>
>> Date: Sat, 30 Jun 2012 00:12:37 +0100
>>
>>> I had hoped you would have at least read the opening list where I
>>> outlined the underlying features which explain why none of the
>>> existing IPC match the requirements.
>>
>> I had hoped that you had read the part we told you last time where
>> we explained why multicast and "reliable delivery" are fundamentally
>> incompatible attributes.
>>
>> We are not creating a full address family in the kernel which exists
>> for one, and only one, specific and difficult user.
>
>
> For what it's worth, the company I work for (and a number of other
> companies) currently use an out-of-tree datagram multicast messaging
> protocol family based on AF_UNIX.
>
> If AF_BUS were to be accepted, it would be essentially trivial for us to
> port our existing userspace messaging library to use it instead of our
> current protocol family, and we would almost certainly do so.
>
> I'd love to see AF_BUS go in.
>
> Chris Friesen
>
Hi Chris,
Thanks a lot for your comments and feedback.
We tried different approaches before developing the AF_BUS socket family and one
of them was extending AF_UNIX to support multicast. We posted our patches [1]
and the feedback was that the AF_UNIX code was already a complex and difficult
code to maintain. So, we decided to implement a new family (AF_BUS) that is
orthogonal to the rest of the networking stack and no added complexity nor
performance penalty would pay a user not using our IPC solution.
Looking at netdev archives I saw that you both raised the question about
multicast on unix sockets and post an implementation on early 2003. So if I
understand correctly you are maintaining an out-of-tree solution for around 9
years now.
We developed AF_BUS to improve the performance of the D-Bus IPC system (and our
results show us a 2X speedup) but design it to be as generic as possible so
other users can take advantage of it.
It would be a great help if you can join the discussion and explain the
arguments of your company (and the others companies you were talking about) in
favor of a simpler multicast socket family.
The fact that your company spent lots of engineering resources to maintain an
out-of-tree patch-set for 9 years should raise some eyebrows and convince more
than one people that a simpler local multicast solution is needed on the Linux
kernel (which was one of the reasons why Google also developed Binder I guess).
[1]: https://lkml.org/lkml/2012/2/20/84
[2]: https://lkml.org/lkml/2003/2/27/150
[3]: http://lwn.net/Articles/27001/
Thanks a lot and best regards,
Javier
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-06-30 20:41 ` Hans-Peter Jansen
@ 2012-07-02 16:46 ` Alban Crequy
0 siblings, 0 replies; 28+ messages in thread
From: Alban Crequy @ 2012-07-02 16:46 UTC (permalink / raw)
To: Hans-Peter Jansen; +Cc: Vincent Sanders, netdev, linux-kernel, David S. Miller
Sat, 30 Jun 2012 22:41:08 +0200,
"Hans-Peter Jansen" <hpj@urpla.net> wrote :
> Dear Vincent,
>
> On Friday 29 June 2012, 18:45:39 Vincent Sanders wrote:
> > This series adds the bus address family (AF_BUS) it is against
> > net-next as of yesterday.
> >
> > AF_BUS is a message oriented inter process communication system.
> >
> > The principle features are:
> >
> > - Reliable datagram based communication (all sockets are of type
> > SOCK_SEQPACKET)
> >
> > - Multicast message delivery (one to many, unicast as a subset)
> >
> > - Strict ordering (messages are delivered to every client in the
> > same order)
> >
> > - Ability to pass file descriptors
> >
> > - Ability to pass credentials
> >
> > The basic concept is to provide a virtual bus on which multiple
> > processes can communicate and policy is imposed by a "bus master".
> >
> > Introduction
> > ------------
> >
> > AF_BUS is based upon AF_UNIX but extended for multicast operation and
> > removes stream operation, responding to extensive feedback on
> > previous approaches we have made the implementation as isolated as
> > possible. There are opportunities in the future to integrate the
> > socket garbage collector with that of the unix socket implementation.
> >
> > The impetus for creating this IPC mechanism is to replace the
> > underlying transport for D-Bus. The D-Bus system currently emulates
> > this IPC mechanism using AF_UNIX sockets in userspace and has
> > numerous undesirable behaviours. D-Bus is now widely deployed in many
> > areas and has become a de-facto IPC standard. Using this IPC
> > mechanism as a transport gives a significant (100% or more)
> > improvement to throughput with comparable improvement to latency.
>
> Your introduction is missing a comprehensive "Discussion" section, where
> you compare the AF_UNIX based implementation with AF_BUS ones.
>
> You should elaborate on each of the above noted undesirable behaviours,
> why and how AF_BUS is advantageous. Show the workarounds, that are
> needed by AF_UNIX to operate (properly?!?) and how the new
> implementation is going to improve this situation.
Hi Hans-Peter,
Thanks for your feedback. I would like to elaborate on the priority
inversion and on the latency.
Priority inversion:
===================
A bus can have users with different priorities. The classical example was
Nokia's N900 phone. A incoming phone call should query the contact
database, start the correct ringtone, display the correct avatar very
quickly. Other background tasks don't have the same priority. Since all
messages go through dbus-daemon, it is a single bottleneck and the
kernel has no way to schedule the processes with the correct
priorities. Low priority messages are waking up dbus-daemon as much as
high priority messages.
A workaround was to set the nice level of dbus-daemon to -5. It didn't
really address the priority inversion, but it reduces the number of
context switches on multicast messages, and that helped a bit. The
diagram "Experiment #3" on this blog post shows dbus-daemon is no
longer context switched for every recipient of a multicast message:
http://alban-apinc.blogspot.co.uk/2011/12/importance-of-scheduling-priority-in-d.html
With AF_BUS, there is no single process who has to receive all messages
from low priority processes and high priority processes. The kernel can
schedule the high priority processes and they can progress in their
communication without having dbus-daemon involved.
Latency:
========
On AF_UNIX, a message round-trip would go like this:
- the sender sends a message to dbus-daemon
- dbus-daemon receives it and forward it to the correct recipient
- the recipient receives it and reply with a new message sent to
dbus-daemon
- dbus-daemon receives the reply and forward it to the initial sender
- the sender receives the reply.
There is a total of 4 context switches.
On AF_BUS, the messages are most of the time not routed by dbus-daemon,
this halves the number of context switches. It reduced the latency and
brought the performance improvement mentioned by Vincent.
> This will help to get some progress into the indurated discussion here.
>
> Please also note, that, while your aims are nice and sound, it's even
> more important for IPC mechanisms to operate properly - even during
> persisting error conditions (crashed bus master and clients,
> misbehaving or even abusing members). It would be cool to create a
> D-BUS test rig, that not only measures performance numbers, but also
> checks for dead locks, corner cases and abuse attempts in both IPC
> implementations.
>
> It's a juggling act: while AF_UNIX might suffer from downsides, the code
> is heavily exercised in every aspect. Your implementation will only be
> exercised by a handful of users (basically one lib), but in order to
> rectify its existence in kernel space, such extensions need different
> kinds of users, and the basic concepts need to fit in the whole kernel
> picture as well, or you need to call it AF_DBUS with even less chance
> to get it into mainstream.
I am hoping there will be more users with different use-cases and it
should help to improve AF_BUS and fix the unavoidable bugs in a young
code. I would be happy if AF_BUS reduces the cost of maintaining the
out-of-tree multicast messaging protocol family based on AF_UNIX
mentioned by Chris Friesen.
Thank you!
Alban
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-07-02 15:18 AF_BUS socket address family Javier Martinez Canillas
@ 2012-07-03 16:52 ` Chris Friesen
2012-07-03 17:18 ` Chris Friesen
0 siblings, 1 reply; 28+ messages in thread
From: Chris Friesen @ 2012-07-03 16:52 UTC (permalink / raw)
To: Javier Martinez Canillas
Cc: David Miller, vincent.sanders, netdev, linux-kernel
On 07/02/2012 09:18 AM, Javier Martinez Canillas wrote:
> We tried different approaches before developing the AF_BUS socket family and one
> of them was extending AF_UNIX to support multicast. We posted our patches [1]
> and the feedback was that the AF_UNIX code was already a complex and difficult
> code to maintain. So, we decided to implement a new family (AF_BUS) that is
> orthogonal to the rest of the networking stack and no added complexity nor
> performance penalty would pay a user not using our IPC solution.
That's what I ended up doing as well. In our case it's basically a
stripped-down AF_UNIX with only datagram support, no security, no fd
passing, etc., but with with the addition of multicast and wildcard (for
debugging).
> Looking at netdev archives I saw that you both raised the question about
> multicast on unix sockets and post an implementation on early 2003. So if I
> understand correctly you are maintaining an out-of-tree solution for around 9
> years now.
That's correct.
> It would be a great help if you can join the discussion and explain the
> arguments of your company (and the others companies you were talking about) in
> favor of a simpler multicast socket family.
>
> The fact that your company spent lots of engineering resources to maintain an
> out-of-tree patch-set for 9 years should raise some eyebrows and convince more
> than one people that a simpler local multicast solution is needed on the Linux
> kernel (which was one of the reasons why Google also developed Binder I guess).
To be fair, since it was implemented as a separate protocol family the
maintenance burden actually hasn't been large--it's been fairly simple
to port between versions. Also, we do embedded telecom stuff and don't
jump kernel versions all that often. (It's a big headache, requires
coordinating between multiple vendors, etc.)
In our case we typically send small (100-200 byte) messages to a
smallish (1-10) number of listeners, though there are exceptions of
course. Back before I started the original implementation used a
userspace daemon, but it had a number of issues. Originally I was
focussed on the performance gains but I must admit that since then other
factors have made that less of an issue.
Among other things, this messaging is used on some systems to configure
the IP addressing for the system, so it does simplify things to not use
an IP-based protocol for this purpose.
Also, back when I did my original implementation IP multicast wasn't
supported on the loopback device--David, has that changed since then?
If it has, then we probably could figure out a way to make it work using
IP multicast, but I don't know that it would be worth the effort given
the minimal ongoing maintenance costs for our patch.
Chris
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-07-03 16:52 ` Chris Friesen
@ 2012-07-03 17:18 ` Chris Friesen
0 siblings, 0 replies; 28+ messages in thread
From: Chris Friesen @ 2012-07-03 17:18 UTC (permalink / raw)
To: Javier Martinez Canillas
Cc: David Miller, vincent.sanders, netdev, linux-kernel
On 07/03/2012 10:52 AM, Chris Friesen wrote:
> To be fair, since it was implemented as a separate protocol family the
> maintenance burden actually hasn't been large--it's been fairly simple
> to port between versions. Also, we do embedded telecom stuff and don't
> jump kernel versions all that often. (It's a big headache, requires
> coordinating between multiple vendors, etc.)
>
> In our case we typically send small (100-200 byte) messages to a
> smallish (1-10) number of listeners, though there are exceptions of
> course. Back before I started the original implementation used a
> userspace daemon, but it had a number of issues. Originally I was
> focussed on the performance gains but I must admit that since then other
> factors have made that less of an issue.
I should point out that some of the other factors that have been
discussed for AF_BUS also hold true for our implementation:
--strict ordering
--reliable (in our case, if the sender has space in the tx buffer then
messages get to all recipients with buffer space, there are kernel logs
if recipients don't have space)
Also, the fact that it's in the kernel rather than a userspace daemon
reduces priority inversion type issues. Presumably this would apply to
an IP-multicast based solution as well.
One problem that I ran into back when I was experimenting with this
stuff was trying to isolate host-local IP multicast from the rest of the
network. It would be suboptimal to need to set up filtering and such
before being able to use the communication protocol.
Chris
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-06-29 16:45 Vincent Sanders
` (3 preceding siblings ...)
2012-06-30 20:41 ` Hans-Peter Jansen
@ 2012-07-05 7:59 ` Linus Walleij
2012-07-05 16:01 ` Daniel Walker
4 siblings, 1 reply; 28+ messages in thread
From: Linus Walleij @ 2012-07-05 7:59 UTC (permalink / raw)
To: Vincent Sanders
Cc: netdev, linux-kernel, David S. Miller, Arve Hjønnevåg,
Daniel Walker, John Stultz, Anton Vorontsov, Greg Kroah-Hartman
2012/6/29 Vincent Sanders <vincent.sanders@collabora.co.uk>:
> AF_BUS is a message oriented inter process communication system.
We have a very huge and important in-kernel IPC message passer
in drivers/staging/android/binder.c
It's deployed in some 400 million devices according to latest reports.
John Stultz & Anton Vorontsov are trying to look after these Android
drivers a bit...
I and others discussed this in the past with the Android folks. Dianne
makes an excellent summary of how it works here:
https://lkml.org/lkml/2009/6/25/3
If we could all be convinced that this thing also fulfills the needs
of what binder does, this is a pretty solid case for it too. I can
sure see that some of the shortcuts that Android is taking with
binder try to address the same issue of high-speed IPC loopholes
through the kernel and some kind of security model.
Whether Android would actually use it (or wrap it) is a totally
different question, but what I think we need to know is whether it
*could*. And staging code has to move forward, maybe this
is the direction it should move?
Yours,
Linus Walleij
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-07-05 7:59 ` Linus Walleij
@ 2012-07-05 16:01 ` Daniel Walker
0 siblings, 0 replies; 28+ messages in thread
From: Daniel Walker @ 2012-07-05 16:01 UTC (permalink / raw)
To: Linus Walleij
Cc: Vincent Sanders, netdev, linux-kernel, David S. Miller,
Arve Hjønnevåg, John Stultz, Anton Vorontsov,
Greg Kroah-Hartman
On Thu, Jul 05, 2012 at 09:59:53AM +0200, Linus Walleij wrote:
> 2012/6/29 Vincent Sanders <vincent.sanders@collabora.co.uk>:
>
> > AF_BUS is a message oriented inter process communication system.
>
> We have a very huge and important in-kernel IPC message passer
> in drivers/staging/android/binder.c
>
> It's deployed in some 400 million devices according to latest reports.
> John Stultz & Anton Vorontsov are trying to look after these Android
> drivers a bit...
>
> I and others discussed this in the past with the Android folks. Dianne
> makes an excellent summary of how it works here:
> https://lkml.org/lkml/2009/6/25/3
>
> If we could all be convinced that this thing also fulfills the needs
> of what binder does, this is a pretty solid case for it too. I can
> sure see that some of the shortcuts that Android is taking with
> binder try to address the same issue of high-speed IPC loopholes
> through the kernel and some kind of security model.
>
> Whether Android would actually use it (or wrap it) is a totally
> different question, but what I think we need to know is whether it
> *could*. And staging code has to move forward, maybe this
> is the direction it should move?
I'm all for alternatives.. I haven't read this thread at all, but I read an LWN
article comparing Binder and other implementations .. So there are for sure
alternatives. It would be nice if things were included that did whatever binder
needs .. Since I did the logger performance analysis the big questions to me is
if Binder is actually fast (or faster than the alternatives). Whatever this
AF_BUS things is reviewing the performance of the major alternative(s) is probably a
good idea.
In terms of Android using anything we produce or incorporate, don't get
your hopes up .. They will always just use Binder .. (John is good cop, I'm bad
cop)
Daniel
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-06-29 23:12 ` Vincent Sanders
2012-06-29 23:18 ` David Miller
@ 2012-07-05 21:06 ` Jan Engelhardt
2012-07-06 18:27 ` Chris Friesen
1 sibling, 1 reply; 28+ messages in thread
From: Jan Engelhardt @ 2012-07-05 21:06 UTC (permalink / raw)
To: Vincent Sanders; +Cc: David Miller, netdev, linux-kernel
On Saturday 2012-06-30 01:12, Vincent Sanders wrote:
>
>Firstly it is intended is an interprocess mechanism and not to rely on
>a configured IP system, indeed one of its primary usages is to
>provide mechanism for various tools to set up IP networking.
Using IP as a localhost IPC is not uncommon (independent of
software preferring AF_UNIX, if so available). Distro boot
scripts have been running `ip addr add ::1/128 dev lo`
all these years along.
And now we suddently need a DBUS program just to configure
IP-based localhost IPC? I can see the flaw in that.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: AF_BUS socket address family
2012-07-05 21:06 ` Jan Engelhardt
@ 2012-07-06 18:27 ` Chris Friesen
0 siblings, 0 replies; 28+ messages in thread
From: Chris Friesen @ 2012-07-06 18:27 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: Vincent Sanders, David Miller, netdev, linux-kernel
> On Saturday 2012-06-30 01:12, Vincent Sanders wrote:
>> Firstly it is intended is an interprocess mechanism and not to rely on
>> a configured IP system, indeed one of its primary usages is to
>> provide mechanism for various tools to set up IP networking.
> Using IP as a localhost IPC is not uncommon (independent of
> software preferring AF_UNIX, if so available). Distro boot
> scripts have been running `ip addr add ::1/128 dev lo`
> all these years along.
>
> And now we suddently need a DBUS program just to configure
> IP-based localhost IPC? I can see the flaw in that.
>
I haven't tried it in a while but it used to be that you couldn't use IP
multicast on the "lo" device. Has that been fixed?
Chris
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2012-07-06 18:28 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-02 15:18 AF_BUS socket address family Javier Martinez Canillas
2012-07-03 16:52 ` Chris Friesen
2012-07-03 17:18 ` Chris Friesen
-- strict thread matches above, loose matches on Subject: below --
2012-06-29 16:45 Vincent Sanders
2012-06-29 18:16 ` Chris Friesen
2012-06-29 19:33 ` Ben Hutchings
2012-06-29 18:45 ` Casey Schaufler
2012-06-29 23:22 ` Vincent Sanders
2012-06-29 22:36 ` David Miller
2012-06-29 23:12 ` Vincent Sanders
2012-06-29 23:18 ` David Miller
2012-06-29 23:42 ` Vincent Sanders
2012-06-29 23:50 ` David Miller
2012-06-30 0:09 ` Vincent Sanders
2012-06-30 13:12 ` Alan Cox
2012-07-01 0:33 ` David Miller
2012-07-01 14:16 ` Alan Cox
2012-07-01 21:45 ` David Miller
2012-06-30 0:13 ` Benjamin LaHaise
2012-06-30 12:52 ` Alan Cox
2012-07-02 14:51 ` Vincent Sanders
2012-07-02 4:49 ` Chris Friesen
2012-07-05 21:06 ` Jan Engelhardt
2012-07-06 18:27 ` Chris Friesen
2012-06-30 20:41 ` Hans-Peter Jansen
2012-07-02 16:46 ` Alban Crequy
2012-07-05 7:59 ` Linus Walleij
2012-07-05 16:01 ` Daniel Walker
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).