public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Revisit AF_BUS: is it a better way to implement KDBUS?
@ 2015-07-30 13:09 cee1
  2015-07-30 18:12 ` Andy Lutomirski
  2015-07-31 10:09 ` cee1
  0 siblings, 2 replies; 8+ messages in thread
From: cee1 @ 2015-07-30 13:09 UTC (permalink / raw)
  To: LKML; +Cc: Greg KH, Lennart Poettering, dh.herrmann, gnomes, luto

Hi all,

I'm interested in the idea of AF_BUS.

There have already been varies discussions about it:
* Missing the AF_BUS - https://lwn.net/Articles/504970/
* Kroah-Hartman: AF_BUS, D-Bus, and the Linux kernel -
http://lwn.net/Articles/537021/
* presentation-kdbus -
https://github.com/gregkh/presentation-kdbus/blob/master/kdbus.txt
* Re: [GIT PULL] kdbus for 4.1-rc1 - https://lwn.net/Articles/641278/
* The kdbuswreck - https://lwn.net/Articles/641275/

I'm wondering whether it is a better way, that is, a general mechanism
to implement varies __Bus__ orientated IPCs, such as Binder[1],
DBus[2], etc.

The original design of AF_BUS is at
https://github.com/Airtau/genivi/blob/master/af_bus-linux/0002-net-bus-Add-AF_BUS-documentation.patch.
And following is my version of AF_BUS.

Some characteristics of a Bus orientated IPC:
1. A process creates a Bus, the process is then called 'bus master'.
2. Connects to a Bus, be assigned Bus address(es).
3. Sending/Receiving multicast message, in additional to P2P communication.
4. The implementation may base on shared memory model to avoid unnecessary copy.

## How to map point 1: """A process creates a Bus, the process is then
called 'bus master'"""
The [bus master] acts:

struct sockaddr_bus {
        sa_family_t     sbus_family;                    /* AF_BUS */
        unsigned short  sbus_addr_ncomp;                /* number of
components of sbus_addr */
        char            sbus_path[BUS_PATH_MAX];        /* pathname of
this bus */
        uint64_t        sbus_addr[BUS_ADDR_COMP_MAX];   /* address
within the bus */
};
#define BUS_ADDR_MAX    (BUS_ADDR_COMP_MAX * sizeof(uint64_t))

char bus_path[] = "/tmp/test"; /* non-abstract path */
char bus_addr[] = "org.example.bus";
struct sockaddr_bus addr = { .sbus_family = AF_BUS };

strncpy(addr.sbus_path, bus_path, BUS_PATH_MAX - 2);
memcpy(addr.sbus_addr, bus_addr, MIN(sizeof(bus_addr), BUS_ADDR_MAX));
addr.sbus_addr_ncomp = MIN(ALIGN(sizeof(bus_addr), 8) / 8, BUS_ADDR_COMP_MAX);

bus_fd = socket(AF_BUS, SOCK_DGRAM, 0);
/* creates a Bus, becomes the master of the bus */
bind(bus_fd, &addr, sizeof(struct sockaddr_bus));


## How to map point 2: """Connects to a Bus, be assigned Bus address(es)"""
### The [bus endpoint] acts:
fd = socket(AF_BUS, SOCK_DGRAM, 0);

/* AUTH message setup */
struct msghdr msghdr = {
        .msg_name = &addr, /* bus master's addr */
        .msg_namelen = sizeof(struct sockaddr_bus),
        .msg_iov = &auth_iovec,
        .msg_iovlen = 1,
};

msghdr.msg_controllen = CMSG_SPACE(sizeof(struct ucred));
msghdr.msg_control = alloca(msghdr.msg_controllen);
cmsg = CMSG_FIRSTHDR(&msghdr);
cmsg->cmsg_level = SOL_SOCKET;
cmsg->cmsg_type = SCM_CREDENTIALS;
cmsg->cmsg_len = CMSG_LEN(sizeof(struct ucred));
ucred = (struct ucred *) CMSG_DATA(cmsg);
ucred->pid = getpid();
ucred->uid = getuid();
ucred->gid = getgid();

sendmsg(fd, &msghdr, MSG_NOSIGNAL);

### The [bus master] acts:
int optval = 1;
setsockopt(bus_fd, SOL_SOCKET, SO_PASSCRED, &optval, sizeof(optval));
recvmsg(bus_fd, &msghdr, MSG_NOSIGNAL);

/* do AUTH ... */

msghdr.msg_iov = &reply_iovec;
msghdr.msg_iovlen = 1;
msghdr.msg_controllen = 0;
msghdr.msg_control = NULL;

if (auth_ok) {
        /* bus master allocates a bus addr */
        char bus_path[] = "/tmp/test";
        char ret_bus_addr[] = "1.1";
        struct sockaddr_bus ret_addr = { .sbus_family = AF_BUS };

        strncpy(ret_addr.sbus_path, bus_path, BUS_PATH_MAX - 2);
        memcpy(ret_addr.sbus_addr, ret_bus_addr,
MIN(sizeof(ret_bus_addr), BUS_ADDR_MAX));
        ret_addr.sbus_addr_ncomp = MIN(ALIGN(sizeof(ret_bus_addr), 8)
/ 8, BUS_ADDR_COMP_MAX);

        /*
         * 1. bus master returns the bus addr
         * 2. kernel will apply it against the bus endpoint
         * 3. the bus endpoint is then able to talk with endpoints on the bus.
         */
        msghdr.msg_controllen = CMSG_SPACE(sizeof(struct sockaddr_bus));
        msghdr.msg_control = alloca(msghdr.msg_controllen);
        cmsg = CMSG_FIRSTHDR(&msghdr);
        cmsg->cmsg_level = BUS_SOCKET;
        cmsg->cmsg_type = SCM_OWNED_ADDR;
        cmsg->cmsg_len = CMSG_LEN(sizeof(struct sockaddr_bus));
        memcpy(CMSG_DATA(cmsg), &ret_addr, sizeof(struct sockaddr_bus));
}
sendmsg(bus_fd, &msghdr, MSG_NOSIGNAL);


## How to map point 3: """Sending/Receiving multicast message, in
additional to P2P communication""".
### P2P communication
Sometimes, a bus endpoint maybe assigned to multi-addresses. It may
want to send message through a specific address.

struct msghdr msghdr = {
        .msg_name = &dst_addr,
        .msg_namelen = sizeof(struct sockaddr_bus),
        .msg_iov = &msg_iovec,
        .msg_iovlen = 1,
};

char bus_path[] = "/tmp/test";
char bus_addr[] = "com.example.service1";
struct sockaddr_bus src_addr = { .sbus_family = AF_BUS };

strncpy(src_addr.sbus_path, bus_path, BUS_PATH_MAX - 2);
memcpy(src_addr.sbus_addr, bus_addr, MIN(sizeof(bus_addr), BUS_ADDR_MAX));
src_addr.sbus_addr_ncomp = MIN(ALIGN(sizeof(bus_addr), 8) / 8,
BUS_ADDR_COMP_MAX),

msghdr.msg_controllen = CMSG_SPACE(sizeof(struct sockaddr_bus));
msghdr.msg_control = alloca(msghdr.msg_controllen);
cmsg = CMSG_FIRSTHDR(&msghdr);
cmsg->cmsg_level = BUS_SOCKET;
cmsg->cmsg_type = SCM_SRC_ADDR;
cmsg->cmsg_len = CMSG_LEN(sizeof(struct sockaddr_bus));
memcpy(CMSG_DATA(cmsg), &src_addr, sizeof(struct sockaddr_bus));

sendmsg(my_sock_fd, &msghdr, MSG_NOSIGNAL);

### Multicast
The multicast address may look like:
        {
                .sbus_family = AF_BUS,

                /* In a multicast addr, its bus_path is  '*'-terminated */
                .sbus_path = "/tmp/test\0\0\0\0\0...*",

                .sbus_addr_ncomp = 8;
                .sbus_addr = /* 8 * 64bits bitarray for example */
        }

The receiver will request [bus master] for permitting to receive
messages from a set of multicast addresses, and the bus master grants
it with replying a control message:
        {
                .cmsg_level = BUS_SOCKET,
                .cmsg_type = SCM_MULTICAST_MATCH,
                .cmsg_data = /* the requested struct sockaddr_bus */
        }

How does matching happen?
Let's assume someone sends message to multicast address maddr1, and
the receiver granted a match of maddr2:

The [kernel]:
        is_matched = maddr1 & maddr2 == maddr2.

In this way, usespace can deploy bloom filters, and then it may
further apply eBPF to filter out "false positive" case.

## How to avoid unnecessary copy?
A sockopt similar to PACKET_RX_RING[3] may be introduced, which brings
a mmap/shared memory style API.


## Other thoughts
1. The bus master may want to receive notifications from the kernel,
such as "a bus endpoint died". A special sockaddr_bus "{
.sbus_addr_ncomp = 0, .sbus_addr = NULL }" indicates a message from
kernel.
2. A bus endpoint may pass a memfd to another bus endpoint, and then
they communicates under mmap/shared memory model, if it needs ultimate
performance.



---
1. http://www.freedesktop.org/wiki/Software/dbus/
2. http://elinux.org/Android_Binder
3. http://man7.org/linux/man-pages/man7/packet.7.html



Regards,

- cee1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Revisit AF_BUS: is it a better way to implement KDBUS?
  2015-07-30 13:09 Revisit AF_BUS: is it a better way to implement KDBUS? cee1
@ 2015-07-30 18:12 ` Andy Lutomirski
  2015-07-31  4:01   ` Greg KH
                     ` (2 more replies)
  2015-07-31 10:09 ` cee1
  1 sibling, 3 replies; 8+ messages in thread
From: Andy Lutomirski @ 2015-07-30 18:12 UTC (permalink / raw)
  To: cee1; +Cc: LKML, Greg KH, Lennart Poettering, David Herrmann,
	One Thousand Gnomes

On Thu, Jul 30, 2015 at 6:09 AM, cee1 <fykcee1@gmail.com> wrote:
> Hi all,
>
> I'm interested in the idea of AF_BUS.
>
> There have already been varies discussions about it:
> * Missing the AF_BUS - https://lwn.net/Articles/504970/
> * Kroah-Hartman: AF_BUS, D-Bus, and the Linux kernel -
> http://lwn.net/Articles/537021/
> * presentation-kdbus -
> https://github.com/gregkh/presentation-kdbus/blob/master/kdbus.txt
> * Re: [GIT PULL] kdbus for 4.1-rc1 - https://lwn.net/Articles/641278/
> * The kdbuswreck - https://lwn.net/Articles/641275/
>
> I'm wondering whether it is a better way, that is, a general mechanism
> to implement varies __Bus__ orientated IPCs, such as Binder[1],
> DBus[2], etc.

I find myself wondering whether an in-kernel *bus* is a good idea at
all.  Creating a bus that unprivileged programs are allowed to
broadcast on (which is kind of the point) opens up big cans of worms.
Namely: what happens when producers produce data faster than the
consumers consume it?  Keep in mind that, with a bus, this scales
pretty badly.  Each producer's sends are multiplied by the number of
participants.

At some point soon, I'm planning on playing with Fedora Rawhide with
kdbus.  Anything's possible (maybe), but I'd be rather surprised if it
holds up under abuse of the bus.

ISTM kdbus is trying to solve a few problems that really can't be
solved together: it wants (mostly) reliable delivery, it wants
globally ordered messages, and it wants broadcasts.  That means that,
if message N gets broadcast, then, until *every* recipient has
received message N, message N and all of its successors need to be
buffered somewhere.  I see how this works (by massive use of tmpfs),
but I don't see how it's going to work *well*.

Certainly approximate solutions are possible, but is the kernel really
a good place to arbitrate which message survive under pressure?


People can throw code at this all they want, but ISTM the problem that
the dbus community wants to solve doesn't actually admit a scalable
solution.

--Andy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Revisit AF_BUS: is it a better way to implement KDBUS?
  2015-07-30 18:12 ` Andy Lutomirski
@ 2015-07-31  4:01   ` Greg KH
  2015-07-31  9:52   ` cee1
  2015-07-31 16:25   ` cee1
  2 siblings, 0 replies; 8+ messages in thread
From: Greg KH @ 2015-07-31  4:01 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: cee1, LKML, Lennart Poettering, David Herrmann,
	One Thousand Gnomes

On Thu, Jul 30, 2015 at 11:12:44AM -0700, Andy Lutomirski wrote:
> On Thu, Jul 30, 2015 at 6:09 AM, cee1 <fykcee1@gmail.com> wrote:
> > Hi all,
> >
> > I'm interested in the idea of AF_BUS.
> >
> > There have already been varies discussions about it:
> > * Missing the AF_BUS - https://lwn.net/Articles/504970/
> > * Kroah-Hartman: AF_BUS, D-Bus, and the Linux kernel -
> > http://lwn.net/Articles/537021/
> > * presentation-kdbus -
> > https://github.com/gregkh/presentation-kdbus/blob/master/kdbus.txt
> > * Re: [GIT PULL] kdbus for 4.1-rc1 - https://lwn.net/Articles/641278/
> > * The kdbuswreck - https://lwn.net/Articles/641275/
> >
> > I'm wondering whether it is a better way, that is, a general mechanism
> > to implement varies __Bus__ orientated IPCs, such as Binder[1],
> > DBus[2], etc.
> 
> I find myself wondering whether an in-kernel *bus* is a good idea at
> all.  Creating a bus that unprivileged programs are allowed to
> broadcast on (which is kind of the point) opens up big cans of worms.
> Namely: what happens when producers produce data faster than the
> consumers consume it?  Keep in mind that, with a bus, this scales
> pretty badly.  Each producer's sends are multiplied by the number of
> participants.
> 
> At some point soon, I'm planning on playing with Fedora Rawhide with
> kdbus.  Anything's possible (maybe), but I'd be rather surprised if it
> holds up under abuse of the bus.

Just boot Fedora Rawhide with "kdbus=1" on the kernel command line and
you should be set.  If not, please let the kdbus developers know.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Revisit AF_BUS: is it a better way to implement KDBUS?
  2015-07-30 18:12 ` Andy Lutomirski
  2015-07-31  4:01   ` Greg KH
@ 2015-07-31  9:52   ` cee1
  2015-07-31 16:25   ` cee1
  2 siblings, 0 replies; 8+ messages in thread
From: cee1 @ 2015-07-31  9:52 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: LKML, Greg KH, Lennart Poettering, David Herrmann,
	One Thousand Gnomes

2015-07-31 2:12 GMT+08:00 Andy Lutomirski <luto@amacapital.net>:
>
> ISTM kdbus is trying to solve a few problems that really can't be
> solved together: it wants (mostly) reliable delivery, it wants
> globally ordered messages, and it wants broadcasts.  That means that,
> if message N gets broadcast, then, until *every* recipient has
> received message N, message N and all of its successors need to be
> buffered somewhere.  I see how this works (by massive use of tmpfs),
> but I don't see how it's going to work *well*.

For broadcast, what will the kernel behave if:
1. Lots of processes open netlink socket (to receive uevents), but not
consume it. And someone continues to trigger uevents.
2. Lots of processes open inotify to monitor a directory, but not
consume the events. And someone continues to operate files under the
directory.
...

I guess it may have to drop some data if the producer produces too
fast(or the consumers consume too slow). What it needs may be a chance
for recipients to know some broadcast data lost.



-- 
Regards,

- cee1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Revisit AF_BUS: is it a better way to implement KDBUS?
  2015-07-30 13:09 Revisit AF_BUS: is it a better way to implement KDBUS? cee1
  2015-07-30 18:12 ` Andy Lutomirski
@ 2015-07-31 10:09 ` cee1
  1 sibling, 0 replies; 8+ messages in thread
From: cee1 @ 2015-07-31 10:09 UTC (permalink / raw)
  To: LKML; +Cc: Greg KH, Lennart Poettering, David Herrmann, gnomes, luto

In a nutshell, this AF_BUS:

1. For privilege operations, bus endpoints send requests to bus
master, and bus master replies with cmsg(control message, e.g. tells
the kernel to assign specified sockaddr_bus)

2. Bus master allocates sockaddr_bus

3. Three kinds of sockaddr_bus:
* The normal ones
* Multicast addresses (last char of sbus_path is '*')
* Kernel notification addr (sbus_addr == NULL)

4. Bloom filters friendly. (i.e. the multicast logic)

2015-07-30 21:09 GMT+08:00 cee1 <fykcee1@gmail.com>:
> Hi all,
>
> I'm interested in the idea of AF_BUS.
>
> There have already been varies discussions about it:
> * Missing the AF_BUS - https://lwn.net/Articles/504970/
> * Kroah-Hartman: AF_BUS, D-Bus, and the Linux kernel -
> http://lwn.net/Articles/537021/
> * presentation-kdbus -
> https://github.com/gregkh/presentation-kdbus/blob/master/kdbus.txt
> * Re: [GIT PULL] kdbus for 4.1-rc1 - https://lwn.net/Articles/641278/
> * The kdbuswreck - https://lwn.net/Articles/641275/
>
> I'm wondering whether it is a better way, that is, a general mechanism
> to implement varies __Bus__ orientated IPCs, such as Binder[1],
> DBus[2], etc.
>
> The original design of AF_BUS is at
> https://github.com/Airtau/genivi/blob/master/af_bus-linux/0002-net-bus-Add-AF_BUS-documentation.patch.
> And following is my version of AF_BUS.
>
> Some characteristics of a Bus orientated IPC:
> 1. A process creates a Bus, the process is then called 'bus master'.
> 2. Connects to a Bus, be assigned Bus address(es).
> 3. Sending/Receiving multicast message, in additional to P2P communication.
> 4. The implementation may base on shared memory model to avoid unnecessary copy.
>
> ## How to map point 1: """A process creates a Bus, the process is then
> called 'bus master'"""
> The [bus master] acts:
>
> struct sockaddr_bus {
>         sa_family_t     sbus_family;                    /* AF_BUS */
>         unsigned short  sbus_addr_ncomp;                /* number of
> components of sbus_addr */
>         char            sbus_path[BUS_PATH_MAX];        /* pathname of
> this bus */
>         uint64_t        sbus_addr[BUS_ADDR_COMP_MAX];   /* address
> within the bus */
> };
> #define BUS_ADDR_MAX    (BUS_ADDR_COMP_MAX * sizeof(uint64_t))
>
> char bus_path[] = "/tmp/test"; /* non-abstract path */
> char bus_addr[] = "org.example.bus";
> struct sockaddr_bus addr = { .sbus_family = AF_BUS };
>
> strncpy(addr.sbus_path, bus_path, BUS_PATH_MAX - 2);
> memcpy(addr.sbus_addr, bus_addr, MIN(sizeof(bus_addr), BUS_ADDR_MAX));
> addr.sbus_addr_ncomp = MIN(ALIGN(sizeof(bus_addr), 8) / 8, BUS_ADDR_COMP_MAX);
>
> bus_fd = socket(AF_BUS, SOCK_DGRAM, 0);
> /* creates a Bus, becomes the master of the bus */
> bind(bus_fd, &addr, sizeof(struct sockaddr_bus));
>
>
> ## How to map point 2: """Connects to a Bus, be assigned Bus address(es)"""
> ### The [bus endpoint] acts:
> fd = socket(AF_BUS, SOCK_DGRAM, 0);
>
> /* AUTH message setup */
> struct msghdr msghdr = {
>         .msg_name = &addr, /* bus master's addr */
>         .msg_namelen = sizeof(struct sockaddr_bus),
>         .msg_iov = &auth_iovec,
>         .msg_iovlen = 1,
> };
>
> msghdr.msg_controllen = CMSG_SPACE(sizeof(struct ucred));
> msghdr.msg_control = alloca(msghdr.msg_controllen);
> cmsg = CMSG_FIRSTHDR(&msghdr);
> cmsg->cmsg_level = SOL_SOCKET;
> cmsg->cmsg_type = SCM_CREDENTIALS;
> cmsg->cmsg_len = CMSG_LEN(sizeof(struct ucred));
> ucred = (struct ucred *) CMSG_DATA(cmsg);
> ucred->pid = getpid();
> ucred->uid = getuid();
> ucred->gid = getgid();
>
> sendmsg(fd, &msghdr, MSG_NOSIGNAL);
>
> ### The [bus master] acts:
> int optval = 1;
> setsockopt(bus_fd, SOL_SOCKET, SO_PASSCRED, &optval, sizeof(optval));
> recvmsg(bus_fd, &msghdr, MSG_NOSIGNAL);
>
> /* do AUTH ... */
>
> msghdr.msg_iov = &reply_iovec;
> msghdr.msg_iovlen = 1;
> msghdr.msg_controllen = 0;
> msghdr.msg_control = NULL;
>
> if (auth_ok) {
>         /* bus master allocates a bus addr */
>         char bus_path[] = "/tmp/test";
>         char ret_bus_addr[] = "1.1";
>         struct sockaddr_bus ret_addr = { .sbus_family = AF_BUS };
>
>         strncpy(ret_addr.sbus_path, bus_path, BUS_PATH_MAX - 2);
>         memcpy(ret_addr.sbus_addr, ret_bus_addr,
> MIN(sizeof(ret_bus_addr), BUS_ADDR_MAX));
>         ret_addr.sbus_addr_ncomp = MIN(ALIGN(sizeof(ret_bus_addr), 8)
> / 8, BUS_ADDR_COMP_MAX);
>
>         /*
>          * 1. bus master returns the bus addr
>          * 2. kernel will apply it against the bus endpoint
>          * 3. the bus endpoint is then able to talk with endpoints on the bus.
>          */
>         msghdr.msg_controllen = CMSG_SPACE(sizeof(struct sockaddr_bus));
>         msghdr.msg_control = alloca(msghdr.msg_controllen);
>         cmsg = CMSG_FIRSTHDR(&msghdr);
>         cmsg->cmsg_level = BUS_SOCKET;
>         cmsg->cmsg_type = SCM_OWNED_ADDR;
>         cmsg->cmsg_len = CMSG_LEN(sizeof(struct sockaddr_bus));
>         memcpy(CMSG_DATA(cmsg), &ret_addr, sizeof(struct sockaddr_bus));
> }
> sendmsg(bus_fd, &msghdr, MSG_NOSIGNAL);
>
>
> ## How to map point 3: """Sending/Receiving multicast message, in
> additional to P2P communication""".
> ### P2P communication
> Sometimes, a bus endpoint maybe assigned to multi-addresses. It may
> want to send message through a specific address.
>
> struct msghdr msghdr = {
>         .msg_name = &dst_addr,
>         .msg_namelen = sizeof(struct sockaddr_bus),
>         .msg_iov = &msg_iovec,
>         .msg_iovlen = 1,
> };
>
> char bus_path[] = "/tmp/test";
> char bus_addr[] = "com.example.service1";
> struct sockaddr_bus src_addr = { .sbus_family = AF_BUS };
>
> strncpy(src_addr.sbus_path, bus_path, BUS_PATH_MAX - 2);
> memcpy(src_addr.sbus_addr, bus_addr, MIN(sizeof(bus_addr), BUS_ADDR_MAX));
> src_addr.sbus_addr_ncomp = MIN(ALIGN(sizeof(bus_addr), 8) / 8,
> BUS_ADDR_COMP_MAX),
>
> msghdr.msg_controllen = CMSG_SPACE(sizeof(struct sockaddr_bus));
> msghdr.msg_control = alloca(msghdr.msg_controllen);
> cmsg = CMSG_FIRSTHDR(&msghdr);
> cmsg->cmsg_level = BUS_SOCKET;
> cmsg->cmsg_type = SCM_SRC_ADDR;
> cmsg->cmsg_len = CMSG_LEN(sizeof(struct sockaddr_bus));
> memcpy(CMSG_DATA(cmsg), &src_addr, sizeof(struct sockaddr_bus));
>
> sendmsg(my_sock_fd, &msghdr, MSG_NOSIGNAL);
>
> ### Multicast
> The multicast address may look like:
>         {
>                 .sbus_family = AF_BUS,
>
>                 /* In a multicast addr, its bus_path is  '*'-terminated */
>                 .sbus_path = "/tmp/test\0\0\0\0\0...*",
>
>                 .sbus_addr_ncomp = 8;
>                 .sbus_addr = /* 8 * 64bits bitarray for example */
>         }
>
> The receiver will request [bus master] for permitting to receive
> messages from a set of multicast addresses, and the bus master grants
> it with replying a control message:
>         {
>                 .cmsg_level = BUS_SOCKET,
>                 .cmsg_type = SCM_MULTICAST_MATCH,
>                 .cmsg_data = /* the requested struct sockaddr_bus */
>         }
>
> How does matching happen?
> Let's assume someone sends message to multicast address maddr1, and
> the receiver granted a match of maddr2:
>
> The [kernel]:
>         is_matched = maddr1 & maddr2 == maddr2.
>
> In this way, usespace can deploy bloom filters, and then it may
> further apply eBPF to filter out "false positive" case.
>
> ## How to avoid unnecessary copy?
> A sockopt similar to PACKET_RX_RING[3] may be introduced, which brings
> a mmap/shared memory style API.
>
>
> ## Other thoughts
> 1. The bus master may want to receive notifications from the kernel,
> such as "a bus endpoint died". A special sockaddr_bus "{
> .sbus_addr_ncomp = 0, .sbus_addr = NULL }" indicates a message from
> kernel.
> 2. A bus endpoint may pass a memfd to another bus endpoint, and then
> they communicates under mmap/shared memory model, if it needs ultimate
> performance.
>
>
>
> ---
> 1. http://www.freedesktop.org/wiki/Software/dbus/
> 2. http://elinux.org/Android_Binder
> 3. http://man7.org/linux/man-pages/man7/packet.7.html
>
>
>
> Regards,
>
> - cee1



-- 
Regards,

- cee1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Revisit AF_BUS: is it a better way to implement KDBUS?
  2015-07-30 18:12 ` Andy Lutomirski
  2015-07-31  4:01   ` Greg KH
  2015-07-31  9:52   ` cee1
@ 2015-07-31 16:25   ` cee1
  2015-07-31 21:15     ` Andy Lutomirski
  2 siblings, 1 reply; 8+ messages in thread
From: cee1 @ 2015-07-31 16:25 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: LKML, Greg KH, Lennart Poettering, David Herrmann,
	One Thousand Gnomes

2015-07-31 2:12 GMT+08:00 Andy Lutomirski <luto@amacapital.net>:
>
> I find myself wondering whether an in-kernel *bus* is a good idea at
> all.  Creating a bus that unprivileged programs are allowed to
> broadcast on (which is kind of the point) opens up big cans of worms.

This can be solved in this AF_BUS like this:
* Becoming a bus master needs a proper CAP.
* Impose a bus endpoint to join multicast address "maddr1" first, if
it wants to send to multicast address "maddr2".

The bus endpoint sends the request of joining maddr1, and the bus
master grants it with replying a cmsg(control message) and setting up
a proper eBPF.

Next time, the bus endpoint sends to maddr2, the kernel will allow this if:
1) maddr1 & maddr2 == maddr1
And 2) the eBPF allows it.
 (i.e. the same multicast match logic in this AF_BUS)



-- 
Regards,

- cee1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Revisit AF_BUS: is it a better way to implement KDBUS?
  2015-07-31 16:25   ` cee1
@ 2015-07-31 21:15     ` Andy Lutomirski
  2015-08-01  2:00       ` cee1
  0 siblings, 1 reply; 8+ messages in thread
From: Andy Lutomirski @ 2015-07-31 21:15 UTC (permalink / raw)
  To: cee1; +Cc: LKML, Greg KH, Lennart Poettering, David Herrmann,
	One Thousand Gnomes

On Fri, Jul 31, 2015 at 9:25 AM, cee1 <fykcee1@gmail.com> wrote:
> 2015-07-31 2:12 GMT+08:00 Andy Lutomirski <luto@amacapital.net>:
>>
>> I find myself wondering whether an in-kernel *bus* is a good idea at
>> all.  Creating a bus that unprivileged programs are allowed to
>> broadcast on (which is kind of the point) opens up big cans of worms.
>
> This can be solved in this AF_BUS like this:
> * Becoming a bus master needs a proper CAP.
> * Impose a bus endpoint to join multicast address "maddr1" first, if
> it wants to send to multicast address "maddr2".
>
> The bus endpoint sends the request of joining maddr1, and the bus
> master grants it with replying a cmsg(control message) and setting up
> a proper eBPF.
>
> Next time, the bus endpoint sends to maddr2, the kernel will allow this if:
> 1) maddr1 & maddr2 == maddr1
> And 2) the eBPF allows it.
>  (i.e. the same multicast match logic in this AF_BUS)
>

I don't understand.

If the endpoint is unprivileged (i.e. random untrusted things can send
multicast), then you have the scaling problem.  If the endpoint is
privileged, then it's much less clear to me that this thing is useful.

--Andy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Revisit AF_BUS: is it a better way to implement KDBUS?
  2015-07-31 21:15     ` Andy Lutomirski
@ 2015-08-01  2:00       ` cee1
  0 siblings, 0 replies; 8+ messages in thread
From: cee1 @ 2015-08-01  2:00 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: LKML, Greg KH, Lennart Poettering, David Herrmann,
	One Thousand Gnomes

2015-08-01 5:15 GMT+08:00 Andy Lutomirski <luto@amacapital.net>:
> On Fri, Jul 31, 2015 at 9:25 AM, cee1 <fykcee1@gmail.com> wrote:
>> 2015-07-31 2:12 GMT+08:00 Andy Lutomirski <luto@amacapital.net>:
>>>
>>> I find myself wondering whether an in-kernel *bus* is a good idea at
>>> all.  Creating a bus that unprivileged programs are allowed to
>>> broadcast on (which is kind of the point) opens up big cans of worms.
>>
>> This can be solved in this AF_BUS like this:
>> * Becoming a bus master needs a proper CAP.
>> * Impose a bus endpoint to join multicast address "maddr1" first, if
>> it wants to send to multicast address "maddr2".
>>
>> The bus endpoint sends the request of joining maddr1, and the bus
>> master grants it with replying a cmsg(control message) and setting up
>> a proper eBPF.
>>
>> Next time, the bus endpoint sends to maddr2, the kernel will allow this if:
>> 1) maddr1 & maddr2 == maddr1
>> And 2) the eBPF allows it.
>>  (i.e. the same multicast match logic in this AF_BUS)
>>
>
> I don't understand.
>
> If the endpoint is unprivileged (i.e. random untrusted things can send
> multicast), then you have the scaling problem.  If the endpoint is
> privileged, then it's much less clear to me that this thing is useful.

That means an endpoint has to request the ability of sending to a
specific multicast address(aka join a multicast group), and it's up to
bus master whether grants it or not.



-- 
Regards,

- cee1

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-08-01  2:00 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-30 13:09 Revisit AF_BUS: is it a better way to implement KDBUS? cee1
2015-07-30 18:12 ` Andy Lutomirski
2015-07-31  4:01   ` Greg KH
2015-07-31  9:52   ` cee1
2015-07-31 16:25   ` cee1
2015-07-31 21:15     ` Andy Lutomirski
2015-08-01  2:00       ` cee1
2015-07-31 10:09 ` cee1

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox