From mboxrd@z Thu Jan 1 00:00:00 1970 From: ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org (Eric W. Biederman) Subject: Re: [PATCH 01/13] kdbus: add documentation Date: Wed, 04 Feb 2015 00:30:48 -0600 Message-ID: <874mr2b33b.fsf@x220.int.ebiederm.org> References: <54C9F525.4040703@zonque.org> <54CA1CA2.9060005@zonque.org> <54CF44B9.8000005@zonque.org> <54D09E6B.2020903@zonque.org> <87siemgzoo.fsf@x220.int.ebiederm.org> <20150204031437.GB32207@kroah.com> Mime-Version: 1.0 Content-Type: text/plain Return-path: In-Reply-To: <20150204031437.GB32207-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org> (Greg Kroah-Hartman's message of "Tue, 3 Feb 2015 19:14:37 -0800") Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Greg Kroah-Hartman Cc: Andy Lutomirski , Daniel Mack , Arnd Bergmann , Ted Ts'o , Michael Kerrisk , Linux API , One Thousand Gnomes , Austin S Hemmelgarn , Tom Gundersen , linux-kernel , David Herrmann , Djalal Harouni , Johannes Stezenbach , Christoph Hellwig List-Id: linux-api@vger.kernel.org Greg Kroah-Hartman writes: > On Tue, Feb 03, 2015 at 08:47:51PM -0600, Eric W. Biederman wrote: >> Andy Lutomirski writes: >> >> > On Tue, Feb 3, 2015 at 2:09 AM, Daniel Mack wrote: >> >> Hi Andy, >> >> >> >> On 02/02/2015 09:12 PM, Andy Lutomirski wrote: >> >>> On Feb 2, 2015 1:34 AM, "Daniel Mack" wrote: >> >> >> >>>> That's right, but again - if an application wants to gather this kind of >> >>>> information about tasks it interacts with, it can do so today by looking >> >>>> at /proc or similar sources. Desktop machines do exactly that already, >> >>>> and the kernel code executed in such cases very much resembles that in >> >>>> metadata.c, and is certainly not cheaper. kdbus just makes such >> >>>> information more accessible when requested. Which information is >> >>>> collected is defined by bit-masks on both the sender and the receiver >> >>>> connection, and most applications will effectively only use a very >> >>>> limited set by default if they go through one of the more high-level >> >>>> libraries. >> >>> >> >>> I should rephrase a bit. Kdbus doesn't require use of send-time >> >>> metadata. It does, however, strongly encourage it, and it sounds like >> >> >> >> On the kernel level, kdbus just *offers* that, just like sockets offer >> >> SO_PASSCRED. On the userland level, kdbus helps applications get that >> >> information race-free, easier and faster than they would otherwise. >> >> >> >>> systemd and other major users will use send-time metadata. Once that >> >>> happens, it's ABI (even if it's purely in userspace), and changing it >> >>> is asking for security holes to pop up. So you'll be mostly stuck >> >>> with it. >> >> >> >> We know we can't break the ABI. At most, we could deprecate item types >> >> and introduce new ones, but we want to avoid that by all means of >> >> course. However, I fail to see how that is related to send time >> >> metadata, or even to kdbus in general, as all ABIs have to be kept stable. >> > >> > I should have said it differently. ABI is the wrong term -- it's more >> > of a protocol issue. >> > >> > It looks like, with the current code, the kernel will provide >> > (optional) send-time metadata, and the sd-bus library will use it. >> > The result will be that the communication protocol between clients and >> > udev, systemd, systemd-logind, g-s-d, etc, will likely involve >> > send-time metadata. This may end up being a bottleneck. >> >> A quick note on a couple of things I have seen in this conversation. >> >> - The reason for kdbus is performance. > > No, that's not the only reason for kdbus, don't focus only on this. I > set out a long list of things for why we created kdbus, speed was only > one of the things. Security is also one, and the ability to gather > these attributes in an atomic and secure way is very important as > userspace wants this. Perhaps I should have said the predominant reason. Certainly that seems to be most of what I have seen talked about. Regardless looking at the performance in the design and removing any substantial obstacle to making things go fast. Further. I had this conversation earlier in an earlier round of the review and I was told that in fact existing dbus applications do not want or need these attributes. I think I heard journald wants them for pretty printing things. If security is your concern I really think per message attributes collected and sent when a message is sent is a bad idea. It has been a nasty anti-pattern in the kernel code. Lots and lots of meta-data copyed from a task and sent to someone else has significant performance, maintenance, and security impacts. Code written in that pattern is complex and hard to analyze, and hard to think about. Consider debugging why a message does not get the expected treatment from your suid application because someone changed the euid over that particular call and had not thought about it's consequences. Frankly I have been there and done that and it is a mess. So no I do not think breaking encapsulation and having weird side effects affecting your new primitive will have any security benefits whatsover. It will just result in brittle complex code. If you want to avoid the races causing sends through a file descriptor to fail that don't have the expected attributes (my constructive suggestion earlier) is a very different thing from a performance and mainteance standpoint. That does not increase the code complexity nearly as much in the implementation or in use, and unexpected failures happen right away. >> - pipes rather than unix domain sockets are likely the standard to meet. >> If you can't equal unix domain sockets for simple things you are >> likely leaving a lot of stops in. Last I looked pipes in general were >> notiably faster than unix domain sockets. >> >> The performance numbers I saw posted up-thread were horrible. I have >> seen faster numbers across a network of machines. If your ping-pong >> latency isn't measured in nano-seconds you are probably doing >> something wrong. > > It all depends on what you are passing on that "ping-pong", a real > D-Bus connection has real data and meta data that has to be sent. > Trying to make a fake benchmark number isn't going to show anything. All that I was intending to convey is that the numbers I have seen have been orders of magnitude slower than I would expect. And 10x to 100x slower than the code should be is a reason to ask why. In my experience being efficient with small messages are important because (a) they are the hardest to make go fast (b) they are surprising common. Remote X application start-up times are very slow because of these. People have a distressing habit of writing applications that send a small message and synchronously waits for it. Over time these small ipc calls build up and you are limited by how fast they will go. >> - syscalls remove overhead. So since performance is kdbus's reason for existence >> let's remove some ridiculous stops, and get a fast path into the kernel. > > Again, not the only reason, see my first post in this thread for > details. But performance is important, and performance is a good reason to use system calls. Security is another reason to have real system calls, as there is less going on (compared to an ioctl multiplexer) so the code is easier to audit. Eric From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751653AbbBDGeI (ORCPT ); Wed, 4 Feb 2015 01:34:08 -0500 Received: from out03.mta.xmission.com ([166.70.13.233]:36700 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751114AbbBDGeG (ORCPT ); Wed, 4 Feb 2015 01:34:06 -0500 From: ebiederm@xmission.com (Eric W. Biederman) To: Greg Kroah-Hartman Cc: Andy Lutomirski , Daniel Mack , Arnd Bergmann , "Ted Ts'o" , Michael Kerrisk , Linux API , One Thousand Gnomes , Austin S Hemmelgarn , Tom Gundersen , linux-kernel , David Herrmann , Djalal Harouni , Johannes Stezenbach , Christoph Hellwig References: <54C9F525.4040703@zonque.org> <54CA1CA2.9060005@zonque.org> <54CF44B9.8000005@zonque.org> <54D09E6B.2020903@zonque.org> <87siemgzoo.fsf@x220.int.ebiederm.org> <20150204031437.GB32207@kroah.com> Date: Wed, 04 Feb 2015 00:30:48 -0600 In-Reply-To: <20150204031437.GB32207@kroah.com> (Greg Kroah-Hartman's message of "Tue, 3 Feb 2015 19:14:37 -0800") Message-ID: <874mr2b33b.fsf@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-AID: U2FsdGVkX1+6Ud1YmrlwoOxZZLfnrfXBE9h6vC5nn/M= X-SA-Exim-Connect-IP: 70.59.163.10 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 1.5 XMNoVowels Alpha-numberic number with no vowels * 0.0 TVD_RCVD_IP Message was received from an IP address * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa07 1397; Body=1 Fuz1=1 Fuz2=1] * 1.0 T_XMDrugObfuBody_08 obfuscated drug references * 0.5 XM_Body_Dirty_Words Contains a dirty word X-Spam-DCC: XMission; sa07 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: **;Greg Kroah-Hartman X-Spam-Relay-Country: X-Spam-Timing: total 574 ms - load_scoreonly_sql: 0.06 (0.0%), signal_user_changed: 4.2 (0.7%), b_tie_ro: 3.0 (0.5%), parse: 1.17 (0.2%), extract_message_metadata: 20 (3.5%), get_uri_detail_list: 4.3 (0.8%), tests_pri_-1000: 9 (1.5%), tests_pri_-950: 1.27 (0.2%), tests_pri_-900: 1.11 (0.2%), tests_pri_-400: 40 (6.9%), check_bayes: 38 (6.7%), b_tokenize: 12 (2.1%), b_tok_get_all: 15 (2.6%), b_comp_prob: 5 (0.9%), b_tok_touch_all: 3.2 (0.6%), b_finish: 0.75 (0.1%), tests_pri_0: 484 (84.3%), tests_pri_500: 9 (1.6%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCH 01/13] kdbus: add documentation X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Wed, 24 Sep 2014 11:00:52 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Greg Kroah-Hartman writes: > On Tue, Feb 03, 2015 at 08:47:51PM -0600, Eric W. Biederman wrote: >> Andy Lutomirski writes: >> >> > On Tue, Feb 3, 2015 at 2:09 AM, Daniel Mack wrote: >> >> Hi Andy, >> >> >> >> On 02/02/2015 09:12 PM, Andy Lutomirski wrote: >> >>> On Feb 2, 2015 1:34 AM, "Daniel Mack" wrote: >> >> >> >>>> That's right, but again - if an application wants to gather this kind of >> >>>> information about tasks it interacts with, it can do so today by looking >> >>>> at /proc or similar sources. Desktop machines do exactly that already, >> >>>> and the kernel code executed in such cases very much resembles that in >> >>>> metadata.c, and is certainly not cheaper. kdbus just makes such >> >>>> information more accessible when requested. Which information is >> >>>> collected is defined by bit-masks on both the sender and the receiver >> >>>> connection, and most applications will effectively only use a very >> >>>> limited set by default if they go through one of the more high-level >> >>>> libraries. >> >>> >> >>> I should rephrase a bit. Kdbus doesn't require use of send-time >> >>> metadata. It does, however, strongly encourage it, and it sounds like >> >> >> >> On the kernel level, kdbus just *offers* that, just like sockets offer >> >> SO_PASSCRED. On the userland level, kdbus helps applications get that >> >> information race-free, easier and faster than they would otherwise. >> >> >> >>> systemd and other major users will use send-time metadata. Once that >> >>> happens, it's ABI (even if it's purely in userspace), and changing it >> >>> is asking for security holes to pop up. So you'll be mostly stuck >> >>> with it. >> >> >> >> We know we can't break the ABI. At most, we could deprecate item types >> >> and introduce new ones, but we want to avoid that by all means of >> >> course. However, I fail to see how that is related to send time >> >> metadata, or even to kdbus in general, as all ABIs have to be kept stable. >> > >> > I should have said it differently. ABI is the wrong term -- it's more >> > of a protocol issue. >> > >> > It looks like, with the current code, the kernel will provide >> > (optional) send-time metadata, and the sd-bus library will use it. >> > The result will be that the communication protocol between clients and >> > udev, systemd, systemd-logind, g-s-d, etc, will likely involve >> > send-time metadata. This may end up being a bottleneck. >> >> A quick note on a couple of things I have seen in this conversation. >> >> - The reason for kdbus is performance. > > No, that's not the only reason for kdbus, don't focus only on this. I > set out a long list of things for why we created kdbus, speed was only > one of the things. Security is also one, and the ability to gather > these attributes in an atomic and secure way is very important as > userspace wants this. Perhaps I should have said the predominant reason. Certainly that seems to be most of what I have seen talked about. Regardless looking at the performance in the design and removing any substantial obstacle to making things go fast. Further. I had this conversation earlier in an earlier round of the review and I was told that in fact existing dbus applications do not want or need these attributes. I think I heard journald wants them for pretty printing things. If security is your concern I really think per message attributes collected and sent when a message is sent is a bad idea. It has been a nasty anti-pattern in the kernel code. Lots and lots of meta-data copyed from a task and sent to someone else has significant performance, maintenance, and security impacts. Code written in that pattern is complex and hard to analyze, and hard to think about. Consider debugging why a message does not get the expected treatment from your suid application because someone changed the euid over that particular call and had not thought about it's consequences. Frankly I have been there and done that and it is a mess. So no I do not think breaking encapsulation and having weird side effects affecting your new primitive will have any security benefits whatsover. It will just result in brittle complex code. If you want to avoid the races causing sends through a file descriptor to fail that don't have the expected attributes (my constructive suggestion earlier) is a very different thing from a performance and mainteance standpoint. That does not increase the code complexity nearly as much in the implementation or in use, and unexpected failures happen right away. >> - pipes rather than unix domain sockets are likely the standard to meet. >> If you can't equal unix domain sockets for simple things you are >> likely leaving a lot of stops in. Last I looked pipes in general were >> notiably faster than unix domain sockets. >> >> The performance numbers I saw posted up-thread were horrible. I have >> seen faster numbers across a network of machines. If your ping-pong >> latency isn't measured in nano-seconds you are probably doing >> something wrong. > > It all depends on what you are passing on that "ping-pong", a real > D-Bus connection has real data and meta data that has to be sent. > Trying to make a fake benchmark number isn't going to show anything. All that I was intending to convey is that the numbers I have seen have been orders of magnitude slower than I would expect. And 10x to 100x slower than the code should be is a reason to ask why. In my experience being efficient with small messages are important because (a) they are the hardest to make go fast (b) they are surprising common. Remote X application start-up times are very slow because of these. People have a distressing habit of writing applications that send a small message and synchronously waits for it. Over time these small ipc calls build up and you are limited by how fast they will go. >> - syscalls remove overhead. So since performance is kdbus's reason for existence >> let's remove some ridiculous stops, and get a fast path into the kernel. > > Again, not the only reason, see my first post in this thread for > details. But performance is important, and performance is a good reason to use system calls. Security is another reason to have real system calls, as there is less going on (compared to an ioctl multiplexer) so the code is easier to audit. Eric