linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christian Seiler <christian@iwakd.de>
To: David Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org, dan@mindstab.net, edumazet@google.com,
	hannes@stressinduktion.org, linux-api@vger.kernel.org
Subject: Re: [PATCH] net: add SO_MAX_DGRAM_QLEN for AF_UNIX SOCK_DGRAM sockets
Date: Tue, 03 Mar 2015 10:04:10 +0100	[thread overview]
Message-ID: <0b5908020d83bcbaa7f4938e5cb433ea@iwakd.de> (raw)
In-Reply-To: <20150302.213807.367185406305242762.davem@davemloft.net>

Am 2015-03-03 03:38, schrieb David Miller:
>> Rationale: applications may want to control how many datagrams the
>> kernel buffers before senders are blocked. A prominent example would 
>> be
>> to create a socket for syslog early at boot but only consume 
>> messages
>> once enough of the system has been set up. The default queue length 
>> of
>> 11 messages (= 10 + 1) is too low for this kind of application.
>
> I never like arguments that talk about forcing the kernel to do
> excessive buffering for an application.
>
> Queue this stuff in the userspace side, then you can have as many
> messages backlogged as you like _without_ consuming unswappable 
> kernel
> memory.
>
> I'm tossing this, you're going to have to do a much better job
> explaining to me why userspace cannot take upon itself the burdon of
> queueing data until it can be sent.

There are certain things that can't be done in userspace:

  - If SO_PASSCRED is active, a userspace process relaying the messages
    can't fake the PID of the original process unless that one is still
    around (sendmsg will return -ESRCH). Also, one needs CAP_SYS_ADMIN
    to do this (and CAP_SETUID/CAP_SETGID to fake uid/gid as well).

  - More importantly, timestamps of messages can't be faked at all. So
    in the example of a socket used for syslog purposes, all the
    timestamps on the messages queued would be wrong.

Also note that if I have a stream socket, by default I can buffer up to
256 kiB of data in the kernel. I did some test measurements on x86_64
and including overhead of internal bookkeeping structures, I can fit up
to 555 datagrams in there if each is at most 192 bytes long, at least
333 datagrams if each is at most 704 bytes long and at least 185
datagrams if each is at most 1728 bytes long. If I compare these
numbers to 11, that's an order of magnitude in difference.

I'm not asking to be able to use a lot of memory, I'm just asking to be
able to raise an artificial limit that doesn't apply to other types of
sockets.

Finally, increasing the queue length is not the only use case, some
applications might want to decrease it. For example, if the value is
set to zero, only a single datagram can be queued at a time (and all
else blocks or fails with -EAGAIN), which might be interesting if the
application processing the datagrams takes a long time to do so for
each one of them.

Christian

  reply	other threads:[~2015-03-03  9:04 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-03  0:32 [PATCH] net: add SO_MAX_DGRAM_QLEN for AF_UNIX SOCK_DGRAM sockets Christian Seiler
2015-03-03  2:38 ` David Miller
2015-03-03  9:04   ` Christian Seiler [this message]
     [not found]     ` <0b5908020d83bcbaa7f4938e5cb433ea-+GPkE3DhqnY@public.gmane.org>
2015-03-03 14:30       ` Eric Dumazet
2015-03-03 15:05         ` Christian Seiler
     [not found]           ` <55ee39bff7875967acb06f25fa695f95-+GPkE3DhqnY@public.gmane.org>
2015-03-03 15:56             ` Eric Dumazet
2015-03-03 18:59     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0b5908020d83bcbaa7f4938e5cb433ea@iwakd.de \
    --to=christian@iwakd.de \
    --cc=dan@mindstab.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hannes@stressinduktion.org \
    --cc=linux-api@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).