From: Hannes Frederic Sowa <hannes@stressinduktion.org>
To: Tom Herbert <tom@herbertland.com>
Cc: "David S. Miller" <davem@davemloft.net>,
Linux Kernel Network Developers <netdev@vger.kernel.org>,
Kernel Team <kernel-team@fb.com>,
davewatson@fb.com,
Alexei Starovoitov <alexei.starovoitov@gmail.com>
Subject: Re: [PATCH net-next 0/6] kcm: Kernel Connection Multiplexor (KCM)
Date: Mon, 23 Nov 2015 20:35:37 +0100 [thread overview]
Message-ID: <1448307337.1628829.447962729.60B03DCC@webmail.messagingengine.com> (raw)
In-Reply-To: <CALx6S34TrGgmH3eZKPLw_xKYfX2dm+fZq6r7ZBe2DBPudBiSzA@mail.gmail.com>
Hello Tom,
On Mon, Nov 23, 2015, at 18:33, Tom Herbert wrote:
> > For me this still looks a little bit like messages could be delimited by
> > TCP PSH flag, where we might need to have some more fine grained control
> > over and besides that just adding better fanout semantics to TCP, no?
> >
> The TCP PSH flag is not defined for message delineation (neither is
> urgent pointer). We can't change that (many people have tried to add
> message semantics to TCP protocol but have always failed miserably).
> The fact is TCP is always going to be a stream based protocol. Period!
> :-) It is up to the application to interpret the stream and extract
> messages. Even if we could somehow apply the PSH bit to "help" in
> message delineation, we would need to change senders to use the PSH
> bit in that fashion for it to be of benefit to receivers.
I see TCP PSH flags as an optimization and I agree it is hard to
properly make use of them in the internet. But in a datacenter where
everything is under control, this could be done?
Anyway, decoding arbitrary messages in the kernel with maybe huge
lengths could result in starvation problems if you adhere to the socket
receive buffer limits at all time. So I wonder if forward progress
guarantee can be achieved here agnostic of the eBPF program? I really
see this becoming a problem as soon as people use it for privilege
separation. Will there be central error handling?
Also, would a TCP option make sense here to add instead of using the TCP
PSH flag? Not sure, yet...
> > Do kcm sockets still allow streaming unlimited amounts of data? E.g. if
> > you want to pass a data stream attached to a rpc message? I think not
> > allowing streaming is a major shortcoming then (even though this will
> > induce head of line blocking).
> >
> RPC messages can be of arbitrary size and with SOCK_SEQPACKET,
> messages can be sent or received in multiple calls. No HOL blocking
> since message are constructed on KCM sockets before starting to send
> on TCP sockets. Socket buffer limits are respected. KCM does not
> enforce a maximum message size, if an applications does have a maximum
> then that can be checked in the BPF code.
I was referring to the receivers end HOL blocking, the same as in user
space TCP, where one data stream (or huge message) keeps the byte stream
busy so no other datagrams in there can be delivered. For low latency I
would actually use multiple streams or switch to UDP with user space
based retry.
I think this problem more and more comes down to improve epoll interface
with somewhat better CPU steered wake-up capabilities to make it more
agnostic. Some programs e.g. want also be woken up if a HTTP header is
received completely, SO_RCVLOWAT was made for this, FreeBSD has
accept_filter for this kind.
You want to use this in thrift which is mainly Java based and reuse the
existing NIO infrastructure?
> >> Future support:
> >>
> >> - Integration with TLS (TLS-in-kernel is a separate initiative).
> >
> > This is interesting:
> >
> > Regarding the last week's discussion about better OOB support in TCP
> > e.g. for SOCKET_DESTROY, do you already have a plan to handle TLS alerts
> > and do CHANGE_CIPHER on the socket synchronously?
> >
> Dave should be posting the basic TLS-in-the-kenel patches shortly,
> those will be a better context for discussion.
Thanks, I am looking at them right now. :)
Thanks,
Hannes
next prev parent reply other threads:[~2015-11-23 19:35 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-20 21:21 [PATCH net-next 0/6] kcm: Kernel Connection Multiplexor (KCM) Tom Herbert
2015-11-20 21:21 ` [PATCH net-next 1/6] rcu: Add list_next_or_null_rcu Tom Herbert
2015-11-20 21:21 ` [PATCH net-next 2/6] net: Make sock_alloc exportable Tom Herbert
2015-11-20 21:21 ` [PATCH net-next 3/6] net: Add MSG_BATCH flag Tom Herbert
2015-11-23 10:02 ` Hannes Frederic Sowa
2015-11-20 21:21 ` [PATCH net-next 4/6] kcm: Kernel Connection Multiplexor module Tom Herbert
2015-11-20 22:50 ` Sowmini Varadhan
2015-11-20 23:19 ` Tom Herbert
2015-11-20 23:27 ` Sowmini Varadhan
2015-11-20 23:10 ` Alexei Starovoitov
2015-11-20 23:20 ` Tom Herbert
2015-11-23 9:42 ` Daniel Borkmann
2015-11-20 21:21 ` [PATCH net-next 5/6] kcm: Add statistics and proc interfaces Tom Herbert
2015-11-20 21:22 ` [PATCH net-next 6/6] kcm: Add description in Documentation Tom Herbert
2015-11-23 9:53 ` [PATCH net-next 0/6] kcm: Kernel Connection Multiplexor (KCM) Hannes Frederic Sowa
2015-11-23 12:43 ` Sowmini Varadhan
2015-11-23 17:33 ` Tom Herbert
2015-11-23 19:35 ` Hannes Frederic Sowa [this message]
2015-11-23 19:54 ` David Miller
2015-11-23 20:02 ` Tom Herbert
2015-11-24 11:25 ` Hannes Frederic Sowa
2015-11-24 15:49 ` David Miller
2015-11-24 15:27 ` Florian Westphal
2015-11-24 15:49 ` Eric Dumazet
2015-11-24 18:09 ` Rick Jones
2015-11-24 15:55 ` David Miller
2015-11-24 16:25 ` Florian Westphal
2015-11-24 17:00 ` Tom Herbert
2015-11-24 17:16 ` Florian Westphal
2015-11-24 17:43 ` Tom Herbert
2015-11-24 20:55 ` Florian Westphal
2015-11-24 21:49 ` Tom Herbert
2015-11-24 22:22 ` Florian Westphal
2015-11-24 22:25 ` David Miller
2015-11-24 22:45 ` Florian Westphal
2015-11-24 23:13 ` Hannes Frederic Sowa
2015-11-24 18:23 ` Hannes Frederic Sowa
2015-11-24 18:59 ` Alexei Starovoitov
2015-11-24 19:16 ` Hannes Frederic Sowa
2015-11-24 19:26 ` Hannes Frederic Sowa
2015-11-24 20:23 ` Alexei Starovoitov
[not found] ` <1448402288.1489559.449199721.64EBB346@webmail.messagingengine.com>
[not found] ` <20151124222109.GA86838@ast-mbp.thefacebook.com>
2015-11-25 10:38 ` Hannes Frederic Sowa
2015-11-25 16:26 ` Sowmini Varadhan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1448307337.1628829.447962729.60B03DCC@webmail.messagingengine.com \
--to=hannes@stressinduktion.org \
--cc=alexei.starovoitov@gmail.com \
--cc=davem@davemloft.net \
--cc=davewatson@fb.com \
--cc=kernel-team@fb.com \
--cc=netdev@vger.kernel.org \
--cc=tom@herbertland.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).