All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoph Paasch <cpaasch at apple.com>
To: mptcp at lists.01.org
Subject: Re: [MPTCP] Status
Date: Mon, 02 Oct 2017 16:28:32 -0700	[thread overview]
Message-ID: <20171002232832.GI31629@Chimay.local> (raw)
In-Reply-To: alpine.OSX.2.21.1710021423010.10266@mjmartin-mac01.sea.intel.com

[-- Attachment #1: Type: text/plain, Size: 4513 bytes --]

Hello Mat,

On 02/10/17 - 16:00:56, Mat Martineau wrote:
> On Mon, 2 Oct 2017, Christoph Paasch wrote:
> > just wanted to ring in here.
> > 
> > 
> > I started working on porting TCP_MD5 to use the TCP extra-option framework
> > from Mat's branch.
> > 
> > It allows to nicely cleanup the TCP_MD5-code out of the TCP data-path.
> > There are some changes/extensions I needed to do to Mat's framework. But
> > nothing. I will post patches in the coming days here on the list.
> > 
> 
> I'm glad the framework is working well for the MD5 cleanup. I had set that
> aside for a while, but one thing I remember wanting to fix was the option
> writing callback. It seemed like a per-socket callback (especially for
> connections in the 'established' state) would be an improvement, so every
> socket doesn't have to tranverse the entire list of callbacks.

+1 on the per-socket callback. I have this on my list of things that would
be good to add.

Basically, I think we should move tcp_option_list to struct tcp_sock.
Dynamically allocated there a pointer to the callbacks and also some
additional memory region.

That way, we have a generic way to store the state needed for the extra TCP
option (md5sig_info in TCP_MD5's case).


I have some more comments, but it will be clearer with the TCP_MD5-code at
hand.


> > I keep on moving mptcp_trunk upwards to track upstream Linux. Currently I'm
> > stuck at v4.9 (there is a nasty bug that popped up with the merge and I
> > wasn't able to fix that yet).
> > 
> > 
> > The merge with v4.9 also forced me to bump skb->cb to 80 bytes... :/
> > I have been thinking back and forth on how we could handle this. The best
> > way I see at the moment is to create a scratch-area at the end of the skb's
> > data (like skb_shared_info). I think it also would quite nicely fit with a
> > KCM/ULP-style architecture where we could have a BPF-program that does the
> > scheduling.
> > I haven't dived very deep into the skb->cb problem yet.
> > 
> 
> I don't think we're the first ones to want more control block bytes, seems
> like finding a solution would help outside of MPTCP too. I've looked at
> skb_tailroom_reserve a little bit, and also given some preliminary thought
> to stashing header info in skb_shared_info->frags (maybe by creating "header
> fragments").



Yes, I also have to look a bit more at tailroom_reserve.

Can you elaborate a bit more on the "header fragments" ?



At one point, I had a more or less crazy idea of storing it inside the
payload.

Here was my train of thought:

Basically, the big problem with MPTCP (ignoring implementations) is that the
IETF decided to put the DSS-option in the TCP option-space. Thus, this
inherintly links a TCP-option with the payload of the packet (due to the
DSS-mapping).
Such linking is bad, for TSO, LRO/GRO, middleboxes splitting segments,...

It would have been much better if the IETF would have placed the DSS-option
(not the DATA_ACK) in the payload and leave the TCP-options just for truly
signalling options (DATA_ACK, ADD_ADDR, REMOVE_ADDR, MP_PRIO,...).

So, I was thinking that we could fake this and the MPTCP-level would do a
regular tcp_sendmsg on the subsockets with the DSS-mapping as part of the
payload. It would also just pass a flag down to tcp_sendmsg, that indicates
that the payload contains a DSS-mapping. This flag would then be stored in
the relevant skb (just one bit - I think we have that space).

Then, later in tcp_options_write, we just need to check on that flag and
extract the DSS-mapping and write it into the TCP header space (and adjust
skb->data,...).


In principle, I think this would have been very clean IMO.


But it doesn't work, because this DSS-mapping will no be accounted in TCP's
sequence space (aka., snd_nxt,...) but in the end it won't be sent out. So,
that would screw up TCP completly. Basically, skb->len will include the
DSS-mapping in the payload but it won't be sent out as part of the payload
but as part of the TCP option-space.

So, because of this I gave up on this avenue.

If you think this could work in another way or something like that, let me
know :)



Christoph



> 
> > 
> > Anyways, at the moment I am focusing on fixing mptcp_trunk's merge with v4.9
> > and the TCP_MD5 cleanup (which I think would be of interest for netdev).
> 
> I really appreciate the update, thank you!
> 
> 
> --
> Mat Martineau
> Intel OTC

             reply	other threads:[~2017-10-02 23:28 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-02 23:28 Christoph Paasch [this message]
  -- strict thread matches above, loose matches on Subject: below --
2017-10-04 16:38 [MPTCP] Status Christoph Paasch
2017-10-04 16:13 Mat Martineau
2017-10-04  6:22 Christoph Paasch
2017-10-04  0:22 Mat Martineau
2017-10-03 21:13 Christoph Paasch
2017-10-03 19:26 Mat Martineau
2017-10-02 23:00 Mat Martineau
2017-10-02 21:14 Christoph Paasch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171002232832.GI31629@Chimay.local \
    --to=unknown@example.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.