All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sowmini Varadhan <sowmini.varadhan@oracle.com>
To: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: "Samudrala, Sridhar" <sridhar.samudrala@intel.com>,
	Network Development <netdev@vger.kernel.org>,
	Willem de Bruijn <willemb@google.com>
Subject: Re: [PATCH RFC net-next 00/11] udp gso
Date: Wed, 18 Apr 2018 08:31:03 -0400	[thread overview]
Message-ID: <20180418123103.GC19633@oracle.com> (raw)
In-Reply-To: <CAF=yD-KhNrcZBQizK+RtFq4Lx-ExntdLR69qz_2beRo8d7XOTA@mail.gmail.com>


I went through the patch set and the code looks fine- it extends existing
infra for TCP/GSO to UDP.

One thing that was not clear to me about the API: shouldn't UDP_SEGMENT
just be automatically determined in the stack from the pmtu? Whats
the motivation for the socket option for this? also AIUI this can be
either a per-socket or a per-packet option?

However, I share Sridhar's concerns about the very fundamental change
to UDP message boundary semantics here.  There is actually no such thing
as a "segment" in udp, so in general this feature makes me a little
uneasy.  Well behaved udp applications should already be sending mtu
sized datagrams. And the not-so-well-behaved ones are probably relying
on IP fragmentation/reassembly to take care of datagram boundary semantics
for them?

As Sridhar points out, the feature is not really "negotiated" - one side
unilaterally sets the option. If the receiver is a classic/POSIX UDP
implementation, it will have no way of knowing that message boundaries
have been re-adjusted at the sender.  

One thought to recover from this: use the infra being proposed in
  https://tools.ietf.org/html/draft-touch-tsvwg-udp-options-09
to include a new UDP TLV option that tracks datagram# (similar to IP ID)
to help the receiver reassemble the UDP datagram and pass it up with
the POSIX-conformant UDP message boundary. I realize that this is also
not a perfect solution: as you point out, there are risks from
packet re-ordering/drops- you may well end up just reinventing IP
frag/re-assembly when you are done (with just the slight improvement
that each "fragment" has a full UDP header, so it has a better shot
at ECMP and RSS).

--Sowmini

  reply	other threads:[~2018-04-18 12:31 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-17 20:00 [PATCH RFC net-next 00/11] udp gso Willem de Bruijn
2018-04-17 20:00 ` [PATCH RFC net-next 01/11] udp: expose inet cork to udp Willem de Bruijn
2018-04-17 20:00 ` [PATCH RFC net-next 02/11] udp: add gso Willem de Bruijn
2018-04-17 20:00 ` [PATCH RFC net-next 03/11] udp: better wmem accounting on gso Willem de Bruijn
2018-04-17 20:00 ` [PATCH RFC net-next 04/11] udp: paged allocation with gso Willem de Bruijn
2018-04-17 20:00 ` [PATCH RFC net-next 05/11] udp: add gso segment cmsg Willem de Bruijn
2018-04-17 20:00 ` [PATCH RFC net-next 06/11] udp: add gso support to virtual devices Willem de Bruijn
2018-04-18  0:43   ` Dimitris Michailidis
2018-04-18  3:27     ` Willem de Bruijn
2018-04-17 20:00 ` [PATCH RFC net-next 07/11] udp: zerocopy Willem de Bruijn
2018-04-17 20:00 ` [PATCH RFC net-next 08/11] selftests: udp gso Willem de Bruijn
2018-04-17 20:00 ` [PATCH RFC net-next 09/11] selftests: udp gso with connected sockets Willem de Bruijn
2018-04-17 20:15 ` [PATCH RFC net-next 00/11] udp gso Sowmini Varadhan
2018-04-17 20:23   ` Willem de Bruijn
2018-04-17 20:48     ` Sowmini Varadhan
2018-04-17 21:07       ` Willem de Bruijn
2018-04-18  2:25         ` Samudrala, Sridhar
2018-04-18  3:33           ` Willem de Bruijn
2018-04-18 12:31             ` Sowmini Varadhan [this message]
2018-04-18 13:35               ` Eric Dumazet
2018-04-18 13:47                 ` Sowmini Varadhan
2018-04-18 13:51                   ` Willem de Bruijn
2018-04-18 15:08                     ` Samudrala, Sridhar
2018-04-18 17:40                     ` David Miller
2018-04-18 17:34                   ` David Miller
2018-04-18 13:59               ` Willem de Bruijn
2018-04-18 14:28                 ` Willem de Bruijn
2018-04-18 17:28               ` David Miller
2018-04-18 18:12                 ` Alexander Duyck
2018-04-18 18:22                   ` Willem de Bruijn
2018-04-20 17:38                     ` Alexander Duyck
2018-04-20 21:58                       ` Willem de Bruijn
2018-04-21  2:08                         ` Alexander Duyck
2018-04-18 19:33                   ` David Miller
2018-04-20 18:27                   ` Tushar Dave
2018-04-20 20:08                     ` Alexander Duyck
2018-04-21  3:11                       ` Tushar Dave
2018-08-31  9:09         ` Paolo Abeni
2018-08-31 10:09           ` Eric Dumazet
2018-08-31 13:08           ` Willem de Bruijn
2018-08-31 13:44             ` Paolo Abeni
2018-08-31 15:11               ` Willem de Bruijn
2018-09-03  8:02             ` Steffen Klassert
2018-09-03 11:45               ` Sowmini Varadhan
2018-04-18 11:17 ` Paolo Abeni
2018-04-18 13:49   ` Willem de Bruijn
2018-05-24  0:02     ` Marcelo Ricardo Leitner
2018-05-24  1:15       ` Willem de Bruijn
2018-04-18 17:24   ` David Miller
2018-04-18 17:50 ` David Miller
2018-04-18 18:12   ` Willem de Bruijn
2018-04-19 17:45     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180418123103.GC19633@oracle.com \
    --to=sowmini.varadhan@oracle.com \
    --cc=netdev@vger.kernel.org \
    --cc=sridhar.samudrala@intel.com \
    --cc=willemb@google.com \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.