netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vlad Yasevich <vyasevich@gmail.com>
To: David Laight <David.Laight@ACULAB.COM>,
	"'netdev@vger.kernel.org'" <netdev@vger.kernel.org>,
	"'linux-sctp@vger.kernel.org'" <linux-sctp@vger.kernel.org>
Cc: "'davem@davemloft.net'" <davem@davemloft.net>
Subject: Re: [PATCH net-next v2 3/3] net: sctp: Add partial support for MSG_MORE on SCTP
Date: Mon, 14 Jul 2014 15:15:36 -0400	[thread overview]
Message-ID: <53C42C58.3050108@gmail.com> (raw)
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6D17271E0B@AcuExch.aculab.com>

On 07/14/2014 12:27 PM, David Laight wrote:
> From: Vlad Yasevich
> ...
>>> +	/* Setting MSG_MORE currently has the same effect as enabling Nagle.
>>> +	 * This means that the user can't force bundling of the first two data
>>> +	 * chunks.  It does mean that all the data chunks will be sent
>>> +	 * without an extra timer.
>>> +	 * It is enough to save the last value since any data sent with
>>> +	 * MSG_MORE clear will already have been sent (subject to flow control).
>>> +	 */
>>> +	if (msg->msg_flags & MSG_MORE)
>>> +		sp->tx_delay |= SCTP_F_TX_MSG_MORE;
>>> +	else
>>> +		sp->tx_delay &= ~SCTP_F_TX_MSG_MORE;
>>> +
>>
>> This is ok for 1-1 sockets, but it doesn't really work for 1-many sockets.  If one of
>> the associations uses MSG_MORE while another does not, we'll see some interesting
>> side-effects on the wire.
> 
> They shouldn't cause any grief, and are somewhat unlikely.
> Unless multiple threads/processes are writing data into the same socket
> and are also flipping MSG_MORE (and the socket locking allows the
> send path to run concurrently - I suspect it doesn't).
> 
> AFAICT the tx_delay/Nagle flag is looked at in two code paths:
> 1) After the application tries to send some data.
> 2) When processing a received ack chunk.
> 
> For 1-many sockets I suspect the code that checks tx_delay after a send()
> is executed before a send() from a different thread could change the value.
> And that sends for alternate destinations won't try to clear the tx queue
> for the other association.
> So the send() processing is unlikely to be affected by the MSG_MORE flag
> value for the other association.

But the MSG_MORE is not per association.  It is per socket.  So if you have
a process with 2 threads that clears Nagle (sets SCTP_NODELAY) and then
uses MSG_MORE to force bundling when it has a lot of data in queue then
you can have the following:
  1: send(MSG_MORE)
  1: send(MSG_MORE)
  2: send()

The send from thread2 will reset the tx_delay across the socket.  If
association from thread 1 then receives a SACK, it will flush the queue
before it's ready.  So, you have a side-effect that you don't get the
bundling that you are really after with MSG_MORE usage.

> 
> The only time there will be sendable data for (2) is if the connection
> were flow-controlled off, or if data were unsent due the MSG_MORE/Nagle
> being set when the last send was processed.
> Most likely the queued data will be sent - either because there is nothing
> outstanding, because there is more than a packet full, or because the last
> send had MSG_MORE clear.
> 
> The expectation is that an application will send some data chunks with
> MSG_MORE set, followed by one with it clear.
> 

Within a single thread, sure.  But it you have multiple association as above,
you could end up with a scenario where MSG_MORE is almost useless.

> The only scenario I can see that might be unexpected is:
> - a 1-many socket.
> - one destination flow controlled (ie waiting an ack chunk) but
>   with less than 1500 bytes queued.
> - send with MSG_MORE set for a different destination.
> - ack received, queued data not sent.
> 
> But if you are waiting for ack chunks on a 1-many socket you are already
> in deep trouble - since there is only a single socket send buffer.

Not always.  A lot of deployments that use 1-many socket specifically
change buffering policy.

> 
> I don't think this is a problem.

Not, it is not a _problem_, but it does make MSG_MORE rather useless
in some situations.  Waiting for an ACK across low-latency links
is rare, but in a high-latency scenarios where you want to utilize the
bandwidth better with bundling, you may not see the gains you expect.

Since MSG_MORE is association, it should be handled as such and an
a change on one association should not effect the others.

-vlad
> 
> 	David
> 
> 

  reply	other threads:[~2014-07-14 19:15 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-09  8:29 [PATCH net-next v2 3/3] net: sctp: Add partial support for MSG_MORE on SCTP David Laight
2014-07-11 20:11 ` Vlad Yasevich
2014-07-14 16:27   ` David Laight
2014-07-14 19:15     ` Vlad Yasevich [this message]
2014-07-15 14:33       ` David Laight
2014-07-15 15:24         ` Vlad Yasevich
2014-07-15 16:13           ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53C42C58.3050108@gmail.com \
    --to=vyasevich@gmail.com \
    --cc=David.Laight@ACULAB.COM \
    --cc=davem@davemloft.net \
    --cc=linux-sctp@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).