All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael T Kerrisk" <mtk-lists@gmx.net>
To: netdev@oss.sgi.com
Cc: michael.kerrisk@gmx.net
Subject: TCP_CORK 200ms maximum cork time -- expected behaviour?
Date: Fri, 20 Aug 2004 16:00:33 +0200 (MEST)	[thread overview]
Message-ID: <18686.1093010433@www70.gmx.net> (raw)

Gidday,

I tried posting this several weeks back, but got no response.  
I'll try again, this time with programs (see below) that 
demonstrate (also see below) what I’m seeing.

The TCP_CORK socket option allows us to perform multiple 
write()s (or send()s or sendfile()s) while delaying the 
transmission of an outgoing TCP segment until the option is 
disabled (or a segment MSS is filled or the socket is closed).

All is fine and good, but there's one point I'm puzzled 
about: even when TCP_CORK is set, buffered data will still be 
transmitted after a 200 millisecond delay (the delay counts 
from the time that the first corked byte was written), 
**even if TCP_CORK is still set**.  So, I'm wondering:

1. Is this intended behaviour, or simply an 
   outgrowth of the combined implementations of 
   TCP_CORK and TCP_NAGLE_OFF?

2. If it's intended behaviour, what is the 
   rationale for the ceiling time on corking?

I first observed this behaviour quite some time back, but 
I've verified that it is still current (2.4.26 and 2.6.8.1 
kernels).  (In passing: of course, similar behaviour occurs 
with MSG_MORE on TCP sockets.)

Here's what I see using my two test programs:

tcp_cork_receive port
    This binds to a port, accepts a connection
    and then reads blocks displaying them along
    with the time that the read() completed.

tcp_cork_send [options] server port num-writes buf-size
    Connect to server/port, perform specified number
    ('num-writes') of writes, each containing 'buf-size'
    bytes.  By default, this program enables TCP_CORK
    on the socket.

    Various options are provided, but the only one 
    needed for the test is '-d usecs' which specifies
    a number of microseconds to usleep() between writes.
    
In the following (run on 2.6.8.1), tcp_cork_send is used to 
write 100 bytes, one at a time, with a 10 millisecond delay 
between writes:

$ ./tcp_cork_receive 9999 &
[1] 8868
$ ./tcp_cork_send -d 10000 localhost 9999 100 1
[PID 8868] 1093009988.950: Receiver accepted connecton
[PID 8869] 1093009988.951: Enabled TCP_CORK
[PID 8869] 1093009988.951: TCP_CORK=1
[PID 8868] 1093009989.152: [received 17 bytes]
[PID 8868] 1093009989.359: [received 17 bytes]
[PID 8868] 1093009989.563: [received 17 bytes]
[PID 8868] 1093009989.767: [received 17 bytes]
[PID 8868] 1093009989.971: [received 17 bytes]
[PID 8869] 1093009990.154: Completed writes
[PID 8868] 1093009990.155: [received 15 bytes]
[1]+  Done                    ./tcp_cork_receive 9999


The "received" messages appear every 200 milliseconds, even
though the sender did not disable TCP_CORK.  Based on what I’d
read/heard about TCP_CORK, I would have expected to see only 
one "received" after the sender had closed the socket.  But, 
instead, there is clearly a 200 millisecond ceiling on corkage.

Cheers,

Michael




/* tcp_cork_receive.c */

#include <sys/types.h>
#include <netinet/tcp.h>
#include <netinet/in.h>
#include <sys/time.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>

#define errMsg(msg)     { perror(msg); }

#define errExit(msg)    { perror(msg); exit(EXIT_FAILURE); }

#define usageErr(msg, progName) \
                        { fprintf(stderr, "Usage: "); \
                          fprintf(stderr, msg, progName); \
                          exit(EXIT_FAILURE); }
static void
traceInfo(void)
{
    struct timeval tv;

    if (gettimeofday(&tv, NULL) == -1) errExit("gettimeofday");
    printf("[PID %ld] %8.3f: ", (long) getpid(),
            tv.tv_sec + tv.tv_usec / 1000000.0);
} /* traceInfo */

int
main(int argc, char *argv[])
{
    int lfd, sfd;
    ssize_t numRead;
#define BUF_SIZE 100000
    char buf[BUF_SIZE];
    int optval;
    struct sockaddr_in svaddr;

    if (argc != 2  || strcmp(argv[1], "--help") == 0) {
        fprintf(stderr, "%s port\n", argv[0]);
        exit(EXIT_FAILURE);
    } 

    lfd = socket(AF_INET, SOCK_STREAM, 0);
    if (lfd == -1) errExit("socket");

    memset(&svaddr, 0, sizeof(struct sockaddr_in));
    svaddr.sin_family = AF_INET;
    svaddr.sin_port = htons(atoi(argv[1]));
    svaddr.sin_addr.s_addr = htonl(INADDR_ANY);

    optval = 1;
    if (setsockopt(lfd, SOL_SOCKET, SO_REUSEADDR, &optval,
                sizeof(optval)) == -1) errExit("setsockopt");

    if (bind(lfd, (struct sockaddr *) &svaddr, sizeof(struct sockaddr_in))
            == -1) errExit("bind");

    if (listen(lfd, 5) == -1) errExit("listen");

    sfd = accept(lfd, NULL, NULL);
    if (sfd == -1) errExit("accept");

    traceInfo();
    printf("Receiver accepted connecton\n");

    for (;;) {
        numRead = read(sfd, buf, BUF_SIZE);
        if (numRead == -1) errExit("read");
        if (numRead == 0)
            break;
        traceInfo();
        /* printf("[received %d bytes] %.*s\n", numRead,
           (int) numRead, buf); */
        printf("[received %d bytes]\n", numRead);
    } 

    close(lfd);
    close(sfd);
    exit(EXIT_SUCCESS);
} /* main */




/* tcp_cork_send.c */

#define _XOPEN_SOURCE 500    
#include <sys/types.h>
#include <unistd.h>
#include <netinet/tcp.h>
#include <netinet/in.h>
#include <sys/time.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
#include "inet_sockets.h"

typedef enum { FALSE, TRUE } Boolean;

#define errMsg(msg)     { perror(msg); }

#define errExit(msg)    { perror(msg); exit(EXIT_FAILURE); }

#define fatalErr(msg)   { fprintf(stderr, "%s\n", msg); \
                          exit(EXIT_FAILURE); }

static void
traceInfo(void)
{
    struct timeval tv;

    if (gettimeofday(&tv, NULL) == -1) errExit("gettimeofday");
    printf("[PID %ld] %8.3f: ", (long) getpid(),
            tv.tv_sec + tv.tv_usec / 1000000.0);
} /* traceInfo */

static void
usageError(char *progName, char *msg)
{
    if (msg != NULL)
        fprintf(stderr, "%s\n", msg);

    fprintf(stderr,
        "%s [options] server port num-write buf-size\n"
        "\tnum-write        Number of writes\n"
        "\tbuf-size         Size of buffer for each write\n"
        "\tOptions are:\n"
        "\t\t-d usecs  Microsecs delay between each write\n"
        "\t\t-n        Don't enable TCP_CORK before writes\n"
        "\t\t-s nsecs  Sleep for 'nsecs' seconds before closing socket\n"
        "\t\t-u        Disable TCP_CORK immediately after sending\n"
        "\t\t-v        Verbose reporting of (delayed) writes\n"
        ,
        progName);
    exit(EXIT_FAILURE);
} /* usageError */

int
main(int argc, char *argv[])
{
    int numWrites, j, sfd;
    useconds_t delayUsecs;
    size_t bufSize;
    char *buf;
    int optval;
    int opt;
    int finalSleepSecs;
    Boolean nocork, uncork, verbose;
    socklen_t optlen;
    struct sockaddr_in svaddr;
    struct hostent *h;
    struct in_addr **addrpp;

    nocork = FALSE;
    uncork = FALSE;
    verbose = FALSE;
    finalSleepSecs = 0;
    delayUsecs = 0;

    while ((opt = getopt(argc, argv, "d:us:vn")) != -1) {
        switch (opt) {
        case 'u':
            uncork = TRUE;
            break;

        case 's':
            finalSleepSecs = atoi(optarg);
            break;

        case 'n':
            nocork = TRUE;
            break;

        case 'v':
            verbose = TRUE;
            break;

        case 'd':
            delayUsecs = atoi(optarg);
            break;

        default:
            usageError(argv[0], "Bad option");
        } /* switch */
    } /* while */

    if (nocork && uncork)
        fatalErr("Can't specify both -n and -u options");

    if (argc != optind + 4 || strcmp(argv[optind], "--help") == 0)
        usageError(argv[0], NULL);

    numWrites = atoi(argv[optind + 2]);
    bufSize = atoi(argv[optind + 3]);

    buf = malloc(bufSize);
    for (j = 0; j < bufSize; j++)
        buf[j] = 'a' + j % 26;

    sfd = socket(AF_INET, SOCK_STREAM, 0);
    if (sfd == -1) errExit("socket");

    memset(&svaddr, 0, sizeof(struct sockaddr_in));
    svaddr.sin_family = AF_INET;
    svaddr.sin_port = htons(atoi(argv[optind + 1]));

    h = gethostbyname(argv[optind]);
    if (h == NULL)
        fatalErr("host lookup failed (gethostbyname())");
    addrpp = (struct in_addr **) h->h_addr_list;
    svaddr.sin_addr.s_addr = (*addrpp)->s_addr;

    if (connect(sfd, (struct sockaddr *) &svaddr,
            sizeof(struct sockaddr_in)) == -1) errExit("connect");

    if (!nocork) {
        optval = 1;
        if (setsockopt(sfd, IPPROTO_TCP, TCP_CORK, &optval,
                sizeof(optval)) == -1) errExit("setsockopt");

        traceInfo();
        printf("Enabled TCP_CORK\n");
    } 

    optlen = sizeof(optval);
    if (getsockopt(sfd, IPPROTO_TCP,  TCP_CORK, &optval, &optlen) == -1)
        errExit("getsockopt");
    traceInfo();
    printf("TCP_CORK=%d\n", optval);

    for (j = 0; j < numWrites; j++) {
         if (write(sfd, buf, bufSize) != bufSize)
             errExit("write");

         if (delayUsecs > 0) {
             if (verbose) {
                 traceInfo();
                 printf("sleep %d\n", j);
             } 
             usleep(delayUsecs);
         } 
    } 

    traceInfo();
    printf("Completed writes\n");

    if (uncork) {
        traceInfo();
        printf("Disabling TCP_CORK\n");
        optval = 0;
        if (setsockopt(sfd, IPPROTO_TCP, TCP_CORK, &optval,
                sizeof(optval)) == -1) errExit("setsockopt");
    } 

    if (finalSleepSecs > 0) {
        traceInfo();
        printf("Sleeping\n");
        sleep(finalSleepSecs);
    } 

    close(sfd);
    exit(EXIT_SUCCESS);
} /* main */

-- 
Michael Kerrisk
mtk-lists@gmx.net

Supergünstige DSL-Tarife + WLAN-Router für 0,- EUR*
Jetzt zu GMX wechseln und sparen http://www.gmx.net/de/go/dsl

             reply	other threads:[~2004-08-20 14:00 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-08-20 14:00 Michael T Kerrisk [this message]
  -- strict thread matches above, loose matches on Subject: below --
2004-07-01 12:47 TCP_CORK 200ms maximum cork time -- expected behaviour? Michael Kerrisk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=18686.1093010433@www70.gmx.net \
    --to=mtk-lists@gmx.net \
    --cc=michael.kerrisk@gmx.net \
    --cc=netdev@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.