public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [BUG] net: tcp: SO_LINGER with l_linger=0 leaks memory when closing sockets with pending send data
@ 2026-04-18  0:19 Ahmed, Aaron
  2026-04-18  0:44 ` Kuniyuki Iwashima
  0 siblings, 1 reply; 5+ messages in thread
From: Ahmed, Aaron @ 2026-04-18  0:19 UTC (permalink / raw)
  To: stable@vger.kernel.org, netdev@vger.kernel.org
  Cc: ncardwell@google.com, edumazet@google.com, kuniyu@google.com

Hi,

We have identified a TCP memory leak issue on Amazon Linux with kernel versions 5.15.168 through 6.18.20 that occurs when closing sockets with SO_LINGER set to l_onoff=1, l_linger=0, on servers handling many persistent connections with full write buffers.

Overview:

The issue was discovered on a public-facing non-blocking TCP server that maintains many persistent connections and streams data to clients. When a client cannot read fast enough, the TCP write socket buffer on the server side fills up and send() returns EAGAIN. At that point, the server application disconnects the slow client by setting SO_LINGER to l_onoff=1, l_linger=0 and calling close(). This is intended to immediately reset the connection and release all associated kernel resources. However, while the socket disappears from netstat and sockstat (TCP inuse drops), the write buffer memory is not properly reclaimed. /proc/net/sockstat shows TCP mem pages accumulating with no owning sockets, causing the leaked memory to grow past the tcp_mem limits. Setting SO_LINGER to l_onoff=1, l_linger=1 instead does not leak. With l_linger=1, the connection goes through FIN_WAIT1 → FIN_WAIT2 → CLOSE (confirmed with BPF tcpstates), and all memory is freed properly. With l_linger=0, the connection transitions directly from ESTABLISHED → CLOSE via RST, bypassing the FIN states entirely.

Reproducer:
```
/* tcp_linger_memleak.c - SO_LINGER(0) TCP memory leak reproducer
 *
 * Build:  gcc -O2 -o tcp_linger_memleak tcp_linger_memleak.c
 * Run:    sudo sysctl -w net.core.wmem_max=4194304
 *         sudo sysctl -w net.ipv4.tcp_rmem="4096 8192 16384"
 *         ./tcp_linger_memleak
 */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <fcntl.h>
#include <signal.h>
#include <sys/socket.h>
#include <sys/wait.h>
#include <netinet/in.h>
	
#define NUM_CONNS 5000
#define PORT      6666

static void print_mem(const char *label) {
    FILE *f;
    char line[256];
    f = fopen("/proc/meminfo", "r");
    while (fgets(line, sizeof(line), f))
        if (strncmp(line, "MemAvailable:", 13) == 0)
            printf("%s: %s", label, line);
    fclose(f);
    f = fopen("/proc/net/sockstat", "r");
    while (fgets(line, sizeof(line), f))
        if (strncmp(line, "TCP:", 4) == 0)
            printf("%s: %s", label, line);
    fclose(f);
}

int main(void) {
    struct sockaddr_in addr = {
        .sin_family = AF_INET,
        .sin_port = htons(PORT),
        .sin_addr.s_addr = htonl(INADDR_LOOPBACK)
    };
    int opt = 1;
    signal(SIGPIPE, SIG_IGN);

    int lsn = socket(AF_INET, SOCK_STREAM, 0);
    setsockopt(lsn, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));
    bind(lsn, (struct sockaddr *)&addr, sizeof(addr));
    listen(lsn, NUM_CONNS);

    /* Fork client: connect N times, never read */
    pid_t child = fork();
    if (child == 0) {
        int fds[NUM_CONNS];
        for (int i = 0; i < NUM_CONNS; i++) {
            fds[i] = socket(AF_INET, SOCK_STREAM, 0);
            connect(fds[i], (struct sockaddr *)&addr, sizeof(addr));
        }
        pause(); /* sit forever, never read */
        _exit(0);
    }

    /* Accept all connections */
    int clients[NUM_CONNS];
    for (int i = 0; i < NUM_CONNS; i++)
        clients[i] = accept(lsn, NULL, NULL);

    /* Freeze client so it stops reading */
    kill(child, SIGSTOP);
    printf("=== %d connections established, client frozen ===\n", NUM_CONNS);
    print_mem("BEFORE");

    /* Fill buffers and close with SO_LINGER(1,0) */
    char buf[2048];
    memset(buf, 'A', sizeof(buf));
    for (int i = 0; i < NUM_CONNS; i++) {
        int flags = fcntl(clients[i], F_GETFL, 0);
        fcntl(clients[i], F_SETFL, flags | O_NONBLOCK);
        while (send(clients[i], buf, sizeof(buf), MSG_NOSIGNAL) > 0);
        struct linger lg = { .l_onoff = 1, .l_linger = 0 };
        setsockopt(clients[i], SOL_SOCKET, SO_LINGER, &lg, sizeof(lg));
        close(clients[i]);
    }

    sleep(2);
    printf("\n=== All sockets closed with SO_LINGER(1,0) ===\n");
    print_mem("AFTER");
    kill(child, SIGKILL);
    waitpid(child, NULL, 0);
    close(lsn);
    return 0;
}
```
Output (Tested on 6.18.20):
```
=== 5000 connections established, client frozen ===
BEFORE: MemAvailable:   95491288 kB
BEFORE: TCP: inuse 10005 orphan 0 tw 5 alloc 10006 mem 0

=== All sockets closed with SO_LINGER(1,0) ===
AFTER: MemAvailable:   95321800 kB
AFTER: TCP: inuse 5 orphan 0 tw 5 alloc 5006 mem 8300
```

Thanks,
Aaron Ahmed



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-04-28  0:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-18  0:19 [BUG] net: tcp: SO_LINGER with l_linger=0 leaks memory when closing sockets with pending send data Ahmed, Aaron
2026-04-18  0:44 ` Kuniyuki Iwashima
2026-04-18  1:06   ` Kuniyuki Iwashima
2026-04-27 22:26   ` Ahmed, Aaron
2026-04-28  0:15     ` Kuniyuki Iwashima

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox