public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Corey Minyard <minyard@acm.org>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Jiri Slaby <jirislaby@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Subject: Bug with data getting dropped on a pty
Date: Fri, 25 Sep 2020 17:05:36 -0500	[thread overview]
Message-ID: <20200925220536.GG3674@minyard.net> (raw)

[-- Attachment #1: Type: text/plain, Size: 2714 bytes --]

I've been trying to track down a bug in a library I support (named
gensio; it does all kinds of stream I/O) and I have figured out that
the problem is not in the library, it's in the kernel.  I have
attached a reproducer program, more on how to run it later.

Basically, if you have a pty master and do the following:

  write(ptym, data, size);
  close(ptym);

The other end will occasionally not get the first 4095 bytes of data,
but it will get byte 4095 and on.  This only happens on SMP systems; I
couldn't reproduce with just one processor.  (Running under qemu I
have seen it drop 2048 bytes, but it has always been 4095 outside of a
VM.)  I have tested on Ubuntu 18.04.5 x86_64, the base 5.4 kernel, on a
raspberry pi running raspian, 5.4.51 kernel, and the latest on the
master branch of Linus' tree running under qemu on x86_64.

I have never seen it fail going the other way (writing to the slave
and reading from the master) and that's part of the test suite.

I'm ok with it not getting any of the data, I'm ok with it getting
some of the data at the beginning, but dropping a chunk of the data
and getting later data is a problem.

I've looked at the pty and tty code and I haven't found anything
obvious, but I haven't looked that hard and I don't know that code
very well.

To run the reproducer:

  gcc -g -o testpty testpty.c
  ulimit -c unlimited
  while ./testpty; do echo pass; done

It should fail pretty quickly; it asserts when it detects the error.
You can load the core dump into the debugger.  Note that I wasn't able
to reproduce running it in the debugger.

In the debugger, you can back up to the assert and look at the readbuf:

(gdb) x/30xb readbuf
0x559e5e9c6080 <readbuf>:	0xff	0x08	0x00	0x08	0x01	0x08	0x02	0x08
0x559e5e9c6088 <readbuf+8>:	0x03	0x08	0x04	0x08	0x05	0x08	0x06	0x08
0x559e5e9c6090 <readbuf+16>:	0x07	0x08	0x08	0x08	0x09	0x08	0x0a	0x08
0x559e5e9c6098 <readbuf+24>:	0x0b	0x08	0x0c	0x08	0x0d	0x08

verses the data that was sent:

0x559e5e9b6080 <data>:	0x00	0x00	0x00	0x01	0x00	0x02	0x00	0x03
0x559e5e9b6088 <data+8>:	0x00	0x04	0x00	0x05	0x00	0x06	0x00	0x07
0x559e5e9b6090 <data+16>:	0x00	0x08	0x00	0x09	0x00	0x0a	0x00	0x0b
0x559e5e9b6098 <data+24>:	0x00	0x0c	0x00	0x0d	0x00	0x0e

The data is two byte big endian numbers ascending, the data in readbuf
that was read by the reader thread is the data starting at position
4095 in the data buffer that was transmitted.  Since n_tty has a 4096
byte buffer, that's somewhat suspicious.

Though the reproducer always fails on the first buffer, the test
program I had would close in random places, it would fail at places
besides the beginning of the buffer.

I searched and I couldn't find any error report on this.

-corey

[-- Attachment #2: testpty.c --]
[-- Type: text/x-csrc, Size: 2556 bytes --]


#define _XOPEN_SOURCE 600
#define _DEFAULT_SOURCE

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <pthread.h>
#include <unistd.h>
#include <errno.h>
#include <termios.h>
#include <assert.h>

static int pty_make_raw(int ptym)
{
    struct termios t;
    int err;

    err = tcgetattr(ptym, &t);
    if (err)
	return err;

    cfmakeraw(&t);
    return tcsetattr(ptym, TCSANOW, &t);
}

unsigned char data[65536];
unsigned char readbuf[65536];
int slavefd, slaveerr;
size_t readsize;

int
cmp_mem(unsigned char *buf, unsigned char *buf2, size_t len, size_t pos)
{
    size_t i;
    int rv = 0;

    for (i = 0; i < len; i++) {
	if (buf[i] != buf2[i]) {
	    printf("Mismatch on byte %lu, expected 0x%2.2x, got 0x%2.2x\n",
		   (long) (i + pos), buf[i], buf2[i]);
	    fflush(stdout);
	    rv = -1;
	    break;
	}
    }
    return rv;
}

static void *read_thread(void *dummy)
{
    ssize_t i;

    do {
	i = read(slavefd, readbuf + readsize, sizeof(readbuf) - readsize);
	if (i <= -1) {
	    if (errno == EAGAIN)
		continue;
	    if (errno == EIO)
		/* Remote close causes an EIO. */
		return NULL;
	    perror("read");
	    slaveerr = errno;
	    return NULL;
	}
	if (i + readsize > sizeof(data)) {
	    slaveerr = E2BIG;
	    return NULL;
	}
	if (i && cmp_mem(data + readsize, readbuf + readsize, i, readsize)) {
	    fprintf(stderr, "Data mismatch, starting at %ld, %ld bytes\n",
		    (long) readsize, (long) i);
	    assert(0);
	    slaveerr = EBADMSG;
	    return NULL;
	}
	readsize += i;
    } while (i != 0);

    return NULL;
}

int main(int argc, char *argv[])
{
    int ptym, err;
    char *slave;
    ssize_t i;
    pthread_t slavethr;

    for (i = 0; i < sizeof(data); i += 2) {
	data[i] = (i / 2) >> 8;
	data[i + 1] = i / 2;
    }

    ptym = posix_openpt(O_RDWR | O_NOCTTY);
    if (ptym == -1) {
	perror("posix_openpt");
	exit(1);
    }

    if (fcntl(ptym, F_SETFL, O_NONBLOCK) == -1) {
	perror("fcntl ptym");
	exit(1);
    }

    if (pty_make_raw(ptym)) {
	perror("pty_make_raw");
	exit(1);
    }

    if (unlockpt(ptym) < 0) {
	perror("unlockpt");
	exit(1);
    }

    slave = ptsname(ptym);
    slavefd = open(slave, O_RDWR);
    if (slavefd == -1) {
	perror("open");
	exit(1);
    }

    err = pthread_create(&slavethr, NULL, read_thread, NULL);
    if (err) {
	fprintf(stderr, "pthread_create: %s\n", strerror(err));
	exit(1);
    }

    i = write(ptym, data, sizeof(data));
    if (i == -1) {
	perror("write");
	exit(1);
    }
    close(ptym);

    pthread_join(slavethr, NULL);

    exit(0);
}

             reply	other threads:[~2020-09-25 22:05 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-25 22:05 Corey Minyard [this message]
2020-09-27 11:37 ` Bug with data getting dropped on a pty Greg Kroah-Hartman
2020-09-27 19:55   ` Corey Minyard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200925220536.GG3674@minyard.net \
    --to=minyard@acm.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jirislaby@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox