public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* linux select() bug hit
@ 2002-01-26  6:50 A. Castro
  2002-01-26  7:17 ` Andrew Morton
  0 siblings, 1 reply; 2+ messages in thread
From: A. Castro @ 2002-01-26  6:50 UTC (permalink / raw)
  To: linux-kernel

Please CC'ed any answers/questions. I'm not on the mailing list.

Greetings,

Reason for posting/sending this email.

1. the actual message:
pppoe[1857]: Linux select bug hit! This message is harmless, but please
ask the Linux kernel developers to fix it.

I connect to the InterNet via my ADSL provider. From time to time, my
adsl connection dies and following it, i receive a message in my logs
concerning a Linux bug hit.

I'm using N_HDLC line discipline for synchronous mode.
The adsl daemon is provided by the rp-pppoe package. If for any reasons,
anyone on this list wishes to look in further, please contact me, so
that i can provide them with more accurate information.

Currently running: kernel 2.4.17
At this moment i'm using kernel mode pppoe, and i havent seen "linux 
select bug hit" At the time that i wrote this email i was using the 
regular pppoe mode with n_hdlc line discipline.

Al


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: linux select() bug hit
  2002-01-26  6:50 linux select() bug hit A. Castro
@ 2002-01-26  7:17 ` Andrew Morton
  0 siblings, 0 replies; 2+ messages in thread
From: Andrew Morton @ 2002-01-26  7:17 UTC (permalink / raw)
  To: A. Castro; +Cc: linux-kernel, David F. Skoll

"A. Castro" wrote:
> 
> Please CC'ed any answers/questions. I'm not on the mailing list.
> 
> Greetings,
> 
> Reason for posting/sending this email.
> 
> 1. the actual message:
> pppoe[1857]: Linux select bug hit! This message is harmless, but please
> ask the Linux kernel developers to fix it.
> 

hmm. Source is at http://www.roaringpenguin.com/pppoe/rp-pppoe-3.3.tar.gz

They have this:

            /* There is a bug in Linux's select which returns a descriptor
             * as readable if N_HDLC line discipline is on, even if
             * it isn't really readable.  This return happens only when
             * select() times out.  To avoid blocking forever in read(),
             * make descriptor 0 non-blocking */
            flags = fcntl(0, F_GETFL);
            if (flags < 0) fatalSys("fcntl(F_GETFL)");
            if (fcntl(0, F_SETFL, (long) flags | O_NONBLOCK) < 0) {
                fatalSys("fcntl(F_SETFL)");
            }

and later this:

syncReadFromPPP(PPPoEConnection *conn, PPPoEPacket *packet)
{
    int r;
#ifndef HAVE_N_HDLC
    struct iovec vec[2];
    unsigned char dummy[2];
    vec[0].iov_base = (void *) dummy;
    vec[0].iov_len = 2;
    vec[1].iov_base = (void *) packet->payload;
    vec[1].iov_len = ETH_DATA_LEN - PPPOE_OVERHEAD;

    /* Use scatter-read to throw away the PPP frame address bytes */
    r = readv(0, vec, 2);
#else
    /* Bloody hell... readv doesn't work with N_HDLC line discipline... GRR! */
    unsigned char buf[ETH_DATA_LEN - PPPOE_OVERHEAD + 2];
    r = read(0, buf, ETH_DATA_LEN - PPPOE_OVERHEAD + 2);
    if (r >= 2) {
        memcpy(packet->payload, buf+2, r-2);
    }
#endif
    if (r < 0) {
        /* Catch the Linux "select" bug */
        if (errno == EAGAIN) {
            rp_fatal("Linux select bug hit!  This message is harmless, but please ask the Linux kernel developers to fix it.");
        }
        fatalSys("read (syncReadFromPPP)");
    }

and

    struct timeval *tvp = NULL;
 ...
    for (;;) {
        if (optInactivityTimeout > 0) {
            tv.tv_sec = optInactivityTimeout;
            tv.tv_usec = 0;
            tvp = &tv;
        }
        FD_ZERO(&readable);
        FD_SET(0, &readable);     /* ppp packets come from stdin */
        if (conn->discoverySocket >= 0) {
            FD_SET(conn->discoverySocket, &readable);
        }
        FD_SET(conn->sessionSocket, &readable);
        while(1) {
            r = select(maxFD, &readable, NULL, NULL, tvp);
            if (r >= 0 || errno != EINTR) break;
        }
 ...
        /* Handle ready sockets */
        if (FD_ISSET(0, &readable)) {
            if (conn->synchronous) {
                syncReadFromPPP(conn, &packet);
            } else {
                asyncReadFromPPP(conn, &packet);
            }
        }

So as the comment says, they are claiming that select() is returning
"yes" for an O_NONBLOCK descriptor which has N_HDLC line disc pushed
onto it, if the select times out.  So a subsequent read() on that
descriptor returns -1 (EAGAIN).

And from a quick read, the code looks OK.  select() says there's
activity on fd 0, but there isn't.

Can any ABI gurus confirm that this is actually a kernel bug?

-

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2002-01-26  7:24 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-01-26  6:50 linux select() bug hit A. Castro
2002-01-26  7:17 ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox