From: Ray Bryant <raybry@sgi.com>
To: linux-kernel@vger.kernel.org
Cc: Andrew Morton <akpm@digeo.com>,
Manfred Spraul <manfred@colorfullife.com>,
Andi Kleen <ak@suse.de>,
trivial@rustcorp.com.au, alan@lxorguk.ukuu.org.uk
Subject: PROBLEM: Bug in __pollwait() can cause select() and poll() to hang in 2.4.22-pre2 -- second try
Date: Fri, 27 Jun 2003 13:19:20 -0500 [thread overview]
Message-ID: <3EFC8AA8.7000501@sgi.com> (raw)
[1.] One line summary of the problem:
In low memory situations, a process that issues a call to select()
or poll() can sleep forever in the kernel.
[2.] Full description of the problem/report:
select() and poll() call a common routine: __pollwait(). On the
first call to __pollwait(), it calls __get_free_page(GFP_KERNEL) to
allocate a table to hold wait queues. In the natural course of things,
this calls into __alloc_pages(). In low memory situations, the process
can then end up in the rebalance code at the bottom of __alloc_pages()
where there is a call to yield(). If the process makes this call, this
is a bad thing [tm], since the process state at that point is
TASK_INTERRUPTIBLE. There is no wait queue yet for the process (that is
done later in __pollwait()) and no schedule timeout event has yet been
created (that is done later in select()) so the process will never
return from the call to yield().
[3.] Keywords (i.e., modules, networking, kernel):
Kernel
[4.] Kernel version (from /proc/version):
This bug appears to be present in every 2.4 kernel from (at least)
2.4.13 thru 2.4.22-pre2. It is not present in 2.5.70, since a different
method of waiting for memory to free up is used there (in
__alloc_pages()).
[5.] Output of Oops.. message (if applicable) with symbolic information
resolved (see Documentation/oops-tracing.txt)
N/A.
[6.] A small shell script or example program which triggers the
problem (if possible)
We ecountered this whilst running batch queue tests that are too
complex to include here.
[7.] Environment
[7.1.] Software (add the output of the ver_linux script here)
[7.2.] Processor information (from /proc/cpuinfo):
We encountered this on ia64, however, this is in machine
independent code and we believe the bug is present on all 2.4.21
platforms.
[7.3.] Module information (from /proc/modules):
[7.4.] Loaded driver and hardware information (/proc/ioports,
/proc/iomem)
[7.5.] PCI information ('lspci -vvv' as root)
[7.6.] SCSI information (from /proc/scsi/scsi)
[7.7.] Other information that might be relevant to the problem
(please look in /proc and include all information that you
think to be relevant):
[X.] Other notes, patches, fixes, workarounds:
The simplest fix (as suggested by Manfred Spraul) is to set
current=>state to TASK_RUNNING just before the call to yield() in
__alloc_pages(). I have tested this sufficiently that I believe
this does not change the user level semantics of select() (my
concern was that if state got set to TASK_RUNNING that the syscall
could return before any fd's are ready or the select() timeout has
expired, but this does not appear to be the case).
Here is a trivial patch against 2.4.22-pre2:
--- linux-2.4.22-pre2.orig/mm/page_alloc.c Thu Nov 28 17:53:15 2002
+++ linux-2.4.22-pre2/mm/page_alloc.c Fri Jun 27 13:47:49 2003
@@ -418,6 +418,7 @@
return NULL;
/* Yield for kswapd, and try again */
+ set_current_state(TASK_RUNNING);
yield();
goto rebalance;
}
--
Best Regards,
Ray
-----------------------------------------------
Ray Bryant
512-453-9679 (work) 512-507-7807 (cell)
Jun 23-Jul 18 I will be at: 970-513-4743
raybry@sgi.com raybry@austin.rr.com
The box said: "Requires Windows 98 or better",
so I installed Linux.
-----------------------------------------------
next reply other threads:[~2003-06-27 18:04 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-06-27 18:19 Ray Bryant [this message]
2003-06-30 4:34 ` PROBLEM: Bug in __pollwait() can cause select() and poll() to hang in 2.4.22-pre2 -- second try Rusty Russell
2003-06-30 16:24 ` Manfred Spraul
2003-07-01 1:17 ` Rusty Russell
2003-07-01 4:17 ` Linus Torvalds
2003-07-01 5:08 ` Rusty Russell
2003-07-02 18:06 ` Ray Bryant
2003-07-03 0:56 ` Rusty Russell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3EFC8AA8.7000501@sgi.com \
--to=raybry@sgi.com \
--cc=ak@suse.de \
--cc=akpm@digeo.com \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=manfred@colorfullife.com \
--cc=trivial@rustcorp.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.