From: Ray Bryant <raybry@sgi.com>
To: linux-kernel@vger.kernel.org
Cc: Andrew Morton <akpm@digeo.com>,
Manfred Spraul <manfred@colorfullife.com>,
Andi Kleen <ak@suse.de>,
trivial@rustcorp.com.au, alan@lxorguk.ukuu.org.uk
Subject: PROBLEM: Bug in __pollwait() can cause select() and poll() to hang in 2.4.22-pre2 -- second try
Date: Fri, 27 Jun 2003 13:19:20 -0500 [thread overview]
Message-ID: <3EFC8AA8.7000501@sgi.com> (raw)
[1.] One line summary of the problem:
In low memory situations, a process that issues a call to select()
or poll() can sleep forever in the kernel.
[2.] Full description of the problem/report:
select() and poll() call a common routine: __pollwait(). On the
first call to __pollwait(), it calls __get_free_page(GFP_KERNEL) to
allocate a table to hold wait queues. In the natural course of things,
this calls into __alloc_pages(). In low memory situations, the process
can then end up in the rebalance code at the bottom of __alloc_pages()
where there is a call to yield(). If the process makes this call, this
is a bad thing [tm], since the process state at that point is
TASK_INTERRUPTIBLE. There is no wait queue yet for the process (that is
done later in __pollwait()) and no schedule timeout event has yet been
created (that is done later in select()) so the process will never
return from the call to yield().
[3.] Keywords (i.e., modules, networking, kernel):
Kernel
[4.] Kernel version (from /proc/version):
This bug appears to be present in every 2.4 kernel from (at least)
2.4.13 thru 2.4.22-pre2. It is not present in 2.5.70, since a different
method of waiting for memory to free up is used there (in
__alloc_pages()).
[5.] Output of Oops.. message (if applicable) with symbolic information
resolved (see Documentation/oops-tracing.txt)
N/A.
[6.] A small shell script or example program which triggers the
problem (if possible)
We ecountered this whilst running batch queue tests that are too
complex to include here.
[7.] Environment
[7.1.] Software (add the output of the ver_linux script here)
[7.2.] Processor information (from /proc/cpuinfo):
We encountered this on ia64, however, this is in machine
independent code and we believe the bug is present on all 2.4.21
platforms.
[7.3.] Module information (from /proc/modules):
[7.4.] Loaded driver and hardware information (/proc/ioports,
/proc/iomem)
[7.5.] PCI information ('lspci -vvv' as root)
[7.6.] SCSI information (from /proc/scsi/scsi)
[7.7.] Other information that might be relevant to the problem
(please look in /proc and include all information that you
think to be relevant):
[X.] Other notes, patches, fixes, workarounds:
The simplest fix (as suggested by Manfred Spraul) is to set
current=>state to TASK_RUNNING just before the call to yield() in
__alloc_pages(). I have tested this sufficiently that I believe
this does not change the user level semantics of select() (my
concern was that if state got set to TASK_RUNNING that the syscall
could return before any fd's are ready or the select() timeout has
expired, but this does not appear to be the case).
Here is a trivial patch against 2.4.22-pre2:
--- linux-2.4.22-pre2.orig/mm/page_alloc.c Thu Nov 28 17:53:15 2002
+++ linux-2.4.22-pre2/mm/page_alloc.c Fri Jun 27 13:47:49 2003
@@ -418,6 +418,7 @@
return NULL;
/* Yield for kswapd, and try again */
+ set_current_state(TASK_RUNNING);
yield();
goto rebalance;
}
--
Best Regards,
Ray
-----------------------------------------------
Ray Bryant
512-453-9679 (work) 512-507-7807 (cell)
Jun 23-Jul 18 I will be at: 970-513-4743
raybry@sgi.com raybry@austin.rr.com
The box said: "Requires Windows 98 or better",
so I installed Linux.
-----------------------------------------------
next reply other threads:[~2003-06-27 18:04 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-06-27 18:19 Ray Bryant [this message]
2003-06-30 4:34 ` PROBLEM: Bug in __pollwait() can cause select() and poll() to hang in 2.4.22-pre2 -- second try Rusty Russell
2003-06-30 16:24 ` Manfred Spraul
2003-07-01 1:17 ` Rusty Russell
2003-07-01 4:17 ` Linus Torvalds
2003-07-01 5:08 ` Rusty Russell
2003-07-02 18:06 ` Ray Bryant
2003-07-03 0:56 ` Rusty Russell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3EFC8AA8.7000501@sgi.com \
--to=raybry@sgi.com \
--cc=ak@suse.de \
--cc=akpm@digeo.com \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=manfred@colorfullife.com \
--cc=trivial@rustcorp.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox