From: Joe Korty <joe.korty@ccur.com>
To: akpm@digeo.com
Cc: linux-kernel@vger.kernel.org, riel@redhat.com, andrea@suse.de
Subject: mlockall and mmap of IO devices don't mix
Date: Fri, 3 Oct 2003 17:44:12 -0400 [thread overview]
Message-ID: <20031003214411.GA25802@rudolph.ccur.com> (raw)
2.6.0-test6: the use of mlockall(2) in a process that has mmap(2)ed
the registers of an IO device will hang that process uninterruptibly.
The task runs in an infinite loop in get_user_pages(), invoking
follow_page() forever.
Using binary search I discovered that the problem was introduced
in 2.5.14, specifically in ChangeSetKey
zippel@linux-m68k.org|ChangeSet|20020503210330|37095
The following patch backs out those lines of the above changeset
which are causing the problem:
--- a/mm/memory.c 2003-10-02 15:32:51.000000000 -0400
+++ b/mm/memory.c 2003-10-02 17:02:48.000000000 -0400
@@ -485,9 +485,7 @@
if (pte_present(pte)) {
if (!write ||
(pte_write(pte) && pte_dirty(pte))) {
- pfn = pte_pfn(pte);
- if (pfn_valid(pfn))
- return pfn_to_page(pfn);
+ return pte_page(pte);
}
}
The equivalent backout patch for 2.6.0-test6 is:
--- linux-2.6.0-test6/mm/memory.c.orig 2003-09-27 20:50:19.000000000 -0400
+++ linux-2.6.0-test6/mm/memory.c 2003-10-03 16:17:25.000000000 -0400
@@ -618,7 +618,6 @@
pgd_t *pgd;
pmd_t *pmd;
pte_t *ptep, pte;
- unsigned long pfn;
struct vm_area_struct *vma;
vma = hugepage_vma(mm, address);
@@ -645,13 +644,11 @@
pte_unmap(ptep);
if (pte_present(pte)) {
if (!write || (pte_write(pte) && pte_dirty(pte))) {
- pfn = pte_pfn(pte);
- if (pfn_valid(pfn)) {
- struct page *page = pfn_to_page(pfn);
-
+ struct page *page = pte_page(pte);
+ if (pfn_valid(pte_pfn(pte))) {
mark_page_accessed(page);
- return page;
}
+ return page;
}
}
I do not believe that the above constitutes a correct fix. The
problem is that follow_pages() is fundamentally not able to handle a
mapping which does not have a 'struct page' backing it up, and a
mapping to IO memory by definition has no 'struct page' structure to
back it up.
IMO the original 2.5.14 patch which 'broke' the kernel is correct.
It made follow_page() return zero for a 'struct page' address, which
is entirely reasonable when a mapping has no 'struct page'. Prior to
2.5.14 follow_page() would return some meaningless nonzero value
which, nevertheless, allowed its caller to continue and establish an
apparently correct mapping.
Joe
PS: my test program. Nice and short.
/* test mlockall() in conjunction with mmap(2) of an IO device.
* success -- program exits.
* failure -- program hangs up, unkillable.
*/
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/mman.h>
int main(int argc, char **argv)
{
int stat, fd;
unsigned char *p;
unsigned long off;
if (argc > 1) {
off = strtoul(argv[1], NULL, 16);
} else {
fprintf(stderr, "arg is video card addr in /dev/mem");
exit(1);
}
stat = mlockall(MCL_CURRENT | MCL_FUTURE);
if (stat == -1) {
perror("mlockall");
exit(1);
}
fd = open ("/dev/mem", O_RDWR, 0666);
if (fd == -1) {
perror("open");
exit(1);
}
p = (unsigned char *)mmap(NULL, 0x4096,
PROT_READ | PROT_WRITE, MAP_SHARED, fd, (off_t)off);
if (!p) {
perror("mmap");
exit(1);
}
printf("[%08lx] %02x%02x%02x%02x\n",
off, p[0], p[1], p[2], p[3]);
return 0;
}
next reply other threads:[~2003-10-03 21:45 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-10-03 21:44 Joe Korty [this message]
2003-10-03 22:23 ` mlockall and mmap of IO devices don't mix Andrew Morton
2003-10-03 22:55 ` Joe Korty
2003-10-03 23:06 ` Andrew Morton
2003-10-03 23:28 ` Joe Korty
2003-10-03 23:15 ` Andrew Morton
2003-10-03 23:54 ` Joe Korty
2003-10-04 0:27 ` Andrew Morton
2003-10-04 5:47 ` David S. Miller
2003-10-04 9:29 ` Ingo Oeser
2004-05-21 11:34 ` Mark Hounschell
2004-05-22 2:13 ` Andrew Morton
2004-05-22 10:47 ` Mark Hounschell
2004-05-23 12:58 ` Mark Hounschell
2004-05-25 14:27 ` Joe Korty
2004-05-25 19:47 ` Andrew Morton
2004-05-25 21:31 ` Joe Korty
2004-07-16 21:01 ` Mark Hounschell
[not found] <CFYv.787.23@gated-at.bofh.it>
2003-10-04 7:02 ` Andi Kleen
2003-10-04 7:42 ` Andrew Morton
2003-10-04 8:29 ` Andi Kleen
2003-10-04 8:47 ` Ingo Oeser
2003-10-04 9:17 ` Andi Kleen
2003-10-04 9:22 ` Russell King
2003-10-04 10:02 ` Ingo Oeser
2003-10-04 10:13 ` Russell King
2003-10-04 14:19 ` Ingo Oeser
2003-10-04 14:19 ` Ingo Oeser
2003-10-04 14:32 ` Martin J. Bligh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20031003214411.GA25802@rudolph.ccur.com \
--to=joe.korty@ccur.com \
--cc=akpm@digeo.com \
--cc=andrea@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.