From: Andrea Arcangeli <andrea@suse.de>
To: Russell King <rmk@arm.linux.org.uk>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Bug: Discontigmem virt_to_page() [Alpha,ARM,Mips64?]
Date: Sat, 27 Apr 2002 00:46:41 +0200 [thread overview]
Message-ID: <20020427004641.L19278@dualathlon.random> (raw)
In-Reply-To: <20020426192711.D18350@flint.arm.linux.org.uk>
On Fri, Apr 26, 2002 at 07:27:11PM +0100, Russell King wrote:
> Hi,
>
> I've been looking at some of the ARM discontigmem implementations, and
> have come across a nasty bug. To illustrate this, I'm going to take
> part of the generic kernel, and use the Alpha implementation to
> illustrate the problem we're facing on ARM.
>
> I'm going to argue here that virt_to_page() can, in the discontigmem
> case, produce rather a nasty bug when used with non-direct mapped
> kernel memory arguments.
>
> In mm/memory.c:remap_pte_range() we have the following code:
>
> page = virt_to_page(__va(phys_addr));
> if ((!VALID_PAGE(page)) || PageReserved(page))
> set_pte(pte, mk_pte_phys(phys_addr, prot));
>
> Let's look closely at the first line:
>
> page = virt_to_page(__va(phys_addr));
>
> Essentially, what we're trying to do here is convert a physical address
> to a struct page pointer.
>
> __va() is defined, on Alpha, to be:
>
> #define __va(x) ((void *)((unsigned long) (x) + PAGE_OFFSET))
>
> so we produce a unique "va" for any physical address that is passed. No
> problem so far. Now, lets look at virt_to_page() for the Alpha:
>
> #define virt_to_page(kaddr) \
> (ADDR_TO_MAPBASE(kaddr) + LOCAL_MAP_NR(kaddr))
>
> Looks inoccuous enough. However, look closer at ADDR_TO_MAPBASE:
>
> #define ADDR_TO_MAPBASE(kaddr) \
> NODE_MEM_MAP(KVADDR_TO_NID((unsigned long)(kaddr)))
> #define NODE_MEM_MAP(nid) (NODE_DATA(nid)->node_mem_map)
> #define NODE_DATA(n) (&((PLAT_NODE_DATA(n))->gendata))
> #define PLAT_NODE_DATA(n) (plat_node_data[(n)])
>
> Ok, so here we get the map base via:
>
> plat_node_data[KVADDR_TO_NID((unsigned long)(kaddr))]->
> gendata.node_mem_map
>
> plat_node_data is declared as:
>
> plat_pg_data_t *plat_node_data[MAX_NUMNODES];
>
> Lets look closer at KVADDR_TO_NID() and MAX_NUMNODES:
>
> #define KVADDR_TO_NID(kaddr) PHYSADDR_TO_NID(__pa(kaddr))
> #define __pa(x) ((unsigned long) (x) - PAGE_OFFSET)
> #define PHYSADDR_TO_NID(pa) ALPHA_PA_TO_NID(pa)
> #define ALPHA_PA_TO_NID(pa) ((pa) >> 36) /* 16 nodes max due 43bit kseg */
>
> #define MAX_NUMNODES WILDFIRE_MAX_QBB
> #define WILDFIRE_MAX_QBB 8 /* more than 8 requires other mods */
>
> So, we have a maximum of 8 nodes total, and therefore the plat_node_data
> array is 8 entries large.
>
> Now, what happens if 'kaddr' is below PAGE_OFFSET (because the user has
> opened /dev/mem and mapped some random bit of physical memory space)?
>
> __pa returns a large positive number. We shift this large positive
> number left by 36 bits, leaving 28 bits of large positive number, which
> is larger than our total 8 nodes.
>
> We use this 28-bit number to index plat_node_data. Whoops.
>
> And now, for the icing on the cake, take a look at Alpha's pte_page()
> implementation:
>
> unsigned long kvirt; \
> struct page * __xx; \
> \
> kvirt = (unsigned long)__va(pte_val(x) >> (32-PAGE_SHIFT)); \
> __xx = virt_to_page(kvirt); \
> \
> __xx; \
>
>
> Someone *please* tell me where I'm wrong. I really want to be wrong,
> because I can see the same thing happening (in theory, and one report
> in practice from a developer) on a certain ARM platform.
>
> On ARM, however, we have cherry to add here. __va() may alias certain
> physical memory addresses to the same virtual memory address, which
> makes:
>
> VALID_PAGE(virt_to_page(__va(phys)))
>
> completely nonsensical.
correct. This should fix it:
--- 2.4.19pre7aa2/include/asm-alpha/mmzone.h.~1~ Fri Apr 26 10:28:28 2002
+++ 2.4.19pre7aa2/include/asm-alpha/mmzone.h Sat Apr 27 00:30:02 2002
@@ -106,8 +106,8 @@
#define kern_addr_valid(kaddr) test_bit(LOCAL_MAP_NR(kaddr), \
NODE_DATA(KVADDR_TO_NID(kaddr))->valid_addr_bitmap)
-#define virt_to_page(kaddr) (ADDR_TO_MAPBASE(kaddr) + LOCAL_MAP_NR(kaddr))
-#define VALID_PAGE(page) (((page) - mem_map) < max_mapnr)
+#define virt_to_page(kaddr) (KVADDR_TO_NID((unsigned long) kaddr) < MAX_NUMNODES ? ADDR_TO_MAPBASE(kaddr) + LOCAL_MAP_NR(kaddr) : 0)
+#define VALID_PAGE(page) ((page) != NULL)
#ifdef CONFIG_NUMA
#ifdef CONFIG_NUMA_SCHED
It still doesn't cover the ram between the end of a node and the start
of the next node, but at least on alpha-wildfire there can be nothing
mapped there (it's reserved for "more dimm ram" slots) and it would be
even more costly to check if the address is in those intra-node holes.
The invalid pages now will start at phys addr 64G*8 that is the maximum
ram that linux can handle the the wildfire. if you mmap the intra-node
ram via /dev/mem you risk for troubles anyways because there's no dimm
there and probably the effect is undefined or unpredictable, it's just a
"mustn't do that", /dev/mem is a "root" thing so the above approch looks
fine to me.
Andrea
next prev parent reply other threads:[~2002-04-26 22:46 UTC|newest]
Thread overview: 152+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-04-26 18:27 Bug: Discontigmem virt_to_page() [Alpha,ARM,Mips64?] Russell King
2002-04-26 22:46 ` Andrea Arcangeli [this message]
2002-04-29 17:50 ` Martin J. Bligh
2002-04-29 22:00 ` Roman Zippel
2002-04-30 0:43 ` Andrea Arcangeli
2002-04-27 22:10 ` Daniel Phillips
2002-04-29 13:35 ` Andrea Arcangeli
2002-04-29 23:02 ` Daniel Phillips
2002-05-01 2:23 ` Andrea Arcangeli
2002-04-30 23:12 ` Daniel Phillips
2002-05-01 1:05 ` Daniel Phillips
2002-05-02 0:47 ` Andrea Arcangeli
2002-05-01 1:26 ` Daniel Phillips
2002-05-02 1:43 ` Andrea Arcangeli
2002-05-01 2:41 ` Daniel Phillips
2002-05-02 13:34 ` Andrea Arcangeli
2002-05-02 15:18 ` Martin J. Bligh
2002-05-02 15:35 ` Andrea Arcangeli
2002-05-01 15:42 ` Daniel Phillips
2002-05-02 16:06 ` Andrea Arcangeli
2002-05-02 16:10 ` Martin J. Bligh
2002-05-02 16:40 ` Andrea Arcangeli
2002-05-02 17:16 ` William Lee Irwin III
2002-05-02 18:41 ` Andrea Arcangeli
2002-05-02 19:19 ` William Lee Irwin III
2002-05-02 19:27 ` Daniel Phillips
2002-05-02 19:38 ` William Lee Irwin III
2002-05-02 19:58 ` Daniel Phillips
2002-05-03 6:28 ` Andrea Arcangeli
2002-05-03 6:10 ` Andrea Arcangeli
2002-05-02 22:20 ` Martin J. Bligh
2002-05-02 21:28 ` William Lee Irwin III
2002-05-02 21:52 ` Kurt Ferreira
2002-05-02 21:55 ` William Lee Irwin III
2002-05-03 6:38 ` Andrea Arcangeli
2002-05-03 6:58 ` Martin J. Bligh
2002-05-03 6:04 ` Andrea Arcangeli
2002-05-03 6:33 ` Martin J. Bligh
2002-05-03 8:38 ` Andrea Arcangeli
2002-05-03 9:26 ` William Lee Irwin III
2002-05-03 15:38 ` Martin J. Bligh
2002-05-03 15:17 ` Virtual address space exhaustion (was Discontigmem virt_to_page() ) Martin J. Bligh
2002-05-03 15:58 ` Andrea Arcangeli
2002-05-03 16:10 ` Martin J. Bligh
2002-05-03 16:25 ` Andrea Arcangeli
2002-05-03 16:02 ` Daniel Phillips
2002-05-03 16:20 ` Andrea Arcangeli
2002-05-03 16:41 ` Daniel Phillips
2002-05-03 16:58 ` Andrea Arcangeli
2002-05-03 18:08 ` Daniel Phillips
2002-05-03 9:24 ` Bug: Discontigmem virt_to_page() [Alpha,ARM,Mips64?] William Lee Irwin III
2002-05-03 10:30 ` Andrea Arcangeli
2002-05-03 11:09 ` William Lee Irwin III
2002-05-03 11:27 ` Andrea Arcangeli
2002-05-03 15:42 ` Martin J. Bligh
2002-05-03 15:32 ` Martin J. Bligh
2002-05-02 19:22 ` Daniel Phillips
2002-05-03 6:06 ` Andrea Arcangeli
2002-05-02 18:25 ` Daniel Phillips
2002-05-02 18:44 ` Andrea Arcangeli
2002-05-02 19:31 ` Martin J. Bligh
2002-05-02 18:57 ` Andrea Arcangeli
2002-05-02 19:08 ` Daniel Phillips
2002-05-03 5:15 ` Andrea Arcangeli
2002-05-05 23:54 ` Daniel Phillips
2002-05-06 0:28 ` Andrea Arcangeli
2002-05-06 0:34 ` Daniel Phillips
2002-05-06 1:01 ` Andrea Arcangeli
2002-05-06 0:55 ` Russell King
2002-05-06 1:07 ` Daniel Phillips
2002-05-06 1:20 ` Andrea Arcangeli
2002-05-06 1:24 ` Daniel Phillips
2002-05-06 1:42 ` Andrea Arcangeli
2002-05-06 1:48 ` Daniel Phillips
2002-05-06 2:06 ` Andrea Arcangeli
2002-05-06 17:40 ` Daniel Phillips
2002-05-06 19:09 ` Martin J. Bligh
2002-05-06 1:09 ` Andrea Arcangeli
2002-05-06 1:13 ` Daniel Phillips
2002-05-06 2:03 ` Daniel Phillips
2002-05-06 2:31 ` Andrea Arcangeli
2002-05-06 8:57 ` Russell King
2002-05-06 8:54 ` Roman Zippel
2002-05-06 15:26 ` Daniel Phillips
2002-05-06 19:07 ` Roman Zippel
2002-05-08 15:57 ` Daniel Phillips
2002-05-08 23:11 ` Roman Zippel
2002-05-09 16:08 ` Daniel Phillips
2002-05-09 22:06 ` Roman Zippel
2002-05-09 22:22 ` Daniel Phillips
2002-05-09 23:00 ` Roman Zippel
2002-05-09 23:22 ` Daniel Phillips
2002-05-10 0:13 ` Roman Zippel
2002-05-02 22:39 ` Martin J. Bligh
2002-05-03 7:04 ` Andrea Arcangeli
2002-05-02 23:42 ` Daniel Phillips
2002-05-03 7:45 ` Andrea Arcangeli
2002-05-02 16:07 ` Martin J. Bligh
2002-05-02 16:58 ` Gerrit Huizenga
2002-05-02 18:10 ` Andrea Arcangeli
2002-05-02 19:28 ` Gerrit Huizenga
2002-05-02 22:23 ` Martin J. Bligh
2002-05-03 6:20 ` Andrea Arcangeli
2002-05-03 6:39 ` Martin J. Bligh
2002-05-02 16:00 ` William Lee Irwin III
2002-05-02 2:37 ` William Lee Irwin III
2002-05-02 15:59 ` Andrea Arcangeli
2002-05-02 16:06 ` William Lee Irwin III
2002-05-01 18:05 ` Jesse Barnes
2002-05-01 23:17 ` Andrea Arcangeli
2002-05-01 23:23 ` discontiguous memory platforms Jesse Barnes
2002-05-02 0:51 ` Ralf Baechle
2002-05-02 1:27 ` Andrea Arcangeli
2002-05-02 1:32 ` Ralf Baechle
2002-05-02 8:50 ` Roman Zippel
2002-05-01 13:21 ` Daniel Phillips
2002-05-02 14:00 ` Roman Zippel
2002-05-01 14:08 ` Daniel Phillips
2002-05-02 17:56 ` Roman Zippel
2002-05-01 17:59 ` Daniel Phillips
2002-05-02 18:26 ` Roman Zippel
2002-05-02 18:32 ` Daniel Phillips
2002-05-02 19:40 ` Roman Zippel
2002-05-02 20:14 ` Daniel Phillips
2002-05-03 6:34 ` Andrea Arcangeli
2002-05-03 9:33 ` Roman Zippel
2002-05-03 6:30 ` Andrea Arcangeli
2002-05-02 18:35 ` Geert Uytterhoeven
2002-05-02 18:39 ` Daniel Phillips
2002-05-02 0:20 ` Bug: Discontigmem virt_to_page() [Alpha,ARM,Mips64?] Anton Blanchard
2002-05-01 1:35 ` Daniel Phillips
2002-05-02 1:45 ` William Lee Irwin III
2002-05-01 2:02 ` Daniel Phillips
2002-05-02 2:33 ` William Lee Irwin III
2002-05-01 2:44 ` Daniel Phillips
2002-05-02 1:46 ` Andrea Arcangeli
2002-05-01 1:56 ` Daniel Phillips
2002-05-02 1:01 ` Andrea Arcangeli
2002-05-02 15:28 ` Anton Blanchard
2002-05-01 16:10 ` Daniel Phillips
2002-05-02 15:59 ` Dave Engebretsen
2002-05-01 17:24 ` Daniel Phillips
2002-05-02 16:44 ` Dave Engebretsen
2002-05-02 16:31 ` William Lee Irwin III
2002-05-02 16:21 ` Dave Engebretsen
2002-05-02 17:28 ` William Lee Irwin III
2002-05-02 23:05 ` Daniel Phillips
2002-05-03 0:05 ` William Lee Irwin III
2002-05-03 1:19 ` Daniel Phillips
2002-05-03 19:47 ` Dave Engebretsen
2002-05-03 22:06 ` Daniel Phillips
2002-05-03 23:52 ` David Mosberger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20020427004641.L19278@dualathlon.random \
--to=andrea@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=rmk@arm.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox