From: Petr Vandrovec <vandrove@vc.cvut.cz>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Nick's core remove PageReserved broke vmware...
Date: Wed, 02 Nov 2005 02:17:14 +0100 [thread overview]
Message-ID: <4368139A.30701@vc.cvut.cz> (raw)
In-Reply-To: <4368097A.1080601@yahoo.com.au>
Nick Piggin wrote:
> Petr Vandrovec wrote:
>
>> Hello Nick,
>> what's the reason behind disallowing get_user_pages() on VM_RESERVED
>> regions? vmmon uses VM_RESERVED on its 'vma' as otherwise some
>> kernels used by SUSE complained loudly about mismatch between
>> PageReserved() and VM_RESERVED flags.
>>
>
> The reason is that VM_RESERVED indicates that the core vm is not allowed
> to touch any 'struct page' through this mapping, which get_user_pages
> would do.
But get_user_pages() was not invoked by 'the core vm'. I invoked it, from my
module... same one which populated this VMA before.
>> I'll remove it from vmmon for >= 2.6.14 kernels as that bogus test
>> never made to Linux kernel, but I cannot find any reason why
>> get_user_pages() should not work on VM_RESERVED (or VM_IO for that
>> matter) user pages. Can you show me reasoning behind that decision ?
>
> The reasoning behind the decision was so VM_RESERVED is usable for a
> complete replacement to PageReserved. For example mappings through
> /dev/mem should not touch the page count.
>
> You may be able to go a step further and clear PageReserved from your
> pages as well, and thus have a working driver without special casing
> for both kernels.
Nope. We are not having PageReserved() set on our pages since we want them
refcounted. But old SuSE kernels contained this code which was rather unhappy
if page did not have ->mapping set. So we just marked vma VM_RESERVED, as it
did not hurt, and all pages in this vma have refcount > 1 anyway so there is no
point in trying to cleanup these page tables. Now rmap catches this by
page_count() != page_mapcount(), so VM_RESERVED is not needed anymore, but there
did not seem to be any reason to remove it.
pageable = !PageReserved(new_page); << pageable = 1
as = !!new_page->mapping; << as = 0
BUG_ON(!pageable && as);
pageable &= as; << pageable = 0
...
/*
* This is the entry point for memory under VM_RESERVED vmas.
* That memory will not be tracked by the vm. These aren't
* real anonymous pages, they're "device" reserved pages instead.
*/
reserved = !!(vma->vm_flags & VM_RESERVED); << reserved = 0
if (unlikely(reserved == pageable)) << fires...
printk("Badness in %s at %s:%d\n",
__FUNCTION__, __FILE__, __LINE__);
So I've made this change... Test probably could be for 2.6.4 <= x <= 2.6.5 to
rule out all buggy kernels, but I'll probably leave it this way unless there is
some good reason to not set VM_RESERVED on these older kernels.
Thanks for explanation.
Petr
--- vmmon-only/linux/driver.c.orig 2005-11-02 02:00:46.000000000 +0100
+++ vmmon-only/linux/driver.c 2005-11-01 20:12:13.000000000 +0100
@@ -1283,9 +1283,13 @@
/*
* It seems that SuSE's 2.6.4-52 needs this. Hopefully
* it will not break anything else.
+ *
+ * It breaks on post 2.6.14 kernels, so get rid of it on them.
*/
#ifdef VM_RESERVED
+# if LINUX_VERSION_CODE < KERNEL_VERSION(2, 6, 14)
vma->vm_flags |= VM_RESERVED;
+# endif
#endif
return 0;
}
next prev parent reply other threads:[~2005-11-02 1:17 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-11-01 19:30 Nick's core remove PageReserved broke vmware Petr Vandrovec
2005-11-02 0:34 ` Nick Piggin
2005-11-02 1:17 ` Petr Vandrovec [this message]
2005-11-02 2:09 ` Nick Piggin
2005-11-02 12:26 ` Hugh Dickins
2005-11-02 18:06 ` Petr Vandrovec
2005-11-02 21:04 ` Benjamin Herrenschmidt
2005-11-02 21:41 ` Hugh Dickins
2005-11-02 21:45 ` Benjamin Herrenschmidt
2005-11-02 22:02 ` Hugh Dickins
2005-11-02 22:22 ` Benjamin Herrenschmidt
2005-11-03 8:03 ` Gleb Natapov
2005-11-03 13:32 ` Hugh Dickins
2005-11-03 13:55 ` Gleb Natapov
2005-11-03 21:21 ` Benjamin Herrenschmidt
2005-11-02 22:39 ` Petr Vandrovec
2005-11-03 8:12 ` Gleb Natapov
2005-11-03 14:11 ` Hugh Dickins
2005-11-03 14:22 ` Gleb Natapov
2005-11-03 14:37 ` Michael S. Tsirkin
2005-11-03 14:59 ` Hugh Dickins
2005-11-03 15:09 ` Gleb Natapov
2005-11-03 15:14 ` Michael S. Tsirkin
2005-11-03 15:37 ` Hugh Dickins
2005-11-03 15:53 ` Gleb Natapov
2005-11-03 15:56 ` Michael S. Tsirkin
2005-11-08 21:34 ` Michael S. Tsirkin
2005-11-10 12:35 ` Gleb Natapov
2005-11-10 12:48 ` Michael S. Tsirkin
2005-11-10 12:49 ` Gleb Natapov
2005-11-10 13:16 ` Michael S. Tsirkin
2005-11-10 13:16 ` Gleb Natapov
2005-11-10 13:21 ` Hugh Dickins
2005-11-10 13:26 ` Gleb Natapov
2005-11-10 13:15 ` Hugh Dickins
2005-11-10 13:10 ` Hugh Dickins
2005-11-10 13:37 ` Michael S. Tsirkin
2005-11-10 13:55 ` Hugh Dickins
2005-11-10 14:12 ` Michael S. Tsirkin
2005-11-14 12:25 ` Michael S. Tsirkin
2005-11-14 12:27 ` Gleb Natapov
2005-11-14 12:34 ` Michael S. Tsirkin
2005-11-14 12:40 ` Hugh Dickins
2005-11-14 14:57 ` Michael S. Tsirkin
2005-11-14 15:07 ` Gleb Natapov
2005-11-14 12:41 ` Gleb Natapov
2005-11-14 14:52 ` Michael S. Tsirkin
2005-11-14 15:00 ` Gleb Natapov
2005-11-14 20:23 ` Michael S. Tsirkin
2005-11-15 9:26 ` Gleb Natapov
2005-11-14 15:58 ` Hugh Dickins
2005-11-14 21:17 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4368139A.30701@vc.cvut.cz \
--to=vandrove@vc.cvut.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=nickpiggin@yahoo.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox