From: Boaz Harrosh <boaz@plexistor.com>
To: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>,
Andrea Arcangeli <aarcange@redhat.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH] mm: avoid setting up anonymous pages into file mapping
Date: Sun, 05 Jul 2015 18:15:20 +0300 [thread overview]
Message-ID: <55994A08.3030308@plexistor.com> (raw)
In-Reply-To: <1435932447-84377-1-git-send-email-kirill.shutemov@linux.intel.com>
On 07/03/2015 05:07 PM, Kirill A. Shutemov wrote:
> Reading page fault handler code I've noticed that under right
> circumstances kernel would map anonymous pages into file mappings:
> if the VMA doesn't have vm_ops->fault() and the VMA wasn't fully
> populated on ->mmap(), kernel would handle page fault to not populated
> pte with do_anonymous_page().
>
> There's chance that it was done intentionally, but I don't see good
> justification for this. We just hide bugs in broken drivers.
>
Have you done a preliminary audit for these broken drivers? If they actually
exist in-tree then this patch is a regression for them.
We need to look for vm_ops without an .fault = . Perhaps define a
map_annonimous() for those to revert to the old behavior, if any
actually exist.
> Let's change page fault handler to use do_anonymous_page() only on
> anonymous VMA (->vm_ops == NULL).
>
> For file mappings without vm_ops->fault() page fault on pte_none() entry
> would lead to SIGBUS.
>
Again that could mean a theoretical regression for some in-tree driver,
do you know of any such driver?
Thanks
Boaz
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
> mm/memory.c | 15 +++++++++------
> 1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 8a2fc9945b46..f3ee782059e3 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3115,6 +3115,9 @@ static int do_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
>
> pte_unmap(page_table);
> +
> + if (unlikely(!vma->vm_ops->fault))
> + return VM_FAULT_SIGBUS;
> if (!(flags & FAULT_FLAG_WRITE))
> return do_read_fault(mm, vma, address, pmd, pgoff, flags,
> orig_pte);
> @@ -3260,13 +3263,13 @@ static int handle_pte_fault(struct mm_struct *mm,
> barrier();
> if (!pte_present(entry)) {
> if (pte_none(entry)) {
> - if (vma->vm_ops) {
> - if (likely(vma->vm_ops->fault))
> - return do_fault(mm, vma, address, pte,
> - pmd, flags, entry);
> + if (!vma->vm_ops) {
> + return do_anonymous_page(mm, vma, address, pte,
> + pmd, flags);
> + } else {
> + return do_fault(mm, vma, address, pte, pmd,
> + flags, entry);
> }
> - return do_anonymous_page(mm, vma, address,
> - pte, pmd, flags);
> }
> return do_swap_page(mm, vma, address,
> pte, pmd, flags, entry);
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Boaz Harrosh <boaz@plexistor.com>
To: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>,
Andrea Arcangeli <aarcange@redhat.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH] mm: avoid setting up anonymous pages into file mapping
Date: Sun, 05 Jul 2015 18:15:20 +0300 [thread overview]
Message-ID: <55994A08.3030308@plexistor.com> (raw)
In-Reply-To: <1435932447-84377-1-git-send-email-kirill.shutemov@linux.intel.com>
On 07/03/2015 05:07 PM, Kirill A. Shutemov wrote:
> Reading page fault handler code I've noticed that under right
> circumstances kernel would map anonymous pages into file mappings:
> if the VMA doesn't have vm_ops->fault() and the VMA wasn't fully
> populated on ->mmap(), kernel would handle page fault to not populated
> pte with do_anonymous_page().
>
> There's chance that it was done intentionally, but I don't see good
> justification for this. We just hide bugs in broken drivers.
>
Have you done a preliminary audit for these broken drivers? If they actually
exist in-tree then this patch is a regression for them.
We need to look for vm_ops without an .fault = . Perhaps define a
map_annonimous() for those to revert to the old behavior, if any
actually exist.
> Let's change page fault handler to use do_anonymous_page() only on
> anonymous VMA (->vm_ops == NULL).
>
> For file mappings without vm_ops->fault() page fault on pte_none() entry
> would lead to SIGBUS.
>
Again that could mean a theoretical regression for some in-tree driver,
do you know of any such driver?
Thanks
Boaz
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
> mm/memory.c | 15 +++++++++------
> 1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 8a2fc9945b46..f3ee782059e3 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3115,6 +3115,9 @@ static int do_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
>
> pte_unmap(page_table);
> +
> + if (unlikely(!vma->vm_ops->fault))
> + return VM_FAULT_SIGBUS;
> if (!(flags & FAULT_FLAG_WRITE))
> return do_read_fault(mm, vma, address, pmd, pgoff, flags,
> orig_pte);
> @@ -3260,13 +3263,13 @@ static int handle_pte_fault(struct mm_struct *mm,
> barrier();
> if (!pte_present(entry)) {
> if (pte_none(entry)) {
> - if (vma->vm_ops) {
> - if (likely(vma->vm_ops->fault))
> - return do_fault(mm, vma, address, pte,
> - pmd, flags, entry);
> + if (!vma->vm_ops) {
> + return do_anonymous_page(mm, vma, address, pte,
> + pmd, flags);
> + } else {
> + return do_fault(mm, vma, address, pte, pmd,
> + flags, entry);
> }
> - return do_anonymous_page(mm, vma, address,
> - pte, pmd, flags);
> }
> return do_swap_page(mm, vma, address,
> pte, pmd, flags, entry);
>
next prev parent reply other threads:[~2015-07-05 15:15 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-03 14:07 [PATCH] mm: avoid setting up anonymous pages into file mapping Kirill A. Shutemov
2015-07-03 14:07 ` Kirill A. Shutemov
2015-07-05 15:15 ` Boaz Harrosh [this message]
2015-07-05 15:15 ` Boaz Harrosh
2015-07-05 15:44 ` Kirill A. Shutemov
2015-07-05 15:44 ` Kirill A. Shutemov
2015-07-05 16:38 ` Boaz Harrosh
2015-07-05 16:38 ` Boaz Harrosh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55994A08.3030308@plexistor.com \
--to=boaz@plexistor.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hughd@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=riel@redhat.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.