All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Daney <ddaney@caviumnetworks.com>
To: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Cc: David Daney <ddaney.cavm@gmail.com>, <linux-mips@linux-mips.org>,
	<ralf@linux-mips.org>, David Daney <david.daney@cavium.com>,
	<stable@vger.kernel.org>
Subject: Re: MIPS: Make set_pte() SMP safe.
Date: Tue, 4 Aug 2015 13:48:11 -0700	[thread overview]
Message-ID: <55C1250B.2090508@caviumnetworks.com> (raw)
In-Reply-To: <55C1214F.8050208@imgtec.com>

On 08/04/2015 01:32 PM, Leonid Yegoshin wrote:
> David,
>
> It is interesting, I still don't understand the effect

I think the best way to think about it is to ignore vmap, and consider 
the semantics of set_pte().

When a thread calls set_pte() it must ensure that no other thread will 
crash using VA region covered by the PTE.  That is the contract of 
set_pte().

The MIPS set_pte() does something different.  In addition to setting the 
specified PTE, it has the side effect of clobbering another PTE (called 
the buddy).  There is nothing in the kernel that prevents a another 
thread from using the buddy-PTE, and when that happens in the race 
window, the page tables are corrupted, and the system crashes.

The fix is to not clobber the buddy-PTE.

You can go around in circles all you want trying to indirectly avoid 
using the buddy-PTE from another thread, but I think it is best to make 
set_pte() have easily understood semantics (and semantics that match 
those of other architectures) and not clobber things in unexpected ways.

David Daney.


> - if guard page
> is used then two different VMAP allocations can't use two buddy PTEs.
>
> Yes, only one of buddy PTEs in that case can be allocated and attached
> to VMA but caller doesn't know about additional page and two cases are
> possible. Even map_vm_area has no any info about guard page.
>
> (assume VMA1 has low address range and VMA2 has higher address range):
>
> a.  VMA1 (after adjustment) ends at even PTE ==> caller doesn't use that
> PTE and there is no collision with last pair of buddy PTEs, even if VMA2
> uses odd PTE from that pair.
> b.  VMA1 (after adjustment) ends at odd PTE ==> again, this buddy pair
> is used only VMA1. Next VMA2 start from next pair.
>
> What is wrong here?
>
> Is it possible that access gone bad and touches a page beyond a
> requested size?
> Is it possible that it is not vmap() but some different interface was used?
>
> - Leonid.
>

WARNING: multiple messages have this Message-ID (diff)
From: David Daney <ddaney@caviumnetworks.com>
To: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Cc: David Daney <ddaney.cavm@gmail.com>,
	linux-mips@linux-mips.org, ralf@linux-mips.org,
	David Daney <david.daney@cavium.com>,
	stable@vger.kernel.org
Subject: Re: MIPS: Make set_pte() SMP safe.
Date: Tue, 4 Aug 2015 13:48:11 -0700	[thread overview]
Message-ID: <55C1250B.2090508@caviumnetworks.com> (raw)
Message-ID: <20150804204811.iIwTFUKguOyEdxUt8m4DOGFMtBKH_r9PW-l-6H4g1t0@z> (raw)
In-Reply-To: <55C1214F.8050208@imgtec.com>

On 08/04/2015 01:32 PM, Leonid Yegoshin wrote:
> David,
>
> It is interesting, I still don't understand the effect

I think the best way to think about it is to ignore vmap, and consider 
the semantics of set_pte().

When a thread calls set_pte() it must ensure that no other thread will 
crash using VA region covered by the PTE.  That is the contract of 
set_pte().

The MIPS set_pte() does something different.  In addition to setting the 
specified PTE, it has the side effect of clobbering another PTE (called 
the buddy).  There is nothing in the kernel that prevents a another 
thread from using the buddy-PTE, and when that happens in the race 
window, the page tables are corrupted, and the system crashes.

The fix is to not clobber the buddy-PTE.

You can go around in circles all you want trying to indirectly avoid 
using the buddy-PTE from another thread, but I think it is best to make 
set_pte() have easily understood semantics (and semantics that match 
those of other architectures) and not clobber things in unexpected ways.

David Daney.


> - if guard page
> is used then two different VMAP allocations can't use two buddy PTEs.
>
> Yes, only one of buddy PTEs in that case can be allocated and attached
> to VMA but caller doesn't know about additional page and two cases are
> possible. Even map_vm_area has no any info about guard page.
>
> (assume VMA1 has low address range and VMA2 has higher address range):
>
> a.  VMA1 (after adjustment) ends at even PTE ==> caller doesn't use that
> PTE and there is no collision with last pair of buddy PTEs, even if VMA2
> uses odd PTE from that pair.
> b.  VMA1 (after adjustment) ends at odd PTE ==> again, this buddy pair
> is used only VMA1. Next VMA2 start from next pair.
>
> What is wrong here?
>
> Is it possible that access gone bad and touches a page beyond a
> requested size?
> Is it possible that it is not vmap() but some different interface was used?
>
> - Leonid.
>

  parent reply	other threads:[~2015-08-04 20:48 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-04  0:48 [PATCH] MIPS: Make set_pte() SMP safe David Daney
2015-08-04  0:48 ` David Daney
2015-08-04 19:15 ` Leonid Yegoshin
2015-08-04 19:15   ` Leonid Yegoshin
2015-08-04 20:01   ` David Daney
2015-08-04 20:01     ` David Daney
2015-08-04 20:32     ` Leonid Yegoshin
2015-08-04 20:32       ` Leonid Yegoshin
2015-08-04 20:36       ` Leonid Yegoshin
2015-08-04 20:36         ` Leonid Yegoshin
2015-08-04 20:38         ` David Daney
2015-08-04 20:38           ` David Daney
2015-08-04 20:47           ` Leonid Yegoshin
2015-08-04 20:47             ` Leonid Yegoshin
2015-08-04 20:48       ` David Daney [this message]
2015-08-04 20:48         ` David Daney
2015-08-04 20:58         ` Leonid Yegoshin
2015-08-04 20:58           ` Leonid Yegoshin
2015-08-24  3:28 ` [PATCH] " Joshua Kinard
  -- strict thread matches above, loose matches on Subject: below --
2015-08-04  0:48 David Daney
2015-08-04  0:48 David Daney
2015-08-04  0:48 David Daney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55C1250B.2090508@caviumnetworks.com \
    --to=ddaney@caviumnetworks.com \
    --cc=Leonid.Yegoshin@imgtec.com \
    --cc=david.daney@cavium.com \
    --cc=ddaney.cavm@gmail.com \
    --cc=linux-mips@linux-mips.org \
    --cc=ralf@linux-mips.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.