From: Zoltan Menyhart <Zoltan.Menyhart@bull.net>
To: linux-ia64@vger.kernel.org
Subject: Re: Read *pgd again in vhpt_miss handler
Date: Fri, 28 Apr 2006 07:53:19 +0000 [thread overview]
Message-ID: <4451C9EF.9060807@bull.net> (raw)
In-Reply-To: <444F79CA.7060804@bull.net>
Christoph Lameter wrote:
> On Thu, 27 Apr 2006, Zoltan Menyhart wrote:
>
>
>>I wanted to use the mm semaphore => no need to walk again the
>>pgd ... pte chain.
>
>
> The pgd ... pte chain does not change even without mmap until
> the usage of the memory area ceases.
It is about about un-mapping a zone while another thread faults
on an address belonging to the same zone.
We have got a
rx = ... -> pgd[i] -> pud[j] -> pmd[k] -> pte[l]
chain to walk in the VHPT miss handler.
Having reached somewhere in this chain walking, we have got
the ph. address of the next page in the chain in a register.
Before we can fetch the next item in the chain, "unpredictable
long" time can pass.
In the mean time:
- "free_pgtables()" kills the page we are about to touch.
- Someone re-uses the same page for something else.
As we are still keeping the same ph. address, we fetch an item
from a page that is no more ours.
Even if this security window is small, it does exist.
The probability to hit this bug grows higher on a NUMA machine
with lots of CPUs.
I can accept that the VHPT miss handler cannot protected by
some locks, it is the other end that should use some "careful
un-mapping" in order to avoid race conditions.
Here is what I'm working on:
PTE, PMD and PUD page usage perfectly fits into the RCU approach:
1. The VHPT miss handler is protected by "rcu_read_lock_bh()".
There is not a single instruction added, the required semantics
is provided by the fact that the interrupts are off.
2. "free_pgtables()" keeps working as today for the non multi-
threaded applications.
3. "free_pgtables()" and its subroutines do not actually free
the PTE, PMD and PUD pages for multi-threaded applications.
These pages will set free via an "call_rcu_bh()"-activated
service.
(Perhaps, the weaker protection "rcu_read_lock()" - "call_rcu()"
will be enough...)
Please note that:
- The life span of the PTE, PMD and PUD pages is rather long:
they are freed when the usage of the memory area ceases,
provided no other map (using the same PTE, PMD and PUD pages)
is valid.
- The number of the PTE, PMD and PUD pages is much more smaller
that that of the leaf pages.
Therefore freeing them is not really performance critical.
As the "call_rcu_bh()"-activated freeing service will do a batch
processing, these is a chance that freeing the PTE, PMD and PUD
pages in this way be more efficient then the "pte_free()"... etc.
services of today are.
Regards,
Zoltan
prev parent reply other threads:[~2006-04-28 7:53 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-04-26 13:46 Read *pgd again in vhpt_miss handler Zoltan Menyhart
2006-04-26 15:00 ` Chen, Kenneth W
2006-04-27 11:04 ` Zoltan Menyhart
2006-04-28 1:23 ` Christoph Lameter
2006-04-28 7:53 ` Zoltan Menyhart [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4451C9EF.9060807@bull.net \
--to=zoltan.menyhart@bull.net \
--cc=linux-ia64@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox