From: Zoltan Menyhart <Zoltan.Menyhart@bull.net>
To: linux-ia64@vger.kernel.org
Subject: Re: Read *pgd again in vhpt_miss handler
Date: Fri, 28 Apr 2006 07:53:19 +0000 [thread overview]
Message-ID: <4451C9EF.9060807@bull.net> (raw)
In-Reply-To: <444F79CA.7060804@bull.net>
Christoph Lameter wrote:
> On Thu, 27 Apr 2006, Zoltan Menyhart wrote:
>
>
>>I wanted to use the mm semaphore => no need to walk again the
>>pgd ... pte chain.
>
>
> The pgd ... pte chain does not change even without mmap until
> the usage of the memory area ceases.
It is about about un-mapping a zone while another thread faults
on an address belonging to the same zone.
We have got a
rx = ... -> pgd[i] -> pud[j] -> pmd[k] -> pte[l]
chain to walk in the VHPT miss handler.
Having reached somewhere in this chain walking, we have got
the ph. address of the next page in the chain in a register.
Before we can fetch the next item in the chain, "unpredictable
long" time can pass.
In the mean time:
- "free_pgtables()" kills the page we are about to touch.
- Someone re-uses the same page for something else.
As we are still keeping the same ph. address, we fetch an item
from a page that is no more ours.
Even if this security window is small, it does exist.
The probability to hit this bug grows higher on a NUMA machine
with lots of CPUs.
I can accept that the VHPT miss handler cannot protected by
some locks, it is the other end that should use some "careful
un-mapping" in order to avoid race conditions.
Here is what I'm working on:
PTE, PMD and PUD page usage perfectly fits into the RCU approach:
1. The VHPT miss handler is protected by "rcu_read_lock_bh()".
There is not a single instruction added, the required semantics
is provided by the fact that the interrupts are off.
2. "free_pgtables()" keeps working as today for the non multi-
threaded applications.
3. "free_pgtables()" and its subroutines do not actually free
the PTE, PMD and PUD pages for multi-threaded applications.
These pages will set free via an "call_rcu_bh()"-activated
service.
(Perhaps, the weaker protection "rcu_read_lock()" - "call_rcu()"
will be enough...)
Please note that:
- The life span of the PTE, PMD and PUD pages is rather long:
they are freed when the usage of the memory area ceases,
provided no other map (using the same PTE, PMD and PUD pages)
is valid.
- The number of the PTE, PMD and PUD pages is much more smaller
that that of the leaf pages.
Therefore freeing them is not really performance critical.
As the "call_rcu_bh()"-activated freeing service will do a batch
processing, these is a chance that freeing the PTE, PMD and PUD
pages in this way be more efficient then the "pte_free()"... etc.
services of today are.
Regards,
Zoltan
prev parent reply other threads:[~2006-04-28 7:53 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-04-26 13:46 Read *pgd again in vhpt_miss handler Zoltan Menyhart
2006-04-26 15:00 ` Chen, Kenneth W
2006-04-27 11:04 ` Zoltan Menyhart
2006-04-28 1:23 ` Christoph Lameter
2006-04-28 7:53 ` Zoltan Menyhart [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4451C9EF.9060807@bull.net \
--to=zoltan.menyhart@bull.net \
--cc=linux-ia64@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.