From: Seth Jennings <sjenning@linux.vnet.ibm.com>
To: Michael Neuling <mikey@neuling.org>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>,
linuxppc-dev@lists.ozlabs.org, Nitin Gupta <ngupta@vflare.org>,
Dave Hansen <dave@linux.vnet.ibm.com>
Subject: Re: tlb flushing on Power
Date: Wed, 07 Mar 2012 15:18:56 -0600 [thread overview]
Message-ID: <4F57D0C0.7040108@linux.vnet.ibm.com> (raw)
In-Reply-To: <6006.1331098094@neuling.org>
On 03/06/2012 11:28 PM, Michael Neuling wrote:
> Seth,
>
>> Thanks for the help! I was wondering if you could take a look at something
>> for me.
>>
>> I've been working on this staging driver (zsmalloc memory allocator)
>> that does virtual mapping of two pages.
>>
>> I have a github repo with the driver and the unsubmitted changes. I'm
>> trying to make to get the pte/tlb stuff working in a portable way:
>>
>> git://github.com/spartacus06/linux.git (portable branch)
>>
>> The experimental commits are the top 5 and the branch is based on
>> Greg's staging-next + frontswap-v11 patches.
>>
>> Could you take a look at the zs_map_object() and zs_unmap_object()
>> in drivers/staging/zsmalloc/zsmalloc-main.c and see if they should
>> work for PPC64?
>
> I suggest you post the code directly to the list in reviewable chucks.
> People are much more likely to review it if they don't have to download
> some random git tree, checkout some branch, find the changes, etc etc.
It's hard to summarize out of context, but I'll try.
So zsmalloc is designed to store compressed memory pages. In order
to do that efficiently in memory restricted environments where
memory allocations from the buddy allocator greater than order 0 are
very likely to fail, we have to be able to store compressed memory pages
across non-contiguous page boundaries.
zsmalloc uses virtual memory mapping to do this. There is a per-cpu
struct vm_struct *vm. This vm is initialized like this:
===
pte_t *vm_ptes[2];
vm = alloc_vm_area(2 * PAGE_SIZE, vm_ptes);
===
When the allocation needs to be accessed, we map the memory like this
(with preemption disabled):
set_pte_at(&init_mm, 0, vm_ptes[0], mk_pte(page1, PAGE_KERNEL));
set_pte_at(&init_mm, 0, vm_ptes[1], mk_pte(page2, PAGE_KERNEL));
Preemption remains disabled while the user manipulates the allocation.
When the user is done, we unmap by doing:
====
ptep_get_and_clear(&init_mm, vm->addr, vm_ptes[0]);
ptep_get_and_clear(&init_mm, vm->addr + PAGE_SIZE, vm_ptes[1]);
local_flush_tlb_kernel_page(vm->addr);
local_flush_tlb_kernel_page(vm->addr + PAGE_SIZE);
====
In my patch, I've defined local_flush_tlb_kernel_page as:
#define local_flush_tlb_kernel_page(kaddr) local_flush_tlb_page(NULL, kaddr)
in arch/powerpc/include/asm/tlbflush.h
For PPC64 (the platform I'm testing on) local_flush_tlb_page() is a no-op
because of the hashing tlb in that architecture; however, it isn't a no-op
on other archs and the whole point of this change is portability.
My understanding is that the tlb flush on PPC64 happens as a result of
the pte_update() eventually called from ptep_get_and_clear().
The problem is that this doesn't always seem to work. I get corruption
in the mapped memory from time to time.
I've isolated the issue to this mapping code.
Does anyone see anything glaringly wrong with this approach?
Should ptep_get_and_clear() also flush the tlb entry (asynchronously I think)?
For comparison, this code is stable on x86 where local_flush_tlb_kernel_page()
is defined to __flush_tlb_one().
> If it's work in progress, mark it as an RFC patch and note what issues
> you think still exist.
>
> Mikey
prev parent reply other threads:[~2012-03-07 21:19 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <4F2160B3.60708@linux.vnet.ibm.com>
2012-01-26 14:41 ` tlb flushing on Power Brian King
2012-01-26 21:39 ` Benjamin Herrenschmidt
2012-01-26 22:30 ` Dave Hansen
2012-01-27 2:40 ` Benjamin Herrenschmidt
2012-02-08 17:39 ` Seth Jennings
2012-02-08 21:04 ` Benjamin Herrenschmidt
2012-02-10 19:14 ` Seth Jennings
2012-02-16 17:11 ` Seth Jennings
2012-02-16 20:31 ` Benjamin Herrenschmidt
2012-03-05 17:56 ` Seth Jennings
2012-03-07 5:28 ` Michael Neuling
2012-03-07 21:18 ` Seth Jennings [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F57D0C0.7040108@linux.vnet.ibm.com \
--to=sjenning@linux.vnet.ibm.com \
--cc=dave@linux.vnet.ibm.com \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mikey@neuling.org \
--cc=ngupta@vflare.org \
--cc=rcj@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).