* RE: [PATCH] powerpc/85xx: dts - add ranges property for SEC
From: Liu Po-B43644 @ 2013-02-20 1:37 UTC (permalink / raw)
To: Gala Kumar-B11780; +Cc: <linuxppc-dev@ozlabs.org>
In-Reply-To: <CF1EE7AED478CD48A05574C8E2DA142D72B420@039-SN1MPN1-005.039d.mgd.msft.net>
Thanks again for the fix. Just ignore this post.
Best regards,
Liu Po
- 8038
-----Original Message-----
From: Gala Kumar-B11780=20
Sent: Wednesday, February 20, 2013 12:01 AM
To: Liu Po-B43644
Cc: <linuxppc-dev@ozlabs.org>
Subject: Re: [PATCH] powerpc/85xx: dts - add ranges property for SEC
On Feb 18, 2013, at 6:29 PM, Po Liu wrote:
> This facilitates getting the physical address of the SEC node.
>=20
> Signed-off-by: Liu po <po.liu@freescale.com>
> ---
> arch/powerpc/boot/dts/fsl/pq3-sec4.4-0.dtsi | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
Why are you reposting this, I already applied it:
http://git.kernel.org/?p=3Dlinux/kernel/git/galak/powerpc.git;a=3Dcommit;h=
=3Ddb29cd3c4497e7edf9176284ba7cf3cec1814c7a
- k
^ permalink raw reply
* RE: [PATCH][UPSTEAM] powerpc/mpic: add irq_set_wake support
From: Wang Dongsheng-B40534 @ 2013-02-20 5:57 UTC (permalink / raw)
To: Kumar Gala; +Cc: linuxppc-dev@lists.ozlabs.org
In-Reply-To: <3887203D-64E3-4BA5-AB2A-20BE668560A4@kernel.crashing.org>
> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]
> Sent: Tuesday, February 19, 2013 3:43 AM
> To: Wang Dongsheng-B40534
> Cc: linuxppc-dev@lists.ozlabs.org
> Subject: Re: [PATCH][UPSTEAM] powerpc/mpic: add irq_set_wake support
>=20
>=20
> On Jan 30, 2013, at 9:10 PM, Wang Dongsheng wrote:
>=20
> > Add irq_set_wake support. Just add IRQF_NO_SUSPEND to desc->action-
> >flag.
> > So the wake up interrupt will not be disable in suspend_device_irqs.
> >
> > Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
> > ---
> > arch/powerpc/sysdev/mpic.c | 15 +++++++++++++++
> > 1 files changed, 15 insertions(+), 0 deletions(-)
>=20
> Why are we doing this globally for all interrupts? Don't we only have=20
> some specific interrupts that wake us up?
> Also, I'm guessing the wake behavior for interrupts is FSL specific so
> should not apply to ALL users of MPIC.
That is IRQ wakeup (PM) control. Actually not all interrupts will be set.
We just let mpic have this ability. It's control by driver.
If a device has the ability to wake up system, this device driver can set
irq wake up(through enable/disable_irq_wake()), and the driver do not need
add a flag(IRQF_NO_SUSPEND) to request_irq().
for example,
suspend()
{
...;
enable_irq_wake(irq);
...;
}
resume()
{
...;
disable_irq_wake(irq);
...;
}
>=20
> - k
>=20
> >
> > diff --git a/arch/powerpc/sysdev/mpic.c b/arch/powerpc/sysdev/mpic.c
> > index 9c6e535..2ed0220 100644
> > --- a/arch/powerpc/sysdev/mpic.c
> > +++ b/arch/powerpc/sysdev/mpic.c
> > @@ -920,6 +920,18 @@ int mpic_set_irq_type(struct irq_data *d, unsigned
> int flow_type)
> > return IRQ_SET_MASK_OK_NOCOPY;
> > }
> >
> > +static int mpic_irq_set_wake(struct irq_data *d, unsigned int on) {
> > + struct irq_desc *desc =3D container_of(d, struct irq_desc, irq_data);
> > +
> > + if (on)
> > + desc->action->flags |=3D IRQF_NO_SUSPEND;
> > + else
> > + desc->action->flags &=3D ~IRQF_NO_SUSPEND;
> > +
> > + return 0;
> > +}
> > +
> > void mpic_set_vector(unsigned int virq, unsigned int vector) {
> > struct mpic *mpic =3D mpic_from_irq(virq); @@ -957,6 +969,7 @@ static
> > struct irq_chip mpic_irq_chip =3D {
> > .irq_unmask =3D mpic_unmask_irq,
> > .irq_eoi =3D mpic_end_irq,
> > .irq_set_type =3D mpic_set_irq_type,
> > + .irq_set_wake =3D mpic_irq_set_wake,
> > };
> >
> > #ifdef CONFIG_SMP
> > @@ -971,6 +984,7 @@ static struct irq_chip mpic_tm_chip =3D {
> > .irq_mask =3D mpic_mask_tm,
> > .irq_unmask =3D mpic_unmask_tm,
> > .irq_eoi =3D mpic_end_irq,
> > + .irq_set_wake =3D mpic_irq_set_wake,
> > };
> >
> > #ifdef CONFIG_MPIC_U3_HT_IRQS
> > @@ -981,6 +995,7 @@ static struct irq_chip mpic_irq_ht_chip =3D {
> > .irq_unmask =3D mpic_unmask_ht_irq,
> > .irq_eoi =3D mpic_end_ht_irq,
> > .irq_set_type =3D mpic_set_irq_type,
> > + .irq_set_wake =3D mpic_irq_set_wake,
> > };
> > #endif /* CONFIG_MPIC_U3_HT_IRQS */
> >
> > --
> > 1.7.5.1
> >
> >
> > _______________________________________________
> > Linuxppc-dev mailing list
> > Linuxppc-dev@lists.ozlabs.org
> > https://lists.ozlabs.org/listinfo/linuxppc-dev
>=20
^ permalink raw reply
* Re: [PATCH][RFC] Replaced tlbilx with tlbwe in the initialization code
From: Diana Craciun @ 2013-02-20 9:22 UTC (permalink / raw)
To: Scott Wood; +Cc: Yoder Stuart-B08248, linuxppc-dev
In-Reply-To: <1361303278.29654.7@snotra>
On 02/19/2013 09:47 PM, Scott Wood wrote:
> On 02/15/2013 09:16:15 AM, Diana Craciun wrote:
>> On 02/15/2013 02:11 AM, Benjamin Herrenschmidt wrote:
>>> On Thu, 2013-02-14 at 14:56 +0200, Diana Craciun wrote:
>>>> From: Diana Craciun <Diana.Craciun@freescale.com>
>>>>
>>>> On Freescale e6500 cores EPCR[DGTMI] controls whether guest
>>>> supervisor
>>>> state can execute TLB management instructions. If EPCR[DGTMI]=0
>>>> tlbwe and tlbilx are allowed to execute normally in the guest state.
>>>>
>>>> A hypervisor may choose to virtualize TLB1 and for this purpose it
>>>> may use IPROT to protect the entries for being invalidated by the
>>>> guest. However, because tlbwe and tlbilx execution in the guest
>>>> state
>>>> are sharing the same bit, it is not possible to have a scenario
>>>> where
>>>> tlbwe is allowed to be executed in guest state and tlbilx traps.
>>>> When
>>>> guest TLB management instructions are allowed to be executed in
>>>> guest
>>>> state the guest cannot use tlbilx to invalidate TLB1 guest entries.
>>> Sorry, I don't understand the explanation... can you be more
>>> detailed ?
>> TLB1 supports huge page sizes. The guest may see the memory as
>> contiguous but it sees the guest physical memory as presented by the
>> hypervisor. In reality the real physical memory may be fragmented. In
>> this case the hypervisor can add more than one TLB1 entry for one
>> guest request and the hypervisor will keep track of all fragments.
>> When the guest performs a tlbilx, the hypervisor will correctly
>> invalidate all the corresponding fragments because both tlbwe and
>> tlbilx trap and has full control of tlb management instructions
>> targeting TLB1.
>>
>> For e6500 a single bit controls if tlbwe and tlbilx trap to the
>> Hypervisor. tlbwe targeting TLB1 always traps. But if we want to use
>> LRAT for TLB0, we have to configure tlbwe (targeting TLB 0) to go
>> directly to the guest. But in this case tlbilx (which is targeting
>> both TLBs) will never trap.
>>
>> If the tlbilx does not trap, the guest can invalidate only one of
>> (possible more) fragments and furthermore the synchronization between
>> what entries the hypervisor thinks there are in the TLB1 and what are
>> the actual entries is lost.
> This patch addresses boot-time invalidations only. How will you handle
> hugetlb invalidations (or indirect entry invalidations, once that
> becomes supported)?
>
> -Scott
I will not handle them. This patch offers the possibility to run Linux
under hypervisor without using hugetlb or indirect entries (of course in
case when we configure tlb management instructions to go to the guest
because otherwise it works)
If indirect entries are supported most likely we will configure tlbilx
and tlbwe to trap. In this case LRAT will be still used through the page
table walk mechanism.
Diana
^ permalink raw reply
* Re: [PATCH] powerpc/rtas_flash: Free kmem upon module exit
From: Vasant Hegde @ 2013-02-20 9:32 UTC (permalink / raw)
To: linuxppc-dev, benh
In-Reply-To: <20130208111742.31083.67858.stgit@hegdevasant.in.ibm.com>
Ben,
Let me know your thoughts on this patch.
-Vasant
On 02/08/2013 04:48 PM, Vasant Hegde wrote:
> Memory allocated to rtas_firmware_flash_list in rtas_flash_write
> is not freed during module exit. We hit below call trace if we
> unload rtas_flash module after loading new firmware image and
> before rebooting the system.
>
> Call trace:
> ----------
> Feb 6 08:42:10 eagle3 kernel: kmem_cache_destroy rtas_flash_cache: Slab cache still has objects
> Feb 6 08:42:10 eagle3 kernel: Call Trace:
> Feb 6 08:42:10 eagle3 kernel: [c00000001c303b40] [c000000000014940] .show_stack+0x70/0x1c0 (unreliable)
> Feb 6 08:42:10 eagle3 kernel: [c00000001c303bf0] [c000000000199bec] .kmem_cache_destroy+0x15c/0x170
> Feb 6 08:42:10 eagle3 kernel: [c00000001c303c90] [d000000006fa1208] .rtas_flash_cleanup+0x3c/0x80 [rtas_flash]
> Feb 6 08:42:10 eagle3 kernel: [c00000001c303d20] [c0000000000f8970] .SyS_delete_module+0x1d0/0x2e0
> Feb 6 08:42:10 eagle3 kernel: [c00000001c303e30] [c000000000009954] syscall_exit+0x0/0x94
>
> This patch frees rtas_firmware_flash_list during module exit.
>
> Signed-off-by: Vasant Hegde<hegdevasant@linux.vnet.ibm.com>
> ---
> arch/powerpc/kernel/rtas_flash.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/arch/powerpc/kernel/rtas_flash.c b/arch/powerpc/kernel/rtas_flash.c
> index 8329190..e22ef34 100644
> --- a/arch/powerpc/kernel/rtas_flash.c
> +++ b/arch/powerpc/kernel/rtas_flash.c
> @@ -790,6 +790,11 @@ static void __exit rtas_flash_cleanup(void)
> {
> rtas_flash_term_hook = NULL;
>
> + if (rtas_firmware_flash_list) {
> + free_flash_list(rtas_firmware_flash_list);
> + rtas_firmware_flash_list = NULL;
> + }
> +
> if (flash_block_cache)
> kmem_cache_destroy(flash_block_cache);
>
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
>
^ permalink raw reply
* RE: [PATCH 6/6 v8] iommu/fsl: Freescale PAMU driver and IOMMU API implementation.
From: Sethi Varun-B16395 @ 2013-02-20 9:41 UTC (permalink / raw)
To: Craciun Diana Madalina-STFD002
Cc: Wood Scott-B07421, joro@8bytes.org, linux-kernel@vger.kernel.org,
Yoder Stuart-B08248, iommu@lists.linux-foundation.org,
linuxppc-dev@lists.ozlabs.org
In-Reply-To: <5123A169.9060100@freescale.com>
> -----Original Message-----
> From: Craciun Diana Madalina-STFD002
> Sent: Tuesday, February 19, 2013 9:30 PM
> To: Sethi Varun-B16395
> Cc: iommu@lists.linux-foundation.org; linuxppc-dev@lists.ozlabs.org;
> linux-kernel@vger.kernel.org; Wood Scott-B07421; joro@8bytes.org; Yoder
> Stuart-B08248
> Subject: Re: [PATCH 6/6 v8] iommu/fsl: Freescale PAMU driver and IOMMU
> API implementation.
>=20
> On 02/18/2013 02:52 PM, Varun Sethi wrote:
> > +/**
> > + * pamu_get_ppaace() - Return the primary PACCE
> > + * @liodn: liodn PAACT index for desired PAACE
> > + *
> > + * Returns the ppace pointer upon success else return
> > + * null.
> > + */
> > +static struct paace *pamu_get_ppaace(int liodn) {
> > + if (!ppaact || liodn > PAACE_NUMBER_ENTRIES) {
>=20
> Shouldn't be "liodn >=3D PAACE_NUMBER_ENTRIES" ?
Yes, will fix this.
-Varun
^ permalink raw reply
* Re: [PATCH][RFC] Replaced tlbilx with tlbwe in the initialization code
From: Stuart Yoder @ 2013-02-20 14:22 UTC (permalink / raw)
To: Scott Wood; +Cc: Diana Craciun, linuxppc-dev
In-Reply-To: <1361303278.29654.7@snotra>
On Tue, Feb 19, 2013 at 1:47 PM, Scott Wood <scottwood@freescale.com> wrote:
>
> This patch addresses boot-time invalidations only. How will you handle
> hugetlb invalidations (or indirect entry invalidations, once that becomes
> supported)?
We do envision that "direct guest TLB management" is an opt-in option
that a guest can enable.
If LRAT is on, with TLB management directly handled by guests, the only
mechanism we have to do TLB1 invalidates is tlbwe. That is our only option
as far as I know. So, hugetlb and indirect entries will each need to be
addressed separately. The kernel code that handles these either needs
to be A) modified to unconditionally do all invalidates by tlbwe or B)
conditionally
use tlbwe depending on whether this is a guest that has enabled direct
TLB management.
Stuart
^ permalink raw reply
* Re: [PATCH][RFC] Replaced tlbilx with tlbwe in the initialization code
From: Diana Craciun @ 2013-02-20 14:31 UTC (permalink / raw)
To: Stuart Yoder; +Cc: Scott Wood, linuxppc-dev
In-Reply-To: <CALRxmdCpS6cJ6kJ=cg4Mre6YA572fxyk+596E1k1MX+iRL7s9Q@mail.gmail.com>
On 02/20/2013 04:22 PM, Stuart Yoder wrote:
> On Tue, Feb 19, 2013 at 1:47 PM, Scott Wood <scottwood@freescale.com> wrote:
>> This patch addresses boot-time invalidations only. How will you handle
>> hugetlb invalidations (or indirect entry invalidations, once that becomes
>> supported)?
> We do envision that "direct guest TLB management" is an opt-in option
> that a guest can enable.
>
> If LRAT is on, with TLB management directly handled by guests, the only
> mechanism we have to do TLB1 invalidates is tlbwe. That is our only option
> as far as I know. So, hugetlb and indirect entries will each need to be
> addressed separately. The kernel code that handles these either needs
> to be A) modified to unconditionally do all invalidates by tlbwe or B)
> conditionally
> use tlbwe depending on whether this is a guest that has enabled direct
> TLB management.
>
> Stuart
>
In case of indirect entries I think we can configure tlbwe and tlbilx to
go to the hypervisor. The guest should not mix tlbwe (for TLB0) and
hardware page table walk, so we can support this scenario without
modifying the guest.
Diana
^ permalink raw reply
* Re: PS3: Strange issue with kexec and FreeBSD loader
From: Phileas Fogg @ 2013-02-20 20:43 UTC (permalink / raw)
To: Phileas Fogg; +Cc: linuxppc-dev
In-Reply-To: <5123D864.4060503@mail.ru>
Phileas Fogg wrote:
> Phileas Fogg wrote:
>> I could finally find the commit which broke FreeBSD booting in linux-stable.git
>> repository.
>> The Linux 3.4-rc1 seems to have this problem already.
>>
>> --------------
>> commit 5375871d432ae9fc581014ac117b96aaee3cd0c7
>> Merge: b57cb72 dfbc2d7
>> Author: Linus Torvalds <torvalds@linux-foundation.org>
>> Date: Wed Mar 21 18:55:10 2012 -0700
>>
>> Merge branch 'next' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
>>
>> Pull powerpc merge from Benjamin Herrenschmidt:
>> "Here's the powerpc batch for this merge window. It is going to be a
>> bit more nasty than usual as in touching things outside of
>> arch/powerpc mostly due to the big iSeriesectomy :-) We finally got
>> rid of the bugger (legacy iSeries support) which was a PITA to
>> maintain and that nobody really used anymore.
>>
>> Here are some of the highlights:
>>
>> - Legacy iSeries is gone. Thanks Stephen ! There's still some bits
>> and pieces remaining if you do a grep -ir series arch/powerpc but
>> they are harmless and will be removed in the next few weeks
>> hopefully.
>>
>> - The 'fadump' functionality (Firmware Assisted Dump) replaces the
>> previous (equivalent) "pHyp assisted dump"... it's a rewrite of a
>> mechanism to get the hypervisor to do crash dumps on pSeries, the
>> new implementation hopefully being much more reliable. Thanks
>> Mahesh Salgaonkar.
>>
>> - The "EEH" code (pSeries PCI error handling & recovery) got a big
>> spring cleaning, motivated by the need to be able to implement a
>> new backend for it on top of some new different type of firwmare.
>>
>> The work isn't complete yet, but a good chunk of the cleanups is
>> there. Note that this adds a field to struct device_node which is
>> not very nice and which Grant objects to. I will have a patch soon
>> that moves that to a powerpc private data structure (hopefully
>> before rc1) and we'll improve things further later on (hopefully
>> getting rid of the need for that pointer completely). Thanks Gavin
>> Shan.
>>
>> - I dug into our exception & interrupt handling code to improve the
>> way we do lazy interrupt handling (and make it work properly with
>> "edge" triggered interrupt sources), and while at it found & fixed
>> a wagon of issues in those areas, including adding support for page
>> fault retry & fatal signals on page faults.
>>
>> - Your usual random batch of small fixes & updates, including a bunch
>> of new embedded boards, both Freescale and APM based ones, etc..."
>>
>> I fixed up some conflicts with the generalized irq-domain changes from
>> Grant Likely, hopefully correctly.
>>
>> * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
>> (141 commits)
>> powerpc/ps3: Do not adjust the wrapper load address
>> powerpc: Remove the rest of the legacy iSeries include files
>> powerpc: Remove the remaining CONFIG_PPC_ISERIES pieces
>> init: Remove CONFIG_PPC_ISERIES
>> powerpc: Remove FW_FEATURE ISERIES from arch code
>> tty/hvc_vio: FW_FEATURE_ISERIES is no longer selectable
>> powerpc/spufs: Fix double unlocks
>> powerpc/5200: convert mpc5200 to use of_platform_populate()
>> powerpc/mpc5200: add options to mpc5200_defconfig
>> powerpc/mpc52xx: add a4m072 board support
>> powerpc/mpc5200: update mpc5200_defconfig to fit for charon board
>> Documentation/powerpc/mpc52xx.txt: Checkpatch cleanup
>> powerpc/44x: Add additional device support for APM821xx SoC and Bluestone
>> board
>> powerpc/44x: Add support PCI-E for APM821xx SoC and Bluestone board
>> MAINTAINERS: Update PowerPC 4xx tree
>> powerpc/44x: The bug fixed support for APM821xx SoC and Bluestone board
>> powerpc: document the FSL MPIC message register binding
>> powerpc: add support for MPIC message register API
>> powerpc/fsl: Added aliased MSIIR register address to MSI node in dts
>> powerpc/85xx: mpc8548cds - add 36-bit dts
>> ...
>>
>> _______________________________________________
>> Linuxppc-dev mailing list
>> Linuxppc-dev@lists.ozlabs.org
>> https://lists.ozlabs.org/listinfo/linuxppc-dev
>
> Reverting this commit fixes the problem with SHA256 checkusm in the purgatory
> code too. I'm trying to find out which commit exactly caused the problem.
>
> regards
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
I found the single commit which brakes kexec stuff for FreeBSD loader or other
custom ELF kernels on the PS3 console.
From 7230c5644188cd9e3fb380cc97dde00c464a3ba7 Mon Sep 17 00:00:00 2001
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: Tue, 6 Mar 2012 18:27:59 +1100
Subject: [PATCH] powerpc: Rework lazy-interrupt handling
regards
^ permalink raw reply
* Re: PS3: Strange issue with kexec and FreeBSD loader
From: Geoff Levand @ 2013-02-21 0:14 UTC (permalink / raw)
To: Phileas Fogg; +Cc: cbe-oss-dev, linuxppc-dev
In-Reply-To: <51201276.8020104@mail.ru>
Hi Phileas,
On Sun, 2013-02-17 at 00:12 +0100, Phileas Fogg wrote:
> I found new clues about the problem.
>
> Normally the device tree memory segment is allocated at the top of the boot
> memory region. The boot memory size on the PS3 console is 128MB.
>
> root@ps3-linux:~# kexec -l loader.ps3
> segment[0].mem:0x131d000 memsz:262144
> segment[1].mem:0x135d000 memsz:36864
> segment[2].mem:0x7fff000 memsz:4096
>
> And the device tree is located at address 0x7fff000, it's the last page of the
> boot memory.
>
> I changed the kexec-tools and made it store the device tree just after the
> purgatory code which is located at address 0x135d000. Like here:
>
> root@ps3-linux:~# kexec -l loader.ps3
> segment[0].mem:0x131d000 memsz:262144
> segment[1].mem:0x135d000 memsz:36864
> segment[2].mem:0x1366000 memsz:4096 <---- new address of device tree segment
>
> And now the sha256 verification is always successful for the FreeBSD loader too.
> But still no idea what actually corrupts the device tree segment when it's
> located at the top of the boot memory region. And why it happens on Linux 3.7
> and Linux 3.8 but not on Linux 3.3.8.
Excellent work so far.
You may be able to use the Cell Processor's DABR (Data Address Breakpoint)
register to find out what code is writing to that memory area. I have a
helper patch to setup the DABR register from kernel code here:
http://git.kernel.org/?p=linux/kernel/git/geoff/ps3-linux.git;a=commitdiff;h=c46799f5c6ba7594cdaa248ec60a50c7ad1cdeaa
-Geoff
^ permalink raw reply
* Re: PS3: Strange issue with kexec and FreeBSD loader
From: Benjamin Herrenschmidt @ 2013-02-21 0:32 UTC (permalink / raw)
To: Phileas Fogg; +Cc: linuxppc-dev
In-Reply-To: <51253558.1070407@mail.ru>
On Wed, 2013-02-20 at 21:43 +0100, Phileas Fogg wrote:
> I found the single commit which brakes kexec stuff for FreeBSD loader or other
> custom ELF kernels on the PS3 console.
>
>
> From 7230c5644188cd9e3fb380cc97dde00c464a3ba7 Mon Sep 17 00:00:00 2001
> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Date: Tue, 6 Mar 2012 18:27:59 +1100
> Subject: [PATCH] powerpc: Rework lazy-interrupt handling
Odd... That rework had its own issues and so several patches went in
subsequently to address them. It's possible that the PS3 does more
horrid stuff we missed here but I don't quite see how to relate that to
your specific memory corruption problem...
Do you see any "pattern" to the corruption ? Does it looks like
something known ? IE., exception frame, ASCII data, MSR values, ...
Ben.
^ permalink raw reply
* linux-next: manual merge of the signal tree with the powerpc tree
From: Stephen Rothwell @ 2013-02-21 4:52 UTC (permalink / raw)
To: Al Viro
Cc: Michael Neuling, linux-kernel, linux-next, Paul Mackerras,
linuxppc-dev
[-- Attachment #1: Type: text/plain, Size: 10071 bytes --]
Hi Al,
Today's linux-next merge of the signal tree got conflicts in
arch/powerpc/kernel/signal_32.c and arch/powerpc/kernel/signal_64.c
between commit 2b0a576d15e0 ("powerpc: Add new transactional memory state
to the signal context") from the powerpc tree and commit 7cce246557bf
("powerpc: switch to generic sigaltstack") from the signal tree.
I fixed it up (I think - see below) and can carry the fix as necessary
(no action is required).
--
Cheers,
Stephen Rothwell sfr@canb.auug.org.au
diff --cc arch/powerpc/kernel/signal_32.c
index e4a88d3,802ab5e..0000000
--- a/arch/powerpc/kernel/signal_32.c
+++ b/arch/powerpc/kernel/signal_32.c
@@@ -817,223 -513,7 +742,140 @@@ static long restore_user_regs(struct pt
return 0;
}
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+/*
+ * Restore the current user register values from the user stack, except for
+ * MSR, and recheckpoint the original checkpointed register state for processes
+ * in transactions.
+ */
+static long restore_tm_user_regs(struct pt_regs *regs,
+ struct mcontext __user *sr,
+ struct mcontext __user *tm_sr)
+{
+ long err;
+ unsigned long msr;
+#ifdef CONFIG_VSX
+ int i;
+#endif
+
+ /*
+ * restore general registers but not including MSR or SOFTE. Also
+ * take care of keeping r2 (TLS) intact if not a signal.
+ * See comment in signal_64.c:restore_tm_sigcontexts();
+ * TFHAR is restored from the checkpointed NIP; TEXASR and TFIAR
+ * were set by the signal delivery.
+ */
+ err = restore_general_regs(regs, tm_sr);
+ err |= restore_general_regs(¤t->thread.ckpt_regs, sr);
+
+ err |= __get_user(current->thread.tm_tfhar, &sr->mc_gregs[PT_NIP]);
+
+ err |= __get_user(msr, &sr->mc_gregs[PT_MSR]);
+ if (err)
+ return 1;
+
+ /* Restore the previous little-endian mode */
+ regs->msr = (regs->msr & ~MSR_LE) | (msr & MSR_LE);
+
+ /*
+ * Do this before updating the thread state in
+ * current->thread.fpr/vr/evr. That way, if we get preempted
+ * and another task grabs the FPU/Altivec/SPE, it won't be
+ * tempted to save the current CPU state into the thread_struct
+ * and corrupt what we are writing there.
+ */
+ discard_lazy_cpu_state();
+
+#ifdef CONFIG_ALTIVEC
+ regs->msr &= ~MSR_VEC;
+ if (msr & MSR_VEC) {
+ /* restore altivec registers from the stack */
+ if (__copy_from_user(current->thread.vr, &sr->mc_vregs,
+ sizeof(sr->mc_vregs)) ||
+ __copy_from_user(current->thread.transact_vr,
+ &tm_sr->mc_vregs,
+ sizeof(sr->mc_vregs)))
+ return 1;
+ } else if (current->thread.used_vr) {
+ memset(current->thread.vr, 0, ELF_NVRREG * sizeof(vector128));
+ memset(current->thread.transact_vr, 0,
+ ELF_NVRREG * sizeof(vector128));
+ }
+
+ /* Always get VRSAVE back */
+ if (__get_user(current->thread.vrsave,
+ (u32 __user *)&sr->mc_vregs[32]) ||
+ __get_user(current->thread.transact_vrsave,
+ (u32 __user *)&tm_sr->mc_vregs[32]))
+ return 1;
+#endif /* CONFIG_ALTIVEC */
+
+ regs->msr &= ~(MSR_FP | MSR_FE0 | MSR_FE1);
+
+ if (copy_fpr_from_user(current, &sr->mc_fregs) ||
+ copy_transact_fpr_from_user(current, &tm_sr->mc_fregs))
+ return 1;
+
+#ifdef CONFIG_VSX
+ regs->msr &= ~MSR_VSX;
+ if (msr & MSR_VSX) {
+ /*
+ * Restore altivec registers from the stack to a local
+ * buffer, then write this out to the thread_struct
+ */
+ if (copy_vsx_from_user(current, &sr->mc_vsregs) ||
+ copy_transact_vsx_from_user(current, &tm_sr->mc_vsregs))
+ return 1;
+ } else if (current->thread.used_vsr)
+ for (i = 0; i < 32 ; i++) {
+ current->thread.fpr[i][TS_VSRLOWOFFSET] = 0;
+ current->thread.transact_fpr[i][TS_VSRLOWOFFSET] = 0;
+ }
+#endif /* CONFIG_VSX */
+
+#ifdef CONFIG_SPE
+ /* SPE regs are not checkpointed with TM, so this section is
+ * simply the same as in restore_user_regs().
+ */
+ regs->msr &= ~MSR_SPE;
+ if (msr & MSR_SPE) {
+ if (__copy_from_user(current->thread.evr, &sr->mc_vregs,
+ ELF_NEVRREG * sizeof(u32)))
+ return 1;
+ } else if (current->thread.used_spe)
+ memset(current->thread.evr, 0, ELF_NEVRREG * sizeof(u32));
+
+ /* Always get SPEFSCR back */
+ if (__get_user(current->thread.spefscr, (u32 __user *)&sr->mc_vregs
+ + ELF_NEVRREG))
+ return 1;
+#endif /* CONFIG_SPE */
+
+ /* Now, recheckpoint. This loads up all of the checkpointed (older)
+ * registers, including FP and V[S]Rs. After recheckpointing, the
+ * transactional versions should be loaded.
+ */
+ tm_enable();
+ /* This loads the checkpointed FP/VEC state, if used */
+ tm_recheckpoint(¤t->thread, msr);
+ /* The task has moved into TM state S, so ensure MSR reflects this */
+ regs->msr = (regs->msr & ~MSR_TS_MASK) | MSR_TS_S;
+
+ /* This loads the speculative FP/VEC state, if used */
+ if (msr & MSR_FP) {
+ do_load_up_transact_fpu(¤t->thread);
+ regs->msr |= (MSR_FP | current->thread.fpexc_mode);
+ }
+ if (msr & MSR_VEC) {
+ do_load_up_transact_altivec(¤t->thread);
+ regs->msr |= MSR_VEC;
+ }
+
+ return 0;
+}
+#endif
+
#ifdef CONFIG_PPC64
- long compat_sys_rt_sigaction(int sig, const struct sigaction32 __user *act,
- struct sigaction32 __user *oact, size_t sigsetsize)
- {
- struct k_sigaction new_ka, old_ka;
- int ret;
-
- /* XXX: Don't preclude handling different sized sigset_t's. */
- if (sigsetsize != sizeof(compat_sigset_t))
- return -EINVAL;
-
- if (act) {
- compat_uptr_t handler;
-
- ret = get_user(handler, &act->sa_handler);
- new_ka.sa.sa_handler = compat_ptr(handler);
- ret |= get_sigset_t(&new_ka.sa.sa_mask, &act->sa_mask);
- ret |= __get_user(new_ka.sa.sa_flags, &act->sa_flags);
- if (ret)
- return -EFAULT;
- }
-
- ret = do_sigaction(sig, act ? &new_ka : NULL, oact ? &old_ka : NULL);
- if (!ret && oact) {
- ret = put_user(to_user_ptr(old_ka.sa.sa_handler), &oact->sa_handler);
- ret |= put_sigset_t(&oact->sa_mask, &old_ka.sa.sa_mask);
- ret |= __put_user(old_ka.sa.sa_flags, &oact->sa_flags);
- }
- return ret;
- }
-
- /*
- * Note: it is necessary to treat how as an unsigned int, with the
- * corresponding cast to a signed int to insure that the proper
- * conversion (sign extension) between the register representation
- * of a signed int (msr in 32-bit mode) and the register representation
- * of a signed int (msr in 64-bit mode) is performed.
- */
- long compat_sys_rt_sigprocmask(u32 how, compat_sigset_t __user *set,
- compat_sigset_t __user *oset, size_t sigsetsize)
- {
- sigset_t s;
- sigset_t __user *up;
- int ret;
- mm_segment_t old_fs = get_fs();
-
- if (set) {
- if (get_sigset_t(&s, set))
- return -EFAULT;
- }
-
- set_fs(KERNEL_DS);
- /* This is valid because of the set_fs() */
- up = (sigset_t __user *) &s;
- ret = sys_rt_sigprocmask((int)how, set ? up : NULL, oset ? up : NULL,
- sigsetsize);
- set_fs(old_fs);
- if (ret)
- return ret;
- if (oset) {
- if (put_sigset_t(oset, &s))
- return -EFAULT;
- }
- return 0;
- }
-
- long compat_sys_rt_sigpending(compat_sigset_t __user *set, compat_size_t sigsetsize)
- {
- sigset_t s;
- int ret;
- mm_segment_t old_fs = get_fs();
-
- set_fs(KERNEL_DS);
- /* The __user pointer cast is valid because of the set_fs() */
- ret = sys_rt_sigpending((sigset_t __user *) &s, sigsetsize);
- set_fs(old_fs);
- if (!ret) {
- if (put_sigset_t(set, &s))
- return -EFAULT;
- }
- return ret;
- }
-
-
int copy_siginfo_to_user32(struct compat_siginfo __user *d, siginfo_t *s)
{
int err;
@@@ -1202,10 -607,8 +971,7 @@@ int handle_rt_signal32(unsigned long si
/* Put the siginfo & fill in most of the ucontext */
if (copy_siginfo_to_user(&rt_sf->info, info)
|| __put_user(0, &rt_sf->uc.uc_flags)
- || __put_user(current->sas_ss_sp, &rt_sf->uc.uc_stack.ss_sp)
- || __put_user(sas_ss_flags(regs->gpr[1]),
- &rt_sf->uc.uc_stack.ss_flags)
- || __put_user(current->sas_ss_size, &rt_sf->uc.uc_stack.ss_size)
- || __put_user(0, &rt_sf->uc.uc_link)
+ || __save_altstack(&rt_sf->uc.uc_stack, regs->gpr[1])
|| __put_user(to_user_ptr(&rt_sf->uc.uc_mcontext),
&rt_sf->uc.uc_regs)
|| put_sigset_t(&rt_sf->uc.uc_sigmask, oldset))
diff --cc arch/powerpc/kernel/signal_64.c
index 7a76ee4,807b5b1..0000000
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@@ -723,29 -413,10 +721,26 @@@ int handle_rt_signal64(int signr, struc
/* Create the ucontext. */
err |= __put_user(0, &frame->uc.uc_flags);
- err |= __put_user(current->sas_ss_sp, &frame->uc.uc_stack.ss_sp);
- err |= __put_user(sas_ss_flags(regs->gpr[1]),
- &frame->uc.uc_stack.ss_flags);
- err |= __put_user(current->sas_ss_size, &frame->uc.uc_stack.ss_size);
- err |= __put_user(0, &frame->uc.uc_link);
+ err |= __save_altstack(&frame->uc.uc_stack, regs->gpr[1]);
- err |= setup_sigcontext(&frame->uc.uc_mcontext, regs, signr, NULL,
- (unsigned long)ka->sa.sa_handler, 1);
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+ if (MSR_TM_ACTIVE(regs->msr)) {
+ /* The ucontext_t passed to userland points to the second
+ * ucontext_t (for transactional state) with its uc_link ptr.
+ */
+ err |= __put_user(&frame->uc_transact, &frame->uc.uc_link);
+ err |= setup_tm_sigcontexts(&frame->uc.uc_mcontext,
+ &frame->uc_transact.uc_mcontext,
+ regs, signr,
+ NULL,
+ (unsigned long)ka->sa.sa_handler);
+ } else
+#endif
+ {
+ err |= __put_user(0, &frame->uc.uc_link);
+ err |= setup_sigcontext(&frame->uc.uc_mcontext, regs, signr,
+ NULL, (unsigned long)ka->sa.sa_handler,
+ 1);
+ }
err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
if (err)
goto badframe;
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply
* Re: linux-next: manual merge of the signal tree with the powerpc tree
From: Benjamin Herrenschmidt @ 2013-02-21 5:50 UTC (permalink / raw)
To: Stephen Rothwell
Cc: Michael Neuling, linux-kernel, linux-next, Paul Mackerras,
Al Viro, linuxppc-dev
In-Reply-To: <20130221155208.bcb1295ab9bdecf394d48bfc@canb.auug.org.au>
On Thu, 2013-02-21 at 15:52 +1100, Stephen Rothwell wrote:
> Hi Al,
>
> Today's linux-next merge of the signal tree got conflicts in
> arch/powerpc/kernel/signal_32.c and arch/powerpc/kernel/signal_64.c
> between commit 2b0a576d15e0 ("powerpc: Add new transactional memory state
> to the signal context") from the powerpc tree and commit 7cce246557bf
> ("powerpc: switch to generic sigaltstack") from the signal tree.
>
> I fixed it up (I think - see below) and can carry the fix as necessary
> (no action is required).
Mikey, can you check everything's all right ?
I'm happy to wait for Al stuff to go in first & fixup the conflict
before I send the pull request to Linus. I'm off travelling around but I
should be able to get stuff out this week-end.
Cheers,
Ben.
^ permalink raw reply
* PATCH [1/1] GIANFAR infiinate loop on reception of PAUSE FRAME
From: Staale.Aakermann @ 2013-02-21 8:04 UTC (permalink / raw)
To: linuxppc-dev
There is a undesired behavior in the GIANFAR driver that causes a infinite =
loop stalling the CPU when reception of PAUSE FRAMES.
I found this error during testing with a DSL modem (Westermo http://www.wes=
termo.com). This equipment spawns PAUSE FRAMES continuously on its LAN side=
as long as the link is not up on the WAN side. The GIANFAR driver does not=
handle that it get an timeout on transmission on the first packet sent. It=
will try to gracefully stop all ongoing transactions, but this will fail a=
s there aren't any, and it will loop forever.
--- drivers/net/ethernet/freescale/gianfar.c 2012-12-11 04:30:57.0000000=
00 +0100
+++ drivers/net/ethernet/freescale/gianfar.c 2013-02-20 20:28:13.2019316=
02 +0100
@@ -1600,8 +1600,10 @@
struct gfar_private *priv =3D netdev_priv(dev);
struct gfar __iomem *regs =3D NULL;
u32 tempval;
- int i;
-
+ u32 reset_value =3D DMACTRL_GRS;
+ u32 event_value =3D IEVENT_GRSC;
+ int i =3D 0;
+
for (i =3D 0; i < priv->num_grps; i++) {
regs =3D priv->gfargrp[i].regs;
/* Mask all interrupts */
@@ -1612,22 +1614,32 @@
}
regs =3D priv->gfargrp[0].regs;
+
+ if (!((gfar_read(®s->tctrl) & TCTRL_RFCPAUSE) =3D=3D TCTRL_RFCPA=
USE))
+ {
+ reset_value |=3D DMACTRL_GTS;
+ event_value |=3D IEVENT_GTSC;
+
+ }
+
/* Stop the DMA, and wait for it to stop */
tempval =3D gfar_read(®s->dmactrl);
- if ((tempval & (DMACTRL_GRS | DMACTRL_GTS)) !=3D
- (DMACTRL_GRS | DMACTRL_GTS)) {
+ if ((tempval & reset_value) !=3D reset_value) {
int ret;
- tempval |=3D (DMACTRL_GRS | DMACTRL_GTS);
+ tempval |=3D reset_value;
gfar_write(®s->dmactrl, tempval);
do {
- ret =3D spin_event_timeout(((gfar_read(®s->ieven=
t) &
- (IEVENT_GRSC | IEVENT_GTSC)) =3D=3D
- (IEVENT_GRSC | IEVENT_GTSC)), 1000000, 0);
+
+ ret =3D spin_event_timeout(((gfar_read(®s->ieven=
t) & event_value) =3D=3D event_value), 1000000, 0);
+
if (!ret && !(gfar_read(®s->ievent) & IEVENT_GRS=
C))
+ {
ret =3D __gfar_is_rx_idle(priv);
+ }
} while (!ret);
+
}
}
@@ -1668,9 +1682,10 @@
lock_rx_qs(priv);
gfar_halt(dev);
-
+
unlock_rx_qs(priv);
unlock_tx_qs(priv);
+
local_irq_restore(flags);
/* Free the IRQs */
@@ -2424,22 +2439,32 @@
struct gfar_private *priv =3D container_of(work, struct gfar_privat=
e,
reset_task);
struct net_device *dev =3D priv->ndev;
-
if (dev->flags & IFF_UP) {
netif_tx_stop_all_queues(dev);
stop_gfar(dev);
startup_gfar(dev);
netif_tx_start_all_queues(dev);
}
-
netif_tx_schedule_all(dev);
}
static void gfar_timeout(struct net_device *dev)
{
struct gfar_private *priv =3D netdev_priv(dev);
+ struct gfar __iomem *regs =3D NULL;
- dev->stats.tx_errors++;
+ regs =3D priv->gfargrp[0].regs;
+
+ if ((gfar_read(®s->tctrl) & TCTRL_RFCPAUSE) =3D=3D TCTRL_RFCPAUS=
E)
+ {
+ printk(KERN_DEBUG "PAUSE FRAME RECEIVED\n");
+ dev->stats.tx_dropped++;
+ }
+ else
+ {
+ dev->stats.tx_errors++;
+ }
+
schedule_work(&priv->reset_task);
}
________________________________
CONFIDENTIALITY
This e-mail and any attachment contain KONGSBERG information which may be p=
roprietary, confidential or subject to export regulations, and is only mean=
t for the intended recipient(s). Any disclosure, copying, distribution or u=
se is prohibited, if not otherwise explicitly agreed with KONGSBERG. If rec=
eived in error, please delete it immediately from your system and notify th=
e sender properly.
^ permalink raw reply
* Re: [PATCH] i2c: Remove unneeded xxx_set_drvdata(..., NULL) calls
From: Wolfram Sang @ 2013-02-21 10:48 UTC (permalink / raw)
To: Doug Anderson
Cc: Tony Lindgren, Linus Walleij, Thierry Reding, Sekhar Nori,
linux-i2c, Guan Xuetao, Kevin Hilman, Sonic Zhang,
linux-arm-kernel, Deepak Sikri, Havard Skinnemoen, Marek Vasut,
Pawel Moll, Stephen Warren, Sascha Hauer, Uwe Kleine-König,
Rob Herring, uclinux-dist-devel, Jean Delvare, Lars-Peter Clausen,
Ben Dooks (embedded platforms), Barry Song, linux-omap,
Mika Westerberg, Oskar Schirmer, Fabio Estevam,
davinci-linux-open-source, Shawn Guo, Jim Cromie,
Greg Kroah-Hartman, Tomoya MORINAGA, linux-kernel, Kyungmin Park,
Viresh Kumar, Karol Lewandowski, Jiri Kosina, STEricsson,
Joe Perches, Andrew Morton, Alessandro Rubini, linuxppc-dev,
Alexander Stein
In-Reply-To: <1360970315-32116-1-git-send-email-dianders@chromium.org>
On Fri, Feb 15, 2013 at 03:18:35PM -0800, Doug Anderson wrote:
> There is simply no reason to be manually setting the private driver
> data to NULL in the remove/fail to probe cases. This is just extra
> cruft code that can be removed.
>
> A few notes:
> * Nothing relies on drvdata being set to NULL.
> * The __device_release_driver() function eventually calls
> dev_set_drvdata(dev, NULL) anyway, so there's no need to do it
> twice.
> * I verified that there were no cases where xxx_get_drvdata() was
> being called in these drivers and checking for / relying on the NULL
> return value.
>
> This could be cleaned up kernel-wide but for now just take the baby
> step and remove from the i2c subsystem.
>
> Reported-by: Wolfram Sang <wsa@the-dreams.de>
> Reported-by: Stephen Warren <swarren@wwwdotorg.org>
> Signed-off-by: Doug Anderson <dianders@chromium.org>
Applied, thanks!
^ permalink raw reply
* Re: [PATCH 3/4] KVM: PPC: Book3S HV: Preserve guest CFAR register value
From: Alexander Graf @ 2013-02-21 13:33 UTC (permalink / raw)
To: Paul Mackerras; +Cc: linuxppc-dev, kvm-ppc
In-Reply-To: <20130205041051.GD20303@drongo>
On 05.02.2013, at 05:10, Paul Mackerras wrote:
> The CFAR (Come-From Address Register) is a useful debugging aid that
> exists on POWER7 processors. Currently HV KVM doesn't save or restore
> the CFAR register for guest vcpus, making the CFAR of limited use in
> guests.
>
> This adds the necessary code to capture the CFAR value saved in the
> early exception entry code (it has to be saved before any branch is
> executed), save it in the vcpu.arch struct, and restore it on entry
> to the guest.
>
> Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Alexander Graf <agraf@suse.de>
Alex
^ permalink raw reply
* [RFC PATCH -V2 01/21] powerpc: Use signed formatting when printing error
From: Aneesh Kumar K.V @ 2013-02-21 16:47 UTC (permalink / raw)
To: benh, paulus; +Cc: linux-mm, linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1361465248-10867-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
PAPR define these errors as negative values. So print them accordingly
for easy debugging.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/platforms/pseries/lpar.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index 0da39fe..a77c35b 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -155,7 +155,7 @@ static long pSeries_lpar_hpte_insert(unsigned long hpte_group,
*/
if (unlikely(lpar_rc != H_SUCCESS)) {
if (!(vflags & HPTE_V_BOLTED))
- pr_devel(" lpar err %lu\n", lpar_rc);
+ pr_devel(" lpar err %ld\n", lpar_rc);
return -2;
}
if (!(vflags & HPTE_V_BOLTED))
--
1.7.10
^ permalink raw reply related
* [RFC PATCH -V2 02/21] powerpc: Save DAR and DSISR in pt_regs on MCE
From: Aneesh Kumar K.V @ 2013-02-21 16:47 UTC (permalink / raw)
To: benh, paulus; +Cc: linux-mm, linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1361465248-10867-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
We were not saving DAR and DSISR on MCE. Save then and also print the values
along with exception details in xmon.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/kernel/exceptions-64s.S | 9 +++++++++
arch/powerpc/xmon/xmon.c | 2 +-
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 0e9c48c..d02e730 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -640,9 +640,18 @@ slb_miss_user_pseries:
.align 7
.globl machine_check_common
machine_check_common:
+
+ mfspr r10,SPRN_DAR
+ std r10,PACA_EXGEN+EX_DAR(r13)
+ mfspr r10,SPRN_DSISR
+ stw r10,PACA_EXGEN+EX_DSISR(r13)
EXCEPTION_PROLOG_COMMON(0x200, PACA_EXMC)
FINISH_NAP
DISABLE_INTS
+ ld r3,PACA_EXGEN+EX_DAR(r13)
+ lwz r4,PACA_EXGEN+EX_DSISR(r13)
+ std r3,_DAR(r1)
+ std r4,_DSISR(r1)
bl .save_nvgprs
addi r3,r1,STACK_FRAME_OVERHEAD
bl .machine_check_exception
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 1f8d2f1..a72e490 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -1423,7 +1423,7 @@ static void excprint(struct pt_regs *fp)
printf(" sp: %lx\n", fp->gpr[1]);
printf(" msr: %lx\n", fp->msr);
- if (trap == 0x300 || trap == 0x380 || trap == 0x600) {
+ if (trap == 0x300 || trap == 0x380 || trap == 0x600 || trap == 0x200) {
printf(" dar: %lx\n", fp->dar);
if (trap != 0x380)
printf(" dsisr: %lx\n", fp->dsisr);
--
1.7.10
^ permalink raw reply related
* [RFC PATCH -V2 03/21] powerpc: Don't hard code the size of pte page
From: Aneesh Kumar K.V @ 2013-02-21 16:47 UTC (permalink / raw)
To: benh, paulus; +Cc: linux-mm, linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1361465248-10867-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
USE PTRS_PER_PTE to indicate the size of pte page.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
powerpc: Don't hard code the size of pte page
USE PTRS_PER_PTE to indicate the size of pte page.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/pgtable.h | 6 ++++++
arch/powerpc/mm/hash_low_64.S | 4 ++--
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index a9cbd3b..fc57855 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -17,6 +17,12 @@ struct mm_struct;
# include <asm/pgtable-ppc32.h>
#endif
+/*
+ * hidx is in the second half of the page table. We use the
+ * 8 bytes per each pte entry.
+ */
+#define PTE_PAGE_HIDX_OFFSET (PTRS_PER_PTE * 8)
+
#ifndef __ASSEMBLY__
#include <asm/tlbflush.h>
diff --git a/arch/powerpc/mm/hash_low_64.S b/arch/powerpc/mm/hash_low_64.S
index 7443481..abdd5e2 100644
--- a/arch/powerpc/mm/hash_low_64.S
+++ b/arch/powerpc/mm/hash_low_64.S
@@ -490,7 +490,7 @@ END_FTR_SECTION(CPU_FTR_NOEXECUTE|CPU_FTR_COHERENT_ICACHE, CPU_FTR_NOEXECUTE)
beq htab_inval_old_hpte
ld r6,STK_PARAM(R6)(r1)
- ori r26,r6,0x8000 /* Load the hidx mask */
+ ori r26,r6,PTE_PAGE_HIDX_OFFSET /* Load the hidx mask. */
ld r26,0(r26)
addi r5,r25,36 /* Check actual HPTE_SUB bit, this */
rldcr. r0,r31,r5,0 /* must match pgtable.h definition */
@@ -607,7 +607,7 @@ htab_pte_insert_ok:
sld r4,r4,r5
andc r26,r26,r4
or r26,r26,r3
- ori r5,r6,0x8000
+ ori r5,r6,PTE_PAGE_HIDX_OFFSET
std r26,0(r5)
lwsync
std r30,0(r6)
--
1.7.10
^ permalink raw reply related
* [RFC PATCH -V2 04/21] powerpc: Reduce the PTE_INDEX_SIZE
From: Aneesh Kumar K.V @ 2013-02-21 16:47 UTC (permalink / raw)
To: benh, paulus; +Cc: linux-mm, linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1361465248-10867-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
This make one PMD cover 16MB range. That helps in easier implementation of THP
on power. THP core code make use of one pmd entry to track the huge page and
the range mapped by a single pmd entry should be equal to the huge page size
supported by the hardware.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/pgtable-ppc64-64k.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/pgtable-ppc64-64k.h b/arch/powerpc/include/asm/pgtable-ppc64-64k.h
index be4e287..3c529b4 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64-64k.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64-64k.h
@@ -4,10 +4,10 @@
#include <asm-generic/pgtable-nopud.h>
-#define PTE_INDEX_SIZE 12
+#define PTE_INDEX_SIZE 8
#define PMD_INDEX_SIZE 12
#define PUD_INDEX_SIZE 0
-#define PGD_INDEX_SIZE 6
+#define PGD_INDEX_SIZE 10
#ifndef __ASSEMBLY__
#define PTE_TABLE_SIZE (sizeof(real_pte_t) << PTE_INDEX_SIZE)
--
1.7.10
^ permalink raw reply related
* [RFC PATCH -V2 05/21] powerpc: Reduce PTE table memory wastage
From: Aneesh Kumar K.V @ 2013-02-21 16:47 UTC (permalink / raw)
To: benh, paulus; +Cc: linux-mm, linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1361465248-10867-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
We now have PTE page consuming only 2K of the 64K page.This is in order to
facilitate transparent huge page support, which works much better if our PMDs
cover 16MB instead of 256MB.
Inorder to reduce the wastage, we now have multiple PTE page fragment
from the same PTE page.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/mmu-book3e.h | 4 +
arch/powerpc/include/asm/mmu-hash64.h | 4 +
arch/powerpc/include/asm/page.h | 4 +
arch/powerpc/include/asm/pgalloc-32.h | 45 ++++++++
arch/powerpc/include/asm/pgalloc-64.h | 143 ++++++++++++++++++++-----
arch/powerpc/include/asm/pgalloc.h | 46 +-------
arch/powerpc/kernel/setup_64.c | 4 +-
arch/powerpc/mm/mmu_context_hash64.c | 27 +++++
arch/powerpc/mm/pgtable_64.c | 189 +++++++++++++++++++++++++++++++++
9 files changed, 392 insertions(+), 74 deletions(-)
diff --git a/arch/powerpc/include/asm/mmu-book3e.h b/arch/powerpc/include/asm/mmu-book3e.h
index 99d43e0..6bd293d 100644
--- a/arch/powerpc/include/asm/mmu-book3e.h
+++ b/arch/powerpc/include/asm/mmu-book3e.h
@@ -231,6 +231,10 @@ typedef struct {
u64 high_slices_psize; /* 4 bits per slice for now */
u16 user_psize; /* page size index */
#endif
+#ifdef CONFIG_PPC_64K_PAGES
+ /* for 2K page table support */
+ struct list_head pgtable_list;
+#endif
} mm_context_t;
/* Page size definitions, common between 32 and 64-bit
diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
index 35bb51e..c3b3518 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -498,6 +498,10 @@ typedef struct {
unsigned long acop; /* mask of enabled coprocessor types */
unsigned int cop_pid; /* pid value used with coprocessors */
#endif /* CONFIG_PPC_ICSWX */
+#ifdef CONFIG_PPC_64K_PAGES
+ /* for 2K page table support */
+ struct list_head pgtable_list;
+#endif
} mm_context_t;
diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index f072e97..38e7ff6 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -378,7 +378,11 @@ void arch_free_page(struct page *page, int order);
struct vm_area_struct;
+#ifdef CONFIG_PPC_64K_PAGES
+typedef pte_t *pgtable_t;
+#else
typedef struct page *pgtable_t;
+#endif
#include <asm-generic/memory_model.h>
#endif /* __ASSEMBLY__ */
diff --git a/arch/powerpc/include/asm/pgalloc-32.h b/arch/powerpc/include/asm/pgalloc-32.h
index 580cf73..27b2386 100644
--- a/arch/powerpc/include/asm/pgalloc-32.h
+++ b/arch/powerpc/include/asm/pgalloc-32.h
@@ -37,6 +37,17 @@ extern void pgd_free(struct mm_struct *mm, pgd_t *pgd);
extern pte_t *pte_alloc_one_kernel(struct mm_struct *mm, unsigned long addr);
extern pgtable_t pte_alloc_one(struct mm_struct *mm, unsigned long addr);
+static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
+{
+ free_page((unsigned long)pte);
+}
+
+static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
+{
+ pgtable_page_dtor(ptepage);
+ __free_page(ptepage);
+}
+
static inline void pgtable_free(void *table, unsigned index_size)
{
BUG_ON(index_size); /* 32-bit doesn't use this */
@@ -45,4 +56,38 @@ static inline void pgtable_free(void *table, unsigned index_size)
#define check_pgt_cache() do { } while (0)
+#ifdef CONFIG_SMP
+static inline void pgtable_free_tlb(struct mmu_gather *tlb,
+ void *table, int shift)
+{
+ unsigned long pgf = (unsigned long)table;
+ BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
+ pgf |= shift;
+ tlb_remove_table(tlb, (void *)pgf);
+}
+
+static inline void __tlb_remove_table(void *_table)
+{
+ void *table = (void *)((unsigned long)_table & ~MAX_PGTABLE_INDEX_SIZE);
+ unsigned shift = (unsigned long)_table & MAX_PGTABLE_INDEX_SIZE;
+
+ pgtable_free(table, shift);
+}
+#else
+static inline void pgtable_free_tlb(struct mmu_gather *tlb,
+ void *table, int shift)
+{
+ pgtable_free(table, shift);
+}
+#endif
+
+static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
+ unsigned long address)
+{
+ struct page *page = page_address(table);
+
+ tlb_flush_pgtable(tlb, address);
+ pgtable_page_dtor(page);
+ pgtable_free_tlb(tlb, page, 0);
+}
#endif /* _ASM_POWERPC_PGALLOC_32_H */
diff --git a/arch/powerpc/include/asm/pgalloc-64.h b/arch/powerpc/include/asm/pgalloc-64.h
index 292725c..f6875a5 100644
--- a/arch/powerpc/include/asm/pgalloc-64.h
+++ b/arch/powerpc/include/asm/pgalloc-64.h
@@ -72,9 +72,91 @@ static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
#define pmd_populate_kernel(mm, pmd, pte) pmd_set(pmd, (unsigned long)(pte))
#define pmd_pgtable(pmd) pmd_page(pmd)
+static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
+ unsigned long address)
+{
+ return (pte_t *)__get_free_page(GFP_KERNEL | __GFP_REPEAT | __GFP_ZERO);
+}
-#else /* CONFIG_PPC_64K_PAGES */
+static inline pgtable_t pte_alloc_one(struct mm_struct *mm,
+ unsigned long address)
+{
+ pte_t *pte;
+ struct page *page;
+ pte = pte_alloc_one_kernel(mm, address);
+ if (!pte)
+ return NULL;
+ page = virt_to_page(pte);
+ pgtable_page_ctor(page);
+ return page;
+}
+
+static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
+{
+ free_page((unsigned long)pte);
+}
+
+static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
+{
+ pgtable_page_dtor(ptepage);
+ __free_page(ptepage);
+}
+
+#ifdef CONFIG_SMP
+static inline void pgtable_free_tlb(struct mmu_gather *tlb,
+ void *table, int shift)
+{
+ unsigned long pgf = (unsigned long)table;
+ BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
+ pgf |= shift;
+ tlb_remove_table(tlb, (void *)pgf);
+}
+
+static inline void __tlb_remove_table(void *_table)
+{
+ void *table = (void *)((unsigned long)_table & ~MAX_PGTABLE_INDEX_SIZE);
+ unsigned shift = (unsigned long)_table & MAX_PGTABLE_INDEX_SIZE;
+
+ if (!shift)
+ free_page((unsigned long)table);
+ else {
+ BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
+ kmem_cache_free(PGT_CACHE(shift), table);
+ }
+}
+#else
+static inline void pgtable_free_tlb(struct mmu_gather *tlb,
+ void *table, int shift)
+{
+ pgtable_free(table, shift);
+}
+#endif
+
+static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
+ unsigned long address)
+{
+ struct page *page = page_address(table);
+
+ tlb_flush_pgtable(tlb, address);
+ pgtable_page_dtor(page);
+ pgtable_free_tlb(tlb, page, 0);
+}
+
+#else /* if CONFIG_PPC_64K_PAGES */
+
+extern unsigned long *page_table_alloc(struct mm_struct *, unsigned long);
+extern void page_table_free(struct mm_struct *, unsigned long *);
+#ifdef CONFIG_SMP
+extern void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift);
+extern void __tlb_remove_table(void *_table);
+#else
+static inline void pgtable_free_tlb(struct mmu_gather *tlb,
+ void *table, int shift)
+{
+ pgtable_free(table, shift);
+}
+#endif
#define pud_populate(mm, pud, pmd) pud_set(pud, (unsigned long)pmd)
static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
@@ -83,51 +165,56 @@ static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
pmd_set(pmd, (unsigned long)pte);
}
-#define pmd_populate(mm, pmd, pte_page) \
- pmd_populate_kernel(mm, pmd, page_address(pte_page))
-#define pmd_pgtable(pmd) pmd_page(pmd)
-
-#endif /* CONFIG_PPC_64K_PAGES */
-
-static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
+static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,
+ pgtable_t pte_page)
{
- return kmem_cache_alloc(PGT_CACHE(PMD_INDEX_SIZE),
- GFP_KERNEL|__GFP_REPEAT);
+ pmd_set(pmd, (unsigned long)pte_page);
}
-static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
+static inline pgtable_t pmd_pgtable(pmd_t pmd)
{
- kmem_cache_free(PGT_CACHE(PMD_INDEX_SIZE), pmd);
+ return (pgtable_t)(pmd_val(pmd) & -sizeof(pte_t)*PTRS_PER_PTE);
}
static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
unsigned long address)
{
- return (pte_t *)__get_free_page(GFP_KERNEL | __GFP_REPEAT | __GFP_ZERO);
+ return (pte_t *)page_table_alloc(mm, address);
}
static inline pgtable_t pte_alloc_one(struct mm_struct *mm,
unsigned long address)
{
- struct page *page;
- pte_t *pte;
+ return (pgtable_t)page_table_alloc(mm, address);
+}
- pte = pte_alloc_one_kernel(mm, address);
- if (!pte)
- return NULL;
- page = virt_to_page(pte);
- pgtable_page_ctor(page);
- return page;
+static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
+{
+ page_table_free(mm, (unsigned long *)pte);
}
-static inline void pgtable_free(void *table, unsigned index_size)
+static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
{
- if (!index_size)
- free_page((unsigned long)table);
- else {
- BUG_ON(index_size > MAX_PGTABLE_INDEX_SIZE);
- kmem_cache_free(PGT_CACHE(index_size), table);
- }
+ page_table_free(mm, (unsigned long *)ptepage);
+}
+
+static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
+ unsigned long address)
+{
+ tlb_flush_pgtable(tlb, address);
+ pgtable_free_tlb(tlb, table, 0);
+}
+#endif /* CONFIG_PPC_64K_PAGES */
+
+static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
+{
+ return kmem_cache_alloc(PGT_CACHE(PMD_INDEX_SIZE),
+ GFP_KERNEL|__GFP_REPEAT);
+}
+
+static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
+{
+ kmem_cache_free(PGT_CACHE(PMD_INDEX_SIZE), pmd);
}
#define __pmd_free_tlb(tlb, pmd, addr) \
diff --git a/arch/powerpc/include/asm/pgalloc.h b/arch/powerpc/include/asm/pgalloc.h
index bf301ac..e9a9f60 100644
--- a/arch/powerpc/include/asm/pgalloc.h
+++ b/arch/powerpc/include/asm/pgalloc.h
@@ -3,6 +3,7 @@
#ifdef __KERNEL__
#include <linux/mm.h>
+#include <asm-generic/tlb.h>
#ifdef CONFIG_PPC_BOOK3E
extern void tlb_flush_pgtable(struct mmu_gather *tlb, unsigned long address);
@@ -13,56 +14,11 @@ static inline void tlb_flush_pgtable(struct mmu_gather *tlb,
}
#endif /* !CONFIG_PPC_BOOK3E */
-static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
-{
- free_page((unsigned long)pte);
-}
-
-static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
-{
- pgtable_page_dtor(ptepage);
- __free_page(ptepage);
-}
-
#ifdef CONFIG_PPC64
#include <asm/pgalloc-64.h>
#else
#include <asm/pgalloc-32.h>
#endif
-#ifdef CONFIG_SMP
-struct mmu_gather;
-extern void tlb_remove_table(struct mmu_gather *, void *);
-
-static inline void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift)
-{
- unsigned long pgf = (unsigned long)table;
- BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
- pgf |= shift;
- tlb_remove_table(tlb, (void *)pgf);
-}
-
-static inline void __tlb_remove_table(void *_table)
-{
- void *table = (void *)((unsigned long)_table & ~MAX_PGTABLE_INDEX_SIZE);
- unsigned shift = (unsigned long)_table & MAX_PGTABLE_INDEX_SIZE;
-
- pgtable_free(table, shift);
-}
-#else /* CONFIG_SMP */
-static inline void pgtable_free_tlb(struct mmu_gather *tlb, void *table, unsigned shift)
-{
- pgtable_free(table, shift);
-}
-#endif /* !CONFIG_SMP */
-
-static inline void __pte_free_tlb(struct mmu_gather *tlb, struct page *ptepage,
- unsigned long address)
-{
- tlb_flush_pgtable(tlb, address);
- pgtable_page_dtor(ptepage);
- pgtable_free_tlb(tlb, page_address(ptepage), 0);
-}
-
#endif /* __KERNEL__ */
#endif /* _ASM_POWERPC_PGALLOC_H */
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 6da881b..4e2db82 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -575,7 +575,9 @@ void __init setup_arch(char **cmdline_p)
init_mm.end_code = (unsigned long) _etext;
init_mm.end_data = (unsigned long) _edata;
init_mm.brk = klimit;
-
+#ifdef CONFIG_PPC_64K_PAGES
+ INIT_LIST_HEAD(&init_mm.context.pgtable_list);
+#endif
irqstack_early_init();
exc_lvl_early_init();
emergency_stack_init();
diff --git a/arch/powerpc/mm/mmu_context_hash64.c b/arch/powerpc/mm/mmu_context_hash64.c
index 59cd773..474b9af 100644
--- a/arch/powerpc/mm/mmu_context_hash64.c
+++ b/arch/powerpc/mm/mmu_context_hash64.c
@@ -86,6 +86,9 @@ int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
spin_lock_init(mm->context.cop_lockp);
#endif /* CONFIG_PPC_ICSWX */
+#ifdef CONFIG_PPC_64K_PAGES
+ INIT_LIST_HEAD(&mm->context.pgtable_list);
+#endif
return 0;
}
@@ -97,13 +100,37 @@ void __destroy_context(int context_id)
}
EXPORT_SYMBOL_GPL(__destroy_context);
+#ifdef CONFIG_PPC_64K_PAGES
+static void destroy_pagetable_list(struct mm_struct *mm)
+{
+ struct page *page;
+ struct list_head *item, *tmp;
+
+ list_for_each_safe(item, tmp, &mm->context.pgtable_list) {
+ page = list_entry(item, struct page, lru);
+ list_del(&page->lru);
+ pgtable_page_dtor(page);
+ atomic_set(&page->_mapcount, -1);
+ __free_page(page);
+ }
+}
+#else
+static inline void destroy_pagetable_list(struct mm_struct *mm)
+{
+ return;
+}
+#endif
+
void destroy_context(struct mm_struct *mm)
{
+
#ifdef CONFIG_PPC_ICSWX
drop_cop(mm->context.acop, mm);
kfree(mm->context.cop_lockp);
mm->context.cop_lockp = NULL;
#endif /* CONFIG_PPC_ICSWX */
+
+ destroy_pagetable_list(mm);
__destroy_context(mm->context.id);
subpage_prot_free(mm);
mm->context.id = MMU_NO_CONTEXT;
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index e212a27..ec80314 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -69,6 +69,7 @@
unsigned long ioremap_bot = IOREMAP_BASE;
#ifdef CONFIG_PPC_MMU_NOHASH
+/* FIXME!! */
static void *early_alloc_pgtable(unsigned long size)
{
void *pt;
@@ -337,3 +338,191 @@ EXPORT_SYMBOL(__ioremap_at);
EXPORT_SYMBOL(iounmap);
EXPORT_SYMBOL(__iounmap);
EXPORT_SYMBOL(__iounmap_at);
+
+#ifdef CONFIG_PPC_64K_PAGES
+/*
+ * we support 15 fragments per PTE page. This is limited by how many
+ * bits we can pack in page->_mapcount. We use the first half for
+ * tracking the usage for rcu page table free.
+ */
+#define FRAG_MASK_BITS 15
+#define FRAG_MASK ((1 << FRAG_MASK_BITS) - 1)
+/*
+ * We use a 2K PTE page fragment and another 2K for storing
+ * real_pte_t hash index
+ */
+#define PTE_FRAG_SIZE (2 * PTRS_PER_PTE * sizeof(pte_t))
+
+static inline unsigned int atomic_xor_bits(atomic_t *v, unsigned int bits)
+{
+ unsigned int old, new;
+
+ do {
+ old = atomic_read(v);
+ new = old ^ bits;
+ } while (atomic_cmpxchg(v, old, new) != old);
+ return new;
+}
+
+unsigned long *page_table_alloc(struct mm_struct *mm, unsigned long vmaddr)
+{
+ struct page *page;
+ unsigned int mask, bit;
+ unsigned long *table;
+
+ /* Allocate fragments of a 4K page as 1K/2K page table */
+ spin_lock(&mm->page_table_lock);
+ mask = FRAG_MASK;
+ if (!list_empty(&mm->context.pgtable_list)) {
+ page = list_first_entry(&mm->context.pgtable_list,
+ struct page, lru);
+ table = (unsigned long *) page_address(page);
+ mask = atomic_read(&page->_mapcount);
+ /*
+ * Update with the higher order mask bits accumulated,
+ * added as a part of rcu free.
+ */
+ mask = mask | (mask >> FRAG_MASK_BITS);
+ }
+ if ((mask & FRAG_MASK) == FRAG_MASK) {
+ spin_unlock(&mm->page_table_lock);
+ page = alloc_page(GFP_KERNEL|__GFP_REPEAT);
+ if (!page)
+ return NULL;
+ pgtable_page_ctor(page);
+ atomic_set(&page->_mapcount, 1);
+ table = (unsigned long *) page_address(page);
+ spin_lock(&mm->page_table_lock);
+ INIT_LIST_HEAD(&page->lru);
+ list_add(&page->lru, &mm->context.pgtable_list);
+ } else {
+ /* The second half is used for real_pte_t hindex */
+ for (bit = 1; mask & bit; bit <<= 1)
+ table = (unsigned long *)((char *)table + PTE_FRAG_SIZE);
+
+ mask = atomic_xor_bits(&page->_mapcount, bit);
+ /*
+ * We have taken up all the space, remove this from
+ * the list, we will add it back when we have a free slot
+ */
+ if ((mask & FRAG_MASK) == FRAG_MASK)
+ list_del_init(&page->lru);
+ }
+ spin_unlock(&mm->page_table_lock);
+ /*
+ * zero out the newly allocated area, this make sure we don't
+ * see the old left over pte values
+ */
+ memset(table, 0, PTE_FRAG_SIZE);
+ return table;
+}
+
+void page_table_free(struct mm_struct *mm, unsigned long *table)
+{
+ struct page *page;
+ unsigned int bit, mask;
+
+ /* Free 2K page table fragment of a 64K page */
+ page = virt_to_page(table);
+ bit = 1 << ((__pa(table) & ~PAGE_MASK) / PTE_FRAG_SIZE);
+ spin_lock(&mm->page_table_lock);
+ mask = atomic_xor_bits(&page->_mapcount, bit);
+ if (mask == 0)
+ list_del(&page->lru);
+ else if (mask & FRAG_MASK) {
+ /*
+ * Add the page table page to pgtable_list so that
+ * the free fragment can be used by the next alloc
+ */
+ list_del_init(&page->lru);
+ list_add(&page->lru, &mm->context.pgtable_list);
+ }
+ spin_unlock(&mm->page_table_lock);
+ if (mask == 0) {
+ pgtable_page_dtor(page);
+ atomic_set(&page->_mapcount, -1);
+ __free_page(page);
+ }
+}
+
+#ifdef CONFIG_SMP
+static void __page_table_free_rcu(void *table)
+{
+ unsigned int bit;
+ struct page *page;
+ /*
+ * this is a PTE page free 2K page table
+ * fragment of a 64K page.
+ */
+ page = virt_to_page(table);
+ bit = 1 << ((__pa(table) & ~PAGE_MASK) / PTE_FRAG_SIZE);
+ bit <<= FRAG_MASK_BITS;
+ /*
+ * clear the higher half and if nobody used the page in
+ * between, even lower half would be zero.
+ */
+ if (atomic_xor_bits(&page->_mapcount, bit) == 0) {
+ pgtable_page_dtor(page);
+ atomic_set(&page->_mapcount, -1);
+ __free_page(page);
+ }
+}
+
+static void page_table_free_rcu(struct mmu_gather *tlb, unsigned long *table)
+{
+ struct page *page;
+ struct mm_struct *mm;
+ unsigned int bit, mask;
+
+ mm = tlb->mm;
+ /* Free 2K page table fragment of a 64K page */
+ page = virt_to_page(table);
+ bit = 1 << ((__pa(table) & ~PAGE_MASK) / PTE_FRAG_SIZE);
+ spin_lock(&mm->page_table_lock);
+ /*
+ * stash the actual mask in higher half, and clear the lower half
+ * and selectively, add remove from pgtable list
+ */
+ mask = atomic_xor_bits(&page->_mapcount, bit | (bit << FRAG_MASK_BITS));
+ if (!(mask & FRAG_MASK))
+ list_del(&page->lru);
+ else {
+ /*
+ * Add the page table page to pgtable_list so that
+ * the free fragment can be used by the next alloc
+ */
+ list_del_init(&page->lru);
+ list_add_tail(&page->lru, &mm->context.pgtable_list);
+ }
+ spin_unlock(&mm->page_table_lock);
+ tlb_remove_table(tlb, table);
+}
+
+void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift)
+{
+ unsigned long pgf = (unsigned long)table;
+
+ BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
+ pgf |= shift;
+ if (shift == 0)
+ /* PTE page needs special handling */
+ page_table_free_rcu(tlb, table);
+ else
+ tlb_remove_table(tlb, (void *)pgf);
+}
+
+void __tlb_remove_table(void *_table)
+{
+ void *table = (void *)((unsigned long)_table & ~MAX_PGTABLE_INDEX_SIZE);
+ unsigned shift = (unsigned long)_table & MAX_PGTABLE_INDEX_SIZE;
+
+ if (!shift)
+ /* PTE page needs special handling */
+ __page_table_free_rcu(table);
+ else {
+ BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
+ kmem_cache_free(PGT_CACHE(shift), table);
+ }
+}
+#endif
+#endif /* CONFIG_PPC_64K_PAGES */
--
1.7.10
^ permalink raw reply related
* [RFC PATCH -V2 06/21] powerpc: Add size argument to pgtable_cache_add
From: Aneesh Kumar K.V @ 2013-02-21 16:47 UTC (permalink / raw)
To: benh, paulus; +Cc: linux-mm, linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1361465248-10867-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
We will use this later with THP changes. With THP we want to create PMD with
twice the size. The second half will be used to depoist pgtable, which will
carry the hpte hash index value
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/pgtable-ppc64.h | 7 ++++++-
arch/powerpc/mm/init_64.c | 16 ++++++++--------
2 files changed, 14 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
index 0182c20..658ba7c 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -338,8 +338,13 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
#define pgoff_to_pte(off) ((pte_t) {((off) << PTE_RPN_SHIFT)|_PAGE_FILE})
#define PTE_FILE_MAX_BITS (BITS_PER_LONG - PTE_RPN_SHIFT)
-void pgtable_cache_add(unsigned shift, void (*ctor)(void *));
+extern void __pgtable_cache_add(unsigned index, unsigned long table_size,
+ void (*ctor)(void *));
void pgtable_cache_init(void);
+static inline void pgtable_cache_add(unsigned shift, void (*ctor)(void *))
+{
+ return __pgtable_cache_add(shift, sizeof(void *) << shift, ctor);
+}
/*
* find_linux_pte returns the address of a linux pte for a given
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index 95a4529..b378438 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -100,10 +100,10 @@ struct kmem_cache *pgtable_cache[MAX_PGTABLE_INDEX_SIZE];
* everything else. Caches created by this function are used for all
* the higher level pagetables, and for hugepage pagetables.
*/
-void pgtable_cache_add(unsigned shift, void (*ctor)(void *))
+void __pgtable_cache_add(unsigned int index, unsigned long table_size,
+ void (*ctor)(void *))
{
char *name;
- unsigned long table_size = sizeof(void *) << shift;
unsigned long align = table_size;
/* When batching pgtable pointers for RCU freeing, we store
@@ -111,7 +111,7 @@ void pgtable_cache_add(unsigned shift, void (*ctor)(void *))
* big enough to fit it.
*
* Likewise, hugeapge pagetable pointers contain a (different)
- * shift value in the low bits. All tables must be aligned so
+ * huge page size in the low bits. All tables must be aligned so
* as to leave enough 0 bits in the address to contain it. */
unsigned long minalign = max(MAX_PGTABLE_INDEX_SIZE + 1,
HUGEPD_SHIFT_MASK + 1);
@@ -121,17 +121,17 @@ void pgtable_cache_add(unsigned shift, void (*ctor)(void *))
* moment, gcc doesn't seem to recognize is_power_of_2 as a
* constant expression, so so much for that. */
BUG_ON(!is_power_of_2(minalign));
- BUG_ON((shift < 1) || (shift > MAX_PGTABLE_INDEX_SIZE));
+ BUG_ON((index < 1) || (index > MAX_PGTABLE_INDEX_SIZE));
- if (PGT_CACHE(shift))
+ if (PGT_CACHE(index))
return; /* Already have a cache of this size */
align = max_t(unsigned long, align, minalign);
- name = kasprintf(GFP_KERNEL, "pgtable-2^%d", shift);
+ name = kasprintf(GFP_KERNEL, "pgtable-2^%d", index);
new = kmem_cache_create(name, table_size, align, 0, ctor);
- PGT_CACHE(shift) = new;
+ PGT_CACHE(index) = new;
- pr_debug("Allocated pgtable cache for order %d\n", shift);
+ pr_debug("Allocated pgtable cache for order %d\n", index);
}
--
1.7.10
^ permalink raw reply related
* [RFC PATCH -V2 08/21] powerpc: Decode the pte-lp-encoding bits correctly.
From: Aneesh Kumar K.V @ 2013-02-21 16:47 UTC (permalink / raw)
To: benh, paulus; +Cc: linux-mm, linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1361465248-10867-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
We look at both the segment base page size and actual page size and store
the pte-lp-encodings in an array per base page size.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/machdep.h | 3 +-
arch/powerpc/include/asm/mmu-hash64.h | 30 +++++----
arch/powerpc/kvm/book3s_hv.c | 7 ++-
arch/powerpc/mm/hash_low_64.S | 18 ++++--
arch/powerpc/mm/hash_native_64.c | 85 ++++++++++++++++---------
arch/powerpc/mm/hash_utils_64.c | 103 +++++++++++++++++++------------
arch/powerpc/mm/hugetlbpage-hash64.c | 4 +-
arch/powerpc/platforms/cell/beat_htab.c | 16 ++---
arch/powerpc/platforms/ps3/htab.c | 6 +-
arch/powerpc/platforms/pseries/lpar.c | 6 +-
10 files changed, 174 insertions(+), 104 deletions(-)
diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index 19d9d96..6cee6e0 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -50,7 +50,8 @@ struct machdep_calls {
unsigned long prpn,
unsigned long rflags,
unsigned long vflags,
- int psize, int ssize);
+ int psize, int apsize,
+ int ssize);
long (*hpte_remove)(unsigned long hpte_group);
void (*hpte_removebolted)(unsigned long ea,
int psize, int ssize);
diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
index c3b3518..c7bc181 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -154,7 +154,7 @@ extern unsigned long htab_hash_mask;
struct mmu_psize_def
{
unsigned int shift; /* number of bits */
- unsigned int penc; /* HPTE encoding */
+ unsigned int penc[MMU_PAGE_COUNT]; /* HPTE encoding */
unsigned int tlbiel; /* tlbiel supported for that page size */
unsigned long avpnm; /* bits to mask out in AVPN in the HPTE */
unsigned long sllp; /* SLB L||LP (exact mask to use in slbmte) */
@@ -181,6 +181,13 @@ struct mmu_psize_def
*/
#define VPN_SHIFT 12
+/*
+ * HPTE LP details
+ */
+#define LP_SHIFT 12
+#define LP_BITS 8
+#define LP_MASK(i) ((0xFF >> (i)) << LP_SHIFT)
+
#ifndef __ASSEMBLY__
static inline int segment_shift(int ssize)
@@ -237,14 +244,14 @@ static inline unsigned long hpte_encode_avpn(unsigned long vpn, int psize,
/*
* This function sets the AVPN and L fields of the HPTE appropriately
- * for the page size
+ * using the base page size and actual page size.
*/
-static inline unsigned long hpte_encode_v(unsigned long vpn,
- int psize, int ssize)
+static inline unsigned long hpte_encode_v(unsigned long vpn, int base_psize,
+ int actual_psize, int ssize)
{
unsigned long v;
- v = hpte_encode_avpn(vpn, psize, ssize);
- if (psize != MMU_PAGE_4K)
+ v = hpte_encode_avpn(vpn, base_psize, ssize);
+ if (actual_psize != MMU_PAGE_4K)
v |= HPTE_V_LARGE;
return v;
}
@@ -254,17 +261,18 @@ static inline unsigned long hpte_encode_v(unsigned long vpn,
* for the page size. We assume the pa is already "clean" that is properly
* aligned for the requested page size
*/
-static inline unsigned long hpte_encode_r(unsigned long pa, int psize)
+static inline unsigned long hpte_encode_r(unsigned long pa, int base_psize,
+ int actual_psize)
{
unsigned long r;
/* A 4K page needs no special encoding */
- if (psize == MMU_PAGE_4K)
+ if (actual_psize == MMU_PAGE_4K)
return pa & HPTE_R_RPN;
else {
- unsigned int penc = mmu_psize_defs[psize].penc;
- unsigned int shift = mmu_psize_defs[psize].shift;
- return (pa & ~((1ul << shift) - 1)) | (penc << 12);
+ unsigned int penc = mmu_psize_defs[base_psize].penc[actual_psize];
+ unsigned int shift = mmu_psize_defs[actual_psize].shift;
+ return (pa & ~((1ul << shift) - 1)) | (penc << LP_SHIFT);
}
return r;
}
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 71d0c90..d2c9932 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1515,7 +1515,12 @@ static void kvmppc_add_seg_page_size(struct kvm_ppc_one_seg_page_size **sps,
(*sps)->page_shift = def->shift;
(*sps)->slb_enc = def->sllp;
(*sps)->enc[0].page_shift = def->shift;
- (*sps)->enc[0].pte_enc = def->penc;
+ /*
+ * FIXME!!
+ * This is returned to user space. Do we need to
+ * return details of MPSS here ?
+ */
+ (*sps)->enc[0].pte_enc = def->penc[linux_psize];
(*sps)++;
}
diff --git a/arch/powerpc/mm/hash_low_64.S b/arch/powerpc/mm/hash_low_64.S
index abdd5e2..0e980ac 100644
--- a/arch/powerpc/mm/hash_low_64.S
+++ b/arch/powerpc/mm/hash_low_64.S
@@ -196,7 +196,8 @@ htab_insert_pte:
mr r4,r29 /* Retrieve vpn */
li r7,0 /* !bolted, !secondary */
li r8,MMU_PAGE_4K /* page size */
- ld r9,STK_PARAM(R9)(r1) /* segment size */
+ li r9,MMU_PAGE_4K /* actual page size */
+ ld r10,STK_PARAM(R9)(r1) /* segment size */
_GLOBAL(htab_call_hpte_insert1)
bl . /* Patched by htab_finish_init() */
cmpdi 0,r3,0
@@ -219,7 +220,8 @@ _GLOBAL(htab_call_hpte_insert1)
mr r4,r29 /* Retrieve vpn */
li r7,HPTE_V_SECONDARY /* !bolted, secondary */
li r8,MMU_PAGE_4K /* page size */
- ld r9,STK_PARAM(R9)(r1) /* segment size */
+ li r9,MMU_PAGE_4K /* actual page size */
+ ld r10,STK_PARAM(R9)(r1) /* segment size */
_GLOBAL(htab_call_hpte_insert2)
bl . /* Patched by htab_finish_init() */
cmpdi 0,r3,0
@@ -515,7 +517,8 @@ htab_special_pfn:
mr r4,r29 /* Retrieve vpn */
li r7,0 /* !bolted, !secondary */
li r8,MMU_PAGE_4K /* page size */
- ld r9,STK_PARAM(R9)(r1) /* segment size */
+ li r9,MMU_PAGE_4K /* actual page size */
+ ld r10,STK_PARAM(R9)(r1) /* segment size */
_GLOBAL(htab_call_hpte_insert1)
bl . /* patched by htab_finish_init() */
cmpdi 0,r3,0
@@ -542,7 +545,8 @@ _GLOBAL(htab_call_hpte_insert1)
mr r4,r29 /* Retrieve vpn */
li r7,HPTE_V_SECONDARY /* !bolted, secondary */
li r8,MMU_PAGE_4K /* page size */
- ld r9,STK_PARAM(R9)(r1) /* segment size */
+ li r9,MMU_PAGE_4K /* actual page size */
+ ld r10,STK_PARAM(R9)(r1) /* segment size */
_GLOBAL(htab_call_hpte_insert2)
bl . /* patched by htab_finish_init() */
cmpdi 0,r3,0
@@ -840,7 +844,8 @@ ht64_insert_pte:
mr r4,r29 /* Retrieve vpn */
li r7,0 /* !bolted, !secondary */
li r8,MMU_PAGE_64K
- ld r9,STK_PARAM(R9)(r1) /* segment size */
+ li r9,MMU_PAGE_64K /* actual page size */
+ ld r10,STK_PARAM(R9)(r1) /* segment size */
_GLOBAL(ht64_call_hpte_insert1)
bl . /* patched by htab_finish_init() */
cmpdi 0,r3,0
@@ -863,7 +868,8 @@ _GLOBAL(ht64_call_hpte_insert1)
mr r4,r29 /* Retrieve vpn */
li r7,HPTE_V_SECONDARY /* !bolted, secondary */
li r8,MMU_PAGE_64K
- ld r9,STK_PARAM(R9)(r1) /* segment size */
+ li r9,MMU_PAGE_64K /* actual page size */
+ ld r10,STK_PARAM(R9)(r1) /* segment size */
_GLOBAL(ht64_call_hpte_insert2)
bl . /* patched by htab_finish_init() */
cmpdi 0,r3,0
diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c
index 9d8983a..3d30b23 100644
--- a/arch/powerpc/mm/hash_native_64.c
+++ b/arch/powerpc/mm/hash_native_64.c
@@ -39,7 +39,7 @@
DEFINE_RAW_SPINLOCK(native_tlbie_lock);
-static inline void __tlbie(unsigned long vpn, int psize, int ssize)
+static inline void __tlbie(unsigned long vpn, int psize, int apsize, int ssize)
{
unsigned long va;
unsigned int penc;
@@ -68,7 +68,7 @@ static inline void __tlbie(unsigned long vpn, int psize, int ssize)
break;
default:
/* We need 14 to 14 + i bits of va */
- penc = mmu_psize_defs[psize].penc;
+ penc = mmu_psize_defs[psize].penc[apsize];
va &= ~((1ul << mmu_psize_defs[psize].shift) - 1);
va |= penc << 12;
va |= ssize << 8;
@@ -80,7 +80,7 @@ static inline void __tlbie(unsigned long vpn, int psize, int ssize)
}
}
-static inline void __tlbiel(unsigned long vpn, int psize, int ssize)
+static inline void __tlbiel(unsigned long vpn, int psize, int apsize, int ssize)
{
unsigned long va;
unsigned int penc;
@@ -102,7 +102,7 @@ static inline void __tlbiel(unsigned long vpn, int psize, int ssize)
break;
default:
/* We need 14 to 14 + i bits of va */
- penc = mmu_psize_defs[psize].penc;
+ penc = mmu_psize_defs[psize].penc[apsize];
va &= ~((1ul << mmu_psize_defs[psize].shift) - 1);
va |= penc << 12;
va |= ssize << 8;
@@ -114,7 +114,8 @@ static inline void __tlbiel(unsigned long vpn, int psize, int ssize)
}
-static inline void tlbie(unsigned long vpn, int psize, int ssize, int local)
+static inline void tlbie(unsigned long vpn, int psize, int apsize,
+ int ssize, int local)
{
unsigned int use_local = local && mmu_has_feature(MMU_FTR_TLBIEL);
int lock_tlbie = !mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE);
@@ -125,10 +126,10 @@ static inline void tlbie(unsigned long vpn, int psize, int ssize, int local)
raw_spin_lock(&native_tlbie_lock);
asm volatile("ptesync": : :"memory");
if (use_local) {
- __tlbiel(vpn, psize, ssize);
+ __tlbiel(vpn, psize, apsize, ssize);
asm volatile("ptesync": : :"memory");
} else {
- __tlbie(vpn, psize, ssize);
+ __tlbie(vpn, psize, apsize, ssize);
asm volatile("eieio; tlbsync; ptesync": : :"memory");
}
if (lock_tlbie && !use_local)
@@ -156,7 +157,7 @@ static inline void native_unlock_hpte(struct hash_pte *hptep)
static long native_hpte_insert(unsigned long hpte_group, unsigned long vpn,
unsigned long pa, unsigned long rflags,
- unsigned long vflags, int psize, int ssize)
+ unsigned long vflags, int psize, int apsize, int ssize)
{
struct hash_pte *hptep = htab_address + hpte_group;
unsigned long hpte_v, hpte_r;
@@ -183,8 +184,8 @@ static long native_hpte_insert(unsigned long hpte_group, unsigned long vpn,
if (i == HPTES_PER_GROUP)
return -1;
- hpte_v = hpte_encode_v(vpn, psize, ssize) | vflags | HPTE_V_VALID;
- hpte_r = hpte_encode_r(pa, psize) | rflags;
+ hpte_v = hpte_encode_v(vpn, psize, apsize, ssize) | vflags | HPTE_V_VALID;
+ hpte_r = hpte_encode_r(pa, psize, apsize) | rflags;
if (!(vflags & HPTE_V_BOLTED)) {
DBG_LOW(" i=%x hpte_v=%016lx, hpte_r=%016lx\n",
@@ -244,6 +245,30 @@ static long native_hpte_remove(unsigned long hpte_group)
return i;
}
+static inline int hpte_actual_psize(struct hash_pte *hptep, int psize)
+{
+ unsigned int mask;
+ int i, penc, shift;
+ /* Look at the 8 bit LP value */
+ unsigned int lp = (hptep->r >> LP_SHIFT) & ((1 << (LP_BITS + 1)) - 1);
+
+ penc = 0;
+ for (i = 0; i < MMU_PAGE_COUNT; i++) {
+ /* valid entries have a shift value */
+ if (!mmu_psize_defs[i].shift)
+ continue;
+
+ /* encoding bits per actual page size */
+ shift = mmu_psize_defs[i].shift - 11;
+ if (shift > 9)
+ shift = 9;
+ mask = (1 << shift) - 1;
+ if ((lp & mask) == mmu_psize_defs[psize].penc[i])
+ return i;
+ }
+ return -1;
+}
+
static long native_hpte_updatepp(unsigned long slot, unsigned long newpp,
unsigned long vpn, int psize, int ssize,
int local)
@@ -251,6 +276,7 @@ static long native_hpte_updatepp(unsigned long slot, unsigned long newpp,
struct hash_pte *hptep = htab_address + slot;
unsigned long hpte_v, want_v;
int ret = 0;
+ int actual_psize;
want_v = hpte_encode_avpn(vpn, psize, ssize);
@@ -260,6 +286,7 @@ static long native_hpte_updatepp(unsigned long slot, unsigned long newpp,
native_lock_hpte(hptep);
hpte_v = hptep->v;
+ actual_psize = hpte_actual_psize(hptep, psize);
/* Even if we miss, we need to invalidate the TLB */
if (!HPTE_V_COMPARE(hpte_v, want_v) || !(hpte_v & HPTE_V_VALID)) {
@@ -274,7 +301,7 @@ static long native_hpte_updatepp(unsigned long slot, unsigned long newpp,
native_unlock_hpte(hptep);
/* Ensure it is out of the tlb too. */
- tlbie(vpn, psize, ssize, local);
+ tlbie(vpn, psize, actual_psize, ssize, local);
return ret;
}
@@ -315,6 +342,7 @@ static long native_hpte_find(unsigned long vpn, int psize, int ssize)
static void native_hpte_updateboltedpp(unsigned long newpp, unsigned long ea,
int psize, int ssize)
{
+ int actual_psize;
unsigned long vpn;
unsigned long vsid;
long slot;
@@ -327,13 +355,14 @@ static void native_hpte_updateboltedpp(unsigned long newpp, unsigned long ea,
if (slot == -1)
panic("could not find page to bolt\n");
hptep = htab_address + slot;
+ actual_psize = hpte_actual_psize(hptep, psize);
/* Update the HPTE */
hptep->r = (hptep->r & ~(HPTE_R_PP | HPTE_R_N)) |
(newpp & (HPTE_R_PP | HPTE_R_N));
/* Ensure it is out of the tlb too. */
- tlbie(vpn, psize, ssize, 0);
+ tlbie(vpn, psize, actual_psize, ssize, 0);
}
static void native_hpte_invalidate(unsigned long slot, unsigned long vpn,
@@ -343,6 +372,7 @@ static void native_hpte_invalidate(unsigned long slot, unsigned long vpn,
unsigned long hpte_v;
unsigned long want_v;
unsigned long flags;
+ int actual_psize;
local_irq_save(flags);
@@ -352,6 +382,7 @@ static void native_hpte_invalidate(unsigned long slot, unsigned long vpn,
native_lock_hpte(hptep);
hpte_v = hptep->v;
+ actual_psize = hpte_actual_psize(hptep, psize);
/* Even if we miss, we need to invalidate the TLB */
if (!HPTE_V_COMPARE(hpte_v, want_v) || !(hpte_v & HPTE_V_VALID))
native_unlock_hpte(hptep);
@@ -360,23 +391,19 @@ static void native_hpte_invalidate(unsigned long slot, unsigned long vpn,
hptep->v = 0;
/* Invalidate the TLB */
- tlbie(vpn, psize, ssize, local);
+ tlbie(vpn, psize, actual_psize, ssize, local);
local_irq_restore(flags);
}
-#define LP_SHIFT 12
-#define LP_BITS 8
-#define LP_MASK(i) ((0xFF >> (i)) << LP_SHIFT)
-
static void hpte_decode(struct hash_pte *hpte, unsigned long slot,
- int *psize, int *ssize, unsigned long *vpn)
+ int *psize, int *apsize, int *ssize, unsigned long *vpn)
{
unsigned long avpn, pteg, vpi;
unsigned long hpte_r = hpte->r;
unsigned long hpte_v = hpte->v;
unsigned long vsid, seg_off;
- int i, size, shift, penc;
+ int i, size, a_size = MMU_PAGE_4K, shift, penc;
if (!(hpte_v & HPTE_V_LARGE))
size = MMU_PAGE_4K;
@@ -395,12 +422,13 @@ static void hpte_decode(struct hash_pte *hpte, unsigned long slot,
/* valid entries have a shift value */
if (!mmu_psize_defs[size].shift)
continue;
-
- if (penc == mmu_psize_defs[size].penc)
- break;
+ for (a_size = 0; a_size < MMU_PAGE_COUNT; a_size++)
+ if (penc == mmu_psize_defs[size].penc[a_size])
+ goto out;
}
}
+out:
/* This works for all page sizes, and for 256M and 1T segments */
*ssize = hpte_v >> HPTE_V_SSIZE_SHIFT;
shift = mmu_psize_defs[size].shift;
@@ -433,7 +461,8 @@ static void hpte_decode(struct hash_pte *hpte, unsigned long slot,
default:
*vpn = size = 0;
}
- *psize = size;
+ *psize = size;
+ *apsize = a_size;
}
/*
@@ -451,7 +480,7 @@ static void native_hpte_clear(void)
struct hash_pte *hptep = htab_address;
unsigned long hpte_v;
unsigned long pteg_count;
- int psize, ssize;
+ int psize, apsize, ssize;
pteg_count = htab_hash_mask + 1;
@@ -477,9 +506,9 @@ static void native_hpte_clear(void)
* already hold the native_tlbie_lock.
*/
if (hpte_v & HPTE_V_VALID) {
- hpte_decode(hptep, slot, &psize, &ssize, &vpn);
+ hpte_decode(hptep, slot, &psize, &apsize, &ssize, &vpn);
hptep->v = 0;
- __tlbie(vpn, psize, ssize);
+ __tlbie(vpn, psize, apsize, ssize);
}
}
@@ -540,7 +569,7 @@ static void native_flush_hash_range(unsigned long number, int local)
pte_iterate_hashed_subpages(pte, psize,
vpn, index, shift) {
- __tlbiel(vpn, psize, ssize);
+ __tlbiel(vpn, psize, psize, ssize);
} pte_iterate_hashed_end();
}
asm volatile("ptesync":::"memory");
@@ -557,7 +586,7 @@ static void native_flush_hash_range(unsigned long number, int local)
pte_iterate_hashed_subpages(pte, psize,
vpn, index, shift) {
- __tlbie(vpn, psize, ssize);
+ __tlbie(vpn, psize, psize, ssize);
} pte_iterate_hashed_end();
}
asm volatile("eieio; tlbsync; ptesync":::"memory");
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index bfeab83..48edb46 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -125,7 +125,7 @@ static struct mmu_psize_def mmu_psize_defaults_old[] = {
[MMU_PAGE_4K] = {
.shift = 12,
.sllp = 0,
- .penc = 0,
+ .penc[MMU_PAGE_4K] = 0,
.avpnm = 0,
.tlbiel = 0,
},
@@ -139,14 +139,14 @@ static struct mmu_psize_def mmu_psize_defaults_gp[] = {
[MMU_PAGE_4K] = {
.shift = 12,
.sllp = 0,
- .penc = 0,
+ .penc[MMU_PAGE_4K] = 0,
.avpnm = 0,
.tlbiel = 1,
},
[MMU_PAGE_16M] = {
.shift = 24,
.sllp = SLB_VSID_L,
- .penc = 0,
+ .penc[MMU_PAGE_16M] = 0,
.avpnm = 0x1UL,
.tlbiel = 0,
},
@@ -208,7 +208,7 @@ int htab_bolt_mapping(unsigned long vstart, unsigned long vend,
BUG_ON(!ppc_md.hpte_insert);
ret = ppc_md.hpte_insert(hpteg, vpn, paddr, tprot,
- HPTE_V_BOLTED, psize, ssize);
+ HPTE_V_BOLTED, psize, psize, ssize);
if (ret < 0)
break;
@@ -275,6 +275,30 @@ static void __init htab_init_seg_sizes(void)
of_scan_flat_dt(htab_dt_scan_seg_sizes, NULL);
}
+static int __init get_idx_from_shift(unsigned int shift)
+{
+ int idx = -1;
+
+ switch (shift) {
+ case 0xc:
+ idx = MMU_PAGE_4K;
+ break;
+ case 0x10:
+ idx = MMU_PAGE_64K;
+ break;
+ case 0x14:
+ idx = MMU_PAGE_1M;
+ break;
+ case 0x18:
+ idx = MMU_PAGE_16M;
+ break;
+ case 0x22:
+ idx = MMU_PAGE_16G;
+ break;
+ }
+ return idx;
+}
+
static int __init htab_dt_scan_page_sizes(unsigned long node,
const char *uname, int depth,
void *data)
@@ -294,60 +318,57 @@ static int __init htab_dt_scan_page_sizes(unsigned long node,
size /= 4;
cur_cpu_spec->mmu_features &= ~(MMU_FTR_16M_PAGE);
while(size > 0) {
- unsigned int shift = prop[0];
+ unsigned int base_shift = prop[0];
unsigned int slbenc = prop[1];
unsigned int lpnum = prop[2];
- unsigned int lpenc = 0;
struct mmu_psize_def *def;
- int idx = -1;
+ int idx, base_idx;
size -= 3; prop += 3;
- while(size > 0 && lpnum) {
- if (prop[0] == shift)
- lpenc = prop[1];
+ base_idx = get_idx_from_shift(base_shift);
+ if (base_idx < 0) {
+ /*
+ * skip the pte encoding also
+ */
prop += 2; size -= 2;
- lpnum--;
+ continue;
}
- switch(shift) {
- case 0xc:
- idx = MMU_PAGE_4K;
- break;
- case 0x10:
- idx = MMU_PAGE_64K;
- break;
- case 0x14:
- idx = MMU_PAGE_1M;
- break;
- case 0x18:
- idx = MMU_PAGE_16M;
+ def = &mmu_psize_defs[base_idx];
+ if (base_idx == MMU_PAGE_16M)
cur_cpu_spec->mmu_features |= MMU_FTR_16M_PAGE;
- break;
- case 0x22:
- idx = MMU_PAGE_16G;
- break;
- }
- if (idx < 0)
- continue;
- def = &mmu_psize_defs[idx];
- def->shift = shift;
- if (shift <= 23)
+
+ def->shift = base_shift;
+ if (base_shift <= 23)
def->avpnm = 0;
else
- def->avpnm = (1 << (shift - 23)) - 1;
+ def->avpnm = (1 << (base_shift - 23)) - 1;
def->sllp = slbenc;
- def->penc = lpenc;
- /* We don't know for sure what's up with tlbiel, so
+ /*
+ * We don't know for sure what's up with tlbiel, so
* for now we only set it for 4K and 64K pages
*/
- if (idx == MMU_PAGE_4K || idx == MMU_PAGE_64K)
+ if (base_idx == MMU_PAGE_4K || base_idx == MMU_PAGE_64K)
def->tlbiel = 1;
else
def->tlbiel = 0;
- DBG(" %d: shift=%02x, sllp=%04lx, avpnm=%08lx, "
- "tlbiel=%d, penc=%d\n",
- idx, shift, def->sllp, def->avpnm, def->tlbiel,
- def->penc);
+ while (size > 0 && lpnum) {
+ unsigned int shift = prop[0];
+ unsigned int penc = prop[1];
+
+ prop += 2; size -= 2;
+ lpnum--;
+
+ idx = get_idx_from_shift(shift);
+ if (idx < 0)
+ continue;
+
+ def->penc[idx] = penc;
+ DBG(" %d: shift=%02x, sllp=%04lx, "
+ "avpnm=%08lx, tlbiel=%d, penc=%d\n",
+ idx, shift, def->sllp, def->avpnm,
+ def->tlbiel, def->penc[idx]);
+ }
}
return 1;
}
diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
index cecad34..e0d52ee 100644
--- a/arch/powerpc/mm/hugetlbpage-hash64.c
+++ b/arch/powerpc/mm/hugetlbpage-hash64.c
@@ -103,7 +103,7 @@ repeat:
/* Insert into the hash table, primary slot */
slot = ppc_md.hpte_insert(hpte_group, vpn, pa, rflags, 0,
- mmu_psize, ssize);
+ mmu_psize, mmu_psize, ssize);
/* Primary is full, try the secondary */
if (unlikely(slot == -1)) {
@@ -111,7 +111,7 @@ repeat:
HPTES_PER_GROUP) & ~0x7UL;
slot = ppc_md.hpte_insert(hpte_group, vpn, pa, rflags,
HPTE_V_SECONDARY,
- mmu_psize, ssize);
+ mmu_psize, mmu_psize, ssize);
if (slot == -1) {
if (mftb() & 0x1)
hpte_group = ((hash & htab_hash_mask) *
diff --git a/arch/powerpc/platforms/cell/beat_htab.c b/arch/powerpc/platforms/cell/beat_htab.c
index 472f9a7..246e1d8 100644
--- a/arch/powerpc/platforms/cell/beat_htab.c
+++ b/arch/powerpc/platforms/cell/beat_htab.c
@@ -90,7 +90,7 @@ static inline unsigned int beat_read_mask(unsigned hpte_group)
static long beat_lpar_hpte_insert(unsigned long hpte_group,
unsigned long vpn, unsigned long pa,
unsigned long rflags, unsigned long vflags,
- int psize, int ssize)
+ int psize, int apsize, int ssize)
{
unsigned long lpar_rc;
u64 hpte_v, hpte_r, slot;
@@ -103,9 +103,9 @@ static long beat_lpar_hpte_insert(unsigned long hpte_group,
"rflags=%lx, vflags=%lx, psize=%d)\n",
hpte_group, va, pa, rflags, vflags, psize);
- hpte_v = hpte_encode_v(vpn, psize, MMU_SEGSIZE_256M) |
+ hpte_v = hpte_encode_v(vpn, psize, apsize, MMU_SEGSIZE_256M) |
vflags | HPTE_V_VALID;
- hpte_r = hpte_encode_r(pa, psize) | rflags;
+ hpte_r = hpte_encode_r(pa, psize, apsize) | rflags;
if (!(vflags & HPTE_V_BOLTED))
DBG_LOW(" hpte_v=%016lx, hpte_r=%016lx\n", hpte_v, hpte_r);
@@ -314,7 +314,7 @@ void __init hpte_init_beat(void)
static long beat_lpar_hpte_insert_v3(unsigned long hpte_group,
unsigned long vpn, unsigned long pa,
unsigned long rflags, unsigned long vflags,
- int psize, int ssize)
+ int psize, int apsize, int ssize)
{
unsigned long lpar_rc;
u64 hpte_v, hpte_r, slot;
@@ -327,9 +327,9 @@ static long beat_lpar_hpte_insert_v3(unsigned long hpte_group,
"rflags=%lx, vflags=%lx, psize=%d)\n",
hpte_group, vpn, pa, rflags, vflags, psize);
- hpte_v = hpte_encode_v(vpn, psize, MMU_SEGSIZE_256M) |
+ hpte_v = hpte_encode_v(vpn, psize, apsize, MMU_SEGSIZE_256M) |
vflags | HPTE_V_VALID;
- hpte_r = hpte_encode_r(pa, psize) | rflags;
+ hpte_r = hpte_encode_r(pa, psize, apsize) | rflags;
if (!(vflags & HPTE_V_BOLTED))
DBG_LOW(" hpte_v=%016lx, hpte_r=%016lx\n", hpte_v, hpte_r);
@@ -373,7 +373,7 @@ static long beat_lpar_hpte_updatepp_v3(unsigned long slot,
unsigned long pss;
want_v = hpte_encode_avpn(vpn, psize, MMU_SEGSIZE_256M);
- pss = (psize == MMU_PAGE_4K) ? -1UL : mmu_psize_defs[psize].penc;
+ pss = (psize == MMU_PAGE_4K) ? -1UL : mmu_psize_defs[psize].penc[psize];
DBG_LOW(" update: "
"avpnv=%016lx, slot=%016lx, psize: %d, newpp %016lx ... ",
@@ -403,7 +403,7 @@ static void beat_lpar_hpte_invalidate_v3(unsigned long slot, unsigned long vpn,
DBG_LOW(" inval : slot=%lx, vpn=%016lx, psize: %d, local: %d\n",
slot, vpn, psize, local);
want_v = hpte_encode_avpn(vpn, psize, MMU_SEGSIZE_256M);
- pss = (psize == MMU_PAGE_4K) ? -1UL : mmu_psize_defs[psize].penc;
+ pss = (psize == MMU_PAGE_4K) ? -1UL : mmu_psize_defs[psize].penc[psize];
lpar_rc = beat_invalidate_htab_entry3(0, slot, want_v, pss);
diff --git a/arch/powerpc/platforms/ps3/htab.c b/arch/powerpc/platforms/ps3/htab.c
index 07a4bba..44f06d2 100644
--- a/arch/powerpc/platforms/ps3/htab.c
+++ b/arch/powerpc/platforms/ps3/htab.c
@@ -45,7 +45,7 @@ static DEFINE_SPINLOCK(ps3_htab_lock);
static long ps3_hpte_insert(unsigned long hpte_group, unsigned long vpn,
unsigned long pa, unsigned long rflags, unsigned long vflags,
- int psize, int ssize)
+ int psize, int apsize, int ssize)
{
int result;
u64 hpte_v, hpte_r;
@@ -61,8 +61,8 @@ static long ps3_hpte_insert(unsigned long hpte_group, unsigned long vpn,
*/
vflags &= ~HPTE_V_SECONDARY;
- hpte_v = hpte_encode_v(vpn, psize, ssize) | vflags | HPTE_V_VALID;
- hpte_r = hpte_encode_r(ps3_mm_phys_to_lpar(pa), psize) | rflags;
+ hpte_v = hpte_encode_v(vpn, psize, apsize, ssize) | vflags | HPTE_V_VALID;
+ hpte_r = hpte_encode_r(ps3_mm_phys_to_lpar(pa), psize, apsize) | rflags;
spin_lock_irqsave(&ps3_htab_lock, flags);
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index a77c35b..3daced3 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -109,7 +109,7 @@ void vpa_init(int cpu)
static long pSeries_lpar_hpte_insert(unsigned long hpte_group,
unsigned long vpn, unsigned long pa,
unsigned long rflags, unsigned long vflags,
- int psize, int ssize)
+ int psize, int apsize, int ssize)
{
unsigned long lpar_rc;
unsigned long flags;
@@ -121,8 +121,8 @@ static long pSeries_lpar_hpte_insert(unsigned long hpte_group,
"pa=%016lx, rflags=%lx, vflags=%lx, psize=%d)\n",
hpte_group, vpn, pa, rflags, vflags, psize);
- hpte_v = hpte_encode_v(vpn, psize, ssize) | vflags | HPTE_V_VALID;
- hpte_r = hpte_encode_r(pa, psize) | rflags;
+ hpte_v = hpte_encode_v(vpn, psize, apsize, ssize) | vflags | HPTE_V_VALID;
+ hpte_r = hpte_encode_r(pa, psize, apsize) | rflags;
if (!(vflags & HPTE_V_BOLTED))
pr_devel(" hpte_v=%016lx, hpte_r=%016lx\n", hpte_v, hpte_r);
--
1.7.10
^ permalink raw reply related
* [RFC PATCH -V2 09/21] powerpc: Update tlbie/tlbiel as per ISA doc
From: Aneesh Kumar K.V @ 2013-02-21 16:47 UTC (permalink / raw)
To: benh, paulus; +Cc: linux-mm, linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1361465248-10867-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
This make sure we handle Multiple page size segment correctly.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/mm/hash_native_64.c | 52 +++++++++++++++++++++++++++++---------
1 file changed, 40 insertions(+), 12 deletions(-)
diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c
index 3d30b23..3bc57e2 100644
--- a/arch/powerpc/mm/hash_native_64.c
+++ b/arch/powerpc/mm/hash_native_64.c
@@ -39,7 +39,7 @@
DEFINE_RAW_SPINLOCK(native_tlbie_lock);
-static inline void __tlbie(unsigned long vpn, int psize, int apsize, int ssize)
+static inline void __tlbie(unsigned long vpn, int bpsize, int apsize, int ssize)
{
unsigned long va;
unsigned int penc;
@@ -59,19 +59,33 @@ static inline void __tlbie(unsigned long vpn, int psize, int apsize, int ssize)
*/
va &= ~(0xffffULL << 48);
- switch (psize) {
+ switch (bpsize) {
case MMU_PAGE_4K:
+ /* clear out bits after (52) [0....52.....63] */
+ va &= ~((1ul << (64 - 52)) - 1);
va |= ssize << 8;
+ va |= mmu_psize_defs[apsize].sllp << 6;
asm volatile(ASM_FTR_IFCLR("tlbie %0,0", PPC_TLBIE(%1,%0), %2)
: : "r" (va), "r"(0), "i" (CPU_FTR_ARCH_206)
: "memory");
break;
default:
/* We need 14 to 14 + i bits of va */
- penc = mmu_psize_defs[psize].penc[apsize];
- va &= ~((1ul << mmu_psize_defs[psize].shift) - 1);
+ penc = mmu_psize_defs[bpsize].penc[apsize];
+ /* clear out bits after (44) [0....44.....63] */
+ va &= ~((1ul << (64 - 44)) - 1);
va |= penc << 12;
va |= ssize << 8;
+ /* Add AVAL part */
+ if (bpsize != apsize) {
+ /*
+ * MPSS, 64K base page size and 16MB parge page size
+ * We don't need all the bits, but this seems to work.
+ * vpn cover upto 65 bits of va. (0...65) and we need
+ * 56..62 bits of va.
+ */
+ va |= ((vpn >> 2) & 0xfe);
+ }
va |= 1; /* L */
asm volatile(ASM_FTR_IFCLR("tlbie %0,1", PPC_TLBIE(%1,%0), %2)
: : "r" (va), "r"(0), "i" (CPU_FTR_ARCH_206)
@@ -80,7 +94,7 @@ static inline void __tlbie(unsigned long vpn, int psize, int apsize, int ssize)
}
}
-static inline void __tlbiel(unsigned long vpn, int psize, int apsize, int ssize)
+static inline void __tlbiel(unsigned long vpn, int bpsize, int apsize, int ssize)
{
unsigned long va;
unsigned int penc;
@@ -94,18 +108,32 @@ static inline void __tlbiel(unsigned long vpn, int psize, int apsize, int ssize)
*/
va &= ~(0xffffULL << 48);
- switch (psize) {
+ switch (bpsize) {
case MMU_PAGE_4K:
+ /* clear out bits after(52) [0....52.....63] */
+ va &= ~((1ul << (64 - 52)) - 1);
va |= ssize << 8;
+ va |= mmu_psize_defs[apsize].sllp << 6;
asm volatile(".long 0x7c000224 | (%0 << 11) | (0 << 21)"
: : "r"(va) : "memory");
break;
default:
/* We need 14 to 14 + i bits of va */
- penc = mmu_psize_defs[psize].penc[apsize];
- va &= ~((1ul << mmu_psize_defs[psize].shift) - 1);
+ penc = mmu_psize_defs[bpsize].penc[apsize];
+ /* clear out bits after(44) [0....44.....63] */
+ va &= ~((1ul << (64 - 44)) - 1);
va |= penc << 12;
va |= ssize << 8;
+ /* Add AVAL part */
+ if (bpsize != apsize) {
+ /*
+ * MPSS, 64K base page size and 16MB parge page size
+ * We don't need all the bits, but this seems to work.
+ * vpn cover upto 65 bits of va. (0...65) and we need
+ * 56..62 bits of va.
+ */
+ va |= ((vpn >> 2) & 0xfe);
+ }
va |= 1; /* L */
asm volatile(".long 0x7c000224 | (%0 << 11) | (1 << 21)"
: : "r"(va) : "memory");
@@ -114,22 +142,22 @@ static inline void __tlbiel(unsigned long vpn, int psize, int apsize, int ssize)
}
-static inline void tlbie(unsigned long vpn, int psize, int apsize,
+static inline void tlbie(unsigned long vpn, int bpsize, int apsize,
int ssize, int local)
{
unsigned int use_local = local && mmu_has_feature(MMU_FTR_TLBIEL);
int lock_tlbie = !mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE);
if (use_local)
- use_local = mmu_psize_defs[psize].tlbiel;
+ use_local = mmu_psize_defs[bpsize].tlbiel;
if (lock_tlbie && !use_local)
raw_spin_lock(&native_tlbie_lock);
asm volatile("ptesync": : :"memory");
if (use_local) {
- __tlbiel(vpn, psize, apsize, ssize);
+ __tlbiel(vpn, bpsize, apsize, ssize);
asm volatile("ptesync": : :"memory");
} else {
- __tlbie(vpn, psize, apsize, ssize);
+ __tlbie(vpn, bpsize, apsize, ssize);
asm volatile("eieio; tlbsync; ptesync": : :"memory");
}
if (lock_tlbie && !use_local)
--
1.7.10
^ permalink raw reply related
* [RFC PATCH -V2 11/21] powerpc: Print page size info during boot
From: Aneesh Kumar K.V @ 2013-02-21 16:47 UTC (permalink / raw)
To: benh, paulus; +Cc: linux-mm, linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1361465248-10867-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
This gives hint about different base and actual page size combination
supported by the platform.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/mm/hash_utils_64.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index df48ba5..a06b55a 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -314,7 +314,7 @@ static int __init htab_dt_scan_page_sizes(unsigned long node,
prop = (u32 *)of_get_flat_dt_prop(node,
"ibm,segment-page-sizes", &size);
if (prop != NULL) {
- DBG("Page sizes from device-tree:\n");
+ pr_info("Page sizes from device-tree:\n");
size /= 4;
cur_cpu_spec->mmu_features &= ~(MMU_FTR_16M_PAGE);
while(size > 0) {
@@ -364,10 +364,10 @@ static int __init htab_dt_scan_page_sizes(unsigned long node,
continue;
def->penc[idx] = penc;
- DBG(" %d: shift=%02x, sllp=%04lx, "
- "avpnm=%08lx, tlbiel=%d, penc=%d\n",
- idx, shift, def->sllp, def->avpnm,
- def->tlbiel, def->penc[idx]);
+ pr_info("base_shift=%d: shift=%d, sllp=0x%04lx,"
+ " avpnm=0x%08lx, tlbiel=%d, penc=%d\n",
+ base_shift, shift, def->sllp,
+ def->avpnm, def->tlbiel, def->penc[idx]);
}
}
return 1;
--
1.7.10
^ permalink raw reply related
* [RFC PATCH -V2 15/21] mm/THP: support for zerout withdraw.
From: Aneesh Kumar K.V @ 2013-02-21 16:47 UTC (permalink / raw)
To: benh, paulus; +Cc: linux-mm, linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1361465248-10867-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/s390/include/asm/pgtable.h | 6 ++++++
arch/sparc/include/asm/pgtable_64.h | 6 ++++++
include/asm-generic/pgtable.h | 9 +++++++++
mm/huge_memory.c | 7 ++++++-
4 files changed, 27 insertions(+), 1 deletion(-)
diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 883296e..2e8b7fe 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -1238,6 +1238,12 @@ extern void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,
#define __HAVE_ARCH_PGTABLE_WITHDRAW
extern pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp);
+static inline pgtable_t __pgtable_trans_huge_withdraw(struct mm_struct *mm,
+ pmd_t *pmdp, int tozero)
+{
+ return pgtable_trans_huge_withdraw(mm, pmdp);
+}
+
static inline int pmd_trans_splitting(pmd_t pmd)
{
return pmd_val(pmd) & _SEGMENT_ENTRY_SPLIT;
diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h
index 4c86de2..0f57c61 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -858,6 +858,12 @@ extern void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,
#define __HAVE_ARCH_PGTABLE_WITHDRAW
extern pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp);
+
+static inline pgtable_t __pgtable_trans_huge_withdraw(struct mm_struct *mm,
+ pmd_t *pmdp, int tozero)
+{
+ return pgtable_trans_huge_withdraw(mm, pmdp);
+}
#endif
/* Encode and de-code a swap entry */
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 6f87e9e..802eccc 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -169,6 +169,15 @@ extern void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,
#ifndef __HAVE_ARCH_PGTABLE_WITHDRAW
extern pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp);
+/*
+ * Some archs use the deposited huge table internally. Request for a
+ * zeroed/non-zeroed pgtabled when withdrawing
+ */
+static inline pgtable_t __pgtable_trans_huge_withdraw(struct mm_struct *mm,
+ pmd_t *pmdp, int tozero)
+{
+ return pgtable_trans_huge_withdraw(mm, pmdp);
+}
#endif
#ifndef __HAVE_ARCH_PMDP_INVALIDATE
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e91b763..2586994 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1380,7 +1380,12 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
struct page *page;
pgtable_t pgtable;
pmd_t orig_pmd;
- pgtable = pgtable_trans_huge_withdraw(tlb->mm, pmd);
+ /*
+ * Withdraw the pgtable without zero out, because
+ * the following pmd_get_and_clear will look at
+ * pgtable contents, in case of architectures like ppc64
+ */
+ pgtable = __pgtable_trans_huge_withdraw(tlb->mm, pmd, 0);
orig_pmd = pmdp_get_and_clear(tlb->mm, addr, pmd);
tlb_remove_pmd_tlb_entry(tlb, pmd, addr);
if (is_huge_zero_pmd(orig_pmd)) {
--
1.7.10
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox