* [PATCH 4/4] powerpc: don't use module_init for non-modular core hugetlb code
From: Paul Gortmaker @ 2014-01-13 16:21 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras; +Cc: Paul Gortmaker, linuxppc-dev
In-Reply-To: <1389630113-7919-1-git-send-email-paul.gortmaker@windriver.com>
The hugetlbpage.o is obj-y (always built in). It will never
be modular, so using module_init as an alias for __initcall is
somewhat misleading.
Fix this up now, so that we can relocate module_init from
init.h into module.h in the future. If we don't do this, we'd
have to add module.h to obviously non-modular code, and that
would be a worse thing.
Note that direct use of __initcall is discouraged, vs. one
of the priority categorized subgroups. As __initcall gets
mapped onto device_initcall, our use of arch_initcall (which
makes sense for arch code) will thus change this registration
from level 6-device to level 3-arch (i.e. slightly earlier).
However no observable impact of that small difference has
been observed during testing, or is expected.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
arch/powerpc/mm/hugetlbpage.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 90bb6d9409bf..d25c202420da 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -911,7 +911,7 @@ static int __init hugetlbpage_init(void)
return 0;
}
#endif
-module_init(hugetlbpage_init);
+arch_initcall(hugetlbpage_init);
void flush_dcache_icache_hugepage(struct page *page)
{
--
1.8.5.2
^ permalink raw reply related
* [PATCH 1/4] powerpc: use device_initcall for registering rtc devices
From: Paul Gortmaker @ 2014-01-13 16:21 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras; +Cc: Paul Gortmaker, linuxppc-dev
In-Reply-To: <1389630113-7919-1-git-send-email-paul.gortmaker@windriver.com>
Currently these two RTC devices are in core platform code
where it is not possible for them to be modular. It will
never be modular, so using module_init as an alias for
__initcall can be somewhat misleading.
Fix this up now, so that we can relocate module_init from
init.h into module.h in the future. If we don't do this, we'd
have to add module.h to obviously non-modular code, and that
would be a worse thing.
Note that direct use of __initcall is discouraged, vs. one
of the priority categorized subgroups. As __initcall gets
mapped onto device_initcall, our use of device_initcall
directly in this change means that the runtime impact is
zero -- they will remain at level 6 in initcall ordering.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
arch/powerpc/kernel/time.c | 2 +-
arch/powerpc/platforms/ps3/time.c | 3 +--
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index afb1b56ef4fa..bee2bb2bbc75 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -1053,4 +1053,4 @@ static int __init rtc_init(void)
return PTR_ERR_OR_ZERO(pdev);
}
-module_init(rtc_init);
+device_initcall(rtc_init);
diff --git a/arch/powerpc/platforms/ps3/time.c b/arch/powerpc/platforms/ps3/time.c
index ce73ce865613..791c6142c4a7 100644
--- a/arch/powerpc/platforms/ps3/time.c
+++ b/arch/powerpc/platforms/ps3/time.c
@@ -92,5 +92,4 @@ static int __init ps3_rtc_init(void)
return PTR_ERR_OR_ZERO(pdev);
}
-
-module_init(ps3_rtc_init);
+device_initcall(ps3_rtc_init);
--
1.8.5.2
^ permalink raw reply related
* [PATCH 0/4] remap non-modular uses of module_init properly
From: Paul Gortmaker @ 2014-01-13 16:21 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras; +Cc: Paul Gortmaker, linuxppc-dev
The goal is to move module_init/module_exit from init.h and into
module.h -- however in doing so, we uncover several instances in
powerpc code where module_init is used somewhat incorrectly by
non modular code, and a file that needs module.h but isn't sourcing
it. We need to make these fixups 1st before changing the headers
so that we don't cause build failures.
The changes are largely inert, however we do cause a largely trivial
change in the initcall ordering -- that happens because module_init
is really device_initcall; and yet we shouldn't be using device_initcall
where clearly arch_initcall or subsys_initcall are more appropriate.
Boot tested on sbc8548 on powerpc next branch of today.
Paul Gortmaker (4):
powerpc: use device_initcall for registering rtc devices
powerpc: book3s kvm can be modular so it should use module.h
powerpc: use subsys_initcall for Freescale Local Bus
powerpc: don't use module_init for non-modular core hugetlb code
arch/powerpc/kernel/time.c | 2 +-
arch/powerpc/kvm/book3s.c | 2 +-
arch/powerpc/mm/hugetlbpage.c | 2 +-
arch/powerpc/platforms/ps3/time.c | 3 +--
arch/powerpc/sysdev/fsl_lbc.c | 2 +-
5 files changed, 5 insertions(+), 6 deletions(-)
--
1.8.5.2
^ permalink raw reply
* Re: [PATCH V4] powerpc: thp: Fix crash on mremap
From: Benjamin Herrenschmidt @ 2014-01-13 13:37 UTC (permalink / raw)
To: Aneesh Kumar K.V
Cc: aarcange, linuxppc-dev, paulus, kirill.shutemov, linux-mm
In-Reply-To: <87wqi42p0f.fsf@linux.vnet.ibm.com>
On Mon, 2014-01-13 at 15:16 +0530, Aneesh Kumar K.V wrote:
> Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
>
> > On Mon, 2014-01-13 at 11:34 +0530, Aneesh Kumar K.V wrote:
> >> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> >>
> >> This patch fix the below crash
> >
> > Andrea, can you ack the generic bit please ?
> >
> > Thanks !
>
> Kirill A. Shutemov did ack an earlier version
>
> http://article.gmane.org/gmane.linux.kernel.mm/111368
Doesn't help. If I'm going to send Linus a patch with a generic change
like that, I need an ack of that exact version of the change by a senior
mm person such as Andrea.
Cheers,
Ben.
^ permalink raw reply
* Re: [PATCH V4] powerpc: thp: Fix crash on mremap
From: Aneesh Kumar K.V @ 2014-01-13 9:46 UTC (permalink / raw)
To: Benjamin Herrenschmidt, aarcange
Cc: aarcange, linuxppc-dev, paulus, kirill.shutemov, linux-mm
In-Reply-To: <1389598587.4672.121.camel@pasglop>
Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
> On Mon, 2014-01-13 at 11:34 +0530, Aneesh Kumar K.V wrote:
>> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>>
>> This patch fix the below crash
>
> Andrea, can you ack the generic bit please ?
>
> Thanks !
Kirill A. Shutemov did ack an earlier version
http://article.gmane.org/gmane.linux.kernel.mm/111368
-aneesh
^ permalink raw reply
* Re: [PATCH 02/13] ppc/cell: use get_unused_fd_flags(0) instead of get_unused_fd()
From: Yann Droneaud @ 2014-01-13 9:30 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: cbe-oss-dev, linuxppc-dev, linux-kernel
In-Reply-To: <1389567961.4672.111.camel@pasglop>
Hi Benjamin,
Le lundi 13 janvier 2014 à 10:06 +1100, Benjamin Herrenschmidt a écrit :
> On Tue, 2013-07-02 at 18:39 +0200, Yann Droneaud wrote:
> > Macro get_unused_fd() is used to allocate a file descriptor with
> > default flags. Those default flags (0) can be "unsafe":
> > O_CLOEXEC must be used by default to not leak file descriptor
> > across exec().
> >
> > Instead of macro get_unused_fd(), functions anon_inode_getfd()
> > or get_unused_fd_flags() should be used with flags given by userspace.
> > If not possible, flags should be set to O_CLOEXEC to provide userspace
> > with a default safe behavor.
> >
> > In a further patch, get_unused_fd() will be removed so that
> > new code start using anon_inode_getfd() or get_unused_fd_flags()
> > with correct flags.
> >
> > This patch replaces calls to get_unused_fd() with equivalent call to
> > get_unused_fd_flags(0) to preserve current behavor for existing code.
> >
> > The hard coded flag value (0) should be reviewed on a per-subsystem basis,
> > and, if possible, set to O_CLOEXEC.
> >
> > Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
>
> Should I merge this (v5 on patchwork) or let Al do it ?
>
Please merge it directly: patches from the previous patchsets were
picked individually by each subsystem maintainer after proper review
regarding setting close on exec flag by default.
> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>
Thanks a lot.
> > ---
> > arch/powerpc/platforms/cell/spufs/inode.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
> > index f390042..88df441 100644
> > --- a/arch/powerpc/platforms/cell/spufs/inode.c
> > +++ b/arch/powerpc/platforms/cell/spufs/inode.c
> > @@ -301,7 +301,7 @@ static int spufs_context_open(struct path *path)
> > int ret;
> > struct file *filp;
> >
> > - ret = get_unused_fd();
> > + ret = get_unused_fd_flags(0);
> > if (ret < 0)
> > return ret;
> >
> > @@ -518,7 +518,7 @@ static int spufs_gang_open(struct path *path)
> > int ret;
> > struct file *filp;
> >
> > - ret = get_unused_fd();
> > + ret = get_unused_fd_flags(0);
> > if (ret < 0)
> > return ret;
> >
>
>
Note:
latest patch (from v5 patchset) is at
http://lkml.kernel.org/r/fe27abcfab5563d36a3e5e58ff36e5500c39be6a.1388952061.git.ydroneaud@opteya.com
v5 patchset is at
http://lkml.kernel.org/r/cover.1388952061.git.ydroneaud@opteya.com
Regards.
--
Yann Droneaud
OPTEYA
^ permalink raw reply
* [PATCH v9] clk: corenet: Adds the clock binding
From: Tang Yuantian @ 2014-01-13 8:16 UTC (permalink / raw)
To: b07421; +Cc: mark.rutland, devicetree, Tang Yuantian, linuxppc-dev
From: Tang Yuantian <yuantian.tang@freescale.com>
Adds the clock bindings for Freescale PowerPC CoreNet platforms
Signed-off-by: Tang Yuantian <Yuantian.Tang@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
---
v9:
- refined some properties' description
v8:
- added clock-frequency property description
- fixed whitespace and tab mixing issue
v7:
- refined some properties' definitions
v6:
- splited the previous patch into 2 parts, one is for binding(this one),
the other is for DTS modification(will submit once this gets accepted)
- fixed typo
- refined #clock-cells and clock-output-names properties
- removed fixed-clock compatible string
v5:
- refine the binding document
- update the compatible string
v4:
- add binding document
- update compatible string
- update the reg property
v3:
- fix typo
v2:
- add t4240, b4420, b4860 support
- remove pll/4 clock from p2041, p3041 and p5020 board
.../devicetree/bindings/clock/corenet-clock.txt | 134 +++++++++++++++++++++
1 file changed, 134 insertions(+)
create mode 100644 Documentation/devicetree/bindings/clock/corenet-clock.txt
diff --git a/Documentation/devicetree/bindings/clock/corenet-clock.txt b/Documentation/devicetree/bindings/clock/corenet-clock.txt
new file mode 100644
index 0000000..8394bc7
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/corenet-clock.txt
@@ -0,0 +1,134 @@
+* Clock Block on Freescale CoreNet Platforms
+
+Freescale CoreNet chips take primary clocking input from the external
+SYSCLK signal. The SYSCLK input (frequency) is multiplied using
+multiple phase locked loops (PLL) to create a variety of frequencies
+which can then be passed to a variety of internal logic, including
+cores and peripheral IP blocks.
+Please refer to the Reference Manual for details.
+
+1. Clock Block Binding
+
+Required properties:
+- compatible: Should contain a specific clock block compatible string
+ and a single chassis clock compatible string.
+ Clock block strings include, but not limited to, one of the:
+ * "fsl,p2041-clockgen"
+ * "fsl,p3041-clockgen"
+ * "fsl,p4080-clockgen"
+ * "fsl,p5020-clockgen"
+ * "fsl,p5040-clockgen"
+ * "fsl,t4240-clockgen"
+ * "fsl,b4420-clockgen"
+ * "fsl,b4860-clockgen"
+ Chassis clock strings include:
+ * "fsl,qoriq-clockgen-1.0": for chassis 1.0 clocks
+ * "fsl,qoriq-clockgen-2.0": for chassis 2.0 clocks
+- reg: Describes the address of the device's resources within the
+ address space defined by its parent bus, and resource zero
+ represents the clock register set
+- clock-frequency: Input system clock frequency
+
+Recommended properties:
+- ranges: Allows valid translation between child's address space and
+ parent's. Must be present if the device has sub-nodes.
+- #address-cells: Specifies the number of cells used to represent
+ physical base addresses. Must be present if the device has
+ sub-nodes and set to 1 if present
+- #size-cells: Specifies the number of cells used to represent
+ the size of an address. Must be present if the device has
+ sub-nodes and set to 1 if present
+
+2. Clock Provider/Consumer Binding
+
+Most of the bindings are from the common clock binding[1].
+ [1] Documentation/devicetree/bindings/clock/clock-bindings.txt
+
+Required properties:
+- compatible : Should include one of the following:
+ * "fsl,qoriq-core-pll-1.0" for core PLL clocks (v1.0)
+ * "fsl,qoriq-core-pll-2.0" for core PLL clocks (v2.0)
+ * "fsl,qoriq-core-mux-1.0" for core mux clocks (v1.0)
+ * "fsl,qoriq-core-mux-2.0" for core mux clocks (v2.0)
+ * "fsl,qoriq-sysclk-1.0": for input system clock (v1.0).
+ It takes parent's clock-frequency as its clock.
+ * "fsl,qoriq-sysclk-2.0": for input system clock (v2.0).
+ It takes parent's clock-frequency as its clock.
+- #clock-cells: From common clock binding. The number of cells in a
+ clock-specifier. Should be <0> for "fsl,qoriq-sysclk-[1,2].0"
+ clocks, or <1> for "fsl,qoriq-core-pll-[1,2].0" clocks.
+ For "fsl,qoriq-core-pll-[1,2].0" clocks, the single
+ clock-specifier cell may take the following values:
+ * 0 - equal to the PLL frequency
+ * 1 - equal to the PLL frequency divided by 2
+ * 2 - equal to the PLL frequency divided by 4
+
+Recommended properties:
+- clocks: Should be the phandle of input parent clock
+- clock-names: From common clock binding, indicates the clock name
+- clock-output-names: From common clock binding, indicates the names of
+ output clocks
+- reg: Should be the offset and length of clock block base address.
+ The length should be 4.
+
+Example for clock block and clock provider:
+/ {
+ clockgen: global-utilities@e1000 {
+ compatible = "fsl,p5020-clockgen", "fsl,qoriq-clockgen-1.0";
+ ranges = <0x0 0xe1000 0x1000>;
+ clock-frequency = 133333;
+ reg = <0xe1000 0x1000>;
+ #address-cells = <1>;
+ #size-cells = <1>;
+
+ sysclk: sysclk {
+ #clock-cells = <0>;
+ compatible = "fsl,qoriq-sysclk-1.0";
+ clock-output-names = "sysclk";
+ }
+
+ pll0: pll0@800 {
+ #clock-cells = <1>;
+ reg = <0x800 0x4>;
+ compatible = "fsl,qoriq-core-pll-1.0";
+ clocks = <&sysclk>;
+ clock-output-names = "pll0", "pll0-div2";
+ };
+
+ pll1: pll1@820 {
+ #clock-cells = <1>;
+ reg = <0x820 0x4>;
+ compatible = "fsl,qoriq-core-pll-1.0";
+ clocks = <&sysclk>;
+ clock-output-names = "pll1", "pll1-div2";
+ };
+
+ mux0: mux0@0 {
+ #clock-cells = <0>;
+ reg = <0x0 0x4>;
+ compatible = "fsl,qoriq-core-mux-1.0";
+ clocks = <&pll0 0>, <&pll0 1>, <&pll1 0>, <&pll1 1>;
+ clock-names = "pll0", "pll0-div2", "pll1", "pll1-div2";
+ clock-output-names = "cmux0";
+ };
+
+ mux1: mux1@20 {
+ #clock-cells = <0>;
+ reg = <0x20 0x4>;
+ compatible = "fsl,qoriq-core-mux-1.0";
+ clocks = <&pll0 0>, <&pll0 1>, <&pll1 0>, <&pll1 1>;
+ clock-names = "pll0", "pll0-div2", "pll1", "pll1-div2";
+ clock-output-names = "cmux1";
+ };
+ };
+ }
+
+Example for clock consumer:
+
+/ {
+ cpu0: PowerPC,e5500@0 {
+ ...
+ clocks = <&mux0>;
+ ...
+ };
+ }
--
1.8.0
^ permalink raw reply related
* Re: [PATCH RFC v6 4/5] dma: mpc512x: register for device tree channel lookup
From: Alexander Popov @ 2014-01-13 8:17 UTC (permalink / raw)
To: Vinod Koul
Cc: Lars-Peter Clausen, Arnd Bergmann, Gerhard Sittig,
Alexander Popov, dmaengine, Dan Williams, Anatolij Gustschin,
linuxppc-dev
In-Reply-To: <20140109111957.GE16227@intel.com>
Thanks for your replies, Gerhard and Vinod.
2014/1/9 Vinod Koul <vinod.koul@intel.com>:
> On Wed, Jan 08, 2014 at 05:47:19PM +0100, Gerhard Sittig wrote:
>> [ what is the semantics of DMA_PRIVATE capability flag?
>> is documentation available beyond the initial commit message?
>> need individual channels be handled instead of controllers? ]
>
> The DMA_PRIVATE means that your channels are not to be used for global memcpy,
> as one can do in async cases (this is hwere DMAengine came into existence)
>
> If the device has the capablity of doing genric memcpy then it should not set
> this. For slave dma usage the dam channel can transfer data to a specfic
> slave device(s), hence we should use this is geric fashion so setting
> DMA_PRIVATE makes sense in those cases.
Each DMA channel of MPC512x DMA controller can do _both_
mem-to-mem transfers and transfers between mem and some slave peripheral
(only one DMA channel is fully dedicated to DDR).
All DMA channels of MPC512x DMA controller belong to one dma_device.
So we _don't_ need setting DMA_PRIVATE flag for this dma_device at all, do we?
>> On Sat, Jan 04, 2014 at 00:54 +0400, Alexander Popov wrote:
>> > I've involved DMA_PRIVATE flag because new of_dma_xlate_by_chan_id()
>> > uses dma_get_slave_channel() instead of dma_request_channel()
>> > (PATCH RFC v6 3/5). This flag is implicitly set in dma_request_channel(),
>> > but is not set in dma_get_slave_channel().
> Which makes me thing you are targetting slave usages. Do you intend to use for
> mempcy too on all controllers you support. in that case you should set it
> selectively.
Vinod, please correct me if I'm wrong.
As I could understand from your comments and the code,
DMA_PRIVATE flag is needed for dma_devices with DMA channels
which _can_ work with slave peripheral but _can't_ do mem-to-mem transfers.
If DMA_PRIVATE flag is set for some dma_device before
dma_async_device_register()
then its DMA channels are not published in tables for kernel slab allocator
(because these channels are simply useless for memcpy).
>> Still I see a difference in the lookup approaches: Yours applies
>> DMA_PRIVATE globally and in advance, preventing _any_ use of DMA
>> for memory transfers. While the __dma_request_channel() routine
>> only applies it _temporarily_ around a dma_chan_get() operation.
>> Allowing for use of DMA channels by both individual peripherals
>> as well as memory transfers.
>>
> No it doesnt prevent. You can still use it for memcpy once you have the channel.
Excuse me, I don't completely understand why dma_request_channel()
needs to set DMA_PRIVATE flag.
If dma_request_channel() for some dma_device without DMA_PRIVATE
is called before the first dmaengine_get()
then no DMA channels of this dma_device will become available for memcpy
by slab allocator.
Could you give me a clue?
>> > > Consider the fact that this driver
>> > > handles both MPC5121 as well as MPC8308 hardware.
>> >
>> > Ah, yes, sorry. I should certainly fix this, if setting of DMA_PRIVATE flag
>> > is needed at all.
>>
>> What I meant here is that implications for all affected platforms
>> should be considered. There is one driver source, but the driver
>> applies to more than one platform (another issue of the driver is
>> that this is not apparent from the doc nor the compat strings).
I'll add a comment with information about the supported platforms to
mpc512x_dma.c
in RFC PATCH 1/5. Ok?
>> So blocking memory transfers in mpc512x_dma.c is a total breakage
>> for MPC8308 (removes the only previous feature and adds nothing),
>> and is a regression for MPC512x (removes the previously supported
>> memory transfers, while it may add peripheral supports with very
>> few users).
Yes, I see. MPC512x and MPC8308 should be treated differently.
Thanks!
Alexander
^ permalink raw reply
* Re: [PATCH V4] powerpc: thp: Fix crash on mremap
From: Benjamin Herrenschmidt @ 2014-01-13 7:36 UTC (permalink / raw)
To: aarcange
Cc: aarcange, linux-mm, paulus, Aneesh Kumar K.V, linuxppc-dev,
kirill.shutemov
In-Reply-To: <1389593064-32664-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
On Mon, 2014-01-13 at 11:34 +0530, Aneesh Kumar K.V wrote:
> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>
> This patch fix the below crash
Andrea, can you ack the generic bit please ?
Thanks !
Cheers,
Ben.
> NIP [c00000000004cee4] .__hash_page_thp+0x2a4/0x440
> LR [c0000000000439ac] .hash_page+0x18c/0x5e0
> ...
> Call Trace:
> [c000000736103c40] [00001ffffb000000] 0x1ffffb000000(unreliable)
> [437908.479693] [c000000736103d50] [c0000000000439ac] .hash_page+0x18c/0x5e0
> [437908.479699] [c000000736103e30] [c00000000000924c] .do_hash_page+0x4c/0x58
>
> On ppc64 we use the pgtable for storing the hpte slot information and
> store address to the pgtable at a constant offset (PTRS_PER_PMD) from
> pmd. On mremap, when we switch the pmd, we need to withdraw and deposit
> the pgtable again, so that we find the pgtable at PTRS_PER_PMD offset
> from new pmd.
>
> We also want to move the withdraw and deposit before the set_pmd so
> that, when page fault find the pmd as trans huge we can be sure that
> pgtable can be located at the offset.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
> Changes from V3:
> * Drop "powerpc: mm: Move ppc64 page table range definitions to separate header"" patch
>
> arch/powerpc/include/asm/pgtable-ppc64.h | 14 ++++++++++++++
> include/asm-generic/pgtable.h | 12 ++++++++++++
> mm/huge_memory.c | 14 +++++---------
> 3 files changed, 31 insertions(+), 9 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
> index 4a191c472867..d27960c89a71 100644
> --- a/arch/powerpc/include/asm/pgtable-ppc64.h
> +++ b/arch/powerpc/include/asm/pgtable-ppc64.h
> @@ -558,5 +558,19 @@ extern pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp);
> #define __HAVE_ARCH_PMDP_INVALIDATE
> extern void pmdp_invalidate(struct vm_area_struct *vma, unsigned long address,
> pmd_t *pmdp);
> +
> +#define pmd_move_must_withdraw pmd_move_must_withdraw
> +typedef struct spinlock spinlock_t;
> +static inline int pmd_move_must_withdraw(spinlock_t *new_pmd_ptl,
> + spinlock_t *old_pmd_ptl)
> +{
> + /*
> + * Archs like ppc64 use pgtable to store per pmd
> + * specific information. So when we switch the pmd,
> + * we should also withdraw and deposit the pgtable
> + */
> + return true;
> +}
> +
> #endif /* __ASSEMBLY__ */
> #endif /* _ASM_POWERPC_PGTABLE_PPC64_H_ */
> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
> index db0923458940..8e4f41d9af4d 100644
> --- a/include/asm-generic/pgtable.h
> +++ b/include/asm-generic/pgtable.h
> @@ -558,6 +558,18 @@ static inline pmd_t pmd_read_atomic(pmd_t *pmdp)
> }
> #endif
>
> +#ifndef pmd_move_must_withdraw
> +static inline int pmd_move_must_withdraw(spinlock_t *new_pmd_ptl,
> + spinlock_t *old_pmd_ptl)
> +{
> + /*
> + * With split pmd lock we also need to move preallocated
> + * PTE page table if new_pmd is on different PMD page table.
> + */
> + return new_pmd_ptl != old_pmd_ptl;
> +}
> +#endif
> +
> /*
> * This function is meant to be used by sites walking pagetables with
> * the mmap_sem hold in read mode to protect against MADV_DONTNEED and
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 95d1acb0f3d2..5d80c53b87cb 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1502,19 +1502,15 @@ int move_huge_pmd(struct vm_area_struct *vma, struct vm_area_struct *new_vma,
> spin_lock_nested(new_ptl, SINGLE_DEPTH_NESTING);
> pmd = pmdp_get_and_clear(mm, old_addr, old_pmd);
> VM_BUG_ON(!pmd_none(*new_pmd));
> - set_pmd_at(mm, new_addr, new_pmd, pmd_mksoft_dirty(pmd));
> - if (new_ptl != old_ptl) {
> - pgtable_t pgtable;
>
> - /*
> - * Move preallocated PTE page table if new_pmd is on
> - * different PMD page table.
> - */
> + if (pmd_move_must_withdraw(new_ptl, old_ptl)) {
> + pgtable_t pgtable;
> pgtable = pgtable_trans_huge_withdraw(mm, old_pmd);
> pgtable_trans_huge_deposit(mm, new_pmd, pgtable);
> -
> - spin_unlock(new_ptl);
> }
> + set_pmd_at(mm, new_addr, new_pmd, pmd_mksoft_dirty(pmd));
> + if (new_ptl != old_ptl)
> + spin_unlock(new_ptl);
> spin_unlock(old_ptl);
> }
> out:
^ permalink raw reply
* Re: [PATCH] pseries/cpuidle: Remove redundant call to ppc64_runlatch_off() in cpu idle routines
From: Preeti U Murthy @ 2014-01-13 6:34 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev, paulus, deepthi
In-Reply-To: <1389591507.29912.6.camel@concordia>
Hi Mikey
I have the patch with the changelog according to your suggestion below.
Thanks
On 01/13/2014 11:08 AM, Michael Ellerman wrote:
> On Thu, 2014-01-09 at 10:35 +0530, Preeti U Murthy wrote:
>> Commit fbd7740fdfdf9475f switched pseries cpu idle handling from complete idle
>> loops to ppc_md.powersave functions. Earlier to this switch,
>> ppc64_runlatch_off() had to be called in each of the idle routines. But after
>> the switch this call is handled in arch_cpu_idle(),just before the call
>> to ppc_md.powersave, where platform specific idle routines are called.
>>
>> As a consequence, the call to ppc64_runlatch_off() got duplicated in the
>> arch_cpu_idle() routine as well as in the some of the idle routines in
>> pseries and commit fbd7740fdfdf9475f missed to get rid of these redundant
>> calls. These calls were carried over subsequent enhancements to the pseries
>> cpuidle routines. This patch takes care of eliminating this redundancy.
>
> It's "obvious" that turning the runlatch off multiple times is harmless,
> although it adds extra overhead, but please spell that out in the changelog.
>
> cheers
>
>
pseries/cpuidle: Remove redundant call to ppc64_runlatch_off() in cpu idle routines
From: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Commit fbd7740fdfdf9475f(powerpc: Simplify pSeries idle loop) switched pseries cpu
idle handling from complete idle loops to ppc_md.powersave functions. Earlier to
this switch, ppc64_runlatch_off() had to be called in each of the idle routines.
But after the switch, this call is handled in arch_cpu_idle(),just before the call
to ppc_md.powersave, where platform specific idle routines are called.
As a consequence, the call to ppc64_runlatch_off() got duplicated in the
arch_cpu_idle() routine as well as in the some of the idle routines in
pseries and commit fbd7740fdfdf9475f missed to get rid of these redundant
calls. These calls were carried over subsequent enhancements to the pseries
cpuidle routines.
Although multiple calls to ppc64_runlatch_off() is harmless, there is still some
overhead due to it. Besides that, these calls could also make way for a
misunderstanding that it is *necessary* to call ppc64_runlatch_off() multiple
times, when that is not the case. Hence this patch takes care of eliminating
this redundancy.
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
---
arch/powerpc/platforms/pseries/processor_idle.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/processor_idle.c b/arch/powerpc/platforms/pseries/processor_idle.c
index a166e38..09e4f56 100644
--- a/arch/powerpc/platforms/pseries/processor_idle.c
+++ b/arch/powerpc/platforms/pseries/processor_idle.c
@@ -17,7 +17,6 @@
#include <asm/reg.h>
#include <asm/machdep.h>
#include <asm/firmware.h>
-#include <asm/runlatch.h>
#include <asm/plpar_wrappers.h>
struct cpuidle_driver pseries_idle_driver = {
@@ -63,7 +62,6 @@ static int snooze_loop(struct cpuidle_device *dev,
set_thread_flag(TIF_POLLING_NRFLAG);
while ((!need_resched()) && cpu_online(cpu)) {
- ppc64_runlatch_off();
HMT_low();
HMT_very_low();
}
@@ -103,7 +101,6 @@ static int dedicated_cede_loop(struct cpuidle_device *dev,
idle_loop_prolog(&in_purr);
get_lppaca()->donate_dedicated_cpu = 1;
- ppc64_runlatch_off();
HMT_medium();
check_and_cede_processor();
Regards
Preeti U Murthy
^ permalink raw reply related
* [PATCH V4] powerpc: thp: Fix crash on mremap
From: Aneesh Kumar K.V @ 2014-01-13 6:04 UTC (permalink / raw)
To: benh, paulus, aarcange, kirill.shutemov
Cc: linux-mm, linuxppc-dev, Aneesh Kumar K.V
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
This patch fix the below crash
NIP [c00000000004cee4] .__hash_page_thp+0x2a4/0x440
LR [c0000000000439ac] .hash_page+0x18c/0x5e0
...
Call Trace:
[c000000736103c40] [00001ffffb000000] 0x1ffffb000000(unreliable)
[437908.479693] [c000000736103d50] [c0000000000439ac] .hash_page+0x18c/0x5e0
[437908.479699] [c000000736103e30] [c00000000000924c] .do_hash_page+0x4c/0x58
On ppc64 we use the pgtable for storing the hpte slot information and
store address to the pgtable at a constant offset (PTRS_PER_PMD) from
pmd. On mremap, when we switch the pmd, we need to withdraw and deposit
the pgtable again, so that we find the pgtable at PTRS_PER_PMD offset
from new pmd.
We also want to move the withdraw and deposit before the set_pmd so
that, when page fault find the pmd as trans huge we can be sure that
pgtable can be located at the offset.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
Changes from V3:
* Drop "powerpc: mm: Move ppc64 page table range definitions to separate header"" patch
arch/powerpc/include/asm/pgtable-ppc64.h | 14 ++++++++++++++
include/asm-generic/pgtable.h | 12 ++++++++++++
mm/huge_memory.c | 14 +++++---------
3 files changed, 31 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
index 4a191c472867..d27960c89a71 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -558,5 +558,19 @@ extern pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp);
#define __HAVE_ARCH_PMDP_INVALIDATE
extern void pmdp_invalidate(struct vm_area_struct *vma, unsigned long address,
pmd_t *pmdp);
+
+#define pmd_move_must_withdraw pmd_move_must_withdraw
+typedef struct spinlock spinlock_t;
+static inline int pmd_move_must_withdraw(spinlock_t *new_pmd_ptl,
+ spinlock_t *old_pmd_ptl)
+{
+ /*
+ * Archs like ppc64 use pgtable to store per pmd
+ * specific information. So when we switch the pmd,
+ * we should also withdraw and deposit the pgtable
+ */
+ return true;
+}
+
#endif /* __ASSEMBLY__ */
#endif /* _ASM_POWERPC_PGTABLE_PPC64_H_ */
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index db0923458940..8e4f41d9af4d 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -558,6 +558,18 @@ static inline pmd_t pmd_read_atomic(pmd_t *pmdp)
}
#endif
+#ifndef pmd_move_must_withdraw
+static inline int pmd_move_must_withdraw(spinlock_t *new_pmd_ptl,
+ spinlock_t *old_pmd_ptl)
+{
+ /*
+ * With split pmd lock we also need to move preallocated
+ * PTE page table if new_pmd is on different PMD page table.
+ */
+ return new_pmd_ptl != old_pmd_ptl;
+}
+#endif
+
/*
* This function is meant to be used by sites walking pagetables with
* the mmap_sem hold in read mode to protect against MADV_DONTNEED and
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 95d1acb0f3d2..5d80c53b87cb 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1502,19 +1502,15 @@ int move_huge_pmd(struct vm_area_struct *vma, struct vm_area_struct *new_vma,
spin_lock_nested(new_ptl, SINGLE_DEPTH_NESTING);
pmd = pmdp_get_and_clear(mm, old_addr, old_pmd);
VM_BUG_ON(!pmd_none(*new_pmd));
- set_pmd_at(mm, new_addr, new_pmd, pmd_mksoft_dirty(pmd));
- if (new_ptl != old_ptl) {
- pgtable_t pgtable;
- /*
- * Move preallocated PTE page table if new_pmd is on
- * different PMD page table.
- */
+ if (pmd_move_must_withdraw(new_ptl, old_ptl)) {
+ pgtable_t pgtable;
pgtable = pgtable_trans_huge_withdraw(mm, old_pmd);
pgtable_trans_huge_deposit(mm, new_pmd, pgtable);
-
- spin_unlock(new_ptl);
}
+ set_pmd_at(mm, new_addr, new_pmd, pmd_mksoft_dirty(pmd));
+ if (new_ptl != old_ptl)
+ spin_unlock(new_ptl);
spin_unlock(old_ptl);
}
out:
--
1.8.3.2
^ permalink raw reply related
* Re: [PATCH] pseries/cpuidle: Remove redundant call to ppc64_runlatch_off() in cpu idle routines
From: Michael Ellerman @ 2014-01-13 5:38 UTC (permalink / raw)
To: Preeti U Murthy; +Cc: linuxppc-dev, paulus, deepthi
In-Reply-To: <20140109050519.11532.6044.stgit@preeti.in.ibm.com>
On Thu, 2014-01-09 at 10:35 +0530, Preeti U Murthy wrote:
> Commit fbd7740fdfdf9475f switched pseries cpu idle handling from complete idle
> loops to ppc_md.powersave functions. Earlier to this switch,
> ppc64_runlatch_off() had to be called in each of the idle routines. But after
> the switch this call is handled in arch_cpu_idle(),just before the call
> to ppc_md.powersave, where platform specific idle routines are called.
>
> As a consequence, the call to ppc64_runlatch_off() got duplicated in the
> arch_cpu_idle() routine as well as in the some of the idle routines in
> pseries and commit fbd7740fdfdf9475f missed to get rid of these redundant
> calls. These calls were carried over subsequent enhancements to the pseries
> cpuidle routines. This patch takes care of eliminating this redundancy.
It's "obvious" that turning the runlatch off multiple times is harmless,
although it adds extra overhead, but please spell that out in the changelog.
cheers
^ permalink raw reply
* [PATCH 3/3] powerpc: Fix transactional FP/VMX/VSX unavailable handlers
From: Paul Mackerras @ 2014-01-13 4:56 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <1389588990-25953-1-git-send-email-paulus@samba.org>
Currently, if a process starts a transaction and then takes an
exception because the FPU, VMX or VSX unit is unavailable to it,
we end up corrupting any FP/VMX/VSX state that was valid before
the interrupt. For example, if the process starts a transaction
with the FPU available to it but VMX unavailable, and then does
a VMX instruction inside the transaction, the FP state gets
corrupted.
Loading up the desired state generally involves doing a reclaim
and a recheckpoint. To avoid corrupting already-valid state, we have
to be careful not to reload that state from the thread_struct
between the reclaim and the recheckpoint (since the thread_struct
values are stale by now), and we have to reload that state from
the transact_fp/vr arrays after the recheckpoint to get back the
current transactional values saved there by the reclaim.
Signed-off-by: Paul Mackerras <paulus@samba.org>
---
arch/powerpc/kernel/traps.c | 45 ++++++++++++++++++++++++++++++++++++---------
1 file changed, 36 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index b543587..50a7ec3 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1401,11 +1401,19 @@ void fp_unavailable_tm(struct pt_regs *regs)
/* This loads and recheckpoints the FP registers from
* thread.fpr[]. They will remain in registers after the
* checkpoint so we don't need to reload them after.
+ * If VMX is in use, the VRs now hold checkpointed values,
+ * so we don't want to load the VRs from the thread_struct.
*/
- tm_recheckpoint(¤t->thread, regs->msr);
+ tm_recheckpoint(¤t->thread, MSR_FP);
+
+ /* If VMX is in use, get the transactional values back */
+ if (regs->msr & MSR_VEC) {
+ do_load_up_transact_altivec(¤t->thread);
+ /* At this point all the VSX state is loaded, so enable it */
+ regs->msr |= MSR_VSX;
+ }
}
-#ifdef CONFIG_ALTIVEC
void altivec_unavailable_tm(struct pt_regs *regs)
{
/* See the comments in fp_unavailable_tm(). This function operates
@@ -1417,14 +1425,19 @@ void altivec_unavailable_tm(struct pt_regs *regs)
regs->nip, regs->msr);
tm_reclaim_current(TM_CAUSE_FAC_UNAV);
regs->msr |= MSR_VEC;
- tm_recheckpoint(¤t->thread, regs->msr);
+ tm_recheckpoint(¤t->thread, MSR_VEC);
current->thread.used_vr = 1;
+
+ if (regs->msr & MSR_FP) {
+ do_load_up_transact_fpu(¤t->thread);
+ regs->msr |= MSR_VSX;
+ }
}
-#endif
-#ifdef CONFIG_VSX
void vsx_unavailable_tm(struct pt_regs *regs)
{
+ unsigned long orig_msr = regs->msr;
+
/* See the comments in fp_unavailable_tm(). This works similarly,
* though we're loading both FP and VEC registers in here.
*
@@ -1436,16 +1449,30 @@ void vsx_unavailable_tm(struct pt_regs *regs)
"MSR=%lx\n",
regs->nip, regs->msr);
+ current->thread.used_vsr = 1;
+
+ /* If FP and VMX are already loaded, we have all the state we need */
+ if ((orig_msr & (MSR_FP | MSR_VEC)) == (MSR_FP | MSR_VEC)) {
+ regs->msr |= MSR_VSX;
+ return;
+ }
+
/* This reclaims FP and/or VR regs if they're already enabled */
tm_reclaim_current(TM_CAUSE_FAC_UNAV);
regs->msr |= MSR_VEC | MSR_FP | current->thread.fpexc_mode |
MSR_VSX;
- /* This loads & recheckpoints FP and VRs. */
- tm_recheckpoint(¤t->thread, regs->msr);
- current->thread.used_vsr = 1;
+
+ /* This loads & recheckpoints FP and VRs; but we have
+ * to be sure not to overwrite previously-valid state.
+ */
+ tm_recheckpoint(¤t->thread, regs->msr & ~orig_msr);
+
+ if (orig_msr & MSR_FP)
+ do_load_up_transact_fpu(¤t->thread);
+ if (orig_msr & MSR_VEC)
+ do_load_up_transact_altivec(¤t->thread);
}
-#endif
#endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
void performance_monitor_exception(struct pt_regs *regs)
--
1.8.4.2
^ permalink raw reply related
* [PATCH 2/3] powerpc: Don't corrupt transactional state when using FP/VMX in kernel
From: Paul Mackerras @ 2014-01-13 4:56 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <1389588990-25953-1-git-send-email-paulus@samba.org>
Currently, when we have a process using the transactional memory
facilities on POWER8 (that is, the processor is in transactional
or suspended state), and the process enters the kernel and the
kernel then uses the floating-point or vector (VMX/Altivec) facility,
we end up corrupting the user-visible FP/VMX/VSX state. This
happens, for example, if a page fault causes a copy-on-write
operation, because the copy_page function will use VMX to do the
copy on POWER8. The test program below demonstrates the bug.
The bug happens because when FP/VMX state for a transactional process
is stored in the thread_struct, we store the checkpointed state in
.fp_state/.vr_state and the transactional (current) state in
.transact_fp/.transact_vr. However, when the kernel wants to use
FP/VMX, it calls enable_kernel_fp() or enable_kernel_altivec(),
which saves the current state in .fp_state/.vr_state. Furthermore,
when we return to the user process we return with FP/VMX/VSX
disabled. The next time the process uses FP/VMX/VSX, we don't know
which set of state (the current register values, .fp_state/.vr_state,
or .transact_fp/.transact_vr) we should be using, since we have no
way to tell if we are still in the same transaction, and if not,
whether the previous transaction succeeded or failed.
Thus it is necessary to strictly adhere to the rule that if FP has
been enabled at any point in a transaction, we must keep FP enabled
for the user process with the current transactional state in the
FP registers, until we detect that it is no longer in a transaction.
Similarly for VMX; once enabled it must stay enabled until the
process is no longer transactional.
In order to keep this rule, we add a new thread_info flag which we
test when returning from the kernel to userspace, called TIF_RESTORE_TM.
This flag indicates that there is FP/VMX/VSX state to be restored
before entering userspace, and when it is set the .tm_orig_msr field
in the thread_struct indicates what state needs to be restored.
The restoration is done by restore_tm_state(). The TIF_RESTORE_TM
bit is set by new giveup_fpu/altivec_maybe_transactional helpers,
which are called from enable_kernel_fp/altivec, giveup_vsx, and
flush_fp/altivec_to_thread instead of giveup_fpu/altivec.
The other thing to be done is to get the transactional FP/VMX/VSX
state from .fp_state/.vr_state when doing reclaim, if that state
has been saved there by giveup_fpu/altivec_maybe_transactional.
Having done this, we set the FP/VMX bit in the thread's MSR after
reclaim to indicate that that part of the state is now valid
(having been reclaimed from the processor's checkpointed state).
Finally, in the signal handling code, we move the clearing of the
transactional state bits in the thread's MSR a bit earlier, before
calling flush_fp_to_thread(), so that we don't unnecessarily set
the TIF_RESTORE_TM bit.
This is the test program:
/* Michael Neuling 4/12/2013
*
* See if the altivec state is leaked out of an aborted transaction due to
* kernel vmx copy loops.
*
* gcc -m64 htm_vmxcopy.c -o htm_vmxcopy
*
*/
/* We don't use all of these, but for reference: */
int main(int argc, char *argv[])
{
long double vecin = 1.3;
long double vecout;
unsigned long pgsize = getpagesize();
int i;
int fd;
int size = pgsize*16;
char tmpfile[] = "/tmp/page_faultXXXXXX";
char buf[pgsize];
char *a;
uint64_t aborted = 0;
fd = mkstemp(tmpfile);
assert(fd >= 0);
memset(buf, 0, pgsize);
for (i = 0; i < size; i += pgsize)
assert(write(fd, buf, pgsize) == pgsize);
unlink(tmpfile);
a = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0);
assert(a != MAP_FAILED);
asm __volatile__(
"lxvd2x 40,0,%[vecinptr] ; " // set 40 to initial value
TBEGIN
"beq 3f ;"
TSUSPEND
"xxlxor 40,40,40 ; " // set 40 to 0
"std 5, 0(%[map]) ;" // cause kernel vmx copy page
TABORT
TRESUME
TEND
"li %[res], 0 ;"
"b 5f ;"
"3: ;" // Abort handler
"li %[res], 1 ;"
"5: ;"
"stxvd2x 40,0,%[vecoutptr] ; "
: [res]"=r"(aborted)
: [vecinptr]"r"(&vecin),
[vecoutptr]"r"(&vecout),
[map]"r"(a)
: "memory", "r0", "r3", "r4", "r5", "r6", "r7");
if (aborted && (vecin != vecout)){
printf("FAILED: vector state leaked on abort %f != %f\n",
(double)vecin, (double)vecout);
exit(1);
}
munmap(a, size);
close(fd);
printf("PASSED!\n");
return 0;
}
Signed-off-by: Paul Mackerras <paulus@samba.org>
---
arch/powerpc/include/asm/processor.h | 2 +
arch/powerpc/include/asm/thread_info.h | 5 +-
arch/powerpc/include/asm/tm.h | 1 +
arch/powerpc/kernel/entry_64.S | 12 ++-
arch/powerpc/kernel/fpu.S | 16 ++++
arch/powerpc/kernel/process.c | 146 ++++++++++++++++++++++++++++++---
arch/powerpc/kernel/signal.c | 3 +-
arch/powerpc/kernel/signal_32.c | 21 ++---
arch/powerpc/kernel/signal_64.c | 14 ++--
arch/powerpc/kernel/traps.c | 12 +--
arch/powerpc/kernel/vector.S | 10 +++
11 files changed, 195 insertions(+), 47 deletions(-)
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index fc14a38..232a2fa 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -373,6 +373,8 @@ extern int set_endian(struct task_struct *tsk, unsigned int val);
extern int get_unalign_ctl(struct task_struct *tsk, unsigned long adr);
extern int set_unalign_ctl(struct task_struct *tsk, unsigned int val);
+extern void fp_enable(void);
+extern void vec_enable(void);
extern void load_fp_state(struct thread_fp_state *fp);
extern void store_fp_state(struct thread_fp_state *fp);
extern void load_vr_state(struct thread_vr_state *vr);
diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
index fc2bf41..b034ecd 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -91,6 +91,7 @@ static inline struct thread_info *current_thread_info(void)
#define TIF_POLLING_NRFLAG 3 /* true if poll_idle() is polling
TIF_NEED_RESCHED */
#define TIF_32BIT 4 /* 32 bit binary */
+#define TIF_RESTORE_TM 5 /* need to restore TM FP/VEC/VSX */
#define TIF_SYSCALL_AUDIT 7 /* syscall auditing active */
#define TIF_SINGLESTEP 8 /* singlestepping active */
#define TIF_NOHZ 9 /* in adaptive nohz mode */
@@ -113,6 +114,7 @@ static inline struct thread_info *current_thread_info(void)
#define _TIF_NEED_RESCHED (1<<TIF_NEED_RESCHED)
#define _TIF_POLLING_NRFLAG (1<<TIF_POLLING_NRFLAG)
#define _TIF_32BIT (1<<TIF_32BIT)
+#define _TIF_RESTORE_TM (1<<TIF_RESTORE_TM)
#define _TIF_SYSCALL_AUDIT (1<<TIF_SYSCALL_AUDIT)
#define _TIF_SINGLESTEP (1<<TIF_SINGLESTEP)
#define _TIF_SECCOMP (1<<TIF_SECCOMP)
@@ -128,7 +130,8 @@ static inline struct thread_info *current_thread_info(void)
_TIF_NOHZ)
#define _TIF_USER_WORK_MASK (_TIF_SIGPENDING | _TIF_NEED_RESCHED | \
- _TIF_NOTIFY_RESUME | _TIF_UPROBE)
+ _TIF_NOTIFY_RESUME | _TIF_UPROBE | \
+ _TIF_RESTORE_TM)
#define _TIF_PERSYSCALL_MASK (_TIF_RESTOREALL|_TIF_NOERROR)
/* Bits in local_flags */
diff --git a/arch/powerpc/include/asm/tm.h b/arch/powerpc/include/asm/tm.h
index 9dfbc34..0c9f8b7 100644
--- a/arch/powerpc/include/asm/tm.h
+++ b/arch/powerpc/include/asm/tm.h
@@ -15,6 +15,7 @@ extern void do_load_up_transact_altivec(struct thread_struct *thread);
extern void tm_enable(void);
extern void tm_reclaim(struct thread_struct *thread,
unsigned long orig_msr, uint8_t cause);
+extern void tm_reclaim_current(uint8_t cause);
extern void tm_recheckpoint(struct thread_struct *thread,
unsigned long orig_msr);
extern void tm_abort(uint8_t cause);
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index bbfb029..662c6dd 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -664,8 +664,16 @@ _GLOBAL(ret_from_except_lite)
bl .restore_interrupts
SCHEDULE_USER
b .ret_from_except_lite
-
-2: bl .save_nvgprs
+2:
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+ andi. r0,r4,_TIF_USER_WORK_MASK & ~_TIF_RESTORE_TM
+ bne 3f /* only restore TM if nothing else to do */
+ addi r3,r1,STACK_FRAME_OVERHEAD
+ bl .restore_tm_state
+ b restore
+3:
+#endif
+ bl .save_nvgprs
bl .restore_interrupts
addi r3,r1,STACK_FRAME_OVERHEAD
bl .do_notify_resume
diff --git a/arch/powerpc/kernel/fpu.S b/arch/powerpc/kernel/fpu.S
index f7f5b8b..9ad236e 100644
--- a/arch/powerpc/kernel/fpu.S
+++ b/arch/powerpc/kernel/fpu.S
@@ -81,6 +81,22 @@ END_FTR_SECTION_IFSET(CPU_FTR_VSX)
#endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
/*
+ * Enable use of the FPU, and VSX if possible, for the caller.
+ */
+_GLOBAL(fp_enable)
+ mfmsr r3
+ ori r3,r3,MSR_FP
+#ifdef CONFIG_VSX
+BEGIN_FTR_SECTION
+ oris r3,r3,MSR_VSX@h
+END_FTR_SECTION_IFSET(CPU_FTR_VSX)
+#endif
+ SYNC
+ MTMSRD(r3)
+ isync /* (not necessary for arch 2.02 and later) */
+ blr
+
+/*
* Load state from memory into FP registers including FPSCR.
* Assumes the caller has enabled FP in the MSR.
*/
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 4a96556..51acc39 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -74,6 +74,48 @@ struct task_struct *last_task_used_vsx = NULL;
struct task_struct *last_task_used_spe = NULL;
#endif
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+void giveup_fpu_maybe_transactional(struct task_struct *tsk)
+{
+ /*
+ * If we are saving the current thread's registers, and the
+ * thread is in a transactional state, set the TIF_RESTORE_TM
+ * bit so that we know to restore the registers before
+ * returning to userspace.
+ */
+ if (tsk == current && tsk->thread.regs &&
+ MSR_TM_ACTIVE(tsk->thread.regs->msr) &&
+ !test_thread_flag(TIF_RESTORE_TM)) {
+ tsk->thread.tm_orig_msr = tsk->thread.regs->msr;
+ set_thread_flag(TIF_RESTORE_TM);
+ }
+
+ giveup_fpu(tsk);
+}
+
+void giveup_altivec_maybe_transactional(struct task_struct *tsk)
+{
+ /*
+ * If we are saving the current thread's registers, and the
+ * thread is in a transactional state, set the TIF_RESTORE_TM
+ * bit so that we know to restore the registers before
+ * returning to userspace.
+ */
+ if (tsk == current && tsk->thread.regs &&
+ MSR_TM_ACTIVE(tsk->thread.regs->msr) &&
+ !test_thread_flag(TIF_RESTORE_TM)) {
+ tsk->thread.tm_orig_msr = tsk->thread.regs->msr;
+ set_thread_flag(TIF_RESTORE_TM);
+ }
+
+ giveup_altivec(tsk);
+}
+
+#else
+#define giveup_fpu_maybe_transactional(tsk) giveup_fpu(tsk)
+#define giveup_altivec_maybe_transactional(tsk) giveup_altivec(tsk)
+#endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
+
#ifdef CONFIG_PPC_FPU
/*
* Make sure the floating-point register state in the
@@ -102,13 +144,13 @@ void flush_fp_to_thread(struct task_struct *tsk)
*/
BUG_ON(tsk != current);
#endif
- giveup_fpu(tsk);
+ giveup_fpu_maybe_transactional(tsk);
}
preempt_enable();
}
}
EXPORT_SYMBOL_GPL(flush_fp_to_thread);
-#endif
+#endif /* CONFIG_PPC_FPU */
void enable_kernel_fp(void)
{
@@ -116,11 +158,11 @@ void enable_kernel_fp(void)
#ifdef CONFIG_SMP
if (current->thread.regs && (current->thread.regs->msr & MSR_FP))
- giveup_fpu(current);
+ giveup_fpu_maybe_transactional(current);
else
giveup_fpu(NULL); /* just enables FP for kernel */
#else
- giveup_fpu(last_task_used_math);
+ giveup_fpu_maybe_transactional(last_task_used_math);
#endif /* CONFIG_SMP */
}
EXPORT_SYMBOL(enable_kernel_fp);
@@ -132,11 +174,11 @@ void enable_kernel_altivec(void)
#ifdef CONFIG_SMP
if (current->thread.regs && (current->thread.regs->msr & MSR_VEC))
- giveup_altivec(current);
+ giveup_altivec_maybe_transactional(current);
else
giveup_altivec_notask();
#else
- giveup_altivec(last_task_used_altivec);
+ giveup_altivec_maybe_transactional(last_task_used_altivec);
#endif /* CONFIG_SMP */
}
EXPORT_SYMBOL(enable_kernel_altivec);
@@ -153,7 +195,7 @@ void flush_altivec_to_thread(struct task_struct *tsk)
#ifdef CONFIG_SMP
BUG_ON(tsk != current);
#endif
- giveup_altivec(tsk);
+ giveup_altivec_maybe_transactional(tsk);
}
preempt_enable();
}
@@ -182,8 +224,8 @@ EXPORT_SYMBOL(enable_kernel_vsx);
void giveup_vsx(struct task_struct *tsk)
{
- giveup_fpu(tsk);
- giveup_altivec(tsk);
+ giveup_fpu_maybe_transactional(tsk);
+ giveup_altivec_maybe_transactional(tsk);
__giveup_vsx(tsk);
}
@@ -479,7 +521,48 @@ static inline bool hw_brk_match(struct arch_hw_breakpoint *a,
return false;
return true;
}
+
#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+static void tm_reclaim_thread(struct thread_struct *thr,
+ struct thread_info *ti, uint8_t cause)
+{
+ unsigned long msr_diff = 0;
+
+ /*
+ * If FP/VSX registers have been already saved to the
+ * thread_struct, move them to the transact_fp array.
+ * We clear the TIF_RESTORE_TM bit since after the reclaim
+ * the thread will no longer be transactional.
+ */
+ if (test_ti_thread_flag(ti, TIF_RESTORE_TM)) {
+ msr_diff = thr->tm_orig_msr & ~thr->regs->msr;
+ if (msr_diff & MSR_FP)
+ memcpy(&thr->transact_fp, &thr->fp_state,
+ sizeof(struct thread_fp_state));
+ if (msr_diff & MSR_VEC)
+ memcpy(&thr->transact_vr, &thr->vr_state,
+ sizeof(struct thread_vr_state));
+ clear_ti_thread_flag(ti, TIF_RESTORE_TM);
+ msr_diff &= MSR_FP | MSR_VEC | MSR_VSX | MSR_FE0 | MSR_FE1;
+ }
+
+ tm_reclaim(thr, thr->regs->msr, cause);
+
+ /* Having done the reclaim, we now have the checkpointed
+ * FP/VSX values in the registers. These might be valid
+ * even if we have previously called enable_kernel_fp() or
+ * flush_fp_to_thread(), so update thr->regs->msr to
+ * indicate their current validity.
+ */
+ thr->regs->msr |= msr_diff;
+}
+
+void tm_reclaim_current(uint8_t cause)
+{
+ tm_enable();
+ tm_reclaim_thread(¤t->thread, current_thread_info(), cause);
+}
+
static inline void tm_reclaim_task(struct task_struct *tsk)
{
/* We have to work out if we're switching from/to a task that's in the
@@ -502,9 +585,11 @@ static inline void tm_reclaim_task(struct task_struct *tsk)
/* Stash the original thread MSR, as giveup_fpu et al will
* modify it. We hold onto it to see whether the task used
- * FP & vector regs.
+ * FP & vector regs. If the TIF_RESTORE_TM flag is set,
+ * tm_orig_msr is already set.
*/
- thr->tm_orig_msr = thr->regs->msr;
+ if (!test_ti_thread_flag(task_thread_info(tsk), TIF_RESTORE_TM))
+ thr->tm_orig_msr = thr->regs->msr;
TM_DEBUG("--- tm_reclaim on pid %d (NIP=%lx, "
"ccr=%lx, msr=%lx, trap=%lx)\n",
@@ -512,7 +597,7 @@ static inline void tm_reclaim_task(struct task_struct *tsk)
thr->regs->ccr, thr->regs->msr,
thr->regs->trap);
- tm_reclaim(thr, thr->regs->msr, TM_CAUSE_RESCHED);
+ tm_reclaim_thread(thr, task_thread_info(tsk), TM_CAUSE_RESCHED);
TM_DEBUG("--- tm_reclaim on pid %d complete\n",
tsk->pid);
@@ -588,6 +673,43 @@ static inline void __switch_to_tm(struct task_struct *prev)
tm_reclaim_task(prev);
}
}
+
+/*
+ * This is called if we are on the way out to userspace and the
+ * TIF_RESTORE_TM flag is set. It checks if we need to reload
+ * FP and/or vector state and does so if necessary.
+ * If userspace is inside a transaction (whether active or
+ * suspended) and FP/VMX/VSX instructions have ever been enabled
+ * inside that transaction, then we have to keep them enabled
+ * and keep the FP/VMX/VSX state loaded while ever the transaction
+ * continues. The reason is that if we didn't, and subsequently
+ * got a FP/VMX/VSX unavailable interrupt inside a transaction,
+ * we don't know whether it's the same transaction, and thus we
+ * don't know which of the checkpointed state and the transactional
+ * state to use.
+ */
+void restore_tm_state(struct pt_regs *regs)
+{
+ unsigned long msr_diff;
+
+ clear_thread_flag(TIF_RESTORE_TM);
+ if (!MSR_TM_ACTIVE(regs->msr))
+ return;
+
+ msr_diff = current->thread.tm_orig_msr & ~regs->msr;
+ msr_diff &= MSR_FP | MSR_VEC | MSR_VSX;
+ if (msr_diff & MSR_FP) {
+ fp_enable();
+ load_fp_state(¤t->thread.fp_state);
+ regs->msr |= current->thread.fpexc_mode;
+ }
+ if (msr_diff & MSR_VEC) {
+ vec_enable();
+ load_vr_state(¤t->thread.vr_state);
+ }
+ regs->msr |= msr_diff;
+}
+
#else
#define tm_recheckpoint_new_task(new)
#define __switch_to_tm(prev)
diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
index 457e97a..8fc4177 100644
--- a/arch/powerpc/kernel/signal.c
+++ b/arch/powerpc/kernel/signal.c
@@ -203,8 +203,7 @@ unsigned long get_tm_stackpointer(struct pt_regs *regs)
#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
if (MSR_TM_ACTIVE(regs->msr)) {
- tm_enable();
- tm_reclaim(¤t->thread, regs->msr, TM_CAUSE_SIGNAL);
+ tm_reclaim_current(TM_CAUSE_SIGNAL);
if (MSR_TM_TRANSACTIONAL(regs->msr))
return current->thread.ckpt_regs.gpr[1];
}
diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c
index 68027bf..6ce69e6 100644
--- a/arch/powerpc/kernel/signal_32.c
+++ b/arch/powerpc/kernel/signal_32.c
@@ -519,6 +519,13 @@ static int save_tm_user_regs(struct pt_regs *regs,
{
unsigned long msr = regs->msr;
+ /* Remove TM bits from thread's MSR. The MSR in the sigcontext
+ * just indicates to userland that we were doing a transaction, but we
+ * don't want to return in transactional state. This also ensures
+ * that flush_fp_to_thread won't set TIF_RESTORE_TM again.
+ */
+ regs->msr &= ~MSR_TS_MASK;
+
/* Make sure floating point registers are stored in regs */
flush_fp_to_thread(current);
@@ -1056,13 +1063,6 @@ int handle_rt_signal32(unsigned long sig, struct k_sigaction *ka,
/* enter the signal handler in native-endian mode */
regs->msr &= ~MSR_LE;
regs->msr |= (MSR_KERNEL & MSR_LE);
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
- /* Remove TM bits from thread's MSR. The MSR in the sigcontext
- * just indicates to userland that we were doing a transaction, but we
- * don't want to return in transactional state:
- */
- regs->msr &= ~MSR_TS_MASK;
-#endif
return 1;
badframe:
@@ -1484,13 +1484,6 @@ int handle_signal32(unsigned long sig, struct k_sigaction *ka,
regs->nip = (unsigned long) ka->sa.sa_handler;
/* enter the signal handler in big-endian mode */
regs->msr &= ~MSR_LE;
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
- /* Remove TM bits from thread's MSR. The MSR in the sigcontext
- * just indicates to userland that we were doing a transaction, but we
- * don't want to return in transactional state:
- */
- regs->msr &= ~MSR_TS_MASK;
-#endif
return 1;
badframe:
diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index 4299104..e35bf77 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -192,6 +192,13 @@ static long setup_tm_sigcontexts(struct sigcontext __user *sc,
BUG_ON(!MSR_TM_ACTIVE(regs->msr));
+ /* Remove TM bits from thread's MSR. The MSR in the sigcontext
+ * just indicates to userland that we were doing a transaction, but we
+ * don't want to return in transactional state. This also ensures
+ * that flush_fp_to_thread won't set TIF_RESTORE_TM again.
+ */
+ regs->msr &= ~MSR_TS_MASK;
+
flush_fp_to_thread(current);
#ifdef CONFIG_ALTIVEC
@@ -749,13 +756,6 @@ int handle_rt_signal64(int signr, struct k_sigaction *ka, siginfo_t *info,
/* Make sure signal handler doesn't get spurious FP exceptions */
current->thread.fp_state.fpscr = 0;
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
- /* Remove TM bits from thread's MSR. The MSR in the sigcontext
- * just indicates to userland that we were doing a transaction, but we
- * don't want to return in transactional state:
- */
- regs->msr &= ~MSR_TS_MASK;
-#endif
/* Set up to return from userspace. */
if (vdso64_rt_sigtramp && current->mm->context.vdso_base) {
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 907a472..b543587 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1384,7 +1384,6 @@ void fp_unavailable_tm(struct pt_regs *regs)
TM_DEBUG("FP Unavailable trap whilst transactional at 0x%lx, MSR=%lx\n",
regs->nip, regs->msr);
- tm_enable();
/* We can only have got here if the task started using FP after
* beginning the transaction. So, the transactional regs are just a
@@ -1393,8 +1392,7 @@ void fp_unavailable_tm(struct pt_regs *regs)
* transaction, and probably retry but now with FP enabled. So the
* checkpointed FP registers need to be loaded.
*/
- tm_reclaim(¤t->thread, current->thread.regs->msr,
- TM_CAUSE_FAC_UNAV);
+ tm_reclaim_current(TM_CAUSE_FAC_UNAV);
/* Reclaim didn't save out any FPRs to transact_fprs. */
/* Enable FP for the task: */
@@ -1417,9 +1415,7 @@ void altivec_unavailable_tm(struct pt_regs *regs)
TM_DEBUG("Vector Unavailable trap whilst transactional at 0x%lx,"
"MSR=%lx\n",
regs->nip, regs->msr);
- tm_enable();
- tm_reclaim(¤t->thread, current->thread.regs->msr,
- TM_CAUSE_FAC_UNAV);
+ tm_reclaim_current(TM_CAUSE_FAC_UNAV);
regs->msr |= MSR_VEC;
tm_recheckpoint(¤t->thread, regs->msr);
current->thread.used_vr = 1;
@@ -1440,10 +1436,8 @@ void vsx_unavailable_tm(struct pt_regs *regs)
"MSR=%lx\n",
regs->nip, regs->msr);
- tm_enable();
/* This reclaims FP and/or VR regs if they're already enabled */
- tm_reclaim(¤t->thread, current->thread.regs->msr,
- TM_CAUSE_FAC_UNAV);
+ tm_reclaim_current(TM_CAUSE_FAC_UNAV);
regs->msr |= MSR_VEC | MSR_FP | current->thread.fpexc_mode |
MSR_VSX;
diff --git a/arch/powerpc/kernel/vector.S b/arch/powerpc/kernel/vector.S
index 0458a9a..74f8050 100644
--- a/arch/powerpc/kernel/vector.S
+++ b/arch/powerpc/kernel/vector.S
@@ -37,6 +37,16 @@ _GLOBAL(do_load_up_transact_altivec)
#endif
/*
+ * Enable use of VMX/Altivec for the caller.
+ */
+_GLOBAL(vec_enable)
+ mfmsr r3
+ oris r3,r3,MSR_VEC@h
+ MTMSRD(r3)
+ isync
+ blr
+
+/*
* Load state from memory into VMX registers including VSCR.
* Assumes the caller has enabled VMX in the MSR.
*/
--
1.8.4.2
^ permalink raw reply related
* [PATCH 1/3] powerpc: Reclaim two unused thread_info flag bits
From: Paul Mackerras @ 2014-01-13 4:56 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <1389588990-25953-1-git-send-email-paulus@samba.org>
TIF_PERFMON_WORK and TIF_PERFMON_CTXSW are completely unused. They
appear to be related to the old perfmon2 code, which has been
superseded by the perf_event infrastructure. This removes their
definitions so that the bits can be used for other purposes.
Signed-off-by: Paul Mackerras <paulus@samba.org>
---
arch/powerpc/include/asm/thread_info.h | 4 ----
1 file changed, 4 deletions(-)
diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
index 9854c56..fc2bf41 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -91,8 +91,6 @@ static inline struct thread_info *current_thread_info(void)
#define TIF_POLLING_NRFLAG 3 /* true if poll_idle() is polling
TIF_NEED_RESCHED */
#define TIF_32BIT 4 /* 32 bit binary */
-#define TIF_PERFMON_WORK 5 /* work for pfm_handle_work() */
-#define TIF_PERFMON_CTXSW 6 /* perfmon needs ctxsw calls */
#define TIF_SYSCALL_AUDIT 7 /* syscall auditing active */
#define TIF_SINGLESTEP 8 /* singlestepping active */
#define TIF_NOHZ 9 /* in adaptive nohz mode */
@@ -115,8 +113,6 @@ static inline struct thread_info *current_thread_info(void)
#define _TIF_NEED_RESCHED (1<<TIF_NEED_RESCHED)
#define _TIF_POLLING_NRFLAG (1<<TIF_POLLING_NRFLAG)
#define _TIF_32BIT (1<<TIF_32BIT)
-#define _TIF_PERFMON_WORK (1<<TIF_PERFMON_WORK)
-#define _TIF_PERFMON_CTXSW (1<<TIF_PERFMON_CTXSW)
#define _TIF_SYSCALL_AUDIT (1<<TIF_SYSCALL_AUDIT)
#define _TIF_SINGLESTEP (1<<TIF_SINGLESTEP)
#define _TIF_SECCOMP (1<<TIF_SECCOMP)
--
1.8.4.2
^ permalink raw reply related
* [PATCH 0/3] Transactional memory fixes
From: Paul Mackerras @ 2014-01-13 4:56 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
This series of patches fixes a couple of cases where bugs in the
handling of FP/VMX/VSX state around transactions can lead to
corruption of the FP/VMX/VSX state visible to user processes. The
bugs were found basically by inspection but then verified by writing
small test programs that provoke the bugs.
Paul.
arch/powerpc/include/asm/processor.h | 2 +
arch/powerpc/include/asm/thread_info.h | 9 +-
arch/powerpc/include/asm/tm.h | 1 +
arch/powerpc/kernel/entry_64.S | 12 ++-
arch/powerpc/kernel/fpu.S | 16 ++++
arch/powerpc/kernel/process.c | 146 ++++++++++++++++++++++++++++++---
arch/powerpc/kernel/signal.c | 3 +-
arch/powerpc/kernel/signal_32.c | 21 ++---
arch/powerpc/kernel/signal_64.c | 14 ++--
arch/powerpc/kernel/traps.c | 57 +++++++++----
arch/powerpc/kernel/vector.S | 10 +++
11 files changed, 231 insertions(+), 60 deletions(-)
^ permalink raw reply
* Re: [PATCH] powerpc/iommu: Don't detach device without IOMMU group
From: Alexey Kardashevskiy @ 2014-01-13 4:54 UTC (permalink / raw)
To: Gavin Shan, linuxppc-dev
In-Reply-To: <1389584182-6349-1-git-send-email-shangw@linux.vnet.ibm.com>
On 01/13/2014 02:36 PM, Gavin Shan wrote:
> Some devices, for example PCI root port, don't have IOMMU table and
> group. We needn't detach them from their IOMMU group. Otherwise, it
> potentially incurs kernel crash because of referring NULL IOMMU group
> as following backtrace indicates:
>
> .iommu_group_remove_device+0x74/0x1b0
> .iommu_bus_notifier+0x94/0xb4
> .notifier_call_chain+0x78/0xe8
> .__blocking_notifier_call_chain+0x7c/0xbc
> .blocking_notifier_call_chain+0x38/0x48
> .device_del+0x50/0x234
> .pci_remove_bus_device+0x88/0x138
> .pci_stop_and_remove_bus_device+0x2c/0x40
> .pcibios_remove_pci_devices+0xcc/0xfc
> .pcibios_remove_pci_devices+0x3c/0xfc
>
> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Thanks! I never tested the unplug case...
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> arch/powerpc/kernel/iommu.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
> index 572bb5b..8a49424 100644
> --- a/arch/powerpc/kernel/iommu.c
> +++ b/arch/powerpc/kernel/iommu.c
> @@ -1137,6 +1137,17 @@ static int iommu_add_device(struct device *dev)
>
> static void iommu_del_device(struct device *dev)
> {
> + /*
> + * Some devices might not have IOMMU table and group
> + * and we needn't detach them from the associated
> + * IOMMU groups
> + */
> + if (!dev->iommu_group) {
> + pr_debug("iommu_tce: skipping device %s with no tbl\n",
> + dev_name(dev));
> + return;
> + }
> +
> iommu_group_remove_device(dev);
> }
>
>
--
Alexey
^ permalink raw reply
* Re: [PATCH] pseries/cpuidle: Remove redundant call to ppc64_runlatch_off() in cpu idle routines
From: Preeti U Murthy @ 2014-01-13 4:22 UTC (permalink / raw)
To: Deepthi Dharwar; +Cc: linuxppc-dev, paulus
In-Reply-To: <52D3640F.2060805@linux.vnet.ibm.com>
Hi Deepthi,
On 01/13/2014 09:27 AM, Deepthi Dharwar wrote:
> On 01/09/2014 10:35 AM, Preeti U Murthy wrote:
>> Commit fbd7740fdfdf9475f switched pseries cpu idle handling from complete idle
>> loops to ppc_md.powersave functions. Earlier to this switch,
>> ppc64_runlatch_off() had to be called in each of the idle routines. But after
>> the switch this call is handled in arch_cpu_idle(),just before the call
>> to ppc_md.powersave, where platform specific idle routines are called.
>>
>> As a consequence, the call to ppc64_runlatch_off() got duplicated in the
>> arch_cpu_idle() routine as well as in the some of the idle routines in
>> pseries and commit fbd7740fdfdf9475f missed to get rid of these redundant
>> calls. These calls were carried over subsequent enhancements to the pseries
>> cpuidle routines. This patch takes care of eliminating this redundancy.
>>
>> Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
>> ---
>
> Acked-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
>
> Preeti, I will include this patch as part of the pseries cpuidle driver
> clean-ups series which I have undertaken.
Yes that would be great, thanks!
Regards
Preeti U Murthy
>
> Regards,
> Deepthi
>
>> arch/powerpc/platforms/pseries/processor_idle.c | 3 ---
>> 1 file changed, 3 deletions(-)
>>
>> diff --git a/arch/powerpc/platforms/pseries/processor_idle.c b/arch/powerpc/platforms/pseries/processor_idle.c
>> index a166e38..09e4f56 100644
>> --- a/arch/powerpc/platforms/pseries/processor_idle.c
>> +++ b/arch/powerpc/platforms/pseries/processor_idle.c
>> @@ -17,7 +17,6 @@
>> #include <asm/reg.h>
>> #include <asm/machdep.h>
>> #include <asm/firmware.h>
>> -#include <asm/runlatch.h>
>> #include <asm/plpar_wrappers.h>
>>
>> struct cpuidle_driver pseries_idle_driver = {
>> @@ -63,7 +62,6 @@ static int snooze_loop(struct cpuidle_device *dev,
>> set_thread_flag(TIF_POLLING_NRFLAG);
>>
>> while ((!need_resched()) && cpu_online(cpu)) {
>> - ppc64_runlatch_off();
>> HMT_low();
>> HMT_very_low();
>> }
>> @@ -103,7 +101,6 @@ static int dedicated_cede_loop(struct cpuidle_device *dev,
>> idle_loop_prolog(&in_purr);
>> get_lppaca()->donate_dedicated_cpu = 1;
>>
>> - ppc64_runlatch_off();
>> HMT_medium();
>> check_and_cede_processor();
>>
>>
>> _______________________________________________
>> Linuxppc-dev mailing list
>> Linuxppc-dev@lists.ozlabs.org
>> https://lists.ozlabs.org/listinfo/linuxppc-dev
>>
>
^ permalink raw reply
* Re: [PATCH] pseries/cpuidle: Remove redundant call to ppc64_runlatch_off() in cpu idle routines
From: Deepthi Dharwar @ 2014-01-13 3:57 UTC (permalink / raw)
To: Preeti U Murthy; +Cc: linuxppc-dev, paulus
In-Reply-To: <20140109050519.11532.6044.stgit@preeti.in.ibm.com>
On 01/09/2014 10:35 AM, Preeti U Murthy wrote:
> Commit fbd7740fdfdf9475f switched pseries cpu idle handling from complete idle
> loops to ppc_md.powersave functions. Earlier to this switch,
> ppc64_runlatch_off() had to be called in each of the idle routines. But after
> the switch this call is handled in arch_cpu_idle(),just before the call
> to ppc_md.powersave, where platform specific idle routines are called.
>
> As a consequence, the call to ppc64_runlatch_off() got duplicated in the
> arch_cpu_idle() routine as well as in the some of the idle routines in
> pseries and commit fbd7740fdfdf9475f missed to get rid of these redundant
> calls. These calls were carried over subsequent enhancements to the pseries
> cpuidle routines. This patch takes care of eliminating this redundancy.
>
> Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> ---
Acked-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Preeti, I will include this patch as part of the pseries cpuidle driver
clean-ups series which I have undertaken.
Regards,
Deepthi
> arch/powerpc/platforms/pseries/processor_idle.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/processor_idle.c b/arch/powerpc/platforms/pseries/processor_idle.c
> index a166e38..09e4f56 100644
> --- a/arch/powerpc/platforms/pseries/processor_idle.c
> +++ b/arch/powerpc/platforms/pseries/processor_idle.c
> @@ -17,7 +17,6 @@
> #include <asm/reg.h>
> #include <asm/machdep.h>
> #include <asm/firmware.h>
> -#include <asm/runlatch.h>
> #include <asm/plpar_wrappers.h>
>
> struct cpuidle_driver pseries_idle_driver = {
> @@ -63,7 +62,6 @@ static int snooze_loop(struct cpuidle_device *dev,
> set_thread_flag(TIF_POLLING_NRFLAG);
>
> while ((!need_resched()) && cpu_online(cpu)) {
> - ppc64_runlatch_off();
> HMT_low();
> HMT_very_low();
> }
> @@ -103,7 +101,6 @@ static int dedicated_cede_loop(struct cpuidle_device *dev,
> idle_loop_prolog(&in_purr);
> get_lppaca()->donate_dedicated_cpu = 1;
>
> - ppc64_runlatch_off();
> HMT_medium();
> check_and_cede_processor();
>
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
>
^ permalink raw reply
* [PATCH] powerpc/iommu: Don't detach device without IOMMU group
From: Gavin Shan @ 2014-01-13 3:36 UTC (permalink / raw)
To: linuxppc-dev; +Cc: aik, Gavin Shan
Some devices, for example PCI root port, don't have IOMMU table and
group. We needn't detach them from their IOMMU group. Otherwise, it
potentially incurs kernel crash because of referring NULL IOMMU group
as following backtrace indicates:
.iommu_group_remove_device+0x74/0x1b0
.iommu_bus_notifier+0x94/0xb4
.notifier_call_chain+0x78/0xe8
.__blocking_notifier_call_chain+0x7c/0xbc
.blocking_notifier_call_chain+0x38/0x48
.device_del+0x50/0x234
.pci_remove_bus_device+0x88/0x138
.pci_stop_and_remove_bus_device+0x2c/0x40
.pcibios_remove_pci_devices+0xcc/0xfc
.pcibios_remove_pci_devices+0x3c/0xfc
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
arch/powerpc/kernel/iommu.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index 572bb5b..8a49424 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -1137,6 +1137,17 @@ static int iommu_add_device(struct device *dev)
static void iommu_del_device(struct device *dev)
{
+ /*
+ * Some devices might not have IOMMU table and group
+ * and we needn't detach them from the associated
+ * IOMMU groups
+ */
+ if (!dev->iommu_group) {
+ pr_debug("iommu_tce: skipping device %s with no tbl\n",
+ dev_name(dev));
+ return;
+ }
+
iommu_group_remove_device(dev);
}
--
1.7.10.4
^ permalink raw reply related
* RE: [PATCH v8] clk: corenet: Adds the clock binding
From: Yuantian Tang @ 2014-01-13 2:40 UTC (permalink / raw)
To: Scott Wood
Cc: mark.rutland@arm.com, devicetree@vger.kernel.org,
linuxppc-dev@lists.ozlabs.org
In-Reply-To: <1389385170.24905.19.camel@snotra.buserror.net>
VGhhbmtzIGZvciB5b3VyIHJldmlldy4NCg0KVGhhbmtzLA0KWXVhbnRpYW4NCg0KPiAtLS0tLU9y
aWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiBGcm9tOiBXb29kIFNjb3R0LUIwNzQyMQ0KPiBTZW50OiAy
MDE05bm0MeaciDEx5pelIOaYn+acn+WFrSA0OjIwDQo+IFRvOiBUYW5nIFl1YW50aWFuLUIyOTk4
Mw0KPiBDYzogV29vZCBTY290dC1CMDc0MjE7IGdhbGFrQGtlcm5lbC5jcmFzaGluZy5vcmc7IG1h
cmsucnV0bGFuZEBhcm0uY29tOw0KPiBkZXZpY2V0cmVlQHZnZXIua2VybmVsLm9yZzsgbGludXhw
cGMtZGV2QGxpc3RzLm96bGFicy5vcmc7IExpIFlhbmctTGVvLQ0KPiBSNTg0NzINCj4gU3ViamVj
dDogUmU6IFtQQVRDSCB2OF0gY2xrOiBjb3JlbmV0OiBBZGRzIHRoZSBjbG9jayBiaW5kaW5nDQo+
IA0KPiBPbiBGcmksIDIwMTQtMDEtMTAgYXQgMTA6MjkgKzA4MDAsIFRhbmcgWXVhbnRpYW4gd3Jv
dGU6DQo+ID4gKy0gcmVnOiBPZmZzZXQgYW5kIGxlbmd0aCBvZiB0aGUgY2xvY2sgcmVnaXN0ZXIg
c2V0DQo+IA0KPiAib2Zmc2V0IiBpbnRvIHdoYXQ/ICBUaGUgY29udGFpbmluZyBub2RlIGlzIG5v
dCB3aXRoaW4gdGhlIHNjb3BlIG9mIHRoaXMNCj4gYmluZGluZy4NCj4gDQo+IEkga25vdyB0aGF0
IHBsZW50eSBvZiBvdGhlciBiaW5kaW5ncyBhcmUgd29yZGVkIHRoaXMgd2F5LCBhbmQgSSB3b3Vs
ZG4ndA0KPiBob2xkIHVwIGFjY2VwdGFuY2UgaWYgdGhpcyB3ZXJlIHRoZSBvbmx5IGlzc3VlLCBi
dXQgaXQgb3VnaHQgdG8gYmUgZml4ZWQNCj4gdG8gc2F5IHNvbWV0aGluZyBsaWtlICJyZWc6IHJl
c291cmNlIHplcm8gcmVwcmVzZW50cyB0aGUgY2xvY2sgcmVnaXN0ZXINCj4gc2V0Ii4NCj4gDQpP
Sywgd2lsbCByZWZpbmUgaXQuDQoNCj4gPiArUmVjb21tZW5kZWQgcHJvcGVydGllczoNCj4gPiAr
LSBjbG9jay1mcmVxdWVuY3k6IElucHV0IHN5c3RlbSBjbG9jayBmcmVxdWVuY3kuIE11c3QgYmUg
cHJlc2VudA0KPiA+ICsJaWYgdGhlIGRldmljZSBoYXMgc3ViLW5vZGVzLg0KPiANCj4gV2h5IG9u
bHkgImlmIHRoZSBkZXZpY2UgaGFzIHN1Yi1ub2RlcyI/DQo+IA0KT0ssIHdpbGwgZml4IGl0Lg0K
DQo+ID4gKyAgICAgICAqICJmc2wscW9yaXEtc3lzY2xrLTEuMCI6IGZvciBpbnB1dCBzeXN0ZW0g
Y2xvY2sgKHYxLjApLg0KPiA+ICsgICAgICAgICAgICAgICBJdCB0YWtlcyBwYXJlbnQncyBjbG9j
ayBhcyBpdHMgY2xvY2suDQo+ID4gKyAgICAgICAqICJmc2wscW9yaXEtc3lzY2xrLTIuMCI6IGZv
ciBpbnB1dCBzeXN0ZW0gY2xvY2sgKHYyLjApLg0KPiA+ICsgICAgICAgICAgICAgICBJdCB0YWtl
cyBwYXJlbnQncyBjbG9jayBhcyBpdHMgY2xvY2suDQo+IA0KPiBzL3BhcmVudCdzIGNsb2NrL3Bh
cmVudCdzIGNsb2NrLWZyZXF1ZW5jeS8gc2luY2UgdGhlIHBhcmVudCBpc24ndA0KPiBhY3R1YWxs
eSBleHBvc2luZyBhIGNsb2NrIGFzIHBlciB0aGUgY2xvY2sgYmluZGluZ3MuDQo+IA0KT0suDQoN
Cj4gPiArRXhhbXBsZSBmb3IgY2xvY2sgYmxvY2sgYW5kIGNsb2NrIHByb3ZpZGVyOg0KPiA+ICsv
IHsNCj4gPiArCWNsb2NrZ2VuOiBnbG9iYWwtdXRpbGl0aWVzQGUxMDAwIHsNCj4gPiArCQljb21w
YXRpYmxlID0gImZzbCxwNTAyMC1jbG9ja2dlbiIsICJmc2wscW9yaXEtY2xvY2tnZW4tMS4wIjsN
Cj4gPiArCQlyYW5nZXMgPSA8MHgwIDB4ZTEwMDAgMHgxMDAwPjsNCj4gPiArCQljbG9jay1mcmVx
dWVuY3kgPSA8MD47DQo+IA0KPiBJdCdkIGJlIGJldHRlciB0byBzaG93IGEgcmVhbCBjbG9jay1m
cmVxdWVuY3kgaGVyZSAtLSB0aGlzIGlzIGFuIGV4YW1wbGUNCj4gZm9yIHRoZSBub2RlIGFzIHRo
ZSBPUyBzZWVzIGl0LCBub3Qgd2hhdCBnb2VzIGluIHRoZSBkdHMgYXMgYW4gaW5wdXQgdG8NCj4g
VS1Cb290Lg0KPiANCk9LLCB3aWxsIHJlbW92ZSBpdC4NCg0KPiAtU2NvdHQNCj4gDQoNCg==
^ permalink raw reply
* [git pull] Please pull powerpc.git merge branch
From: Benjamin Herrenschmidt @ 2014-01-13 1:15 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linuxppc-dev, Linux Kernel list
Hi Linus !
Here's one regression fix for 3.13 that I would appreciate if you could still
pull in. It was an "interesting" one to debug, basically it's an old bug that
got somewhat "exposed" by new code breaking the boot on PA Semi boards (yes,
it does appear that some people are still using these !).
Cheers,
Ben.
The following changes since commit f991db1cf1bdca43675b5d2df0af991719727029:
Merge remote-tracking branch 'agust/merge' into merge (2013-12-30 14:48:27 +1100)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc.git merge
for you to fetch changes up to 10348f5976830e5d8f74e8abb04a9a057a5e8478:
powerpc: Check return value of instance-to-package OF call (2014-01-13 09:49:17 +1100)
----------------------------------------------------------------
Benjamin Herrenschmidt (1):
powerpc: Check return value of instance-to-package OF call
arch/powerpc/kernel/prom_init.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)
^ permalink raw reply
* Re: [PATCH -V3 1/2] powerpc: mm: Move ppc64 page table range definitions to separate header
From: Benjamin Herrenschmidt @ 2014-01-12 22:46 UTC (permalink / raw)
To: Aneesh Kumar K.V
Cc: aarcange, linuxppc-dev, paulus, kirill.shutemov, linux-mm
In-Reply-To: <87mwj8wn3e.fsf@linux.vnet.ibm.com>
On Tue, 2014-01-07 at 07:49 +0530, Aneesh Kumar K.V wrote:
> Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
>
> > On Mon, 2014-01-06 at 14:33 +0530, Aneesh Kumar K.V wrote:
> >> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> >>
> >> This avoid mmu-hash64.h including pagetable-ppc64.h. That inclusion
> >> cause issues like
> >
> > I don't like this. We have that stuff split into too many includes
> > already it's a mess.
>
> I understand. Let me know, if you have any suggestion on cleaning that
> up. I can do that.
>
> >
> > Why do we need to include it from mmu*.h ?
>
> in mmu-hash64.h added by me via 78f1dbde9fd020419313c2a0c3b602ea2427118f
>
> /*
> * This is necessary to get the definition of PGTABLE_RANGE which we
> * need for various slices related matters. Note that this isn't the
> * complete pgtable.h but only a portion of it.
> */
> #include <asm/pgtable-ppc64.h>
For now, instead, just do fwd def of the spinlock, I don't like the
inclusion of spinlock.h there anyway.
Cheers,
Ben,
^ permalink raw reply
* Re: [PATCH 4/8] IBM Akebono: Add support to the OHCI platform driver for PPC476GTR
From: Alistair Popple @ 2014-01-12 23:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Greg KH, linux-usb, Sergei Shtylyov, linuxppc-dev
In-Reply-To: <1389402392.4672.87.camel@pasglop>
On Sat, 11 Jan 2014 12:06:32 Benjamin Herrenschmidt wrote:
> On Fri, 2014-01-10 at 16:52 -0800, Greg KH wrote:
> > > >Signed-off-by: Alistair Popple <alistair@popple.id.au>
> > > >Acked-by: Alan Stern <stern@rowland.harvard.edu>
> > > >Cc: linux-usb@vger.kernel.org
> > > >
> > > Greg, why this patch hasn't been merged? Because it wasn't addressed
> > > to
> > >
> > > you (but BenH)? The other, ehci-platform.c patch didn't even get posted
> > > to
> > > linux-usb that time, but this one?
> >
> > Probably, yes, if it's not sent to me, I'm guessing that the person
> > doesn't want it applied by me, especially if it's written by someone who
> > knows what they are doing.
> >
> > I thought this was going through the PPC tree. My USB patch queue is
> > empty, and closed, for 3.14-rc1.
>
> Communication failure then :-) I told Alistair to submit it to the USB
> tree and thus was expecting it to be picked up by you but I didn't pay
> that much attention. I'll see if I can still put it into my tree based
> on invasiveness when I'm finished with travel. Otherwise, it will wait
> for the next window.
Sorry - this is probably my fault. I originally thought they would go via the
PPC tree but as Ben said he asked me to resubmit them to the appropriate
subsystem trees (which I forgot to do). Unless Ben picks them up I will
resubmit them for the next window.
- Alistair
> Ben.
^ permalink raw reply
* Re: [PATCH v2] powerpc/booke-64: fix tlbsrx. path in bolted tlb handler
From: Benjamin Herrenschmidt @ 2014-01-12 23:27 UTC (permalink / raw)
To: Scott Wood; +Cc: linuxppc-dev
In-Reply-To: <1389396656-27813-1-git-send-email-scottwood@freescale.com>
On Fri, 2014-01-10 at 17:30 -0600, Scott Wood wrote:
> From: Scott Wood <scott@tyr.buserror.net>
>
> It was branching to the cleanup part of the non-bolted handler,
> which would have been bad if there were any chips with tlbsrx.
> that use the bolted handler.
>
> Signed-off-by: Scott Wood <scottwood@freescale.com>
> ---
> v2: rebase
Ack.
> arch/powerpc/mm/tlb_low_64e.S | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/mm/tlb_low_64e.S b/arch/powerpc/mm/tlb_low_64e.S
> index 75f5d27..16250b1 100644
> --- a/arch/powerpc/mm/tlb_low_64e.S
> +++ b/arch/powerpc/mm/tlb_low_64e.S
> @@ -136,7 +136,7 @@ BEGIN_MMU_FTR_SECTION
> */
> PPC_TLBSRX_DOT(0,R16)
> ldx r14,r14,r15 /* grab pgd entry */
> - beq normal_tlb_miss_done /* tlb exists already, bail */
> + beq tlb_miss_done_bolted /* tlb exists already, bail */
> MMU_FTR_SECTION_ELSE
> ldx r14,r14,r15 /* grab pgd entry */
> ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_USE_TLBRSRV)
> @@ -192,6 +192,7 @@ ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_USE_TLBRSRV)
> mtspr SPRN_MAS7_MAS3,r15
> tlbwe
>
> +tlb_miss_done_bolted:
> TLB_MISS_STATS_X(MMSTAT_TLB_MISS_NORM_OK)
> tlb_epilog_bolted
> rfi
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox