LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH -V1 04/24] powerpc: Reduce the PTE_INDEX_SIZE
From: Aneesh Kumar K.V @ 2013-02-26  8:04 UTC (permalink / raw)
  To: benh, paulus; +Cc: linux-mm, linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1361865914-13911-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

This make one PMD cover 16MB range. That helps in easier implementation of THP
on power. THP core code make use of one pmd entry to track the huge page and
the range mapped by a single pmd entry should be equal to the huge page size
supported by the hardware.

Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pgtable-ppc64-64k.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/pgtable-ppc64-64k.h b/arch/powerpc/include/asm/pgtable-ppc64-64k.h
index be4e287..3c529b4 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64-64k.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64-64k.h
@@ -4,10 +4,10 @@
 #include <asm-generic/pgtable-nopud.h>
 
 
-#define PTE_INDEX_SIZE  12
+#define PTE_INDEX_SIZE  8
 #define PMD_INDEX_SIZE  12
 #define PUD_INDEX_SIZE	0
-#define PGD_INDEX_SIZE  6
+#define PGD_INDEX_SIZE  10
 
 #ifndef __ASSEMBLY__
 #define PTE_TABLE_SIZE	(sizeof(real_pte_t) << PTE_INDEX_SIZE)
-- 
1.7.10

^ permalink raw reply related

* [PATCH -V1 03/24] powerpc: Don't hard code the size of pte page
From: Aneesh Kumar K.V @ 2013-02-26  8:04 UTC (permalink / raw)
  To: benh, paulus; +Cc: linux-mm, linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1361865914-13911-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

USE PTRS_PER_PTE to indicate the size of pte page. To support THP,
later patches will be changing PTRS_PER_PTE value.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pgtable.h |    6 ++++++
 arch/powerpc/mm/hash_low_64.S      |    4 ++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index a9cbd3b..4b52726 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -17,6 +17,12 @@ struct mm_struct;
 #  include <asm/pgtable-ppc32.h>
 #endif
 
+/*
+ * We save the slot number & secondary bit in the second half of the
+ * PTE page. We use the 8 bytes per each pte entry.
+ */
+#define PTE_PAGE_HIDX_OFFSET (PTRS_PER_PTE * 8)
+
 #ifndef __ASSEMBLY__
 
 #include <asm/tlbflush.h>
diff --git a/arch/powerpc/mm/hash_low_64.S b/arch/powerpc/mm/hash_low_64.S
index 7443481..abdd5e2 100644
--- a/arch/powerpc/mm/hash_low_64.S
+++ b/arch/powerpc/mm/hash_low_64.S
@@ -490,7 +490,7 @@ END_FTR_SECTION(CPU_FTR_NOEXECUTE|CPU_FTR_COHERENT_ICACHE, CPU_FTR_NOEXECUTE)
 	beq	htab_inval_old_hpte
 
 	ld	r6,STK_PARAM(R6)(r1)
-	ori	r26,r6,0x8000		/* Load the hidx mask */
+	ori	r26,r6,PTE_PAGE_HIDX_OFFSET /* Load the hidx mask. */
 	ld	r26,0(r26)
 	addi	r5,r25,36		/* Check actual HPTE_SUB bit, this */
 	rldcr.	r0,r31,r5,0		/* must match pgtable.h definition */
@@ -607,7 +607,7 @@ htab_pte_insert_ok:
 	sld	r4,r4,r5
 	andc	r26,r26,r4
 	or	r26,r26,r3
-	ori	r5,r6,0x8000
+	ori	r5,r6,PTE_PAGE_HIDX_OFFSET
 	std	r26,0(r5)
 	lwsync
 	std	r30,0(r6)
-- 
1.7.10

^ permalink raw reply related

* [PATCH -V1 00/24] THP support for PPC64
From: Aneesh Kumar K.V @ 2013-02-26  8:04 UTC (permalink / raw)
  To: benh, paulus; +Cc: linux-mm, linuxppc-dev

Hi,

This patchset adds transparent huge page support for PPC64.

I am marking the series to linux-mm because the PPC64 implementation
required few interface changes to core THP code.

TODO:
* ppc64 KVM related changes
* batch support for hpte invalidate
* powernv still doesn't boot
* hash preload support in update_mmu_cache_pmd

Some numbers:

The latency measurements code from Anton  found at
http://ozlabs.org/~anton/junkcode/latency2001.c

THP disabled 64K page size
------------------------
[root@llmp24l02 ~]# ./latency2001 8G
 8589934592    731.73 cycles    205.77 ns
[root@llmp24l02 ~]# ./latency2001 8G
 8589934592    743.39 cycles    209.05 ns
[root@llmp24l02 ~]#

THP disabled large page via hugetlbfs
-------------------------------------
[root@llmp24l02 ~]# ./latency2001  -l 8G
 8589934592    416.09 cycles    117.01 ns
[root@llmp24l02 ~]# ./latency2001  -l 8G
 8589934592    415.74 cycles    116.91 ns

THP enabled 64K page size.
----------------
[root@llmp24l02 ~]# ./latency2001 8G
 8589934592    405.07 cycles    113.91 ns
[root@llmp24l02 ~]# ./latency2001 8G
 8589934592    411.82 cycles    115.81 ns
[root@llmp24l02 ~]#


We are close to hugetlbfs in latency and we can achieve this with zero
config/page reservation. Most of the allocations above are fault allocated.
I haven't really measured the collapse alloc impact.

Another test that does 50000000 random access over 1GB area goes from
2.65 seconds to 1.07 seconds with this patchset.

Changes from RFC V2:
* Address review comments
* More code cleanup and patch split

Changes from RFC V1:
* HugeTLB fs now works
* Compile issues fixed
* rebased to v3.8
* Patch series reorded so that ppc64 cleanups and MM THP changes are moved
  early in the series. This should help in picking those patches early.

Thanks,
-aneesh

^ permalink raw reply

* [PATCH -V1 02/24] powerpc: Save DAR and DSISR in pt_regs on MCE
From: Aneesh Kumar K.V @ 2013-02-26  8:04 UTC (permalink / raw)
  To: benh, paulus; +Cc: linux-mm, linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1361865914-13911-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

We were not saving DAR and DSISR on MCE. Save then and also print the values
along with exception details in xmon.

Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/exceptions-64s.S |    9 +++++++++
 arch/powerpc/xmon/xmon.c             |    2 +-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 0e9c48c..d02e730 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -640,9 +640,18 @@ slb_miss_user_pseries:
 	.align	7
 	.globl machine_check_common
 machine_check_common:
+
+	mfspr	r10,SPRN_DAR
+	std	r10,PACA_EXGEN+EX_DAR(r13)
+	mfspr	r10,SPRN_DSISR
+	stw	r10,PACA_EXGEN+EX_DSISR(r13)
 	EXCEPTION_PROLOG_COMMON(0x200, PACA_EXMC)
 	FINISH_NAP
 	DISABLE_INTS
+	ld	r3,PACA_EXGEN+EX_DAR(r13)
+	lwz	r4,PACA_EXGEN+EX_DSISR(r13)
+	std	r3,_DAR(r1)
+	std	r4,_DSISR(r1)
 	bl	.save_nvgprs
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	.machine_check_exception
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 1f8d2f1..a72e490 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -1423,7 +1423,7 @@ static void excprint(struct pt_regs *fp)
 	printf("    sp: %lx\n", fp->gpr[1]);
 	printf("   msr: %lx\n", fp->msr);
 
-	if (trap == 0x300 || trap == 0x380 || trap == 0x600) {
+	if (trap == 0x300 || trap == 0x380 || trap == 0x600 || trap == 0x200) {
 		printf("   dar: %lx\n", fp->dar);
 		if (trap != 0x380)
 			printf(" dsisr: %lx\n", fp->dsisr);
-- 
1.7.10

^ permalink raw reply related

* [PATCH -V1 01/24] powerpc: Use signed formatting when printing error
From: Aneesh Kumar K.V @ 2013-02-26  8:04 UTC (permalink / raw)
  To: benh, paulus; +Cc: linux-mm, linuxppc-dev, Aneesh Kumar K.V
In-Reply-To: <1361865914-13911-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

PAPR defines these errors as negative values. So print them accordingly
for easy debugging.

Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/pseries/lpar.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index 0da39fe..a77c35b 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -155,7 +155,7 @@ static long pSeries_lpar_hpte_insert(unsigned long hpte_group,
 	 */
 	if (unlikely(lpar_rc != H_SUCCESS)) {
 		if (!(vflags & HPTE_V_BOLTED))
-			pr_devel(" lpar err %lu\n", lpar_rc);
+			pr_devel(" lpar err %ld\n", lpar_rc);
 		return -2;
 	}
 	if (!(vflags & HPTE_V_BOLTED))
-- 
1.7.10

^ permalink raw reply related

* RE: [PATCH 2/6] powerpc/fsl_pci: Store the platform device information corresponding to the pci controller.
From: Sethi Varun-B16395 @ 2013-02-26  6:16 UTC (permalink / raw)
  To: Stuart Yoder
  Cc: Wood Scott-B07421, Joerg Roedel, linux-kernel@vger.kernel.org,
	Yoder Stuart-B08248, iommu@lists.linux-foundation.org,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <CALRxmdDc9TuxH7HgAF3_iLgatoaOUML6fUt8SU+sxHyz6ZVjfw@mail.gmail.com>

This patch is not present in Joerg's tree and the add_device API in the PAM=
U driver requires this patch.

-Varun

> -----Original Message-----
> From: Stuart Yoder [mailto:b08248@gmail.com]
> Sent: Tuesday, February 26, 2013 5:39 AM
> To: Sethi Varun-B16395
> Cc: iommu@lists.linux-foundation.org; linuxppc-dev@lists.ozlabs.org;
> linux-kernel@vger.kernel.org; Wood Scott-B07421; Joerg Roedel; Yoder
> Stuart-B08248
> Subject: Re: [PATCH 2/6] powerpc/fsl_pci: Store the platform device
> information corresponding to the pci controller.
>=20
> This patch was submitted separately to linuxppc-dev (and was already
> applied).  You don't need it in this patch set, right?
>=20
> Stuart
>=20
> On Mon, Feb 18, 2013 at 6:52 AM, Varun Sethi <Varun.Sethi@freescale.com>
> wrote:
> > The pci controller structure has a provision to store the device
> > strcuture pointer of the corresponding platform device. Currently this
> > information is not stored during fsl pci controller initialization.
> > This information is required while dealing with iommu groups for pci
> > devices connected to the fsl pci controller. For the case where the
> > pci devices can't be paritioned, they would fall under the same device
> group as the pci controller.
> >
> > This patch stores the platform device information in the pci
> > controller structure during initialization.
> >
> > Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
> > ---
> >  arch/powerpc/sysdev/fsl_pci.c |    9 +++++++--
> >  arch/powerpc/sysdev/fsl_pci.h |    2 +-
> >  2 files changed, 8 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/powerpc/sysdev/fsl_pci.c
> > b/arch/powerpc/sysdev/fsl_pci.c index 92a5915..b393ae7 100644
> > --- a/arch/powerpc/sysdev/fsl_pci.c
> > +++ b/arch/powerpc/sysdev/fsl_pci.c
> > @@ -421,13 +421,16 @@ void fsl_pcibios_fixup_bus(struct pci_bus *bus)
> >         }
> >  }
> >
> > -int __init fsl_add_bridge(struct device_node *dev, int is_primary)
> > +int __init fsl_add_bridge(struct platform_device *pdev, int
> > +is_primary)
> >  {
> >         int len;
> >         struct pci_controller *hose;
> >         struct resource rsrc;
> >         const int *bus_range;
> >         u8 hdr_type, progif;
> > +       struct device_node *dev;
> > +
> > +       dev =3D pdev->dev.of_node;
> >
> >         if (!of_device_is_available(dev)) {
> >                 pr_warning("%s: disabled\n", dev->full_name); @@
> > -453,6 +456,8 @@ int __init fsl_add_bridge(struct device_node *dev, int
> is_primary)
> >         if (!hose)
> >                 return -ENOMEM;
> >
> > +       /* set platform device as the parent */
> > +       hose->parent =3D &pdev->dev;
> >         hose->first_busno =3D bus_range ? bus_range[0] : 0x0;
> >         hose->last_busno =3D bus_range ? bus_range[1] : 0xff;
> >
> > @@ -880,7 +885,7 @@ static int fsl_pci_probe(struct platform_device
> > *pdev)  #endif
> >
> >         node =3D pdev->dev.of_node;
> > -       ret =3D fsl_add_bridge(node, fsl_pci_primary =3D=3D node);
> > +       ret =3D fsl_add_bridge(pdev, fsl_pci_primary =3D=3D node);
> >
> >  #ifdef CONFIG_SWIOTLB
> >         if (ret =3D=3D 0) {
> > diff --git a/arch/powerpc/sysdev/fsl_pci.h
> > b/arch/powerpc/sysdev/fsl_pci.h index d078537..c495c00 100644
> > --- a/arch/powerpc/sysdev/fsl_pci.h
> > +++ b/arch/powerpc/sysdev/fsl_pci.h
> > @@ -91,7 +91,7 @@ struct ccsr_pci {
> >         __be32  pex_err_cap_r3;         /* 0x.e34 - PCIE error capture
> register 0 */
> >  };
> >
> > -extern int fsl_add_bridge(struct device_node *dev, int is_primary);
> > +extern int fsl_add_bridge(struct platform_device *pdev, int
> > +is_primary);
> >  extern void fsl_pcibios_fixup_bus(struct pci_bus *bus);  extern int
> > mpc83xx_add_bridge(struct device_node *dev);
> >  u64 fsl_pci_immrbar_base(struct pci_controller *hose);
> > --
> > 1.7.4.1
> >
> >
> > _______________________________________________
> > iommu mailing list
> > iommu@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply

* Re: [PATCH 5/6][v4]: perf: Create a sysfs entry for Power event format
From: Michael Ellerman @ 2013-02-26  5:26 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Andi Kleen, Peter Zijlstra, robert.richter, Anton Blanchard,
	linux-kernel, Stephane Eranian, linuxppc-dev, Ingo Molnar,
	Paul Mackerras, Arnaldo Carvalho de Melo, Jiri Olsa
In-Reply-To: <20130123062613.GF13720@us.ibm.com>

On Tue, Jan 22, 2013 at 10:26:13PM -0800, Sukadev Bhattiprolu wrote:
> 
> [PATCH 5/6][v4]: perf: Create a sysfs entry for Power event format
> 
> Create a sysfs entry, '/sys/bus/event_source/devices/cpu/format/event'
> which describes the format of a POWER cpu.

Did this patch go upstream? I don't see it.

If not, please don't merge it.

> The format of the event is the same for all POWER cpus at least in
> (Power6, Power7), so bulk of this change is common in the code common
> to POWER cpus.

No. The event format is different on most POWER cpus, in particular it
is different on Power6 and Power7, and will be different again on
Power8.

cheers

^ permalink raw reply

* [PATCH] drivers/tty/hvc: using strlcpy instead of strncpy
From: Chen Gang @ 2013-02-26  3:43 UTC (permalink / raw)
  To: Jiri Slaby, wfp5p, tklauser; +Cc: Greg KH, linuxppc-dev, alan


  when strlen pi->location_code is larger than HVCS_CLC_LENGTH + 1,
    original implementation can not let hvcsd->p_location_code NUL terminated.
  so need fix it (also can simplify the code)

Signed-off-by: Chen Gang <gang.chen@asianux.com>
---
 drivers/tty/hvc/hvcs.c |    9 ++-------
 1 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/tty/hvc/hvcs.c b/drivers/tty/hvc/hvcs.c
index 1956593..81e939e 100644
--- a/drivers/tty/hvc/hvcs.c
+++ b/drivers/tty/hvc/hvcs.c
@@ -881,17 +881,12 @@ static struct vio_driver hvcs_vio_driver = {
 /* Only called from hvcs_get_pi please */
 static void hvcs_set_pi(struct hvcs_partner_info *pi, struct hvcs_struct *hvcsd)
 {
-	int clclength;
-
 	hvcsd->p_unit_address = pi->unit_address;
 	hvcsd->p_partition_ID  = pi->partition_ID;
-	clclength = strlen(&pi->location_code[0]);
-	if (clclength > HVCS_CLC_LENGTH)
-		clclength = HVCS_CLC_LENGTH;
 
 	/* copy the null-term char too */
-	strncpy(&hvcsd->p_location_code[0],
-			&pi->location_code[0], clclength + 1);
+	strlcpy(&hvcsd->p_location_code[0],
+			&pi->location_code[0], sizeof(hvcsd->p_location_code));
 }
 
 /*
-- 
1.7.7.6

^ permalink raw reply related

* [PATCH] powerpc/85xx: Reserve a partition of NOR flash for QE ucode firmware
From: Jiucheng Xu @ 2013-02-26  2:33 UTC (permalink / raw)
  To: galak, linuxppc-dev; +Cc: Jiucheng Xu

Due to the partition of JFFS2 overlaps with QE ucode firmware, So JFFS2
will break QE ucode. Shrink JFFS2's partition to reserve the space of
QE ucode firmware.

Signed-off-by: Jiucheng Xu <Jiucheng.Xu@freescale.com>
---
 arch/powerpc/boot/dts/p1021rdb-pc.dtsi |   12 ++++++++++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/boot/dts/p1021rdb-pc.dtsi b/arch/powerpc/boot/dts/p1021rdb-pc.dtsi
index c13abfb..d6274c5 100644
--- a/arch/powerpc/boot/dts/p1021rdb-pc.dtsi
+++ b/arch/powerpc/boot/dts/p1021rdb-pc.dtsi
@@ -62,11 +62,19 @@
 		};
 
 		partition@400000 {
-			/* 11MB for JFFS2 based Root file System */
-			reg = <0x00400000 0x00b00000>;
+			/* 10.75MB for JFFS2 based Root file System */
+			reg = <0x00400000 0x00ac0000>;
 			label = "NOR JFFS2 Root File System";
 		};
 
+		partition@ec0000 {
+			/* This location must not be altered  */
+			/* 256KB for QE ucode firmware*/
+			reg = <0x00ec0000 0x00040000>;
+			label = "NOR QE microcode firmware";
+			read-only;
+		};
+
 		partition@f00000 {
 			/* This location must not be altered  */
 			/* 512KB for u-boot Bootloader Image */
-- 
1.6.4

^ permalink raw reply related

* [PATCH] PowerPC:PSeries: strncpy need limit destnation length
From: Chen Gang @ 2013-02-26  2:51 UTC (permalink / raw)
  To: benh, paulus@samba.org; +Cc: linuxppc-dev


  the dest buf len is 80 (HVCS_CLC_LENGTH + 1).
  the src buf len is PAGE_SIZE.
  if src buf string len is more than 80, it will cause issue.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
---
 arch/powerpc/platforms/pseries/hvcserver.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hvcserver.c b/arch/powerpc/platforms/pseries/hvcserver.c
index fcf4b4c..4557e91 100644
--- a/arch/powerpc/platforms/pseries/hvcserver.c
+++ b/arch/powerpc/platforms/pseries/hvcserver.c
@@ -23,6 +23,7 @@
 #include <linux/list.h>
 #include <linux/module.h>
 #include <linux/slab.h>
+#include <linux/string.h>
 
 #include <asm/hvcall.h>
 #include <asm/hvcserver.h>
@@ -188,9 +189,9 @@ int hvcs_get_partner_info(uint32_t unit_address, struct list_head *head,
 			= (unsigned int)last_p_partition_ID;
 
 		/* copy the Null-term char too */
-		strncpy(&next_partner_info->location_code[0],
+		strlcpy(&next_partner_info->location_code[0],
 			(char *)&pi_buff[2],
-			strlen((char *)&pi_buff[2]) + 1);
+			sizeof(next_partner_info->location_code));
 
 		list_add_tail(&(next_partner_info->node), head);
 		next_partner_info = NULL;
-- 
1.7.7.6

^ permalink raw reply related

* [PATCH] ppc32: Fix compile of sha1-powerpc-asm.S
From: Tony Breeds @ 2013-02-26  2:20 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Josh Boyer, LinuxPPC-dev

When building with CRYPTO_SHA1_PPC enabled we fail with:
---
powerpc/crypto/sha1-powerpc-asm.S: Assembler messages:
powerpc/crypto/sha1-powerpc-asm.S:116: Error: can't resolve `0' {*ABS* section} - `STACKFRAMESIZE' {*UND* section}
powerpc/crypto/sha1-powerpc-asm.S:116: Error: expression too complex
powerpc/crypto/sha1-powerpc-asm.S:178: Error: unsupported relocation against STACKFRAMESIZE
---

Use INT_FRAME_SIZE instead.

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
---
 arch/powerpc/crypto/sha1-powerpc-asm.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

FWIW the SHA1_PPC makes about a 20% difference on my 32bit board

diff --git a/arch/powerpc/crypto/sha1-powerpc-asm.S b/arch/powerpc/crypto/sha1-powerpc-asm.S
index a5f8264..125e165 100644
--- a/arch/powerpc/crypto/sha1-powerpc-asm.S
+++ b/arch/powerpc/crypto/sha1-powerpc-asm.S
@@ -113,7 +113,7 @@
 	STEPUP4((t)+16, fn)
 
 _GLOBAL(powerpc_sha_transform)
-	PPC_STLU r1,-STACKFRAMESIZE(r1)
+	PPC_STLU r1,-INT_FRAME_SIZE(r1)
 	SAVE_8GPRS(14, r1)
 	SAVE_10GPRS(22, r1)
 
@@ -175,5 +175,5 @@ _GLOBAL(powerpc_sha_transform)
 
 	REST_8GPRS(14, r1)
 	REST_10GPRS(22, r1)
-	addi	r1,r1,STACKFRAMESIZE
+	addi	r1,r1,INT_FRAME_SIZE
 	blr
-- 
1.8.1.2

^ permalink raw reply related

* Re: [PATCH v6 04/46] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks
From: Lai Jiangshan @ 2013-02-26  0:19 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: linux-doc, peterz, fweisbec, linux-kernel, Michel Lespinasse,
	mingo, linux-arch, linux, xiaoguangrong, wangyun, paulmck, nikunj,
	linux-pm, rusty, rostedt, rjw, namhyung, tglx, linux-arm-kernel,
	netdev, oleg, vincent.guittot, sbw, tj, akpm, linuxppc-dev
In-Reply-To: <CACvQF51jCxk5jUqmhD=QBBtUsBkQWZzakacrKO4Gsk=w61rNwQ@mail.gmail.com>

On Tue, Feb 26, 2013 at 8:17 AM, Lai Jiangshan <eag0628@gmail.com> wrote:
> On Tue, Feb 26, 2013 at 3:26 AM, Srivatsa S. Bhat
> <srivatsa.bhat@linux.vnet.ibm.com> wrote:
>> Hi Lai,
>>
>> On 02/25/2013 09:23 PM, Lai Jiangshan wrote:
>>> Hi, Srivatsa,
>>>
>>> The target of the whole patchset is nice for me.
>>
>> Cool! Thanks :-)
>>
>>> A question: How did you find out the such usages of
>>> "preempt_disable()" and convert them? did all are converted?
>>>
>>
>> Well, I scanned through the source tree for usages which implicitly
>> disabled CPU offline and converted them over. Its not limited to uses
>> of preempt_disable() alone - even spin_locks, rwlocks, local_irq_disable()
>> etc also help disable CPU offline. So I tried to dig out all such uses
>> and converted them. However, since the merge window is open, a lot of
>> new code is flowing into the tree. So I'll have to rescan the tree to
>> see if there are any more places to convert.
>>
>>> And I think the lock is too complex and reinvent the wheel, why don't
>>> you reuse the lglock?
>>
>> lglocks? No way! ;-) See below...
>>
>>> I wrote an untested draft here.
>>>
>>> Thanks,
>>> Lai
>>>
>>> PS: Some HA tools(I'm writing one) which takes checkpoints of
>>> virtual-machines frequently, I guess this patchset can speedup the
>>> tools.
>>>
>>> From 01db542693a1b7fc6f9ece45d57cb529d9be5b66 Mon Sep 17 00:00:00 2001
>>> From: Lai Jiangshan <laijs@cn.fujitsu.com>
>>> Date: Mon, 25 Feb 2013 23:14:27 +0800
>>> Subject: [PATCH] lglock: add read-preference local-global rwlock
>>>
>>> locality via lglock(trylock)
>>> read-preference read-write-lock via fallback rwlock_t
>>>
>>> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
>>> ---
>>>  include/linux/lglock.h |   31 +++++++++++++++++++++++++++++++
>>>  kernel/lglock.c        |   45 +++++++++++++++++++++++++++++++++++++++++++++
>>>  2 files changed, 76 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/include/linux/lglock.h b/include/linux/lglock.h
>>> index 0d24e93..30fe887 100644
>>> --- a/include/linux/lglock.h
>>> +++ b/include/linux/lglock.h
>>> @@ -67,4 +67,35 @@ void lg_local_unlock_cpu(struct lglock *lg, int cpu);
>>>  void lg_global_lock(struct lglock *lg);
>>>  void lg_global_unlock(struct lglock *lg);
>>>
>>> +struct lgrwlock {
>>> +     unsigned long __percpu *fallback_reader_refcnt;
>>> +     struct lglock lglock;
>>> +     rwlock_t fallback_rwlock;
>>> +};
>>> +
>>> +#define DEFINE_LGRWLOCK(name)                                                \
>>> +     static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock)           \
>>> +     = __ARCH_SPIN_LOCK_UNLOCKED;                                    \
>>> +     static DEFINE_PER_CPU(unsigned long, name ## _refcnt);          \
>>> +     struct lgrwlock name = {                                        \
>>> +             .fallback_reader_refcnt = &name ## _refcnt,             \
>>> +             .lglock = { .lock = &name ## _lock } }
>>> +
>>> +#define DEFINE_STATIC_LGRWLOCK(name)                                 \
>>> +     static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock)           \
>>> +     = __ARCH_SPIN_LOCK_UNLOCKED;                                    \
>>> +     static DEFINE_PER_CPU(unsigned long, name ## _refcnt);          \
>>> +     static struct lgrwlock name = {                                 \
>>> +             .fallback_reader_refcnt = &name ## _refcnt,             \
>>> +             .lglock = { .lock = &name ## _lock } }
>>> +
>>> +static inline void lg_rwlock_init(struct lgrwlock *lgrw, char *name)
>>> +{
>>> +     lg_lock_init(&lgrw->lglock, name);
>>> +}
>>> +
>>> +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw);
>>> +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw);
>>> +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw);
>>> +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw);
>>>  #endif
>>> diff --git a/kernel/lglock.c b/kernel/lglock.c
>>> index 6535a66..463543a 100644
>>> --- a/kernel/lglock.c
>>> +++ b/kernel/lglock.c
>>> @@ -87,3 +87,48 @@ void lg_global_unlock(struct lglock *lg)
>>>       preempt_enable();
>>>  }
>>>  EXPORT_SYMBOL(lg_global_unlock);
>>> +
>>> +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw)
>>> +{
>>> +     struct lglock *lg = &lgrw->lglock;
>>> +
>>> +     preempt_disable();
>>> +     if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) {
>>> +             if (likely(arch_spin_trylock(this_cpu_ptr(lg->lock)))) {
>>> +                     rwlock_acquire_read(&lg->lock_dep_map, 0, 0, _RET_IP_);
>>> +                     return;
>>> +             }
>>> +             read_lock(&lgrw->fallback_rwlock);
>>> +     }
>>> +
>>> +     __this_cpu_inc(*lgrw->fallback_reader_refcnt);
>>> +}
>>> +EXPORT_SYMBOL(lg_rwlock_local_read_lock);
>>> +
>>> +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw)
>>> +{
>>> +     if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) {
>>> +             lg_local_unlock(&lgrw->lglock);
>>> +             return;
>>> +     }
>>> +
>>> +     if (!__this_cpu_dec_return(*lgrw->fallback_reader_refcnt))
>>> +             read_unlock(&lgrw->fallback_rwlock);
>>> +
>>> +     preempt_enable();
>>> +}
>>> +EXPORT_SYMBOL(lg_rwlock_local_read_unlock);
>>> +
>>
>> If I read the code above correctly, all you are doing is implementing a
>> recursive reader-side primitive (ie., allowing the reader to call these
>> functions recursively, without resulting in a self-deadlock).
>>
>> But the thing is, making the reader-side recursive is the least of our
>> problems! Our main challenge is to make the locking extremely flexible
>> and also safe-guard it against circular-locking-dependencies and deadlocks.
>> Please take a look at the changelog of patch 1 - it explains the situation
>> with an example.
>
>
> My lock fixes your requirements(I read patch 1-6 before I sent). In

s/fixes/fits/

> readsite, lglock 's lock is token via trylock, the lglock doesn't
> contribute to deadlocks, we can consider it doesn't exist when we find
> deadlock from it. And global fallback rwlock doesn't result to
> deadlocks because it is read-preference(you need to inc the
> fallback_reader_refcnt inside the cpu-hotplug write-side, I don't do
> it in generic lgrwlock)
>
>
> If lg_rwlock_local_read_lock() spins, which means
> lg_rwlock_local_read_lock() spins on fallback_rwlock, and which means
> lg_rwlock_global_write_lock() took the lgrwlock successfully and
> return, and which means lg_rwlock_local_read_lock() will stop spinning
> when the write side finished.
>
>
>>
>>> +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw)
>>> +{
>>> +     lg_global_lock(&lgrw->lglock);
>>
>> This does a for-loop on all CPUs and takes their locks one-by-one. That's
>> exactly what we want to prevent, because that is the _source_ of all our
>> deadlock woes in this case. In the presence of perfect lock ordering
>> guarantees, this wouldn't have been a problem (that's why lglocks are
>> being used successfully elsewhere in the kernel). In the stop-machine()
>> removal case, the over-flexibility of preempt_disable() forces us to provide
>> an equally flexible locking alternative. Hence we can't use such per-cpu
>> locking schemes.
>>
>> You might note that, for exactly this reason, I haven't actually used any
>> per-cpu _locks_ in this synchronization scheme, though it is named as
>> "per-cpu rwlocks". The only per-cpu component here are the refcounts, and
>> we consciously avoid waiting/spinning on them (because then that would be
>> equivalent to having per-cpu locks, which are deadlock-prone). We use
>> global rwlocks to get the deadlock-safety that we need.
>>
>>> +     write_lock(&lgrw->fallback_rwlock);
>>> +}
>>> +EXPORT_SYMBOL(lg_rwlock_global_write_lock);
>>> +
>>> +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw)
>>> +{
>>> +     write_unlock(&lgrw->fallback_rwlock);
>>> +     lg_global_unlock(&lgrw->lglock);
>>> +}
>>> +EXPORT_SYMBOL(lg_rwlock_global_write_unlock);
>>>
>>
>> Regards,
>> Srivatsa S. Bhat
>>

^ permalink raw reply

* Re: [PATCH v6 04/46] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks
From: Lai Jiangshan @ 2013-02-26  0:17 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: linux-doc, peterz, fweisbec, linux-kernel, Michel Lespinasse,
	mingo, linux-arch, linux, xiaoguangrong, wangyun, paulmck, nikunj,
	linux-pm, rusty, rostedt, rjw, namhyung, tglx, linux-arm-kernel,
	netdev, oleg, vincent.guittot, sbw, tj, akpm, linuxppc-dev
In-Reply-To: <512BBAD8.8010006@linux.vnet.ibm.com>

On Tue, Feb 26, 2013 at 3:26 AM, Srivatsa S. Bhat
<srivatsa.bhat@linux.vnet.ibm.com> wrote:
> Hi Lai,
>
> On 02/25/2013 09:23 PM, Lai Jiangshan wrote:
>> Hi, Srivatsa,
>>
>> The target of the whole patchset is nice for me.
>
> Cool! Thanks :-)
>
>> A question: How did you find out the such usages of
>> "preempt_disable()" and convert them? did all are converted?
>>
>
> Well, I scanned through the source tree for usages which implicitly
> disabled CPU offline and converted them over. Its not limited to uses
> of preempt_disable() alone - even spin_locks, rwlocks, local_irq_disable()
> etc also help disable CPU offline. So I tried to dig out all such uses
> and converted them. However, since the merge window is open, a lot of
> new code is flowing into the tree. So I'll have to rescan the tree to
> see if there are any more places to convert.
>
>> And I think the lock is too complex and reinvent the wheel, why don't
>> you reuse the lglock?
>
> lglocks? No way! ;-) See below...
>
>> I wrote an untested draft here.
>>
>> Thanks,
>> Lai
>>
>> PS: Some HA tools(I'm writing one) which takes checkpoints of
>> virtual-machines frequently, I guess this patchset can speedup the
>> tools.
>>
>> From 01db542693a1b7fc6f9ece45d57cb529d9be5b66 Mon Sep 17 00:00:00 2001
>> From: Lai Jiangshan <laijs@cn.fujitsu.com>
>> Date: Mon, 25 Feb 2013 23:14:27 +0800
>> Subject: [PATCH] lglock: add read-preference local-global rwlock
>>
>> locality via lglock(trylock)
>> read-preference read-write-lock via fallback rwlock_t
>>
>> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
>> ---
>>  include/linux/lglock.h |   31 +++++++++++++++++++++++++++++++
>>  kernel/lglock.c        |   45 +++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 76 insertions(+), 0 deletions(-)
>>
>> diff --git a/include/linux/lglock.h b/include/linux/lglock.h
>> index 0d24e93..30fe887 100644
>> --- a/include/linux/lglock.h
>> +++ b/include/linux/lglock.h
>> @@ -67,4 +67,35 @@ void lg_local_unlock_cpu(struct lglock *lg, int cpu);
>>  void lg_global_lock(struct lglock *lg);
>>  void lg_global_unlock(struct lglock *lg);
>>
>> +struct lgrwlock {
>> +     unsigned long __percpu *fallback_reader_refcnt;
>> +     struct lglock lglock;
>> +     rwlock_t fallback_rwlock;
>> +};
>> +
>> +#define DEFINE_LGRWLOCK(name)                                                \
>> +     static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock)           \
>> +     = __ARCH_SPIN_LOCK_UNLOCKED;                                    \
>> +     static DEFINE_PER_CPU(unsigned long, name ## _refcnt);          \
>> +     struct lgrwlock name = {                                        \
>> +             .fallback_reader_refcnt = &name ## _refcnt,             \
>> +             .lglock = { .lock = &name ## _lock } }
>> +
>> +#define DEFINE_STATIC_LGRWLOCK(name)                                 \
>> +     static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock)           \
>> +     = __ARCH_SPIN_LOCK_UNLOCKED;                                    \
>> +     static DEFINE_PER_CPU(unsigned long, name ## _refcnt);          \
>> +     static struct lgrwlock name = {                                 \
>> +             .fallback_reader_refcnt = &name ## _refcnt,             \
>> +             .lglock = { .lock = &name ## _lock } }
>> +
>> +static inline void lg_rwlock_init(struct lgrwlock *lgrw, char *name)
>> +{
>> +     lg_lock_init(&lgrw->lglock, name);
>> +}
>> +
>> +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw);
>> +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw);
>> +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw);
>> +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw);
>>  #endif
>> diff --git a/kernel/lglock.c b/kernel/lglock.c
>> index 6535a66..463543a 100644
>> --- a/kernel/lglock.c
>> +++ b/kernel/lglock.c
>> @@ -87,3 +87,48 @@ void lg_global_unlock(struct lglock *lg)
>>       preempt_enable();
>>  }
>>  EXPORT_SYMBOL(lg_global_unlock);
>> +
>> +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw)
>> +{
>> +     struct lglock *lg = &lgrw->lglock;
>> +
>> +     preempt_disable();
>> +     if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) {
>> +             if (likely(arch_spin_trylock(this_cpu_ptr(lg->lock)))) {
>> +                     rwlock_acquire_read(&lg->lock_dep_map, 0, 0, _RET_IP_);
>> +                     return;
>> +             }
>> +             read_lock(&lgrw->fallback_rwlock);
>> +     }
>> +
>> +     __this_cpu_inc(*lgrw->fallback_reader_refcnt);
>> +}
>> +EXPORT_SYMBOL(lg_rwlock_local_read_lock);
>> +
>> +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw)
>> +{
>> +     if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) {
>> +             lg_local_unlock(&lgrw->lglock);
>> +             return;
>> +     }
>> +
>> +     if (!__this_cpu_dec_return(*lgrw->fallback_reader_refcnt))
>> +             read_unlock(&lgrw->fallback_rwlock);
>> +
>> +     preempt_enable();
>> +}
>> +EXPORT_SYMBOL(lg_rwlock_local_read_unlock);
>> +
>
> If I read the code above correctly, all you are doing is implementing a
> recursive reader-side primitive (ie., allowing the reader to call these
> functions recursively, without resulting in a self-deadlock).
>
> But the thing is, making the reader-side recursive is the least of our
> problems! Our main challenge is to make the locking extremely flexible
> and also safe-guard it against circular-locking-dependencies and deadlocks.
> Please take a look at the changelog of patch 1 - it explains the situation
> with an example.


My lock fixes your requirements(I read patch 1-6 before I sent). In
readsite, lglock 's lock is token via trylock, the lglock doesn't
contribute to deadlocks, we can consider it doesn't exist when we find
deadlock from it. And global fallback rwlock doesn't result to
deadlocks because it is read-preference(you need to inc the
fallback_reader_refcnt inside the cpu-hotplug write-side, I don't do
it in generic lgrwlock)


If lg_rwlock_local_read_lock() spins, which means
lg_rwlock_local_read_lock() spins on fallback_rwlock, and which means
lg_rwlock_global_write_lock() took the lgrwlock successfully and
return, and which means lg_rwlock_local_read_lock() will stop spinning
when the write side finished.


>
>> +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw)
>> +{
>> +     lg_global_lock(&lgrw->lglock);
>
> This does a for-loop on all CPUs and takes their locks one-by-one. That's
> exactly what we want to prevent, because that is the _source_ of all our
> deadlock woes in this case. In the presence of perfect lock ordering
> guarantees, this wouldn't have been a problem (that's why lglocks are
> being used successfully elsewhere in the kernel). In the stop-machine()
> removal case, the over-flexibility of preempt_disable() forces us to provide
> an equally flexible locking alternative. Hence we can't use such per-cpu
> locking schemes.
>
> You might note that, for exactly this reason, I haven't actually used any
> per-cpu _locks_ in this synchronization scheme, though it is named as
> "per-cpu rwlocks". The only per-cpu component here are the refcounts, and
> we consciously avoid waiting/spinning on them (because then that would be
> equivalent to having per-cpu locks, which are deadlock-prone). We use
> global rwlocks to get the deadlock-safety that we need.
>
>> +     write_lock(&lgrw->fallback_rwlock);
>> +}
>> +EXPORT_SYMBOL(lg_rwlock_global_write_lock);
>> +
>> +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw)
>> +{
>> +     write_unlock(&lgrw->fallback_rwlock);
>> +     lg_global_unlock(&lgrw->lglock);
>> +}
>> +EXPORT_SYMBOL(lg_rwlock_global_write_unlock);
>>
>
> Regards,
> Srivatsa S. Bhat
>

^ permalink raw reply

* Re: [PATCH 2/6] powerpc/fsl_pci: Store the platform device information corresponding to the pci controller.
From: Stuart Yoder @ 2013-02-26  0:09 UTC (permalink / raw)
  To: Varun Sethi
  Cc: Joerg Roedel, Stuart Yoder, linux-kernel, iommu, Scott Wood,
	linuxppc-dev
In-Reply-To: <1361191939-21260-3-git-send-email-Varun.Sethi@freescale.com>

This patch was submitted separately to linuxppc-dev (and was already
applied).  You don't need it in this patch set, right?

Stuart

On Mon, Feb 18, 2013 at 6:52 AM, Varun Sethi <Varun.Sethi@freescale.com> wrote:
> The pci controller structure has a provision to store the device strcuture
> pointer of the corresponding platform device. Currently this information is
> not stored during fsl pci controller initialization. This information is
> required while dealing with iommu groups for pci devices connected to the fsl
> pci controller. For the case where the pci devices can't be paritioned, they
> would fall under the same device group as the pci controller.
>
> This patch stores the platform device information in the pci controller
> structure during initialization.
>
> Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
> ---
>  arch/powerpc/sysdev/fsl_pci.c |    9 +++++++--
>  arch/powerpc/sysdev/fsl_pci.h |    2 +-
>  2 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c
> index 92a5915..b393ae7 100644
> --- a/arch/powerpc/sysdev/fsl_pci.c
> +++ b/arch/powerpc/sysdev/fsl_pci.c
> @@ -421,13 +421,16 @@ void fsl_pcibios_fixup_bus(struct pci_bus *bus)
>         }
>  }
>
> -int __init fsl_add_bridge(struct device_node *dev, int is_primary)
> +int __init fsl_add_bridge(struct platform_device *pdev, int is_primary)
>  {
>         int len;
>         struct pci_controller *hose;
>         struct resource rsrc;
>         const int *bus_range;
>         u8 hdr_type, progif;
> +       struct device_node *dev;
> +
> +       dev = pdev->dev.of_node;
>
>         if (!of_device_is_available(dev)) {
>                 pr_warning("%s: disabled\n", dev->full_name);
> @@ -453,6 +456,8 @@ int __init fsl_add_bridge(struct device_node *dev, int is_primary)
>         if (!hose)
>                 return -ENOMEM;
>
> +       /* set platform device as the parent */
> +       hose->parent = &pdev->dev;
>         hose->first_busno = bus_range ? bus_range[0] : 0x0;
>         hose->last_busno = bus_range ? bus_range[1] : 0xff;
>
> @@ -880,7 +885,7 @@ static int fsl_pci_probe(struct platform_device *pdev)
>  #endif
>
>         node = pdev->dev.of_node;
> -       ret = fsl_add_bridge(node, fsl_pci_primary == node);
> +       ret = fsl_add_bridge(pdev, fsl_pci_primary == node);
>
>  #ifdef CONFIG_SWIOTLB
>         if (ret == 0) {
> diff --git a/arch/powerpc/sysdev/fsl_pci.h b/arch/powerpc/sysdev/fsl_pci.h
> index d078537..c495c00 100644
> --- a/arch/powerpc/sysdev/fsl_pci.h
> +++ b/arch/powerpc/sysdev/fsl_pci.h
> @@ -91,7 +91,7 @@ struct ccsr_pci {
>         __be32  pex_err_cap_r3;         /* 0x.e34 - PCIE error capture register 0 */
>  };
>
> -extern int fsl_add_bridge(struct device_node *dev, int is_primary);
> +extern int fsl_add_bridge(struct platform_device *pdev, int is_primary);
>  extern void fsl_pcibios_fixup_bus(struct pci_bus *bus);
>  extern int mpc83xx_add_bridge(struct device_node *dev);
>  u64 fsl_pci_immrbar_base(struct pci_controller *hose);
> --
> 1.7.4.1
>
>
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply

* Re: [PATCH v6 00/46] CPU hotplug: stop_machine()-free CPU hotplug
From: Srivatsa S. Bhat @ 2013-02-25 21:45 UTC (permalink / raw)
  To: Rusty Russell
  Cc: linux-doc, peterz, fweisbec, linux-kernel, walken, mingo,
	linux-arch, linux, xiaoguangrong, wangyun, paulmck, nikunj,
	linux-pm, rostedt, rjw, namhyung, tglx, linux-arm-kernel, netdev,
	oleg, vincent.guittot, sbw, tj, akpm, linuxppc-dev
In-Reply-To: <87mwuxfatp.fsf@rustcorp.com.au>

On 02/22/2013 06:01 AM, Rusty Russell wrote:
> "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> writes:
>> Hi,
>>
>> This patchset removes CPU hotplug's dependence on stop_machine() from the CPU
>> offline path and provides an alternative (set of APIs) to preempt_disable() to
>> prevent CPUs from going offline, which can be invoked from atomic context.
>> The motivation behind the removal of stop_machine() is to avoid its ill-effects
>> and thus improve the design of CPU hotplug. (More description regarding this
>> is available in the patches).
> 
> If you're doing a v7, please put your benchmark results somewhere!
> 

Oh, I forgot to put them in v6! Thanks for reminding :-)
And yes, I'll have to do a v7 to incorporate changes (if any) to the new code
that went in during this merge window.

> The obvious place is in the 44/46.
>

Ok, will add it there. Thank you!
 
Regards,
Srivatsa S. Bhat

^ permalink raw reply

* Re: [PATCH v6 04/46] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks
From: Srivatsa S. Bhat @ 2013-02-25 19:26 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: linux-doc, peterz, fweisbec, linux-kernel, Michel Lespinasse,
	mingo, linux-arch, linux, xiaoguangrong, wangyun, paulmck, nikunj,
	linux-pm, rusty, rostedt, rjw, namhyung, tglx, linux-arm-kernel,
	netdev, oleg, vincent.guittot, sbw, tj, akpm, linuxppc-dev
In-Reply-To: <CACvQF53bdh4_BxF0y1fnTVR+T2OmRc0jmWQYftsvx92-fg-Lug@mail.gmail.com>

Hi Lai,

On 02/25/2013 09:23 PM, Lai Jiangshan wrote:
> Hi, Srivatsa,
> 
> The target of the whole patchset is nice for me.

Cool! Thanks :-)

> A question: How did you find out the such usages of
> "preempt_disable()" and convert them? did all are converted?
> 

Well, I scanned through the source tree for usages which implicitly
disabled CPU offline and converted them over. Its not limited to uses
of preempt_disable() alone - even spin_locks, rwlocks, local_irq_disable()
etc also help disable CPU offline. So I tried to dig out all such uses
and converted them. However, since the merge window is open, a lot of
new code is flowing into the tree. So I'll have to rescan the tree to
see if there are any more places to convert.

> And I think the lock is too complex and reinvent the wheel, why don't
> you reuse the lglock?

lglocks? No way! ;-) See below...

> I wrote an untested draft here.
> 
> Thanks,
> Lai
> 
> PS: Some HA tools(I'm writing one) which takes checkpoints of
> virtual-machines frequently, I guess this patchset can speedup the
> tools.
> 
> From 01db542693a1b7fc6f9ece45d57cb529d9be5b66 Mon Sep 17 00:00:00 2001
> From: Lai Jiangshan <laijs@cn.fujitsu.com>
> Date: Mon, 25 Feb 2013 23:14:27 +0800
> Subject: [PATCH] lglock: add read-preference local-global rwlock
> 
> locality via lglock(trylock)
> read-preference read-write-lock via fallback rwlock_t
> 
> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
> ---
>  include/linux/lglock.h |   31 +++++++++++++++++++++++++++++++
>  kernel/lglock.c        |   45 +++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 76 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/lglock.h b/include/linux/lglock.h
> index 0d24e93..30fe887 100644
> --- a/include/linux/lglock.h
> +++ b/include/linux/lglock.h
> @@ -67,4 +67,35 @@ void lg_local_unlock_cpu(struct lglock *lg, int cpu);
>  void lg_global_lock(struct lglock *lg);
>  void lg_global_unlock(struct lglock *lg);
> 
> +struct lgrwlock {
> +	unsigned long __percpu *fallback_reader_refcnt;
> +	struct lglock lglock;
> +	rwlock_t fallback_rwlock;
> +};
> +
> +#define DEFINE_LGRWLOCK(name)						\
> +	static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock)		\
> +	= __ARCH_SPIN_LOCK_UNLOCKED;					\
> +	static DEFINE_PER_CPU(unsigned long, name ## _refcnt);		\
> +	struct lgrwlock name = {					\
> +		.fallback_reader_refcnt = &name ## _refcnt,		\
> +		.lglock = { .lock = &name ## _lock } }
> +
> +#define DEFINE_STATIC_LGRWLOCK(name)					\
> +	static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock)		\
> +	= __ARCH_SPIN_LOCK_UNLOCKED;					\
> +	static DEFINE_PER_CPU(unsigned long, name ## _refcnt);		\
> +	static struct lgrwlock name = {					\
> +		.fallback_reader_refcnt = &name ## _refcnt,		\
> +		.lglock = { .lock = &name ## _lock } }
> +
> +static inline void lg_rwlock_init(struct lgrwlock *lgrw, char *name)
> +{
> +	lg_lock_init(&lgrw->lglock, name);
> +}
> +
> +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw);
> +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw);
> +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw);
> +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw);
>  #endif
> diff --git a/kernel/lglock.c b/kernel/lglock.c
> index 6535a66..463543a 100644
> --- a/kernel/lglock.c
> +++ b/kernel/lglock.c
> @@ -87,3 +87,48 @@ void lg_global_unlock(struct lglock *lg)
>  	preempt_enable();
>  }
>  EXPORT_SYMBOL(lg_global_unlock);
> +
> +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw)
> +{
> +	struct lglock *lg = &lgrw->lglock;
> +
> +	preempt_disable();
> +	if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) {
> +		if (likely(arch_spin_trylock(this_cpu_ptr(lg->lock)))) {
> +			rwlock_acquire_read(&lg->lock_dep_map, 0, 0, _RET_IP_);
> +			return;
> +		}
> +		read_lock(&lgrw->fallback_rwlock);
> +	}
> +
> +	__this_cpu_inc(*lgrw->fallback_reader_refcnt);
> +}
> +EXPORT_SYMBOL(lg_rwlock_local_read_lock);
> +
> +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw)
> +{
> +	if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) {
> +		lg_local_unlock(&lgrw->lglock);
> +		return;
> +	}
> +
> +	if (!__this_cpu_dec_return(*lgrw->fallback_reader_refcnt))
> +		read_unlock(&lgrw->fallback_rwlock);
> +
> +	preempt_enable();
> +}
> +EXPORT_SYMBOL(lg_rwlock_local_read_unlock);
> +

If I read the code above correctly, all you are doing is implementing a
recursive reader-side primitive (ie., allowing the reader to call these
functions recursively, without resulting in a self-deadlock).

But the thing is, making the reader-side recursive is the least of our
problems! Our main challenge is to make the locking extremely flexible
and also safe-guard it against circular-locking-dependencies and deadlocks.
Please take a look at the changelog of patch 1 - it explains the situation
with an example.

> +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw)
> +{
> +	lg_global_lock(&lgrw->lglock);

This does a for-loop on all CPUs and takes their locks one-by-one. That's
exactly what we want to prevent, because that is the _source_ of all our
deadlock woes in this case. In the presence of perfect lock ordering
guarantees, this wouldn't have been a problem (that's why lglocks are
being used successfully elsewhere in the kernel). In the stop-machine()
removal case, the over-flexibility of preempt_disable() forces us to provide
an equally flexible locking alternative. Hence we can't use such per-cpu
locking schemes.

You might note that, for exactly this reason, I haven't actually used any
per-cpu _locks_ in this synchronization scheme, though it is named as
"per-cpu rwlocks". The only per-cpu component here are the refcounts, and
we consciously avoid waiting/spinning on them (because then that would be
equivalent to having per-cpu locks, which are deadlock-prone). We use
global rwlocks to get the deadlock-safety that we need.

> +	write_lock(&lgrw->fallback_rwlock);
> +}
> +EXPORT_SYMBOL(lg_rwlock_global_write_lock);
> +
> +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw)
> +{
> +	write_unlock(&lgrw->fallback_rwlock);
> +	lg_global_unlock(&lgrw->lglock);
> +}
> +EXPORT_SYMBOL(lg_rwlock_global_write_unlock);
> 

Regards,
Srivatsa S. Bhat

^ permalink raw reply

* Re: [PATCH v6 04/46] percpu_rwlock: Implement the core design of Per-CPU Reader-Writer Locks
From: Lai Jiangshan @ 2013-02-25 15:53 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: linux-doc, peterz, fweisbec, linux-kernel, Michel Lespinasse,
	mingo, linux-arch, linux, xiaoguangrong, wangyun, paulmck, nikunj,
	linux-pm, rusty, rostedt, rjw, namhyung, tglx, linux-arm-kernel,
	netdev, oleg, vincent.guittot, sbw, tj, akpm, linuxppc-dev
In-Reply-To: <51226F91.7000108@linux.vnet.ibm.com>

Hi, Srivatsa,

The target of the whole patchset is nice for me.
A question: How did you find out the such usages of
"preempt_disable()" and convert them? did all are converted?

And I think the lock is too complex and reinvent the wheel, why don't
you reuse the lglock?
I wrote an untested draft here.

Thanks,
Lai

PS: Some HA tools(I'm writing one) which takes checkpoints of
virtual-machines frequently, I guess this patchset can speedup the
tools.

>From 01db542693a1b7fc6f9ece45d57cb529d9be5b66 Mon Sep 17 00:00:00 2001
From: Lai Jiangshan <laijs@cn.fujitsu.com>
Date: Mon, 25 Feb 2013 23:14:27 +0800
Subject: [PATCH] lglock: add read-preference local-global rwlock

locality via lglock(trylock)
read-preference read-write-lock via fallback rwlock_t

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 include/linux/lglock.h |   31 +++++++++++++++++++++++++++++++
 kernel/lglock.c        |   45 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 76 insertions(+), 0 deletions(-)

diff --git a/include/linux/lglock.h b/include/linux/lglock.h
index 0d24e93..30fe887 100644
--- a/include/linux/lglock.h
+++ b/include/linux/lglock.h
@@ -67,4 +67,35 @@ void lg_local_unlock_cpu(struct lglock *lg, int cpu);
 void lg_global_lock(struct lglock *lg);
 void lg_global_unlock(struct lglock *lg);

+struct lgrwlock {
+	unsigned long __percpu *fallback_reader_refcnt;
+	struct lglock lglock;
+	rwlock_t fallback_rwlock;
+};
+
+#define DEFINE_LGRWLOCK(name)						\
+	static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock)		\
+	= __ARCH_SPIN_LOCK_UNLOCKED;					\
+	static DEFINE_PER_CPU(unsigned long, name ## _refcnt);		\
+	struct lgrwlock name = {					\
+		.fallback_reader_refcnt = &name ## _refcnt,		\
+		.lglock = { .lock = &name ## _lock } }
+
+#define DEFINE_STATIC_LGRWLOCK(name)					\
+	static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock)		\
+	= __ARCH_SPIN_LOCK_UNLOCKED;					\
+	static DEFINE_PER_CPU(unsigned long, name ## _refcnt);		\
+	static struct lgrwlock name = {					\
+		.fallback_reader_refcnt = &name ## _refcnt,		\
+		.lglock = { .lock = &name ## _lock } }
+
+static inline void lg_rwlock_init(struct lgrwlock *lgrw, char *name)
+{
+	lg_lock_init(&lgrw->lglock, name);
+}
+
+void lg_rwlock_local_read_lock(struct lgrwlock *lgrw);
+void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw);
+void lg_rwlock_global_write_lock(struct lgrwlock *lgrw);
+void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw);
 #endif
diff --git a/kernel/lglock.c b/kernel/lglock.c
index 6535a66..463543a 100644
--- a/kernel/lglock.c
+++ b/kernel/lglock.c
@@ -87,3 +87,48 @@ void lg_global_unlock(struct lglock *lg)
 	preempt_enable();
 }
 EXPORT_SYMBOL(lg_global_unlock);
+
+void lg_rwlock_local_read_lock(struct lgrwlock *lgrw)
+{
+	struct lglock *lg = &lgrw->lglock;
+
+	preempt_disable();
+	if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) {
+		if (likely(arch_spin_trylock(this_cpu_ptr(lg->lock)))) {
+			rwlock_acquire_read(&lg->lock_dep_map, 0, 0, _RET_IP_);
+			return;
+		}
+		read_lock(&lgrw->fallback_rwlock);
+	}
+
+	__this_cpu_inc(*lgrw->fallback_reader_refcnt);
+}
+EXPORT_SYMBOL(lg_rwlock_local_read_lock);
+
+void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw)
+{
+	if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) {
+		lg_local_unlock(&lgrw->lglock);
+		return;
+	}
+
+	if (!__this_cpu_dec_return(*lgrw->fallback_reader_refcnt))
+		read_unlock(&lgrw->fallback_rwlock);
+
+	preempt_enable();
+}
+EXPORT_SYMBOL(lg_rwlock_local_read_unlock);
+
+void lg_rwlock_global_write_lock(struct lgrwlock *lgrw)
+{
+	lg_global_lock(&lgrw->lglock);
+	write_lock(&lgrw->fallback_rwlock);
+}
+EXPORT_SYMBOL(lg_rwlock_global_write_lock);
+
+void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw)
+{
+	write_unlock(&lgrw->fallback_rwlock);
+	lg_global_unlock(&lgrw->lglock);
+}
+EXPORT_SYMBOL(lg_rwlock_global_write_unlock);
-- 
1.7.7.6

^ permalink raw reply related

* RE: [PATCH 0/6 v8] iommu/fsl: Freescale PAMU driver and IOMMU API implementation.
From: Sethi Varun-B16395 @ 2013-02-25 10:15 UTC (permalink / raw)
  To: joro@8bytes.org
  Cc: Wood Scott-B07421, joro@8bytes.org, linux-kernel@vger.kernel.org,
	Yoder Stuart-B08248, iommu@lists.linux-foundation.org,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <1361191939-21260-1-git-send-email-Varun.Sethi@freescale.com>

Hi Joerg,
Do you have any comments on the patch set.

Regards
Varun

> -----Original Message-----
> From: Sethi Varun-B16395
> Sent: Monday, February 18, 2013 6:22 PM
> To: iommu@lists.linux-foundation.org; linuxppc-dev@lists.ozlabs.org;
> linux-kernel@vger.kernel.org; Wood Scott-B07421; joro@8bytes.org; Yoder
> Stuart-B08248
> Cc: Sethi Varun-B16395
> Subject: [PATCH 0/6 v8] iommu/fsl: Freescale PAMU driver and IOMMU API
> implementation.
>=20
> This patchset provides the Freescale PAMU (Peripheral Access Management
> Unit) driver and the corresponding IOMMU API implementation. PAMU is the
> IOMMU present on Freescale QorIQ platforms. PAMU can authorize memory
> access, remap the memory address, and remap the I/O transaction type.
>=20
> This set consists of the following patches:
> 1. Addition of new field in the device (powerpc) archdata structure for
> storing iommu domain information
>    pointer. This pointer is stored when the device is attached to a
> particular iommu domain.
> 2. Store PCI controller platform device information in the PCI controller
> structure.
> 3. Add defines for FSL PCI controller BRR1 register.
> 4. Add window permission flags in the iommu_domain_window_enable API.
> 5. Add domain attributes for FSL PAMU driver.
> 6. PAMU driver and IOMMU API implementation.
>=20
> This patch set is based on the next branch of the iommu git tree
> maintained by Joerg.
>=20
> Varun Sethi (6):
>   Store iommu domain information in the device structure.
>   Store the platform device information corresponding to the pci
>     controller.
>   Added defines for the FSL PCI controller BRR1 register.
>   Add window permission flags for iommu_domain_window_enable API.
>   Add addtional attributes specific to the PAMU driver.
>   FSL PAMU driver and IOMMU API implementation.
>=20
>  arch/powerpc/include/asm/device.h     |    4 +
>  arch/powerpc/include/asm/pci-bridge.h |    4 +
>  arch/powerpc/sysdev/fsl_pci.c         |    9 +-
>  arch/powerpc/sysdev/fsl_pci.h         |    2 +-
>  drivers/iommu/Kconfig                 |    8 +
>  drivers/iommu/Makefile                |    1 +
>  drivers/iommu/fsl_pamu.c              | 1260
> +++++++++++++++++++++++++++++++++
>  drivers/iommu/fsl_pamu.h              |  398 +++++++++++
>  drivers/iommu/fsl_pamu_domain.c       | 1135
> +++++++++++++++++++++++++++++
>  drivers/iommu/fsl_pamu_domain.h       |   89 +++
>  drivers/iommu/iommu.c                 |    5 +-
>  include/linux/iommu.h                 |   40 +-
>  12 files changed, 2947 insertions(+), 8 deletions(-)  create mode 100644
> drivers/iommu/fsl_pamu.c  create mode 100644 drivers/iommu/fsl_pamu.h
> create mode 100644 drivers/iommu/fsl_pamu_domain.c  create mode 100644
> drivers/iommu/fsl_pamu_domain.h
>=20
> --
> 1.7.4.1

^ permalink raw reply

* [RFC PATCH powerpc] try secondary hash before BUG in kernel_map_linear_page()
From: Li Zhong @ 2013-02-25  9:29 UTC (permalink / raw)
  To: PowerPC email list; +Cc: Paul Mackerras

This patch tries to fix following issue when CONFIG_DEBUG_PAGEALLOC
is enabled:

[  543.075675] ------------[ cut here ]------------
[  543.075701] kernel BUG at arch/powerpc/mm/hash_utils_64.c:1239!
[  543.075714] Oops: Exception in kernel mode, sig: 5 [#1]
[  543.075722] PREEMPT SMP NR_CPUS=16 DEBUG_PAGEALLOC NUMA pSeries
[  543.075741] Modules linked in: binfmt_misc ehea
[  543.075759] NIP: c000000000036eb0 LR: c000000000036ea4 CTR: c00000000005a594
[  543.075771] REGS: c0000000a90832c0 TRAP: 0700   Not tainted  (3.8.0-next-20130222)
[  543.075781] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI>  CR: 22224482  XER: 00000000
[  543.075816] SOFTE: 0
[  543.075823] CFAR: c00000000004c200
[  543.075830] TASK = c0000000e506b750[23934] 'cc1' THREAD: c0000000a9080000 CPU: 1
GPR00: 0000000000000001 c0000000a9083540 c000000000c600a8 ffffffffffffffff
GPR04: 0000000000000050 fffffffffffffffa c0000000a90834e0 00000000004ff594
GPR08: 0000000000000001 0000000000000000 000000009592d4d8 c000000000c86854
GPR12: 0000000000000002 c000000006ead300 0000000000a51000 0000000000000001
GPR16: f000000003354380 ffffffffffffffff ffffffffffffff80 0000000000000000
GPR20: 0000000000000001 c000000000c600a8 0000000000000001 0000000000000001
GPR24: 0000000003354380 c000000000000000 0000000000000000 c000000000b65950
GPR28: 0000002000000000 00000000000cd50e 0000000000bf50d9 c000000000c7c230
[  543.076005] NIP [c000000000036eb0] .kernel_map_pages+0x1e0/0x3f8
[  543.076016] LR [c000000000036ea4] .kernel_map_pages+0x1d4/0x3f8
[  543.076025] Call Trace:
[  543.076033] [c0000000a9083540] [c000000000036ea4] .kernel_map_pages+0x1d4/0x3f8 (unreliable)
[  543.076053] [c0000000a9083640] [c000000000167638] .get_page_from_freelist+0x6cc/0x8dc
[  543.076067] [c0000000a9083800] [c000000000167a48] .__alloc_pages_nodemask+0x200/0x96c
[  543.076082] [c0000000a90839c0] [c0000000001ade44] .alloc_pages_vma+0x160/0x1e4
[  543.076098] [c0000000a9083a80] [c00000000018ce04] .handle_pte_fault+0x1b0/0x7e8
[  543.076113] [c0000000a9083b50] [c00000000018d5a8] .handle_mm_fault+0x16c/0x1a0
[  543.076129] [c0000000a9083c00] [c0000000007bf1dc] .do_page_fault+0x4d0/0x7a4
[  543.076144] [c0000000a9083e30] [c0000000000090e8] handle_page_fault+0x10/0x30
[  543.076155] Instruction dump:
[  543.076163] 7c630038 78631d88 e80a0000 f8410028 7c0903a6 e91f01de e96a0010 e84a0008
[  543.076192] 4e800421 e8410028 7c7107b4 7a200fe0 <0b000000> 7f63db78 48785781 60000000
[  543.076224] ---[ end trace bd5807e8d6ae186b ]---

The code is borrowed from that in __hash_page_huge().

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
---
 arch/powerpc/mm/hash_utils_64.c |   17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 1b6e127..31c7924 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1231,11 +1231,28 @@ static void kernel_map_linear_page(unsigned long vaddr, unsigned long lmi)
 	int ret;
 
 	hash = hpt_hash(vpn, PAGE_SHIFT, mmu_kernel_ssize);
+
+repeat:
 	hpteg = ((hash & htab_hash_mask) * HPTES_PER_GROUP);
 
 	ret = ppc_md.hpte_insert(hpteg, vpn, __pa(vaddr),
 				 mode, HPTE_V_BOLTED,
 				 mmu_linear_psize, mmu_kernel_ssize);
+
+	if (unlikely(ret == -1)) {
+		hpteg = (~hash & htab_hash_mask) * HPTES_PER_GROUP;
+		ret = ppc_md.hpte_insert(hpteg, vpn, __pa(vaddr), mode,
+					 HPTE_V_SECONDARY,
+					 mmu_linear_psize, mmu_kernel_ssize);
+		if (ret == -1) {
+			if (mftb() & 0x1)
+				hpteg = (hash & htab_hash_mask) *
+					 HPTES_PER_GROUP;
+			ppc_md.hpte_remove(hpteg);
+			goto repeat;
+		}
+	}
+
 	BUG_ON (ret < 0);
 	spin_lock(&linear_map_hash_lock);
 	BUG_ON(linear_map_hash_slots[lmi] & 0x80);
-- 
1.7.9.5

^ permalink raw reply related

* Re: [PATCH 1/2] vfio powerpc: enabled on powernv platform
From: Paul Mackerras @ 2013-02-25  2:21 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, Alexey Kardashevskiy, linux-kernel, linuxppc-dev,
	David Gibson
In-Reply-To: <1360621003.12392.117.camel@bling.home>

On Mon, Feb 11, 2013 at 03:16:43PM -0700, Alex Williamson wrote:
> 
> Why do these all return long (vs int)?  Is this a POWER-ism?

On ppc64 the compiler tends to generate slightly shorter code with
longs than with ints.  The reason is that with ints the compiler has
to put in "extend sign word" instructions to convert 64-bit values
from arithmetic instructions to values in the 32-bit range.

Paul.

^ permalink raw reply

* Re: [RFC PATCH -V2 08/21] powerpc: Decode the pte-lp-encoding bits correctly.
From: Aneesh Kumar K.V @ 2013-02-24 17:45 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev, linux-mm
In-Reply-To: <20130222053735.GH6139@drongo>

Paul Mackerras <paulus@samba.org> writes:

>
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index 71d0c90..d2c9932 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -1515,7 +1515,12 @@ static void kvmppc_add_seg_page_size(struct kvm_ppc_one_seg_page_size **sps,
>>  	(*sps)->page_shift = def->shift;
>>  	(*sps)->slb_enc = def->sllp;
>>  	(*sps)->enc[0].page_shift = def->shift;
>> -	(*sps)->enc[0].pte_enc = def->penc;
>> +	/*
>> +	 * FIXME!!
>> +	 * This is returned to user space. Do we need to
>> +	 * return details of MPSS here ?
>
> Yes, we do, probably a separate entry for each valid base/actual page
> size pair.
>

How about

commit fb7bca460d5e3a517dce24c0fe28cc94ffde37fa
Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Date:   Sun Feb 24 22:55:38 2013 +0530

    powerpc: Return all the valid pte ecndoing in KVM_PPC_GET_SMMU_INFO ioctl
    
    Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 48f6d99..e50eb0d 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1508,14 +1508,21 @@ long kvm_vm_ioctl_allocate_rma(struct kvm *kvm, struct kvm_allocate_rma *ret)
 static void kvmppc_add_seg_page_size(struct kvm_ppc_one_seg_page_size **sps,
 				     int linux_psize)
 {
+	int i, index = 0;
 	struct mmu_psize_def *def = &mmu_psize_defs[linux_psize];
 
 	if (!def->shift)
 		return;
 	(*sps)->page_shift = def->shift;
 	(*sps)->slb_enc = def->sllp;
-	(*sps)->enc[0].page_shift = def->shift;
-	(*sps)->enc[0].pte_enc = def->penc[linux_psize];
+	for (i = 0; i < MMU_PAGE_COUNT; i++) {
+		if ((signed int)def->penc[i] != -1) {
+			BUG_ON(index >= KVM_PPC_PAGE_SIZES_MAX_SZ);
+			(*sps)->enc[index].page_shift = mmu_psize_defs[i].shift;
+			(*sps)->enc[index].pte_enc = def->penc[i];
+			index++;
+		}
+	}
 	(*sps)++;
 }
 

^ permalink raw reply related

* Re: [PATCH net] gianfar: fix compile fail for NET_POLL=y due to struct packing
From: David Miller @ 2013-02-24 17:04 UTC (permalink / raw)
  To: paul.gortmaker; +Cc: jianhua.xie, netdev, linuxppc-dev, claudiu.manoil
In-Reply-To: <1361720311-13267-1-git-send-email-paul.gortmaker@windriver.com>

From: Paul Gortmaker <paul.gortmaker@windriver.com>
Date: Sun, 24 Feb 2013 10:38:31 -0500

> Commit ee873fda3bec7c668407b837fc5519eb961fcd37 ("gianfar: Pack struct
> gfar_priv_grp into three cachelines") moved the irq number and names
> off into a separate struct and created accessors for them.  However
> it was never tested with NET_POLL enabled, and so some conversions
> that were simply overlooked went undetected until now.
> 
> Make the netpoll ones also use the gfar_irq() accessors.
> 
> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Claudiu Manoil <claudiu.manoil@freescale.com>
> Cc: Jianhua Xie <jianhua.xie@freescale.com>
> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

Applied, thanks.

^ permalink raw reply

* Re: [RFC PATCH -V2 08/21] powerpc: Decode the pte-lp-encoding bits correctly.
From: Aneesh Kumar K.V @ 2013-02-24 16:51 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev, linux-mm
In-Reply-To: <20130222053735.GH6139@drongo>

Paul Mackerras <paulus@samba.org> writes:

> On Thu, Feb 21, 2013 at 10:17:15PM +0530, Aneesh Kumar K.V wrote:
>> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>> 
>> We look at both the segment base page size and actual page size and store
>> the pte-lp-encodings in an array per base page size.
>> 
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>
> This needs more than 2 lines of patch description.  In fact what
> you're doing is adding general mixed page-size segment (MPSS)
> support.  Doing this should mean that you can also get rid of the
> MMU_PAGE_64K_AP value from the list in asm/mmu.h.
>

Can you elaborate on Admixed pages ? I was not able to find more info on
that. 


>>  struct mmu_psize_def
>>  {
>>  	unsigned int	shift;	/* number of bits */
>> -	unsigned int	penc;	/* HPTE encoding */
>> +	unsigned int	penc[MMU_PAGE_COUNT];	/* HPTE encoding */
>
> I guess this is reasonable, though adding space for 14 page size
> encodings seems a little bit over the top.  Also, you don't seem to
> have any way to indicate which encodings are valid, since 0 is a valid
> encoding.  Maybe you need to add a valid bit higher up to indicate
> which page sizes are valid.
>

how about penc = { [0 ... MMU_PAGE_COUNT] = -1 } ?, that would make sure
we set all the bits for invalid entries. 


>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index 71d0c90..d2c9932 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -1515,7 +1515,12 @@ static void kvmppc_add_seg_page_size(struct kvm_ppc_one_seg_page_size **sps,
>>  	(*sps)->page_shift = def->shift;
>>  	(*sps)->slb_enc = def->sllp;
>>  	(*sps)->enc[0].page_shift = def->shift;
>> -	(*sps)->enc[0].pte_enc = def->penc;
>> +	/*
>> +	 * FIXME!!
>> +	 * This is returned to user space. Do we need to
>> +	 * return details of MPSS here ?
>
> Yes, we do, probably a separate entry for each valid base/actual page
> size pair.
>

Ok will do new entries to enc for valid actual page size supported.

for 16MB actual page size 

enc[1].page_shift = 24
enc[1].page_shift = x

should this be a seperate patch ?

>> +static inline int hpte_actual_psize(struct hash_pte *hptep, int psize)
>> +{
>> +	unsigned int mask;
>> +	int i, penc, shift;
>> +	/* Look at the 8 bit LP value */
>> +	unsigned int lp = (hptep->r >> LP_SHIFT) & ((1 << (LP_BITS + 1)) - 1);
>
> Why LP_BITS + 1 here?  You seem to be extracting and comparing 9 bits
> rather than 8.  Why is that?
>

My mistake.  Will fix

>> @@ -395,12 +422,13 @@ static void hpte_decode(struct hash_pte *hpte, unsigned long slot,
>>  			/* valid entries have a shift value */
>>  			if (!mmu_psize_defs[size].shift)
>>  				continue;
>> -
>> -			if (penc == mmu_psize_defs[size].penc)
>> -				break;
>> +			for (a_size = 0; a_size < MMU_PAGE_COUNT; a_size++)
>> +				if (penc == mmu_psize_defs[size].penc[a_size])
>> +					goto out;
>
> I think this will get false matches due to unused/invalid entries
> in mmu_psize_defs[size].penc[] containing 0.

Will set the invalid value to 0xff. 

-aneesh

^ permalink raw reply

* [PATCH net] gianfar: fix compile fail for NET_POLL=y due to struct packing
From: Paul Gortmaker @ 2013-02-24 15:38 UTC (permalink / raw)
  To: netdev; +Cc: Jianhua Xie, Paul Gortmaker, linuxppc-dev, Claudiu Manoil
In-Reply-To: <1361717023.16950.12.camel@pasglop>

Commit ee873fda3bec7c668407b837fc5519eb961fcd37 ("gianfar: Pack struct
gfar_priv_grp into three cachelines") moved the irq number and names
off into a separate struct and created accessors for them.  However
it was never tested with NET_POLL enabled, and so some conversions
that were simply overlooked went undetected until now.

Make the netpoll ones also use the gfar_irq() accessors.

Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Claudiu Manoil <claudiu.manoil@freescale.com>
Cc: Jianhua Xie <jianhua.xie@freescale.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---

[compile tested for sbc8548 with NET_POLL=y]

 drivers/net/ethernet/freescale/gianfar.c | 26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c
index 4b5e8a6..d2c5441 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -2906,21 +2906,23 @@ static void gfar_netpoll(struct net_device *dev)
 	/* If the device has multiple interrupts, run tx/rx */
 	if (priv->device_flags & FSL_GIANFAR_DEV_HAS_MULTI_INTR) {
 		for (i = 0; i < priv->num_grps; i++) {
-			disable_irq(priv->gfargrp[i].interruptTransmit);
-			disable_irq(priv->gfargrp[i].interruptReceive);
-			disable_irq(priv->gfargrp[i].interruptError);
-			gfar_interrupt(priv->gfargrp[i].interruptTransmit,
-				       &priv->gfargrp[i]);
-			enable_irq(priv->gfargrp[i].interruptError);
-			enable_irq(priv->gfargrp[i].interruptReceive);
-			enable_irq(priv->gfargrp[i].interruptTransmit);
+			struct gfar_priv_grp *grp = &priv->gfargrp[i];
+
+			disable_irq(gfar_irq(grp, TX)->irq);
+			disable_irq(gfar_irq(grp, RX)->irq);
+			disable_irq(gfar_irq(grp, ER)->irq);
+			gfar_interrupt(gfar_irq(grp, TX)->irq, grp);
+			enable_irq(gfar_irq(grp, ER)->irq);
+			enable_irq(gfar_irq(grp, RX)->irq);
+			enable_irq(gfar_irq(grp, TX)->irq);
 		}
 	} else {
 		for (i = 0; i < priv->num_grps; i++) {
-			disable_irq(priv->gfargrp[i].interruptTransmit);
-			gfar_interrupt(priv->gfargrp[i].interruptTransmit,
-				       &priv->gfargrp[i]);
-			enable_irq(priv->gfargrp[i].interruptTransmit);
+			struct gfar_priv_grp *grp = &priv->gfargrp[i];
+
+			disable_irq(gfar_irq(grp, TX)->irq);
+			gfar_interrupt(gfar_irq(grp, TX)->irq, grp);
+			enable_irq(gfar_irq(grp, TX)->irq);
 		}
 	}
 }
-- 
1.8.1.2

^ permalink raw reply related

* Re: Gianfar breaks one of my test configs
From: Paul Gortmaker @ 2013-02-24 15:05 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Jianhua Xie, netdev, linuxppc-dev, claudiu.manoil
In-Reply-To: <1361717023.16950.12.camel@pasglop>

On Sun, Feb 24, 2013 at 9:43 AM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> Hi folks !
>
> Current Linus tree as of this morning fails to build with one of my
> (semi-random) test configs (attached):
>
> /home/benh/linux-powerpc-test/drivers/net/ethernet/freescale/gianfar.c: In function 'gfar_netpoll':
> /home/benh/linux-powerpc-test/drivers/net/ethernet/freescale/gianfar.c:2909:32: error: 'struct gfar_priv_grp' has no member named 'interruptTransmit'

I see the problem - it wasn't tested with NET_POLL.   I'll have a fix
out shortly.

Thanks,
Paul.
--

> /home/benh/linux-powerpc-test/drivers/net/ethernet/freescale/gianfar.c:2910:32: error: 'struct gfar_priv_grp' has no member named 'interruptReceive'
> /home/benh/linux-powerpc-test/drivers/net/ethernet/freescale/gianfar.c:2911:32: error: 'struct gfar_priv_grp' has no member named 'interruptError'
> /home/benh/linux-powerpc-test/drivers/net/ethernet/freescale/gianfar.c:2912:35: error: 'struct gfar_priv_grp' has no member named 'interruptTransmit'
> /home/benh/linux-powerpc-test/drivers/net/ethernet/freescale/gianfar.c:2914:31: error: 'struct gfar_priv_grp' has no member named 'interruptError'
> /home/benh/linux-powerpc-test/drivers/net/ethernet/freescale/gianfar.c:2915:31: error: 'struct gfar_priv_grp' has no member named 'interruptReceive'
> /home/benh/linux-powerpc-test/drivers/net/ethernet/freescale/gianfar.c:2916:31: error: 'struct gfar_priv_grp' has no member named 'interruptTransmit'
> /home/benh/linux-powerpc-test/drivers/net/ethernet/freescale/gianfar.c:2920:32: error: 'struct gfar_priv_grp' has no member named 'interruptTransmit'
> /home/benh/linux-powerpc-test/drivers/net/ethernet/freescale/gianfar.c:2921:35: error: 'struct gfar_priv_grp' has no member named 'interruptTransmit'
> /home/benh/linux-powerpc-test/drivers/net/ethernet/freescale/gianfar.c:2923:31: error: 'struct gfar_priv_grp' has no member named 'interruptTransmit'
> make[5]: *** [drivers/net/ethernet/freescale/gianfar.o] Error 1
> make[5]: *** Waiting for unfinished jobs....
>
> Cheers,
> Ben.
>
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox