LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH V2 1/2] mm: move FAULT_AROUND_ORDER to arch/
From: Kirill A. Shutemov @ 2014-04-04 13:17 UTC (permalink / raw)
  To: Madhavan Srinivasan
  Cc: linux-arch, riel, rusty, peterz, x86, linux-kernel, linux-mm, ak,
	paulus, mgorman, akpm, linuxppc-dev, mingo, kirill.shutemov
In-Reply-To: <1396592835-24767-2-git-send-email-maddy@linux.vnet.ibm.com>

On Fri, Apr 04, 2014 at 11:57:14AM +0530, Madhavan Srinivasan wrote:
> Kirill A. Shutemov with faultaround patchset introduced
> vm_ops->map_pages() for mapping easy accessible pages around
> fault address in hope to reduce number of minor page faults.
> 
> This patch creates infrastructure to move the FAULT_AROUND_ORDER
> to arch/ using Kconfig. This will enable architecture maintainers
> to decide on suitable FAULT_AROUND_ORDER value based on
> performance data for that architecture. Patch also adds
> FAULT_AROUND_ORDER Kconfig element in arch/X86.
> 
> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> ---
>  arch/x86/Kconfig   |    4 ++++
>  include/linux/mm.h |    9 +++++++++
>  mm/memory.c        |   12 +++++-------
>  3 files changed, 18 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 9c0a657..5833f22 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1177,6 +1177,10 @@ config DIRECT_GBPAGES
>  	  support it. This can improve the kernel's performance a tiny bit by
>  	  reducing TLB pressure. If in doubt, say "Y".
>  
> +config FAULT_AROUND_ORDER
> +	int
> +	default "4"
> +
>  # Common NUMA Features
>  config NUMA
>  	bool "Numa Memory Allocation and Scheduler Support"
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0bd4359..b93c1c3 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -26,6 +26,15 @@ struct file_ra_state;
>  struct user_struct;
>  struct writeback_control;
>  
> +/*
> + * Fault around order is a control knob to decide the fault around pages.
> + * Default value is set to 0UL (disabled), but the arch can override it as
> + * desired.
> + */
> +#ifndef CONFIG_FAULT_AROUND_ORDER
> +#define CONFIG_FAULT_AROUND_ORDER 0
> +#endif
> +

I don't think it should be in header file: nobody except mm/memory.c cares.
Just put it instead '#define FAULT_AROUND_ORDER'.

>  #ifndef CONFIG_NEED_MULTIPLE_NODES	/* Don't use mapnrs, do it properly */
>  extern unsigned long max_mapnr;
>  
> diff --git a/mm/memory.c b/mm/memory.c
> index b02c584..22a4a89 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3358,10 +3358,8 @@ void do_set_pte(struct vm_area_struct *vma, unsigned long address,
>  	update_mmu_cache(vma, address, pte);
>  }
>  
> -#define FAULT_AROUND_ORDER 4
> -
>  #ifdef CONFIG_DEBUG_FS
> -static unsigned int fault_around_order = FAULT_AROUND_ORDER;
> +static unsigned int fault_around_order = CONFIG_FAULT_AROUND_ORDER;
>  
>  static int fault_around_order_get(void *data, u64 *val)
>  {
> @@ -3371,7 +3369,7 @@ static int fault_around_order_get(void *data, u64 *val)
>  
>  static int fault_around_order_set(void *data, u64 val)
>  {
> -	BUILD_BUG_ON((1UL << FAULT_AROUND_ORDER) > PTRS_PER_PTE);
> +	BUILD_BUG_ON((1UL << CONFIG_FAULT_AROUND_ORDER) > PTRS_PER_PTE);
>  	if (1UL << val > PTRS_PER_PTE)
>  		return -EINVAL;
>  	fault_around_order = val;
> @@ -3406,14 +3404,14 @@ static inline unsigned long fault_around_pages(void)
>  {
>  	unsigned long nr_pages;
>  
> -	nr_pages = 1UL << FAULT_AROUND_ORDER;
> +	nr_pages = 1UL << CONFIG_FAULT_AROUND_ORDER;
>  	BUILD_BUG_ON(nr_pages > PTRS_PER_PTE);
>  	return nr_pages;
>  }
>  
>  static inline unsigned long fault_around_mask(void)
>  {
> -	return ~((1UL << (PAGE_SHIFT + FAULT_AROUND_ORDER)) - 1);
> +	return ~((1UL << (PAGE_SHIFT + CONFIG_FAULT_AROUND_ORDER)) - 1);
>  }
>  #endif
>  
> @@ -3471,7 +3469,7 @@ static int do_read_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>  	 * if page by the offset is not ready to be mapped (cold cache or
>  	 * something).
>  	 */
> -	if (vma->vm_ops->map_pages) {
> +	if ((vma->vm_ops->map_pages) && (fault_around_pages() > 1)) {

	if (vma->vm_ops->map_pages && fault_around_pages()) {

>  		pte = pte_offset_map_lock(mm, pmd, address, &ptl);
>  		do_fault_around(vma, address, pte, pgoff, flags);
>  		if (!pte_same(*pte, orig_pte))
> -- 
> 1.7.10.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
 Kirill A. Shutemov

^ permalink raw reply

* Re: [PATCH 09/20] of/fdt: create common debugfs
From: Rob Herring @ 2014-04-04 13:00 UTC (permalink / raw)
  To: Michal Simek
  Cc: Grant Likely, linuxppc-dev, Paul Mackerras,
	linux-kernel@vger.kernel.org
In-Reply-To: <533EA2B0.7080703@monstr.eu>

On Fri, Apr 4, 2014 at 7:16 AM, Michal Simek <monstr@monstr.eu> wrote:
> On 04/04/2014 12:16 AM, Rob Herring wrote:
>> From: Rob Herring <robh@kernel.org>
>>
>> Both powerpc and microblaze have the same FDT blob in debugfs feature.
>> Move this to common location and remove the powerpc and microblaze
>> implementations. This feature could become more useful when FDT
>> overlay support is added.
>>
>> This changes the path of the blob from "$arch/flat-device-tree" to
>> "device-tree/flat-device-tree".

[snip]

>> -#if defined(CONFIG_DEBUG_FS) && defined(DEBUG)
>> -static struct debugfs_blob_wrapper flat_dt_blob;
>> -
>> -static int __init export_flat_device_tree(void)
>> -{
>> -     struct dentry *d;
>> -
>> -     flat_dt_blob.data = initial_boot_params;
>> -     flat_dt_blob.size = initial_boot_params->totalsize;
>
> As I see even microblaze version was buggy.

How so?

> ...
>
>> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
>> index fa16a91..2085d47 100644
>> --- a/drivers/of/fdt.c
>> +++ b/drivers/of/fdt.c
>> @@ -20,6 +20,7 @@
>>  #include <linux/string.h>
>>  #include <linux/errno.h>
>>  #include <linux/slab.h>
>> +#include <linux/debugfs.h>
>>
>>  #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
>>  #ifdef CONFIG_PPC
>> @@ -1084,4 +1085,27 @@ void __init unflatten_and_copy_device_tree(void)
>>       unflatten_device_tree();
>>  }
>>
>> +#if defined(CONFIG_DEBUG_FS) && defined(DEBUG)
>> +static struct debugfs_blob_wrapper flat_dt_blob;
>> +
>> +static int __init of_flat_dt_debugfs_export_fdt(void)
>> +{
>> +     struct dentry *d = debugfs_create_dir("device-tree", NULL);
>> +
>> +     if (!d)
>> +             return -ENOENT;
>> +
>> +     flat_dt_blob.data = initial_boot_params;
>> +     flat_dt_blob.size = fdt_totalsize(initial_boot_params);
>
> Have you tried to compile this?
>
> From my tests fdt_totalsize is not available for target just for host
> from libfdt.h
>
> drivers/of/fdt.c: In function 'of_flat_dt_debugfs_export_fdt':
> drivers/of/fdt.c:957:2: error: implicit declaration of function 'fdt_totalsize' [-Werror=implicit-function-declaration]

Ah, it needs to be re-ordered after the libfdt conversion when
libfdt.h gets added.

Rob

^ permalink raw reply

* Re: [PATCH 09/20] of/fdt: create common debugfs
From: Michal Simek @ 2014-04-04 12:22 UTC (permalink / raw)
  To: monstr
  Cc: Rob Herring, linux-kernel, Rob Herring, Grant Likely,
	Paul Mackerras, linuxppc-dev
In-Reply-To: <533EA2B0.7080703@monstr.eu>

[-- Attachment #1: Type: text/plain, Size: 1602 bytes --]

>> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
>> index fa16a91..2085d47 100644
>> --- a/drivers/of/fdt.c
>> +++ b/drivers/of/fdt.c
>> @@ -20,6 +20,7 @@
>>  #include <linux/string.h>
>>  #include <linux/errno.h>
>>  #include <linux/slab.h>
>> +#include <linux/debugfs.h>
>>  
>>  #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
>>  #ifdef CONFIG_PPC
>> @@ -1084,4 +1085,27 @@ void __init unflatten_and_copy_device_tree(void)
>>  	unflatten_device_tree();
>>  }
>>  
>> +#if defined(CONFIG_DEBUG_FS) && defined(DEBUG)
>> +static struct debugfs_blob_wrapper flat_dt_blob;
>> +
>> +static int __init of_flat_dt_debugfs_export_fdt(void)
>> +{
>> +	struct dentry *d = debugfs_create_dir("device-tree", NULL);
>> +
>> +	if (!d)
>> +		return -ENOENT;
>> +
>> +	flat_dt_blob.data = initial_boot_params;
>> +	flat_dt_blob.size = fdt_totalsize(initial_boot_params);
> 
> Have you tried to compile this?
> 
> From my tests fdt_totalsize is not available for target just for host
> from libfdt.h
> 
> drivers/of/fdt.c: In function 'of_flat_dt_debugfs_export_fdt':
> drivers/of/fdt.c:957:2: error: implicit declaration of function 'fdt_totalsize' [-Werror=implicit-function-declaration]

Ignore this one - there is no compilation problem.

Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply

* Re: [PATCH 09/20] of/fdt: create common debugfs
From: Michal Simek @ 2014-04-04 12:16 UTC (permalink / raw)
  To: Rob Herring
  Cc: Rob Herring, linux-kernel, Paul Mackerras, Grant Likely,
	linuxppc-dev
In-Reply-To: <1396563423-30893-10-git-send-email-robherring2@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3432 bytes --]

On 04/04/2014 12:16 AM, Rob Herring wrote:
> From: Rob Herring <robh@kernel.org>
> 
> Both powerpc and microblaze have the same FDT blob in debugfs feature.
> Move this to common location and remove the powerpc and microblaze
> implementations. This feature could become more useful when FDT
> overlay support is added.
> 
> This changes the path of the blob from "$arch/flat-device-tree" to
> "device-tree/flat-device-tree".
> 
> Signed-off-by: Rob Herring <robh@kernel.org>
> Cc: Michal Simek <monstr@monstr.eu>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: linuxppc-dev@lists.ozlabs.org
> ---
>  arch/microblaze/kernel/prom.c | 31 -------------------------------
>  arch/powerpc/kernel/prom.c    | 21 ---------------------
>  drivers/of/fdt.c              | 24 ++++++++++++++++++++++++
>  3 files changed, 24 insertions(+), 52 deletions(-)
> 
> diff --git a/arch/microblaze/kernel/prom.c b/arch/microblaze/kernel/prom.c
> index abdfb10..1312cd2 100644
> --- a/arch/microblaze/kernel/prom.c
> +++ b/arch/microblaze/kernel/prom.c
> @@ -114,34 +114,3 @@ void __init early_init_devtree(void *params)
>  
>  	pr_debug(" <- early_init_devtree()\n");
>  }
> -
> -/*******
> - *
> - * New implementation of the OF "find" APIs, return a refcounted
> - * object, call of_node_put() when done.  The device tree and list
> - * are protected by a rw_lock.
> - *
> - * Note that property management will need some locking as well,
> - * this isn't dealt with yet.
> - *
> - *******/
> -
> -#if defined(CONFIG_DEBUG_FS) && defined(DEBUG)
> -static struct debugfs_blob_wrapper flat_dt_blob;
> -
> -static int __init export_flat_device_tree(void)
> -{
> -	struct dentry *d;
> -
> -	flat_dt_blob.data = initial_boot_params;
> -	flat_dt_blob.size = initial_boot_params->totalsize;

As I see even microblaze version was buggy.

...

> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> index fa16a91..2085d47 100644
> --- a/drivers/of/fdt.c
> +++ b/drivers/of/fdt.c
> @@ -20,6 +20,7 @@
>  #include <linux/string.h>
>  #include <linux/errno.h>
>  #include <linux/slab.h>
> +#include <linux/debugfs.h>
>  
>  #include <asm/setup.h>  /* for COMMAND_LINE_SIZE */
>  #ifdef CONFIG_PPC
> @@ -1084,4 +1085,27 @@ void __init unflatten_and_copy_device_tree(void)
>  	unflatten_device_tree();
>  }
>  
> +#if defined(CONFIG_DEBUG_FS) && defined(DEBUG)
> +static struct debugfs_blob_wrapper flat_dt_blob;
> +
> +static int __init of_flat_dt_debugfs_export_fdt(void)
> +{
> +	struct dentry *d = debugfs_create_dir("device-tree", NULL);
> +
> +	if (!d)
> +		return -ENOENT;
> +
> +	flat_dt_blob.data = initial_boot_params;
> +	flat_dt_blob.size = fdt_totalsize(initial_boot_params);

Have you tried to compile this?

From my tests fdt_totalsize is not available for target just for host
from libfdt.h

drivers/of/fdt.c: In function 'of_flat_dt_debugfs_export_fdt':
drivers/of/fdt.c:957:2: error: implicit declaration of function 'fdt_totalsize' [-Werror=implicit-function-declaration]

Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply

* [PATCH v3 2/2] ARM: dts: Append clock bindings for sai2 on VF610 platform
From: Nicolin Chen @ 2014-04-04 10:08 UTC (permalink / raw)
  To: broonie, shawn.guo
  Cc: mark.rutland, devicetree, alsa-devel, pawel.moll, linux-doc,
	ijc+devicetree, linux-kernel, robh+dt, timur, Li.Xiubo, rob,
	galak, linuxppc-dev
In-Reply-To: <cover.1396605772.git.Guangyu.Chen@freescale.com>

Since we added fours clock to the DT binding, we should update the current
SAI dts/dtsi so as not to break their functions.

Signed-off-by: Nicolin Chen <Guangyu.Chen@freescale.com>
Tested-by: Xiubo Li <Li.Xiubo@freescale.com>
---
 arch/arm/boot/dts/vf610.dtsi | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/arm/boot/dts/vf610.dtsi b/arch/arm/boot/dts/vf610.dtsi
index d31ce1b..9fd0007 100644
--- a/arch/arm/boot/dts/vf610.dtsi
+++ b/arch/arm/boot/dts/vf610.dtsi
@@ -139,8 +139,10 @@
 				compatible = "fsl,vf610-sai";
 				reg = <0x40031000 0x1000>;
 				interrupts = <0 86 0x04>;
-				clocks = <&clks VF610_CLK_SAI2>;
-				clock-names = "sai";
+				clocks = <&clks VF610_CLK_SAI2>,
+				       <&clks VF610_CLK_SAI2>,
+				       <&clks 0>, <&clks 0>;
+				clock-names = "bus", "mclk1", "mclk2", "mclk3";
 				status = "disabled";
 			};
 
-- 
1.8.4

^ permalink raw reply related

* [PATCH v3 1/2] ASoC: fsl_sai: Add clock controls for SAI
From: Nicolin Chen @ 2014-04-04 10:08 UTC (permalink / raw)
  To: broonie, shawn.guo
  Cc: mark.rutland, devicetree, alsa-devel, pawel.moll, linux-doc,
	ijc+devicetree, linux-kernel, robh+dt, timur, Li.Xiubo, rob,
	galak, linuxppc-dev
In-Reply-To: <cover.1396605772.git.Guangyu.Chen@freescale.com>

The SAI mainly has the following clocks:
  bus clock
    control and configuration registers and to generate synchronous
    interrupts and DMA requests.

  mclk1, mclk2, mclk3
    to generate the bit clock when the receiver or transmitter is
    configured for an internally generated bit clock.

So this patch adds these clocks to the driver and their clock controls.

Signed-off-by: Nicolin Chen <Guangyu.Chen@freescale.com>
Acked-by: Xiubo Li <Li.Xiubo@freescale.com>
---
 .../devicetree/bindings/sound/fsl-sai.txt          |  8 ++--
 sound/soc/fsl/fsl_sai.c                            | 51 ++++++++++++++++++++--
 sound/soc/fsl/fsl_sai.h                            |  4 ++
 3 files changed, 56 insertions(+), 7 deletions(-)

diff --git a/Documentation/devicetree/bindings/sound/fsl-sai.txt b/Documentation/devicetree/bindings/sound/fsl-sai.txt
index 35c09fe..8a273bc 100644
--- a/Documentation/devicetree/bindings/sound/fsl-sai.txt
+++ b/Documentation/devicetree/bindings/sound/fsl-sai.txt
@@ -10,7 +10,8 @@ Required properties:
 - compatible: Compatible list, contains "fsl,vf610-sai" or "fsl,imx6sx-sai".
 - reg: Offset and length of the register set for the device.
 - clocks: Must contain an entry for each entry in clock-names.
-- clock-names : Must include the "sai" entry.
+- clock-names : Must include the "bus" for register access and "mclk1" "mclk2"
+  "mclk3" for bit clock and frame clock providing.
 - dmas : Generic dma devicetree binding as described in
   Documentation/devicetree/bindings/dma/dma.txt.
 - dma-names : Two dmas have to be defined, "tx" and "rx".
@@ -30,8 +31,9 @@ sai2: sai@40031000 {
 	      reg = <0x40031000 0x1000>;
 	      pinctrl-names = "default";
 	      pinctrl-0 = <&pinctrl_sai2_1>;
-	      clocks = <&clks VF610_CLK_SAI2>;
-	      clock-names = "sai";
+	      clocks = <&clks VF610_CLK_SAI2>, <&clks VF610_CLK_SAI2>
+		      <&clks 0>, <&clks 0>;
+	      clock-names = "bus", "mclk1", "mclk2", "mclk3";
 	      dma-names = "tx", "rx";
 	      dmas = <&edma0 0 VF610_EDMA_MUXID0_SAI2_TX>,
 		   <&edma0 0 VF610_EDMA_MUXID0_SAI2_RX>;
diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c
index 9cd1764..2dae440 100644
--- a/sound/soc/fsl/fsl_sai.c
+++ b/sound/soc/fsl/fsl_sai.c
@@ -401,7 +401,23 @@ static int fsl_sai_startup(struct snd_pcm_substream *substream,
 		struct snd_soc_dai *cpu_dai)
 {
 	struct fsl_sai *sai = snd_soc_dai_get_drvdata(cpu_dai);
-	u32 reg;
+	struct device *dev = &sai->pdev->dev;
+	u32 reg, i;
+	int ret;
+
+	ret = clk_prepare_enable(sai->bus_clk);
+	if (ret) {
+		dev_err(dev, "failed to enable bus clock\n");
+		return ret;
+	}
+
+	for (i = 0; i < FSL_SAI_MCLK_MAX; i++) {
+		ret = clk_prepare_enable(sai->mclk_clk[i]);
+		if (ret) {
+			dev_err(dev, "failed to enable mclk%d clock\n", i + 1);
+			goto err;
+		}
+	}
 
 	if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
 		reg = FSL_SAI_TCR3;
@@ -412,13 +428,20 @@ static int fsl_sai_startup(struct snd_pcm_substream *substream,
 			   FSL_SAI_CR3_TRCE);
 
 	return 0;
+
+err:
+	for (; i > 0; i--)
+		clk_disable_unprepare(sai->mclk_clk[i - 1]);
+	clk_disable_unprepare(sai->bus_clk);
+
+	return ret;
 }
 
 static void fsl_sai_shutdown(struct snd_pcm_substream *substream,
 		struct snd_soc_dai *cpu_dai)
 {
 	struct fsl_sai *sai = snd_soc_dai_get_drvdata(cpu_dai);
-	u32 reg;
+	u32 reg, i;
 
 	if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
 		reg = FSL_SAI_TCR3;
@@ -427,6 +450,10 @@ static void fsl_sai_shutdown(struct snd_pcm_substream *substream,
 
 	regmap_update_bits(sai->regmap, reg, FSL_SAI_CR3_TRCE,
 			   ~FSL_SAI_CR3_TRCE);
+
+	for (i = 0; i < FSL_SAI_MCLK_MAX; i++)
+		clk_disable_unprepare(sai->mclk_clk[i]);
+	clk_disable_unprepare(sai->bus_clk);
 }
 
 static const struct snd_soc_dai_ops fsl_sai_pcm_dai_ops = {
@@ -559,7 +586,8 @@ static int fsl_sai_probe(struct platform_device *pdev)
 	struct fsl_sai *sai;
 	struct resource *res;
 	void __iomem *base;
-	int irq, ret;
+	char tmp[8];
+	int irq, ret, i;
 
 	sai = devm_kzalloc(&pdev->dev, sizeof(*sai), GFP_KERNEL);
 	if (!sai)
@@ -582,12 +610,27 @@ static int fsl_sai_probe(struct platform_device *pdev)
 		return PTR_ERR(base);
 
 	sai->regmap = devm_regmap_init_mmio_clk(&pdev->dev,
-			"sai", base, &fsl_sai_regmap_config);
+			"bus", base, &fsl_sai_regmap_config);
 	if (IS_ERR(sai->regmap)) {
 		dev_err(&pdev->dev, "regmap init failed\n");
 		return PTR_ERR(sai->regmap);
 	}
 
+	sai->bus_clk = devm_clk_get(&pdev->dev, "bus");
+	if (IS_ERR(sai->bus_clk)) {
+		dev_err(&pdev->dev, "failed to get bus clock\n");
+		return PTR_ERR(sai->bus_clk);
+	}
+
+	for (i = 0; i < FSL_SAI_MCLK_MAX; i++) {
+		sprintf(tmp, "mclk%d", i + 1);
+		sai->mclk_clk[i] = devm_clk_get(&pdev->dev, tmp);
+		if (IS_ERR(sai->mclk_clk[i])) {
+			dev_err(&pdev->dev, "failed to get mclk%d clock\n", i + 1);
+			return PTR_ERR(sai->mclk_clk[i]);
+		}
+	}
+
 	irq = platform_get_irq(pdev, 0);
 	if (irq < 0) {
 		dev_err(&pdev->dev, "no irq for node %s\n", np->full_name);
diff --git a/sound/soc/fsl/fsl_sai.h b/sound/soc/fsl/fsl_sai.h
index 677670d..0e6c9f5 100644
--- a/sound/soc/fsl/fsl_sai.h
+++ b/sound/soc/fsl/fsl_sai.h
@@ -119,6 +119,8 @@
 #define FSL_SAI_CLK_MAST2	2
 #define FSL_SAI_CLK_MAST3	3
 
+#define FSL_SAI_MCLK_MAX	3
+
 /* SAI data transfer numbers per DMA request */
 #define FSL_SAI_MAXBURST_TX 6
 #define FSL_SAI_MAXBURST_RX 6
@@ -126,6 +128,8 @@
 struct fsl_sai {
 	struct platform_device *pdev;
 	struct regmap *regmap;
+	struct clk *bus_clk;
+	struct clk *mclk_clk[FSL_SAI_MCLK_MAX];
 
 	bool big_endian_regs;
 	bool big_endian_data;
-- 
1.8.4

^ permalink raw reply related

* [PATCH v3 0/2] ASoC: fsl_sai: Add bus clock and mclks with clock controls
From: Nicolin Chen @ 2014-04-04 10:08 UTC (permalink / raw)
  To: broonie, shawn.guo
  Cc: mark.rutland, devicetree, alsa-devel, pawel.moll, linux-doc,
	ijc+devicetree, linux-kernel, robh+dt, timur, Li.Xiubo, rob,
	galak, linuxppc-dev

This series of patches add clock controls to fsl_sai driver and updates
the vf610.dtsi accordingly.

Changelog
v3:
 * Use int type for ret instead of u32.
 * Added Acked-by and Tested-by from Xiubo Li.
v2:
 * Appended two extra mclks to the driver since SAI actually has three.
 * Renamed clock name to 'bus' and 'mclk' according to the reference manual.

Nicolin Chen (2):
  ASoC: fsl_sai: Add clock controls for SAI
  ARM: dts: Append clock bindings for sai2 on VF610 platform

 .../devicetree/bindings/sound/fsl-sai.txt          |  8 ++--
 arch/arm/boot/dts/vf610.dtsi                       |  6 ++-
 sound/soc/fsl/fsl_sai.c                            | 51 ++++++++++++++++++++--
 sound/soc/fsl/fsl_sai.h                            |  4 ++
 4 files changed, 60 insertions(+), 9 deletions(-)

-- 
1.8.4

^ permalink raw reply

* Re: [PATCH] ASoC: fsl_sai: Fix Bit Clock Polarity configurations
From: Mark Brown @ 2014-04-04 10:05 UTC (permalink / raw)
  To: Nicolin Chen; +Cc: alsa-devel, Li.Xiubo, linuxppc-dev, linux-kernel, timur
In-Reply-To: <1396595387-4371-1-git-send-email-Guangyu.Chen@freescale.com>

[-- Attachment #1: Type: text/plain, Size: 348 bytes --]

On Fri, Apr 04, 2014 at 03:09:47PM +0800, Nicolin Chen wrote:
> The BCP bit in TCR4/RCR4 register rules as followings:
>   0 Bit clock is active high with drive outputs on rising edge
>     and sample inputs on falling edge.
>   1 Bit clock is active low with drive outputs on falling edge
>     and sample inputs on rising edge.

Applied, thanks.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* RE: [PATCH v2 2/2] ARM: dts: Append clock bindings for sai2 on VF610 platform
From: Li.Xiubo @ 2014-04-04  9:52 UTC (permalink / raw)
  To: guangyu.chen@freescale.com, Shawn Guo
  Cc: mark.rutland@arm.com, devicetree@vger.kernel.org,
	alsa-devel@alsa-project.org, pawel.moll@arm.com,
	linux-doc@vger.kernel.org, ijc+devicetree@hellion.org.uk,
	linux-kernel@vger.kernel.org, robh+dt@kernel.org, timur@tabi.org,
	broonie@kernel.org, rob@landley.net, galak@codeaurora.org,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20140402133955.GB29440@MrMyself>


> Subject: Re: [PATCH v2 2/2] ARM: dts: Append clock bindings for sai2 on V=
F610
> platform
>=20
> Hi Shawn,
>=20
>    Thanks for the comments, but...
>=20
> On Wed, Apr 02, 2014 at 09:03:04PM +0800, Shawn Guo wrote:
> > On Wed, Apr 02, 2014 at 06:10:20PM +0800, Nicolin Chen wrote:
> > > Since we added fours clock to the DT binding, we should update the cu=
rrent
> > > SAI dts/dtsi so as not to break their functions.
> > >
> > > Signed-off-by: Nicolin Chen <Guangyu.Chen@freescale.com>
> > > ---
> > >  arch/arm/boot/dts/vf610.dtsi | 6 ++++--
> > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/arch/arm/boot/dts/vf610.dtsi b/arch/arm/boot/dts/vf610.d=
tsi
> > > index d31ce1b..9fd0007 100644
> > > --- a/arch/arm/boot/dts/vf610.dtsi
> > > +++ b/arch/arm/boot/dts/vf610.dtsi
> > > @@ -139,8 +139,10 @@
> > >  				compatible =3D "fsl,vf610-sai";
> > >  				reg =3D <0x40031000 0x1000>;
> > >  				interrupts =3D <0 86 0x04>;
> > > -				clocks =3D <&clks VF610_CLK_SAI2>;
> > > -				clock-names =3D "sai";
> > > +				clocks =3D <&clks VF610_CLK_SAI2>,
> > > +				       <&clks VF610_CLK_SAI2>,
> > > +				       <&clks 0>, <&clks 0>;
> >
> > So it seems that SAI on vf610 does work with only one clock.  So the
> > driver change will break old DTB for vf610?  If that's case, we will
> > have to need a new compatible for cases where 4 clocks are needed.
>=20
> According to Vybrid's RM Chapter 9.11.12 SAI clocking, the SoC actually
> connects SAI with two clocks: SAI_CLK and Platform Bus Clock. So the DT
> binding here still needs to be corrected even if ignoring driver change.
>=20
> Besides, I've checked both SAI on imx and vf610 and found that they are
> seemly identical, especially for the clock part -- "The transmitter and
> receiver can independently select between the bus clock and up to three
> audio master clocks to generate the bit clock." And the driver that was
> designed for vf610 already contains the code to switch the clock between
> Bus Clock and Three MCLKs. What I want to say is, even if SAI on vf610
> does work with only one clock, it still doesn't have the full function
> on vf610 -- driving clock from Platform Bus Clock unless we make this
> improvement to the DT binding.
>=20
> So I think it's fair to complete the code here for both platforms, even
> though we might take the risk of merging conflict. And I understand
> your point to avoid function break on those platform both of us aren't
> convenient to test. But I've already involved Xiubo in the list. And
> we can wait for his test result.
>=20

This has been test on my Vybird-TWR board and it's fine.

Thanks,

Brs
Xiubo



> Hope you can understand the circumstance,
> Nicolin

^ permalink raw reply

* Re: [PATCH 1/2] ASoC: fsl_sai: Add clock control for SAI
From: Nicolin Chen @ 2014-04-04  9:38 UTC (permalink / raw)
  To: Xiubo Li-B47053
  Cc: mark.rutland@arm.com, devicetree@vger.kernel.org,
	alsa-devel@alsa-project.org, pawel.moll@arm.com,
	linux-doc@vger.kernel.org, ijc+devicetree@hellion.org.uk,
	timur@tabi.org, robh+dt@kernel.org, linux-kernel@vger.kernel.org,
	broonie@kernel.org, rob@landley.net, galak@codeaurora.org,
	shawn.guo@linaro.org, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <f713ce2310f44ef883194429ba97d4a8@BY2PR03MB505.namprd03.prod.outlook.com>

On Fri, Apr 04, 2014 at 05:43:12PM +0800, Xiubo Li-B47053 wrote:
> > Subject: Re: [PATCH 1/2] ASoC: fsl_sai: Add clock control for SAI
> > 
> > Hi Xiubo,
> > 
> > On Fri, Apr 04, 2014 at 05:24:39PM +0800, Xiubo Li-B47053 wrote:
> > > Hi,
> > >
> > > I have test this series on my Vybrid-TWR board and it works happily.
> > 
> > You just checked the wrong version. I've sent a mail to let people disregard
> > this version and a newer v2.
> > 
> 
> Sorry, I didn't receive these, I will search it in the web page.
> 
> > >
> > > [...]
> > > > diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c
> > > > index 3847d2a..2d749df 100644
> > > > --- a/sound/soc/fsl/fsl_sai.c
> > > > +++ b/sound/soc/fsl/fsl_sai.c
> > > > @@ -428,5 +428,18 @@ static int fsl_sai_startup(struct snd_pcm_substream
> > > > *substream,
> > > >  {
> > > >  	struct fsl_sai *sai = snd_soc_dai_get_drvdata(cpu_dai);
> > > > -	u32 reg;
> > > > +	struct device *dev = &sai->pdev->dev;
> > > > +	u32 reg, ret;
> > > > +
> > >
> > > I'd prefer:
> > >   +	int ret;
> > 
> > Just like last time I said, it would be converted to 'int' any way. There's
> > no much difference between them.
> > 
> 
> Yes, it is...
> 
> I have ever encounter something like this in the feature maybe someone will
> do the following check:
> 
> If (ret < 0)
> 	...
> 
> Just one potential problem and one suggestion.

Fine...I'll revise it as you wish..after you checked the v2.

Thanks,
Nicolin

^ permalink raw reply

* RE: [PATCH 1/2] ASoC: fsl_sai: Add clock control for SAI
From: Li.Xiubo @ 2014-04-04  9:43 UTC (permalink / raw)
  To: guangyu.chen@freescale.com
  Cc: mark.rutland@arm.com, devicetree@vger.kernel.org,
	alsa-devel@alsa-project.org, pawel.moll@arm.com,
	linux-doc@vger.kernel.org, ijc+devicetree@hellion.org.uk,
	timur@tabi.org, robh+dt@kernel.org, linux-kernel@vger.kernel.org,
	broonie@kernel.org, rob@landley.net, galak@codeaurora.org,
	shawn.guo@linaro.org, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20140404092804.GA2735@MrMyself>

> Subject: Re: [PATCH 1/2] ASoC: fsl_sai: Add clock control for SAI
>=20
> Hi Xiubo,
>=20
> On Fri, Apr 04, 2014 at 05:24:39PM +0800, Xiubo Li-B47053 wrote:
> > Hi,
> >
> > I have test this series on my Vybrid-TWR board and it works happily.
>=20
> You just checked the wrong version. I've sent a mail to let people disreg=
ard
> this version and a newer v2.
>=20

Sorry, I didn't receive these, I will search it in the web page.

> >
> > [...]
> > > diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c
> > > index 3847d2a..2d749df 100644
> > > --- a/sound/soc/fsl/fsl_sai.c
> > > +++ b/sound/soc/fsl/fsl_sai.c
> > > @@ -428,5 +428,18 @@ static int fsl_sai_startup(struct snd_pcm_substr=
eam
> > > *substream,
> > >  {
> > >  	struct fsl_sai *sai =3D snd_soc_dai_get_drvdata(cpu_dai);
> > > -	u32 reg;
> > > +	struct device *dev =3D &sai->pdev->dev;
> > > +	u32 reg, ret;
> > > +
> >
> > I'd prefer:
> >   +	int ret;
>=20
> Just like last time I said, it would be converted to 'int' any way. There=
's
> no much difference between them.
>=20

Yes, it is...

I have ever encounter something like this in the feature maybe someone will
do the following check:

If (ret < 0)
	...

Just one potential problem and one suggestion.

Thanks,


> >
> > > +	ret =3D clk_prepare_enable(sai->ipg_clk);
> > > +	if (ret) {
> > > +		dev_err(dev, "failed to prepare and enable ipg clock\n");
> > > +		return ret;
> > > +	}
> > > +
> >
> > [...]
> >
> > > @@ -609,5 +630,5 @@ static int fsl_sai_probe(struct platform_device *=
pdev)
> > >
> > >  	sai->regmap =3D devm_regmap_init_mmio_clk(&pdev->dev,
> > > -			"sai", base, &fsl_sai_regmap_config);
> > > +			"ipg", base, &fsl_sai_regmap_config);
> > >  	if (IS_ERR(sai->regmap)) {
> > >  		dev_err(&pdev->dev, "regmap init failed\n");
> > > @@ -615,4 +636,16 @@ static int fsl_sai_probe(struct platform_device =
*pdev)
> > >  	}
> > >
> > > +	sai->ipg_clk =3D devm_clk_get(&pdev->dev, "ipg");
> > > +	if (IS_ERR(sai->ipg_clk)) {
> > > +		dev_err(&pdev->dev, "failed to get ipg clock\n");
> > > +		return PTR_ERR(sai->ipg_clk);
> > > +	}
> > > +
> >
> > Since the 'ipg' clock is just intend to be used for registers accessing=
 and
> > We are using the regmap_init_mmio_clk(), so we can just drop it here an=
d
> > Let the regmap APIs to do the clock options properly.
>=20
> This 'ipg' clock, which I renamed in v2 to 'bus clock', is not just used
> in the way you said but also able to drive bit clock as your own code in
> the fsl_sai_set_dai_sysclk_tr(), and as reference manual describes:
>=20
> 52.3.1.3 Bus clock
> The bus clock is used by the control and configuration registers and to
> generate synchronous interrupts and DMA requests.
>=20
> Thanks,
> Nicolin
>=20
> >
> > Otherwise it look good to me.
> >
> > After this:
> > Acked-by: Xiubo Li <Li.Xiubo@freescale.com>
> >
> >
> >
> > Thanks,
> >
> > Brs,
> > Xiubo
> >
> >
> >
> >
> > > +	sai->sai_clk =3D devm_clk_get(&pdev->dev, "sai");
> > > +	if (IS_ERR(sai->sai_clk)) {
> > > +		dev_err(&pdev->dev, "failed to get sai clock\n");
> > > +		return PTR_ERR(sai->sai_clk);
> > > +	}
> > > +
> > >  	irq =3D platform_get_irq(pdev, 0);
> > >  	if (irq < 0) {
> > > diff --git a/sound/soc/fsl/fsl_sai.h b/sound/soc/fsl/fsl_sai.h
> > > index 677670d..cbaf114 100644
> > > --- a/sound/soc/fsl/fsl_sai.h
> > > +++ b/sound/soc/fsl/fsl_sai.h
> > > @@ -127,4 +127,6 @@ struct fsl_sai {
> > >  	struct platform_device *pdev;
> > >  	struct regmap *regmap;
> > > +	struct clk *ipg_clk;
> > > +	struct clk *sai_clk;
> > >
> > >  	bool big_endian_regs;
> > > --
> > > 1.8.4
> > >
> >

^ permalink raw reply

* Re: [PATCH 1/2] ASoC: fsl_sai: Add clock control for SAI
From: Nicolin Chen @ 2014-04-04  9:28 UTC (permalink / raw)
  To: Xiubo Li-B47053
  Cc: mark.rutland@arm.com, devicetree@vger.kernel.org,
	alsa-devel@alsa-project.org, pawel.moll@arm.com,
	linux-doc@vger.kernel.org, ijc+devicetree@hellion.org.uk,
	timur@tabi.org, robh+dt@kernel.org, linux-kernel@vger.kernel.org,
	broonie@kernel.org, rob@landley.net, galak@codeaurora.org,
	shawn.guo@linaro.org, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <49295d4f0f0f42c4b9c6de77583b978e@BY2PR03MB505.namprd03.prod.outlook.com>

Hi Xiubo,

On Fri, Apr 04, 2014 at 05:24:39PM +0800, Xiubo Li-B47053 wrote:
> Hi,
> 
> I have test this series on my Vybrid-TWR board and it works happily.

You just checked the wrong version. I've sent a mail to let people disregard
this version and a newer v2.

> 
> [...]
> > diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c
> > index 3847d2a..2d749df 100644
> > --- a/sound/soc/fsl/fsl_sai.c
> > +++ b/sound/soc/fsl/fsl_sai.c
> > @@ -428,5 +428,18 @@ static int fsl_sai_startup(struct snd_pcm_substream
> > *substream,
> >  {
> >  	struct fsl_sai *sai = snd_soc_dai_get_drvdata(cpu_dai);
> > -	u32 reg;
> > +	struct device *dev = &sai->pdev->dev;
> > +	u32 reg, ret;
> > +
> 
> I'd prefer:
>   +	int ret;

Just like last time I said, it would be converted to 'int' any way. There's
no much difference between them.

> 
> > +	ret = clk_prepare_enable(sai->ipg_clk);
> > +	if (ret) {
> > +		dev_err(dev, "failed to prepare and enable ipg clock\n");
> > +		return ret;
> > +	}
> > +
> 
> [...]
> 
> > @@ -609,5 +630,5 @@ static int fsl_sai_probe(struct platform_device *pdev)
> > 
> >  	sai->regmap = devm_regmap_init_mmio_clk(&pdev->dev,
> > -			"sai", base, &fsl_sai_regmap_config);
> > +			"ipg", base, &fsl_sai_regmap_config);
> >  	if (IS_ERR(sai->regmap)) {
> >  		dev_err(&pdev->dev, "regmap init failed\n");
> > @@ -615,4 +636,16 @@ static int fsl_sai_probe(struct platform_device *pdev)
> >  	}
> > 
> > +	sai->ipg_clk = devm_clk_get(&pdev->dev, "ipg");
> > +	if (IS_ERR(sai->ipg_clk)) {
> > +		dev_err(&pdev->dev, "failed to get ipg clock\n");
> > +		return PTR_ERR(sai->ipg_clk);
> > +	}
> > +
> 
> Since the 'ipg' clock is just intend to be used for registers accessing and
> We are using the regmap_init_mmio_clk(), so we can just drop it here and
> Let the regmap APIs to do the clock options properly.

This 'ipg' clock, which I renamed in v2 to 'bus clock', is not just used
in the way you said but also able to drive bit clock as your own code in
the fsl_sai_set_dai_sysclk_tr(), and as reference manual describes:

52.3.1.3 Bus clock
The bus clock is used by the control and configuration registers and to
generate synchronous interrupts and DMA requests.

Thanks,
Nicolin

> 
> Otherwise it look good to me.
> 
> After this:
> Acked-by: Xiubo Li <Li.Xiubo@freescale.com>
> 
> 
> 
> Thanks,
> 
> Brs,
> Xiubo
> 
> 
> 
> 
> > +	sai->sai_clk = devm_clk_get(&pdev->dev, "sai");
> > +	if (IS_ERR(sai->sai_clk)) {
> > +		dev_err(&pdev->dev, "failed to get sai clock\n");
> > +		return PTR_ERR(sai->sai_clk);
> > +	}
> > +
> >  	irq = platform_get_irq(pdev, 0);
> >  	if (irq < 0) {
> > diff --git a/sound/soc/fsl/fsl_sai.h b/sound/soc/fsl/fsl_sai.h
> > index 677670d..cbaf114 100644
> > --- a/sound/soc/fsl/fsl_sai.h
> > +++ b/sound/soc/fsl/fsl_sai.h
> > @@ -127,4 +127,6 @@ struct fsl_sai {
> >  	struct platform_device *pdev;
> >  	struct regmap *regmap;
> > +	struct clk *ipg_clk;
> > +	struct clk *sai_clk;
> > 
> >  	bool big_endian_regs;
> > --
> > 1.8.4
> > 
> 

^ permalink raw reply

* RE: [PATCH 1/2] ASoC: fsl_sai: Add clock control for SAI
From: Li.Xiubo @ 2014-04-04  9:24 UTC (permalink / raw)
  To: guangyu.chen@freescale.com, broonie@kernel.org,
	shawn.guo@linaro.org
  Cc: mark.rutland@arm.com, devicetree@vger.kernel.org,
	alsa-devel@alsa-project.org, pawel.moll@arm.com,
	linux-doc@vger.kernel.org, ijc+devicetree@hellion.org.uk,
	linux-kernel@vger.kernel.org, timur@tabi.org, robh+dt@kernel.org,
	rob@landley.net, galak@codeaurora.org,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <87d225a25812d730c0865f9d0bb733c67d8b3ed1.1396352401.git.Guangyu.Chen@freescale.com>

Hi,

I have test this series on my Vybrid-TWR board and it works happily.

[...]
> diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c
> index 3847d2a..2d749df 100644
> --- a/sound/soc/fsl/fsl_sai.c
> +++ b/sound/soc/fsl/fsl_sai.c
> @@ -428,5 +428,18 @@ static int fsl_sai_startup(struct snd_pcm_substream
> *substream,
>  {
>  	struct fsl_sai *sai =3D snd_soc_dai_get_drvdata(cpu_dai);
> -	u32 reg;
> +	struct device *dev =3D &sai->pdev->dev;
> +	u32 reg, ret;
> +

I'd prefer:
  +	int ret;


> +	ret =3D clk_prepare_enable(sai->ipg_clk);
> +	if (ret) {
> +		dev_err(dev, "failed to prepare and enable ipg clock\n");
> +		return ret;
> +	}
> +

[...]

> @@ -609,5 +630,5 @@ static int fsl_sai_probe(struct platform_device *pdev=
)
>=20
>  	sai->regmap =3D devm_regmap_init_mmio_clk(&pdev->dev,
> -			"sai", base, &fsl_sai_regmap_config);
> +			"ipg", base, &fsl_sai_regmap_config);
>  	if (IS_ERR(sai->regmap)) {
>  		dev_err(&pdev->dev, "regmap init failed\n");
> @@ -615,4 +636,16 @@ static int fsl_sai_probe(struct platform_device *pde=
v)
>  	}
>=20
> +	sai->ipg_clk =3D devm_clk_get(&pdev->dev, "ipg");
> +	if (IS_ERR(sai->ipg_clk)) {
> +		dev_err(&pdev->dev, "failed to get ipg clock\n");
> +		return PTR_ERR(sai->ipg_clk);
> +	}
> +

Since the 'ipg' clock is just intend to be used for registers accessing and
We are using the regmap_init_mmio_clk(), so we can just drop it here and
Let the regmap APIs to do the clock options properly.

Otherwise it look good to me.

After this:
Acked-by: Xiubo Li <Li.Xiubo@freescale.com>



Thanks,

Brs,
Xiubo




> +	sai->sai_clk =3D devm_clk_get(&pdev->dev, "sai");
> +	if (IS_ERR(sai->sai_clk)) {
> +		dev_err(&pdev->dev, "failed to get sai clock\n");
> +		return PTR_ERR(sai->sai_clk);
> +	}
> +
>  	irq =3D platform_get_irq(pdev, 0);
>  	if (irq < 0) {
> diff --git a/sound/soc/fsl/fsl_sai.h b/sound/soc/fsl/fsl_sai.h
> index 677670d..cbaf114 100644
> --- a/sound/soc/fsl/fsl_sai.h
> +++ b/sound/soc/fsl/fsl_sai.h
> @@ -127,4 +127,6 @@ struct fsl_sai {
>  	struct platform_device *pdev;
>  	struct regmap *regmap;
> +	struct clk *ipg_clk;
> +	struct clk *sai_clk;
>=20
>  	bool big_endian_regs;
> --
> 1.8.4
>=20

^ permalink raw reply

* [PATCH v2] powerpc/tm: Disable IRQ in tm_recheckpoint
From: Michael Neuling @ 2014-04-04  9:19 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, Paul Mackerras
In-Reply-To: <1395985234-27281-1-git-send-email-mikey@neuling.org>

We can't take an IRQ when we're about to do a trechkpt as our GPR state is set
to user GPR values.

We've hit this when running some IBM Java stress tests in the lab resulting in
the following dump:

  cpu 0x3f: Vector: 700 (Program Check) at [c000000007eb3d40]
      pc: c000000000050074: restore_gprs+0xc0/0x148
      lr: 00000000b52a8184
      sp: ac57d360
     msr: 8000000100201030
    current = 0xc00000002c500000
    paca    = 0xc000000007dbfc00     softe: 0     irq_happened: 0x00
      pid   = 34535, comm = Pooled Thread #
  R00 = 00000000b52a8184   R16 = 00000000b3e48fda
  R01 = 00000000ac57d360   R17 = 00000000ade79bd8
  R02 = 00000000ac586930   R18 = 000000000fac9bcc
  R03 = 00000000ade60000   R19 = 00000000ac57f930
  R04 = 00000000f6624918   R20 = 00000000ade79be8
  R05 = 00000000f663f238   R21 = 00000000ac218a54
  R06 = 0000000000000002   R22 = 000000000f956280
  R07 = 0000000000000008   R23 = 000000000000007e
  R08 = 000000000000000a   R24 = 000000000000000c
  R09 = 00000000b6e69160   R25 = 00000000b424cf00
  R10 = 0000000000000181   R26 = 00000000f66256d4
  R11 = 000000000f365ec0   R27 = 00000000b6fdcdd0
  R12 = 00000000f66400f0   R28 = 0000000000000001
  R13 = 00000000ada71900   R29 = 00000000ade5a300
  R14 = 00000000ac2185a8   R30 = 00000000f663f238
  R15 = 0000000000000004   R31 = 00000000f6624918
  pc  = c000000000050074 restore_gprs+0xc0/0x148
  cfar= c00000000004fe28 dont_restore_vec+0x1c/0x1a4
  lr  = 00000000b52a8184
  msr = 8000000100201030   cr  = 24804888
  ctr = 0000000000000000   xer = 0000000000000000   trap =  700

This moves tm_recheckpoint to a C function and moves the tm_restore_sprs into
that function.  It then adds IRQ disabling over the trechkpt critical section.
It also sets the TEXASR FS in the signals code to ensure this is never set now
that we explictly write the TM sprs in tm_recheckpoint.

Signed-off-by: Michael Neuling <mikey@neuling.org>
cc: stable@vger.kernel.org

---
v2:
  Fix bug where TM SPRs are not saved when outside a transaction.

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index af064d2..31d0215 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -610,6 +610,31 @@ out_and_saveregs:
 	tm_save_sprs(thr);
 }
 
+extern void __tm_recheckpoint(struct thread_struct *thread,
+			      unsigned long orig_msr);
+
+void tm_recheckpoint(struct thread_struct *thread,
+		     unsigned long orig_msr)
+{
+	unsigned long flags;
+
+	/* We really can't be interrupted here as the TEXASR registers can't
+	 * change and later in the trecheckpoint code, we have a userspace R1.
+	 * So let's hard disable over this region.
+	 */
+	local_irq_save(flags);
+	hard_irq_disable();
+
+	/* The TM SPRs are restored here, so that TEXASR.FS can be set
+	 * before the trecheckpoint and no explosion occurs.
+	 */
+	tm_restore_sprs(thread);
+
+	__tm_recheckpoint(thread, orig_msr);
+
+	local_irq_restore(flags);
+}
+
 static inline void tm_recheckpoint_new_task(struct task_struct *new)
 {
 	unsigned long msr;
@@ -628,13 +653,10 @@ static inline void tm_recheckpoint_new_task(struct task_struct *new)
 	if (!new->thread.regs)
 		return;
 
-	/* The TM SPRs are restored here, so that TEXASR.FS can be set
-	 * before the trecheckpoint and no explosion occurs.
-	 */
-	tm_restore_sprs(&new->thread);
-
-	if (!MSR_TM_ACTIVE(new->thread.regs->msr))
+	if (!MSR_TM_ACTIVE(new->thread.regs->msr)){
+		tm_restore_sprs(&new->thread);
 		return;
+	}
 	msr = new->thread.tm_orig_msr;
 	/* Recheckpoint to restore original checkpointed register state. */
 	TM_DEBUG("*** tm_recheckpoint of pid %d "
diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c
index a67e00a..4e47db6 100644
--- a/arch/powerpc/kernel/signal_32.c
+++ b/arch/powerpc/kernel/signal_32.c
@@ -881,6 +881,8 @@ static long restore_tm_user_regs(struct pt_regs *regs,
 	 * transactional versions should be loaded.
 	 */
 	tm_enable();
+	/* Make sure the transaction is marked as failed */
+	current->thread.tm_texasr |= TEXASR_FS;
 	/* This loads the checkpointed FP/VEC state, if used */
 	tm_recheckpoint(&current->thread, msr);
 	/* Get the top half of the MSR */
diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index 8d253c2..d501dc4 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -527,6 +527,8 @@ static long restore_tm_sigcontexts(struct pt_regs *regs,
 	}
 #endif
 	tm_enable();
+	/* Make sure the transaction is marked as failed */
+	current->thread.tm_texasr |= TEXASR_FS;
 	/* This loads the checkpointed FP/VEC state, if used */
 	tm_recheckpoint(&current->thread, msr);
 
diff --git a/arch/powerpc/kernel/tm.S b/arch/powerpc/kernel/tm.S
index ef47bcb..03567c0 100644
--- a/arch/powerpc/kernel/tm.S
+++ b/arch/powerpc/kernel/tm.S
@@ -307,7 +307,7 @@ dont_backup_fp:
 	 *	Call with IRQs off, stacks get all out of sync for
 	 *	some periods in here!
 	 */
-_GLOBAL(tm_recheckpoint)
+_GLOBAL(__tm_recheckpoint)
 	mfcr	r5
 	mflr	r0
 	stw	r5, 8(r1)

^ permalink raw reply related

* RE: [PATCH] ASoC: fsl_sai: Fix Bit Clock Polarity configurations
From: Li.Xiubo @ 2014-04-04  8:54 UTC (permalink / raw)
  To: guangyu.chen@freescale.com
  Cc: alsa-devel@alsa-project.org, broonie@kernel.org,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	timur@tabi.org
In-Reply-To: <20140404074359.GA2599@MrMyself>


>=20
> Is that possible for you to test those two clock patches for fsl_sai?
>=20
> I think most of us are waiting for your reply to it. And I'd really
> like to move on to append clock dividing code into the driver so both
> of vybrid and imx can easily enable the DAI master mode.
>=20

Certainly, I will test them later.

Thanks,

^ permalink raw reply

* Re: [PATCH] ASoC: fsl_sai: Fix Bit Clock Polarity configurations
From: Nicolin Chen @ 2014-04-04  7:44 UTC (permalink / raw)
  To: Xiubo Li-B47053
  Cc: alsa-devel@alsa-project.org, broonie@kernel.org,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	timur@tabi.org
In-Reply-To: <9cb53ac6e0e04c01bc0c6ae6a4d42557@BY2PR03MB505.namprd03.prod.outlook.com>

Hi Xiubo,

On Fri, Apr 04, 2014 at 03:37:00PM +0800, Xiubo Li-B47053 wrote:
> 
> > Subject: [PATCH] ASoC: fsl_sai: Fix Bit Clock Polarity configurations
> > 
> > The BCP bit in TCR4/RCR4 register rules as followings:
> >   0 Bit clock is active high with drive outputs on rising edge
> >     and sample inputs on falling edge.
> >   1 Bit clock is active low with drive outputs on falling edge
> >     and sample inputs on rising edge.
> > 
> > For all formats currently supported in the fsl_sai driver, they're exactly
> > sending data on the falling edge and sampling on the rising edge.
> > 
> > However, the driver clears this BCP bit for all of them which results click
> > noise when working with SGTL5000 and big noise with WM8962.
> > 
> > Thus this patch corrects the BCP settings for all the formats here to fix
> > the nosie issue.
> > 
> > Signed-off-by: Nicolin Chen <Guangyu.Chen@freescale.com>
> > ---
> 
> Good catch.
> 
> Acked-by: Xiubo Li <Li.Xiubo@freescale.com>
> 
> Thanks,

Is that possible for you to test those two clock patches for fsl_sai?

I think most of us are waiting for your reply to it. And I'd really
like to move on to append clock dividing code into the driver so both
of vybrid and imx can easily enable the DAI master mode.

Thank you,
Nicolin

> --
> 
> BRs,
> Xiubo
> 
> 
> >  sound/soc/fsl/fsl_sai.c | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c
> > index 99051c7..9bbebea 100644
> > --- a/sound/soc/fsl/fsl_sai.c
> > +++ b/sound/soc/fsl/fsl_sai.c
> > @@ -180,7 +180,7 @@ static int fsl_sai_set_dai_fmt_tr(struct snd_soc_dai
> > *cpu_dai,
> >  		 * that is, together with the last bit of the previous
> >  		 * data word.
> >  		 */
> > -		val_cr2 &= ~FSL_SAI_CR2_BCP;
> > +		val_cr2 |= FSL_SAI_CR2_BCP;
> >  		val_cr4 |= FSL_SAI_CR4_FSE | FSL_SAI_CR4_FSP;
> >  		break;
> >  	case SND_SOC_DAIFMT_LEFT_J:
> > @@ -188,7 +188,7 @@ static int fsl_sai_set_dai_fmt_tr(struct snd_soc_dai
> > *cpu_dai,
> >  		 * Frame high, one word length for frame sync,
> >  		 * frame sync asserts with the first bit of the frame.
> >  		 */
> > -		val_cr2 &= ~FSL_SAI_CR2_BCP;
> > +		val_cr2 |= FSL_SAI_CR2_BCP;
> >  		val_cr4 &= ~(FSL_SAI_CR4_FSE | FSL_SAI_CR4_FSP);
> >  		break;
> >  	case SND_SOC_DAIFMT_DSP_A:
> > @@ -198,7 +198,7 @@ static int fsl_sai_set_dai_fmt_tr(struct snd_soc_dai
> > *cpu_dai,
> >  		 * that is, together with the last bit of the previous
> >  		 * data word.
> >  		 */
> > -		val_cr2 &= ~FSL_SAI_CR2_BCP;
> > +		val_cr2 |= FSL_SAI_CR2_BCP;
> >  		val_cr4 &= ~FSL_SAI_CR4_FSP;
> >  		val_cr4 |= FSL_SAI_CR4_FSE;
> >  		sai->is_dsp_mode = true;
> > @@ -208,7 +208,7 @@ static int fsl_sai_set_dai_fmt_tr(struct snd_soc_dai
> > *cpu_dai,
> >  		 * Frame high, one bit for frame sync,
> >  		 * frame sync asserts with the first bit of the frame.
> >  		 */
> > -		val_cr2 &= ~FSL_SAI_CR2_BCP;
> > +		val_cr2 |= FSL_SAI_CR2_BCP;
> >  		val_cr4 &= ~(FSL_SAI_CR4_FSE | FSL_SAI_CR4_FSP);
> >  		sai->is_dsp_mode = true;
> >  		break;
> > --
> > 1.8.4
> > 
> 

^ permalink raw reply

* Re: MPC8641 based custom board Kernel stuck at 1000Mhz core clock
From: Ashish @ 2014-04-04  7:42 UTC (permalink / raw)
  To: sanjeev sharma; +Cc: scottwood, linuxppc-dev, Valdis Kletnieks, kernelnewbies
In-Reply-To: <CAGUYZuR+rymKURoHURemEggZ8uktagyhGGuqtaWvN=11DQsoxw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5874 bytes --]

On Thursday 03 April 2014 08:55 AM, sanjeev sharma wrote:
> Are you able to capture kernel logs ?
>
> Regards
> Sanjeev Sharma
>
>
> On Thu, Mar 27, 2014 at 10:01 PM, <Valdis.Kletnieks@vt.edu 
> <mailto:Valdis.Kletnieks@vt.edu>> wrote:
>
>     On Thu, 27 Mar 2014 16:04:37 +0530, Ashish said:
>     > Hi,
>     >
>     >   I am using MPC8641-HPCN based custom board and able to boot
>     linux at
>     > MPX clock 400Mhz and core clock 800mhz. When I am increasing core
>     > frequency ie MPX clock at 400Mhz and core at 1Ghz, kernel stuck.
>
>     Step 0:  Prove to us that your core actually runs reliable and
>     stably at 1Ghz.
>
>     Step 1: Figure out *where* it gets stuck.  If you have earlyprintk
>     working on
>     your board, adding 'initcall_debug ignore_loglevel' to the kernel
>     cmdline often
>     helps track down where a kernel hangs during boot.
>
>
>     _______________________________________________
>     Kernelnewbies mailing list
>     Kernelnewbies@kernelnewbies.org
>     <mailto:Kernelnewbies@kernelnewbies.org>
>     http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>
Hi,
Does kernel logs means debugging information that kernel prints while 
booting using printk or it is something else?
Here is kernel boot logs that kernel printed while booting...

U-Boot 2013.04 (Jan 27 2014 - 11:21:21)

Unicore software on multiprocessor system!!
To enable mutlticore build define CONFIG_MP
CPU:   8641, Version: 2.1, (0x80900021)
Core:  E600 Core 0, Version: 2.2, (0x80040202)
Clock Configuration:
        CPU:1000 MHz, MPX:400  MHz
        DDR:200  MHz (400 MT/s data rate), LBC:25   MHz
L1:    D-cache 32 KB enabled
        I-cache 32 KB enabled
L2:    512 KB enabled
Board: MPC8641-HPCN
I2C:   ready
DRAM:  512 MiB
SDRAM test phase 1:
SDRAM test phase 2:
SDRAM test passed.
Flash: 16 MiB
EEPROM: NXID v1
In:    serial
Out:   serial
Err:   serial
Net:   eTSEC1, eTSEC2, eTSEC3, eTSEC4 [PRIME]
Hit any key to stop autoboot:
Speed: 1000, full duplex
Using eTSEC4 device
TFTP from server 192.168.10.1; our IP address is 192.168.10.2
Filename 'uRamdisk'.
Load address: 0x600000
Loading: 
*\b#################################################################
  #################################################################
  #################################################################
  #################################################################
  #################################################################
  #################################################################
  #################################################################
  #################################################################
  #################################################################
  #################################################################
  #################################################################
  #################################################################
  #################################################################
  #################################################################
  #################################################################
  #################################################################
  #################################################################
  #################################################################
  #################################################################
  #################################################################
  #################################################################
  #################################################################
  #################################################################
      ##################################################
      13.1 MiB/s
done
Bytes transferred = 22680188 (15a127c hex)
Speed: 1000, full duplex
Using eTSEC4 device
TFTP from server 192.168.10.1; our IP address is 192.168.10.2
Filename 'uImage'.
Load address: 0x16000000
Loading: 
*\b#################################################################
  #################################################################
      #######################################
      14 MiB/s
done
Bytes transferred = 2476304 (25c910 hex)
Speed: 1000, full duplex
Using eTSEC4 device
TFTP from server 192.168.10.1; our IP address is 192.168.10.2
Filename 'mpc8641_hpcn.dtb'.
Load address: 0x14000000
Loading: *\b#
      2.6 MiB/s
done
Bytes transferred = 5540 (15a4 hex)
## Booting kernel from Legacy Image at 16000000 ...
    Image Name:   Linux-3.13.6
    Image Type:   PowerPC Linux Kernel Image (gzip compressed)
    Data Size:    2476240 Bytes = 2.4 MiB
    Load Address: 00000000
    Entry Point:  00000000
    Verifying Checksum ... OK
## Loading init Ramdisk from Legacy Image at 00600000 ...
    Image Name:   rootfs
    Image Type:   PowerPC Linux RAMDisk Image (gzip compressed)
    Data Size:    22680124 Bytes = 21.6 MiB
    Load Address: 00000000
    Entry Point:  00000000
    Verifying Checksum ... OK
## Flattened Device Tree blob at 14000000
    Booting using the fdt blob at 0x14000000
    Uncompressing Kernel Image ... OK
    Loading Ramdisk to 1e923000, end 1fec423c ... OK
    Loading Device Tree to 007fb000, end 007ff5a3 ... OK /_*(some times 
it stucking here) *_/
Using MPC86xx HPCN machine description
Total memory = 512MB; using 1024kB for hash table (at cff00000)
Linux version 3.13.6 (ashish@ashish-VirtualBox) (gcc version 4.7.2 (GCC) 
) #8 Sat Mar 29 10:31:58 IST 2014
Found initrd at 0xde923000:0xdfec423c
bootconsole [udbg0] enabled
setup_arch: bootmem
mpc86xx_hpcn_setup_arch()
MPC86xx HPCN board from Freescale Semiconductor
arch: exit
Zone ranges:
   DMA      [mem 0x00000000-0x1fffffff]
   Normal   empty
   HighMem  empty
Movable zone start for each node
Early memory node ranges
   node   0: [mem 0x00000000-0x1fffffff] /_*(some times here) *_/



[-- Attachment #2: Type: text/html, Size: 9651 bytes --]

^ permalink raw reply

* RE: [PATCH] ASoC: fsl_sai: Fix Bit Clock Polarity configurations
From: Li.Xiubo @ 2014-04-04  7:37 UTC (permalink / raw)
  To: guangyu.chen@freescale.com, broonie@kernel.org
  Cc: alsa-devel@alsa-project.org, linuxppc-dev@lists.ozlabs.org,
	linux-kernel@vger.kernel.org, timur@tabi.org
In-Reply-To: <1396595387-4371-1-git-send-email-Guangyu.Chen@freescale.com>


> Subject: [PATCH] ASoC: fsl_sai: Fix Bit Clock Polarity configurations
>=20
> The BCP bit in TCR4/RCR4 register rules as followings:
>   0 Bit clock is active high with drive outputs on rising edge
>     and sample inputs on falling edge.
>   1 Bit clock is active low with drive outputs on falling edge
>     and sample inputs on rising edge.
>=20
> For all formats currently supported in the fsl_sai driver, they're exactl=
y
> sending data on the falling edge and sampling on the rising edge.
>=20
> However, the driver clears this BCP bit for all of them which results cli=
ck
> noise when working with SGTL5000 and big noise with WM8962.
>=20
> Thus this patch corrects the BCP settings for all the formats here to fix
> the nosie issue.
>=20
> Signed-off-by: Nicolin Chen <Guangyu.Chen@freescale.com>
> ---

Good catch.

Acked-by: Xiubo Li <Li.Xiubo@freescale.com>

Thanks,
--

BRs,
Xiubo


>  sound/soc/fsl/fsl_sai.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>=20
> diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c
> index 99051c7..9bbebea 100644
> --- a/sound/soc/fsl/fsl_sai.c
> +++ b/sound/soc/fsl/fsl_sai.c
> @@ -180,7 +180,7 @@ static int fsl_sai_set_dai_fmt_tr(struct snd_soc_dai
> *cpu_dai,
>  		 * that is, together with the last bit of the previous
>  		 * data word.
>  		 */
> -		val_cr2 &=3D ~FSL_SAI_CR2_BCP;
> +		val_cr2 |=3D FSL_SAI_CR2_BCP;
>  		val_cr4 |=3D FSL_SAI_CR4_FSE | FSL_SAI_CR4_FSP;
>  		break;
>  	case SND_SOC_DAIFMT_LEFT_J:
> @@ -188,7 +188,7 @@ static int fsl_sai_set_dai_fmt_tr(struct snd_soc_dai
> *cpu_dai,
>  		 * Frame high, one word length for frame sync,
>  		 * frame sync asserts with the first bit of the frame.
>  		 */
> -		val_cr2 &=3D ~FSL_SAI_CR2_BCP;
> +		val_cr2 |=3D FSL_SAI_CR2_BCP;
>  		val_cr4 &=3D ~(FSL_SAI_CR4_FSE | FSL_SAI_CR4_FSP);
>  		break;
>  	case SND_SOC_DAIFMT_DSP_A:
> @@ -198,7 +198,7 @@ static int fsl_sai_set_dai_fmt_tr(struct snd_soc_dai
> *cpu_dai,
>  		 * that is, together with the last bit of the previous
>  		 * data word.
>  		 */
> -		val_cr2 &=3D ~FSL_SAI_CR2_BCP;
> +		val_cr2 |=3D FSL_SAI_CR2_BCP;
>  		val_cr4 &=3D ~FSL_SAI_CR4_FSP;
>  		val_cr4 |=3D FSL_SAI_CR4_FSE;
>  		sai->is_dsp_mode =3D true;
> @@ -208,7 +208,7 @@ static int fsl_sai_set_dai_fmt_tr(struct snd_soc_dai
> *cpu_dai,
>  		 * Frame high, one bit for frame sync,
>  		 * frame sync asserts with the first bit of the frame.
>  		 */
> -		val_cr2 &=3D ~FSL_SAI_CR2_BCP;
> +		val_cr2 |=3D FSL_SAI_CR2_BCP;
>  		val_cr4 &=3D ~(FSL_SAI_CR4_FSE | FSL_SAI_CR4_FSP);
>  		sai->is_dsp_mode =3D true;
>  		break;
> --
> 1.8.4
>=20

^ permalink raw reply

* [PATCH v2] powerpc/le: enable RTAS events support
From: Greg Kurz @ 2014-04-04  7:35 UTC (permalink / raw)
  To: benh; +Cc: linux-kernel, geert, paulus, anton, nfont, linuxppc-dev
In-Reply-To: <0140402175658.1b4a8c4d@bahia.local>

The current kernel code assumes big endian and parses RTAS events all
wrong. The most visible effect is that we cannot honor EPOW events,
meaning, for example, we cannot shut down a guest properly from the
hypervisor.

This new patch is largely inspired by Nathan's work: we get rid of all
the bit fields in the RTAS event structures (even the unused ones, for
consistency). We also introduce endian safe accessors for the fields used
by the kernel (trivial rtas_error_type() accessor added for consistency).

Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---

Changes since v2:
- kill all bit fields in RTAS event structures
- endian safe accessors
- use __be32 and __be16 where applicable

I could successfuly have the system_shutdown command working in the qemu
monitor.

Cheers.

--
Greg

 arch/powerpc/include/asm/rtas.h               |  127 +++++++++++++++++++------
 arch/powerpc/kernel/rtas.c                    |   15 ++-
 arch/powerpc/kernel/rtasd.c                   |   24 ++---
 arch/powerpc/platforms/pseries/io_event_irq.c |    6 +
 arch/powerpc/platforms/pseries/ras.c          |   17 ++-
 5 files changed, 128 insertions(+), 61 deletions(-)

diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index 9bd52c6..217c81e 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -150,19 +150,53 @@ struct rtas_suspend_me_data {
 #define RTAS_VECTOR_EXTERNAL_INTERRUPT	0x500
 
 struct rtas_error_log {
-	unsigned long version:8;		/* Architectural version */
-	unsigned long severity:3;		/* Severity level of error */
-	unsigned long disposition:2;		/* Degree of recovery */
-	unsigned long extended:1;		/* extended log present? */
-	unsigned long /* reserved */ :2;	/* Reserved for future use */
-	unsigned long initiator:4;		/* Initiator of event */
-	unsigned long target:4;			/* Target of failed operation */
-	unsigned long type:8;			/* General event or error*/
-	unsigned long extended_log_length:32;	/* length in bytes */
-	unsigned char buffer[1];		/* Start of extended log */
+	/* Byte 0 */
+	uint8_t		byte0;			/* Architectural version */
+
+	/* Byte 1 */
+	uint8_t		byte1;
+	/* XXXXXXXX
+	 * XXX		3: Severity level of error
+	 *    XX	2: Degree of recovery
+	 *      X	1: Extended log present?
+	 *       XX	2: Reserved
+	 */
+
+	/* Byte 2 */
+	uint8_t		byte2;
+	/* XXXXXXXX
+	 * XXXX		4: Initiator of event
+	 *     XXXX	4: Target of failed operation
+	 */
+	uint8_t		byte3;			/* General event or error*/
+	__be32		extended_log_length;	/* length in bytes */
+	unsigned char	buffer[1];		/* Start of extended log */
 						/* Variable length.      */
 };
 
+static inline uint8_t rtas_error_severity(const struct rtas_error_log *elog)
+{
+	return (elog->byte1 & 0xE0) >> 5;
+}
+
+static inline uint8_t rtas_error_disposition(const struct rtas_error_log *elog)
+{
+	return (elog->byte1 & 0x18) >> 3;
+}
+
+static inline uint8_t rtas_error_extended(const struct rtas_error_log *elog)
+{
+	return (elog->byte1 & 0x04) >> 2;
+}
+
+#define rtas_error_type(x)	((x)->byte3)
+
+static inline
+uint32_t rtas_error_extended_log_length(const struct rtas_error_log *elog)
+{
+	return be32_to_cpu(elog->extended_log_length);
+}
+
 #define RTAS_V6EXT_LOG_FORMAT_EVENT_LOG	14
 
 #define RTAS_V6EXT_COMPANY_ID_IBM	(('I' << 24) | ('B' << 16) | ('M' << 8))
@@ -172,32 +206,35 @@ struct rtas_error_log {
  */
 struct rtas_ext_event_log_v6 {
 	/* Byte 0 */
-	uint32_t log_valid:1;		/* 1:Log valid */
-	uint32_t unrecoverable_error:1;	/* 1:Unrecoverable error */
-	uint32_t recoverable_error:1;	/* 1:recoverable (correctable	*/
-					/*   or successfully retried)	*/
-	uint32_t degraded_operation:1;	/* 1:Unrecoverable err, bypassed*/
-					/*   - degraded operation (e.g.	*/
-					/*   CPU or mem taken off-line)	*/
-	uint32_t predictive_error:1;
-	uint32_t new_log:1;		/* 1:"New" log (Always 1 for	*/
-					/*   data returned from RTAS	*/
-	uint32_t big_endian:1;		/* 1: Big endian */
-	uint32_t :1;			/* reserved */
+	uint8_t byte0;
+	/* XXXXXXXX
+	 * X		1: Log valid
+	 *  X		1: Unrecoverable error
+	 *   X		1: Recoverable (correctable or successfully retried)
+	 *    X		1: Bypassed unrecoverable error (degraded operation)
+	 *     X	1: Predictive error
+	 *      X	1: "New" log (always 1 for data returned from RTAS)
+	 *       X	1: Big Endian
+	 *        X	1: Reserved
+	 */
+
 	/* Byte 1 */
-	uint32_t :8;			/* reserved */
+	uint8_t byte1;			/* reserved */
+
 	/* Byte 2 */
-	uint32_t powerpc_format:1;	/* Set to 1 (indicating log is	*/
-					/* in PowerPC format		*/
-	uint32_t :3;			/* reserved */
-	uint32_t log_format:4;		/* Log format indicator. Define	*/
-					/* format used for byte 12-2047	*/
+	uint8_t byte2;
+	/* XXXXXXXX
+	 * X		1: Set to 1 (indicating log is in PowerPC format)
+	 *  XXX		3: Reserved
+	 *     XXXX	4: Log format used for bytes 12-2047
+	 */
+
 	/* Byte 3 */
-	uint32_t :8;			/* reserved */
+	uint8_t byte3;			/* reserved */
 	/* Byte 4-11 */
 	uint8_t reserved[8];		/* reserved */
 	/* Byte 12-15 */
-	uint32_t company_id;		/* Company ID of the company	*/
+	__be32  company_id;		/* Company ID of the company	*/
 					/* that defines the format for	*/
 					/* the vendor specific log type	*/
 	/* Byte 16-end of log */
@@ -205,6 +242,18 @@ struct rtas_ext_event_log_v6 {
 					/* Variable length.		*/
 };
 
+static
+inline uint8_t rtas_ext_event_log_format(struct rtas_ext_event_log_v6 *ext_log)
+{
+	return ext_log->byte2 & 0x0F;
+}
+
+static
+inline uint32_t rtas_ext_event_company_id(struct rtas_ext_event_log_v6 *ext_log)
+{
+	return be32_to_cpu(ext_log->company_id);
+}
+
 /* pSeries event log format */
 
 /* Two bytes ASCII section IDs */
@@ -227,14 +276,26 @@ struct rtas_ext_event_log_v6 {
 
 /* Vendor specific Platform Event Log Format, Version 6, section header */
 struct pseries_errorlog {
-	uint16_t id;			/* 0x00 2-byte ASCII section ID	*/
-	uint16_t length;		/* 0x02 Section length in bytes	*/
+	__be16 id;			/* 0x00 2-byte ASCII section ID	*/
+	__be16 length;			/* 0x02 Section length in bytes	*/
 	uint8_t version;		/* 0x04 Section version		*/
 	uint8_t subtype;		/* 0x05 Section subtype		*/
-	uint16_t creator_component;	/* 0x06 Creator component ID	*/
+	__be16 creator_component;	/* 0x06 Creator component ID	*/
 	uint8_t data[];			/* 0x08 Start of section data	*/
 };
 
+static
+inline uint16_t pseries_errorlog_id(struct pseries_errorlog *sect)
+{
+	return be16_to_cpu(sect->id);
+}
+
+static
+inline uint16_t pseries_errorlog_length(struct pseries_errorlog *sect)
+{
+	return be16_to_cpu(sect->length);
+}
+
 struct pseries_errorlog *get_pseries_errorlog(struct rtas_error_log *log,
 					      uint16_t section_id);
 
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index f386296..8cd5ed0 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -993,21 +993,24 @@ struct pseries_errorlog *get_pseries_errorlog(struct rtas_error_log *log,
 		(struct rtas_ext_event_log_v6 *)log->buffer;
 	struct pseries_errorlog *sect;
 	unsigned char *p, *log_end;
+	uint32_t ext_log_length = rtas_error_extended_log_length(log);
+	uint8_t log_format = rtas_ext_event_log_format(ext_log);
+	uint32_t company_id = rtas_ext_event_company_id(ext_log);
 
 	/* Check that we understand the format */
-	if (log->extended_log_length < sizeof(struct rtas_ext_event_log_v6) ||
-	    ext_log->log_format != RTAS_V6EXT_LOG_FORMAT_EVENT_LOG ||
-	    ext_log->company_id != RTAS_V6EXT_COMPANY_ID_IBM)
+	if (ext_log_length < sizeof(struct rtas_ext_event_log_v6) ||
+	    log_format != RTAS_V6EXT_LOG_FORMAT_EVENT_LOG ||
+	    company_id != RTAS_V6EXT_COMPANY_ID_IBM)
 		return NULL;
 
-	log_end = log->buffer + log->extended_log_length;
+	log_end = log->buffer + ext_log_length;
 	p = ext_log->vendor_log;
 
 	while (p < log_end) {
 		sect = (struct pseries_errorlog *)p;
-		if (sect->id == section_id)
+		if (pseries_errorlog_id(sect) == section_id)
 			return sect;
-		p += sect->length;
+		p += pseries_errorlog_length(sect);
 	}
 
 	return NULL;
diff --git a/arch/powerpc/kernel/rtasd.c b/arch/powerpc/kernel/rtasd.c
index 1130c53..e736387 100644
--- a/arch/powerpc/kernel/rtasd.c
+++ b/arch/powerpc/kernel/rtasd.c
@@ -150,8 +150,8 @@ static void printk_log_rtas(char *buf, int len)
 		struct rtas_error_log *errlog = (struct rtas_error_log *)buf;
 
 		printk(RTAS_DEBUG "event: %d, Type: %s, Severity: %d\n",
-		       error_log_cnt, rtas_event_type(errlog->type),
-		       errlog->severity);
+		       error_log_cnt, rtas_event_type(rtas_error_type(errlog)),
+		       rtas_error_severity(errlog));
 	}
 }
 
@@ -159,14 +159,16 @@ static int log_rtas_len(char * buf)
 {
 	int len;
 	struct rtas_error_log *err;
+	uint32_t extended_log_length;
 
 	/* rtas fixed header */
 	len = 8;
 	err = (struct rtas_error_log *)buf;
-	if (err->extended && err->extended_log_length) {
+	extended_log_length = rtas_error_extended_log_length(err);
+	if (rtas_error_extended(err) && extended_log_length) {
 
 		/* extended header */
-		len += err->extended_log_length;
+		len += extended_log_length;
 	}
 
 	if (rtas_error_log_max == 0)
@@ -293,15 +295,13 @@ void prrn_schedule_update(u32 scope)
 
 static void handle_rtas_event(const struct rtas_error_log *log)
 {
-	if (log->type == RTAS_TYPE_PRRN) {
-		/* For PRRN Events the extended log length is used to denote
-		 * the scope for calling rtas update-nodes.
-		 */
-		if (prrn_is_enabled())
-			prrn_schedule_update(log->extended_log_length);
-	}
+	if (rtas_error_type(log) != RTAS_TYPE_PRRN || !prrn_is_enabled())
+		return;
 
-	return;
+	/* For PRRN Events the extended log length is used to denote
+	 * the scope for calling rtas update-nodes.
+	 */
+	prrn_schedule_update(rtas_error_extended_log_length(log));
 }
 
 #else
diff --git a/arch/powerpc/platforms/pseries/io_event_irq.c b/arch/powerpc/platforms/pseries/io_event_irq.c
index 5ea88d1..0240c4f 100644
--- a/arch/powerpc/platforms/pseries/io_event_irq.c
+++ b/arch/powerpc/platforms/pseries/io_event_irq.c
@@ -82,9 +82,9 @@ static struct pseries_io_event * ioei_find_event(struct rtas_error_log *elog)
 	 * RTAS_TYPE_IO only exists in extended event log version 6 or later.
 	 * No need to check event log version.
 	 */
-	if (unlikely(elog->type != RTAS_TYPE_IO)) {
-		printk_once(KERN_WARNING "io_event_irq: Unexpected event type %d",
-			    elog->type);
+	if (unlikely(rtas_error_type(elog) != RTAS_TYPE_IO)) {
+		printk_once(KERN_WARNING"io_event_irq: Unexpected event type %d",
+			    rtas_error_type(elog));
 		return NULL;
 	}
 
diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
index 721c058..9c5778e 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -236,7 +236,8 @@ static irqreturn_t ras_error_interrupt(int irq, void *dev_id)
 
 	rtas_elog = (struct rtas_error_log *)ras_log_buf;
 
-	if ((status == 0) && (rtas_elog->severity >= RTAS_SEVERITY_ERROR_SYNC))
+	if (status == 0 &&
+	    rtas_error_severity(rtas_elog) >= RTAS_SEVERITY_ERROR_SYNC)
 		fatal = 1;
 	else
 		fatal = 0;
@@ -300,13 +301,14 @@ static struct rtas_error_log *fwnmi_get_errinfo(struct pt_regs *regs)
 
 	/* If it isn't an extended log we can use the per cpu 64bit buffer */
 	h = (struct rtas_error_log *)&savep[1];
-	if (!h->extended) {
+	if (!rtas_error_extended(h)) {
 		memcpy(&__get_cpu_var(mce_data_buf), h, sizeof(__u64));
 		errhdr = (struct rtas_error_log *)&__get_cpu_var(mce_data_buf);
 	} else {
-		int len;
+		int len, error_log_length;
 
-		len = max_t(int, 8+h->extended_log_length, RTAS_ERROR_LOG_MAX);
+		error_log_length = 8 + rtas_error_extended_log_length(h);
+		len = max_t(int, error_log_length, RTAS_ERROR_LOG_MAX);
 		memset(global_mce_data_buf, 0, RTAS_ERROR_LOG_MAX);
 		memcpy(global_mce_data_buf, h, len);
 		errhdr = (struct rtas_error_log *)global_mce_data_buf;
@@ -350,23 +352,24 @@ int pSeries_system_reset_exception(struct pt_regs *regs)
 static int recover_mce(struct pt_regs *regs, struct rtas_error_log *err)
 {
 	int recovered = 0;
+	int disposition = rtas_error_disposition(err);
 
 	if (!(regs->msr & MSR_RI)) {
 		/* If MSR_RI isn't set, we cannot recover */
 		recovered = 0;
 
-	} else if (err->disposition == RTAS_DISP_FULLY_RECOVERED) {
+	} else if (disposition == RTAS_DISP_FULLY_RECOVERED) {
 		/* Platform corrected itself */
 		recovered = 1;
 
-	} else if (err->disposition == RTAS_DISP_LIMITED_RECOVERY) {
+	} else if (disposition == RTAS_DISP_LIMITED_RECOVERY) {
 		/* Platform corrected itself but could be degraded */
 		printk(KERN_ERR "MCE: limited recovery, system may "
 		       "be degraded\n");
 		recovered = 1;
 
 	} else if (user_mode(regs) && !is_global_init(current) &&
-		   err->severity == RTAS_SEVERITY_ERROR_SYNC) {
+		   rtas_error_severity(err) == RTAS_SEVERITY_ERROR_SYNC) {
 
 		/*
 		 * If we received a synchronous error when in userspace

^ permalink raw reply related

* Re: [PATCH V2 2/2] mm: add FAULT_AROUND_ORDER Kconfig paramater for powerpc
From: Ingo Molnar @ 2014-04-04  7:10 UTC (permalink / raw)
  To: Madhavan Srinivasan
  Cc: linux-arch, riel, rusty, Peter Zijlstra, peterz, x86,
	linux-kernel, linux-mm, ak, paulus, mgorman, Linus Torvalds, akpm,
	linuxppc-dev, kirill.shutemov
In-Reply-To: <20140404070241.GA984@gmail.com>


* Ingo Molnar <mingo@kernel.org> wrote:

> * Madhavan Srinivasan <maddy@linux.vnet.ibm.com> wrote:
> 
> > Performance data for different FAULT_AROUND_ORDER values from 4 
> > socket Power7 system (128 Threads and 128GB memory) is below. perf 
> > stat with repeat of 5 is used to get the stddev values. This patch 
> > create FAULT_AROUND_ORDER Kconfig parameter and defaults it to 3 
> > based on the performance data.
> > 
> > FAULT_AROUND_ORDER      Baseline        1               3               4               5               7
> > 
> > Linux build (make -j64)
> > minor-faults            7184385         5874015         4567289         4318518         4193815         4159193
> > times in seconds        61.433776136    60.865935292    59.245368038    60.630675011    60.56587624     59.828271924
> >  stddev for time	( +-  1.18% )	( +-  1.78% )	( +-  0.44% )	( +-  2.03% )	( +-  1.66% )	( +-  1.45% )
> 
> Ok, this is better, but it is still rather incomplete statistically, 
> please also calculate the percentage difference to baseline, so that 
> the stddev becomes meaningful and can be compared to something!
> 
> As an example I did this for the first line of measurements (all 
> errors in the numbers are mine, this was done manually), and it 
> gives:
> 
> >  stddev for time   ( +-  1.18% ) ( +-  1.78% ) ( +-  0.44% ) ( +-  2.03% ) ( +-  1.66% ) ( +-  1.45% )
>                                         +0.9%         +3.5%         +1.3%         +1.4%         +2.6%
> 
> This shows that there is probably a statistically significant 
> (positiv) effect from the change, but from these numbers alone I 
> would not draw any quantitative (sizing, tuning) conclusions, 
> because in 3 out of 5 cases the stddev was larger than the effect, 
> so the resulting percentages are not comparable.

Also note that because we calculate the percentage by dividing result 
with baseline, the stddev of the two values roughly adds up. So for 
example the second column the true noise is around 1.5%, not 0.4%

So for good sizing decisions the stddev must be 'comfortably' below 
the effect. (or sizing should be done based on the other workloads yu 
tested, I have not checked them.)

It also makes sense to run more measurements to reduce the stddev of 
the baseline. So if each measurement is run 3 times then it makes 
sense to run the baseline 6 times, this gives a ~30% improvement in 
the confidence of our result, at just a small increase in test time.

[ For such cases it might also make sense to script all of that, 
  combined with a debug patch that puts the tuned fault-around value 
  into a dynamic knob in /proc/sys/, so that you can run the full 
  measurement in a single pass, with no reboot and with no human 
  intervention. ]

Thanks,

	Ingo

^ permalink raw reply

* [PATCH] ASoC: fsl_sai: Fix Bit Clock Polarity configurations
From: Nicolin Chen @ 2014-04-04  7:09 UTC (permalink / raw)
  To: broonie, Li.Xiubo; +Cc: alsa-devel, linuxppc-dev, linux-kernel, timur

The BCP bit in TCR4/RCR4 register rules as followings:
  0 Bit clock is active high with drive outputs on rising edge
    and sample inputs on falling edge.
  1 Bit clock is active low with drive outputs on falling edge
    and sample inputs on rising edge.

For all formats currently supported in the fsl_sai driver, they're exactly
sending data on the falling edge and sampling on the rising edge.

However, the driver clears this BCP bit for all of them which results click
noise when working with SGTL5000 and big noise with WM8962.

Thus this patch corrects the BCP settings for all the formats here to fix
the nosie issue.

Signed-off-by: Nicolin Chen <Guangyu.Chen@freescale.com>
---
 sound/soc/fsl/fsl_sai.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c
index 99051c7..9bbebea 100644
--- a/sound/soc/fsl/fsl_sai.c
+++ b/sound/soc/fsl/fsl_sai.c
@@ -180,7 +180,7 @@ static int fsl_sai_set_dai_fmt_tr(struct snd_soc_dai *cpu_dai,
 		 * that is, together with the last bit of the previous
 		 * data word.
 		 */
-		val_cr2 &= ~FSL_SAI_CR2_BCP;
+		val_cr2 |= FSL_SAI_CR2_BCP;
 		val_cr4 |= FSL_SAI_CR4_FSE | FSL_SAI_CR4_FSP;
 		break;
 	case SND_SOC_DAIFMT_LEFT_J:
@@ -188,7 +188,7 @@ static int fsl_sai_set_dai_fmt_tr(struct snd_soc_dai *cpu_dai,
 		 * Frame high, one word length for frame sync,
 		 * frame sync asserts with the first bit of the frame.
 		 */
-		val_cr2 &= ~FSL_SAI_CR2_BCP;
+		val_cr2 |= FSL_SAI_CR2_BCP;
 		val_cr4 &= ~(FSL_SAI_CR4_FSE | FSL_SAI_CR4_FSP);
 		break;
 	case SND_SOC_DAIFMT_DSP_A:
@@ -198,7 +198,7 @@ static int fsl_sai_set_dai_fmt_tr(struct snd_soc_dai *cpu_dai,
 		 * that is, together with the last bit of the previous
 		 * data word.
 		 */
-		val_cr2 &= ~FSL_SAI_CR2_BCP;
+		val_cr2 |= FSL_SAI_CR2_BCP;
 		val_cr4 &= ~FSL_SAI_CR4_FSP;
 		val_cr4 |= FSL_SAI_CR4_FSE;
 		sai->is_dsp_mode = true;
@@ -208,7 +208,7 @@ static int fsl_sai_set_dai_fmt_tr(struct snd_soc_dai *cpu_dai,
 		 * Frame high, one bit for frame sync,
 		 * frame sync asserts with the first bit of the frame.
 		 */
-		val_cr2 &= ~FSL_SAI_CR2_BCP;
+		val_cr2 |= FSL_SAI_CR2_BCP;
 		val_cr4 &= ~(FSL_SAI_CR4_FSE | FSL_SAI_CR4_FSP);
 		sai->is_dsp_mode = true;
 		break;
-- 
1.8.4

^ permalink raw reply related

* Re: [PATCH V2 2/2] mm: add FAULT_AROUND_ORDER Kconfig paramater for powerpc
From: Ingo Molnar @ 2014-04-04  7:02 UTC (permalink / raw)
  To: Madhavan Srinivasan
  Cc: linux-arch, riel, rusty, peterz, x86, linux-kernel, linux-mm, ak,
	paulus, mgorman, akpm, linuxppc-dev, kirill.shutemov
In-Reply-To: <1396592835-24767-3-git-send-email-maddy@linux.vnet.ibm.com>


* Madhavan Srinivasan <maddy@linux.vnet.ibm.com> wrote:

> Performance data for different FAULT_AROUND_ORDER values from 4 socket
> Power7 system (128 Threads and 128GB memory) is below. perf stat with
> repeat of 5 is used to get the stddev values. This patch create
> FAULT_AROUND_ORDER Kconfig parameter and defaults it to 3 based on the
> performance data.
> 
> FAULT_AROUND_ORDER      Baseline        1               3               4               5               7
> 
> Linux build (make -j64)
> minor-faults            7184385         5874015         4567289         4318518         4193815         4159193
> times in seconds        61.433776136    60.865935292    59.245368038    60.630675011    60.56587624     59.828271924
>  stddev for time	( +-  1.18% )	( +-  1.78% )	( +-  0.44% )	( +-  2.03% )	( +-  1.66% )	( +-  1.45% )

Ok, this is better, but it is still rather incomplete statistically, 
please also calculate the percentage difference to baseline, so that 
the stddev becomes meaningful and can be compared to something!

As an example I did this for the first line of measurements (all 
errors in the numbers are mine, this was done manually), and it gives:

>  stddev for time   ( +-  1.18% ) ( +-  1.78% ) ( +-  0.44% ) ( +-  2.03% ) ( +-  1.66% ) ( +-  1.45% )
                                        +0.9%         +3.5%         +1.3%         +1.4%         +2.6%

This shows that there is probably a statistically significant 
(positiv) effect from the change, but from these numbers alone I would 
not draw any quantitative (sizing, tuning) conclusions, because in 3 
out of 5 cases the stddev was larger than the effect, so the resulting 
percentages are not comparable.

Please do this calculation for all the other lines as well and also 
close all the numbers with a conclusion section where you *analyze* 
the results, outline the statistics and compare the various workloads 
and how the tuning affects them and don't force the readers of the 
commit guess what it all means and how significant it all is!

Thanks,

	Ingo

^ permalink raw reply

* [PATCH V2 1/2] mm: move FAULT_AROUND_ORDER to arch/
From: Madhavan Srinivasan @ 2014-04-04  6:27 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, linux-mm, linux-arch, x86
  Cc: riel, ak, peterz, rusty, Madhavan Srinivasan, paulus, mgorman,
	akpm, mingo, kirill.shutemov
In-Reply-To: <1396592835-24767-1-git-send-email-maddy@linux.vnet.ibm.com>

Kirill A. Shutemov with faultaround patchset introduced
vm_ops->map_pages() for mapping easy accessible pages around
fault address in hope to reduce number of minor page faults.

This patch creates infrastructure to move the FAULT_AROUND_ORDER
to arch/ using Kconfig. This will enable architecture maintainers
to decide on suitable FAULT_AROUND_ORDER value based on
performance data for that architecture. Patch also adds
FAULT_AROUND_ORDER Kconfig element in arch/X86.

Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
 arch/x86/Kconfig   |    4 ++++
 include/linux/mm.h |    9 +++++++++
 mm/memory.c        |   12 +++++-------
 3 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 9c0a657..5833f22 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1177,6 +1177,10 @@ config DIRECT_GBPAGES
 	  support it. This can improve the kernel's performance a tiny bit by
 	  reducing TLB pressure. If in doubt, say "Y".
 
+config FAULT_AROUND_ORDER
+	int
+	default "4"
+
 # Common NUMA Features
 config NUMA
 	bool "Numa Memory Allocation and Scheduler Support"
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0bd4359..b93c1c3 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -26,6 +26,15 @@ struct file_ra_state;
 struct user_struct;
 struct writeback_control;
 
+/*
+ * Fault around order is a control knob to decide the fault around pages.
+ * Default value is set to 0UL (disabled), but the arch can override it as
+ * desired.
+ */
+#ifndef CONFIG_FAULT_AROUND_ORDER
+#define CONFIG_FAULT_AROUND_ORDER 0
+#endif
+
 #ifndef CONFIG_NEED_MULTIPLE_NODES	/* Don't use mapnrs, do it properly */
 extern unsigned long max_mapnr;
 
diff --git a/mm/memory.c b/mm/memory.c
index b02c584..22a4a89 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3358,10 +3358,8 @@ void do_set_pte(struct vm_area_struct *vma, unsigned long address,
 	update_mmu_cache(vma, address, pte);
 }
 
-#define FAULT_AROUND_ORDER 4
-
 #ifdef CONFIG_DEBUG_FS
-static unsigned int fault_around_order = FAULT_AROUND_ORDER;
+static unsigned int fault_around_order = CONFIG_FAULT_AROUND_ORDER;
 
 static int fault_around_order_get(void *data, u64 *val)
 {
@@ -3371,7 +3369,7 @@ static int fault_around_order_get(void *data, u64 *val)
 
 static int fault_around_order_set(void *data, u64 val)
 {
-	BUILD_BUG_ON((1UL << FAULT_AROUND_ORDER) > PTRS_PER_PTE);
+	BUILD_BUG_ON((1UL << CONFIG_FAULT_AROUND_ORDER) > PTRS_PER_PTE);
 	if (1UL << val > PTRS_PER_PTE)
 		return -EINVAL;
 	fault_around_order = val;
@@ -3406,14 +3404,14 @@ static inline unsigned long fault_around_pages(void)
 {
 	unsigned long nr_pages;
 
-	nr_pages = 1UL << FAULT_AROUND_ORDER;
+	nr_pages = 1UL << CONFIG_FAULT_AROUND_ORDER;
 	BUILD_BUG_ON(nr_pages > PTRS_PER_PTE);
 	return nr_pages;
 }
 
 static inline unsigned long fault_around_mask(void)
 {
-	return ~((1UL << (PAGE_SHIFT + FAULT_AROUND_ORDER)) - 1);
+	return ~((1UL << (PAGE_SHIFT + CONFIG_FAULT_AROUND_ORDER)) - 1);
 }
 #endif
 
@@ -3471,7 +3469,7 @@ static int do_read_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	 * if page by the offset is not ready to be mapped (cold cache or
 	 * something).
 	 */
-	if (vma->vm_ops->map_pages) {
+	if ((vma->vm_ops->map_pages) && (fault_around_pages() > 1)) {
 		pte = pte_offset_map_lock(mm, pmd, address, &ptl);
 		do_fault_around(vma, address, pte, pgoff, flags);
 		if (!pte_same(*pte, orig_pte))
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH V2 2/2] mm: add FAULT_AROUND_ORDER Kconfig paramater for powerpc
From: Madhavan Srinivasan @ 2014-04-04  6:27 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, linux-mm, linux-arch, x86
  Cc: riel, ak, peterz, rusty, Madhavan Srinivasan, paulus, mgorman,
	akpm, mingo, kirill.shutemov
In-Reply-To: <1396592835-24767-1-git-send-email-maddy@linux.vnet.ibm.com>

Performance data for different FAULT_AROUND_ORDER values from 4 socket
Power7 system (128 Threads and 128GB memory) is below. perf stat with
repeat of 5 is used to get the stddev values. This patch create
FAULT_AROUND_ORDER Kconfig parameter and defaults it to 3 based on the
performance data.

FAULT_AROUND_ORDER      Baseline        1               3               4               5               7

Linux build (make -j64)
minor-faults            7184385         5874015         4567289         4318518         4193815         4159193
times in seconds        61.433776136    60.865935292    59.245368038    60.630675011    60.56587624     59.828271924
 stddev for time	( +-  1.18% )	( +-  1.78% )	( +-  0.44% )	( +-  2.03% )	( +-  1.66% )	( +-  1.45% )

Linux rebuild (make -j64)
minor-faults            303018          226392          146170          132480          126878          126236
times in seconds        5.659819172     5.723996942     5.591238319     5.622533357     5.878811995     5.550133096
 stddev for time	( +-  0.71% )	( +-  0.78% )	( +-  0.62% )	( +-  0.45% )	( +-  1.55% )	( +-  0.29% )

Two synthetic tests: access every word in file in sequential/random order.
Marginal Performance gains seen for FAO value of 3 when compared to value
of 4.

Sequential access 16GiB file
FAULT_AROUND_ORDER      Baseline        1               3               4               5               7
1 thread
       minor-faults     262302          131192          32873           16486           8291            2351
       times in seconds 53.071497352    52.945826882    52.931417302    52.928577184    52.859285439    53.116800539
       stddev for time	( +-  0.01% )	( +-  0.02% )	( +-  0.02% )	( +-  0.04% )	( +-  0.04% )	( +-  0.01% )
8 threads
       minor-faults     2097314         1051046         263336          131715          66098           16653
       times in seconds 54.385698561    54.603652339    54.771282004    54.488565674    54.496701531    54.962142189
       stddev for time	( +-  0.05% )	( +-  0.02% )	( +-  0.37% )	( +-  0.08% )	( +-  0.07% )	( +-  0.51% )
32 threads
       minor-faults     8389267         4218595         1059961         531319          266463          67271
       times in seconds 60.61715047     60.827964038    60.46412673     60.266045885    60.492398315    60.24531921
       stddev for time	( +-  0.65% )	( +-  0.21% )	( +-  0.25% )	( +-  0.29% )	( +-  0.19% )	( +-  0.35% )
64 threads
       minor-faults     16777455        8485998         2178582         1092106         544302          137693
       times in seconds 86.471334554    84.412415735    85.208303832    84.331473392    85.598793479    84.695469266
       stddev for time	( +-  0.60% )	( +-  1.47% )	( +-  0.74% )	( +-  1.55% )	( +-  0.92% )	( +-  1.16% )
128 threads
       minor-faults     33555267        17734522        4710107         2380821         1182707         292077
       times in seconds 117.535385569   114.291359037   112.593908276   113.081807611   114.358686588   114.491043011
       stddev for time	( +-  2.24% )	( +-  0.92% )	( +-  0.36% )	( +-  0.53% )	( +-  0.70% )	( +-  0.53% )

Random access 1GiB file
FAULT_AROUND_ORDER      Baseline        1               3               4               5               7
1 thread
       minor-faults     16503           8664            2149            1126            610             437
       times in seconds 43.843573808    48.042069805    50.580779682    54.282884593    52.641739876    51.803302129
       stddev for time	( +-  1.30% )	( +-  2.25% )	( +-  2.92% )	( +-  1.44% )	( +-  4.49% )	( +-  3.78% )
8 threads
       minor-faults     131201          70916           17760           8665            4250            1149
       times in seconds 46.262626804    55.942851041    56.629191584    57.97044714     55.417557594    56.019709166
       stddev for time	( +-  4.66% )	( +-  1.52% )	( +-  1.43% )	( +-  1.61% )	( +-  0.65% )	( +-  1.27% )
32 threads
       minor-faults     524959          265980          67282           33601           16930           4316
       times in seconds 67.754175928    69.85012331     71.750338061    71.053074643    68.90728294     71.250103217
       stddev for time	( +-  3.79% )	( +-  0.77% )	( +-  1.15% )	( +-  1.08% )	( +-  2.14% )	( +-  1.17% )
64 threads
       minor-faults     1048831         528829          133256          66700           33428           8776
       times in seconds 96.674025305    93.109961822    87.441777715    91.986332028    88.686748472    93.101434306
       stddev for time	( +-  2.85% )	( +-  2.42% )	( +-  0.42% )	( +-  1.58% )	( +-  1.29% )	( +-  2.01% )
128 threads
       minor-faults     2098043         1053224         266271          133702          66966           17276
       times in seconds 156.525792044   152.117971403   147.523673243   148.560226602   148.596575663   149.389288429
       stddev for time	( +-  2.39% )	( +-  0.71% )	( +-  0.72% )	( +-  0.35% )	( +-  0.41% )	( +-  1.04% )

Worst case scenario: we touch one page every 16M to demonstrate overhead.

Touch only one page in page table in 16GiB file
FAULT_AROUND_ORDER      Baseline        1               3               4               5               7
1 thread
       minor-faults     1077            1064            1051            1048            1046            1045
       times in seconds 0.00615347      0.008327379     0.019775282     0.034444003     0.05905971      0.220863339
       stddev for time	( +-  3.12% )	( +-  2.29% )	( +-  1.68% )	( +-  1.40% )	( +-  1.68% )	( +-  1.53% )
8 threads
       minor-faults     8252            8239            8226            8223            8220            8224
       times in seconds 0.04387392      0.059859294     0.113897648     0.199707764     0.361585762     1.343366843
       stddev for time	( +-  3.66% )	( +-  2.18% )	( +-  1.44% )	( +-  2.40% )	( +-  2.08% )	( +-  1.75% )
32 threads
       minor-faults     32852           32841           32825           32826           32824           32828
       times in seconds 0.191404544     0.21907773      0.433207123     0.72430447      1.334983196     4.97727449
       stddev for time	( +-  5.08% )	( +-  3.19% )	( +-  4.45% )	( +-  2.18% )	( +-  2.75% )	( +-  1.52% )
64 threads
       minor-faults     65652           65642           65629           65622           65623           65634
       times in seconds 0.402140429     0.510806718     0.854288645     1.412329805     2.556707704     8.711074863
       stddev for time	( +-  2.90% )	( +-  3.04% )	( +-  1.83% )	( +-  2.75% )	( +-  2.49% )	( +-  0.68% )
128 threads
       minor-faults     131255          131239          131228          131228          131229          131243
       times in seconds 0.817782148     1.124631348     2.023730928     3.184792382     5.331392072     17.309524609
       stddev for time	( +-  3.35% )	( +- 11.74% )	( +-  4.55% )	( +-  2.00% )	( +-  1.31% )	( +-  1.30% )

Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/pseries/Kconfig |    5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig
index 2cb8b77..2246d9f 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -132,3 +132,8 @@ config DTL
 	  which are accessible through a debugfs file.
 
 	  Say N if you are unsure.
+
+config FAULT_AROUND_ORDER
+	int
+	default "3"
+
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH V2 0/2] FAULT_AROUND_ORDER patchset performance data for powerpc
From: Madhavan Srinivasan @ 2014-04-04  6:27 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, linux-mm, linux-arch, x86
  Cc: riel, ak, peterz, rusty, Madhavan Srinivasan, paulus, mgorman,
	akpm, mingo, kirill.shutemov

Kirill A. Shutemov with faultaround patchset introduced
vm_ops->map_pages() for mapping easy accessible pages around
fault address in hope to reduce number of minor page faults.

This patchset creates infrastructure to move the FAULT_AROUND_ORDER
to arch/ using Kconfig. This will enable architecture maintainers
to decide on suitable FAULT_AROUND_ORDER value based on
performance data for that architecture. First patch also adds
FAULT_AROUND_ORDER Kconfig element for X86. Second patch list
out the performance numbers for powerpc (platform pseries) and
adds FAULT_AROUND_ORDER Kconfig element for powerpc.

V2 Changes:
  Created Kconfig parameter for FAULT_AROUND_ORDER
  Added check in do_read_fault to handle FAULT_AROUND_ORDER value of 0
  Made changes in commit messages.

Madhavan Srinivasan (2):
  mm: move FAULT_AROUND_ORDER to arch/
  mm: add FAULT_AROUND_ORDER Kconfig paramater for powerpc

 arch/powerpc/platforms/pseries/Kconfig |    5 +++++
 arch/x86/Kconfig                       |    4 ++++
 include/linux/mm.h                     |    9 +++++++++
 mm/memory.c                            |   12 +++++-------
 4 files changed, 23 insertions(+), 7 deletions(-)

-- 
1.7.10.4

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox