LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 1/7] [booke] Rename mapping based RELOCATABLE to DYNAMIC_MEMSTART for BookE
From: Suzuki K. Poulose @ 2011-12-15  8:57 UTC (permalink / raw)
  To: Josh Boyer, Benjamin Herrenschmidt
  Cc: Scott Wood, Josh Poimboeuf, linux ppc dev
In-Reply-To: <20111215085516.3404.56377.stgit@suzukikp.in.ibm.com>

The current implementation of CONFIG_RELOCATABLE in BookE is based
on mapping the page aligned kernel load address to KERNELBASE. This
approach however is not enough for platforms, where the TLB page size
is large (e.g, 256M on 44x). So we are renaming the RELOCATABLE used
currently in BookE to DYNAMIC_MEMSTART to reflect the actual method.

The CONFIG_RELOCATABLE for PPC32(BookE) based on processing of the
dynamic relocations will be introduced in the later in the patch series.

This change would allow the use of the old method of RELOCATABLE for
platforms which can afford to enforce the page alignment (platforms with
smaller TLB size).

Changes since v3:

* Introduced a new config, NONSTATIC_KERNEL, to denote a kernel which is
  either a RELOCATABLE or DYNAMIC_MEMSTART(Suggested by: Josh Boyer)

Suggested-by: Scott Wood <scottwood@freescale.com>
Tested-by: Scott Wood <scottwood@freescale.com>

Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linux ppc dev <linuxppc-dev@lists.ozlabs.org>
---

 arch/powerpc/Kconfig                          |   60 +++++++++++++++++--------
 arch/powerpc/configs/44x/iss476-smp_defconfig |    3 +
 arch/powerpc/include/asm/kdump.h              |    4 +-
 arch/powerpc/include/asm/page.h               |    4 +-
 arch/powerpc/kernel/crash_dump.c              |    4 +-
 arch/powerpc/kernel/head_44x.S                |    4 +-
 arch/powerpc/kernel/head_fsl_booke.S          |    2 -
 arch/powerpc/kernel/machine_kexec.c           |    2 -
 arch/powerpc/kernel/prom_init.c               |    2 -
 arch/powerpc/mm/44x_mmu.c                     |    2 -
 10 files changed, 56 insertions(+), 31 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 7c93c7e..fac92ce 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -364,7 +364,8 @@ config KEXEC
 config CRASH_DUMP
 	bool "Build a kdump crash kernel"
 	depends on PPC64 || 6xx || FSL_BOOKE
-	select RELOCATABLE if PPC64 || FSL_BOOKE
+	select RELOCATABLE if PPC64
+	select DYNAMIC_MEMSTART if FSL_BOOKE
 	help
 	  Build a kernel suitable for use as a kdump capture kernel.
 	  The same kernel binary can be used as production kernel and dump
@@ -773,6 +774,10 @@ source "drivers/rapidio/Kconfig"
 
 endmenu
 
+config NONSTATIC_KERNEL
+	bool
+	default n
+
 menu "Advanced setup"
 	depends on PPC32
 
@@ -822,23 +827,39 @@ config LOWMEM_CAM_NUM
 	int "Number of CAMs to use to map low memory" if LOWMEM_CAM_NUM_BOOL
 	default 3
 
-config RELOCATABLE
-	bool "Build a relocatable kernel (EXPERIMENTAL)"
+config DYNAMIC_MEMSTART
+	bool "Enable page aligned dynamic load address for kernel (EXPERIMENTAL)"
 	depends on EXPERIMENTAL && ADVANCED_OPTIONS && FLATMEM && (FSL_BOOKE || PPC_47x)
-	help
-	  This builds a kernel image that is capable of running at the
-	  location the kernel is loaded at (some alignment restrictions may
-	  exist).
-
-	  One use is for the kexec on panic case where the recovery kernel
-	  must live at a different physical address than the primary
-	  kernel.
-
-	  Note: If CONFIG_RELOCATABLE=y, then the kernel runs from the address
-	  it has been loaded at and the compile time physical addresses
-	  CONFIG_PHYSICAL_START is ignored.  However CONFIG_PHYSICAL_START
-	  setting can still be useful to bootwrappers that need to know the
-	  load location of the kernel (eg. u-boot/mkimage).
+	select NONSTATIC_KERNEL
+	help
+	  This option enables the kernel to be loaded at any page aligned
+	  physical address. The kernel creates a mapping from KERNELBASE to 
+	  the address where the kernel is loaded. The page size here implies
+	  the TLB page size of the mapping for kernel on the particular platform.
+	  Please refer to the init code for finding the TLB page size.
+
+	  DYNAMIC_MEMSTART is an easy way of implementing pseudo-RELOCATABLE
+	  kernel image, where the only restriction is the page aligned kernel
+	  load address. When this option is enabled, the compile time physical 
+	  address CONFIG_PHYSICAL_START is ignored.
+
+# Mapping based RELOCATABLE is moved to DYNAMIC_MEMSTART
+# config RELOCATABLE
+#	bool "Build a relocatable kernel (EXPERIMENTAL)"
+#	depends on EXPERIMENTAL && ADVANCED_OPTIONS && FLATMEM && (FSL_BOOKE || PPC_47x)
+#	help
+#	  This builds a kernel image that is capable of running at the
+#	  location the kernel is loaded at, without any alignment restrictions.
+#
+#	  One use is for the kexec on panic case where the recovery kernel
+#	  must live at a different physical address than the primary
+#	  kernel.
+#
+#	  Note: If CONFIG_RELOCATABLE=y, then the kernel runs from the address
+#	  it has been loaded at and the compile time physical addresses
+#	  CONFIG_PHYSICAL_START is ignored.  However CONFIG_PHYSICAL_START
+#	  setting can still be useful to bootwrappers that need to know the
+#	  load location of the kernel (eg. u-boot/mkimage).
 
 config PAGE_OFFSET_BOOL
 	bool "Set custom page offset address"
@@ -868,7 +889,7 @@ config KERNEL_START_BOOL
 config KERNEL_START
 	hex "Virtual address of kernel base" if KERNEL_START_BOOL
 	default PAGE_OFFSET if PAGE_OFFSET_BOOL
-	default "0xc2000000" if CRASH_DUMP && !RELOCATABLE
+	default "0xc2000000" if CRASH_DUMP && !NONSTATIC_KERNEL
 	default "0xc0000000"
 
 config PHYSICAL_START_BOOL
@@ -881,7 +902,7 @@ config PHYSICAL_START_BOOL
 
 config PHYSICAL_START
 	hex "Physical address where the kernel is loaded" if PHYSICAL_START_BOOL
-	default "0x02000000" if PPC_STD_MMU && CRASH_DUMP && !RELOCATABLE
+	default "0x02000000" if PPC_STD_MMU && CRASH_DUMP && !NONSTATIC_KERNEL
 	default "0x00000000"
 
 config PHYSICAL_ALIGN
@@ -927,6 +948,7 @@ endmenu
 if PPC64
 config RELOCATABLE
 	bool "Build a relocatable kernel"
+	select NONSTATIC_KERNEL
 	help
 	  This builds a kernel image that is capable of running anywhere
 	  in the RMA (real memory area) at any 16k-aligned base address.
diff --git a/arch/powerpc/configs/44x/iss476-smp_defconfig b/arch/powerpc/configs/44x/iss476-smp_defconfig
index a6eb6ad..ca00cf7 100644
--- a/arch/powerpc/configs/44x/iss476-smp_defconfig
+++ b/arch/powerpc/configs/44x/iss476-smp_defconfig
@@ -25,7 +25,8 @@ CONFIG_CMDLINE_BOOL=y
 CONFIG_CMDLINE="root=/dev/issblk0"
 # CONFIG_PCI is not set
 CONFIG_ADVANCED_OPTIONS=y
-CONFIG_RELOCATABLE=y
+CONFIG_NONSTATIC_KERNEL=y
+CONFIG_DYNAMIC_MEMSTART=y
 CONFIG_NET=y
 CONFIG_PACKET=y
 CONFIG_UNIX=y
diff --git a/arch/powerpc/include/asm/kdump.h b/arch/powerpc/include/asm/kdump.h
index bffd062..c977620 100644
--- a/arch/powerpc/include/asm/kdump.h
+++ b/arch/powerpc/include/asm/kdump.h
@@ -32,11 +32,11 @@
 
 #ifndef __ASSEMBLY__
 
-#if defined(CONFIG_CRASH_DUMP) && !defined(CONFIG_RELOCATABLE)
+#if defined(CONFIG_CRASH_DUMP) && !defined(CONFIG_NONSTATIC_KERNEL)
 extern void reserve_kdump_trampoline(void);
 extern void setup_kdump_trampoline(void);
 #else
-/* !CRASH_DUMP || RELOCATABLE */
+/* !CRASH_DUMP || !NONSTATIC_KERNEL */
 static inline void reserve_kdump_trampoline(void) { ; }
 static inline void setup_kdump_trampoline(void) { ; }
 #endif
diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index 9d7485c..f149967 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -92,7 +92,7 @@ extern unsigned int HPAGE_SHIFT;
 #define PAGE_OFFSET	ASM_CONST(CONFIG_PAGE_OFFSET)
 #define LOAD_OFFSET	ASM_CONST((CONFIG_KERNEL_START-CONFIG_PHYSICAL_START))
 
-#if defined(CONFIG_RELOCATABLE)
+#if defined(CONFIG_NONSTATIC_KERNEL)
 #ifndef __ASSEMBLY__
 
 extern phys_addr_t memstart_addr;
@@ -105,7 +105,7 @@ extern phys_addr_t kernstart_addr;
 
 #ifdef CONFIG_PPC64
 #define MEMORY_START	0UL
-#elif defined(CONFIG_RELOCATABLE)
+#elif defined(CONFIG_NONSTATIC_KERNEL)
 #define MEMORY_START	memstart_addr
 #else
 #define MEMORY_START	(PHYSICAL_START + PAGE_OFFSET - KERNELBASE)
diff --git a/arch/powerpc/kernel/crash_dump.c b/arch/powerpc/kernel/crash_dump.c
index 424afb6..b3ba516 100644
--- a/arch/powerpc/kernel/crash_dump.c
+++ b/arch/powerpc/kernel/crash_dump.c
@@ -28,7 +28,7 @@
 #define DBG(fmt...)
 #endif
 
-#ifndef CONFIG_RELOCATABLE
+#ifndef CONFIG_NONSTATIC_KERNEL
 void __init reserve_kdump_trampoline(void)
 {
 	memblock_reserve(0, KDUMP_RESERVE_LIMIT);
@@ -67,7 +67,7 @@ void __init setup_kdump_trampoline(void)
 
 	DBG(" <- setup_kdump_trampoline()\n");
 }
-#endif /* CONFIG_RELOCATABLE */
+#endif /* CONFIG_NONSTATIC_KERNEL */
 
 static int __init parse_savemaxmem(char *p)
 {
diff --git a/arch/powerpc/kernel/head_44x.S b/arch/powerpc/kernel/head_44x.S
index b725dab..3df7735 100644
--- a/arch/powerpc/kernel/head_44x.S
+++ b/arch/powerpc/kernel/head_44x.S
@@ -86,8 +86,10 @@ _ENTRY(_start);
 
 	bl	early_init
 
-#ifdef CONFIG_RELOCATABLE
+#ifdef CONFIG_DYNAMIC_MEMSTART
 	/*
+	 * Mapping based, page aligned dynamic kernel loading.
+	 *
 	 * r25 will contain RPN/ERPN for the start address of memory
 	 *
 	 * Add the difference between KERNELBASE and PAGE_OFFSET to the
diff --git a/arch/powerpc/kernel/head_fsl_booke.S b/arch/powerpc/kernel/head_fsl_booke.S
index 9f5d210..d5d78c4 100644
--- a/arch/powerpc/kernel/head_fsl_booke.S
+++ b/arch/powerpc/kernel/head_fsl_booke.S
@@ -197,7 +197,7 @@ _ENTRY(__early_start)
 
 	bl	early_init
 
-#ifdef CONFIG_RELOCATABLE
+#ifdef CONFIG_DYNAMIC_MEMSTART
 	lis	r3,kernstart_addr@ha
 	la	r3,kernstart_addr@l(r3)
 #ifdef CONFIG_PHYS_64BIT
diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c
index 9ce1672..ec50bb9 100644
--- a/arch/powerpc/kernel/machine_kexec.c
+++ b/arch/powerpc/kernel/machine_kexec.c
@@ -128,7 +128,7 @@ void __init reserve_crashkernel(void)
 
 	crash_size = resource_size(&crashk_res);
 
-#ifndef CONFIG_RELOCATABLE
+#ifndef CONFIG_NONSTATIC_KERNEL
 	if (crashk_res.start != KDUMP_KERNELBASE)
 		printk("Crash kernel location must be 0x%x\n",
 				KDUMP_KERNELBASE);
diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index df47316..6e63b20 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -2844,7 +2844,7 @@ unsigned long __init prom_init(unsigned long r3, unsigned long r4,
 	RELOC(of_platform) = prom_find_machine_type();
 	prom_printf("Detected machine type: %x\n", RELOC(of_platform));
 
-#ifndef CONFIG_RELOCATABLE
+#ifndef CONFIG_NONSTATIC_KERNEL
 	/* Bail if this is a kdump kernel. */
 	if (PHYSICAL_START > 0)
 		prom_panic("Error: You can't boot a kdump kernel from OF!\n");
diff --git a/arch/powerpc/mm/44x_mmu.c b/arch/powerpc/mm/44x_mmu.c
index f60e006..924a258 100644
--- a/arch/powerpc/mm/44x_mmu.c
+++ b/arch/powerpc/mm/44x_mmu.c
@@ -221,7 +221,7 @@ void setup_initial_memory_limit(phys_addr_t first_memblock_base,
 {
 	u64 size;
 
-#ifndef CONFIG_RELOCATABLE
+#ifndef CONFIG_NONSTATIC_KERNEL
 	/* We don't currently support the first MEMBLOCK not mapping 0
 	 * physical on those processors
 	 */

^ permalink raw reply related

* [PATCH v5 0/7] Kudmp support for PPC440x
From: Suzuki K. Poulose @ 2011-12-15  8:56 UTC (permalink / raw)
  To: Josh Boyer, Benjamin Herrenschmidt
  Cc: Scott Wood, Josh Poimboeuf, linux ppc dev

The following series implements:

 * Generic framework for relocatable kernel on PPC32, based on processing 
   the dynamic relocation entries.
 * Relocatable kernel support for 44x
 * Kdump support for 44x. Doesn't support 47x yet, as the kexec 
   support is missing.

Changes from V4:

 (Suggested by : Segher Boessenkool <segher@kernel.crashing.org> )
 * Added 'sync' between dcbst and icbi for the modified instruction in
   relocate().
 * Better comments on register usage in reloc_32.S
 * Better check for relocation types in relocs_check.pl.
 
Changes from V3:

 * Added a new config - NONSTATIC_KERNEL - to group different types of relocatable
   kernel. (Suggested by: Josh Boyer)
 * Added supported ppc relocation types in relocs_check.pl for verifying the
   relocations used in the kernel.

Changes from V2:

 * Renamed old style mapping based RELOCATABLE on BookE to DYNAMIC_MEMSTART.
   Suggested by: Scott Wood
 * Added support for DYNAMIC_MEMSTART on PPC440x
 * Reverted back to RELOCATABLE and RELOCATABLE_PPC32 from RELOCATABLE_PPC32_PIE
   for relocation based on processing dynamic reloc entries for PPC32.
 * Ensure the modified instructions are flushed and the i-cache invalidated at
   the end of relocate(). - Reported by : Josh Poimboeuf

Changes from V1:

 * Splitted patch 'Enable CONFIG_RELOCATABLE for PPC44x' to move some
   of the generic bits to a new patch.
 * Renamed RELOCATABLE_PPC32 to RELOCATABLE_PPC32_PIE and provided options to
   retained old style mapping. (Suggested by: Scott Wood)
 * Added support for avoiding the overlapping of uncompressed kernel
   with boot wrapper for PPC images.

The patches are based on -next tree for ppc.

I have tested these patches on Ebony, Sequoia and Virtex(QEMU Emulated).
I haven't tested the RELOCATABLE bits on PPC_47x yet, as I don't have access
to one. However, RELOCATABLE should work fine there as we only depend on the 
runtime address and the XLAT entry setup by the boot loader. It would be great if
somebody could test these patches on a 47x.

---

Suzuki K. Poulose (7):
      [boot] Change the load address for the wrapper to fit the kernel
      [44x] Enable CRASH_DUMP for 440x
      [44x] Enable CONFIG_RELOCATABLE for PPC44x
      [ppc] Define virtual-physical translations for RELOCATABLE
      [ppc] Process dynamic relocations for kernel
      [44x] Enable DYNAMIC_MEMSTART for 440x
      [booke] Rename mapping based RELOCATABLE to DYNAMIC_MEMSTART for BookE


 arch/powerpc/Kconfig                          |   45 ++++-
 arch/powerpc/Makefile                         |    6 -
 arch/powerpc/boot/wrapper                     |   20 ++
 arch/powerpc/configs/44x/iss476-smp_defconfig |    3 
 arch/powerpc/include/asm/kdump.h              |    4 
 arch/powerpc/include/asm/page.h               |   89 ++++++++++-
 arch/powerpc/kernel/Makefile                  |    2 
 arch/powerpc/kernel/crash_dump.c              |    4 
 arch/powerpc/kernel/head_44x.S                |  105 +++++++++++++
 arch/powerpc/kernel/head_fsl_booke.S          |    2 
 arch/powerpc/kernel/machine_kexec.c           |    2 
 arch/powerpc/kernel/prom_init.c               |    2 
 arch/powerpc/kernel/reloc_32.S                |  208 +++++++++++++++++++++++++
 arch/powerpc/kernel/vmlinux.lds.S             |    8 +
 arch/powerpc/mm/44x_mmu.c                     |    2 
 arch/powerpc/mm/init_32.c                     |    7 +
 arch/powerpc/relocs_check.pl                  |   14 +-
 17 files changed, 495 insertions(+), 28 deletions(-)
 create mode 100644 arch/powerpc/kernel/reloc_32.S

--
Suzuki K. Poulose

^ permalink raw reply

* Re: [PATCH 2/2] offb: Add palette hack for qemu "standard vga" framebuffer
From: Benjamin Herrenschmidt @ 2011-12-15  8:09 UTC (permalink / raw)
  To: Andreas Färber; +Cc: linux-fbdev, linuxppc-dev, kvm-ppc
In-Reply-To: <4EE9A6A5.2050600@suse.de>

On Thu, 2011-12-15 at 08:49 +0100, Andreas Färber wrote:
> Am 15.12.2011 00:58, schrieb Benjamin Herrenschmidt:
> > We rename the mach64 hack to "simple" since that's also applicable
> > to anything using VGA-style DAC IO ports (set to 8-bit DAC) and we
> > use it for qemu vga.
> > 
> > Note that this is keyed on a device-tree "compatible" property that
> > is currently only set by an upcoming version of SLOF when using the
> > qemu "pseries" platform. This is on purpose as other qemu ppc platforms
> > using OpenBIOS aren't properly setting the DAC to 8-bit at the time of
> > the writing of this patch.
> > 
> > We can fix OpenBIOS later to do that and add the required property, in
> > which case it will be matched by this change.
> 
> Just let me know what's needed for OpenBIOS.
> Is this just for -vga std as opposed to the default cirrus?

Yes. Cirrus isn't the default on mac99 and on pseries (tho I will
eventually add a SLOF driver for it as well).

For OpenBIOS I was thinking about just sending you a patch :-) But if
you have more time than I do, what is needed is:

 - Set the 8-bit DAC bit in the VBE enable register when initializing
the card (0x20 off the top of my mind but dbl check). Remove your >> 2
in your palette setting.

 - Implement color! so prom_init can set the initial palette (but that's
not strictly necessary).

 - I assume that the VGA device already has a device_type of "display",
can be open()'ed from the client interface and will have the necessary
properties to be used by offb (width, height, linebytes, depth, and
address if fits in 32-bit (if not, ignore it, offb will pick the largest
BAR)). 

 - Stick "qemu,std-vga" into the compatible property of the vga PCI
device.

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH 2/2] offb: Add palette hack for qemu "standard vga" framebuffer
From: Andreas Färber @ 2011-12-15  7:49 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linux-fbdev, linuxppc-dev, kvm-ppc
In-Reply-To: <1323907109.21839.28.camel@pasglop>

Am 15.12.2011 00:58, schrieb Benjamin Herrenschmidt:
> We rename the mach64 hack to "simple" since that's also applicable
> to anything using VGA-style DAC IO ports (set to 8-bit DAC) and we
> use it for qemu vga.
> 
> Note that this is keyed on a device-tree "compatible" property that
> is currently only set by an upcoming version of SLOF when using the
> qemu "pseries" platform. This is on purpose as other qemu ppc platforms
> using OpenBIOS aren't properly setting the DAC to 8-bit at the time of
> the writing of this patch.
> 
> We can fix OpenBIOS later to do that and add the required property, in
> which case it will be matched by this change.

Just let me know what's needed for OpenBIOS.
Is this just for -vga std as opposed to the default cirrus?

Cheers,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply

* Re: [PATCH 3/3] mtd/nand : workaround for Freescale FCM to support large-page Nand chip
From: Li Yang @ 2011-12-15  4:59 UTC (permalink / raw)
  To: Scott Wood
  Cc: Artem.Bityutskiy, dedekind1, linuxppc-dev, LiuShuo, linux-kernel,
	shuo.liu, linux-mtd, akpm, dwmw2
In-Reply-To: <4EE903CE.1010903@freescale.com>

On Thu, Dec 15, 2011 at 4:15 AM, Scott Wood <scottwood@freescale.com> wrote=
:
> On 12/14/2011 02:41 AM, LiuShuo wrote:
>> =E4=BA=8E 2011=E5=B9=B412=E6=9C=8813=E6=97=A5 10:46, LiuShuo =E5=86=99=
=E9=81=93:
>>> =E4=BA=8E 2011=E5=B9=B412=E6=9C=8813=E6=97=A5 05:30, Scott Wood =E5=86=
=99=E9=81=93:
>>>> On 12/12/2011 03:19 PM, Artem Bityutskiy wrote:
>>>>> On Mon, 2011-12-12 at 15:15 -0600, Scott Wood wrote:
>>>>>> NAND chips come from the factory with bad blocks marked at a certain
>>>>>> offset into each page. =C2=A0This offset is normally in the OOB area=
, but
>>>>>> since we change the layout from "4k data, 128 byte oob" to "2k
>>>>>> data, 64
>>>>>> byte oob, 2k data, 64 byte oob" the marker is no longer in the
>>>>>> oob. =C2=A0On
>>>>>> first use we need to migrate the markers so that they are still in
>>>>>> the oob.
>>>>> Ah, I see, thanks. Are you planning to implement in-kernel migration =
or
>>>>> use a user-space tool?
>>>> That's the kind of answer I was hoping to get from Shuo. :-)
>>> OK, I try to do this. Wait for a couple of days.
>>>
>>> -LiuShuo
>> I found it's too complex to do the migration in Linux driver.
>>
>> Maybe we can add a uboot command (e.g. nand bbmigrate) to do it, once is
>> enough.
>
> Any reason not to do it automatically on the first U-Boot bad block
> scan, if the flash isn't marked as already migrated?
>
> Further discussion on the details of how to do it in U-Boot should move
> to the U-Boot list.

The limitation of the proposed bad block marker migration is that you
need to make sure the migration is done and only done once.  If it is
done more than once, the factory bad block marker is totally messed
up.  It requires a complex mechanism to automatically guarantee the
migration is only done once, and it still won't be 100% safe.

I would suggest we use a much easier compromise that we form the BBT
base on the factory bad block marker on first use of the flash, and
after that the factory bad block marker is dropped.  We just relies on
the BBT for information about bad blocks.  Although by doing so we
can't regenerate the BBT again,  as there is mirror for the BBT I
don't think we have too much risk.

- Leo

^ permalink raw reply

* Re: [PATCH 1/1] ppc64: fix missing to check all bits of _TIF_USER_WORK_MASK in preempt
From: Benjamin Herrenschmidt @ 2011-12-15  1:04 UTC (permalink / raw)
  To: Tiejun Chen; +Cc: linuxppc-dev
In-Reply-To: <1323681035-19350-1-git-send-email-tiejun.chen@windriver.com>

On Mon, 2011-12-12 at 17:10 +0800, Tiejun Chen wrote:

> -#else /* !CONFIG_PREEMPT */
>  	ld	r3,_MSR(r1)	/* Returning to user mode? */
>  	andi.	r3,r3,MSR_PR
> -	beq	restore		/* if not, just restore regs and return */
> +	bne	test_work_user
>  
> +	clrrdi	r9,r1,THREAD_SHIFT	/* current_thread_info() */
> +	li	r0,_TIF_USER_WORK_MASK

You meant _TIF_KERNEL_WORK_MASK ?

> +#ifdef CONFIG_PREEMPT
> +	ori	r0,r0,_TIF_NEED_RESCHED
> +#endif

No, include that in _TIF_KERNEL_WORK_MASK when CONFIG_PREEMPT, ie,
modify the definition of _TIF_KERNEL_WORK_MASK to include it

> +	ld	r4,TI_FLAGS(r9)
> +	and.	r0,r4,r0	/* check NEED_RESCHED and maybe _TIF_USER_WORK_MASK */

Comment is wrong after the above

> +	bne	do_kernel_work
> +	b	restore		/* if so, just restore regs and return */
> +
> +test_work_user:
>  	/* Check current_thread_info()->flags */
>  	clrrdi	r9,r1,THREAD_SHIFT
>  	ld	r4,TI_FLAGS(r9)

For better scheduling, couldn't you have preloaded the TIF flags, then
checked for PR ?

Looks like this (also replaces do_work)

IE.	ld	r3,_MSR(r1)
	clrrdi	r9,r1,THREAD_SHIFT
  	ld	r4,TI_FLAGS(r9)
 	andi.	r3,r3,MSR_PR
	bne	test_work_user
test_work_kernel:
 	andi.	r0,r4,_TIF_KERNEL_WORK_MASK
	beq+	restore
#ifdef CONFIG_PREEMPT
	/* Check if we need to preempt */
	andi.	r0,r4,_TIF_NEED_RESCHED
	beq+	2f
	lwz	r8,TI_PREEMPT(r9)
	cmpwi	cr1,r8,0
	ld	r0,SOFTE(r1)
	cmpdi	r0,0
	crandc	eq,cr1*4+eq,eq
	bne	1f
	/* ... copy comment about preempt here ... */
	li	r0,0
	stb	r0,PACASOFTIRQEN(r13)
	stb	r0,PACAHARDIRQEN(r13)
	TRACE_DISABLE_INTS
1:	bl	.preempt_schedule_irq

	/* preempt may have re-enabled and then disabled interrupts,
	 * so we may come here as soft-disabled & hard-enabled, but
	 * we really want hard disabled.
	 */ 
#ifdef CONFIG_PPC_BOOK3E
	wrteei	0
#else
	mfmsr	r10
	rldicl	r10,r10,48,1
	rotldi	r10,r10,16
	mtmsrd	r10,1
#endif
	li	r0,0
	stb	r0,PACAHARDIRQEN(r13)

	/* Re-check if we need to preempt again */
	clrrdi	r9,r1,THREAD_SHIFT
	ld	r4,TI_FLAGS(r9)
	andi.	r0,r4,_TIF_NEED_RESCHED
	bne	1b
2:
#endif /* CONFIG_PREEMPT */	
	andi.	r0,r4,_TIF_OUR_NEW_FLAG
	beq+	restore
	... handle our new flag here
	b	restore

test_work_user:
 	andi.	r0,r4,_TIF_USER_WORK_MASK
	beq+	restore
	/* ... move user_work here ... */

>  	andi.	r0,r4,_TIF_USER_WORK_MASK
> -	bne	do_work
> -#endif
> +	bne	do_user_work
>  
>  restore:
>  BEGIN_FW_FTR_SECTION
> @@ -693,10 +692,8 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>  	b	.ret_from_except_lite		/* loop back and handle more */
>  #endif
>  
> -do_work:
> +do_kernel_work:
>  #ifdef CONFIG_PREEMPT
> -	andi.	r0,r3,MSR_PR	/* Returning to user mode? */
> -	bne	user_work
>  	/* Check that preempt_count() == 0 and interrupts are enabled */
>  	lwz	r8,TI_PREEMPT(r9)
>  	cmpwi	cr1,r8,0
> @@ -738,9 +735,9 @@ do_work:
>  	bne	1b
>  	b	restore
>  
> -user_work:
>  #endif /* CONFIG_PREEMPT */
>  
> +do_user_work:
>  	/* Enable interrupts */
>  #ifdef CONFIG_PPC_BOOK3E
>  	wrteei	1

^ permalink raw reply

* Re: [PATCH 3/4] ppc32/kprobe: complete kprobe and migrate exception frame
From: Benjamin Herrenschmidt @ 2011-12-15  0:37 UTC (permalink / raw)
  To: tiejun.chen; +Cc: linuxppc-dev
In-Reply-To: <4EE72AB8.4090502@windriver.com>

On Tue, 2011-12-13 at 18:36 +0800, tiejun.chen wrote:
> >> You need to hook into "resume_kernel" instead.
> >
> 
> I regenerate this with hooking into resume_kernel in below.

 .../...

> I assume it may not necessary to reorganize ret_from_except for *ppc32* .

It might be cleaner but I can do that myself later.

> >>>  do_user_signal:			/* r10 contains MSR_KERNEL here */
> >>>  	ori	r10,r10,MSR_EE
> >>>  	SYNC
> >>> @@ -1202,6 +1204,30 @@ do_user_signal:			/* r10 contains MSR_KERNEL here */
> >>>  	REST_NVGPRS(r1)
> >>>  	b	recheck
> >>>  
> >>> +restore_kprobe:
> >>> +	lwz	r3,GPR1(r1)
> >>> +	subi    r3,r3,INT_FRAME_SIZE; /* Allocate a trampoline exception frame */
> >>> +	mr	r4,r1
> >>> +	bl	copy_exc_stack	/* Copy from the original to the trampoline */
> >>> +
> >>> +	/* Do real stw operation to complete stwu */
> >>> +	mr	r4,r1
> >>> +	addi	r4,r4,INT_FRAME_SIZE	/* Get kprobed entry */
> >>> +	lwz	r5,GPR1(r1)		/* Backup r1 */
> >>> +	stw	r4,GPR1(r1)		/* Now store that safely */
> >> The above confuses me. Shouldn't you do instead something like
> >>
> >> 	lwz	r4,GPR1(r1)
> 
> Now GPR1(r1) is already pointed with new r1 in emulate_step().

Right

> >> 	subi	r3,r4,INT_FRAME_SIZE
> 
> Here we need this, 'mr r4,r1', since r1 holds current exception stack.

Right.

> >> 	li	r5,INT_FRAME_SIZE
> >> 	bl	memcpy
> 
> Then the current exception stack is migrated below the kprobed function stack.
> 
> stack flow:
> 
> --------------------------  -> old r1 when hit 'stwu r1, -AA(r1)' in our
>         ^       ^           kprobed function entry.
>         |       |
>         |       |------------> AA allocated for the kprobed function
>         |       |
>         |       v
> --------|-----------------  -> new r1, also GPR1(r1). It holds the kprobed
>    ^    |                   function stack , -AA(r1).
>    |    |
>    |    |--------------------> INT_FRAME_SIZE for program exception
>    |    |
>    |    v
> ---|---------------------  -> r1 is updated to hold program exception stack.
>    |
>    |
>    |------------------------> migrate the exception stack (r1) below the
>    |                        kprobed after memcpy with INT_FRAME_SIZE.
>    v
> -------------------------  -> reroute this as r1 for program exception stack.

I see so you simply assume that the old r1 value is the current r1 +
INT_FRAME_SIZE, which is probably fair enough.

BTW. we should probably WARN_ON if emulate_step tries to set the new TIF
flag and sees it already set since that means we'll lose the previous
value.

> >>
> > 
> > Anyway I'll try this if you think memcpy is fine/safe in exception return codes.
> > 
> >> To start with, then you need to know the "old" r1 value which may or may
> >> not be related to your current r1. The emulation code should stash it
> > 
> > If the old r1 is not related to our current r1, it shouldn't be possible to go
> > restore_kprob since we set that new flag only for the current.
> > 
> > If I'm wrong please correct me :)
> 
> If you agree what I say above, and its also avoid any issue introduced with
> orig_gpr3, please check the follow:
> =========
> diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
> index 56212bc..277029d 100644
> --- a/arch/powerpc/kernel/entry_32.S
> +++ b/arch/powerpc/kernel/entry_32.S
> @@ -813,9 +813,40 @@ restore_user:
> 
>  #ifdef CONFIG_PREEMPT
>         b       restore
> +#endif

The above means that if !PREEMPT, a userspace return -will- fo into
your new code, while with PREEMPT it won't. This is inconsistent. Now
we should never need that for userspace returns (and indeed you should
double check in emulate step that you are only applying this when
regs->msr & MSR_PR is 0). The above branch should basically become
unconditional.

> -/* N.B. the only way to get here is from the beq following ret_from_except. */
>  resume_kernel:
> +#ifdef CONFIG_KPROBES

Don't make this KPROBES specific. Anything using emulate_step (such as
xmon) might need that too.

> +       /* check current_thread_info, _TIF_EMULATE_STACK_STORE */
> +       rlwinm  r9,r1,0,0,(31-THREAD_SHIFT)
> +       lwz     r0,TI_FLAGS(r9)
> +       andis.  r0,r0,_TIF_EMULATE_STACK_STORE@h
> +       beq+    restore_kernel

So you are introducing a new symbol restore_kernel, you could just
branch to "restore". However, that would mean putting the preempt
case before the kprobe case. But don't we want to do that anyway ?

I don't like keeping that "offsetted" return stack accross a preempt.

> +       addi    r9,r1,INT_FRAME_SIZE    /* Get the kprobed function entry */
> +
> +       lwz     r3,GPR1(r1)
> +       subi    r3,r3,INT_FRAME_SIZE    /* dst: Allocate a trampoline exception
> frame */
> +       mr      r4,r1                   /* src:  current exception frame */
> +       li      r5,INT_FRAME_SIZE       /* size: INT_FRAME_SIZE */
> +       mr      r1,r3                   /* Reroute the trampoline frame to r1 */
> +       bl      memcpy                  /* Copy from the original to the
> trampoline */
> +
> +       /* Do real store operation to complete stwu */
> +       lwz     r5,GPR1(r1)
> +       stw     r9,0(r5)
>
> +       /* Clear _TIF_EMULATE_STACK_STORE flag */
> +       rlwinm  r9,r1,0,0,(31-THREAD_SHIFT)
> +       lwz     r0,TI_FLAGS(r9)
> +       rlwinm  r0,r0,0,_TIF_EMULATE_STACK_STORE
> +       stw     r0,TI_FLAGS(r9)

I think this needs to be an atomic operation, another CPU can be trying
to set _NEED_RESCHED at the same time.

> +restore_kernel:
> +#endif
> +
> +#ifdef CONFIG_PREEMPT
> +/* N.B. the only way to get here is from the beq following ret_from_except. */
>         /* check current_thread_info->preempt_count */
>         rlwinm  r9,r1,0,0,(31-THREAD_SHIFT)
>         lwz     r0,TI_PREEMPT(r9)
> @@ -844,8 +875,6 @@ resume_kernel:
>          */
>         bl      trace_hardirqs_on
>  #endif
> -#else
> -resume_kernel:
>  #endif /* CONFIG_PREEMPT */
> 
>         /* interrupts are hard-disabled at this point */
> 
> Tiejun

Cheers,
Ben.

^ permalink raw reply

* [PATCH 2/2] offb: Add palette hack for qemu "standard vga" framebuffer
From: Benjamin Herrenschmidt @ 2011-12-14 23:58 UTC (permalink / raw)
  To: linux-fbdev; +Cc: linuxppc-dev, kvm-ppc

We rename the mach64 hack to "simple" since that's also applicable
to anything using VGA-style DAC IO ports (set to 8-bit DAC) and we
use it for qemu vga.

Note that this is keyed on a device-tree "compatible" property that
is currently only set by an upcoming version of SLOF when using the
qemu "pseries" platform. This is on purpose as other qemu ppc platforms
using OpenBIOS aren't properly setting the DAC to 8-bit at the time of
the writing of this patch.

We can fix OpenBIOS later to do that and add the required property, in
which case it will be matched by this change.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 drivers/video/offb.c |   19 +++++++++++++++----
 1 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/video/offb.c b/drivers/video/offb.c
index 915acae..da7cf79 100644
--- a/drivers/video/offb.c
+++ b/drivers/video/offb.c
@@ -41,13 +41,14 @@
 /* Supported palette hacks */
 enum {
 	cmap_unknown,
-	cmap_m64,		/* ATI Mach64 */
+	cmap_simple,		/* ATI Mach64 */
 	cmap_r128,		/* ATI Rage128 */
 	cmap_M3A,		/* ATI Rage Mobility M3 Head A */
 	cmap_M3B,		/* ATI Rage Mobility M3 Head B */
 	cmap_radeon,		/* ATI Radeon */
 	cmap_gxt2000,		/* IBM GXT2000 */
 	cmap_avivo,		/* ATI R5xx */
+	cmap_qemu,		/* qemu vga */
 };
 
 struct offb_par {
@@ -138,7 +139,7 @@ static int offb_setcolreg(u_int regno, u_int red, u_int green, u_int blue,
 		return 0;
 
 	switch (par->cmap_type) {
-	case cmap_m64:
+	case cmap_simple:
 		writeb(regno, par->cmap_adr);
 		writeb(red, par->cmap_data);
 		writeb(green, par->cmap_data);
@@ -208,7 +209,7 @@ static int offb_blank(int blank, struct fb_info *info)
 	if (blank)
 		for (i = 0; i < 256; i++) {
 			switch (par->cmap_type) {
-			case cmap_m64:
+			case cmap_simple:
 				writeb(i, par->cmap_adr);
 				for (j = 0; j < 3; j++)
 					writeb(0, par->cmap_data);
@@ -350,7 +351,7 @@ static void offb_init_palette_hacks(struct fb_info *info, struct device_node *dp
 		par->cmap_adr =
 			ioremap(base + 0x7ff000, 0x1000) + 0xcc0;
 		par->cmap_data = par->cmap_adr + 1;
-		par->cmap_type = cmap_m64;
+		par->cmap_type = cmap_simple;
 	} else if (dp && (of_device_is_compatible(dp, "pci1014,b7") ||
 			  of_device_is_compatible(dp, "pci1014,21c"))) {
 		par->cmap_adr = offb_map_reg(dp, 0, 0x6000, 0x1000);
@@ -371,6 +372,16 @@ static void offb_init_palette_hacks(struct fb_info *info, struct device_node *dp
 				par->cmap_type = cmap_avivo;
 		}
 		of_node_put(pciparent);
+	} else if (dp && of_device_is_compatible(dp, "qemu,std-vga")) {
+		const u32 io_of_addr[3] = { 0x01000000, 0x0, 0x0 };
+		u64 io_addr = of_translate_address(dp, io_of_addr);
+		if (io_addr != OF_BAD_ADDR) {
+			par->cmap_adr = ioremap(io_addr + 0x3c8, 2);
+			if (par->cmap_adr) {
+				par->cmap_type = cmap_simple;
+				par->cmap_data = par->cmap_adr + 1;
+			}
+		}
 	}
 	info->fix.visual = (par->cmap_type != cmap_unknown) ?
 		FB_VISUAL_PSEUDOCOLOR : FB_VISUAL_STATIC_PSEUDOCOLOR;

^ permalink raw reply related

* [PATCH 1/2] offb: Fix bug in calculating requested vram size
From: Benjamin Herrenschmidt @ 2011-12-14 23:58 UTC (permalink / raw)
  To: linux-fbdev; +Cc: linuxppc-dev, kvm-ppc

>From 448820776363da565f221c020f4ccb3c610faec3 Mon Sep 17 00:00:00 2001
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: Wed, 14 Dec 2011 16:52:02 +1100
Subject: 

We used to try to request 8 times more vram than needed, which would
fail if the card has a too small BAR (observed with qemu).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 drivers/video/offb.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

(I'm happy to carry that in the powerpc tree)

diff --git a/drivers/video/offb.c b/drivers/video/offb.c
index cb163a5..915acae 100644
--- a/drivers/video/offb.c
+++ b/drivers/video/offb.c
@@ -381,7 +381,7 @@ static void __init offb_init_fb(const char *name, const char *full_name,
 				int pitch, unsigned long address,
 				int foreign_endian, struct device_node *dp)
 {
-	unsigned long res_size = pitch * height * (depth + 7) / 8;
+	unsigned long res_size = pitch * height;
 	struct offb_par *par = &default_par;
 	unsigned long res_start = address;
 	struct fb_fix_screeninfo *fix;

^ permalink raw reply related

* [PATCH] powerpc: Fix old bug in prom_init setting of the color
From: Benjamin Herrenschmidt @ 2011-12-14 23:55 UTC (permalink / raw)
  To: linuxppc-dev

We have an array of 16 entries and a loop of 32 iterations... oops.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/kernel/prom_init.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index cc58486..e3d0ecd 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -2079,7 +2079,7 @@ static void __init prom_check_displays(void)
 		/* Setup a usable color table when the appropriate
 		 * method is available. Should update this to set-colors */
 		clut = RELOC(default_colors);
-		for (i = 0; i < 32; i++, clut += 3)
+		for (i = 0; i < 16; i++, clut += 3)
 			if (prom_set_color(ih, i, clut[0], clut[1],
 					   clut[2]) != 0)
 				break;

^ permalink raw reply related

* Re: [PATCH 3/3] mtd/nand : workaround for Freescale FCM to support large-page Nand chip
From: Scott Wood @ 2011-12-14 20:53 UTC (permalink / raw)
  To: LiuShuo
  Cc: Artem.Bityutskiy, dedekind1, linuxppc-dev, linux-kernel, shuo.liu,
	linux-mtd, akpm, dwmw2
In-Reply-To: <4EE81ADD.6090104@freescale.com>

On 12/13/2011 09:41 PM, LiuShuo wrote:
> =E4=BA=8E 2011=E5=B9=B412=E6=9C=8813=E6=97=A5 05:09, Artem Bityutskiy =E5=
=86=99=E9=81=93:
>> On Tue, 2011-12-06 at 18:09 -0600, Scott Wood wrote:
>>> On 12/03/2011 10:31 PM, shuo.liu@freescale.com wrote:
>>>> From: Liu Shuo<shuo.liu@freescale.com>
>>>>
>>>> Freescale FCM controller has a 2K size limitation of buffer RAM. In
>>>> order
>>>> to support the Nand flash chip whose page size is larger than 2K byt=
es,
>>>> we read/write 2k data repeatedly by issuing FIR_OP_RB/FIR_OP_WB and
>>>> save
>>>> them to a large buffer.
>>>>
>>>> Signed-off-by: Liu Shuo<shuo.liu@freescale.com>
>>>> ---
>>>> v3:
>>>>      -remove page_size of struct fsl_elbc_mtd.
>>>>      -do a oob write by NAND_CMD_RNDIN.
>>>>
>>>>   drivers/mtd/nand/fsl_elbc_nand.c |  243
>>>> ++++++++++++++++++++++++++++++++++----
>>>>   1 files changed, 218 insertions(+), 25 deletions(-)
>>> What is the plan for bad block marker migration.
> I think we can use a special bbt pattern to indicate whether migration
> has been done.
> (we needn't to define another marker)
>=20
> Do the migration our chip->scan_bbt as follow :
>=20
> /*
>  * this pattern indicate that the bad block information has been migrat=
ed,
>  * if this isn't found, we do the migration.
>  */
> static u8 migrated_bbt_pattern[] =3D {'M', 'b', 'b', 't', '0' };
>=20
> static int fsl_elbc_bbt(struct mtd_info *mtd)
> {
>         if (!check_migrated_bbt_pattern())
>             bad_block_info_migtrate();
>=20
>          nand_default_bbt(mtd); /* default function in nand_bbt.c */
> }

Hmm.  This is OK as long as the bad block table never gets erased (which
could happen if a user wants it reconstructed, such as if buggy software
makes a mess of it on a developer's board).  If it gets erased, we'll
end up migrating again -- and the place that factory bad block markers
would have been in is now data, so all blocks that have been written to
will show up as bad unless they happen to have 0xff at the right place.

How about a marker that is compatible with the bbt, so the same block
can be used in production (where scrubbing the bbt should never happen),
but that does not have to imply that the block is a bbt (so a developer
that might want to erase the bbt can set the mark elsewhere, preferably
just before the bbt)?

Or have two versions of the marker, one that is also a bbt marker and
one that is not.

When scanning the bbt, the driver would look for one of these markers
from the end of the chip backward.  If not found, it concludes the chip
is unmigrated.  In U-Boot, this would trigger a migration (or a message
to run a migration command).  In Linux (and U-Boot if migration is a
separate command that has not been run) an unmigrated flash could be
read-only, with the possible exception of raw accesses if needed to
support an external migration tool.

-Scott

^ permalink raw reply

* Re: [PATCH 3/3] mtd/nand : workaround for Freescale FCM to support large-page Nand chip
From: Scott Wood @ 2011-12-14 20:15 UTC (permalink / raw)
  To: LiuShuo
  Cc: Artem.Bityutskiy, dedekind1, linuxppc-dev, linux-kernel, shuo.liu,
	linux-mtd, akpm, dwmw2
In-Reply-To: <4EE8612C.9050104@freescale.com>

On 12/14/2011 02:41 AM, LiuShuo wrote:
> =E4=BA=8E 2011=E5=B9=B412=E6=9C=8813=E6=97=A5 10:46, LiuShuo =E5=86=99=E9=
=81=93:
>> =E4=BA=8E 2011=E5=B9=B412=E6=9C=8813=E6=97=A5 05:30, Scott Wood =E5=86=
=99=E9=81=93:
>>> On 12/12/2011 03:19 PM, Artem Bityutskiy wrote:
>>>> On Mon, 2011-12-12 at 15:15 -0600, Scott Wood wrote:
>>>>> NAND chips come from the factory with bad blocks marked at a certai=
n
>>>>> offset into each page.  This offset is normally in the OOB area, bu=
t
>>>>> since we change the layout from "4k data, 128 byte oob" to "2k
>>>>> data, 64
>>>>> byte oob, 2k data, 64 byte oob" the marker is no longer in the
>>>>> oob.  On
>>>>> first use we need to migrate the markers so that they are still in
>>>>> the oob.
>>>> Ah, I see, thanks. Are you planning to implement in-kernel migration=
 or
>>>> use a user-space tool?
>>> That's the kind of answer I was hoping to get from Shuo. :-)
>> OK, I try to do this. Wait for a couple of days.
>>
>> -LiuShuo
> I found it's too complex to do the migration in Linux driver.
>=20
> Maybe we can add a uboot command (e.g. nand bbmigrate) to do it, once i=
s
> enough.

Any reason not to do it automatically on the first U-Boot bad block
scan, if the flash isn't marked as already migrated?

Further discussion on the details of how to do it in U-Boot should move
to the U-Boot list.

> And let user ensure it been completed before linux use the Nand flash c=
hip.

I don't want to trust the user here.  It's too easy to skip it, and
things will appear to work, but have subtle problems.

> Even if we don't do the migration, the bad block also can be marked as =
bad
> by wearing. So, do we really need to take much time to implement it ?
> (code looks too complex.)

It is not acceptable to ignore factory bad block markers just because
some methods of using the flash may eventually detect an error (possibly
after data is lost -- no guarantee that the badness is ECC-correctable)
and mark the block bad again.

If you don't feel up to the task, I can look at it, but won't have time
until January.

-Scott

^ permalink raw reply

* RE: ARM + TI DSP device tree repesentation
From: Jeff Brower @ 2011-12-14 17:52 UTC (permalink / raw)
  To: David Laight; +Cc: linuxppc-dev
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6D8AF13@saturn3.aculab.com>

David-

>> I am trying to locate the device tree for OMAP platform which
>> has ARM and TI DSP core.
>>
>> The background is that we are trying to understand how such
>> SOCs with dissimilar cores are represented in linux.
>
> Most likely linux runs on the ARM, and the DSP processor
> is loaded with specific code to perform certain functions.
>
> As such the DSP is probably a 'peripheral' not a cpu.

I think that's correct.  However, some companies are developing OpenMP and OpenCL acceleration that lets the DSP cores
appear as additional CPU cores, with the idea being to offload compute-intensive operations from the ARM.

This works with OMAP because the DSP cores are CPUs with SIMD and VLIW, as opposed to other SoCs that have limited
"media processor" or GPU type cores that can't run arbitrary C/C++ code.

-Jeff

^ permalink raw reply

* Re: linux-next: build warnings after merge of the final tree
From: H. Peter Anvin @ 2011-12-14 16:47 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Stephen Rothwell, Peter Zijlstra, linux-kernel, linux-next,
	Paul Mackerras, Thomas Gleixner, linuxppc-dev, Ingo Molnar
In-Reply-To: <1323880322.23971.33.camel@gandalf.stny.rr.com>

On 12/14/2011 08:32 AM, Steven Rostedt wrote:
> On Wed, 2011-12-14 at 18:15 +1100, Stephen Rothwell wrote:
> 
>> Maybe caused by commit d5e553d6e0a4 ("trace: Include <asm/asm-offsets.h>
>> in trace_syscalls.c") from the tip tree.  These warnings may have been
>> here for a while as it is hard to catch the new ones among the flood.
>>
> 
> hpa,
> 
> Was this change only needed for x86? If so, could you put that into
> asm/ftrace.h instead?
> 

Yes (on both accounts).  It was part of the syscall changes; I'll move
the include.

	-hpa


-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

^ permalink raw reply

* Re: linux-next: build warnings after merge of the final tree
From: Steven Rostedt @ 2011-12-14 16:32 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Peter Zijlstra, linux-kernel, linux-next, Paul Mackerras,
	H. Peter Anvin, Thomas Gleixner, linuxppc-dev, Ingo Molnar
In-Reply-To: <20111214181545.6e13bc954cb7ddce9086e861@canb.auug.org.au>

On Wed, 2011-12-14 at 18:15 +1100, Stephen Rothwell wrote:

> Maybe caused by commit d5e553d6e0a4 ("trace: Include <asm/asm-offsets.h>
> in trace_syscalls.c") from the tip tree.  These warnings may have been
> here for a while as it is hard to catch the new ones among the flood.
> 

hpa,

Was this change only needed for x86? If so, could you put that into
asm/ftrace.h instead?

-- Steve

^ permalink raw reply

* RE: ARM + TI DSP device tree repesentation
From: David Laight @ 2011-12-14 16:27 UTC (permalink / raw)
  To: Aggrwal Poonam-B10812, linuxppc-dev; +Cc: Gala Kumar-B11780
In-Reply-To: <ACB6D0C0104CFF42A45A5D82A0DD4F3D014365@039-SN2MPN1-013.039d.mgd.msft.net>

=20
> I am trying to locate the device tree for OMAP platform which=20
> has ARM and TI DSP core.
>=20
> The background is that we are trying to understand how such=20
> SOCs with dissimilar cores are represented in linux.

Most likely linux runs on the ARM, and the DSP processor
is loaded with specific code to perform certain functions.

As such the DSP is probably a 'peripheral' not a cpu.

	David

^ permalink raw reply

* Re: ARM + TI DSP device tree repesentation
From: Jeff Brower @ 2011-12-14 15:51 UTC (permalink / raw)
  To: Aggrwal Poonam-B10812; +Cc: linuxppc-dev
In-Reply-To: <ACB6D0C0104CFF42A45A5D82A0DD4F3D014365@039-SN2MPN1-013.039d.mgd.msft. net>

Poonam-

> I am trying to locate the device tree for OMAP platform which
> has ARM and TI DSP core.
>
> The background is that we are trying to understand how such
> SOCs with dissimilar cores are represented in linux.
>
> Can somebody point me to the correct file.

Neither one of the cores are PowerPC -- you might have better luck looking under an Android Linux distribution. 
Android is typically used on tablets and phones that contain OMAP.

-Jeff

^ permalink raw reply

* ARM + TI DSP device tree repesentation
From: Aggrwal Poonam-B10812 @ 2011-12-14 12:33 UTC (permalink / raw)
  To: linuxppc-dev@lists.ozlabs.org list; +Cc: Gala Kumar-B11780

Hello All

I am trying to locate the device tree for OMAP platform which has ARM and T=
I DSP core.

The background is that we are trying to understand how such SOCs with dissi=
milar cores are represented in linux.

Can somebody point me to the correct file.=20


Many Thanks
Poonam

^ permalink raw reply

* RE: [PATCH v2] Integrated Flash Controller support
From: Li Yang-R58472 @ 2011-12-14 11:13 UTC (permalink / raw)
  To: dedekind1@gmail.com, Kumar Gala
  Cc: Wood Scott-B07421, linuxppc-dev@lists.ozlabs.org list,
	Liu Shuo-B35362, linux-kernel@vger.kernel.org Kernel,
	linux-mtd@lists.infradead.org, Andrew Morton, David Woodhouse
In-Reply-To: <1322643080.24797.413.camel@sauron.fi.intel.com>

Pi0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQo+RnJvbTogQXJ0ZW0gQml0eXV0c2tpeSBbbWFp
bHRvOmRlZGVraW5kMUBnbWFpbC5jb21dDQo+U2VudDogV2VkbmVzZGF5LCBOb3ZlbWJlciAzMCwg
MjAxMSA0OjUxIFBNDQo+VG86IEt1bWFyIEdhbGENCj5DYzogV29vZCBTY290dC1CMDc0MjE7IExp
IFlhbmctUjU4NDcyOyBMaXUgU2h1by1CMzUzNjI7IGxpbnV4LQ0KPmtlcm5lbEB2Z2VyLmtlcm5l
bC5vcmcgS2VybmVsOyBsaW51eC1tdGRAbGlzdHMuaW5mcmFkZWFkLm9yZzsgQW5kcmV3DQo+TW9y
dG9uOyBEYXZpZCBXb29kaG91c2U7IGxpbnV4cHBjLWRldkBsaXN0cy5vemxhYnMub3JnIGxpc3QN
Cj5TdWJqZWN0OiBSZTogW1BBVENIIHYyXSBJbnRlZ3JhdGVkIEZsYXNoIENvbnRyb2xsZXIgc3Vw
cG9ydA0KPg0KPk9uIFR1ZSwgMjAxMS0xMS0yOSBhdCAxOTo0NyAtMDYwMCwgS3VtYXIgR2FsYSB3
cm90ZToNCj4+IEFzIFNjb3R0IHNhaWQsIEkgd2FzIG1vcmUgYXNraW5nIGFib3V0IHRoZSAybmQg
cGF0Y2ggaW4gdGhlIHNlcXVlbmNlDQo+PiB3aGljaCBkaWQgdG91Y2ggTVRELiAgU2luY2UgdGhh
dCBvbmUgaXMgZGVwZW5kZW50IG9uIHRoaXMgcGF0Y2gsDQo+PiB3b25kZXJpbmcgaG93IHdlIHdh
bnRlZCB0byBoYW5kbGUgdGhlbS4NCj4NCj5JIGRvIG5vdCBoYXZlIHRpbWUgdG8gcmV2aWV3IGl0
LCBidXQgaXQgbG9va3MgT0ssIHNvIEknZCBzdWdnZXN0IHRvDQo+bWVyZ2UgaXQgdmllIHRoZSBz
YW1lIHRyZWUgYXMgdGhlIHJlc3Qgb2YgdGhlIHBhdGNoZXMuDQoNCkhpIEFydGVtLA0KDQpTbyB3
aGF0IGlzIHlvdXIgc3VnZ2VzdGlvbiBvbiB0aGlzIHBhdGNoPyAgQ2FuIHdlIHJlZ2FyZCB5b3Vy
IHByZXZpb3VzIGVtYWlsIGFzIGFuIEFDSyBhbmQgbWVyZ2UgaXQgdGhyb3VnaCB0aGUgcG93ZXJw
YyB0cmVlPyAgT3IgZG8geW91IHByZWZlciB0byBtZXJnZSB0aGVtIHRocm91Z2ggdGhlIE1URCB0
cmVlIHdpdGggS3VtYXIncyBBQ0sgaW5zdGVhZD8NCg0KLSBMZW8NCg==

^ permalink raw reply

* Re: [PATCH 3/3] mtd/nand : workaround for Freescale FCM to support large-page Nand chip
From: LiuShuo @ 2011-12-14  8:41 UTC (permalink / raw)
  To: Scott Wood, dedekind1
  Cc: Artem.Bityutskiy, linuxppc-dev, linux-kernel, shuo.liu, linux-mtd,
	akpm, dwmw2
In-Reply-To: <4EE6BC9B.4000602@freescale.com>

=E4=BA=8E 2011=E5=B9=B412=E6=9C=8813=E6=97=A5 10:46, LiuShuo =E5=86=99=E9=
=81=93:
> =E4=BA=8E 2011=E5=B9=B412=E6=9C=8813=E6=97=A5 05:30, Scott Wood =E5=86=99=
=E9=81=93:
>> On 12/12/2011 03:19 PM, Artem Bityutskiy wrote:
>>> On Mon, 2011-12-12 at 15:15 -0600, Scott Wood wrote:
>>>> NAND chips come from the factory with bad blocks marked at a certain
>>>> offset into each page.  This offset is normally in the OOB area, but
>>>> since we change the layout from "4k data, 128 byte oob" to "2k=20
>>>> data, 64
>>>> byte oob, 2k data, 64 byte oob" the marker is no longer in the=20
>>>> oob.  On
>>>> first use we need to migrate the markers so that they are still in=20
>>>> the oob.
>>> Ah, I see, thanks. Are you planning to implement in-kernel migration =
or
>>> use a user-space tool?
>> That's the kind of answer I was hoping to get from Shuo. :-)
> OK, I try to do this. Wait for a couple of days.
>
> -LiuShuo
I found it's too complex to do the migration in Linux driver.

Maybe we can add a uboot command (e.g. nand bbmigrate) to do it, once is=20
enough.
And let user ensure it been completed before linux use the Nand flash chi=
p.

Even if we don't do the migration, the bad block also can be marked as ba=
d
by wearing. So, do we really need to take much time to implement it ?
(code looks too complex.)

-LiuShuo

>> Most likely is a firmware-based tool, but I'd like there to be some wa=
y
>> for the tool to mark that this has happened, so that the Linux driver
>> can refuse to do non-raw accesses to a chip that isn't marked as havin=
g
>> been migrated (or at least yell loudly in the log).
>>
>> Speaking of raw accesses, these are currently broken in the eLBC
>> driver... we need some way for the generic layer to tell us what kind =
of
>> access it is before the transaction starts, not once it wants to read
>> out the buffer (unless we add more hacks to delay the start of a read
>> transaction until first buffer access...).  We'd be better off with a
>> high-level "read page/write page" function that does the whole thing
>> (not just buffer access, but command issuance as well).
>>
>> -Scott
>

^ permalink raw reply

* linux-next: build warnings after merge of the final tree
From: Stephen Rothwell @ 2011-12-14  7:15 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Peter Zijlstra
  Cc: linux-next, Paul Mackerras, linux-kernel, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 1476 bytes --]

Hi all,

After merging the final tree, today's linux-next build (powerpc allyesconfig)
produced these warnings:

In file included from arch/powerpc/include/asm/asm-offsets.h:1:0,
                 from kernel/trace/trace_syscalls.c:9:
include/generated/asm-offsets.h:15:0: warning: "NMI_MASK" redefined [enabled by default]
include/linux/hardirq.h:56:0: note: this is the location of the previous definition
include/generated/asm-offsets.h:128:0: warning: "CLONE_VM" redefined [enabled by default]
include/linux/sched.h:8:0: note: this is the location of the previous definition
include/generated/asm-offsets.h:129:0: warning: "CLONE_UNTRACED" redefined [enabled by default]
include/linux/sched.h:22:0: note: this is the location of the previous definition
include/generated/asm-offsets.h:165:0: warning: "NSEC_PER_SEC" redefined [enabled by default]
include/linux/time.h:40:0: note: this is the location of the previous definition
include/generated/asm-offsets.h:173:0: warning: "PGD_TABLE_SIZE" redefined [enabled by default]
arch/powerpc/include/asm/pgtable-ppc64-4k.h:17:0: note: this is the location of the previous definition

Maybe caused by commit d5e553d6e0a4 ("trace: Include <asm/asm-offsets.h>
in trace_syscalls.c") from the tip tree.  These warnings may have been
here for a while as it is hard to catch the new ones among the flood.

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* [PATCH] Only use initrd_end as the limit for alloc_bottom if it's inside the RMO.
From: Tony Breeds @ 2011-12-14  3:54 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, LinuxPPC-dev; +Cc: Paul Mackerras, Anton Blanchard

From: Paul Mackerras <paulus@samba.org>

As the kernels and initrd's get bigger boot-loaders and possibly
kexec-tools will need to place the initrd outside the RMO.  When this
happens we end up with no lowmem and the boot doesn't get very far.

Only use initrd_end as the limit for alloc_bottom if it's inside the
RMO.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
---
 arch/powerpc/kernel/prom_init.c |   17 +++++++++--------
 1 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index cc58486..940dc0c 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -1224,14 +1224,6 @@ static void __init prom_init_mem(void)
 
 	RELOC(alloc_bottom) = PAGE_ALIGN((unsigned long)&RELOC(_end) + 0x4000);
 
-	/* Check if we have an initrd after the kernel, if we do move our bottom
-	 * point to after it
-	 */
-	if (RELOC(prom_initrd_start)) {
-		if (RELOC(prom_initrd_end) > RELOC(alloc_bottom))
-			RELOC(alloc_bottom) = PAGE_ALIGN(RELOC(prom_initrd_end));
-	}
-
 	/*
 	 * If prom_memory_limit is set we reduce the upper limits *except* for
 	 * alloc_top_high. This must be the real top of RAM so we can put
@@ -1269,6 +1261,15 @@ static void __init prom_init_mem(void)
 	RELOC(alloc_top) = RELOC(rmo_top);
 	RELOC(alloc_top_high) = RELOC(ram_top);
 
+	/*
+	 * Check if we have an initrd after the kernel but still inside
+	 * the RMO.  If we do move our bottom point to after it.
+	 */
+	if (RELOC(prom_initrd_start) &&
+	    RELOC(prom_initrd_start) < RELOC(rmo_top) &&
+	    RELOC(prom_initrd_end) > RELOC(alloc_bottom))
+		RELOC(alloc_bottom) = PAGE_ALIGN(RELOC(prom_initrd_end));
+
 	prom_printf("memory layout at init:\n");
 	prom_printf("  memory_limit : %x (16 MB aligned)\n", RELOC(prom_memory_limit));
 	prom_printf("  alloc_bottom : %x\n", RELOC(alloc_bottom));
-- 
1.7.6.4

^ permalink raw reply related

* Re: [PATCH 3/3] mtd/nand : workaround for Freescale FCM to support large-page Nand chip
From: LiuShuo @ 2011-12-14  3:41 UTC (permalink / raw)
  To: dedekind1, Scott Wood
  Cc: Artem.Bityutskiy, linuxppc-dev, linux-kernel, shuo.liu, linux-mtd,
	akpm, dwmw2
In-Reply-To: <1323724195.2297.11.camel@koala>

=E4=BA=8E 2011=E5=B9=B412=E6=9C=8813=E6=97=A5 05:09, Artem Bityutskiy =E5=
=86=99=E9=81=93:
> On Tue, 2011-12-06 at 18:09 -0600, Scott Wood wrote:
>> On 12/03/2011 10:31 PM, shuo.liu@freescale.com wrote:
>>> From: Liu Shuo<shuo.liu@freescale.com>
>>>
>>> Freescale FCM controller has a 2K size limitation of buffer RAM. In o=
rder
>>> to support the Nand flash chip whose page size is larger than 2K byte=
s,
>>> we read/write 2k data repeatedly by issuing FIR_OP_RB/FIR_OP_WB and s=
ave
>>> them to a large buffer.
>>>
>>> Signed-off-by: Liu Shuo<shuo.liu@freescale.com>
>>> ---
>>> v3:
>>>      -remove page_size of struct fsl_elbc_mtd.
>>>      -do a oob write by NAND_CMD_RNDIN.
>>>
>>>   drivers/mtd/nand/fsl_elbc_nand.c |  243 +++++++++++++++++++++++++++=
+++++++----
>>>   1 files changed, 218 insertions(+), 25 deletions(-)
>> What is the plan for bad block marker migration.
I think we can use a special bbt pattern to indicate whether migration=20
has been done.
(we needn't to define another marker)

Do the migration our chip->scan_bbt as follow :

/*
  * this pattern indicate that the bad block information has been migrate=
d,
  * if this isn't found, we do the migration.
  */
static u8 migrated_bbt_pattern[] =3D {'M', 'b', 'b', 't', '0' };

static int fsl_elbc_bbt(struct mtd_info *mtd)
{
         if (!check_migrated_bbt_pattern())
             bad_block_info_migtrate();

          nand_default_bbt(mtd); /* default function in nand_bbt.c */
}

- LiuShuo

^ permalink raw reply

* RE: [PATCH] mmc: sdhci-pltfm: Added sdhci-adjust-timeout quirk
From: Xie Xiaobo-R63061 @ 2011-12-14  3:27 UTC (permalink / raw)
  To: Huang Changming-R66093, linuxppc-dev@lists.ozlabs.org
  Cc: avorontsov@ru.mvista.com, linux-mmc@vger.kernel.org
In-Reply-To: <8A2FC72B45BB5A4C9F801431E06AE48F11644C44@039-SN1MPN1-006.039d.mgd.msft.net>

SGkgQ2hhbmdtaW5nLA0KDQpPSywgeW91IGNhbiBtZXJnZSBteSBwYXRjaCBpbnRvIHlvdXIgcGF0
Y2hlcy4NCg0KSGkgYWxsLA0KUGxlYXNlIGlnbm9yZSB0aGlzIHBhdGNoLiBDaGFuZ21pbmcgd2ls
bCBzZW5kIHRoZSBzaW1pbGFyIHBhdGNoLg0KDQpCUnMNClhpZSBYaWFvYm8NCg0KLS0tLS1Pcmln
aW5hbCBNZXNzYWdlLS0tLS0NCkZyb206IEh1YW5nIENoYW5nbWluZy1SNjYwOTMgDQpTZW50OiAy
MDExxOoxMtTCMTPI1SAxNjowMA0KVG86IFhpZSBYaWFvYm8tUjYzMDYxOyBsaW51eHBwYy1kZXZA
bGlzdHMub3psYWJzLm9yZw0KQ2M6IGF2b3JvbnRzb3ZAcnUubXZpc3RhLmNvbTsgbGludXgtbW1j
QHZnZXIua2VybmVsLm9yZzsgWGllIFhpYW9iby1SNjMwNjENClN1YmplY3Q6IFJFOiBbUEFUQ0hd
IG1tYzogc2RoY2ktcGx0Zm06IEFkZGVkIHNkaGNpLWFkanVzdC10aW1lb3V0IHF1aXJrDQoNClhp
YW9ibywgSSBoYXZlIG9uZSBvdGhlciBzaW1pbGFyIHBhdGNoLCBidXQgdGhlIHByb3BlcnR5IGlz
ICdzZGhjaSxhZGp1c3QtdGltZW91dCcuDQpNYXliZSBJIGNhbiByZXBvc3QgaXQgd2l0aCBhZGQg
eW91ciBzaWduZWQtb2ZmLWJ5Pw0KDQo+IC0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQo+IEZy
b206IGxpbnV4cHBjLWRldi1ib3VuY2VzK3I2NjA5Mz1mcmVlc2NhbGUuY29tQGxpc3RzLm96bGFi
cy5vcmcNCj4gW21haWx0bzpsaW51eHBwYy1kZXYtYm91bmNlcytyNjYwOTM9ZnJlZXNjYWxlLmNv
bUBsaXN0cy5vemxhYnMub3JnXSBPbiANCj4gQmVoYWxmIE9mIFhpZSBYaWFvYm8NCj4gU2VudDog
TW9uZGF5LCBEZWNlbWJlciAwNSwgMjAxMSA0OjU1IFBNDQo+IFRvOiBsaW51eHBwYy1kZXZAbGlz
dHMub3psYWJzLm9yZw0KPiBDYzogYXZvcm9udHNvdkBydS5tdmlzdGEuY29tOyBsaW51eC1tbWNA
dmdlci5rZXJuZWwub3JnOyBYaWUgWGlhb2JvLQ0KPiBSNjMwNjENCj4gU3ViamVjdDogW1BBVENI
XSBtbWM6IHNkaGNpLXBsdGZtOiBBZGRlZCBzZGhjaS1hZGp1c3QtdGltZW91dCBxdWlyaw0KPiAN
Cj4gU29tZSBjb250cm9sbGVyIHByb3ZpZGVzIGFuIGluY29ycmVjdCB0aW1lb3V0IHZhbHVlIGZv
ciB0cmFuc2ZlcnMsIFNvIA0KPiBpdCBuZWVkIHRoZSBxdWlyayB0byBhZGp1c3QgdGltZW91dCB2
YWx1ZSB0byAweEUuDQo+IEUuZy4gZVNESEMgb2YgTVBDODUzNiwgUDEwMTAsIGFuZCBQMjAyMC4N
Cj4gDQo+IFNpZ25lZC1vZmYtYnk6IFhpZSBYaWFvYm8gPFguWGllQGZyZWVzY2FsZS5jb20+DQo+
IC0tLQ0KPiAgZHJpdmVycy9tbWMvaG9zdC9zZGhjaS1wbHRmbS5jIHwgICAgNSArKysrLQ0KPiAg
MSBmaWxlcyBjaGFuZ2VkLCA0IGluc2VydGlvbnMoKyksIDEgZGVsZXRpb25zKC0pDQo+IA0KPiBk
aWZmIC0tZ2l0IGEvZHJpdmVycy9tbWMvaG9zdC9zZGhjaS1wbHRmbS5jIGIvZHJpdmVycy9tbWMv
aG9zdC9zZGhjaS0gDQo+IHBsdGZtLmMgaW5kZXggYTllMTJlYS4uYjVkNmIzZiAxMDA2NDQNCj4g
LS0tIGEvZHJpdmVycy9tbWMvaG9zdC9zZGhjaS1wbHRmbS5jDQo+ICsrKyBiL2RyaXZlcnMvbW1j
L2hvc3Qvc2RoY2ktcGx0Zm0uYw0KPiBAQCAtMiw3ICsyLDcgQEANCj4gICAqIHNkaGNpLXBsdGZt
LmMgU3VwcG9ydCBmb3IgU0RIQ0kgcGxhdGZvcm0gZGV2aWNlcw0KPiAgICogQ29weXJpZ2h0IChj
KSAyMDA5IEludGVsIENvcnBvcmF0aW9uDQo+ICAgKg0KPiAtICogQ29weXJpZ2h0IChjKSAyMDA3
IEZyZWVzY2FsZSBTZW1pY29uZHVjdG9yLCBJbmMuDQo+ICsgKiBDb3B5cmlnaHQgKGMpIDIwMDcs
IDIwMTEgRnJlZXNjYWxlIFNlbWljb25kdWN0b3IsIEluYy4NCj4gICAqIENvcHlyaWdodCAoYykg
MjAwOSBNb250YVZpc3RhIFNvZnR3YXJlLCBJbmMuDQo+ICAgKg0KPiAgICogQXV0aG9yczogWGlh
b2JvIFhpZSA8WC5YaWVAZnJlZXNjYWxlLmNvbT4gQEAgLTY4LDYgKzY4LDkgQEAgdm9pZCANCj4g
c2RoY2lfZ2V0X29mX3Byb3BlcnR5KHN0cnVjdCBwbGF0Zm9ybV9kZXZpY2UgKnBkZXYpDQo+ICAJ
CWlmIChvZl9nZXRfcHJvcGVydHkobnAsICJzZGhjaSwxLWJpdC1vbmx5IiwgTlVMTCkpDQo+ICAJ
CQlob3N0LT5xdWlya3MgfD0gU0RIQ0lfUVVJUktfRk9SQ0VfMV9CSVRfREFUQTsNCj4gDQo+ICsJ
CWlmIChvZl9nZXRfcHJvcGVydHkobnAsICJzZGhjaSxzZGhjaS1hZGp1c3QtdGltZW91dCIsIE5V
TEwpKQ0KPiArCQkJaG9zdC0+cXVpcmtzIHw9IFNESENJX1FVSVJLX0JST0tFTl9USU1FT1VUX1ZB
TDsNCj4gKw0KPiAgCQlpZiAoc2RoY2lfb2Zfd3BfaW52ZXJ0ZWQobnApKQ0KPiAgCQkJaG9zdC0+
cXVpcmtzIHw9IFNESENJX1FVSVJLX0lOVkVSVEVEX1dSSVRFX1BST1RFQ1Q7DQo+IA0KPiAtLQ0K
PiAxLjYuNA0KPiANCj4gDQo+IF9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f
X19fX19fX19fDQo+IExpbnV4cHBjLWRldiBtYWlsaW5nIGxpc3QNCj4gTGludXhwcGMtZGV2QGxp
c3RzLm96bGFicy5vcmcNCj4gaHR0cHM6Ly9saXN0cy5vemxhYnMub3JnL2xpc3RpbmZvL2xpbnV4
cHBjLWRldg0KDQo=

^ permalink raw reply

* Re: CONFIG_NO_HZ added too much idle time in /proc/stat during throughput test.
From: Anton Blanchard @ 2011-12-14  3:17 UTC (permalink / raw)
  To: Fushen Chen; +Cc: Peter Zijlstra, Thomas Gleixner, Linuxppc-dev Development
In-Reply-To: <CAEu=RPgnh-i-SFt4kiiAtDjLZ3A0cAHhk7ch56Uvv0uG_+HfCg@mail.gmail.com>


Hi,

> This is 2.6.32, but I think 2.6.36 is the same.

Sounds a bit like this, merged in 2.6.39.

Anton
--

commit ad5d1c888e556bc00c4e86f452cad4a3a87d22c1
Author: Anton Blanchard <anton@samba.org>
Date:   Sun Mar 20 15:28:03 2011 +0000

    powerpc: Fix accounting of softirq time when idle
    
    commit cf9efce0ce31 (powerpc: Account time using timebase rather
    than PURR) used in_irq() to detect if the time was spent in
    interrupt processing. This only catches hardirq context so if we
    are in softirq context and in the idle loop we end up accounting it
    as idle time. If we instead use in_interrupt() we catch both softirq
    and hardirq time.
    
    The issue was found when running a network intensive workload. top
    showed the following:
    
    0.0%us,  1.1%sy,  0.0%ni, 85.7%id,  0.0%wa,  9.9%hi,  3.3%si,  0.0%st
    
    85.7% idle. But this was wildly different to the perf events data.
    To confirm the suspicion I ran something to keep the core busy:
    
    # yes > /dev/null &
    
    8.2%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa, 10.3%hi, 81.4%si,  0.0%st
    
    We only got 8.2% of the CPU for the userspace task and softirq has
    shot up to 81.4%.
    
    With the patch below top shows the correct stats:
    
    0.0%us,  0.0%sy,  0.0%ni,  5.3%id,  0.0%wa, 13.3%hi, 81.3%si,  0.0%st
    
    Signed-off-by: Anton Blanchard <anton@samba.org>
    Cc: stable@kernel.org
    Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox