LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v3 0/13] memory-hotplug : hot-remove physical memory
From: Yasuaki Ishimatsu @ 2012-07-09 10:21 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linuxppc-dev, linux-acpi
  Cc: len.brown, wency, paulus, minchan.kim, kosaki.motohiro, rientjes,
	cl, akpm, liuj97

This patch series aims to support physical memory hot-remove.

  [RFC PATCH v3 1/13] memory-hotplug : rename remove_memory to offline_memory
  [RFC PATCH v3 2/13] memory-hotplug : add physical memory hotplug code to acpi_memory_device_remove
  [RFC PATCH v3 3/13] memory-hotplug : unify argument of firmware_map_add_early/hotplug
  [RFC PATCH v3 4/13] memory-hotplug : remove /sys/firmware/memmap/X sysfs
  [RFC PATCH v3 5/13] memory-hotplug : does not release memory region in PAGES_PER_SECTION chunks
  [RFC PATCH v3 6/13] memory-hotplug : add memory_block_release
  [RFC PATCH v3 7/13] memory-hotplug : remove_memory calls __remove_pages
  [RFC PATCH v3 8/13] memory-hotplug : check page type in get_page_bootmem
  [RFC PATCH v3 9/13] memory-hotplug : move register_page_bootmem_info_node and put_page_bootmem for
sparse-vmemmap
  [RFC PATCH v3 10/13] memory-hotplug : implement register_page_bootmem_info_section of sparse-vmemmap
  [RFC PATCH v3 11/13] memory-hotplug : free memmap of sparse-vmemmap
  [RFC PATCH v3 12/13] memory-hotplug : add node_device_release
  [RFC PATCH v3 13/13] memory-hotplug : remove sysfs file of node

Even if you apply these patches, you cannot remove the physical memory
completely since these patches are still under development. I want you to
cooperate to improve the physical memory hot-remove. So please review these
patches and give your comment/idea.

The patches can free/remove following things:

  - acpi_memory_info                          : [RFC PATCH 2/13]
  - /sys/firmware/memmap/X/{end, start, type} : [RFC PATCH 4/13]
  - iomem_resource                            : [RFC PATCH 5/13]
  - mem_section and related sysfs files       : [RFC PATCH 6-11/13]
  - node and related sysfs files              : [RFC PATCH 12-13/13]

The patches cannot do following things yet:

  - page table of removed memory

If you find lack of function for physical memory hot-remove, please let me
know.

change log of v3:
 * rebase to 3.5.0-rc6

 [RFC PATCH v2 2/13]
   * remove extra kobject_put()

   * The patch was commented by Wen. Wen's comment is
     "acpi_memory_device_remove() should ignore a return value of
     remove_memory() since caller does not care the return value".
     But I did not change it since I think caller should care the
     return value. And I am trying to fix it as follow:

     https://lkml.org/lkml/2012/7/5/624

 [RFC PATCH v2 4/13]
   * remove a firmware_memmap_entry allocated by kzmalloc()

change log of v2:
 [RFC PATCH v2 2/13]
   * check whether memory block is offline or not before calling offline_memory()
   * check whether section is valid or not in is_memblk_offline()
   * call kobject_put() for each memory_block in is_memblk_offline()

 [RFC PATCH v2 3/13]
   * unify the end argument of firmware_map_add_early/hotplug

 [RFC PATCH v2 4/13]
   * add release_firmware_map_entry() for freeing firmware_map_entry

 [RFC PATCH v2 6/13]
  * add release_memory_block() for freeing memory_block

 [RFC PATCH v2 11/13]
  * fix wrong arguments of free_pages()

---
 arch/powerpc/platforms/pseries/hotplug-memory.c |   16 +-
 arch/x86/mm/init_64.c                           |  144 ++++++++++++++++++++++++
 drivers/acpi/acpi_memhotplug.c                  |   28 ++++
 drivers/base/memory.c                           |   54 ++++++++-
 drivers/base/node.c                             |    7 +
 drivers/firmware/memmap.c                       |   78 ++++++++++++-
 include/linux/firmware-map.h                    |    6 +
 include/linux/memory.h                          |    5
 include/linux/memory_hotplug.h                  |   17 --
 include/linux/mm.h                              |    5
 mm/memory_hotplug.c                             |   98 ++++++++++++----
 mm/sparse.c                                     |    5
 12 files changed, 414 insertions(+), 49 deletions(-)

^ permalink raw reply

* [PATCH 3/3 v2] powerpc/mpic: FSL MPIC error interrupt support.
From: Varun Sethi @ 2012-07-09  8:47 UTC (permalink / raw)
  To: galak, linuxppc-dev; +Cc: Bogdan Hamciuc, Varun Sethi

All SOC device error interrupts are muxed and delivered to the core as a single
MPIC error interrupt. Currently all the device drivers requiring access to device
errors have to register for the MPIC error interrupt as a shared interrupt.

With this patch we add interrupt demuxing capability in the mpic driver, allowing
device drivers to register for their individual error interrupts. This is achieved
by handling error interrupts in a cascaded fashion.

MPIC error interrupt is handled by the "error_int_handler", which subsequently demuxes
it using the EISR and delivers it to the respective drivers. 

The error interrupt capability is dependent on the MPIC EIMR register, which was
introduced in FSL MPIC version 4.1 (P4080 rev2). So, error interrupt demuxing capability
is dependent on the MPIC version and can be used for versions >= 4.1.

Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Bogdan Hamciuc <bogdan.hamciuc@freescale.com>
[In the initial version of the patch we were using handle_simple_irq
 as the handler for cascaded error interrupts, this resulted
 in issues in case of threaded isrs (with RT kernel). This issue was
 debugged by Bogdan and decision was taken to use the handle_level_irq
 handler]
---
 arch/powerpc/include/asm/mpic.h    |   17 ++++
 arch/powerpc/sysdev/Makefile       |    2 +-
 arch/powerpc/sysdev/fsl_mpic_err.c |  157 ++++++++++++++++++++++++++++++++++++
 arch/powerpc/sysdev/mpic.c         |   35 ++++++++-
 arch/powerpc/sysdev/mpic.h         |   22 +++++
 5 files changed, 231 insertions(+), 2 deletions(-)
 create mode 100644 arch/powerpc/sysdev/fsl_mpic_err.c

diff --git a/arch/powerpc/include/asm/mpic.h b/arch/powerpc/include/asm/mpic.h
index e14d35d..71b42b9 100644
--- a/arch/powerpc/include/asm/mpic.h
+++ b/arch/powerpc/include/asm/mpic.h
@@ -114,10 +114,17 @@
 #define MPIC_FSL_BRR1			0x00000
 #define 	MPIC_FSL_BRR1_VER			0x0000ffff
 
+/*
+ * Error interrupt registers
+ */
+
+
 #define MPIC_MAX_IRQ_SOURCES	2048
 #define MPIC_MAX_CPUS		32
 #define MPIC_MAX_ISU		32
 
+#define MPIC_MAX_ERR      32
+
 /*
  * Tsi108 implementation of MPIC has many differences from the original one
  */
@@ -270,6 +277,7 @@ struct mpic
 	struct irq_chip		hc_ipi;
 #endif
 	struct irq_chip		hc_tm;
+	struct irq_chip		hc_err;
 	const char		*name;
 	/* Flags */
 	unsigned int		flags;
@@ -283,6 +291,8 @@ struct mpic
 	/* vector numbers used for internal sources (ipi/timers) */
 	unsigned int		ipi_vecs[4];
 	unsigned int		timer_vecs[8];
+	/* vector numbers used for FSL MPIC error interrupts */
+	unsigned int		err_int_vecs[MPIC_MAX_ERR];
 
 	/* Spurious vector to program into unused sources */
 	unsigned int		spurious_vec;
@@ -306,6 +316,11 @@ struct mpic
 	struct mpic_reg_bank	cpuregs[MPIC_MAX_CPUS];
 	struct mpic_reg_bank	isus[MPIC_MAX_ISU];
 
+	/* ioremap'ed base for error interrupt registers */
+	u32 __iomem	*err_regs;
+	/* error interrupt config */
+	u32			err_int_config_done;
+
 	/* Protected sources */
 	unsigned long		*protected;
 
@@ -370,6 +385,8 @@ struct mpic
 #define MPIC_NO_RESET			0x00004000
 /* Freescale MPIC (compatible includes "fsl,mpic") */
 #define MPIC_FSL			0x00008000
+/* Freescale MPIC supports EIMR (error interrupt mask register)*/
+#define MPIC_FSL_HAS_EIMR		0x00010000
 
 /* MPIC HW modification ID */
 #define MPIC_REGSET_MASK		0xf0000000
diff --git a/arch/powerpc/sysdev/Makefile b/arch/powerpc/sysdev/Makefile
index 1bd7ecb..a57600b 100644
--- a/arch/powerpc/sysdev/Makefile
+++ b/arch/powerpc/sysdev/Makefile
@@ -15,7 +15,7 @@ obj-$(CONFIG_PPC_DCR_NATIVE)	+= dcr-low.o
 obj-$(CONFIG_PPC_PMI)		+= pmi.o
 obj-$(CONFIG_U3_DART)		+= dart_iommu.o
 obj-$(CONFIG_MMIO_NVRAM)	+= mmio_nvram.o
-obj-$(CONFIG_FSL_SOC)		+= fsl_soc.o
+obj-$(CONFIG_FSL_SOC)		+= fsl_soc.o fsl_mpic_err.o
 obj-$(CONFIG_FSL_PCI)		+= fsl_pci.o $(fsl-msi-obj-y)
 obj-$(CONFIG_FSL_PMC)		+= fsl_pmc.o
 obj-$(CONFIG_FSL_LBC)		+= fsl_lbc.o
diff --git a/arch/powerpc/sysdev/fsl_mpic_err.c b/arch/powerpc/sysdev/fsl_mpic_err.c
new file mode 100644
index 0000000..f2d28f2
--- /dev/null
+++ b/arch/powerpc/sysdev/fsl_mpic_err.c
@@ -0,0 +1,157 @@
+/*
+ * Copyright (C) 2012 Freescale Semiconductor, Inc.
+ *
+ * Author: Varun Sethi <varun.sethi@freescale.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2 of the
+ * License.
+ *
+ */
+
+#include <linux/irq.h>
+#include <linux/smp.h>
+#include <linux/interrupt.h>
+
+#include <asm/io.h>
+#include <asm/irq.h>
+#include <asm/mpic.h>
+
+#include "mpic.h"
+
+#define MPIC_ERR_INT_BASE	0x3900
+#define MPIC_ERR_INT_EISR	0x0000
+#define MPIC_ERR_INT_EIMR	0x0010
+
+static inline u32 fsl_mpic_err_read(u32 __iomem *base, unsigned int err_reg)
+{
+	return in_be32(base + (err_reg >> 2));
+}
+
+static inline void fsl_mpic_err_write(u32 __iomem *base, unsigned int err_reg,
+				   u32 value)
+{
+	out_be32(base + (err_reg >> 2), value);
+}
+
+static void fsl_mpic_mask_err(struct irq_data *d)
+{
+	u32 eimr;
+	struct mpic *mpic = irq_data_get_irq_chip_data(d);
+	unsigned int src = virq_to_hw(d->irq) - mpic->err_int_vecs[0];
+	unsigned int err_reg_offset = MPIC_ERR_INT_EIMR;
+
+	eimr = fsl_mpic_err_read(mpic->err_regs, err_reg_offset);
+	eimr |= (1 << (31 - src));
+	fsl_mpic_err_write(mpic->err_regs, err_reg_offset, eimr);
+}
+
+static void fsl_mpic_unmask_err(struct irq_data *d)
+{
+	u32 eimr;
+	struct mpic *mpic = irq_data_get_irq_chip_data(d);
+	unsigned int src = virq_to_hw(d->irq) - mpic->err_int_vecs[0];
+	unsigned int err_reg_offset = MPIC_ERR_INT_EIMR;
+
+	eimr = fsl_mpic_err_read(mpic->err_regs, err_reg_offset);
+	eimr &= ~(1 << (31 - src));
+	fsl_mpic_err_write(mpic->err_regs, err_reg_offset, eimr);
+}
+
+static struct irq_chip fsl_mpic_err_chip = {
+	.irq_disable	= fsl_mpic_mask_err,
+	.irq_mask	= fsl_mpic_mask_err,
+	.irq_unmask	= fsl_mpic_unmask_err,
+};
+
+void mpic_setup_error_int(struct mpic *mpic, int intvec)
+{
+	int i;
+
+	mpic->err_regs = ioremap(mpic->paddr + MPIC_ERR_INT_BASE, 0x1000);
+	if (!mpic->err_regs) {
+		pr_err("could not map mpic error registers\n");
+		return;
+	}
+	mpic->hc_err = fsl_mpic_err_chip;
+	mpic->hc_err.name = mpic->name;
+	mpic->flags |= MPIC_FSL_HAS_EIMR;
+	/* allocate interrupt vectors for error interrupts */
+	for (i = MPIC_MAX_ERR - 1; i >= 0; i--)
+		mpic->err_int_vecs[i] = --intvec;
+
+}
+
+int mpic_map_error_int(struct mpic *mpic, unsigned int virq, irq_hw_number_t  hw)
+{
+	if ((mpic->flags & MPIC_FSL_HAS_EIMR) &&
+	    (hw >= mpic->err_int_vecs[0] &&
+	     hw <= mpic->err_int_vecs[MPIC_MAX_ERR - 1])) {
+		WARN_ON(mpic->flags & MPIC_SECONDARY);
+
+		pr_debug("mpic: mapping as Error Interrupt\n");
+		irq_set_chip_data(virq, mpic);
+		irq_set_chip_and_handler(virq, &mpic->hc_err,
+					 handle_level_irq);
+		return 1;
+	}
+
+	return 0;
+}
+
+static irqreturn_t fsl_error_int_handler(int irq, void *data)
+{
+	struct mpic *mpic = (struct mpic *) data;
+	unsigned int eisr_offset = MPIC_ERR_INT_EISR;
+	unsigned int eimr_offset = MPIC_ERR_INT_EIMR;
+	u32 eisr, eimr;
+	int errint;
+	unsigned int cascade_irq;
+
+	eisr = fsl_mpic_err_read(mpic->err_regs, eisr_offset);
+	eimr = fsl_mpic_err_read(mpic->err_regs, eimr_offset);
+
+	if (!(eisr & ~eimr))
+		return IRQ_NONE;
+
+	while (eisr) {
+		errint = __builtin_clz(eisr);
+		cascade_irq = irq_linear_revmap(mpic->irqhost,
+				 mpic->err_int_vecs[errint]);
+		WARN_ON(cascade_irq == NO_IRQ);
+		if (cascade_irq != NO_IRQ) {
+			generic_handle_irq(cascade_irq);
+		} else {
+			eimr |=  1 << (31 - errint);
+			fsl_mpic_err_write(mpic->err_regs, eimr_offset, eimr);
+		}
+		eisr &= ~(1 << (31 - errint));
+	}
+
+	return IRQ_HANDLED;
+}
+
+int mpic_err_int_init(struct mpic *mpic, irq_hw_number_t irqnum)
+{
+	unsigned int virq;
+	unsigned int offset = MPIC_ERR_INT_EIMR;
+	int ret;
+
+	virq = irq_create_mapping(mpic->irqhost, irqnum);
+	if (virq == NO_IRQ) {
+		pr_err("Error interrupt setup failed\n");
+		return -ENOSPC;
+	}
+
+	fsl_mpic_err_write(mpic->err_regs, offset, ~0);
+
+	ret = request_irq(virq, fsl_error_int_handler, IRQF_NO_THREAD,
+		    "mpic-error-int", mpic);
+	if (ret) {
+		pr_err("Failed to register error interrupt handler\n");
+		return ret;
+	}
+
+	return 0;
+}
diff --git a/arch/powerpc/sysdev/mpic.c b/arch/powerpc/sysdev/mpic.c
index 61c7225..7002ef3 100644
--- a/arch/powerpc/sysdev/mpic.c
+++ b/arch/powerpc/sysdev/mpic.c
@@ -1026,6 +1026,9 @@ static int mpic_host_map(struct irq_domain *h, unsigned int virq,
 		return 0;
 	}
 
+	if (mpic_map_error_int(mpic, virq, hw))
+		return 0;
+
 	if (hw >= mpic->num_sources)
 		return -EINVAL;
 
@@ -1085,7 +1088,24 @@ static int mpic_host_xlate(struct irq_domain *h, struct device_node *ct,
 		 */
 		switch (intspec[2]) {
 		case 0:
-		case 1: /* no EISR/EIMR support for now, treat as shared IRQ */
+			break;
+		case 1:
+			if (!(mpic->flags & MPIC_FSL_HAS_EIMR))
+				break;
+
+			if (intspec[3] >= ARRAY_SIZE(mpic->err_int_vecs))
+				return -EINVAL;
+
+			if (!mpic->err_int_config_done) {
+				int ret;
+				ret = mpic_err_int_init(mpic, intspec[0]);
+				if (ret)
+					return ret;
+				mpic->err_int_config_done = 1;
+			}
+
+			*out_hwirq = mpic->err_int_vecs[intspec[3]];
+
 			break;
 		case 2:
 			if (intspec[0] >= ARRAY_SIZE(mpic->ipi_vecs))
@@ -1302,6 +1322,8 @@ struct mpic * __init mpic_alloc(struct device_node *node,
 	mpic_map(mpic, mpic->paddr, &mpic->tmregs, MPIC_INFO(TIMER_BASE), 0x1000);
 
 	if (mpic->flags & MPIC_FSL) {
+		u32 brr1, version;
+
 		/*
 		 * Yes, Freescale really did put global registers in the
 		 * magic per-cpu area -- and they don't even show up in the
@@ -1309,6 +1331,17 @@ struct mpic * __init mpic_alloc(struct device_node *node,
 		 */
 		mpic_map(mpic, mpic->paddr, &mpic->thiscpuregs,
 			 MPIC_CPU_THISBASE, 0x1000);
+
+		brr1 = _mpic_read(mpic->reg_type, &mpic->thiscpuregs,
+				MPIC_FSL_BRR1);
+		version = brr1 & MPIC_FSL_BRR1_VER;
+
+		/* Error interrupt mask register (EIMR) is required for
+		 * handling individual device error interrupts. EIMR
+		 * was added in MPIC version 4.1.
+		 */
+		if (version >= 0x401)
+			mpic_setup_error_int(mpic, intvec_top - 12);
 	}
 
 	/* Reset */
diff --git a/arch/powerpc/sysdev/mpic.h b/arch/powerpc/sysdev/mpic.h
index 13f3e89..1a6995a 100644
--- a/arch/powerpc/sysdev/mpic.h
+++ b/arch/powerpc/sysdev/mpic.h
@@ -40,4 +40,26 @@ extern int mpic_set_affinity(struct irq_data *d,
 			     const struct cpumask *cpumask, bool force);
 extern void mpic_reset_core(int cpu);
 
+#ifdef CONFIG_FSL_SOC
+extern int mpic_map_error_int(struct mpic *mpic, unsigned int virq, irq_hw_number_t  hw);
+extern int mpic_err_int_init(struct mpic *mpic, irq_hw_number_t irqnum);
+extern void mpic_setup_error_int(struct mpic *mpic, int intvec);
+#else
+static inline int mpic_map_error_int(struct mpic *mpic, unsigned int virq, irq_hw_number_t  hw)
+{
+	return 0;
+}
+
+
+static inline int mpic_err_int_init(struct mpic *mpic, irq_hw_number_t irqnum)
+{
+	return -1;
+}
+
+static inline void mpic_setup_error_int(struct mpic *mpic, int intvec)
+{
+	return;
+}
+#endif
+
 #endif /* _POWERPC_SYSDEV_MPIC_H */
-- 
1.7.2.2

^ permalink raw reply related

* [PATCH 2/3 v2] powerpc/mpic: Use the MPIC_LARGE_VECTORS flag for FSL MPIC.
From: Varun Sethi @ 2012-07-09  8:46 UTC (permalink / raw)
  To: galak, linuxppc-dev; +Cc: Varun Sethi

We should use the MPIC_LARG_VECTORS flag while intializing the MPIC. 
This prevents us from eating in to hardware vector number space (MSIs)
while setting up internal sources.

Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
---
 arch/powerpc/sysdev/mpic.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/sysdev/mpic.c b/arch/powerpc/sysdev/mpic.c
index a98eb77..61c7225 100644
--- a/arch/powerpc/sysdev/mpic.c
+++ b/arch/powerpc/sysdev/mpic.c
@@ -1211,7 +1211,7 @@ struct mpic * __init mpic_alloc(struct device_node *node,
 	if (of_get_property(node, "single-cpu-affinity", NULL))
 		flags |= MPIC_SINGLE_DEST_CPU;
 	if (of_device_is_compatible(node, "fsl,mpic"))
-		flags |= MPIC_FSL;
+		flags |= MPIC_FSL | MPIC_LARGE_VECTORS;
 
 	mpic = kzalloc(sizeof(struct mpic), GFP_KERNEL);
 	if (mpic == NULL)
-- 
1.7.2.2

^ permalink raw reply related

* [PATCH 1/3] powerpc/mpic: finish supporting timer group B on Freescale chips
From: Varun Sethi @ 2012-07-09  8:45 UTC (permalink / raw)
  To: galak, linuxppc-dev; +Cc: Scott Wood, Varun Sethi

Previously, these interrupts would be mapped, but the offset
calculation was broken, and only the first group was initialized.

Signed-off-by: Scott Wood <scottwood@freescale.com>
---
 arch/powerpc/include/asm/mpic.h |    5 +++
 arch/powerpc/sysdev/mpic.c      |   58 ++++++++++++++++++++++++++++-----------
 2 files changed, 47 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/include/asm/mpic.h b/arch/powerpc/include/asm/mpic.h
index c9f698a..e14d35d 100644
--- a/arch/powerpc/include/asm/mpic.h
+++ b/arch/powerpc/include/asm/mpic.h
@@ -63,6 +63,7 @@
  */
 #define MPIC_TIMER_BASE			0x01100
 #define MPIC_TIMER_STRIDE		0x40
+#define MPIC_TIMER_GROUP_STRIDE		0x1000
 
 #define MPIC_TIMER_CURRENT_CNT		0x00000
 #define MPIC_TIMER_BASE_CNT		0x00010
@@ -110,6 +111,9 @@
 #define 	MPIC_VECPRI_SENSE_MASK			0x00400000
 #define MPIC_IRQ_DESTINATION		0x00010
 
+#define MPIC_FSL_BRR1			0x00000
+#define 	MPIC_FSL_BRR1_VER			0x0000ffff
+
 #define MPIC_MAX_IRQ_SOURCES	2048
 #define MPIC_MAX_CPUS		32
 #define MPIC_MAX_ISU		32
@@ -296,6 +300,7 @@ struct mpic
 	phys_addr_t paddr;
 
 	/* The various ioremap'ed bases */
+	struct mpic_reg_bank	thiscpuregs;
 	struct mpic_reg_bank	gregs;
 	struct mpic_reg_bank	tmregs;
 	struct mpic_reg_bank	cpuregs[MPIC_MAX_CPUS];
diff --git a/arch/powerpc/sysdev/mpic.c b/arch/powerpc/sysdev/mpic.c
index 395af13..a98eb77 100644
--- a/arch/powerpc/sysdev/mpic.c
+++ b/arch/powerpc/sysdev/mpic.c
@@ -6,7 +6,7 @@
  *  with various broken implementations of this HW.
  *
  *  Copyright (C) 2004 Benjamin Herrenschmidt, IBM Corp.
- *  Copyright 2010-2011 Freescale Semiconductor, Inc.
+ *  Copyright 2010-2012 Freescale Semiconductor, Inc.
  *
  *  This file is subject to the terms and conditions of the GNU General Public
  *  License.  See the file COPYING in the main directory of this archive
@@ -221,24 +221,24 @@ static inline void _mpic_ipi_write(struct mpic *mpic, unsigned int ipi, u32 valu
 	_mpic_write(mpic->reg_type, &mpic->gregs, offset, value);
 }
 
-static inline u32 _mpic_tm_read(struct mpic *mpic, unsigned int tm)
+static inline unsigned int mpic_tm_offset(struct mpic *mpic, unsigned int tm)
 {
-	unsigned int offset = MPIC_INFO(TIMER_VECTOR_PRI) +
-			      ((tm & 3) * MPIC_INFO(TIMER_STRIDE));
+	return (tm >> 2) * MPIC_TIMER_GROUP_STRIDE +
+	       (tm & 3) * MPIC_INFO(TIMER_STRIDE);
+}
 
-	if (tm >= 4)
-		offset += 0x1000 / 4;
+static inline u32 _mpic_tm_read(struct mpic *mpic, unsigned int tm)
+{
+	unsigned int offset = mpic_tm_offset(mpic, tm) +
+			      MPIC_INFO(TIMER_VECTOR_PRI);
 
 	return _mpic_read(mpic->reg_type, &mpic->tmregs, offset);
 }
 
 static inline void _mpic_tm_write(struct mpic *mpic, unsigned int tm, u32 value)
 {
-	unsigned int offset = MPIC_INFO(TIMER_VECTOR_PRI) +
-			      ((tm & 3) * MPIC_INFO(TIMER_STRIDE));
-
-	if (tm >= 4)
-		offset += 0x1000 / 4;
+	unsigned int offset = mpic_tm_offset(mpic, tm) +
+			      MPIC_INFO(TIMER_VECTOR_PRI);
 
 	_mpic_write(mpic->reg_type, &mpic->tmregs, offset, value);
 }
@@ -1301,6 +1301,16 @@ struct mpic * __init mpic_alloc(struct device_node *node,
 	mpic_map(mpic, mpic->paddr, &mpic->gregs, MPIC_INFO(GREG_BASE), 0x1000);
 	mpic_map(mpic, mpic->paddr, &mpic->tmregs, MPIC_INFO(TIMER_BASE), 0x1000);
 
+	if (mpic->flags & MPIC_FSL) {
+		/*
+		 * Yes, Freescale really did put global registers in the
+		 * magic per-cpu area -- and they don't even show up in the
+		 * non-magic per-cpu copies that this driver normally uses.
+		 */
+		mpic_map(mpic, mpic->paddr, &mpic->thiscpuregs,
+			 MPIC_CPU_THISBASE, 0x1000);
+	}
+
 	/* Reset */
 
 	/* When using a device-node, reset requests are only honored if the MPIC
@@ -1440,6 +1450,7 @@ void __init mpic_assign_isu(struct mpic *mpic, unsigned int isu_num,
 void __init mpic_init(struct mpic *mpic)
 {
 	int i, cpu;
+	int num_timers = 4;
 
 	BUG_ON(mpic->num_sources == 0);
 
@@ -1448,15 +1459,30 @@ void __init mpic_init(struct mpic *mpic)
 	/* Set current processor priority to max */
 	mpic_cpu_write(MPIC_INFO(CPU_CURRENT_TASK_PRI), 0xf);
 
+	if (mpic->flags & MPIC_FSL) {
+		u32 brr1 = _mpic_read(mpic->reg_type, &mpic->thiscpuregs,
+				      MPIC_FSL_BRR1);
+		u32 version = brr1 & MPIC_FSL_BRR1_VER;
+
+		/*
+		 * Timer group B is present at the latest in MPIC 3.1 (e.g.
+		 * mpc8536).  It is not present in MPIC 2.0 (e.g. mpc8544).
+		 * I don't know about the status of intermediate versions (or
+		 * whether they even exist).
+		 */
+		if (version >= 0x0301)
+			num_timers = 8;
+	}
+
 	/* Initialize timers to our reserved vectors and mask them for now */
-	for (i = 0; i < 4; i++) {
+	for (i = 0; i < num_timers; i++) {
+		unsigned int offset = mpic_tm_offset(mpic, i);
+
 		mpic_write(mpic->tmregs,
-			   i * MPIC_INFO(TIMER_STRIDE) +
-			   MPIC_INFO(TIMER_DESTINATION),
+			   offset + MPIC_INFO(TIMER_DESTINATION),
 			   1 << hard_smp_processor_id());
 		mpic_write(mpic->tmregs,
-			   i * MPIC_INFO(TIMER_STRIDE) +
-			   MPIC_INFO(TIMER_VECTOR_PRI),
+			   offset + MPIC_INFO(TIMER_VECTOR_PRI),
 			   MPIC_VECPRI_MASK |
 			   (9 << MPIC_VECPRI_PRIORITY_SHIFT) |
 			   (mpic->timer_vecs[0] + i));
-- 
1.7.2.2

^ permalink raw reply related

* [PATCH 0/3] powerpc/mpic: Enhancements for FSL MPIC.
From: Varun Sethi @ 2012-07-09  8:43 UTC (permalink / raw)
  To: galak, linuxppc-dev; +Cc: Varun Sethi

This patchset adds/fixes the following functionality specific to the
FSL MPIC:
1. Fix support for timer group B interrupts. Previously these were
not getting initialized.

2. Use the MPIC_LARGE_VECTORS flag while intializing FSL MPIC.
This prevents us from eating in to hardware vector number
space (MSIs) while setting up internal sources.

3.Cascaded handling for the MPIC error interrupt. This is possible
with FSL MPIC version >= 4.1.

The patches are based on "next" branch of Benjamin Herrenschmidt's powerpc
linux tree.

Varun Sethi (3):
  Support time group b on freescale chips.
  Use MPIC_LARGE_VECTORS flag for Freescale MPIC.
  Add support for cascaded error interrupt handling.

 arch/powerpc/include/asm/mpic.h          |   22 ++++
 arch/powerpc/sysdev/Makefile             |    2 +-
 arch/powerpc/sysdev/fsl_mpic_err.c       |  157 ++++++++++++++++++++++++++++++
 arch/powerpc/sysdev/mpic.c               |   95 +++++++++++++++----
 arch/powerpc/sysdev/mpic.h               |   22 ++++
 6 files changed, 338 insertions(+), 19 deletions(-)
 create mode 100644 arch/powerpc/sysdev/fsl_mpic_err.c

-- 
1.7.2.2

^ permalink raw reply

* Re: [RFC PATCH v2 4/13] memory-hotplug : remove /sys/firmware/memmap/X sysfs
From: Yasuaki Ishimatsu @ 2012-07-09  8:18 UTC (permalink / raw)
  To: Wen Congyang
  Cc: len.brown, linux-acpi, linux-kernel, linux-mm, paulus,
	minchan.kim, kosaki.motohiro, rientjes, cl, linuxppc-dev, akpm,
	liuj97
In-Reply-To: <4FF6ADD9.7040600@cn.fujitsu.com>

Hi Wen,

2012/07/06 18:20, Wen Congyang wrote:
> At 07/06/2012 04:27 PM, Yasuaki Ishimatsu Wrote:
>> Hi Wen,
>>
>> 2012/07/04 19:01, Wen Congyang wrote:
>>> At 07/04/2012 01:52 PM, Yasuaki Ishimatsu Wrote:
>>>> Hi Wen,
>>>>
>>>> 2012/07/04 14:08, Wen Congyang wrote:
>>>>> At 07/04/2012 12:45 PM, Yasuaki Ishimatsu Wrote:
>>>>>> Hi Wen,
>>>>>>
>>>>>> 2012/07/03 15:35, Wen Congyang wrote:
>>>>>>> At 07/03/2012 01:56 PM, Yasuaki Ishimatsu Wrote:
>>>>>>>> When (hot)adding memory into system, /sys/firmware/memmap/X/{end, start, type}
>>>>>>>> sysfs files are created. But there is no code to remove these files. The patch
>>>>>>>> implements the function to remove them.
>>>>>>>>
>>>>>>>> Note : The code does not free firmware_map_entry since there is no way to free
>>>>>>>>            memory which is allocated by bootmem.
>>>>>>>>
>>>>>>>> CC: David Rientjes <rientjes@google.com>
>>>>>>>> CC: Jiang Liu <liuj97@gmail.com>
>>>>>>>> CC: Len Brown <len.brown@intel.com>
>>>>>>>> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>>>>>>>> CC: Paul Mackerras <paulus@samba.org>
>>>>>>>> CC: Christoph Lameter <cl@linux.com>
>>>>>>>> Cc: Minchan Kim <minchan.kim@gmail.com>
>>>>>>>> CC: Andrew Morton <akpm@linux-foundation.org>
>>>>>>>> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>>>>>>>> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>>>>>>>
>>>>>>>> ---
>>>>>>>>      drivers/firmware/memmap.c    |   70 +++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>      include/linux/firmware-map.h |    6 +++
>>>>>>>>      mm/memory_hotplug.c          |    6 +++
>>>>>>>>      3 files changed, 81 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> Index: linux-3.5-rc4/mm/memory_hotplug.c
>>>>>>>> ===================================================================
>>>>>>>> --- linux-3.5-rc4.orig/mm/memory_hotplug.c	2012-07-03 14:22:00.190240794 +0900
>>>>>>>> +++ linux-3.5-rc4/mm/memory_hotplug.c	2012-07-03 14:22:03.549198802 +0900
>>>>>>>> @@ -661,7 +661,11 @@ EXPORT_SYMBOL_GPL(add_memory);
>>>>>>>>
>>>>>>>>      int remove_memory(int nid, u64 start, u64 size)
>>>>>>>>      {
>>>>>>>> -	return -EBUSY;
>>>>>>>> +	lock_memory_hotplug();
>>>>>>>> +	/* remove memmap entry */
>>>>>>>> +	firmware_map_remove(start, start + size - 1, "System RAM");
>>>>>>>> +	unlock_memory_hotplug();
>>>>>>>> +	return 0;
>>>>>>>>
>>>>>>>>      }
>>>>>>>>      EXPORT_SYMBOL_GPL(remove_memory);
>>>>>>>> Index: linux-3.5-rc4/include/linux/firmware-map.h
>>>>>>>> ===================================================================
>>>>>>>> --- linux-3.5-rc4.orig/include/linux/firmware-map.h	2012-07-03 14:21:45.766421116 +0900
>>>>>>>> +++ linux-3.5-rc4/include/linux/firmware-map.h	2012-07-03 14:22:03.550198789 +0900
>>>>>>>> @@ -25,6 +25,7 @@
>>>>>>>>
>>>>>>>>      int firmware_map_add_early(u64 start, u64 end, const char *type);
>>>>>>>>      int firmware_map_add_hotplug(u64 start, u64 end, const char *type);
>>>>>>>> +int firmware_map_remove(u64 start, u64 end, const char *type);
>>>>>>>>
>>>>>>>>      #else /* CONFIG_FIRMWARE_MEMMAP */
>>>>>>>>
>>>>>>>> @@ -38,6 +39,11 @@ static inline int firmware_map_add_hotpl
>>>>>>>>      	return 0;
>>>>>>>>      }
>>>>>>>>
>>>>>>>> +static inline int firmware_map_remove(u64 start, u64 end, const char *type)
>>>>>>>> +{
>>>>>>>> +	return 0;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>      #endif /* CONFIG_FIRMWARE_MEMMAP */
>>>>>>>>
>>>>>>>>      #endif /* _LINUX_FIRMWARE_MAP_H */
>>>>>>>> Index: linux-3.5-rc4/drivers/firmware/memmap.c
>>>>>>>> ===================================================================
>>>>>>>> --- linux-3.5-rc4.orig/drivers/firmware/memmap.c	2012-07-03 14:21:45.761421180 +0900
>>>>>>>> +++ linux-3.5-rc4/drivers/firmware/memmap.c	2012-07-03 14:22:03.569198549 +0900
>>>>>>>> @@ -79,7 +79,16 @@ static const struct sysfs_ops memmap_att
>>>>>>>>      	.show = memmap_attr_show,
>>>>>>>>      };
>>>>>>>>
>>>>>>>> +static void release_firmware_map_entry(struct kobject *kobj)
>>>>>>>> +{
>>>>>>>> +	/*
>>>>>>>> +	 * FIXME : There is no idea.
>>>>>>>> +	 *         How to free the entry which allocated bootmem?
>>>>>>>> +	 */
>>>>>>>
>>>>>>> I find a function free_bootmem(), but I am not sure whether it can work here.
>>>>>>
>>>>>> It cannot work here.
>>>>>>
>>>>>>> Another problem: how to check whether the entry uses bootmem?
>>>>>>
>>>>>> When firmware_map_entry is allocated by kzalloc(), the page has PG_slab.
>>>>>
>>>>> This is not true. In my test, I find the page does not have PG_slab sometimes.
>>>>
>>>> I think that it depends on the allocated size. firmware_map_entry size is
>>>> smaller than PAGE_SIZE. So the page has PG_Slab.
>>>
>>> In my test, I add printk in the function firmware_map_add_hotplug() to display
>>> page's flags. And sometimes the page is not allocated by slab(I use PageSlab()
>>> to verify it).
>>
>> How did you check it? Could you send your debug patch?
> 
> When the memory is not allocated from slab, the flags is 0x10000000008000.

Thank you for sending the patch.
I think the page to not have PageSlab is a compound page. So we can check
whether the entry is allocate from bootmem or not as follow:

static void release_firmware_map_entry(struct kobject *kobj)
{
	struct firmware_map_entry *entry = to_memmap_entry(kobj);
	struct page *head_page;

	head_page = virt_to_head_page(entry);
	if (PageSlab(head_page))
		kfree(etnry);
	else
		/* the entry is allocated from bootmem */
}

Thanks,
Yasuaki Ishimatsu

> 
>  From 8dd51368d6c03edf7edc89cab17441e3741c39c7 Mon Sep 17 00:00:00 2001
> From: Wen Congyang <wency@cn.fujitsu.com>
> Date: Wed, 4 Jul 2012 16:05:26 +0800
> Subject: [PATCH] debug
> 
> ---
>   drivers/firmware/memmap.c |    7 +++++++
>   1 files changed, 7 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/firmware/memmap.c b/drivers/firmware/memmap.c
> index adc0710..993ba3f 100644
> --- a/drivers/firmware/memmap.c
> +++ b/drivers/firmware/memmap.c
> @@ -21,6 +21,7 @@
>   #include <linux/types.h>
>   #include <linux/bootmem.h>
>   #include <linux/slab.h>
> +#include <linux/mm.h>
>   
>   /*
>    * Data types ------------------------------------------------------------------
> @@ -160,11 +161,17 @@ static int add_sysfs_fw_map_entry(struct firmware_map_entry *entry)
>   int __meminit firmware_map_add_hotplug(u64 start, u64 end, const char *type)
>   {
>   	struct firmware_map_entry *entry;
> +	struct page *entry_page;
>   
>   	entry = kzalloc(sizeof(struct firmware_map_entry), GFP_ATOMIC);
>   	if (!entry)
>   		return -ENOMEM;
>   
> +	entry_page = virt_to_page(entry);
> +	printk(KERN_WARNING "flags: %lx\n", entry_page->flags);
> +	if (PageSlab(entry_page)) {
> +		printk(KERN_WARNING "page is allocated from slab\n");
> +	}
>   	firmware_map_add_entry(start, end, type, entry);
>   	/* create the memmap entry */
>   	add_sysfs_fw_map_entry(entry);
> 

^ permalink raw reply

* Re: linux-next: boot failure in next-20120705 and 20120706
From: Benjamin Herrenschmidt @ 2012-07-09  5:19 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: linux-next, ppc-dev, LKML
In-Reply-To: <20120709144946.ada290480c0edc6a99f9c9c6@canb.auug.org.au>

On Mon, 2012-07-09 at 14:49 +1000, Stephen Rothwell wrote:
> Hi all,
> 
> Boot testing next-20120705 and 20120706 one of my PowerPC machines get this BUG:
> 
> (this one just after the console login prompt appeared.)

I'll have a look tomorrow. Make sure you keep the .config & machine at
hand :-) Any chance you can check with just powerpc-next ?

Cheers,
Ben.

> kernel BUG at arch/powerpc/kernel/irq.c:188!
> Oops: Exception in kernel mode, sig: 5 [#1]
> SMP NR_CPUS=32 NUMA pSeries
> Modules linked in: binfmt_misc dm_mirror dm_region_hash dm_log ibmveth
> NIP: c00000000000f7d4 LR: c00000000000f838 CTR: c0000000007222a0
> REGS: c000000000d17a30 TRAP: 0700   Not tainted  (3.5.0-rc5-autokern1)
> MSR: 8000000000021032 <SF,ME,IR,DR,RI>  CR: 24000048  XER: 00000000
> SOFTE: 0
> TASK = c000000000c41a10[0] 'swapper/0' THREAD: c000000000d14000 CPU: 0
> GPR00: 0000000000000001 c000000000d17cb0 c000000000d12d10 0000000000000500 
> GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> GPR08: 001752557ab9dd5b c000000007ffb000 0000000000000000 c000000000bf5200 
> GPR12: 0000000000000000 c000000007ffb000 0000000002b00000 00000000009d7000 
> GPR16: 0000000000d454f0 00000000027dc6f0 0000000001b5f994 0000000000000060 
> GPR20: 0000000000000000 ffffffffffffffff ffffffffffffffff 8000000000009032 
> GPR24: c000000000d14000 c000000000d30290 c000000000d30178 c000000000d30290 
> GPR28: c0000000008a6180 0000000000000008 c000000000c7e130 0000000000000001 
> NIP [c00000000000f7d4] .__check_irq_replay+0xc4/0xf0
> LR [c00000000000f838] .arch_local_irq_restore+0x38/0x90
> Call Trace:
> [c000000000d17cb0] [c000000000d17d30] init_thread_union+0x3d30/0x4000 (unreliable)
> [c000000000d17d40] [c00000000000f838] .arch_local_irq_restore+0x38/0x90
> [c000000000d17db0] [c000000000015c60] .cpu_idle+0x230/0x2b0
> [c000000000d17e70] [c00000000000b3f8] .rest_init+0x88/0xa0
> [c000000000d17ef0] [c000000000b86c0c] .start_kernel+0x478/0x498
> [c000000000d17f90] [c0000000000096f8] .start_here_common+0x20/0x28
> Instruction dump:
> 4e800020 7da96b78 7beaf7e3 880901f3 38600500 5400e87e 5400183e 980901f3 
> 4082ffcc 880d01f3 7c0000d0 78000fe0 <0b000000> 38600000 4bffffb4 38610070 
> 
> This is:
> 	BUG_ON(local_paca->irq_happened != 0);
> in __check_irq_replay().
> 
> This only happens on my Power5+ machine.

^ permalink raw reply

* linux-next: boot failure in next-20120705 and 20120706
From: Stephen Rothwell @ 2012-07-09  4:49 UTC (permalink / raw)
  To: ppc-dev; +Cc: linux-next, LKML

[-- Attachment #1: Type: text/plain, Size: 2151 bytes --]

Hi all,

Boot testing next-20120705 and 20120706 one of my PowerPC machines get this BUG:

(this one just after the console login prompt appeared.)

kernel BUG at arch/powerpc/kernel/irq.c:188!
Oops: Exception in kernel mode, sig: 5 [#1]
SMP NR_CPUS=32 NUMA pSeries
Modules linked in: binfmt_misc dm_mirror dm_region_hash dm_log ibmveth
NIP: c00000000000f7d4 LR: c00000000000f838 CTR: c0000000007222a0
REGS: c000000000d17a30 TRAP: 0700   Not tainted  (3.5.0-rc5-autokern1)
MSR: 8000000000021032 <SF,ME,IR,DR,RI>  CR: 24000048  XER: 00000000
SOFTE: 0
TASK = c000000000c41a10[0] 'swapper/0' THREAD: c000000000d14000 CPU: 0
GPR00: 0000000000000001 c000000000d17cb0 c000000000d12d10 0000000000000500 
GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR08: 001752557ab9dd5b c000000007ffb000 0000000000000000 c000000000bf5200 
GPR12: 0000000000000000 c000000007ffb000 0000000002b00000 00000000009d7000 
GPR16: 0000000000d454f0 00000000027dc6f0 0000000001b5f994 0000000000000060 
GPR20: 0000000000000000 ffffffffffffffff ffffffffffffffff 8000000000009032 
GPR24: c000000000d14000 c000000000d30290 c000000000d30178 c000000000d30290 
GPR28: c0000000008a6180 0000000000000008 c000000000c7e130 0000000000000001 
NIP [c00000000000f7d4] .__check_irq_replay+0xc4/0xf0
LR [c00000000000f838] .arch_local_irq_restore+0x38/0x90
Call Trace:
[c000000000d17cb0] [c000000000d17d30] init_thread_union+0x3d30/0x4000 (unreliable)
[c000000000d17d40] [c00000000000f838] .arch_local_irq_restore+0x38/0x90
[c000000000d17db0] [c000000000015c60] .cpu_idle+0x230/0x2b0
[c000000000d17e70] [c00000000000b3f8] .rest_init+0x88/0xa0
[c000000000d17ef0] [c000000000b86c0c] .start_kernel+0x478/0x498
[c000000000d17f90] [c0000000000096f8] .start_here_common+0x20/0x28
Instruction dump:
4e800020 7da96b78 7beaf7e3 880901f3 38600500 5400e87e 5400183e 980901f3 
4082ffcc 880d01f3 7c0000d0 78000fe0 <0b000000> 38600000 4bffffb4 38610070 

This is:
	BUG_ON(local_paca->irq_happened != 0);
in __check_irq_replay().

This only happens on my Power5+ machine.
-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH 1/4] powerpc/perf: Create mmcra_sihv/mmcra_sipv helpers
From: Anshuman Khandual @ 2012-07-09  4:20 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: sukadev, paulus, linuxppc-dev
In-Reply-To: <20120626210013.2fbb9044@kryten>

On Tuesday 26 June 2012 04:30 PM, Anton Blanchard wrote:

> 
> We want to access the MMCRA_SIHV and MMCRA_SIPR bits elsewhere so 
> create mmcra_sihv and mmcra_sipr which hide the differences between
> the old and new layout of the bits.
> 


Hey Anton,

Going further in this direction, we can actually create wrapper functions
to capture SIHV and SIPR values whether they are based out of MMCRA register or
not. It would help us decide PERF_RECORD_MISC_USER | PERF_RECORD_MISC_HYPERVISOR
| PERF_RECORD_MISC_KERNEL hiding the register and related bit details.

> Signed-off-by: Anton Blanchard <anton@samba.org>
> ---
> 
> Index: linux-build/arch/powerpc/perf/core-book3s.c
> ===================================================================
> --- linux-build.orig/arch/powerpc/perf/core-book3s.c	2012-06-26 10:26:40.695707845 +1000
> +++ linux-build/arch/powerpc/perf/core-book3s.c	2012-06-26 10:28:53.325958826 +1000
> @@ -116,6 +116,26 @@ static inline void perf_get_data_addr(st
>  		*addrp = mfspr(SPRN_SDAR);
>  }
> 
> +static bool mmcra_sihv(unsigned long mmcra)
> +{
> +	unsigned long sihv = MMCRA_SIHV;
> +
> +	if (ppmu->flags & PPMU_ALT_SIPR)
> +		sihv = POWER6_MMCRA_SIHV;
> +
> +	return !!(mmcra & sihv);
> +}
> +
> +static bool mmcra_sipr(unsigned long mmcra)
> +{
> +	unsigned long sipr = MMCRA_SIPR;
> +
> +	if (ppmu->flags & PPMU_ALT_SIPR)
> +		sipr = POWER6_MMCRA_SIPR;
> +
> +	return !!(mmcra & sipr);
> +}

^ permalink raw reply

* Re: [PATCH SLAB 1/2 v3] duplicate the cache name in SLUB's saved_alias list, SLAB, and SLOB
From: Li Zhong @ 2012-07-09  2:42 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: LKML, Glauber Costa, Pekka Enberg, linux-mm, Paul Mackerras,
	Matt Mackall, PowerPC email list, Wanlong Gao
In-Reply-To: <alpine.DEB.2.00.1207060855320.26441@router.home>

On Fri, 2012-07-06 at 08:56 -0500, Christoph Lameter wrote:
> I thought I posted this a couple of days ago. Would this not fix things
> without having to change all the allocators?

I was pointed by Glauber to the slab common code patches. I need some
more time to read the patches. Now I think the slab/slot changes in this
v3 are not needed, and can be ignored.

But for the SLUB's saved_alias list issue, I don't think the following
patch helps. Details below: (Maybe I am wrong, as I'm reading the patch
based on the 3.5-rc6 code ...)

> 
> 
> Subject: slub: Dup name earlier in kmem_cache_create
> 
> Dup the name earlier in kmem_cache_create so that alias
> processing is done using the copy of the string and not
> the string itself.
> 
> Signed-off-by: Christoph Lameter <cl@linux.com>
> 
> ---
>  mm/slub.c |   29 ++++++++++++++---------------
>  1 file changed, 14 insertions(+), 15 deletions(-)
> 
> Index: linux-2.6/mm/slub.c
> ===================================================================
> --- linux-2.6.orig/mm/slub.c	2012-06-11 08:49:56.000000000 -0500
> +++ linux-2.6/mm/slub.c	2012-07-03 15:17:37.000000000 -0500
> @@ -3933,8 +3933,12 @@ struct kmem_cache *kmem_cache_create(con
>  	if (WARN_ON(!name))
>  		return NULL;
> 
> +	n = kstrdup(name, GFP_KERNEL);
> +	if (!n)
> +		goto out;
> +
>  	down_write(&slub_lock);
> -	s = find_mergeable(size, align, flags, name, ctor);
> +	s = find_mergeable(size, align, flags, n, ctor);
>  	if (s) {
>  		s->refcount++;
>  		/*

		......
		up_write(&slub_lock);
		return s; 
	}

Here, the function returns without name string n be kfreed. 

But we couldn't kfree n here, because in sysfs_slab_alias(), if
(slab_state < SYS_FS), the name need to be kept valid until
slab_sysfs_init() is finished adding the entry into sysfs. 
		
> @@ -3944,7 +3948,7 @@ struct kmem_cache *kmem_cache_create(con
>  		s->objsize = max(s->objsize, (int)size);
>  		s->inuse = max_t(int, s->inuse, ALIGN(size, sizeof(void *)));
> 
> -		if (sysfs_slab_alias(s, name)) {
> +		if (sysfs_slab_alias(s, n)) {
>  			s->refcount--;
>  			goto err;
>  		}
> @@ -3952,31 +3956,26 @@ struct kmem_cache *kmem_cache_create(con
>  		return s;
>  	}
> 
> -	n = kstrdup(name, GFP_KERNEL);
> -	if (!n)
> -		goto err;
> -
>  	s = kmalloc(kmem_size, GFP_KERNEL);
>  	if (s) {
>  		if (kmem_cache_open(s, n,
>  				size, align, flags, ctor)) {
>  			list_add(&s->list, &slab_caches);
>  			up_write(&slub_lock);
> -			if (sysfs_slab_add(s)) {
> -				down_write(&slub_lock);
> -				list_del(&s->list);
> -				kfree(n);
> -				kfree(s);
> -				goto err;
> -			}
> -			return s;
> +			if (!sysfs_slab_add(s))
> +				return s;
> +
> +			down_write(&slub_lock);
> +			list_del(&s->list);
>  		}
>  		kfree(s);
>  	}
> -	kfree(n);
> +
>  err:
> +	kfree(n);
>  	up_write(&slub_lock);
> 
> +out:
>  	if (flags & SLAB_PANIC)
>  		panic("Cannot create slabcache %s\n", name);
>  	else
> 

^ permalink raw reply

* Re: [PATCH powerpc 2/2] kfree the cache name  of pgtable cache if SLUB is used
From: Li Zhong @ 2012-07-09  1:48 UTC (permalink / raw)
  To: Glauber Costa
  Cc: LKML, Pekka Enberg, linux-mm, Paul Mackerras, Matt Mackall,
	Christoph Lameter, PowerPC email list
In-Reply-To: <4FF6BA39.4000305@parallels.com>

On Fri, 2012-07-06 at 14:13 +0400, Glauber Costa wrote:
> On 07/05/2012 01:29 PM, Li Zhong wrote:
> > On Thu, 2012-07-05 at 12:23 +0400, Glauber Costa wrote:
> >> On 07/05/2012 05:41 AM, Li Zhong wrote:
> >>> On Wed, 2012-07-04 at 16:40 +0400, Glauber Costa wrote:
> >>>> On 07/04/2012 01:00 PM, Li Zhong wrote:
> >>>>> On Tue, 2012-07-03 at 15:36 -0500, Christoph Lameter wrote:
> >>>>>>> Looking through the emails it seems that there is an issue with alias
> >>>>>>> strings. 
> >>>>> To be more precise, there seems no big issue currently. I just wanted to
> >>>>> make following usage of kmem_cache_create (SLUB) possible:
> >>>>>
> >>>>> 	name = some string kmalloced
> >>>>> 	kmem_cache_create(name, ...)
> >>>>> 	kfree(name);
> >>>>
> >>>> Out of curiosity: Why?
> >>>> This is not (currently) possible with the other allocators (may change
> >>>> with christoph's unification patches), so you would be making your code
> >>>> slub-dependent.
> >>>>
> >>>
> >>> For slub itself, I think it's not good that: in some cases, the name
> >>> string could be kfreed ( if it was kmalloced ) immediately after calling
> >>> the cache create; in some other case, the name string needs to be kept
> >>> valid until some init calls finished. 
> >>>
> >>> I agree with you that it would make the code slub-dependent, so I'm now
> >>> working on the consistency of the other allocators regarding this name
> >>> string duplicating thing. 
> >>
> >> If you really need to kfree the string, or even if it is easier for you
> >> this way, it can be done. As a matter of fact, this is the case for me.
> >> Just that your patch is not enough. Christoph has a patch that makes
> >> this behavior consistent over all allocators.
> > 
> > Sorry, I didn't know that. Seems I don't need to continue the half-done
> > work in slab. If possible, would you please give me a link of the patch?
> > Thank you. 
> > 
> 
> Sorry for the delay. In case you haven't found it out yourself yet:
> 
> http://www.spinics.net/lists/linux-mm/msg36149.html

Thank you. I think it is better to have these things in the
slab_common.c. 

> 
> Please not this posted patch as is has a bug.
> 
> I do believe that your take on the aliasing code adds value to it. But
> as I've already said once, might have to dig a bit deeper in that to get
> to end of the rabbit hole.

With slab_common, I think my slab/slob modifications are not needed any
more. After I understand the common patches, I will check whether the
aliasing problem in slub still exists, and if yes, try to send a patch
based on that. 

^ permalink raw reply

* Re: [PATCH] powerpc: put the gpr sabe/restore functions in their own section
From: Stephen Rothwell @ 2012-07-08 23:50 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: ppc-dev, Alan Modra
In-Reply-To: <1341644121.30371.2.camel@concordia>

[-- Attachment #1: Type: text/plain, Size: 1790 bytes --]

Hi Michael,

On Sat, 07 Jul 2012 16:55:21 +1000 Michael Ellerman <michael@ellerman.id.au> wrote:
>
> On Fri, 2012-07-06 at 17:09 +1000, Stephen Rothwell wrote:
> > This allows the linker to know that calls to them do not need to switch
> > TOC and stop errors like the following when linking large configurations:
> 
> >  arch/powerpc/lib/crtsavres.S |    5 ++++-
> 
> You didn't make any change to the linker script? How does this section
> get in there? I don't see a .text* anywhere?

Yeah.  ld just seems to figure it out (I note that we also have sections
called .text.unlikely and .text.startup).

> > diff --git a/arch/powerpc/lib/crtsavres.S b/arch/powerpc/lib/crtsavres.S
> > index 1c893f0..b2c68ce 100644
> > --- a/arch/powerpc/lib/crtsavres.S
> > +++ b/arch/powerpc/lib/crtsavres.S
> > @@ -41,12 +41,13 @@
> >  #include <asm/ppc_asm.h>
> >  
> >  	.file	"crtsavres.S"
> > -	.section ".text"
> >  
> >  #ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
> >  
> >  #ifndef CONFIG_PPC64
> >  
> > +	.section ".text"
> > +
> >  /* Routines for saving integer registers, called by the compiler.  */
> >  /* Called with r11 pointing to the stack header word of the caller of the */
> >  /* function, just beyond the end of the integer save area.  */
> > @@ -232,6 +233,8 @@ _GLOBAL(_rest32gpr_31_x)
> >  
> >  #else /* CONFIG_PPC64 */
> >  
> > +	.section ".text.save.restore","ax",@progbits
> > +
> >  .globl	_savegpr0_14
> >  _savegpr0_14:
> >  	std	r14,-144(r1)
> 
> Any reason to not put the 32-bit versions in the same section? AFAICS
> nothing in that file uses the TOC. 

32 bit does not use a toc (apparently).  And I figured to make the
minimal change to the output.

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* AUTO: Michael Barry is out of the office (returning 09/07/2012)
From: Michael Barry @ 2012-07-08 20:41 UTC (permalink / raw)
  To: linuxppc-dev


I am out of the office until 09/07/2012.




Note: This is an automated response to your message  "Linuxppc-dev Digest,
Vol 95, Issue 22" sent on 5/7/2012 3:00:01.

This is the only notification you will receive while this person is away.

^ permalink raw reply

* Re: [PATCH v3] printk: Have printk() never buffer its data
From: Kay Sievers @ 2012-07-08 17:55 UTC (permalink / raw)
  To: Michael Neuling
  Cc: Greg Kroah-Hartman, LKML, Steven Rostedt, Paul E. McKenney,
	linuxppc-dev, Joe Perches, Andrew Morton, Wu Fengguang,
	Linus Torvalds, Ingo Molnar
In-Reply-To: <21892.1341608647@neuling.org>

On Sat, 2012-07-07 at 07:04 +1000, Michael Neuling wrote:
> Whole kmsg below.

I guess I have an idea now what's going on.

> 4,47,0;WARNING: at /scratch/mikey/src/linux-ozlabs/arch/powerpc/sysdev/xics/xics-common.c:105
> 4,51,0;MSR: 9000000000021032 <SF,HV,ME,IR,DR,RI>  CR: 24000042  XER: 22000000
> 4,54,0;TASK = c000000000b2dd80[0] 'swapper/0' THREAD: c000000000c24000 CPU: 0

This is the warning on CPU#1, all fine, all in one line.

> 6,74,0;console [tty0] enabled
> 6,75,0;console [hvc0] enabled

Now the boot consoles are registered, which replays the whole buffer
that was collected up to this point. During the entire time the console
semaphore needs to be held, and this can be quite a while.

> 4,87,24545;WARNING: at /scratch/mikey/src/linux-ozlabs/arch/powerpc/sysdev/xics/xics-common.c:105
> \4,91,24586;MSR: 9000000000021032 
> 4,92,24590;<
> 4,93,24594;SF
> 4,94,24599;,HV
> 4,95,24604;,ME
> 4,96,24609;,IR
> 4,97,24614;,DR
> 4,98,24619;,RI
> 4,99,24623;>
> 4,104,24661; CPU: 1

At the same time the CPU#2 prints the same warning with a continuation
line, but the buffer from CPU#1 can not be flushed to the console, nor
can the continuation line printk()s from CPU#2 be merged at this point.
The consoles are still locked and busy with replaying the old log
messages, so the new continuation data is just stored away in the record
buffer as it is coming in.
If the console would be registered a bit earlier, or the warning would
happen a bit later, we would probably not see any of this.

I can fake something like this just by holding the console semaphore
over a longer time and printing continuation lines with different CPUs
in a row.

The patch below seems to work for me. It is also here:
  http://git.kernel.org/?p=linux/kernel/git/kay/patches.git;a=blob;f=kmsg-merge-cont.patch;hb=HEAD

It only applies cleanly on top of this patch:
  http://git.kernel.org/?p=linux/kernel/git/kay/patches.git;a=blob;f=kmsg-syslog-1-byte-read.patch;hb=HEAD

Thanks,
Kay


Subject: kmsg: merge continuation records while printing

In (the unlikely) case our continuation merge buffer is busy, we unfortunately
can not merge further continuation printk()s into a single record and have to
store them separately, which leads to split-up output of these lines when they
are printed.

Add some flags about newlines and prefix existence to these records and try to
reconstruct the full line again, when the separated records are printed.
---
 kernel/printk.c |  119 ++++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 77 insertions(+), 42 deletions(-)

--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -194,8 +194,10 @@ static int console_may_schedule;
  */
 
 enum log_flags {
-	LOG_DEFAULT = 0,
-	LOG_NOCONS = 1,		/* already flushed, do not print to console */
+	LOG_NOCONS	= 1,	/* already flushed, do not print to console */
+	LOG_NEWLINE	= 2,	/* text ended with a newline */
+	LOG_PREFIX	= 4,	/* text started with a prefix */
+	LOG_CONT	= 8,	/* text is a fragment of a continuation line */
 };
 
 struct log {
@@ -217,6 +219,7 @@ static DEFINE_RAW_SPINLOCK(logbuf_lock);
 /* the next printk record to read by syslog(READ) or /proc/kmsg */
 static u64 syslog_seq;
 static u32 syslog_idx;
+static enum log_flags syslog_prev;
 static size_t syslog_partial;
 
 /* index and sequence number of the first record stored in the buffer */
@@ -839,8 +842,8 @@ static size_t print_prefix(const struct
 	return len;
 }
 
-static size_t msg_print_text(const struct log *msg, bool syslog,
-			     char *buf, size_t size)
+static size_t msg_print_text(const struct log *msg, enum log_flags prev,
+			     bool syslog, char *buf, size_t size)
 {
 	const char *text = log_text(msg);
 	size_t text_size = msg->text_len;
@@ -849,6 +852,8 @@ static size_t msg_print_text(const struc
 	do {
 		const char *next = memchr(text, '\n', text_size);
 		size_t text_len;
+		bool prefix = true;
+		bool newline = true;
 
 		if (next) {
 			text_len = next - text;
@@ -858,19 +863,35 @@ static size_t msg_print_text(const struc
 			text_len = text_size;
 		}
 
+		if ((prev & LOG_CONT) && !(msg->flags & LOG_PREFIX))
+			prefix = false;
+
+		if (msg->flags & LOG_CONT) {
+			if ((prev & LOG_CONT) && !(prev & LOG_NEWLINE))
+				prefix = false;
+
+			if (!(msg->flags & LOG_NEWLINE))
+				newline = false;
+		}
+
 		if (buf) {
 			if (print_prefix(msg, syslog, NULL) +
 			    text_len + 1>= size - len)
 				break;
 
-			len += print_prefix(msg, syslog, buf + len);
+			if (prefix)
+				len += print_prefix(msg, syslog, buf + len);
 			memcpy(buf + len, text, text_len);
 			len += text_len;
-			buf[len++] = '\n';
+			if (newline)
+				buf[len++] = '\n';
 		} else {
 			/* SYSLOG_ACTION_* buffer size only calculation */
-			len += print_prefix(msg, syslog, NULL);
-			len += text_len + 1;
+			if (prefix)
+				len += print_prefix(msg, syslog, NULL);
+			len += text_len;
+			if (newline)
+				len++;
 		}
 
 		text = next;
@@ -898,6 +919,7 @@ static int syslog_print(char __user *buf
 			/* messages are gone, move to first one */
 			syslog_seq = log_first_seq;
 			syslog_idx = log_first_idx;
+			syslog_prev = 0;
 			syslog_partial = 0;
 		}
 		if (syslog_seq == log_next_seq) {
@@ -907,11 +929,12 @@ static int syslog_print(char __user *buf
 
 		skip = syslog_partial;
 		msg = log_from_idx(syslog_idx);
-		n = msg_print_text(msg, true, text, LOG_LINE_MAX);
+		n = msg_print_text(msg, syslog_prev, true, text, LOG_LINE_MAX);
 		if (n - syslog_partial <= size) {
 			/* message fits into buffer, move forward */
 			syslog_idx = log_next(syslog_idx);
 			syslog_seq++;
+			syslog_prev = msg->flags;
 			n -= syslog_partial;
 			syslog_partial = 0;
 		} else if (!len){
@@ -954,6 +977,7 @@ static int syslog_print_all(char __user
 		u64 next_seq;
 		u64 seq;
 		u32 idx;
+		enum log_flags prev;
 
 		if (clear_seq < log_first_seq) {
 			/* messages are gone, move to first available one */
@@ -967,10 +991,11 @@ static int syslog_print_all(char __user
 		 */
 		seq = clear_seq;
 		idx = clear_idx;
+		prev = 0;
 		while (seq < log_next_seq) {
 			struct log *msg = log_from_idx(idx);
 
-			len += msg_print_text(msg, true, NULL, 0);
+			len += msg_print_text(msg, prev, true, NULL, 0);
 			idx = log_next(idx);
 			seq++;
 		}
@@ -978,10 +1003,11 @@ static int syslog_print_all(char __user
 		/* move first record forward until length fits into the buffer */
 		seq = clear_seq;
 		idx = clear_idx;
+		prev = 0;
 		while (len > size && seq < log_next_seq) {
 			struct log *msg = log_from_idx(idx);
 
-			len -= msg_print_text(msg, true, NULL, 0);
+			len -= msg_print_text(msg, prev, true, NULL, 0);
 			idx = log_next(idx);
 			seq++;
 		}
@@ -990,17 +1016,19 @@ static int syslog_print_all(char __user
 		next_seq = log_next_seq;
 
 		len = 0;
+		prev = 0;
 		while (len >= 0 && seq < next_seq) {
 			struct log *msg = log_from_idx(idx);
 			int textlen;
 
-			textlen = msg_print_text(msg, true, text, LOG_LINE_MAX);
+			textlen = msg_print_text(msg, prev, true, text, LOG_LINE_MAX);
 			if (textlen < 0) {
 				len = textlen;
 				break;
 			}
 			idx = log_next(idx);
 			seq++;
+			prev = msg->flags;
 
 			raw_spin_unlock_irq(&logbuf_lock);
 			if (copy_to_user(buf + len, text, textlen))
@@ -1013,6 +1041,7 @@ static int syslog_print_all(char __user
 				/* messages are gone, move to next one */
 				seq = log_first_seq;
 				idx = log_first_idx;
+				prev = 0;
 			}
 		}
 	}
@@ -1117,6 +1146,7 @@ int do_syslog(int type, char __user *buf
 			/* messages are gone, move to first one */
 			syslog_seq = log_first_seq;
 			syslog_idx = log_first_idx;
+			syslog_prev = 0;
 			syslog_partial = 0;
 		}
 		if (from_file) {
@@ -1127,18 +1157,18 @@ int do_syslog(int type, char __user *buf
 			 */
 			error = log_next_idx - syslog_idx;
 		} else {
-			u64 seq;
-			u32 idx;
+			u64 seq = syslog_seq;
+			u32 idx = syslog_idx;
+			enum log_flags prev = syslog_prev;
 
 			error = 0;
-			seq = syslog_seq;
-			idx = syslog_idx;
 			while (seq < log_next_seq) {
 				struct log *msg = log_from_idx(idx);
 
-				error += msg_print_text(msg, true, NULL, 0);
+				error += msg_print_text(msg, prev, true, NULL, 0);
 				idx = log_next(idx);
 				seq++;
+				prev = msg->flags;
 			}
 			error -= syslog_partial;
 		}
@@ -1408,10 +1438,9 @@ asmlinkage int vprintk_emit(int facility
 	static char textbuf[LOG_LINE_MAX];
 	char *text = textbuf;
 	size_t text_len;
+	enum log_flags lflags = 0;
 	unsigned long flags;
 	int this_cpu;
-	bool newline = false;
-	bool prefix = false;
 	int printed_len = 0;
 
 	boot_delay_msec();
@@ -1450,7 +1479,7 @@ asmlinkage int vprintk_emit(int facility
 		recursion_bug = 0;
 		printed_len += strlen(recursion_msg);
 		/* emit KERN_CRIT message */
-		log_store(0, 2, LOG_DEFAULT, 0,
+		log_store(0, 2, LOG_PREFIX|LOG_NEWLINE, 0,
 			  NULL, 0, recursion_msg, printed_len);
 	}
 
@@ -1463,7 +1492,7 @@ asmlinkage int vprintk_emit(int facility
 	/* mark and strip a trailing newline */
 	if (text_len && text[text_len-1] == '\n') {
 		text_len--;
-		newline = true;
+		lflags |= LOG_NEWLINE;
 	}
 
 	/* strip syslog prefix and extract log level or control flags */
@@ -1473,7 +1502,7 @@ asmlinkage int vprintk_emit(int facility
 			if (level == -1)
 				level = text[1] - '0';
 		case 'd':	/* KERN_DEFAULT */
-			prefix = true;
+			lflags |= LOG_PREFIX;
 		case 'c':	/* KERN_CONT */
 			text += 3;
 			text_len -= 3;
@@ -1483,22 +1512,20 @@ asmlinkage int vprintk_emit(int facility
 	if (level == -1)
 		level = default_message_loglevel;
 
-	if (dict) {
-		prefix = true;
-		newline = true;
-	}
+	if (dict)
+		lflags |= LOG_PREFIX|LOG_NEWLINE;
 
-	if (!newline) {
+	if (!(lflags & LOG_NEWLINE)) {
 		/*
 		 * Flush the conflicting buffer. An earlier newline was missing,
 		 * or another task also prints continuation lines.
 		 */
-		if (cont.len && (prefix || cont.owner != current))
+		if (cont.len && (lflags & LOG_PREFIX || cont.owner != current))
 			cont_flush();
 
 		/* buffer line if possible, otherwise store it right away */
 		if (!cont_add(facility, level, text, text_len))
-			log_store(facility, level, LOG_DEFAULT, 0,
+			log_store(facility, level, lflags | LOG_CONT, 0,
 				  dict, dictlen, text, text_len);
 	} else {
 		bool stored = false;
@@ -1510,13 +1537,13 @@ asmlinkage int vprintk_emit(int facility
 		 * flush it out and store this line separately.
 		 */
 		if (cont.len && cont.owner == current) {
-			if (!prefix)
+			if (!(lflags & LOG_PREFIX))
 				stored = cont_add(facility, level, text, text_len);
 			cont_flush();
 		}
 
 		if (!stored)
-			log_store(facility, level, LOG_DEFAULT, 0,
+			log_store(facility, level, lflags, 0,
 				  dict, dictlen, text, text_len);
 	}
 	printed_len += text_len;
@@ -1615,8 +1642,8 @@ static struct cont {
 static struct log *log_from_idx(u32 idx) { return NULL; }
 static u32 log_next(u32 idx) { return 0; }
 static void call_console_drivers(int level, const char *text, size_t len) {}
-static size_t msg_print_text(const struct log *msg, bool syslog,
-			     char *buf, size_t size) { return 0; }
+static size_t msg_print_text(const struct log *msg, enum log_flags prev,
+			     bool syslog, char *buf, size_t size) { return 0; }
 static size_t cont_print_text(char *text, size_t size) { return 0; }
 
 #endif /* CONFIG_PRINTK */
@@ -1892,6 +1919,7 @@ void wake_up_klogd(void)
 /* the next printk record to write to the console */
 static u64 console_seq;
 static u32 console_idx;
+static enum log_flags console_prev;
 
 /**
  * console_unlock - unlock the console system
@@ -1952,6 +1980,7 @@ again:
 			/* messages are gone, move to first one */
 			console_seq = log_first_seq;
 			console_idx = log_first_idx;
+			console_prev = 0;
 		}
 skip:
 		if (console_seq == log_next_seq)
@@ -1975,10 +2004,11 @@ skip:
 		}
 
 		level = msg->level;
-		len = msg_print_text(msg, false, text, sizeof(text));
-
+		len = msg_print_text(msg, console_prev, false,
+				     text, sizeof(text));
 		console_idx = log_next(console_idx);
 		console_seq++;
+		console_prev = msg->flags;
 		raw_spin_unlock(&logbuf_lock);
 
 		stop_critical_timings();	/* don't trace print latency */
@@ -2241,6 +2271,7 @@ void register_console(struct console *ne
 		raw_spin_lock_irqsave(&logbuf_lock, flags);
 		console_seq = syslog_seq;
 		console_idx = syslog_idx;
+		console_prev = syslog_prev;
 		raw_spin_unlock_irqrestore(&logbuf_lock, flags);
 		/*
 		 * We're about to replay the log buffer.  Only do this to the
@@ -2534,8 +2565,7 @@ bool kmsg_dump_get_line(struct kmsg_dump
 	}
 
 	msg = log_from_idx(dumper->cur_idx);
-	l = msg_print_text(msg, syslog,
-			      line, size);
+	l = msg_print_text(msg, 0, syslog, line, size);
 
 	dumper->cur_idx = log_next(dumper->cur_idx);
 	dumper->cur_seq++;
@@ -2575,6 +2605,7 @@ bool kmsg_dump_get_buffer(struct kmsg_du
 	u32 idx;
 	u64 next_seq;
 	u32 next_idx;
+	enum log_flags prev;
 	size_t l = 0;
 	bool ret = false;
 
@@ -2597,23 +2628,27 @@ bool kmsg_dump_get_buffer(struct kmsg_du
 	/* calculate length of entire buffer */
 	seq = dumper->cur_seq;
 	idx = dumper->cur_idx;
+	prev = 0;
 	while (seq < dumper->next_seq) {
 		struct log *msg = log_from_idx(idx);
 
-		l += msg_print_text(msg, true, NULL, 0);
+		l += msg_print_text(msg, prev, true, NULL, 0);
 		idx = log_next(idx);
 		seq++;
+		prev = msg->flags;
 	}
 
 	/* move first record forward until length fits into the buffer */
 	seq = dumper->cur_seq;
 	idx = dumper->cur_idx;
+	prev = 0;
 	while (l > size && seq < dumper->next_seq) {
 		struct log *msg = log_from_idx(idx);
 
-		l -= msg_print_text(msg, true, NULL, 0);
+		l -= msg_print_text(msg, prev, true, NULL, 0);
 		idx = log_next(idx);
 		seq++;
+		prev = msg->flags;
 	}
 
 	/* last message in next interation */
@@ -2621,14 +2656,14 @@ bool kmsg_dump_get_buffer(struct kmsg_du
 	next_idx = idx;
 
 	l = 0;
+	prev = 0;
 	while (seq < dumper->next_seq) {
 		struct log *msg = log_from_idx(idx);
 
-		l += msg_print_text(msg, syslog,
-				    buf + l, size - l);
-
+		l += msg_print_text(msg, prev, syslog, buf + l, size - l);
 		idx = log_next(idx);
 		seq++;
+		prev = msg->flags;
 	}
 
 	dumper->next_seq = next_seq;

^ permalink raw reply

* RE: [Qemu-ppc] [RFC PATCH 12/17] PowerPC: booke64: Add DO_KVM kernel hooks
From: Caraman Mihai Claudiu-B02008 @ 2012-07-07  8:39 UTC (permalink / raw)
  To: Alexander Graf, Benjamin Herrenschmidt
  Cc: qemu-ppc@nongnu.org List, linuxppc-dev, KVM list,
	<kvm-ppc@vger.kernel.org>
In-Reply-To: <7305F50A-8E77-4E88-8EB8-4046A7E94DF9@suse.de>

>________________________________________=0A=
>From: Alexander Graf [agraf@suse.de]=0A=
>Sent: Saturday, July 07, 2012 2:11 AM=0A=
>To: Caraman Mihai Claudiu-B02008=0A=
>Cc: Benjamin Herrenschmidt; <kvm-ppc@vger.kernel.org>; KVM list; linuxppc-=
dev; qemu-ppc@nongnu.org List=0A=
>Subject: Re: [Qemu-ppc] [RFC PATCH 12/17] PowerPC: booke64: Add DO_KVM ker=
nel hooks=0A=
>=0A=
>On 07.07.2012, at 00:33, Caraman Mihai Claudiu-B02008 wrote:=0A=
>=0A=
>>> -----Original Message-----=0A=
>>> From: Benjamin Herrenschmidt [mailto:benh@kernel.crashing.org]=0A=
>>> Sent: Thursday, July 05, 2012 1:26 AM=0A=
>>> To: Alexander Graf=0A=
>>> Cc: Caraman Mihai Claudiu-B02008; <kvm-ppc@vger.kernel.org>; KVM list;=
=0A=
>>> linuxppc-dev; qemu-ppc@nongnu.org List=0A=
>>> Subject: Re: [Qemu-ppc] [RFC PATCH 12/17] PowerPC: booke64: Add DO_KVM=
=0A=
>>> kernel hooks=0A=
>>>=0A=
>>> You can't but in any case I don't see the point of the conditional here=
,=0A=
>>> we'll eventually have to load srr1 no ? We can move the load up to here=
=0A=
>>> in all cases or can't we ?=0A=
>>=0A=
>> I like the idea, but there is a problem with addition macros which may c=
lobber=0A=
>> r11 and PROLOG_ADDITION_MASKABLE_GEN is such a case.=0A=
>=0A=
>Mike -v please :)=0A=
=0A=
Ben suggested something like this:=0A=
	=0A=
 #define EXCEPTION_PROLOG(n, type, addition) \=0A=
 	mtspr SPRN_SPRG_##type##_SCRATCH,r13; /* get spare registers */ \=0A=
 	mfspr r13,SPRN_SPRG_PACA; /* get PACA */ \=0A=
 	std r10,PACA_EX##type+EX_R10(r13); \=0A=
 	std r11,PACA_EX##type+EX_R11(r13); \=0A=
 	mfcr r10; /* save CR */ \	=0A=
+	mfspr r11,SPRN_##type##_SRR1;/* what are we coming from */ \=0A=
	DO_KVM	intnum,srr1; \=0A=
 	addition; /* additional code for that exc. */ \=0A=
 	std r1,PACA_EX##type+EX_R1(r13); /* save old r1 in the PACA */ \=0A=
 	stw r10,PACA_EX##type+EX_CR(r13); /* save old CR in the PACA */ \=0A=
-	mfspr r11,SPRN_##type##_SRR1;/* what are we coming from */ \=0A=
 	type##_SET_KSTACK; /* get special stack if necessary */\=0A=
 	andi. r10,r11,MSR_PR; /* save stack pointer */ \=0A=
=0A=
But one of the addition looks like this:=0A=
	=0A=
 #define PROLOG_ADDITION_MASKABLE_GEN(n) \=0A=
 	lbz r11,PACASOFTIRQEN(r13); /* are irqs soft-disabled ? */ \=0A=
	cmpwi cr0,r11,0; /* yes -> go out of line */ \=0A=
	beq masked_interrupt_book3e_##n	=0A=
=0A=
So for maskable gen we end up with:=0A=
=0A=
 #define EXCEPTION_PROLOG(n, type, addition) \=0A=
 	mtspr SPRN_SPRG_##type##_SCRATCH,r13; /* get spare registers */ \=0A=
 	mfspr r13,SPRN_SPRG_PACA; /* get PACA */ \=0A=
 	std r10,PACA_EX##type+EX_R10(r13); \=0A=
 	std r11,PACA_EX##type+EX_R11(r13); \=0A=
 	mfcr r10; /* save CR */ \=0A=
	mfspr r11,SPRN_##type##_SRR1;/* what are we coming from */ \=0A=
 	DO_KVM	intnum,srr1; \=0A=
	lbz r11,PACASOFTIRQEN(r13); /* are irqs soft-disabled ? */ \=0A=
	cmpwi cr0,r11,0; /* yes -> go out of line */ \=0A=
	beq masked_interrupt_book3e_##n	\=0A=
	std r1,PACA_EX##type+EX_R1(r13); /* save old r1 in the PACA */ \=0A=
 	stw r10,PACA_EX##type+EX_CR(r13); /* save old CR in the PACA */ \=0A=
 	type##_SET_KSTACK; /* get special stack if necessary */\=0A=
 	andi. r10,r11,MSR_PR; /* save stack pointer */ \=0A=
	=0A=
This affects the last asm line, we load srr1 into r11 but clobber it in-bet=
ween.=0A=
We need a spare register for maskable gen addition. I think we can free r10=
 sooner=0A=
and used it in addition like this:=0A=
=0A=
 #define EXCEPTION_PROLOG(n, type, addition) \=0A=
 	mtspr SPRN_SPRG_##type##_SCRATCH,r13; /* get spare registers */ \=0A=
 	mfspr r13,SPRN_SPRG_PACA; /* get PACA */ \		=0A=
 	std r10,PACA_EX##type+EX_R10(r13); \=0A=
 	std r11,PACA_EX##type+EX_R11(r13); \=0A=
+	mfspr r11,SPRN_##type##_SRR1;/* what are we coming from */ \=0A=
	mfcr r10; /* save CR */ \=0A=
+ 	stw r10,PACA_EX##type+EX_CR(r13); /* save old CR in the PACA */ \=0A=
 	DO_KVM	intnum,srr1; \=0A=
-	lbz r11,PACASOFTIRQEN(r13); /* are irqs soft-disabled ? */ \=0A=
-	cmpwi cr0,r11,0; /* yes -> go out of line */ \=0A=
+	lbz r10,PACASOFTIRQEN(r13); /* are irqs soft-disabled ? */ \=0A=
+	cmpwi cr0,r10,0; /* yes -> go out of line */ \=0A=
	beq masked_interrupt_book3e_##n	\=0A=
 	std r1,PACA_EX##type+EX_R1(r13); /* save old r1 in the PACA */ \=0A=
- 	stw r10,PACA_EX##type+EX_CR(r13); /* save old CR in the PACA */ \=0A=
-	mfspr r11,SPRN_##type##_SRR1;/* what are we coming from */ \=0A=
 	type##_SET_KSTACK; /* get special stack if necessary */\=0A=
 	andi. r10,r11,MSR_PR; /* save stack pointer */ \=0A=
	=0A=
-Mike=

^ permalink raw reply

* Re: [PATCH] powerpc: put the gpr sabe/restore functions in their own section
From: Michael Ellerman @ 2012-07-07  6:55 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: ppc-dev, Alan Modra
In-Reply-To: <20120706170940.e3457b04a58ddd18d963c14a@canb.auug.org.au>

On Fri, 2012-07-06 at 17:09 +1000, Stephen Rothwell wrote:
> This allows the linker to know that calls to them do not need to switch
> TOC and stop errors like the following when linking large configurations:

>  arch/powerpc/lib/crtsavres.S |    5 ++++-

You didn't make any change to the linker script? How does this section
get in there? I don't see a .text* anywhere?

> diff --git a/arch/powerpc/lib/crtsavres.S b/arch/powerpc/lib/crtsavres.S
> index 1c893f0..b2c68ce 100644
> --- a/arch/powerpc/lib/crtsavres.S
> +++ b/arch/powerpc/lib/crtsavres.S
> @@ -41,12 +41,13 @@
>  #include <asm/ppc_asm.h>
>  
>  	.file	"crtsavres.S"
> -	.section ".text"
>  
>  #ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
>  
>  #ifndef CONFIG_PPC64
>  
> +	.section ".text"
> +
>  /* Routines for saving integer registers, called by the compiler.  */
>  /* Called with r11 pointing to the stack header word of the caller of the */
>  /* function, just beyond the end of the integer save area.  */
> @@ -232,6 +233,8 @@ _GLOBAL(_rest32gpr_31_x)
>  
>  #else /* CONFIG_PPC64 */
>  
> +	.section ".text.save.restore","ax",@progbits
> +
>  .globl	_savegpr0_14
>  _savegpr0_14:
>  	std	r14,-144(r1)

Any reason to not put the 32-bit versions in the same section? AFAICS
nothing in that file uses the TOC. 

cheers

^ permalink raw reply

* Re: [Qemu-ppc] [RFC PATCH 12/17] PowerPC: booke64: Add DO_KVM kernel hooks
From: Alexander Graf @ 2012-07-06 23:11 UTC (permalink / raw)
  To: Caraman Mihai Claudiu-B02008
  Cc: qemu-ppc@nongnu.org List, linuxppc-dev, KVM list,
	<kvm-ppc@vger.kernel.org>
In-Reply-To: <300B73AA675FCE4A93EB4FC1D42459FF15CDE6@039-SN2MPN1-013.039d.mgd.msft.net>


On 07.07.2012, at 00:33, Caraman Mihai Claudiu-B02008 wrote:

>> -----Original Message-----
>> From: Benjamin Herrenschmidt [mailto:benh@kernel.crashing.org]
>> Sent: Thursday, July 05, 2012 1:26 AM
>> To: Alexander Graf
>> Cc: Caraman Mihai Claudiu-B02008; <kvm-ppc@vger.kernel.org>; KVM =
list;
>> linuxppc-dev; qemu-ppc@nongnu.org List
>> Subject: Re: [Qemu-ppc] [RFC PATCH 12/17] PowerPC: booke64: Add =
DO_KVM
>> kernel hooks
>>=20
>> On Wed, 2012-07-04 at 16:29 +0200, Alexander Graf wrote:
>>=20
>>>> +#ifdef CONFIG_KVM_BOOKE_HV
>>>> +#define KVM_BOOKE_HV_MFSPR(reg, spr)				=
\
>>>> +	BEGIN_FTR_SECTION					\
>>>> +		mfspr	reg, spr;			  	\
>>>> +	END_FTR_SECTION_IFSET(CPU_FTR_EMB_HV)
>>>> +#else
>>>> +#define KVM_BOOKE_HV_MFSPR(reg, spr)
>>>> +#endif
>>>=20
>>> Bleks - this is ugly. Do we really need to open-code the #ifdef =
here?
>>> Can't the feature section code determine that the feature is =
disabled
>>> and just always not include the code?
>>=20
>> You can't but in any case I don't see the point of the conditional =
here,
>> we'll eventually have to load srr1 no ? We can move the load up to =
here
>> in all cases or can't we ?=20
>=20
> I like the idea, but there is a problem with addition macros which may =
clobber
> r11 and PROLOG_ADDITION_MASKABLE_GEN is such a case.

Mike -v please :)


Alex

^ permalink raw reply

* RE: [Qemu-ppc] [RFC PATCH 09/17] KVM: PPC64: booke: Hard disable interrupts when entering guest
From: Caraman Mihai Claudiu-B02008 @ 2012-07-06 23:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Alexander Graf
  Cc: linuxppc-dev, qemu-ppc@nongnu.org List,
	<kvm-ppc@vger.kernel.org>, KVM list
In-Reply-To: <1341440465.16808.38.camel@pasglop>

> -----Original Message-----
> From: Linuxppc-dev [mailto:linuxppc-dev-
> bounces+mihai.caraman=3Dfreescale.com@lists.ozlabs.org] On Behalf Of
> Benjamin Herrenschmidt
> Sent: Thursday, July 05, 2012 1:21 AM
> To: Alexander Graf
> Cc: qemu-ppc@nongnu.org List; Caraman Mihai Claudiu-B02008; linuxppc-dev;
> KVM list; <kvm-ppc@vger.kernel.org>
> Subject: Re: [Qemu-ppc] [RFC PATCH 09/17] KVM: PPC64: booke: Hard disable
> interrupts when entering guest
>=20
> On Wed, 2012-07-04 at 16:14 +0200, Alexander Graf wrote:
> > > +#ifdef CONFIG_64BIT
> > > +#define _hard_irq_disable() hard_irq_disable()
> > > +#else
> > > +#define _hard_irq_disable() local_irq_disable()
> > > +#endif
> >
> > So you only swap out the disable bit, but not the enable one? Ben,
> > would this work out?
>=20
> hard_irq_disable() both soft and hard disable. local_irq_enable() will
> see that irqs are hard disabled and will hard enable.
>=20
> However, there's a nastier discrepancy above: local_irq_disable will
> properly inform lockdep that we are disabling, while hard_irq_disable
> won't.
>=20
> Arguably we might want to fix that inside hard_irq_disable() itself...
>=20
> Also you need to be careful. If you are coming with interrupts already
> enabled, it's fine, but if you have interrupts soft disabled, then
> you hard disable, before you enter the guest you probably want to
> check if anything was left "pending" and cancel the entering of the
> guest if that is the case.

On which cases I can find interrupts soft disabled if I call local_irq_enab=
le()
ahead? Can this happen when my kernel task is scheduled?=20

I presume that if I call hard_irq_disable() before entering the guest, a gu=
est exit
will find interrupts soft disabled.

-Mike

^ permalink raw reply

* RE: [Qemu-ppc] [RFC PATCH 12/17] PowerPC: booke64: Add DO_KVM kernel hooks
From: Caraman Mihai Claudiu-B02008 @ 2012-07-06 22:33 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Alexander Graf
  Cc: qemu-ppc@nongnu.org List, linuxppc-dev, KVM list,
	<kvm-ppc@vger.kernel.org>
In-Reply-To: <1341440735.16808.42.camel@pasglop>

PiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiBGcm9tOiBCZW5qYW1pbiBIZXJyZW5zY2ht
aWR0IFttYWlsdG86YmVuaEBrZXJuZWwuY3Jhc2hpbmcub3JnXQ0KPiBTZW50OiBUaHVyc2RheSwg
SnVseSAwNSwgMjAxMiAxOjI2IEFNDQo+IFRvOiBBbGV4YW5kZXIgR3JhZg0KPiBDYzogQ2FyYW1h
biBNaWhhaSBDbGF1ZGl1LUIwMjAwODsgPGt2bS1wcGNAdmdlci5rZXJuZWwub3JnPjsgS1ZNIGxp
c3Q7DQo+IGxpbnV4cHBjLWRldjsgcWVtdS1wcGNAbm9uZ251Lm9yZyBMaXN0DQo+IFN1YmplY3Q6
IFJlOiBbUWVtdS1wcGNdIFtSRkMgUEFUQ0ggMTIvMTddIFBvd2VyUEM6IGJvb2tlNjQ6IEFkZCBE
T19LVk0NCj4ga2VybmVsIGhvb2tzDQo+IA0KPiBPbiBXZWQsIDIwMTItMDctMDQgYXQgMTY6Mjkg
KzAyMDAsIEFsZXhhbmRlciBHcmFmIHdyb3RlOg0KPiANCj4gPiA+ICsjaWZkZWYgQ09ORklHX0tW
TV9CT09LRV9IVg0KPiA+ID4gKyNkZWZpbmUgS1ZNX0JPT0tFX0hWX01GU1BSKHJlZywgc3ByKQkJ
CQlcDQo+ID4gPiArCUJFR0lOX0ZUUl9TRUNUSU9OCQkJCQlcDQo+ID4gPiArCQltZnNwcglyZWcs
IHNwcjsJCQkgIAlcDQo+ID4gPiArCUVORF9GVFJfU0VDVElPTl9JRlNFVChDUFVfRlRSX0VNQl9I
VikNCj4gPiA+ICsjZWxzZQ0KPiA+ID4gKyNkZWZpbmUgS1ZNX0JPT0tFX0hWX01GU1BSKHJlZywg
c3ByKQ0KPiA+ID4gKyNlbmRpZg0KPiA+DQo+ID4gQmxla3MgLSB0aGlzIGlzIHVnbHkuIERvIHdl
IHJlYWxseSBuZWVkIHRvIG9wZW4tY29kZSB0aGUgI2lmZGVmIGhlcmU/DQo+ID4gQ2FuJ3QgdGhl
IGZlYXR1cmUgc2VjdGlvbiBjb2RlIGRldGVybWluZSB0aGF0IHRoZSBmZWF0dXJlIGlzIGRpc2Fi
bGVkDQo+ID4gYW5kIGp1c3QgYWx3YXlzIG5vdCBpbmNsdWRlIHRoZSBjb2RlPw0KPiANCj4gWW91
IGNhbid0IGJ1dCBpbiBhbnkgY2FzZSBJIGRvbid0IHNlZSB0aGUgcG9pbnQgb2YgdGhlIGNvbmRp
dGlvbmFsIGhlcmUsDQo+IHdlJ2xsIGV2ZW50dWFsbHkgaGF2ZSB0byBsb2FkIHNycjEgbm8gPyBX
ZSBjYW4gbW92ZSB0aGUgbG9hZCB1cCB0byBoZXJlDQo+IGluIGFsbCBjYXNlcyBvciBjYW4ndCB3
ZSA/IA0KDQpJIGxpa2UgdGhlIGlkZWEsIGJ1dCB0aGVyZSBpcyBhIHByb2JsZW0gd2l0aCBhZGRp
dGlvbiBtYWNyb3Mgd2hpY2ggbWF5IGNsb2JiZXINCnIxMSBhbmQgUFJPTE9HX0FERElUSU9OX01B
U0tBQkxFX0dFTiBpcyBzdWNoIGEgY2FzZS4NCg0KPiBJZiByZWFsbHkgbm90LCB3ZSBjb3VsZCBo
YXZlIGl0IGluc2lkZSBET19LVk0gYW5kIGJlIGRvbmUgd2l0aCBpdCBubyA/DQoNCjMyLWJpdCBl
eGNlcHRpb24gcHJvbG9nIGxvYWRzIHNycjEgdW5jb25kaXRpb25hbGx5LCBhcyBBbGV4IGFuZCBT
Y290dCBtZW50aW9uZWQNCmVhcmxpZXIsIHNvIHdlIHdpbGwgYmUgc3Vib3B0aW1hbCBmb3IgdGhp
cyBjYXNlLg0KDQotTWlrZQ0K

^ permalink raw reply

* Re: [PATCH 2/3] powerpc/e500: add paravirt QEMU platform
From: Scott Wood @ 2012-07-06 22:04 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev
In-Reply-To: <0C067161-2FF3-4EF6-BC4D-B3C93828ED36@suse.de>

On 07/06/2012 11:59 AM, Alexander Graf wrote:
> 
> On 06.07.2012, at 18:52, Scott Wood wrote:
> 
>> On 07/06/2012 11:30 AM, Alexander Graf wrote:
>>> 
>>> On 06.07.2012, at 18:25, Scott Wood wrote:
>>> 
>>>> Then what would we do if we want to add an ePAPR virtual PIC
>>>> instead? Or if something replaces MPIC on future FSL chips?
>>> 
>>> Then we need a different compatible anyways, because we wouldn't
>>> be backwards compatible, no?
>> 
>> No, that's exactly what I'm trying to avoid.  This notion of a
>> toplevel compatible that tells you everything you need to know
>> about the machine (even if Linux chooses to be device-tree-based
>> for some arbitrary subset of that information) is incompatible with
>> a flexible virtual platform.
>> 
>> All this compatible is saying is "see the rest of the device
>> tree". How well Linux does so is a quality of implementation issue
>> that can be addressed as needed.  The information about what sort
>> of interrupt controller you have is already in the device tree.
>> The device tree is the machine spec.
>> 
>> Another assumption this patch makes is that it doesn't need
>> SWIOTLB.  Is "has more than 4GiB RAM" a machine attribute that
>> would warrant a separate toplevel compatible?  SWIOTLB for PCI is
>> handled due to the previous patch that provides common PCI code --
>> but in a previous version of the patch it was not handled.  Is it
>> yet another incompatible machine spec if RAM must be less than 4GiB
>> minus PCICSRBAR (ignoring the QEMU bug that PCICSRBAR is not
>> implemented)?
> 
> Well, the thing that I'm wary of is the following. Imagine we make
> this the default machine type for all e500 user cases. Which is
> reasonable. Now we release 3.6 which works awesome with QEMU 1.2. We
> change something in QEMU. QEMU 1.3 comes out. It can no longer boot
> your old kernel 3.6.

Do you expect your old kernel to boot when you get new hardware?  QEMU
is basically hardware that is easy to change.

The only thing that using a more specific compatible would do is make
sure that the kernel wouldn't boot whenever it changes, rather than just
having a chance of certain combinations having problems.

Obviously we should make a reasonable effort to avoid gratuitous
breakage in the default config, but I just don't see how overspecifying
things is going to help.

> That's the type of situation I don't want to be in. We need to be
> backwards compatible with what we used to be able to run. We can get
> away with declaring things as experimental for now, until we settled
> on a reasonable compromise to achieve said compatibility. But it
> needs to be our goal somewhere.
> 
> One idea would be to version the machine type according to what Linux
> implements. If Linux finds a machine type that is newer than what it
> implements, it spawns a warning.

What does it mean to have a version number for a platform which is
intended to eventually be arbitrarily configurable?

> If we want, we can implement
> backwards compatible machine types in QEMU, similar to how we
> implement -M pc-0.12 and friends today.

Heh, I was just about to respond by saying "how would you version a PC"? :-)

If you want a stable versioned platform that happens to not pretend to
be a real board, go ahead and add one -- that's not what this is for.
Maybe instead of documenting things like "has an MPIC", there should be
some comment mentioning that this platform is intended to be flexible
and device tree driven, not static.  The device tree is the machine
spec.  I could see an argument for versioning individual devices, OTOH,
rather than e.g. pretending the PCI is really equivalent to an
mpc8540-pci despite significant missing functionality.

BTW, could you point me to the documentation that explains exactly what
a pc-0.12 is?  And is there any place in Linux that actually sees this
version number and does anything with it?  How would a user know what
version of a PC to request?  What version do you get by default?  Under
what conditions are the version number bumped?

> Again, no need to do so as long as we tell users to not use it. As
> soon as we want them to actually run the machine, we need to have
> independent upgrade paths in place. New QEMU needs to be able to run
> old kernels. New kernels need to be run on old QEMU.

They will, usually.  We can't guarantee this will always be true
regardless of a versioning scheme, since bugs will happen.

>>>> Better to change the Linux implementation as needed than to
>>>> change a spec.
>>> 
>>> Why not keep the 2 in sync in the same patch? Just throw a file
>>> with a rough outline of the machine in Documentation/.
>> 
>> Because that would give people the wrong impression about what
>> this machine is, and be unlikely to stay in sync or be a complete
>> listing of current assumptions.  You're basically suggesting to use
>> Documentation/ as a bug tracker.
> 
> I'm just saying that every time we hardcode assumptions, we need to
> make sure we document it somewhere. And currently we do hardcode
> assumptions, even though only a few.

If you want a bug tracker, use a bug tracker.  Linux already has plenty
of assumptions regarding the real hardware it runs on, how firmware
configures it, etc.  Most of these assumptions are not documented, and
things get changed when new hardware comes along that breaks an
assumption.  Most of the assumptions this platform would be making come
from outside the platform file itself.  If I tried to document it, it
would be incomplete and quickly become out of date.

-Scott

^ permalink raw reply

* Re: [PATCH v3] printk: Have printk() never buffer its data
From: Michael Neuling @ 2012-07-06 21:04 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Greg Kroah-Hartman, LKML, Steven Rostedt, Paul E. McKenney,
	linuxppc-dev, Joe Perches, Andrew Morton, Wu Fengguang,
	Linus Torvalds, Ingo Molnar
In-Reply-To: <CAPXgP10JthGjZyqD6gqn2n1rPu1uP+=mem_T-iOritXAQMstRQ@mail.gmail.com>

Kay Sievers <kay@vrfy.org> wrote:

> On Fri, Jul 6, 2012 at 12:46 PM, Kay Sievers <kay@vrfy.org> wrote:
> > On Fri, Jul 6, 2012 at 5:47 AM, Michael Neuling <mikey@neuling.org> wrote:
> >
> >>> 4,89,24561;NIP: c000000000048164 LR: c000000000048160 CTR: 0000000000000000
> >>> 4,90,24576;REGS: c00000007e59fb50 TRAP: 0700   Tainted: G        W     (3.5.0-rc4-mikey)
> >>> 4,91,24583;MSR: 9000000000021032
> >>> 4,92,24586;<
> >>> 4,93,24591;SF
> >>> 4,94,24596;,HV
> >>> 4,95,24601;,ME
> >>> 4,96,24606;,IR
> >>> 4,97,24611;,DR
> >>> 4,98,24616;,RI
> >>> 4,99,24619;>
> >>> 4,100,24628;  CR: 28000042  XER: 22000000
> >>
> >> FWIW, compiling with the parent commit gives this:
> >>
> >> 4,89,1712;NIP: c000000000048164 LR: c000000000048160 CTR: 0000000000000000
> >> 4,90,1713;REGS: c00000007e59fb50 TRAP: 0700   Tainted: G        W     (3.5.0-rc4-mikey)
> >> 4,91,1716;MSR: 9000000000021032 <SF,HV,ME,IR,DR,RI>  CR: 22000082  XER: 02000000
> >
> > Hmm, I don't understand, which parent commit do you mean? You maybe
> > mean without 084681d?
> >
> > I think it's a race of the two CPUs printing continuation lines, and
> > the continuation buffer is still occupied with data from one CPU and
> > not available to the other one at the same time.
> >
> > What you see is likely not the direct output to the console (that
> > would work) but the replay of the stored buffer when the console is
> > registered. Because the cont buffer was still busy with one CPU, the
> > other thread needs to store the continuation line prints in individual
> > records, which leads to the (unwanted) printed newlines when
> > replaying.
> >
> > The data we store looks all fine, it just looks needlessly separated
> > when we replay fromt he buffer on a newly registered boot console. We
> > need to merge the lines in the output, so they *look* like they are
> > all in one line. I'll work on a fix for that now.
> 
> It could be that the console semaphore is still help by the other CPU,
> for whatever reason, when your box runs into this situation.
> 
> Mind pasting more context (/dev/kmsg) of the log when this happens,
> not only the one line that get split-up?
> 
> Is this possibly during an oops or backtrace going on when you see
> this? Which code calls show_regs() here?

Whole kmsg below.

It is a backtrace.  

It's a warning in
arch/powerpc/sysdev/xics/xics-common.c:xics_set_cpu_giq().  The firmware
this machine is running on is non standard (Bare Metal Linux in the
lab).  The warning itself not an issue.  We've had it for years and it
tells us that our firmware/RTAS is not fully implemented.

Mikey

7,0,0;Allocated 917504 bytes for 1024 pacas at c00000000ff20000
6,1,0;Using pSeries machine description
7,2,0;Page orders: linear mapping = 24, virtual = 16, io = 12, vmemmap = 24
6,3,0;Using 1TB segments
4,4,0;Found initrd at 0xc000000002da5000:0xc00000000bc8c200
6,5,0;CPU maps initialized for 2 threads per core
7,6,0; (thread shift is 1)
7,7,0;Freed 851968 bytes for unused pacas
4,8,0;Starting Linux PPC64 #100 SMP Sat Jul 7 06:55:43 EST 2012
4,9,0;-----------------------------------------------------
4,10,0;ppc64_pft_size                = 0x0
4,11,0;physicalMemorySize            = 0x80000000
4,12,0;htab_address                  = 0xc00000007fe00000
4,13,0;htab_hash_mask                = 0x3fff
4,14,0;-----------------------------------------------------
6,15,0;Initializing cgroup subsys cpuset
5,16,0;Linux version 3.5.0-rc4-mikey (mikey@ka1) (gcc version 4.6.0 (GCC) ) #100 SMP Sat Jul 7 06:55:43 EST 2012
4,17,0;[boot]0012 Setup Arch
7,18,0;Node 0 Memory: 0x0-0x80000000
6,19,0;Section 1 and 127 (node 0) have a circular dependency on usemap and pgdat allocations
4,20,0;pseries_eeh_init: RTAS service <ibm, set-slot-reset> invalid
4,21,0;eeh_init: Failed to call platform init function (-22)
4,22,0;Zone ranges:
4,23,0;  DMA      [mem 0x00000000-0x7fffffff]
4,24,0;  Normal   empty
4,25,0;Movable zone start for each node
4,26,0;Early memory node ranges
4,27,0;  node   0: [mem 0x00000000-0x7fffffff]
7,28,0;On node 0 totalpages: 32768
7,29,0;  DMA zone: 28 pages used for memmap
7,30,0;  DMA zone: 0 pages reserved
7,31,0;  DMA zone: 32740 pages, LIFO batch:1
4,32,0;[boot]0015 Setup Done
6,33,0;PERCPU: Embedded 2 pages/cpu @c000000000e00000 s84224 r0 d46848 u524288
7,34,0;pcpu-alloc: s84224 r0 d46848 u524288 alloc=1*1048576
7,35,0;pcpu-alloc: [0] 0 1 
4,36,0;Built 1 zonelists in Node order, mobility grouping on.  Total pages: 32740
4,37,0;Policy zone: DMA
5,38,0;Kernel command line: ipr.enabled=0
6,39,0;PID hash table entries: 4096 (order: -1, 32768 bytes)
4,40,0;freeing bootmem node 0
6,41,0;Memory: 1916096k/2097152k available (11200k kernel code, 181056k reserved, 1728k data, 1041k bss, 576k init)
6,42,0;SLUB: Genslabs=19, HWalign=128, Order=0-3, MinObjects=0, CPUs=2, Nodes=256
6,43,0;Hierarchical RCU implementation.
6,44,0;NR_IRQS:512 nr_irqs:512 16
4,45,0;set-indicator(9005, 0, 1) returned -22
4,46,0;------------[ cut here ]------------
4,47,0;WARNING: at /scratch/mikey/src/linux-ozlabs/arch/powerpc/sysdev/xics/xics-common.c:105
4,48,0;Modules linked in:
4,49,0;NIP: c000000000048164 LR: c000000000048160 CTR: 0000000000000000
4,50,0;REGS: c000000000c27ae0 TRAP: 0700   Not tainted  (3.5.0-rc4-mikey)
4,51,0;MSR: 9000000000021032 <SF,HV,ME,IR,DR,RI>  CR: 24000042  XER: 22000000
4,52,0;SOFTE: 0
4,53,0;CFAR: c000000000740438
4,54,0;TASK = c000000000b2dd80[0] 'swapper/0' THREAD: c000000000c24000 CPU: 0\x0aGPR00: c000000000048160 c000000000c27d60 c000000000c24488 0000000000000026 \x0aGPR04: 0000000000000000 000000000000002e 30352c2000000000 0000000032320000 \x0aGPR08: 2920726500000000 c000000000b30e20 0000000000000020 000000000000002e \x0aGPR12: 0000000024000042 c00000000ff20000 0000000000000000 0000000000000000 \x0aGPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 \x0aGPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 \x0aGPR24: 0000000000000000 0000000000000000 0000000000000000 c000000000dd8080 \x0aGPR28: c000000000cb0628 0000000000000000 c000000000b70a28 0000000000000001 
4,55,0;NIP [c000000000048164] .xics_set_cpu_giq+0xb4/0xc0
4,56,0;LR [c000000000048160] .xics_set_cpu_giq+0xb0/0xc0
4,57,0;Call Trace:
4,58,0;[c000000000c27d60] [c000000000048160] .xics_set_cpu_giq+0xb0/0xc0 (unreliable)
4,59,0;[c000000000c27df0] [c000000000a75b90] .pseries_xics_init_IRQ+0x10/0x24
4,60,0;[c000000000c27e60] [c000000000a643cc] .init_IRQ+0x3c/0x54
4,61,0;[c000000000c27ee0] [c000000000a60804] .start_kernel+0x250/0x464
4,62,0;[c000000000c27f90] [c00000000000967c] .start_here_common+0x20/0x24
4,63,0;Instruction dump:
4,64,0;7fa4eb78 4bfddd49 60000000 2f830000 7c671b78 409cffa4 e87e8030 3880232d 
4,65,0;7fa5eb78 7fe6fb78 486f8289 60000000 <0fe00000> 4bffff84 60000000 7c0802a6 
4,66,0;---[ end trace 31fd0ba7d8756001 ]---
7,67,0;pic: no ISA interrupt controller
4,68,0;error: reading the clock failed (-1)
7,69,0;time_init: decrementer frequency = 59.375000 MHz
7,70,0;time_init: processor frequency   = 3800.000000 MHz
6,71,0;clocksource: timebase mult[86bca1b] shift[23] registered
7,72,0;clockevent: decrementer mult[f333333] shift[32] cpu[0]
6,73,0;Console: colour dummy device 80x25
6,74,0;console [tty0] enabled
6,75,0;console [hvc0] enabled
6,76,4785;pid_max: default: 32768 minimum: 301
6,77,5419;Dentry cache hash table entries: 262144 (order: 5, 2097152 bytes)
6,78,14588;Inode-cache hash table entries: 131072 (order: 4, 1048576 bytes)
6,79,19200;Mount-cache hash table entries: 4096
6,80,22334;Initializing cgroup subsys cpuacct
6,81,22418;Initializing cgroup subsys devices
6,82,22504;Initializing cgroup subsys freezer
6,83,23005;POWER7 performance monitor hardware support registered
6,84,24392;Firmware doesn't support query-cpu-stopped-state
4,85,24518;set-indicator(9005, 0, 1) returned -22
4,86,24534;------------[ cut here ]------------
4,87,24545;WARNING: at /scratch/mikey/src/linux-ozlabs/arch/powerpc/sysdev/xics/xics-common.c:105
4,88,24549;Modules linked in:
4,89,24565;NIP: c000000000048164 LR: c000000000048160 CTR: 0000000000000000
4,90,24579;REGS: c00000007e59fb50 TRAP: 0700   Tainted: G        W     (3.5.0-rc4-mikey)
4,91,24586;MSR: 9000000000021032 
4,92,24590;<
4,93,24594;SF
4,94,24599;,HV
4,95,24604;,ME
4,96,24609;,IR
4,97,24614;,DR
4,98,24619;,RI
4,99,24623;>
4,100,24631;  CR: 28000042  XER: 22000000
4,101,24637;SOFTE: 0
4,102,24643;CFAR: c000000000740438
4,103,24656;TASK = c00000007e56bb40[0] 'swapper/1' THREAD: c00000007e59c000
4,104,24661; CPU: 1
4,105,24667;\x0aGPR00: 
4,106,24673;c000000000048160 
4,107,24679;c00000007e59fdd0 
4,108,24685;c000000000c24488 
4,109,24691;0000000000000026 
4,110,24696;\x0aGPR04: 
4,111,24701;0000000000000000 
4,112,24707;0000000000000002 
4,113,24713;c000000000cd011c 
4,114,24719;302c203129207265 
4,115,24724;\x0aGPR08: 
4,116,24730;7475726e00000000 
4,117,24736;0000000000000000 
4,118,24741;0000000000000000 
4,119,24746;0000000000000000 
4,120,24752;\x0aGPR12: 
4,121,24758;0000000028000042 
4,122,24764;c00000000ff20380 
4,123,24770;c00000007e59ff90 
4,124,24775;0000000000000000 
4,125,24781;\x0aGPR16: 
4,126,24786;0000000000000000 
4,127,24792;0000000000000000 
4,128,24797;0000000000000000 
4,129,24802;0000000000000000 
4,130,24808;\x0aGPR20: 
4,131,24813;0000000000000000 
4,132,24818;0000000000000000 
4,133,24824;0000000000000000 
4,134,24829;0000000000000001 
4,135,24835;\x0aGPR24: 
4,136,24840;0000000000000001 
4,137,24845;0000000000000000 
4,138,24851;0000000000000000 
4,139,24856;0000000000000001 
4,140,24862;\x0aGPR28: 
4,141,24867;0000000000000008 
4,142,24872;0000000000000000 
4,143,24878;c000000000b70a28 
4,144,24884;0000000000000001 
4,145,24887;
4,146,24907;NIP [c000000000048164] .xics_set_cpu_giq+0xb4/0xc0
4,147,24927;LR [c000000000048160] .xics_set_cpu_giq+0xb0/0xc0
4,148,24931;Call Trace:
4,149,24953;[c00000007e59fdd0] [c000000000048160] .xics_set_cpu_giq+0xb0/0xc0
4,150,24957; (unreliable)
4,151,24961;
4,152,24986;[c00000007e59fe60] [c00000000076768c] .smp_xics_setup_cpu+0x28/0xbc
4,153,24989;
4,154,25013;[c00000007e59fee0] [c000000000764308] .start_secondary+0xc8/0x360
4,155,25017;
4,156,25040;[c00000007e59ff90] [c00000000000936c] .start_secondary_prolog+0x10/0x14
4,157,25043;
4,158,25048;Instruction dump:
4,159,25052;
4,160,25058;7fa4eb78 
4,161,25064;4bfddd49 
4,162,25070;60000000 
4,163,25076;2f830000 
4,164,25082;7c671b78 
4,165,25088;409cffa4 
4,166,25094;e87e8030 
4,167,25100;3880232d 
4,168,25104;
4,169,25110;7fa5eb78 
4,170,25116;7fe6fb78 
4,171,25122;486f8289 
4,172,25128;60000000 
4,173,25134;<0fe00000> 
4,174,25141;4bffff84 
4,175,25147;60000000 
4,176,25153;7c0802a6 
4,177,25156;
4,178,25164;---[ end trace 31fd0ba7d8756002 ]---
6,179,25465;Brought up 2 CPUs
7,180,29974;Node 0 CPUs: 0-1
6,181,30219;Enabling Asymmetric SMT scheduling
6,182,106522;NET: Registered protocol family 16
6,183,106731;IBM eBus Device Driver
4,184,107916;kworker/u:0 (16) used greatest stack depth: 12800 bytes left
6,185,120011;nvram: No room to create ibm,rtas-log partition, deleting any obsolete OS partitions...
3,186,120204;nvram: Failed to find or create ibm,rtas-log partition, err -28
6,187,120344;nvram: No room to create lnx,oops-log partition, deleting any obsolete OS partitions...
3,188,120536;nvram: Failed to find or create lnx,oops-log partition, err -28
6,189,120694;CPU Hotplug not supported by firmware - disabling.
6,190,126948;PCI: Probing PCI hardware
7,191,127017;PCI: Probing PCI hardware done
4,192,127028;opal: Node not found
6,193,296438;bio: create slab <bio-0> at 0
6,194,300573;vgaarb: loaded
5,195,303929;SCSI subsystem initialized
7,196,307408;libata version 3.00 loaded.
6,197,312421;usbcore: registered new interface driver usbfs
6,198,313151;usbcore: registered new interface driver hub
6,199,314645;usbcore: registered new device driver usb
6,200,322114;Switching to clocksource timebase
4,201,352608;kworker/u:0 (223) used greatest stack depth: 11168 bytes left
6,202,432888;NET: Registered protocol family 2
6,203,433315;IP route cache hash table entries: 16384 (order: 1, 131072 bytes)
6,204,434295;TCP established hash table entries: 65536 (order: 4, 1048576 bytes)
6,205,438160;TCP bind hash table entries: 65536 (order: 4, 1048576 bytes)
6,206,441363;TCP: Hash tables configured (established 65536 bind 65536)
6,207,441491;TCP: reno registered
6,208,441565;UDP hash table entries: 2048 (order: 0, 65536 bytes)
6,209,441978;UDP-Lite hash table entries: 2048 (order: 0, 65536 bytes)
6,210,442727;NET: Registered protocol family 1
6,211,443753;RPC: Registered named UNIX socket transport module.
6,212,443869;RPC: Registered udp transport module.
6,213,443958;RPC: Registered tcp transport module.
6,214,444049;RPC: Registered tcp NFSv4.1 backchannel transport module.
7,215,444178;PCI: CLS 0 bytes, default 128
6,216,444401;Trying to unpack rootfs image as initramfs...
6,217,2011859;Freeing initrd memory: 146368k freed
6,218,2019359;rtasd: No event-scan on system
6,219,2020743;Hypercall H_BEST_ENERGY not supported
6,220,2025601;audit: initializing netlink socket (disabled)
5,221,2025727;type=2000 audit(2.020:1): initialized
6,222,2992853;HugeTLB registered 1 MB page size, pre-allocated 0 pages
6,223,2992982;HugeTLB registered 16 MB page size, pre-allocated 0 pages
6,224,2993112;HugeTLB registered 16 GB page size, pre-allocated 0 pages
5,225,3109094;NFS: Registering the id_resolver key type
5,226,3109219;Key type id_resolver registered
6,227,3112591;msgmni has been set to 4028
6,228,3117312;Block layer SCSI generic (bsg) driver version 0.4 loaded (major 254)
6,229,3117460;io scheduler noop registered
6,230,3117534;io scheduler deadline registered
6,231,3118446;io scheduler cfq registered (default)
6,232,3692669;Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
6,233,3703393;Generic RTC Driver v1.07
6,234,3748115;brd: module loaded
6,235,3770441;loop: module loaded
6,236,3770496;Uniform Multi-Platform E-IDE driver
6,237,3774960;ide-gd driver 1.18
6,238,3776029;ide-cd driver 5.00
6,239,3787095;ipr: IBM Power RAID SCSI Device Driver version: 2.5.3 (March 10, 2012)
6,240,3788311;st: Version 20101219, fixed bufsize 32768, s/g segs 256
6,241,3796157;pcnet32: pcnet32.c:v1.35 21.Apr.2008 tsbogend@alpha.franken.de
7,242,3798393;ibmveth: IBM Power Virtual Ethernet Driver 1.04
6,243,3799456;ehea: IBM eHEA ethernet device driver (Release EHEA_0107)
6,244,3801948;e100: Intel(R) PRO/100 Network Driver, 3.5.24-k2-NAPI
6,245,3802068;e100: Copyright(c) 1999-2006 Intel Corporation
6,246,3803197;e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
6,247,3803337;e1000: Copyright (c) 1999-2006 Intel Corporation.
6,248,3804513;e1000e: Intel(R) PRO/1000 Network Driver - 2.0.0-k
6,249,3804625;e1000e: Copyright(c) 1999 - 2012 Intel Corporation.
6,250,3805823;ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
6,251,3807019;ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
6,252,3809276;mousedev: PS/2 mouse device common for all mice
6,253,3812135;md: linear personality registered for level -1
6,254,3812243;md: raid0 personality registered for level 0
6,255,3812348;md: raid1 personality registered for level 1
6,256,3814922;device-mapper: ioctl: 4.22.0-ioctl (2011-10-19) initialised: dm-devel@redhat.com
6,257,3815091;cpuidle: using governor ladder
6,258,3815168;cpuidle: using governor menu
6,259,3837236;usbcore: registered new interface driver usbhid
6,260,3837342;usbhid: USB HID core driver
7,261,3837414;oprofile: using ppc64/power7 performance monitoring.
6,262,3837813;IPv4 over IPv4 tunneling driver
6,263,3841830;TCP: cubic registered
6,264,3841891;NET: Registered protocol family 17
5,265,3842089;Key type dns_resolver registered
7,266,3842319;Running code patching self-tests ...
7,267,3846278;Running feature fixup self-tests ...
7,268,3846308;Running MSI bitmap self-tests ...
6,269,3875364;registered taskstats version 1
6,270,3876407;console [netcon0] enabled
6,271,3876473;netconsole: network logging started
6,272,3877574;Freeing unused kernel memory: 576k freed
4,273,3889441;modprobe (1071) used greatest stack depth: 11120 bytes left
4,274,4403534;udevd (1108) used greatest stack depth: 10528 bytes left
4,275,4416293;tput (1111) used greatest stack depth: 10464 bytes left
4,276,4777038;blkid (1149) used greatest stack depth: 10288 bytes left
4,277,6012331;fsck (1224) used greatest stack depth: 10224 bytes left
4,278,7298006;run-parts (1346) used greatest stack depth: 9968 bytes left
4,279,9235784;sshd (1497): /proc/1497/oom_adj is deprecated, please use /proc/1497/oom_score_adj instead.

^ permalink raw reply

* Re: [PATCH] powerpc/85xx: use the BRx registers to enable indirect mode on the P1022DS
From: Kumar Gala @ 2012-07-06 18:09 UTC (permalink / raw)
  To: Timur Tabi; +Cc: linuxppc-dev
In-Reply-To: <1341500908-22384-1-git-send-email-timur@freescale.com>


On Jul 5, 2012, at 10:08 AM, Timur Tabi wrote:

> In order to enable the DIU video controller on the P1022DS, the FPGA =
needs
> to be switched to "indirect mode", where the localbus is disabled and
> the FPGA is accessed via writes to localbus chip select signals CS0 =
and CS1.
>=20
> To obtain the address of CS0 and CS1, the platform driver uses an =
"indirect
> pixis mode" device tree node.  This node assumes that the localbus =
'ranges'
> property is sorted in chip-select order.  That is, reg value 0 maps to
> CS0, reg value 1 maps to CS1, etc.  This is how the 'ranges' property =
is
> supposed to be arranged.
>=20
> Unfortunately, the 'ranges' property is often mis-arranged, and not =
just on
> the P1022DS.  Linux normally does not care, since it does not program =
the
> localbus.  But the indirect-mode code on the P1022DS does care.
>=20
> The "proper" fix is to have U-Boot fix the 'ranges' property, but this =
would
> be too cumbersome.  The names and 'reg' properties of all the localbus
> devices would also need to be updated, and determining which localbus =
device
> maps to which chip select is board-specific.
>=20
> Instead, we determine the CS0/CS1 base addresses the same way that =
U-boot
> does -- by reading the BRx registers directly and mapping them to =
physical
> addresses.  This code is simpler and more reliable, and it does not =
require
> a U-boot or device tree change.
>=20
> Since the indirect pixis device tree node is no longer needed, the =
node is
> deleted from the DTS.
>=20
> Signed-off-by: Timur Tabi <timur@freescale.com>
> ---
> arch/powerpc/boot/dts/p1022ds.dtsi     |   16 -----
> arch/powerpc/platforms/85xx/p1022_ds.c |  106 =
++++++++++++++++++++++++++++----
> 2 files changed, 93 insertions(+), 29 deletions(-)

applied to next

- k=

^ permalink raw reply

* Re: [PATCH] Revert "powerpc/p3060qds: Add support for P3060QDS board"
From: Kumar Gala @ 2012-07-06 18:09 UTC (permalink / raw)
  To: Timur Tabi; +Cc: linuxppc-dev
In-Reply-To: <1341526073-10595-1-git-send-email-timur@freescale.com>


On Jul 5, 2012, at 5:07 PM, Timur Tabi wrote:

> This reverts commit 96cc017c5b7ec095ef047d3c1952b6b6bbf98943.
>=20
> The P3060 was cancelled before it went into production, so there's no =
point
> in supporting it.
>=20
> Signed-off-by: Timur Tabi <timur@freescale.com>
> ---
> arch/powerpc/boot/dts/fsl/p3060si-post.dtsi  |  302 =
--------------------------
> arch/powerpc/boot/dts/fsl/p3060si-pre.dtsi   |  125 -----------
> arch/powerpc/boot/dts/p3060qds.dts           |  242 =
---------------------
> arch/powerpc/configs/corenet32_smp_defconfig |    1 -
> arch/powerpc/platforms/85xx/Kconfig          |   12 -
> arch/powerpc/platforms/85xx/Makefile         |    1 -
> arch/powerpc/platforms/85xx/p3060_qds.c      |   77 -------
> 7 files changed, 0 insertions(+), 760 deletions(-)
> delete mode 100644 arch/powerpc/boot/dts/fsl/p3060si-post.dtsi
> delete mode 100644 arch/powerpc/boot/dts/fsl/p3060si-pre.dtsi
> delete mode 100644 arch/powerpc/boot/dts/p3060qds.dts
> delete mode 100644 arch/powerpc/platforms/85xx/p3060_qds.c

applied to next

- k=

^ permalink raw reply

* Re: [PATCH][v3] powerpc/85xx:Add BSC9131 RDB Support
From: Kumar Gala @ 2012-07-06 18:09 UTC (permalink / raw)
  To: Prabhakar Kushwaha
  Cc: Priyanka Jain, devicetree-discuss, Ramneek Mehresh,
	Rajan Srivastava, linuxppc-dev, Akhil Goyal
In-Reply-To: <1332392055-30112-1-git-send-email-prabhakar@freescale.com>


On Mar 21, 2012, at 11:54 PM, Prabhakar Kushwaha wrote:

> BSC9131RDB is a Freescale reference design board for BSC9131 SoC.The =
BSC9131
> is integrated SoC that targets Femto base station market. It combines =
Power
> Architecture e500v2 and DSP StarCore SC3850 core technologies with =
MAPLE-B2F
> baseband acceleration processing elements.
>=20
> The BSC9131 SoC includes the following function and features:
>    . Power Architecture subsystem including a e500 processor with =
256-Kbyte
>    shared L2 cache
>    . StarCore SC3850 DSP subsystem with a 512-Kbyte private L2 cache
>    . The Multi Accelerator Platform Engine for Femto BaseStation =
Baseband
>      Processing (MAPLE-B2F)
>    . A multi-standard baseband algorithm accelerator for Channel
>      Decoding/Encoding, Fourier Transforms, UMTS chip rate processing, =
LTE
>      UP/DL Channel processing, and CRC algorithms
>    . Consists of accelerators for Convolution, Filtering, Turbo =
Encoding,
>      Turbo Decoding, Viterbi decoding, Chiprate processing, and Matrix
>      Inversion operations
>    . DDR3/3L memory interface with 32-bit data width without ECC and =
16-bit
>      with ECC, up to 400-MHz clock/800 MHz data rate
>    . Dedicated security engine featuring trusted boot
>    . DMA controller
>    . OCNDMA with four bidirectional channels
>    . Interfaces
>    . Two triple-speed Gigabit Ethernet controllers featuring network
>      acceleration including IEEE 1588. v2 hardware support and
>      virtualization (eTSEC)
>    . eTSEC 1 supports RGMII/RMII
>    . eTSEC 2 supports RGMII
>    . High-speed USB 2.0 host and device controller with ULPI interface
>    . Enhanced secure digital (SD/MMC) host controller (eSDHC)
>    . Antenna interface controller (AIC), supporting three industry =
standard
>      JESD207/three custom ADI RF interfaces (two dual port and one =
single
>      port) and three MAXIM's MaxPHY serial interfaces
>    . ADI lanes support both full duplex FDD support and half duplex =
TDD
>      support
>    . Universal Subscriber Identity Module (USIM) interface that =
facilitates
>      communication to SIM cards or Eurochip pre-paid phone cards
>    . TDM with one TDM port
>    . Two DUART, four eSPI, and two I2C controllers
>    . Integrated Flash memory controller (IFC)
>    . TDM with 256 channels
>    . GPIO
>    . Sixteen 32-bit timers
>=20
> The DSP portion of the SoC consists of DSP core (SC3850) and various
> accelerators pertaining to DSP operations.
>=20
> BSC9131RDB Overview
> ----------------------
>    BSC9131 SoC
>    1Gbyte DDR3 (on board DDR)
>    128Mbyte 2K page size NAND Flash
>    256 Kbit M24256 I2C EEPROM
>    128 Mbit SPI Flash memory
>    USB-ULPI
>    eTSEC1: Connected to RGMII PHY
>    eTSEC2: Connected to RGMII PHY
>    DUART interface: supports one UARTs up to 115200 bps for console =
display
>=20
> Linux runs on e500v2 core and access some DSP peripherals like AIC
>=20
> Signed-off-by: Ramneek Mehresh <ramneek.mehresh@freescale.com>
> Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com>
> Signed-off-by: Akhil Goyal <Akhil.Goyal@freescale.com>
> Signed-off-by: Poonam Aggrwal <poonam.aggrwal@freescale.com>
> Signed-off-by: Rajan Srivastava <rajan.srivastava@freescale.com>
> Signed-off-by: Prabhakar Kushwaha <prabhakar@freescale.com>
> ---

applied to next.

(Made some minor changes to match upstream board ports like p1010, also =
sorted Makefile & Kconfig by Alphabet)

- k=

^ permalink raw reply

* Re: [PATCH 2/3] powerpc/e500: add paravirt QEMU platform
From: Alexander Graf @ 2012-07-06 16:59 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev
In-Reply-To: <4FF717D3.1020005@freescale.com>


On 06.07.2012, at 18:52, Scott Wood wrote:

> On 07/06/2012 11:30 AM, Alexander Graf wrote:
>>=20
>> On 06.07.2012, at 18:25, Scott Wood wrote:
>>=20
>>> On 07/06/2012 07:29 AM, Alexander Graf wrote:
>>>> I really think we should document what exactly this machine =
expects.
>>>=20
>>> Well, the point of this paravirt machine is to avoid such =
assumptions --
>>> it's all device-tree driven, at least in theory.  If a certain qemu
>>> configuration ends up breaking the Linux platform (such as using a
>>> different PIC), then that's a lack of flexibility on Linux's part =
that
>>> should get fixed if someone finds it useful enough to justify the
>>> effort.  Same with real hardware -- if you care about it, you add
>>> support -- we just don't have a unique name for every configuration.
>>> The information is there in the device tree, though.
>>>=20
>>> Honestly, even having "qemu" in there is more specific than I'd =
prefer,
>>> but I don't want to stir up the "generic platform" argument again
>>> without at least limiting the scope.
>>=20
>> Well, can't we note down the assumptions we make to make sure that
>> whoever develops an implementation of it knows what to implement?
>> It's ppc specific for example. I also don't think that plugging a G3
>> in there works, would it?
>=20
> Well, it does have "e500" in the name. :-P
>=20
>>>>> +void __init qemu_e500_pic_init(void)
>>>>> +{
>>>>> +	struct mpic *mpic;
>>>>> +
>>>>> +	mpic =3D mpic_alloc(NULL, 0, MPIC_BIG_ENDIAN | =
MPIC_SINGLE_DEST_CPU,
>>>>> +			0, 256, " OpenPIC  ");
>>>>=20
>>>> Does that mean we're configuring the MPIC regardless of what the
>>>> guest tells us? So the MPIC is a hard requirement. We can't use UIC
>>>> or XPIC with this machine, right? This needs to be documented.
>>>=20
>>> Then what would we do if we want to add an ePAPR virtual PIC =
instead?
>>> Or if something replaces MPIC on future FSL chips?
>>=20
>> Then we need a different compatible anyways, because we wouldn't be =
backwards compatible, no?
>=20
> No, that's exactly what I'm trying to avoid.  This notion of a =
toplevel
> compatible that tells you everything you need to know about the =
machine
> (even if Linux chooses to be device-tree-based for some arbitrary =
subset
> of that information) is incompatible with a flexible virtual platform.
>=20
> All this compatible is saying is "see the rest of the device tree".
> How well Linux does so is a quality of implementation issue that can =
be
> addressed as needed.  The information about what sort of interrupt
> controller you have is already in the device tree.  The device tree is
> the machine spec.
>=20
> Another assumption this patch makes is that it doesn't need SWIOTLB.  =
Is
> "has more than 4GiB RAM" a machine attribute that would warrant a
> separate toplevel compatible?  SWIOTLB for PCI is handled due to the
> previous patch that provides common PCI code -- but in a previous
> version of the patch it was not handled.  Is it yet another =
incompatible
> machine spec if RAM must be less than 4GiB minus PCICSRBAR (ignoring =
the
> QEMU bug that PCICSRBAR is not implemented)?

Well, the thing that I'm wary of is the following. Imagine we make this =
the default machine type for all e500 user cases. Which is reasonable. =
Now we release 3.6 which works awesome with QEMU 1.2. We change =
something in QEMU. QEMU 1.3 comes out. It can no longer boot your old =
kernel 3.6.

That's the type of situation I don't want to be in. We need to be =
backwards compatible with what we used to be able to run. We can get =
away with declaring things as experimental for now, until we settled on =
a reasonable compromise to achieve said compatibility. But it needs to =
be our goal somewhere.

One idea would be to version the machine type according to what Linux =
implements. If Linux finds a machine type that is newer than what it =
implements, it spawns a warning. If we want, we can implement backwards =
compatible machine types in QEMU, similar to how we implement -M pc-0.12 =
and friends today.

Again, no need to do so as long as we tell users to not use it. As soon =
as we want them to actually run the machine, we need to have independent =
upgrade paths in place. New QEMU needs to be able to run old kernels. =
New kernels need to be run on old QEMU.

>=20
>>> Better to change the Linux implementation as needed than to change a =
spec.
>>=20
>> Why not keep the 2 in sync in the same patch? Just throw a file with =
a rough outline of the machine in Documentation/.
>=20
> Because that would give people the wrong impression about what this
> machine is, and be unlikely to stay in sync or be a complete listing =
of
> current assumptions.  You're basically suggesting to use =
Documentation/
> as a bug tracker.

I'm just saying that every time we hardcode assumptions, we need to make =
sure we document it somewhere. And currently we do hardcode assumptions, =
even though only a few.


Alex

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox