LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 2/6] powerpc/pseries: Use irq_has_action() in eeh_disable_irq()
From: Michael Ellerman @ 2009-10-14  5:44 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <a6fc612b826ff12628e62601203d565df00c6436.1255499081.git.michael@ellerman.id.au>

Rather than open-coding our own check, use irq_has_action()
to check if an irq has an action - ie. is "in use".

irq_has_action() doesn't take the descriptor lock, but it
shouldn't matter - we're just using it as an indicator
that the irq is in use. disable_irq_nosync() will take
the descriptor lock before doing anything also.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
---
 arch/powerpc/platforms/pseries/eeh_driver.c |   18 +-----------------
 1 files changed, 1 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/eeh_driver.c b/arch/powerpc/platforms/pseries/eeh_driver.c
index 0e8db67..ef8e454 100644
--- a/arch/powerpc/platforms/pseries/eeh_driver.c
+++ b/arch/powerpc/platforms/pseries/eeh_driver.c
@@ -63,22 +63,6 @@ static void print_device_node_tree(struct pci_dn *pdn, int dent)
 }
 #endif
 
-/** 
- * irq_in_use - return true if this irq is being used 
- */
-static int irq_in_use(unsigned int irq)
-{
-	int rc = 0;
-	unsigned long flags;
-   struct irq_desc *desc = irq_desc + irq;
-
-	spin_lock_irqsave(&desc->lock, flags);
-	if (desc->action)
-		rc = 1;
-	spin_unlock_irqrestore(&desc->lock, flags);
-	return rc;
-}
-
 /**
  * eeh_disable_irq - disable interrupt for the recovering device
  */
@@ -93,7 +77,7 @@ static void eeh_disable_irq(struct pci_dev *dev)
 	if (dev->msi_enabled || dev->msix_enabled)
 		return;
 
-	if (!irq_in_use(dev->irq))
+	if (!irq_has_action(dev->irq))
 		return;
 
 	PCI_DN(dn)->eeh_mode |= EEH_MODE_IRQ_DISABLED;
-- 
1.6.2.1

^ permalink raw reply related

* [PATCH 1/6] powerpc: Make NR_IRQS a CONFIG option
From: Michael Ellerman @ 2009-10-14  5:44 UTC (permalink / raw)
  To: linuxppc-dev

The irq_desc array consumes quite a lot of space, and for systems
that don't need or can't have 512 irqs it's just wasted space.

The first 16 are reserved for ISA, so the minimum of 32 is really
16 - and no one has asked for more than 512 so leave that as the
maximum.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
---
 arch/powerpc/Kconfig           |   10 ++++++++++
 arch/powerpc/include/asm/irq.h |    4 ++--
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 10a0a54..2230e75 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -56,6 +56,16 @@ config IRQ_PER_CPU
 	bool
 	default y
 
+config NR_IRQS
+	int "Number of virtual interrupt numbers"
+	range 32 512
+	default "512"
+	help
+	  This defines the number of virtual interrupt numbers the kernel
+	  can manage. Virtual interrupt numbers are what you see in
+	  /proc/interrupts. If you configure your system to have too few,
+	  drivers will fail to load or worse - handle with care.
+
 config STACKTRACE_SUPPORT
 	bool
 	default y
diff --git a/arch/powerpc/include/asm/irq.h b/arch/powerpc/include/asm/irq.h
index bbcd1aa..b83fcc8 100644
--- a/arch/powerpc/include/asm/irq.h
+++ b/arch/powerpc/include/asm/irq.h
@@ -34,8 +34,8 @@ extern atomic_t ppc_n_lost_interrupts;
  */
 #define NO_IRQ_IGNORE		((unsigned int)-1)
 
-/* Total number of virq in the platform (make it a CONFIG_* option ? */
-#define NR_IRQS		512
+/* Total number of virq in the platform */
+#define NR_IRQS		CONFIG_NR_IRQS
 
 /* Number of irqs reserved for the legacy controller */
 #define NUM_ISA_INTERRUPTS	16
-- 
1.6.2.1

^ permalink raw reply related

* [Fwd: [PATCH 2/2] i2c-powermac: Log errors]
From: Benjamin Herrenschmidt @ 2009-10-14  5:40 UTC (permalink / raw)
  To: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 1 bytes --]



[-- Attachment #2: Forwarded message - [PATCH 2/2] i2c-powermac: Log errors --]
[-- Type: message/rfc822, Size: 3291 bytes --]

From: Jean Delvare <khali@linux-fr.org>
To: Linux I2C <linux-i2c@vger.kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>, Paul Mackerras <paulus@samba.org>
Subject: [PATCH 2/2] i2c-powermac: Log errors
Date: Sat, 10 Oct 2009 14:20:28 +0200
Message-ID: <20091010142028.52f19518@hyperion.delvare>

Log errors when they happen, otherwise we have no idea what went
wrong.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
---
 drivers/i2c/busses/i2c-powermac.c |   28 +++++++++++++++++++++++-----
 1 file changed, 23 insertions(+), 5 deletions(-)

--- linux-2.6.32-rc3.orig/drivers/i2c/busses/i2c-powermac.c	2009-10-10 14:13:04.000000000 +0200
+++ linux-2.6.32-rc3/drivers/i2c/busses/i2c-powermac.c	2009-10-10 14:13:12.000000000 +0200
@@ -108,16 +108,25 @@ static s32 i2c_powermac_smbus_xfer(	stru
 	}
 
 	rc = pmac_i2c_open(bus, 0);
-	if (rc)
+	if (rc) {
+		dev_err(&adap->dev, "Failed to open I2C, err %d\n", rc);
 		return rc;
+	}
 
 	rc = pmac_i2c_setmode(bus, mode);
-	if (rc)
+	if (rc) {
+		dev_err(&adap->dev, "Failed to set I2C mode %d, err %d\n",
+			mode, rc);
 		goto bail;
+	}
 
 	rc = pmac_i2c_xfer(bus, addrdir, subsize, subaddr, buf, len);
-	if (rc)
+	if (rc) {
+		dev_err(&adap->dev,
+			"I2C transfer at 0x%02x failed, size %d, err %d\n",
+			addrdir >> 1, size, rc);
 		goto bail;
+	}
 
 	if (size == I2C_SMBUS_WORD_DATA && read) {
 		data->word = ((u16)local[1]) << 8;
@@ -157,12 +166,21 @@ static int i2c_powermac_master_xfer(	str
 		addrdir ^= 1;
 
 	rc = pmac_i2c_open(bus, 0);
-	if (rc)
+	if (rc) {
+		dev_err(&adap->dev, "Failed to open I2C, err %d\n", rc);
 		return rc;
+	}
 	rc = pmac_i2c_setmode(bus, pmac_i2c_mode_std);
-	if (rc)
+	if (rc) {
+		dev_err(&adap->dev, "Failed to set I2C mode %d, err %d\n",
+			pmac_i2c_mode_std, rc);
 		goto bail;
+	}
 	rc = pmac_i2c_xfer(bus, addrdir, 0, 0, msgs->buf, msgs->len);
+	if (rc < 0)
+		dev_err(&adap->dev, "I2C %s 0x%02x failed, err %d\n",
+			addrdir & 1 ? "read from" : "write to", addrdir >> 1,
+			rc);
  bail:
 	pmac_i2c_close(bus);
 	return rc < 0 ? rc : 1;


-- 
Jean Delvare

^ permalink raw reply

* [Fwd: [PATCH 1/2] i2c-powermac: Refactor i2c_powermac_smbus_xfer]
From: Benjamin Herrenschmidt @ 2009-10-14  5:40 UTC (permalink / raw)
  To: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 1 bytes --]



[-- Attachment #2: Forwarded message - [PATCH 1/2] i2c-powermac: Refactor i2c_powermac_smbus_xfer --]
[-- Type: message/rfc822, Size: 5169 bytes --]

From: Jean Delvare <khali@linux-fr.org>
To: Linux I2C <linux-i2c@vger.kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>, Paul Mackerras <paulus@samba.org>
Subject: [PATCH 1/2] i2c-powermac: Refactor i2c_powermac_smbus_xfer
Date: Sat, 10 Oct 2009 14:19:08 +0200
Message-ID: <20091010141908.0be884a5@hyperion.delvare>

I wanted to add some error logging to the i2c-powermac driver, but
found that it was very difficult due to the way the
i2c_powermac_smbus_xfer function is organized. Refactor the code in
this function so that each low-level function is only called once.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
---
This needs testing! Thanks.

 drivers/i2c/busses/i2c-powermac.c |   85 +++++++++++++++++--------------------
 1 file changed, 41 insertions(+), 44 deletions(-)

--- linux-2.6.32-rc3.orig/drivers/i2c/busses/i2c-powermac.c	2009-10-10 14:08:39.000000000 +0200
+++ linux-2.6.32-rc3/drivers/i2c/busses/i2c-powermac.c	2009-10-10 14:13:04.000000000 +0200
@@ -49,48 +49,38 @@ static s32 i2c_powermac_smbus_xfer(	stru
 	int			rc = 0;
 	int			read = (read_write == I2C_SMBUS_READ);
 	int			addrdir = (addr << 1) | read;
+	int			mode, subsize, len;
+	u32			subaddr;
+	u8			*buf;
 	u8			local[2];
 
-	rc = pmac_i2c_open(bus, 0);
-	if (rc)
-		return rc;
+	if (size == I2C_SMBUS_QUICK || size == I2C_SMBUS_BYTE) {
+		mode = pmac_i2c_mode_std;
+		subsize = 0;
+		subaddr = 0;
+	} else {
+		mode = read ? pmac_i2c_mode_combined : pmac_i2c_mode_stdsub;
+		subsize = 1;
+		subaddr = command;
+	}
 
 	switch (size) {
         case I2C_SMBUS_QUICK:
-		rc = pmac_i2c_setmode(bus, pmac_i2c_mode_std);
-		if (rc)
-			goto bail;
-		rc = pmac_i2c_xfer(bus, addrdir, 0, 0, NULL, 0);
+		buf = NULL;
+		len = 0;
 	    	break;
         case I2C_SMBUS_BYTE:
-		rc = pmac_i2c_setmode(bus, pmac_i2c_mode_std);
-		if (rc)
-			goto bail;
-		rc = pmac_i2c_xfer(bus, addrdir, 0, 0, &data->byte, 1);
-	    	break;
         case I2C_SMBUS_BYTE_DATA:
-		rc = pmac_i2c_setmode(bus, read ?
-				      pmac_i2c_mode_combined :
-				      pmac_i2c_mode_stdsub);
-		if (rc)
-			goto bail;
-		rc = pmac_i2c_xfer(bus, addrdir, 1, command, &data->byte, 1);
+		buf = &data->byte;
+		len = 1;
 	    	break;
         case I2C_SMBUS_WORD_DATA:
-		rc = pmac_i2c_setmode(bus, read ?
-				      pmac_i2c_mode_combined :
-				      pmac_i2c_mode_stdsub);
-		if (rc)
-			goto bail;
 		if (!read) {
 			local[0] = data->word & 0xff;
 			local[1] = (data->word >> 8) & 0xff;
 		}
-		rc = pmac_i2c_xfer(bus, addrdir, 1, command, local, 2);
-		if (rc == 0 && read) {
-			data->word = ((u16)local[1]) << 8;
-			data->word |= local[0];
-		}
+		buf = local;
+		len = 2;
 	    	break;
 
 	/* Note that these are broken vs. the expected smbus API where
@@ -105,28 +95,35 @@ static s32 i2c_powermac_smbus_xfer(	stru
 	 * a repeat start/addr phase (but not stop in between)
 	 */
         case I2C_SMBUS_BLOCK_DATA:
-		rc = pmac_i2c_setmode(bus, read ?
-				      pmac_i2c_mode_combined :
-				      pmac_i2c_mode_stdsub);
-		if (rc)
-			goto bail;
-		rc = pmac_i2c_xfer(bus, addrdir, 1, command, data->block,
-				   data->block[0] + 1);
-
+		buf = data->block;
+		len = data->block[0] + 1;
 		break;
 	case I2C_SMBUS_I2C_BLOCK_DATA:
-		rc = pmac_i2c_setmode(bus, read ?
-				      pmac_i2c_mode_combined :
-				      pmac_i2c_mode_stdsub);
-		if (rc)
-			goto bail;
-		rc = pmac_i2c_xfer(bus, addrdir, 1, command,
-				   &data->block[1], data->block[0]);
+		buf = &data->block[1];
+		len = data->block[0];
 		break;
 
         default:
-	    	rc = -EINVAL;
+		return -EINVAL;
+	}
+
+	rc = pmac_i2c_open(bus, 0);
+	if (rc)
+		return rc;
+
+	rc = pmac_i2c_setmode(bus, mode);
+	if (rc)
+		goto bail;
+
+	rc = pmac_i2c_xfer(bus, addrdir, subsize, subaddr, buf, len);
+	if (rc)
+		goto bail;
+
+	if (size == I2C_SMBUS_WORD_DATA && read) {
+		data->word = ((u16)local[1]) << 8;
+		data->word |= local[0];
 	}
+
  bail:
 	pmac_i2c_close(bus);
 	return rc;


-- 
Jean Delvare

^ permalink raw reply

* Re: [PATCH] i2c-powermac: Reject unsupported I2C transactions
From: Benjamin Herrenschmidt @ 2009-10-14  5:39 UTC (permalink / raw)
  To: Jean Delvare; +Cc: linuxppc-dev, Paul Mackerras, Linux I2C
In-Reply-To: <20090930221435.58636363@hyperion.delvare>

On Wed, 2009-09-30 at 22:14 +0200, Jean Delvare wrote:
> The i2c-powermac driver doesn't support arbitrary multi-message I2C
> transactions, only SMBus ones. Make it clear by returning an error if
> a multi-message I2C transaction is attempted. This is better than only
> processing the first message, because most callers won't recover from
> the short transaction. Anyone wishing to issue multi-message
> transactions should use the SMBus API instead of the raw I2C API.
> 
> Signed-off-by: Jean Delvare <khali@linux-fr.org>

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

> Cc: Paul Mackerras <paulus@samba.org>
> ---
>  drivers/i2c/busses/i2c-powermac.c |    6 ++++++
>  1 file changed, 6 insertions(+)
> 
> --- linux-2.6.32-rc1.orig/drivers/i2c/busses/i2c-powermac.c	2009-06-10 05:05:27.000000000 +0200
> +++ linux-2.6.32-rc1/drivers/i2c/busses/i2c-powermac.c	2009-09-30 20:29:42.000000000 +0200
> @@ -146,6 +146,12 @@ static int i2c_powermac_master_xfer(	str
>  	int			read;
>  	int			addrdir;
>  
> +	if (num != 1) {
> +		dev_err(&adap->dev,
> +			"Multi-message I2C transactions not supported\n");
> +		return -EOPNOTSUPP;
> +	}
> +
>  	if (msgs->flags & I2C_M_TEN)
>  		return -EINVAL;
>  	read = (msgs->flags & I2C_M_RD) != 0;
> 
> 

^ permalink raw reply

* Re: [PATCH] of/platform: Implement support for dev_pm_ops
From: Benjamin Herrenschmidt @ 2009-10-14  4:55 UTC (permalink / raw)
  To: avorontsov; +Cc: linuxppc-dev, linux-pm, David Miller
In-Reply-To: <20091012224410.GA18923@oksana.dev.rtsoft.ru>

On Tue, 2009-10-13 at 02:44 +0400, Anton Vorontsov wrote:

> I agree that there is some room for improvements in general (e.g.
> merging platform and of_platform devices/drivers), but it's not as
> easy as you would like to think. Let's make it in a separate step
> that don't stop real features from being implemented (e.g.
> hibernate).
> 
> For the six functions that we can reuse I can prepare a cleanup
> patch that we can merge via -mm, or it can just sit and collect
> needed acks and can be merged via any tree. But please, no
> cross-tree dependencies for the cruicial features.

I agree. I'll take the patch for now.

In the long run, I'm all for killing of_platform if we can find
a "proper" way to replace it with platform.

IE. With dev_archdata, any device carries the of device node, so
of_platform doesn't really buy us much anymore.

We could even "default" by populating platform device resources
with standard-parsing of "reg" properties etc...

So for devices who don't actually need anything more, we may get
away re-using platform devices as-is, all we would need is some
kind of conversion table or such to map OF match to platform dev names,
or maybe a secondary match table in the drivers themselves.

Anyway, that's an old discussion, something we still need to sort out...

Ben.

^ permalink raw reply

* Re: [RFC PATCH 05/12] of: add common header for flattened device tree representation
From: David Gibson @ 2009-10-14  4:47 UTC (permalink / raw)
  To: Grant Likely
  Cc: Stephen Rothwell, monstr, microblaze-uclinux, devicetree-discuss,
	sparclinux, linuxppc-dev, davem
In-Reply-To: <fa686aa40910090007lef3fddbod0c5843abfe33be5@mail.gmail.com>

On Fri, Oct 09, 2009 at 01:07:57AM -0600, Grant Likely wrote:
> On Fri, Oct 9, 2009 at 12:35 AM, David Gibson
> <david@gibson.dropbear.id.au> wrote:
> > On Tue, Oct 06, 2009 at 10:30:59PM -0600, Grant Likely wrote:
> >> Add a common header file for working with the flattened device tree
> >> data structure and merge the shared data tags used by Microblaze and
> >> PowerPC
> >>
> >> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
> >> ---
> >>
> >>  arch/microblaze/include/asm/prom.h |   12 +-----------
> >>  arch/powerpc/include/asm/prom.h    |   12 +-----------
> >>  include/linux/of_fdt.h             |   30 ++++++++++++++++++++++++++++++
> >>  3 files changed, 32 insertions(+), 22 deletions(-)
> >>  create mode 100644 include/linux/of_fdt.h
> >>
> >> diff --git a/arch/microblaze/include/asm/prom.h b/arch/microblaze/include/asm/prom.h
> >> index 64e8b3a..5f461f0 100644
> >> --- a/arch/microblaze/include/asm/prom.h
> >> +++ b/arch/microblaze/include/asm/prom.h
> >> @@ -17,20 +17,10 @@
> >>  #ifndef _ASM_MICROBLAZE_PROM_H
> >>  #define _ASM_MICROBLAZE_PROM_H
> >>  #ifdef __KERNEL__
> >> -
> >> -/* Definitions used by the flattened device tree */
> >> -#define OF_DT_HEADER         0xd00dfeed /* marker */
> >> -#define OF_DT_BEGIN_NODE     0x1 /* Start of node, full name */
> >> -#define OF_DT_END_NODE               0x2 /* End node */
> >> -#define OF_DT_PROP           0x3 /* Property: name off, size, content */
> >> -#define OF_DT_NOP            0x4 /* nop */
> >> -#define OF_DT_END            0x9
> >> -
> >> -#define OF_DT_VERSION                0x10
> >
> >
> > So, if you're merging all these, I guess the question is do we also
> > want to merge them with scripts/dtc/libfdt/fdt.h, and by extension
> > with the upstream libfdt header file which defines the same things.
> 
> I see your question and raise you another.  Where should the merge
> file live for it to be included both by dtc and kernel code? Or should
> it just be cloned in the kernel tree?

Yeah, a good question.  As I see it there are two options.  Number one
is just make sure everything relevant that the kernel needs is in the
libfdt version, then just have the kernel code reference it from its
location in scripts/dtc.  Other option is we clone the file in the
kernel tree.  Requires keeping in sync, in theory at least, but since
that file has been pretty static (since it's only supposed to contain
passive structures/constants related to the physical flat tree
structure - no code or prototypes).

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply

* Re: [PATCH 2/8] bitmap: Introduce bitmap_set, bitmap_clear, bitmap_find_next_zero_area
From: Akinobu Mita @ 2009-10-14  3:39 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Fenghua Yu, x86, linux-ia64, Thomas Gleixner, David S. Miller,
	netdev, Greg Kroah-Hartman, linux-kernel, linux-altix,
	Yevgeny Petrilin, FUJITA Tomonori, linuxppc-dev, Tony Luck,
	Paul Mackerras, H. Peter Anvin, sparclinux, Andrew Morton,
	linux-usb, Ingo Molnar, Lothar Wassmann
In-Reply-To: <1255470887.21871.2.camel@concordia>

On Wed, Oct 14, 2009 at 08:54:47AM +1100, Michael Ellerman wrote:
> On Tue, 2009-10-13 at 18:10 +0900, Akinobu Mita wrote:
> > My user space testing exposed off-by-one error find_next_zero_area
> > in iommu-helper.
> 
> Why not merge those tests into the kernel as a configurable boot-time
> self-test?

I send the test program that I used. Obviously it needs
better diagnostic messages and cleanup to be added into kernel tests.

#include <stdio.h>
#include <time.h>
#include <stdlib.h>
#include <string.h>

#if 1 /* Copy and paste from kernel source */

#define BITS_PER_BYTE  8
#define BITS_PER_LONG (sizeof(long) * BITS_PER_BYTE)

#define BIT_WORD(nr)	((nr) / BITS_PER_LONG)
#define BITOP_WORD(nr)	((nr) / BITS_PER_LONG)

#define BITMAP_LAST_WORD_MASK(nbits)                                    \
(                                                                       \
        ((nbits) % BITS_PER_LONG) ?                                     \
                (1UL<<((nbits) % BITS_PER_LONG))-1 : ~0UL               \
)

#define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) % BITS_PER_LONG))

void bitmap_set(unsigned long *map, int start, int nr)
{
	unsigned long *p = map + BIT_WORD(start);
	const int size = start + nr;
	int bits_to_set = BITS_PER_LONG - (start % BITS_PER_LONG);
	unsigned long mask_to_set = BITMAP_FIRST_WORD_MASK(start);

	while (nr - bits_to_set >= 0) {
		*p |= mask_to_set;
		nr -= bits_to_set;
		bits_to_set = BITS_PER_LONG;
		mask_to_set = ~0UL;
		p++;
	}
	if (nr) {
		mask_to_set &= BITMAP_LAST_WORD_MASK(size);
		*p |= mask_to_set;
	}
}

void bitmap_clear(unsigned long *map, int start, int nr)
{
	unsigned long *p = map + BIT_WORD(start);
	const int size = start + nr;
	int bits_to_clear = BITS_PER_LONG - (start % BITS_PER_LONG);
	unsigned long mask_to_clear = BITMAP_FIRST_WORD_MASK(start);

	while (nr - bits_to_clear >= 0) {
		*p &= ~mask_to_clear;
		nr -= bits_to_clear;
		bits_to_clear = BITS_PER_LONG;
		mask_to_clear = ~0UL;
		p++;
	}
	if (nr) {
		mask_to_clear &= BITMAP_LAST_WORD_MASK(size);
		*p &= ~mask_to_clear;
	}
}

static unsigned long __ffs(unsigned long word)
{
	int num = 0;

	if ((word & 0xffff) == 0) {
		num += 16;
		word >>= 16;
	}
	if ((word & 0xff) == 0) {
		num += 8;
		word >>= 8;
	}
	if ((word & 0xf) == 0) {
		num += 4;
		word >>= 4;
	}
	if ((word & 0x3) == 0) {
		num += 2;
		word >>= 2;
	}
	if ((word & 0x1) == 0)
		num += 1;
	return num;
}

unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
			    unsigned long offset)
{
	const unsigned long *p = addr + BITOP_WORD(offset);
	unsigned long result = offset & ~(BITS_PER_LONG-1);
	unsigned long tmp;

	if (offset >= size)
		return size;
	size -= result;
	offset %= BITS_PER_LONG;
	if (offset) {
		tmp = *(p++);
		tmp &= (~0UL << offset);
		if (size < BITS_PER_LONG)
			goto found_first;
		if (tmp)
			goto found_middle;
		size -= BITS_PER_LONG;
		result += BITS_PER_LONG;
	}
	while (size & ~(BITS_PER_LONG-1)) {
		if ((tmp = *(p++)))
			goto found_middle;
		result += BITS_PER_LONG;
		size -= BITS_PER_LONG;
	}
	if (!size)
		return result;
	tmp = *p;

found_first:
	tmp &= (~0UL >> (BITS_PER_LONG - size));
	if (tmp == 0UL)		/* Are any bits set? */
		return result + size;	/* Nope. */
found_middle:
	return result + __ffs(tmp);
}

#define ffz(x)  __ffs(~(x))

unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
				 unsigned long offset)
{
	const unsigned long *p = addr + BITOP_WORD(offset);
	unsigned long result = offset & ~(BITS_PER_LONG-1);
	unsigned long tmp;

	if (offset >= size)
		return size;
	size -= result;
	offset %= BITS_PER_LONG;
	if (offset) {
		tmp = *(p++);
		tmp |= ~0UL >> (BITS_PER_LONG - offset);
		if (size < BITS_PER_LONG)
			goto found_first;
		if (~tmp)
			goto found_middle;
		size -= BITS_PER_LONG;
		result += BITS_PER_LONG;
	}
	while (size & ~(BITS_PER_LONG-1)) {
		if (~(tmp = *(p++)))
			goto found_middle;
		result += BITS_PER_LONG;
		size -= BITS_PER_LONG;
	}
	if (!size)
		return result;
	tmp = *p;

found_first:
	tmp |= ~0UL << size;
	if (tmp == ~0UL)	/* Are any bits zero? */
		return result + size;	/* Nope. */
found_middle:
	return result + ffz(tmp);
}

#define __ALIGN_MASK(x,mask) (((x)+(mask))&~(mask))

static inline int test_bit(int nr, const volatile unsigned long *addr)
{
	return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
}

unsigned long bitmap_find_next_zero_area(unsigned long *map,
					 unsigned long size,
					 unsigned long start,
					 unsigned int nr,
					 unsigned long align_mask)
{
	unsigned long index, end, i;
again:
	index = find_next_zero_bit(map, size, start);

	/* Align allocation */
	index = __ALIGN_MASK(index, align_mask);

	end = index + nr;
#ifdef ORIGINAL
	if (end >= size)
#else
	if (end > size)
#endif
		return end;

#ifdef ORIGINAL
	for (i = index; i < end; i++) {
		if (test_bit(i, map)) {
			start = i+1;
			goto again;
		}
	}
#else
	i = find_next_bit(map, end, index);
	if (i < end) {
		start = i + 1;
		goto again;
	}
#endif
	return index;
}

#define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d))
#define BITS_TO_LONGS(nr)       DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long))
#define DECLARE_BITMAP(name,bits) unsigned long name[BITS_TO_LONGS(bits)]

#endif /* Copy and paste from kernel source */

static DECLARE_BITMAP(bitmap, 1000);
static DECLARE_BITMAP(empty, 1000);
static DECLARE_BITMAP(full, 1000);

static void bitmap_dump(unsigned long *bitmap, int size)
{
	int i;

	for (i = 0; i < size; i++) {
		if (test_bit(i, bitmap))
			printf("1 ");
		else
			printf("0 ");
		if (i % 10 == 9)
			printf("\n");
	}
	printf("\n");
}

static int test1(int size)
{

	int start = random() % size;
	int nr = random() % (size - start);

	memset(bitmap, 0x00, BITS_TO_LONGS(size) * sizeof(unsigned long));

	bitmap_set(bitmap, start, nr);
	bitmap_clear(bitmap, start, nr);

	if (memcmp(empty, bitmap, BITS_TO_LONGS(size) * sizeof(unsigned long)))
		goto error;

	return 0;
error:
	bitmap_dump(bitmap, size);
	return 1;
}

int test2(int size)
{
	int start = random() % size;
	int nr = random() % (size - start);

	memset(bitmap, 0xff, BITS_TO_LONGS(size) * sizeof(unsigned long));

	bitmap_clear(bitmap, start, nr);
	bitmap_set(bitmap, start, nr);

	if (memcmp(full, bitmap, BITS_TO_LONGS(size) * sizeof(unsigned long)))
		goto error;

	return 0;
error:
	bitmap_dump(bitmap, size);
	return 1;
}

int test3(int size)
{
	int start = random() % size;
	int nr = random() % (size - start);
	unsigned long offset;

	memset(bitmap, 0x00, BITS_TO_LONGS(size) * sizeof(unsigned long));
	bitmap_set(bitmap, start, nr);
	if (start) {
		offset = bitmap_find_next_zero_area(bitmap, size, 0, start, 0);
		if (offset != 0) {
			printf("start %ld nr %ld\n", start, nr);
			printf("offset %ld != 0\n", offset);
			goto error;
		}
	}
	offset = bitmap_find_next_zero_area(bitmap, size, start,
						size - (start + nr), 0);
	if (offset != start + nr) {
		printf("start %ld nr %ld\n", start, nr);
		printf("offset %ld != size + nr %ld\n", offset, start + nr);
		goto error;
	}

	return 0;
error:
	bitmap_dump(bitmap, size);

	return 1;
}

int test4(int size)
{
	int start = random() % size;
	int nr = random() % (size - start);
	unsigned long offset;

	memset(bitmap, 0xff, BITS_TO_LONGS(size) * sizeof(unsigned long));
	bitmap_clear(bitmap, start, nr);
	offset = bitmap_find_next_zero_area(bitmap, size, start, nr, 0);
	if (nr != 0) {
		if (offset != start) {
			printf("start %ld nr %ld\n", start, nr);
			printf("offset %ld != start %ld\n", offset, start);
			goto error;
		}
	}
	return 0;
error:
	bitmap_dump(bitmap, size);

	return 1;
}

int main(int argc, char *argv[])
{
	int err = 0;

	srandom(time(NULL));

	memset(empty, 0x00, sizeof(empty));
	memset(full, 0xff, sizeof(full));

	while (!err) {
		err |= test1(1000);
		err |= test2(1000);
		err |= test3(1000);
		err |= test4(1000);
	}
	return 0;
}

^ permalink raw reply

* Re: New percpu & ppc64 perfs
From: Benjamin Herrenschmidt @ 2009-10-14  3:28 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linuxppc-dev, linux-kernel@vger.kernel.org
In-Reply-To: <4AD52E37.80209@kernel.org>

On Wed, 2009-10-14 at 10:49 +0900, Tejun Heo wrote:
> For 256M segment, I don't think much can be done but for 1T segment,
> just limiting vmalloc area size to 1T should do the trick, no?

Right. I'll have a look at it.

Cheers,
Ben.

^ permalink raw reply

* [PATCH -mmotm] Fix bitmap-introduce-bitmap_set-bitmap_clear-bitmap_find_next_zero_area. patch
From: Akinobu Mita @ 2009-10-14  3:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Fenghua Yu, Greg Kroah-Hartman, linux-ia64, Tony Luck, x86,
	netdev, linux-kernel, linux-altix, Yevgeny Petrilin,
	FUJITA Tomonori, linuxppc-dev, Ingo Molnar, Paul Mackerras,
	H. Peter Anvin, sparclinux, Thomas Gleixner, linux-usb,
	David S. Miller, Lothar Wassmann
In-Reply-To: <20091013091017.GA18431@localhost.localdomain>

Update PATCH 2/8 based on review comments by Andrew and bugfix
exposed by user space testing.

I didn't change argument of align_mask at this time because it
turned out that it needs more changes in iommu-helper users.

From: Akinobu Mita <akinobu.mita@gmail.com>
Subject: Fix bitmap-introduce-bitmap_set-bitmap_clear-bitmap_find_next_zero_area.patch

- Rewrite bitmap_set and bitmap_clear

  Instead of setting or clearing for each bit.

- Fix off-by-one error in bitmap_find_next_zero_area

  This bug was derived from find_next_zero_area in iommu-helper.

- Add kerneldoc for bitmap_find_next_zero_area

This patch is supposed to be folded into
bitmap-introduce-bitmap_set-bitmap_clear-bitmap_find_next_zero_area.patch

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
---
 lib/bitmap.c |   60 +++++++++++++++++++++++++++++++++++++++++++++------------
 1 files changed, 47 insertions(+), 13 deletions(-)

diff --git a/lib/bitmap.c b/lib/bitmap.c
index 2415da4..84292c9 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -271,28 +271,62 @@ int __bitmap_weight(const unsigned long *bitmap, int bits)
 }
 EXPORT_SYMBOL(__bitmap_weight);
 
-void bitmap_set(unsigned long *map, int i, int len)
-{
-	int end = i + len;
+#define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) % BITS_PER_LONG))
 
-	while (i < end) {
-		__set_bit(i, map);
-		i++;
+void bitmap_set(unsigned long *map, int start, int nr)
+{
+	unsigned long *p = map + BIT_WORD(start);
+	const int size = start + nr;
+	int bits_to_set = BITS_PER_LONG - (start % BITS_PER_LONG);
+	unsigned long mask_to_set = BITMAP_FIRST_WORD_MASK(start);
+
+	while (nr - bits_to_set >= 0) {
+		*p |= mask_to_set;
+		nr -= bits_to_set;
+		bits_to_set = BITS_PER_LONG;
+		mask_to_set = ~0UL;
+		p++;
+	}
+	if (nr) {
+		mask_to_set &= BITMAP_LAST_WORD_MASK(size);
+		*p |= mask_to_set;
 	}
 }
 EXPORT_SYMBOL(bitmap_set);
 
 void bitmap_clear(unsigned long *map, int start, int nr)
 {
-	int end = start + nr;
-
-	while (start < end) {
-		__clear_bit(start, map);
-		start++;
+	unsigned long *p = map + BIT_WORD(start);
+	const int size = start + nr;
+	int bits_to_clear = BITS_PER_LONG - (start % BITS_PER_LONG);
+	unsigned long mask_to_clear = BITMAP_FIRST_WORD_MASK(start);
+
+	while (nr - bits_to_clear >= 0) {
+		*p &= ~mask_to_clear;
+		nr -= bits_to_clear;
+		bits_to_clear = BITS_PER_LONG;
+		mask_to_clear = ~0UL;
+		p++;
+	}
+	if (nr) {
+		mask_to_clear &= BITMAP_LAST_WORD_MASK(size);
+		*p &= ~mask_to_clear;
 	}
 }
 EXPORT_SYMBOL(bitmap_clear);
 
+/*
+ * bitmap_find_next_zero_area - find a contiguous aligned zero area
+ * @map: The address to base the search on
+ * @size: The bitmap size in bits
+ * @start: The bitnumber to start searching at
+ * @nr: The number of zeroed bits we're looking for
+ * @align_mask: Alignment mask for zero area
+ *
+ * The @align_mask should be one less than a power of 2; the effect is that
+ * the bit offset of all zero areas this function finds is multiples of that
+ * power of 2. A @align_mask of 0 means no alignment is required.
+ */
 unsigned long bitmap_find_next_zero_area(unsigned long *map,
 					 unsigned long size,
 					 unsigned long start,
@@ -304,10 +338,10 @@ again:
 	index = find_next_zero_bit(map, size, start);
 
 	/* Align allocation */
-	index = (index + align_mask) & ~align_mask;
+	index = __ALIGN_MASK(index, align_mask);
 
 	end = index + nr;
-	if (end >= size)
+	if (end > size)
 		return end;
 	i = find_next_bit(map, end, index);
 	if (i < end) {
-- 
1.5.4.3

^ permalink raw reply related

* Re: [PATCH] Ftrace : fix function_graph tracer OOPS
From: Steven Rostedt @ 2009-10-14  3:01 UTC (permalink / raw)
  To: Sachin Sant; +Cc: linuxppc-dev
In-Reply-To: <4ACDFC83.4080205@in.ibm.com>

On Thu, 2009-10-08 at 20:21 +0530, Sachin Sant wrote:
> Switch to LOAD_REG_ADDR().
> 
> Signed-off-by : Sachin Sant <sachinp@in.ibm.com>
> ---
> diff -Naurp old/arch/powerpc/kernel/entry_64.S
> new/arch/powerpc/kernel/entry_64.S
> --- old/arch/powerpc/kernel/entry_64.S  2009-10-08 18:37:44.000000000
> +0530
> +++ new/arch/powerpc/kernel/entry_64.S  2009-10-08 18:34:33.000000000
> +0530
> @@ -1038,8 +1038,8 @@ _GLOBAL(mod_return_to_handler)
>          * We are in a module using the module's TOC.
>          * Switch to our TOC to run inside the core kernel.
>          */
> -       LOAD_REG_IMMEDIATE(r4,ftrace_return_to_handler)
> -       ld      r2, 8(r4)
> +       ld      r2, PACATOC(r13)
> +       LOAD_REG_ADDR(r4,ftrace_return_to_handler)

Actually, the loading of this register is not needed. The original used
the loading to get the r2.

I actually wrote a fix for this a month ago. I never sent it out because
I was distracted by other issues.

I'll send out the two patches I had now.

Could yo test them?

Thanks!

-- Steve

>  
>         bl      .ftrace_return_to_handler
>         nop
> 

^ permalink raw reply

* Re: New percpu & ppc64 perfs
From: Tejun Heo @ 2009-10-14  1:49 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, linux-kernel@vger.kernel.org
In-Reply-To: <1255478358.2347.28.camel@pasglop>

Hello, Benjamin.

Benjamin Herrenschmidt wrote:
> So I found (and fixed, though the patch isn't upstream yet) the problem
> that was causing the new percpu to hang when accessing the top of our
> vmalloc space.
> 
> However, I have some concerns about that choice of location for the
> percpu datas.
> 
> Basically, our MMU divides the address space into "segments" (of 256M or
> 1T depending on your processor capabilities) and those segments are SW
> loaded into a relatively small (64 entries) SLB buffer.
> 
> Thus, by moving the per-cpu to the end of the vmalloc space, you
> essentially make it use a different segment from the rest of the vmalloc
> space, which will overall degrade performances by increasing pressure on
> the SLB.
> 
> It would be nicer if we could provide an arch function to provide a
> "preferred" location for the per-cpu data.
> 
> I can easily cook up a patch but wanted to discuss that with you first.
> Any reason why we would keep it within vmalloc space for example ? IE. I
> could move VMALLOC_END to below the per-cpu reserved areas, or are they
> subject to expansion past boot time ?
> 
> Also, how big can they be ? Ie, will the top of the first 256M segment
> good enough or that will risk blowing out of space ? In general,
> machines with 256M segments won't have more than 64 or maybe 128 CPUs I
> believe. Bigger machines will have CPUs that support 1T segments.

Hmm... I don't think 256M segment will be enough.  Percpu area layout
will follow how numa memory is laidd out.  For example, if a machine
has 4 nodes (each one with one cpu) and memory for each node is 1G in
size and 1G apart, the first chunk will be embedded in the linear
mapping area (normal kernel addressable area) and each unit in the
chunk will be apart by between 1G and 2G.  As the first chunk is
embedded in the linear mapped area, this shouldn't cause any extra
overhead.

The vmalloc area is used when the first chunk is filled and another
chunk need to be allocated.  From the second chunk on, vmalloc area is
used to preserve the layout of the first chunk.  ie. Each of them will
span across 8G bytes (they will overlap tho, so even with many dynamic
chunks vm usage will only be slightly over 8G).

The reason why vmalloc area from the top is used is that I didn't want
this congruent allocation to compete with normal vmalloc allocations.
Depending on the numa layout, competition between linear allocation
and congruent allocation may create many unnecessary holes.

For 256M segment, I don't think much can be done but for 1T segment,
just limiting vmalloc area size to 1T should do the trick, no?

Thanks.

-- 
tejun

^ permalink raw reply

* New percpu & ppc64 perfs
From: Benjamin Herrenschmidt @ 2009-10-13 23:59 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linuxppc-dev, linux-kernel@vger.kernel.org

Hi Tejun !

So I found (and fixed, though the patch isn't upstream yet) the problem
that was causing the new percpu to hang when accessing the top of our
vmalloc space.

However, I have some concerns about that choice of location for the
percpu datas.

Basically, our MMU divides the address space into "segments" (of 256M or
1T depending on your processor capabilities) and those segments are SW
loaded into a relatively small (64 entries) SLB buffer.

Thus, by moving the per-cpu to the end of the vmalloc space, you
essentially make it use a different segment from the rest of the vmalloc
space, which will overall degrade performances by increasing pressure on
the SLB.

It would be nicer if we could provide an arch function to provide a
"preferred" location for the per-cpu data.

I can easily cook up a patch but wanted to discuss that with you first.
Any reason why we would keep it within vmalloc space for example ? IE. I
could move VMALLOC_END to below the per-cpu reserved areas, or are they
subject to expansion past boot time ?

Also, how big can they be ? Ie, will the top of the first 256M segment
good enough or that will risk blowing out of space ? In general,
machines with 256M segments won't have more than 64 or maybe 128 CPUs I
believe. Bigger machines will have CPUs that support 1T segments.

Cheers,
Ben.
 

^ permalink raw reply

* Re: [PATCH 4/5 v3] kernel handling of memory DLPAR
From: Michael Ellerman @ 2009-10-13 22:31 UTC (permalink / raw)
  To: Nathan Fontenot; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <4AD4C346.20801@austin.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 6749 bytes --]

On Tue, 2009-10-13 at 13:13 -0500, Nathan Fontenot wrote:
> This adds the capability to DLPAR add and remove memory from the kernel.  The

Hi Nathan,

Sorry to only get around to reviewing version 3, time is a commodity in
short supply :)

> Index: powerpc/arch/powerpc/platforms/pseries/dlpar.c
> ===================================================================
> --- powerpc.orig/arch/powerpc/platforms/pseries/dlpar.c	2009-10-08 11:08:42.000000000 -0500
> +++ powerpc/arch/powerpc/platforms/pseries/dlpar.c	2009-10-13 13:08:22.000000000 -0500
> @@ -16,6 +16,10 @@
>  #include <linux/notifier.h>
>  #include <linux/proc_fs.h>
>  #include <linux/spinlock.h>
> +#include <linux/memory_hotplug.h>
> +#include <linux/sysdev.h>
> +#include <linux/sysfs.h>
> +
>  
>  #include <asm/prom.h>
>  #include <asm/machdep.h>
> @@ -404,11 +408,165 @@
>  	return 0;
>  }
>  
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +
> +static struct property *clone_property(struct property *old_prop)
> +{
> +	struct property *new_prop;
> +
> +	new_prop = kzalloc((sizeof *new_prop), GFP_KERNEL);
> +	if (!new_prop)
> +		return NULL;
> +
> +	new_prop->name = kzalloc(strlen(old_prop->name) + 1, GFP_KERNEL);

kstrdup()?

> +	new_prop->value = kzalloc(old_prop->length + 1, GFP_KERNEL);
> +	if (!new_prop->name || !new_prop->value) {
> +		free_property(new_prop);
> +		return NULL;
> +	}
> +
> +	strcpy(new_prop->name, old_prop->name);
> +	memcpy(new_prop->value, old_prop->value, old_prop->length);
> +	new_prop->length = old_prop->length;
> +
> +	return new_prop;
> +}
> +
> +int platform_probe_memory(u64 phys_addr)
> +{
> +	struct device_node *dn;
> +	struct property *new_prop, *old_prop;
> +	struct property *lmb_sz_prop;
> +	struct of_drconf_cell *drmem;
> +	u64 lmb_size;
> +	int num_entries, i, rc;
> +
> +	if (!phys_addr)
> +		return -EINVAL;
> +
> +	dn = of_find_node_by_path("/ibm,dynamic-reconfiguration-memory");
> +	if (!dn)
> +		return -EINVAL;
> +
> +	lmb_sz_prop = of_find_property(dn, "ibm,lmb-size", NULL);
> +	lmb_size = *(u64 *)lmb_sz_prop->value;

of_get_property() ?
> +
> +	old_prop = of_find_property(dn, "ibm,dynamic-memory", NULL);

I know we should never fail to find these properties, but it would be
nice to check just in case.

> +
> +	num_entries = *(u32 *)old_prop->value;
> +	drmem = (struct of_drconf_cell *)
> +				((char *)old_prop->value + sizeof(u32));

You do this dance twice (see below), a struct might make it cleaner.

> +	for (i = 0; i < num_entries; i++) {
> +		u64 lmb_end_addr = drmem[i].base_addr + lmb_size;
> +		if (phys_addr >= drmem[i].base_addr
> +		    && phys_addr < lmb_end_addr)
> +			break;
> +	}
> +
> +	if (i >= num_entries) {
> +		of_node_put(dn);
> +		return -EINVAL;
> +	}
> +
> +	if (drmem[i].flags & DRCONF_MEM_ASSIGNED) {
> +		of_node_put(dn);
> +		return 0;

This is the already added case?

> +	}
> +
> +	rc = acquire_drc(drmem[i].drc_index);
> +	if (rc) {
> +		of_node_put(dn);
> +		return -1;

-1 ?

> +	}
> +
> +	new_prop = clone_property(old_prop);
> +	drmem = (struct of_drconf_cell *)
> +				((char *)new_prop->value + sizeof(u32));
> +
> +	drmem[i].flags |= DRCONF_MEM_ASSIGNED;
> +	prom_update_property(dn, new_prop, old_prop);
> +
> +	rc = blocking_notifier_call_chain(&pSeries_reconfig_chain,
> +					  PSERIES_DRCONF_MEM_ADD,
> +					  &drmem[i].base_addr);
> +	if (rc == NOTIFY_BAD) {
> +		prom_update_property(dn, old_prop, new_prop);
> +		release_drc(drmem[i].drc_index);
> +	}
> +
> +	of_node_put(dn);
> +	return rc == NOTIFY_BAD ? -1 : 0;

-1 ?

> +}
> +
> +static ssize_t memory_release_store(struct class *class, const char *buf,
> +				    size_t count)
> +{
> +	unsigned long drc_index;
> +	struct device_node *dn;
> +	struct property *new_prop, *old_prop;
> +	struct of_drconf_cell *drmem;
> +	int num_entries;
> +	int i, rc;
> +
> +	rc = strict_strtoul(buf, 0, &drc_index);
> +	if (rc)
> +		return -EINVAL;
> +
> +	dn = of_find_node_by_path("/ibm,dynamic-reconfiguration-memory");
> +	if (!dn)
> +		return 0;

0 really?

> +
> +	old_prop = of_find_property(dn, "ibm,dynamic-memory", NULL);
> +	new_prop = clone_property(old_prop);
> +
> +	num_entries = *(u32 *)new_prop->value;
> +	drmem = (struct of_drconf_cell *)
> +				((char *)new_prop->value + sizeof(u32));
> +
> +	for (i = 0; i < num_entries; i++) {
> +		if (drmem[i].drc_index == drc_index)
> +			break;
> +	}
> +
> +	if (i >= num_entries) {
> +		free_property(new_prop);
> +		of_node_put(dn);
> +		return -EINVAL;
> +	}

Couldn't use old_prop up until here? They're identical aren't they, so
you can do the clone here and you can avoid the free in the above error
case.

> +	drmem[i].flags &= ~DRCONF_MEM_ASSIGNED;
> +	prom_update_property(dn, new_prop, old_prop);
> +
> +	rc = blocking_notifier_call_chain(&pSeries_reconfig_chain,
> +					  PSERIES_DRCONF_MEM_REMOVE,
> +					  &drmem[i].base_addr);
> +	if (rc != NOTIFY_BAD)
> +		rc = release_drc(drc_index);
> +
> +	if (rc)
> +		prom_update_property(dn, old_prop, new_prop);
> +
> +	of_node_put(dn);
> +	return rc ? -1 : count;

-1, EPERM?

> +}
> +
> +static struct class_attribute class_attr_mem_release =
> +			__ATTR(release, S_IWUSR, NULL, memory_release_store);
> +#endif
> +
>  static int pseries_dlpar_init(void)
>  {
>  	if (!machine_is(pseries))
>  		return 0;
>  
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +	if (sysfs_create_file(&memory_sysdev_class.kset.kobj,
> +			      &class_attr_mem_release.attr))
> +		printk(KERN_INFO "DLPAR: Could not create sysfs memory "
> +		       "release file\n");
> +#endif
> +
>  	return 0;
>  }
>  device_initcall(pseries_dlpar_init);
> Index: powerpc/arch/powerpc/mm/mem.c
> ===================================================================
> --- powerpc.orig/arch/powerpc/mm/mem.c	2009-10-08 11:07:45.000000000 -0500
> +++ powerpc/arch/powerpc/mm/mem.c	2009-10-08 11:08:54.000000000 -0500
> @@ -111,8 +111,19 @@
>  #ifdef CONFIG_MEMORY_HOTPLUG
>  
>  #ifdef CONFIG_NUMA
> +int __attribute ((weak)) platform_probe_memory(u64 start)

__weak

Though be careful, I think this is vulnerable to a bug in some
toolchains where the compiler will inline this version. See the comment
around early_irq_init() in kernel/softirq.c for example.

This will need to be a ppc_md hook as soon as another platform supports
memory hotplug, though that may be never :)

> +{
> +	return 0;
> +}
> +
>  int memory_add_physaddr_to_nid(u64 start)
>  {
> +	int rc;
> +
> +	rc = platform_probe_memory(start);
> +	if (rc)
> +		return rc;
> +
>  	return hot_add_scn_to_nid(start);
>  }
>  #endif

cheers


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply

* Re: [PATCH 5/5 v2] kernel handling of CPU DLPAR
From: Michael Ellerman @ 2009-10-13 22:30 UTC (permalink / raw)
  To: Nathan Fontenot; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <4AD4C3A3.5050103@austin.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 3511 bytes --]

On Tue, 2009-10-13 at 13:14 -0500, Nathan Fontenot wrote:
> This adds the capability to DLPAR add and remove CPUs from the kernel. The
> creates two new files /sys/devices/system/cpu/probe and
> /sys/devices/system/cpu/release to handle the DLPAR addition and removal of
> CPUs respectively.

How does this relate to the existing cpu hotplug mechanism? Or is this
making the cpu exist (possible), vs marking it as online?

Is some other platform going to want to do the same? ie. should the
probe/release part be in generic code?

> Index: powerpc/arch/powerpc/platforms/pseries/dlpar.c
> ===================================================================
> --- powerpc.orig/arch/powerpc/platforms/pseries/dlpar.c	2009-10-13 13:08:22.000000000 -0500
> +++ powerpc/arch/powerpc/platforms/pseries/dlpar.c	2009-10-13 13:09:00.000000000 -0500
> @@ -1,11 +1,11 @@
>  /*
> - * dlpar.c - support for dynamic reconfiguration (including PCI
> - * Hotplug and Dynamic Logical Partitioning on RPA platforms).
> + * dlpar.c - support for dynamic reconfiguration (including PCI,

We know it's dlpar.c :)

> + * Memory, and CPU Hotplug and Dynamic Logical Partitioning on
> + * PAPR platforms).
>   *
>   * Copyright (C) 2009 Nathan Fontenot
>   * Copyright (C) 2009 IBM Corporation
>   *
> - *
>   * This program is free software; you can redistribute it and/or
>   * modify it under the terms of the GNU General Public License version
>   * 2 as published by the Free Software Foundation.
> @@ -19,6 +19,7 @@
>  #include <linux/memory_hotplug.h>
>  #include <linux/sysdev.h>
>  #include <linux/sysfs.h>
> +#include <linux/cpu.h>
>  
> 
>  #include <asm/prom.h>
> @@ -408,6 +409,82 @@
>  	return 0;
>  }
>  
> +#ifdef CONFIG_HOTPLUG_CPU
> +static ssize_t cpu_probe_store(struct class *class, const char *buf,
> +			       size_t count)
> +{
> +	struct device_node *dn;
> +	unsigned long drc_index;
> +	char *cpu_name;
> +	int rc;
> +
> +	rc = strict_strtoul(buf, 0, &drc_index);
> +	if (rc)
> +		return -EINVAL;
> +
> +	rc = acquire_drc(drc_index);
> +	if (rc)
> +		return rc;
> +
> +	dn = configure_connector(drc_index);
> +	if (!dn) {
> +		release_drc(drc_index);
> +		return rc;
> +	}
> +
> +	/* fixup dn name */
> +	cpu_name = kzalloc(strlen(dn->full_name) + strlen("/cpus/") + 1,
> +			   GFP_KERNEL);
> +	if (!cpu_name) {
> +		free_cc_nodes(dn);
> +		release_drc(drc_index);
> +		return -ENOMEM;
> +	}
> +
> +	sprintf(cpu_name, "/cpus/%s", dn->full_name);
> +	kfree(dn->full_name);
> +	dn->full_name = cpu_name;

What was all that? Firmware gives us a bogus full name? But the parent
is right?

> +	rc = add_device_tree_nodes(dn);
> +	if (rc)
> +		release_drc(drc_index);
> +
> +	return rc ? rc : count;

You're sure rc is < 0.

> +}
> +
> +static ssize_t cpu_release_store(struct class *class, const char *buf,
> +				 size_t count)
> +{
> +	struct device_node *dn;
> +	u32 *drc_index;
> +	int rc;
> +
> +	dn = of_find_node_by_path(buf);
> +	if (!dn)
> +		return -EINVAL;
> +
> +	drc_index = (u32 *)of_get_property(dn, "ibm,my-drc-index", NULL);

No cast required.

> +	if (!drc_index) {
> +		of_node_put(dn);
> +		return -EINVAL;
> +	}
> +
> +	rc = release_drc(*drc_index);
> +	if (rc) {
> +		of_node_put(dn);
> +		return rc;
> +	}
> +
> +	rc = remove_device_tree_nodes(dn);
> +	if (rc)
> +		acquire_drc(*drc_index);
> +
> +	of_node_put(dn);
> +	return rc ? rc : count;
> +}


cheers

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply

* Re: [PATCH 2/8] bitmap: Introduce bitmap_set, bitmap_clear, bitmap_find_next_zero_area
From: Michael Ellerman @ 2009-10-13 21:54 UTC (permalink / raw)
  To: Akinobu Mita
  Cc: Fenghua Yu, x86, linux-ia64, Thomas Gleixner, David S. Miller,
	netdev, Greg Kroah-Hartman, linux-kernel, linux-altix,
	Yevgeny Petrilin, FUJITA Tomonori, linuxppc-dev, Tony Luck,
	Paul Mackerras, H. Peter Anvin, sparclinux, Andrew Morton,
	linux-usb, Ingo Molnar, Lothar Wassmann
In-Reply-To: <20091013091017.GA18431@localhost.localdomain>

[-- Attachment #1: Type: text/plain, Size: 241 bytes --]

On Tue, 2009-10-13 at 18:10 +0900, Akinobu Mita wrote:
> My user space testing exposed off-by-one error find_next_zero_area
> in iommu-helper.

Why not merge those tests into the kernel as a configurable boot-time
self-test?

cheers

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply

* Re: [PATCH 0/8] gianfar: Add support for hibernation
From: David Miller @ 2009-10-13 19:09 UTC (permalink / raw)
  To: afleming; +Cc: scottwood, linuxppc-dev, netdev
In-Reply-To: <64B2BB18-32DC-4B98-95D6-F203F74040D5@freescale.com>

From: Andy Fleming <afleming@freescale.com>
Date: Tue, 13 Oct 2009 12:22:38 -0500

> No, it was fine (though made unnecessary by other patches).  The BD
> has a union:
> 
>                 struct {
>                         u16     status; /* Status Fields */
>                         u16     length; /* Buffer length */
>                 };
>                 u32 lstatus;
> 
> so when you write "lstatus", you need to use the BD_LFLAG() macro, but
> when you write "status", you are just setting the status bits.

Indeed I missed that, thanks.

^ permalink raw reply

* Re: [PATCH 5/5 v2] kernel handling of CPU DLPAR
From: Nathan Fontenot @ 2009-10-13 18:14 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: linux-kernel
In-Reply-To: <4AB3A172.4090601@austin.ibm.com>

This adds the capability to DLPAR add and remove CPUs from the kernel. The
creates two new files /sys/devices/system/cpu/probe and
/sys/devices/system/cpu/release to handle the DLPAR addition and removal of
CPUs respectively.

CPU DLPAR add is accomplished by writing the drc-index of the CPU to the
probe file, and removal is done by writing the device-tree path of the cpu
to the release file.

Updated to include #ifdef CONFIG_HOTPLUG_CPU around the cpu hotplug specific
bits so that it will build without CONFIG_HOTPLUG_CPU defined.

Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
--- 

Index: powerpc/arch/powerpc/platforms/pseries/dlpar.c
===================================================================
--- powerpc.orig/arch/powerpc/platforms/pseries/dlpar.c	2009-10-13 13:08:22.000000000 -0500
+++ powerpc/arch/powerpc/platforms/pseries/dlpar.c	2009-10-13 13:09:00.000000000 -0500
@@ -1,11 +1,11 @@
 /*
- * dlpar.c - support for dynamic reconfiguration (including PCI
- * Hotplug and Dynamic Logical Partitioning on RPA platforms).
+ * dlpar.c - support for dynamic reconfiguration (including PCI,
+ * Memory, and CPU Hotplug and Dynamic Logical Partitioning on
+ * PAPR platforms).
  *
  * Copyright (C) 2009 Nathan Fontenot
  * Copyright (C) 2009 IBM Corporation
  *
- *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License version
  * 2 as published by the Free Software Foundation.
@@ -19,6 +19,7 @@
 #include <linux/memory_hotplug.h>
 #include <linux/sysdev.h>
 #include <linux/sysfs.h>
+#include <linux/cpu.h>
 
 
 #include <asm/prom.h>
@@ -408,6 +409,82 @@
 	return 0;
 }
 
+#ifdef CONFIG_HOTPLUG_CPU
+static ssize_t cpu_probe_store(struct class *class, const char *buf,
+			       size_t count)
+{
+	struct device_node *dn;
+	unsigned long drc_index;
+	char *cpu_name;
+	int rc;
+
+	rc = strict_strtoul(buf, 0, &drc_index);
+	if (rc)
+		return -EINVAL;
+
+	rc = acquire_drc(drc_index);
+	if (rc)
+		return rc;
+
+	dn = configure_connector(drc_index);
+	if (!dn) {
+		release_drc(drc_index);
+		return rc;
+	}
+
+	/* fixup dn name */
+	cpu_name = kzalloc(strlen(dn->full_name) + strlen("/cpus/") + 1,
+			   GFP_KERNEL);
+	if (!cpu_name) {
+		free_cc_nodes(dn);
+		release_drc(drc_index);
+		return -ENOMEM;
+	}
+
+	sprintf(cpu_name, "/cpus/%s", dn->full_name);
+	kfree(dn->full_name);
+	dn->full_name = cpu_name;
+
+	rc = add_device_tree_nodes(dn);
+	if (rc)
+		release_drc(drc_index);
+
+	return rc ? rc : count;
+}
+
+static ssize_t cpu_release_store(struct class *class, const char *buf,
+				 size_t count)
+{
+	struct device_node *dn;
+	u32 *drc_index;
+	int rc;
+
+	dn = of_find_node_by_path(buf);
+	if (!dn)
+		return -EINVAL;
+
+	drc_index = (u32 *)of_get_property(dn, "ibm,my-drc-index", NULL);
+	if (!drc_index) {
+		of_node_put(dn);
+		return -EINVAL;
+	}
+
+	rc = release_drc(*drc_index);
+	if (rc) {
+		of_node_put(dn);
+		return rc;
+	}
+
+	rc = remove_device_tree_nodes(dn);
+	if (rc)
+		acquire_drc(*drc_index);
+
+	of_node_put(dn);
+	return rc ? rc : count;
+}
+
+#endif /* CONFIG_HOTPLUG_CPU */
+
 #ifdef CONFIG_MEMORY_HOTPLUG
 
 static struct property *clone_property(struct property *old_prop)
@@ -553,6 +630,13 @@
 
 static struct class_attribute class_attr_mem_release =
 			__ATTR(release, S_IWUSR, NULL, memory_release_store);
+#endif /* CONFIG_MEMORY_HOTPLUG */
+
+#ifdef CONFIG_HOTPLUG_CPU
+static struct class_attribute class_attr_cpu_probe =
+			__ATTR(probe, S_IWUSR, NULL, cpu_probe_store);
+static struct class_attribute class_attr_cpu_release =
+			__ATTR(release, S_IWUSR, NULL, cpu_release_store);
 #endif
 
 static int pseries_dlpar_init(void)
@@ -567,6 +651,18 @@
 		       "release file\n");
 #endif
 
+#ifdef CONFIG_HOTPLUG_CPU
+	if (sysfs_create_file(&cpu_sysdev_class.kset.kobj,
+			      &class_attr_cpu_probe.attr))
+		printk(KERN_INFO "DLPAR: Could not create sysfs cpu "
+		       "probe file\n");
+
+	if (sysfs_create_file(&cpu_sysdev_class.kset.kobj,
+			      &class_attr_cpu_release.attr))
+		printk(KERN_INFO "DLPAR: Could not create sysfs cpu "
+		       "release file\n");
+#endif
+
 	return 0;
 }
 device_initcall(pseries_dlpar_init);

^ permalink raw reply

* Re: [PATCH 4/5 v3] kernel handling of memory DLPAR
From: Nathan Fontenot @ 2009-10-13 18:13 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: linux-kernel
In-Reply-To: <4AB3A13D.1060405@austin.ibm.com>

This adds the capability to DLPAR add and remove memory from the kernel.  The
patch extends the powerpc handling of memory_add_physaddr_to_nid(), which is
called from the sysfs memory 'probe' file to first ensure that the memory
has been added to the system.  This is done by creating a platform specific
callout from the routine.  The pseries implementation of this handles the
DLPAR work to add the memory to the system and update the device tree.

The patch also creates a pseries only 'release' sys file,
/sys/devices/system/memory/release.  This file handles the DLPAR release of
memory back to firmware and updating of the device-tree.

Updated to add #ifdef CONFIG_MEMORY_HOTPLUG around the memory hotplug specific
updates.  This allows the file to be built without CONFIG_MEMORY_HOTPLUG
defined.

Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
--- 

Index: powerpc/arch/powerpc/platforms/pseries/dlpar.c
===================================================================
--- powerpc.orig/arch/powerpc/platforms/pseries/dlpar.c	2009-10-08 11:08:42.000000000 -0500
+++ powerpc/arch/powerpc/platforms/pseries/dlpar.c	2009-10-13 13:08:22.000000000 -0500
@@ -16,6 +16,10 @@
 #include <linux/notifier.h>
 #include <linux/proc_fs.h>
 #include <linux/spinlock.h>
+#include <linux/memory_hotplug.h>
+#include <linux/sysdev.h>
+#include <linux/sysfs.h>
+
 
 #include <asm/prom.h>
 #include <asm/machdep.h>
@@ -404,11 +408,165 @@
 	return 0;
 }
 
+#ifdef CONFIG_MEMORY_HOTPLUG
+
+static struct property *clone_property(struct property *old_prop)
+{
+	struct property *new_prop;
+
+	new_prop = kzalloc((sizeof *new_prop), GFP_KERNEL);
+	if (!new_prop)
+		return NULL;
+
+	new_prop->name = kzalloc(strlen(old_prop->name) + 1, GFP_KERNEL);
+	new_prop->value = kzalloc(old_prop->length + 1, GFP_KERNEL);
+	if (!new_prop->name || !new_prop->value) {
+		free_property(new_prop);
+		return NULL;
+	}
+
+	strcpy(new_prop->name, old_prop->name);
+	memcpy(new_prop->value, old_prop->value, old_prop->length);
+	new_prop->length = old_prop->length;
+
+	return new_prop;
+}
+
+int platform_probe_memory(u64 phys_addr)
+{
+	struct device_node *dn;
+	struct property *new_prop, *old_prop;
+	struct property *lmb_sz_prop;
+	struct of_drconf_cell *drmem;
+	u64 lmb_size;
+	int num_entries, i, rc;
+
+	if (!phys_addr)
+		return -EINVAL;
+
+	dn = of_find_node_by_path("/ibm,dynamic-reconfiguration-memory");
+	if (!dn)
+		return -EINVAL;
+
+	lmb_sz_prop = of_find_property(dn, "ibm,lmb-size", NULL);
+	lmb_size = *(u64 *)lmb_sz_prop->value;
+
+	old_prop = of_find_property(dn, "ibm,dynamic-memory", NULL);
+
+	num_entries = *(u32 *)old_prop->value;
+	drmem = (struct of_drconf_cell *)
+				((char *)old_prop->value + sizeof(u32));
+
+	for (i = 0; i < num_entries; i++) {
+		u64 lmb_end_addr = drmem[i].base_addr + lmb_size;
+		if (phys_addr >= drmem[i].base_addr
+		    && phys_addr < lmb_end_addr)
+			break;
+	}
+
+	if (i >= num_entries) {
+		of_node_put(dn);
+		return -EINVAL;
+	}
+
+	if (drmem[i].flags & DRCONF_MEM_ASSIGNED) {
+		of_node_put(dn);
+		return 0;
+	}
+
+	rc = acquire_drc(drmem[i].drc_index);
+	if (rc) {
+		of_node_put(dn);
+		return -1;
+	}
+
+	new_prop = clone_property(old_prop);
+	drmem = (struct of_drconf_cell *)
+				((char *)new_prop->value + sizeof(u32));
+
+	drmem[i].flags |= DRCONF_MEM_ASSIGNED;
+	prom_update_property(dn, new_prop, old_prop);
+
+	rc = blocking_notifier_call_chain(&pSeries_reconfig_chain,
+					  PSERIES_DRCONF_MEM_ADD,
+					  &drmem[i].base_addr);
+	if (rc == NOTIFY_BAD) {
+		prom_update_property(dn, old_prop, new_prop);
+		release_drc(drmem[i].drc_index);
+	}
+
+	of_node_put(dn);
+	return rc == NOTIFY_BAD ? -1 : 0;
+}
+
+static ssize_t memory_release_store(struct class *class, const char *buf,
+				    size_t count)
+{
+	unsigned long drc_index;
+	struct device_node *dn;
+	struct property *new_prop, *old_prop;
+	struct of_drconf_cell *drmem;
+	int num_entries;
+	int i, rc;
+
+	rc = strict_strtoul(buf, 0, &drc_index);
+	if (rc)
+		return -EINVAL;
+
+	dn = of_find_node_by_path("/ibm,dynamic-reconfiguration-memory");
+	if (!dn)
+		return 0;
+
+	old_prop = of_find_property(dn, "ibm,dynamic-memory", NULL);
+	new_prop = clone_property(old_prop);
+
+	num_entries = *(u32 *)new_prop->value;
+	drmem = (struct of_drconf_cell *)
+				((char *)new_prop->value + sizeof(u32));
+
+	for (i = 0; i < num_entries; i++) {
+		if (drmem[i].drc_index == drc_index)
+			break;
+	}
+
+	if (i >= num_entries) {
+		free_property(new_prop);
+		of_node_put(dn);
+		return -EINVAL;
+	}
+
+	drmem[i].flags &= ~DRCONF_MEM_ASSIGNED;
+	prom_update_property(dn, new_prop, old_prop);
+
+	rc = blocking_notifier_call_chain(&pSeries_reconfig_chain,
+					  PSERIES_DRCONF_MEM_REMOVE,
+					  &drmem[i].base_addr);
+	if (rc != NOTIFY_BAD)
+		rc = release_drc(drc_index);
+
+	if (rc)
+		prom_update_property(dn, old_prop, new_prop);
+
+	of_node_put(dn);
+	return rc ? -1 : count;
+}
+
+static struct class_attribute class_attr_mem_release =
+			__ATTR(release, S_IWUSR, NULL, memory_release_store);
+#endif
+
 static int pseries_dlpar_init(void)
 {
 	if (!machine_is(pseries))
 		return 0;
 
+#ifdef CONFIG_MEMORY_HOTPLUG
+	if (sysfs_create_file(&memory_sysdev_class.kset.kobj,
+			      &class_attr_mem_release.attr))
+		printk(KERN_INFO "DLPAR: Could not create sysfs memory "
+		       "release file\n");
+#endif
+
 	return 0;
 }
 device_initcall(pseries_dlpar_init);
Index: powerpc/arch/powerpc/mm/mem.c
===================================================================
--- powerpc.orig/arch/powerpc/mm/mem.c	2009-10-08 11:07:45.000000000 -0500
+++ powerpc/arch/powerpc/mm/mem.c	2009-10-08 11:08:54.000000000 -0500
@@ -111,8 +111,19 @@
 #ifdef CONFIG_MEMORY_HOTPLUG
 
 #ifdef CONFIG_NUMA
+int __attribute ((weak)) platform_probe_memory(u64 start)
+{
+	return 0;
+}
+
 int memory_add_physaddr_to_nid(u64 start)
 {
+	int rc;
+
+	rc = platform_probe_memory(start);
+	if (rc)
+		return rc;
+
 	return hot_add_scn_to_nid(start);
 }
 #endif

^ permalink raw reply

* Re: [PATCH 1/5 v3] dynamic logical partitioning infrastructure
From: Nathan Fontenot @ 2009-10-13 18:06 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: linux-kernel
In-Reply-To: <4AB3A05A.6010204@austin.ibm.com>

This patch provides the kernel DLPAR infrastructure in a new filed named
dlpar.c.  The functionality provided is for acquiring and releasing a 
resource from firmware and the parsing of information returned from the
ibm,configure-connector rtas call.  Additionally, this exports the pSeries 
reconfiguration notifier chain so that it can be invoked when
device tree updates are made.

Updated to remove an extraneous of_node_put() in the removal of a device
tree node path.

Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> 
---

Index: powerpc/arch/powerpc/platforms/pseries/dlpar.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ powerpc/arch/powerpc/platforms/pseries/dlpar.c	2009-10-08 11:08:42.000000000 -0500
@@ -0,0 +1,414 @@
+/*
+ * dlpar.c - support for dynamic reconfiguration (including PCI
+ * Hotplug and Dynamic Logical Partitioning on RPA platforms).
+ *
+ * Copyright (C) 2009 Nathan Fontenot
+ * Copyright (C) 2009 IBM Corporation
+ *
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version
+ * 2 as published by the Free Software Foundation.
+ */
+
+#include <linux/kernel.h>
+#include <linux/kref.h>
+#include <linux/notifier.h>
+#include <linux/proc_fs.h>
+#include <linux/spinlock.h>
+
+#include <asm/prom.h>
+#include <asm/machdep.h>
+#include <asm/uaccess.h>
+#include <asm/rtas.h>
+#include <asm/pSeries_reconfig.h>
+
+#define CFG_CONN_WORK_SIZE	4096
+static char workarea[CFG_CONN_WORK_SIZE];
+static DEFINE_SPINLOCK(workarea_lock);
+
+struct cc_workarea {
+	u32	drc_index;
+	u32	zero;
+	u32	name_offset;
+	u32	prop_length;
+	u32	prop_offset;
+};
+
+static struct property *parse_cc_property(char *workarea)
+{
+	struct property *prop;
+	struct cc_workarea *ccwa;
+	char *name;
+	char *value;
+
+	prop = kzalloc(sizeof(*prop), GFP_KERNEL);
+	if (!prop)
+		return NULL;
+
+	ccwa = (struct cc_workarea *)workarea;
+	name = workarea + ccwa->name_offset;
+	prop->name = kzalloc(strlen(name) + 1, GFP_KERNEL);
+	if (!prop->name) {
+		kfree(prop);
+		return NULL;
+	}
+
+	strcpy(prop->name, name);
+
+	prop->length = ccwa->prop_length;
+	value = workarea + ccwa->prop_offset;
+	prop->value = kzalloc(prop->length, GFP_KERNEL);
+	if (!prop->value) {
+		kfree(prop->name);
+		kfree(prop);
+		return NULL;
+	}
+
+	memcpy(prop->value, value, prop->length);
+	return prop;
+}
+
+static void free_property(struct property *prop)
+{
+	kfree(prop->name);
+	kfree(prop->value);
+	kfree(prop);
+}
+
+static struct device_node *parse_cc_node(char *work_area)
+{
+	struct device_node *dn;
+	struct cc_workarea *ccwa;
+	char *name;
+
+	dn = kzalloc(sizeof(*dn), GFP_KERNEL);
+	if (!dn)
+		return NULL;
+
+	ccwa = (struct cc_workarea *)work_area;
+	name = work_area + ccwa->name_offset;
+	dn->full_name = kzalloc(strlen(name) + 1, GFP_KERNEL);
+	if (!dn->full_name) {
+		kfree(dn);
+		return NULL;
+	}
+
+	strcpy(dn->full_name, name);
+	return dn;
+}
+
+static void free_one_cc_node(struct device_node *dn)
+{
+	struct property *prop;
+
+	while (dn->properties) {
+		prop = dn->properties;
+		dn->properties = prop->next;
+		free_property(prop);
+	}
+
+	kfree(dn->full_name);
+	kfree(dn);
+}
+
+static void free_cc_nodes(struct device_node *dn)
+{
+	if (dn->child)
+		free_cc_nodes(dn->child);
+
+	if (dn->sibling)
+		free_cc_nodes(dn->sibling);
+
+	free_one_cc_node(dn);
+}
+
+#define NEXT_SIBLING    1
+#define NEXT_CHILD      2
+#define NEXT_PROPERTY   3
+#define PREV_PARENT     4
+#define MORE_MEMORY     5
+#define CALL_AGAIN	-2
+#define ERR_CFG_USE     -9003
+
+struct device_node *configure_connector(u32 drc_index)
+{
+	struct device_node *dn;
+	struct device_node *first_dn = NULL;
+	struct device_node *last_dn = NULL;
+	struct property *property;
+	struct property *last_property = NULL;
+	struct cc_workarea *ccwa;
+	int cc_token;
+	int rc;
+
+	cc_token = rtas_token("ibm,configure-connector");
+	if (cc_token == RTAS_UNKNOWN_SERVICE)
+		return NULL;
+
+	spin_lock(&workarea_lock);
+
+	ccwa = (struct cc_workarea *)&workarea[0];
+	ccwa->drc_index = drc_index;
+	ccwa->zero = 0;
+
+	rc = rtas_call(cc_token, 2, 1, NULL, workarea, NULL);
+	while (rc) {
+		switch (rc) {
+		case NEXT_SIBLING:
+			dn = parse_cc_node(workarea);
+			if (!dn)
+				goto cc_error;
+
+			dn->parent = last_dn->parent;
+			last_dn->sibling = dn;
+			last_dn = dn;
+			break;
+
+		case NEXT_CHILD:
+			dn = parse_cc_node(workarea);
+			if (!dn)
+				goto cc_error;
+
+			if (!first_dn)
+				first_dn = dn;
+			else {
+				dn->parent = last_dn;
+				if (last_dn)
+					last_dn->child = dn;
+			}
+
+			last_dn = dn;
+			break;
+
+		case NEXT_PROPERTY:
+			property = parse_cc_property(workarea);
+			if (!property)
+				goto cc_error;
+
+			if (!last_dn->properties)
+				last_dn->properties = property;
+			else
+				last_property->next = property;
+
+			last_property = property;
+			break;
+
+		case PREV_PARENT:
+			last_dn = last_dn->parent;
+			break;
+
+		case CALL_AGAIN:
+			break;
+
+		case MORE_MEMORY:
+		case ERR_CFG_USE:
+		default:
+			printk(KERN_ERR "Unexpected Error (%d) "
+			       "returned from configure-connector\n", rc);
+			goto cc_error;
+		}
+
+		rc = rtas_call(cc_token, 2, 1, NULL, workarea, NULL);
+	}
+
+	spin_unlock(&workarea_lock);
+	return first_dn;
+
+cc_error:
+	spin_unlock(&workarea_lock);
+
+	if (first_dn)
+		free_cc_nodes(first_dn);
+
+	return NULL;
+}
+
+static struct device_node *derive_parent(const char *path)
+{
+	struct device_node *parent;
+	char parent_path[128];
+	int parent_path_len;
+
+	parent_path_len = strrchr(path, '/') - path + 1;
+	strlcpy(parent_path, path, parent_path_len);
+
+	parent = of_find_node_by_path(parent_path);
+
+	return parent;
+}
+
+static int add_one_node(struct device_node *dn)
+{
+	struct proc_dir_entry *ent;
+	int rc;
+
+	of_node_set_flag(dn, OF_DYNAMIC);
+	kref_init(&dn->kref);
+	dn->parent = derive_parent(dn->full_name);
+
+	rc = blocking_notifier_call_chain(&pSeries_reconfig_chain,
+					  PSERIES_RECONFIG_ADD, dn);
+	if (rc == NOTIFY_BAD) {
+		printk(KERN_ERR "Failed to add device node %s\n",
+		       dn->full_name);
+		return -ENOMEM; /* For now, safe to assume kmalloc failure */
+	}
+
+	of_attach_node(dn);
+
+#ifdef CONFIG_PROC_DEVICETREE
+	ent = proc_mkdir(strrchr(dn->full_name, '/') + 1, dn->parent->pde);
+	if (ent)
+		proc_device_tree_add_node(dn, ent);
+#endif
+
+	of_node_put(dn->parent);
+	return 0;
+}
+
+int add_device_tree_nodes(struct device_node *dn)
+{
+	struct device_node *child = dn->child;
+	struct device_node *sibling = dn->sibling;
+	int rc;
+
+	dn->child = NULL;
+	dn->sibling = NULL;
+	dn->parent = NULL;
+
+	rc = add_one_node(dn);
+	if (rc)
+		return rc;
+
+	if (child) {
+		rc = add_device_tree_nodes(child);
+		if (rc)
+			return rc;
+	}
+
+	if (sibling)
+		rc = add_device_tree_nodes(sibling);
+
+	return rc;
+}
+
+static int remove_one_node(struct device_node *dn)
+{
+	struct device_node *parent = dn->parent;
+	struct property *prop = dn->properties;
+
+#ifdef CONFIG_PROC_DEVICETREE
+	while (prop) {
+		remove_proc_entry(prop->name, dn->pde);
+		prop = prop->next;
+	}
+
+	if (dn->pde)
+		remove_proc_entry(dn->pde->name, parent->pde);
+#endif
+
+	blocking_notifier_call_chain(&pSeries_reconfig_chain,
+			    PSERIES_RECONFIG_REMOVE, dn);
+	of_detach_node(dn);
+	of_node_put(dn); /* Must decrement the refcount */
+
+	return 0;
+}
+
+static int _remove_device_tree_nodes(struct device_node *dn)
+{
+	int rc;
+
+	if (dn->child) {
+		rc = _remove_device_tree_nodes(dn->child);
+		if (rc)
+			return rc;
+	}
+
+	if (dn->sibling) {
+		rc = _remove_device_tree_nodes(dn->sibling);
+		if (rc)
+			return rc;
+	}
+
+	rc = remove_one_node(dn);
+	return rc;
+}
+
+int remove_device_tree_nodes(struct device_node *dn)
+{
+	int rc;
+
+	if (dn->child) {
+		rc = _remove_device_tree_nodes(dn->child);
+		if (rc)
+			return rc;
+	}
+
+	rc = remove_one_node(dn);
+	return rc;
+}
+
+#define DR_ENTITY_SENSE		9003
+#define DR_ENTITY_PRESENT	1
+#define DR_ENTITY_UNUSABLE	2
+#define ALLOCATION_STATE	9003
+#define ALLOC_UNUSABLE		0
+#define ALLOC_USABLE		1
+#define ISOLATION_STATE		9001
+#define ISOLATE			0
+#define UNISOLATE		1
+
+int acquire_drc(u32 drc_index)
+{
+	int dr_status, rc;
+
+	rc = rtas_call(rtas_token("get-sensor-state"), 2, 2, &dr_status,
+		       DR_ENTITY_SENSE, drc_index);
+	if (rc || dr_status != DR_ENTITY_UNUSABLE)
+		return -1;
+
+	rc = rtas_set_indicator(ALLOCATION_STATE, drc_index, ALLOC_USABLE);
+	if (rc)
+		return rc;
+
+	rc = rtas_set_indicator(ISOLATION_STATE, drc_index, UNISOLATE);
+	if (rc) {
+		rtas_set_indicator(ALLOCATION_STATE, drc_index, ALLOC_UNUSABLE);
+		return rc;
+	}
+
+	return 0;
+}
+
+int release_drc(u32 drc_index)
+{
+	int dr_status, rc;
+
+	rc = rtas_call(rtas_token("get-sensor-state"), 2, 2, &dr_status,
+		       DR_ENTITY_SENSE, drc_index);
+	if (rc || dr_status != DR_ENTITY_PRESENT)
+		return -1;
+
+	rc = rtas_set_indicator(ISOLATION_STATE, drc_index, ISOLATE);
+	if (rc)
+		return rc;
+
+	rc = rtas_set_indicator(ALLOCATION_STATE, drc_index, ALLOC_UNUSABLE);
+	if (rc) {
+		rtas_set_indicator(ISOLATION_STATE, drc_index, UNISOLATE);
+		return rc;
+	}
+
+	return 0;
+}
+
+static int pseries_dlpar_init(void)
+{
+	if (!machine_is(pseries))
+		return 0;
+
+	return 0;
+}
+device_initcall(pseries_dlpar_init);
Index: powerpc/arch/powerpc/platforms/pseries/Makefile
===================================================================
--- powerpc.orig/arch/powerpc/platforms/pseries/Makefile	2009-09-11 12:43:39.000000000 -0500
+++ powerpc/arch/powerpc/platforms/pseries/Makefile	2009-09-11 12:51:52.000000000 -0500
@@ -8,7 +8,7 @@
 
 obj-y			:= lpar.o hvCall.o nvram.o reconfig.o \
 			   setup.o iommu.o ras.o rtasd.o \
-			   firmware.o power.o
+			   firmware.o power.o dlpar.o
 obj-$(CONFIG_SMP)	+= smp.o
 obj-$(CONFIG_XICS)	+= xics.o
 obj-$(CONFIG_SCANLOG)	+= scanlog.o
Index: powerpc/arch/powerpc/include/asm/pSeries_reconfig.h
===================================================================
--- powerpc.orig/arch/powerpc/include/asm/pSeries_reconfig.h	2009-09-11 12:43:39.000000000 -0500
+++ powerpc/arch/powerpc/include/asm/pSeries_reconfig.h	2009-10-08 09:37:40.000000000 -0500
@@ -17,6 +17,7 @@
 #ifdef CONFIG_PPC_PSERIES
 extern int pSeries_reconfig_notifier_register(struct notifier_block *);
 extern void pSeries_reconfig_notifier_unregister(struct notifier_block *);
+extern struct blocking_notifier_head pSeries_reconfig_chain;
 #else /* !CONFIG_PPC_PSERIES */
 static inline int pSeries_reconfig_notifier_register(struct notifier_block *nb)
 {
Index: powerpc/arch/powerpc/platforms/pseries/reconfig.c
===================================================================
--- powerpc.orig/arch/powerpc/platforms/pseries/reconfig.c	2009-09-11 12:43:39.000000000 -0500
+++ powerpc/arch/powerpc/platforms/pseries/reconfig.c	2009-10-08 09:37:49.000000000 -0500
@@ -95,7 +95,7 @@
 	return parent;
 }
 
-static BLOCKING_NOTIFIER_HEAD(pSeries_reconfig_chain);
+BLOCKING_NOTIFIER_HEAD(pSeries_reconfig_chain);
 
 int pSeries_reconfig_notifier_register(struct notifier_block *nb)
 {

^ permalink raw reply

* Re: [PATCH 0/8] gianfar: Add support for hibernation
From: Andy Fleming @ 2009-10-13 17:22 UTC (permalink / raw)
  To: David Miller; +Cc: scottwood, linuxppc-dev, netdev
In-Reply-To: <20091012.235747.195783342.davem@davemloft.net>


On Oct 13, 2009, at 1:57 AM, David Miller wrote:

> From: Anton Vorontsov <avorontsov@ru.mvista.com>
> Date: Mon, 12 Oct 2009 20:00:00 +0400
>
>> Here are few patches that add support for hibernation for gianfar
>> driver.
>>
>> Technically, we could just do gfar_close() and then gfar_enet_open()
>> sequence to restore gianfar functionality after hibernation, but
>> close/open does so many unneeded things (e.g. BDs buffers freeing and
>> allocation, IRQ freeing and requesting), that I felt it would be much
>> better to cleanup and refactor some code to make the hibernation [and
>> not only hibernation] code a little bit prettier.
>
> I applied all of this, it's a really nice patch set.  If there are any
> problems we can deal with it using follow-on fixups.
>
> I noticed something, in patch #3 where you remove the spurious wrap
> bit setting in startup_gfar().  It looks like that was not only
> spurious but it was doing it wrong too.
>
> It's writing garbage into the status word, because it's not using the
> BD_LFLAG() macro to shift the value up 16 bits.
>

No, it was fine (though made unnecessary by other patches).  The BD  
has a union:

                 struct {
                         u16     status; /* Status Fields */
                         u16     length; /* Buffer length */
                 };
                 u32 lstatus;

so when you write "lstatus", you need to use the BD_LFLAG() macro, but  
when you write "status", you are just setting the status bits.

Andy

^ permalink raw reply

* Re: From: Tim Abbott <tabbott@ksplice.com>
From: Tim Abbott @ 2009-10-13 15:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: Paul Mackerras, Sam Ravnborg, linuxppc-dev
In-Reply-To: <1255449206-16650-1-git-send-email-tabbott@ksplice.com>

Well, I think I just found a bug in git-send-email.  I'll resend with the 
actual subject line.

	-Tim Abbott

On Tue, 13 Oct 2009, Tim Abbott wrote:

> There is already an architecture-independent __page_aligned_data macro
> for this purpose, so removing the powerpc-specific macro should be
> harmless.
> 
> Signed-off-by: Tim Abbott <tabbott@ksplice.com>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: linuxppc-dev@ozlabs.org
> Cc: Sam Ravnborg <sam@ravnborg.org>
> ---
>  arch/powerpc/include/asm/page_64.h |    8 --------
>  1 files changed, 0 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/page_64.h b/arch/powerpc/include/asm/page_64.h
> index 3f17b83..3c7118f 100644
> --- a/arch/powerpc/include/asm/page_64.h
> +++ b/arch/powerpc/include/asm/page_64.h
> @@ -162,14 +162,6 @@ do {						\
>  
>  #endif /* !CONFIG_HUGETLB_PAGE */
>  
> -#ifdef MODULE
> -#define __page_aligned __attribute__((__aligned__(PAGE_SIZE)))
> -#else
> -#define __page_aligned \
> -	__attribute__((__aligned__(PAGE_SIZE), \
> -		__section__(".data.page_aligned")))
> -#endif
> -
>  #define VM_DATA_DEFAULT_FLAGS \
>  	(test_thread_flag(TIF_32BIT) ? \
>  	 VM_DATA_DEFAULT_FLAGS32 : VM_DATA_DEFAULT_FLAGS64)
> -- 
> 1.6.4.3
> 
> 

^ permalink raw reply

* From: Tim Abbott <tabbott@ksplice.com>
From: Tim Abbott @ 2009-10-13 15:53 UTC (permalink / raw)
  To: linux-kernel; +Cc: linuxppc-dev, Tim Abbott, Paul Mackerras, Sam Ravnborg

There is already an architecture-independent __page_aligned_data macro
for this purpose, so removing the powerpc-specific macro should be
harmless.

Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@ozlabs.org
Cc: Sam Ravnborg <sam@ravnborg.org>
---
 arch/powerpc/include/asm/page_64.h |    8 --------
 1 files changed, 0 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/page_64.h b/arch/powerpc/include/asm/page_64.h
index 3f17b83..3c7118f 100644
--- a/arch/powerpc/include/asm/page_64.h
+++ b/arch/powerpc/include/asm/page_64.h
@@ -162,14 +162,6 @@ do {						\
 
 #endif /* !CONFIG_HUGETLB_PAGE */
 
-#ifdef MODULE
-#define __page_aligned __attribute__((__aligned__(PAGE_SIZE)))
-#else
-#define __page_aligned \
-	__attribute__((__aligned__(PAGE_SIZE), \
-		__section__(".data.page_aligned")))
-#endif
-
 #define VM_DATA_DEFAULT_FLAGS \
 	(test_thread_flag(TIF_32BIT) ? \
 	 VM_DATA_DEFAULT_FLAGS32 : VM_DATA_DEFAULT_FLAGS64)
-- 
1.6.4.3

^ permalink raw reply related

* Re: [PATCH 1/2][v2] mm: add notifier in pageblock isolation for balloon drivers
From: Robert Jennings @ 2009-10-13 15:48 UTC (permalink / raw)
  To: Mel Gorman
  Cc: linux-mm, Gerald Schaefer, linux-kernel, linuxppc-dev,
	Martin Schwidefsky, Badari Pulavarty, Brian King, Paul Mackerras,
	Andrew Morton, Ingo Molnar, KAMEZAWA Hiroyuki
In-Reply-To: <20091009204326.GH24845@csn.ul.ie>

On Fri, Oct 09, 2009 at 21:43:26 +0100, Mal Gorman wrote:
> As you have tested this recently, would you be willing to post the
> results? While it's not a requirement of the patch, it would be nice to have
> an idea of how the effectiveness of memory hot-remove is improved when used
> with the powerpc balloon. This might convince others developers for balloons
> to register with the notifier.

I did ten test runs without my patches and ten test runs with my patches
on a 2.6.32-rc3 kernel.

Without the patch:
6 out of 10 memory-remove operations without the patch removed 1 LMB
(64Mb), the rest of the memory-remove attempts failed to remove any LMBs.

With the patch:
All of the memory-remove operations removed some LMBs.  The average
removed was just over 11 LMBs (704Mb) per attempt.

Linux was given 2Gb of memory.  During the test runs the average memory in
use was 140Mb, not including cache and buffers, and the average amount
consumed by the balloon was 1217Mb.  The system was idle while the
memory remove operation was performed.  After each attempt the system
was rebooted and allowed ~10 minutes to settle after boot.

With a 2Gb configuration on POWER the LMB size is 64Mb.  The drmgr command
(part of powerpc-utils) was used to remove memory by LBM, just as an
end-user would.  Below is a list of the runs and the number of LMBs
removed.

Stock kernel (v2.6.32-rc3)
--------------------------
LMBs	Used kb	Loaned kb
removed
0	135232	1257280
0	151168	1231744
1	152128	1234176
1	150976	1239232
1	151808	1232064
0	136064	1249152
0	137088	1246976
1	135296	1289984
1	136384	1263104
1	152960	1243904
=======================
0.60	143910	1248762 Average
0.49	  7929	  16960 StdDev

Patched kernel
--------------------------
LMBs	Used kb	Loaned kb
removed
12	134336	1294336
10	152192	1250432
 9	152832	1235520
15	153152	1237952
12	152320	1232704
13	135360	1252224
11	154176	1237056
10	153920	1243264
10	150720	1236416
13	151040	1230848
=======================
11.50	149005  1245075 Average
 1.75	  7158	  17738 StdDev


Regards,
Robert Jennings

^ permalink raw reply

* Re: [Cbe-oss-dev] [PATCH] spufs: Fix test in spufs_switch_log_read()
From: Arnd Bergmann @ 2009-10-13 13:31 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: cbe-oss-dev, Roel Kluin, linuxppc-dev, Jeremy Kerr, Andrew Morton,
	cbe-oss-dev
In-Reply-To: <200910130849.56671.jk@ozlabs.org>

On Tuesday 13 October 2009, Jeremy Kerr wrote:
> > Or can this test be removed?
> 
> I'd prefer just to remove the test.

Yes, sounds good.

	Arnd <><

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox