LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH 2/3] powerpc/e500: add paravirt QEMU platform
From: Alexander Graf @ 2012-07-06 16:59 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev
In-Reply-To: <4FF717D3.1020005@freescale.com>


On 06.07.2012, at 18:52, Scott Wood wrote:

> On 07/06/2012 11:30 AM, Alexander Graf wrote:
>>=20
>> On 06.07.2012, at 18:25, Scott Wood wrote:
>>=20
>>> On 07/06/2012 07:29 AM, Alexander Graf wrote:
>>>> I really think we should document what exactly this machine =
expects.
>>>=20
>>> Well, the point of this paravirt machine is to avoid such =
assumptions --
>>> it's all device-tree driven, at least in theory.  If a certain qemu
>>> configuration ends up breaking the Linux platform (such as using a
>>> different PIC), then that's a lack of flexibility on Linux's part =
that
>>> should get fixed if someone finds it useful enough to justify the
>>> effort.  Same with real hardware -- if you care about it, you add
>>> support -- we just don't have a unique name for every configuration.
>>> The information is there in the device tree, though.
>>>=20
>>> Honestly, even having "qemu" in there is more specific than I'd =
prefer,
>>> but I don't want to stir up the "generic platform" argument again
>>> without at least limiting the scope.
>>=20
>> Well, can't we note down the assumptions we make to make sure that
>> whoever develops an implementation of it knows what to implement?
>> It's ppc specific for example. I also don't think that plugging a G3
>> in there works, would it?
>=20
> Well, it does have "e500" in the name. :-P
>=20
>>>>> +void __init qemu_e500_pic_init(void)
>>>>> +{
>>>>> +	struct mpic *mpic;
>>>>> +
>>>>> +	mpic =3D mpic_alloc(NULL, 0, MPIC_BIG_ENDIAN | =
MPIC_SINGLE_DEST_CPU,
>>>>> +			0, 256, " OpenPIC  ");
>>>>=20
>>>> Does that mean we're configuring the MPIC regardless of what the
>>>> guest tells us? So the MPIC is a hard requirement. We can't use UIC
>>>> or XPIC with this machine, right? This needs to be documented.
>>>=20
>>> Then what would we do if we want to add an ePAPR virtual PIC =
instead?
>>> Or if something replaces MPIC on future FSL chips?
>>=20
>> Then we need a different compatible anyways, because we wouldn't be =
backwards compatible, no?
>=20
> No, that's exactly what I'm trying to avoid.  This notion of a =
toplevel
> compatible that tells you everything you need to know about the =
machine
> (even if Linux chooses to be device-tree-based for some arbitrary =
subset
> of that information) is incompatible with a flexible virtual platform.
>=20
> All this compatible is saying is "see the rest of the device tree".
> How well Linux does so is a quality of implementation issue that can =
be
> addressed as needed.  The information about what sort of interrupt
> controller you have is already in the device tree.  The device tree is
> the machine spec.
>=20
> Another assumption this patch makes is that it doesn't need SWIOTLB.  =
Is
> "has more than 4GiB RAM" a machine attribute that would warrant a
> separate toplevel compatible?  SWIOTLB for PCI is handled due to the
> previous patch that provides common PCI code -- but in a previous
> version of the patch it was not handled.  Is it yet another =
incompatible
> machine spec if RAM must be less than 4GiB minus PCICSRBAR (ignoring =
the
> QEMU bug that PCICSRBAR is not implemented)?

Well, the thing that I'm wary of is the following. Imagine we make this =
the default machine type for all e500 user cases. Which is reasonable. =
Now we release 3.6 which works awesome with QEMU 1.2. We change =
something in QEMU. QEMU 1.3 comes out. It can no longer boot your old =
kernel 3.6.

That's the type of situation I don't want to be in. We need to be =
backwards compatible with what we used to be able to run. We can get =
away with declaring things as experimental for now, until we settled on =
a reasonable compromise to achieve said compatibility. But it needs to =
be our goal somewhere.

One idea would be to version the machine type according to what Linux =
implements. If Linux finds a machine type that is newer than what it =
implements, it spawns a warning. If we want, we can implement backwards =
compatible machine types in QEMU, similar to how we implement -M pc-0.12 =
and friends today.

Again, no need to do so as long as we tell users to not use it. As soon =
as we want them to actually run the machine, we need to have independent =
upgrade paths in place. New QEMU needs to be able to run old kernels. =
New kernels need to be run on old QEMU.

>=20
>>> Better to change the Linux implementation as needed than to change a =
spec.
>>=20
>> Why not keep the 2 in sync in the same patch? Just throw a file with =
a rough outline of the machine in Documentation/.
>=20
> Because that would give people the wrong impression about what this
> machine is, and be unlikely to stay in sync or be a complete listing =
of
> current assumptions.  You're basically suggesting to use =
Documentation/
> as a bug tracker.

I'm just saying that every time we hardcode assumptions, we need to make =
sure we document it somewhere. And currently we do hardcode assumptions, =
even though only a few.


Alex

^ permalink raw reply

* Re: [PATCH 2/3] powerpc/e500: add paravirt QEMU platform
From: Scott Wood @ 2012-07-06 16:52 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev
In-Reply-To: <D89B244E-C8E0-4CCD-9118-2E84A13B6633@suse.de>

On 07/06/2012 11:30 AM, Alexander Graf wrote:
> 
> On 06.07.2012, at 18:25, Scott Wood wrote:
> 
>> On 07/06/2012 07:29 AM, Alexander Graf wrote:
>>> I really think we should document what exactly this machine expects.
>>
>> Well, the point of this paravirt machine is to avoid such assumptions --
>> it's all device-tree driven, at least in theory.  If a certain qemu
>> configuration ends up breaking the Linux platform (such as using a
>> different PIC), then that's a lack of flexibility on Linux's part that
>> should get fixed if someone finds it useful enough to justify the
>> effort.  Same with real hardware -- if you care about it, you add
>> support -- we just don't have a unique name for every configuration.
>> The information is there in the device tree, though.
>>
>> Honestly, even having "qemu" in there is more specific than I'd prefer,
>> but I don't want to stir up the "generic platform" argument again
>> without at least limiting the scope.
> 
> Well, can't we note down the assumptions we make to make sure that
> whoever develops an implementation of it knows what to implement?
> It's ppc specific for example. I also don't think that plugging a G3
> in there works, would it?

Well, it does have "e500" in the name. :-P

>>>> +void __init qemu_e500_pic_init(void)
>>>> +{
>>>> +	struct mpic *mpic;
>>>> +
>>>> +	mpic = mpic_alloc(NULL, 0, MPIC_BIG_ENDIAN | MPIC_SINGLE_DEST_CPU,
>>>> +			0, 256, " OpenPIC  ");
>>>
>>> Does that mean we're configuring the MPIC regardless of what the
>>> guest tells us? So the MPIC is a hard requirement. We can't use UIC
>>> or XPIC with this machine, right? This needs to be documented.
>>
>> Then what would we do if we want to add an ePAPR virtual PIC instead?
>> Or if something replaces MPIC on future FSL chips?
> 
> Then we need a different compatible anyways, because we wouldn't be backwards compatible, no?

No, that's exactly what I'm trying to avoid.  This notion of a toplevel
compatible that tells you everything you need to know about the machine
(even if Linux chooses to be device-tree-based for some arbitrary subset
of that information) is incompatible with a flexible virtual platform.

All this compatible is saying is "see the rest of the device tree".
How well Linux does so is a quality of implementation issue that can be
addressed as needed.  The information about what sort of interrupt
controller you have is already in the device tree.  The device tree is
the machine spec.

Another assumption this patch makes is that it doesn't need SWIOTLB.  Is
"has more than 4GiB RAM" a machine attribute that would warrant a
separate toplevel compatible?  SWIOTLB for PCI is handled due to the
previous patch that provides common PCI code -- but in a previous
version of the patch it was not handled.  Is it yet another incompatible
machine spec if RAM must be less than 4GiB minus PCICSRBAR (ignoring the
QEMU bug that PCICSRBAR is not implemented)?

>> Better to change the Linux implementation as needed than to change a spec.
> 
> Why not keep the 2 in sync in the same patch? Just throw a file with a rough outline of the machine in Documentation/.

Because that would give people the wrong impression about what this
machine is, and be unlikely to stay in sync or be a complete listing of
current assumptions.  You're basically suggesting to use Documentation/
as a bug tracker.

-Scott

^ permalink raw reply

* Re: [PATCH 2/3] powerpc/e500: add paravirt QEMU platform
From: Alexander Graf @ 2012-07-06 16:30 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev
In-Reply-To: <4FF71164.3040501@freescale.com>


On 06.07.2012, at 18:25, Scott Wood wrote:

> On 07/06/2012 07:29 AM, Alexander Graf wrote:
>>=20
>> On 28.06.2012, at 01:50, Scott Wood wrote:
>>=20
>>> This gives the kernel a paravirtualized machine to target, without
>>> requiring both sides to pretend to be targeting a specific board
>>> that likely has little to do with the host in KVM scenarios.  This
>>> avoids the need to add new boards to QEMU just to be able to
>>> run KVM on new CPUs.
>>>=20
>>> As this is the first platform that can run with either e500v2 or
>>> e500mc, CONFIG_PPC_E500MC is now a legitimately user configurable
>>> option, so add a help text.
>>>=20
>>> Signed-off-by: Scott Wood <scottwood@freescale.com>
>>> ---
>>> arch/powerpc/platforms/85xx/Kconfig     |   16 +++++++
>>> arch/powerpc/platforms/85xx/Makefile    |    1 +
>>> arch/powerpc/platforms/85xx/qemu_e500.c |   66 =
+++++++++++++++++++++++++++++++
>>> arch/powerpc/platforms/Kconfig.cputype  |    4 ++
>>=20
>> I really think we should document what exactly this machine expects.
>=20
> Well, the point of this paravirt machine is to avoid such assumptions =
--
> it's all device-tree driven, at least in theory.  If a certain qemu
> configuration ends up breaking the Linux platform (such as using a
> different PIC), then that's a lack of flexibility on Linux's part that
> should get fixed if someone finds it useful enough to justify the
> effort.  Same with real hardware -- if you care about it, you add
> support -- we just don't have a unique name for every configuration.
> The information is there in the device tree, though.
>=20
> Honestly, even having "qemu" in there is more specific than I'd =
prefer,
> but I don't want to stir up the "generic platform" argument again
> without at least limiting the scope.

Well, can't we note down the assumptions we make to make sure that =
whoever develops an implementation of it knows what to implement? It's =
ppc specific for example. I also don't think that plugging a G3 in there =
works, would it?

>=20
>>> +void __init qemu_e500_pic_init(void)
>>> +{
>>> +	struct mpic *mpic;
>>> +
>>> +	mpic =3D mpic_alloc(NULL, 0, MPIC_BIG_ENDIAN | =
MPIC_SINGLE_DEST_CPU,
>>> +			0, 256, " OpenPIC  ");
>>=20
>> Does that mean we're configuring the MPIC regardless of what the
>> guest tells us? So the MPIC is a hard requirement. We can't use UIC
>> or XPIC with this machine, right? This needs to be documented.
>=20
> Then what would we do if we want to add an ePAPR virtual PIC instead?
> Or if something replaces MPIC on future FSL chips?

Then we need a different compatible anyways, because we wouldn't be =
backwards compatible, no?

> Better to change the Linux implementation as needed than to change a =
spec.

Why not keep the 2 in sync in the same patch? Just throw a file with a =
rough outline of the machine in Documentation/.


Alex

^ permalink raw reply

* Re: [PATCH 2/3] powerpc/e500: add paravirt QEMU platform
From: Scott Wood @ 2012-07-06 16:25 UTC (permalink / raw)
  To: Alexander Graf; +Cc: linuxppc-dev
In-Reply-To: <36624A62-88AD-404A-A7A8-02D62E8700F0@suse.de>

On 07/06/2012 07:29 AM, Alexander Graf wrote:
> 
> On 28.06.2012, at 01:50, Scott Wood wrote:
> 
>> This gives the kernel a paravirtualized machine to target, without
>> requiring both sides to pretend to be targeting a specific board
>> that likely has little to do with the host in KVM scenarios.  This
>> avoids the need to add new boards to QEMU just to be able to
>> run KVM on new CPUs.
>>
>> As this is the first platform that can run with either e500v2 or
>> e500mc, CONFIG_PPC_E500MC is now a legitimately user configurable
>> option, so add a help text.
>>
>> Signed-off-by: Scott Wood <scottwood@freescale.com>
>> ---
>> arch/powerpc/platforms/85xx/Kconfig     |   16 +++++++
>> arch/powerpc/platforms/85xx/Makefile    |    1 +
>> arch/powerpc/platforms/85xx/qemu_e500.c |   66 +++++++++++++++++++++++++++++++
>> arch/powerpc/platforms/Kconfig.cputype  |    4 ++
> 
> I really think we should document what exactly this machine expects.

Well, the point of this paravirt machine is to avoid such assumptions --
it's all device-tree driven, at least in theory.  If a certain qemu
configuration ends up breaking the Linux platform (such as using a
different PIC), then that's a lack of flexibility on Linux's part that
should get fixed if someone finds it useful enough to justify the
effort.  Same with real hardware -- if you care about it, you add
support -- we just don't have a unique name for every configuration.
The information is there in the device tree, though.

Honestly, even having "qemu" in there is more specific than I'd prefer,
but I don't want to stir up the "generic platform" argument again
without at least limiting the scope.

>> +void __init qemu_e500_pic_init(void)
>> +{
>> +	struct mpic *mpic;
>> +
>> +	mpic = mpic_alloc(NULL, 0, MPIC_BIG_ENDIAN | MPIC_SINGLE_DEST_CPU,
>> +			0, 256, " OpenPIC  ");
> 
> Does that mean we're configuring the MPIC regardless of what the
> guest tells us? So the MPIC is a hard requirement. We can't use UIC
> or XPIC with this machine, right? This needs to be documented.

Then what would we do if we want to add an ePAPR virtual PIC instead?
Or if something replaces MPIC on future FSL chips?

Better to change the Linux implementation as needed than to change a spec.

-Scott

^ permalink raw reply

* Re: [PATCH v3] printk: Have printk() never buffer its data
From: Kay Sievers @ 2012-07-06 15:12 UTC (permalink / raw)
  To: Michael Neuling
  Cc: Greg Kroah-Hartman, LKML, Steven Rostedt, Paul E. McKenney,
	linuxppc-dev, Joe Perches, Andrew Morton, Wu Fengguang,
	Linus Torvalds, Ingo Molnar
In-Reply-To: <CAPXgP13_cQUQLE4RSjACTzj28Pn6r3ZaPhjch2U+uwgi7YKcWA@mail.gmail.com>

On Fri, Jul 6, 2012 at 12:46 PM, Kay Sievers <kay@vrfy.org> wrote:
> On Fri, Jul 6, 2012 at 5:47 AM, Michael Neuling <mikey@neuling.org> wrote:
>
>>> 4,89,24561;NIP: c000000000048164 LR: c000000000048160 CTR: 0000000000000000
>>> 4,90,24576;REGS: c00000007e59fb50 TRAP: 0700   Tainted: G        W     (3.5.0-rc4-mikey)
>>> 4,91,24583;MSR: 9000000000021032
>>> 4,92,24586;<
>>> 4,93,24591;SF
>>> 4,94,24596;,HV
>>> 4,95,24601;,ME
>>> 4,96,24606;,IR
>>> 4,97,24611;,DR
>>> 4,98,24616;,RI
>>> 4,99,24619;>
>>> 4,100,24628;  CR: 28000042  XER: 22000000
>>
>> FWIW, compiling with the parent commit gives this:
>>
>> 4,89,1712;NIP: c000000000048164 LR: c000000000048160 CTR: 0000000000000000
>> 4,90,1713;REGS: c00000007e59fb50 TRAP: 0700   Tainted: G        W     (3.5.0-rc4-mikey)
>> 4,91,1716;MSR: 9000000000021032 <SF,HV,ME,IR,DR,RI>  CR: 22000082  XER: 02000000
>
> Hmm, I don't understand, which parent commit do you mean? You maybe
> mean without 084681d?
>
> I think it's a race of the two CPUs printing continuation lines, and
> the continuation buffer is still occupied with data from one CPU and
> not available to the other one at the same time.
>
> What you see is likely not the direct output to the console (that
> would work) but the replay of the stored buffer when the console is
> registered. Because the cont buffer was still busy with one CPU, the
> other thread needs to store the continuation line prints in individual
> records, which leads to the (unwanted) printed newlines when
> replaying.
>
> The data we store looks all fine, it just looks needlessly separated
> when we replay fromt he buffer on a newly registered boot console. We
> need to merge the lines in the output, so they *look* like they are
> all in one line. I'll work on a fix for that now.

It could be that the console semaphore is still help by the other CPU,
for whatever reason, when your box runs into this situation.

Mind pasting more context (/dev/kmsg) of the log when this happens,
not only the one line that get split-up?

Is this possibly during an oops or backtrace going on when you see
this? Which code calls show_regs() here?

Kay

^ permalink raw reply

* Re: [RFC PATCH 08/17] KVM: PPC: e500mc: Fix tlbilx emulation for 64-bit guests
From: Alexander Graf @ 2012-07-06 14:54 UTC (permalink / raw)
  To: Mihai Caraman; +Cc: qemu-ppc, linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <1340627195-11544-9-git-send-email-mihai.caraman@freescale.com>


On 25.06.2012, at 14:26, Mihai Caraman wrote:

> tlbilxva emulation was using an u32 variable for guest effective address.
> Replace it with gva_t type to handle 64-bit guests.
> 
> Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>

Thanks, applied to kvm-ppc-next.


Alex

^ permalink raw reply

* Re: [PATCH] Revert "powerpc/p3060qds: Add support for P3060QDS board"
From: Timur Tabi @ 2012-07-06 14:44 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev
In-Reply-To: <B652D6B0-4782-4CC5-9685-95A86FACF60A@kernel.crashing.org>

Kumar Gala wrote:
> I assume you're sending a similar patch to u-boot.

Yes, but I wanted to see if this one was accepted first.

-- 
Timur Tabi
Linux kernel developer at Freescale

^ permalink raw reply

* Re: [PATCH 4/4] powerpc/mpic: FSL MPIC error interrupt support.
From: Kumar Gala @ 2012-07-06 14:26 UTC (permalink / raw)
  To: Sethi Varun-B16395; +Cc: Wood Scott-B07421, Linuxppc-dev@lists.ozlabs.org
In-Reply-To: <C5ECD7A89D1DC44195F34B25E172658D12D9E5@039-SN2MPN1-012.039d.mgd.msft.net>


On Jul 5, 2012, at 11:02 PM, Sethi Varun-B16395 wrote:

>=20
>=20
>> -----Original Message-----
>> From: Wood Scott-B07421
>> Sent: Tuesday, June 19, 2012 12:53 AM
>> To: Sethi Varun-B16395
>> Cc: Wood Scott-B07421; Kumar Gala; Linuxppc-dev@lists.ozlabs.org
>> Subject: Re: [PATCH 4/4] powerpc/mpic: FSL MPIC error interrupt =
support.
>>=20
>> On 06/18/2012 02:19 PM, Sethi Varun-B16395 wrote:
>>>=20
>>>=20
>>>> -----Original Message-----
>>>> From: Wood Scott-B07421
>>>> Sent: Tuesday, June 19, 2012 12:47 AM
>>>> To: Sethi Varun-B16395
>>>> Cc: Kumar Gala; Wood Scott-B07421; Linuxppc-dev@lists.ozlabs.org
>>>> Subject: Re: [PATCH 4/4] powerpc/mpic: FSL MPIC error interrupt
>> support.
>>>>=20
>>>> On 06/18/2012 02:12 PM, Sethi Varun-B16395 wrote:
>>>>>=20
>>>>>=20
>>>>>>>> +/*
>>>>>>>>> + * Error interrupt registers
>>>>>>>>> + */
>>>>>>>>> +
>>>>>>>>> +#define MPIC_ERR_INT_BASE	0x3900
>>>>>>>>> +#define MPIC_ERR_INT_EISR	0x0000
>>>>>>>>> +#define MPIC_ERR_INT_EIMR	0x0010
>>>>>>>>> +
>>>>>>>>> #define MPIC_MAX_IRQ_SOURCES	2048
>>>>>>>>> #define MPIC_MAX_CPUS		32
>>>>>>>>> #define MPIC_MAX_ISU		32
>>>>>>>>>=20
>>>>>>>>> #define MPIC_MAX_TIMER    8
>>>>>>>>> #define MPIC_MAX_IPI      4
>>>>>>>>> +#define MPIC_MAX_ERR      32
>>>>>>>>=20
>>>>>>>> Should probably be 64
>>>>>>>=20
>>>>>>> This patch supports MPIC 4.1 and EISR0.  When support is added =
for
>>>>>>> EISR1 (didn't realize this was coming until your comment =
prompted
>>>>>>> me to check...), this should be updated, but this change alone
>>>>>>> would not make it work.
>>>>>>=20
>>>>>> Would prefer we handle this now rather than later (T4240 is going
>>>>>> to need
>>>>>> EISR1 support).
>>>>> Hi Kumar,
>>>>> As of now I don't have a proper mechanism to test this =
functionality.
>>>>> I will submit a follow up patch for EISR1/EIMR1 support once I =
have
>>>>> a mechanism to test this functionality.
>>>>=20
>>>> You could still write the code in a way that scales to multiple
>>>> EISRs, and test that it works with EISR0.
>>>>=20
>>> Yes, but I would like to submit the patch once I have tested it.
>>=20
>> So test it the way I described, and submit. :-P
> There just seem to be 32 error interrupts even in case of T4240, that =
means there is no
> need to handle multiple EISRs.

Ok, but I had some other comments about this patch.

> I have already submitted a revised patch for handling MPIC error =
interrupts.
> [PATCH 3/3 v2] powerpc/mpic: FSL MPIC error interrupt support

Please resubmit the full sequence of patches at this point.

- k=

^ permalink raw reply

* Re: [PATCH] Revert "powerpc/p3060qds: Add support for P3060QDS board"
From: Kumar Gala @ 2012-07-06 14:06 UTC (permalink / raw)
  To: Timur Tabi; +Cc: linuxppc-dev
In-Reply-To: <1341526073-10595-1-git-send-email-timur@freescale.com>


On Jul 5, 2012, at 5:07 PM, Timur Tabi wrote:

> This reverts commit 96cc017c5b7ec095ef047d3c1952b6b6bbf98943.
>=20
> The P3060 was cancelled before it went into production, so there's no =
point
> in supporting it.
>=20
> Signed-off-by: Timur Tabi <timur@freescale.com>
> ---
> arch/powerpc/boot/dts/fsl/p3060si-post.dtsi  |  302 =
--------------------------
> arch/powerpc/boot/dts/fsl/p3060si-pre.dtsi   |  125 -----------
> arch/powerpc/boot/dts/p3060qds.dts           |  242 =
---------------------
> arch/powerpc/configs/corenet32_smp_defconfig |    1 -
> arch/powerpc/platforms/85xx/Kconfig          |   12 -
> arch/powerpc/platforms/85xx/Makefile         |    1 -
> arch/powerpc/platforms/85xx/p3060_qds.c      |   77 -------
> 7 files changed, 0 insertions(+), 760 deletions(-)
> delete mode 100644 arch/powerpc/boot/dts/fsl/p3060si-post.dtsi
> delete mode 100644 arch/powerpc/boot/dts/fsl/p3060si-pre.dtsi
> delete mode 100644 arch/powerpc/boot/dts/p3060qds.dts
> delete mode 100644 arch/powerpc/platforms/85xx/p3060_qds.c

I assume you're sending a similar patch to u-boot.

- k=

^ permalink raw reply

* Re: [PATCH SLAB 1/2 v3] duplicate the cache name in SLUB's saved_alias list, SLAB, and SLOB
From: Christoph Lameter @ 2012-07-06 13:56 UTC (permalink / raw)
  To: Li Zhong
  Cc: LKML, Glauber Costa, Pekka Enberg, linux-mm, Paul Mackerras,
	Matt Mackall, PowerPC email list, Wanlong Gao
In-Reply-To: <1341561286.24895.9.camel@ThinkPad-T420>

I thought I posted this a couple of days ago. Would this not fix things
without having to change all the allocators?


Subject: slub: Dup name earlier in kmem_cache_create

Dup the name earlier in kmem_cache_create so that alias
processing is done using the copy of the string and not
the string itself.

Signed-off-by: Christoph Lameter <cl@linux.com>

---
 mm/slub.c |   29 ++++++++++++++---------------
 1 file changed, 14 insertions(+), 15 deletions(-)

Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c	2012-06-11 08:49:56.000000000 -0500
+++ linux-2.6/mm/slub.c	2012-07-03 15:17:37.000000000 -0500
@@ -3933,8 +3933,12 @@ struct kmem_cache *kmem_cache_create(con
 	if (WARN_ON(!name))
 		return NULL;

+	n = kstrdup(name, GFP_KERNEL);
+	if (!n)
+		goto out;
+
 	down_write(&slub_lock);
-	s = find_mergeable(size, align, flags, name, ctor);
+	s = find_mergeable(size, align, flags, n, ctor);
 	if (s) {
 		s->refcount++;
 		/*
@@ -3944,7 +3948,7 @@ struct kmem_cache *kmem_cache_create(con
 		s->objsize = max(s->objsize, (int)size);
 		s->inuse = max_t(int, s->inuse, ALIGN(size, sizeof(void *)));

-		if (sysfs_slab_alias(s, name)) {
+		if (sysfs_slab_alias(s, n)) {
 			s->refcount--;
 			goto err;
 		}
@@ -3952,31 +3956,26 @@ struct kmem_cache *kmem_cache_create(con
 		return s;
 	}

-	n = kstrdup(name, GFP_KERNEL);
-	if (!n)
-		goto err;
-
 	s = kmalloc(kmem_size, GFP_KERNEL);
 	if (s) {
 		if (kmem_cache_open(s, n,
 				size, align, flags, ctor)) {
 			list_add(&s->list, &slab_caches);
 			up_write(&slub_lock);
-			if (sysfs_slab_add(s)) {
-				down_write(&slub_lock);
-				list_del(&s->list);
-				kfree(n);
-				kfree(s);
-				goto err;
-			}
-			return s;
+			if (!sysfs_slab_add(s))
+				return s;
+
+			down_write(&slub_lock);
+			list_del(&s->list);
 		}
 		kfree(s);
 	}
-	kfree(n);
+
 err:
+	kfree(n);
 	up_write(&slub_lock);

+out:
 	if (flags & SLAB_PANIC)
 		panic("Cannot create slabcache %s\n", name);
 	else

^ permalink raw reply

* Re: [PATCH 2/3] powerpc/e500: add paravirt QEMU platform
From: Alexander Graf @ 2012-07-06 12:29 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, Jia Hongtao
In-Reply-To: <20120627235005.GB9100@tyr.buserror.net>


On 28.06.2012, at 01:50, Scott Wood wrote:

> This gives the kernel a paravirtualized machine to target, without
> requiring both sides to pretend to be targeting a specific board
> that likely has little to do with the host in KVM scenarios.  This
> avoids the need to add new boards to QEMU just to be able to
> run KVM on new CPUs.
>=20
> As this is the first platform that can run with either e500v2 or
> e500mc, CONFIG_PPC_E500MC is now a legitimately user configurable
> option, so add a help text.
>=20
> Signed-off-by: Scott Wood <scottwood@freescale.com>
> ---
> arch/powerpc/platforms/85xx/Kconfig     |   16 +++++++
> arch/powerpc/platforms/85xx/Makefile    |    1 +
> arch/powerpc/platforms/85xx/qemu_e500.c |   66 =
+++++++++++++++++++++++++++++++
> arch/powerpc/platforms/Kconfig.cputype  |    4 ++

I really think we should document what exactly this machine expects.

> 4 files changed, 87 insertions(+), 0 deletions(-)
> create mode 100644 arch/powerpc/platforms/85xx/qemu_e500.c
>=20
> diff --git a/arch/powerpc/platforms/85xx/Kconfig =
b/arch/powerpc/platforms/85xx/Kconfig
> index f000d81..7bbebe5 100644
> --- a/arch/powerpc/platforms/85xx/Kconfig
> +++ b/arch/powerpc/platforms/85xx/Kconfig
> @@ -263,6 +263,22 @@ config P5020_DS
> 	help
> 	  This option enables support for the P5020 DS board
>=20
> +config PPC_QEMU_E500
> +	bool "QEMU generic e500 platform"
> +	depends on EXPERIMENTAL
> +	select DEFAULT_UIMAGE
> +	help
> +	  This option enables support for running as a QEMU guest using
> +	  QEMU's generic e500 machine.  This is not required if you're
> +	  using a QEMU machine that targets a specific board, such as
> +	  mpc8544ds.
> +
> +	  Unlike most e500 boards that target a specific CPU, this
> +	  platform works with any e500-family CPU that QEMU supports.
> +	  Thus, you'll need to make sure CONFIG_PPC_E500MC is set or
> +	  unset based on the emulated CPU (or actual host CPU in the =
case
> +	  of KVM).
> +
> endif # FSL_SOC_BOOKE
>=20
> config TQM85xx
> diff --git a/arch/powerpc/platforms/85xx/Makefile =
b/arch/powerpc/platforms/85xx/Makefile
> index 2125d4c..f841ac8 100644
> --- a/arch/powerpc/platforms/85xx/Makefile
> +++ b/arch/powerpc/platforms/85xx/Makefile
> @@ -28,3 +28,4 @@ obj-$(CONFIG_SOCRATES)    +=3D socrates.o =
socrates_fpga_pic.o
> obj-$(CONFIG_KSI8560)	  +=3D ksi8560.o
> obj-$(CONFIG_XES_MPC85xx) +=3D xes_mpc85xx.o
> obj-$(CONFIG_GE_IMP3A)	  +=3D ge_imp3a.o
> +obj-$(CONFIG_PPC_QEMU_E500) +=3D qemu_e500.o
> diff --git a/arch/powerpc/platforms/85xx/qemu_e500.c =
b/arch/powerpc/platforms/85xx/qemu_e500.c
> new file mode 100644
> index 0000000..77c8d5d
> --- /dev/null
> +++ b/arch/powerpc/platforms/85xx/qemu_e500.c
> @@ -0,0 +1,66 @@
> +/*
> + * Paravirt target for a generic QEMU e500 machine
> + *
> + * Copyright 2012 Freescale Semiconductor Inc.
> + *
> + * This program is free software; you can redistribute  it and/or =
modify it
> + * under  the terms of  the GNU General  Public License as published =
by the
> + * Free Software Foundation;  either version 2 of the  License, or =
(at your
> + * option) any later version.
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/of_fdt.h>
> +#include <asm/machdep.h>
> +#include <asm/time.h>
> +#include <asm/udbg.h>
> +#include <asm/mpic.h>
> +#include <sysdev/fsl_soc.h>
> +#include <sysdev/fsl_pci.h>
> +#include "smp.h"
> +#include "mpc85xx.h"
> +
> +void __init qemu_e500_pic_init(void)
> +{
> +	struct mpic *mpic;
> +
> +	mpic =3D mpic_alloc(NULL, 0, MPIC_BIG_ENDIAN | =
MPIC_SINGLE_DEST_CPU,
> +			0, 256, " OpenPIC  ");

Does that mean we're configuring the MPIC regardless of what the guest =
tells us? So the MPIC is a hard requirement. We can't use UIC or XPIC =
with this machine, right? This needs to be documented.

> +
> +	BUG_ON(mpic =3D=3D NULL);
> +	mpic_init(mpic);
> +}
> +
> +static void __init qemu_e500_setup_arch(void)
> +{
> +	ppc_md.progress("qemu_e500_setup_arch()", 0);
> +
> +	fsl_pci_init();
> +	mpc85xx_smp_init();
> +}
> +
> +/*
> + * Called very early, device-tree isn't unflattened
> + */
> +static int __init qemu_e500_probe(void)
> +{
> +	unsigned long root =3D of_get_flat_dt_root();
> +
> +	return !!of_flat_dt_is_compatible(root, "fsl,qemu-e500");

So the machine needs to be compatible "fsl,qemu-e500". Needs =
documentation in the machine spec.

I'm sure you'll find more constraints that appear logical, but really =
should be written down so we have something formal that potentially =
someone not-QEMU or not-Scott could write a machine implementation =
against ;).


Alex

^ permalink raw reply

* [PATCH] PPC Hardware Breakpoints: Fix incorrect pointer access
From: Naveen N. Rao @ 2012-07-06 11:30 UTC (permalink / raw)
  To: fweisbec, benh, paulus, prasad.krishnan
  Cc: linuxppc-dev, linux-kernel, emachado

If arch_validate_hwbkpt_settings() fails, bp->ctx won't be valid and the
kernel panics. Add a check to fix this.

Reported-by: Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/hw_breakpoint.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index 2bc0584..f3a82dd 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -111,7 +111,7 @@ void arch_unregister_hw_breakpoint(struct perf_event *bp)
 	 * and the single_step_dabr_instruction(), then cleanup the breakpoint
 	 * restoration variables to prevent dangling pointers.
 	 */
-	if (bp->ctx->task)
+	if (bp->ctx && bp->ctx->task)
 		bp->ctx->task->thread.last_hit_ubp = NULL;
 }
 

^ permalink raw reply related

* Re: [PATCH v3] printk: Have printk() never buffer its data
From: Kay Sievers @ 2012-07-06 10:46 UTC (permalink / raw)
  To: Michael Neuling
  Cc: Greg Kroah-Hartman, LKML, Steven Rostedt, Paul E. McKenney,
	linuxppc-dev, Joe Perches, Andrew Morton, Wu Fengguang,
	Linus Torvalds, Ingo Molnar
In-Reply-To: <30343.1341546469@neuling.org>

On Fri, Jul 6, 2012 at 5:47 AM, Michael Neuling <mikey@neuling.org> wrote:

>> 4,89,24561;NIP: c000000000048164 LR: c000000000048160 CTR: 0000000000000000
>> 4,90,24576;REGS: c00000007e59fb50 TRAP: 0700   Tainted: G        W     (3.5.0-rc4-mikey)
>> 4,91,24583;MSR: 9000000000021032
>> 4,92,24586;<
>> 4,93,24591;SF
>> 4,94,24596;,HV
>> 4,95,24601;,ME
>> 4,96,24606;,IR
>> 4,97,24611;,DR
>> 4,98,24616;,RI
>> 4,99,24619;>
>> 4,100,24628;  CR: 28000042  XER: 22000000
>
> FWIW, compiling with the parent commit gives this:
>
> 4,89,1712;NIP: c000000000048164 LR: c000000000048160 CTR: 0000000000000000
> 4,90,1713;REGS: c00000007e59fb50 TRAP: 0700   Tainted: G        W     (3.5.0-rc4-mikey)
> 4,91,1716;MSR: 9000000000021032 <SF,HV,ME,IR,DR,RI>  CR: 22000082  XER: 02000000

Hmm, I don't understand, which parent commit do you mean? You maybe
mean without 084681d?

I think it's a race of the two CPUs printing continuation lines, and
the continuation buffer is still occupied with data from one CPU and
not available to the other one at the same time.

What you see is likely not the direct output to the console (that
would work) but the replay of the stored buffer when the console is
registered. Because the cont buffer was still busy with one CPU, the
other thread needs to store the continuation line prints in individual
records, which leads to the (unwanted) printed newlines when
replaying.

The data we store looks all fine, it just looks needlessly separated
when we replay fromt he buffer on a newly registered boot console. We
need to merge the lines in the output, so they *look* like they are
all in one line. I'll work on a fix for that now.

Thanks,
Kay

^ permalink raw reply

* Re: [PATCH powerpc 2/2] kfree the cache name  of pgtable cache if SLUB is used
From: Glauber Costa @ 2012-07-06 10:13 UTC (permalink / raw)
  To: Li Zhong
  Cc: LKML, Pekka Enberg, linux-mm, Paul Mackerras, Matt Mackall,
	Christoph Lameter, PowerPC email list
In-Reply-To: <1341480578.23916.7.camel@ThinkPad-T420>

On 07/05/2012 01:29 PM, Li Zhong wrote:
> On Thu, 2012-07-05 at 12:23 +0400, Glauber Costa wrote:
>> On 07/05/2012 05:41 AM, Li Zhong wrote:
>>> On Wed, 2012-07-04 at 16:40 +0400, Glauber Costa wrote:
>>>> On 07/04/2012 01:00 PM, Li Zhong wrote:
>>>>> On Tue, 2012-07-03 at 15:36 -0500, Christoph Lameter wrote:
>>>>>>> Looking through the emails it seems that there is an issue with alias
>>>>>>> strings. 
>>>>> To be more precise, there seems no big issue currently. I just wanted to
>>>>> make following usage of kmem_cache_create (SLUB) possible:
>>>>>
>>>>> 	name = some string kmalloced
>>>>> 	kmem_cache_create(name, ...)
>>>>> 	kfree(name);
>>>>
>>>> Out of curiosity: Why?
>>>> This is not (currently) possible with the other allocators (may change
>>>> with christoph's unification patches), so you would be making your code
>>>> slub-dependent.
>>>>
>>>
>>> For slub itself, I think it's not good that: in some cases, the name
>>> string could be kfreed ( if it was kmalloced ) immediately after calling
>>> the cache create; in some other case, the name string needs to be kept
>>> valid until some init calls finished. 
>>>
>>> I agree with you that it would make the code slub-dependent, so I'm now
>>> working on the consistency of the other allocators regarding this name
>>> string duplicating thing. 
>>
>> If you really need to kfree the string, or even if it is easier for you
>> this way, it can be done. As a matter of fact, this is the case for me.
>> Just that your patch is not enough. Christoph has a patch that makes
>> this behavior consistent over all allocators.
> 
> Sorry, I didn't know that. Seems I don't need to continue the half-done
> work in slab. If possible, would you please give me a link of the patch?
> Thank you. 
> 

Sorry for the delay. In case you haven't found it out yourself yet:

http://www.spinics.net/lists/linux-mm/msg36149.html

Please not this posted patch as is has a bug.

I do believe that your take on the aliasing code adds value to it. But
as I've already said once, might have to dig a bit deeper in that to get
to end of the rabbit hole.

^ permalink raw reply

* Re: [RFC PATCH v2 4/13] memory-hotplug : remove /sys/firmware/memmap/X sysfs
From: Wen Congyang @ 2012-07-06  9:20 UTC (permalink / raw)
  To: Yasuaki Ishimatsu
  Cc: len.brown, linux-acpi, linux-kernel, linux-mm, paulus,
	minchan.kim, kosaki.motohiro, rientjes, cl, linuxppc-dev, akpm,
	liuj97
In-Reply-To: <4FF6A17C.6000808@jp.fujitsu.com>

At 07/06/2012 04:27 PM, Yasuaki Ishimatsu Wrote:
> Hi Wen,
> 
> 2012/07/04 19:01, Wen Congyang wrote:
>> At 07/04/2012 01:52 PM, Yasuaki Ishimatsu Wrote:
>>> Hi Wen,
>>>
>>> 2012/07/04 14:08, Wen Congyang wrote:
>>>> At 07/04/2012 12:45 PM, Yasuaki Ishimatsu Wrote:
>>>>> Hi Wen,
>>>>>
>>>>> 2012/07/03 15:35, Wen Congyang wrote:
>>>>>> At 07/03/2012 01:56 PM, Yasuaki Ishimatsu Wrote:
>>>>>>> When (hot)adding memory into system, /sys/firmware/memmap/X/{end, start, type}
>>>>>>> sysfs files are created. But there is no code to remove these files. The patch
>>>>>>> implements the function to remove them.
>>>>>>>
>>>>>>> Note : The code does not free firmware_map_entry since there is no way to free
>>>>>>>           memory which is allocated by bootmem.
>>>>>>>
>>>>>>> CC: David Rientjes <rientjes@google.com>
>>>>>>> CC: Jiang Liu <liuj97@gmail.com>
>>>>>>> CC: Len Brown <len.brown@intel.com>
>>>>>>> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>>>>>>> CC: Paul Mackerras <paulus@samba.org>
>>>>>>> CC: Christoph Lameter <cl@linux.com>
>>>>>>> Cc: Minchan Kim <minchan.kim@gmail.com>
>>>>>>> CC: Andrew Morton <akpm@linux-foundation.org>
>>>>>>> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>>>>>>> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>>>>>>
>>>>>>> ---
>>>>>>>     drivers/firmware/memmap.c    |   70 +++++++++++++++++++++++++++++++++++++++++++
>>>>>>>     include/linux/firmware-map.h |    6 +++
>>>>>>>     mm/memory_hotplug.c          |    6 +++
>>>>>>>     3 files changed, 81 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> Index: linux-3.5-rc4/mm/memory_hotplug.c
>>>>>>> ===================================================================
>>>>>>> --- linux-3.5-rc4.orig/mm/memory_hotplug.c	2012-07-03 14:22:00.190240794 +0900
>>>>>>> +++ linux-3.5-rc4/mm/memory_hotplug.c	2012-07-03 14:22:03.549198802 +0900
>>>>>>> @@ -661,7 +661,11 @@ EXPORT_SYMBOL_GPL(add_memory);
>>>>>>>
>>>>>>>     int remove_memory(int nid, u64 start, u64 size)
>>>>>>>     {
>>>>>>> -	return -EBUSY;
>>>>>>> +	lock_memory_hotplug();
>>>>>>> +	/* remove memmap entry */
>>>>>>> +	firmware_map_remove(start, start + size - 1, "System RAM");
>>>>>>> +	unlock_memory_hotplug();
>>>>>>> +	return 0;
>>>>>>>
>>>>>>>     }
>>>>>>>     EXPORT_SYMBOL_GPL(remove_memory);
>>>>>>> Index: linux-3.5-rc4/include/linux/firmware-map.h
>>>>>>> ===================================================================
>>>>>>> --- linux-3.5-rc4.orig/include/linux/firmware-map.h	2012-07-03 14:21:45.766421116 +0900
>>>>>>> +++ linux-3.5-rc4/include/linux/firmware-map.h	2012-07-03 14:22:03.550198789 +0900
>>>>>>> @@ -25,6 +25,7 @@
>>>>>>>
>>>>>>>     int firmware_map_add_early(u64 start, u64 end, const char *type);
>>>>>>>     int firmware_map_add_hotplug(u64 start, u64 end, const char *type);
>>>>>>> +int firmware_map_remove(u64 start, u64 end, const char *type);
>>>>>>>
>>>>>>>     #else /* CONFIG_FIRMWARE_MEMMAP */
>>>>>>>
>>>>>>> @@ -38,6 +39,11 @@ static inline int firmware_map_add_hotpl
>>>>>>>     	return 0;
>>>>>>>     }
>>>>>>>
>>>>>>> +static inline int firmware_map_remove(u64 start, u64 end, const char *type)
>>>>>>> +{
>>>>>>> +	return 0;
>>>>>>> +}
>>>>>>> +
>>>>>>>     #endif /* CONFIG_FIRMWARE_MEMMAP */
>>>>>>>
>>>>>>>     #endif /* _LINUX_FIRMWARE_MAP_H */
>>>>>>> Index: linux-3.5-rc4/drivers/firmware/memmap.c
>>>>>>> ===================================================================
>>>>>>> --- linux-3.5-rc4.orig/drivers/firmware/memmap.c	2012-07-03 14:21:45.761421180 +0900
>>>>>>> +++ linux-3.5-rc4/drivers/firmware/memmap.c	2012-07-03 14:22:03.569198549 +0900
>>>>>>> @@ -79,7 +79,16 @@ static const struct sysfs_ops memmap_att
>>>>>>>     	.show = memmap_attr_show,
>>>>>>>     };
>>>>>>>
>>>>>>> +static void release_firmware_map_entry(struct kobject *kobj)
>>>>>>> +{
>>>>>>> +	/*
>>>>>>> +	 * FIXME : There is no idea.
>>>>>>> +	 *         How to free the entry which allocated bootmem?
>>>>>>> +	 */
>>>>>>
>>>>>> I find a function free_bootmem(), but I am not sure whether it can work here.
>>>>>
>>>>> It cannot work here.
>>>>>
>>>>>> Another problem: how to check whether the entry uses bootmem?
>>>>>
>>>>> When firmware_map_entry is allocated by kzalloc(), the page has PG_slab.
>>>>
>>>> This is not true. In my test, I find the page does not have PG_slab sometimes.
>>>
>>> I think that it depends on the allocated size. firmware_map_entry size is
>>> smaller than PAGE_SIZE. So the page has PG_Slab.
>>
>> In my test, I add printk in the function firmware_map_add_hotplug() to display
>> page's flags. And sometimes the page is not allocated by slab(I use PageSlab()
>> to verify it).
> 
> How did you check it? Could you send your debug patch?

When the memory is not allocated from slab, the flags is 0x10000000008000.

>From 8dd51368d6c03edf7edc89cab17441e3741c39c7 Mon Sep 17 00:00:00 2001
From: Wen Congyang <wency@cn.fujitsu.com>
Date: Wed, 4 Jul 2012 16:05:26 +0800
Subject: [PATCH] debug

---
 drivers/firmware/memmap.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/drivers/firmware/memmap.c b/drivers/firmware/memmap.c
index adc0710..993ba3f 100644
--- a/drivers/firmware/memmap.c
+++ b/drivers/firmware/memmap.c
@@ -21,6 +21,7 @@
 #include <linux/types.h>
 #include <linux/bootmem.h>
 #include <linux/slab.h>
+#include <linux/mm.h>
 
 /*
  * Data types ------------------------------------------------------------------
@@ -160,11 +161,17 @@ static int add_sysfs_fw_map_entry(struct firmware_map_entry *entry)
 int __meminit firmware_map_add_hotplug(u64 start, u64 end, const char *type)
 {
 	struct firmware_map_entry *entry;
+	struct page *entry_page;
 
 	entry = kzalloc(sizeof(struct firmware_map_entry), GFP_ATOMIC);
 	if (!entry)
 		return -ENOMEM;
 
+	entry_page = virt_to_page(entry);
+	printk(KERN_WARNING "flags: %lx\n", entry_page->flags);
+	if (PageSlab(entry_page)) {
+		printk(KERN_WARNING "page is allocated from slab\n");
+	}
 	firmware_map_add_entry(start, end, type, entry);
 	/* create the memmap entry */
 	add_sysfs_fw_map_entry(entry);
-- 
1.7.1

Thanks
Wen Congyang

> 
> Thanks,
> Yasuaki Ishimatsu
> 
>> Thanks
>> Wen Congyang
>>
>>>
>>> Thanks,
>>> Yasuaki Ishimatsu
>>>
>>>>
>>>> Thanks
>>>> Wen Congyang.
>>>>
>>>>> So we can check whether the entry was allocated by bootmem or not.
>>>>> If the eantry was allocated by kzalloc(), we can free the entry by kfree().
>>>>> But if the entry was allocated by bootmem, we have no way to free the entry.
>>>>>
>>>>> Thanks,
>>>>> Yasuaki Ishimatsu
>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Wen Congyang
>>>>>>
>>>>>>> +}
>>>>>>> +
>>>>>>>     static struct kobj_type memmap_ktype = {
>>>>>>> +	.release	= release_firmware_map_entry,
>>>>>>>     	.sysfs_ops	= &memmap_attr_ops,
>>>>>>>     	.default_attrs	= def_attrs,
>>>>>>>     };
>>>>>>> @@ -123,6 +132,16 @@ static int firmware_map_add_entry(u64 st
>>>>>>>     	return 0;
>>>>>>>     }
>>>>>>>
>>>>>>> +/**
>>>>>>> + * firmware_map_remove_entry() - Does the real work to remove a firmware
>>>>>>> + * memmap entry.
>>>>>>> + * @entry: removed entry.
>>>>>>> + **/
>>>>>>> +static inline void firmware_map_remove_entry(struct firmware_map_entry *entry)
>>>>>>> +{
>>>>>>> +	list_del(&entry->list);
>>>>>>> +}
>>>>>>> +
>>>>>>>     /*
>>>>>>>      * Add memmap entry on sysfs
>>>>>>>      */
>>>>>>> @@ -144,6 +163,31 @@ static int add_sysfs_fw_map_entry(struct
>>>>>>>     	return 0;
>>>>>>>     }
>>>>>>>
>>>>>>> +/*
>>>>>>> + * Remove memmap entry on sysfs
>>>>>>> + */
>>>>>>> +static inline void remove_sysfs_fw_map_entry(struct firmware_map_entry *entry)
>>>>>>> +{
>>>>>>> +	kobject_put(&entry->kobj);
>>>>>>> +}
>>>>>>> +
>>>>>>> +/*
>>>>>>> + * Search memmap entry
>>>>>>> + */
>>>>>>> +
>>>>>>> +struct firmware_map_entry * __meminit
>>>>>>> +find_firmware_map_entry(u64 start, u64 end, const char *type)
>>>>>>> +{
>>>>>>> +	struct firmware_map_entry *entry;
>>>>>>> +
>>>>>>> +	list_for_each_entry(entry, &map_entries, list)
>>>>>>> +		if ((entry->start == start) && (entry->end == end) &&
>>>>>>> +		    (!strcmp(entry->type, type)))
>>>>>>> +			return entry;
>>>>>>> +
>>>>>>> +	return NULL;
>>>>>>> +}
>>>>>>> +
>>>>>>>     /**
>>>>>>>      * firmware_map_add_hotplug() - Adds a firmware mapping entry when we do
>>>>>>>      * memory hotplug.
>>>>>>> @@ -196,6 +240,32 @@ int __init firmware_map_add_early(u64 st
>>>>>>>     	return firmware_map_add_entry(start, end, type, entry);
>>>>>>>     }
>>>>>>>
>>>>>>> +/**
>>>>>>> + * firmware_map_remove() - remove a firmware mapping entry
>>>>>>> + * @start: Start of the memory range.
>>>>>>> + * @end:   End of the memory range (inclusive).
>>>>>>> + * @type:  Type of the memory range.
>>>>>>> + *
>>>>>>> + * removes a firmware mapping entry.
>>>>>>> + *
>>>>>>> + * Returns 0 on success, or -EINVAL if no entry.
>>>>>>> + **/
>>>>>>> +int __meminit firmware_map_remove(u64 start, u64 end, const char *type)
>>>>>>> +{
>>>>>>> +	struct firmware_map_entry *entry;
>>>>>>> +
>>>>>>> +	entry = find_firmware_map_entry(start, end, type);
>>>>>>> +	if (!entry)
>>>>>>> +		return -EINVAL;
>>>>>>> +
>>>>>>> +	/* remove the memmap entry */
>>>>>>> +	remove_sysfs_fw_map_entry(entry);
>>>>>>> +
>>>>>>> +	firmware_map_remove_entry(entry);
>>>>>>> +
>>>>>>> +	return 0;
>>>>>>> +}
>>>>>>> +
>>>>>>>     /*
>>>>>>>      * Sysfs functions -------------------------------------------------------------
>>>>>>>      */
>>>>>>>
>>>>>>> --
>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>>
>>>
>>>
>>>
>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>>
> 
> 
> 
> 

^ permalink raw reply related

* Re: [PATCH v2 1/1] of: reform prom_update_property function
From: Dong Aisheng @ 2012-07-06  9:13 UTC (permalink / raw)
  To: Rob Herring
  Cc: devicetree-discuss@lists.ozlabs.org,
	linuxppc-dev@lists.ozlabs.org, paulus@samba.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <4FE8F6DA.8070109@gmail.com>

On Tue, Jun 26, 2012 at 07:40:10AM +0800, Rob Herring wrote:
> On 06/25/2012 01:28 AM, Dong Aisheng wrote:
> > From: Dong Aisheng <dong.aisheng@linaro.org>
> > 
> > prom_update_property() currently fails if the property doesn't
> > actually exist yet which isn't what we want. Change to add-or-update
> > instead of update-only, then we can remove a lot duplicated lines.
> > 
> > Suggested-by: Grant Likely <grant.likely@secretlab.ca>
> > Signed-off-by: Dong Aisheng <dong.aisheng@linaro.org>
> 
> Acked-by: Rob Herring <rob.herring@calxeda.com>
> 
> Ben, you can merge this via powerpc.
> 
> Rob
> 
Ping...

Regards
Dong Aisheng

^ permalink raw reply

* Re: [PATCH SLAB 1/2 v3] duplicate the cache name in SLUB's saved_alias list, SLAB, and SLOB
From: Glauber Costa @ 2012-07-06  9:04 UTC (permalink / raw)
  To: Li Zhong
  Cc: Christoph Lameter, LKML, Pekka Enberg, linux-mm, Paul Mackerras,
	Matt Mackall, PowerPC email list, Wanlong Gao
In-Reply-To: <1341561286.24895.9.camel@ThinkPad-T420>

On 07/06/2012 11:54 AM, Li Zhong wrote:
> +	if (!c && lname)
> +		kfree(lname);
> +
kfree can still be validly called with a NULL argument. No need for the
lname in the conditional.

^ permalink raw reply

* Re: [RFC PATCH v2 4/13] memory-hotplug : remove /sys/firmware/memmap/X sysfs
From: Yasuaki Ishimatsu @ 2012-07-06  8:27 UTC (permalink / raw)
  To: Wen Congyang
  Cc: len.brown, linux-acpi, linux-kernel, linux-mm, paulus,
	minchan.kim, kosaki.motohiro, rientjes, cl, linuxppc-dev, akpm,
	liuj97
In-Reply-To: <4FF41484.3070806@cn.fujitsu.com>

Hi Wen,

2012/07/04 19:01, Wen Congyang wrote:
> At 07/04/2012 01:52 PM, Yasuaki Ishimatsu Wrote:
>> Hi Wen,
>>
>> 2012/07/04 14:08, Wen Congyang wrote:
>>> At 07/04/2012 12:45 PM, Yasuaki Ishimatsu Wrote:
>>>> Hi Wen,
>>>>
>>>> 2012/07/03 15:35, Wen Congyang wrote:
>>>>> At 07/03/2012 01:56 PM, Yasuaki Ishimatsu Wrote:
>>>>>> When (hot)adding memory into system, /sys/firmware/memmap/X/{end, start, type}
>>>>>> sysfs files are created. But there is no code to remove these files. The patch
>>>>>> implements the function to remove them.
>>>>>>
>>>>>> Note : The code does not free firmware_map_entry since there is no way to free
>>>>>>           memory which is allocated by bootmem.
>>>>>>
>>>>>> CC: David Rientjes <rientjes@google.com>
>>>>>> CC: Jiang Liu <liuj97@gmail.com>
>>>>>> CC: Len Brown <len.brown@intel.com>
>>>>>> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>>>>>> CC: Paul Mackerras <paulus@samba.org>
>>>>>> CC: Christoph Lameter <cl@linux.com>
>>>>>> Cc: Minchan Kim <minchan.kim@gmail.com>
>>>>>> CC: Andrew Morton <akpm@linux-foundation.org>
>>>>>> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>>>>>> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
>>>>>>
>>>>>> ---
>>>>>>     drivers/firmware/memmap.c    |   70 +++++++++++++++++++++++++++++++++++++++++++
>>>>>>     include/linux/firmware-map.h |    6 +++
>>>>>>     mm/memory_hotplug.c          |    6 +++
>>>>>>     3 files changed, 81 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> Index: linux-3.5-rc4/mm/memory_hotplug.c
>>>>>> ===================================================================
>>>>>> --- linux-3.5-rc4.orig/mm/memory_hotplug.c	2012-07-03 14:22:00.190240794 +0900
>>>>>> +++ linux-3.5-rc4/mm/memory_hotplug.c	2012-07-03 14:22:03.549198802 +0900
>>>>>> @@ -661,7 +661,11 @@ EXPORT_SYMBOL_GPL(add_memory);
>>>>>>
>>>>>>     int remove_memory(int nid, u64 start, u64 size)
>>>>>>     {
>>>>>> -	return -EBUSY;
>>>>>> +	lock_memory_hotplug();
>>>>>> +	/* remove memmap entry */
>>>>>> +	firmware_map_remove(start, start + size - 1, "System RAM");
>>>>>> +	unlock_memory_hotplug();
>>>>>> +	return 0;
>>>>>>
>>>>>>     }
>>>>>>     EXPORT_SYMBOL_GPL(remove_memory);
>>>>>> Index: linux-3.5-rc4/include/linux/firmware-map.h
>>>>>> ===================================================================
>>>>>> --- linux-3.5-rc4.orig/include/linux/firmware-map.h	2012-07-03 14:21:45.766421116 +0900
>>>>>> +++ linux-3.5-rc4/include/linux/firmware-map.h	2012-07-03 14:22:03.550198789 +0900
>>>>>> @@ -25,6 +25,7 @@
>>>>>>
>>>>>>     int firmware_map_add_early(u64 start, u64 end, const char *type);
>>>>>>     int firmware_map_add_hotplug(u64 start, u64 end, const char *type);
>>>>>> +int firmware_map_remove(u64 start, u64 end, const char *type);
>>>>>>
>>>>>>     #else /* CONFIG_FIRMWARE_MEMMAP */
>>>>>>
>>>>>> @@ -38,6 +39,11 @@ static inline int firmware_map_add_hotpl
>>>>>>     	return 0;
>>>>>>     }
>>>>>>
>>>>>> +static inline int firmware_map_remove(u64 start, u64 end, const char *type)
>>>>>> +{
>>>>>> +	return 0;
>>>>>> +}
>>>>>> +
>>>>>>     #endif /* CONFIG_FIRMWARE_MEMMAP */
>>>>>>
>>>>>>     #endif /* _LINUX_FIRMWARE_MAP_H */
>>>>>> Index: linux-3.5-rc4/drivers/firmware/memmap.c
>>>>>> ===================================================================
>>>>>> --- linux-3.5-rc4.orig/drivers/firmware/memmap.c	2012-07-03 14:21:45.761421180 +0900
>>>>>> +++ linux-3.5-rc4/drivers/firmware/memmap.c	2012-07-03 14:22:03.569198549 +0900
>>>>>> @@ -79,7 +79,16 @@ static const struct sysfs_ops memmap_att
>>>>>>     	.show = memmap_attr_show,
>>>>>>     };
>>>>>>
>>>>>> +static void release_firmware_map_entry(struct kobject *kobj)
>>>>>> +{
>>>>>> +	/*
>>>>>> +	 * FIXME : There is no idea.
>>>>>> +	 *         How to free the entry which allocated bootmem?
>>>>>> +	 */
>>>>>
>>>>> I find a function free_bootmem(), but I am not sure whether it can work here.
>>>>
>>>> It cannot work here.
>>>>
>>>>> Another problem: how to check whether the entry uses bootmem?
>>>>
>>>> When firmware_map_entry is allocated by kzalloc(), the page has PG_slab.
>>>
>>> This is not true. In my test, I find the page does not have PG_slab sometimes.
>>
>> I think that it depends on the allocated size. firmware_map_entry size is
>> smaller than PAGE_SIZE. So the page has PG_Slab.
> 
> In my test, I add printk in the function firmware_map_add_hotplug() to display
> page's flags. And sometimes the page is not allocated by slab(I use PageSlab()
> to verify it).

How did you check it? Could you send your debug patch?

Thanks,
Yasuaki Ishimatsu

> Thanks
> Wen Congyang
> 
>>
>> Thanks,
>> Yasuaki Ishimatsu
>>
>>>
>>> Thanks
>>> Wen Congyang.
>>>
>>>> So we can check whether the entry was allocated by bootmem or not.
>>>> If the eantry was allocated by kzalloc(), we can free the entry by kfree().
>>>> But if the entry was allocated by bootmem, we have no way to free the entry.
>>>>
>>>> Thanks,
>>>> Yasuaki Ishimatsu
>>>>
>>>>>
>>>>> Thanks
>>>>> Wen Congyang
>>>>>
>>>>>> +}
>>>>>> +
>>>>>>     static struct kobj_type memmap_ktype = {
>>>>>> +	.release	= release_firmware_map_entry,
>>>>>>     	.sysfs_ops	= &memmap_attr_ops,
>>>>>>     	.default_attrs	= def_attrs,
>>>>>>     };
>>>>>> @@ -123,6 +132,16 @@ static int firmware_map_add_entry(u64 st
>>>>>>     	return 0;
>>>>>>     }
>>>>>>
>>>>>> +/**
>>>>>> + * firmware_map_remove_entry() - Does the real work to remove a firmware
>>>>>> + * memmap entry.
>>>>>> + * @entry: removed entry.
>>>>>> + **/
>>>>>> +static inline void firmware_map_remove_entry(struct firmware_map_entry *entry)
>>>>>> +{
>>>>>> +	list_del(&entry->list);
>>>>>> +}
>>>>>> +
>>>>>>     /*
>>>>>>      * Add memmap entry on sysfs
>>>>>>      */
>>>>>> @@ -144,6 +163,31 @@ static int add_sysfs_fw_map_entry(struct
>>>>>>     	return 0;
>>>>>>     }
>>>>>>
>>>>>> +/*
>>>>>> + * Remove memmap entry on sysfs
>>>>>> + */
>>>>>> +static inline void remove_sysfs_fw_map_entry(struct firmware_map_entry *entry)
>>>>>> +{
>>>>>> +	kobject_put(&entry->kobj);
>>>>>> +}
>>>>>> +
>>>>>> +/*
>>>>>> + * Search memmap entry
>>>>>> + */
>>>>>> +
>>>>>> +struct firmware_map_entry * __meminit
>>>>>> +find_firmware_map_entry(u64 start, u64 end, const char *type)
>>>>>> +{
>>>>>> +	struct firmware_map_entry *entry;
>>>>>> +
>>>>>> +	list_for_each_entry(entry, &map_entries, list)
>>>>>> +		if ((entry->start == start) && (entry->end == end) &&
>>>>>> +		    (!strcmp(entry->type, type)))
>>>>>> +			return entry;
>>>>>> +
>>>>>> +	return NULL;
>>>>>> +}
>>>>>> +
>>>>>>     /**
>>>>>>      * firmware_map_add_hotplug() - Adds a firmware mapping entry when we do
>>>>>>      * memory hotplug.
>>>>>> @@ -196,6 +240,32 @@ int __init firmware_map_add_early(u64 st
>>>>>>     	return firmware_map_add_entry(start, end, type, entry);
>>>>>>     }
>>>>>>
>>>>>> +/**
>>>>>> + * firmware_map_remove() - remove a firmware mapping entry
>>>>>> + * @start: Start of the memory range.
>>>>>> + * @end:   End of the memory range (inclusive).
>>>>>> + * @type:  Type of the memory range.
>>>>>> + *
>>>>>> + * removes a firmware mapping entry.
>>>>>> + *
>>>>>> + * Returns 0 on success, or -EINVAL if no entry.
>>>>>> + **/
>>>>>> +int __meminit firmware_map_remove(u64 start, u64 end, const char *type)
>>>>>> +{
>>>>>> +	struct firmware_map_entry *entry;
>>>>>> +
>>>>>> +	entry = find_firmware_map_entry(start, end, type);
>>>>>> +	if (!entry)
>>>>>> +		return -EINVAL;
>>>>>> +
>>>>>> +	/* remove the memmap entry */
>>>>>> +	remove_sysfs_fw_map_entry(entry);
>>>>>> +
>>>>>> +	firmware_map_remove_entry(entry);
>>>>>> +
>>>>>> +	return 0;
>>>>>> +}
>>>>>> +
>>>>>>     /*
>>>>>>      * Sysfs functions -------------------------------------------------------------
>>>>>>      */
>>>>>>
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>>>>
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>
>>
>>
>>
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply

* Re: ptrace and emulated mfspr/mtspr on DSCR
From: Alexey Kardashevskiy @ 2012-07-06  8:12 UTC (permalink / raw)
  To: Linuxppc-dev
In-Reply-To: <4FF69404.6090408@ozlabs.ru>


ha, forget it, it is all correct actually :)


On 06/07/12 17:30, Alexey Kardashevskiy wrote:
> Hi!
> 
> I am trying to change DSCR's value of a specific process with pid=XXX. For this, I attach by ptrace() to XXX, inject a piece of code which does mfspr/mtspr, "continue" XXX and see how it is changing. So far so good.
> 
> The problem is with "continue". The XXX process does not wake up until I press a key (if XXX is waiting on something like scanf() or gets()) OR it exits from sleep() if I change it to run sleep() in a loop.
> 
> Not sure if it matters but mfspr/mtspr are privileged instructions and are emulated by the kernel.
> 
> How to wake XXX up?
> 
> 
> 
> #include <sys/ptrace.h>
> #include <sys/types.h>
> #include <sys/wait.h>
> #include <string.h>
> #include <unistd.h>
> #include <sys/user.h>
> #include <stdio.h>
> #include <stdlib.h>
> 
> void getdata(pid_t child, long addr, void *str)
> {
> 	unsigned long *ptr = (unsigned long *) str;
> 	ptr[0] = ptrace(PTRACE_PEEKDATA, child, addr, NULL);
> }
> 
> void putdata(pid_t child, long addr, void *str)
> {
> 	unsigned long *ptr = (unsigned long *) str;
> 	ptrace(PTRACE_POKEDATA, child, addr, ptr[0]);
> }
> 
> int main(int argc, char *argv[])
> {
> 	pid_t traced_process;
> 	struct pt_regs regs, backup_regs;
> 	unsigned long dscr = -1;
> /*.set_dscr:
> * 7f d1 03 a6     mtspr   17,r30
>   7d 82 10 08     twge    r2,r2     <- set breakpoint */
> 	unsigned int insert_set[] = { 0x7fd103a6, 0x7d821008 };
> /*.get_dscr:
>   7f d1 02 a6     mfspr   r30,17
>   7d 82 10 08     twge    r2,r2     <- set breakpoint */
> 	unsigned int insert_get[] = { 0x7fd102a6, 0x7d821008 };
> 	char backup[8];
> 	int len = 8;
> 
> 	if((argc < 2)||(sizeof(unsigned int)!=4)) {
> 		printf("Usage: %s <pid to be traced> [dscr value]\n", argv[0], argv[1]);
> 		exit(1);
> 	}
> 	if (argc > 2) {
> 		dscr = atoi(argv[2]);
> 	}
> 
> 	traced_process = atoi(argv[1]);
> 	ptrace(PTRACE_ATTACH, traced_process, NULL, NULL);
> 	wait(NULL);
> 
> 	printf("Attached to pid=%u\n", traced_process);
> 	ptrace(PTRACE_GETREGS, traced_process, NULL, &regs);
> 	backup_regs = regs;
> 	getdata(traced_process, regs.nip, backup);
> 
> 	if (dscr != -1) {
> 		regs.gpr[30] = dscr;
> 		putdata(traced_process, regs.nip, insert_set);
> 		ptrace(PTRACE_SETREGS, traced_process, NULL, &regs);
> 		printf("Setting DSCR = %x to gpr0\n", regs.gpr[30]);
> 	} else {
> 		putdata(traced_process, regs.nip, insert_get);
> 		printf("Reading DSCR\n");
> 	}
> 
> 	printf("Continued pid=%u\n", traced_process);
> 	ptrace(PTRACE_CONT, traced_process, NULL, SIGCONT);
> 
> 	printf("waiting...\n");
> 	wait(NULL);      // <---------------- HERE IS THE PROBLEM
> 
> 	if (dscr == -1) {
> 		printf("DSCR has been read\n");
> 		ptrace(PTRACE_GETREGS, traced_process, NULL, &regs);
> 		printf("Reading DSCR from gpr30 = %x\n", regs.gpr[30]);
> 	}
> 
> 	printf("The process stopped, Putting back the original instructions\n");
> 	putdata(traced_process, backup_regs.nip, backup);
> 	ptrace(PTRACE_SETREGS, traced_process, NULL, &backup_regs);
> 	printf("Letting it continue with original flow\n");
> 	ptrace(PTRACE_DETACH, traced_process, NULL, NULL);
> 
> 	return 0;
> }
> 


-- 
Alexey

^ permalink raw reply

* [PATCH powerpc 2/2 v3] kfree the cache name of pgtable cache
From: Li Zhong @ 2012-07-06  7:57 UTC (permalink / raw)
  To: LKML
  Cc: Christoph Lameter, Glauber Costa, Pekka Enberg, linux-mm,
	Paul Mackerras, Matt Mackall, PowerPC email list, Wanlong Gao
In-Reply-To: <1341561286.24895.9.camel@ThinkPad-T420>

This patch tries to kfree the cache name of pgtables cache. It depends
on patch 1/2 -- ([PATCH SLAB 1/2 v3] duplicate the cache name in SLUB's
saved_alias list, SLAB, and SLOB) in this mail thread. 

For SLUB, as the pgtables cache might be mergeable to other caches.
During early boot, the name string is saved in the save_alias list. In
this case, the name could be safely kfreed after calling
kmem_cache_create() with patch 1.

For SLAB/SLOB, we need the changes in patch 1, which duplicates the name
strings in cache create.

v3: with patch 1/2 updated to make slab/slob consistent, #ifdef
CONFIG_SLUB is no longer needed. 

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
---
 arch/powerpc/mm/init_64.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index 620b7ac..bc7f462 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -130,6 +130,7 @@ void pgtable_cache_add(unsigned shift, void
(*ctor)(void *))
 	align = max_t(unsigned long, align, minalign);
 	name = kasprintf(GFP_KERNEL, "pgtable-2^%d", shift);
 	new = kmem_cache_create(name, table_size, align, 0, ctor);
+	kfree(name);
 	PGT_CACHE(shift) = new;

 	pr_debug("Allocated pgtable cache for order %d\n", shift);
-- 
1.7.1

^ permalink raw reply related

* [PATCH SLAB 1/2 v3] duplicate the cache name in SLUB's saved_alias list, SLAB, and SLOB
From: Li Zhong @ 2012-07-06  7:54 UTC (permalink / raw)
  To: LKML
  Cc: Christoph Lameter, Glauber Costa, Pekka Enberg, linux-mm,
	Paul Mackerras, Matt Mackall, PowerPC email list, Wanlong Gao

SLUB duplicates the cache name string passed into kmem_cache_create().
However if the cache could be merged to others during early boot, the
name pointer is saved in saved_alias list, and the string needs to be
kept valid before slab_sysfs_init() is finished. With this patch, the
name string (if kmalloced) could be kfreed after calling
kmem_cache_create().

Some more details:

kmem_cache_create() checks whether it is mergeable before creating one.
If not mergeable, the name is duplicated: n = kstrdup(name, GFP_KERNEL);

If it is mergeable, it calls sysfs_slab_alias(). If the sysfs is ready
(slab_state == SYSFS), then the name is duplicated (or dropped if no
SYSFS support) in sysfs_create_link() for use.

For the above cases, we could safely kfree the name string after calling
cache create. 

However, during early boot, before sysfs is ready (slab_state < SYSFS),
the sysfs_slab_alias() saves the pointer of name in the alias_list.
Those entries in the list are added to sysfs later in slab_sysfs_init()
to set up the sysfs stuff, and we need keep the name string passed in
valid until it finishes. By duplicating the name string here also, we
are able to safely kfree the name string after calling cache create.

v2: removed an unnecessary assignment in v1; some changes in change log,
added more details

v3: changed slab/slot to let them also duplicate the name string, so the
code is not slub-dependent, and in patch 2/2, we could call kfree()
after cache create without #ifdef slub.
    for slab, the name of the sizes caches created before
slab_is_available() is not duplicated, and it is not checked in
kmem_cache_destroy(), as I think these caches won't be destroyed.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
---
 mm/slab.c |   15 ++++++++++++++-
 mm/slob.c |   17 ++++++++++++++---
 mm/slub.c |    7 ++++++-
 3 files changed, 34 insertions(+), 5 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index e901a36..87df7d1 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2280,6 +2280,7 @@ kmem_cache_create (const char *name, size_t size,
size_t align,
 	size_t left_over, slab_size, ralign;
 	struct kmem_cache *cachep = NULL, *pc;
 	gfp_t gfp;
+	const char *lname;
 
 	/*
 	 * Sanity checks... these are all serious usage bugs.
@@ -2291,6 +2292,13 @@ kmem_cache_create (const char *name, size_t size,
size_t align,
 		BUG();
 	}
 
+	if (slab_is_available()) {
+		lname = kstrdup(name, GFP_KERNEL);
+		if (!lname)
+			goto oops;
+	} else
+		lname = name;
+
 	/*
 	 * We use cache_chain_mutex to ensure a consistent view of
 	 * cpu_online_mask as well.  Please see cpuup_callback
@@ -2526,7 +2534,7 @@ kmem_cache_create (const char *name, size_t size,
size_t align,
 		BUG_ON(ZERO_OR_NULL_PTR(cachep->slabp_cache));
 	}
 	cachep->ctor = ctor;
-	cachep->name = name;
+	cachep->name = lname;
 
 	if (setup_cpu_cache(cachep, gfp)) {
 		__kmem_cache_destroy(cachep);
@@ -2550,6 +2558,9 @@ oops:
 	if (!cachep && (flags & SLAB_PANIC))
 		panic("kmem_cache_create(): failed to create slab `%s'\n",
 		      name);
+	if (!cachep && lname)
+		kfree(lname);
+
 	if (slab_is_available()) {
 		mutex_unlock(&cache_chain_mutex);
 		put_online_cpus();
@@ -2752,6 +2763,8 @@ void kmem_cache_destroy(struct kmem_cache *cachep)
 	if (unlikely(cachep->flags & SLAB_DESTROY_BY_RCU))
 		rcu_barrier();
 
+	/* sizes caches will not be destroyed? */
+	kfree(cachep->name);
 	__kmem_cache_destroy(cachep);
 	mutex_unlock(&cache_chain_mutex);
 	put_online_cpus();
diff --git a/mm/slob.c b/mm/slob.c
index 8105be4..7bea3a3 100644
--- a/mm/slob.c
+++ b/mm/slob.c
@@ -569,13 +569,18 @@ struct kmem_cache {
 struct kmem_cache *kmem_cache_create(const char *name, size_t size,
 	size_t align, unsigned long flags, void (*ctor)(void *))
 {
-	struct kmem_cache *c;
+	struct kmem_cache *c = NULL;
+	const char *lname;
+
+	lname = kstrdup(name, GFP_KERNEL);
+	if (!lname)
+		goto oops;
 
 	c = slob_alloc(sizeof(struct kmem_cache),
 		GFP_KERNEL, ARCH_KMALLOC_MINALIGN, -1);
 
 	if (c) {
-		c->name = name;
+		c->name = lname;
 		c->size = size;
 		if (flags & SLAB_DESTROY_BY_RCU) {
 			/* leave room for rcu footer at the end of object */
@@ -589,9 +594,14 @@ struct kmem_cache *kmem_cache_create(const char
*name, size_t size,
 			c->align = ARCH_SLAB_MINALIGN;
 		if (c->align < align)
 			c->align = align;
-	} else if (flags & SLAB_PANIC)
+	}
+oops:
+	if (!c && (flags & SLAB_PANIC))
 		panic("Cannot create slab cache %s\n", name);
 
+	if (!c && lname)
+		kfree(lname);
+
 	kmemleak_alloc(c, sizeof(struct kmem_cache), 1, GFP_KERNEL);
 	return c;
 }
@@ -602,6 +612,7 @@ void kmem_cache_destroy(struct kmem_cache *c)
 	kmemleak_free(c);
 	if (c->flags & SLAB_DESTROY_BY_RCU)
 		rcu_barrier();
+	kfree(c->name);
 	slob_free(c, sizeof(struct kmem_cache));
 }
 EXPORT_SYMBOL(kmem_cache_destroy);
diff --git a/mm/slub.c b/mm/slub.c
index 8c691fa..ed9f3c5 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5372,7 +5372,11 @@ static int sysfs_slab_alias(struct kmem_cache *s,
const char *name)
 		return -ENOMEM;
 
 	al->s = s;
-	al->name = name;
+	al->name = kstrdup(name, GFP_KERNEL);
+	if (!al->name) {
+		kfree(al);
+		return -ENOMEM;
+	}
 	al->next = alias_list;
 	alias_list = al;
 	return 0;
@@ -5409,6 +5413,7 @@ static int __init slab_sysfs_init(void)
 		if (err)
 			printk(KERN_ERR "SLUB: Unable to add boot slab alias"
 					" %s to sysfs\n", s->name);
+		kfree(al->name);
 		kfree(al);
 	}
 
-- 
1.7.1

^ permalink raw reply related

* ptrace and emulated mfspr/mtspr on DSCR
From: Alexey Kardashevskiy @ 2012-07-06  7:30 UTC (permalink / raw)
  To: Linuxppc-dev

Hi!

I am trying to change DSCR's value of a specific process with pid=XXX. For this, I attach by ptrace() to XXX, inject a piece of code which does mfspr/mtspr, "continue" XXX and see how it is changing. So far so good.

The problem is with "continue". The XXX process does not wake up until I press a key (if XXX is waiting on something like scanf() or gets()) OR it exits from sleep() if I change it to run sleep() in a loop.

Not sure if it matters but mfspr/mtspr are privileged instructions and are emulated by the kernel.

How to wake XXX up?



#include <sys/ptrace.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <string.h>
#include <unistd.h>
#include <sys/user.h>
#include <stdio.h>
#include <stdlib.h>

void getdata(pid_t child, long addr, void *str)
{
	unsigned long *ptr = (unsigned long *) str;
	ptr[0] = ptrace(PTRACE_PEEKDATA, child, addr, NULL);
}

void putdata(pid_t child, long addr, void *str)
{
	unsigned long *ptr = (unsigned long *) str;
	ptrace(PTRACE_POKEDATA, child, addr, ptr[0]);
}

int main(int argc, char *argv[])
{
	pid_t traced_process;
	struct pt_regs regs, backup_regs;
	unsigned long dscr = -1;
/*.set_dscr:
* 7f d1 03 a6     mtspr   17,r30
  7d 82 10 08     twge    r2,r2     <- set breakpoint */
	unsigned int insert_set[] = { 0x7fd103a6, 0x7d821008 };
/*.get_dscr:
  7f d1 02 a6     mfspr   r30,17
  7d 82 10 08     twge    r2,r2     <- set breakpoint */
	unsigned int insert_get[] = { 0x7fd102a6, 0x7d821008 };
	char backup[8];
	int len = 8;

	if((argc < 2)||(sizeof(unsigned int)!=4)) {
		printf("Usage: %s <pid to be traced> [dscr value]\n", argv[0], argv[1]);
		exit(1);
	}
	if (argc > 2) {
		dscr = atoi(argv[2]);
	}

	traced_process = atoi(argv[1]);
	ptrace(PTRACE_ATTACH, traced_process, NULL, NULL);
	wait(NULL);

	printf("Attached to pid=%u\n", traced_process);
	ptrace(PTRACE_GETREGS, traced_process, NULL, &regs);
	backup_regs = regs;
	getdata(traced_process, regs.nip, backup);

	if (dscr != -1) {
		regs.gpr[30] = dscr;
		putdata(traced_process, regs.nip, insert_set);
		ptrace(PTRACE_SETREGS, traced_process, NULL, &regs);
		printf("Setting DSCR = %x to gpr0\n", regs.gpr[30]);
	} else {
		putdata(traced_process, regs.nip, insert_get);
		printf("Reading DSCR\n");
	}

	printf("Continued pid=%u\n", traced_process);
	ptrace(PTRACE_CONT, traced_process, NULL, SIGCONT);

	printf("waiting...\n");
	wait(NULL);      // <---------------- HERE IS THE PROBLEM

	if (dscr == -1) {
		printf("DSCR has been read\n");
		ptrace(PTRACE_GETREGS, traced_process, NULL, &regs);
		printf("Reading DSCR from gpr30 = %x\n", regs.gpr[30]);
	}

	printf("The process stopped, Putting back the original instructions\n");
	putdata(traced_process, backup_regs.nip, backup);
	ptrace(PTRACE_SETREGS, traced_process, NULL, &backup_regs);
	printf("Letting it continue with original flow\n");
	ptrace(PTRACE_DETACH, traced_process, NULL, NULL);

	return 0;
}

-- 
Alexey

^ permalink raw reply

* [PATCH] powerpc: put the gpr sabe/restore functions in their own section
From: Stephen Rothwell @ 2012-07-06  7:09 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: ppc-dev, Alan Modra

[-- Attachment #1: Type: text/plain, Size: 1440 bytes --]

This allows the linker to know that calls to them do not need to switch
TOC and stop errors like the following when linking large configurations:

powerpc64-linux-ld: drivers/built-in.o: In function `.gpiochip_is_requested':
(.text+0x4): sibling call optimization to `_savegpr0_29' does not allow automatic multiple TOCs; recompile with -mminimal-toc or -fno-optimize-sibling-calls, or make `_savegpr0_29' extern

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
---
 arch/powerpc/lib/crtsavres.S |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/lib/crtsavres.S b/arch/powerpc/lib/crtsavres.S
index 1c893f0..b2c68ce 100644
--- a/arch/powerpc/lib/crtsavres.S
+++ b/arch/powerpc/lib/crtsavres.S
@@ -41,12 +41,13 @@
 #include <asm/ppc_asm.h>
 
 	.file	"crtsavres.S"
-	.section ".text"
 
 #ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
 
 #ifndef CONFIG_PPC64
 
+	.section ".text"
+
 /* Routines for saving integer registers, called by the compiler.  */
 /* Called with r11 pointing to the stack header word of the caller of the */
 /* function, just beyond the end of the integer save area.  */
@@ -232,6 +233,8 @@ _GLOBAL(_rest32gpr_31_x)
 
 #else /* CONFIG_PPC64 */
 
+	.section ".text.save.restore","ax",@progbits
+
 .globl	_savegpr0_14
 _savegpr0_14:
 	std	r14,-144(r1)
-- 
1.7.10.280.gaa39

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply related

* Re: [Qemu-ppc] [RFC PATCH 04/17] KVM: PPC64: booke: Add guest computation mode for irq delivery
From: Alexander Graf @ 2012-07-06  7:03 UTC (permalink / raw)
  To: Scott Wood
  Cc: <kvm-ppc@vger.kernel.org>, Mihai Caraman,
	<qemu-ppc@nongnu.org>,
	<linuxppc-dev@lists.ozlabs.org>,
	<kvm@vger.kernel.org>
In-Reply-To: <4FF62891.9020702@freescale.com>


On 06.07.2012, at 01:51, Scott Wood <scottwood@freescale.com> wrote:

> On 07/04/2012 08:40 AM, Alexander Graf wrote:
>> On 25.06.2012, at 14:26, Mihai Caraman wrote:
>>> @@ -381,7 +386,8 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_v=
cpu *vcpu,
>>>            set_guest_esr(vcpu, vcpu->arch.queued_esr);
>>>        if (update_dear =3D=3D true)
>>>            set_guest_dear(vcpu, vcpu->arch.queued_dear);
>>> -        kvmppc_set_msr(vcpu, vcpu->arch.shared->msr & msr_mask);
>>> +        kvmppc_set_msr(vcpu, (vcpu->arch.shared->msr & msr_mask)
>>> +                | msr_cm);
>>=20
>> Please split this computation out into its own variable and apply the mas=
king regardless. Something like
>>=20
>> ulong new_msr =3D vcpu->arch.shared->msr;
>> if (vcpu->arch.epcr & SPRN_EPCR_ICM)
>>    new_msr |=3D MSR_CM;
>> new_msr &=3D msr_mask;
>> kvmppc_set_msr(vcpu, new_msr);
>=20
> This will fail to clear MSR[CM] in the odd but legal situation where you
> have MSR[CM] set but EPCR[ICM] unset.

Ah. Good point. Then leave the msr_mask logic as before and only stretch it o=
ut into its own variable.

Alex

>=20
> -Scott
>=20

^ permalink raw reply

* Re: linux-next: build failure after merge of the final tree
From: Alan Modra @ 2012-07-06  6:08 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: linux-kernel, linux-next, Paul Mackerras, linuxppc-dev
In-Reply-To: <20120706130137.6aae96c0072af1f330249c0d@canb.auug.org.au>

On Fri, Jul 06, 2012 at 01:01:37PM +1000, Stephen Rothwell wrote:
> solos-pci.c:(.text+0x1ff923c): relocation truncated to fit: R_PPC64_REL24
                     ^^^^^^^^^

> I assume at this point, we are just too large.

Yeah, but not in total.  I didn't see any of these in the allyes
kernel I built with our proof of concept hack to avoid ld -r.  I think
you'll find that these are all from ld -r output, as I assume no one
in kernel land writes drivers or whatever with 33M of text in a single
file.  Branches in that monstrous section can't even reach the
trampolines that ld inserts to extend branch reach.  Did I mention
that ld -r is a bad idea?

One workaround might be to compile with -ffunction-sections.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox