* mpc870: hctosys.c unable to open rtc device rtc0
From: Shawn Jin @ 2010-08-09 6:37 UTC (permalink / raw)
To: ppcdev
A DS1339 RTC is connected to the I2C bus (i2c-cpm in mpc870). The
dmesg below shows that the ds1037 driver was registered. But the
hctosys.c was not able to open the rtc device rtc0. The rtc doesn't
seem to be connected with I2C driver properly.
i2c-core: driver [rtc-ds1307] registered
i2c /dev entries driver
i2c-core: driver [dev_driver] registered
fsl-i2c-cpm fa200860.i2c: cpm_i2c_setup()
alloc irq_desc for 21 on node 0
alloc kstat_irqs on node 0
irq: irq 16 on host /soc@fa200000/cpm@9c0/interrupt-controller@930
mapped to virtual irq 21
fsl-i2c-cpm fa200860.i2c: i2c_ram 0xfddfbc80, i2c_addr 0x1c80, freq 60000
fsl-i2c-cpm fa200860.i2c: tbase 0x0340, rbase 0x0360
i2c i2c-0: adapter [i2c-cpm] registered
i2c-dev: adapter [i2c-cpm] registered as minor 0
fsl-i2c-cpm fa200860.i2c: hw routines for i2c-cpm registered.
i2c-core: driver [lm75] registered
TCP cubic registered
NET: Registered protocol family 17
drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
My I2C settings in the dts is as follows, same as the mpc885ads.
i2c@860 {
compatible = "fsl,mpc885-i2c",
"fsl,cpm1-i2c";
reg = <0x860 0x20 0x3c80 0x30>;
interrupts = <16>;
interrupt-parent = <&CPM_PIC>;
fsl,cpm-command = <0x10>;
#address-cells = <1>;
#size-cells = <0>;
};
Reading the fsl i2c bindings in the documentation, I found an example
as follows.
27 i2c@860 {
28 compatible = "fsl,mpc823-i2c",
29 "fsl,cpm1-i2c";
30 reg = <0x860 0x20 0x3c80 0x30>;
31 interrupts = <16>;
32 interrupt-parent = <&CPM_PIC>;
33 fsl,cpm-command = <0x10>;
34 #address-cells = <1>;
35 #size-cells = <0>;
36
37 rtc@68 {
38 compatible = "dallas,ds1307";
39 reg = <0x68>;
40 };
41 };
42
In the above example the rtc was explicitly declared as a subnode of
the i2c node. Is this the way to connect (or bind) a RTC to the I2C
driver? If not how is an RTC driver (or hwmon driver) bound to the I2C
driver? What is the reg (0x68) under rtc node?
I set breakpoint at ds1037_probe() and was hoping that it might be hit
during the driver registration. But it didn't. Would the
ds1037_probe() be called during when the ds1037_driver was registered
as an I2C driver?
Thanks,
-Shawn.
^ permalink raw reply
* Re: Review Request: New proposal for device tree clock binding.
From: Grant Likely @ 2010-08-09 7:12 UTC (permalink / raw)
To: Li Yang-R58472; +Cc: devicetree-discuss, linuxppc-dev, Jeremy Kerr
In-Reply-To: <F9BD3E0A8083BE4ABAA94D7EDD7E3F6310925F@zch01exm26.fsl.freescale.net>
On Mon, Aug 9, 2010 at 1:05 AM, Li Yang-R58472 <r58472@freescale.com> wrote=
:
>>>><tt>*-clock</tt> is named for the signal name for the ''clock input''
>>>>of the device. it should describe the function of the signal for that
>>>>device, rather than the name of the system-wide clock line. For
>>>>example, a UART with two clocks - one for baud-rate clocking, and the
>>>>other for register clocking - may have clock input properties named
>>>>"baud-clock" and "register-clock". =A0The property value is a tuple
>>>>containing the phandle to the clock provider and the name of the clock
>>output signal.
>>>>
>>>>For example:
>>>>
>>>> =A0 =A0uart {
>>>> =A0 =A0 =A0 =A0baud-clock =3D <&osc>, "ckil";
>>>> =A0 =A0 =A0 =A0register-clock =3D <&ref>, "bus";
>>>> =A0 =A0};
>>>>
>>>>
>>>>This represents a device with two clock inputs, named "baud" and
>>>>"register". The baud clock is connected to the "ckil" output of the
>>"osc"
>>>>device, and the register clock is connected to the "bus" output of the
>>>>"ref" device.
>>>
>>>
>>> Instead of having two items to identify a clock, I would suggest to hav=
e
>>a node for each clock. =A0So that clock can be referenced by one
>>handle. =A0Also we can have clock specific information defined in the clo=
ck
>>node. =A0Here is the example I am planning to use on 85xx PMC.
>>>
>>>
>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0power@e0070{
>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0compatible =3D "fsl,mpc8=
548-pmc",
>>> "fsl,p2020-pmc";
>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0reg =3D <0xe0070 0x20>;
>>>
>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0etsec1_clk: soc-clk@24{
>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0fsl,pmcd=
r-mask =3D <0x00000080>;
>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0};
>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0etsec2_clk: soc-clk@25{
>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0fsl,pmcd=
r-mask =3D <0x00000040>;
>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0};
>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0etsec3_clk: soc-clk@26{
>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0fsl,pmcd=
r-mask =3D <0x00000020>;
>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0};
>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0};
>>>
>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0enet0: ethernet@24000 {
>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0......
>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0master-clock =3D <&etsec=
1_clk>;
>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0......
>>>
>>>
>>> What do you think?
>
> Quoting your reply:
>
>> I've avoided requiring clock nodes to have a separate sub node for
>> each output because it is more verbose and it prevents clock providers
>> from having child nodes for other purposes. =A0Are you concerned that
>
> I don't see why there should be child nodes for other purposes under cloc=
k node.
>
>> having the <phandle>+output name pair will be difficult to manage?
>
> That's part of my concern.
I was concerned about this too until I found precedence for doing the
exact same thing in the pci binding (and ePAPR). Mixing phandle and a
string in this way doesn't bother me anymore.
> =A0But my main concern is the inability of describing the properties of e=
ach clock in the device tree. =A0The clock stuff is much SoC related, which=
means it could be variable among chips even in a same family. =A0Having cl=
ock properties defined in device tree will make it easier to have an abstra=
cted driver to handle clock operations. =A0That's why device trees are used=
in the first place, right?
You can do whatever you like for your specific clock source driver.
All the clock binding provides is a connection from a clock consumer
node to a specific named output from a clock provider node. You can
add whatever properties (or subnodes) you need for the hardware you
are writing a binding for. This binding doesn't prevent you from
doing anything.
g.
--=20
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
^ permalink raw reply
* RE: Review Request: New proposal for device tree clock binding.
From: Li Yang-R58472 @ 2010-08-09 7:05 UTC (permalink / raw)
To: Grant Likely; +Cc: devicetree-discuss, linuxppc-dev, Jeremy Kerr
In-Reply-To: <AANLkTi=3MQBSd_phbSZnRA-wFB7cfa4U_PKND+BgZVhd@mail.gmail.com>
>>><tt>*-clock</tt> is named for the signal name for the ''clock input''
>>>of the device. it should describe the function of the signal for that
>>>device, rather than the name of the system-wide clock line. For
>>>example, a UART with two clocks - one for baud-rate clocking, and the
>>>other for register clocking - may have clock input properties named
>>>"baud-clock" and "register-clock". =A0The property value is a tuple
>>>containing the phandle to the clock provider and the name of the =
clock
>output signal.
>>>
>>>For example:
>>>
>>> =A0 =A0uart {
>>> =A0 =A0 =A0 =A0baud-clock =3D <&osc>, "ckil";
>>> =A0 =A0 =A0 =A0register-clock =3D <&ref>, "bus";
>>> =A0 =A0};
>>>
>>>
>>>This represents a device with two clock inputs, named "baud" and
>>>"register". The baud clock is connected to the "ckil" output of the
>"osc"
>>>device, and the register clock is connected to the "bus" output of =
the
>>>"ref" device.
>>
>>
>> Instead of having two items to identify a clock, I would suggest to =
have
>a node for each clock. =A0So that clock can be referenced by one
>handle. =A0Also we can have clock specific information defined in the =
clock
>node. =A0Here is the example I am planning to use on 85xx PMC.
>>
>>
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0power@e0070{
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0compatible =3D =
"fsl,mpc8548-pmc",
>> "fsl,p2020-pmc";
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0reg =3D <0xe0070 =
0x20>;
>>
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0etsec1_clk: =
soc-clk@24{
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0fsl,pmcdr-mask =3D <0x00000080>;
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0};
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0etsec2_clk: =
soc-clk@25{
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0fsl,pmcdr-mask =3D <0x00000040>;
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0};
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0etsec3_clk: =
soc-clk@26{
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0fsl,pmcdr-mask =3D <0x00000020>;
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0};
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0};
>>
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0enet0: ethernet@24000 {
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0......
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0master-clock =3D =
<&etsec1_clk>;
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0......
>>
>>
>> What do you think?
Quoting your reply:
> I've avoided requiring clock nodes to have a separate sub node for
> each output because it is more verbose and it prevents clock providers
> from having child nodes for other purposes. Are you concerned that
I don't see why there should be child nodes for other purposes under =
clock node.
> having the <phandle>+output name pair will be difficult to manage?
That's part of my concern. But my main concern is the inability of =
describing the properties of each clock in the device tree. The clock =
stuff is much SoC related, which means it could be variable among chips =
even in a same family. Having clock properties defined in device tree =
will make it easier to have an abstracted driver to handle clock =
operations. That's why device trees are used in the first place, right?
- Leo
^ permalink raw reply
* RE: [PATCH 1/3][MTD] P4080/eLBC: Make Freescale elbc interrupt common to elbc devices
From: Zang Roy-R61911 @ 2010-08-09 7:33 UTC (permalink / raw)
To: Zang Roy-R61911, linux-mtd
Cc: Lan Chunhe-B25806, linuxppc-dev, akpm, Gala Kumar-B11780
In-Reply-To: <1281063096-26598-1-git-send-email-tie-fei.zang@freescale.com>
=20
> -----Original Message-----
> From: Zang Roy-R61911=20
> Sent: Friday, August 06, 2010 10:52 AM
> To: linux-mtd@lists.infradead.org
> Cc: linuxppc-dev@ozlabs.org; akpm@linux-foundation.org; Gala=20
> Kumar-B11780; Lan Chunhe-B25806
> Subject: [PATCH 1/3][MTD] P4080/eLBC: Make Freescale elbc=20
> interrupt common to elbc devices
>=20
> From: Lan Chunhe-B25806 <b25806@freescale.com>
>=20
> Move Freescale elbc interrupt from nand dirver to elbc driver.
> Then all elbc devices can use the interrupt instead of ONLY nand.
>=20
> Signed-off-by: Lan Chunhe-B25806 <b25806@freescale.com>
> Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
> ---
> send the patch to linux-mtd@lists.infradead.org
> it has been posted to linuxppc-dev@ozlabs.org and do not get=20
> any comment.
Any comment about this serial patches?
If none, I'd ask Andrew to merge to his mm tree.
Thanks.
Roy
^ permalink raw reply
* Re: [git pull] Please pull powerpc.git next branch
From: Benjamin Herrenschmidt @ 2010-08-09 11:25 UTC (permalink / raw)
To: Grant Likely
Cc: linuxppc-dev list, Andrew Morton, Linus Torvalds,
Linux Kernel list
In-Reply-To: <AANLkTin0dkWPkU3C47HjnxZ2e_ujCUfNUp4xm_xSX3yK@mail.gmail.com>
On Sun, 2010-08-08 at 23:18 -0600, Grant Likely wrote:
> And how is anyone else to make it into the kernel statistics top
> contributors by lines changed list with stuff like this going in? :-)
lindent ? :-)
Cheers,
Ben.
^ permalink raw reply
* Re: [PATCH] powerpc/fsl-pci: Fix MSI support on 83xx platforms
From: Ilya Yanok @ 2010-08-09 12:58 UTC (permalink / raw)
To: Kumar Gala; +Cc: linuxppc-dev, wd
In-Reply-To: <1280995347-6550-1-git-send-email-galak@kernel.crashing.org>
Hi Kumar,
05.08.2010 12:02, Kumar Gala wrote:
> The following commit broke 83xx because it assumed the 83xx platforms
> exposed the "IMMR" address in BAR0 like the 85xx/86xx/QoriQ devices do:
>
> commit 3da34aae03d498ee62f75aa7467de93cce3030fd
> Author: Kumar Gala<galak@kernel.crashing.org>
> Date: Tue May 12 15:51:56 2009 -0500
>
> powerpc/fsl: Support unique MSI addresses per PCIe Root Complex
>
> However that is not true, so we have to search through the inbound
> window settings on 83xx to find which one matches the IMMR address to
> determine its PCI address.
As I've already told you my testing on the MPC8308RDB board was
successful. As for 85xx boards, Wolfgang told me that DENX doesn't have
any 85xx boards that support MSI at the moment, so I can't do complete
testing. I'm sorry. I've tested that TQM8560 boards is able to boot and
PCI is working as expected though (with your patch applied). I fear I
can't do anything else here.
Thanks.
Regards, Ilya.
^ permalink raw reply
* Re: [PATCH] powerpc/fsl-pci: Fix MSI support on 83xx platforms
From: Ilya Yanok @ 2010-08-09 12:58 UTC (permalink / raw)
To: Kumar Gala; +Cc: linuxppc-dev, wd
In-Reply-To: <1280995347-6550-1-git-send-email-galak@kernel.crashing.org>
Hi Kumar,
05.08.2010 12:02, Kumar Gala wrote:
> The following commit broke 83xx because it assumed the 83xx platforms
> exposed the "IMMR" address in BAR0 like the 85xx/86xx/QoriQ devices do:
>
> commit 3da34aae03d498ee62f75aa7467de93cce3030fd
> Author: Kumar Gala<galak@kernel.crashing.org>
> Date: Tue May 12 15:51:56 2009 -0500
>
> powerpc/fsl: Support unique MSI addresses per PCIe Root Complex
>
> However that is not true, so we have to search through the inbound
> window settings on 83xx to find which one matches the IMMR address to
> determine its PCI address.
As I've already told you my testing on the MPC8308RDB board was
successful. As for 85xx boards, Wolfgang told me that DENX doesn't have
any 85xx boards that support MSI at the moment, so I can't do complete
testing. I'm sorry. I've tested that TQM8560 board is able to boot and
PCI is working as expected though (with your patch applied). I fear I
can't do anything else here.
Thanks.
Regards, Ilya.
^ permalink raw reply
* Re: [PATCH 4/9] v4 Add mutex for add/remove of memory blocks
From: Nathan Fontenot @ 2010-08-09 13:55 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, Dave Hansen, Greg KH, linux-kernel, linuxppc-dev
In-Reply-To: <20100805135314.7229d07c.kamezawa.hiroyu@jp.fujitsu.com>
On 08/04/2010 11:53 PM, KAMEZAWA Hiroyuki wrote:
> On Tue, 03 Aug 2010 08:39:50 -0500
> Nathan Fontenot <nfont@austin.ibm.com> wrote:
>
>> Add a new mutex for use in adding and removing of memory blocks. This
>> is needed to avoid any race conditions in which the same memory block could
>> be added and removed at the same time.
>>
>> Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
>
> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>
> But a nitpick (see below)
>
>> ---
>> drivers/base/memory.c | 9 +++++++++
>> 1 file changed, 9 insertions(+)
>>
>> Index: linux-2.6/drivers/base/memory.c
>> ===================================================================
>> --- linux-2.6.orig/drivers/base/memory.c 2010-08-02 13:35:00.000000000 -0500
>> +++ linux-2.6/drivers/base/memory.c 2010-08-02 13:45:34.000000000 -0500
>> @@ -27,6 +27,8 @@
>> #include <asm/atomic.h>
>> #include <asm/uaccess.h>
>>
>> +static struct mutex mem_sysfs_mutex;
>> +
>
> For static symbol of mutex, we usually do
> static DEFINE_MUTEX(mem_sysfs_mutex);
>
> Then, extra calls of mutex_init() is not required.
>
ok, fixed in the next version of the patches.
-Nathan
^ permalink raw reply
* Re: [PATCH 6/9] v4 Update the find_memory_block declaration
From: Nathan Fontenot @ 2010-08-09 13:56 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, Dave Hansen, Greg KH, linux-kernel, linuxppc-dev
In-Reply-To: <20100805135944.97ecbaa4.kamezawa.hiroyu@jp.fujitsu.com>
On 08/04/2010 11:59 PM, KAMEZAWA Hiroyuki wrote:
> On Tue, 03 Aug 2010 08:41:45 -0500
> Nathan Fontenot <nfont@austin.ibm.com> wrote:
>
>> Update the find_memory_block declaration to to take a struct mem_section *
>> so that it matches the definition.
>>
>> Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
>
> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>
> Hmm...my mmotm-0727 has this definition in memory.h...
>
> extern struct memory_block *find_memory_block(struct mem_section *);
>
> What patch makes it unsigned long ?
>
I was basing the patches on the latest mainline tree. Looks like can drop
this patch in the next version of the patchset.
-Nathan
^ permalink raw reply
* Re: 2.6.35-stable/ppc64/p7: suspicious rcu_dereference_check() usage detected during 2.6.35-stable boot
From: Paul E. McKenney @ 2010-08-09 16:12 UTC (permalink / raw)
To: Subrata Modak
Cc: sachinp, Peter Zijlstra, Li Zefan, linux-kernel, Linuxppc-dev,
DIVYA PRAKASH
In-Reply-To: <1280739132.15317.9.camel@subratamodak.linux.ibm.com>
On Mon, Aug 02, 2010 at 02:22:12PM +0530, Subrata Modak wrote:
> Hi,
>
> The following suspicious rcu_dereference_check() usage is detected
> during 2.6.35-stable boot on my ppc64/p7 machine:
>
> ==================================================
> [ INFO: suspicious rcu_dereference_check() usage. ]
> ---------------------------------------------------
> kernel/sched.c:616 invoked rcu_dereference_check() without protection!
> other info that might help us debug this:
>
> rcu_scheduler_active = 1, debug_locks = 0
> 1 lock held by swapper/1:
> #0: (&rq->lock){-.....}, at: [<c0000000007ca2f8>] .init_idle+0x78/0x4a8
> stack backtrace:
> Call Trace:
> [c000000f392bf990] [c000000000014f04] .show_stack+0xb0/0x1a0 (unreliable)
> [c000000f392bfa50] [c0000000007c87b4] .dump_stack+0x28/0x3c
> [c000000f392bfad0] [c000000000103e1c] .lockdep_rcu_dereference+0xbc/0xe4
> [c000000f392bfb70] [c0000000007ca434] .init_idle+0x1b4/0x4a8
> [c000000f392bfc30] [c0000000007cad04] .fork_idle+0xa4/0xd0
> [c000000f392bfe30] [c000000000aefaac] .smp_prepare_cpus+0x23c/0x2f4
> [c000000f392bfed0] [c000000000ae1424] .kernel_init+0xec/0x32c
> [c000000f392bff90] [c000000000033f40] .kernel_thread+0x54/0x70
> ==================================================
>
> Please note that this was reported earlier on 2.6.34-rc6:
> http://marc.info/?l=linux-kernel&m=127313031922395&w=2,
> The issue was fixed with:
> commit 1ce7e4ff24fe338438bc7837e02780f202bf202b
> Author: Li Zefan <lizf@cn.fujitsu.com>
> Date: Fri Apr 23 10:35:52 2010 +0800
> cgroup: Check task_lock in task_subsys_state()
>
> According to:
> http://lkml.org/lkml/2010/7/1/883,
> commit dc61b1d65e353d638b2445f71fb8e5b5630f2415
> Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Date: Tue Jun 8 11:40:42 2010 +0200
> sched: Fix PROVE_RCU vs cpu_cgroup
> should have fixed this. But this is reproducible on 2.6.35-stable.
>
> Please also see the config file attached.
Hello, Subrata,
Thank you for locating this one! This looks like the same issue that
Ilia Mirkin located. Please see below for my analysis -- no fix yet,
as I need confirmation from cgroups experts. I can easily create a
patch that suppresses the warning, but I don't yet know whether this is
the right thing to do.
Thanx, Paul
------------------------------------------------------------------------
On Thu, Aug 05, 2010 at 01:31:10PM -0400, Ilia Mirkin wrote:
> On Thu, Jul 1, 2010 at 6:18 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > On Thu, Jul 01, 2010 at 08:21:43AM -0400, Miles Lane wrote:
> >> [ INFO: suspicious rcu_dereference_check() usage. ]
> >> ---------------------------------------------------
> >> kernel/sched.c:616 invoked rcu_dereference_check() without protection!
> >>
> >> other info that might help us debug this:
> >>
> >> rcu_scheduler_active = 1, debug_locks = 1
> >> 3 locks held by swapper/1:
> >> #0: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff81042914>]
> >> cpu_maps_update_begin+0x12/0x14
> >> #1: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8104294f>]
> >> cpu_hotplug_begin+0x27/0x4e
> >> #2: (&rq->lock){-.-...}, at: [<ffffffff812f8502>] init_idle+0x2b/0x114
> >
> > Hello, Miles!
> >
> > I believe that this one is fixed by commit dc61b1d6 in -tip.
>
> Hi Paul,
>
> Looks like that commit made it into 2.6.35:
>
> git tag -l --contains dc61b1d65e353d638b2445f71fb8e5b5630f2415 v2.6.35*
> v2.6.35
> v2.6.35-rc4
> v2.6.35-rc5
> v2.6.35-rc6
>
> However I still get:
>
> [ 0.051203] CPU0: AMD QEMU Virtual CPU version 0.12.4 stepping 03
> [ 0.052999] lockdep: fixing up alternatives.
> [ 0.054105]
> [ 0.054106] ===================================================
> [ 0.054999] [ INFO: suspicious rcu_dereference_check() usage. ]
> [ 0.054999] ---------------------------------------------------
> [ 0.054999] kernel/sched.c:616 invoked rcu_dereference_check()
> without protection
> !
> [ 0.054999]
> [ 0.054999] other info that might help us debug this:
> [ 0.054999]
> [ 0.054999]
> [ 0.054999] rcu_scheduler_active = 1, debug_locks = 1
> [ 0.054999] 3 locks held by swapper/1:
> [ 0.054999] #0: (cpu_add_remove_lock){+.+.+.}, at:
> [<ffffffff814be933>] cpu_up+
> 0x42/0x6a
> [ 0.054999] #1: (cpu_hotplug.lock){+.+.+.}, at:
> [<ffffffff810400d8>] cpu_hotplu
> g_begin+0x2a/0x51
> [ 0.054999] #2: (&rq->lock){-.-...}, at: [<ffffffff814be2f7>]
> init_idle+0x2f/0x
> 113
> [ 0.054999]
> [ 0.054999] stack backtrace:
> [ 0.054999] Pid: 1, comm: swapper Not tainted 2.6.35 #1
> [ 0.054999] Call Trace:
> [ 0.054999] [<ffffffff81068054>] lockdep_rcu_dereference+0x9b/0xa3
> [ 0.054999] [<ffffffff810325c3>] task_group+0x7b/0x8a
> [ 0.054999] [<ffffffff810325e5>] set_task_rq+0x13/0x40
> [ 0.054999] [<ffffffff814be39a>] init_idle+0xd2/0x113
> [ 0.054999] [<ffffffff814be78a>] fork_idle+0xb8/0xc7
> [ 0.054999] [<ffffffff81068717>] ? mark_held_locks+0x4d/0x6b
> [ 0.054999] [<ffffffff814bcebd>] do_fork_idle+0x17/0x2b
> [ 0.054999] [<ffffffff814bc89b>] native_cpu_up+0x1c1/0x724
> [ 0.054999] [<ffffffff814bcea6>] ? do_fork_idle+0x0/0x2b
> [ 0.054999] [<ffffffff814be876>] _cpu_up+0xac/0x127
> [ 0.054999] [<ffffffff814be946>] cpu_up+0x55/0x6a
> [ 0.054999] [<ffffffff81ab562a>] kernel_init+0xe1/0x1ff
> [ 0.054999] [<ffffffff81003854>] kernel_thread_helper+0x4/0x10
> [ 0.054999] [<ffffffff814c353c>] ? restore_args+0x0/0x30
> [ 0.054999] [<ffffffff81ab5549>] ? kernel_init+0x0/0x1ff
> [ 0.054999] [<ffffffff81003850>] ? kernel_thread_helper+0x0/0x10
> [ 0.056074] Booting Node 0, Processors #1lockdep: fixing up alternatives.
> [ 0.130045] #2lockdep: fixing up alternatives.
> [ 0.203089] #3 Ok.
> [ 0.275286] Brought up 4 CPUs
> [ 0.276005] Total of 4 processors activated (16017.17 BogoMIPS).
This does look like a new one, thank you for reporting it!
Here is my analysis, which should at least provide some humor value to
those who understand the code better than I do. ;-)
So the corresponding rcu_dereference_check() is in
task_subsys_state_check(), and is fetching the cpu_cgroup_subsys_id
element of the newly created task's task->cgroups->subsys[] array.
The "git grep" command finds only three uses of cpu_cgroup_subsys_id,
but no definition.
Now, fork_idle() invokes copy_process(), which invokes cgroup_fork(),
which sets the child process's ->cgroups pointer to that of the parent,
also invoking get_css_set(), which increments the corresponding reference
count, doing both operations under task_lock() protection (->alloc_lock).
Because fork_idle() does not specify any of CLONE_NEWNS, CLONE_NEWUTS,
CLONE_NEWIPC, CLONE_NEWPID, or CLONE_NEWNET, copy_namespaces() should
not create a new namespace, and so there should be no ns_cgroup_clone().
We should thus retain the parent's ->cgroups pointer. And copy_process()
installs the new task in the various lists, so that the task is externally
accessible upon return.
After a non-error return from copy_process(), fork_init() invokes
init_idle_pid(), which does not appear to affect the task's cgroup
state. Next fork_init() invokes init_idle(), which in turn invokes
__set_task_cpu(), which invokes set_task_rq(), which calls task_group()
several times, which calls task_subsys_state_check(), which calls the
rcu_dereference_check() that complained above.
However, the result returns by rcu_dereference_check() is stored into
the task structure:
p->se.cfs_rq = task_group(p)->cfs_rq[cpu];
p->se.parent = task_group(p)->se[cpu];
This means that the corresponding structure must have been tied down with
a reference count or some such. If such a reference has been taken, then
this complaint is a false positive, and could be suppressed by putting
rcu_read_lock() and rcu_read_unlock() around the call to init_idle()
from fork_idle(). However, although, reference to the enclosing ->cgroups
struct css_set is held, it is not clear to me that this reference applies
to the structures pointed to by the ->subsys[] array, especially given
that the cgroup_subsys_state structures referenced by this array have
their own reference count, which does not appear to me to be acquired
by this code path.
Or are the cgroup_subsys_state structures referenced by idle tasks
never freed or some such?
Thanx, Paul
^ permalink raw reply
* Re: mpc870: hctosys.c unable to open rtc device rtc0
From: Scott Wood @ 2010-08-09 17:29 UTC (permalink / raw)
To: Shawn Jin; +Cc: ppcdev
In-Reply-To: <AANLkTinjLBPG9YfpoDnSoKeGvUO2OEjdTXrYLTZ9c8LB@mail.gmail.com>
On Sun, 8 Aug 2010 23:37:00 -0700
Shawn Jin <shawnxjin@gmail.com> wrote:
> Reading the fsl i2c bindings in the documentation, I found an example
> as follows.
> 27 i2c@860 {
> 28 compatible = "fsl,mpc823-i2c",
> 29 "fsl,cpm1-i2c";
> 30 reg = <0x860 0x20 0x3c80 0x30>;
> 31 interrupts = <16>;
> 32 interrupt-parent = <&CPM_PIC>;
> 33 fsl,cpm-command = <0x10>;
> 34 #address-cells = <1>;
> 35 #size-cells = <0>;
> 36
> 37 rtc@68 {
> 38 compatible = "dallas,ds1307";
> 39 reg = <0x68>;
> 40 };
> 41 };
> 42
>
> In the above example the rtc was explicitly declared as a subnode of
> the i2c node. Is this the way to connect (or bind) a RTC to the I2C
> driver?
Yes.
> What is the reg (0x68) under rtc node?
It's the 7-bit I2C address (without the low-order direction bit).
> I set breakpoint at ds1037_probe() and was hoping that it might be hit
> during the driver registration. But it didn't. Would the
> ds1037_probe() be called during when the ds1037_driver was registered
> as an I2C driver?
The probe function is called only if the device is declared. There is
no autodetection.
-Scott
^ permalink raw reply
* Re: [PATCH 3/3 v2] dts: Add ESDHC weird voltage bits workaround
From: Anton Vorontsov @ 2010-08-09 17:33 UTC (permalink / raw)
To: Roy Zang; +Cc: linuxppc-dev, akpm, linux-mmc
In-Reply-To: <1280805072-26112-3-git-send-email-tie-fei.zang@freescale.com>
On Tue, Aug 03, 2010 at 11:11:12AM +0800, Roy Zang wrote:
> P4080 ESDHC controller does not support 1.8V and 3.0V voltage. but the
> host controller capabilities register wrongly set the bits.
> This patch adds the workaround to correct the weird voltage setting bits.
> Only 3.3V voltage is supported for P4080 ESDHC controller.
>
> Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
Acked-by: Anton Vorontsov <cbouatmailru@gmail.com>
Btw, where is implementation for the voltage-ranges handling?
> ---
> arch/powerpc/boot/dts/p4080ds.dts | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/arch/powerpc/boot/dts/p4080ds.dts b/arch/powerpc/boot/dts/p4080ds.dts
> index efa0091..2f0de24 100644
> --- a/arch/powerpc/boot/dts/p4080ds.dts
> +++ b/arch/powerpc/boot/dts/p4080ds.dts
> @@ -280,6 +280,7 @@
> reg = <0x114000 0x1000>;
> interrupts = <48 2>;
> interrupt-parent = <&mpic>;
> + voltage-ranges = <3300 3300>;
> sdhci,auto-cmd12;
> };
>
> --
> 1.5.6.5
^ permalink raw reply
* Re: [PATCH 2/3 v2] dts: Add sdhci,auto-cmd12 field for p4080 device tree
From: Anton Vorontsov @ 2010-08-09 17:36 UTC (permalink / raw)
To: Roy Zang; +Cc: linuxppc-dev, akpm, linux-mmc
In-Reply-To: <1280805072-26112-2-git-send-email-tie-fei.zang@freescale.com>
On Tue, Aug 03, 2010 at 11:11:11AM +0800, Roy Zang wrote:
> Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
> ---
> Documentation/powerpc/dts-bindings/fsl/esdhc.txt | 2 ++
> arch/powerpc/boot/dts/p4080ds.dts | 1 +
> 2 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/Documentation/powerpc/dts-bindings/fsl/esdhc.txt b/Documentation/powerpc/dts-bindings/fsl/esdhc.txt
> index 8a00407..64bcb8b 100644
> --- a/Documentation/powerpc/dts-bindings/fsl/esdhc.txt
> +++ b/Documentation/powerpc/dts-bindings/fsl/esdhc.txt
> @@ -14,6 +14,8 @@ Required properties:
> reports inverted write-protect state;
> - sdhci,1-bit-only : (optional) specifies that a controller can
> only handle 1-bit data transfers.
> + - sdhci,auto-cmd12: (optional) specifies that a controller can
> + only handle auto CMD12.
Acked-by: Anton Vorontsov <cbouatmailru@gmail.com>
> Example:
>
> diff --git a/arch/powerpc/boot/dts/p4080ds.dts b/arch/powerpc/boot/dts/p4080ds.dts
> index 6b29eab..efa0091 100644
> --- a/arch/powerpc/boot/dts/p4080ds.dts
> +++ b/arch/powerpc/boot/dts/p4080ds.dts
> @@ -280,6 +280,7 @@
> reg = <0x114000 0x1000>;
> interrupts = <48 2>;
> interrupt-parent = <&mpic>;
> + sdhci,auto-cmd12;
> };
>
> i2c@118000 {
--
Anton Vorontsov
email: cbouatmailru@gmail.com
irc://irc.freenode.net/bd2
^ permalink raw reply
* Re: [PATCH 1/3 v2] sdhci: Add auto CMD12 support for eSDHC driver
From: Anton Vorontsov @ 2010-08-09 17:43 UTC (permalink / raw)
To: Grant Likely; +Cc: linux-mmc, linuxppc-dev, akpm
In-Reply-To: <AANLkTinJKO3-iXOXuL7CMWCj5qMMHyMrtnadwPpqTeBu@mail.gmail.com>
On Wed, Aug 04, 2010 at 07:02:56PM -0600, Grant Likely wrote:
> On Mon, Aug 2, 2010 at 9:11 PM, Roy Zang <tie-fei.zang@freescale.com> wrote:
> > From: Jerry Huang <Chang-Ming.Huang@freescale.com>
> >
> > Add auto CMD12 command support for eSDHC driver.
> > This is needed by P4080 and P1022 for block read/write.
> > Manual asynchronous CMD12 abort operation causes protocol violations on
> > these silicons.
> >
> > Signed-off-by: Jerry Huang <Chang-Ming.Huang@freescale.com>
> > Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
> > ---
> > drivers/mmc/host/sdhci-of-core.c | 4 ++++
> > drivers/mmc/host/sdhci.c | 14 ++++++++++++--
> > drivers/mmc/host/sdhci.h | 2 ++
> > 3 files changed, 18 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
> > index c6d1bd8..a92566e 100644
> > --- a/drivers/mmc/host/sdhci.c
> > +++ b/drivers/mmc/host/sdhci.c
> > @@ -817,8 +817,12 @@ static void sdhci_set_transfer_mode(struct sdhci_host *host,
> > WARN_ON(!host->data);
> >
> > mode = SDHCI_TRNS_BLK_CNT_EN;
> > - if (data->blocks > 1)
> > - mode |= SDHCI_TRNS_MULTI;
> > + if (data->blocks > 1) {
> > + if (host->quirks & SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12)
> > + mode |= SDHCI_TRNS_MULTI | SDHCI_TRNS_ACMD12;
> > + else
> > + mode |= SDHCI_TRNS_MULTI;
>
> nit:
> + mode |= SDHCI_TRNS_MULTI;
> + if (host->quirks & SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12)
> + mode |= SDHCI_TRNS_ACMD12;
>
> Clearer, no?
Much clearer. I would prefer the nit incorporated.
Another nit:
> @@ -154,6 +154,10 @@ static int __devinit sdhci_of_probe(struct of_device *ofdev,
> host->ops = &sdhci_of_data->ops;
> }
>
> + if (of_get_property(np, "sdhci,auto-cmd12", NULL))
> + host->quirks |= SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12;
> +
> +
^^ No need for the two empty lines.
> if (of_get_property(np, "sdhci,1-bit-only", NULL))
Though, technically the patch looks OK, feel free to add my
Acked-by: Anton Vorontsov <cbouatmailru@gmail.com>
on the next resend (if any).
Thanks Roy!
--
Anton Vorontsov
email: cbouatmailru@gmail.com
irc://irc.freenode.net/bd2
^ permalink raw reply
* [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections
From: Nathan Fontenot @ 2010-08-09 17:53 UTC (permalink / raw)
To: linux-kernel, linux-mm, linuxppc-dev
Cc: Greg KH, KAMEZAWA Hiroyuki, Dave Hansen
This set of patches de-couples the idea that there is a single
directory in sysfs for each memory section. The intent of the
patches is to reduce the number of sysfs directories created to
resolve a boot-time performance issue. On very large systems
boot time are getting very long (as seen on powerpc hardware)
due to the enormous number of sysfs directories being created.
On a system with 1 TB of memory we create ~63,000 directories.
For even larger systems boot times are being measured in hours.
This set of patches allows for each directory created in sysfs
to cover more than one memory section. The default behavior for
sysfs directory creation is the same, in that each directory
represents a single memory section. A new file 'end_phys_index'
in each directory contains the physical_id of the last memory
section covered by the directory so that users can easily
determine the memory section range of a directory.
Updates for version 5 of the patchset include the following:
Patch 4/8 Add mutex for add/remove of memory blocks
- Define the mutex using DEFINE_MUTEX macro.
Patch 8/8 Update memory-hotplug documentation
- Add information concerning memory holes in phys_index..end_phys_index.
Thanks,
Nathan Fontenot
^ permalink raw reply
* Re: [PATCH 1/3 v2] sdhci: Add auto CMD12 support for eSDHC driver
From: Michał Mirosław @ 2010-08-09 18:24 UTC (permalink / raw)
To: Roy Zang; +Cc: linuxppc-dev, akpm, linux-mmc
In-Reply-To: <1280805072-26112-1-git-send-email-tie-fei.zang@freescale.com>
2010/8/3 Roy Zang <tie-fei.zang@freescale.com>:
[...]
> @@ -240,6 +240,8 @@ struct sdhci_host {
> =A0#define SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN =A0 =A0 =A0 =A0 =A0 =A0 =A0(=
1<<25)
> =A0/* Controller cannot support End Attribute in NOP ADMA descriptor */
> =A0#define SDHCI_QUIRK_NO_ENDATTR_IN_NOPDESC =A0 =A0 =A0 =A0 =A0 =A0 =A0(=
1<<26)
> +/* Controller uses Auto CMD12 command to stop the transfer */
> +#define SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12 =A0 =A0 =A0 =A0 =A0 =A0 (1<<2=
7)
>
> =A0 =A0 =A0 =A0int =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 irq; =A0 =A0 =
=A0 =A0 =A0 =A0/* Device IRQ */
> =A0 =A0 =A0 =A0void __iomem * =A0 =A0 =A0 =A0 =A0ioaddr; =A0 =A0 =A0 =A0 =
/* Mapped address */
Just a cosmetic hint: I suggest SDHCI_QUIRK_MULTIBLOCK_READ_AUTO_CMD12
or something for the quirk name, because ACMD12 part might be confused
with MMC/SD App CMD 12 (CMD55+CMD12 combo) if/whenever that gets used.
Best Regards,
Micha=B3 Miros=B3aw
^ permalink raw reply
* [PATCH 1/8] v5 Move the find_memory_block() routine up
From: Nathan Fontenot @ 2010-08-09 18:35 UTC (permalink / raw)
To: linux-kernel, linux-mm, linuxppc-dev
Cc: Greg KH, KAMEZAWA Hiroyuki, Dave Hansen
In-Reply-To: <4C60407C.2080608@austin.ibm.com>
Move the find_me mory_block() routine up to avoid needing a forward
declaration in subsequent patches.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
---
drivers/base/memory.c | 62 +++++++++++++++++++++++++-------------------------
1 file changed, 31 insertions(+), 31 deletions(-)
Index: linux-2.6/drivers/base/memory.c
===================================================================
--- linux-2.6.orig/drivers/base/memory.c 2010-08-09 07:36:55.000000000 -0500
+++ linux-2.6/drivers/base/memory.c 2010-08-09 07:44:21.000000000 -0500
@@ -435,6 +435,37 @@ int __weak arch_get_memory_phys_device(u
return 0;
}
+/*
+ * For now, we have a linear search to go find the appropriate
+ * memory_block corresponding to a particular phys_index. If
+ * this gets to be a real problem, we can always use a radix
+ * tree or something here.
+ *
+ * This could be made generic for all sysdev classes.
+ */
+struct memory_block *find_memory_block(struct mem_section *section)
+{
+ struct kobject *kobj;
+ struct sys_device *sysdev;
+ struct memory_block *mem;
+ char name[sizeof(MEMORY_CLASS_NAME) + 9 + 1];
+
+ /*
+ * This only works because we know that section == sysdev->id
+ * slightly redundant with sysdev_register()
+ */
+ sprintf(&name[0], "%s%d", MEMORY_CLASS_NAME, __section_nr(section));
+
+ kobj = kset_find_obj(&memory_sysdev_class.kset, name);
+ if (!kobj)
+ return NULL;
+
+ sysdev = container_of(kobj, struct sys_device, kobj);
+ mem = container_of(sysdev, struct memory_block, sysdev);
+
+ return mem;
+}
+
static int add_memory_block(int nid, struct mem_section *section,
unsigned long state, enum mem_add_context context)
{
@@ -468,37 +499,6 @@ static int add_memory_block(int nid, str
return ret;
}
-/*
- * For now, we have a linear search to go find the appropriate
- * memory_block corresponding to a particular phys_index. If
- * this gets to be a real problem, we can always use a radix
- * tree or something here.
- *
- * This could be made generic for all sysdev classes.
- */
-struct memory_block *find_memory_block(struct mem_section *section)
-{
- struct kobject *kobj;
- struct sys_device *sysdev;
- struct memory_block *mem;
- char name[sizeof(MEMORY_CLASS_NAME) + 9 + 1];
-
- /*
- * This only works because we know that section == sysdev->id
- * slightly redundant with sysdev_register()
- */
- sprintf(&name[0], "%s%d", MEMORY_CLASS_NAME, __section_nr(section));
-
- kobj = kset_find_obj(&memory_sysdev_class.kset, name);
- if (!kobj)
- return NULL;
-
- sysdev = container_of(kobj, struct sys_device, kobj);
- mem = container_of(sysdev, struct memory_block, sysdev);
-
- return mem;
-}
-
int remove_memory_block(unsigned long node_id, struct mem_section *section,
int phys_device)
{
^ permalink raw reply
* [PATCH 2/8] v5 Add new phys_index properties
From: Nathan Fontenot @ 2010-08-09 18:36 UTC (permalink / raw)
To: linux-kernel, linux-mm, linuxppc-dev
Cc: Greg KH, KAMEZAWA Hiroyuki, Dave Hansen
In-Reply-To: <4C60407C.2080608@austin.ibm.com>
Update the 'phys_index' properties of a memory block to include a
'start_phys_index' which is the same as the current 'phys_index' property.
The property still appears as 'phys_index' in sysfs but the memory_block
struct name is updated to indicate the start and end values.
This also adds an 'end_phys_index' property to indicate the id of the
last section in th memory block.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
---
drivers/base/memory.c | 28 ++++++++++++++++++++--------
include/linux/memory.h | 3 ++-
2 files changed, 22 insertions(+), 9 deletions(-)
Index: linux-2.6/drivers/base/memory.c
===================================================================
--- linux-2.6.orig/drivers/base/memory.c 2010-08-09 07:44:21.000000000 -0500
+++ linux-2.6/drivers/base/memory.c 2010-08-09 07:44:31.000000000 -0500
@@ -109,12 +109,20 @@ unregister_memory(struct memory_block *m
* uses.
*/
-static ssize_t show_mem_phys_index(struct sys_device *dev,
+static ssize_t show_mem_start_phys_index(struct sys_device *dev,
struct sysdev_attribute *attr, char *buf)
{
struct memory_block *mem =
container_of(dev, struct memory_block, sysdev);
- return sprintf(buf, "%08lx\n", mem->phys_index);
+ return sprintf(buf, "%08lx\n", mem->start_phys_index);
+}
+
+static ssize_t show_mem_end_phys_index(struct sys_device *dev,
+ struct sysdev_attribute *attr, char *buf)
+{
+ struct memory_block *mem =
+ container_of(dev, struct memory_block, sysdev);
+ return sprintf(buf, "%08lx\n", mem->end_phys_index);
}
/*
@@ -128,7 +136,7 @@ static ssize_t show_mem_removable(struct
struct memory_block *mem =
container_of(dev, struct memory_block, sysdev);
- start_pfn = section_nr_to_pfn(mem->phys_index);
+ start_pfn = section_nr_to_pfn(mem->start_phys_index);
ret = is_mem_section_removable(start_pfn, PAGES_PER_SECTION);
return sprintf(buf, "%d\n", ret);
}
@@ -191,7 +199,7 @@ memory_block_action(struct memory_block
int ret;
int old_state = mem->state;
- psection = mem->phys_index;
+ psection = mem->start_phys_index;
first_page = pfn_to_page(psection << PFN_SECTION_SHIFT);
/*
@@ -264,7 +272,7 @@ store_mem_state(struct sys_device *dev,
int ret = -EINVAL;
mem = container_of(dev, struct memory_block, sysdev);
- phys_section_nr = mem->phys_index;
+ phys_section_nr = mem->start_phys_index;
if (!present_section_nr(phys_section_nr))
goto out;
@@ -296,7 +304,8 @@ static ssize_t show_phys_device(struct s
return sprintf(buf, "%d\n", mem->phys_device);
}
-static SYSDEV_ATTR(phys_index, 0444, show_mem_phys_index, NULL);
+static SYSDEV_ATTR(phys_index, 0444, show_mem_start_phys_index, NULL);
+static SYSDEV_ATTR(end_phys_index, 0444, show_mem_end_phys_index, NULL);
static SYSDEV_ATTR(state, 0644, show_mem_state, store_mem_state);
static SYSDEV_ATTR(phys_device, 0444, show_phys_device, NULL);
static SYSDEV_ATTR(removable, 0444, show_mem_removable, NULL);
@@ -476,16 +485,18 @@ static int add_memory_block(int nid, str
if (!mem)
return -ENOMEM;
- mem->phys_index = __section_nr(section);
+ mem->start_phys_index = __section_nr(section);
mem->state = state;
mutex_init(&mem->state_mutex);
- start_pfn = section_nr_to_pfn(mem->phys_index);
+ start_pfn = section_nr_to_pfn(mem->start_phys_index);
mem->phys_device = arch_get_memory_phys_device(start_pfn);
ret = register_memory(mem, section);
if (!ret)
ret = mem_create_simple_file(mem, phys_index);
if (!ret)
+ ret = mem_create_simple_file(mem, end_phys_index);
+ if (!ret)
ret = mem_create_simple_file(mem, state);
if (!ret)
ret = mem_create_simple_file(mem, phys_device);
@@ -507,6 +518,7 @@ int remove_memory_block(unsigned long no
mem = find_memory_block(section);
unregister_mem_sect_under_nodes(mem);
mem_remove_simple_file(mem, phys_index);
+ mem_remove_simple_file(mem, end_phys_index);
mem_remove_simple_file(mem, state);
mem_remove_simple_file(mem, phys_device);
mem_remove_simple_file(mem, removable);
Index: linux-2.6/include/linux/memory.h
===================================================================
--- linux-2.6.orig/include/linux/memory.h 2010-08-09 07:36:55.000000000 -0500
+++ linux-2.6/include/linux/memory.h 2010-08-09 07:44:31.000000000 -0500
@@ -21,7 +21,8 @@
#include <linux/mutex.h>
struct memory_block {
- unsigned long phys_index;
+ unsigned long start_phys_index;
+ unsigned long end_phys_index;
unsigned long state;
/*
* This serializes all state change requests. It isn't
^ permalink raw reply
* [PATCH 3/8] v5 Add section count to memory_block
From: Nathan Fontenot @ 2010-08-09 18:37 UTC (permalink / raw)
To: linux-kernel, linux-mm, linuxppc-dev
Cc: Greg KH, KAMEZAWA Hiroyuki, Dave Hansen
In-Reply-To: <4C60407C.2080608@austin.ibm.com>
Add a section count property to the memory_block struct to track the number
of memory sections that have been added/removed from a memory block. This
alolws us to know when the lasat memory section of a memory block has been
removed so we can remove the memory block.
Signed-off-by: Nathan Fontenot <nfont@asutin.ibm.com>
---
drivers/base/memory.c | 18 +++++++++++-------
include/linux/memory.h | 2 ++
2 files changed, 13 insertions(+), 7 deletions(-)
Index: linux-2.6/drivers/base/memory.c
===================================================================
--- linux-2.6.orig/drivers/base/memory.c 2010-08-09 07:44:31.000000000 -0500
+++ linux-2.6/drivers/base/memory.c 2010-08-09 07:49:04.000000000 -0500
@@ -487,6 +487,7 @@ static int add_memory_block(int nid, str
mem->start_phys_index = __section_nr(section);
mem->state = state;
+ atomic_inc(&mem->section_count);
mutex_init(&mem->state_mutex);
start_pfn = section_nr_to_pfn(mem->start_phys_index);
mem->phys_device = arch_get_memory_phys_device(start_pfn);
@@ -516,13 +517,16 @@ int remove_memory_block(unsigned long no
struct memory_block *mem;
mem = find_memory_block(section);
- unregister_mem_sect_under_nodes(mem);
- mem_remove_simple_file(mem, phys_index);
- mem_remove_simple_file(mem, end_phys_index);
- mem_remove_simple_file(mem, state);
- mem_remove_simple_file(mem, phys_device);
- mem_remove_simple_file(mem, removable);
- unregister_memory(mem, section);
+
+ if (atomic_dec_and_test(&mem->section_count)) {
+ unregister_mem_sect_under_nodes(mem);
+ mem_remove_simple_file(mem, phys_index);
+ mem_remove_simple_file(mem, end_phys_index);
+ mem_remove_simple_file(mem, state);
+ mem_remove_simple_file(mem, phys_device);
+ mem_remove_simple_file(mem, removable);
+ unregister_memory(mem, section);
+ }
return 0;
}
Index: linux-2.6/include/linux/memory.h
===================================================================
--- linux-2.6.orig/include/linux/memory.h 2010-08-09 07:44:31.000000000 -0500
+++ linux-2.6/include/linux/memory.h 2010-08-09 07:49:04.000000000 -0500
@@ -19,11 +19,13 @@
#include <linux/node.h>
#include <linux/compiler.h>
#include <linux/mutex.h>
+#include <asm/atomic.h>
struct memory_block {
unsigned long start_phys_index;
unsigned long end_phys_index;
unsigned long state;
+ atomic_t section_count;
/*
* This serializes all state change requests. It isn't
* held during creation because the control files are
^ permalink raw reply
* [PATCH 4/8] v5 Add mutex for add/remove of memory blocks
From: Nathan Fontenot @ 2010-08-09 18:38 UTC (permalink / raw)
To: linux-kernel, linux-mm, linuxppc-dev
Cc: Greg KH, KAMEZAWA Hiroyuki, Dave Hansen
In-Reply-To: <4C60407C.2080608@austin.ibm.com>
Add a new mutex for use in adding and removing of memory blocks. This
is needed to avoid any race conditions in which the same memory block could
be added and removed at the same time.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
---
drivers/base/memory.c | 7 +++++++
1 file changed, 7 insertions(+)
Index: linux-2.6/drivers/base/memory.c
===================================================================
--- linux-2.6.orig/drivers/base/memory.c 2010-08-09 07:49:04.000000000 -0500
+++ linux-2.6/drivers/base/memory.c 2010-08-09 07:50:20.000000000 -0500
@@ -27,6 +27,8 @@
#include <asm/atomic.h>
#include <asm/uaccess.h>
+static DEFINE_MUTEX(mem_sysfs_mutex);
+
#define MEMORY_CLASS_NAME "memory"
static struct sysdev_class memory_sysdev_class = {
@@ -485,6 +487,8 @@ static int add_memory_block(int nid, str
if (!mem)
return -ENOMEM;
+ mutex_lock(&mem_sysfs_mutex);
+
mem->start_phys_index = __section_nr(section);
mem->state = state;
atomic_inc(&mem->section_count);
@@ -508,6 +512,7 @@ static int add_memory_block(int nid, str
ret = register_mem_sect_under_node(mem, nid);
}
+ mutex_unlock(&mem_sysfs_mutex);
return ret;
}
@@ -516,6 +521,7 @@ int remove_memory_block(unsigned long no
{
struct memory_block *mem;
+ mutex_lock(&mem_sysfs_mutex);
mem = find_memory_block(section);
if (atomic_dec_and_test(&mem->section_count)) {
@@ -528,6 +534,7 @@ int remove_memory_block(unsigned long no
unregister_memory(mem, section);
}
+ mutex_unlock(&mem_sysfs_mutex);
return 0;
}
^ permalink raw reply
* [PATCH 5/8] v5 Allow memory_block to span multiple memory sections
From: Nathan Fontenot @ 2010-08-09 18:39 UTC (permalink / raw)
To: linux-kernel, linux-mm, linuxppc-dev
Cc: Greg KH, KAMEZAWA Hiroyuki, Dave Hansen
In-Reply-To: <4C60407C.2080608@austin.ibm.com>
Update the memory sysfs code that each sysfs memory directory is now
considered a memory block that can contain multiple memory sections per
memory block. The default size of each memory block is SECTION_SIZE_BITS
to maintain the current behavior of having a single memory section per
memory block (i.e. one sysfs directory per memory section).
For architectures that want to have memory blocks span multiple
memory sections they need only define their own memory_block_size_bytes()
routine.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
---
drivers/base/memory.c | 148 ++++++++++++++++++++++++++++++++++----------------
1 file changed, 103 insertions(+), 45 deletions(-)
Index: linux-2.6/drivers/base/memory.c
===================================================================
--- linux-2.6.orig/drivers/base/memory.c 2010-08-09 07:50:20.000000000 -0500
+++ linux-2.6/drivers/base/memory.c 2010-08-09 07:50:28.000000000 -0500
@@ -30,6 +30,14 @@
static DEFINE_MUTEX(mem_sysfs_mutex);
#define MEMORY_CLASS_NAME "memory"
+#define MIN_MEMORY_BLOCK_SIZE (1 << SECTION_SIZE_BITS)
+
+static int sections_per_block;
+
+static inline int base_memory_block_id(int section_nr)
+{
+ return (section_nr / sections_per_block) * sections_per_block;
+}
static struct sysdev_class memory_sysdev_class = {
.name = MEMORY_CLASS_NAME,
@@ -84,22 +92,21 @@ EXPORT_SYMBOL(unregister_memory_isolate_
* register_memory - Setup a sysfs device for a memory block
*/
static
-int register_memory(struct memory_block *memory, struct mem_section *section)
+int register_memory(struct memory_block *memory)
{
int error;
memory->sysdev.cls = &memory_sysdev_class;
- memory->sysdev.id = __section_nr(section);
+ memory->sysdev.id = memory->start_phys_index;
error = sysdev_register(&memory->sysdev);
return error;
}
static void
-unregister_memory(struct memory_block *memory, struct mem_section *section)
+unregister_memory(struct memory_block *memory)
{
BUG_ON(memory->sysdev.cls != &memory_sysdev_class);
- BUG_ON(memory->sysdev.id != __section_nr(section));
/* drop the ref. we got in remove_memory_block() */
kobject_put(&memory->sysdev.kobj);
@@ -133,13 +140,16 @@ static ssize_t show_mem_end_phys_index(s
static ssize_t show_mem_removable(struct sys_device *dev,
struct sysdev_attribute *attr, char *buf)
{
- unsigned long start_pfn;
- int ret;
+ unsigned long i, pfn;
+ int ret = 1;
struct memory_block *mem =
container_of(dev, struct memory_block, sysdev);
- start_pfn = section_nr_to_pfn(mem->start_phys_index);
- ret = is_mem_section_removable(start_pfn, PAGES_PER_SECTION);
+ for (i = mem->start_phys_index; i <= mem->end_phys_index; i++) {
+ pfn = section_nr_to_pfn(i);
+ ret &= is_mem_section_removable(pfn, PAGES_PER_SECTION);
+ }
+
return sprintf(buf, "%d\n", ret);
}
@@ -192,17 +202,14 @@ int memory_isolate_notify(unsigned long
* OK to have direct references to sparsemem variables in here.
*/
static int
-memory_block_action(struct memory_block *mem, unsigned long action)
+memory_section_action(unsigned long phys_index, unsigned long action)
{
int i;
- unsigned long psection;
unsigned long start_pfn, start_paddr;
struct page *first_page;
int ret;
- int old_state = mem->state;
- psection = mem->start_phys_index;
- first_page = pfn_to_page(psection << PFN_SECTION_SHIFT);
+ first_page = pfn_to_page(phys_index << PFN_SECTION_SHIFT);
/*
* The probe routines leave the pages reserved, just
@@ -215,8 +222,8 @@ memory_block_action(struct memory_block
continue;
printk(KERN_WARNING "section number %ld page number %d "
- "not reserved, was it already online? \n",
- psection, i);
+ "not reserved, was it already online?\n",
+ phys_index, i);
return -EBUSY;
}
}
@@ -227,18 +234,13 @@ memory_block_action(struct memory_block
ret = online_pages(start_pfn, PAGES_PER_SECTION);
break;
case MEM_OFFLINE:
- mem->state = MEM_GOING_OFFLINE;
start_paddr = page_to_pfn(first_page) << PAGE_SHIFT;
ret = remove_memory(start_paddr,
PAGES_PER_SECTION << PAGE_SHIFT);
- if (ret) {
- mem->state = old_state;
- break;
- }
break;
default:
- WARN(1, KERN_WARNING "%s(%p, %ld) unknown action: %ld\n",
- __func__, mem, action, action);
+ WARN(1, KERN_WARNING "%s(%ld, %ld) unknown action: "
+ "%ld\n", __func__, phys_index, action, action);
ret = -EINVAL;
}
@@ -248,7 +250,7 @@ memory_block_action(struct memory_block
static int memory_block_change_state(struct memory_block *mem,
unsigned long to_state, unsigned long from_state_req)
{
- int ret = 0;
+ int i, ret = 0;
mutex_lock(&mem->state_mutex);
if (mem->state != from_state_req) {
@@ -256,8 +258,21 @@ static int memory_block_change_state(str
goto out;
}
- ret = memory_block_action(mem, to_state);
- if (!ret)
+ if (to_state == MEM_OFFLINE)
+ mem->state = MEM_GOING_OFFLINE;
+
+ for (i = mem->start_phys_index; i <= mem->end_phys_index; i++) {
+ ret = memory_section_action(i, to_state);
+ if (ret)
+ break;
+ }
+
+ if (ret) {
+ for (i = mem->start_phys_index; i <= mem->end_phys_index; i++)
+ memory_section_action(i, from_state_req);
+
+ mem->state = from_state_req;
+ } else
mem->state = to_state;
out:
@@ -270,20 +285,15 @@ store_mem_state(struct sys_device *dev,
struct sysdev_attribute *attr, const char *buf, size_t count)
{
struct memory_block *mem;
- unsigned int phys_section_nr;
int ret = -EINVAL;
mem = container_of(dev, struct memory_block, sysdev);
- phys_section_nr = mem->start_phys_index;
-
- if (!present_section_nr(phys_section_nr))
- goto out;
if (!strncmp(buf, "online", min((int)count, 6)))
ret = memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE);
else if(!strncmp(buf, "offline", min((int)count, 7)))
ret = memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE);
-out:
+
if (ret)
return ret;
return count;
@@ -460,12 +470,13 @@ struct memory_block *find_memory_block(s
struct sys_device *sysdev;
struct memory_block *mem;
char name[sizeof(MEMORY_CLASS_NAME) + 9 + 1];
+ int block_id = base_memory_block_id(__section_nr(section));
/*
* This only works because we know that section == sysdev->id
* slightly redundant with sysdev_register()
*/
- sprintf(&name[0], "%s%d", MEMORY_CLASS_NAME, __section_nr(section));
+ sprintf(&name[0], "%s%d", MEMORY_CLASS_NAME, block_id);
kobj = kset_find_obj(&memory_sysdev_class.kset, name);
if (!kobj)
@@ -477,26 +488,26 @@ struct memory_block *find_memory_block(s
return mem;
}
-static int add_memory_block(int nid, struct mem_section *section,
- unsigned long state, enum mem_add_context context)
+static int init_memory_block(struct memory_block **memory,
+ struct mem_section *section, unsigned long state)
{
- struct memory_block *mem = kzalloc(sizeof(*mem), GFP_KERNEL);
+ struct memory_block *mem;
unsigned long start_pfn;
int ret = 0;
+ mem = kzalloc(sizeof(*mem), GFP_KERNEL);
if (!mem)
return -ENOMEM;
- mutex_lock(&mem_sysfs_mutex);
-
- mem->start_phys_index = __section_nr(section);
+ mem->start_phys_index = base_memory_block_id(__section_nr(section));
+ mem->end_phys_index = mem->start_phys_index + sections_per_block - 1;
mem->state = state;
atomic_inc(&mem->section_count);
mutex_init(&mem->state_mutex);
start_pfn = section_nr_to_pfn(mem->start_phys_index);
mem->phys_device = arch_get_memory_phys_device(start_pfn);
- ret = register_memory(mem, section);
+ ret = register_memory(mem);
if (!ret)
ret = mem_create_simple_file(mem, phys_index);
if (!ret)
@@ -507,8 +518,29 @@ static int add_memory_block(int nid, str
ret = mem_create_simple_file(mem, phys_device);
if (!ret)
ret = mem_create_simple_file(mem, removable);
+
+ *memory = mem;
+ return ret;
+}
+
+static int add_memory_section(int nid, struct mem_section *section,
+ unsigned long state, enum mem_add_context context)
+{
+ struct memory_block *mem;
+ int ret = 0;
+
+ mutex_lock(&mem_sysfs_mutex);
+
+ mem = find_memory_block(section);
+ if (mem) {
+ atomic_inc(&mem->section_count);
+ kobject_put(&mem->sysdev.kobj);
+ } else
+ ret = init_memory_block(&mem, section, state);
+
if (!ret) {
- if (context == HOTPLUG)
+ if (context == HOTPLUG &&
+ atomic_read(&mem->section_count) == sections_per_block)
ret = register_mem_sect_under_node(mem, nid);
}
@@ -531,8 +563,10 @@ int remove_memory_block(unsigned long no
mem_remove_simple_file(mem, state);
mem_remove_simple_file(mem, phys_device);
mem_remove_simple_file(mem, removable);
- unregister_memory(mem, section);
- }
+ unregister_memory(mem);
+ kfree(mem);
+ } else
+ kobject_put(&mem->sysdev.kobj);
mutex_unlock(&mem_sysfs_mutex);
return 0;
@@ -544,7 +578,7 @@ int remove_memory_block(unsigned long no
*/
int register_new_memory(int nid, struct mem_section *section)
{
- return add_memory_block(nid, section, MEM_OFFLINE, HOTPLUG);
+ return add_memory_section(nid, section, MEM_OFFLINE, HOTPLUG);
}
int unregister_memory_section(struct mem_section *section)
@@ -555,6 +589,26 @@ int unregister_memory_section(struct mem
return remove_memory_block(0, section, 0);
}
+u32 __weak memory_block_size_bytes(void)
+{
+ return MIN_MEMORY_BLOCK_SIZE;
+}
+
+static u32 get_memory_block_size(void)
+{
+ u32 block_sz;
+
+ block_sz = memory_block_size_bytes();
+
+ /* Validate blk_sz is a power of 2 and not less than section size */
+ if ((block_sz & (block_sz - 1)) || (block_sz < MIN_MEMORY_BLOCK_SIZE)) {
+ WARN_ON(1);
+ block_sz = MIN_MEMORY_BLOCK_SIZE;
+ }
+
+ return block_sz;
+}
+
/*
* Initialize the sysfs support for memory devices...
*/
@@ -563,12 +617,16 @@ int __init memory_dev_init(void)
unsigned int i;
int ret;
int err;
+ int block_sz;
memory_sysdev_class.kset.uevent_ops = &memory_uevent_ops;
ret = sysdev_class_register(&memory_sysdev_class);
if (ret)
goto out;
+ block_sz = get_memory_block_size();
+ sections_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
+
/*
* Create entries for memory sections that were found
* during boot and have been initialized
@@ -576,8 +634,8 @@ int __init memory_dev_init(void)
for (i = 0; i < NR_MEM_SECTIONS; i++) {
if (!present_section_nr(i))
continue;
- err = add_memory_block(0, __nr_to_section(i), MEM_ONLINE,
- BOOT);
+ err = add_memory_section(0, __nr_to_section(i), MEM_ONLINE,
+ BOOT);
if (!ret)
ret = err;
}
^ permalink raw reply
* [PATCH 6/8] v5 Update the node sysfs code
From: Nathan Fontenot @ 2010-08-09 18:41 UTC (permalink / raw)
To: linux-kernel, linux-mm, linuxppc-dev
Cc: Greg KH, KAMEZAWA Hiroyuki, Dave Hansen
In-Reply-To: <4C60407C.2080608@austin.ibm.com>
Update the node sysfs code to be aware of the new capability for a memory
block to contain multiple memory sections. This requires an additional
parameter to unregister_mem_sect_under_nodes so that we know which memory
section of the memory block to unregister.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
---
drivers/base/memory.c | 2 +-
drivers/base/node.c | 12 ++++++++----
include/linux/node.h | 6 ++++--
3 files changed, 13 insertions(+), 7 deletions(-)
Index: linux-2.6/drivers/base/node.c
===================================================================
--- linux-2.6.orig/drivers/base/node.c 2010-08-09 07:36:50.000000000 -0500
+++ linux-2.6/drivers/base/node.c 2010-08-09 07:53:30.000000000 -0500
@@ -346,8 +346,10 @@ int register_mem_sect_under_node(struct
return -EFAULT;
if (!node_online(nid))
return 0;
- sect_start_pfn = section_nr_to_pfn(mem_blk->phys_index);
- sect_end_pfn = sect_start_pfn + PAGES_PER_SECTION - 1;
+
+ sect_start_pfn = section_nr_to_pfn(mem_blk->start_phys_index);
+ sect_end_pfn = section_nr_to_pfn(mem_blk->end_phys_index);
+ sect_end_pfn += PAGES_PER_SECTION - 1;
for (pfn = sect_start_pfn; pfn <= sect_end_pfn; pfn++) {
int page_nid;
@@ -371,7 +373,8 @@ int register_mem_sect_under_node(struct
}
/* unregister memory section under all nodes that it spans */
-int unregister_mem_sect_under_nodes(struct memory_block *mem_blk)
+int unregister_mem_sect_under_nodes(struct memory_block *mem_blk,
+ unsigned long phys_index)
{
NODEMASK_ALLOC(nodemask_t, unlinked_nodes, GFP_KERNEL);
unsigned long pfn, sect_start_pfn, sect_end_pfn;
@@ -383,7 +386,8 @@ int unregister_mem_sect_under_nodes(stru
if (!unlinked_nodes)
return -ENOMEM;
nodes_clear(*unlinked_nodes);
- sect_start_pfn = section_nr_to_pfn(mem_blk->phys_index);
+
+ sect_start_pfn = section_nr_to_pfn(phys_index);
sect_end_pfn = sect_start_pfn + PAGES_PER_SECTION - 1;
for (pfn = sect_start_pfn; pfn <= sect_end_pfn; pfn++) {
int nid;
Index: linux-2.6/drivers/base/memory.c
===================================================================
--- linux-2.6.orig/drivers/base/memory.c 2010-08-09 07:50:28.000000000 -0500
+++ linux-2.6/drivers/base/memory.c 2010-08-09 07:53:30.000000000 -0500
@@ -555,9 +555,9 @@ int remove_memory_block(unsigned long no
mutex_lock(&mem_sysfs_mutex);
mem = find_memory_block(section);
+ unregister_mem_sect_under_nodes(mem, __section_nr(section));
if (atomic_dec_and_test(&mem->section_count)) {
- unregister_mem_sect_under_nodes(mem);
mem_remove_simple_file(mem, phys_index);
mem_remove_simple_file(mem, end_phys_index);
mem_remove_simple_file(mem, state);
Index: linux-2.6/include/linux/node.h
===================================================================
--- linux-2.6.orig/include/linux/node.h 2010-08-09 07:36:50.000000000 -0500
+++ linux-2.6/include/linux/node.h 2010-08-09 07:53:30.000000000 -0500
@@ -44,7 +44,8 @@ extern int register_cpu_under_node(unsig
extern int unregister_cpu_under_node(unsigned int cpu, unsigned int nid);
extern int register_mem_sect_under_node(struct memory_block *mem_blk,
int nid);
-extern int unregister_mem_sect_under_nodes(struct memory_block *mem_blk);
+extern int unregister_mem_sect_under_nodes(struct memory_block *mem_blk,
+ unsigned long phys_index);
#ifdef CONFIG_HUGETLBFS
extern void register_hugetlbfs_with_node(node_registration_func_t doregister,
@@ -72,7 +73,8 @@ static inline int register_mem_sect_unde
{
return 0;
}
-static inline int unregister_mem_sect_under_nodes(struct memory_block *mem_blk)
+static inline int unregister_mem_sect_under_nodes(struct memory_block *mem_blk,
+ unsigned long phys_index)
{
return 0;
}
^ permalink raw reply
* [PATCH 7/8] v5 Define memory_block_size_bytes() for ppc/pseries
From: Nathan Fontenot @ 2010-08-09 18:42 UTC (permalink / raw)
To: linux-kernel, linux-mm, linuxppc-dev
Cc: Greg KH, KAMEZAWA Hiroyuki, Dave Hansen
In-Reply-To: <4C60407C.2080608@austin.ibm.com>
Define a version of memory_block_size_bytes() for powerpc/pseries such that
a memory block spans an entire lmb.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
---
arch/powerpc/platforms/pseries/hotplug-memory.c | 66 +++++++++++++++++++-----
1 file changed, 53 insertions(+), 13 deletions(-)
Index: linux-2.6/arch/powerpc/platforms/pseries/hotplug-memory.c
===================================================================
--- linux-2.6.orig/arch/powerpc/platforms/pseries/hotplug-memory.c 2010-08-09 07:36:49.000000000 -0500
+++ linux-2.6/arch/powerpc/platforms/pseries/hotplug-memory.c 2010-08-09 07:54:00.000000000 -0500
@@ -17,6 +17,54 @@
#include <asm/pSeries_reconfig.h>
#include <asm/sparsemem.h>
+static u32 get_memblock_size(void)
+{
+ struct device_node *np;
+ unsigned int memblock_size = 0;
+
+ np = of_find_node_by_path("/ibm,dynamic-reconfiguration-memory");
+ if (np) {
+ const unsigned long *size;
+
+ size = of_get_property(np, "ibm,lmb-size", NULL);
+ memblock_size = size ? *size : 0;
+
+ of_node_put(np);
+ } else {
+ unsigned int memzero_size = 0;
+ const unsigned int *regs;
+
+ np = of_find_node_by_path("/memory@0");
+ if (np) {
+ regs = of_get_property(np, "reg", NULL);
+ memzero_size = regs ? regs[3] : 0;
+ of_node_put(np);
+ }
+
+ if (memzero_size) {
+ /* We now know the size of memory@0, use this to find
+ * the first memoryblock and get its size.
+ */
+ char buf[64];
+
+ sprintf(buf, "/memory@%x", memzero_size);
+ np = of_find_node_by_path(buf);
+ if (np) {
+ regs = of_get_property(np, "reg", NULL);
+ memblock_size = regs ? regs[3] : 0;
+ of_node_put(np);
+ }
+ }
+ }
+
+ return memblock_size;
+}
+
+u32 memory_block_size_bytes(void)
+{
+ return get_memblock_size();
+}
+
static int pseries_remove_memblock(unsigned long base, unsigned int memblock_size)
{
unsigned long start, start_pfn;
@@ -127,30 +175,22 @@ static int pseries_add_memory(struct dev
static int pseries_drconf_memory(unsigned long *base, unsigned int action)
{
- struct device_node *np;
- const unsigned long *lmb_size;
+ unsigned long memblock_size;
int rc;
- np = of_find_node_by_path("/ibm,dynamic-reconfiguration-memory");
- if (!np)
+ memblock_size = get_memblock_size();
+ if (!memblock_size)
return -EINVAL;
- lmb_size = of_get_property(np, "ibm,lmb-size", NULL);
- if (!lmb_size) {
- of_node_put(np);
- return -EINVAL;
- }
-
if (action == PSERIES_DRCONF_MEM_ADD) {
- rc = memblock_add(*base, *lmb_size);
+ rc = memblock_add(*base, memblock_size);
rc = (rc < 0) ? -EINVAL : 0;
} else if (action == PSERIES_DRCONF_MEM_REMOVE) {
- rc = pseries_remove_memblock(*base, *lmb_size);
+ rc = pseries_remove_memblock(*base, memblock_size);
} else {
rc = -EINVAL;
}
- of_node_put(np);
return rc;
}
^ permalink raw reply
* [PATCH 8/8] v5 Update memory-hotplug documentation
From: Nathan Fontenot @ 2010-08-09 18:43 UTC (permalink / raw)
To: linux-kernel, linux-mm, linuxppc-dev
Cc: Greg KH, KAMEZAWA Hiroyuki, Dave Hansen
In-Reply-To: <4C60407C.2080608@austin.ibm.com>
Update the memory hotplug documentation to reflect the new behaviors of
memory blocks reflected in sysfs.
Signed-off-by: Nathan Fontent <nfont@austin.ibm.com>
---
Documentation/memory-hotplug.txt | 46 +++++++++++++++++++++++++--------------
1 file changed, 30 insertions(+), 16 deletions(-)
Index: linux-2.6/Documentation/memory-hotplug.txt
===================================================================
--- linux-2.6.orig/Documentation/memory-hotplug.txt 2010-08-09 07:36:48.000000000 -0500
+++ linux-2.6/Documentation/memory-hotplug.txt 2010-08-09 07:59:54.000000000 -0500
@@ -126,36 +126,50 @@ config options.
--------------------------------
4 sysfs files for memory hotplug
--------------------------------
-All sections have their device information under /sys/devices/system/memory as
+All sections have their device information in sysfs. Each section is part of
+a memory block under /sys/devices/system/memory as
/sys/devices/system/memory/memoryXXX
-(XXX is section id.)
+(XXX is the section id.)
-Now, XXX is defined as start_address_of_section / section_size.
+Now, XXX is defined as (start_address_of_section / section_size) of the first
+section contained in the memory block. The files 'phys_index' and
+'end_phys_index' under each directory report the beginning and end section id's
+for the memory block covered by the sysfs directory. It is expected that all
+memory sections in this range are present and no memory holes exist in the
+range. Currently there is no way to determine if there is a memory hole, but
+the existence of one should not affect the hotplug capabilities of the memory
+block.
For example, assume 1GiB section size. A device for a memory starting at
0x100000000 is /sys/device/system/memory/memory4
(0x100000000 / 1Gib = 4)
This device covers address range [0x100000000 ... 0x140000000)
-Under each section, you can see 4 files.
+Under each section, you can see 5 files.
-/sys/devices/system/memory/memoryXXX/phys_index
+/sys/devices/system/memory/memoryXXX/start_phys_index
+/sys/devices/system/memory/memoryXXX/end_phys_index
/sys/devices/system/memory/memoryXXX/phys_device
/sys/devices/system/memory/memoryXXX/state
/sys/devices/system/memory/memoryXXX/removable
-'phys_index' : read-only and contains section id, same as XXX.
-'state' : read-write
- at read: contains online/offline state of memory.
- at write: user can specify "online", "offline" command
-'phys_device': read-only: designed to show the name of physical memory device.
- This is not well implemented now.
-'removable' : read-only: contains an integer value indicating
- whether the memory section is removable or not
- removable. A value of 1 indicates that the memory
- section is removable and a value of 0 indicates that
- it is not removable.
+'phys_index' : read-only and contains section id of the first section
+ in the memory block, same as XXX.
+'end_phys_index' : read-only and contains section id of the last section
+ in the memory block.
+'state' : read-write
+ at read: contains online/offline state of memory.
+ at write: user can specify "online", "offline" command
+ which will be performed on al sections in the block.
+'phys_device' : read-only: designed to show the name of physical memory
+ device. This is not well implemented now.
+'removable' : read-only: contains an integer value indicating
+ whether the memory block is removable or not
+ removable. A value of 1 indicates that the memory
+ block is removable and a value of 0 indicates that
+ it is not removable. A memory block is removable only if
+ every section in the block is removable.
NOTE:
These directories/files appear after physical memory hotplug phase.
^ permalink raw reply
* Re: [PATCH 8/8] v5 Update memory-hotplug documentation
From: Nishanth Aravamudan @ 2010-08-09 20:44 UTC (permalink / raw)
To: linuxppc-dev
Cc: linuxppc-dev, Greg KH, linux-kernel, Dave Hansen, linux-mm,
KAMEZAWA Hiroyuki
In-Reply-To: <4C604C62.7060509@austin.ibm.com>
On Monday, August 09, 2010 11:43:46 am Nathan Fontenot wrote:
> Update the memory hotplug documentation to reflect the new behaviors of
> memory blocks reflected in sysfs.
<snip>
> Index: linux-2.6/Documentation/memory-hotplug.txt
> ===================================================================
> --- linux-2.6.orig/Documentation/memory-hotplug.txt 2010-08-09 07:36:48.000000000 -0500
> +++ linux-2.6/Documentation/memory-hotplug.txt 2010-08-09 07:59:54.000000000 -0500
<snip>
> -/sys/devices/system/memory/memoryXXX/phys_index
> +/sys/devices/system/memory/memoryXXX/start_phys_index
> +/sys/devices/system/memory/memoryXXX/end_phys_index
> /sys/devices/system/memory/memoryXXX/phys_device
> /sys/devices/system/memory/memoryXXX/state
> /sys/devices/system/memory/memoryXXX/removable
>
> -'phys_index' : read-only and contains section id, same as XXX.
<snip>
> +'phys_index' : read-only and contains section id of the first section
Shouldn't this be "start_phys_index"?
Thanks,
Nish
--
Nishanth Aravamudan <nacc@us.ibm.com>
Linux Technology Center
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox