* Re: [PATCH] powerpc: enable CONFIG_HAVE_MEMORYLESS_NODES
From: Nishanth Aravamudan @ 2014-02-13 21:41 UTC (permalink / raw)
To: Christoph Lameter
Cc: David Rientjes, Pekka Enberg, linux-mm, Paul Mackerras,
Anton Blanchard, Matt Mackall, Joonsoo Kim, linuxppc-dev,
Wanpeng Li
In-Reply-To: <20140128183457.GA9315@linux.vnet.ibm.com>
On 28.01.2014 [10:34:57 -0800], Nishanth Aravamudan wrote:
> Anton Blanchard found an issue with an LPAR that had no memory in Node
> 0. Christoph Lameter recommended, as one possible solution, to use
> numa_mem_id() for locality of the nearest memory node-wise. However,
> numa_mem_id() [and the other related APIs] are only useful if
> CONFIG_HAVE_MEMORYLESS_NODES is set. This is only the case for ia64
> currently, but clearly we can have memoryless nodes on ppc64. Add the
> Kconfig option and define it to be the same value as CONFIG_NUMA.
>
> On the LPAR in question, which was very inefficiently using slabs, this
> took the slab consumption at boot from roughly 7GB to roughly 4GB.
Err, this should have been
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
!
Sorry about that Ben!
> ---
> Ben, the only question I have wrt this change is if it's appropriate to
> change it for all powerpc configs (that have NUMA on)?
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 25493a0..bb2d5fe 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -447,6 +447,9 @@ config NODES_SHIFT
> default "4"
> depends on NEED_MULTIPLE_NODES
>
> +config HAVE_MEMORYLESS_NODES
> + def_bool NUMA
> +
> config ARCH_SELECT_MEMORY_MODEL
> def_bool y
> depends on PPC64
^ permalink raw reply
* Re: [PATCH] of: give priority to the compatible match in __of_match_node()
From: Rob Herring @ 2014-02-13 19:01 UTC (permalink / raw)
To: Kevin Hao
Cc: devicetree@vger.kernel.org, Arnd Bergmann, Chris Proctor,
Stephen N Chivers, Grant Likely, Rob Herring, Scott Wood,
linuxppc-dev, Sebastian Hesselbarth
In-Reply-To: <1392205084-2351-1-git-send-email-haokexin@gmail.com>
On Wed, Feb 12, 2014 at 5:38 AM, Kevin Hao <haokexin@gmail.com> wrote:
> When the device node do have a compatible property, we definitely
> prefer the compatible match besides the type and name. Only if
> there is no such a match, we then consider the candidate which
> doesn't have compatible entry but do match the type or name with
> the device node.
>
> This is based on a patch from Sebastian Hesselbarth.
> http://patchwork.ozlabs.org/patch/319434/
>
> I did some code refactoring and also fixed a bug in the original patch.
I'm inclined to just revert this once again and avoid possibly
breaking yet another platform.
However, I think I would like to see this structured differently. We
basically have 2 ways of matching: the existing pre-3.14 way and the
desired match on best compatible only. All new bindings should match
with the new way and the old way needs to be kept for compatibility.
So lets structure the code that way. Search the match table first for
best compatible with name and type NULL, then search the table the old
way. I realize it appears you are doing this, but it is not clear this
is the intent of the code. I would like to see this written as a patch
with commit 105353145eafb3ea919 reverted first and you add a new match
function to call first and then fallback to the existing function.
Rob
>
> Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
> Signed-off-by: Kevin Hao <haokexin@gmail.com>
> ---
> drivers/of/base.c | 55 +++++++++++++++++++++++++++++++++++++------------------
> 1 file changed, 37 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/of/base.c b/drivers/of/base.c
> index ff85450d5683..9d655df458bd 100644
> --- a/drivers/of/base.c
> +++ b/drivers/of/base.c
> @@ -730,32 +730,45 @@ out:
> }
> EXPORT_SYMBOL(of_find_node_with_property);
>
> +static int of_match_type_or_name(const struct device_node *node,
> + const struct of_device_id *m)
> +{
> + int match = 1;
> +
> + if (m->name[0])
> + match &= node->name && !strcmp(m->name, node->name);
> +
> + if (m->type[0])
> + match &= node->type && !strcmp(m->type, node->type);
> +
> + return match;
> +}
> +
> static
> const struct of_device_id *__of_match_node(const struct of_device_id *matches,
> const struct device_node *node)
> {
> const char *cp;
> int cplen, l;
> + const struct of_device_id *m;
> + int match;
>
> if (!matches)
> return NULL;
>
> cp = __of_get_property(node, "compatible", &cplen);
> - do {
> - const struct of_device_id *m = matches;
> + while (cp && (cplen > 0)) {
> + m = matches;
>
> /* Check against matches with current compatible string */
> while (m->name[0] || m->type[0] || m->compatible[0]) {
> - int match = 1;
> - if (m->name[0])
> - match &= node->name
> - && !strcmp(m->name, node->name);
> - if (m->type[0])
> - match &= node->type
> - && !strcmp(m->type, node->type);
> - if (m->compatible[0])
> - match &= cp
> - && !of_compat_cmp(m->compatible, cp,
> + if (!m->compatible[0]) {
> + m++;
> + continue;
> + }
> +
> + match = of_match_type_or_name(node, m);
> + match &= cp && !of_compat_cmp(m->compatible, cp,
> strlen(m->compatible));
> if (match)
> return m;
> @@ -763,12 +776,18 @@ const struct of_device_id *__of_match_node(const struct of_device_id *matches,
> }
>
> /* Get node's next compatible string */
> - if (cp) {
> - l = strlen(cp) + 1;
> - cp += l;
> - cplen -= l;
> - }
> - } while (cp && (cplen > 0));
> + l = strlen(cp) + 1;
> + cp += l;
> + cplen -= l;
> + }
> +
> + m = matches;
> + /* Check against matches without compatible string */
> + while (m->name[0] || m->type[0] || m->compatible[0]) {
> + if (!m->compatible[0] && of_match_type_or_name(node, m))
> + return m;
> + m++;
> + }
>
> return NULL;
> }
> --
> 1.8.5.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe devicetree" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH 2/6] PCI/arm: Use list_for_each_entry() for bus traversal
From: Russell King - ARM Linux @ 2014-02-13 14:46 UTC (permalink / raw)
To: Yijing Wang
Cc: David Airlie, linux-pcmcia, Hanjun Guo, dri-devel, linux-pci,
Bjorn Helgaas, linuxppc-dev
In-Reply-To: <1392297243-61848-2-git-send-email-wangyijing@huawei.com>
On Thu, Feb 13, 2014 at 09:13:59PM +0800, Yijing Wang wrote:
> Replace list_for_each() + pci_bus_b() with the simpler
> list_for_each_entry().
>
> Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
--
FTTC broadband for 0.8mile line: 5.8Mbps down 500kbps up. Estimation
in database were 13.1 to 19Mbit for a good line, about 7.5+ for a bad.
Estimate before purchase was "up to 13.2Mbit".
^ permalink raw reply
* [PATCH 2/6] PCI/arm: Use list_for_each_entry() for bus traversal
From: Yijing Wang @ 2014-02-13 13:13 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Russell King, David Airlie, linux-pcmcia, Hanjun Guo, dri-devel,
linux-pci, Yijing Wang, linuxppc-dev
In-Reply-To: <1392297243-61848-1-git-send-email-wangyijing@huawei.com>
Replace list_for_each() + pci_bus_b() with the simpler
list_for_each_entry().
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
---
arch/arm/kernel/bios32.c | 7 ++-----
1 files changed, 2 insertions(+), 5 deletions(-)
diff --git a/arch/arm/kernel/bios32.c b/arch/arm/kernel/bios32.c
index 317da88..0a77858 100644
--- a/arch/arm/kernel/bios32.c
+++ b/arch/arm/kernel/bios32.c
@@ -57,13 +57,10 @@ static void pcibios_bus_report_status(struct pci_bus *bus, u_int status_mask, in
void pcibios_report_status(u_int status_mask, int warn)
{
- struct list_head *l;
-
- list_for_each(l, &pci_root_buses) {
- struct pci_bus *bus = pci_bus_b(l);
+ struct pci_bus *bus;
+ list_for_each_entry(bus, &pci_root_buses, node)
pcibios_bus_report_status(bus, status_mask, warn);
- }
}
/*
--
1.7.1
^ permalink raw reply related
* [PATCH 1/6] PCI,acpiphp: Use list_for_each_entry() for bus traversal
From: Yijing Wang @ 2014-02-13 13:13 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Russell King, David Airlie, linux-pcmcia, Hanjun Guo, dri-devel,
linux-pci, Yijing Wang, linuxppc-dev
Replace list_for_each() + pci_bus_b() with the simpler
list_for_each_entry().
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
---
drivers/pci/hotplug/acpiphp_glue.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
index cd929ae..aee6a0a 100644
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -450,7 +450,7 @@ static void cleanup_bridge(struct acpiphp_bridge *bridge)
*/
static unsigned char acpiphp_max_busnr(struct pci_bus *bus)
{
- struct list_head *tmp;
+ struct pci_bus *tmp;
unsigned char max, n;
/*
@@ -463,8 +463,8 @@ static unsigned char acpiphp_max_busnr(struct pci_bus *bus)
*/
max = bus->busn_res.start;
- list_for_each(tmp, &bus->children) {
- n = pci_bus_max_busnr(pci_bus_b(tmp));
+ list_for_each_entry(tmp, &bus->children, node) {
+ n = pci_bus_max_busnr(tmp);
if (n > max)
max = n;
}
--
1.7.1
^ permalink raw reply related
* [PATCH 6/6] PCI: Remove pci_bus_b() and use list_entry() directly
From: Yijing Wang @ 2014-02-13 13:14 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Russell King, David Airlie, linux-pcmcia, Hanjun Guo, dri-devel,
linux-pci, Yijing Wang, linuxppc-dev
In-Reply-To: <1392297243-61848-1-git-send-email-wangyijing@huawei.com>
Replace pci_bus_b() with list_entry(), so we can remove
pci_bus_b().
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
---
drivers/pci/pci.c | 6 +++---
drivers/pci/search.c | 10 +++++-----
include/linux/pci.h | 1 -
3 files changed, 8 insertions(+), 9 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 1febe90..6f5ed88 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -108,12 +108,12 @@ static bool pcie_ari_disabled;
*/
unsigned char pci_bus_max_busnr(struct pci_bus* bus)
{
- struct list_head *tmp;
+ struct pci_bus *tmp;
unsigned char max, n;
max = bus->busn_res.end;
- list_for_each(tmp, &bus->children) {
- n = pci_bus_max_busnr(pci_bus_b(tmp));
+ list_for_each_entry(tmp, &bus->children, node) {
+ n = pci_bus_max_busnr(tmp);
if(n > max)
max = n;
}
diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index 3ff2ac7..4a1b972 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -54,14 +54,14 @@ pci_find_upstream_pcie_bridge(struct pci_dev *pdev)
static struct pci_bus *pci_do_find_bus(struct pci_bus *bus, unsigned char busnr)
{
- struct pci_bus* child;
- struct list_head *tmp;
+ struct pci_bus *child;
+ struct pci_bus *tmp;
if(bus->number == busnr)
return bus;
- list_for_each(tmp, &bus->children) {
- child = pci_do_find_bus(pci_bus_b(tmp), busnr);
+ list_for_each_entry(tmp, &bus->children, node) {
+ child = pci_do_find_bus(tmp, busnr);
if(child)
return child;
}
@@ -111,7 +111,7 @@ pci_find_next_bus(const struct pci_bus *from)
down_read(&pci_bus_sem);
n = from ? from->node.next : pci_root_buses.next;
if (n != &pci_root_buses)
- b = pci_bus_b(n);
+ b = list_entry(n, struct pci_bus, node);
up_read(&pci_bus_sem);
return b;
}
diff --git a/include/linux/pci.h b/include/linux/pci.h
index fb57c89..e1b5752 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -461,7 +461,6 @@ struct pci_bus {
unsigned int is_added:1;
};
-#define pci_bus_b(n) list_entry(n, struct pci_bus, node)
#define to_pci_bus(n) container_of(n, struct pci_bus, dev)
/*
--
1.7.1
^ permalink raw reply related
* [PATCH 3/6] PCI/drm: Use list_for_each_entry() for bus traversal
From: Yijing Wang @ 2014-02-13 13:14 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Russell King, David Airlie, linux-pcmcia, Hanjun Guo, dri-devel,
linux-pci, Yijing Wang, linuxppc-dev
In-Reply-To: <1392297243-61848-1-git-send-email-wangyijing@huawei.com>
Replace list_for_each() + pci_bus_b() with the simpler
list_for_each_entry().
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
---
drivers/gpu/drm/drm_fops.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c
index 7f2af9a..70d2987 100644
--- a/drivers/gpu/drm/drm_fops.c
+++ b/drivers/gpu/drm/drm_fops.c
@@ -319,7 +319,8 @@ static int drm_open_helper(struct inode *inode, struct file *filp,
pci_dev_put(pci_dev);
}
if (!dev->hose) {
- struct pci_bus *b = pci_bus_b(pci_root_buses.next);
+ struct pci_bus *b = list_entry(pci_root_buses.next,
+ struct pci_bus, node);
if (b)
dev->hose = b->sysdata;
}
--
1.7.1
^ permalink raw reply related
* [PATCH 5/6] PCI/pcmcia: Use list_for_each_entry() for bus traversal
From: Yijing Wang @ 2014-02-13 13:14 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Russell King, David Airlie, linux-pcmcia, Hanjun Guo, dri-devel,
linux-pci, Yijing Wang, linuxppc-dev
In-Reply-To: <1392297243-61848-1-git-send-email-wangyijing@huawei.com>
Replace list_for_each() + pci_bus_b() with the simpler
list_for_each_entry().
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
---
drivers/pcmcia/yenta_socket.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/pcmcia/yenta_socket.c b/drivers/pcmcia/yenta_socket.c
index 8485761..d16fb12 100644
--- a/drivers/pcmcia/yenta_socket.c
+++ b/drivers/pcmcia/yenta_socket.c
@@ -1076,7 +1076,7 @@ static void yenta_config_init(struct yenta_socket *socket)
*/
static void yenta_fixup_parent_bridge(struct pci_bus *cardbus_bridge)
{
- struct list_head *tmp;
+ struct pci_bus *silbling;
unsigned char upper_limit;
/*
* We only check and fix the parent bridge: All systems which need
@@ -1096,8 +1096,8 @@ static void yenta_fixup_parent_bridge(struct pci_bus *cardbus_bridge)
upper_limit = bridge_to_fix->parent->busn_res.end;
/* check the bus ranges of all silbling bridges to prevent overlap */
- list_for_each(tmp, &bridge_to_fix->parent->children) {
- struct pci_bus *silbling = pci_bus_b(tmp);
+ list_for_each_entry(silbling, &bridge_to_fix->parent->children,
+ node) {
/*
* If the silbling has a higher secondary bus number
* and it's secondary is equal or smaller than our
--
1.7.1
^ permalink raw reply related
* [PATCH 4/6] PCI/powerpc: Use list_for_each_entry() for bus traversal
From: Yijing Wang @ 2014-02-13 13:14 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Russell King, David Airlie, linux-pcmcia, Hanjun Guo, dri-devel,
linux-pci, Yijing Wang, linuxppc-dev
In-Reply-To: <1392297243-61848-1-git-send-email-wangyijing@huawei.com>
Replace list_for_each() + pci_bus_b() with the simpler
list_for_each_entry().
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
---
arch/powerpc/kernel/pci_64.c | 4 +---
arch/powerpc/platforms/pseries/pci_dlpar.c | 6 +++---
2 files changed, 4 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/kernel/pci_64.c b/arch/powerpc/kernel/pci_64.c
index a9e311f..2a47790 100644
--- a/arch/powerpc/kernel/pci_64.c
+++ b/arch/powerpc/kernel/pci_64.c
@@ -208,7 +208,6 @@ long sys_pciconfig_iobase(long which, unsigned long in_bus,
unsigned long in_devfn)
{
struct pci_controller* hose;
- struct list_head *ln;
struct pci_bus *bus = NULL;
struct device_node *hose_node;
@@ -230,8 +229,7 @@ long sys_pciconfig_iobase(long which, unsigned long in_bus,
* used on pre-domains setup. We return the first match
*/
- for (ln = pci_root_buses.next; ln != &pci_root_buses; ln = ln->next) {
- bus = pci_bus_b(ln);
+ list_for_each_entry(bus, &pci_root_buses, node) {
if (in_bus >= bus->number && in_bus <= bus->busn_res.end)
break;
bus = NULL;
diff --git a/arch/powerpc/platforms/pseries/pci_dlpar.c b/arch/powerpc/platforms/pseries/pci_dlpar.c
index efe6137..203cbf0 100644
--- a/arch/powerpc/platforms/pseries/pci_dlpar.c
+++ b/arch/powerpc/platforms/pseries/pci_dlpar.c
@@ -37,15 +37,15 @@ find_bus_among_children(struct pci_bus *bus,
struct device_node *dn)
{
struct pci_bus *child = NULL;
- struct list_head *tmp;
+ struct pci_bus *tmp;
struct device_node *busdn;
busdn = pci_bus_to_OF_node(bus);
if (busdn == dn)
return bus;
- list_for_each(tmp, &bus->children) {
- child = find_bus_among_children(pci_bus_b(tmp), dn);
+ list_for_each_entry(tmp, &bus->children, node) {
+ child = find_bus_among_children(tmp, dn);
if (child)
break;
};
--
1.7.1
^ permalink raw reply related
* RE: [PATCH v2] mtd: m25p80: Make the name of mtd_info fixed
From: B48286 @ 2014-02-13 8:08 UTC (permalink / raw)
To: 'Brian Norris'
Cc: Scott Wood, linuxppc-dev@ozlabs.org, Mingkai.Hu@freescale.com,
linux-mtd@lists.infradead.org, linux-spi@vger.kernel.org
In-Reply-To: <20140210193948.GD18440@ld-irv-0074>
Hi Brian,
> -----Original Message-----
> From: Brian Norris [mailto:computersforpeace@gmail.com]
> Sent: Tuesday, February 11, 2014 3:40 AM
> To: Hou Zhiqiang-B48286
> Cc: linux-mtd@lists.infradead.org; linuxppc-dev@ozlabs.org; Wood Scott-
> B07421; Hu Mingkai-B21284; linux-spi@vger.kernel.org
> Subject: Re: [PATCH v2] mtd: m25p80: Make the name of mtd_info fixed
>=20
> On Sun, Jan 26, 2014 at 02:16:43PM +0800, Hou Zhiqiang wrote:
> > To give spi flash layout using "mtdparts=3D..." in cmdline, we must giv=
e
> > mtd_info a fixed name,because the cmdlinepart's parser will match the
> > name given in cmdline with the mtd_info.
> >
> > Now, if use OF node, mtd_info's name will be spi->dev->name. It
> > consists of spi_master->bus_num, and the spi_master->bus_num maybe
> > dynamically fetched.
> > So, give the mtd_info a new fiexd name "name.cs", "name" is name of
> > spi_device_id and "cs" is chip-select in spi_dev.
> >
> > Signed-off-by: Hou Zhiqiang <b48286@freescale.com>
> > ---
> > v2:
> > - add check for return value of function kasprintf.
> > - whether the spi_master->bus_num is dynamical is determined by spi
> > controller driver, and it can't be check in this driver. So, we can
> > not initial the mtd_info's name by distinguishing the spi_master
> > bus_num dynamically-allocated or not.
>=20
> How about spi->master->bus_num < 0 ?
>=20
In spi slave driver, we can not see this case. The spi->master->bus_num=20
must be greater than 0, because before loading spi slave driver, the spi
controller driver will check spi->master->bus_num, if it is less than 0,
the spi controller driver will allocate a new bus_num dynamically. So it
is always greater than 0 in m25p80.c.
> > drivers/mtd/devices/m25p80.c | 8 ++++++--
> > 1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/mtd/devices/m25p80.c
> > b/drivers/mtd/devices/m25p80.c index eb558e8..1f494d2 100644
> > --- a/drivers/mtd/devices/m25p80.c
> > +++ b/drivers/mtd/devices/m25p80.c
> > @@ -1011,8 +1011,12 @@ static int m25p_probe(struct spi_device *spi)
> >
> > if (data && data->name)
> > flash->mtd.name =3D data->name;
> > - else
> > - flash->mtd.name =3D dev_name(&spi->dev);
> > + else {
> > + flash->mtd.name =3D kasprintf(GFP_KERNEL, "%s.%d",
> > + id->name, spi->chip_select);
>=20
> I don't think this name is specific enough. What if there are more than
> one SPI controller? Then there could be one chip with the same chip-
> select. You probably still need to incorporate the SPI master somehow,
> even if it's not by using the bus number directly (because it's dynamic).
>
Yeah, you're right. Actually the bus_num is used to distinguish different
spi controller. If the controller driver give a dynamically-allocated
bus_num, yourself should take the risk to use mtdparts in command line.
I think, it is spi controller driver's responsibility to assign a
reasonable bus_num to make sure using command line mtdparts riskless, then
it is unnecessary to change mtdinfo's name.
=20
> > + if (!flash->mtd.name)
> > + return -ENOMEM;
> > + }
> >
> > flash->mtd.type =3D MTD_NORFLASH;
> > flash->mtd.writesize =3D 1;
>=20
> Brian
>=20
Zhiqiang Hou
^ permalink raw reply
* [PATCH][v2] powerpc/fsl: Add/update miscellaneous missing bindings
From: Harninder Rai @ 2014-02-13 7:29 UTC (permalink / raw)
To: scottwood, devicetree; +Cc: Harninder Rai, linuxppc-dev
Missing bindings were found on running checkpatch.pl on bsc9132
device tree. This patch add/update the following
- Add bindings for L2 cache controller
- Add bindings for memory controller
- Update bindings for USB controller
Signed-off-by: Harninder Rai <harninder.rai@freescale.com>
---
Changes since base version:
Incorporated Scott's comments
- Rename l2cc.txt to l2cache.txt
- Add information about ePAPR compliance
- Add missing "cache" in compatible
- Miscellaneous minors
.../devicetree/bindings/powerpc/fsl/l2cache.txt | 26 ++++++++++++++++++++
.../devicetree/bindings/powerpc/fsl/mem-ctrlr.txt | 16 ++++++++++++
Documentation/devicetree/bindings/usb/fsl-usb.txt | 2 +
3 files changed, 44 insertions(+), 0 deletions(-)
create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/l2cache.txt
create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/mem-ctrlr.txt
diff --git a/Documentation/devicetree/bindings/powerpc/fsl/l2cache.txt b/Documentation/devicetree/bindings/powerpc/fsl/l2cache.txt
new file mode 100644
index 0000000..79ef4a1
--- /dev/null
+++ b/Documentation/devicetree/bindings/powerpc/fsl/l2cache.txt
@@ -0,0 +1,26 @@
+Freescale L2 Cache Controller
+
+L2 cache is present in Freescale's QorIQ and QorIQ Qonverge platforms.
+The cache bindings explained below are ePAPR compliant
+
+Required Properties:
+
+- compatible : Should include "fsl,chip-l2-cache-controller" and "cache"
+ where chip is the processor (bsc9132, npc8572 etc.)
+- reg : Address and size of L2 cache controller registers
+- cache-size : Size of the entire L2 cache
+- interrupts : Error interrupt of L2 controller
+
+Optional Properties:
+
+- cache-line-size : Size of L2 cache lines
+
+Example:
+
+ L2: l2-cache-controller@20000 {
+ compatible = "fsl,bsc9132-l2-cache-controller", "cache";
+ reg = <0x20000 0x1000>;
+ cache-line-size = <32>; // 32 bytes
+ cache-size = <0x40000>; // L2,256K
+ interrupts = <16 2 1 0>;
+ };
diff --git a/Documentation/devicetree/bindings/powerpc/fsl/mem-ctrlr.txt b/Documentation/devicetree/bindings/powerpc/fsl/mem-ctrlr.txt
new file mode 100644
index 0000000..70b42bb
--- /dev/null
+++ b/Documentation/devicetree/bindings/powerpc/fsl/mem-ctrlr.txt
@@ -0,0 +1,16 @@
+Freescale DDR memory controller
+
+Properties:
+
+- compatible : Should include "fsl,chip-memory-controller" where
+ chip is the processor (bsc9132, mpc8572 etc.)
+- reg : Address and size of DDR controller registers
+- interrupts : Error interrupt of DDR controller
+
+Example:
+
+ memory-controller@2000 {
+ compatible = "fsl,bsc9132-memory-controller";
+ reg = <0x2000 0x1000>;
+ interrupts = <16 2 1 8>;
+ };
diff --git a/Documentation/devicetree/bindings/usb/fsl-usb.txt b/Documentation/devicetree/bindings/usb/fsl-usb.txt
index bd5723f..afa5809 100644
--- a/Documentation/devicetree/bindings/usb/fsl-usb.txt
+++ b/Documentation/devicetree/bindings/usb/fsl-usb.txt
@@ -9,6 +9,8 @@ Required properties :
- compatible : Should be "fsl-usb2-mph" for multi port host USB
controllers, or "fsl-usb2-dr" for dual role USB controllers
or "fsl,mpc5121-usb2-dr" for dual role USB controllers of MPC5121
+ Wherever applicable, the IP version of the USB controller should
+ also be mentioned (for eg. fsl-usb2-dr-v2.2 for bsc9132).
- phy_type : For multi port host USB controllers, should be one of
"ulpi", or "serial". For dual role USB controllers, should be
one of "ulpi", "utmi", "utmi_wide", or "serial".
--
1.7.6.GIT
^ permalink raw reply related
* Re: [RFC PATCH 2/3] topology: support node_numa_mem() for determining the fallback node
From: Nishanth Aravamudan @ 2014-02-13 6:51 UTC (permalink / raw)
To: Joonsoo Kim
Cc: Han Pingtian, Matt Mackall, Pekka Enberg,
Linux Memory Management List, Paul Mackerras, Anton Blanchard,
David Rientjes, Christoph Lameter, linuxppc-dev, Wanpeng Li
In-Reply-To: <20140211074159.GB27870@lge.com>
Hi Joonsoo,
On 11.02.2014 [16:42:00 +0900], Joonsoo Kim wrote:
> On Mon, Feb 10, 2014 at 11:13:21AM -0800, Nishanth Aravamudan wrote:
> > Hi Christoph,
> >
> > On 07.02.2014 [12:51:07 -0600], Christoph Lameter wrote:
> > > Here is a draft of a patch to make this work with memoryless nodes.
> > >
> > > The first thing is that we modify node_match to also match if we hit an
> > > empty node. In that case we simply take the current slab if its there.
> > >
> > > If there is no current slab then a regular allocation occurs with the
> > > memoryless node. The page allocator will fallback to a possible node and
> > > that will become the current slab. Next alloc from a memoryless node
> > > will then use that slab.
> > >
> > > For that we also add some tracking of allocations on nodes that were not
> > > satisfied using the empty_node[] array. A successful alloc on a node
> > > clears that flag.
> > >
> > > I would rather avoid the empty_node[] array since its global and there may
> > > be thread specific allocation restrictions but it would be expensive to do
> > > an allocation attempt via the page allocator to make sure that there is
> > > really no page available from the page allocator.
> >
> > With this patch on our test system (I pulled out the numa_mem_id()
> > change, since you Acked Joonsoo's already), on top of 3.13.0 + my
> > kthread locality change + CONFIG_HAVE_MEMORYLESS_NODES + Joonsoo's RFC
> > patch 1):
> >
> > MemTotal: 8264704 kB
> > MemFree: 5924608 kB
> > ...
> > Slab: 1402496 kB
> > SReclaimable: 102848 kB
> > SUnreclaim: 1299648 kB
> >
> > And Anton's slabusage reports:
> >
> > slab mem objs slabs
> > used active active
> > ------------------------------------------------------------
> > kmalloc-16384 207 MB 98.60% 100.00%
> > task_struct 134 MB 97.82% 100.00%
> > kmalloc-8192 117 MB 100.00% 100.00%
> > pgtable-2^12 111 MB 100.00% 100.00%
> > pgtable-2^10 104 MB 100.00% 100.00%
> >
> > For comparison, Anton's patch applied at the same point in the series:
> >
> > meminfo:
> >
> > MemTotal: 8264704 kB
> > MemFree: 4150464 kB
> > ...
> > Slab: 1590336 kB
> > SReclaimable: 208768 kB
> > SUnreclaim: 1381568 kB
> >
> > slabusage:
> >
> > slab mem objs slabs
> > used active active
> > ------------------------------------------------------------
> > kmalloc-16384 227 MB 98.63% 100.00%
> > kmalloc-8192 130 MB 100.00% 100.00%
> > task_struct 129 MB 97.73% 100.00%
> > pgtable-2^12 112 MB 100.00% 100.00%
> > pgtable-2^10 106 MB 100.00% 100.00%
> >
> >
> > Consider this patch:
> >
> > Acked-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
> > Tested-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
>
> Hello,
>
> I still think that there is another problem.
> Your report about CONFIG_SLAB said that SLAB uses just 200MB.
> Below is your previous report.
>
> Ok, with your patches applied and CONFIG_SLAB enabled:
>
> MemTotal: 8264640 kB
> MemFree: 7119680 kB
> Slab: 207232 kB
> SReclaimable: 32896 kB
> SUnreclaim: 174336 kB
>
> The number on CONFIG_SLUB with these patches tell us that SLUB uses 1.4GB.
> There is large difference on slab usage.
Agreed. But, at least for now, this gets us to not OOM all the time :) I
think that's significant progress. I will continue to look at this
issue for where the other gaps are, but would like to see Christoph's
latest patch get merged (pending my re-testing).
> And, I should note that number of active objects on slabinfo can be
> wrong on some situation, since it doesn't consider cpu slab (and cpu
> partial slab).
Well, I grabbed everything from /sys/kernel/slab for you in the
tarballs, I believe.
> I recommend to confirm page_to_nid() and other things as I mentioned
> earlier.
I believe these all work once CONFIG_HAVE_MEMORYLESS_NODES was set for
ppc64, but will test it again when I have access to the test system.
Also, given that only ia64 and (hopefuly soon) ppc64 can set
CONFIG_HAVE_MEMORYLESS_NODES, does that mean x86_64 can't have
memoryless nodes present? Even with fakenuma? Just curious.
-Nish
^ permalink raw reply
* Re: [RFC PATCH 2/3] topology: support node_numa_mem() for determining the fallback node
From: Nishanth Aravamudan @ 2014-02-13 3:53 UTC (permalink / raw)
To: Christoph Lameter
Cc: Han Pingtian, Matt Mackall, Pekka Enberg,
Linux Memory Management List, Paul Mackerras, Anton Blanchard,
David Rientjes, Joonsoo Kim, linuxppc-dev, Wanpeng Li
In-Reply-To: <alpine.DEB.2.10.1402121612270.8183@nuc>
On 12.02.2014 [16:16:11 -0600], Christoph Lameter wrote:
> Here is another patch with some fixes. The additional logic is only
> compiled in if CONFIG_HAVE_MEMORYLESS_NODES is set.
>
> Subject: slub: Memoryless node support
>
> Support memoryless nodes by tracking which allocations are failing.
> Allocations targeted to the nodes without memory fall back to the
> current available per cpu objects and if that is not available will
> create a new slab using the page allocator to fallback from the
> memoryless node to some other node.
I'll try and retest this once the LPAR in question comes free. Hopefully
in the next day or two.
Thanks,
Nish
> Signed-off-by: Christoph Lameter <cl@linux.com>
>
> Index: linux/mm/slub.c
> ===================================================================
> --- linux.orig/mm/slub.c 2014-02-12 16:07:48.957869570 -0600
> +++ linux/mm/slub.c 2014-02-12 16:09:22.198928260 -0600
> @@ -134,6 +134,10 @@ static inline bool kmem_cache_has_cpu_pa
> #endif
> }
>
> +#ifdef CONFIG_HAVE_MEMORYLESS_NODES
> +static nodemask_t empty_nodes;
> +#endif
> +
> /*
> * Issues still to be resolved:
> *
> @@ -1405,16 +1409,28 @@ static struct page *new_slab(struct kmem
> void *last;
> void *p;
> int order;
> + int alloc_node;
>
> BUG_ON(flags & GFP_SLAB_BUG_MASK);
>
> page = allocate_slab(s,
> flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
> - if (!page)
> + if (!page) {
> +#ifdef CONFIG_HAVE_MEMORYLESS_NODES
> + if (node != NUMA_NO_NODE)
> + node_set(node, empty_nodes);
> +#endif
> goto out;
> + }
>
> order = compound_order(page);
> - inc_slabs_node(s, page_to_nid(page), page->objects);
> + alloc_node = page_to_nid(page);
> +#ifdef CONFIG_HAVE_MEMORYLESS_NODES
> + node_clear(alloc_node, empty_nodes);
> + if (node != NUMA_NO_NODE && alloc_node != node)
> + node_set(node, empty_nodes);
> +#endif
> + inc_slabs_node(s, alloc_node, page->objects);
> memcg_bind_pages(s, order);
> page->slab_cache = s;
> __SetPageSlab(page);
> @@ -1722,7 +1738,7 @@ static void *get_partial(struct kmem_cac
> struct kmem_cache_cpu *c)
> {
> void *object;
> - int searchnode = (node == NUMA_NO_NODE) ? numa_node_id() : node;
> + int searchnode = (node == NUMA_NO_NODE) ? numa_mem_id() : node;
>
> object = get_partial_node(s, get_node(s, searchnode), c, flags);
> if (object || node != NUMA_NO_NODE)
> @@ -2117,8 +2133,19 @@ static void flush_all(struct kmem_cache
> static inline int node_match(struct page *page, int node)
> {
> #ifdef CONFIG_NUMA
> - if (!page || (node != NUMA_NO_NODE && page_to_nid(page) != node))
> + int page_node = page_to_nid(page);
> +
> + if (!page)
> return 0;
> +
> + if (node != NUMA_NO_NODE) {
> +#ifdef CONFIG_HAVE_MEMORYLESS_NODES
> + if (node_isset(node, empty_nodes))
> + return 1;
> +#endif
> + if (page_node != node)
> + return 0;
> + }
> #endif
> return 1;
> }
>
^ permalink raw reply
* Re: [PATCH RFC v7 0/6] MPC512x DMA slave s/g support, OF DMA lookup
From: Gerhard Sittig @ 2014-02-13 0:32 UTC (permalink / raw)
To: Alexander Popov
Cc: devicetree, Lars-Peter Clausen, Arnd Bergmann, Vinod Koul,
Dan Williams, Anatolij Gustschin, linuxppc-dev
In-Reply-To: <1392211508-23615-1-git-send-email-a13xp0p0v88@gmail.com>
For some reason you have kept the DMA maintainers, but dropped
the dmaengine ML from Cc: -- was this intentional, given that the
series is specifically about DMA and you want to get feedback?
And you may want to help DT people by not sending purely Linux
implementation related stuff to them (they already are drinking
from the firehose). DT reviewers are foremost interested in
bindings and policy and remaining OS agnostic, and leave
mechanical .dts file updates to subsystem maintainers.
virtually yours
Gerhard Sittig
--
DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr. 5, D-82194 Groebenzell, Germany
Phone: +49-8142-66989-0 Fax: +49-8142-66989-80 Email: office@denx.de
^ permalink raw reply
* Re: [PATCH RFC v7 6/6] HACK mmc: mxcmmc: enable clocks for the MPC512x
From: Gerhard Sittig @ 2014-02-13 0:24 UTC (permalink / raw)
To: Alexander Popov
Cc: Lars-Peter Clausen, Arnd Bergmann, Vinod Koul, Dan Williams,
Anatolij Gustschin, linuxppc-dev
In-Reply-To: <1392211508-23615-7-git-send-email-a13xp0p0v88@gmail.com>
[ removed DT from Cc: ]
On Wed, Feb 12, 2014 at 17:25 +0400, Alexander Popov wrote:
>
> Q&D HACK to enable SD card support without correct COMMON_CLK support,
> best viewed with 'git diff -w -b', NOT acceptable for mainline (NAKed)
This one has become obsolete, v3.14-rc1 comes with proper
COMMON_CLK support.
virtually yours
Gerhard Sittig
--
DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr. 5, D-82194 Groebenzell, Germany
Phone: +49-8142-66989-0 Fax: +49-8142-66989-80 Email: office@denx.de
^ permalink raw reply
* Re: [PATCH RFC v7 4/6] dma: mpc512x: add device tree binding document
From: Gerhard Sittig @ 2014-02-13 0:21 UTC (permalink / raw)
To: Alexander Popov
Cc: devicetree, Lars-Peter Clausen, Arnd Bergmann, Vinod Koul,
Dan Williams, Anatolij Gustschin, linuxppc-dev
In-Reply-To: <1392211508-23615-5-git-send-email-a13xp0p0v88@gmail.com>
On Wed, Feb 12, 2014 at 17:25 +0400, Alexander Popov wrote:
>
> From: Gerhard Sittig <gsi@denx.de>
>
> introduce a device tree binding document for the MPC512x DMA controller
>
> Signed-off-by: Gerhard Sittig <gsi@denx.de>
> [ a13xp0p0v88@gmail.com: turn this into a separate patch ]
As stated in the previous iteration, this one no longer is good
enough. As time has passed, we have moved forward and learned
something. We would not write a binding like this today.
Admittedly I went dormant (did not provide an update) since v6.
There are several issues.
- The MPC512x DMA completely lacks a binding document, so one
should get added.
- The MPC8308 hardware is similar and can re-use the MPC512x
binding, which should be stated.
- The Linux implementation currently has no OF based channel
lookup support, so '#dma-cells' is "a future feature". I guess
the binding can and should already discuss the feature,
regardless of whether all implementations support it.
virtually yours
Gerhard Sittig
--
DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr. 5, D-82194 Groebenzell, Germany
Phone: +49-8142-66989-0 Fax: +49-8142-66989-80 Email: office@denx.de
^ permalink raw reply
* Re: [PATCH RFC v7 2/6] dma: mpc512x: add support for peripheral transfers
From: Gerhard Sittig @ 2014-02-13 0:07 UTC (permalink / raw)
To: Alexander Popov
Cc: Lars-Peter Clausen, Arnd Bergmann, Vinod Koul, Dan Williams,
Anatolij Gustschin, linuxppc-dev
In-Reply-To: <1392211508-23615-3-git-send-email-a13xp0p0v88@gmail.com>
[ removed DT from Cc: ]
On Wed, Feb 12, 2014 at 17:25 +0400, Alexander Popov wrote:
>
> Introduce support for slave s/g transfer preparation and the associated
> device control callback in the MPC512x DMA controller driver, which adds
> support for data transfers between memory and peripheral I/O to the
> previously supported mem-to-mem transfers.
>
> [ ... ]
> --- a/drivers/dma/mpc512x_dma.c
> +++ b/drivers/dma/mpc512x_dma.c
> [ ... ]
> @@ -29,8 +30,15 @@
> */
>
> /*
> - * This is initial version of MPC5121 DMA driver. Only memory to memory
> - * transfers are supported (tested using dmatest module).
> + * MPC512x and MPC8308 DMA driver. It supports
> + * memory to memory data transfers (tested using dmatest module) and
> + * data transfers between memory and peripheral I/O memory
> + * by means of slave s/g with these limitations:
> + * - chunked transfers (transfers with more than one part) are refused
> + * as long as proper support for scatter/gather is missing;
> + * - transfers on MPC8308 always start from software as this SoC appears
> + * not to have external request lines for peripheral flow control;
> + * - minimal memory <-> I/O memory transfer size is 4 bytes.
> */
Often I assume people would notice themselves, and apparently I'm
wrong. :) Can you adjust the formatting such (here and
elsewhere) that the bullet list is clearly visible as such?
Flowing text like above obfuscates the fact that the content may
have a structure ...
There are known limitations which are not listed here, "minimal
transfer size" is incomplete. It appears that you assume
constraints on start addresses as well as sizes/lengths. Can you
update the documentation to match the implementation?
> @@ -251,8 +264,21 @@ static void mpc_dma_execute(struct mpc_dma_chan *mchan)
> struct mpc_dma_desc *mdesc;
> int cid = mchan->chan.chan_id;
>
> - /* Move all queued descriptors to active list */
> - list_splice_tail_init(&mchan->queued, &mchan->active);
> + while (!list_empty(&mchan->queued)) {
> + mdesc = list_first_entry(&mchan->queued,
> + struct mpc_dma_desc, node);
> +
> + /* Grab either several mem-to-mem transfer descriptors
> + * or one peripheral transfer descriptor,
> + * don't mix mem-to-mem and peripheral transfer descriptors
> + * within the same 'active' list. */
> + if (mdesc->will_access_peripheral) {
> + if (list_empty(&mchan->active))
> + list_move_tail(&mdesc->node, &mchan->active);
> + break;
> + } else
> + list_move_tail(&mdesc->node, &mchan->active);
> + }
>
> /* Chain descriptors into one transaction */
> list_for_each_entry(mdesc, &mchan->active, node) {
There are style issues. Both in multi line comments, and in the
braces of the if/else block.
> @@ -643,6 +680,186 @@ mpc_dma_prep_memcpy(struct dma_chan *chan, dma_addr_t dst, dma_addr_t src,
> return &mdesc->desc;
> }
>
> +static struct dma_async_tx_descriptor *
> +mpc_dma_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl,
> + unsigned int sg_len, enum dma_transfer_direction direction,
> + unsigned long flags, void *context)
> +{
> + struct mpc_dma *mdma = dma_chan_to_mpc_dma(chan);
> + struct mpc_dma_chan *mchan = dma_chan_to_mpc_dma_chan(chan);
> + struct mpc_dma_desc *mdesc = NULL;
> + dma_addr_t per_paddr;
> + u32 tcd_nunits;
> + struct mpc_dma_tcd *tcd;
> + unsigned long iflags;
> + struct scatterlist *sg;
> + size_t len;
> + int iter, i;
Personally I much dislike this style of mixing declarations and
instructions. But others may disagree, and strongly so.
> +
> + /* currently there is no proper support for scatter/gather */
> + if (sg_len != 1)
> + return NULL;
> +
> + for_each_sg(sgl, sg, sg_len, i) {
> + spin_lock_irqsave(&mchan->lock, iflags);
> +
> + mdesc = list_first_entry(&mchan->free, struct mpc_dma_desc,
> + node);
style (continuation and indentation)
> + if (!mdesc) {
> + spin_unlock_irqrestore(&mchan->lock, iflags);
> + /* try to free completed descriptors */
> + mpc_dma_process_completed(mdma);
> + return NULL;
> + }
> +
> + list_del(&mdesc->node);
> +
> + per_paddr = mchan->per_paddr;
> + tcd_nunits = mchan->tcd_nunits;
> +
> + spin_unlock_irqrestore(&mchan->lock, iflags);
> +
> + if (per_paddr == 0 || tcd_nunits == 0)
> + goto err_prep;
> +
> + mdesc->error = 0;
> + mdesc->will_access_peripheral = 1;
> + tcd = mdesc->tcd;
> +
> + /* Prepare Transfer Control Descriptor for this transaction */
> +
> + memset(tcd, 0, sizeof(struct mpc_dma_tcd));
> +
> + if (!IS_ALIGNED(sg_dma_address(sg), 4))
> + goto err_prep;
You found multiple ways of encoding the "4 byte alignment", using
both the fixed number as well as (several) symbolic identifiers.
Can you look into making them use the same condition if the same
motivation is behind the test?
> +
> + if (direction == DMA_DEV_TO_MEM) {
> + tcd->saddr = per_paddr;
> + tcd->daddr = sg_dma_address(sg);
> + tcd->soff = 0;
> + tcd->doff = 4;
> + } else if (direction == DMA_MEM_TO_DEV) {
> + tcd->saddr = sg_dma_address(sg);
> + tcd->daddr = per_paddr;
> + tcd->soff = 4;
> + tcd->doff = 0;
> + } else
> + goto err_prep;
> +
> + tcd->ssize = MPC_DMA_TSIZE_4;
> + tcd->dsize = MPC_DMA_TSIZE_4;
> +
> + len = sg_dma_len(sg);
> + tcd->nbytes = tcd_nunits * 4;
> + if (!IS_ALIGNED(len, tcd->nbytes))
> + goto err_prep;
> +
> + iter = len / tcd->nbytes;
> + if (iter >= 1 << 15) {
> + /* len is too big */
> + goto err_prep;
> + } else {
> + /* citer_linkch contains the high bits of iter */
> + tcd->biter = iter & 0x1ff;
> + tcd->biter_linkch = iter >> 9;
> + tcd->citer = tcd->biter;
> + tcd->citer_linkch = tcd->biter_linkch;
> + }
> +
> + tcd->e_sg = 0;
> + tcd->d_req = 1;
> +
> + /* Place descriptor in prepared list */
> + spin_lock_irqsave(&mchan->lock, iflags);
> + list_add_tail(&mdesc->node, &mchan->prepared);
> + spin_unlock_irqrestore(&mchan->lock, iflags);
> + }
> +
> + return &mdesc->desc;
> +
> +err_prep:
> + /* Put the descriptor back */
> + spin_lock_irqsave(&mchan->lock, iflags);
> + list_add_tail(&mdesc->node, &mchan->free);
> + spin_unlock_irqrestore(&mchan->lock, iflags);
> +
> + return NULL;
> +}
virtually yours
Gerhard Sittig
--
DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr. 5, D-82194 Groebenzell, Germany
Phone: +49-8142-66989-0 Fax: +49-8142-66989-80 Email: office@denx.de
^ permalink raw reply
* Re: [RFC PATCH 2/3] topology: support node_numa_mem() for determining the fallback node
From: Christoph Lameter @ 2014-02-12 22:16 UTC (permalink / raw)
To: Joonsoo Kim
Cc: Han Pingtian, Nishanth Aravamudan, Matt Mackall, Pekka Enberg,
Linux Memory Management List, Paul Mackerras, Anton Blanchard,
David Rientjes, linuxppc-dev, Wanpeng Li
In-Reply-To: <20140211074159.GB27870@lge.com>
Here is another patch with some fixes. The additional logic is only
compiled in if CONFIG_HAVE_MEMORYLESS_NODES is set.
Subject: slub: Memoryless node support
Support memoryless nodes by tracking which allocations are failing.
Allocations targeted to the nodes without memory fall back to the
current available per cpu objects and if that is not available will
create a new slab using the page allocator to fallback from the
memoryless node to some other node.
Signed-off-by: Christoph Lameter <cl@linux.com>
Index: linux/mm/slub.c
===================================================================
--- linux.orig/mm/slub.c 2014-02-12 16:07:48.957869570 -0600
+++ linux/mm/slub.c 2014-02-12 16:09:22.198928260 -0600
@@ -134,6 +134,10 @@ static inline bool kmem_cache_has_cpu_pa
#endif
}
+#ifdef CONFIG_HAVE_MEMORYLESS_NODES
+static nodemask_t empty_nodes;
+#endif
+
/*
* Issues still to be resolved:
*
@@ -1405,16 +1409,28 @@ static struct page *new_slab(struct kmem
void *last;
void *p;
int order;
+ int alloc_node;
BUG_ON(flags & GFP_SLAB_BUG_MASK);
page = allocate_slab(s,
flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
- if (!page)
+ if (!page) {
+#ifdef CONFIG_HAVE_MEMORYLESS_NODES
+ if (node != NUMA_NO_NODE)
+ node_set(node, empty_nodes);
+#endif
goto out;
+ }
order = compound_order(page);
- inc_slabs_node(s, page_to_nid(page), page->objects);
+ alloc_node = page_to_nid(page);
+#ifdef CONFIG_HAVE_MEMORYLESS_NODES
+ node_clear(alloc_node, empty_nodes);
+ if (node != NUMA_NO_NODE && alloc_node != node)
+ node_set(node, empty_nodes);
+#endif
+ inc_slabs_node(s, alloc_node, page->objects);
memcg_bind_pages(s, order);
page->slab_cache = s;
__SetPageSlab(page);
@@ -1722,7 +1738,7 @@ static void *get_partial(struct kmem_cac
struct kmem_cache_cpu *c)
{
void *object;
- int searchnode = (node == NUMA_NO_NODE) ? numa_node_id() : node;
+ int searchnode = (node == NUMA_NO_NODE) ? numa_mem_id() : node;
object = get_partial_node(s, get_node(s, searchnode), c, flags);
if (object || node != NUMA_NO_NODE)
@@ -2117,8 +2133,19 @@ static void flush_all(struct kmem_cache
static inline int node_match(struct page *page, int node)
{
#ifdef CONFIG_NUMA
- if (!page || (node != NUMA_NO_NODE && page_to_nid(page) != node))
+ int page_node = page_to_nid(page);
+
+ if (!page)
return 0;
+
+ if (node != NUMA_NO_NODE) {
+#ifdef CONFIG_HAVE_MEMORYLESS_NODES
+ if (node_isset(node, empty_nodes))
+ return 1;
+#endif
+ if (page_node != node)
+ return 0;
+ }
#endif
return 1;
}
^ permalink raw reply
* Re: [PATCH v3 0/3] powerpc/pseries: fix issues in suspend/resume code
From: Tyrel Datwyler @ 2014-02-12 21:43 UTC (permalink / raw)
To: linuxppc-dev; +Cc: nfont, Tyrel Datwyler
In-Reply-To: <1391212692-16217-1-git-send-email-tyreld@linux.vnet.ibm.com>
On 01/31/2014 03:58 PM, Tyrel Datwyler wrote:
> This patchset fixes a couple of issues encountered in the suspend/resume code
> base. First when using the kernel device tree update code update-nodes is
> unnecessarily called more than once. Second the cpu cache lists are not
> updated after a suspend/resume which under certain conditions may cause a
> panic. Finally, since the cache list fix utilzes in kernel device tree update
> code a means for telling drmgr not to perform a device tree update from
> userspace is required.
>
> Changes from v2:
> - Moved dynamic configuration update code into pseries specific routine
> per Nathan's suggestion.
>
> Changes from v1:
> - Fixed several commit message typos
> - Fixed authorship of first two patches
>
> Haren Myneni (2):
> powerpc/pseries: Device tree should only be updated once after
> suspend/migrate
> powerpc/pseries: Update dynamic cache nodes for suspend/resume
> operation
>
> Tyrel Datwyler (1):
> powerpc/pseries: Report in kernel device tree update to drmgr
>
> arch/powerpc/include/asm/rtas.h | 1 +
> arch/powerpc/platforms/pseries/mobility.c | 26 +++++++-----------
> arch/powerpc/platforms/pseries/suspend.c | 44 ++++++++++++++++++++++++++++++-
> 3 files changed, 54 insertions(+), 17 deletions(-)
>
Ping?
Nathan, can I at least get your ack on this v3 patchset. We really need
to get these upstream.
-Tyrel
^ permalink raw reply
* Re: [PATCH V2] powerpc: thp: Fix crash on mremap
From: Benjamin Herrenschmidt @ 2014-02-12 21:03 UTC (permalink / raw)
To: Greg KH; +Cc: stable, linuxppc-dev, paulus, Aneesh Kumar K.V,
Kirill A. Shutemov
In-Reply-To: <20140212142334.GB7688@kroah.com>
On Wed, 2014-02-12 at 06:23 -0800, Greg KH wrote:
> I have no idea what that means...
>
> If you want this patch applied, please be specific as to what is going
> on, why the code is _very_ different, and all of that. Make it
> _obvious_ as to what is happening, and why I would be a fool not to
> take
> it in the stable tree.
>
> As it is, the code in this patch looks so different that I'm just
> assuming you got something wrong and are trying to really send me
> something else, so I'll just ignore it.
It looks very different because the function that needs to be fixed
changed a lot upstream in 3.13.
In practice it's *not* very different in behaviour. It's just that
on powerpc we need to unconditionally call withdraw and deposit when
moving PTEs or it will crash, due to how we keep the transparent
huge page in sync with the hash table.
With the 3.13 code, due to lock breaking introduced by Kirill in
3.13-rc's, there's already a generic case for doing that (if we dropped
the lock). So we just changed the condition to essentially force the
condition to true to always do it under control of an arch helper.
The pre-3.13 code didn't do the withdraw and deposit at all in that
function however, so in that case, the patch (this 3.12 one) basically
just adds the calls to withdraw and deposit under control of an ifdef
which is only enabled for powerpc64.
So you are taking 0 risk with other architecture and as the powerpc
maintainer I'm happy with the patch.
Cheers,
Ben.
^ permalink raw reply
* Re: [PATCH] of: give priority to the compatible match in __of_match_node()
From: Stephen N Chivers @ 2014-02-12 20:42 UTC (permalink / raw)
To: Kevin Hao
Cc: Chris Proctor, Arnd Bergmann, devicetree, Stephen N Chivers,
Scott Wood, Rob Herring, Grant Likely, linuxppc-dev,
Sebastian Hesselbarth
In-Reply-To: <1392205084-2351-1-git-send-email-haokexin@gmail.com>
Kevin Hao <haokexin@gmail.com> wrote on 02/12/2014 10:38:04 PM:
> From: Kevin Hao <haokexin@gmail.com>
> To: devicetree@vger.kernel.org, linuxppc-dev@lists.ozlabs.org
> Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>, Stephen
> N Chivers <schivers@csc.com.au>, Chris Proctor
> <cproctor@csc.com.au>, Arnd Bergmann <arnd@arndb.de>, Scott Wood
> <scottwood@freescale.com>, Grant Likely <grant.likely@linaro.org>,
> Rob Herring <robh+dt@kernel.org>
> Date: 02/12/2014 10:38 PM
> Subject: [PATCH] of: give priority to the compatible match in
> __of_match_node()
>
> When the device node do have a compatible property, we definitely
> prefer the compatible match besides the type and name. Only if
> there is no such a match, we then consider the candidate which
> doesn't have compatible entry but do match the type or name with
> the device node.
>
> This is based on a patch from Sebastian Hesselbarth.
> http://patchwork.ozlabs.org/patch/319434/
>
> I did some code refactoring and also fixed a bug in the original patch.
>
> Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
> Signed-off-by: Kevin Hao <haokexin@gmail.com>
Tested-by: Stephen Chivers <schivers@csc.com>
Patch works for both orderings. Platform boots without problems and
I get the normal serial console.
> ---
> drivers/of/base.c | 55 ++++++++++++++++++++++++++++++++++++
> +------------------
> 1 file changed, 37 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/of/base.c b/drivers/of/base.c
> index ff85450d5683..9d655df458bd 100644
> --- a/drivers/of/base.c
> +++ b/drivers/of/base.c
> @@ -730,32 +730,45 @@ out:
> }
> EXPORT_SYMBOL(of_find_node_with_property);
>
> +static int of_match_type_or_name(const struct device_node *node,
> + const struct of_device_id *m)
> +{
> + int match = 1;
> +
> + if (m->name[0])
> + match &= node->name && !strcmp(m->name, node->name);
> +
> + if (m->type[0])
> + match &= node->type && !strcmp(m->type, node->type);
> +
> + return match;
> +}
> +
> static
> const struct of_device_id *__of_match_node(const struct
> of_device_id *matches,
> const struct device_node *node)
> {
> const char *cp;
> int cplen, l;
> + const struct of_device_id *m;
> + int match;
>
> if (!matches)
> return NULL;
>
> cp = __of_get_property(node, "compatible", &cplen);
> - do {
> - const struct of_device_id *m = matches;
> + while (cp && (cplen > 0)) {
> + m = matches;
>
> /* Check against matches with current compatible string */
> while (m->name[0] || m->type[0] || m->compatible[0]) {
> - int match = 1;
> - if (m->name[0])
> - match &= node->name
> - && !strcmp(m->name, node->name);
> - if (m->type[0])
> - match &= node->type
> - && !strcmp(m->type, node->type);
> - if (m->compatible[0])
> - match &= cp
> - && !of_compat_cmp(m->compatible, cp,
> + if (!m->compatible[0]) {
> + m++;
> + continue;
> + }
> +
> + match = of_match_type_or_name(node, m);
> + match &= cp && !of_compat_cmp(m->compatible, cp,
> strlen(m->compatible));
> if (match)
> return m;
> @@ -763,12 +776,18 @@ const struct of_device_id *__of_match_node
> (const struct of_device_id *matches,
> }
>
> /* Get node's next compatible string */
> - if (cp) {
> - l = strlen(cp) + 1;
> - cp += l;
> - cplen -= l;
> - }
> - } while (cp && (cplen > 0));
> + l = strlen(cp) + 1;
> + cp += l;
> + cplen -= l;
> + }
> +
> + m = matches;
> + /* Check against matches without compatible string */
> + while (m->name[0] || m->type[0] || m->compatible[0]) {
> + if (!m->compatible[0] && of_match_type_or_name(node, m))
> + return m;
> + m++;
> + }
>
> return NULL;
> }
> --
> 1.8.5.3
>
^ permalink raw reply
* Re: [PATCH RFC v7 1/6] dma: mpc512x: reorder mpc8308 specific instructions
From: Gerhard Sittig @ 2014-02-12 19:21 UTC (permalink / raw)
To: Alexander Popov
Cc: Lars-Peter Clausen, Arnd Bergmann, Vinod Koul, Dan Williams,
Anatolij Gustschin, linuxppc-dev
In-Reply-To: <1392211508-23615-2-git-send-email-a13xp0p0v88@gmail.com>
[ removed DT from Cc: ]
On Wed, Feb 12, 2014 at 17:25 +0400, Alexander Popov wrote:
>
> Concentrate the specific code for MPC8308 in the 'if' branch
> and handle MPC512x in the 'else' branch.
> This modification only reorders instructions but doesn't change behaviour.
As this one is an obvious improvement and straight forward, it
can be taken regardless of the remainder of the series. (I guess
this formerly stated judgement is what made Alexander derive
Acked-By tags from.)
virtually yours
Gerhard Sittig
--
DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr. 5, D-82194 Groebenzell, Germany
Phone: +49-8142-66989-0 Fax: +49-8142-66989-80 Email: office@denx.de
^ permalink raw reply
* Re: [PATCH V2] powerpc: thp: Fix crash on mremap
From: Aneesh Kumar K.V @ 2014-02-12 15:11 UTC (permalink / raw)
To: Greg KH; +Cc: paulus, linuxppc-dev, Kirill A. Shutemov, stable
In-Reply-To: <20140212142334.GB7688@kroah.com>
Greg KH <gregkh@linuxfoundation.org> writes:
> On Wed, Feb 12, 2014 at 08:22:02AM +0530, Aneesh Kumar K.V wrote:
>> Greg KH <gregkh@linuxfoundation.org> writes:
>>
>> > On Fri, Feb 07, 2014 at 07:21:57PM +0530, Aneesh Kumar K.V wrote:
>> >> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>> >>
>> >> This patch fix the below crash
>> >>
>> >> NIP [c00000000004cee4] .__hash_page_thp+0x2a4/0x440
>> >> LR [c0000000000439ac] .hash_page+0x18c/0x5e0
>> >> ...
>> >> Call Trace:
>> >> [c000000736103c40] [00001ffffb000000] 0x1ffffb000000(unreliable)
>> >> [437908.479693] [c000000736103d50] [c0000000000439ac] .hash_page+0x18c/0x5e0
>> >> [437908.479699] [c000000736103e30] [c00000000000924c] .do_hash_page+0x4c/0x58
>> >>
>> >> On ppc64 we use the pgtable for storing the hpte slot information and
>> >> store address to the pgtable at a constant offset (PTRS_PER_PMD) from
>> >> pmd. On mremap, when we switch the pmd, we need to withdraw and deposit
>> >> the pgtable again, so that we find the pgtable at PTRS_PER_PMD offset
>> >> from new pmd.
>> >>
>> >> We also want to move the withdraw and deposit before the set_pmd so
>> >> that, when page fault find the pmd as trans huge we can be sure that
>> >> pgtable can be located at the offset.
>> >>
>> >> variant of upstream SHA1: b3084f4db3aeb991c507ca774337c7e7893ed04f
>> >> for 3.12 stable series
>> >
>> > This doesn't look like a "variant", it looks totally different. Why
>> > can't I just take the b3084f4db3aeb991c507ca774337c7e7893ed04f patch
>> > (and follow-on fix) for 3.12?
>>
>> Because the code in that function changed in 3.13. Kirill added split
>> ptl locks for huge pte, and we decide whether to withdraw and
>> deposit again based on the ptl locks in 3.13. In 3.12 we do that only
>> for ppc64 using #ifdef
>
> I have no idea what that means...
>
> If you want this patch applied, please be specific as to what is going
> on, why the code is _very_ different, and all of that. Make it
> _obvious_ as to what is happening, and why I would be a fool not to take
> it in the stable tree.
>
> As it is, the code in this patch looks so different that I'm just
> assuming you got something wrong and are trying to really send me
> something else, so I'll just ignore it.
3.13 we added split huge ptl lock which introduced separate lock at pmd
level for hugepage (bf929152e9f6c49b66fad4ebf08cc95b02ce48f5). This
required us 3592806cfa08b7cca968f793c33f8e9460bab395. ie, when we move
huge page, we need to withdraw and deposit PTE page if we are moving
them across different pmd page. We did that by checking spin lock
address in 3.13. ie, we have
if (new_ptl != old_ptl) {
.....
pgtable = pgtable_trans_huge_withdraw(mm, old_pmd);
pgtable_trans_huge_deposit(mm, new_pmd,pgtable);
...
}
ppc64 even without using split ptl had PTE page per pmd entry. The
details for that are explained in the commit message above. So when
we move huge page we need to withdraw and deposit PTE page always on
ppc64.
Now on 3.13 we added a new function which did
static inline int pmd_move_must_withdraw(spinlock_t *new_pmd_ptl,
spinlock_t *old_pmd_ptl)
{
/*
* With split pmd lock we also need to move preallocated
* PTE page table if new_pmd is on different PMD page table.
*/
return new_pmd_ptl != old_pmd_ptl;
}
for x86
and on ppc64 we did
static inline int pmd_move_must_withdraw(spinlock_t *new_pmd_ptl,
spinlock_t *old_pmd_ptl)
{
/*
* Archs like ppc64 use pgtable to store per pmd
* specific information. So when we switch the pmd,
* we should also withdraw and deposit the pgtable
*/
return true;
}
ie, on ppc64 we always did withdraw and deposit and on x86 we do that
only when spin lock address are different.
For 3.12, since we don't have split huge ptl locks yet, we did the below
+#ifdef CONFIG_ARCH_THP_MOVE_PMD_ALWAYS_WITHDRAW
+ /*
+ * Archs like ppc64 use pgtable to store per pmd
+ * specific information. So when we switch the pmd,
+ * we should also withdraw and deposit the pgtable
+ */
+ pgtable = pgtable_trans_huge_withdraw(mm, old_pmd);
+ pgtable_trans_huge_deposit(mm, new_pmd, pgtable);
+#endif
CONFIG_ARCH_THP_MOVE_PMD_ALWAYS_WITHDRAW is only set for PPC64.
-aneesh
^ permalink raw reply
* Re: [PATCH v2] powerpc/powernv: Platform dump interface
From: Vasant Hegde @ 2014-02-12 15:01 UTC (permalink / raw)
To: Anton Blanchard; +Cc: linuxppc-dev
In-Reply-To: <20140209082034.0329833d@kryten>
On 02/09/2014 02:50 AM, Anton Blanchard wrote:
>
> Hi Vasant,
>
>> +static void free_dump_sg_list(struct opal_sg_list *list)
>> +{
>> + struct opal_sg_list *sg1;
>> + while (list) {
>> + sg1 = list->next;
>> + kfree(list);
>> + list = sg1;
>> + }
>> + list = NULL;
>> +}
>> +
>> +/*
>> + * Build dump buffer scatter gather list
>> + */
>> +static struct opal_sg_list *dump_data_to_sglist(void)
>> +{
>> + struct opal_sg_list *sg1, *list = NULL;
>> + void *addr;
>> + int64_t size;
>> +
>> + addr = dump_record.buffer;
>> + size = dump_record.size;
>> +
>> + sg1 = kzalloc(PAGE_SIZE, GFP_KERNEL);
>> + if (!sg1)
>> + goto nomem;
>> +
>> + list = sg1;
>> + sg1->num_entries = 0;
>> + while (size > 0) {
>> + /* Translate virtual address to physical address */
>> + sg1->entry[sg1->num_entries].data =
>> + (void *)(vmalloc_to_pfn(addr) << PAGE_SHIFT);
>> +
>> + if (size > PAGE_SIZE)
>> + sg1->entry[sg1->num_entries].length =
>> PAGE_SIZE;
>> + else
>> + sg1->entry[sg1->num_entries].length = size;
>> +
>> + sg1->num_entries++;
>> + if (sg1->num_entries >= SG_ENTRIES_PER_NODE) {
>> + sg1->next = kzalloc(PAGE_SIZE, GFP_KERNEL);
>> + if (!sg1->next)
>> + goto nomem;
>> +
>> + sg1 = sg1->next;
>> + sg1->num_entries = 0;
>> + }
>> + addr += PAGE_SIZE;
>> + size -= PAGE_SIZE;
>> + }
>> + return list;
>> +
>> +nomem:
>> + pr_err("%s : Failed to allocate memory\n", __func__);
>> + free_dump_sg_list(list);
>> + return NULL;
>> +}
>> +
>> +/*
>> + * Translate sg list address to absolute
>> + */
>> +static void sglist_to_phy_addr(struct opal_sg_list *list)
>> +{
>> + struct opal_sg_list *sg, *next;
>> +
>> + for (sg = list; sg; sg = next) {
>> + next = sg->next;
>> + /* Don't translate NULL pointer for last entry */
>> + if (sg->next)
>> + sg->next = (struct opal_sg_list
>> *)__pa(sg->next);
>> + else
>> + sg->next = NULL;
>> +
>> + /* Convert num_entries to length */
>> + sg->num_entries =
>> + sg->num_entries * sizeof(struct
>> opal_sg_entry) + 16;
>> + }
>> +}
>> +
>> +static void free_dump_data_buf(void)
>> +{
>> + vfree(dump_record.buffer);
>> + dump_record.size = 0;
>> +}
>
Anton,
> This looks identical to the code in opal-flash.c. Considering how
> complicated it is, can we put it somewhere common?
Thanks for the review.. Will look into it next week.
-Vasant
>
> Anton
>
^ permalink raw reply
* Re: [PATCH V2] powerpc: thp: Fix crash on mremap
From: Greg KH @ 2014-02-12 14:23 UTC (permalink / raw)
To: Aneesh Kumar K.V; +Cc: linuxppc-dev, paulus, stable, Kirill A. Shutemov
In-Reply-To: <87ioslt54d.fsf@linux.vnet.ibm.com>
On Wed, Feb 12, 2014 at 08:22:02AM +0530, Aneesh Kumar K.V wrote:
> Greg KH <gregkh@linuxfoundation.org> writes:
>
> > On Fri, Feb 07, 2014 at 07:21:57PM +0530, Aneesh Kumar K.V wrote:
> >> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> >>
> >> This patch fix the below crash
> >>
> >> NIP [c00000000004cee4] .__hash_page_thp+0x2a4/0x440
> >> LR [c0000000000439ac] .hash_page+0x18c/0x5e0
> >> ...
> >> Call Trace:
> >> [c000000736103c40] [00001ffffb000000] 0x1ffffb000000(unreliable)
> >> [437908.479693] [c000000736103d50] [c0000000000439ac] .hash_page+0x18c/0x5e0
> >> [437908.479699] [c000000736103e30] [c00000000000924c] .do_hash_page+0x4c/0x58
> >>
> >> On ppc64 we use the pgtable for storing the hpte slot information and
> >> store address to the pgtable at a constant offset (PTRS_PER_PMD) from
> >> pmd. On mremap, when we switch the pmd, we need to withdraw and deposit
> >> the pgtable again, so that we find the pgtable at PTRS_PER_PMD offset
> >> from new pmd.
> >>
> >> We also want to move the withdraw and deposit before the set_pmd so
> >> that, when page fault find the pmd as trans huge we can be sure that
> >> pgtable can be located at the offset.
> >>
> >> variant of upstream SHA1: b3084f4db3aeb991c507ca774337c7e7893ed04f
> >> for 3.12 stable series
> >
> > This doesn't look like a "variant", it looks totally different. Why
> > can't I just take the b3084f4db3aeb991c507ca774337c7e7893ed04f patch
> > (and follow-on fix) for 3.12?
>
> Because the code in that function changed in 3.13. Kirill added split
> ptl locks for huge pte, and we decide whether to withdraw and
> deposit again based on the ptl locks in 3.13. In 3.12 we do that only
> for ppc64 using #ifdef
I have no idea what that means...
If you want this patch applied, please be specific as to what is going
on, why the code is _very_ different, and all of that. Make it
_obvious_ as to what is happening, and why I would be a fool not to take
it in the stable tree.
As it is, the code in this patch looks so different that I'm just
assuming you got something wrong and are trying to really send me
something else, so I'll just ignore it.
greg k-h
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox