* [PATCH 0/3 RFC] Driver core: Add offline/online callbacks for memory_subsys
[not found] ` <3166726.elbgrUIZ0L@vostro.rjw.lan>
@ 2013-05-04 1:01 ` Rafael J. Wysocki
2013-05-04 1:03 ` [PATCH 1/3 RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes Rafael J. Wysocki
` (3 more replies)
0 siblings, 4 replies; 32+ messages in thread
From: Rafael J. Wysocki @ 2013-05-04 1:01 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
vasilis.liaskovitis, Len Brown, linux-mm
Hi,
This is a continuation of this patchset: https://lkml.org/lkml/2013/5/2/214
and it applies on top of it or rather on top of the rebased version (with
build problems fixed) in the bleeding-edge branch of the linux-pm.git tree:
http://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git/log/?h=bleeding-edge
An introduction to the first part of the patchset is below, a description of
the current patches follows.
On Thursday, May 02, 2013 02:26:39 PM Rafael J. Wysocki wrote:
> On Monday, April 29, 2013 02:23:59 PM Rafael J. Wysocki wrote:
> >
> > It has been argued for a number of times that in some cases, if a device cannot
> > be gracefully removed from the system, it shouldn't be removed from it at all,
> > because that may lead to a kernel crash. In particular, that will happen if a
> > memory module holding kernel memory is removed, but also removing the last CPU
> > in the system may not be a good idea. [And I can imagine a few other cases
> > like that.]
> >
> > The kernel currently only supports "forced" hot-remove which cannot be stopped
> > once started, so users have no choice but to try to hot-remove stuff and see
> > whether or not that crashes the kernel which is kind of unpleasant. That seems
> > to be based on the "the user knows better" argument according to which users
> > triggering device hot-removal should really know what they are doing, so the
> > kernel doesn't have to worry about that. However, for instance, this pretty
> > much isn't the case for memory modules, because the users have no way to see
> > whether or not any kernel memory has been allocated from a given module.
> >
> > There have been a few attempts to address this issue, but none of them has
> > gained broader acceptance. The following 3 patches are the heart of a new
> > proposal which is based on the idea to introduce device_offline() and
> > device_online() operations along the lines of the existing CPU offline/online
> > mechanism (or, rather, to extend the CPU offline/online so that analogous
> > operations are available for other devices). The way it is supposed to work is
> > that device_offline() will fail if the given device cannot be gracefully
> > removed from the system (in the kernel's view). Once it succeeds, though, the
> > device won't be used any more until either it is removed, or device_online() is
> > run for it. That will allow the ACPI device hot-remove code, for one example,
> > to avoid triggering a non-reversible removal procedure for devices that cannot
> > be removed gracefully.
> >
> > Patch [1/3] introduces device_offline() and device_online() as outlined above.
> > The .offline() and .online() callbacks are only added at the bus type level for
> > now, because that should be sufficient to cover the memory and CPU use cases.
>
> That's [1/4] now and the changes from the previous version are:
> - strtobool() is used in store_online().
> - device_offline_lock has been renamed to device_hotplug_lock (and the
> functions operating it accordingly) following the Toshi's advice.
>
> > Patch [2/3] modifies the CPU hotplug support code to use device_offline() and
> > device_online() to support the sysfs 'online' attribute for CPUs.
>
> That is [2/4] now and it takes cpu_hotplug_driver_lock() around cpu_up() and
> cpu_down().
>
> > Patch [3/3] changes the ACPI device hot-remove code to use device_offline()
> > for checking if graceful removal of devices is possible. The way it does that
> > is to walk the list of "physical" companion devices for each struct acpi_device
> > involved in the operation and call device_offline() for each of them. If any
> > of the device_offline() calls fails (and the hot-removal is not "forced", which
> > is an option), the removal procedure (which is not reversible) is simply not
> > carried out.
>
> That's current [3/4]. It's a bit simpler, because I decided that it would be
> better to have a global 'force_remove' attribute (the semantics of the
> per-profile 'force_remove' wasn't clear and it didn't really add any value over
> a global one). I also added lock/unlock_device_hotplug() around acpi_bus_scan()
> in acpi_scan_bus_device_check() to allow scan handlers to update dev->offline
> for "physical" companion devices safely (the processor's one added by the next
> patch actually does that).
>
> > Of some concern is that device_offline() (and possibly device_online()) is
> > called under physical_node_lock of the corresponding struct acpi_device, which
> > introduces ordering dependency between that lock and device locks for the
> > "physical" devices, but I didn't see any cleaner way to do that (I guess it
> > is avoidable at the expense of added complexity, but for now it's just better
> > to make the code as clean as possible IMO).
>
> Patch [4/4] reworks the ACPI processor driver to use the common hotplug code.
> It basically splits the driver into two parts as described in the changelog,
> where the first part is essentially a scan handler and the second part is
> a driver, but it doesn't bind to struct acpi_device any more. Instead, it
> binds to processor devices under /sys/devices/system/cpu/ (the driver itself
> has a sysfs directory under /sys/bus/cpu/drivers/ which IMHO makes more sense
> than having it under /sys/bus/acpi/drivers/).
>
> The patch at https://patchwork.kernel.org/patch/2506371/ is a prerequisite
> for this series, but I'm going to push it for v3.10-rc2 if no one screams
> bloody murder.
Patch [1/3] in the current series uses acpi_bind_one() to associate memory
block devices with ACPI namespace objects representing memory modules that hold
them. With patch [3/3] that will allow the ACPI core's device hot-remove code
to attempt to offline the memory blocks, if possible, before removing the
modules holding them from the system (and if the offlining fails, the removal
will not be carried out).
Patch [2/3] kind of prepares the (just introduced) driver core's device_online()
and device_offline() for handling memory block devices (becase for those devices
there are multiple types, or levels if you will, of "online").
Finally, patch [3/3] adds .online() and .offline() callbacks to memory_subsys
that are used by the common "online" sysfs attribute and by the ACPI core's
hot-remove code, through device_online() and device_offline().
I hope this is not too ugly. :-)
Thanks,
Rafael
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 1/3 RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes
2013-05-04 1:01 ` [PATCH 0/3 RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
@ 2013-05-04 1:03 ` Rafael J. Wysocki
2013-05-04 1:04 ` [PATCH 2/3 RFC] Driver core: Introduce types of device "online" Rafael J. Wysocki
` (2 subsequent siblings)
3 siblings, 0 replies; 32+ messages in thread
From: Rafael J. Wysocki @ 2013-05-04 1:03 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
vasilis.liaskovitis, Len Brown, linux-mm
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
During ACPI memory hotplug configuration bind memory blocks residing
in modules removable through the standard ACPI mechanism to struct
acpi_device objects associated with ACPI namespace objects
representing those modules. Accordingly, unbind those memory blocks
from the struct acpi_device objects when the memory modules in
question are being removed.
When "offline" operation for devices representing memory blocks is
introduced, this will allow the ACPI core's device hot-remove code to
use it to carry out remove_memory() for those memory blocks and check
the results of that before it actually removes the modules holding
them from the system.
Since walk_memory_range() is used for accessing all memory blocks
corresponding to a given ACPI namespace object, it is exported from
memory_hotplug.c so that the code in acpi_memhotplug.c can use it.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
drivers/acpi/acpi_memhotplug.c | 53 ++++++++++++++++++++++++++++++++++++++---
include/linux/memory_hotplug.h | 2 +
mm/memory_hotplug.c | 4 ++-
3 files changed, 55 insertions(+), 4 deletions(-)
Index: linux-pm/mm/memory_hotplug.c
===================================================================
--- linux-pm.orig/mm/memory_hotplug.c
+++ linux-pm/mm/memory_hotplug.c
@@ -1618,6 +1618,7 @@ int offline_pages(unsigned long start_pf
{
return __offline_pages(start_pfn, start_pfn + nr_pages, 120 * HZ);
}
+#endif /* CONFIG_MEMORY_HOTREMOVE */
/**
* walk_memory_range - walks through all mem sections in [start_pfn, end_pfn)
@@ -1631,7 +1632,7 @@ int offline_pages(unsigned long start_pf
*
* Returns the return value of func.
*/
-static int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
+int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
void *arg, int (*func)(struct memory_block *, void *))
{
struct memory_block *mem = NULL;
@@ -1668,6 +1669,7 @@ static int walk_memory_range(unsigned lo
return 0;
}
+#ifdef CONFIG_MEMORY_HOTREMOVE
/**
* offline_memory_block_cb - callback function for offlining memory block
* @mem: the memory block to be offlined
Index: linux-pm/include/linux/memory_hotplug.h
===================================================================
--- linux-pm.orig/include/linux/memory_hotplug.h
+++ linux-pm/include/linux/memory_hotplug.h
@@ -245,6 +245,8 @@ static inline int is_mem_section_removab
static inline void try_offline_node(int nid) {}
#endif /* CONFIG_MEMORY_HOTREMOVE */
+extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
+ void *arg, int (*func)(struct memory_block *, void *));
extern int mem_online_node(int nid);
extern int add_memory(int nid, u64 start, u64 size);
extern int arch_add_memory(int nid, u64 start, u64 size);
Index: linux-pm/drivers/acpi/acpi_memhotplug.c
===================================================================
--- linux-pm.orig/drivers/acpi/acpi_memhotplug.c
+++ linux-pm/drivers/acpi/acpi_memhotplug.c
@@ -28,6 +28,7 @@
*/
#include <linux/acpi.h>
+#include <linux/memory.h>
#include <linux/memory_hotplug.h>
#include "internal.h"
@@ -166,13 +167,50 @@ static int acpi_memory_check_device(stru
return 0;
}
+static unsigned long acpi_meminfo_start_pfn(struct acpi_memory_info *info)
+{
+ return PFN_DOWN(info->start_addr);
+}
+
+static unsigned long acpi_meminfo_end_pfn(struct acpi_memory_info *info)
+{
+ return PFN_UP(info->start_addr + info->length-1);
+}
+
+static int acpi_bind_memblk(struct memory_block *mem, void *arg)
+{
+ return acpi_bind_one(&mem->dev, (acpi_handle)arg);
+}
+
+static int acpi_bind_memory_blocks(struct acpi_memory_info *info,
+ acpi_handle handle)
+{
+ return walk_memory_range(acpi_meminfo_start_pfn(info),
+ acpi_meminfo_end_pfn(info), (void *)handle,
+ acpi_bind_memblk);
+}
+
+static int acpi_unbind_memblk(struct memory_block *mem, void *arg)
+{
+ acpi_unbind_one(&mem->dev);
+ return 0;
+}
+
+static void acpi_unbind_memory_blocks(struct acpi_memory_info *info,
+ acpi_handle handle)
+{
+ walk_memory_range(acpi_meminfo_start_pfn(info),
+ acpi_meminfo_end_pfn(info), NULL, acpi_unbind_memblk);
+}
+
static int acpi_memory_enable_device(struct acpi_memory_device *mem_device)
{
+ acpi_handle handle = mem_device->device->handle;
int result, num_enabled = 0;
struct acpi_memory_info *info;
int node;
- node = acpi_get_node(mem_device->device->handle);
+ node = acpi_get_node(handle);
/*
* Tell the VM there is more memory here...
* Note: Assume that this function returns zero on success
@@ -203,6 +241,12 @@ static int acpi_memory_enable_device(str
if (result && result != -EEXIST)
continue;
+ result = acpi_bind_memory_blocks(info, handle);
+ if (result) {
+ acpi_unbind_memory_blocks(info, handle);
+ return -ENODEV;
+ }
+
info->enabled = 1;
/*
@@ -229,10 +273,11 @@ static int acpi_memory_enable_device(str
static int acpi_memory_remove_memory(struct acpi_memory_device *mem_device)
{
+ acpi_handle handle = mem_device->device->handle;
int result = 0, nid;
struct acpi_memory_info *info, *n;
- nid = acpi_get_node(mem_device->device->handle);
+ nid = acpi_get_node(handle);
list_for_each_entry_safe(info, n, &mem_device->res_list, list) {
if (!info->enabled)
@@ -240,6 +285,8 @@ static int acpi_memory_remove_memory(str
if (nid < 0)
nid = memory_add_physaddr_to_nid(info->start_addr);
+
+ acpi_unbind_memory_blocks(info, handle);
result = remove_memory(nid, info->start_addr, info->length);
if (result)
return result;
@@ -300,7 +347,7 @@ static int acpi_memory_device_add(struct
if (result) {
dev_err(&device->dev, "acpi_memory_enable_device() error\n");
acpi_memory_device_free(mem_device);
- return -ENODEV;
+ return result;
}
dev_dbg(&device->dev, "Memory device configured by ACPI\n");
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 2/3 RFC] Driver core: Introduce types of device "online"
2013-05-04 1:01 ` [PATCH 0/3 RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
2013-05-04 1:03 ` [PATCH 1/3 RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes Rafael J. Wysocki
@ 2013-05-04 1:04 ` Rafael J. Wysocki
2013-05-04 1:06 ` [PATCH 3/3 RFC] Driver core: Introduce offline/online callbacks for memory blocks Rafael J. Wysocki
2013-05-04 11:11 ` [PATCH 0/2 v2, RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
3 siblings, 0 replies; 32+ messages in thread
From: Rafael J. Wysocki @ 2013-05-04 1:04 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
vasilis.liaskovitis, Len Brown, linux-mm
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
For memory blocks there are multiple ways in which they can be
"online" that determine what can be done with the given block.
For this reason, to allow the generic device_offline() and
device_online() to be used for devices representing memory
blocks, introduce a second "online type" argument for
device_online() that will be interpreted by the bus type whose
.online() callback is executed by device_online().
Of course, that requires some changes to be made in struct device
and struct bus_type, and the code related to device_online()
and device_offline() needs to be changed as well.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
drivers/acpi/acpi_processor.c | 2 +-
drivers/acpi/scan.c | 14 ++++++++------
drivers/base/core.c | 36 ++++++++++++++++++++----------------
drivers/base/cpu.c | 5 ++++-
include/acpi/acpi_bus.h | 2 +-
include/linux/device.h | 8 ++++----
6 files changed, 38 insertions(+), 29 deletions(-)
Index: linux-pm/include/linux/device.h
===================================================================
--- linux-pm.orig/include/linux/device.h
+++ linux-pm/include/linux/device.h
@@ -108,7 +108,7 @@ struct bus_type {
int (*remove)(struct device *dev);
void (*shutdown)(struct device *dev);
- int (*online)(struct device *dev);
+ int (*online)(struct device *dev, unsigned int type);
int (*offline)(struct device *dev);
int (*suspend)(struct device *dev, pm_message_t state);
@@ -656,7 +656,7 @@ struct acpi_dev_node {
* gone away. This should be set by the allocator of the
* device (i.e. the bus driver that discovered the device).
* @offline_disabled: If set, the device is permanently online.
- * @offline: Set after successful invocation of bus type's .offline().
+ * @online_type: 0 if the device is offline, otherwise bus type dependent.
*
* At the lowest level, every device in a Linux system is represented by an
* instance of struct device. The device structure contains the information
@@ -730,8 +730,8 @@ struct device {
void (*release)(struct device *dev);
struct iommu_group *iommu_group;
+ unsigned int online_type;
bool offline_disabled:1;
- bool offline:1;
};
static inline struct device *kobj_to_dev(struct kobject *kobj)
@@ -876,7 +876,7 @@ static inline bool device_supports_offli
extern void lock_device_hotplug(void);
extern void unlock_device_hotplug(void);
extern int device_offline(struct device *dev);
-extern int device_online(struct device *dev);
+extern int device_online(struct device *dev, unsigned int type);
/*
* Root device objects for grouping under /sys/devices
*/
Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -406,10 +406,10 @@ static struct device_attribute uevent_at
static ssize_t show_online(struct device *dev, struct device_attribute *attr,
char *buf)
{
- bool val;
+ unsigned int val;
lock_device_hotplug();
- val = !dev->offline;
+ val = dev->online_type;
unlock_device_hotplug();
return sprintf(buf, "%u\n", val);
}
@@ -417,15 +417,15 @@ static ssize_t show_online(struct device
static ssize_t store_online(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
- bool val;
+ unsigned int val;
int ret;
- ret = strtobool(buf, &val);
+ ret = kstrtouint(buf, 10, &val);
if (ret < 0)
return ret;
lock_device_hotplug();
- ret = val ? device_online(dev) : device_offline(dev);
+ ret = val ? device_online(dev, val) : device_offline(dev);
unlock_device_hotplug();
return ret < 0 ? ret : count;
}
@@ -1488,7 +1488,7 @@ static int device_check_offline(struct d
if (ret)
return ret;
- return device_supports_offline(dev) && !dev->offline ? -EBUSY : 0;
+ return device_supports_offline(dev) && !!dev->online_type ? -EBUSY : 0;
}
/**
@@ -1515,14 +1515,14 @@ int device_offline(struct device *dev)
device_lock(dev);
if (device_supports_offline(dev)) {
- if (dev->offline) {
- ret = 1;
- } else {
+ if (dev->online_type) {
ret = dev->bus->offline(dev);
if (!ret) {
kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
- dev->offline = true;
+ dev->online_type = 0;
}
+ } else {
+ ret = 1;
}
}
device_unlock(dev);
@@ -1533,6 +1533,7 @@ int device_offline(struct device *dev)
/**
* device_online - Put the device back online after successful device_offline().
* @dev: Device to be put back online.
+ * @type: Interpreted by the bus type, must be nonzero.
*
* If device_offline() has been successfully executed for @dev, but the device
* has not been removed subsequently, execute its bus type's .online() callback
@@ -1540,20 +1541,23 @@ int device_offline(struct device *dev)
*
* Call under device_hotplug_lock.
*/
-int device_online(struct device *dev)
+int device_online(struct device *dev, unsigned int type)
{
int ret = 0;
+ if (!type)
+ return -EINVAL;
+
device_lock(dev);
if (device_supports_offline(dev)) {
- if (dev->offline) {
- ret = dev->bus->online(dev);
+ if (dev->online_type) {
+ ret = 1;
+ } else {
+ ret = dev->bus->online(dev, type);
if (!ret) {
kobject_uevent(&dev->kobj, KOBJ_ONLINE);
- dev->offline = false;
+ dev->online_type = type;
}
- } else {
- ret = 1;
}
}
device_unlock(dev);
Index: linux-pm/include/acpi/acpi_bus.h
===================================================================
--- linux-pm.orig/include/acpi/acpi_bus.h
+++ linux-pm/include/acpi/acpi_bus.h
@@ -286,7 +286,7 @@ struct acpi_device_physical_node {
u8 node_id;
struct list_head node;
struct device *dev;
- bool put_online:1;
+ unsigned int online_type;
};
/* set maximum of physical nodes to 32 for expansibility */
Index: linux-pm/drivers/acpi/scan.c
===================================================================
--- linux-pm.orig/drivers/acpi/scan.c
+++ linux-pm/drivers/acpi/scan.c
@@ -141,15 +141,17 @@ static acpi_status acpi_bus_offline_comp
list_for_each_entry(pn, &device->physical_node_list, node) {
int ret;
+ pn->online_type = pn->dev->online_type;
ret = device_offline(pn->dev);
- if (acpi_force_hot_remove)
+ if (acpi_force_hot_remove) {
+ pn->online_type = 0;
continue;
-
+ }
if (ret < 0) {
+ pn->online_type = 0;
status = AE_ERROR;
break;
}
- pn->put_online = !ret;
}
mutex_unlock(&device->physical_node_lock);
@@ -169,9 +171,9 @@ static acpi_status acpi_bus_online_compa
mutex_lock(&device->physical_node_lock);
list_for_each_entry(pn, &device->physical_node_list, node)
- if (pn->put_online) {
- device_online(pn->dev);
- pn->put_online = false;
+ if (pn->online_type) {
+ device_online(pn->dev, pn->online_type);
+ pn->online_type = 0;
}
mutex_unlock(&device->physical_node_lock);
Index: linux-pm/drivers/acpi/acpi_processor.c
===================================================================
--- linux-pm.orig/drivers/acpi/acpi_processor.c
+++ linux-pm/drivers/acpi/acpi_processor.c
@@ -395,7 +395,7 @@ static int __cpuinit acpi_processor_add(
goto err;
pr->dev = dev;
- dev->offline = pr->flags.need_hotplug_init;
+ dev->online_type = !pr->flags.need_hotplug_init;
/* Trigger the processor driver's .probe() if present. */
if (device_attach(dev) >= 0)
Index: linux-pm/drivers/base/cpu.c
===================================================================
--- linux-pm.orig/drivers/base/cpu.c
+++ linux-pm/drivers/base/cpu.c
@@ -38,13 +38,16 @@ static void change_cpu_under_node(struct
cpu->node_id = to_nid;
}
-static int __ref cpu_subsys_online(struct device *dev)
+static int __ref cpu_subsys_online(struct device *dev, unsigned int type)
{
struct cpu *cpu = container_of(dev, struct cpu, dev);
int cpuid = dev->id;
int from_nid, to_nid;
int ret;
+ if (type > 1)
+ return -EINVAL;
+
cpu_hotplug_driver_lock();
from_nid = cpu_to_node(cpuid);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 3/3 RFC] Driver core: Introduce offline/online callbacks for memory blocks
2013-05-04 1:01 ` [PATCH 0/3 RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
2013-05-04 1:03 ` [PATCH 1/3 RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes Rafael J. Wysocki
2013-05-04 1:04 ` [PATCH 2/3 RFC] Driver core: Introduce types of device "online" Rafael J. Wysocki
@ 2013-05-04 1:06 ` Rafael J. Wysocki
2013-05-04 11:11 ` [PATCH 0/2 v2, RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
3 siblings, 0 replies; 32+ messages in thread
From: Rafael J. Wysocki @ 2013-05-04 1:06 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
vasilis.liaskovitis, Len Brown, linux-mm
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Introduce .offline() and .online() callbacks for memory_subsys
that will allow the generic device_offline() and device_online()
to be used with device objects representing memory blocks. That,
in turn, allows the ACPI subsystem to use device_offline() to put
removable memory blocks offline, if possible, before removing
memory modules holding them.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
drivers/base/memory.c | 84 ++++++++++++++++++++++++++++++-----------
include/linux/memory_hotplug.h | 2
2 files changed, 64 insertions(+), 22 deletions(-)
Index: linux-pm/drivers/base/memory.c
===================================================================
--- linux-pm.orig/drivers/base/memory.c
+++ linux-pm/drivers/base/memory.c
@@ -37,9 +37,14 @@ static inline int base_memory_block_id(i
return section_nr / sections_per_block;
}
+static int memory_subsys_online(struct device *dev, unsigned int type);
+static int memory_subsys_offline(struct device *dev);
+
static struct bus_type memory_subsys = {
.name = MEMORY_CLASS_NAME,
.dev_name = MEMORY_CLASS_NAME,
+ .online = memory_subsys_online,
+ .offline = memory_subsys_offline,
};
static BLOCKING_NOTIFIER_HEAD(memory_chain);
@@ -294,16 +299,7 @@ static int __memory_block_change_state(s
}
mem->state = to_state;
- switch (mem->state) {
- case MEM_OFFLINE:
- kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
- break;
- case MEM_ONLINE:
- kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
- break;
- default:
- break;
- }
+
out:
return ret;
}
@@ -321,27 +317,66 @@ static int memory_block_change_state(str
return ret;
}
+
+static int memory_subsys_online(struct device *dev, unsigned int type)
+{
+ struct memory_block *mem = container_of(dev, struct memory_block, dev);
+
+ if (type < ONLINE_KEEP || type > ONLINE_KERNEL)
+ return -EINVAL;
+
+ return memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE, type);
+}
+
+static int memory_block_online(struct device *dev, unsigned int type)
+{
+ int ret = memory_subsys_online(dev, type);
+
+ if (!ret) {
+ dev->online_type = type;
+ kobject_uevent(&dev->kobj, KOBJ_ONLINE);
+ }
+
+ return ret;
+}
+
+static int memory_subsys_offline(struct device *dev)
+{
+ struct memory_block *mem = container_of(dev, struct memory_block, dev);
+
+ return memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+}
+
+static int memory_block_offline(struct device *dev)
+{
+ int ret = memory_subsys_offline(dev);
+
+ if (!ret) {
+ dev->online_type = 0;
+ kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
+ }
+
+ return ret;
+}
+
static ssize_t
store_mem_state(struct device *dev,
struct device_attribute *attr, const char *buf, size_t count)
{
- struct memory_block *mem;
int ret = -EINVAL;
- mem = container_of(dev, struct memory_block, dev);
+ lock_device_hotplug();
if (!strncmp(buf, "online_kernel", min_t(int, count, 13)))
- ret = memory_block_change_state(mem, MEM_ONLINE,
- MEM_OFFLINE, ONLINE_KERNEL);
+ ret = memory_block_online(dev, ONLINE_KERNEL);
else if (!strncmp(buf, "online_movable", min_t(int, count, 14)))
- ret = memory_block_change_state(mem, MEM_ONLINE,
- MEM_OFFLINE, ONLINE_MOVABLE);
+ ret = memory_block_online(dev, ONLINE_MOVABLE);
else if (!strncmp(buf, "online", min_t(int, count, 6)))
- ret = memory_block_change_state(mem, MEM_ONLINE,
- MEM_OFFLINE, ONLINE_KEEP);
+ ret = memory_block_online(dev, ONLINE_KEEP);
else if(!strncmp(buf, "offline", min_t(int, count, 7)))
- ret = memory_block_change_state(mem, MEM_OFFLINE,
- MEM_ONLINE, -1);
+ ret = memory_block_offline(dev);
+
+ unlock_device_hotplug();
if (ret)
return ret;
@@ -686,10 +721,17 @@ int offline_memory_block(struct memory_b
{
int ret = 0;
+ lock_device_hotplug();
mutex_lock(&mem->state_mutex);
- if (mem->state != MEM_OFFLINE)
+ if (mem->state != MEM_OFFLINE) {
ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+ if (!ret) {
+ mem->dev.online_type = 0;
+ kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
+ }
+ }
mutex_unlock(&mem->state_mutex);
+ unlock_device_hotplug();
return ret;
}
Index: linux-pm/include/linux/memory_hotplug.h
===================================================================
--- linux-pm.orig/include/linux/memory_hotplug.h
+++ linux-pm/include/linux/memory_hotplug.h
@@ -28,7 +28,7 @@ enum {
/* Types for control the zone type of onlined memory */
enum {
- ONLINE_KEEP,
+ ONLINE_KEEP = 1,
ONLINE_KERNEL,
ONLINE_MOVABLE,
};
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 0/2 v2, RFC] Driver core: Add offline/online callbacks for memory_subsys
2013-05-04 1:01 ` [PATCH 0/3 RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
` (2 preceding siblings ...)
2013-05-04 1:06 ` [PATCH 3/3 RFC] Driver core: Introduce offline/online callbacks for memory blocks Rafael J. Wysocki
@ 2013-05-04 11:11 ` Rafael J. Wysocki
2013-05-04 11:12 ` [PATCH 1/2 v2, RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes Rafael J. Wysocki
` (2 more replies)
3 siblings, 3 replies; 32+ messages in thread
From: Rafael J. Wysocki @ 2013-05-04 11:11 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
vasilis.liaskovitis, Len Brown, linux-mm
Hi,
On Saturday, May 04, 2013 03:01:23 AM Rafael J. Wysocki wrote:
> Hi,
>
> This is a continuation of this patchset: https://lkml.org/lkml/2013/5/2/214
> and it applies on top of it or rather on top of the rebased version (with
> build problems fixed) in the bleeding-edge branch of the linux-pm.git tree:
>
> http://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git/log/?h=bleeding-edge
>
> An introduction to the first part of the patchset is below, a description of
> the current patches follows.
Actually, I'm withdrawing the previous version of this patchset (or rather
patches [2-3/3] from it), because I had a better idea in the meantime.
Patch [1/2] is the same as the previous [1/3] ->
> On Thursday, May 02, 2013 02:26:39 PM Rafael J. Wysocki wrote:
> > On Monday, April 29, 2013 02:23:59 PM Rafael J. Wysocki wrote:
> > >
> > > It has been argued for a number of times that in some cases, if a device cannot
> > > be gracefully removed from the system, it shouldn't be removed from it at all,
> > > because that may lead to a kernel crash. In particular, that will happen if a
> > > memory module holding kernel memory is removed, but also removing the last CPU
> > > in the system may not be a good idea. [And I can imagine a few other cases
> > > like that.]
> > >
> > > The kernel currently only supports "forced" hot-remove which cannot be stopped
> > > once started, so users have no choice but to try to hot-remove stuff and see
> > > whether or not that crashes the kernel which is kind of unpleasant. That seems
> > > to be based on the "the user knows better" argument according to which users
> > > triggering device hot-removal should really know what they are doing, so the
> > > kernel doesn't have to worry about that. However, for instance, this pretty
> > > much isn't the case for memory modules, because the users have no way to see
> > > whether or not any kernel memory has been allocated from a given module.
> > >
> > > There have been a few attempts to address this issue, but none of them has
> > > gained broader acceptance. The following 3 patches are the heart of a new
> > > proposal which is based on the idea to introduce device_offline() and
> > > device_online() operations along the lines of the existing CPU offline/online
> > > mechanism (or, rather, to extend the CPU offline/online so that analogous
> > > operations are available for other devices). The way it is supposed to work is
> > > that device_offline() will fail if the given device cannot be gracefully
> > > removed from the system (in the kernel's view). Once it succeeds, though, the
> > > device won't be used any more until either it is removed, or device_online() is
> > > run for it. That will allow the ACPI device hot-remove code, for one example,
> > > to avoid triggering a non-reversible removal procedure for devices that cannot
> > > be removed gracefully.
> > >
> > > Patch [1/3] introduces device_offline() and device_online() as outlined above.
> > > The .offline() and .online() callbacks are only added at the bus type level for
> > > now, because that should be sufficient to cover the memory and CPU use cases.
> >
> > That's [1/4] now and the changes from the previous version are:
> > - strtobool() is used in store_online().
> > - device_offline_lock has been renamed to device_hotplug_lock (and the
> > functions operating it accordingly) following the Toshi's advice.
> >
> > > Patch [2/3] modifies the CPU hotplug support code to use device_offline() and
> > > device_online() to support the sysfs 'online' attribute for CPUs.
> >
> > That is [2/4] now and it takes cpu_hotplug_driver_lock() around cpu_up() and
> > cpu_down().
> >
> > > Patch [3/3] changes the ACPI device hot-remove code to use device_offline()
> > > for checking if graceful removal of devices is possible. The way it does that
> > > is to walk the list of "physical" companion devices for each struct acpi_device
> > > involved in the operation and call device_offline() for each of them. If any
> > > of the device_offline() calls fails (and the hot-removal is not "forced", which
> > > is an option), the removal procedure (which is not reversible) is simply not
> > > carried out.
> >
> > That's current [3/4]. It's a bit simpler, because I decided that it would be
> > better to have a global 'force_remove' attribute (the semantics of the
> > per-profile 'force_remove' wasn't clear and it didn't really add any value over
> > a global one). I also added lock/unlock_device_hotplug() around acpi_bus_scan()
> > in acpi_scan_bus_device_check() to allow scan handlers to update dev->offline
> > for "physical" companion devices safely (the processor's one added by the next
> > patch actually does that).
> >
> > > Of some concern is that device_offline() (and possibly device_online()) is
> > > called under physical_node_lock of the corresponding struct acpi_device, which
> > > introduces ordering dependency between that lock and device locks for the
> > > "physical" devices, but I didn't see any cleaner way to do that (I guess it
> > > is avoidable at the expense of added complexity, but for now it's just better
> > > to make the code as clean as possible IMO).
> >
> > Patch [4/4] reworks the ACPI processor driver to use the common hotplug code.
> > It basically splits the driver into two parts as described in the changelog,
> > where the first part is essentially a scan handler and the second part is
> > a driver, but it doesn't bind to struct acpi_device any more. Instead, it
> > binds to processor devices under /sys/devices/system/cpu/ (the driver itself
> > has a sysfs directory under /sys/bus/cpu/drivers/ which IMHO makes more sense
> > than having it under /sys/bus/acpi/drivers/).
> >
> > The patch at https://patchwork.kernel.org/patch/2506371/ is a prerequisite
> > for this series, but I'm going to push it for v3.10-rc2 if no one screams
> > bloody murder.
-> (this is [1/2] now):
> Patch [1/3] in the current series uses acpi_bind_one() to associate memory
> block devices with ACPI namespace objects representing memory modules that hold
> them. With patch [3/3] that will allow the ACPI core's device hot-remove code
> to attempt to offline the memory blocks, if possible, before removing the
> modules holding them from the system (and if the offlining fails, the removal
> will not be carried out).
Patch [2/2] adds .online() and .offline() callbacks to memory_subsys
that are used by the common "online" sysfs attribute and by the ACPI core's
hot-remove code, through device_online() and device_offline().
The way it is supposed to work is that device_offline() will attempt to put
memory blocks offline and device_online() will online them and attempt to
apply the last online type previously used to them.
Thanks,
Rafael
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 1/2 v2, RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes
2013-05-04 11:11 ` [PATCH 0/2 v2, RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
@ 2013-05-04 11:12 ` Rafael J. Wysocki
2013-05-21 6:50 ` Tang Chen
2013-05-04 11:21 ` [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks Rafael J. Wysocki
2013-05-06 10:48 ` [PATCH 0/2 v2, RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
2 siblings, 1 reply; 32+ messages in thread
From: Rafael J. Wysocki @ 2013-05-04 11:12 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
vasilis.liaskovitis, Len Brown, linux-mm
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
During ACPI memory hotplug configuration bind memory blocks residing
in modules removable through the standard ACPI mechanism to struct
acpi_device objects associated with ACPI namespace objects
representing those modules. Accordingly, unbind those memory blocks
from the struct acpi_device objects when the memory modules in
question are being removed.
When "offline" operation for devices representing memory blocks is
introduced, this will allow the ACPI core's device hot-remove code to
use it to carry out remove_memory() for those memory blocks and check
the results of that before it actually removes the modules holding
them from the system.
Since walk_memory_range() is used for accessing all memory blocks
corresponding to a given ACPI namespace object, it is exported from
memory_hotplug.c so that the code in acpi_memhotplug.c can use it.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
drivers/acpi/acpi_memhotplug.c | 53 ++++++++++++++++++++++++++++++++++++++---
include/linux/memory_hotplug.h | 2 +
mm/memory_hotplug.c | 4 ++-
3 files changed, 55 insertions(+), 4 deletions(-)
Index: linux-pm/mm/memory_hotplug.c
===================================================================
--- linux-pm.orig/mm/memory_hotplug.c
+++ linux-pm/mm/memory_hotplug.c
@@ -1618,6 +1618,7 @@ int offline_pages(unsigned long start_pf
{
return __offline_pages(start_pfn, start_pfn + nr_pages, 120 * HZ);
}
+#endif /* CONFIG_MEMORY_HOTREMOVE */
/**
* walk_memory_range - walks through all mem sections in [start_pfn, end_pfn)
@@ -1631,7 +1632,7 @@ int offline_pages(unsigned long start_pf
*
* Returns the return value of func.
*/
-static int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
+int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
void *arg, int (*func)(struct memory_block *, void *))
{
struct memory_block *mem = NULL;
@@ -1668,6 +1669,7 @@ static int walk_memory_range(unsigned lo
return 0;
}
+#ifdef CONFIG_MEMORY_HOTREMOVE
/**
* offline_memory_block_cb - callback function for offlining memory block
* @mem: the memory block to be offlined
Index: linux-pm/include/linux/memory_hotplug.h
===================================================================
--- linux-pm.orig/include/linux/memory_hotplug.h
+++ linux-pm/include/linux/memory_hotplug.h
@@ -245,6 +245,8 @@ static inline int is_mem_section_removab
static inline void try_offline_node(int nid) {}
#endif /* CONFIG_MEMORY_HOTREMOVE */
+extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
+ void *arg, int (*func)(struct memory_block *, void *));
extern int mem_online_node(int nid);
extern int add_memory(int nid, u64 start, u64 size);
extern int arch_add_memory(int nid, u64 start, u64 size);
Index: linux-pm/drivers/acpi/acpi_memhotplug.c
===================================================================
--- linux-pm.orig/drivers/acpi/acpi_memhotplug.c
+++ linux-pm/drivers/acpi/acpi_memhotplug.c
@@ -28,6 +28,7 @@
*/
#include <linux/acpi.h>
+#include <linux/memory.h>
#include <linux/memory_hotplug.h>
#include "internal.h"
@@ -166,13 +167,50 @@ static int acpi_memory_check_device(stru
return 0;
}
+static unsigned long acpi_meminfo_start_pfn(struct acpi_memory_info *info)
+{
+ return PFN_DOWN(info->start_addr);
+}
+
+static unsigned long acpi_meminfo_end_pfn(struct acpi_memory_info *info)
+{
+ return PFN_UP(info->start_addr + info->length-1);
+}
+
+static int acpi_bind_memblk(struct memory_block *mem, void *arg)
+{
+ return acpi_bind_one(&mem->dev, (acpi_handle)arg);
+}
+
+static int acpi_bind_memory_blocks(struct acpi_memory_info *info,
+ acpi_handle handle)
+{
+ return walk_memory_range(acpi_meminfo_start_pfn(info),
+ acpi_meminfo_end_pfn(info), (void *)handle,
+ acpi_bind_memblk);
+}
+
+static int acpi_unbind_memblk(struct memory_block *mem, void *arg)
+{
+ acpi_unbind_one(&mem->dev);
+ return 0;
+}
+
+static void acpi_unbind_memory_blocks(struct acpi_memory_info *info,
+ acpi_handle handle)
+{
+ walk_memory_range(acpi_meminfo_start_pfn(info),
+ acpi_meminfo_end_pfn(info), NULL, acpi_unbind_memblk);
+}
+
static int acpi_memory_enable_device(struct acpi_memory_device *mem_device)
{
+ acpi_handle handle = mem_device->device->handle;
int result, num_enabled = 0;
struct acpi_memory_info *info;
int node;
- node = acpi_get_node(mem_device->device->handle);
+ node = acpi_get_node(handle);
/*
* Tell the VM there is more memory here...
* Note: Assume that this function returns zero on success
@@ -203,6 +241,12 @@ static int acpi_memory_enable_device(str
if (result && result != -EEXIST)
continue;
+ result = acpi_bind_memory_blocks(info, handle);
+ if (result) {
+ acpi_unbind_memory_blocks(info, handle);
+ return -ENODEV;
+ }
+
info->enabled = 1;
/*
@@ -229,10 +273,11 @@ static int acpi_memory_enable_device(str
static int acpi_memory_remove_memory(struct acpi_memory_device *mem_device)
{
+ acpi_handle handle = mem_device->device->handle;
int result = 0, nid;
struct acpi_memory_info *info, *n;
- nid = acpi_get_node(mem_device->device->handle);
+ nid = acpi_get_node(handle);
list_for_each_entry_safe(info, n, &mem_device->res_list, list) {
if (!info->enabled)
@@ -240,6 +285,8 @@ static int acpi_memory_remove_memory(str
if (nid < 0)
nid = memory_add_physaddr_to_nid(info->start_addr);
+
+ acpi_unbind_memory_blocks(info, handle);
result = remove_memory(nid, info->start_addr, info->length);
if (result)
return result;
@@ -300,7 +347,7 @@ static int acpi_memory_device_add(struct
if (result) {
dev_err(&device->dev, "acpi_memory_enable_device() error\n");
acpi_memory_device_free(mem_device);
- return -ENODEV;
+ return result;
}
dev_dbg(&device->dev, "Memory device configured by ACPI\n");
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
2013-05-04 11:11 ` [PATCH 0/2 v2, RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
2013-05-04 11:12 ` [PATCH 1/2 v2, RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes Rafael J. Wysocki
@ 2013-05-04 11:21 ` Rafael J. Wysocki
2013-05-06 16:28 ` Vasilis Liaskovitis
` (2 more replies)
2013-05-06 10:48 ` [PATCH 0/2 v2, RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
2 siblings, 3 replies; 32+ messages in thread
From: Rafael J. Wysocki @ 2013-05-04 11:21 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
vasilis.liaskovitis, Len Brown, linux-mm
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Introduce .offline() and .online() callbacks for memory_subsys
that will allow the generic device_offline() and device_online()
to be used with device objects representing memory blocks. That,
in turn, allows the ACPI subsystem to use device_offline() to put
removable memory blocks offline, if possible, before removing
memory modules holding them.
The 'online' sysfs attribute of memory block devices will attempt to
put them offline if 0 is written to it and will attempt to apply the
previously used online type when onlining them (i.e. when 1 is
written to it).
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
drivers/base/memory.c | 105 +++++++++++++++++++++++++++++++++++++------------
include/linux/memory.h | 1
2 files changed, 81 insertions(+), 25 deletions(-)
Index: linux-pm/drivers/base/memory.c
===================================================================
--- linux-pm.orig/drivers/base/memory.c
+++ linux-pm/drivers/base/memory.c
@@ -37,9 +37,14 @@ static inline int base_memory_block_id(i
return section_nr / sections_per_block;
}
+static int memory_subsys_online(struct device *dev);
+static int memory_subsys_offline(struct device *dev);
+
static struct bus_type memory_subsys = {
.name = MEMORY_CLASS_NAME,
.dev_name = MEMORY_CLASS_NAME,
+ .online = memory_subsys_online,
+ .offline = memory_subsys_offline,
};
static BLOCKING_NOTIFIER_HEAD(memory_chain);
@@ -278,33 +283,64 @@ static int __memory_block_change_state(s
{
int ret = 0;
- if (mem->state != from_state_req) {
- ret = -EINVAL;
- goto out;
- }
+ if (mem->state != from_state_req)
+ return -EINVAL;
if (to_state == MEM_OFFLINE)
mem->state = MEM_GOING_OFFLINE;
ret = memory_block_action(mem->start_section_nr, to_state, online_type);
-
if (ret) {
mem->state = from_state_req;
- goto out;
+ } else {
+ mem->state = to_state;
+ if (to_state == MEM_ONLINE)
+ mem->last_online = online_type;
}
+ return ret;
+}
- mem->state = to_state;
- switch (mem->state) {
- case MEM_OFFLINE:
- kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
- break;
- case MEM_ONLINE:
- kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
- break;
- default:
- break;
+static int memory_subsys_online(struct device *dev)
+{
+ struct memory_block *mem = container_of(dev, struct memory_block, dev);
+ int ret;
+
+ mutex_lock(&mem->state_mutex);
+ ret = __memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE,
+ mem->last_online);
+ mutex_unlock(&mem->state_mutex);
+ return ret;
+}
+
+static int memory_subsys_offline(struct device *dev)
+{
+ struct memory_block *mem = container_of(dev, struct memory_block, dev);
+ int ret;
+
+ mutex_lock(&mem->state_mutex);
+ ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+ mutex_unlock(&mem->state_mutex);
+ return ret;
+}
+
+static int __memory_block_change_state_uevent(struct memory_block *mem,
+ unsigned long to_state, unsigned long from_state_req,
+ int online_type)
+{
+ int ret = __memory_block_change_state(mem, to_state, from_state_req,
+ online_type);
+ if (!ret) {
+ switch (mem->state) {
+ case MEM_OFFLINE:
+ kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
+ break;
+ case MEM_ONLINE:
+ kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
+ break;
+ default:
+ break;
+ }
}
-out:
return ret;
}
@@ -315,8 +351,8 @@ static int memory_block_change_state(str
int ret;
mutex_lock(&mem->state_mutex);
- ret = __memory_block_change_state(mem, to_state, from_state_req,
- online_type);
+ ret = __memory_block_change_state_uevent(mem, to_state, from_state_req,
+ online_type);
mutex_unlock(&mem->state_mutex);
return ret;
@@ -326,22 +362,34 @@ store_mem_state(struct device *dev,
struct device_attribute *attr, const char *buf, size_t count)
{
struct memory_block *mem;
+ bool offline;
int ret = -EINVAL;
mem = container_of(dev, struct memory_block, dev);
- if (!strncmp(buf, "online_kernel", min_t(int, count, 13)))
+ lock_device_hotplug();
+
+ if (!strncmp(buf, "online_kernel", min_t(int, count, 13))) {
+ offline = false;
ret = memory_block_change_state(mem, MEM_ONLINE,
MEM_OFFLINE, ONLINE_KERNEL);
- else if (!strncmp(buf, "online_movable", min_t(int, count, 14)))
+ } else if (!strncmp(buf, "online_movable", min_t(int, count, 14))) {
+ offline = false;
ret = memory_block_change_state(mem, MEM_ONLINE,
MEM_OFFLINE, ONLINE_MOVABLE);
- else if (!strncmp(buf, "online", min_t(int, count, 6)))
+ } else if (!strncmp(buf, "online", min_t(int, count, 6))) {
+ offline = false;
ret = memory_block_change_state(mem, MEM_ONLINE,
MEM_OFFLINE, ONLINE_KEEP);
- else if(!strncmp(buf, "offline", min_t(int, count, 7)))
+ } else if(!strncmp(buf, "offline", min_t(int, count, 7))) {
+ offline = true;
ret = memory_block_change_state(mem, MEM_OFFLINE,
MEM_ONLINE, -1);
+ }
+ if (!ret)
+ dev->offline = offline;
+
+ unlock_device_hotplug();
if (ret)
return ret;
@@ -563,6 +611,7 @@ static int init_memory_block(struct memo
base_memory_block_id(scn_nr) * sections_per_block;
mem->end_section_nr = mem->start_section_nr + sections_per_block - 1;
mem->state = state;
+ mem->last_online = ONLINE_KEEP;
mem->section_count++;
mutex_init(&mem->state_mutex);
start_pfn = section_nr_to_pfn(mem->start_section_nr);
@@ -686,10 +735,16 @@ int offline_memory_block(struct memory_b
{
int ret = 0;
+ lock_device_hotplug();
mutex_lock(&mem->state_mutex);
- if (mem->state != MEM_OFFLINE)
- ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+ if (mem->state != MEM_OFFLINE) {
+ ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
+ MEM_ONLINE, -1);
+ if (!ret)
+ mem->dev.offline = true;
+ }
mutex_unlock(&mem->state_mutex);
+ unlock_device_hotplug();
return ret;
}
Index: linux-pm/include/linux/memory.h
===================================================================
--- linux-pm.orig/include/linux/memory.h
+++ linux-pm/include/linux/memory.h
@@ -26,6 +26,7 @@ struct memory_block {
unsigned long start_section_nr;
unsigned long end_section_nr;
unsigned long state;
+ int last_online;
int section_count;
/*
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 0/2 v2, RFC] Driver core: Add offline/online callbacks for memory_subsys
2013-05-04 11:11 ` [PATCH 0/2 v2, RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
2013-05-04 11:12 ` [PATCH 1/2 v2, RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes Rafael J. Wysocki
2013-05-04 11:21 ` [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks Rafael J. Wysocki
@ 2013-05-06 10:48 ` Rafael J. Wysocki
2 siblings, 0 replies; 32+ messages in thread
From: Rafael J. Wysocki @ 2013-05-06 10:48 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
vasilis.liaskovitis, Len Brown, linux-mm
On Saturday, May 04, 2013 01:11:21 PM Rafael J. Wysocki wrote:
> Hi,
>
> On Saturday, May 04, 2013 03:01:23 AM Rafael J. Wysocki wrote:
> > Hi,
> >
> > This is a continuation of this patchset: https://lkml.org/lkml/2013/5/2/214
> > and it applies on top of it or rather on top of the rebased version (with
> > build problems fixed) in the bleeding-edge branch of the linux-pm.git tree:
> >
> > http://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git/log/?h=bleeding-edge
> >
> > An introduction to the first part of the patchset is below, a description of
> > the current patches follows.
>
> Actually, I'm withdrawing the previous version of this patchset (or rather
> patches [2-3/3] from it), because I had a better idea in the meantime.
>
> Patch [1/2] is the same as the previous [1/3] ->
>
> > On Thursday, May 02, 2013 02:26:39 PM Rafael J. Wysocki wrote:
> > > On Monday, April 29, 2013 02:23:59 PM Rafael J. Wysocki wrote:
> > > >
> > > > It has been argued for a number of times that in some cases, if a device cannot
> > > > be gracefully removed from the system, it shouldn't be removed from it at all,
> > > > because that may lead to a kernel crash. In particular, that will happen if a
> > > > memory module holding kernel memory is removed, but also removing the last CPU
> > > > in the system may not be a good idea. [And I can imagine a few other cases
> > > > like that.]
> > > >
> > > > The kernel currently only supports "forced" hot-remove which cannot be stopped
> > > > once started, so users have no choice but to try to hot-remove stuff and see
> > > > whether or not that crashes the kernel which is kind of unpleasant. That seems
> > > > to be based on the "the user knows better" argument according to which users
> > > > triggering device hot-removal should really know what they are doing, so the
> > > > kernel doesn't have to worry about that. However, for instance, this pretty
> > > > much isn't the case for memory modules, because the users have no way to see
> > > > whether or not any kernel memory has been allocated from a given module.
> > > >
> > > > There have been a few attempts to address this issue, but none of them has
> > > > gained broader acceptance. The following 3 patches are the heart of a new
> > > > proposal which is based on the idea to introduce device_offline() and
> > > > device_online() operations along the lines of the existing CPU offline/online
> > > > mechanism (or, rather, to extend the CPU offline/online so that analogous
> > > > operations are available for other devices). The way it is supposed to work is
> > > > that device_offline() will fail if the given device cannot be gracefully
> > > > removed from the system (in the kernel's view). Once it succeeds, though, the
> > > > device won't be used any more until either it is removed, or device_online() is
> > > > run for it. That will allow the ACPI device hot-remove code, for one example,
> > > > to avoid triggering a non-reversible removal procedure for devices that cannot
> > > > be removed gracefully.
> > > >
> > > > Patch [1/3] introduces device_offline() and device_online() as outlined above.
> > > > The .offline() and .online() callbacks are only added at the bus type level for
> > > > now, because that should be sufficient to cover the memory and CPU use cases.
> > >
> > > That's [1/4] now and the changes from the previous version are:
> > > - strtobool() is used in store_online().
> > > - device_offline_lock has been renamed to device_hotplug_lock (and the
> > > functions operating it accordingly) following the Toshi's advice.
> > >
> > > > Patch [2/3] modifies the CPU hotplug support code to use device_offline() and
> > > > device_online() to support the sysfs 'online' attribute for CPUs.
> > >
> > > That is [2/4] now and it takes cpu_hotplug_driver_lock() around cpu_up() and
> > > cpu_down().
> > >
> > > > Patch [3/3] changes the ACPI device hot-remove code to use device_offline()
> > > > for checking if graceful removal of devices is possible. The way it does that
> > > > is to walk the list of "physical" companion devices for each struct acpi_device
> > > > involved in the operation and call device_offline() for each of them. If any
> > > > of the device_offline() calls fails (and the hot-removal is not "forced", which
> > > > is an option), the removal procedure (which is not reversible) is simply not
> > > > carried out.
> > >
> > > That's current [3/4]. It's a bit simpler, because I decided that it would be
> > > better to have a global 'force_remove' attribute (the semantics of the
> > > per-profile 'force_remove' wasn't clear and it didn't really add any value over
> > > a global one). I also added lock/unlock_device_hotplug() around acpi_bus_scan()
> > > in acpi_scan_bus_device_check() to allow scan handlers to update dev->offline
> > > for "physical" companion devices safely (the processor's one added by the next
> > > patch actually does that).
> > >
> > > > Of some concern is that device_offline() (and possibly device_online()) is
> > > > called under physical_node_lock of the corresponding struct acpi_device, which
> > > > introduces ordering dependency between that lock and device locks for the
> > > > "physical" devices, but I didn't see any cleaner way to do that (I guess it
> > > > is avoidable at the expense of added complexity, but for now it's just better
> > > > to make the code as clean as possible IMO).
> > >
> > > Patch [4/4] reworks the ACPI processor driver to use the common hotplug code.
> > > It basically splits the driver into two parts as described in the changelog,
> > > where the first part is essentially a scan handler and the second part is
> > > a driver, but it doesn't bind to struct acpi_device any more. Instead, it
> > > binds to processor devices under /sys/devices/system/cpu/ (the driver itself
> > > has a sysfs directory under /sys/bus/cpu/drivers/ which IMHO makes more sense
> > > than having it under /sys/bus/acpi/drivers/).
> > >
> > > The patch at https://patchwork.kernel.org/patch/2506371/ is a prerequisite
> > > for this series, but I'm going to push it for v3.10-rc2 if no one screams
> > > bloody murder.
>
> -> (this is [1/2] now):
>
> > Patch [1/3] in the current series uses acpi_bind_one() to associate memory
> > block devices with ACPI namespace objects representing memory modules that hold
> > them. With patch [3/3] that will allow the ACPI core's device hot-remove code
> > to attempt to offline the memory blocks, if possible, before removing the
> > modules holding them from the system (and if the offlining fails, the removal
> > will not be carried out).
>
> Patch [2/2] adds .online() and .offline() callbacks to memory_subsys
> that are used by the common "online" sysfs attribute and by the ACPI core's
> hot-remove code, through device_online() and device_offline().
>
> The way it is supposed to work is that device_offline() will attempt to put
> memory blocks offline and device_online() will online them and attempt to
> apply the last online type previously used to them.
I forgot to mention that patch [2/2] was (lightly) tested. Unfortunately,
I don't have the hardware (or an emulator) allowing me to test patch [1/2].
Thanks,
Rafael
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
2013-05-04 11:21 ` [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks Rafael J. Wysocki
@ 2013-05-06 16:28 ` Vasilis Liaskovitis
2013-05-07 0:59 ` Rafael J. Wysocki
2013-05-06 17:20 ` Greg Kroah-Hartman
2013-05-21 6:37 ` Tang Chen
2 siblings, 1 reply; 32+ messages in thread
From: Vasilis Liaskovitis @ 2013-05-06 16:28 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
isimatu.yasuaki, Len Brown, linux-mm
Hi,
On Sat, May 04, 2013 at 01:21:16PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Introduce .offline() and .online() callbacks for memory_subsys
> that will allow the generic device_offline() and device_online()
> to be used with device objects representing memory blocks. That,
> in turn, allows the ACPI subsystem to use device_offline() to put
> removable memory blocks offline, if possible, before removing
> memory modules holding them.
>
> The 'online' sysfs attribute of memory block devices will attempt to
> put them offline if 0 is written to it and will attempt to apply the
> previously used online type when onlining them (i.e. when 1 is
> written to it).
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> drivers/base/memory.c | 105 +++++++++++++++++++++++++++++++++++++------------
> include/linux/memory.h | 1
> 2 files changed, 81 insertions(+), 25 deletions(-)
>
[...]
> @@ -686,10 +735,16 @@ int offline_memory_block(struct memory_b
> {
> int ret = 0;
>
> + lock_device_hotplug();
> mutex_lock(&mem->state_mutex);
> - if (mem->state != MEM_OFFLINE)
> - ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> + if (mem->state != MEM_OFFLINE) {
> + ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
> + MEM_ONLINE, -1);
> + if (!ret)
> + mem->dev.offline = true;
> + }
> mutex_unlock(&mem->state_mutex);
> + unlock_device_hotplug();
(Testing with qemu...)
offline_memory_block is called from remove_memory, which in turn is called from
acpi_memory_device_remove (detach operation) during acpi_bus_trim. We already
hold the device_hotplug lock when we trim (acpi_scan_hot_remove), so we
don't need to lock/unlock_device_hotplug in offline_memory_block.
A more general issue is that there are now two memory offlining efforts:
1) from acpi_bus_offline_companions during device offline
2) from mm: remove_memory during device detach (offline_memory_block_cb)
The 2nd is only called if the device offline operation was already succesful, so
it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
(unless the blocks were re-onlined in between).
On the other hand, the 2nd effort has some more intelligence in offlining, as it
tries to offline twice in the precense of memcg, see commits df3e1b91 or
reworked 0baeab16. Maybe we need to consolidate the logic.
remove_memory is called from device_detach, during trim that can't fail, so it
should not fail. However this function can still fail in 2 cases:
- offline_memory_block_cb
- is_memblock_offlined_cb
in the case of re-onlined memblocks in between device-offline and device detach.
This seems possible I think, since we do not hold lock_memory_hotplug for the
duration of the hot-remove operation.
thanks,
- Vasilis
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
2013-05-04 11:21 ` [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks Rafael J. Wysocki
2013-05-06 16:28 ` Vasilis Liaskovitis
@ 2013-05-06 17:20 ` Greg Kroah-Hartman
2013-05-06 19:46 ` Rafael J. Wysocki
2013-05-21 6:37 ` Tang Chen
2 siblings, 1 reply; 32+ messages in thread
From: Greg Kroah-Hartman @ 2013-05-06 17:20 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
vasilis.liaskovitis, Len Brown, linux-mm
On Sat, May 04, 2013 at 01:21:16PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Introduce .offline() and .online() callbacks for memory_subsys
> that will allow the generic device_offline() and device_online()
> to be used with device objects representing memory blocks. That,
> in turn, allows the ACPI subsystem to use device_offline() to put
> removable memory blocks offline, if possible, before removing
> memory modules holding them.
>
> The 'online' sysfs attribute of memory block devices will attempt to
> put them offline if 0 is written to it and will attempt to apply the
> previously used online type when onlining them (i.e. when 1 is
> written to it).
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> drivers/base/memory.c | 105 +++++++++++++++++++++++++++++++++++++------------
> include/linux/memory.h | 1
> 2 files changed, 81 insertions(+), 25 deletions(-)
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
2013-05-06 17:20 ` Greg Kroah-Hartman
@ 2013-05-06 19:46 ` Rafael J. Wysocki
0 siblings, 0 replies; 32+ messages in thread
From: Rafael J. Wysocki @ 2013-05-06 19:46 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
vasilis.liaskovitis, Len Brown, linux-mm
On Monday, May 06, 2013 10:20:44 AM Greg Kroah-Hartman wrote:
> On Sat, May 04, 2013 at 01:21:16PM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Introduce .offline() and .online() callbacks for memory_subsys
> > that will allow the generic device_offline() and device_online()
> > to be used with device objects representing memory blocks. That,
> > in turn, allows the ACPI subsystem to use device_offline() to put
> > removable memory blocks offline, if possible, before removing
> > memory modules holding them.
> >
> > The 'online' sysfs attribute of memory block devices will attempt to
> > put them offline if 0 is written to it and will attempt to apply the
> > previously used online type when onlining them (i.e. when 1 is
> > written to it).
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> > drivers/base/memory.c | 105 +++++++++++++++++++++++++++++++++++++------------
> > include/linux/memory.h | 1
> > 2 files changed, 81 insertions(+), 25 deletions(-)
>
> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Thanks!
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
2013-05-06 16:28 ` Vasilis Liaskovitis
@ 2013-05-07 0:59 ` Rafael J. Wysocki
2013-05-07 10:59 ` Vasilis Liaskovitis
0 siblings, 1 reply; 32+ messages in thread
From: Rafael J. Wysocki @ 2013-05-07 0:59 UTC (permalink / raw)
To: Vasilis Liaskovitis
Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
isimatu.yasuaki, Len Brown, linux-mm
On Monday, May 06, 2013 06:28:12 PM Vasilis Liaskovitis wrote:
> Hi,
>
> On Sat, May 04, 2013 at 01:21:16PM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Introduce .offline() and .online() callbacks for memory_subsys
> > that will allow the generic device_offline() and device_online()
> > to be used with device objects representing memory blocks. That,
> > in turn, allows the ACPI subsystem to use device_offline() to put
> > removable memory blocks offline, if possible, before removing
> > memory modules holding them.
> >
> > The 'online' sysfs attribute of memory block devices will attempt to
> > put them offline if 0 is written to it and will attempt to apply the
> > previously used online type when onlining them (i.e. when 1 is
> > written to it).
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> > drivers/base/memory.c | 105 +++++++++++++++++++++++++++++++++++++------------
> > include/linux/memory.h | 1
> > 2 files changed, 81 insertions(+), 25 deletions(-)
> >
> [...]
>
> > @@ -686,10 +735,16 @@ int offline_memory_block(struct memory_b
> > {
> > int ret = 0;
> >
> > + lock_device_hotplug();
> > mutex_lock(&mem->state_mutex);
> > - if (mem->state != MEM_OFFLINE)
> > - ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > + if (mem->state != MEM_OFFLINE) {
> > + ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
> > + MEM_ONLINE, -1);
> > + if (!ret)
> > + mem->dev.offline = true;
> > + }
> > mutex_unlock(&mem->state_mutex);
> > + unlock_device_hotplug();
>
> (Testing with qemu...)
Thanks!
> offline_memory_block is called from remove_memory, which in turn is called from
> acpi_memory_device_remove (detach operation) during acpi_bus_trim. We already
> hold the device_hotplug lock when we trim (acpi_scan_hot_remove), so we
> don't need to lock/unlock_device_hotplug in offline_memory_block.
Indeed.
First, it looks like offline_memory_block_cb() is the only place calling
offline_memory_block(), is that right? I'm wondering if it would make
sense to use device_offline() in there and remove offline_memory_block()
entirely?
Second, if you ran into this issue during testing, that would mean that patch
[1/2] actually worked for you, which would be nice. :-) Was that really the
case?
> A more general issue is that there are now two memory offlining efforts:
>
> 1) from acpi_bus_offline_companions during device offline
> 2) from mm: remove_memory during device detach (offline_memory_block_cb)
>
> The 2nd is only called if the device offline operation was already succesful, so
> it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> (unless the blocks were re-onlined in between).
Sure, and that should be OK for now. Changing the detach behavior is not
essential from the patch [2/2] perspective, we can do it later.
> On the other hand, the 2nd effort has some more intelligence in offlining, as it
> tries to offline twice in the precense of memcg, see commits df3e1b91 or
> reworked 0baeab16. Maybe we need to consolidate the logic.
Hmm. Perhaps it would make sense to implement that logic in
memory_subsys_offline(), then?
> remove_memory is called from device_detach, during trim that can't fail, so it
> should not fail. However this function can still fail in 2 cases:
> - offline_memory_block_cb
> - is_memblock_offlined_cb
> in the case of re-onlined memblocks in between device-offline and device detach.
> This seems possible I think, since we do not hold lock_memory_hotplug for the
> duration of the hot-remove operation.
But we do hold device_hotplug_lock, so every code path that may race with
acpi_scan_hot_remove() needs to take device_hotplug_lock as well. Now,
question is whether or not there are any code paths like that calling one of
the two functions above without holding device_hotplug_lock?
Rafael
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
2013-05-07 0:59 ` Rafael J. Wysocki
@ 2013-05-07 10:59 ` Vasilis Liaskovitis
2013-05-07 12:11 ` Rafael J. Wysocki
0 siblings, 1 reply; 32+ messages in thread
From: Vasilis Liaskovitis @ 2013-05-07 10:59 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
isimatu.yasuaki, Len Brown, linux-mm, wency
Hi,
On Tue, May 07, 2013 at 02:59:05AM +0200, Rafael J. Wysocki wrote:
> On Monday, May 06, 2013 06:28:12 PM Vasilis Liaskovitis wrote:
> > On Sat, May 04, 2013 at 01:21:16PM +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > >
> > > Introduce .offline() and .online() callbacks for memory_subsys
> > > that will allow the generic device_offline() and device_online()
> > > to be used with device objects representing memory blocks. That,
> > > in turn, allows the ACPI subsystem to use device_offline() to put
> > > removable memory blocks offline, if possible, before removing
> > > memory modules holding them.
> > >
> > > The 'online' sysfs attribute of memory block devices will attempt to
> > > put them offline if 0 is written to it and will attempt to apply the
> > > previously used online type when onlining them (i.e. when 1 is
> > > written to it).
> > >
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > ---
> > > drivers/base/memory.c | 105 +++++++++++++++++++++++++++++++++++++------------
> > > include/linux/memory.h | 1
> > > 2 files changed, 81 insertions(+), 25 deletions(-)
> > >
> > [...]
> >
> > > @@ -686,10 +735,16 @@ int offline_memory_block(struct memory_b
> > > {
> > > int ret = 0;
> > >
> > > + lock_device_hotplug();
> > > mutex_lock(&mem->state_mutex);
> > > - if (mem->state != MEM_OFFLINE)
> > > - ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > + if (mem->state != MEM_OFFLINE) {
> > > + ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
> > > + MEM_ONLINE, -1);
> > > + if (!ret)
> > > + mem->dev.offline = true;
> > > + }
> > > mutex_unlock(&mem->state_mutex);
> > > + unlock_device_hotplug();
> >
> > (Testing with qemu...)
>
> Thanks!
>
> > offline_memory_block is called from remove_memory, which in turn is called from
> > acpi_memory_device_remove (detach operation) during acpi_bus_trim. We already
> > hold the device_hotplug lock when we trim (acpi_scan_hot_remove), so we
> > don't need to lock/unlock_device_hotplug in offline_memory_block.
>
> Indeed.
>
> First, it looks like offline_memory_block_cb() is the only place calling
> offline_memory_block(), is that right? I'm wondering if it would make
correct.
> sense to use device_offline() in there and remove offline_memory_block()
> entirely?
possibly. Not sure if we can get hold of the struct device from
mm/memory_hotplug.c, maybe we still need the helper function that operates
directly on the memory block.
>
> Second, if you ran into this issue during testing, that would mean that patch
> [1/2] actually worked for you, which would be nice. :-) Was that really the
> case?
yes, the patchset works fine once the extra lock/unlock_device_hotplug is
removed. For various dimm hot-remove operations, I saw either successfull
offlining and removal, or failed offlining and aborted removal.
You can add this to 1/2 (or, once the extra lock is removed, to 2/2 as well):
Tested-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
>
> > A more general issue is that there are now two memory offlining efforts:
> >
> > 1) from acpi_bus_offline_companions during device offline
> > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> >
> > The 2nd is only called if the device offline operation was already succesful, so
> > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > (unless the blocks were re-onlined in between).
>
> Sure, and that should be OK for now. Changing the detach behavior is not
> essential from the patch [2/2] perspective, we can do it later.
yes, ok.
>
> > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > reworked 0baeab16. Maybe we need to consolidate the logic.
>
> Hmm. Perhaps it would make sense to implement that logic in
> memory_subsys_offline(), then?
the logic tries to offline the memory blocks of the device twice, because the
first memory block might be storing information for the subsequent memblocks.
memory_subsys_offline operates on one memory block at a time. Perhaps we can get
the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
acpi_scan_hot_remove but it's probably not a good idea, since that would
affect non-memory devices as well.
I am not sure how important this intelligence is in practice (I am not using
mem cgroups in my guest kernel tests yet). Maybe Wen (original author) has
more details on 2-pass offlining effectiveness.
>
> > remove_memory is called from device_detach, during trim that can't fail, so it
> > should not fail. However this function can still fail in 2 cases:
> > - offline_memory_block_cb
> > - is_memblock_offlined_cb
> > in the case of re-onlined memblocks in between device-offline and device detach.
> > This seems possible I think, since we do not hold lock_memory_hotplug for the
> > duration of the hot-remove operation.
>
> But we do hold device_hotplug_lock, so every code path that may race with
> acpi_scan_hot_remove() needs to take device_hotplug_lock as well. Now,
> question is whether or not there are any code paths like that calling one of
> the two functions above without holding device_hotplug_lock?
I think you are right. The other code path I had in mind was userspace initiated
online/offline operations from store_mem_state in drivers/base/memory.c. But we
also do lock_device_hotplug in that case too. So it seems safe. If I find
something else with stress testing the paths simultaneously (or another code
path) I 'll update.
thanks,
- Vasilis
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
2013-05-07 10:59 ` Vasilis Liaskovitis
@ 2013-05-07 12:11 ` Rafael J. Wysocki
2013-05-07 21:03 ` Toshi Kani
0 siblings, 1 reply; 32+ messages in thread
From: Rafael J. Wysocki @ 2013-05-07 12:11 UTC (permalink / raw)
To: Vasilis Liaskovitis
Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
isimatu.yasuaki, Len Brown, linux-mm, wency
On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> Hi,
>
> On Tue, May 07, 2013 at 02:59:05AM +0200, Rafael J. Wysocki wrote:
> > On Monday, May 06, 2013 06:28:12 PM Vasilis Liaskovitis wrote:
> > > On Sat, May 04, 2013 at 01:21:16PM +0200, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > >
> > > > Introduce .offline() and .online() callbacks for memory_subsys
> > > > that will allow the generic device_offline() and device_online()
> > > > to be used with device objects representing memory blocks. That,
> > > > in turn, allows the ACPI subsystem to use device_offline() to put
> > > > removable memory blocks offline, if possible, before removing
> > > > memory modules holding them.
> > > >
> > > > The 'online' sysfs attribute of memory block devices will attempt to
> > > > put them offline if 0 is written to it and will attempt to apply the
> > > > previously used online type when onlining them (i.e. when 1 is
> > > > written to it).
> > > >
> > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > ---
> > > > drivers/base/memory.c | 105 +++++++++++++++++++++++++++++++++++++------------
> > > > include/linux/memory.h | 1
> > > > 2 files changed, 81 insertions(+), 25 deletions(-)
> > > >
> > > [...]
> > >
> > > > @@ -686,10 +735,16 @@ int offline_memory_block(struct memory_b
> > > > {
> > > > int ret = 0;
> > > >
> > > > + lock_device_hotplug();
> > > > mutex_lock(&mem->state_mutex);
> > > > - if (mem->state != MEM_OFFLINE)
> > > > - ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > > + if (mem->state != MEM_OFFLINE) {
> > > > + ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
> > > > + MEM_ONLINE, -1);
> > > > + if (!ret)
> > > > + mem->dev.offline = true;
> > > > + }
> > > > mutex_unlock(&mem->state_mutex);
> > > > + unlock_device_hotplug();
> > >
> > > (Testing with qemu...)
> >
> > Thanks!
> >
> > > offline_memory_block is called from remove_memory, which in turn is called from
> > > acpi_memory_device_remove (detach operation) during acpi_bus_trim. We already
> > > hold the device_hotplug lock when we trim (acpi_scan_hot_remove), so we
> > > don't need to lock/unlock_device_hotplug in offline_memory_block.
> >
> > Indeed.
> >
> > First, it looks like offline_memory_block_cb() is the only place calling
> > offline_memory_block(), is that right? I'm wondering if it would make
>
> correct.
Great!
> > sense to use device_offline() in there and remove offline_memory_block()
> > entirely?
>
> possibly. Not sure if we can get hold of the struct device from
> mm/memory_hotplug.c, maybe we still need the helper function that operates
> directly on the memory block.
We can pass mem->dev to device_offline() and the locking should be fine.
> > Second, if you ran into this issue during testing, that would mean that patch
> > [1/2] actually worked for you, which would be nice. :-) Was that really the
> > case?
>
> yes, the patchset works fine once the extra lock/unlock_device_hotplug is
> removed. For various dimm hot-remove operations, I saw either successfull
> offlining and removal, or failed offlining and aborted removal.
> You can add this to 1/2 (or, once the extra lock is removed, to 2/2 as well):
>
> Tested-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Thanks!
Updated patch is appended for completness.
> >
> > > A more general issue is that there are now two memory offlining efforts:
> > >
> > > 1) from acpi_bus_offline_companions during device offline
> > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > >
> > > The 2nd is only called if the device offline operation was already succesful, so
> > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > (unless the blocks were re-onlined in between).
> >
> > Sure, and that should be OK for now. Changing the detach behavior is not
> > essential from the patch [2/2] perspective, we can do it later.
>
> yes, ok.
>
> >
> > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > reworked 0baeab16. Maybe we need to consolidate the logic.
> >
> > Hmm. Perhaps it would make sense to implement that logic in
> > memory_subsys_offline(), then?
>
> the logic tries to offline the memory blocks of the device twice, because the
> first memory block might be storing information for the subsequent memblocks.
>
> memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> acpi_scan_hot_remove but it's probably not a good idea, since that would
> affect non-memory devices as well.
>
> I am not sure how important this intelligence is in practice (I am not using
> mem cgroups in my guest kernel tests yet). Maybe Wen (original author) has
> more details on 2-pass offlining effectiveness.
OK
It may be added in a separate patch in any case.
> > > remove_memory is called from device_detach, during trim that can't fail, so it
> > > should not fail. However this function can still fail in 2 cases:
> > > - offline_memory_block_cb
> > > - is_memblock_offlined_cb
> > > in the case of re-onlined memblocks in between device-offline and device detach.
> > > This seems possible I think, since we do not hold lock_memory_hotplug for the
> > > duration of the hot-remove operation.
> >
> > But we do hold device_hotplug_lock, so every code path that may race with
> > acpi_scan_hot_remove() needs to take device_hotplug_lock as well. Now,
> > question is whether or not there are any code paths like that calling one of
> > the two functions above without holding device_hotplug_lock?
>
> I think you are right. The other code path I had in mind was userspace initiated
> online/offline operations from store_mem_state in drivers/base/memory.c. But we
> also do lock_device_hotplug in that case too. So it seems safe. If I find
> something else with stress testing the paths simultaneously (or another code
> path) I 'll update.
OK
Thanks,
Rafael
---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: Driver core: Introduce offline/online callbacks for memory blocks
Introduce .offline() and .online() callbacks for memory_subsys
that will allow the generic device_offline() and device_online()
to be used with device objects representing memory blocks. That,
in turn, allows the ACPI subsystem to use device_offline() to put
removable memory blocks offline, if possible, before removing
memory modules holding them.
The 'online' sysfs attribute of memory block devices will attempt to
put them offline if 0 is written to it and will attempt to apply the
previously used online type when onlining them (i.e. when 1 is
written to it).
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/base/memory.c | 105 +++++++++++++++++++++++++++++++++++++------------
include/linux/memory.h | 1
2 files changed, 81 insertions(+), 25 deletions(-)
Index: linux-pm/drivers/base/memory.c
===================================================================
--- linux-pm.orig/drivers/base/memory.c
+++ linux-pm/drivers/base/memory.c
@@ -37,9 +37,14 @@ static inline int base_memory_block_id(i
return section_nr / sections_per_block;
}
+static int memory_subsys_online(struct device *dev);
+static int memory_subsys_offline(struct device *dev);
+
static struct bus_type memory_subsys = {
.name = MEMORY_CLASS_NAME,
.dev_name = MEMORY_CLASS_NAME,
+ .online = memory_subsys_online,
+ .offline = memory_subsys_offline,
};
static BLOCKING_NOTIFIER_HEAD(memory_chain);
@@ -278,33 +283,64 @@ static int __memory_block_change_state(s
{
int ret = 0;
- if (mem->state != from_state_req) {
- ret = -EINVAL;
- goto out;
- }
+ if (mem->state != from_state_req)
+ return -EINVAL;
if (to_state == MEM_OFFLINE)
mem->state = MEM_GOING_OFFLINE;
ret = memory_block_action(mem->start_section_nr, to_state, online_type);
-
if (ret) {
mem->state = from_state_req;
- goto out;
+ } else {
+ mem->state = to_state;
+ if (to_state == MEM_ONLINE)
+ mem->last_online = online_type;
}
+ return ret;
+}
- mem->state = to_state;
- switch (mem->state) {
- case MEM_OFFLINE:
- kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
- break;
- case MEM_ONLINE:
- kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
- break;
- default:
- break;
+static int memory_subsys_online(struct device *dev)
+{
+ struct memory_block *mem = container_of(dev, struct memory_block, dev);
+ int ret;
+
+ mutex_lock(&mem->state_mutex);
+ ret = __memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE,
+ mem->last_online);
+ mutex_unlock(&mem->state_mutex);
+ return ret;
+}
+
+static int memory_subsys_offline(struct device *dev)
+{
+ struct memory_block *mem = container_of(dev, struct memory_block, dev);
+ int ret;
+
+ mutex_lock(&mem->state_mutex);
+ ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+ mutex_unlock(&mem->state_mutex);
+ return ret;
+}
+
+static int __memory_block_change_state_uevent(struct memory_block *mem,
+ unsigned long to_state, unsigned long from_state_req,
+ int online_type)
+{
+ int ret = __memory_block_change_state(mem, to_state, from_state_req,
+ online_type);
+ if (!ret) {
+ switch (mem->state) {
+ case MEM_OFFLINE:
+ kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
+ break;
+ case MEM_ONLINE:
+ kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
+ break;
+ default:
+ break;
+ }
}
-out:
return ret;
}
@@ -315,8 +351,8 @@ static int memory_block_change_state(str
int ret;
mutex_lock(&mem->state_mutex);
- ret = __memory_block_change_state(mem, to_state, from_state_req,
- online_type);
+ ret = __memory_block_change_state_uevent(mem, to_state, from_state_req,
+ online_type);
mutex_unlock(&mem->state_mutex);
return ret;
@@ -326,22 +362,34 @@ store_mem_state(struct device *dev,
struct device_attribute *attr, const char *buf, size_t count)
{
struct memory_block *mem;
+ bool offline;
int ret = -EINVAL;
mem = container_of(dev, struct memory_block, dev);
- if (!strncmp(buf, "online_kernel", min_t(int, count, 13)))
+ lock_device_hotplug();
+
+ if (!strncmp(buf, "online_kernel", min_t(int, count, 13))) {
+ offline = false;
ret = memory_block_change_state(mem, MEM_ONLINE,
MEM_OFFLINE, ONLINE_KERNEL);
- else if (!strncmp(buf, "online_movable", min_t(int, count, 14)))
+ } else if (!strncmp(buf, "online_movable", min_t(int, count, 14))) {
+ offline = false;
ret = memory_block_change_state(mem, MEM_ONLINE,
MEM_OFFLINE, ONLINE_MOVABLE);
- else if (!strncmp(buf, "online", min_t(int, count, 6)))
+ } else if (!strncmp(buf, "online", min_t(int, count, 6))) {
+ offline = false;
ret = memory_block_change_state(mem, MEM_ONLINE,
MEM_OFFLINE, ONLINE_KEEP);
- else if(!strncmp(buf, "offline", min_t(int, count, 7)))
+ } else if(!strncmp(buf, "offline", min_t(int, count, 7))) {
+ offline = true;
ret = memory_block_change_state(mem, MEM_OFFLINE,
MEM_ONLINE, -1);
+ }
+ if (!ret)
+ dev->offline = offline;
+
+ unlock_device_hotplug();
if (ret)
return ret;
@@ -563,6 +611,7 @@ static int init_memory_block(struct memo
base_memory_block_id(scn_nr) * sections_per_block;
mem->end_section_nr = mem->start_section_nr + sections_per_block - 1;
mem->state = state;
+ mem->last_online = ONLINE_KEEP;
mem->section_count++;
mutex_init(&mem->state_mutex);
start_pfn = section_nr_to_pfn(mem->start_section_nr);
@@ -681,14 +730,20 @@ int unregister_memory_section(struct mem
/*
* offline one memory block. If the memory block has been offlined, do nothing.
+ *
+ * Call under device_hotplug_lock.
*/
int offline_memory_block(struct memory_block *mem)
{
int ret = 0;
mutex_lock(&mem->state_mutex);
- if (mem->state != MEM_OFFLINE)
- ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+ if (mem->state != MEM_OFFLINE) {
+ ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
+ MEM_ONLINE, -1);
+ if (!ret)
+ mem->dev.offline = true;
+ }
mutex_unlock(&mem->state_mutex);
return ret;
Index: linux-pm/include/linux/memory.h
===================================================================
--- linux-pm.orig/include/linux/memory.h
+++ linux-pm/include/linux/memory.h
@@ -26,6 +26,7 @@ struct memory_block {
unsigned long start_section_nr;
unsigned long end_section_nr;
unsigned long state;
+ int last_online;
int section_count;
/*
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
2013-05-07 12:11 ` Rafael J. Wysocki
@ 2013-05-07 21:03 ` Toshi Kani
2013-05-07 22:10 ` Rafael J. Wysocki
0 siblings, 1 reply; 32+ messages in thread
From: Toshi Kani @ 2013-05-07 21:03 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
LKML, isimatu.yasuaki, Len Brown, linux-mm, wency
On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
:
> Updated patch is appended for completness.
Yes, this updated patch solved the locking issue.
> > > > A more general issue is that there are now two memory offlining efforts:
> > > >
> > > > 1) from acpi_bus_offline_companions during device offline
> > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > >
> > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > (unless the blocks were re-onlined in between).
> > >
> > > Sure, and that should be OK for now. Changing the detach behavior is not
> > > essential from the patch [2/2] perspective, we can do it later.
> >
> > yes, ok.
> >
> > >
> > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > >
> > > Hmm. Perhaps it would make sense to implement that logic in
> > > memory_subsys_offline(), then?
> >
> > the logic tries to offline the memory blocks of the device twice, because the
> > first memory block might be storing information for the subsequent memblocks.
> >
> > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > affect non-memory devices as well.
> >
> > I am not sure how important this intelligence is in practice (I am not using
> > mem cgroups in my guest kernel tests yet). Maybe Wen (original author) has
> > more details on 2-pass offlining effectiveness.
>
> OK
>
> It may be added in a separate patch in any case.
I had the same comment as Vasilis. And, I agree with you that we can
enhance it in separate patches.
:
> +static int memory_subsys_offline(struct device *dev)
> +{
> + struct memory_block *mem = container_of(dev, struct memory_block, dev);
> + int ret;
> +
> + mutex_lock(&mem->state_mutex);
> + ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
This function needs to check mem->state just like
offline_memory_block(). That is:
int ret = 0;
:
if (mem->state != MEM_OFFLINE)
ret = __memory_block_change_state(...);
Otherwise, memory hot-delete to an off-lined memory fails in
__memory_block_change_state() since mem->state is already set to
MEM_OFFLINE.
With that change, for the series:
Reviewed-by: Toshi Kani <toshi.kani@hp.com>
Thanks,
-Toshi
> + mutex_unlock(&mem->state_mutex);
> + return ret;
> +}
> +
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
2013-05-07 21:03 ` Toshi Kani
@ 2013-05-07 22:10 ` Rafael J. Wysocki
2013-05-07 22:45 ` Toshi Kani
0 siblings, 1 reply; 32+ messages in thread
From: Rafael J. Wysocki @ 2013-05-07 22:10 UTC (permalink / raw)
To: Toshi Kani
Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
LKML, isimatu.yasuaki, Len Brown, linux-mm, wency
On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote:
> On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
>
> :
>
> > Updated patch is appended for completness.
>
> Yes, this updated patch solved the locking issue.
>
> > > > > A more general issue is that there are now two memory offlining efforts:
> > > > >
> > > > > 1) from acpi_bus_offline_companions during device offline
> > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > >
> > > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > > (unless the blocks were re-onlined in between).
> > > >
> > > > Sure, and that should be OK for now. Changing the detach behavior is not
> > > > essential from the patch [2/2] perspective, we can do it later.
> > >
> > > yes, ok.
> > >
> > > >
> > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > >
> > > > Hmm. Perhaps it would make sense to implement that logic in
> > > > memory_subsys_offline(), then?
> > >
> > > the logic tries to offline the memory blocks of the device twice, because the
> > > first memory block might be storing information for the subsequent memblocks.
> > >
> > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > > affect non-memory devices as well.
> > >
> > > I am not sure how important this intelligence is in practice (I am not using
> > > mem cgroups in my guest kernel tests yet). Maybe Wen (original author) has
> > > more details on 2-pass offlining effectiveness.
> >
> > OK
> >
> > It may be added in a separate patch in any case.
>
> I had the same comment as Vasilis. And, I agree with you that we can
> enhance it in separate patches.
>
> :
>
> > +static int memory_subsys_offline(struct device *dev)
> > +{
> > + struct memory_block *mem = container_of(dev, struct memory_block, dev);
> > + int ret;
> > +
> > + mutex_lock(&mem->state_mutex);
> > + ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
>
> This function needs to check mem->state just like
> offline_memory_block(). That is:
>
> int ret = 0;
> :
> if (mem->state != MEM_OFFLINE)
> ret = __memory_block_change_state(...);
>
> Otherwise, memory hot-delete to an off-lined memory fails in
> __memory_block_change_state() since mem->state is already set to
> MEM_OFFLINE.
>
> With that change, for the series:
> Reviewed-by: Toshi Kani <toshi.kani@hp.com>
OK, one more update, then (appended).
That said I thought that the check against dev->offline in device_offline()
would be sufficient to guard agaist that. Is there any "offline" code path
I didn't take into account?
Rafael
---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: Driver core: Introduce offline/online callbacks for memory blocks
Introduce .offline() and .online() callbacks for memory_subsys
that will allow the generic device_offline() and device_online()
to be used with device objects representing memory blocks. That,
in turn, allows the ACPI subsystem to use device_offline() to put
removable memory blocks offline, if possible, before removing
memory modules holding them.
The 'online' sysfs attribute of memory block devices will attempt to
put them offline if 0 is written to it and will attempt to apply the
previously used online type when onlining them (i.e. when 1 is
written to it).
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Toshi Kani <toshi.kani@hp.com>
---
drivers/base/memory.c | 111 +++++++++++++++++++++++++++++++++++++------------
include/linux/memory.h | 1
2 files changed, 87 insertions(+), 25 deletions(-)
Index: linux-pm/drivers/base/memory.c
===================================================================
--- linux-pm.orig/drivers/base/memory.c
+++ linux-pm/drivers/base/memory.c
@@ -37,9 +37,14 @@ static inline int base_memory_block_id(i
return section_nr / sections_per_block;
}
+static int memory_subsys_online(struct device *dev);
+static int memory_subsys_offline(struct device *dev);
+
static struct bus_type memory_subsys = {
.name = MEMORY_CLASS_NAME,
.dev_name = MEMORY_CLASS_NAME,
+ .online = memory_subsys_online,
+ .offline = memory_subsys_offline,
};
static BLOCKING_NOTIFIER_HEAD(memory_chain);
@@ -278,33 +283,70 @@ static int __memory_block_change_state(s
{
int ret = 0;
- if (mem->state != from_state_req) {
- ret = -EINVAL;
- goto out;
- }
+ if (mem->state != from_state_req)
+ return -EINVAL;
if (to_state == MEM_OFFLINE)
mem->state = MEM_GOING_OFFLINE;
ret = memory_block_action(mem->start_section_nr, to_state, online_type);
-
if (ret) {
mem->state = from_state_req;
- goto out;
+ } else {
+ mem->state = to_state;
+ if (to_state == MEM_ONLINE)
+ mem->last_online = online_type;
}
+ return ret;
+}
- mem->state = to_state;
- switch (mem->state) {
- case MEM_OFFLINE:
- kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
- break;
- case MEM_ONLINE:
- kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
- break;
- default:
- break;
+static int memory_subsys_online(struct device *dev)
+{
+ struct memory_block *mem = container_of(dev, struct memory_block, dev);
+ int ret;
+
+ mutex_lock(&mem->state_mutex);
+
+ ret = mem->state == MEM_ONLINE ? 0 :
+ __memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE,
+ mem->last_online);
+
+ mutex_unlock(&mem->state_mutex);
+ return ret;
+}
+
+static int memory_subsys_offline(struct device *dev)
+{
+ struct memory_block *mem = container_of(dev, struct memory_block, dev);
+ int ret;
+
+ mutex_lock(&mem->state_mutex);
+
+ ret = mem->state == MEM_OFFLINE ? 0 :
+ __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+
+ mutex_unlock(&mem->state_mutex);
+ return ret;
+}
+
+static int __memory_block_change_state_uevent(struct memory_block *mem,
+ unsigned long to_state, unsigned long from_state_req,
+ int online_type)
+{
+ int ret = __memory_block_change_state(mem, to_state, from_state_req,
+ online_type);
+ if (!ret) {
+ switch (mem->state) {
+ case MEM_OFFLINE:
+ kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
+ break;
+ case MEM_ONLINE:
+ kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
+ break;
+ default:
+ break;
+ }
}
-out:
return ret;
}
@@ -315,8 +357,8 @@ static int memory_block_change_state(str
int ret;
mutex_lock(&mem->state_mutex);
- ret = __memory_block_change_state(mem, to_state, from_state_req,
- online_type);
+ ret = __memory_block_change_state_uevent(mem, to_state, from_state_req,
+ online_type);
mutex_unlock(&mem->state_mutex);
return ret;
@@ -326,22 +368,34 @@ store_mem_state(struct device *dev,
struct device_attribute *attr, const char *buf, size_t count)
{
struct memory_block *mem;
+ bool offline;
int ret = -EINVAL;
mem = container_of(dev, struct memory_block, dev);
- if (!strncmp(buf, "online_kernel", min_t(int, count, 13)))
+ lock_device_hotplug();
+
+ if (!strncmp(buf, "online_kernel", min_t(int, count, 13))) {
+ offline = false;
ret = memory_block_change_state(mem, MEM_ONLINE,
MEM_OFFLINE, ONLINE_KERNEL);
- else if (!strncmp(buf, "online_movable", min_t(int, count, 14)))
+ } else if (!strncmp(buf, "online_movable", min_t(int, count, 14))) {
+ offline = false;
ret = memory_block_change_state(mem, MEM_ONLINE,
MEM_OFFLINE, ONLINE_MOVABLE);
- else if (!strncmp(buf, "online", min_t(int, count, 6)))
+ } else if (!strncmp(buf, "online", min_t(int, count, 6))) {
+ offline = false;
ret = memory_block_change_state(mem, MEM_ONLINE,
MEM_OFFLINE, ONLINE_KEEP);
- else if(!strncmp(buf, "offline", min_t(int, count, 7)))
+ } else if(!strncmp(buf, "offline", min_t(int, count, 7))) {
+ offline = true;
ret = memory_block_change_state(mem, MEM_OFFLINE,
MEM_ONLINE, -1);
+ }
+ if (!ret)
+ dev->offline = offline;
+
+ unlock_device_hotplug();
if (ret)
return ret;
@@ -563,6 +617,7 @@ static int init_memory_block(struct memo
base_memory_block_id(scn_nr) * sections_per_block;
mem->end_section_nr = mem->start_section_nr + sections_per_block - 1;
mem->state = state;
+ mem->last_online = ONLINE_KEEP;
mem->section_count++;
mutex_init(&mem->state_mutex);
start_pfn = section_nr_to_pfn(mem->start_section_nr);
@@ -681,14 +736,20 @@ int unregister_memory_section(struct mem
/*
* offline one memory block. If the memory block has been offlined, do nothing.
+ *
+ * Call under device_hotplug_lock.
*/
int offline_memory_block(struct memory_block *mem)
{
int ret = 0;
mutex_lock(&mem->state_mutex);
- if (mem->state != MEM_OFFLINE)
- ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+ if (mem->state != MEM_OFFLINE) {
+ ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
+ MEM_ONLINE, -1);
+ if (!ret)
+ mem->dev.offline = true;
+ }
mutex_unlock(&mem->state_mutex);
return ret;
Index: linux-pm/include/linux/memory.h
===================================================================
--- linux-pm.orig/include/linux/memory.h
+++ linux-pm/include/linux/memory.h
@@ -26,6 +26,7 @@ struct memory_block {
unsigned long start_section_nr;
unsigned long end_section_nr;
unsigned long state;
+ int last_online;
int section_count;
/*
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
2013-05-07 22:10 ` Rafael J. Wysocki
@ 2013-05-07 22:45 ` Toshi Kani
2013-05-07 23:17 ` Rafael J. Wysocki
0 siblings, 1 reply; 32+ messages in thread
From: Toshi Kani @ 2013-05-07 22:45 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
LKML, isimatu.yasuaki, Len Brown, linux-mm, wency
On Wed, 2013-05-08 at 00:10 +0200, Rafael J. Wysocki wrote:
> On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote:
> > On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> > > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> >
> > :
> >
> > > Updated patch is appended for completness.
> >
> > Yes, this updated patch solved the locking issue.
> >
> > > > > > A more general issue is that there are now two memory offlining efforts:
> > > > > >
> > > > > > 1) from acpi_bus_offline_companions during device offline
> > > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > > >
> > > > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > > > (unless the blocks were re-onlined in between).
> > > > >
> > > > > Sure, and that should be OK for now. Changing the detach behavior is not
> > > > > essential from the patch [2/2] perspective, we can do it later.
> > > >
> > > > yes, ok.
> > > >
> > > > >
> > > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > > >
> > > > > Hmm. Perhaps it would make sense to implement that logic in
> > > > > memory_subsys_offline(), then?
> > > >
> > > > the logic tries to offline the memory blocks of the device twice, because the
> > > > first memory block might be storing information for the subsequent memblocks.
> > > >
> > > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > > > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > > > affect non-memory devices as well.
> > > >
> > > > I am not sure how important this intelligence is in practice (I am not using
> > > > mem cgroups in my guest kernel tests yet). Maybe Wen (original author) has
> > > > more details on 2-pass offlining effectiveness.
> > >
> > > OK
> > >
> > > It may be added in a separate patch in any case.
> >
> > I had the same comment as Vasilis. And, I agree with you that we can
> > enhance it in separate patches.
> >
> > :
> >
> > > +static int memory_subsys_offline(struct device *dev)
> > > +{
> > > + struct memory_block *mem = container_of(dev, struct memory_block, dev);
> > > + int ret;
> > > +
> > > + mutex_lock(&mem->state_mutex);
> > > + ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> >
> > This function needs to check mem->state just like
> > offline_memory_block(). That is:
> >
> > int ret = 0;
> > :
> > if (mem->state != MEM_OFFLINE)
> > ret = __memory_block_change_state(...);
> >
> > Otherwise, memory hot-delete to an off-lined memory fails in
> > __memory_block_change_state() since mem->state is already set to
> > MEM_OFFLINE.
> >
> > With that change, for the series:
> > Reviewed-by: Toshi Kani <toshi.kani@hp.com>
>
> OK, one more update, then (appended).
>
> That said I thought that the check against dev->offline in device_offline()
> would be sufficient to guard agaist that. Is there any "offline" code path
> I didn't take into account?
Oh, you are right about that. The real problem is that dev->offline is
set to false (0) when a new memory is hot-added in off-line state. So,
instead, dev->offline needs to be set properly.
Thanks,
-Toshi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
2013-05-07 22:45 ` Toshi Kani
@ 2013-05-07 23:17 ` Rafael J. Wysocki
2013-05-07 23:59 ` Toshi Kani
0 siblings, 1 reply; 32+ messages in thread
From: Rafael J. Wysocki @ 2013-05-07 23:17 UTC (permalink / raw)
To: Toshi Kani
Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
LKML, isimatu.yasuaki, Len Brown, linux-mm, wency
On Tuesday, May 07, 2013 04:45:40 PM Toshi Kani wrote:
> On Wed, 2013-05-08 at 00:10 +0200, Rafael J. Wysocki wrote:
> > On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote:
> > > On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> > > > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> > >
> > > :
> > >
> > > > Updated patch is appended for completness.
> > >
> > > Yes, this updated patch solved the locking issue.
> > >
> > > > > > > A more general issue is that there are now two memory offlining efforts:
> > > > > > >
> > > > > > > 1) from acpi_bus_offline_companions during device offline
> > > > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > > > >
> > > > > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > > > > (unless the blocks were re-onlined in between).
> > > > > >
> > > > > > Sure, and that should be OK for now. Changing the detach behavior is not
> > > > > > essential from the patch [2/2] perspective, we can do it later.
> > > > >
> > > > > yes, ok.
> > > > >
> > > > > >
> > > > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > > > >
> > > > > > Hmm. Perhaps it would make sense to implement that logic in
> > > > > > memory_subsys_offline(), then?
> > > > >
> > > > > the logic tries to offline the memory blocks of the device twice, because the
> > > > > first memory block might be storing information for the subsequent memblocks.
> > > > >
> > > > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > > > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > > > > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > > > > affect non-memory devices as well.
> > > > >
> > > > > I am not sure how important this intelligence is in practice (I am not using
> > > > > mem cgroups in my guest kernel tests yet). Maybe Wen (original author) has
> > > > > more details on 2-pass offlining effectiveness.
> > > >
> > > > OK
> > > >
> > > > It may be added in a separate patch in any case.
> > >
> > > I had the same comment as Vasilis. And, I agree with you that we can
> > > enhance it in separate patches.
> > >
> > > :
> > >
> > > > +static int memory_subsys_offline(struct device *dev)
> > > > +{
> > > > + struct memory_block *mem = container_of(dev, struct memory_block, dev);
> > > > + int ret;
> > > > +
> > > > + mutex_lock(&mem->state_mutex);
> > > > + ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > >
> > > This function needs to check mem->state just like
> > > offline_memory_block(). That is:
> > >
> > > int ret = 0;
> > > :
> > > if (mem->state != MEM_OFFLINE)
> > > ret = __memory_block_change_state(...);
> > >
> > > Otherwise, memory hot-delete to an off-lined memory fails in
> > > __memory_block_change_state() since mem->state is already set to
> > > MEM_OFFLINE.
> > >
> > > With that change, for the series:
> > > Reviewed-by: Toshi Kani <toshi.kani@hp.com>
> >
> > OK, one more update, then (appended).
> >
> > That said I thought that the check against dev->offline in device_offline()
> > would be sufficient to guard agaist that. Is there any "offline" code path
> > I didn't take into account?
>
> Oh, you are right about that. The real problem is that dev->offline is
> set to false (0) when a new memory is hot-added in off-line state. So,
> instead, dev->offline needs to be set properly.
OK, where does that happen?
Rafael
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
2013-05-07 23:17 ` Rafael J. Wysocki
@ 2013-05-07 23:59 ` Toshi Kani
2013-05-08 0:24 ` Rafael J. Wysocki
0 siblings, 1 reply; 32+ messages in thread
From: Toshi Kani @ 2013-05-07 23:59 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
LKML, isimatu.yasuaki, Len Brown, linux-mm, wency
On Wed, 2013-05-08 at 01:17 +0200, Rafael J. Wysocki wrote:
> On Tuesday, May 07, 2013 04:45:40 PM Toshi Kani wrote:
> > On Wed, 2013-05-08 at 00:10 +0200, Rafael J. Wysocki wrote:
> > > On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote:
> > > > On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> > > > > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> > > >
> > > > :
> > > >
> > > > > Updated patch is appended for completness.
> > > >
> > > > Yes, this updated patch solved the locking issue.
> > > >
> > > > > > > > A more general issue is that there are now two memory offlining efforts:
> > > > > > > >
> > > > > > > > 1) from acpi_bus_offline_companions during device offline
> > > > > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > > > > >
> > > > > > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > > > > > (unless the blocks were re-onlined in between).
> > > > > > >
> > > > > > > Sure, and that should be OK for now. Changing the detach behavior is not
> > > > > > > essential from the patch [2/2] perspective, we can do it later.
> > > > > >
> > > > > > yes, ok.
> > > > > >
> > > > > > >
> > > > > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > > > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > > > > >
> > > > > > > Hmm. Perhaps it would make sense to implement that logic in
> > > > > > > memory_subsys_offline(), then?
> > > > > >
> > > > > > the logic tries to offline the memory blocks of the device twice, because the
> > > > > > first memory block might be storing information for the subsequent memblocks.
> > > > > >
> > > > > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > > > > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > > > > > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > > > > > affect non-memory devices as well.
> > > > > >
> > > > > > I am not sure how important this intelligence is in practice (I am not using
> > > > > > mem cgroups in my guest kernel tests yet). Maybe Wen (original author) has
> > > > > > more details on 2-pass offlining effectiveness.
> > > > >
> > > > > OK
> > > > >
> > > > > It may be added in a separate patch in any case.
> > > >
> > > > I had the same comment as Vasilis. And, I agree with you that we can
> > > > enhance it in separate patches.
> > > >
> > > > :
> > > >
> > > > > +static int memory_subsys_offline(struct device *dev)
> > > > > +{
> > > > > + struct memory_block *mem = container_of(dev, struct memory_block, dev);
> > > > > + int ret;
> > > > > +
> > > > > + mutex_lock(&mem->state_mutex);
> > > > > + ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > >
> > > > This function needs to check mem->state just like
> > > > offline_memory_block(). That is:
> > > >
> > > > int ret = 0;
> > > > :
> > > > if (mem->state != MEM_OFFLINE)
> > > > ret = __memory_block_change_state(...);
> > > >
> > > > Otherwise, memory hot-delete to an off-lined memory fails in
> > > > __memory_block_change_state() since mem->state is already set to
> > > > MEM_OFFLINE.
> > > >
> > > > With that change, for the series:
> > > > Reviewed-by: Toshi Kani <toshi.kani@hp.com>
> > >
> > > OK, one more update, then (appended).
> > >
> > > That said I thought that the check against dev->offline in device_offline()
> > > would be sufficient to guard agaist that. Is there any "offline" code path
> > > I didn't take into account?
> >
> > Oh, you are right about that. The real problem is that dev->offline is
> > set to false (0) when a new memory is hot-added in off-line state. So,
> > instead, dev->offline needs to be set properly.
>
> OK, where does that happen?
It's a bit messy, but the following change seems to work. A tricky part
is that online() is not called during boot, so I needed to update the
offline flag in __memory_block_change_state().
Thanks,
-Toshi
---
drivers/base/memory.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index b9dfd34..1c8d781 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -294,8 +294,10 @@ static int __memory_block_change_state(struct
memory_block *mem,
mem->state = from_state_req;
} else {
mem->state = to_state;
- if (to_state == MEM_ONLINE)
+ if (to_state == MEM_ONLINE) {
mem->last_online = online_type;
+ mem->dev.offline = false;
+ }
}
return ret;
}
@@ -613,6 +615,7 @@ static int init_memory_block(struct memory_block
**memory,
mem->state = state;
mem->last_online = ONLINE_KEEP;
mem->section_count++;
+ mem->dev.offline = (state == MEM_OFFLINE) ? true : false;
mutex_init(&mem->state_mutex);
start_pfn = section_nr_to_pfn(mem->start_section_nr);
mem->phys_device = arch_get_memory_phys_device(start_pfn);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
2013-05-07 23:59 ` Toshi Kani
@ 2013-05-08 0:24 ` Rafael J. Wysocki
2013-05-08 0:37 ` Toshi Kani
0 siblings, 1 reply; 32+ messages in thread
From: Rafael J. Wysocki @ 2013-05-08 0:24 UTC (permalink / raw)
To: Toshi Kani
Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
LKML, isimatu.yasuaki, Len Brown, linux-mm, wency
On Tuesday, May 07, 2013 05:59:16 PM Toshi Kani wrote:
> On Wed, 2013-05-08 at 01:17 +0200, Rafael J. Wysocki wrote:
> > On Tuesday, May 07, 2013 04:45:40 PM Toshi Kani wrote:
> > > On Wed, 2013-05-08 at 00:10 +0200, Rafael J. Wysocki wrote:
> > > > On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote:
> > > > > On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> > > > > > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> > > > >
> > > > > :
> > > > >
> > > > > > Updated patch is appended for completness.
> > > > >
> > > > > Yes, this updated patch solved the locking issue.
> > > > >
> > > > > > > > > A more general issue is that there are now two memory offlining efforts:
> > > > > > > > >
> > > > > > > > > 1) from acpi_bus_offline_companions during device offline
> > > > > > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > > > > > >
> > > > > > > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > > > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > > > > > > (unless the blocks were re-onlined in between).
> > > > > > > >
> > > > > > > > Sure, and that should be OK for now. Changing the detach behavior is not
> > > > > > > > essential from the patch [2/2] perspective, we can do it later.
> > > > > > >
> > > > > > > yes, ok.
> > > > > > >
> > > > > > > >
> > > > > > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > > > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > > > > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > > > > > >
> > > > > > > > Hmm. Perhaps it would make sense to implement that logic in
> > > > > > > > memory_subsys_offline(), then?
> > > > > > >
> > > > > > > the logic tries to offline the memory blocks of the device twice, because the
> > > > > > > first memory block might be storing information for the subsequent memblocks.
> > > > > > >
> > > > > > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > > > > > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > > > > > > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > > > > > > affect non-memory devices as well.
> > > > > > >
> > > > > > > I am not sure how important this intelligence is in practice (I am not using
> > > > > > > mem cgroups in my guest kernel tests yet). Maybe Wen (original author) has
> > > > > > > more details on 2-pass offlining effectiveness.
> > > > > >
> > > > > > OK
> > > > > >
> > > > > > It may be added in a separate patch in any case.
> > > > >
> > > > > I had the same comment as Vasilis. And, I agree with you that we can
> > > > > enhance it in separate patches.
> > > > >
> > > > > :
> > > > >
> > > > > > +static int memory_subsys_offline(struct device *dev)
> > > > > > +{
> > > > > > + struct memory_block *mem = container_of(dev, struct memory_block, dev);
> > > > > > + int ret;
> > > > > > +
> > > > > > + mutex_lock(&mem->state_mutex);
> > > > > > + ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > > >
> > > > > This function needs to check mem->state just like
> > > > > offline_memory_block(). That is:
> > > > >
> > > > > int ret = 0;
> > > > > :
> > > > > if (mem->state != MEM_OFFLINE)
> > > > > ret = __memory_block_change_state(...);
> > > > >
> > > > > Otherwise, memory hot-delete to an off-lined memory fails in
> > > > > __memory_block_change_state() since mem->state is already set to
> > > > > MEM_OFFLINE.
> > > > >
> > > > > With that change, for the series:
> > > > > Reviewed-by: Toshi Kani <toshi.kani@hp.com>
> > > >
> > > > OK, one more update, then (appended).
> > > >
> > > > That said I thought that the check against dev->offline in device_offline()
> > > > would be sufficient to guard agaist that. Is there any "offline" code path
> > > > I didn't take into account?
> > >
> > > Oh, you are right about that. The real problem is that dev->offline is
> > > set to false (0) when a new memory is hot-added in off-line state. So,
> > > instead, dev->offline needs to be set properly.
> >
> > OK, where does that happen?
>
> It's a bit messy, but the following change seems to work. A tricky part
> is that online() is not called during boot, so I needed to update the
> offline flag in __memory_block_change_state().
I wonder why? ->
> ---
> drivers/base/memory.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index b9dfd34..1c8d781 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -294,8 +294,10 @@ static int __memory_block_change_state(struct
> memory_block *mem,
> mem->state = from_state_req;
> } else {
> mem->state = to_state;
> - if (to_state == MEM_ONLINE)
> + if (to_state == MEM_ONLINE) {
> mem->last_online = online_type;
> + mem->dev.offline = false;
> + }
->
__memory_block_change_state() is called by memory_subsys_online/offline()
and by __memory_block_change_state_uevent() only, so it should be sufficient
to do this under the switch () in the latter.
Still, though, __memory_block_change_state_uevent() is only called (indirectly)
from store_mem_state() and by offline_memory_block() the both of which update
dev->offline.
What's the exact scenario you needed this for?
> }
> return ret;
> }
> @@ -613,6 +615,7 @@ static int init_memory_block(struct memory_block
> **memory,
> mem->state = state;
> mem->last_online = ONLINE_KEEP;
> mem->section_count++;
> + mem->dev.offline = (state == MEM_OFFLINE) ? true : false;
You could write this as
+ mem->dev.offline = state == MEM_OFFLINE;
Moreover, it'd be better to do it in register_memory(), I think.
> mutex_init(&mem->state_mutex);
> start_pfn = section_nr_to_pfn(mem->start_section_nr);
> mem->phys_device = arch_get_memory_phys_device(start_pfn);
Thanks,
Rafael
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
2013-05-08 0:24 ` Rafael J. Wysocki
@ 2013-05-08 0:37 ` Toshi Kani
2013-05-08 11:53 ` Rafael J. Wysocki
0 siblings, 1 reply; 32+ messages in thread
From: Toshi Kani @ 2013-05-08 0:37 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
LKML, isimatu.yasuaki, Len Brown, linux-mm, wency
On Wed, 2013-05-08 at 02:24 +0200, Rafael J. Wysocki wrote:
> On Tuesday, May 07, 2013 05:59:16 PM Toshi Kani wrote:
> > On Wed, 2013-05-08 at 01:17 +0200, Rafael J. Wysocki wrote:
> > > On Tuesday, May 07, 2013 04:45:40 PM Toshi Kani wrote:
> > > > On Wed, 2013-05-08 at 00:10 +0200, Rafael J. Wysocki wrote:
> > > > > On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote:
> > > > > > On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> > > > > > > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> > > > > >
> > > > > > :
> > > > > >
> > > > > > > Updated patch is appended for completness.
> > > > > >
> > > > > > Yes, this updated patch solved the locking issue.
> > > > > >
> > > > > > > > > > A more general issue is that there are now two memory offlining efforts:
> > > > > > > > > >
> > > > > > > > > > 1) from acpi_bus_offline_companions during device offline
> > > > > > > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > > > > > > >
> > > > > > > > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > > > > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > > > > > > > (unless the blocks were re-onlined in between).
> > > > > > > > >
> > > > > > > > > Sure, and that should be OK for now. Changing the detach behavior is not
> > > > > > > > > essential from the patch [2/2] perspective, we can do it later.
> > > > > > > >
> > > > > > > > yes, ok.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > > > > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > > > > > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > > > > > > >
> > > > > > > > > Hmm. Perhaps it would make sense to implement that logic in
> > > > > > > > > memory_subsys_offline(), then?
> > > > > > > >
> > > > > > > > the logic tries to offline the memory blocks of the device twice, because the
> > > > > > > > first memory block might be storing information for the subsequent memblocks.
> > > > > > > >
> > > > > > > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > > > > > > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > > > > > > > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > > > > > > > affect non-memory devices as well.
> > > > > > > >
> > > > > > > > I am not sure how important this intelligence is in practice (I am not using
> > > > > > > > mem cgroups in my guest kernel tests yet). Maybe Wen (original author) has
> > > > > > > > more details on 2-pass offlining effectiveness.
> > > > > > >
> > > > > > > OK
> > > > > > >
> > > > > > > It may be added in a separate patch in any case.
> > > > > >
> > > > > > I had the same comment as Vasilis. And, I agree with you that we can
> > > > > > enhance it in separate patches.
> > > > > >
> > > > > > :
> > > > > >
> > > > > > > +static int memory_subsys_offline(struct device *dev)
> > > > > > > +{
> > > > > > > + struct memory_block *mem = container_of(dev, struct memory_block, dev);
> > > > > > > + int ret;
> > > > > > > +
> > > > > > > + mutex_lock(&mem->state_mutex);
> > > > > > > + ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > > > >
> > > > > > This function needs to check mem->state just like
> > > > > > offline_memory_block(). That is:
> > > > > >
> > > > > > int ret = 0;
> > > > > > :
> > > > > > if (mem->state != MEM_OFFLINE)
> > > > > > ret = __memory_block_change_state(...);
> > > > > >
> > > > > > Otherwise, memory hot-delete to an off-lined memory fails in
> > > > > > __memory_block_change_state() since mem->state is already set to
> > > > > > MEM_OFFLINE.
> > > > > >
> > > > > > With that change, for the series:
> > > > > > Reviewed-by: Toshi Kani <toshi.kani@hp.com>
> > > > >
> > > > > OK, one more update, then (appended).
> > > > >
> > > > > That said I thought that the check against dev->offline in device_offline()
> > > > > would be sufficient to guard agaist that. Is there any "offline" code path
> > > > > I didn't take into account?
> > > >
> > > > Oh, you are right about that. The real problem is that dev->offline is
> > > > set to false (0) when a new memory is hot-added in off-line state. So,
> > > > instead, dev->offline needs to be set properly.
> > >
> > > OK, where does that happen?
> >
> > It's a bit messy, but the following change seems to work. A tricky part
> > is that online() is not called during boot, so I needed to update the
> > offline flag in __memory_block_change_state().
>
> I wonder why? ->
>
> > ---
> > drivers/base/memory.c | 5 ++++-
> > 1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> > index b9dfd34..1c8d781 100644
> > --- a/drivers/base/memory.c
> > +++ b/drivers/base/memory.c
> > @@ -294,8 +294,10 @@ static int __memory_block_change_state(struct
> > memory_block *mem,
> > mem->state = from_state_req;
> > } else {
> > mem->state = to_state;
> > - if (to_state == MEM_ONLINE)
> > + if (to_state == MEM_ONLINE) {
> > mem->last_online = online_type;
> > + mem->dev.offline = false;
> > + }
>
> ->
>
> __memory_block_change_state() is called by memory_subsys_online/offline()
> and by __memory_block_change_state_uevent() only, so it should be sufficient
> to do this under the switch () in the latter.
>
> Still, though, __memory_block_change_state_uevent() is only called (indirectly)
> from store_mem_state() and by offline_memory_block() the both of which update
> dev->offline.
>
> What's the exact scenario you needed this for?
Right. I was in hurry and made a wrong assumption... This change is
not necessary.
> > }
> > return ret;
> > }
> > @@ -613,6 +615,7 @@ static int init_memory_block(struct memory_block
> > **memory,
> > mem->state = state;
> > mem->last_online = ONLINE_KEEP;
> > mem->section_count++;
> > + mem->dev.offline = (state == MEM_OFFLINE) ? true : false;
>
> You could write this as
>
> + mem->dev.offline = state == MEM_OFFLINE;
Right.
> Moreover, it'd be better to do it in register_memory(), I think.
Yes, if we change register_memory() to have the arg state.
Thanks,
-Toshi
>
> > mutex_init(&mem->state_mutex);
> > start_pfn = section_nr_to_pfn(mem->start_section_nr);
> > mem->phys_device = arch_get_memory_phys_device(start_pfn);
>
> Thanks,
> Rafael
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
2013-05-08 0:37 ` Toshi Kani
@ 2013-05-08 11:53 ` Rafael J. Wysocki
2013-05-08 14:38 ` Toshi Kani
0 siblings, 1 reply; 32+ messages in thread
From: Rafael J. Wysocki @ 2013-05-08 11:53 UTC (permalink / raw)
To: Toshi Kani
Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
LKML, isimatu.yasuaki, Len Brown, linux-mm, wency
On Tuesday, May 07, 2013 06:37:34 PM Toshi Kani wrote:
> On Wed, 2013-05-08 at 02:24 +0200, Rafael J. Wysocki wrote:
> > On Tuesday, May 07, 2013 05:59:16 PM Toshi Kani wrote:
> > > On Wed, 2013-05-08 at 01:17 +0200, Rafael J. Wysocki wrote:
> > > > On Tuesday, May 07, 2013 04:45:40 PM Toshi Kani wrote:
> > > > > On Wed, 2013-05-08 at 00:10 +0200, Rafael J. Wysocki wrote:
> > > > > > On Tuesday, May 07, 2013 03:03:49 PM Toshi Kani wrote:
> > > > > > > On Tue, 2013-05-07 at 14:11 +0200, Rafael J. Wysocki wrote:
> > > > > > > > On Tuesday, May 07, 2013 12:59:45 PM Vasilis Liaskovitis wrote:
> > > > > > >
> > > > > > > :
> > > > > > >
> > > > > > > > Updated patch is appended for completness.
> > > > > > >
> > > > > > > Yes, this updated patch solved the locking issue.
> > > > > > >
> > > > > > > > > > > A more general issue is that there are now two memory offlining efforts:
> > > > > > > > > > >
> > > > > > > > > > > 1) from acpi_bus_offline_companions during device offline
> > > > > > > > > > > 2) from mm: remove_memory during device detach (offline_memory_block_cb)
> > > > > > > > > > >
> > > > > > > > > > > The 2nd is only called if the device offline operation was already succesful, so
> > > > > > > > > > > it seems ineffective or redundant now, at least for x86_64/acpi_memhotplug machine
> > > > > > > > > > > (unless the blocks were re-onlined in between).
> > > > > > > > > >
> > > > > > > > > > Sure, and that should be OK for now. Changing the detach behavior is not
> > > > > > > > > > essential from the patch [2/2] perspective, we can do it later.
> > > > > > > > >
> > > > > > > > > yes, ok.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > On the other hand, the 2nd effort has some more intelligence in offlining, as it
> > > > > > > > > > > tries to offline twice in the precense of memcg, see commits df3e1b91 or
> > > > > > > > > > > reworked 0baeab16. Maybe we need to consolidate the logic.
> > > > > > > > > >
> > > > > > > > > > Hmm. Perhaps it would make sense to implement that logic in
> > > > > > > > > > memory_subsys_offline(), then?
> > > > > > > > >
> > > > > > > > > the logic tries to offline the memory blocks of the device twice, because the
> > > > > > > > > first memory block might be storing information for the subsequent memblocks.
> > > > > > > > >
> > > > > > > > > memory_subsys_offline operates on one memory block at a time. Perhaps we can get
> > > > > > > > > the same effect if we do an acpi_walk of acpi_bus_offline_companions twice in
> > > > > > > > > acpi_scan_hot_remove but it's probably not a good idea, since that would
> > > > > > > > > affect non-memory devices as well.
> > > > > > > > >
> > > > > > > > > I am not sure how important this intelligence is in practice (I am not using
> > > > > > > > > mem cgroups in my guest kernel tests yet). Maybe Wen (original author) has
> > > > > > > > > more details on 2-pass offlining effectiveness.
> > > > > > > >
> > > > > > > > OK
> > > > > > > >
> > > > > > > > It may be added in a separate patch in any case.
> > > > > > >
> > > > > > > I had the same comment as Vasilis. And, I agree with you that we can
> > > > > > > enhance it in separate patches.
> > > > > > >
> > > > > > > :
> > > > > > >
> > > > > > > > +static int memory_subsys_offline(struct device *dev)
> > > > > > > > +{
> > > > > > > > + struct memory_block *mem = container_of(dev, struct memory_block, dev);
> > > > > > > > + int ret;
> > > > > > > > +
> > > > > > > > + mutex_lock(&mem->state_mutex);
> > > > > > > > + ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
> > > > > > >
> > > > > > > This function needs to check mem->state just like
> > > > > > > offline_memory_block(). That is:
> > > > > > >
> > > > > > > int ret = 0;
> > > > > > > :
> > > > > > > if (mem->state != MEM_OFFLINE)
> > > > > > > ret = __memory_block_change_state(...);
> > > > > > >
> > > > > > > Otherwise, memory hot-delete to an off-lined memory fails in
> > > > > > > __memory_block_change_state() since mem->state is already set to
> > > > > > > MEM_OFFLINE.
> > > > > > >
> > > > > > > With that change, for the series:
> > > > > > > Reviewed-by: Toshi Kani <toshi.kani@hp.com>
> > > > > >
> > > > > > OK, one more update, then (appended).
> > > > > >
> > > > > > That said I thought that the check against dev->offline in device_offline()
> > > > > > would be sufficient to guard agaist that. Is there any "offline" code path
> > > > > > I didn't take into account?
> > > > >
> > > > > Oh, you are right about that. The real problem is that dev->offline is
> > > > > set to false (0) when a new memory is hot-added in off-line state. So,
> > > > > instead, dev->offline needs to be set properly.
> > > >
> > > > OK, where does that happen?
> > >
> > > It's a bit messy, but the following change seems to work. A tricky part
> > > is that online() is not called during boot, so I needed to update the
> > > offline flag in __memory_block_change_state().
> >
> > I wonder why? ->
> >
> > > ---
> > > drivers/base/memory.c | 5 ++++-
> > > 1 file changed, 4 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> > > index b9dfd34..1c8d781 100644
> > > --- a/drivers/base/memory.c
> > > +++ b/drivers/base/memory.c
> > > @@ -294,8 +294,10 @@ static int __memory_block_change_state(struct
> > > memory_block *mem,
> > > mem->state = from_state_req;
> > > } else {
> > > mem->state = to_state;
> > > - if (to_state == MEM_ONLINE)
> > > + if (to_state == MEM_ONLINE) {
> > > mem->last_online = online_type;
> > > + mem->dev.offline = false;
> > > + }
> >
> > ->
> >
> > __memory_block_change_state() is called by memory_subsys_online/offline()
> > and by __memory_block_change_state_uevent() only, so it should be sufficient
> > to do this under the switch () in the latter.
> >
> > Still, though, __memory_block_change_state_uevent() is only called (indirectly)
> > from store_mem_state() and by offline_memory_block() the both of which update
> > dev->offline.
> >
> > What's the exact scenario you needed this for?
>
> Right. I was in hurry and made a wrong assumption... This change is
> not necessary.
>
> > > }
> > > return ret;
> > > }
> > > @@ -613,6 +615,7 @@ static int init_memory_block(struct memory_block
> > > **memory,
> > > mem->state = state;
> > > mem->last_online = ONLINE_KEEP;
> > > mem->section_count++;
> > > + mem->dev.offline = (state == MEM_OFFLINE) ? true : false;
> >
> > You could write this as
> >
> > + mem->dev.offline = state == MEM_OFFLINE;
>
> Right.
>
> > Moreover, it'd be better to do it in register_memory(), I think.
>
> Yes, if we change register_memory() to have the arg state.
It can use mem->state which already has been populated at this point
(and init_memory_block() is the only called).
I've updated the patch to do that (appended).
Thanks,
Rafael
---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: Driver core: Introduce offline/online callbacks for memory blocks
Introduce .offline() and .online() callbacks for memory_subsys
that will allow the generic device_offline() and device_online()
to be used with device objects representing memory blocks. That,
in turn, allows the ACPI subsystem to use device_offline() to put
removable memory blocks offline, if possible, before removing
memory modules holding them.
The 'online' sysfs attribute of memory block devices will attempt to
put them offline if 0 is written to it and will attempt to apply the
previously used online type when onlining them (i.e. when 1 is
written to it).
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Toshi Kani <toshi.kani@hp.com>
---
drivers/base/memory.c | 112 ++++++++++++++++++++++++++++++++++++++-----------
include/linux/memory.h | 1
2 files changed, 88 insertions(+), 25 deletions(-)
Index: linux-pm/drivers/base/memory.c
===================================================================
--- linux-pm.orig/drivers/base/memory.c
+++ linux-pm/drivers/base/memory.c
@@ -37,9 +37,14 @@ static inline int base_memory_block_id(i
return section_nr / sections_per_block;
}
+static int memory_subsys_online(struct device *dev);
+static int memory_subsys_offline(struct device *dev);
+
static struct bus_type memory_subsys = {
.name = MEMORY_CLASS_NAME,
.dev_name = MEMORY_CLASS_NAME,
+ .online = memory_subsys_online,
+ .offline = memory_subsys_offline,
};
static BLOCKING_NOTIFIER_HEAD(memory_chain);
@@ -88,6 +93,7 @@ int register_memory(struct memory_block
memory->dev.bus = &memory_subsys;
memory->dev.id = memory->start_section_nr / sections_per_block;
memory->dev.release = memory_block_release;
+ memory->dev.offline = memory->state == MEM_OFFLINE;
error = device_register(&memory->dev);
return error;
@@ -278,33 +284,70 @@ static int __memory_block_change_state(s
{
int ret = 0;
- if (mem->state != from_state_req) {
- ret = -EINVAL;
- goto out;
- }
+ if (mem->state != from_state_req)
+ return -EINVAL;
if (to_state == MEM_OFFLINE)
mem->state = MEM_GOING_OFFLINE;
ret = memory_block_action(mem->start_section_nr, to_state, online_type);
-
if (ret) {
mem->state = from_state_req;
- goto out;
+ } else {
+ mem->state = to_state;
+ if (to_state == MEM_ONLINE)
+ mem->last_online = online_type;
}
+ return ret;
+}
- mem->state = to_state;
- switch (mem->state) {
- case MEM_OFFLINE:
- kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
- break;
- case MEM_ONLINE:
- kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
- break;
- default:
- break;
+static int memory_subsys_online(struct device *dev)
+{
+ struct memory_block *mem = container_of(dev, struct memory_block, dev);
+ int ret;
+
+ mutex_lock(&mem->state_mutex);
+
+ ret = mem->state == MEM_ONLINE ? 0 :
+ __memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE,
+ mem->last_online);
+
+ mutex_unlock(&mem->state_mutex);
+ return ret;
+}
+
+static int memory_subsys_offline(struct device *dev)
+{
+ struct memory_block *mem = container_of(dev, struct memory_block, dev);
+ int ret;
+
+ mutex_lock(&mem->state_mutex);
+
+ ret = mem->state == MEM_OFFLINE ? 0 :
+ __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+
+ mutex_unlock(&mem->state_mutex);
+ return ret;
+}
+
+static int __memory_block_change_state_uevent(struct memory_block *mem,
+ unsigned long to_state, unsigned long from_state_req,
+ int online_type)
+{
+ int ret = __memory_block_change_state(mem, to_state, from_state_req,
+ online_type);
+ if (!ret) {
+ switch (mem->state) {
+ case MEM_OFFLINE:
+ kobject_uevent(&mem->dev.kobj, KOBJ_OFFLINE);
+ break;
+ case MEM_ONLINE:
+ kobject_uevent(&mem->dev.kobj, KOBJ_ONLINE);
+ break;
+ default:
+ break;
+ }
}
-out:
return ret;
}
@@ -315,8 +358,8 @@ static int memory_block_change_state(str
int ret;
mutex_lock(&mem->state_mutex);
- ret = __memory_block_change_state(mem, to_state, from_state_req,
- online_type);
+ ret = __memory_block_change_state_uevent(mem, to_state, from_state_req,
+ online_type);
mutex_unlock(&mem->state_mutex);
return ret;
@@ -326,22 +369,34 @@ store_mem_state(struct device *dev,
struct device_attribute *attr, const char *buf, size_t count)
{
struct memory_block *mem;
+ bool offline;
int ret = -EINVAL;
mem = container_of(dev, struct memory_block, dev);
- if (!strncmp(buf, "online_kernel", min_t(int, count, 13)))
+ lock_device_hotplug();
+
+ if (!strncmp(buf, "online_kernel", min_t(int, count, 13))) {
+ offline = false;
ret = memory_block_change_state(mem, MEM_ONLINE,
MEM_OFFLINE, ONLINE_KERNEL);
- else if (!strncmp(buf, "online_movable", min_t(int, count, 14)))
+ } else if (!strncmp(buf, "online_movable", min_t(int, count, 14))) {
+ offline = false;
ret = memory_block_change_state(mem, MEM_ONLINE,
MEM_OFFLINE, ONLINE_MOVABLE);
- else if (!strncmp(buf, "online", min_t(int, count, 6)))
+ } else if (!strncmp(buf, "online", min_t(int, count, 6))) {
+ offline = false;
ret = memory_block_change_state(mem, MEM_ONLINE,
MEM_OFFLINE, ONLINE_KEEP);
- else if(!strncmp(buf, "offline", min_t(int, count, 7)))
+ } else if(!strncmp(buf, "offline", min_t(int, count, 7))) {
+ offline = true;
ret = memory_block_change_state(mem, MEM_OFFLINE,
MEM_ONLINE, -1);
+ }
+ if (!ret)
+ dev->offline = offline;
+
+ unlock_device_hotplug();
if (ret)
return ret;
@@ -563,6 +618,7 @@ static int init_memory_block(struct memo
base_memory_block_id(scn_nr) * sections_per_block;
mem->end_section_nr = mem->start_section_nr + sections_per_block - 1;
mem->state = state;
+ mem->last_online = ONLINE_KEEP;
mem->section_count++;
mutex_init(&mem->state_mutex);
start_pfn = section_nr_to_pfn(mem->start_section_nr);
@@ -681,14 +737,20 @@ int unregister_memory_section(struct mem
/*
* offline one memory block. If the memory block has been offlined, do nothing.
+ *
+ * Call under device_hotplug_lock.
*/
int offline_memory_block(struct memory_block *mem)
{
int ret = 0;
mutex_lock(&mem->state_mutex);
- if (mem->state != MEM_OFFLINE)
- ret = __memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE, -1);
+ if (mem->state != MEM_OFFLINE) {
+ ret = __memory_block_change_state_uevent(mem, MEM_OFFLINE,
+ MEM_ONLINE, -1);
+ if (!ret)
+ mem->dev.offline = true;
+ }
mutex_unlock(&mem->state_mutex);
return ret;
Index: linux-pm/include/linux/memory.h
===================================================================
--- linux-pm.orig/include/linux/memory.h
+++ linux-pm/include/linux/memory.h
@@ -26,6 +26,7 @@ struct memory_block {
unsigned long start_section_nr;
unsigned long end_section_nr;
unsigned long state;
+ int last_online;
int section_count;
/*
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
2013-05-08 11:53 ` Rafael J. Wysocki
@ 2013-05-08 14:38 ` Toshi Kani
0 siblings, 0 replies; 32+ messages in thread
From: Toshi Kani @ 2013-05-08 14:38 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Vasilis Liaskovitis, Greg Kroah-Hartman, ACPI Devel Maling List,
LKML, isimatu.yasuaki, Len Brown, linux-mm, wency
On Wed, 2013-05-08 at 13:53 +0200, Rafael J. Wysocki wrote:
> On Tuesday, May 07, 2013 06:37:34 PM Toshi Kani wrote:
> > On Wed, 2013-05-08 at 02:24 +0200, Rafael J. Wysocki wrote:
> > > On Tuesday, May 07, 2013 05:59:16 PM Toshi Kani wrote:
:
> > > Moreover, it'd be better to do it in register_memory(), I think.
> >
> > Yes, if we change register_memory() to have the arg state.
>
> It can use mem->state which already has been populated at this point
> (and init_memory_block() is the only called).
Right.
> I've updated the patch to do that (appended).
Looks good! Thanks for the update!
-Toshi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
2013-05-04 11:21 ` [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks Rafael J. Wysocki
2013-05-06 16:28 ` Vasilis Liaskovitis
2013-05-06 17:20 ` Greg Kroah-Hartman
@ 2013-05-21 6:37 ` Tang Chen
2013-05-21 11:15 ` Rafael J. Wysocki
2 siblings, 1 reply; 32+ messages in thread
From: Tang Chen @ 2013-05-21 6:37 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm
Hi Rafael,
Please see below.
On 05/04/2013 07:21 PM, Rafael J. Wysocki wrote:
......
> static BLOCKING_NOTIFIER_HEAD(memory_chain);
> @@ -278,33 +283,64 @@ static int __memory_block_change_state(s
> {
> int ret = 0;
>
> - if (mem->state != from_state_req) {
> - ret = -EINVAL;
> - goto out;
> - }
> + if (mem->state != from_state_req)
> + return -EINVAL;
>
> if (to_state == MEM_OFFLINE)
> mem->state = MEM_GOING_OFFLINE;
>
> ret = memory_block_action(mem->start_section_nr, to_state, online_type);
> -
> if (ret) {
> mem->state = from_state_req;
> - goto out;
> + } else {
> + mem->state = to_state;
> + if (to_state == MEM_ONLINE)
> + mem->last_online = online_type;
Why do we need to remember last online type ?
And as far as I know, we can obtain which zone a page was in last time it
was onlined by check page->flags, just like online_pages() does. If we
use online_kernel or online_movable, the zone boundary will be
recalculated.
So we don't need to remember the last online type.
Seeing from your patch, I guess memory_subsys_online() can only handle
online and offline. So mem->last_online is used to remember what user has
done through the original way to trigger memory hot-remove, right ? And
when
user does it in this new way, it just does the same thing as user does last
time.
But I still think we don't need to remember it because if finally you call
online_pages(), it just does the same thing as last time by default.
online_pages()
{
......
if (online_type == ONLINE_KERNEL ......
if (online_type == ONLINE_MOVABLE......
zone = page_zone(pfn_to_page(pfn));
/* Here, the page will be put into the zone which it belong to last
time. */
......
}
I just thought of it. Maybe I missed something in your design. Please tell
me if I'm wrong.
Reviewed-by: Tang Chen <tangchen@cn.fujitsu.com>
Thanks. :)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/2 v2, RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes
2013-05-04 11:12 ` [PATCH 1/2 v2, RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes Rafael J. Wysocki
@ 2013-05-21 6:50 ` Tang Chen
0 siblings, 0 replies; 32+ messages in thread
From: Tang Chen @ 2013-05-21 6:50 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm
Hi Rafael,
Seems OK to me.
Reviewed-by: Tang Chen <tangchen@cn.fujitsu.com>
Thanks. :)
On 05/04/2013 07:12 PM, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki<rafael.j.wysocki@intel.com>
>
> During ACPI memory hotplug configuration bind memory blocks residing
> in modules removable through the standard ACPI mechanism to struct
> acpi_device objects associated with ACPI namespace objects
> representing those modules. Accordingly, unbind those memory blocks
> from the struct acpi_device objects when the memory modules in
> question are being removed.
>
> When "offline" operation for devices representing memory blocks is
> introduced, this will allow the ACPI core's device hot-remove code to
> use it to carry out remove_memory() for those memory blocks and check
> the results of that before it actually removes the modules holding
> them from the system.
>
> Since walk_memory_range() is used for accessing all memory blocks
> corresponding to a given ACPI namespace object, it is exported from
> memory_hotplug.c so that the code in acpi_memhotplug.c can use it.
>
> Signed-off-by: Rafael J. Wysocki<rafael.j.wysocki@intel.com>
> ---
> drivers/acpi/acpi_memhotplug.c | 53 ++++++++++++++++++++++++++++++++++++++---
> include/linux/memory_hotplug.h | 2 +
> mm/memory_hotplug.c | 4 ++-
> 3 files changed, 55 insertions(+), 4 deletions(-)
>
> Index: linux-pm/mm/memory_hotplug.c
> ===================================================================
> --- linux-pm.orig/mm/memory_hotplug.c
> +++ linux-pm/mm/memory_hotplug.c
> @@ -1618,6 +1618,7 @@ int offline_pages(unsigned long start_pf
> {
> return __offline_pages(start_pfn, start_pfn + nr_pages, 120 * HZ);
> }
> +#endif /* CONFIG_MEMORY_HOTREMOVE */
>
> /**
> * walk_memory_range - walks through all mem sections in [start_pfn, end_pfn)
> @@ -1631,7 +1632,7 @@ int offline_pages(unsigned long start_pf
> *
> * Returns the return value of func.
> */
> -static int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
> +int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
> void *arg, int (*func)(struct memory_block *, void *))
> {
> struct memory_block *mem = NULL;
> @@ -1668,6 +1669,7 @@ static int walk_memory_range(unsigned lo
> return 0;
> }
>
> +#ifdef CONFIG_MEMORY_HOTREMOVE
> /**
> * offline_memory_block_cb - callback function for offlining memory block
> * @mem: the memory block to be offlined
> Index: linux-pm/include/linux/memory_hotplug.h
> ===================================================================
> --- linux-pm.orig/include/linux/memory_hotplug.h
> +++ linux-pm/include/linux/memory_hotplug.h
> @@ -245,6 +245,8 @@ static inline int is_mem_section_removab
> static inline void try_offline_node(int nid) {}
> #endif /* CONFIG_MEMORY_HOTREMOVE */
>
> +extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn,
> + void *arg, int (*func)(struct memory_block *, void *));
> extern int mem_online_node(int nid);
> extern int add_memory(int nid, u64 start, u64 size);
> extern int arch_add_memory(int nid, u64 start, u64 size);
> Index: linux-pm/drivers/acpi/acpi_memhotplug.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/acpi_memhotplug.c
> +++ linux-pm/drivers/acpi/acpi_memhotplug.c
> @@ -28,6 +28,7 @@
> */
>
> #include<linux/acpi.h>
> +#include<linux/memory.h>
> #include<linux/memory_hotplug.h>
>
> #include "internal.h"
> @@ -166,13 +167,50 @@ static int acpi_memory_check_device(stru
> return 0;
> }
>
> +static unsigned long acpi_meminfo_start_pfn(struct acpi_memory_info *info)
> +{
> + return PFN_DOWN(info->start_addr);
> +}
> +
> +static unsigned long acpi_meminfo_end_pfn(struct acpi_memory_info *info)
> +{
> + return PFN_UP(info->start_addr + info->length-1);
> +}
> +
> +static int acpi_bind_memblk(struct memory_block *mem, void *arg)
> +{
> + return acpi_bind_one(&mem->dev, (acpi_handle)arg);
> +}
> +
> +static int acpi_bind_memory_blocks(struct acpi_memory_info *info,
> + acpi_handle handle)
> +{
> + return walk_memory_range(acpi_meminfo_start_pfn(info),
> + acpi_meminfo_end_pfn(info), (void *)handle,
> + acpi_bind_memblk);
> +}
> +
> +static int acpi_unbind_memblk(struct memory_block *mem, void *arg)
> +{
> + acpi_unbind_one(&mem->dev);
> + return 0;
> +}
> +
> +static void acpi_unbind_memory_blocks(struct acpi_memory_info *info,
> + acpi_handle handle)
> +{
> + walk_memory_range(acpi_meminfo_start_pfn(info),
> + acpi_meminfo_end_pfn(info), NULL, acpi_unbind_memblk);
> +}
> +
> static int acpi_memory_enable_device(struct acpi_memory_device *mem_device)
> {
> + acpi_handle handle = mem_device->device->handle;
> int result, num_enabled = 0;
> struct acpi_memory_info *info;
> int node;
>
> - node = acpi_get_node(mem_device->device->handle);
> + node = acpi_get_node(handle);
> /*
> * Tell the VM there is more memory here...
> * Note: Assume that this function returns zero on success
> @@ -203,6 +241,12 @@ static int acpi_memory_enable_device(str
> if (result&& result != -EEXIST)
> continue;
>
> + result = acpi_bind_memory_blocks(info, handle);
> + if (result) {
> + acpi_unbind_memory_blocks(info, handle);
> + return -ENODEV;
> + }
> +
> info->enabled = 1;
>
> /*
> @@ -229,10 +273,11 @@ static int acpi_memory_enable_device(str
>
> static int acpi_memory_remove_memory(struct acpi_memory_device *mem_device)
> {
> + acpi_handle handle = mem_device->device->handle;
> int result = 0, nid;
> struct acpi_memory_info *info, *n;
>
> - nid = acpi_get_node(mem_device->device->handle);
> + nid = acpi_get_node(handle);
>
> list_for_each_entry_safe(info, n,&mem_device->res_list, list) {
> if (!info->enabled)
> @@ -240,6 +285,8 @@ static int acpi_memory_remove_memory(str
>
> if (nid< 0)
> nid = memory_add_physaddr_to_nid(info->start_addr);
> +
> + acpi_unbind_memory_blocks(info, handle);
> result = remove_memory(nid, info->start_addr, info->length);
> if (result)
> return result;
> @@ -300,7 +347,7 @@ static int acpi_memory_device_add(struct
> if (result) {
> dev_err(&device->dev, "acpi_memory_enable_device() error\n");
> acpi_memory_device_free(mem_device);
> - return -ENODEV;
> + return result;
> }
>
> dev_dbg(&device->dev, "Memory device configured by ACPI\n");
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
2013-05-21 6:37 ` Tang Chen
@ 2013-05-21 11:15 ` Rafael J. Wysocki
2013-05-22 4:45 ` Tang Chen
0 siblings, 1 reply; 32+ messages in thread
From: Rafael J. Wysocki @ 2013-05-21 11:15 UTC (permalink / raw)
To: Tang Chen
Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm
On Tuesday, May 21, 2013 02:37:53 PM Tang Chen wrote:
> Hi Rafael,
>
> Please see below.
>
> On 05/04/2013 07:21 PM, Rafael J. Wysocki wrote:
> ......
> > static BLOCKING_NOTIFIER_HEAD(memory_chain);
> > @@ -278,33 +283,64 @@ static int __memory_block_change_state(s
> > {
> > int ret = 0;
> >
> > - if (mem->state != from_state_req) {
> > - ret = -EINVAL;
> > - goto out;
> > - }
> > + if (mem->state != from_state_req)
> > + return -EINVAL;
> >
> > if (to_state == MEM_OFFLINE)
> > mem->state = MEM_GOING_OFFLINE;
> >
> > ret = memory_block_action(mem->start_section_nr, to_state, online_type);
> > -
> > if (ret) {
> > mem->state = from_state_req;
> > - goto out;
> > + } else {
> > + mem->state = to_state;
> > + if (to_state == MEM_ONLINE)
> > + mem->last_online = online_type;
>
> Why do we need to remember last online type ?
>
> And as far as I know, we can obtain which zone a page was in last time it
> was onlined by check page->flags, just like online_pages() does. If we
> use online_kernel or online_movable, the zone boundary will be
> recalculated.
> So we don't need to remember the last online type.
>
> Seeing from your patch, I guess memory_subsys_online() can only handle
> online and offline. So mem->last_online is used to remember what user has
> done through the original way to trigger memory hot-remove, right ? And
> when
> user does it in this new way, it just does the same thing as user does last
> time.
>
> But I still think we don't need to remember it because if finally you call
> online_pages(), it just does the same thing as last time by default.
>
> online_pages()
> {
> ......
> if (online_type == ONLINE_KERNEL ......
>
> if (online_type == ONLINE_MOVABLE......
>
> zone = page_zone(pfn_to_page(pfn));
>
> /* Here, the page will be put into the zone which it belong to last
> time. */
To be honest, it wasn't entirely clear to me that online_pages() would do the
same thing as last time by default. Suppose, for example, that the previous
online_type was ONLINE_MOVABLE. How online_pages() is supposed to know that
it should do the move_pfn_zone_right() if we don't tell it to do that? Or
is that unnecessary, because it's already been done previously?
> ......
> }
>
> I just thought of it. Maybe I missed something in your design. Please tell
> me if I'm wrong.
Well, so what should be passed to __memory_block_change_state() in
memory_subsys_online()? -1?
> Reviewed-by: Tang Chen <tangchen@cn.fujitsu.com>
>
> Thanks. :)
Thanks for your comments,
Rafael
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
2013-05-21 11:15 ` Rafael J. Wysocki
@ 2013-05-22 4:45 ` Tang Chen
2013-05-22 10:42 ` Rafael J. Wysocki
2013-05-22 22:06 ` [PATCH] Driver core / memory: Simplify __memory_block_change_state() Rafael J. Wysocki
0 siblings, 2 replies; 32+ messages in thread
From: Tang Chen @ 2013-05-22 4:45 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm
Hi Rafael,
On 05/21/2013 07:15 PM, Rafael J. Wysocki wrote:
......
>>> + mem->state = to_state;
>>> + if (to_state == MEM_ONLINE)
>>> + mem->last_online = online_type;
>>
>> Why do we need to remember last online type ?
>>
>> And as far as I know, we can obtain which zone a page was in last time it
>> was onlined by check page->flags, just like online_pages() does. If we
>> use online_kernel or online_movable, the zone boundary will be
>> recalculated.
>> So we don't need to remember the last online type.
>>
>> Seeing from your patch, I guess memory_subsys_online() can only handle
>> online and offline. So mem->last_online is used to remember what user has
>> done through the original way to trigger memory hot-remove, right ? And
>> when
>> user does it in this new way, it just does the same thing as user does last
>> time.
>>
>> But I still think we don't need to remember it because if finally you call
>> online_pages(), it just does the same thing as last time by default.
>>
>> online_pages()
>> {
>> ......
>> if (online_type == ONLINE_KERNEL ......
>>
>> if (online_type == ONLINE_MOVABLE......
>>
>> zone = page_zone(pfn_to_page(pfn));
>>
>> /* Here, the page will be put into the zone which it belong to last
>> time. */
>
> To be honest, it wasn't entirely clear to me that online_pages() would do the
> same thing as last time by default. Suppose, for example, that the previous
> online_type was ONLINE_MOVABLE. How online_pages() is supposed to know that
> it should do the move_pfn_zone_right() if we don't tell it to do that? Or
> is that unnecessary, because it's already been done previously?
Yes, it is unnecessary. move_pfn_zone_right/left() will modify the zone
related
bits in page->flags. But when the page is offline, the zone related bits in
page->flags will not change. So when it is online again, by dafault, it
will
be in the zone which it was in last time.
......
>>
>> I just thought of it. Maybe I missed something in your design. Please tell
>> me if I'm wrong.
>
> Well, so what should be passed to __memory_block_change_state() in
> memory_subsys_online()? -1?
If you want to keep the last time status, you can pass ONLINE_KEEP.
Or -1 is all right.
Thanks. :)
>
>> Reviewed-by: Tang Chen<tangchen@cn.fujitsu.com>
>>
>> Thanks. :)
>
> Thanks for your comments,
> Rafael
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks
2013-05-22 4:45 ` Tang Chen
@ 2013-05-22 10:42 ` Rafael J. Wysocki
2013-05-22 22:06 ` [PATCH] Driver core / memory: Simplify __memory_block_change_state() Rafael J. Wysocki
1 sibling, 0 replies; 32+ messages in thread
From: Rafael J. Wysocki @ 2013-05-22 10:42 UTC (permalink / raw)
To: Tang Chen
Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm
On Wednesday, May 22, 2013 12:45:34 PM Tang Chen wrote:
> Hi Rafael,
>
> On 05/21/2013 07:15 PM, Rafael J. Wysocki wrote:
> ......
> >>> + mem->state = to_state;
> >>> + if (to_state == MEM_ONLINE)
> >>> + mem->last_online = online_type;
> >>
> >> Why do we need to remember last online type ?
> >>
> >> And as far as I know, we can obtain which zone a page was in last time it
> >> was onlined by check page->flags, just like online_pages() does. If we
> >> use online_kernel or online_movable, the zone boundary will be
> >> recalculated.
> >> So we don't need to remember the last online type.
> >>
> >> Seeing from your patch, I guess memory_subsys_online() can only handle
> >> online and offline. So mem->last_online is used to remember what user has
> >> done through the original way to trigger memory hot-remove, right ? And
> >> when
> >> user does it in this new way, it just does the same thing as user does last
> >> time.
> >>
> >> But I still think we don't need to remember it because if finally you call
> >> online_pages(), it just does the same thing as last time by default.
> >>
> >> online_pages()
> >> {
> >> ......
> >> if (online_type == ONLINE_KERNEL ......
> >>
> >> if (online_type == ONLINE_MOVABLE......
> >>
> >> zone = page_zone(pfn_to_page(pfn));
> >>
> >> /* Here, the page will be put into the zone which it belong to last
> >> time. */
> >
> > To be honest, it wasn't entirely clear to me that online_pages() would do the
> > same thing as last time by default. Suppose, for example, that the previous
> > online_type was ONLINE_MOVABLE. How online_pages() is supposed to know that
> > it should do the move_pfn_zone_right() if we don't tell it to do that? Or
> > is that unnecessary, because it's already been done previously?
>
> Yes, it is unnecessary. move_pfn_zone_right/left() will modify the zone
> related
> bits in page->flags. But when the page is offline, the zone related bits in
> page->flags will not change. So when it is online again, by dafault, it
> will
> be in the zone which it was in last time.
>
> ......
>
> >>
> >> I just thought of it. Maybe I missed something in your design. Please tell
> >> me if I'm wrong.
> >
> > Well, so what should be passed to __memory_block_change_state() in
> > memory_subsys_online()? -1?
>
> If you want to keep the last time status, you can pass ONLINE_KEEP.
> Or -1 is all right.
>
> Thanks. :)
OK, thanks for the info.
Since the $subject patch is on my acpi-hotplug branch which has gone public
already (and cannot be rebased), I'll prepare a patch with the change you're
recommending on top of it.
Thanks,
Rafael
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH] Driver core / memory: Simplify __memory_block_change_state()
2013-05-22 4:45 ` Tang Chen
2013-05-22 10:42 ` Rafael J. Wysocki
@ 2013-05-22 22:06 ` Rafael J. Wysocki
2013-05-22 22:14 ` Greg Kroah-Hartman
2013-05-23 4:37 ` Tang Chen
1 sibling, 2 replies; 32+ messages in thread
From: Rafael J. Wysocki @ 2013-05-22 22:06 UTC (permalink / raw)
To: Tang Chen, Greg Kroah-Hartman
Cc: Toshi Kani, ACPI Devel Maling List, LKML, isimatu.yasuaki,
vasilis.liaskovitis, Len Brown, linux-mm
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
As noted by Tang Chen, the last_online field in struct memory_block
introduced by commit 4960e05 (Driver core: Introduce offline/online
callbacks for memory blocks) is not really necessary, because
online_pages() restores the previous state if passed ONLINE_KEEP as
the last argument. Therefore, remove that field along with the code
referring to it.
References: http://marc.info/?l=linux-kernel&m=136919777305599&w=2
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
Hi,
The patch is on top (and the commit mentioned in the changelog is present in)
the acpi-hotplug branch of the linux-pm.git tree.
Thanks,
Rafael
---
drivers/base/memory.c | 11 ++---------
include/linux/memory.h | 1 -
2 files changed, 2 insertions(+), 10 deletions(-)
Index: linux-pm/drivers/base/memory.c
===================================================================
--- linux-pm.orig/drivers/base/memory.c
+++ linux-pm/drivers/base/memory.c
@@ -291,13 +291,7 @@ static int __memory_block_change_state(s
mem->state = MEM_GOING_OFFLINE;
ret = memory_block_action(mem->start_section_nr, to_state, online_type);
- if (ret) {
- mem->state = from_state_req;
- } else {
- mem->state = to_state;
- if (to_state == MEM_ONLINE)
- mem->last_online = online_type;
- }
+ mem->state = ret ? from_state_req : to_state;
return ret;
}
@@ -310,7 +304,7 @@ static int memory_subsys_online(struct d
ret = mem->state == MEM_ONLINE ? 0 :
__memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE,
- mem->last_online);
+ ONLINE_KEEP);
mutex_unlock(&mem->state_mutex);
return ret;
@@ -618,7 +612,6 @@ static int init_memory_block(struct memo
base_memory_block_id(scn_nr) * sections_per_block;
mem->end_section_nr = mem->start_section_nr + sections_per_block - 1;
mem->state = state;
- mem->last_online = ONLINE_KEEP;
mem->section_count++;
mutex_init(&mem->state_mutex);
start_pfn = section_nr_to_pfn(mem->start_section_nr);
Index: linux-pm/include/linux/memory.h
===================================================================
--- linux-pm.orig/include/linux/memory.h
+++ linux-pm/include/linux/memory.h
@@ -26,7 +26,6 @@ struct memory_block {
unsigned long start_section_nr;
unsigned long end_section_nr;
unsigned long state;
- int last_online;
int section_count;
/*
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Driver core / memory: Simplify __memory_block_change_state()
2013-05-22 22:06 ` [PATCH] Driver core / memory: Simplify __memory_block_change_state() Rafael J. Wysocki
@ 2013-05-22 22:14 ` Greg Kroah-Hartman
2013-05-22 23:29 ` Rafael J. Wysocki
2013-05-23 4:37 ` Tang Chen
1 sibling, 1 reply; 32+ messages in thread
From: Greg Kroah-Hartman @ 2013-05-22 22:14 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Tang Chen, Toshi Kani, ACPI Devel Maling List, LKML,
isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm
On Thu, May 23, 2013 at 12:06:50AM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> As noted by Tang Chen, the last_online field in struct memory_block
> introduced by commit 4960e05 (Driver core: Introduce offline/online
> callbacks for memory blocks) is not really necessary, because
> online_pages() restores the previous state if passed ONLINE_KEEP as
> the last argument. Therefore, remove that field along with the code
> referring to it.
>
> References: http://marc.info/?l=linux-kernel&m=136919777305599&w=2
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Driver core / memory: Simplify __memory_block_change_state()
2013-05-22 22:14 ` Greg Kroah-Hartman
@ 2013-05-22 23:29 ` Rafael J. Wysocki
0 siblings, 0 replies; 32+ messages in thread
From: Rafael J. Wysocki @ 2013-05-22 23:29 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Tang Chen, Toshi Kani, ACPI Devel Maling List, LKML,
isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm
On Wednesday, May 22, 2013 03:14:43 PM Greg Kroah-Hartman wrote:
> On Thu, May 23, 2013 at 12:06:50AM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > As noted by Tang Chen, the last_online field in struct memory_block
> > introduced by commit 4960e05 (Driver core: Introduce offline/online
> > callbacks for memory blocks) is not really necessary, because
> > online_pages() restores the previous state if passed ONLINE_KEEP as
> > the last argument. Therefore, remove that field along with the code
> > referring to it.
> >
> > References: http://marc.info/?l=linux-kernel&m=136919777305599&w=2
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
>
> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Thanks!
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Driver core / memory: Simplify __memory_block_change_state()
2013-05-22 22:06 ` [PATCH] Driver core / memory: Simplify __memory_block_change_state() Rafael J. Wysocki
2013-05-22 22:14 ` Greg Kroah-Hartman
@ 2013-05-23 4:37 ` Tang Chen
1 sibling, 0 replies; 32+ messages in thread
From: Tang Chen @ 2013-05-23 4:37 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Greg Kroah-Hartman, Toshi Kani, ACPI Devel Maling List, LKML,
isimatu.yasuaki, vasilis.liaskovitis, Len Brown, linux-mm
Reviewed-by: Tang Chen <tangchen@cn.fujitsu.com>
Thanks. :)
On 05/23/2013 06:06 AM, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki<rafael.j.wysocki@intel.com>
>
> As noted by Tang Chen, the last_online field in struct memory_block
> introduced by commit 4960e05 (Driver core: Introduce offline/online
> callbacks for memory blocks) is not really necessary, because
> online_pages() restores the previous state if passed ONLINE_KEEP as
> the last argument. Therefore, remove that field along with the code
> referring to it.
>
> References: http://marc.info/?l=linux-kernel&m=136919777305599&w=2
> Signed-off-by: Rafael J. Wysocki<rafael.j.wysocki@intel.com>
> ---
>
> Hi,
>
> The patch is on top (and the commit mentioned in the changelog is present in)
> the acpi-hotplug branch of the linux-pm.git tree.
>
> Thanks,
> Rafael
>
> ---
> drivers/base/memory.c | 11 ++---------
> include/linux/memory.h | 1 -
> 2 files changed, 2 insertions(+), 10 deletions(-)
>
> Index: linux-pm/drivers/base/memory.c
> ===================================================================
> --- linux-pm.orig/drivers/base/memory.c
> +++ linux-pm/drivers/base/memory.c
> @@ -291,13 +291,7 @@ static int __memory_block_change_state(s
> mem->state = MEM_GOING_OFFLINE;
>
> ret = memory_block_action(mem->start_section_nr, to_state, online_type);
> - if (ret) {
> - mem->state = from_state_req;
> - } else {
> - mem->state = to_state;
> - if (to_state == MEM_ONLINE)
> - mem->last_online = online_type;
> - }
> + mem->state = ret ? from_state_req : to_state;
> return ret;
> }
>
> @@ -310,7 +304,7 @@ static int memory_subsys_online(struct d
>
> ret = mem->state == MEM_ONLINE ? 0 :
> __memory_block_change_state(mem, MEM_ONLINE, MEM_OFFLINE,
> - mem->last_online);
> + ONLINE_KEEP);
>
> mutex_unlock(&mem->state_mutex);
> return ret;
> @@ -618,7 +612,6 @@ static int init_memory_block(struct memo
> base_memory_block_id(scn_nr) * sections_per_block;
> mem->end_section_nr = mem->start_section_nr + sections_per_block - 1;
> mem->state = state;
> - mem->last_online = ONLINE_KEEP;
> mem->section_count++;
> mutex_init(&mem->state_mutex);
> start_pfn = section_nr_to_pfn(mem->start_section_nr);
> Index: linux-pm/include/linux/memory.h
> ===================================================================
> --- linux-pm.orig/include/linux/memory.h
> +++ linux-pm/include/linux/memory.h
> @@ -26,7 +26,6 @@ struct memory_block {
> unsigned long start_section_nr;
> unsigned long end_section_nr;
> unsigned long state;
> - int last_online;
> int section_count;
>
> /*
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2013-05-23 4:34 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1576321.HU0tZ4cGWk@vostro.rjw.lan>
[not found] ` <3166726.elbgrUIZ0L@vostro.rjw.lan>
2013-05-04 1:01 ` [PATCH 0/3 RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
2013-05-04 1:03 ` [PATCH 1/3 RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes Rafael J. Wysocki
2013-05-04 1:04 ` [PATCH 2/3 RFC] Driver core: Introduce types of device "online" Rafael J. Wysocki
2013-05-04 1:06 ` [PATCH 3/3 RFC] Driver core: Introduce offline/online callbacks for memory blocks Rafael J. Wysocki
2013-05-04 11:11 ` [PATCH 0/2 v2, RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
2013-05-04 11:12 ` [PATCH 1/2 v2, RFC] ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes Rafael J. Wysocki
2013-05-21 6:50 ` Tang Chen
2013-05-04 11:21 ` [PATCH 2/2 v2, RFC] Driver core: Introduce offline/online callbacks for memory blocks Rafael J. Wysocki
2013-05-06 16:28 ` Vasilis Liaskovitis
2013-05-07 0:59 ` Rafael J. Wysocki
2013-05-07 10:59 ` Vasilis Liaskovitis
2013-05-07 12:11 ` Rafael J. Wysocki
2013-05-07 21:03 ` Toshi Kani
2013-05-07 22:10 ` Rafael J. Wysocki
2013-05-07 22:45 ` Toshi Kani
2013-05-07 23:17 ` Rafael J. Wysocki
2013-05-07 23:59 ` Toshi Kani
2013-05-08 0:24 ` Rafael J. Wysocki
2013-05-08 0:37 ` Toshi Kani
2013-05-08 11:53 ` Rafael J. Wysocki
2013-05-08 14:38 ` Toshi Kani
2013-05-06 17:20 ` Greg Kroah-Hartman
2013-05-06 19:46 ` Rafael J. Wysocki
2013-05-21 6:37 ` Tang Chen
2013-05-21 11:15 ` Rafael J. Wysocki
2013-05-22 4:45 ` Tang Chen
2013-05-22 10:42 ` Rafael J. Wysocki
2013-05-22 22:06 ` [PATCH] Driver core / memory: Simplify __memory_block_change_state() Rafael J. Wysocki
2013-05-22 22:14 ` Greg Kroah-Hartman
2013-05-22 23:29 ` Rafael J. Wysocki
2013-05-23 4:37 ` Tang Chen
2013-05-06 10:48 ` [PATCH 0/2 v2, RFC] Driver core: Add offline/online callbacks for memory_subsys Rafael J. Wysocki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).