* kdump: A memory hotplug issue on s390 @ 2011-10-25 17:17 Michael Holzheu 2011-10-26 22:31 ` Simon Horman 0 siblings, 1 reply; 7+ messages in thread From: Michael Holzheu @ 2011-10-25 17:17 UTC (permalink / raw) To: Simon Horman, Vivek Goyal; +Cc: kexec Hello Simon and Vivek, For s390 we currently use /proc/iomem for defining the memory layout in the kexec elfcore header. Unfortunately this is not correct, when using memory hotplug. When a memory chunk is set offline (e.g. with "echo offline > /sys/devices/system/memory/memoryX/state") this is not reflected in /proc/iomem. To fix this I could parse /sys/devices/system/memory and exclude each memory chunk that in not online from the /proc/iomem info. Do you think that this approach is fine or is there a better solution? Michael _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: kdump: A memory hotplug issue on s390 2011-10-25 17:17 kdump: A memory hotplug issue on s390 Michael Holzheu @ 2011-10-26 22:31 ` Simon Horman 2011-10-27 17:28 ` Vivek Goyal 2011-10-28 13:35 ` [PATCH] kexec-tools: s390: Fix memory detection for memory hotplug Michael Holzheu 0 siblings, 2 replies; 7+ messages in thread From: Simon Horman @ 2011-10-26 22:31 UTC (permalink / raw) To: Michael Holzheu; +Cc: kexec, Vivek Goyal On Tue, Oct 25, 2011 at 07:17:17PM +0200, Michael Holzheu wrote: > Hello Simon and Vivek, > > For s390 we currently use /proc/iomem for defining the memory layout in > the kexec elfcore header. Unfortunately this is not correct, when using > memory hotplug. When a memory chunk is set offline (e.g. with "echo > offline > /sys/devices/system/memory/memoryX/state") this is not > reflected in /proc/iomem. > > To fix this I could parse /sys/devices/system/memory and exclude each > memory chunk that in not online from the /proc/iomem info. Do you think > that this approach is fine or is there a better solution? Hi Michael, that sounds like a reasonable approach to me. IIRC, kexec xen on ia64 makes use of an alternate iomem file, and this seems to be another example of /proc/iomem not being the right source of information. _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: kdump: A memory hotplug issue on s390 2011-10-26 22:31 ` Simon Horman @ 2011-10-27 17:28 ` Vivek Goyal 2011-10-27 18:15 ` Michael Holzheu 2011-10-28 13:35 ` [PATCH] kexec-tools: s390: Fix memory detection for memory hotplug Michael Holzheu 1 sibling, 1 reply; 7+ messages in thread From: Vivek Goyal @ 2011-10-27 17:28 UTC (permalink / raw) To: Simon Horman; +Cc: Michael Holzheu, kexec On Thu, Oct 27, 2011 at 07:31:26AM +0900, Simon Horman wrote: > On Tue, Oct 25, 2011 at 07:17:17PM +0200, Michael Holzheu wrote: > > Hello Simon and Vivek, > > > > For s390 we currently use /proc/iomem for defining the memory layout in > > the kexec elfcore header. Unfortunately this is not correct, when using > > memory hotplug. When a memory chunk is set offline (e.g. with "echo > > offline > /sys/devices/system/memory/memoryX/state") this is not > > reflected in /proc/iomem. > > > > To fix this I could parse /sys/devices/system/memory and exclude each > > memory chunk that in not online from the /proc/iomem info. Do you think > > that this approach is fine or is there a better solution? > > Hi Michael, > > that sounds like a reasonable approach to me. > IIRC, kexec xen on ia64 makes use of an alternate iomem file, > and this seems to be another example of /proc/iomem not being > the right source of information. Agree that it sounds reasonable. I have never used /sys/devices/memory/ interface. So does it work realiably and how long has it been working reliably? Secondly we should do this only for kdump and not for kexec. If some memory is offlined, then we still want to use it in case of kexec. What's the meaning of various entries. I see lots of memory[1-n] entries in my system and under memory0/ dir I see following. [memory0]# grep ".*" * end_phys_index:00000000 phys_device:0 phys_index:00000000 removable:0 state:online What does it mean. Is memory0 representing a chunk of physical memory? If yes, then where does the segment start and where does it end. Everything seems to be zero. So is it representing chunk0 of memory. So both starting and end index are 0. But where is the chunk size mentioned? Thanks Vivek _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: kdump: A memory hotplug issue on s390 2011-10-27 17:28 ` Vivek Goyal @ 2011-10-27 18:15 ` Michael Holzheu 2011-10-27 18:33 ` Vivek Goyal 0 siblings, 1 reply; 7+ messages in thread From: Michael Holzheu @ 2011-10-27 18:15 UTC (permalink / raw) To: Vivek Goyal; +Cc: Simon Horman, kexec Hello Vivek, On Thu, 2011-10-27 at 13:28 -0400, Vivek Goyal wrote: > On Thu, Oct 27, 2011 at 07:31:26AM +0900, Simon Horman wrote: > > On Tue, Oct 25, 2011 at 07:17:17PM +0200, Michael Holzheu wrote: > > > Hello Simon and Vivek, > > > > > > For s390 we currently use /proc/iomem for defining the memory layout in > > > the kexec elfcore header. Unfortunately this is not correct, when using > > > memory hotplug. When a memory chunk is set offline (e.g. with "echo > > > offline > /sys/devices/system/memory/memoryX/state") this is not > > > reflected in /proc/iomem. > > > > > > To fix this I could parse /sys/devices/system/memory and exclude each > > > memory chunk that in not online from the /proc/iomem info. Do you think > > > that this approach is fine or is there a better solution? > > > > Hi Michael, > > > > that sounds like a reasonable approach to me. > > IIRC, kexec xen on ia64 makes use of an alternate iomem file, > > and this seems to be another example of /proc/iomem not being > > the right source of information. > > Secondly we should do this only for kdump and not for kexec. If some > memory is offlined, then we still want to use it in case of kexec. I don't think so. At least on s390 we can't use it for kexec. If memory is set offline, it is gone (given back to the hypervisor) and can't be used any more before it is set to online again. > What's the meaning of various entries. I see lots of memory[1-n] entries > in my system and under memory0/ dir I see following. > > [memory0]# grep ".*" * > end_phys_index:00000000 > phys_device:0 > phys_index:00000000 > removable:0 This means that it is not removable, e.g. because not movable kernel structures are located in this chunk. > state:online It is online. > What does it mean. Is memory0 representing a chunk of physical memory? > If yes, then where does the segment start and where does it end. Everything > seems to be zero. > So is it representing chunk0 of memory. So both starting and end index > are 0. But where is the chunk size mentioned? The file "/sys/devices/system/memory/block_size_bytes" tells you how big each memory chunk is (in hex). Assume block_size_bytes is 0x10000000 (256MiB). Then "memory0" represents the memory from 0x0-0x10000000, "memory1" represents memory from 0x10000000-0x20000000 and so on. So when you find that the "state" of "memory1" is "offline", you know that memory 0x10000000-0x20000000 is not used by the Linux kernel (and should not included in vmcore) and (at least on s390) this area is not backed with real memory. With this information I can update the /proc/iommem info accordingly. Michael _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: kdump: A memory hotplug issue on s390 2011-10-27 18:15 ` Michael Holzheu @ 2011-10-27 18:33 ` Vivek Goyal 0 siblings, 0 replies; 7+ messages in thread From: Vivek Goyal @ 2011-10-27 18:33 UTC (permalink / raw) To: Michael Holzheu; +Cc: Simon Horman, kexec On Thu, Oct 27, 2011 at 08:15:03PM +0200, Michael Holzheu wrote: > Hello Vivek, > > On Thu, 2011-10-27 at 13:28 -0400, Vivek Goyal wrote: > > On Thu, Oct 27, 2011 at 07:31:26AM +0900, Simon Horman wrote: > > > On Tue, Oct 25, 2011 at 07:17:17PM +0200, Michael Holzheu wrote: > > > > Hello Simon and Vivek, > > > > > > > > For s390 we currently use /proc/iomem for defining the memory layout in > > > > the kexec elfcore header. Unfortunately this is not correct, when using > > > > memory hotplug. When a memory chunk is set offline (e.g. with "echo > > > > offline > /sys/devices/system/memory/memoryX/state") this is not > > > > reflected in /proc/iomem. > > > > > > > > To fix this I could parse /sys/devices/system/memory and exclude each > > > > memory chunk that in not online from the /proc/iomem info. Do you think > > > > that this approach is fine or is there a better solution? > > > > > > Hi Michael, > > > > > > that sounds like a reasonable approach to me. > > > IIRC, kexec xen on ia64 makes use of an alternate iomem file, > > > and this seems to be another example of /proc/iomem not being > > > the right source of information. > > > > Secondly we should do this only for kdump and not for kexec. If some > > memory is offlined, then we still want to use it in case of kexec. Ok. So we seem to have two cases then. In baremetal that memory chunk can still be used. In case of virtualization it can't be used. May be every arch can take its own decision. Or may be ignoring offlined memory is safe default until somebody complains. :-) > > I don't think so. At least on s390 we can't use it for kexec. If memory > is set offline, it is gone (given back to the hypervisor) and can't be > used any more before it is set to online again. > > > What's the meaning of various entries. I see lots of memory[1-n] entries > > in my system and under memory0/ dir I see following. > > > > [memory0]# grep ".*" * > > end_phys_index:00000000 > > phys_device:0 > > phys_index:00000000 > > > removable:0 > > This means that it is not removable, e.g. because not movable kernel > structures are located in this chunk. > > > state:online > > It is online. > > > What does it mean. Is memory0 representing a chunk of physical memory? > > If yes, then where does the segment start and where does it end. Everything > > seems to be zero. > > So is it representing chunk0 of memory. So both starting and end index > > are 0. But where is the chunk size mentioned? > > The file "/sys/devices/system/memory/block_size_bytes" tells you how big > each memory chunk is (in hex). > > Assume block_size_bytes is 0x10000000 (256MiB). Then "memory0" > represents the memory from 0x0-0x10000000, "memory1" represents memory > from 0x10000000-0x20000000 and so on. So when you find that the "state" > of "memory1" is "offline", you know that memory 0x10000000-0x20000000 is > not used by the Linux kernel (and should not included in vmcore) and (at > least on s390) this area is not backed with real memory. > > With this information I can update the /proc/iommem info accordingly. Ok, thanks for the info. I seem to have 128MB sized blocks on my system. So yes, it makes sense to me to exclude memory from the kdump map which is not online and restarting kdump on memory online/offline events. Thanks Vivek _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] kexec-tools: s390: Fix memory detection for memory hotplug 2011-10-26 22:31 ` Simon Horman 2011-10-27 17:28 ` Vivek Goyal @ 2011-10-28 13:35 ` Michael Holzheu 2011-10-31 6:40 ` Simon Horman 1 sibling, 1 reply; 7+ messages in thread From: Michael Holzheu @ 2011-10-28 13:35 UTC (permalink / raw) To: Simon Horman; +Cc: kexec, Vivek Goyal Hello Simon, Here comes the patch... On Thu, 2011-10-27 at 07:31 +0900, Simon Horman wrote: > On Tue, Oct 25, 2011 at 07:17:17PM +0200, Michael Holzheu wrote: > > To fix this I could parse /sys/devices/system/memory and exclude each > > memory chunk that in not online from the /proc/iomem info. Do you think > > that this approach is fine or is there a better solution? > > Hi Michael, > > that sounds like a reasonable approach to me. > IIRC, kexec xen on ia64 makes use of an alternate iomem file, > and this seems to be another example of /proc/iomem not being > the right source of information. From: Michael Holzheu <holzheu@linux.vnet.ibm.com> Currently on s390 for memory detection only the "/proc/iomem" file is used. This file does not include information on offlined memory chunks. With this patch the memory hotplug information is read from "/sys/devices/system/memory" and is added to the "/proc/iomem" info. Also the MAX_MEMORY_RANGES count is increased to 1024 in order to support systems with many memory holes. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> --- kexec/arch/s390/kexec-s390.c | 122 ++++++++++++++++++++++++++++++++++++++++++- kexec/arch/s390/kexec-s390.h | 3 - 2 files changed, 123 insertions(+), 2 deletions(-) --- a/kexec/arch/s390/kexec-s390.c +++ b/kexec/arch/s390/kexec-s390.c @@ -11,10 +11,13 @@ #define _GNU_SOURCE #include <stddef.h> #include <stdio.h> +#include <stdlib.h> #include <errno.h> #include <stdint.h> #include <string.h> #include <getopt.h> +#include <sys/types.h> +#include <dirent.h> #include "../../kexec.h" #include "../../kexec-syscall.h" #include "kexec-s390.h" @@ -23,6 +26,122 @@ static struct memory_range memory_range[MAX_MEMORY_RANGES]; /* + * Read string from file + */ +static void read_str(char *string, const char *path, size_t len) +{ + size_t rc; + FILE *fh; + + fh = fopen(path, "rb"); + if (fh == NULL) + die("Could not open \"%s\"", path); + rc = fread(string, 1, len - 1, fh); + if (rc == 0 && ferror(fh)) + die("Could not read \"%s\"", path); + fclose(fh); + string[rc] = 0; + if (string[strlen(string) - 1] == '\n') + string[strlen(string) - 1] = 0; +} + +/* + * Return number of memory chunks + */ +static int memory_range_cnt(struct memory_range chunks[]) +{ + int i; + + for (i = 0; i < MAX_MEMORY_RANGES; i++) { + if (chunks[i].end == 0) + break; + } + return i; +} + +/* + * Create memory hole with given address and size + * + * lh = local hole + */ +static void add_mem_hole(struct memory_range chunks[], unsigned long addr, + unsigned long size) +{ + unsigned long lh_start, lh_end, lh_size, chunk_cnt; + int i; + + chunk_cnt = memory_range_cnt(chunks); + + for (i = 0; i < chunk_cnt; i++) { + if (addr + size <= chunks[i].start) + break; + if (addr > chunks[i].end) + continue; + lh_start = MAX(addr, chunks[i].start); + lh_end = MIN(addr + size - 1, chunks[i].end); + lh_size = lh_end - lh_start + 1; + if (lh_start == chunks[i].start && lh_end == chunks[i].end) { + /* Remove chunk */ + memmove(&chunks[i], &chunks[i + 1], + sizeof(struct memory_range) * + (MAX_MEMORY_RANGES - (i + 1))); + memset(&chunks[MAX_MEMORY_RANGES - 1], 0, + sizeof(struct memory_range)); + chunk_cnt--; + i--; + } else if (lh_start == chunks[i].start) { + /* Make chunk smaller at start */ + chunks[i].start = chunks[i].start + lh_size; + break; + } else if (lh_end == chunks[i].end) { + /* Make chunk smaller at end */ + chunks[i].end = lh_start - 1; + } else { + /* Split chunk into two */ + if (chunk_cnt >= MAX_MEMORY_RANGES) + die("Unable to create memory hole: %i", i); + memmove(&chunks[i + 1], &chunks[i], + sizeof(struct memory_range) * + (MAX_MEMORY_RANGES - (i + 1))); + chunks[i + 1].start = lh_start + lh_size; + chunks[i].end = lh_start - 1; + break; + } + } +} + +/* + * Remove offline memory from memory chunks + */ +static void remove_offline_memory(struct memory_range memory_range[]) +{ + unsigned long block_size, chunk_nr; + struct dirent *dirent; + char path[PATH_MAX]; + char str[64]; + DIR *dir; + + read_str(str, "/sys/devices/system/memory/block_size_bytes", + sizeof(str)); + sscanf(str, "%lx", &block_size); + + dir = opendir("/sys/devices/system/memory"); + if (!dir) + die("Could not read \"/sys/devices/system/memory\""); + while ((dirent = readdir(dir))) { + if (sscanf(dirent->d_name, "memory%ld\n", &chunk_nr) != 1) + continue; + sprintf(path, "/sys/devices/system/memory/%s/state", + dirent->d_name); + read_str(str, path, sizeof(str)); + if (strncmp(str, "offline", 6) != 0) + continue; + add_mem_hole(memory_range, chunk_nr * block_size, block_size); + } + closedir(dir); +} + +/* * Get memory ranges of type "System RAM" from /proc/iomem. If with_crashk=1 * then also type "Crash kernel" is added. */ @@ -66,7 +185,8 @@ int get_memory_ranges_s390(struct memory } } fclose(fp); - *ranges = current_range; + remove_offline_memory(memory_range); + *ranges = memory_range_cnt(memory_range); return 0; } --- a/kexec/arch/s390/kexec-s390.h +++ b/kexec/arch/s390/kexec-s390.h @@ -19,10 +19,11 @@ #define OLDMEM_SIZE_OFFS 0x420 #define COMMAND_LINE_OFFS 0x480 #define COMMAND_LINESIZE 896 -#define MAX_MEMORY_RANGES 64 +#define MAX_MEMORY_RANGES 1024 #define ALIGN_UP(addr, size) (((addr) + ((size)-1)) & (~((size)-1))) #define MAX(x, y) ((x) > (y) ? (x) : (y)) +#define MIN(x, y) ((x) < (y) ? (x) : (y)) extern int image_s390_load(int, char **, const char *, off_t, struct kexec_info *); extern int image_s390_probe(const char *, off_t); _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] kexec-tools: s390: Fix memory detection for memory hotplug 2011-10-28 13:35 ` [PATCH] kexec-tools: s390: Fix memory detection for memory hotplug Michael Holzheu @ 2011-10-31 6:40 ` Simon Horman 0 siblings, 0 replies; 7+ messages in thread From: Simon Horman @ 2011-10-31 6:40 UTC (permalink / raw) To: Michael Holzheu; +Cc: kexec, Vivek Goyal On Fri, Oct 28, 2011 at 03:35:35PM +0200, Michael Holzheu wrote: > Hello Simon, > > Here comes the patch... > > On Thu, 2011-10-27 at 07:31 +0900, Simon Horman wrote: > > On Tue, Oct 25, 2011 at 07:17:17PM +0200, Michael Holzheu wrote: > > > To fix this I could parse /sys/devices/system/memory and exclude each > > > memory chunk that in not online from the /proc/iomem info. Do you think > > > that this approach is fine or is there a better solution? > > > > Hi Michael, > > > > that sounds like a reasonable approach to me. > > IIRC, kexec xen on ia64 makes use of an alternate iomem file, > > and this seems to be another example of /proc/iomem not being > > the right source of information. > > From: Michael Holzheu <holzheu@linux.vnet.ibm.com> > > Currently on s390 for memory detection only the "/proc/iomem" file is used. > This file does not include information on offlined memory chunks. With this > patch the memory hotplug information is read from "/sys/devices/system/memory" > and is added to the "/proc/iomem" info. > > Also the MAX_MEMORY_RANGES count is increased to 1024 in order to support > systems with many memory holes. Thanks Michael, applied. _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2011-10-31 6:40 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-10-25 17:17 kdump: A memory hotplug issue on s390 Michael Holzheu 2011-10-26 22:31 ` Simon Horman 2011-10-27 17:28 ` Vivek Goyal 2011-10-27 18:15 ` Michael Holzheu 2011-10-27 18:33 ` Vivek Goyal 2011-10-28 13:35 ` [PATCH] kexec-tools: s390: Fix memory detection for memory hotplug Michael Holzheu 2011-10-31 6:40 ` Simon Horman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox