* kdump: A memory hotplug issue on s390
@ 2011-10-25 17:17 Michael Holzheu
2011-10-26 22:31 ` Simon Horman
0 siblings, 1 reply; 7+ messages in thread
From: Michael Holzheu @ 2011-10-25 17:17 UTC (permalink / raw)
To: Simon Horman, Vivek Goyal; +Cc: kexec
Hello Simon and Vivek,
For s390 we currently use /proc/iomem for defining the memory layout in
the kexec elfcore header. Unfortunately this is not correct, when using
memory hotplug. When a memory chunk is set offline (e.g. with "echo
offline > /sys/devices/system/memory/memoryX/state") this is not
reflected in /proc/iomem.
To fix this I could parse /sys/devices/system/memory and exclude each
memory chunk that in not online from the /proc/iomem info. Do you think
that this approach is fine or is there a better solution?
Michael
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: kdump: A memory hotplug issue on s390
2011-10-25 17:17 kdump: A memory hotplug issue on s390 Michael Holzheu
@ 2011-10-26 22:31 ` Simon Horman
2011-10-27 17:28 ` Vivek Goyal
2011-10-28 13:35 ` [PATCH] kexec-tools: s390: Fix memory detection for memory hotplug Michael Holzheu
0 siblings, 2 replies; 7+ messages in thread
From: Simon Horman @ 2011-10-26 22:31 UTC (permalink / raw)
To: Michael Holzheu; +Cc: kexec, Vivek Goyal
On Tue, Oct 25, 2011 at 07:17:17PM +0200, Michael Holzheu wrote:
> Hello Simon and Vivek,
>
> For s390 we currently use /proc/iomem for defining the memory layout in
> the kexec elfcore header. Unfortunately this is not correct, when using
> memory hotplug. When a memory chunk is set offline (e.g. with "echo
> offline > /sys/devices/system/memory/memoryX/state") this is not
> reflected in /proc/iomem.
>
> To fix this I could parse /sys/devices/system/memory and exclude each
> memory chunk that in not online from the /proc/iomem info. Do you think
> that this approach is fine or is there a better solution?
Hi Michael,
that sounds like a reasonable approach to me.
IIRC, kexec xen on ia64 makes use of an alternate iomem file,
and this seems to be another example of /proc/iomem not being
the right source of information.
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: kdump: A memory hotplug issue on s390
2011-10-26 22:31 ` Simon Horman
@ 2011-10-27 17:28 ` Vivek Goyal
2011-10-27 18:15 ` Michael Holzheu
2011-10-28 13:35 ` [PATCH] kexec-tools: s390: Fix memory detection for memory hotplug Michael Holzheu
1 sibling, 1 reply; 7+ messages in thread
From: Vivek Goyal @ 2011-10-27 17:28 UTC (permalink / raw)
To: Simon Horman; +Cc: Michael Holzheu, kexec
On Thu, Oct 27, 2011 at 07:31:26AM +0900, Simon Horman wrote:
> On Tue, Oct 25, 2011 at 07:17:17PM +0200, Michael Holzheu wrote:
> > Hello Simon and Vivek,
> >
> > For s390 we currently use /proc/iomem for defining the memory layout in
> > the kexec elfcore header. Unfortunately this is not correct, when using
> > memory hotplug. When a memory chunk is set offline (e.g. with "echo
> > offline > /sys/devices/system/memory/memoryX/state") this is not
> > reflected in /proc/iomem.
> >
> > To fix this I could parse /sys/devices/system/memory and exclude each
> > memory chunk that in not online from the /proc/iomem info. Do you think
> > that this approach is fine or is there a better solution?
>
> Hi Michael,
>
> that sounds like a reasonable approach to me.
> IIRC, kexec xen on ia64 makes use of an alternate iomem file,
> and this seems to be another example of /proc/iomem not being
> the right source of information.
Agree that it sounds reasonable. I have never used /sys/devices/memory/
interface. So does it work realiably and how long has it been working
reliably?
Secondly we should do this only for kdump and not for kexec. If some
memory is offlined, then we still want to use it in case of kexec.
What's the meaning of various entries. I see lots of memory[1-n] entries
in my system and under memory0/ dir I see following.
[memory0]# grep ".*" *
end_phys_index:00000000
phys_device:0
phys_index:00000000
removable:0
state:online
What does it mean. Is memory0 representing a chunk of physical memory? If
yes, then where does the segment start and where does it end. Everything
seems to be zero.
So is it representing chunk0 of memory. So both starting and end index
are 0. But where is the chunk size mentioned?
Thanks
Vivek
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: kdump: A memory hotplug issue on s390
2011-10-27 17:28 ` Vivek Goyal
@ 2011-10-27 18:15 ` Michael Holzheu
2011-10-27 18:33 ` Vivek Goyal
0 siblings, 1 reply; 7+ messages in thread
From: Michael Holzheu @ 2011-10-27 18:15 UTC (permalink / raw)
To: Vivek Goyal; +Cc: Simon Horman, kexec
Hello Vivek,
On Thu, 2011-10-27 at 13:28 -0400, Vivek Goyal wrote:
> On Thu, Oct 27, 2011 at 07:31:26AM +0900, Simon Horman wrote:
> > On Tue, Oct 25, 2011 at 07:17:17PM +0200, Michael Holzheu wrote:
> > > Hello Simon and Vivek,
> > >
> > > For s390 we currently use /proc/iomem for defining the memory layout in
> > > the kexec elfcore header. Unfortunately this is not correct, when using
> > > memory hotplug. When a memory chunk is set offline (e.g. with "echo
> > > offline > /sys/devices/system/memory/memoryX/state") this is not
> > > reflected in /proc/iomem.
> > >
> > > To fix this I could parse /sys/devices/system/memory and exclude each
> > > memory chunk that in not online from the /proc/iomem info. Do you think
> > > that this approach is fine or is there a better solution?
> >
> > Hi Michael,
> >
> > that sounds like a reasonable approach to me.
> > IIRC, kexec xen on ia64 makes use of an alternate iomem file,
> > and this seems to be another example of /proc/iomem not being
> > the right source of information.
>
> Secondly we should do this only for kdump and not for kexec. If some
> memory is offlined, then we still want to use it in case of kexec.
I don't think so. At least on s390 we can't use it for kexec. If memory
is set offline, it is gone (given back to the hypervisor) and can't be
used any more before it is set to online again.
> What's the meaning of various entries. I see lots of memory[1-n] entries
> in my system and under memory0/ dir I see following.
>
> [memory0]# grep ".*" *
> end_phys_index:00000000
> phys_device:0
> phys_index:00000000
> removable:0
This means that it is not removable, e.g. because not movable kernel
structures are located in this chunk.
> state:online
It is online.
> What does it mean. Is memory0 representing a chunk of physical memory?
> If yes, then where does the segment start and where does it end. Everything
> seems to be zero.
> So is it representing chunk0 of memory. So both starting and end index
> are 0. But where is the chunk size mentioned?
The file "/sys/devices/system/memory/block_size_bytes" tells you how big
each memory chunk is (in hex).
Assume block_size_bytes is 0x10000000 (256MiB). Then "memory0"
represents the memory from 0x0-0x10000000, "memory1" represents memory
from 0x10000000-0x20000000 and so on. So when you find that the "state"
of "memory1" is "offline", you know that memory 0x10000000-0x20000000 is
not used by the Linux kernel (and should not included in vmcore) and (at
least on s390) this area is not backed with real memory.
With this information I can update the /proc/iommem info accordingly.
Michael
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: kdump: A memory hotplug issue on s390
2011-10-27 18:15 ` Michael Holzheu
@ 2011-10-27 18:33 ` Vivek Goyal
0 siblings, 0 replies; 7+ messages in thread
From: Vivek Goyal @ 2011-10-27 18:33 UTC (permalink / raw)
To: Michael Holzheu; +Cc: Simon Horman, kexec
On Thu, Oct 27, 2011 at 08:15:03PM +0200, Michael Holzheu wrote:
> Hello Vivek,
>
> On Thu, 2011-10-27 at 13:28 -0400, Vivek Goyal wrote:
> > On Thu, Oct 27, 2011 at 07:31:26AM +0900, Simon Horman wrote:
> > > On Tue, Oct 25, 2011 at 07:17:17PM +0200, Michael Holzheu wrote:
> > > > Hello Simon and Vivek,
> > > >
> > > > For s390 we currently use /proc/iomem for defining the memory layout in
> > > > the kexec elfcore header. Unfortunately this is not correct, when using
> > > > memory hotplug. When a memory chunk is set offline (e.g. with "echo
> > > > offline > /sys/devices/system/memory/memoryX/state") this is not
> > > > reflected in /proc/iomem.
> > > >
> > > > To fix this I could parse /sys/devices/system/memory and exclude each
> > > > memory chunk that in not online from the /proc/iomem info. Do you think
> > > > that this approach is fine or is there a better solution?
> > >
> > > Hi Michael,
> > >
> > > that sounds like a reasonable approach to me.
> > > IIRC, kexec xen on ia64 makes use of an alternate iomem file,
> > > and this seems to be another example of /proc/iomem not being
> > > the right source of information.
> >
> > Secondly we should do this only for kdump and not for kexec. If some
> > memory is offlined, then we still want to use it in case of kexec.
Ok. So we seem to have two cases then. In baremetal that memory chunk
can still be used. In case of virtualization it can't be used. May be
every arch can take its own decision. Or may be ignoring offlined memory
is safe default until somebody complains. :-)
>
> I don't think so. At least on s390 we can't use it for kexec. If memory
> is set offline, it is gone (given back to the hypervisor) and can't be
> used any more before it is set to online again.
>
> > What's the meaning of various entries. I see lots of memory[1-n] entries
> > in my system and under memory0/ dir I see following.
> >
> > [memory0]# grep ".*" *
> > end_phys_index:00000000
> > phys_device:0
> > phys_index:00000000
>
> > removable:0
>
> This means that it is not removable, e.g. because not movable kernel
> structures are located in this chunk.
>
> > state:online
>
> It is online.
>
> > What does it mean. Is memory0 representing a chunk of physical memory?
> > If yes, then where does the segment start and where does it end. Everything
> > seems to be zero.
> > So is it representing chunk0 of memory. So both starting and end index
> > are 0. But where is the chunk size mentioned?
>
> The file "/sys/devices/system/memory/block_size_bytes" tells you how big
> each memory chunk is (in hex).
>
> Assume block_size_bytes is 0x10000000 (256MiB). Then "memory0"
> represents the memory from 0x0-0x10000000, "memory1" represents memory
> from 0x10000000-0x20000000 and so on. So when you find that the "state"
> of "memory1" is "offline", you know that memory 0x10000000-0x20000000 is
> not used by the Linux kernel (and should not included in vmcore) and (at
> least on s390) this area is not backed with real memory.
>
> With this information I can update the /proc/iommem info accordingly.
Ok, thanks for the info. I seem to have 128MB sized blocks on my system.
So yes, it makes sense to me to exclude memory from the kdump map which
is not online and restarting kdump on memory online/offline events.
Thanks
Vivek
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] kexec-tools: s390: Fix memory detection for memory hotplug
2011-10-26 22:31 ` Simon Horman
2011-10-27 17:28 ` Vivek Goyal
@ 2011-10-28 13:35 ` Michael Holzheu
2011-10-31 6:40 ` Simon Horman
1 sibling, 1 reply; 7+ messages in thread
From: Michael Holzheu @ 2011-10-28 13:35 UTC (permalink / raw)
To: Simon Horman; +Cc: kexec, Vivek Goyal
Hello Simon,
Here comes the patch...
On Thu, 2011-10-27 at 07:31 +0900, Simon Horman wrote:
> On Tue, Oct 25, 2011 at 07:17:17PM +0200, Michael Holzheu wrote:
> > To fix this I could parse /sys/devices/system/memory and exclude each
> > memory chunk that in not online from the /proc/iomem info. Do you think
> > that this approach is fine or is there a better solution?
>
> Hi Michael,
>
> that sounds like a reasonable approach to me.
> IIRC, kexec xen on ia64 makes use of an alternate iomem file,
> and this seems to be another example of /proc/iomem not being
> the right source of information.
From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Currently on s390 for memory detection only the "/proc/iomem" file is used.
This file does not include information on offlined memory chunks. With this
patch the memory hotplug information is read from "/sys/devices/system/memory"
and is added to the "/proc/iomem" info.
Also the MAX_MEMORY_RANGES count is increased to 1024 in order to support
systems with many memory holes.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
---
kexec/arch/s390/kexec-s390.c | 122 ++++++++++++++++++++++++++++++++++++++++++-
kexec/arch/s390/kexec-s390.h | 3 -
2 files changed, 123 insertions(+), 2 deletions(-)
--- a/kexec/arch/s390/kexec-s390.c
+++ b/kexec/arch/s390/kexec-s390.c
@@ -11,10 +11,13 @@
#define _GNU_SOURCE
#include <stddef.h>
#include <stdio.h>
+#include <stdlib.h>
#include <errno.h>
#include <stdint.h>
#include <string.h>
#include <getopt.h>
+#include <sys/types.h>
+#include <dirent.h>
#include "../../kexec.h"
#include "../../kexec-syscall.h"
#include "kexec-s390.h"
@@ -23,6 +26,122 @@
static struct memory_range memory_range[MAX_MEMORY_RANGES];
/*
+ * Read string from file
+ */
+static void read_str(char *string, const char *path, size_t len)
+{
+ size_t rc;
+ FILE *fh;
+
+ fh = fopen(path, "rb");
+ if (fh == NULL)
+ die("Could not open \"%s\"", path);
+ rc = fread(string, 1, len - 1, fh);
+ if (rc == 0 && ferror(fh))
+ die("Could not read \"%s\"", path);
+ fclose(fh);
+ string[rc] = 0;
+ if (string[strlen(string) - 1] == '\n')
+ string[strlen(string) - 1] = 0;
+}
+
+/*
+ * Return number of memory chunks
+ */
+static int memory_range_cnt(struct memory_range chunks[])
+{
+ int i;
+
+ for (i = 0; i < MAX_MEMORY_RANGES; i++) {
+ if (chunks[i].end == 0)
+ break;
+ }
+ return i;
+}
+
+/*
+ * Create memory hole with given address and size
+ *
+ * lh = local hole
+ */
+static void add_mem_hole(struct memory_range chunks[], unsigned long addr,
+ unsigned long size)
+{
+ unsigned long lh_start, lh_end, lh_size, chunk_cnt;
+ int i;
+
+ chunk_cnt = memory_range_cnt(chunks);
+
+ for (i = 0; i < chunk_cnt; i++) {
+ if (addr + size <= chunks[i].start)
+ break;
+ if (addr > chunks[i].end)
+ continue;
+ lh_start = MAX(addr, chunks[i].start);
+ lh_end = MIN(addr + size - 1, chunks[i].end);
+ lh_size = lh_end - lh_start + 1;
+ if (lh_start == chunks[i].start && lh_end == chunks[i].end) {
+ /* Remove chunk */
+ memmove(&chunks[i], &chunks[i + 1],
+ sizeof(struct memory_range) *
+ (MAX_MEMORY_RANGES - (i + 1)));
+ memset(&chunks[MAX_MEMORY_RANGES - 1], 0,
+ sizeof(struct memory_range));
+ chunk_cnt--;
+ i--;
+ } else if (lh_start == chunks[i].start) {
+ /* Make chunk smaller at start */
+ chunks[i].start = chunks[i].start + lh_size;
+ break;
+ } else if (lh_end == chunks[i].end) {
+ /* Make chunk smaller at end */
+ chunks[i].end = lh_start - 1;
+ } else {
+ /* Split chunk into two */
+ if (chunk_cnt >= MAX_MEMORY_RANGES)
+ die("Unable to create memory hole: %i", i);
+ memmove(&chunks[i + 1], &chunks[i],
+ sizeof(struct memory_range) *
+ (MAX_MEMORY_RANGES - (i + 1)));
+ chunks[i + 1].start = lh_start + lh_size;
+ chunks[i].end = lh_start - 1;
+ break;
+ }
+ }
+}
+
+/*
+ * Remove offline memory from memory chunks
+ */
+static void remove_offline_memory(struct memory_range memory_range[])
+{
+ unsigned long block_size, chunk_nr;
+ struct dirent *dirent;
+ char path[PATH_MAX];
+ char str[64];
+ DIR *dir;
+
+ read_str(str, "/sys/devices/system/memory/block_size_bytes",
+ sizeof(str));
+ sscanf(str, "%lx", &block_size);
+
+ dir = opendir("/sys/devices/system/memory");
+ if (!dir)
+ die("Could not read \"/sys/devices/system/memory\"");
+ while ((dirent = readdir(dir))) {
+ if (sscanf(dirent->d_name, "memory%ld\n", &chunk_nr) != 1)
+ continue;
+ sprintf(path, "/sys/devices/system/memory/%s/state",
+ dirent->d_name);
+ read_str(str, path, sizeof(str));
+ if (strncmp(str, "offline", 6) != 0)
+ continue;
+ add_mem_hole(memory_range, chunk_nr * block_size, block_size);
+ }
+ closedir(dir);
+}
+
+/*
* Get memory ranges of type "System RAM" from /proc/iomem. If with_crashk=1
* then also type "Crash kernel" is added.
*/
@@ -66,7 +185,8 @@ int get_memory_ranges_s390(struct memory
}
}
fclose(fp);
- *ranges = current_range;
+ remove_offline_memory(memory_range);
+ *ranges = memory_range_cnt(memory_range);
return 0;
}
--- a/kexec/arch/s390/kexec-s390.h
+++ b/kexec/arch/s390/kexec-s390.h
@@ -19,10 +19,11 @@
#define OLDMEM_SIZE_OFFS 0x420
#define COMMAND_LINE_OFFS 0x480
#define COMMAND_LINESIZE 896
-#define MAX_MEMORY_RANGES 64
+#define MAX_MEMORY_RANGES 1024
#define ALIGN_UP(addr, size) (((addr) + ((size)-1)) & (~((size)-1)))
#define MAX(x, y) ((x) > (y) ? (x) : (y))
+#define MIN(x, y) ((x) < (y) ? (x) : (y))
extern int image_s390_load(int, char **, const char *, off_t, struct kexec_info *);
extern int image_s390_probe(const char *, off_t);
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] kexec-tools: s390: Fix memory detection for memory hotplug
2011-10-28 13:35 ` [PATCH] kexec-tools: s390: Fix memory detection for memory hotplug Michael Holzheu
@ 2011-10-31 6:40 ` Simon Horman
0 siblings, 0 replies; 7+ messages in thread
From: Simon Horman @ 2011-10-31 6:40 UTC (permalink / raw)
To: Michael Holzheu; +Cc: kexec, Vivek Goyal
On Fri, Oct 28, 2011 at 03:35:35PM +0200, Michael Holzheu wrote:
> Hello Simon,
>
> Here comes the patch...
>
> On Thu, 2011-10-27 at 07:31 +0900, Simon Horman wrote:
> > On Tue, Oct 25, 2011 at 07:17:17PM +0200, Michael Holzheu wrote:
> > > To fix this I could parse /sys/devices/system/memory and exclude each
> > > memory chunk that in not online from the /proc/iomem info. Do you think
> > > that this approach is fine or is there a better solution?
> >
> > Hi Michael,
> >
> > that sounds like a reasonable approach to me.
> > IIRC, kexec xen on ia64 makes use of an alternate iomem file,
> > and this seems to be another example of /proc/iomem not being
> > the right source of information.
>
> From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
>
> Currently on s390 for memory detection only the "/proc/iomem" file is used.
> This file does not include information on offlined memory chunks. With this
> patch the memory hotplug information is read from "/sys/devices/system/memory"
> and is added to the "/proc/iomem" info.
>
> Also the MAX_MEMORY_RANGES count is increased to 1024 in order to support
> systems with many memory holes.
Thanks Michael, applied.
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2011-10-31 6:40 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-25 17:17 kdump: A memory hotplug issue on s390 Michael Holzheu
2011-10-26 22:31 ` Simon Horman
2011-10-27 17:28 ` Vivek Goyal
2011-10-27 18:15 ` Michael Holzheu
2011-10-27 18:33 ` Vivek Goyal
2011-10-28 13:35 ` [PATCH] kexec-tools: s390: Fix memory detection for memory hotplug Michael Holzheu
2011-10-31 6:40 ` Simon Horman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox