* [PATCH 1/2] Enumerate all /sys/devices/system/cpu/cpuN when they are discontiguous
2014-06-05 5:10 [PATCH 0/2] kexec command fails after cpu hot-removing Takao Indoh
@ 2014-06-05 5:10 ` Takao Indoh
2014-06-05 5:10 ` [PATCH 2/2] Fix mistaken check of stat(2) return value Takao Indoh
` (2 subsequent siblings)
3 siblings, 0 replies; 6+ messages in thread
From: Takao Indoh @ 2014-06-05 5:10 UTC (permalink / raw)
To: horms, kexec
There is a case that the number of /sys/devices/system/cpu/cpuN is not
contiguous, for example after cpu hot removing. This patch fixes so that
all /sys/devices/system/cpu/cpuN is handled when they are discontiguous.
Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com>
---
kexec/crashdump-elf.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/kexec/crashdump-elf.c b/kexec/crashdump-elf.c
index 2baa357..c869347 100644
--- a/kexec/crashdump-elf.c
+++ b/kexec/crashdump-elf.c
@@ -41,6 +41,7 @@ int FUNC(struct kexec_info *info,
uint64_t vmcoreinfo_addr, vmcoreinfo_len;
int has_vmcoreinfo = 0;
int (*get_note_info)(int cpu, uint64_t *addr, uint64_t *len);
+ long int count_cpu;
if (xen_present())
nr_cpus = xen_get_nr_phys_cpus();
@@ -138,11 +139,13 @@ int FUNC(struct kexec_info *info,
/* PT_NOTE program headers. One per cpu */
- for (i = 0; i < nr_cpus; i++) {
+ count_cpu = nr_cpus;
+ for (i = 0; count_cpu > 0; i++) {
if (get_note_info(i, ¬es_addr, ¬es_len) < 0) {
/* This cpu is not present. Skip it. */
continue;
}
+ count_cpu--;
phdr = (PHDR *) bufp;
bufp += sizeof(PHDR);
--
1.9.3
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH 2/2] Fix mistaken check of stat(2) return value
2014-06-05 5:10 [PATCH 0/2] kexec command fails after cpu hot-removing Takao Indoh
2014-06-05 5:10 ` [PATCH 1/2] Enumerate all /sys/devices/system/cpu/cpuN when they are discontiguous Takao Indoh
@ 2014-06-05 5:10 ` Takao Indoh
2014-06-05 6:04 ` [PATCH 0/2] kexec command fails after cpu hot-removing Zhang Yanfei
2014-06-05 7:32 ` WANG Chao
3 siblings, 0 replies; 6+ messages in thread
From: Takao Indoh @ 2014-06-05 5:10 UTC (permalink / raw)
To: horms, kexec
get_crash_notes_per_cpu() should return -1 if return value of stat(2) is
zero (on success).
Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com>
---
kexec/crashdump.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kexec/crashdump.c b/kexec/crashdump.c
index 131e624..15c1105 100644
--- a/kexec/crashdump.c
+++ b/kexec/crashdump.c
@@ -84,7 +84,7 @@ int get_crash_notes_per_cpu(int cpu, uint64_t *addr, uint64_t *len)
if (fopen_errno != ENOENT)
die("Could not open \"%s\": %s\n", crash_notes,
strerror(fopen_errno));
- if (!stat("/sys/devices", &cpu_stat)) {
+ if (stat("/sys/devices", &cpu_stat)) {
stat_errno = errno;
if (stat_errno == ENOENT)
die("\"/sys/devices\" does not exist. "
--
1.9.3
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH 0/2] kexec command fails after cpu hot-removing
2014-06-05 5:10 [PATCH 0/2] kexec command fails after cpu hot-removing Takao Indoh
2014-06-05 5:10 ` [PATCH 1/2] Enumerate all /sys/devices/system/cpu/cpuN when they are discontiguous Takao Indoh
2014-06-05 5:10 ` [PATCH 2/2] Fix mistaken check of stat(2) return value Takao Indoh
@ 2014-06-05 6:04 ` Zhang Yanfei
2014-06-05 9:08 ` Simon Horman
2014-06-05 7:32 ` WANG Chao
3 siblings, 1 reply; 6+ messages in thread
From: Zhang Yanfei @ 2014-06-05 6:04 UTC (permalink / raw)
To: Takao Indoh; +Cc: horms, kexec
I think maybe no one had tested kexec-tools after cpu hot-remove. So
the bug remains until today.
For both two patches:
Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
On 06/05/2014 01:10 PM, Takao Indoh wrote:
> After cpu hot-removing, kexec command fails with the following message.
>
> "/sys/devices" does not exist. Sysfs does not seem to be mounted. Try
> mounting sysfs.
>
> Of course sysfs is mounted. kexec tried to open
> /sys/devices/system/cpu/cpu30/crash_notes but it failed because
> /sys/devices/system/cpu/cpu30 did not exist.
>
>
> [Before hot-remove]
>
> # ls /sys/devices/system/cpu/
> cpu0 cpu111 cpu18 cpu31 cpu45 cpu59 cpu72 cpu86 cpuidle
> cpu1 cpu112 cpu19 cpu32 cpu46 cpu6 cpu73 cpu87 intel_pstate
> cpu10 cpu113 cpu2 cpu33 cpu47 cpu60 cpu74 cpu88 kernel_max
> cpu100 cpu114 cpu20 cpu34 cpu48 cpu61 cpu75 cpu89 microcode
> cpu101 cpu115 cpu21 cpu35 cpu49 cpu62 cpu76 cpu9 modalias
> cpu102 cpu116 cpu22 cpu36 cpu5 cpu63 cpu77 cpu90 offline
> cpu103 cpu117 cpu23 cpu37 cpu50 cpu64 cpu78 cpu91 online
> cpu104 cpu118 cpu24 cpu38 cpu51 cpu65 cpu79 cpu92 possible
> cpu105 cpu119 cpu25 cpu39 cpu52 cpu66 cpu8 cpu93 power
> cpu106 cpu12 cpu26 cpu4 cpu53 cpu67 cpu80 cpu94 present
> cpu107 cpu13 cpu27 cpu40 cpu54 cpu68 cpu81 cpu95 probe
> cpu108 cpu14 cpu28 cpu41 cpu55 cpu69 cpu82 cpu96 release
> cpu109 cpu15 cpu29 cpu42 cpu56 cpu7 cpu83 cpu97 uevent
> cpu11 cpu16 cpu3 cpu43 cpu57 cpu70 cpu84 cpu98
> cpu110 cpu17 cpu30 cpu44 cpu58 cpu71 cpu85 cpu99
>
>
> [After hot-remove]
>
> # ls /sys/devices/system/cpu/
> cpu0 cpu16 cpu23 cpu4 cpu65 cpu72 cpu8 cpu87 modalias uevent
> cpu1 cpu17 cpu24 cpu5 cpu66 cpu73 cpu80 cpu88 offline
> cpu10 cpu18 cpu25 cpu6 cpu67 cpu74 cpu81 cpu89 online
> cpu11 cpu19 cpu26 cpu60 cpu68 cpu75 cpu82 cpu9 possible
> cpu12 cpu2 cpu27 cpu61 cpu69 cpu76 cpu83 cpuidle power
> cpu13 cpu20 cpu28 cpu62 cpu7 cpu77 cpu84 intel_pstate present
> cpu14 cpu21 cpu29 cpu63 cpu70 cpu78 cpu85 kernel_max probe
> cpu15 cpu22 cpu3 cpu64 cpu71 cpu79 cpu86 microcode release
>
> You can see cpu30 to cpu59, and cpu90 to cpu119 does not exist, they are
> removed from this system by hot-remove operation.
>
> kexec command expects the number of each directory is contiguous. For
> example, if sysconf(_SC_NPROCESSORS_CONF) is 60, kexec thinks directory
> cpu0, cpu1, cpu2, .... cpu59 exists. But in this case, the number of
> directory is not contiguous. That is the root cause of this problem.
> This patches fix it.
>
> Takao Indoh (2):
> Enumerate all /sys/devices/system/cpu/cpuN when they are discontiguous
> Fix mistaken check of stat(2) return value
>
> kexec/crashdump-elf.c | 5 ++++-
> kexec/crashdump.c | 2 +-
> 2 files changed, 5 insertions(+), 2 deletions(-)
>
--
Thanks.
Zhang Yanfei
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 0/2] kexec command fails after cpu hot-removing
2014-06-05 6:04 ` [PATCH 0/2] kexec command fails after cpu hot-removing Zhang Yanfei
@ 2014-06-05 9:08 ` Simon Horman
0 siblings, 0 replies; 6+ messages in thread
From: Simon Horman @ 2014-06-05 9:08 UTC (permalink / raw)
To: Zhang Yanfei, WANG Chao; +Cc: kexec, Takao Indoh
On Thu, Jun 05, 2014 at 02:04:41PM +0800, Zhang Yanfei wrote:
> I think maybe no one had tested kexec-tools after cpu hot-remove. So
> the bug remains until today.
>
> For both two patches:
>
> Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
On Thu, Jun 05, 2014 at 03:32:15PM +0800, WANG Chao wrote:
> On 06/05/14 at 02:10pm, Takao Indoh wrote:
> > After cpu hot-removing, kexec command fails with the following message.
> >
> > "/sys/devices" does not exist. Sysfs does not seem to be mounted. Try
> > mounting sysfs.
> >
> > Of course sysfs is mounted. kexec tried to open
> > /sys/devices/system/cpu/cpu30/crash_notes but it failed because
> > /sys/devices/system/cpu/cpu30 did not exist.
> >
> >
> > [Before hot-remove]
> >
> > # ls /sys/devices/system/cpu/
> > cpu0 cpu111 cpu18 cpu31 cpu45 cpu59 cpu72 cpu86 cpuidle
> > cpu1 cpu112 cpu19 cpu32 cpu46 cpu6 cpu73 cpu87 intel_pstate
> > cpu10 cpu113 cpu2 cpu33 cpu47 cpu60 cpu74 cpu88 kernel_max
> > cpu100 cpu114 cpu20 cpu34 cpu48 cpu61 cpu75 cpu89 microcode
> > cpu101 cpu115 cpu21 cpu35 cpu49 cpu62 cpu76 cpu9 modalias
> > cpu102 cpu116 cpu22 cpu36 cpu5 cpu63 cpu77 cpu90 offline
> > cpu103 cpu117 cpu23 cpu37 cpu50 cpu64 cpu78 cpu91 online
> > cpu104 cpu118 cpu24 cpu38 cpu51 cpu65 cpu79 cpu92 possible
> > cpu105 cpu119 cpu25 cpu39 cpu52 cpu66 cpu8 cpu93 power
> > cpu106 cpu12 cpu26 cpu4 cpu53 cpu67 cpu80 cpu94 present
> > cpu107 cpu13 cpu27 cpu40 cpu54 cpu68 cpu81 cpu95 probe
> > cpu108 cpu14 cpu28 cpu41 cpu55 cpu69 cpu82 cpu96 release
> > cpu109 cpu15 cpu29 cpu42 cpu56 cpu7 cpu83 cpu97 uevent
> > cpu11 cpu16 cpu3 cpu43 cpu57 cpu70 cpu84 cpu98
> > cpu110 cpu17 cpu30 cpu44 cpu58 cpu71 cpu85 cpu99
> >
> >
> > [After hot-remove]
> >
> > # ls /sys/devices/system/cpu/
> > cpu0 cpu16 cpu23 cpu4 cpu65 cpu72 cpu8 cpu87 modalias uevent
> > cpu1 cpu17 cpu24 cpu5 cpu66 cpu73 cpu80 cpu88 offline
> > cpu10 cpu18 cpu25 cpu6 cpu67 cpu74 cpu81 cpu89 online
> > cpu11 cpu19 cpu26 cpu60 cpu68 cpu75 cpu82 cpu9 possible
> > cpu12 cpu2 cpu27 cpu61 cpu69 cpu76 cpu83 cpuidle power
> > cpu13 cpu20 cpu28 cpu62 cpu7 cpu77 cpu84 intel_pstate present
> > cpu14 cpu21 cpu29 cpu63 cpu70 cpu78 cpu85 kernel_max probe
> > cpu15 cpu22 cpu3 cpu64 cpu71 cpu79 cpu86 microcode release
> >
> > You can see cpu30 to cpu59, and cpu90 to cpu119 does not exist, they are
> > removed from this system by hot-remove operation.
> >
> > kexec command expects the number of each directory is contiguous. For
> > example, if sysconf(_SC_NPROCESSORS_CONF) is 60, kexec thinks directory
> > cpu0, cpu1, cpu2, .... cpu59 exists. But in this case, the number of
> > directory is not contiguous. That is the root cause of this problem.
> > This patches fix it.
> >
> > Takao Indoh (2):
> > Enumerate all /sys/devices/system/cpu/cpuN when they are discontiguous
> > Fix mistaken check of stat(2) return value
> >
> > kexec/crashdump-elf.c | 5 ++++-
> > kexec/crashdump.c | 2 +-
> > 2 files changed, 5 insertions(+), 2 deletions(-)
>
> These two patches look good to me.
>
> Acked-by: WANG Chao <chaowang@redhat.com>
Thanks, applied.
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 0/2] kexec command fails after cpu hot-removing
2014-06-05 5:10 [PATCH 0/2] kexec command fails after cpu hot-removing Takao Indoh
` (2 preceding siblings ...)
2014-06-05 6:04 ` [PATCH 0/2] kexec command fails after cpu hot-removing Zhang Yanfei
@ 2014-06-05 7:32 ` WANG Chao
3 siblings, 0 replies; 6+ messages in thread
From: WANG Chao @ 2014-06-05 7:32 UTC (permalink / raw)
To: Takao Indoh; +Cc: horms, kexec
On 06/05/14 at 02:10pm, Takao Indoh wrote:
> After cpu hot-removing, kexec command fails with the following message.
>
> "/sys/devices" does not exist. Sysfs does not seem to be mounted. Try
> mounting sysfs.
>
> Of course sysfs is mounted. kexec tried to open
> /sys/devices/system/cpu/cpu30/crash_notes but it failed because
> /sys/devices/system/cpu/cpu30 did not exist.
>
>
> [Before hot-remove]
>
> # ls /sys/devices/system/cpu/
> cpu0 cpu111 cpu18 cpu31 cpu45 cpu59 cpu72 cpu86 cpuidle
> cpu1 cpu112 cpu19 cpu32 cpu46 cpu6 cpu73 cpu87 intel_pstate
> cpu10 cpu113 cpu2 cpu33 cpu47 cpu60 cpu74 cpu88 kernel_max
> cpu100 cpu114 cpu20 cpu34 cpu48 cpu61 cpu75 cpu89 microcode
> cpu101 cpu115 cpu21 cpu35 cpu49 cpu62 cpu76 cpu9 modalias
> cpu102 cpu116 cpu22 cpu36 cpu5 cpu63 cpu77 cpu90 offline
> cpu103 cpu117 cpu23 cpu37 cpu50 cpu64 cpu78 cpu91 online
> cpu104 cpu118 cpu24 cpu38 cpu51 cpu65 cpu79 cpu92 possible
> cpu105 cpu119 cpu25 cpu39 cpu52 cpu66 cpu8 cpu93 power
> cpu106 cpu12 cpu26 cpu4 cpu53 cpu67 cpu80 cpu94 present
> cpu107 cpu13 cpu27 cpu40 cpu54 cpu68 cpu81 cpu95 probe
> cpu108 cpu14 cpu28 cpu41 cpu55 cpu69 cpu82 cpu96 release
> cpu109 cpu15 cpu29 cpu42 cpu56 cpu7 cpu83 cpu97 uevent
> cpu11 cpu16 cpu3 cpu43 cpu57 cpu70 cpu84 cpu98
> cpu110 cpu17 cpu30 cpu44 cpu58 cpu71 cpu85 cpu99
>
>
> [After hot-remove]
>
> # ls /sys/devices/system/cpu/
> cpu0 cpu16 cpu23 cpu4 cpu65 cpu72 cpu8 cpu87 modalias uevent
> cpu1 cpu17 cpu24 cpu5 cpu66 cpu73 cpu80 cpu88 offline
> cpu10 cpu18 cpu25 cpu6 cpu67 cpu74 cpu81 cpu89 online
> cpu11 cpu19 cpu26 cpu60 cpu68 cpu75 cpu82 cpu9 possible
> cpu12 cpu2 cpu27 cpu61 cpu69 cpu76 cpu83 cpuidle power
> cpu13 cpu20 cpu28 cpu62 cpu7 cpu77 cpu84 intel_pstate present
> cpu14 cpu21 cpu29 cpu63 cpu70 cpu78 cpu85 kernel_max probe
> cpu15 cpu22 cpu3 cpu64 cpu71 cpu79 cpu86 microcode release
>
> You can see cpu30 to cpu59, and cpu90 to cpu119 does not exist, they are
> removed from this system by hot-remove operation.
>
> kexec command expects the number of each directory is contiguous. For
> example, if sysconf(_SC_NPROCESSORS_CONF) is 60, kexec thinks directory
> cpu0, cpu1, cpu2, .... cpu59 exists. But in this case, the number of
> directory is not contiguous. That is the root cause of this problem.
> This patches fix it.
>
> Takao Indoh (2):
> Enumerate all /sys/devices/system/cpu/cpuN when they are discontiguous
> Fix mistaken check of stat(2) return value
>
> kexec/crashdump-elf.c | 5 ++++-
> kexec/crashdump.c | 2 +-
> 2 files changed, 5 insertions(+), 2 deletions(-)
These two patches look good to me.
Acked-by: WANG Chao <chaowang@redhat.com>
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 6+ messages in thread