* [PATCH 0/2] kexec command fails after cpu hot-removing
@ 2014-06-05 5:10 Takao Indoh
2014-06-05 5:10 ` [PATCH 1/2] Enumerate all /sys/devices/system/cpu/cpuN when they are discontiguous Takao Indoh
` (3 more replies)
0 siblings, 4 replies; 6+ messages in thread
From: Takao Indoh @ 2014-06-05 5:10 UTC (permalink / raw)
To: horms, kexec
After cpu hot-removing, kexec command fails with the following message.
"/sys/devices" does not exist. Sysfs does not seem to be mounted. Try
mounting sysfs.
Of course sysfs is mounted. kexec tried to open
/sys/devices/system/cpu/cpu30/crash_notes but it failed because
/sys/devices/system/cpu/cpu30 did not exist.
[Before hot-remove]
# ls /sys/devices/system/cpu/
cpu0 cpu111 cpu18 cpu31 cpu45 cpu59 cpu72 cpu86 cpuidle
cpu1 cpu112 cpu19 cpu32 cpu46 cpu6 cpu73 cpu87 intel_pstate
cpu10 cpu113 cpu2 cpu33 cpu47 cpu60 cpu74 cpu88 kernel_max
cpu100 cpu114 cpu20 cpu34 cpu48 cpu61 cpu75 cpu89 microcode
cpu101 cpu115 cpu21 cpu35 cpu49 cpu62 cpu76 cpu9 modalias
cpu102 cpu116 cpu22 cpu36 cpu5 cpu63 cpu77 cpu90 offline
cpu103 cpu117 cpu23 cpu37 cpu50 cpu64 cpu78 cpu91 online
cpu104 cpu118 cpu24 cpu38 cpu51 cpu65 cpu79 cpu92 possible
cpu105 cpu119 cpu25 cpu39 cpu52 cpu66 cpu8 cpu93 power
cpu106 cpu12 cpu26 cpu4 cpu53 cpu67 cpu80 cpu94 present
cpu107 cpu13 cpu27 cpu40 cpu54 cpu68 cpu81 cpu95 probe
cpu108 cpu14 cpu28 cpu41 cpu55 cpu69 cpu82 cpu96 release
cpu109 cpu15 cpu29 cpu42 cpu56 cpu7 cpu83 cpu97 uevent
cpu11 cpu16 cpu3 cpu43 cpu57 cpu70 cpu84 cpu98
cpu110 cpu17 cpu30 cpu44 cpu58 cpu71 cpu85 cpu99
[After hot-remove]
# ls /sys/devices/system/cpu/
cpu0 cpu16 cpu23 cpu4 cpu65 cpu72 cpu8 cpu87 modalias uevent
cpu1 cpu17 cpu24 cpu5 cpu66 cpu73 cpu80 cpu88 offline
cpu10 cpu18 cpu25 cpu6 cpu67 cpu74 cpu81 cpu89 online
cpu11 cpu19 cpu26 cpu60 cpu68 cpu75 cpu82 cpu9 possible
cpu12 cpu2 cpu27 cpu61 cpu69 cpu76 cpu83 cpuidle power
cpu13 cpu20 cpu28 cpu62 cpu7 cpu77 cpu84 intel_pstate present
cpu14 cpu21 cpu29 cpu63 cpu70 cpu78 cpu85 kernel_max probe
cpu15 cpu22 cpu3 cpu64 cpu71 cpu79 cpu86 microcode release
You can see cpu30 to cpu59, and cpu90 to cpu119 does not exist, they are
removed from this system by hot-remove operation.
kexec command expects the number of each directory is contiguous. For
example, if sysconf(_SC_NPROCESSORS_CONF) is 60, kexec thinks directory
cpu0, cpu1, cpu2, .... cpu59 exists. But in this case, the number of
directory is not contiguous. That is the root cause of this problem.
This patches fix it.
Takao Indoh (2):
Enumerate all /sys/devices/system/cpu/cpuN when they are discontiguous
Fix mistaken check of stat(2) return value
kexec/crashdump-elf.c | 5 ++++-
kexec/crashdump.c | 2 +-
2 files changed, 5 insertions(+), 2 deletions(-)
--
1.9.3
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] Enumerate all /sys/devices/system/cpu/cpuN when they are discontiguous
2014-06-05 5:10 [PATCH 0/2] kexec command fails after cpu hot-removing Takao Indoh
@ 2014-06-05 5:10 ` Takao Indoh
2014-06-05 5:10 ` [PATCH 2/2] Fix mistaken check of stat(2) return value Takao Indoh
` (2 subsequent siblings)
3 siblings, 0 replies; 6+ messages in thread
From: Takao Indoh @ 2014-06-05 5:10 UTC (permalink / raw)
To: horms, kexec
There is a case that the number of /sys/devices/system/cpu/cpuN is not
contiguous, for example after cpu hot removing. This patch fixes so that
all /sys/devices/system/cpu/cpuN is handled when they are discontiguous.
Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com>
---
kexec/crashdump-elf.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/kexec/crashdump-elf.c b/kexec/crashdump-elf.c
index 2baa357..c869347 100644
--- a/kexec/crashdump-elf.c
+++ b/kexec/crashdump-elf.c
@@ -41,6 +41,7 @@ int FUNC(struct kexec_info *info,
uint64_t vmcoreinfo_addr, vmcoreinfo_len;
int has_vmcoreinfo = 0;
int (*get_note_info)(int cpu, uint64_t *addr, uint64_t *len);
+ long int count_cpu;
if (xen_present())
nr_cpus = xen_get_nr_phys_cpus();
@@ -138,11 +139,13 @@ int FUNC(struct kexec_info *info,
/* PT_NOTE program headers. One per cpu */
- for (i = 0; i < nr_cpus; i++) {
+ count_cpu = nr_cpus;
+ for (i = 0; count_cpu > 0; i++) {
if (get_note_info(i, ¬es_addr, ¬es_len) < 0) {
/* This cpu is not present. Skip it. */
continue;
}
+ count_cpu--;
phdr = (PHDR *) bufp;
bufp += sizeof(PHDR);
--
1.9.3
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] Fix mistaken check of stat(2) return value
2014-06-05 5:10 [PATCH 0/2] kexec command fails after cpu hot-removing Takao Indoh
2014-06-05 5:10 ` [PATCH 1/2] Enumerate all /sys/devices/system/cpu/cpuN when they are discontiguous Takao Indoh
@ 2014-06-05 5:10 ` Takao Indoh
2014-06-05 6:04 ` [PATCH 0/2] kexec command fails after cpu hot-removing Zhang Yanfei
2014-06-05 7:32 ` WANG Chao
3 siblings, 0 replies; 6+ messages in thread
From: Takao Indoh @ 2014-06-05 5:10 UTC (permalink / raw)
To: horms, kexec
get_crash_notes_per_cpu() should return -1 if return value of stat(2) is
zero (on success).
Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com>
---
kexec/crashdump.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kexec/crashdump.c b/kexec/crashdump.c
index 131e624..15c1105 100644
--- a/kexec/crashdump.c
+++ b/kexec/crashdump.c
@@ -84,7 +84,7 @@ int get_crash_notes_per_cpu(int cpu, uint64_t *addr, uint64_t *len)
if (fopen_errno != ENOENT)
die("Could not open \"%s\": %s\n", crash_notes,
strerror(fopen_errno));
- if (!stat("/sys/devices", &cpu_stat)) {
+ if (stat("/sys/devices", &cpu_stat)) {
stat_errno = errno;
if (stat_errno == ENOENT)
die("\"/sys/devices\" does not exist. "
--
1.9.3
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 0/2] kexec command fails after cpu hot-removing
2014-06-05 5:10 [PATCH 0/2] kexec command fails after cpu hot-removing Takao Indoh
2014-06-05 5:10 ` [PATCH 1/2] Enumerate all /sys/devices/system/cpu/cpuN when they are discontiguous Takao Indoh
2014-06-05 5:10 ` [PATCH 2/2] Fix mistaken check of stat(2) return value Takao Indoh
@ 2014-06-05 6:04 ` Zhang Yanfei
2014-06-05 9:08 ` Simon Horman
2014-06-05 7:32 ` WANG Chao
3 siblings, 1 reply; 6+ messages in thread
From: Zhang Yanfei @ 2014-06-05 6:04 UTC (permalink / raw)
To: Takao Indoh; +Cc: horms, kexec
I think maybe no one had tested kexec-tools after cpu hot-remove. So
the bug remains until today.
For both two patches:
Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
On 06/05/2014 01:10 PM, Takao Indoh wrote:
> After cpu hot-removing, kexec command fails with the following message.
>
> "/sys/devices" does not exist. Sysfs does not seem to be mounted. Try
> mounting sysfs.
>
> Of course sysfs is mounted. kexec tried to open
> /sys/devices/system/cpu/cpu30/crash_notes but it failed because
> /sys/devices/system/cpu/cpu30 did not exist.
>
>
> [Before hot-remove]
>
> # ls /sys/devices/system/cpu/
> cpu0 cpu111 cpu18 cpu31 cpu45 cpu59 cpu72 cpu86 cpuidle
> cpu1 cpu112 cpu19 cpu32 cpu46 cpu6 cpu73 cpu87 intel_pstate
> cpu10 cpu113 cpu2 cpu33 cpu47 cpu60 cpu74 cpu88 kernel_max
> cpu100 cpu114 cpu20 cpu34 cpu48 cpu61 cpu75 cpu89 microcode
> cpu101 cpu115 cpu21 cpu35 cpu49 cpu62 cpu76 cpu9 modalias
> cpu102 cpu116 cpu22 cpu36 cpu5 cpu63 cpu77 cpu90 offline
> cpu103 cpu117 cpu23 cpu37 cpu50 cpu64 cpu78 cpu91 online
> cpu104 cpu118 cpu24 cpu38 cpu51 cpu65 cpu79 cpu92 possible
> cpu105 cpu119 cpu25 cpu39 cpu52 cpu66 cpu8 cpu93 power
> cpu106 cpu12 cpu26 cpu4 cpu53 cpu67 cpu80 cpu94 present
> cpu107 cpu13 cpu27 cpu40 cpu54 cpu68 cpu81 cpu95 probe
> cpu108 cpu14 cpu28 cpu41 cpu55 cpu69 cpu82 cpu96 release
> cpu109 cpu15 cpu29 cpu42 cpu56 cpu7 cpu83 cpu97 uevent
> cpu11 cpu16 cpu3 cpu43 cpu57 cpu70 cpu84 cpu98
> cpu110 cpu17 cpu30 cpu44 cpu58 cpu71 cpu85 cpu99
>
>
> [After hot-remove]
>
> # ls /sys/devices/system/cpu/
> cpu0 cpu16 cpu23 cpu4 cpu65 cpu72 cpu8 cpu87 modalias uevent
> cpu1 cpu17 cpu24 cpu5 cpu66 cpu73 cpu80 cpu88 offline
> cpu10 cpu18 cpu25 cpu6 cpu67 cpu74 cpu81 cpu89 online
> cpu11 cpu19 cpu26 cpu60 cpu68 cpu75 cpu82 cpu9 possible
> cpu12 cpu2 cpu27 cpu61 cpu69 cpu76 cpu83 cpuidle power
> cpu13 cpu20 cpu28 cpu62 cpu7 cpu77 cpu84 intel_pstate present
> cpu14 cpu21 cpu29 cpu63 cpu70 cpu78 cpu85 kernel_max probe
> cpu15 cpu22 cpu3 cpu64 cpu71 cpu79 cpu86 microcode release
>
> You can see cpu30 to cpu59, and cpu90 to cpu119 does not exist, they are
> removed from this system by hot-remove operation.
>
> kexec command expects the number of each directory is contiguous. For
> example, if sysconf(_SC_NPROCESSORS_CONF) is 60, kexec thinks directory
> cpu0, cpu1, cpu2, .... cpu59 exists. But in this case, the number of
> directory is not contiguous. That is the root cause of this problem.
> This patches fix it.
>
> Takao Indoh (2):
> Enumerate all /sys/devices/system/cpu/cpuN when they are discontiguous
> Fix mistaken check of stat(2) return value
>
> kexec/crashdump-elf.c | 5 ++++-
> kexec/crashdump.c | 2 +-
> 2 files changed, 5 insertions(+), 2 deletions(-)
>
--
Thanks.
Zhang Yanfei
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 0/2] kexec command fails after cpu hot-removing
2014-06-05 5:10 [PATCH 0/2] kexec command fails after cpu hot-removing Takao Indoh
` (2 preceding siblings ...)
2014-06-05 6:04 ` [PATCH 0/2] kexec command fails after cpu hot-removing Zhang Yanfei
@ 2014-06-05 7:32 ` WANG Chao
3 siblings, 0 replies; 6+ messages in thread
From: WANG Chao @ 2014-06-05 7:32 UTC (permalink / raw)
To: Takao Indoh; +Cc: horms, kexec
On 06/05/14 at 02:10pm, Takao Indoh wrote:
> After cpu hot-removing, kexec command fails with the following message.
>
> "/sys/devices" does not exist. Sysfs does not seem to be mounted. Try
> mounting sysfs.
>
> Of course sysfs is mounted. kexec tried to open
> /sys/devices/system/cpu/cpu30/crash_notes but it failed because
> /sys/devices/system/cpu/cpu30 did not exist.
>
>
> [Before hot-remove]
>
> # ls /sys/devices/system/cpu/
> cpu0 cpu111 cpu18 cpu31 cpu45 cpu59 cpu72 cpu86 cpuidle
> cpu1 cpu112 cpu19 cpu32 cpu46 cpu6 cpu73 cpu87 intel_pstate
> cpu10 cpu113 cpu2 cpu33 cpu47 cpu60 cpu74 cpu88 kernel_max
> cpu100 cpu114 cpu20 cpu34 cpu48 cpu61 cpu75 cpu89 microcode
> cpu101 cpu115 cpu21 cpu35 cpu49 cpu62 cpu76 cpu9 modalias
> cpu102 cpu116 cpu22 cpu36 cpu5 cpu63 cpu77 cpu90 offline
> cpu103 cpu117 cpu23 cpu37 cpu50 cpu64 cpu78 cpu91 online
> cpu104 cpu118 cpu24 cpu38 cpu51 cpu65 cpu79 cpu92 possible
> cpu105 cpu119 cpu25 cpu39 cpu52 cpu66 cpu8 cpu93 power
> cpu106 cpu12 cpu26 cpu4 cpu53 cpu67 cpu80 cpu94 present
> cpu107 cpu13 cpu27 cpu40 cpu54 cpu68 cpu81 cpu95 probe
> cpu108 cpu14 cpu28 cpu41 cpu55 cpu69 cpu82 cpu96 release
> cpu109 cpu15 cpu29 cpu42 cpu56 cpu7 cpu83 cpu97 uevent
> cpu11 cpu16 cpu3 cpu43 cpu57 cpu70 cpu84 cpu98
> cpu110 cpu17 cpu30 cpu44 cpu58 cpu71 cpu85 cpu99
>
>
> [After hot-remove]
>
> # ls /sys/devices/system/cpu/
> cpu0 cpu16 cpu23 cpu4 cpu65 cpu72 cpu8 cpu87 modalias uevent
> cpu1 cpu17 cpu24 cpu5 cpu66 cpu73 cpu80 cpu88 offline
> cpu10 cpu18 cpu25 cpu6 cpu67 cpu74 cpu81 cpu89 online
> cpu11 cpu19 cpu26 cpu60 cpu68 cpu75 cpu82 cpu9 possible
> cpu12 cpu2 cpu27 cpu61 cpu69 cpu76 cpu83 cpuidle power
> cpu13 cpu20 cpu28 cpu62 cpu7 cpu77 cpu84 intel_pstate present
> cpu14 cpu21 cpu29 cpu63 cpu70 cpu78 cpu85 kernel_max probe
> cpu15 cpu22 cpu3 cpu64 cpu71 cpu79 cpu86 microcode release
>
> You can see cpu30 to cpu59, and cpu90 to cpu119 does not exist, they are
> removed from this system by hot-remove operation.
>
> kexec command expects the number of each directory is contiguous. For
> example, if sysconf(_SC_NPROCESSORS_CONF) is 60, kexec thinks directory
> cpu0, cpu1, cpu2, .... cpu59 exists. But in this case, the number of
> directory is not contiguous. That is the root cause of this problem.
> This patches fix it.
>
> Takao Indoh (2):
> Enumerate all /sys/devices/system/cpu/cpuN when they are discontiguous
> Fix mistaken check of stat(2) return value
>
> kexec/crashdump-elf.c | 5 ++++-
> kexec/crashdump.c | 2 +-
> 2 files changed, 5 insertions(+), 2 deletions(-)
These two patches look good to me.
Acked-by: WANG Chao <chaowang@redhat.com>
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 0/2] kexec command fails after cpu hot-removing
2014-06-05 6:04 ` [PATCH 0/2] kexec command fails after cpu hot-removing Zhang Yanfei
@ 2014-06-05 9:08 ` Simon Horman
0 siblings, 0 replies; 6+ messages in thread
From: Simon Horman @ 2014-06-05 9:08 UTC (permalink / raw)
To: Zhang Yanfei, WANG Chao; +Cc: kexec, Takao Indoh
On Thu, Jun 05, 2014 at 02:04:41PM +0800, Zhang Yanfei wrote:
> I think maybe no one had tested kexec-tools after cpu hot-remove. So
> the bug remains until today.
>
> For both two patches:
>
> Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
On Thu, Jun 05, 2014 at 03:32:15PM +0800, WANG Chao wrote:
> On 06/05/14 at 02:10pm, Takao Indoh wrote:
> > After cpu hot-removing, kexec command fails with the following message.
> >
> > "/sys/devices" does not exist. Sysfs does not seem to be mounted. Try
> > mounting sysfs.
> >
> > Of course sysfs is mounted. kexec tried to open
> > /sys/devices/system/cpu/cpu30/crash_notes but it failed because
> > /sys/devices/system/cpu/cpu30 did not exist.
> >
> >
> > [Before hot-remove]
> >
> > # ls /sys/devices/system/cpu/
> > cpu0 cpu111 cpu18 cpu31 cpu45 cpu59 cpu72 cpu86 cpuidle
> > cpu1 cpu112 cpu19 cpu32 cpu46 cpu6 cpu73 cpu87 intel_pstate
> > cpu10 cpu113 cpu2 cpu33 cpu47 cpu60 cpu74 cpu88 kernel_max
> > cpu100 cpu114 cpu20 cpu34 cpu48 cpu61 cpu75 cpu89 microcode
> > cpu101 cpu115 cpu21 cpu35 cpu49 cpu62 cpu76 cpu9 modalias
> > cpu102 cpu116 cpu22 cpu36 cpu5 cpu63 cpu77 cpu90 offline
> > cpu103 cpu117 cpu23 cpu37 cpu50 cpu64 cpu78 cpu91 online
> > cpu104 cpu118 cpu24 cpu38 cpu51 cpu65 cpu79 cpu92 possible
> > cpu105 cpu119 cpu25 cpu39 cpu52 cpu66 cpu8 cpu93 power
> > cpu106 cpu12 cpu26 cpu4 cpu53 cpu67 cpu80 cpu94 present
> > cpu107 cpu13 cpu27 cpu40 cpu54 cpu68 cpu81 cpu95 probe
> > cpu108 cpu14 cpu28 cpu41 cpu55 cpu69 cpu82 cpu96 release
> > cpu109 cpu15 cpu29 cpu42 cpu56 cpu7 cpu83 cpu97 uevent
> > cpu11 cpu16 cpu3 cpu43 cpu57 cpu70 cpu84 cpu98
> > cpu110 cpu17 cpu30 cpu44 cpu58 cpu71 cpu85 cpu99
> >
> >
> > [After hot-remove]
> >
> > # ls /sys/devices/system/cpu/
> > cpu0 cpu16 cpu23 cpu4 cpu65 cpu72 cpu8 cpu87 modalias uevent
> > cpu1 cpu17 cpu24 cpu5 cpu66 cpu73 cpu80 cpu88 offline
> > cpu10 cpu18 cpu25 cpu6 cpu67 cpu74 cpu81 cpu89 online
> > cpu11 cpu19 cpu26 cpu60 cpu68 cpu75 cpu82 cpu9 possible
> > cpu12 cpu2 cpu27 cpu61 cpu69 cpu76 cpu83 cpuidle power
> > cpu13 cpu20 cpu28 cpu62 cpu7 cpu77 cpu84 intel_pstate present
> > cpu14 cpu21 cpu29 cpu63 cpu70 cpu78 cpu85 kernel_max probe
> > cpu15 cpu22 cpu3 cpu64 cpu71 cpu79 cpu86 microcode release
> >
> > You can see cpu30 to cpu59, and cpu90 to cpu119 does not exist, they are
> > removed from this system by hot-remove operation.
> >
> > kexec command expects the number of each directory is contiguous. For
> > example, if sysconf(_SC_NPROCESSORS_CONF) is 60, kexec thinks directory
> > cpu0, cpu1, cpu2, .... cpu59 exists. But in this case, the number of
> > directory is not contiguous. That is the root cause of this problem.
> > This patches fix it.
> >
> > Takao Indoh (2):
> > Enumerate all /sys/devices/system/cpu/cpuN when they are discontiguous
> > Fix mistaken check of stat(2) return value
> >
> > kexec/crashdump-elf.c | 5 ++++-
> > kexec/crashdump.c | 2 +-
> > 2 files changed, 5 insertions(+), 2 deletions(-)
>
> These two patches look good to me.
>
> Acked-by: WANG Chao <chaowang@redhat.com>
Thanks, applied.
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-06-05 9:09 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-05 5:10 [PATCH 0/2] kexec command fails after cpu hot-removing Takao Indoh
2014-06-05 5:10 ` [PATCH 1/2] Enumerate all /sys/devices/system/cpu/cpuN when they are discontiguous Takao Indoh
2014-06-05 5:10 ` [PATCH 2/2] Fix mistaken check of stat(2) return value Takao Indoh
2014-06-05 6:04 ` [PATCH 0/2] kexec command fails after cpu hot-removing Zhang Yanfei
2014-06-05 9:08 ` Simon Horman
2014-06-05 7:32 ` WANG Chao
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.