linux-ia64.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] Initial Kexec patches
@ 2016-04-14 19:59 Russell King - ARM Linux
  2016-04-14 20:00 ` [PATCH 1/3] kexec: don't invoke OOM-killer for control page allocation Russell King
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Russell King - ARM Linux @ 2016-04-14 19:59 UTC (permalink / raw)
  To: Eric Biederman; +Cc: Fenghua Yu, kexec, linux-ia64, linux-kernel, Tony Luck

Eric,

Here are some initial patches from my stack to bring functional kexec
support to the TI Keystone 2 platform.  These are relatively independent
of that work, but I believe are useful changes in their own regard.
Each patch is independent of each other; these do not really comprise
a strict series.

The first fixes an obscure problem - if we try to allocate the control
page, and we're unable to allocate a page within the memory region we
want, we end up killing processes, trying to gain that page.  In the
case where there is no memory located within the desired range, this
results in the OOM killer killing off many processes rather than a
graceful failure.  The patch changes the behaviour to allow a graceful
failure instead without invoking the OOM killer.

The second patch fixes a missing check in the kexec code, where the
user could supply a start and size address for a segment which causes
the calculation to wrap.  Eg, the case where segment[0].mem = 0xffff0000,
segment[0].memsz = 0x10000 would pass this check.  This patch adds an
explicit check for cases such as these.

The third patch allows the vmcoreinfo note data to be located above
4GB in physical memory space.  Since kexec tools only use this value
when opening /dev/mem, having this located above 4GB physical is not
a problem - kexec tools even expect this to potentially be a 64-bit
value already.  Making this change means there is one less area which
becomes Keystone 2 specific.

 arch/ia64/kernel/machine_kexec.c | 2 +-
 include/linux/kexec.h            | 4 ++--
 kernel/kexec_core.c              | 4 +++-
 kernel/ksysfs.c                  | 4 ++--
 4 files changed, 8 insertions(+), 6 deletions(-)

Thanks.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/3] kexec: don't invoke OOM-killer for control page allocation
  2016-04-14 19:59 [PATCH 0/3] Initial Kexec patches Russell King - ARM Linux
@ 2016-04-14 20:00 ` Russell King
  2016-04-18  5:32   ` Baoquan He
  2016-04-14 20:00 ` [PATCH 2/3] kexec: ensure user memory sizes do not wrap Russell King
  2016-04-14 20:00 ` [PATCH 3/3] kexec: arrange for paddr_vmcoreinfo_note() to return phys_addr_t Russell King
  2 siblings, 1 reply; 24+ messages in thread
From: Russell King @ 2016-04-14 20:00 UTC (permalink / raw)
  To: Eric Biederman; +Cc: Fenghua Yu, linux-ia64, Tony Luck, kexec

If we are unable to find a suitable page when allocating the control
page, do not invoke the OOM-killer: killing processes probably isn't
going to help.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 include/linux/kexec.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 2cc643c6e870..1b32ab587f66 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -41,7 +41,7 @@
 #endif
 
 #ifndef KEXEC_CONTROL_MEMORY_GFP
-#define KEXEC_CONTROL_MEMORY_GFP GFP_KERNEL
+#define KEXEC_CONTROL_MEMORY_GFP (GFP_KERNEL | __GFP_NORETRY)
 #endif
 
 #ifndef KEXEC_CONTROL_PAGE_SIZE
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/3] kexec: ensure user memory sizes do not wrap
  2016-04-14 19:59 [PATCH 0/3] Initial Kexec patches Russell King - ARM Linux
  2016-04-14 20:00 ` [PATCH 1/3] kexec: don't invoke OOM-killer for control page allocation Russell King
@ 2016-04-14 20:00 ` Russell King
  2016-04-18  5:35   ` Baoquan He
                     ` (2 more replies)
  2016-04-14 20:00 ` [PATCH 3/3] kexec: arrange for paddr_vmcoreinfo_note() to return phys_addr_t Russell King
  2 siblings, 3 replies; 24+ messages in thread
From: Russell King @ 2016-04-14 20:00 UTC (permalink / raw)
  To: Eric Biederman; +Cc: Fenghua Yu, linux-ia64, Tony Luck, kexec

Ensure that user memory sizes do not wrap around when validating the
user input, which can lead to the following input validation working
incorrectly.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 kernel/kexec_core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 8d34308ea449..d719a4d0ef55 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -169,6 +169,8 @@ int sanity_check_segment_list(struct kimage *image)
 
 		mstart = image->segment[i].mem;
 		mend   = mstart + image->segment[i].memsz;
+		if (mstart > mend)
+			return result;
 		if ((mstart & ~PAGE_MASK) || (mend & ~PAGE_MASK))
 			return result;
 		if (mend >= KEXEC_DESTINATION_MEMORY_LIMIT)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/3] kexec: arrange for paddr_vmcoreinfo_note() to return phys_addr_t
  2016-04-14 19:59 [PATCH 0/3] Initial Kexec patches Russell King - ARM Linux
  2016-04-14 20:00 ` [PATCH 1/3] kexec: don't invoke OOM-killer for control page allocation Russell King
  2016-04-14 20:00 ` [PATCH 2/3] kexec: ensure user memory sizes do not wrap Russell King
@ 2016-04-14 20:00 ` Russell King
  2016-04-18  5:38   ` Baoquan He
  2016-04-28  9:59   ` Baoquan He
  2 siblings, 2 replies; 24+ messages in thread
From: Russell King @ 2016-04-14 20:00 UTC (permalink / raw)
  To: Eric Biederman; +Cc: Fenghua Yu, linux-ia64, Tony Luck, kexec

On PAE systems (eg, ARM LPAE) the vmcore note may be located above
4GB physical on 32-bit architectures, so we need a wider type than
"unsigned long" here.  Arrange for paddr_vmcoreinfo_note() to return
a phys_addr_t, thereby allowing it to be located above 4GB.

This makes no difference for kexec-tools, as they already assume a
64-bit type when reading from this file.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/ia64/kernel/machine_kexec.c | 2 +-
 include/linux/kexec.h            | 2 +-
 kernel/kexec_core.c              | 2 +-
 kernel/ksysfs.c                  | 4 ++--
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/ia64/kernel/machine_kexec.c b/arch/ia64/kernel/machine_kexec.c
index b72cd7a07222..599507bcec91 100644
--- a/arch/ia64/kernel/machine_kexec.c
+++ b/arch/ia64/kernel/machine_kexec.c
@@ -163,7 +163,7 @@ void arch_crash_save_vmcoreinfo(void)
 #endif
 }
 
-unsigned long paddr_vmcoreinfo_note(void)
+phys_addr_t paddr_vmcoreinfo_note(void)
 {
 	return ia64_tpa((unsigned long)(char *)&vmcoreinfo_note);
 }
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 1b32ab587f66..52a3a221bcb2 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -235,7 +235,7 @@ void crash_unmap_reserved_pages(void);
 void arch_crash_save_vmcoreinfo(void);
 __printf(1, 2)
 void vmcoreinfo_append_str(const char *fmt, ...);
-unsigned long paddr_vmcoreinfo_note(void);
+phys_addr_t paddr_vmcoreinfo_note(void);
 
 #define VMCOREINFO_OSRELEASE(value) \
 	vmcoreinfo_append_str("OSRELEASE=%s\n", value)
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index d719a4d0ef55..f9847e5822e6 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1377,7 +1377,7 @@ void vmcoreinfo_append_str(const char *fmt, ...)
 void __weak arch_crash_save_vmcoreinfo(void)
 {}
 
-unsigned long __weak paddr_vmcoreinfo_note(void)
+phys_addr_t __weak paddr_vmcoreinfo_note(void)
 {
 	return __pa((unsigned long)(char *)&vmcoreinfo_note);
 }
diff --git a/kernel/ksysfs.c b/kernel/ksysfs.c
index 152da4a48867..9f1920d2d0c6 100644
--- a/kernel/ksysfs.c
+++ b/kernel/ksysfs.c
@@ -128,8 +128,8 @@ KERNEL_ATTR_RW(kexec_crash_size);
 static ssize_t vmcoreinfo_show(struct kobject *kobj,
 			       struct kobj_attribute *attr, char *buf)
 {
-	return sprintf(buf, "%lx %x\n",
-		       paddr_vmcoreinfo_note(),
+	phys_addr_t vmcore_base = paddr_vmcoreinfo_note();
+	return sprintf(buf, "%pa %x\n", &vmcore_base,
 		       (unsigned int)sizeof(vmcoreinfo_note));
 }
 KERNEL_ATTR_RO(vmcoreinfo);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] kexec: don't invoke OOM-killer for control page allocation
  2016-04-14 20:00 ` [PATCH 1/3] kexec: don't invoke OOM-killer for control page allocation Russell King
@ 2016-04-18  5:32   ` Baoquan He
  2016-04-18  8:39     ` Russell King - ARM Linux
  0 siblings, 1 reply; 24+ messages in thread
From: Baoquan He @ 2016-04-18  5:32 UTC (permalink / raw)
  To: Russell King; +Cc: Fenghua Yu, Tony Luck, linux-ia64, Eric Biederman, kexec

On 04/14/16 at 09:00pm, Russell King wrote:
> If we are unable to find a suitable page when allocating the control
> page, do not invoke the OOM-killer: killing processes probably isn't
> going to help.

Originally kexec was made to reboot to test kernel quickly. If 1st
kernel is palyed and hurted in a bad state and developer want to discard
it and take a quick reboot, why don't we have a best try to make a
successful kexec load?

I personally think this change sounds un-reasonable.

> 
> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> ---
>  include/linux/kexec.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index 2cc643c6e870..1b32ab587f66 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -41,7 +41,7 @@
>  #endif
>  
>  #ifndef KEXEC_CONTROL_MEMORY_GFP
> -#define KEXEC_CONTROL_MEMORY_GFP GFP_KERNEL
> +#define KEXEC_CONTROL_MEMORY_GFP (GFP_KERNEL | __GFP_NORETRY)
>  #endif
>  
>  #ifndef KEXEC_CONTROL_PAGE_SIZE
> -- 
> 2.1.0
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] kexec: ensure user memory sizes do not wrap
  2016-04-14 20:00 ` [PATCH 2/3] kexec: ensure user memory sizes do not wrap Russell King
@ 2016-04-18  5:35   ` Baoquan He
  2016-04-18  8:37     ` Russell King - ARM Linux
  2016-04-28  9:56   ` Baoquan He
  2016-04-28 11:07   ` Minfei Huang
  2 siblings, 1 reply; 24+ messages in thread
From: Baoquan He @ 2016-04-18  5:35 UTC (permalink / raw)
  To: Russell King; +Cc: Fenghua Yu, Tony Luck, linux-ia64, Eric Biederman, kexec

On 04/14/16 at 09:00pm, Russell King wrote:
> Ensure that user memory sizes do not wrap around when validating the
> user input, which can lead to the following input validation working
> incorrectly.
> 
> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> ---
>  kernel/kexec_core.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index 8d34308ea449..d719a4d0ef55 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -169,6 +169,8 @@ int sanity_check_segment_list(struct kimage *image)
>  
>  		mstart = image->segment[i].mem;
>  		mend   = mstart + image->segment[i].memsz;
> +		if (mstart > mend)
> +			return result;

These segments are built in kexec utility, their availability should be
guaranteed there. If without some alignment handling, wrapping around
might not happen here. But I don't have strong objection to it.

>  		if ((mstart & ~PAGE_MASK) || (mend & ~PAGE_MASK))
>  			return result;
>  		if (mend >= KEXEC_DESTINATION_MEMORY_LIMIT)
> -- 
> 2.1.0
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] kexec: arrange for paddr_vmcoreinfo_note() to return phys_addr_t
  2016-04-14 20:00 ` [PATCH 3/3] kexec: arrange for paddr_vmcoreinfo_note() to return phys_addr_t Russell King
@ 2016-04-18  5:38   ` Baoquan He
  2016-04-18  8:36     ` Russell King - ARM Linux
  2016-04-28  9:59   ` Baoquan He
  1 sibling, 1 reply; 24+ messages in thread
From: Baoquan He @ 2016-04-18  5:38 UTC (permalink / raw)
  To: Russell King; +Cc: Fenghua Yu, Tony Luck, linux-ia64, Eric Biederman, kexec

On 04/14/16 at 09:00pm, Russell King wrote:
> On PAE systems (eg, ARM LPAE) the vmcore note may be located above
> 4GB physical on 32-bit architectures, so we need a wider type than
> "unsigned long" here.  Arrange for paddr_vmcoreinfo_note() to return
> a phys_addr_t, thereby allowing it to be located above 4GB.

At first glance, it sounds great. But I can't imagine a scenario where
on pae system kernel can be located above 4G. As far as I know i386 and
its pae can't do this because the current linux VM implementation can't
allow that. I am not familiar with arm system. So please correct me if
I was wrong.

Thanks
Baoquan

> 
> This makes no difference for kexec-tools, as they already assume a
> 64-bit type when reading from this file.
> 
> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> ---
>  arch/ia64/kernel/machine_kexec.c | 2 +-
>  include/linux/kexec.h            | 2 +-
>  kernel/kexec_core.c              | 2 +-
>  kernel/ksysfs.c                  | 4 ++--
>  4 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/ia64/kernel/machine_kexec.c b/arch/ia64/kernel/machine_kexec.c
> index b72cd7a07222..599507bcec91 100644
> --- a/arch/ia64/kernel/machine_kexec.c
> +++ b/arch/ia64/kernel/machine_kexec.c
> @@ -163,7 +163,7 @@ void arch_crash_save_vmcoreinfo(void)
>  #endif
>  }
>  
> -unsigned long paddr_vmcoreinfo_note(void)
> +phys_addr_t paddr_vmcoreinfo_note(void)
>  {
>  	return ia64_tpa((unsigned long)(char *)&vmcoreinfo_note);
>  }
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index 1b32ab587f66..52a3a221bcb2 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -235,7 +235,7 @@ void crash_unmap_reserved_pages(void);
>  void arch_crash_save_vmcoreinfo(void);
>  __printf(1, 2)
>  void vmcoreinfo_append_str(const char *fmt, ...);
> -unsigned long paddr_vmcoreinfo_note(void);
> +phys_addr_t paddr_vmcoreinfo_note(void);
>  
>  #define VMCOREINFO_OSRELEASE(value) \
>  	vmcoreinfo_append_str("OSRELEASE=%s\n", value)
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index d719a4d0ef55..f9847e5822e6 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -1377,7 +1377,7 @@ void vmcoreinfo_append_str(const char *fmt, ...)
>  void __weak arch_crash_save_vmcoreinfo(void)
>  {}
>  
> -unsigned long __weak paddr_vmcoreinfo_note(void)
> +phys_addr_t __weak paddr_vmcoreinfo_note(void)
>  {
>  	return __pa((unsigned long)(char *)&vmcoreinfo_note);
>  }
> diff --git a/kernel/ksysfs.c b/kernel/ksysfs.c
> index 152da4a48867..9f1920d2d0c6 100644
> --- a/kernel/ksysfs.c
> +++ b/kernel/ksysfs.c
> @@ -128,8 +128,8 @@ KERNEL_ATTR_RW(kexec_crash_size);
>  static ssize_t vmcoreinfo_show(struct kobject *kobj,
>  			       struct kobj_attribute *attr, char *buf)
>  {
> -	return sprintf(buf, "%lx %x\n",
> -		       paddr_vmcoreinfo_note(),
> +	phys_addr_t vmcore_base = paddr_vmcoreinfo_note();
> +	return sprintf(buf, "%pa %x\n", &vmcore_base,
>  		       (unsigned int)sizeof(vmcoreinfo_note));
>  }
>  KERNEL_ATTR_RO(vmcoreinfo);
> -- 
> 2.1.0
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] kexec: arrange for paddr_vmcoreinfo_note() to return phys_addr_t
  2016-04-18  5:38   ` Baoquan He
@ 2016-04-18  8:36     ` Russell King - ARM Linux
  2016-04-18 10:32       ` Baoquan He
  0 siblings, 1 reply; 24+ messages in thread
From: Russell King - ARM Linux @ 2016-04-18  8:36 UTC (permalink / raw)
  To: Baoquan He; +Cc: Fenghua Yu, Tony Luck, linux-ia64, Eric Biederman, kexec

On Mon, Apr 18, 2016 at 01:38:20PM +0800, Baoquan He wrote:
> On 04/14/16 at 09:00pm, Russell King wrote:
> > On PAE systems (eg, ARM LPAE) the vmcore note may be located above
> > 4GB physical on 32-bit architectures, so we need a wider type than
> > "unsigned long" here.  Arrange for paddr_vmcoreinfo_note() to return
> > a phys_addr_t, thereby allowing it to be located above 4GB.
> 
> At first glance, it sounds great. But I can't imagine a scenario where
> on pae system kernel can be located above 4G. As far as I know i386 and
> its pae can't do this because the current linux VM implementation can't
> allow that. I am not familiar with arm system. So please correct me if
> I was wrong.

You are wrong.  That's precisely why this patch exists.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] kexec: ensure user memory sizes do not wrap
  2016-04-18  5:35   ` Baoquan He
@ 2016-04-18  8:37     ` Russell King - ARM Linux
  2016-04-18 10:17       ` Baoquan He
  0 siblings, 1 reply; 24+ messages in thread
From: Russell King - ARM Linux @ 2016-04-18  8:37 UTC (permalink / raw)
  To: Baoquan He; +Cc: Fenghua Yu, Tony Luck, linux-ia64, Eric Biederman, kexec

On Mon, Apr 18, 2016 at 01:35:34PM +0800, Baoquan He wrote:
> On 04/14/16 at 09:00pm, Russell King wrote:
> > Ensure that user memory sizes do not wrap around when validating the
> > user input, which can lead to the following input validation working
> > incorrectly.
> > 
> > Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> > ---
> >  kernel/kexec_core.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> > index 8d34308ea449..d719a4d0ef55 100644
> > --- a/kernel/kexec_core.c
> > +++ b/kernel/kexec_core.c
> > @@ -169,6 +169,8 @@ int sanity_check_segment_list(struct kimage *image)
> >  
> >  		mstart = image->segment[i].mem;
> >  		mend   = mstart + image->segment[i].memsz;
> > +		if (mstart > mend)
> > +			return result;
> 
> These segments are built in kexec utility, their availability should be
> guaranteed there. If without some alignment handling, wrapping around
> might not happen here. But I don't have strong objection to it.

In which case, what's the point of all the other validation which is done,
like the check below:

> >  		if ((mstart & ~PAGE_MASK) || (mend & ~PAGE_MASK))
> >  			return result;
> >  		if (mend >= KEXEC_DESTINATION_MEMORY_LIMIT)

Your reply is contradictory to the whole suite of tests which kexec does
to validate its input from userspace.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] kexec: don't invoke OOM-killer for control page allocation
  2016-04-18  5:32   ` Baoquan He
@ 2016-04-18  8:39     ` Russell King - ARM Linux
  2016-04-18 10:12       ` Baoquan He
  0 siblings, 1 reply; 24+ messages in thread
From: Russell King - ARM Linux @ 2016-04-18  8:39 UTC (permalink / raw)
  To: Baoquan He; +Cc: Fenghua Yu, Tony Luck, linux-ia64, Eric Biederman, kexec

On Mon, Apr 18, 2016 at 01:32:53PM +0800, Baoquan He wrote:
> On 04/14/16 at 09:00pm, Russell King wrote:
> > If we are unable to find a suitable page when allocating the control
> > page, do not invoke the OOM-killer: killing processes probably isn't
> > going to help.
> 
> Originally kexec was made to reboot to test kernel quickly. If 1st
> kernel is palyed and hurted in a bad state and developer want to discard
> it and take a quick reboot, why don't we have a best try to make a
> successful kexec load?

And if it kills off every process trying to get a suitable page,
which then means you can't do anything other than power cycle,
that's okay?

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] kexec: don't invoke OOM-killer for control page allocation
  2016-04-18  8:39     ` Russell King - ARM Linux
@ 2016-04-18 10:12       ` Baoquan He
  2016-04-28  9:53         ` Baoquan He
  0 siblings, 1 reply; 24+ messages in thread
From: Baoquan He @ 2016-04-18 10:12 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Fenghua Yu, Tony Luck, linux-ia64, Eric Biederman, kexec

On 04/18/16 at 09:39am, Russell King - ARM Linux wrote:
> On Mon, Apr 18, 2016 at 01:32:53PM +0800, Baoquan He wrote:
> > On 04/14/16 at 09:00pm, Russell King wrote:
> > > If we are unable to find a suitable page when allocating the control
> > > page, do not invoke the OOM-killer: killing processes probably isn't
> > > going to help.
> > 
> > Originally kexec was made to reboot to test kernel quickly. If 1st
> > kernel is palyed and hurted in a bad state and developer want to discard
> > it and take a quick reboot, why don't we have a best try to make a
> > successful kexec load?
> 
> And if it kills off every process trying to get a suitable page,
> which then means you can't do anything other than power cycle,
> that's okay?

Yes, I agree on that it's non-sense if every process is killed. But will
each kexec load which need OOM-killer go that far? And there's only one
page (if 32 bit) or 2 pages (if 64 bit) for control page, it may not
need kill that many processes to pick one.

> 
> -- 
> RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
> according to speedtest.net.
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] kexec: ensure user memory sizes do not wrap
  2016-04-18  8:37     ` Russell King - ARM Linux
@ 2016-04-18 10:17       ` Baoquan He
  0 siblings, 0 replies; 24+ messages in thread
From: Baoquan He @ 2016-04-18 10:17 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Fenghua Yu, Tony Luck, linux-ia64, Eric Biederman, kexec

On 04/18/16 at 09:37am, Russell King - ARM Linux wrote:
> On Mon, Apr 18, 2016 at 01:35:34PM +0800, Baoquan He wrote:
> > On 04/14/16 at 09:00pm, Russell King wrote:
> > > Ensure that user memory sizes do not wrap around when validating the
> > > user input, which can lead to the following input validation working
> > > incorrectly.
> > > 
> > > Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> > > ---
> > >  kernel/kexec_core.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> > > index 8d34308ea449..d719a4d0ef55 100644
> > > --- a/kernel/kexec_core.c
> > > +++ b/kernel/kexec_core.c
> > > @@ -169,6 +169,8 @@ int sanity_check_segment_list(struct kimage *image)
> > >  
> > >  		mstart = image->segment[i].mem;
> > >  		mend   = mstart + image->segment[i].memsz;
> > > +		if (mstart > mend)
> > > +			return result;
> > 
> > These segments are built in kexec utility, their availability should be
> > guaranteed there. If without some alignment handling, wrapping around
> > might not happen here. But I don't have strong objection to it.
> 
> In which case, what's the point of all the other validation which is done,
> like the check below:
> 
> > >  		if ((mstart & ~PAGE_MASK) || (mend & ~PAGE_MASK))
> > >  			return result;
> > >  		if (mend >= KEXEC_DESTINATION_MEMORY_LIMIT)
> 
> Your reply is contradictory to the whole suite of tests which kexec does
> to validate its input from userspace.

It's not contradictory. In kexec utility it will call
valid_memory_segment() to check each segment. And there it will check if
the start is bigger than end. What I meant is if start is 5000, end is
5100, an alignment of end will make start> end case happen. Anyway I am
fine with this check adding, the safer, the better.

> 
> -- 
> RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
> according to speedtest.net.
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] kexec: arrange for paddr_vmcoreinfo_note() to return phys_addr_t
  2016-04-18  8:36     ` Russell King - ARM Linux
@ 2016-04-18 10:32       ` Baoquan He
  2016-04-18 10:52         ` Russell King - ARM Linux
  0 siblings, 1 reply; 24+ messages in thread
From: Baoquan He @ 2016-04-18 10:32 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Fenghua Yu, Tony Luck, linux-ia64, Eric Biederman, kexec

On 04/18/16 at 09:36am, Russell King - ARM Linux wrote:
> On Mon, Apr 18, 2016 at 01:38:20PM +0800, Baoquan He wrote:
> > On 04/14/16 at 09:00pm, Russell King wrote:
> > > On PAE systems (eg, ARM LPAE) the vmcore note may be located above
> > > 4GB physical on 32-bit architectures, so we need a wider type than
> > > "unsigned long" here.  Arrange for paddr_vmcoreinfo_note() to return
> > > a phys_addr_t, thereby allowing it to be located above 4GB.
> > 
> > At first glance, it sounds great. But I can't imagine a scenario where
> > on pae system kernel can be located above 4G. As far as I know i386 and
> > its pae can't do this because the current linux VM implementation can't
> > allow that. I am not familiar with arm system. So please correct me if
> > I was wrong.
> 
> You are wrong.  That's precisely why this patch exists.

I don't know arm is different then i386. On i386 the kernel text mapping
is linear, just as follow:

#define __phys_addr_nodebug(x)  ((x) - PAGE_OFFSET)


But arm seems not linear. 
static inline phys_addr_t __virt_to_phys(unsigned long x)
{
        phys_addr_t t;

        if (sizeof(phys_addr_t) = 4) {
                __pv_stub(x, t, "add", __PV_BITS_31_24);
        } else {
                __pv_stub_mov_hi(t);
                __pv_add_carry_stub(x, t);
        }
        return t;
}

So on arm PAE this change makes sense.

Besides, I checked kexec/arch/arm/kexec-zImage-arm.c and found function
locate_hole() is used to find position for arm kernel. 

unsigned long locate_hole(struct kexec_info *info,
        unsigned long hole_size, unsigned long hole_align,
        unsigned long hole_min, unsigned long hole_max, 
        int hole_end)

The type unsigned long for hole_max limit the region where arm kernel is
loaded. So withough modifying this I doubt arm PAE can really be loaded
above 4G.

Thanks
Baoquan

> 
> -- 
> RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
> according to speedtest.net.
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] kexec: arrange for paddr_vmcoreinfo_note() to return phys_addr_t
  2016-04-18 10:32       ` Baoquan He
@ 2016-04-18 10:52         ` Russell King - ARM Linux
  2016-04-18 11:28           ` Baoquan He
  0 siblings, 1 reply; 24+ messages in thread
From: Russell King - ARM Linux @ 2016-04-18 10:52 UTC (permalink / raw)
  To: Baoquan He; +Cc: Fenghua Yu, Tony Luck, linux-ia64, Eric Biederman, kexec

On Mon, Apr 18, 2016 at 06:32:58PM +0800, Baoquan He wrote:
> On 04/18/16 at 09:36am, Russell King - ARM Linux wrote:
> > On Mon, Apr 18, 2016 at 01:38:20PM +0800, Baoquan He wrote:
> > > On 04/14/16 at 09:00pm, Russell King wrote:
> > > > On PAE systems (eg, ARM LPAE) the vmcore note may be located above
> > > > 4GB physical on 32-bit architectures, so we need a wider type than
> > > > "unsigned long" here.  Arrange for paddr_vmcoreinfo_note() to return
> > > > a phys_addr_t, thereby allowing it to be located above 4GB.
> > > 
> > > At first glance, it sounds great. But I can't imagine a scenario where
> > > on pae system kernel can be located above 4G. As far as I know i386 and
> > > its pae can't do this because the current linux VM implementation can't
> > > allow that. I am not familiar with arm system. So please correct me if
> > > I was wrong.
> > 
> > You are wrong.  That's precisely why this patch exists.
> 
> I don't know arm is different then i386. On i386 the kernel text mapping
> is linear, just as follow:
> 
> #define __phys_addr_nodebug(x)  ((x) - PAGE_OFFSET)
> 
> 
> But arm seems not linear. 
> static inline phys_addr_t __virt_to_phys(unsigned long x)
> {
>         phys_addr_t t;
> 
>         if (sizeof(phys_addr_t) = 4) {
>                 __pv_stub(x, t, "add", __PV_BITS_31_24);
>         } else {
>                 __pv_stub_mov_hi(t);
>                 __pv_add_carry_stub(x, t);
>         }
>         return t;
> }
> 
> So on arm PAE this change makes sense.

No.  This has nothing to do with whether memory is linear or not.  The
above code has nothing to do with that either.  The above code you quote
allows us to efficiently runtime modify the virtual to physical
translation offset, nothing more.

> Besides, I checked kexec/arch/arm/kexec-zImage-arm.c and found function
> locate_hole() is used to find position for arm kernel. 
> 
> unsigned long locate_hole(struct kexec_info *info,
>         unsigned long hole_size, unsigned long hole_align,
>         unsigned long hole_min, unsigned long hole_max, 
>         int hole_end)
> 
> The type unsigned long for hole_max limit the region where arm kernel is
> loaded. So withough modifying this I doubt arm PAE can really be loaded
> above 4G.

Please, stop "doubting" the patches.

I have here a machine which requires these patches, and they're all
tested.  Without these patches, it doesn't work - and in fact trying
to use kexec on the platform takes out userspace due to the OOM killer.
With these patches, it does work - fully.  It has the start of system
memory above 4GB physical, with a non-DMA coherent boot time alias
below 4GB.

On a running system, the kernel ignores the boot alias below 4GB.
Having discussed with Eric, kexec is designed to use the boot time
alias, and we need to teach kexec about the difference between the
boot time alias and the running system memory layout.

That's what these patches are all about.

I've been discussing the problem with Eric on and off over the last
six months, and he's the one who suggested in part the solution
implemented in this series.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] kexec: arrange for paddr_vmcoreinfo_note() to return phys_addr_t
  2016-04-18 10:52         ` Russell King - ARM Linux
@ 2016-04-18 11:28           ` Baoquan He
  2016-04-28  8:56             ` Russell King - ARM Linux
  0 siblings, 1 reply; 24+ messages in thread
From: Baoquan He @ 2016-04-18 11:28 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Fenghua Yu, Tony Luck, linux-ia64, Eric Biederman, kexec

On 04/18/16 at 11:52am, Russell King - ARM Linux wrote:
> On Mon, Apr 18, 2016 at 06:32:58PM +0800, Baoquan He wrote:
> > On 04/18/16 at 09:36am, Russell King - ARM Linux wrote:
> > > On Mon, Apr 18, 2016 at 01:38:20PM +0800, Baoquan He wrote:
> > > > On 04/14/16 at 09:00pm, Russell King wrote:
> > > > > On PAE systems (eg, ARM LPAE) the vmcore note may be located above
> > > > > 4GB physical on 32-bit architectures, so we need a wider type than
> > > > > "unsigned long" here.  Arrange for paddr_vmcoreinfo_note() to return
> > > > > a phys_addr_t, thereby allowing it to be located above 4GB.
> > > > 
> > > > At first glance, it sounds great. But I can't imagine a scenario where
> > > > on pae system kernel can be located above 4G. As far as I know i386 and
> > > > its pae can't do this because the current linux VM implementation can't
> > > > allow that. I am not familiar with arm system. So please correct me if
> > > > I was wrong.
> > > 
> > > You are wrong.  That's precisely why this patch exists.
> > 
> > I don't know arm is different then i386. On i386 the kernel text mapping
> > is linear, just as follow:
> > 
> > #define __phys_addr_nodebug(x)  ((x) - PAGE_OFFSET)
> > 
> > 
> > But arm seems not linear. 
> > static inline phys_addr_t __virt_to_phys(unsigned long x)
> > {
> >         phys_addr_t t;
> > 
> >         if (sizeof(phys_addr_t) = 4) {
> >                 __pv_stub(x, t, "add", __PV_BITS_31_24);
> >         } else {
> >                 __pv_stub_mov_hi(t);
> >                 __pv_add_carry_stub(x, t);
> >         }
> >         return t;
> > }
> > 
> > So on arm PAE this change makes sense.
> 
> No.  This has nothing to do with whether memory is linear or not.  The
> above code has nothing to do with that either.  The above code you quote
> allows us to efficiently runtime modify the virtual to physical
> translation offset, nothing more.

OK, thanks for telling this. Learned it.

> 
> > Besides, I checked kexec/arch/arm/kexec-zImage-arm.c and found function
> > locate_hole() is used to find position for arm kernel. 
> > 
> > unsigned long locate_hole(struct kexec_info *info,
> >         unsigned long hole_size, unsigned long hole_align,
> >         unsigned long hole_min, unsigned long hole_max, 
> >         int hole_end)
> > 
> > The type unsigned long for hole_max limit the region where arm kernel is
> > loaded. So withough modifying this I doubt arm PAE can really be loaded
> > above 4G.
> 
> Please, stop "doubting" the patches.
> 
> I have here a machine which requires these patches, and they're all
> tested.  Without these patches, it doesn't work - and in fact trying
> to use kexec on the platform takes out userspace due to the OOM killer.
> With these patches, it does work - fully.  It has the start of system
> memory above 4GB physical, with a non-DMA coherent boot time alias
> below 4GB.
> 
> On a running system, the kernel ignores the boot alias below 4GB.
> Having discussed with Eric, kexec is designed to use the boot time
> alias, and we need to teach kexec about the difference between the
> boot time alias and the running system memory layout.
> 
> That's what these patches are all about.
> 
> I've been discussing the problem with Eric on and off over the last
> six months, and he's the one who suggested in part the solution
> implemented in this series.

Got it. Just pass by to have a look:) Since Erci suggested these I stop
making noise right now.

Thanks for telling above knowledge about arm.

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] kexec: arrange for paddr_vmcoreinfo_note() to return phys_addr_t
  2016-04-18 11:28           ` Baoquan He
@ 2016-04-28  8:56             ` Russell King - ARM Linux
  0 siblings, 0 replies; 24+ messages in thread
From: Russell King - ARM Linux @ 2016-04-28  8:56 UTC (permalink / raw)
  To: Baoquan He; +Cc: Fenghua Yu, Tony Luck, linux-ia64, Eric Biederman, kexec

What's the status of these three patches then?  It's been a fortnight
since they were sent, and I've heard nothing about whether they're
going to be applied or not.


On Mon, Apr 18, 2016 at 07:28:15PM +0800, Baoquan He wrote:
> On 04/18/16 at 11:52am, Russell King - ARM Linux wrote:
> > On Mon, Apr 18, 2016 at 06:32:58PM +0800, Baoquan He wrote:
> > > On 04/18/16 at 09:36am, Russell King - ARM Linux wrote:
> > > > On Mon, Apr 18, 2016 at 01:38:20PM +0800, Baoquan He wrote:
> > > > > On 04/14/16 at 09:00pm, Russell King wrote:
> > > > > > On PAE systems (eg, ARM LPAE) the vmcore note may be located above
> > > > > > 4GB physical on 32-bit architectures, so we need a wider type than
> > > > > > "unsigned long" here.  Arrange for paddr_vmcoreinfo_note() to return
> > > > > > a phys_addr_t, thereby allowing it to be located above 4GB.
> > > > > 
> > > > > At first glance, it sounds great. But I can't imagine a scenario where
> > > > > on pae system kernel can be located above 4G. As far as I know i386 and
> > > > > its pae can't do this because the current linux VM implementation can't
> > > > > allow that. I am not familiar with arm system. So please correct me if
> > > > > I was wrong.
> > > > 
> > > > You are wrong.  That's precisely why this patch exists.
> > > 
> > > I don't know arm is different then i386. On i386 the kernel text mapping
> > > is linear, just as follow:
> > > 
> > > #define __phys_addr_nodebug(x)  ((x) - PAGE_OFFSET)
> > > 
> > > 
> > > But arm seems not linear. 
> > > static inline phys_addr_t __virt_to_phys(unsigned long x)
> > > {
> > >         phys_addr_t t;
> > > 
> > >         if (sizeof(phys_addr_t) = 4) {
> > >                 __pv_stub(x, t, "add", __PV_BITS_31_24);
> > >         } else {
> > >                 __pv_stub_mov_hi(t);
> > >                 __pv_add_carry_stub(x, t);
> > >         }
> > >         return t;
> > > }
> > > 
> > > So on arm PAE this change makes sense.
> > 
> > No.  This has nothing to do with whether memory is linear or not.  The
> > above code has nothing to do with that either.  The above code you quote
> > allows us to efficiently runtime modify the virtual to physical
> > translation offset, nothing more.
> 
> OK, thanks for telling this. Learned it.
> 
> > 
> > > Besides, I checked kexec/arch/arm/kexec-zImage-arm.c and found function
> > > locate_hole() is used to find position for arm kernel. 
> > > 
> > > unsigned long locate_hole(struct kexec_info *info,
> > >         unsigned long hole_size, unsigned long hole_align,
> > >         unsigned long hole_min, unsigned long hole_max, 
> > >         int hole_end)
> > > 
> > > The type unsigned long for hole_max limit the region where arm kernel is
> > > loaded. So withough modifying this I doubt arm PAE can really be loaded
> > > above 4G.
> > 
> > Please, stop "doubting" the patches.
> > 
> > I have here a machine which requires these patches, and they're all
> > tested.  Without these patches, it doesn't work - and in fact trying
> > to use kexec on the platform takes out userspace due to the OOM killer.
> > With these patches, it does work - fully.  It has the start of system
> > memory above 4GB physical, with a non-DMA coherent boot time alias
> > below 4GB.
> > 
> > On a running system, the kernel ignores the boot alias below 4GB.
> > Having discussed with Eric, kexec is designed to use the boot time
> > alias, and we need to teach kexec about the difference between the
> > boot time alias and the running system memory layout.
> > 
> > That's what these patches are all about.
> > 
> > I've been discussing the problem with Eric on and off over the last
> > six months, and he's the one who suggested in part the solution
> > implemented in this series.
> 
> Got it. Just pass by to have a look:) Since Erci suggested these I stop
> making noise right now.
> 
> Thanks for telling above knowledge about arm.
> 
> Thanks
> Baoquan

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] kexec: don't invoke OOM-killer for control page allocation
  2016-04-18 10:12       ` Baoquan He
@ 2016-04-28  9:53         ` Baoquan He
  0 siblings, 0 replies; 24+ messages in thread
From: Baoquan He @ 2016-04-28  9:53 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Fenghua Yu, Tony Luck, linux-ia64, Eric Biederman, kexec

On 04/18/16 at 06:12pm, Baoquan He wrote:
> On 04/18/16 at 09:39am, Russell King - ARM Linux wrote:
> > On Mon, Apr 18, 2016 at 01:32:53PM +0800, Baoquan He wrote:
> > > On 04/14/16 at 09:00pm, Russell King wrote:
> > > > If we are unable to find a suitable page when allocating the control
> > > > page, do not invoke the OOM-killer: killing processes probably isn't
> > > > going to help.
> > > 
> > > Originally kexec was made to reboot to test kernel quickly. If 1st
> > > kernel is palyed and hurted in a bad state and developer want to discard
> > > it and take a quick reboot, why don't we have a best try to make a
> > > successful kexec load?
> > 
> > And if it kills off every process trying to get a suitable page,
> > which then means you can't do anything other than power cycle,
> > that's okay?
> 
> Yes, I agree on that it's non-sense if every process is killed. But will
> each kexec load which need OOM-killer go that far? And there's only one
> page (if 32 bit) or 2 pages (if 64 bit) for control page, it may not
> need kill that many processes to pick one.

Sorry, I was wrong. Those control pages are imtermediate pages which
is used to copy kenrel and initrd from user buffer to their final place,
they should be dozens of MB. So I am fine with this change, ack it.

Acked-by: Baoquan He <bhe@redhat.com>

Thanks
Baoquan


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] kexec: ensure user memory sizes do not wrap
  2016-04-14 20:00 ` [PATCH 2/3] kexec: ensure user memory sizes do not wrap Russell King
  2016-04-18  5:35   ` Baoquan He
@ 2016-04-28  9:56   ` Baoquan He
  2016-04-28 11:07   ` Minfei Huang
  2 siblings, 0 replies; 24+ messages in thread
From: Baoquan He @ 2016-04-28  9:56 UTC (permalink / raw)
  To: Russell King; +Cc: Fenghua Yu, Tony Luck, linux-ia64, Eric Biederman, kexec

On 04/14/16 at 09:00pm, Russell King wrote:
> Ensure that user memory sizes do not wrap around when validating the
> user input, which can lead to the following input validation working
> incorrectly.
> 
> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> ---
>  kernel/kexec_core.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index 8d34308ea449..d719a4d0ef55 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -169,6 +169,8 @@ int sanity_check_segment_list(struct kimage *image)
>  
>  		mstart = image->segment[i].mem;
>  		mend   = mstart + image->segment[i].memsz;
> +		if (mstart > mend)
> +			return result;

Though this checking has been done in kexec utitlity, I am fine it's
re-check in kernel space. Anyway it keeps code safer. Ack it.

Acked-by: Baoquan He <bhe@redhat.com>


>  		if ((mstart & ~PAGE_MASK) || (mend & ~PAGE_MASK))
>  			return result;
>  		if (mend >= KEXEC_DESTINATION_MEMORY_LIMIT)
> -- 
> 2.1.0
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] kexec: arrange for paddr_vmcoreinfo_note() to return phys_addr_t
  2016-04-14 20:00 ` [PATCH 3/3] kexec: arrange for paddr_vmcoreinfo_note() to return phys_addr_t Russell King
  2016-04-18  5:38   ` Baoquan He
@ 2016-04-28  9:59   ` Baoquan He
  1 sibling, 0 replies; 24+ messages in thread
From: Baoquan He @ 2016-04-28  9:59 UTC (permalink / raw)
  To: Russell King; +Cc: Fenghua Yu, Tony Luck, linux-ia64, Eric Biederman, kexec

On 04/14/16 at 09:00pm, Russell King wrote:
> On PAE systems (eg, ARM LPAE) the vmcore note may be located above
> 4GB physical on 32-bit architectures, so we need a wider type than
> "unsigned long" here.  Arrange for paddr_vmcoreinfo_note() to return
> a phys_addr_t, thereby allowing it to be located above 4GB.
> 
> This makes no difference for kexec-tools, as they already assume a
> 64-bit type when reading from this file.
> 
> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

I am not sure if ARM PAE can load kernel above 4G. But it seems not
harmful to other ARCH. Ack it now. Hope other ARM people can also help
to have a look.

Acked-by: Baoquan He <bhe@redhat.com>

Thanks

> ---
>  arch/ia64/kernel/machine_kexec.c | 2 +-
>  include/linux/kexec.h            | 2 +-
>  kernel/kexec_core.c              | 2 +-
>  kernel/ksysfs.c                  | 4 ++--
>  4 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/ia64/kernel/machine_kexec.c b/arch/ia64/kernel/machine_kexec.c
> index b72cd7a07222..599507bcec91 100644
> --- a/arch/ia64/kernel/machine_kexec.c
> +++ b/arch/ia64/kernel/machine_kexec.c
> @@ -163,7 +163,7 @@ void arch_crash_save_vmcoreinfo(void)
>  #endif
>  }
>  
> -unsigned long paddr_vmcoreinfo_note(void)
> +phys_addr_t paddr_vmcoreinfo_note(void)
>  {
>  	return ia64_tpa((unsigned long)(char *)&vmcoreinfo_note);
>  }
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index 1b32ab587f66..52a3a221bcb2 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -235,7 +235,7 @@ void crash_unmap_reserved_pages(void);
>  void arch_crash_save_vmcoreinfo(void);
>  __printf(1, 2)
>  void vmcoreinfo_append_str(const char *fmt, ...);
> -unsigned long paddr_vmcoreinfo_note(void);
> +phys_addr_t paddr_vmcoreinfo_note(void);
>  
>  #define VMCOREINFO_OSRELEASE(value) \
>  	vmcoreinfo_append_str("OSRELEASE=%s\n", value)
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index d719a4d0ef55..f9847e5822e6 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -1377,7 +1377,7 @@ void vmcoreinfo_append_str(const char *fmt, ...)
>  void __weak arch_crash_save_vmcoreinfo(void)
>  {}
>  
> -unsigned long __weak paddr_vmcoreinfo_note(void)
> +phys_addr_t __weak paddr_vmcoreinfo_note(void)
>  {
>  	return __pa((unsigned long)(char *)&vmcoreinfo_note);
>  }
> diff --git a/kernel/ksysfs.c b/kernel/ksysfs.c
> index 152da4a48867..9f1920d2d0c6 100644
> --- a/kernel/ksysfs.c
> +++ b/kernel/ksysfs.c
> @@ -128,8 +128,8 @@ KERNEL_ATTR_RW(kexec_crash_size);
>  static ssize_t vmcoreinfo_show(struct kobject *kobj,
>  			       struct kobj_attribute *attr, char *buf)
>  {
> -	return sprintf(buf, "%lx %x\n",
> -		       paddr_vmcoreinfo_note(),
> +	phys_addr_t vmcore_base = paddr_vmcoreinfo_note();
> +	return sprintf(buf, "%pa %x\n", &vmcore_base,
>  		       (unsigned int)sizeof(vmcoreinfo_note));
>  }
>  KERNEL_ATTR_RO(vmcoreinfo);
> -- 
> 2.1.0
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] kexec: ensure user memory sizes do not wrap
  2016-04-14 20:00 ` [PATCH 2/3] kexec: ensure user memory sizes do not wrap Russell King
  2016-04-18  5:35   ` Baoquan He
  2016-04-28  9:56   ` Baoquan He
@ 2016-04-28 11:07   ` Minfei Huang
  2016-04-28 12:22     ` Russell King - ARM Linux
  2 siblings, 1 reply; 24+ messages in thread
From: Minfei Huang @ 2016-04-28 11:07 UTC (permalink / raw)
  To: Russell King; +Cc: Fenghua Yu, Tony Luck, linux-ia64, Eric Biederman, kexec

On 04/14/16 at 09:00pm, Russell King wrote:
> Ensure that user memory sizes do not wrap around when validating the
> user input, which can lead to the following input validation working
> incorrectly.
> 
> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> ---
>  kernel/kexec_core.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index 8d34308ea449..d719a4d0ef55 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -169,6 +169,8 @@ int sanity_check_segment_list(struct kimage *image)
>  
>  		mstart = image->segment[i].mem;
>  		mend   = mstart + image->segment[i].memsz;
> +		if (mstart > mend)
> +			return result;

The type of image->segment[i].memsz is unsigned. So it is no need to
have a test here.

The next several test will catch if image->segment[i].memsz is equal to
0.

Thanks
Minfei

>  		if ((mstart & ~PAGE_MASK) || (mend & ~PAGE_MASK))
>  			return result;
>  		if (mend >= KEXEC_DESTINATION_MEMORY_LIMIT)
> -- 
> 2.1.0
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] kexec: ensure user memory sizes do not wrap
  2016-04-28 11:07   ` Minfei Huang
@ 2016-04-28 12:22     ` Russell King - ARM Linux
  2016-04-29  9:32       ` Minfei Huang
  0 siblings, 1 reply; 24+ messages in thread
From: Russell King - ARM Linux @ 2016-04-28 12:22 UTC (permalink / raw)
  To: Minfei Huang; +Cc: Fenghua Yu, Tony Luck, linux-ia64, Eric Biederman, kexec

On Thu, Apr 28, 2016 at 07:07:22PM +0800, Minfei Huang wrote:
> On 04/14/16 at 09:00pm, Russell King wrote:
> > Ensure that user memory sizes do not wrap around when validating the
> > user input, which can lead to the following input validation working
> > incorrectly.
> > 
> > Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> > ---
> >  kernel/kexec_core.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> > index 8d34308ea449..d719a4d0ef55 100644
> > --- a/kernel/kexec_core.c
> > +++ b/kernel/kexec_core.c
> > @@ -169,6 +169,8 @@ int sanity_check_segment_list(struct kimage *image)
> >  
> >  		mstart = image->segment[i].mem;
> >  		mend   = mstart + image->segment[i].memsz;
> > +		if (mstart > mend)
> > +			return result;
> 
> The type of image->segment[i].memsz is unsigned. So it is no need to
> have a test here.

Absolutely wrong.  Consider the case:

	segment[i].mem = 0xfff00000;
	segment[i].size = 0x00200000;

Here, mstart will be 0xfff00000, and mend will be 0x00100000.  Just
because it's some random type does not make things magically work.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] kexec: ensure user memory sizes do not wrap
  2016-04-29  9:32       ` Minfei Huang
@ 2016-04-29  9:30         ` Russell King - ARM Linux
  2016-04-29 10:45           ` Minfei Huang
  0 siblings, 1 reply; 24+ messages in thread
From: Russell King - ARM Linux @ 2016-04-29  9:30 UTC (permalink / raw)
  To: Minfei Huang; +Cc: Fenghua Yu, Tony Luck, linux-ia64, Eric Biederman, kexec

On Fri, Apr 29, 2016 at 05:32:21PM +0800, Minfei Huang wrote:
> On 04/28/16 at 01:22pm, Russell King - ARM Linux wrote:
> > On Thu, Apr 28, 2016 at 07:07:22PM +0800, Minfei Huang wrote:
> > > On 04/14/16 at 09:00pm, Russell King wrote:
> > > > Ensure that user memory sizes do not wrap around when validating the
> > > > user input, which can lead to the following input validation working
> > > > incorrectly.
> > > > 
> > > > Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> > > > ---
> > > >  kernel/kexec_core.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > > 
> > > > diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> > > > index 8d34308ea449..d719a4d0ef55 100644
> > > > --- a/kernel/kexec_core.c
> > > > +++ b/kernel/kexec_core.c
> > > > @@ -169,6 +169,8 @@ int sanity_check_segment_list(struct kimage *image)
> > > >  
> > > >  		mstart = image->segment[i].mem;
> > > >  		mend   = mstart + image->segment[i].memsz;
> > > > +		if (mstart > mend)
> > > > +			return result;
> > > 
> > > The type of image->segment[i].memsz is unsigned. So it is no need to
> > > have a test here.
> > 
> > Absolutely wrong.  Consider the case:
> > 
> > 	segment[i].mem = 0xfff00000;
> > 	segment[i].size = 0x00200000;
> > 
> > Here, mstart will be 0xfff00000, and mend will be 0x00100000.  Just
> > because it's some random type does not make things magically work.
> 
> Hi, Russell.
> 
> Do you mean in PAE mode? If so, we will be in big trouble, because there
> are a lot of functions which use unsigned long to store memory address,
> and this type is 32 bit in PAE mode.

This is basic input validation stuff, it's got nothing to do with whether
we're in PAE mode.  If we get passed such a segment as I illustrate above,
we should detect and fail it, just as we detect and fail other similar
errors.

I'm not sure what the big deal here is.  This is basic validation checks
for stuff coming from userspace which the kernel should be doing as a
matter of course to protect itself.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] kexec: ensure user memory sizes do not wrap
  2016-04-28 12:22     ` Russell King - ARM Linux
@ 2016-04-29  9:32       ` Minfei Huang
  2016-04-29  9:30         ` Russell King - ARM Linux
  0 siblings, 1 reply; 24+ messages in thread
From: Minfei Huang @ 2016-04-29  9:32 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Fenghua Yu, Tony Luck, linux-ia64, Eric Biederman, kexec

On 04/28/16 at 01:22pm, Russell King - ARM Linux wrote:
> On Thu, Apr 28, 2016 at 07:07:22PM +0800, Minfei Huang wrote:
> > On 04/14/16 at 09:00pm, Russell King wrote:
> > > Ensure that user memory sizes do not wrap around when validating the
> > > user input, which can lead to the following input validation working
> > > incorrectly.
> > > 
> > > Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> > > ---
> > >  kernel/kexec_core.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> > > index 8d34308ea449..d719a4d0ef55 100644
> > > --- a/kernel/kexec_core.c
> > > +++ b/kernel/kexec_core.c
> > > @@ -169,6 +169,8 @@ int sanity_check_segment_list(struct kimage *image)
> > >  
> > >  		mstart = image->segment[i].mem;
> > >  		mend   = mstart + image->segment[i].memsz;
> > > +		if (mstart > mend)
> > > +			return result;
> > 
> > The type of image->segment[i].memsz is unsigned. So it is no need to
> > have a test here.
> 
> Absolutely wrong.  Consider the case:
> 
> 	segment[i].mem = 0xfff00000;
> 	segment[i].size = 0x00200000;
> 
> Here, mstart will be 0xfff00000, and mend will be 0x00100000.  Just
> because it's some random type does not make things magically work.

Hi, Russell.

Do you mean in PAE mode? If so, we will be in big trouble, because there
are a lot of functions which use unsigned long to store memory address,
and this type is 32 bit in PAE mode.

Thanks
Minfei

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] kexec: ensure user memory sizes do not wrap
  2016-04-29  9:30         ` Russell King - ARM Linux
@ 2016-04-29 10:45           ` Minfei Huang
  0 siblings, 0 replies; 24+ messages in thread
From: Minfei Huang @ 2016-04-29 10:45 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Fenghua Yu, Tony Luck, linux-ia64, Eric Biederman, kexec

On 04/29/16 at 10:30am, Russell King - ARM Linux wrote:
> > > > > @@ -169,6 +169,8 @@ int sanity_check_segment_list(struct kimage *image)
> > > > >  
> > > > >  		mstart = image->segment[i].mem;
> > > > >  		mend   = mstart + image->segment[i].memsz;
> > > > > +		if (mstart > mend)
> > > > > +			return result;
> > > > 
> > > > The type of image->segment[i].memsz is unsigned. So it is no need to
> > > > have a test here.
> > > 
> > > Absolutely wrong.  Consider the case:
> > > 
> > > 	segment[i].mem = 0xfff00000;
> > > 	segment[i].size = 0x00200000;
> > > 
> > > Here, mstart will be 0xfff00000, and mend will be 0x00100000.  Just
> > > because it's some random type does not make things magically work.
> > 
> > Hi, Russell.
> > 
> > Do you mean in PAE mode? If so, we will be in big trouble, because there
> > are a lot of functions which use unsigned long to store memory address,
> > and this type is 32 bit in PAE mode.
> 
> This is basic input validation stuff, it's got nothing to do with whether
> we're in PAE mode.  If we get passed such a segment as I illustrate above,
> we should detect and fail it, just as we detect and fail other similar
> errors.
> 
> I'm not sure what the big deal here is.  This is basic validation checks
> for stuff coming from userspace which the kernel should be doing as a
> matter of course to protect itself.

Hi, Russell.

Thanks for your reply. I'm fine with this basic test now.

Please feel free to add
Reviewed-by: Minfei Huang <mhuang@redhat.com>.

Thanks
Minfei

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2016-04-29 10:45 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-14 19:59 [PATCH 0/3] Initial Kexec patches Russell King - ARM Linux
2016-04-14 20:00 ` [PATCH 1/3] kexec: don't invoke OOM-killer for control page allocation Russell King
2016-04-18  5:32   ` Baoquan He
2016-04-18  8:39     ` Russell King - ARM Linux
2016-04-18 10:12       ` Baoquan He
2016-04-28  9:53         ` Baoquan He
2016-04-14 20:00 ` [PATCH 2/3] kexec: ensure user memory sizes do not wrap Russell King
2016-04-18  5:35   ` Baoquan He
2016-04-18  8:37     ` Russell King - ARM Linux
2016-04-18 10:17       ` Baoquan He
2016-04-28  9:56   ` Baoquan He
2016-04-28 11:07   ` Minfei Huang
2016-04-28 12:22     ` Russell King - ARM Linux
2016-04-29  9:32       ` Minfei Huang
2016-04-29  9:30         ` Russell King - ARM Linux
2016-04-29 10:45           ` Minfei Huang
2016-04-14 20:00 ` [PATCH 3/3] kexec: arrange for paddr_vmcoreinfo_note() to return phys_addr_t Russell King
2016-04-18  5:38   ` Baoquan He
2016-04-18  8:36     ` Russell King - ARM Linux
2016-04-18 10:32       ` Baoquan He
2016-04-18 10:52         ` Russell King - ARM Linux
2016-04-18 11:28           ` Baoquan He
2016-04-28  8:56             ` Russell King - ARM Linux
2016-04-28  9:59   ` Baoquan He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).