All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] driver: base: memory: Maintain correct mem->end_section_nr when memory block is partially filled
@ 2015-08-13  9:17 Bharata B Rao
  2015-08-14 15:27 ` Nathan Fontenot
  0 siblings, 1 reply; 4+ messages in thread
From: Bharata B Rao @ 2015-08-13  9:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: david, Bharata B Rao, Nathan Fontenot

Last section of memory block is always initialized to

mem->start_section_nr + sections_per_block - 1

which will not be true for a section that doesn't contain sections_per_block
sections due to the memory size specified. This causes the following
kernel crash when memory blocks under a node are registered during reboot
that follows a memory hotplug operation on pseries guest.

Unable to handle kernel paging request for data at address 0xf0000000003f0020
Faulting instruction address: 0xc0000000007657cc
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=1024 NUMA pSeries

Modules linked in:

CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.2.0-rc6+ #48
task: c0000000ba3c0000 ti: c00000013c580000 task.ti: c00000013c580000
NIP: c0000000007657cc LR: c000000000592dbc CTR: 0000000000000400
REGS: c00000013c5836f0 TRAP: 0300   Not tainted  (4.2.0-rc6+)
MSR: 8000000000009032  MSR: 8000000000009032 <<SFSF,EE,EE,ME,ME,IR,IR,DR,DR,RI,RI>>  CR: 48000048  XER: 00000000
  CR: 48000048  XER: 00000000
CFAR: 00003fff990f50ec CFAR: 00003fff990f50ec DAR: f0000000003f0020 DSISR: 40000000 DAR: f0000000003f0020 DSISR: 40000000 SOFTE: 1 SOFTE: 1
GPR00: c000000000592dbc c000000000592dbc c00000013c583970 c00000013c583970 c0000000014f0300 c0000000014f0300 00000000003f0000 00000000003f0000
GPR04: 0000000000000000 0000000000000000 c0000000f43b2900 c0000000f43b2900 c0000000ba324668 c0000000ba324668 0000000000000001 0000000000000001
GPR08: c000000001540300 c000000001540300 f000000000000000 f000000000000000 f0000000003f0000 f0000000003f0000 0000000000000001 0000000000000001
GPR12: 0000000024000084 0000000024000084 c00000000ff20000 c00000000ff20000 c00000000000b5b0 c00000000000b5b0 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR24: c00000000188c380 c00000000188c380 0000000000000000 0000000000000000 0000000000014000 0000000000014000 c0000000018b54e8 c0000000018b54e8
GPR28: c00000013c06e800 c00000013c06e800 000000000000ffff 000000000000ffff 0000000000000000 0000000000000000 000000000000fc00 000000000000fc00

NIP [c0000000007657cc] .get_nid_for_pfn+0x2c/0x60
LR [c000000000592dbc] .register_mem_sect_under_node+0x8c/0x150
Call Trace:
[c00000013c583970] [c00000000056e44c] .put_device+0x2c/0x50
[c00000013c5839f0] [c000000000592dbc] .register_mem_sect_under_node+0x8c/0x150
[c00000013c583a80] [c0000000005932b4] .register_one_node+0x2c4/0x380
[c00000013c583b30] [c000000000c882b8] .topology_init+0x44/0x1e0
[c00000013c583bf0] [c00000000000ad30] .do_one_initcall+0x110/0x270
[c00000013c583ce0] [c000000000c845d4] .kernel_init_freeable+0x278/0x360
[c00000013c583db0] [c00000000000b5d4] .kernel_init+0x24/0x130
[c00000013c583e30] [c0000000000094e8] .ret_from_kernel_thread+0x58/0x70

Fix this by updating the memory block to always contain the right
number of sections instead of assuming sections_per_block.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
---
 drivers/base/memory.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 2804aed..7f3ce2e 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -645,6 +645,7 @@ static int add_memory_block(int base_section_nr)
 	if (ret)
 		return ret;
 	mem->section_count = section_count;
+        mem->end_section_nr = mem->start_section_nr + section_count -1;
 	return 0;
 }
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH] driver: base: memory: Maintain correct mem->end_section_nr when memory block is partially filled
  2015-08-13  9:17 [RFC PATCH] driver: base: memory: Maintain correct mem->end_section_nr when memory block is partially filled Bharata B Rao
@ 2015-08-14 15:27 ` Nathan Fontenot
  2015-08-17  6:26   ` Bharata B Rao
  0 siblings, 1 reply; 4+ messages in thread
From: Nathan Fontenot @ 2015-08-14 15:27 UTC (permalink / raw)
  To: Bharata B Rao, linux-kernel; +Cc: david

On 08/13/2015 04:17 AM, Bharata B Rao wrote:
> Last section of memory block is always initialized to
> 
> mem->start_section_nr + sections_per_block - 1
> 
> which will not be true for a section that doesn't contain sections_per_block
> sections due to the memory size specified. This causes the following
> kernel crash when memory blocks under a node are registered during reboot
> that follows a memory hotplug operation on pseries guest.
> 
> Unable to handle kernel paging request for data at address 0xf0000000003f0020
> Faulting instruction address: 0xc0000000007657cc
> Oops: Kernel access of bad area, sig: 11 [#1]
> SMP NR_CPUS=1024 NUMA pSeries
> 
> Modules linked in:
> 
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.2.0-rc6+ #48
> task: c0000000ba3c0000 ti: c00000013c580000 task.ti: c00000013c580000
> NIP: c0000000007657cc LR: c000000000592dbc CTR: 0000000000000400
> REGS: c00000013c5836f0 TRAP: 0300   Not tainted  (4.2.0-rc6+)
> MSR: 8000000000009032  MSR: 8000000000009032 <<SFSF,EE,EE,ME,ME,IR,IR,DR,DR,RI,RI>>  CR: 48000048  XER: 00000000
>   CR: 48000048  XER: 00000000
> CFAR: 00003fff990f50ec CFAR: 00003fff990f50ec DAR: f0000000003f0020 DSISR: 40000000 DAR: f0000000003f0020 DSISR: 40000000 SOFTE: 1 SOFTE: 1
> GPR00: c000000000592dbc c000000000592dbc c00000013c583970 c00000013c583970 c0000000014f0300 c0000000014f0300 00000000003f0000 00000000003f0000
> GPR04: 0000000000000000 0000000000000000 c0000000f43b2900 c0000000f43b2900 c0000000ba324668 c0000000ba324668 0000000000000001 0000000000000001
> GPR08: c000000001540300 c000000001540300 f000000000000000 f000000000000000 f0000000003f0000 f0000000003f0000 0000000000000001 0000000000000001
> GPR12: 0000000024000084 0000000024000084 c00000000ff20000 c00000000ff20000 c00000000000b5b0 c00000000000b5b0 0000000000000000 0000000000000000
> GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> GPR24: c00000000188c380 c00000000188c380 0000000000000000 0000000000000000 0000000000014000 0000000000014000 c0000000018b54e8 c0000000018b54e8
> GPR28: c00000013c06e800 c00000013c06e800 000000000000ffff 000000000000ffff 0000000000000000 0000000000000000 000000000000fc00 000000000000fc00
> 
> NIP [c0000000007657cc] .get_nid_for_pfn+0x2c/0x60
> LR [c000000000592dbc] .register_mem_sect_under_node+0x8c/0x150
> Call Trace:
> [c00000013c583970] [c00000000056e44c] .put_device+0x2c/0x50
> [c00000013c5839f0] [c000000000592dbc] .register_mem_sect_under_node+0x8c/0x150
> [c00000013c583a80] [c0000000005932b4] .register_one_node+0x2c4/0x380
> [c00000013c583b30] [c000000000c882b8] .topology_init+0x44/0x1e0
> [c00000013c583bf0] [c00000000000ad30] .do_one_initcall+0x110/0x270
> [c00000013c583ce0] [c000000000c845d4] .kernel_init_freeable+0x278/0x360
> [c00000013c583db0] [c00000000000b5d4] .kernel_init+0x24/0x130
> [c00000013c583e30] [c0000000000094e8] .ret_from_kernel_thread+0x58/0x70
> 
> Fix this by updating the memory block to always contain the right
> number of sections instead of assuming sections_per_block.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> ---
>  drivers/base/memory.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index 2804aed..7f3ce2e 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -645,6 +645,7 @@ static int add_memory_block(int base_section_nr)
>  	if (ret)
>  		return ret;
>  	mem->section_count = section_count;
> +        mem->end_section_nr = mem->start_section_nr + section_count -1;

I think this change may be correct but makes me wonder if we need to update
code elsewhere. There are places (at least in drivers/base/memory.c) that assume
a memory block contains sections_per_block sections.

Also, I think you may need to cc GregKH for this patch.

-Nathan
 
>  	return 0;
>  }
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH] driver: base: memory: Maintain correct mem->end_section_nr when memory block is partially filled
  2015-08-14 15:27 ` Nathan Fontenot
@ 2015-08-17  6:26   ` Bharata B Rao
  2015-08-17 16:32     ` Greg KH
  0 siblings, 1 reply; 4+ messages in thread
From: Bharata B Rao @ 2015-08-17  6:26 UTC (permalink / raw)
  To: Nathan Fontenot; +Cc: linux-kernel, david, gregkh

On Fri, Aug 14, 2015 at 10:27:53AM -0500, Nathan Fontenot wrote:
> On 08/13/2015 04:17 AM, Bharata B Rao wrote:
> > Last section of memory block is always initialized to
> > 
> > mem->start_section_nr + sections_per_block - 1
> > 
> > which will not be true for a section that doesn't contain sections_per_block
> > sections due to the memory size specified. This causes the following
> > kernel crash when memory blocks under a node are registered during reboot
> > that follows a memory hotplug operation on pseries guest.
> > 
> > Unable to handle kernel paging request for data at address 0xf0000000003f0020
> > Faulting instruction address: 0xc0000000007657cc
> > Oops: Kernel access of bad area, sig: 11 [#1]
> > SMP NR_CPUS=1024 NUMA pSeries
> > 
> > Modules linked in:
> > 
> > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.2.0-rc6+ #48
> > task: c0000000ba3c0000 ti: c00000013c580000 task.ti: c00000013c580000
> > NIP: c0000000007657cc LR: c000000000592dbc CTR: 0000000000000400
> > REGS: c00000013c5836f0 TRAP: 0300   Not tainted  (4.2.0-rc6+)
> > MSR: 8000000000009032  MSR: 8000000000009032 <<SFSF,EE,EE,ME,ME,IR,IR,DR,DR,RI,RI>>  CR: 48000048  XER: 00000000
> >   CR: 48000048  XER: 00000000
> > CFAR: 00003fff990f50ec CFAR: 00003fff990f50ec DAR: f0000000003f0020 DSISR: 40000000 DAR: f0000000003f0020 DSISR: 40000000 SOFTE: 1 SOFTE: 1
> > GPR00: c000000000592dbc c000000000592dbc c00000013c583970 c00000013c583970 c0000000014f0300 c0000000014f0300 00000000003f0000 00000000003f0000
> > GPR04: 0000000000000000 0000000000000000 c0000000f43b2900 c0000000f43b2900 c0000000ba324668 c0000000ba324668 0000000000000001 0000000000000001
> > GPR08: c000000001540300 c000000001540300 f000000000000000 f000000000000000 f0000000003f0000 f0000000003f0000 0000000000000001 0000000000000001
> > GPR12: 0000000024000084 0000000024000084 c00000000ff20000 c00000000ff20000 c00000000000b5b0 c00000000000b5b0 0000000000000000 0000000000000000
> > GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > GPR24: c00000000188c380 c00000000188c380 0000000000000000 0000000000000000 0000000000014000 0000000000014000 c0000000018b54e8 c0000000018b54e8
> > GPR28: c00000013c06e800 c00000013c06e800 000000000000ffff 000000000000ffff 0000000000000000 0000000000000000 000000000000fc00 000000000000fc00
> > 
> > NIP [c0000000007657cc] .get_nid_for_pfn+0x2c/0x60
> > LR [c000000000592dbc] .register_mem_sect_under_node+0x8c/0x150
> > Call Trace:
> > [c00000013c583970] [c00000000056e44c] .put_device+0x2c/0x50
> > [c00000013c5839f0] [c000000000592dbc] .register_mem_sect_under_node+0x8c/0x150
> > [c00000013c583a80] [c0000000005932b4] .register_one_node+0x2c4/0x380
> > [c00000013c583b30] [c000000000c882b8] .topology_init+0x44/0x1e0
> > [c00000013c583bf0] [c00000000000ad30] .do_one_initcall+0x110/0x270
> > [c00000013c583ce0] [c000000000c845d4] .kernel_init_freeable+0x278/0x360
> > [c00000013c583db0] [c00000000000b5d4] .kernel_init+0x24/0x130
> > [c00000013c583e30] [c0000000000094e8] .ret_from_kernel_thread+0x58/0x70
> > 
> > Fix this by updating the memory block to always contain the right
> > number of sections instead of assuming sections_per_block.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> > ---
> >  drivers/base/memory.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> > index 2804aed..7f3ce2e 100644
> > --- a/drivers/base/memory.c
> > +++ b/drivers/base/memory.c
> > @@ -645,6 +645,7 @@ static int add_memory_block(int base_section_nr)
> >  	if (ret)
> >  		return ret;
> >  	mem->section_count = section_count;
> > +        mem->end_section_nr = mem->start_section_nr + section_count -1;
> 
> I think this change may be correct but makes me wonder if we need to update
> code elsewhere. There are places (at least in drivers/base/memory.c) that assume
> a memory block contains sections_per_block sections.
> 
> Also, I think you may need to cc GregKH for this patch.
 
Hi Greg - Do you think the above is the right fix to the problem that is
described here ?

Regards,
Bharata.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH] driver: base: memory: Maintain correct mem->end_section_nr when memory block is partially filled
  2015-08-17  6:26   ` Bharata B Rao
@ 2015-08-17 16:32     ` Greg KH
  0 siblings, 0 replies; 4+ messages in thread
From: Greg KH @ 2015-08-17 16:32 UTC (permalink / raw)
  To: Bharata B Rao; +Cc: Nathan Fontenot, linux-kernel, david

On Mon, Aug 17, 2015 at 11:56:53AM +0530, Bharata B Rao wrote:
> On Fri, Aug 14, 2015 at 10:27:53AM -0500, Nathan Fontenot wrote:
> > On 08/13/2015 04:17 AM, Bharata B Rao wrote:
> > > Last section of memory block is always initialized to
> > > 
> > > mem->start_section_nr + sections_per_block - 1
> > > 
> > > which will not be true for a section that doesn't contain sections_per_block
> > > sections due to the memory size specified. This causes the following
> > > kernel crash when memory blocks under a node are registered during reboot
> > > that follows a memory hotplug operation on pseries guest.
> > > 
> > > Unable to handle kernel paging request for data at address 0xf0000000003f0020
> > > Faulting instruction address: 0xc0000000007657cc
> > > Oops: Kernel access of bad area, sig: 11 [#1]
> > > SMP NR_CPUS=1024 NUMA pSeries
> > > 
> > > Modules linked in:
> > > 
> > > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.2.0-rc6+ #48
> > > task: c0000000ba3c0000 ti: c00000013c580000 task.ti: c00000013c580000
> > > NIP: c0000000007657cc LR: c000000000592dbc CTR: 0000000000000400
> > > REGS: c00000013c5836f0 TRAP: 0300   Not tainted  (4.2.0-rc6+)
> > > MSR: 8000000000009032  MSR: 8000000000009032 <<SFSF,EE,EE,ME,ME,IR,IR,DR,DR,RI,RI>>  CR: 48000048  XER: 00000000
> > >   CR: 48000048  XER: 00000000
> > > CFAR: 00003fff990f50ec CFAR: 00003fff990f50ec DAR: f0000000003f0020 DSISR: 40000000 DAR: f0000000003f0020 DSISR: 40000000 SOFTE: 1 SOFTE: 1
> > > GPR00: c000000000592dbc c000000000592dbc c00000013c583970 c00000013c583970 c0000000014f0300 c0000000014f0300 00000000003f0000 00000000003f0000
> > > GPR04: 0000000000000000 0000000000000000 c0000000f43b2900 c0000000f43b2900 c0000000ba324668 c0000000ba324668 0000000000000001 0000000000000001
> > > GPR08: c000000001540300 c000000001540300 f000000000000000 f000000000000000 f0000000003f0000 f0000000003f0000 0000000000000001 0000000000000001
> > > GPR12: 0000000024000084 0000000024000084 c00000000ff20000 c00000000ff20000 c00000000000b5b0 c00000000000b5b0 0000000000000000 0000000000000000
> > > GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > GPR24: c00000000188c380 c00000000188c380 0000000000000000 0000000000000000 0000000000014000 0000000000014000 c0000000018b54e8 c0000000018b54e8
> > > GPR28: c00000013c06e800 c00000013c06e800 000000000000ffff 000000000000ffff 0000000000000000 0000000000000000 000000000000fc00 000000000000fc00
> > > 
> > > NIP [c0000000007657cc] .get_nid_for_pfn+0x2c/0x60
> > > LR [c000000000592dbc] .register_mem_sect_under_node+0x8c/0x150
> > > Call Trace:
> > > [c00000013c583970] [c00000000056e44c] .put_device+0x2c/0x50
> > > [c00000013c5839f0] [c000000000592dbc] .register_mem_sect_under_node+0x8c/0x150
> > > [c00000013c583a80] [c0000000005932b4] .register_one_node+0x2c4/0x380
> > > [c00000013c583b30] [c000000000c882b8] .topology_init+0x44/0x1e0
> > > [c00000013c583bf0] [c00000000000ad30] .do_one_initcall+0x110/0x270
> > > [c00000013c583ce0] [c000000000c845d4] .kernel_init_freeable+0x278/0x360
> > > [c00000013c583db0] [c00000000000b5d4] .kernel_init+0x24/0x130
> > > [c00000013c583e30] [c0000000000094e8] .ret_from_kernel_thread+0x58/0x70
> > > 
> > > Fix this by updating the memory block to always contain the right
> > > number of sections instead of assuming sections_per_block.
> > > 
> > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > > Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> > > ---
> > >  drivers/base/memory.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> > > index 2804aed..7f3ce2e 100644
> > > --- a/drivers/base/memory.c
> > > +++ b/drivers/base/memory.c
> > > @@ -645,6 +645,7 @@ static int add_memory_block(int base_section_nr)
> > >  	if (ret)
> > >  		return ret;
> > >  	mem->section_count = section_count;
> > > +        mem->end_section_nr = mem->start_section_nr + section_count -1;
> > 
> > I think this change may be correct but makes me wonder if we need to update
> > code elsewhere. There are places (at least in drivers/base/memory.c) that assume
> > a memory block contains sections_per_block sections.
> > 
> > Also, I think you may need to cc GregKH for this patch.
>  
> Hi Greg - Do you think the above is the right fix to the problem that is
> described here ?

I have no idea, sorry, I didn't write this code :)

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-08-17 16:32 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-13  9:17 [RFC PATCH] driver: base: memory: Maintain correct mem->end_section_nr when memory block is partially filled Bharata B Rao
2015-08-14 15:27 ` Nathan Fontenot
2015-08-17  6:26   ` Bharata B Rao
2015-08-17 16:32     ` Greg KH

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.