All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wei Yang <richard.weiyang@gmail.com>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Sachin Sant <sachinp@linux.vnet.ibm.com>,
	Pankaj Gupta <pankaj.gupta.linux@gmail.com>,
	Michal Hocko <mhocko@suse.com>, Baoquan He <bhe@redhat.com>,
	David Hildenbrand <david@redhat.com>,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org,
	Mike Rapoport <rppt@linux.ibm.com>,
	linux-mm@kvack.org, Wei Yang <richardw.yang@linux.intel.com>,
	akpm@linux-foundation.org, linuxppc-dev@lists.ozlabs.org,
	Dan Williams <dan.j.williams@intel.com>,
	Oscar Salvador <osalvador@suse.de>
Subject: Re: [PATCH v2] mm/sparse: Fix kernel crash with pfn_section_valid check
Date: Thu, 26 Mar 2020 22:12:37 +0000	[thread overview]
Message-ID: <20200326221237.fttur7yyfl7nrdsj@master> (raw)
In-Reply-To: <20200326133235.343616-1-aneesh.kumar@linux.ibm.com>

On Thu, Mar 26, 2020 at 07:02:35PM +0530, Aneesh Kumar K.V wrote:
>Fixes the below crash
>
>BUG: Kernel NULL pointer dereference on read at 0x00000000
>Faulting instruction address: 0xc000000000c3447c
>Oops: Kernel access of bad area, sig: 11 [#1]
>LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
>CPU: 11 PID: 7519 Comm: lt-ndctl Not tainted 5.6.0-rc7-autotest #1
>...
>NIP [c000000000c3447c] vmemmap_populated+0x98/0xc0
>LR [c000000000088354] vmemmap_free+0x144/0x320
>Call Trace:
> section_deactivate+0x220/0x240
> __remove_pages+0x118/0x170
> arch_remove_memory+0x3c/0x150
> memunmap_pages+0x1cc/0x2f0
> devm_action_release+0x30/0x50
> release_nodes+0x2f8/0x3e0
> device_release_driver_internal+0x168/0x270
> unbind_store+0x130/0x170
> drv_attr_store+0x44/0x60
> sysfs_kf_write+0x68/0x80
> kernfs_fop_write+0x100/0x290
> __vfs_write+0x3c/0x70
> vfs_write+0xcc/0x240
> ksys_write+0x7c/0x140
> system_call+0x5c/0x68
>
>The crash is due to NULL dereference at
>
>test_bit(idx, ms->usage->subsection_map); due to ms->usage = NULL; in pfn_section_valid()
>
>With commit: d41e2f3bd546 ("mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case")
>section_mem_map is set to NULL after depopulate_section_mem(). This
>was done so that pfn_page() can work correctly with kernel config that disables
>SPARSEMEM_VMEMMAP. With that config pfn_to_page does
>
>	__section_mem_map_addr(__sec) + __pfn;
>where
>
>static inline struct page *__section_mem_map_addr(struct mem_section *section)
>{
>	unsigned long map = section->section_mem_map;
>	map &= SECTION_MAP_MASK;
>	return (struct page *)map;
>}
>
>Now with SPASEMEM_VMEMAP enabled, mem_section->usage->subsection_map is used to
>check the pfn validity (pfn_valid()). Since section_deactivate release
>mem_section->usage if a section is fully deactivated, pfn_valid() check after
>a subsection_deactivate cause a kernel crash.
>
>static inline int pfn_valid(unsigned long pfn)
>{
>...
>	return early_section(ms) || pfn_section_valid(ms, pfn);
>}
>
>where
>
>static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
>{
>	int idx = subsection_map_index(pfn);
>
>	return test_bit(idx, ms->usage->subsection_map);
>}
>
>Avoid this by clearing SECTION_HAS_MEM_MAP when mem_section->usage is freed.
>For architectures like ppc64 where large pages are used for vmmemap mapping (16MB),
>a specific vmemmap mapping can cover multiple sections. Hence before a vmemmap
>mapping page can be freed, the kernel needs to make sure there are no valid sections
>within that mapping. Clearing the section valid bit before
>depopulate_section_memap enables this.
>
>Fixes: d41e2f3bd546 ("mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case")
>Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
>Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
>Cc: Baoquan He <bhe@redhat.com>
>Cc: Michael Ellerman <mpe@ellerman.id.au>
>Cc: Dan Williams <dan.j.williams@intel.com>
>Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
>Cc: David Hildenbrand <david@redhat.com>
>Cc: Michal Hocko <mhocko@suse.com>
>Cc: Wei Yang <richardw.yang@linux.intel.com>
>Cc: Oscar Salvador <osalvador@suse.de>
>Cc: Mike Rapoport <rppt@linux.ibm.com>
>Cc: <stable@vger.kernel.org>
>Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>

Reviewed-by: Wei Yang <richard.weiyang@gmail.com>

-- 
Wei Yang
Help you, Help me

WARNING: multiple messages have this Message-ID (diff)
From: Wei Yang <richard.weiyang@gmail.com>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
	linux-kernel@vger.kernel.org, mpe@ellerman.id.au,
	linuxppc-dev@lists.ozlabs.org,
	Sachin Sant <sachinp@linux.vnet.ibm.com>,
	Baoquan He <bhe@redhat.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Pankaj Gupta <pankaj.gupta.linux@gmail.com>,
	David Hildenbrand <david@redhat.com>,
	Michal Hocko <mhocko@suse.com>,
	Wei Yang <richardw.yang@linux.intel.com>,
	Oscar Salvador <osalvador@suse.de>,
	Mike Rapoport <rppt@linux.ibm.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH v2] mm/sparse: Fix kernel crash with pfn_section_valid check
Date: Thu, 26 Mar 2020 22:12:37 +0000	[thread overview]
Message-ID: <20200326221237.fttur7yyfl7nrdsj@master> (raw)
In-Reply-To: <20200326133235.343616-1-aneesh.kumar@linux.ibm.com>

On Thu, Mar 26, 2020 at 07:02:35PM +0530, Aneesh Kumar K.V wrote:
>Fixes the below crash
>
>BUG: Kernel NULL pointer dereference on read at 0x00000000
>Faulting instruction address: 0xc000000000c3447c
>Oops: Kernel access of bad area, sig: 11 [#1]
>LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
>CPU: 11 PID: 7519 Comm: lt-ndctl Not tainted 5.6.0-rc7-autotest #1
>...
>NIP [c000000000c3447c] vmemmap_populated+0x98/0xc0
>LR [c000000000088354] vmemmap_free+0x144/0x320
>Call Trace:
> section_deactivate+0x220/0x240
> __remove_pages+0x118/0x170
> arch_remove_memory+0x3c/0x150
> memunmap_pages+0x1cc/0x2f0
> devm_action_release+0x30/0x50
> release_nodes+0x2f8/0x3e0
> device_release_driver_internal+0x168/0x270
> unbind_store+0x130/0x170
> drv_attr_store+0x44/0x60
> sysfs_kf_write+0x68/0x80
> kernfs_fop_write+0x100/0x290
> __vfs_write+0x3c/0x70
> vfs_write+0xcc/0x240
> ksys_write+0x7c/0x140
> system_call+0x5c/0x68
>
>The crash is due to NULL dereference at
>
>test_bit(idx, ms->usage->subsection_map); due to ms->usage = NULL; in pfn_section_valid()
>
>With commit: d41e2f3bd546 ("mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case")
>section_mem_map is set to NULL after depopulate_section_mem(). This
>was done so that pfn_page() can work correctly with kernel config that disables
>SPARSEMEM_VMEMMAP. With that config pfn_to_page does
>
>	__section_mem_map_addr(__sec) + __pfn;
>where
>
>static inline struct page *__section_mem_map_addr(struct mem_section *section)
>{
>	unsigned long map = section->section_mem_map;
>	map &= SECTION_MAP_MASK;
>	return (struct page *)map;
>}
>
>Now with SPASEMEM_VMEMAP enabled, mem_section->usage->subsection_map is used to
>check the pfn validity (pfn_valid()). Since section_deactivate release
>mem_section->usage if a section is fully deactivated, pfn_valid() check after
>a subsection_deactivate cause a kernel crash.
>
>static inline int pfn_valid(unsigned long pfn)
>{
>...
>	return early_section(ms) || pfn_section_valid(ms, pfn);
>}
>
>where
>
>static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
>{
>	int idx = subsection_map_index(pfn);
>
>	return test_bit(idx, ms->usage->subsection_map);
>}
>
>Avoid this by clearing SECTION_HAS_MEM_MAP when mem_section->usage is freed.
>For architectures like ppc64 where large pages are used for vmmemap mapping (16MB),
>a specific vmemmap mapping can cover multiple sections. Hence before a vmemmap
>mapping page can be freed, the kernel needs to make sure there are no valid sections
>within that mapping. Clearing the section valid bit before
>depopulate_section_memap enables this.
>
>Fixes: d41e2f3bd546 ("mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case")
>Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
>Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
>Cc: Baoquan He <bhe@redhat.com>
>Cc: Michael Ellerman <mpe@ellerman.id.au>
>Cc: Dan Williams <dan.j.williams@intel.com>
>Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
>Cc: David Hildenbrand <david@redhat.com>
>Cc: Michal Hocko <mhocko@suse.com>
>Cc: Wei Yang <richardw.yang@linux.intel.com>
>Cc: Oscar Salvador <osalvador@suse.de>
>Cc: Mike Rapoport <rppt@linux.ibm.com>
>Cc: <stable@vger.kernel.org>
>Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>

Reviewed-by: Wei Yang <richard.weiyang@gmail.com>

-- 
Wei Yang
Help you, Help me


  parent reply	other threads:[~2020-03-26 22:14 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-26 13:32 [PATCH v2] mm/sparse: Fix kernel crash with pfn_section_valid check Aneesh Kumar K.V
2020-03-26 13:32 ` Aneesh Kumar K.V
2020-03-26 13:38 ` Baoquan He
2020-03-26 13:38   ` Baoquan He
2020-03-26 13:55 ` Michal Hocko
2020-03-26 13:55   ` Michal Hocko
2020-03-26 16:04 ` Pankaj Gupta
2020-03-26 16:04   ` Pankaj Gupta
2020-03-26 22:12 ` Wei Yang [this message]
2020-03-26 22:12   ` Wei Yang
2020-03-27  9:42 ` David Hildenbrand
2020-03-27  9:42   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200326221237.fttur7yyfl7nrdsj@master \
    --to=richard.weiyang@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=bhe@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    --cc=pankaj.gupta.linux@gmail.com \
    --cc=richardw.yang@linux.intel.com \
    --cc=rppt@linux.ibm.com \
    --cc=sachinp@linux.vnet.ibm.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.