From: Bharata B Rao <bharata@linux.ibm.com>
To: guro@fb.com
Cc: mhocko@kernel.org, hannes@cmpxchg.org,
linux-kernel@vger.kernel.org, kernel-team@fb.com,
shakeelb@google.com, vdavydov.dev@gmail.com, longman@redhat.com
Subject: Re: [PATCH 00/16] The new slab memory controller
Date: Mon, 9 Dec 2019 17:26:49 +0530 [thread overview]
Message-ID: <20191209115649.GA17552@in.ibm.com> (raw)
In-Reply-To: <20191209091746.GA16989@in.ibm.com>
On Mon, Dec 09, 2019 at 02:47:52PM +0530, Bharata B Rao wrote:
> Hi,
>
> I see the below crash during early boot when I try this patchset on
> PowerPC host. I am on new_slab.rfc.v5.3 branch.
>
> BUG: Unable to handle kernel data access at 0x81030236d1814578
> Faulting instruction address: 0xc0000000002cc314
> Oops: Kernel access of bad area, sig: 11 [#1]
> LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
> Modules linked in: ip_tables x_tables autofs4 sr_mod cdrom usbhid bnx2x crct10dif_vpmsum crct10dif_common mdio libcrc32c crc32c_vpmsum
> CPU: 31 PID: 1752 Comm: keyboard-setup. Not tainted 5.3.0-g9bd85fd72a0c #155
> NIP: c0000000002cc314 LR: c0000000002cc2e8 CTR: 0000000000000000
> REGS: c000001e40f378b0 TRAP: 0380 Not tainted (5.3.0-g9bd85fd72a0c)
> MSR: 900000010280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 44022224 XER: 00000000
> CFAR: c0000000002c6ad4 IRQMASK: 1
> GPR00: c0000000000b8a40 c000001e40f37b40 c000000000ed9600 0000000000000000
> GPR04: 0000000000000023 0000000000000010 c000001e40f37b24 c000001e3cba3400
> GPR08: 0000000000000020 81030218815f4578 0000001e50220000 0000000000000030
> GPR12: 0000000000002200 c000001fff774d80 0000000000000000 00000001072600d8
> GPR16: 0000000000000000 c0000000000bbaac 0000000000000000 0000000000000000
> GPR20: c000001e40f37c48 0000000000000001 0000000000000000 c000001e3cba3400
> GPR24: c000001e40f37dd8 0000000000000000 c000000000fa0d58 0000000000000000
> GPR28: c000001e3a080080 c000001e32da0100 0000000000000118 0000000000000010
> NIP [c0000000002cc314] __mod_memcg_state+0x58/0xd0
> LR [c0000000002cc2e8] __mod_memcg_state+0x2c/0xd0
> Call Trace:
> [c000001e40f37b90] [c0000000000b8a40] account_kernel_stack+0xa4/0xe4
> [c000001e40f37bd0] [c0000000000ba4a4] copy_process+0x2b4/0x16f0
> [c000001e40f37cf0] [c0000000000bbaac] _do_fork+0x9c/0x3e4
> [c000001e40f37db0] [c0000000000bc030] sys_clone+0x74/0xa8
> [c000001e40f37e20] [c00000000000bb34] ppc_clone+0x8/0xc
> Instruction dump:
> 4bffa7e9 2fa30000 409e007c 395efffb 3d000020 2b8a0001 409d0008 39000020
> e93d0718 e94d0028 7bde1f24 7d29f214 <7ca9502a> 7fff2a14 7fe9fe76 7d27fa78
>
> Looks like page->mem_cgroup_vec is allocated but not yet initialized
> with memcg pointers when we try to access them.
>
> I did get past the crash by initializing the pointers like this
> in account_kernel_stack(),
The above is not an accurate description of the hack I showed below.
Essentially I am making sure that I get to the memcg corresponding
to task_struct_cachep object in the page.
But that still doesn't explain why we don't hit this problem on x86.
> but I am pretty sure that this is not the
> place to do this:
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 541fd805fb88..be21419feae2 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -380,13 +380,26 @@ static void account_kernel_stack(struct task_struct *tsk, int account)
> * All stack pages are in the same zone and belong to the
> * same memcg.
> */
> - struct page *first_page = virt_to_page(stack);
> + struct page *first_page = virt_to_head_page((stack));
> + unsigned long off;
> + struct mem_cgroup_ptr *memcg_ptr;
> + struct mem_cgroup *memcg;
>
> mod_zone_page_state(page_zone(first_page), NR_KERNEL_STACK_KB,
> THREAD_SIZE / 1024 * account);
>
> - mod_memcg_page_state(first_page, MEMCG_KERNEL_STACK_KB,
> + if (!first_page->mem_cgroup_vec)
> + return;
> + off = obj_to_index(task_struct_cachep, first_page, stack);
> + memcg_ptr = first_page->mem_cgroup_vec[off];
> + if (!memcg_ptr)
> + return;
> + rcu_read_lock();
> + memcg = memcg_ptr->memcg;
> + if (memcg)
> + __mod_memcg_state(memcg, MEMCG_KERNEL_STACK_KB,
> account * (THREAD_SIZE / 1024));
> + rcu_read_unlock();
> }
> }
>
> Regards,
> Bharata.
next prev parent reply other threads:[~2019-12-09 11:57 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-05 21:45 [PATCH RFC 00/14] The new slab memory controller Roman Gushchin
2019-09-05 21:45 ` [PATCH RFC 01/14] mm: memcg: subpage charging API Roman Gushchin
2019-09-16 12:56 ` Johannes Weiner
2019-09-17 2:27 ` Roman Gushchin
2019-09-17 8:50 ` Johannes Weiner
2019-09-17 18:33 ` Roman Gushchin
2019-09-05 21:45 ` [PATCH RFC 02/14] mm: memcg: introduce mem_cgroup_ptr Roman Gushchin
2019-09-05 22:34 ` Roman Gushchin
2019-09-05 21:45 ` [PATCH RFC 03/14] mm: vmstat: use s32 for vm_node_stat_diff in struct per_cpu_nodestat Roman Gushchin
2019-09-05 21:45 ` [PATCH RFC 04/14] mm: vmstat: convert slab vmstat counter to bytes Roman Gushchin
2019-09-16 12:38 ` Johannes Weiner
2019-09-17 2:08 ` Roman Gushchin
2019-09-05 21:45 ` [PATCH RFC 05/14] mm: memcg/slab: allocate space for memcg ownership data for non-root slabs Roman Gushchin
2019-09-05 21:45 ` [PATCH RFC 06/14] mm: slub: implement SLUB version of obj_to_index() Roman Gushchin
2019-09-05 21:45 ` [PATCH RFC 07/14] mm: memcg/slab: save memcg ownership data for non-root slab objects Roman Gushchin
2019-09-05 21:45 ` [PATCH RFC 08/14] mm: memcg: move memcg_kmem_bypass() to memcontrol.h Roman Gushchin
2019-09-05 21:45 ` [PATCH RFC 09/14] mm: memcg: introduce __mod_lruvec_memcg_state() Roman Gushchin
2019-09-05 22:37 ` [PATCH RFC 02/14] mm: memcg: introduce mem_cgroup_ptr Roman Gushchin
2019-09-17 19:48 ` [PATCH RFC 00/14] The new slab memory controller Waiman Long
2019-09-17 21:24 ` Roman Gushchin
2019-09-19 13:39 ` Suleiman Souhlal
2019-09-19 16:22 ` Roman Gushchin
2019-09-19 21:10 ` Suleiman Souhlal
2019-09-19 21:40 ` Roman Gushchin
2019-10-01 15:12 ` Michal Koutný
2019-10-02 2:09 ` Roman Gushchin
2019-10-02 13:00 ` Suleiman Souhlal
2019-10-03 10:47 ` Michal Koutný
2019-10-03 15:52 ` Roman Gushchin
2019-12-09 9:17 ` [PATCH 00/16] " Bharata B Rao
2019-12-09 11:56 ` Bharata B Rao [this message]
2019-12-09 18:04 ` Roman Gushchin
2019-12-10 6:23 ` Bharata B Rao
2019-12-10 18:05 ` Roman Gushchin
2020-01-13 8:47 ` Bharata B Rao
2020-01-13 15:31 ` Roman Gushchin
-- strict thread matches above, loose matches on Subject: below --
2019-10-18 0:28 Roman Gushchin
2019-10-18 17:03 ` Waiman Long
2019-10-18 17:12 ` Roman Gushchin
2019-10-22 13:22 ` Michal Hocko
2019-10-22 13:28 ` Michal Hocko
2019-10-22 15:48 ` Roman Gushchin
2019-10-22 13:31 ` Michal Hocko
2019-10-22 15:59 ` Roman Gushchin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191209115649.GA17552@in.ibm.com \
--to=bharata@linux.ibm.com \
--cc=guro@fb.com \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=mhocko@kernel.org \
--cc=shakeelb@google.com \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.