linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [btrfs] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
       [not found] <20140207023801.GC11051@localhost>
@ 2014-02-07 10:13 ` David Rientjes
  2014-02-07 12:10   ` Fengguang Wu
  0 siblings, 1 reply; 9+ messages in thread
From: David Rientjes @ 2014-02-07 10:13 UTC (permalink / raw)
  To: Fengguang Wu, Tejun Heo
  Cc: Filipe David Borba Manana, Chris Mason, linux-btrfs, linux-kernel

On Fri, 7 Feb 2014, Fengguang Wu wrote:

> [    1.625020] BTRFS: selftest: Running btrfs_split_item tests
> [    1.627004] BTRFS: selftest: Running find delalloc tests
> [    2.289182] tsc: Refined TSC clocksource calibration: 2299.967 MHz
> [  292.084537] kthreadd invoked oom-killer: gfp_mask=0x3000d0, order=1, oom_score_adj=0
> [  292.086439] kthreadd cpuset=
> [  292.087072] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
> [  292.087372] IP: [<ffffffff812119de>] pr_cont_kernfs_name+0x1b/0x6c

This looks like a problem with the cpuset cgroup name, are you sure this 
isn't related to the removal of cgroup->name?

> [  292.087372] PGD 0 
> [  292.087372] Oops: 0000 [#1] 
> [  292.087372] Modules linked in:
> [  292.087372] CPU: 0 PID: 2 Comm: kthreadd Not tainted 3.14.0-rc1-wl-ath-00978-g4830363 #2
> [  292.087372] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [  292.087372] task: ffff880000148050 ti: ffff88000014a000 task.ti: ffff88000014a000
> [  292.087372] RIP: 0010:[<ffffffff812119de>]  [<ffffffff812119de>] pr_cont_kernfs_name+0x1b/0x6c
> [  292.087372] RSP: 0000:ffff88000014bb20  EFLAGS: 00010046
> [  292.087372] RAX: 0000000000000282 RBX: 0000000000000000 RCX: 0000000000000002
> [  292.087372] RDX: ffffffff812119de RSI: ffffffff8247d4a8 RDI: 0000000000000046
> [  292.087372] RBP: ffff88000014bb30 R08: ffffffff82f31218 R09: 0000000000000000
> [  292.087372] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff833c4ab8
> [  292.087372] R13: 00000000003000d0 R14: 0000000000000001 R15: 0000000000000000
> [  292.087372] FS:  0000000000000000(0000) GS:ffffffff82279000(0000) knlGS:0000000000000000
> [  292.087372] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [  292.087372] CR2: 0000000000000038 CR3: 0000000002269000 CR4: 00000000000006b0
> [  292.087372] Stack:
> [  292.087372]  ffff880000148050 ffffffff833c4ab8 ffff88000014bb50 ffffffff8110836d
> [  292.087372]  ffff880000148050 ffff880000148530 ffff88000014bbd0 ffffffff81ae2614
> [  292.087372]  ffff880000148640 ffff88000014bb78 ffffffff810c7dd7 ffff88000014bb98
> [  292.087372] Call Trace:
> [  292.087372]  [<ffffffff8110836d>] cpuset_print_task_mems_allowed+0x65/0x8b
> [  292.087372]  [<ffffffff81ae2614>] dump_header.isra.11+0x68/0x29a
> [  292.087372]  [<ffffffff810c7dd7>] ? local_clock+0x2b/0x34
> [  292.087372]  [<ffffffff810ce52b>] ? lock_release_holdtime+0xcf/0xdb
> [  292.087372]  [<ffffffff811533d2>] ? out_of_memory+0x318/0x40f
> [  292.087372]  [<ffffffff8115342d>] out_of_memory+0x373/0x40f
> [  292.087372]  [<ffffffff8115324d>] ? out_of_memory+0x193/0x40f
> [  292.087372]  [<ffffffff8115a0ac>] __alloc_pages_nodemask+0xdd1/0x12c4
> [  292.087372]  [<ffffffff8100d301>] ? native_sched_clock+0xe4/0xfc
> [  292.087372]  [<ffffffff81089519>] copy_process+0x21d/0x2115
> [  292.087372]  [<ffffffff810b693e>] ? kthread_create_on_node+0x23e/0x23e
> [  292.087372]  [<ffffffff8100d322>] ? sched_clock+0x9/0xd
> [  292.087372]  [<ffffffff810c7aa7>] ? sched_clock_local.constprop.2+0x35/0xc8
> [  292.087372]  [<ffffffff810360bf>] ? pvclock_clocksource_read+0x9b/0x140
> [  292.087372]  [<ffffffff810b693e>] ? kthread_create_on_node+0x23e/0x23e
> [  292.087372]  [<ffffffff8108b61c>] do_fork+0x105/0x4db
> [  292.087372]  [<ffffffff810c7d54>] ? sched_clock_cpu+0xc9/0xdb
> [  292.087372]  [<ffffffff810c7dd7>] ? local_clock+0x2b/0x34
> [  292.087372]  [<ffffffff810ce52b>] ? lock_release_holdtime+0xcf/0xdb
> [  292.087372]  [<ffffffff810b768d>] ? kthreadd+0x1b3/0x24a
> [  292.087372]  [<ffffffff8108ba18>] kernel_thread+0x26/0x28
> [  292.087372]  [<ffffffff810b76a1>] kthreadd+0x1c7/0x24a
> [  292.087372]  [<ffffffff81afc50a>] ? ret_from_fork+0x7a/0xb0
> [  292.087372]  [<ffffffff810b74da>] ? kthread_create_on_cpu+0x7a/0x7a
> [  292.087372]  [<ffffffff81afc50a>] ret_from_fork+0x7a/0xb0
> [  292.087372]  [<ffffffff810b74da>] ? kthread_create_on_cpu+0x7a/0x7a
> [  292.087372] Code: 1c 92 8e 00 48 8b 45 e0 59 5b 41 5c 41 5d 5d c3 0f 1f 44 00 00 55 48 89 e5 41 54 53 48 89 fb 48 c7 c7 90 d4 47 82 e8 b1 8f 8e 00 <48> 83 7b 38 00 49 89 c4 74 06 48 8b 73 40 eb 07 48 c7 c6 48 fe 
> [  292.087372] RIP  [<ffffffff812119de>] pr_cont_kernfs_name+0x1b/0x6c
> [  292.087372]  RSP <ffff88000014bb20>
> [  292.087372] CR2: 0000000000000038
> [  292.087372] ---[ end trace df25444498a82119 ]---
> [  292.087372] ---[ end trace df25444498a82119 ]---
> 
> git bisect start 483036322b45d800fd68cb028874b6cd4fee3dba 38dbfb59d1175ef458d006556061adeaa8751b72 --
> git bisect  bad 50a557bc0d03997d9203d749b0304c54d2c6e5f5  # 02:20      0-      5  Merge 'cgroup/review-simplify' into devel-hourly-2014020601
> git bisect  bad c04e036e53a39cf2986ddaaf5607ff011f9ea55b  # 03:14      0-      1  Merge 'pinctrl/for-next' into devel-hourly-2014020601
> git bisect good 08ce20bd7ef9a08b74cdfe6fe454374be72b5825  # 03:58     25+      0  Merge 'pci/pci/msi' into devel-hourly-2014020601
> git bisect good 4f86564c657669f69c9cc03151ef6f23ae9e2015  # 04:32     25+      0  Merge 'm68knommu/cf' into devel-hourly-2014020601
> git bisect  bad 29ae23c20b538664beaea72bb0721ce2538b4ca9  # 04:50      0-     11  Merge 'm68knommu/cfmmu' into devel-hourly-2014020601
> git bisect  bad bbce71dbd03db3cf6df7faba0132d87a1a055827  # 04:58      0-      3  Merge 'nfs/devel' into devel-hourly-2014020601
> git bisect good 88a78a912ee059467ae6db7429a6efe4654620a5  # 06:16     25+      1  Merge branch 'acl_fixes' into linux-next
> git bisect good 12b13835a0a8bfabea68741e1ab4d4a4cb77d037  # 06:36     25+      0  kbuild: don't enable DEBUG_INFO when building for COMPILE_TEST
> git bisect  bad 1f35d872a0b9dd05de8f0e50fc2957bf74ea30f7  # 06:55      0-      9  NFS: Shrink nfs_inode by sharing storage for cookieverf and commit_info
> git bisect  bad 878a876b2e10888afe53766dcca33f723ae20edc  # 07:16      0-      4  Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
> git bisect good d7512f79fd6cb8e2d9b78770289df6391a867ca1  # 07:20     25+      0  Merge tag 'nfs-for-3.14-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
> git bisect  bad 9343224bfd4be6a02e6ae0c0d66426c955c7d76e  # 07:34      0-     12  Merge branch 'akpm' (patches from Andrew Morton)
> git bisect  bad 0cc2aa51be9d2f2b001c0e070b2e5cdde89b39f4  # 07:47      0-     10  Add linux-next specific files for 20140206
> 
> Thanks,
> Fengguang
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [btrfs] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
  2014-02-07 10:13 ` [btrfs] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 David Rientjes
@ 2014-02-07 12:10   ` Fengguang Wu
  2014-02-07 15:10     ` Chris Mason
  2014-02-07 21:13     ` David Rientjes
  0 siblings, 2 replies; 9+ messages in thread
From: Fengguang Wu @ 2014-02-07 12:10 UTC (permalink / raw)
  To: David Rientjes
  Cc: Tejun Heo, Filipe David Borba Manana, Chris Mason, linux-btrfs,
	linux-kernel

On Fri, Feb 07, 2014 at 02:13:59AM -0800, David Rientjes wrote:
> On Fri, 7 Feb 2014, Fengguang Wu wrote:
> 
> > [    1.625020] BTRFS: selftest: Running btrfs_split_item tests
> > [    1.627004] BTRFS: selftest: Running find delalloc tests
> > [    2.289182] tsc: Refined TSC clocksource calibration: 2299.967 MHz
> > [  292.084537] kthreadd invoked oom-killer: gfp_mask=0x3000d0, order=1, oom_score_adj=0
> > [  292.086439] kthreadd cpuset=
> > [  292.087072] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
> > [  292.087372] IP: [<ffffffff812119de>] pr_cont_kernfs_name+0x1b/0x6c
> 
> This looks like a problem with the cpuset cgroup name, are you sure this 
> isn't related to the removal of cgroup->name?

It looks not related to patch "cgroup: remove cgroup->name", because
that patch lies in the cgroup tree and not contained in output of "git log BAD_COMMIT".

Thanks,
Fengguang

> > [  292.087372] PGD 0 
> > [  292.087372] Oops: 0000 [#1] 
> > [  292.087372] Modules linked in:
> > [  292.087372] CPU: 0 PID: 2 Comm: kthreadd Not tainted 3.14.0-rc1-wl-ath-00978-g4830363 #2
> > [  292.087372] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> > [  292.087372] task: ffff880000148050 ti: ffff88000014a000 task.ti: ffff88000014a000
> > [  292.087372] RIP: 0010:[<ffffffff812119de>]  [<ffffffff812119de>] pr_cont_kernfs_name+0x1b/0x6c
> > [  292.087372] RSP: 0000:ffff88000014bb20  EFLAGS: 00010046
> > [  292.087372] RAX: 0000000000000282 RBX: 0000000000000000 RCX: 0000000000000002
> > [  292.087372] RDX: ffffffff812119de RSI: ffffffff8247d4a8 RDI: 0000000000000046
> > [  292.087372] RBP: ffff88000014bb30 R08: ffffffff82f31218 R09: 0000000000000000
> > [  292.087372] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff833c4ab8
> > [  292.087372] R13: 00000000003000d0 R14: 0000000000000001 R15: 0000000000000000
> > [  292.087372] FS:  0000000000000000(0000) GS:ffffffff82279000(0000) knlGS:0000000000000000
> > [  292.087372] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > [  292.087372] CR2: 0000000000000038 CR3: 0000000002269000 CR4: 00000000000006b0
> > [  292.087372] Stack:
> > [  292.087372]  ffff880000148050 ffffffff833c4ab8 ffff88000014bb50 ffffffff8110836d
> > [  292.087372]  ffff880000148050 ffff880000148530 ffff88000014bbd0 ffffffff81ae2614
> > [  292.087372]  ffff880000148640 ffff88000014bb78 ffffffff810c7dd7 ffff88000014bb98
> > [  292.087372] Call Trace:
> > [  292.087372]  [<ffffffff8110836d>] cpuset_print_task_mems_allowed+0x65/0x8b
> > [  292.087372]  [<ffffffff81ae2614>] dump_header.isra.11+0x68/0x29a
> > [  292.087372]  [<ffffffff810c7dd7>] ? local_clock+0x2b/0x34
> > [  292.087372]  [<ffffffff810ce52b>] ? lock_release_holdtime+0xcf/0xdb
> > [  292.087372]  [<ffffffff811533d2>] ? out_of_memory+0x318/0x40f
> > [  292.087372]  [<ffffffff8115342d>] out_of_memory+0x373/0x40f
> > [  292.087372]  [<ffffffff8115324d>] ? out_of_memory+0x193/0x40f
> > [  292.087372]  [<ffffffff8115a0ac>] __alloc_pages_nodemask+0xdd1/0x12c4
> > [  292.087372]  [<ffffffff8100d301>] ? native_sched_clock+0xe4/0xfc
> > [  292.087372]  [<ffffffff81089519>] copy_process+0x21d/0x2115
> > [  292.087372]  [<ffffffff810b693e>] ? kthread_create_on_node+0x23e/0x23e
> > [  292.087372]  [<ffffffff8100d322>] ? sched_clock+0x9/0xd
> > [  292.087372]  [<ffffffff810c7aa7>] ? sched_clock_local.constprop.2+0x35/0xc8
> > [  292.087372]  [<ffffffff810360bf>] ? pvclock_clocksource_read+0x9b/0x140
> > [  292.087372]  [<ffffffff810b693e>] ? kthread_create_on_node+0x23e/0x23e
> > [  292.087372]  [<ffffffff8108b61c>] do_fork+0x105/0x4db
> > [  292.087372]  [<ffffffff810c7d54>] ? sched_clock_cpu+0xc9/0xdb
> > [  292.087372]  [<ffffffff810c7dd7>] ? local_clock+0x2b/0x34
> > [  292.087372]  [<ffffffff810ce52b>] ? lock_release_holdtime+0xcf/0xdb
> > [  292.087372]  [<ffffffff810b768d>] ? kthreadd+0x1b3/0x24a
> > [  292.087372]  [<ffffffff8108ba18>] kernel_thread+0x26/0x28
> > [  292.087372]  [<ffffffff810b76a1>] kthreadd+0x1c7/0x24a
> > [  292.087372]  [<ffffffff81afc50a>] ? ret_from_fork+0x7a/0xb0
> > [  292.087372]  [<ffffffff810b74da>] ? kthread_create_on_cpu+0x7a/0x7a
> > [  292.087372]  [<ffffffff81afc50a>] ret_from_fork+0x7a/0xb0
> > [  292.087372]  [<ffffffff810b74da>] ? kthread_create_on_cpu+0x7a/0x7a
> > [  292.087372] Code: 1c 92 8e 00 48 8b 45 e0 59 5b 41 5c 41 5d 5d c3 0f 1f 44 00 00 55 48 89 e5 41 54 53 48 89 fb 48 c7 c7 90 d4 47 82 e8 b1 8f 8e 00 <48> 83 7b 38 00 49 89 c4 74 06 48 8b 73 40 eb 07 48 c7 c6 48 fe 
> > [  292.087372] RIP  [<ffffffff812119de>] pr_cont_kernfs_name+0x1b/0x6c
> > [  292.087372]  RSP <ffff88000014bb20>
> > [  292.087372] CR2: 0000000000000038
> > [  292.087372] ---[ end trace df25444498a82119 ]---
> > [  292.087372] ---[ end trace df25444498a82119 ]---
> > 
> > git bisect start 483036322b45d800fd68cb028874b6cd4fee3dba 38dbfb59d1175ef458d006556061adeaa8751b72 --
> > git bisect  bad 50a557bc0d03997d9203d749b0304c54d2c6e5f5  # 02:20      0-      5  Merge 'cgroup/review-simplify' into devel-hourly-2014020601
> > git bisect  bad c04e036e53a39cf2986ddaaf5607ff011f9ea55b  # 03:14      0-      1  Merge 'pinctrl/for-next' into devel-hourly-2014020601
> > git bisect good 08ce20bd7ef9a08b74cdfe6fe454374be72b5825  # 03:58     25+      0  Merge 'pci/pci/msi' into devel-hourly-2014020601
> > git bisect good 4f86564c657669f69c9cc03151ef6f23ae9e2015  # 04:32     25+      0  Merge 'm68knommu/cf' into devel-hourly-2014020601
> > git bisect  bad 29ae23c20b538664beaea72bb0721ce2538b4ca9  # 04:50      0-     11  Merge 'm68knommu/cfmmu' into devel-hourly-2014020601
> > git bisect  bad bbce71dbd03db3cf6df7faba0132d87a1a055827  # 04:58      0-      3  Merge 'nfs/devel' into devel-hourly-2014020601
> > git bisect good 88a78a912ee059467ae6db7429a6efe4654620a5  # 06:16     25+      1  Merge branch 'acl_fixes' into linux-next
> > git bisect good 12b13835a0a8bfabea68741e1ab4d4a4cb77d037  # 06:36     25+      0  kbuild: don't enable DEBUG_INFO when building for COMPILE_TEST
> > git bisect  bad 1f35d872a0b9dd05de8f0e50fc2957bf74ea30f7  # 06:55      0-      9  NFS: Shrink nfs_inode by sharing storage for cookieverf and commit_info
> > git bisect  bad 878a876b2e10888afe53766dcca33f723ae20edc  # 07:16      0-      4  Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
> > git bisect good d7512f79fd6cb8e2d9b78770289df6391a867ca1  # 07:20     25+      0  Merge tag 'nfs-for-3.14-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
> > git bisect  bad 9343224bfd4be6a02e6ae0c0d66426c955c7d76e  # 07:34      0-     12  Merge branch 'akpm' (patches from Andrew Morton)
> > git bisect  bad 0cc2aa51be9d2f2b001c0e070b2e5cdde89b39f4  # 07:47      0-     10  Add linux-next specific files for 20140206
> > 
> > Thanks,
> > Fengguang
> > 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [btrfs] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
  2014-02-07 12:10   ` Fengguang Wu
@ 2014-02-07 15:10     ` Chris Mason
  2014-02-07 15:22       ` Filipe David Manana
  2014-02-07 21:13     ` David Rientjes
  1 sibling, 1 reply; 9+ messages in thread
From: Chris Mason @ 2014-02-07 15:10 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Tejun Heo, Filipe David Borba Manana, linux-btrfs, linux-kernel,
	David Rientjes

On Fri 07 Feb 2014 07:10:38 AM EST, Fengguang Wu wrote:
> On Fri, Feb 07, 2014 at 02:13:59AM -0800, David Rientjes wrote:
>> On Fri, 7 Feb 2014, Fengguang Wu wrote:
>>
>>> [    1.625020] BTRFS: selftest: Running btrfs_split_item tests
>>> [    1.627004] BTRFS: selftest: Running find delalloc tests
>>> [    2.289182] tsc: Refined TSC clocksource calibration: 2299.967 MHz
>>> [  292.084537] kthreadd invoked oom-killer: gfp_mask=0x3000d0, order=1, oom_score_adj=0
>>> [  292.086439] kthreadd cpuset=
>>> [  292.087072] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
>>> [  292.087372] IP: [<ffffffff812119de>] pr_cont_kernfs_name+0x1b/0x6c
>>
>> This looks like a problem with the cpuset cgroup name, are you sure this
>> isn't related to the removal of cgroup->name?
>
> It looks not related to patch "cgroup: remove cgroup->name", because
> that patch lies in the cgroup tree and not contained in output of "git log BAD_COMMIT".

Still not sure exactly what is going on, but I can't trigger it here.  
My first guess is that it is related to having btrfs static, some part 
of our init is happening at the wrong time, and the self tests are 
swooping in and causing trouble.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [btrfs] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
  2014-02-07 15:10     ` Chris Mason
@ 2014-02-07 15:22       ` Filipe David Manana
  2014-02-08 13:07         ` Fengguang Wu
  2014-02-10  9:12         ` Fengguang Wu
  0 siblings, 2 replies; 9+ messages in thread
From: Filipe David Manana @ 2014-02-07 15:22 UTC (permalink / raw)
  To: Chris Mason
  Cc: Fengguang Wu, Tejun Heo, linux-btrfs@vger.kernel.org,
	linux-kernel, David Rientjes

On Fri, Feb 7, 2014 at 3:10 PM, Chris Mason <clm@fb.com> wrote:
> On Fri 07 Feb 2014 07:10:38 AM EST, Fengguang Wu wrote:
>>
>> On Fri, Feb 07, 2014 at 02:13:59AM -0800, David Rientjes wrote:
>>>
>>> On Fri, 7 Feb 2014, Fengguang Wu wrote:
>>>
>>>> [    1.625020] BTRFS: selftest: Running btrfs_split_item tests
>>>> [    1.627004] BTRFS: selftest: Running find delalloc tests
>>>> [    2.289182] tsc: Refined TSC clocksource calibration: 2299.967 MHz
>>>> [  292.084537] kthreadd invoked oom-killer: gfp_mask=0x3000d0, order=1,
>>>> oom_score_adj=0
>>>> [  292.086439] kthreadd cpuset=
>>>> [  292.087072] BUG: unable to handle kernel NULL pointer dereference at
>>>> 0000000000000038
>>>> [  292.087372] IP: [<ffffffff812119de>] pr_cont_kernfs_name+0x1b/0x6c
>>>
>>>
>>> This looks like a problem with the cpuset cgroup name, are you sure this
>>> isn't related to the removal of cgroup->name?
>>
>>
>> It looks not related to patch "cgroup: remove cgroup->name", because
>> that patch lies in the cgroup tree and not contained in output of "git log
>> BAD_COMMIT".
>
>
> Still not sure exactly what is going on, but I can't trigger it here.  My
> first guess is that it is related to having btrfs static, some part of our
> init is happening at the wrong time, and the self tests are swooping in and
> causing trouble.

I couldn't reproduce it either so far, neither on a physical machine
nor in a vm (qemu+kvm) (with CONFIG_BTRFS_FS=y, CONFIG_CRYPTO_CRC32C=y
and CONFIG_CRYPTO_CRC32C_INTEL=y).
If you disable CONFIG_BTRFS_FS_RUN_SANITY_TESTS, does it still crash?

thanks

>
>



-- 
Filipe David Manana,

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [btrfs] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
  2014-02-07 12:10   ` Fengguang Wu
  2014-02-07 15:10     ` Chris Mason
@ 2014-02-07 21:13     ` David Rientjes
  2014-02-08 20:10       ` Tejun Heo
  1 sibling, 1 reply; 9+ messages in thread
From: David Rientjes @ 2014-02-07 21:13 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Tejun Heo, Filipe David Borba Manana, Chris Mason, linux-btrfs,
	linux-kernel

On Fri, 7 Feb 2014, Fengguang Wu wrote:

> On Fri, Feb 07, 2014 at 02:13:59AM -0800, David Rientjes wrote:
> > On Fri, 7 Feb 2014, Fengguang Wu wrote:
> > 
> > > [    1.625020] BTRFS: selftest: Running btrfs_split_item tests
> > > [    1.627004] BTRFS: selftest: Running find delalloc tests
> > > [    2.289182] tsc: Refined TSC clocksource calibration: 2299.967 MHz
> > > [  292.084537] kthreadd invoked oom-killer: gfp_mask=0x3000d0, order=1, oom_score_adj=0
> > > [  292.086439] kthreadd cpuset=
> > > [  292.087072] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
> > > [  292.087372] IP: [<ffffffff812119de>] pr_cont_kernfs_name+0x1b/0x6c
> > 
> > This looks like a problem with the cpuset cgroup name, are you sure this 
> > isn't related to the removal of cgroup->name?
> 
> It looks not related to patch "cgroup: remove cgroup->name", because
> that patch lies in the cgroup tree and not contained in output of "git log BAD_COMMIT".
> 

It's dying on pr_cont_kernfs_name which is some tree that has "kernfs: 
implement kernfs_get_parent(), kernfs_name/path() and friends", which is 
not in linux-next, and is obviously printing the cpuset cgroup name.

It doesn't look like it has anything at all to do with btrfs or why they 
would care about this failure.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [btrfs] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
  2014-02-07 15:22       ` Filipe David Manana
@ 2014-02-08 13:07         ` Fengguang Wu
  2014-02-10  9:12         ` Fengguang Wu
  1 sibling, 0 replies; 9+ messages in thread
From: Fengguang Wu @ 2014-02-08 13:07 UTC (permalink / raw)
  To: Filipe David Manana
  Cc: Chris Mason, Tejun Heo, linux-btrfs@vger.kernel.org, linux-kernel,
	David Rientjes

> If you disable CONFIG_BTRFS_FS_RUN_SANITY_TESTS, does it still crash?

Good idea! I've queued test jobs for that config. However sorry that
I'll be offline for the next 2 days. So please expect some delays.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [btrfs] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
  2014-02-07 21:13     ` David Rientjes
@ 2014-02-08 20:10       ` Tejun Heo
  2014-02-10  9:25         ` Fengguang Wu
  0 siblings, 1 reply; 9+ messages in thread
From: Tejun Heo @ 2014-02-08 20:10 UTC (permalink / raw)
  To: David Rientjes
  Cc: Fengguang Wu, Filipe David Borba Manana, Chris Mason, linux-btrfs,
	linux-kernel

Hello, David, Fengguang, Chris.

On Fri, Feb 07, 2014 at 01:13:06PM -0800, David Rientjes wrote:
> On Fri, 7 Feb 2014, Fengguang Wu wrote:
> 
> > On Fri, Feb 07, 2014 at 02:13:59AM -0800, David Rientjes wrote:
> > > On Fri, 7 Feb 2014, Fengguang Wu wrote:
> > > 
> > > > [    1.625020] BTRFS: selftest: Running btrfs_split_item tests
> > > > [    1.627004] BTRFS: selftest: Running find delalloc tests
> > > > [    2.289182] tsc: Refined TSC clocksource calibration: 2299.967 MHz
> > > > [  292.084537] kthreadd invoked oom-killer: gfp_mask=0x3000d0, order=1, oom_score_adj=0
> > > > [  292.086439] kthreadd cpuset=
> > > > [  292.087072] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
> > > > [  292.087372] IP: [<ffffffff812119de>] pr_cont_kernfs_name+0x1b/0x6c
> > > 
> > > This looks like a problem with the cpuset cgroup name, are you sure this 
> > > isn't related to the removal of cgroup->name?
> > 
> > It looks not related to patch "cgroup: remove cgroup->name", because
> > that patch lies in the cgroup tree and not contained in output of "git log BAD_COMMIT".
> > 
> 
> It's dying on pr_cont_kernfs_name which is some tree that has "kernfs: 
> implement kernfs_get_parent(), kernfs_name/path() and friends", which is 
> not in linux-next, and is obviously printing the cpuset cgroup name.
> 
> It doesn't look like it has anything at all to do with btrfs or why they 
> would care about this failure.

Yeah, this is from a patch in cgroup/review-post-kernfs-conversion
branch which updates cgroup to use pr_cont_kernfs_name().  I forget
that cgrp->kn is NULL for the dummy_root's top cgroup and thus it ends
up calling the kernfs functions with NULL kn and thus the oops.  I
posted an updated patch and the git branch has been updated.

 http://lkml.kernel.org/g/20140208200640.GB10975@htj.dyndns.org

So, nothing to do with btrfs and it looks like somehow the test
appratus is mixing up branches?

Thanks!

-- 
tejun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [btrfs] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
  2014-02-07 15:22       ` Filipe David Manana
  2014-02-08 13:07         ` Fengguang Wu
@ 2014-02-10  9:12         ` Fengguang Wu
  1 sibling, 0 replies; 9+ messages in thread
From: Fengguang Wu @ 2014-02-10  9:12 UTC (permalink / raw)
  To: Filipe David Manana
  Cc: Chris Mason, Tejun Heo, linux-btrfs@vger.kernel.org, linux-kernel,
	David Rientjes

Hi Filipe,

> If you disable CONFIG_BTRFS_FS_RUN_SANITY_TESTS, does it still crash?

I tried disabling CONFIG_BTRFS_FS_RUN_SANITY_TESTS in the reported 3
randconfigs and they all boot fine.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [btrfs] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
  2014-02-08 20:10       ` Tejun Heo
@ 2014-02-10  9:25         ` Fengguang Wu
  0 siblings, 0 replies; 9+ messages in thread
From: Fengguang Wu @ 2014-02-10  9:25 UTC (permalink / raw)
  To: Tejun Heo
  Cc: David Rientjes, Filipe David Borba Manana, Chris Mason,
	linux-btrfs, linux-kernel

On Sat, Feb 08, 2014 at 03:10:37PM -0500, Tejun Heo wrote:
> Hello, David, Fengguang, Chris.
> 
> On Fri, Feb 07, 2014 at 01:13:06PM -0800, David Rientjes wrote:
> > On Fri, 7 Feb 2014, Fengguang Wu wrote:
> > 
> > > On Fri, Feb 07, 2014 at 02:13:59AM -0800, David Rientjes wrote:
> > > > On Fri, 7 Feb 2014, Fengguang Wu wrote:
> > > > 
> > > > > [    1.625020] BTRFS: selftest: Running btrfs_split_item tests
> > > > > [    1.627004] BTRFS: selftest: Running find delalloc tests
> > > > > [    2.289182] tsc: Refined TSC clocksource calibration: 2299.967 MHz
> > > > > [  292.084537] kthreadd invoked oom-killer: gfp_mask=0x3000d0, order=1, oom_score_adj=0
> > > > > [  292.086439] kthreadd cpuset=
> > > > > [  292.087072] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
> > > > > [  292.087372] IP: [<ffffffff812119de>] pr_cont_kernfs_name+0x1b/0x6c
> > > > 
> > > > This looks like a problem with the cpuset cgroup name, are you sure this 
> > > > isn't related to the removal of cgroup->name?
> > > 
> > > It looks not related to patch "cgroup: remove cgroup->name", because
> > > that patch lies in the cgroup tree and not contained in output of "git log BAD_COMMIT".

Sorry I was wrong here. I find that the above dmesg is for commit
4830363 which is a merge HEAD that contains the cgroup code.

The dmesg for commit 878a876b2e1 ("Merge branch 'for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs")
looks different, which hangs after the tsc line:

[    2.428110] Btrfs loaded, assert=on, integrity-checker=on
[    2.429469] BTRFS: selftest: Running btrfs free space cache tests
[    2.430874] BTRFS: selftest: Running extent only tests
[    2.432135] BTRFS: selftest: Running bitmap only tests
[    2.433359] BTRFS: selftest: Running bitmap and extent tests
[    2.434675] BTRFS: selftest: Free space cache tests finished
[    2.435959] BTRFS: selftest: Running extent buffer operation tests
[    2.437350] BTRFS: selftest: Running btrfs_split_item tests
[    2.438843] BTRFS: selftest: Running find delalloc tests
[    3.158351] tsc: Refined TSC clocksource calibration: 2666.596 MHz


> > It's dying on pr_cont_kernfs_name which is some tree that has "kernfs: 
> > implement kernfs_get_parent(), kernfs_name/path() and friends", which is 
> > not in linux-next, and is obviously printing the cpuset cgroup name.
> > 
> > It doesn't look like it has anything at all to do with btrfs or why they 
> > would care about this failure.
> 
> Yeah, this is from a patch in cgroup/review-post-kernfs-conversion
> branch which updates cgroup to use pr_cont_kernfs_name().  I forget
> that cgrp->kn is NULL for the dummy_root's top cgroup and thus it ends
> up calling the kernfs functions with NULL kn and thus the oops.  I
> posted an updated patch and the git branch has been updated.
> 
>  http://lkml.kernel.org/g/20140208200640.GB10975@htj.dyndns.org
> 
> So, nothing to do with btrfs and it looks like somehow the test
> appratus is mixing up branches?

Yes - I may do random merges and boot test the resulted kernels.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-02-10  9:25 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20140207023801.GC11051@localhost>
2014-02-07 10:13 ` [btrfs] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 David Rientjes
2014-02-07 12:10   ` Fengguang Wu
2014-02-07 15:10     ` Chris Mason
2014-02-07 15:22       ` Filipe David Manana
2014-02-08 13:07         ` Fengguang Wu
2014-02-10  9:12         ` Fengguang Wu
2014-02-07 21:13     ` David Rientjes
2014-02-08 20:10       ` Tejun Heo
2014-02-10  9:25         ` Fengguang Wu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).