From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com ([192.55.52.93]:47975 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751428AbaBJJZP (ORCPT ); Mon, 10 Feb 2014 04:25:15 -0500 Date: Mon, 10 Feb 2014 17:25:00 +0800 From: Fengguang Wu To: Tejun Heo Cc: David Rientjes , Filipe David Borba Manana , Chris Mason , linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [btrfs] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 Message-ID: <20140210092500.GB7864@localhost> References: <20140207023801.GC11051@localhost> <20140207121038.GA1690@localhost> <20140208201037.GC10975@htj.dyndns.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <20140208201037.GC10975@htj.dyndns.org> Sender: linux-btrfs-owner@vger.kernel.org List-ID: On Sat, Feb 08, 2014 at 03:10:37PM -0500, Tejun Heo wrote: > Hello, David, Fengguang, Chris. > > On Fri, Feb 07, 2014 at 01:13:06PM -0800, David Rientjes wrote: > > On Fri, 7 Feb 2014, Fengguang Wu wrote: > > > > > On Fri, Feb 07, 2014 at 02:13:59AM -0800, David Rientjes wrote: > > > > On Fri, 7 Feb 2014, Fengguang Wu wrote: > > > > > > > > > [ 1.625020] BTRFS: selftest: Running btrfs_split_item tests > > > > > [ 1.627004] BTRFS: selftest: Running find delalloc tests > > > > > [ 2.289182] tsc: Refined TSC clocksource calibration: 2299.967 MHz > > > > > [ 292.084537] kthreadd invoked oom-killer: gfp_mask=0x3000d0, order=1, oom_score_adj=0 > > > > > [ 292.086439] kthreadd cpuset= > > > > > [ 292.087072] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 > > > > > [ 292.087372] IP: [] pr_cont_kernfs_name+0x1b/0x6c > > > > > > > > This looks like a problem with the cpuset cgroup name, are you sure this > > > > isn't related to the removal of cgroup->name? > > > > > > It looks not related to patch "cgroup: remove cgroup->name", because > > > that patch lies in the cgroup tree and not contained in output of "git log BAD_COMMIT". Sorry I was wrong here. I find that the above dmesg is for commit 4830363 which is a merge HEAD that contains the cgroup code. The dmesg for commit 878a876b2e1 ("Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs") looks different, which hangs after the tsc line: [ 2.428110] Btrfs loaded, assert=on, integrity-checker=on [ 2.429469] BTRFS: selftest: Running btrfs free space cache tests [ 2.430874] BTRFS: selftest: Running extent only tests [ 2.432135] BTRFS: selftest: Running bitmap only tests [ 2.433359] BTRFS: selftest: Running bitmap and extent tests [ 2.434675] BTRFS: selftest: Free space cache tests finished [ 2.435959] BTRFS: selftest: Running extent buffer operation tests [ 2.437350] BTRFS: selftest: Running btrfs_split_item tests [ 2.438843] BTRFS: selftest: Running find delalloc tests [ 3.158351] tsc: Refined TSC clocksource calibration: 2666.596 MHz > > It's dying on pr_cont_kernfs_name which is some tree that has "kernfs: > > implement kernfs_get_parent(), kernfs_name/path() and friends", which is > > not in linux-next, and is obviously printing the cpuset cgroup name. > > > > It doesn't look like it has anything at all to do with btrfs or why they > > would care about this failure. > > Yeah, this is from a patch in cgroup/review-post-kernfs-conversion > branch which updates cgroup to use pr_cont_kernfs_name(). I forget > that cgrp->kn is NULL for the dummy_root's top cgroup and thus it ends > up calling the kernfs functions with NULL kn and thus the oops. I > posted an updated patch and the git branch has been updated. > > http://lkml.kernel.org/g/20140208200640.GB10975@htj.dyndns.org > > So, nothing to do with btrfs and it looks like somehow the test > appratus is mixing up branches? Yes - I may do random merges and boot test the resulted kernels. Thanks, Fengguang