public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Upgraded from 3.4 to 3.5.1 kernel: machine does not boot
@ 2012-08-10 16:45 Justin Piszcz
  2012-08-10 17:53 ` Jesper Juhl
  0 siblings, 1 reply; 9+ messages in thread
From: Justin Piszcz @ 2012-08-10 16:45 UTC (permalink / raw)
  To: linux-kernel; +Cc: ap

Hello,

Motherboard: Supermicro X8DTH-6F
Distro: Debian Testing x86_64

>From 3.4 -> 3.5.1 on x86_64 make oldconfig and a few minor changes and the
machine attempts to boot but hangs at the filesystem mounting part of the
boot process.

Picture of where it stops working (a little burry but readable)
http://home.comcast.net/~jpiszcz/20120810/3.5-kernel-hangs.jpg

Kernel config 3.4 (working)
http://home.comcast.net/~jpiszcz/20120810/config-3.4.txt

Kernel config 3.5.1 (hangs)
http://home.comcast.net/~jpiszcz/20120810/config-3.5.1.txt

As you see towards the end the machine has been sitting there for 1 hour as
that's the timeout I have the drives spindown on the 3ware card.

Any thoughts as what is wrong here?

Diff between the two:

$ diff -u config-3.4.txt  config-3.5.1.txt  |grep '^+C'
+CONFIG_ARCH_SUPPORTS_UPROBES=y
+CONFIG_BUILDTIME_EXTABLE_SORT=y
+CONFIG_CLOCKSOURCE_WATCHDOG=y
+CONFIG_ARCH_CLOCKSOURCE_DATA=y
+CONFIG_GENERIC_TIME_VSYSCALL=y
+CONFIG_GENERIC_CLOCKEVENTS=y
+CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
+CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
+CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
+CONFIG_GENERIC_CMOS_UPDATE=y
+CONFIG_TICK_ONESHOT=y
+CONFIG_NO_HZ=y
+CONFIG_HIGH_RES_TIMERS=y
+CONFIG_RCU_FANOUT_LEAF=16
+CONFIG_GENERIC_SMP_IDLE_THREAD=y
+CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
+CONFIG_SECCOMP_FILTER=y
+CONFIG_CROSS_MEMORY_ATTACH=y
+CONFIG_X86_DEV_DMA_OPS=y
+CONFIG_NETFILTER_NETLINK=y
+CONFIG_NF_CT_NETLINK=y
+CONFIG_HAVE_BPF_JIT=y
+CONFIG_E1000E=y
+CONFIG_IXGBE_HWMON=y
+CONFIG_NET_VENDOR_I825XX=y
+CONFIG_HID=y
+CONFIG_HIDRAW=y
+CONFIG_HID_GENERIC=y
+CONFIG_USB_HID=y
+CONFIG_HID_PID=y
+CONFIG_USB_HIDDEV=y
+CONFIG_NEW_LEDS=y
+CONFIG_LEDS_CLASS=y
+CONFIG_NFS_V2=y
+CONFIG_PANIC_ON_OOPS_VALUE=0
+CONFIG_RCU_CPU_STALL_INFO=y
+CONFIG_CRYPTO_CRC32C=y
+CONFIG_GENERIC_STRNCPY_FROM_USER=y
+CONFIG_GENERIC_STRNLEN_USER=y

Justin.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Upgraded from 3.4 to 3.5.1 kernel: machine does not boot
  2012-08-10 16:45 Upgraded from 3.4 to 3.5.1 kernel: machine does not boot Justin Piszcz
@ 2012-08-10 17:53 ` Jesper Juhl
  2012-08-10 21:45   ` Justin Piszcz
  0 siblings, 1 reply; 9+ messages in thread
From: Jesper Juhl @ 2012-08-10 17:53 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: linux-kernel, ap

On Fri, 10 Aug 2012, Justin Piszcz wrote:

> Hello,
> 
> Motherboard: Supermicro X8DTH-6F
> Distro: Debian Testing x86_64
> 
> >From 3.4 -> 3.5.1 on x86_64 make oldconfig and a few minor changes and the
> machine attempts to boot but hangs at the filesystem mounting part of the
> boot process.
> 
> Picture of where it stops working (a little burry but readable)
> http://home.comcast.net/~jpiszcz/20120810/3.5-kernel-hangs.jpg
> 
> Kernel config 3.4 (working)
> http://home.comcast.net/~jpiszcz/20120810/config-3.4.txt
> 
> Kernel config 3.5.1 (hangs)
> http://home.comcast.net/~jpiszcz/20120810/config-3.5.1.txt
> 
> As you see towards the end the machine has been sitting there for 1 hour as
> that's the timeout I have the drives spindown on the 3ware card.
> 
> Any thoughts as what is wrong here?
> 
Not really, but some (rather obvious) ideas on what to try:

- Does 3.5 work? (could be that whatever broke things for you was 
introduced in 3.5.1).

- Does the latest 3.4.8 work?

- Does 3.6-rc1 (or even the latest snapshot of Linus' tree, post-rc1) 
work?

- If noone comes up with a good idea as to the cause of your troubles, you 
could try bisecting between your last working kernel and 3.5.1 to try and 
narrow it down to one (or a few) commits that are causing your trouble.

-- 
Jesper Juhl <jj@chaosbits.net>       http://www.chaosbits.net/
Don't top-post http://www.catb.org/jargon/html/T/top-post.html
Plain text mails only, please.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Upgraded from 3.4 to 3.5.1 kernel: machine does not boot
  2012-08-10 17:53 ` Jesper Juhl
@ 2012-08-10 21:45   ` Justin Piszcz
  2012-08-10 23:07     ` Justin Piszcz
  0 siblings, 1 reply; 9+ messages in thread
From: Justin Piszcz @ 2012-08-10 21:45 UTC (permalink / raw)
  To: Jesper Juhl; +Cc: linux-kernel, ap

On Fri, Aug 10, 2012 at 1:53 PM, Jesper Juhl <jj@chaosbits.net> wrote:
> On Fri, 10 Aug 2012, Justin Piszcz wrote:
>
>> Hello,
>>
>> Motherboard: Supermicro X8DTH-6F
>> Distro: Debian Testing x86_64
>>
>> >From 3.4 -> 3.5.1 on x86_64 make oldconfig and a few minor changes and the
>> machine attempts to boot but hangs at the filesystem mounting part of the
>> boot process.

Hi,

Found the root cause, the 3.5.1 kernel cannot mount my ext4 filesystem (60TB).

The 3.4 kernel works fine.

This is proven by commenting out the filesystem in /etc/fstab with
3.5.1, and all is OK.

When I run mount for that filesystem, it hangs, I ran alt+sysrq+t to
get additional output and I have pasted it below with the 3.5.1
kernel:

[  160.373406] mount           R  running task        0  4361   4355 0x00000000
[  160.373407]  ffff8806266bdb68 0000000000000086 ffff8806266bdaa8
ffff8806266bdfd8
[  160.373410]  ffff8806266bdfd8 0000000000004000 ffff8806270b0600
ffff880626c73a10
[  160.373413]  0000000000011240 ffff880c260177c0 ffff880c260177c0
00000000ffffffff
[  160.373415] Call Trace:
[  160.373416]  [<ffffffff816bd009>] ? __schedule+0x299/0x770
[  160.373418]  [<ffffffff81053465>] __cond_resched+0x25/0x40
[  160.373420]  [<ffffffff816bd6ba>] _cond_resched+0x2a/0x40
[  160.373421]  [<ffffffff8115bc09>] ext4_calculate_overhead+0x239/0x3e0
[  160.373425]  [<ffffffff8115d859>] ext4_fill_super+0x1aa9/0x2930
[  160.373427]  [<ffffffff810c677f>] mount_bdev+0x19f/0x1e0
[  160.373429]  [<ffffffff8115bdb0>] ? ext4_calculate_overhead+0x3e0/0x3e0
[  160.373431]  [<ffffffff811579c0>] ext4_mount+0x10/0x20
[  160.373433]  [<ffffffff810c69eb>] mount_fs+0x1b/0xd0
[  160.373434]  [<ffffffff810df2df>] vfs_kern_mount+0x6f/0x110
[  160.373437]  [<ffffffff810df3ff>] do_kern_mount+0x4f/0x100
[  160.373439]  [<ffffffff810e091e>] do_mount+0x2fe/0x8a0
[  160.373440]  [<ffffffff81097053>] ? strndup_user+0x53/0x70
[  160.373442]  [<ffffffff810e0fd0>] sys_mount+0x90/0xe0
[  160.373443]  [<ffffffff816beba6>] system_call_fastpath+0x1a/0x1f
[  160.373446] jbd2/sda1-8     S ffff880c2675f800     0  4362      2 0x00000000
[  160.373448]  ffff880623ca9e50 0000000000000046 ffff880626c73a10
ffff880623ca9fd8
[  160.373450]  ffff880623ca9fd8 0000000000004000 ffff8806271b9850
ffff880626d08250
[  160.373453]  ffff880623ca9da0 ffff8806266bdbe0 ffff880c2675f8a0
ffff880c2675f888
[  160.373455] Call Trace:
[  160.373456]  [<ffffffff81055f9d>] ? default_wake_function+0xd/0x10
[  160.373458]  [<ffffffff8104acf1>] ? autoremove_wake_function+0x11/0x40
[  160.373460]  [<ffffffff810521f5>] ? __wake_up_common+0x55/0x90
[  160.373462]  [<ffffffff816bd504>] schedule+0x24/0x70
[  160.373463]  [<ffffffff81183f8e>] kjournald2+0x1ce/0x1e0
[  160.373465]  [<ffffffff8104ace0>] ? abort_exclusive_wait+0xb0/0xb0
[  160.373467]  [<ffffffff81183dc0>] ? commit_timeout+0x10/0x10
[  160.373469]  [<ffffffff8104a27e>] kthread+0x8e/0xa0
[  160.373471]  [<ffffffff816bfed4>] kernel_thread_helper+0x4/0x10
[  160.373472]  [<ffffffff8104a1f0>] ? kthread_flush_work_fn+0x10/0x10
[  160.373474]  [<ffffffff816bfed0>] ? gs_change+0xb/0xb

Justin.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: Upgraded from 3.4 to 3.5.1 kernel: machine does not boot
  2012-08-10 21:45   ` Justin Piszcz
@ 2012-08-10 23:07     ` Justin Piszcz
  2012-08-11  4:14       ` Justin Piszcz
  0 siblings, 1 reply; 9+ messages in thread
From: Justin Piszcz @ 2012-08-10 23:07 UTC (permalink / raw)
  To: 'Jesper Juhl'; +Cc: linux-kernel, ap



-----Original Message-----
From: Justin Piszcz [mailto:jpiszcz@lucidpixels.com] 
Sent: Friday, August 10, 2012 5:46 PM
To: Jesper Juhl
Cc: linux-kernel@vger.kernel.org; ap@solarrain.com
Subject: Re: Upgraded from 3.4 to 3.5.1 kernel: machine does not boot

On Fri, Aug 10, 2012 at 1:53 PM, Jesper Juhl <jj@chaosbits.net> wrote:
> On Fri, 10 Aug 2012, Justin Piszcz wrote:
>
>> Hello,
>>
>> Motherboard: Supermicro X8DTH-6F
>> Distro: Debian Testing x86_64
>>
>> >From 3.4 -> 3.5.1 on x86_64 make oldconfig and a few minor changes and
the
>> machine attempts to boot but hangs at the filesystem mounting part of the
>> boot process.

Hi,

Found the root cause, the 3.5.1 kernel cannot mount my ext4 filesystem
(60TB).

The 3.4 kernel works fine.

This is proven by commenting out the filesystem in /etc/fstab with
3.5.1, and all is OK.

--

Hi again,

I tested with linux-3.6-rc1:

The same problem, here is what I get from the strace:

irectory)
4434  readlink("/dev", 0x7fff3b05c670, 4096) = -1 EINVAL (Invalid argument)
4434  readlink("/dev/sda1", 0x7fff3b05c670, 4096) = -1 EINVAL (Invalid
argument)
4434  readlink("/r1", 0x7fff3b05c670, 4096) = -1 EINVAL (Invalid argument)
4434  getuid()                          = 0
4434  geteuid()                         = 0
4434  getgid()                          = 0
4434  getegid()                         = 0
4434  prctl(PR_GET_DUMPABLE)            = 1
4434  lstat("/etc/mtab", {st_mode=S_IFLNK|0777, st_size=12, ...}) = 0
4434  getuid()                          = 0
4434  geteuid()                         = 0
4434  getgid()                          = 0
4434  getegid()                         = 0
4434  prctl(PR_GET_DUMPABLE)            = 1
4434  stat("/run", {st_mode=S_IFDIR|0755, st_size=820, ...}) = 0
4434  lstat("/run/mount/utab", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
4434  open("/run/mount/utab", O_RDWR|O_CREAT, 0644) = 3
4434  close(3)                          = 0
4434  mount("/dev/sda1", "/r1", "ext4", MS_MGC_VAL|MS_NOATIME, NULL

--

(w/ 3.6-rc1) 

[   89.868843] mount           R  running task        0  4434   4433
0x00000009
[   89.868847]  ffff880c246b7b68 ffffffff816c9279 ffff880c246b7aa8
ffff880c246b7fd8
[   89.868851]  ffff880c246b7fd8 0000000000004000 ffff88062720cdb0
ffff880c246862d0
[   89.868855]  00000000000116c0 ffff880623a863c0 ffff880623a863c0
00000000ffffffff
[   89.868855] Call Trace:
[   89.868858]  [<ffffffff816c9279>] ? __schedule+0x299/0x770
[   89.868860]  [<ffffffff816c9279>] ? __schedule+0x299/0x770
[   89.868864]  [<ffffffff8114a729>] ? ext4_get_group_desc+0x49/0xb0
[   89.868868]  [<ffffffff81161d41>] ? ext4_calculate_overhead+0x131/0x3e0
[   89.868871]  [<ffffffff81163a3b>] ? ext4_fill_super+0x1a4b/0x28d0
[   89.868875]  [<ffffffff810cc301>] ? mount_bdev+0x1a1/0x1e0
[   89.868877]  [<ffffffff81161ff0>] ? ext4_calculate_overhead+0x3e0/0x3e0
[   89.868880]  [<ffffffff8115dd00>] ? ext4_mount+0x10/0x20
[   89.868882]  [<ffffffff810cc55b>] ? mount_fs+0x1b/0xd0
[   89.868885]  [<ffffffff810e57af>] ? vfs_kern_mount+0x6f/0x110
[   89.868888]  [<ffffffff810e58cf>] ? do_kern_mount+0x4f/0x100
[   89.868890]  [<ffffffff810e6dae>] ? do_mount+0x2fe/0x8a0
[   89.868894]  [<ffffffff8109c0a3>] ? strndup_user+0x53/0x70
[   89.868896]  [<ffffffff810e73e0>] ? sys_mount+0x90/0xe0
[   89.868899]  [<ffffffff816cafa1>] ? tracesys+0xd4/0xd9

Justin.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Upgraded from 3.4 to 3.5.1 kernel: machine does not boot
  2012-08-10 23:07     ` Justin Piszcz
@ 2012-08-11  4:14       ` Justin Piszcz
  2012-08-12 13:10         ` Eric Sandeen
  0 siblings, 1 reply; 9+ messages in thread
From: Justin Piszcz @ 2012-08-11  4:14 UTC (permalink / raw)
  To: Jesper Juhl; +Cc: linux-kernel, ap, linux-ext4

On Fri, Aug 10, 2012 at 7:07 PM, Justin Piszcz
>
> Hi,
>
> Found the root cause, the 3.5.1 kernel cannot mount my ext4 filesystem
> (60TB).
>
> The 3.4 kernel works fine.
>
> This is proven by commenting out the filesystem in /etc/fstab with
> 3.5.1, and all is OK.
>
> --
>
> Hi again,
>
> I tested with linux-3.6-rc1:
>
> The same problem, here is what I get from the strace:
>
> irectory)
> 4434  readlink("/dev", 0x7fff3b05c670, 4096) = -1 EINVAL (Invalid argument)
> 4434  readlink("/dev/sda1", 0x7fff3b05c670, 4096) = -1 EINVAL (Invalid
> argument)
> 4434  readlink("/r1", 0x7fff3b05c670, 4096) = -1 EINVAL (Invalid argument)
> 4434  getuid()                          = 0
> 4434  geteuid()                         = 0
> 4434  getgid()                          = 0
> 4434  getegid()                         = 0
> 4434  prctl(PR_GET_DUMPABLE)            = 1
> 4434  lstat("/etc/mtab", {st_mode=S_IFLNK|0777, st_size=12, ...}) = 0
> 4434  getuid()                          = 0
> 4434  geteuid()                         = 0
> 4434  getgid()                          = 0
> 4434  getegid()                         = 0
> 4434  prctl(PR_GET_DUMPABLE)            = 1
> 4434  stat("/run", {st_mode=S_IFDIR|0755, st_size=820, ...}) = 0
> 4434  lstat("/run/mount/utab", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> 4434  open("/run/mount/utab", O_RDWR|O_CREAT, 0644) = 3
> 4434  close(3)                          = 0
> 4434  mount("/dev/sda1", "/r1", "ext4", MS_MGC_VAL|MS_NOATIME, NULL
>
> --
>
> (w/ 3.6-rc1)
>
> [   89.868843] mount           R  running task        0  4434   4433
> 0x00000009
> [   89.868847]  ffff880c246b7b68 ffffffff816c9279 ffff880c246b7aa8
> ffff880c246b7fd8
> [   89.868851]  ffff880c246b7fd8 0000000000004000 ffff88062720cdb0
> ffff880c246862d0
> [   89.868855]  00000000000116c0 ffff880623a863c0 ffff880623a863c0
> 00000000ffffffff
> [   89.868855] Call Trace:
> [   89.868858]  [<ffffffff816c9279>] ? __schedule+0x299/0x770
> [   89.868860]  [<ffffffff816c9279>] ? __schedule+0x299/0x770
> [   89.868864]  [<ffffffff8114a729>] ? ext4_get_group_desc+0x49/0xb0
> [   89.868868]  [<ffffffff81161d41>] ? ext4_calculate_overhead+0x131/0x3e0
> [   89.868871]  [<ffffffff81163a3b>] ? ext4_fill_super+0x1a4b/0x28d0
> [   89.868875]  [<ffffffff810cc301>] ? mount_bdev+0x1a1/0x1e0
> [   89.868877]  [<ffffffff81161ff0>] ? ext4_calculate_overhead+0x3e0/0x3e0
> [   89.868880]  [<ffffffff8115dd00>] ? ext4_mount+0x10/0x20
> [   89.868882]  [<ffffffff810cc55b>] ? mount_fs+0x1b/0xd0
> [   89.868885]  [<ffffffff810e57af>] ? vfs_kern_mount+0x6f/0x110
> [   89.868888]  [<ffffffff810e58cf>] ? do_kern_mount+0x4f/0x100
> [   89.868890]  [<ffffffff810e6dae>] ? do_mount+0x2fe/0x8a0
> [   89.868894]  [<ffffffff8109c0a3>] ? strndup_user+0x53/0x70
> [   89.868896]  [<ffffffff810e73e0>] ? sys_mount+0x90/0xe0
> [   89.868899]  [<ffffffff816cafa1>] ? tracesys+0xd4/0xd9
>
> Justin.
>
>
>

CC: linux-ext4

Any ideas here (kernel 3.4 and below can mount 60TB ext4 no issues)
but > 3.5.1 (did not try 3.5) cannot mount the filesystem.

Justin.

Justin.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Upgraded from 3.4 to 3.5.1 kernel: machine does not boot
  2012-08-11  4:14       ` Justin Piszcz
@ 2012-08-12 13:10         ` Eric Sandeen
  2012-08-12 13:51           ` Justin Piszcz
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Sandeen @ 2012-08-12 13:10 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: Jesper Juhl, linux-kernel, ap, linux-ext4

On 8/10/12 11:14 PM, Justin Piszcz wrote:
> On Fri, Aug 10, 2012 at 7:07 PM, Justin Piszcz
>>
>> Hi,
>>
>> Found the root cause, the 3.5.1 kernel cannot mount my ext4 filesystem
>> (60TB).

You are a brave man running ext4 at 60T, but thank you for testing :)

Backing out 8aeb00ff85ad25453765dd339b408c0087db1527 from 3.5.1
(952fc18ef9ec707ebdc16c0786ec360295e5ff15 upstream) probably helps?

>From a quick look, I think that essentially has a :

for (i = 0; i < ngroups; i++) {

	for (j = 0; j < ngroups; j++) {

	}
}

type nested loop going on; for a filesystem this big it's going to take almost
literally forever, if I read it right.

-Eric

>> The 3.4 kernel works fine.
>>
>> This is proven by commenting out the filesystem in /etc/fstab with
>> 3.5.1, and all is OK.
>>
>> --
>>
>> Hi again,
>>
>> I tested with linux-3.6-rc1:
>>
>> The same problem, here is what I get from the strace:
>>
>> irectory)
>> 4434  readlink("/dev", 0x7fff3b05c670, 4096) = -1 EINVAL (Invalid argument)
>> 4434  readlink("/dev/sda1", 0x7fff3b05c670, 4096) = -1 EINVAL (Invalid
>> argument)
>> 4434  readlink("/r1", 0x7fff3b05c670, 4096) = -1 EINVAL (Invalid argument)
>> 4434  getuid()                          = 0
>> 4434  geteuid()                         = 0
>> 4434  getgid()                          = 0
>> 4434  getegid()                         = 0
>> 4434  prctl(PR_GET_DUMPABLE)            = 1
>> 4434  lstat("/etc/mtab", {st_mode=S_IFLNK|0777, st_size=12, ...}) = 0
>> 4434  getuid()                          = 0
>> 4434  geteuid()                         = 0
>> 4434  getgid()                          = 0
>> 4434  getegid()                         = 0
>> 4434  prctl(PR_GET_DUMPABLE)            = 1
>> 4434  stat("/run", {st_mode=S_IFDIR|0755, st_size=820, ...}) = 0
>> 4434  lstat("/run/mount/utab", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
>> 4434  open("/run/mount/utab", O_RDWR|O_CREAT, 0644) = 3
>> 4434  close(3)                          = 0
>> 4434  mount("/dev/sda1", "/r1", "ext4", MS_MGC_VAL|MS_NOATIME, NULL
>>
>> --
>>
>> (w/ 3.6-rc1)
>>
>> [   89.868843] mount           R  running task        0  4434   4433
>> 0x00000009
>> [   89.868847]  ffff880c246b7b68 ffffffff816c9279 ffff880c246b7aa8
>> ffff880c246b7fd8
>> [   89.868851]  ffff880c246b7fd8 0000000000004000 ffff88062720cdb0
>> ffff880c246862d0
>> [   89.868855]  00000000000116c0 ffff880623a863c0 ffff880623a863c0
>> 00000000ffffffff
>> [   89.868855] Call Trace:
>> [   89.868858]  [<ffffffff816c9279>] ? __schedule+0x299/0x770
>> [   89.868860]  [<ffffffff816c9279>] ? __schedule+0x299/0x770
>> [   89.868864]  [<ffffffff8114a729>] ? ext4_get_group_desc+0x49/0xb0
>> [   89.868868]  [<ffffffff81161d41>] ? ext4_calculate_overhead+0x131/0x3e0
>> [   89.868871]  [<ffffffff81163a3b>] ? ext4_fill_super+0x1a4b/0x28d0
>> [   89.868875]  [<ffffffff810cc301>] ? mount_bdev+0x1a1/0x1e0
>> [   89.868877]  [<ffffffff81161ff0>] ? ext4_calculate_overhead+0x3e0/0x3e0
>> [   89.868880]  [<ffffffff8115dd00>] ? ext4_mount+0x10/0x20
>> [   89.868882]  [<ffffffff810cc55b>] ? mount_fs+0x1b/0xd0
>> [   89.868885]  [<ffffffff810e57af>] ? vfs_kern_mount+0x6f/0x110
>> [   89.868888]  [<ffffffff810e58cf>] ? do_kern_mount+0x4f/0x100
>> [   89.868890]  [<ffffffff810e6dae>] ? do_mount+0x2fe/0x8a0
>> [   89.868894]  [<ffffffff8109c0a3>] ? strndup_user+0x53/0x70
>> [   89.868896]  [<ffffffff810e73e0>] ? sys_mount+0x90/0xe0
>> [   89.868899]  [<ffffffff816cafa1>] ? tracesys+0xd4/0xd9
>>
>> Justin.
>>
>>
>>
> 
> CC: linux-ext4
> 
> Any ideas here (kernel 3.4 and below can mount 60TB ext4 no issues)
> but > 3.5.1 (did not try 3.5) cannot mount the filesystem.
> 
> Justin.
> 
> Justin.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Upgraded from 3.4 to 3.5.1 kernel: machine does not boot
  2012-08-12 13:10         ` Eric Sandeen
@ 2012-08-12 13:51           ` Justin Piszcz
  2012-08-12 14:13             ` Paul Gortmaker
  0 siblings, 1 reply; 9+ messages in thread
From: Justin Piszcz @ 2012-08-12 13:51 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Jesper Juhl, linux-kernel, ap, linux-ext4, greg

On Sun, Aug 12, 2012 at 9:10 AM, Eric Sandeen <sandeen@sandeen.net> wrote:
> On 8/10/12 11:14 PM, Justin Piszcz wrote:
>> On Fri, Aug 10, 2012 at 7:07 PM, Justin Piszcz
>>>
>>> Hi,
>>>
>>> Found the root cause, the 3.5.1 kernel cannot mount my ext4 filesystem
>>> (60TB).
>
> You are a brave man running ext4 at 60T, but thank you for testing :)
>
> Backing out 8aeb00ff85ad25453765dd339b408c0087db1527 from 3.5.1
> (952fc18ef9ec707ebdc16c0786ec360295e5ff15 upstream) probably helps?
>
> From a quick look, I think that essentially has a :
>
> for (i = 0; i < ngroups; i++) {
>
>         for (j = 0; j < ngroups; j++) {
>
>         }
> }
>
> type nested loop going on; for a filesystem this big it's going to take almost
> literally forever, if I read it right.
>
> -Eric

Hello,

It worked!! I can mount my filesystem now!

I pulled down 3.5 and backed out that commit, I could not quickly find
a doc to do this, so I will add how to do that below:

1. Clone Linux repo (3.5/stable as of this writing)
git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git

2. List commits:
git log

3. Show a specific commit
git show 8aeb00ff85ad25453765dd339b408c0087db1527

4. How to revert the commit:
git revert 8aeb00ff85ad25453765dd339b408c0087db1527

# On branch master
nothing to commit (working directory clean)

5. Recompile, reboot, does it work?
# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        61T   17T   44T  28% /r1
# uname -a
Linux p34 3.5.0 #1 SMP Sun Aug 12 09:42:41 EDT 2012 x86_64 GNU/Linux

Yes!

CC: Greg to see if this can be backed out for 3.5.2?

Justin.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Upgraded from 3.4 to 3.5.1 kernel: machine does not boot
  2012-08-12 13:51           ` Justin Piszcz
@ 2012-08-12 14:13             ` Paul Gortmaker
  2012-08-12 14:36               ` Justin Piszcz
  0 siblings, 1 reply; 9+ messages in thread
From: Paul Gortmaker @ 2012-08-12 14:13 UTC (permalink / raw)
  To: Justin Piszcz
  Cc: Eric Sandeen, Jesper Juhl, linux-kernel, ap, linux-ext4, greg

On Sun, Aug 12, 2012 at 9:51 AM, Justin Piszcz <jpiszcz@lucidpixels.com> wrote:
> On Sun, Aug 12, 2012 at 9:10 AM, Eric Sandeen <sandeen@sandeen.net> wrote:
>> On 8/10/12 11:14 PM, Justin Piszcz wrote:
>>> On Fri, Aug 10, 2012 at 7:07 PM, Justin Piszcz
>>>>
>>>> Hi,
>>>>
>>>> Found the root cause, the 3.5.1 kernel cannot mount my ext4 filesystem
>>>> (60TB).
>>
>> You are a brave man running ext4 at 60T, but thank you for testing :)
>>
>> Backing out 8aeb00ff85ad25453765dd339b408c0087db1527 from 3.5.1
>> (952fc18ef9ec707ebdc16c0786ec360295e5ff15 upstream) probably helps?
>>
>> From a quick look, I think that essentially has a :
>>
>> for (i = 0; i < ngroups; i++) {
>>
>>         for (j = 0; j < ngroups; j++) {
>>
>>         }
>> }
>>
>> type nested loop going on; for a filesystem this big it's going to take almost
>> literally forever, if I read it right.
>>
>> -Eric
>
> Hello,
>
> It worked!! I can mount my filesystem now!
>
> I pulled down 3.5 and backed out that commit, I could not quickly find
> a doc to do this, so I will add how to do that below:
>
> 1. Clone Linux repo (3.5/stable as of this writing)
> git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
>
> 2. List commits:
> git log
>
> 3. Show a specific commit
> git show 8aeb00ff85ad25453765dd339b408c0087db1527
>
> 4. How to revert the commit:
> git revert 8aeb00ff85ad25453765dd339b408c0087db1527
>
> # On branch master
> nothing to commit (working directory clean)

You didn't actually revert anything here, because your clone left
you on "master" branch, which points at 3.5 (i.e. 3.5.0).   It does
not contain the commit which is of interest to you.

-------------
linux-stable$git tag --contains 8aeb00ff
v3.5.1
linux-stable$git branch --contains 8aeb00ff
  linux-3.5.y
linux-stable$
------------

The master branch in linux-stable is left pointing at one of the
most recent mainline (i.e. non-stable) tags, and all of the stable
content is on individual branches (type "git branch" to see them).

So if you do a "git checkout linux-3.5.y"  and then do the revert,
you will actually be testing what you wanted to test.

Paul.
--

>
> 5. Recompile, reboot, does it work?
> # df -h
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/sda1        61T   17T   44T  28% /r1
> # uname -a
> Linux p34 3.5.0 #1 SMP Sun Aug 12 09:42:41 EDT 2012 x86_64 GNU/Linux
>
> Yes!
>
> CC: Greg to see if this can be backed out for 3.5.2?
>
> Justin.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Upgraded from 3.4 to 3.5.1 kernel: machine does not boot
  2012-08-12 14:13             ` Paul Gortmaker
@ 2012-08-12 14:36               ` Justin Piszcz
  0 siblings, 0 replies; 9+ messages in thread
From: Justin Piszcz @ 2012-08-12 14:36 UTC (permalink / raw)
  To: Paul Gortmaker
  Cc: Eric Sandeen, Jesper Juhl, linux-kernel, ap, linux-ext4, greg

On Sun, Aug 12, 2012 at 10:13 AM, Paul Gortmaker
<paul.gortmaker@windriver.com> wrote:
> On Sun, Aug 12, 2012 at 9:51 AM, Justin Piszcz <jpiszcz@lucidpixels.com> wrote:
>> On Sun, Aug 12, 2012 at 9:10 AM, Eric Sandeen <sandeen@sandeen.net> wrote:
>>> On 8/10/12 11:14 PM, Justin Piszcz wrote:
>>>> On Fri, Aug 10, 2012 at 7:07 PM, Justin Piszcz
>>>>>
>>>>> Hi,
>>>>>
>>>>> Found the root cause, the 3.5.1 kernel cannot mount my ext4 filesystem
>>>>> (60TB).
>>>
>>> You are a brave man running ext4 at 60T, but thank you for testing :)
>>>
>>> Backing out 8aeb00ff85ad25453765dd339b408c0087db1527 from 3.5.1
>>> (952fc18ef9ec707ebdc16c0786ec360295e5ff15 upstream) probably helps?
>>>
>>> From a quick look, I think that essentially has a :
>>>
>>> for (i = 0; i < ngroups; i++) {
>>>
>>>         for (j = 0; j < ngroups; j++) {
>>>
>>>         }
>>> }
>>>
>>> type nested loop going on; for a filesystem this big it's going to take almost
>>> literally forever, if I read it right.
>>>
>>> -Eric
>>
>> Hello,
>>
>> It worked!! I can mount my filesystem now!
>>
>> I pulled down 3.5 and backed out that commit, I could not quickly find
>> a doc to do this, so I will add how to do that below:
>>
>> 1. Clone Linux repo (3.5/stable as of this writing)
>> git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
>>
>> 2. List commits:
>> git log
>>
>> 3. Show a specific commit
>> git show 8aeb00ff85ad25453765dd339b408c0087db1527
>>
>> 4. How to revert the commit:
>> git revert 8aeb00ff85ad25453765dd339b408c0087db1527
>>
>> # On branch master
>> nothing to commit (working directory clean)
>
> You didn't actually revert anything here, because your clone left
> you on "master" branch, which points at 3.5 (i.e. 3.5.0).   It does
> not contain the commit which is of interest to you.
>
> -------------
> linux-stable$git tag --contains 8aeb00ff
> v3.5.1
> linux-stable$git branch --contains 8aeb00ff
>   linux-3.5.y
> linux-stable$
> ------------
>
> The master branch in linux-stable is left pointing at one of the
> most recent mainline (i.e. non-stable) tags, and all of the stable
> content is on individual branches (type "git branch" to see them).
>
> So if you do a "git checkout linux-3.5.y"  and then do the revert,
> you will actually be testing what you wanted to test.
>
> Paul.
> --

Yikes, I saw the git details (via get show) but that must check the
commit via git/inet-- I assumed that was also in the 3.5 tree, but its
not per your check, so I've made some changes to my notes, recompiled,
rebooted .. and success!!  Woohoo!

One other item, I've never used a git updated kernel before, usually
just patch -p1 the mainline or pull it down directly (but git seems
nicer now that I know how to do it), does the '+' signify its the
3.5.1 kernel and then a '+' because I made changes to it?

p34:~# df -h | grep /r1
/dev/sda1        61T   16T   45T  26% /r1
p34:~# uname -a
Linux p34 3.5.1+ #3 SMP Sun Aug 12 10:31:34 EDT 2012 x86_64 GNU/Linux
p34:~# uptime
 10:35:12 up 1 min,  1 user,  load average: 0.05, 0.03, 0.01
p34:~#

---


Updated notes:


1. Clone Linux repo (3.5/stable as of this writing)
git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
linux-3.5

2.0 Cd into linux-3.5
cd linux-3.5

2.1 Check available kernel versions:
git tag | tail -n 3
v3.5-rc6
v3.5-rc7
v3.5.1

2.2 Update to the latest 3.5.1 kernel:
git checkout linux-3.5.1

Note: checking out 'v3.5.1'.
..
HEAD is now at cbd3c20... Linux 3.5.1

2.3 Confirm it is 3.5.1:
# head -n 3 Makefile
VERSION = 3
PATCHLEVEL = 5
SUBLEVEL = 1

2.4 List commits:
git log

3. Show a specific commit
git show 8aeb00ff85ad25453765dd339b408c0087db1527

4. How to revert the commit:
git revert 8aeb00ff85ad25453765dd339b408c0087db1527
(It brings up a text editor like svn/cvs, commit -> write/save/quit)

[detached HEAD 35d699f] Revert "ext4: fix overhead calculation used by
ext4_statfs()"
 Committer: root <root@lucidpixels.com>
Your name and email address were configured automatically based
on your username and hostname. Please check that they are accurate.
You can suppress this message by setting them explicitly:

    git config --global user.name "Your Name"
    git config --global user.email you@example.com

After doing this, you may fix the identity used for this commit with:

    git commit --amend --reset-author

 4 files changed, 57 insertions(+), 132 deletions(-)

5. Recompile, reboot, does it work, still?
# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        61T   17T   44T  28% /r1
# uname -a
Linux p34 3.5.0 #1 SMP Sun Aug 12 09:42:41 EDT 2012 x86_64 GNU/Linux

Yes!


--

How to find where a particular commit lies?

-------------
linux-stable$git tag --contains 8aeb00ff
v3.5.1
linux-stable$git branch --contains 8aeb00ff
  linux-3.5.y
linux-stable$

--

Justin.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-08-12 14:36 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-10 16:45 Upgraded from 3.4 to 3.5.1 kernel: machine does not boot Justin Piszcz
2012-08-10 17:53 ` Jesper Juhl
2012-08-10 21:45   ` Justin Piszcz
2012-08-10 23:07     ` Justin Piszcz
2012-08-11  4:14       ` Justin Piszcz
2012-08-12 13:10         ` Eric Sandeen
2012-08-12 13:51           ` Justin Piszcz
2012-08-12 14:13             ` Paul Gortmaker
2012-08-12 14:36               ` Justin Piszcz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox