public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* 3.5+, xfs and 32bit armhf - xfs_buf_get: failed to map pages
@ 2013-05-17 10:45 Paolo Pisati
  2013-05-18  8:43 ` Jeff Liu
  2013-05-19  1:13 ` Dave Chinner
  0 siblings, 2 replies; 9+ messages in thread
From: Paolo Pisati @ 2013-05-17 10:45 UTC (permalink / raw)
  To: xfs

While exercising swift on a single node 32bit armhf system running a 3.5 kernel,
i got this when i hit ~25% of fs space usage:

dmesg:
...
[ 3037.399406] vmap allocation for size 2097152 failed: use vmalloc=<size> to increase size.
[ 3037.399442] vmap allocation for size 2097152 failed: use vmalloc=<size> to increase size.
[ 3037.399469] vmap allocation for size 2097152 failed: use vmalloc=<size> to increase size.
[ 3037.399485] XFS (sda5): xfs_buf_get: failed to map pages
[ 3037.399485]
[ 3037.399501] XFS (sda5): Internal error xfs_trans_cancel at line 1466 of file /build/buildd/linux-3.5.0/fs/xfs/xfs_trans.c. Caller 0xbf0235e0
[ 3037.399501]
[ 3037.413789] [<c00164cc>] (unwind_backtrace+0x0/0x104) from [<c04ed624>] (dump_stack+0x20/0x24)
[ 3037.413985] [<c04ed624>] (dump_stack+0x20/0x24) from [<bf01091c>] (xfs_error_report+0x60/0x6c [xfs])
[ 3037.414321] [<bf01091c>] (xfs_error_report+0x60/0x6c [xfs]) from [<bf0633f8>] (xfs_trans_cancel+0xfc/0x11c [xfs])
[ 3037.414654] [<bf0633f8>] (xfs_trans_cancel+0xfc/0x11c [xfs]) from [<bf0235e0>] (xfs_create+0x228/0x558 [xfs])
[ 3037.414953] [<bf0235e0>] (xfs_create+0x228/0x558 [xfs]) from [<bf01a7cc>] (xfs_vn_mknod+0x9c/0x180 [xfs])
[ 3037.415239] [<bf01a7cc>] (xfs_vn_mknod+0x9c/0x180 [xfs]) from [<bf01a8d0>] (xfs_vn_mkdir+0x20/0x24 [xfs])
[ 3037.415393] [<bf01a8d0>] (xfs_vn_mkdir+0x20/0x24 [xfs]) from [<c0135758>] (vfs_mkdir+0xc4/0x13c)
[ 3037.415410] [<c0135758>] (vfs_mkdir+0xc4/0x13c) from [<c013884c>] (sys_mkdirat+0xdc/0xe4)
[ 3037.415422] [<c013884c>] (sys_mkdirat+0xdc/0xe4) from [<c0138878>] (sys_mkdir+0x24/0x28)
[ 3037.415437] [<c0138878>] (sys_mkdir+0x24/0x28) from [<c000e320>] (ret_fast_syscall+0x0/0x30)
[ 3037.415452] XFS (sda5): xfs_do_force_shutdown(0x8) called from line 1467 of file /build/buildd/linux-3.5.0/fs/xfs/xfs_trans.c. Return address = 0xbf06340c
[ 3037.416892] XFS (sda5): Corruption of in-memory data detected. Shutting down filesystem
[ 3037.425008] XFS (sda5): Please umount the filesystem and rectify the problem(s)
[ 3047.912480] XFS (sda5): xfs_log_force: error 5 returned.

flag@c13:~$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 225G 2.1G 212G 1% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
udev 2.0G 4.0K 2.0G 1% /dev
tmpfs 405M 260K 404M 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 2.0G 0 2.0G 0% /run/shm
none 100M 0 100M 0% /run/user
/dev/sda1 228M 30M 186M 14% /boot
/dev/sda5 2.0G 569M 1.5G 28% /mnt/sdb1

flag@c13:~$ df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sda2 14958592 74462 14884130 1% /
none 182027 1 182026 1% /sys/fs/cgroup
udev 177378 1361 176017 1% /dev
tmpfs 182027 807 181220 1% /run
none 182027 3 182024 1% /run/lock
none 182027 1 182026 1% /run/shm
none 182027 1 182026 1% /run/user
/dev/sda1 124496 35 124461 1% /boot
/dev/sda5 524288 237184 287104 46% /mnt/sdb1

the vmalloc space is ~256M usually on this box, so i enlarged it:

flag@c13:~$ dmesg | grep vmalloc                                                                                                                                                          
Kernel command line: console=ttyAMA0 nosplash vmalloc=512M                                                                                                                                
    vmalloc : 0xdf800000 - 0xff000000   ( 504 MB)

and while i didn't hit the warning above, still after ~25% of usage, the storage
node died with:

May 17 06:26:00 c13 container-server ERROR __call__ error with PUT /sdb1/123172/AUTH_test/3b3d078015304a41b76b0ab083b7863a_5 : [Errno 28] No space
left on device: '/srv/1/node/sdb1/containers/123172' (txn: tx8ea3ce392ee94df096b16-00519605b0)


flag@c13:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda2       225G  3.9G  210G   2% /
none            4.0K     0  4.0K   0% /sys/fs/cgroup
udev            2.0G  4.0K  2.0G   1% /dev
tmpfs           405M  260K  404M   1% /run
none            5.0M     0  5.0M   0% /run/lock
none            2.0G     0  2.0G   0% /run/shm
none            100M     0  100M   0% /run/user
/dev/sda1       228M   25M  192M  12% /boot
/dev/sda5       2.0G  564M  1.5G  28% /mnt/sdb1

flag@c13:~$ df -i
Filesystem       Inodes  IUsed    IFree IUse% Mounted on
/dev/sda2      14958592 124409 14834183    1% /
none             114542      1   114541    1% /sys/fs/cgroup
udev             103895   1361   102534    2% /dev
tmpfs            114542    806   113736    1% /run
none             114542      3   114539    1% /run/lock
none             114542      1   114541    1% /run/shm
none             114542      1   114541    1% /run/user
/dev/sda1        124496     33   124463    1% /boot
/dev/sda5        524288 234880   289408   45% /mnt/sdb1


any idea what else shall i tune to workaround this? or is it a know problem that
involves 32bit arch and xfs?
-- 
bye,
p.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 3.5+, xfs and 32bit armhf - xfs_buf_get: failed to map pages
  2013-05-17 10:45 3.5+, xfs and 32bit armhf - xfs_buf_get: failed to map pages Paolo Pisati
@ 2013-05-18  8:43 ` Jeff Liu
  2013-05-19  1:13 ` Dave Chinner
  1 sibling, 0 replies; 9+ messages in thread
From: Jeff Liu @ 2013-05-18  8:43 UTC (permalink / raw)
  To: Paolo Pisati; +Cc: xfs

On 05/17/2013 06:45 PM, Paolo Pisati wrote:
> While exercising swift on a single node 32bit armhf system running a 3.5 kernel,
> i got this when i hit ~25% of fs space usage:
> 
> dmesg:
> ...
> [ 3037.399406] vmap allocation for size 2097152 failed: use vmalloc=<size> to increase size.
> [ 3037.399442] vmap allocation for size 2097152 failed: use vmalloc=<size> to increase size.
> [ 3037.399469] vmap allocation for size 2097152 failed: use vmalloc=<size> to increase size.
> [ 3037.399485] XFS (sda5): xfs_buf_get: failed to map pages
> [ 3037.399485]
> [ 3037.399501] XFS (sda5): Internal error xfs_trans_cancel at line 1466 of file /build/buildd/linux-3.5.0/fs/xfs/xfs_trans.c. Caller 0xbf0235e0
> [ 3037.399501]
> [ 3037.413789] [<c00164cc>] (unwind_backtrace+0x0/0x104) from [<c04ed624>] (dump_stack+0x20/0x24)
> [ 3037.413985] [<c04ed624>] (dump_stack+0x20/0x24) from [<bf01091c>] (xfs_error_report+0x60/0x6c [xfs])
> [ 3037.414321] [<bf01091c>] (xfs_error_report+0x60/0x6c [xfs]) from [<bf0633f8>] (xfs_trans_cancel+0xfc/0x11c [xfs])
> [ 3037.414654] [<bf0633f8>] (xfs_trans_cancel+0xfc/0x11c [xfs]) from [<bf0235e0>] (xfs_create+0x228/0x558 [xfs])
> [ 3037.414953] [<bf0235e0>] (xfs_create+0x228/0x558 [xfs]) from [<bf01a7cc>] (xfs_vn_mknod+0x9c/0x180 [xfs])
> [ 3037.415239] [<bf01a7cc>] (xfs_vn_mknod+0x9c/0x180 [xfs]) from [<bf01a8d0>] (xfs_vn_mkdir+0x20/0x24 [xfs])
> [ 3037.415393] [<bf01a8d0>] (xfs_vn_mkdir+0x20/0x24 [xfs]) from [<c0135758>] (vfs_mkdir+0xc4/0x13c)
> [ 3037.415410] [<c0135758>] (vfs_mkdir+0xc4/0x13c) from [<c013884c>] (sys_mkdirat+0xdc/0xe4)
> [ 3037.415422] [<c013884c>] (sys_mkdirat+0xdc/0xe4) from [<c0138878>] (sys_mkdir+0x24/0x28)
> [ 3037.415437] [<c0138878>] (sys_mkdir+0x24/0x28) from [<c000e320>] (ret_fast_syscall+0x0/0x30)
> [ 3037.415452] XFS (sda5): xfs_do_force_shutdown(0x8) called from line 1467 of file /build/buildd/linux-3.5.0/fs/xfs/xfs_trans.c. Return address = 0xbf06340c
> [ 3037.416892] XFS (sda5): Corruption of in-memory data detected. Shutting down filesystem
> [ 3037.425008] XFS (sda5): Please umount the filesystem and rectify the problem(s)
> [ 3047.912480] XFS (sda5): xfs_log_force: error 5 returned.
> 
> flag@c13:~$ df -h
> Filesystem Size Used Avail Use% Mounted on
> /dev/sda2 225G 2.1G 212G 1% /
> none 4.0K 0 4.0K 0% /sys/fs/cgroup
> udev 2.0G 4.0K 2.0G 1% /dev
> tmpfs 405M 260K 404M 1% /run
> none 5.0M 0 5.0M 0% /run/lock
> none 2.0G 0 2.0G 0% /run/shm
> none 100M 0 100M 0% /run/user
> /dev/sda1 228M 30M 186M 14% /boot
> /dev/sda5 2.0G 569M 1.5G 28% /mnt/sdb1
> 
> flag@c13:~$ df -i
> Filesystem Inodes IUsed IFree IUse% Mounted on
> /dev/sda2 14958592 74462 14884130 1% /
> none 182027 1 182026 1% /sys/fs/cgroup
> udev 177378 1361 176017 1% /dev
> tmpfs 182027 807 181220 1% /run
> none 182027 3 182024 1% /run/lock
> none 182027 1 182026 1% /run/shm
> none 182027 1 182026 1% /run/user
> /dev/sda1 124496 35 124461 1% /boot
> /dev/sda5 524288 237184 287104 46% /mnt/sdb1
> 
> the vmalloc space is ~256M usually on this box, so i enlarged it:
> 
> flag@c13:~$ dmesg | grep vmalloc                                                                                                                                                          
> Kernel command line: console=ttyAMA0 nosplash vmalloc=512M                                                                                                                                
>     vmalloc : 0xdf800000 - 0xff000000   ( 504 MB)
> 
> and while i didn't hit the warning above, still after ~25% of usage, the storage
> node died with:
> 
> May 17 06:26:00 c13 container-server ERROR __call__ error with PUT /sdb1/123172/AUTH_test/3b3d078015304a41b76b0ab083b7863a_5 : [Errno 28] No space
> left on device: '/srv/1/node/sdb1/containers/123172' (txn: tx8ea3ce392ee94df096b16-00519605b0)
> 
> 
> flag@c13:~$ df -h
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/sda2       225G  3.9G  210G   2% /
> none            4.0K     0  4.0K   0% /sys/fs/cgroup
> udev            2.0G  4.0K  2.0G   1% /dev
> tmpfs           405M  260K  404M   1% /run
> none            5.0M     0  5.0M   0% /run/lock
> none            2.0G     0  2.0G   0% /run/shm
> none            100M     0  100M   0% /run/user
> /dev/sda1       228M   25M  192M  12% /boot
> /dev/sda5       2.0G  564M  1.5G  28% /mnt/sdb1
> 
> flag@c13:~$ df -i
> Filesystem       Inodes  IUsed    IFree IUse% Mounted on
> /dev/sda2      14958592 124409 14834183    1% /
> none             114542      1   114541    1% /sys/fs/cgroup
> udev             103895   1361   102534    2% /dev
> tmpfs            114542    806   113736    1% /run
> none             114542      3   114539    1% /run/lock
> none             114542      1   114541    1% /run/shm
> none             114542      1   114541    1% /run/user
> /dev/sda1        124496     33   124463    1% /boot
> /dev/sda5        524288 234880   289408   45% /mnt/sdb1
> 
> 
> any idea what else shall i tune to workaround this? or is it a know problem that
> involves 32bit arch and xfs?

I tried to reproduce this issue against the latest upstream tree on 32-bit system
but no luck.

Could you please supply the following info:

1) xfs_db -r "-c freesp -s" /dev/sda5
2) xfs_info /mnt/sdb1

Thanks,
-Jeff

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 3.5+, xfs and 32bit armhf - xfs_buf_get: failed to map pages
  2013-05-17 10:45 3.5+, xfs and 32bit armhf - xfs_buf_get: failed to map pages Paolo Pisati
  2013-05-18  8:43 ` Jeff Liu
@ 2013-05-19  1:13 ` Dave Chinner
  2013-05-20 17:07   ` Paolo Pisati
  1 sibling, 1 reply; 9+ messages in thread
From: Dave Chinner @ 2013-05-19  1:13 UTC (permalink / raw)
  To: Paolo Pisati; +Cc: xfs

On Fri, May 17, 2013 at 12:45:29PM +0200, Paolo Pisati wrote:
> While exercising swift on a single node 32bit armhf system running a 3.5 kernel,
> i got this when i hit ~25% of fs space usage:
> 
> dmesg:
> ...
> [ 3037.399406] vmap allocation for size 2097152 failed: use vmalloc=<size> to increase size.
> [ 3037.399442] vmap allocation for size 2097152 failed: use vmalloc=<size> to increase size.
> [ 3037.399469] vmap allocation for size 2097152 failed: use vmalloc=<size> to increase size.
> [ 3037.399485] XFS (sda5): xfs_buf_get: failed to map pages
> [ 3037.399485]
> [ 3037.399501] XFS (sda5): Internal error xfs_trans_cancel at line 1466 of file /build/buildd/linux-3.5.0/fs/xfs/xfs_trans.c. Caller 0xbf0235e0
> [ 3037.399501]
> [ 3037.413789] [<c00164cc>] (unwind_backtrace+0x0/0x104) from [<c04ed624>] (dump_stack+0x20/0x24)
> [ 3037.413985] [<c04ed624>] (dump_stack+0x20/0x24) from [<bf01091c>] (xfs_error_report+0x60/0x6c [xfs])
> [ 3037.414321] [<bf01091c>] (xfs_error_report+0x60/0x6c [xfs]) from [<bf0633f8>] (xfs_trans_cancel+0xfc/0x11c [xfs])
> [ 3037.414654] [<bf0633f8>] (xfs_trans_cancel+0xfc/0x11c [xfs]) from [<bf0235e0>] (xfs_create+0x228/0x558 [xfs])
> [ 3037.414953] [<bf0235e0>] (xfs_create+0x228/0x558 [xfs]) from [<bf01a7cc>] (xfs_vn_mknod+0x9c/0x180 [xfs])
> [ 3037.415239] [<bf01a7cc>] (xfs_vn_mknod+0x9c/0x180 [xfs]) from [<bf01a8d0>] (xfs_vn_mkdir+0x20/0x24 [xfs])
> [ 3037.415393] [<bf01a8d0>] (xfs_vn_mkdir+0x20/0x24 [xfs]) from [<c0135758>] (vfs_mkdir+0xc4/0x13c)
> [ 3037.415410] [<c0135758>] (vfs_mkdir+0xc4/0x13c) from [<c013884c>] (sys_mkdirat+0xdc/0xe4)
> [ 3037.415422] [<c013884c>] (sys_mkdirat+0xdc/0xe4) from [<c0138878>] (sys_mkdir+0x24/0x28)
> [ 3037.415437] [<c0138878>] (sys_mkdir+0x24/0x28) from [<c000e320>] (ret_fast_syscall+0x0/0x30)
> [ 3037.415452] XFS (sda5): xfs_do_force_shutdown(0x8) called from line 1467 of file /build/buildd/linux-3.5.0/fs/xfs/xfs_trans.c. Return address = 0xbf06340c
> [ 3037.416892] XFS (sda5): Corruption of in-memory data detected. Shutting down filesystem
> [ 3037.425008] XFS (sda5): Please umount the filesystem and rectify the problem(s)
> [ 3047.912480] XFS (sda5): xfs_log_force: error 5 returned.

Hi Paolo,

You've already contacted me off list about this and pointed me to
this:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1176977

which contains information that everyone looking at the problem
should know. Also, any progress on testing the backported fix
mentioned in the bug?

> and while i didn't hit the warning above, still after ~25% of
> usage, the storage node died with:
>
> May 17 06:26:00 c13 container-server ERROR __call__ error with PUT /sdb1/123172/AUTH_test/3b3d078015304a41b76b0ab083b7863a_5 : [Errno 28] No space
> left on device: '/srv/1/node/sdb1/containers/123172' (txn: tx8ea3ce392ee94df096b16-00519605b0)

You're testing swift benchmark which is probably a small file
workload with large attributes attached.  It's a good chance that
the workload is fragmenting free space because swift is doing bad
things to allocation patterns.  It's almost certainly exacerbated by
the tiny filesystem you are using (1.5GB), but you can probably work
around this problem for now with allocsize=4096.

I've got a fix that I'm testing for the underlying cause of the
problem I'm aware of with this workload, but I'll need more
information about your storage/filesystem config to confirm it is
the same root cause first. Can you include the info from here:

http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F

As well the freespace info that Jeff asked for?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 3.5+, xfs and 32bit armhf - xfs_buf_get: failed to map pages
  2013-05-19  1:13 ` Dave Chinner
@ 2013-05-20 17:07   ` Paolo Pisati
  2013-05-21  0:02     ` Dave Chinner
  0 siblings, 1 reply; 9+ messages in thread
From: Paolo Pisati @ 2013-05-20 17:07 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs, Paolo Pisati

On Sun, May 19, 2013 at 11:13:54AM +1000, Dave Chinner wrote:
> 
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1176977
> 
> which contains information that everyone looking at the problem
> should know. Also, any progress on testing the backported fix
> mentioned in the bug?

the problem with the 'fix' is that it prevents xfs from erroring out, but
swift-test fails regardless after ~25% of fs usage and i think having a bold
'xfs error' and a stack trace is more useful.
 
> You're testing swift benchmark which is probably a small file
> workload with large attributes attached.  It's a good chance that
> the workload is fragmenting free space because swift is doing bad
> things to allocation patterns.  It's almost certainly exacerbated by
> the tiny filesystem you are using (1.5GB), but you can probably work
> around this problem for now with allocsize=4096.

ok, i repartitioned my disk but i can still reprodue it fairly easily:

df -h:
/dev/sda6       216G  573M  215G   1% /mnt/sdb1

df -i:
/dev/sda6      56451072 235458 56215614    1% /mnt/sdb1

dmesg:
...
[  363.130877] XFS (sda6): Mounting Filesystem
[  363.146708] XFS (sda6): Ending clean mount
[ 3055.520769] alloc_vmap_area: 18 callbacks suppressed
[ 3055.520783] vmap allocation for size 2097152 failed: use vmalloc=<size> to increase size.
[ 3055.520817] vmap allocation for size 2097152 failed: use vmalloc=<size> to increase size.
[ 3055.520845] vmap allocation for size 2097152 failed: use vmalloc=<size> to increase size.
[ 3055.520861] XFS (sda6): xfs_buf_get: failed to map pages
[ 3055.520861]
[ 3055.520882] XFS (sda6): Internal error xfs_trans_cancel at line 1466 of file /build/buildd/linux-3.5.0/fs/xfs/xfs_trans.c.  Caller 0xbf0235e0
[ 3055.520882]
[ 3055.535135] [<c00164cc>] (unwind_backtrace+0x0/0x104) from [<c04ed624>] (dump_stack+0x20/0x24)
[ 3055.535345] [<c04ed624>] (dump_stack+0x20/0x24) from [<bf01091c>] (xfs_error_report+0x60/0x6c [xfs])
[ 3055.535687] [<bf01091c>] (xfs_error_report+0x60/0x6c [xfs]) from [<bf0633f8>] (xfs_trans_cancel+0xfc/0x11c [xfs])
[ 3055.536023] [<bf0633f8>] (xfs_trans_cancel+0xfc/0x11c [xfs]) from [<bf0235e0>] (xfs_create+0x228/0x558 [xfs])
[ 3055.536314] [<bf0235e0>] (xfs_create+0x228/0x558 [xfs]) from [<bf01a7cc>] (xfs_vn_mknod+0x9c/0x180 [xfs])
[ 3055.536589] [<bf01a7cc>] (xfs_vn_mknod+0x9c/0x180 [xfs]) from [<bf01a8f0>] (xfs_vn_create+0x1c/0x20 [xfs])
[ 3055.536741] [<bf01a8f0>] (xfs_vn_create+0x1c/0x20 [xfs]) from [<c01359d4>] (vfs_create+0xb4/0x120)
[ 3055.536760] [<c01359d4>] (vfs_create+0xb4/0x120) from [<c0137c3c>] (do_last+0x860/0x9bc)
[ 3055.536775] [<c0137c3c>] (do_last+0x860/0x9bc) from [<c0137fdc>] (path_openat+0xcc/0x428)
[ 3055.536787] [<c0137fdc>] (path_openat+0xcc/0x428) from [<c0138458>] (do_filp_open+0x3c/0x90)
[ 3055.536805] [<c0138458>] (do_filp_open+0x3c/0x90) from [<c0128248>] (do_sys_open+0xfc/0x1d0)
[ 3055.536817] [<c0128248>] (do_sys_open+0xfc/0x1d0) from [<c0128348>] (sys_open+0x2c/0x30)
[ 3055.536832] [<c0128348>] (sys_open+0x2c/0x30) from [<c000e320>] (ret_fast_syscall+0x0/0x30)
[ 3055.536848] XFS (sda6): xfs_do_force_shutdown(0x8) called from line 1467 of file /build/buildd/linux-3.5.0/fs/xfs/xfs_trans.c.  Return address = 0xbf06340c
[ 3055.537327] XFS (sda6): Corruption of in-memory data detected.  Shutting down filesystem
[ 3055.545439] XFS (sda6): Please umount the filesystem and rectify the problem(s)
[ 3070.301048] XFS (sda6): xfs_log_force: error 5 returned.
[ 3100.381068] XFS (sda6): xfs_log_force: error 5 returned.
[ 3130.461041] XFS (sda6): xfs_log_force: error 5 returned.
[ 3160.541042] XFS (sda6): xfs_log_force: error 5 returned.
[ 3190.621042] XFS (sda6): xfs_log_force: error 5 returned.
[ 3220.701040] XFS (sda6): xfs_log_force: error 5 returned.
[ 3250.781039] XFS (sda6): xfs_log_force: error 5 returned.
[ 3280.861036] XFS (sda6): xfs_log_force: error 5 returned.
[ 3310.941047] XFS (sda6): xfs_log_force: error 5 returned.
[ 3341.021044] XFS (sda6): xfs_log_force: error 5 returned.
[ 3371.101044] XFS (sda6): xfs_log_force: error 5 returned.
[ 3401.181036] XFS (sda6): xfs_log_force: error 5 returned.
[ 3431.261036] XFS (sda6): xfs_log_force: error 5 returned.
[ 3461.341038] XFS (sda6): xfs_log_force: error 5 returned.
[ 3491.421038] XFS (sda6): xfs_log_force: error 5 returned.
[ 3521.501051] XFS (sda6): xfs_log_force: error 5 returned.
[ 3551.581037] XFS (sda6): xfs_log_force: error 5 returned.
[ 3581.661041] XFS (sda6): xfs_log_force: error 5 returned.

> I've got a fix that I'm testing for the underlying cause of the
> problem I'm aware of with this workload, but I'll need more
> information about your storage/filesystem config to confirm it is
> the same root cause first. Can you include the info from here:
> 
> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F

flag@c13:~$ uname -a
Linux c13 3.5.0-30-highbank #51-Ubuntu SMP Tue May 14 22:57:15 UTC 2013 armv7l armv7l armv7l GNU/Linux

lag@c13:~$ xfs_repair -V
xfs_repair version 3.1.7

armhf highbank node, 4 cores, 4GB mem

flag@c13:~$ cat /proc/meminfo
MemTotal:        4137004 kB
MemFree:         2719752 kB
Buffers:           39688 kB
Cached:           580508 kB
SwapCached:            0 kB
Active:           631136 kB
Inactive:         204552 kB
Active(anon):     215520 kB
Inactive(anon):      232 kB
Active(file):     415616 kB
Inactive(file):   204320 kB
Unevictable:           0 kB
Mlocked:               0 kB
HighTotal:       3408896 kB
HighFree:        2606516 kB
LowTotal:         728108 kB
LowFree:          113236 kB
SwapTotal:       8378364 kB
SwapFree:        8378364 kB
Dirty:                 4 kB
Writeback:             0 kB
AnonPages:        215516 kB
Mapped:             8676 kB
Shmem:               264 kB
Slab:             317000 kB
SReclaimable:     230392 kB
SUnreclaim:        86608 kB
KernelStack:        2192 kB
PageTables:         2284 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    10446864 kB
Committed_AS:    1049624 kB
VmallocTotal:     245760 kB
VmallocUsed:        2360 kB
VmallocChunk:     241428 kB

flag@c13:~$ cat /proc/mounts
rootfs / rootfs rw 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,relatime,size=2059248k,nr_inodes=177400,mode=755 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,relatime,size=827404k,mode=755 0 0
/dev/disk/by-uuid/6594b183-3198-4dec-a97d-a3f834b98011 / ext4 rw,relatime,errors=remount-ro,data=ordered 0 0
none /sys/fs/fuse/connections fusectl rw,relatime 0 0
none /sys/kernel/debug debugfs rw,relatime 0 0
none /sys/kernel/security securityfs rw,relatime 0 0
none /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
none /run/shm tmpfs rw,nosuid,nodev,relatime 0 0
none /run/user tmpfs rw,nosuid,nodev,noexec,relatime,size=102400k,mode=755 0 0
/dev/sda1 /boot ext2 rw,relatime,errors=continue 0 0
/dev/sda6 /mnt/sdb1 xfs rw,noatime,nodiratime,attr2,nobarrier,logbufs=8,noquota 0 0

flag@c13:~$ cat /proc/partitions
major minor  #blocks  name

   8        0  250059096 sda
   8        1     248832 sda1
   8        2   15625000 sda2
   8        3          1 sda3
   8        5    8378369 sda5
   8        6  225804288 sda6

no RAID, no LVM

common sata disk

write cache on (unknown size)

no BBWC AFAIK

flag@c13:~$ xfs_info /mnt/sdb1/
meta-data=/dev/sda6              isize=1024   agcount=4, agsize=14112768 blks
         =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=56451072, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=27564, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

> 
> As well the freespace info that Jeff asked for?

flag@c13:~$ sudo xfs_db -r "-c freesp -s" /dev/sda6 
   from      to extents  blocks    pct
      1       1     423     423   0.00
      2       3     897    2615   0.01
      4       7     136     915   0.00
      8      15   24833  365797   0.86
8388608 14112768       3 41928421  99.13
total free extents 26292
total free blocks 42298171
average free extent size 1608.78

-- 
bye,
p.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 3.5+, xfs and 32bit armhf - xfs_buf_get: failed to map pages
  2013-05-20 17:07   ` Paolo Pisati
@ 2013-05-21  0:02     ` Dave Chinner
  2013-05-23 14:34       ` Paolo Pisati
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Chinner @ 2013-05-21  0:02 UTC (permalink / raw)
  To: Paolo Pisati; +Cc: xfs

On Mon, May 20, 2013 at 07:07:10PM +0200, Paolo Pisati wrote:
> On Sun, May 19, 2013 at 11:13:54AM +1000, Dave Chinner wrote:
> > 
> > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1176977
> > 
> > which contains information that everyone looking at the problem
> > should know. Also, any progress on testing the backported fix
> > mentioned in the bug?
> 
> the problem with the 'fix' is that it prevents xfs from erroring out, but
> swift-test fails regardless after ~25% of fs usage and i think having a bold
> 'xfs error' and a stack trace is more useful.

I think your logic is misguided.  There's a major difference
between ENOSPC and a filesystem shutdown. After a shutdown you need
to unmount, remount, and then work out what didn't make it to disk
before you can restart.

Not to mention that the  ENOMEM that triggers the shutdown is highly
system dependent - it will occur at different times on different
machines and will be highly unpredictable. That's not a good thing.

Compare that to a plain ENOSPC error: you can just remove files and
keep going.

> > You're testing swift benchmark which is probably a small file
> > workload with large attributes attached.  It's a good chance that
> > the workload is fragmenting free space because swift is doing bad
> > things to allocation patterns.  It's almost certainly exacerbated by
> > the tiny filesystem you are using (1.5GB), but you can probably work
> > around this problem for now with allocsize=4096.
> 
> ok, i repartitioned my disk but i can still reprodue it fairly easily:
> 
> df -h:
> /dev/sda6       216G  573M  215G   1% /mnt/sdb1
> 
> df -i:
> /dev/sda6      56451072 235458 56215614    1% /mnt/sdb1
> 
> dmesg:
> ...
> [  363.130877] XFS (sda6): Mounting Filesystem
> [  363.146708] XFS (sda6): Ending clean mount
> [ 3055.520769] alloc_vmap_area: 18 callbacks suppressed
> [ 3055.520783] vmap allocation for size 2097152 failed: use vmalloc=<size> to increase size.
> [ 3055.520817] vmap allocation for size 2097152 failed: use vmalloc=<size> to increase size.
> [ 3055.520845] vmap allocation for size 2097152 failed: use vmalloc=<size> to increase size.
> [ 3055.520861] XFS (sda6): xfs_buf_get: failed to map pages

Which is your ENOMEM error, not an ENOSPC error. So the larger
filesystem meant you didn't hit the ENOSPC problem like I suspected
it would....

> > I've got a fix that I'm testing for the underlying cause of the
> > problem I'm aware of with this workload, but I'll need more
> > information about your storage/filesystem config to confirm it is
> > the same root cause first. Can you include the info from here:

And that fix I mentioned will be useless if you don't apply the
patch that avoids the vmap allocation problem....

> > http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
> 
> flag@c13:~$ uname -a
> Linux c13 3.5.0-30-highbank #51-Ubuntu SMP Tue May 14 22:57:15 UTC 2013 armv7l armv7l armv7l GNU/Linux
> 
> lag@c13:~$ xfs_repair -V
> xfs_repair version 3.1.7
> 
> armhf highbank node, 4 cores, 4GB mem
> 
> flag@c13:~$ cat /proc/meminfo
> MemTotal:        4137004 kB
> MemFree:         2719752 kB
> Buffers:           39688 kB
> Cached:           580508 kB
> SwapCached:            0 kB
> Active:           631136 kB
> Inactive:         204552 kB
> Active(anon):     215520 kB
> Inactive(anon):      232 kB
> Active(file):     415616 kB
> Inactive(file):   204320 kB
> Unevictable:           0 kB
> Mlocked:               0 kB
> HighTotal:       3408896 kB
> HighFree:        2606516 kB
> LowTotal:         728108 kB

Oh, there's a likely cause of the vmalloc issue.  You have 3.4GB of
high memory, which means the kernel only has 700MB of low memory for
slab caches, vmap regions, etc.

An ia32 box has, by default 960MB of low memory which will be why
you are seeing this more frequently than anyone using an ia32
machine. And an ia32 machine can be configured with 2G/2G or 3G/1G
kernel/user address space splits, so most vmalloc problems can
be worked around.

> Slab:             317000 kB
> SReclaimable:     230392 kB
> SUnreclaim:        86608 kB

And so you have 300MB in slab caches in low memory

> KernelStack:        2192 kB
> PageTables:         2284 kB
> NFS_Unstable:          0 kB
> Bounce:                0 kB
> WritebackTmp:          0 kB
> CommitLimit:    10446864 kB
> Committed_AS:    1049624 kB
> VmallocTotal:     245760 kB
> VmallocUsed:        2360 kB
> VmallocChunk:     241428 kB

and 240MB in vmalloc space. so there's not much left of that 700MB
of low memory space. So, you really need that vmap fix, and you need
to configure your kernel with more low memory space.

> > As well the freespace info that Jeff asked for?
> 
> flag@c13:~$ sudo xfs_db -r "-c freesp -s" /dev/sda6 
>    from      to extents  blocks    pct
>       1       1     423     423   0.00
>       2       3     897    2615   0.01
>       4       7     136     915   0.00
>       8      15   24833  365797   0.86
> 8388608 14112768       3 41928421  99.13
> total free extents 26292
> total free blocks 42298171
> average free extent size 1608.78

We need this information after the ENOSPC error occurs, not soon
after mkfs or after the ENOMEM error. If this is after ENOSPC,
please unmount the filesystem, drop caches and rerun the freesp
command...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 3.5+, xfs and 32bit armhf - xfs_buf_get: failed to map pages
  2013-05-21  0:02     ` Dave Chinner
@ 2013-05-23 14:34       ` Paolo Pisati
  2013-05-29 13:56         ` Paolo Pisati
  2013-05-30  0:38         ` Dave Chinner
  0 siblings, 2 replies; 9+ messages in thread
From: Paolo Pisati @ 2013-05-23 14:34 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs, Paolo Pisati

On Tue, May 21, 2013 at 10:02:09AM +1000, Dave Chinner wrote:
> 
> And that fix I mentioned will be useless if you don't apply the
> patch that avoids the vmap allocation problem....


ok, so i recompiled a kernel+aforementioend fix, i repartitioned my disk and i
ran the swift-bench for 2 days in a row until i got this:

dmesg:
...
[163596.605253] updatedb.mlocat: page allocation failure: order:0, mode:0x20
[163596.605299] [<c00164cc>] (unwind_backtrace+0x0/0x104) from [<c04edb20>] (dump_stack+0x20/0x24)
[163596.605320] [<c04edb20>] (dump_stack+0x20/0x24) from [<c00e7780>] (warn_alloc_failed+0xd8/0x118)
[163596.605335] [<c00e7780>] (warn_alloc_failed+0xd8/0x118) from [<c00e9b88>] (__alloc_pages_nodemask+0x524/0x708)
[163596.605354] [<c00e9b88>] (__alloc_pages_nodemask+0x524/0x708) from [<c011b798>] (new_slab+0x22c/0x248)
[163596.605370] [<c011b798>] (new_slab+0x22c/0x248) from [<c04f04f8>] (__slab_alloc.constprop.46+0x1a4/0x4c8)
[163596.605383] [<c04f04f8>] (__slab_alloc.constprop.46+0x1a4/0x4c8) from [<c011ced4>] (kmem_cache_alloc+0x158/0x190)
[163596.605402] [<c011ced4>] (kmem_cache_alloc+0x158/0x190) from [<c0332be0>] (scsi_pool_alloc_command+0x30/0x74)
[163596.605417] [<c0332be0>] (scsi_pool_alloc_command+0x30/0x74) from [<c0332c80>] (scsi_host_alloc_command+0x24/0x78)
[163596.605428] [<c0332c80>] (scsi_host_alloc_command+0x24/0x78) from [<c0332cf0>] (__scsi_get_command+0x1c/0xa0)
[163596.605439] [<c0332cf0>] (__scsi_get_command+0x1c/0xa0) from [<c0332db0>] (scsi_get_command+0x3c/0xb0)
[163596.605453] [<c0332db0>] (scsi_get_command+0x3c/0xb0) from [<c0338d44>] (scsi_get_cmd_from_req+0x50/0x60)
[163596.605466] [<c0338d44>] (scsi_get_cmd_from_req+0x50/0x60) from [<c0339fd8>] (scsi_setup_fs_cmnd+0x4c/0xac)
[163596.605482] [<c0339fd8>] (scsi_setup_fs_cmnd+0x4c/0xac) from [<c0343568>] (sd_prep_fn+0x114/0xaf4)
[163596.605501] [<c0343568>] (sd_prep_fn+0x114/0xaf4) from [<c0299af4>] (blk_peek_request+0xc8/0x214)
[163596.605514] [<c0299af4>] (blk_peek_request+0xc8/0x214) from [<c033a1b0>] (scsi_request_fn+0x40/0x504)
[163596.605524] [<c033a1b0>] (scsi_request_fn+0x40/0x504) from [<c029a38c>] (blk_queue_bio+0x300/0x384)
[163596.605536] [<c029a38c>] (blk_queue_bio+0x300/0x384) from [<c0298450>] (generic_make_request+0xb8/0xd8)
[163596.605548] [<c0298450>] (generic_make_request+0xb8/0xd8) from [<c0298534>] (submit_bio+0xc4/0x17c)
[163596.605756] [<c0298534>] (submit_bio+0xc4/0x17c) from [<bf00f1c4>] (_xfs_buf_ioapply+0x1bc/0x224 [xfs])
[163596.606002] [<bf00f1c4>] (_xfs_buf_ioapply+0x1bc/0x224 [xfs]) from [<bf00f314>] (xfs_buf_iorequest+0x4c/0x98 [xfs])
[163596.606241] [<bf00f314>] (xfs_buf_iorequest+0x4c/0x98 [xfs]) from [<bf00f868>] (_xfs_buf_read+0x34/0x50 [xfs])
[163596.606481] [<bf00f868>] (_xfs_buf_read+0x34/0x50 [xfs]) from [<bf00f964>] (xfs_buf_read+0xe0/0x108 [xfs])
[163596.606781] [<bf00f964>] (xfs_buf_read+0xe0/0x108 [xfs]) from [<bf06ba78>] (xfs_trans_read_buf+0x1e4/0x3e8 [xfs])
[163596.607115] [<bf06ba78>] (xfs_trans_read_buf+0x1e4/0x3e8 [xfs]) from [<bf053a9c>] (xfs_imap_to_bp+0x54/0x128 [xfs])
[163596.607432] [<bf053a9c>] (xfs_imap_to_bp+0x54/0x128 [xfs]) from [<bf057bc4>] (xfs_iread+0x6c/0x150 [xfs])
[163596.607719] [<bf057bc4>] (xfs_iread+0x6c/0x150 [xfs]) from [<bf015bfc>] (xfs_iget+0x210/0x72c [xfs])
[163596.607982] [<bf015bfc>] (xfs_iget+0x210/0x72c [xfs]) from [<bf0233b4>] (xfs_lookup+0xf4/0x114 [xfs])
[163596.608247] [<bf0233b4>] (xfs_lookup+0xf4/0x114 [xfs]) from [<bf01a5e8>] (xfs_vn_lookup+0x54/0x98 [xfs])
[163596.608387] [<bf01a5e8>] (xfs_vn_lookup+0x54/0x98 [xfs]) from [<c0134198>] (__lookup_hash+0x64/0xec)
[163596.608402] [<c0134198>] (__lookup_hash+0x64/0xec) from [<c04f0d68>] (lookup_slow+0x50/0xac)
[163596.608415] [<c04f0d68>] (lookup_slow+0x50/0xac) from [<c0136724>] (path_lookupat+0x730/0x794)
[163596.608428] [<c0136724>] (path_lookupat+0x730/0x794) from [<c01367b4>] (do_path_lookup+0x2c/0xd0)
[163596.608439] [<c01367b4>] (do_path_lookup+0x2c/0xd0) from [<c01385e0>] (user_path_at_empty+0x64/0x8c)
[163596.608451] [<c01385e0>] (user_path_at_empty+0x64/0x8c) from [<c013862c>] (user_path_at+0x24/0x2c)
[163596.608462] [<c013862c>] (user_path_at+0x24/0x2c) from [<c012dd3c>] (vfs_fstatat+0x40/0x78)
[163596.608473] [<c012dd3c>] (vfs_fstatat+0x40/0x78) from [<c012dd9c>] (vfs_lstat+0x28/0x2c)
[163596.608482] [<c012dd9c>] (vfs_lstat+0x28/0x2c) from [<c012e048>] (sys_lstat64+0x24/0x40)
[163596.608495] [<c012e048>] (sys_lstat64+0x24/0x40) from [<c000e320>] (ret_fast_syscall+0x0/0x30)
[163596.608503] Mem-info:
[163596.608509] Normal per-cpu:
[163596.608515] CPU    0: hi:  186, btch:  31 usd:  38
[163596.608521] CPU    1: hi:  186, btch:  31 usd: 218
[163596.608528] CPU    2: hi:  186, btch:  31 usd: 152
[163596.608533] CPU    3: hi:  186, btch:  31 usd: 171
[163596.608538] HighMem per-cpu:
[163596.608544] CPU    0: hi:  186, btch:  31 usd:  46
[163596.608549] CPU    1: hi:  186, btch:  31 usd: 171
[163596.608555] CPU    2: hi:  186, btch:  31 usd: 168
[163596.608561] CPU    3: hi:  186, btch:  31 usd: 177
[163596.608574] active_anon:26367 inactive_anon:29153 isolated_anon:0
[163596.608574]  active_file:396338 inactive_file:397959 isolated_file:0
[163596.608574]  unevictable:0 dirty:0 writeback:5 unstable:0
[163596.608574]  free:5145 slab_reclaimable:57625 slab_unreclaimable:7729
[163596.608574]  mapped:1703 shmem:10 pagetables:581 bounce:0
[163596.608602] Normal free:15256kB min:3508kB low:4384kB high:5260kB active_anon:0kB inactive_anon:8kB active_file:848kB inactive_file:1560kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:772160kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:230500kB slab_unreclaimable:30916kB kernel_stack:2208kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[163596.608607] lowmem_reserve[]: 0 26423 26423
[163596.608628] HighMem free:5324kB min:512kB low:4352kB high:8192kB active_anon:105468kB inactive_anon:116604kB active_file:1584504kB inactive_file:1590276kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3382264kB mlocked:0kB dirty:0kB writeback:20kB mapped:6812kB shmem:40kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:2324kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[163596.608634] lowmem_reserve[]: 0 0 0
[163596.608643] Normal: 216*4kB 215*8kB 216*16kB 216*32kB 36*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 15256kB
[163596.608668] HighMem: 233*4kB 67*8kB 141*16kB 22*32kB 8*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 5324kB
[163596.608692] 794329 total pagecache pages
[163596.608697] 12 pages in swap cache
[163596.608703] Swap cache stats: add 79, delete 67, find 9/11
[163596.608708] Free swap  = 8378092kB
[163596.608712] Total swap = 8378364kB
[163596.670667] 1046784 pages of RAM
[163596.670674] 6801 free pages
[163596.670679] 12533 reserved pages
[163596.670683] 36489 slab pages
[163596.670687] 631668 pages shared
[163596.670692] 12 pages swap cached
[163596.670701] SLUB: Unable to allocate memory on node -1 (gfp=0x8020)
[163596.670710]   cache: kmalloc-192, object size: 192, buffer size: 192, default order: 0, min order: 0
[163596.670718]   node 0: slabs: 2733, objs: 57393, free: 0

df -h:
...
/dev/sda6       216G   53G  163G  25% /mnt/sdb1

df -i:
...
/dev/sda6      56451072 19721920 36729152   35% /mnt/sdb1

flag@c13:~$ cat /proc/meminfo 
MemTotal:        4137004 kB
MemFree:         1191096 kB
Buffers:           23172 kB
Cached:          2074116 kB
SwapCached:           48 kB
Active:          1301568 kB
Inactive:        1016024 kB
Active(anon):     103748 kB
Inactive(anon):   116600 kB
Active(file):    1197820 kB
Inactive(file):   899424 kB
Unevictable:           0 kB
Mlocked:               0 kB
HighTotal:       3408896 kB
HighFree:        1108444 kB
LowTotal:         728108 kB
LowFree:           82652 kB
SwapTotal:       8378364 kB
SwapFree:        8378092 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:        220284 kB
Mapped:             7068 kB
Shmem:                44 kB
Slab:             263216 kB
SReclaimable:     232212 kB
SUnreclaim:        31004 kB
KernelStack:        2192 kB
PageTables:         2312 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    10446864 kB
Committed_AS:    1051868 kB
VmallocTotal:     245760 kB
VmallocUsed:        2360 kB
VmallocChunk:     241428 kB

flag@c13:~$ sudo xfs_db -r "-c freesp -s" /dev/sda6
   from      to extents  blocks    pct
      1       1   27058   27058   0.06
      2       3  124367  358831   0.84
      4       7   17656  121693   0.29
      8      15 2856900 42122381  98.81
total free extents 3025981
total free blocks 42629963
average free extent size 14.088

-- 
bye,
p.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 3.5+, xfs and 32bit armhf - xfs_buf_get: failed to map pages
  2013-05-23 14:34       ` Paolo Pisati
@ 2013-05-29 13:56         ` Paolo Pisati
  2013-05-30  0:42           ` Dave Chinner
  2013-05-30  0:38         ` Dave Chinner
  1 sibling, 1 reply; 9+ messages in thread
From: Paolo Pisati @ 2013-05-29 13:56 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs, Paolo Pisati

On Thu, May 23, 2013 at 04:34:56PM +0200, Paolo Pisati wrote:
> On Tue, May 21, 2013 at 10:02:09AM +1000, Dave Chinner wrote:
> > 
> > And that fix I mentioned will be useless if you don't apply the
> > patch that avoids the vmap allocation problem....
> 
> 
> ok, so i recompiled a kernel+aforementioend fix, i repartitioned my disk and i
> ran the swift-bench for 2 days in a row until i got this:

i'm testing a 3.5.y kernel plus those 3 patches:

549142a xfs: don't use speculative prealloc for small files
f0843f4 xfs: limit speculative prealloc size on sparse files
454da09 xfs: inode allocation should use unmapped buffers.

and i can confirm that:

-using a small fs (2G) i cannot reproduce any -ENOSPC or vmalloc() problem
anymore, the benchmark runs until running out of inodes 

-using a bigger fs (~250G), two days and my tests are still running good

-- 
bye,
p.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 3.5+, xfs and 32bit armhf - xfs_buf_get: failed to map pages
  2013-05-23 14:34       ` Paolo Pisati
  2013-05-29 13:56         ` Paolo Pisati
@ 2013-05-30  0:38         ` Dave Chinner
  1 sibling, 0 replies; 9+ messages in thread
From: Dave Chinner @ 2013-05-30  0:38 UTC (permalink / raw)
  To: Paolo Pisati; +Cc: xfs

On Thu, May 23, 2013 at 04:34:56PM +0200, Paolo Pisati wrote:
> On Tue, May 21, 2013 at 10:02:09AM +1000, Dave Chinner wrote:
> > 
> > And that fix I mentioned will be useless if you don't apply the
> > patch that avoids the vmap allocation problem....
> 
> 
> ok, so i recompiled a kernel+aforementioend fix, i repartitioned my disk and i
> ran the swift-bench for 2 days in a row until i got this:
> 
> dmesg:
> ...
> [163596.605253] updatedb.mlocat: page allocation failure: order:0, mode:0x20
> [163596.605299] [<c00164cc>] (unwind_backtrace+0x0/0x104) from [<c04edb20>] (dump_stack+0x20/0x24)
> [163596.605320] [<c04edb20>] (dump_stack+0x20/0x24) from [<c00e7780>] (warn_alloc_failed+0xd8/0x118)
> [163596.605335] [<c00e7780>] (warn_alloc_failed+0xd8/0x118) from [<c00e9b88>] (__alloc_pages_nodemask+0x524/0x708)
> [163596.605354] [<c00e9b88>] (__alloc_pages_nodemask+0x524/0x708) from [<c011b798>] (new_slab+0x22c/0x248)
> [163596.605370] [<c011b798>] (new_slab+0x22c/0x248) from [<c04f04f8>] (__slab_alloc.constprop.46+0x1a4/0x4c8)
> [163596.605383] [<c04f04f8>] (__slab_alloc.constprop.46+0x1a4/0x4c8) from [<c011ced4>] (kmem_cache_alloc+0x158/0x190)
> [163596.605402] [<c011ced4>] (kmem_cache_alloc+0x158/0x190) from [<c0332be0>] (scsi_pool_alloc_command+0x30/0x74)
> [163596.605417] [<c0332be0>] (scsi_pool_alloc_command+0x30/0x74) from [<c0332c80>] (scsi_host_alloc_command+0x24/0x78)
> [163596.605428] [<c0332c80>] (scsi_host_alloc_command+0x24/0x78) from [<c0332cf0>] (__scsi_get_command+0x1c/0xa0)
> [163596.605439] [<c0332cf0>] (__scsi_get_command+0x1c/0xa0) from [<c0332db0>] (scsi_get_command+0x3c/0xb0)
> [163596.605453] [<c0332db0>] (scsi_get_command+0x3c/0xb0) from [<c0338d44>] (scsi_get_cmd_from_req+0x50/0x60)
> [163596.605466] [<c0338d44>] (scsi_get_cmd_from_req+0x50/0x60) from [<c0339fd8>] (scsi_setup_fs_cmnd+0x4c/0xac)

ENOMEM deep in the SCSI stack for an order 0 GFP_ATOMIC allocation.
That's not an XFS problem - that's a SCSI stack issue. You should
probably report that to the scsi list...

> [163596.608574] active_anon:26367 inactive_anon:29153 isolated_anon:0
> [163596.608574]  active_file:396338 inactive_file:397959 isolated_file:0
> [163596.608574]  unevictable:0 dirty:0 writeback:5 unstable:0
> [163596.608574]  free:5145 slab_reclaimable:57625 slab_unreclaimable:7729
> [163596.608574]  mapped:1703 shmem:10 pagetables:581 bounce:0
> [163596.608602] Normal free:15256kB min:3508kB low:4384kB high:5260kB active_anon:0kB inactive_anon:8kB active_file:848kB inactive_file:1560kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:772160kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:230500kB slab_unreclaimable:30916kB kernel_stack:2208kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> [163596.608607] lowmem_reserve[]: 0 26423 26423
> [163596.608628] HighMem free:5324kB min:512kB low:4352kB high:8192kB active_anon:105468kB inactive_anon:116604kB active_file:1584504kB inactive_file:1590276kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3382264kB mlocked:0kB dirty:0kB writeback:20kB mapped:6812kB shmem:40kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:2324kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> [163596.608634] lowmem_reserve[]: 0 0 0
> [163596.608643] Normal: 216*4kB 215*8kB 216*16kB 216*32kB 36*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 15256kB
> [163596.608668] HighMem: 233*4kB 67*8kB 141*16kB 22*32kB 8*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 5324kB

Though this says there is plenty  of free order 0 pages in both low
and high memory....

> [163596.608692] 794329 total pagecache pages
> [163596.608697] 12 pages in swap cache
> [163596.608703] Swap cache stats: add 79, delete 67, find 9/11
> [163596.608708] Free swap  = 8378092kB
> [163596.608712] Total swap = 8378364kB
> [163596.670667] 1046784 pages of RAM
> [163596.670674] 6801 free pages
> [163596.670679] 12533 reserved pages
> [163596.670683] 36489 slab pages
> [163596.670687] 631668 pages shared
> [163596.670692] 12 pages swap cached
> [163596.670701] SLUB: Unable to allocate memory on node -1 (gfp=0x8020)
> [163596.670710]   cache: kmalloc-192, object size: 192, buffer size: 192, default order: 0, min order: 0
> [163596.670718]   node 0: slabs: 2733, objs: 57393, free: 0

And it was slub that was unable to find a page when it should have
been able to, so perhaps this is a VM problem?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 3.5+, xfs and 32bit armhf - xfs_buf_get: failed to map pages
  2013-05-29 13:56         ` Paolo Pisati
@ 2013-05-30  0:42           ` Dave Chinner
  0 siblings, 0 replies; 9+ messages in thread
From: Dave Chinner @ 2013-05-30  0:42 UTC (permalink / raw)
  To: Paolo Pisati; +Cc: xfs

On Wed, May 29, 2013 at 03:56:41PM +0200, Paolo Pisati wrote:
> On Thu, May 23, 2013 at 04:34:56PM +0200, Paolo Pisati wrote:
> > On Tue, May 21, 2013 at 10:02:09AM +1000, Dave Chinner wrote:
> > > 
> > > And that fix I mentioned will be useless if you don't apply the
> > > patch that avoids the vmap allocation problem....
> > 
> > 
> > ok, so i recompiled a kernel+aforementioend fix, i repartitioned my disk and i
> > ran the swift-bench for 2 days in a row until i got this:
> 
> i'm testing a 3.5.y kernel plus those 3 patches:
> 
> 549142a xfs: don't use speculative prealloc for small files
> f0843f4 xfs: limit speculative prealloc size on sparse files
> 454da09 xfs: inode allocation should use unmapped buffers.
> 
> and i can confirm that:
> 
> -using a small fs (2G) i cannot reproduce any -ENOSPC or vmalloc() problem
> anymore, the benchmark runs until running out of inodes 
> 
> -using a bigger fs (~250G), two days and my tests are still running good

Ok, good to know. The first patch you list there hasn't even been
reviewed yet, so it might take some time before that is ready for
-stable backport.

Also, there are a bunch of fixes needed to the second patch you have
there (f0843f4 xfs: limit speculative prealloc...) that would also
be necessary for a -stable backport. i.e:

e8108ce xfs: fix xfs_iomap_eof_prealloc_initial_size type
e114b5f xfs: increase prealloc size to double that of the previous extent
e78c420 xfs: fix potential infinite loop in xfs_iomap_prealloc_size()

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-05-30  0:42 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-17 10:45 3.5+, xfs and 32bit armhf - xfs_buf_get: failed to map pages Paolo Pisati
2013-05-18  8:43 ` Jeff Liu
2013-05-19  1:13 ` Dave Chinner
2013-05-20 17:07   ` Paolo Pisati
2013-05-21  0:02     ` Dave Chinner
2013-05-23 14:34       ` Paolo Pisati
2013-05-29 13:56         ` Paolo Pisati
2013-05-30  0:42           ` Dave Chinner
2013-05-30  0:38         ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox