[uml-devel] BUG: soft lockup for a user mode linux image

All of lore.kernel.org
 help / color / mirror / Atom feed

* [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-02 18:30 ` Toralf Förster
  0 siblings, 0 replies; 64+ messages in thread
From: Toralf Förster @ 2013-10-02 18:30 UTC (permalink / raw)
  To: trinity; +Cc: UML devel

Running trinity (1 process, no victim files, just "$>trinity -C1) for a longer time 
within a 32 bit user mode linux image with a recent git kernel (host: 3.11.3 guest 3.12-rc3-g...)
yields into this konsole message :

 * Starting local
net.core.warnings = 0                                                                         [ ok ]
BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]


and at the host t1 of the "linux"-processes eats all CPU cycles at 1 CPU core.
2 subsequent made back traces made with

$> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt

shows nearly a similar position around __get_user_pages() - both are attached.

I'm not surprised that trinity harms a systems - I'm just wondering whether this particular picture is
expected or if it points to an issue.


FWIW the last lines of trinity log were :


[2031] [94] setsid() = 2031
[2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, sgid=0x22000040) = -1 (Operation not permitted)
[2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 0x3000
[2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, suid=0xff19b6fa) = -1 (Operation not permitted)
[2031] [98] setpriority(which=0xff010000, who=0xf3737373, niceval=0x8088960c) = -1 (Invalid argument)
[2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family not supported by protocol)
[2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
[2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 3
[2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid argument)
[2031] [103] sigpending(set=0x40025000) = 0
[2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, flags=0x80800) = -1 (Invalid argument)
[2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = -1 (Invalid argument)
[2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid argument)
[2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 (Invalid argument)
[2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
[2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, suid=0xffff2e33) = -1 (Operation not permitted)
[2031] [110] munlock(addr=0x40025000, len=34) = 0
[2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
[2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid argument)
[2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 (Operation not permitted)
[2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, niceval=0xffff0682) = -1 (Invalid argument)
[2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad address)
[2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
[2031] [117] inotify_init() = 654
[2031] [118] getgid() = 100
[2031] [119] fstatat64(dfd=382, filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, flag=0xb545d727) = -1 (Invalid argument)
[2031] [120] unlinkat(dfd=382, pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", flag=0xc00ef76) = -1 (Invalid argument)
[2031] [121] timerfd_create(clockid=0, flags=0) = 655
[2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
[2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
[2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid argument)
[2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
[2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a directory)
[2031] [127] ssetmask(newmask=0x88000092) = 0
[2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad address)
[2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 (Operation not permitted)
[2031] [130] fchdir(fd=397) = -1 (Not a directory)
[2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", mode=525) = -1 (File exists)
[2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, size=0, flags=0) = -1 (Numerical result out of range)
[2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by signal 2.

-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* BUG: soft lockup for a user mode linux image
@ 2013-10-02 18:30 ` Toralf Förster
  0 siblings, 0 replies; 64+ messages in thread
From: Toralf Förster @ 2013-10-02 18:30 UTC (permalink / raw)
  To: trinity; +Cc: UML devel

Running trinity (1 process, no victim files, just "$>trinity -C1) for a longer time 
within a 32 bit user mode linux image with a recent git kernel (host: 3.11.3 guest 3.12-rc3-g...)
yields into this konsole message :

 * Starting local
net.core.warnings = 0                                                                         [ ok ]
BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]


and at the host t1 of the "linux"-processes eats all CPU cycles at 1 CPU core.
2 subsequent made back traces made with

$> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt

shows nearly a similar position around __get_user_pages() - both are attached.

I'm not surprised that trinity harms a systems - I'm just wondering whether this particular picture is
expected or if it points to an issue.


FWIW the last lines of trinity log were :


[2031] [94] setsid() = 2031
[2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, sgid=0x22000040) = -1 (Operation not permitted)
[2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 0x3000
[2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, suid=0xff19b6fa) = -1 (Operation not permitted)
[2031] [98] setpriority(which=0xff010000, who=0xf3737373, niceval=0x8088960c) = -1 (Invalid argument)
[2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family not supported by protocol)
[2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
[2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 3
[2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid argument)
[2031] [103] sigpending(set=0x40025000) = 0
[2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, flags=0x80800) = -1 (Invalid argument)
[2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = -1 (Invalid argument)
[2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid argument)
[2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 (Invalid argument)
[2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
[2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, suid=0xffff2e33) = -1 (Operation not permitted)
[2031] [110] munlock(addr=0x40025000, len=34) = 0
[2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
[2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid argument)
[2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 (Operation not permitted)
[2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, niceval=0xffff0682) = -1 (Invalid argument)
[2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad address)
[2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
[2031] [117] inotify_init() = 654
[2031] [118] getgid() = 100
[2031] [119] fstatat64(dfd=382, filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, flag=0xb545d727) = -1 (Invalid argument)
[2031] [120] unlinkat(dfd=382, pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", flag=0xc00ef76) = -1 (Invalid argument)
[2031] [121] timerfd_create(clockid=0, flags=0) = 655
[2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
[2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
[2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid argument)
[2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
[2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a directory)
[2031] [127] ssetmask(newmask=0x88000092) = 0
[2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad address)
[2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 (Operation not permitted)
[2031] [130] fchdir(fd=397) = -1 (Not a directory)
[2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", mode=525) = -1 (File exists)
[2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, size=0, flags=0) = -1 (Numerical result out of range)
[2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by signal 2.

-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-02 18:30 ` Toralf Förster
@ 2013-10-02 18:31   ` Toralf Förster
  -1 siblings, 0 replies; 64+ messages in thread
From: Toralf Förster @ 2013-10-02 18:31 UTC (permalink / raw)
  To: trinity; +Cc: UML devel

[-- Attachment #1: Type: text/plain, Size: 4232 bytes --]

(resend - now with attached files)

Running trinity (1 process, no victim files, just "$>trinity -C1) for a longer time 
within a 32 bit user mode linux image with a recent git kernel (host: 3.11.3 guest 3.12-rc3-g...)
yields into this konsole message :

 * Starting local
net.core.warnings = 0                                                                         [ ok ]
BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]


and at the host t1 of the "linux"-processes eats all CPU cycles at 1 CPU core.
2 subsequent made back traces made with

$> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt

shows nearly a similar position around __get_user_pages() - both are attached.

I'm not surprised that trinity harms a systems - I'm just wondering whether this particular picture is
expected or if it points to an issue.


FWIW the last lines of trinity log were :


[2031] [94] setsid() = 2031
[2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, sgid=0x22000040) = -1 (Operation not permitted)
[2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 0x3000
[2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, suid=0xff19b6fa) = -1 (Operation not permitted)
[2031] [98] setpriority(which=0xff010000, who=0xf3737373, niceval=0x8088960c) = -1 (Invalid argument)
[2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family not supported by protocol)
[2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
[2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 3
[2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid argument)
[2031] [103] sigpending(set=0x40025000) = 0
[2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, flags=0x80800) = -1 (Invalid argument)
[2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = -1 (Invalid argument)
[2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid argument)
[2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 (Invalid argument)
[2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
[2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, suid=0xffff2e33) = -1 (Operation not permitted)
[2031] [110] munlock(addr=0x40025000, len=34) = 0
[2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
[2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid argument)
[2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 (Operation not permitted)
[2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, niceval=0xffff0682) = -1 (Invalid argument)
[2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad address)
[2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
[2031] [117] inotify_init() = 654
[2031] [118] getgid() = 100
[2031] [119] fstatat64(dfd=382, filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, flag=0xb545d727) = -1 (Invalid argument)
[2031] [120] unlinkat(dfd=382, pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", flag=0xc00ef76) = -1 (Invalid argument)
[2031] [121] timerfd_create(clockid=0, flags=0) = 655
[2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
[2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
[2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid argument)
[2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
[2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a directory)
[2031] [127] ssetmask(newmask=0x88000092) = 0
[2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad address)
[2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 (Operation not permitted)
[2031] [130] fchdir(fd=397) = -1 (Not a directory)
[2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", mode=525) = -1 (File exists)
[2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, size=0, flags=0) = -1 (Numerical result out of range)
[2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by signal 2.

-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3



[-- Attachment #2: 1 --]
[-- Type: text/plain, Size: 4078 bytes --]

0x082949e4 in number (buf=0x434dfb21 "/K\b\225\036`\b\243^K\b\243^K\bD\373MC\210\247\t\b", end=0x434dfb22 "K\b\225\036`\b\243^K\b\243^K\bD\373MC\210\247\t\b", num=597699752268362946, spec=...) at lib/vsprintf.c:381
381	char *number(char *buf, char *end, unsigned long long num,
#0  0x082949e4 in number (buf=0x434dfb21 "/K\b\225\036`\b\243^K\b\243^K\bD\373MC\210\247\t\b", end=0x434dfb22 "K\b\225\036`\b\243^K\b\243^K\bD\373MC\210\247\t\b", num=597699752268362946, spec=...) at lib/vsprintf.c:381
#1  0x08295ef7 in vsnprintf (buf=0x434dfb1c "+0x52/K\b\225\036`\b\243^K\b\243^K\bD\373MC\210\247\t\b", size=2147483647, fmt=0x84b74c2 "", args=0x434dfad0 "\t\373MC") at lib/vsprintf.c:1740
#2  0x08296547 in sprintf (buf=0x434dfb22 "K\b\225\036`\b\243^K\b\243^K\bD\373MC\210\247\t\b", fmt=0x434dfb22 "K\b\225\036`\b\243^K\b\243^K\bD\373MC\210\247\t\b") at lib/vsprintf.c:1875
#3  0x080b76cf in __sprint_symbol (buffer=0x434dfb09 "io_schedule_timeout+0x52/K\b\225\036`\b\243^K\b\243^K\bD\373MC\210\247\t\b", address=3276667675, symbol_offset=1, add_offset=1) at kernel/kallsyms.c:374
#4  0x080b781a in sprint_symbol (buffer=0x434dfb09 "io_schedule_timeout+0x52/K\b\225\036`\b\243^K\b\243^K\bD\373MC\210\247\t\b", address=138503010) at kernel/kallsyms.c:395
#5  0x080b787f in __print_symbol (fmt=0x434dfb22 "K\b\225\036`\b\243^K\b\243^K\bD\373MC\210\247\t\b", address=1129184034) at kernel/kallsyms.c:440
#6  0x08078200 in print_symbol (fmt=<optimized out>, addr=<optimized out>) at include/linux/kallsyms.h:119
#7  print_context_stack (ebp=<optimized out>, tinfo=<optimized out>, stack=<optimized out>) at arch/x86/um/sysrq_32.c:54
#8  show_trace (task=0x434dfffd, stack=0x434dfc44) at arch/x86/um/sysrq_32.c:95
#9  0x0805fb6f in show_stack (task=0x0, esp=0x434dfc44) at arch/um/kernel/sysrq.c:65
#10 0x08413ae6 in __dump_stack () at lib/dump_stack.c:15
#11 dump_stack () at lib/dump_stack.c:60
#12 0x08414981 in schedule_timeout (timeout=-4028) at kernel/timer.c:1460
#13 0x08416362 in io_schedule_timeout (timeout=1129184034) at kernel/sched/core.c:3992
#14 0x08411c31 in balance_dirty_pages (pages_dirtied=17, mapping=<optimized out>) at mm/page-writeback.c:1519
#15 0x080d1ce4 in balance_dirty_pages_ratelimited (mapping=0x60) at mm/page-writeback.c:1653
#16 0x080e2d0c in __do_fault (mm=0x47f24400, vma=0x442095d8, address=1080451072, pmd=0x434dfb22, pgoff=1129184034, flags=1155011416, orig_pte=<incomplete type>) at mm/memory.c:3452
#17 0x080e5286 in do_nonlinear_fault (orig_pte=..., flags=<optimized out>, pmd=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>, page_table=<optimized out>) at mm/memory.c:3518
#18 handle_pte_fault (flags=<optimized out>, pmd=<optimized out>, pte=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3717
#19 __handle_mm_fault (flags=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3845
#20 handle_mm_fault (mm=0x47f24400, vma=0x442095d8, address=1080451072, flags=1) at mm/memory.c:3868
#21 0x080e5a07 in __get_user_pages (tsk=0x46e23600, mm=0x47f24400, start=1080451072, nr_pages=65, gup_flags=519, pages=0x46f34d20, vmas=0x0, nonblocking=0x0) at mm/memory.c:1822
#22 0x080e5cc3 in get_user_pages (tsk=0x434dfb22, mm=0x434dfb22, start=1129184034, nr_pages=1129184034, write=1, force=0, pages=0xc34dfb1b, vmas=0x60) at mm/memory.c:2019
#23 0x08140cce in aio_setup_ring (ctx=<optimized out>) at fs/aio.c:317
#24 ioctx_alloc (nr_events=<optimized out>) at fs/aio.c:582
#25 SYSC_io_setup (ctxp=<optimized out>, nr_events=<optimized out>) at fs/aio.c:1100
#26 SyS_io_setup (nr_events=4095, ctxp=1076256768) at fs/aio.c:1083
#27 0x080619c2 in handle_syscall (r=0x46e237d4) at arch/um/kernel/skas/syscall.c:35
#28 0x08073f2d in handle_trap (local_using_sysemu=<optimized out>, regs=<optimized out>, pid=<optimized out>) at arch/um/os-Linux/skas/process.c:198
#29 userspace (regs=0x46e237d4) at arch/um/os-Linux/skas/process.c:431
#30 0x0805e6ac in fork_handler () at arch/um/kernel/process.c:160
#31 0x00000000 in ?? ()

[-- Attachment #3: 2 --]
[-- Type: text/plain, Size: 3500 bytes --]

format_decode (fmt=0x84b74b8 "+%#lx/%#lx", spec=0x434dfa8c) at lib/vsprintf.c:1375
1375		if (spec->type == FORMAT_TYPE_WIDTH) {
#0  format_decode (fmt=0x84b74b8 "+%#lx/%#lx", spec=0x434dfa8c) at lib/vsprintf.c:1375
#1  0x08295b82 in vsnprintf (buf=0x434dfb19 "", size=2147483647, fmt=0x84b74b8 "+%#lx/%#lx", args=0x434dfac8 "A") at lib/vsprintf.c:1615
#2  0x08296547 in sprintf (buf=0x84b74b8 "+%#lx/%#lx", fmt=0x84b74b8 "+%#lx/%#lx") at lib/vsprintf.c:1875
#3  0x080b76cf in __sprint_symbol (buffer=0x434dfb09 "schedule_timeout", address=1129183884, symbol_offset=1, add_offset=1) at kernel/kallsyms.c:374
#4  0x080b781a in sprint_symbol (buffer=0x434dfb09 "schedule_timeout", address=138496385) at kernel/kallsyms.c:395
#5  0x080b787f in __print_symbol (fmt=0x84b74b8 "+%#lx/%#lx", address=139162808) at kernel/kallsyms.c:440
#6  0x08078200 in print_symbol (fmt=<optimized out>, addr=<optimized out>) at include/linux/kallsyms.h:119
#7  print_context_stack (ebp=<optimized out>, tinfo=<optimized out>, stack=<optimized out>) at arch/x86/um/sysrq_32.c:54
#8  show_trace (task=0x434dfffd, stack=0x434dfc44) at arch/x86/um/sysrq_32.c:95
#9  0x0805fb6f in show_stack (task=0x0, esp=0x434dfc44) at arch/um/kernel/sysrq.c:65
#10 0x08413ae6 in __dump_stack () at lib/dump_stack.c:15
#11 dump_stack () at lib/dump_stack.c:60
#12 0x08414981 in schedule_timeout (timeout=-4028) at kernel/timer.c:1460
#13 0x08416362 in io_schedule_timeout (timeout=139162808) at kernel/sched/core.c:3992
#14 0x08411c31 in balance_dirty_pages (pages_dirtied=17, mapping=<optimized out>) at mm/page-writeback.c:1519
#15 0x080d1ce4 in balance_dirty_pages_ratelimited (mapping=0x11) at mm/page-writeback.c:1653
#16 0x080e2d0c in __do_fault (mm=0x47f24400, vma=0x442095d8, address=1080451072, pmd=0x84b74b8, pgoff=139162808, flags=1155011416, orig_pte=<incomplete type>) at mm/memory.c:3452
#17 0x080e5286 in do_nonlinear_fault (orig_pte=..., flags=<optimized out>, pmd=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>, page_table=<optimized out>) at mm/memory.c:3518
#18 handle_pte_fault (flags=<optimized out>, pmd=<optimized out>, pte=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3717
#19 __handle_mm_fault (flags=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3845
#20 handle_mm_fault (mm=0x47f24400, vma=0x442095d8, address=1080451072, flags=1) at mm/memory.c:3868
#21 0x080e5a07 in __get_user_pages (tsk=0x46e23600, mm=0x47f24400, start=1080451072, nr_pages=65, gup_flags=519, pages=0x46f34d20, vmas=0x0, nonblocking=0x0) at mm/memory.c:1822
#22 0x080e5cc3 in get_user_pages (tsk=0x84b74b8, mm=0x84b74b8, start=139162808, nr_pages=139162808, write=1, force=0, pages=0x434dfa8c, vmas=0x11) at mm/memory.c:2019
#23 0x08140cce in aio_setup_ring (ctx=<optimized out>) at fs/aio.c:317
#24 ioctx_alloc (nr_events=<optimized out>) at fs/aio.c:582
#25 SYSC_io_setup (ctxp=<optimized out>, nr_events=<optimized out>) at fs/aio.c:1100
#26 SyS_io_setup (nr_events=4095, ctxp=1076256768) at fs/aio.c:1083
#27 0x080619c2 in handle_syscall (r=0x46e237d4) at arch/um/kernel/skas/syscall.c:35
#28 0x08073f2d in handle_trap (local_using_sysemu=<optimized out>, regs=<optimized out>, pid=<optimized out>) at arch/um/os-Linux/skas/process.c:198
#29 userspace (regs=0x46e237d4) at arch/um/os-Linux/skas/process.c:431
#30 0x0805e6ac in fork_handler () at arch/um/kernel/process.c:160
#31 0x00000000 in ?? ()

[-- Attachment #4: Type: text/plain, Size: 416 bytes --]

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk

[-- Attachment #5: Type: text/plain, Size: 194 bytes --]

_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* BUG: soft lockup for a user mode linux image
@ 2013-10-02 18:31   ` Toralf Förster
  0 siblings, 0 replies; 64+ messages in thread
From: Toralf Förster @ 2013-10-02 18:31 UTC (permalink / raw)
  To: trinity; +Cc: UML devel

[-- Attachment #1: Type: text/plain, Size: 4232 bytes --]

(resend - now with attached files)

Running trinity (1 process, no victim files, just "$>trinity -C1) for a longer time 
within a 32 bit user mode linux image with a recent git kernel (host: 3.11.3 guest 3.12-rc3-g...)
yields into this konsole message :

 * Starting local
net.core.warnings = 0                                                                         [ ok ]
BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]


and at the host t1 of the "linux"-processes eats all CPU cycles at 1 CPU core.
2 subsequent made back traces made with

$> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt

shows nearly a similar position around __get_user_pages() - both are attached.

I'm not surprised that trinity harms a systems - I'm just wondering whether this particular picture is
expected or if it points to an issue.


FWIW the last lines of trinity log were :


[2031] [94] setsid() = 2031
[2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, sgid=0x22000040) = -1 (Operation not permitted)
[2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 0x3000
[2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, suid=0xff19b6fa) = -1 (Operation not permitted)
[2031] [98] setpriority(which=0xff010000, who=0xf3737373, niceval=0x8088960c) = -1 (Invalid argument)
[2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family not supported by protocol)
[2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
[2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 3
[2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid argument)
[2031] [103] sigpending(set=0x40025000) = 0
[2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, flags=0x80800) = -1 (Invalid argument)
[2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = -1 (Invalid argument)
[2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid argument)
[2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 (Invalid argument)
[2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
[2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, suid=0xffff2e33) = -1 (Operation not permitted)
[2031] [110] munlock(addr=0x40025000, len=34) = 0
[2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
[2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid argument)
[2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 (Operation not permitted)
[2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, niceval=0xffff0682) = -1 (Invalid argument)
[2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad address)
[2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
[2031] [117] inotify_init() = 654
[2031] [118] getgid() = 100
[2031] [119] fstatat64(dfd=382, filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, flag=0xb545d727) = -1 (Invalid argument)
[2031] [120] unlinkat(dfd=382, pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", flag=0xc00ef76) = -1 (Invalid argument)
[2031] [121] timerfd_create(clockid=0, flags=0) = 655
[2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
[2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
[2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid argument)
[2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
[2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a directory)
[2031] [127] ssetmask(newmask=0x88000092) = 0
[2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad address)
[2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 (Operation not permitted)
[2031] [130] fchdir(fd=397) = -1 (Not a directory)
[2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", mode=525) = -1 (File exists)
[2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, size=0, flags=0) = -1 (Numerical result out of range)
[2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by signal 2.

-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3



[-- Attachment #2: 1 --]
[-- Type: text/plain, Size: 4078 bytes --]

0x082949e4 in number (buf=0x434dfb21 "/K\b\225\036`\b\243^K\b\243^K\bD\373MC\210\247\t\b", end=0x434dfb22 "K\b\225\036`\b\243^K\b\243^K\bD\373MC\210\247\t\b", num=597699752268362946, spec=...) at lib/vsprintf.c:381
381	char *number(char *buf, char *end, unsigned long long num,
#0  0x082949e4 in number (buf=0x434dfb21 "/K\b\225\036`\b\243^K\b\243^K\bD\373MC\210\247\t\b", end=0x434dfb22 "K\b\225\036`\b\243^K\b\243^K\bD\373MC\210\247\t\b", num=597699752268362946, spec=...) at lib/vsprintf.c:381
#1  0x08295ef7 in vsnprintf (buf=0x434dfb1c "+0x52/K\b\225\036`\b\243^K\b\243^K\bD\373MC\210\247\t\b", size=2147483647, fmt=0x84b74c2 "", args=0x434dfad0 "\t\373MC") at lib/vsprintf.c:1740
#2  0x08296547 in sprintf (buf=0x434dfb22 "K\b\225\036`\b\243^K\b\243^K\bD\373MC\210\247\t\b", fmt=0x434dfb22 "K\b\225\036`\b\243^K\b\243^K\bD\373MC\210\247\t\b") at lib/vsprintf.c:1875
#3  0x080b76cf in __sprint_symbol (buffer=0x434dfb09 "io_schedule_timeout+0x52/K\b\225\036`\b\243^K\b\243^K\bD\373MC\210\247\t\b", address=3276667675, symbol_offset=1, add_offset=1) at kernel/kallsyms.c:374
#4  0x080b781a in sprint_symbol (buffer=0x434dfb09 "io_schedule_timeout+0x52/K\b\225\036`\b\243^K\b\243^K\bD\373MC\210\247\t\b", address=138503010) at kernel/kallsyms.c:395
#5  0x080b787f in __print_symbol (fmt=0x434dfb22 "K\b\225\036`\b\243^K\b\243^K\bD\373MC\210\247\t\b", address=1129184034) at kernel/kallsyms.c:440
#6  0x08078200 in print_symbol (fmt=<optimized out>, addr=<optimized out>) at include/linux/kallsyms.h:119
#7  print_context_stack (ebp=<optimized out>, tinfo=<optimized out>, stack=<optimized out>) at arch/x86/um/sysrq_32.c:54
#8  show_trace (task=0x434dfffd, stack=0x434dfc44) at arch/x86/um/sysrq_32.c:95
#9  0x0805fb6f in show_stack (task=0x0, esp=0x434dfc44) at arch/um/kernel/sysrq.c:65
#10 0x08413ae6 in __dump_stack () at lib/dump_stack.c:15
#11 dump_stack () at lib/dump_stack.c:60
#12 0x08414981 in schedule_timeout (timeout=-4028) at kernel/timer.c:1460
#13 0x08416362 in io_schedule_timeout (timeout=1129184034) at kernel/sched/core.c:3992
#14 0x08411c31 in balance_dirty_pages (pages_dirtied=17, mapping=<optimized out>) at mm/page-writeback.c:1519
#15 0x080d1ce4 in balance_dirty_pages_ratelimited (mapping=0x60) at mm/page-writeback.c:1653
#16 0x080e2d0c in __do_fault (mm=0x47f24400, vma=0x442095d8, address=1080451072, pmd=0x434dfb22, pgoff=1129184034, flags=1155011416, orig_pte=<incomplete type>) at mm/memory.c:3452
#17 0x080e5286 in do_nonlinear_fault (orig_pte=..., flags=<optimized out>, pmd=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>, page_table=<optimized out>) at mm/memory.c:3518
#18 handle_pte_fault (flags=<optimized out>, pmd=<optimized out>, pte=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3717
#19 __handle_mm_fault (flags=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3845
#20 handle_mm_fault (mm=0x47f24400, vma=0x442095d8, address=1080451072, flags=1) at mm/memory.c:3868
#21 0x080e5a07 in __get_user_pages (tsk=0x46e23600, mm=0x47f24400, start=1080451072, nr_pages=65, gup_flags=519, pages=0x46f34d20, vmas=0x0, nonblocking=0x0) at mm/memory.c:1822
#22 0x080e5cc3 in get_user_pages (tsk=0x434dfb22, mm=0x434dfb22, start=1129184034, nr_pages=1129184034, write=1, force=0, pages=0xc34dfb1b, vmas=0x60) at mm/memory.c:2019
#23 0x08140cce in aio_setup_ring (ctx=<optimized out>) at fs/aio.c:317
#24 ioctx_alloc (nr_events=<optimized out>) at fs/aio.c:582
#25 SYSC_io_setup (ctxp=<optimized out>, nr_events=<optimized out>) at fs/aio.c:1100
#26 SyS_io_setup (nr_events=4095, ctxp=1076256768) at fs/aio.c:1083
#27 0x080619c2 in handle_syscall (r=0x46e237d4) at arch/um/kernel/skas/syscall.c:35
#28 0x08073f2d in handle_trap (local_using_sysemu=<optimized out>, regs=<optimized out>, pid=<optimized out>) at arch/um/os-Linux/skas/process.c:198
#29 userspace (regs=0x46e237d4) at arch/um/os-Linux/skas/process.c:431
#30 0x0805e6ac in fork_handler () at arch/um/kernel/process.c:160
#31 0x00000000 in ?? ()

[-- Attachment #3: 2 --]
[-- Type: text/plain, Size: 3500 bytes --]

format_decode (fmt=0x84b74b8 "+%#lx/%#lx", spec=0x434dfa8c) at lib/vsprintf.c:1375
1375		if (spec->type == FORMAT_TYPE_WIDTH) {
#0  format_decode (fmt=0x84b74b8 "+%#lx/%#lx", spec=0x434dfa8c) at lib/vsprintf.c:1375
#1  0x08295b82 in vsnprintf (buf=0x434dfb19 "", size=2147483647, fmt=0x84b74b8 "+%#lx/%#lx", args=0x434dfac8 "A") at lib/vsprintf.c:1615
#2  0x08296547 in sprintf (buf=0x84b74b8 "+%#lx/%#lx", fmt=0x84b74b8 "+%#lx/%#lx") at lib/vsprintf.c:1875
#3  0x080b76cf in __sprint_symbol (buffer=0x434dfb09 "schedule_timeout", address=1129183884, symbol_offset=1, add_offset=1) at kernel/kallsyms.c:374
#4  0x080b781a in sprint_symbol (buffer=0x434dfb09 "schedule_timeout", address=138496385) at kernel/kallsyms.c:395
#5  0x080b787f in __print_symbol (fmt=0x84b74b8 "+%#lx/%#lx", address=139162808) at kernel/kallsyms.c:440
#6  0x08078200 in print_symbol (fmt=<optimized out>, addr=<optimized out>) at include/linux/kallsyms.h:119
#7  print_context_stack (ebp=<optimized out>, tinfo=<optimized out>, stack=<optimized out>) at arch/x86/um/sysrq_32.c:54
#8  show_trace (task=0x434dfffd, stack=0x434dfc44) at arch/x86/um/sysrq_32.c:95
#9  0x0805fb6f in show_stack (task=0x0, esp=0x434dfc44) at arch/um/kernel/sysrq.c:65
#10 0x08413ae6 in __dump_stack () at lib/dump_stack.c:15
#11 dump_stack () at lib/dump_stack.c:60
#12 0x08414981 in schedule_timeout (timeout=-4028) at kernel/timer.c:1460
#13 0x08416362 in io_schedule_timeout (timeout=139162808) at kernel/sched/core.c:3992
#14 0x08411c31 in balance_dirty_pages (pages_dirtied=17, mapping=<optimized out>) at mm/page-writeback.c:1519
#15 0x080d1ce4 in balance_dirty_pages_ratelimited (mapping=0x11) at mm/page-writeback.c:1653
#16 0x080e2d0c in __do_fault (mm=0x47f24400, vma=0x442095d8, address=1080451072, pmd=0x84b74b8, pgoff=139162808, flags=1155011416, orig_pte=<incomplete type>) at mm/memory.c:3452
#17 0x080e5286 in do_nonlinear_fault (orig_pte=..., flags=<optimized out>, pmd=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>, page_table=<optimized out>) at mm/memory.c:3518
#18 handle_pte_fault (flags=<optimized out>, pmd=<optimized out>, pte=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3717
#19 __handle_mm_fault (flags=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3845
#20 handle_mm_fault (mm=0x47f24400, vma=0x442095d8, address=1080451072, flags=1) at mm/memory.c:3868
#21 0x080e5a07 in __get_user_pages (tsk=0x46e23600, mm=0x47f24400, start=1080451072, nr_pages=65, gup_flags=519, pages=0x46f34d20, vmas=0x0, nonblocking=0x0) at mm/memory.c:1822
#22 0x080e5cc3 in get_user_pages (tsk=0x84b74b8, mm=0x84b74b8, start=139162808, nr_pages=139162808, write=1, force=0, pages=0x434dfa8c, vmas=0x11) at mm/memory.c:2019
#23 0x08140cce in aio_setup_ring (ctx=<optimized out>) at fs/aio.c:317
#24 ioctx_alloc (nr_events=<optimized out>) at fs/aio.c:582
#25 SYSC_io_setup (ctxp=<optimized out>, nr_events=<optimized out>) at fs/aio.c:1100
#26 SyS_io_setup (nr_events=4095, ctxp=1076256768) at fs/aio.c:1083
#27 0x080619c2 in handle_syscall (r=0x46e237d4) at arch/um/kernel/skas/syscall.c:35
#28 0x08073f2d in handle_trap (local_using_sysemu=<optimized out>, regs=<optimized out>, pid=<optimized out>) at arch/um/os-Linux/skas/process.c:198
#29 userspace (regs=0x46e237d4) at arch/um/os-Linux/skas/process.c:431
#30 0x0805e6ac in fork_handler () at arch/um/kernel/process.c:160
#31 0x00000000 in ?? ()

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-02 18:30 ` Toralf Förster
@ 2013-10-02 19:55   ` Richard Weinberger
  -1 siblings, 0 replies; 64+ messages in thread
From: Richard Weinberger @ 2013-10-02 19:55 UTC (permalink / raw)
  To: Toralf Förster; +Cc: UML devel, trinity

On Wed, Oct 2, 2013 at 8:30 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
> Running trinity (1 process, no victim files, just "$>trinity -C1) for a longer time
> within a 32 bit user mode linux image with a recent git kernel (host: 3.11.3 guest 3.12-rc3-g...)
> yields into this konsole message :
>
>  * Starting local
> net.core.warnings = 0                                                                         [ ok ]
> BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]
>
>
> and at the host t1 of the "linux"-processes eats all CPU cycles at 1 CPU core.
> 2 subsequent made back traces made with
>
> $> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt
>
> shows nearly a similar position around __get_user_pages() - both are attached.
>
> I'm not surprised that trinity harms a systems - I'm just wondering whether this particular picture is
> expected or if it points to an issue.
>
>
> FWIW the last lines of trinity log were :
>
>
> [2031] [94] setsid() = 2031
> [2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, sgid=0x22000040) = -1 (Operation not permitted)
> [2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 0x3000
> [2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, suid=0xff19b6fa) = -1 (Operation not permitted)
> [2031] [98] setpriority(which=0xff010000, who=0xf3737373, niceval=0x8088960c) = -1 (Invalid argument)
> [2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family not supported by protocol)
> [2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
> [2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 3
> [2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid argument)
> [2031] [103] sigpending(set=0x40025000) = 0
> [2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, flags=0x80800) = -1 (Invalid argument)
> [2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = -1 (Invalid argument)
> [2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid argument)
> [2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 (Invalid argument)
> [2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
> [2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, suid=0xffff2e33) = -1 (Operation not permitted)
> [2031] [110] munlock(addr=0x40025000, len=34) = 0
> [2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
> [2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid argument)
> [2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 (Operation not permitted)
> [2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, niceval=0xffff0682) = -1 (Invalid argument)
> [2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad address)
> [2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
> [2031] [117] inotify_init() = 654
> [2031] [118] getgid() = 100
> [2031] [119] fstatat64(dfd=382, filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, flag=0xb545d727) = -1 (Invalid argument)
> [2031] [120] unlinkat(dfd=382, pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", flag=0xc00ef76) = -1 (Invalid argument)
> [2031] [121] timerfd_create(clockid=0, flags=0) = 655
> [2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
> [2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
> [2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid argument)
> [2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
> [2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a directory)
> [2031] [127] ssetmask(newmask=0x88000092) = 0
> [2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad address)
> [2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 (Operation not permitted)
> [2031] [130] fchdir(fd=397) = -1 (Not a directory)
> [2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", mode=525) = -1 (File exists)
> [2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, size=0, flags=0) = -1 (Numerical result out of range)
> [2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by signal 2.

Reading your gdb backtraces show that schedule_timeout() got called
with a negative value.
Looks like an integer overflow.
The soft-lockup might also origin from that (very big integer which
did not overflow jet)

-- 
Thanks,
//richard

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-02 19:55   ` Richard Weinberger
  0 siblings, 0 replies; 64+ messages in thread
From: Richard Weinberger @ 2013-10-02 19:55 UTC (permalink / raw)
  To: Toralf Förster; +Cc: trinity, UML devel

On Wed, Oct 2, 2013 at 8:30 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
> Running trinity (1 process, no victim files, just "$>trinity -C1) for a longer time
> within a 32 bit user mode linux image with a recent git kernel (host: 3.11.3 guest 3.12-rc3-g...)
> yields into this konsole message :
>
>  * Starting local
> net.core.warnings = 0                                                                         [ ok ]
> BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]
>
>
> and at the host t1 of the "linux"-processes eats all CPU cycles at 1 CPU core.
> 2 subsequent made back traces made with
>
> $> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt
>
> shows nearly a similar position around __get_user_pages() - both are attached.
>
> I'm not surprised that trinity harms a systems - I'm just wondering whether this particular picture is
> expected or if it points to an issue.
>
>
> FWIW the last lines of trinity log were :
>
>
> [2031] [94] setsid() = 2031
> [2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, sgid=0x22000040) = -1 (Operation not permitted)
> [2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 0x3000
> [2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, suid=0xff19b6fa) = -1 (Operation not permitted)
> [2031] [98] setpriority(which=0xff010000, who=0xf3737373, niceval=0x8088960c) = -1 (Invalid argument)
> [2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family not supported by protocol)
> [2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
> [2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 3
> [2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid argument)
> [2031] [103] sigpending(set=0x40025000) = 0
> [2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, flags=0x80800) = -1 (Invalid argument)
> [2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = -1 (Invalid argument)
> [2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid argument)
> [2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 (Invalid argument)
> [2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
> [2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, suid=0xffff2e33) = -1 (Operation not permitted)
> [2031] [110] munlock(addr=0x40025000, len=34) = 0
> [2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
> [2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid argument)
> [2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 (Operation not permitted)
> [2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, niceval=0xffff0682) = -1 (Invalid argument)
> [2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad address)
> [2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
> [2031] [117] inotify_init() = 654
> [2031] [118] getgid() = 100
> [2031] [119] fstatat64(dfd=382, filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, flag=0xb545d727) = -1 (Invalid argument)
> [2031] [120] unlinkat(dfd=382, pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", flag=0xc00ef76) = -1 (Invalid argument)
> [2031] [121] timerfd_create(clockid=0, flags=0) = 655
> [2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
> [2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
> [2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid argument)
> [2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
> [2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a directory)
> [2031] [127] ssetmask(newmask=0x88000092) = 0
> [2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad address)
> [2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 (Operation not permitted)
> [2031] [130] fchdir(fd=397) = -1 (Not a directory)
> [2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", mode=525) = -1 (File exists)
> [2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, size=0, flags=0) = -1 (Numerical result out of range)
> [2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by signal 2.

Reading your gdb backtraces show that schedule_timeout() got called
with a negative value.
Looks like an integer overflow.
The soft-lockup might also origin from that (very big integer which
did not overflow jet)

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-02 19:55   ` Richard Weinberger
@ 2013-10-03 18:54     ` Toralf Förster
  -1 siblings, 0 replies; 64+ messages in thread
From: Toralf Förster @ 2013-10-03 18:54 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: UML devel, trinity

On 10/02/2013 09:55 PM, Richard Weinberger wrote:
> On Wed, Oct 2, 2013 at 8:30 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>> Running trinity (1 process, no victim files, just "$>trinity -C1) for a longer time
>> within a 32 bit user mode linux image with a recent git kernel (host: 3.11.3 guest 3.12-rc3-g...)
>> yields into this konsole message :
>>
>>  * Starting local
>> net.core.warnings = 0                                                                         [ ok ]
>> BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]
>>
>>
>> and at the host t1 of the "linux"-processes eats all CPU cycles at 1 CPU core.
>> 2 subsequent made back traces made with
>>
>> $> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt
>>
>> shows nearly a similar position around __get_user_pages() - both are attached.
>>
>> I'm not surprised that trinity harms a systems - I'm just wondering whether this particular picture is
>> expected or if it points to an issue.
>>
>>
>> FWIW the last lines of trinity log were :
>>
>>
>> [2031] [94] setsid() = 2031
>> [2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, sgid=0x22000040) = -1 (Operation not permitted)
>> [2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 0x3000
>> [2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, suid=0xff19b6fa) = -1 (Operation not permitted)
>> [2031] [98] setpriority(which=0xff010000, who=0xf3737373, niceval=0x8088960c) = -1 (Invalid argument)
>> [2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family not supported by protocol)
>> [2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
>> [2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 3
>> [2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid argument)
>> [2031] [103] sigpending(set=0x40025000) = 0
>> [2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, flags=0x80800) = -1 (Invalid argument)
>> [2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = -1 (Invalid argument)
>> [2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid argument)
>> [2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 (Invalid argument)
>> [2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
>> [2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, suid=0xffff2e33) = -1 (Operation not permitted)
>> [2031] [110] munlock(addr=0x40025000, len=34) = 0
>> [2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
>> [2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid argument)
>> [2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 (Operation not permitted)
>> [2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, niceval=0xffff0682) = -1 (Invalid argument)
>> [2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad address)
>> [2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
>> [2031] [117] inotify_init() = 654
>> [2031] [118] getgid() = 100
>> [2031] [119] fstatat64(dfd=382, filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, flag=0xb545d727) = -1 (Invalid argument)
>> [2031] [120] unlinkat(dfd=382, pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", flag=0xc00ef76) = -1 (Invalid argument)
>> [2031] [121] timerfd_create(clockid=0, flags=0) = 655
>> [2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
>> [2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
>> [2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid argument)
>> [2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
>> [2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a directory)
>> [2031] [127] ssetmask(newmask=0x88000092) = 0
>> [2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad address)
>> [2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 (Operation not permitted)
>> [2031] [130] fchdir(fd=397) = -1 (Not a directory)
>> [2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", mode=525) = -1 (File exists)
>> [2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, size=0, flags=0) = -1 (Numerical result out of range)
>> [2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by signal 2.
> 
> Reading your gdb backtraces show that schedule_timeout() got called
> with a negative value.
> Looks like an integer overflow.
> The soft-lockup might also origin from that (very big integer which
> did not overflow jet)
> 

If the culprit is solved by this patch I'd like to send it out. But I'm
unsure whether it catches the culprit or if it just covers the root cause.

ommit 7b367d5478dfcba1635e8cde3c663515fee129ec
Author: Toralf Förster <toralf.foerster@gmx.de>
Date:   Thu Oct 3 20:46:38 2013 +0200

    mm/page-writeback.c: avoid integer overflow in balance_dirty_pages()

    avoid crash of a 32 bit UML (found using fuzz tool trinity)

    Signed-off-by: Toralf Förster <toralf.foerster@gmx.de>

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index f5236f8..f5a3096 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1462,7 +1462,8 @@ static void balance_dirty_pages(struct
address_space *mapping,
                        pause = max_pause;
                        goto pause;
                }
-               period = HZ * pages_dirtied / task_ratelimit;
+               period = ((long) HZ) * pages_dirtied / task_ratelimit;
+
                pause = period;
                if (current->dirty_paused_when)
                        pause -= now - current->dirty_paused_when;


-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-03 18:54     ` Toralf Förster
  0 siblings, 0 replies; 64+ messages in thread
From: Toralf Förster @ 2013-10-03 18:54 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: trinity, UML devel

On 10/02/2013 09:55 PM, Richard Weinberger wrote:
> On Wed, Oct 2, 2013 at 8:30 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>> Running trinity (1 process, no victim files, just "$>trinity -C1) for a longer time
>> within a 32 bit user mode linux image with a recent git kernel (host: 3.11.3 guest 3.12-rc3-g...)
>> yields into this konsole message :
>>
>>  * Starting local
>> net.core.warnings = 0                                                                         [ ok ]
>> BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]
>>
>>
>> and at the host t1 of the "linux"-processes eats all CPU cycles at 1 CPU core.
>> 2 subsequent made back traces made with
>>
>> $> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt
>>
>> shows nearly a similar position around __get_user_pages() - both are attached.
>>
>> I'm not surprised that trinity harms a systems - I'm just wondering whether this particular picture is
>> expected or if it points to an issue.
>>
>>
>> FWIW the last lines of trinity log were :
>>
>>
>> [2031] [94] setsid() = 2031
>> [2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, sgid=0x22000040) = -1 (Operation not permitted)
>> [2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 0x3000
>> [2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, suid=0xff19b6fa) = -1 (Operation not permitted)
>> [2031] [98] setpriority(which=0xff010000, who=0xf3737373, niceval=0x8088960c) = -1 (Invalid argument)
>> [2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family not supported by protocol)
>> [2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
>> [2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 3
>> [2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid argument)
>> [2031] [103] sigpending(set=0x40025000) = 0
>> [2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, flags=0x80800) = -1 (Invalid argument)
>> [2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = -1 (Invalid argument)
>> [2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid argument)
>> [2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 (Invalid argument)
>> [2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
>> [2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, suid=0xffff2e33) = -1 (Operation not permitted)
>> [2031] [110] munlock(addr=0x40025000, len=34) = 0
>> [2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
>> [2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid argument)
>> [2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 (Operation not permitted)
>> [2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, niceval=0xffff0682) = -1 (Invalid argument)
>> [2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad address)
>> [2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
>> [2031] [117] inotify_init() = 654
>> [2031] [118] getgid() = 100
>> [2031] [119] fstatat64(dfd=382, filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, flag=0xb545d727) = -1 (Invalid argument)
>> [2031] [120] unlinkat(dfd=382, pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", flag=0xc00ef76) = -1 (Invalid argument)
>> [2031] [121] timerfd_create(clockid=0, flags=0) = 655
>> [2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
>> [2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
>> [2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid argument)
>> [2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
>> [2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a directory)
>> [2031] [127] ssetmask(newmask=0x88000092) = 0
>> [2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad address)
>> [2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 (Operation not permitted)
>> [2031] [130] fchdir(fd=397) = -1 (Not a directory)
>> [2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", mode=525) = -1 (File exists)
>> [2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, size=0, flags=0) = -1 (Numerical result out of range)
>> [2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by signal 2.
> 
> Reading your gdb backtraces show that schedule_timeout() got called
> with a negative value.
> Looks like an integer overflow.
> The soft-lockup might also origin from that (very big integer which
> did not overflow jet)
> 

If the culprit is solved by this patch I'd like to send it out. But I'm
unsure whether it catches the culprit or if it just covers the root cause.

ommit 7b367d5478dfcba1635e8cde3c663515fee129ec
Author: Toralf Förster <toralf.foerster@gmx.de>
Date:   Thu Oct 3 20:46:38 2013 +0200

    mm/page-writeback.c: avoid integer overflow in balance_dirty_pages()

    avoid crash of a 32 bit UML (found using fuzz tool trinity)

    Signed-off-by: Toralf Förster <toralf.foerster@gmx.de>

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index f5236f8..f5a3096 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1462,7 +1462,8 @@ static void balance_dirty_pages(struct
address_space *mapping,
                        pause = max_pause;
                        goto pause;
                }
-               period = HZ * pages_dirtied / task_ratelimit;
+               period = ((long) HZ) * pages_dirtied / task_ratelimit;
+
                pause = period;
                if (current->dirty_paused_when)
                        pause -= now - current->dirty_paused_when;


-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-03 18:54     ` Toralf Förster
@ 2013-10-03 19:04       ` Richard Weinberger
  -1 siblings, 0 replies; 64+ messages in thread
From: Richard Weinberger @ 2013-10-03 19:04 UTC (permalink / raw)
  To: Toralf Förster; +Cc: UML devel, trinity

Am 03.10.2013 20:54, schrieb Toralf Förster:
> On 10/02/2013 09:55 PM, Richard Weinberger wrote:
>> On Wed, Oct 2, 2013 at 8:30 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>>> Running trinity (1 process, no victim files, just "$>trinity -C1) for a longer time
>>> within a 32 bit user mode linux image with a recent git kernel (host: 3.11.3 guest 3.12-rc3-g...)
>>> yields into this konsole message :
>>>
>>>  * Starting local
>>> net.core.warnings = 0                                                                         [ ok ]
>>> BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]
>>>
>>>
>>> and at the host t1 of the "linux"-processes eats all CPU cycles at 1 CPU core.
>>> 2 subsequent made back traces made with
>>>
>>> $> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt
>>>
>>> shows nearly a similar position around __get_user_pages() - both are attached.
>>>
>>> I'm not surprised that trinity harms a systems - I'm just wondering whether this particular picture is
>>> expected or if it points to an issue.
>>>
>>>
>>> FWIW the last lines of trinity log were :
>>>
>>>
>>> [2031] [94] setsid() = 2031
>>> [2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, sgid=0x22000040) = -1 (Operation not permitted)
>>> [2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 0x3000
>>> [2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, suid=0xff19b6fa) = -1 (Operation not permitted)
>>> [2031] [98] setpriority(which=0xff010000, who=0xf3737373, niceval=0x8088960c) = -1 (Invalid argument)
>>> [2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family not supported by protocol)
>>> [2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
>>> [2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 3
>>> [2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid argument)
>>> [2031] [103] sigpending(set=0x40025000) = 0
>>> [2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, flags=0x80800) = -1 (Invalid argument)
>>> [2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = -1 (Invalid argument)
>>> [2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid argument)
>>> [2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 (Invalid argument)
>>> [2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
>>> [2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, suid=0xffff2e33) = -1 (Operation not permitted)
>>> [2031] [110] munlock(addr=0x40025000, len=34) = 0
>>> [2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
>>> [2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid argument)
>>> [2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 (Operation not permitted)
>>> [2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, niceval=0xffff0682) = -1 (Invalid argument)
>>> [2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad address)
>>> [2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
>>> [2031] [117] inotify_init() = 654
>>> [2031] [118] getgid() = 100
>>> [2031] [119] fstatat64(dfd=382, filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, flag=0xb545d727) = -1 (Invalid argument)
>>> [2031] [120] unlinkat(dfd=382, pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", flag=0xc00ef76) = -1 (Invalid argument)
>>> [2031] [121] timerfd_create(clockid=0, flags=0) = 655
>>> [2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
>>> [2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
>>> [2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid argument)
>>> [2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
>>> [2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a directory)
>>> [2031] [127] ssetmask(newmask=0x88000092) = 0
>>> [2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad address)
>>> [2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 (Operation not permitted)
>>> [2031] [130] fchdir(fd=397) = -1 (Not a directory)
>>> [2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", mode=525) = -1 (File exists)
>>> [2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, size=0, flags=0) = -1 (Numerical result out of range)
>>> [2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by signal 2.
>>
>> Reading your gdb backtraces show that schedule_timeout() got called
>> with a negative value.
>> Looks like an integer overflow.
>> The soft-lockup might also origin from that (very big integer which
>> did not overflow jet)
>>
> 
> If the culprit is solved by this patch I'd like to send it out. But I'm
> unsure whether it catches the culprit or if it just covers the root cause.

I fear your Patch will not fix the issue.

Does the issue only trigger on 32bit UMLs?
How long does it take till trinity hits it?

Thanks,
//richard

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: BUG: soft lockup for a user mode linux image
@ 2013-10-03 19:04       ` Richard Weinberger
  0 siblings, 0 replies; 64+ messages in thread
From: Richard Weinberger @ 2013-10-03 19:04 UTC (permalink / raw)
  To: Toralf Förster; +Cc: UML devel, trinity

Am 03.10.2013 20:54, schrieb Toralf Förster:
> On 10/02/2013 09:55 PM, Richard Weinberger wrote:
>> On Wed, Oct 2, 2013 at 8:30 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>>> Running trinity (1 process, no victim files, just "$>trinity -C1) for a longer time
>>> within a 32 bit user mode linux image with a recent git kernel (host: 3.11.3 guest 3.12-rc3-g...)
>>> yields into this konsole message :
>>>
>>>  * Starting local
>>> net.core.warnings = 0                                                                         [ ok ]
>>> BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]
>>>
>>>
>>> and at the host t1 of the "linux"-processes eats all CPU cycles at 1 CPU core.
>>> 2 subsequent made back traces made with
>>>
>>> $> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt
>>>
>>> shows nearly a similar position around __get_user_pages() - both are attached.
>>>
>>> I'm not surprised that trinity harms a systems - I'm just wondering whether this particular picture is
>>> expected or if it points to an issue.
>>>
>>>
>>> FWIW the last lines of trinity log were :
>>>
>>>
>>> [2031] [94] setsid() = 2031
>>> [2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, sgid=0x22000040) = -1 (Operation not permitted)
>>> [2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 0x3000
>>> [2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, suid=0xff19b6fa) = -1 (Operation not permitted)
>>> [2031] [98] setpriority(which=0xff010000, who=0xf3737373, niceval=0x8088960c) = -1 (Invalid argument)
>>> [2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family not supported by protocol)
>>> [2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
>>> [2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 3
>>> [2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid argument)
>>> [2031] [103] sigpending(set=0x40025000) = 0
>>> [2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, flags=0x80800) = -1 (Invalid argument)
>>> [2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = -1 (Invalid argument)
>>> [2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid argument)
>>> [2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 (Invalid argument)
>>> [2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
>>> [2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, suid=0xffff2e33) = -1 (Operation not permitted)
>>> [2031] [110] munlock(addr=0x40025000, len=34) = 0
>>> [2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
>>> [2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid argument)
>>> [2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 (Operation not permitted)
>>> [2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, niceval=0xffff0682) = -1 (Invalid argument)
>>> [2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad address)
>>> [2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
>>> [2031] [117] inotify_init() = 654
>>> [2031] [118] getgid() = 100
>>> [2031] [119] fstatat64(dfd=382, filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, flag=0xb545d727) = -1 (Invalid argument)
>>> [2031] [120] unlinkat(dfd=382, pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", flag=0xc00ef76) = -1 (Invalid argument)
>>> [2031] [121] timerfd_create(clockid=0, flags=0) = 655
>>> [2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
>>> [2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
>>> [2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid argument)
>>> [2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
>>> [2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a directory)
>>> [2031] [127] ssetmask(newmask=0x88000092) = 0
>>> [2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad address)
>>> [2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 (Operation not permitted)
>>> [2031] [130] fchdir(fd=397) = -1 (Not a directory)
>>> [2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", mode=525) = -1 (File exists)
>>> [2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, size=0, flags=0) = -1 (Numerical result out of range)
>>> [2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by signal 2.
>>
>> Reading your gdb backtraces show that schedule_timeout() got called
>> with a negative value.
>> Looks like an integer overflow.
>> The soft-lockup might also origin from that (very big integer which
>> did not overflow jet)
>>
> 
> If the culprit is solved by this patch I'd like to send it out. But I'm
> unsure whether it catches the culprit or if it just covers the root cause.

I fear your Patch will not fix the issue.

Does the issue only trigger on 32bit UMLs?
How long does it take till trinity hits it?

Thanks,
//richard

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-03 19:04       ` Richard Weinberger
@ 2013-10-03 19:16         ` Toralf Förster
  -1 siblings, 0 replies; 64+ messages in thread
From: Toralf Förster @ 2013-10-03 19:16 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: UML devel, trinity

On 10/03/2013 09:04 PM, Richard Weinberger wrote:
> Am 03.10.2013 20:54, schrieb Toralf Förster:
>> On 10/02/2013 09:55 PM, Richard Weinberger wrote:
>>> On Wed, Oct 2, 2013 at 8:30 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>>>> Running trinity (1 process, no victim files, just "$>trinity -C1) for a longer time
>>>> within a 32 bit user mode linux image with a recent git kernel (host: 3.11.3 guest 3.12-rc3-g...)
>>>> yields into this konsole message :
>>>>
>>>>  * Starting local
>>>> net.core.warnings = 0                                                                         [ ok ]
>>>> BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]
>>>>
>>>>
>>>> and at the host t1 of the "linux"-processes eats all CPU cycles at 1 CPU core.
>>>> 2 subsequent made back traces made with
>>>>
>>>> $> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt
>>>>
>>>> shows nearly a similar position around __get_user_pages() - both are attached.
>>>>
>>>> I'm not surprised that trinity harms a systems - I'm just wondering whether this particular picture is
>>>> expected or if it points to an issue.
>>>>
>>>>
>>>> FWIW the last lines of trinity log were :
>>>>
>>>>
>>>> [2031] [94] setsid() = 2031
>>>> [2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, sgid=0x22000040) = -1 (Operation not permitted)
>>>> [2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 0x3000
>>>> [2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, suid=0xff19b6fa) = -1 (Operation not permitted)
>>>> [2031] [98] setpriority(which=0xff010000, who=0xf3737373, niceval=0x8088960c) = -1 (Invalid argument)
>>>> [2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family not supported by protocol)
>>>> [2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
>>>> [2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 3
>>>> [2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid argument)
>>>> [2031] [103] sigpending(set=0x40025000) = 0
>>>> [2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, flags=0x80800) = -1 (Invalid argument)
>>>> [2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = -1 (Invalid argument)
>>>> [2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid argument)
>>>> [2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 (Invalid argument)
>>>> [2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
>>>> [2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, suid=0xffff2e33) = -1 (Operation not permitted)
>>>> [2031] [110] munlock(addr=0x40025000, len=34) = 0
>>>> [2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
>>>> [2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid argument)
>>>> [2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 (Operation not permitted)
>>>> [2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, niceval=0xffff0682) = -1 (Invalid argument)
>>>> [2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad address)
>>>> [2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
>>>> [2031] [117] inotify_init() = 654
>>>> [2031] [118] getgid() = 100
>>>> [2031] [119] fstatat64(dfd=382, filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, flag=0xb545d727) = -1 (Invalid argument)
>>>> [2031] [120] unlinkat(dfd=382, pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", flag=0xc00ef76) = -1 (Invalid argument)
>>>> [2031] [121] timerfd_create(clockid=0, flags=0) = 655
>>>> [2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
>>>> [2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
>>>> [2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid argument)
>>>> [2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
>>>> [2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a directory)
>>>> [2031] [127] ssetmask(newmask=0x88000092) = 0
>>>> [2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad address)
>>>> [2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 (Operation not permitted)
>>>> [2031] [130] fchdir(fd=397) = -1 (Not a directory)
>>>> [2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", mode=525) = -1 (File exists)
>>>> [2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, size=0, flags=0) = -1 (Numerical result out of range)
>>>> [2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by signal 2.
>>>
>>> Reading your gdb backtraces show that schedule_timeout() got called
>>> with a negative value.
>>> Looks like an integer overflow.
>>> The soft-lockup might also origin from that (very big integer which
>>> did not overflow jet)
>>>
>>
>> If the culprit is solved by this patch I'd like to send it out. But I'm
>> unsure whether it catches the culprit or if it just covers the root cause.
> 
> I fear your Patch will not fix the issue.
> 
> Does the issue only trigger on 32bit UMLs?
No diea, I do only have a 32 bit system here (both host and client).

> How long does it take till trinity hits it?
a command like

$> ssh tfoerste@trinity "rm -rf t3; mkdir t3; cd t3; trinity -C4"

usually needs 10 till 15 min to trigger the issue. With just 1 trinity
task (-C1) however it needs often a hour or more.

> Thanks,
> //richard
> 


-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-03 19:16         ` Toralf Förster
  0 siblings, 0 replies; 64+ messages in thread
From: Toralf Förster @ 2013-10-03 19:16 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Richard Weinberger, trinity, UML devel

On 10/03/2013 09:04 PM, Richard Weinberger wrote:
> Am 03.10.2013 20:54, schrieb Toralf Förster:
>> On 10/02/2013 09:55 PM, Richard Weinberger wrote:
>>> On Wed, Oct 2, 2013 at 8:30 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>>>> Running trinity (1 process, no victim files, just "$>trinity -C1) for a longer time
>>>> within a 32 bit user mode linux image with a recent git kernel (host: 3.11.3 guest 3.12-rc3-g...)
>>>> yields into this konsole message :
>>>>
>>>>  * Starting local
>>>> net.core.warnings = 0                                                                         [ ok ]
>>>> BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]
>>>>
>>>>
>>>> and at the host t1 of the "linux"-processes eats all CPU cycles at 1 CPU core.
>>>> 2 subsequent made back traces made with
>>>>
>>>> $> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt
>>>>
>>>> shows nearly a similar position around __get_user_pages() - both are attached.
>>>>
>>>> I'm not surprised that trinity harms a systems - I'm just wondering whether this particular picture is
>>>> expected or if it points to an issue.
>>>>
>>>>
>>>> FWIW the last lines of trinity log were :
>>>>
>>>>
>>>> [2031] [94] setsid() = 2031
>>>> [2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, sgid=0x22000040) = -1 (Operation not permitted)
>>>> [2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 0x3000
>>>> [2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, suid=0xff19b6fa) = -1 (Operation not permitted)
>>>> [2031] [98] setpriority(which=0xff010000, who=0xf3737373, niceval=0x8088960c) = -1 (Invalid argument)
>>>> [2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family not supported by protocol)
>>>> [2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
>>>> [2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 3
>>>> [2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid argument)
>>>> [2031] [103] sigpending(set=0x40025000) = 0
>>>> [2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, flags=0x80800) = -1 (Invalid argument)
>>>> [2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = -1 (Invalid argument)
>>>> [2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid argument)
>>>> [2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 (Invalid argument)
>>>> [2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
>>>> [2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, suid=0xffff2e33) = -1 (Operation not permitted)
>>>> [2031] [110] munlock(addr=0x40025000, len=34) = 0
>>>> [2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
>>>> [2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid argument)
>>>> [2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 (Operation not permitted)
>>>> [2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, niceval=0xffff0682) = -1 (Invalid argument)
>>>> [2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad address)
>>>> [2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
>>>> [2031] [117] inotify_init() = 654
>>>> [2031] [118] getgid() = 100
>>>> [2031] [119] fstatat64(dfd=382, filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, flag=0xb545d727) = -1 (Invalid argument)
>>>> [2031] [120] unlinkat(dfd=382, pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", flag=0xc00ef76) = -1 (Invalid argument)
>>>> [2031] [121] timerfd_create(clockid=0, flags=0) = 655
>>>> [2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
>>>> [2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
>>>> [2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid argument)
>>>> [2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
>>>> [2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a directory)
>>>> [2031] [127] ssetmask(newmask=0x88000092) = 0
>>>> [2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad address)
>>>> [2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 (Operation not permitted)
>>>> [2031] [130] fchdir(fd=397) = -1 (Not a directory)
>>>> [2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", mode=525) = -1 (File exists)
>>>> [2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, size=0, flags=0) = -1 (Numerical result out of range)
>>>> [2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by signal 2.
>>>
>>> Reading your gdb backtraces show that schedule_timeout() got called
>>> with a negative value.
>>> Looks like an integer overflow.
>>> The soft-lockup might also origin from that (very big integer which
>>> did not overflow jet)
>>>
>>
>> If the culprit is solved by this patch I'd like to send it out. But I'm
>> unsure whether it catches the culprit or if it just covers the root cause.
> 
> I fear your Patch will not fix the issue.
> 
> Does the issue only trigger on 32bit UMLs?
No diea, I do only have a 32 bit system here (both host and client).

> How long does it take till trinity hits it?
a command like

$> ssh tfoerste@trinity "rm -rf t3; mkdir t3; cd t3; trinity -C4"

usually needs 10 till 15 min to trigger the issue. With just 1 trinity
task (-C1) however it needs often a hour or more.

> Thanks,
> //richard
> 


-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-03 19:16         ` Toralf Förster
@ 2013-10-03 19:20           ` Richard Weinberger
  -1 siblings, 0 replies; 64+ messages in thread
From: Richard Weinberger @ 2013-10-03 19:20 UTC (permalink / raw)
  To: Toralf Förster; +Cc: UML devel, trinity

Am 03.10.2013 21:16, schrieb Toralf Förster:
> On 10/03/2013 09:04 PM, Richard Weinberger wrote:
>> Am 03.10.2013 20:54, schrieb Toralf Förster:
>>> On 10/02/2013 09:55 PM, Richard Weinberger wrote:
>>>> On Wed, Oct 2, 2013 at 8:30 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>>>>> Running trinity (1 process, no victim files, just "$>trinity -C1) for a longer time
>>>>> within a 32 bit user mode linux image with a recent git kernel (host: 3.11.3 guest 3.12-rc3-g...)
>>>>> yields into this konsole message :
>>>>>
>>>>>  * Starting local
>>>>> net.core.warnings = 0                                                                         [ ok ]
>>>>> BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]
>>>>>
>>>>>
>>>>> and at the host t1 of the "linux"-processes eats all CPU cycles at 1 CPU core.
>>>>> 2 subsequent made back traces made with
>>>>>
>>>>> $> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt
>>>>>
>>>>> shows nearly a similar position around __get_user_pages() - both are attached.
>>>>>
>>>>> I'm not surprised that trinity harms a systems - I'm just wondering whether this particular picture is
>>>>> expected or if it points to an issue.
>>>>>
>>>>>
>>>>> FWIW the last lines of trinity log were :
>>>>>
>>>>>
>>>>> [2031] [94] setsid() = 2031
>>>>> [2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, sgid=0x22000040) = -1 (Operation not permitted)
>>>>> [2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 0x3000
>>>>> [2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, suid=0xff19b6fa) = -1 (Operation not permitted)
>>>>> [2031] [98] setpriority(which=0xff010000, who=0xf3737373, niceval=0x8088960c) = -1 (Invalid argument)
>>>>> [2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family not supported by protocol)
>>>>> [2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
>>>>> [2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 3
>>>>> [2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid argument)
>>>>> [2031] [103] sigpending(set=0x40025000) = 0
>>>>> [2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, flags=0x80800) = -1 (Invalid argument)
>>>>> [2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = -1 (Invalid argument)
>>>>> [2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid argument)
>>>>> [2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 (Invalid argument)
>>>>> [2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
>>>>> [2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, suid=0xffff2e33) = -1 (Operation not permitted)
>>>>> [2031] [110] munlock(addr=0x40025000, len=34) = 0
>>>>> [2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
>>>>> [2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid argument)
>>>>> [2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 (Operation not permitted)
>>>>> [2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, niceval=0xffff0682) = -1 (Invalid argument)
>>>>> [2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad address)
>>>>> [2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
>>>>> [2031] [117] inotify_init() = 654
>>>>> [2031] [118] getgid() = 100
>>>>> [2031] [119] fstatat64(dfd=382, filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, flag=0xb545d727) = -1 (Invalid argument)
>>>>> [2031] [120] unlinkat(dfd=382, pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", flag=0xc00ef76) = -1 (Invalid argument)
>>>>> [2031] [121] timerfd_create(clockid=0, flags=0) = 655
>>>>> [2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
>>>>> [2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
>>>>> [2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid argument)
>>>>> [2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
>>>>> [2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a directory)
>>>>> [2031] [127] ssetmask(newmask=0x88000092) = 0
>>>>> [2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad address)
>>>>> [2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 (Operation not permitted)
>>>>> [2031] [130] fchdir(fd=397) = -1 (Not a directory)
>>>>> [2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", mode=525) = -1 (File exists)
>>>>> [2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, size=0, flags=0) = -1 (Numerical result out of range)
>>>>> [2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by signal 2.
>>>>
>>>> Reading your gdb backtraces show that schedule_timeout() got called
>>>> with a negative value.
>>>> Looks like an integer overflow.
>>>> The soft-lockup might also origin from that (very big integer which
>>>> did not overflow jet)
>>>>
>>>
>>> If the culprit is solved by this patch I'd like to send it out. But I'm
>>> unsure whether it catches the culprit or if it just covers the root cause.
>>
>> I fear your Patch will not fix the issue.
>>
>> Does the issue only trigger on 32bit UMLs?
> No diea, I do only have a 32 bit system here (both host and client).
> 
>> How long does it take till trinity hits it?
> a command like
> 
> $> ssh tfoerste@trinity "rm -rf t3; mkdir t3; cd t3; trinity -C4"
> 
> usually needs 10 till 15 min to trigger the issue. With just 1 trinity
> task (-C1) however it needs often a hour or more.

That's good. :-)
You can place some printk()s into balance_dirty_pages() and observe the values
of period, max_pause, min_pause, etc...
Maybe this will give us a clue.

So far the issue looks not really UML specific.
But maybe it is more likely to happen on UML because of the slow page faults...

Thanks,
//richard


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-03 19:20           ` Richard Weinberger
  0 siblings, 0 replies; 64+ messages in thread
From: Richard Weinberger @ 2013-10-03 19:20 UTC (permalink / raw)
  To: Toralf Förster; +Cc: Richard Weinberger, trinity, UML devel

Am 03.10.2013 21:16, schrieb Toralf Förster:
> On 10/03/2013 09:04 PM, Richard Weinberger wrote:
>> Am 03.10.2013 20:54, schrieb Toralf Förster:
>>> On 10/02/2013 09:55 PM, Richard Weinberger wrote:
>>>> On Wed, Oct 2, 2013 at 8:30 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>>>>> Running trinity (1 process, no victim files, just "$>trinity -C1) for a longer time
>>>>> within a 32 bit user mode linux image with a recent git kernel (host: 3.11.3 guest 3.12-rc3-g...)
>>>>> yields into this konsole message :
>>>>>
>>>>>  * Starting local
>>>>> net.core.warnings = 0                                                                         [ ok ]
>>>>> BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]
>>>>>
>>>>>
>>>>> and at the host t1 of the "linux"-processes eats all CPU cycles at 1 CPU core.
>>>>> 2 subsequent made back traces made with
>>>>>
>>>>> $> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt
>>>>>
>>>>> shows nearly a similar position around __get_user_pages() - both are attached.
>>>>>
>>>>> I'm not surprised that trinity harms a systems - I'm just wondering whether this particular picture is
>>>>> expected or if it points to an issue.
>>>>>
>>>>>
>>>>> FWIW the last lines of trinity log were :
>>>>>
>>>>>
>>>>> [2031] [94] setsid() = 2031
>>>>> [2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, sgid=0x22000040) = -1 (Operation not permitted)
>>>>> [2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 0x3000
>>>>> [2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, suid=0xff19b6fa) = -1 (Operation not permitted)
>>>>> [2031] [98] setpriority(which=0xff010000, who=0xf3737373, niceval=0x8088960c) = -1 (Invalid argument)
>>>>> [2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family not supported by protocol)
>>>>> [2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
>>>>> [2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 3
>>>>> [2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid argument)
>>>>> [2031] [103] sigpending(set=0x40025000) = 0
>>>>> [2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, flags=0x80800) = -1 (Invalid argument)
>>>>> [2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = -1 (Invalid argument)
>>>>> [2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid argument)
>>>>> [2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 (Invalid argument)
>>>>> [2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
>>>>> [2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, suid=0xffff2e33) = -1 (Operation not permitted)
>>>>> [2031] [110] munlock(addr=0x40025000, len=34) = 0
>>>>> [2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
>>>>> [2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid argument)
>>>>> [2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 (Operation not permitted)
>>>>> [2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, niceval=0xffff0682) = -1 (Invalid argument)
>>>>> [2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad address)
>>>>> [2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
>>>>> [2031] [117] inotify_init() = 654
>>>>> [2031] [118] getgid() = 100
>>>>> [2031] [119] fstatat64(dfd=382, filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, flag=0xb545d727) = -1 (Invalid argument)
>>>>> [2031] [120] unlinkat(dfd=382, pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", flag=0xc00ef76) = -1 (Invalid argument)
>>>>> [2031] [121] timerfd_create(clockid=0, flags=0) = 655
>>>>> [2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
>>>>> [2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
>>>>> [2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid argument)
>>>>> [2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
>>>>> [2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a directory)
>>>>> [2031] [127] ssetmask(newmask=0x88000092) = 0
>>>>> [2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad address)
>>>>> [2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 (Operation not permitted)
>>>>> [2031] [130] fchdir(fd=397) = -1 (Not a directory)
>>>>> [2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", mode=525) = -1 (File exists)
>>>>> [2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, size=0, flags=0) = -1 (Numerical result out of range)
>>>>> [2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by signal 2.
>>>>
>>>> Reading your gdb backtraces show that schedule_timeout() got called
>>>> with a negative value.
>>>> Looks like an integer overflow.
>>>> The soft-lockup might also origin from that (very big integer which
>>>> did not overflow jet)
>>>>
>>>
>>> If the culprit is solved by this patch I'd like to send it out. But I'm
>>> unsure whether it catches the culprit or if it just covers the root cause.
>>
>> I fear your Patch will not fix the issue.
>>
>> Does the issue only trigger on 32bit UMLs?
> No diea, I do only have a 32 bit system here (both host and client).
> 
>> How long does it take till trinity hits it?
> a command like
> 
> $> ssh tfoerste@trinity "rm -rf t3; mkdir t3; cd t3; trinity -C4"
> 
> usually needs 10 till 15 min to trigger the issue. With just 1 trinity
> task (-C1) however it needs often a hour or more.

That's good. :-)
You can place some printk()s into balance_dirty_pages() and observe the values
of period, max_pause, min_pause, etc...
Maybe this will give us a clue.

So far the issue looks not really UML specific.
But maybe it is more likely to happen on UML because of the slow page faults...

Thanks,
//richard

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-03 19:20           ` Richard Weinberger
@ 2013-10-03 19:33             ` Toralf Förster
  -1 siblings, 0 replies; 64+ messages in thread
From: Toralf Förster @ 2013-10-03 19:33 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: UML devel, trinity

On 10/03/2013 09:20 PM, Richard Weinberger wrote:
> Am 03.10.2013 21:16, schrieb Toralf Förster:
>> On 10/03/2013 09:04 PM, Richard Weinberger wrote:
>>> Am 03.10.2013 20:54, schrieb Toralf Förster:
>>>> On 10/02/2013 09:55 PM, Richard Weinberger wrote:
>>>>> On Wed, Oct 2, 2013 at 8:30 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>>>>>> Running trinity (1 process, no victim files, just "$>trinity -C1) for a longer time
>>>>>> within a 32 bit user mode linux image with a recent git kernel (host: 3.11.3 guest 3.12-rc3-g...)
>>>>>> yields into this konsole message :
>>>>>>
>>>>>>  * Starting local
>>>>>> net.core.warnings = 0                                                                         [ ok ]
>>>>>> BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]
>>>>>>
>>>>>>
>>>>>> and at the host t1 of the "linux"-processes eats all CPU cycles at 1 CPU core.
>>>>>> 2 subsequent made back traces made with
>>>>>>
>>>>>> $> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt
>>>>>>
>>>>>> shows nearly a similar position around __get_user_pages() - both are attached.
>>>>>>
>>>>>> I'm not surprised that trinity harms a systems - I'm just wondering whether this particular picture is
>>>>>> expected or if it points to an issue.
>>>>>>
>>>>>>
>>>>>> FWIW the last lines of trinity log were :
>>>>>>
>>>>>>
>>>>>> [2031] [94] setsid() = 2031
>>>>>> [2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, sgid=0x22000040) = -1 (Operation not permitted)
>>>>>> [2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 0x3000
>>>>>> [2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, suid=0xff19b6fa) = -1 (Operation not permitted)
>>>>>> [2031] [98] setpriority(which=0xff010000, who=0xf3737373, niceval=0x8088960c) = -1 (Invalid argument)
>>>>>> [2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family not supported by protocol)
>>>>>> [2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
>>>>>> [2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 3
>>>>>> [2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid argument)
>>>>>> [2031] [103] sigpending(set=0x40025000) = 0
>>>>>> [2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, flags=0x80800) = -1 (Invalid argument)
>>>>>> [2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = -1 (Invalid argument)
>>>>>> [2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid argument)
>>>>>> [2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 (Invalid argument)
>>>>>> [2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
>>>>>> [2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, suid=0xffff2e33) = -1 (Operation not permitted)
>>>>>> [2031] [110] munlock(addr=0x40025000, len=34) = 0
>>>>>> [2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
>>>>>> [2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid argument)
>>>>>> [2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 (Operation not permitted)
>>>>>> [2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, niceval=0xffff0682) = -1 (Invalid argument)
>>>>>> [2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad address)
>>>>>> [2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
>>>>>> [2031] [117] inotify_init() = 654
>>>>>> [2031] [118] getgid() = 100
>>>>>> [2031] [119] fstatat64(dfd=382, filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, flag=0xb545d727) = -1 (Invalid argument)
>>>>>> [2031] [120] unlinkat(dfd=382, pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", flag=0xc00ef76) = -1 (Invalid argument)
>>>>>> [2031] [121] timerfd_create(clockid=0, flags=0) = 655
>>>>>> [2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
>>>>>> [2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
>>>>>> [2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid argument)
>>>>>> [2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
>>>>>> [2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a directory)
>>>>>> [2031] [127] ssetmask(newmask=0x88000092) = 0
>>>>>> [2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad address)
>>>>>> [2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 (Operation not permitted)
>>>>>> [2031] [130] fchdir(fd=397) = -1 (Not a directory)
>>>>>> [2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", mode=525) = -1 (File exists)
>>>>>> [2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, size=0, flags=0) = -1 (Numerical result out of range)
>>>>>> [2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by signal 2.
>>>>>
>>>>> Reading your gdb backtraces show that schedule_timeout() got called
>>>>> with a negative value.
>>>>> Looks like an integer overflow.
>>>>> The soft-lockup might also origin from that (very big integer which
>>>>> did not overflow jet)
>>>>>
>>>>
>>>> If the culprit is solved by this patch I'd like to send it out. But I'm
>>>> unsure whether it catches the culprit or if it just covers the root cause.
>>>
>>> I fear your Patch will not fix the issue.
>>>
>>> Does the issue only trigger on 32bit UMLs?
>> No diea, I do only have a 32 bit system here (both host and client).
>>
>>> How long does it take till trinity hits it?
>> a command like
>>
>> $> ssh tfoerste@trinity "rm -rf t3; mkdir t3; cd t3; trinity -C4"
>>
>> usually needs 10 till 15 min to trigger the issue. With just 1 trinity
>> task (-C1) however it needs often a hour or more.
> 
> That's good. :-)
> You can place some printk()s into balance_dirty_pages() and observe the values
> of period, max_pause, min_pause, etc...
> Maybe this will give us a clue.

Will do it again. I did it before already but period was the only thing
which becames otherwise negative -
and because pages_dirtied was often in the range of 20 - 30 and perioad
around -3000

I just wondered if HZ is casts as a short int in the calculation ?
BTW printk doesn't showed anything in the syslog (ok, I used
printk(KERN_ERR ...)
BUG_ON however worked fine, but do not give any output ...

> 
> So far the issue looks not really UML specific.
> But maybe it is more likely to happen on UML because of the slow page faults...
> 
> Thanks,
> //richard
> 
> 

-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-03 19:33             ` Toralf Förster
  0 siblings, 0 replies; 64+ messages in thread
From: Toralf Förster @ 2013-10-03 19:33 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: Richard Weinberger, trinity, UML devel

On 10/03/2013 09:20 PM, Richard Weinberger wrote:
> Am 03.10.2013 21:16, schrieb Toralf Förster:
>> On 10/03/2013 09:04 PM, Richard Weinberger wrote:
>>> Am 03.10.2013 20:54, schrieb Toralf Förster:
>>>> On 10/02/2013 09:55 PM, Richard Weinberger wrote:
>>>>> On Wed, Oct 2, 2013 at 8:30 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>>>>>> Running trinity (1 process, no victim files, just "$>trinity -C1) for a longer time
>>>>>> within a 32 bit user mode linux image with a recent git kernel (host: 3.11.3 guest 3.12-rc3-g...)
>>>>>> yields into this konsole message :
>>>>>>
>>>>>>  * Starting local
>>>>>> net.core.warnings = 0                                                                         [ ok ]
>>>>>> BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]
>>>>>>
>>>>>>
>>>>>> and at the host t1 of the "linux"-processes eats all CPU cycles at 1 CPU core.
>>>>>> 2 subsequent made back traces made with
>>>>>>
>>>>>> $> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt
>>>>>>
>>>>>> shows nearly a similar position around __get_user_pages() - both are attached.
>>>>>>
>>>>>> I'm not surprised that trinity harms a systems - I'm just wondering whether this particular picture is
>>>>>> expected or if it points to an issue.
>>>>>>
>>>>>>
>>>>>> FWIW the last lines of trinity log were :
>>>>>>
>>>>>>
>>>>>> [2031] [94] setsid() = 2031
>>>>>> [2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, sgid=0x22000040) = -1 (Operation not permitted)
>>>>>> [2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 0x3000
>>>>>> [2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, suid=0xff19b6fa) = -1 (Operation not permitted)
>>>>>> [2031] [98] setpriority(which=0xff010000, who=0xf3737373, niceval=0x8088960c) = -1 (Invalid argument)
>>>>>> [2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family not supported by protocol)
>>>>>> [2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
>>>>>> [2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 3
>>>>>> [2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid argument)
>>>>>> [2031] [103] sigpending(set=0x40025000) = 0
>>>>>> [2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, flags=0x80800) = -1 (Invalid argument)
>>>>>> [2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = -1 (Invalid argument)
>>>>>> [2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid argument)
>>>>>> [2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 (Invalid argument)
>>>>>> [2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
>>>>>> [2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, suid=0xffff2e33) = -1 (Operation not permitted)
>>>>>> [2031] [110] munlock(addr=0x40025000, len=34) = 0
>>>>>> [2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
>>>>>> [2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid argument)
>>>>>> [2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 (Operation not permitted)
>>>>>> [2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, niceval=0xffff0682) = -1 (Invalid argument)
>>>>>> [2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad address)
>>>>>> [2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
>>>>>> [2031] [117] inotify_init() = 654
>>>>>> [2031] [118] getgid() = 100
>>>>>> [2031] [119] fstatat64(dfd=382, filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, flag=0xb545d727) = -1 (Invalid argument)
>>>>>> [2031] [120] unlinkat(dfd=382, pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", flag=0xc00ef76) = -1 (Invalid argument)
>>>>>> [2031] [121] timerfd_create(clockid=0, flags=0) = 655
>>>>>> [2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
>>>>>> [2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
>>>>>> [2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid argument)
>>>>>> [2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
>>>>>> [2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a directory)
>>>>>> [2031] [127] ssetmask(newmask=0x88000092) = 0
>>>>>> [2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad address)
>>>>>> [2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 (Operation not permitted)
>>>>>> [2031] [130] fchdir(fd=397) = -1 (Not a directory)
>>>>>> [2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", mode=525) = -1 (File exists)
>>>>>> [2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, size=0, flags=0) = -1 (Numerical result out of range)
>>>>>> [2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by signal 2.
>>>>>
>>>>> Reading your gdb backtraces show that schedule_timeout() got called
>>>>> with a negative value.
>>>>> Looks like an integer overflow.
>>>>> The soft-lockup might also origin from that (very big integer which
>>>>> did not overflow jet)
>>>>>
>>>>
>>>> If the culprit is solved by this patch I'd like to send it out. But I'm
>>>> unsure whether it catches the culprit or if it just covers the root cause.
>>>
>>> I fear your Patch will not fix the issue.
>>>
>>> Does the issue only trigger on 32bit UMLs?
>> No diea, I do only have a 32 bit system here (both host and client).
>>
>>> How long does it take till trinity hits it?
>> a command like
>>
>> $> ssh tfoerste@trinity "rm -rf t3; mkdir t3; cd t3; trinity -C4"
>>
>> usually needs 10 till 15 min to trigger the issue. With just 1 trinity
>> task (-C1) however it needs often a hour or more.
> 
> That's good. :-)
> You can place some printk()s into balance_dirty_pages() and observe the values
> of period, max_pause, min_pause, etc...
> Maybe this will give us a clue.

Will do it again. I did it before already but period was the only thing
which becames otherwise negative -
and because pages_dirtied was often in the range of 20 - 30 and perioad
around -3000

I just wondered if HZ is casts as a short int in the calculation ?
BTW printk doesn't showed anything in the syslog (ok, I used
printk(KERN_ERR ...)
BUG_ON however worked fine, but do not give any output ...

> 
> So far the issue looks not really UML specific.
> But maybe it is more likely to happen on UML because of the slow page faults...
> 
> Thanks,
> //richard
> 
> 

-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-03 19:33             ` Toralf Förster
@ 2013-10-04  5:52               ` Richard Weinberger
  -1 siblings, 0 replies; 64+ messages in thread
From: Richard Weinberger @ 2013-10-04  5:52 UTC (permalink / raw)
  To: Toralf Förster; +Cc: UML devel, trinity

Am 03.10.2013 21:33, schrieb Toralf Förster:
> On 10/03/2013 09:20 PM, Richard Weinberger wrote:
>> Am 03.10.2013 21:16, schrieb Toralf Förster:
>>> On 10/03/2013 09:04 PM, Richard Weinberger wrote:
>>>> Am 03.10.2013 20:54, schrieb Toralf Förster:
>>>>> On 10/02/2013 09:55 PM, Richard Weinberger wrote:
>>>>>> On Wed, Oct 2, 2013 at 8:30 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>>>>>>> Running trinity (1 process, no victim files, just "$>trinity -C1) for a longer time
>>>>>>> within a 32 bit user mode linux image with a recent git kernel (host: 3.11.3 guest 3.12-rc3-g...)
>>>>>>> yields into this konsole message :
>>>>>>>
>>>>>>>  * Starting local
>>>>>>> net.core.warnings = 0                                                                         [ ok ]
>>>>>>> BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]
>>>>>>>
>>>>>>>
>>>>>>> and at the host t1 of the "linux"-processes eats all CPU cycles at 1 CPU core.
>>>>>>> 2 subsequent made back traces made with
>>>>>>>
>>>>>>> $> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt
>>>>>>>
>>>>>>> shows nearly a similar position around __get_user_pages() - both are attached.
>>>>>>>
>>>>>>> I'm not surprised that trinity harms a systems - I'm just wondering whether this particular picture is
>>>>>>> expected or if it points to an issue.
>>>>>>>
>>>>>>>
>>>>>>> FWIW the last lines of trinity log were :
>>>>>>>
>>>>>>>
>>>>>>> [2031] [94] setsid() = 2031
>>>>>>> [2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, sgid=0x22000040) = -1 (Operation not permitted)
>>>>>>> [2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 0x3000
>>>>>>> [2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, suid=0xff19b6fa) = -1 (Operation not permitted)
>>>>>>> [2031] [98] setpriority(which=0xff010000, who=0xf3737373, niceval=0x8088960c) = -1 (Invalid argument)
>>>>>>> [2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family not supported by protocol)
>>>>>>> [2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
>>>>>>> [2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 3
>>>>>>> [2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid argument)
>>>>>>> [2031] [103] sigpending(set=0x40025000) = 0
>>>>>>> [2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, flags=0x80800) = -1 (Invalid argument)
>>>>>>> [2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = -1 (Invalid argument)
>>>>>>> [2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid argument)
>>>>>>> [2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 (Invalid argument)
>>>>>>> [2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
>>>>>>> [2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, suid=0xffff2e33) = -1 (Operation not permitted)
>>>>>>> [2031] [110] munlock(addr=0x40025000, len=34) = 0
>>>>>>> [2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
>>>>>>> [2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid argument)
>>>>>>> [2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 (Operation not permitted)
>>>>>>> [2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, niceval=0xffff0682) = -1 (Invalid argument)
>>>>>>> [2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad address)
>>>>>>> [2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
>>>>>>> [2031] [117] inotify_init() = 654
>>>>>>> [2031] [118] getgid() = 100
>>>>>>> [2031] [119] fstatat64(dfd=382, filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, flag=0xb545d727) = -1 (Invalid argument)
>>>>>>> [2031] [120] unlinkat(dfd=382, pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", flag=0xc00ef76) = -1 (Invalid argument)
>>>>>>> [2031] [121] timerfd_create(clockid=0, flags=0) = 655
>>>>>>> [2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
>>>>>>> [2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
>>>>>>> [2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid argument)
>>>>>>> [2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
>>>>>>> [2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a directory)
>>>>>>> [2031] [127] ssetmask(newmask=0x88000092) = 0
>>>>>>> [2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad address)
>>>>>>> [2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 (Operation not permitted)
>>>>>>> [2031] [130] fchdir(fd=397) = -1 (Not a directory)
>>>>>>> [2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", mode=525) = -1 (File exists)
>>>>>>> [2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, size=0, flags=0) = -1 (Numerical result out of range)
>>>>>>> [2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by signal 2.
>>>>>>
>>>>>> Reading your gdb backtraces show that schedule_timeout() got called
>>>>>> with a negative value.
>>>>>> Looks like an integer overflow.
>>>>>> The soft-lockup might also origin from that (very big integer which
>>>>>> did not overflow jet)
>>>>>>
>>>>>
>>>>> If the culprit is solved by this patch I'd like to send it out. But I'm
>>>>> unsure whether it catches the culprit or if it just covers the root cause.
>>>>
>>>> I fear your Patch will not fix the issue.
>>>>
>>>> Does the issue only trigger on 32bit UMLs?
>>> No diea, I do only have a 32 bit system here (both host and client).
>>>
>>>> How long does it take till trinity hits it?
>>> a command like
>>>
>>> $> ssh tfoerste@trinity "rm -rf t3; mkdir t3; cd t3; trinity -C4"
>>>
>>> usually needs 10 till 15 min to trigger the issue. With just 1 trinity
>>> task (-C1) however it needs often a hour or more.
>>
>> That's good. :-)
>> You can place some printk()s into balance_dirty_pages() and observe the values
>> of period, max_pause, min_pause, etc...
>> Maybe this will give us a clue.
> 
> Will do it again. I did it before already but period was the only thing
> which becames otherwise negative -
> and because pages_dirtied was often in the range of 20 - 30 and perioad
> around -3000

Did you print it before or after
                if (current->dirty_paused_when)
                        pause -= now - current->dirty_paused_when;

I'm also wondering why
		if (pause < min_pause) {
did not trigger or did it?

> I just wondered if HZ is casts as a short int in the calculation ?
> BTW printk doesn't showed anything in the syslog (ok, I used
> printk(KERN_ERR ...)

syslog or dmesg?

Thanks,
//richard

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-04  5:52               ` Richard Weinberger
  0 siblings, 0 replies; 64+ messages in thread
From: Richard Weinberger @ 2013-10-04  5:52 UTC (permalink / raw)
  To: Toralf Förster; +Cc: trinity, UML devel

Am 03.10.2013 21:33, schrieb Toralf Förster:
> On 10/03/2013 09:20 PM, Richard Weinberger wrote:
>> Am 03.10.2013 21:16, schrieb Toralf Förster:
>>> On 10/03/2013 09:04 PM, Richard Weinberger wrote:
>>>> Am 03.10.2013 20:54, schrieb Toralf Förster:
>>>>> On 10/02/2013 09:55 PM, Richard Weinberger wrote:
>>>>>> On Wed, Oct 2, 2013 at 8:30 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>>>>>>> Running trinity (1 process, no victim files, just "$>trinity -C1) for a longer time
>>>>>>> within a 32 bit user mode linux image with a recent git kernel (host: 3.11.3 guest 3.12-rc3-g...)
>>>>>>> yields into this konsole message :
>>>>>>>
>>>>>>>  * Starting local
>>>>>>> net.core.warnings = 0                                                                         [ ok ]
>>>>>>> BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]
>>>>>>>
>>>>>>>
>>>>>>> and at the host t1 of the "linux"-processes eats all CPU cycles at 1 CPU core.
>>>>>>> 2 subsequent made back traces made with
>>>>>>>
>>>>>>> $> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt
>>>>>>>
>>>>>>> shows nearly a similar position around __get_user_pages() - both are attached.
>>>>>>>
>>>>>>> I'm not surprised that trinity harms a systems - I'm just wondering whether this particular picture is
>>>>>>> expected or if it points to an issue.
>>>>>>>
>>>>>>>
>>>>>>> FWIW the last lines of trinity log were :
>>>>>>>
>>>>>>>
>>>>>>> [2031] [94] setsid() = 2031
>>>>>>> [2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, sgid=0x22000040) = -1 (Operation not permitted)
>>>>>>> [2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 0x3000
>>>>>>> [2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, suid=0xff19b6fa) = -1 (Operation not permitted)
>>>>>>> [2031] [98] setpriority(which=0xff010000, who=0xf3737373, niceval=0x8088960c) = -1 (Invalid argument)
>>>>>>> [2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family not supported by protocol)
>>>>>>> [2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
>>>>>>> [2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 3
>>>>>>> [2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid argument)
>>>>>>> [2031] [103] sigpending(set=0x40025000) = 0
>>>>>>> [2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, flags=0x80800) = -1 (Invalid argument)
>>>>>>> [2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = -1 (Invalid argument)
>>>>>>> [2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid argument)
>>>>>>> [2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 (Invalid argument)
>>>>>>> [2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
>>>>>>> [2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, suid=0xffff2e33) = -1 (Operation not permitted)
>>>>>>> [2031] [110] munlock(addr=0x40025000, len=34) = 0
>>>>>>> [2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
>>>>>>> [2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid argument)
>>>>>>> [2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 (Operation not permitted)
>>>>>>> [2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, niceval=0xffff0682) = -1 (Invalid argument)
>>>>>>> [2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad address)
>>>>>>> [2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
>>>>>>> [2031] [117] inotify_init() = 654
>>>>>>> [2031] [118] getgid() = 100
>>>>>>> [2031] [119] fstatat64(dfd=382, filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, flag=0xb545d727) = -1 (Invalid argument)
>>>>>>> [2031] [120] unlinkat(dfd=382, pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", flag=0xc00ef76) = -1 (Invalid argument)
>>>>>>> [2031] [121] timerfd_create(clockid=0, flags=0) = 655
>>>>>>> [2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
>>>>>>> [2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
>>>>>>> [2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid argument)
>>>>>>> [2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
>>>>>>> [2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a directory)
>>>>>>> [2031] [127] ssetmask(newmask=0x88000092) = 0
>>>>>>> [2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad address)
>>>>>>> [2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 (Operation not permitted)
>>>>>>> [2031] [130] fchdir(fd=397) = -1 (Not a directory)
>>>>>>> [2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", mode=525) = -1 (File exists)
>>>>>>> [2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, size=0, flags=0) = -1 (Numerical result out of range)
>>>>>>> [2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by signal 2.
>>>>>>
>>>>>> Reading your gdb backtraces show that schedule_timeout() got called
>>>>>> with a negative value.
>>>>>> Looks like an integer overflow.
>>>>>> The soft-lockup might also origin from that (very big integer which
>>>>>> did not overflow jet)
>>>>>>
>>>>>
>>>>> If the culprit is solved by this patch I'd like to send it out. But I'm
>>>>> unsure whether it catches the culprit or if it just covers the root cause.
>>>>
>>>> I fear your Patch will not fix the issue.
>>>>
>>>> Does the issue only trigger on 32bit UMLs?
>>> No diea, I do only have a 32 bit system here (both host and client).
>>>
>>>> How long does it take till trinity hits it?
>>> a command like
>>>
>>> $> ssh tfoerste@trinity "rm -rf t3; mkdir t3; cd t3; trinity -C4"
>>>
>>> usually needs 10 till 15 min to trigger the issue. With just 1 trinity
>>> task (-C1) however it needs often a hour or more.
>>
>> That's good. :-)
>> You can place some printk()s into balance_dirty_pages() and observe the values
>> of period, max_pause, min_pause, etc...
>> Maybe this will give us a clue.
> 
> Will do it again. I did it before already but period was the only thing
> which becames otherwise negative -
> and because pages_dirtied was often in the range of 20 - 30 and perioad
> around -3000

Did you print it before or after
                if (current->dirty_paused_when)
                        pause -= now - current->dirty_paused_when;

I'm also wondering why
		if (pause < min_pause) {
did not trigger or did it?

> I just wondered if HZ is casts as a short int in the calculation ?
> BTW printk doesn't showed anything in the syslog (ok, I used
> printk(KERN_ERR ...)

syslog or dmesg?

Thanks,
//richard

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-04  5:52               ` Richard Weinberger
@ 2013-10-06 14:17                 ` Toralf Förster
  -1 siblings, 0 replies; 64+ messages in thread
From: Toralf Förster @ 2013-10-06 14:17 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: UML devel, trinity

On 10/04/2013 07:52 AM, Richard Weinberger wrote:
> Am 03.10.2013 21:33, schrieb Toralf Förster:
>> On 10/03/2013 09:20 PM, Richard Weinberger wrote:
>>> Am 03.10.2013 21:16, schrieb Toralf Förster:
>>>> On 10/03/2013 09:04 PM, Richard Weinberger wrote:
>>>>> Am 03.10.2013 20:54, schrieb Toralf Förster:
>>>>>> On 10/02/2013 09:55 PM, Richard Weinberger wrote:
>>>>>>> On Wed, Oct 2, 2013 at 8:30 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>>>>>>>> Running trinity (1 process, no victim files, just "$>trinity -C1) for a longer time
>>>>>>>> within a 32 bit user mode linux image with a recent git kernel (host: 3.11.3 guest 3.12-rc3-g...)
>>>>>>>> yields into this konsole message :
>>>>>>>>
>>>>>>>>  * Starting local
>>>>>>>> net.core.warnings = 0                                                                         [ ok ]
>>>>>>>> BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]
>>>>>>>>
>>>>>>>>
>>>>>>>> and at the host t1 of the "linux"-processes eats all CPU cycles at 1 CPU core.
>>>>>>>> 2 subsequent made back traces made with
>>>>>>>>
>>>>>>>> $> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt
>>>>>>>>
>>>>>>>> shows nearly a similar position around __get_user_pages() - both are attached.
>>>>>>>>
>>>>>>>> I'm not surprised that trinity harms a systems - I'm just wondering whether this particular picture is
>>>>>>>> expected or if it points to an issue.
>>>>>>>>
>>>>>>>>
>>>>>>>> FWIW the last lines of trinity log were :
>>>>>>>>
>>>>>>>>
>>>>>>>> [2031] [94] setsid() = 2031
>>>>>>>> [2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, sgid=0x22000040) = -1 (Operation not permitted)
>>>>>>>> [2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 0x3000
>>>>>>>> [2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, suid=0xff19b6fa) = -1 (Operation not permitted)
>>>>>>>> [2031] [98] setpriority(which=0xff010000, who=0xf3737373, niceval=0x8088960c) = -1 (Invalid argument)
>>>>>>>> [2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family not supported by protocol)
>>>>>>>> [2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
>>>>>>>> [2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 3
>>>>>>>> [2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid argument)
>>>>>>>> [2031] [103] sigpending(set=0x40025000) = 0
>>>>>>>> [2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, flags=0x80800) = -1 (Invalid argument)
>>>>>>>> [2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = -1 (Invalid argument)
>>>>>>>> [2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid argument)
>>>>>>>> [2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 (Invalid argument)
>>>>>>>> [2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
>>>>>>>> [2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, suid=0xffff2e33) = -1 (Operation not permitted)
>>>>>>>> [2031] [110] munlock(addr=0x40025000, len=34) = 0
>>>>>>>> [2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
>>>>>>>> [2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid argument)
>>>>>>>> [2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 (Operation not permitted)
>>>>>>>> [2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, niceval=0xffff0682) = -1 (Invalid argument)
>>>>>>>> [2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad address)
>>>>>>>> [2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
>>>>>>>> [2031] [117] inotify_init() = 654
>>>>>>>> [2031] [118] getgid() = 100
>>>>>>>> [2031] [119] fstatat64(dfd=382, filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, flag=0xb545d727) = -1 (Invalid argument)
>>>>>>>> [2031] [120] unlinkat(dfd=382, pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", flag=0xc00ef76) = -1 (Invalid argument)
>>>>>>>> [2031] [121] timerfd_create(clockid=0, flags=0) = 655
>>>>>>>> [2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
>>>>>>>> [2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
>>>>>>>> [2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid argument)
>>>>>>>> [2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
>>>>>>>> [2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a directory)
>>>>>>>> [2031] [127] ssetmask(newmask=0x88000092) = 0
>>>>>>>> [2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad address)
>>>>>>>> [2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 (Operation not permitted)
>>>>>>>> [2031] [130] fchdir(fd=397) = -1 (Not a directory)
>>>>>>>> [2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", mode=525) = -1 (File exists)
>>>>>>>> [2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, size=0, flags=0) = -1 (Numerical result out of range)
>>>>>>>> [2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by signal 2.
>>>>>>>
>>>>>>> Reading your gdb backtraces show that schedule_timeout() got called
>>>>>>> with a negative value.
>>>>>>> Looks like an integer overflow.
>>>>>>> The soft-lockup might also origin from that (very big integer which
>>>>>>> did not overflow jet)
>>>>>>>
>>>>>>
>>>>>> If the culprit is solved by this patch I'd like to send it out. But I'm
>>>>>> unsure whether it catches the culprit or if it just covers the root cause.
>>>>>
>>>>> I fear your Patch will not fix the issue.
>>>>>
>>>>> Does the issue only trigger on 32bit UMLs?
>>>> No diea, I do only have a 32 bit system here (both host and client).
>>>>
>>>>> How long does it take till trinity hits it?
>>>> a command like
>>>>
>>>> $> ssh tfoerste@trinity "rm -rf t3; mkdir t3; cd t3; trinity -C4"
>>>>
>>>> usually needs 10 till 15 min to trigger the issue. With just 1 trinity
>>>> task (-C1) however it needs often a hour or more.
>>>
>>> That's good. :-)
>>> You can place some printk()s into balance_dirty_pages() and observe the values
>>> of period, max_pause, min_pause, etc...
>>> Maybe this will give us a clue.
>>
>> Will do it again. I did it before already but period was the only thing
>> which becames otherwise negative -
>> and because pages_dirtied was often in the range of 20 - 30 and perioad
>> around -3000
> 
> Did you print it before or after
>                 if (current->dirty_paused_when)
>                         pause -= now - current->dirty_paused_when;
> 
> I'm also wondering why
> 		if (pause < min_pause) {
> did not trigger or did it?
> 
>> I just wondered if HZ is casts as a short int in the calculation ?
>> BTW printk doesn't showed anything in the syslog (ok, I used
>> printk(KERN_ERR ...)
> 
> syslog or dmesg?
> 
> Thanks,
> //richard
> 

back from going a mushroom foray - I nearly surrounded all "pause=" statements
with BUG_ON and downgrades trinity (b/c their latest enhancements made trinity too weak)

The UML stopped here :
...
		if (unlikely(task_ratelimit == 0)) {
			period = max_pause;
			pause = max_pause;
			BUG_ON(pause < 0);
			goto pause;
		}
		BUG_ON(pages_dirtied < 0);
		BUG_ON(task_ratelimit < 0);
		period = HZ * pages_dirtied / task_ratelimit;
		BUG_ON(period < 0);         <----------------------here
		pause = period;
		BUG_ON(pause < 0);
		if (current->dirty_paused_when)
			pause -= now - current->dirty_paused_when;
		BUG_ON(pause < 0);
...

The back trace is :

tfoerste@n22 ~/devel/trinity $ gdb --core=/mnt/ramdisk/core /home/tfoerste/devel/linux/linux -batch -ex bt[New LWP 2522]
Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk ubda=/home/tfoerste/virtual/uml/tr'.
Program terminated with signal 6, Aborted.
#0  0xb777a424 in __kernel_vsyscall ()
#0  0xb777a424 in __kernel_vsyscall ()
#1  0x083b33b5 in kill ()
#2  0x0807190d in uml_abort () at arch/um/os-Linux/util.c:93
#3  0x08071c45 in os_dump_core () at arch/um/os-Linux/util.c:148
#4  0x08061417 in panic_exit (self=0x85b9558 <panic_exit_notifier>, unused1=0, unused2=0x85ef720 <buf.16084>) at arch/um/kernel/um_arch.c:240
#5  0x0809a7d8 in notifier_call_chain (nl=0x0, val=0, v=0x85ef720 <buf.16084>, nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
#6  0x0809a923 in __atomic_notifier_call_chain (nr_calls=<optimized out>, nr_to_call=<optimized out>, v=<optimized out>, val=<optimized out>, nh=<optimized out>) at kernel/notifier.c:182
#7  atomic_notifier_call_chain (nh=0x85ef704 <panic_notifier_list>, val=0, v=0x85ef720 <buf.16084>) at kernel/notifier.c:191
#8  0x08410d1c in panic (fmt=0x0) at kernel/panic.c:130
#9  0x08411c64 in balance_dirty_pages (pages_dirtied=9, mapping=<optimized out>) at mm/page-writeback.c:1471
#10 0x080d1ce4 in balance_dirty_pages_ratelimited (mapping=0x6) at mm/page-writeback.c:1663
#11 0x080e2d0c in __do_fault (mm=0x4870da00, vma=0x487484c8, address=1095131136, pmd=0x0, pgoff=0, flags=1212323992, orig_pte=<incomplete type>) at mm/memory.c:3452
#12 0x080e5286 in do_nonlinear_fault (orig_pte=..., flags=<optimized out>, pmd=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>, page_table=<optimized out>) at mm/memory.c:3518
#13 handle_pte_fault (flags=<optimized out>, pmd=<optimized out>, pte=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3717
#14 __handle_mm_fault (flags=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3845
#15 handle_mm_fault (mm=0x4870da00, vma=0x487484c8, address=1095131136, flags=1) at mm/memory.c:3868
#16 0x080e5a07 in __get_user_pages (tsk=0x47d5a400, mm=0x4870da00, start=1095131136, nr_pages=1025, gup_flags=519, pages=0x47c50000, vmas=0x0, nonblocking=0x0) at mm/memory.c:1822
#17 0x080e5cc3 in get_user_pages (tsk=0x0, mm=0x0, start=0, nr_pages=0, write=1, force=0, pages=0x47fafb98, vmas=0x6) at mm/memory.c:2019
#18 0x08140d0e in aio_setup_ring (ctx=<optimized out>) at fs/aio.c:340
#19 ioctx_alloc (nr_events=<optimized out>) at fs/aio.c:605
#20 SYSC_io_setup (ctxp=<optimized out>, nr_events=<optimized out>) at fs/aio.c:1122
#21 SyS_io_setup (nr_events=65535, ctxp=135045120) at fs/aio.c:1105
#22 0x080619c2 in handle_syscall (r=0x47d5a5d4) at arch/um/kernel/skas/syscall.c:35
#23 0x08073f2d in handle_trap (local_using_sysemu=<optimized out>, regs=<optimized out>, pid=<optimized out>) at arch/um/os-Linux/skas/process.c:198
#24 userspace (regs=0x47d5a5d4) at arch/um/os-Linux/skas/process.c:431
#25 0x0805e6ac in fork_handler () at arch/um/kernel/process.c:160
#26 0x00000000 in ?? ()

I'll try again to see if printk's will help - how do I force a sync after a printk ?

-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-06 14:17                 ` Toralf Förster
  0 siblings, 0 replies; 64+ messages in thread
From: Toralf Förster @ 2013-10-06 14:17 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: trinity, UML devel

On 10/04/2013 07:52 AM, Richard Weinberger wrote:
> Am 03.10.2013 21:33, schrieb Toralf Förster:
>> On 10/03/2013 09:20 PM, Richard Weinberger wrote:
>>> Am 03.10.2013 21:16, schrieb Toralf Förster:
>>>> On 10/03/2013 09:04 PM, Richard Weinberger wrote:
>>>>> Am 03.10.2013 20:54, schrieb Toralf Förster:
>>>>>> On 10/02/2013 09:55 PM, Richard Weinberger wrote:
>>>>>>> On Wed, Oct 2, 2013 at 8:30 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>>>>>>>> Running trinity (1 process, no victim files, just "$>trinity -C1) for a longer time
>>>>>>>> within a 32 bit user mode linux image with a recent git kernel (host: 3.11.3 guest 3.12-rc3-g...)
>>>>>>>> yields into this konsole message :
>>>>>>>>
>>>>>>>>  * Starting local
>>>>>>>> net.core.warnings = 0                                                                         [ ok ]
>>>>>>>> BUG: soft lockup - CPU#0 stuck for 23s! [trinity-child0:2031]
>>>>>>>>
>>>>>>>>
>>>>>>>> and at the host t1 of the "linux"-processes eats all CPU cycles at 1 CPU core.
>>>>>>>> 2 subsequent made back traces made with
>>>>>>>>
>>>>>>>> $> sudo gdb /home/tfoerste/devel/linux/linux 28144 -n -batch -ex bt
>>>>>>>>
>>>>>>>> shows nearly a similar position around __get_user_pages() - both are attached.
>>>>>>>>
>>>>>>>> I'm not surprised that trinity harms a systems - I'm just wondering whether this particular picture is
>>>>>>>> expected or if it points to an issue.
>>>>>>>>
>>>>>>>>
>>>>>>>> FWIW the last lines of trinity log were :
>>>>>>>>
>>>>>>>>
>>>>>>>> [2031] [94] setsid() = 2031
>>>>>>>> [2031] [95] setresgid(rgid=0xffff33e3, egid=0xffffff93, sgid=0x22000040) = -1 (Operation not permitted)
>>>>>>>> [2031] [96] vmsplice(fd=5, iov=0x85501e0, nr_segs=300, flags=9) = 0x3000
>>>>>>>> [2031] [97] setresuid(ruid=0x80549193, euid=0xc61041e0, suid=0xff19b6fa) = -1 (Operation not permitted)
>>>>>>>> [2031] [98] setpriority(which=0xff010000, who=0xf3737373, niceval=0x8088960c) = -1 (Invalid argument)
>>>>>>>> [2031] [99] socketcall(call=1, args=0x8550200) = -1 (Address family not supported by protocol)
>>>>>>>> [2031] [100] access(filename="�", mode=2017) = -1 (Invalid argument)
>>>>>>>> [2031] [101] getgroups(gidsetsize=0, grouplist=0x80d0000[page_rand]) = 3
>>>>>>>> [2031] [102] msync(start=0xc0100220, len=0, flags=3) = -1 (Invalid argument)
>>>>>>>> [2031] [103] sigpending(set=0x40025000) = 0
>>>>>>>> [2031] [104] signalfd4(ufd=383, user_mask=1, sizemask=0xa4200000, flags=0x80800) = -1 (Invalid argument)
>>>>>>>> [2031] [105] sendfile(out_fd=383, in_fd=382, offset=0, count=4096) = -1 (Invalid argument)
>>>>>>>> [2031] [106] fanotify_mark(fanotify_fd=382, flags=5, mask=0x8000023, dfd=382, pathname="/proc/1092/task/1092/fdinfo/68") = -1 (Invalid argument)
>>>>>>>> [2031] [107] wait4(upid=1, stat_addr=4, options=0xd761979b, ru=8) = -1 (Invalid argument)
>>>>>>>> [2031] [108] sigpending(set=0x80ca000[page_zeros]) = 0
>>>>>>>> [2031] [109] setresuid(ruid=0xefffd6fc, euid=0x1bf4c92f, suid=0xffff2e33) = -1 (Operation not permitted)
>>>>>>>> [2031] [110] munlock(addr=0x40025000, len=34) = 0
>>>>>>>> [2031] [111] timer_delete(timer_id=0xffffffdc) = -1 (Invalid argument)
>>>>>>>> [2031] [112] sched_get_priority_max(policy=0x10000040) = -1 (Invalid argument)
>>>>>>>> [2031] [113] syslog(type=0xc1000000, buf=1, len=0x82a5) = -1 (Operation not permitted)
>>>>>>>> [2031] [114] setpriority(which=0xc4c806c6, who=0xffffff01, niceval=0xffff0682) = -1 (Invalid argument)
>>>>>>>> [2031] [115] getgroups16(gidsetsize=0xfffe, grouplist=1) = -1 (Bad address)
>>>>>>>> [2031] [116] rename(oldname=4, newname=8) = -1 (Bad address)
>>>>>>>> [2031] [117] inotify_init() = 654
>>>>>>>> [2031] [118] getgid() = 100
>>>>>>>> [2031] [119] fstatat64(dfd=382, filename="/sys/devices/virtual/net/sit0/duplex", statbuf=0, flag=0xb545d727) = -1 (Invalid argument)
>>>>>>>> [2031] [120] unlinkat(dfd=382, pathname="/proc/sys/net/ipv4/neigh/default/retrans_time", flag=0xc00ef76) = -1 (Invalid argument)
>>>>>>>> [2031] [121] timerfd_create(clockid=0, flags=0) = 655
>>>>>>>> [2031] [122] munlock(addr=4, len=0x3fff) = -1 (Cannot allocate memory)
>>>>>>>> [2031] [123] fremovexattr(fd=382, name=0) = -1 (Bad address)
>>>>>>>> [2031] [124] sched_get_priority_min(policy=0xff58bfef) = -1 (Invalid argument)
>>>>>>>> [2031] [125] mq_timedreceive(mqdes=397, u_msg_ptr=4, msg_len=5245, u_msg_prio=0xc0100220, u_abs_timeout=0xc0100220) = -1 (Bad address)
>>>>>>>> [2031] [126] chdir(filename="/proc/116/net/ptype") = -1 (Not a directory)
>>>>>>>> [2031] [127] ssetmask(newmask=0x88000092) = 0
>>>>>>>> [2031] [128] statfs(pathname="/proc/6/mounts", buf=0) = -1 (Bad address)
>>>>>>>> [2031] [129] fchown16(fd=397, user=104, group=0x94100000) = -1 (Operation not permitted)
>>>>>>>> [2031] [130] fchdir(fd=397) = -1 (Not a directory)
>>>>>>>> [2031] [131] mkdir(pathname="/proc/1092/task/1092/fdinfo/316", mode=525) = -1 (File exists)
>>>>>>>> [2031] [132] fsetxattr(fd=386, name=0x856f158, value=0x8571160, size=0, flags=0) = -1 (Numerical result out of range)
>>>>>>>> [2031] [133] io_setup(nr_events=4095, ctxp=0x40266000) ^CKilled by signal 2.
>>>>>>>
>>>>>>> Reading your gdb backtraces show that schedule_timeout() got called
>>>>>>> with a negative value.
>>>>>>> Looks like an integer overflow.
>>>>>>> The soft-lockup might also origin from that (very big integer which
>>>>>>> did not overflow jet)
>>>>>>>
>>>>>>
>>>>>> If the culprit is solved by this patch I'd like to send it out. But I'm
>>>>>> unsure whether it catches the culprit or if it just covers the root cause.
>>>>>
>>>>> I fear your Patch will not fix the issue.
>>>>>
>>>>> Does the issue only trigger on 32bit UMLs?
>>>> No diea, I do only have a 32 bit system here (both host and client).
>>>>
>>>>> How long does it take till trinity hits it?
>>>> a command like
>>>>
>>>> $> ssh tfoerste@trinity "rm -rf t3; mkdir t3; cd t3; trinity -C4"
>>>>
>>>> usually needs 10 till 15 min to trigger the issue. With just 1 trinity
>>>> task (-C1) however it needs often a hour or more.
>>>
>>> That's good. :-)
>>> You can place some printk()s into balance_dirty_pages() and observe the values
>>> of period, max_pause, min_pause, etc...
>>> Maybe this will give us a clue.
>>
>> Will do it again. I did it before already but period was the only thing
>> which becames otherwise negative -
>> and because pages_dirtied was often in the range of 20 - 30 and perioad
>> around -3000
> 
> Did you print it before or after
>                 if (current->dirty_paused_when)
>                         pause -= now - current->dirty_paused_when;
> 
> I'm also wondering why
> 		if (pause < min_pause) {
> did not trigger or did it?
> 
>> I just wondered if HZ is casts as a short int in the calculation ?
>> BTW printk doesn't showed anything in the syslog (ok, I used
>> printk(KERN_ERR ...)
> 
> syslog or dmesg?
> 
> Thanks,
> //richard
> 

back from going a mushroom foray - I nearly surrounded all "pause=" statements
with BUG_ON and downgrades trinity (b/c their latest enhancements made trinity too weak)

The UML stopped here :
...
		if (unlikely(task_ratelimit == 0)) {
			period = max_pause;
			pause = max_pause;
			BUG_ON(pause < 0);
			goto pause;
		}
		BUG_ON(pages_dirtied < 0);
		BUG_ON(task_ratelimit < 0);
		period = HZ * pages_dirtied / task_ratelimit;
		BUG_ON(period < 0);         <----------------------here
		pause = period;
		BUG_ON(pause < 0);
		if (current->dirty_paused_when)
			pause -= now - current->dirty_paused_when;
		BUG_ON(pause < 0);
...

The back trace is :

tfoerste@n22 ~/devel/trinity $ gdb --core=/mnt/ramdisk/core /home/tfoerste/devel/linux/linux -batch -ex bt[New LWP 2522]
Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk ubda=/home/tfoerste/virtual/uml/tr'.
Program terminated with signal 6, Aborted.
#0  0xb777a424 in __kernel_vsyscall ()
#0  0xb777a424 in __kernel_vsyscall ()
#1  0x083b33b5 in kill ()
#2  0x0807190d in uml_abort () at arch/um/os-Linux/util.c:93
#3  0x08071c45 in os_dump_core () at arch/um/os-Linux/util.c:148
#4  0x08061417 in panic_exit (self=0x85b9558 <panic_exit_notifier>, unused1=0, unused2=0x85ef720 <buf.16084>) at arch/um/kernel/um_arch.c:240
#5  0x0809a7d8 in notifier_call_chain (nl=0x0, val=0, v=0x85ef720 <buf.16084>, nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
#6  0x0809a923 in __atomic_notifier_call_chain (nr_calls=<optimized out>, nr_to_call=<optimized out>, v=<optimized out>, val=<optimized out>, nh=<optimized out>) at kernel/notifier.c:182
#7  atomic_notifier_call_chain (nh=0x85ef704 <panic_notifier_list>, val=0, v=0x85ef720 <buf.16084>) at kernel/notifier.c:191
#8  0x08410d1c in panic (fmt=0x0) at kernel/panic.c:130
#9  0x08411c64 in balance_dirty_pages (pages_dirtied=9, mapping=<optimized out>) at mm/page-writeback.c:1471
#10 0x080d1ce4 in balance_dirty_pages_ratelimited (mapping=0x6) at mm/page-writeback.c:1663
#11 0x080e2d0c in __do_fault (mm=0x4870da00, vma=0x487484c8, address=1095131136, pmd=0x0, pgoff=0, flags=1212323992, orig_pte=<incomplete type>) at mm/memory.c:3452
#12 0x080e5286 in do_nonlinear_fault (orig_pte=..., flags=<optimized out>, pmd=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>, page_table=<optimized out>) at mm/memory.c:3518
#13 handle_pte_fault (flags=<optimized out>, pmd=<optimized out>, pte=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3717
#14 __handle_mm_fault (flags=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3845
#15 handle_mm_fault (mm=0x4870da00, vma=0x487484c8, address=1095131136, flags=1) at mm/memory.c:3868
#16 0x080e5a07 in __get_user_pages (tsk=0x47d5a400, mm=0x4870da00, start=1095131136, nr_pages=1025, gup_flags=519, pages=0x47c50000, vmas=0x0, nonblocking=0x0) at mm/memory.c:1822
#17 0x080e5cc3 in get_user_pages (tsk=0x0, mm=0x0, start=0, nr_pages=0, write=1, force=0, pages=0x47fafb98, vmas=0x6) at mm/memory.c:2019
#18 0x08140d0e in aio_setup_ring (ctx=<optimized out>) at fs/aio.c:340
#19 ioctx_alloc (nr_events=<optimized out>) at fs/aio.c:605
#20 SYSC_io_setup (ctxp=<optimized out>, nr_events=<optimized out>) at fs/aio.c:1122
#21 SyS_io_setup (nr_events=65535, ctxp=135045120) at fs/aio.c:1105
#22 0x080619c2 in handle_syscall (r=0x47d5a5d4) at arch/um/kernel/skas/syscall.c:35
#23 0x08073f2d in handle_trap (local_using_sysemu=<optimized out>, regs=<optimized out>, pid=<optimized out>) at arch/um/os-Linux/skas/process.c:198
#24 userspace (regs=0x47d5a5d4) at arch/um/os-Linux/skas/process.c:431
#25 0x0805e6ac in fork_handler () at arch/um/kernel/process.c:160
#26 0x00000000 in ?? ()

I'll try again to see if printk's will help - how do I force a sync after a printk ?

-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-06 14:17                 ` Toralf Förster
@ 2013-10-06 18:38                   ` Geert Uytterhoeven
  -1 siblings, 0 replies; 64+ messages in thread
From: Geert Uytterhoeven @ 2013-10-06 18:38 UTC (permalink / raw)
  To: Toralf Förster; +Cc: Richard Weinberger, UML devel, trinity

On Sun, Oct 6, 2013 at 4:17 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
> The UML stopped here :
> ...
>                 if (unlikely(task_ratelimit == 0)) {
>                         period = max_pause;
>                         pause = max_pause;
>                         BUG_ON(pause < 0);
>                         goto pause;
>                 }
>                 BUG_ON(pages_dirtied < 0);
>                 BUG_ON(task_ratelimit < 0);
>                 period = HZ * pages_dirtied / task_ratelimit;
>                 BUG_ON(period < 0);         <----------------------here

So pages_dirtied becomes that big compared to task_ratelimit (both are
"unsigned long"), that period (which is "long", just like "pause") overflows
into a negative number.

This is indeed much more likely to happen on 32-bit.

> The back trace is :

> #9  0x08411c64 in balance_dirty_pages (pages_dirtied=9, mapping=<optimized out>) at mm/page-writeback.c:1471

But here pages_dirtied is only 9??

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-06 18:38                   ` Geert Uytterhoeven
  0 siblings, 0 replies; 64+ messages in thread
From: Geert Uytterhoeven @ 2013-10-06 18:38 UTC (permalink / raw)
  To: Toralf Förster; +Cc: Richard Weinberger, UML devel, trinity

On Sun, Oct 6, 2013 at 4:17 PM, Toralf FÃ¶rster <toralf.foerster@gmx.de> wrote:
> The UML stopped here :
> ...
>                 if (unlikely(task_ratelimit == 0)) {
>                         period = max_pause;
>                         pause = max_pause;
>                         BUG_ON(pause < 0);
>                         goto pause;
>                 }
>                 BUG_ON(pages_dirtied < 0);
>                 BUG_ON(task_ratelimit < 0);
>                 period = HZ * pages_dirtied / task_ratelimit;
>                 BUG_ON(period < 0);         <----------------------here

So pages_dirtied becomes that big compared to task_ratelimit (both are
"unsigned long"), that period (which is "long", just like "pause") overflows
into a negative number.

This is indeed much more likely to happen on 32-bit.

> The back trace is :

> #9  0x08411c64 in balance_dirty_pages (pages_dirtied=9, mapping=<optimized out>) at mm/page-writeback.c:1471

But here pages_dirtied is only 9??

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-06 18:38                   ` Geert Uytterhoeven
  (?)
@ 2013-10-06 20:08                   ` Toralf Förster
  2013-10-06 20:26                     ` Geert Uytterhoeven
  -1 siblings, 1 reply; 64+ messages in thread
From: Toralf Förster @ 2013-10-06 20:08 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Richard Weinberger, UML devel, trinity

On 10/06/2013 08:38 PM, Geert Uytterhoeven wrote:
> On Sun, Oct 6, 2013 at 4:17 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>> The UML stopped here :
>> ...
>>                 if (unlikely(task_ratelimit == 0)) {
>>                         period = max_pause;
>>                         pause = max_pause;
>>                         BUG_ON(pause < 0);
>>                         goto pause;
>>                 }
>>                 BUG_ON(pages_dirtied < 0);
>>                 BUG_ON(task_ratelimit < 0);
>>                 period = HZ * pages_dirtied / task_ratelimit;
>>                 BUG_ON(period < 0);         <----------------------here
> 
> So pages_dirtied becomes that big compared to task_ratelimit (both are
> "unsigned long"), that period (which is "long", just like "pause") overflows
> into a negative number.
> 
> This is indeed much more likely to happen on 32-bit.
> 
>> The back trace is :
> 
>> #9  0x08411c64 in balance_dirty_pages (pages_dirtied=9, mapping=<optimized out>) at mm/page-writeback.c:1471
> 
> But here pages_dirtied is only 9??
> 
> Gr{oetje,eeting}s,
> 
>                         Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds
> 
Well, this points to an overflow or ? :

tfoerste@n22 ~/devel/linux $ nl -ba mm/page-writeback.c | grep -A 5 -B 5 1468
  1463                          BUG_ON(pause < 0);
  1464                          goto pause;
  1465                  }
  1466                  period = HZ * pages_dirtied / task_ratelimit;
  1467                  pause = period;
  1468                  BUG_ON(pause < 0 && pages_dirtied > 0 && task_ratelimit > 0);
  1469                  if (current->dirty_paused_when)
  1470                          pause -= now - current->dirty_paused_when;
  1471                  /*
  1472                   * For less than 1s think time (ext3/4 may block the dirtier
  1473                   * for up to 800ms from time to time on 1-HDD; so does xfs,


and the back trace is :

tfoerste@n22 ~/devel/linux $ gdb --core=/mnt/ramdisk/core /home/tfoerste/devel/linux/linux -batch -ex bt
[New LWP 13163]
Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk ubda=/home/tfoerste/virtual/uml/tr'.
Program terminated with signal 6, Aborted.
#0  0xb77d2424 in __kernel_vsyscall ()
#0  0xb77d2424 in __kernel_vsyscall ()
#1  0x083b33b5 in kill ()
#2  0x0807190d in uml_abort () at arch/um/os-Linux/util.c:93
#3  0x08071c45 in os_dump_core () at arch/um/os-Linux/util.c:148
#4  0x08061417 in panic_exit (self=0x85b9558 <panic_exit_notifier>, unused1=0, unused2=0x85ef720 <buf.16084>) at arch/um/kernel/um_arch.c:240
#5  0x0809a7d8 in notifier_call_chain (nl=0x0, val=0, v=0x85ef720 <buf.16084>, nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
#6  0x0809a923 in __atomic_notifier_call_chain (nr_calls=<optimized out>, nr_to_call=<optimized out>, v=<optimized out>, val=<optimized out>, nh=<optimized out>) at kernel/notifier.c:182
#7  atomic_notifier_call_chain (nh=0x85ef704 <panic_notifier_list>, val=0, v=0x85ef720 <buf.16084>) at kernel/notifier.c:191
#8  0x08410d1c in panic (fmt=0x0) at kernel/panic.c:130
#9  0x08411c6c in balance_dirty_pages (pages_dirtied=0, mapping=<optimized out>) at mm/page-writeback.c:1468
#10 0x080d1ce4 in balance_dirty_pages_ratelimited (mapping=0x6) at mm/page-writeback.c:1657
#11 0x080e2d0c in __do_fault (mm=0x47b09600, vma=0x48bc9e58, address=1082572800, pmd=0x0, pgoff=0, flags=1167616488, orig_pte=<incomplete type>) at mm/memory.c:3452
#12 0x080e5286 in do_nonlinear_fault (orig_pte=..., flags=<optimized out>, pmd=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>, page_table=<optimized out>) at mm/memory.c:3518
#13 handle_pte_fault (flags=<optimized out>, pmd=<optimized out>, pte=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3717
#14 __handle_mm_fault (flags=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3845
#15 handle_mm_fault (mm=0x47b09600, vma=0x48bc9e58, address=1082572800, flags=1) at mm/memory.c:3868
#16 0x080e5a07 in __get_user_pages (tsk=0x47a3ea00, mm=0x47b09600, start=1082572800, nr_pages=962, gup_flags=519, pages=0x47b96120, vmas=0x0, nonblocking=0x0) at mm/memory.c:1822
#17 0x080e5cc3 in get_user_pages (tsk=0x0, mm=0x0, start=0, nr_pages=0, write=1, force=0, pages=0x4789fb9c, vmas=0x6) at mm/memory.c:2019
#18 0x08140d0e in aio_setup_ring (ctx=<optimized out>) at fs/aio.c:340
#19 ioctx_alloc (nr_events=<optimized out>) at fs/aio.c:605
#20 SYSC_io_setup (ctxp=<optimized out>, nr_events=<optimized out>) at fs/aio.c:1122
#21 SyS_io_setup (nr_events=-2147422135, ctxp=135045120) at fs/aio.c:1105
#22 0x080619c2 in handle_syscall (r=0x47a3ebd4) at arch/um/kernel/skas/syscall.c:35
#23 0x08073f2d in handle_trap (local_using_sysemu=<optimized out>, regs=<optimized out>, pid=<optimized out>) at arch/um/os-Linux/skas/process.c:198
#24 userspace (regs=0x47a3ebd4) at arch/um/os-Linux/skas/process.c:431
#25 0x0805e6ac in fork_handler () at arch/um/kernel/process.c:160
#26 0x5a5a5a5a in ?? ()

-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-06 20:08                   ` Toralf Förster
@ 2013-10-06 20:26                     ` Geert Uytterhoeven
  2013-10-06 21:01                       ` Toralf Förster
  2013-10-08 19:56                       ` Toralf Förster
  0 siblings, 2 replies; 64+ messages in thread
From: Geert Uytterhoeven @ 2013-10-06 20:26 UTC (permalink / raw)
  To: Toralf Förster; +Cc: Richard Weinberger, UML devel, trinity

On Sun, Oct 6, 2013 at 10:08 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
> On 10/06/2013 08:38 PM, Geert Uytterhoeven wrote:
>> On Sun, Oct 6, 2013 at 4:17 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>>> The UML stopped here :
>>> ...
>>>                 if (unlikely(task_ratelimit == 0)) {
>>>                         period = max_pause;
>>>                         pause = max_pause;
>>>                         BUG_ON(pause < 0);
>>>                         goto pause;
>>>                 }
>>>                 BUG_ON(pages_dirtied < 0);
>>>                 BUG_ON(task_ratelimit < 0);
>>>                 period = HZ * pages_dirtied / task_ratelimit;
>>>                 BUG_ON(period < 0);         <----------------------here
>>
>> So pages_dirtied becomes that big compared to task_ratelimit (both are
>> "unsigned long"), that period (which is "long", just like "pause") overflows
>> into a negative number.
>>
>> This is indeed much more likely to happen on 32-bit.
>>
>>> The back trace is :
>>
>>> #9  0x08411c64 in balance_dirty_pages (pages_dirtied=9, mapping=<optimized out>) at mm/page-writeback.c:1471
>>
>> But here pages_dirtied is only 9??

> Well, this points to an overflow or ? :

Negative indicates an overflow, but pages_dirtied doesn't.

> tfoerste@n22 ~/devel/linux $ nl -ba mm/page-writeback.c | grep -A 5 -B 5 1468
>   1463                          BUG_ON(pause < 0);
>   1464                          goto pause;
>   1465                  }
>   1466                  period = HZ * pages_dirtied / task_ratelimit;
>   1467                  pause = period;
>   1468                  BUG_ON(pause < 0 && pages_dirtied > 0 && task_ratelimit > 0);
>   1469                  if (current->dirty_paused_when)
>   1470                          pause -= now - current->dirty_paused_when;
>   1471                  /*
>   1472                   * For less than 1s think time (ext3/4 may block the dirtier
>   1473                   * for up to 800ms from time to time on 1-HDD; so does xfs,
>
>
> and the back trace is :
>
> #9  0x08411c6c in balance_dirty_pages (pages_dirtied=0, mapping=<optimized out>) at mm/page-writeback.c:1468

Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
asserts its strict positive?!?

Can you please try the following instead of the BUG_ON():

if (pause < 0) {
        printk("pages_dirtied = %lu\n", pages_dirtied);
        printk("task_ratelimit = %lu\n", task_ratelimit);
        printk("pause = %ld\n", pause);
}

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-06 20:26                     ` Geert Uytterhoeven
@ 2013-10-06 21:01                       ` Toralf Förster
  2013-10-08 20:07                           ` Geert Uytterhoeven
  2013-10-08 19:56                       ` Toralf Förster
  1 sibling, 1 reply; 64+ messages in thread
From: Toralf Förster @ 2013-10-06 21:01 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Richard Weinberger, UML devel, trinity

On 10/06/2013 10:26 PM, Geert Uytterhoeven wrote:
> On Sun, Oct 6, 2013 at 10:08 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>> On 10/06/2013 08:38 PM, Geert Uytterhoeven wrote:
>>> On Sun, Oct 6, 2013 at 4:17 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>>>> The UML stopped here :
>>>> ...
>>>>                 if (unlikely(task_ratelimit == 0)) {
>>>>                         period = max_pause;
>>>>                         pause = max_pause;
>>>>                         BUG_ON(pause < 0);
>>>>                         goto pause;
>>>>                 }
>>>>                 BUG_ON(pages_dirtied < 0);
>>>>                 BUG_ON(task_ratelimit < 0);
>>>>                 period = HZ * pages_dirtied / task_ratelimit;
>>>>                 BUG_ON(period < 0);         <----------------------here
>>>
>>> So pages_dirtied becomes that big compared to task_ratelimit (both are
>>> "unsigned long"), that period (which is "long", just like "pause") overflows
>>> into a negative number.
>>>
>>> This is indeed much more likely to happen on 32-bit.
>>>
>>>> The back trace is :
>>>
>>>> #9  0x08411c64 in balance_dirty_pages (pages_dirtied=9, mapping=<optimized out>) at mm/page-writeback.c:1471
>>>
>>> But here pages_dirtied is only 9??
> 
>> Well, this points to an overflow or ? :
> 
> Negative indicates an overflow, but pages_dirtied doesn't.
> 
>> tfoerste@n22 ~/devel/linux $ nl -ba mm/page-writeback.c | grep -A 5 -B 5 1468
>>   1463                          BUG_ON(pause < 0);
>>   1464                          goto pause;
>>   1465                  }
>>   1466                  period = HZ * pages_dirtied / task_ratelimit;
>>   1467                  pause = period;
>>   1468                  BUG_ON(pause < 0 && pages_dirtied > 0 && task_ratelimit > 0);
>>   1469                  if (current->dirty_paused_when)
>>   1470                          pause -= now - current->dirty_paused_when;
>>   1471                  /*
>>   1472                   * For less than 1s think time (ext3/4 may block the dirtier
>>   1473                   * for up to 800ms from time to time on 1-HDD; so does xfs,
>>
>>
>> and the back trace is :
>>
>> #9  0x08411c6c in balance_dirty_pages (pages_dirtied=0, mapping=<optimized out>) at mm/page-writeback.c:1468
> 
> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
> asserts its strict positive?!?
> 
> Can you please try the following instead of the BUG_ON():
> 
> if (pause < 0) {
>         printk("pages_dirtied = %lu\n", pages_dirtied);
>         printk("task_ratelimit = %lu\n", task_ratelimit);
>         printk("pause = %ld\n", pause);
> }
> 
> Gr{oetje,eeting}s,
> 
>                         Geert
I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
As soon as the issue happens I do have a 

BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]

at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
to run very basic commands like a crash dump, sysrq ond so on.



> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds
> 


-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-06 21:01                       ` Toralf Förster
@ 2013-10-08 20:07                           ` Geert Uytterhoeven
  0 siblings, 0 replies; 64+ messages in thread
From: Geert Uytterhoeven @ 2013-10-08 20:07 UTC (permalink / raw)
  To: Toralf Förster; +Cc: Richard Weinberger, UML devel, trinity

On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
>> asserts its strict positive?!?
>>
>> Can you please try the following instead of the BUG_ON():
>>
>> if (pause < 0) {
>>         printk("pages_dirtied = %lu\n", pages_dirtied);
>>         printk("task_ratelimit = %lu\n", task_ratelimit);
>>         printk("pause = %ld\n", pause);
>> }
>>
>> Gr{oetje,eeting}s,
>>
>>                         Geert
> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
> As soon as the issue happens I do have a
>
> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
>
> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
> to run very basic commands like a crash dump, sysrq ond so on.

You may get an idea of the magnitude of pages_dirtied by using a chain of
BUG_ON()s, like:

BUG_ON(pages_dirtied > 2000000000);
BUG_ON(pages_dirtied > 1000000000);
BUG_ON(pages_dirtied > 100000000);
BUG_ON(pages_dirtied > 10000000);
BUG_ON(pages_dirtied > 1000000);

Probably 1 million is already too much for normal operation?

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-08 20:07                           ` Geert Uytterhoeven
  0 siblings, 0 replies; 64+ messages in thread
From: Geert Uytterhoeven @ 2013-10-08 20:07 UTC (permalink / raw)
  To: Toralf Förster; +Cc: Richard Weinberger, UML devel, trinity

On Sun, Oct 6, 2013 at 11:01 PM, Toralf FÃ¶rster <toralf.foerster@gmx.de> wrote:
>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
>> asserts its strict positive?!?
>>
>> Can you please try the following instead of the BUG_ON():
>>
>> if (pause < 0) {
>>         printk("pages_dirtied = %lu\n", pages_dirtied);
>>         printk("task_ratelimit = %lu\n", task_ratelimit);
>>         printk("pause = %ld\n", pause);
>> }
>>
>> Gr{oetje,eeting}s,
>>
>>                         Geert
> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
> As soon as the issue happens I do have a
>
> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
>
> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
> to run very basic commands like a crash dump, sysrq ond so on.

You may get an idea of the magnitude of pages_dirtied by using a chain of
BUG_ON()s, like:

BUG_ON(pages_dirtied > 2000000000);
BUG_ON(pages_dirtied > 1000000000);
BUG_ON(pages_dirtied > 100000000);
BUG_ON(pages_dirtied > 10000000);
BUG_ON(pages_dirtied > 1000000);

Probably 1 million is already too much for normal operation?

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-08 20:07                           ` Geert Uytterhoeven
  (?)
@ 2013-10-09 17:26                           ` Toralf Förster
  2013-10-09 18:43                               ` Richard Weinberger
  -1 siblings, 1 reply; 64+ messages in thread
From: Toralf Förster @ 2013-10-09 17:26 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Richard Weinberger, UML devel

On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
>>> asserts its strict positive?!?
>>>
>>> Can you please try the following instead of the BUG_ON():
>>>
>>> if (pause < 0) {
>>>         printk("pages_dirtied = %lu\n", pages_dirtied);
>>>         printk("task_ratelimit = %lu\n", task_ratelimit);
>>>         printk("pause = %ld\n", pause);
>>> }
>>>
>>> Gr{oetje,eeting}s,
>>>
>>>                         Geert
>> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
>> As soon as the issue happens I do have a
>>
>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
>>
>> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
>> to run very basic commands like a crash dump, sysrq ond so on.
> 
> You may get an idea of the magnitude of pages_dirtied by using a chain of
> BUG_ON()s, like:
> 
> BUG_ON(pages_dirtied > 2000000000);
> BUG_ON(pages_dirtied > 1000000000);
> BUG_ON(pages_dirtied > 100000000);
> BUG_ON(pages_dirtied > 10000000);
> BUG_ON(pages_dirtied > 1000000);
> 
> Probably 1 million is already too much for normal operation?
> 
period = HZ * pages_dirtied / task_ratelimit;
		BUG_ON(pages_dirtied > 2000000000);
		BUG_ON(pages_dirtied > 1000000000);      <-------------- this is line 1467

the back trace is :

tfoerste@n22 ~/devel/linux $ gdb --core=/mnt/ramdisk/core /home/tfoerste/devel/linux/linux -batch -ex bt
[New LWP 6911]
Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk ubda=/home/tfoerste/virtual/uml/tr'.
Program terminated with signal 6, Aborted.
#0  0xb77a7424 in __kernel_vsyscall ()
#0  0xb77a7424 in __kernel_vsyscall ()
#1  0x083bdf35 in kill ()
#2  0x0807296d in uml_abort () at arch/um/os-Linux/util.c:93
#3  0x08072ca5 in os_dump_core () at arch/um/os-Linux/util.c:148
#4  0x080623c4 in panic_exit (self=0x85c1558 <panic_exit_notifier>, unused1=0, unused2=0x85f76e0 <buf.16221>) at arch/um/kernel/um_arch.c:240
#5  0x0809ba86 in notifier_call_chain (nl=0x0, val=0, v=0x85f76e0 <buf.16221>, nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
#6  0x0809bba1 in __atomic_notifier_call_chain (nh=0x85f76c4 <panic_notifier_list>, val=0, v=0x85f76e0 <buf.16221>, nr_to_call=0, nr_calls=0x0) at kernel/notifier.c:182
#7  0x0809bbdf in atomic_notifier_call_chain (nh=0x0, val=0, v=0x0) at kernel/notifier.c:191
#8  0x0841b5bc in panic (fmt=0x0) at kernel/panic.c:130
#9  0x0841c470 in balance_dirty_pages (pages_dirtied=23, mapping=<optimized out>) at mm/page-writeback.c:1467
#10 0x080d3595 in balance_dirty_pages_ratelimited (mapping=0x6) at mm/page-writeback.c:1661
#11 0x080e4a3f in __do_fault (mm=0x45ba3600, vma=0x48777b90, address=1084678144, pmd=0x0, pgoff=0, flags=0, orig_pte=<incomplete type>) at mm/memory.c:3452
#12 0x080e6e0f in do_linear_fault (orig_pte=..., flags=<optimized out>, pmd=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>, page_table=<optimized out>) at mm/memory.c:3486
#13 handle_pte_fault (flags=<optimized out>, pmd=<optimized out>, pte=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3710
#14 __handle_mm_fault (flags=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3845
#15 handle_mm_fault (mm=0x45ba3600, vma=0x487034c8, address=1084678144, flags=1) at mm/memory.c:3868
#16 0x080e7817 in __get_user_pages (tsk=0x48705800, mm=0x45ba3600, start=1084678144, nr_pages=1025, gup_flags=519, pages=0x48558000, vmas=0x0, nonblocking=0x0) at mm/memory.c:1822
#17 0x080e7ae3 in get_user_pages (tsk=0x0, mm=0x0, start=0, nr_pages=0, write=1, force=0, pages=0x48777b90, vmas=0x6) at mm/memory.c:2019
#18 0x08143dc6 in aio_setup_ring (ctx=<optimized out>) at fs/aio.c:340
#19 ioctx_alloc (nr_events=<optimized out>) at fs/aio.c:605
#20 SYSC_io_setup (ctxp=<optimized out>, nr_events=<optimized out>) at fs/aio.c:1122
#21 SyS_io_setup (nr_events=65535, ctxp=135081984) at fs/aio.c:1105
#22 0x08062984 in handle_syscall (r=0x487059d4) at arch/um/kernel/skas/syscall.c:35
#23 0x08074fb5 in handle_trap (local_using_sysemu=<optimized out>, regs=<optimized out>, pid=<optimized out>) at arch/um/os-Linux/skas/process.c:198
#24 userspace (regs=0x487059d4) at arch/um/os-Linux/skas/process.c:431
#25 0x0805f750 in fork_handler () at arch/um/kernel/process.c:160
#26 0x00000000 in ?? ()



> Gr{oetje,eeting}s,
> 
>                         Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds
> 


-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-09 17:26                           ` Toralf Förster
@ 2013-10-09 18:43                               ` Richard Weinberger
  0 siblings, 0 replies; 64+ messages in thread
From: Richard Weinberger @ 2013-10-09 18:43 UTC (permalink / raw)
  To: Toralf Förster
  Cc: Geert Uytterhoeven, UML devel, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, akpm@linux-foundation.org, Jan Kara,
	hannes, darrick.wong, Michal Hocko

CC'ing mm folks.
Please see below.

Am 09.10.2013 19:26, schrieb Toralf Förster:
> On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
>> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
>>>> asserts its strict positive?!?
>>>>
>>>> Can you please try the following instead of the BUG_ON():
>>>>
>>>> if (pause < 0) {
>>>>         printk("pages_dirtied = %lu\n", pages_dirtied);
>>>>         printk("task_ratelimit = %lu\n", task_ratelimit);
>>>>         printk("pause = %ld\n", pause);
>>>> }
>>>>
>>>> Gr{oetje,eeting}s,
>>>>
>>>>                         Geert
>>> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
>>> As soon as the issue happens I do have a
>>>
>>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
>>>
>>> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
>>> to run very basic commands like a crash dump, sysrq ond so on.
>>
>> You may get an idea of the magnitude of pages_dirtied by using a chain of
>> BUG_ON()s, like:
>>
>> BUG_ON(pages_dirtied > 2000000000);
>> BUG_ON(pages_dirtied > 1000000000);
>> BUG_ON(pages_dirtied > 100000000);
>> BUG_ON(pages_dirtied > 10000000);
>> BUG_ON(pages_dirtied > 1000000);
>>
>> Probably 1 million is already too much for normal operation?
>>
> period = HZ * pages_dirtied / task_ratelimit;
> 		BUG_ON(pages_dirtied > 2000000000);
> 		BUG_ON(pages_dirtied > 1000000000);      <-------------- this is line 1467

Summary for mm people:

Toralf runs trinty on UML/i386.
After some time pages_dirtied becomes very large.
More than 1000000000 pages in this case.

Thus, period = HZ * pages_dirtied / task_ratelimit overflows
and period/pause becomes extremely large.

It looks like io_schedule_timeout() get's called with a very large timeout.
I don't know why "if (unlikely(pause > max_pause)) {" does not help.

Any ideas?

Thanks,
//richard

> the back trace is :
> 
> tfoerste@n22 ~/devel/linux $ gdb --core=/mnt/ramdisk/core /home/tfoerste/devel/linux/linux -batch -ex bt
> [New LWP 6911]
> Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk ubda=/home/tfoerste/virtual/uml/tr'.
> Program terminated with signal 6, Aborted.
> #0  0xb77a7424 in __kernel_vsyscall ()
> #0  0xb77a7424 in __kernel_vsyscall ()
> #1  0x083bdf35 in kill ()
> #2  0x0807296d in uml_abort () at arch/um/os-Linux/util.c:93
> #3  0x08072ca5 in os_dump_core () at arch/um/os-Linux/util.c:148
> #4  0x080623c4 in panic_exit (self=0x85c1558 <panic_exit_notifier>, unused1=0, unused2=0x85f76e0 <buf.16221>) at arch/um/kernel/um_arch.c:240
> #5  0x0809ba86 in notifier_call_chain (nl=0x0, val=0, v=0x85f76e0 <buf.16221>, nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
> #6  0x0809bba1 in __atomic_notifier_call_chain (nh=0x85f76c4 <panic_notifier_list>, val=0, v=0x85f76e0 <buf.16221>, nr_to_call=0, nr_calls=0x0) at kernel/notifier.c:182
> #7  0x0809bbdf in atomic_notifier_call_chain (nh=0x0, val=0, v=0x0) at kernel/notifier.c:191
> #8  0x0841b5bc in panic (fmt=0x0) at kernel/panic.c:130
> #9  0x0841c470 in balance_dirty_pages (pages_dirtied=23, mapping=<optimized out>) at mm/page-writeback.c:1467
> #10 0x080d3595 in balance_dirty_pages_ratelimited (mapping=0x6) at mm/page-writeback.c:1661
> #11 0x080e4a3f in __do_fault (mm=0x45ba3600, vma=0x48777b90, address=1084678144, pmd=0x0, pgoff=0, flags=0, orig_pte=<incomplete type>) at mm/memory.c:3452
> #12 0x080e6e0f in do_linear_fault (orig_pte=..., flags=<optimized out>, pmd=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>, page_table=<optimized out>) at mm/memory.c:3486
> #13 handle_pte_fault (flags=<optimized out>, pmd=<optimized out>, pte=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3710
> #14 __handle_mm_fault (flags=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3845
> #15 handle_mm_fault (mm=0x45ba3600, vma=0x487034c8, address=1084678144, flags=1) at mm/memory.c:3868
> #16 0x080e7817 in __get_user_pages (tsk=0x48705800, mm=0x45ba3600, start=1084678144, nr_pages=1025, gup_flags=519, pages=0x48558000, vmas=0x0, nonblocking=0x0) at mm/memory.c:1822
> #17 0x080e7ae3 in get_user_pages (tsk=0x0, mm=0x0, start=0, nr_pages=0, write=1, force=0, pages=0x48777b90, vmas=0x6) at mm/memory.c:2019
> #18 0x08143dc6 in aio_setup_ring (ctx=<optimized out>) at fs/aio.c:340
> #19 ioctx_alloc (nr_events=<optimized out>) at fs/aio.c:605
> #20 SYSC_io_setup (ctxp=<optimized out>, nr_events=<optimized out>) at fs/aio.c:1122
> #21 SyS_io_setup (nr_events=65535, ctxp=135081984) at fs/aio.c:1105
> #22 0x08062984 in handle_syscall (r=0x487059d4) at arch/um/kernel/skas/syscall.c:35
> #23 0x08074fb5 in handle_trap (local_using_sysemu=<optimized out>, regs=<optimized out>, pid=<optimized out>) at arch/um/os-Linux/skas/process.c:198
> #24 userspace (regs=0x487059d4) at arch/um/os-Linux/skas/process.c:431
> #25 0x0805f750 in fork_handler () at arch/um/kernel/process.c:160
> #26 0x00000000 in ?? ()
> 
> 
> 
>> Gr{oetje,eeting}s,
>>
>>                         Geert
>>
>> --
>> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>>
>> In personal conversations with technical people, I call myself a hacker. But
>> when I'm talking to journalists I just say "programmer" or something like that.
>>                                 -- Linus Torvalds
>>
> 
> 



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-09 18:43                               ` Richard Weinberger
  0 siblings, 0 replies; 64+ messages in thread
From: Richard Weinberger @ 2013-10-09 18:43 UTC (permalink / raw)
  To: Toralf Förster
  Cc: Geert Uytterhoeven, UML devel, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, akpm@linux-foundation.org, Jan Kara,
	hannes, darrick.wong, Michal Hocko

CC'ing mm folks.
Please see below.

Am 09.10.2013 19:26, schrieb Toralf FA?rster:
> On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
>> On Sun, Oct 6, 2013 at 11:01 PM, Toralf FA?rster <toralf.foerster@gmx.de> wrote:
>>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
>>>> asserts its strict positive?!?
>>>>
>>>> Can you please try the following instead of the BUG_ON():
>>>>
>>>> if (pause < 0) {
>>>>         printk("pages_dirtied = %lu\n", pages_dirtied);
>>>>         printk("task_ratelimit = %lu\n", task_ratelimit);
>>>>         printk("pause = %ld\n", pause);
>>>> }
>>>>
>>>> Gr{oetje,eeting}s,
>>>>
>>>>                         Geert
>>> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
>>> As soon as the issue happens I do have a
>>>
>>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
>>>
>>> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
>>> to run very basic commands like a crash dump, sysrq ond so on.
>>
>> You may get an idea of the magnitude of pages_dirtied by using a chain of
>> BUG_ON()s, like:
>>
>> BUG_ON(pages_dirtied > 2000000000);
>> BUG_ON(pages_dirtied > 1000000000);
>> BUG_ON(pages_dirtied > 100000000);
>> BUG_ON(pages_dirtied > 10000000);
>> BUG_ON(pages_dirtied > 1000000);
>>
>> Probably 1 million is already too much for normal operation?
>>
> period = HZ * pages_dirtied / task_ratelimit;
> 		BUG_ON(pages_dirtied > 2000000000);
> 		BUG_ON(pages_dirtied > 1000000000);      <-------------- this is line 1467

Summary for mm people:

Toralf runs trinty on UML/i386.
After some time pages_dirtied becomes very large.
More than 1000000000 pages in this case.

Thus, period = HZ * pages_dirtied / task_ratelimit overflows
and period/pause becomes extremely large.

It looks like io_schedule_timeout() get's called with a very large timeout.
I don't know why "if (unlikely(pause > max_pause)) {" does not help.

Any ideas?

Thanks,
//richard

> the back trace is :
> 
> tfoerste@n22 ~/devel/linux $ gdb --core=/mnt/ramdisk/core /home/tfoerste/devel/linux/linux -batch -ex bt
> [New LWP 6911]
> Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk ubda=/home/tfoerste/virtual/uml/tr'.
> Program terminated with signal 6, Aborted.
> #0  0xb77a7424 in __kernel_vsyscall ()
> #0  0xb77a7424 in __kernel_vsyscall ()
> #1  0x083bdf35 in kill ()
> #2  0x0807296d in uml_abort () at arch/um/os-Linux/util.c:93
> #3  0x08072ca5 in os_dump_core () at arch/um/os-Linux/util.c:148
> #4  0x080623c4 in panic_exit (self=0x85c1558 <panic_exit_notifier>, unused1=0, unused2=0x85f76e0 <buf.16221>) at arch/um/kernel/um_arch.c:240
> #5  0x0809ba86 in notifier_call_chain (nl=0x0, val=0, v=0x85f76e0 <buf.16221>, nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
> #6  0x0809bba1 in __atomic_notifier_call_chain (nh=0x85f76c4 <panic_notifier_list>, val=0, v=0x85f76e0 <buf.16221>, nr_to_call=0, nr_calls=0x0) at kernel/notifier.c:182
> #7  0x0809bbdf in atomic_notifier_call_chain (nh=0x0, val=0, v=0x0) at kernel/notifier.c:191
> #8  0x0841b5bc in panic (fmt=0x0) at kernel/panic.c:130
> #9  0x0841c470 in balance_dirty_pages (pages_dirtied=23, mapping=<optimized out>) at mm/page-writeback.c:1467
> #10 0x080d3595 in balance_dirty_pages_ratelimited (mapping=0x6) at mm/page-writeback.c:1661
> #11 0x080e4a3f in __do_fault (mm=0x45ba3600, vma=0x48777b90, address=1084678144, pmd=0x0, pgoff=0, flags=0, orig_pte=<incomplete type>) at mm/memory.c:3452
> #12 0x080e6e0f in do_linear_fault (orig_pte=..., flags=<optimized out>, pmd=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>, page_table=<optimized out>) at mm/memory.c:3486
> #13 handle_pte_fault (flags=<optimized out>, pmd=<optimized out>, pte=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3710
> #14 __handle_mm_fault (flags=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3845
> #15 handle_mm_fault (mm=0x45ba3600, vma=0x487034c8, address=1084678144, flags=1) at mm/memory.c:3868
> #16 0x080e7817 in __get_user_pages (tsk=0x48705800, mm=0x45ba3600, start=1084678144, nr_pages=1025, gup_flags=519, pages=0x48558000, vmas=0x0, nonblocking=0x0) at mm/memory.c:1822
> #17 0x080e7ae3 in get_user_pages (tsk=0x0, mm=0x0, start=0, nr_pages=0, write=1, force=0, pages=0x48777b90, vmas=0x6) at mm/memory.c:2019
> #18 0x08143dc6 in aio_setup_ring (ctx=<optimized out>) at fs/aio.c:340
> #19 ioctx_alloc (nr_events=<optimized out>) at fs/aio.c:605
> #20 SYSC_io_setup (ctxp=<optimized out>, nr_events=<optimized out>) at fs/aio.c:1122
> #21 SyS_io_setup (nr_events=65535, ctxp=135081984) at fs/aio.c:1105
> #22 0x08062984 in handle_syscall (r=0x487059d4) at arch/um/kernel/skas/syscall.c:35
> #23 0x08074fb5 in handle_trap (local_using_sysemu=<optimized out>, regs=<optimized out>, pid=<optimized out>) at arch/um/os-Linux/skas/process.c:198
> #24 userspace (regs=0x487059d4) at arch/um/os-Linux/skas/process.c:431
> #25 0x0805f750 in fork_handler () at arch/um/kernel/process.c:160
> #26 0x00000000 in ?? ()
> 
> 
> 
>> Gr{oetje,eeting}s,
>>
>>                         Geert
>>
>> --
>> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>>
>> In personal conversations with technical people, I call myself a hacker. But
>> when I'm talking to journalists I just say "programmer" or something like that.
>>                                 -- Linus Torvalds
>>
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-09 18:43                               ` Richard Weinberger
@ 2013-10-09 21:47                                 ` Jan Kara
  -1 siblings, 0 replies; 64+ messages in thread
From: Jan Kara @ 2013-10-09 21:47 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: Toralf Förster, Geert Uytterhoeven, UML devel,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, Jan Kara, hannes, darrick.wong,
	Michal Hocko, Wu Fengguang

On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
> CC'ing mm folks.
> Please see below.
  Added Fenguang to CC since he is the author of this code.

> Am 09.10.2013 19:26, schrieb Toralf Förster:
> > On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
> >> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
> >>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
> >>>> asserts its strict positive?!?
> >>>>
> >>>> Can you please try the following instead of the BUG_ON():
> >>>>
> >>>> if (pause < 0) {
> >>>>         printk("pages_dirtied = %lu\n", pages_dirtied);
> >>>>         printk("task_ratelimit = %lu\n", task_ratelimit);
> >>>>         printk("pause = %ld\n", pause);
> >>>> }
> >>>>
> >>>> Gr{oetje,eeting}s,
> >>>>
> >>>>                         Geert
> >>> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
> >>> As soon as the issue happens I do have a
> >>>
> >>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
> >>>
> >>> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
> >>> to run very basic commands like a crash dump, sysrq ond so on.
> >>
> >> You may get an idea of the magnitude of pages_dirtied by using a chain of
> >> BUG_ON()s, like:
> >>
> >> BUG_ON(pages_dirtied > 2000000000);
> >> BUG_ON(pages_dirtied > 1000000000);
> >> BUG_ON(pages_dirtied > 100000000);
> >> BUG_ON(pages_dirtied > 10000000);
> >> BUG_ON(pages_dirtied > 1000000);
> >>
> >> Probably 1 million is already too much for normal operation?
> >>
> > period = HZ * pages_dirtied / task_ratelimit;
> > 		BUG_ON(pages_dirtied > 2000000000);
> > 		BUG_ON(pages_dirtied > 1000000000);      <-------------- this is line 1467
> 
> Summary for mm people:
> 
> Toralf runs trinty on UML/i386.
> After some time pages_dirtied becomes very large.
> More than 1000000000 pages in this case.
  Huh, this is really strange. pages_dirtied is passed into
balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
task was ever able to dirty that much during its lifetime (but correct me
if I'm wrong here, with UML and memory backed disks it is not totally
impossible)... I went through the logic of handling ->nr_dirtied but
I didn't find any obvious problem there. Hum, maybe one thing - what
'task_ratelimit' values do you see in balance_dirty_pages? If that one was
huge, we could possibly accumulate huge current->nr_dirtied.

> Thus, period = HZ * pages_dirtied / task_ratelimit overflows
> and period/pause becomes extremely large.
> 
> It looks like io_schedule_timeout() get's called with a very large timeout.
> I don't know why "if (unlikely(pause > max_pause)) {" does not help.
> 
> 
> > the back trace is :
> > 
> > tfoerste@n22 ~/devel/linux $ gdb --core=/mnt/ramdisk/core /home/tfoerste/devel/linux/linux -batch -ex bt
> > [New LWP 6911]
> > Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk ubda=/home/tfoerste/virtual/uml/tr'.
> > Program terminated with signal 6, Aborted.
> > #0  0xb77a7424 in __kernel_vsyscall ()
> > #0  0xb77a7424 in __kernel_vsyscall ()
> > #1  0x083bdf35 in kill ()
> > #2  0x0807296d in uml_abort () at arch/um/os-Linux/util.c:93
> > #3  0x08072ca5 in os_dump_core () at arch/um/os-Linux/util.c:148
> > #4  0x080623c4 in panic_exit (self=0x85c1558 <panic_exit_notifier>, unused1=0, unused2=0x85f76e0 <buf.16221>) at arch/um/kernel/um_arch.c:240
> > #5  0x0809ba86 in notifier_call_chain (nl=0x0, val=0, v=0x85f76e0 <buf.16221>, nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
> > #6  0x0809bba1 in __atomic_notifier_call_chain (nh=0x85f76c4 <panic_notifier_list>, val=0, v=0x85f76e0 <buf.16221>, nr_to_call=0, nr_calls=0x0) at kernel/notifier.c:182
> > #7  0x0809bbdf in atomic_notifier_call_chain (nh=0x0, val=0, v=0x0) at kernel/notifier.c:191
> > #8  0x0841b5bc in panic (fmt=0x0) at kernel/panic.c:130
> > #9  0x0841c470 in balance_dirty_pages (pages_dirtied=23, mapping=<optimized out>) at mm/page-writeback.c:1467
> > #10 0x080d3595 in balance_dirty_pages_ratelimited (mapping=0x6) at mm/page-writeback.c:1661
> > #11 0x080e4a3f in __do_fault (mm=0x45ba3600, vma=0x48777b90, address=1084678144, pmd=0x0, pgoff=0, flags=0, orig_pte=<incomplete type>) at mm/memory.c:3452
> > #12 0x080e6e0f in do_linear_fault (orig_pte=..., flags=<optimized out>, pmd=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>, page_table=<optimized out>) at mm/memory.c:3486
> > #13 handle_pte_fault (flags=<optimized out>, pmd=<optimized out>, pte=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3710
> > #14 __handle_mm_fault (flags=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3845
> > #15 handle_mm_fault (mm=0x45ba3600, vma=0x487034c8, address=1084678144, flags=1) at mm/memory.c:3868
> > #16 0x080e7817 in __get_user_pages (tsk=0x48705800, mm=0x45ba3600, start=1084678144, nr_pages=1025, gup_flags=519, pages=0x48558000, vmas=0x0, nonblocking=0x0) at mm/memory.c:1822
> > #17 0x080e7ae3 in get_user_pages (tsk=0x0, mm=0x0, start=0, nr_pages=0, write=1, force=0, pages=0x48777b90, vmas=0x6) at mm/memory.c:2019
> > #18 0x08143dc6 in aio_setup_ring (ctx=<optimized out>) at fs/aio.c:340
> > #19 ioctx_alloc (nr_events=<optimized out>) at fs/aio.c:605
> > #20 SYSC_io_setup (ctxp=<optimized out>, nr_events=<optimized out>) at fs/aio.c:1122
> > #21 SyS_io_setup (nr_events=65535, ctxp=135081984) at fs/aio.c:1105
> > #22 0x08062984 in handle_syscall (r=0x487059d4) at arch/um/kernel/skas/syscall.c:35
> > #23 0x08074fb5 in handle_trap (local_using_sysemu=<optimized out>, regs=<optimized out>, pid=<optimized out>) at arch/um/os-Linux/skas/process.c:198
> > #24 userspace (regs=0x487059d4) at arch/um/os-Linux/skas/process.c:431
> > #25 0x0805f750 in fork_handler () at arch/um/kernel/process.c:160
> > #26 0x00000000 in ?? ()

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-09 21:47                                 ` Jan Kara
  0 siblings, 0 replies; 64+ messages in thread
From: Jan Kara @ 2013-10-09 21:47 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: Toralf Förster, Geert Uytterhoeven, UML devel,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, Jan Kara, hannes, darrick.wong,
	Michal Hocko, Wu Fengguang

On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
> CC'ing mm folks.
> Please see below.
  Added Fenguang to CC since he is the author of this code.

> Am 09.10.2013 19:26, schrieb Toralf Forster:
> > On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
> >> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Forster <toralf.foerster@gmx.de> wrote:
> >>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
> >>>> asserts its strict positive?!?
> >>>>
> >>>> Can you please try the following instead of the BUG_ON():
> >>>>
> >>>> if (pause < 0) {
> >>>>         printk("pages_dirtied = %lu\n", pages_dirtied);
> >>>>         printk("task_ratelimit = %lu\n", task_ratelimit);
> >>>>         printk("pause = %ld\n", pause);
> >>>> }
> >>>>
> >>>> Gr{oetje,eeting}s,
> >>>>
> >>>>                         Geert
> >>> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
> >>> As soon as the issue happens I do have a
> >>>
> >>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
> >>>
> >>> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
> >>> to run very basic commands like a crash dump, sysrq ond so on.
> >>
> >> You may get an idea of the magnitude of pages_dirtied by using a chain of
> >> BUG_ON()s, like:
> >>
> >> BUG_ON(pages_dirtied > 2000000000);
> >> BUG_ON(pages_dirtied > 1000000000);
> >> BUG_ON(pages_dirtied > 100000000);
> >> BUG_ON(pages_dirtied > 10000000);
> >> BUG_ON(pages_dirtied > 1000000);
> >>
> >> Probably 1 million is already too much for normal operation?
> >>
> > period = HZ * pages_dirtied / task_ratelimit;
> > 		BUG_ON(pages_dirtied > 2000000000);
> > 		BUG_ON(pages_dirtied > 1000000000);      <-------------- this is line 1467
> 
> Summary for mm people:
> 
> Toralf runs trinty on UML/i386.
> After some time pages_dirtied becomes very large.
> More than 1000000000 pages in this case.
  Huh, this is really strange. pages_dirtied is passed into
balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
task was ever able to dirty that much during its lifetime (but correct me
if I'm wrong here, with UML and memory backed disks it is not totally
impossible)... I went through the logic of handling ->nr_dirtied but
I didn't find any obvious problem there. Hum, maybe one thing - what
'task_ratelimit' values do you see in balance_dirty_pages? If that one was
huge, we could possibly accumulate huge current->nr_dirtied.

> Thus, period = HZ * pages_dirtied / task_ratelimit overflows
> and period/pause becomes extremely large.
> 
> It looks like io_schedule_timeout() get's called with a very large timeout.
> I don't know why "if (unlikely(pause > max_pause)) {" does not help.
> 
> 
> > the back trace is :
> > 
> > tfoerste@n22 ~/devel/linux $ gdb --core=/mnt/ramdisk/core /home/tfoerste/devel/linux/linux -batch -ex bt
> > [New LWP 6911]
> > Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk ubda=/home/tfoerste/virtual/uml/tr'.
> > Program terminated with signal 6, Aborted.
> > #0  0xb77a7424 in __kernel_vsyscall ()
> > #0  0xb77a7424 in __kernel_vsyscall ()
> > #1  0x083bdf35 in kill ()
> > #2  0x0807296d in uml_abort () at arch/um/os-Linux/util.c:93
> > #3  0x08072ca5 in os_dump_core () at arch/um/os-Linux/util.c:148
> > #4  0x080623c4 in panic_exit (self=0x85c1558 <panic_exit_notifier>, unused1=0, unused2=0x85f76e0 <buf.16221>) at arch/um/kernel/um_arch.c:240
> > #5  0x0809ba86 in notifier_call_chain (nl=0x0, val=0, v=0x85f76e0 <buf.16221>, nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
> > #6  0x0809bba1 in __atomic_notifier_call_chain (nh=0x85f76c4 <panic_notifier_list>, val=0, v=0x85f76e0 <buf.16221>, nr_to_call=0, nr_calls=0x0) at kernel/notifier.c:182
> > #7  0x0809bbdf in atomic_notifier_call_chain (nh=0x0, val=0, v=0x0) at kernel/notifier.c:191
> > #8  0x0841b5bc in panic (fmt=0x0) at kernel/panic.c:130
> > #9  0x0841c470 in balance_dirty_pages (pages_dirtied=23, mapping=<optimized out>) at mm/page-writeback.c:1467
> > #10 0x080d3595 in balance_dirty_pages_ratelimited (mapping=0x6) at mm/page-writeback.c:1661
> > #11 0x080e4a3f in __do_fault (mm=0x45ba3600, vma=0x48777b90, address=1084678144, pmd=0x0, pgoff=0, flags=0, orig_pte=<incomplete type>) at mm/memory.c:3452
> > #12 0x080e6e0f in do_linear_fault (orig_pte=..., flags=<optimized out>, pmd=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>, page_table=<optimized out>) at mm/memory.c:3486
> > #13 handle_pte_fault (flags=<optimized out>, pmd=<optimized out>, pte=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3710
> > #14 __handle_mm_fault (flags=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3845
> > #15 handle_mm_fault (mm=0x45ba3600, vma=0x487034c8, address=1084678144, flags=1) at mm/memory.c:3868
> > #16 0x080e7817 in __get_user_pages (tsk=0x48705800, mm=0x45ba3600, start=1084678144, nr_pages=1025, gup_flags=519, pages=0x48558000, vmas=0x0, nonblocking=0x0) at mm/memory.c:1822
> > #17 0x080e7ae3 in get_user_pages (tsk=0x0, mm=0x0, start=0, nr_pages=0, write=1, force=0, pages=0x48777b90, vmas=0x6) at mm/memory.c:2019
> > #18 0x08143dc6 in aio_setup_ring (ctx=<optimized out>) at fs/aio.c:340
> > #19 ioctx_alloc (nr_events=<optimized out>) at fs/aio.c:605
> > #20 SYSC_io_setup (ctxp=<optimized out>, nr_events=<optimized out>) at fs/aio.c:1122
> > #21 SyS_io_setup (nr_events=65535, ctxp=135081984) at fs/aio.c:1105
> > #22 0x08062984 in handle_syscall (r=0x487059d4) at arch/um/kernel/skas/syscall.c:35
> > #23 0x08074fb5 in handle_trap (local_using_sysemu=<optimized out>, regs=<optimized out>, pid=<optimized out>) at arch/um/os-Linux/skas/process.c:198
> > #24 userspace (regs=0x487059d4) at arch/um/os-Linux/skas/process.c:431
> > #25 0x0805f750 in fork_handler () at arch/um/kernel/process.c:160
> > #26 0x00000000 in ?? ()

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-09 21:47                                 ` Jan Kara
  (?)
@ 2013-10-09 22:33                                   ` Richard Weinberger
  -1 siblings, 0 replies; 64+ messages in thread
From: Richard Weinberger @ 2013-10-09 22:33 UTC (permalink / raw)
  To: Jan Kara
  Cc: Toralf Förster, Geert Uytterhoeven, UML devel,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes, darrick.wong, Michal Hocko,
	Wu Fengguang

Am 09.10.2013 23:47, schrieb Jan Kara:
> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
>> CC'ing mm folks.
>> Please see below.
>   Added Fenguang to CC since he is the author of this code.

Thx, get_maintainer.pl didn't list him.

>> Am 09.10.2013 19:26, schrieb Toralf Fï¿½rster:
>>> On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
>>>> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Fï¿½rster <toralf.foerster@gmx.de> wrote:
>>>>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
>>>>>> asserts its strict positive?!?
>>>>>>
>>>>>> Can you please try the following instead of the BUG_ON():
>>>>>>
>>>>>> if (pause < 0) {
>>>>>>         printk("pages_dirtied = %lu\n", pages_dirtied);
>>>>>>         printk("task_ratelimit = %lu\n", task_ratelimit);
>>>>>>         printk("pause = %ld\n", pause);
>>>>>> }
>>>>>>
>>>>>> Gr{oetje,eeting}s,
>>>>>>
>>>>>>                         Geert
>>>>> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
>>>>> As soon as the issue happens I do have a
>>>>>
>>>>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
>>>>>
>>>>> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
>>>>> to run very basic commands like a crash dump, sysrq ond so on.
>>>>
>>>> You may get an idea of the magnitude of pages_dirtied by using a chain of
>>>> BUG_ON()s, like:
>>>>
>>>> BUG_ON(pages_dirtied > 2000000000);
>>>> BUG_ON(pages_dirtied > 1000000000);
>>>> BUG_ON(pages_dirtied > 100000000);
>>>> BUG_ON(pages_dirtied > 10000000);
>>>> BUG_ON(pages_dirtied > 1000000);
>>>>
>>>> Probably 1 million is already too much for normal operation?
>>>>
>>> period = HZ * pages_dirtied / task_ratelimit;
>>> 		BUG_ON(pages_dirtied > 2000000000);
>>> 		BUG_ON(pages_dirtied > 1000000000);      <-------------- this is line 1467
>>
>> Summary for mm people:
>>
>> Toralf runs trinty on UML/i386.
>> After some time pages_dirtied becomes very large.
>> More than 1000000000 pages in this case.
>   Huh, this is really strange. pages_dirtied is passed into
> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
> over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
> task was ever able to dirty that much during its lifetime (but correct me
> if I'm wrong here, with UML and memory backed disks it is not totally
> impossible)... I went through the logic of handling ->nr_dirtied but
> I didn't find any obvious problem there. Hum, maybe one thing - what
> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
> huge, we could possibly accumulate huge current->nr_dirtied.

Toralf, you can try a snipplet like this one to get the values printed out:
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index f5236f8..a80e520 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1463,6 +1463,12 @@ static void balance_dirty_pages(struct address_space *mapping,
                        goto pause;
                }
                period = HZ * pages_dirtied / task_ratelimit;
+
+               {
+                       extern int printf(char *, ...);
+                       printf("---> task_ratelimit: %lu\n", task_ratelimit);
+               }
+
                pause = period;
                if (current->dirty_paused_when)
                        pause -= now - current->dirty_paused_when;


Yes, printf(), not printk().
Using this hack we print directly to host's stdout. :)

Thanks,
//richard


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-09 22:33                                   ` Richard Weinberger
  0 siblings, 0 replies; 64+ messages in thread
From: Richard Weinberger @ 2013-10-09 22:33 UTC (permalink / raw)
  To: Jan Kara
  Cc: Toralf Förster, Geert Uytterhoeven, UML devel,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes, darrick.wong, Michal Hocko,
	Wu Fengguang

Am 09.10.2013 23:47, schrieb Jan Kara:
> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
>> CC'ing mm folks.
>> Please see below.
>   Added Fenguang to CC since he is the author of this code.

Thx, get_maintainer.pl didn't list him.

>> Am 09.10.2013 19:26, schrieb Toralf Förster:
>>> On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
>>>> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>>>>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
>>>>>> asserts its strict positive?!?
>>>>>>
>>>>>> Can you please try the following instead of the BUG_ON():
>>>>>>
>>>>>> if (pause < 0) {
>>>>>>         printk("pages_dirtied = %lu\n", pages_dirtied);
>>>>>>         printk("task_ratelimit = %lu\n", task_ratelimit);
>>>>>>         printk("pause = %ld\n", pause);
>>>>>> }
>>>>>>
>>>>>> Gr{oetje,eeting}s,
>>>>>>
>>>>>>                         Geert
>>>>> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
>>>>> As soon as the issue happens I do have a
>>>>>
>>>>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
>>>>>
>>>>> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
>>>>> to run very basic commands like a crash dump, sysrq ond so on.
>>>>
>>>> You may get an idea of the magnitude of pages_dirtied by using a chain of
>>>> BUG_ON()s, like:
>>>>
>>>> BUG_ON(pages_dirtied > 2000000000);
>>>> BUG_ON(pages_dirtied > 1000000000);
>>>> BUG_ON(pages_dirtied > 100000000);
>>>> BUG_ON(pages_dirtied > 10000000);
>>>> BUG_ON(pages_dirtied > 1000000);
>>>>
>>>> Probably 1 million is already too much for normal operation?
>>>>
>>> period = HZ * pages_dirtied / task_ratelimit;
>>> 		BUG_ON(pages_dirtied > 2000000000);
>>> 		BUG_ON(pages_dirtied > 1000000000);      <-------------- this is line 1467
>>
>> Summary for mm people:
>>
>> Toralf runs trinty on UML/i386.
>> After some time pages_dirtied becomes very large.
>> More than 1000000000 pages in this case.
>   Huh, this is really strange. pages_dirtied is passed into
> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
> over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
> task was ever able to dirty that much during its lifetime (but correct me
> if I'm wrong here, with UML and memory backed disks it is not totally
> impossible)... I went through the logic of handling ->nr_dirtied but
> I didn't find any obvious problem there. Hum, maybe one thing - what
> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
> huge, we could possibly accumulate huge current->nr_dirtied.

Toralf, you can try a snipplet like this one to get the values printed out:
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index f5236f8..a80e520 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1463,6 +1463,12 @@ static void balance_dirty_pages(struct address_space *mapping,
                        goto pause;
                }
                period = HZ * pages_dirtied / task_ratelimit;
+
+               {
+                       extern int printf(char *, ...);
+                       printf("---> task_ratelimit: %lu\n", task_ratelimit);
+               }
+
                pause = period;
                if (current->dirty_paused_when)
                        pause -= now - current->dirty_paused_when;


Yes, printf(), not printk().
Using this hack we print directly to host's stdout. :)

Thanks,
//richard

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-09 22:33                                   ` Richard Weinberger
  0 siblings, 0 replies; 64+ messages in thread
From: Richard Weinberger @ 2013-10-09 22:33 UTC (permalink / raw)
  To: Jan Kara
  Cc: Toralf Förster, Geert Uytterhoeven, UML devel,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes, darrick.wong, Michal Hocko,
	Wu Fengguang

Am 09.10.2013 23:47, schrieb Jan Kara:
> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
>> CC'ing mm folks.
>> Please see below.
>   Added Fenguang to CC since he is the author of this code.

Thx, get_maintainer.pl didn't list him.

>> Am 09.10.2013 19:26, schrieb Toralf Forster:
>>> On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
>>>> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Forster <toralf.foerster@gmx.de> wrote:
>>>>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
>>>>>> asserts its strict positive?!?
>>>>>>
>>>>>> Can you please try the following instead of the BUG_ON():
>>>>>>
>>>>>> if (pause < 0) {
>>>>>>         printk("pages_dirtied = %lu\n", pages_dirtied);
>>>>>>         printk("task_ratelimit = %lu\n", task_ratelimit);
>>>>>>         printk("pause = %ld\n", pause);
>>>>>> }
>>>>>>
>>>>>> Gr{oetje,eeting}s,
>>>>>>
>>>>>>                         Geert
>>>>> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
>>>>> As soon as the issue happens I do have a
>>>>>
>>>>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
>>>>>
>>>>> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
>>>>> to run very basic commands like a crash dump, sysrq ond so on.
>>>>
>>>> You may get an idea of the magnitude of pages_dirtied by using a chain of
>>>> BUG_ON()s, like:
>>>>
>>>> BUG_ON(pages_dirtied > 2000000000);
>>>> BUG_ON(pages_dirtied > 1000000000);
>>>> BUG_ON(pages_dirtied > 100000000);
>>>> BUG_ON(pages_dirtied > 10000000);
>>>> BUG_ON(pages_dirtied > 1000000);
>>>>
>>>> Probably 1 million is already too much for normal operation?
>>>>
>>> period = HZ * pages_dirtied / task_ratelimit;
>>> 		BUG_ON(pages_dirtied > 2000000000);
>>> 		BUG_ON(pages_dirtied > 1000000000);      <-------------- this is line 1467
>>
>> Summary for mm people:
>>
>> Toralf runs trinty on UML/i386.
>> After some time pages_dirtied becomes very large.
>> More than 1000000000 pages in this case.
>   Huh, this is really strange. pages_dirtied is passed into
> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
> over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
> task was ever able to dirty that much during its lifetime (but correct me
> if I'm wrong here, with UML and memory backed disks it is not totally
> impossible)... I went through the logic of handling ->nr_dirtied but
> I didn't find any obvious problem there. Hum, maybe one thing - what
> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
> huge, we could possibly accumulate huge current->nr_dirtied.

Toralf, you can try a snipplet like this one to get the values printed out:
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index f5236f8..a80e520 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1463,6 +1463,12 @@ static void balance_dirty_pages(struct address_space *mapping,
                        goto pause;
                }
                period = HZ * pages_dirtied / task_ratelimit;
+
+               {
+                       extern int printf(char *, ...);
+                       printf("---> task_ratelimit: %lu\n", task_ratelimit);
+               }
+
                pause = period;
                if (current->dirty_paused_when)
                        pause -= now - current->dirty_paused_when;


Yes, printf(), not printk().
Using this hack we print directly to host's stdout. :)

Thanks,
//richard

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-09 22:33                                   ` Richard Weinberger
  (?)
@ 2013-10-10 16:49                                     ` Toralf Förster
  -1 siblings, 0 replies; 64+ messages in thread
From: Toralf Förster @ 2013-10-10 16:49 UTC (permalink / raw)
  To: Richard Weinberger, Jan Kara
  Cc: Geert Uytterhoeven, UML devel, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, akpm@linux-foundation.org, hannes,
	darrick.wong, Michal Hocko, Wu Fengguang

On 10/10/2013 12:33 AM, Richard Weinberger wrote:
> Am 09.10.2013 23:47, schrieb Jan Kara:
>> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
>>> CC'ing mm folks.
>>> Please see below.
>>   Added Fenguang to CC since he is the author of this code.
> 
> Thx, get_maintainer.pl didn't list him.
> 
>>> Am 09.10.2013 19:26, schrieb Toralf Förster:
>>>> On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
>>>>> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>>>>>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
>>>>>>> asserts its strict positive?!?
>>>>>>>
>>>>>>> Can you please try the following instead of the BUG_ON():
>>>>>>>
>>>>>>> if (pause < 0) {
>>>>>>>         printk("pages_dirtied = %lu\n", pages_dirtied);
>>>>>>>         printk("task_ratelimit = %lu\n", task_ratelimit);
>>>>>>>         printk("pause = %ld\n", pause);
>>>>>>> }
>>>>>>>
>>>>>>> Gr{oetje,eeting}s,
>>>>>>>
>>>>>>>                         Geert
>>>>>> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
>>>>>> As soon as the issue happens I do have a
>>>>>>
>>>>>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
>>>>>>
>>>>>> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
>>>>>> to run very basic commands like a crash dump, sysrq ond so on.
>>>>>
>>>>> You may get an idea of the magnitude of pages_dirtied by using a chain of
>>>>> BUG_ON()s, like:
>>>>>
>>>>> BUG_ON(pages_dirtied > 2000000000);
>>>>> BUG_ON(pages_dirtied > 1000000000);
>>>>> BUG_ON(pages_dirtied > 100000000);
>>>>> BUG_ON(pages_dirtied > 10000000);
>>>>> BUG_ON(pages_dirtied > 1000000);
>>>>>
>>>>> Probably 1 million is already too much for normal operation?
>>>>>
>>>> period = HZ * pages_dirtied / task_ratelimit;
>>>> 		BUG_ON(pages_dirtied > 2000000000);
>>>> 		BUG_ON(pages_dirtied > 1000000000);      <-------------- this is line 1467
>>>
>>> Summary for mm people:
>>>
>>> Toralf runs trinty on UML/i386.
>>> After some time pages_dirtied becomes very large.
>>> More than 1000000000 pages in this case.
>>   Huh, this is really strange. pages_dirtied is passed into
>> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
>> over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
>> task was ever able to dirty that much during its lifetime (but correct me
>> if I'm wrong here, with UML and memory backed disks it is not totally
>> impossible)... I went through the logic of handling ->nr_dirtied but
>> I didn't find any obvious problem there. Hum, maybe one thing - what
>> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
>> huge, we could possibly accumulate huge current->nr_dirtied.
> 
> Toralf, you can try a snipplet like this one to get the values printed out:
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index f5236f8..a80e520 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1463,6 +1463,12 @@ static void balance_dirty_pages(struct address_space *mapping,
>                         goto pause;
>                 }
>                 period = HZ * pages_dirtied / task_ratelimit;
> +
> +               {
> +                       extern int printf(char *, ...);
> +                       printf("---> task_ratelimit: %lu\n", task_ratelimit);
> +               }
> +
>                 pause = period;
>                 if (current->dirty_paused_when)
>                         pause -= now - current->dirty_paused_when;
> 
> 
> Yes, printf(), not printk().
> Using this hack we print directly to host's stdout. :)
> 
*head smack* ofc - works fine.
So given this diff :

iff --git a/mm/page-writeback.c b/mm/page-writeback.c
index f5236f8..5a2c337 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1464,6 +1464,13 @@ static void balance_dirty_pages(struct address_space *mapping,
                }
                period = HZ * pages_dirtied / task_ratelimit;
                pause = period;
+               if (pause < 0)  {
+                       extern int printf(char *, ...);
+                       printf("overflow : pause : %li\n", pause);
+                       printf("overflow : pages_dirtied : %lu\n", pages_dirtied);
+                       printf("overflow :  task_ratelimit: %lu\n", task_ratelimit);
+                       BUG_ON(1);
+               }
                if (current->dirty_paused_when)
                        pause -= now - current->dirty_paused_when;
                /*
@@ -1503,6 +1510,13 @@ static void balance_dirty_pages(struct address_space *mapping,
                }

 pause:
+               if (pause < 0)  {
+                       extern int printf(char *, ...);
+                       printf("ick : pause : %li\n", pause);
+                       printf("ick: pages_dirtied : %lu\n", pages_dirtied);
+                       printf("ick: task_ratelimit: %lu\n", task_ratelimit);
+                       BUG_ON(1);
+               }
                trace_balance_dirty_pages(bdi,
                                          dirty_thresh,
                                          background_thresh,


I got this :




 * Starting local
net.core.warnings = 0                                                                                                                            [ ok ]
ick : pause : -984
                  ick: pages_dirtied : 0
                                        ick: task_ratelimit: 0
                                                              Kernel panic - not syncing: BUG!
CPU: 0 PID: 1434 Comm: trinity-child2 Not tainted 3.12.0-rc4-00029-g0e7a3ed-dirty #12
47397c84 47397cb0 0841b5a0 084c30e8 085f76e0 084b4745 47397cbc 00000000 
       fffffc28 01fff278 085cb4a0 47397d2c 0841c5a0 084b4745 084c5398 000005ee 
       08432cf0 43e47600 471757f8 47397cf0 ffffff0c 47397cdc 5256d8b9 3417ec18 47397c5c:  [<08060b2c>] show_stack+0x7c/0xd0
47397c7c:  [<0841e34d>] dump_stack+0x26/0x28
47397c8c:  [<0841b5a0>] panic+0x7a/0x180
47397cb4:  [<0841c5a0>] balance_dirty_pages.isra.32+0x4e3/0x5ad
47397d30:  [<080d3595>] balance_dirty_pages_ratelimited+0xf5/0x100
47397d44:  [<080e4a3f>] __do_fault+0x3cf/0x440
47397d9c:  [<080e6e0f>] handle_mm_fault+0xef/0x7c0
47397dec:  [<080e7817>] __get_user_pages+0x227/0x420
47397e24:  [<080e7ae3>] get_user_pages+0x63/0x70
47397e4c:  [<08143dc6>] SyS_io_setup+0x3c6/0x760
47397eb0:  [<08062984>] handle_syscall+0x64/0x80
47397ef0:  [<08074fb5>] userspace+0x475/0x5f0
47397fec:  [<0805f750>] fork_handler+0x60/0x70
47397ffc:  [<00000000>] 0x0


EIP: 0073:[<40001282>] CPU: 0 Not tainted ESP: 007b:bfb348f8 EFLAGS: 00000246
    Not tainted
EAX: ffffffda EBX: 00001000 ECX: 080d0000 EDX: 80000048
ESI: 80fbff1f EDI: ffe02f77 EBP: 90f6e2a3 DS: 007b ES: 007b
47397c0c:  [<0807947f>] show_regs+0x10f/0x120
47397c28:  [<080623a9>] panic_exit+0x29/0x50
47397c38:  [<0809ba86>] notifier_call_chain+0x36/0x60
47397c60:  [<0809bba1>] __atomic_notifier_call_chain+0x21/0x30
47397c70:  [<0809bbdf>] atomic_notifier_call_chain+0x2f/0x40
47397c8c:  [<0841b5bc>] panic+0x96/0x180
47397cb4:  [<0841c5a0>] balance_dirty_pages.isra.32+0x4e3/0x5ad
47397d30:  [<080d3595>] balance_dirty_pages_ratelimited+0xf5/0x100
47397d44:  [<080e4a3f>] __do_fault+0x3cf/0x440
47397d9c:  [<080e6e0f>] handle_mm_fault+0xef/0x7c0
47397dec:  [<080e7817>] __get_user_pages+0x227/0x420
47397e24:  [<080e7ae3>] get_user_pages+0x63/0x70
47397e4c:  [<08143dc6>] SyS_io_setup+0x3c6/0x760
47397eb0:  [<08062984>] handle_syscall+0x64/0x80
47397ef0:  [<08074fb5>] userspace+0x475/0x5f0
47397fec:  [<0805f750>] fork_handler+0x60/0x70
47397ffc:  [<00000000>] 0x0

/home/tfoerste/workspace/bin/start_uml.sh: line 115: 18718 Aborted                 (core dumped) $LINUX earlyprintk ubda=$ROOTFS ubdb=$SWAP eth0=$NET mem=$MEM $TTY umid=uml_$NAME rootfstype=ext4 "$ARGS"


From what I see there are 2 different types of issues - and this is an example of the other of both 

> Thanks,
> //richard
> 


-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-10 16:49                                     ` Toralf Förster
  0 siblings, 0 replies; 64+ messages in thread
From: Toralf Förster @ 2013-10-10 16:49 UTC (permalink / raw)
  To: Richard Weinberger, Jan Kara
  Cc: Geert Uytterhoeven, UML devel, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, akpm@linux-foundation.org, hannes,
	darrick.wong, Michal Hocko, Wu Fengguang

On 10/10/2013 12:33 AM, Richard Weinberger wrote:
> Am 09.10.2013 23:47, schrieb Jan Kara:
>> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
>>> CC'ing mm folks.
>>> Please see below.
>>   Added Fenguang to CC since he is the author of this code.
> 
> Thx, get_maintainer.pl didn't list him.
> 
>>> Am 09.10.2013 19:26, schrieb Toralf Förster:
>>>> On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
>>>>> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>>>>>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
>>>>>>> asserts its strict positive?!?
>>>>>>>
>>>>>>> Can you please try the following instead of the BUG_ON():
>>>>>>>
>>>>>>> if (pause < 0) {
>>>>>>>         printk("pages_dirtied = %lu\n", pages_dirtied);
>>>>>>>         printk("task_ratelimit = %lu\n", task_ratelimit);
>>>>>>>         printk("pause = %ld\n", pause);
>>>>>>> }
>>>>>>>
>>>>>>> Gr{oetje,eeting}s,
>>>>>>>
>>>>>>>                         Geert
>>>>>> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
>>>>>> As soon as the issue happens I do have a
>>>>>>
>>>>>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
>>>>>>
>>>>>> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
>>>>>> to run very basic commands like a crash dump, sysrq ond so on.
>>>>>
>>>>> You may get an idea of the magnitude of pages_dirtied by using a chain of
>>>>> BUG_ON()s, like:
>>>>>
>>>>> BUG_ON(pages_dirtied > 2000000000);
>>>>> BUG_ON(pages_dirtied > 1000000000);
>>>>> BUG_ON(pages_dirtied > 100000000);
>>>>> BUG_ON(pages_dirtied > 10000000);
>>>>> BUG_ON(pages_dirtied > 1000000);
>>>>>
>>>>> Probably 1 million is already too much for normal operation?
>>>>>
>>>> period = HZ * pages_dirtied / task_ratelimit;
>>>> 		BUG_ON(pages_dirtied > 2000000000);
>>>> 		BUG_ON(pages_dirtied > 1000000000);      <-------------- this is line 1467
>>>
>>> Summary for mm people:
>>>
>>> Toralf runs trinty on UML/i386.
>>> After some time pages_dirtied becomes very large.
>>> More than 1000000000 pages in this case.
>>   Huh, this is really strange. pages_dirtied is passed into
>> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
>> over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
>> task was ever able to dirty that much during its lifetime (but correct me
>> if I'm wrong here, with UML and memory backed disks it is not totally
>> impossible)... I went through the logic of handling ->nr_dirtied but
>> I didn't find any obvious problem there. Hum, maybe one thing - what
>> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
>> huge, we could possibly accumulate huge current->nr_dirtied.
> 
> Toralf, you can try a snipplet like this one to get the values printed out:
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index f5236f8..a80e520 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1463,6 +1463,12 @@ static void balance_dirty_pages(struct address_space *mapping,
>                         goto pause;
>                 }
>                 period = HZ * pages_dirtied / task_ratelimit;
> +
> +               {
> +                       extern int printf(char *, ...);
> +                       printf("---> task_ratelimit: %lu\n", task_ratelimit);
> +               }
> +
>                 pause = period;
>                 if (current->dirty_paused_when)
>                         pause -= now - current->dirty_paused_when;
> 
> 
> Yes, printf(), not printk().
> Using this hack we print directly to host's stdout. :)
> 
*head smack* ofc - works fine.
So given this diff :

iff --git a/mm/page-writeback.c b/mm/page-writeback.c
index f5236f8..5a2c337 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1464,6 +1464,13 @@ static void balance_dirty_pages(struct address_space *mapping,
                }
                period = HZ * pages_dirtied / task_ratelimit;
                pause = period;
+               if (pause < 0)  {
+                       extern int printf(char *, ...);
+                       printf("overflow : pause : %li\n", pause);
+                       printf("overflow : pages_dirtied : %lu\n", pages_dirtied);
+                       printf("overflow :  task_ratelimit: %lu\n", task_ratelimit);
+                       BUG_ON(1);
+               }
                if (current->dirty_paused_when)
                        pause -= now - current->dirty_paused_when;
                /*
@@ -1503,6 +1510,13 @@ static void balance_dirty_pages(struct address_space *mapping,
                }

 pause:
+               if (pause < 0)  {
+                       extern int printf(char *, ...);
+                       printf("ick : pause : %li\n", pause);
+                       printf("ick: pages_dirtied : %lu\n", pages_dirtied);
+                       printf("ick: task_ratelimit: %lu\n", task_ratelimit);
+                       BUG_ON(1);
+               }
                trace_balance_dirty_pages(bdi,
                                          dirty_thresh,
                                          background_thresh,


I got this :




 * Starting local
net.core.warnings = 0                                                                                                                            [ ok ]
ick : pause : -984
                  ick: pages_dirtied : 0
                                        ick: task_ratelimit: 0
                                                              Kernel panic - not syncing: BUG!
CPU: 0 PID: 1434 Comm: trinity-child2 Not tainted 3.12.0-rc4-00029-g0e7a3ed-dirty #12
47397c84 47397cb0 0841b5a0 084c30e8 085f76e0 084b4745 47397cbc 00000000 
       fffffc28 01fff278 085cb4a0 47397d2c 0841c5a0 084b4745 084c5398 000005ee 
       08432cf0 43e47600 471757f8 47397cf0 ffffff0c 47397cdc 5256d8b9 3417ec18 47397c5c:  [<08060b2c>] show_stack+0x7c/0xd0
47397c7c:  [<0841e34d>] dump_stack+0x26/0x28
47397c8c:  [<0841b5a0>] panic+0x7a/0x180
47397cb4:  [<0841c5a0>] balance_dirty_pages.isra.32+0x4e3/0x5ad
47397d30:  [<080d3595>] balance_dirty_pages_ratelimited+0xf5/0x100
47397d44:  [<080e4a3f>] __do_fault+0x3cf/0x440
47397d9c:  [<080e6e0f>] handle_mm_fault+0xef/0x7c0
47397dec:  [<080e7817>] __get_user_pages+0x227/0x420
47397e24:  [<080e7ae3>] get_user_pages+0x63/0x70
47397e4c:  [<08143dc6>] SyS_io_setup+0x3c6/0x760
47397eb0:  [<08062984>] handle_syscall+0x64/0x80
47397ef0:  [<08074fb5>] userspace+0x475/0x5f0
47397fec:  [<0805f750>] fork_handler+0x60/0x70
47397ffc:  [<00000000>] 0x0


EIP: 0073:[<40001282>] CPU: 0 Not tainted ESP: 007b:bfb348f8 EFLAGS: 00000246
    Not tainted
EAX: ffffffda EBX: 00001000 ECX: 080d0000 EDX: 80000048
ESI: 80fbff1f EDI: ffe02f77 EBP: 90f6e2a3 DS: 007b ES: 007b
47397c0c:  [<0807947f>] show_regs+0x10f/0x120
47397c28:  [<080623a9>] panic_exit+0x29/0x50
47397c38:  [<0809ba86>] notifier_call_chain+0x36/0x60
47397c60:  [<0809bba1>] __atomic_notifier_call_chain+0x21/0x30
47397c70:  [<0809bbdf>] atomic_notifier_call_chain+0x2f/0x40
47397c8c:  [<0841b5bc>] panic+0x96/0x180
47397cb4:  [<0841c5a0>] balance_dirty_pages.isra.32+0x4e3/0x5ad
47397d30:  [<080d3595>] balance_dirty_pages_ratelimited+0xf5/0x100
47397d44:  [<080e4a3f>] __do_fault+0x3cf/0x440
47397d9c:  [<080e6e0f>] handle_mm_fault+0xef/0x7c0
47397dec:  [<080e7817>] __get_user_pages+0x227/0x420
47397e24:  [<080e7ae3>] get_user_pages+0x63/0x70
47397e4c:  [<08143dc6>] SyS_io_setup+0x3c6/0x760
47397eb0:  [<08062984>] handle_syscall+0x64/0x80
47397ef0:  [<08074fb5>] userspace+0x475/0x5f0
47397fec:  [<0805f750>] fork_handler+0x60/0x70
47397ffc:  [<00000000>] 0x0

/home/tfoerste/workspace/bin/start_uml.sh: line 115: 18718 Aborted                 (core dumped) $LINUX earlyprintk ubda=$ROOTFS ubdb=$SWAP eth0=$NET mem=$MEM $TTY umid=uml_$NAME rootfstype=ext4 "$ARGS"


>From what I see there are 2 different types of issues - and this is an example of the other of both 

> Thanks,
> //richard
> 


-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-10 16:49                                     ` Toralf Förster
  0 siblings, 0 replies; 64+ messages in thread
From: Toralf Förster @ 2013-10-10 16:49 UTC (permalink / raw)
  To: Richard Weinberger, Jan Kara
  Cc: Geert Uytterhoeven, UML devel, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, akpm@linux-foundation.org, hannes,
	darrick.wong, Michal Hocko, Wu Fengguang

On 10/10/2013 12:33 AM, Richard Weinberger wrote:
> Am 09.10.2013 23:47, schrieb Jan Kara:
>> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
>>> CC'ing mm folks.
>>> Please see below.
>>   Added Fenguang to CC since he is the author of this code.
> 
> Thx, get_maintainer.pl didn't list him.
> 
>>> Am 09.10.2013 19:26, schrieb Toralf FA?rster:
>>>> On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
>>>>> On Sun, Oct 6, 2013 at 11:01 PM, Toralf FA?rster <toralf.foerster@gmx.de> wrote:
>>>>>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
>>>>>>> asserts its strict positive?!?
>>>>>>>
>>>>>>> Can you please try the following instead of the BUG_ON():
>>>>>>>
>>>>>>> if (pause < 0) {
>>>>>>>         printk("pages_dirtied = %lu\n", pages_dirtied);
>>>>>>>         printk("task_ratelimit = %lu\n", task_ratelimit);
>>>>>>>         printk("pause = %ld\n", pause);
>>>>>>> }
>>>>>>>
>>>>>>> Gr{oetje,eeting}s,
>>>>>>>
>>>>>>>                         Geert
>>>>>> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
>>>>>> As soon as the issue happens I do have a
>>>>>>
>>>>>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
>>>>>>
>>>>>> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
>>>>>> to run very basic commands like a crash dump, sysrq ond so on.
>>>>>
>>>>> You may get an idea of the magnitude of pages_dirtied by using a chain of
>>>>> BUG_ON()s, like:
>>>>>
>>>>> BUG_ON(pages_dirtied > 2000000000);
>>>>> BUG_ON(pages_dirtied > 1000000000);
>>>>> BUG_ON(pages_dirtied > 100000000);
>>>>> BUG_ON(pages_dirtied > 10000000);
>>>>> BUG_ON(pages_dirtied > 1000000);
>>>>>
>>>>> Probably 1 million is already too much for normal operation?
>>>>>
>>>> period = HZ * pages_dirtied / task_ratelimit;
>>>> 		BUG_ON(pages_dirtied > 2000000000);
>>>> 		BUG_ON(pages_dirtied > 1000000000);      <-------------- this is line 1467
>>>
>>> Summary for mm people:
>>>
>>> Toralf runs trinty on UML/i386.
>>> After some time pages_dirtied becomes very large.
>>> More than 1000000000 pages in this case.
>>   Huh, this is really strange. pages_dirtied is passed into
>> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
>> over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
>> task was ever able to dirty that much during its lifetime (but correct me
>> if I'm wrong here, with UML and memory backed disks it is not totally
>> impossible)... I went through the logic of handling ->nr_dirtied but
>> I didn't find any obvious problem there. Hum, maybe one thing - what
>> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
>> huge, we could possibly accumulate huge current->nr_dirtied.
> 
> Toralf, you can try a snipplet like this one to get the values printed out:
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index f5236f8..a80e520 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1463,6 +1463,12 @@ static void balance_dirty_pages(struct address_space *mapping,
>                         goto pause;
>                 }
>                 period = HZ * pages_dirtied / task_ratelimit;
> +
> +               {
> +                       extern int printf(char *, ...);
> +                       printf("---> task_ratelimit: %lu\n", task_ratelimit);
> +               }
> +
>                 pause = period;
>                 if (current->dirty_paused_when)
>                         pause -= now - current->dirty_paused_when;
> 
> 
> Yes, printf(), not printk().
> Using this hack we print directly to host's stdout. :)
> 
*head smack* ofc - works fine.
So given this diff :

iff --git a/mm/page-writeback.c b/mm/page-writeback.c
index f5236f8..5a2c337 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1464,6 +1464,13 @@ static void balance_dirty_pages(struct address_space *mapping,
                }
                period = HZ * pages_dirtied / task_ratelimit;
                pause = period;
+               if (pause < 0)  {
+                       extern int printf(char *, ...);
+                       printf("overflow : pause : %li\n", pause);
+                       printf("overflow : pages_dirtied : %lu\n", pages_dirtied);
+                       printf("overflow :  task_ratelimit: %lu\n", task_ratelimit);
+                       BUG_ON(1);
+               }
                if (current->dirty_paused_when)
                        pause -= now - current->dirty_paused_when;
                /*
@@ -1503,6 +1510,13 @@ static void balance_dirty_pages(struct address_space *mapping,
                }

 pause:
+               if (pause < 0)  {
+                       extern int printf(char *, ...);
+                       printf("ick : pause : %li\n", pause);
+                       printf("ick: pages_dirtied : %lu\n", pages_dirtied);
+                       printf("ick: task_ratelimit: %lu\n", task_ratelimit);
+                       BUG_ON(1);
+               }
                trace_balance_dirty_pages(bdi,
                                          dirty_thresh,
                                          background_thresh,


I got this :




 * Starting local
net.core.warnings = 0                                                                                                                            [ ok ]
ick : pause : -984
                  ick: pages_dirtied : 0
                                        ick: task_ratelimit: 0
                                                              Kernel panic - not syncing: BUG!
CPU: 0 PID: 1434 Comm: trinity-child2 Not tainted 3.12.0-rc4-00029-g0e7a3ed-dirty #12
47397c84 47397cb0 0841b5a0 084c30e8 085f76e0 084b4745 47397cbc 00000000 
       fffffc28 01fff278 085cb4a0 47397d2c 0841c5a0 084b4745 084c5398 000005ee 
       08432cf0 43e47600 471757f8 47397cf0 ffffff0c 47397cdc 5256d8b9 3417ec18 47397c5c:  [<08060b2c>] show_stack+0x7c/0xd0
47397c7c:  [<0841e34d>] dump_stack+0x26/0x28
47397c8c:  [<0841b5a0>] panic+0x7a/0x180
47397cb4:  [<0841c5a0>] balance_dirty_pages.isra.32+0x4e3/0x5ad
47397d30:  [<080d3595>] balance_dirty_pages_ratelimited+0xf5/0x100
47397d44:  [<080e4a3f>] __do_fault+0x3cf/0x440
47397d9c:  [<080e6e0f>] handle_mm_fault+0xef/0x7c0
47397dec:  [<080e7817>] __get_user_pages+0x227/0x420
47397e24:  [<080e7ae3>] get_user_pages+0x63/0x70
47397e4c:  [<08143dc6>] SyS_io_setup+0x3c6/0x760
47397eb0:  [<08062984>] handle_syscall+0x64/0x80
47397ef0:  [<08074fb5>] userspace+0x475/0x5f0
47397fec:  [<0805f750>] fork_handler+0x60/0x70
47397ffc:  [<00000000>] 0x0


EIP: 0073:[<40001282>] CPU: 0 Not tainted ESP: 007b:bfb348f8 EFLAGS: 00000246
    Not tainted
EAX: ffffffda EBX: 00001000 ECX: 080d0000 EDX: 80000048
ESI: 80fbff1f EDI: ffe02f77 EBP: 90f6e2a3 DS: 007b ES: 007b
47397c0c:  [<0807947f>] show_regs+0x10f/0x120
47397c28:  [<080623a9>] panic_exit+0x29/0x50
47397c38:  [<0809ba86>] notifier_call_chain+0x36/0x60
47397c60:  [<0809bba1>] __atomic_notifier_call_chain+0x21/0x30
47397c70:  [<0809bbdf>] atomic_notifier_call_chain+0x2f/0x40
47397c8c:  [<0841b5bc>] panic+0x96/0x180
47397cb4:  [<0841c5a0>] balance_dirty_pages.isra.32+0x4e3/0x5ad
47397d30:  [<080d3595>] balance_dirty_pages_ratelimited+0xf5/0x100
47397d44:  [<080e4a3f>] __do_fault+0x3cf/0x440
47397d9c:  [<080e6e0f>] handle_mm_fault+0xef/0x7c0
47397dec:  [<080e7817>] __get_user_pages+0x227/0x420
47397e24:  [<080e7ae3>] get_user_pages+0x63/0x70
47397e4c:  [<08143dc6>] SyS_io_setup+0x3c6/0x760
47397eb0:  [<08062984>] handle_syscall+0x64/0x80
47397ef0:  [<08074fb5>] userspace+0x475/0x5f0
47397fec:  [<0805f750>] fork_handler+0x60/0x70
47397ffc:  [<00000000>] 0x0

/home/tfoerste/workspace/bin/start_uml.sh: line 115: 18718 Aborted                 (core dumped) $LINUX earlyprintk ubda=$ROOTFS ubdb=$SWAP eth0=$NET mem=$MEM $TTY umid=uml_$NAME rootfstype=ext4 "$ARGS"


>From what I see there are 2 different types of issues - and this is an example of the other of both 

> Thanks,
> //richard
> 


-- 
MfG/Sincerely
Toralf FA?rster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-10 16:49                                     ` Toralf Förster
@ 2013-10-11  1:16                                       ` Fengguang Wu
  -1 siblings, 0 replies; 64+ messages in thread
From: Fengguang Wu @ 2013-10-11  1:16 UTC (permalink / raw)
  To: Toralf Förster
  Cc: Richard Weinberger, Jan Kara, Geert Uytterhoeven, UML devel,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes, darrick.wong, Michal Hocko

On Thu, Oct 10, 2013 at 06:49:30PM +0200, Toralf Förster wrote:
> On 10/10/2013 12:33 AM, Richard Weinberger wrote:
> > Am 09.10.2013 23:47, schrieb Jan Kara:
> >> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
> >>> CC'ing mm folks.
> >>> Please see below.
> >>   Added Fenguang to CC since he is the author of this code.
> > 
> > Thx, get_maintainer.pl didn't list him.
> > 
> >>> Am 09.10.2013 19:26, schrieb Toralf Förster:
> >>>> On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
> >>>>> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
> >>>>>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
> >>>>>>> asserts its strict positive?!?
> >>>>>>>
> >>>>>>> Can you please try the following instead of the BUG_ON():
> >>>>>>>
> >>>>>>> if (pause < 0) {
> >>>>>>>         printk("pages_dirtied = %lu\n", pages_dirtied);
> >>>>>>>         printk("task_ratelimit = %lu\n", task_ratelimit);
> >>>>>>>         printk("pause = %ld\n", pause);
> >>>>>>> }
> >>>>>>>
> >>>>>>> Gr{oetje,eeting}s,
> >>>>>>>
> >>>>>>>                         Geert
> >>>>>> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
> >>>>>> As soon as the issue happens I do have a
> >>>>>>
> >>>>>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
> >>>>>>
> >>>>>> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
> >>>>>> to run very basic commands like a crash dump, sysrq ond so on.
> >>>>>
> >>>>> You may get an idea of the magnitude of pages_dirtied by using a chain of
> >>>>> BUG_ON()s, like:
> >>>>>
> >>>>> BUG_ON(pages_dirtied > 2000000000);
> >>>>> BUG_ON(pages_dirtied > 1000000000);
> >>>>> BUG_ON(pages_dirtied > 100000000);
> >>>>> BUG_ON(pages_dirtied > 10000000);
> >>>>> BUG_ON(pages_dirtied > 1000000);
> >>>>>
> >>>>> Probably 1 million is already too much for normal operation?
> >>>>>
> >>>> period = HZ * pages_dirtied / task_ratelimit;
> >>>> 		BUG_ON(pages_dirtied > 2000000000);
> >>>> 		BUG_ON(pages_dirtied > 1000000000);      <-------------- this is line 1467
> >>>
> >>> Summary for mm people:
> >>>
> >>> Toralf runs trinty on UML/i386.
> >>> After some time pages_dirtied becomes very large.
> >>> More than 1000000000 pages in this case.
> >>   Huh, this is really strange. pages_dirtied is passed into
> >> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
> >> over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
> >> task was ever able to dirty that much during its lifetime (but correct me
> >> if I'm wrong here, with UML and memory backed disks it is not totally
> >> impossible)... I went through the logic of handling ->nr_dirtied but
> >> I didn't find any obvious problem there. Hum, maybe one thing - what
> >> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
> >> huge, we could possibly accumulate huge current->nr_dirtied.
> > 
> > Toralf, you can try a snipplet like this one to get the values printed out:
> > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> > index f5236f8..a80e520 100644
> > --- a/mm/page-writeback.c
> > +++ b/mm/page-writeback.c
> > @@ -1463,6 +1463,12 @@ static void balance_dirty_pages(struct address_space *mapping,
> >                         goto pause;
> >                 }
> >                 period = HZ * pages_dirtied / task_ratelimit;
> > +
> > +               {
> > +                       extern int printf(char *, ...);
> > +                       printf("---> task_ratelimit: %lu\n", task_ratelimit);
> > +               }
> > +
> >                 pause = period;
> >                 if (current->dirty_paused_when)
> >                         pause -= now - current->dirty_paused_when;
> > 
> > 
> > Yes, printf(), not printk().
> > Using this hack we print directly to host's stdout. :)
> > 
> *head smack* ofc - works fine.
> So given this diff :
> 
> iff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index f5236f8..5a2c337 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1464,6 +1464,13 @@ static void balance_dirty_pages(struct address_space *mapping,
>                 }
>                 period = HZ * pages_dirtied / task_ratelimit;
>                 pause = period;
> +               if (pause < 0)  {
> +                       extern int printf(char *, ...);
> +                       printf("overflow : pause : %li\n", pause);
> +                       printf("overflow : pages_dirtied : %lu\n", pages_dirtied);
> +                       printf("overflow :  task_ratelimit: %lu\n", task_ratelimit);
> +                       BUG_ON(1);
> +               }
>                 if (current->dirty_paused_when)
>                         pause -= now - current->dirty_paused_when;
>                 /*
> @@ -1503,6 +1510,13 @@ static void balance_dirty_pages(struct address_space *mapping,
>                 }
> 
>  pause:
> +               if (pause < 0)  {

Oops, we got 

        ick : pause : -984

here! Since pause is bounded by [min_pause, max_pause], it means
something goes wrong with the bounds. Would you help print min_pause
and max_pause as well?

It seems there are "long <=> unsigned long" conversion problems in
bdi_max_pause(), which might make max_pause a negative number. And the
min_pause calculation is based on max_pause to some degree..

Thanks,
Fengguang

> +                       extern int printf(char *, ...);
> +                       printf("ick : pause : %li\n", pause);
> +                       printf("ick: pages_dirtied : %lu\n", pages_dirtied);
> +                       printf("ick: task_ratelimit: %lu\n", task_ratelimit);
> +                       BUG_ON(1);
> +               }
>                 trace_balance_dirty_pages(bdi,
>                                           dirty_thresh,
>                                           background_thresh,
> 
> 
> I got this :
> 
> 
> 
> 
>  * Starting local
> net.core.warnings = 0                                                                                                                            [ ok ]
> ick : pause : -984
>                   ick: pages_dirtied : 0
>                                         ick: task_ratelimit: 0
>                                                               Kernel panic - not syncing: BUG!
> CPU: 0 PID: 1434 Comm: trinity-child2 Not tainted 3.12.0-rc4-00029-g0e7a3ed-dirty #12
> 47397c84 47397cb0 0841b5a0 084c30e8 085f76e0 084b4745 47397cbc 00000000 
>        fffffc28 01fff278 085cb4a0 47397d2c 0841c5a0 084b4745 084c5398 000005ee 
>        08432cf0 43e47600 471757f8 47397cf0 ffffff0c 47397cdc 5256d8b9 3417ec18 47397c5c:  [<08060b2c>] show_stack+0x7c/0xd0
> 47397c7c:  [<0841e34d>] dump_stack+0x26/0x28
> 47397c8c:  [<0841b5a0>] panic+0x7a/0x180
> 47397cb4:  [<0841c5a0>] balance_dirty_pages.isra.32+0x4e3/0x5ad
> 47397d30:  [<080d3595>] balance_dirty_pages_ratelimited+0xf5/0x100
> 47397d44:  [<080e4a3f>] __do_fault+0x3cf/0x440
> 47397d9c:  [<080e6e0f>] handle_mm_fault+0xef/0x7c0
> 47397dec:  [<080e7817>] __get_user_pages+0x227/0x420
> 47397e24:  [<080e7ae3>] get_user_pages+0x63/0x70
> 47397e4c:  [<08143dc6>] SyS_io_setup+0x3c6/0x760
> 47397eb0:  [<08062984>] handle_syscall+0x64/0x80
> 47397ef0:  [<08074fb5>] userspace+0x475/0x5f0
> 47397fec:  [<0805f750>] fork_handler+0x60/0x70
> 47397ffc:  [<00000000>] 0x0
> 
> 
> EIP: 0073:[<40001282>] CPU: 0 Not tainted ESP: 007b:bfb348f8 EFLAGS: 00000246
>     Not tainted
> EAX: ffffffda EBX: 00001000 ECX: 080d0000 EDX: 80000048
> ESI: 80fbff1f EDI: ffe02f77 EBP: 90f6e2a3 DS: 007b ES: 007b
> 47397c0c:  [<0807947f>] show_regs+0x10f/0x120
> 47397c28:  [<080623a9>] panic_exit+0x29/0x50
> 47397c38:  [<0809ba86>] notifier_call_chain+0x36/0x60
> 47397c60:  [<0809bba1>] __atomic_notifier_call_chain+0x21/0x30
> 47397c70:  [<0809bbdf>] atomic_notifier_call_chain+0x2f/0x40
> 47397c8c:  [<0841b5bc>] panic+0x96/0x180
> 47397cb4:  [<0841c5a0>] balance_dirty_pages.isra.32+0x4e3/0x5ad
> 47397d30:  [<080d3595>] balance_dirty_pages_ratelimited+0xf5/0x100
> 47397d44:  [<080e4a3f>] __do_fault+0x3cf/0x440
> 47397d9c:  [<080e6e0f>] handle_mm_fault+0xef/0x7c0
> 47397dec:  [<080e7817>] __get_user_pages+0x227/0x420
> 47397e24:  [<080e7ae3>] get_user_pages+0x63/0x70
> 47397e4c:  [<08143dc6>] SyS_io_setup+0x3c6/0x760
> 47397eb0:  [<08062984>] handle_syscall+0x64/0x80
> 47397ef0:  [<08074fb5>] userspace+0x475/0x5f0
> 47397fec:  [<0805f750>] fork_handler+0x60/0x70
> 47397ffc:  [<00000000>] 0x0
> 
> /home/tfoerste/workspace/bin/start_uml.sh: line 115: 18718 Aborted                 (core dumped) $LINUX earlyprintk ubda=$ROOTFS ubdb=$SWAP eth0=$NET mem=$MEM $TTY umid=uml_$NAME rootfstype=ext4 "$ARGS"
> 
> 
> >From what I see there are 2 different types of issues - and this is an example of the other of both 
> 
> > Thanks,
> > //richard
> > 
> 
> 
> -- 
> MfG/Sincerely
> Toralf Förster
> pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-11  1:16                                       ` Fengguang Wu
  0 siblings, 0 replies; 64+ messages in thread
From: Fengguang Wu @ 2013-10-11  1:16 UTC (permalink / raw)
  To: Toralf Förster
  Cc: Richard Weinberger, Jan Kara, Geert Uytterhoeven, UML devel,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes, darrick.wong, Michal Hocko

On Thu, Oct 10, 2013 at 06:49:30PM +0200, Toralf FA?rster wrote:
> On 10/10/2013 12:33 AM, Richard Weinberger wrote:
> > Am 09.10.2013 23:47, schrieb Jan Kara:
> >> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
> >>> CC'ing mm folks.
> >>> Please see below.
> >>   Added Fenguang to CC since he is the author of this code.
> > 
> > Thx, get_maintainer.pl didn't list him.
> > 
> >>> Am 09.10.2013 19:26, schrieb Toralf FA?rster:
> >>>> On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
> >>>>> On Sun, Oct 6, 2013 at 11:01 PM, Toralf FA?rster <toralf.foerster@gmx.de> wrote:
> >>>>>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
> >>>>>>> asserts its strict positive?!?
> >>>>>>>
> >>>>>>> Can you please try the following instead of the BUG_ON():
> >>>>>>>
> >>>>>>> if (pause < 0) {
> >>>>>>>         printk("pages_dirtied = %lu\n", pages_dirtied);
> >>>>>>>         printk("task_ratelimit = %lu\n", task_ratelimit);
> >>>>>>>         printk("pause = %ld\n", pause);
> >>>>>>> }
> >>>>>>>
> >>>>>>> Gr{oetje,eeting}s,
> >>>>>>>
> >>>>>>>                         Geert
> >>>>>> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
> >>>>>> As soon as the issue happens I do have a
> >>>>>>
> >>>>>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
> >>>>>>
> >>>>>> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
> >>>>>> to run very basic commands like a crash dump, sysrq ond so on.
> >>>>>
> >>>>> You may get an idea of the magnitude of pages_dirtied by using a chain of
> >>>>> BUG_ON()s, like:
> >>>>>
> >>>>> BUG_ON(pages_dirtied > 2000000000);
> >>>>> BUG_ON(pages_dirtied > 1000000000);
> >>>>> BUG_ON(pages_dirtied > 100000000);
> >>>>> BUG_ON(pages_dirtied > 10000000);
> >>>>> BUG_ON(pages_dirtied > 1000000);
> >>>>>
> >>>>> Probably 1 million is already too much for normal operation?
> >>>>>
> >>>> period = HZ * pages_dirtied / task_ratelimit;
> >>>> 		BUG_ON(pages_dirtied > 2000000000);
> >>>> 		BUG_ON(pages_dirtied > 1000000000);      <-------------- this is line 1467
> >>>
> >>> Summary for mm people:
> >>>
> >>> Toralf runs trinty on UML/i386.
> >>> After some time pages_dirtied becomes very large.
> >>> More than 1000000000 pages in this case.
> >>   Huh, this is really strange. pages_dirtied is passed into
> >> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
> >> over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
> >> task was ever able to dirty that much during its lifetime (but correct me
> >> if I'm wrong here, with UML and memory backed disks it is not totally
> >> impossible)... I went through the logic of handling ->nr_dirtied but
> >> I didn't find any obvious problem there. Hum, maybe one thing - what
> >> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
> >> huge, we could possibly accumulate huge current->nr_dirtied.
> > 
> > Toralf, you can try a snipplet like this one to get the values printed out:
> > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> > index f5236f8..a80e520 100644
> > --- a/mm/page-writeback.c
> > +++ b/mm/page-writeback.c
> > @@ -1463,6 +1463,12 @@ static void balance_dirty_pages(struct address_space *mapping,
> >                         goto pause;
> >                 }
> >                 period = HZ * pages_dirtied / task_ratelimit;
> > +
> > +               {
> > +                       extern int printf(char *, ...);
> > +                       printf("---> task_ratelimit: %lu\n", task_ratelimit);
> > +               }
> > +
> >                 pause = period;
> >                 if (current->dirty_paused_when)
> >                         pause -= now - current->dirty_paused_when;
> > 
> > 
> > Yes, printf(), not printk().
> > Using this hack we print directly to host's stdout. :)
> > 
> *head smack* ofc - works fine.
> So given this diff :
> 
> iff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index f5236f8..5a2c337 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1464,6 +1464,13 @@ static void balance_dirty_pages(struct address_space *mapping,
>                 }
>                 period = HZ * pages_dirtied / task_ratelimit;
>                 pause = period;
> +               if (pause < 0)  {
> +                       extern int printf(char *, ...);
> +                       printf("overflow : pause : %li\n", pause);
> +                       printf("overflow : pages_dirtied : %lu\n", pages_dirtied);
> +                       printf("overflow :  task_ratelimit: %lu\n", task_ratelimit);
> +                       BUG_ON(1);
> +               }
>                 if (current->dirty_paused_when)
>                         pause -= now - current->dirty_paused_when;
>                 /*
> @@ -1503,6 +1510,13 @@ static void balance_dirty_pages(struct address_space *mapping,
>                 }
> 
>  pause:
> +               if (pause < 0)  {

Oops, we got 

        ick : pause : -984

here! Since pause is bounded by [min_pause, max_pause], it means
something goes wrong with the bounds. Would you help print min_pause
and max_pause as well?

It seems there are "long <=> unsigned long" conversion problems in
bdi_max_pause(), which might make max_pause a negative number. And the
min_pause calculation is based on max_pause to some degree..

Thanks,
Fengguang

> +                       extern int printf(char *, ...);
> +                       printf("ick : pause : %li\n", pause);
> +                       printf("ick: pages_dirtied : %lu\n", pages_dirtied);
> +                       printf("ick: task_ratelimit: %lu\n", task_ratelimit);
> +                       BUG_ON(1);
> +               }
>                 trace_balance_dirty_pages(bdi,
>                                           dirty_thresh,
>                                           background_thresh,
> 
> 
> I got this :
> 
> 
> 
> 
>  * Starting local
> net.core.warnings = 0                                                                                                                            [ ok ]
> ick : pause : -984
>                   ick: pages_dirtied : 0
>                                         ick: task_ratelimit: 0
>                                                               Kernel panic - not syncing: BUG!
> CPU: 0 PID: 1434 Comm: trinity-child2 Not tainted 3.12.0-rc4-00029-g0e7a3ed-dirty #12
> 47397c84 47397cb0 0841b5a0 084c30e8 085f76e0 084b4745 47397cbc 00000000 
>        fffffc28 01fff278 085cb4a0 47397d2c 0841c5a0 084b4745 084c5398 000005ee 
>        08432cf0 43e47600 471757f8 47397cf0 ffffff0c 47397cdc 5256d8b9 3417ec18 47397c5c:  [<08060b2c>] show_stack+0x7c/0xd0
> 47397c7c:  [<0841e34d>] dump_stack+0x26/0x28
> 47397c8c:  [<0841b5a0>] panic+0x7a/0x180
> 47397cb4:  [<0841c5a0>] balance_dirty_pages.isra.32+0x4e3/0x5ad
> 47397d30:  [<080d3595>] balance_dirty_pages_ratelimited+0xf5/0x100
> 47397d44:  [<080e4a3f>] __do_fault+0x3cf/0x440
> 47397d9c:  [<080e6e0f>] handle_mm_fault+0xef/0x7c0
> 47397dec:  [<080e7817>] __get_user_pages+0x227/0x420
> 47397e24:  [<080e7ae3>] get_user_pages+0x63/0x70
> 47397e4c:  [<08143dc6>] SyS_io_setup+0x3c6/0x760
> 47397eb0:  [<08062984>] handle_syscall+0x64/0x80
> 47397ef0:  [<08074fb5>] userspace+0x475/0x5f0
> 47397fec:  [<0805f750>] fork_handler+0x60/0x70
> 47397ffc:  [<00000000>] 0x0
> 
> 
> EIP: 0073:[<40001282>] CPU: 0 Not tainted ESP: 007b:bfb348f8 EFLAGS: 00000246
>     Not tainted
> EAX: ffffffda EBX: 00001000 ECX: 080d0000 EDX: 80000048
> ESI: 80fbff1f EDI: ffe02f77 EBP: 90f6e2a3 DS: 007b ES: 007b
> 47397c0c:  [<0807947f>] show_regs+0x10f/0x120
> 47397c28:  [<080623a9>] panic_exit+0x29/0x50
> 47397c38:  [<0809ba86>] notifier_call_chain+0x36/0x60
> 47397c60:  [<0809bba1>] __atomic_notifier_call_chain+0x21/0x30
> 47397c70:  [<0809bbdf>] atomic_notifier_call_chain+0x2f/0x40
> 47397c8c:  [<0841b5bc>] panic+0x96/0x180
> 47397cb4:  [<0841c5a0>] balance_dirty_pages.isra.32+0x4e3/0x5ad
> 47397d30:  [<080d3595>] balance_dirty_pages_ratelimited+0xf5/0x100
> 47397d44:  [<080e4a3f>] __do_fault+0x3cf/0x440
> 47397d9c:  [<080e6e0f>] handle_mm_fault+0xef/0x7c0
> 47397dec:  [<080e7817>] __get_user_pages+0x227/0x420
> 47397e24:  [<080e7ae3>] get_user_pages+0x63/0x70
> 47397e4c:  [<08143dc6>] SyS_io_setup+0x3c6/0x760
> 47397eb0:  [<08062984>] handle_syscall+0x64/0x80
> 47397ef0:  [<08074fb5>] userspace+0x475/0x5f0
> 47397fec:  [<0805f750>] fork_handler+0x60/0x70
> 47397ffc:  [<00000000>] 0x0
> 
> /home/tfoerste/workspace/bin/start_uml.sh: line 115: 18718 Aborted                 (core dumped) $LINUX earlyprintk ubda=$ROOTFS ubdb=$SWAP eth0=$NET mem=$MEM $TTY umid=uml_$NAME rootfstype=ext4 "$ARGS"
> 
> 
> >From what I see there are 2 different types of issues - and this is an example of the other of both 
> 
> > Thanks,
> > //richard
> > 
> 
> 
> -- 
> MfG/Sincerely
> Toralf FA?rster
> pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-11  1:16                                       ` Fengguang Wu
@ 2013-10-11  8:42                                         ` Toralf Förster
  -1 siblings, 0 replies; 64+ messages in thread
From: Toralf Förster @ 2013-10-11  8:42 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Richard Weinberger, Jan Kara, Geert Uytterhoeven, UML devel,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes, darrick.wong, Michal Hocko

yeah, now the picture becomes more clear
...
net.core.warnings = 0                                                                         [ ok ]
ick: pause : -717
                 ick : min_pause : -177
                                   ick : max_pause : -717
                                                     ick: pages_dirtied : 14
                                                                            ick: task_ratelimit: 0


On 10/11/2013 03:16 AM, Fengguang Wu wrote:
> On Thu, Oct 10, 2013 at 06:49:30PM +0200, Toralf Förster wrote:
>> On 10/10/2013 12:33 AM, Richard Weinberger wrote:
>>> Am 09.10.2013 23:47, schrieb Jan Kara:
>>>> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
>>>>> CC'ing mm folks.
>>>>> Please see below.
>>>>   Added Fenguang to CC since he is the author of this code.
>>>
>>> Thx, get_maintainer.pl didn't list him.
>>>
>>>>> Am 09.10.2013 19:26, schrieb Toralf Förster:
>>>>>> On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
>>>>>>> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>>>>>>>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
>>>>>>>>> asserts its strict positive?!?
>>>>>>>>>
>>>>>>>>> Can you please try the following instead of the BUG_ON():
>>>>>>>>>
>>>>>>>>> if (pause < 0) {
>>>>>>>>>         printk("pages_dirtied = %lu\n", pages_dirtied);
>>>>>>>>>         printk("task_ratelimit = %lu\n", task_ratelimit);
>>>>>>>>>         printk("pause = %ld\n", pause);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> Gr{oetje,eeting}s,
>>>>>>>>>
>>>>>>>>>                         Geert
>>>>>>>> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
>>>>>>>> As soon as the issue happens I do have a
>>>>>>>>
>>>>>>>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
>>>>>>>>
>>>>>>>> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
>>>>>>>> to run very basic commands like a crash dump, sysrq ond so on.
>>>>>>>
>>>>>>> You may get an idea of the magnitude of pages_dirtied by using a chain of
>>>>>>> BUG_ON()s, like:
>>>>>>>
>>>>>>> BUG_ON(pages_dirtied > 2000000000);
>>>>>>> BUG_ON(pages_dirtied > 1000000000);
>>>>>>> BUG_ON(pages_dirtied > 100000000);
>>>>>>> BUG_ON(pages_dirtied > 10000000);
>>>>>>> BUG_ON(pages_dirtied > 1000000);
>>>>>>>
>>>>>>> Probably 1 million is already too much for normal operation?
>>>>>>>
>>>>>> period = HZ * pages_dirtied / task_ratelimit;
>>>>>> 		BUG_ON(pages_dirtied > 2000000000);
>>>>>> 		BUG_ON(pages_dirtied > 1000000000);      <-------------- this is line 1467
>>>>>
>>>>> Summary for mm people:
>>>>>
>>>>> Toralf runs trinty on UML/i386.
>>>>> After some time pages_dirtied becomes very large.
>>>>> More than 1000000000 pages in this case.
>>>>   Huh, this is really strange. pages_dirtied is passed into
>>>> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
>>>> over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
>>>> task was ever able to dirty that much during its lifetime (but correct me
>>>> if I'm wrong here, with UML and memory backed disks it is not totally
>>>> impossible)... I went through the logic of handling ->nr_dirtied but
>>>> I didn't find any obvious problem there. Hum, maybe one thing - what
>>>> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
>>>> huge, we could possibly accumulate huge current->nr_dirtied.
>>>
>>> Toralf, you can try a snipplet like this one to get the values printed out:
>>> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
>>> index f5236f8..a80e520 100644
>>> --- a/mm/page-writeback.c
>>> +++ b/mm/page-writeback.c
>>> @@ -1463,6 +1463,12 @@ static void balance_dirty_pages(struct address_space *mapping,
>>>                         goto pause;
>>>                 }
>>>                 period = HZ * pages_dirtied / task_ratelimit;
>>> +
>>> +               {
>>> +                       extern int printf(char *, ...);
>>> +                       printf("---> task_ratelimit: %lu\n", task_ratelimit);
>>> +               }
>>> +
>>>                 pause = period;
>>>                 if (current->dirty_paused_when)
>>>                         pause -= now - current->dirty_paused_when;
>>>
>>>
>>> Yes, printf(), not printk().
>>> Using this hack we print directly to host's stdout. :)
>>>
>> *head smack* ofc - works fine.
>> So given this diff :
>>
>> iff --git a/mm/page-writeback.c b/mm/page-writeback.c
>> index f5236f8..5a2c337 100644
>> --- a/mm/page-writeback.c
>> +++ b/mm/page-writeback.c
>> @@ -1464,6 +1464,13 @@ static void balance_dirty_pages(struct address_space *mapping,
>>                 }
>>                 period = HZ * pages_dirtied / task_ratelimit;
>>                 pause = period;
>> +               if (pause < 0)  {
>> +                       extern int printf(char *, ...);
>> +                       printf("overflow : pause : %li\n", pause);
>> +                       printf("overflow : pages_dirtied : %lu\n", pages_dirtied);
>> +                       printf("overflow :  task_ratelimit: %lu\n", task_ratelimit);
>> +                       BUG_ON(1);
>> +               }
>>                 if (current->dirty_paused_when)
>>                         pause -= now - current->dirty_paused_when;
>>                 /*
>> @@ -1503,6 +1510,13 @@ static void balance_dirty_pages(struct address_space *mapping,
>>                 }
>>
>>  pause:
>> +               if (pause < 0)  {
> 
> Oops, we got 
> 
>         ick : pause : -984
> 
> here! Since pause is bounded by [min_pause, max_pause], it means
> something goes wrong with the bounds. Would you help print min_pause
> and max_pause as well?
> 
> It seems there are "long <=> unsigned long" conversion problems in
> bdi_max_pause(), which might make max_pause a negative number. And the
> min_pause calculation is based on max_pause to some degree..
> 
> Thanks,
> Fengguang
> 
>> +                       extern int printf(char *, ...);
>> +                       printf("ick : pause : %li\n", pause);
>> +                       printf("ick: pages_dirtied : %lu\n", pages_dirtied);
>> +                       printf("ick: task_ratelimit: %lu\n", task_ratelimit);
>> +                       BUG_ON(1);
>> +               }
>>                 trace_balance_dirty_pages(bdi,
>>                                           dirty_thresh,
>>                                           background_thresh,
>>
>>
>> I got this :
>>
>>
>>
>>
>>  * Starting local
>> net.core.warnings = 0                                                                                                                            [ ok ]
>> ick : pause : -984
>>                   ick: pages_dirtied : 0
>>                                         ick: task_ratelimit: 0
>>                                                               Kernel panic - not syncing: BUG!
>> CPU: 0 PID: 1434 Comm: trinity-child2 Not tainted 3.12.0-rc4-00029-g0e7a3ed-dirty #12
>> 47397c84 47397cb0 0841b5a0 084c30e8 085f76e0 084b4745 47397cbc 00000000 
>>        fffffc28 01fff278 085cb4a0 47397d2c 0841c5a0 084b4745 084c5398 000005ee 
>>        08432cf0 43e47600 471757f8 47397cf0 ffffff0c 47397cdc 5256d8b9 3417ec18 47397c5c:  [<08060b2c>] show_stack+0x7c/0xd0
>> 47397c7c:  [<0841e34d>] dump_stack+0x26/0x28
>> 47397c8c:  [<0841b5a0>] panic+0x7a/0x180
>> 47397cb4:  [<0841c5a0>] balance_dirty_pages.isra.32+0x4e3/0x5ad
>> 47397d30:  [<080d3595>] balance_dirty_pages_ratelimited+0xf5/0x100
>> 47397d44:  [<080e4a3f>] __do_fault+0x3cf/0x440
>> 47397d9c:  [<080e6e0f>] handle_mm_fault+0xef/0x7c0
>> 47397dec:  [<080e7817>] __get_user_pages+0x227/0x420
>> 47397e24:  [<080e7ae3>] get_user_pages+0x63/0x70
>> 47397e4c:  [<08143dc6>] SyS_io_setup+0x3c6/0x760
>> 47397eb0:  [<08062984>] handle_syscall+0x64/0x80
>> 47397ef0:  [<08074fb5>] userspace+0x475/0x5f0
>> 47397fec:  [<0805f750>] fork_handler+0x60/0x70
>> 47397ffc:  [<00000000>] 0x0
>>
>>
>> EIP: 0073:[<40001282>] CPU: 0 Not tainted ESP: 007b:bfb348f8 EFLAGS: 00000246
>>     Not tainted
>> EAX: ffffffda EBX: 00001000 ECX: 080d0000 EDX: 80000048
>> ESI: 80fbff1f EDI: ffe02f77 EBP: 90f6e2a3 DS: 007b ES: 007b
>> 47397c0c:  [<0807947f>] show_regs+0x10f/0x120
>> 47397c28:  [<080623a9>] panic_exit+0x29/0x50
>> 47397c38:  [<0809ba86>] notifier_call_chain+0x36/0x60
>> 47397c60:  [<0809bba1>] __atomic_notifier_call_chain+0x21/0x30
>> 47397c70:  [<0809bbdf>] atomic_notifier_call_chain+0x2f/0x40
>> 47397c8c:  [<0841b5bc>] panic+0x96/0x180
>> 47397cb4:  [<0841c5a0>] balance_dirty_pages.isra.32+0x4e3/0x5ad
>> 47397d30:  [<080d3595>] balance_dirty_pages_ratelimited+0xf5/0x100
>> 47397d44:  [<080e4a3f>] __do_fault+0x3cf/0x440
>> 47397d9c:  [<080e6e0f>] handle_mm_fault+0xef/0x7c0
>> 47397dec:  [<080e7817>] __get_user_pages+0x227/0x420
>> 47397e24:  [<080e7ae3>] get_user_pages+0x63/0x70
>> 47397e4c:  [<08143dc6>] SyS_io_setup+0x3c6/0x760
>> 47397eb0:  [<08062984>] handle_syscall+0x64/0x80
>> 47397ef0:  [<08074fb5>] userspace+0x475/0x5f0
>> 47397fec:  [<0805f750>] fork_handler+0x60/0x70
>> 47397ffc:  [<00000000>] 0x0
>>
>> /home/tfoerste/workspace/bin/start_uml.sh: line 115: 18718 Aborted                 (core dumped) $LINUX earlyprintk ubda=$ROOTFS ubdb=$SWAP eth0=$NET mem=$MEM $TTY umid=uml_$NAME rootfstype=ext4 "$ARGS"
>>
>>
>> >From what I see there are 2 different types of issues - and this is an example of the other of both 
>>
>>> Thanks,
>>> //richard
>>>
>>
>>
>> -- 
>> MfG/Sincerely
>> Toralf Förster
>> pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3
> 


-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-11  8:42                                         ` Toralf Förster
  0 siblings, 0 replies; 64+ messages in thread
From: Toralf Förster @ 2013-10-11  8:42 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Richard Weinberger, Jan Kara, Geert Uytterhoeven, UML devel,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes, darrick.wong, Michal Hocko

yeah, now the picture becomes more clear
...
net.core.warnings = 0                                                                         [ ok ]
ick: pause : -717
                 ick : min_pause : -177
                                   ick : max_pause : -717
                                                     ick: pages_dirtied : 14
                                                                            ick: task_ratelimit: 0


On 10/11/2013 03:16 AM, Fengguang Wu wrote:
> On Thu, Oct 10, 2013 at 06:49:30PM +0200, Toralf FA?rster wrote:
>> On 10/10/2013 12:33 AM, Richard Weinberger wrote:
>>> Am 09.10.2013 23:47, schrieb Jan Kara:
>>>> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
>>>>> CC'ing mm folks.
>>>>> Please see below.
>>>>   Added Fenguang to CC since he is the author of this code.
>>>
>>> Thx, get_maintainer.pl didn't list him.
>>>
>>>>> Am 09.10.2013 19:26, schrieb Toralf FA?rster:
>>>>>> On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
>>>>>>> On Sun, Oct 6, 2013 at 11:01 PM, Toralf FA?rster <toralf.foerster@gmx.de> wrote:
>>>>>>>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
>>>>>>>>> asserts its strict positive?!?
>>>>>>>>>
>>>>>>>>> Can you please try the following instead of the BUG_ON():
>>>>>>>>>
>>>>>>>>> if (pause < 0) {
>>>>>>>>>         printk("pages_dirtied = %lu\n", pages_dirtied);
>>>>>>>>>         printk("task_ratelimit = %lu\n", task_ratelimit);
>>>>>>>>>         printk("pause = %ld\n", pause);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> Gr{oetje,eeting}s,
>>>>>>>>>
>>>>>>>>>                         Geert
>>>>>>>> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
>>>>>>>> As soon as the issue happens I do have a
>>>>>>>>
>>>>>>>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
>>>>>>>>
>>>>>>>> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
>>>>>>>> to run very basic commands like a crash dump, sysrq ond so on.
>>>>>>>
>>>>>>> You may get an idea of the magnitude of pages_dirtied by using a chain of
>>>>>>> BUG_ON()s, like:
>>>>>>>
>>>>>>> BUG_ON(pages_dirtied > 2000000000);
>>>>>>> BUG_ON(pages_dirtied > 1000000000);
>>>>>>> BUG_ON(pages_dirtied > 100000000);
>>>>>>> BUG_ON(pages_dirtied > 10000000);
>>>>>>> BUG_ON(pages_dirtied > 1000000);
>>>>>>>
>>>>>>> Probably 1 million is already too much for normal operation?
>>>>>>>
>>>>>> period = HZ * pages_dirtied / task_ratelimit;
>>>>>> 		BUG_ON(pages_dirtied > 2000000000);
>>>>>> 		BUG_ON(pages_dirtied > 1000000000);      <-------------- this is line 1467
>>>>>
>>>>> Summary for mm people:
>>>>>
>>>>> Toralf runs trinty on UML/i386.
>>>>> After some time pages_dirtied becomes very large.
>>>>> More than 1000000000 pages in this case.
>>>>   Huh, this is really strange. pages_dirtied is passed into
>>>> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
>>>> over 10^9 can get there. After all that is over 4TB so I somewhat doubt the
>>>> task was ever able to dirty that much during its lifetime (but correct me
>>>> if I'm wrong here, with UML and memory backed disks it is not totally
>>>> impossible)... I went through the logic of handling ->nr_dirtied but
>>>> I didn't find any obvious problem there. Hum, maybe one thing - what
>>>> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
>>>> huge, we could possibly accumulate huge current->nr_dirtied.
>>>
>>> Toralf, you can try a snipplet like this one to get the values printed out:
>>> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
>>> index f5236f8..a80e520 100644
>>> --- a/mm/page-writeback.c
>>> +++ b/mm/page-writeback.c
>>> @@ -1463,6 +1463,12 @@ static void balance_dirty_pages(struct address_space *mapping,
>>>                         goto pause;
>>>                 }
>>>                 period = HZ * pages_dirtied / task_ratelimit;
>>> +
>>> +               {
>>> +                       extern int printf(char *, ...);
>>> +                       printf("---> task_ratelimit: %lu\n", task_ratelimit);
>>> +               }
>>> +
>>>                 pause = period;
>>>                 if (current->dirty_paused_when)
>>>                         pause -= now - current->dirty_paused_when;
>>>
>>>
>>> Yes, printf(), not printk().
>>> Using this hack we print directly to host's stdout. :)
>>>
>> *head smack* ofc - works fine.
>> So given this diff :
>>
>> iff --git a/mm/page-writeback.c b/mm/page-writeback.c
>> index f5236f8..5a2c337 100644
>> --- a/mm/page-writeback.c
>> +++ b/mm/page-writeback.c
>> @@ -1464,6 +1464,13 @@ static void balance_dirty_pages(struct address_space *mapping,
>>                 }
>>                 period = HZ * pages_dirtied / task_ratelimit;
>>                 pause = period;
>> +               if (pause < 0)  {
>> +                       extern int printf(char *, ...);
>> +                       printf("overflow : pause : %li\n", pause);
>> +                       printf("overflow : pages_dirtied : %lu\n", pages_dirtied);
>> +                       printf("overflow :  task_ratelimit: %lu\n", task_ratelimit);
>> +                       BUG_ON(1);
>> +               }
>>                 if (current->dirty_paused_when)
>>                         pause -= now - current->dirty_paused_when;
>>                 /*
>> @@ -1503,6 +1510,13 @@ static void balance_dirty_pages(struct address_space *mapping,
>>                 }
>>
>>  pause:
>> +               if (pause < 0)  {
> 
> Oops, we got 
> 
>         ick : pause : -984
> 
> here! Since pause is bounded by [min_pause, max_pause], it means
> something goes wrong with the bounds. Would you help print min_pause
> and max_pause as well?
> 
> It seems there are "long <=> unsigned long" conversion problems in
> bdi_max_pause(), which might make max_pause a negative number. And the
> min_pause calculation is based on max_pause to some degree..
> 
> Thanks,
> Fengguang
> 
>> +                       extern int printf(char *, ...);
>> +                       printf("ick : pause : %li\n", pause);
>> +                       printf("ick: pages_dirtied : %lu\n", pages_dirtied);
>> +                       printf("ick: task_ratelimit: %lu\n", task_ratelimit);
>> +                       BUG_ON(1);
>> +               }
>>                 trace_balance_dirty_pages(bdi,
>>                                           dirty_thresh,
>>                                           background_thresh,
>>
>>
>> I got this :
>>
>>
>>
>>
>>  * Starting local
>> net.core.warnings = 0                                                                                                                            [ ok ]
>> ick : pause : -984
>>                   ick: pages_dirtied : 0
>>                                         ick: task_ratelimit: 0
>>                                                               Kernel panic - not syncing: BUG!
>> CPU: 0 PID: 1434 Comm: trinity-child2 Not tainted 3.12.0-rc4-00029-g0e7a3ed-dirty #12
>> 47397c84 47397cb0 0841b5a0 084c30e8 085f76e0 084b4745 47397cbc 00000000 
>>        fffffc28 01fff278 085cb4a0 47397d2c 0841c5a0 084b4745 084c5398 000005ee 
>>        08432cf0 43e47600 471757f8 47397cf0 ffffff0c 47397cdc 5256d8b9 3417ec18 47397c5c:  [<08060b2c>] show_stack+0x7c/0xd0
>> 47397c7c:  [<0841e34d>] dump_stack+0x26/0x28
>> 47397c8c:  [<0841b5a0>] panic+0x7a/0x180
>> 47397cb4:  [<0841c5a0>] balance_dirty_pages.isra.32+0x4e3/0x5ad
>> 47397d30:  [<080d3595>] balance_dirty_pages_ratelimited+0xf5/0x100
>> 47397d44:  [<080e4a3f>] __do_fault+0x3cf/0x440
>> 47397d9c:  [<080e6e0f>] handle_mm_fault+0xef/0x7c0
>> 47397dec:  [<080e7817>] __get_user_pages+0x227/0x420
>> 47397e24:  [<080e7ae3>] get_user_pages+0x63/0x70
>> 47397e4c:  [<08143dc6>] SyS_io_setup+0x3c6/0x760
>> 47397eb0:  [<08062984>] handle_syscall+0x64/0x80
>> 47397ef0:  [<08074fb5>] userspace+0x475/0x5f0
>> 47397fec:  [<0805f750>] fork_handler+0x60/0x70
>> 47397ffc:  [<00000000>] 0x0
>>
>>
>> EIP: 0073:[<40001282>] CPU: 0 Not tainted ESP: 007b:bfb348f8 EFLAGS: 00000246
>>     Not tainted
>> EAX: ffffffda EBX: 00001000 ECX: 080d0000 EDX: 80000048
>> ESI: 80fbff1f EDI: ffe02f77 EBP: 90f6e2a3 DS: 007b ES: 007b
>> 47397c0c:  [<0807947f>] show_regs+0x10f/0x120
>> 47397c28:  [<080623a9>] panic_exit+0x29/0x50
>> 47397c38:  [<0809ba86>] notifier_call_chain+0x36/0x60
>> 47397c60:  [<0809bba1>] __atomic_notifier_call_chain+0x21/0x30
>> 47397c70:  [<0809bbdf>] atomic_notifier_call_chain+0x2f/0x40
>> 47397c8c:  [<0841b5bc>] panic+0x96/0x180
>> 47397cb4:  [<0841c5a0>] balance_dirty_pages.isra.32+0x4e3/0x5ad
>> 47397d30:  [<080d3595>] balance_dirty_pages_ratelimited+0xf5/0x100
>> 47397d44:  [<080e4a3f>] __do_fault+0x3cf/0x440
>> 47397d9c:  [<080e6e0f>] handle_mm_fault+0xef/0x7c0
>> 47397dec:  [<080e7817>] __get_user_pages+0x227/0x420
>> 47397e24:  [<080e7ae3>] get_user_pages+0x63/0x70
>> 47397e4c:  [<08143dc6>] SyS_io_setup+0x3c6/0x760
>> 47397eb0:  [<08062984>] handle_syscall+0x64/0x80
>> 47397ef0:  [<08074fb5>] userspace+0x475/0x5f0
>> 47397fec:  [<0805f750>] fork_handler+0x60/0x70
>> 47397ffc:  [<00000000>] 0x0
>>
>> /home/tfoerste/workspace/bin/start_uml.sh: line 115: 18718 Aborted                 (core dumped) $LINUX earlyprintk ubda=$ROOTFS ubdb=$SWAP eth0=$NET mem=$MEM $TTY umid=uml_$NAME rootfstype=ext4 "$ARGS"
>>
>>
>> >From what I see there are 2 different types of issues - and this is an example of the other of both 
>>
>>> Thanks,
>>> //richard
>>>
>>
>>
>> -- 
>> MfG/Sincerely
>> Toralf FA?rster
>> pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3
> 


-- 
MfG/Sincerely
Toralf FA?rster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-11  8:42                                         ` Toralf Förster
  (?)
@ 2013-10-11  8:57                                           ` Fengguang Wu
  -1 siblings, 0 replies; 64+ messages in thread
From: Fengguang Wu @ 2013-10-11  8:57 UTC (permalink / raw)
  To: Toralf Förster
  Cc: Richard Weinberger, Jan Kara, Geert Uytterhoeven, UML devel,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes, darrick.wong, Michal Hocko

On Fri, Oct 11, 2013 at 10:42:19AM +0200, Toralf Förster wrote:
> yeah, now the picture becomes more clear
> ...
> net.core.warnings = 0                                                                         [ ok ]
> ick: pause : -717
>                  ick : min_pause : -177
>                                    ick : max_pause : -717
>                                                      ick: pages_dirtied : 14
>                                                                             ick: task_ratelimit: 0

Great and thanks! So it's the max pause calculation went wrong.
Would help you try the below patch?

From 5420b9bbe42dd0a366d7615e9f3d3724cee725c4 Mon Sep 17 00:00:00 2001
From: Fengguang Wu <fengguang.wu@intel.com>
Date: Fri, 11 Oct 2013 16:53:26 +0800
Subject: [PATCH] fix bdi max pause calculation

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---
 mm/page-writeback.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 3f0c895..241a746 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1104,11 +1104,11 @@ static unsigned long dirty_poll_interval(unsigned long dirty,
 	return 1;
 }
 
-static long bdi_max_pause(struct backing_dev_info *bdi,
-			  unsigned long bdi_dirty)
+static unsigned long bdi_max_pause(struct backing_dev_info *bdi,
+				   unsigned long bdi_dirty)
 {
-	long bw = bdi->avg_write_bandwidth;
-	long t;
+	unsigned long bw = bdi->avg_write_bandwidth;
+	unsigned long t;
 
 	/*
 	 * Limit pause time for small memory systems. If sleeping for too long
@@ -1120,7 +1120,7 @@ static long bdi_max_pause(struct backing_dev_info *bdi,
 	t = bdi_dirty / (1 + bw / roundup_pow_of_two(1 + HZ / 8));
 	t++;
 
-	return min_t(long, t, MAX_PAUSE);
+	return min_t(unsigned long, t, MAX_PAUSE);
 }
 
 static long bdi_min_pause(struct backing_dev_info *bdi,
-- 
1.7.10.4



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-11  8:57                                           ` Fengguang Wu
  0 siblings, 0 replies; 64+ messages in thread
From: Fengguang Wu @ 2013-10-11  8:57 UTC (permalink / raw)
  To: Toralf Förster
  Cc: Richard Weinberger, Jan Kara, Geert Uytterhoeven, UML devel,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes, darrick.wong, Michal Hocko

On Fri, Oct 11, 2013 at 10:42:19AM +0200, Toralf Förster wrote:
> yeah, now the picture becomes more clear
> ...
> net.core.warnings = 0                                                                         [ ok ]
> ick: pause : -717
>                  ick : min_pause : -177
>                                    ick : max_pause : -717
>                                                      ick: pages_dirtied : 14
>                                                                             ick: task_ratelimit: 0

Great and thanks! So it's the max pause calculation went wrong.
Would help you try the below patch?

>From 5420b9bbe42dd0a366d7615e9f3d3724cee725c4 Mon Sep 17 00:00:00 2001
From: Fengguang Wu <fengguang.wu@intel.com>
Date: Fri, 11 Oct 2013 16:53:26 +0800
Subject: [PATCH] fix bdi max pause calculation

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---
 mm/page-writeback.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 3f0c895..241a746 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1104,11 +1104,11 @@ static unsigned long dirty_poll_interval(unsigned long dirty,
 	return 1;
 }
 
-static long bdi_max_pause(struct backing_dev_info *bdi,
-			  unsigned long bdi_dirty)
+static unsigned long bdi_max_pause(struct backing_dev_info *bdi,
+				   unsigned long bdi_dirty)
 {
-	long bw = bdi->avg_write_bandwidth;
-	long t;
+	unsigned long bw = bdi->avg_write_bandwidth;
+	unsigned long t;
 
 	/*
 	 * Limit pause time for small memory systems. If sleeping for too long
@@ -1120,7 +1120,7 @@ static long bdi_max_pause(struct backing_dev_info *bdi,
 	t = bdi_dirty / (1 + bw / roundup_pow_of_two(1 + HZ / 8));
 	t++;
 
-	return min_t(long, t, MAX_PAUSE);
+	return min_t(unsigned long, t, MAX_PAUSE);
 }
 
 static long bdi_min_pause(struct backing_dev_info *bdi,
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-11  8:57                                           ` Fengguang Wu
  0 siblings, 0 replies; 64+ messages in thread
From: Fengguang Wu @ 2013-10-11  8:57 UTC (permalink / raw)
  To: Toralf Förster
  Cc: Richard Weinberger, Jan Kara, Geert Uytterhoeven, UML devel,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes, darrick.wong, Michal Hocko

On Fri, Oct 11, 2013 at 10:42:19AM +0200, Toralf FA?rster wrote:
> yeah, now the picture becomes more clear
> ...
> net.core.warnings = 0                                                                         [ ok ]
> ick: pause : -717
>                  ick : min_pause : -177
>                                    ick : max_pause : -717
>                                                      ick: pages_dirtied : 14
>                                                                             ick: task_ratelimit: 0

Great and thanks! So it's the max pause calculation went wrong.
Would help you try the below patch?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-11  8:57                                           ` Fengguang Wu
@ 2013-10-11  9:05                                             ` Fengguang Wu
  -1 siblings, 0 replies; 64+ messages in thread
From: Fengguang Wu @ 2013-10-11  9:05 UTC (permalink / raw)
  To: Toralf Förster
  Cc: Richard Weinberger, Jan Kara, Geert Uytterhoeven, UML devel,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes, darrick.wong, Michal Hocko

On Fri, Oct 11, 2013 at 04:57:01PM +0800, Fengguang Wu wrote:
> On Fri, Oct 11, 2013 at 10:42:19AM +0200, Toralf Förster wrote:
> > yeah, now the picture becomes more clear
> > ...
> > net.core.warnings = 0                                                                         [ ok ]
> > ick: pause : -717
> >                  ick : min_pause : -177
> >                                    ick : max_pause : -717
> >                                                      ick: pages_dirtied : 14
> >                                                                             ick: task_ratelimit: 0
> 
> Great and thanks! So it's the max pause calculation went wrong.

However I still suspect this is not the main reason for the soft
lockup. Because schedule_timeout() will directly return on negative
timeout. So yes, we have encountered some negative pauses, however
we still need to fix the huge dirtied pages problem which should be
more fundamental. 

Thanks,
Fengguang


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-11  9:05                                             ` Fengguang Wu
  0 siblings, 0 replies; 64+ messages in thread
From: Fengguang Wu @ 2013-10-11  9:05 UTC (permalink / raw)
  To: Toralf Förster
  Cc: Richard Weinberger, Jan Kara, Geert Uytterhoeven, UML devel,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes, darrick.wong, Michal Hocko

On Fri, Oct 11, 2013 at 04:57:01PM +0800, Fengguang Wu wrote:
> On Fri, Oct 11, 2013 at 10:42:19AM +0200, Toralf FA?rster wrote:
> > yeah, now the picture becomes more clear
> > ...
> > net.core.warnings = 0                                                                         [ ok ]
> > ick: pause : -717
> >                  ick : min_pause : -177
> >                                    ick : max_pause : -717
> >                                                      ick: pages_dirtied : 14
> >                                                                             ick: task_ratelimit: 0
> 
> Great and thanks! So it's the max pause calculation went wrong.

However I still suspect this is not the main reason for the soft
lockup. Because schedule_timeout() will directly return on negative
timeout. So yes, we have encountered some negative pauses, however
we still need to fix the huge dirtied pages problem which should be
more fundamental. 

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-11  8:57                                           ` Fengguang Wu
  (?)
@ 2013-10-11 14:12                                             ` Toralf Förster
  -1 siblings, 0 replies; 64+ messages in thread
From: Toralf Förster @ 2013-10-11 14:12 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Richard Weinberger, Jan Kara, Geert Uytterhoeven, UML devel,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes, darrick.wong, Michal Hocko

On 10/11/2013 10:57 AM, Fengguang Wu wrote:
> On Fri, Oct 11, 2013 at 10:42:19AM +0200, Toralf Förster wrote:
>> yeah, now the picture becomes more clear
>> ...
>> net.core.warnings = 0                                                                         [ ok ]
>> ick: pause : -717
>>                  ick : min_pause : -177
>>                                    ick : max_pause : -717
>>                                                      ick: pages_dirtied : 14
>>                                                                             ick: task_ratelimit: 0
> 
> Great and thanks! So it's the max pause calculation went wrong.
> Would help you try the below patch?
> 
Definitely.
I'm running now the test case since 6 hours w/o any issues.
before that usually after 15 - 30 min the bug occurred.

>>>From 5420b9bbe42dd0a366d7615e9f3d3724cee725c4 Mon Sep 17 00:00:00 2001
> From: Fengguang Wu <fengguang.wu@intel.com>
> Date: Fri, 11 Oct 2013 16:53:26 +0800
> Subject: [PATCH] fix bdi max pause calculation
> 
> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
> ---
>  mm/page-writeback.c |   10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 3f0c895..241a746 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1104,11 +1104,11 @@ static unsigned long dirty_poll_interval(unsigned long dirty,
>  	return 1;
>  }
>  
> -static long bdi_max_pause(struct backing_dev_info *bdi,
> -			  unsigned long bdi_dirty)
> +static unsigned long bdi_max_pause(struct backing_dev_info *bdi,
> +				   unsigned long bdi_dirty)
>  {
> -	long bw = bdi->avg_write_bandwidth;
> -	long t;
> +	unsigned long bw = bdi->avg_write_bandwidth;
> +	unsigned long t;
>  
>  	/*
>  	 * Limit pause time for small memory systems. If sleeping for too long
> @@ -1120,7 +1120,7 @@ static long bdi_max_pause(struct backing_dev_info *bdi,
>  	t = bdi_dirty / (1 + bw / roundup_pow_of_two(1 + HZ / 8));
>  	t++;
>  
> -	return min_t(long, t, MAX_PAUSE);
> +	return min_t(unsigned long, t, MAX_PAUSE);
>  }
>  
>  static long bdi_min_pause(struct backing_dev_info *bdi,
> 


-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-11 14:12                                             ` Toralf Förster
  0 siblings, 0 replies; 64+ messages in thread
From: Toralf Förster @ 2013-10-11 14:12 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Richard Weinberger, Jan Kara, Geert Uytterhoeven, UML devel,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes, darrick.wong, Michal Hocko

On 10/11/2013 10:57 AM, Fengguang Wu wrote:
> On Fri, Oct 11, 2013 at 10:42:19AM +0200, Toralf Förster wrote:
>> yeah, now the picture becomes more clear
>> ...
>> net.core.warnings = 0                                                                         [ ok ]
>> ick: pause : -717
>>                  ick : min_pause : -177
>>                                    ick : max_pause : -717
>>                                                      ick: pages_dirtied : 14
>>                                                                             ick: task_ratelimit: 0
> 
> Great and thanks! So it's the max pause calculation went wrong.
> Would help you try the below patch?
> 
Definitely.
I'm running now the test case since 6 hours w/o any issues.
before that usually after 15 - 30 min the bug occurred.

>>From 5420b9bbe42dd0a366d7615e9f3d3724cee725c4 Mon Sep 17 00:00:00 2001
> From: Fengguang Wu <fengguang.wu@intel.com>
> Date: Fri, 11 Oct 2013 16:53:26 +0800
> Subject: [PATCH] fix bdi max pause calculation
> 
> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
> ---
>  mm/page-writeback.c |   10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 3f0c895..241a746 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1104,11 +1104,11 @@ static unsigned long dirty_poll_interval(unsigned long dirty,
>  	return 1;
>  }
>  
> -static long bdi_max_pause(struct backing_dev_info *bdi,
> -			  unsigned long bdi_dirty)
> +static unsigned long bdi_max_pause(struct backing_dev_info *bdi,
> +				   unsigned long bdi_dirty)
>  {
> -	long bw = bdi->avg_write_bandwidth;
> -	long t;
> +	unsigned long bw = bdi->avg_write_bandwidth;
> +	unsigned long t;
>  
>  	/*
>  	 * Limit pause time for small memory systems. If sleeping for too long
> @@ -1120,7 +1120,7 @@ static long bdi_max_pause(struct backing_dev_info *bdi,
>  	t = bdi_dirty / (1 + bw / roundup_pow_of_two(1 + HZ / 8));
>  	t++;
>  
> -	return min_t(long, t, MAX_PAUSE);
> +	return min_t(unsigned long, t, MAX_PAUSE);
>  }
>  
>  static long bdi_min_pause(struct backing_dev_info *bdi,
> 


-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-11 14:12                                             ` Toralf Förster
  0 siblings, 0 replies; 64+ messages in thread
From: Toralf Förster @ 2013-10-11 14:12 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Richard Weinberger, Jan Kara, Geert Uytterhoeven, UML devel,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes, darrick.wong, Michal Hocko

On 10/11/2013 10:57 AM, Fengguang Wu wrote:
> On Fri, Oct 11, 2013 at 10:42:19AM +0200, Toralf FA?rster wrote:
>> yeah, now the picture becomes more clear
>> ...
>> net.core.warnings = 0                                                                         [ ok ]
>> ick: pause : -717
>>                  ick : min_pause : -177
>>                                    ick : max_pause : -717
>>                                                      ick: pages_dirtied : 14
>>                                                                             ick: task_ratelimit: 0
> 
> Great and thanks! So it's the max pause calculation went wrong.
> Would help you try the below patch?
> 
Definitely.
I'm running now the test case since 6 hours w/o any issues.
before that usually after 15 - 30 min the bug occurred.

>>From 5420b9bbe42dd0a366d7615e9f3d3724cee725c4 Mon Sep 17 00:00:00 2001
> From: Fengguang Wu <fengguang.wu@intel.com>
> Date: Fri, 11 Oct 2013 16:53:26 +0800
> Subject: [PATCH] fix bdi max pause calculation
> 
> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
> ---
>  mm/page-writeback.c |   10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 3f0c895..241a746 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1104,11 +1104,11 @@ static unsigned long dirty_poll_interval(unsigned long dirty,
>  	return 1;
>  }
>  
> -static long bdi_max_pause(struct backing_dev_info *bdi,
> -			  unsigned long bdi_dirty)
> +static unsigned long bdi_max_pause(struct backing_dev_info *bdi,
> +				   unsigned long bdi_dirty)
>  {
> -	long bw = bdi->avg_write_bandwidth;
> -	long t;
> +	unsigned long bw = bdi->avg_write_bandwidth;
> +	unsigned long t;
>  
>  	/*
>  	 * Limit pause time for small memory systems. If sleeping for too long
> @@ -1120,7 +1120,7 @@ static long bdi_max_pause(struct backing_dev_info *bdi,
>  	t = bdi_dirty / (1 + bw / roundup_pow_of_two(1 + HZ / 8));
>  	t++;
>  
> -	return min_t(long, t, MAX_PAUSE);
> +	return min_t(unsigned long, t, MAX_PAUSE);
>  }
>  
>  static long bdi_min_pause(struct backing_dev_info *bdi,
> 


-- 
MfG/Sincerely
Toralf FA?rster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] writeback: fix negative bdi max pause
  2013-10-11 14:12                                             ` Toralf Förster
@ 2013-10-12  0:43                                               ` Fengguang Wu
  -1 siblings, 0 replies; 64+ messages in thread
From: Fengguang Wu @ 2013-10-12  0:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Toralf Förster, Richard Weinberger, Jan Kara,
	Geert Uytterhoeven, UML devel, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, hannes, darrick.wong, Michal Hocko

Toralf runs trinity on UML/i386.
After some time it hangs and the last message line is

	BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]

It's found that pages_dirtied becomes very large.
More than 1000000000 pages in this case:

	period = HZ * pages_dirtied / task_ratelimit;
	BUG_ON(pages_dirtied > 2000000000);
	BUG_ON(pages_dirtied > 1000000000);      <---------

UML debug printf shows that we got negative pause here:

	ick: pause : -984
	ick: pages_dirtied : 0
	ick: task_ratelimit: 0

	 pause:
	+       if (pause < 0)  {
	+               extern int printf(char *, ...);
	+               printf("ick : pause : %li\n", pause);
	+               printf("ick: pages_dirtied : %lu\n", pages_dirtied);
	+               printf("ick: task_ratelimit: %lu\n", task_ratelimit);
	+               BUG_ON(1);
	+       }
	        trace_balance_dirty_pages(bdi,

Since pause is bounded by [min_pause, max_pause] where min_pause is also
bounded by max_pause. It's suspected and demonstrated that the max_pause
calculation goes wrong:

	ick: pause : -717
	ick: min_pause : -177
	ick: max_pause : -717
	ick: pages_dirtied : 14
	ick: task_ratelimit: 0

The problem lies in the two "long = unsigned long" assignments in
bdi_max_pause() which might go negative if the highest bit is 1, and
the min_t(long, ...) check failed to protect it falling under 0. Fix
all of them by using "unsigned long" throughout the function.

Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Tested-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---
 mm/page-writeback.c |   10 +++++-----
 mm/readahead.c      |    2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 3f0c895..241a746 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1104,11 +1104,11 @@ static unsigned long dirty_poll_interval(unsigned long dirty,
 	return 1;
 }
 
-static long bdi_max_pause(struct backing_dev_info *bdi,
-			  unsigned long bdi_dirty)
+static unsigned long bdi_max_pause(struct backing_dev_info *bdi,
+				   unsigned long bdi_dirty)
 {
-	long bw = bdi->avg_write_bandwidth;
-	long t;
+	unsigned long bw = bdi->avg_write_bandwidth;
+	unsigned long t;
 
 	/*
 	 * Limit pause time for small memory systems. If sleeping for too long
@@ -1120,7 +1120,7 @@ static long bdi_max_pause(struct backing_dev_info *bdi,
 	t = bdi_dirty / (1 + bw / roundup_pow_of_two(1 + HZ / 8));
 	t++;
 
-	return min_t(long, t, MAX_PAUSE);
+	return min_t(unsigned long, t, MAX_PAUSE);
 }
 
 static long bdi_min_pause(struct backing_dev_info *bdi,


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH] writeback: fix negative bdi max pause
@ 2013-10-12  0:43                                               ` Fengguang Wu
  0 siblings, 0 replies; 64+ messages in thread
From: Fengguang Wu @ 2013-10-12  0:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Toralf Förster, Richard Weinberger, Jan Kara,
	Geert Uytterhoeven, UML devel, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, hannes, darrick.wong, Michal Hocko

Toralf runs trinity on UML/i386.
After some time it hangs and the last message line is

	BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]

It's found that pages_dirtied becomes very large.
More than 1000000000 pages in this case:

	period = HZ * pages_dirtied / task_ratelimit;
	BUG_ON(pages_dirtied > 2000000000);
	BUG_ON(pages_dirtied > 1000000000);      <---------

UML debug printf shows that we got negative pause here:

	ick: pause : -984
	ick: pages_dirtied : 0
	ick: task_ratelimit: 0

	 pause:
	+       if (pause < 0)  {
	+               extern int printf(char *, ...);
	+               printf("ick : pause : %li\n", pause);
	+               printf("ick: pages_dirtied : %lu\n", pages_dirtied);
	+               printf("ick: task_ratelimit: %lu\n", task_ratelimit);
	+               BUG_ON(1);
	+       }
	        trace_balance_dirty_pages(bdi,

Since pause is bounded by [min_pause, max_pause] where min_pause is also
bounded by max_pause. It's suspected and demonstrated that the max_pause
calculation goes wrong:

	ick: pause : -717
	ick: min_pause : -177
	ick: max_pause : -717
	ick: pages_dirtied : 14
	ick: task_ratelimit: 0

The problem lies in the two "long = unsigned long" assignments in
bdi_max_pause() which might go negative if the highest bit is 1, and
the min_t(long, ...) check failed to protect it falling under 0. Fix
all of them by using "unsigned long" throughout the function.

Reported-by: Toralf FA?rster <toralf.foerster@gmx.de>
Tested-by: Toralf FA?rster <toralf.foerster@gmx.de>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---
 mm/page-writeback.c |   10 +++++-----
 mm/readahead.c      |    2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 3f0c895..241a746 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1104,11 +1104,11 @@ static unsigned long dirty_poll_interval(unsigned long dirty,
 	return 1;
 }
 
-static long bdi_max_pause(struct backing_dev_info *bdi,
-			  unsigned long bdi_dirty)
+static unsigned long bdi_max_pause(struct backing_dev_info *bdi,
+				   unsigned long bdi_dirty)
 {
-	long bw = bdi->avg_write_bandwidth;
-	long t;
+	unsigned long bw = bdi->avg_write_bandwidth;
+	unsigned long t;
 
 	/*
 	 * Limit pause time for small memory systems. If sleeping for too long
@@ -1120,7 +1120,7 @@ static long bdi_max_pause(struct backing_dev_info *bdi,
 	t = bdi_dirty / (1 + bw / roundup_pow_of_two(1 + HZ / 8));
 	t++;
 
-	return min_t(long, t, MAX_PAUSE);
+	return min_t(unsigned long, t, MAX_PAUSE);
 }
 
 static long bdi_min_pause(struct backing_dev_info *bdi,

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2] writeback: fix negative bdi max pause
  2013-10-11 14:12                                             ` Toralf Förster
@ 2013-10-12  4:45                                               ` Fengguang Wu
  -1 siblings, 0 replies; 64+ messages in thread
From: Fengguang Wu @ 2013-10-12  4:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Toralf Förster, Richard Weinberger, Jan Kara,
	Geert Uytterhoeven, UML devel, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, hannes, darrick.wong, Michal Hocko

Toralf runs trinity on UML/i386.
After some time it hangs and the last message line is

	BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]

It's found that pages_dirtied becomes very large.
More than 1000000000 pages in this case:

	period = HZ * pages_dirtied / task_ratelimit;
	BUG_ON(pages_dirtied > 2000000000);
	BUG_ON(pages_dirtied > 1000000000);      <---------

UML debug printf shows that we got negative pause here:

	ick: pause : -984
	ick: pages_dirtied : 0
	ick: task_ratelimit: 0

	 pause:
	+       if (pause < 0)  {
	+               extern int printf(char *, ...);
	+               printf("ick : pause : %li\n", pause);
	+               printf("ick: pages_dirtied : %lu\n", pages_dirtied);
	+               printf("ick: task_ratelimit: %lu\n", task_ratelimit);
	+               BUG_ON(1);
	+       }
	        trace_balance_dirty_pages(bdi,

Since pause is bounded by [min_pause, max_pause] where min_pause is also
bounded by max_pause. It's suspected and demonstrated that the max_pause
calculation goes wrong:

	ick: pause : -717
	ick: min_pause : -177
	ick: max_pause : -717
	ick: pages_dirtied : 14
	ick: task_ratelimit: 0

The problem lies in the two "long = unsigned long" assignments in
bdi_max_pause() which might go negative if the highest bit is 1, and
the min_t(long, ...) check failed to protect it falling under 0. Fix
all of them by using "unsigned long" throughout the function.

Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Tested-by: Toralf Förster <toralf.foerster@gmx.de>
Cc: <stable@vger.kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Richard Weinberger <richard@nod.at>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---
 mm/page-writeback.c |   10 +++++-----
 mm/readahead.c      |    2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

 Changes since v1: Add CC list.

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 3f0c895..241a746 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1104,11 +1104,11 @@ static unsigned long dirty_poll_interval(unsigned long dirty,
 	return 1;
 }
 
-static long bdi_max_pause(struct backing_dev_info *bdi,
-			  unsigned long bdi_dirty)
+static unsigned long bdi_max_pause(struct backing_dev_info *bdi,
+				   unsigned long bdi_dirty)
 {
-	long bw = bdi->avg_write_bandwidth;
-	long t;
+	unsigned long bw = bdi->avg_write_bandwidth;
+	unsigned long t;
 
 	/*
 	 * Limit pause time for small memory systems. If sleeping for too long
@@ -1120,7 +1120,7 @@ static long bdi_max_pause(struct backing_dev_info *bdi,
 	t = bdi_dirty / (1 + bw / roundup_pow_of_two(1 + HZ / 8));
 	t++;
 
-	return min_t(long, t, MAX_PAUSE);
+	return min_t(unsigned long, t, MAX_PAUSE);
 }
 
 static long bdi_min_pause(struct backing_dev_info *bdi,


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v2] writeback: fix negative bdi max pause
@ 2013-10-12  4:45                                               ` Fengguang Wu
  0 siblings, 0 replies; 64+ messages in thread
From: Fengguang Wu @ 2013-10-12  4:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Toralf Förster, Richard Weinberger, Jan Kara,
	Geert Uytterhoeven, UML devel, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, hannes, darrick.wong, Michal Hocko

Toralf runs trinity on UML/i386.
After some time it hangs and the last message line is

	BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]

It's found that pages_dirtied becomes very large.
More than 1000000000 pages in this case:

	period = HZ * pages_dirtied / task_ratelimit;
	BUG_ON(pages_dirtied > 2000000000);
	BUG_ON(pages_dirtied > 1000000000);      <---------

UML debug printf shows that we got negative pause here:

	ick: pause : -984
	ick: pages_dirtied : 0
	ick: task_ratelimit: 0

	 pause:
	+       if (pause < 0)  {
	+               extern int printf(char *, ...);
	+               printf("ick : pause : %li\n", pause);
	+               printf("ick: pages_dirtied : %lu\n", pages_dirtied);
	+               printf("ick: task_ratelimit: %lu\n", task_ratelimit);
	+               BUG_ON(1);
	+       }
	        trace_balance_dirty_pages(bdi,

Since pause is bounded by [min_pause, max_pause] where min_pause is also
bounded by max_pause. It's suspected and demonstrated that the max_pause
calculation goes wrong:

	ick: pause : -717
	ick: min_pause : -177
	ick: max_pause : -717
	ick: pages_dirtied : 14
	ick: task_ratelimit: 0

The problem lies in the two "long = unsigned long" assignments in
bdi_max_pause() which might go negative if the highest bit is 1, and
the min_t(long, ...) check failed to protect it falling under 0. Fix
all of them by using "unsigned long" throughout the function.

Reported-by: Toralf FA?rster <toralf.foerster@gmx.de>
Tested-by: Toralf FA?rster <toralf.foerster@gmx.de>
Cc: <stable@vger.kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Richard Weinberger <richard@nod.at>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---
 mm/page-writeback.c |   10 +++++-----
 mm/readahead.c      |    2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

 Changes since v1: Add CC list.

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 3f0c895..241a746 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1104,11 +1104,11 @@ static unsigned long dirty_poll_interval(unsigned long dirty,
 	return 1;
 }
 
-static long bdi_max_pause(struct backing_dev_info *bdi,
-			  unsigned long bdi_dirty)
+static unsigned long bdi_max_pause(struct backing_dev_info *bdi,
+				   unsigned long bdi_dirty)
 {
-	long bw = bdi->avg_write_bandwidth;
-	long t;
+	unsigned long bw = bdi->avg_write_bandwidth;
+	unsigned long t;
 
 	/*
 	 * Limit pause time for small memory systems. If sleeping for too long
@@ -1120,7 +1120,7 @@ static long bdi_max_pause(struct backing_dev_info *bdi,
 	t = bdi_dirty / (1 + bw / roundup_pow_of_two(1 + HZ / 8));
 	t++;
 
-	return min_t(long, t, MAX_PAUSE);
+	return min_t(unsigned long, t, MAX_PAUSE);
 }
 
 static long bdi_min_pause(struct backing_dev_info *bdi,

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v2] writeback: fix negative bdi max pause
  2013-10-12  4:45                                               ` Fengguang Wu
@ 2013-10-14 12:34                                                 ` Jan Kara
  -1 siblings, 0 replies; 64+ messages in thread
From: Jan Kara @ 2013-10-14 12:34 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Andrew Morton, Toralf Förster, Richard Weinberger, Jan Kara,
	Geert Uytterhoeven, UML devel, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, hannes, darrick.wong, Michal Hocko

On Sat 12-10-13 12:45:17, Wu Fengguang wrote:
> Toralf runs trinity on UML/i386.
> After some time it hangs and the last message line is
> 
> 	BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
> 
> It's found that pages_dirtied becomes very large.
> More than 1000000000 pages in this case:
> 
> 	period = HZ * pages_dirtied / task_ratelimit;
> 	BUG_ON(pages_dirtied > 2000000000);
> 	BUG_ON(pages_dirtied > 1000000000);      <---------
> 
> UML debug printf shows that we got negative pause here:
> 
> 	ick: pause : -984
> 	ick: pages_dirtied : 0
> 	ick: task_ratelimit: 0
> 
> 	 pause:
> 	+       if (pause < 0)  {
> 	+               extern int printf(char *, ...);
> 	+               printf("ick : pause : %li\n", pause);
> 	+               printf("ick: pages_dirtied : %lu\n", pages_dirtied);
> 	+               printf("ick: task_ratelimit: %lu\n", task_ratelimit);
> 	+               BUG_ON(1);
> 	+       }
> 	        trace_balance_dirty_pages(bdi,
> 
> Since pause is bounded by [min_pause, max_pause] where min_pause is also
> bounded by max_pause. It's suspected and demonstrated that the max_pause
> calculation goes wrong:
> 
> 	ick: pause : -717
> 	ick: min_pause : -177
> 	ick: max_pause : -717
> 	ick: pages_dirtied : 14
> 	ick: task_ratelimit: 0
> 
> The problem lies in the two "long = unsigned long" assignments in
> bdi_max_pause() which might go negative if the highest bit is 1, and
> the min_t(long, ...) check failed to protect it falling under 0. Fix
> all of them by using "unsigned long" throughout the function.
> 
> Reported-by: Toralf Förster <toralf.foerster@gmx.de>
> Tested-by: Toralf Förster <toralf.foerster@gmx.de>
> Cc: <stable@vger.kernel.org>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Richard Weinberger <richard@nod.at>
> Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
  The patch looks good. You can add:
Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  mm/page-writeback.c |   10 +++++-----
>  mm/readahead.c      |    2 +-
>  2 files changed, 6 insertions(+), 6 deletions(-)
> 
>  Changes since v1: Add CC list.
> 
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 3f0c895..241a746 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1104,11 +1104,11 @@ static unsigned long dirty_poll_interval(unsigned long dirty,
>  	return 1;
>  }
>  
> -static long bdi_max_pause(struct backing_dev_info *bdi,
> -			  unsigned long bdi_dirty)
> +static unsigned long bdi_max_pause(struct backing_dev_info *bdi,
> +				   unsigned long bdi_dirty)
>  {
> -	long bw = bdi->avg_write_bandwidth;
> -	long t;
> +	unsigned long bw = bdi->avg_write_bandwidth;
> +	unsigned long t;
>  
>  	/*
>  	 * Limit pause time for small memory systems. If sleeping for too long
> @@ -1120,7 +1120,7 @@ static long bdi_max_pause(struct backing_dev_info *bdi,
>  	t = bdi_dirty / (1 + bw / roundup_pow_of_two(1 + HZ / 8));
>  	t++;
>  
> -	return min_t(long, t, MAX_PAUSE);
> +	return min_t(unsigned long, t, MAX_PAUSE);
>  }
>  
>  static long bdi_min_pause(struct backing_dev_info *bdi,
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v2] writeback: fix negative bdi max pause
@ 2013-10-14 12:34                                                 ` Jan Kara
  0 siblings, 0 replies; 64+ messages in thread
From: Jan Kara @ 2013-10-14 12:34 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Andrew Morton, Toralf Förster, Richard Weinberger, Jan Kara,
	Geert Uytterhoeven, UML devel, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, hannes, darrick.wong, Michal Hocko

On Sat 12-10-13 12:45:17, Wu Fengguang wrote:
> Toralf runs trinity on UML/i386.
> After some time it hangs and the last message line is
> 
> 	BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
> 
> It's found that pages_dirtied becomes very large.
> More than 1000000000 pages in this case:
> 
> 	period = HZ * pages_dirtied / task_ratelimit;
> 	BUG_ON(pages_dirtied > 2000000000);
> 	BUG_ON(pages_dirtied > 1000000000);      <---------
> 
> UML debug printf shows that we got negative pause here:
> 
> 	ick: pause : -984
> 	ick: pages_dirtied : 0
> 	ick: task_ratelimit: 0
> 
> 	 pause:
> 	+       if (pause < 0)  {
> 	+               extern int printf(char *, ...);
> 	+               printf("ick : pause : %li\n", pause);
> 	+               printf("ick: pages_dirtied : %lu\n", pages_dirtied);
> 	+               printf("ick: task_ratelimit: %lu\n", task_ratelimit);
> 	+               BUG_ON(1);
> 	+       }
> 	        trace_balance_dirty_pages(bdi,
> 
> Since pause is bounded by [min_pause, max_pause] where min_pause is also
> bounded by max_pause. It's suspected and demonstrated that the max_pause
> calculation goes wrong:
> 
> 	ick: pause : -717
> 	ick: min_pause : -177
> 	ick: max_pause : -717
> 	ick: pages_dirtied : 14
> 	ick: task_ratelimit: 0
> 
> The problem lies in the two "long = unsigned long" assignments in
> bdi_max_pause() which might go negative if the highest bit is 1, and
> the min_t(long, ...) check failed to protect it falling under 0. Fix
> all of them by using "unsigned long" throughout the function.
> 
> Reported-by: Toralf Forster <toralf.foerster@gmx.de>
> Tested-by: Toralf Forster <toralf.foerster@gmx.de>
> Cc: <stable@vger.kernel.org>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Richard Weinberger <richard@nod.at>
> Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
  The patch looks good. You can add:
Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  mm/page-writeback.c |   10 +++++-----
>  mm/readahead.c      |    2 +-
>  2 files changed, 6 insertions(+), 6 deletions(-)
> 
>  Changes since v1: Add CC list.
> 
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 3f0c895..241a746 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1104,11 +1104,11 @@ static unsigned long dirty_poll_interval(unsigned long dirty,
>  	return 1;
>  }
>  
> -static long bdi_max_pause(struct backing_dev_info *bdi,
> -			  unsigned long bdi_dirty)
> +static unsigned long bdi_max_pause(struct backing_dev_info *bdi,
> +				   unsigned long bdi_dirty)
>  {
> -	long bw = bdi->avg_write_bandwidth;
> -	long t;
> +	unsigned long bw = bdi->avg_write_bandwidth;
> +	unsigned long t;
>  
>  	/*
>  	 * Limit pause time for small memory systems. If sleeping for too long
> @@ -1120,7 +1120,7 @@ static long bdi_max_pause(struct backing_dev_info *bdi,
>  	t = bdi_dirty / (1 + bw / roundup_pow_of_two(1 + HZ / 8));
>  	t++;
>  
> -	return min_t(long, t, MAX_PAUSE);
> +	return min_t(unsigned long, t, MAX_PAUSE);
>  }
>  
>  static long bdi_min_pause(struct backing_dev_info *bdi,
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-09 21:47                                 ` Jan Kara
@ 2013-10-10  2:46                                   ` Fengguang Wu
  -1 siblings, 0 replies; 64+ messages in thread
From: Fengguang Wu @ 2013-10-10  2:46 UTC (permalink / raw)
  To: Jan Kara
  Cc: Richard Weinberger, Toralf Förster, Geert Uytterhoeven,
	UML devel, linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes, darrick.wong, Michal Hocko,
	Gu Zheng, Benjamin LaHaise

On Wed, Oct 09, 2013 at 11:47:33PM +0200, Jan Kara wrote:
> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
> > CC'ing mm folks.
> > Please see below.
>   Added Fenguang to CC since he is the author of this code.

Thanks!

> > Am 09.10.2013 19:26, schrieb Toralf Förster:
> > > On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
> > >> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
> > >>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
> > >>>> asserts its strict positive?!?
> > >>>>
> > >>>> Can you please try the following instead of the BUG_ON():
> > >>>>
> > >>>> if (pause < 0) {
> > >>>>         printk("pages_dirtied = %lu\n", pages_dirtied);
> > >>>>         printk("task_ratelimit = %lu\n", task_ratelimit);
> > >>>>         printk("pause = %ld\n", pause);
> > >>>> }
> > >>>>
> > >>>> Gr{oetje,eeting}s,
> > >>>>
> > >>>>                         Geert
> > >>> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
> > >>> As soon as the issue happens I do have a
> > >>>
> > >>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
> > >>>
> > >>> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
> > >>> to run very basic commands like a crash dump, sysrq ond so on.
> > >>
> > >> You may get an idea of the magnitude of pages_dirtied by using a chain of
> > >> BUG_ON()s, like:
> > >>
> > >> BUG_ON(pages_dirtied > 2000000000);
> > >> BUG_ON(pages_dirtied > 1000000000);
> > >> BUG_ON(pages_dirtied > 100000000);
> > >> BUG_ON(pages_dirtied > 10000000);
> > >> BUG_ON(pages_dirtied > 1000000);
> > >>
> > >> Probably 1 million is already too much for normal operation?
> > >>
> > > period = HZ * pages_dirtied / task_ratelimit;
> > > 		BUG_ON(pages_dirtied > 2000000000);
> > > 		BUG_ON(pages_dirtied > 1000000000);      <-------------- this is line 1467
> > 
> > Summary for mm people:
> > 
> > Toralf runs trinty on UML/i386.
> > After some time pages_dirtied becomes very large.
> > More than 1000000000 pages in this case.
>   Huh, this is really strange. pages_dirtied is passed into
> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
> over 10^9 can get there.

I noticed aio_setup_ring() in the call trace and find it recently
added a SetPageDirty() call in a loop by commit 36bc08cc01 ("fs/aio:
Add support to aio ring pages migration"). So added CC to its authors.

> After all that is over 4TB so I somewhat doubt the
> task was ever able to dirty that much during its lifetime (but correct me
> if I'm wrong here, with UML and memory backed disks it is not totally
> impossible)... I went through the logic of handling ->nr_dirtied but
> I didn't find any obvious problem there. Hum, maybe one thing - what
> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
> huge, we could possibly accumulate huge current->nr_dirtied.
> 
> > Thus, period = HZ * pages_dirtied / task_ratelimit overflows
> > and period/pause becomes extremely large.

Yes, that's possible.

> > It looks like io_schedule_timeout() get's called with a very large timeout.
> > I don't know why "if (unlikely(pause > max_pause)) {" does not help.

The test will sure work and limit pause to <= max_pause. However it's
very possible balance_dirty_pages() cannot break out of the loop (or
being called repeatedly) and block the task.

I'm afraid there are no one to clear the dirty pages, which makes
balance_dirty_pages() waiting for ever.

Thanks,
Fengguang

> > 
> > > the back trace is :
> > > 
> > > tfoerste@n22 ~/devel/linux $ gdb --core=/mnt/ramdisk/core /home/tfoerste/devel/linux/linux -batch -ex bt
> > > [New LWP 6911]
> > > Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk ubda=/home/tfoerste/virtual/uml/tr'.
> > > Program terminated with signal 6, Aborted.
> > > #0  0xb77a7424 in __kernel_vsyscall ()
> > > #0  0xb77a7424 in __kernel_vsyscall ()
> > > #1  0x083bdf35 in kill ()
> > > #2  0x0807296d in uml_abort () at arch/um/os-Linux/util.c:93
> > > #3  0x08072ca5 in os_dump_core () at arch/um/os-Linux/util.c:148
> > > #4  0x080623c4 in panic_exit (self=0x85c1558 <panic_exit_notifier>, unused1=0, unused2=0x85f76e0 <buf.16221>) at arch/um/kernel/um_arch.c:240
> > > #5  0x0809ba86 in notifier_call_chain (nl=0x0, val=0, v=0x85f76e0 <buf.16221>, nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
> > > #6  0x0809bba1 in __atomic_notifier_call_chain (nh=0x85f76c4 <panic_notifier_list>, val=0, v=0x85f76e0 <buf.16221>, nr_to_call=0, nr_calls=0x0) at kernel/notifier.c:182
> > > #7  0x0809bbdf in atomic_notifier_call_chain (nh=0x0, val=0, v=0x0) at kernel/notifier.c:191
> > > #8  0x0841b5bc in panic (fmt=0x0) at kernel/panic.c:130
> > > #9  0x0841c470 in balance_dirty_pages (pages_dirtied=23, mapping=<optimized out>) at mm/page-writeback.c:1467
> > > #10 0x080d3595 in balance_dirty_pages_ratelimited (mapping=0x6) at mm/page-writeback.c:1661
> > > #11 0x080e4a3f in __do_fault (mm=0x45ba3600, vma=0x48777b90, address=1084678144, pmd=0x0, pgoff=0, flags=0, orig_pte=<incomplete type>) at mm/memory.c:3452
> > > #12 0x080e6e0f in do_linear_fault (orig_pte=..., flags=<optimized out>, pmd=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>, page_table=<optimized out>) at mm/memory.c:3486
> > > #13 handle_pte_fault (flags=<optimized out>, pmd=<optimized out>, pte=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3710
> > > #14 __handle_mm_fault (flags=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3845
> > > #15 handle_mm_fault (mm=0x45ba3600, vma=0x487034c8, address=1084678144, flags=1) at mm/memory.c:3868
> > > #16 0x080e7817 in __get_user_pages (tsk=0x48705800, mm=0x45ba3600, start=1084678144, nr_pages=1025, gup_flags=519, pages=0x48558000, vmas=0x0, nonblocking=0x0) at mm/memory.c:1822
> > > #17 0x080e7ae3 in get_user_pages (tsk=0x0, mm=0x0, start=0, nr_pages=0, write=1, force=0, pages=0x48777b90, vmas=0x6) at mm/memory.c:2019
> > > #18 0x08143dc6 in aio_setup_ring (ctx=<optimized out>) at fs/aio.c:340
> > > #19 ioctx_alloc (nr_events=<optimized out>) at fs/aio.c:605
> > > #20 SYSC_io_setup (ctxp=<optimized out>, nr_events=<optimized out>) at fs/aio.c:1122
> > > #21 SyS_io_setup (nr_events=65535, ctxp=135081984) at fs/aio.c:1105
> > > #22 0x08062984 in handle_syscall (r=0x487059d4) at arch/um/kernel/skas/syscall.c:35
> > > #23 0x08074fb5 in handle_trap (local_using_sysemu=<optimized out>, regs=<optimized out>, pid=<optimized out>) at arch/um/os-Linux/skas/process.c:198
> > > #24 userspace (regs=0x487059d4) at arch/um/os-Linux/skas/process.c:431
> > > #25 0x0805f750 in fork_handler () at arch/um/kernel/process.c:160
> > > #26 0x00000000 in ?? ()
> 
> 								Honza
> -- 
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-10  2:46                                   ` Fengguang Wu
  0 siblings, 0 replies; 64+ messages in thread
From: Fengguang Wu @ 2013-10-10  2:46 UTC (permalink / raw)
  To: Jan Kara
  Cc: Richard Weinberger, Toralf Förster, Geert Uytterhoeven,
	UML devel, linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes, darrick.wong, Michal Hocko,
	Gu Zheng, Benjamin LaHaise

On Wed, Oct 09, 2013 at 11:47:33PM +0200, Jan Kara wrote:
> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
> > CC'ing mm folks.
> > Please see below.
>   Added Fenguang to CC since he is the author of this code.

Thanks!

> > Am 09.10.2013 19:26, schrieb Toralf FA?rster:
> > > On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
> > >> On Sun, Oct 6, 2013 at 11:01 PM, Toralf FA?rster <toralf.foerster@gmx.de> wrote:
> > >>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
> > >>>> asserts its strict positive?!?
> > >>>>
> > >>>> Can you please try the following instead of the BUG_ON():
> > >>>>
> > >>>> if (pause < 0) {
> > >>>>         printk("pages_dirtied = %lu\n", pages_dirtied);
> > >>>>         printk("task_ratelimit = %lu\n", task_ratelimit);
> > >>>>         printk("pause = %ld\n", pause);
> > >>>> }
> > >>>>
> > >>>> Gr{oetje,eeting}s,
> > >>>>
> > >>>>                         Geert
> > >>> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
> > >>> As soon as the issue happens I do have a
> > >>>
> > >>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
> > >>>
> > >>> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
> > >>> to run very basic commands like a crash dump, sysrq ond so on.
> > >>
> > >> You may get an idea of the magnitude of pages_dirtied by using a chain of
> > >> BUG_ON()s, like:
> > >>
> > >> BUG_ON(pages_dirtied > 2000000000);
> > >> BUG_ON(pages_dirtied > 1000000000);
> > >> BUG_ON(pages_dirtied > 100000000);
> > >> BUG_ON(pages_dirtied > 10000000);
> > >> BUG_ON(pages_dirtied > 1000000);
> > >>
> > >> Probably 1 million is already too much for normal operation?
> > >>
> > > period = HZ * pages_dirtied / task_ratelimit;
> > > 		BUG_ON(pages_dirtied > 2000000000);
> > > 		BUG_ON(pages_dirtied > 1000000000);      <-------------- this is line 1467
> > 
> > Summary for mm people:
> > 
> > Toralf runs trinty on UML/i386.
> > After some time pages_dirtied becomes very large.
> > More than 1000000000 pages in this case.
>   Huh, this is really strange. pages_dirtied is passed into
> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
> over 10^9 can get there.

I noticed aio_setup_ring() in the call trace and find it recently
added a SetPageDirty() call in a loop by commit 36bc08cc01 ("fs/aio:
Add support to aio ring pages migration"). So added CC to its authors.

> After all that is over 4TB so I somewhat doubt the
> task was ever able to dirty that much during its lifetime (but correct me
> if I'm wrong here, with UML and memory backed disks it is not totally
> impossible)... I went through the logic of handling ->nr_dirtied but
> I didn't find any obvious problem there. Hum, maybe one thing - what
> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
> huge, we could possibly accumulate huge current->nr_dirtied.
> 
> > Thus, period = HZ * pages_dirtied / task_ratelimit overflows
> > and period/pause becomes extremely large.

Yes, that's possible.

> > It looks like io_schedule_timeout() get's called with a very large timeout.
> > I don't know why "if (unlikely(pause > max_pause)) {" does not help.

The test will sure work and limit pause to <= max_pause. However it's
very possible balance_dirty_pages() cannot break out of the loop (or
being called repeatedly) and block the task.

I'm afraid there are no one to clear the dirty pages, which makes
balance_dirty_pages() waiting for ever.

Thanks,
Fengguang

> > 
> > > the back trace is :
> > > 
> > > tfoerste@n22 ~/devel/linux $ gdb --core=/mnt/ramdisk/core /home/tfoerste/devel/linux/linux -batch -ex bt
> > > [New LWP 6911]
> > > Core was generated by `/home/tfoerste/devel/linux/linux earlyprintk ubda=/home/tfoerste/virtual/uml/tr'.
> > > Program terminated with signal 6, Aborted.
> > > #0  0xb77a7424 in __kernel_vsyscall ()
> > > #0  0xb77a7424 in __kernel_vsyscall ()
> > > #1  0x083bdf35 in kill ()
> > > #2  0x0807296d in uml_abort () at arch/um/os-Linux/util.c:93
> > > #3  0x08072ca5 in os_dump_core () at arch/um/os-Linux/util.c:148
> > > #4  0x080623c4 in panic_exit (self=0x85c1558 <panic_exit_notifier>, unused1=0, unused2=0x85f76e0 <buf.16221>) at arch/um/kernel/um_arch.c:240
> > > #5  0x0809ba86 in notifier_call_chain (nl=0x0, val=0, v=0x85f76e0 <buf.16221>, nr_to_call=-2, nr_calls=0x0) at kernel/notifier.c:93
> > > #6  0x0809bba1 in __atomic_notifier_call_chain (nh=0x85f76c4 <panic_notifier_list>, val=0, v=0x85f76e0 <buf.16221>, nr_to_call=0, nr_calls=0x0) at kernel/notifier.c:182
> > > #7  0x0809bbdf in atomic_notifier_call_chain (nh=0x0, val=0, v=0x0) at kernel/notifier.c:191
> > > #8  0x0841b5bc in panic (fmt=0x0) at kernel/panic.c:130
> > > #9  0x0841c470 in balance_dirty_pages (pages_dirtied=23, mapping=<optimized out>) at mm/page-writeback.c:1467
> > > #10 0x080d3595 in balance_dirty_pages_ratelimited (mapping=0x6) at mm/page-writeback.c:1661
> > > #11 0x080e4a3f in __do_fault (mm=0x45ba3600, vma=0x48777b90, address=1084678144, pmd=0x0, pgoff=0, flags=0, orig_pte=<incomplete type>) at mm/memory.c:3452
> > > #12 0x080e6e0f in do_linear_fault (orig_pte=..., flags=<optimized out>, pmd=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>, page_table=<optimized out>) at mm/memory.c:3486
> > > #13 handle_pte_fault (flags=<optimized out>, pmd=<optimized out>, pte=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3710
> > > #14 __handle_mm_fault (flags=<optimized out>, address=<optimized out>, vma=<optimized out>, mm=<optimized out>) at mm/memory.c:3845
> > > #15 handle_mm_fault (mm=0x45ba3600, vma=0x487034c8, address=1084678144, flags=1) at mm/memory.c:3868
> > > #16 0x080e7817 in __get_user_pages (tsk=0x48705800, mm=0x45ba3600, start=1084678144, nr_pages=1025, gup_flags=519, pages=0x48558000, vmas=0x0, nonblocking=0x0) at mm/memory.c:1822
> > > #17 0x080e7ae3 in get_user_pages (tsk=0x0, mm=0x0, start=0, nr_pages=0, write=1, force=0, pages=0x48777b90, vmas=0x6) at mm/memory.c:2019
> > > #18 0x08143dc6 in aio_setup_ring (ctx=<optimized out>) at fs/aio.c:340
> > > #19 ioctx_alloc (nr_events=<optimized out>) at fs/aio.c:605
> > > #20 SYSC_io_setup (ctxp=<optimized out>, nr_events=<optimized out>) at fs/aio.c:1122
> > > #21 SyS_io_setup (nr_events=65535, ctxp=135081984) at fs/aio.c:1105
> > > #22 0x08062984 in handle_syscall (r=0x487059d4) at arch/um/kernel/skas/syscall.c:35
> > > #23 0x08074fb5 in handle_trap (local_using_sysemu=<optimized out>, regs=<optimized out>, pid=<optimized out>) at arch/um/os-Linux/skas/process.c:198
> > > #24 userspace (regs=0x487059d4) at arch/um/os-Linux/skas/process.c:431
> > > #25 0x0805f750 in fork_handler () at arch/um/kernel/process.c:160
> > > #26 0x00000000 in ?? ()
> 
> 								Honza
> -- 
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-10  2:46                                   ` Fengguang Wu
@ 2013-10-10  6:52                                     ` Geert Uytterhoeven
  -1 siblings, 0 replies; 64+ messages in thread
From: Geert Uytterhoeven @ 2013-10-10  6:52 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Jan Kara, Richard Weinberger, Toralf Förster, UML devel,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes, darrick.wong, Michal Hocko,
	Gu Zheng, Benjamin LaHaise

On Thu, Oct 10, 2013 at 4:46 AM, Fengguang Wu <fengguang.wu@intel.com> wrote:
> On Wed, Oct 09, 2013 at 11:47:33PM +0200, Jan Kara wrote:
>> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
>> > Am 09.10.2013 19:26, schrieb Toralf Förster:
>> > > On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
>> > >> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>> > >>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
>> > >>>> asserts its strict positive?!?
>> > >>>>
>> > >>>> Can you please try the following instead of the BUG_ON():
>> > >>>>
>> > >>>> if (pause < 0) {
>> > >>>>         printk("pages_dirtied = %lu\n", pages_dirtied);
>> > >>>>         printk("task_ratelimit = %lu\n", task_ratelimit);
>> > >>>>         printk("pause = %ld\n", pause);

>> > >>> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
>> > >>> As soon as the issue happens I do have a
>> > >>>
>> > >>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
>> > >>>
>> > >>> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
>> > >>> to run very basic commands like a crash dump, sysrq ond so on.
>> > >>
>> > >> You may get an idea of the magnitude of pages_dirtied by using a chain of
>> > >> BUG_ON()s, like:
>> > >>
>> > >> BUG_ON(pages_dirtied > 2000000000);
>> > >> BUG_ON(pages_dirtied > 1000000000);
>> > >> BUG_ON(pages_dirtied > 100000000);
>> > >> BUG_ON(pages_dirtied > 10000000);
>> > >> BUG_ON(pages_dirtied > 1000000);
>> > >>
>> > >> Probably 1 million is already too much for normal operation?
>> > >>
>> > > period = HZ * pages_dirtied / task_ratelimit;
>> > >           BUG_ON(pages_dirtied > 2000000000);
>> > >           BUG_ON(pages_dirtied > 1000000000);      <-------------- this is line 1467
>> >
>> > Summary for mm people:
>> >
>> > Toralf runs trinty on UML/i386.
>> > After some time pages_dirtied becomes very large.
>> > More than 1000000000 pages in this case.
>>   Huh, this is really strange. pages_dirtied is passed into
>> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
>> over 10^9 can get there.
>
> I noticed aio_setup_ring() in the call trace and find it recently
> added a SetPageDirty() call in a loop by commit 36bc08cc01 ("fs/aio:
> Add support to aio ring pages migration"). So added CC to its authors.
>
>> After all that is over 4TB so I somewhat doubt the
>> task was ever able to dirty that much during its lifetime (but correct me
>> if I'm wrong here, with UML and memory backed disks it is not totally
>> impossible)... I went through the logic of handling ->nr_dirtied but
>> I didn't find any obvious problem there. Hum, maybe one thing - what
>> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
>> huge, we could possibly accumulate huge current->nr_dirtied.
>>
>> > Thus, period = HZ * pages_dirtied / task_ratelimit overflows
>> > and period/pause becomes extremely large.

period/pause are signed long, so they become negative instead of
extremely large when overflowing.

>> > It looks like io_schedule_timeout() get's called with a very large timeout.
>> > I don't know why "if (unlikely(pause > max_pause)) {" does not help.

Because pause is now negative.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-10  6:52                                     ` Geert Uytterhoeven
  0 siblings, 0 replies; 64+ messages in thread
From: Geert Uytterhoeven @ 2013-10-10  6:52 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Jan Kara, Richard Weinberger, Toralf Förster, UML devel,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes, darrick.wong, Michal Hocko,
	Gu Zheng, Benjamin LaHaise

On Thu, Oct 10, 2013 at 4:46 AM, Fengguang Wu <fengguang.wu@intel.com> wrote:
> On Wed, Oct 09, 2013 at 11:47:33PM +0200, Jan Kara wrote:
>> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
>> > Am 09.10.2013 19:26, schrieb Toralf Förster:
>> > > On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
>> > >> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>> > >>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
>> > >>>> asserts its strict positive?!?
>> > >>>>
>> > >>>> Can you please try the following instead of the BUG_ON():
>> > >>>>
>> > >>>> if (pause < 0) {
>> > >>>>         printk("pages_dirtied = %lu\n", pages_dirtied);
>> > >>>>         printk("task_ratelimit = %lu\n", task_ratelimit);
>> > >>>>         printk("pause = %ld\n", pause);

>> > >>> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
>> > >>> As soon as the issue happens I do have a
>> > >>>
>> > >>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
>> > >>>
>> > >>> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
>> > >>> to run very basic commands like a crash dump, sysrq ond so on.
>> > >>
>> > >> You may get an idea of the magnitude of pages_dirtied by using a chain of
>> > >> BUG_ON()s, like:
>> > >>
>> > >> BUG_ON(pages_dirtied > 2000000000);
>> > >> BUG_ON(pages_dirtied > 1000000000);
>> > >> BUG_ON(pages_dirtied > 100000000);
>> > >> BUG_ON(pages_dirtied > 10000000);
>> > >> BUG_ON(pages_dirtied > 1000000);
>> > >>
>> > >> Probably 1 million is already too much for normal operation?
>> > >>
>> > > period = HZ * pages_dirtied / task_ratelimit;
>> > >           BUG_ON(pages_dirtied > 2000000000);
>> > >           BUG_ON(pages_dirtied > 1000000000);      <-------------- this is line 1467
>> >
>> > Summary for mm people:
>> >
>> > Toralf runs trinty on UML/i386.
>> > After some time pages_dirtied becomes very large.
>> > More than 1000000000 pages in this case.
>>   Huh, this is really strange. pages_dirtied is passed into
>> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
>> over 10^9 can get there.
>
> I noticed aio_setup_ring() in the call trace and find it recently
> added a SetPageDirty() call in a loop by commit 36bc08cc01 ("fs/aio:
> Add support to aio ring pages migration"). So added CC to its authors.
>
>> After all that is over 4TB so I somewhat doubt the
>> task was ever able to dirty that much during its lifetime (but correct me
>> if I'm wrong here, with UML and memory backed disks it is not totally
>> impossible)... I went through the logic of handling ->nr_dirtied but
>> I didn't find any obvious problem there. Hum, maybe one thing - what
>> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
>> huge, we could possibly accumulate huge current->nr_dirtied.
>>
>> > Thus, period = HZ * pages_dirtied / task_ratelimit overflows
>> > and period/pause becomes extremely large.

period/pause are signed long, so they become negative instead of
extremely large when overflowing.

>> > It looks like io_schedule_timeout() get's called with a very large timeout.
>> > I don't know why "if (unlikely(pause > max_pause)) {" does not help.

Because pause is now negative.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-10  6:52                                     ` Geert Uytterhoeven
@ 2013-10-10  7:03                                       ` Fengguang Wu
  -1 siblings, 0 replies; 64+ messages in thread
From: Fengguang Wu @ 2013-10-10  7:03 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Jan Kara, Richard Weinberger, Toralf Förster, UML devel,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes, darrick.wong, Michal Hocko,
	Gu Zheng, Benjamin LaHaise

On Thu, Oct 10, 2013 at 08:52:33AM +0200, Geert Uytterhoeven wrote:
> On Thu, Oct 10, 2013 at 4:46 AM, Fengguang Wu <fengguang.wu@intel.com> wrote:
> > On Wed, Oct 09, 2013 at 11:47:33PM +0200, Jan Kara wrote:
> >> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
> >> > Am 09.10.2013 19:26, schrieb Toralf Förster:
> >> > > On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
> >> > >> On Sun, Oct 6, 2013 at 11:01 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
> >> > >>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
> >> > >>>> asserts its strict positive?!?
> >> > >>>>
> >> > >>>> Can you please try the following instead of the BUG_ON():
> >> > >>>>
> >> > >>>> if (pause < 0) {
> >> > >>>>         printk("pages_dirtied = %lu\n", pages_dirtied);
> >> > >>>>         printk("task_ratelimit = %lu\n", task_ratelimit);
> >> > >>>>         printk("pause = %ld\n", pause);
> 
> >> > >>> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
> >> > >>> As soon as the issue happens I do have a
> >> > >>>
> >> > >>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
> >> > >>>
> >> > >>> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
> >> > >>> to run very basic commands like a crash dump, sysrq ond so on.
> >> > >>
> >> > >> You may get an idea of the magnitude of pages_dirtied by using a chain of
> >> > >> BUG_ON()s, like:
> >> > >>
> >> > >> BUG_ON(pages_dirtied > 2000000000);
> >> > >> BUG_ON(pages_dirtied > 1000000000);
> >> > >> BUG_ON(pages_dirtied > 100000000);
> >> > >> BUG_ON(pages_dirtied > 10000000);
> >> > >> BUG_ON(pages_dirtied > 1000000);
> >> > >>
> >> > >> Probably 1 million is already too much for normal operation?
> >> > >>
> >> > > period = HZ * pages_dirtied / task_ratelimit;
> >> > >           BUG_ON(pages_dirtied > 2000000000);
> >> > >           BUG_ON(pages_dirtied > 1000000000);      <-------------- this is line 1467
> >> >
> >> > Summary for mm people:
> >> >
> >> > Toralf runs trinty on UML/i386.
> >> > After some time pages_dirtied becomes very large.
> >> > More than 1000000000 pages in this case.
> >>   Huh, this is really strange. pages_dirtied is passed into
> >> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
> >> over 10^9 can get there.
> >
> > I noticed aio_setup_ring() in the call trace and find it recently
> > added a SetPageDirty() call in a loop by commit 36bc08cc01 ("fs/aio:
> > Add support to aio ring pages migration"). So added CC to its authors.
> >
> >> After all that is over 4TB so I somewhat doubt the
> >> task was ever able to dirty that much during its lifetime (but correct me
> >> if I'm wrong here, with UML and memory backed disks it is not totally
> >> impossible)... I went through the logic of handling ->nr_dirtied but
> >> I didn't find any obvious problem there. Hum, maybe one thing - what
> >> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
> >> huge, we could possibly accumulate huge current->nr_dirtied.
> >>
> >> > Thus, period = HZ * pages_dirtied / task_ratelimit overflows
> >> > and period/pause becomes extremely large.
> 
> period/pause are signed long, so they become negative instead of
> extremely large when overflowing.

Yeah. For that we have underflow detect as well: 

                if (pause < min_pause) {
                        ...
                        break;
                }

So we'll break out of the loop -- but yeah, whether the break is the
right behavior on underflow is still questionable.

> >> > It looks like io_schedule_timeout() get's called with a very large timeout.
> >> > I don't know why "if (unlikely(pause > max_pause)) {" does not help.
> 
> Because pause is now negative.

So here io_schedule_timeout() won't be called with negative pause.

And if ever io_schedule_timeout() calls schedule_timeout() with
negative timeout, the latter will emit a warning and break out, too:

                if (timeout < 0) {
                        printk(KERN_ERR "schedule_timeout: wrong timeout "
                                "value %lx\n", timeout);
                        dump_stack();
                        current->state = TASK_RUNNING;
                        goto out;
                }

Thanks,
Fengguang


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
@ 2013-10-10  7:03                                       ` Fengguang Wu
  0 siblings, 0 replies; 64+ messages in thread
From: Fengguang Wu @ 2013-10-10  7:03 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Jan Kara, Richard Weinberger, Toralf Förster, UML devel,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes, darrick.wong, Michal Hocko,
	Gu Zheng, Benjamin LaHaise

On Thu, Oct 10, 2013 at 08:52:33AM +0200, Geert Uytterhoeven wrote:
> On Thu, Oct 10, 2013 at 4:46 AM, Fengguang Wu <fengguang.wu@intel.com> wrote:
> > On Wed, Oct 09, 2013 at 11:47:33PM +0200, Jan Kara wrote:
> >> On Wed 09-10-13 20:43:50, Richard Weinberger wrote:
> >> > Am 09.10.2013 19:26, schrieb Toralf FA?rster:
> >> > > On 10/08/2013 10:07 PM, Geert Uytterhoeven wrote:
> >> > >> On Sun, Oct 6, 2013 at 11:01 PM, Toralf FA?rster <toralf.foerster@gmx.de> wrote:
> >> > >>>> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
> >> > >>>> asserts its strict positive?!?
> >> > >>>>
> >> > >>>> Can you please try the following instead of the BUG_ON():
> >> > >>>>
> >> > >>>> if (pause < 0) {
> >> > >>>>         printk("pages_dirtied = %lu\n", pages_dirtied);
> >> > >>>>         printk("task_ratelimit = %lu\n", task_ratelimit);
> >> > >>>>         printk("pause = %ld\n", pause);
> 
> >> > >>> I tried it in different ways already - I'm completely unsuccessful in getting any printk output.
> >> > >>> As soon as the issue happens I do have a
> >> > >>>
> >> > >>> BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
> >> > >>>
> >> > >>> at stderr of the UML and then no further input is accepted. With uml_mconsole I'm however able
> >> > >>> to run very basic commands like a crash dump, sysrq ond so on.
> >> > >>
> >> > >> You may get an idea of the magnitude of pages_dirtied by using a chain of
> >> > >> BUG_ON()s, like:
> >> > >>
> >> > >> BUG_ON(pages_dirtied > 2000000000);
> >> > >> BUG_ON(pages_dirtied > 1000000000);
> >> > >> BUG_ON(pages_dirtied > 100000000);
> >> > >> BUG_ON(pages_dirtied > 10000000);
> >> > >> BUG_ON(pages_dirtied > 1000000);
> >> > >>
> >> > >> Probably 1 million is already too much for normal operation?
> >> > >>
> >> > > period = HZ * pages_dirtied / task_ratelimit;
> >> > >           BUG_ON(pages_dirtied > 2000000000);
> >> > >           BUG_ON(pages_dirtied > 1000000000);      <-------------- this is line 1467
> >> >
> >> > Summary for mm people:
> >> >
> >> > Toralf runs trinty on UML/i386.
> >> > After some time pages_dirtied becomes very large.
> >> > More than 1000000000 pages in this case.
> >>   Huh, this is really strange. pages_dirtied is passed into
> >> balance_dirty_pages() from current->nr_dirtied. So I wonder how a value
> >> over 10^9 can get there.
> >
> > I noticed aio_setup_ring() in the call trace and find it recently
> > added a SetPageDirty() call in a loop by commit 36bc08cc01 ("fs/aio:
> > Add support to aio ring pages migration"). So added CC to its authors.
> >
> >> After all that is over 4TB so I somewhat doubt the
> >> task was ever able to dirty that much during its lifetime (but correct me
> >> if I'm wrong here, with UML and memory backed disks it is not totally
> >> impossible)... I went through the logic of handling ->nr_dirtied but
> >> I didn't find any obvious problem there. Hum, maybe one thing - what
> >> 'task_ratelimit' values do you see in balance_dirty_pages? If that one was
> >> huge, we could possibly accumulate huge current->nr_dirtied.
> >>
> >> > Thus, period = HZ * pages_dirtied / task_ratelimit overflows
> >> > and period/pause becomes extremely large.
> 
> period/pause are signed long, so they become negative instead of
> extremely large when overflowing.

Yeah. For that we have underflow detect as well: 

                if (pause < min_pause) {
                        ...
                        break;
                }

So we'll break out of the loop -- but yeah, whether the break is the
right behavior on underflow is still questionable.

> >> > It looks like io_schedule_timeout() get's called with a very large timeout.
> >> > I don't know why "if (unlikely(pause > max_pause)) {" does not help.
> 
> Because pause is now negative.

So here io_schedule_timeout() won't be called with negative pause.

And if ever io_schedule_timeout() calls schedule_timeout() with
negative timeout, the latter will emit a warning and break out, too:

                if (timeout < 0) {
                        printk(KERN_ERR "schedule_timeout: wrong timeout "
                                "value %lx\n", timeout);
                        dump_stack();
                        current->state = TASK_RUNNING;
                        goto out;
                }

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-06 20:26                     ` Geert Uytterhoeven
  2013-10-06 21:01                       ` Toralf Förster
@ 2013-10-08 19:56                       ` Toralf Förster
  2013-10-09 10:35                         ` stian
  1 sibling, 1 reply; 64+ messages in thread
From: Toralf Förster @ 2013-10-08 19:56 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Richard Weinberger, UML devel, Linux Kernel

Well, the quick&dirty hack below at least works for the moment to
overcome the soft lookup and the hang/unresponsiveness of the 32 bit
user mode linux guest :


diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index f5236f8..7e9483c 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1503,6 +1503,8 @@ static void balance_dirty_pages(struct
address_space *mapping,
                }

 pause:
+               if (pause < 0)
+                       break;
                trace_balance_dirty_pages(bdi,
                                          dirty_thresh,
                                          background_thresh,



I'm not proud of it but after starring at the source code in
mm/page-writeback.c too often and too long currently I don't have any
better clue.

WRT to debug of the culprit: neither printk nor friends worked (maybe
b/c the affected process is just hanging ?) and BUG_ON doesn't gave me
any new clues.


On 10/06/2013 10:26 PM, Geert Uytterhoeven wrote:
> On Sun, Oct 6, 2013 at 10:08 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>> On 10/06/2013 08:38 PM, Geert Uytterhoeven wrote:
>>> On Sun, Oct 6, 2013 at 4:17 PM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>>>> The UML stopped here :
>>>> ...
>>>>                 if (unlikely(task_ratelimit == 0)) {
>>>>                         period = max_pause;
>>>>                         pause = max_pause;
>>>>                         BUG_ON(pause < 0);
>>>>                         goto pause;
>>>>                 }
>>>>                 BUG_ON(pages_dirtied < 0);
>>>>                 BUG_ON(task_ratelimit < 0);
>>>>                 period = HZ * pages_dirtied / task_ratelimit;
>>>>                 BUG_ON(period < 0);         <----------------------here
>>>
>>> So pages_dirtied becomes that big compared to task_ratelimit (both are
>>> "unsigned long"), that period (which is "long", just like "pause") overflows
>>> into a negative number.
>>>
>>> This is indeed much more likely to happen on 32-bit.
>>>
>>>> The back trace is :
>>>
>>>> #9  0x08411c64 in balance_dirty_pages (pages_dirtied=9, mapping=<optimized out>) at mm/page-writeback.c:1471
>>>
>>> But here pages_dirtied is only 9??
> 
>> Well, this points to an overflow or ? :
> 
> Negative indicates an overflow, but pages_dirtied doesn't.
> 
>> tfoerste@n22 ~/devel/linux $ nl -ba mm/page-writeback.c | grep -A 5 -B 5 1468
>>   1463                          BUG_ON(pause < 0);
>>   1464                          goto pause;
>>   1465                  }
>>   1466                  period = HZ * pages_dirtied / task_ratelimit;
>>   1467                  pause = period;
>>   1468                  BUG_ON(pause < 0 && pages_dirtied > 0 && task_ratelimit > 0);
>>   1469                  if (current->dirty_paused_when)
>>   1470                          pause -= now - current->dirty_paused_when;
>>   1471                  /*
>>   1472                   * For less than 1s think time (ext3/4 may block the dirtier
>>   1473                   * for up to 800ms from time to time on 1-HDD; so does xfs,
>>
>>
>> and the back trace is :
>>
>> #9  0x08411c6c in balance_dirty_pages (pages_dirtied=0, mapping=<optimized out>) at mm/page-writeback.c:1468
> 
> Hmm, now pages_dirtied is zero, according to the backtrace, but the BUG_ON()
> asserts its strict positive?!?
> 
> Can you please try the following instead of the BUG_ON():
> 
> if (pause < 0) {
>         printk("pages_dirtied = %lu\n", pages_dirtied);
>         printk("task_ratelimit = %lu\n", task_ratelimit);
>         printk("pause = %ld\n", pause);
> }
> 
> Gr{oetje,eeting}s,
> 
>                         Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds
> 


-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [uml-devel] BUG: soft lockup for a user mode linux image
  2013-10-08 19:56                       ` Toralf Förster
@ 2013-10-09 10:35                         ` stian
  0 siblings, 0 replies; 64+ messages in thread
From: stian @ 2013-10-09 10:35 UTC (permalink / raw)
  To: user-mode-linux-devel

Den 2013-10-08 21:56, skrev Toralf Förster:
> Well, the quick&dirty hack below at least works for the moment to
> overcome the soft lookup and the hang/unresponsiveness of the 32 bit
> user mode linux guest :
>
>
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index f5236f8..7e9483c 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1503,6 +1503,8 @@ static void balance_dirty_pages(struct
> address_space *mapping,
>                 }
>
>  pause:
> +               if (pause < 0)
> +                       break;

can we combine in the debug printk() mentioned below, that prints out 
pages_dirtied and task_ratelimit:

>> if (pause < 0) {
>>         printk("pages_dirtied = %lu\n", pages_dirtied);
>>         printk("task_ratelimit = %lu\n", task_ratelimit);
>>         printk("pause = %ld\n", pause);
            break;
>> }

I have not read the source, so what is the definition of HZ? Are 
pages_dirtied and task_ratelimit atomically safe under UML here, and are 
what datatype are they? (unsigned long)? The defininion of HZ will all 
in all decide what kind of multiplication and division we actually 
perform here; so let us hope it is not something like (signed char)

>>>>>                 BUG_ON(pages_dirtied < 0);
>>>>>                 BUG_ON(task_ratelimit < 0);

These will never hit if pages_dirtied and task_ratelimit are unsigned.


Stian Skjelstad

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2013-10-14 12:34 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-02 18:30 [uml-devel] BUG: soft lockup for a user mode linux image Toralf Förster
2013-10-02 18:30 ` Toralf Förster
2013-10-02 18:31 ` [uml-devel] " Toralf Förster
2013-10-02 18:31   ` Toralf Förster
2013-10-02 19:55 ` [uml-devel] " Richard Weinberger
2013-10-02 19:55   ` Richard Weinberger
2013-10-03 18:54   ` Toralf Förster
2013-10-03 18:54     ` Toralf Förster
2013-10-03 19:04     ` Richard Weinberger
2013-10-03 19:04       ` Richard Weinberger
2013-10-03 19:16       ` [uml-devel] " Toralf Förster
2013-10-03 19:16         ` Toralf Förster
2013-10-03 19:20         ` Richard Weinberger
2013-10-03 19:20           ` Richard Weinberger
2013-10-03 19:33           ` Toralf Förster
2013-10-03 19:33             ` Toralf Förster
2013-10-04  5:52             ` Richard Weinberger
2013-10-04  5:52               ` Richard Weinberger
2013-10-06 14:17               ` Toralf Förster
2013-10-06 14:17                 ` Toralf Förster
2013-10-06 18:38                 ` Geert Uytterhoeven
2013-10-06 18:38                   ` Geert Uytterhoeven
2013-10-06 20:08                   ` Toralf Förster
2013-10-06 20:26                     ` Geert Uytterhoeven
2013-10-06 21:01                       ` Toralf Förster
2013-10-08 20:07                         ` Geert Uytterhoeven
2013-10-08 20:07                           ` Geert Uytterhoeven
2013-10-09 17:26                           ` Toralf Förster
2013-10-09 18:43                             ` Richard Weinberger
2013-10-09 18:43                               ` Richard Weinberger
2013-10-09 21:47                               ` Jan Kara
2013-10-09 21:47                                 ` Jan Kara
2013-10-09 22:33                                 ` Richard Weinberger
2013-10-09 22:33                                   ` Richard Weinberger
2013-10-09 22:33                                   ` Richard Weinberger
2013-10-10 16:49                                   ` Toralf Förster
2013-10-10 16:49                                     ` Toralf Förster
2013-10-10 16:49                                     ` Toralf Förster
2013-10-11  1:16                                     ` Fengguang Wu
2013-10-11  1:16                                       ` Fengguang Wu
2013-10-11  8:42                                       ` Toralf Förster
2013-10-11  8:42                                         ` Toralf Förster
2013-10-11  8:57                                         ` Fengguang Wu
2013-10-11  8:57                                           ` Fengguang Wu
2013-10-11  8:57                                           ` Fengguang Wu
2013-10-11  9:05                                           ` Fengguang Wu
2013-10-11  9:05                                             ` Fengguang Wu
2013-10-11 14:12                                           ` Toralf Förster
2013-10-11 14:12                                             ` Toralf Förster
2013-10-11 14:12                                             ` Toralf Förster
2013-10-12  0:43                                             ` [PATCH] writeback: fix negative bdi max pause Fengguang Wu
2013-10-12  0:43                                               ` Fengguang Wu
2013-10-12  4:45                                             ` [PATCH v2] " Fengguang Wu
2013-10-12  4:45                                               ` Fengguang Wu
2013-10-14 12:34                                               ` Jan Kara
2013-10-14 12:34                                                 ` Jan Kara
2013-10-10  2:46                                 ` [uml-devel] BUG: soft lockup for a user mode linux image Fengguang Wu
2013-10-10  2:46                                   ` Fengguang Wu
2013-10-10  6:52                                   ` Geert Uytterhoeven
2013-10-10  6:52                                     ` Geert Uytterhoeven
2013-10-10  7:03                                     ` Fengguang Wu
2013-10-10  7:03                                       ` Fengguang Wu
2013-10-08 19:56                       ` Toralf Förster
2013-10-09 10:35                         ` stian

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.