All of lore.kernel.org
 help / color / mirror / Atom feed
* [uml-devel] negative pid -516 possible ?
@ 2013-12-21 14:36 Toralf Förster
  2013-12-29 12:53 ` Toralf Förster
  0 siblings, 1 reply; 8+ messages in thread
From: Toralf Förster @ 2013-12-21 14:36 UTC (permalink / raw)
  To: UML devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Trinity'ing a 32 bit linux user mode linux (still the raid x tree issue ....) gives for a guest :

tfoerste@n22 ~ $ date; sudo gdb /home/tfoerste/devel/linux/linux 10044 -n -batch -ex 'bt full'
Sat Dec 21 15:33:03 CET 2013
0xb7710424 in __kernel_vsyscall ()
#0  0xb7710424 in __kernel_vsyscall ()
No symbol table info available.
#1  0x083d5d2f in __nanosleep_nocancel ()
No symbol table info available.
#2  0x0807267c in idle_sleep (nsecs=602496466104653440) at arch/um/os-Linux/time.c:183
        ts = {tv_sec = 0, tv_nsec = 6471789}
#3  0x0805fc0f in arch_cpu_idle () at arch/um/kernel/process.c:208
No locals.
#4  0x080a8981 in cpu_idle_loop () at kernel/cpu/idle.c:98
No locals.
#5  cpu_startup_entry (state=CPUHP_ONLINE) at kernel/cpu/idle.c:140
No locals.
#6  0x08421d02 in rest_init () at init/main.c:401
        pid = -516
#7  0x080487e1 in start_kernel () at init/main.c:655
        command_line = 0x85b6400 <command_line> "earlyprintk ubda=/home/tfoerste/virtual/uml/trinity ubdb=/mnt/ramdisk/trinity_swap eth0=tuntap,tap0,72:ef:3d:9f:c3:5a mem=1025M con0=fd:0,fd:1 con=pts rootfstype=ext4  root=98:0"
#8  0x08049e09 in start_kernel_proc (unused=0x0) at arch/um/kernel/skas/process.c:46
        pid = -516
#9  0x0805f7cb in new_thread_handler () at arch/um/kernel/process.c:129
        fn = 0x0
#10 0x00000000 in ?? ()
No symbol table info available.


Is this a valid number ?

- -- 
MfG/Sincerely
Toralf Förster
pgp finger print:1A37 6F99 4A9D 026F 13E2 4DCF C4EA CDDE 0076 E94E
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iF4EAREIAAYFAlK1p3MACgkQxOrN3gB26U5rdAD/aZdN5SlzlNae6NVQyARaeCEh
jlykcUQiISJBjzWWbOsA/3hCVN/HhqLmQd/SHRQnY5TFdVmyutCSsbADxuZQhPJP
=VSvd
-----END PGP SIGNATURE-----

------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [uml-devel] negative pid -516 possible ?
  2013-12-21 14:36 [uml-devel] negative pid -516 possible ? Toralf Förster
@ 2013-12-29 12:53 ` Toralf Förster
  2013-12-29 13:14   ` stian
  0 siblings, 1 reply; 8+ messages in thread
From: Toralf Förster @ 2013-12-29 12:53 UTC (permalink / raw)
  To: UML devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 12/21/2013 03:36 PM, Toralf Förster wrote:
> Trinity'ing a 32 bit linux user mode linux (still the raid x tree issue ....) gives for a guest :
> 
> tfoerste@n22 ~ $ date; sudo gdb /home/tfoerste/devel/linux/linux 10044 -n -batch -ex 'bt full'
> Sat Dec 21 15:33:03 CET 2013
> 0xb7710424 in __kernel_vsyscall ()
> #0  0xb7710424 in __kernel_vsyscall ()
> No symbol table info available.
> #1  0x083d5d2f in __nanosleep_nocancel ()
> No symbol table info available.
> #2  0x0807267c in idle_sleep (nsecs=602496466104653440) at arch/um/os-Linux/time.c:183
>         ts = {tv_sec = 0, tv_nsec = 6471789}
> #3  0x0805fc0f in arch_cpu_idle () at arch/um/kernel/process.c:208
> No locals.
> #4  0x080a8981 in cpu_idle_loop () at kernel/cpu/idle.c:98
> No locals.
> #5  cpu_startup_entry (state=CPUHP_ONLINE) at kernel/cpu/idle.c:140
> No locals.
> #6  0x08421d02 in rest_init () at init/main.c:401
>         pid = -516
> #7  0x080487e1 in start_kernel () at init/main.c:655
>         command_line = 0x85b6400 <command_line> "earlyprintk ubda=/home/tfoerste/virtual/uml/trinity ubdb=/mnt/ramdisk/trinity_swap eth0=tuntap,tap0,72:ef:3d:9f:c3:5a mem=1025M con0=fd:0,fd:1 con=pts rootfstype=ext4  root=98:0"
> #8  0x08049e09 in start_kernel_proc (unused=0x0) at arch/um/kernel/skas/process.c:46
>         pid = -516
> #9  0x0805f7cb in new_thread_handler () at arch/um/kernel/process.c:129
>         fn = 0x0
> #10 0x00000000 in ?? ()
> No symbol table info available.
> 
> 
> Is this a valid number ?
> 
> 

I'm asking b/c there's no process group id 516, and -516 always happens in the back traces.
And furthermore after a while the UML system does no longer serve any ssh login attempts.

- -- 
MfG/Sincerely
Toralf Förster
pgp finger print:1A37 6F99 4A9D 026F 13E2 4DCF C4EA CDDE 0076 E94E
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iF4EAREIAAYFAlLAGzwACgkQxOrN3gB26U4GaAD+J/AW3LTgeooTehy4vIw1QQO4
o1m6w/3Isy4JhVE/GBQA/AqqqNeuLRJsXrG0i3NpRiD9IpAiXbzieDaFQFOncGe5
=7Bs6
-----END PGP SIGNATURE-----

------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [uml-devel] negative pid -516 possible ?
  2013-12-29 12:53 ` Toralf Förster
@ 2013-12-29 13:14   ` stian
  2014-01-02 13:38     ` Richard Weinberger
  0 siblings, 1 reply; 8+ messages in thread
From: stian @ 2013-12-29 13:14 UTC (permalink / raw)
  To: user-mode-linux-devel

>> #6  0x08421d02 in rest_init () at init/main.c:401
>>         pid = -516
>> #7  0x080487e1 in start_kernel () at init/main.c:655
>>         command_line = 0x85b6400 <command_line> "earlyprintk 
>> ubda=/home/tfoerste/virtual/uml/trinity ubdb=/mnt/ramdisk/trinity_swap 
>> eth0=tuntap,tap0,72:ef:3d:9f:c3:5a mem=1025M con0=fd:0,fd:1 con=pts 
>> rootfstype=ext4  root=98:0"
>> #8  0x08049e09 in start_kernel_proc (unused=0x0) at 
>> arch/um/kernel/skas/process.c:46
>>         pid = -516
>> #9  0x0805f7cb in new_thread_handler () at 
>> arch/um/kernel/process.c:129
>>         fn = 0x0
>> #10 0x00000000 in ?? ()
>> No symbol table info available.
>>
>>
>> Is this a valid number ?
> I'm asking b/c there's no process group id 516, and -516 always
> happens in the back traces.
> And furthermore after a while the UML system does no longer serve any
> ssh login attempts.

516 ==  -ERESTART_RESTARTBLOCK  ??

Stian

------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [uml-devel] negative pid -516 possible ?
  2013-12-29 13:14   ` stian
@ 2014-01-02 13:38     ` Richard Weinberger
  0 siblings, 0 replies; 8+ messages in thread
From: Richard Weinberger @ 2014-01-02 13:38 UTC (permalink / raw)
  To: Stian Skjelstad; +Cc: user-mode-linux-devel@lists.sourceforge.net

[-- Attachment #1: Type: text/plain, Size: 1727 bytes --]

On Sun, Dec 29, 2013 at 2:14 PM,  <stian@nixia.no> wrote:
>>> #6  0x08421d02 in rest_init () at init/main.c:401
>>>         pid = -516
>>> #7  0x080487e1 in start_kernel () at init/main.c:655
>>>         command_line = 0x85b6400 <command_line> "earlyprintk
>>> ubda=/home/tfoerste/virtual/uml/trinity ubdb=/mnt/ramdisk/trinity_swap
>>> eth0=tuntap,tap0,72:ef:3d:9f:c3:5a mem=1025M con0=fd:0,fd:1 con=pts
>>> rootfstype=ext4  root=98:0"
>>> #8  0x08049e09 in start_kernel_proc (unused=0x0) at
>>> arch/um/kernel/skas/process.c:46
>>>         pid = -516
>>> #9  0x0805f7cb in new_thread_handler () at
>>> arch/um/kernel/process.c:129
>>>         fn = 0x0
>>> #10 0x00000000 in ?? ()
>>> No symbol table info available.
>>>
>>>
>>> Is this a valid number ?
>> I'm asking b/c there's no process group id 516, and -516 always
>> happens in the back traces.
>> And furthermore after a while the UML system does no longer serve any
>> ssh login attempts.
>
> 516 ==  -ERESTART_RESTARTBLOCK  ??

Yeah, maybe.

Toralf, where exactly comes this back trace from? "gives for a guest"
is not a good error description.
Did it crash and you took it from the core dump?
Did it panic() and you attached to it?
Did it hang...?
IOW don't throw random back traces to us without much details. ;-)

The number -516 is a bit odd because you see it in
arch/um/kernel/skas/process.c.
In that function it comes from os_getpid() which indicates that the
host kernel reports that number.
...very strange.

init/main.c makes a bit more sense. Maybe a kthread creation within
UML returned that internal error.

Can you try the attached debug patch?
If the BUG_ON() trigger, please show us panic from UML, not just the
gdb back trace.

-- 
Thanks,
//richard

[-- Attachment #2: 516.diff --]
[-- Type: text/plain, Size: 834 bytes --]

diff --git a/arch/um/kernel/skas/process.c b/arch/um/kernel/skas/process.c
index 4da11b3..71a5828 100644
--- a/arch/um/kernel/skas/process.c
+++ b/arch/um/kernel/skas/process.c
@@ -38,6 +38,8 @@ static int __init start_kernel_proc(void *unused)
 	block_signals();
 	pid = os_getpid();
 
+	BUG_ON(pid == -516);
+
 	cpu_tasks[0].pid = pid;
 	cpu_tasks[0].task = current;
 #ifdef CONFIG_SMP
diff --git a/init/main.c b/init/main.c
index febc511..9ad68ab 100644
--- a/init/main.c
+++ b/init/main.c
@@ -386,6 +386,7 @@ static noinline void __init_refok rest_init(void)
 	kernel_thread(kernel_init, NULL, CLONE_FS | CLONE_SIGHAND);
 	numa_default_policy();
 	pid = kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES);
+	BUG_ON(pid == -516);
 	rcu_read_lock();
 	kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns);
 	rcu_read_unlock();

[-- Attachment #3: Type: text/plain, Size: 455 bytes --]

------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk

[-- Attachment #4: Type: text/plain, Size: 194 bytes --]

_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [uml-devel] negative pid -516 possible ?
@ 2014-01-11 10:47 Toralf Förster
  2014-01-12 23:21 ` Richard Weinberger
  0 siblings, 1 reply; 8+ messages in thread
From: Toralf Förster @ 2014-01-11 10:47 UTC (permalink / raw)
  To: UML devel

[-- Attachment #1: Type: text/plain, Size: 6356 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

I do fuzz testing with trinity (latest git version) a stable 32 bit Gentoo Linux user mode linux image.
The host is a stable 32 bit vanilla 3.12.7 kernel, the guest runs latest git tree + 2 patches (attached).

The trinity call in the UML guest is :
$> trinity -q -l off -N 10000 -C 2 -x move_pages -x mremap -v /mnt/ramdisk

After a while there's no progress on the command line seen at the host system - the trinity process seems to just hangs/idling. When this does occur I cannot longer ssh into the system. The system however runs furthermore. In another terminal I still see the output of this command:

$> ssh root@trinity "tail -f /var/log/messages"

That's why I do know that the system does not hang completely. The output of top at the host system gives me the pid of the linux exe. A gdb call gives for that pid :

$ date; sudo gdb /home/tfoerste/devel/linux/linux 25224 -n -batch -ex 'bt full'
Sat Jan 11 11:36:47 CET 2014

warning: Could not load shared library symbols for linux-gate.so.1.
Do you need "set solib-search-path" or "set sysroot"?
0xb7800424 in __kernel_vsyscall ()
#0  0xb7800424 in __kernel_vsyscall ()
No symbol table info available.
#1  0x083d63ff in __nanosleep_nocancel ()
No symbol table info available.
#2  0x0807266c in idle_sleep (nsecs=602496380195307520) at arch/um/os-Linux/time.c:183
        ts = {tv_sec = 0, tv_nsec = 8436602}
#3  0x0805fc0f in arch_cpu_idle () at arch/um/kernel/process.c:208
No locals.
#4  0x080a8971 in cpu_idle_loop () at kernel/cpu/idle.c:98
No locals.
#5  cpu_startup_entry (state=CPUHP_ONLINE) at kernel/cpu/idle.c:140
No locals.
#6  0x084215e9 in rest_init () at init/main.c:402
        pid = -516
        __func__ = "rest_init"
#7  0x080487e1 in start_kernel () at init/main.c:656
        command_line = 0x85b8400 <command_line> "earlyprintk ubda=/home/tfoerste/virtual/uml/trinity ubdb=/mnt/ramdisk/trinity_swap eth0=tuntap,tap0,72:ef:3d:9f:c3:5a mem=1025M con0=fd:0,fd:1 con=pts rootfstype=ext4  root=98:0"
#8  0x08049e42 in start_kernel_proc (unused=0x0) at arch/um/kernel/skas/process.c:48
        pid = -516
        __func__ = "start_kernel_proc"
#9  0x0805f7cb in new_thread_handler () at arch/um/kernel/process.c:129
        fn = 0x0
#10 0x00000000 in ?? ()
No symbol table info available.



Please note that BUG_ON was not triggered. For completeness here are the gdb traces from all linux processes currently running at the host:


$ pgrep linux | xargs -n1 -I {} sudo gdb /home/tfoerste/devel/linux/linux {} -n -batch -ex 'bt'          
warning: process 1613 is already traced by process 25224                                                                                               
ptrace: Operation not permitted.                                                                                                                       
/home/tfoerste/1613: No such file or directory.                                                                                                        
No stack.                                                                                                                                              
warning: process 21849 is already traced by process 25224
ptrace: Operation not permitted.
/home/tfoerste/21849: No such file or directory.
No stack.

warning: Could not load shared library symbols for linux-gate.so.1.
Do you need "set solib-search-path" or "set sysroot"?
0xb7800424 in __kernel_vsyscall ()
#0  0xb7800424 in __kernel_vsyscall ()
#1  0x083d63ff in __nanosleep_nocancel ()
#2  0x0807266c in idle_sleep (nsecs=602496380205307520) at arch/um/os-Linux/time.c:183
#3  0x0805fc0f in arch_cpu_idle () at arch/um/kernel/process.c:208
#4  0x080a8971 in cpu_idle_loop () at kernel/cpu/idle.c:98
#5  cpu_startup_entry (state=CPUHP_ONLINE) at kernel/cpu/idle.c:140
#6  0x084215e9 in rest_init () at init/main.c:402
#7  0x080487e1 in start_kernel () at init/main.c:656
#8  0x08049e42 in start_kernel_proc (unused=0x0) at arch/um/kernel/skas/process.c:48
#9  0x0805f7cb in new_thread_handler () at arch/um/kernel/process.c:129
#10 0x00000000 in ?? ()

warning: process 25231 is a cloned process

warning: Could not load shared library symbols for linux-gate.so.1.
Do you need "set solib-search-path" or "set sysroot"?
0xb7800424 in __kernel_vsyscall ()
#0  0xb7800424 in __kernel_vsyscall ()
#1  0x083da446 in syscall ()
#2  0x0806e861 in io_getevents (events=<optimized out>, ctx_id=<optimized out>, min_nr=<optimized out>, nr=<optimized out>, timeout=<optimized out>) at arch/um/os-Linux/aio.c:49
#3  aio_thread (arg=0x0) at arch/um/os-Linux/aio.c:109
#4  0x083db56e in clone ()

warning: process 25232 is a cloned process

warning: Could not load shared library symbols for linux-gate.so.1.
Do you need "set solib-search-path" or "set sysroot"?
0xb7800424 in __kernel_vsyscall ()
#0  0xb7800424 in __kernel_vsyscall ()
#1  0x083d82c2 in __read_nocancel ()
#2  0x0806f3ff in read (__nbytes=<optimized out>, __buf=<optimized out>, __fd=<optimized out>) at /usr/include/bits/unistd.h:44
#3  os_read_file (fd=-512, buf=0xfffffe00, len=-512) at arch/um/os-Linux/file.c:253
#4  0x0806bafc in io_thread (arg=0x0) at arch/um/drivers/ubd_kern.c:1482
#5  0x083db56e in clone ()

warning: process 25233 is a cloned process

warning: Could not load shared library symbols for linux-gate.so.1.
Do you need "set solib-search-path" or "set sysroot"?
0xb7800424 in __kernel_vsyscall ()
#0  0xb7800424 in __kernel_vsyscall ()
#1  0x083d9132 in __poll_nocancel ()
#2  0x08071114 in poll (__timeout=<optimized out>, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:46
#3  write_sigio_thread (unused=0x0) at arch/um/os-Linux/sigio.c:61
#4  0x083db56e in clone ()
warning: process 25234 is a zombie - the process has already terminated
ptrace: Operation not permitted.
/home/tfoerste/25234: No such file or directory.
No stack.
...


Please Cc: me I'm not subscribed.



- -- 
MfG/Sincerely
Toralf Förster
pgp finger print:1A37 6F99 4A9D 026F 13E2 4DCF C4EA CDDE 0076 E94E
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iF4EAREIAAYFAlLRISQACgkQxOrN3gB26U54pwD9Eq49Oog5KpSC4+e19t4HG6LA
5d3Oz4/qq98wCb+rF9UA/0j+fT4xjdHbYmLtc8Z0wctVO3DjdQG49/+n81s/gLx3
=eP08
-----END PGP SIGNATURE-----

[-- Attachment #2: uml_filemap.patch --]
[-- Type: text/x-patch, Size: 937 bytes --]

diff --git a/mm/filemap.c b/mm/filemap.c
index b7749a92021c..622d49ac2a24 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1108,18 +1108,25 @@ static void do_generic_file_read(struct file *filp, loff_t *ppos,
 	pgoff_t prev_index;
 	unsigned long offset;      /* offset into pagecache page */
 	unsigned int prev_offset;
+	loff_t isize;
 	int error;
 
+	/* we need to trim desc->count to avoid expose stale data to user */
+	isize = i_size_read(inode);
+	if (*ppos + desc->count >= isize)
+		desc->count = isize - *ppos;
 	index = *ppos >> PAGE_CACHE_SHIFT;
 	prev_index = ra->prev_pos >> PAGE_CACHE_SHIFT;
 	prev_offset = ra->prev_pos & (PAGE_CACHE_SIZE-1);
 	last_index = (*ppos + desc->count + PAGE_CACHE_SIZE-1) >> PAGE_CACHE_SHIFT;
 	offset = *ppos & ~PAGE_CACHE_MASK;
 
+	if (desc->count == 0)
+		goto out;
+
 	for (;;) {
 		struct page *page;
 		pgoff_t end_index;
-		loff_t isize;
 		unsigned long nr, ret;
 
 		cond_resched();


[-- Attachment #3: pid516.patch --]
[-- Type: text/x-patch, Size: 835 bytes --]

diff --git a/arch/um/kernel/skas/process.c b/arch/um/kernel/skas/process.c
index 4da11b3..71a5828 100644
--- a/arch/um/kernel/skas/process.c
+++ b/arch/um/kernel/skas/process.c
@@ -38,6 +38,8 @@ static int __init start_kernel_proc(void *unused)
 	block_signals();
 	pid = os_getpid();
 
+	BUG_ON(pid == -516);
+
 	cpu_tasks[0].pid = pid;
 	cpu_tasks[0].task = current;
 #ifdef CONFIG_SMP
diff --git a/init/main.c b/init/main.c
index febc511..9ad68ab 100644
--- a/init/main.c
+++ b/init/main.c
@@ -386,6 +386,7 @@ static noinline void __init_refok rest_init(void)
 	kernel_thread(kernel_init, NULL, CLONE_FS | CLONE_SIGHAND);
 	numa_default_policy();
 	pid = kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES);
+	BUG_ON(pid == -516);
 	rcu_read_lock();
 	kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns);
 	rcu_read_unlock();


[-- Attachment #4: uml_filemap.patch.sig --]
[-- Type: application/pgp-signature, Size: 96 bytes --]

[-- Attachment #5: pid516.patch.sig --]
[-- Type: application/pgp-signature, Size: 96 bytes --]

[-- Attachment #6: Type: text/plain, Size: 388 bytes --]

------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk

[-- Attachment #7: Type: text/plain, Size: 194 bytes --]

_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [uml-devel] negative pid -516 possible ?
  2014-01-11 10:47 Toralf Förster
@ 2014-01-12 23:21 ` Richard Weinberger
  2014-01-13 19:54   ` Toralf Förster
  0 siblings, 1 reply; 8+ messages in thread
From: Richard Weinberger @ 2014-01-12 23:21 UTC (permalink / raw)
  To: Toralf Förster; +Cc: UML devel

On Sat, Jan 11, 2014 at 11:47 AM, Toralf Förster <toralf.foerster@gmx.de> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> I do fuzz testing with trinity (latest git version) a stable 32 bit Gentoo Linux user mode linux image.
> The host is a stable 32 bit vanilla 3.12.7 kernel, the guest runs latest git tree + 2 patches (attached).
>
> The trinity call in the UML guest is :
> $> trinity -q -l off -N 10000 -C 2 -x move_pages -x mremap -v /mnt/ramdisk
>
> After a while there's no progress on the command line seen at the host system - the trinity process seems to just hangs/idling. When this does occur I cannot longer ssh into the system. The system however runs furthermore. In another terminal I still see the output of this command:

Does it consume 100% CPU?

> $> ssh root@trinity "tail -f /var/log/messages"
>
> That's why I do know that the system does not hang completely. The output of top at the host system gives me the pid of the linux exe. A gdb call gives for that pid :
>
> $ date; sudo gdb /home/tfoerste/devel/linux/linux 25224 -n -batch -ex 'bt full'
> Sat Jan 11 11:36:47 CET 2014
>
> warning: Could not load shared library symbols for linux-gate.so.1.
> Do you need "set solib-search-path" or "set sysroot"?
> 0xb7800424 in __kernel_vsyscall ()
> #0  0xb7800424 in __kernel_vsyscall ()
> No symbol table info available.
> #1  0x083d63ff in __nanosleep_nocancel ()
> No symbol table info available.
> #2  0x0807266c in idle_sleep (nsecs=602496380195307520) at arch/um/os-Linux/time.c:183
>         ts = {tv_sec = 0, tv_nsec = 8436602}
> #3  0x0805fc0f in arch_cpu_idle () at arch/um/kernel/process.c:208
> No locals.
> #4  0x080a8971 in cpu_idle_loop () at kernel/cpu/idle.c:98
> No locals.
> #5  cpu_startup_entry (state=CPUHP_ONLINE) at kernel/cpu/idle.c:140
> No locals.
> #6  0x084215e9 in rest_init () at init/main.c:402
>         pid = -516
>         __func__ = "rest_init"
> #7  0x080487e1 in start_kernel () at init/main.c:656
>         command_line = 0x85b8400 <command_line> "earlyprintk ubda=/home/tfoerste/virtual/uml/trinity ubdb=/mnt/ramdisk/trinity_swap eth0=tuntap,tap0,72:ef:3d:9f:c3:5a mem=1025M con0=fd:0,fd:1 con=pts rootfstype=ext4  root=98:0"
> #8  0x08049e42 in start_kernel_proc (unused=0x0) at arch/um/kernel/skas/process.c:48
>         pid = -516
>         __func__ = "start_kernel_proc"
> #9  0x0805f7cb in new_thread_handler () at arch/um/kernel/process.c:129
>         fn = 0x0
> #10 0x00000000 in ?? ()
> No symbol table info available.
>
>
>
> Please note that BUG_ON was not triggered. For completeness here are the gdb traces from all linux processes currently running at the host:

So let's forget the 516 issue for now.
What we no for now is that you manage to trigger a lockup within UML.


>
> $ pgrep linux | xargs -n1 -I {} sudo gdb /home/tfoerste/devel/linux/linux {} -n -batch -ex 'bt'
> warning: process 1613 is already traced by process 25224
> ptrace: Operation not permitted.
> /home/tfoerste/1613: No such file or directory.
> No stack.
> warning: process 21849 is already traced by process 25224
> ptrace: Operation not permitted.
> /home/tfoerste/21849: No such file or directory.
> No stack.
>
> warning: Could not load shared library symbols for linux-gate.so.1.
> Do you need "set solib-search-path" or "set sysroot"?
> 0xb7800424 in __kernel_vsyscall ()
> #0  0xb7800424 in __kernel_vsyscall ()
> #1  0x083d63ff in __nanosleep_nocancel ()
> #2  0x0807266c in idle_sleep (nsecs=602496380205307520) at arch/um/os-Linux/time.c:183
> #3  0x0805fc0f in arch_cpu_idle () at arch/um/kernel/process.c:208
> #4  0x080a8971 in cpu_idle_loop () at kernel/cpu/idle.c:98
> #5  cpu_startup_entry (state=CPUHP_ONLINE) at kernel/cpu/idle.c:140
> #6  0x084215e9 in rest_init () at init/main.c:402
> #7  0x080487e1 in start_kernel () at init/main.c:656
> #8  0x08049e42 in start_kernel_proc (unused=0x0) at arch/um/kernel/skas/process.c:48
> #9  0x0805f7cb in new_thread_handler () at arch/um/kernel/process.c:129
> #10 0x00000000 in ?? ()
>
> warning: process 25231 is a cloned process
>
> warning: Could not load shared library symbols for linux-gate.so.1.
> Do you need "set solib-search-path" or "set sysroot"?
> 0xb7800424 in __kernel_vsyscall ()
> #0  0xb7800424 in __kernel_vsyscall ()
> #1  0x083da446 in syscall ()
> #2  0x0806e861 in io_getevents (events=<optimized out>, ctx_id=<optimized out>, min_nr=<optimized out>, nr=<optimized out>, timeout=<optimized out>) at arch/um/os-Linux/aio.c:49
> #3  aio_thread (arg=0x0) at arch/um/os-Linux/aio.c:109
> #4  0x083db56e in clone ()
>
> warning: process 25232 is a cloned process
>
> warning: Could not load shared library symbols for linux-gate.so.1.
> Do you need "set solib-search-path" or "set sysroot"?
> 0xb7800424 in __kernel_vsyscall ()
> #0  0xb7800424 in __kernel_vsyscall ()
> #1  0x083d82c2 in __read_nocancel ()
> #2  0x0806f3ff in read (__nbytes=<optimized out>, __buf=<optimized out>, __fd=<optimized out>) at /usr/include/bits/unistd.h:44
> #3  os_read_file (fd=-512, buf=0xfffffe00, len=-512) at arch/um/os-Linux/file.c:253
> #4  0x0806bafc in io_thread (arg=0x0) at arch/um/drivers/ubd_kern.c:1482
> #5  0x083db56e in clone ()
>
> warning: process 25233 is a cloned process
>
> warning: Could not load shared library symbols for linux-gate.so.1.
> Do you need "set solib-search-path" or "set sysroot"?
> 0xb7800424 in __kernel_vsyscall ()
> #0  0xb7800424 in __kernel_vsyscall ()
> #1  0x083d9132 in __poll_nocancel ()
> #2  0x08071114 in poll (__timeout=<optimized out>, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:46
> #3  write_sigio_thread (unused=0x0) at arch/um/os-Linux/sigio.c:61
> #4  0x083db56e in clone ()
> warning: process 25234 is a zombie - the process has already terminated
> ptrace: Operation not permitted.
> /home/tfoerste/25234: No such file or directory.
> No stack.
> ...
>
>
> Please Cc: me I'm not subscribed.

Wouldn't it make sense to subscribe?
You post very often on this list. :)

>
>
> - --
> MfG/Sincerely
> Toralf Förster
> pgp finger print:1A37 6F99 4A9D 026F 13E2 4DCF C4EA CDDE 0076 E94E
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.22 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iF4EAREIAAYFAlLRISQACgkQxOrN3gB26U54pwD9Eq49Oog5KpSC4+e19t4HG6LA
> 5d3Oz4/qq98wCb+rF9UA/0j+fT4xjdHbYmLtc8Z0wctVO3DjdQG49/+n81s/gLx3
> =eP08
> -----END PGP SIGNATURE-----
>
> ------------------------------------------------------------------------------
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today.
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
> _______________________________________________
> User-mode-linux-devel mailing list
> User-mode-linux-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
>



-- 
Thanks,
//richard

------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today.
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [uml-devel] negative pid -516 possible ?
  2014-01-12 23:21 ` Richard Weinberger
@ 2014-01-13 19:54   ` Toralf Förster
  2014-02-15 15:44     ` Toralf Förster
  0 siblings, 1 reply; 8+ messages in thread
From: Toralf Förster @ 2014-01-13 19:54 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: UML devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 01/13/2014 12:21 AM, Richard Weinberger wrote:
> On Sat, Jan 11, 2014 at 11:47 AM, Toralf Förster <toralf.foerster@gmx.de> wrote:
> I do fuzz testing with trinity (latest git version) a stable 32 bit Gentoo Linux user mode linux image.
> The host is a stable 32 bit vanilla 3.12.7 kernel, the guest runs latest git tree + 2 patches (attached).
> 
> The trinity call in the UML guest is :
> $> trinity -q -l off -N 10000 -C 2 -x move_pages -x mremap -v /mnt/ramdisk
> 
> After a while there's no progress on the command line seen at the host system - the trinity process seems to just hangs/idling. When this does occur I cannot longer ssh into the system. The system however runs furthermore. In another terminal I still see the output of this command:
> 
>> Does it consume 100% CPU?
> 
No.
It just doesnt allow new ssh connections. Existing ssh conenctinos are still working.

> $> ssh root@trinity "tail -f /var/log/messages"
> 
> That's why I do know that the system does not hang completely. The output of top at the host system gives me the pid of the linux exe. A gdb call gives for that pid :
> 
> $ date; sudo gdb /home/tfoerste/devel/linux/linux 25224 -n -batch -ex 'bt full'
> Sat Jan 11 11:36:47 CET 2014
> 
> warning: Could not load shared library symbols for linux-gate.so.1.
> Do you need "set solib-search-path" or "set sysroot"?
> 0xb7800424 in __kernel_vsyscall ()
> #0  0xb7800424 in __kernel_vsyscall ()
> No symbol table info available.
> #1  0x083d63ff in __nanosleep_nocancel ()
> No symbol table info available.
> #2  0x0807266c in idle_sleep (nsecs=602496380195307520) at arch/um/os-Linux/time.c:183
>         ts = {tv_sec = 0, tv_nsec = 8436602}
> #3  0x0805fc0f in arch_cpu_idle () at arch/um/kernel/process.c:208
> No locals.
> #4  0x080a8971 in cpu_idle_loop () at kernel/cpu/idle.c:98
> No locals.
> #5  cpu_startup_entry (state=CPUHP_ONLINE) at kernel/cpu/idle.c:140
> No locals.
> #6  0x084215e9 in rest_init () at init/main.c:402
>         pid = -516
>         __func__ = "rest_init"
> #7  0x080487e1 in start_kernel () at init/main.c:656
>         command_line = 0x85b8400 <command_line> "earlyprintk ubda=/home/tfoerste/virtual/uml/trinity ubdb=/mnt/ramdisk/trinity_swap eth0=tuntap,tap0,72:ef:3d:9f:c3:5a mem=1025M con0=fd:0,fd:1 con=pts rootfstype=ext4  root=98:0"
> #8  0x08049e42 in start_kernel_proc (unused=0x0) at arch/um/kernel/skas/process.c:48
>         pid = -516
>         __func__ = "start_kernel_proc"
> #9  0x0805f7cb in new_thread_handler () at arch/um/kernel/process.c:129
>         fn = 0x0
> #10 0x00000000 in ?? ()
> No symbol table info available.
> 
> 
> 
> Please note that BUG_ON was not triggered. For completeness here are the gdb traces from all linux processes currently running at the host:
> 
>> So let's forget the 516 issue for now.
>> What we no for now is that you manage to trigger a lockup within UML.
> 
Agreed, especially b/c I added this patch too :
$ cat ~/devel/priv/uml/pid516_2.patch
- --- init/main.c_orig    2014-01-12 16:43:48.585439158 +0100
+++ init/main.c 2014-01-12 16:44:01.706438453 +0100
@@ -389,6 +389,7 @@
        BUG_ON(pid == -516);
        rcu_read_lock();
        kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns);
+       BUG_ON(pid == -516);
        rcu_read_unlock();
        complete(&kthreadd_done);

and this wasn't triggered (/me wonders if the -516 is somehow garbage).

But I can narrow down the problem. In an still open ssh sessions I made :

$ lsof | grep t3
bash      6129      tfoerste  cwd       DIR       98,0     4096    734 /home/tfoerste/t3
logger    6135      tfoerste  cwd       DIR       98,0     4096    734 /home/tfoerste/t3

(t3 is the ~/t3 directory where I cd into it bewfore I run trinity.

And after killing the logger command the trinity batch continues :

$ ps xf -eo pid,start_time,command | grep trinity
 6412 20:48  |           \_ grep --colour=auto trinity
 6129 19:17          \_ bash -c cd ~; sudo su -c 'if [[ -d ./t3 ]]; then sudo chmod -R a+rwx ./t3; sudo rm -rf ./t3; fi'; mkdir ./t3; cd ./t3; logger "17#-1, M=/mnt/ramdisk"; if [[ -n /mnt/ramdisk ]]; then if [[ -d /mnt/ramdisk/victims/v1 ]]; then sudo chmod -R a+rwx /mnt/ramdisk/victims/v1; sudo rm -rf /mnt/ramdisk/victims/v1; fi; mkdir -p /mnt/ramdisk/victims/v1/v2; for i in $(seq -w 0 99); do touch /mnt/ramdisk/victims/v1/v2/f$i 2>/dev/null; mkdir /mnt/ramdisk/victims/v1/v2/d$i 2>/dev/null; done; fi;  trinity -q -N 10000 -C 2 -x move_pages -x mremap -V /mnt/ramdisk/victims/v1/v2
 6390 20:46              \_ trinity -q -N 10000 -C 2 -x move_pages -x mremap -V /mnt/ramdisk/victims/v1/v2
 6391 20:46                  \_ trinity -q -N 10000 -C 2 -x move_pages -x mremap -V /mnt/ramdisk/victims/v1/v2
 6392 20:46                  \_ trinity -q -N 10000 -C 2 -x move_pages -x mremap -V /mnt/ramdisk/victims/v1/v2
 6408 20:47                      \_ trinity -q -N 10000 -C 2 -x move_pages -x mremap -V /mnt/ramdisk/victims/v1/v2
 6410 20:48                      \_ trinity -q -N 10000 -C 2 -x move_pages -x mremap -V /mnt/ramdisk/victims/v1/v2


FWIW a ssh into the UML guest is however still no longer possible. So I'm pretty sure that trinity damage there something really but I'd expect that such a damage should be seen somewhere in the logs, or ?

And finally - now the the batch trinity command hangs again and now not even killing logger helps.
And a shutdown ("sudo halt; exit") hangs too.

> 
> 
> $ pgrep linux | xargs -n1 -I {} sudo gdb /home/tfoerste/devel/linux/linux {} -n -batch -ex 'bt'
> warning: process 1613 is already traced by process 25224
> ptrace: Operation not permitted.
> /home/tfoerste/1613: No such file or directory.
> No stack.
> warning: process 21849 is already traced by process 25224
> ptrace: Operation not permitted.
> /home/tfoerste/21849: No such file or directory.
> No stack.
> 
> warning: Could not load shared library symbols for linux-gate.so.1.
> Do you need "set solib-search-path" or "set sysroot"?
> 0xb7800424 in __kernel_vsyscall ()
> #0  0xb7800424 in __kernel_vsyscall ()
> #1  0x083d63ff in __nanosleep_nocancel ()
> #2  0x0807266c in idle_sleep (nsecs=602496380205307520) at arch/um/os-Linux/time.c:183
> #3  0x0805fc0f in arch_cpu_idle () at arch/um/kernel/process.c:208
> #4  0x080a8971 in cpu_idle_loop () at kernel/cpu/idle.c:98
> #5  cpu_startup_entry (state=CPUHP_ONLINE) at kernel/cpu/idle.c:140
> #6  0x084215e9 in rest_init () at init/main.c:402
> #7  0x080487e1 in start_kernel () at init/main.c:656
> #8  0x08049e42 in start_kernel_proc (unused=0x0) at arch/um/kernel/skas/process.c:48
> #9  0x0805f7cb in new_thread_handler () at arch/um/kernel/process.c:129
> #10 0x00000000 in ?? ()
> 
> warning: process 25231 is a cloned process
> 
> warning: Could not load shared library symbols for linux-gate.so.1.
> Do you need "set solib-search-path" or "set sysroot"?
> 0xb7800424 in __kernel_vsyscall ()
> #0  0xb7800424 in __kernel_vsyscall ()
> #1  0x083da446 in syscall ()
> #2  0x0806e861 in io_getevents (events=<optimized out>, ctx_id=<optimized out>, min_nr=<optimized out>, nr=<optimized out>, timeout=<optimized out>) at arch/um/os-Linux/aio.c:49
> #3  aio_thread (arg=0x0) at arch/um/os-Linux/aio.c:109
> #4  0x083db56e in clone ()
> 
> warning: process 25232 is a cloned process
> 
> warning: Could not load shared library symbols for linux-gate.so.1.
> Do you need "set solib-search-path" or "set sysroot"?
> 0xb7800424 in __kernel_vsyscall ()
> #0  0xb7800424 in __kernel_vsyscall ()
> #1  0x083d82c2 in __read_nocancel ()
> #2  0x0806f3ff in read (__nbytes=<optimized out>, __buf=<optimized out>, __fd=<optimized out>) at /usr/include/bits/unistd.h:44
> #3  os_read_file (fd=-512, buf=0xfffffe00, len=-512) at arch/um/os-Linux/file.c:253
> #4  0x0806bafc in io_thread (arg=0x0) at arch/um/drivers/ubd_kern.c:1482
> #5  0x083db56e in clone ()
> 
> warning: process 25233 is a cloned process
> 
> warning: Could not load shared library symbols for linux-gate.so.1.
> Do you need "set solib-search-path" or "set sysroot"?
> 0xb7800424 in __kernel_vsyscall ()
> #0  0xb7800424 in __kernel_vsyscall ()
> #1  0x083d9132 in __poll_nocancel ()
> #2  0x08071114 in poll (__timeout=<optimized out>, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:46
> #3  write_sigio_thread (unused=0x0) at arch/um/os-Linux/sigio.c:61
> #4  0x083db56e in clone ()
> warning: process 25234 is a zombie - the process has already terminated
> ptrace: Operation not permitted.
> /home/tfoerste/25234: No such file or directory.
> No stack.
> ...
> 
> 
> Please Cc: me I'm not subscribed.
> 
>> Wouldn't it make sense to subscribe?
>> You post very often on this list. :)
> 
done ;)

> 
> 
>>
>> ------------------------------------------------------------------------------
>> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>> Learn Why More Businesses Are Choosing CenturyLink Cloud For
>> Critical Workloads, Development Environments & Everything In Between.
>> Get a Quote or Start a Free Trial Today.
>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>> _______________________________________________
>> User-mode-linux-devel mailing list
>> User-mode-linux-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
>>
> 
> 
> 

- -- 
MfG/Sincerely
Toralf Förster
pgp finger print:1A37 6F99 4A9D 026F 13E2 4DCF C4EA CDDE 0076 E94E
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iF4EAREIAAYFAlLURGIACgkQxOrN3gB26U44RQD+KUqGBeP6/nJk1K/1Wx6nz7ij
/JXcjNN+ZBt8PsMWrV4A/jx7w7Xrl0RPWcwXVFYm+Ixo0dSbtr+zvh/2pdcCNU2c
=uGid
-----END PGP SIGNATURE-----

------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [uml-devel] negative pid -516 possible ?
  2014-01-13 19:54   ` Toralf Förster
@ 2014-02-15 15:44     ` Toralf Förster
  0 siblings, 0 replies; 8+ messages in thread
From: Toralf Förster @ 2014-02-15 15:44 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: UML devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 01/13/2014 08:54 PM, Toralf Förster wrote:
> On 01/13/2014 12:21 AM, Richard Weinberger wrote:
>> On Sat, Jan 11, 2014 at 11:47 AM, Toralf Förster <toralf.foerster@gmx.de> wrote:
>> I do fuzz testing with trinity (latest git version) a stable 32 bit Gentoo Linux user mode linux image.
>> The host is a stable 32 bit vanilla 3.12.7 kernel, the guest runs latest git tree + 2 patches (attached).
> 
>> The trinity call in the UML guest is :
>> $> trinity -q -l off -N 10000 -C 2 -x move_pages -x mremap -v /mnt/ramdisk
> 
>> After a while there's no progress on the command line seen at the host system - the trinity process seems to just hangs/idling. When this does occur I cannot longer ssh into the system. The system however runs furthermore. In another terminal I still see the output of this command:
> 
>>> Does it consume 100% CPU?
> 
> No.
> It just doesnt allow new ssh connections. Existing ssh conenctinos are still working.
> 
>> $> ssh root@trinity "tail -f /var/log/messages"
> 
>> That's why I do know that the system does not hang completely. The output of top at the host system gives me the pid of the linux exe. A gdb call gives for that pid :
> 
>> $ date; sudo gdb /home/tfoerste/devel/linux/linux 25224 -n -batch -ex 'bt full'
>> Sat Jan 11 11:36:47 CET 2014
> 
>> warning: Could not load shared library symbols for linux-gate.so.1.
>> Do you need "set solib-search-path" or "set sysroot"?
>> 0xb7800424 in __kernel_vsyscall ()
>> #0  0xb7800424 in __kernel_vsyscall ()
>> No symbol table info available.
>> #1  0x083d63ff in __nanosleep_nocancel ()
>> No symbol table info available.
>> #2  0x0807266c in idle_sleep (nsecs=602496380195307520) at arch/um/os-Linux/time.c:183
>>         ts = {tv_sec = 0, tv_nsec = 8436602}
>> #3  0x0805fc0f in arch_cpu_idle () at arch/um/kernel/process.c:208
>> No locals.
>> #4  0x080a8971 in cpu_idle_loop () at kernel/cpu/idle.c:98
>> No locals.
>> #5  cpu_startup_entry (state=CPUHP_ONLINE) at kernel/cpu/idle.c:140
>> No locals.
>> #6  0x084215e9 in rest_init () at init/main.c:402
>>         pid = -516
>>         __func__ = "rest_init"
>> #7  0x080487e1 in start_kernel () at init/main.c:656
>>         command_line = 0x85b8400 <command_line> "earlyprintk ubda=/home/tfoerste/virtual/uml/trinity ubdb=/mnt/ramdisk/trinity_swap eth0=tuntap,tap0,72:ef:3d:9f:c3:5a mem=1025M con0=fd:0,fd:1 con=pts rootfstype=ext4  root=98:0"
>> #8  0x08049e42 in start_kernel_proc (unused=0x0) at arch/um/kernel/skas/process.c:48
>>         pid = -516
>>         __func__ = "start_kernel_proc"
>> #9  0x0805f7cb in new_thread_handler () at arch/um/kernel/process.c:129
>>         fn = 0x0
>> #10 0x00000000 in ?? ()
>> No symbol table info available.
> 
> 
> 
>> Please note that BUG_ON was not triggered. For completeness here are the gdb traces from all linux processes currently running at the host:
> 
>>> So let's forget the 516 issue for now.
>>> What we no for now is that you manage to trigger a lockup within UML.
> 
> Agreed, especially b/c I added this patch too :
> $ cat ~/devel/priv/uml/pid516_2.patch
> --- init/main.c_orig    2014-01-12 16:43:48.585439158 +0100
> +++ init/main.c 2014-01-12 16:44:01.706438453 +0100
> @@ -389,6 +389,7 @@
>         BUG_ON(pid == -516);
>         rcu_read_lock();
>         kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns);
> +       BUG_ON(pid == -516);
>         rcu_read_unlock();
>         complete(&kthreadd_done);
> 
> and this wasn't triggered (/me wonders if the -516 is somehow garbage).
> 
> But I can narrow down the problem. In an still open ssh sessions I made :
> 
> $ lsof | grep t3
> bash      6129      tfoerste  cwd       DIR       98,0     4096    734 /home/tfoerste/t3
> logger    6135      tfoerste  cwd       DIR       98,0     4096    734 /home/tfoerste/t3
> 
> (t3 is the ~/t3 directory where I cd into it bewfore I run trinity.
> 
> And after killing the logger command the trinity batch continues :
> 
> $ ps xf -eo pid,start_time,command | grep trinity
>  6412 20:48  |           \_ grep --colour=auto trinity
>  6129 19:17          \_ bash -c cd ~; sudo su -c 'if [[ -d ./t3 ]]; then sudo chmod -R a+rwx ./t3; sudo rm -rf ./t3; fi'; mkdir ./t3; cd ./t3; logger "17#-1, M=/mnt/ramdisk"; if [[ -n /mnt/ramdisk ]]; then if [[ -d /mnt/ramdisk/victims/v1 ]]; then sudo chmod -R a+rwx /mnt/ramdisk/victims/v1; sudo rm -rf /mnt/ramdisk/victims/v1; fi; mkdir -p /mnt/ramdisk/victims/v1/v2; for i in $(seq -w 0 99); do touch /mnt/ramdisk/victims/v1/v2/f$i 2>/dev/null; mkdir /mnt/ramdisk/victims/v1/v2/d$i 2>/dev/null; done; fi;  trinity -q -N 10000 -C 2 -x move_pages -x mremap -V /mnt/ramdisk/victims/v1/v2
>  6390 20:46              \_ trinity -q -N 10000 -C 2 -x move_pages -x mremap -V /mnt/ramdisk/victims/v1/v2
>  6391 20:46                  \_ trinity -q -N 10000 -C 2 -x move_pages -x mremap -V /mnt/ramdisk/victims/v1/v2
>  6392 20:46                  \_ trinity -q -N 10000 -C 2 -x move_pages -x mremap -V /mnt/ramdisk/victims/v1/v2
>  6408 20:47                      \_ trinity -q -N 10000 -C 2 -x move_pages -x mremap -V /mnt/ramdisk/victims/v1/v2
>  6410 20:48                      \_ trinity -q -N 10000 -C 2 -x move_pages -x mremap -V /mnt/ramdisk/victims/v1/v2
> 
> 
> FWIW a ssh into the UML guest is however still no longer possible. So I'm pretty sure that trinity damage there something really but I'd expect that such a damage should be seen somewhere in the logs, or ?
> 
> And finally - now the the batch trinity command hangs again and now not even killing logger helps.
> And a shutdown ("sudo halt; exit") hangs too.
> 
> 
> 
>> $ pgrep linux | xargs -n1 -I {} sudo gdb /home/tfoerste/devel/linux/linux {} -n -batch -ex 'bt'
>> warning: process 1613 is already traced by process 25224
>> ptrace: Operation not permitted.
>> /home/tfoerste/1613: No such file or directory.
>> No stack.
>> warning: process 21849 is already traced by process 25224
>> ptrace: Operation not permitted.
>> /home/tfoerste/21849: No such file or directory.
>> No stack.
> 
>> warning: Could not load shared library symbols for linux-gate.so.1.
>> Do you need "set solib-search-path" or "set sysroot"?
>> 0xb7800424 in __kernel_vsyscall ()
>> #0  0xb7800424 in __kernel_vsyscall ()
>> #1  0x083d63ff in __nanosleep_nocancel ()
>> #2  0x0807266c in idle_sleep (nsecs=602496380205307520) at arch/um/os-Linux/time.c:183
>> #3  0x0805fc0f in arch_cpu_idle () at arch/um/kernel/process.c:208
>> #4  0x080a8971 in cpu_idle_loop () at kernel/cpu/idle.c:98
>> #5  cpu_startup_entry (state=CPUHP_ONLINE) at kernel/cpu/idle.c:140
>> #6  0x084215e9 in rest_init () at init/main.c:402
>> #7  0x080487e1 in start_kernel () at init/main.c:656
>> #8  0x08049e42 in start_kernel_proc (unused=0x0) at arch/um/kernel/skas/process.c:48
>> #9  0x0805f7cb in new_thread_handler () at arch/um/kernel/process.c:129
>> #10 0x00000000 in ?? ()
> 
>> warning: process 25231 is a cloned process
> 
>> warning: Could not load shared library symbols for linux-gate.so.1.
>> Do you need "set solib-search-path" or "set sysroot"?
>> 0xb7800424 in __kernel_vsyscall ()
>> #0  0xb7800424 in __kernel_vsyscall ()
>> #1  0x083da446 in syscall ()
>> #2  0x0806e861 in io_getevents (events=<optimized out>, ctx_id=<optimized out>, min_nr=<optimized out>, nr=<optimized out>, timeout=<optimized out>) at arch/um/os-Linux/aio.c:49
>> #3  aio_thread (arg=0x0) at arch/um/os-Linux/aio.c:109
>> #4  0x083db56e in clone ()
> 
>> warning: process 25232 is a cloned process
> 
>> warning: Could not load shared library symbols for linux-gate.so.1.
>> Do you need "set solib-search-path" or "set sysroot"?
>> 0xb7800424 in __kernel_vsyscall ()
>> #0  0xb7800424 in __kernel_vsyscall ()
>> #1  0x083d82c2 in __read_nocancel ()
>> #2  0x0806f3ff in read (__nbytes=<optimized out>, __buf=<optimized out>, __fd=<optimized out>) at /usr/include/bits/unistd.h:44
>> #3  os_read_file (fd=-512, buf=0xfffffe00, len=-512) at arch/um/os-Linux/file.c:253
>> #4  0x0806bafc in io_thread (arg=0x0) at arch/um/drivers/ubd_kern.c:1482
>> #5  0x083db56e in clone ()
> 
>> warning: process 25233 is a cloned process
> 
>> warning: Could not load shared library symbols for linux-gate.so.1.
>> Do you need "set solib-search-path" or "set sysroot"?
>> 0xb7800424 in __kernel_vsyscall ()
>> #0  0xb7800424 in __kernel_vsyscall ()
>> #1  0x083d9132 in __poll_nocancel ()
>> #2  0x08071114 in poll (__timeout=<optimized out>, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:46
>> #3  write_sigio_thread (unused=0x0) at arch/um/os-Linux/sigio.c:61
>> #4  0x083db56e in clone ()
>> warning: process 25234 is a zombie - the process has already terminated
>> ptrace: Operation not permitted.
>> /home/tfoerste/25234: No such file or directory.
>> No stack.
>> ...
> 
> 
>> Please Cc: me I'm not subscribed.
> 
>>> Wouldn't it make sense to subscribe?
>>> You post very often on this list. :)
> 
> done ;)
> 
> 
> 
>>>
>>> ------------------------------------------------------------------------------
>>> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>>> Learn Why More Businesses Are Choosing CenturyLink Cloud For
>>> Critical Workloads, Development Environments & Everything In Between.
>>> Get a Quote or Start a Free Trial Today.
>>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>>> _______________________________________________
>>> User-mode-linux-devel mailing list
>>> User-mode-linux-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
>>>
> 
> 
> 
> 
> 


A funny variant of the problem is , that with clatest Linus' tree I do now get pid = 0 - all others are unchanged :

$ date; sudo gdb /home/tfoerste/devel/linux/linux 18483 -n -batch -ex 'bt full'
Sat Feb 15 16:41:42 CET 2014

warning: Could not load shared library symbols for linux-gate.so.1.
Do you need "set solib-search-path" or "set sysroot"?
0xb7792424 in __kernel_vsyscall ()
#0  0xb7792424 in __kernel_vsyscall ()
No symbol table info available.
#1  0x083ded0f in __nanosleep_nocancel ()
No symbol table info available.
#2  0x0807269c in idle_sleep (nsecs=602637203593008768) at arch/um/os-Linux/time.c:183
        ts = {tv_sec = 0, tv_nsec = 10000000}
#3  0x0805fc2f in arch_cpu_idle () at arch/um/kernel/process.c:208
No locals.
#4  0x080a99c1 in cpu_idle_loop () at kernel/cpu/idle.c:98
No locals.
#5  cpu_startup_entry (state=CPUHP_ONLINE) at kernel/cpu/idle.c:143
No locals.
#6  0x08429ec2 in rest_init () at init/main.c:397
        pid = 0
#7  0x080487e9 in start_kernel () at init/main.c:652
        command_line = 0x85c2420 <command_line> "earlyprintk ubda=/home/tfoerste/virtual/uml/trinity ubdb=/mnt/ramdisk/trinity_swap eth0=tuntap,tap0,72:ef:3d:9f:c3:5a mem=1025M con0=fd:0,fd:1 con=pts rootfstype=ext4  root=98:0"
#8  0x08049e19 in start_kernel_proc (unused=0x0) at arch/um/kernel/skas/process.c:46
        pid = 0
#9  0x0805f7eb in new_thread_handler () at arch/um/kernel/process.c:129
        fn = 0x0
#10 0x00000000 in ?? ()
No symbol table info available.


- -- 
MfG/Sincerely
Toralf Förster
pgp finger print:1A37 6F99 4A9D 026F 13E2 4DCF C4EA CDDE 0076 E94E
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iF4EAREIAAYFAlL/i1MACgkQxOrN3gB26U6HCwD/WRTDhGO38eNIMaZla2RPLCcW
AVbaR7p7PLtFHP/I7AsA/Rzz9ASZyvxpx+TufWWl/3xKkv7fFs/Z6/laEseKhVpM
=v42x
-----END PGP SIGNATURE-----

------------------------------------------------------------------------------
Android apps run on BlackBerry 10
Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
Now with support for Jelly Bean, Bluetooth, Mapview and more.
Get your Android app in front of a whole new audience.  Start now.
http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-02-15 15:44 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-21 14:36 [uml-devel] negative pid -516 possible ? Toralf Förster
2013-12-29 12:53 ` Toralf Förster
2013-12-29 13:14   ` stian
2014-01-02 13:38     ` Richard Weinberger
  -- strict thread matches above, loose matches on Subject: below --
2014-01-11 10:47 Toralf Förster
2014-01-12 23:21 ` Richard Weinberger
2014-01-13 19:54   ` Toralf Förster
2014-02-15 15:44     ` Toralf Förster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.