public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.4.18 fork & defunct child.
@ 2003-11-17  7:18 Keith Whyte
       [not found] ` <3FB8E40F.EF61CA7@gmx.de>
       [not found] ` <20031117184732.GA531@louise.pinerecords.com>
  0 siblings, 2 replies; 7+ messages in thread
From: Keith Whyte @ 2003-11-17  7:18 UTC (permalink / raw)
  To: linux-kernel

I'm at a loss to get myself out of this one, folks, i really have tried. 
In desperation i am posting to linux-kernel in the hopes that one of you good 
folks has seen this behaviour before.

I have a kernel 2.4.18 install, based on a slackware 8.1 system
This system was installed almost a year ago and within two weeks of being up 
and running, I began to have problems compiling. 
make would fail with the likes of:
make[3]: *** wait: No child processes.  Stop.
make[3]: *** Waiting for unfinished jobs....
make[3]: *** wait: No child processes.  Stop.
make[2]: *** [first_rule] Error 2

i discovered that also, and often, programs like grep echo, cut.. would fork 
and hang. and this was what was spoiling the makes.

in this case (a kernel compile), the following is from ps axf:

17785 pts/0    T      0:00 touch /usr/src/linux-2.4.18/include/linux/ip.h
17786 pts/0    Z      0:00  \_ [touch <defunct>]

I tried reinstalling everything in /lib and other things but only a clean 
reinstallation would fix it, but the problem kept coming back after a few days.

To cut a long story short, after many clean reinstallations, hardware changes, 
and me complaining to the isp about what i thought was dodgy hardware, it 
finally seemed to be working reliably. Now, after some 6 months, the problem 
has returned. 
(this machine is located at a remote isp, i have never seen it, this makes it 
dificult to try a new kernel for example, as if it doesn't come up with nic's 
and all, the isp will charge heavily to intervene and fix it.)

This machine is still running the default kernel and modules and libc's from 
slackware 8.1.

I have made a directory (/sys2), installed some base packages below there, and 
when i chroot /sys2  , I can demonstrate the following:

(i read about strace not following forks or something on linux and i don't 
understand it fully, but why in one case is it doing these lseek and fork 
operations and in the other it isn't?)


in the "normal" system:


root@califas:~# strace /bin/true
execve("/bin/true", ["/bin/true"], [/* 25 vars */]) = 0
brk(0)                                  = 0x8049c2c
open("/etc/ld.so.preload", O_RDONLY)    = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=20514, ...}) = 0
old_mmap(NULL, 20514, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40015000
close(3)                                = 0
open("/lib/libc.so.6", O_RDONLY)        = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0h\222\1"..., 1024) = 
1024
fstat64(3, {st_mode=S_IFREG|0755, st_size=5029105, ...}) = 0
old_mmap(NULL, 1191168, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x4001b000
mprotect(0x40134000, 40192, PROT_NONE)  = 0
old_mmap(0x40134000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 
0x119000) = 0x40134000
old_mmap(0x4013a000, 15616, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4013a000
close(3)                                = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x4013e000
munmap(0x40015000, 20514)               = 0
brk(0)                                  = 0x8049c2c
brk(0x8049c46)                          = 0x8049c46
getpid()                                = 17900
open("/proc/17900///////////exe", O_RDONLY) = 3
lseek(3, 12, SEEK_SET)                  = 12
read(3, "p\"\0\0", 4)                   = 4
lseek(3, 0, SEEK_END)                   = 11693
lseek(3, 8816, SEEK_SET)                = 8816
brk(0)                                  = 0x8049c46
brk(0x804a769)                          = 0x804a769
read(3, "\351o\10\0\0\215v\0U\211\345\353\3X\353s\350\370\377\377"..., 2877) = 
2877
close(3)                                = 0
getppid()                               = 17899
fork()                                  = 17901
waitpid(17901,

and it hangs till i kill the strace process


in the chroot system:

root@califas:/# strace /bin/true
execve("/bin/true", ["/bin/true"], [/* 22 vars */]) = 0
brk(0)                                  = 0x8049c2c
open("/etc/ld.so.preload", O_RDONLY)    = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=14328, ...}) = 0
old_mmap(NULL, 14328, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40015000
close(3)                                = 0
open("/lib/libc.so.6", O_RDONLY)        = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0h\222\1"..., 1024) = 
1024
fstat64(3, {st_mode=S_IFREG|0755, st_size=5029105, ...}) = 0
old_mmap(NULL, 1191168, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40019000
mprotect(0x40132000, 40192, PROT_NONE)  = 0
old_mmap(0x40132000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 
0x119000) = 0x40132000
old_mmap(0x40138000, 15616, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40138000
close(3)                                = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x4013c000
munmap(0x40015000, 14328)               = 0
brk(0)                                  = 0x8049c2c
brk(0x8049c46)                          = 0x8049c46
getpid()                                = 17904
open("/proc/17904///////////exe", O_RDONLY) = -1 ENOENT (No such file or 
directory)
brk(0x8049c2c)                          = 0x8049c2c
brk(0)                                  = 0x8049c2c
brk(0x8049c54)                          = 0x8049c54
brk(0x804a000)                          = 0x804a000
_exit(0)                                = ?


here's a diff -y of those:

execve("/bin/true", ["/bin/true"], [/* 22 vars */]) = 0       | execve
("/bin/true", ["/bin/true"], [/* 25 vars */]) = 0
brk(0)                                  = 0x8049c2c             brk
(0)                                  = 0x8049c2c
open("/etc/ld.so.preload", O_RDONLY)    = -1 ENOENT (No such    open
("/etc/ld.so.preload", O_RDONLY)    = -1 ENOENT (No such 
open("/etc/ld.so.cache", O_RDONLY)      = 3                     open
("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=14328, ...}) = 0    | fstat64(3, 
{st_mode=S_IFREG|0644, st_size=20514, ...}) = 0
old_mmap(NULL, 14328, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40015 | old_mmap(NULL, 
20514, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40015
close(3)                                = 0                     close
(3)                                = 0
open("/lib/libc.so.6", O_RDONLY)        = 3                     open
("/lib/libc.so.6", O_RDONLY)        = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0h\222   read
(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0h\222
fstat64(3, {st_mode=S_IFREG|0755, st_size=5029105, ...}) = 0    fstat64(3, 
{st_mode=S_IFREG|0755, st_size=5029105, ...}) = 0
old_mmap(NULL, 1191168, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3,  | old_mmap(NULL, 
1191168, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 
mprotect(0x40132000, 40192, PROT_NONE)  = 0                   | mprotect
(0x40134000, 40192, PROT_NONE)  = 0
old_mmap(0x40132000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE | old_mmap
(0x40134000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE
old_mmap(0x40138000, 15616, PROT_READ|PROT_WRITE, MAP_PRIVATE | old_mmap
(0x4013a000, 15616, PROT_READ|PROT_WRITE, MAP_PRIVATE
close(3)                                = 0                     close
(3)                                = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_AN | old_mmap(NULL, 
4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_AN
munmap(0x40015000, 14328)               = 0                   | munmap
(0x40015000, 20514)               = 0
brk(0)                                  = 0x8049c2c             brk
(0)                                  = 0x8049c2c
brk(0x8049c46)                          = 0x8049c46             brk
(0x8049c46)                          = 0x8049c46
getpid()                                = 17904               | getpid
()                                = 17900
open("/proc/17904///////////exe", O_RDONLY) = -1 ENOENT (No s | open
("/proc/17900///////////exe", O_RDONLY) = 3
brk(0x8049c2c)                          = 0x8049c2c           | lseek(3, 12, 
SEEK_SET)                  = 12
brk(0)                                  = 0x8049c2c           | read(3, "p\"\0
\0", 4)                   = 4
brk(0x8049c54)                          = 0x8049c54           | lseek(3, 0, 
SEEK_END)                   = 11693
brk(0x804a000)                          = 0x804a000           | lseek(3, 8816, 
SEEK_SET)                = 8816
_exit(0)                                = ?                   | brk
(0)                                  = 0x8049c46
                                                              > brk
(0x804a769)                          = 0x804a769
                                                              > read
(3, "\351o\10\0\0\215v\0U\211\345\353\3X\353s\350\370\377
                                                              > close
(3)                                = 0
                                                              > getppid
()                               = 17899
                                                              > fork
()                                  = 17901
                                                              > waitpid(17901,


Thanks for your help.

Keith.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4.18 fork & defunct child.
       [not found] ` <3FB8E40F.EF61CA7@gmx.de>
@ 2003-11-18  0:26   ` Keith Whyte
  2003-11-18  1:00     ` Maciej Zenczykowski
  2003-11-18 10:39     ` 2.4.18 fork & defunct child => system is hacked Frank van Maarseveen
  2003-11-20  2:42   ` solution: 2.4.18 fork & defunct child Keith Whyte
  1 sibling, 2 replies; 7+ messages in thread
From: Keith Whyte @ 2003-11-18  0:26 UTC (permalink / raw)
  To: Edgar Toernig, linux-kernel, linux-gcc, linux-admin

Edgar Toernig wrote:

{ strace listing deleted, see 
http://marc.theaimsgroup.com/?l=linux-kernel&m=106905386725308&w=2 }

>That is not normal /bin/true behaviour.  Sure your system
>isn't hacked?  Give the -f option to ptrace to see what the
>forked process is trying to do...  Compare the size of
>/bin/true with a known-good one.
>
>Ciao, ET.
>

I'm not sure. I should be running tripwire or something, this is the 
only one of my systems that doesn't run such a thing, as i have the  
firewall locked down and have been busy.
But it is true i accidently did iptables -F and it was left that way for 
a few days.

But this happens with any program, not just /bin/true, also the 
/bin/true on the root and chroot systems are identical. and with much 
interest i discovered, that if i unmount /proc, the problem goes away. aggh.

that is why it is not exhibiting itself in the chroot system, - no /proc.

I also remember that when this first happen nearly a year ago, some 
"unix engineer" at the ISP said, oh yeah that's because something in the 
ext2 filesystem header is corrupted.. i don't quite remember what he 
said exactly, something  that sounded so far fetched that i ignored it. 
does that ring any bells with anyone?

please help, ug, i hate having a linux system that's not reliable. feels 
like having a pet that's in pain or something.

btw,
/lib/libc.so.6 -> libc-2.2.5.so

Keith

(i'm cross-posting here to gcc and admin in the hopes of finding someone 
who has seen this, thanks!)




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4.18 fork & defunct child.
       [not found] ` <20031117184732.GA531@louise.pinerecords.com>
@ 2003-11-18  0:41   ` Keith Whyte
  0 siblings, 0 replies; 7+ messages in thread
From: Keith Whyte @ 2003-11-18  0:41 UTC (permalink / raw)
  To: Tomas Szepe, linux-kernel


>Weird.  Totally weird.
>
>Have you checked the systems for root kits?  I'm really out of ideas
>here other than the usual hardwarehosed/systemcompromised.  One thing
>I can vouch for is Slackware 8.1 working ok as is, we've installed
>dozens of that particular release and all the machines are still
>humming away in the wild nicely.
>
>  
>
Thanks Tomas,
weird it is, it has me stumped. I'm no spring chicken with linux systems 
and i also have a slackware 8.1 system running fine on PCchips hardware 
for years. (well since slackware 8.1 came out, and before that it had 
7). But this is the only machine i've ever run a distro kernel on.

umounting /proc removes the problem.
what could be in there in proc that would be causing it? something 
misrepresented about the memory? or some other resource?


One thing i have noticed is that this happens:
kernel: PCI_IDE: unknown IDE controller on PCI bus 00 device f9, 
VID=8086, DID=24cb
kernel: PCI: Device 00:1f.1 not available because of resource collisions
on boot.

I sent some more info about the problem earlier to linux-kernel.

http://marc.theaimsgroup.com/?l=linux-kernel&m=106911546802893&w=2


thanks



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4.18 fork & defunct child.
  2003-11-18  0:26   ` Keith Whyte
@ 2003-11-18  1:00     ` Maciej Zenczykowski
  2003-11-18 10:39     ` 2.4.18 fork & defunct child => system is hacked Frank van Maarseveen
  1 sibling, 0 replies; 7+ messages in thread
From: Maciej Zenczykowski @ 2003-11-18  1:00 UTC (permalink / raw)
  To: Keith Whyte; +Cc: Edgar Toernig, linux-kernel, linux-gcc, linux-admin

> { strace listing deleted, see 
> http://marc.theaimsgroup.com/?l=linux-kernel&m=106905386725308&w=2 }

well, I strace'd by glibc 2.3.2 system /bin/true and it doesn't fork and 
doesn't open proc (first place the two straces differ).  Maybe your 
libraries have been hacked - seems the most likely to me - if this is 
happening for all programs than the libc is likely bad...

I can't understand what it is opening /proc/.../exe for and I don't 
understand what the ///////// in there is for (I think more than 2 
consecutive slashes are illegal in POSIX, not sure though, never use more 
than 2 :) )

On a side note /bin/true should take up somewhere like 10 bytes asm code - 
what the hell is that thing doing more than exit(1) for? it shouldn't open 
any files at all... what a bad design (and true --help and true --version 
don't work anyway... duh!)

perhaps try ltrace'ing /bin/true and see what that prints out?

Cheers,
MaZe.




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4.18 fork & defunct child => system is hacked
  2003-11-18  0:26   ` Keith Whyte
  2003-11-18  1:00     ` Maciej Zenczykowski
@ 2003-11-18 10:39     ` Frank van Maarseveen
  2003-11-19 19:45       ` Keith Whyte
  1 sibling, 1 reply; 7+ messages in thread
From: Frank van Maarseveen @ 2003-11-18 10:39 UTC (permalink / raw)
  To: Keith Whyte; +Cc: linux-kernel

On Mon, Nov 17, 2003 at 06:26:00PM -0600, Keith Whyte wrote:
> 
> { strace listing deleted, see 
> http://marc.theaimsgroup.com/?l=linux-kernel&m=106905386725308&w=2 }

First of all, /bin/true doing a fork() basically means you've
been hacked: there should not be any such code in there. The
open("/proc/17904///////////exe" is anouther piece of clear evidence
that your system has been hacked.

Why the additional slashes?

I suspect a library/or LD_PRELOAD hack which simply encodes the getpid()
return value in decimal notation and stores it right into a static
buffer containing

	"/proc//////////////////exe"

because it can't use sprintf at that point for some reason (maybe
just because it is a library/LD_PRELOAD hack).


-- 
Frank

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4.18 fork & defunct child => system is hacked
  2003-11-18 10:39     ` 2.4.18 fork & defunct child => system is hacked Frank van Maarseveen
@ 2003-11-19 19:45       ` Keith Whyte
  0 siblings, 0 replies; 7+ messages in thread
From: Keith Whyte @ 2003-11-19 19:45 UTC (permalink / raw)
  To: Frank van Maarseveen; +Cc: linux-kernel

Frank van Maarseveen wrote:

>On Mon, Nov 17, 2003 at 06:26:00PM -0600, Keith Whyte wrote:
>  
>
>>{ strace listing deleted, see 
>>http://marc.theaimsgroup.com/?l=linux-kernel&m=106905386725308&w=2 }
>>    
>>
>
>First of all, /bin/true doing a fork() basically means you've
>been hacked: there should not be any such code in there. The
>open("/proc/17904///////////exe" is anouther piece of clear evidence
>that your system has been hacked.
>
>Why the additional slashes?
>

Is it at all possible that this behaviour is due to strace?
I have just installed under a fresh directory, from the slackware 
packages, the glibc-so libs, a few progs, strace, and chroot'ed into 
that system.
 I still get the same behaviour. So does that mean it _has_ to be the 
kernel that is at fault?

a cmp on the distro kernel and the one on my system does show this..:

cmp -b -l /boot/vmlinuz /home/r2/boot/vmlinuz
    499   1 ^A     0 ^@

but that is the rootflags, no? I must have set it ro before.

 
I am going to compile a kernel on a clean machine and boot the machine 
with that as soon as i can get somebody down there to monitor it in case 
it doesn't come back up with the new kernel.

>I suspect a library/or LD_PRELOAD hack which simply encodes the getpid()
>return value in decimal notation and stores it right into a static
>buffer containing
>
>	"/proc//////////////////exe"
>
>because it can't use sprintf at that point for some reason (maybe
>just because it is a library/LD_PRELOAD hack).
>
>
>  
>
I think I vaguely know what your saying here, but why? why would it have 
happened as soon as the machine was first brought up.. (after the 
initial install), then agian after a reinstall, and then go away. why 
then would it happen again some months later? and how would they have 
hacked it? it only runs ssh and apache. no sendmail, no bind, none of 
those usual culprits. apache is not running as root. the only other 
listener is identd.
it also runs nfsd, but connections are firewalled, from anything other 
than a 192.168.0.1 address configured on the second NIC. ah, but then i 
did accidentally open the firewall recently for a few days.

hmmm.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* solution: 2.4.18 fork & defunct child.
       [not found] ` <3FB8E40F.EF61CA7@gmx.de>
  2003-11-18  0:26   ` Keith Whyte
@ 2003-11-20  2:42   ` Keith Whyte
  1 sibling, 0 replies; 7+ messages in thread
From: Keith Whyte @ 2003-11-20  2:42 UTC (permalink / raw)
  To: linux-kernel, linux-gcc, linux-admin

Folks thanks to everyone who helped me out with this, I just found the 
file 982235016-gtkrc-429249277 in /tmp
It kept reappearing as it tried to rm * -r in /tmp and
a quick google search led me to find out where it came from.

A few weeks ago i installed a binary that i got from a friends machine, 
and i just checked his machine. It has the trojan also. that explains a 
lot. It was a realserver binary (no longer available for d/l)and i ran 
it once as root as it likes to listen on port 554, before I changed that 
config and set up a user to run it. aggh. so easy to let something slip 
through. never trust binaries... no matter where they come from.

Keith.



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2003-11-20  2:44 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-11-17  7:18 2.4.18 fork & defunct child Keith Whyte
     [not found] ` <3FB8E40F.EF61CA7@gmx.de>
2003-11-18  0:26   ` Keith Whyte
2003-11-18  1:00     ` Maciej Zenczykowski
2003-11-18 10:39     ` 2.4.18 fork & defunct child => system is hacked Frank van Maarseveen
2003-11-19 19:45       ` Keith Whyte
2003-11-20  2:42   ` solution: 2.4.18 fork & defunct child Keith Whyte
     [not found] ` <20031117184732.GA531@louise.pinerecords.com>
2003-11-18  0:41   ` Keith Whyte

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox