* PS/Top broken - /proc entry bad
@ 2002-12-11 21:49 Matt Simonsen
2002-12-16 17:43 ` Benjamin LaHaise
0 siblings, 1 reply; 2+ messages in thread
From: Matt Simonsen @ 2002-12-11 21:49 UTC (permalink / raw)
To: linux-kernel
I had a box where ps and top quit working after hundreds of days uptime.
After doing an strace ps I found that one directory in /proc was hanging
it up, a directory named a 5 digit number which I believe was
associtated with a process of the same name.
I tried doing a kill -9 on the process, it returned fine but the process
was still there. Reboot hung my session, too, I had to use reboot -f to
get the machine healthy again.
Is there any way to "fix" /proc other than what I did? I suppose maybe
going into a lower init level and then back to 3 may have worked. It's a
remote machine, though, so reboot was at the time seemed like a better
solution.
Any comments/suggestions on what to do in this situation?
Thanks
Matt
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: PS/Top broken - /proc entry bad
2002-12-11 21:49 PS/Top broken - /proc entry bad Matt Simonsen
@ 2002-12-16 17:43 ` Benjamin LaHaise
0 siblings, 0 replies; 2+ messages in thread
From: Benjamin LaHaise @ 2002-12-16 17:43 UTC (permalink / raw)
To: Matt Simonsen; +Cc: linux-kernel
Use sysreq-t to get a backtrace of the processes. Most likely one of
them hung while still holding the mm semaphore, thereby preventing ps
and top from proceeding. Check your log for oopsen.
-ben
On Wed, Dec 11, 2002 at 01:49:51PM -0800, Matt Simonsen wrote:
> I had a box where ps and top quit working after hundreds of days uptime.
> After doing an strace ps I found that one directory in /proc was hanging
> it up, a directory named a 5 digit number which I believe was
> associtated with a process of the same name.
>
> I tried doing a kill -9 on the process, it returned fine but the process
> was still there. Reboot hung my session, too, I had to use reboot -f to
> get the machine healthy again.
>
> Is there any way to "fix" /proc other than what I did? I suppose maybe
> going into a lower init level and then back to 3 may have worked. It's a
> remote machine, though, so reboot was at the time seemed like a better
> solution.
>
> Any comments/suggestions on what to do in this situation?
>
> Thanks
> Matt
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
"Do you seek knowledge in time travel?"
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2002-12-16 17:35 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-12-11 21:49 PS/Top broken - /proc entry bad Matt Simonsen
2002-12-16 17:43 ` Benjamin LaHaise
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).