* ps hang in 241-pre10
@ 2001-01-27 4:34 John Sheahan
2001-01-27 4:43 ` Aaron Lehmann
0 siblings, 1 reply; 32+ messages in thread
From: John Sheahan @ 2001-01-27 4:34 UTC (permalink / raw)
To: linux-kernel
Hi
my box has been running 2.4.1-pre10 for three days.
This morning I noticed odd behavioue - ps and top wouuld freeze
with no output.
running strace on 'ps'
open("/proc/669/environ", O_RDONLY) = 7
read(7, "INIT_VERSION=sysvinit-2.78\0previ"..., 2047) = 254
close(7) = 0
stat("/proc/683", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
open("/proc/683/stat", O_RDONLY) = 7
read(7,
--- and things just stop in that window--
I cannot read from /proc/683/<anything>
process 683 does not show up in /var/log/messages. How to I find
what it is?
Any suggestions on how to debug?
Kernel 2.4.1-pre10 on a 2-processor i686
the box has run various 240-test for no unusual issues.
john
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-27 4:34 ps hang in 241-pre10 John Sheahan
@ 2001-01-27 4:43 ` Aaron Lehmann
2001-01-27 7:03 ` Shawn Starr
2001-01-27 8:06 ` J Sloan
0 siblings, 2 replies; 32+ messages in thread
From: Aaron Lehmann @ 2001-01-27 4:43 UTC (permalink / raw)
To: John Sheahan; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1023 bytes --]
On Sat, Jan 27, 2001 at 03:34:26PM +1100, John Sheahan wrote:
> Hi
> my box has been running 2.4.1-pre10 for three days.
> This morning I noticed odd behavioue - ps and top wouuld freeze
> with no output.
I had the same problem with 2.4.1-pre10 and the zerocopy patchset.
I came home one day and xmms was frozen. Attempting to determine
whether it was stuck in an odd state, I ran ps aux. At a certain
point (presumably just when it started trying to print info about the
xmms process), ps froze up too. And any attempts to killall -9 these
processes made the killall freeze!
I'm not sure what made xmms freeze up in the first place. My first
though was a problem in the zerocopy patchset -- most of my mp3s are
played over NFS. However, XMMS was completely idle during the time I
was away from the computer, so I'm not sure what caused it. It seemed
clear, however, that the problem was contagious between processes.
I reverted back to 2.4.0-ac7 and have not had any more problems of this
nature.
[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-27 4:43 ` Aaron Lehmann
@ 2001-01-27 7:03 ` Shawn Starr
2001-01-27 8:06 ` J Sloan
1 sibling, 0 replies; 32+ messages in thread
From: Shawn Starr @ 2001-01-27 7:03 UTC (permalink / raw)
To: Aaron Lehmann; +Cc: John Sheahan, linux-kernel
I noticed this problem in 2.4.1-pre8.
Odd, thats EXACLY what happened to me. I had to do a hard restart as killall
locked when i tried to kill ps.
Any word on why this is happening?
Aaron Lehmann wrote:
> On Sat, Jan 27, 2001 at 03:34:26PM +1100, John Sheahan wrote:
> > Hi
> > my box has been running 2.4.1-pre10 for three days.
> > This morning I noticed odd behavioue - ps and top wouuld freeze
> > with no output.
>
> I had the same problem with 2.4.1-pre10 and the zerocopy patchset.
> I came home one day and xmms was frozen. Attempting to determine
> whether it was stuck in an odd state, I ran ps aux. At a certain
> point (presumably just when it started trying to print info about the
> xmms process), ps froze up too. And any attempts to killall -9 these
> processes made the killall freeze!
>
> I'm not sure what made xmms freeze up in the first place. My first
> though was a problem in the zerocopy patchset -- most of my mp3s are
> played over NFS. However, XMMS was completely idle during the time I
> was away from the computer, so I'm not sure what caused it. It seemed
> clear, however, that the problem was contagious between processes.
>
> I reverted back to 2.4.0-ac7 and have not had any more problems of this
> nature.
>
> ------------------------------------------------------------------------
> Part 1.2Type: application/pgp-signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-27 4:43 ` Aaron Lehmann
2001-01-27 7:03 ` Shawn Starr
@ 2001-01-27 8:06 ` J Sloan
2001-01-27 8:24 ` David Ford
1 sibling, 1 reply; 32+ messages in thread
From: J Sloan @ 2001-01-27 8:06 UTC (permalink / raw)
To: Aaron Lehmann; +Cc: John Sheahan, linux-kernel
OK, It's official now, I didn't know if it was some
weird hardware fluke or something, but one of
the computers here exhibited the same problem -
The system in question is a Pentium II 400, scsi
only (aic7xxx), running 2.4.1-pre8 plus Andrew
Morton's low latency patches.
The user was playing unreal tournament at the time
and reported that it "got weird all of a sudden". I
logged in and tried to do a ps, but the ps froze
after listing a few lines. weird, never saw that one
before. The user rebooted, so there was further
opportunity to investigate, but I thought I ought
to mention it after seeing these reports!
jjs
Aaron Lehmann wrote:
> On Sat, Jan 27, 2001 at 03:34:26PM +1100, John Sheahan wrote:
> > Hi
> > my box has been running 2.4.1-pre10 for three days.
> > This morning I noticed odd behavioue - ps and top wouuld freeze
> > with no output.
>
> I had the same problem with 2.4.1-pre10 and the zerocopy patchset.
> I came home one day and xmms was frozen. Attempting to determine
> whether it was stuck in an odd state, I ran ps aux. At a certain
> point (presumably just when it started trying to print info about the
> xmms process), ps froze up too. And any attempts to killall -9 these
> processes made the killall freeze!
>
> I'm not sure what made xmms freeze up in the first place. My first
> though was a problem in the zerocopy patchset -- most of my mp3s are
> played over NFS. However, XMMS was completely idle during the time I
> was away from the computer, so I'm not sure what caused it. It seemed
> clear, however, that the problem was contagious between processes.
>
> I reverted back to 2.4.0-ac7 and have not had any more problems of this
> nature.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-27 8:06 ` J Sloan
@ 2001-01-27 8:24 ` David Ford
2001-01-27 9:33 ` Shawn Starr
2001-01-27 16:19 ` Linus Torvalds
0 siblings, 2 replies; 32+ messages in thread
From: David Ford @ 2001-01-27 8:24 UTC (permalink / raw)
To: J Sloan; +Cc: Aaron Lehmann, John Sheahan, linux-kernel
I can quickly and easily duplicate it on my notebook by playing music or
mpegs in xmms. It may take a few minutes but it's guaranteed.
xmms stalls flat on it's face and anything accessing /proc stalls. If I get
the time to do it, I'll take a gander at it with kdb.
I have no patches applied to p10, I have reiserfs onboard but I highly doubt
it's reiserfs.
-d
J Sloan wrote:
> OK, It's official now, I didn't know if it was some
> weird hardware fluke or something, but one of
> the computers here exhibited the same problem -
--
There is a natural aristocracy among men. The grounds of this are virtue and talents. Thomas Jefferson
The good thing about standards is that there are so many to choose from. Andrew S. Tanenbaum
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-27 8:24 ` David Ford
@ 2001-01-27 9:33 ` Shawn Starr
2001-01-27 11:26 ` John Sheahan
` (2 more replies)
2001-01-27 16:19 ` Linus Torvalds
1 sibling, 3 replies; 32+ messages in thread
From: Shawn Starr @ 2001-01-27 9:33 UTC (permalink / raw)
To: David Ford; +Cc: J Sloan, Aaron Lehmann, John Sheahan, linux-kernel
Yes, I have ReiserFS as well...hrm...
David Ford wrote:
> I can quickly and easily duplicate it on my notebook by playing music or
> mpegs in xmms. It may take a few minutes but it's guaranteed.
>
> xmms stalls flat on it's face and anything accessing /proc stalls. If I get
> the time to do it, I'll take a gander at it with kdb.
>
> I have no patches applied to p10, I have reiserfs onboard but I highly doubt
> it's reiserfs.
>
> -d
>
> J Sloan wrote:
>
> > OK, It's official now, I didn't know if it was some
> > weird hardware fluke or something, but one of
> > the computers here exhibited the same problem -
>
> --
> There is a natural aristocracy among men. The grounds of this are virtue and talents. Thomas Jefferson
> The good thing about standards is that there are so many to choose from. Andrew S. Tanenbaum
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-27 9:33 ` Shawn Starr
@ 2001-01-27 11:26 ` John Sheahan
2001-01-27 19:15 ` J Sloan
2001-01-27 21:14 ` Aaron Lehmann
2 siblings, 0 replies; 32+ messages in thread
From: John Sheahan @ 2001-01-27 11:26 UTC (permalink / raw)
To: Shawn Starr; +Cc: David Ford, J Sloan, Aaron Lehmann, linux-kernel
I have not compiled or used reiserfs here yet.
compiling Mikes semaphore debug patch now and adding sysrq
- but this took three days to happen just once here.
..john
Shawn Starr wrote:
>
> Yes, I have ReiserFS as well...hrm...
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-27 8:24 ` David Ford
2001-01-27 9:33 ` Shawn Starr
@ 2001-01-27 16:19 ` Linus Torvalds
2001-01-27 23:42 ` David Ford
1 sibling, 1 reply; 32+ messages in thread
From: Linus Torvalds @ 2001-01-27 16:19 UTC (permalink / raw)
To: linux-kernel
In article <3A7285D4.9409E63A@linux.com>, David Ford <david@linux.com> wrote:
>I can quickly and easily duplicate it on my notebook by playing music or
>mpegs in xmms. It may take a few minutes but it's guaranteed.
>
>xmms stalls flat on it's face and anything accessing /proc stalls. If I get
>the time to do it, I'll take a gander at it with kdb.
Please, if you see something like this, just do a simple
<Alt+ScrollLock> followed by <Ctrl+ScrollLock> while in text-mode. The
magic keystrokes will give a stack trace of the currently running
process and all processes respectively.
Then, just look in your /var/log/messages, and if you have everything
set up correctly the system should have done the conversion to symbolic
kernel addresses for you - so you can see directly where the different
processes are sleeping.
Sanity-check that your System.map information (and thus the symbolic
conversion) ooks to be ok: the processes that hang should show up in the
trace as being in __down_failed() or something like that. Tha only
reason for a hang with /proc/<pid>/ tends to be that some process would
have deadlocked on it's MM semaphore or is somehow stuck inside it's
critical region on something else.
Finally, try to pinpoint _which_ process it is. Usully most easily done
by simply seeing where it is that the /proc accesses get stuck, with
something simple like
cd /proc
for i in [0-9]*; do
echo $i
cat $i/stat > /dev/null
done
and see what the last pid it printed out was (not that the above
guarantees that you found the thing, because there might be several
things. But it's one more piece to the puzzle).
And send the information to the kernel mailing list, along with anything
else you might think of.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-27 9:33 ` Shawn Starr
2001-01-27 11:26 ` John Sheahan
@ 2001-01-27 19:15 ` J Sloan
2001-01-27 23:28 ` David Ford
2001-01-27 21:14 ` Aaron Lehmann
2 siblings, 1 reply; 32+ messages in thread
From: J Sloan @ 2001-01-27 19:15 UTC (permalink / raw)
To: Shawn Starr; +Cc: David Ford, Aaron Lehmann, John Sheahan, linux-kernel
Just for the record, the system where I saw the problem
has only ext2 -
jjs
Shawn Starr wrote:
> Yes, I have ReiserFS as well...hrm...
>
> David Ford wrote:
>
> > I can quickly and easily duplicate it on my notebook by playing music or
> > mpegs in xmms. It may take a few minutes but it's guaranteed.
> >
> > xmms stalls flat on it's face and anything accessing /proc stalls. If I get
> > the time to do it, I'll take a gander at it with kdb.
> >
> > I have no patches applied to p10, I have reiserfs onboard but I highly doubt
> > it's reiserfs.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-27 9:33 ` Shawn Starr
2001-01-27 11:26 ` John Sheahan
2001-01-27 19:15 ` J Sloan
@ 2001-01-27 21:14 ` Aaron Lehmann
2 siblings, 0 replies; 32+ messages in thread
From: Aaron Lehmann @ 2001-01-27 21:14 UTC (permalink / raw)
To: Shawn Starr; +Cc: David Ford, J Sloan, John Sheahan, linux-kernel
On Sat, Jan 27, 2001 at 04:33:42AM -0500, Shawn Starr wrote:
> Yes, I have ReiserFS as well...hrm...
I don't.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-27 19:15 ` J Sloan
@ 2001-01-27 23:28 ` David Ford
2001-01-28 0:22 ` Linus Torvalds
2001-01-28 0:42 ` J Sloan
0 siblings, 2 replies; 32+ messages in thread
From: David Ford @ 2001-01-27 23:28 UTC (permalink / raw)
To: J Sloan; +Cc: Shawn Starr, Aaron Lehmann, John Sheahan, linux-kernel
We've narrowed it down to "we're all running xmms" when it happend.
-d
J Sloan wrote:
> Just for the record, the system where I saw the problem
> has only ext2 -
--
There is a natural aristocracy among men. The grounds of this are virtue and talents. Thomas Jefferson
The good thing about standards is that there are so many to choose from. Andrew S. Tanenbaum
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-27 16:19 ` Linus Torvalds
@ 2001-01-27 23:42 ` David Ford
0 siblings, 0 replies; 32+ messages in thread
From: David Ford @ 2001-01-27 23:42 UTC (permalink / raw)
To: LKML
At the time I had temporary access to my notebook and had a mismatched System.map
file :S
-d
Linus Torvalds wrote:
> In article <3A7285D4.9409E63A@linux.com>, David Ford <david@linux.com> wrote:
> >I can quickly and easily duplicate it on my notebook by playing music or
> >mpegs in xmms. It may take a few minutes but it's guaranteed.
> >
> >xmms stalls flat on it's face and anything accessing /proc stalls. If I get
> >the time to do it, I'll take a gander at it with kdb.
>
> Please, if you see something like this, just do a simple
> <Alt+ScrollLock> followed by <Ctrl+ScrollLock> while in text-mode. The
> magic keystrokes will give a stack trace of the currently running
> process and all processes respectively.
--
There is a natural aristocracy among men. The grounds of this are virtue and talents. Thomas Jefferson
The good thing about standards is that there are so many to choose from. Andrew S. Tanenbaum
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-27 23:28 ` David Ford
@ 2001-01-28 0:22 ` Linus Torvalds
2001-01-28 0:36 ` Shawn Starr
` (6 more replies)
2001-01-28 0:42 ` J Sloan
1 sibling, 7 replies; 32+ messages in thread
From: Linus Torvalds @ 2001-01-28 0:22 UTC (permalink / raw)
To: linux-kernel
In article <3A7359BB.7BBEE42A@linux.com>, David Ford <david@linux.com>
wrote:
>
>We've narrowed it down to "we're all running xmms" when it happend.
Does anybody have a clue about what is different with xmms?
Does it use KNI if it can, for example? We used to have a problem with
KNI+Athlons, for example.
It might also be that it's threading-related, and that XMMS is one of
the few things that uses threads. Things like that. I'm not an XMMS
user, can somebody who knows XMMS comment on things that it does that
are unusual?
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-28 0:22 ` Linus Torvalds
@ 2001-01-28 0:36 ` Shawn Starr
2001-01-28 0:43 ` David Ford
` (5 subsequent siblings)
6 siblings, 0 replies; 32+ messages in thread
From: Shawn Starr @ 2001-01-28 0:36 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel
This system is the following:
AcerOPEN AP53/AX Motherboard, Intel Pentium 200Mhz w/o MMX (1996-1997)
Chipsets: 430HX, PIIX3 (EIDE)
64MB RAM EDO 60ns (Kingston brand)
Linus Torvalds wrote:
> In article <3A7359BB.7BBEE42A@linux.com>, David Ford <david@linux.com>
> wrote:
> >
> >We've narrowed it down to "we're all running xmms" when it happend.
>
> Does anybody have a clue about what is different with xmms?
>
> Does it use KNI if it can, for example? We used to have a problem with
> KNI+Athlons, for example.
>
> It might also be that it's threading-related, and that XMMS is one of
> the few things that uses threads. Things like that. I'm not an XMMS
> user, can somebody who knows XMMS comment on things that it does that
> are unusual?
>
> Linus
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-27 23:28 ` David Ford
2001-01-28 0:22 ` Linus Torvalds
@ 2001-01-28 0:42 ` J Sloan
2001-01-28 0:44 ` Aaron Lehmann
2001-01-28 1:11 ` David Ford
1 sibling, 2 replies; 32+ messages in thread
From: J Sloan @ 2001-01-28 0:42 UTC (permalink / raw)
To: David Ford; +Cc: Shawn Starr, Aaron Lehmann, John Sheahan, linux-kernel
Sorry, there was no xmms involved here -
The behavior occurred while playing unreal tournament.
But at least the sound card was in use, FWIW -
jjs
David Ford wrote:
> We've narrowed it down to "we're all running xmms" when it happend.
>
> -d
>
> J Sloan wrote:
>
> > Just for the record, the system where I saw the problem
> > has only ext2 -
>
> --
> There is a natural aristocracy among men. The grounds of this are virtue and talents. Thomas Jefferson
> The good thing about standards is that there are so many to choose from. Andrew S. Tanenbaum
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-28 0:22 ` Linus Torvalds
2001-01-28 0:36 ` Shawn Starr
@ 2001-01-28 0:43 ` David Ford
2001-01-28 1:05 ` David Ford
` (4 subsequent siblings)
6 siblings, 0 replies; 32+ messages in thread
From: David Ford @ 2001-01-28 0:43 UTC (permalink / raw)
To: LKML
Linus Torvalds wrote:
> In article <3A7359BB.7BBEE42A@linux.com>, David Ford <david@linux.com>
> wrote:
> >
> >We've narrowed it down to "we're all running xmms" when it happend.
>
> Does anybody have a clue about what is different with xmms?
Not sure.
> Does it use KNI if it can, for example? We used to have a problem with
> KNI+Athlons, for example.
>
> It might also be that it's threading-related, and that XMMS is one of
> the few things that uses threads. Things like that. I'm not an XMMS
> user, can somebody who knows XMMS comment on things that it does that
> are unusual?
If I was clued enough to know KNI, I could say for a certainty. I am
assuming it's a form of MMX or related. My notebook is a mobile pII 366.
I'm stress testing it now with ac12. I originally had pre9 on it. There is
one difference other than that, I have Marcelo's bg aging patch on here which
seems to have improved responsiveness significantly but I'll save that for
another story.
I've triggered it, report follows in next email.
-d
--
There is a natural aristocracy among men. The grounds of this are virtue and talents. Thomas Jefferson
The good thing about standards is that there are so many to choose from. Andrew S. Tanenbaum
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-28 0:42 ` J Sloan
@ 2001-01-28 0:44 ` Aaron Lehmann
2001-01-28 1:11 ` David Ford
1 sibling, 0 replies; 32+ messages in thread
From: Aaron Lehmann @ 2001-01-28 0:44 UTC (permalink / raw)
To: J Sloan; +Cc: David Ford, Shawn Starr, John Sheahan, linux-kernel
On Sat, Jan 27, 2001 at 04:42:45PM -0800, J Sloan wrote:
> But at least the sound card was in use, FWIW -
Not for me. My xmms was sitting idle when it froze.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-28 0:22 ` Linus Torvalds
2001-01-28 0:36 ` Shawn Starr
2001-01-28 0:43 ` David Ford
@ 2001-01-28 1:05 ` David Ford
2001-01-28 1:55 ` Linus Torvalds
2001-01-28 1:15 ` David Ford
` (3 subsequent siblings)
6 siblings, 1 reply; 32+ messages in thread
From: David Ford @ 2001-01-28 1:05 UTC (permalink / raw)
To: LKML
Unfortunately klogd reads /proc....erg.
So the following is a painstakingly slow hand translation, I'll only print
the D state entries unless someone asks otherwise.
Prior to this:
XMMS is running playing star wars mpeg. (regular user) (frozen)
TOP is running (regular user) (frozen)
while [ 1 ]; do ls -laR /proc ; done (regular user) (frozen)
skill -9 xmms (root) (frozen)
X 4.0.2 running, scp of 600meg file over pegasus usb ethernet (10Mbit).
syslog caught:
Jan 27 16:42:26 nifty kernel: SysRq: Show State
Jan 27 16:42:26 nifty kernel:
Jan 27 16:42:26 nifty kernel:
free sibling
Jan 27 16:42:26 nifty kernel: task PC stack pid father
child younger older
Jan 27 16:42:26 nifty kernel: init S CBFEBF2C 3184 1 0 187
(NOTLB)
<end>
dmesg shows (only D state for brevity):
top D CA98B3DC 4440 219 158 (NOTLB)
Call Trace: [<c010791d>] [<c0107a68>] [<c02f73dd>] [<c014b5cb>] [<c01311e6>]
[<c0108d5f>] [<c010002b>]
c01078c8 T __down
c0107964 T __down_interruptible
c0107a28 T __down_trylock
c0107a60 T __down_failed
c0107a6c T __down_failed_interruptible
c02f6a00 T stext_lock
c02f827e A _etext
c014b578 t proc_info_read
c014b688 t mem_read
c0131150 T sys_read
c013121c T sys_write
c0108d2c T system_call
c0108d64 T ret_from_sys_call
c0100000 t startup_32
c0100139 t is486
xmms D CACC5EA8 4116 713 155 715 (NOTLB) 1493 674
Call Trace: [<c0124966>] [<c012412f>] [<c01242b8>] [<c0144138>] [<c014238e>]
[<c0131cd0>] [<c01236b2>]
[<c01239f2>] [<c01ac5ca>] [<c010d1f6>] [<c0108e7c>] [<c0108d5f>]
c01248e4 T ___wait_on_page
c0124984 t __lock_page
c01240dc t truncate_list_pages
c0124268 T truncate_inode_pages
c01242d4 t writeout_one_page
c0144094 T remove_inode_hash
c01440a8 T iput
c01441fc T force_delete
c01422a0 T dput
c01423e4 T d_invalidate
c0131c58 T fput
c0131d28 T fget
c012365c t unmap_fixup
c0123788 t free_pgtables
c012380c T do_munmap
c0123a5c T sys_munmap
...ask if you want more
xmms S C2979F30 0 715 713 725 (NOTLB)
Call Trace: [<c01142fb>] [<c0114240>] [<c013f95e>] [<c013fb53>] [<c0119fff>]
[<c0108d5f>]
xmms S C2B75F2C 1156 716 715 (NOTLB) 718
Call Trace: [<c01142fb>] [<c0114240>] [<c013f341>] [<c013f6e0>] [<c0108d5f>]
xmms S 7FFFFFFF 0 718 715 (NOTLB) 719 716
Call Trace: [<c011429f>] [<c013f341>] [<c013f6e0>] [<c0108d5f>]
xmms S C2975F88 832 719 715 (NOTLB) 725 718
Call Trace: [<c01142fb>] [<c0114240>] [<c011d468>] [<c0108d5f>] [<c010002b>]
xmms S CA8D7F88 2672 725 715 (NOTLB) 719
Call Trace: [<c01142fb>] [<c0114240>] [<c011d468>] [<c0108d5f>]
c0114240 t process_timeout
c0114288 T schedule_timeout
c011431c T schedule_tail
c0113d70 t remap_area_pages
c0114020 T __ioremap
c0108d2c T system_call
c0108d64 T ret_from_sys_call
ls D CA98B3DC 0 1896 222 (NOTLB)
Call Trace: [<c010791d>] [<c0107a68>] [<c02f73b5>] [<c014b95a>] [<c01389a2>]
[<c0108d5f>]
skill D CA98B3DC 0 1897 187 (NOTLB)
Call Trace: [<c010791d>] [<c0107a68>] [<c02f73dd>] [<c014b5cb>] [<c01311e6>]
[<c0108d5f>]
c0107964 T __down_interruptible
c0107a28 T __down_trylock
c0107a60 T __down_failed
c0107a6c T __down_failed_interruptible
c02f6a00 T stext_lock
c02f827e A _etext
...
SysRq: Show Memory
Mem-info:
Free pages: 2240kB ( 0kB HighMem)
( Active: 4153, inactive_dirty: 198, inactive_clean: 1077, free: 560 (383 766
1149) )
31*4kB 1*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB =
660kB)
125*4kB 5*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 0*2048kB
= 1580kB)
= 0kB)
Swap cache: add 3165, delete 547, find 25/124
Free swap: 53104kB
49136 pages of RAM
0 pages of HIGHMEM
1798 reserved pages
2619 pages shared
2618 pages swap cached
0 pages in page table cache
Buffer memory: 1276kB
-d
--
There is a natural aristocracy among men. The grounds of this are virtue and talents. Thomas Jefferson
The good thing about standards is that there are so many to choose from. Andrew S. Tanenbaum
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-28 0:42 ` J Sloan
2001-01-28 0:44 ` Aaron Lehmann
@ 2001-01-28 1:11 ` David Ford
2001-01-28 1:30 ` J Sloan
1 sibling, 1 reply; 32+ messages in thread
From: David Ford @ 2001-01-28 1:11 UTC (permalink / raw)
To: J Sloan; +Cc: Shawn Starr, Aaron Lehmann, John Sheahan, linux-kernel
On 2.4.0-ac12, I played music for about 30 minutes without any problems. I started up an mpeg in xmms and it
locked in short order. I'm sure now that it has something to do with the graphics. What DGA or other config
options do you have enabled for your game?
What video and sound card?
I have an ATI Rage LT Pro AGP-133 according to lspci.
-d
J Sloan wrote:
> Sorry, there was no xmms involved here -
>
> The behavior occurred while playing unreal tournament.
>
> But at least the sound card was in use, FWIW -
>
> jjs
>
> David Ford wrote:
>
> > We've narrowed it down to "we're all running xmms" when it happend.
> >
> > -d
> >
> > J Sloan wrote:
> >
> > > Just for the record, the system where I saw the problem
> > > has only ext2 -
> >
> > --
> > There is a natural aristocracy among men. The grounds of this are virtue and talents. Thomas Jefferson
> > The good thing about standards is that there are so many to choose from. Andrew S. Tanenbaum
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > Please read the FAQ at http://www.tux.org/lkml/
--
There is a natural aristocracy among men. The grounds of this are virtue and talents. Thomas Jefferson
The good thing about standards is that there are so many to choose from. Andrew S. Tanenbaum
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-28 0:22 ` Linus Torvalds
` (2 preceding siblings ...)
2001-01-28 1:05 ` David Ford
@ 2001-01-28 1:15 ` David Ford
[not found] ` <fa.ikhc52v.e68327@ifi.uio.no>
` (2 subsequent siblings)
6 siblings, 0 replies; 32+ messages in thread
From: David Ford @ 2001-01-28 1:15 UTC (permalink / raw)
To: LKML
It is important to note that when I hit the magic key and rebooted (SUB), a
split second before it rebooted, a stalled 'lspci' snapped back to life and
printed out my expected data.
-d
--
There is a natural aristocracy among men. The grounds of this are virtue and talents. Thomas Jefferson
The good thing about standards is that there are so many to choose from. Andrew S. Tanenbaum
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-28 1:11 ` David Ford
@ 2001-01-28 1:30 ` J Sloan
2001-01-28 1:51 ` Shawn Starr
0 siblings, 1 reply; 32+ messages in thread
From: J Sloan @ 2001-01-28 1:30 UTC (permalink / raw)
To: David Ford; +Cc: Shawn Starr, Aaron Lehmann, John Sheahan, linux-kernel
OK, here's the details you asked about:
Soundblaster Awe 32 sound card
Voodoo 3 pci video card
Running Xfree86-4.0.0 (rpms from 3dfx.com)
Playing unreal tournament, no special game
options, just 800x600 graphics @ 16 bits.
To recap, the symptoms (hung ps, etc) occurred
on kernel 2.4.1-pre8 + low latency patches. (but
I don't think the low latency patches had anything
to do with it, based on the other reports)
Hope this helps
jjs
David Ford wrote:
> On 2.4.0-ac12, I played music for about 30 minutes without any problems. I started up an mpeg in xmms and it
> locked in short order. I'm sure now that it has something to do with the graphics. What DGA or other config
> options do you have enabled for your game?
>
> What video and sound card?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-28 1:30 ` J Sloan
@ 2001-01-28 1:51 ` Shawn Starr
0 siblings, 0 replies; 32+ messages in thread
From: Shawn Starr @ 2001-01-28 1:51 UTC (permalink / raw)
To: J Sloan; +Cc: David Ford, Shawn Starr, Aaron Lehmann, John Sheahan,
linux-kernel
yes, I should also mention I have also a SoundBlaster 32AWE (0MB on the daughterboard).
J Sloan wrote:
> OK, here's the details you asked about:
>
> Soundblaster Awe 32 sound card
> Voodoo 3 pci video card
> Running Xfree86-4.0.0 (rpms from 3dfx.com)
> Playing unreal tournament, no special game
> options, just 800x600 graphics @ 16 bits.
>
> To recap, the symptoms (hung ps, etc) occurred
> on kernel 2.4.1-pre8 + low latency patches. (but
> I don't think the low latency patches had anything
> to do with it, based on the other reports)
>
> Hope this helps
>
> jjs
>
> David Ford wrote:
>
> > On 2.4.0-ac12, I played music for about 30 minutes without any problems. I started up an mpeg in xmms and it
> > locked in short order. I'm sure now that it has something to do with the graphics. What DGA or other config
> > options do you have enabled for your game?
> >
> > What video and sound card?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-28 1:05 ` David Ford
@ 2001-01-28 1:55 ` Linus Torvalds
2001-01-28 2:20 ` Marcelo Tosatti
0 siblings, 1 reply; 32+ messages in thread
From: Linus Torvalds @ 2001-01-28 1:55 UTC (permalink / raw)
To: linux-kernel
In article <3A737061.F1B914A3@linux.com>, David Ford <david@linux.com> wrote:
>Unfortunately klogd reads /proc....erg.
>
>So the following is a painstakingly slow hand translation, I'll only print
>the D state entries unless someone asks otherwise.
You seem to be pretty much able to reproduce this at will, right?
I'd really like to see the raw System.map and dmesg output if your
syslogd doesn't do a proper job of getting the symbols interpreted: just
send the things by email, and I'll put something together. It's too
hard to interpret your half-way decoded thing, and I really want to see
what this xmms thing is doing..
>xmms D CACC5EA8 4116 713 155 715 (NOTLB) 1493 674
>Call Trace: [<c0124966>] [<c012412f>] [<c01242b8>] [<c0144138>] [<c014238e>]
>[<c0131cd0>] [<c01236b2>]
> [<c01239f2>] [<c01ac5ca>] [<c010d1f6>] [<c0108e7c>] [<c0108d5f>]
>
>c01248e4 T ___wait_on_page
>c0124984 t __lock_page
>
>c01240dc t truncate_list_pages
>c0124268 T truncate_inode_pages
>c01242d4 t writeout_one_page
This is the smoking gun here, I bet, but I'd like to make sure I see the
whole thing. I don't see _why_ we'd have deadlocked on __wait_on_page(),
but I think this is the thread that hangs on to the mm semaphore.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-28 1:55 ` Linus Torvalds
@ 2001-01-28 2:20 ` Marcelo Tosatti
2001-01-28 4:37 ` Linus Torvalds
0 siblings, 1 reply; 32+ messages in thread
From: Marcelo Tosatti @ 2001-01-28 2:20 UTC (permalink / raw)
To: Linus Torvalds; +Cc: lkml, Jens Axboe
(ugh, sorry about last mail)
On 27 Jan 2001, Linus Torvalds wrote:
> In article <3A737061.F1B914A3@linux.com>, David Ford <david@linux.com> wrote:
> >Unfortunately klogd reads /proc....erg.
> >
> >So the following is a painstakingly slow hand translation, I'll only print
> >the D state entries unless someone asks otherwise.
>
> You seem to be pretty much able to reproduce this at will, right?
>
> I'd really like to see the raw System.map and dmesg output if your
> syslogd doesn't do a proper job of getting the symbols interpreted: just
> send the things by email, and I'll put something together. It's too
> hard to interpret your half-way decoded thing, and I really want to see
> what this xmms thing is doing..
>
> >xmms D CACC5EA8 4116 713 155 715 (NOTLB) 1493 674
> >Call Trace: [<c0124966>] [<c012412f>] [<c01242b8>] [<c0144138>] [<c014238e>]
> >[<c0131cd0>] [<c01236b2>]
> > [<c01239f2>] [<c01ac5ca>] [<c010d1f6>] [<c0108e7c>] [<c0108d5f>]
> >
> >c01248e4 T ___wait_on_page
> >c0124984 t __lock_page
> >
> >c01240dc t truncate_list_pages
> >c0124268 T truncate_inode_pages
> >c01242d4 t writeout_one_page
>
> This is the smoking gun here, I bet, but I'd like to make sure I see the
> whole thing. I don't see _why_ we'd have deadlocked on __wait_on_page(),
> but I think this is the thread that hangs on to the mm semaphore.
I was able to reproduce it here with dbench.
Nothing is locked except this dbench thread (the only dbench thread):
dbench D C1C9FE64 5200 1013 1 (L-TLB) 1370 785
Call Trace: [___wait_on_page+130/160] [truncate_list_pages+100/404] [truncate_inode_pages+93/128] [iput+162/360] [dput+262/356] [fput+121/232] [exit_mmap+218/292]
[mmput+56/80] [do_exit+208/680] [do_signal+566/656] [dput+25/356] [path_release+13/60] [sys_newstat+100/112] [sys_read+188/196] [signal_return+20/24]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-28 4:37 ` Linus Torvalds
@ 2001-01-28 3:42 ` Marcelo Tosatti
2001-01-28 4:01 ` Marcelo Tosatti
2001-01-28 6:10 ` Shawn Starr
0 siblings, 2 replies; 32+ messages in thread
From: Marcelo Tosatti @ 2001-01-28 3:42 UTC (permalink / raw)
To: Linus Torvalds; +Cc: lkml, Jens Axboe
On Sat, 27 Jan 2001, Linus Torvalds wrote:
>
>
> On Sun, 28 Jan 2001, Marcelo Tosatti wrote:
> > >
> > > This is the smoking gun here, I bet, but I'd like to make sure I see the
> > > whole thing. I don't see _why_ we'd have deadlocked on __wait_on_page(),
> > > but I think this is the thread that hangs on to the mm semaphore.
> >
> > I was able to reproduce it here with dbench.
> >
> > Nothing is locked except this dbench thread (the only dbench thread):
> >
> > dbench D C1C9FE64 5200 1013 1 (L-TLB) 1370 785
> > Call Trace: [___wait_on_page+130/160] [truncate_list_pages+100/404] [truncate_inode_pages+93/128] [iput+162/360] [dput+262/356] [fput+121/232] [exit_mmap+218/292]
> > [mmput+56/80] [do_exit+208/680] [do_signal+566/656] [dput+25/356] [path_release+13/60] [sys_newstat+100/112] [sys_read+188/196] [signal_return+20/24]
>
> Ok, this definitely seems to be the pattern.
>
> I don't see _what_ is going on, though.
>
> I know of one "known bug" in pre10: if you run out of swap-space with
> shared memory segments, it will do the wrong thing (return 1 without
> unlocking the page). xmms might trigger this, but I didn't think that
> dbench used shared memory?
It does. Bingo.
I'm not able to reproduce the problem here with your patch.
Btw, there is another bug in shm_writepage() where it does not set the
page dirty in case of failure...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
[not found] ` <fa.ikhc52v.e68327@ifi.uio.no>
@ 2001-01-28 3:46 ` Håvard Kvålen
0 siblings, 0 replies; 32+ messages in thread
From: Håvard Kvålen @ 2001-01-28 3:46 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel
> Does anybody have a clue about what is different with xmms?
>
> Does it use KNI if it can, for example? We used to have a problem
> with KNI+Athlons, for example.
No, it doesn't.
> It might also be that it's threading-related, and that XMMS is one
> of the few things that uses threads. Things like that. I'm not an
> XMMS user, can somebody who knows XMMS comment on things that it
> does that are unusual?
Yes, threads could be the thing that makes a difference. I can't
think of anything else that is special about XMMS.
--
Håvard Kvålen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-28 3:42 ` Marcelo Tosatti
@ 2001-01-28 4:01 ` Marcelo Tosatti
2001-01-28 17:21 ` Linus Torvalds
2001-01-28 6:10 ` Shawn Starr
1 sibling, 1 reply; 32+ messages in thread
From: Marcelo Tosatti @ 2001-01-28 4:01 UTC (permalink / raw)
To: Linus Torvalds; +Cc: lkml, Jens Axboe
On Sun, 28 Jan 2001, Marcelo Tosatti wrote:
> On Sat, 27 Jan 2001, Linus Torvalds wrote:
>
> >
> >
> > On Sun, 28 Jan 2001, Marcelo Tosatti wrote:
> > > >
> > > > This is the smoking gun here, I bet, but I'd like to make sure I see the
> > > > whole thing. I don't see _why_ we'd have deadlocked on __wait_on_page(),
> > > > but I think this is the thread that hangs on to the mm semaphore.
> > >
> > > I was able to reproduce it here with dbench.
> > >
> > > Nothing is locked except this dbench thread (the only dbench thread):
> > >
> > > dbench D C1C9FE64 5200 1013 1 (L-TLB) 1370 785
> > > Call Trace: [___wait_on_page+130/160] [truncate_list_pages+100/404] [truncate_inode_pages+93/128] [iput+162/360] [dput+262/356] [fput+121/232] [exit_mmap+218/292]
> > > [mmput+56/80] [do_exit+208/680] [do_signal+566/656] [dput+25/356] [path_release+13/60] [sys_newstat+100/112] [sys_read+188/196] [signal_return+20/24]
> >
> > Ok, this definitely seems to be the pattern.
> >
> > I don't see _what_ is going on, though.
> >
> > I know of one "known bug" in pre10: if you run out of swap-space with
> > shared memory segments, it will do the wrong thing (return 1 without
> > unlocking the page). xmms might trigger this, but I didn't think that
> > dbench used shared memory?
>
> It does. Bingo.
>
> I'm not able to reproduce the problem here with your patch.
>
> Btw, there is another bug in shm_writepage() where it does not set the
> page dirty in case of failure...
Why dont you just put set_page_dirty() back in page_launder() in case
writepage() fails?
Otherwise you'll have to do in every specific implementation of
writepage().
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-28 2:20 ` Marcelo Tosatti
@ 2001-01-28 4:37 ` Linus Torvalds
2001-01-28 3:42 ` Marcelo Tosatti
0 siblings, 1 reply; 32+ messages in thread
From: Linus Torvalds @ 2001-01-28 4:37 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: lkml, Jens Axboe
On Sun, 28 Jan 2001, Marcelo Tosatti wrote:
> >
> > This is the smoking gun here, I bet, but I'd like to make sure I see the
> > whole thing. I don't see _why_ we'd have deadlocked on __wait_on_page(),
> > but I think this is the thread that hangs on to the mm semaphore.
>
> I was able to reproduce it here with dbench.
>
> Nothing is locked except this dbench thread (the only dbench thread):
>
> dbench D C1C9FE64 5200 1013 1 (L-TLB) 1370 785
> Call Trace: [___wait_on_page+130/160] [truncate_list_pages+100/404] [truncate_inode_pages+93/128] [iput+162/360] [dput+262/356] [fput+121/232] [exit_mmap+218/292]
> [mmput+56/80] [do_exit+208/680] [do_signal+566/656] [dput+25/356] [path_release+13/60] [sys_newstat+100/112] [sys_read+188/196] [signal_return+20/24]
Ok, this definitely seems to be the pattern.
I don't see _what_ is going on, though.
I know of one "known bug" in pre10: if you run out of swap-space with
shared memory segments, it will do the wrong thing (return 1 without
unlocking the page). xmms might trigger this, but I didn't think that
dbench used shared memory?
There's also an ugliness in the truncate ordering. I don't think it should
matter, but I do believe it's conceptually wrong as-is.
Does this patch make any difference at all?
Linus
-----
diff -u --recursive --new-file pre10/linux/mm/memory.c linux/mm/memory.c
--- pre10/linux/mm/memory.c Sat Jan 27 10:53:39 2001
+++ linux/mm/memory.c Sat Jan 27 19:12:35 2001
@@ -945,7 +945,6 @@
if (inode->i_size < offset)
goto do_expand;
inode->i_size = offset;
- truncate_inode_pages(mapping, offset);
spin_lock(&mapping->i_shared_lock);
if (!mapping->i_mmap && !mapping->i_mmap_shared)
goto out_unlock;
@@ -960,8 +959,7 @@
out_unlock:
spin_unlock(&mapping->i_shared_lock);
- /* this should go into ->truncate */
- inode->i_size = offset;
+ truncate_inode_pages(mapping, offset);
if (inode->i_op && inode->i_op->truncate)
inode->i_op->truncate(inode);
return;
diff -u --recursive --new-file pre10/linux/mm/shmem.c linux/mm/shmem.c
--- pre10/linux/mm/shmem.c Sat Jan 27 10:53:39 2001
+++ linux/mm/shmem.c Sat Jan 27 19:50:08 2001
@@ -217,8 +217,11 @@
info = &page->mapping->host->u.shmem_i;
swap = __get_swap_page(2);
- if (!swap.val)
- return 1;
+ if (!swap.val) {
+ set_page_dirty(page);
+ UnlockPage(page);
+ return -ENOMEM;
+ }
spin_lock(&info->lock);
shmem_recalc_inode(page->mapping->host);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-28 3:42 ` Marcelo Tosatti
2001-01-28 4:01 ` Marcelo Tosatti
@ 2001-01-28 6:10 ` Shawn Starr
1 sibling, 0 replies; 32+ messages in thread
From: Shawn Starr @ 2001-01-28 6:10 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: Linus Torvalds, lkml, Jens Axboe
Patch appears to work,
for i in [0-9]*; do echo $i; cat $i/stat > /dev/null; done
completes successfully with xmms running in "real-time" priority.
Shawn.
Marcelo Tosatti wrote:
> On Sat, 27 Jan 2001, Linus Torvalds wrote:
>
> >
> >
> > On Sun, 28 Jan 2001, Marcelo Tosatti wrote:
> > > >
> > > > This is the smoking gun here, I bet, but I'd like to make sure I see the
> > > > whole thing. I don't see _why_ we'd have deadlocked on __wait_on_page(),
> > > > but I think this is the thread that hangs on to the mm semaphore.
> > >
> > > I was able to reproduce it here with dbench.
> > >
> > > Nothing is locked except this dbench thread (the only dbench thread):
> > >
> > > dbench D C1C9FE64 5200 1013 1 (L-TLB) 1370 785
> > > Call Trace: [___wait_on_page+130/160] [truncate_list_pages+100/404] [truncate_inode_pages+93/128] [iput+162/360] [dput+262/356] [fput+121/232] [exit_mmap+218/292]
> > > [mmput+56/80] [do_exit+208/680] [do_signal+566/656] [dput+25/356] [path_release+13/60] [sys_newstat+100/112] [sys_read+188/196] [signal_return+20/24]
> >
> > Ok, this definitely seems to be the pattern.
> >
> > I don't see _what_ is going on, though.
> >
> > I know of one "known bug" in pre10: if you run out of swap-space with
> > shared memory segments, it will do the wrong thing (return 1 without
> > unlocking the page). xmms might trigger this, but I didn't think that
> > dbench used shared memory?
>
> It does. Bingo.
>
> I'm not able to reproduce the problem here with your patch.
>
> Btw, there is another bug in shm_writepage() where it does not set the
> page dirty in case of failure...
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-28 0:22 ` Linus Torvalds
` (4 preceding siblings ...)
[not found] ` <fa.ikhc52v.e68327@ifi.uio.no>
@ 2001-01-28 8:59 ` James Sutherland
2001-01-29 15:08 ` Zdenek Kabelac
6 siblings, 0 replies; 32+ messages in thread
From: James Sutherland @ 2001-01-28 8:59 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel
On 27 Jan 2001, Linus Torvalds wrote:
> In article <3A7359BB.7BBEE42A@linux.com>, David Ford <david@linux.com>
> wrote:
> >
> >We've narrowed it down to "we're all running xmms" when it happend.
>
> Does anybody have a clue about what is different with xmms?
>
> Does it use KNI if it can, for example? We used to have a problem with
> KNI+Athlons, for example.
Not KNI, I don't think, but 1.2.4 did add support for 3dnow!, with
auto-detection of CPU type. Disabled by default, but available. Are there
any 3dnow! issues??
> It might also be that it's threading-related, and that XMMS is one of
> the few things that uses threads. Things like that. I'm not an XMMS
> user, can somebody who knows XMMS comment on things that it does that
> are unusual?
Always uses threads, can use 3dnow!, DGA and realtime priority. Can also
do direct hardware access to some graphics cards (inc SB16), but I haven't
looked at that one closely.
James.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-28 4:01 ` Marcelo Tosatti
@ 2001-01-28 17:21 ` Linus Torvalds
0 siblings, 0 replies; 32+ messages in thread
From: Linus Torvalds @ 2001-01-28 17:21 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: lkml, Jens Axboe
On Sun, 28 Jan 2001, Marcelo Tosatti wrote:
>
> Why dont you just put set_page_dirty() back in page_launder() in case
> writepage() fails?
Because a EIO or similar should _not_ be re-tried or kept dirty.
Imagine a bad user that goes over his quota on purpose, and then every
single write will always return an error. What should we do? Let him eat
all physical memory? I don't think so.
write-out errors will be ignored. We _might_ send a signal or something,
but considering the fact that we don't even know who caused the dirty page
in the first place, even that is kind of hard.
Shared memory and out-of-swap is special - the shared memory code is
supposed to check that we have enough memory before it even allocates
anything.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: ps hang in 241-pre10
2001-01-28 0:22 ` Linus Torvalds
` (5 preceding siblings ...)
2001-01-28 8:59 ` James Sutherland
@ 2001-01-29 15:08 ` Zdenek Kabelac
6 siblings, 0 replies; 32+ messages in thread
From: Zdenek Kabelac @ 2001-01-29 15:08 UTC (permalink / raw)
To: Linus Torvalds
Linus Torvalds wrote:
>
> In article <3A7359BB.7BBEE42A@linux.com>, David Ford <david@linux.com>
> wrote:
> >
> >We've narrowed it down to "we're all running xmms" when it happend.
>
> Does anybody have a clue about what is different with xmms?
>
> Does it use KNI if it can, for example? We used to have a problem with
Seeing this - I'll add my post here too - I've been burning one audio CD
last week and while I've been moving slider the system has locked - I
think
the kernel version has been -ac7 - then I've used pre8 and I've been
playing divx file while burning four other CD with no problem.
My system is SMP Bp6 with SBLive kernel's emu driver.
--
There are three types of people in the world:
those who can count, and those who can't.
Zdenek Kabelac http://i.am/kabi/ kabi@i.am {debian.org; fi.muni.cz}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2001-01-29 15:08 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-01-27 4:34 ps hang in 241-pre10 John Sheahan
2001-01-27 4:43 ` Aaron Lehmann
2001-01-27 7:03 ` Shawn Starr
2001-01-27 8:06 ` J Sloan
2001-01-27 8:24 ` David Ford
2001-01-27 9:33 ` Shawn Starr
2001-01-27 11:26 ` John Sheahan
2001-01-27 19:15 ` J Sloan
2001-01-27 23:28 ` David Ford
2001-01-28 0:22 ` Linus Torvalds
2001-01-28 0:36 ` Shawn Starr
2001-01-28 0:43 ` David Ford
2001-01-28 1:05 ` David Ford
2001-01-28 1:55 ` Linus Torvalds
2001-01-28 2:20 ` Marcelo Tosatti
2001-01-28 4:37 ` Linus Torvalds
2001-01-28 3:42 ` Marcelo Tosatti
2001-01-28 4:01 ` Marcelo Tosatti
2001-01-28 17:21 ` Linus Torvalds
2001-01-28 6:10 ` Shawn Starr
2001-01-28 1:15 ` David Ford
[not found] ` <fa.ikhc52v.e68327@ifi.uio.no>
2001-01-28 3:46 ` Håvard Kvålen
2001-01-28 8:59 ` James Sutherland
2001-01-29 15:08 ` Zdenek Kabelac
2001-01-28 0:42 ` J Sloan
2001-01-28 0:44 ` Aaron Lehmann
2001-01-28 1:11 ` David Ford
2001-01-28 1:30 ` J Sloan
2001-01-28 1:51 ` Shawn Starr
2001-01-27 21:14 ` Aaron Lehmann
2001-01-27 16:19 ` Linus Torvalds
2001-01-27 23:42 ` David Ford
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox