* pte_chain leak in rmap code (2.5.31)
@ 2002-08-12 13:45 Christian Ehrhardt
2002-08-12 14:21 ` Rik van Riel
0 siblings, 1 reply; 5+ messages in thread
From: Christian Ehrhardt @ 2002-08-12 13:45 UTC (permalink / raw)
To: riel; +Cc: linux-kernel
Hi,
While browsing through rmap.c in 2.5.31 I found what looks like a
bug introduced by the PageDirect optimizations. Look at the following
piece of code in try_to_unmap:
for (pc = page->pte.chain; pc; pc = next_pc) {
next_pc = pc->next;
switch (try_to_unmap_one(page, pc->ptep)) {
case SWAP_SUCCESS:
/* Free the pte_chain struct. */
pte_chain_free(pc, prev_pc, page);
break;
case SWAP_AGAIN:
/* Skip this pte, remembering status. */
prev_pc = pc;
ret = SWAP_AGAIN;
continue;
case SWAP_FAIL:
ret = SWAP_FAIL;
break;
case SWAP_ERROR:
ret = SWAP_ERROR;
break;
}
}
Note the strange use of continue and break which both achieve the same!
What was meant to happen (judging from rmap-13c) is that we break
out of the for-Loop once SWAP_FAIL or SWAP_ERROR is returned from
try_to_unmap_one. However, this doesn't happen and a subsequent call
to pte_chain_free will use the wrong value for prev_pc.
The impact seems to be at least leakage of pte_chain structures.
I propose the following (untested) patch:
--- rmap.c Sun Aug 11 03:41:54 2002
+++ /home/ehrhardt/rmap.c Mon Aug 12 15:49:25 2002
@@ -336,9 +336,11 @@
continue;
case SWAP_FAIL:
ret = SWAP_FAIL;
+ pc = NULL
break;
case SWAP_ERROR:
ret = SWAP_ERROR;
+ pc = NULL
break;
}
}
regards Christian Ehrhardt
--
THAT'S ALL FOLKS!
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: pte_chain leak in rmap code (2.5.31)
2002-08-12 13:45 pte_chain leak in rmap code (2.5.31) Christian Ehrhardt
@ 2002-08-12 14:21 ` Rik van Riel
2002-08-13 0:44 ` Thomas Molina
0 siblings, 1 reply; 5+ messages in thread
From: Rik van Riel @ 2002-08-12 14:21 UTC (permalink / raw)
To: Christian Ehrhardt; +Cc: linux-kernel
On Mon, 12 Aug 2002, Christian Ehrhardt wrote:
> Note the strange use of continue and break which both achieve the same!
> What was meant to happen (judging from rmap-13c) is that we break
> out of the for-Loop once SWAP_FAIL or SWAP_ERROR is returned from
> try_to_unmap_one. However, this doesn't happen and a subsequent call
> to pte_chain_free will use the wrong value for prev_pc.
Excellent hunting! Thank you!
Your fix should work too, although in my opinion it's a
little bit too subtle, so I've changed it into:
case SWAP_FAIL:
ret = SWAP_FAIL;
goto give_up;
case SWAP_ERROR:
ret = SWAP_ERROR;
goto give_up;
}
}
give_up:
This is going into 2.4-rmap and 2.5 right now.
thanks,
Rik
--
Bravely reimplemented by the knights who say "NIH".
http://www.surriel.com/ http://distro.conectiva.com/
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: pte_chain leak in rmap code (2.5.31)
2002-08-12 14:21 ` Rik van Riel
@ 2002-08-13 0:44 ` Thomas Molina
2002-08-13 1:09 ` Rik van Riel
0 siblings, 1 reply; 5+ messages in thread
From: Thomas Molina @ 2002-08-13 0:44 UTC (permalink / raw)
To: linux-kernel
On Mon, 12 Aug 2002, Rik van Riel wrote:
> On Mon, 12 Aug 2002, Christian Ehrhardt wrote:
>
> > Note the strange use of continue and break which both achieve the same!
> > What was meant to happen (judging from rmap-13c) is that we break
> > out of the for-Loop once SWAP_FAIL or SWAP_ERROR is returned from
> > try_to_unmap_one. However, this doesn't happen and a subsequent call
> > to pte_chain_free will use the wrong value for prev_pc.
>
> Excellent hunting! Thank you!
>
> Your fix should work too, although in my opinion it's a
> little bit too subtle, so I've changed it into:
>
> case SWAP_FAIL:
> ret = SWAP_FAIL;
> goto give_up;
> case SWAP_ERROR:
> ret = SWAP_ERROR;
> goto give_up;
> }
> }
> give_up:
Any chance this is the cause of the following?
---------------extract-----------------------
Subject: Re: [patch 1/21] random fixes
From: Adam Kropelin <akropel1@rochester.rr.com>
Date: 2002-08-12 2:54:31
FYI, just got this while un-tarring a kernel tree with
2.5.31+everything.gz:
(no nvidia ;)
Date: Sun, 11 Aug 2002 20:40:31 -0700
From: Andrew Morton <akpm@zip.com.au>
That'll be this one:
BUG_ON(page->pte.chain != NULL);
we've had a few reports of this dribbling in since rmap went in. But
nothing repeatable enough for it to be hunted down.
But we do have a repeatable inconsistency happening with ntpd and
memory pressure. That may be related, but in that case it's probably
related to mlock().
So. An open bug, alas.
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: pte_chain leak in rmap code (2.5.31)
2002-08-13 0:44 ` Thomas Molina
@ 2002-08-13 1:09 ` Rik van Riel
2002-08-13 3:13 ` Andrew Morton
0 siblings, 1 reply; 5+ messages in thread
From: Rik van Riel @ 2002-08-13 1:09 UTC (permalink / raw)
To: Thomas Molina; +Cc: linux-kernel
On Mon, 12 Aug 2002, Thomas Molina wrote:
> On Mon, 12 Aug 2002, Rik van Riel wrote:
> > On Mon, 12 Aug 2002, Christian Ehrhardt wrote:
> >
> > > Note the strange use of continue and break which both achieve the same!
> > > What was meant to happen (judging from rmap-13c) is that we break
> > Excellent hunting! Thank you!
> Any chance this is the cause of the following?
Yes, quite possible.
> From: Adam Kropelin <akropel1@rochester.rr.com>
> Date: 2002-08-12 2:54:31
> But we do have a repeatable inconsistency happening with ntpd and
> memory pressure. That may be related, but in that case it's probably
> related to mlock().
kind regards,
Rik
--
Bravely reimplemented by the knights who say "NIH".
http://www.surriel.com/ http://distro.conectiva.com/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: pte_chain leak in rmap code (2.5.31)
2002-08-13 1:09 ` Rik van Riel
@ 2002-08-13 3:13 ` Andrew Morton
0 siblings, 0 replies; 5+ messages in thread
From: Andrew Morton @ 2002-08-13 3:13 UTC (permalink / raw)
To: Rik van Riel; +Cc: Thomas Molina, linux-kernel
Rik van Riel wrote:
>
> On Mon, 12 Aug 2002, Thomas Molina wrote:
> > On Mon, 12 Aug 2002, Rik van Riel wrote:
> > > On Mon, 12 Aug 2002, Christian Ehrhardt wrote:
> > >
> > > > Note the strange use of continue and break which both achieve the same!
> > > > What was meant to happen (judging from rmap-13c) is that we break
> > > Excellent hunting! Thank you!
> > Any chance this is the cause of the following?
>
> Yes, quite possible.
>
Well Adam reported it against the patched version, in which
is appears that I accidentally fixed that bug. So we may
yet have a problem:
for (pc = start; pc; pc = next_pc) {
int i;
next_pc = pc->next;
if (next_pc)
prefetch(next_pc);
for (i = 0; i < NRPTE; i++) {
pte_t *p = pc->ptes[i];
if (!p)
continue;
if (victim_i == -1)
victim_i = i;
switch (try_to_unmap_one(page, p)) {
case SWAP_SUCCESS:
/*
* Release a slot. If we're releasing the
* first pte in the first pte_chain then
* pc->ptes[i] and start->ptes[victim_i] both
* refer to the same thing. It works out.
*/
pc->ptes[i] = start->ptes[victim_i];
start->ptes[victim_i] = NULL;
dec_page_state(nr_reverse_maps);
victim_i++;
if (victim_i == NRPTE) {
page->pte.chain = start->next;
pte_chain_free(start);
start = page->pte.chain;
victim_i = 0;
}
break;
case SWAP_AGAIN:
/* Skip this pte, remembering status. */
ret = SWAP_AGAIN;
continue;
case SWAP_FAIL:
ret = SWAP_FAIL;
goto out;
case SWAP_ERROR:
ret = SWAP_ERROR;
goto out;
}
}
}
out:
return ret;
}
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2002-08-13 2:59 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-08-12 13:45 pte_chain leak in rmap code (2.5.31) Christian Ehrhardt
2002-08-12 14:21 ` Rik van Riel
2002-08-13 0:44 ` Thomas Molina
2002-08-13 1:09 ` Rik van Riel
2002-08-13 3:13 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox