* 2.4.18(19) swapcache oops
@ 2002-08-15 12:39 j-nomura
2002-08-15 14:32 ` Hugh Dickins
0 siblings, 1 reply; 8+ messages in thread
From: j-nomura @ 2002-08-15 12:39 UTC (permalink / raw)
To: linux-kernel; +Cc: j-nomura
Hello,
I'm using 2.4.18 kernel and suspect there are swapcache race.
I looked into 2.4.19 patch but could not find the fix to it.
The race condition is described below.
Do you have any idea?
In the situation such as:
- two processes (process A and B) sharing memory space
- one of the pages in the space has been swapped out and
not remained in swapcache
- process A runs on cpu0, process B runs on cpu1
when process A reads the address corresponding to the page,
page fault occurs and the cpu0 reads swapped-out page into memory,
calls add_to_page_cache_unique() to add it to swapcache and then
calls lru_cache_add() to add it to lru list.
If process B reads the same address at that time, cpu1 calls
do_swap_page() and lookup_swap_cache() may succeed before cpu0
calls lru_cache_add() and cpu1 will set the page active by
following mark_page_accessed().
lru_cache_add() checks if the page is active and if it is active,
it calls BUG().
process A process B
cpu0 cpu1
-----------------------------------------------------------
do_swap_page()
spinunlock(mm->page_table_lock)
spinlock(mm->page_table_lock)
lookup_swap_cache()
read_swap_cache_async()
alloc_page()
add_to_swap_cache_unique()
__add_to_page_cache()
do_swap_page()
lookup_swap_cache()
mark_page_accessed()
lru_cache_add()
Best regards.
--
NOMURA, Jun'ichi <j-nomura@ce.jp.nec.com, nomura@hpc.bs1.fc.nec.co.jp>
HPC Operating System Group, 1st Computers Software Division,
Computers Software Operations Unit, NEC Solutions.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.4.18(19) swapcache oops
2002-08-15 12:39 2.4.18(19) swapcache oops j-nomura
@ 2002-08-15 14:32 ` Hugh Dickins
2002-08-15 20:05 ` Andrew Morton
0 siblings, 1 reply; 8+ messages in thread
From: Hugh Dickins @ 2002-08-15 14:32 UTC (permalink / raw)
To: j-nomura; +Cc: linux-kernel
On Thu, 15 Aug 2002 j-nomura@ce.jp.nec.com wrote:
>
> I'm using 2.4.18 kernel and suspect there are swapcache race.
> I looked into 2.4.19 patch but could not find the fix to it.
I see a benign race but no oops.
> In the situation such as:
> - two processes (process A and B) sharing memory space
> - one of the pages in the space has been swapped out and
> not remained in swapcache
> - process A runs on cpu0, process B runs on cpu1
>
> when process A reads the address corresponding to the page,
> page fault occurs and the cpu0 reads swapped-out page into memory,
> calls add_to_page_cache_unique() to add it to swapcache and then
> calls lru_cache_add() to add it to lru list.
>
> If process B reads the same address at that time, cpu1 calls
> do_swap_page() and lookup_swap_cache() may succeed before cpu0
> calls lru_cache_add() and cpu1 will set the page active by
> following mark_page_accessed().
I agree that B may get to mark_page_accessed before A gets to
lru_cache_add, but B will just SetPageReferenced. If there's a
similar process C racing on the page too, its mark_page_accessed
would call activate_page, but that will see !PageLRU and do nothing.
> lru_cache_add() checks if the page is active and if it is active,
> it calls BUG().
It cannot be made PageActive until after lru_cache_add has set PageLRU.
Hugh
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.4.18(19) swapcache oops
2002-08-15 14:32 ` Hugh Dickins
@ 2002-08-15 20:05 ` Andrew Morton
2002-08-15 20:21 ` Andrew Morton
2002-08-15 22:59 ` Hugh Dickins
0 siblings, 2 replies; 8+ messages in thread
From: Andrew Morton @ 2002-08-15 20:05 UTC (permalink / raw)
To: Hugh Dickins; +Cc: j-nomura, linux-kernel
Hugh Dickins wrote:
>
> On Thu, 15 Aug 2002 j-nomura@ce.jp.nec.com wrote:
> >
> > I'm using 2.4.18 kernel and suspect there are swapcache race.
> > I looked into 2.4.19 patch but could not find the fix to it.
>
> I see a benign race but no oops.
>
But look at lru_cache_add():
void lru_cache_add(struct page * page)
{
if (!TestSetPageLRU(page)) {
/* window here */
spin_lock(&pagemap_lru_lock);
add_page_to_inactive_list(page);
spin_unlock(&pagemap_lru_lock);
}
}
It sets PG_lru before adding the page to the LRU.
static inline void activate_page_nolock(struct page * page)
{
if (PageLRU(page) && !PageActive(page)) {
del_page_from_inactive_list(page);
add_page_to_active_list(page);
}
}
void activate_page(struct page * page)
{
spin_lock(&pagemap_lru_lock);
activate_page_nolock(page);
spin_unlock(&pagemap_lru_lock);
}
So if activate_page gets the lock inside that window, it will
delete a page from the LRU which isn't on it (memory corruption).
Then activate_page will set PG_active and will drop the lock.
lru_cache_add gets the lock, runs add_page_to_inactive_list which
BUGs over PG_active.
--- 2.4.19/mm/swap.c~lru-race Thu Aug 15 13:03:48 2002
+++ 2.4.19-akpm/mm/swap.c Thu Aug 15 13:04:19 2002
@@ -57,9 +57,10 @@ void activate_page(struct page * page)
*/
void lru_cache_add(struct page * page)
{
- if (!TestSetPageLRU(page)) {
+ if (!PageLRU(page)) {
spin_lock(&pagemap_lru_lock);
- add_page_to_inactive_list(page);
+ if (!TestSetPageLRU(page))
+ add_page_to_inactive_list(page);
spin_unlock(&pagemap_lru_lock);
}
}
.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.4.18(19) swapcache oops
2002-08-15 20:05 ` Andrew Morton
@ 2002-08-15 20:21 ` Andrew Morton
2002-08-16 4:19 ` j-nomura
2002-08-15 22:59 ` Hugh Dickins
1 sibling, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2002-08-15 20:21 UTC (permalink / raw)
To: Hugh Dickins, j-nomura, linux-kernel
Andrew Morton wrote:
>
> ...
> --- 2.4.19/mm/swap.c~lru-race Thu Aug 15 13:03:48 2002
> +++ 2.4.19-akpm/mm/swap.c Thu Aug 15 13:04:19 2002
> @@ -57,9 +57,10 @@ void activate_page(struct page * page)
> */
> void lru_cache_add(struct page * page)
> {
> - if (!TestSetPageLRU(page)) {
> + if (!PageLRU(page)) {
> spin_lock(&pagemap_lru_lock);
> - add_page_to_inactive_list(page);
> + if (!TestSetPageLRU(page))
> + add_page_to_inactive_list(page);
> spin_unlock(&pagemap_lru_lock);
> }
> }
Seems that I fixed this in 2.5.32. That set_bit outside
the lock gave me the willes, and I couldn't put my finger on why.
Never occurred to me that the page could be found via pagecache
lookup in this manner.
In 2.5, it is effectively:
void lru_cache_add(struct page * page)
{
spin_lock(&pagemap_lru_lock);
if (TestSetPageLRU(page))
BUG();
add_page_to_inactive_list(page);
spin_unlock(&pagemap_lru_lock);
}
which is what should be tested in 2.4. It's stricter, and significantly
faster.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.4.18(19) swapcache oops
2002-08-15 20:05 ` Andrew Morton
2002-08-15 20:21 ` Andrew Morton
@ 2002-08-15 22:59 ` Hugh Dickins
1 sibling, 0 replies; 8+ messages in thread
From: Hugh Dickins @ 2002-08-15 22:59 UTC (permalink / raw)
To: Andrew Morton; +Cc: j-nomura, linux-kernel
On Thu, 15 Aug 2002, Andrew Morton wrote:
> Hugh Dickins wrote:
> > On Thu, 15 Aug 2002 j-nomura@ce.jp.nec.com wrote:
> > >
> > > I'm using 2.4.18 kernel and suspect there are swapcache race.
> > > I looked into 2.4.19 patch but could not find the fix to it.
> >
> > I see a benign race but no oops.
>
> But look at lru_cache_add():
>
> void lru_cache_add(struct page * page)
> {
> if (!TestSetPageLRU(page)) {
> /* window here */
> spin_lock(&pagemap_lru_lock);
> add_page_to_inactive_list(page);
> spin_unlock(&pagemap_lru_lock);
> }
> }
>
> It sets PG_lru before adding the page to the LRU.
Right you are! Thanks for correcting my dysanalysis.
Hugh
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.4.18(19) swapcache oops
2002-08-15 20:21 ` Andrew Morton
@ 2002-08-16 4:19 ` j-nomura
2002-08-16 4:40 ` Andrew Morton
0 siblings, 1 reply; 8+ messages in thread
From: j-nomura @ 2002-08-16 4:19 UTC (permalink / raw)
To: akpm; +Cc: hugh, j-nomura, linux-kernel
Thank you for the solutions.
Your first patch seems to fix the problem in 2.4.
As to the second fix from 2.5.32, it can't be applicable to 2.4, can it?
try_to_swap_out() may call add_to_swap_cache() with PG_lru page.
From: Andrew Morton <akpm@zip.com.au>
Subject: Re: 2.4.18(19) swapcache oops
Date: Thu, 15 Aug 2002 13:21:17 -0700
> In 2.5, it is effectively:
>
> void lru_cache_add(struct page * page)
> {
> spin_lock(&pagemap_lru_lock);
> if (TestSetPageLRU(page))
> BUG();
> add_page_to_inactive_list(page);
> spin_unlock(&pagemap_lru_lock);
> }
>
> which is what should be tested in 2.4. It's stricter, and significantly
> faster.
Best regards.
--
NOMURA, Jun'ichi <j-nomura@ce.jp.nec.com, nomura@hpc.bs1.fc.nec.co.jp>
HPC Operating System Group, 1st Computers Software Division,
Computers Software Operations Unit, NEC Solutions.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.4.18(19) swapcache oops
2002-08-16 4:19 ` j-nomura
@ 2002-08-16 4:40 ` Andrew Morton
2002-08-16 8:07 ` j-nomura
0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2002-08-16 4:40 UTC (permalink / raw)
To: j-nomura; +Cc: hugh, linux-kernel
j-nomura@ce.jp.nec.com wrote:
>
> Thank you for the solutions.
> Your first patch seems to fix the problem in 2.4.
You must have been able to reproduce it quite quickly. But the
code has been like that for ages. Why is that? Are you testing
on an unusual machine?
> As to the second fix from 2.5.32, it can't be applicable to 2.4, can it?
> try_to_swap_out() may call add_to_swap_cache() with PG_lru page.
Ah, good point. In 2.5, add_to_page_cache() does not add the
page to the LRU. Anonymous pages are already there, and we
don't need to do that. The first patch should be OK.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.4.18(19) swapcache oops
2002-08-16 4:40 ` Andrew Morton
@ 2002-08-16 8:07 ` j-nomura
0 siblings, 0 replies; 8+ messages in thread
From: j-nomura @ 2002-08-16 8:07 UTC (permalink / raw)
To: akpm; +Cc: j-nomura, hugh, linux-kernel
From: Andrew Morton <akpm@zip.com.au>
Subject: Re: 2.4.18(19) swapcache oops
Date: Thu, 15 Aug 2002 21:40:26 -0700
> You must have been able to reproduce it quite quickly. But the
> code has been like that for ages. Why is that? Are you testing
> on an unusual machine?
Not so unusual. It's Itanium2 SMP machine (4cpus to 32cpus).
The problem have occured at first after I run a very large multi-threaded
program for days.
To investigate the problem, I made a small program which allocate
memory and then create some threads to access the shared space in parallel.
It makes shared pages swapped in and out with very high frequency.
Running it with small free memory situation, the program could cause
the oops in a few minutes on 4cpu machine.
---------------------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <pthread.h>
#define NPROC 4
#define SIZE 0x20000000
char *ptr;
void*
loop(void* none)
{
while(1)
memset((char *)ptr, 0, SIZE);
}
int
main(int argc, char** argv)
{
int i;
void *v;
pthread_t tid[NPROC];
ptr=malloc(SIZE);
memset(ptr, 0, SIZE);
for(i=0; i<NPROC; i++){
pthread_create((pthread_t *)&tid[i],
(pthread_attr_t *)NULL,
loop,
(void *)NULL);
}
for(i=0; i<NPROC; i++)
pthread_join(tid[i], (void**)NULL);
return 0;
}
---------------------------------------------------------------------
> > As to the second fix from 2.5.32, it can't be applicable to 2.4, can it?
> > try_to_swap_out() may call add_to_swap_cache() with PG_lru page.
>
> Ah, good point. In 2.5, add_to_page_cache() does not add the
> page to the LRU. Anonymous pages are already there, and we
> don't need to do that. The first patch should be OK.
I hope the patch will be taken in next 2.4 release.
Best regards.
--
NOMURA, Jun'ichi <j-nomura@ce.jp.nec.com, nomura@hpc.bs1.fc.nec.co.jp>
HPC Operating System Group, 1st Computers Software Division,
Computers Software Operations Unit, NEC Solutions.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2002-08-16 8:04 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-08-15 12:39 2.4.18(19) swapcache oops j-nomura
2002-08-15 14:32 ` Hugh Dickins
2002-08-15 20:05 ` Andrew Morton
2002-08-15 20:21 ` Andrew Morton
2002-08-16 4:19 ` j-nomura
2002-08-16 4:40 ` Andrew Morton
2002-08-16 8:07 ` j-nomura
2002-08-15 22:59 ` Hugh Dickins
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.