All of lore.kernel.org
 help / color / mirror / Atom feed
From: kanoj@google.engr.sgi.com (Kanoj Sarcar)
To: andrea@suse.de
Cc: Ben LaHaise <bcrl@redhat.com>,
	riel@nl.linux.org, Linus Torvalds <torvalds@transmeta.com>,
	linux-mm@kvack.org
Subject: Re: [patch] take 2 Re: PG_swap_entry bug in recent kernels
Date: Mon, 10 Apr 2000 19:45:14 -0700 (PDT)	[thread overview]
Message-ID: <200004110245.TAA57888@google.engr.sgi.com> (raw)
In-Reply-To: <Pine.LNX.4.21.0004092326460.293-100000@vaio.random> from "andrea@suse.de" at Apr 10, 2000 10:55:32 AM

> 
> On Sat, 8 Apr 2000, Kanoj Sarcar wrote:
> 
> >Btw, I am looking at your patch with message id
> ><Pine.LNX.4.21.0004081924010.317-100000@alpha.random>, that does not
> >seem to be holding vmlist/pagetable lock in the swapdelete code (at
> >least at first blush). That was partly why I wanted to know what fixes 
> >are in your patch ...
> 
> The patch was against the earlier swapentry patch that was also fixing the
> vma/pte locking in swapoff. All the three patches I posted were
> incremental.
> 
> >Note: I prefer being able to hold mmap_sem in the swapdelete path, that
> >will provide protection against fork/exit races too. I will try to port
>
> With my approch swapoff is serialized w.r.t. to fork/exit the same way as
> swap_out(). However I see the potential future problem in exit_mmap() that

While forking, a parent might copy a swap handle into the child, but we
might entirely miss scanning the child because he is not on the process list
(kernel_lock is not enough, forking code might sleep). Eg: in the body of 
dup_mmap, we go to sleep due to memory shortage which kicks page stealing
(at highly asynchronous swapio).

Same problem exists in exit_mmap. In this case, one of the routines inside
exit_mmap() can very plausibly go to sleep. Eg: file_unmap.

> makes the entries not reachable before swapoff starts and that does the
> swap_free() after swapoff completed and after the swapdevice gone away (==
> too late). That's not an issue right now though, since both swapoff and
> do_exit() are holding the big kernel lock but it will become an issue
> eventually. Probably exit_mmap() should unlink and unmap the vmas bit by
> bit using locking to unlink and lefting them visible if they are not yet
> released. That should get rid of that future race.
> 
> About grabbing the mmap semaphore in unuse_process: we don't need to do
> that because we aren't changing vmas from swapoff. Swapoff only browses
> and changes pagetables so it only needs the vmalist-access read-spinlock
> that avoids vma to go away, and the pagtable exclusive spinlock because
> we'll change the pagetables (and the latter one is implied in the
> vmlist_access_lock as we know from the vmlist_access_lock implementation).

See above.

> 
> swap_out() can't grab the mmap_sem for obvious reasons, so if you only

Why not? Of course, not with tasklist_lock held (Hehehe, I am not that 
stupid :-)). But other mechanisms are possible.

> grab the mmap_sem you'll have to rely only on the big kernel lock to avoid
> swap_out() to race with your swapoff, right? It doesn't look like a right
> long term solution.

Actually, let me put out the patch, for different reasons, IMO, it is the
right long term solution ...

Kanoj
> 
> Andrea
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux.eu.org/Linux-MM/
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

  reply	other threads:[~2000-04-11  2:45 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2000-04-03 22:22 PG_swap_entry bug in recent kernels Ben LaHaise
2000-04-04 15:06 ` Andrea Arcangeli
2000-04-04 15:46   ` Rik van Riel
2000-04-04 16:50     ` Andrea Arcangeli
2000-04-04 17:06       ` Ben LaHaise
2000-04-04 18:03         ` Andrea Arcangeli
2000-04-06 22:11           ` [patch] take 2 " Ben LaHaise
2000-04-07 10:45             ` Andrea Arcangeli
2000-04-07 11:29               ` Rik van Riel
2000-04-07 12:00                 ` Andrea Arcangeli
2000-04-07 12:54                   ` Rik van Riel
2000-04-07 13:14                     ` Andrea Arcangeli
2000-04-07 20:12               ` Kanoj Sarcar
2000-04-07 23:26                 ` Andrea Arcangeli
2000-04-08  0:11                   ` Kanoj Sarcar
2000-04-08  0:37                     ` Kanoj Sarcar
2000-04-08 13:20                       ` Andrea Arcangeli
2000-04-08 21:39                         ` Kanoj Sarcar
2000-04-08 23:02                           ` Andrea Arcangeli
2000-04-08 23:18                             ` Kanoj Sarcar
2000-04-08 23:58                               ` Andrea Arcangeli
2000-04-08 13:30                     ` Andrea Arcangeli
2000-04-08 17:39                       ` Andrea Arcangeli
2000-04-07 23:54                 ` Andrea Arcangeli
2000-04-08  0:15                   ` Kanoj Sarcar
2000-04-08 13:14                     ` Andrea Arcangeli
2000-04-08 21:47                       ` Kanoj Sarcar
2000-04-08 23:10                         ` Andrea Arcangeli
2000-04-08 23:21                           ` Kanoj Sarcar
2000-04-08 23:39                             ` Andrea Arcangeli
2000-04-09  0:40                               ` Kanoj Sarcar
2000-04-10  8:55                                 ` andrea
2000-04-11  2:45                                   ` Kanoj Sarcar [this message]
2000-04-11 16:22                                     ` Andrea Arcangeli
2000-04-11 17:40                                       ` Rik van Riel
2000-04-11 18:20                                         ` Kanoj Sarcar
2000-04-21 18:23                                         ` Andrea Arcangeli
2000-04-21 21:00                                           ` Rik van Riel
2000-04-22  1:12                                             ` Andrea Arcangeli
2000-04-22  1:51                                               ` Linus Torvalds
2000-04-22 18:29                                                 ` Rik van Riel
2000-04-22 19:58                                                   ` Linus Torvalds
2000-04-11 18:26                                       ` Kanoj Sarcar
2000-04-10 19:10                         ` Stephen C. Tweedie
2000-04-08  0:04                 ` Andrea Arcangeli
     [not found] <yttem7xstk2.fsf@vexeta.dc.fi.udc.es>
2000-04-23  0:52 ` Andrea Arcangeli
     [not found] <yttk8ho26s8.fsf@vexeta.dc.fi.udc.es>
2000-04-23 16:07 ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200004110245.TAA57888@google.engr.sgi.com \
    --to=kanoj@google.engr.sgi.com \
    --cc=andrea@suse.de \
    --cc=bcrl@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=riel@nl.linux.org \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.