public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mark Gross <mgross@unix-os.sc.intel.com>
To: Daniel Jacobowitz <dan@debian.org>, Andi Kleen <ak@suse.de>
Cc: linux-kernel@vger.kernel.org
Subject: Re: PATCH Multithreaded core dump support for the 2.5.14 (and 15) kernel.
Date: Thu, 16 May 2002 14:08:10 -0400	[thread overview]
Message-ID: <200205162108.g4GL8Xw01263@unix-os.sc.intel.com> (raw)
In-Reply-To: <59885C5E3098D511AD690002A5072D3C057B485B@orsmsx111.jf.intel.com.suse.lists.linux.kernel> <20020516192759.A5326@wotan.suse.de> <20020516173634.GA16561@nevyn.them.org>

On Thursday 16 May 2002 01:36 pm, Daniel Jacobowitz wrote:
> On Thu, May 16, 2002 at 07:27:59PM +0200, Andi Kleen wrote:
> > On Thu, May 16, 2002 at 10:13:40AM -0400, Mark Gross wrote:
> > > Also, does anyone know WHY the mmap_sem is needed in the elf_core_dump
> > > code, and is this need still valid if I've suspended all the other
> > > processes that could even touch that mm?  I.e. can I fix this by
> > > removing the down_write / up_write in elf_core_dump?
> >
> > The mmap_sem is needed to access current->mm (especially the vma list)
> > safely. Otherwise someone else sharing the mm_struct could modify it.
> > If you make sure all others sharing the mm_struct are killed first
> > (including now way for them to start new clones inbetween) then
> > the only loophole left would be remote access using /proc/pid/mem or
> > ptrace. If you handle that too then it is probably safe to drop it.
> > Unfortunately I don't see a way to handle these remote users without at
> > least
> > taking it temporarily.
> >
> > Of course there are other semaphores in involved in dumping too (e.g. the
> > VFS ->write code may take the i_sem or other private ones). I guess they
> > won't be a big problem if you first kill and then dump later.
>
> Except unfortunately we don't kill; the other threads are resumed
> afterwards for cleanup.  They're just suspended.

Yes, they start back up after the dump.  

It certainly seems that with the processes paused that the use of the 
current->mm->mm_sem could be obsolete for core dumps.  I'm not so sure 
protecting the core file data from ptrace or /proc/pid/mem is important in 
the case of core dumping.

I just don't want the kernel to lock up dumping the multithreaded core file.

I'm still not sure we have a problem yet.  (wishful thinking I suppose).   
Also I've seen zero lock ups from semaphore being held by one of the 
processes getting pauses temporarily in my testing on the patch I posted.

To restate: the only way I see that my design gets into trouble is when a 
semaphore is HELD, not getting waited on, by one of the processes that gets 
put onto the phantom runqueue, AND that semaphore is needed in the processing 
of elf_core_dump(...).

For this to happen that semaphore would have to held across schedule()'s.  
The ONLY place I've seen that in the kernel is set_CPUs_allowed + 
migration_thread.  

Can someone point me at other critical sections that have non-deterministic 
life times as a function of when the process holding the semaphore gets 
scheduled onto a CPU?  That type of code seems very risky to me.  This is the 
only type of code that could get my design into trouble.

--mgross



  reply	other threads:[~2002-05-16 21:08 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <59885C5E3098D511AD690002A5072D3C057B485B@orsmsx111.jf.intel.com.suse.lists.linux.kernel>
     [not found] ` <20020515120722.A17644@in.ibm.com.suse.lists.linux.kernel>
     [not found]   ` <20020515140448.C37@toy.ucw.cz.suse.lists.linux.kernel>
     [not found]     ` <200205152353.g4FNrew30146@unix-os.sc.intel.com.suse.lists.linux.kernel>
2002-05-16 12:54       ` PATCH Multithreaded core dump support for the 2.5.14 (and 15) kernel Andi Kleen
2002-05-16 14:13         ` Mark Gross
2002-05-16 17:27           ` Andi Kleen
2002-05-16 17:36             ` Daniel Jacobowitz
2002-05-16 18:08               ` Mark Gross [this message]
2002-05-16 21:32                 ` Alan Cox
2002-05-16 21:24                   ` Robert Love
2002-05-16 18:40                     ` Mark Gross
2002-05-20 15:44 Gross, Mark
  -- strict thread matches above, loose matches on Subject: below --
2002-05-17 12:26 Erich Focht
2002-05-14 16:38 Gross, Mark
2002-05-15  6:37 ` Vamsi Krishna S .
2002-05-15 14:04   ` Pavel Machek
2002-05-15 20:53     ` Mark Gross
2002-05-16 10:11       ` Pavel Machek
2002-05-13 19:17 Mark Gross
2002-05-14 15:35 ` Erich Focht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200205162108.g4GL8Xw01263@unix-os.sc.intel.com \
    --to=mgross@unix-os.sc.intel.com \
    --cc=ak@suse.de \
    --cc=dan@debian.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox