From: Izik Eidus <ieidus@redhat.com>
To: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
Rik van Riel <riel@redhat.com>, Chris Wright <chrisw@redhat.com>,
Nick Piggin <nickpiggin@yahoo.com.au>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 9/12] ksm: fix oom deadlock
Date: Tue, 04 Aug 2009 22:32:14 +0300 [thread overview]
Message-ID: <4A788CBE.7080100@redhat.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0908031317190.16754@sister.anvils>
Hugh Dickins wrote:
> There's a now-obvious deadlock in KSM's out-of-memory handling:
> imagine ksmd or KSM_RUN_UNMERGE handling, holding ksm_thread_mutex,
> trying to allocate a page to break KSM in an mm which becomes the
> OOM victim (quite likely in the unmerge case): it's killed and goes
> to exit, and hangs there waiting to acquire ksm_thread_mutex.
>
> Clearly we must not require ksm_thread_mutex in __ksm_exit, simple
> though that made everything else: perhaps use mmap_sem somehow?
> And part of the answer lies in the comments on unmerge_ksm_pages:
> __ksm_exit should also leave all the rmap_item removal to ksmd.
>
> But there's a fundamental problem, that KSM relies upon mmap_sem to
> guarantee the consistency of the mm it's dealing with, yet exit_mmap
> tears down an mm without taking mmap_sem. And bumping mm_users won't
> help at all, that just ensures that the pages the OOM killer assumes
> are on their way to being freed will not be freed.
>
> The best answer seems to be, to move the ksm_exit callout from just
> before exit_mmap, to the middle of exit_mmap: after the mm's pages
> have been freed (if the mmu_gather is flushed), but before its page
> tables and vma structures have been freed; and down_write,up_write
> mmap_sem there to serialize with KSM's own reliance on mmap_sem.
>
> But KSM then needs to be careful, whenever it downs mmap_sem, to
> check that the mm is not already exiting: there's a danger of using
> find_vma on a layout that's being torn apart, or writing into page
> tables which have been freed for reuse; and even do_anonymous_page
> and __do_fault need to check they're not being called by break_ksm
> to reinstate a pte after zap_pte_range has zapped that page table.
>
> Though it might be clearer to add an exiting flag, set while holding
> mmap_sem in __ksm_exit, that wouldn't cover the issue of reinstating
> a zapped pte. All we need is to check whether mm_users is 0 - but
> must remember that ksmd may detect that before __ksm_exit is reached.
> So, ksm_test_exit(mm) added to comment such checks on mm->mm_users.
>
> __ksm_exit now has to leave clearing up the rmap_items to ksmd,
> that needs ksm_thread_mutex; but shift the exiting mm just after the
> ksm_scan cursor so that it will soon be dealt with. __ksm_enter raise
> mm_count to hold the mm_struct, ksmd's exit processing (exactly like
> its processing when it finds all VM_MERGEABLEs unmapped) mmdrop it,
> similar procedure for KSM_RUN_UNMERGE (which has stopped ksmd).
>
> But also give __ksm_exit a fast path: when there's no complication
> (no rmap_items attached to mm and it's not at the ksm_scan cursor),
> it can safely do all the exiting work itself. This is not just an
> optimization: when ksmd is not running, the raised mm_count would
> otherwise leak mm_structs.
>
> Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
> ---
>
Acked-by: Izik Eidus <ieidus@redhat.com>
WARNING: multiple messages have this Message-ID (diff)
From: Izik Eidus <ieidus@redhat.com>
To: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
Rik van Riel <riel@redhat.com>, Chris Wright <chrisw@redhat.com>,
Nick Piggin <nickpiggin@yahoo.com.au>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 9/12] ksm: fix oom deadlock
Date: Tue, 04 Aug 2009 22:32:14 +0300 [thread overview]
Message-ID: <4A788CBE.7080100@redhat.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0908031317190.16754@sister.anvils>
Hugh Dickins wrote:
> There's a now-obvious deadlock in KSM's out-of-memory handling:
> imagine ksmd or KSM_RUN_UNMERGE handling, holding ksm_thread_mutex,
> trying to allocate a page to break KSM in an mm which becomes the
> OOM victim (quite likely in the unmerge case): it's killed and goes
> to exit, and hangs there waiting to acquire ksm_thread_mutex.
>
> Clearly we must not require ksm_thread_mutex in __ksm_exit, simple
> though that made everything else: perhaps use mmap_sem somehow?
> And part of the answer lies in the comments on unmerge_ksm_pages:
> __ksm_exit should also leave all the rmap_item removal to ksmd.
>
> But there's a fundamental problem, that KSM relies upon mmap_sem to
> guarantee the consistency of the mm it's dealing with, yet exit_mmap
> tears down an mm without taking mmap_sem. And bumping mm_users won't
> help at all, that just ensures that the pages the OOM killer assumes
> are on their way to being freed will not be freed.
>
> The best answer seems to be, to move the ksm_exit callout from just
> before exit_mmap, to the middle of exit_mmap: after the mm's pages
> have been freed (if the mmu_gather is flushed), but before its page
> tables and vma structures have been freed; and down_write,up_write
> mmap_sem there to serialize with KSM's own reliance on mmap_sem.
>
> But KSM then needs to be careful, whenever it downs mmap_sem, to
> check that the mm is not already exiting: there's a danger of using
> find_vma on a layout that's being torn apart, or writing into page
> tables which have been freed for reuse; and even do_anonymous_page
> and __do_fault need to check they're not being called by break_ksm
> to reinstate a pte after zap_pte_range has zapped that page table.
>
> Though it might be clearer to add an exiting flag, set while holding
> mmap_sem in __ksm_exit, that wouldn't cover the issue of reinstating
> a zapped pte. All we need is to check whether mm_users is 0 - but
> must remember that ksmd may detect that before __ksm_exit is reached.
> So, ksm_test_exit(mm) added to comment such checks on mm->mm_users.
>
> __ksm_exit now has to leave clearing up the rmap_items to ksmd,
> that needs ksm_thread_mutex; but shift the exiting mm just after the
> ksm_scan cursor so that it will soon be dealt with. __ksm_enter raise
> mm_count to hold the mm_struct, ksmd's exit processing (exactly like
> its processing when it finds all VM_MERGEABLEs unmapped) mmdrop it,
> similar procedure for KSM_RUN_UNMERGE (which has stopped ksmd).
>
> But also give __ksm_exit a fast path: when there's no complication
> (no rmap_items attached to mm and it's not at the ksm_scan cursor),
> it can safely do all the exiting work itself. This is not just an
> optimization: when ksmd is not running, the raised mm_count would
> otherwise leak mm_structs.
>
> Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
> ---
>
Acked-by: Izik Eidus <ieidus@redhat.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-08-04 19:28 UTC|newest]
Thread overview: 106+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-03 12:08 [PATCH 0/12] ksm: stats, oom, doc, misc Hugh Dickins
2009-08-03 12:08 ` Hugh Dickins
2009-08-03 12:10 ` [PATCH 1/12] ksm: rename kernel_pages_allocated Hugh Dickins
2009-08-03 12:10 ` Hugh Dickins
2009-08-03 14:21 ` Izik Eidus
2009-08-03 14:21 ` Izik Eidus
2009-08-03 16:48 ` Andrea Arcangeli
2009-08-03 16:48 ` Andrea Arcangeli
2009-08-03 12:11 ` [PATCH 2/12] ksm: move pages_sharing updates Hugh Dickins
2009-08-03 12:11 ` Hugh Dickins
2009-08-03 14:34 ` Izik Eidus
2009-08-03 14:34 ` Izik Eidus
2009-08-03 16:53 ` Andrea Arcangeli
2009-08-03 16:53 ` Andrea Arcangeli
2009-08-03 17:34 ` Hugh Dickins
2009-08-03 17:34 ` Hugh Dickins
2009-08-03 12:11 ` [PATCH 3/12] ksm: pages_unshared and pages_volatile Hugh Dickins
2009-08-03 12:11 ` Hugh Dickins
2009-08-03 14:54 ` Izik Eidus
2009-08-03 14:54 ` Izik Eidus
2009-08-04 21:49 ` Andrew Morton
2009-08-04 21:49 ` Andrew Morton
2009-08-05 11:39 ` Hugh Dickins
2009-08-05 11:39 ` Hugh Dickins
2009-08-05 15:11 ` Andrea Arcangeli
2009-08-05 15:11 ` Andrea Arcangeli
2009-08-03 12:12 ` [PATCH 4/12] ksm: break cow once unshared Hugh Dickins
2009-08-03 12:12 ` Hugh Dickins
2009-08-03 16:00 ` Izik Eidus
2009-08-03 16:00 ` Izik Eidus
2009-08-03 12:14 ` [PATCH 5/12] ksm: keep quiet while list empty Hugh Dickins
2009-08-03 12:14 ` Hugh Dickins
2009-08-03 16:55 ` Izik Eidus
2009-08-03 16:55 ` Izik Eidus
2009-08-04 21:59 ` Andrew Morton
2009-08-04 21:59 ` Andrew Morton
2009-08-05 11:54 ` Hugh Dickins
2009-08-05 11:54 ` Hugh Dickins
2009-08-03 12:15 ` [PATCH 6/12] ksm: five little cleanups Hugh Dickins
2009-08-03 12:15 ` Hugh Dickins
2009-08-04 12:41 ` Izik Eidus
2009-08-04 12:41 ` Izik Eidus
2009-08-03 12:16 ` [PATCH 7/12] ksm: fix endless loop on oom Hugh Dickins
2009-08-03 12:16 ` Hugh Dickins
2009-08-04 12:55 ` Izik Eidus
2009-08-04 12:55 ` Izik Eidus
2009-08-03 12:17 ` [PATCH 8/12] ksm: distribute remove_mm_from_lists Hugh Dickins
2009-08-03 12:17 ` Hugh Dickins
2009-08-04 13:03 ` Izik Eidus
2009-08-04 13:03 ` Izik Eidus
2009-08-03 12:18 ` [PATCH 9/12] ksm: fix oom deadlock Hugh Dickins
2009-08-03 12:18 ` Hugh Dickins
2009-08-04 19:32 ` Izik Eidus [this message]
2009-08-04 19:32 ` Izik Eidus
2009-08-25 14:58 ` Andrea Arcangeli
2009-08-25 14:58 ` Andrea Arcangeli
2009-08-25 15:22 ` [PATCH 13/12] ksm: fix munlock during exit_mmap deadlock Andrea Arcangeli
2009-08-25 15:22 ` Andrea Arcangeli
2009-08-25 17:49 ` Hugh Dickins
2009-08-25 17:49 ` Hugh Dickins
2009-08-25 18:10 ` Andrea Arcangeli
2009-08-25 18:10 ` Andrea Arcangeli
2009-08-25 18:58 ` Hugh Dickins
2009-08-25 18:58 ` Hugh Dickins
2009-08-25 19:45 ` Andrea Arcangeli
2009-08-25 19:45 ` Andrea Arcangeli
2009-08-26 16:18 ` Justin M. Forbes
2009-08-26 16:18 ` Justin M. Forbes
2009-08-26 19:17 ` Hugh Dickins
2009-08-26 19:17 ` Hugh Dickins
2009-08-26 19:44 ` Andrea Arcangeli
2009-08-26 19:44 ` Andrea Arcangeli
2009-08-26 19:57 ` Hugh Dickins
2009-08-26 19:57 ` Hugh Dickins
2009-08-26 20:28 ` Andrea Arcangeli
2009-08-26 20:28 ` Andrea Arcangeli
2009-08-26 20:54 ` Izik Eidus
2009-08-26 20:54 ` Izik Eidus
2009-08-26 21:14 ` Andrea Arcangeli
2009-08-26 21:14 ` Andrea Arcangeli
2009-08-26 21:49 ` Izik Eidus
2009-08-26 21:49 ` Izik Eidus
2009-08-27 19:11 ` Hugh Dickins
2009-08-27 19:11 ` Hugh Dickins
2009-08-27 19:35 ` Izik Eidus
2009-08-27 19:35 ` Izik Eidus
2009-08-26 22:00 ` David Rientjes
2009-08-26 22:00 ` David Rientjes
2009-08-26 20:29 ` Hugh Dickins
2009-08-26 20:29 ` Hugh Dickins
2009-08-25 17:35 ` [PATCH 9/12] ksm: fix oom deadlock Hugh Dickins
2009-08-25 17:35 ` Hugh Dickins
2009-08-25 17:47 ` Andrea Arcangeli
2009-08-25 17:47 ` Andrea Arcangeli
2009-08-03 12:19 ` [PATCH 10/12] ksm: sysfs and defaults Hugh Dickins
2009-08-03 12:19 ` Hugh Dickins
2009-08-04 19:34 ` Izik Eidus
2009-08-04 19:34 ` Izik Eidus
2009-08-03 12:21 ` [PATCH 11/12] ksm: add some documentation Hugh Dickins
2009-08-03 12:21 ` Hugh Dickins
2009-08-04 19:35 ` Izik Eidus
2009-08-04 19:35 ` Izik Eidus
2009-08-03 12:22 ` [PATCH 12/12] ksm: remove VM_MERGEABLE_FLAGS Hugh Dickins
2009-08-03 12:22 ` Hugh Dickins
2009-08-04 19:35 ` Izik Eidus
2009-08-04 19:35 ` Izik Eidus
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A788CBE.7080100@redhat.com \
--to=ieidus@redhat.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=chrisw@redhat.com \
--cc=hugh.dickins@tiscali.co.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nickpiggin@yahoo.com.au \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.