From: Michal Hocko <mhocko@suse.cz>
To: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Sasha Levin <sasha.levin@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>,
Konstantin Khlebnikov <koct9i@gmail.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Michel Lespinasse <walken@google.com>,
Lukas Czerner <lczerner@redhat.com>,
Dave Jones <davej@redhat.com>,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] shmem: fix splicing from a hole while it's punched
Date: Fri, 25 Jul 2014 16:33:44 +0200 [thread overview]
Message-ID: <20140725143344.GC4844@dhcp22.suse.cz> (raw)
In-Reply-To: <alpine.LSU.2.11.1407150331170.2584@eggly.anvils>
On Tue 15-07-14 03:33:02, Hugh Dickins wrote:
> shmem_fault() is the actual culprit in trinity's hole-punch starvation,
> and the most significant cause of such problems: since a page faulted is
> one that then appears page_mapped(), needing unmap_mapping_range() and
> i_mmap_mutex to be unmapped again.
>
> But it is not the only way in which a page can be brought into a hole in
> the radix_tree while that hole is being punched; and Vlastimil's testing
> implies that if enough other processors are busy filling in the hole,
> then shmem_undo_range() can be kept from completing indefinitely.
>
> shmem_file_splice_read() is the main other user of SGP_CACHE, which
> can instantiate shmem pagecache pages in the read-only case (without
> holding i_mutex, so perhaps concurrently with a hole-punch). Probably
> it's silly not to use SGP_READ already (using the ZERO_PAGE for holes):
> which ought to be safe, but might bring surprises - not a change to be
> rushed.
>
> shmem_read_mapping_page_gfp() is an internal interface used by
> drivers/gpu/drm GEM (and next by uprobes): it should be okay. And
> shmem_file_read_iter() uses the SGP_DIRTY variant of SGP_CACHE, when
> called internally by the kernel (perhaps for a stacking filesystem,
> which might rely on holes to be reserved): it's unclear whether it
> could be provoked to keep hole-punch busy or not.
>
> We could apply the same umbrella as now used in shmem_fault() to
> shmem_file_splice_read() and the others; but it looks ugly, and use
> over a range raises questions - should it actually be per page? can
> these get starved themselves?
>
> The origin of this part of the problem is my v3.1 commit d0823576bf4b
> ("mm: pincer in truncate_inode_pages_range"), once it was duplicated
> into shmem.c. It seemed like a nice idea at the time, to ensure
> (barring RCU lookup fuzziness) that there's an instant when the entire
> hole is empty; but the indefinitely repeated scans to ensure that make
> it vulnerable.
>
> Revert that "enhancement" to hole-punch from shmem_undo_range(), but
> retain the unproblematic rescanning when it's truncating; add a couple
> of comments there.
>
> Remove the "indices[0] >= end" test: that is now handled satisfactorily
> by the inner loop, and mem_cgroup_uncharge_start()/end() are too light
> to be worth avoiding here.
>
> But if we do not always loop indefinitely, we do need to handle the case
> of swap swizzled back to page before shmem_free_swap() gets it: add a
> retry for that case, as suggested by Konstantin Khlebnikov; and for the
> case of page swizzled back to swap, as suggested by Johannes Weiner.
>
> Signed-off-by: Hugh Dickins <hughd@google.com>
> Reported-by: Sasha Levin <sasha.levin@oracle.com>
> Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> Cc: Konstantin Khlebnikov <koct9i@gmail.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Lukas Czerner <lczerner@redhat.com>
> Cc: Dave Jones <davej@redhat.com>
> Cc: <stable@vger.kernel.org> [3.1+]
Reviewed-by: Michal Hocko <mhocko@suse.cz>
> ---
> Please replace mmotm's
> shmem-fix-faulting-into-a-hole-while-its-punched-take-2.patch
> by this patch: which is substantially the same as that, but with
> updated commit comment, and a page retry as indicated by Hannes.
>
> mm/shmem.c | 24 +++++++++++++++---------
> 1 file changed, 15 insertions(+), 9 deletions(-)
>
> --- 3.16-rc5+/mm/shmem.c 2014-07-14 20:34:28.196153828 -0700
> +++ 3.16-rc5++/mm/shmem.c 2014-07-14 20:35:14.156154916 -0700
> @@ -468,23 +468,20 @@ static void shmem_undo_range(struct inod
> return;
>
> index = start;
> - for ( ; ; ) {
> + while (index < end) {
> cond_resched();
>
> pvec.nr = find_get_entries(mapping, index,
> min(end - index, (pgoff_t)PAGEVEC_SIZE),
> pvec.pages, indices);
> if (!pvec.nr) {
> - if (index == start || unfalloc)
> + /* If all gone or hole-punch or unfalloc, we're done */
> + if (index == start || end != -1)
> break;
> + /* But if truncating, restart to make sure all gone */
> index = start;
> continue;
> }
> - if ((index == start || unfalloc) && indices[0] >= end) {
> - pagevec_remove_exceptionals(&pvec);
> - pagevec_release(&pvec);
> - break;
> - }
> mem_cgroup_uncharge_start();
> for (i = 0; i < pagevec_count(&pvec); i++) {
> struct page *page = pvec.pages[i];
> @@ -496,8 +493,12 @@ static void shmem_undo_range(struct inod
> if (radix_tree_exceptional_entry(page)) {
> if (unfalloc)
> continue;
> - nr_swaps_freed += !shmem_free_swap(mapping,
> - index, page);
> + if (shmem_free_swap(mapping, index, page)) {
> + /* Swap was replaced by page: retry */
> + index--;
> + break;
> + }
> + nr_swaps_freed++;
> continue;
> }
>
> @@ -506,6 +507,11 @@ static void shmem_undo_range(struct inod
> if (page->mapping == mapping) {
> VM_BUG_ON_PAGE(PageWriteback(page), page);
> truncate_inode_page(mapping, page);
> + } else {
> + /* Page was replaced by swap: retry */
> + unlock_page(page);
> + index--;
> + break;
> }
> }
> unlock_page(page);
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-07-25 14:33 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-15 10:28 [PATCH 0/2] shmem: fix faulting into a hole while it's punched, take 3 Hugh Dickins
2014-07-15 10:31 ` [PATCH 1/2] shmem: fix faulting into a hole, not taking i_mutex Hugh Dickins
2014-07-15 16:07 ` Vlastimil Babka
2014-07-15 19:26 ` Hugh Dickins
2014-07-16 7:18 ` Vlastimil Babka
2014-07-25 14:25 ` Michal Hocko
2014-07-15 10:33 ` [PATCH 2/2] shmem: fix splicing from a hole while it's punched Hugh Dickins
2014-07-25 14:33 ` Michal Hocko [this message]
2014-07-17 16:10 ` [PATCH 0/2] shmem: fix faulting into a hole while it's punched, take 3 Vlastimil Babka
2014-07-17 16:12 ` Sasha Levin
2014-07-18 10:44 ` Sasha Levin
2014-07-19 23:44 ` Hugh Dickins
2014-07-22 3:24 ` Sasha Levin
2014-07-22 8:07 ` Hugh Dickins
2014-07-22 10:06 ` Vlastimil Babka
2014-07-22 12:09 ` Vlastimil Babka
2014-07-22 18:42 ` Hugh Dickins
2014-07-22 23:19 ` Sasha Levin
2014-07-22 23:58 ` Hugh Dickins
2014-07-17 23:34 ` Hugh Dickins
2014-07-18 8:05 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140725143344.GC4844@dhcp22.suse.cz \
--to=mhocko@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=davej@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=koct9i@gmail.com \
--cc=lczerner@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=sasha.levin@oracle.com \
--cc=vbabka@suse.cz \
--cc=walken@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).