All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Mauricio Faria de Oliveira <mfo@canonical.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-block@vger.kernel.org,
	Huang Ying <ying.huang@intel.com>,
	Miaohe Lin <linmiaohe@huawei.com>
Subject: Re: [PATCH] mm: fix race between MADV_FREE reclaim and blkdev direct IO read
Date: Tue, 4 Jan 2022 15:06:42 -0800	[thread overview]
Message-ID: <YdTTAjfse0Zwl7eB@google.com> (raw)
In-Reply-To: <CAO9xwp32+izoL54iCWRMGttL_T9yJKcyDyqwqxoDBx8Z7d_ZKg@mail.gmail.com>

On Tue, Jan 04, 2022 at 08:46:17AM -0300, Mauricio Faria de Oliveira wrote:
> On Thu, Dec 16, 2021 at 3:17 PM Minchan Kim <minchan@kernel.org> wrote:
> ...
> > Hi Mauricio,
> >
> > Thanks for catching the bug. There is some comment before I would
> > look the problem in more detail. Please see below.
> >
> 
> Hey! Thanks for looking into this. Sorry for the delay; I've been out
> a few weeks.
> 
> > > diff --git a/mm/rmap.c b/mm/rmap.c
> > > index 163ac4e6bcee..f04151aae03b 100644
> > > --- a/mm/rmap.c
> > > +++ b/mm/rmap.c
> > > @@ -1570,7 +1570,18 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
> > >
> > >                       /* MADV_FREE page check */
> > >                       if (!PageSwapBacked(page)) {
> > > -                             if (!PageDirty(page)) {
> > > +                             int refcount = page_ref_count(page);
> > > +
> > > +                             /*
> > > +                              * The only page refs must be from the isolation
> > > +                              * (checked by the caller shrink_page_list() too)
> > > +                              * and the (single) rmap (dropped by discard:).
> > > +                              *
> > > +                              * Check the reference count before dirty flag
> > > +                              * with memory barrier; see __remove_mapping().
> > > +                              */
> > > +                             smp_rmb();
> > > +                             if (refcount == 2 && !PageDirty(page)) {
> >
> > A madv_free marked page could be mapped at several processes so
> > it wouldn't be refcount two all the time, I think.
> > Shouldn't we check it with page_mapcount with page_refcount?
> >
> >     page_ref_count(page) - 1  > page_mapcount(page)
> >
> 
> It's the other way around, isn't it? The madvise(MADV_FREE) call only clears
> the page dirty flag if page_mapcount() == 1 (ie not mapped by more processes).
> 
> @ madvise_free_pte_range()
> 
>                         /*
>                          * If page is shared with others, we couldn't clear
>                          * PG_dirty of the page.
>                          */
>                         if (page_mapcount(page) != 1) {
>                                 unlock_page(page);
>                                 continue;
>                         }
> ...
>                         ClearPageDirty(page);
>                         unlock_page(page);
> 
> 
> If that's right, the refcount of 2 should be OK (one from the isolation,
> another one from the single map/one process.)
> 
> Does that make sense?  I might be missing something.

A child process could be forked from parent after madvise so the page
mapcount of the hinted page could have elevated refcount.

  reply	other threads:[~2022-01-04 23:06 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-11  2:21 [PATCH] mm: fix race between MADV_FREE reclaim and blkdev direct IO read Mauricio Faria de Oliveira
2021-12-16 18:17 ` Minchan Kim
2021-12-17  2:10   ` Huang, Ying
2022-01-04 11:49     ` Mauricio Faria de Oliveira
2022-01-04 11:46   ` Mauricio Faria de Oliveira
2022-01-04 23:06     ` Minchan Kim [this message]
2022-01-04 23:32       ` Mauricio Faria de Oliveira
2021-12-17 18:51 ` Yang Shi
2022-01-04 11:57   ` Mauricio Faria de Oliveira
2022-01-05  0:32   ` Huang, Ying
2022-01-05  1:20     ` Yang Shi
2022-01-05  1:42       ` Huang, Ying
2022-01-05  2:16         ` Yang Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YdTTAjfse0Zwl7eB@google.com \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mfo@canonical.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.