linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	linux-mm@kvack.org, Ira Weiny <ira.weiny@intel.com>
Subject: Re: Rare memory leakage
Date: Wed, 23 Sep 2020 12:26:50 +0100	[thread overview]
Message-ID: <20200923112650.GN32101@casper.infradead.org> (raw)
In-Reply-To: <20200923064606.l3ajwbbkewvvaimc@black.fi.intel.com>

On Wed, Sep 23, 2020 at 09:46:06AM +0300, Kirill A. Shutemov wrote:
> On Tue, Sep 22, 2020 at 02:12:19PM +0100, Matthew Wilcox wrote:
> > On Tue, Sep 22, 2020 at 12:44:51PM +0100, Matthew Wilcox wrote:
> > > I've just kicked off a test run using this:
> > > 
> > > +/*
> > > + * Have to be careful when freeing a non-compound allocation in case somebody
> > > + * else takes a temporary reference on the first page and then calls put_page()
> > > + */
> > >  void __free_pages(struct page *page, unsigned int order)
> > >  {
> > > -       if (put_page_testzero(page))
> > > -               free_the_page(page, order);
> > > +       if (likely(page_ref_freeze(page, 1)))
> > > +               goto free;
> > > +       if (likely(order == 0 || PageHead(page))) {
> > > +               if (put_page_testzero(page))
> > > +                       goto free;
> > > +               return;
> > > +       }
> > > +
> > > +       prep_compound_page(page, order);
> > > +       put_page(page);
> > > +       return;
> > > +free:
> > > +       free_the_page(page, order);
> > >  }
> > >  EXPORT_SYMBOL(__free_pages);
> > > 
> > > > > Better ideas?
> > > > 
> > > > IMHO, alloc_pages() with order > 0 and without __GFP_COMP is a weird beast. In
> > > > that case it would be probably best to refcount each base page separately. I
> > > > don't know how many assumptions this would break :/
> > > 
> > > You sound like a man who's never dealt with device drivers.  Multiorder,
> > > non-compound allocations are the norm in that world.
> > 
> > This seems easier to reason about:
> > 
> > +/*
> > + * If we free a non-compound allocation, another thread may have a
> > + * transient reference to the first page.  It has no way of knowing
> > + * about the rest of the allocation, so we have to free all but the
> > + * first page here.
> > + */
> >  void __free_pages(struct page *page, unsigned int order)
> >  {
> >         if (put_page_testzero(page))
> >                 free_the_page(page, order);
> > +       else
> > +               while (order-- > 0)
> > +                       free_the_page(page + 1 << order, order);
> >  }
> >  EXPORT_SYMBOL(__free_pages);
> 
> To be honest, I have not dealt with device drivers much myself, but this
> looks bogus. We free pages beynd the first one if the first is pinned.
> How is it correct?

From the device driver's point of view, it's just freeing some memory.

Let's look at an example:

static struct page *i8xx_alloc_pages(void)
{
        struct page *page;

        page = alloc_pages(GFP_KERNEL | GFP_DMA32, 2);
        if (page == NULL)
                return NULL;

        if (set_pages_uc(page, 4) < 0) {
                set_pages_wb(page, 4);
                __free_pages(page, 2);
                return NULL;
        }
        atomic_inc(&agp_bridge->current_memory_agp);
        return page;
}

So we allocate four consecutive pages but they're not a compound page.
If this mysterious call to set_pages_uc() fails, we call __free_pages()
and intend to free all four pages.

If the page cache has a speculative reference on the first page, we will
free none of them.  When the page cache drops its speculative reference,
it will free only the first one.  So we have to free all-but-the-first
one here.

As far as I can tell, you're concerned that the driver might do something
like this:

	struct page *page = alloc_pages(GFP_KERNEL, 2);
	get_page(page);
	__free_pages(page, 2);
	__free_pages(page, 2);

Drivers don't do that, in my experience, and if there are any, we'll see
a noisy complaint in page_expected_state() when it checks _mapcount on
each of the freed pages and it's PageBuddy instead of -1.


      reply	other threads:[~2020-09-23 11:26 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-22  3:12 Rare memory leakage Matthew Wilcox
2020-09-22  3:43 ` Ira Weiny
2020-09-22 11:04   ` Matthew Wilcox
2020-09-22 11:22 ` Vlastimil Babka
2020-09-22 11:44   ` Matthew Wilcox
2020-09-22 13:12     ` Matthew Wilcox
2020-09-23  6:46       ` Kirill A. Shutemov
2020-09-23 11:26         ` Matthew Wilcox [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200923112650.GN32101@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=ira.weiny@intel.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-mm@kvack.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).