public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <andrea@suse.de>
To: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>,
	bgagnon@coradiant.com,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Andrea Arcangeli <andrea@novell.com>,
	davem@redhat.com
Subject: Re: Memory leak in 2.4.27 kernel, using mmap raw packet sockets
Date: Thu, 25 Nov 2004 21:32:48 +0100	[thread overview]
Message-ID: <20041125203248.GD5904@dualathlon.random> (raw)
In-Reply-To: <20041125150206.GF16633@logos.cnet>

On Thu, Nov 25, 2004 at 01:02:06PM -0200, Marcelo Tosatti wrote:
> On Sun, Oct 17, 2004 at 03:39:26AM +0100, Alan Cox wrote:
> > On Gwe, 2004-10-15 at 19:23, Marcelo Tosatti wrote:
> > > I prefer doing the "if (PageReserved(page)) put_page_testzero(page)" as
> > > you propose instead of changing get_user_pages(), as there are several
> > > users which rely on its behaviour.
> > > 
> > > I have applied your fix to the 2.4 BK tree.
> > 
> > That isnt sufficient. Consider anything else taking a reference to the
> > page and the refcount going negative. And yes 2.6.x has this problem and
> > far worse in some ways, but it also has the mechanism to fix it.
> > 
> > 2.6.x uses VM_IO as a VMA flag which tells the kernel two things
> > a) get_user_pages fails on it
> > b) core dumping of it is forbidden
> > 
> > 2.6.x is missing a whole pile of these (fixed in the 2.6.9-ac tree I'm
> > putting together). I *think* remap_page_range() in 2.6.x can just set
> > VM_IO, but older kernels didn't pass the vma so all the users would need
> > fixing (OSS audio, media/video, usb audio, usb video, frame buffer
> > etc).
> 
> I dont see any practical problem with 2.4.x right now.
> 
> get_user_pages() wont be called on driver created VMA's with PageReserved 
> pages because of the VM_IO bit which is set at remap_page_range(). 
> 
> Its not possible to have any vma mapped by a driver without VM_IO set.
> 
> But the network packet mmap was an isolated case, so I'll apply Andrea's 
> fix just for safety, although I can't find any offender in the tree.
> 
> 
> --- memory.c    2004-10-22 15:58:28.000000000 -0200
> +++ memory.c  2004-10-28 14:32:26.585813200 -0200
> @@ -499,7 +499,7 @@
>                                 /* FIXME: call the correct function,
>                                  * depending on the type of the found page
>                                  */
> -                               if (!pages[i])
> +                               if (!pages[i] || PageReserved(pages[i]))
>                                         goto bad_page;
>                                 page_cache_get(pages[i]);
>                         }

this needs to be modified to take the ZERO_PAGE(start) into account.
It's a minor detail, but we should allow that. (my 2.4-aa tree was
already checking the zeropage for other reasons, so it didn't need any
change)

in short the above fix should be modified like this:

	if (!pages[i] || PageReserved(pages[i])) {
		if (pages[i] != ZERO_PAGE(start))
			goto bad_page;
	} else
		page_cache_get(pages[i]);

the zero page is guaranteed to remain reserved. the major bug is that
get_user_pages was allowing _temporarily_ reserved pages to be pinned.
__free_pages isn't checking the VM_IO, and as such get_user_pages should
be robust against its __free_pages counterpart, this is why I believe
the above fix is the right fix.

ZEROPAGE is special because:

1) it's guaranteed to never be unpinned
2) it's the only reserved page that handle_mm_fault is allowed to
istantiate for a filesystem data mapping

The VM_IO enforcment is a nice improvement on top of the above.

Checking the PageReserved bitflag is a good thing at least for the
zeropage, so we don't overflow the zeropage count, which isn't nice.

If you really want to fix it only using VM_IO, I still recommend to
apply the above patch, and to turn 'goto bad_page' into BUG().
It'd be a bad idea not to at least add the above code as a robustness
check.

Comments welcome. thanks.

  reply	other threads:[~2004-11-26 23:39 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-10-14 14:50 Memory leak in 2.4.27 kernel, using mmap raw packet sockets bgagnon
2004-10-15 18:23 ` Marcelo Tosatti
2004-10-17  2:39   ` Alan Cox
2004-10-19 14:35     ` Marcelo Tosatti
2004-10-20 18:43       ` Alan Cox
2004-10-20 23:24         ` Andrea Arcangeli
2004-10-23 14:17           ` Marcelo Tosatti
2004-11-25 15:02     ` Marcelo Tosatti
2004-11-25 20:32       ` Andrea Arcangeli [this message]
2004-11-25 17:12         ` Marcelo Tosatti
2004-11-25 23:13           ` Andrea Arcangeli
2004-11-25 19:45             ` Marcelo Tosatti
2004-11-26  1:04               ` Andrea Arcangeli
2004-11-30  4:03                 ` David S. Miller
2004-11-30  4:16                   ` Andrea Arcangeli
2004-11-30  6:11                     ` David S. Miller
2004-11-30  6:19                     ` David S. Miller
  -- strict thread matches above, loose matches on Subject: below --
2004-10-21 13:39 O.Sezer
2004-10-21 14:26 ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20041125203248.GD5904@dualathlon.random \
    --to=andrea@suse.de \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=andrea@novell.com \
    --cc=bgagnon@coradiant.com \
    --cc=davem@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcelo.tosatti@cyclades.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox