From: Andrea Arcangeli <andrea@suse.de>
To: Gerd Knorr <kraxel@bytesex.org>
Cc: Hugh Dickins <hugh@veritas.com>,
Marcelo Tosatti <marcelo@conectiva.com.br>,
Linus Torvalds <torvalds@transmeta.com>,
Andrew Morton <akpm@zip.com.au>,
Rik van Riel <riel@conectiva.com.br>,
"David S. Miller" <davem@redhat.com>,
Benjamin LaHaise <bcrl@redhat.com>, Dave Jones <davej@suse.de>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] __free_pages_ok oops
Date: Thu, 14 Feb 2002 14:10:28 +0100 [thread overview]
Message-ID: <20020214141028.M7940@athlon.random> (raw)
In-Reply-To: <Pine.LNX.4.21.0202131652050.20915-100000@freak.distro.conectiva> <Pine.LNX.4.21.0202141045250.1722-100000@localhost.localdomain> <20020214121037.A6194@bytesex.org>
In-Reply-To: <20020214121037.A6194@bytesex.org>
On Thu, Feb 14, 2002 at 12:10:37PM +0100, Gerd Knorr wrote:
> > However: that is the only unambiguous example I've seen, and you
> > may argue that his bttv 0.8 driver is not in the current 2.4 tree,
> > is experimental, and even wrong in that area (we now know it also
> > vfrees there).
>
> I've recently changed the code to make it *not* call unmap_kiobuf/vfree
> from irq context. Instead bttv 0.8.x doesn't allow you to close the
> device with DMA xfers in flight. If you try this the release() fops
> handler will block until the transfer is done, then unmap_kiobuf from
> process context, then return.
perfect, that's the right fix for 2.4 (waiting DMA to complete at
->release looks also much saner). unmap_kiobuf wasn't supposed to be run
from irq handlers. Everything dealing with userspace mappings cannot run
from irq handlers, tlb flushes, VM, swapping etc... everything must run
from normal kernel context. If you obey this rule, my previous email to
this thread will still apply. I wasn't aware of bttv running
unmap_kiobuf from irq.
With aio in 2.5 we may want to change this property for the unpinning
stage that would be better run asynchronously from irq handlers, but I
wouldn't change that for 2.4 (at least until we're forced to ship aio in
production on top 2.4, that cannot happen until a final user<->kernel API is
registered somewhere).
I think the foundamental design mistake that leads to __free_pages to
fail from irq, is that we allow an anonymous page to reach count 0 and to be
still in the LRU (the count == 0 check in shrink_cache is the other side
of the hack too). That's the real BUG, that breaks subtly the freelist
semantics, and then we need to make horrible hacks like last Hugh's
patch to work around such magic case (or even worse Rik's proposal for a
spin_lock_irqed list that would hurt in all the vm fast paths). As far
as clean design and orthogonality of subsystem is concerned, the right
fix for 2.5 is to bump the page->count by the time the anonymous page is
added to the lru (think and guess why we're doing that for the
pagecache, and why the pagecache is obviously safe even for aio and
unmap_kiobuf from irq). Then we need to keep it into account during
COWs (page count == 2 for an anonymous page will mean "exclusive")
etc... the MM will need to be changed a little more heavily than with
the hack approch, but I think that's the clean design in the long run.
No special cases for those magic anonymous pages, everything goes in
pagecache, with the difference the anonymous pages aren't hashed (until
they becomes swapcache at least). The semantics of __free_pages will
remain that if you own a page you can __free_pages it anytime you want
without running into BUGS(). If the page also owned by some other
subsystem (the VM), such subsystem will need to take care of bumping the
refernece count and to free the page later lazily. No collisions.
As said this should be a matter only for 2.5, now that Gerd recalls
unmap_kiobuf from normal kernel context.
Andrea
next prev parent reply other threads:[~2002-02-14 13:10 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-02-06 19:06 [PATCH] __free_pages_ok oops Hugh Dickins
2002-02-06 19:47 ` Andrew Morton
2002-02-06 20:15 ` Hugh Dickins
2002-02-06 21:11 ` Andrew Morton
2002-02-07 20:31 ` Manfred Spraul
2002-02-07 5:09 ` Benjamin LaHaise
2002-02-07 5:47 ` Andrew Morton
2002-02-07 5:55 ` David S. Miller
2002-02-07 6:19 ` Andrew Morton
2002-02-07 6:49 ` David S. Miller
2002-02-07 7:07 ` Andrew Morton
2002-02-07 11:52 ` Hugh Dickins
2002-02-07 12:34 ` Rik van Riel
2002-02-07 12:37 ` David S. Miller
2002-02-07 12:44 ` Rik van Riel
2002-02-07 13:19 ` Hugh Dickins
2002-02-07 13:27 ` Rik van Riel
2002-02-07 13:55 ` Daniel Phillips
2002-02-07 14:28 ` Hugh Dickins
2002-02-07 14:56 ` Rik van Riel
2002-02-07 20:21 ` Hugh Dickins
2002-02-07 20:58 ` Andrea Arcangeli
2002-02-07 21:09 ` Andrew Morton
2002-02-07 22:18 ` Andrea Arcangeli
2002-02-07 22:31 ` Andrew Morton
2002-02-07 23:09 ` Andrea Arcangeli
2002-02-07 23:27 ` Andrew Morton
2002-02-08 17:46 ` Hugh Dickins
2002-02-09 14:14 ` Gerd Knorr
2002-02-09 15:47 ` arjan
2002-02-09 14:33 ` Benjamin LaHaise
2002-02-12 20:19 ` Hugh Dickins
2002-02-13 18:52 ` Marcelo Tosatti
2002-02-14 10:47 ` Hugh Dickins
2002-02-14 11:10 ` Gerd Knorr
2002-02-14 13:10 ` Andrea Arcangeli [this message]
2002-02-14 14:01 ` Hugh Dickins
2002-02-14 15:17 ` Andrea Arcangeli
2002-02-14 16:27 ` Linus Torvalds
2002-02-25 18:32 ` Benjamin LaHaise
2002-02-25 19:35 ` Linus Torvalds
2002-02-07 9:48 ` Benjamin LaHaise
-- strict thread matches above, loose matches on Subject: below --
2002-02-09 8:52 alad
2002-02-09 10:46 ` Hugh Dickins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20020214141028.M7940@athlon.random \
--to=andrea@suse.de \
--cc=akpm@zip.com.au \
--cc=bcrl@redhat.com \
--cc=davej@suse.de \
--cc=davem@redhat.com \
--cc=hugh@veritas.com \
--cc=kraxel@bytesex.org \
--cc=linux-kernel@vger.kernel.org \
--cc=marcelo@conectiva.com.br \
--cc=riel@conectiva.com.br \
--cc=torvalds@transmeta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox