From: Andrew Morton <akpm@linux-foundation.org>
To: support@coraid.com
Cc: "Ed L. Cashin" <ecashin@coraid.com>,
linux-kernel@vger.kernel.org, Greg KH <greg@kroah.com>
Subject: Re: PATCH 2.6.21-rc1 aoe: handle zero _count pages in bios
Date: Thu, 1 Mar 2007 17:42:04 -0800 [thread overview]
Message-ID: <20070301174204.a550dd3a.akpm@linux-foundation.org> (raw)
In-Reply-To: <20070301231510.GC8524@coraid.com>
On Thu, 1 Mar 2007 18:15:10 -0500
"Ed L. Cashin" <ecashin@coraid.com> wrote:
> This patch works around a problem discussed here and on the XFS
> mailing list in January.
>
> http://lkml.org/lkml/2007/1/19/56
>
> To summarize the issue: If XFS (or any other creator of bios) gives
> the aoe driver a bio with pages that have a zero page _count, and then
> the aoe driver hands the page to the network layer in an sk_buff's
> frags, and if the network card does not support the scatter gather
> feature, then the network layer will eventually try to put_page on the
> page, and the kernel winds up panicing.
>
> There is a disconnect between the assumptions of the bio creator (that
> pages don't need to have a non-zero _count), and the assumptions of
> the network layer (where it's assumed that pages will always have a
> positive count). There was no response, though, to a call in January
> for ideas about resolving the disconnect.
>
> So to work around the issue, the simple patch below increments the
> page _count before handing it to the network layer and decrements it
> after the network layer is done with the page. This patch eliminates
> panics for XFS on aoe users who lack scatter gather support in their
> network cards.
Something funny is going on here.
Generally, one should increment the refcount of a page when it is put into
some container. That means that the page should get +1 when it is added to
a bio. (direct-io does this, but the mpage.c pagecache code cheats, and
relies upon PG_locked and PG-writeback protecting the page).
Similarly, the network code (or its caller) should be incrementing the
page's refcount as the page goes into a container (ie: the skb) and
decrementing it as the page is removed.
But someone somewhere is breaking those rules. Who?
> It's regrettable that _count is manipulated directly, because Andrew
> Morton changed the page "count" member to a _count to prevent exactly
> this kind of direct manipulation of the data. There does not appear
> to be a "right" way to increment and decrement the count, however,
> inside a driver without unwanted side effects. The closest candidates
> are in mm/internal.h and are presumably intended to be used
> exclusively by mm/*.c.
>
Odd. You _should_ be able to use plain old get_page() and put_page(). If
you use a raw decrement of page->count then there's a risk that this was
the final reference and the page will leak forever. We do need to use
put_page()'s return-it-to-the-allocator-if-this-was-the-last-use logic.
So. Who is breaking refcounting protocol here? Perhaps it is AOE, failing
to increment the refcount on pages as they are added to an skb?
(Do we know which callsite in XFS is adding zero-ref pages to a BIO, btw?)
next prev parent reply other threads:[~2007-03-02 1:42 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-03-01 23:15 PATCH 2.6.21-rc1 aoe: handle zero _count pages in bios Ed L. Cashin
2007-03-02 1:42 ` Andrew Morton [this message]
2007-03-02 2:29 ` Christoph Hellwig
2007-03-02 3:22 ` Andrew Morton
2007-03-02 4:30 ` Christoph Hellwig
2007-03-02 4:48 ` Andrew Morton
2007-03-02 4:49 ` Christoph Hellwig
2007-03-02 5:00 ` Andrew Morton
2007-03-02 5:03 ` Christoph Hellwig
2007-03-02 5:09 ` Andrew Morton
2007-03-02 5:15 ` Christoph Hellwig
2007-03-02 15:51 ` Sam Hopkins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070301174204.a550dd3a.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=ecashin@coraid.com \
--cc=greg@kroah.com \
--cc=linux-kernel@vger.kernel.org \
--cc=support@coraid.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox