From: Christoph Hellwig <hch@infradead.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Anton Altaparmakov <aia21@cam.ac.uk>,
Jens Axboe <axboe@kernel.dk>, Christoph Hellwig <hch@lst.de>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
George Spelvin <linux@horizon.com>
Subject: Re: a major regression in recent kernels? - was: Re: Null pointer OOPS in sync_inodes_sb+0xa9/0x104
Date: Fri, 4 Mar 2011 07:52:20 -0500 [thread overview]
Message-ID: <20110304125220.GA6740@infradead.org> (raw)
In-Reply-To: <AANLkTi=eWdUjgTB2iWOB7YxOtm9f9a9Ux2uCghR=KTgN@mail.gmail.com>
On Wed, Mar 02, 2011 at 10:31:15AM -0800, Linus Torvalds wrote:
> The whole "backing_dev_info" has been a total disaster. The thing is
> crap. It violates all the normal kernel memory management rules ("Thou
> shalt use reference counts and free only when it goes to zero") and
> the whole thing has been a constant source of "oh, that driver didn't
> set it, but we changed all the code to require it to be correct".
>
> And the reason we set it to NULL when the device goes away is exactly
> that it's not ref-counted correctly, so we really _have_ to set it to
> NULL, because it's not going to be around.
>
> (And the reverse of that is why all kernel data structures should use
> refcounts, and not some external lifetime notion)
Yes. But the bdi is even worse than that, as it conflates things with
different lifetime into a single object. We have the "old school" bdi
which mostly contained various bits of tuning for the VM and read-ahead
algorithms. This one is required to stay around even with no fs mounted
on block devices because people expect it to stay around with no fs
mounted. And then we have the writeback context entangled into it,
which only makes sense with an active filesystem (or block device node)
on it to make it special fun. Even more fun is that we have a pointer
from the superblock, and one from the inode, and the latter might point
to lala land if this is say a /dev/mem node which has a different bdi
for the "old-school" MM usage.
I had various stages of prototypes for separating the two into:
1) the old bdi. Life time rules are: allocated and reference counted
with the containing device. That is gendisk for block devices,
server context for remote devices, static at module init time for
/dev/zero and similar.
2) writeback context. Only exists if a user is there, and thus
refcounted by itself. For non-blockdevice filesystem instances it's
trivially always allocated with the superblock, and goes away with it.
For block-device instances we need to keep a pointer to it from
struct block_device and properly look it up on mount, or opening of
the block device nodes.
I guess I need to get back to it, but kept it off for now as the code
had reached relative stability and really fear touching it again.
It's for sure not .38 material, though.
next prev parent reply other threads:[~2011-03-04 12:52 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-02 3:44 Null pointer OOPS in sync_inodes_sb+0xa9/0x104 George Spelvin
2011-03-02 10:52 ` a major regression in recent kernels? - was: " Anton Altaparmakov
2011-03-02 18:31 ` Linus Torvalds
2011-03-03 0:15 ` Jens Axboe
2011-03-04 12:52 ` Christoph Hellwig [this message]
2011-03-14 7:52 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110304125220.GA6740@infradead.org \
--to=hch@infradead.org \
--cc=aia21@cam.ac.uk \
--cc=axboe@kernel.dk \
--cc=hch@lst.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@horizon.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).