* Thinko with per-bdi flusher threads
@ 2010-07-23 16:12 Jan Kara
2010-07-23 16:58 ` Christoph Hellwig
0 siblings, 1 reply; 2+ messages in thread
From: Jan Kara @ 2010-07-23 16:12 UTC (permalink / raw)
To: linux-fsdevel; +Cc: axboe, hch, Wu Fengguang
Hi,
I had a look at the bug https://bugzilla.kernel.org/show_bug.cgi?id=16312
(because we got some reports against our distro kernels as well ;). The
culprit is that when a device inode gets dirty, we try to file it to
per-bdi queues of the bdi device inode describes. I.e., if say /dev/zero
gets dirty because someone does touch /dev/zero, we try to file dirty inode
to the /dev/zero's bdi which obviously complains.
So a trivial reproducer is:
cd /tmp; mknod devzero c 1 5; touch devzero
(provided /tmp is on some normal filesystem such as ext3).
The question is how to solve this problem. Adding /dev/zero to the lists
of "zero" bdi seems silly (we'd have to create writeback thread, write that
single inode and kill the thread) and conceptually wrong (the inode write
has to happen against the filesystem carrying the device node, not against
mapping->backing_dev of the inode).
But there are more complicated cases. Think for example what should
happen if a filesystem on /dev/sda carries a device inode for /dev/sdb.
Then dirty pages of the device inode should be written by a per-bdi thread
for /dev/sdb but inode metadata should be written by a thread for /dev/sda.
Not too nice either because the device inode would have to be in two queues
- one for data and one for metadata writeback. OTOH checks like
bdi_has_dirty_io() would correctly report whether there is some
modification pending against a bdi or not.
A reasonable mildly hacky solution would be to file inode against parent
filesystem's bdi if mapping->backing_dev isn't capable of having dirty
pages and do writeback and against mapping->backing_dev otherwise. This
would mean that we would have to properly mark bdis like the one of
/dev/zero as not capable of writeback.
Any opinions?
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Thinko with per-bdi flusher threads
2010-07-23 16:12 Thinko with per-bdi flusher threads Jan Kara
@ 2010-07-23 16:58 ` Christoph Hellwig
0 siblings, 0 replies; 2+ messages in thread
From: Christoph Hellwig @ 2010-07-23 16:58 UTC (permalink / raw)
To: Jan Kara; +Cc: linux-fsdevel, axboe, hch, Wu Fengguang
On Fri, Jul 23, 2010 at 06:12:39PM +0200, Jan Kara wrote:
> A reasonable mildly hacky solution would be to file inode against parent
> filesystem's bdi if mapping->backing_dev isn't capable of having dirty
> pages and do writeback and against mapping->backing_dev otherwise. This
> would mean that we would have to properly mark bdis like the one of
> /dev/zero as not capable of writeback.
That seems like a good enough short term solution.
In the long term this should be fixed as a side effect of some patches
I'm about to submit. What I do is to split the writeback code
(bdi_writeback and some fields currently in the backing_dev) off the
backing_dev entirely an make them a separate entity.
mapping->backing_dev will still point to the backing dev, but all inode
dirtying and writeback will be done via the new sb->s_wb.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2010-07-23 16:58 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-23 16:12 Thinko with per-bdi flusher threads Jan Kara
2010-07-23 16:58 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).