From: Shaohua Li <shaohua.li@intel.com>
To: "Wu, Fengguang" <fengguang.wu@intel.com>
Cc: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
Chris Mason <chris.mason@oracle.com>,
Christoph Hellwig <hch@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
Arjan van de Ven <arjan@infradead.org>,
"Yan, Zheng" <zheng.z.yan@linux.intel.com>,
"linux-api@vger.kernel.org" <linux-api@vger.kernel.org>,
"mtk.manpages@gmail.com" <mtk.manpages@gmail.com>
Subject: Re: [PATCH v2 0/5] add new ioctls to do metadata readahead in btrfs
Date: Tue, 11 Jan 2011 10:03:16 +0800 [thread overview]
Message-ID: <1294711397.1949.613.camel@sli10-conroe> (raw)
In-Reply-To: <20110111013813.GA10449@localhost>
On Tue, 2011-01-11 at 09:38 +0800, Wu, Fengguang wrote:
> On Tue, Jan 11, 2011 at 08:15:19AM +0800, Li, Shaohua wrote:
> > On Mon, 2011-01-10 at 22:26 +0800, Wu, Fengguang wrote:
> > > Shaohua,
> > >
> > > On Tue, Jan 04, 2011 at 01:40:30PM +0800, Li, Shaohua wrote:
> > > > Hi,
> > > > We have file readahead to do asyn file read, but has no metadata
> > > > readahead. For a list of files, their metadata is stored in fragmented
> > > > disk space and metadata read is a sync operation, which impacts the
> > > > efficiency of readahead much. The patches try to add meatadata readahead
> > > > for btrfs.
> > > > In btrfs, metadata is stored in btree_inode. Ideally, if we could hook
> > > > the inode to a fd so we could use existing syscalls (readahead, mincore
> > > > or upcoming fincore) to do readahead, but the inode is hidden, there is
> > > > no easy way for this from my understanding. So we add two ioctls for
> > >
> > > If that is the main obstacle, why not do straightforward fincore()/
> > > fadvise(), and add ioctls to btrfs to export/grab the hidden
> > > btree_inode in any form? This will address btrfs' specific issue, and
> > > have the benefit of making the VFS part general enough. You know
> > > ext2/3/4 already have block_dev ready for metadata readahead.
> > I forgot to update this comment. Please see patch 2 and patch 4, both
> > incore and readahead need btrfs specific staff involved, so we can't use
> > generic fincore or something.
>
> You can if you like :)
>
> - fincore() can return the referenced bit, which is generally
> useful information
metadata page in ext2/3 doesn't have reference bit set, while btrfs has.
we can't blindly filter out such pages with the bit. fincore can takes a
parameter or it returns a bit to distinguish referenced pages, but I
don't think it's a good API. This should be transparent to userspace.
> - btrfs_metadata_readahead() can be passed to some (faked)
> ->readpages() for use with fadvise.
this need filesystem specific hook too, the difference is your proposal
uses fadvise but I'm using ioctl. There isn't big difference.
BTW, it's hard to hook btrfs_inode to a fd even with a ioctl, at least I
didn't find a easy way to do this. It might be possible to do this for
example adding a fake device or fake fs (anon_inode doesn't work here,
IIRC), which is a bit ugly. Before it's proved generic API can handle
metadata readahead, I don't want to do it.
Thanks,
Shaohua
> > > > this. One is like readahead syscall, the other is like micore/fincore
> > > > syscall.
> > > > Under a harddisk based netbook with Meego, the metadata readahead
> > > > reduced about 3.5s boot time in average from total 16s.
> > > > Last time I posted similar patches to btrfs maillist, which adds the
> > > > new ioctls in btrfs specific ioctl code. But Christoph Hellwig asks we
> > > > have a generic interface to do this so other filesystem can share some
> > > > code, so I came up with the new one. Comments and suggestions are
> > > > welcome!
> > > >
> > > > v1->v2:
> > > > 1. Added more comments and fix return values suggested by Andrew Morton
> > > > 2. fix a race condition pointed out by Yan Zheng
> > > >
> > > > initial post:
> > > > http://marc.info/?l=linux-fsdevel&m=129222493406353&w=2
> > > >
> > > > Thanks,
> > > > Shaohua
> > > >
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
> >
next prev parent reply other threads:[~2011-01-11 2:03 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-01-04 5:40 [PATCH v2 0/5] add new ioctls to do metadata readahead in btrfs Shaohua Li
2011-01-04 16:14 ` Jeff Moyer
[not found] ` <x498vz0abov.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
2011-01-05 2:10 ` Shaohua Li
2011-01-10 14:26 ` Wu Fengguang
2011-01-11 0:15 ` Shaohua Li
2011-01-11 1:38 ` Wu Fengguang
2011-01-11 2:03 ` Shaohua Li [this message]
2011-01-11 3:07 ` Wu Fengguang
2011-01-11 3:27 ` Shaohua Li
2011-01-11 9:13 ` Wu Fengguang
2011-01-12 2:55 ` Shaohua Li
[not found] ` <20110112025516.GA11303-yAZKuqJtXNMXR+D7ky4Foa2pdiUAq4bhAL8bYrjMMd8@public.gmane.org>
2011-01-16 3:38 ` Wu Fengguang
2011-01-17 1:32 ` Shaohua Li
2011-01-18 4:41 ` Wu Fengguang
2011-01-18 5:15 ` Shaohua Li
2011-01-18 6:22 ` Wu Fengguang
2011-01-18 6:35 ` Shaohua Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1294711397.1949.613.camel@sli10-conroe \
--to=shaohua.li@intel.com \
--cc=akpm@linux-foundation.org \
--cc=arjan@infradead.org \
--cc=chris.mason@oracle.com \
--cc=fengguang.wu@intel.com \
--cc=hch@infradead.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=mtk.manpages@gmail.com \
--cc=zheng.z.yan@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).