linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Ram Pai <linuxram@us.ibm.com>, Zhi Yong Wu <zwu.kernel@gmail.com>,
	linux-fsdevel@vger.kernel.org, linuxram@linux.vnet.ibm.com,
	Dave Chinner <david@fromorbit.com>,
	cmm@us.ibm.com, Ben Chociej <bchociej@gmail.com>,
	James Northrup <northrup.james@gmail.com>
Subject: Re: VFS hot tracking: How to calculate data temperature?
Date: Wed, 7 Nov 2012 11:25:33 -0800	[thread overview]
Message-ID: <20121107192533.GG3941@blackbox.djwong.org> (raw)
In-Reply-To: <20121107063642.GA7086@gmail.com>

On Wed, Nov 07, 2012 at 02:36:42PM +0800, Zheng Liu wrote:
> On Tue, Nov 06, 2012 at 03:10:11PM -0800, Darrick J. Wong wrote:
> > On Tue, Nov 06, 2012 at 05:36:38PM +0800, Ram Pai wrote:
> > > On Fri, Nov 02, 2012 at 04:41:09PM +0800, Zheng Liu wrote:
> > > > On Fri, Nov 02, 2012 at 02:38:29PM +0800, Zhi Yong Wu wrote:
> > > > > Here also has another question.
> > > > > 
> > > > > How to save the file temperature among the umount to be able to
> > > > > preserve the file tempreture after reboot?
> > > > > 
> > > > > This above is the requirement from DB product.
> > > > > I thought that we can save file temperature in its inode struct, that
> > > > > is, add one new field in struct inode, then this info will be written
> > > > > to disk with inode.
> > > > > 
> > > > > Any comments or ideas are appreciated, thanks.
> > > > 
> > > > Hi Zhiyong,
> > > > 
> > > > I think that we might define a callback function.  If a filesystem wants
> > > > to save these data, it can implement a function to save them.  The
> > > > filesystem can decide whether adding it or not by themselves.
> > > > 
> > > > BTW, actually I don't really care about how to save these data because I
> > > > only want to observe which file is accessed in real time, which is very
> > > > useful for me to track a problem in our product system.
> > > 
> > > To me, umounting a filesystem is a way of explicitly telling the VFS that the
> > > filesystem's data is not hot anymore. So probably, it really does not make
> > > sense to store temperatures across mount boundaries.
> > 
> > I'd prefer that file heat data to be retained across mounts -- we shouldn't
> > throw away all of our observations just because of a system crash / power
> > outage / scheduled reboot.
> > 
> > Or, imagine if you're a defragging tool.  If you're clever enough to try
> > consolidating all the hot blocks in one place on disk so that you could
> > aggressively read them all in at once (e.g. ureadahead), I think you'd want to
> > be able to access as big of an observation pool as possible.
> > 
> > This just occurred to me -- are you saving all of the file's heat data, like
> > the per-range read/write counters, and the averages?  Or just a single compiled
> > heat rating for the whole file?  I suggested a big hidden file a few days ago
> > because I'd thought you were trying to save all the range/heat data, which
> > would probably be painful to shoehorn into an xattr.  If you're only storing a
> > single number, then the xattr way is probably ok.
> 
> Hi Darrick,
> 
> Maybe the best way is that a new mount option or a switch in sysfs is
> provided to turn on/off it.  The user can decide whether it is enabled
> or not.  After all it will bring some extra overhead.  At least turning
> it on in our product system is unacceptable for me if there is no any
> problem that I need to track.

Hmm... who are the intended in-kernel users of the hot tracking feature?  I'm
starting to wonder if it's possible (or desirable) to implement some of this in
userspace and have the kernel ask for the hot data as needed, or simply write a
driver program that handles the strategy and only needs the kernel interface
that moves extents around.  I feel like we could just write a regular program
that uses ftrace to record io activity and manage all the observations that we
pick up, and then the db, defrag, dedupe, etc. programs can just call into
that?

On the other hand, writing some daemon program has its own problems with
distribution, starting it up, and killing it off at shutdown.  But it would
make Zheng's (non)use case easier -- if you don't want it, don't run it.

Perhaps this approach has already been discussed and thrown out?  In which case
I'll shut up. :)

--D
> 
> Regards,
> Zheng

  reply	other threads:[~2012-11-07 19:26 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-02  4:04 VFS hot tracking: How to calculate data temperature? Zhi Yong Wu
2012-11-02  4:43 ` Ram Pai
2012-11-02  6:39   ` Zhi Yong Wu
2012-11-02  6:38 ` Zhi Yong Wu
2012-11-02  8:41   ` Zheng Liu
2012-11-02 20:10     ` Darrick J. Wong
2012-11-05  2:34       ` Zhi Yong Wu
2012-11-05  8:35       ` Dave Chinner
2012-11-05  2:29     ` Zhi Yong Wu
2012-11-06  8:39       ` Zheng Liu
2012-11-06  9:00         ` Zhi Yong Wu
2012-11-07  6:45           ` Zheng Liu
2012-11-06  9:36     ` Ram Pai
2012-11-06 23:10       ` Darrick J. Wong
2012-11-07  6:36         ` Zheng Liu
2012-11-07 19:25           ` Darrick J. Wong [this message]
2012-11-08  2:48             ` Zheng Liu
2012-11-02 21:27   ` Mingming.cao
2012-11-05  2:35     ` Zhi Yong Wu
2012-11-05  8:28       ` Dave Chinner
2012-11-05  8:44         ` Zhi Yong Wu
2012-11-05 10:33           ` Steven Whitehouse
2012-11-05 11:46             ` Zhi Yong Wu
2012-11-05 11:57               ` Steven Whitehouse
2012-11-05 12:18                 ` Zhi Yong Wu
2012-11-05 12:25                   ` Steven Whitehouse
2012-11-09  1:12 ` Zhi Yong Wu
2012-11-09  3:20   ` Zheng Liu
     [not found]   ` <CAPkEcwg0ZHjV3JVxoKSzFqKLHavhGdTufLZBdBGQ6xXDMrSU-w@mail.gmail.com>
2012-11-11 23:32     ` Zhi Yong Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121107192533.GG3941@blackbox.djwong.org \
    --to=darrick.wong@oracle.com \
    --cc=bchociej@gmail.com \
    --cc=cmm@us.ibm.com \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linuxram@linux.vnet.ibm.com \
    --cc=linuxram@us.ibm.com \
    --cc=northrup.james@gmail.com \
    --cc=zwu.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).