linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: zwu.kernel@gmail.com
Cc: linux-fsdevel@vger.kernel.org, sekharan@us.ibm.com,
	linuxram@us.ibm.com, david@fromorbit.com,
	chris.mason@fusionio.com, jbacik@fusionio.com,
	Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Subject: Re: [PATCH v3 00/13] VFS hot tracking
Date: Fri, 28 Jun 2013 17:03:24 +0100	[thread overview]
Message-ID: <20130628160324.GA23227@ZenIV.linux.org.uk> (raw)
In-Reply-To: <1371817042-8556-1-git-send-email-zwu.kernel@gmail.com>

On Fri, Jun 21, 2013 at 08:17:09PM +0800, zwu.kernel@gmail.com wrote:
> From: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
> 
>   The patchset is trying to introduce hot tracking function in
> VFS layer, which will keep track of real disk I/O in memory.
> By it, you will easily know more details about disk I/O, and
> then detect where disk I/O hot spots are. Also, specific FS
> can take use of it to do accurate defragment, and hot relocation
> support, etc.
> 
>   After V1 was sent out, Chandra Seetharaman has reviewed and
> made a lot of comments, thanks a lot to him. Now it's time to
> send out its V3 for external review, any comments or ideas are
> appreciated, thanks.

First of all, my apologies for obscenely long delay with review.  I've
started doing it several times and dropped these attempts getting mired
in the nightmare of lifetime rules in there.  What I should've done was
sending the obvious low-hanging fruits right then and waiting for the
variant with that stuff sanitized ;-/

First of all, one general observation: please, separate inode and range stuff
clearly; do *not* use the same functions on both, it only makes harder to read.
I.e. kill hot_comm_item and split the functions that take it.  This code is
trying to be too generic for its own good and ends up being obfuscated to hell
and back...

refcounting:
	* the whole refcount + "DELETING" flag approach is a bad idea.
hot_comm_item_unlink() tries to be idempotent *and* includes dropping the
reference on its first call.  Schemes like that tend to be either pointless
(i.e. we know that it won't be called twice for the same object) or prone
to stepping on dangling pointers.  This one does the latter (at least).
	* lookups (both for inode and range) leak on race with unlink.
You grab a reference, drop the lock, check if HOT_DELETING is set and
return an error if it is; how the hell could the caller possibly undo
that reference grab, when it has no idea what object had been grabbed
in the first place?  Moreover, that check does *not* prevent getting
an object with HOT_DELETING from those functions - unlink coming just
after that check will be unnoticed.

hot_inode_item_delete() mess:
	* hot_inode_item_delete() is done on unlink(2), no matter how many
links are there.  Why?
	* hot_inode_item_delete() is done even if unlink() fails (e.g. with
EBUSY, or on whatever error ->unlink() might return).
	* hot_inode_item_delete() grabs a reference to hot_inode_item,
*drops* *it*, then does hot_comm_item_unlink().  What protects it from being
freed right as we drop the damn reference?

debugfs-related issues - debugfs is completely unsuitable for dynamic
objects and you step into that big way, in addition to races of your
own:
	* creation of hot_debugfs_root is racy - WTF prevents
two hot_track_init() in parallel?
	* if creation fails, we leave hot_debugfs_root ERR_PTR(something);
from that point on, no attempts to create it will be done (check for
hot_debugfs_root being *NULL*)
	* what guarantees that sb->s_id is unique?
	* removal on failure - screwed; first of all, list_empty()
will *not* be true if we have somebody open it (cursors are inserted into
that list).  Moreover, what's to stop another hot_debugfs_init() from
being called just as we are doing debugfs_remove(hot_debugfs_root) and
see that it's non-NULL *before* we get to assigning NULL there?
	* hot_debugfs_exit() - screwed in the same way.
	* debugfs files get ->i_private set to corresponding sb->s_hot_root.
It's copied to seq->private on open() and used by iterators.  WTF prevents
open on debugfs, followed by umount of corresponding btrfs volume, freeing of
sb->s_hot_root and then read() on our file stepping into kfree'd hot_root
in ->start()?

  parent reply	other threads:[~2013-06-28 16:03 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-21 12:17 [PATCH v3 00/13] VFS hot tracking zwu.kernel
2013-06-21 12:17 ` [PATCH v3 01/13] VFS hot tracking: introduce some data structures zwu.kernel
2013-06-21 12:17 ` [PATCH v3 02/13] VFS hot tracking: add i/o freq tracking hooks zwu.kernel
2013-06-21 12:17 ` [PATCH v3 03/13] VFS hot tracking: add one wq to update hot map zwu.kernel
2013-06-21 12:17 ` [PATCH v3 04/13] VFS hot tracking: register one shrinker zwu.kernel
2013-06-21 12:17 ` [PATCH v3 05/13] VFS hot tracking, rcu: introduce one rcu macro for list zwu.kernel
2013-06-21 12:17 ` [PATCH v3 06/13] VFS hot tracking, seq_file: add seq_list rcu interfaces zwu.kernel
2013-06-21 12:17 ` [PATCH v3 07/13] VFS hot tracking: add debugfs support zwu.kernel
2013-06-21 12:17 ` [PATCH v3 08/13] VFS hot tracking: add one ioctl interface zwu.kernel
2013-06-21 12:17 ` [PATCH v3 09/13] VFS hot tracking, procfs: add one proc interface zwu.kernel
2013-06-21 12:17 ` [PATCH v3 10/13] VFS hot tracking: add memory caping function zwu.kernel
2013-06-21 12:17 ` [PATCH v3 11/13] VFS hot tracking, btrfs: add hot tracking support zwu.kernel
2013-06-21 12:17 ` [PATCH v3 12/13] VFS hot tracking: add documentation zwu.kernel
2013-06-21 12:17 ` [PATCH v3 13/13] VFS hot tracking: add fs hot type support zwu.kernel
2013-06-24 13:41 ` [PATCH v3 00/13] VFS hot tracking Zhi Yong Wu
2013-06-28 16:03 ` Al Viro [this message]
2013-07-01 13:19   ` Zhi Yong Wu
2013-07-03 13:30     ` Al Viro
2013-07-03 15:16       ` Zhi Yong Wu
2013-07-08 12:44         ` Zhi Yong Wu
2013-07-02 12:45   ` Zhi Yong Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130628160324.GA23227@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=chris.mason@fusionio.com \
    --cc=david@fromorbit.com \
    --cc=jbacik@fusionio.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linuxram@us.ibm.com \
    --cc=sekharan@us.ibm.com \
    --cc=wuzhy@linux.vnet.ibm.com \
    --cc=zwu.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).