From: David Teigland <teigland@redhat.com>
To: Al Viro <viro@ftp.linux.org.uk>
Cc: linux-kernel@vger.kernel.org, akpm@osdl.org,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 11/16] GFS: mount and tuning options
Date: Tue, 11 Oct 2005 16:38:11 -0500 [thread overview]
Message-ID: <20051011213811.GA15913@redhat.com> (raw)
In-Reply-To: <20051010213748.GQ7992@ftp.linux.org.uk>
On Mon, Oct 10, 2005 at 10:37:48PM +0100, Al Viro wrote:
> On Mon, Oct 10, 2005 at 12:10:52PM -0500, David Teigland wrote:
> > There are a variety of mount options, tunable parameters, internal
> > statistics, and methods of online file system manipulation.
>
> Could you explain WTF are you doing with rename here? This pile of
> ioctls is every bit as bad as sys_reiser4(); kindly provide a detailed
> description of the API you've introduced and explain why nothing saner
> would do...
First some background that I've copied from elsewhere: The superblock
contains a pointer to a "master" directory that contains various system
inodes. The inodes in the master directory are:
1) A directory named "jindex" containing all the journal files. The
journals are named "journal0", "journal1", ..., "journalX"
2) A directory named "per_node" that contains a bunch of files where
each node can store data specific to that node. Each node has
files "inum_rangeX", "statfs_changeX", "unlinked_tagX", and
"quota_changeX". So, there are a set of these four files for each
journal in the jindex directory.
3) A file named "inum" that contains the next cluster-wide inode number.
4) A file named "statfs" that contains the cluster-wide statfs
information.
5) A file named "rindex" that contains the locations of all the RGs in
the filesystem. (RG's == resource groups == allocation groups)
6) A file named "quota" that contains the quota values (UID and GID)
for the filesystem.
7) A directory named "root" that is the root directory of the
user-visible filesystem.
The ioctls "hfile_stat", "hfile_read", "hfile_write", "hfile_trunc" are
used to operate on the hidden system files. I notice we're not using
trunc, so it can be removed. stat/read/write could be replaced with a few
specific ioctl's if that's preferred.
The next issue is adding journals (and the associated system files) to a
fs. The gfs2_jadd command does this with the fs online. If you created
the fs with 8 journals and you now want 12 machines to mount it at once,
you need to add 4 journals by running "gfs2_jadd -j 4 /path/to/fs".
Say gfs2_jadd is adding a 9th journal (id 8) ...
creates ordinary file /.gfs2_admin/new_inode
writes to new_inode initializing it as an inum_range file
moves .gfs2_admin/new_inode to per_node/inum_range8
creates ordinary file /.gfs2_admin/new_inode
writes to new_inode initializing it as a statfs_change file
moves .gfs2_admin/new_inode to per_node/statfs_change8
same for unlinked_tag8 and quota_change8
creates ordinary file /.gfs2_admin/new_inode
writes to new_inode initializing it as a journal file
moves .gfs2_admin/new_inode to jindex/journal8
(keeping in mind that the "per_node" and "jindex" dirs and the files
under them are in the hidden/system portion of the fs)
The create and write steps use ordinary system calls. The "move" step
uses the "rename2system" ioctl to move .gfs2_admin/new_inode to the
specified system file. The new files are synced before being renamed so
in case of a crash only correctly formed files are found in the hidden
dirs. Only when the final journal file is moved into place is the fs
ready to accept a new mounter.
Next is exapanding the size of the fs. To do this, gfs2_grow first opens
the device and initializes the new space with RG headers. Second, it uses
the "resize_add_rgrps" ioctl to add new structures defining the space to
the "rindex" system file. I'm looking into using hfile_write for this.
Other ioctls:
get_super - copy struct gfs2_sb to user space
get_file_stat - copy struct gfs2_dinode to user space for given file
set_file_flag - set gfs-specific flag in inode
get_bmap - map file block to disk block
get_file_meta - return all the metadata for a file or dir
do_file_flush - sync out all dirty data and drop the cache and lock
do_quota_sync - sync outstanding quota change (moving to sysfs)
do_quota_refresh - refresh quota lvb from the quota file (moving to sysfs)
do_quota_read - read quota values from quota file
Some of these we could do without if they're objectionable. Regardless,
we'll take a closer look to see if any don't qualify as useful enough.
Finally, how ioctl is implemented. All the commands above are multiplexed
through one actual ioctl (GFS2_IOCTL_SUPER) that passes in:
struct gfs2_ioctl {
unsigned int gi_argc;
char **gi_argv;
char __user *gi_data;
unsigned int gi_size;
uint64_t gi_offset;
};
- argv[0] is the command string, e.g. "set_file_flag", "rename2system",
- argv[x] are other string arguments for the command, e.g. for set_file_flag
argv[1] is either "set" or "clear". For rename2system argv[1] is the
destination directory and argv[2] is the new name.
- gi_data, gi_size, gi_offset - data returned to caller when needed
This could be exchanged, of course, for the more tradition ioctl mess if
that's any saner.
Dave
next prev parent reply other threads:[~2005-10-11 21:38 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-10-10 17:10 [PATCH 11/16] GFS: mount and tuning options David Teigland
2005-10-10 21:01 ` Greg KH
2005-10-10 21:14 ` David Teigland
2005-10-10 21:19 ` Greg KH
2005-10-10 21:30 ` Al Viro
2005-10-10 22:22 ` David Teigland
2005-10-10 21:37 ` Al Viro
2005-10-11 21:38 ` David Teigland [this message]
2005-10-12 8:43 ` Jan Hudec
2005-10-12 16:12 ` David Teigland
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20051011213811.GA15913@redhat.com \
--to=teigland@redhat.com \
--cc=akpm@osdl.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=viro@ftp.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.