From: Steven Whitehouse <swhiteho@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [GFS2 PATCH] GFS2: Extend the life of the reservations structure
Date: Fri, 13 Apr 2012 11:10:33 +0100 [thread overview]
Message-ID: <1334311833.2746.5.camel@menhir> (raw)
In-Reply-To: <eef337a4-5ccb-43d7-a2f7-5527a2d6c5a5@zmail12.collab.prod.int.phx2.redhat.com>
Hi,
On Thu, 2012-04-12 at 11:43 -0400, Bob Peterson wrote:
> Hi,
>
> This patch lengthens the life of the reservations structure.
> See notes below.
>
> Regards,
>
> Bob Peterson
> Red Hat GFS
>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> --
> Author: Bob Peterson <rpeterso@redhat.com>
> Date: Thu Apr 12 11:37:24 2012 -0500
>
> GFS2: Extend the life of the reservations structure
>
> This patch lengthens the lifespan of the reservations structure for
> inodes. Before, they were allocated and deallocated for every write
> operation. With this patch, they are allocated when the first write
> occurs, and deallocated when the last process closes the file.
> It's more efficient to do it this way because it saves GFS2 a lot of
> unnecessary allocates and frees. It also gives us more flexibility
> for the future: (1) we can now fold the qadata structure back into
> the structure and save those alloc/frees, (2) we can use this for
> multi-block reservations.
>
> diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
> index 56dc1f0..a793957 100644
> --- a/fs/gfs2/aops.c
> +++ b/fs/gfs2/aops.c
> @@ -879,7 +879,7 @@ static int gfs2_write_end(struct file *file, struct address_space *mapping,
> brelse(dibh);
> failed:
> gfs2_trans_end(sdp);
> - if (ip->i_res)
> + if (gfs2_mb_reserved(ip))
> gfs2_inplace_release(ip);
> if (qa) {
> gfs2_quota_unlock(ip);
> diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
> index 31b199f..3a1a60a 100644
> --- a/fs/gfs2/file.c
> +++ b/fs/gfs2/file.c
> @@ -384,6 +384,10 @@ static int gfs2_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> set_bit(GLF_DIRTY, &ip->i_gl->gl_flags);
> set_bit(GIF_SW_PAGED, &ip->i_flags);
>
> + ret = gfs2_rs_alloc(ip);
> + if (ret)
> + goto out_unlock;
> +
I don't think the locking is quite right here... at least I can't see
any locking at all in this case.
> if (!gfs2_write_alloc_required(ip, pos, PAGE_CACHE_SIZE)) {
> lock_page(page);
> if (!PageUptodate(page) || page->mapping != inode->i_mapping) {
> @@ -569,10 +573,17 @@ static int gfs2_release(struct inode *inode, struct file *file)
> {
> struct gfs2_sbd *sdp = inode->i_sb->s_fs_info;
> struct gfs2_file *fp;
> + struct gfs2_inode *ip = GFS2_I(inode);
>
> fp = file->private_data;
> file->private_data = NULL;
>
> + if (file->f_mode & FMODE_WRITE) {
> + mutex_lock(&inode->i_mutex);
> + if (ip->i_res && (atomic_read(&inode->i_writecount) == 1))
> + gfs2_rs_delete(ip);
> + mutex_unlock(&inode->i_mutex);
> + }
> if (gfs2_assert_warn(sdp, fp))
> return -EIO;
>
> @@ -653,12 +664,16 @@ static ssize_t gfs2_file_aio_write(struct kiocb *iocb, const struct iovec *iov,
> unsigned long nr_segs, loff_t pos)
> {
> struct file *file = iocb->ki_filp;
> + struct dentry *dentry = file->f_dentry;
> + struct gfs2_inode *ip = GFS2_I(dentry->d_inode);
> + int ret;
> +
> + ret = gfs2_rs_alloc(ip);
> + if (ret)
> + return ret;
Again, there is no locking here.
>
> if (file->f_flags & O_APPEND) {
> - struct dentry *dentry = file->f_dentry;
> - struct gfs2_inode *ip = GFS2_I(dentry->d_inode);
> struct gfs2_holder gh;
> - int ret;
>
> ret = gfs2_glock_nq_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
> if (ret)
> diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
> index a9ba244..127b1a4 100644
> --- a/fs/gfs2/inode.c
> +++ b/fs/gfs2/inode.c
> @@ -681,6 +681,10 @@ static int gfs2_create_inode(struct inode *dir, struct dentry *dentry,
> if (error)
> goto fail_gunlock;
>
> + error = gfs2_rs_alloc(dip);
> + if (error)
> + goto fail_gunlock;
> +
> error = alloc_dinode(dip, &inum.no_addr, &generation);
> if (error)
> goto fail_gunlock;
> @@ -704,6 +708,11 @@ static int gfs2_create_inode(struct inode *dir, struct dentry *dentry,
> if (error)
> goto fail_gunlock2;
>
> + /* the new inode needs a reservation so it can allocate xattrs. */
> + error = gfs2_rs_alloc(GFS2_I(inode));
> + if (error)
> + goto fail_gunlock2;
> +
> error = gfs2_acl_create(dip, inode);
> if (error)
> goto fail_gunlock2;
> @@ -722,7 +731,7 @@ static int gfs2_create_inode(struct inode *dir, struct dentry *dentry,
> gfs2_trans_end(sdp);
> /* Check if we reserved space in the rgrp. Function link_dinode may
> not, depending on whether alloc is required. */
> - if (dip->i_res)
> + if (gfs2_mb_reserved(dip))
> gfs2_inplace_release(dip);
> gfs2_quota_unlock(dip);
> gfs2_qadata_put(dip);
> @@ -734,6 +743,7 @@ static int gfs2_create_inode(struct inode *dir, struct dentry *dentry,
> fail_gunlock2:
> gfs2_glock_dq_uninit(ghs + 1);
> fail_gunlock:
> + gfs2_rs_delete(dip);
> gfs2_glock_dq_uninit(ghs);
> if (inode && !IS_ERR(inode)) {
> set_bit(GIF_ALLOC_FAILED, &GFS2_I(inode)->i_flags);
> diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
> index 146c3d2..af590de 100644
> --- a/fs/gfs2/rgrp.c
> +++ b/fs/gfs2/rgrp.c
> @@ -418,6 +418,34 @@ void gfs2_free_clones(struct gfs2_rgrpd *rgd)
> }
> }
>
> +/**
> + * gfs2_rs_alloc - make sure we have a reservation assigned to the inode
> + * @ip: the inode for this reservation
> + */
> +int gfs2_rs_alloc(struct gfs2_inode *ip)
> +{
> + if (ip->i_res)
> + return 0;
> + ip->i_res = kmem_cache_zalloc(gfs2_rsrv_cachep, GFP_NOFS);
> + if (!ip->i_res)
> + return -ENOMEM;
> + return 0;
> +}
> +
I wonder whether it would be better to do this at open/create time,
assuming that the file is being opened or created for writing. That
means fewer cases to catch. The only issue I can see with that is
catching files that are created and then never opened for writing - not
sure if that is possible, but worth checking I think.
So I think the locking needs sorting out here, but otherwise it looks
good,
Steve.
> +/**
> + * gfs2_rs_delete - delete a reservation
> + * @ip: The inode for this reservation
> + *
> + */
> +void gfs2_rs_delete(struct gfs2_inode *ip)
> +{
> + if (ip->i_res == NULL)
> + return;
> +
> + kmem_cache_free(gfs2_rsrv_cachep, ip->i_res);
> + ip->i_res = NULL;
> +}
> +
> void gfs2_clear_rgrpd(struct gfs2_sbd *sdp)
> {
> struct rb_node *n;
> @@ -1001,22 +1029,6 @@ struct gfs2_qadata *gfs2_qadata_get(struct gfs2_inode *ip)
> }
>
> /**
> - * gfs2_blkrsv_get - get the struct gfs2_blkreserv structure for an inode
> - * @ip: the incore GFS2 inode structure
> - *
> - * Returns: the struct gfs2_qadata
> - */
> -
> -static int gfs2_blkrsv_get(struct gfs2_inode *ip)
> -{
> - BUG_ON(ip->i_res != NULL);
> - ip->i_res = kmem_cache_zalloc(gfs2_rsrv_cachep, GFP_NOFS);
> - if (!ip->i_res)
> - return -ENOMEM;
> - return 0;
> -}
> -
> -/**
> * try_rgrp_fit - See if a given reservation will fit in a given RG
> * @rgd: the RG data
> * @ip: the inode
> @@ -1170,13 +1182,6 @@ static int get_local_rgrp(struct gfs2_inode *ip, u64 *last_unlinked)
> return -ENOSPC;
> }
>
> -static void gfs2_blkrsv_put(struct gfs2_inode *ip)
> -{
> - BUG_ON(ip->i_res == NULL);
> - kmem_cache_free(gfs2_rsrv_cachep, ip->i_res);
> - ip->i_res = NULL;
> -}
> -
> /**
> * gfs2_inplace_reserve - Reserve space in the filesystem
> * @ip: the inode to reserve space for
> @@ -1189,14 +1194,10 @@ int gfs2_inplace_reserve(struct gfs2_inode *ip, u32 requested)
> {
> struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
> struct gfs2_blkreserv *rs;
> - int error;
> + int error = 0;
> u64 last_unlinked = NO_BLOCK;
> int tries = 0;
>
> - error = gfs2_blkrsv_get(ip);
> - if (error)
> - return error;
> -
> rs = ip->i_res;
> rs->rs_requested = requested;
> if (gfs2_assert_warn(sdp, requested)) {
> @@ -1221,7 +1222,7 @@ int gfs2_inplace_reserve(struct gfs2_inode *ip, u32 requested)
>
> out:
> if (error)
> - gfs2_blkrsv_put(ip);
> + rs->rs_requested = 0;
> return error;
> }
>
> @@ -1238,7 +1239,7 @@ void gfs2_inplace_release(struct gfs2_inode *ip)
>
> if (rs->rs_rgd_gh.gh_gl)
> gfs2_glock_dq_uninit(&rs->rs_rgd_gh);
> - gfs2_blkrsv_put(ip);
> + rs->rs_requested = 0;
> }
>
> /**
> @@ -1506,7 +1507,7 @@ int gfs2_alloc_blocks(struct gfs2_inode *ip, u64 *bn, unsigned int *nblocks,
> /* Only happens if there is a bug in gfs2, return something distinctive
> * to ensure that it is noticed.
> */
> - if (ip->i_res == NULL)
> + if (ip->i_res->rs_requested == 0)
> return -ECANCELED;
>
> rgd = ip->i_rgd;
> diff --git a/fs/gfs2/rgrp.h b/fs/gfs2/rgrp.h
> index b4b10f4..d9eda5f 100644
> --- a/fs/gfs2/rgrp.h
> +++ b/fs/gfs2/rgrp.h
> @@ -43,6 +43,8 @@ extern void gfs2_inplace_release(struct gfs2_inode *ip);
> extern int gfs2_alloc_blocks(struct gfs2_inode *ip, u64 *bn, unsigned int *n,
> bool dinode, u64 *generation);
>
> +extern int gfs2_rs_alloc(struct gfs2_inode *ip);
> +extern void gfs2_rs_delete(struct gfs2_inode *ip);
> extern void __gfs2_free_blocks(struct gfs2_inode *ip, u64 bstart, u32 blen, int meta);
> extern void gfs2_free_meta(struct gfs2_inode *ip, u64 bstart, u32 blen);
> extern void gfs2_free_di(struct gfs2_rgrpd *rgd, struct gfs2_inode *ip);
> @@ -68,4 +70,12 @@ extern int gfs2_rgrp_send_discards(struct gfs2_sbd *sdp, u64 offset,
> const struct gfs2_bitmap *bi, unsigned minlen, u64 *ptrimmed);
> extern int gfs2_fitrim(struct file *filp, void __user *argp);
>
> +/* This is how to tell if a reservation is "inplace" reserved: */
> +static inline int gfs2_mb_reserved(struct gfs2_inode *ip)
> +{
> + if (ip->i_res && ip->i_res->rs_requested)
> + return 1;
> + return 0;
> +}
> +
> #endif /* __RGRP_DOT_H__ */
> diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> index 6172fa7..a42df66 100644
> --- a/fs/gfs2/super.c
> +++ b/fs/gfs2/super.c
> @@ -1554,6 +1554,7 @@ out_unlock:
> out:
> /* Case 3 starts here */
> truncate_inode_pages(&inode->i_data, 0);
> + gfs2_rs_delete(ip);
> end_writeback(inode);
> gfs2_dir_hash_inval(ip);
> ip->i_gl->gl_object = NULL;
> @@ -1576,6 +1577,7 @@ static struct inode *gfs2_alloc_inode(struct super_block *sb)
> ip->i_flags = 0;
> ip->i_gl = NULL;
> ip->i_rgd = NULL;
> + ip->i_res = NULL;
> }
> return &ip->i_inode;
> }
> diff --git a/fs/gfs2/trans.h b/fs/gfs2/trans.h
> index 125d457..41f42cd 100644
> --- a/fs/gfs2/trans.h
> +++ b/fs/gfs2/trans.h
> @@ -31,7 +31,7 @@ struct gfs2_glock;
> static inline unsigned int gfs2_rg_blocks(const struct gfs2_inode *ip)
> {
> const struct gfs2_blkreserv *rs = ip->i_res;
> - if (rs->rs_requested < ip->i_rgd->rd_length)
> + if (rs && rs->rs_requested < ip->i_rgd->rd_length)
> return rs->rs_requested + 1;
> return ip->i_rgd->rd_length;
> }
>
prev parent reply other threads:[~2012-04-13 10:10 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <8fabed8a-9eae-4fbc-becf-77a3a1643c74@zmail12.collab.prod.int.phx2.redhat.com>
2012-04-12 15:43 ` [Cluster-devel] [GFS2 PATCH] GFS2: Extend the life of the reservations structure Bob Peterson
2012-04-13 10:10 ` Steven Whitehouse [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1334311833.2746.5.camel@menhir \
--to=swhiteho@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).