* [PATCH 0/8] pnfs-submit: forgetful client v2
@ 2010-05-05 17:00 Alexandros Batsakis
2010-05-05 17:00 ` [PATCH 1/8] pnfs-submit: clean struct nfs_inode Alexandros Batsakis
2010-06-07 8:52 ` [PATCH 0/8] pnfs-submit: forgetful client v2 Boaz Harrosh
0 siblings, 2 replies; 20+ messages in thread
From: Alexandros Batsakis @ 2010-05-05 17:00 UTC (permalink / raw)
To: linux-nfs; +Cc: bhalevy, Alexandros Batsakis
This set of patches (2.6.35-rc1) includes a first attempt to implement
the forgetful client model for the pNFS client. The model
is explained is patch 7.
It also includes some minor cleanups in the layout management code
that help to improve the maintanability of the current code.
Passed cthon tests against the pyNFS server, and against a modified
version of pyNFS server that randomly issues layout recalls after opens.
Alexandros Batsakis (8):
pnfs-submit: clean struct nfs_inode
pnfs-submit: clean locking infrastructure
pnfs-submit: remove lgetcount, lretcount (outstanding
LAYOUTGETs/LAYOUTRETUNs)
pnfs-submit: change stateid to be a union
pnfs-submit: request whole file layouts only
pnfs-submit: change layouts list to be similar to the other state
list management
pnfs-submit: forgetful client model
pnfs-submit: support for cb_recall_any (layouts)
fs/nfs/callback.h | 7 +
fs/nfs/callback_proc.c | 231 +++++++++++++++++++++++++++++---------
fs/nfs/callback_xdr.c | 2 +-
fs/nfs/client.c | 2 +-
fs/nfs/delegation.c | 19 ++--
fs/nfs/inode.c | 16 ++-
fs/nfs/nfs4_fs.h | 1 +
fs/nfs/nfs4proc.c | 46 +++++---
fs/nfs/nfs4state.c | 4 +-
fs/nfs/nfs4xdr.c | 38 ++++---
fs/nfs/pnfs.c | 276 +++++++++++++++++++++------------------------
fs/nfs/pnfs.h | 3 +-
fs/nfsd/nfs4callback.c | 1 -
include/linux/nfs4.h | 16 +++-
include/linux/nfs4_pnfs.h | 2 +-
include/linux/nfs_fs.h | 28 ++---
include/linux/nfs_fs_sb.h | 2 +-
17 files changed, 417 insertions(+), 277 deletions(-)
^ permalink raw reply [flat|nested] 20+ messages in thread* [PATCH 1/8] pnfs-submit: clean struct nfs_inode 2010-05-05 17:00 [PATCH 0/8] pnfs-submit: forgetful client v2 Alexandros Batsakis @ 2010-05-05 17:00 ` Alexandros Batsakis 2010-05-05 17:00 ` [PATCH 2/8] pnfs-submit: clean locking infrastructure Alexandros Batsakis 2010-06-07 8:52 ` [PATCH 0/8] pnfs-submit: forgetful client v2 Boaz Harrosh 1 sibling, 1 reply; 20+ messages in thread From: Alexandros Batsakis @ 2010-05-05 17:00 UTC (permalink / raw) To: linux-nfs; +Cc: bhalevy, Alexandros Batsakis by moving layout specific fields from nfs_inode to struct pnfs_layout_type Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> --- fs/nfs/inode.c | 8 +++--- fs/nfs/pnfs.c | 55 ++++++++++++++++++++++++-------------------- include/linux/nfs4_pnfs.h | 2 +- include/linux/nfs_fs.h | 22 +++++++++--------- 4 files changed, 46 insertions(+), 41 deletions(-) diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index 2234126..ade797e 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -1328,12 +1328,12 @@ void nfs4_clear_inode(struct inode *inode) static void pnfs_alloc_init_inode(struct nfs_inode *nfsi) { #ifdef CONFIG_NFS_V4_1 - nfsi->pnfs_layout_state = 0; + nfsi->layout.pnfs_layout_state = 0; memset(&nfsi->layout.stateid, 0, NFS4_STATEID_SIZE); nfsi->layout.roc_iomode = 0; - nfsi->lo_cred = NULL; - nfsi->pnfs_write_begin_pos = 0; - nfsi->pnfs_write_end_pos = 0; + nfsi->layout.lo_cred = NULL; + nfsi->layout.pnfs_write_begin_pos = 0; + nfsi->layout.pnfs_write_end_pos = 0; #endif /* CONFIG_NFS_V4_1 */ } diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c index cf9cfe5..f32dbbb 100644 --- a/fs/nfs/pnfs.c +++ b/fs/nfs/pnfs.c @@ -154,7 +154,7 @@ pnfs_need_layoutcommit(struct nfs_inode *nfsi, struct nfs_open_context *ctx) dprintk("%s: has_layout=%d ctx=%p\n", __func__, has_layout(nfsi), ctx); spin_lock(&nfsi->lo_lock); if (has_layout(nfsi) && !layoutcommit_needed(nfsi)) { - nfsi->lo_cred = get_rpccred(ctx->state->owner->so_cred); + nfsi->layout.lo_cred = get_rpccred(ctx->state->owner->so_cred); nfsi->change_attr++; spin_unlock(&nfsi->lo_lock); dprintk("%s: Set layoutcommit\n", __func__); @@ -174,17 +174,17 @@ pnfs_update_last_write(struct nfs_inode *nfsi, loff_t offset, size_t extent) loff_t end_pos; spin_lock(&nfsi->lo_lock); - if (offset < nfsi->pnfs_write_begin_pos) - nfsi->pnfs_write_begin_pos = offset; + if (offset < nfsi->layout.pnfs_write_begin_pos) + nfsi->layout.pnfs_write_begin_pos = offset; end_pos = offset + extent - 1; /* I'm being inclusive */ - if (end_pos > nfsi->pnfs_write_end_pos) - nfsi->pnfs_write_end_pos = end_pos; + if (end_pos > nfsi->layout.pnfs_write_end_pos) + nfsi->layout.pnfs_write_end_pos = end_pos; dprintk("%s: Wrote %lu@%lu bpos %lu, epos: %lu\n", __func__, (unsigned long) extent, (unsigned long) offset , - (unsigned long) nfsi->pnfs_write_begin_pos, - (unsigned long) nfsi->pnfs_write_end_pos); + (unsigned long) nfsi->layout.pnfs_write_begin_pos, + (unsigned long) nfsi->layout.pnfs_write_end_pos); spin_unlock(&nfsi->lo_lock); } @@ -916,7 +916,8 @@ get_lock_alloc_layout(struct inode *ino) * wait until bit is cleared if we lost this race. */ res = wait_on_bit_lock( - &nfsi->pnfs_layout_state, NFS_INO_LAYOUT_ALLOC, + &nfsi->layout.pnfs_layout_state, + NFS_INO_LAYOUT_ALLOC, pnfs_wait_schedule, TASK_KILLABLE); if (res) { lo = ERR_PTR(res); @@ -944,8 +945,10 @@ get_lock_alloc_layout(struct inode *ino) lo = ERR_PTR(-ENOMEM); /* release the NFS_INO_LAYOUT_ALLOC bit and wake up waiters */ - clear_bit_unlock(NFS_INO_LAYOUT_ALLOC, &nfsi->pnfs_layout_state); - wake_up_bit(&nfsi->pnfs_layout_state, NFS_INO_LAYOUT_ALLOC); + clear_bit_unlock(NFS_INO_LAYOUT_ALLOC, + &nfsi->layout.pnfs_layout_state); + wake_up_bit(&nfsi->layout.pnfs_layout_state, + NFS_INO_LAYOUT_ALLOC); break; } @@ -1129,13 +1132,13 @@ pnfs_update_layout(struct inode *ino, } /* if get layout already failed once goto out */ - if (test_bit(lo_fail_bit(iomode), &nfsi->pnfs_layout_state)) { - if (unlikely(nfsi->pnfs_layout_suspend && - get_seconds() >= nfsi->pnfs_layout_suspend)) { + if (test_bit(lo_fail_bit(iomode), &nfsi->layout.pnfs_layout_state)) { + if (unlikely(nfsi->layout.pnfs_layout_suspend && + get_seconds() >= nfsi->layout.pnfs_layout_suspend)) { dprintk("%s: layout_get resumed\n", __func__); clear_bit(lo_fail_bit(iomode), - &nfsi->pnfs_layout_state); - nfsi->pnfs_layout_suspend = 0; + &nfsi->layout.pnfs_layout_state); + nfsi->layout.pnfs_layout_suspend = 0; } else { result = 1; goto out_put; @@ -1151,7 +1154,8 @@ pnfs_update_layout(struct inode *ino, result = get_layout(ino, ctx, &arg, lsegpp, lo); out: dprintk("%s end (err:%d) state 0x%lx lseg %p\n", - __func__, result, nfsi->pnfs_layout_state, lseg); + __func__, result, nfsi->layout.pnfs_layout_state, + lseg); return result; out_put: if (lsegpp) @@ -1256,13 +1260,14 @@ pnfs_get_layout_done(struct nfs4_pnfs_layoutget *lgp, int rpc_status) get_out: /* remember that get layout failed and suspend trying */ - nfsi->pnfs_layout_suspend = suspend; - set_bit(lo_fail_bit(lgp->args.lseg.iomode), &nfsi->pnfs_layout_state); + nfsi->layout.pnfs_layout_suspend = suspend; + set_bit(lo_fail_bit(lgp->args.lseg.iomode), + &nfsi->layout.pnfs_layout_state); dprintk("%s: layout_get suspended until %ld\n", __func__, suspend); out: dprintk("%s end (err:%d) state 0x%lx lseg %p\n", - __func__, lgp->status, nfsi->pnfs_layout_state, lseg); + __func__, lgp->status, nfsi->layout.pnfs_layout_state, lseg); return; } @@ -2144,12 +2149,12 @@ pnfs_layoutcommit_inode(struct inode *inode, int sync) /* Clear layoutcommit properties in the inode so * new lc info can be generated */ - write_begin_pos = nfsi->pnfs_write_begin_pos; - write_end_pos = nfsi->pnfs_write_end_pos; - data->cred = nfsi->lo_cred; - nfsi->pnfs_write_begin_pos = 0; - nfsi->pnfs_write_end_pos = 0; - nfsi->lo_cred = NULL; + write_begin_pos = nfsi->layout.pnfs_write_begin_pos; + write_end_pos = nfsi->layout.pnfs_write_end_pos; + data->cred = nfsi->layout.lo_cred; + nfsi->layout.pnfs_write_begin_pos = 0; + nfsi->layout.pnfs_write_end_pos = 0; + nfsi->layout.lo_cred = NULL; pnfs_get_layout_stateid(&data->args.stateid, &nfsi->layout); spin_unlock(&nfsi->lo_lock); diff --git a/include/linux/nfs4_pnfs.h b/include/linux/nfs4_pnfs.h index ee45b69..d810f4a 100644 --- a/include/linux/nfs4_pnfs.h +++ b/include/linux/nfs4_pnfs.h @@ -90,7 +90,7 @@ has_layout(struct nfs_inode *nfsi) static inline bool layoutcommit_needed(struct nfs_inode *nfsi) { - return nfsi->lo_cred != NULL; + return nfsi->layout.lo_cred != NULL; } #endif /* CONFIG_NFS_V4_1 */ diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index f3250df..787d2f4 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -106,6 +106,17 @@ struct pnfs_layout_type { seqlock_t seqlock; /* Protects the stateid */ nfs4_stateid stateid; void *ld_data; /* layout driver private data */ + unsigned long pnfs_layout_state; + #define NFS_INO_RO_LAYOUT_FAILED 0 /* get ro layout failed stop trying */ + #define NFS_INO_RW_LAYOUT_FAILED 1 /* get rw layout failed stop trying */ + #define NFS_INO_LAYOUT_ALLOC 2 /* bit lock for layout allocation */ + time_t pnfs_layout_suspend; + struct rpc_cred *lo_cred; /* layoutcommit credential */ + /* DH: These vars keep track of the maximum write range + * so the values can be used for layoutcommit. + */ + loff_t pnfs_write_begin_pos; + loff_t pnfs_write_end_pos; }; /* @@ -198,20 +209,9 @@ struct nfs_inode { /* Inodes having layouts */ struct list_head lo_inodes; - unsigned long pnfs_layout_state; -#define NFS_INO_RO_LAYOUT_FAILED 0 /* get ro layout failed stop trying */ -#define NFS_INO_RW_LAYOUT_FAILED 1 /* get rw layout failed stop trying */ -#define NFS_INO_LAYOUT_ALLOC 2 /* bit lock for layout allocation */ - time_t pnfs_layout_suspend; - struct rpc_cred *lo_cred; /* layoutcommit credential */ wait_queue_head_t lo_waitq; spinlock_t lo_lock; struct pnfs_layout_type layout; - /* DH: These vars keep track of the maximum write range - * so the values can be used for layoutcommit. - */ - loff_t pnfs_write_begin_pos; - loff_t pnfs_write_end_pos; #endif /* CONFIG_NFS_V4_1 */ #endif /* CONFIG_NFS_V4*/ #ifdef CONFIG_NFS_FSCACHE -- 1.6.2.5 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/8] pnfs-submit: clean locking infrastructure 2010-05-05 17:00 ` [PATCH 1/8] pnfs-submit: clean struct nfs_inode Alexandros Batsakis @ 2010-05-05 17:00 ` Alexandros Batsakis 2010-05-05 17:00 ` [PATCH 3/8] pnfs-submit: remove lgetcount, lretcount (outstanding LAYOUTGETs/LAYOUTRETUNs) Alexandros Batsakis 2010-06-07 14:34 ` [PATCH 2/8] pnfs-submit: clean locking infrastructure Fred Isaman 0 siblings, 2 replies; 20+ messages in thread From: Alexandros Batsakis @ 2010-05-05 17:00 UTC (permalink / raw) To: linux-nfs; +Cc: bhalevy, Alexandros Batsakis (also minor cleanup of pnfs_free_layout()) Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> --- fs/nfs/pnfs.c | 73 ++++++++++++++++++++++++++++++++++++-------------------- 1 files changed, 47 insertions(+), 26 deletions(-) diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c index f32dbbb..a4031b4 100644 --- a/fs/nfs/pnfs.c +++ b/fs/nfs/pnfs.c @@ -60,6 +60,8 @@ static int pnfs_initialized; static void pnfs_free_layout(struct pnfs_layout_type *lo, struct nfs4_pnfs_layout_segment *range); static enum pnfs_try_status pnfs_commit(struct nfs_write_data *data, int sync); +static inline void lock_current_layout(struct nfs_inode *nfsi); +static inline void unlock_current_layout(struct nfs_inode *nfsi); /* Locking: * @@ -152,15 +154,15 @@ void pnfs_need_layoutcommit(struct nfs_inode *nfsi, struct nfs_open_context *ctx) { dprintk("%s: has_layout=%d ctx=%p\n", __func__, has_layout(nfsi), ctx); - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); if (has_layout(nfsi) && !layoutcommit_needed(nfsi)) { nfsi->layout.lo_cred = get_rpccred(ctx->state->owner->so_cred); nfsi->change_attr++; - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); dprintk("%s: Set layoutcommit\n", __func__); return; } - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); } /* Update last_write_offset for layoutcommit. @@ -173,7 +175,7 @@ pnfs_update_last_write(struct nfs_inode *nfsi, loff_t offset, size_t extent) { loff_t end_pos; - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); if (offset < nfsi->layout.pnfs_write_begin_pos) nfsi->layout.pnfs_write_begin_pos = offset; end_pos = offset + extent - 1; /* I'm being inclusive */ @@ -185,7 +187,7 @@ pnfs_update_last_write(struct nfs_inode *nfsi, loff_t offset, size_t extent) (unsigned long) offset , (unsigned long) nfsi->layout.pnfs_write_begin_pos, (unsigned long) nfsi->layout.pnfs_write_end_pos); - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); } /* Unitialize a mountpoint in a layout driver */ @@ -313,6 +315,17 @@ pnfs_unregister_layoutdriver(struct pnfs_layoutdriver_type *ld_type) #define BUG_ON_UNLOCKED_LO(lo) do {} while (0) #endif /* CONFIG_SMP */ +static inline void lock_current_layout(struct nfs_inode *nfsi) +{ + spin_lock(&nfsi->lo_lock); +} + +static inline void unlock_current_layout(struct nfs_inode *nfsi) +{ + BUG_ON_UNLOCKED_LO((&nfsi->layout)); + spin_unlock(&nfsi->lo_lock); +} + /* * get and lock nfsi->layout */ @@ -321,10 +334,10 @@ get_lock_current_layout(struct nfs_inode *nfsi) { struct pnfs_layout_type *lo; + lock_current_layout(nfsi); lo = &nfsi->layout; - spin_lock(&nfsi->lo_lock); if (!lo->ld_data) { - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); return NULL; } @@ -344,7 +357,12 @@ put_unlock_current_layout(struct pnfs_layout_type *lo) BUG_ON_UNLOCKED_LO(lo); BUG_ON(lo->refcount <= 0); - if (--lo->refcount == 0 && list_empty(&lo->segs)) { + lo->refcount--; + + if (lo->refcount > 0) + goto out; + + if (list_empty(&lo->segs)) { struct layoutdriver_io_operations *io_ops = PNFS_LD_IO_OPS(lo); @@ -358,7 +376,8 @@ put_unlock_current_layout(struct pnfs_layout_type *lo) list_del_init(&nfsi->lo_inodes); spin_unlock(&clp->cl_lock); } - spin_unlock(&nfsi->lo_lock); +out: + unlock_current_layout(nfsi); } void @@ -367,7 +386,7 @@ pnfs_layout_release(struct pnfs_layout_type *lo, atomic_t *count, { struct nfs_inode *nfsi = PNFS_NFS_INODE(lo); - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); if (range) pnfs_free_layout(lo, range); atomic_dec(count); @@ -386,6 +405,8 @@ pnfs_destroy_layout(struct nfs_inode *nfsi) }; lo = get_lock_current_layout(nfsi); + if (!lo) + return; pnfs_free_layout(lo, &range); put_unlock_current_layout(lo); } @@ -663,7 +684,7 @@ pnfs_return_layout_barrier(struct nfs_inode *nfsi, struct pnfs_layout_segment *lseg; bool ret = false; - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); list_for_each_entry (lseg, &nfsi->layout.segs, fi_list) { if (!should_free_lseg(lseg, range)) continue; @@ -677,7 +698,7 @@ pnfs_return_layout_barrier(struct nfs_inode *nfsi, } if (atomic_read(&nfsi->layout.lgetcount)) ret = true; - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); dprintk("%s:Return %d\n", __func__, ret); return ret; @@ -759,7 +780,7 @@ _pnfs_return_layout(struct inode *ino, struct nfs4_pnfs_layout_segment *range, /* unlock w/o put rebalanced by eventual call to * pnfs_layout_release */ - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); if (pnfs_return_layout_barrier(nfsi, &arg)) { dprintk("%s: waiting\n", __func__); @@ -900,7 +921,7 @@ static int pnfs_wait_schedule(void *word) * * Note: If successful, nfsi->lo_lock is taken and the caller * must put and unlock current_layout by using put_unlock_current_layout() - * when the returned layout is released. + * directly or pnfs_layout_release() when the returned layout is released. */ static struct pnfs_layout_type * get_lock_alloc_layout(struct inode *ino) @@ -935,7 +956,7 @@ get_lock_alloc_layout(struct inode *ino) struct nfs_client *clp = NFS_SERVER(ino)->nfs_client; /* must grab the layout lock before the client lock */ - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); spin_lock(&clp->cl_lock); if (list_empty(&nfsi->lo_inodes)) @@ -1051,10 +1072,10 @@ void drain_layoutreturns(struct pnfs_layout_type *lo) while (atomic_read(&lo->lretcount)) { struct nfs_inode *nfsi = PNFS_NFS_INODE(lo); - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); dprintk("%s: waiting\n", __func__); wait_event(nfsi->lo_waitq, (atomic_read(&lo->lretcount) == 0)); - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); } } @@ -1093,13 +1114,13 @@ pnfs_update_layout(struct inode *ino, /* Check to see if the layout for the given range already exists */ lseg = pnfs_has_layout(lo, &arg, take_ref, !take_ref); if (lseg && !lseg->valid) { - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); if (take_ref) put_lseg(lseg); for (;;) { prepare_to_wait(&nfsi->lo_waitq, &__wait, TASK_KILLABLE); - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); lseg = pnfs_has_layout(lo, &arg, take_ref, !take_ref); if (!lseg || lseg->valid) break; @@ -1112,7 +1133,7 @@ pnfs_update_layout(struct inode *ino, result = -ERESTARTSYS; break; } - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); schedule(); } finish_wait(&nfsi->lo_waitq, &__wait); @@ -1149,7 +1170,7 @@ pnfs_update_layout(struct inode *ino, /* Matching dec is done in .rpc_release (on non-error paths) */ atomic_inc(&lo->lgetcount); /* Lose lock, but not reference, match this with pnfs_layout_release */ - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); result = get_layout(ino, ctx, &arg, lsegpp, lo); out: @@ -1299,7 +1320,7 @@ pnfs_layout_process(struct nfs4_pnfs_layoutget *lgp) *lgp->lsegpp = lseg; } - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); pnfs_insert_layout(lo, lseg); if (res->return_on_close) { @@ -1310,7 +1331,7 @@ pnfs_layout_process(struct nfs4_pnfs_layoutget *lgp) /* Done processing layoutget. Set the layout stateid */ pnfs_set_layout_stateid(lo, &res->stateid); - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); out: return status; } @@ -2140,9 +2161,9 @@ pnfs_layoutcommit_inode(struct inode *inode, int sync) if (!data) return -ENOMEM; - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); if (!layoutcommit_needed(nfsi)) { - spin_unlock(&nfsi->lo_lock); + lock_current_layout(nfsi); goto out_free; } @@ -2157,7 +2178,7 @@ pnfs_layoutcommit_inode(struct inode *inode, int sync) nfsi->layout.lo_cred = NULL; pnfs_get_layout_stateid(&data->args.stateid, &nfsi->layout); - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); /* Set up layout commit args */ status = pnfs_layoutcommit_setup(inode, data, write_begin_pos, -- 1.6.2.5 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 3/8] pnfs-submit: remove lgetcount, lretcount (outstanding LAYOUTGETs/LAYOUTRETUNs) 2010-05-05 17:00 ` [PATCH 2/8] pnfs-submit: clean locking infrastructure Alexandros Batsakis @ 2010-05-05 17:00 ` Alexandros Batsakis 2010-05-05 17:00 ` [PATCH 4/8] pnfs-submit: change stateid to be a union Alexandros Batsakis 2010-06-07 14:34 ` [PATCH 2/8] pnfs-submit: clean locking infrastructure Fred Isaman 1 sibling, 1 reply; 20+ messages in thread From: Alexandros Batsakis @ 2010-05-05 17:00 UTC (permalink / raw) To: linux-nfs; +Cc: bhalevy, Alexandros Batsakis This is in order to prepare for the forgetful client. There is no need to explicitly count the number of outstanding layout operations, as the protocol has provision for it (seqid of stateid -- e.g. section 12.5.5.2.1.2). As long as no requests for intersecting layouts are issued LAYOUTGETs/LAYOUTRETURNs can be sent in parallel Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> --- fs/nfs/nfs4proc.c | 5 ++--- fs/nfs/pnfs.c | 46 ++++++++++++---------------------------------- fs/nfs/pnfs.h | 3 +-- include/linux/nfs_fs.h | 2 -- 4 files changed, 15 insertions(+), 41 deletions(-) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index c01ecd7..a89a290 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -5503,7 +5503,7 @@ static void nfs4_pnfs_layoutget_release(void *calldata) struct nfs4_pnfs_layoutget *lgp = calldata; dprintk("--> %s\n", __func__); - pnfs_layout_release(lgp->lo, &lgp->lo->lgetcount, NULL); + pnfs_layout_release(lgp->lo, NULL); if (lgp->res.layout.buf != NULL) free_page((unsigned long) lgp->res.layout.buf); kfree(calldata); @@ -5724,8 +5724,7 @@ static void nfs4_pnfs_layoutreturn_release(void *calldata) if (lrp->lo && (lrp->args.return_type == RETURN_FILE)) { if (!lrp->res.lrs_present) pnfs_set_layout_stateid(lrp->lo, &zero_stateid); - pnfs_layout_release(lrp->lo, &lrp->lo->lretcount, - &lrp->args.lseg); + pnfs_layout_release(lrp->lo, &lrp->args.lseg); } kfree(calldata); dprintk("<-- %s\n", __func__); diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c index a4031b4..c4c7c35 100644 --- a/fs/nfs/pnfs.c +++ b/fs/nfs/pnfs.c @@ -381,7 +381,7 @@ out: } void -pnfs_layout_release(struct pnfs_layout_type *lo, atomic_t *count, +pnfs_layout_release(struct pnfs_layout_type *lo, struct nfs4_pnfs_layout_segment *range) { struct nfs_inode *nfsi = PNFS_NFS_INODE(lo); @@ -389,7 +389,6 @@ pnfs_layout_release(struct pnfs_layout_type *lo, atomic_t *count, lock_current_layout(nfsi); if (range) pnfs_free_layout(lo, range); - atomic_dec(count); put_unlock_current_layout(lo); wake_up_all(&nfsi->lo_waitq); } @@ -574,7 +573,7 @@ get_layout(struct inode *ino, lgp = kzalloc(sizeof(*lgp), GFP_KERNEL); if (lgp == NULL) { - pnfs_layout_release(lo, &lo->lgetcount, NULL); + pnfs_layout_release(lo, NULL); return -ENOMEM; } lgp->lo = lo; @@ -648,6 +647,13 @@ has_layout_to_return(struct pnfs_layout_type *lo, return out; } +static inline bool +_pnfs_can_return_lseg(struct pnfs_layout_segment *lseg) +{ + return atomic_read(&lseg->kref.refcount) == 1; +} + + static void pnfs_free_layout(struct pnfs_layout_type *lo, struct nfs4_pnfs_layout_segment *range) @@ -658,7 +664,8 @@ pnfs_free_layout(struct pnfs_layout_type *lo, BUG_ON_UNLOCKED_LO(lo); list_for_each_entry_safe (lseg, next, &lo->segs, fi_list) { - if (!should_free_lseg(lseg, range)) + if (!should_free_lseg(lseg, range) || + !_pnfs_can_return_lseg(lseg)) continue; dprintk("%s: freeing lseg %p iomode %d " "offset %llu length %llu\n", __func__, @@ -671,12 +678,6 @@ pnfs_free_layout(struct pnfs_layout_type *lo, dprintk("%s:Return\n", __func__); } -static inline bool -_pnfs_can_return_lseg(struct pnfs_layout_segment *lseg) -{ - return atomic_read(&lseg->kref.refcount) == 1; -} - static bool pnfs_return_layout_barrier(struct nfs_inode *nfsi, struct nfs4_pnfs_layout_segment *range) @@ -696,8 +697,6 @@ pnfs_return_layout_barrier(struct nfs_inode *nfsi, ret = true; } } - if (atomic_read(&nfsi->layout.lgetcount)) - ret = true; unlock_current_layout(nfsi); dprintk("%s:Return %d\n", __func__, ret); @@ -719,7 +718,7 @@ return_layout(struct inode *ino, struct nfs4_pnfs_layout_segment *range, lrp = kzalloc(sizeof(*lrp), GFP_KERNEL); if (lrp == NULL) { if (lo && (type == RETURN_FILE)) - pnfs_layout_release(lo, &lo->lretcount, NULL); + pnfs_layout_release(lo, NULL); goto out; } lrp->args.reclaim = 0; @@ -774,9 +773,6 @@ _pnfs_return_layout(struct inode *ino, struct nfs4_pnfs_layout_segment *range, goto out; } - /* Matching dec is done in .rpc_release (on non-error paths) */ - atomic_inc(&lo->lretcount); - /* unlock w/o put rebalanced by eventual call to * pnfs_layout_release */ @@ -901,8 +897,6 @@ alloc_init_layout(struct inode *ino) seqlock_init(&lo->seqlock); memset(&lo->stateid, 0, NFS4_STATEID_SIZE); lo->refcount = 1; - atomic_set(&lo->lgetcount, 0); - atomic_set(&lo->lretcount, 0); INIT_LIST_HEAD(&lo->segs); lo->roc_iomode = 0; return lo; @@ -1066,19 +1060,6 @@ pnfs_find_get_lseg(struct inode *inode, return lseg; } -/* Called with spin lock held */ -void drain_layoutreturns(struct pnfs_layout_type *lo) -{ - while (atomic_read(&lo->lretcount)) { - struct nfs_inode *nfsi = PNFS_NFS_INODE(lo); - - unlock_current_layout(nfsi); - dprintk("%s: waiting\n", __func__); - wait_event(nfsi->lo_waitq, (atomic_read(&lo->lretcount) == 0)); - lock_current_layout(nfsi); - } -} - /* Update the file's layout for the given range and iomode. * Layout is retreived from the server if needed. * If lsegpp is given, the appropriate layout segment is referenced and @@ -1166,9 +1147,6 @@ pnfs_update_layout(struct inode *ino, } } - drain_layoutreturns(lo); - /* Matching dec is done in .rpc_release (on non-error paths) */ - atomic_inc(&lo->lgetcount); /* Lose lock, but not reference, match this with pnfs_layout_release */ unlock_current_layout(nfsi); diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h index f1132b3..908d32b 100644 --- a/fs/nfs/pnfs.h +++ b/fs/nfs/pnfs.h @@ -68,8 +68,7 @@ void pnfs_free_fsdata(struct pnfs_fsdata *fsdata); ssize_t pnfs_file_write(struct file *, const char __user *, size_t, loff_t *); void pnfs_get_layout_done(struct nfs4_pnfs_layoutget *, int rpc_status); int pnfs_layout_process(struct nfs4_pnfs_layoutget *lgp); -void pnfs_layout_release(struct pnfs_layout_type *, atomic_t *, - struct nfs4_pnfs_layout_segment *range); +void pnfs_layout_release(struct pnfs_layout_type *, struct nfs4_pnfs_layout_segment *range); void pnfs_set_layout_stateid(struct pnfs_layout_type *lo, const nfs4_stateid *stateid); void pnfs_destroy_layout(struct nfs_inode *); diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index 787d2f4..eb95dbf 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -99,8 +99,6 @@ struct posix_acl; struct pnfs_layout_type { int refcount; - atomic_t lretcount; /* Layoutreturns outstanding */ - atomic_t lgetcount; /* Layoutgets outstanding */ struct list_head segs; /* layout segments list */ int roc_iomode; /* iomode to return on close, 0=none */ seqlock_t seqlock; /* Protects the stateid */ -- 1.6.2.5 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 4/8] pnfs-submit: change stateid to be a union 2010-05-05 17:00 ` [PATCH 3/8] pnfs-submit: remove lgetcount, lretcount (outstanding LAYOUTGETs/LAYOUTRETUNs) Alexandros Batsakis @ 2010-05-05 17:00 ` Alexandros Batsakis 2010-05-05 17:00 ` [PATCH 5/8] pnfs-submit: request whole file layouts only Alexandros Batsakis 0 siblings, 1 reply; 20+ messages in thread From: Alexandros Batsakis @ 2010-05-05 17:00 UTC (permalink / raw) To: linux-nfs; +Cc: bhalevy, Alexandros Batsakis In NFSv4.1 the stateid consists of the other and seqid fields. For layout processing we need to numerically compare the seqid value of layout stateids. To do so, introduce a union to nfs4_stateid to swtich between opaque(16 bytes) and opaque(12 bytes) / __be32 Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> --- fs/nfs/callback_proc.c | 13 +++++++------ fs/nfs/callback_xdr.c | 2 +- fs/nfs/delegation.c | 19 +++++++++++-------- fs/nfs/nfs4proc.c | 41 +++++++++++++++++++++++++---------------- fs/nfs/nfs4state.c | 4 ++-- fs/nfs/nfs4xdr.c | 38 +++++++++++++++++++++----------------- fs/nfs/pnfs.c | 11 ++++++----- fs/nfsd/nfs4callback.c | 1 - include/linux/nfs4.h | 16 ++++++++++++++-- 9 files changed, 87 insertions(+), 58 deletions(-) diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c index 33ef5c0..dcf3747 100644 --- a/fs/nfs/callback_proc.c +++ b/fs/nfs/callback_proc.c @@ -121,8 +121,9 @@ out: int nfs4_validate_delegation_stateid(struct nfs_delegation *delegation, const nfs4_stateid *stateid) { - if (delegation == NULL || memcmp(delegation->stateid.data, stateid->data, - sizeof(delegation->stateid.data)) != 0) + if (delegation == NULL || memcmp(delegation->stateid.u.data, + stateid->u.data, + sizeof(delegation->stateid.u.data))) return 0; return 1; } @@ -384,11 +385,11 @@ int nfs41_validate_delegation_stateid(struct nfs_delegation *delegation, const n if (delegation == NULL) return 0; - /* seqid is 4-bytes long */ - if (((u32 *) &stateid->data)[0] != 0) + if (stateid->u.stateid.seqid != 0) return 0; - if (memcmp(&delegation->stateid.data[4], &stateid->data[4], - sizeof(stateid->data)-4)) + if (memcmp(&delegation->stateid.u.stateid.other, + &stateid->u.stateid.other, + NFS4_STATEID_OTHER_SIZE)) return 0; return 1; diff --git a/fs/nfs/callback_xdr.c b/fs/nfs/callback_xdr.c index 69a026d..7e34bb3 100644 --- a/fs/nfs/callback_xdr.c +++ b/fs/nfs/callback_xdr.c @@ -138,7 +138,7 @@ static __be32 decode_stateid(struct xdr_stream *xdr, nfs4_stateid *stateid) p = read_buf(xdr, 16); if (unlikely(p == NULL)) return htonl(NFS4ERR_RESOURCE); - memcpy(stateid->data, p, 16); + memcpy(stateid->u.data, p, 16); return 0; } diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c index ea61d26..3b8e86a 100644 --- a/fs/nfs/delegation.c +++ b/fs/nfs/delegation.c @@ -104,7 +104,8 @@ again: continue; if (!test_bit(NFS_DELEGATED_STATE, &state->flags)) continue; - if (memcmp(state->stateid.data, stateid->data, sizeof(state->stateid.data)) != 0) + if (memcmp(state->stateid.u.data, stateid->u.data, + sizeof(state->stateid.u.data)) != 0) continue; get_nfs_open_context(ctx); spin_unlock(&inode->i_lock); @@ -133,8 +134,8 @@ void nfs_inode_reclaim_delegation(struct inode *inode, struct rpc_cred *cred, st if (delegation != NULL) { spin_lock(&delegation->lock); if (delegation->inode != NULL) { - memcpy(delegation->stateid.data, res->delegation.data, - sizeof(delegation->stateid.data)); + memcpy(delegation->stateid.u.data, res->delegation.u.data, + sizeof(delegation->stateid.u.data)); delegation->type = res->delegation_type; delegation->maxsize = res->maxsize; oldcred = delegation->cred; @@ -187,8 +188,9 @@ static struct nfs_delegation *nfs_detach_delegation_locked(struct nfs_inode *nfs if (delegation == NULL) goto nomatch; spin_lock(&delegation->lock); - if (stateid != NULL && memcmp(delegation->stateid.data, stateid->data, - sizeof(delegation->stateid.data)) != 0) + if (stateid != NULL && memcmp(delegation->stateid.u.data, + stateid->u.data, + sizeof(delegation->stateid.u.data)) != 0) goto nomatch_unlock; list_del_rcu(&delegation->super_list); delegation->inode = NULL; @@ -216,8 +218,8 @@ int nfs_inode_set_delegation(struct inode *inode, struct rpc_cred *cred, struct delegation = kmalloc(sizeof(*delegation), GFP_KERNEL); if (delegation == NULL) return -ENOMEM; - memcpy(delegation->stateid.data, res->delegation.data, - sizeof(delegation->stateid.data)); + memcpy(delegation->stateid.u.data, res->delegation.u.data, + sizeof(delegation->stateid.u.data)); delegation->type = res->delegation_type; delegation->maxsize = res->maxsize; delegation->change_attr = nfsi->change_attr; @@ -562,7 +564,8 @@ int nfs4_copy_delegation_stateid(nfs4_stateid *dst, struct inode *inode) rcu_read_lock(); delegation = rcu_dereference(nfsi->delegation); if (delegation != NULL) { - memcpy(dst->data, delegation->stateid.data, sizeof(dst->data)); + memcpy(dst->u.data, delegation->stateid.u.data, + sizeof(dst->u.data)); ret = 1; } rcu_read_unlock(); diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index a89a290..3741024 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -866,8 +866,10 @@ static void update_open_stateflags(struct nfs4_state *state, fmode_t fmode) static void nfs_set_open_stateid_locked(struct nfs4_state *state, nfs4_stateid *stateid, fmode_t fmode) { if (test_bit(NFS_DELEGATED_STATE, &state->flags) == 0) - memcpy(state->stateid.data, stateid->data, sizeof(state->stateid.data)); - memcpy(state->open_stateid.data, stateid->data, sizeof(state->open_stateid.data)); + memcpy(state->stateid.u.data, stateid->u.data, + sizeof(state->stateid.u.data)); + memcpy(state->open_stateid.u.data, stateid->u.data, + sizeof(state->open_stateid.u.data)); switch (fmode) { case FMODE_READ: set_bit(NFS_O_RDONLY_STATE, &state->flags); @@ -895,7 +897,8 @@ static void __update_open_stateid(struct nfs4_state *state, nfs4_stateid *open_s */ write_seqlock(&state->seqlock); if (deleg_stateid != NULL) { - memcpy(state->stateid.data, deleg_stateid->data, sizeof(state->stateid.data)); + memcpy(state->stateid.u.data, deleg_stateid->u.data, + sizeof(state->stateid.u.data)); set_bit(NFS_DELEGATED_STATE, &state->flags); } if (open_stateid != NULL) @@ -926,7 +929,8 @@ static int update_open_stateid(struct nfs4_state *state, nfs4_stateid *open_stat if (delegation == NULL) delegation = &deleg_cur->stateid; - else if (memcmp(deleg_cur->stateid.data, delegation->data, NFS4_STATEID_SIZE) != 0) + else if (memcmp(deleg_cur->stateid.u.data, delegation->u.data, + NFS4_STATEID_SIZE) != 0) goto no_delegation_unlock; nfs_mark_delegation_referenced(deleg_cur); @@ -988,7 +992,8 @@ static struct nfs4_state *nfs4_try_open_cached(struct nfs4_opendata *opendata) break; } /* Save the delegation */ - memcpy(stateid.data, delegation->stateid.data, sizeof(stateid.data)); + memcpy(stateid.u.data, delegation->stateid.u.data, + sizeof(stateid.u.data)); rcu_read_unlock(); ret = nfs_may_open(state->inode, state->owner->so_cred, open_mode); if (ret != 0) @@ -1154,10 +1159,13 @@ static int nfs4_open_recover(struct nfs4_opendata *opendata, struct nfs4_state * * Check if we need to update the current stateid. */ if (test_bit(NFS_DELEGATED_STATE, &state->flags) == 0 && - memcmp(state->stateid.data, state->open_stateid.data, sizeof(state->stateid.data)) != 0) { + memcmp(state->stateid.u.data, state->open_stateid.u.data, + sizeof(state->stateid.u.data)) != 0) { write_seqlock(&state->seqlock); if (test_bit(NFS_DELEGATED_STATE, &state->flags) == 0) - memcpy(state->stateid.data, state->open_stateid.data, sizeof(state->stateid.data)); + memcpy(state->stateid.u.data, + state->open_stateid.u.data, + sizeof(state->stateid.u.data)); write_sequnlock(&state->seqlock); } pnfs4_layout_reclaim(state); @@ -1228,8 +1236,8 @@ static int _nfs4_open_delegation_recall(struct nfs_open_context *ctx, struct nfs if (IS_ERR(opendata)) return PTR_ERR(opendata); opendata->o_arg.claim = NFS4_OPEN_CLAIM_DELEGATE_CUR; - memcpy(opendata->o_arg.u.delegation.data, stateid->data, - sizeof(opendata->o_arg.u.delegation.data)); + memcpy(opendata->o_arg.u.delegation.u.data, stateid->u.data, + sizeof(opendata->o_arg.u.delegation.u.data)); ret = nfs4_open_recover(opendata, state); nfs4_opendata_put(opendata); return ret; @@ -1287,8 +1295,8 @@ static void nfs4_open_confirm_done(struct rpc_task *task, void *calldata) if (RPC_ASSASSINATED(task)) return; if (data->rpc_status == 0) { - memcpy(data->o_res.stateid.data, data->c_res.stateid.data, - sizeof(data->o_res.stateid.data)); + memcpy(data->o_res.stateid.u.data, data->c_res.stateid.u.data, + sizeof(data->o_res.stateid.u.data)); nfs_confirm_seqid(&data->owner->so_seqid, 0); renew_lease(data->o_res.server, data->timestamp); data->rpc_done = 1; @@ -4097,9 +4105,10 @@ static void nfs4_locku_done(struct rpc_task *task, void *data) return; switch (task->tk_status) { case 0: - memcpy(calldata->lsp->ls_stateid.data, - calldata->res.stateid.data, - sizeof(calldata->lsp->ls_stateid.data)); + memcpy(calldata->lsp->ls_stateid.u.data, + calldata->res.stateid.u.data, + sizeof(calldata->lsp->ls_stateid.u. + data)); renew_lease(calldata->server, calldata->timestamp); break; case -NFS4ERR_BAD_STATEID: @@ -4312,8 +4321,8 @@ static void nfs4_lock_done(struct rpc_task *task, void *calldata) goto out; } if (data->rpc_status == 0) { - memcpy(data->lsp->ls_stateid.data, data->res.stateid.data, - sizeof(data->lsp->ls_stateid.data)); + memcpy(data->lsp->ls_stateid.u.data, data->res.stateid.u.data, + sizeof(data->lsp->ls_stateid.u.data)); data->lsp->ls_flags |= NFS_LOCK_INITIALIZED; renew_lease(NFS_SERVER(data->ctx->path.dentry->d_inode), data->timestamp); } diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index d0dbdd4..25849b8 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -1049,8 +1049,8 @@ restart: * Open state on this file cannot be recovered * All we can do is revert to using the zero stateid. */ - memset(state->stateid.data, 0, - sizeof(state->stateid.data)); + memset(state->stateid.u.data, 0, + sizeof(state->stateid.u.data)); /* Mark the file as being 'closed' */ state->state = 0; break; diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c index 6cf69b6..dc3e8dd 100644 --- a/fs/nfs/nfs4xdr.c +++ b/fs/nfs/nfs4xdr.c @@ -1004,7 +1004,7 @@ static void encode_close(struct xdr_stream *xdr, const struct nfs_closeargs *arg p = reserve_space(xdr, 8+NFS4_STATEID_SIZE); *p++ = cpu_to_be32(OP_CLOSE); *p++ = cpu_to_be32(arg->seqid->sequence->counter); - xdr_encode_opaque_fixed(p, arg->stateid->data, NFS4_STATEID_SIZE); + xdr_encode_opaque_fixed(p, arg->stateid->u.data, NFS4_STATEID_SIZE); hdr->nops++; hdr->replen += decode_close_maxsz; } @@ -1181,7 +1181,8 @@ static void encode_lock(struct xdr_stream *xdr, const struct nfs_lock_args *args if (args->new_lock_owner){ p = reserve_space(xdr, 4+NFS4_STATEID_SIZE+32); *p++ = cpu_to_be32(args->open_seqid->sequence->counter); - p = xdr_encode_opaque_fixed(p, args->open_stateid->data, NFS4_STATEID_SIZE); + p = xdr_encode_opaque_fixed(p, args->open_stateid->u.data, + NFS4_STATEID_SIZE); *p++ = cpu_to_be32(args->lock_seqid->sequence->counter); p = xdr_encode_hyper(p, args->lock_owner.clientid); *p++ = cpu_to_be32(16); @@ -1190,7 +1191,7 @@ static void encode_lock(struct xdr_stream *xdr, const struct nfs_lock_args *args } else { p = reserve_space(xdr, NFS4_STATEID_SIZE+4); - p = xdr_encode_opaque_fixed(p, args->lock_stateid->data, NFS4_STATEID_SIZE); + p = xdr_encode_opaque_fixed(p, args->lock_stateid->u.data, NFS4_STATEID_SIZE); *p = cpu_to_be32(args->lock_seqid->sequence->counter); } hdr->nops++; @@ -1222,7 +1223,8 @@ static void encode_locku(struct xdr_stream *xdr, const struct nfs_locku_args *ar *p++ = cpu_to_be32(OP_LOCKU); *p++ = cpu_to_be32(nfs4_lock_type(args->fl, 0)); *p++ = cpu_to_be32(args->seqid->sequence->counter); - p = xdr_encode_opaque_fixed(p, args->stateid->data, NFS4_STATEID_SIZE); + p = xdr_encode_opaque_fixed(p, args->stateid->u.data, + NFS4_STATEID_SIZE); p = xdr_encode_hyper(p, args->fl->fl_start); xdr_encode_hyper(p, nfs4_lock_length(args->fl)); hdr->nops++; @@ -1372,7 +1374,7 @@ static inline void encode_claim_delegate_cur(struct xdr_stream *xdr, const struc p = reserve_space(xdr, 4+NFS4_STATEID_SIZE); *p++ = cpu_to_be32(NFS4_OPEN_CLAIM_DELEGATE_CUR); - xdr_encode_opaque_fixed(p, stateid->data, NFS4_STATEID_SIZE); + xdr_encode_opaque_fixed(p, stateid->u.data, NFS4_STATEID_SIZE); encode_string(xdr, name->len, name->name); } @@ -1403,7 +1405,7 @@ static void encode_open_confirm(struct xdr_stream *xdr, const struct nfs_open_co p = reserve_space(xdr, 4+NFS4_STATEID_SIZE+4); *p++ = cpu_to_be32(OP_OPEN_CONFIRM); - p = xdr_encode_opaque_fixed(p, arg->stateid->data, NFS4_STATEID_SIZE); + p = xdr_encode_opaque_fixed(p, arg->stateid->u.data, NFS4_STATEID_SIZE); *p = cpu_to_be32(arg->seqid->sequence->counter); hdr->nops++; hdr->replen += decode_open_confirm_maxsz; @@ -1415,7 +1417,7 @@ static void encode_open_downgrade(struct xdr_stream *xdr, const struct nfs_close p = reserve_space(xdr, 4+NFS4_STATEID_SIZE+4); *p++ = cpu_to_be32(OP_OPEN_DOWNGRADE); - p = xdr_encode_opaque_fixed(p, arg->stateid->data, NFS4_STATEID_SIZE); + p = xdr_encode_opaque_fixed(p, arg->stateid->u.data, NFS4_STATEID_SIZE); *p = cpu_to_be32(arg->seqid->sequence->counter); encode_share_access(xdr, arg->fmode); hdr->nops++; @@ -1453,9 +1455,10 @@ static void encode_stateid(struct xdr_stream *xdr, const struct nfs_open_context p = reserve_space(xdr, NFS4_STATEID_SIZE); if (ctx->state != NULL) { nfs4_copy_stateid(&stateid, ctx->state, ctx->lockowner); - xdr_encode_opaque_fixed(p, stateid.data, NFS4_STATEID_SIZE); + xdr_encode_opaque_fixed(p, stateid.u.data, + NFS4_STATEID_SIZE); } else - xdr_encode_opaque_fixed(p, zero_stateid.data, NFS4_STATEID_SIZE); + xdr_encode_opaque_fixed(p, zero_stateid.u.data, NFS4_STATEID_SIZE); } static void encode_read(struct xdr_stream *xdr, const struct nfs_readargs *args, struct compound_hdr *hdr) @@ -1569,7 +1572,7 @@ encode_setacl(struct xdr_stream *xdr, struct nfs_setaclargs *arg, struct compoun p = reserve_space(xdr, 4+NFS4_STATEID_SIZE); *p++ = cpu_to_be32(OP_SETATTR); - xdr_encode_opaque_fixed(p, zero_stateid.data, NFS4_STATEID_SIZE); + xdr_encode_opaque_fixed(p, zero_stateid.u.data, NFS4_STATEID_SIZE); p = reserve_space(xdr, 2*4); *p++ = cpu_to_be32(1); *p = cpu_to_be32(FATTR4_WORD0_ACL); @@ -1600,7 +1603,7 @@ static void encode_setattr(struct xdr_stream *xdr, const struct nfs_setattrargs p = reserve_space(xdr, 4+NFS4_STATEID_SIZE); *p++ = cpu_to_be32(OP_SETATTR); - xdr_encode_opaque_fixed(p, arg->stateid.data, NFS4_STATEID_SIZE); + xdr_encode_opaque_fixed(p, arg->stateid.u.data, NFS4_STATEID_SIZE); hdr->nops++; hdr->replen += decode_setattr_maxsz; encode_attrs(xdr, arg->iap, server); @@ -1663,7 +1666,7 @@ static void encode_delegreturn(struct xdr_stream *xdr, const nfs4_stateid *state p = reserve_space(xdr, 4+NFS4_STATEID_SIZE); *p++ = cpu_to_be32(OP_DELEGRETURN); - xdr_encode_opaque_fixed(p, stateid->data, NFS4_STATEID_SIZE); + xdr_encode_opaque_fixed(p, stateid->u.data, NFS4_STATEID_SIZE); hdr->nops++; hdr->replen += decode_delegreturn_maxsz; } @@ -1873,7 +1876,8 @@ encode_layoutget(struct xdr_stream *xdr, p = xdr_encode_hyper(p, args->lseg.offset); p = xdr_encode_hyper(p, args->lseg.length); p = xdr_encode_hyper(p, args->minlength); - p = xdr_encode_opaque_fixed(p, &args->stateid.data, NFS4_STATEID_SIZE); + p = xdr_encode_opaque_fixed(p, &args->stateid.u.data, + NFS4_STATEID_SIZE); *p = cpu_to_be32(args->maxcount); dprintk("%s: 1st type:0x%x iomode:%d off:%lu len:%lu mc:%d\n", @@ -1905,7 +1909,7 @@ encode_layoutcommit(struct xdr_stream *xdr, p = xdr_encode_hyper(p, args->lseg.offset); p = xdr_encode_hyper(p, args->lseg.length); *p++ = cpu_to_be32(0); /* reclaim */ - p = xdr_encode_opaque_fixed(p, args->stateid.data, NFS4_STATEID_SIZE); + p = xdr_encode_opaque_fixed(p, args->stateid.u.data, NFS4_STATEID_SIZE); *p++ = cpu_to_be32(1); /* newoffset = TRUE */ p = xdr_encode_hyper(p, args->lastbytewritten); *p = cpu_to_be32(args->time_modify_changed != 0); @@ -1952,7 +1956,7 @@ encode_layoutreturn(struct xdr_stream *xdr, p = reserve_space(xdr, 16 + NFS4_STATEID_SIZE); p = xdr_encode_hyper(p, args->lseg.offset); p = xdr_encode_hyper(p, args->lseg.length); - p = xdr_encode_opaque_fixed(p, &args->stateid.data, + p = xdr_encode_opaque_fixed(p, &args->stateid.u.data, NFS4_STATEID_SIZE); dprintk("%s: call %pF\n", __func__, @@ -3996,7 +4000,7 @@ static int decode_opaque_fixed(struct xdr_stream *xdr, void *buf, size_t len) static int decode_stateid(struct xdr_stream *xdr, nfs4_stateid *stateid) { - return decode_opaque_fixed(xdr, stateid->data, NFS4_STATEID_SIZE); + return decode_opaque_fixed(xdr, stateid->u.data, NFS4_STATEID_SIZE); } static int decode_close(struct xdr_stream *xdr, struct nfs_closeres *res) @@ -5313,7 +5317,7 @@ static int decode_layoutget(struct xdr_stream *xdr, struct rpc_rqst *req, if (unlikely(!p)) goto out_overflow; res->return_on_close = be32_to_cpup(p++); - p = xdr_decode_opaque_fixed(p, res->stateid.data, NFS4_STATEID_SIZE); + p = xdr_decode_opaque_fixed(p, res->stateid.u.data, NFS4_STATEID_SIZE); layout_count = be32_to_cpup(p); if (!layout_count) { dprintk("%s: server responded with empty layout array\n", diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c index c4c7c35..7189173 100644 --- a/fs/nfs/pnfs.c +++ b/fs/nfs/pnfs.c @@ -511,7 +511,7 @@ pnfs_set_layout_stateid(struct pnfs_layout_type *lo, const nfs4_stateid *stateid) { write_seqlock(&lo->seqlock); - memcpy(lo->stateid.data, stateid->data, sizeof(lo->stateid.data)); + memcpy(lo->stateid.u.data, stateid->u.data, sizeof(lo->stateid.u.data)); write_sequnlock(&lo->seqlock); } @@ -524,7 +524,8 @@ pnfs_get_layout_stateid(nfs4_stateid *dst, struct pnfs_layout_type *lo) do { seq = read_seqbegin(&lo->seqlock); - memcpy(dst->data, lo->stateid.data, sizeof(lo->stateid.data)); + memcpy(dst->u.data, lo->stateid.u.data, + sizeof(lo->stateid.u.data)); } while (read_seqretry(&lo->seqlock, seq)); dprintk("<-- %s\n", __func__); @@ -539,8 +540,8 @@ pnfs_layout_from_open_stateid(nfs4_stateid *dst, struct nfs4_state *state) do { seq = read_seqbegin(&state->seqlock); - memcpy(dst->data, state->stateid.data, - sizeof(state->stateid.data)); + memcpy(dst->u.data, state->stateid.u.data, + sizeof(state->stateid.u.data)); } while (read_seqretry(&state->seqlock, seq)); dprintk("<-- %s\n", __func__); @@ -586,7 +587,7 @@ get_layout(struct inode *ino, lgp->args.inode = ino; lgp->lsegpp = lsegpp; - if (!memcmp(lo->stateid.data, &zero_stateid, NFS4_STATEID_SIZE)) { + if (!memcmp(lo->stateid.u.data, &zero_stateid, NFS4_STATEID_SIZE)) { struct nfs_open_context *oldctx = ctx; if (!oldctx) { diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c index f371c43..1d3cae3 100644 --- a/fs/nfsd/nfs4callback.c +++ b/fs/nfsd/nfs4callback.c @@ -40,7 +40,6 @@ #define NFSPROC4_CB_NULL 0 #define NFSPROC4_CB_COMPOUND 1 -#define NFS4_STATEID_SIZE 16 /* Index of predefined Linux callback client operations */ diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h index 9603b71..9cf2d98 100644 --- a/include/linux/nfs4.h +++ b/include/linux/nfs4.h @@ -18,7 +18,9 @@ #define NFS4_BITMAP_SIZE 2 #define NFS4_VERIFIER_SIZE 8 #define NFS4_CLIENTID_SIZE 8 -#define NFS4_STATEID_SIZE 16 +#define NFS4_STATEID_SEQID_SIZE 4 +#define NFS4_STATEID_OTHER_SIZE 12 +#define NFS4_STATEID_SIZE (NFS4_STATEID_SEQID_SIZE + NFS4_STATEID_OTHER_SIZE) #define NFS4_FHSIZE 128 #define NFS4_MAXPATHLEN PATH_MAX #define NFS4_MAXNAMLEN NAME_MAX @@ -181,7 +183,17 @@ struct nfs4_fsid { typedef struct { char data[NFS4_VERIFIER_SIZE]; } nfs4_verifier; typedef struct { char data[NFS4_CLIENTID_SIZE]; } nfs4_clientid; -typedef struct { char data[NFS4_STATEID_SIZE]; } nfs4_stateid; + +struct nfs41_stateid { + __be32 seqid; + char other[NFS4_STATEID_OTHER_SIZE]; +} __attribute__ ((packed)); +typedef struct { + union { + char data[NFS4_STATEID_SIZE]; + struct nfs41_stateid stateid; + } u; +} nfs4_stateid; enum nfs_opnum4 { OP_ACCESS = 3, -- 1.6.2.5 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 5/8] pnfs-submit: request whole file layouts only 2010-05-05 17:00 ` [PATCH 4/8] pnfs-submit: change stateid to be a union Alexandros Batsakis @ 2010-05-05 17:00 ` Alexandros Batsakis 2010-05-05 17:00 ` [PATCH 6/8] pnfs-submit: change layouts list to be similar to the other state list management Alexandros Batsakis 0 siblings, 1 reply; 20+ messages in thread From: Alexandros Batsakis @ 2010-05-05 17:00 UTC (permalink / raw) To: linux-nfs; +Cc: bhalevy, Alexandros Batsakis In the first iteration of the pNFS code, we support only whole file layouts. To facilitate the move to multiple-segments, we keep the segment processing code, but the segment list should always contain at most one segment per I/O type Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> --- fs/nfs/callback_proc.c | 7 ++++--- fs/nfs/pnfs.c | 25 ++++++++----------------- 2 files changed, 12 insertions(+), 20 deletions(-) diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c index dcf3747..acd72b7 100644 --- a/fs/nfs/callback_proc.c +++ b/fs/nfs/callback_proc.c @@ -213,6 +213,10 @@ static int pnfs_recall_layout(void *data) then return layouts, resume after layoutreturns complete */ + /* support whole file layouts only */ + rl.cbl_seg.offset = 0; + rl.cbl_seg.length = NFS4_MAX_UINT64; + if (rl.cbl_recall_type == RETURN_FILE) { status = pnfs_return_layout(inode, &rl.cbl_seg, &rl.cbl_stateid, RETURN_FILE, true); @@ -221,9 +225,6 @@ static int pnfs_recall_layout(void *data) goto out; } - rl.cbl_seg.offset = 0; - rl.cbl_seg.length = NFS4_MAX_UINT64; - /* FIXME: This loop is inefficient, running in O(|s_inodes|^2) */ while ((ino = nfs_layoutrecall_find_inode(clp, &rl)) != NULL) { /* XXX need to check status on pnfs_return_layout */ diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c index 7189173..52879f6 100644 --- a/fs/nfs/pnfs.c +++ b/fs/nfs/pnfs.c @@ -552,12 +552,6 @@ pnfs_layout_from_open_stateid(nfs4_stateid *dst, struct nfs4_state *state) * for now, assume that whole file layouts are requested. * arg->offset: 0 * arg->length: all ones -* -* for now, assume the LAYOUTGET operation is triggered by an I/O request. -* the count field is the count in the I/O request, and will be used -* as the minlength. for the file operation that piggy-backs -* the LAYOUTGET operation with an OPEN, s -* arg->minlength = count. */ static int get_layout(struct inode *ino, @@ -578,11 +572,11 @@ get_layout(struct inode *ino, return -ENOMEM; } lgp->lo = lo; - lgp->args.minlength = PAGE_CACHE_SIZE; + lgp->args.minlength = NFS4_MAX_UINT64; lgp->args.maxcount = PNFS_LAYOUT_MAXSIZE; lgp->args.lseg.iomode = range->iomode; - lgp->args.lseg.offset = range->offset; - lgp->args.lseg.length = max(range->length, lgp->args.minlength); + lgp->args.lseg.offset = 0; + lgp->args.lseg.length = NFS4_MAX_UINT64; lgp->args.type = server->pnfs_curr_ld->id; lgp->args.inode = ino; lgp->lsegpp = lsegpp; @@ -757,7 +751,7 @@ _pnfs_return_layout(struct inode *ino, struct nfs4_pnfs_layout_segment *range, else { arg.iomode = IOMODE_ANY; arg.offset = 0; - arg.length = ~0; + arg.length = NFS4_MAX_UINT64; } if (type == RETURN_FILE) { lo = get_lock_current_layout(nfsi); @@ -1076,8 +1070,8 @@ pnfs_update_layout(struct inode *ino, { struct nfs4_pnfs_layout_segment arg = { .iomode = iomode, - .offset = pos, - .length = count + .offset = 0, + .length = ~0 }; struct nfs_inode *nfsi = NFS_I(ino); struct pnfs_layout_type *lo; @@ -1167,7 +1161,6 @@ out_put: void pnfs_get_layout_done(struct nfs4_pnfs_layoutget *lgp, int rpc_status) { - struct nfs4_pnfs_layoutget_res *res = &lgp->res; struct pnfs_layout_segment *lseg = NULL; struct nfs_inode *nfsi = PNFS_NFS_INODE(lgp->lo); time_t suspend = 0; @@ -1176,11 +1169,10 @@ pnfs_get_layout_done(struct nfs4_pnfs_layoutget *lgp, int rpc_status) lgp->status = rpc_status; if (likely(!rpc_status)) { - if (unlikely(res->layout.len <= 0)) { + if (unlikely(lgp->res.layout.len < 0)) { printk(KERN_ERR - "%s: ERROR! Layout size is ZERO!\n", __func__); + "%s: ERROR Returned layout size is ZERO\n", __func__); lgp->status = -EIO; - goto get_out; } goto out; } @@ -1258,7 +1250,6 @@ pnfs_get_layout_done(struct nfs4_pnfs_layoutget *lgp, int rpc_status) break; } -get_out: /* remember that get layout failed and suspend trying */ nfsi->layout.pnfs_layout_suspend = suspend; set_bit(lo_fail_bit(lgp->args.lseg.iomode), -- 1.6.2.5 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 6/8] pnfs-submit: change layouts list to be similar to the other state list management 2010-05-05 17:00 ` [PATCH 5/8] pnfs-submit: request whole file layouts only Alexandros Batsakis @ 2010-05-05 17:00 ` Alexandros Batsakis 2010-05-05 17:00 ` [PATCH 7/8] pnfs-submit: forgetful client model Alexandros Batsakis 0 siblings, 1 reply; 20+ messages in thread From: Alexandros Batsakis @ 2010-05-05 17:00 UTC (permalink / raw) To: linux-nfs; +Cc: bhalevy, Alexandros Batsakis The current design keeps a list (nfs_client) of inodes having layouts. In order to make that code more similar to delegation handling (and in general to the rest of the NFS code), this patch changes the list element to layouts directly. No backpointer from the layout to the inode is needed as the inode can be accesed by a container_of() call Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> --- fs/nfs/callback_proc.c | 9 +++++++-- fs/nfs/client.c | 2 +- fs/nfs/inode.c | 8 +++++--- fs/nfs/pnfs.c | 10 ++++------ include/linux/nfs_fs.h | 4 +--- include/linux/nfs_fs_sb.h | 2 +- 6 files changed, 19 insertions(+), 16 deletions(-) diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c index acd72b7..0400dbe 100644 --- a/fs/nfs/callback_proc.c +++ b/fs/nfs/callback_proc.c @@ -76,7 +76,6 @@ static int (*nfs_validate_delegation_stateid(struct nfs_client *clp))(struct nfs return nfs4_validate_delegation_stateid; } - __be32 nfs4_callback_recall(struct cb_recallargs *args, void *dummy) { struct nfs_client *clp; @@ -140,6 +139,7 @@ nfs_layoutrecall_find_inode(struct nfs_client *clp, const struct cb_pnfs_layoutrecallargs *args) { struct nfs_inode *nfsi; + struct pnfs_layout_type *layout; struct nfs_server *server; struct inode *ino = NULL; @@ -147,9 +147,14 @@ nfs_layoutrecall_find_inode(struct nfs_client *clp, __func__, args->cbl_recall_type, clp); spin_lock(&clp->cl_lock); - list_for_each_entry(nfsi, &clp->cl_lo_inodes, lo_inodes) { + list_for_each_entry(layout, &clp->cl_layouts, lo_layouts) { + nfsi = PNFS_NFS_INODE(layout); + if (!nfsi) + continue; + dprintk("%s: Searching inode=%lu\n", __func__, nfsi->vfs_inode.i_ino); + if (args->cbl_recall_type == RETURN_FILE) { if (nfs_compare_fh(&args->cbl_fh, &nfsi->fh)) continue; diff --git a/fs/nfs/client.c b/fs/nfs/client.c index 963fc19..448c565 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -157,7 +157,7 @@ static struct nfs_client *nfs_alloc_client(const struct nfs_client_initdata *cl_ if (!IS_ERR(cred)) clp->cl_machine_cred = cred; #if defined(CONFIG_NFS_V4_1) - INIT_LIST_HEAD(&clp->cl_lo_inodes); + INIT_LIST_HEAD(&clp->cl_layouts); #endif nfs_fscache_get_client_cookie(clp); diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index ade797e..722e76f 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -1363,9 +1363,11 @@ static void pnfs_destroy_inode(struct nfs_inode *nfsi) pnfs_destroy_layout(nfsi); BUG_ON(!list_empty(&nfsi->layout.segs)); -if (nfsi->layout.refcount) printk("%s: layout.refcount %d\n", __func__, nfsi->layout.refcount); + if (nfsi->layout.refcount) + dprintk("%s: layout.refcount %d\n", __func__, + nfsi->layout.refcount); BUG_ON(nfsi->layout.refcount); - BUG_ON(!list_empty(&nfsi->lo_inodes)); + BUG_ON(!list_empty(&nfsi->layout.lo_layouts)); BUG_ON(nfsi->layout.ld_data); #endif /* CONFIG_NFS_V4_1 */ } @@ -1381,10 +1383,10 @@ void nfs_destroy_inode(struct inode *inode) static void pnfs_init_once(struct nfs_inode *nfsi) { #ifdef CONFIG_NFS_V4_1 - INIT_LIST_HEAD(&nfsi->lo_inodes); init_waitqueue_head(&nfsi->lo_waitq); spin_lock_init(&nfsi->lo_lock); seqlock_init(&nfsi->layout.seqlock); + INIT_LIST_HEAD(&nfsi->layout.lo_layouts); INIT_LIST_HEAD(&nfsi->layout.segs); nfsi->layout.refcount = 0; nfsi->layout.ld_data = NULL; diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c index 52879f6..546e2f4 100644 --- a/fs/nfs/pnfs.c +++ b/fs/nfs/pnfs.c @@ -370,10 +370,10 @@ put_unlock_current_layout(struct pnfs_layout_type *lo) io_ops->free_layout(lo->ld_data); lo->ld_data = NULL; - /* Unlist the inode. */ + /* Unlist the layout. */ clp = NFS_SERVER(&nfsi->vfs_inode)->nfs_client; spin_lock(&clp->cl_lock); - list_del_init(&nfsi->lo_inodes); + list_del_init(&lo->lo_layouts); spin_unlock(&clp->cl_lock); } out: @@ -889,10 +889,8 @@ alloc_init_layout(struct inode *ino) BUG_ON(lo->ld_data != NULL); lo->ld_data = ld_data; - seqlock_init(&lo->seqlock); memset(&lo->stateid, 0, NFS4_STATEID_SIZE); lo->refcount = 1; - INIT_LIST_HEAD(&lo->segs); lo->roc_iomode = 0; return lo; } @@ -948,8 +946,8 @@ get_lock_alloc_layout(struct inode *ino) lock_current_layout(nfsi); spin_lock(&clp->cl_lock); - if (list_empty(&nfsi->lo_inodes)) - list_add_tail(&nfsi->lo_inodes, &clp->cl_lo_inodes); + if (list_empty(&lo->lo_layouts)) + list_add_tail(&lo->lo_layouts, &clp->cl_layouts); spin_unlock(&clp->cl_lock); } else lo = ERR_PTR(-ENOMEM); diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index eb95dbf..764d061 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -99,6 +99,7 @@ struct posix_acl; struct pnfs_layout_type { int refcount; + struct list_head lo_layouts; /* other client layouts */ struct list_head segs; /* layout segments list */ int roc_iomode; /* iomode to return on close, 0=none */ seqlock_t seqlock; /* Protects the stateid */ @@ -204,9 +205,6 @@ struct nfs_inode { /* pNFS layout information */ #if defined(CONFIG_NFS_V4_1) - /* Inodes having layouts */ - struct list_head lo_inodes; - wait_queue_head_t lo_waitq; spinlock_t lo_lock; struct pnfs_layout_type layout; diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h index 76eb08d..8d93182 100644 --- a/include/linux/nfs_fs_sb.h +++ b/include/linux/nfs_fs_sb.h @@ -86,7 +86,7 @@ struct nfs_client { /* The flags used for obtaining the clientid during EXCHANGE_ID */ u32 cl_exchange_flags; struct nfs4_session *cl_session; /* sharred session */ - struct list_head cl_lo_inodes; /* Inodes having layouts */ + struct list_head cl_layouts; struct nfs4_deviceid_cache *cl_devid_cache; /* pNFS deviceid cache */ #endif /* CONFIG_NFS_V4_1 */ -- 1.6.2.5 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 7/8] pnfs-submit: forgetful client model 2010-05-05 17:00 ` [PATCH 6/8] pnfs-submit: change layouts list to be similar to the other state list management Alexandros Batsakis @ 2010-05-05 17:00 ` Alexandros Batsakis 2010-05-05 17:00 ` [PATCH 8/8] pnfs-submit: support for cb_recall_any (layouts) Alexandros Batsakis 0 siblings, 1 reply; 20+ messages in thread From: Alexandros Batsakis @ 2010-05-05 17:00 UTC (permalink / raw) To: linux-nfs; +Cc: bhalevy, Alexandros Batsakis Forgetful client model: If we receive a CB_LAYOUTRECALL - we spawn a thread to handle the recall (xxx: now only one recall can be active at a time, else NFS4ERR_DELAY) - we check the stateid seqid if it does not match we return NFS4ERR_DELAY - we check for pending I/O if there is we return NFS4ERR_DELAY Else we return NO_MATCHING_LAYOUT. Note that for whole file layouts there is no need to serialize LAYOUTGETs/LAYOUTRETURNs For bulk layouts, if there is a layout active, we return NFS4_OK and we start cleaning the layouts asynchronously. At the end we send a bulk LAYOUTRETURN. Note that there is no need to prevent any new LAYOUTGETs explicitly as the server should reject them. Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> --- fs/nfs/callback_proc.c | 146 ++++++++++++++++++++++++++++++++++-------------- fs/nfs/nfs4_fs.h | 1 + fs/nfs/pnfs.c | 70 ++++++++++------------- 3 files changed, 136 insertions(+), 81 deletions(-) diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c index 0400dbe..9a38725 100644 --- a/fs/nfs/callback_proc.c +++ b/fs/nfs/callback_proc.c @@ -129,6 +129,38 @@ int nfs4_validate_delegation_stateid(struct nfs_delegation *delegation, const nf #if defined(CONFIG_NFS_V4_1) +static bool +pnfs_is_next_layout_stateid(const struct pnfs_layout_type *lo, + const nfs4_stateid stateid) +{ + int seqlock; + bool res; + u32 oldseqid, newseqid; + + do { + seqlock = read_seqbegin(&lo->seqlock); + oldseqid = be32_to_cpu(lo->stateid.u.stateid.seqid); + newseqid = be32_to_cpu(stateid.u.stateid.seqid); + res = !memcmp(lo->stateid.u.stateid.other, + stateid.u.stateid.other, + NFS4_STATEID_OTHER_SIZE); + if (res) { /* comparing layout stateids */ + if (oldseqid == ~0) + res = (newseqid == 1); + else + res = (newseqid == oldseqid + 1); + } else { /* open stateid */ + res = !memcmp(lo->stateid.u.data, + &zero_stateid, + NFS4_STATEID_SIZE); + if (res) + res = (newseqid == 1); + } + } while (read_seqretry(&lo->seqlock, seqlock)); + + return res; +} + /* * Retrieve an inode based on layout recall parameters * @@ -191,9 +223,10 @@ static int pnfs_recall_layout(void *data) struct inode *inode, *ino; struct nfs_client *clp; struct cb_pnfs_layoutrecallargs rl; + struct nfs4_pnfs_layoutreturn *lrp; struct recall_layout_threadargs *args = (struct recall_layout_threadargs *)data; - int status; + int status = 0; daemonize("nfsv4-layoutreturn"); @@ -204,47 +237,59 @@ static int pnfs_recall_layout(void *data) clp = args->clp; inode = args->inode; rl = *args->rl; - args->result = 0; - complete(&args->started); - args = NULL; - /* Note: args must not be used after this point!!! */ - -/* FIXME: need barrier here: - pause I/O to data servers - pause layoutgets - drain all outstanding writes to storage devices - wait for any outstanding layoutreturns and layoutgets mentioned in - cb_sequence. - then return layouts, resume after layoutreturns complete - */ /* support whole file layouts only */ rl.cbl_seg.offset = 0; rl.cbl_seg.length = NFS4_MAX_UINT64; if (rl.cbl_recall_type == RETURN_FILE) { - status = pnfs_return_layout(inode, &rl.cbl_seg, &rl.cbl_stateid, - RETURN_FILE, true); + if (pnfs_is_next_layout_stateid(&NFS_I(inode)->layout, + rl.cbl_stateid)) + status = pnfs_return_layout(inode, &rl.cbl_seg, + &rl.cbl_stateid, RETURN_FILE, + false); + else + status = cpu_to_be32(NFS4ERR_DELAY); if (status) dprintk("%s RETURN_FILE error: %d\n", __func__, status); + else + status = cpu_to_be32(NFS4ERR_NOMATCHING_LAYOUT); + args->result = status; + complete(&args->started); goto out; } - /* FIXME: This loop is inefficient, running in O(|s_inodes|^2) */ + status = cpu_to_be32(NFS4_OK); + args->result = status; + complete(&args->started); + args = NULL; + + /* IMPROVEME: This loop is inefficient, running in O(|s_inodes|^2) */ while ((ino = nfs_layoutrecall_find_inode(clp, &rl)) != NULL) { - /* XXX need to check status on pnfs_return_layout */ - pnfs_return_layout(ino, &rl.cbl_seg, NULL, RETURN_FILE, true); + /* FIXME: need to check status on pnfs_return_layout */ + pnfs_return_layout(ino, &rl.cbl_seg, NULL, RETURN_FILE, false); iput(ino); } + lrp = kzalloc(sizeof(*lrp), GFP_KERNEL); + if (!lrp) { + dprintk("%s: allocation failed. Cannot send last LAYOUTRETURN\n", + __func__); + goto out; + } + /* send final layoutreturn */ - status = pnfs_return_layout(inode, &rl.cbl_seg, NULL, - rl.cbl_recall_type, true); - if (status) - printk(KERN_INFO "%s: ignoring pnfs_return_layout status=%d\n", - __func__, status); + lrp->args.reclaim = 0; + lrp->args.layout_type = rl.cbl_layout_type; + lrp->args.return_type = rl.cbl_recall_type; + lrp->args.lseg = rl.cbl_seg; + lrp->args.inode = inode; + lrp->lo = NULL; + pnfs4_proc_layoutreturn(lrp, true); + out: - iput(inode); + clear_bit(NFS4CLNT_LAYOUT_RECALL, &clp->cl_state); + nfs_put_client(clp); module_put_and_exit(0); dprintk("%s: exit status %d\n", __func__, 0); return 0; @@ -262,15 +307,18 @@ static int pnfs_async_return_layout(struct nfs_client *clp, struct inode *inode, .rl = rl, }; struct task_struct *t; - int status; - - /* should have returned NFS4ERR_NOMATCHING_LAYOUT... */ - BUG_ON(inode == NULL); + int status = -EAGAIN; dprintk("%s: -->\n", __func__); + /* FIXME: do not allow two concurrent layout recalls */ + if (test_and_set_bit(NFS4CLNT_LAYOUT_RECALL, &clp->cl_state)) + return status; + init_completion(&data.started); __module_get(THIS_MODULE); + if (!atomic_inc_not_zero(&clp->cl_count)) + goto out_put_no_client; t = kthread_run(pnfs_recall_layout, &data, "%s", "pnfs_recall_layout"); if (IS_ERR(t)) { @@ -284,6 +332,9 @@ static int pnfs_async_return_layout(struct nfs_client *clp, struct inode *inode, wait_for_completion(&data.started); return data.result; out_module_put: + nfs_put_client(clp); +out_put_no_client: + clear_bit(NFS4CLNT_LAYOUT_RECALL, &clp->cl_state); module_put(THIS_MODULE); return status; } @@ -294,35 +345,46 @@ __be32 pnfs_cb_layoutrecall(struct cb_pnfs_layoutrecallargs *args, struct nfs_client *clp; struct inode *inode = NULL; __be32 res; + int status; unsigned int num_client = 0; dprintk("%s: -->\n", __func__); - res = htonl(NFS4ERR_INVAL); - clp = nfs_find_client(args->cbl_addr, 4); + res = cpu_to_be32(NFS4ERR_OP_NOT_IN_SESSION); + clp = nfs_find_client(args->cbl_addr, 4); if (clp == NULL) { dprintk("%s: no client for addr %u.%u.%u.%u\n", __func__, NIPQUAD(args->cbl_addr)); goto out; } - res = htonl(NFS4ERR_NOMATCHING_LAYOUT); + res = cpu_to_be32(NFS4ERR_NOMATCHING_LAYOUT); do { struct nfs_client *prev = clp; num_client++; - inode = nfs_layoutrecall_find_inode(clp, args); - if (inode != NULL) { - if (PNFS_LD(&NFS_I(inode)->layout)->id == - args->cbl_layout_type) { - /* Set up a helper thread to actually - * return the delegation */ - res = pnfs_async_return_layout(clp, inode, args); - if (res != 0) - res = htonl(NFS4ERR_RESOURCE); - break; + /* the callback must come from the MDS personality */ + if (!(clp->cl_exchange_flags & EXCHGID4_FLAG_USE_PNFS_MDS)) + goto loop; + if (args->cbl_recall_type == RETURN_FILE) { + inode = nfs_layoutrecall_find_inode(clp, args); + if (inode != NULL) { + status = pnfs_async_return_layout(clp, inode, + args); + if (status == -EAGAIN) + res = cpu_to_be32(NFS4ERR_DELAY); + iput(inode); } + } else { /* _ALL or _FSID */ + /* we need the inode to get the nfs_server struct */ + inode = nfs_layoutrecall_find_inode(clp, args); + if (!inode) + goto loop; + status = pnfs_async_return_layout(clp, inode, args); + if (status == -EAGAIN) + res = cpu_to_be32(NFS4ERR_DELAY); iput(inode); } +loop: clp = nfs_find_client_next(prev); nfs_put_client(prev); } while (clp != NULL); diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h index 0abcd45..4e48aea 100644 --- a/fs/nfs/nfs4_fs.h +++ b/fs/nfs/nfs4_fs.h @@ -47,6 +47,7 @@ enum nfs4_client_state { NFS4CLNT_SESSION_RESET, NFS4CLNT_SESSION_DRAINING, NFS4CLNT_RECALL_SLOT, + NFS4CLNT_LAYOUT_RECALL, }; /* diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c index 546e2f4..3104730 100644 --- a/fs/nfs/pnfs.c +++ b/fs/nfs/pnfs.c @@ -710,6 +710,8 @@ return_layout(struct inode *ino, struct nfs4_pnfs_layout_segment *range, dprintk("--> %s\n", __func__); + BUG_ON(type != RETURN_FILE); + lrp = kzalloc(sizeof(*lrp), GFP_KERNEL); if (lrp == NULL) { if (lo && (type == RETURN_FILE)) @@ -746,13 +748,11 @@ _pnfs_return_layout(struct inode *ino, struct nfs4_pnfs_layout_segment *range, dprintk("--> %s type %d\n", __func__, type); - if (range) - arg = *range; - else { - arg.iomode = IOMODE_ANY; - arg.offset = 0; - arg.length = NFS4_MAX_UINT64; - } + + arg.iomode = range ? range->iomode : IOMODE_ANY; + arg.offset = 0; + arg.length = NFS4_MAX_UINT64; + if (type == RETURN_FILE) { lo = get_lock_current_layout(nfsi); if (lo && !has_layout_to_return(lo, &arg)) { @@ -761,11 +761,7 @@ _pnfs_return_layout(struct inode *ino, struct nfs4_pnfs_layout_segment *range, } if (!lo) { dprintk("%s: no layout segments to return\n", __func__); - /* must send the LAYOUTRETURN in response to recall */ - if (stateid) - goto send_return; - else - goto out; + goto out; } /* unlock w/o put rebalanced by eventual call to @@ -774,12 +770,23 @@ _pnfs_return_layout(struct inode *ino, struct nfs4_pnfs_layout_segment *range, unlock_current_layout(nfsi); if (pnfs_return_layout_barrier(nfsi, &arg)) { + if (stateid) { /* callback */ + status = -EAGAIN; + lock_current_layout(nfsi); + put_unlock_current_layout(lo); + goto out; + } dprintk("%s: waiting\n", __func__); wait_event(nfsi->lo_waitq, - !pnfs_return_layout_barrier(nfsi, &arg)); + !pnfs_return_layout_barrier(nfsi, &arg)); } if (layoutcommit_needed(nfsi)) { + if (stateid && !wait) { /* callback */ + dprintk("%s: layoutcommit pending\n", __func__); + status = -EAGAIN; + goto out; + } status = pnfs_layoutcommit_inode(ino, wait); if (status) { dprintk("%s: layoutcommit failed, status=%d. " @@ -788,9 +795,13 @@ _pnfs_return_layout(struct inode *ino, struct nfs4_pnfs_layout_segment *range, status = 0; } } + + if (stateid && wait) + status = return_layout(ino, &arg, stateid, type, + lo, wait); + else + pnfs_layout_release(lo, &arg); } -send_return: - status = return_layout(ino, &arg, stateid, type, lo, wait); out: dprintk("<-- %s status: %d\n", __func__, status); return status; @@ -1069,7 +1080,7 @@ pnfs_update_layout(struct inode *ino, struct nfs4_pnfs_layout_segment arg = { .iomode = iomode, .offset = 0, - .length = ~0 + .length = NFS4_MAX_UINT64, }; struct nfs_inode *nfsi = NFS_I(ino); struct pnfs_layout_type *lo; @@ -1088,31 +1099,12 @@ pnfs_update_layout(struct inode *ino, /* Check to see if the layout for the given range already exists */ lseg = pnfs_has_layout(lo, &arg, take_ref, !take_ref); if (lseg && !lseg->valid) { - unlock_current_layout(nfsi); if (take_ref) put_lseg(lseg); - for (;;) { - prepare_to_wait(&nfsi->lo_waitq, &__wait, - TASK_KILLABLE); - lock_current_layout(nfsi); - lseg = pnfs_has_layout(lo, &arg, take_ref, !take_ref); - if (!lseg || lseg->valid) - break; - dprintk("%s: invalid lseg %p ref %d\n", __func__, - lseg, atomic_read(&lseg->kref.refcount)-1); - if (take_ref) - put_lseg(lseg); - if (signal_pending(current)) { - lseg = NULL; - result = -ERESTARTSYS; - break; - } - unlock_current_layout(nfsi); - schedule(); - } - finish_wait(&nfsi->lo_waitq, &__wait); - if (result) - goto out_put; + + /* someone is cleaning the layout */ + result = -EAGAIN; + goto out_put; } if (lseg) { -- 1.6.2.5 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 8/8] pnfs-submit: support for cb_recall_any (layouts) 2010-05-05 17:00 ` [PATCH 7/8] pnfs-submit: forgetful client model Alexandros Batsakis @ 2010-05-05 17:00 ` Alexandros Batsakis 0 siblings, 0 replies; 20+ messages in thread From: Alexandros Batsakis @ 2010-05-05 17:00 UTC (permalink / raw) To: linux-nfs; +Cc: bhalevy, Alexandros Batsakis CB_RECALL_ANY serves as a hint to the client to return some server state. We reply immediately and we clean the layouts asycnhronously. FIXME: currently we return _all_ layouts FIXME: eventually we should treat layouts as delegations, marked them expired and fire the state manager to clean them. Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> --- fs/nfs/callback.h | 7 +++++ fs/nfs/callback_proc.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 63 insertions(+), 2 deletions(-) diff --git a/fs/nfs/callback.h b/fs/nfs/callback.h index 73f21bc..b39ac86 100644 --- a/fs/nfs/callback.h +++ b/fs/nfs/callback.h @@ -115,6 +115,13 @@ extern int nfs41_validate_delegation_stateid(struct nfs_delegation *delegation, #define RCA4_TYPE_MASK_RDATA_DLG 0 #define RCA4_TYPE_MASK_WDATA_DLG 1 +#define RCA4_TYPE_MASK_DIR_DLG 2 +#define RCA4_TYPE_MASK_FILE_LAYOUT 3 +#define RCA4_TYPE_MASK_BLK_LAYOUT 4 +#define RCA4_TYPE_MASK_OBJ_LAYOUT_MIN 8 +#define RCA4_TYPE_MASK_OBJ_LAYOUT_MAX 9 +#define RCA4_TYPE_MASK_OTHER_LAYOUT_MIN 12 +#define RCA4_TYPE_MASK_OTHER_LAYOUT_MAX 15 struct cb_recallanyargs { struct sockaddr *craa_addr; diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c index 9a38725..ba4c48c 100644 --- a/fs/nfs/callback_proc.c +++ b/fs/nfs/callback_proc.c @@ -339,6 +339,27 @@ out_put_no_client: return status; } +static int pnfs_recall_all_layouts(struct nfs_client *clp) +{ + struct cb_pnfs_layoutrecallargs rl; + struct inode *inode; + int status = 0; + + rl.cbl_recall_type = RETURN_ALL; + rl.cbl_seg.iomode = IOMODE_ANY; + rl.cbl_seg.offset = 0; + rl.cbl_seg.length = NFS4_MAX_UINT64; + + /* we need the inode to get the nfs_server struct */ + inode = nfs_layoutrecall_find_inode(clp, &rl); + if (!inode) + return status; + status = pnfs_async_return_layout(clp, inode, &rl); + iput(inode); + + return status; +} + __be32 pnfs_cb_layoutrecall(struct cb_pnfs_layoutrecallargs *args, void *dummy) { @@ -659,13 +680,37 @@ out: return status; } +static inline bool +validate_bitmap_values(const unsigned long *mask) +{ + int i; + + if (*mask == 0) + return true; + if (test_bit(RCA4_TYPE_MASK_RDATA_DLG, mask) || + test_bit(RCA4_TYPE_MASK_WDATA_DLG, mask) || + test_bit(RCA4_TYPE_MASK_DIR_DLG, mask) || + test_bit(RCA4_TYPE_MASK_FILE_LAYOUT, mask) || + test_bit(RCA4_TYPE_MASK_BLK_LAYOUT, mask)) + return true; + for (i = RCA4_TYPE_MASK_OBJ_LAYOUT_MIN; + i <= RCA4_TYPE_MASK_OBJ_LAYOUT_MAX; i++) + if (test_bit(i, mask)) + return true; + for (i = RCA4_TYPE_MASK_OTHER_LAYOUT_MIN; + i <= RCA4_TYPE_MASK_OTHER_LAYOUT_MAX; i++) + if (test_bit(i, mask)) + return true; + return false; +} + __be32 nfs4_callback_recallany(struct cb_recallanyargs *args, void *dummy) { struct nfs_client *clp; __be32 status; fmode_t flags = 0; - status = htonl(NFS4ERR_OP_NOT_IN_SESSION); + status = cpu_to_be32(NFS4ERR_OP_NOT_IN_SESSION); clp = nfs_find_client(args->craa_addr, 4); if (clp == NULL) goto out; @@ -673,16 +718,25 @@ __be32 nfs4_callback_recallany(struct cb_recallanyargs *args, void *dummy) dprintk("NFS: RECALL_ANY callback request from %s\n", rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_ADDR)); + status = cpu_to_be32(NFS4ERR_INVAL); + if (!validate_bitmap_values((const unsigned long *) + &args->craa_type_mask)) + return status; + + status = cpu_to_be32(NFS4_OK); if (test_bit(RCA4_TYPE_MASK_RDATA_DLG, (const unsigned long *) &args->craa_type_mask)) flags = FMODE_READ; if (test_bit(RCA4_TYPE_MASK_WDATA_DLG, (const unsigned long *) &args->craa_type_mask)) flags |= FMODE_WRITE; + if (test_bit(RCA4_TYPE_MASK_FILE_LAYOUT, (const unsigned long *) + &args->craa_type_mask)) + if (pnfs_recall_all_layouts(clp) == -EAGAIN) + status = cpu_to_be32(NFS4ERR_DELAY); if (flags) nfs_expire_all_delegation_types(clp, flags); - status = htonl(NFS4_OK); out: dprintk("%s: exit with status = %d\n", __func__, ntohl(status)); return status; -- 1.6.2.5 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 2/8] pnfs-submit: clean locking infrastructure 2010-05-05 17:00 ` [PATCH 2/8] pnfs-submit: clean locking infrastructure Alexandros Batsakis 2010-05-05 17:00 ` [PATCH 3/8] pnfs-submit: remove lgetcount, lretcount (outstanding LAYOUTGETs/LAYOUTRETUNs) Alexandros Batsakis @ 2010-06-07 14:34 ` Fred Isaman 1 sibling, 0 replies; 20+ messages in thread From: Fred Isaman @ 2010-06-07 14:34 UTC (permalink / raw) To: Alexandros Batsakis; +Cc: linux-nfs, bhalevy On Wed, May 5, 2010 at 1:00 PM, Alexandros Batsakis <batsakis-HgOvQuBEEgRhl2p70BpVqQ@public.gmane.org= m> wrote: > (also minor cleanup of pnfs_free_layout()) > > Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> > --- > =A0fs/nfs/pnfs.c | =A0 73 ++++++++++++++++++++++++++++++++++++-------= ------------- > =A01 files changed, 47 insertions(+), 26 deletions(-) > > diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c > index f32dbbb..a4031b4 100644 > --- a/fs/nfs/pnfs.c > +++ b/fs/nfs/pnfs.c > @@ -60,6 +60,8 @@ static int pnfs_initialized; > =A0static void pnfs_free_layout(struct pnfs_layout_type *lo, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 struct nfs4_p= nfs_layout_segment *range); > =A0static enum pnfs_try_status pnfs_commit(struct nfs_write_data *dat= a, int sync); > +static inline void lock_current_layout(struct nfs_inode *nfsi); > +static inline void unlock_current_layout(struct nfs_inode *nfsi); > > =A0/* Locking: > =A0* > @@ -152,15 +154,15 @@ void > =A0pnfs_need_layoutcommit(struct nfs_inode *nfsi, struct nfs_open_con= text *ctx) > =A0{ > =A0 =A0 =A0 =A0dprintk("%s: has_layout=3D%d ctx=3D%p\n", __func__, ha= s_layout(nfsi), ctx); > - =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); > + =A0 =A0 =A0 lock_current_layout(nfsi); > =A0 =A0 =A0 =A0if (has_layout(nfsi) && !layoutcommit_needed(nfsi)) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0nfsi->layout.lo_cred =3D get_rpccred(c= tx->state->owner->so_cred); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0nfsi->change_attr++; > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 unlock_current_layout(nfsi); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0dprintk("%s: Set layoutcommit\n", __fu= nc__); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return; > =A0 =A0 =A0 =A0} > - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 unlock_current_layout(nfsi); > =A0} > > =A0/* Update last_write_offset for layoutcommit. > @@ -173,7 +175,7 @@ pnfs_update_last_write(struct nfs_inode *nfsi, lo= ff_t offset, size_t extent) > =A0{ > =A0 =A0 =A0 =A0loff_t end_pos; > > - =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); > + =A0 =A0 =A0 lock_current_layout(nfsi); > =A0 =A0 =A0 =A0if (offset < nfsi->layout.pnfs_write_begin_pos) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0nfsi->layout.pnfs_write_begin_pos =3D = offset; > =A0 =A0 =A0 =A0end_pos =3D offset + extent - 1; /* I'm being inclusiv= e */ > @@ -185,7 +187,7 @@ pnfs_update_last_write(struct nfs_inode *nfsi, lo= ff_t offset, size_t extent) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(unsigned long) offset , > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(unsigned long) nfsi->layout.pnfs_writ= e_begin_pos, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(unsigned long) nfsi->layout.pnfs_writ= e_end_pos); > - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 unlock_current_layout(nfsi); > =A0} > > =A0/* Unitialize a mountpoint in a layout driver */ > @@ -313,6 +315,17 @@ pnfs_unregister_layoutdriver(struct pnfs_layoutd= river_type *ld_type) > =A0#define BUG_ON_UNLOCKED_LO(lo) do {} while (0) > =A0#endif /* CONFIG_SMP */ > > +static inline void lock_current_layout(struct nfs_inode *nfsi) > +{ > + =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); > +} > + > +static inline void unlock_current_layout(struct nfs_inode *nfsi) > +{ > + =A0 =A0 =A0 BUG_ON_UNLOCKED_LO((&nfsi->layout)); > + =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > +} > + > =A0/* > =A0* get and lock nfsi->layout > =A0*/ > @@ -321,10 +334,10 @@ get_lock_current_layout(struct nfs_inode *nfsi) > =A0{ > =A0 =A0 =A0 =A0struct pnfs_layout_type *lo; > > + =A0 =A0 =A0 lock_current_layout(nfsi); > =A0 =A0 =A0 =A0lo =3D &nfsi->layout; > - =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); > =A0 =A0 =A0 =A0if (!lo->ld_data) { > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 unlock_current_layout(nfsi); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return NULL; > =A0 =A0 =A0 =A0} > > @@ -344,7 +357,12 @@ put_unlock_current_layout(struct pnfs_layout_typ= e *lo) > =A0 =A0 =A0 =A0BUG_ON_UNLOCKED_LO(lo); > =A0 =A0 =A0 =A0BUG_ON(lo->refcount <=3D 0); > > - =A0 =A0 =A0 if (--lo->refcount =3D=3D 0 && list_empty(&lo->segs)) { > + =A0 =A0 =A0 lo->refcount--; > + > + =A0 =A0 =A0 if (lo->refcount > 0) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto out; > + > + =A0 =A0 =A0 if (list_empty(&lo->segs)) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct layoutdriver_io_operations *io_= ops =3D > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0PNFS_LD_IO_OPS(lo); > > @@ -358,7 +376,8 @@ put_unlock_current_layout(struct pnfs_layout_type= *lo) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0list_del_init(&nfsi->lo_inodes); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0spin_unlock(&clp->cl_lock); > =A0 =A0 =A0 =A0} > - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > +out: > + =A0 =A0 =A0 unlock_current_layout(nfsi); > =A0} > > =A0void > @@ -367,7 +386,7 @@ pnfs_layout_release(struct pnfs_layout_type *lo, = atomic_t *count, > =A0{ > =A0 =A0 =A0 =A0struct nfs_inode *nfsi =3D PNFS_NFS_INODE(lo); > > - =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); > + =A0 =A0 =A0 lock_current_layout(nfsi); > =A0 =A0 =A0 =A0if (range) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0pnfs_free_layout(lo, range); > =A0 =A0 =A0 =A0atomic_dec(count); > @@ -386,6 +405,8 @@ pnfs_destroy_layout(struct nfs_inode *nfsi) > =A0 =A0 =A0 =A0}; > > =A0 =A0 =A0 =A0lo =3D get_lock_current_layout(nfsi); > + =A0 =A0 =A0 if (!lo) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return; > =A0 =A0 =A0 =A0pnfs_free_layout(lo, &range); > =A0 =A0 =A0 =A0put_unlock_current_layout(lo); > =A0} > @@ -663,7 +684,7 @@ pnfs_return_layout_barrier(struct nfs_inode *nfsi= , > =A0 =A0 =A0 =A0struct pnfs_layout_segment *lseg; > =A0 =A0 =A0 =A0bool ret =3D false; > > - =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); > + =A0 =A0 =A0 lock_current_layout(nfsi); > =A0 =A0 =A0 =A0list_for_each_entry (lseg, &nfsi->layout.segs, fi_list= ) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (!should_free_lseg(lseg, range)) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0continue; > @@ -677,7 +698,7 @@ pnfs_return_layout_barrier(struct nfs_inode *nfsi= , > =A0 =A0 =A0 =A0} > =A0 =A0 =A0 =A0if (atomic_read(&nfsi->layout.lgetcount)) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0ret =3D true; > - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 unlock_current_layout(nfsi); > > =A0 =A0 =A0 =A0dprintk("%s:Return %d\n", __func__, ret); > =A0 =A0 =A0 =A0return ret; > @@ -759,7 +780,7 @@ _pnfs_return_layout(struct inode *ino, struct nfs= 4_pnfs_layout_segment *range, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* unlock w/o put rebalanced by eventu= al call to > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * pnfs_layout_release > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 */ > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 unlock_current_layout(nfsi); > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (pnfs_return_layout_barrier(nfsi, &= arg)) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0dprintk("%s: waiting\n= ", __func__); > @@ -900,7 +921,7 @@ static int pnfs_wait_schedule(void *word) > =A0* > =A0* Note: If successful, nfsi->lo_lock is taken and the caller > =A0* must put and unlock current_layout by using put_unlock_current_l= ayout() > - * when the returned layout is released. > + * directly or pnfs_layout_release() when the returned layout is rel= eased. > =A0*/ > =A0static struct pnfs_layout_type * > =A0get_lock_alloc_layout(struct inode *ino) > @@ -935,7 +956,7 @@ get_lock_alloc_layout(struct inode *ino) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct nfs_client *clp= =3D NFS_SERVER(ino)->nfs_client; > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* must grab the layou= t lock before the client lock */ > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_lock(&nfsi->lo_loc= k); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lock_current_layout(nfs= i); > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0spin_lock(&clp->cl_loc= k); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (list_empty(&nfsi->= lo_inodes)) > @@ -1051,10 +1072,10 @@ void drain_layoutreturns(struct pnfs_layout_t= ype *lo) > =A0 =A0 =A0 =A0while (atomic_read(&lo->lretcount)) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct nfs_inode *nfsi =3D PNFS_NFS_IN= ODE(lo); > > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 unlock_current_layout(nfsi); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0dprintk("%s: waiting\n", __func__); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0wait_event(nfsi->lo_waitq, (atomic_rea= d(&lo->lretcount) =3D=3D 0)); > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 lock_current_layout(nfsi); > =A0 =A0 =A0 =A0} > =A0} > > @@ -1093,13 +1114,13 @@ pnfs_update_layout(struct inode *ino, > =A0 =A0 =A0 =A0/* Check to see if the layout for the given range alre= ady exists */ > =A0 =A0 =A0 =A0lseg =3D pnfs_has_layout(lo, &arg, take_ref, !take_ref= ); > =A0 =A0 =A0 =A0if (lseg && !lseg->valid) { > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 unlock_current_layout(nfsi); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (take_ref) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0put_lseg(lseg); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0for (;;) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0prepare_to_wait(&nfsi-= >lo_waitq, &__wait, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0TASK_KILLABLE); > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_lock(&nfsi->lo_loc= k); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lock_current_layout(nfs= i); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0lseg =3D pnfs_has_layo= ut(lo, &arg, take_ref, !take_ref); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (!lseg || lseg->val= id) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0break; > @@ -1112,7 +1133,7 @@ pnfs_update_layout(struct inode *ino, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0result= =3D -ERESTARTSYS; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0break; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_unlock(&nfsi->lo_l= ock); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 unlock_current_layout(n= fsi); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0schedule(); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0finish_wait(&nfsi->lo_waitq, &__wait); > @@ -1149,7 +1170,7 @@ pnfs_update_layout(struct inode *ino, > =A0 =A0 =A0 =A0/* Matching dec is done in .rpc_release (on non-error = paths) */ > =A0 =A0 =A0 =A0atomic_inc(&lo->lgetcount); > =A0 =A0 =A0 =A0/* Lose lock, but not reference, match this with pnfs_= layout_release */ > - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 unlock_current_layout(nfsi); > > =A0 =A0 =A0 =A0result =3D get_layout(ino, ctx, &arg, lsegpp, lo); > =A0out: > @@ -1299,7 +1320,7 @@ pnfs_layout_process(struct nfs4_pnfs_layoutget = *lgp) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0*lgp->lsegpp =3D lseg; > =A0 =A0 =A0 =A0} > > - =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); > + =A0 =A0 =A0 lock_current_layout(nfsi); > =A0 =A0 =A0 =A0pnfs_insert_layout(lo, lseg); > > =A0 =A0 =A0 =A0if (res->return_on_close) { > @@ -1310,7 +1331,7 @@ pnfs_layout_process(struct nfs4_pnfs_layoutget = *lgp) > > =A0 =A0 =A0 =A0/* Done processing layoutget. Set the layout stateid *= / > =A0 =A0 =A0 =A0pnfs_set_layout_stateid(lo, &res->stateid); > - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 unlock_current_layout(nfsi); > =A0out: > =A0 =A0 =A0 =A0return status; > =A0} > @@ -2140,9 +2161,9 @@ pnfs_layoutcommit_inode(struct inode *inode, in= t sync) > =A0 =A0 =A0 =A0if (!data) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return -ENOMEM; > > - =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); > + =A0 =A0 =A0 lock_current_layout(nfsi); > =A0 =A0 =A0 =A0if (!layoutcommit_needed(nfsi)) { > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 lock_current_layout(nfsi); This should be unlock_current_layout =46red > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0goto out_free; > =A0 =A0 =A0 =A0} > > @@ -2157,7 +2178,7 @@ pnfs_layoutcommit_inode(struct inode *inode, in= t sync) > =A0 =A0 =A0 =A0nfsi->layout.lo_cred =3D NULL; > =A0 =A0 =A0 =A0pnfs_get_layout_stateid(&data->args.stateid, &nfsi->la= yout); > > - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 unlock_current_layout(nfsi); > > =A0 =A0 =A0 =A0/* Set up layout commit args */ > =A0 =A0 =A0 =A0status =3D pnfs_layoutcommit_setup(inode, data, write_= begin_pos, > -- > 1.6.2.5 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" = in > the body of a message to majordomo@vger.kernel.org > More majordomo info at =A0http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/8] pnfs-submit: forgetful client v2 2010-05-05 17:00 [PATCH 0/8] pnfs-submit: forgetful client v2 Alexandros Batsakis 2010-05-05 17:00 ` [PATCH 1/8] pnfs-submit: clean struct nfs_inode Alexandros Batsakis @ 2010-06-07 8:52 ` Boaz Harrosh 2010-06-07 8:54 ` Boaz Harrosh 1 sibling, 1 reply; 20+ messages in thread From: Boaz Harrosh @ 2010-06-07 8:52 UTC (permalink / raw) To: Alexandros Batsakis; +Cc: linux-nfs, bhalevy On 05/05/2010 08:00 PM, Alexandros Batsakis wrote: > This set of patches (2.6.35-rc1) includes a first attempt to implement Alexandros what's up with the date of these mails they are all marked as 5/5/2010. Looks like a bug in git send-email. (Thunderbird gave me a hard time with that) Boaz > the forgetful client model for the pNFS client. The model > is explained is patch 7. > It also includes some minor cleanups in the layout management code > that help to improve the maintanability of the current code. > > Passed cthon tests against the pyNFS server, and against a modified > version of pyNFS server that randomly issues layout recalls after opens. > > Alexandros Batsakis (8): > pnfs-submit: clean struct nfs_inode > pnfs-submit: clean locking infrastructure > pnfs-submit: remove lgetcount, lretcount (outstanding > LAYOUTGETs/LAYOUTRETUNs) > pnfs-submit: change stateid to be a union > pnfs-submit: request whole file layouts only > pnfs-submit: change layouts list to be similar to the other state > list management > pnfs-submit: forgetful client model > pnfs-submit: support for cb_recall_any (layouts) > > fs/nfs/callback.h | 7 + > fs/nfs/callback_proc.c | 231 +++++++++++++++++++++++++++++--------- > fs/nfs/callback_xdr.c | 2 +- > fs/nfs/client.c | 2 +- > fs/nfs/delegation.c | 19 ++-- > fs/nfs/inode.c | 16 ++- > fs/nfs/nfs4_fs.h | 1 + > fs/nfs/nfs4proc.c | 46 +++++--- > fs/nfs/nfs4state.c | 4 +- > fs/nfs/nfs4xdr.c | 38 ++++--- > fs/nfs/pnfs.c | 276 +++++++++++++++++++++------------------------ > fs/nfs/pnfs.h | 3 +- > fs/nfsd/nfs4callback.c | 1 - > include/linux/nfs4.h | 16 +++- > include/linux/nfs4_pnfs.h | 2 +- > include/linux/nfs_fs.h | 28 ++--- > include/linux/nfs_fs_sb.h | 2 +- > 17 files changed, 417 insertions(+), 277 deletions(-) > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/8] pnfs-submit: forgetful client v2 2010-06-07 8:52 ` [PATCH 0/8] pnfs-submit: forgetful client v2 Boaz Harrosh @ 2010-06-07 8:54 ` Boaz Harrosh 2010-06-07 15:38 ` Alexandros Batsakis 0 siblings, 1 reply; 20+ messages in thread From: Boaz Harrosh @ 2010-06-07 8:54 UTC (permalink / raw) To: Alexandros Batsakis; +Cc: linux-nfs, bhalevy On 06/07/2010 11:52 AM, Boaz Harrosh wrote: > On 05/05/2010 08:00 PM, Alexandros Batsakis wrote: >> This set of patches (2.6.35-rc1) includes a first attempt to implement > > Alexandros what's up with the date of these mails they are all marked > as 5/5/2010. Looks like a bug in git send-email. (Thunderbird gave me > a hard time with that) > > Boaz The previous set was sent on the 2010_05_17, perhaps it's your machine then? Boaz >> the forgetful client model for the pNFS client. The model >> is explained is patch 7. >> It also includes some minor cleanups in the layout management code >> that help to improve the maintanability of the current code. >> >> Passed cthon tests against the pyNFS server, and against a modified >> version of pyNFS server that randomly issues layout recalls after opens. >> >> Alexandros Batsakis (8): >> pnfs-submit: clean struct nfs_inode >> pnfs-submit: clean locking infrastructure >> pnfs-submit: remove lgetcount, lretcount (outstanding >> LAYOUTGETs/LAYOUTRETUNs) >> pnfs-submit: change stateid to be a union >> pnfs-submit: request whole file layouts only >> pnfs-submit: change layouts list to be similar to the other state >> list management >> pnfs-submit: forgetful client model >> pnfs-submit: support for cb_recall_any (layouts) >> >> fs/nfs/callback.h | 7 + >> fs/nfs/callback_proc.c | 231 +++++++++++++++++++++++++++++--------- >> fs/nfs/callback_xdr.c | 2 +- >> fs/nfs/client.c | 2 +- >> fs/nfs/delegation.c | 19 ++-- >> fs/nfs/inode.c | 16 ++- >> fs/nfs/nfs4_fs.h | 1 + >> fs/nfs/nfs4proc.c | 46 +++++--- >> fs/nfs/nfs4state.c | 4 +- >> fs/nfs/nfs4xdr.c | 38 ++++--- >> fs/nfs/pnfs.c | 276 +++++++++++++++++++++------------------------ >> fs/nfs/pnfs.h | 3 +- >> fs/nfsd/nfs4callback.c | 1 - >> include/linux/nfs4.h | 16 +++- >> include/linux/nfs4_pnfs.h | 2 +- >> include/linux/nfs_fs.h | 28 ++--- >> include/linux/nfs_fs_sb.h | 2 +- >> 17 files changed, 417 insertions(+), 277 deletions(-) >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/8] pnfs-submit: forgetful client v2 2010-06-07 8:54 ` Boaz Harrosh @ 2010-06-07 15:38 ` Alexandros Batsakis 0 siblings, 0 replies; 20+ messages in thread From: Alexandros Batsakis @ 2010-06-07 15:38 UTC (permalink / raw) To: Boaz Harrosh; +Cc: Alexandros Batsakis, linux-nfs, bhalevy On Mon, Jun 7, 2010 at 1:54 AM, Boaz Harrosh <bharrosh@panasas.com> wro= te: > On 06/07/2010 11:52 AM, Boaz Harrosh wrote: >> On 05/05/2010 08:00 PM, Alexandros Batsakis wrote: >>> This set of patches (2.6.35-rc1) includes a first attempt to implem= ent >> >> Alexandros what's up with the date of these mails they are all marke= d >> as 5/5/2010. Looks like a bug in git send-email. (Thunderbird gave m= e >> a hard time with that) >> >> Boaz > > The previous set was sent on the 2010_05_17, perhaps it's your machin= e > then? > Yeah... the "hardware" clock in my virtual machine gets crazy sometimes and I didn't notice. Apologies... I ll rebase to -rc2 and resend anyway. -alexandros > Boaz >>> the forgetful client model for the pNFS client. The model >>> is explained is patch 7. >>> It also includes some minor cleanups in the layout management code >>> that help to improve the maintanability of the current code. >>> >>> Passed cthon tests against the pyNFS server, and against a modified >>> version of =A0pyNFS server that randomly issues layout recalls afte= r opens. >>> >>> Alexandros Batsakis (8): >>> =A0 pnfs-submit: clean struct nfs_inode >>> =A0 pnfs-submit: clean locking infrastructure >>> =A0 pnfs-submit: remove lgetcount, lretcount (outstanding >>> =A0 =A0 LAYOUTGETs/LAYOUTRETUNs) >>> =A0 pnfs-submit: change stateid to be a union >>> =A0 pnfs-submit: request whole file layouts only >>> =A0 pnfs-submit: change layouts list to be similar to the other sta= te >>> =A0 =A0 list management >>> =A0 pnfs-submit: forgetful client model >>> =A0 pnfs-submit: support for cb_recall_any (layouts) >>> >>> =A0fs/nfs/callback.h =A0 =A0 =A0 =A0 | =A0 =A07 + >>> =A0fs/nfs/callback_proc.c =A0 =A0| =A0231 +++++++++++++++++++++++++= ++++--------- >>> =A0fs/nfs/callback_xdr.c =A0 =A0 | =A0 =A02 +- >>> =A0fs/nfs/client.c =A0 =A0 =A0 =A0 =A0 | =A0 =A02 +- >>> =A0fs/nfs/delegation.c =A0 =A0 =A0 | =A0 19 ++-- >>> =A0fs/nfs/inode.c =A0 =A0 =A0 =A0 =A0 =A0| =A0 16 ++- >>> =A0fs/nfs/nfs4_fs.h =A0 =A0 =A0 =A0 =A0| =A0 =A01 + >>> =A0fs/nfs/nfs4proc.c =A0 =A0 =A0 =A0 | =A0 46 +++++--- >>> =A0fs/nfs/nfs4state.c =A0 =A0 =A0 =A0| =A0 =A04 +- >>> =A0fs/nfs/nfs4xdr.c =A0 =A0 =A0 =A0 =A0| =A0 38 ++++--- >>> =A0fs/nfs/pnfs.c =A0 =A0 =A0 =A0 =A0 =A0 | =A0276 +++++++++++++++++= ++++------------------------ >>> =A0fs/nfs/pnfs.h =A0 =A0 =A0 =A0 =A0 =A0 | =A0 =A03 +- >>> =A0fs/nfsd/nfs4callback.c =A0 =A0| =A0 =A01 - >>> =A0include/linux/nfs4.h =A0 =A0 =A0| =A0 16 +++- >>> =A0include/linux/nfs4_pnfs.h | =A0 =A02 +- >>> =A0include/linux/nfs_fs.h =A0 =A0| =A0 28 ++--- >>> =A0include/linux/nfs_fs_sb.h | =A0 =A02 +- >>> =A017 files changed, 417 insertions(+), 277 deletions(-) >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-nfs= " in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at =A0http://vger.kernel.org/majordomo-info.htm= l >>> >> > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" = in > the body of a message to majordomo@vger.kernel.org > More majordomo info at =A0http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 0/8] pnfs-submit: Forgetful cleint and some layout cleanups
@ 2010-05-17 17:56 Alexandros Batsakis
2010-05-17 17:56 ` [PATCH 1/8] pnfs-submit: clean struct nfs_inode Alexandros Batsakis
0 siblings, 1 reply; 20+ messages in thread
From: Alexandros Batsakis @ 2010-05-17 17:56 UTC (permalink / raw)
To: bhalevy; +Cc: linux-nfs, Alexandros Batsakis
This set of patches includes a first attempt to implement the forgetful client model for the pNFS client. The model is explained is patch 7.
It also includes some minor cleanups in the layout management code that help to improve the maintanability of the current code.
Passed cthon tests against the pyNFS server, and against a modified version of the pyNFS server that randomly issues recalls after opens.
Alexandros Batsakis (8):
pnfs-submit: clean struct nfs_inode
pnfs-submit: clean locking infrastructure
pnfs-submit: remove lgetcount, lretcount (outstanding
LAYOUTGETs/LAYOUTRETUNs)
pnfs-submit: change stateid to be a union
pnfs-submit: request whole file layouts only
pnfs-submit: change layouts list to be similar to the other state
list management
pnfs-submit: forgetful client model
pnfs-submit: support for cb_recall_any (layouts)
fs/nfs/callback.h | 7 +
fs/nfs/callback_proc.c | 210 +++++++++++++++++++++++-------
fs/nfs/callback_xdr.c | 2 +-
fs/nfs/client.c | 2 +-
fs/nfs/delegation.c | 19 ++-
fs/nfs/inode.c | 17 ++-
fs/nfs/nfs4_fs.h | 1 +
fs/nfs/nfs4proc.c | 50 ++++---
fs/nfs/nfs4state.c | 8 +-
fs/nfs/nfs4xdr.c | 38 +++---
fs/nfs/pnfs.c | 310 ++++++++++++++++++++++-----------------------
fs/nfs/pnfs.h | 11 +-
fs/nfs/write.c | 3 +-
fs/nfsd/nfs4callback.c | 1 -
include/linux/nfs4.h | 18 +++-
include/linux/nfs_fs.h | 30 ++---
include/linux/nfs_fs_sb.h | 2 +-
17 files changed, 432 insertions(+), 297 deletions(-)
^ permalink raw reply [flat|nested] 20+ messages in thread* [PATCH 1/8] pnfs-submit: clean struct nfs_inode 2010-05-17 17:56 [PATCH 0/8] pnfs-submit: Forgetful cleint and some layout cleanups Alexandros Batsakis @ 2010-05-17 17:56 ` Alexandros Batsakis 2010-05-17 17:56 ` [PATCH 2/8] pnfs-submit: clean locking infrastructure Alexandros Batsakis 0 siblings, 1 reply; 20+ messages in thread From: Alexandros Batsakis @ 2010-05-17 17:56 UTC (permalink / raw) To: bhalevy; +Cc: linux-nfs, Alexandros Batsakis by moving layout specific fields from nfs_inode to struct pnfs_layout_type Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> --- fs/nfs/inode.c | 11 ++++--- fs/nfs/nfs4state.c | 2 +- fs/nfs/pnfs.c | 70 ++++++++++++++++++++++++++---------------------- fs/nfs/write.c | 3 +- include/linux/nfs_fs.h | 25 ++++++++--------- 5 files changed, 59 insertions(+), 52 deletions(-) diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index 9b8b655..ab599be 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -1110,7 +1110,8 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr) /* * file needs layout commit, server attributes may be stale */ - if (nfsi->layoutcommit_ctx && nfsi->change_attr >= fattr->change_attr) { + if (nfsi->layout.layoutcommit_ctx && + nfsi->change_attr >= fattr->change_attr) { dprintk("NFS: %s: layoutcommit is needed for file %s/%ld\n", __func__, inode->i_sb->s_id, inode->i_ino); return 0; @@ -1328,12 +1329,12 @@ void nfs4_clear_inode(struct inode *inode) static void pnfs_alloc_init_inode(struct nfs_inode *nfsi) { #ifdef CONFIG_NFS_V4_1 - nfsi->pnfs_layout_state = 0; + nfsi->layout.pnfs_layout_state = 0; memset(&nfsi->layout.stateid, 0, NFS4_STATEID_SIZE); nfsi->layout.roc_iomode = 0; - nfsi->layoutcommit_ctx = NULL; - nfsi->pnfs_write_begin_pos = 0; - nfsi->pnfs_write_end_pos = 0; + nfsi->layout.layoutcommit_ctx = NULL; + nfsi->layout.pnfs_write_begin_pos = 0; + nfsi->layout.pnfs_write_end_pos = 0; #endif /* CONFIG_NFS_V4_1 */ } diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index 15c8bc8..3765ca1 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -589,7 +589,7 @@ static void __nfs4_close(struct path *path, struct nfs4_state *state, fmode_t fm #ifdef CONFIG_NFS_V4_1 struct nfs_inode *nfsi = NFS_I(state->inode); - if (nfsi->layoutcommit_ctx) + if (nfsi->layout.ld_data && nfsi->layout.layoutcommit_ctx) pnfs_layoutcommit_inode(state->inode, 0); if (has_layout(nfsi) && nfsi->layout.roc_iomode) { struct nfs4_pnfs_layout_segment range; diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c index 8df1610..b72c013 100644 --- a/fs/nfs/pnfs.c +++ b/fs/nfs/pnfs.c @@ -152,14 +152,14 @@ void pnfs_need_layoutcommit(struct nfs_inode *nfsi, struct nfs_open_context *ctx) { dprintk("%s: has_layout=%d layoutcommit_ctx=%p ctx=%p\n", __func__, - has_layout(nfsi), nfsi->layoutcommit_ctx, ctx); + has_layout(nfsi), nfsi->layout.layoutcommit_ctx, ctx); spin_lock(&nfsi->lo_lock); - if (has_layout(nfsi) && !nfsi->layoutcommit_ctx) { - nfsi->layoutcommit_ctx = get_nfs_open_context(ctx); + if (has_layout(nfsi) && !nfsi->layout.layoutcommit_ctx) { + nfsi->layout.layoutcommit_ctx = get_nfs_open_context(ctx); nfsi->change_attr++; spin_unlock(&nfsi->lo_lock); dprintk("%s: Set layoutcommit_ctx=%p\n", __func__, - nfsi->layoutcommit_ctx); + nfsi->layout.layoutcommit_ctx); return; } spin_unlock(&nfsi->lo_lock); @@ -176,17 +176,17 @@ pnfs_update_last_write(struct nfs_inode *nfsi, loff_t offset, size_t extent) loff_t end_pos; spin_lock(&nfsi->lo_lock); - if (offset < nfsi->pnfs_write_begin_pos) - nfsi->pnfs_write_begin_pos = offset; + if (offset < nfsi->layout.pnfs_write_begin_pos) + nfsi->layout.pnfs_write_begin_pos = offset; end_pos = offset + extent - 1; /* I'm being inclusive */ - if (end_pos > nfsi->pnfs_write_end_pos) - nfsi->pnfs_write_end_pos = end_pos; + if (end_pos > nfsi->layout.pnfs_write_end_pos) + nfsi->layout.pnfs_write_end_pos = end_pos; dprintk("%s: Wrote %lu@%lu bpos %lu, epos: %lu\n", __func__, (unsigned long) extent, (unsigned long) offset , - (unsigned long) nfsi->pnfs_write_begin_pos, - (unsigned long) nfsi->pnfs_write_end_pos); + (unsigned long) nfsi->layout.pnfs_write_begin_pos, + (unsigned long) nfsi->layout.pnfs_write_end_pos); spin_unlock(&nfsi->lo_lock); } @@ -726,7 +726,7 @@ _pnfs_return_layout(struct inode *ino, struct nfs4_pnfs_layout_segment *range, arg.length = ~0; } if (type == RETURN_FILE) { - if (nfsi->layoutcommit_ctx) { + if (nfsi->layout.layoutcommit_ctx) { status = pnfs_layoutcommit_inode(ino, 1); if (status) { dprintk("%s: layoutcommit failed, status=%d. " @@ -903,7 +903,8 @@ get_lock_alloc_layout(struct inode *ino) * wait until bit is cleared if we lost this race. */ res = wait_on_bit_lock( - &nfsi->pnfs_layout_state, NFS_INO_LAYOUT_ALLOC, + &nfsi->layout.pnfs_layout_state, + NFS_INO_LAYOUT_ALLOC, pnfs_wait_schedule, TASK_KILLABLE); if (res) { lo = ERR_PTR(res); @@ -931,8 +932,10 @@ get_lock_alloc_layout(struct inode *ino) lo = ERR_PTR(-ENOMEM); /* release the NFS_INO_LAYOUT_ALLOC bit and wake up waiters */ - clear_bit_unlock(NFS_INO_LAYOUT_ALLOC, &nfsi->pnfs_layout_state); - wake_up_bit(&nfsi->pnfs_layout_state, NFS_INO_LAYOUT_ALLOC); + clear_bit_unlock(NFS_INO_LAYOUT_ALLOC, + &nfsi->layout.pnfs_layout_state); + wake_up_bit(&nfsi->layout.pnfs_layout_state, + NFS_INO_LAYOUT_ALLOC); break; } @@ -1116,13 +1119,13 @@ pnfs_update_layout(struct inode *ino, } /* if get layout already failed once goto out */ - if (test_bit(lo_fail_bit(iomode), &nfsi->pnfs_layout_state)) { - if (unlikely(nfsi->pnfs_layout_suspend && - get_seconds() >= nfsi->pnfs_layout_suspend)) { + if (test_bit(lo_fail_bit(iomode), &nfsi->layout.pnfs_layout_state)) { + if (unlikely(nfsi->layout.pnfs_layout_suspend && + get_seconds() >= nfsi->layout.pnfs_layout_suspend)) { dprintk("%s: layout_get resumed\n", __func__); clear_bit(lo_fail_bit(iomode), - &nfsi->pnfs_layout_state); - nfsi->pnfs_layout_suspend = 0; + &nfsi->layout.pnfs_layout_state); + nfsi->layout.pnfs_layout_suspend = 0; } else { result = 1; goto out_put; @@ -1138,7 +1141,8 @@ pnfs_update_layout(struct inode *ino, result = get_layout(ino, ctx, &arg, lsegpp, lo); out: dprintk("%s end (err:%d) state 0x%lx lseg %p\n", - __func__, result, nfsi->pnfs_layout_state, lseg); + __func__, result, nfsi->layout.pnfs_layout_state, + lseg); return result; out_put: if (lsegpp) @@ -1243,13 +1247,14 @@ pnfs_get_layout_done(struct nfs4_pnfs_layoutget *lgp, int rpc_status) get_out: /* remember that get layout failed and suspend trying */ - nfsi->pnfs_layout_suspend = suspend; - set_bit(lo_fail_bit(lgp->args.lseg.iomode), &nfsi->pnfs_layout_state); + nfsi->layout.pnfs_layout_suspend = suspend; + set_bit(lo_fail_bit(lgp->args.lseg.iomode), + &nfsi->layout.pnfs_layout_state); dprintk("%s: layout_get suspended until %ld\n", __func__, suspend); out: dprintk("%s end (err:%d) state 0x%lx lseg %p\n", - __func__, lgp->status, nfsi->pnfs_layout_state, lseg); + __func__, lgp->status, nfsi->layout.pnfs_layout_state, lseg); return; } @@ -2166,9 +2171,10 @@ pnfs_layoutcommit_setup(struct pnfs_layoutcommit_data *data, int sync) /* Set values from inode so it can be reset */ data->args.lseg.iomode = IOMODE_RW; - data->args.lseg.offset = nfsi->pnfs_write_begin_pos; - data->args.lseg.length = nfsi->pnfs_write_end_pos - nfsi->pnfs_write_begin_pos + 1; - data->args.lastbytewritten = nfsi->pnfs_write_end_pos; + data->args.lseg.offset = nfsi->layout.pnfs_write_begin_pos; + data->args.lseg.length = nfsi->layout.pnfs_write_end_pos - + nfsi->layout.pnfs_write_begin_pos + 1; + data->args.lastbytewritten = nfsi->layout.pnfs_write_end_pos; data->args.bitmask = nfss->attr_bitmask; data->res.server = nfss; @@ -2207,12 +2213,12 @@ pnfs_layoutcommit_inode(struct inode *inode, int sync) return -ENOMEM; spin_lock(&nfsi->lo_lock); - if (!nfsi->layoutcommit_ctx) + if (!nfsi->layout.layoutcommit_ctx) goto out_unlock; data->args.inode = inode; - data->cred = nfsi->layoutcommit_ctx->cred; - data->ctx = nfsi->layoutcommit_ctx; + data->cred = nfsi->layout.layoutcommit_ctx->cred; + data->ctx = nfsi->layout.layoutcommit_ctx; /* Set up layout commit args*/ status = pnfs_layoutcommit_setup(data, sync); @@ -2222,9 +2228,9 @@ pnfs_layoutcommit_inode(struct inode *inode, int sync) /* Clear layoutcommit properties in the inode so * new lc info can be generated */ - nfsi->pnfs_write_begin_pos = 0; - nfsi->pnfs_write_end_pos = 0; - nfsi->layoutcommit_ctx = NULL; + nfsi->layout.pnfs_write_begin_pos = 0; + nfsi->layout.pnfs_write_end_pos = 0; + nfsi->layout.layoutcommit_ctx = NULL; /* release lock on pnfs layoutcommit attrs */ spin_unlock(&nfsi->lo_lock); diff --git a/fs/nfs/write.c b/fs/nfs/write.c index d058781..57bfc85 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -1494,7 +1494,8 @@ static int nfs_commit_inode(struct inode *inode, int how) nfs_wait_bit_killable, TASK_KILLABLE); #ifdef CONFIG_NFS_V4_1 - if (may_wait && NFS_I(inode)->layoutcommit_ctx) { + if (may_wait && NFS_I(inode)->layout.ld_data && + NFS_I(inode)->layout.layoutcommit_ctx) { error = pnfs_layoutcommit_inode(inode, 1); if (error < 0) return error; diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index 68b3b5c..5048b97 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -106,6 +106,18 @@ struct pnfs_layout_type { seqlock_t seqlock; /* Protects the stateid */ nfs4_stateid stateid; void *ld_data; /* layout driver private data */ + unsigned long pnfs_layout_state; + #define NFS_INO_RO_LAYOUT_FAILED 0 /* get ro layout failed stop trying */ + #define NFS_INO_RW_LAYOUT_FAILED 1 /* get rw layout failed stop trying */ + #define NFS_INO_LAYOUT_ALLOC 2 /* bit lock for layout allocation */ + time_t pnfs_layout_suspend; + /* use rpc_creds in this open_context to send LAYOUTCOMMIT to MDS */ + struct nfs_open_context *layoutcommit_ctx; + /* DH: These vars keep track of the maximum write range + * so the values can be used for layoutcommit. + */ + loff_t pnfs_write_begin_pos; + loff_t pnfs_write_end_pos; }; /* @@ -197,22 +209,9 @@ struct nfs_inode { #if defined(CONFIG_NFS_V4_1) /* Inodes having layouts */ struct list_head lo_inodes; - - unsigned long pnfs_layout_state; -#define NFS_INO_RO_LAYOUT_FAILED 0 /* get ro layout failed stop trying */ -#define NFS_INO_RW_LAYOUT_FAILED 1 /* get rw layout failed stop trying */ -#define NFS_INO_LAYOUT_ALLOC 2 /* bit lock for layout allocation */ - time_t pnfs_layout_suspend; wait_queue_head_t lo_waitq; spinlock_t lo_lock; struct pnfs_layout_type layout; - /* use rpc_creds in this open_context to send LAYOUTCOMMIT to MDS */ - struct nfs_open_context *layoutcommit_ctx; - /* DH: These vars keep track of the maximum write range - * so the values can be used for layoutcommit. - */ - loff_t pnfs_write_begin_pos; - loff_t pnfs_write_end_pos; #endif /* CONFIG_NFS_V4_1 */ #endif /* CONFIG_NFS_V4*/ #ifdef CONFIG_NFS_FSCACHE -- 1.6.2.5 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/8] pnfs-submit: clean locking infrastructure 2010-05-17 17:56 ` [PATCH 1/8] pnfs-submit: clean struct nfs_inode Alexandros Batsakis @ 2010-05-17 17:56 ` Alexandros Batsakis 2010-05-26 8:28 ` Benny Halevy 2010-05-28 17:27 ` Fred Isaman 0 siblings, 2 replies; 20+ messages in thread From: Alexandros Batsakis @ 2010-05-17 17:56 UTC (permalink / raw) To: bhalevy; +Cc: linux-nfs, Alexandros Batsakis (also minor cleanup of pnfs_free_layout()) Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Conflicts: fs/nfs/pnfs.c --- fs/nfs/pnfs.c | 80 +++++++++++++++++++++++++++++++++++++------------------- 1 files changed, 53 insertions(+), 27 deletions(-) diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c index b72c013..74cb998 100644 --- a/fs/nfs/pnfs.c +++ b/fs/nfs/pnfs.c @@ -1,4 +1,4 @@ -/* + /* * linux/fs/nfs/pnfs.c * * pNFS functions to call and manage layout drivers. @@ -60,6 +60,8 @@ static int pnfs_initialized; static void pnfs_free_layout(struct pnfs_layout_type *lo, struct nfs4_pnfs_layout_segment *range); static enum pnfs_try_status pnfs_commit(struct nfs_write_data *data, int sync); +static inline void lock_current_layout(struct nfs_inode *nfsi); +static inline void unlock_current_layout(struct nfs_inode *nfsi); /* Locking: * @@ -153,16 +155,17 @@ pnfs_need_layoutcommit(struct nfs_inode *nfsi, struct nfs_open_context *ctx) { dprintk("%s: has_layout=%d layoutcommit_ctx=%p ctx=%p\n", __func__, has_layout(nfsi), nfsi->layout.layoutcommit_ctx, ctx); - spin_lock(&nfsi->lo_lock); + + lock_current_layout(nfsi); if (has_layout(nfsi) && !nfsi->layout.layoutcommit_ctx) { nfsi->layout.layoutcommit_ctx = get_nfs_open_context(ctx); nfsi->change_attr++; - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); dprintk("%s: Set layoutcommit_ctx=%p\n", __func__, nfsi->layout.layoutcommit_ctx); return; } - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); } /* Update last_write_offset for layoutcommit. @@ -175,7 +178,7 @@ pnfs_update_last_write(struct nfs_inode *nfsi, loff_t offset, size_t extent) { loff_t end_pos; - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); if (offset < nfsi->layout.pnfs_write_begin_pos) nfsi->layout.pnfs_write_begin_pos = offset; end_pos = offset + extent - 1; /* I'm being inclusive */ @@ -187,7 +190,7 @@ pnfs_update_last_write(struct nfs_inode *nfsi, loff_t offset, size_t extent) (unsigned long) offset , (unsigned long) nfsi->layout.pnfs_write_begin_pos, (unsigned long) nfsi->layout.pnfs_write_end_pos); - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); } /* Unitialize a mountpoint in a layout driver */ @@ -296,12 +299,27 @@ pnfs_unregister_layoutdriver(struct pnfs_layoutdriver_type *ld_type) * pNFS client layout cache */ #if defined(CONFIG_SMP) +#define BUG_ON_LOCKED_LO(lo) \ + BUG_ON(spin_is_locked(&PNFS_NFS_INODE(lo)->lo_lock)) #define BUG_ON_UNLOCKED_LO(lo) \ BUG_ON(!spin_is_locked(&PNFS_NFS_INODE(lo)->lo_lock)) #else /* CONFIG_SMP */ +#define BUG_ON_LOCKED_LO(lo) do {} while (0) #define BUG_ON_UNLOCKED_LO(lo) do {} while (0) #endif /* CONFIG_SMP */ +static inline void lock_current_layout(struct nfs_inode *nfsi) +{ + BUG_ON_LOCKED_LO((&nfsi->layout)); + spin_lock(&nfsi->lo_lock); +} + +static inline void unlock_current_layout(struct nfs_inode *nfsi) +{ + BUG_ON_UNLOCKED_LO((&nfsi->layout)); + spin_unlock(&nfsi->lo_lock); +} + /* * get and lock nfsi->layout */ @@ -310,10 +328,10 @@ get_lock_current_layout(struct nfs_inode *nfsi) { struct pnfs_layout_type *lo; + lock_current_layout(nfsi); lo = &nfsi->layout; - spin_lock(&nfsi->lo_lock); if (!lo->ld_data) { - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); return NULL; } @@ -333,7 +351,12 @@ put_unlock_current_layout(struct pnfs_layout_type *lo) BUG_ON_UNLOCKED_LO(lo); BUG_ON(lo->refcount <= 0); - if (--lo->refcount == 0 && list_empty(&lo->segs)) { + lo->refcount--; + + if (lo->refcount > 0) + goto out; + + if (list_empty(&lo->segs)) { struct layoutdriver_io_operations *io_ops = PNFS_LD_IO_OPS(lo); @@ -347,7 +370,8 @@ put_unlock_current_layout(struct pnfs_layout_type *lo) list_del_init(&nfsi->lo_inodes); spin_unlock(&clp->cl_lock); } - spin_unlock(&nfsi->lo_lock); +out: + unlock_current_layout(nfsi); } void @@ -356,7 +380,7 @@ pnfs_layout_release(struct pnfs_layout_type *lo, atomic_t *count, { struct nfs_inode *nfsi = PNFS_NFS_INODE(lo); - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); if (range) pnfs_free_layout(lo, range); atomic_dec(count); @@ -375,6 +399,8 @@ pnfs_destroy_layout(struct nfs_inode *nfsi) }; lo = get_lock_current_layout(nfsi); + if (!lo) + return; pnfs_free_layout(lo, &range); put_unlock_current_layout(lo); } @@ -652,7 +678,7 @@ pnfs_return_layout_barrier(struct nfs_inode *nfsi, struct pnfs_layout_segment *lseg; bool ret = false; - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); list_for_each_entry (lseg, &nfsi->layout.segs, fi_list) { if (!should_free_lseg(lseg, range)) continue; @@ -666,7 +692,7 @@ pnfs_return_layout_barrier(struct nfs_inode *nfsi, } if (atomic_read(&nfsi->layout.lgetcount)) ret = true; - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); dprintk("%s:Return %d\n", __func__, ret); return ret; @@ -756,7 +782,7 @@ _pnfs_return_layout(struct inode *ino, struct nfs4_pnfs_layout_segment *range, /* unlock w/o put rebalanced by eventual call to * pnfs_layout_release */ - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); if (pnfs_return_layout_barrier(nfsi, &arg)) { dprintk("%s: waiting\n", __func__); @@ -887,7 +913,7 @@ static int pnfs_wait_schedule(void *word) * * Note: If successful, nfsi->lo_lock is taken and the caller * must put and unlock current_layout by using put_unlock_current_layout() - * when the returned layout is released. + * directly or pnfs_layout_release() when the returned layout is released. */ static struct pnfs_layout_type * get_lock_alloc_layout(struct inode *ino) @@ -922,7 +948,7 @@ get_lock_alloc_layout(struct inode *ino) struct nfs_client *clp = NFS_SERVER(ino)->nfs_client; /* must grab the layout lock before the client lock */ - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); spin_lock(&clp->cl_lock); if (list_empty(&nfsi->lo_inodes)) @@ -1038,10 +1064,10 @@ void drain_layoutreturns(struct pnfs_layout_type *lo) while (atomic_read(&lo->lretcount)) { struct nfs_inode *nfsi = PNFS_NFS_INODE(lo); - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); dprintk("%s: waiting\n", __func__); wait_event(nfsi->lo_waitq, (atomic_read(&lo->lretcount) == 0)); - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); } } @@ -1080,13 +1106,13 @@ pnfs_update_layout(struct inode *ino, /* Check to see if the layout for the given range already exists */ lseg = pnfs_has_layout(lo, &arg, take_ref, !take_ref); if (lseg && !lseg->valid) { - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); if (take_ref) put_lseg(lseg); for (;;) { prepare_to_wait(&nfsi->lo_waitq, &__wait, TASK_KILLABLE); - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); lseg = pnfs_has_layout(lo, &arg, take_ref, !take_ref); if (!lseg || lseg->valid) break; @@ -1099,7 +1125,7 @@ pnfs_update_layout(struct inode *ino, result = -ERESTARTSYS; break; } - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); schedule(); } finish_wait(&nfsi->lo_waitq, &__wait); @@ -1136,7 +1162,7 @@ pnfs_update_layout(struct inode *ino, /* Matching dec is done in .rpc_release (on non-error paths) */ atomic_inc(&lo->lgetcount); /* Lose lock, but not reference, match this with pnfs_layout_release */ - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); result = get_layout(ino, ctx, &arg, lsegpp, lo); out: @@ -1286,7 +1312,7 @@ pnfs_layout_process(struct nfs4_pnfs_layoutget *lgp) *lgp->lsegpp = lseg; } - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); pnfs_insert_layout(lo, lseg); if (res->return_on_close) { @@ -1297,7 +1323,7 @@ pnfs_layout_process(struct nfs4_pnfs_layoutget *lgp) /* Done processing layoutget. Set the layout stateid */ pnfs_set_layout_stateid(lo, &res->stateid); - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); out: return status; } @@ -2212,7 +2238,7 @@ pnfs_layoutcommit_inode(struct inode *inode, int sync) if (!data) return -ENOMEM; - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); if (!nfsi->layout.layoutcommit_ctx) goto out_unlock; @@ -2233,7 +2259,7 @@ pnfs_layoutcommit_inode(struct inode *inode, int sync) nfsi->layout.layoutcommit_ctx = NULL; /* release lock on pnfs layoutcommit attrs */ - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); data->is_sync = sync; status = pnfs4_proc_layoutcommit(data); @@ -2242,7 +2268,7 @@ out: return status; out_unlock: pnfs_layoutcommit_free(data); - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); goto out; } -- 1.6.2.5 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 2/8] pnfs-submit: clean locking infrastructure 2010-05-17 17:56 ` [PATCH 2/8] pnfs-submit: clean locking infrastructure Alexandros Batsakis @ 2010-05-26 8:28 ` Benny Halevy 2010-05-28 17:27 ` Fred Isaman 1 sibling, 0 replies; 20+ messages in thread From: Benny Halevy @ 2010-05-26 8:28 UTC (permalink / raw) To: Alexandros Batsakis; +Cc: linux-nfs On May. 17, 2010, 20:56 +0300, Alexandros Batsakis <batsakis@netapp.com> wrote: > (also minor cleanup of pnfs_free_layout()) > > Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> > > Conflicts: > > fs/nfs/pnfs.c > --- > fs/nfs/pnfs.c | 80 +++++++++++++++++++++++++++++++++++++------------------- > 1 files changed, 53 insertions(+), 27 deletions(-) > > diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c > index b72c013..74cb998 100644 > --- a/fs/nfs/pnfs.c > +++ b/fs/nfs/pnfs.c > @@ -1,4 +1,4 @@ > -/* > + /* just picking nit... Benny > * linux/fs/nfs/pnfs.c > * > * pNFS functions to call and manage layout drivers. > @@ -60,6 +60,8 @@ static int pnfs_initialized; > static void pnfs_free_layout(struct pnfs_layout_type *lo, > struct nfs4_pnfs_layout_segment *range); > static enum pnfs_try_status pnfs_commit(struct nfs_write_data *data, int sync); > +static inline void lock_current_layout(struct nfs_inode *nfsi); > +static inline void unlock_current_layout(struct nfs_inode *nfsi); > > /* Locking: > * > @@ -153,16 +155,17 @@ pnfs_need_layoutcommit(struct nfs_inode *nfsi, struct nfs_open_context *ctx) > { > dprintk("%s: has_layout=%d layoutcommit_ctx=%p ctx=%p\n", __func__, > has_layout(nfsi), nfsi->layout.layoutcommit_ctx, ctx); > - spin_lock(&nfsi->lo_lock); > + > + lock_current_layout(nfsi); > if (has_layout(nfsi) && !nfsi->layout.layoutcommit_ctx) { > nfsi->layout.layoutcommit_ctx = get_nfs_open_context(ctx); > nfsi->change_attr++; > - spin_unlock(&nfsi->lo_lock); > + unlock_current_layout(nfsi); > dprintk("%s: Set layoutcommit_ctx=%p\n", __func__, > nfsi->layout.layoutcommit_ctx); > return; > } > - spin_unlock(&nfsi->lo_lock); > + unlock_current_layout(nfsi); > } > > /* Update last_write_offset for layoutcommit. > @@ -175,7 +178,7 @@ pnfs_update_last_write(struct nfs_inode *nfsi, loff_t offset, size_t extent) > { > loff_t end_pos; > > - spin_lock(&nfsi->lo_lock); > + lock_current_layout(nfsi); > if (offset < nfsi->layout.pnfs_write_begin_pos) > nfsi->layout.pnfs_write_begin_pos = offset; > end_pos = offset + extent - 1; /* I'm being inclusive */ > @@ -187,7 +190,7 @@ pnfs_update_last_write(struct nfs_inode *nfsi, loff_t offset, size_t extent) > (unsigned long) offset , > (unsigned long) nfsi->layout.pnfs_write_begin_pos, > (unsigned long) nfsi->layout.pnfs_write_end_pos); > - spin_unlock(&nfsi->lo_lock); > + unlock_current_layout(nfsi); > } > > /* Unitialize a mountpoint in a layout driver */ > @@ -296,12 +299,27 @@ pnfs_unregister_layoutdriver(struct pnfs_layoutdriver_type *ld_type) > * pNFS client layout cache > */ > #if defined(CONFIG_SMP) > +#define BUG_ON_LOCKED_LO(lo) \ > + BUG_ON(spin_is_locked(&PNFS_NFS_INODE(lo)->lo_lock)) > #define BUG_ON_UNLOCKED_LO(lo) \ > BUG_ON(!spin_is_locked(&PNFS_NFS_INODE(lo)->lo_lock)) > #else /* CONFIG_SMP */ > +#define BUG_ON_LOCKED_LO(lo) do {} while (0) > #define BUG_ON_UNLOCKED_LO(lo) do {} while (0) > #endif /* CONFIG_SMP */ > > +static inline void lock_current_layout(struct nfs_inode *nfsi) > +{ > + BUG_ON_LOCKED_LO((&nfsi->layout)); > + spin_lock(&nfsi->lo_lock); > +} > + > +static inline void unlock_current_layout(struct nfs_inode *nfsi) > +{ > + BUG_ON_UNLOCKED_LO((&nfsi->layout)); > + spin_unlock(&nfsi->lo_lock); > +} > + > /* > * get and lock nfsi->layout > */ > @@ -310,10 +328,10 @@ get_lock_current_layout(struct nfs_inode *nfsi) > { > struct pnfs_layout_type *lo; > > + lock_current_layout(nfsi); > lo = &nfsi->layout; > - spin_lock(&nfsi->lo_lock); > if (!lo->ld_data) { > - spin_unlock(&nfsi->lo_lock); > + unlock_current_layout(nfsi); > return NULL; > } > > @@ -333,7 +351,12 @@ put_unlock_current_layout(struct pnfs_layout_type *lo) > BUG_ON_UNLOCKED_LO(lo); > BUG_ON(lo->refcount <= 0); > > - if (--lo->refcount == 0 && list_empty(&lo->segs)) { > + lo->refcount--; > + > + if (lo->refcount > 0) > + goto out; > + > + if (list_empty(&lo->segs)) { > struct layoutdriver_io_operations *io_ops = > PNFS_LD_IO_OPS(lo); > > @@ -347,7 +370,8 @@ put_unlock_current_layout(struct pnfs_layout_type *lo) > list_del_init(&nfsi->lo_inodes); > spin_unlock(&clp->cl_lock); > } > - spin_unlock(&nfsi->lo_lock); > +out: > + unlock_current_layout(nfsi); > } > > void > @@ -356,7 +380,7 @@ pnfs_layout_release(struct pnfs_layout_type *lo, atomic_t *count, > { > struct nfs_inode *nfsi = PNFS_NFS_INODE(lo); > > - spin_lock(&nfsi->lo_lock); > + lock_current_layout(nfsi); > if (range) > pnfs_free_layout(lo, range); > atomic_dec(count); > @@ -375,6 +399,8 @@ pnfs_destroy_layout(struct nfs_inode *nfsi) > }; > > lo = get_lock_current_layout(nfsi); > + if (!lo) > + return; > pnfs_free_layout(lo, &range); > put_unlock_current_layout(lo); > } > @@ -652,7 +678,7 @@ pnfs_return_layout_barrier(struct nfs_inode *nfsi, > struct pnfs_layout_segment *lseg; > bool ret = false; > > - spin_lock(&nfsi->lo_lock); > + lock_current_layout(nfsi); > list_for_each_entry (lseg, &nfsi->layout.segs, fi_list) { > if (!should_free_lseg(lseg, range)) > continue; > @@ -666,7 +692,7 @@ pnfs_return_layout_barrier(struct nfs_inode *nfsi, > } > if (atomic_read(&nfsi->layout.lgetcount)) > ret = true; > - spin_unlock(&nfsi->lo_lock); > + unlock_current_layout(nfsi); > > dprintk("%s:Return %d\n", __func__, ret); > return ret; > @@ -756,7 +782,7 @@ _pnfs_return_layout(struct inode *ino, struct nfs4_pnfs_layout_segment *range, > /* unlock w/o put rebalanced by eventual call to > * pnfs_layout_release > */ > - spin_unlock(&nfsi->lo_lock); > + unlock_current_layout(nfsi); > > if (pnfs_return_layout_barrier(nfsi, &arg)) { > dprintk("%s: waiting\n", __func__); > @@ -887,7 +913,7 @@ static int pnfs_wait_schedule(void *word) > * > * Note: If successful, nfsi->lo_lock is taken and the caller > * must put and unlock current_layout by using put_unlock_current_layout() > - * when the returned layout is released. > + * directly or pnfs_layout_release() when the returned layout is released. > */ > static struct pnfs_layout_type * > get_lock_alloc_layout(struct inode *ino) > @@ -922,7 +948,7 @@ get_lock_alloc_layout(struct inode *ino) > struct nfs_client *clp = NFS_SERVER(ino)->nfs_client; > > /* must grab the layout lock before the client lock */ > - spin_lock(&nfsi->lo_lock); > + lock_current_layout(nfsi); > > spin_lock(&clp->cl_lock); > if (list_empty(&nfsi->lo_inodes)) > @@ -1038,10 +1064,10 @@ void drain_layoutreturns(struct pnfs_layout_type *lo) > while (atomic_read(&lo->lretcount)) { > struct nfs_inode *nfsi = PNFS_NFS_INODE(lo); > > - spin_unlock(&nfsi->lo_lock); > + unlock_current_layout(nfsi); > dprintk("%s: waiting\n", __func__); > wait_event(nfsi->lo_waitq, (atomic_read(&lo->lretcount) == 0)); > - spin_lock(&nfsi->lo_lock); > + lock_current_layout(nfsi); > } > } > > @@ -1080,13 +1106,13 @@ pnfs_update_layout(struct inode *ino, > /* Check to see if the layout for the given range already exists */ > lseg = pnfs_has_layout(lo, &arg, take_ref, !take_ref); > if (lseg && !lseg->valid) { > - spin_unlock(&nfsi->lo_lock); > + unlock_current_layout(nfsi); > if (take_ref) > put_lseg(lseg); > for (;;) { > prepare_to_wait(&nfsi->lo_waitq, &__wait, > TASK_KILLABLE); > - spin_lock(&nfsi->lo_lock); > + lock_current_layout(nfsi); > lseg = pnfs_has_layout(lo, &arg, take_ref, !take_ref); > if (!lseg || lseg->valid) > break; > @@ -1099,7 +1125,7 @@ pnfs_update_layout(struct inode *ino, > result = -ERESTARTSYS; > break; > } > - spin_unlock(&nfsi->lo_lock); > + unlock_current_layout(nfsi); > schedule(); > } > finish_wait(&nfsi->lo_waitq, &__wait); > @@ -1136,7 +1162,7 @@ pnfs_update_layout(struct inode *ino, > /* Matching dec is done in .rpc_release (on non-error paths) */ > atomic_inc(&lo->lgetcount); > /* Lose lock, but not reference, match this with pnfs_layout_release */ > - spin_unlock(&nfsi->lo_lock); > + unlock_current_layout(nfsi); > > result = get_layout(ino, ctx, &arg, lsegpp, lo); > out: > @@ -1286,7 +1312,7 @@ pnfs_layout_process(struct nfs4_pnfs_layoutget *lgp) > *lgp->lsegpp = lseg; > } > > - spin_lock(&nfsi->lo_lock); > + lock_current_layout(nfsi); > pnfs_insert_layout(lo, lseg); > > if (res->return_on_close) { > @@ -1297,7 +1323,7 @@ pnfs_layout_process(struct nfs4_pnfs_layoutget *lgp) > > /* Done processing layoutget. Set the layout stateid */ > pnfs_set_layout_stateid(lo, &res->stateid); > - spin_unlock(&nfsi->lo_lock); > + unlock_current_layout(nfsi); > out: > return status; > } > @@ -2212,7 +2238,7 @@ pnfs_layoutcommit_inode(struct inode *inode, int sync) > if (!data) > return -ENOMEM; > > - spin_lock(&nfsi->lo_lock); > + lock_current_layout(nfsi); > if (!nfsi->layout.layoutcommit_ctx) > goto out_unlock; > > @@ -2233,7 +2259,7 @@ pnfs_layoutcommit_inode(struct inode *inode, int sync) > nfsi->layout.layoutcommit_ctx = NULL; > > /* release lock on pnfs layoutcommit attrs */ > - spin_unlock(&nfsi->lo_lock); > + unlock_current_layout(nfsi); > > data->is_sync = sync; > status = pnfs4_proc_layoutcommit(data); > @@ -2242,7 +2268,7 @@ out: > return status; > out_unlock: > pnfs_layoutcommit_free(data); > - spin_unlock(&nfsi->lo_lock); > + unlock_current_layout(nfsi); > goto out; > } > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/8] pnfs-submit: clean locking infrastructure 2010-05-17 17:56 ` [PATCH 2/8] pnfs-submit: clean locking infrastructure Alexandros Batsakis 2010-05-26 8:28 ` Benny Halevy @ 2010-05-28 17:27 ` Fred Isaman [not found] ` <AANLkTinsHI0fHYdpUlq-MsMX0BmsLGvdAbrKx7M5ydjw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 1 sibling, 1 reply; 20+ messages in thread From: Fred Isaman @ 2010-05-28 17:27 UTC (permalink / raw) To: Alexandros Batsakis; +Cc: bhalevy, linux-nfs On Mon, May 17, 2010 at 1:56 PM, Alexandros Batsakis <batsakis@netapp.com> wrote: > (also minor cleanup of pnfs_free_layout()) > > Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> > > Conflicts: > > =A0 =A0 =A0 =A0fs/nfs/pnfs.c > --- > =A0fs/nfs/pnfs.c | =A0 80 +++++++++++++++++++++++++++++++++++++------= ------------- > =A01 files changed, 53 insertions(+), 27 deletions(-) > > diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c > index b72c013..74cb998 100644 > --- a/fs/nfs/pnfs.c > +++ b/fs/nfs/pnfs.c > @@ -1,4 +1,4 @@ > -/* > + /* > =A0* =A0linux/fs/nfs/pnfs.c > =A0* > =A0* =A0pNFS functions to call and manage layout drivers. > @@ -60,6 +60,8 @@ static int pnfs_initialized; > =A0static void pnfs_free_layout(struct pnfs_layout_type *lo, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 struct nfs4_p= nfs_layout_segment *range); > =A0static enum pnfs_try_status pnfs_commit(struct nfs_write_data *dat= a, int sync); > +static inline void lock_current_layout(struct nfs_inode *nfsi); > +static inline void unlock_current_layout(struct nfs_inode *nfsi); > > =A0/* Locking: > =A0* > @@ -153,16 +155,17 @@ pnfs_need_layoutcommit(struct nfs_inode *nfsi, = struct nfs_open_context *ctx) > =A0{ > =A0 =A0 =A0 =A0dprintk("%s: has_layout=3D%d layoutcommit_ctx=3D%p ctx= =3D%p\n", __func__, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0has_layout(nfsi), nfsi->layout.layoutc= ommit_ctx, ctx); > - =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); > + > + =A0 =A0 =A0 lock_current_layout(nfsi); > =A0 =A0 =A0 =A0if (has_layout(nfsi) && !nfsi->layout.layoutcommit_ctx= ) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0nfsi->layout.layoutcommit_ctx =3D get_= nfs_open_context(ctx); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0nfsi->change_attr++; > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 unlock_current_layout(nfsi); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0dprintk("%s: Set layoutcommit_ctx=3D%p= \n", __func__, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0nfsi->layout.layoutcom= mit_ctx); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return; > =A0 =A0 =A0 =A0} > - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 unlock_current_layout(nfsi); > =A0} > > =A0/* Update last_write_offset for layoutcommit. > @@ -175,7 +178,7 @@ pnfs_update_last_write(struct nfs_inode *nfsi, lo= ff_t offset, size_t extent) > =A0{ > =A0 =A0 =A0 =A0loff_t end_pos; > > - =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); > + =A0 =A0 =A0 lock_current_layout(nfsi); > =A0 =A0 =A0 =A0if (offset < nfsi->layout.pnfs_write_begin_pos) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0nfsi->layout.pnfs_write_begin_pos =3D = offset; > =A0 =A0 =A0 =A0end_pos =3D offset + extent - 1; /* I'm being inclusiv= e */ > @@ -187,7 +190,7 @@ pnfs_update_last_write(struct nfs_inode *nfsi, lo= ff_t offset, size_t extent) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(unsigned long) offset , > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(unsigned long) nfsi->layout.pnfs_writ= e_begin_pos, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(unsigned long) nfsi->layout.pnfs_writ= e_end_pos); > - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 unlock_current_layout(nfsi); > =A0} > > =A0/* Unitialize a mountpoint in a layout driver */ > @@ -296,12 +299,27 @@ pnfs_unregister_layoutdriver(struct pnfs_layout= driver_type *ld_type) > =A0* pNFS client layout cache > =A0*/ > =A0#if defined(CONFIG_SMP) > +#define BUG_ON_LOCKED_LO(lo) \ > + =A0 =A0 =A0 BUG_ON(spin_is_locked(&PNFS_NFS_INODE(lo)->lo_lock)) > =A0#define BUG_ON_UNLOCKED_LO(lo) \ > =A0 =A0 =A0 =A0BUG_ON(!spin_is_locked(&PNFS_NFS_INODE(lo)->lo_lock)) > =A0#else /* CONFIG_SMP */ > +#define BUG_ON_LOCKED_LO(lo) do {} while (0) > =A0#define BUG_ON_UNLOCKED_LO(lo) do {} while (0) > =A0#endif /* CONFIG_SMP */ > > +static inline void lock_current_layout(struct nfs_inode *nfsi) > +{ > + =A0 =A0 =A0 BUG_ON_LOCKED_LO((&nfsi->layout)); I just ran into this in testing. This check causes problems. If you know it is already unlocked, you wouldn't have to "spin". =46red > + =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); > +} > + > +static inline void unlock_current_layout(struct nfs_inode *nfsi) > +{ > + =A0 =A0 =A0 BUG_ON_UNLOCKED_LO((&nfsi->layout)); > + =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > +} > + > =A0/* > =A0* get and lock nfsi->layout > =A0*/ > @@ -310,10 +328,10 @@ get_lock_current_layout(struct nfs_inode *nfsi) > =A0{ > =A0 =A0 =A0 =A0struct pnfs_layout_type *lo; > > + =A0 =A0 =A0 lock_current_layout(nfsi); > =A0 =A0 =A0 =A0lo =3D &nfsi->layout; > - =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); > =A0 =A0 =A0 =A0if (!lo->ld_data) { > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 unlock_current_layout(nfsi); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return NULL; > =A0 =A0 =A0 =A0} > > @@ -333,7 +351,12 @@ put_unlock_current_layout(struct pnfs_layout_typ= e *lo) > =A0 =A0 =A0 =A0BUG_ON_UNLOCKED_LO(lo); > =A0 =A0 =A0 =A0BUG_ON(lo->refcount <=3D 0); > > - =A0 =A0 =A0 if (--lo->refcount =3D=3D 0 && list_empty(&lo->segs)) { > + =A0 =A0 =A0 lo->refcount--; > + > + =A0 =A0 =A0 if (lo->refcount > 0) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto out; > + > + =A0 =A0 =A0 if (list_empty(&lo->segs)) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct layoutdriver_io_operations *io_= ops =3D > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0PNFS_LD_IO_OPS(lo); > > @@ -347,7 +370,8 @@ put_unlock_current_layout(struct pnfs_layout_type= *lo) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0list_del_init(&nfsi->lo_inodes); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0spin_unlock(&clp->cl_lock); > =A0 =A0 =A0 =A0} > - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > +out: > + =A0 =A0 =A0 unlock_current_layout(nfsi); > =A0} > > =A0void > @@ -356,7 +380,7 @@ pnfs_layout_release(struct pnfs_layout_type *lo, = atomic_t *count, > =A0{ > =A0 =A0 =A0 =A0struct nfs_inode *nfsi =3D PNFS_NFS_INODE(lo); > > - =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); > + =A0 =A0 =A0 lock_current_layout(nfsi); > =A0 =A0 =A0 =A0if (range) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0pnfs_free_layout(lo, range); > =A0 =A0 =A0 =A0atomic_dec(count); > @@ -375,6 +399,8 @@ pnfs_destroy_layout(struct nfs_inode *nfsi) > =A0 =A0 =A0 =A0}; > > =A0 =A0 =A0 =A0lo =3D get_lock_current_layout(nfsi); > + =A0 =A0 =A0 if (!lo) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return; > =A0 =A0 =A0 =A0pnfs_free_layout(lo, &range); > =A0 =A0 =A0 =A0put_unlock_current_layout(lo); > =A0} > @@ -652,7 +678,7 @@ pnfs_return_layout_barrier(struct nfs_inode *nfsi= , > =A0 =A0 =A0 =A0struct pnfs_layout_segment *lseg; > =A0 =A0 =A0 =A0bool ret =3D false; > > - =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); > + =A0 =A0 =A0 lock_current_layout(nfsi); > =A0 =A0 =A0 =A0list_for_each_entry (lseg, &nfsi->layout.segs, fi_list= ) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (!should_free_lseg(lseg, range)) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0continue; > @@ -666,7 +692,7 @@ pnfs_return_layout_barrier(struct nfs_inode *nfsi= , > =A0 =A0 =A0 =A0} > =A0 =A0 =A0 =A0if (atomic_read(&nfsi->layout.lgetcount)) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0ret =3D true; > - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 unlock_current_layout(nfsi); > > =A0 =A0 =A0 =A0dprintk("%s:Return %d\n", __func__, ret); > =A0 =A0 =A0 =A0return ret; > @@ -756,7 +782,7 @@ _pnfs_return_layout(struct inode *ino, struct nfs= 4_pnfs_layout_segment *range, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* unlock w/o put rebalanced by eventu= al call to > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * pnfs_layout_release > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 */ > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 unlock_current_layout(nfsi); > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (pnfs_return_layout_barrier(nfsi, &= arg)) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0dprintk("%s: waiting\n= ", __func__); > @@ -887,7 +913,7 @@ static int pnfs_wait_schedule(void *word) > =A0* > =A0* Note: If successful, nfsi->lo_lock is taken and the caller > =A0* must put and unlock current_layout by using put_unlock_current_l= ayout() > - * when the returned layout is released. > + * directly or pnfs_layout_release() when the returned layout is rel= eased. > =A0*/ > =A0static struct pnfs_layout_type * > =A0get_lock_alloc_layout(struct inode *ino) > @@ -922,7 +948,7 @@ get_lock_alloc_layout(struct inode *ino) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct nfs_client *clp= =3D NFS_SERVER(ino)->nfs_client; > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* must grab the layou= t lock before the client lock */ > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_lock(&nfsi->lo_loc= k); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lock_current_layout(nfs= i); > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0spin_lock(&clp->cl_loc= k); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (list_empty(&nfsi->= lo_inodes)) > @@ -1038,10 +1064,10 @@ void drain_layoutreturns(struct pnfs_layout_t= ype *lo) > =A0 =A0 =A0 =A0while (atomic_read(&lo->lretcount)) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct nfs_inode *nfsi =3D PNFS_NFS_IN= ODE(lo); > > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 unlock_current_layout(nfsi); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0dprintk("%s: waiting\n", __func__); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0wait_event(nfsi->lo_waitq, (atomic_rea= d(&lo->lretcount) =3D=3D 0)); > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 lock_current_layout(nfsi); > =A0 =A0 =A0 =A0} > =A0} > > @@ -1080,13 +1106,13 @@ pnfs_update_layout(struct inode *ino, > =A0 =A0 =A0 =A0/* Check to see if the layout for the given range alre= ady exists */ > =A0 =A0 =A0 =A0lseg =3D pnfs_has_layout(lo, &arg, take_ref, !take_ref= ); > =A0 =A0 =A0 =A0if (lseg && !lseg->valid) { > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 unlock_current_layout(nfsi); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (take_ref) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0put_lseg(lseg); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0for (;;) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0prepare_to_wait(&nfsi-= >lo_waitq, &__wait, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0TASK_KILLABLE); > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_lock(&nfsi->lo_loc= k); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lock_current_layout(nfs= i); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0lseg =3D pnfs_has_layo= ut(lo, &arg, take_ref, !take_ref); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (!lseg || lseg->val= id) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0break; > @@ -1099,7 +1125,7 @@ pnfs_update_layout(struct inode *ino, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0result= =3D -ERESTARTSYS; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0break; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_unlock(&nfsi->lo_l= ock); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 unlock_current_layout(n= fsi); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0schedule(); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0finish_wait(&nfsi->lo_waitq, &__wait); > @@ -1136,7 +1162,7 @@ pnfs_update_layout(struct inode *ino, > =A0 =A0 =A0 =A0/* Matching dec is done in .rpc_release (on non-error = paths) */ > =A0 =A0 =A0 =A0atomic_inc(&lo->lgetcount); > =A0 =A0 =A0 =A0/* Lose lock, but not reference, match this with pnfs_= layout_release */ > - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 unlock_current_layout(nfsi); > > =A0 =A0 =A0 =A0result =3D get_layout(ino, ctx, &arg, lsegpp, lo); > =A0out: > @@ -1286,7 +1312,7 @@ pnfs_layout_process(struct nfs4_pnfs_layoutget = *lgp) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0*lgp->lsegpp =3D lseg; > =A0 =A0 =A0 =A0} > > - =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); > + =A0 =A0 =A0 lock_current_layout(nfsi); > =A0 =A0 =A0 =A0pnfs_insert_layout(lo, lseg); > > =A0 =A0 =A0 =A0if (res->return_on_close) { > @@ -1297,7 +1323,7 @@ pnfs_layout_process(struct nfs4_pnfs_layoutget = *lgp) > > =A0 =A0 =A0 =A0/* Done processing layoutget. Set the layout stateid *= / > =A0 =A0 =A0 =A0pnfs_set_layout_stateid(lo, &res->stateid); > - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 unlock_current_layout(nfsi); > =A0out: > =A0 =A0 =A0 =A0return status; > =A0} > @@ -2212,7 +2238,7 @@ pnfs_layoutcommit_inode(struct inode *inode, in= t sync) > =A0 =A0 =A0 =A0if (!data) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return -ENOMEM; > > - =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); > + =A0 =A0 =A0 lock_current_layout(nfsi); > =A0 =A0 =A0 =A0if (!nfsi->layout.layoutcommit_ctx) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0goto out_unlock; > > @@ -2233,7 +2259,7 @@ pnfs_layoutcommit_inode(struct inode *inode, in= t sync) > =A0 =A0 =A0 =A0nfsi->layout.layoutcommit_ctx =3D NULL; > > =A0 =A0 =A0 =A0/* release lock on pnfs layoutcommit attrs */ > - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 unlock_current_layout(nfsi); > > =A0 =A0 =A0 =A0data->is_sync =3D sync; > =A0 =A0 =A0 =A0status =3D pnfs4_proc_layoutcommit(data); > @@ -2242,7 +2268,7 @@ out: > =A0 =A0 =A0 =A0return status; > =A0out_unlock: > =A0 =A0 =A0 =A0pnfs_layoutcommit_free(data); > - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); > + =A0 =A0 =A0 unlock_current_layout(nfsi); > =A0 =A0 =A0 =A0goto out; > =A0} > > -- > 1.6.2.5 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" = in > the body of a message to majordomo@vger.kernel.org > More majordomo info at =A0http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <AANLkTinsHI0fHYdpUlq-MsMX0BmsLGvdAbrKx7M5ydjw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 2/8] pnfs-submit: clean locking infrastructure [not found] ` <AANLkTinsHI0fHYdpUlq-MsMX0BmsLGvdAbrKx7M5ydjw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2010-05-28 18:27 ` Alexandros Batsakis 0 siblings, 0 replies; 20+ messages in thread From: Alexandros Batsakis @ 2010-05-28 18:27 UTC (permalink / raw) To: Fred Isaman; +Cc: Alexandros Batsakis, bhalevy, linux-nfs On Fri, May 28, 2010 at 10:27 AM, Fred Isaman <iisaman@citi.umich.edu> = wrote: > On Mon, May 17, 2010 at 1:56 PM, Alexandros Batsakis > <batsakis@netapp.com> wrote: >> (also minor cleanup of pnfs_free_layout()) >> >> Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> >> >> Conflicts: >> >> =A0 =A0 =A0 =A0fs/nfs/pnfs.c >> --- >> =A0fs/nfs/pnfs.c | =A0 80 +++++++++++++++++++++++++++++++++++++-----= -------------- >> =A01 files changed, 53 insertions(+), 27 deletions(-) >> >> diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c >> index b72c013..74cb998 100644 >> --- a/fs/nfs/pnfs.c >> +++ b/fs/nfs/pnfs.c >> @@ -1,4 +1,4 @@ >> -/* >> + /* >> =A0* =A0linux/fs/nfs/pnfs.c >> =A0* >> =A0* =A0pNFS functions to call and manage layout drivers. >> @@ -60,6 +60,8 @@ static int pnfs_initialized; >> =A0static void pnfs_free_layout(struct pnfs_layout_type *lo, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 struct nfs4_= pnfs_layout_segment *range); >> =A0static enum pnfs_try_status pnfs_commit(struct nfs_write_data *da= ta, int sync); >> +static inline void lock_current_layout(struct nfs_inode *nfsi); >> +static inline void unlock_current_layout(struct nfs_inode *nfsi); >> >> =A0/* Locking: >> =A0* >> @@ -153,16 +155,17 @@ pnfs_need_layoutcommit(struct nfs_inode *nfsi,= struct nfs_open_context *ctx) >> =A0{ >> =A0 =A0 =A0 =A0dprintk("%s: has_layout=3D%d layoutcommit_ctx=3D%p ct= x=3D%p\n", __func__, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0has_layout(nfsi), nfsi->layout.layout= commit_ctx, ctx); >> - =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); >> + >> + =A0 =A0 =A0 lock_current_layout(nfsi); >> =A0 =A0 =A0 =A0if (has_layout(nfsi) && !nfsi->layout.layoutcommit_ct= x) { >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0nfsi->layout.layoutcommit_ctx =3D get= _nfs_open_context(ctx); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0nfsi->change_attr++; >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 unlock_current_layout(nfsi); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0dprintk("%s: Set layoutcommit_ctx=3D%= p\n", __func__, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0nfsi->layout.layoutco= mmit_ctx); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return; >> =A0 =A0 =A0 =A0} >> - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); >> + =A0 =A0 =A0 unlock_current_layout(nfsi); >> =A0} >> >> =A0/* Update last_write_offset for layoutcommit. >> @@ -175,7 +178,7 @@ pnfs_update_last_write(struct nfs_inode *nfsi, l= off_t offset, size_t extent) >> =A0{ >> =A0 =A0 =A0 =A0loff_t end_pos; >> >> - =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); >> + =A0 =A0 =A0 lock_current_layout(nfsi); >> =A0 =A0 =A0 =A0if (offset < nfsi->layout.pnfs_write_begin_pos) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0nfsi->layout.pnfs_write_begin_pos =3D= offset; >> =A0 =A0 =A0 =A0end_pos =3D offset + extent - 1; /* I'm being inclusi= ve */ >> @@ -187,7 +190,7 @@ pnfs_update_last_write(struct nfs_inode *nfsi, l= off_t offset, size_t extent) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(unsigned long) offset , >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(unsigned long) nfsi->layout.pnfs_wri= te_begin_pos, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(unsigned long) nfsi->layout.pnfs_wri= te_end_pos); >> - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); >> + =A0 =A0 =A0 unlock_current_layout(nfsi); >> =A0} >> >> =A0/* Unitialize a mountpoint in a layout driver */ >> @@ -296,12 +299,27 @@ pnfs_unregister_layoutdriver(struct pnfs_layou= tdriver_type *ld_type) >> =A0* pNFS client layout cache >> =A0*/ >> =A0#if defined(CONFIG_SMP) >> +#define BUG_ON_LOCKED_LO(lo) \ >> + =A0 =A0 =A0 BUG_ON(spin_is_locked(&PNFS_NFS_INODE(lo)->lo_lock)) >> =A0#define BUG_ON_UNLOCKED_LO(lo) \ >> =A0 =A0 =A0 =A0BUG_ON(!spin_is_locked(&PNFS_NFS_INODE(lo)->lo_lock)) >> =A0#else /* CONFIG_SMP */ >> +#define BUG_ON_LOCKED_LO(lo) do {} while (0) >> =A0#define BUG_ON_UNLOCKED_LO(lo) do {} while (0) >> =A0#endif /* CONFIG_SMP */ >> >> +static inline void lock_current_layout(struct nfs_inode *nfsi) >> +{ >> + =A0 =A0 =A0 BUG_ON_LOCKED_LO((&nfsi->layout)); > > I just ran into this in testing. This check causes problems. =A0If yo= u > know it is already unlocked, you wouldn't have to "spin". > Yeah I saw that too. I fixed it in the new version that is coming up. -alexandros > Fred > >> + =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); >> +} >> + >> +static inline void unlock_current_layout(struct nfs_inode *nfsi) >> +{ >> + =A0 =A0 =A0 BUG_ON_UNLOCKED_LO((&nfsi->layout)); >> + =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); >> +} >> + >> =A0/* >> =A0* get and lock nfsi->layout >> =A0*/ >> @@ -310,10 +328,10 @@ get_lock_current_layout(struct nfs_inode *nfsi= ) >> =A0{ >> =A0 =A0 =A0 =A0struct pnfs_layout_type *lo; >> >> + =A0 =A0 =A0 lock_current_layout(nfsi); >> =A0 =A0 =A0 =A0lo =3D &nfsi->layout; >> - =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); >> =A0 =A0 =A0 =A0if (!lo->ld_data) { >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 unlock_current_layout(nfsi); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return NULL; >> =A0 =A0 =A0 =A0} >> >> @@ -333,7 +351,12 @@ put_unlock_current_layout(struct pnfs_layout_ty= pe *lo) >> =A0 =A0 =A0 =A0BUG_ON_UNLOCKED_LO(lo); >> =A0 =A0 =A0 =A0BUG_ON(lo->refcount <=3D 0); >> >> - =A0 =A0 =A0 if (--lo->refcount =3D=3D 0 && list_empty(&lo->segs)) = { >> + =A0 =A0 =A0 lo->refcount--; >> + >> + =A0 =A0 =A0 if (lo->refcount > 0) >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto out; >> + >> + =A0 =A0 =A0 if (list_empty(&lo->segs)) { >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct layoutdriver_io_operations *io= _ops =3D >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0PNFS_LD_IO_OPS(lo); >> >> @@ -347,7 +370,8 @@ put_unlock_current_layout(struct pnfs_layout_typ= e *lo) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0list_del_init(&nfsi->lo_inodes); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0spin_unlock(&clp->cl_lock); >> =A0 =A0 =A0 =A0} >> - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); >> +out: >> + =A0 =A0 =A0 unlock_current_layout(nfsi); >> =A0} >> >> =A0void >> @@ -356,7 +380,7 @@ pnfs_layout_release(struct pnfs_layout_type *lo,= atomic_t *count, >> =A0{ >> =A0 =A0 =A0 =A0struct nfs_inode *nfsi =3D PNFS_NFS_INODE(lo); >> >> - =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); >> + =A0 =A0 =A0 lock_current_layout(nfsi); >> =A0 =A0 =A0 =A0if (range) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0pnfs_free_layout(lo, range); >> =A0 =A0 =A0 =A0atomic_dec(count); >> @@ -375,6 +399,8 @@ pnfs_destroy_layout(struct nfs_inode *nfsi) >> =A0 =A0 =A0 =A0}; >> >> =A0 =A0 =A0 =A0lo =3D get_lock_current_layout(nfsi); >> + =A0 =A0 =A0 if (!lo) >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 return; >> =A0 =A0 =A0 =A0pnfs_free_layout(lo, &range); >> =A0 =A0 =A0 =A0put_unlock_current_layout(lo); >> =A0} >> @@ -652,7 +678,7 @@ pnfs_return_layout_barrier(struct nfs_inode *nfs= i, >> =A0 =A0 =A0 =A0struct pnfs_layout_segment *lseg; >> =A0 =A0 =A0 =A0bool ret =3D false; >> >> - =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); >> + =A0 =A0 =A0 lock_current_layout(nfsi); >> =A0 =A0 =A0 =A0list_for_each_entry (lseg, &nfsi->layout.segs, fi_lis= t) { >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (!should_free_lseg(lseg, range)) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0continue; >> @@ -666,7 +692,7 @@ pnfs_return_layout_barrier(struct nfs_inode *nfs= i, >> =A0 =A0 =A0 =A0} >> =A0 =A0 =A0 =A0if (atomic_read(&nfsi->layout.lgetcount)) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0ret =3D true; >> - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); >> + =A0 =A0 =A0 unlock_current_layout(nfsi); >> >> =A0 =A0 =A0 =A0dprintk("%s:Return %d\n", __func__, ret); >> =A0 =A0 =A0 =A0return ret; >> @@ -756,7 +782,7 @@ _pnfs_return_layout(struct inode *ino, struct nf= s4_pnfs_layout_segment *range, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* unlock w/o put rebalanced by event= ual call to >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * pnfs_layout_release >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 */ >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 unlock_current_layout(nfsi); >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (pnfs_return_layout_barrier(nfsi, = &arg)) { >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0dprintk("%s: waiting\= n", __func__); >> @@ -887,7 +913,7 @@ static int pnfs_wait_schedule(void *word) >> =A0* >> =A0* Note: If successful, nfsi->lo_lock is taken and the caller >> =A0* must put and unlock current_layout by using put_unlock_current_= layout() >> - * when the returned layout is released. >> + * directly or pnfs_layout_release() when the returned layout is re= leased. >> =A0*/ >> =A0static struct pnfs_layout_type * >> =A0get_lock_alloc_layout(struct inode *ino) >> @@ -922,7 +948,7 @@ get_lock_alloc_layout(struct inode *ino) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct nfs_client *cl= p =3D NFS_SERVER(ino)->nfs_client; >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/* must grab the layo= ut lock before the client lock */ >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_lock(&nfsi->lo_lo= ck); >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lock_current_layout(nf= si); >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0spin_lock(&clp->cl_lo= ck); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (list_empty(&nfsi-= >lo_inodes)) >> @@ -1038,10 +1064,10 @@ void drain_layoutreturns(struct pnfs_layout_= type *lo) >> =A0 =A0 =A0 =A0while (atomic_read(&lo->lretcount)) { >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct nfs_inode *nfsi =3D PNFS_NFS_I= NODE(lo); >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 unlock_current_layout(nfsi); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0dprintk("%s: waiting\n", __func__); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0wait_event(nfsi->lo_waitq, (atomic_re= ad(&lo->lretcount) =3D=3D 0)); >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 lock_current_layout(nfsi); >> =A0 =A0 =A0 =A0} >> =A0} >> >> @@ -1080,13 +1106,13 @@ pnfs_update_layout(struct inode *ino, >> =A0 =A0 =A0 =A0/* Check to see if the layout for the given range alr= eady exists */ >> =A0 =A0 =A0 =A0lseg =3D pnfs_has_layout(lo, &arg, take_ref, !take_re= f); >> =A0 =A0 =A0 =A0if (lseg && !lseg->valid) { >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 unlock_current_layout(nfsi); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (take_ref) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0put_lseg(lseg); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0for (;;) { >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0prepare_to_wait(&nfsi= ->lo_waitq, &__wait, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0TASK_KILLABLE); >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_lock(&nfsi->lo_lo= ck); >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 lock_current_layout(nf= si); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0lseg =3D pnfs_has_lay= out(lo, &arg, take_ref, !take_ref); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (!lseg || lseg->va= lid) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0break= ; >> @@ -1099,7 +1125,7 @@ pnfs_update_layout(struct inode *ino, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0resul= t =3D -ERESTARTSYS; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0break= ; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 spin_unlock(&nfsi->lo_= lock); >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 unlock_current_layout(= nfsi); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0schedule(); >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0finish_wait(&nfsi->lo_waitq, &__wait)= ; >> @@ -1136,7 +1162,7 @@ pnfs_update_layout(struct inode *ino, >> =A0 =A0 =A0 =A0/* Matching dec is done in .rpc_release (on non-error= paths) */ >> =A0 =A0 =A0 =A0atomic_inc(&lo->lgetcount); >> =A0 =A0 =A0 =A0/* Lose lock, but not reference, match this with pnfs= _layout_release */ >> - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); >> + =A0 =A0 =A0 unlock_current_layout(nfsi); >> >> =A0 =A0 =A0 =A0result =3D get_layout(ino, ctx, &arg, lsegpp, lo); >> =A0out: >> @@ -1286,7 +1312,7 @@ pnfs_layout_process(struct nfs4_pnfs_layoutget= *lgp) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0*lgp->lsegpp =3D lseg; >> =A0 =A0 =A0 =A0} >> >> - =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); >> + =A0 =A0 =A0 lock_current_layout(nfsi); >> =A0 =A0 =A0 =A0pnfs_insert_layout(lo, lseg); >> >> =A0 =A0 =A0 =A0if (res->return_on_close) { >> @@ -1297,7 +1323,7 @@ pnfs_layout_process(struct nfs4_pnfs_layoutget= *lgp) >> >> =A0 =A0 =A0 =A0/* Done processing layoutget. Set the layout stateid = */ >> =A0 =A0 =A0 =A0pnfs_set_layout_stateid(lo, &res->stateid); >> - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); >> + =A0 =A0 =A0 unlock_current_layout(nfsi); >> =A0out: >> =A0 =A0 =A0 =A0return status; >> =A0} >> @@ -2212,7 +2238,7 @@ pnfs_layoutcommit_inode(struct inode *inode, i= nt sync) >> =A0 =A0 =A0 =A0if (!data) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return -ENOMEM; >> >> - =A0 =A0 =A0 spin_lock(&nfsi->lo_lock); >> + =A0 =A0 =A0 lock_current_layout(nfsi); >> =A0 =A0 =A0 =A0if (!nfsi->layout.layoutcommit_ctx) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0goto out_unlock; >> >> @@ -2233,7 +2259,7 @@ pnfs_layoutcommit_inode(struct inode *inode, i= nt sync) >> =A0 =A0 =A0 =A0nfsi->layout.layoutcommit_ctx =3D NULL; >> >> =A0 =A0 =A0 =A0/* release lock on pnfs layoutcommit attrs */ >> - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); >> + =A0 =A0 =A0 unlock_current_layout(nfsi); >> >> =A0 =A0 =A0 =A0data->is_sync =3D sync; >> =A0 =A0 =A0 =A0status =3D pnfs4_proc_layoutcommit(data); >> @@ -2242,7 +2268,7 @@ out: >> =A0 =A0 =A0 =A0return status; >> =A0out_unlock: >> =A0 =A0 =A0 =A0pnfs_layoutcommit_free(data); >> - =A0 =A0 =A0 spin_unlock(&nfsi->lo_lock); >> + =A0 =A0 =A0 unlock_current_layout(nfsi); >> =A0 =A0 =A0 =A0goto out; >> =A0} >> >> -- >> 1.6.2.5 >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-nfs"= in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at =A0http://vger.kernel.org/majordomo-info.html >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" = in > the body of a message to majordomo@vger.kernel.org > More majordomo info at =A0http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 0/8] forgetful client v2 @ 2010-06-07 21:11 Alexandros Batsakis 2010-06-07 21:11 ` [PATCH 1/8] pnfs-submit: clean struct nfs_inode Alexandros Batsakis 0 siblings, 1 reply; 20+ messages in thread From: Alexandros Batsakis @ 2010-06-07 21:11 UTC (permalink / raw) To: linux-nfs; +Cc: bhalevy, Alexandros Batsakis This set of patches (2.6.35-rc2) includes a first attempt to implement the forgetful client model for the pNFS client. The model is explained is patch 7. It also includes some minor cleanups in the layout management code that help to improve the maintanability of the current code. Passed cthon tests against the pyNFS server, and against a modified version of pyNFS server that randomly issues layout recalls after opens. Alexandros Batsakis (8): pnfs-submit: clean struct nfs_inode pnfs-submit: clean locking infrastructure pnfs-submit: remove lgetcount, lretcount pnfs-submit: change stateid to be a union pnfs-submit: request whole-file layouts only pnfs-submit: change layout list to be similar to other state lists pnfs-submit: forgetful client (layouts) pnfs-submit: support for CB_RECALL_ANY (layouts) fs/nfs/callback.h | 7 + fs/nfs/callback_proc.c | 231 +++++++++++++++++++++++++++++--------- fs/nfs/callback_xdr.c | 2 +- fs/nfs/client.c | 2 +- fs/nfs/delegation.c | 19 ++-- fs/nfs/inode.c | 12 +- fs/nfs/nfs4_fs.h | 1 + fs/nfs/nfs4proc.c | 46 +++++--- fs/nfs/nfs4state.c | 4 +- fs/nfs/nfs4xdr.c | 38 ++++--- fs/nfs/pnfs.c | 276 +++++++++++++++++++++------------------------ fs/nfs/pnfs.h | 3 +- fs/nfsd/nfs4callback.c | 1 - include/linux/nfs4.h | 16 +++- include/linux/nfs4_pnfs.h | 2 +- include/linux/nfs_fs.h | 28 ++--- include/linux/nfs_fs_sb.h | 2 +- 17 files changed, 414 insertions(+), 276 deletions(-) ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/8] pnfs-submit: clean struct nfs_inode 2010-06-07 21:11 [PATCH 0/8] forgetful client v2 Alexandros Batsakis @ 2010-06-07 21:11 ` Alexandros Batsakis 2010-06-07 21:11 ` [PATCH 2/8] pnfs-submit: clean locking infrastructure Alexandros Batsakis 0 siblings, 1 reply; 20+ messages in thread From: Alexandros Batsakis @ 2010-06-07 21:11 UTC (permalink / raw) To: linux-nfs; +Cc: bhalevy, Alexandros Batsakis by moving layout specific fields from nfs_inode to struct pnfs_layout_type Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> --- fs/nfs/inode.c | 8 +++--- fs/nfs/pnfs.c | 55 ++++++++++++++++++++++++-------------------- include/linux/nfs4_pnfs.h | 2 +- include/linux/nfs_fs.h | 22 +++++++++--------- 4 files changed, 46 insertions(+), 41 deletions(-) diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index b33d1a1..d43f2c5 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -1366,12 +1366,12 @@ void nfs4_clear_inode(struct inode *inode) static void pnfs_alloc_init_inode(struct nfs_inode *nfsi) { #ifdef CONFIG_NFS_V4_1 - nfsi->pnfs_layout_state = 0; + nfsi->layout.pnfs_layout_state = 0; memset(&nfsi->layout.stateid, 0, NFS4_STATEID_SIZE); nfsi->layout.roc_iomode = 0; - nfsi->lo_cred = NULL; - nfsi->pnfs_write_begin_pos = 0; - nfsi->pnfs_write_end_pos = 0; + nfsi->layout.lo_cred = NULL; + nfsi->layout.pnfs_write_begin_pos = 0; + nfsi->layout.pnfs_write_end_pos = 0; #endif /* CONFIG_NFS_V4_1 */ } diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c index 8cc4412..8620f68 100644 --- a/fs/nfs/pnfs.c +++ b/fs/nfs/pnfs.c @@ -154,7 +154,7 @@ pnfs_need_layoutcommit(struct nfs_inode *nfsi, struct nfs_open_context *ctx) dprintk("%s: has_layout=%d ctx=%p\n", __func__, has_layout(nfsi), ctx); spin_lock(&nfsi->lo_lock); if (has_layout(nfsi) && !layoutcommit_needed(nfsi)) { - nfsi->lo_cred = get_rpccred(ctx->state->owner->so_cred); + nfsi->layout.lo_cred = get_rpccred(ctx->state->owner->so_cred); nfsi->change_attr++; spin_unlock(&nfsi->lo_lock); dprintk("%s: Set layoutcommit\n", __func__); @@ -174,17 +174,17 @@ pnfs_update_last_write(struct nfs_inode *nfsi, loff_t offset, size_t extent) loff_t end_pos; spin_lock(&nfsi->lo_lock); - if (offset < nfsi->pnfs_write_begin_pos) - nfsi->pnfs_write_begin_pos = offset; + if (offset < nfsi->layout.pnfs_write_begin_pos) + nfsi->layout.pnfs_write_begin_pos = offset; end_pos = offset + extent - 1; /* I'm being inclusive */ - if (end_pos > nfsi->pnfs_write_end_pos) - nfsi->pnfs_write_end_pos = end_pos; + if (end_pos > nfsi->layout.pnfs_write_end_pos) + nfsi->layout.pnfs_write_end_pos = end_pos; dprintk("%s: Wrote %lu@%lu bpos %lu, epos: %lu\n", __func__, (unsigned long) extent, (unsigned long) offset , - (unsigned long) nfsi->pnfs_write_begin_pos, - (unsigned long) nfsi->pnfs_write_end_pos); + (unsigned long) nfsi->layout.pnfs_write_begin_pos, + (unsigned long) nfsi->layout.pnfs_write_end_pos); spin_unlock(&nfsi->lo_lock); } @@ -915,7 +915,8 @@ get_lock_alloc_layout(struct inode *ino) * wait until bit is cleared if we lost this race. */ res = wait_on_bit_lock( - &nfsi->pnfs_layout_state, NFS_INO_LAYOUT_ALLOC, + &nfsi->layout.pnfs_layout_state, + NFS_INO_LAYOUT_ALLOC, pnfs_wait_schedule, TASK_KILLABLE); if (res) { lo = ERR_PTR(res); @@ -943,8 +944,10 @@ get_lock_alloc_layout(struct inode *ino) lo = ERR_PTR(-ENOMEM); /* release the NFS_INO_LAYOUT_ALLOC bit and wake up waiters */ - clear_bit_unlock(NFS_INO_LAYOUT_ALLOC, &nfsi->pnfs_layout_state); - wake_up_bit(&nfsi->pnfs_layout_state, NFS_INO_LAYOUT_ALLOC); + clear_bit_unlock(NFS_INO_LAYOUT_ALLOC, + &nfsi->layout.pnfs_layout_state); + wake_up_bit(&nfsi->layout.pnfs_layout_state, + NFS_INO_LAYOUT_ALLOC); break; } @@ -1104,13 +1107,13 @@ pnfs_update_layout(struct inode *ino, } /* if get layout already failed once goto out */ - if (test_bit(lo_fail_bit(iomode), &nfsi->pnfs_layout_state)) { - if (unlikely(nfsi->pnfs_layout_suspend && - get_seconds() >= nfsi->pnfs_layout_suspend)) { + if (test_bit(lo_fail_bit(iomode), &nfsi->layout.pnfs_layout_state)) { + if (unlikely(nfsi->layout.pnfs_layout_suspend && + get_seconds() >= nfsi->layout.pnfs_layout_suspend)) { dprintk("%s: layout_get resumed\n", __func__); clear_bit(lo_fail_bit(iomode), - &nfsi->pnfs_layout_state); - nfsi->pnfs_layout_suspend = 0; + &nfsi->layout.pnfs_layout_state); + nfsi->layout.pnfs_layout_suspend = 0; } else { result = 1; goto out_put; @@ -1126,7 +1129,8 @@ pnfs_update_layout(struct inode *ino, result = get_layout(ino, ctx, &arg, lsegpp, lo); out: dprintk("%s end (err:%d) state 0x%lx lseg %p\n", - __func__, result, nfsi->pnfs_layout_state, lseg); + __func__, result, nfsi->layout.pnfs_layout_state, + lseg); return result; out_put: if (lsegpp) @@ -1231,13 +1235,14 @@ pnfs_get_layout_done(struct nfs4_pnfs_layoutget *lgp, int rpc_status) get_out: /* remember that get layout failed and suspend trying */ - nfsi->pnfs_layout_suspend = suspend; - set_bit(lo_fail_bit(lgp->args.lseg.iomode), &nfsi->pnfs_layout_state); + nfsi->layout.pnfs_layout_suspend = suspend; + set_bit(lo_fail_bit(lgp->args.lseg.iomode), + &nfsi->layout.pnfs_layout_state); dprintk("%s: layout_get suspended until %ld\n", __func__, suspend); out: dprintk("%s end (err:%d) state 0x%lx lseg %p\n", - __func__, lgp->status, nfsi->pnfs_layout_state, lseg); + __func__, lgp->status, nfsi->layout.pnfs_layout_state, lseg); return; } @@ -2009,12 +2014,12 @@ pnfs_layoutcommit_inode(struct inode *inode, int sync) /* Clear layoutcommit properties in the inode so * new lc info can be generated */ - write_begin_pos = nfsi->pnfs_write_begin_pos; - write_end_pos = nfsi->pnfs_write_end_pos; - data->cred = nfsi->lo_cred; - nfsi->pnfs_write_begin_pos = 0; - nfsi->pnfs_write_end_pos = 0; - nfsi->lo_cred = NULL; + write_begin_pos = nfsi->layout.pnfs_write_begin_pos; + write_end_pos = nfsi->layout.pnfs_write_end_pos; + data->cred = nfsi->layout.lo_cred; + nfsi->layout.pnfs_write_begin_pos = 0; + nfsi->layout.pnfs_write_end_pos = 0; + nfsi->layout.lo_cred = NULL; pnfs_get_layout_stateid(&data->args.stateid, &nfsi->layout); spin_unlock(&nfsi->lo_lock); diff --git a/include/linux/nfs4_pnfs.h b/include/linux/nfs4_pnfs.h index 84d2e95..53626d4 100644 --- a/include/linux/nfs4_pnfs.h +++ b/include/linux/nfs4_pnfs.h @@ -83,7 +83,7 @@ has_layout(struct nfs_inode *nfsi) static inline bool layoutcommit_needed(struct nfs_inode *nfsi) { - return nfsi->lo_cred != NULL; + return nfsi->layout.lo_cred != NULL; } #endif /* CONFIG_NFS_V4_1 */ diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index 2762b2c..45846c5 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -106,6 +106,17 @@ struct pnfs_layout_type { seqlock_t seqlock; /* Protects the stateid */ nfs4_stateid stateid; void *ld_data; /* layout driver private data */ + unsigned long pnfs_layout_state; + #define NFS_INO_RO_LAYOUT_FAILED 0 /* get ro layout failed stop trying */ + #define NFS_INO_RW_LAYOUT_FAILED 1 /* get rw layout failed stop trying */ + #define NFS_INO_LAYOUT_ALLOC 2 /* bit lock for layout allocation */ + time_t pnfs_layout_suspend; + struct rpc_cred *lo_cred; /* layoutcommit credential */ + /* DH: These vars keep track of the maximum write range + * so the values can be used for layoutcommit. + */ + loff_t pnfs_write_begin_pos; + loff_t pnfs_write_end_pos; }; /* @@ -198,20 +209,9 @@ struct nfs_inode { /* Inodes having layouts */ struct list_head lo_inodes; - unsigned long pnfs_layout_state; -#define NFS_INO_RO_LAYOUT_FAILED 0 /* get ro layout failed stop trying */ -#define NFS_INO_RW_LAYOUT_FAILED 1 /* get rw layout failed stop trying */ -#define NFS_INO_LAYOUT_ALLOC 2 /* bit lock for layout allocation */ - time_t pnfs_layout_suspend; - struct rpc_cred *lo_cred; /* layoutcommit credential */ wait_queue_head_t lo_waitq; spinlock_t lo_lock; struct pnfs_layout_type layout; - /* DH: These vars keep track of the maximum write range - * so the values can be used for layoutcommit. - */ - loff_t pnfs_write_begin_pos; - loff_t pnfs_write_end_pos; #endif /* CONFIG_NFS_V4_1 */ #endif /* CONFIG_NFS_V4*/ #ifdef CONFIG_NFS_FSCACHE -- 1.6.2.5 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/8] pnfs-submit: clean locking infrastructure 2010-06-07 21:11 ` [PATCH 1/8] pnfs-submit: clean struct nfs_inode Alexandros Batsakis @ 2010-06-07 21:11 ` Alexandros Batsakis 2010-06-08 7:30 ` Christoph Hellwig 0 siblings, 1 reply; 20+ messages in thread From: Alexandros Batsakis @ 2010-06-07 21:11 UTC (permalink / raw) To: linux-nfs; +Cc: bhalevy, Alexandros Batsakis, Fred Isaman (also minor cleanup of pnfs_free_layout()) Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Fred Isaman <iisaman@netapp.com> --- fs/nfs/pnfs.c | 73 ++++++++++++++++++++++++++++++++++++-------------------- 1 files changed, 47 insertions(+), 26 deletions(-) diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c index 8620f68..b0a4bca 100644 --- a/fs/nfs/pnfs.c +++ b/fs/nfs/pnfs.c @@ -60,6 +60,8 @@ static int pnfs_initialized; static void pnfs_free_layout(struct pnfs_layout_type *lo, struct nfs4_pnfs_layout_segment *range); static enum pnfs_try_status pnfs_commit(struct nfs_write_data *data, int sync); +static inline void lock_current_layout(struct nfs_inode *nfsi); +static inline void unlock_current_layout(struct nfs_inode *nfsi); /* Locking: * @@ -152,15 +154,15 @@ void pnfs_need_layoutcommit(struct nfs_inode *nfsi, struct nfs_open_context *ctx) { dprintk("%s: has_layout=%d ctx=%p\n", __func__, has_layout(nfsi), ctx); - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); if (has_layout(nfsi) && !layoutcommit_needed(nfsi)) { nfsi->layout.lo_cred = get_rpccred(ctx->state->owner->so_cred); nfsi->change_attr++; - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); dprintk("%s: Set layoutcommit\n", __func__); return; } - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); } /* Update last_write_offset for layoutcommit. @@ -173,7 +175,7 @@ pnfs_update_last_write(struct nfs_inode *nfsi, loff_t offset, size_t extent) { loff_t end_pos; - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); if (offset < nfsi->layout.pnfs_write_begin_pos) nfsi->layout.pnfs_write_begin_pos = offset; end_pos = offset + extent - 1; /* I'm being inclusive */ @@ -185,7 +187,7 @@ pnfs_update_last_write(struct nfs_inode *nfsi, loff_t offset, size_t extent) (unsigned long) offset , (unsigned long) nfsi->layout.pnfs_write_begin_pos, (unsigned long) nfsi->layout.pnfs_write_end_pos); - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); } /* Unitialize a mountpoint in a layout driver */ @@ -312,6 +314,17 @@ pnfs_unregister_layoutdriver(struct pnfs_layoutdriver_type *ld_type) #define BUG_ON_UNLOCKED_LO(lo) do {} while (0) #endif /* CONFIG_SMP */ +static inline void lock_current_layout(struct nfs_inode *nfsi) +{ + spin_lock(&nfsi->lo_lock); +} + +static inline void unlock_current_layout(struct nfs_inode *nfsi) +{ + BUG_ON_UNLOCKED_LO((&nfsi->layout)); + spin_unlock(&nfsi->lo_lock); +} + /* * get and lock nfsi->layout */ @@ -320,10 +333,10 @@ get_lock_current_layout(struct nfs_inode *nfsi) { struct pnfs_layout_type *lo; + lock_current_layout(nfsi); lo = &nfsi->layout; - spin_lock(&nfsi->lo_lock); if (!lo->ld_data) { - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); return NULL; } @@ -343,7 +356,12 @@ put_unlock_current_layout(struct pnfs_layout_type *lo) BUG_ON_UNLOCKED_LO(lo); BUG_ON(lo->refcount <= 0); - if (--lo->refcount == 0 && list_empty(&lo->segs)) { + lo->refcount--; + + if (lo->refcount > 0) + goto out; + + if (list_empty(&lo->segs)) { struct layoutdriver_io_operations *io_ops = PNFS_LD_IO_OPS(lo); @@ -357,7 +375,8 @@ put_unlock_current_layout(struct pnfs_layout_type *lo) list_del_init(&nfsi->lo_inodes); spin_unlock(&clp->cl_lock); } - spin_unlock(&nfsi->lo_lock); +out: + unlock_current_layout(nfsi); } void @@ -366,7 +385,7 @@ pnfs_layout_release(struct pnfs_layout_type *lo, atomic_t *count, { struct nfs_inode *nfsi = PNFS_NFS_INODE(lo); - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); if (range) pnfs_free_layout(lo, range); atomic_dec(count); @@ -385,6 +404,8 @@ pnfs_destroy_layout(struct nfs_inode *nfsi) }; lo = get_lock_current_layout(nfsi); + if (!lo) + return; pnfs_free_layout(lo, &range); put_unlock_current_layout(lo); } @@ -662,7 +683,7 @@ pnfs_return_layout_barrier(struct nfs_inode *nfsi, struct pnfs_layout_segment *lseg; bool ret = false; - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); list_for_each_entry (lseg, &nfsi->layout.segs, fi_list) { if (!should_free_lseg(lseg, range)) continue; @@ -676,7 +697,7 @@ pnfs_return_layout_barrier(struct nfs_inode *nfsi, } if (atomic_read(&nfsi->layout.lgetcount)) ret = true; - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); dprintk("%s:Return %d\n", __func__, ret); return ret; @@ -758,7 +779,7 @@ _pnfs_return_layout(struct inode *ino, struct nfs4_pnfs_layout_segment *range, /* unlock w/o put rebalanced by eventual call to * pnfs_layout_release */ - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); if (pnfs_return_layout_barrier(nfsi, &arg)) { dprintk("%s: waiting\n", __func__); @@ -899,7 +920,7 @@ static int pnfs_wait_schedule(void *word) * * Note: If successful, nfsi->lo_lock is taken and the caller * must put and unlock current_layout by using put_unlock_current_layout() - * when the returned layout is released. + * directly or pnfs_layout_release() when the returned layout is released. */ static struct pnfs_layout_type * get_lock_alloc_layout(struct inode *ino) @@ -934,7 +955,7 @@ get_lock_alloc_layout(struct inode *ino) struct nfs_client *clp = NFS_SERVER(ino)->nfs_client; /* must grab the layout lock before the client lock */ - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); spin_lock(&clp->cl_lock); if (list_empty(&nfsi->lo_inodes)) @@ -1026,10 +1047,10 @@ void drain_layoutreturns(struct pnfs_layout_type *lo) while (atomic_read(&lo->lretcount)) { struct nfs_inode *nfsi = PNFS_NFS_INODE(lo); - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); dprintk("%s: waiting\n", __func__); wait_event(nfsi->lo_waitq, (atomic_read(&lo->lretcount) == 0)); - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); } } @@ -1068,13 +1089,13 @@ pnfs_update_layout(struct inode *ino, /* Check to see if the layout for the given range already exists */ lseg = pnfs_has_layout(lo, &arg, take_ref, !take_ref); if (lseg && !lseg->valid) { - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); if (take_ref) put_lseg(lseg); for (;;) { prepare_to_wait(&nfsi->lo_waitq, &__wait, TASK_KILLABLE); - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); lseg = pnfs_has_layout(lo, &arg, take_ref, !take_ref); if (!lseg || lseg->valid) break; @@ -1087,7 +1108,7 @@ pnfs_update_layout(struct inode *ino, result = -ERESTARTSYS; break; } - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); schedule(); } finish_wait(&nfsi->lo_waitq, &__wait); @@ -1124,7 +1145,7 @@ pnfs_update_layout(struct inode *ino, /* Matching dec is done in .rpc_release (on non-error paths) */ atomic_inc(&lo->lgetcount); /* Lose lock, but not reference, match this with pnfs_layout_release */ - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); result = get_layout(ino, ctx, &arg, lsegpp, lo); out: @@ -1274,7 +1295,7 @@ pnfs_layout_process(struct nfs4_pnfs_layoutget *lgp) *lgp->lsegpp = lseg; } - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); pnfs_insert_layout(lo, lseg); if (res->return_on_close) { @@ -1285,7 +1306,7 @@ pnfs_layout_process(struct nfs4_pnfs_layoutget *lgp) /* Done processing layoutget. Set the layout stateid */ pnfs_set_layout_stateid(lo, &res->stateid); - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); out: return status; } @@ -2005,9 +2026,9 @@ pnfs_layoutcommit_inode(struct inode *inode, int sync) if (!data) return -ENOMEM; - spin_lock(&nfsi->lo_lock); + lock_current_layout(nfsi); if (!layoutcommit_needed(nfsi)) { - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); goto out_free; } @@ -2022,7 +2043,7 @@ pnfs_layoutcommit_inode(struct inode *inode, int sync) nfsi->layout.lo_cred = NULL; pnfs_get_layout_stateid(&data->args.stateid, &nfsi->layout); - spin_unlock(&nfsi->lo_lock); + unlock_current_layout(nfsi); /* Set up layout commit args */ status = pnfs_layoutcommit_setup(inode, data, write_begin_pos, -- 1.6.2.5 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 2/8] pnfs-submit: clean locking infrastructure 2010-06-07 21:11 ` [PATCH 2/8] pnfs-submit: clean locking infrastructure Alexandros Batsakis @ 2010-06-08 7:30 ` Christoph Hellwig 2010-06-08 7:34 ` Benny Halevy 0 siblings, 1 reply; 20+ messages in thread From: Christoph Hellwig @ 2010-06-08 7:30 UTC (permalink / raw) To: Alexandros Batsakis; +Cc: linux-nfs, bhalevy, Fred Isaman On Mon, Jun 07, 2010 at 02:11:47PM -0700, Alexandros Batsakis wrote: > +static inline void lock_current_layout(struct nfs_inode *nfsi) > +{ > + spin_lock(&nfsi->lo_lock); > +} > + > +static inline void unlock_current_layout(struct nfs_inode *nfsi) > +{ > + BUG_ON_UNLOCKED_LO((&nfsi->layout)); > + spin_unlock(&nfsi->lo_lock); > +} Adding wrappers for these is nothing but obsfucation. No need for the BUG_ON above, the spinlock code asserts that already if building with spinlock debugging. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/8] pnfs-submit: clean locking infrastructure 2010-06-08 7:30 ` Christoph Hellwig @ 2010-06-08 7:34 ` Benny Halevy 0 siblings, 0 replies; 20+ messages in thread From: Benny Halevy @ 2010-06-08 7:34 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Alexandros Batsakis, linux-nfs, Fred Isaman On 2010-06-08 10:30, Christoph Hellwig wrote: > On Mon, Jun 07, 2010 at 02:11:47PM -0700, Alexandros Batsakis wrote: >> +static inline void lock_current_layout(struct nfs_inode *nfsi) >> +{ >> + spin_lock(&nfsi->lo_lock); >> +} >> + >> +static inline void unlock_current_layout(struct nfs_inode *nfsi) >> +{ >> + BUG_ON_UNLOCKED_LO((&nfsi->layout)); >> + spin_unlock(&nfsi->lo_lock); >> +} > > Adding wrappers for these is nothing but obsfucation. No need > for the BUG_ON above, the spinlock code asserts that already if > building with spinlock debugging. Good point. ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2010-06-08 7:35 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-05 17:00 [PATCH 0/8] pnfs-submit: forgetful client v2 Alexandros Batsakis
2010-05-05 17:00 ` [PATCH 1/8] pnfs-submit: clean struct nfs_inode Alexandros Batsakis
2010-05-05 17:00 ` [PATCH 2/8] pnfs-submit: clean locking infrastructure Alexandros Batsakis
2010-05-05 17:00 ` [PATCH 3/8] pnfs-submit: remove lgetcount, lretcount (outstanding LAYOUTGETs/LAYOUTRETUNs) Alexandros Batsakis
2010-05-05 17:00 ` [PATCH 4/8] pnfs-submit: change stateid to be a union Alexandros Batsakis
2010-05-05 17:00 ` [PATCH 5/8] pnfs-submit: request whole file layouts only Alexandros Batsakis
2010-05-05 17:00 ` [PATCH 6/8] pnfs-submit: change layouts list to be similar to the other state list management Alexandros Batsakis
2010-05-05 17:00 ` [PATCH 7/8] pnfs-submit: forgetful client model Alexandros Batsakis
2010-05-05 17:00 ` [PATCH 8/8] pnfs-submit: support for cb_recall_any (layouts) Alexandros Batsakis
2010-06-07 14:34 ` [PATCH 2/8] pnfs-submit: clean locking infrastructure Fred Isaman
2010-06-07 8:52 ` [PATCH 0/8] pnfs-submit: forgetful client v2 Boaz Harrosh
2010-06-07 8:54 ` Boaz Harrosh
2010-06-07 15:38 ` Alexandros Batsakis
-- strict thread matches above, loose matches on Subject: below --
2010-05-17 17:56 [PATCH 0/8] pnfs-submit: Forgetful cleint and some layout cleanups Alexandros Batsakis
2010-05-17 17:56 ` [PATCH 1/8] pnfs-submit: clean struct nfs_inode Alexandros Batsakis
2010-05-17 17:56 ` [PATCH 2/8] pnfs-submit: clean locking infrastructure Alexandros Batsakis
2010-05-26 8:28 ` Benny Halevy
2010-05-28 17:27 ` Fred Isaman
[not found] ` <AANLkTinsHI0fHYdpUlq-MsMX0BmsLGvdAbrKx7M5ydjw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-05-28 18:27 ` Alexandros Batsakis
2010-06-07 21:11 [PATCH 0/8] forgetful client v2 Alexandros Batsakis
2010-06-07 21:11 ` [PATCH 1/8] pnfs-submit: clean struct nfs_inode Alexandros Batsakis
2010-06-07 21:11 ` [PATCH 2/8] pnfs-submit: clean locking infrastructure Alexandros Batsakis
2010-06-08 7:30 ` Christoph Hellwig
2010-06-08 7:34 ` Benny Halevy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).