[PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3]

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3]
@ 2007-04-25 10:50 ` David Howells
  2007-04-25 10:50   ` [PATCH 01/16] AF_RXRPC: Move generic skbuff stuff from XFRM code to generic code " David Howells
                     ` (12 more replies)
  0 siblings, 13 replies; 17+ messages in thread
From: David Howells @ 2007-04-25 10:50 UTC (permalink / raw)
  To: torvalds, akpm; +Cc: linux-kernel, linux-fsdevel, netdev, dhowells


The first of these patches together provide secure client-side RxRPC
connectivity as a Linux kernel socket family.  Only the RxRPC transport/session
side is supplied - the presentation side (marshalling the data) is left to the
client.  Copies of the patches can be found here:

	http://people.redhat.com/~dhowells/rxrpc/series
	http://people.redhat.com/~dhowells/rxrpc/01-move-skb-generic.diff
	http://people.redhat.com/~dhowells/rxrpc/02-cancel_delayed_work.diff
	http://people.redhat.com/~dhowells/rxrpc/03-keys.diff
	http://people.redhat.com/~dhowells/rxrpc/04-timer-exports.diff
	http://people.redhat.com/~dhowells/rxrpc/05-af_rxrpc.diff

Further patches make the in-kernel AFS filesystem use AF_RXRPC and delete the
old RxRPC implementation:

	http://people.redhat.com/~dhowells/rxrpc/06-afs-cleanup.diff
	http://people.redhat.com/~dhowells/rxrpc/07-af_rxrpc-kernel.diff
	http://people.redhat.com/~dhowells/rxrpc/08-af_rxrpc-afs.diff
	http://people.redhat.com/~dhowells/rxrpc/09-af_rxrpc-delete-old.diff

And then the rest of the patches extend AFS to provide automatic unmounting of
automount trees, security support and directory-level write support (create,
mkdir, etc.):

	http://people.redhat.com/~dhowells/rxrpc/10-afs-multimount.diff
	http://people.redhat.com/~dhowells/rxrpc/11-afs-security.diff
	http://people.redhat.com/~dhowells/rxrpc/12-afs-doc.diff
	http://people.redhat.com/~dhowells/rxrpc/13-netlink-support-MSG_TRUNC.diff
	http://people.redhat.com/~dhowells/rxrpc/14-afs-get-capabilities.diff
	http://people.redhat.com/~dhowells/rxrpc/15-afs-initcallbackstate3.diff
	http://people.redhat.com/~dhowells/rxrpc/16-afs-dir-write-support.diff

Note that file-level write support is not yet complete and so is not included
in this patch set.


The userspace access methods make use of the control data passed to/by
sendmsg() and recvmsg().  See the three simple test programs:

	http://people.redhat.com/~dhowells/rxrpc/klog.c
	http://people.redhat.com/~dhowells/rxrpc/rxrpc.c
	http://people.redhat.com/~dhowells/rxrpc/listen.c

The klog program is provided to go and get a Kerberos IV key from the AFS
kaserver.  Currently it must be edited before compiling to note the right
server IP address and the appropriate credentials.

These programs can be compiled by:

	make klog rxrpc listen CFLAGS="-Wall -g" LDLIBS="-lcrypto -lcrypt -lkrb4 -lkeyutils"

Then a ticket can be obtained by:

	./klog

If a security key is acquired in this way, then all subsequent AFS operations -
including VL lookups and mounts - performed with that session keyring will be
authenticated using that key.  The key can be viewed like so:

	[root@andromeda ~]# keyctl show
	Session Keyring
	       -3 --alswrv      0     0  keyring: _ses.3268
		2 --alswrv      0     0   \_ keyring: _uid.0
	111416553 --als--v      0     0   \_ rxrpc: afs@CAMBRIDGE.REDHAT.COM

TODO:

 (*) Make certain parameters (such as connection timeouts) userspace
     configurable.

 (*) Make userspace utilities use it; librxrpc.

 (*) Userspace documentation.

 (*) KerberosV security.

Changes:

 (*) SOCK_RPC has been removed.  SOCK_DGRAM is now used instead.

 (*) I've add a facility whereby calls can be made to destinations other than
     the connect() address of a client socket by making use of msg_name in the
     msghdr struct when using sendmsg() to send the first data packet of a
     call.  Indeed, a client socket need not be connected before being used
     so.

 (*) I've also added a facility whereby client calls may also be made on
     server sockets, again by using msg_name in the msghdr struct.  In such a
     case, the server's local transport endpoint is used.

 (*) I've made the write buffer space check available to various callers
     (sk_write_space) and implemented poll support.

 (*) Rewrote rxrpc_recvmsg().  It now concatenates adjacent data messages from
     the same call when delivering them.

 (*) Updated the documentation to include notes on recvmsg, cover control
     messages and cover SOL_RXRPC-level socket options.

 (*) Provided an in-kernel interface to give in-kernel utilities easier access
     to the facility.

 (*) Made fs/afs/ use it.

 (*) Deleted the old contents of net/rxrpc/.

 (*) Use the scatterlist interface to the crypto API for now.  The patch that
     added the direct access interface conflicts with patches Herbert Xu is
     producing, so I've dropped it for the moment.

 (*) Moved a bug fix to make secure connection reuse work from the
     af_rxrpc-kernel patch to the af_rxrpc main patch.

 (*) Make RxRPC use its own private work queues rather than keventd's to avoid
     deadlocks when AFS tries to use keventd too.  This also puts encryption
     in the private work queue rather than keventd's queue as that might take
     a relatively long time to process.

 (*) Shorten the dead call timeout to 2 seconds.  Without this, each completed
     call sits around eating up resources for 10 seconds.  The calls need to
     hang around for a little while in case duplicate packets appear, but 10
     seconds is excessive.

 (*) Decrement the number of available calls on a connection when a new call is
     allocated for a connection that doesn't have any calls in progress.

David

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 01/16] AF_RXRPC: Move generic skbuff stuff from XFRM code to generic code [try #3]
  2007-04-25 10:50 ` [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3] David Howells
@ 2007-04-25 10:50   ` David Howells
  2007-04-25 10:50   ` [PATCH 02/16] cancel_delayed_work: use del_timer() instead of del_timer_sync() " David Howells
                     ` (11 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: David Howells @ 2007-04-25 10:50 UTC (permalink / raw)
  To: torvalds, akpm; +Cc: linux-kernel, linux-fsdevel, netdev, dhowells

Move generic skbuff stuff from XFRM code to generic code so that AF_RXRPC can
use it too.

The kdoc comments I've attached to the functions needs to be checked by whoever
wrote them as I had to make some guesses about the workings of these functions.

Signed-Off-By: David Howells <dhowells@redhat.com>
---

 include/linux/skbuff.h |    6 ++
 include/net/esp.h      |    2 -
 net/core/skbuff.c      |  188 ++++++++++++++++++++++++++++++++++++++++++++++++
 net/xfrm/xfrm_algo.c   |  169 -------------------------------------------
 4 files changed, 194 insertions(+), 171 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 5992f65..c905d42 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -83,6 +83,7 @@
  */
 
 struct net_device;
+struct scatterlist;
 
 #ifdef CONFIG_NETFILTER
 struct nf_conntrack {
@@ -361,6 +362,11 @@ extern struct sk_buff *skb_realloc_headroom(struct sk_buff *skb,
 extern struct sk_buff *skb_copy_expand(const struct sk_buff *skb,
 				       int newheadroom, int newtailroom,
 				       gfp_t priority);
+extern int	       skb_to_sgvec(struct sk_buff *skb,
+				    struct scatterlist *sg, int offset,
+				    int len);
+extern int	       skb_cow_data(struct sk_buff *skb, int tailbits,
+				    struct sk_buff **trailer);
 extern int	       skb_pad(struct sk_buff *skb, int pad);
 #define dev_kfree_skb(a)	kfree_skb(a)
 extern void	      skb_over_panic(struct sk_buff *skb, int len,
diff --git a/include/net/esp.h b/include/net/esp.h
index 713d039..d05d8d2 100644
--- a/include/net/esp.h
+++ b/include/net/esp.h
@@ -40,8 +40,6 @@ struct esp_data
 	} auth;
 };
 
-extern int skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len);
-extern int skb_cow_data(struct sk_buff *skb, int tailbits, struct sk_buff **trailer);
 extern void *pskb_put(struct sk_buff *skb, struct sk_buff *tail, int len);
 
 static inline int esp_mac_digest(struct esp_data *esp, struct sk_buff *skb,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 336958f..aa02bd4 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -55,6 +55,7 @@
 #include <linux/cache.h>
 #include <linux/rtnetlink.h>
 #include <linux/init.h>
+#include <linux/scatterlist.h>
 
 #include <net/protocol.h>
 #include <net/dst.h>
@@ -2005,6 +2006,190 @@ void __init skb_init(void)
 						NULL, NULL);
 }
 
+/**
+ *	skb_to_sgvec - Fill a scatter-gather list from a socket buffer
+ *	@skb: Socket buffer containing the buffers to be mapped
+ *	@sg: The scatter-gather list to map into
+ *	@offset: The offset into the buffer's contents to start mapping
+ *	@len: Length of buffer space to be mapped
+ *
+ *	Fill the specified scatter-gather list with mappings/pointers into a
+ *	region of the buffer space attached to a socket buffer.
+ */
+int
+skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len)
+{
+	int start = skb_headlen(skb);
+	int i, copy = start - offset;
+	int elt = 0;
+
+	if (copy > 0) {
+		if (copy > len)
+			copy = len;
+		sg[elt].page = virt_to_page(skb->data + offset);
+		sg[elt].offset = (unsigned long)(skb->data + offset) % PAGE_SIZE;
+		sg[elt].length = copy;
+		elt++;
+		if ((len -= copy) == 0)
+			return elt;
+		offset += copy;
+	}
+
+	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
+		int end;
+
+		BUG_TRAP(start <= offset + len);
+
+		end = start + skb_shinfo(skb)->frags[i].size;
+		if ((copy = end - offset) > 0) {
+			skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
+
+			if (copy > len)
+				copy = len;
+			sg[elt].page = frag->page;
+			sg[elt].offset = frag->page_offset+offset-start;
+			sg[elt].length = copy;
+			elt++;
+			if (!(len -= copy))
+				return elt;
+			offset += copy;
+		}
+		start = end;
+	}
+
+	if (skb_shinfo(skb)->frag_list) {
+		struct sk_buff *list = skb_shinfo(skb)->frag_list;
+
+		for (; list; list = list->next) {
+			int end;
+
+			BUG_TRAP(start <= offset + len);
+
+			end = start + list->len;
+			if ((copy = end - offset) > 0) {
+				if (copy > len)
+					copy = len;
+				elt += skb_to_sgvec(list, sg+elt, offset - start, copy);
+				if ((len -= copy) == 0)
+					return elt;
+				offset += copy;
+			}
+			start = end;
+		}
+	}
+	BUG_ON(len);
+	return elt;
+}
+
+/**
+ *	skb_cow_data - Check that a socket buffer's data buffers are writable
+ *	@skb: The socket buffer to check.
+ *	@tailbits: Amount of trailing space to be added
+ *	@trailer: Returned pointer to the skb where the @tailbits space begins
+ *
+ *	Make sure that the data buffers attached to a socket buffer are
+ *	writable. If they are not, private copies are made of the data buffers
+ *	and the socket buffer is set to use these instead.
+ *
+ *	If @tailbits is given, make sure that there is space to write @tailbits
+ *	bytes of data beyond current end of socket buffer.  @trailer will be
+ *	set to point to the skb in which this space begins.
+ *
+ *	The number of scatterlist elements required to completely map the
+ *	COW'd and extended socket buffer will be returned.
+ */
+int skb_cow_data(struct sk_buff *skb, int tailbits, struct sk_buff **trailer)
+{
+	int copyflag;
+	int elt;
+	struct sk_buff *skb1, **skb_p;
+
+	/* If skb is cloned or its head is paged, reallocate
+	 * head pulling out all the pages (pages are considered not writable
+	 * at the moment even if they are anonymous).
+	 */
+	if ((skb_cloned(skb) || skb_shinfo(skb)->nr_frags) &&
+	    __pskb_pull_tail(skb, skb_pagelen(skb)-skb_headlen(skb)) == NULL)
+		return -ENOMEM;
+
+	/* Easy case. Most of packets will go this way. */
+	if (!skb_shinfo(skb)->frag_list) {
+		/* A little of trouble, not enough of space for trailer.
+		 * This should not happen, when stack is tuned to generate
+		 * good frames. OK, on miss we reallocate and reserve even more
+		 * space, 128 bytes is fair. */
+
+		if (skb_tailroom(skb) < tailbits &&
+		    pskb_expand_head(skb, 0, tailbits-skb_tailroom(skb)+128, GFP_ATOMIC))
+			return -ENOMEM;
+
+		/* Voila! */
+		*trailer = skb;
+		return 1;
+	}
+
+	/* Misery. We are in troubles, going to mincer fragments... */
+
+	elt = 1;
+	skb_p = &skb_shinfo(skb)->frag_list;
+	copyflag = 0;
+
+	while ((skb1 = *skb_p) != NULL) {
+		int ntail = 0;
+
+		/* The fragment is partially pulled by someone,
+		 * this can happen on input. Copy it and everything
+		 * after it. */
+
+		if (skb_shared(skb1))
+			copyflag = 1;
+
+		/* If the skb is the last, worry about trailer. */
+
+		if (skb1->next == NULL && tailbits) {
+			if (skb_shinfo(skb1)->nr_frags ||
+			    skb_shinfo(skb1)->frag_list ||
+			    skb_tailroom(skb1) < tailbits)
+				ntail = tailbits + 128;
+		}
+
+		if (copyflag ||
+		    skb_cloned(skb1) ||
+		    ntail ||
+		    skb_shinfo(skb1)->nr_frags ||
+		    skb_shinfo(skb1)->frag_list) {
+			struct sk_buff *skb2;
+
+			/* Fuck, we are miserable poor guys... */
+			if (ntail == 0)
+				skb2 = skb_copy(skb1, GFP_ATOMIC);
+			else
+				skb2 = skb_copy_expand(skb1,
+						       skb_headroom(skb1),
+						       ntail,
+						       GFP_ATOMIC);
+			if (unlikely(skb2 == NULL))
+				return -ENOMEM;
+
+			if (skb1->sk)
+				skb_set_owner_w(skb2, skb1->sk);
+
+			/* Looking around. Are we still alive?
+			 * OK, link new skb, drop old one */
+
+			skb2->next = skb1->next;
+			*skb_p = skb2;
+			kfree_skb(skb1);
+			skb1 = skb2;
+		}
+		elt++;
+		*trailer = skb1;
+		skb_p = &skb1->next;
+	}
+
+	return elt;
+}
+
 EXPORT_SYMBOL(___pskb_trim);
 EXPORT_SYMBOL(__kfree_skb);
 EXPORT_SYMBOL(kfree_skb);
@@ -2039,3 +2224,6 @@ EXPORT_SYMBOL(skb_seq_read);
 EXPORT_SYMBOL(skb_abort_seq_read);
 EXPORT_SYMBOL(skb_find_text);
 EXPORT_SYMBOL(skb_append_datato_frags);
+
+EXPORT_SYMBOL_GPL(skb_to_sgvec);
+EXPORT_SYMBOL_GPL(skb_cow_data);
diff --git a/net/xfrm/xfrm_algo.c b/net/xfrm/xfrm_algo.c
index f373a8a..6249a94 100644
--- a/net/xfrm/xfrm_algo.c
+++ b/net/xfrm/xfrm_algo.c
@@ -612,175 +612,6 @@ EXPORT_SYMBOL_GPL(skb_icv_walk);
 
 #if defined(CONFIG_INET_ESP) || defined(CONFIG_INET_ESP_MODULE) || defined(CONFIG_INET6_ESP) || defined(CONFIG_INET6_ESP_MODULE)
 
-/* Looking generic it is not used in another places. */
-
-int
-skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len)
-{
-	int start = skb_headlen(skb);
-	int i, copy = start - offset;
-	int elt = 0;
-
-	if (copy > 0) {
-		if (copy > len)
-			copy = len;
-		sg[elt].page = virt_to_page(skb->data + offset);
-		sg[elt].offset = (unsigned long)(skb->data + offset) % PAGE_SIZE;
-		sg[elt].length = copy;
-		elt++;
-		if ((len -= copy) == 0)
-			return elt;
-		offset += copy;
-	}
-
-	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
-		int end;
-
-		BUG_TRAP(start <= offset + len);
-
-		end = start + skb_shinfo(skb)->frags[i].size;
-		if ((copy = end - offset) > 0) {
-			skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
-
-			if (copy > len)
-				copy = len;
-			sg[elt].page = frag->page;
-			sg[elt].offset = frag->page_offset+offset-start;
-			sg[elt].length = copy;
-			elt++;
-			if (!(len -= copy))
-				return elt;
-			offset += copy;
-		}
-		start = end;
-	}
-
-	if (skb_shinfo(skb)->frag_list) {
-		struct sk_buff *list = skb_shinfo(skb)->frag_list;
-
-		for (; list; list = list->next) {
-			int end;
-
-			BUG_TRAP(start <= offset + len);
-
-			end = start + list->len;
-			if ((copy = end - offset) > 0) {
-				if (copy > len)
-					copy = len;
-				elt += skb_to_sgvec(list, sg+elt, offset - start, copy);
-				if ((len -= copy) == 0)
-					return elt;
-				offset += copy;
-			}
-			start = end;
-		}
-	}
-	BUG_ON(len);
-	return elt;
-}
-EXPORT_SYMBOL_GPL(skb_to_sgvec);
-
-/* Check that skb data bits are writable. If they are not, copy data
- * to newly created private area. If "tailbits" is given, make sure that
- * tailbits bytes beyond current end of skb are writable.
- *
- * Returns amount of elements of scatterlist to load for subsequent
- * transformations and pointer to writable trailer skb.
- */
-
-int skb_cow_data(struct sk_buff *skb, int tailbits, struct sk_buff **trailer)
-{
-	int copyflag;
-	int elt;
-	struct sk_buff *skb1, **skb_p;
-
-	/* If skb is cloned or its head is paged, reallocate
-	 * head pulling out all the pages (pages are considered not writable
-	 * at the moment even if they are anonymous).
-	 */
-	if ((skb_cloned(skb) || skb_shinfo(skb)->nr_frags) &&
-	    __pskb_pull_tail(skb, skb_pagelen(skb)-skb_headlen(skb)) == NULL)
-		return -ENOMEM;
-
-	/* Easy case. Most of packets will go this way. */
-	if (!skb_shinfo(skb)->frag_list) {
-		/* A little of trouble, not enough of space for trailer.
-		 * This should not happen, when stack is tuned to generate
-		 * good frames. OK, on miss we reallocate and reserve even more
-		 * space, 128 bytes is fair. */
-
-		if (skb_tailroom(skb) < tailbits &&
-		    pskb_expand_head(skb, 0, tailbits-skb_tailroom(skb)+128, GFP_ATOMIC))
-			return -ENOMEM;
-
-		/* Voila! */
-		*trailer = skb;
-		return 1;
-	}
-
-	/* Misery. We are in troubles, going to mincer fragments... */
-
-	elt = 1;
-	skb_p = &skb_shinfo(skb)->frag_list;
-	copyflag = 0;
-
-	while ((skb1 = *skb_p) != NULL) {
-		int ntail = 0;
-
-		/* The fragment is partially pulled by someone,
-		 * this can happen on input. Copy it and everything
-		 * after it. */
-
-		if (skb_shared(skb1))
-			copyflag = 1;
-
-		/* If the skb is the last, worry about trailer. */
-
-		if (skb1->next == NULL && tailbits) {
-			if (skb_shinfo(skb1)->nr_frags ||
-			    skb_shinfo(skb1)->frag_list ||
-			    skb_tailroom(skb1) < tailbits)
-				ntail = tailbits + 128;
-		}
-
-		if (copyflag ||
-		    skb_cloned(skb1) ||
-		    ntail ||
-		    skb_shinfo(skb1)->nr_frags ||
-		    skb_shinfo(skb1)->frag_list) {
-			struct sk_buff *skb2;
-
-			/* Fuck, we are miserable poor guys... */
-			if (ntail == 0)
-				skb2 = skb_copy(skb1, GFP_ATOMIC);
-			else
-				skb2 = skb_copy_expand(skb1,
-						       skb_headroom(skb1),
-						       ntail,
-						       GFP_ATOMIC);
-			if (unlikely(skb2 == NULL))
-				return -ENOMEM;
-
-			if (skb1->sk)
-				skb_set_owner_w(skb2, skb1->sk);
-
-			/* Looking around. Are we still alive?
-			 * OK, link new skb, drop old one */
-
-			skb2->next = skb1->next;
-			*skb_p = skb2;
-			kfree_skb(skb1);
-			skb1 = skb2;
-		}
-		elt++;
-		*trailer = skb1;
-		skb_p = &skb1->next;
-	}
-
-	return elt;
-}
-EXPORT_SYMBOL_GPL(skb_cow_data);
-
 void *pskb_put(struct sk_buff *skb, struct sk_buff *tail, int len)
 {
 	if (tail != skb) {


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 02/16] cancel_delayed_work: use del_timer() instead of del_timer_sync() [try #3]
  2007-04-25 10:50 ` [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3] David Howells
  2007-04-25 10:50   ` [PATCH 01/16] AF_RXRPC: Move generic skbuff stuff from XFRM code to generic code " David Howells
@ 2007-04-25 10:50   ` David Howells
  2007-04-25 10:50   ` [PATCH 03/16] AF_RXRPC: Key facility changes for AF_RXRPC " David Howells
                     ` (10 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: David Howells @ 2007-04-25 10:50 UTC (permalink / raw)
  To: torvalds, akpm; +Cc: linux-kernel, linux-fsdevel, netdev, dhowells

del_timer_sync() buys nothing for cancel_delayed_work(), but it is less
efficient since it locks the timer unconditionally, and may wait for the
completion of the delayed_work_timer_fn().

cancel_delayed_work() == 0 means:

	before this patch:
		work->func may still be running or queued

	after this patch:
		work->func may still be running or queued, or
		delayed_work_timer_fn->__queue_work() in progress.

		The latter doesn't differ from the caller's POV,
		delayed_work_timer_fn() is called with _PENDING
		bit set.

cancel_delayed_work() == 1 with this patch adds a new possibility:

	delayed_work->work was cancelled, but delayed_work_timer_fn
	is still running (this is only possible for the re-arming
	works on single-threaded workqueue).

	In this case the timer was re-started by work->func(), nobody
	else can do this. This in turn means that delayed_work_timer_fn
	has already passed __queue_work() (and wont't touch delayed_work)
	because nobody else can queue delayed_work->work.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-Off-By: David Howells <dhowells@redhat.com>
---

 include/linux/workqueue.h |    7 ++++---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 2a7b38d..b8abfc7 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -191,14 +191,15 @@ int execute_in_process_context(work_func_t fn, struct execute_work *);
 
 /*
  * Kill off a pending schedule_delayed_work().  Note that the work callback
- * function may still be running on return from cancel_delayed_work().  Run
- * flush_scheduled_work() to wait on it.
+ * function may still be running on return from cancel_delayed_work(), unless
+ * it returns 1 and the work doesn't re-arm itself. Run flush_workqueue() or
+ * cancel_work_sync() to wait on it.
  */
 static inline int cancel_delayed_work(struct delayed_work *work)
 {
 	int ret;
 
-	ret = del_timer_sync(&work->timer);
+	ret = del_timer(&work->timer);
 	if (ret)
 		work_release(&work->work);
 	return ret;

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 03/16] AF_RXRPC: Key facility changes for AF_RXRPC [try #3]
  2007-04-25 10:50 ` [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3] David Howells
  2007-04-25 10:50   ` [PATCH 01/16] AF_RXRPC: Move generic skbuff stuff from XFRM code to generic code " David Howells
  2007-04-25 10:50   ` [PATCH 02/16] cancel_delayed_work: use del_timer() instead of del_timer_sync() " David Howells
@ 2007-04-25 10:50   ` David Howells
  2007-04-25 10:50   ` [PATCH 04/16] AF_RXRPC: Make it possible to merely try to cancel timers from a module " David Howells
                     ` (9 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: David Howells @ 2007-04-25 10:50 UTC (permalink / raw)
  To: torvalds, akpm; +Cc: linux-kernel, linux-fsdevel, netdev, dhowells

Export the keyring key type definition and document its availability.

Add alternative types into the key's type_data union to make it more useful.
Not all users necessarily want to use it as a list_head (AF_RXRPC doesn't, for
example), so make it clear that it can be used in other ways.

Signed-Off-By: David Howells <dhowells@redhat.com>
---

 Documentation/keys.txt  |   12 ++++++++++++
 include/linux/key.h     |    2 ++
 security/keys/keyring.c |    2 ++
 3 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/Documentation/keys.txt b/Documentation/keys.txt
index 60c665d..81d9aa0 100644
--- a/Documentation/keys.txt
+++ b/Documentation/keys.txt
@@ -859,6 +859,18 @@ payload contents" for more information.
 	void unregister_key_type(struct key_type *type);
 
 
+Under some circumstances, it may be desirable to desirable to deal with a
+bundle of keys.  The facility provides access to the keyring type for managing
+such a bundle:
+
+	struct key_type key_type_keyring;
+
+This can be used with a function such as request_key() to find a specific
+keyring in a process's keyrings.  A keyring thus found can then be searched
+with keyring_search().  Note that it is not possible to use request_key() to
+search a specific keyring, so using keyrings in this way is of limited utility.
+
+
 ===================================
 NOTES ON ACCESSING PAYLOAD CONTENTS
 ===================================
diff --git a/include/linux/key.h b/include/linux/key.h
index 169f05e..a9220e7 100644
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -160,6 +160,8 @@ struct key {
 	 */
 	union {
 		struct list_head	link;
+		unsigned long		x[2];
+		void			*p[2];
 	} type_data;
 
 	/* key data
diff --git a/security/keys/keyring.c b/security/keys/keyring.c
index ad45ce7..88292e3 100644
--- a/security/keys/keyring.c
+++ b/security/keys/keyring.c
@@ -66,6 +66,8 @@ struct key_type key_type_keyring = {
 	.read		= keyring_read,
 };
 
+EXPORT_SYMBOL(key_type_keyring);
+
 /*
  * semaphore to serialise link/link calls to prevent two link calls in parallel
  * introducing a cycle


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 04/16] AF_RXRPC: Make it possible to merely try to cancel timers from a module [try #3]
  2007-04-25 10:50 ` [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3] David Howells
                     ` (2 preceding siblings ...)
  2007-04-25 10:50   ` [PATCH 03/16] AF_RXRPC: Key facility changes for AF_RXRPC " David Howells
@ 2007-04-25 10:50   ` David Howells
  2007-04-25 10:51   ` [PATCH 07/16] AF_RXRPC: Add an interface to the AF_RXRPC module for the AFS filesystem to use " David Howells
                     ` (8 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: David Howells @ 2007-04-25 10:50 UTC (permalink / raw)
  To: torvalds, akpm; +Cc: linux-kernel, linux-fsdevel, netdev, dhowells

Export try_to_del_timer_sync() for use by the AF_RXRPC module.

Signed-Off-By: David Howells <dhowells@redhat.com>
---

 kernel/timer.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/kernel/timer.c b/kernel/timer.c
index dd6c2c1..b22bd39 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -505,6 +505,8 @@ out:
 	return ret;
 }
 
+EXPORT_SYMBOL(try_to_del_timer_sync);
+
 /**
  * del_timer_sync - deactivate a timer and wait for the handler to finish.
  * @timer: the timer to be deactivated


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 07/16] AF_RXRPC: Add an interface to the AF_RXRPC module for the AFS filesystem to use [try #3]
  2007-04-25 10:50 ` [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3] David Howells
                     ` (3 preceding siblings ...)
  2007-04-25 10:50   ` [PATCH 04/16] AF_RXRPC: Make it possible to merely try to cancel timers from a module " David Howells
@ 2007-04-25 10:51   ` David Howells
  2007-04-25 10:51   ` [PATCH 10/16] AFS: Handle multiple mounts of an AFS superblock correctly " David Howells
                     ` (7 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: David Howells @ 2007-04-25 10:51 UTC (permalink / raw)
  To: torvalds, akpm; +Cc: linux-kernel, linux-fsdevel, netdev, dhowells

Add an interface to the AF_RXRPC module so that the AFS filesystem module can
more easily make use of the services available.  AFS still opens a socket but
then uses the action functions in lieu of sendmsg() and registers an intercept
functions to grab messages before they're queued on the socket Rx queue.

This permits AFS (or whatever) to:

 (1) Avoid the overhead of using the recvmsg() call.

 (2) Use different keys directly on individual client calls on one socket
     rather than having to open a whole slew of sockets, one for each key it
     might want to use.

 (3) Avoid calling request_key() at the point of issue of a call or opening of
     a socket.  This is done instead by AFS at the point of open(), unlink() or
     other VFS operation and the key handed through.

 (4) Request the use of something other than GFP_KERNEL to allocate memory.

Furthermore:

 (*) The socket buffer markings used by RxRPC are made available for AFS so
     that it can interpret the cooked RxRPC messages itself.

 (*) rxgen (un)marshalling abort codes are made available.


The following documentation for the kernel interface is added to
Documentation/networking/rxrpc.txt:

=========================
AF_RXRPC KERNEL INTERFACE
=========================

The AF_RXRPC module also provides an interface for use by in-kernel utilities
such as the AFS filesystem.  This permits such a utility to:

 (1) Use different keys directly on individual client calls on one socket
     rather than having to open a whole slew of sockets, one for each key it
     might want to use.

 (2) Avoid having RxRPC call request_key() at the point of issue of a call or
     opening of a socket.  Instead the utility is responsible for requesting a
     key at the appropriate point.  AFS, for instance, would do this during VFS
     operations such as open() or unlink().  The key is then handed through
     when the call is initiated.

 (3) Request the use of something other than GFP_KERNEL to allocate memory.

 (4) Avoid the overhead of using the recvmsg() call.  RxRPC messages can be
     intercepted before they get put into the socket Rx queue and the socket
     buffers manipulated directly.

To use the RxRPC facility, a kernel utility must still open an AF_RXRPC socket,
bind an addess as appropriate and listen if it's to be a server socket, but
then it passes this to the kernel interface functions.

The kernel interface functions are as follows:

 (*) Begin a new client call.

	struct rxrpc_call *
	rxrpc_kernel_begin_call(struct socket *sock,
				struct sockaddr_rxrpc *srx,
				struct key *key,
				unsigned long user_call_ID,
				gfp_t gfp);

     This allocates the infrastructure to make a new RxRPC call and assigns
     call and connection numbers.  The call will be made on the UDP port that
     the socket is bound to.  The call will go to the destination address of a
     connected client socket unless an alternative is supplied (srx is
     non-NULL).

     If a key is supplied then this will be used to secure the call instead of
     the key bound to the socket with the RXRPC_SECURITY_KEY sockopt.  Calls
     secured in this way will still share connections if at all possible.

     The user_call_ID is equivalent to that supplied to sendmsg() in the
     control data buffer.  It is entirely feasible to use this to point to a
     kernel data structure.

     If this function is successful, an opaque reference to the RxRPC call is
     returned.  The caller now holds a reference on this and it must be
     properly ended.

 (*) End a client call.

	void rxrpc_kernel_end_call(struct rxrpc_call *call);

     This is used to end a previously begun call.  The user_call_ID is expunged
     from AF_RXRPC's knowledge and will not be seen again in association with
     the specified call.

 (*) Send data through a call.

	int rxrpc_kernel_send_data(struct rxrpc_call *call, struct msghdr *msg,
				   size_t len);

     This is used to supply either the request part of a client call or the
     reply part of a server call.  msg.msg_iovlen and msg.msg_iov specify the
     data buffers to be used.  msg_iov may not be NULL and must point
     exclusively to in-kernel virtual addresses.  msg.msg_flags may be given
     MSG_MORE if there will be subsequent data sends for this call.

     The msg must not specify a destination address, control data or any flags
     other than MSG_MORE.  len is the total amount of data to transmit.

 (*) Abort a call.

	void rxrpc_kernel_abort_call(struct rxrpc_call *call, u32 abort_code);

     This is used to abort a call if it's still in an abortable state.  The
     abort code specified will be placed in the ABORT message sent.

 (*) Intercept received RxRPC messages.

	typedef void (*rxrpc_interceptor_t)(struct sock *sk,
					    unsigned long user_call_ID,
					    struct sk_buff *skb);

	void
	rxrpc_kernel_intercept_rx_messages(struct socket *sock,
					   rxrpc_interceptor_t interceptor);

     This installs an interceptor function on the specified AF_RXRPC socket.
     All messages that would otherwise wind up in the socket's Rx queue are
     then diverted to this function.  Note that care must be taken to process
     the messages in the right order to maintain DATA message sequentiality.

     The interceptor function itself is provided with the address of the socket
     and handling the incoming message, the ID assigned by the kernel utility
     to the call and the socket buffer containing the message.

     The skb->mark field indicates the type of message:

	MARK				MEANING
	===============================	=======================================
	RXRPC_SKB_MARK_DATA		Data message
	RXRPC_SKB_MARK_FINAL_ACK	Final ACK received for an incoming call
	RXRPC_SKB_MARK_BUSY		Client call rejected as server busy
	RXRPC_SKB_MARK_REMOTE_ABORT	Call aborted by peer
	RXRPC_SKB_MARK_NET_ERROR	Network error detected
	RXRPC_SKB_MARK_LOCAL_ERROR	Local error encountered
	RXRPC_SKB_MARK_NEW_CALL		New incoming call awaiting acceptance

     The remote abort message can be probed with rxrpc_kernel_get_abort_code().
     The two error messages can be probed with rxrpc_kernel_get_error_number().
     A new call can be accepted with rxrpc_kernel_accept_call().

     Data messages can have their contents extracted with the usual bunch of
     socket buffer manipulation functions.  A data message can be determined to
     be the last one in a sequence with rxrpc_kernel_is_data_last().  When a
     data message has been used up, rxrpc_kernel_data_delivered() should be
     called on it..

     Non-data messages should be handled to rxrpc_kernel_free_skb() to dispose
     of.  It is possible to get extra refs on all types of message for later
     freeing, but this may pin the state of a call until the message is finally
     freed.

 (*) Accept an incoming call.

	struct rxrpc_call *
	rxrpc_kernel_accept_call(struct socket *sock,
				 unsigned long user_call_ID);

     This is used to accept an incoming call and to assign it a call ID.  This
     function is similar to rxrpc_kernel_begin_call() and calls accepted must
     be ended in the same way.

     If this function is successful, an opaque reference to the RxRPC call is
     returned.  The caller now holds a reference on this and it must be
     properly ended.

 (*) Reject an incoming call.

	int rxrpc_kernel_reject_call(struct socket *sock);

     This is used to reject the first incoming call on the socket's queue with
     a BUSY message.  -ENODATA is returned if there were no incoming calls.
     Other errors may be returned if the call had been aborted (-ECONNABORTED)
     or had timed out (-ETIME).

 (*) Record the delivery of a data message and free it.

	void rxrpc_kernel_data_delivered(struct sk_buff *skb);

     This is used to record a data message as having been delivered and to
     update the ACK state for the call.  The socket buffer will be freed.

 (*) Free a message.

	void rxrpc_kernel_free_skb(struct sk_buff *skb);

     This is used to free a non-DATA socket buffer intercepted from an AF_RXRPC
     socket.

 (*) Determine if a data message is the last one on a call.

	bool rxrpc_kernel_is_data_last(struct sk_buff *skb);

     This is used to determine if a socket buffer holds the last data message
     to be received for a call (true will be returned if it does, false
     if not).

     The data message will be part of the reply on a client call and the
     request on an incoming call.  In the latter case there will be more
     messages, but in the former case there will not.

 (*) Get the abort code from an abort message.

	u32 rxrpc_kernel_get_abort_code(struct sk_buff *skb);

     This is used to extract the abort code from a remote abort message.

 (*) Get the error number from a local or network error message.

	int rxrpc_kernel_get_error_number(struct sk_buff *skb);

     This is used to extract the error number from a message indicating either
     a local error occurred or a network error occurred.

Signed-Off-By: David Howells <dhowells@redhat.com>
---

 Documentation/networking/rxrpc.txt |  196 ++++++++++++++++++++++++++++++++++++
 include/net/af_rxrpc.h             |   44 ++++++++
 include/rxrpc/packet.h             |   12 ++
 net/rxrpc/af_rxrpc.c               |  141 ++++++++++++++++++++++++--
 net/rxrpc/ar-accept.c              |  119 +++++++++++++++++++++-
 net/rxrpc/ar-ack.c                 |   10 +-
 net/rxrpc/ar-call.c                |   75 ++++++++------
 net/rxrpc/ar-connection.c          |   28 ++++-
 net/rxrpc/ar-connevent.c           |   20 +++-
 net/rxrpc/ar-error.c               |    6 +
 net/rxrpc/ar-input.c               |   60 ++++++-----
 net/rxrpc/ar-internal.h            |   44 +++-----
 net/rxrpc/ar-local.c               |    2 
 net/rxrpc/ar-output.c              |   84 +++++++++++++++
 net/rxrpc/ar-peer.c                |    2 
 net/rxrpc/ar-recvmsg.c             |   75 +++++++++++++-
 net/rxrpc/ar-skbuff.c              |   16 +++
 net/rxrpc/ar-transport.c           |    8 +
 18 files changed, 813 insertions(+), 129 deletions(-)

diff --git a/Documentation/networking/rxrpc.txt b/Documentation/networking/rxrpc.txt
index 146a73e..21ea5fa 100644
--- a/Documentation/networking/rxrpc.txt
+++ b/Documentation/networking/rxrpc.txt
@@ -25,6 +25,8 @@ Contents of this document:
 
  (*) Example server usage.
 
+ (*) AF_RXRPC kernel interface.
+
 
 ========
 OVERVIEW
@@ -661,3 +663,197 @@ A server would be set up to accept operations in the following manner:
 Note that all the communications for a particular service take place through
 the one server socket, using control messages on sendmsg() and recvmsg() to
 determine the call affected.
+
+
+=========================
+AF_RXRPC KERNEL INTERFACE
+=========================
+
+The AF_RXRPC module also provides an interface for use by in-kernel utilities
+such as the AFS filesystem.  This permits such a utility to:
+
+ (1) Use different keys directly on individual client calls on one socket
+     rather than having to open a whole slew of sockets, one for each key it
+     might want to use.
+
+ (2) Avoid having RxRPC call request_key() at the point of issue of a call or
+     opening of a socket.  Instead the utility is responsible for requesting a
+     key at the appropriate point.  AFS, for instance, would do this during VFS
+     operations such as open() or unlink().  The key is then handed through
+     when the call is initiated.
+
+ (3) Request the use of something other than GFP_KERNEL to allocate memory.
+
+ (4) Avoid the overhead of using the recvmsg() call.  RxRPC messages can be
+     intercepted before they get put into the socket Rx queue and the socket
+     buffers manipulated directly.
+
+To use the RxRPC facility, a kernel utility must still open an AF_RXRPC socket,
+bind an addess as appropriate and listen if it's to be a server socket, but
+then it passes this to the kernel interface functions.
+
+The kernel interface functions are as follows:
+
+ (*) Begin a new client call.
+
+	struct rxrpc_call *
+	rxrpc_kernel_begin_call(struct socket *sock,
+				struct sockaddr_rxrpc *srx,
+				struct key *key,
+				unsigned long user_call_ID,
+				gfp_t gfp);
+
+     This allocates the infrastructure to make a new RxRPC call and assigns
+     call and connection numbers.  The call will be made on the UDP port that
+     the socket is bound to.  The call will go to the destination address of a
+     connected client socket unless an alternative is supplied (srx is
+     non-NULL).
+
+     If a key is supplied then this will be used to secure the call instead of
+     the key bound to the socket with the RXRPC_SECURITY_KEY sockopt.  Calls
+     secured in this way will still share connections if at all possible.
+
+     The user_call_ID is equivalent to that supplied to sendmsg() in the
+     control data buffer.  It is entirely feasible to use this to point to a
+     kernel data structure.
+
+     If this function is successful, an opaque reference to the RxRPC call is
+     returned.  The caller now holds a reference on this and it must be
+     properly ended.
+
+ (*) End a client call.
+
+	void rxrpc_kernel_end_call(struct rxrpc_call *call);
+
+     This is used to end a previously begun call.  The user_call_ID is expunged
+     from AF_RXRPC's knowledge and will not be seen again in association with
+     the specified call.
+
+ (*) Send data through a call.
+
+	int rxrpc_kernel_send_data(struct rxrpc_call *call, struct msghdr *msg,
+				   size_t len);
+
+     This is used to supply either the request part of a client call or the
+     reply part of a server call.  msg.msg_iovlen and msg.msg_iov specify the
+     data buffers to be used.  msg_iov may not be NULL and must point
+     exclusively to in-kernel virtual addresses.  msg.msg_flags may be given
+     MSG_MORE if there will be subsequent data sends for this call.
+
+     The msg must not specify a destination address, control data or any flags
+     other than MSG_MORE.  len is the total amount of data to transmit.
+
+ (*) Abort a call.
+
+	void rxrpc_kernel_abort_call(struct rxrpc_call *call, u32 abort_code);
+
+     This is used to abort a call if it's still in an abortable state.  The
+     abort code specified will be placed in the ABORT message sent.
+
+ (*) Intercept received RxRPC messages.
+
+	typedef void (*rxrpc_interceptor_t)(struct sock *sk,
+					    unsigned long user_call_ID,
+					    struct sk_buff *skb);
+
+	void
+	rxrpc_kernel_intercept_rx_messages(struct socket *sock,
+					   rxrpc_interceptor_t interceptor);
+
+     This installs an interceptor function on the specified AF_RXRPC socket.
+     All messages that would otherwise wind up in the socket's Rx queue are
+     then diverted to this function.  Note that care must be taken to process
+     the messages in the right order to maintain DATA message sequentiality.
+
+     The interceptor function itself is provided with the address of the socket
+     and handling the incoming message, the ID assigned by the kernel utility
+     to the call and the socket buffer containing the message.
+
+     The skb->mark field indicates the type of message:
+
+	MARK				MEANING
+	===============================	=======================================
+	RXRPC_SKB_MARK_DATA		Data message
+	RXRPC_SKB_MARK_FINAL_ACK	Final ACK received for an incoming call
+	RXRPC_SKB_MARK_BUSY		Client call rejected as server busy
+	RXRPC_SKB_MARK_REMOTE_ABORT	Call aborted by peer
+	RXRPC_SKB_MARK_NET_ERROR	Network error detected
+	RXRPC_SKB_MARK_LOCAL_ERROR	Local error encountered
+	RXRPC_SKB_MARK_NEW_CALL		New incoming call awaiting acceptance
+
+     The remote abort message can be probed with rxrpc_kernel_get_abort_code().
+     The two error messages can be probed with rxrpc_kernel_get_error_number().
+     A new call can be accepted with rxrpc_kernel_accept_call().
+
+     Data messages can have their contents extracted with the usual bunch of
+     socket buffer manipulation functions.  A data message can be determined to
+     be the last one in a sequence with rxrpc_kernel_is_data_last().  When a
+     data message has been used up, rxrpc_kernel_data_delivered() should be
+     called on it..
+
+     Non-data messages should be handled to rxrpc_kernel_free_skb() to dispose
+     of.  It is possible to get extra refs on all types of message for later
+     freeing, but this may pin the state of a call until the message is finally
+     freed.
+
+ (*) Accept an incoming call.
+
+	struct rxrpc_call *
+	rxrpc_kernel_accept_call(struct socket *sock,
+				 unsigned long user_call_ID);
+
+     This is used to accept an incoming call and to assign it a call ID.  This
+     function is similar to rxrpc_kernel_begin_call() and calls accepted must
+     be ended in the same way.
+
+     If this function is successful, an opaque reference to the RxRPC call is
+     returned.  The caller now holds a reference on this and it must be
+     properly ended.
+
+ (*) Reject an incoming call.
+
+	int rxrpc_kernel_reject_call(struct socket *sock);
+
+     This is used to reject the first incoming call on the socket's queue with
+     a BUSY message.  -ENODATA is returned if there were no incoming calls.
+     Other errors may be returned if the call had been aborted (-ECONNABORTED)
+     or had timed out (-ETIME).
+
+ (*) Record the delivery of a data message and free it.
+
+	void rxrpc_kernel_data_delivered(struct sk_buff *skb);
+
+     This is used to record a data message as having been delivered and to
+     update the ACK state for the call.  The socket buffer will be freed.
+
+ (*) Free a message.
+
+	void rxrpc_kernel_free_skb(struct sk_buff *skb);
+
+     This is used to free a non-DATA socket buffer intercepted from an AF_RXRPC
+     socket.
+
+ (*) Determine if a data message is the last one on a call.
+
+	bool rxrpc_kernel_is_data_last(struct sk_buff *skb);
+
+     This is used to determine if a socket buffer holds the last data message
+     to be received for a call (true will be returned if it does, false
+     if not).
+
+     The data message will be part of the reply on a client call and the
+     request on an incoming call.  In the latter case there will be more
+     messages, but in the former case there will not.
+
+ (*) Get the abort code from an abort message.
+
+	u32 rxrpc_kernel_get_abort_code(struct sk_buff *skb);
+
+     This is used to extract the abort code from a remote abort message.
+
+ (*) Get the error number from a local or network error message.
+
+	int rxrpc_kernel_get_error_number(struct sk_buff *skb);
+
+     This is used to extract the error number from a message indicating either
+     a local error occurred or a network error occurred.
diff --git a/include/net/af_rxrpc.h b/include/net/af_rxrpc.h
index b01ca25..00c2eaa 100644
--- a/include/net/af_rxrpc.h
+++ b/include/net/af_rxrpc.h
@@ -1,6 +1,6 @@
-/* RxRPC definitions
+/* RxRPC kernel service interface definitions
  *
- * Copyright (C) 2006 Red Hat, Inc. All Rights Reserved.
+ * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
  * Written by David Howells (dhowells@redhat.com)
  *
  * This program is free software; you can redistribute it and/or
@@ -12,6 +12,46 @@
 #ifndef _NET_RXRPC_H
 #define _NET_RXRPC_H
 
+#ifdef __KERNEL__
+
 #include <linux/rxrpc.h>
 
+struct rxrpc_call;
+
+/*
+ * the mark applied to socket buffers that may be intercepted
+ */
+enum {
+	RXRPC_SKB_MARK_DATA,		/* data message */
+	RXRPC_SKB_MARK_FINAL_ACK,	/* final ACK received message */
+	RXRPC_SKB_MARK_BUSY,		/* server busy message */
+	RXRPC_SKB_MARK_REMOTE_ABORT,	/* remote abort message */
+	RXRPC_SKB_MARK_NET_ERROR,	/* network error message */
+	RXRPC_SKB_MARK_LOCAL_ERROR,	/* local error message */
+	RXRPC_SKB_MARK_NEW_CALL,	/* local error message */
+};
+
+typedef void (*rxrpc_interceptor_t)(struct sock *, unsigned long,
+				    struct sk_buff *);
+extern void rxrpc_kernel_intercept_rx_messages(struct socket *,
+					       rxrpc_interceptor_t);
+extern struct rxrpc_call *rxrpc_kernel_begin_call(struct socket *,
+						  struct sockaddr_rxrpc *,
+						  struct key *,
+						  unsigned long,
+						  gfp_t);
+extern int rxrpc_kernel_send_data(struct rxrpc_call *, struct msghdr *,
+				  size_t);
+extern void rxrpc_kernel_abort_call(struct rxrpc_call *, u32);
+extern void rxrpc_kernel_end_call(struct rxrpc_call *);
+extern bool rxrpc_kernel_is_data_last(struct sk_buff *);
+extern u32 rxrpc_kernel_get_abort_code(struct sk_buff *);
+extern int rxrpc_kernel_get_error_number(struct sk_buff *);
+extern void rxrpc_kernel_data_delivered(struct sk_buff *);
+extern void rxrpc_kernel_free_skb(struct sk_buff *);
+extern struct rxrpc_call *rxrpc_kernel_accept_call(struct socket *,
+						   unsigned long);
+extern int rxrpc_kernel_reject_call(struct socket *);
+
+#endif /* __KERNEL__ */
 #endif /* _NET_RXRPC_H */
diff --git a/include/rxrpc/packet.h b/include/rxrpc/packet.h
index 452a9bb..09b11a1 100644
--- a/include/rxrpc/packet.h
+++ b/include/rxrpc/packet.h
@@ -186,6 +186,18 @@ struct rxkad_response {
 #define RX_DEBUGI_BADTYPE	-8	/* bad debugging packet type */
 
 /*
+ * (un)marshalling abort codes (rxgen)
+ */
+#define	RXGEN_CC_MARSHAL    -450
+#define	RXGEN_CC_UNMARSHAL  -451
+#define	RXGEN_SS_MARSHAL    -452
+#define	RXGEN_SS_UNMARSHAL  -453
+#define	RXGEN_DECODE	    -454
+#define	RXGEN_OPCODE	    -455
+#define	RXGEN_SS_XDRFREE    -456
+#define	RXGEN_CC_XDRFREE    -457
+
+/*
  * Rx kerberos security abort codes
  * - unfortunately we have no generalised security abort codes to say things
  *   like "unsupported security", so we have to use these instead and hope the
diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index 54b93d7..9e37e4f 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -41,6 +41,8 @@ atomic_t rxrpc_debug_id;
 /* count of skbs currently in use */
 atomic_t rxrpc_n_skbs;
 
+struct workqueue_struct *rxrpc_workqueue;
+
 static void rxrpc_sock_destructor(struct sock *);
 
 /*
@@ -214,7 +216,8 @@ static int rxrpc_listen(struct socket *sock, int backlog)
  */
 static struct rxrpc_transport *rxrpc_name_to_transport(struct socket *sock,
 						       struct sockaddr *addr,
-						       int addr_len, int flags)
+						       int addr_len, int flags,
+						       gfp_t gfp)
 {
 	struct sockaddr_rxrpc *srx = (struct sockaddr_rxrpc *) addr;
 	struct rxrpc_transport *trans;
@@ -232,17 +235,129 @@ static struct rxrpc_transport *rxrpc_name_to_transport(struct socket *sock,
 		return ERR_PTR(-EAFNOSUPPORT);
 
 	/* find a remote transport endpoint from the local one */
-	peer = rxrpc_get_peer(srx, GFP_KERNEL);
+	peer = rxrpc_get_peer(srx, gfp);
 	if (IS_ERR(peer))
 		return ERR_PTR(PTR_ERR(peer));
 
 	/* find a transport */
-	trans = rxrpc_get_transport(rx->local, peer, GFP_KERNEL);
+	trans = rxrpc_get_transport(rx->local, peer, gfp);
 	rxrpc_put_peer(peer);
 	_leave(" = %p", trans);
 	return trans;
 }
 
+/**
+ * rxrpc_kernel_begin_call - Allow a kernel service to begin a call
+ * @sock: The socket on which to make the call
+ * @srx: The address of the peer to contact (defaults to socket setting)
+ * @key: The security context to use (defaults to socket setting)
+ * @user_call_ID: The ID to use
+ *
+ * Allow a kernel service to begin a call on the nominated socket.  This just
+ * sets up all the internal tracking structures and allocates connection and
+ * call IDs as appropriate.  The call to be used is returned.
+ *
+ * The default socket destination address and security may be overridden by
+ * supplying @srx and @key.
+ */
+struct rxrpc_call *rxrpc_kernel_begin_call(struct socket *sock,
+					   struct sockaddr_rxrpc *srx,
+					   struct key *key,
+					   unsigned long user_call_ID,
+					   gfp_t gfp)
+{
+	struct rxrpc_conn_bundle *bundle;
+	struct rxrpc_transport *trans;
+	struct rxrpc_call *call;
+	struct rxrpc_sock *rx = rxrpc_sk(sock->sk);
+	__be16 service_id;
+
+	_enter(",,%x,%lx", key_serial(key), user_call_ID);
+
+	lock_sock(&rx->sk);
+
+	if (srx) {
+		trans = rxrpc_name_to_transport(sock, (struct sockaddr *) srx,
+						sizeof(*srx), 0, gfp);
+		if (IS_ERR(trans)) {
+			call = ERR_PTR(PTR_ERR(trans));
+			trans = NULL;
+			goto out;
+		}
+	} else {
+		trans = rx->trans;
+		if (!trans) {
+			call = ERR_PTR(-ENOTCONN);
+			goto out;
+		}
+		atomic_inc(&trans->usage);
+	}
+
+	service_id = rx->service_id;
+	if (srx)
+		service_id = htons(srx->srx_service);
+
+	if (!key)
+		key = rx->key;
+	if (key && !key->payload.data)
+		key = NULL; /* a no-security key */
+
+	bundle = rxrpc_get_bundle(rx, trans, key, service_id, gfp);
+	if (IS_ERR(bundle)) {
+		call = ERR_PTR(PTR_ERR(bundle));
+		goto out;
+	}
+
+	call = rxrpc_get_client_call(rx, trans, bundle, user_call_ID, true,
+				     gfp);
+	rxrpc_put_bundle(trans, bundle);
+out:
+	rxrpc_put_transport(trans);
+	release_sock(&rx->sk);
+	_leave(" = %p", call);
+	return call;
+}
+
+EXPORT_SYMBOL(rxrpc_kernel_begin_call);
+
+/**
+ * rxrpc_kernel_end_call - Allow a kernel service to end a call it was using
+ * @call: The call to end
+ *
+ * Allow a kernel service to end a call it was using.  The call must be
+ * complete before this is called (the call should be aborted if necessary).
+ */
+void rxrpc_kernel_end_call(struct rxrpc_call *call)
+{
+	_enter("%d{%d}", call->debug_id, atomic_read(&call->usage));
+	rxrpc_remove_user_ID(call->socket, call);
+	rxrpc_put_call(call);
+}
+
+EXPORT_SYMBOL(rxrpc_kernel_end_call);
+
+/**
+ * rxrpc_kernel_intercept_rx_messages - Intercept received RxRPC messages
+ * @sock: The socket to intercept received messages on
+ * @interceptor: The function to pass the messages to
+ *
+ * Allow a kernel service to intercept messages heading for the Rx queue on an
+ * RxRPC socket.  They get passed to the specified function instead.
+ * @interceptor should free the socket buffers it is given.  @interceptor is
+ * called with the socket receive queue spinlock held and softirqs disabled -
+ * this ensures that the messages will be delivered in the right order.
+ */
+void rxrpc_kernel_intercept_rx_messages(struct socket *sock,
+					rxrpc_interceptor_t interceptor)
+{
+	struct rxrpc_sock *rx = rxrpc_sk(sock->sk);
+
+	_enter("");
+	rx->interceptor = interceptor;
+}
+
+EXPORT_SYMBOL(rxrpc_kernel_intercept_rx_messages);
+
 /*
  * connect an RxRPC socket
  * - this just targets it at a specific destination; no actual connection
@@ -294,7 +409,8 @@ static int rxrpc_connect(struct socket *sock, struct sockaddr *addr,
 		return -EBUSY; /* server sockets can't connect as well */
 	}
 
-	trans = rxrpc_name_to_transport(sock, addr, addr_len, flags);
+	trans = rxrpc_name_to_transport(sock, addr, addr_len, flags,
+					GFP_KERNEL);
 	if (IS_ERR(trans)) {
 		release_sock(&rx->sk);
 		_leave(" = %ld", PTR_ERR(trans));
@@ -344,7 +460,7 @@ static int rxrpc_sendmsg(struct kiocb *iocb, struct socket *sock,
 	if (m->msg_name) {
 		ret = -EISCONN;
 		trans = rxrpc_name_to_transport(sock, m->msg_name,
-						m->msg_namelen, 0);
+						m->msg_namelen, 0, GFP_KERNEL);
 		if (IS_ERR(trans)) {
 			ret = PTR_ERR(trans);
 			trans = NULL;
@@ -576,7 +692,7 @@ static int rxrpc_release_sock(struct sock *sk)
 
 	/* try to flush out this socket */
 	rxrpc_release_calls_on_socket(rx);
-	flush_scheduled_work();
+	flush_workqueue(rxrpc_workqueue);
 	rxrpc_purge_queue(&sk->sk_receive_queue);
 
 	if (rx->conn) {
@@ -673,15 +789,21 @@ static int __init af_rxrpc_init(void)
 
 	rxrpc_epoch = htonl(xtime.tv_sec);
 
+	ret = -ENOMEM;
 	rxrpc_call_jar = kmem_cache_create(
 		"rxrpc_call_jar", sizeof(struct rxrpc_call), 0,
 		SLAB_HWCACHE_ALIGN, NULL, NULL);
 	if (!rxrpc_call_jar) {
 		printk(KERN_NOTICE "RxRPC: Failed to allocate call jar\n");
-		ret = -ENOMEM;
 		goto error_call_jar;
 	}
 
+	rxrpc_workqueue = create_workqueue("krxrpcd");
+	if (!rxrpc_workqueue) {
+		printk(KERN_NOTICE "RxRPC: Failed to allocate work queue\n");
+		goto error_work_queue;
+	}
+
 	ret = proto_register(&rxrpc_proto, 1);
         if (ret < 0) {
                 printk(KERN_CRIT "RxRPC: Cannot register protocol\n");
@@ -719,6 +841,8 @@ error_key_type:
 error_sock:
 	proto_unregister(&rxrpc_proto);
 error_proto:
+	destroy_workqueue(rxrpc_workqueue);
+error_work_queue:
 	kmem_cache_destroy(rxrpc_call_jar);
 error_call_jar:
 	return ret;
@@ -743,9 +867,10 @@ static void __exit af_rxrpc_exit(void)
 	ASSERTCMP(atomic_read(&rxrpc_n_skbs), ==, 0);
 
 	_debug("flush scheduled work");
-	flush_scheduled_work();
+	flush_workqueue(rxrpc_workqueue);
 	proc_net_remove("rxrpc_conns");
 	proc_net_remove("rxrpc_calls");
+	destroy_workqueue(rxrpc_workqueue);
 	kmem_cache_destroy(rxrpc_call_jar);
 	_leave("");
 }
diff --git a/net/rxrpc/ar-accept.c b/net/rxrpc/ar-accept.c
index b988e0f..73243ab 100644
--- a/net/rxrpc/ar-accept.c
+++ b/net/rxrpc/ar-accept.c
@@ -139,7 +139,7 @@ static int rxrpc_accept_incoming_call(struct rxrpc_local *local,
 			call->conn->state = RXRPC_CONN_SERVER_CHALLENGING;
 			atomic_inc(&call->conn->usage);
 			set_bit(RXRPC_CONN_CHALLENGE, &call->conn->events);
-			schedule_work(&call->conn->processor);
+			rxrpc_queue_conn(call->conn);
 		} else {
 			_debug("conn ready");
 			call->state = RXRPC_CALL_SERVER_ACCEPTING;
@@ -183,7 +183,7 @@ invalid_service:
 	if (!test_bit(RXRPC_CALL_RELEASE, &call->flags) &&
 	    !test_and_set_bit(RXRPC_CALL_RELEASE, &call->events)) {
 		rxrpc_get_call(call);
-		schedule_work(&call->processor);
+		rxrpc_queue_call(call);
 	}
 	read_unlock_bh(&call->state_lock);
 	rxrpc_put_call(call);
@@ -310,7 +310,8 @@ security_mismatch:
  * handle acceptance of a call by userspace
  * - assign the user call ID to the call at the front of the queue
  */
-int rxrpc_accept_call(struct rxrpc_sock *rx, unsigned long user_call_ID)
+struct rxrpc_call *rxrpc_accept_call(struct rxrpc_sock *rx,
+				     unsigned long user_call_ID)
 {
 	struct rxrpc_call *call;
 	struct rb_node *parent, **pp;
@@ -374,12 +375,76 @@ int rxrpc_accept_call(struct rxrpc_sock *rx, unsigned long user_call_ID)
 		BUG();
 	if (test_and_set_bit(RXRPC_CALL_ACCEPTED, &call->events))
 		BUG();
-	schedule_work(&call->processor);
+	rxrpc_queue_call(call);
 
+	rxrpc_get_call(call);
 	write_unlock_bh(&call->state_lock);
 	write_unlock(&rx->call_lock);
-	_leave(" = 0");
-	return 0;
+	_leave(" = %p{%d}", call, call->debug_id);
+	return call;
+
+	/* if the call is already dying or dead, then we leave the socket's ref
+	 * on it to be released by rxrpc_dead_call_expired() as induced by
+	 * rxrpc_release_call() */
+out_release:
+	_debug("release %p", call);
+	if (!test_bit(RXRPC_CALL_RELEASED, &call->flags) &&
+	    !test_and_set_bit(RXRPC_CALL_RELEASE, &call->events))
+		rxrpc_queue_call(call);
+out_discard:
+	write_unlock_bh(&call->state_lock);
+	_debug("discard %p", call);
+out:
+	write_unlock(&rx->call_lock);
+	_leave(" = %d", ret);
+	return ERR_PTR(ret);
+}
+
+/*
+ * handle rejectance of a call by userspace
+ * - reject the call at the front of the queue
+ */
+int rxrpc_reject_call(struct rxrpc_sock *rx)
+{
+	struct rxrpc_call *call;
+	int ret;
+
+	_enter("");
+
+	ASSERT(!irqs_disabled());
+
+	write_lock(&rx->call_lock);
+
+	ret = -ENODATA;
+	if (list_empty(&rx->acceptq))
+		goto out;
+
+	/* dequeue the first call and check it's still valid */
+	call = list_entry(rx->acceptq.next, struct rxrpc_call, accept_link);
+	list_del_init(&call->accept_link);
+	sk_acceptq_removed(&rx->sk);
+
+	write_lock_bh(&call->state_lock);
+	switch (call->state) {
+	case RXRPC_CALL_SERVER_ACCEPTING:
+		call->state = RXRPC_CALL_SERVER_BUSY;
+		if (test_and_set_bit(RXRPC_CALL_REJECT_BUSY, &call->events))
+			rxrpc_queue_call(call);
+		ret = 0;
+		goto out_release;
+	case RXRPC_CALL_REMOTELY_ABORTED:
+	case RXRPC_CALL_LOCALLY_ABORTED:
+		ret = -ECONNABORTED;
+		goto out_release;
+	case RXRPC_CALL_NETWORK_ERROR:
+		ret = call->conn->error;
+		goto out_release;
+	case RXRPC_CALL_DEAD:
+		ret = -ETIME;
+		goto out_discard;
+	default:
+		BUG();
+	}
 
 	/* if the call is already dying or dead, then we leave the socket's ref
 	 * on it to be released by rxrpc_dead_call_expired() as induced by
@@ -388,7 +453,7 @@ out_release:
 	_debug("release %p", call);
 	if (!test_bit(RXRPC_CALL_RELEASED, &call->flags) &&
 	    !test_and_set_bit(RXRPC_CALL_RELEASE, &call->events))
-		schedule_work(&call->processor);
+		rxrpc_queue_call(call);
 out_discard:
 	write_unlock_bh(&call->state_lock);
 	_debug("discard %p", call);
@@ -397,3 +462,43 @@ out:
 	_leave(" = %d", ret);
 	return ret;
 }
+
+/**
+ * rxrpc_kernel_accept_call - Allow a kernel service to accept an incoming call
+ * @sock: The socket on which the impending call is waiting
+ * @user_call_ID: The tag to attach to the call
+ *
+ * Allow a kernel service to accept an incoming call, assuming the incoming
+ * call is still valid.
+ */
+struct rxrpc_call *rxrpc_kernel_accept_call(struct socket *sock,
+					    unsigned long user_call_ID)
+{
+	struct rxrpc_call *call;
+
+	_enter(",%lx", user_call_ID);
+	call = rxrpc_accept_call(rxrpc_sk(sock->sk), user_call_ID);
+	_leave(" = %p", call);
+	return call;
+}
+
+EXPORT_SYMBOL(rxrpc_kernel_accept_call);
+
+/**
+ * rxrpc_kernel_reject_call - Allow a kernel service to reject an incoming call
+ * @sock: The socket on which the impending call is waiting
+ *
+ * Allow a kernel service to reject an incoming call with a BUSY message,
+ * assuming the incoming call is still valid.
+ */
+int rxrpc_kernel_reject_call(struct socket *sock)
+{
+	int ret;
+
+	_enter("");
+	ret = rxrpc_reject_call(rxrpc_sk(sock->sk));
+	_leave(" = %d", ret);
+	return ret;
+}
+
+EXPORT_SYMBOL(rxrpc_kernel_reject_call);
diff --git a/net/rxrpc/ar-ack.c b/net/rxrpc/ar-ack.c
index 8f7764e..fc07a92 100644
--- a/net/rxrpc/ar-ack.c
+++ b/net/rxrpc/ar-ack.c
@@ -113,7 +113,7 @@ cancel_timer:
 	read_lock_bh(&call->state_lock);
 	if (call->state <= RXRPC_CALL_COMPLETE &&
 	    !test_and_set_bit(RXRPC_CALL_ACK, &call->events))
-		schedule_work(&call->processor);
+		rxrpc_queue_call(call);
 	read_unlock_bh(&call->state_lock);
 }
 
@@ -1166,7 +1166,7 @@ send_message_2:
 		_debug("sendmsg failed: %d", ret);
 		read_lock_bh(&call->state_lock);
 		if (call->state < RXRPC_CALL_DEAD)
-			schedule_work(&call->processor);
+			rxrpc_queue_call(call);
 		read_unlock_bh(&call->state_lock);
 		goto error;
 	}
@@ -1210,7 +1210,7 @@ maybe_reschedule:
 	if (call->events || !skb_queue_empty(&call->rx_queue)) {
 		read_lock_bh(&call->state_lock);
 		if (call->state < RXRPC_CALL_DEAD)
-			schedule_work(&call->processor);
+			rxrpc_queue_call(call);
 		read_unlock_bh(&call->state_lock);
 	}
 
@@ -1224,7 +1224,7 @@ maybe_reschedule:
 		read_lock_bh(&call->state_lock);
 		if (!test_bit(RXRPC_CALL_RELEASED, &call->flags) &&
 		    !test_and_set_bit(RXRPC_CALL_RELEASE, &call->events))
-			schedule_work(&call->processor);
+			rxrpc_queue_call(call);
 		read_unlock_bh(&call->state_lock);
 	}
 
@@ -1238,7 +1238,7 @@ error:
 	 * work pending bit and the work item being processed again */
 	if (call->events && !work_pending(&call->processor)) {
 		_debug("jumpstart %x", ntohl(call->conn->cid));
-		schedule_work(&call->processor);
+		rxrpc_queue_call(call);
 	}
 
 	_leave("");
diff --git a/net/rxrpc/ar-call.c b/net/rxrpc/ar-call.c
index ac31cce..4d92d88 100644
--- a/net/rxrpc/ar-call.c
+++ b/net/rxrpc/ar-call.c
@@ -19,7 +19,7 @@ struct kmem_cache *rxrpc_call_jar;
 LIST_HEAD(rxrpc_calls);
 DEFINE_RWLOCK(rxrpc_call_lock);
 static unsigned rxrpc_call_max_lifetime = 60;
-static unsigned rxrpc_dead_call_timeout = 10;
+static unsigned rxrpc_dead_call_timeout = 2;
 
 static void rxrpc_destroy_call(struct work_struct *work);
 static void rxrpc_call_life_expired(unsigned long _call);
@@ -264,7 +264,7 @@ struct rxrpc_call *rxrpc_incoming_call(struct rxrpc_sock *rx,
 		switch (call->state) {
 		case RXRPC_CALL_LOCALLY_ABORTED:
 			if (!test_and_set_bit(RXRPC_CALL_ABORT, &call->events))
-				schedule_work(&call->processor);
+				rxrpc_queue_call(call);
 		case RXRPC_CALL_REMOTELY_ABORTED:
 			read_unlock(&call->state_lock);
 			goto aborted_call;
@@ -398,6 +398,7 @@ found_extant_call:
  */
 void rxrpc_release_call(struct rxrpc_call *call)
 {
+	struct rxrpc_connection *conn = call->conn;
 	struct rxrpc_sock *rx = call->socket;
 
 	_enter("{%d,%d,%d,%d}",
@@ -413,8 +414,7 @@ void rxrpc_release_call(struct rxrpc_call *call)
 	/* dissociate from the socket
 	 * - the socket's ref on the call is passed to the death timer
 	 */
-	_debug("RELEASE CALL %p (%d CONN %p)",
-	       call, call->debug_id, call->conn);
+	_debug("RELEASE CALL %p (%d CONN %p)", call, call->debug_id, conn);
 
 	write_lock_bh(&rx->call_lock);
 	if (!list_empty(&call->accept_link)) {
@@ -430,24 +430,42 @@ void rxrpc_release_call(struct rxrpc_call *call)
 	}
 	write_unlock_bh(&rx->call_lock);
 
-	if (call->conn->out_clientflag)
-		spin_lock(&call->conn->trans->client_lock);
-	write_lock_bh(&call->conn->lock);
-
 	/* free up the channel for reuse */
-	if (call->conn->out_clientflag) {
-		call->conn->avail_calls++;
-		if (call->conn->avail_calls == RXRPC_MAXCALLS)
-			list_move_tail(&call->conn->bundle_link,
-				       &call->conn->bundle->unused_conns);
-		else if (call->conn->avail_calls == 1)
-			list_move_tail(&call->conn->bundle_link,
-				       &call->conn->bundle->avail_conns);
+	spin_lock(&conn->trans->client_lock);
+	write_lock_bh(&conn->lock);
+	write_lock(&call->state_lock);
+
+	if (conn->channels[call->channel] == call)
+		conn->channels[call->channel] = NULL;
+
+	if (conn->out_clientflag && conn->bundle) {
+		conn->avail_calls++;
+		switch (conn->avail_calls) {
+		case 1:
+			list_move_tail(&conn->bundle_link,
+				       &conn->bundle->avail_conns);
+		case 2 ... RXRPC_MAXCALLS - 1:
+			ASSERT(conn->channels[0] == NULL ||
+			       conn->channels[1] == NULL ||
+			       conn->channels[2] == NULL ||
+			       conn->channels[3] == NULL);
+			break;
+		case RXRPC_MAXCALLS:
+			list_move_tail(&conn->bundle_link,
+				       &conn->bundle->unused_conns);
+			ASSERT(conn->channels[0] == NULL &&
+			       conn->channels[1] == NULL &&
+			       conn->channels[2] == NULL &&
+			       conn->channels[3] == NULL);
+			break;
+		default:
+			printk(KERN_ERR "RxRPC: conn->avail_calls=%d\n",
+			       conn->avail_calls);
+			BUG();
+		}
 	}
 
-	write_lock(&call->state_lock);
-	if (call->conn->channels[call->channel] == call)
-		call->conn->channels[call->channel] = NULL;
+	spin_unlock(&conn->trans->client_lock);
 
 	if (call->state < RXRPC_CALL_COMPLETE &&
 	    call->state != RXRPC_CALL_CLIENT_FINAL_ACK) {
@@ -455,13 +473,12 @@ void rxrpc_release_call(struct rxrpc_call *call)
 		call->state = RXRPC_CALL_LOCALLY_ABORTED;
 		call->abort_code = RX_CALL_DEAD;
 		set_bit(RXRPC_CALL_ABORT, &call->events);
-		schedule_work(&call->processor);
+		rxrpc_queue_call(call);
 	}
 	write_unlock(&call->state_lock);
-	write_unlock_bh(&call->conn->lock);
-	if (call->conn->out_clientflag)
-		spin_unlock(&call->conn->trans->client_lock);
+	write_unlock_bh(&conn->lock);
 
+	/* clean up the Rx queue */
 	if (!skb_queue_empty(&call->rx_queue) ||
 	    !skb_queue_empty(&call->rx_oos_queue)) {
 		struct rxrpc_skb_priv *sp;
@@ -538,7 +555,7 @@ static void rxrpc_mark_call_released(struct rxrpc_call *call)
 		if (!test_and_set_bit(RXRPC_CALL_RELEASE, &call->events))
 			sched = true;
 		if (sched)
-			schedule_work(&call->processor);
+			rxrpc_queue_call(call);
 	}
 	write_unlock(&call->state_lock);
 }
@@ -588,7 +605,7 @@ void __rxrpc_put_call(struct rxrpc_call *call)
 	if (atomic_dec_and_test(&call->usage)) {
 		_debug("call %d dead", call->debug_id);
 		ASSERTCMP(call->state, ==, RXRPC_CALL_DEAD);
-		schedule_work(&call->destroyer);
+		rxrpc_queue_work(&call->destroyer);
 	}
 	_leave("");
 }
@@ -613,7 +630,7 @@ static void rxrpc_cleanup_call(struct rxrpc_call *call)
 	ASSERTCMP(call->events, ==, 0);
 	if (work_pending(&call->processor)) {
 		_debug("defer destroy");
-		schedule_work(&call->destroyer);
+		rxrpc_queue_work(&call->destroyer);
 		return;
 	}
 
@@ -742,7 +759,7 @@ static void rxrpc_call_life_expired(unsigned long _call)
 	read_lock_bh(&call->state_lock);
 	if (call->state < RXRPC_CALL_COMPLETE) {
 		set_bit(RXRPC_CALL_LIFE_TIMER, &call->events);
-		schedule_work(&call->processor);
+		rxrpc_queue_call(call);
 	}
 	read_unlock_bh(&call->state_lock);
 }
@@ -763,7 +780,7 @@ static void rxrpc_resend_time_expired(unsigned long _call)
 	clear_bit(RXRPC_CALL_RUN_RTIMER, &call->flags);
 	if (call->state < RXRPC_CALL_COMPLETE &&
 	    !test_and_set_bit(RXRPC_CALL_RESEND_TIMER, &call->events))
-		schedule_work(&call->processor);
+		rxrpc_queue_call(call);
 	read_unlock_bh(&call->state_lock);
 }
 
@@ -782,6 +799,6 @@ static void rxrpc_ack_time_expired(unsigned long _call)
 	read_lock_bh(&call->state_lock);
 	if (call->state < RXRPC_CALL_COMPLETE &&
 	    !test_and_set_bit(RXRPC_CALL_ACK, &call->events))
-		schedule_work(&call->processor);
+		rxrpc_queue_call(call);
 	read_unlock_bh(&call->state_lock);
 }
diff --git a/net/rxrpc/ar-connection.c b/net/rxrpc/ar-connection.c
index 01eb33c..43cb3e0 100644
--- a/net/rxrpc/ar-connection.c
+++ b/net/rxrpc/ar-connection.c
@@ -356,7 +356,7 @@ static int rxrpc_connect_exclusive(struct rxrpc_sock *rx,
 		conn->out_clientflag = RXRPC_CLIENT_INITIATED;
 		conn->cid = 0;
 		conn->state = RXRPC_CONN_CLIENT;
-		conn->avail_calls = RXRPC_MAXCALLS;
+		conn->avail_calls = RXRPC_MAXCALLS - 1;
 		conn->security_level = rx->min_sec_level;
 		conn->key = key_get(rx->key);
 
@@ -447,6 +447,11 @@ int rxrpc_connect_call(struct rxrpc_sock *rx,
 			if (--conn->avail_calls == 0)
 				list_move(&conn->bundle_link,
 					  &bundle->busy_conns);
+			ASSERTCMP(conn->avail_calls, <, RXRPC_MAXCALLS);
+			ASSERT(conn->channels[0] == NULL ||
+			       conn->channels[1] == NULL ||
+			       conn->channels[2] == NULL ||
+			       conn->channels[3] == NULL);
 			atomic_inc(&conn->usage);
 			break;
 		}
@@ -456,6 +461,12 @@ int rxrpc_connect_call(struct rxrpc_sock *rx,
 			conn = list_entry(bundle->unused_conns.next,
 					  struct rxrpc_connection,
 					  bundle_link);
+			ASSERTCMP(conn->avail_calls, ==, RXRPC_MAXCALLS);
+			conn->avail_calls = RXRPC_MAXCALLS - 1;
+			ASSERT(conn->channels[0] == NULL &&
+			       conn->channels[1] == NULL &&
+			       conn->channels[2] == NULL &&
+			       conn->channels[3] == NULL);
 			atomic_inc(&conn->usage);
 			list_move(&conn->bundle_link, &bundle->avail_conns);
 			break;
@@ -512,7 +523,7 @@ int rxrpc_connect_call(struct rxrpc_sock *rx,
 		candidate->state = RXRPC_CONN_CLIENT;
 		candidate->avail_calls = RXRPC_MAXCALLS;
 		candidate->security_level = rx->min_sec_level;
-		candidate->key = key_get(rx->key);
+		candidate->key = key_get(bundle->key);
 
 		ret = rxrpc_init_client_conn_security(candidate);
 		if (ret < 0) {
@@ -555,6 +566,10 @@ int rxrpc_connect_call(struct rxrpc_sock *rx,
 	for (chan = 0; chan < RXRPC_MAXCALLS; chan++)
 		if (!conn->channels[chan])
 			goto found_channel;
+	ASSERT(conn->channels[0] == NULL ||
+	       conn->channels[1] == NULL ||
+	       conn->channels[2] == NULL ||
+	       conn->channels[3] == NULL);
 	BUG();
 
 found_channel:
@@ -567,6 +582,7 @@ found_channel:
 	_net("CONNECT client on conn %d chan %d as call %x",
 	     conn->debug_id, chan, ntohl(call->call_id));
 
+	ASSERTCMP(conn->avail_calls, <, RXRPC_MAXCALLS);
 	spin_unlock(&trans->client_lock);
 
 	rxrpc_add_call_ID_to_conn(conn, call);
@@ -778,7 +794,7 @@ void rxrpc_put_connection(struct rxrpc_connection *conn)
 	conn->put_time = xtime.tv_sec;
 	if (atomic_dec_and_test(&conn->usage)) {
 		_debug("zombie");
-		schedule_delayed_work(&rxrpc_connection_reap, 0);
+		rxrpc_queue_delayed_work(&rxrpc_connection_reap, 0);
 	}
 
 	_leave("");
@@ -862,8 +878,8 @@ void rxrpc_connection_reaper(struct work_struct *work)
 	if (earliest != ULONG_MAX) {
 		_debug("reschedule reaper %ld", (long) earliest - now);
 		ASSERTCMP(earliest, >, now);
-		schedule_delayed_work(&rxrpc_connection_reap,
-				      (earliest - now) * HZ);
+		rxrpc_queue_delayed_work(&rxrpc_connection_reap,
+					 (earliest - now) * HZ);
 	}
 
 	/* then destroy all those pulled out */
@@ -889,7 +905,7 @@ void __exit rxrpc_destroy_all_connections(void)
 
 	rxrpc_connection_timeout = 0;
 	cancel_delayed_work(&rxrpc_connection_reap);
-	schedule_delayed_work(&rxrpc_connection_reap, 0);
+	rxrpc_queue_delayed_work(&rxrpc_connection_reap, 0);
 
 	_leave("");
 }
diff --git a/net/rxrpc/ar-connevent.c b/net/rxrpc/ar-connevent.c
index 1957d2f..434c1de 100644
--- a/net/rxrpc/ar-connevent.c
+++ b/net/rxrpc/ar-connevent.c
@@ -45,7 +45,7 @@ static void rxrpc_abort_calls(struct rxrpc_connection *conn, int state,
 				set_bit(RXRPC_CALL_CONN_ABORT, &call->events);
 			else
 				set_bit(RXRPC_CALL_RCVD_ABORT, &call->events);
-			schedule_work(&call->processor);
+			rxrpc_queue_call(call);
 		}
 		write_unlock(&call->state_lock);
 	}
@@ -133,7 +133,7 @@ void rxrpc_call_is_secure(struct rxrpc_call *call)
 		read_lock(&call->state_lock);
 		if (call->state < RXRPC_CALL_COMPLETE &&
 		    !test_and_set_bit(RXRPC_CALL_SECURED, &call->events))
-			schedule_work(&call->processor);
+			rxrpc_queue_call(call);
 		read_unlock(&call->state_lock);
 	}
 }
@@ -308,6 +308,22 @@ protocol_error:
 }
 
 /*
+ * put a packet up for transport-level abort
+ */
+void rxrpc_reject_packet(struct rxrpc_local *local, struct sk_buff *skb)
+{
+	CHECK_SLAB_OKAY(&local->usage);
+
+	if (!atomic_inc_not_zero(&local->usage)) {
+		printk("resurrected on reject\n");
+		BUG();
+	}
+
+	skb_queue_tail(&local->reject_queue, skb);
+	rxrpc_queue_work(&local->rejecter);
+}
+
+/*
  * reject packets through the local endpoint
  */
 void rxrpc_reject_packets(struct work_struct *work)
diff --git a/net/rxrpc/ar-error.c b/net/rxrpc/ar-error.c
index 8554c44..c28001b 100644
--- a/net/rxrpc/ar-error.c
+++ b/net/rxrpc/ar-error.c
@@ -111,7 +111,7 @@ void rxrpc_UDP_error_report(struct sock *sk)
 
 	/* pass the transport ref to error_handler to release */
 	skb_queue_tail(&trans->error_queue, skb);
-	schedule_work(&trans->error_handler);
+	rxrpc_queue_work(&trans->error_handler);
 
 	/* reset and regenerate socket error */
 	spin_lock_bh(&sk->sk_error_queue.lock);
@@ -235,7 +235,7 @@ void rxrpc_UDP_error_handler(struct work_struct *work)
 			    call->state < RXRPC_CALL_NETWORK_ERROR) {
 				call->state = RXRPC_CALL_NETWORK_ERROR;
 				set_bit(RXRPC_CALL_RCVD_ERROR, &call->events);
-				schedule_work(&call->processor);
+				rxrpc_queue_call(call);
 			}
 			write_unlock(&call->state_lock);
 			list_del_init(&call->error_link);
@@ -245,7 +245,7 @@ void rxrpc_UDP_error_handler(struct work_struct *work)
 	}
 
 	if (!skb_queue_empty(&trans->error_queue))
-		schedule_work(&trans->error_handler);
+		rxrpc_queue_work(&trans->error_handler);
 
 	rxrpc_free_skb(skb);
 	rxrpc_put_transport(trans);
diff --git a/net/rxrpc/ar-input.c b/net/rxrpc/ar-input.c
index 64ae9fa..80c979d 100644
--- a/net/rxrpc/ar-input.c
+++ b/net/rxrpc/ar-input.c
@@ -42,6 +42,7 @@ int rxrpc_queue_rcv_skb(struct rxrpc_call *call, struct sk_buff *skb,
 			bool force, bool terminal)
 {
 	struct rxrpc_skb_priv *sp;
+	struct rxrpc_sock *rx = call->socket;
 	struct sock *sk;
 	int skb_len, ret;
 
@@ -64,7 +65,7 @@ int rxrpc_queue_rcv_skb(struct rxrpc_call *call, struct sk_buff *skb,
 		return 0;
 	}
 
-	sk = &call->socket->sk;
+	sk = &rx->sk;
 
 	if (!force) {
 		/* cast skb->rcvbuf to unsigned...  It's pointless, but
@@ -89,25 +90,30 @@ int rxrpc_queue_rcv_skb(struct rxrpc_call *call, struct sk_buff *skb,
 		skb->sk = sk;
 		atomic_add(skb->truesize, &sk->sk_rmem_alloc);
 
-		/* Cache the SKB length before we tack it onto the receive
-		 * queue.  Once it is added it no longer belongs to us and
-		 * may be freed by other threads of control pulling packets
-		 * from the queue.
-		 */
-		skb_len = skb->len;
-
-		_net("post skb %p", skb);
-		__skb_queue_tail(&sk->sk_receive_queue, skb);
-		spin_unlock_bh(&sk->sk_receive_queue.lock);
-
-		if (!sock_flag(sk, SOCK_DEAD))
-			sk->sk_data_ready(sk, skb_len);
-
 		if (terminal) {
 			_debug("<<<< TERMINAL MESSAGE >>>>");
 			set_bit(RXRPC_CALL_TERMINAL_MSG, &call->flags);
 		}
 
+		/* allow interception by a kernel service */
+		if (rx->interceptor) {
+			rx->interceptor(sk, call->user_call_ID, skb);
+			spin_unlock_bh(&sk->sk_receive_queue.lock);
+		} else {
+
+			/* Cache the SKB length before we tack it onto the
+			 * receive queue.  Once it is added it no longer
+			 * belongs to us and may be freed by other threads of
+			 * control pulling packets from the queue */
+			skb_len = skb->len;
+
+			_net("post skb %p", skb);
+			__skb_queue_tail(&sk->sk_receive_queue, skb);
+			spin_unlock_bh(&sk->sk_receive_queue.lock);
+
+			if (!sock_flag(sk, SOCK_DEAD))
+				sk->sk_data_ready(sk, skb_len);
+		}
 		skb = NULL;
 	} else {
 		spin_unlock_bh(&sk->sk_receive_queue.lock);
@@ -232,7 +238,7 @@ static int rxrpc_fast_process_data(struct rxrpc_call *call,
 		read_lock(&call->state_lock);
 		if (call->state < RXRPC_CALL_COMPLETE &&
 		    !test_and_set_bit(RXRPC_CALL_DRAIN_RX_OOS, &call->events))
-			schedule_work(&call->processor);
+			rxrpc_queue_call(call);
 		read_unlock(&call->state_lock);
 	}
 
@@ -267,7 +273,7 @@ enqueue_packet:
 	atomic_inc(&call->ackr_not_idle);
 	read_lock(&call->state_lock);
 	if (call->state < RXRPC_CALL_DEAD)
-		schedule_work(&call->processor);
+		rxrpc_queue_call(call);
 	read_unlock(&call->state_lock);
 	_leave(" = 0 [queued]");
 	return 0;
@@ -360,7 +366,7 @@ void rxrpc_fast_process_packet(struct rxrpc_call *call, struct sk_buff *skb)
 			call->state = RXRPC_CALL_REMOTELY_ABORTED;
 			call->abort_code = abort_code;
 			set_bit(RXRPC_CALL_RCVD_ABORT, &call->events);
-			schedule_work(&call->processor);
+			rxrpc_queue_call(call);
 		}
 		goto free_packet_unlock;
 
@@ -375,7 +381,7 @@ void rxrpc_fast_process_packet(struct rxrpc_call *call, struct sk_buff *skb)
 		case RXRPC_CALL_CLIENT_SEND_REQUEST:
 			call->state = RXRPC_CALL_SERVER_BUSY;
 			set_bit(RXRPC_CALL_RCVD_BUSY, &call->events);
-			schedule_work(&call->processor);
+			rxrpc_queue_call(call);
 		case RXRPC_CALL_SERVER_BUSY:
 			goto free_packet_unlock;
 		default:
@@ -419,7 +425,7 @@ void rxrpc_fast_process_packet(struct rxrpc_call *call, struct sk_buff *skb)
 		read_lock_bh(&call->state_lock);
 		if (call->state < RXRPC_CALL_DEAD) {
 			skb_queue_tail(&call->rx_queue, skb);
-			schedule_work(&call->processor);
+			rxrpc_queue_call(call);
 			skb = NULL;
 		}
 		read_unlock_bh(&call->state_lock);
@@ -434,7 +440,7 @@ protocol_error_locked:
 		call->state = RXRPC_CALL_LOCALLY_ABORTED;
 		call->abort_code = RX_PROTOCOL_ERROR;
 		set_bit(RXRPC_CALL_ABORT, &call->events);
-		schedule_work(&call->processor);
+		rxrpc_queue_call(call);
 	}
 free_packet_unlock:
 	write_unlock_bh(&call->state_lock);
@@ -506,7 +512,7 @@ protocol_error:
 		call->state = RXRPC_CALL_LOCALLY_ABORTED;
 		call->abort_code = RX_PROTOCOL_ERROR;
 		set_bit(RXRPC_CALL_ABORT, &call->events);
-		schedule_work(&call->processor);
+		rxrpc_queue_call(call);
 	}
 	write_unlock_bh(&call->state_lock);
 	_leave("");
@@ -542,7 +548,7 @@ static void rxrpc_post_packet_to_call(struct rxrpc_connection *conn,
 	switch (call->state) {
 	case RXRPC_CALL_LOCALLY_ABORTED:
 		if (!test_and_set_bit(RXRPC_CALL_ABORT, &call->events))
-			schedule_work(&call->processor);
+			rxrpc_queue_call(call);
 	case RXRPC_CALL_REMOTELY_ABORTED:
 	case RXRPC_CALL_NETWORK_ERROR:
 	case RXRPC_CALL_DEAD:
@@ -591,7 +597,7 @@ dead_call:
 	    sp->hdr.seq == __constant_cpu_to_be32(1)) {
 		_debug("incoming call");
 		skb_queue_tail(&conn->trans->local->accept_queue, skb);
-		schedule_work(&conn->trans->local->acceptor);
+		rxrpc_queue_work(&conn->trans->local->acceptor);
 		goto done;
 	}
 
@@ -630,7 +636,7 @@ found_completed_call:
 	_debug("final ack again");
 	rxrpc_get_call(call);
 	set_bit(RXRPC_CALL_ACK_FINAL, &call->events);
-	schedule_work(&call->processor);
+	rxrpc_queue_call(call);
 
 free_unlock:
 	read_unlock(&call->state_lock);
@@ -651,7 +657,7 @@ static void rxrpc_post_packet_to_conn(struct rxrpc_connection *conn,
 
 	atomic_inc(&conn->usage);
 	skb_queue_tail(&conn->rx_queue, skb);
-	schedule_work(&conn->processor);
+	rxrpc_queue_conn(conn);
 }
 
 /*
@@ -767,7 +773,7 @@ cant_route_call:
 		if (sp->hdr.seq == __constant_cpu_to_be32(1)) {
 			_debug("first packet");
 			skb_queue_tail(&local->accept_queue, skb);
-			schedule_work(&local->acceptor);
+			rxrpc_queue_work(&local->acceptor);
 			rxrpc_put_local(local);
 			_leave(" [incoming]");
 			return;
diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index 7bfbf47..cb1eb49 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -19,8 +19,6 @@
 #define CHECK_SLAB_OKAY(X) do {} while(0)
 #endif
 
-extern atomic_t rxrpc_n_skbs;
-
 #define FCRYPT_BSIZE 8
 struct rxrpc_crypt {
 	union {
@@ -29,8 +27,12 @@ struct rxrpc_crypt {
 	};
 } __attribute__((aligned(8)));
 
-extern __be32 rxrpc_epoch;		/* local epoch for detecting local-end reset */
-extern atomic_t rxrpc_debug_id;		/* current debugging ID */
+#define rxrpc_queue_work(WS)	queue_work(rxrpc_workqueue, (WS))
+#define rxrpc_queue_delayed_work(WS,D)	\
+	queue_delayed_work(rxrpc_workqueue, (WS), (D))
+
+#define rxrpc_queue_call(CALL)	rxrpc_queue_work(&(CALL)->processor)
+#define rxrpc_queue_conn(CONN)	rxrpc_queue_work(&(CONN)->processor)
 
 /*
  * sk_state for RxRPC sockets
@@ -50,6 +52,7 @@ enum {
 struct rxrpc_sock {
 	/* WARNING: sk has to be the first member */
 	struct sock		sk;
+	rxrpc_interceptor_t	interceptor;	/* kernel service Rx interceptor function */
 	struct rxrpc_local	*local;		/* local endpoint */
 	struct rxrpc_transport	*trans;		/* transport handler */
 	struct rxrpc_conn_bundle *bundle;	/* virtual connection bundle */
@@ -91,16 +94,6 @@ struct rxrpc_skb_priv {
 
 #define rxrpc_skb(__skb) ((struct rxrpc_skb_priv *) &(__skb)->cb)
 
-enum {
-	RXRPC_SKB_MARK_DATA,		/* data message */
-	RXRPC_SKB_MARK_FINAL_ACK,	/* final ACK received message */
-	RXRPC_SKB_MARK_BUSY,		/* server busy message */
-	RXRPC_SKB_MARK_REMOTE_ABORT,	/* remote abort message */
-	RXRPC_SKB_MARK_NET_ERROR,	/* network error message */
-	RXRPC_SKB_MARK_LOCAL_ERROR,	/* local error message */
-	RXRPC_SKB_MARK_NEW_CALL,	/* local error message */
-};
-
 enum rxrpc_command {
 	RXRPC_CMD_SEND_DATA,		/* send data message */
 	RXRPC_CMD_SEND_ABORT,		/* request abort generation */
@@ -439,25 +432,20 @@ static inline void rxrpc_abort_call(struct rxrpc_call *call, u32 abort_code)
 }
 
 /*
- * put a packet up for transport-level abort
+ * af_rxrpc.c
  */
-static inline
-void rxrpc_reject_packet(struct rxrpc_local *local, struct sk_buff *skb)
-{
-	CHECK_SLAB_OKAY(&local->usage);
-	if (!atomic_inc_not_zero(&local->usage)) {
-		printk("resurrected on reject\n");
-		BUG();
-	}
-	skb_queue_tail(&local->reject_queue, skb);
-	schedule_work(&local->rejecter);
-}
+extern atomic_t rxrpc_n_skbs;
+extern __be32 rxrpc_epoch;
+extern atomic_t rxrpc_debug_id;
+extern struct workqueue_struct *rxrpc_workqueue;
 
 /*
  * ar-accept.c
  */
 extern void rxrpc_accept_incoming_calls(struct work_struct *);
-extern int rxrpc_accept_call(struct rxrpc_sock *, unsigned long);
+extern struct rxrpc_call *rxrpc_accept_call(struct rxrpc_sock *,
+					    unsigned long);
+extern int rxrpc_reject_call(struct rxrpc_sock *);
 
 /*
  * ar-ack.c
@@ -514,6 +502,7 @@ rxrpc_incoming_connection(struct rxrpc_transport *, struct rxrpc_header *,
  * ar-connevent.c
  */
 extern void rxrpc_process_connection(struct work_struct *);
+extern void rxrpc_reject_packet(struct rxrpc_local *, struct sk_buff *);
 extern void rxrpc_reject_packets(struct work_struct *);
 
 /*
@@ -583,6 +572,7 @@ extern struct file_operations rxrpc_connection_seq_fops;
 /*
  * ar-recvmsg.c
  */
+extern void rxrpc_remove_user_ID(struct rxrpc_sock *, struct rxrpc_call *);
 extern int rxrpc_recvmsg(struct kiocb *, struct socket *, struct msghdr *,
 			 size_t, int);
 
diff --git a/net/rxrpc/ar-local.c b/net/rxrpc/ar-local.c
index a20a2c0..fe03f71 100644
--- a/net/rxrpc/ar-local.c
+++ b/net/rxrpc/ar-local.c
@@ -228,7 +228,7 @@ void rxrpc_put_local(struct rxrpc_local *local)
 	write_lock_bh(&rxrpc_local_lock);
 	if (unlikely(atomic_dec_and_test(&local->usage))) {
 		_debug("destroy local");
-		schedule_work(&local->destroyer);
+		rxrpc_queue_work(&local->destroyer);
 	}
 	write_unlock_bh(&rxrpc_local_lock);
 	_leave("");
diff --git a/net/rxrpc/ar-output.c b/net/rxrpc/ar-output.c
index 491e47c..d2d0baa 100644
--- a/net/rxrpc/ar-output.c
+++ b/net/rxrpc/ar-output.c
@@ -113,7 +113,7 @@ static void rxrpc_send_abort(struct rxrpc_call *call, u32 abort_code)
 		clear_bit(RXRPC_CALL_RESEND_TIMER, &call->events);
 		clear_bit(RXRPC_CALL_ACK, &call->events);
 		clear_bit(RXRPC_CALL_RUN_RTIMER, &call->flags);
-		schedule_work(&call->processor);
+		rxrpc_queue_call(call);
 	}
 
 	write_unlock_bh(&call->state_lock);
@@ -194,6 +194,77 @@ int rxrpc_client_sendmsg(struct kiocb *iocb, struct rxrpc_sock *rx,
 	return ret;
 }
 
+/**
+ * rxrpc_kernel_send_data - Allow a kernel service to send data on a call
+ * @call: The call to send data through
+ * @msg: The data to send
+ * @len: The amount of data to send
+ *
+ * Allow a kernel service to send data on a call.  The call must be in an state
+ * appropriate to sending data.  No control data should be supplied in @msg,
+ * nor should an address be supplied.  MSG_MORE should be flagged if there's
+ * more data to come, otherwise this data will end the transmission phase.
+ */
+int rxrpc_kernel_send_data(struct rxrpc_call *call, struct msghdr *msg,
+			   size_t len)
+{
+	int ret;
+
+	_enter("{%d,%s},", call->debug_id, rxrpc_call_states[call->state]);
+
+	ASSERTCMP(msg->msg_name, ==, NULL);
+	ASSERTCMP(msg->msg_control, ==, NULL);
+
+	lock_sock(&call->socket->sk);
+
+	_debug("CALL %d USR %lx ST %d on CONN %p",
+	       call->debug_id, call->user_call_ID, call->state, call->conn);
+
+	if (call->state >= RXRPC_CALL_COMPLETE) {
+		ret = -ESHUTDOWN; /* it's too late for this call */
+	} else if (call->state != RXRPC_CALL_CLIENT_SEND_REQUEST &&
+		   call->state != RXRPC_CALL_SERVER_ACK_REQUEST &&
+		   call->state != RXRPC_CALL_SERVER_SEND_REPLY) {
+		ret = -EPROTO; /* request phase complete for this client call */
+	} else {
+		mm_segment_t oldfs = get_fs();
+		set_fs(KERNEL_DS);
+		ret = rxrpc_send_data(NULL, call->socket, call, msg, len);
+		set_fs(oldfs);
+	}
+
+	release_sock(&call->socket->sk);
+	_leave(" = %d", ret);
+	return ret;
+}
+
+EXPORT_SYMBOL(rxrpc_kernel_send_data);
+
+/*
+ * rxrpc_kernel_abort_call - Allow a kernel service to abort a call
+ * @call: The call to be aborted
+ * @abort_code: The abort code to stick into the ABORT packet
+ *
+ * Allow a kernel service to abort a call, if it's still in an abortable state.
+ */
+void rxrpc_kernel_abort_call(struct rxrpc_call *call, u32 abort_code)
+{
+	_enter("{%d},%d", call->debug_id, abort_code);
+
+	lock_sock(&call->socket->sk);
+
+	_debug("CALL %d USR %lx ST %d on CONN %p",
+	       call->debug_id, call->user_call_ID, call->state, call->conn);
+
+	if (call->state < RXRPC_CALL_COMPLETE)
+		rxrpc_send_abort(call, abort_code);
+
+	release_sock(&call->socket->sk);
+	_leave("");
+}
+
+EXPORT_SYMBOL(rxrpc_kernel_abort_call);
+
 /*
  * send a message through a server socket
  * - caller holds the socket locked
@@ -214,8 +285,13 @@ int rxrpc_server_sendmsg(struct kiocb *iocb, struct rxrpc_sock *rx,
 	if (ret < 0)
 		return ret;
 
-	if (cmd == RXRPC_CMD_ACCEPT)
-		return rxrpc_accept_call(rx, user_call_ID);
+	if (cmd == RXRPC_CMD_ACCEPT) {
+		call = rxrpc_accept_call(rx, user_call_ID);
+		if (IS_ERR(call))
+			return PTR_ERR(call);
+		rxrpc_put_call(call);
+		return 0;
+	}
 
 	call = rxrpc_find_server_call(rx, user_call_ID);
 	if (!call)
@@ -363,7 +439,7 @@ static inline void rxrpc_instant_resend(struct rxrpc_call *call)
 		clear_bit(RXRPC_CALL_RUN_RTIMER, &call->flags);
 		if (call->state < RXRPC_CALL_COMPLETE &&
 		    !test_and_set_bit(RXRPC_CALL_RESEND_TIMER, &call->events))
-			schedule_work(&call->processor);
+			rxrpc_queue_call(call);
 	}
 	read_unlock_bh(&call->state_lock);
 }
diff --git a/net/rxrpc/ar-peer.c b/net/rxrpc/ar-peer.c
index 69ac355..d399de4 100644
--- a/net/rxrpc/ar-peer.c
+++ b/net/rxrpc/ar-peer.c
@@ -219,7 +219,7 @@ void rxrpc_put_peer(struct rxrpc_peer *peer)
 		return;
 	}
 
-	schedule_work(&peer->destroyer);
+	rxrpc_queue_work(&peer->destroyer);
 	_leave("");
 }
 
diff --git a/net/rxrpc/ar-recvmsg.c b/net/rxrpc/ar-recvmsg.c
index e947d5c..f19121d 100644
--- a/net/rxrpc/ar-recvmsg.c
+++ b/net/rxrpc/ar-recvmsg.c
@@ -19,7 +19,7 @@
  * removal a call's user ID from the socket tree to make the user ID available
  * again and so that it won't be seen again in association with that call
  */
-static void rxrpc_remove_user_ID(struct rxrpc_sock *rx, struct rxrpc_call *call)
+void rxrpc_remove_user_ID(struct rxrpc_sock *rx, struct rxrpc_call *call)
 {
 	_debug("RELEASE CALL %d", call->debug_id);
 
@@ -33,7 +33,7 @@ static void rxrpc_remove_user_ID(struct rxrpc_sock *rx, struct rxrpc_call *call)
 	read_lock_bh(&call->state_lock);
 	if (!test_bit(RXRPC_CALL_RELEASED, &call->flags) &&
 	    !test_and_set_bit(RXRPC_CALL_RELEASE, &call->events))
-		schedule_work(&call->processor);
+		rxrpc_queue_call(call);
 	read_unlock_bh(&call->state_lock);
 }
 
@@ -364,3 +364,74 @@ wait_error:
 	return copied;
 
 }
+
+/**
+ * rxrpc_kernel_data_delivered - Record delivery of data message
+ * @skb: Message holding data
+ *
+ * Record the delivery of a data message.  This permits RxRPC to keep its
+ * tracking correct.  The socket buffer will be deleted.
+ */
+void rxrpc_kernel_data_delivered(struct sk_buff *skb)
+{
+	struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
+	struct rxrpc_call *call = sp->call;
+
+	ASSERTCMP(ntohl(sp->hdr.seq), >=, call->rx_data_recv);
+	ASSERTCMP(ntohl(sp->hdr.seq), <=, call->rx_data_recv + 1);
+	call->rx_data_recv = ntohl(sp->hdr.seq);
+
+	ASSERTCMP(ntohl(sp->hdr.seq), >, call->rx_data_eaten);
+	rxrpc_free_skb(skb);
+}
+
+EXPORT_SYMBOL(rxrpc_kernel_data_delivered);
+
+/**
+ * rxrpc_kernel_is_data_last - Determine if data message is last one
+ * @skb: Message holding data
+ *
+ * Determine if data message is last one for the parent call.
+ */
+bool rxrpc_kernel_is_data_last(struct sk_buff *skb)
+{
+	struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
+
+	ASSERTCMP(skb->mark, ==, RXRPC_SKB_MARK_DATA);
+
+	return sp->hdr.flags & RXRPC_LAST_PACKET;
+}
+
+EXPORT_SYMBOL(rxrpc_kernel_is_data_last);
+
+/**
+ * rxrpc_kernel_get_abort_code - Get the abort code from an RxRPC abort message
+ * @skb: Message indicating an abort
+ *
+ * Get the abort code from an RxRPC abort message.
+ */
+u32 rxrpc_kernel_get_abort_code(struct sk_buff *skb)
+{
+	struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
+
+	ASSERTCMP(skb->mark, ==, RXRPC_SKB_MARK_REMOTE_ABORT);
+
+	return sp->call->abort_code;
+}
+
+EXPORT_SYMBOL(rxrpc_kernel_get_abort_code);
+
+/**
+ * rxrpc_kernel_get_error - Get the error number from an RxRPC error message
+ * @skb: Message indicating an error
+ *
+ * Get the error number from an RxRPC error message.
+ */
+int rxrpc_kernel_get_error_number(struct sk_buff *skb)
+{
+	struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
+
+	return sp->error;
+}
+
+EXPORT_SYMBOL(rxrpc_kernel_get_error_number);
diff --git a/net/rxrpc/ar-skbuff.c b/net/rxrpc/ar-skbuff.c
index d73f6fc..de755e0 100644
--- a/net/rxrpc/ar-skbuff.c
+++ b/net/rxrpc/ar-skbuff.c
@@ -36,7 +36,7 @@ static void rxrpc_request_final_ACK(struct rxrpc_call *call)
 		rxrpc_get_call(call);
 		set_bit(RXRPC_CALL_ACK_FINAL, &call->events);
 		if (try_to_del_timer_sync(&call->ack_timer) >= 0)
-			schedule_work(&call->processor);
+			rxrpc_queue_call(call);
 		break;
 
 	case RXRPC_CALL_SERVER_RECV_REQUEST:
@@ -116,3 +116,17 @@ void rxrpc_packet_destructor(struct sk_buff *skb)
 		sock_rfree(skb);
 	_leave("");
 }
+
+/**
+ * rxrpc_kernel_free_skb - Free an RxRPC socket buffer
+ * @skb: The socket buffer to be freed
+ *
+ * Let RxRPC free its own socket buffer, permitting it to maintain debug
+ * accounting.
+ */
+void rxrpc_kernel_free_skb(struct sk_buff *skb)
+{
+	rxrpc_free_skb(skb);
+}
+
+EXPORT_SYMBOL(rxrpc_kernel_free_skb);
diff --git a/net/rxrpc/ar-transport.c b/net/rxrpc/ar-transport.c
index 9b4e5cb..d43d78f 100644
--- a/net/rxrpc/ar-transport.c
+++ b/net/rxrpc/ar-transport.c
@@ -189,7 +189,7 @@ void rxrpc_put_transport(struct rxrpc_transport *trans)
 		/* let the reaper determine the timeout to avoid a race with
 		 * overextending the timeout if the reaper is running at the
 		 * same time */
-		schedule_delayed_work(&rxrpc_transport_reap, 0);
+		rxrpc_queue_delayed_work(&rxrpc_transport_reap, 0);
 	_leave("");
 }
 
@@ -243,8 +243,8 @@ static void rxrpc_transport_reaper(struct work_struct *work)
 	if (earliest != ULONG_MAX) {
 		_debug("reschedule reaper %ld", (long) earliest - now);
 		ASSERTCMP(earliest, >, now);
-		schedule_delayed_work(&rxrpc_transport_reap,
-				      (earliest - now) * HZ);
+		rxrpc_queue_delayed_work(&rxrpc_transport_reap,
+					 (earliest - now) * HZ);
 	}
 
 	/* then destroy all those pulled out */
@@ -270,7 +270,7 @@ void __exit rxrpc_destroy_all_transports(void)
 
 	rxrpc_transport_timeout = 0;
 	cancel_delayed_work(&rxrpc_transport_reap);
-	schedule_delayed_work(&rxrpc_transport_reap, 0);
+	rxrpc_queue_delayed_work(&rxrpc_transport_reap, 0);
 
 	_leave("");
 }


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 10/16] AFS: Handle multiple mounts of an AFS superblock correctly [try #3]
  2007-04-25 10:50 ` [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3] David Howells
                     ` (4 preceding siblings ...)
  2007-04-25 10:51   ` [PATCH 07/16] AF_RXRPC: Add an interface to the AF_RXRPC module for the AFS filesystem to use " David Howells
@ 2007-04-25 10:51   ` David Howells
  2007-04-25 10:51   ` [PATCH 11/16] AFS: Add security support " David Howells
                     ` (6 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: David Howells @ 2007-04-25 10:51 UTC (permalink / raw)
  To: torvalds, akpm; +Cc: linux-kernel, linux-fsdevel, netdev, dhowells

Handle multiple mounts of an AFS superblock correctly, checking to see whether
the superblock is already initialised after calling sget() rather than just
unconditionally stamping all over it.

Also delete the "silent" parameter to afs_fill_super() as it's not used and
can, in any case, be obtained from sb->s_flags.

Signed-Off-By: David Howells <dhowells@redhat.com>
---

 fs/afs/super.c |   26 ++++++++++++++++----------
 1 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/fs/afs/super.c b/fs/afs/super.c
index efc4fe6..77e6875 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -212,7 +212,7 @@ static int afs_test_super(struct super_block *sb, void *data)
 /*
  * fill in the superblock
  */
-static int afs_fill_super(struct super_block *sb, void *data, int silent)
+static int afs_fill_super(struct super_block *sb, void *data)
 {
 	struct afs_mount_params *params = data;
 	struct afs_super_info *as = NULL;
@@ -319,17 +319,23 @@ static int afs_get_sb(struct file_system_type *fs_type,
 		goto error;
 	}
 
-	sb->s_flags = flags;
-
-	ret = afs_fill_super(sb, &params, flags & MS_SILENT ? 1 : 0);
-	if (ret < 0) {
-		up_write(&sb->s_umount);
-		deactivate_super(sb);
-		goto error;
+	if (!sb->s_root) {
+		/* initial superblock/root creation */
+		_debug("create");
+		sb->s_flags = flags;
+		ret = afs_fill_super(sb, &params);
+		if (ret < 0) {
+			up_write(&sb->s_umount);
+			deactivate_super(sb);
+			goto error;
+		}
+		sb->s_flags |= MS_ACTIVE;
+	} else {
+		_debug("reuse");
+		ASSERTCMP(sb->s_flags, &, MS_ACTIVE);
 	}
-	sb->s_flags |= MS_ACTIVE;
-	simple_set_mnt(mnt, sb);
 
+	simple_set_mnt(mnt, sb);
 	afs_put_volume(params.volume);
 	afs_put_cell(params.default_cell);
 	_leave(" = 0 [%p]", sb);


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 11/16] AFS: Add security support [try #3]
  2007-04-25 10:50 ` [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3] David Howells
                     ` (5 preceding siblings ...)
  2007-04-25 10:51   ` [PATCH 10/16] AFS: Handle multiple mounts of an AFS superblock correctly " David Howells
@ 2007-04-25 10:51   ` David Howells
  2007-04-25 10:51   ` [PATCH 12/16] AFS: Update the AFS fs documentation " David Howells
                     ` (5 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: David Howells @ 2007-04-25 10:51 UTC (permalink / raw)
  To: torvalds, akpm; +Cc: linux-kernel, linux-fsdevel, netdev, dhowells

Add security support to the AFS filesystem.  Kerberos IV tickets are added as
RxRPC keys are added to the session keyring with the klog program.  open() and
other VFS operations then find this ticket with request_key() and either use
it immediately (eg: mkdir, unlink) or attach it to a file descriptor (open).

Signed-Off-By: David Howells <dhowells@redhat.com>
---

 fs/afs/AFS.h       |   27 ++++
 fs/afs/Makefile    |    1 
 fs/afs/callback.c  |    5 +
 fs/afs/cell.c      |   98 ++++++++++++---
 fs/afs/cmservice.c |    3 
 fs/afs/dir.c       |   51 ++++++--
 fs/afs/file.c      |   60 ++++++++-
 fs/afs/fsclient.c  |    9 +
 fs/afs/inode.c     |   21 ++-
 fs/afs/internal.h  |  106 ++++++++++++----
 fs/afs/mntpt.c     |   12 +-
 fs/afs/rxrpc.c     |  137 ++++++++++++++++-----
 fs/afs/security.c  |  345 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/afs/super.c     |  142 ++++++++++++++++++---
 fs/afs/vlclient.c  |    6 +
 fs/afs/vlocation.c |   26 ++--
 fs/afs/vnode.c     |   25 +++-
 fs/afs/volume.c    |  109 +++-------------
 18 files changed, 945 insertions(+), 238 deletions(-)

diff --git a/fs/afs/AFS.h b/fs/afs/AFS.h
index 0362e11..3f255d6 100644
--- a/fs/afs/AFS.h
+++ b/fs/afs/AFS.h
@@ -14,6 +14,9 @@
 
 #include <linux/in.h>
 
+#define AFS_MAXCELLNAME	64		/* maximum length of a cell name */
+#define AFS_MAXVOLNAME	64		/* maximum length of a volume name */
+
 typedef unsigned			afs_volid_t;
 typedef unsigned			afs_vnodeid_t;
 typedef unsigned long long		afs_dataversion_t;
@@ -75,6 +78,26 @@ struct afs_volume_info {
 };
 
 /*
+ * AFS security ACE access mask
+ */
+typedef u32 afs_access_t;
+#define AFS_ACE_READ		0x00000001U	/* - permission to read a file/dir */
+#define AFS_ACE_WRITE		0x00000002U	/* - permission to write/chmod a file */
+#define AFS_ACE_INSERT		0x00000004U	/* - permission to create dirent in a dir */
+#define AFS_ACE_LOOKUP		0x00000008U	/* - permission to lookup a file/dir in a dir */
+#define AFS_ACE_DELETE		0x00000010U	/* - permission to delete a dirent from a dir */
+#define AFS_ACE_LOCK		0x00000020U	/* - permission to lock a file */
+#define AFS_ACE_ADMINISTER	0x00000040U	/* - permission to change ACL */
+#define AFS_ACE_USER_A		0x01000000U	/* - 'A' user-defined permission */
+#define AFS_ACE_USER_B		0x02000000U	/* - 'B' user-defined permission */
+#define AFS_ACE_USER_C		0x04000000U	/* - 'C' user-defined permission */
+#define AFS_ACE_USER_D		0x08000000U	/* - 'D' user-defined permission */
+#define AFS_ACE_USER_E		0x10000000U	/* - 'E' user-defined permission */
+#define AFS_ACE_USER_F		0x20000000U	/* - 'F' user-defined permission */
+#define AFS_ACE_USER_G		0x40000000U	/* - 'G' user-defined permission */
+#define AFS_ACE_USER_H		0x80000000U	/* - 'H' user-defined permission */
+
+/*
  * AFS file status information
  */
 struct afs_file_status {
@@ -87,8 +110,8 @@ struct afs_file_status {
 	afs_dataversion_t	data_version;	/* current data version */
 	unsigned		author;		/* author ID */
 	unsigned		owner;		/* owner ID */
-	unsigned		caller_access;	/* access rights for authenticated caller */
-	unsigned		anon_access;	/* access rights for unauthenticated caller */
+	afs_access_t		caller_access;	/* access rights for authenticated caller */
+	afs_access_t		anon_access;	/* access rights for unauthenticated caller */
 	umode_t			mode;		/* UNIX mode */
 	struct afs_fid		parent;		/* parent file ID */
 	time_t			mtime_client;	/* last time client changed data */
diff --git a/fs/afs/Makefile b/fs/afs/Makefile
index 66bdc21..cca198b 100644
--- a/fs/afs/Makefile
+++ b/fs/afs/Makefile
@@ -15,6 +15,7 @@ kafs-objs := \
 	mntpt.o \
 	proc.o \
 	rxrpc.o \
+	security.o \
 	server.o \
 	super.o \
 	vlclient.o \
diff --git a/fs/afs/callback.c b/fs/afs/callback.c
index 6112155..e674beb 100644
--- a/fs/afs/callback.c
+++ b/fs/afs/callback.c
@@ -72,7 +72,10 @@ void afs_broken_callback_work(struct work_struct *work)
 		return; /* someone else is dealing with it */
 
 	if (test_bit(AFS_VNODE_CB_BROKEN, &vnode->flags)) {
-		if (afs_vnode_fetch_status(vnode) < 0)
+		if (S_ISDIR(vnode->vfs_inode.i_mode))
+			afs_clear_permits(vnode);
+
+		if (afs_vnode_fetch_status(vnode, NULL, NULL) < 0)
 			goto out;
 
 		if (test_bit(AFS_VNODE_DELETED, &vnode->flags))
diff --git a/fs/afs/cell.c b/fs/afs/cell.c
index 733c602..9b1311a 100644
--- a/fs/afs/cell.c
+++ b/fs/afs/cell.c
@@ -11,6 +11,9 @@
 
 #include <linux/module.h>
 #include <linux/slab.h>
+#include <linux/key.h>
+#include <linux/ctype.h>
+#include <keys/rxrpc-type.h>
 #include "internal.h"
 
 DECLARE_RWSEM(afs_proc_cells_sem);
@@ -23,45 +26,43 @@ static DECLARE_WAIT_QUEUE_HEAD(afs_cells_freeable_wq);
 static struct afs_cell *afs_cell_root;
 
 /*
- * create a cell record
- * - "name" is the name of the cell
- * - "vllist" is a colon separated list of IP addresses in "a.b.c.d" format
+ * allocate a cell record and fill in its name, VL server address list and
+ * allocate an anonymous key
  */
-struct afs_cell *afs_cell_create(const char *name, char *vllist)
+static struct afs_cell *afs_cell_alloc(const char *name, char *vllist)
 {
 	struct afs_cell *cell;
-	char *next;
+	size_t namelen;
+	char keyname[4 + AFS_MAXCELLNAME + 1], *cp, *dp, *next;
 	int ret;
 
 	_enter("%s,%s", name, vllist);
 
 	BUG_ON(!name); /* TODO: want to look up "this cell" in the cache */
 
+	namelen = strlen(name);
+	if (namelen > AFS_MAXCELLNAME)
+		return ERR_PTR(-ENAMETOOLONG);
+
 	/* allocate and initialise a cell record */
-	cell = kmalloc(sizeof(struct afs_cell) + strlen(name) + 1, GFP_KERNEL);
+	cell = kzalloc(sizeof(struct afs_cell) + namelen + 1, GFP_KERNEL);
 	if (!cell) {
 		_leave(" = -ENOMEM");
 		return ERR_PTR(-ENOMEM);
 	}
 
-	down_write(&afs_cells_sem);
+	memcpy(cell->name, name, namelen);
+	cell->name[namelen] = 0;
 
-	memset(cell, 0, sizeof(struct afs_cell));
 	atomic_set(&cell->usage, 1);
-
 	INIT_LIST_HEAD(&cell->link);
-
 	rwlock_init(&cell->servers_lock);
 	INIT_LIST_HEAD(&cell->servers);
-
 	init_rwsem(&cell->vl_sem);
 	INIT_LIST_HEAD(&cell->vl_list);
 	spin_lock_init(&cell->vl_lock);
 
-	strcpy(cell->name, name);
-
 	/* fill in the VL server list from the rest of the string */
-	ret = -EINVAL;
 	do {
 		unsigned a, b, c, d;
 
@@ -70,18 +71,73 @@ struct afs_cell *afs_cell_create(const char *name, char *vllist)
 			*next++ = 0;
 
 		if (sscanf(vllist, "%u.%u.%u.%u", &a, &b, &c, &d) != 4)
-			goto badaddr;
+			goto bad_address;
 
 		if (a > 255 || b > 255 || c > 255 || d > 255)
-			goto badaddr;
+			goto bad_address;
 
 		cell->vl_addrs[cell->vl_naddrs++].s_addr =
 			htonl((a << 24) | (b << 16) | (c << 8) | d);
 
-		if (cell->vl_naddrs >= AFS_CELL_MAX_ADDRS)
-			break;
+	} while (cell->vl_naddrs < AFS_CELL_MAX_ADDRS && (vllist = next));
+
+	/* create a key to represent an anonymous user */
+	memcpy(keyname, "afs@", 4);
+	dp = keyname + 4;
+	cp = cell->name;
+	do {
+		*dp++ = toupper(*cp);
+	} while (*cp++);
+	cell->anonymous_key = key_alloc(&key_type_rxrpc, keyname, 0, 0, current,
+					KEY_POS_SEARCH, KEY_ALLOC_NOT_IN_QUOTA);
+	if (IS_ERR(cell->anonymous_key)) {
+		_debug("no key");
+		ret = PTR_ERR(cell->anonymous_key);
+		goto error;
+	}
+
+	ret = key_instantiate_and_link(cell->anonymous_key, NULL, 0,
+				       NULL, NULL);
+	if (ret < 0) {
+		_debug("instantiate failed");
+		goto error;
+	}
+
+	_debug("anon key %p{%x}",
+	       cell->anonymous_key, key_serial(cell->anonymous_key));
+
+	_leave(" = %p", cell);
+	return cell;
+
+bad_address:
+	printk(KERN_ERR "kAFS: bad VL server IP address\n");
+	ret = -EINVAL;
+error:
+	key_put(cell->anonymous_key);
+	kfree(cell);
+	_leave(" = %d", ret);
+	return ERR_PTR(ret);
+}
 
-	} while ((vllist = next));
+/*
+ * create a cell record
+ * - "name" is the name of the cell
+ * - "vllist" is a colon separated list of IP addresses in "a.b.c.d" format
+ */
+struct afs_cell *afs_cell_create(const char *name, char *vllist)
+{
+	struct afs_cell *cell;
+	int ret;
+
+	_enter("%s,%s", name, vllist);
+
+	cell = afs_cell_alloc(name, vllist);
+	if (IS_ERR(cell)) {
+		_leave(" = %ld", PTR_ERR(cell));
+		return cell;
+	}
+
+	down_write(&afs_cells_sem);
 
 	/* add a proc directory for this cell */
 	ret = afs_proc_cell_setup(cell);
@@ -109,10 +165,9 @@ struct afs_cell *afs_cell_create(const char *name, char *vllist)
 	_leave(" = %p", cell);
 	return cell;
 
-badaddr:
-	printk(KERN_ERR "kAFS: bad VL server IP address\n");
 error:
 	up_write(&afs_cells_sem);
+	key_put(cell->anonymous_key);
 	kfree(cell);
 	_leave(" = %d", ret);
 	return ERR_PTR(ret);
@@ -301,6 +356,7 @@ static void afs_cell_destroy(struct afs_cell *cell)
 	cachefs_relinquish_cookie(cell->cache, 0);
 #endif
 
+	key_put(cell->anonymous_key);
 	kfree(cell);
 
 	_leave(" [destroyed]");
diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index 0b6665e..9cb3ac5 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -28,6 +28,7 @@ static void afs_cm_destructor(struct afs_call *);
  * CB.CallBack operation type
  */
 static const struct afs_call_type afs_SRXCBCallBack = {
+	.name		= "CB.CallBack",
 	.deliver	= afs_deliver_cb_callback,
 	.abort_to_error	= afs_abort_to_error,
 	.destructor	= afs_cm_destructor,
@@ -37,6 +38,7 @@ static const struct afs_call_type afs_SRXCBCallBack = {
  * CB.InitCallBackState operation type
  */
 static const struct afs_call_type afs_SRXCBInitCallBackState = {
+	.name		= "CB.InitCallBackState",
 	.deliver	= afs_deliver_cb_init_call_back_state,
 	.abort_to_error	= afs_abort_to_error,
 	.destructor	= afs_cm_destructor,
@@ -46,6 +48,7 @@ static const struct afs_call_type afs_SRXCBInitCallBackState = {
  * CB.Probe operation type
  */
 static const struct afs_call_type afs_SRXCBProbe = {
+	.name		= "CB.Probe",
 	.deliver	= afs_deliver_cb_probe,
 	.abort_to_error	= afs_abort_to_error,
 	.destructor	= afs_cm_destructor,
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index d7697f6..8736841 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -15,6 +15,7 @@
 #include <linux/slab.h>
 #include <linux/fs.h>
 #include <linux/pagemap.h>
+#include <linux/ctype.h>
 #include "internal.h"
 
 static struct dentry *afs_dir_lookup(struct inode *dir, struct dentry *dentry,
@@ -28,11 +29,13 @@ static int afs_dir_lookup_filldir(void *_cookie, const char *name, int nlen,
 
 const struct file_operations afs_dir_file_operations = {
 	.open		= afs_dir_open,
+	.release	= afs_release,
 	.readdir	= afs_dir_readdir,
 };
 
 const struct inode_operations afs_dir_inode_operations = {
 	.lookup		= afs_dir_lookup,
+	.permission	= afs_permission,
 	.getattr	= afs_inode_getattr,
 #if 0 /* TODO */
 	.create		= afs_dir_create,
@@ -169,13 +172,17 @@ static inline void afs_dir_put_page(struct page *page)
 /*
  * get a page into the pagecache
  */
-static struct page *afs_dir_get_page(struct inode *dir, unsigned long index)
+static struct page *afs_dir_get_page(struct inode *dir, unsigned long index,
+				     struct key *key)
 {
 	struct page *page;
+	struct file file = {
+		.private_data = key,
+	};
 
 	_enter("{%lu},%lu", dir->i_ino, index);
 
-	page = read_mapping_page(dir->i_mapping, index, NULL);
+	page = read_mapping_page(dir->i_mapping, index, &file);
 	if (!IS_ERR(page)) {
 		wait_on_page_locked(page);
 		kmap(page);
@@ -207,8 +214,7 @@ static int afs_dir_open(struct inode *inode, struct file *file)
 	if (test_bit(AFS_VNODE_DELETED, &AFS_FS_I(inode)->flags))
 		return -ENOENT;
 
-	_leave(" = 0");
-	return 0;
+	return afs_open(inode, file);
 }
 
 /*
@@ -311,7 +317,7 @@ static int afs_dir_iterate_block(unsigned *fpos,
  * iterate through the data blob that lists the contents of an AFS directory
  */
 static int afs_dir_iterate(struct inode *dir, unsigned *fpos, void *cookie,
-			   filldir_t filldir)
+			   filldir_t filldir, struct key *key)
 {
 	union afs_dir_block *dblock;
 	struct afs_dir_page *dbuf;
@@ -336,7 +342,7 @@ static int afs_dir_iterate(struct inode *dir, unsigned *fpos, void *cookie,
 		blkoff = *fpos & ~(sizeof(union afs_dir_block) - 1);
 
 		/* fetch the appropriate page from the directory */
-		page = afs_dir_get_page(dir, blkoff / PAGE_SIZE);
+		page = afs_dir_get_page(dir, blkoff / PAGE_SIZE, key);
 		if (IS_ERR(page)) {
 			ret = PTR_ERR(page);
 			break;
@@ -381,9 +387,11 @@ static int afs_dir_readdir(struct file *file, void *cookie, filldir_t filldir)
 	_enter("{%Ld,{%lu}}",
 	       file->f_pos, file->f_path.dentry->d_inode->i_ino);
 
+	ASSERT(file->private_data != NULL);
+
 	fpos = file->f_pos;
 	ret = afs_dir_iterate(file->f_path.dentry->d_inode, &fpos,
-			      cookie, filldir);
+			      cookie, filldir, file->private_data);
 	file->f_pos = fpos;
 
 	_leave(" = %d", ret);
@@ -424,7 +432,7 @@ static int afs_dir_lookup_filldir(void *_cookie, const char *name, int nlen,
  * do a lookup in a directory
  */
 static int afs_do_lookup(struct inode *dir, struct dentry *dentry,
-			 struct afs_fid *fid)
+			 struct afs_fid *fid, struct key *key)
 {
 	struct afs_dir_lookup_cookie cookie;
 	struct afs_super_info *as;
@@ -442,7 +450,8 @@ static int afs_do_lookup(struct inode *dir, struct dentry *dentry,
 	cookie.found	= 0;
 
 	fpos = 0;
-	ret = afs_dir_iterate(dir, &fpos, &cookie, afs_dir_lookup_filldir);
+	ret = afs_dir_iterate(dir, &fpos, &cookie, afs_dir_lookup_filldir,
+			      key);
 	if (ret < 0) {
 		_leave(" = %d [iter]", ret);
 		return ret;
@@ -468,6 +477,7 @@ static struct dentry *afs_dir_lookup(struct inode *dir, struct dentry *dentry,
 	struct afs_vnode *vnode;
 	struct afs_fid fid;
 	struct inode *inode;
+	struct key *key;
 	int ret;
 
 	_enter("{%lu},%p{%s}", dir->i_ino, dentry, dentry->d_name.name);
@@ -483,14 +493,22 @@ static struct dentry *afs_dir_lookup(struct inode *dir, struct dentry *dentry,
 		return ERR_PTR(-ESTALE);
 	}
 
-	ret = afs_do_lookup(dir, dentry, &fid);
+	key = afs_request_key(vnode->volume->cell);
+	if (IS_ERR(key)) {
+		_leave(" = %ld [key]", PTR_ERR(key));
+		return ERR_PTR(PTR_ERR(key));
+	}
+
+	ret = afs_do_lookup(dir, dentry, &fid, key);
 	if (ret < 0) {
+		key_put(key);
 		_leave(" = %d [do]", ret);
 		return ERR_PTR(ret);
 	}
 
 	/* instantiate the dentry */
-	inode = afs_iget(dir->i_sb, &fid);
+	inode = afs_iget(dir->i_sb, key, &fid);
+	key_put(key);
 	if (IS_ERR(inode)) {
 		_leave(" = %ld", PTR_ERR(inode));
 		return ERR_PTR(PTR_ERR(inode));
@@ -559,6 +577,7 @@ static int afs_d_revalidate(struct dentry *dentry, struct nameidata *nd)
 	struct afs_fid fid;
 	struct dentry *parent;
 	struct inode *inode, *dir;
+	struct key *key;
 	int ret;
 
 	vnode = AFS_FS_I(dentry->d_inode);
@@ -566,6 +585,10 @@ static int afs_d_revalidate(struct dentry *dentry, struct nameidata *nd)
 	_enter("{sb=%p n=%s fl=%lx},",
 	       dentry->d_sb, dentry->d_name.name, vnode->flags);
 
+	key = afs_request_key(vnode->volume->cell);
+	if (IS_ERR(key))
+		key = NULL;
+
 	/* lock down the parent dentry so we can peer at it */
 	parent = dget_parent(dentry);
 
@@ -595,7 +618,7 @@ static int afs_d_revalidate(struct dentry *dentry, struct nameidata *nd)
 		_debug("dir modified");
 
 		/* search the directory for this vnode */
-		ret = afs_do_lookup(dir, dentry, &fid);
+		ret = afs_do_lookup(dir, dentry, &fid, key);
 		if (ret == -ENOENT) {
 			_debug("%s: dirent not found", dentry->d_name.name);
 			goto not_found;
@@ -637,7 +660,7 @@ static int afs_d_revalidate(struct dentry *dentry, struct nameidata *nd)
 	    test_bit(AFS_VNODE_CB_BROKEN, &vnode->flags)) {
 		_debug("%s: changed", dentry->d_name.name);
 		set_bit(AFS_VNODE_CB_BROKEN, &vnode->flags);
-		if (afs_vnode_fetch_status(vnode) < 0) {
+		if (afs_vnode_fetch_status(vnode, NULL, key) < 0) {
 			mutex_unlock(&vnode->cb_broken_lock);
 			goto out_bad;
 		}
@@ -667,6 +690,7 @@ static int afs_d_revalidate(struct dentry *dentry, struct nameidata *nd)
 
 out_valid:
 	dput(parent);
+	key_put(key);
 	_leave(" = 1 [valid]");
 	return 1;
 
@@ -688,6 +712,7 @@ out_bad:
 	shrink_dcache_parent(dentry);
 	d_drop(dentry);
 	dput(parent);
+	key_put(key);
 
 	_leave(" = 0 [bad]");
 	return 0;
diff --git a/fs/afs/file.c b/fs/afs/file.c
index 6990327..101bbb8 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -17,17 +17,23 @@
 #include <linux/pagemap.h>
 #include "internal.h"
 
-#if 0
-static int afs_file_open(struct inode *inode, struct file *file);
-static int afs_file_release(struct inode *inode, struct file *file);
-#endif
-
 static int afs_file_readpage(struct file *file, struct page *page);
 static void afs_file_invalidatepage(struct page *page, unsigned long offset);
 static int afs_file_releasepage(struct page *page, gfp_t gfp_flags);
 
+const struct file_operations afs_file_operations = {
+	.open		= afs_open,
+	.release	= afs_release,
+	.llseek		= generic_file_llseek,
+	.read		= do_sync_read,
+	.aio_read	= generic_file_aio_read,
+	.mmap		= generic_file_readonly_mmap,
+	.sendfile	= generic_file_sendfile,
+};
+
 const struct inode_operations afs_file_inode_operations = {
 	.getattr	= afs_inode_getattr,
+	.permission	= afs_permission,
 };
 
 const struct address_space_operations afs_fs_aops = {
@@ -38,6 +44,41 @@ const struct address_space_operations afs_fs_aops = {
 };
 
 /*
+ * open an AFS file or directory and attach a key to it
+ */
+int afs_open(struct inode *inode, struct file *file)
+{
+	struct afs_vnode *vnode = AFS_FS_I(inode);
+	struct key *key;
+
+	_enter("{%x:%x},", vnode->fid.vid, vnode->fid.vnode);
+
+	key = afs_request_key(vnode->volume->cell);
+	if (IS_ERR(key)) {
+		_leave(" = %ld [key]", PTR_ERR(key));
+		return PTR_ERR(key);
+	}
+
+	file->private_data = key;
+	_leave(" = 0");
+	return 0;
+}
+
+/*
+ * release an AFS file or directory and discard its key
+ */
+int afs_release(struct inode *inode, struct file *file)
+{
+	struct afs_vnode *vnode = AFS_FS_I(inode);
+
+	_enter("{%x:%x},", vnode->fid.vid, vnode->fid.vnode);
+
+	key_put(file->private_data);
+	_leave(" = 0");
+	return 0;
+}
+
+/*
  * deal with notification that a page was read from the cache
  */
 #ifdef AFS_CACHING_SUPPORT
@@ -79,13 +120,18 @@ static int afs_file_readpage(struct file *file, struct page *page)
 {
 	struct afs_vnode *vnode;
 	struct inode *inode;
+	struct key *key;
 	size_t len;
 	off_t offset;
 	int ret;
 
 	inode = page->mapping->host;
 
-	_enter("{%lu},{%lu}", inode->i_ino, page->index);
+	ASSERT(file != NULL);
+	key = file->private_data;
+	ASSERT(key != NULL);
+
+	_enter("{%x},{%lu},{%lu}", key_serial(key), inode->i_ino, page->index);
 
 	vnode = AFS_FS_I(inode);
 
@@ -124,7 +170,7 @@ static int afs_file_readpage(struct file *file, struct page *page)
 
 		/* read the contents of the file from the server into the
 		 * page */
-		ret = afs_vnode_fetch_data(vnode, offset, len, page);
+		ret = afs_vnode_fetch_data(vnode, key, offset, len, page);
 		if (ret < 0) {
 			if (ret == -ENOENT) {
 				_debug("got NOENT from server"
diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index e2a36f8..8f20f1b 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -148,6 +148,7 @@ static int afs_deliver_fs_fetch_status(struct afs_call *call,
  * FS.FetchStatus operation type
  */
 static const struct afs_call_type afs_RXFSFetchStatus = {
+	.name		= "FS.FetchStatus",
 	.deliver	= afs_deliver_fs_fetch_status,
 	.abort_to_error	= afs_abort_to_error,
 	.destructor	= afs_flat_call_destructor,
@@ -157,6 +158,7 @@ static const struct afs_call_type afs_RXFSFetchStatus = {
  * fetch the status information for a file
  */
 int afs_fs_fetch_file_status(struct afs_server *server,
+			     struct key *key,
 			     struct afs_vnode *vnode,
 			     struct afs_volsync *volsync,
 			     const struct afs_wait_mode *wait_mode)
@@ -164,12 +166,13 @@ int afs_fs_fetch_file_status(struct afs_server *server,
 	struct afs_call *call;
 	__be32 *bp;
 
-	_enter("");
+	_enter(",%x,,,", key_serial(key));
 
 	call = afs_alloc_flat_call(&afs_RXFSFetchStatus, 16, 120);
 	if (!call)
 		return -ENOMEM;
 
+	call->key = key;
 	call->reply = vnode;
 	call->reply2 = volsync;
 	call->service_id = FS_SERVICE;
@@ -279,6 +282,7 @@ static int afs_deliver_fs_fetch_data(struct afs_call *call,
  * FS.FetchData operation type
  */
 static const struct afs_call_type afs_RXFSFetchData = {
+	.name		= "FS.FetchData",
 	.deliver	= afs_deliver_fs_fetch_data,
 	.abort_to_error	= afs_abort_to_error,
 	.destructor	= afs_flat_call_destructor,
@@ -288,6 +292,7 @@ static const struct afs_call_type afs_RXFSFetchData = {
  * fetch data from a file
  */
 int afs_fs_fetch_data(struct afs_server *server,
+		      struct key *key,
 		      struct afs_vnode *vnode,
 		      off_t offset, size_t length,
 		      struct page *buffer,
@@ -303,6 +308,7 @@ int afs_fs_fetch_data(struct afs_server *server,
 	if (!call)
 		return -ENOMEM;
 
+	call->key = key;
 	call->reply = vnode;
 	call->reply2 = volsync;
 	call->reply3 = buffer;
@@ -338,6 +344,7 @@ static int afs_deliver_fs_give_up_callbacks(struct afs_call *call,
  * FS.GiveUpCallBacks operation type
  */
 static const struct afs_call_type afs_RXFSGiveUpCallBacks = {
+	.name		= "FS.GiveUpCallBacks",
 	.deliver	= afs_deliver_fs_give_up_callbacks,
 	.abort_to_error	= afs_abort_to_error,
 	.destructor	= afs_flat_call_destructor,
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 1886331..2273362 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -29,7 +29,7 @@ struct afs_iget_data {
 /*
  * map the AFS file status to the inode member variables
  */
-static int afs_inode_map_status(struct afs_vnode *vnode)
+static int afs_inode_map_status(struct afs_vnode *vnode, struct key *key)
 {
 	struct inode *inode = AFS_VNODE_TO_I(vnode);
 
@@ -44,7 +44,7 @@ static int afs_inode_map_status(struct afs_vnode *vnode)
 	case AFS_FTYPE_FILE:
 		inode->i_mode	= S_IFREG | vnode->status.mode;
 		inode->i_op	= &afs_file_inode_operations;
-		inode->i_fop	= &generic_ro_fops;
+		inode->i_fop	= &afs_file_operations;
 		break;
 	case AFS_FTYPE_DIR:
 		inode->i_mode	= S_IFDIR | vnode->status.mode;
@@ -73,7 +73,7 @@ static int afs_inode_map_status(struct afs_vnode *vnode)
 
 	/* check to see whether a symbolic link is really a mountpoint */
 	if (vnode->status.type == AFS_FTYPE_SYMLINK) {
-		afs_mntpt_check_symlink(vnode);
+		afs_mntpt_check_symlink(vnode, key);
 
 		if (test_bit(AFS_VNODE_MOUNTPOINT, &vnode->flags)) {
 			inode->i_mode	= S_IFDIR | vnode->status.mode;
@@ -115,7 +115,8 @@ static int afs_iget5_set(struct inode *inode, void *opaque)
 /*
  * inode retrieval
  */
-inline struct inode *afs_iget(struct super_block *sb, struct afs_fid *fid)
+inline struct inode *afs_iget(struct super_block *sb, struct key *key,
+			      struct afs_fid *fid)
 {
 	struct afs_iget_data data = { .fid = *fid };
 	struct afs_super_info *as;
@@ -157,10 +158,10 @@ inline struct inode *afs_iget(struct super_block *sb, struct afs_fid *fid)
 
 	/* okay... it's a new inode */
 	set_bit(AFS_VNODE_CB_BROKEN, &vnode->flags);
-	ret = afs_vnode_fetch_status(vnode);
+	ret = afs_vnode_fetch_status(vnode, NULL, key);
 	if (ret < 0)
 		goto bad_inode;
-	ret = afs_inode_map_status(vnode);
+	ret = afs_inode_map_status(vnode, key);
 	if (ret < 0)
 		goto bad_inode;
 
@@ -201,6 +202,7 @@ int afs_inode_getattr(struct vfsmount *mnt, struct dentry *dentry,
  */
 void afs_clear_inode(struct inode *inode)
 {
+	struct afs_permits *permits;
 	struct afs_vnode *vnode;
 
 	vnode = AFS_FS_I(inode);
@@ -233,5 +235,12 @@ void afs_clear_inode(struct inode *inode)
 	vnode->cache = NULL;
 #endif
 
+	mutex_lock(&vnode->permits_lock);
+	permits = vnode->permits;
+	rcu_assign_pointer(vnode->permits, NULL);
+	mutex_unlock(&vnode->permits_lock);
+	if (permits)
+		call_rcu(&permits->rcu, afs_zap_permits);
+
 	_leave("");
 }
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 12ad524..d49201e 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -15,6 +15,7 @@
 #include <linux/pagemap.h>
 #include <linux/skbuff.h>
 #include <linux/rxrpc.h>
+#include <linux/key.h>
 #include "AFS.h"
 #include "AFS_VL.h"
 
@@ -32,6 +33,17 @@ typedef enum {
 	AFS_VL_UNCERTAIN,		/* uncertain state (update failed) */
 } __attribute__((packed)) afs_vlocation_state_t;
 
+struct afs_mount_params {
+	bool			rwpath;		/* T if the parent should be considered R/W */
+	bool			force;		/* T to force cell type */
+	afs_voltype_t		type;		/* type of volume requested */
+	int			volnamesz;	/* size of volume name */
+	const char		*volname;	/* name of volume to mount */
+	struct afs_cell		*cell;		/* cell in which to find volume */
+	struct afs_volume	*volume;	/* volume record */
+	struct key		*key;		/* key to use for secure mounting */
+};
+
 /*
  * definition of how to wait for the completion of an operation
  */
@@ -95,6 +107,8 @@ struct afs_call {
 };
 
 struct afs_call_type {
+	const char *name;
+
 	/* deliver request or reply data to an call
 	 * - returning an error will cause the call to be aborted
 	 */
@@ -128,8 +142,8 @@ extern struct file_system_type afs_fs_type;
  * entry in the cached cell catalogue
  */
 struct afs_cache_cell {
-	char			name[64];	/* cell name (padded with NULs) */
-	struct in_addr		vl_servers[15];	/* cached cell VL servers */
+	char		name[AFS_MAXCELLNAME];	/* cell name (padded with NULs) */
+	struct in_addr	vl_servers[15];		/* cached cell VL servers */
 };
 
 /*
@@ -138,6 +152,7 @@ struct afs_cache_cell {
 struct afs_cell {
 	atomic_t		usage;
 	struct list_head	link;		/* main cell list link */
+	struct key		*anonymous_key;	/* anonymous user key for this cell */
 	struct list_head	proc_link;	/* /proc cell list link */
 	struct proc_dir_entry	*proc_dir;	/* /proc dir for this cell */
 #ifdef AFS_CACHING_SUPPORT
@@ -163,7 +178,9 @@ struct afs_cell {
  * entry in the cached volume location catalogue
  */
 struct afs_cache_vlocation {
-	uint8_t			name[64 + 1];	/* volume name (lowercase, padded with NULs) */
+	/* volume name (lowercase, padded with NULs) */
+	uint8_t			name[AFS_MAXVOLNAME + 1];
+
 	uint8_t			nservers;	/* number of entries used in servers[] */
 	uint8_t			vidmask;	/* voltype mask for vid[] */
 	uint8_t			srvtmask[8];	/* voltype masks for servers[] */
@@ -281,7 +298,8 @@ struct afs_vnode {
 #ifdef AFS_CACHING_SUPPORT
 	struct cachefs_cookie	*cache;		/* caching cookie */
 #endif
-
+	struct afs_permits	*permits;	/* cache of permits so far obtained */
+	struct mutex		permits_lock;	/* lock for altering permits list */
 	wait_queue_head_t	update_waitq;	/* status fetch waitqueue */
 	unsigned		update_cnt;	/* number of outstanding ops that will update the
 						 * status */
@@ -296,12 +314,13 @@ struct afs_vnode {
 #define AFS_VNODE_DIR_CHANGED	6		/* set if vnode's parent dir metadata changed */
 #define AFS_VNODE_DIR_MODIFIED	7		/* set if vnode's parent dir data modified */
 
+	long			acl_order;	/* ACL check count (callback break count) */
+
 	/* outstanding callback notification on this file */
 	struct rb_node		server_rb;	/* link in server->fs_vnodes */
 	struct rb_node		cb_promise;	/* link in server->cb_promises */
 	struct work_struct	cb_broken_work;	/* work to be done on callback break */
 	struct mutex		cb_broken_lock;	/* lock against multiple attempts to fix break */
-//	struct list_head	cb_hash_link;	/* link in master callback hash */
 	time_t			cb_expires;	/* time at which callback expires */
 	time_t			cb_expires_at;	/* time used to order cb_promise */
 	unsigned		cb_version;	/* callback version */
@@ -310,6 +329,23 @@ struct afs_vnode {
 	bool			cb_promised;	/* true if promise still holds */
 };
 
+/*
+ * cached security record for one user's attempt to access a vnode
+ */
+struct afs_permit {
+	struct key		*key;		/* RxRPC ticket holding a security context */
+	afs_access_t		access_mask;	/* access mask for this key */
+};
+
+/*
+ * cache of security records from attempts to access a vnode
+ */
+struct afs_permits {
+	struct rcu_head		rcu;		/* disposal procedure */
+	int			count;		/* number of records */
+	struct afs_permit	permits[0];	/* the permits so far examined */
+};
+
 /*****************************************************************************/
 /*
  * callback.c
@@ -352,11 +388,17 @@ extern bool afs_cm_incoming_call(struct afs_call *);
 extern const struct inode_operations afs_dir_inode_operations;
 extern const struct file_operations afs_dir_file_operations;
 
+extern int afs_permission(struct inode *, int, struct nameidata *);
+
 /*
  * file.c
  */
 extern const struct address_space_operations afs_fs_aops;
 extern const struct inode_operations afs_file_inode_operations;
+extern const struct file_operations afs_file_operations;
+
+extern int afs_open(struct inode *, struct file *);
+extern int afs_release(struct inode *, struct file *);
 
 #ifdef AFS_CACHING_SUPPORT
 extern int afs_cache_get_page_cookie(struct page *, struct cachefs_page **);
@@ -365,22 +407,24 @@ extern int afs_cache_get_page_cookie(struct page *, struct cachefs_page **);
 /*
  * fsclient.c
  */
-extern int afs_fs_fetch_file_status(struct afs_server *,
-				    struct afs_vnode *,
-				    struct afs_volsync *,
+extern int afs_fs_fetch_file_status(struct afs_server *, struct key *,
+				    struct afs_vnode *, struct afs_volsync *,
 				    const struct afs_wait_mode *);
 extern int afs_fs_give_up_callbacks(struct afs_server *,
 				    const struct afs_wait_mode *);
-extern int afs_fs_fetch_data(struct afs_server *, struct afs_vnode *, off_t,
-			     size_t, struct page *, struct afs_volsync *,
+extern int afs_fs_fetch_data(struct afs_server *, struct key *,
+			     struct afs_vnode *, off_t, size_t, struct page *,
+			     struct afs_volsync *,
 			     const struct afs_wait_mode *);
 
 /*
  * inode.c
  */
-extern struct inode *afs_iget(struct super_block *, struct afs_fid *);
+extern struct inode *afs_iget(struct super_block *, struct key *,
+			      struct afs_fid *);
 extern int afs_inode_getattr(struct vfsmount *, struct dentry *,
 			     struct kstat *);
+extern void afs_zap_permits(struct rcu_head *);
 extern void afs_clear_inode(struct inode *);
 
 /*
@@ -402,17 +446,11 @@ extern const struct inode_operations afs_mntpt_inode_operations;
 extern const struct file_operations afs_mntpt_file_operations;
 extern unsigned long afs_mntpt_expiry_timeout;
 
-extern int afs_mntpt_check_symlink(struct afs_vnode *);
+extern int afs_mntpt_check_symlink(struct afs_vnode *, struct key *);
 extern void afs_mntpt_kill_timer(void);
 extern void afs_umount_begin(struct vfsmount *, int);
 
 /*
- * super.c
- */
-extern int afs_fs_init(void);
-extern void afs_fs_exit(void);
-
-/*
  * proc.c
  */
 extern int afs_proc_init(void);
@@ -436,6 +474,14 @@ extern int afs_extract_data(struct afs_call *, struct sk_buff *, bool, void *,
 			    size_t);
 
 /*
+ * security.c
+ */
+extern void afs_clear_permits(struct afs_vnode *);
+extern void afs_cache_permit(struct afs_vnode *, struct key *, long);
+extern struct key *afs_request_key(struct afs_cell *);
+extern int afs_permission(struct inode *, int, struct nameidata *);
+
+/*
  * server.c
  */
 extern spinlock_t afs_server_peer_lock;
@@ -449,16 +495,23 @@ extern void afs_put_server(struct afs_server *);
 extern void __exit afs_purge_servers(void);
 
 /*
+ * super.c
+ */
+extern int afs_fs_init(void);
+extern void afs_fs_exit(void);
+
+/*
  * vlclient.c
  */
 #ifdef AFS_CACHING_SUPPORT
 extern struct cachefs_index_def afs_vlocation_cache_index_def;
 #endif
 
-extern int afs_vl_get_entry_by_name(struct in_addr *, const char *,
-				    struct afs_cache_vlocation *,
+extern int afs_vl_get_entry_by_name(struct in_addr *, struct key *,
+				    const char *, struct afs_cache_vlocation *,
 				    const struct afs_wait_mode *);
-extern int afs_vl_get_entry_by_id(struct in_addr *, afs_volid_t, afs_voltype_t,
+extern int afs_vl_get_entry_by_id(struct in_addr *, struct key *,
+				  afs_volid_t, afs_voltype_t,
 				  struct afs_cache_vlocation *,
 				  const struct afs_wait_mode *);
 
@@ -469,6 +522,7 @@ extern int afs_vl_get_entry_by_id(struct in_addr *, afs_volid_t, afs_voltype_t,
 
 extern int __init afs_vlocation_update_init(void);
 extern struct afs_vlocation *afs_vlocation_lookup(struct afs_cell *,
+						  struct key *,
 						  const char *, size_t);
 extern void afs_put_vlocation(struct afs_vlocation *);
 extern void __exit afs_vlocation_purge(void);
@@ -492,9 +546,10 @@ static inline struct inode *AFS_VNODE_TO_I(struct afs_vnode *vnode)
 	return &vnode->vfs_inode;
 }
 
-extern int afs_vnode_fetch_status(struct afs_vnode *);
-extern int afs_vnode_fetch_data(struct afs_vnode *vnode, off_t, size_t,
-				struct page *);
+extern int afs_vnode_fetch_status(struct afs_vnode *, struct afs_vnode *,
+				  struct key *);
+extern int afs_vnode_fetch_data(struct afs_vnode *, struct key *,
+				off_t, size_t, struct page *);
 
 /*
  * volume.c
@@ -506,8 +561,7 @@ extern struct cachefs_index_def afs_volume_cache_index_def;
 #define afs_get_volume(V) do { atomic_inc(&(V)->usage); } while(0)
 
 extern void afs_put_volume(struct afs_volume *);
-extern struct afs_volume *afs_volume_lookup(const char *, struct afs_cell *,
-					    int);
+extern struct afs_volume *afs_volume_lookup(struct afs_mount_params *);
 extern struct afs_server *afs_volume_pick_fileserver(struct afs_vnode *);
 extern int afs_volume_release_fileserver(struct afs_vnode *,
 					 struct afs_server *, int);
diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c
index 08c11a0..b905ae3 100644
--- a/fs/afs/mntpt.c
+++ b/fs/afs/mntpt.c
@@ -48,8 +48,11 @@ unsigned long afs_mntpt_expiry_timeout = 10 * 60;
  * check a symbolic link to see whether it actually encodes a mountpoint
  * - sets the AFS_VNODE_MOUNTPOINT flag on the vnode appropriately
  */
-int afs_mntpt_check_symlink(struct afs_vnode *vnode)
+int afs_mntpt_check_symlink(struct afs_vnode *vnode, struct key *key)
 {
+	struct file file = {
+		.private_data = key,
+	};
 	struct page *page;
 	size_t size;
 	char *buf;
@@ -58,7 +61,7 @@ int afs_mntpt_check_symlink(struct afs_vnode *vnode)
 	_enter("{%u,%u}", vnode->fid.vnode, vnode->fid.unique);
 
 	/* read the contents of the symlink into the pagecache */
-	page = read_mapping_page(AFS_VNODE_TO_I(vnode)->i_mapping, 0, NULL);
+	page = read_mapping_page(AFS_VNODE_TO_I(vnode)->i_mapping, 0, &file);
 	if (IS_ERR(page)) {
 		ret = PTR_ERR(page);
 		goto out;
@@ -214,7 +217,7 @@ static void *afs_mntpt_follow_link(struct dentry *dentry, struct nameidata *nd)
 	struct vfsmount *newmnt;
 	int err;
 
-	_enter("%p{%s},{%s:%p{%s}}",
+	_enter("%p{%s},{%s:%p{%s},}",
 	       dentry,
 	       dentry->d_name.name,
 	       nd->mnt->mnt_devname,
@@ -234,7 +237,8 @@ static void *afs_mntpt_follow_link(struct dentry *dentry, struct nameidata *nd)
 	err = do_add_mount(newmnt, nd, MNT_SHRINKABLE, &afs_vfsmounts);
 	switch (err) {
 	case 0:
-		path_release(nd);
+		mntput(nd->mnt);
+		dput(nd->dentry);
 		nd->mnt = newmnt;
 		nd->dentry = dget(newmnt->mnt_root);
 		schedule_delayed_work(&afs_mntpt_expiry_timer,
diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
index e34634b..2d05d8a 100644
--- a/fs/afs/rxrpc.c
+++ b/fs/afs/rxrpc.c
@@ -17,6 +17,8 @@
 
 static struct socket *afs_socket; /* my RxRPC socket */
 static struct workqueue_struct *afs_async_calls;
+static atomic_t afs_outstanding_calls;
+static atomic_t afs_outstanding_skbs;
 
 static void afs_wake_up_call_waiter(struct afs_call *);
 static int afs_wait_for_call_to_complete(struct afs_call *);
@@ -45,6 +47,7 @@ static const struct afs_wait_mode afs_async_incoming_call = {
 
 /* asynchronous incoming call initial processing */
 static const struct afs_call_type afs_RXCMxxxx = {
+	.name		= "CB.xxxx",
 	.deliver	= afs_deliver_cm_op_id,
 	.abort_to_error	= afs_abort_to_error,
 };
@@ -118,10 +121,67 @@ void afs_close_socket(void)
 
 	_debug("dework");
 	destroy_workqueue(afs_async_calls);
+
+	ASSERTCMP(atomic_read(&afs_outstanding_skbs), ==, 0);
+	ASSERTCMP(atomic_read(&afs_outstanding_calls), ==, 0);
 	_leave("");
 }
 
 /*
+ * note that the data in a socket buffer is now delivered and that the buffer
+ * should be freed
+ */
+static void afs_data_delivered(struct sk_buff *skb)
+{
+	if (!skb) {
+		_debug("DLVR NULL [%d]", atomic_read(&afs_outstanding_skbs));
+		dump_stack();
+	} else {
+		_debug("DLVR %p{%u} [%d]",
+		       skb, skb->mark, atomic_read(&afs_outstanding_skbs));
+		if (atomic_dec_return(&afs_outstanding_skbs) == -1)
+			BUG();
+		rxrpc_kernel_data_delivered(skb);
+	}
+}
+
+/*
+ * free a socket buffer
+ */
+static void afs_free_skb(struct sk_buff *skb)
+{
+	if (!skb) {
+		_debug("FREE NULL [%d]", atomic_read(&afs_outstanding_skbs));
+		dump_stack();
+	} else {
+		_debug("FREE %p{%u} [%d]",
+		       skb, skb->mark, atomic_read(&afs_outstanding_skbs));
+		if (atomic_dec_return(&afs_outstanding_skbs) == -1)
+			BUG();
+		rxrpc_kernel_free_skb(skb);
+	}
+}
+
+/*
+ * free a call
+ */
+static void afs_free_call(struct afs_call *call)
+{
+	_debug("DONE %p{%s} [%d]",
+	       call, call->type->name, atomic_read(&afs_outstanding_calls));
+	if (atomic_dec_return(&afs_outstanding_calls) == -1)
+		BUG();
+
+	ASSERTCMP(call->rxcall, ==, NULL);
+	ASSERT(!work_pending(&call->async_work));
+	ASSERT(skb_queue_empty(&call->rx_queue));
+	ASSERT(call->type->name != NULL);
+
+	kfree(call->request);
+	kfree(call);
+}
+
+/*
  * allocate a call with flat request and reply buffers
  */
 struct afs_call *afs_alloc_flat_call(const struct afs_call_type *type,
@@ -133,30 +193,32 @@ struct afs_call *afs_alloc_flat_call(const struct afs_call_type *type,
 	if (!call)
 		goto nomem_call;
 
+	_debug("CALL %p{%s} [%d]",
+	       call, type->name, atomic_read(&afs_outstanding_calls));
+	atomic_inc(&afs_outstanding_calls);
+
+	call->type = type;
+	call->request_size = request_size;
+	call->reply_max = reply_size;
+
 	if (request_size) {
 		call->request = kmalloc(request_size, GFP_NOFS);
 		if (!call->request)
-			goto nomem_request;
+			goto nomem_free;
 	}
 
 	if (reply_size) {
 		call->buffer = kmalloc(reply_size, GFP_NOFS);
 		if (!call->buffer)
-			goto nomem_buffer;
+			goto nomem_free;
 	}
 
-	call->type = type;
-	call->request_size = request_size;
-	call->reply_max = reply_size;
-
 	init_waitqueue_head(&call->waitq);
 	skb_queue_head_init(&call->rx_queue);
 	return call;
 
-nomem_buffer:
-	kfree(call->request);
-nomem_request:
-	kfree(call);
+nomem_free:
+	afs_free_call(call);
 nomem_call:
 	return NULL;
 }
@@ -188,6 +250,12 @@ int afs_make_call(struct in_addr *addr, struct afs_call *call, gfp_t gfp,
 
 	_enter("%x,{%d},", addr->s_addr, ntohs(call->port));
 
+	ASSERT(call->type != NULL);
+	ASSERT(call->type->name != NULL);
+
+	_debug("MAKE %p{%s} [%d]",
+	       call, call->type->name, atomic_read(&afs_outstanding_calls));
+
 	call->wait_mode = wait_mode;
 	INIT_WORK(&call->async_work, afs_process_async_call);
 
@@ -203,6 +271,7 @@ int afs_make_call(struct in_addr *addr, struct afs_call *call, gfp_t gfp,
 	/* create a call */
 	rxcall = rxrpc_kernel_begin_call(afs_socket, &srx, call->key,
 					 (unsigned long) call, gfp);
+	call->key = NULL;
 	if (IS_ERR(rxcall)) {
 		ret = PTR_ERR(rxcall);
 		goto error_kill_call;
@@ -237,10 +306,10 @@ int afs_make_call(struct in_addr *addr, struct afs_call *call, gfp_t gfp,
 error_do_abort:
 	rxrpc_kernel_abort_call(rxcall, RX_USER_ABORT);
 	rxrpc_kernel_end_call(rxcall);
+	call->rxcall = NULL;
 error_kill_call:
 	call->type->destructor(call);
-	ASSERT(skb_queue_empty(&call->rx_queue));
-	kfree(call);
+	afs_free_call(call);
 	_leave(" = %d", ret);
 	return ret;
 }
@@ -257,15 +326,19 @@ static void afs_rx_interceptor(struct sock *sk, unsigned long user_call_ID,
 
 	_enter("%p,,%u", call, skb->mark);
 
+	_debug("ICPT %p{%u} [%d]",
+	       skb, skb->mark, atomic_read(&afs_outstanding_skbs));
+
 	ASSERTCMP(sk, ==, afs_socket->sk);
+	atomic_inc(&afs_outstanding_skbs);
 
 	if (!call) {
 		/* its an incoming call for our callback service */
-		__skb_queue_tail(&afs_incoming_calls, skb);
+		skb_queue_tail(&afs_incoming_calls, skb);
 		schedule_work(&afs_collect_incoming_call_work);
 	} else {
 		/* route the messages directly to the appropriate call */
-		__skb_queue_tail(&call->rx_queue, skb);
+		skb_queue_tail(&call->rx_queue, skb);
 		call->wait_mode->rx_wakeup(call);
 	}
 
@@ -317,9 +390,9 @@ static void afs_deliver_to_call(struct afs_call *call)
 				call->state = AFS_CALL_ERROR;
 				break;
 			}
-			rxrpc_kernel_data_delivered(skb);
+			afs_data_delivered(skb);
 			skb = NULL;
-			break;
+			continue;
 		case RXRPC_SKB_MARK_FINAL_ACK:
 			_debug("Rcv ACK");
 			call->state = AFS_CALL_COMPLETE;
@@ -350,19 +423,19 @@ static void afs_deliver_to_call(struct afs_call *call)
 			break;
 		}
 
-		rxrpc_kernel_free_skb(skb);
+		afs_free_skb(skb);
 	}
 
 	/* make sure the queue is empty if the call is done with (we might have
 	 * aborted the call early because of an unmarshalling error) */
 	if (call->state >= AFS_CALL_COMPLETE) {
 		while ((skb = skb_dequeue(&call->rx_queue)))
-			rxrpc_kernel_free_skb(skb);
+			afs_free_skb(skb);
 		if (call->incoming) {
 			rxrpc_kernel_end_call(call->rxcall);
+			call->rxcall = NULL;
 			call->type->destructor(call);
-			ASSERT(skb_queue_empty(&call->rx_queue));
-			kfree(call);
+			afs_free_call(call);
 		}
 	}
 
@@ -409,14 +482,14 @@ static int afs_wait_for_call_to_complete(struct afs_call *call)
 		_debug("call incomplete");
 		rxrpc_kernel_abort_call(call->rxcall, RX_CALL_DEAD);
 		while ((skb = skb_dequeue(&call->rx_queue)))
-			rxrpc_kernel_free_skb(skb);
+			afs_free_skb(skb);
 	}
 
 	_debug("call complete");
 	rxrpc_kernel_end_call(call->rxcall);
+	call->rxcall = NULL;
 	call->type->destructor(call);
-	ASSERT(skb_queue_empty(&call->rx_queue));
-	kfree(call);
+	afs_free_call(call);
 	_leave(" = %d", ret);
 	return ret;
 }
@@ -459,9 +532,7 @@ static void afs_delete_async_call(struct work_struct *work)
 
 	_enter("");
 
-	ASSERT(skb_queue_empty(&call->rx_queue));
-	ASSERT(!work_pending(&call->async_work));
-	kfree(call);
+	afs_free_call(call);
 
 	_leave("");
 }
@@ -489,6 +560,7 @@ static void afs_process_async_call(struct work_struct *work)
 
 		/* kill the call */
 		rxrpc_kernel_end_call(call->rxcall);
+		call->rxcall = NULL;
 		if (call->type->destructor)
 			call->type->destructor(call);
 
@@ -526,7 +598,7 @@ static void afs_collect_incoming_call(struct work_struct *work)
 		_debug("new call");
 
 		/* don't need the notification */
-		rxrpc_kernel_free_skb(skb);
+		afs_free_skb(skb);
 
 		if (!call) {
 			call = kzalloc(sizeof(struct afs_call), GFP_KERNEL);
@@ -541,6 +613,11 @@ static void afs_collect_incoming_call(struct work_struct *work)
 			init_waitqueue_head(&call->waitq);
 			skb_queue_head_init(&call->rx_queue);
 			call->state = AFS_CALL_AWAIT_OP_ID;
+
+			_debug("CALL %p{%s} [%d]",
+			       call, call->type->name,
+			       atomic_read(&afs_outstanding_calls));
+			atomic_inc(&afs_outstanding_calls);
 		}
 
 		rxcall = rxrpc_kernel_accept_call(afs_socket,
@@ -551,7 +628,8 @@ static void afs_collect_incoming_call(struct work_struct *work)
 		}
 	}
 
-	kfree(call);
+	if (call)
+		afs_free_call(call);
 }
 
 /*
@@ -629,8 +707,7 @@ void afs_send_empty_reply(struct afs_call *call)
 		rxrpc_kernel_end_call(call->rxcall);
 		call->rxcall = NULL;
 		call->type->destructor(call);
-		ASSERT(skb_queue_empty(&call->rx_queue));
-		kfree(call);
+		afs_free_call(call);
 		_leave(" [error]");
 		return;
 	}
diff --git a/fs/afs/security.c b/fs/afs/security.c
new file mode 100644
index 0000000..cbdd7f7
--- /dev/null
+++ b/fs/afs/security.c
@@ -0,0 +1,345 @@
+/* AFS security handling
+ *
+ * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/init.h>
+#include <linux/slab.h>
+#include <linux/fs.h>
+#include <linux/ctype.h>
+#include <keys/rxrpc-type.h>
+#include "internal.h"
+
+/*
+ * get a key
+ */
+struct key *afs_request_key(struct afs_cell *cell)
+{
+	struct key *key;
+
+	_enter("{%x}", key_serial(cell->anonymous_key));
+
+	_debug("key %s", cell->anonymous_key->description);
+	key = request_key(&key_type_rxrpc, cell->anonymous_key->description,
+			  NULL);
+	if (IS_ERR(key)) {
+		if (PTR_ERR(key) != -ENOKEY) {
+			_leave(" = %ld", PTR_ERR(key));
+			return key;
+		}
+
+		/* act as anonymous user */
+		_leave(" = {%x} [anon]", key_serial(cell->anonymous_key));
+		return key_get(cell->anonymous_key);
+	} else {
+		/* act as authorised user */
+		_leave(" = {%x} [auth]", key_serial(key));
+		return key;
+	}
+}
+
+/*
+ * dispose of a permits list
+ */
+void afs_zap_permits(struct rcu_head *rcu)
+{
+	struct afs_permits *permits =
+		container_of(rcu, struct afs_permits, rcu);
+	int loop;
+
+	_enter("{%d}", permits->count);
+
+	for (loop = permits->count - 1; loop >= 0; loop--)
+		key_put(permits->permits[loop].key);
+	kfree(permits);
+}
+
+/*
+ * dispose of a permits list in which all the key pointers have been copied
+ */
+static void afs_dispose_of_permits(struct rcu_head *rcu)
+{
+	struct afs_permits *permits =
+		container_of(rcu, struct afs_permits, rcu);
+
+	_enter("{%d}", permits->count);
+
+	kfree(permits);
+}
+
+/*
+ * get the authorising vnode - this is the specified inode itself if it's a
+ * directory or it's the parent directory if the specified inode is a file or
+ * symlink
+ * - the caller must release the ref on the inode
+ */
+static struct afs_vnode *afs_get_auth_inode(struct afs_vnode *vnode,
+					    struct key *key)
+{
+	struct afs_vnode *auth_vnode;
+	struct inode *auth_inode;
+
+	_enter("");
+
+	if (S_ISDIR(vnode->vfs_inode.i_mode)) {
+		auth_inode = igrab(&vnode->vfs_inode);
+		ASSERT(auth_inode != NULL);
+	} else {
+		auth_inode = afs_iget(vnode->vfs_inode.i_sb, key,
+				      &vnode->status.parent);
+		if (IS_ERR(auth_inode))
+			return ERR_PTR(PTR_ERR(auth_inode));
+	}
+
+	auth_vnode = AFS_FS_I(auth_inode);
+	_leave(" = {%x}", auth_vnode->fid.vnode);
+	return auth_vnode;
+}
+
+/*
+ * clear the permit cache on a directory vnode
+ */
+void afs_clear_permits(struct afs_vnode *vnode)
+{
+	struct afs_permits *permits;
+
+	_enter("{%x}", vnode->fid.vnode);
+
+	mutex_lock(&vnode->permits_lock);
+	permits = vnode->permits;
+	rcu_assign_pointer(vnode->permits, NULL);
+	mutex_unlock(&vnode->permits_lock);
+
+	if (permits)
+		call_rcu(&permits->rcu, afs_zap_permits);
+	_leave("");
+}
+
+/*
+ * add the result obtained for a vnode to its or its parent directory's cache
+ * for the key used to access it
+ */
+void afs_cache_permit(struct afs_vnode *vnode, struct key *key, long acl_order)
+{
+	struct afs_permits *permits, *xpermits;
+	struct afs_permit *permit;
+	struct afs_vnode *auth_vnode;
+	int count, loop;
+
+	_enter("{%x},%x,%lx", vnode->fid.vnode, key_serial(key), acl_order);
+
+	auth_vnode = afs_get_auth_inode(vnode, key);
+	if (IS_ERR(auth_vnode)) {
+		_leave(" [get error %ld]", PTR_ERR(auth_vnode));
+		return;
+	}
+
+	mutex_lock(&auth_vnode->permits_lock);
+
+	/* guard against a rename being detected whilst we waited for the
+	 * lock */
+	if (memcmp(&auth_vnode->fid, &vnode->status.parent,
+		   sizeof(struct afs_fid)) != 0) {
+		_debug("renamed");
+		goto out_unlock;
+	}
+
+	/* have to be careful as the directory's callback may be broken between
+	 * us receiving the status we're trying to cache and us getting the
+	 * lock to update the cache for the status */
+	if (auth_vnode->acl_order - acl_order > 0) {
+		_debug("ACL changed?");
+		goto out_unlock;
+	}
+
+	/* always update the anonymous mask */
+	_debug("anon access %x", vnode->status.anon_access);
+	auth_vnode->status.anon_access = vnode->status.anon_access;
+	if (key == vnode->volume->cell->anonymous_key)
+		goto out_unlock;
+
+	xpermits = auth_vnode->permits;
+	count = 0;
+	if (xpermits) {
+		/* see if the permit is already in the list
+		 * - if it is then we just amend the list
+		 */
+		count = xpermits->count;
+		permit = xpermits->permits;
+		for (loop = count; loop > 0; loop--) {
+			if (permit->key == key) {
+				permit->access_mask =
+					vnode->status.caller_access;
+				goto out_unlock;
+			}
+			permit++;
+		}
+	}
+
+	permits = kmalloc(sizeof(*permits) + sizeof(*permit) * (count + 1),
+			  GFP_NOFS);
+	if (!permits)
+		goto out_unlock;
+
+	memcpy(permits->permits, xpermits->permits,
+	       count * sizeof(struct afs_permit));
+
+	_debug("key %x access %x",
+	       key_serial(key), vnode->status.caller_access);
+	permits->permits[count].access_mask = vnode->status.caller_access;
+	permits->permits[count].key = key_get(key);
+	permits->count = count + 1;
+
+	rcu_assign_pointer(auth_vnode->permits, permits);
+	if (xpermits)
+		call_rcu(&xpermits->rcu, afs_dispose_of_permits);
+
+out_unlock:
+	mutex_unlock(&auth_vnode->permits_lock);
+	iput(&auth_vnode->vfs_inode);
+	_leave("");
+}
+
+/*
+ * check with the fileserver to see if the directory or parent directory is
+ * permitted to be accessed with this authorisation, and if so, what access it
+ * is granted
+ */
+static int afs_check_permit(struct afs_vnode *vnode, struct key *key,
+			    afs_access_t *_access)
+{
+	struct afs_permits *permits;
+	struct afs_permit *permit;
+	struct afs_vnode *auth_vnode;
+	bool valid;
+	int loop, ret;
+
+	_enter("");
+
+	auth_vnode = afs_get_auth_inode(vnode, key);
+	if (IS_ERR(auth_vnode)) {
+		*_access = 0;
+		_leave(" = %ld", PTR_ERR(auth_vnode));
+		return PTR_ERR(auth_vnode);
+	}
+
+	ASSERT(S_ISDIR(auth_vnode->vfs_inode.i_mode));
+
+	/* check the permits to see if we've got one yet */
+	if (key == auth_vnode->volume->cell->anonymous_key) {
+		_debug("anon");
+		*_access = auth_vnode->status.anon_access;
+		valid = true;
+	} else {
+		valid = false;
+		rcu_read_lock();
+		permits = rcu_dereference(auth_vnode->permits);
+		if (permits) {
+			permit = permits->permits;
+			for (loop = permits->count; loop > 0; loop--) {
+				if (permit->key == key) {
+					_debug("found in cache");
+					*_access = permit->access_mask;
+					valid = true;
+					break;
+				}
+				permit++;
+			}
+		}
+		rcu_read_unlock();
+	}
+
+	if (!valid) {
+		/* check the status on the file we're actually interested in
+		 * (the post-processing will cache the result on auth_vnode) */
+		_debug("no valid permit");
+
+		set_bit(AFS_VNODE_CB_BROKEN, &vnode->flags);
+		ret = afs_vnode_fetch_status(vnode, auth_vnode, key);
+		if (ret < 0) {
+			iput(&auth_vnode->vfs_inode);
+			*_access = 0;
+			_leave(" = %d", ret);
+			return ret;
+		}
+	}
+
+	*_access = vnode->status.caller_access;
+	iput(&auth_vnode->vfs_inode);
+	_leave(" = 0 [access %x]", *_access);
+	return 0;
+}
+
+/*
+ * check the permissions on an AFS file
+ * - AFS ACLs are attached to directories only, and a file is controlled by its
+ *   parent directory's ACL
+ */
+int afs_permission(struct inode *inode, int mask, struct nameidata *nd)
+{
+	struct afs_vnode *vnode = AFS_FS_I(inode);
+	afs_access_t access;
+	struct key *key;
+	int ret;
+
+	_enter("{%x:%x},%x,", vnode->fid.vid, vnode->fid.vnode, mask);
+
+	key = afs_request_key(vnode->volume->cell);
+	if (IS_ERR(key)) {
+		_leave(" = %ld [key]", PTR_ERR(key));
+		return PTR_ERR(key);
+	}
+
+	/* check the permits to see if we've got one yet */
+	ret = afs_check_permit(vnode, key, &access);
+	if (ret < 0) {
+		key_put(key);
+		_leave(" = %d [check]", ret);
+		return ret;
+	}
+
+	/* interpret the access mask */
+	_debug("REQ %x ACC %x on %s",
+	       mask, access, S_ISDIR(inode->i_mode) ? "dir" : "file");
+
+	if (S_ISDIR(inode->i_mode)) {
+		if (mask & MAY_EXEC) {
+			if (!(access & AFS_ACE_LOOKUP))
+				goto permission_denied;
+		} else if (mask & MAY_READ) {
+			if (!(access & AFS_ACE_READ))
+				goto permission_denied;
+		} else if (mask & MAY_WRITE) {
+			if (!(access & (AFS_ACE_DELETE | /* rmdir, unlink, rename from */
+					AFS_ACE_INSERT | /* create, mkdir, symlink, rename to */
+					AFS_ACE_WRITE))) /* chmod */
+				goto permission_denied;
+		} else {
+			BUG();
+		}
+	} else {
+		if (!(access & AFS_ACE_LOOKUP))
+			goto permission_denied;
+		if (mask & (MAY_EXEC | MAY_READ)) {
+			if (!(access & AFS_ACE_READ))
+				goto permission_denied;
+		} else if (mask & MAY_WRITE) {
+			if (!(access & AFS_ACE_WRITE))
+				goto permission_denied;
+		}
+	}
+
+	key_put(key);
+	return generic_permission(inode, mask, NULL);
+
+permission_denied:
+	key_put(key);
+	_leave(" = -EACCES");
+	return -EACCES;
+}
diff --git a/fs/afs/super.c b/fs/afs/super.c
index 77e6875..17d7092 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -24,12 +24,6 @@
 
 #define AFS_FS_MAGIC 0x6B414653 /* 'kAFS' */
 
-struct afs_mount_params {
-	int			rwpath;
-	struct afs_cell		*default_cell;
-	struct afs_volume	*volume;
-};
-
 static void afs_i_init_once(void *foo, struct kmem_cache *cachep,
 			    unsigned long flags);
 
@@ -150,8 +144,8 @@ static int want_no_value(char *const *_value, const char *option)
  * - this function has been shamelessly adapted from the ext3 fs which
  *   shamelessly adapted it from the msdos fs
  */
-static int afs_super_parse_options(struct afs_mount_params *params,
-				   char *options, const char **devname)
+static int afs_parse_options(struct afs_mount_params *params,
+			     char *options, const char **devname)
 {
 	struct afs_cell *cell;
 	char *key, *value;
@@ -183,8 +177,8 @@ static int afs_super_parse_options(struct afs_mount_params *params,
 			cell = afs_cell_lookup(value, strlen(value));
 			if (IS_ERR(cell))
 				return PTR_ERR(cell);
-			afs_put_cell(params->default_cell);
-			params->default_cell = cell;
+			afs_put_cell(params->cell);
+			params->cell = cell;
 		} else {
 			printk("kAFS: Unknown mount option: '%s'\n",  key);
 			ret = -EINVAL;
@@ -199,6 +193,99 @@ error:
 }
 
 /*
+ * parse a device name to get cell name, volume name, volume type and R/W
+ * selector
+ * - this can be one of the following:
+ *	"%[cell:]volume[.]"		R/W volume
+ *	"#[cell:]volume[.]"		R/O or R/W volume (rwpath=0),
+ *					 or R/W (rwpath=1) volume
+ *	"%[cell:]volume.readonly"	R/O volume
+ *	"#[cell:]volume.readonly"	R/O volume
+ *	"%[cell:]volume.backup"		Backup volume
+ *	"#[cell:]volume.backup"		Backup volume
+ */
+static int afs_parse_device_name(struct afs_mount_params *params,
+				 const char *name)
+{
+	struct afs_cell *cell;
+	const char *cellname, *suffix;
+	int cellnamesz;
+
+	_enter(",%s", name);
+
+	if (!name) {
+		printk(KERN_ERR "kAFS: no volume name specified\n");
+		return -EINVAL;
+	}
+
+	if ((name[0] != '%' && name[0] != '#') || !name[1]) {
+		printk(KERN_ERR "kAFS: unparsable volume name\n");
+		return -EINVAL;
+	}
+
+	/* determine the type of volume we're looking for */
+	params->type = AFSVL_ROVOL;
+	params->force = false;
+	if (params->rwpath || name[0] == '%') {
+		params->type = AFSVL_RWVOL;
+		params->force = true;
+	}
+	name++;
+
+	/* split the cell name out if there is one */
+	params->volname = strchr(name, ':');
+	if (params->volname) {
+		cellname = name;
+		cellnamesz = params->volname - name;
+		params->volname++;
+	} else {
+		params->volname = name;
+		cellname = NULL;
+		cellnamesz = 0;
+	}
+
+	/* the volume type is further affected by a possible suffix */
+	suffix = strrchr(params->volname, '.');
+	if (suffix) {
+		if (strcmp(suffix, ".readonly") == 0) {
+			params->type = AFSVL_ROVOL;
+			params->force = true;
+		} else if (strcmp(suffix, ".backup") == 0) {
+			params->type = AFSVL_BACKVOL;
+			params->force = true;
+		} else if (suffix[1] == 0) {
+		} else {
+			suffix = NULL;
+		}
+	}
+
+	params->volnamesz = suffix ?
+		suffix - params->volname : strlen(params->volname);
+
+	_debug("cell %*.*s [%p]",
+	       cellnamesz, cellnamesz, cellname ?: "", params->cell);
+
+	/* lookup the cell record */
+	if (cellname || !params->cell) {
+		cell = afs_cell_lookup(cellname, cellnamesz);
+		if (IS_ERR(cell)) {
+			printk(KERN_ERR "kAFS: unable to lookup cell '%s'\n",
+			       cellname ?: "");
+			return PTR_ERR(cell);
+		}
+		afs_put_cell(params->cell);
+		params->cell = cell;
+	}
+
+	_debug("CELL:%s [%p] VOLUME:%*.*s SUFFIX:%s TYPE:%d%s",
+	       params->cell->name, params->cell,
+	       params->volnamesz, params->volnamesz, params->volname,
+	       suffix ?: "-", params->type, params->force ? " FORCE" : "");
+
+	return 0;
+}
+
+/*
  * check a superblock to see if it's the one we're looking for
  */
 static int afs_test_super(struct super_block *sb, void *data)
@@ -244,7 +331,7 @@ static int afs_fill_super(struct super_block *sb, void *data)
 	fid.vid		= as->volume->vid;
 	fid.vnode	= 1;
 	fid.unique	= 1;
-	inode = afs_iget(sb, &fid);
+	inode = afs_iget(sb, params->key, &fid);
 	if (IS_ERR(inode))
 		goto error_inode;
 
@@ -285,31 +372,40 @@ static int afs_get_sb(struct file_system_type *fs_type,
 	struct afs_mount_params params;
 	struct super_block *sb;
 	struct afs_volume *vol;
+	struct key *key;
 	int ret;
 
 	_enter(",,%s,%p", dev_name, options);
 
 	memset(&params, 0, sizeof(params));
 
-	/* parse the options */
+	/* parse the options and device name */
 	if (options) {
-		ret = afs_super_parse_options(&params, options, &dev_name);
+		ret = afs_parse_options(&params, options, &dev_name);
 		if (ret < 0)
 			goto error;
-		if (!dev_name) {
-			printk("kAFS: no volume name specified\n");
-			ret = -EINVAL;
-			goto error;
-		}
 	}
+		
+
+	ret = afs_parse_device_name(&params, dev_name);
+	if (ret < 0)
+		goto error;
+
+	/* try and do the mount securely */
+	key = afs_request_key(params.cell);
+	if (IS_ERR(key)) {
+		_leave(" = %ld [key]", PTR_ERR(key));
+		ret = PTR_ERR(key);
+		goto error;
+	}
+	params.key = key;
 
 	/* parse the device name */
-	vol = afs_volume_lookup(dev_name, params.default_cell, params.rwpath);
+	vol = afs_volume_lookup(&params);
 	if (IS_ERR(vol)) {
 		ret = PTR_ERR(vol);
 		goto error;
 	}
-
 	params.volume = vol;
 
 	/* allocate a deviceless superblock */
@@ -337,13 +433,14 @@ static int afs_get_sb(struct file_system_type *fs_type,
 
 	simple_set_mnt(mnt, sb);
 	afs_put_volume(params.volume);
-	afs_put_cell(params.default_cell);
+	afs_put_cell(params.cell);
 	_leave(" = 0 [%p]", sb);
 	return 0;
 
 error:
 	afs_put_volume(params.volume);
-	afs_put_cell(params.default_cell);
+	afs_put_cell(params.cell);
+	key_put(params.key);
 	_leave(" = %d", ret);
 	return ret;
 }
@@ -375,6 +472,7 @@ static void afs_i_init_once(void *_vnode, struct kmem_cache *cachep,
 		memset(vnode, 0, sizeof(*vnode));
 		inode_init_once(&vnode->vfs_inode);
 		init_waitqueue_head(&vnode->update_waitq);
+		mutex_init(&vnode->permits_lock);
 		spin_lock_init(&vnode->lock);
 		INIT_WORK(&vnode->cb_broken_work, afs_broken_callback_work);
 		mutex_init(&vnode->cb_broken_lock);
diff --git a/fs/afs/vlclient.c b/fs/afs/vlclient.c
index 0c7eba1..36c1306 100644
--- a/fs/afs/vlclient.c
+++ b/fs/afs/vlclient.c
@@ -127,6 +127,7 @@ static int afs_deliver_vl_get_entry_by_xxx(struct afs_call *call,
  * VL.GetEntryByName operation type
  */
 static const struct afs_call_type afs_RXVLGetEntryByName = {
+	.name		= "VL.GetEntryByName",
 	.deliver	= afs_deliver_vl_get_entry_by_xxx,
 	.abort_to_error	= afs_vl_abort_to_error,
 	.destructor	= afs_flat_call_destructor,
@@ -136,6 +137,7 @@ static const struct afs_call_type afs_RXVLGetEntryByName = {
  * VL.GetEntryById operation type
  */
 static const struct afs_call_type afs_RXVLGetEntryById = {
+	.name		= "VL.GetEntryById",
 	.deliver	= afs_deliver_vl_get_entry_by_xxx,
 	.abort_to_error	= afs_vl_abort_to_error,
 	.destructor	= afs_flat_call_destructor,
@@ -145,6 +147,7 @@ static const struct afs_call_type afs_RXVLGetEntryById = {
  * dispatch a get volume entry by name operation
  */
 int afs_vl_get_entry_by_name(struct in_addr *addr,
+			     struct key *key,
 			     const char *volname,
 			     struct afs_cache_vlocation *entry,
 			     const struct afs_wait_mode *wait_mode)
@@ -163,6 +166,7 @@ int afs_vl_get_entry_by_name(struct in_addr *addr,
 	if (!call)
 		return -ENOMEM;
 
+	call->key = key;
 	call->reply = entry;
 	call->service_id = VL_SERVICE;
 	call->port = htons(AFS_VL_PORT);
@@ -183,6 +187,7 @@ int afs_vl_get_entry_by_name(struct in_addr *addr,
  * dispatch a get volume entry by ID operation
  */
 int afs_vl_get_entry_by_id(struct in_addr *addr,
+			   struct key *key,
 			   afs_volid_t volid,
 			   afs_voltype_t voltype,
 			   struct afs_cache_vlocation *entry,
@@ -197,6 +202,7 @@ int afs_vl_get_entry_by_id(struct in_addr *addr,
 	if (!call)
 		return -ENOMEM;
 
+	call->key = key;
 	call->reply = entry;
 	call->service_id = VL_SERVICE;
 	call->port = htons(AFS_VL_PORT);
diff --git a/fs/afs/vlocation.c b/fs/afs/vlocation.c
index 60cb2f4..7d9815e 100644
--- a/fs/afs/vlocation.c
+++ b/fs/afs/vlocation.c
@@ -33,6 +33,7 @@ static struct workqueue_struct *afs_vlocation_update_worker;
  * about the volume in question
  */
 static int afs_vlocation_access_vl_by_name(struct afs_vlocation *vl,
+					   struct key *key,
 					   struct afs_cache_vlocation *vldb)
 {
 	struct afs_cell *cell = vl->cell;
@@ -49,7 +50,7 @@ static int afs_vlocation_access_vl_by_name(struct afs_vlocation *vl,
 		_debug("CellServ[%hu]: %08x", cell->vl_curr_svix, addr.s_addr);
 
 		/* attempt to access the VL server */
-		ret = afs_vl_get_entry_by_name(&addr, vl->vldb.name, vldb,
+		ret = afs_vl_get_entry_by_name(&addr, key, vl->vldb.name, vldb,
 					       &afs_sync_call);
 		switch (ret) {
 		case 0:
@@ -86,6 +87,7 @@ out:
  * about the volume in question
  */
 static int afs_vlocation_access_vl_by_id(struct afs_vlocation *vl,
+					 struct key *key,
 					 afs_volid_t volid,
 					 afs_voltype_t voltype,
 					 struct afs_cache_vlocation *vldb)
@@ -104,7 +106,7 @@ static int afs_vlocation_access_vl_by_id(struct afs_vlocation *vl,
 		_debug("CellServ[%hu]: %08x", cell->vl_curr_svix, addr.s_addr);
 
 		/* attempt to access the VL server */
-		ret = afs_vl_get_entry_by_id(&addr, volid, voltype, vldb,
+		ret = afs_vl_get_entry_by_id(&addr, key, volid, voltype, vldb,
 					     &afs_sync_call);
 		switch (ret) {
 		case 0:
@@ -188,6 +190,7 @@ static struct afs_vlocation *afs_vlocation_alloc(struct afs_cell *cell,
  * update record if we found it in the cache
  */
 static int afs_vlocation_update_record(struct afs_vlocation *vl,
+				       struct key *key,
 				       struct afs_cache_vlocation *vldb)
 {
 	afs_voltype_t voltype;
@@ -228,7 +231,7 @@ static int afs_vlocation_update_record(struct afs_vlocation *vl,
 	/* contact the server to make sure the volume is still available
 	 * - TODO: need to handle disconnected operation here
 	 */
-	ret = afs_vlocation_access_vl_by_id(vl, vid, voltype, vldb);
+	ret = afs_vlocation_access_vl_by_id(vl, key, vid, voltype, vldb);
 	switch (ret) {
 		/* net error */
 	default:
@@ -287,7 +290,8 @@ static void afs_vlocation_apply_update(struct afs_vlocation *vl,
  * fill in a volume location record, consulting the cache and the VL server
  * both
  */
-static int afs_vlocation_fill_in_record(struct afs_vlocation *vl)
+static int afs_vlocation_fill_in_record(struct afs_vlocation *vl,
+					struct key *key)
 {
 	struct afs_cache_vlocation vldb;
 	int ret;
@@ -310,11 +314,11 @@ static int afs_vlocation_fill_in_record(struct afs_vlocation *vl)
 		/* try to update a known volume in the cell VL databases by
 		 * ID as the name may have changed */
 		_debug("found in cache");
-		ret = afs_vlocation_update_record(vl, &vldb);
+		ret = afs_vlocation_update_record(vl, key, &vldb);
 	} else {
 		/* try to look up an unknown volume in the cell VL databases by
 		 * name */
-		ret = afs_vlocation_access_vl_by_name(vl, &vldb);
+		ret = afs_vlocation_access_vl_by_name(vl, key, &vldb);
 		if (ret < 0) {
 			printk("kAFS: failed to locate '%s' in cell '%s'\n",
 			       vl->vldb.name, vl->cell->name);
@@ -366,14 +370,16 @@ void afs_vlocation_queue_for_updates(struct afs_vlocation *vl)
  * - insert/update in the local cache if did get a VL response
  */
 struct afs_vlocation *afs_vlocation_lookup(struct afs_cell *cell,
+					   struct key *key,
 					   const char *name,
 					   size_t namesz)
 {
 	struct afs_vlocation *vl;
 	int ret;
 
-	_enter("{%s},%*.*s,%zu",
-	       cell->name, (int) namesz, (int) namesz, name, namesz);
+	_enter("{%s},{%x},%*.*s,%zu",
+	       cell->name, key_serial(key),
+	       (int) namesz, (int) namesz, name, namesz);
 
 	if (namesz > sizeof(vl->vldb.name)) {
 		_leave(" = -ENAMETOOLONG");
@@ -405,7 +411,7 @@ struct afs_vlocation *afs_vlocation_lookup(struct afs_cell *cell,
 	up_write(&cell->vl_sem);
 
 fill_in_record:
-	ret = afs_vlocation_fill_in_record(vl);
+	ret = afs_vlocation_fill_in_record(vl, key);
 	if (ret < 0)
 		goto error_abandon;
 	vl->state = AFS_VL_VALID;
@@ -656,7 +662,7 @@ static void afs_vlocation_updater(struct work_struct *work)
 	vl->upd_rej_cnt = 0;
 	vl->upd_busy_cnt = 0;
 
-	ret = afs_vlocation_update_record(vl, &vldb);
+	ret = afs_vlocation_update_record(vl, NULL, &vldb);
 	switch (ret) {
 	case 0:
 		afs_vlocation_apply_update(vl, &vldb);
diff --git a/fs/afs/vnode.c b/fs/afs/vnode.c
index d2ca139..1600976 100644
--- a/fs/afs/vnode.c
+++ b/fs/afs/vnode.c
@@ -238,9 +238,11 @@ static void afs_vnode_finalise_status_update(struct afs_vnode *vnode,
  *   - there are any outstanding ops that will fetch the status
  * - TODO implement local caching
  */
-int afs_vnode_fetch_status(struct afs_vnode *vnode)
+int afs_vnode_fetch_status(struct afs_vnode *vnode,
+			   struct afs_vnode *auth_vnode, struct key *key)
 {
 	struct afs_server *server;
+	unsigned long acl_order;
 	int ret;
 
 	DECLARE_WAITQUEUE(myself, current);
@@ -260,6 +262,10 @@ int afs_vnode_fetch_status(struct afs_vnode *vnode)
 		return -ENOENT;
 	}
 
+	acl_order = 0;
+	if (auth_vnode)
+		acl_order = auth_vnode->acl_order;
+
 	spin_lock(&vnode->lock);
 
 	if (!test_bit(AFS_VNODE_CB_BROKEN, &vnode->flags) &&
@@ -324,12 +330,14 @@ get_anyway:
 		_debug("USING SERVER: %p{%08x}",
 		       server, ntohl(server->addr.s_addr));
 
-		ret = afs_fs_fetch_file_status(server, vnode, NULL,
+		ret = afs_fs_fetch_file_status(server, key, vnode, NULL,
 					       &afs_sync_call);
 
 	} while (!afs_volume_release_fileserver(vnode, server, ret));
 
 	/* adjust the flags */
+	if (ret == 0 && auth_vnode)
+		afs_cache_permit(vnode, key, acl_order);
 	afs_vnode_finalise_status_update(vnode, server, ret);
 
 	_leave(" = %d", ret);
@@ -340,17 +348,18 @@ get_anyway:
  * fetch file data from the volume
  * - TODO implement caching and server failover
  */
-int afs_vnode_fetch_data(struct afs_vnode *vnode, off_t offset, size_t length,
-			 struct page *page)
+int afs_vnode_fetch_data(struct afs_vnode *vnode, struct key *key,
+			 off_t offset, size_t length, struct page *page)
 {
 	struct afs_server *server;
 	int ret;
 
-	_enter("%s,{%u,%u,%u}",
+	_enter("%s{%u,%u,%u},%x,,,",
 	       vnode->volume->vlocation->vldb.name,
 	       vnode->fid.vid,
 	       vnode->fid.vnode,
-	       vnode->fid.unique);
+	       vnode->fid.unique,
+	       key_serial(key));
 
 	/* this op will fetch the status */
 	spin_lock(&vnode->lock);
@@ -367,8 +376,8 @@ int afs_vnode_fetch_data(struct afs_vnode *vnode, off_t offset, size_t length,
 
 		_debug("USING SERVER: %08x\n", ntohl(server->addr.s_addr));
 
-		ret = afs_fs_fetch_data(server, vnode, offset, length, page,
-					NULL, &afs_sync_call);
+		ret = afs_fs_fetch_data(server, key, vnode, offset, length,
+					page, NULL, &afs_sync_call);
 
 	} while (!afs_volume_release_fileserver(vnode, server, ret));
 
diff --git a/fs/afs/volume.c b/fs/afs/volume.c
index 45491cf..15e1367 100644
--- a/fs/afs/volume.c
+++ b/fs/afs/volume.c
@@ -41,83 +41,20 @@ static const char *afs_voltypes[] = { "R/W", "R/O", "BAK" };
  * - Rule 3: If parent volume is R/W, then only mount R/W volume unless
  *           explicitly told otherwise
  */
-struct afs_volume *afs_volume_lookup(const char *name, struct afs_cell *cell,
-				     int rwpath)
+struct afs_volume *afs_volume_lookup(struct afs_mount_params *params)
 {
 	struct afs_vlocation *vlocation = NULL;
 	struct afs_volume *volume = NULL;
 	struct afs_server *server = NULL;
-	afs_voltype_t type;
-	const char *cellname, *volname, *suffix;
 	char srvtmask;
-	int force, ret, loop, cellnamesz, volnamesz;
+	int ret, loop;
 
-	_enter("%s,,%d,", name, rwpath);
-
-	if (!name || (name[0] != '%' && name[0] != '#') || !name[1]) {
-		printk("kAFS: unparsable volume name\n");
-		return ERR_PTR(-EINVAL);
-	}
-
-	/* determine the type of volume we're looking for */
-	force = 0;
-	type = AFSVL_ROVOL;
-
-	if (rwpath || name[0] == '%') {
-		type = AFSVL_RWVOL;
-		force = 1;
-	}
-
-	suffix = strrchr(name, '.');
-	if (suffix) {
-		if (strcmp(suffix, ".readonly") == 0) {
-			type = AFSVL_ROVOL;
-			force = 1;
-		} else if (strcmp(suffix, ".backup") == 0) {
-			type = AFSVL_BACKVOL;
-			force = 1;
-		} else if (suffix[1] == 0) {
-		} else {
-			suffix = NULL;
-		}
-	}
-
-	/* split the cell and volume names */
-	name++;
-	volname = strchr(name, ':');
-	if (volname) {
-		cellname = name;
-		cellnamesz = volname - name;
-		volname++;
-	} else {
-		volname = name;
-		cellname = NULL;
-		cellnamesz = 0;
-	}
-
-	volnamesz = suffix ? suffix - volname : strlen(volname);
-
-	_debug("CELL:%*.*s [%p] VOLUME:%*.*s SUFFIX:%s TYPE:%d%s",
-	       cellnamesz, cellnamesz, cellname ?: "", cell,
-	       volnamesz, volnamesz, volname, suffix ?: "-",
-	       type,
-	       force ? " FORCE" : "");
-
-	/* lookup the cell record */
-	if (cellname || !cell) {
-		cell = afs_cell_lookup(cellname, cellnamesz);
-		if (IS_ERR(cell)) {
-			ret = PTR_ERR(cell);
-			printk("kAFS: unable to lookup cell '%s'\n",
-			       cellname ?: "");
-			goto error;
-		}
-	} else {
-		afs_get_cell(cell);
-	}
+	_enter("{%*.*s,%d}",
+	       params->volnamesz, params->volnamesz, params->volname, params->rwpath);
 
 	/* lookup the volume location record */
-	vlocation = afs_vlocation_lookup(cell, volname, volnamesz);
+	vlocation = afs_vlocation_lookup(params->cell, params->key,
+					 params->volname, params->volnamesz);
 	if (IS_ERR(vlocation)) {
 		ret = PTR_ERR(vlocation);
 		vlocation = NULL;
@@ -126,30 +63,30 @@ struct afs_volume *afs_volume_lookup(const char *name, struct afs_cell *cell,
 
 	/* make the final decision on the type we want */
 	ret = -ENOMEDIUM;
-	if (force && !(vlocation->vldb.vidmask & (1 << type)))
+	if (params->force && !(vlocation->vldb.vidmask & (1 << params->type)))
 		goto error;
 
 	srvtmask = 0;
 	for (loop = 0; loop < vlocation->vldb.nservers; loop++)
 		srvtmask |= vlocation->vldb.srvtmask[loop];
 
-	if (force) {
-		if (!(srvtmask & (1 << type)))
+	if (params->force) {
+		if (!(srvtmask & (1 << params->type)))
 			goto error;
 	} else if (srvtmask & AFS_VOL_VTM_RO) {
-		type = AFSVL_ROVOL;
+		params->type = AFSVL_ROVOL;
 	} else if (srvtmask & AFS_VOL_VTM_RW) {
-		type = AFSVL_RWVOL;
+		params->type = AFSVL_RWVOL;
 	} else {
 		goto error;
 	}
 
-	down_write(&cell->vl_sem);
+	down_write(&params->cell->vl_sem);
 
 	/* is the volume already active? */
-	if (vlocation->vols[type]) {
+	if (vlocation->vols[params->type]) {
 		/* yes - re-use it */
-		volume = vlocation->vols[type];
+		volume = vlocation->vols[params->type];
 		afs_get_volume(volume);
 		goto success;
 	}
@@ -163,10 +100,10 @@ struct afs_volume *afs_volume_lookup(const char *name, struct afs_cell *cell,
 		goto error_up;
 
 	atomic_set(&volume->usage, 1);
-	volume->type		= type;
-	volume->type_force	= force;
-	volume->cell		= cell;
-	volume->vid		= vlocation->vldb.vid[type];
+	volume->type		= params->type;
+	volume->type_force	= params->force;
+	volume->cell		= params->cell;
+	volume->vid		= vlocation->vldb.vid[params->type];
 
 	init_rwsem(&volume->server_sem);
 
@@ -196,28 +133,26 @@ struct afs_volume *afs_volume_lookup(const char *name, struct afs_cell *cell,
 	afs_get_vlocation(vlocation);
 	volume->vlocation = vlocation;
 
-	vlocation->vols[type] = volume;
+	vlocation->vols[volume->type] = volume;
 
 success:
 	_debug("kAFS selected %s volume %08x",
 	       afs_voltypes[volume->type], volume->vid);
-	up_write(&cell->vl_sem);
+	up_write(&params->cell->vl_sem);
 	afs_put_vlocation(vlocation);
-	afs_put_cell(cell);
 	_leave(" = %p", volume);
 	return volume;
 
 	/* clean up */
 error_up:
-	up_write(&cell->vl_sem);
+	up_write(&params->cell->vl_sem);
 error:
 	afs_put_vlocation(vlocation);
-	afs_put_cell(cell);
 	_leave(" = %d", ret);
 	return ERR_PTR(ret);
 
 error_discard:
-	up_write(&cell->vl_sem);
+	up_write(&params->cell->vl_sem);
 
 	for (loop = volume->nservers - 1; loop >= 0; loop--)
 		afs_put_server(volume->servers[loop]);


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 12/16] AFS: Update the AFS fs documentation [try #3]
  2007-04-25 10:50 ` [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3] David Howells
                     ` (6 preceding siblings ...)
  2007-04-25 10:51   ` [PATCH 11/16] AFS: Add security support " David Howells
@ 2007-04-25 10:51   ` David Howells
  2007-04-25 10:51   ` [PATCH 13/16] commit ad495d7b6cfcd1bc2eaf06c42699be0bb5d84234 " David Howells
                     ` (4 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: David Howells @ 2007-04-25 10:51 UTC (permalink / raw)
  To: torvalds, akpm; +Cc: linux-kernel, linux-fsdevel, netdev, dhowells

Update the AFS fs documentation.

Signed-Off-By: David Howells <dhowells@redhat.com>
---

 Documentation/filesystems/afs.txt |  214 +++++++++++++++++++++++++++----------
 1 files changed, 154 insertions(+), 60 deletions(-)

diff --git a/Documentation/filesystems/afs.txt b/Documentation/filesystems/afs.txt
index 2f4237d..12ad6c7 100644
--- a/Documentation/filesystems/afs.txt
+++ b/Documentation/filesystems/afs.txt
@@ -1,31 +1,82 @@
+			     ====================
 			     kAFS: AFS FILESYSTEM
 			     ====================
 
-ABOUT
-=====
+Contents:
+
+ - Overview.
+ - Usage.
+ - Mountpoints.
+ - Proc filesystem.
+ - The cell database.
+ - Security.
+ - Examples.
+
+
+========
+OVERVIEW
+========
 
-This filesystem provides a fairly simple AFS filesystem driver. It is under
-development and only provides very basic facilities. It does not yet support
-the following AFS features:
+This filesystem provides a fairly simple secure AFS filesystem driver. It is
+under development and does not yet provide the full feature set.  The features
+it does support include:
 
-	(*) Write support.
-	(*) Communications security.
-	(*) Local caching.
-	(*) pioctl() system call.
-	(*) Automatic mounting of embedded mountpoints.
+ (*) Security (currently only AFS kaserver and KerberosIV tickets).
 
+ (*) File reading.
 
+ (*) Automounting.
+
+It does not yet support the following AFS features:
+
+ (*) Write support.
+
+ (*) Local caching.
+
+ (*) pioctl() system call.
+
+
+===========
+COMPILATION
+===========
+
+The filesystem should be enabled by turning on the kernel configuration
+options:
+
+	CONFIG_AF_RXRPC		- The RxRPC protocol transport
+	CONFIG_RXKAD		- The RxRPC Kerberos security handler
+	CONFIG_AFS		- The AFS filesystem
+
+Additionally, the following can be turned on to aid debugging:
+
+	CONFIG_AF_RXRPC_DEBUG	- Permit AF_RXRPC debugging to be enabled
+	CONFIG_AFS_DEBUG	- Permit AFS debugging to be enabled
+
+They permit the debugging messages to be turned on dynamically by manipulating
+the masks in the following files:
+
+	/sys/module/af_rxrpc/parameters/debug
+	/sys/module/afs/parameters/debug
+
+
+=====
 USAGE
 =====
 
 When inserting the driver modules the root cell must be specified along with a
 list of volume location server IP addresses:
 
-	insmod rxrpc.o
+	insmod af_rxrpc.o
+	insmod rxkad.o
 	insmod kafs.o rootcell=cambridge.redhat.com:172.16.18.73:172.16.18.91
 
-The first module is a driver for the RxRPC remote operation protocol, and the
-second is the actual filesystem driver for the AFS filesystem.
+The first module is the AF_RXRPC network protocol driver.  This provides the
+RxRPC remote operation protocol and may also be accessed from userspace.  See:
+
+	Documentation/networking/rxrpc.txt
+
+The second module is the kerberos RxRPC security driver, and the third module
+is the actual filesystem driver for the AFS filesystem.
 
 Once the module has been loaded, more modules can be added by the following
 procedure:
@@ -33,7 +84,7 @@ procedure:
 	echo add grand.central.org 18.7.14.88:128.2.191.224 >/proc/fs/afs/cells
 
 Where the parameters to the "add" command are the name of a cell and a list of
-volume location servers within that cell.
+volume location servers within that cell, with the latter separated by colons.
 
 Filesystems can be mounted anywhere by commands similar to the following:
 
@@ -42,11 +93,6 @@ Filesystems can be mounted anywhere by commands similar to the following:
 	mount -t afs "#root.afs." /afs
 	mount -t afs "#root.cell." /afs/cambridge
 
-  NB: When using this on Linux 2.4, the mount command has to be different,
-      since the filesystem doesn't have access to the device name argument:
-
-	mount -t afs none /afs -ovol="#root.afs."
-
 Where the initial character is either a hash or a percent symbol depending on
 whether you definitely want a R/W volume (hash) or whether you'd prefer a R/O
 volume, but are willing to use a R/W volume instead (percent).
@@ -60,55 +106,66 @@ named volume will be looked up in the cell specified during insmod.
 Additional cells can be added through /proc (see later section).
 
 
+===========
 MOUNTPOINTS
 ===========
 
-AFS has a concept of mountpoints. These are specially formatted symbolic links
-(of the same form as the "device name" passed to mount). kAFS presents these
-to the user as directories that have special properties:
+AFS has a concept of mountpoints. In AFS terms, these are specially formatted
+symbolic links (of the same form as the "device name" passed to mount).  kAFS
+presents these to the user as directories that have a follow-link capability
+(ie: symbolic link semantics).  If anyone attempts to access them, they will
+automatically cause the target volume to be mounted (if possible) on that site.
 
-  (*) They cannot be listed. Running a program like "ls" on them will incur an
-      EREMOTE error (Object is remote).
+Automatically mounted filesystems will be automatically unmounted approximately
+twenty minutes after they were last used.  Alternatively they can be unmounted
+directly with the umount() system call.
 
-  (*) Other objects can't be looked up inside of them. This also incurs an
-      EREMOTE error.
+Manually unmounting an AFS volume will cause any idle submounts upon it to be
+culled first.  If all are culled, then the requested volume will also be
+unmounted, otherwise error EBUSY will be returned.
 
-  (*) They can be queried with the readlink() system call, which will return
-      the name of the mountpoint to which they point. The "readlink" program
-      will also work.
+This can be used by the administrator to attempt to unmount the whole AFS tree
+mounted on /afs in one go by doing:
 
-  (*) They can be mounted on (which symbolic links can't).
+	umount /afs
 
 
+===============
 PROC FILESYSTEM
 ===============
 
-The rxrpc module creates a number of files in various places in the /proc
-filesystem:
-
-  (*) Firstly, some information files are made available in a directory called
-      "/proc/net/rxrpc/". These list the extant transport endpoint, peer,
-      connection and call records.
-
-  (*) Secondly, some control files are made available in a directory called
-      "/proc/sys/rxrpc/". Currently, all these files can be used for is to
-      turn on various levels of tracing.
-
 The AFS modules creates a "/proc/fs/afs/" directory and populates it:
 
-  (*) A "cells" file that lists cells currently known to the afs module.
+  (*) A "cells" file that lists cells currently known to the afs module and
+      their usage counts:
+
+	[root@andromeda ~]# cat /proc/fs/afs/cells
+	USE NAME
+	  3 cambridge.redhat.com
 
   (*) A directory per cell that contains files that list volume location
       servers, volumes, and active servers known within that cell.
 
+	[root@andromeda ~]# cat /proc/fs/afs/cambridge.redhat.com/servers
+	USE ADDR            STATE
+	  4 172.16.18.91        0
+	[root@andromeda ~]# cat /proc/fs/afs/cambridge.redhat.com/vlservers
+	ADDRESS
+	172.16.18.91
+	[root@andromeda ~]# cat /proc/fs/afs/cambridge.redhat.com/volumes
+	USE STT VLID[0]  VLID[1]  VLID[2]  NAME
+	  1 Val 20000000 20000001 20000002 root.afs
 
+
+=================
 THE CELL DATABASE
 =================
 
-The filesystem maintains an internal database of all the cells it knows and
-the IP addresses of the volume location servers for those cells. The cell to
-which the computer belongs is added to the database when insmod is performed
-by the "rootcell=" argument.
+The filesystem maintains an internal database of all the cells it knows and the
+IP addresses of the volume location servers for those cells.  The cell to which
+the system belongs is added to the database when insmod is performed by the
+"rootcell=" argument or, if compiled in, using a "kafs.rootcell=" argument on
+the kernel command line.
 
 Further cells can be added by commands similar to the following:
 
@@ -118,20 +175,65 @@ Further cells can be added by commands similar to the following:
 No other cell database operations are available at this time.
 
 
+========
+SECURITY
+========
+
+Secure operations are initiated by acquiring a key using the klog program.  A
+very primitive klog program is available at:
+
+	http://people.redhat.com/~dhowells/rxrpc/klog.c
+
+This should be compiled by:
+
+	make klog LDLIBS="-lcrypto -lcrypt -lkrb4 -lkeyutils"
+
+And then run as:
+
+	./klog
+
+Assuming it's successful, this adds a key of type RxRPC, named for the service
+and cell, eg: "afs@<cellname>".  This can be viewed with the keyctl program or
+by cat'ing /proc/keys:
+
+	[root@andromeda ~]# keyctl show
+	Session Keyring
+	       -3 --alswrv      0     0  keyring: _ses.3268
+		2 --alswrv      0     0   \_ keyring: _uid.0
+	111416553 --als--v      0     0   \_ rxrpc: afs@CAMBRIDGE.REDHAT.COM
+
+Currently the username, realm, password and proposed ticket lifetime are
+compiled in to the program.
+
+It is not required to acquire a key before using AFS facilities, but if one is
+not acquired then all operations will be governed by the anonymous user parts
+of the ACLs.
+
+If a key is acquired, then all AFS operations, including mounts and automounts,
+made by a possessor of that key will be secured with that key.
+
+If a file is opened with a particular key and then the file descriptor is
+passed to a process that doesn't have that key (perhaps over an AF_UNIX
+socket), then the operations on the file will be made with key that was used to
+open the file.
+
+
+========
 EXAMPLES
 ========
 
-Here's what I use to test this. Some of the names and IP addresses are local
-to my internal DNS. My "root.afs" partition has a mount point within it for
+Here's what I use to test this.  Some of the names and IP addresses are local
+to my internal DNS.  My "root.afs" partition has a mount point within it for
 some public volumes volumes.
 
-insmod -S /tmp/rxrpc.o 
-insmod -S /tmp/kafs.o rootcell=cambridge.redhat.com:172.16.18.73:172.16.18.91
+insmod /tmp/rxrpc.o
+insmod /tmp/rxkad.o
+insmod /tmp/kafs.o rootcell=cambridge.redhat.com:172.16.18.91
 
 mount -t afs \%root.afs. /afs
 mount -t afs \%cambridge.redhat.com:root.cell. /afs/cambridge.redhat.com/
 
-echo add grand.central.org 18.7.14.88:128.2.191.224 > /proc/fs/afs/cells 
+echo add grand.central.org 18.7.14.88:128.2.191.224 > /proc/fs/afs/cells
 mount -t afs "#grand.central.org:root.cell." /afs/grand.central.org/
 mount -t afs "#grand.central.org:root.archive." /afs/grand.central.org/archive
 mount -t afs "#grand.central.org:root.contrib." /afs/grand.central.org/contrib
@@ -141,15 +243,7 @@ mount -t afs "#grand.central.org:root.service." /afs/grand.central.org/service
 mount -t afs "#grand.central.org:root.software." /afs/grand.central.org/software
 mount -t afs "#grand.central.org:root.user." /afs/grand.central.org/user
 
-umount /afs/grand.central.org/user
-umount /afs/grand.central.org/software
-umount /afs/grand.central.org/service
-umount /afs/grand.central.org/project
-umount /afs/grand.central.org/doc
-umount /afs/grand.central.org/contrib
-umount /afs/grand.central.org/archive
-umount /afs/grand.central.org
-umount /afs/cambridge.redhat.com
 umount /afs
 rmmod kafs
+rmmod rxkad
 rmmod rxrpc

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 13/16] commit ad495d7b6cfcd1bc2eaf06c42699be0bb5d84234 [try #3]
  2007-04-25 10:50 ` [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3] David Howells
                     ` (7 preceding siblings ...)
  2007-04-25 10:51   ` [PATCH 12/16] AFS: Update the AFS fs documentation " David Howells
@ 2007-04-25 10:51   ` David Howells
  2007-04-25 10:51   ` [PATCH 14/16] AFS: Add support for the CB.GetCapabilities operation " David Howells
                     ` (3 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: David Howells @ 2007-04-25 10:51 UTC (permalink / raw)
  To: torvalds, akpm; +Cc: linux-kernel, linux-fsdevel, netdev, dhowells

[NETLINK]: Mirror UDP MSG_TRUNC semantics.

    If the user passes MSG_TRUNC in via msg_flags, return
    the full packet size not the truncated size.

    Idea from Herbert Xu and Thomas Graf.

    Signed-off-by: David S. Miller <davem@davemloft.net>
---

 net/netlink/af_netlink.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index c48b0f4..5890210 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1242,6 +1242,9 @@ static int netlink_recvmsg(struct kiocb *kiocb, struct socket *sock,
 
 	scm_recv(sock, msg, siocb->scm, flags);
 
+	if (flags & MSG_TRUNC)
+		copied = skb->len;
+
 out:
 	netlink_rcv_wake(sk);
 	return err ? : copied;


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 14/16] AFS: Add support for the CB.GetCapabilities operation [try #3]
  2007-04-25 10:50 ` [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3] David Howells
                     ` (8 preceding siblings ...)
  2007-04-25 10:51   ` [PATCH 13/16] commit ad495d7b6cfcd1bc2eaf06c42699be0bb5d84234 " David Howells
@ 2007-04-25 10:51   ` David Howells
  2007-04-25 10:51   ` [PATCH 15/16] AFS: Implement the CB.InitCallBackState3 " David Howells
                     ` (2 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: David Howells @ 2007-04-25 10:51 UTC (permalink / raw)
  To: torvalds, akpm; +Cc: linux-kernel, linux-fsdevel, netdev, dhowells

Add support for the CB.GetCapabilities operation with which the fileserver can
ask the client for the following information:

 (1) The list of network interfaces it has available as IPv4 address + netmask
     plus the MTUs.

 (2) The client's UUID.

 (3) The extended capabilities of the client, for which the only current one
     is unified error mapping (abort code interpretation).

To support this, the patch adds the following routines to AFS:

 (1) A function to iterate through all the network interfaces using RTNETLINK
     to extract IPv4 addresses and MTUs.

 (2) A function to iterate through all the network interfaces using RTNETLINK
     to pull out the MAC address of the lowest index interface to use in UUID
     construction.

Signed-Off-By: David Howells <dhowells@redhat.com>
---

 fs/afs/AFS_CM.h        |    3 
 fs/afs/Makefile        |    1 
 fs/afs/cmservice.c     |   98 ++++++++++
 fs/afs/internal.h      |   42 ++++
 fs/afs/main.c          |   49 +++++
 fs/afs/rxrpc.c         |   39 ++++
 fs/afs/use-rtnetlink.c |  473 ++++++++++++++++++++++++++++++++++++++++++++++++
 7 files changed, 705 insertions(+), 0 deletions(-)

diff --git a/fs/afs/AFS_CM.h b/fs/afs/AFS_CM.h
index 7c8e3d4..d4bd201 100644
--- a/fs/afs/AFS_CM.h
+++ b/fs/afs/AFS_CM.h
@@ -23,6 +23,9 @@ enum AFS_CM_Operations {
 	CBGetCE			= 208,	/* get cache file description */
 	CBGetXStatsVersion	= 209,	/* get version of extended statistics */
 	CBGetXStats		= 210,	/* get contents of extended statistics data */
+	CBGetCapabilities	= 65538, /* get client capabilities */
 };
 
+#define AFS_CAP_ERROR_TRANSLATION	0x1
+
 #endif /* AFS_FS_H */
diff --git a/fs/afs/Makefile b/fs/afs/Makefile
index cca198b..01545eb 100644
--- a/fs/afs/Makefile
+++ b/fs/afs/Makefile
@@ -18,6 +18,7 @@ kafs-objs := \
 	security.o \
 	server.o \
 	super.o \
+	use-rtnetlink.o \
 	vlclient.o \
 	vlocation.o \
 	vnode.o \
diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index 9cb3ac5..5139723 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -22,6 +22,8 @@ static int afs_deliver_cb_init_call_back_state(struct afs_call *,
 					       struct sk_buff *, bool);
 static int afs_deliver_cb_probe(struct afs_call *, struct sk_buff *, bool);
 static int afs_deliver_cb_callback(struct afs_call *, struct sk_buff *, bool);
+static int afs_deliver_cb_get_capabilities(struct afs_call *, struct sk_buff *,
+					   bool);
 static void afs_cm_destructor(struct afs_call *);
 
 /*
@@ -55,6 +57,16 @@ static const struct afs_call_type afs_SRXCBProbe = {
 };
 
 /*
+ * CB.GetCapabilities operation type
+ */
+static const struct afs_call_type afs_SRXCBGetCapabilites = {
+	.name		= "CB.GetCapabilities",
+	.deliver	= afs_deliver_cb_get_capabilities,
+	.abort_to_error	= afs_abort_to_error,
+	.destructor	= afs_cm_destructor,
+};
+
+/*
  * route an incoming cache manager call
  * - return T if supported, F if not
  */
@@ -74,6 +86,9 @@ bool afs_cm_incoming_call(struct afs_call *call)
 	case CBProbe:
 		call->type = &afs_SRXCBProbe;
 		return true;
+	case CBGetCapabilities:
+		call->type = &afs_SRXCBGetCapabilites;
+		return true;
 	default:
 		return false;
 	}
@@ -328,3 +343,86 @@ static int afs_deliver_cb_probe(struct afs_call *call, struct sk_buff *skb,
 	schedule_work(&call->work);
 	return 0;
 }
+
+/*
+ * allow the fileserver to ask about the cache manager's capabilities
+ */
+static void SRXAFSCB_GetCapabilities(struct work_struct *work)
+{
+	struct afs_interface *ifs;
+	struct afs_call *call = container_of(work, struct afs_call, work);
+	int loop, nifs;
+
+	struct {
+		struct /* InterfaceAddr */ {
+			__be32 nifs;
+			__be32 uuid[11];
+			__be32 ifaddr[32];
+			__be32 netmask[32];
+			__be32 mtu[32];
+		} ia;
+		struct /* Capabilities */ {
+			__be32 capcount;
+			__be32 caps[1];
+		} cap;
+	} reply;
+
+	_enter("");
+
+	nifs = 0;
+	ifs = kcalloc(32, sizeof(*ifs), GFP_KERNEL);
+	if (ifs) {
+		nifs = afs_get_ipv4_interfaces(ifs, 32, false);
+		if (nifs < 0) {
+			kfree(ifs);
+			ifs = NULL;
+			nifs = 0;
+		}
+	}
+
+	memset(&reply, 0, sizeof(reply));
+	reply.ia.nifs = htonl(nifs);
+
+	reply.ia.uuid[0] = htonl(afs_uuid.time_low);
+	reply.ia.uuid[1] = htonl(afs_uuid.time_mid);
+	reply.ia.uuid[2] = htonl(afs_uuid.time_hi_and_version);
+	reply.ia.uuid[3] = htonl((s8) afs_uuid.clock_seq_hi_and_reserved);
+	reply.ia.uuid[4] = htonl((s8) afs_uuid.clock_seq_low);
+	for (loop = 0; loop < 6; loop++)
+		reply.ia.uuid[loop + 5] = htonl((s8) afs_uuid.node[loop]);
+
+	if (ifs) {
+		for (loop = 0; loop < nifs; loop++) {
+			reply.ia.ifaddr[loop] = ifs[loop].address.s_addr;
+			reply.ia.netmask[loop] = ifs[loop].netmask.s_addr;
+			reply.ia.mtu[loop] = htonl(ifs[loop].mtu);
+		}
+	}
+
+	reply.cap.capcount = htonl(1);
+	reply.cap.caps[0] = htonl(AFS_CAP_ERROR_TRANSLATION);
+	afs_send_simple_reply(call, &reply, sizeof(reply));
+
+	_leave("");
+}
+
+/*
+ * deliver request data to a CB.GetCapabilities call
+ */
+static int afs_deliver_cb_get_capabilities(struct afs_call *call,
+					   struct sk_buff *skb, bool last)
+{
+	_enter(",{%u},%d", skb->len, last);
+
+	if (skb->len > 0)
+		return -EBADMSG;
+	if (!last)
+		return 0;
+
+	/* no unmarshalling required */
+	call->state = AFS_CALL_REPLYING;
+
+	INIT_WORK(&call->work, SRXAFSCB_GetCapabilities);
+	schedule_work(&call->work);
+	return 0;
+}
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index d49201e..9107d3f 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -346,6 +346,40 @@ struct afs_permits {
 	struct afs_permit	permits[0];	/* the permits so far examined */
 };
 
+/*
+ * record of one of a system's set of network interfaces
+ */
+struct afs_interface {
+	unsigned	index;		/* interface index */
+	struct in_addr	address;	/* IPv4 address bound to interface */
+	struct in_addr	netmask;	/* netmask applied to address */
+	unsigned	mtu;		/* MTU of interface */
+};
+
+/*
+ * UUID definition [internet draft]
+ * - the timestamp is a 60-bit value, split 32/16/12, and goes in 100ns
+ *   increments since midnight 15th October 1582
+ *   - add AFS_UUID_TO_UNIX_TIME to convert unix time in 100ns units to UUID
+ *     time
+ * - the clock sequence is a 14-bit counter to avoid duplicate times
+ */
+struct afs_uuid {
+	u32		time_low;			/* low part of timestamp */
+	u16		time_mid;			/* mid part of timestamp */
+	u16		time_hi_and_version;		/* high part of timestamp and version  */
+#define AFS_UUID_TO_UNIX_TIME	0x01b21dd213814000
+#define AFS_UUID_TIMEHI_MASK	0x0fff
+#define AFS_UUID_VERSION_TIME	0x1000	/* time-based UUID */
+#define AFS_UUID_VERSION_NAME	0x3000	/* name-based UUID */
+#define AFS_UUID_VERSION_RANDOM	0x4000	/* (pseudo-)random generated UUID */
+	u8		clock_seq_hi_and_reserved;	/* clock seq hi and variant */
+#define AFS_UUID_CLOCKHI_MASK	0x3f
+#define AFS_UUID_VARIANT_STD	0x80
+	u8		clock_seq_low;			/* clock seq low */
+	u8		node[6];			/* spatially unique node ID (MAC addr) */
+};
+
 /*****************************************************************************/
 /*
  * callback.c
@@ -430,6 +464,7 @@ extern void afs_clear_inode(struct inode *);
 /*
  * main.c
  */
+extern struct afs_uuid afs_uuid;
 #ifdef AFS_CACHING_SUPPORT
 extern struct cachefs_netfs afs_cache_netfs;
 #endif
@@ -470,6 +505,7 @@ extern struct afs_call *afs_alloc_flat_call(const struct afs_call_type *,
 extern void afs_flat_call_destructor(struct afs_call *);
 extern void afs_transfer_reply(struct afs_call *, struct sk_buff *);
 extern void afs_send_empty_reply(struct afs_call *);
+extern void afs_send_simple_reply(struct afs_call *, const void *, size_t);
 extern int afs_extract_data(struct afs_call *, struct sk_buff *, bool, void *,
 			    size_t);
 
@@ -501,6 +537,12 @@ extern int afs_fs_init(void);
 extern void afs_fs_exit(void);
 
 /*
+ * use-rtnetlink.c
+ */
+extern int afs_get_ipv4_interfaces(struct afs_interface *, size_t, bool);
+extern int afs_get_MAC_address(u8 [6]);
+
+/*
  * vlclient.c
  */
 #ifdef AFS_CACHING_SUPPORT
diff --git a/fs/afs/main.c b/fs/afs/main.c
index 0cf1b02..40c2704 100644
--- a/fs/afs/main.c
+++ b/fs/afs/main.c
@@ -40,6 +40,51 @@ struct cachefs_netfs afs_cache_netfs = {
 };
 #endif
 
+struct afs_uuid afs_uuid;
+
+/*
+ * get a client UUID
+ */
+static int __init afs_get_client_UUID(void)
+{
+	struct timespec ts;
+	u64 uuidtime;
+	u16 clockseq;
+	int ret;
+
+	/* read the MAC address of one of the external interfaces and construct
+	 * a UUID from it */
+	ret = afs_get_MAC_address(afs_uuid.node);
+	if (ret < 0)
+		return ret;
+
+	getnstimeofday(&ts);
+	uuidtime = (u64) ts.tv_sec * 1000 * 1000 * 10;
+	uuidtime += ts.tv_nsec / 100;
+	uuidtime += AFS_UUID_TO_UNIX_TIME;
+	afs_uuid.time_low = uuidtime;
+	afs_uuid.time_mid = uuidtime >> 32;
+	afs_uuid.time_hi_and_version = (uuidtime >> 48) & AFS_UUID_TIMEHI_MASK;
+	afs_uuid.time_hi_and_version = AFS_UUID_VERSION_TIME;
+
+	get_random_bytes(&clockseq, 2);
+	afs_uuid.clock_seq_low = clockseq;
+	afs_uuid.clock_seq_hi_and_reserved =
+		(clockseq >> 8) & AFS_UUID_CLOCKHI_MASK;
+	afs_uuid.clock_seq_hi_and_reserved = AFS_UUID_VARIANT_STD;
+
+	_debug("AFS UUID: %08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x",
+	       afs_uuid.time_low,
+	       afs_uuid.time_mid,
+	       afs_uuid.time_hi_and_version,
+	       afs_uuid.clock_seq_hi_and_reserved,
+	       afs_uuid.clock_seq_low,
+	       afs_uuid.node[0], afs_uuid.node[1], afs_uuid.node[2],
+	       afs_uuid.node[3], afs_uuid.node[4], afs_uuid.node[5]);
+
+	return 0;
+}
+
 /*
  * initialise the AFS client FS module
  */
@@ -49,6 +94,10 @@ static int __init afs_init(void)
 
 	printk(KERN_INFO "kAFS: Red Hat AFS client v0.1 registering.\n");
 
+	ret = afs_get_client_UUID();
+	if (ret < 0)
+		return ret;
+
 	/* register the /proc stuff */
 	ret = afs_proc_init();
 	if (ret < 0)
diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
index 2d05d8a..cf826a5 100644
--- a/fs/afs/rxrpc.c
+++ b/fs/afs/rxrpc.c
@@ -714,6 +714,45 @@ void afs_send_empty_reply(struct afs_call *call)
 }
 
 /*
+ * send a simple reply
+ */
+void afs_send_simple_reply(struct afs_call *call, const void *buf, size_t len)
+{
+	struct msghdr msg;
+	struct iovec iov[1];
+
+	_enter("");
+
+	iov[0].iov_base		= (void *) buf;
+	iov[0].iov_len		= len;
+	msg.msg_name		= NULL;
+	msg.msg_namelen		= 0;
+	msg.msg_iov		= iov;
+	msg.msg_iovlen		= 1;
+	msg.msg_control		= NULL;
+	msg.msg_controllen	= 0;
+	msg.msg_flags		= 0;
+
+	call->state = AFS_CALL_AWAIT_ACK;
+	switch (rxrpc_kernel_send_data(call->rxcall, &msg, len)) {
+	case 0:
+		_leave(" [replied]");
+		return;
+
+	case -ENOMEM:
+		_debug("oom");
+		rxrpc_kernel_abort_call(call->rxcall, RX_USER_ABORT);
+	default:
+		rxrpc_kernel_end_call(call->rxcall);
+		call->rxcall = NULL;
+		call->type->destructor(call);
+		afs_free_call(call);
+		_leave(" [error]");
+		return;
+	}
+}
+
+/*
  * extract a piece of data from the received data socket buffers
  */
 int afs_extract_data(struct afs_call *call, struct sk_buff *skb,
diff --git a/fs/afs/use-rtnetlink.c b/fs/afs/use-rtnetlink.c
new file mode 100644
index 0000000..82f0daa
--- /dev/null
+++ b/fs/afs/use-rtnetlink.c
@@ -0,0 +1,473 @@
+/* RTNETLINK client
+ *
+ * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include <linux/netlink.h>
+#include <linux/rtnetlink.h>
+#include <linux/if_addr.h>
+#include <linux/if_arp.h>
+#include <linux/inetdevice.h>
+#include <net/netlink.h>
+#include "internal.h"
+
+struct afs_rtm_desc {
+	struct socket		*nlsock;
+	struct afs_interface	*bufs;
+	u8			*mac;
+	size_t			nbufs;
+	size_t			maxbufs;
+	void			*data;
+	ssize_t			datalen;
+	size_t			datamax;
+	int			msg_seq;
+	unsigned		mac_index;
+	bool			wantloopback;
+	int (*parse)(struct afs_rtm_desc *, struct nlmsghdr *);
+};
+
+/*
+ * parse an RTM_GETADDR response
+ */
+static int afs_rtm_getaddr_parse(struct afs_rtm_desc *desc,
+				 struct nlmsghdr *nlhdr)
+{
+	struct afs_interface *this;
+	struct ifaddrmsg *ifa;
+	struct rtattr *rtattr;
+	const char *name;
+	size_t len;
+
+	ifa = (struct ifaddrmsg *) NLMSG_DATA(nlhdr);
+
+	_enter("{ix=%d,af=%d}", ifa->ifa_index, ifa->ifa_family);
+
+	if (ifa->ifa_family != AF_INET) {
+		_leave(" = 0 [family %d]", ifa->ifa_family);
+		return 0;
+	}
+	if (desc->nbufs >= desc->maxbufs) {
+		_leave(" = 0 [max %zu/%zu]", desc->nbufs, desc->maxbufs);
+		return 0;
+	}
+
+	this = &desc->bufs[desc->nbufs];
+
+	this->index = ifa->ifa_index;
+	this->netmask.s_addr = inet_make_mask(ifa->ifa_prefixlen);
+	this->mtu = 0;
+
+	rtattr = NLMSG_DATA(nlhdr) + NLMSG_ALIGN(sizeof(struct ifaddrmsg));
+	len = NLMSG_PAYLOAD(nlhdr, sizeof(struct ifaddrmsg));
+
+	name = "unknown";
+	for (; RTA_OK(rtattr, len); rtattr = RTA_NEXT(rtattr, len)) {
+		switch (rtattr->rta_type) {
+		case IFA_ADDRESS:
+			memcpy(&this->address, RTA_DATA(rtattr), 4);
+			break;
+		case IFA_LABEL:
+			name = RTA_DATA(rtattr);
+			break;
+		}
+	}
+
+	_debug("%s: "NIPQUAD_FMT"/"NIPQUAD_FMT,
+	       name, NIPQUAD(this->address), NIPQUAD(this->netmask));
+
+	desc->nbufs++;
+	_leave(" = 0");
+	return 0;
+}
+
+/*
+ * parse an RTM_GETLINK response for MTUs
+ */
+static int afs_rtm_getlink_if_parse(struct afs_rtm_desc *desc,
+				    struct nlmsghdr *nlhdr)
+{
+	struct afs_interface *this;
+	struct ifinfomsg *ifi;
+	struct rtattr *rtattr;
+	const char *name;
+	size_t len, loop;
+
+	ifi = (struct ifinfomsg *) NLMSG_DATA(nlhdr);
+
+	_enter("{ix=%d}", ifi->ifi_index);
+
+	for (loop = 0; loop < desc->nbufs; loop++) {
+		this = &desc->bufs[loop];
+		if (this->index == ifi->ifi_index)
+			goto found;
+	}
+
+	_leave(" = 0 [no match]");
+	return 0;
+
+found:
+	if (ifi->ifi_type == ARPHRD_LOOPBACK && !desc->wantloopback) {
+		_leave(" = 0 [loopback]");
+		return 0;
+	}
+
+	rtattr = NLMSG_DATA(nlhdr) + NLMSG_ALIGN(sizeof(struct ifinfomsg));
+	len = NLMSG_PAYLOAD(nlhdr, sizeof(struct ifinfomsg));
+
+	name = "unknown";
+	for (; RTA_OK(rtattr, len); rtattr = RTA_NEXT(rtattr, len)) {
+		switch (rtattr->rta_type) {
+		case IFLA_MTU:
+			memcpy(&this->mtu, RTA_DATA(rtattr), 4);
+			break;
+		case IFLA_IFNAME:
+			name = RTA_DATA(rtattr);
+			break;
+		}
+	}
+
+	_debug("%s: "NIPQUAD_FMT"/"NIPQUAD_FMT" mtu %u",
+	       name, NIPQUAD(this->address), NIPQUAD(this->netmask),
+	       this->mtu);
+
+	_leave(" = 0");
+	return 0;
+}
+
+/*
+ * parse an RTM_GETLINK response for the MAC address belonging to the lowest
+ * non-internal interface
+ */
+static int afs_rtm_getlink_mac_parse(struct afs_rtm_desc *desc,
+				     struct nlmsghdr *nlhdr)
+{
+	struct ifinfomsg *ifi;
+	struct rtattr *rtattr;
+	const char *name;
+	size_t remain, len;
+	bool set;
+
+	ifi = (struct ifinfomsg *) NLMSG_DATA(nlhdr);
+
+	_enter("{ix=%d}", ifi->ifi_index);
+
+	if (ifi->ifi_index >= desc->mac_index) {
+		_leave(" = 0 [high]");
+		return 0;
+	}
+	if (ifi->ifi_type == ARPHRD_LOOPBACK) {
+		_leave(" = 0 [loopback]");
+		return 0;
+	}
+
+	rtattr = NLMSG_DATA(nlhdr) + NLMSG_ALIGN(sizeof(struct ifinfomsg));
+	remain = NLMSG_PAYLOAD(nlhdr, sizeof(struct ifinfomsg));
+
+	name = "unknown";
+	set = false;
+	for (; RTA_OK(rtattr, remain); rtattr = RTA_NEXT(rtattr, remain)) {
+		switch (rtattr->rta_type) {
+		case IFLA_ADDRESS:
+			len = RTA_PAYLOAD(rtattr);
+			memcpy(desc->mac, RTA_DATA(rtattr),
+			       min_t(size_t, len, 6));
+			desc->mac_index = ifi->ifi_index;
+			set = true;
+			break;
+		case IFLA_IFNAME:
+			name = RTA_DATA(rtattr);
+			break;
+		}
+	}
+
+	if (set)
+		_debug("%s: %02x:%02x:%02x:%02x:%02x:%02x",
+		       name,
+		       desc->mac[0], desc->mac[1], desc->mac[2],
+		       desc->mac[3], desc->mac[4], desc->mac[5]);
+
+	_leave(" = 0");
+	return 0;
+}
+
+/*
+ * read the rtnetlink response and pass to parsing routine
+ */
+static int afs_read_rtm(struct afs_rtm_desc *desc)
+{
+	struct nlmsghdr *nlhdr, tmphdr;
+	struct msghdr msg;
+	struct kvec iov[1];
+	void *data;
+	bool last = false;
+	int len, ret, remain;
+
+	_enter("");
+
+	do {
+		/* first of all peek to see how big the packet is */
+		memset(&msg, 0, sizeof(msg));
+		iov[0].iov_base = &tmphdr;
+		iov[0].iov_len = sizeof(tmphdr);
+		len = kernel_recvmsg(desc->nlsock, &msg, iov, 1,
+				     sizeof(tmphdr), MSG_PEEK | MSG_TRUNC);
+		if (len < 0) {
+			_leave(" = %d [peek]", len);
+			return len;
+		}
+		if (len == 0)
+			continue;
+		if (len < sizeof(tmphdr) || len < NLMSG_PAYLOAD(&tmphdr, 0)) {
+			_leave(" = -EMSGSIZE");
+			return -EMSGSIZE;
+		}
+
+		if (desc->datamax < len) {
+			kfree(desc->data);
+			desc->data = NULL;
+			data = kmalloc(len, GFP_KERNEL);
+			if (!data)
+				return -ENOMEM;
+			desc->data = data;
+		}
+		desc->datamax = len;
+
+		/* read all the data from this packet */
+		iov[0].iov_base = desc->data;
+		iov[0].iov_len = desc->datamax;
+		desc->datalen = kernel_recvmsg(desc->nlsock, &msg, iov, 1,
+					       desc->datamax, 0);
+		if (desc->datalen < 0) {
+			_leave(" = %ld [recv]", desc->datalen);
+			return desc->datalen;
+		}
+
+		nlhdr = desc->data;
+
+		/* check if the header is valid */
+		if (!NLMSG_OK(nlhdr, desc->datalen) ||
+		    nlhdr->nlmsg_type == NLMSG_ERROR) {
+			_leave(" = -EIO");
+			return -EIO;
+		}
+
+		/* see if this is the last message */
+		if (nlhdr->nlmsg_type == NLMSG_DONE ||
+		    !(nlhdr->nlmsg_flags & NLM_F_MULTI))
+			last = true;
+
+		/* parse the bits we got this time */
+		nlmsg_for_each_msg(nlhdr, desc->data, desc->datalen, remain) {
+			ret = desc->parse(desc, nlhdr);
+			if (ret < 0) {
+				_leave(" = %d [parse]", ret);
+				return ret;
+			}
+		}
+
+	} while (!last);
+
+	_leave(" = 0");
+	return 0;
+}
+
+/*
+ * list the interface bound addresses to get the address and netmask
+ */
+static int afs_rtm_getaddr(struct afs_rtm_desc *desc)
+{
+	struct msghdr msg;
+	struct kvec iov[1];
+	int ret;
+
+	struct {
+		struct nlmsghdr nl_msg __attribute__((aligned(NLMSG_ALIGNTO)));
+		struct ifaddrmsg addr_msg __attribute__((aligned(NLMSG_ALIGNTO)));
+	} request;
+
+	_enter("");
+
+	memset(&request, 0, sizeof(request));
+
+	request.nl_msg.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifaddrmsg));
+	request.nl_msg.nlmsg_type = RTM_GETADDR;
+	request.nl_msg.nlmsg_flags = NLM_F_REQUEST | NLM_F_DUMP;
+	request.nl_msg.nlmsg_seq = desc->msg_seq++;
+	request.nl_msg.nlmsg_pid = 0;
+
+	memset(&msg, 0, sizeof(msg));
+	iov[0].iov_base = &request;
+	iov[0].iov_len = sizeof(request);
+
+	ret = kernel_sendmsg(desc->nlsock, &msg, iov, 1, iov[0].iov_len);
+	_leave(" = %d", ret);
+	return ret;
+}
+
+/*
+ * list the interface link statuses to get the MTUs
+ */
+static int afs_rtm_getlink(struct afs_rtm_desc *desc)
+{
+	struct msghdr msg;
+	struct kvec iov[1];
+	int ret;
+
+	struct {
+		struct nlmsghdr nl_msg __attribute__((aligned(NLMSG_ALIGNTO)));
+		struct ifinfomsg link_msg __attribute__((aligned(NLMSG_ALIGNTO)));
+	} request;
+
+	_enter("");
+
+	memset(&request, 0, sizeof(request));
+
+	request.nl_msg.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
+	request.nl_msg.nlmsg_type = RTM_GETLINK;
+	request.nl_msg.nlmsg_flags = NLM_F_REQUEST | NLM_F_ROOT;
+	request.nl_msg.nlmsg_seq = desc->msg_seq++;
+	request.nl_msg.nlmsg_pid = 0;
+
+	memset(&msg, 0, sizeof(msg));
+	iov[0].iov_base = &request;
+	iov[0].iov_len = sizeof(request);
+
+	ret = kernel_sendmsg(desc->nlsock, &msg, iov, 1, iov[0].iov_len);
+	_leave(" = %d", ret);
+	return ret;
+}
+
+/*
+ * cull any interface records for which there isn't an MTU value
+ */
+static void afs_cull_interfaces(struct afs_rtm_desc *desc)
+{
+	struct afs_interface *bufs = desc->bufs;
+	size_t nbufs = desc->nbufs;
+	int loop, point = 0;
+
+	_enter("{%zu}", nbufs);
+
+	for (loop = 0; loop < nbufs; loop++) {
+		if (desc->bufs[loop].mtu != 0) {
+			if (loop != point) {
+				ASSERTCMP(loop, >, point);
+				bufs[point] = bufs[loop];
+			}
+			point++;
+		}
+	}
+
+	desc->nbufs = point;
+	_leave(" [%zu/%zu]", desc->nbufs, nbufs);
+}
+
+/*
+ * get a list of this system's interface IPv4 addresses, netmasks and MTUs
+ * - returns the number of interface records in the buffer
+ */
+int afs_get_ipv4_interfaces(struct afs_interface *bufs, size_t maxbufs,
+			    bool wantloopback)
+{
+	struct afs_rtm_desc desc;
+	int ret, loop;
+
+	_enter("");
+
+	memset(&desc, 0, sizeof(desc));
+	desc.bufs = bufs;
+	desc.maxbufs = maxbufs;
+	desc.wantloopback = wantloopback;
+
+	ret = sock_create_kern(AF_NETLINK, SOCK_DGRAM, NETLINK_ROUTE,
+			       &desc.nlsock);
+	if (ret < 0) {
+		_leave(" = %d [sock]", ret);
+		return ret;
+	}
+
+	/* issue RTM_GETADDR */
+	desc.parse = afs_rtm_getaddr_parse;
+	ret = afs_rtm_getaddr(&desc);
+	if (ret < 0)
+		goto error;
+	ret = afs_read_rtm(&desc);
+	if (ret < 0)
+		goto error;
+
+	/* issue RTM_GETLINK */
+	desc.parse = afs_rtm_getlink_if_parse;
+	ret = afs_rtm_getlink(&desc);
+	if (ret < 0)
+		goto error;
+	ret = afs_read_rtm(&desc);
+	if (ret < 0)
+		goto error;
+
+	afs_cull_interfaces(&desc);
+	ret = desc.nbufs;
+
+	for (loop = 0; loop < ret; loop++)
+		_debug("[%d] "NIPQUAD_FMT"/"NIPQUAD_FMT" mtu %u",
+		       bufs[loop].index,
+		       NIPQUAD(bufs[loop].address),
+		       NIPQUAD(bufs[loop].netmask),
+		       bufs[loop].mtu);
+
+error:
+	kfree(desc.data);
+	sock_release(desc.nlsock);
+	_leave(" = %d", ret);
+	return ret;
+}
+
+/*
+ * get a MAC address from a random ethernet interface that has a real one
+ * - the buffer should be 6 bytes in size
+ */
+int afs_get_MAC_address(u8 mac[6])
+{
+	struct afs_rtm_desc desc;
+	int ret;
+
+	_enter("");
+
+	memset(&desc, 0, sizeof(desc));
+	desc.mac = mac;
+	desc.mac_index = UINT_MAX;
+
+	ret = sock_create_kern(AF_NETLINK, SOCK_DGRAM, NETLINK_ROUTE,
+			       &desc.nlsock);
+	if (ret < 0) {
+		_leave(" = %d [sock]", ret);
+		return ret;
+	}
+
+	/* issue RTM_GETLINK */
+	desc.parse = afs_rtm_getlink_mac_parse;
+	ret = afs_rtm_getlink(&desc);
+	if (ret < 0)
+		goto error;
+	ret = afs_read_rtm(&desc);
+	if (ret < 0)
+		goto error;
+
+	if (desc.mac_index < UINT_MAX) {
+		/* got a MAC address */
+		_debug("[%d] %02x:%02x:%02x:%02x:%02x:%02x",
+		       desc.mac_index,
+		       mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]);
+	} else {
+		ret = -ENONET;
+	}
+
+error:
+	sock_release(desc.nlsock);
+	_leave(" = %d", ret);
+	return ret;
+}


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 15/16] AFS: Implement the CB.InitCallBackState3 operation [try #3]
  2007-04-25 10:50 ` [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3] David Howells
                     ` (9 preceding siblings ...)
  2007-04-25 10:51   ` [PATCH 14/16] AFS: Add support for the CB.GetCapabilities operation " David Howells
@ 2007-04-25 10:51   ` David Howells
  2007-04-25 10:52   ` [PATCH 16/16] AFS: Add "directory write" support " David Howells
  2007-04-25 13:38   ` [PATCH 00/16] AF_RXRPC socket family and AFS rewrite " David Howells
  12 siblings, 0 replies; 17+ messages in thread
From: David Howells @ 2007-04-25 10:51 UTC (permalink / raw)
  To: torvalds, akpm; +Cc: linux-kernel, linux-fsdevel, netdev, dhowells

Implement the CB.InitCallBackState3 operation for the fileserver to call.
This reduces the amount of network traffic because if this op is aborted, the
fileserver will then attempt an CB.InitCallBackState operation.

Signed-Off-By: David Howells <dhowells@redhat.com>
---

 fs/afs/AFS_CM.h    |    1 +
 fs/afs/cmservice.c |   46 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+), 0 deletions(-)

diff --git a/fs/afs/AFS_CM.h b/fs/afs/AFS_CM.h
index d4bd201..7b4d4fa 100644
--- a/fs/afs/AFS_CM.h
+++ b/fs/afs/AFS_CM.h
@@ -23,6 +23,7 @@ enum AFS_CM_Operations {
 	CBGetCE			= 208,	/* get cache file description */
 	CBGetXStatsVersion	= 209,	/* get version of extended statistics */
 	CBGetXStats		= 210,	/* get contents of extended statistics data */
+	CBInitCallBackState3	= 213,	/* initialise callback state, version 3 */
 	CBGetCapabilities	= 65538, /* get client capabilities */
 };
 
diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index 5139723..3d58861 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -20,6 +20,8 @@ struct workqueue_struct *afs_cm_workqueue;
 
 static int afs_deliver_cb_init_call_back_state(struct afs_call *,
 					       struct sk_buff *, bool);
+static int afs_deliver_cb_init_call_back_state3(struct afs_call *,
+						struct sk_buff *, bool);
 static int afs_deliver_cb_probe(struct afs_call *, struct sk_buff *, bool);
 static int afs_deliver_cb_callback(struct afs_call *, struct sk_buff *, bool);
 static int afs_deliver_cb_get_capabilities(struct afs_call *, struct sk_buff *,
@@ -47,6 +49,16 @@ static const struct afs_call_type afs_SRXCBInitCallBackState = {
 };
 
 /*
+ * CB.InitCallBackState3 operation type
+ */
+static const struct afs_call_type afs_SRXCBInitCallBackState3 = {
+	.name		= "CB.InitCallBackState3",
+	.deliver	= afs_deliver_cb_init_call_back_state3,
+	.abort_to_error	= afs_abort_to_error,
+	.destructor	= afs_cm_destructor,
+};
+
+/*
  * CB.Probe operation type
  */
 static const struct afs_call_type afs_SRXCBProbe = {
@@ -83,6 +95,9 @@ bool afs_cm_incoming_call(struct afs_call *call)
 	case CBInitCallBackState:
 		call->type = &afs_SRXCBInitCallBackState;
 		return true;
+	case CBInitCallBackState3:
+		call->type = &afs_SRXCBInitCallBackState3;
+		return true;
 	case CBProbe:
 		call->type = &afs_SRXCBProbe;
 		return true;
@@ -312,6 +327,37 @@ static int afs_deliver_cb_init_call_back_state(struct afs_call *call,
 }
 
 /*
+ * deliver request data to a CB.InitCallBackState3 call
+ */
+static int afs_deliver_cb_init_call_back_state3(struct afs_call *call,
+						struct sk_buff *skb,
+						bool last)
+{
+	struct afs_server *server;
+	struct in_addr addr;
+
+	_enter(",{%u},%d", skb->len, last);
+
+	if (!last)
+		return 0;
+
+	/* no unmarshalling required */
+	call->state = AFS_CALL_REPLYING;
+
+	/* we'll need the file server record as that tells us which set of
+	 * vnodes to operate upon */
+	memcpy(&addr, &skb->nh.iph->saddr, 4);
+	server = afs_find_server(&addr);
+	if (!server)
+		return -ENOTCONN;
+	call->server = server;
+
+	INIT_WORK(&call->work, SRXAFSCB_InitCallBackState);
+	schedule_work(&call->work);
+	return 0;
+}
+
+/*
  * allow the fileserver to see if the cache manager is still alive
  */
 static void SRXAFSCB_Probe(struct work_struct *work)

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 16/16] AFS: Add "directory write" support [try #3]
  2007-04-25 10:50 ` [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3] David Howells
                     ` (10 preceding siblings ...)
  2007-04-25 10:51   ` [PATCH 15/16] AFS: Implement the CB.InitCallBackState3 " David Howells
@ 2007-04-25 10:52   ` David Howells
  2007-04-25 13:38   ` [PATCH 00/16] AF_RXRPC socket family and AFS rewrite " David Howells
  12 siblings, 0 replies; 17+ messages in thread
From: David Howells @ 2007-04-25 10:52 UTC (permalink / raw)
  To: torvalds, akpm; +Cc: linux-kernel, linux-fsdevel, netdev, dhowells

Add support for the create, link, symlink, unlink, mkdir, rmdir and rename VFS
operations to the in-kernel AFS filesystem.

Also:

 (1) Fix dentry and inode revalidation.  d_revalidate should only look at
     state of the dentry.  Revalidation of the contents of an inode pointed to
     by a dentry is now separate.

 (2) Fix afs_lookup() to hash negative dentries as well as positive ones.

Signed-Off-By: David Howells <dhowells@redhat.com>
---

 fs/afs/AFS.h      |   26 ++
 fs/afs/AFS_FS.h   |   11 +
 fs/afs/callback.c |   36 +++
 fs/afs/dir.c      |  676 ++++++++++++++++++++++++++++++++++++++++++-----------
 fs/afs/file.c     |    7 +
 fs/afs/fsclient.c |  625 +++++++++++++++++++++++++++++++++++++++++++++----
 fs/afs/inode.c    |  115 ++++++++-
 fs/afs/internal.h |   54 ++++
 fs/afs/misc.c     |   21 ++
 fs/afs/security.c |   31 ++
 fs/afs/server.c   |    2 
 fs/afs/super.c    |    6 
 fs/afs/vnode.c    |  422 +++++++++++++++++++++++++++++++--
 fs/afs/volume.c   |   14 +
 14 files changed, 1778 insertions(+), 268 deletions(-)

diff --git a/fs/afs/AFS.h b/fs/afs/AFS.h
index 3f255d6..085cf60 100644
--- a/fs/afs/AFS.h
+++ b/fs/afs/AFS.h
@@ -106,19 +106,37 @@ struct afs_file_status {
 
 	afs_file_type_t		type;		/* file type */
 	unsigned		nlink;		/* link count */
-	size_t			size;		/* file size */
+	u64			size;		/* file size */
 	afs_dataversion_t	data_version;	/* current data version */
-	unsigned		author;		/* author ID */
-	unsigned		owner;		/* owner ID */
+	u32			author;		/* author ID */
+	u32			owner;		/* owner ID */
+	u32			group;		/* group ID */
 	afs_access_t		caller_access;	/* access rights for authenticated caller */
 	afs_access_t		anon_access;	/* access rights for unauthenticated caller */
 	umode_t			mode;		/* UNIX mode */
-	struct afs_fid		parent;		/* parent file ID */
+	struct afs_fid		parent;		/* parent dir ID for non-dirs only */
 	time_t			mtime_client;	/* last time client changed data */
 	time_t			mtime_server;	/* last time server changed data */
 };
 
 /*
+ * AFS file status change request
+ */
+struct afs_store_status {
+	u32			mask;		/* which bits of the struct are set */
+	u32			mtime_client;	/* last time client changed data */
+	u32			owner;		/* owner ID */
+	u32			group;		/* group ID */
+	umode_t			mode;		/* UNIX mode */
+};
+
+#define AFS_SET_MTIME		0x01		/* set the mtime */
+#define AFS_SET_OWNER		0x02		/* set the owner ID */
+#define AFS_SET_GROUP		0x04		/* set the group ID (unsupported?) */
+#define AFS_SET_MODE		0x08		/* set the UNIX mode */
+#define AFS_SET_SEG_SIZE	0x10		/* set the segment size (unsupported) */
+
+/*
  * AFS volume synchronisation information
  */
 struct afs_volsync {
diff --git a/fs/afs/AFS_FS.h b/fs/afs/AFS_FS.h
index fd38595..89e0d16 100644
--- a/fs/afs/AFS_FS.h
+++ b/fs/afs/AFS_FS.h
@@ -16,12 +16,19 @@
 #define FS_SERVICE		1	/* AFS File Service ID */
 
 enum AFS_FS_Operations {
-	FSFETCHSTATUS		= 132,	/* AFS Fetch file status */
 	FSFETCHDATA		= 130,	/* AFS Fetch file data */
+	FSFETCHSTATUS		= 132,	/* AFS Fetch file status */
+	FSREMOVEFILE		= 136,	/* AFS Remove a file */
+	FSCREATEFILE		= 137,	/* AFS Create a file */
+	FSRENAME		= 138,	/* AFS Rename or move a file or directory */
+	FSSYMLINK		= 139,	/* AFS Create a symbolic link */
+	FSLINK			= 140,	/* AFS Create a hard link */
+	FSMAKEDIR		= 141,	/* AFS Create a directory */
+	FSREMOVEDIR		= 142,	/* AFS Remove a directory */
 	FSGIVEUPCALLBACKS	= 147,	/* AFS Discard callback promises */
 	FSGETVOLUMEINFO		= 148,	/* AFS Get root volume information */
 	FSGETROOTVOLUME		= 151,	/* AFS Get root volume name */
-	FSLOOKUP		= 161	/* AFS lookup file in directory */
+	FSLOOKUP		= 161,	/* AFS lookup file in directory */
 };
 
 enum AFS_FS_Errors {
diff --git a/fs/afs/callback.c b/fs/afs/callback.c
index e674beb..639399f 100644
--- a/fs/afs/callback.c
+++ b/fs/afs/callback.c
@@ -44,7 +44,8 @@ void afs_init_callback_state(struct afs_server *server)
 	while (!RB_EMPTY_ROOT(&server->cb_promises)) {
 		vnode = rb_entry(server->cb_promises.rb_node,
 				 struct afs_vnode, cb_promise);
-		printk("\nUNPROMISE on %p\n", vnode);
+		_debug("UNPROMISE { vid=%x vn=%u uq=%u}",
+		       vnode->fid.vid, vnode->fid.vnode, vnode->fid.unique);
 		rb_erase(&vnode->cb_promise, &server->cb_promises);
 		vnode->cb_promised = false;
 	}
@@ -68,7 +69,7 @@ void afs_broken_callback_work(struct work_struct *work)
 
 	/* we're only interested in dealing with a broken callback on *this*
 	 * vnode and only if no-one else has dealt with it yet */
-	if (!mutex_trylock(&vnode->cb_broken_lock))
+	if (!mutex_trylock(&vnode->validate_lock))
 		return; /* someone else is dealing with it */
 
 	if (test_bit(AFS_VNODE_CB_BROKEN, &vnode->flags)) {
@@ -84,13 +85,14 @@ void afs_broken_callback_work(struct work_struct *work)
 		/* if the vnode's data version number changed then its contents
 		 * are different */
 		if (test_and_clear_bit(AFS_VNODE_ZAP_DATA, &vnode->flags)) {
-			_debug("zap data");
+			_debug("zap data {%x:%u}",
+			       vnode->fid.vid, vnode->fid.vnode);
 			invalidate_remote_inode(&vnode->vfs_inode);
 		}
 	}
 
 out:
-	mutex_unlock(&vnode->cb_broken_lock);
+	mutex_unlock(&vnode->validate_lock);
 
 	/* avoid the potential race whereby the mutex_trylock() in this
 	 * function happens again between the clear_bit() and the
@@ -252,6 +254,32 @@ static void afs_do_give_up_callback(struct afs_server *server,
 }
 
 /*
+ * discard the callback on a deleted item
+ */
+void afs_discard_callback_on_delete(struct afs_vnode *vnode)
+{
+	struct afs_server *server = vnode->server;
+
+	_enter("%d", vnode->cb_promised);
+
+	if (!vnode->cb_promised) {
+		_leave(" [not promised]");
+		return;
+	}
+
+	ASSERT(server != NULL);
+
+	spin_lock(&server->cb_lock);
+	if (vnode->cb_promised) {
+		ASSERT(server->cb_promises.rb_node != NULL);
+		rb_erase(&vnode->cb_promise, &server->cb_promises);
+		vnode->cb_promised = false;
+	}
+	spin_unlock(&server->cb_lock);
+	_leave("");
+}
+
+/*
  * give up the callback registered for a vnode on the file server when the
  * inode is being cleared
  */
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 8736841..dbbe75d 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -18,40 +18,50 @@
 #include <linux/ctype.h>
 #include "internal.h"
 
-static struct dentry *afs_dir_lookup(struct inode *dir, struct dentry *dentry,
-				     struct nameidata *nd);
+static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry,
+				 struct nameidata *nd);
 static int afs_dir_open(struct inode *inode, struct file *file);
-static int afs_dir_readdir(struct file *file, void *dirent, filldir_t filldir);
+static int afs_readdir(struct file *file, void *dirent, filldir_t filldir);
 static int afs_d_revalidate(struct dentry *dentry, struct nameidata *nd);
 static int afs_d_delete(struct dentry *dentry);
-static int afs_dir_lookup_filldir(void *_cookie, const char *name, int nlen,
+static void afs_d_release(struct dentry *dentry);
+static int afs_lookup_filldir(void *_cookie, const char *name, int nlen,
 				  loff_t fpos, u64 ino, unsigned dtype);
+static int afs_create(struct inode *dir, struct dentry *dentry, int mode,
+		      struct nameidata *nd);
+static int afs_mkdir(struct inode *dir, struct dentry *dentry, int mode);
+static int afs_rmdir(struct inode *dir, struct dentry *dentry);
+static int afs_unlink(struct inode *dir, struct dentry *dentry);
+static int afs_link(struct dentry *from, struct inode *dir,
+		    struct dentry *dentry);
+static int afs_symlink(struct inode *dir, struct dentry *dentry,
+		       const char *content);
+static int afs_rename(struct inode *old_dir, struct dentry *old_dentry,
+		      struct inode *new_dir, struct dentry *new_dentry);
 
 const struct file_operations afs_dir_file_operations = {
 	.open		= afs_dir_open,
 	.release	= afs_release,
-	.readdir	= afs_dir_readdir,
+	.readdir	= afs_readdir,
 };
 
 const struct inode_operations afs_dir_inode_operations = {
-	.lookup		= afs_dir_lookup,
+	.create		= afs_create,
+	.lookup		= afs_lookup,
+	.link		= afs_link,
+	.unlink		= afs_unlink,
+	.symlink	= afs_symlink,
+	.mkdir		= afs_mkdir,
+	.rmdir		= afs_rmdir,
+	.rename		= afs_rename,
 	.permission	= afs_permission,
 	.getattr	= afs_inode_getattr,
-#if 0 /* TODO */
-	.create		= afs_dir_create,
-	.link		= afs_dir_link,
-	.unlink		= afs_dir_unlink,
-	.symlink	= afs_dir_symlink,
-	.mkdir		= afs_dir_mkdir,
-	.rmdir		= afs_dir_rmdir,
-	.mknod		= afs_dir_mknod,
-	.rename		= afs_dir_rename,
-#endif
 };
 
 static struct dentry_operations afs_fs_dentry_operations = {
 	.d_revalidate	= afs_d_revalidate,
 	.d_delete	= afs_d_delete,
+	.d_release	= afs_d_release,
 };
 
 #define AFS_DIR_HASHTBL_SIZE	128
@@ -103,7 +113,7 @@ struct afs_dir_page {
 	union afs_dir_block blocks[PAGE_SIZE / sizeof(union afs_dir_block)];
 };
 
-struct afs_dir_lookup_cookie {
+struct afs_lookup_cookie {
 	struct afs_fid	fid;
 	const char	*name;
 	size_t		nlen;
@@ -299,7 +309,7 @@ static int afs_dir_iterate_block(unsigned *fpos,
 			      nlen,
 			      blkoff + offset * sizeof(union afs_dirent),
 			      ntohl(dire->u.vnode),
-			      filldir == afs_dir_lookup_filldir ?
+			      filldir == afs_lookup_filldir ?
 			      ntohl(dire->u.unique) : DT_UNKNOWN);
 		if (ret < 0) {
 			_leave(" = 0 [full]");
@@ -379,7 +389,7 @@ out:
 /*
  * read an AFS directory
  */
-static int afs_dir_readdir(struct file *file, void *cookie, filldir_t filldir)
+static int afs_readdir(struct file *file, void *cookie, filldir_t filldir)
 {
 	unsigned fpos;
 	int ret;
@@ -403,10 +413,10 @@ static int afs_dir_readdir(struct file *file, void *cookie, filldir_t filldir)
  * - if afs_dir_iterate_block() spots this function, it'll pass the FID
  *   uniquifier through dtype
  */
-static int afs_dir_lookup_filldir(void *_cookie, const char *name, int nlen,
-				  loff_t fpos, u64 ino, unsigned dtype)
+static int afs_lookup_filldir(void *_cookie, const char *name, int nlen,
+			      loff_t fpos, u64 ino, unsigned dtype)
 {
-	struct afs_dir_lookup_cookie *cookie = _cookie;
+	struct afs_lookup_cookie *cookie = _cookie;
 
 	_enter("{%s,%Zu},%s,%u,,%llu,%u",
 	       cookie->name, cookie->nlen, name, nlen, ino, dtype);
@@ -430,11 +440,12 @@ static int afs_dir_lookup_filldir(void *_cookie, const char *name, int nlen,
 
 /*
  * do a lookup in a directory
+ * - just returns the FID the dentry name maps to if found
  */
 static int afs_do_lookup(struct inode *dir, struct dentry *dentry,
 			 struct afs_fid *fid, struct key *key)
 {
-	struct afs_dir_lookup_cookie cookie;
+	struct afs_lookup_cookie cookie;
 	struct afs_super_info *as;
 	unsigned fpos;
 	int ret;
@@ -450,7 +461,7 @@ static int afs_do_lookup(struct inode *dir, struct dentry *dentry,
 	cookie.found	= 0;
 
 	fpos = 0;
-	ret = afs_dir_iterate(dir, &fpos, &cookie, afs_dir_lookup_filldir,
+	ret = afs_dir_iterate(dir, &fpos, &cookie, afs_lookup_filldir,
 			      key);
 	if (ret < 0) {
 		_leave(" = %d [iter]", ret);
@@ -471,8 +482,8 @@ static int afs_do_lookup(struct inode *dir, struct dentry *dentry,
 /*
  * look up an entry in a directory
  */
-static struct dentry *afs_dir_lookup(struct inode *dir, struct dentry *dentry,
-				     struct nameidata *nd)
+static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry,
+				 struct nameidata *nd)
 {
 	struct afs_vnode *vnode;
 	struct afs_fid fid;
@@ -480,14 +491,18 @@ static struct dentry *afs_dir_lookup(struct inode *dir, struct dentry *dentry,
 	struct key *key;
 	int ret;
 
-	_enter("{%lu},%p{%s}", dir->i_ino, dentry, dentry->d_name.name);
+	vnode = AFS_FS_I(dir);
+
+	_enter("{%x:%d},%p{%s},",
+	       vnode->fid.vid, vnode->fid.vnode, dentry, dentry->d_name.name);
+
+	ASSERTCMP(dentry->d_inode, ==, NULL);
 
 	if (dentry->d_name.len > 255) {
 		_leave(" = -ENAMETOOLONG");
 		return ERR_PTR(-ENAMETOOLONG);
 	}
 
-	vnode = AFS_FS_I(dir);
 	if (test_bit(AFS_VNODE_DELETED, &vnode->flags)) {
 		_leave(" = -ESTALE");
 		return ERR_PTR(-ESTALE);
@@ -499,15 +514,28 @@ static struct dentry *afs_dir_lookup(struct inode *dir, struct dentry *dentry,
 		return ERR_PTR(PTR_ERR(key));
 	}
 
+	ret = afs_validate(vnode, key);
+	if (ret < 0) {
+		key_put(key);
+		_leave(" = %d [val]", ret);
+		return ERR_PTR(ret);
+	}
+
 	ret = afs_do_lookup(dir, dentry, &fid, key);
 	if (ret < 0) {
 		key_put(key);
+		if (ret == -ENOENT) {
+			d_add(dentry, NULL);
+			_leave(" = NULL [negative]");
+			return NULL;
+		}
 		_leave(" = %d [do]", ret);
 		return ERR_PTR(ret);
 	}
+	dentry->d_fsdata = (void *)(unsigned long) vnode->status.data_version;
 
 	/* instantiate the dentry */
-	inode = afs_iget(dir->i_sb, key, &fid);
+	inode = afs_iget(dir->i_sb, key, &fid, NULL, NULL);
 	key_put(key);
 	if (IS_ERR(inode)) {
 		_leave(" = %ld", PTR_ERR(inode));
@@ -527,105 +555,64 @@ static struct dentry *afs_dir_lookup(struct inode *dir, struct dentry *dentry,
 }
 
 /*
- * propagate changed and modified flags on a directory to all the children of
- * that directory as they may indicate that the ACL on the dir has changed,
- * potentially rendering the child inaccessible or that a file has been deleted
- * or renamed
- */
-static void afs_propagate_dir_changes(struct dentry *dir)
-{
-	struct dentry *child;
-	bool c, m;
-
-	c = test_bit(AFS_VNODE_CHANGED, &AFS_FS_I(dir->d_inode)->flags);
-	m = test_bit(AFS_VNODE_MODIFIED, &AFS_FS_I(dir->d_inode)->flags);
-
-	_enter("{%d,%d}", c, m);
-
-	spin_lock(&dir->d_lock);
-
-	list_for_each_entry(child, &dir->d_subdirs, d_u.d_child) {
-		if (child->d_inode) {
-			struct afs_vnode *vnode;
-
-			_debug("tag %s", child->d_name.name);
-			vnode = AFS_FS_I(child->d_inode);
-			if (c)
-				set_bit(AFS_VNODE_DIR_CHANGED, &vnode->flags);
-			if (m)
-				set_bit(AFS_VNODE_DIR_MODIFIED, &vnode->flags);
-		}
-	}
-
-	spin_unlock(&dir->d_lock);
-}
-
-/*
  * check that a dentry lookup hit has found a valid entry
  * - NOTE! the hit can be a negative hit too, so we can't assume we have an
  *   inode
- * - there are several things we need to check
- *   - parent dir data changes (rm, rmdir, rename, mkdir, create, link,
- *     symlink)
- *   - parent dir metadata changed (security changes)
- *   - dentry data changed (write, truncate)
- *   - dentry metadata changed (security changes)
  */
 static int afs_d_revalidate(struct dentry *dentry, struct nameidata *nd)
 {
-	struct afs_vnode *vnode;
+	struct afs_vnode *vnode, *dir;
 	struct afs_fid fid;
 	struct dentry *parent;
-	struct inode *inode, *dir;
 	struct key *key;
+	void *dir_version;
 	int ret;
 
 	vnode = AFS_FS_I(dentry->d_inode);
 
-	_enter("{sb=%p n=%s fl=%lx},",
-	       dentry->d_sb, dentry->d_name.name, vnode->flags);
+	if (dentry->d_inode)
+		_enter("{v={%x:%u} n=%s fl=%lx},",
+		       vnode->fid.vid, vnode->fid.vnode, dentry->d_name.name,
+		       vnode->flags);
+	else
+		_enter("{neg n=%s}", dentry->d_name.name);
 
-	key = afs_request_key(vnode->volume->cell);
+	key = afs_request_key(AFS_FS_S(dentry->d_sb)->volume->cell);
 	if (IS_ERR(key))
 		key = NULL;
 
 	/* lock down the parent dentry so we can peer at it */
 	parent = dget_parent(dentry);
-
-	dir = parent->d_inode;
-	inode = dentry->d_inode;
-
-	/* handle a negative dentry */
-	if (!inode)
+	if (!parent->d_inode)
 		goto out_bad;
 
-	/* handle a bad inode */
-	if (is_bad_inode(inode)) {
-		printk("kAFS: afs_d_revalidate: %s/%s has bad inode\n",
-		       parent->d_name.name, dentry->d_name.name);
-		goto out_bad;
-	}
+	dir = AFS_FS_I(parent->d_inode);
 
-	/* check that this dirent still exists if the directory's contents were
-	 * modified */
-	if (test_bit(AFS_VNODE_DELETED, &AFS_FS_I(dir)->flags)) {
+	/* validate the parent directory */
+	if (test_bit(AFS_VNODE_MODIFIED, &dir->flags))
+		afs_validate(dir, key);
+
+	if (test_bit(AFS_VNODE_DELETED, &dir->flags)) {
 		_debug("%s: parent dir deleted", dentry->d_name.name);
 		goto out_bad;
 	}
 
-	if (test_and_clear_bit(AFS_VNODE_DIR_MODIFIED, &vnode->flags)) {
-		/* rm/rmdir/rename may have occurred */
-		_debug("dir modified");
+	dir_version = (void *) (unsigned long) dir->status.data_version;
+	if (dentry->d_fsdata == dir_version)
+		goto out_valid; /* the dir contents are unchanged */
 
-		/* search the directory for this vnode */
-		ret = afs_do_lookup(dir, dentry, &fid, key);
-		if (ret == -ENOENT) {
-			_debug("%s: dirent not found", dentry->d_name.name);
-			goto not_found;
-		}
-		if (ret < 0) {
-			_debug("failed to iterate dir %s: %d",
-			       parent->d_name.name, ret);
+	_debug("dir modified");
+
+	/* search the directory for this vnode */
+	ret = afs_do_lookup(&dir->vfs_inode, dentry, &fid, key);
+	switch (ret) {
+	case 0:
+		/* the filename maps to something */
+		if (!dentry->d_inode)
+			goto out_bad;
+		if (is_bad_inode(dentry->d_inode)) {
+			printk("kAFS: afs_d_revalidate: %s/%s has bad inode\n",
+			       parent->d_name.name, dentry->d_name.name);
 			goto out_bad;
 		}
 
@@ -639,56 +626,35 @@ static int afs_d_revalidate(struct dentry *dentry, struct nameidata *nd)
 		}
 
 		/* if the vnode ID uniqifier has changed, then the file has
-		 * been deleted */
+		 * been deleted and replaced, and the original vnode ID has
+		 * been reused */
 		if (fid.unique != vnode->fid.unique) {
 			_debug("%s: file deleted (uq %u -> %u I:%lu)",
 			       dentry->d_name.name, fid.unique,
-			       vnode->fid.unique, inode->i_version);
+			       vnode->fid.unique, dentry->d_inode->i_version);
 			spin_lock(&vnode->lock);
 			set_bit(AFS_VNODE_DELETED, &vnode->flags);
 			spin_unlock(&vnode->lock);
-			invalidate_remote_inode(inode);
-			goto out_bad;
+			goto not_found;
 		}
-	}
+		goto out_valid;
 
-	/* if the directory's metadata were changed then the security may be
-	 * different and we may no longer have access */
-	mutex_lock(&vnode->cb_broken_lock);
-
-	if (test_and_clear_bit(AFS_VNODE_DIR_CHANGED, &vnode->flags) ||
-	    test_bit(AFS_VNODE_CB_BROKEN, &vnode->flags)) {
-		_debug("%s: changed", dentry->d_name.name);
-		set_bit(AFS_VNODE_CB_BROKEN, &vnode->flags);
-		if (afs_vnode_fetch_status(vnode, NULL, key) < 0) {
-			mutex_unlock(&vnode->cb_broken_lock);
-			goto out_bad;
-		}
-	}
+	case -ENOENT:
+		/* the filename is unknown */
+		_debug("%s: dirent not found", dentry->d_name.name);
+		if (dentry->d_inode)
+			goto not_found;
+		goto out_valid;
 
-	if (test_bit(AFS_VNODE_DELETED, &vnode->flags)) {
-		_debug("%s: file already deleted", dentry->d_name.name);
-		mutex_unlock(&vnode->cb_broken_lock);
+	default:
+		_debug("failed to iterate dir %s: %d",
+		       parent->d_name.name, ret);
 		goto out_bad;
 	}
 
-	/* if the vnode's data version number changed then its contents are
-	 * different */
-	if (test_and_clear_bit(AFS_VNODE_ZAP_DATA, &vnode->flags)) {
-		_debug("zap data");
-		invalidate_remote_inode(inode);
-	}
-
-	if (S_ISDIR(inode->i_mode) &&
-	    (test_bit(AFS_VNODE_CHANGED, &vnode->flags) ||
-	     test_bit(AFS_VNODE_MODIFIED, &vnode->flags)))
-		afs_propagate_dir_changes(dentry);
-
-	clear_bit(AFS_VNODE_CHANGED, &vnode->flags);
-	clear_bit(AFS_VNODE_MODIFIED, &vnode->flags);
-	mutex_unlock(&vnode->cb_broken_lock);
-
 out_valid:
+	dentry->d_fsdata = dir_version;
+out_skip:
 	dput(parent);
 	key_put(key);
 	_leave(" = 1 [valid]");
@@ -701,10 +667,10 @@ not_found:
 	spin_unlock(&dentry->d_lock);
 
 out_bad:
-	if (inode) {
+	if (dentry->d_inode) {
 		/* don't unhash if we have submounts */
 		if (have_submounts(dentry))
-			goto out_valid;
+			goto out_skip;
 	}
 
 	_debug("dropping dentry %s/%s",
@@ -742,3 +708,433 @@ zap:
 	_leave(" = 1 [zap]");
 	return 1;
 }
+
+/*
+ * handle dentry release
+ */
+static void afs_d_release(struct dentry *dentry)
+{
+	_enter("%s", dentry->d_name.name);
+}
+
+/*
+ * create a directory on an AFS filesystem
+ */
+static int afs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
+{
+	struct afs_file_status status;
+	struct afs_callback cb;
+	struct afs_server *server;
+	struct afs_vnode *dvnode, *vnode;
+	struct afs_fid fid;
+	struct inode *inode;
+	struct key *key;
+	int ret;
+
+	dvnode = AFS_FS_I(dir);
+
+	_enter("{%x:%d},{%s},%o",
+	       dvnode->fid.vid, dvnode->fid.vnode, dentry->d_name.name, mode);
+
+	ret = -ENAMETOOLONG;
+	if (dentry->d_name.len > 255)
+		goto error;
+
+	key = afs_request_key(dvnode->volume->cell);
+	if (IS_ERR(key)) {
+		ret = PTR_ERR(key);
+		goto error;
+	}
+
+	mode |= S_IFDIR;
+	ret = afs_vnode_create(dvnode, key, dentry->d_name.name,
+			       mode, &fid, &status, &cb, &server);
+	if (ret < 0)
+		goto mkdir_error;
+
+	inode = afs_iget(dir->i_sb, key, &fid, &status, &cb);
+	if (IS_ERR(inode)) {
+		/* ENOMEM at a really inconvenient time - just abandon the new
+		 * directory on the server */
+		ret = PTR_ERR(inode);
+		goto iget_error;
+	}
+
+	/* apply the status report we've got for the new vnode */
+	vnode = AFS_FS_I(inode);
+	spin_lock(&vnode->lock);
+	vnode->update_cnt++;
+	spin_unlock(&vnode->lock);
+	afs_vnode_finalise_status_update(vnode, server);
+	afs_put_server(server);
+
+	d_instantiate(dentry, inode);
+	if (d_unhashed(dentry)) {
+		_debug("not hashed");
+		d_rehash(dentry);
+	}
+	key_put(key);
+	_leave(" = 0");
+	return 0;
+
+iget_error:
+	afs_put_server(server);
+mkdir_error:
+	key_put(key);
+error:
+	d_drop(dentry);
+	_leave(" = %d", ret);
+	return ret;
+}
+
+/*
+ * remove a directory from an AFS filesystem
+ */
+static int afs_rmdir(struct inode *dir, struct dentry *dentry)
+{
+	struct afs_vnode *dvnode, *vnode;
+	struct key *key;
+	int ret;
+
+	dvnode = AFS_FS_I(dir);
+
+	_enter("{%x:%d},{%s}",
+	       dvnode->fid.vid, dvnode->fid.vnode, dentry->d_name.name);
+
+	ret = -ENAMETOOLONG;
+	if (dentry->d_name.len > 255)
+		goto error;
+
+	key = afs_request_key(dvnode->volume->cell);
+	if (IS_ERR(key)) {
+		ret = PTR_ERR(key);
+		goto error;
+	}
+
+	ret = afs_vnode_remove(dvnode, key, dentry->d_name.name, true);
+	if (ret < 0)
+		goto rmdir_error;
+
+	if (dentry->d_inode) {
+		vnode = AFS_FS_I(dentry->d_inode);
+		clear_nlink(&vnode->vfs_inode);
+		set_bit(AFS_VNODE_DELETED, &vnode->flags);
+		afs_discard_callback_on_delete(vnode);
+	}
+
+	key_put(key);
+	_leave(" = 0");
+	return 0;
+
+rmdir_error:
+	key_put(key);
+error:
+	_leave(" = %d", ret);
+	return ret;
+}
+
+/*
+ * remove a file from an AFS filesystem
+ */
+static int afs_unlink(struct inode *dir, struct dentry *dentry)
+{
+	struct afs_vnode *dvnode, *vnode;
+	struct key *key;
+	int ret;
+
+	dvnode = AFS_FS_I(dir);
+
+	_enter("{%x:%d},{%s}",
+	       dvnode->fid.vid, dvnode->fid.vnode, dentry->d_name.name);
+
+	ret = -ENAMETOOLONG;
+	if (dentry->d_name.len > 255)
+		goto error;
+
+	key = afs_request_key(dvnode->volume->cell);
+	if (IS_ERR(key)) {
+		ret = PTR_ERR(key);
+		goto error;
+	}
+
+	if (dentry->d_inode) {
+		vnode = AFS_FS_I(dentry->d_inode);
+
+		/* make sure we have a callback promise on the victim */
+		ret = afs_validate(vnode, key);
+		if (ret < 0)
+			goto error;
+	}
+
+	ret = afs_vnode_remove(dvnode, key, dentry->d_name.name, false);
+	if (ret < 0)
+		goto remove_error;
+
+	if (dentry->d_inode) {
+		/* if the file wasn't deleted due to excess hard links, the
+		 * fileserver will break the callback promise on the file - if
+		 * it had one - before it returns to us, and if it was deleted,
+		 * it won't
+		 *
+		 * however, if we didn't have a callback promise outstanding,
+		 * or it was outstanding on a different server, then it won't
+		 * break it either...
+		 */
+		vnode = AFS_FS_I(dentry->d_inode);
+		if (test_bit(AFS_VNODE_DELETED, &vnode->flags))
+			_debug("AFS_VNODE_DELETED");
+		if (test_bit(AFS_VNODE_CB_BROKEN, &vnode->flags))
+			_debug("AFS_VNODE_CB_BROKEN");
+		set_bit(AFS_VNODE_CB_BROKEN, &vnode->flags);
+		ret = afs_validate(vnode, key);
+		_debug("nlink %d [val %d]", vnode->vfs_inode.i_nlink, ret);
+	}
+
+	key_put(key);
+	_leave(" = 0");
+	return 0;
+
+remove_error:
+	key_put(key);
+error:
+	_leave(" = %d", ret);
+	return ret;
+}
+
+/*
+ * create a regular file on an AFS filesystem
+ */
+static int afs_create(struct inode *dir, struct dentry *dentry, int mode,
+		      struct nameidata *nd)
+{
+	struct afs_file_status status;
+	struct afs_callback cb;
+	struct afs_server *server;
+	struct afs_vnode *dvnode, *vnode;
+	struct afs_fid fid;
+	struct inode *inode;
+	struct key *key;
+	int ret;
+
+	dvnode = AFS_FS_I(dir);
+
+	_enter("{%x:%d},{%s},%o,",
+	       dvnode->fid.vid, dvnode->fid.vnode, dentry->d_name.name, mode);
+
+	ret = -ENAMETOOLONG;
+	if (dentry->d_name.len > 255)
+		goto error;
+
+	key = afs_request_key(dvnode->volume->cell);
+	if (IS_ERR(key)) {
+		ret = PTR_ERR(key);
+		goto error;
+	}
+
+	mode |= S_IFREG;
+	ret = afs_vnode_create(dvnode, key, dentry->d_name.name,
+			       mode, &fid, &status, &cb, &server);
+	if (ret < 0)
+		goto create_error;
+
+	inode = afs_iget(dir->i_sb, key, &fid, &status, &cb);
+	if (IS_ERR(inode)) {
+		/* ENOMEM at a really inconvenient time - just abandon the new
+		 * directory on the server */
+		ret = PTR_ERR(inode);
+		goto iget_error;
+	}
+
+	/* apply the status report we've got for the new vnode */
+	vnode = AFS_FS_I(inode);
+	spin_lock(&vnode->lock);
+	vnode->update_cnt++;
+	spin_unlock(&vnode->lock);
+	afs_vnode_finalise_status_update(vnode, server);
+	afs_put_server(server);
+
+	d_instantiate(dentry, inode);
+	if (d_unhashed(dentry)) {
+		_debug("not hashed");
+		d_rehash(dentry);
+	}
+	key_put(key);
+	_leave(" = 0");
+	return 0;
+
+iget_error:
+	afs_put_server(server);
+create_error:
+	key_put(key);
+error:
+	d_drop(dentry);
+	_leave(" = %d", ret);
+	return ret;
+}
+
+/*
+ * create a hard link between files in an AFS filesystem
+ */
+static int afs_link(struct dentry *from, struct inode *dir,
+		    struct dentry *dentry)
+{
+	struct afs_vnode *dvnode, *vnode;
+	struct key *key;
+	int ret;
+
+	vnode = AFS_FS_I(from->d_inode);
+	dvnode = AFS_FS_I(dir);
+
+	_enter("{%x:%d},{%x:%d},{%s}",
+	       vnode->fid.vid, vnode->fid.vnode,
+	       dvnode->fid.vid, dvnode->fid.vnode,
+	       dentry->d_name.name);
+
+	ret = -ENAMETOOLONG;
+	if (dentry->d_name.len > 255)
+		goto error;
+
+	key = afs_request_key(dvnode->volume->cell);
+	if (IS_ERR(key)) {
+		ret = PTR_ERR(key);
+		goto error;
+	}
+
+	ret = afs_vnode_link(dvnode, vnode, key, dentry->d_name.name);
+	if (ret < 0)
+		goto link_error;
+
+	atomic_inc(&vnode->vfs_inode.i_count);
+	d_instantiate(dentry, &vnode->vfs_inode);
+	key_put(key);
+	_leave(" = 0");
+	return 0;
+
+link_error:
+	key_put(key);
+error:
+	d_drop(dentry);
+	_leave(" = %d", ret);
+	return ret;
+}
+
+/*
+ * create a symlink in an AFS filesystem
+ */
+static int afs_symlink(struct inode *dir, struct dentry *dentry,
+		       const char *content)
+{
+	struct afs_file_status status;
+	struct afs_server *server;
+	struct afs_vnode *dvnode, *vnode;
+	struct afs_fid fid;
+	struct inode *inode;
+	struct key *key;
+	int ret;
+
+	dvnode = AFS_FS_I(dir);
+
+	_enter("{%x:%d},{%s},%s",
+	       dvnode->fid.vid, dvnode->fid.vnode, dentry->d_name.name,
+	       content);
+
+	ret = -ENAMETOOLONG;
+	if (dentry->d_name.len > 255)
+		goto error;
+
+	ret = -EINVAL;
+	if (strlen(content) > 1023)
+		goto error;
+
+	key = afs_request_key(dvnode->volume->cell);
+	if (IS_ERR(key)) {
+		ret = PTR_ERR(key);
+		goto error;
+	}
+
+	ret = afs_vnode_symlink(dvnode, key, dentry->d_name.name, content,
+				&fid, &status, &server);
+	if (ret < 0)
+		goto create_error;
+
+	inode = afs_iget(dir->i_sb, key, &fid, &status, NULL);
+	if (IS_ERR(inode)) {
+		/* ENOMEM at a really inconvenient time - just abandon the new
+		 * directory on the server */
+		ret = PTR_ERR(inode);
+		goto iget_error;
+	}
+
+	/* apply the status report we've got for the new vnode */
+	vnode = AFS_FS_I(inode);
+	spin_lock(&vnode->lock);
+	vnode->update_cnt++;
+	spin_unlock(&vnode->lock);
+	afs_vnode_finalise_status_update(vnode, server);
+	afs_put_server(server);
+
+	d_instantiate(dentry, inode);
+	if (d_unhashed(dentry)) {
+		_debug("not hashed");
+		d_rehash(dentry);
+	}
+	key_put(key);
+	_leave(" = 0");
+	return 0;
+
+iget_error:
+	afs_put_server(server);
+create_error:
+	key_put(key);
+error:
+	d_drop(dentry);
+	_leave(" = %d", ret);
+	return ret;
+}
+
+/*
+ * rename a file in an AFS filesystem and/or move it between directories
+ */
+static int afs_rename(struct inode *old_dir, struct dentry *old_dentry,
+		      struct inode *new_dir, struct dentry *new_dentry)
+{
+	struct afs_vnode *orig_dvnode, *new_dvnode, *vnode;
+	struct key *key;
+	int ret;
+
+	vnode = AFS_FS_I(old_dentry->d_inode);
+	orig_dvnode = AFS_FS_I(old_dir);
+	new_dvnode = AFS_FS_I(new_dir);
+
+	_enter("{%x:%d},{%x:%d},{%x:%d},{%s}",
+	       orig_dvnode->fid.vid, orig_dvnode->fid.vnode,
+	       vnode->fid.vid, vnode->fid.vnode,
+	       new_dvnode->fid.vid, new_dvnode->fid.vnode,
+	       new_dentry->d_name.name);
+
+	ret = -ENAMETOOLONG;
+	if (new_dentry->d_name.len > 255)
+		goto error;
+
+	key = afs_request_key(orig_dvnode->volume->cell);
+	if (IS_ERR(key)) {
+		ret = PTR_ERR(key);
+		goto error;
+	}
+
+	ret = afs_vnode_rename(orig_dvnode, new_dvnode, key,
+			       old_dentry->d_name.name,
+			       new_dentry->d_name.name);
+	if (ret < 0)
+		goto rename_error;
+	key_put(key);
+	_leave(" = 0");
+	return 0;
+
+rename_error:
+	key_put(key);
+error:
+	d_drop(new_dentry);
+	_leave(" = %d", ret);
+	return ret;
+}
diff --git a/fs/afs/file.c b/fs/afs/file.c
index 101bbb8..59cd955 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -50,6 +50,7 @@ int afs_open(struct inode *inode, struct file *file)
 {
 	struct afs_vnode *vnode = AFS_FS_I(inode);
 	struct key *key;
+	int ret;
 
 	_enter("{%x:%x},", vnode->fid.vid, vnode->fid.vnode);
 
@@ -58,6 +59,12 @@ int afs_open(struct inode *inode, struct file *file)
 		_leave(" = %ld [key]", PTR_ERR(key));
 		return PTR_ERR(key);
 	}
+	
+	ret = afs_validate(vnode, key);
+	if (ret < 0) {
+		_leave(" = %d [val]", ret);
+		return ret;
+	}
 
 	file->private_data = key;
 	_leave(" = 0");
diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index 8f20f1b..cdcbf2f 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -16,14 +16,28 @@
 #include "AFS_FS.h"
 
 /*
+ * decode an AFSFid block
+ */
+static void xdr_decode_AFSFid(const __be32 **_bp, struct afs_fid *fid)
+{
+	const __be32 *bp = *_bp;
+
+	fid->vid		= ntohl(*bp++);
+	fid->vnode		= ntohl(*bp++);
+	fid->unique		= ntohl(*bp++);
+	*_bp = bp;
+}
+
+/*
  * decode an AFSFetchStatus block
  */
 static void xdr_decode_AFSFetchStatus(const __be32 **_bp,
+				      struct afs_file_status *status,
 				      struct afs_vnode *vnode)
 {
 	const __be32 *bp = *_bp;
 	umode_t mode;
-	u64 data_version;
+	u64 data_version, size;
 	u32 changed = 0; /* becomes non-zero if ctime-type changes seen */
 
 #define EXTRACT(DST)				\
@@ -33,55 +47,68 @@ static void xdr_decode_AFSFetchStatus(const __be32 **_bp,
 		DST = x;			\
 	} while (0)
 
-	vnode->status.if_version = ntohl(*bp++);
-	EXTRACT(vnode->status.type);
-	vnode->status.nlink = ntohl(*bp++);
-	EXTRACT(vnode->status.size);
+	status->if_version = ntohl(*bp++);
+	EXTRACT(status->type);
+	EXTRACT(status->nlink);
+	size = ntohl(*bp++);
 	data_version = ntohl(*bp++);
-	EXTRACT(vnode->status.author);
-	EXTRACT(vnode->status.owner);
-	EXTRACT(vnode->status.caller_access); /* call ticket dependent */
-	EXTRACT(vnode->status.anon_access);
-	EXTRACT(vnode->status.mode);
-	vnode->status.parent.vid = vnode->fid.vid;
-	EXTRACT(vnode->status.parent.vnode);
-	EXTRACT(vnode->status.parent.unique);
+	EXTRACT(status->author);
+	EXTRACT(status->owner);
+	EXTRACT(status->caller_access); /* call ticket dependent */
+	EXTRACT(status->anon_access);
+	EXTRACT(status->mode);
+	EXTRACT(status->parent.vnode);
+	EXTRACT(status->parent.unique);
 	bp++; /* seg size */
-	vnode->status.mtime_client = ntohl(*bp++);
-	vnode->status.mtime_server = ntohl(*bp++);
-	bp++; /* group */
+	status->mtime_client = ntohl(*bp++);
+	status->mtime_server = ntohl(*bp++);
+	EXTRACT(status->group);
 	bp++; /* sync counter */
 	data_version |= (u64) ntohl(*bp++) << 32;
-	bp++; /* spare2 */
-	bp++; /* spare3 */
-	bp++; /* spare4 */
+	bp++; /* lock count */
+	size |= (u64) ntohl(*bp++) << 32;
+	bp++; /* spare 4 */
 	*_bp = bp;
 
-	if (changed) {
-		_debug("vnode changed");
-		set_bit(AFS_VNODE_CHANGED, &vnode->flags);
-		vnode->vfs_inode.i_uid		= vnode->status.owner;
-		vnode->vfs_inode.i_size		= vnode->status.size;
-		vnode->vfs_inode.i_version	= vnode->fid.unique;
-
-		vnode->status.mode &= S_IALLUGO;
-		mode = vnode->vfs_inode.i_mode;
-		mode &= ~S_IALLUGO;
-		mode |= vnode->status.mode;
-		vnode->vfs_inode.i_mode = mode;
+	if (size != status->size) {
+		status->size = size;
+		changed |= true;
 	}
+	status->mode &= S_IALLUGO;
 
 	_debug("vnode time %lx, %lx",
-	       vnode->status.mtime_client, vnode->status.mtime_server);
-	vnode->vfs_inode.i_ctime.tv_sec	= vnode->status.mtime_server;
-	vnode->vfs_inode.i_mtime	= vnode->vfs_inode.i_ctime;
-	vnode->vfs_inode.i_atime	= vnode->vfs_inode.i_ctime;
-
-	if (vnode->status.data_version != data_version) {
-		_debug("vnode modified %llx", data_version);
-		vnode->status.data_version = data_version;
-		set_bit(AFS_VNODE_MODIFIED, &vnode->flags);
-		set_bit(AFS_VNODE_ZAP_DATA, &vnode->flags);
+	       status->mtime_client, status->mtime_server);
+
+	if (vnode) {
+		status->parent.vid = vnode->fid.vid;
+		if (changed && !test_bit(AFS_VNODE_UNSET, &vnode->flags)) {
+			_debug("vnode changed");
+			i_size_write(&vnode->vfs_inode, size);
+			vnode->vfs_inode.i_uid = status->owner;
+			vnode->vfs_inode.i_gid = status->group;
+			vnode->vfs_inode.i_version = vnode->fid.unique;
+			vnode->vfs_inode.i_nlink = status->nlink;
+
+			mode = vnode->vfs_inode.i_mode;
+			mode &= ~S_IALLUGO;
+			mode |= status->mode;
+			barrier();
+			vnode->vfs_inode.i_mode = mode;
+		}
+
+		vnode->vfs_inode.i_ctime.tv_sec	= status->mtime_server;
+		vnode->vfs_inode.i_mtime	= vnode->vfs_inode.i_ctime;
+		vnode->vfs_inode.i_atime	= vnode->vfs_inode.i_ctime;
+	}
+
+	if (status->data_version != data_version) {
+		status->data_version = data_version;
+		if (vnode && !test_bit(AFS_VNODE_UNSET, &vnode->flags)) {
+			_debug("vnode modified %llx on {%x:%u}",
+			       data_version, vnode->fid.vid, vnode->fid.vnode);
+			set_bit(AFS_VNODE_MODIFIED, &vnode->flags);
+			set_bit(AFS_VNODE_ZAP_DATA, &vnode->flags);
+		}
 	}
 }
 
@@ -99,6 +126,17 @@ static void xdr_decode_AFSCallBack(const __be32 **_bp, struct afs_vnode *vnode)
 	*_bp = bp;
 }
 
+static void xdr_decode_AFSCallBack_raw(const __be32 **_bp,
+				       struct afs_callback *cb)
+{
+	const __be32 *bp = *_bp;
+
+	cb->version	= ntohl(*bp++);
+	cb->expiry	= ntohl(*bp++);
+	cb->type	= ntohl(*bp++);
+	*_bp = bp;
+}
+
 /*
  * decode an AFSVolSync block
  */
@@ -122,6 +160,7 @@ static void xdr_decode_AFSVolSync(const __be32 **_bp,
 static int afs_deliver_fs_fetch_status(struct afs_call *call,
 				       struct sk_buff *skb, bool last)
 {
+	struct afs_vnode *vnode = call->reply;
 	const __be32 *bp;
 
 	_enter(",,%u", last);
@@ -135,8 +174,8 @@ static int afs_deliver_fs_fetch_status(struct afs_call *call,
 
 	/* unmarshall the reply once we've received all of it */
 	bp = call->buffer;
-	xdr_decode_AFSFetchStatus(&bp, call->reply);
-	xdr_decode_AFSCallBack(&bp, call->reply);
+	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode);
+	xdr_decode_AFSCallBack(&bp, vnode);
 	if (call->reply2)
 		xdr_decode_AFSVolSync(&bp, call->reply2);
 
@@ -166,9 +205,10 @@ int afs_fs_fetch_file_status(struct afs_server *server,
 	struct afs_call *call;
 	__be32 *bp;
 
-	_enter(",%x,,,", key_serial(key));
+	_enter(",%x,{%x:%d},,",
+	       key_serial(key), vnode->fid.vid, vnode->fid.vnode);
 
-	call = afs_alloc_flat_call(&afs_RXFSFetchStatus, 16, 120);
+	call = afs_alloc_flat_call(&afs_RXFSFetchStatus, 16, (21 + 3 + 6) * 4);
 	if (!call)
 		return -ENOMEM;
 
@@ -194,6 +234,7 @@ int afs_fs_fetch_file_status(struct afs_server *server,
 static int afs_deliver_fs_fetch_data(struct afs_call *call,
 				     struct sk_buff *skb, bool last)
 {
+	struct afs_vnode *vnode = call->reply;
 	const __be32 *bp;
 	struct page *page;
 	void *buffer;
@@ -248,7 +289,8 @@ static int afs_deliver_fs_fetch_data(struct afs_call *call,
 
 		/* extract the metadata */
 	case 3:
-		ret = afs_extract_data(call, skb, last, call->buffer, 120);
+		ret = afs_extract_data(call, skb, last, call->buffer,
+				       (21 + 3 + 6) * 4);
 		switch (ret) {
 		case 0:		break;
 		case -EAGAIN:	return 0;
@@ -256,8 +298,8 @@ static int afs_deliver_fs_fetch_data(struct afs_call *call,
 		}
 
 		bp = call->buffer;
-		xdr_decode_AFSFetchStatus(&bp, call->reply);
-		xdr_decode_AFSCallBack(&bp, call->reply);
+		xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode);
+		xdr_decode_AFSCallBack(&bp, vnode);
 		if (call->reply2)
 			xdr_decode_AFSVolSync(&bp, call->reply2);
 
@@ -296,7 +338,6 @@ int afs_fs_fetch_data(struct afs_server *server,
 		      struct afs_vnode *vnode,
 		      off_t offset, size_t length,
 		      struct page *buffer,
-		      struct afs_volsync *volsync,
 		      const struct afs_wait_mode *wait_mode)
 {
 	struct afs_call *call;
@@ -304,13 +345,13 @@ int afs_fs_fetch_data(struct afs_server *server,
 
 	_enter("");
 
-	call = afs_alloc_flat_call(&afs_RXFSFetchData, 24, 120);
+	call = afs_alloc_flat_call(&afs_RXFSFetchData, 24, (21 + 3 + 6) * 4);
 	if (!call)
 		return -ENOMEM;
 
 	call->key = key;
 	call->reply = vnode;
-	call->reply2 = volsync;
+	call->reply2 = NULL; /* volsync */
 	call->reply3 = buffer;
 	call->service_id = FS_SERVICE;
 	call->port = htons(AFS_FS_PORT);
@@ -411,3 +452,485 @@ int afs_fs_give_up_callbacks(struct afs_server *server,
 
 	return afs_make_call(&server->addr, call, GFP_NOFS, wait_mode);
 }
+
+/*
+ * deliver reply data to an FS.CreateFile or an FS.MakeDir
+ */
+static int afs_deliver_fs_create_vnode(struct afs_call *call,
+				       struct sk_buff *skb, bool last)
+{
+	struct afs_vnode *vnode = call->reply;
+	const __be32 *bp;
+
+	_enter("{%u},{%u},%d", call->unmarshall, skb->len, last);
+
+	afs_transfer_reply(call, skb);
+	if (!last)
+		return 0;
+
+	if (call->reply_size != call->reply_max)
+		return -EBADMSG;
+
+	/* unmarshall the reply once we've received all of it */
+	bp = call->buffer;
+	xdr_decode_AFSFid(&bp, call->reply2);
+	xdr_decode_AFSFetchStatus(&bp, call->reply3, NULL);
+	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode);
+	xdr_decode_AFSCallBack_raw(&bp, call->reply4);
+	/* xdr_decode_AFSVolSync(&bp, call->replyX); */
+
+	_leave(" = 0 [done]");
+	return 0;
+}
+
+/*
+ * FS.CreateFile and FS.MakeDir operation type
+ */
+static const struct afs_call_type afs_RXFSCreateXXXX = {
+	.name		= "FS.CreateXXXX",
+	.deliver	= afs_deliver_fs_create_vnode,
+	.abort_to_error	= afs_abort_to_error,
+	.destructor	= afs_flat_call_destructor,
+};
+
+/*
+ * create a file or make a directory
+ */
+int afs_fs_create(struct afs_server *server,
+		  struct key *key,
+		  struct afs_vnode *vnode,
+		  const char *name,
+		  umode_t mode,
+		  struct afs_fid *newfid,
+		  struct afs_file_status *newstatus,
+		  struct afs_callback *newcb,
+		  const struct afs_wait_mode *wait_mode)
+{
+	struct afs_call *call;
+	size_t namesz, reqsz, padsz;
+	__be32 *bp;
+
+	_enter("");
+
+	namesz = strlen(name);
+	padsz = (4 - (namesz & 3)) & 3;
+	reqsz = (5 * 4) + namesz + padsz + (6 * 4);
+
+	call = afs_alloc_flat_call(&afs_RXFSCreateXXXX, reqsz,
+				   (3 + 21 + 21 + 3 + 6) * 4);
+	if (!call)
+		return -ENOMEM;
+
+	call->key = key;
+	call->reply = vnode;
+	call->reply2 = newfid;
+	call->reply3 = newstatus;
+	call->reply4 = newcb;
+	call->service_id = FS_SERVICE;
+	call->port = htons(AFS_FS_PORT);
+
+	/* marshall the parameters */
+	bp = call->request;
+	*bp++ = htonl(S_ISDIR(mode) ? FSMAKEDIR : FSCREATEFILE);
+	*bp++ = htonl(vnode->fid.vid);
+	*bp++ = htonl(vnode->fid.vnode);
+	*bp++ = htonl(vnode->fid.unique);
+	*bp++ = htonl(namesz);
+	memcpy(bp, name, namesz);
+	bp = (void *) bp + namesz;
+	if (padsz > 0) {
+		memset(bp, 0, padsz);
+		bp = (void *) bp + padsz;
+	}
+	*bp++ = htonl(AFS_SET_MODE);
+	*bp++ = 0; /* mtime */
+	*bp++ = 0; /* owner */
+	*bp++ = 0; /* group */
+	*bp++ = htonl(mode & S_IALLUGO); /* unix mode */
+	*bp++ = 0; /* segment size */
+
+	return afs_make_call(&server->addr, call, GFP_NOFS, wait_mode);
+}
+
+/*
+ * deliver reply data to an FS.RemoveFile or FS.RemoveDir
+ */
+static int afs_deliver_fs_remove(struct afs_call *call,
+				 struct sk_buff *skb, bool last)
+{
+	struct afs_vnode *vnode = call->reply;
+	const __be32 *bp;
+
+	_enter("{%u},{%u},%d", call->unmarshall, skb->len, last);
+
+	afs_transfer_reply(call, skb);
+	if (!last)
+		return 0;
+
+	if (call->reply_size != call->reply_max)
+		return -EBADMSG;
+
+	/* unmarshall the reply once we've received all of it */
+	bp = call->buffer;
+	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode);
+	/* xdr_decode_AFSVolSync(&bp, call->replyX); */
+
+	_leave(" = 0 [done]");
+	return 0;
+}
+
+/*
+ * FS.RemoveDir/FS.RemoveFile operation type
+ */
+static const struct afs_call_type afs_RXFSRemoveXXXX = {
+	.name		= "FS.RemoveXXXX",
+	.deliver	= afs_deliver_fs_remove,
+	.abort_to_error	= afs_abort_to_error,
+	.destructor	= afs_flat_call_destructor,
+};
+
+/*
+ * remove a file or directory
+ */
+int afs_fs_remove(struct afs_server *server,
+		  struct key *key,
+		  struct afs_vnode *vnode,
+		  const char *name,
+		  bool isdir,
+		  const struct afs_wait_mode *wait_mode)
+{
+	struct afs_call *call;
+	size_t namesz, reqsz, padsz;
+	__be32 *bp;
+
+	_enter("");
+
+	namesz = strlen(name);
+	padsz = (4 - (namesz & 3)) & 3;
+	reqsz = (5 * 4) + namesz + padsz;
+
+	call = afs_alloc_flat_call(&afs_RXFSRemoveXXXX, reqsz, (21 + 6) * 4);
+	if (!call)
+		return -ENOMEM;
+
+	call->key = key;
+	call->reply = vnode;
+	call->service_id = FS_SERVICE;
+	call->port = htons(AFS_FS_PORT);
+
+	/* marshall the parameters */
+	bp = call->request;
+	*bp++ = htonl(isdir ? FSREMOVEDIR : FSREMOVEFILE);
+	*bp++ = htonl(vnode->fid.vid);
+	*bp++ = htonl(vnode->fid.vnode);
+	*bp++ = htonl(vnode->fid.unique);
+	*bp++ = htonl(namesz);
+	memcpy(bp, name, namesz);
+	bp = (void *) bp + namesz;
+	if (padsz > 0) {
+		memset(bp, 0, padsz);
+		bp = (void *) bp + padsz;
+	}
+
+	return afs_make_call(&server->addr, call, GFP_NOFS, wait_mode);
+}
+
+/*
+ * deliver reply data to an FS.Link
+ */
+static int afs_deliver_fs_link(struct afs_call *call,
+			       struct sk_buff *skb, bool last)
+{
+	struct afs_vnode *dvnode = call->reply, *vnode = call->reply2;
+	const __be32 *bp;
+
+	_enter("{%u},{%u},%d", call->unmarshall, skb->len, last);
+
+	afs_transfer_reply(call, skb);
+	if (!last)
+		return 0;
+
+	if (call->reply_size != call->reply_max)
+		return -EBADMSG;
+
+	/* unmarshall the reply once we've received all of it */
+	bp = call->buffer;
+	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode);
+	xdr_decode_AFSFetchStatus(&bp, &dvnode->status, dvnode);
+	/* xdr_decode_AFSVolSync(&bp, call->replyX); */
+
+	_leave(" = 0 [done]");
+	return 0;
+}
+
+/*
+ * FS.Link operation type
+ */
+static const struct afs_call_type afs_RXFSLink = {
+	.name		= "FS.Link",
+	.deliver	= afs_deliver_fs_link,
+	.abort_to_error	= afs_abort_to_error,
+	.destructor	= afs_flat_call_destructor,
+};
+
+/*
+ * make a hard link
+ */
+int afs_fs_link(struct afs_server *server,
+		struct key *key,
+		struct afs_vnode *dvnode,
+		struct afs_vnode *vnode,
+		const char *name,
+		const struct afs_wait_mode *wait_mode)
+{
+	struct afs_call *call;
+	size_t namesz, reqsz, padsz;
+	__be32 *bp;
+
+	_enter("");
+
+	namesz = strlen(name);
+	padsz = (4 - (namesz & 3)) & 3;
+	reqsz = (5 * 4) + namesz + padsz + (3 * 4);
+
+	call = afs_alloc_flat_call(&afs_RXFSLink, reqsz, (21 + 21 + 6) * 4);
+	if (!call)
+		return -ENOMEM;
+
+	call->key = key;
+	call->reply = dvnode;
+	call->reply2 = vnode;
+	call->service_id = FS_SERVICE;
+	call->port = htons(AFS_FS_PORT);
+
+	/* marshall the parameters */
+	bp = call->request;
+	*bp++ = htonl(FSLINK);
+	*bp++ = htonl(dvnode->fid.vid);
+	*bp++ = htonl(dvnode->fid.vnode);
+	*bp++ = htonl(dvnode->fid.unique);
+	*bp++ = htonl(namesz);
+	memcpy(bp, name, namesz);
+	bp = (void *) bp + namesz;
+	if (padsz > 0) {
+		memset(bp, 0, padsz);
+		bp = (void *) bp + padsz;
+	}
+	*bp++ = htonl(vnode->fid.vid);
+	*bp++ = htonl(vnode->fid.vnode);
+	*bp++ = htonl(vnode->fid.unique);
+
+	return afs_make_call(&server->addr, call, GFP_NOFS, wait_mode);
+}
+
+/*
+ * deliver reply data to an FS.Symlink
+ */
+static int afs_deliver_fs_symlink(struct afs_call *call,
+				  struct sk_buff *skb, bool last)
+{
+	struct afs_vnode *vnode = call->reply;
+	const __be32 *bp;
+
+	_enter("{%u},{%u},%d", call->unmarshall, skb->len, last);
+
+	afs_transfer_reply(call, skb);
+	if (!last)
+		return 0;
+
+	if (call->reply_size != call->reply_max)
+		return -EBADMSG;
+
+	/* unmarshall the reply once we've received all of it */
+	bp = call->buffer;
+	xdr_decode_AFSFid(&bp, call->reply2);
+	xdr_decode_AFSFetchStatus(&bp, call->reply3, NULL);
+	xdr_decode_AFSFetchStatus(&bp, &vnode->status, vnode);
+	/* xdr_decode_AFSVolSync(&bp, call->replyX); */
+
+	_leave(" = 0 [done]");
+	return 0;
+}
+
+/*
+ * FS.Symlink operation type
+ */
+static const struct afs_call_type afs_RXFSSymlink = {
+	.name		= "FS.Symlink",
+	.deliver	= afs_deliver_fs_symlink,
+	.abort_to_error	= afs_abort_to_error,
+	.destructor	= afs_flat_call_destructor,
+};
+
+/*
+ * create a symbolic link
+ */
+int afs_fs_symlink(struct afs_server *server,
+		   struct key *key,
+		   struct afs_vnode *vnode,
+		   const char *name,
+		   const char *contents,
+		   struct afs_fid *newfid,
+		   struct afs_file_status *newstatus,
+		   const struct afs_wait_mode *wait_mode)
+{
+	struct afs_call *call;
+	size_t namesz, reqsz, padsz, c_namesz, c_padsz;
+	__be32 *bp;
+
+	_enter("");
+
+	namesz = strlen(name);
+	padsz = (4 - (namesz & 3)) & 3;
+
+	c_namesz = strlen(contents);
+	c_padsz = (4 - (c_namesz & 3)) & 3;
+
+	reqsz = (6 * 4) + namesz + padsz + c_namesz + c_padsz + (6 * 4);
+
+	call = afs_alloc_flat_call(&afs_RXFSSymlink, reqsz,
+				   (3 + 21 + 21 + 6) * 4);
+	if (!call)
+		return -ENOMEM;
+
+	call->key = key;
+	call->reply = vnode;
+	call->reply2 = newfid;
+	call->reply3 = newstatus;
+	call->service_id = FS_SERVICE;
+	call->port = htons(AFS_FS_PORT);
+
+	/* marshall the parameters */
+	bp = call->request;
+	*bp++ = htonl(FSSYMLINK);
+	*bp++ = htonl(vnode->fid.vid);
+	*bp++ = htonl(vnode->fid.vnode);
+	*bp++ = htonl(vnode->fid.unique);
+	*bp++ = htonl(namesz);
+	memcpy(bp, name, namesz);
+	bp = (void *) bp + namesz;
+	if (padsz > 0) {
+		memset(bp, 0, padsz);
+		bp = (void *) bp + padsz;
+	}
+	*bp++ = htonl(c_namesz);
+	memcpy(bp, contents, c_namesz);
+	bp = (void *) bp + c_namesz;
+	if (c_padsz > 0) {
+		memset(bp, 0, c_padsz);
+		bp = (void *) bp + c_padsz;
+	}
+	*bp++ = htonl(AFS_SET_MODE);
+	*bp++ = 0; /* mtime */
+	*bp++ = 0; /* owner */
+	*bp++ = 0; /* group */
+	*bp++ = htonl(S_IRWXUGO); /* unix mode */
+	*bp++ = 0; /* segment size */
+
+	return afs_make_call(&server->addr, call, GFP_NOFS, wait_mode);
+}
+
+/*
+ * deliver reply data to an FS.Rename
+ */
+static int afs_deliver_fs_rename(struct afs_call *call,
+				  struct sk_buff *skb, bool last)
+{
+	struct afs_vnode *orig_dvnode = call->reply, *new_dvnode = call->reply2;
+	const __be32 *bp;
+
+	_enter("{%u},{%u},%d", call->unmarshall, skb->len, last);
+
+	afs_transfer_reply(call, skb);
+	if (!last)
+		return 0;
+
+	if (call->reply_size != call->reply_max)
+		return -EBADMSG;
+
+	/* unmarshall the reply once we've received all of it */
+	bp = call->buffer;
+	xdr_decode_AFSFetchStatus(&bp, &orig_dvnode->status, orig_dvnode);
+	if (new_dvnode != orig_dvnode)
+		xdr_decode_AFSFetchStatus(&bp, &new_dvnode->status, new_dvnode);
+	/* xdr_decode_AFSVolSync(&bp, call->replyX); */
+
+	_leave(" = 0 [done]");
+	return 0;
+}
+
+/*
+ * FS.Rename operation type
+ */
+static const struct afs_call_type afs_RXFSRename = {
+	.name		= "FS.Rename",
+	.deliver	= afs_deliver_fs_rename,
+	.abort_to_error	= afs_abort_to_error,
+	.destructor	= afs_flat_call_destructor,
+};
+
+/*
+ * create a symbolic link
+ */
+int afs_fs_rename(struct afs_server *server,
+		  struct key *key,
+		  struct afs_vnode *orig_dvnode,
+		  const char *orig_name,
+		  struct afs_vnode *new_dvnode,
+		  const char *new_name,
+		  const struct afs_wait_mode *wait_mode)
+{
+	struct afs_call *call;
+	size_t reqsz, o_namesz, o_padsz, n_namesz, n_padsz;
+	__be32 *bp;
+
+	_enter("");
+
+	o_namesz = strlen(orig_name);
+	o_padsz = (4 - (o_namesz & 3)) & 3;
+
+	n_namesz = strlen(new_name);
+	n_padsz = (4 - (n_namesz & 3)) & 3;
+
+	reqsz = (4 * 4) +
+		4 + o_namesz + o_padsz +
+		(3 * 4) +
+		4 + n_namesz + n_padsz;
+
+	call = afs_alloc_flat_call(&afs_RXFSRename, reqsz, (21 + 21 + 6) * 4);
+	if (!call)
+		return -ENOMEM;
+
+	call->key = key;
+	call->reply = orig_dvnode;
+	call->reply2 = new_dvnode;
+	call->service_id = FS_SERVICE;
+	call->port = htons(AFS_FS_PORT);
+
+	/* marshall the parameters */
+	bp = call->request;
+	*bp++ = htonl(FSRENAME);
+	*bp++ = htonl(orig_dvnode->fid.vid);
+	*bp++ = htonl(orig_dvnode->fid.vnode);
+	*bp++ = htonl(orig_dvnode->fid.unique);
+	*bp++ = htonl(o_namesz);
+	memcpy(bp, orig_name, o_namesz);
+	bp = (void *) bp + o_namesz;
+	if (o_padsz > 0) {
+		memset(bp, 0, o_padsz);
+		bp = (void *) bp + o_padsz;
+	}
+
+	*bp++ = htonl(new_dvnode->fid.vid);
+	*bp++ = htonl(new_dvnode->fid.vnode);
+	*bp++ = htonl(new_dvnode->fid.unique);
+	*bp++ = htonl(n_namesz);
+	memcpy(bp, new_name, n_namesz);
+	bp = (void *) bp + n_namesz;
+	if (n_padsz > 0) {
+		memset(bp, 0, n_padsz);
+		bp = (void *) bp + n_padsz;
+	}
+
+	return afs_make_call(&server->addr, call, GFP_NOFS, wait_mode);
+}
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 2273362..56ca858 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -33,7 +33,7 @@ static int afs_inode_map_status(struct afs_vnode *vnode, struct key *key)
 {
 	struct inode *inode = AFS_VNODE_TO_I(vnode);
 
-	_debug("FS: ft=%d lk=%d sz=%Zu ver=%Lu mod=%hu",
+	_debug("FS: ft=%d lk=%d sz=%llu ver=%Lu mod=%hu",
 	       vnode->status.type,
 	       vnode->status.nlink,
 	       vnode->status.size,
@@ -115,8 +115,9 @@ static int afs_iget5_set(struct inode *inode, void *opaque)
 /*
  * inode retrieval
  */
-inline struct inode *afs_iget(struct super_block *sb, struct key *key,
-			      struct afs_fid *fid)
+struct inode *afs_iget(struct super_block *sb, struct key *key,
+		       struct afs_fid *fid, struct afs_file_status *status,
+		       struct afs_callback *cb)
 {
 	struct afs_iget_data data = { .fid = *fid };
 	struct afs_super_info *as;
@@ -156,16 +157,37 @@ inline struct inode *afs_iget(struct super_block *sb, struct key *key,
 			       &vnode->cache);
 #endif
 
-	/* okay... it's a new inode */
-	set_bit(AFS_VNODE_CB_BROKEN, &vnode->flags);
-	ret = afs_vnode_fetch_status(vnode, NULL, key);
-	if (ret < 0)
-		goto bad_inode;
+	if (!status) {
+		/* it's a remotely extant inode */
+		set_bit(AFS_VNODE_CB_BROKEN, &vnode->flags);
+		ret = afs_vnode_fetch_status(vnode, NULL, key);
+		if (ret < 0)
+			goto bad_inode;
+	} else {
+		/* it's an inode we just created */
+		memcpy(&vnode->status, status, sizeof(vnode->status));
+
+		if (!cb) {
+			/* it's a symlink we just created (the fileserver
+			 * didn't give us a callback) */
+			vnode->cb_version = 0;
+			vnode->cb_expiry = 0;
+			vnode->cb_type = 0;
+			vnode->cb_expires = get_seconds();
+		} else {
+			vnode->cb_version = cb->version;
+			vnode->cb_expiry = cb->expiry;
+			vnode->cb_type = cb->type;
+			vnode->cb_expires = vnode->cb_expiry + get_seconds();
+		}
+	}
+
 	ret = afs_inode_map_status(vnode, key);
 	if (ret < 0)
 		goto bad_inode;
 
 	/* success */
+	clear_bit(AFS_VNODE_UNSET, &vnode->flags);
 	inode->i_flags |= S_NOATIME;
 	unlock_new_inode(inode);
 	_leave(" = %p [CB { v=%u t=%u }]", inode, vnode->cb_version, vnode->cb_type);
@@ -182,6 +204,78 @@ bad_inode:
 }
 
 /*
+ * validate a vnode/inode
+ * - there are several things we need to check
+ *   - parent dir data changes (rm, rmdir, rename, mkdir, create, link,
+ *     symlink)
+ *   - parent dir metadata changed (security changes)
+ *   - dentry data changed (write, truncate)
+ *   - dentry metadata changed (security changes)
+ */
+int afs_validate(struct afs_vnode *vnode, struct key *key)
+{
+	int ret;
+
+	_enter("{v={%x:%u} fl=%lx},%x",
+	       vnode->fid.vid, vnode->fid.vnode, vnode->flags,
+	       key_serial(key));
+
+	if (vnode->cb_promised &&
+	    !test_bit(AFS_VNODE_CB_BROKEN, &vnode->flags) &&
+	    !test_bit(AFS_VNODE_MODIFIED, &vnode->flags) &&
+	    !test_bit(AFS_VNODE_ZAP_DATA, &vnode->flags)) {
+		if (vnode->cb_expires < get_seconds() + 10) {
+			_debug("callback expired");
+			set_bit(AFS_VNODE_CB_BROKEN, &vnode->flags);
+		} else {
+			goto valid;
+		}
+	}
+
+	if (test_bit(AFS_VNODE_DELETED, &vnode->flags))
+		goto valid;
+
+	mutex_lock(&vnode->validate_lock);
+
+	/* if the promise has expired, we need to check the server again to get
+	 * a new promise - note that if the (parent) directory's metadata was
+	 * changed then the security may be different and we may no longer have
+	 * access */
+	if (!vnode->cb_promised ||
+	    test_bit(AFS_VNODE_CB_BROKEN, &vnode->flags)) {
+		_debug("not promised");
+		ret = afs_vnode_fetch_status(vnode, NULL, key);
+		if (ret < 0)
+			goto error_unlock;
+		_debug("new promise [fl=%lx]", vnode->flags);
+	}
+
+	if (test_bit(AFS_VNODE_DELETED, &vnode->flags)) {
+		_debug("file already deleted");
+		ret = -ESTALE;
+		goto error_unlock;
+	}
+
+	/* if the vnode's data version number changed then its contents are
+	 * different */
+	if (test_and_clear_bit(AFS_VNODE_ZAP_DATA, &vnode->flags)) {
+		_debug("zap data {%x:%d}", vnode->fid.vid, vnode->fid.vnode);
+		invalidate_remote_inode(&vnode->vfs_inode);
+	}
+
+	clear_bit(AFS_VNODE_MODIFIED, &vnode->flags);
+	mutex_unlock(&vnode->validate_lock);
+valid:
+	_leave(" = 0");
+	return 0;
+
+error_unlock:
+	mutex_unlock(&vnode->validate_lock);
+	_leave(" = %d", ret);
+	return ret;
+}
+
+/*
  * read the attributes of an inode
  */
 int afs_inode_getattr(struct vfsmount *mnt, struct dentry *dentry,
@@ -207,9 +301,10 @@ void afs_clear_inode(struct inode *inode)
 
 	vnode = AFS_FS_I(inode);
 
-	_enter("ino=%lu { vn=%08x v=%u x=%u t=%u }",
-	       inode->i_ino,
+	_enter("{%x:%d.%d} v=%u x=%u t=%u }",
+	       vnode->fid.vid,
 	       vnode->fid.vnode,
+	       vnode->fid.unique,
 	       vnode->cb_version,
 	       vnode->cb_expiry,
 	       vnode->cb_type);
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 9107d3f..e6d352d 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -80,6 +80,7 @@ struct afs_call {
 	void			*reply;		/* reply buffer (first part) */
 	void			*reply2;	/* reply buffer (second part) */
 	void			*reply3;	/* reply buffer (third part) */
+	void			*reply4;	/* reply buffer (fourth part) */
 	enum {					/* call state */
 		AFS_CALL_REQUESTING,	/* request is being sent for outgoing call */
 		AFS_CALL_AWAIT_REPLY,	/* awaiting reply to outgoing call */
@@ -300,19 +301,18 @@ struct afs_vnode {
 #endif
 	struct afs_permits	*permits;	/* cache of permits so far obtained */
 	struct mutex		permits_lock;	/* lock for altering permits list */
+	struct mutex		validate_lock;	/* lock for validating this vnode */
 	wait_queue_head_t	update_waitq;	/* status fetch waitqueue */
-	unsigned		update_cnt;	/* number of outstanding ops that will update the
+	int			update_cnt;	/* number of outstanding ops that will update the
 						 * status */
 	spinlock_t		lock;		/* waitqueue/flags lock */
 	unsigned long		flags;
 #define AFS_VNODE_CB_BROKEN	0		/* set if vnode's callback was broken */
-#define AFS_VNODE_CHANGED	1		/* set if vnode's metadata changed */
+#define AFS_VNODE_UNSET		1		/* set if vnode attributes not yet set */
 #define AFS_VNODE_MODIFIED	2		/* set if vnode's data modified */
 #define AFS_VNODE_ZAP_DATA	3		/* set if vnode's data should be invalidated */
 #define AFS_VNODE_DELETED	4		/* set if vnode deleted on server */
 #define AFS_VNODE_MOUNTPOINT	5		/* set if vnode is a mountpoint symlink */
-#define AFS_VNODE_DIR_CHANGED	6		/* set if vnode's parent dir metadata changed */
-#define AFS_VNODE_DIR_MODIFIED	7		/* set if vnode's parent dir data modified */
 
 	long			acl_order;	/* ACL check count (callback break count) */
 
@@ -320,7 +320,6 @@ struct afs_vnode {
 	struct rb_node		server_rb;	/* link in server->fs_vnodes */
 	struct rb_node		cb_promise;	/* link in server->cb_promises */
 	struct work_struct	cb_broken_work;	/* work to be done on callback break */
-	struct mutex		cb_broken_lock;	/* lock against multiple attempts to fix break */
 	time_t			cb_expires;	/* time at which callback expires */
 	time_t			cb_expires_at;	/* time used to order cb_promise */
 	unsigned		cb_version;	/* callback version */
@@ -388,6 +387,7 @@ extern void afs_init_callback_state(struct afs_server *);
 extern void afs_broken_callback_work(struct work_struct *);
 extern void afs_break_callbacks(struct afs_server *, size_t,
 				struct afs_callback[]);
+extern void afs_discard_callback_on_delete(struct afs_vnode *);
 extern void afs_give_up_callback(struct afs_vnode *);
 extern void afs_dispatch_give_up_callbacks(struct work_struct *);
 extern void afs_flush_callback_breaks(struct afs_server *);
@@ -448,14 +448,34 @@ extern int afs_fs_give_up_callbacks(struct afs_server *,
 				    const struct afs_wait_mode *);
 extern int afs_fs_fetch_data(struct afs_server *, struct key *,
 			     struct afs_vnode *, off_t, size_t, struct page *,
-			     struct afs_volsync *,
 			     const struct afs_wait_mode *);
+extern int afs_fs_create(struct afs_server *, struct key *,
+			 struct afs_vnode *, const char *, umode_t,
+			 struct afs_fid *, struct afs_file_status *,
+			 struct afs_callback *,
+			 const struct afs_wait_mode *);
+extern int afs_fs_remove(struct afs_server *, struct key *,
+			 struct afs_vnode *, const char *, bool,
+			 const struct afs_wait_mode *);
+extern int afs_fs_link(struct afs_server *, struct key *, struct afs_vnode *,
+		       struct afs_vnode *, const char *,
+		       const struct afs_wait_mode *);
+extern int afs_fs_symlink(struct afs_server *, struct key *,
+			  struct afs_vnode *, const char *, const char *,
+			  struct afs_fid *, struct afs_file_status *,
+			  const struct afs_wait_mode *);
+extern int afs_fs_rename(struct afs_server *, struct key *,
+			 struct afs_vnode *, const char *,
+			 struct afs_vnode *, const char *,
+			 const struct afs_wait_mode *);
 
 /*
  * inode.c
  */
 extern struct inode *afs_iget(struct super_block *, struct key *,
-			      struct afs_fid *);
+			      struct afs_fid *, struct afs_file_status *,
+			      struct afs_callback *);
+extern int afs_validate(struct afs_vnode *, struct key *);
 extern int afs_inode_getattr(struct vfsmount *, struct dentry *,
 			     struct kstat *);
 extern void afs_zap_permits(struct rcu_head *);
@@ -522,7 +542,11 @@ extern int afs_permission(struct inode *, int, struct nameidata *);
  */
 extern spinlock_t afs_server_peer_lock;
 
-#define afs_get_server(S) do { atomic_inc(&(S)->usage); } while(0)
+#define afs_get_server(S)					\
+do {								\
+	_debug("GET SERVER %d", atomic_read(&(S)->usage));	\
+	atomic_inc(&(S)->usage);				\
+} while(0)
 
 extern struct afs_server *afs_lookup_server(struct afs_cell *,
 					    const struct in_addr *);
@@ -588,10 +612,24 @@ static inline struct inode *AFS_VNODE_TO_I(struct afs_vnode *vnode)
 	return &vnode->vfs_inode;
 }
 
+extern void afs_vnode_finalise_status_update(struct afs_vnode *,
+					     struct afs_server *);
 extern int afs_vnode_fetch_status(struct afs_vnode *, struct afs_vnode *,
 				  struct key *);
 extern int afs_vnode_fetch_data(struct afs_vnode *, struct key *,
 				off_t, size_t, struct page *);
+extern int afs_vnode_create(struct afs_vnode *, struct key *, const char *,
+			    umode_t, struct afs_fid *, struct afs_file_status *,
+			    struct afs_callback *, struct afs_server **);
+extern int afs_vnode_remove(struct afs_vnode *, struct key *, const char *,
+			    bool);
+extern int afs_vnode_link(struct afs_vnode *, struct afs_vnode *, struct key *,
+			  const char *);
+extern int afs_vnode_symlink(struct afs_vnode *, struct key *, const char *,
+			     const char *, struct afs_fid *,
+			     struct afs_file_status *, struct afs_server **);
+extern int afs_vnode_rename(struct afs_vnode *, struct afs_vnode *,
+			    struct key *, const char *, const char *);
 
 /*
  * volume.c
diff --git a/fs/afs/misc.c b/fs/afs/misc.c
index ac0f373..ab79dd1 100644
--- a/fs/afs/misc.c
+++ b/fs/afs/misc.c
@@ -22,6 +22,7 @@ int afs_abort_to_error(u32 abort_code)
 {
 	switch (abort_code) {
 	case 13:		return -EACCES;
+	case 30:		return -EROFS;
 	case VSALVAGE:		return -EIO;
 	case VNOVNODE:		return -ENOENT;
 	case VNOVOL:		return -ENOMEDIUM;
@@ -33,6 +34,24 @@ int afs_abort_to_error(u32 abort_code)
 	case VOVERQUOTA:	return -EDQUOT;
 	case VBUSY:		return -EBUSY;
 	case VMOVED:		return -ENXIO;
-	default:		return -EIO;
+	case 0x2f6df0c:		return -EACCES;
+	case 0x2f6df0f:		return -EBUSY;
+	case 0x2f6df10:		return -EEXIST;
+	case 0x2f6df11:		return -EXDEV;
+	case 0x2f6df13:		return -ENOTDIR;
+	case 0x2f6df14:		return -EISDIR;
+	case 0x2f6df15:		return -EINVAL;
+	case 0x2f6df1a:		return -EFBIG;
+	case 0x2f6df1b:		return -ENOSPC;
+	case 0x2f6df1d:		return -EROFS;
+	case 0x2f6df1e:		return -EMLINK;
+	case 0x2f6df20:		return -EDOM;
+	case 0x2f6df21:		return -ERANGE;
+	case 0x2f6df22:		return -EDEADLK;
+	case 0x2f6df23:		return -ENAMETOOLONG;
+	case 0x2f6df24:		return -ENOLCK;
+	case 0x2f6df26:		return -ENOTEMPTY;
+	case 0x2f6df78:		return -EDQUOT;
+	default:		return -EREMOTEIO;
 	}
 }
diff --git a/fs/afs/security.c b/fs/afs/security.c
index cbdd7f7..f9f424d 100644
--- a/fs/afs/security.c
+++ b/fs/afs/security.c
@@ -92,7 +92,7 @@ static struct afs_vnode *afs_get_auth_inode(struct afs_vnode *vnode,
 		ASSERT(auth_inode != NULL);
 	} else {
 		auth_inode = afs_iget(vnode->vfs_inode.i_sb, key,
-				      &vnode->status.parent);
+				      &vnode->status.parent, NULL, NULL);
 		if (IS_ERR(auth_inode))
 			return ERR_PTR(PTR_ERR(auth_inode));
 	}
@@ -288,7 +288,8 @@ int afs_permission(struct inode *inode, int mask, struct nameidata *nd)
 	struct key *key;
 	int ret;
 
-	_enter("{%x:%x},%x,", vnode->fid.vid, vnode->fid.vnode, mask);
+	_enter("{{%x:%x},%lx},%x,",
+	       vnode->fid.vid, vnode->fid.vnode, vnode->flags, mask);
 
 	key = afs_request_key(vnode->volume->cell);
 	if (IS_ERR(key)) {
@@ -296,13 +297,19 @@ int afs_permission(struct inode *inode, int mask, struct nameidata *nd)
 		return PTR_ERR(key);
 	}
 
+	/* if the promise has expired, we need to check the server again */
+	if (!vnode->cb_promised) {
+		_debug("not promised");
+		ret = afs_vnode_fetch_status(vnode, NULL, key);
+		if (ret < 0)
+			goto error;
+		_debug("new promise [fl=%lx]", vnode->flags);
+	}
+
 	/* check the permits to see if we've got one yet */
 	ret = afs_check_permit(vnode, key, &access);
-	if (ret < 0) {
-		key_put(key);
-		_leave(" = %d [check]", ret);
-		return ret;
-	}
+	if (ret < 0)
+		goto error;
 
 	/* interpret the access mask */
 	_debug("REQ %x ACC %x on %s",
@@ -336,10 +343,14 @@ int afs_permission(struct inode *inode, int mask, struct nameidata *nd)
 	}
 
 	key_put(key);
-	return generic_permission(inode, mask, NULL);
+	ret = generic_permission(inode, mask, NULL);
+	_leave(" = %d", ret);
+	return ret;
 
 permission_denied:
+	ret = -EACCES;
+error:
 	key_put(key);
-	_leave(" = -EACCES");
-	return -EACCES;
+	_leave(" = %d", ret);
+	return ret;
 }
diff --git a/fs/afs/server.c b/fs/afs/server.c
index d44bba3..1654e90 100644
--- a/fs/afs/server.c
+++ b/fs/afs/server.c
@@ -223,6 +223,8 @@ void afs_put_server(struct afs_server *server)
 
 	_enter("%p{%d}", server, atomic_read(&server->usage));
 
+	_debug("PUT SERVER %d", atomic_read(&server->usage));
+
 	ASSERTCMP(atomic_read(&server->usage), >, 0);
 
 	if (likely(!atomic_dec_and_test(&server->usage))) {
diff --git a/fs/afs/super.c b/fs/afs/super.c
index 17d7092..441f583 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -331,7 +331,7 @@ static int afs_fill_super(struct super_block *sb, void *data)
 	fid.vid		= as->volume->vid;
 	fid.vnode	= 1;
 	fid.unique	= 1;
-	inode = afs_iget(sb, params->key, &fid);
+	inode = afs_iget(sb, params->key, &fid, NULL, NULL);
 	if (IS_ERR(inode))
 		goto error_inode;
 
@@ -473,9 +473,9 @@ static void afs_i_init_once(void *_vnode, struct kmem_cache *cachep,
 		inode_init_once(&vnode->vfs_inode);
 		init_waitqueue_head(&vnode->update_waitq);
 		mutex_init(&vnode->permits_lock);
+		mutex_init(&vnode->validate_lock);
 		spin_lock_init(&vnode->lock);
 		INIT_WORK(&vnode->cb_broken_work, afs_broken_callback_work);
-		mutex_init(&vnode->cb_broken_lock);
 	}
 }
 
@@ -497,7 +497,7 @@ static struct inode *afs_alloc_inode(struct super_block *sb)
 
 	vnode->volume		= NULL;
 	vnode->update_cnt	= 0;
-	vnode->flags		= 0;
+	vnode->flags		= 1 << AFS_VNODE_UNSET;
 	vnode->cb_promised	= false;
 
 	return &vnode->vfs_inode;
diff --git a/fs/afs/vnode.c b/fs/afs/vnode.c
index 1600976..a1904ab 100644
--- a/fs/afs/vnode.c
+++ b/fs/afs/vnode.c
@@ -30,7 +30,7 @@ static noinline bool dump_tree_aux(struct rb_node *node, struct rb_node *parent,
 		bad = dump_tree_aux(node->rb_left, node, depth + 2, '/');
 
 	vnode = rb_entry(node, struct afs_vnode, cb_promise);
-	kdebug("%c %*.*s%c%p {%d}",
+	_debug("%c %*.*s%c%p {%d}",
 	       rb_is_red(node) ? 'R' : 'B',
 	       depth, depth, "", lr,
 	       vnode, vnode->cb_expires_at);
@@ -47,7 +47,7 @@ static noinline bool dump_tree_aux(struct rb_node *node, struct rb_node *parent,
 
 static noinline void dump_tree(const char *name, struct afs_server *server)
 {
-	kenter("%s", name);
+	_enter("%s", name);
 	if (dump_tree_aux(server->cb_promises.rb_node, NULL, 0, '-'))
 		BUG();
 }
@@ -187,47 +187,61 @@ static void afs_vnode_deleted_remotely(struct afs_vnode *vnode)
 		spin_unlock(&server->cb_lock);
 	}
 
+	spin_lock(&vnode->server->fs_lock);
+	rb_erase(&vnode->server_rb, &vnode->server->fs_vnodes);
+	spin_unlock(&vnode->server->fs_lock);
+
+	vnode->server = NULL;
 	afs_put_server(server);
 }
 
 /*
- * finish off updating the recorded status of a file
+ * finish off updating the recorded status of a file after a successful
+ * operation completion
  * - starts callback expiry timer
  * - adds to server's callback list
  */
-static void afs_vnode_finalise_status_update(struct afs_vnode *vnode,
-					     struct afs_server *server,
-					     int ret)
+void afs_vnode_finalise_status_update(struct afs_vnode *vnode,
+				      struct afs_server *server)
 {
 	struct afs_server *oldserver = NULL;
 
-	_enter("%p,%p,%d", vnode, server, ret);
+	_enter("%p,%p", vnode, server);
+
+	spin_lock(&vnode->lock);
+	clear_bit(AFS_VNODE_CB_BROKEN, &vnode->flags);
+	afs_vnode_note_promise(vnode, server);
+	vnode->update_cnt--;
+	ASSERTCMP(vnode->update_cnt, >=, 0);
+	spin_unlock(&vnode->lock);
+
+	wake_up_all(&vnode->update_waitq);
+	afs_put_server(oldserver);
+	_leave("");
+}
+
+/*
+ * finish off updating the recorded status of a file after an operation failed
+ */
+static void afs_vnode_status_update_failed(struct afs_vnode *vnode, int ret)
+{
+	_enter("%p,%d", vnode, ret);
 
 	spin_lock(&vnode->lock);
 
 	clear_bit(AFS_VNODE_CB_BROKEN, &vnode->flags);
 
-	switch (ret) {
-	case 0:
-		afs_vnode_note_promise(vnode, server);
-		break;
-	case -ENOENT:
+	if (ret == -ENOENT) {
 		/* the file was deleted on the server */
 		_debug("got NOENT from server - marking file deleted");
 		afs_vnode_deleted_remotely(vnode);
-		break;
-	default:
-		break;
 	}
 
 	vnode->update_cnt--;
-
+	ASSERTCMP(vnode->update_cnt, >=, 0);
 	spin_unlock(&vnode->lock);
 
 	wake_up_all(&vnode->update_waitq);
-
-	afs_put_server(oldserver);
-
 	_leave("");
 }
 
@@ -275,8 +289,12 @@ int afs_vnode_fetch_status(struct afs_vnode *vnode,
 		return 0;
 	}
 
+	ASSERTCMP(vnode->update_cnt, >=, 0);
+
 	if (vnode->update_cnt > 0) {
 		/* someone else started a fetch */
+		_debug("wait on fetch %d", vnode->update_cnt);
+
 		set_current_state(TASK_UNINTERRUPTIBLE);
 		ASSERT(myself.func != NULL);
 		add_wait_queue(&vnode->update_waitq, &myself);
@@ -325,7 +343,7 @@ get_anyway:
 		/* pick a server to query */
 		server = afs_volume_pick_fileserver(vnode);
 		if (IS_ERR(server))
-			return PTR_ERR(server);
+			goto no_server;
 
 		_debug("USING SERVER: %p{%08x}",
 		       server, ntohl(server->addr.s_addr));
@@ -336,17 +354,34 @@ get_anyway:
 	} while (!afs_volume_release_fileserver(vnode, server, ret));
 
 	/* adjust the flags */
-	if (ret == 0 && auth_vnode)
-		afs_cache_permit(vnode, key, acl_order);
-	afs_vnode_finalise_status_update(vnode, server, ret);
+	if (ret == 0) {
+		_debug("adjust");
+		if (auth_vnode)
+			afs_cache_permit(vnode, key, acl_order);
+		afs_vnode_finalise_status_update(vnode, server);
+		afs_put_server(server);
+	} else {
+		_debug("failed [%d]", ret);
+		afs_vnode_status_update_failed(vnode, ret);
+	}
 
-	_leave(" = %d", ret);
+	ASSERTCMP(vnode->update_cnt, >=, 0);
+
+	_leave(" = %d [cnt %d]", ret, vnode->update_cnt);
 	return ret;
+
+no_server:
+	spin_lock(&vnode->lock);
+	vnode->update_cnt--;
+	ASSERTCMP(vnode->update_cnt, >=, 0);
+	spin_unlock(&vnode->lock);
+	_leave(" = %ld [cnt %d]", PTR_ERR(server), vnode->update_cnt);
+	return PTR_ERR(server);
 }
 
 /*
  * fetch file data from the volume
- * - TODO implement caching and server failover
+ * - TODO implement caching
  */
 int afs_vnode_fetch_data(struct afs_vnode *vnode, struct key *key,
 			 off_t offset, size_t length, struct page *page)
@@ -372,18 +407,349 @@ int afs_vnode_fetch_data(struct afs_vnode *vnode, struct key *key,
 		/* pick a server to query */
 		server = afs_volume_pick_fileserver(vnode);
 		if (IS_ERR(server))
-			return PTR_ERR(server);
+			goto no_server;
 
 		_debug("USING SERVER: %08x\n", ntohl(server->addr.s_addr));
 
 		ret = afs_fs_fetch_data(server, key, vnode, offset, length,
-					page, NULL, &afs_sync_call);
+					page, &afs_sync_call);
 
 	} while (!afs_volume_release_fileserver(vnode, server, ret));
 
 	/* adjust the flags */
-	afs_vnode_finalise_status_update(vnode, server, ret);
+	if (ret == 0) {
+		afs_vnode_finalise_status_update(vnode, server);
+		afs_put_server(server);
+	} else {
+		afs_vnode_status_update_failed(vnode, ret);
+	}
 
 	_leave(" = %d", ret);
 	return ret;
+
+no_server:
+	spin_lock(&vnode->lock);
+	vnode->update_cnt--;
+	ASSERTCMP(vnode->update_cnt, >=, 0);
+	spin_unlock(&vnode->lock);
+	return PTR_ERR(server);
+}
+
+/*
+ * make a file or a directory
+ */
+int afs_vnode_create(struct afs_vnode *vnode, struct key *key,
+		     const char *name, umode_t mode, struct afs_fid *newfid,
+		     struct afs_file_status *newstatus,
+		     struct afs_callback *newcb, struct afs_server **_server)
+{
+	struct afs_server *server;
+	int ret;
+
+	_enter("%s{%u,%u,%u},%x,%s,,",
+	       vnode->volume->vlocation->vldb.name,
+	       vnode->fid.vid,
+	       vnode->fid.vnode,
+	       vnode->fid.unique,
+	       key_serial(key),
+	       name);
+
+	/* this op will fetch the status on the directory we're creating in */
+	spin_lock(&vnode->lock);
+	vnode->update_cnt++;
+	spin_unlock(&vnode->lock);
+
+	do {
+		/* pick a server to query */
+		server = afs_volume_pick_fileserver(vnode);
+		if (IS_ERR(server))
+			goto no_server;
+
+		_debug("USING SERVER: %08x\n", ntohl(server->addr.s_addr));
+
+		ret = afs_fs_create(server, key, vnode, name, mode, newfid,
+				    newstatus, newcb, &afs_sync_call);
+
+	} while (!afs_volume_release_fileserver(vnode, server, ret));
+
+	/* adjust the flags */
+	if (ret == 0) {
+		afs_vnode_finalise_status_update(vnode, server);
+		*_server = server;
+	} else {
+		afs_vnode_status_update_failed(vnode, ret);
+		*_server = NULL;
+	}
+
+	_leave(" = %d [cnt %d]", ret, vnode->update_cnt);
+	return ret;
+
+no_server:
+	spin_lock(&vnode->lock);
+	vnode->update_cnt--;
+	ASSERTCMP(vnode->update_cnt, >=, 0);
+	spin_unlock(&vnode->lock);
+	_leave(" = %ld [cnt %d]", PTR_ERR(server), vnode->update_cnt);
+	return PTR_ERR(server);
+}
+
+/*
+ * remove a file or directory
+ */
+int afs_vnode_remove(struct afs_vnode *vnode, struct key *key, const char *name,
+		     bool isdir)
+{
+	struct afs_server *server;
+	int ret;
+
+	_enter("%s{%u,%u,%u},%x,%s",
+	       vnode->volume->vlocation->vldb.name,
+	       vnode->fid.vid,
+	       vnode->fid.vnode,
+	       vnode->fid.unique,
+	       key_serial(key),
+	       name);
+
+	/* this op will fetch the status on the directory we're removing from */
+	spin_lock(&vnode->lock);
+	vnode->update_cnt++;
+	spin_unlock(&vnode->lock);
+
+	do {
+		/* pick a server to query */
+		server = afs_volume_pick_fileserver(vnode);
+		if (IS_ERR(server))
+			goto no_server;
+
+		_debug("USING SERVER: %08x\n", ntohl(server->addr.s_addr));
+
+		ret = afs_fs_remove(server, key, vnode, name, isdir,
+				    &afs_sync_call);
+
+	} while (!afs_volume_release_fileserver(vnode, server, ret));
+
+	/* adjust the flags */
+	if (ret == 0) {
+		afs_vnode_finalise_status_update(vnode, server);
+		afs_put_server(server);
+	} else {
+		afs_vnode_status_update_failed(vnode, ret);
+	}
+
+	_leave(" = %d [cnt %d]", ret, vnode->update_cnt);
+	return ret;
+
+no_server:
+	spin_lock(&vnode->lock);
+	vnode->update_cnt--;
+	ASSERTCMP(vnode->update_cnt, >=, 0);
+	spin_unlock(&vnode->lock);
+	_leave(" = %ld [cnt %d]", PTR_ERR(server), vnode->update_cnt);
+	return PTR_ERR(server);
+}
+
+/*
+ * create a hard link
+ */
+extern int afs_vnode_link(struct afs_vnode *dvnode, struct afs_vnode *vnode,
+			  struct key *key, const char *name)
+{
+	struct afs_server *server;
+	int ret;
+
+	_enter("%s{%u,%u,%u},%s{%u,%u,%u},%x,%s",
+	       dvnode->volume->vlocation->vldb.name,
+	       dvnode->fid.vid,
+	       dvnode->fid.vnode,
+	       dvnode->fid.unique,
+	       vnode->volume->vlocation->vldb.name,
+	       vnode->fid.vid,
+	       vnode->fid.vnode,
+	       vnode->fid.unique,
+	       key_serial(key),
+	       name);
+
+	/* this op will fetch the status on the directory we're removing from */
+	spin_lock(&vnode->lock);
+	vnode->update_cnt++;
+	spin_unlock(&vnode->lock);
+	spin_lock(&dvnode->lock);
+	dvnode->update_cnt++;
+	spin_unlock(&dvnode->lock);
+
+	do {
+		/* pick a server to query */
+		server = afs_volume_pick_fileserver(dvnode);
+		if (IS_ERR(server))
+			goto no_server;
+
+		_debug("USING SERVER: %08x\n", ntohl(server->addr.s_addr));
+
+		ret = afs_fs_link(server, key, dvnode, vnode, name,
+				  &afs_sync_call);
+
+	} while (!afs_volume_release_fileserver(dvnode, server, ret));
+
+	/* adjust the flags */
+	if (ret == 0) {
+		afs_vnode_finalise_status_update(vnode, server);
+		afs_vnode_finalise_status_update(dvnode, server);
+		afs_put_server(server);
+	} else {
+		afs_vnode_status_update_failed(vnode, ret);
+		afs_vnode_status_update_failed(dvnode, ret);
+	}
+
+	_leave(" = %d [cnt %d]", ret, vnode->update_cnt);
+	return ret;
+
+no_server:
+	spin_lock(&vnode->lock);
+	vnode->update_cnt--;
+	ASSERTCMP(vnode->update_cnt, >=, 0);
+	spin_unlock(&vnode->lock);
+	spin_lock(&dvnode->lock);
+	dvnode->update_cnt--;
+	ASSERTCMP(dvnode->update_cnt, >=, 0);
+	spin_unlock(&dvnode->lock);
+	_leave(" = %ld [cnt %d]", PTR_ERR(server), vnode->update_cnt);
+	return PTR_ERR(server);
+}
+
+/*
+ * create a symbolic link
+ */
+int afs_vnode_symlink(struct afs_vnode *vnode, struct key *key,
+		      const char *name, const char *content,
+		      struct afs_fid *newfid,
+		      struct afs_file_status *newstatus,
+		      struct afs_server **_server)
+{
+	struct afs_server *server;
+	int ret;
+
+	_enter("%s{%u,%u,%u},%x,%s,%s,,,",
+	       vnode->volume->vlocation->vldb.name,
+	       vnode->fid.vid,
+	       vnode->fid.vnode,
+	       vnode->fid.unique,
+	       key_serial(key),
+	       name, content);
+
+	/* this op will fetch the status on the directory we're creating in */
+	spin_lock(&vnode->lock);
+	vnode->update_cnt++;
+	spin_unlock(&vnode->lock);
+
+	do {
+		/* pick a server to query */
+		server = afs_volume_pick_fileserver(vnode);
+		if (IS_ERR(server))
+			goto no_server;
+
+		_debug("USING SERVER: %08x\n", ntohl(server->addr.s_addr));
+
+		ret = afs_fs_symlink(server, key, vnode, name, content,
+				     newfid, newstatus, &afs_sync_call);
+
+	} while (!afs_volume_release_fileserver(vnode, server, ret));
+
+	/* adjust the flags */
+	if (ret == 0) {
+		afs_vnode_finalise_status_update(vnode, server);
+		*_server = server;
+	} else {
+		afs_vnode_status_update_failed(vnode, ret);
+		*_server = NULL;
+	}
+
+	_leave(" = %d [cnt %d]", ret, vnode->update_cnt);
+	return ret;
+
+no_server:
+	spin_lock(&vnode->lock);
+	vnode->update_cnt--;
+	ASSERTCMP(vnode->update_cnt, >=, 0);
+	spin_unlock(&vnode->lock);
+	_leave(" = %ld [cnt %d]", PTR_ERR(server), vnode->update_cnt);
+	return PTR_ERR(server);
+}
+
+/*
+ * rename a file
+ */
+int afs_vnode_rename(struct afs_vnode *orig_dvnode,
+		     struct afs_vnode *new_dvnode,
+		     struct key *key,
+		     const char *orig_name,
+		     const char *new_name)
+{
+	struct afs_server *server;
+	int ret;
+
+	_enter("%s{%u,%u,%u},%s{%u,%u,%u},%x,%s,%s",
+	       orig_dvnode->volume->vlocation->vldb.name,
+	       orig_dvnode->fid.vid,
+	       orig_dvnode->fid.vnode,
+	       orig_dvnode->fid.unique,
+	       new_dvnode->volume->vlocation->vldb.name,
+	       new_dvnode->fid.vid,
+	       new_dvnode->fid.vnode,
+	       new_dvnode->fid.unique,
+	       key_serial(key),
+	       orig_name,
+	       new_name);
+
+	/* this op will fetch the status on both the directories we're dealing
+	 * with */
+	spin_lock(&orig_dvnode->lock);
+	orig_dvnode->update_cnt++;
+	spin_unlock(&orig_dvnode->lock);
+	if (new_dvnode != orig_dvnode) {
+		spin_lock(&new_dvnode->lock);
+		new_dvnode->update_cnt++;
+		spin_unlock(&new_dvnode->lock);
+	}
+
+	do {
+		/* pick a server to query */
+		server = afs_volume_pick_fileserver(orig_dvnode);
+		if (IS_ERR(server))
+			goto no_server;
+
+		_debug("USING SERVER: %08x\n", ntohl(server->addr.s_addr));
+
+		ret = afs_fs_rename(server, key, orig_dvnode, orig_name,
+				    new_dvnode, new_name, &afs_sync_call);
+
+	} while (!afs_volume_release_fileserver(orig_dvnode, server, ret));
+
+	/* adjust the flags */
+	if (ret == 0) {
+		afs_vnode_finalise_status_update(orig_dvnode, server);
+		if (new_dvnode != orig_dvnode)
+			afs_vnode_finalise_status_update(new_dvnode, server);
+		afs_put_server(server);
+	} else {
+		afs_vnode_status_update_failed(orig_dvnode, ret);
+		if (new_dvnode != orig_dvnode)
+			afs_vnode_status_update_failed(new_dvnode, ret);
+	}
+
+	_leave(" = %d [cnt %d]", ret, orig_dvnode->update_cnt);
+	return ret;
+
+no_server:
+	spin_lock(&orig_dvnode->lock);
+	orig_dvnode->update_cnt--;
+	ASSERTCMP(orig_dvnode->update_cnt, >=, 0);
+	spin_unlock(&orig_dvnode->lock);
+	if (new_dvnode != orig_dvnode) {
+		spin_lock(&new_dvnode->lock);
+		new_dvnode->update_cnt--;
+		ASSERTCMP(new_dvnode->update_cnt, >=, 0);
+		spin_unlock(&new_dvnode->lock);
+	}
+	_leave(" = %ld [cnt %d]", PTR_ERR(server), orig_dvnode->update_cnt);
+	return PTR_ERR(server);
 }
diff --git a/fs/afs/volume.c b/fs/afs/volume.c
index 15e1367..dd160ca 100644
--- a/fs/afs/volume.c
+++ b/fs/afs/volume.c
@@ -295,6 +295,7 @@ struct afs_server *afs_volume_pick_fileserver(struct afs_vnode *vnode)
  * - releases the ref on the server struct that was acquired by picking
  * - records result of using a particular server to access a volume
  * - return 0 to try again, 1 if okay or to issue error
+ * - the caller must release the server struct if result was 0
  */
 int afs_volume_release_fileserver(struct afs_vnode *vnode,
 				  struct afs_server *server,
@@ -312,7 +313,8 @@ int afs_volume_release_fileserver(struct afs_vnode *vnode,
 	case 0:
 		server->fs_act_jif = jiffies;
 		server->fs_state = 0;
-		break;
+		_leave("");
+		return 1;
 
 		/* the fileserver denied all knowledge of the volume */
 	case -ENOMEDIUM:
@@ -377,14 +379,12 @@ int afs_volume_release_fileserver(struct afs_vnode *vnode,
 		server->fs_act_jif = jiffies;
 	case -ENOMEM:
 	case -ENONET:
-		break;
+		/* tell the caller to accept the result */
+		afs_put_server(server);
+		_leave(" [local failure]");
+		return 1;
 	}
 
-	/* tell the caller to accept the result */
-	afs_put_server(server);
-	_leave("");
-	return 1;
-
 	/* tell the caller to loop around and try the next server */
 try_next_server_upw:
 	up_write(&volume->server_sem);


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3]
  2007-04-25 10:50 ` [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3] David Howells
                     ` (11 preceding siblings ...)
  2007-04-25 10:52   ` [PATCH 16/16] AFS: Add "directory write" support " David Howells
@ 2007-04-25 13:38   ` David Howells
  2007-04-25 19:43     ` David Miller
  2007-04-25 19:56     ` David Howells
  12 siblings, 2 replies; 17+ messages in thread
From: David Howells @ 2007-04-25 13:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: torvalds, linux-kernel, linux-fsdevel, netdev

Andrew Morton <akpm@linux-foundation.org> wrote:

> I'm ducking all feature and cleanup patches now, and probably shall
> continue to do so for some weeks.  The priority (which I believe to be
> increasingly urgent) is to fix the 2.6.21 regressions and to stabilise
> the things which we presently have queued for 2.6.22.  Not to
> mention the 1000ish unaddressed bug reports in bugzilla and elsewhere.

Fair enough.  I think the idea is for them (or at least some of them) to go
through one of DaveM's net git trees anyway.

David

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3]
  2007-04-25 13:38   ` [PATCH 00/16] AF_RXRPC socket family and AFS rewrite " David Howells
@ 2007-04-25 19:43     ` David Miller
  2007-04-25 19:56     ` David Howells
  1 sibling, 0 replies; 17+ messages in thread
From: David Miller @ 2007-04-25 19:43 UTC (permalink / raw)
  To: dhowells; +Cc: akpm, torvalds, linux-kernel, linux-fsdevel, netdev

From: David Howells <dhowells@redhat.com>
Date: Wed, 25 Apr 2007 14:38:32 +0100

> I think the idea is for them (or at least some of them) to go
> through one of DaveM's net git trees anyway.

Then please generate your patches against my net-2.6.21 GIT
tree.  Most of your initial patches in the series (the SKB
routine one for example) are already in my tree.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3]
  2007-04-25 13:38   ` [PATCH 00/16] AF_RXRPC socket family and AFS rewrite " David Howells
  2007-04-25 19:43     ` David Miller
@ 2007-04-25 19:56     ` David Howells
  2007-04-25 20:02       ` David Miller
  1 sibling, 1 reply; 17+ messages in thread
From: David Howells @ 2007-04-25 19:56 UTC (permalink / raw)
  To: David Miller; +Cc: akpm, torvalds, linux-kernel, linux-fsdevel, netdev

David Miller <davem@davemloft.net> wrote:

> Then please generate your patches against my net-2.6.21 GIT
> tree.  Most of your initial patches in the series (the SKB
> routine one for example) are already in my tree.

Do you mean your net-2.6.22 GIT tree?

Do you want me to make it available as a GIT tree for you to pull?  Or would
you prefer patches?

David

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3]
  2007-04-25 19:56     ` David Howells
@ 2007-04-25 20:02       ` David Miller
  0 siblings, 0 replies; 17+ messages in thread
From: David Miller @ 2007-04-25 20:02 UTC (permalink / raw)
  To: dhowells; +Cc: akpm, torvalds, linux-kernel, linux-fsdevel, netdev

From: David Howells <dhowells@redhat.com>
Date: Wed, 25 Apr 2007 20:56:47 +0100

> David Miller <davem@davemloft.net> wrote:
> 
> > Then please generate your patches against my net-2.6.21 GIT
> > tree.  Most of your initial patches in the series (the SKB
> > routine one for example) are already in my tree.
> 
> Do you mean your net-2.6.22 GIT tree?
> 
> Do you want me to make it available as a GIT tree for you to pull?  Or would
> you prefer patches?

Just patches is perfectly fine.

Also, if it's easier to diff against -mm, that works too
since Andrew integrates my net-2.6.22 tree into -mm most
of the time.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2007-04-25 20:02 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20070425041514.bb770377.akpm@linux-foundation.org>
2007-04-25 10:50 ` [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3] David Howells
2007-04-25 10:50   ` [PATCH 01/16] AF_RXRPC: Move generic skbuff stuff from XFRM code to generic code " David Howells
2007-04-25 10:50   ` [PATCH 02/16] cancel_delayed_work: use del_timer() instead of del_timer_sync() " David Howells
2007-04-25 10:50   ` [PATCH 03/16] AF_RXRPC: Key facility changes for AF_RXRPC " David Howells
2007-04-25 10:50   ` [PATCH 04/16] AF_RXRPC: Make it possible to merely try to cancel timers from a module " David Howells
2007-04-25 10:51   ` [PATCH 07/16] AF_RXRPC: Add an interface to the AF_RXRPC module for the AFS filesystem to use " David Howells
2007-04-25 10:51   ` [PATCH 10/16] AFS: Handle multiple mounts of an AFS superblock correctly " David Howells
2007-04-25 10:51   ` [PATCH 11/16] AFS: Add security support " David Howells
2007-04-25 10:51   ` [PATCH 12/16] AFS: Update the AFS fs documentation " David Howells
2007-04-25 10:51   ` [PATCH 13/16] commit ad495d7b6cfcd1bc2eaf06c42699be0bb5d84234 " David Howells
2007-04-25 10:51   ` [PATCH 14/16] AFS: Add support for the CB.GetCapabilities operation " David Howells
2007-04-25 10:51   ` [PATCH 15/16] AFS: Implement the CB.InitCallBackState3 " David Howells
2007-04-25 10:52   ` [PATCH 16/16] AFS: Add "directory write" support " David Howells
2007-04-25 13:38   ` [PATCH 00/16] AF_RXRPC socket family and AFS rewrite " David Howells
2007-04-25 19:43     ` David Miller
2007-04-25 19:56     ` David Howells
2007-04-25 20:02       ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).