Embedded Linux development
 help / color / mirror / Atom feed
* Re: [PATCH 01/17] pramfs: documentation
From: Marco Stornelli @ 2011-01-07 20:30 UTC (permalink / raw)
  To: Tony Luck; +Cc: Linux Kernel, Linux Embedded, Linux FS Devel, Tim Bird
In-Reply-To: <AANLkTimGt5hi+TpFHWxktJEg3tcDQmmARdqFqzAt++OQ@mail.gmail.com>

Il 07/01/2011 19:42, Tony Luck ha scritto:
> On Thu, Jan 6, 2011 at 4:01 AM, Marco Stornelli
> <marco.stornelli@gmail.com> wrote:
>> +accessed data that must survive system reboots and power cycles. An
>> +example usage might be system logs under /var/log, or a user address
>> +book in a cell phone or PDA.
> 
> Some usage model questions:
> 
> How do you handle errors?  I see that there are a few sanity checks in the
> "mount" path ... but there would seem to be several opportunities for the
> file system to get corrupted in other ways.  Since you don't have a block
> device, a standard "fsck" program looks challenging (though I guess you
> could mmap("/dev/mem") to peek & poke at the filesystem before trying
> to mount it).

Actually not (at least when strict devmem options is turned on) because
the memory region is marked exclusive at the moment (only a design
constraint). About the errors: pramfs does not maintain file data in the
page caches for normal file I/O, so no writeback, the read/write
operation are done with direct io and they are always sync. The data are
write protected in hw when the arch provide this facility (x86 does).
Inode contains a checksum and when there are problems they are marked as
bad. Superblock contains checksum and there is a redundant superblock.

> Some sort of recovery path would seem useful for the "address
> book" use model ... or do you just expect users to back their address book
> up (to the cloud?) and have the phone just make a clean filesystem if any
> errors are found?

Yeah maybe the address book can be a case not perfectly suitable, but it
was only an example. I thought about the fs as a "cache" in this use
case. However the designer can use this area whatever he wants,
recently I saw in a project this fs used as a system cache for decrypted
files where the files were stored in flash encrypted, so I think it's
flexible.

> What about quotas?  You have a fixed amount of persistent space, and
> presumably a number of apps that the user installs on their device that
> may like to use pramfs to store data.  Do you need some kernel enforcement
> to stop one rogue application from using up all the space? Or do you expect that
> this would be handled in some library level interface that applications will
> use to access pramfs?

Sincerely in my embedded systems I've never used quotas even to save
footprint (for the kernel support I mean). I don't think it's an hot
feature in this case and other fs for embedded use as ubifs, jffs2 etc.
don't support it.

Marco

^ permalink raw reply

* Re: [PATCH 01/17] pramfs: documentation
From: Tony Luck @ 2011-01-07 18:42 UTC (permalink / raw)
  To: Marco Stornelli; +Cc: Linux Kernel, Linux Embedded, Linux FS Devel, Tim Bird
In-Reply-To: <4D25AF02.60208@gmail.com>

On Thu, Jan 6, 2011 at 4:01 AM, Marco Stornelli
<marco.stornelli@gmail.com> wrote:
> +accessed data that must survive system reboots and power cycles. An
> +example usage might be system logs under /var/log, or a user address
> +book in a cell phone or PDA.

Some usage model questions:

How do you handle errors?  I see that there are a few sanity checks in the
"mount" path ... but there would seem to be several opportunities for the
file system to get corrupted in other ways.  Since you don't have a block
device, a standard "fsck" program looks challenging (though I guess you
could mmap("/dev/mem") to peek & poke at the filesystem before trying
to mount it).  Some sort of recovery path would seem useful for the "address
book" use model ... or do you just expect users to back their address book
up (to the cloud?) and have the phone just make a clean filesystem if any
errors are found?

What about quotas?  You have a fixed amount of persistent space, and
presumably a number of apps that the user installs on their device that
may like to use pramfs to store data.  Do you need some kernel enforcement
to stop one rogue application from using up all the space? Or do you expect that
this would be handled in some library level interface that applications will
use to access pramfs?

-Tony

^ permalink raw reply

* Re: [PATCH] Move an assert under DEBUG_KERNEL.
From: Rob Landley @ 2011-01-07  9:44 UTC (permalink / raw)
  To: Andrew Morton; +Cc: trivial, linux-kernel, linux-embedded
In-Reply-To: <20110106154120.b69118c9.akpm@linux-foundation.org>

On 01/06/2011 05:41 PM, Andrew Morton wrote:
>> +#ifdef CONFIG_DEBUG_KERNEL
>>    #define ASSERT_RTNL() do { \
>>    	if (unlikely(!rtnl_is_locked())) { \
>>    		printk(KERN_ERR "RTNL: assertion failed at %s (%d)\n", \
>> @@ -789,6 +790,9 @@ extern void __rtnl_unlock(void);
>>    		dump_stack(); \
>>    	} \
>>    } while(0)
>> +#else
>> +#define ASSERT_RTNL()
>> +#endif
>>
>>    static inline u32 rtm_get_table(struct rtattr **rta, u8 table)
>>    {
>
> Probably a worthwhile thing to do, IMO.  If there's some net-specific
> CONFIG_DEBUG_ setting then that wold be a better thing to use.

I looked and didn't find one.  lib/Kconfig.debug has DEBUG_OBJECTS and 
PROVE_LOCKING and such but nothing quite on topic.  The only "DEBUG" in 
net/Kconfig is NETFLITER_DEBUG.  Nothing relevant in 
drivers/net/Kconfig, there isn't a Kconfig in net/core...

I thought about adding a new symbol, but CONFIG_DEBUG_KERNEL is already 
used in a few existing places:

   arch/powerpc/kernel/sysfs.c
   arch/parisc/mm/init.c
   arch/blackfin/include/asm/entry.h

So this isn't the first instance of it, but that doesn't mean those uses 
are correct. :)

> However the patch was a) wordwrapped, b) space-stuffed and c) not cc'ed
> to the networking list.  So its prospects are dim.

Sorry, finally gave up on kmail and set up thunderbird.  Still trying to 
beat the darn thing into submission.  (It looked right before I hit 
send.  And I cursored over the tabs to make sure. :)

I'll work out my email issues and then cc: the networking list on the 
resubmit.

Thanks,

Rob

^ permalink raw reply

* Re: [PATCH] Move an assert under DEBUG_KERNEL.
From: Andrew Morton @ 2011-01-06 23:41 UTC (permalink / raw)
  To: Rob Landley; +Cc: trivial, linux-kernel, linux-embedded
In-Reply-To: <4D2579B2.7060704@parallels.com>

On Thu, 6 Jan 2011 02:13:38 -0600
Rob Landley <rlandley@parallels.com> wrote:

> From: Rob Landley <rlandley@parallels.com>
> 
> Move an assert under DEBUG_KERNEL.
> 
> Signed-off-by: Rob Landley <rlandley@parallels.com>
> ---
> 
> Saves about 3k from x86-64 defconfig according to scripts/bloat-o-meter.
> 
>   include/linux/rtnetlink.h |    4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
> index bbad657..28c4025 100644
> --- a/include/linux/rtnetlink.h
> +++ b/include/linux/rtnetlink.h
> @@ -782,6 +782,7 @@ extern struct netdev_queue 
> *dev_ingress_queue_create(struct net_device *dev);
>   extern void rtnetlink_init(void);
>   extern void __rtnl_unlock(void);
> 
> +#ifdef CONFIG_DEBUG_KERNEL
>   #define ASSERT_RTNL() do { \
>   	if (unlikely(!rtnl_is_locked())) { \
>   		printk(KERN_ERR "RTNL: assertion failed at %s (%d)\n", \
> @@ -789,6 +790,9 @@ extern void __rtnl_unlock(void);
>   		dump_stack(); \
>   	} \
>   } while(0)
> +#else
> +#define ASSERT_RTNL()
> +#endif
> 
>   static inline u32 rtm_get_table(struct rtattr **rta, u8 table)
>   {

Probably a worthwhile thing to do, IMO.  If there's some net-specific
CONFIG_DEBUG_ setting then that wold be a better thing to use.

However the patch was a) wordwrapped, b) space-stuffed and c) not cc'ed
to the networking list.  So its prospects are dim.

^ permalink raw reply

* Re: [PATCH 00/17] pramfs: persistent and protected RAM filesystem
From: Marco Stornelli @ 2011-01-06 18:31 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Peter Zijlstra, Linux Kernel, Linux FS Devel, Linux Embedded,
	Tim Bird
In-Reply-To: <987664A83D2D224EAE907B061CE93D530193FC6AA2@orsmsx505.amr.corp.intel.com>

Il 06/01/2011 19:22, Luck, Tony ha scritto:
>> Errata corrige: maybe I used the wrong term, I meant "volatile" instead
>> of "temporary" information, i.e. I'd like to save this info to re-read
>> it later but I don't want to store it in flash, a simple log, run-time
>> information for debug like a flight-recorder or whatever you want.
> 
> I'm puzzled by the use of "a generic piece of memory" to store "persistent"
> things (Perhaps this is made clear in the 17 parts of the patch? I haven't
> read them yet).  On x86 f/w typically clears all of memory on reset ... so
> you only get persistence if you use kexec to get from the old kernel to
> the new one.
> 
> -Tony
> 

First of all, you can find a lot of information on the web site where
there is an overview and a page with implementation details, benchmark
and so on. With "a generic piece of memory" I mean a generic memory
device directly addressable. Usually this generic device is an NVRAM, so
we have a persistent store. If you haven't got this hw you can use other
devices or the classic RAM, in this case you have a fs persistent only
over reboot. The use of this fs is mainly for embedded systems, fw can
be configured to not clear *all* the memory. Pramfs is indeed supported
by U-Boot, you can see CONFIG_PRAM in the Das U-Boot manual. x86 in this
case can be a "strange" world for this fs, but however if the user wants
it can be used without problems because there aren't neither strict arch
or hw dependency.

Marco

^ permalink raw reply

* RE: [PATCH 00/17] pramfs: persistent and protected RAM filesystem
From: Luck, Tony @ 2011-01-06 18:22 UTC (permalink / raw)
  To: Marco Stornelli
  Cc: Peter Zijlstra, Linux Kernel, Linux FS Devel, Linux Embedded,
	Tim Bird
In-Reply-To: <4D25F4CF.1030009@gmail.com>

> Errata corrige: maybe I used the wrong term, I meant "volatile" instead
> of "temporary" information, i.e. I'd like to save this info to re-read
> it later but I don't want to store it in flash, a simple log, run-time
> information for debug like a flight-recorder or whatever you want.

I'm puzzled by the use of "a generic piece of memory" to store "persistent"
things (Perhaps this is made clear in the 17 parts of the patch? I haven't
read them yet).  On x86 f/w typically clears all of memory on reset ... so
you only get persistence if you use kexec to get from the old kernel to
the new one.

-Tony

^ permalink raw reply

* Re: [PATCH 00/17] pramfs: persistent and protected RAM filesystem
From: Marco Stornelli @ 2011-01-06 16:58 UTC (permalink / raw)
  To: Marco Stornelli
  Cc: Peter Zijlstra, Linux Kernel, Linux FS Devel, Linux Embedded,
	Tim Bird, Tony Luck
In-Reply-To: <4D25ED22.3070900@gmail.com>

Il 06/01/2011 17:26, Marco Stornelli ha scritto:
> Il 06/01/2011 15:03, Peter Zijlstra ha scritto:
> and temporary information with a complete fs structure. However we are
> 

Errata corrige: maybe I used the wrong term, I meant "volatile" instead
of "temporary" information, i.e. I'd like to save this info to re-read
it later but I don't want to store it in flash, a simple log, run-time
information for debug like a flight-recorder or whatever you want.

^ permalink raw reply

* Re: [PATCH 00/17] pramfs: persistent and protected RAM filesystem
From: Marco Stornelli @ 2011-01-06 16:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Linux Kernel, Linux FS Devel, Linux Embedded, Tim Bird, Tony Luck
In-Reply-To: <1294322613.2016.333.camel@laptop>

Il 06/01/2011 15:03, Peter Zijlstra ha scritto:
> On Thu, 2011-01-06 at 13:00 +0100, Marco Stornelli wrote:
>> Hi all,
>>
>> after several reviews is time to submit the code for mainline. Thanks to
>> CELF to believe and support actively the project and thanks to Tim Bird.
> 
> Tony Luck was also playing with something like this I believe.
> 

Yes, I know. Even if the approach is different. He is trying to use a
persistent space record-based and with a simple fs interface to store
oops or something like this. The idea here is a little bit different,
i.e. to have a place (a generic piece of memory) to write not sensible
and temporary information with a complete fs structure. However we are
on the same road :)

Marco

^ permalink raw reply

* Re: [PATCH 00/17] pramfs: persistent and protected RAM filesystem
From: Peter Zijlstra @ 2011-01-06 14:03 UTC (permalink / raw)
  To: Marco Stornelli
  Cc: Linux Kernel, Linux FS Devel, Linux Embedded, Tim Bird, Tony Luck
In-Reply-To: <4D25AEEE.1050401@gmail.com>

On Thu, 2011-01-06 at 13:00 +0100, Marco Stornelli wrote:
> Hi all,
> 
> after several reviews is time to submit the code for mainline. Thanks to
> CELF to believe and support actively the project and thanks to Tim Bird.

Tony Luck was also playing with something like this I believe.

^ permalink raw reply

* [PATCH 17/17] pramfs: Makefile and Kconfig
From: Marco Stornelli @ 2011-01-06 12:05 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Linux Embedded, Linux FS Devel, Tim Bird

From: Marco Stornelli <marco.stornelli@gmail.com>

Makefile and Kconfig.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
---
diff --git a/arch/Kconfig b/arch/Kconfig
index 8bf0fa6..0d48de0 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -97,6 +97,9 @@ config USER_RETURN_NOTIFIER
 config HAVE_IOREMAP_PROT
 	bool
 +config HAVE_SET_MEMORY_RO
+	bool
+
 config HAVE_KPROBES
 	bool
 diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index e330da2..9be32f2 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -23,6 +23,7 @@ config X86
 	select HAVE_OPROFILE
 	select HAVE_PERF_EVENTS
 	select HAVE_IRQ_WORK
+	select HAVE_SET_MEMORY_RO
 	select HAVE_IOREMAP_PROT
 	select HAVE_KPROBES
 	select HAVE_MEMBLOCK
diff --git a/fs/Kconfig b/fs/Kconfig
index 771f457..774ebb5 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -13,7 +13,7 @@ source "fs/ext4/Kconfig"
 config FS_XIP
 # execute in place
 	bool
-	depends on EXT2_FS_XIP
+	depends on EXT2_FS_XIP || PRAMFS_XIP
 	default y
  source "fs/jbd/Kconfig"
@@ -25,13 +25,14 @@ config FS_MBCACHE
 	default y if EXT2_FS=y && EXT2_FS_XATTR
 	default y if EXT3_FS=y && EXT3_FS_XATTR
 	default y if EXT4_FS=y && EXT4_FS_XATTR
-	default m if EXT2_FS_XATTR || EXT3_FS_XATTR || EXT4_FS_XATTR
+	default y if PRAMFS=y && PRAMFS_XATTR
+	default m if EXT2_FS_XATTR || EXT3_FS_XATTR || EXT4_FS_XATTR || PRAMFS_XATTR
  source "fs/reiserfs/Kconfig"
 source "fs/jfs/Kconfig"
  config FS_POSIX_ACL
-# Posix ACL utility routines (for now, only ext2/ext3/jfs/reiserfs/nfs4)
+# Posix ACL utility routines (for now, only ext2/ext3/jfs/reiserfs/nfs4/pramfs)
 #
 # NOTE: you can implement Posix ACLs without these helpers (XFS does).
 # 	Never use this symbol for ifdefs.
@@ -190,6 +191,7 @@ source "fs/qnx4/Kconfig"
 source "fs/romfs/Kconfig"
 source "fs/sysv/Kconfig"
 source "fs/ufs/Kconfig"
+source "fs/pramfs/Kconfig"
 source "fs/exofs/Kconfig"
  endif # MISC_FILESYSTEMS
diff --git a/fs/Makefile b/fs/Makefile
index a7f7cef..43523ce 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -121,3 +121,4 @@ obj-$(CONFIG_BTRFS_FS)		+= btrfs/
 obj-$(CONFIG_GFS2_FS)           += gfs2/
 obj-$(CONFIG_EXOFS_FS)          += exofs/
 obj-$(CONFIG_CEPH_FS)		+= ceph/
+obj-$(CONFIG_PRAMFS)		+= pramfs/
diff --git a/fs/pramfs/Kconfig b/fs/pramfs/Kconfig
new file mode 100644
index 0000000..0a2c54a
--- /dev/null
+++ b/fs/pramfs/Kconfig
@@ -0,0 +1,72 @@
+config PRAMFS
+	tristate "Persistent and Protected RAM file system support"
+	depends on HAS_IOMEM && EXPERIMENTAL
+	select CRC16
+	help
+	   If your system has a block of fast (comparable in access speed to
+	   system memory) and non-volatile RAM and you wish to mount a
+	   light-weight, full-featured, and space-efficient filesystem over it,
+	   say Y here, and read <file:Documentation/filesystems/pramfs.txt>.
+
+	   To compile this as a module,  choose M here: the module will be
+	   called pramfs.
+
+config PRAMFS_XIP
+	bool "Execute-in-place in PRAMFS"
+	depends on PRAMFS
+	help
+	   Say Y here to enable XIP feature of PRAMFS.
+
+config PRAMFS_WRITE_PROTECT
+	bool "PRAMFS write protection"
+	depends on PRAMFS && MMU && HAVE_SET_MEMORY_RO
+	default y
+	help
+	   Say Y here to enable the write protect feature of PRAMFS.
+
+config PRAMFS_XATTR
+	bool "PRAMFS extended attributes"
+	depends on PRAMFS
+	help
+	  Extended attributes are name:value pairs associated with inodes by
+	  the kernel or by users (see the attr(5) manual page, or visit
+	  <http://acl.bestbits.at/> for details).
+
+	  If unsure, say N.
+
+config PRAMFS_POSIX_ACL
+	bool "PRAMFS POSIX Access Control Lists"
+	depends on PRAMFS_XATTR
+	select FS_POSIX_ACL
+	help
+	  Posix Access Control Lists (ACLs) support permissions for users and
+	  groups beyond the owner/group/world scheme.
+
+	  To learn more about Access Control Lists, visit the Posix ACLs for
+	  Linux website <http://acl.bestbits.at/>.
+
+	  If you don't know what Access Control Lists are, say N.
+
+config PRAMFS_SECURITY
+	bool "PRAMFS Security Labels"
+	depends on PRAMFS_XATTR
+	help
+	  Security labels support alternative access control models
+	  implemented by security modules like SELinux.  This option
+	  enables an extended attribute handler for file security
+	  labels in the pram filesystem.
+
+	  If you are not using a security module that requires using
+	  extended attributes for file security labels, say N.
+
+config PRAMFS_TEST
+	boolean
+	depends on PRAMFS
+
+config PRAMFS_TEST_MODULE
+	tristate "PRAMFS Test"
+	depends on PRAMFS && m
+	select PRAMFS_TEST
+	help
+	  Say Y here to build a simple module to test the protection of
+	  PRAMFS. The module will be called pramfs_test.
diff --git a/fs/pramfs/Makefile b/fs/pramfs/Makefile
new file mode 100644
index 0000000..055f0bb
--- /dev/null
+++ b/fs/pramfs/Makefile
@@ -0,0 +1,14 @@
+#
+# Makefile for the linux pram-filesystem routines.
+#
+
+obj-$(CONFIG_PRAMFS) += pramfs.o
+obj-$(CONFIG_PRAMFS_TEST_MODULE) += pramfs_test.o
+
+pramfs-y := balloc.o dir.o file.o inode.o namei.o super.o symlink.o ioctl.o
+
+pramfs-$(CONFIG_PRAMFS_WRITE_PROTECT) += wprotect.o
+pramfs-$(CONFIG_PRAMFS_XIP) += xip.o
+pramfs-$(CONFIG_PRAMFS_XATTR) += xattr.o xattr_user.o xattr_trusted.o desctree.o
+pramfs-$(CONFIG_PRAMFS_POSIX_ACL) += acl.o
+pramfs-$(CONFIG_PRAMFS_SECURITY) += xattr_security.o

^ permalink raw reply related

* [PATCH 16/17] pramfs: ioctl operations
From: Marco Stornelli @ 2011-01-06 12:04 UTC (permalink / raw)
  To: Linux Kernel, Linux Embedded, Linux FS Devel, Tim Bird

From: Marco Stornelli <marco.stornelli@gmail.com>

Ioctl operations.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
---
diff --git a/fs/pramfs/ioctl.c b/fs/pramfs/ioctl.c
new file mode 100644
index 0000000..092cbe6
--- /dev/null
+++ b/fs/pramfs/ioctl.c
@@ -0,0 +1,121 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * Ioctl operations.
+ *
+ * Copyright 2010 Marco Stornelli <marco.stornelli@gmail.com>
+ *
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include <linux/capability.h>
+#include <linux/time.h>
+#include <linux/sched.h>
+#include <linux/compat.h>
+#include <linux/mount.h>
+#include "pram.h"
+
+long pram_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
+{
+	struct inode *inode = filp->f_dentry->d_inode;
+	struct pram_inode *pi;
+	unsigned int flags;
+	int ret;
+
+	pi = pram_get_inode(inode->i_sb, inode->i_ino);
+	if (!pi)
+		return -EACCES;
+
+	switch (cmd) {
+	case FS_IOC_GETFLAGS:
+		flags = be32_to_cpu(pi->i_flags) & FS_FL_USER_VISIBLE;
+		return put_user(flags, (int __user *) arg);
+	case FS_IOC_SETFLAGS: {
+		unsigned int oldflags;
+
+		ret = mnt_want_write(filp->f_path.mnt);
+		if (ret)
+			return ret;
+
+		if (!is_owner_or_cap(inode)) {
+			ret = -EACCES;
+			goto flags_out;
+		}
+
+		if (get_user(flags, (int __user *) arg)) {
+			ret = -EFAULT;
+			goto flags_out;
+		}
+
+		mutex_lock(&inode->i_mutex);
+		oldflags = be32_to_cpu(pi->i_flags);
+
+		if ((flags ^ oldflags) & (FS_APPEND_FL | FS_IMMUTABLE_FL)) {
+			if (!capable(CAP_LINUX_IMMUTABLE)) {
+				mutex_unlock(&inode->i_mutex);
+				ret = -EPERM;
+				goto flags_out;
+			}
+		}
+
+		if (!S_ISDIR(inode->i_mode))
+			flags &= ~FS_DIRSYNC_FL;
+
+		flags = flags & FS_FL_USER_MODIFIABLE;
+		flags |= oldflags & ~FS_FL_USER_MODIFIABLE;
+		pram_memunlock_inode(inode->i_sb, pi);
+		pi->i_flags = cpu_to_be32(flags);
+		inode->i_ctime = CURRENT_TIME_SEC;
+		pi->i_ctime = cpu_to_be32(inode->i_ctime.tv_sec);
+		pram_set_inode_flags(inode, pi);
+		pram_memlock_inode(inode->i_sb, pi);
+		mutex_unlock(&inode->i_mutex);
+flags_out:
+		mnt_drop_write(filp->f_path.mnt);
+		return ret;
+	}
+	case FS_IOC_GETVERSION:
+		return put_user(inode->i_generation, (int __user *) arg);
+	case FS_IOC_SETVERSION:
+		if (!is_owner_or_cap(inode))
+			return -EPERM;
+		ret = mnt_want_write(filp->f_path.mnt);
+		if (ret)
+			return ret;
+		if (get_user(inode->i_generation, (int __user *) arg)) {
+			ret = -EFAULT;
+		} else {
+			inode->i_ctime = CURRENT_TIME_SEC;
+			pram_update_inode(inode);
+		}
+		mnt_drop_write(filp->f_path.mnt);
+		return ret;
+	default:
+		return -ENOTTY;
+	}
+}
+
+#ifdef CONFIG_COMPAT
+long pram_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+	switch (cmd) {
+	case FS_IOC32_GETFLAGS:
+		cmd = FS_IOC_GETFLAGS;
+		break;
+	case FS_IOC32_SETFLAGS:
+		cmd = FS_IOC_SETFLAGS;
+		break;
+	case FS_IOC32_GETVERSION:
+		cmd = FS_IOC_GETVERSION;
+		break;
+	case FS_IOC32_SETVERSION:
+		cmd = FS_IOC_SETVERSION;
+		break;
+	default:
+		return -ENOIOCTLCMD;
+	}
+	return pram_ioctl(file, cmd, (unsigned long) compat_ptr(arg));
+}
+#endif

^ permalink raw reply related

* [PATCH 15/17] pramfs: test module
From: Marco Stornelli @ 2011-01-06 12:04 UTC (permalink / raw)
  To: Linux Kernel, Linux Embedded, Linux FS Devel, Tim Bird

From: Marco Stornelli <marco.stornelli@gmail.com>

Test module.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
---
diff --git a/fs/pramfs/pramfs_test.c b/fs/pramfs/pramfs_test.c
new file mode 100644
index 0000000..24e016f
--- /dev/null
+++ b/fs/pramfs/pramfs_test.c
@@ -0,0 +1,47 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * Pramfs test module.
+ *
+ * Copyright 2009-2010 Marco Stornelli <marco.stornelli@gmail.com>
+ * Copyright 2003 Sony Corporation
+ * Copyright 2003 Matsushita Electric Industrial Co., Ltd.
+ * 2003-2004 (c) MontaVista Software, Inc. , Steve Longerbeam
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+#include <linux/module.h>
+#include <linux/version.h>
+#include <linux/init.h>
+#include <linux/fs.h>
+#include "pram.h"
+
+int __init test_pramfs_write(void)
+{
+	struct pram_super_block *psb;
+
+	psb = get_pram_super();
+	if (!psb) {
+		printk(KERN_ERR
+			"%s: PRAMFS super block not found (not mounted?)\n",
+			__func__);
+		return 1;
+	}
+
+	/*
+	 * Attempt an unprotected clear of checksum information in the
+	 * superblock, this should cause a kernel page protection fault.
+	 */
+	printk("%s: writing to kernel VA %p\n", __func__, psb);
+	psb->s_sum = 0;
+
+	return 0;
+}
+
+void test_pramfs_write_cleanup(void) {}
+
+/* Module information */
+MODULE_LICENSE("GPL");
+module_init(test_pramfs_write);
+module_exit(test_pramfs_write_cleanup);

^ permalink raw reply related

* [PATCH 14/17] pramfs: memory protection
From: Marco Stornelli @ 2011-01-06 12:04 UTC (permalink / raw)
  To: Linux Kernel, Linux Embedded, Linux FS Devel, Tim Bird

From: Marco Stornelli <marco.stornelli@gmail.com>

Memory write protection.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
---
diff --git a/fs/pramfs/wprotect.c b/fs/pramfs/wprotect.c
new file mode 100644
index 0000000..d0f0508
--- /dev/null
+++ b/fs/pramfs/wprotect.c
@@ -0,0 +1,41 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * Write protection for the filesystem pages.
+ *
+ * Copyright 2009-2010 Marco Stornelli <marco.stornelli@gmail.com>
+ * Copyright 2003 Sony Corporation
+ * Copyright 2003 Matsushita Electric Industrial Co., Ltd.
+ * 2003-2004 (c) MontaVista Software, Inc. , Steve Longerbeam
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/mm.h>
+#include <linux/io.h>
+#include "pram.h"
+
+DEFINE_SPINLOCK(writeable_lock);
+
+void pram_writeable(void *vaddr, unsigned long size, int rw)
+{
+	int ret = 0;
+	unsigned long nrpages = size >> PAGE_SHIFT;
+	unsigned long addr = (unsigned long)vaddr;
+
+	/* Page aligned */
+	addr &= PAGE_MASK;
+
+	if (size & (PAGE_SIZE - 1))
+		nrpages++;
+
+	if (rw)
+		ret = set_memory_rw(addr, nrpages);
+	else
+		ret = set_memory_ro(addr, nrpages);
+
+	BUG_ON(ret);
+}
diff --git a/fs/pramfs/wprotect.h b/fs/pramfs/wprotect.h
new file mode 100644
index 0000000..0aa1e33
--- /dev/null
+++ b/fs/pramfs/wprotect.h
@@ -0,0 +1,151 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * Memory protection definitions for the PRAMFS filesystem.
+ *
+ * Copyright 2010 Marco Stornelli <marco.stornelli@gmail.com>
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#ifndef __WPROTECT_H
+#define __WPROTECT_H
+
+#include <linux/pram_fs.h>
+
+/* pram_memunlock_super() before calling! */
+static inline void pram_sync_super(struct pram_super_block *ps)
+{
+	u16 crc = 0;
+	ps->s_wtime = cpu_to_be32(get_seconds());
+	ps->s_sum = 0;
+	crc = crc16(~0, (__u8 *)ps + sizeof(__be16), PRAM_SB_SIZE - sizeof(__be16));
+	ps->s_sum = cpu_to_be16(crc);
+	/* Keep sync redundant super block */
+	memcpy((void *)ps + PRAM_SB_SIZE, (void *)ps, PRAM_SB_SIZE);
+}
+
+/* pram_memunlock_inode() before calling! */
+static inline void pram_sync_inode(struct pram_inode *pi)
+{
+	u16 crc = 0;
+	pi->i_sum = 0;
+	crc = crc16(~0, (__u8 *)pi + sizeof(__be16), PRAM_INODE_SIZE - sizeof(__be16));
+	pi->i_sum = cpu_to_be16(crc);
+}
+
+#ifdef CONFIG_PRAMFS_WRITE_PROTECT
+extern void pram_writeable(void *vaddr, unsigned long size, int rw);
+extern spinlock_t writeable_lock;
+static inline int pram_is_protected(struct super_block *sb)
+{
+	struct pram_sb_info *sbi = (struct pram_sb_info *)sb->s_fs_info;
+	return sbi->s_mount_opt & PRAM_MOUNT_PROTECT;
+}
+
+static inline void __pram_memunlock_range(void *p, unsigned long len)
+{
+	/* +	 * NOTE: Ideally we should lock all the kernel to be memory safe
+	 * and avoid to write in the protected memory,
+	 * obviously it's not possible, so we only serialize
+	 * the operations at fs level. We can't disable the interrupts
+	 * because we could have a deadlock in this path.
+	 */
+	spin_lock(&writeable_lock);
+	pram_writeable(p, len, 1);
+}
+
+static inline void __pram_memlock_range(void *p, unsigned long len)
+{
+	pram_writeable(p, len, 0);
+	spin_unlock(&writeable_lock);
+}
+
+static inline void pram_memunlock_range(struct super_block *sb, void *p,
+					unsigned long len)
+{
+	if (pram_is_protected(sb))
+		__pram_memunlock_range(p, len);
+}
+
+static inline void pram_memlock_range(struct super_block *sb, void *p,
+					unsigned long len)
+{
+	if (pram_is_protected(sb))
+		__pram_memlock_range(p, len);
+}
+
+static inline void pram_memunlock_super(struct super_block *sb,
+					struct pram_super_block *ps)
+{
+	if (pram_is_protected(sb))
+		__pram_memunlock_range(ps, PRAM_SB_SIZE);
+}
+
+static inline void pram_memlock_super(struct super_block *sb,
+					struct pram_super_block *ps)
+{
+	pram_sync_super(ps);
+	if (pram_is_protected(sb))
+		__pram_memlock_range(ps, PRAM_SB_SIZE);
+}
+
+static inline void pram_memunlock_inode(struct super_block *sb,
+					struct pram_inode *pi)
+{
+	if (pram_is_protected(sb))
+		__pram_memunlock_range(pi, PRAM_SB_SIZE);
+}
+
+static inline void pram_memlock_inode(struct super_block *sb,
+					struct pram_inode *pi)
+{
+	pram_sync_inode(pi);
+	if (pram_is_protected(sb))
+		__pram_memlock_range(pi, PRAM_SB_SIZE);
+}
+
+static inline void pram_memunlock_block(struct super_block *sb,
+					void *bp)
+{
+	if (pram_is_protected(sb))
+		__pram_memunlock_range(bp, sb->s_blocksize);
+}
+
+static inline void pram_memlock_block(struct super_block *sb,
+					void *bp)
+{
+	if (pram_is_protected(sb))
+		__pram_memlock_range(bp, sb->s_blocksize);
+}
+
+#else
+#define pram_is_protected(sb)	0
+#define pram_writeable(vaddr, size, rw) do {} while (0)
+static inline void pram_memunlock_range(struct super_block *sb, void *p,
+					unsigned long len) {}
+static inline void pram_memlock_range(struct super_block *sb, void *p,
+					unsigned long len) {}
+static inline void pram_memunlock_super(struct super_block *sb,
+					struct pram_super_block *ps) {}
+static inline void pram_memlock_super(struct super_block *sb,
+					struct pram_super_block *ps)
+{
+	pram_sync_super(ps);
+}
+static inline void pram_memunlock_inode(struct super_block *sb,
+					struct pram_inode *pi) {}
+static inline void pram_memlock_inode(struct super_block *sb,
+					struct pram_inode *pi)
+{
+	pram_sync_inode(pi);
+}
+static inline void pram_memunlock_block(struct super_block *sb,
+					void *bp) {}
+static inline void pram_memlock_block(struct super_block *sb,
+					void *bp) {}
+
+#endif /* CONFIG PRAMFS_WRITE_PROTECT */
+#endif

^ permalink raw reply related

* [PATCH 13/17] pramfs: extended attributes block descriptors tree
From: Marco Stornelli @ 2011-01-06 12:04 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Linux Embedded, Linux FS Devel, Tim Bird

From: Marco Stornelli <marco.stornelli@gmail.com>

Extended attributes block descriptors tree.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
---
diff --git a/fs/pramfs/desctree.c b/fs/pramfs/desctree.c
new file mode 100644
index 0000000..4508e70
--- /dev/null
+++ b/fs/pramfs/desctree.c
@@ -0,0 +1,181 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * Extended attributes block descriptors tree.
+ *
+ * Copyright 2010 Marco Stornelli <marco.stornelli@gmail.com>
+ *
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include <linux/spinlock.h>
+#include "desctree.h"
+#include "pram.h"
+
+/* xblock_desc_init_always()
+ *
+ * These are initializations that need to be done on every
+ * descriptor allocation as the fields are not initialised
+ * by slab allocation.
+ */
+void xblock_desc_init_always(struct pram_xblock_desc *desc)
+{
+	atomic_set(&desc->refcount, 0);
+	desc->blocknr = 0;
+	desc->flags = 0;
+}
+
+/* xblock_desc_init_once()
+ *
+ * These are initializations that only need to be done
+ * once, because the fields are idempotent across use
+ * of the descriptor, so let the slab aware of that.
+ */
+void xblock_desc_init_once(struct pram_xblock_desc *desc)
+{
+	mutex_init(&desc->lock);
+}
+
+/* __insert_xblock_desc()
+ *
+ * Insert a new descriptor in the tree.
+ */
+static void __insert_xblock_desc(struct pram_sb_info *sbi,
+				 unsigned long blocknr, struct rb_node *node)
+{
+	struct rb_node **p = &(sbi->desc_tree.rb_node);
+	struct rb_node *parent = NULL;
+	struct pram_xblock_desc *desc;
+
+	while (*p) {
+		parent = *p;
+		desc = rb_entry(parent, struct pram_xblock_desc, node);
+
+		if (blocknr < desc->blocknr)
+			p = &(*p)->rb_left;
+		else if (blocknr > desc->blocknr)
+			p = &(*p)->rb_right;
+		else
+			/* Oops...an other descriptor for the same block ? */
+			BUG();
+	}
+
+	rb_link_node(node, parent, p);
+	rb_insert_color(node, &sbi->desc_tree);
+}
+
+void insert_xblock_desc(struct pram_sb_info *sbi, struct pram_xblock_desc *desc)
+{
+	spin_lock(&sbi->desc_tree_lock);
+	__insert_xblock_desc(sbi, desc->blocknr, &desc->node);
+	spin_unlock(&sbi->desc_tree_lock);
+};
+
+/* __lookup_xblock_desc()
+ *
+ * Search an extended attribute descriptor in the tree via the
+ * block number. It returns the descriptor if it's found or
+ * NULL. If not found it creates a new descriptor if create is not 0.
+ */
+static struct pram_xblock_desc *__lookup_xblock_desc(struct pram_sb_info *sbi,
+					    unsigned long blocknr,
+					    struct kmem_cache *cache,
+					    int create)
+{
+	struct rb_node *n = sbi->desc_tree.rb_node;
+	struct pram_xblock_desc *desc = NULL;
+
+	while (n) {
+		desc = rb_entry(n, struct pram_xblock_desc, node);
+
+		if (blocknr < desc->blocknr)
+			n = n->rb_left;
+		else if (blocknr > desc->blocknr)
+			n = n->rb_right;
+		else {
+			atomic_inc(&desc->refcount);
+			goto out;
+		}
+	}
+
+	/* not found */
+	if (create) {
+		desc = kmem_cache_alloc(cache, GFP_NOFS);
+		if (!desc)
+			return ERR_PTR(-ENOMEM);
+		xblock_desc_init_always(desc);
+		atomic_set(&desc->refcount, 1);
+		desc->blocknr = blocknr;
+		__insert_xblock_desc(sbi, desc->blocknr, &desc->node);
+	}
+out:
+	return desc;
+}
+
+struct pram_xblock_desc *lookup_xblock_desc(struct pram_sb_info *sbi,
+					    unsigned long blocknr,
+					    struct kmem_cache *cache,
+					    int create)
+{
+	struct pram_xblock_desc *desc = NULL;
+
+	spin_lock(&sbi->desc_tree_lock);
+	desc = __lookup_xblock_desc(sbi, blocknr, cache, create);
+	spin_unlock(&sbi->desc_tree_lock);
+	return desc;
+}
+
+/* put_xblock_desc()
+ *
+ * Decrement the reference count and if it reaches zero and the
+ * desciptor has been marked to be free, then we free it.
+ * It returns 0 if the descriptor has been deleted and 1 otherwise.
+ */
+int put_xblock_desc(struct pram_sb_info *sbi, struct pram_xblock_desc *desc)
+{
+	int ret = 1;
+	if (!desc)
+		return ret;
+
+	if (atomic_dec_and_lock(&desc->refcount, &sbi->desc_tree_lock)) {
+		if (test_bit(FREEING, &desc->flags)) {
+			rb_erase(&desc->node, &sbi->desc_tree);
+			pram_dbg("erasing desc for block %lu\n", desc->blocknr);
+			ret = 0;
+		}
+		spin_unlock(&sbi->desc_tree_lock);
+	}
+	return ret;
+};
+
+/* mark_free_desc()
+ *
+ * Mark free a descriptor. The descriptor will be deleted later in the
+ * put_xblock_desc().
+ */
+void mark_free_desc(struct pram_xblock_desc *desc)
+{
+	set_bit(FREEING, &desc->flags);
+}
+
+/* erase_tree()
+ *
+ * Free all objects in the tree.
+ */
+void erase_tree(struct pram_sb_info *sbi, struct kmem_cache *cachep)
+{
+	struct rb_node *n;
+	struct pram_xblock_desc *desc;
+
+	spin_lock(&sbi->desc_tree_lock);
+	n = rb_first(&sbi->desc_tree);
+	while (n) {
+		desc = rb_entry(n, struct pram_xblock_desc, node);
+		rb_erase(n, &sbi->desc_tree);
+		kmem_cache_free(cachep, desc);
+		n = rb_next(n);
+	}
+	spin_unlock(&sbi->desc_tree_lock);
+}
diff --git a/fs/pramfs/desctree.h b/fs/pramfs/desctree.h
new file mode 100644
index 0000000..22951c6
--- /dev/null
+++ b/fs/pramfs/desctree.h
@@ -0,0 +1,44 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * Extended attributes block descriptors tree.
+ *
+ * Copyright 2010 Marco Stornelli <marco.stornelli@gmail.com>
+ *
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include <asm/atomic.h>
+#include <linux/slab.h>
+#include "pram.h"
+
+struct pram_xblock_desc {
+#define FREEING (1UL << 1)
+	unsigned long flags;	/* descriptor flags */
+	atomic_t refcount;	/* users count of this descriptor */
+	unsigned long blocknr;	/* absolute block number */
+	struct mutex lock;	/* block lock */
+	struct rb_node node;	/* node in the rb tree */
+};
+
+extern struct pram_xblock_desc *lookup_xblock_desc(struct pram_sb_info *sbi,
+						   unsigned long blocknr,
+						   struct kmem_cache *, int);
+extern void insert_xblock_desc(struct pram_sb_info *sbi,
+			       struct pram_xblock_desc *desc);
+extern void mark_free_desc(struct pram_xblock_desc *desc);
+extern int put_xblock_desc(struct pram_sb_info *sbi,
+			   struct pram_xblock_desc *desc);
+extern void xblock_desc_init_always(struct pram_xblock_desc *desc);
+extern void xblock_desc_init_once(struct pram_xblock_desc *desc);
+extern void erase_tree(struct pram_sb_info *sbi,
+		       struct kmem_cache *);
+
+static inline void xblock_desc_init(struct pram_xblock_desc *desc)
+{
+	xblock_desc_init_always(desc);
+	xblock_desc_init_once(desc);
+};
+

^ permalink raw reply related

* [PATCH 12/17] pramfs: extended attributes
From: Marco Stornelli @ 2011-01-06 12:03 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Linux Embedded, Linux FS Devel, Tim Bird

From: Marco Stornelli <marco.stornelli@gmail.com>

Extended attributes operations.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
---
diff --git a/fs/pramfs/xattr.c b/fs/pramfs/xattr.c
new file mode 100644
index 0000000..44d158e
--- /dev/null
+++ b/fs/pramfs/xattr.c
@@ -0,0 +1,1104 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * Extended attributes operations.
+ *
+ * Copyright 2010 Marco Stornelli <marco.stornelli@gmail.com>
+ *
+ * based on fs/ext2/xattr.c with the following copyright:
+ *
+ * Fix by Harrison Xing <harrison@mountainviewdata.com>.
+ * Extended attributes for symlinks and special files added per
+ *  suggestion of Luka Renko <luka.renko@hermes.si>.
+ * xattr consolidation Copyright (c) 2004 James Morris <jmorris@redhat.com>,
+ *  Red Hat Inc.
+ *
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+/*
+ * Extended attributes are stored in blocks allocated outside of
+ * any inode. The i_xattr field is then made to point to this allocated
+ * block. If all extended attributes of an inode are identical, these
+ * inodes may share the same extended attribute block. Such situations
+ * are automatically detected by keeping a cache of recent attribute block
+ * numbers and hashes over the block's contents in memory.
+ *
+ *
+ * Extended attribute block layout:
+ *
+ *   +------------------+
+ *   | header           |
+ *   | entry 1          | |
+ *   | entry 2          | | growing downwards
+ *   | entry 3          | v
+ *   | four null bytes  |
+ *   | . . .            |
+ *   | value 1          | ^
+ *   | value 3          | | growing upwards
+ *   | value 2          | |
+ *   +------------------+
+ *
+ * The block header is followed by multiple entry descriptors. These entry
+ * descriptors are variable in size, and aligned to PRAM_XATTR_PAD
+ * byte boundaries. The entry descriptors are sorted by attribute name,
+ * so that two extended attribute blocks can be compared efficiently.
+ *
+ * Attribute values are aligned to the end of the block, stored in
+ * no specific order. They are also padded to PRAM_XATTR_PAD byte
+ * boundaries. No additional gaps are left between them.
+ *
+ * Locking strategy
+ * ----------------
+ * pi->i_xattr is protected by PRAM_I(inode)->xattr_sem.
+ * EA blocks are only changed if they are exclusive to an inode, so
+ * holding xattr_sem also means that nothing but the EA block's reference
+ * count will change. Multiple writers to an EA block are synchronized
+ * by the mutex in each block descriptor. Block descriptors are kept in a
+ * red black tree and the key is the absolute block number.
+ */
+
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/mbcache.h>
+#include <linux/rwsem.h>
+#include <linux/security.h>
+#include "pram.h"
+#include "xattr.h"
+#include "acl.h"
+#include "desctree.h"
+
+#define HDR(bp) ((struct pram_xattr_header *)(bp))
+#define ENTRY(ptr) ((struct pram_xattr_entry *)(ptr))
+#define FIRST_ENTRY(bh) ENTRY(HDR(bh)+1)
+#define IS_LAST_ENTRY(entry) (*(__u32 *)(entry) == 0)
+#define GET_DESC(sbi, blocknr) lookup_xblock_desc(sbi, blocknr, pram_xblock_desc_cache, 1)
+#define LOOKUP_DESC(sbi, blocknr) lookup_xblock_desc(sbi, blocknr, NULL, 0)
+
+#ifdef PRAM_XATTR_DEBUG
+# define ea_idebug(inode, f...) do { \
+		printk(KERN_DEBUG "inode %ld: ", inode->i_ino); \
+		printk(f); \
+		printk("\n"); \
+	} while (0)
+# define ea_bdebug(blocknr, f...) do { \
+		printk(KERN_DEBUG "block %lu: ", blocknr); \
+		printk(f); \
+		printk("\n"); \
+	} while (0)
+#else
+# define ea_idebug(f...)
+# define ea_bdebug(f...)
+#endif
+
+static int pram_xattr_set2(struct inode *, char *, struct pram_xblock_desc *, struct pram_xattr_header *);
+
+static int pram_xattr_cache_insert(struct super_block *sb, unsigned long blocknr, u32 xhash);
+static struct pram_xblock_desc *pram_xattr_cache_find(struct inode *,
+						 struct pram_xattr_header *);
+static void pram_xattr_rehash(struct pram_xattr_header *,
+			      struct pram_xattr_entry *);
+
+static struct mb_cache *pram_xattr_cache;
+static struct kmem_cache *pram_xblock_desc_cache;
+
+static const struct xattr_handler *pram_xattr_handler_map[] = {
+	[PRAM_XATTR_INDEX_USER]		     = &pram_xattr_user_handler,
+#ifdef CONFIG_PRAMFS_POSIX_ACL
+	[PRAM_XATTR_INDEX_POSIX_ACL_ACCESS]  = &pram_xattr_acl_access_handler,
+	[PRAM_XATTR_INDEX_POSIX_ACL_DEFAULT] = &pram_xattr_acl_default_handler,
+#endif
+	[PRAM_XATTR_INDEX_TRUSTED]	     = &pram_xattr_trusted_handler,
+#ifdef CONFIG_PRAMFS_SECURITY
+	[PRAM_XATTR_INDEX_SECURITY]	     = &pram_xattr_security_handler,
+#endif
+};
+
+const struct xattr_handler *pram_xattr_handlers[] = {
+	&pram_xattr_user_handler,
+	&pram_xattr_trusted_handler,
+#ifdef CONFIG_PRAMFS_POSIX_ACL
+	&pram_xattr_acl_access_handler,
+	&pram_xattr_acl_default_handler,
+#endif
+#ifdef CONFIG_PRAMFS_SECURITY
+	&pram_xattr_security_handler,
+#endif
+	NULL
+};
+
+static void desc_put(struct super_block *sb, struct pram_xblock_desc *desc)
+{
+	struct pram_sb_info *sbi = (struct pram_sb_info *)sb->s_fs_info;
+	if (!put_xblock_desc(sbi, desc)) {
+		/* Ok we can free the block and its descriptor */
+		pram_dbg("freeing block %lu and its descriptor", desc->blocknr);
+		pram_free_block(sb, desc->blocknr);
+		kmem_cache_free(pram_xblock_desc_cache, desc);
+	}
+}
+
+static inline const struct xattr_handler *pram_xattr_handler(int name_index)
+{
+	const struct xattr_handler *handler = NULL;
+
+	if (name_index > 0 && name_index < ARRAY_SIZE(pram_xattr_handler_map))
+		handler = pram_xattr_handler_map[name_index];
+	return handler;
+}
+
+/*
+ * pram_xattr_get()
+ *
+ * Copy an extended attribute into the buffer
+ * provided, or compute the buffer size required.
+ * Buffer is NULL to compute the size of the buffer required.
+ *
+ * Returns a negative error number on failure, or the number of bytes
+ * used / required on success.
+ */
+int pram_xattr_get(struct inode *inode, int name_index, const char *name,
+	       void *buffer, size_t buffer_size)
+{
+	char *bp = NULL;
+	struct pram_xattr_entry *entry;
+	struct pram_xblock_desc *desc;
+	struct pram_inode *pi;
+	size_t name_len, size;
+	char *end;
+	int error = 0;
+	unsigned long blocknr;
+	struct super_block *sb = inode->i_sb;
+	struct pram_sb_info *sbi = (struct pram_sb_info *)sb->s_fs_info;
+
+	ea_idebug(inode, "name=%d<%s>, buffer=%p, buffer_size=%ld",
+		  name_index, name, buffer, (long)buffer_size);
+
+	pi = pram_get_inode(sb, inode->i_ino);
+	if (!pi)
+		return -EINVAL;
+	if (name == NULL)
+		return -EINVAL;
+	down_read(&PRAM_I(inode)->xattr_sem);
+	error = -ENODATA;
+	if (!pi->i_xattr)
+		goto cleanup;
+	ea_idebug(inode, "reading block %llu", be64_to_cpu(pi->i_xattr));
+	bp = pram_get_block(sb, be64_to_cpu(pi->i_xattr));
+	error = -EIO;
+	if (!bp)
+		goto cleanup;
+	end = bp + sb->s_blocksize;
+	blocknr = pram_get_blocknr(sb, be64_to_cpu(pi->i_xattr));
+	ea_bdebug(blocknr, "refcount=%d", be32_to_cpu(HDR(bp)->h_refcount));
+	if (HDR(bp)->h_magic != cpu_to_be32(PRAM_XATTR_MAGIC)) {
+bad_block:	pram_err(sb, "inode %ld: bad block %llu", inode->i_ino,
+		be64_to_cpu(pi->i_xattr));
+		error = -EIO;
+		goto cleanup;
+	}
+	/* find named attribute */
+	name_len = strlen(name);
+
+	error = -ERANGE;
+	if (name_len > 255)
+		goto cleanup;
+	entry = FIRST_ENTRY(bp);
+	while (!IS_LAST_ENTRY(entry)) {
+		struct pram_xattr_entry *next =
+			PRAM_XATTR_NEXT(entry);
+		if ((char *)next >= end)
+			goto bad_block;
+		if (name_index == entry->e_name_index &&
+		    name_len == entry->e_name_len &&
+		    memcmp(name, entry->e_name, name_len) == 0)
+			goto found;
+		entry = next;
+	}
+	/* Check the remaining name entries */
+	while (!IS_LAST_ENTRY(entry)) {
+		struct pram_xattr_entry *next =
+			PRAM_XATTR_NEXT(entry);
+		if ((char *)next >= end)
+			goto bad_block;
+		entry = next;
+	}
+
+	desc = GET_DESC(sbi, blocknr);
+	if (IS_ERR(desc)) {
+		error = -ENOMEM;
+		goto cleanup;
+	}
+	desc_put(sb, desc);
+	if (pram_xattr_cache_insert(sb, blocknr,
+					be32_to_cpu(HDR(bp)->h_hash)))
+		ea_idebug(inode, "cache insert failed");
+	error = -ENODATA;
+	goto cleanup;
+found:
+	/* check the buffer size */
+	if (entry->e_value_block != 0)
+		goto bad_block;
+	size = be32_to_cpu(entry->e_value_size);
+	if (size > inode->i_sb->s_blocksize ||
+	    be16_to_cpu(entry->e_value_offs) + size > inode->i_sb->s_blocksize)
+		goto bad_block;
+
+	desc = GET_DESC(sbi, blocknr);
+	if (IS_ERR(desc)) {
+		error = -ENOMEM;
+		goto cleanup;
+	}
+	desc_put(sb, desc);
+	if (pram_xattr_cache_insert(sb, blocknr,
+					be32_to_cpu(HDR(bp)->h_hash)))
+		ea_idebug(inode, "cache insert failed");
+	if (buffer) {
+		error = -ERANGE;
+		if (size > buffer_size)
+			goto cleanup;
+		/* return value of attribute */
+		memcpy(buffer, bp + be16_to_cpu(entry->e_value_offs),
+			size);
+	}
+	error = size;
+
+cleanup:
+	up_read(&PRAM_I(inode)->xattr_sem);
+
+	return error;
+}
+
+/*
+ * pram_xattr_list()
+ *
+ * Copy a list of attribute names into the buffer
+ * provided, or compute the buffer size required.
+ * Buffer is NULL to compute the size of the buffer required.
+ *
+ * Returns a negative error number on failure, or the number of bytes
+ * used / required on success.
+ */
+static int pram_xattr_list(struct dentry *dentry, char *buffer, size_t buffer_size)
+{
+	struct inode *inode = dentry->d_inode;
+	char *bp = NULL;
+	struct pram_xattr_entry *entry;
+	struct pram_xblock_desc *desc;
+	struct pram_inode *pi;
+	char *end;
+	size_t rest = buffer_size;
+	int error = 0;
+	unsigned long blocknr;
+	struct super_block *sb = inode->i_sb;
+	struct pram_sb_info *sbi = (struct pram_sb_info *)sb->s_fs_info;
+
+	ea_idebug(inode, "buffer=%p, buffer_size=%ld",
+		  buffer, (long)buffer_size);
+
+	pi = pram_get_inode(sb, inode->i_ino);
+	if (!pi)
+		return error;
+	down_read(&PRAM_I(inode)->xattr_sem);
+	error = 0;
+	if (!pi->i_xattr)
+		goto cleanup;
+	ea_idebug(inode, "reading block %llu", be64_to_cpu(pi->i_xattr));
+	bp = pram_get_block(sb, be64_to_cpu(pi->i_xattr));
+	blocknr = pram_get_blocknr(sb, be64_to_cpu(pi->i_xattr));
+	error = -EIO;
+	if (!bp)
+		goto cleanup;
+	ea_bdebug(blocknr, "refcount=%d", be32_to_cpu(HDR(bp)->h_refcount));
+	end = bp + sb->s_blocksize;
+	if (HDR(bp)->h_magic != cpu_to_be32(PRAM_XATTR_MAGIC)) {
+bad_block:	pram_err(sb, "inode %ld: bad block %llu", inode->i_ino,
+			be64_to_cpu(pi->i_xattr));
+		error = -EIO;
+		goto cleanup;
+	}
+
+	/* check the on-disk data structure */
+	entry = FIRST_ENTRY(bp);
+	while (!IS_LAST_ENTRY(entry)) {
+		struct pram_xattr_entry *next = PRAM_XATTR_NEXT(entry);
+
+		if ((char *)next >= end)
+			goto bad_block;
+		entry = next;
+	}
+
+	desc = GET_DESC(sbi, blocknr);
+	if (IS_ERR(desc)) {
+		error = -ENOMEM;
+		goto cleanup;
+	}
+	desc_put(sb, desc);
+	if (pram_xattr_cache_insert(sb, blocknr,
+					be32_to_cpu(HDR(bp)->h_hash)))
+			ea_idebug(inode, "cache insert failed");
+
+	/* list the attribute names */
+	for (entry = FIRST_ENTRY(bp); !IS_LAST_ENTRY(entry);
+	     entry = PRAM_XATTR_NEXT(entry)) {
+		const struct xattr_handler *handler =
+			pram_xattr_handler(entry->e_name_index);
+
+		if (handler) {
+			size_t size = handler->list(dentry, buffer, rest,
+						    entry->e_name,
+						    entry->e_name_len,
+						    handler->flags);
+			if (buffer) {
+				if (size > rest) {
+					error = -ERANGE;
+					goto cleanup;
+				}
+				buffer += size;
+			}
+			rest -= size;
+		}
+	}
+	error = buffer_size - rest;  /* total size */
+
+cleanup:
+	up_read(&PRAM_I(inode)->xattr_sem);
+
+	return error;
+}
+
+/*
+ * Inode operation listxattr()
+ *
+ * dentry->d_inode->i_mutex: don't care
+ */
+ssize_t pram_listxattr(struct dentry *dentry, char *buffer, size_t size)
+{
+	return pram_xattr_list(dentry, buffer, size);
+}
+
+/*
+ * pram_xattr_set()
+ *
+ * Create, replace or remove an extended attribute for this inode. Buffer
+ * is NULL to remove an existing extended attribute, and non-NULL to
+ * either replace an existing extended attribute, or create a new extended
+ * attribute. The flags XATTR_REPLACE and XATTR_CREATE
+ * specify that an extended attribute must exist and must not exist
+ * previous to the call, respectively.
+ *
+ * Returns 0, or a negative error number on failure.
+ */
+int pram_xattr_set(struct inode *inode, int name_index, const char *name,
+	       const void *value, size_t value_len, int flags)
+{
+	struct super_block *sb = inode->i_sb;
+	struct pram_sb_info *sbi = (struct pram_sb_info *)sb->s_fs_info;
+	struct pram_xattr_header *header = NULL;
+	struct pram_xattr_entry *here, *last;
+	struct pram_inode *pi;
+	struct pram_xblock_desc *desc = NULL;
+	size_t name_len, free, min_offs = sb->s_blocksize;
+	int not_found = 1, error;
+	char *end;
+	char *bp = NULL;
+	unsigned long blocknr = 0;
+
+	/*
+	 * header -- Points either into bp, or to a temporarily
+	 *           allocated buffer.
+	 * here -- The named entry found, or the place for inserting, within
+	 *         the block pointed to by header.
+	 * last -- Points right after the last named entry within the block
+	 *         pointed to by header.
+	 * min_offs -- The offset of the first value (values are aligned
+	 *             towards the end of the block).
+	 * end -- Points right after the block pointed to by header.
+	 */
+
+	ea_idebug(inode, "name=%d.%s, value=%p, value_len=%ld",
+		  name_index, name, value, (long)value_len);
+
+	if (value == NULL)
+		value_len = 0;
+	if (name == NULL)
+		return -EINVAL;
+	name_len = strlen(name);
+	if (name_len > 255 || value_len > sb->s_blocksize)
+		return -ERANGE;
+	pi = pram_get_inode(sb, inode->i_ino);
+	if (!pi)
+		return -EINVAL;
+	down_write(&PRAM_I(inode)->xattr_sem);
+	if (pi->i_xattr) {
+		/* The inode already has an extended attribute block. */
+		bp = pram_get_block(sb, be64_to_cpu(pi->i_xattr));
+		error = -EIO;
+		if (!bp)
+			goto cleanup;
+		blocknr = pram_get_blocknr(sb, be64_to_cpu(pi->i_xattr));
+		ea_bdebug(blocknr, "refcount=%d", be32_to_cpu(HDR(bp)->h_refcount));
+		header = HDR(bp);
+		end = bp + sb->s_blocksize;
+		if (header->h_magic != cpu_to_be32(PRAM_XATTR_MAGIC)) {
+bad_block:		pram_err(sb, "inode %ld: bad block %llu", inode->i_ino,
+				   be64_to_cpu(pi->i_xattr));
+			error = -EIO;
+			goto cleanup;
+		}
+		/* Find the named attribute. */
+		here = FIRST_ENTRY(bp);
+		while (!IS_LAST_ENTRY(here)) {
+			struct pram_xattr_entry *next = PRAM_XATTR_NEXT(here);
+			if ((char *)next >= end)
+				goto bad_block;
+			if (!here->e_value_block && here->e_value_size) {
+				size_t offs = be16_to_cpu(here->e_value_offs);
+				if (offs < min_offs)
+					min_offs = offs;
+			}
+			not_found = name_index - here->e_name_index;
+			if (!not_found)
+				not_found = name_len - here->e_name_len;
+			if (!not_found)
+				not_found = memcmp(name, here->e_name, name_len);
+			if (not_found <= 0)
+				break;
+			here = next;
+		}
+		last = here;
+		/* We still need to compute min_offs and last. */
+		while (!IS_LAST_ENTRY(last)) {
+			struct pram_xattr_entry *next = PRAM_XATTR_NEXT(last);
+			if ((char *)next >= end)
+				goto bad_block;
+			if (!last->e_value_block && last->e_value_size) {
+				size_t offs = be16_to_cpu(last->e_value_offs);
+				if (offs < min_offs)
+					min_offs = offs;
+			}
+			last = next;
+		}
+
+		/* Check whether we have enough space left. */
+		free = min_offs - ((char *)last - (char *)header) - sizeof(__u32);
+	} else {
+		/* We will use a new extended attribute block. */
+		free = sb->s_blocksize -
+			sizeof(struct pram_xattr_header) - sizeof(__u32);
+		here = last = NULL;  /* avoid gcc uninitialized warning. */
+	}
+
+	if (not_found) {
+		/* Request to remove a nonexistent attribute? */
+		error = -ENODATA;
+		if (flags & XATTR_REPLACE)
+			goto cleanup;
+		error = 0;
+		if (value == NULL)
+			goto cleanup;
+	} else {
+		/* Request to create an existing attribute? */
+		error = -EEXIST;
+		if (flags & XATTR_CREATE)
+			goto cleanup;
+		if (!here->e_value_block && here->e_value_size) {
+			size_t size = be32_to_cpu(here->e_value_size);
+
+			if (be16_to_cpu(here->e_value_offs) + size >
+			    sb->s_blocksize || size > sb->s_blocksize)
+				goto bad_block;
+			free += PRAM_XATTR_SIZE(size);
+		}
+		free += PRAM_XATTR_LEN(name_len);
+	}
+	error = -ENOSPC;
+	if (free < PRAM_XATTR_LEN(name_len) + PRAM_XATTR_SIZE(value_len))
+		goto cleanup;
+
+	/* Here we know that we can set the new attribute. */
+
+	if (header) {
+		struct mb_cache_entry *ce;
+
+		desc = GET_DESC(sbi, blocknr);
+		if (IS_ERR(desc)) {
+			error = -ENOMEM;
+			goto cleanup;
+		}
+
+		/* assert(header == HDR(bp)); */
+		ce = mb_cache_entry_get(pram_xattr_cache, (struct block_device *)sbi,
+							blocknr);
+		mutex_lock(&desc->lock);
+		pram_memunlock_block(sb, bp);
+		if (header->h_refcount == cpu_to_be32(1)) {
+			ea_bdebug(blocknr, "modifying in-place");
+			if (ce)
+				mb_cache_entry_free(ce);
+			/* keep it locked while modifying it. */
+		} else {
+			int offset;
+
+			if (ce)
+				mb_cache_entry_release(ce);
+			pram_memlock_block(sb, bp);
+			mutex_unlock(&desc->lock);
+			ea_bdebug(desc->blocknr, "cloning");
+			header = kmalloc(inode->i_sb->s_blocksize, GFP_KERNEL);
+			error = -ENOMEM;
+			if (header == NULL)
+				goto cleanup;
+			memcpy(header, HDR(bp), inode->i_sb->s_blocksize);
+			header->h_refcount = cpu_to_be32(1);
+
+			offset = (char *)here - bp;
+			here = ENTRY((char *)header + offset);
+			offset = (char *)last - bp;
+			last = ENTRY((char *)header + offset);
+		}
+	} else {
+		/* Allocate a buffer where we construct the new block. */
+		header = kzalloc(sb->s_blocksize, GFP_KERNEL);
+		error = -ENOMEM;
+		if (header == NULL)
+			goto cleanup;
+		end = (char *)header + sb->s_blocksize;
+		header->h_magic = cpu_to_be32(PRAM_XATTR_MAGIC);
+		header->h_refcount = cpu_to_be32(1);
+		last = here = ENTRY(header+1);
+	}
+
+	/* Iff we are modifying the block in-place, the block is locked here. */
+
+	if (not_found) {
+		/* Insert the new name. */
+		size_t size = PRAM_XATTR_LEN(name_len);
+		size_t rest = (char *)last - (char *)here;
+		memmove((char *)here + size, here, rest);
+		memset(here, 0, size);
+		here->e_name_index = name_index;
+		here->e_name_len = name_len;
+		memcpy(here->e_name, name, name_len);
+	} else {
+		if (!here->e_value_block && here->e_value_size) {
+			char *first_val = (char *)header + min_offs;
+			size_t offs = be16_to_cpu(here->e_value_offs);
+			char *val = (char *)header + offs;
+			size_t size = PRAM_XATTR_SIZE(
+				be32_to_cpu(here->e_value_size));
+
+			if (size == PRAM_XATTR_SIZE(value_len)) {
+				/* The old and the new value have the same
+				   size. Just replace. */
+				here->e_value_size = cpu_to_be32(value_len);
+				memset(val + size - PRAM_XATTR_PAD, 0,
+				       PRAM_XATTR_PAD); /* Clear pad bytes. */
+				memcpy(val, value, value_len);
+				goto skip_replace;
+			}
+
+			/* Remove the old value. */
+			memmove(first_val + size, first_val, val - first_val);
+			memset(first_val, 0, size);
+			here->e_value_offs = 0;
+			min_offs += size;
+
+			/* Adjust all value offsets. */
+			last = ENTRY(header+1);
+			while (!IS_LAST_ENTRY(last)) {
+				size_t o = be16_to_cpu(last->e_value_offs);
+				if (!last->e_value_block && o < offs)
+					last->e_value_offs =
+						cpu_to_be16(o + size);
+				last = PRAM_XATTR_NEXT(last);
+			}
+		}
+		if (value == NULL) {
+			/* Remove the old name. */
+			size_t size = PRAM_XATTR_LEN(name_len);
+			last = ENTRY((char *)last - size);
+			memmove(here, (char *)here + size,
+				(char *)last - (char *)here);
+			memset(last, 0, size);
+		}
+	}
+
+	if (value != NULL) {
+		/* Insert the new value. */
+		here->e_value_size = cpu_to_be32(value_len);
+		if (value_len) {
+			size_t size = PRAM_XATTR_SIZE(value_len);
+			char *val = (char *)header + min_offs - size;
+			here->e_value_offs =
+				cpu_to_be16((char *)val - (char *)header);
+			memset(val + size - PRAM_XATTR_PAD, 0,
+			       PRAM_XATTR_PAD); /* Clear the pad bytes. */
+			memcpy(val, value, value_len);
+		}
+	}
+
+skip_replace:
+	if (IS_LAST_ENTRY(ENTRY(header+1))) {
+		/* This block is now empty. */
+		if (bp && header == HDR(bp)) {
+			/* we were modifying in-place. */
+			pram_memlock_block(sb, bp);
+			mutex_unlock(&desc->lock);
+		}
+		error = pram_xattr_set2(inode, bp, desc, NULL);
+	} else {
+		pram_xattr_rehash(header, here);
+		if (bp && header == HDR(bp)) {
+			/* we were modifying in-place. */
+			pram_memlock_block(sb, bp);
+			mutex_unlock(&desc->lock);
+		}
+		error = pram_xattr_set2(inode, bp, desc, header);
+	}
+
+cleanup:
+	desc_put(sb, desc);
+	if (!(bp && header == HDR(bp)))
+		kfree(header);
+	up_write(&PRAM_I(inode)->xattr_sem);
+
+	return error;
+}
+
+/*
+ * Second half of pram_xattr_set(): Update the file system.
+ */
+static int pram_xattr_set2(struct inode *inode, char *old_bp,
+			   struct pram_xblock_desc *old_desc,
+			   struct pram_xattr_header *header)
+{
+	struct super_block *sb = inode->i_sb;
+	struct pram_sb_info *sbi = (struct pram_sb_info *)sb->s_fs_info;
+	struct pram_xblock_desc *new_desc = NULL;
+	unsigned long blocknr;
+	struct pram_inode *pi;
+	int error;
+	char *new_bp = NULL;
+
+	if (header) {
+		new_desc = pram_xattr_cache_find(inode, header);
+		if (new_desc) {
+			new_bp = pram_get_block(sb, pram_get_block_off(sb, new_desc->blocknr));
+			/* We found an identical block in the cache. */
+			if (new_bp == old_bp) {
+				ea_bdebug(new_desc->blocknr, "keeping this block");
+			} else {
+				/* The old block is released after updating
+				   the inode.  */
+				ea_bdebug(new_desc->blocknr, "reusing block");
+				pram_memunlock_block(sb, new_bp);
+				be32_add_cpu(&HDR(new_bp)->h_refcount, 1);
+				pram_memlock_block(sb, new_bp);
+				ea_bdebug(new_desc->blocknr, "refcount now=%d",
+					be32_to_cpu(HDR(new_bp)->h_refcount));
+			}
+			blocknr = new_desc->blocknr;
+			mutex_unlock(&new_desc->lock);
+			desc_put(sb, new_desc);
+		} else if (old_bp && header == HDR(old_bp)) {
+			/* Keep this block. No need to lock the block as we
+			   don't need to change the reference count. */
+			new_bp = old_bp;
+			pram_xattr_cache_insert(sb, old_desc->blocknr, HDR(new_bp)->h_hash);
+			blocknr = old_desc->blocknr;
+		} else {
+			/* We need to allocate a new block */
+			struct pram_xblock_desc *new_desc;
+
+			error = pram_new_block(sb, &blocknr, 1);
+			if (error)
+				goto out;
+			ea_idebug(inode, "creating block %lu", blocknr);
+			new_desc = kmem_cache_alloc(pram_xblock_desc_cache, GFP_KERNEL);
+			if (!new_desc) {
+				pram_free_block(sb, blocknr);
+				error = -EIO;
+				goto out;
+			}
+			xblock_desc_init_always(new_desc);
+			new_desc->blocknr = blocknr;
+			new_bp = pram_get_block(sb, pram_get_block_off(sb, blocknr));
+			if (!new_bp) {
+				pram_free_block(sb, blocknr);
+				kmem_cache_free(pram_xblock_desc_cache, new_desc);
+				error = -EIO;
+				goto out;
+			}
+			pram_memunlock_block(sb, new_bp);
+			memcpy(new_bp, header, sb->s_blocksize);
+			pram_memlock_block(sb, new_bp);
+			insert_xblock_desc(sbi, new_desc);
+			pram_xattr_cache_insert(sb, new_desc->blocknr, HDR(new_bp)->h_hash);
+		}
+	}
+
+	/* Update the inode. */
+	pi = pram_get_inode(sb, inode->i_ino);
+	pram_memunlock_inode(sb, pi);
+	pi->i_xattr = new_bp ? be64_to_cpu(pram_get_block_off(sb, blocknr)) : 0;
+	inode->i_ctime = CURRENT_TIME_SEC;
+	pi->i_ctime = cpu_to_be32(inode->i_ctime.tv_sec);
+	pram_memlock_inode(sb, pi);
+
+	error = 0;
+	if (old_bp && old_bp != new_bp) {
+		struct mb_cache_entry *ce;
+
+		/* Here old_desc MUST be valid or we have a bug */
+		BUG_ON(!old_desc);
+
+		/*
+		 * If there was an old block and we are no longer using it,
+		 * release the old block.
+		 */
+		ce = mb_cache_entry_get(pram_xattr_cache, (struct block_device *)sbi,
+					old_desc->blocknr);
+		mutex_lock(&old_desc->lock);
+		if (HDR(old_bp)->h_refcount == cpu_to_be32(1)) {
+			/* Free the old block. */
+			if (ce)
+				mb_cache_entry_free(ce);
+			ea_bdebug(old_desc->blocknr, "freeing");
+			mutex_unlock(&old_desc->lock);
+			/* Caller will call desc_put later */
+			mark_free_desc(old_desc);
+		} else {
+			/* Decrement the refcount only. */
+			pram_memunlock_block(sb, old_bp);
+			be32_add_cpu(&HDR(old_bp)->h_refcount, -1);
+			pram_memlock_block(sb, old_bp);
+			if (ce)
+				mb_cache_entry_release(ce);
+			ea_bdebug(old_desc->blocknr, "refcount now=%d",
+			be32_to_cpu(HDR(old_bp)->h_refcount));
+			mutex_unlock(&old_desc->lock);
+		}
+	}
+
+out:
+	return error;
+}
+
+/*
+ * pram_xattr_delete_inode()
+ *
+ * Free extended attribute resources associated with this inode. This
+ * is called immediately before an inode is freed.
+ */
+void pram_xattr_delete_inode(struct inode *inode)
+{
+	char *bp = NULL;
+	struct mb_cache_entry *ce;
+	struct pram_inode *pi;
+	struct pram_xblock_desc *desc;
+	struct super_block *sb = inode->i_sb;
+	struct pram_sb_info *sbi = (struct pram_sb_info *)sb->s_fs_info;
+	unsigned long blocknr;
+
+	pi = pram_get_inode(sb, inode->i_ino);
+	if (!pi)
+		goto cleanup;
+	down_write(&PRAM_I(inode)->xattr_sem);
+	if (!pi->i_xattr)
+		goto cleanup;
+	bp = pram_get_block(sb, be64_to_cpu(pi->i_xattr));
+	if (!bp) {
+		pram_err(sb, "inode %ld: block %llu read error", inode->i_ino,
+			be64_to_cpu(pi->i_xattr));
+		goto cleanup;
+	}
+	blocknr = pram_get_blocknr(sb, be64_to_cpu(pi->i_xattr));
+	if (HDR(bp)->h_magic != cpu_to_be32(PRAM_XATTR_MAGIC)) {
+		pram_err(sb, "inode %ld: bad block %llu", inode->i_ino,
+			be64_to_cpu(pi->i_xattr));
+		goto cleanup;
+	}
+	ce = mb_cache_entry_get(pram_xattr_cache, (struct block_device *)sbi, blocknr);
+	desc = GET_DESC(sbi, blocknr);
+	if (IS_ERR(desc))
+		goto cleanup;
+	mutex_lock(&desc->lock);
+	if (HDR(bp)->h_refcount == cpu_to_be32(1)) {
+		if (ce)
+			mb_cache_entry_free(ce);
+		mark_free_desc(desc);
+	} else {
+		be32_add_cpu(&HDR(bp)->h_refcount, -1);
+		if (ce)
+			mb_cache_entry_release(ce);
+		ea_bdebug(blocknr, "refcount now=%d",
+			be32_to_cpu(HDR(bp)->h_refcount));
+		mutex_unlock(&desc->lock);
+	}
+	desc_put(sb, desc);
+
+cleanup:
+	up_write(&PRAM_I(inode)->xattr_sem);
+}
+
+/*
+ * pram_xattr_put_super()
+ *
+ * This is called when a file system is unmounted.
+ */
+void pram_xattr_put_super(struct super_block *sb)
+{
+	struct pram_sb_info *sbi = (struct pram_sb_info *)sb->s_fs_info;
+	/*
+	 * NOTE: we haven't got any block device to use with mb. Mb code
+	 * doesn't use the pointer but it uses only the address as unique
+	 * key so it's safe to use a "general purpose" address. We use
+	 * super block info data as unique key. Maybe it'd be better to
+	 * change mb code in order to use a generic void pointer to a
+	 * generic id.
+	 */
+	mb_cache_shrink((struct block_device *)sbi);
+	erase_tree(sbi, pram_xblock_desc_cache);
+	kmem_cache_shrink(pram_xblock_desc_cache);
+}
+
+
+/*
+ * pram_xattr_cache_insert()
+ *
+ * Create a new entry in the extended attribute cache, and insert
+ * it unless such an entry is already in the cache.
+ *
+ * Returns 0, or a negative error number on failure.
+ */
+static int pram_xattr_cache_insert(struct super_block *sb, unsigned long blocknr, u32 xhash)
+{
+	struct pram_sb_info *sbi = (struct pram_sb_info *)sb->s_fs_info;
+	__u32 hash = be32_to_cpu(xhash);
+	struct mb_cache_entry *ce;
+	int error;
+
+	ce = mb_cache_entry_alloc(pram_xattr_cache, GFP_NOFS);
+	if (!ce)
+		return -ENOMEM;
+	error = mb_cache_entry_insert(ce, (struct block_device *)sbi, blocknr, hash);
+	if (error) {
+		mb_cache_entry_free(ce);
+		if (error == -EBUSY) {
+			ea_bdebug(blocknr, "already in cache");
+			error = 0;
+		}
+	} else {
+		ea_bdebug(blocknr, "inserting [%x]", (int)hash);
+		mb_cache_entry_release(ce);
+	}
+	return error;
+}
+
+/*
+ * pram_xattr_cmp()
+ *
+ * Compare two extended attribute blocks for equality.
+ *
+ * Returns 0 if the blocks are equal, 1 if they differ, and
+ * a negative error number on errors.
+ */
+static int pram_xattr_cmp(struct pram_xattr_header *header1,
+			  struct pram_xattr_header *header2)
+{
+	struct pram_xattr_entry *entry1, *entry2;
+
+	entry1 = ENTRY(header1+1);
+	entry2 = ENTRY(header2+1);
+	while (!IS_LAST_ENTRY(entry1)) {
+		if (IS_LAST_ENTRY(entry2))
+			return 1;
+		if (entry1->e_hash != entry2->e_hash ||
+		    entry1->e_name_index != entry2->e_name_index ||
+		    entry1->e_name_len != entry2->e_name_len ||
+		    entry1->e_value_size != entry2->e_value_size ||
+		    memcmp(entry1->e_name, entry2->e_name, entry1->e_name_len))
+			return 1;
+		if (entry1->e_value_block != 0 || entry2->e_value_block != 0)
+			return -EIO;
+		if (memcmp((char *)header1 + be16_to_cpu(entry1->e_value_offs),
+			   (char *)header2 + be16_to_cpu(entry2->e_value_offs),
+			   be32_to_cpu(entry1->e_value_size)))
+			return 1;
+
+		entry1 = PRAM_XATTR_NEXT(entry1);
+		entry2 = PRAM_XATTR_NEXT(entry2);
+	}
+	if (!IS_LAST_ENTRY(entry2))
+		return 1;
+	return 0;
+}
+
+/*
+ * pram_xattr_cache_find()
+ *
+ * Find an identical extended attribute block.
+ *
+ * Returns a locked extended block descriptor for the block found, or
+ * NULL if such a block was not found or an error occurred.
+ * The block, however, is not memory unlocked.
+ */
+static struct pram_xblock_desc *pram_xattr_cache_find(struct inode *inode, struct pram_xattr_header *header)
+{
+	__u32 hash = be32_to_cpu(header->h_hash);
+	struct mb_cache_entry *ce;
+	struct pram_xblock_desc *desc;
+	struct super_block *sb = inode->i_sb;
+	struct pram_sb_info *sbi = (struct pram_sb_info *)sb->s_fs_info;
+
+	if (!header->h_hash)
+		return NULL;  /* never share */
+	ea_idebug(inode, "looking for cached blocks [%x]", (int)hash);
+again:
+	ce = mb_cache_entry_find_first(pram_xattr_cache, (struct block_device *)sbi, hash);
+	while (ce) {
+		char *bp;
+
+		if (IS_ERR(ce)) {
+			if (PTR_ERR(ce) == -EAGAIN)
+				goto again;
+			break;
+		}
+
+		bp = pram_get_block(sb, pram_get_block_off(sb, (unsigned long)ce->e_block));
+		if (!bp) {
+			pram_err(sb, "inode %ld: block %ld read error",
+				inode->i_ino, (unsigned long) ce->e_block);
+		} else {
+			desc = LOOKUP_DESC(sbi, ce->e_block);
+			if (!desc) {
+				mb_cache_entry_release(ce);
+				return NULL;
+			}
+			mutex_lock(&desc->lock);
+			if (be32_to_cpu(HDR(bp)->h_refcount) >
+				   PRAM_XATTR_REFCOUNT_MAX) {
+				ea_idebug(inode, "block %ld refcount %d>%d",
+					  (unsigned long) ce->e_block,
+					  be32_to_cpu(HDR(bp)->h_refcount),
+					  PRAM_XATTR_REFCOUNT_MAX);
+			} else if (!pram_xattr_cmp(header, HDR(bp))) {
+				mb_cache_entry_release(ce);
+				return desc;
+			}
+			mutex_unlock(&desc->lock);
+		}
+		ce = mb_cache_entry_find_next(ce, (struct block_device *)sbi, hash);
+	}
+	return NULL;
+}
+
+#define NAME_HASH_SHIFT 5
+#define VALUE_HASH_SHIFT 16
+
+/*
+ * pram_xattr_hash_entry()
+ *
+ * Compute the hash of an extended attribute.
+ */
+static inline void pram_xattr_hash_entry(struct pram_xattr_header *header,
+					 struct pram_xattr_entry *entry)
+{
+	__u32 hash = 0;
+	char *name = entry->e_name;
+	int n;
+
+	for (n = 0; n < entry->e_name_len; n++) {
+		hash = (hash << NAME_HASH_SHIFT) ^
+		       (hash >> (8*sizeof(hash) - NAME_HASH_SHIFT)) ^
+		       *name++;
+	}
+
+	if (entry->e_value_block == 0 && entry->e_value_size != 0) {
+		__be32 *value = (__be32 *)((char *)header +
+			be16_to_cpu(entry->e_value_offs));
+		for (n = (be32_to_cpu(entry->e_value_size) +
+		     PRAM_XATTR_ROUND) >> PRAM_XATTR_PAD_BITS; n; n--) {
+			hash = (hash << VALUE_HASH_SHIFT) ^
+			       (hash >> (8*sizeof(hash) - VALUE_HASH_SHIFT)) ^
+			       be32_to_cpu(*value++);
+		}
+	}
+	entry->e_hash = cpu_to_be32(hash);
+}
+
+#undef NAME_HASH_SHIFT
+#undef VALUE_HASH_SHIFT
+
+#define BLOCK_HASH_SHIFT 16
+
+/*
+ * pram_xattr_rehash()
+ *
+ * Re-compute the extended attribute hash value after an entry has changed.
+ */
+static void pram_xattr_rehash(struct pram_xattr_header *header,
+			      struct pram_xattr_entry *entry)
+{
+	struct pram_xattr_entry *here;
+	__u32 hash = 0;
+
+	pram_xattr_hash_entry(header, entry);
+	here = ENTRY(header+1);
+	while (!IS_LAST_ENTRY(here)) {
+		if (!here->e_hash) {
+			/* Block is not shared if an entry's hash value == 0 */
+			hash = 0;
+			break;
+		}
+		hash = (hash << BLOCK_HASH_SHIFT) ^
+		       (hash >> (8*sizeof(hash) - BLOCK_HASH_SHIFT)) ^
+		       be32_to_cpu(here->e_hash);
+		here = PRAM_XATTR_NEXT(here);
+	}
+	header->h_hash = cpu_to_be32(hash);
+}
+
+#undef BLOCK_HASH_SHIFT
+
+static void init_xblock_desc_once(void *foo)
+{
+	struct pram_xblock_desc *desc = (struct pram_xblock_desc *) foo;
+
+	xblock_desc_init_once(desc);
+}
+
+int __init init_pram_xattr(void)
+{
+	int ret = 0;
+	pram_xattr_cache = mb_cache_create("pram_xattr", 6);
+	if (!pram_xattr_cache) {
+		ret = -ENOMEM;
+		goto fail1;
+	}
+
+	pram_xblock_desc_cache = kmem_cache_create("pram_xblock_desc",
+					     sizeof(struct pram_xblock_desc),
+					     0, (SLAB_RECLAIM_ACCOUNT|
+						SLAB_MEM_SPREAD),
+					     init_xblock_desc_once);
+	if (!pram_xblock_desc_cache) {
+		ret = -ENOMEM;
+		goto fail2;
+	}
+
+	return 0;
+fail2:
+	mb_cache_destroy(pram_xattr_cache);
+fail1:
+	return ret;
+}
+
+void exit_pram_xattr(void)
+{
+	mb_cache_destroy(pram_xattr_cache);
+	kmem_cache_destroy(pram_xblock_desc_cache);
+}
diff --git a/fs/pramfs/xattr.h b/fs/pramfs/xattr.h
new file mode 100644
index 0000000..bdbe250
--- /dev/null
+++ b/fs/pramfs/xattr.h
@@ -0,0 +1,131 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * Extended attributes for the pram filesystem.
+ *
+ * Copyright 2010 Marco Stornelli <marco.stornelli@gmail.com>
+ *
+ * based on fs/ext2/xattr.h with the following copyright:
+ *
+ *(C) 2001 Andreas Gruenbacher, <a.gruenbacher@computer.org>
+ *
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include <linux/init.h>
+#include <linux/xattr.h>
+
+/* Magic value in attribute blocks */
+#define PRAM_XATTR_MAGIC		0x6d617270
+
+/* Maximum number of references to one attribute block */
+#define PRAM_XATTR_REFCOUNT_MAX		1024
+
+/* Name indexes */
+#define PRAM_XATTR_INDEX_USER			1
+#define PRAM_XATTR_INDEX_POSIX_ACL_ACCESS	2
+#define PRAM_XATTR_INDEX_POSIX_ACL_DEFAULT	3
+#define PRAM_XATTR_INDEX_TRUSTED		4
+#define PRAM_XATTR_INDEX_SECURITY	        5
+
+struct pram_xattr_header {
+	__be32	h_magic;	/* magic number for identification */
+	__be32	h_refcount;	/* reference count */
+	__be32	h_hash;		/* hash value of all attributes */
+	__u32	h_reserved[4];	/* zero right now */
+};
+
+struct pram_xattr_entry {
+	__u8	e_name_len;	/* length of name */
+	__u8	e_name_index;	/* attribute name index */
+	__be16	e_value_offs;	/* offset in disk block of value */
+	__be32	e_value_block;	/* disk block attribute is stored on (n/i) */
+	__be32	e_value_size;	/* size of attribute value */
+	__be32	e_hash;		/* hash value of name and value */
+	char	e_name[0];	/* attribute name */
+};
+
+#define PRAM_XATTR_PAD_BITS		2
+#define PRAM_XATTR_PAD		(1<<PRAM_XATTR_PAD_BITS)
+#define PRAM_XATTR_ROUND		(PRAM_XATTR_PAD-1)
+#define PRAM_XATTR_LEN(name_len) \
+	(((name_len) + PRAM_XATTR_ROUND + \
+	sizeof(struct pram_xattr_entry)) & ~PRAM_XATTR_ROUND)
+#define PRAM_XATTR_NEXT(entry) \
+	((struct pram_xattr_entry *)( \
+	  (char *)(entry) + PRAM_XATTR_LEN((entry)->e_name_len)))
+#define PRAM_XATTR_SIZE(size) \
+	(((size) + PRAM_XATTR_ROUND) & ~PRAM_XATTR_ROUND)
+
+#ifdef CONFIG_PRAMFS_XATTR
+
+extern const struct xattr_handler pram_xattr_user_handler;
+extern const struct xattr_handler pram_xattr_trusted_handler;
+extern const struct xattr_handler pram_xattr_acl_access_handler;
+extern const struct xattr_handler pram_xattr_acl_default_handler;
+extern const struct xattr_handler pram_xattr_security_handler;
+
+extern ssize_t pram_listxattr(struct dentry *, char *, size_t);
+
+extern int pram_xattr_get(struct inode *, int, const char *, void *, size_t);
+extern int pram_xattr_set(struct inode *, int, const char *, const void *, size_t, int);
+
+extern void pram_xattr_delete_inode(struct inode *);
+extern void pram_xattr_put_super(struct super_block *);
+
+extern int init_pram_xattr(void) __init;
+extern void exit_pram_xattr(void);
+
+extern const struct xattr_handler *pram_xattr_handlers[];
+
+# else  /* CONFIG_PRAMFS_XATTR */
+
+static inline int
+pram_xattr_get(struct inode *inode, int name_index,
+	       const char *name, void *buffer, size_t size)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline int
+pram_xattr_set(struct inode *inode, int name_index, const char *name,
+	       const void *value, size_t size, int flags)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline void
+pram_xattr_delete_inode(struct inode *inode)
+{
+}
+
+static inline void
+pram_xattr_put_super(struct super_block *sb)
+{
+}
+
+static inline int
+init_pram_xattr(void)
+{
+	return 0;
+}
+
+static inline void
+exit_pram_xattr(void)
+{
+}
+
+#define pram_xattr_handlers NULL
+
+# endif  /* CONFIG_PRAMFS_XATTR */
+
+#ifdef CONFIG_PRAMFS_SECURITY
+extern int pram_init_security(struct inode *inode, struct inode *dir);
+#else
+static inline int pram_init_security(struct inode *inode, struct inode *dir)
+{
+	return 0;
+}
+#endif
diff --git a/fs/pramfs/xattr_security.c b/fs/pramfs/xattr_security.c
new file mode 100644
index 0000000..e16d9ca
--- /dev/null
+++ b/fs/pramfs/xattr_security.c
@@ -0,0 +1,78 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * Handler for storing security labels as extended attributes.
+ *
+ * Copyright 2010 Marco Stornelli <marco.stornelli@gmail.com>
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/string.h>
+#include <linux/fs.h>
+#include <linux/pram_fs.h>
+#include <linux/security.h>
+#include "xattr.h"
+
+static size_t pram_xattr_security_list(struct dentry *dentry, char *list,
+				       size_t list_size, const char *name,
+				       size_t name_len, int type)
+{
+	const int prefix_len = XATTR_SECURITY_PREFIX_LEN;
+	const size_t total_len = prefix_len + name_len + 1;
+
+	if (list && total_len <= list_size) {
+		memcpy(list, XATTR_SECURITY_PREFIX, prefix_len);
+		memcpy(list+prefix_len, name, name_len);
+		list[prefix_len + name_len] = '\0';
+	}
+	return total_len;
+}
+
+static int pram_xattr_security_get(struct dentry *dentry, const char *name,
+		       void *buffer, size_t size, int type)
+{
+	if (strcmp(name, "") == 0)
+		return -EINVAL;
+	return pram_xattr_get(dentry->d_inode, PRAM_XATTR_INDEX_SECURITY, name,
+			      buffer, size);
+}
+
+static int pram_xattr_security_set(struct dentry *dentry, const char *name,
+		const void *value, size_t size, int flags, int type)
+{
+	if (strcmp(name, "") == 0)
+		return -EINVAL;
+	return pram_xattr_set(dentry->d_inode, PRAM_XATTR_INDEX_SECURITY, name,
+			      value, size, flags);
+}
+
+int pram_init_security(struct inode *inode, struct inode *dir)
+{
+	int err;
+	size_t len;
+	void *value;
+	char *name;
+
+	err = security_inode_init_security(inode, dir, &name, &value, &len);
+	if (err) {
+		if (err == -EOPNOTSUPP)
+			return 0;
+		return err;
+	}
+	err = pram_xattr_set(inode, PRAM_XATTR_INDEX_SECURITY,
+			     name, value, len, 0);
+	kfree(name);
+	kfree(value);
+	return err;
+}
+
+const struct xattr_handler pram_xattr_security_handler = {
+	.prefix	= XATTR_SECURITY_PREFIX,
+	.list	= pram_xattr_security_list,
+	.get	= pram_xattr_security_get,
+	.set	= pram_xattr_security_set,
+};
diff --git a/fs/pramfs/xattr_trusted.c b/fs/pramfs/xattr_trusted.c
new file mode 100644
index 0000000..62127d7
--- /dev/null
+++ b/fs/pramfs/xattr_trusted.c
@@ -0,0 +1,65 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * Handler for trusted extended attributes.
+ *
+ * Copyright 2010 Marco Stornelli <marco.stornelli@gmail.com>
+ *
+ * based on fs/ext2/xattr_trusted.c with the following copyright:
+ *
+ * Copyright (C) 2003 by Andreas Gruenbacher, <a.gruenbacher@computer.org>
+ *
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include <linux/module.h>
+#include <linux/string.h>
+#include <linux/capability.h>
+#include <linux/fs.h>
+#include <linux/pram_fs.h>
+#include "xattr.h"
+
+static size_t pram_xattr_trusted_list(struct dentry *dentry, char *list,
+				size_t list_size, const char *name,
+				size_t name_len, int type)
+{
+	const int prefix_len = XATTR_TRUSTED_PREFIX_LEN;
+	const size_t total_len = prefix_len + name_len + 1;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return 0;
+
+	if (list && total_len <= list_size) {
+		memcpy(list, XATTR_TRUSTED_PREFIX, prefix_len);
+		memcpy(list+prefix_len, name, name_len);
+		list[prefix_len + name_len] = '\0';
+	}
+	return total_len;
+}
+
+static int pram_xattr_trusted_get(struct dentry *dentry, const char *name,
+				void *buffer, size_t size, int type)
+{
+	if (strcmp(name, "") == 0)
+		return -EINVAL;
+	return pram_xattr_get(dentry->d_inode, PRAM_XATTR_INDEX_TRUSTED, name,
+			      buffer, size);
+}
+
+static int pram_xattr_trusted_set(struct dentry *dentry, const char *name,
+		const void *value, size_t size, int flags, int type)
+{
+	if (strcmp(name, "") == 0)
+		return -EINVAL;
+	return pram_xattr_set(dentry->d_inode, PRAM_XATTR_INDEX_TRUSTED, name,
+			      value, size, flags);
+}
+
+const struct xattr_handler pram_xattr_trusted_handler = {
+	.prefix	= XATTR_TRUSTED_PREFIX,
+	.list	= pram_xattr_trusted_list,
+	.get	= pram_xattr_trusted_get,
+	.set	= pram_xattr_trusted_set,
+};
diff --git a/fs/pramfs/xattr_user.c b/fs/pramfs/xattr_user.c
new file mode 100644
index 0000000..caea624
--- /dev/null
+++ b/fs/pramfs/xattr_user.c
@@ -0,0 +1,68 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * Handler for extended user attributes.
+ *
+ * Copyright 2010 Marco Stornelli <marco.stornelli@gmail.com>
+ *
+ * based on fs/ext2/xattr_user.c with the following copyright:
+ *
+ * Copyright (C) 2001 by Andreas Gruenbacher, <a.gruenbacher@computer.org>
+ *
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/string.h>
+#include "pram.h"
+#include "xattr.h"
+
+static size_t pram_xattr_user_list(struct dentry *dentry, char *list, size_t list_size,
+		const char *name, size_t name_len, int type)
+{
+	const size_t prefix_len = XATTR_USER_PREFIX_LEN;
+	const size_t total_len = prefix_len + name_len + 1;
+
+	if (!test_opt(dentry->d_sb, XATTR_USER))
+		return 0;
+
+	if (list && total_len <= list_size) {
+		memcpy(list, XATTR_USER_PREFIX, prefix_len);
+		memcpy(list+prefix_len, name, name_len);
+		list[prefix_len + name_len] = '\0';
+	}
+	return total_len;
+}
+
+static int pram_xattr_user_get(struct dentry *dentry, const char *name,
+		void *buffer, size_t size, int type)
+{
+	if (strcmp(name, "") == 0)
+		return -EINVAL;
+	if (!test_opt(dentry->d_sb, XATTR_USER))
+		return -EOPNOTSUPP;
+	return pram_xattr_get(dentry->d_inode, PRAM_XATTR_INDEX_USER,
+			      name, buffer, size);
+}
+
+static int pram_xattr_user_set(struct dentry *dentry, const char *name,
+		const void *value, size_t size, int flags, int type)
+{
+	if (strcmp(name, "") == 0)
+		return -EINVAL;
+	if (!test_opt(dentry->d_sb, XATTR_USER))
+		return -EOPNOTSUPP;
+
+	return pram_xattr_set(dentry->d_inode, PRAM_XATTR_INDEX_USER,
+			      name, value, size, flags);
+}
+
+const struct xattr_handler pram_xattr_user_handler = {
+	.prefix	= XATTR_USER_PREFIX,
+	.list	= pram_xattr_user_list,
+	.get	= pram_xattr_user_get,
+	.set	= pram_xattr_user_set,
+};

^ permalink raw reply related

* [PATCH 11/17] pramfs: ACL management
From: Marco Stornelli @ 2011-01-06 12:03 UTC (permalink / raw)
  To: Linux Kernel, Linux Embedded, Linux FS Devel, Tim Bird

From: Marco Stornelli <marco.stornelli@gmail.com>

ACL operations.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
---
diff --git a/fs/pramfs/acl.c b/fs/pramfs/acl.c
new file mode 100644
index 0000000..53090a5
--- /dev/null
+++ b/fs/pramfs/acl.c
@@ -0,0 +1,433 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * POSIX ACL operations
+ *
+ * Copyright 2010 Marco Stornelli <marco.stornelli@gmail.com>
+ *
+ * based on fs/ext2/acl.c with the following copyright:
+ *
+ * Copyright (C) 2001-2003 Andreas Gruenbacher, <agruen@suse.de>
+ *
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include <linux/capability.h>
+#include <linux/init.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/fs.h>
+#include "pram.h"
+#include "xattr.h"
+#include "acl.h"
+
+/*
+ * Load ACL information from filesystem.
+ */
+static struct posix_acl *pram_acl_load(const void *value, size_t size)
+{
+	const char *end = (char *)value + size;
+	int n, count;
+	struct posix_acl *acl;
+
+	if (!value)
+		return NULL;
+	if (size < sizeof(struct pram_acl_header))
+		 return ERR_PTR(-EINVAL);
+	if (((struct pram_acl_header *)value)->a_version !=
+	    cpu_to_be32(PRAM_ACL_VERSION))
+		return ERR_PTR(-EINVAL);
+	value = (char *)value + sizeof(struct pram_acl_header);
+	count = pram_acl_count(size);
+	if (count < 0)
+		return ERR_PTR(-EINVAL);
+	if (count == 0)
+		return NULL;
+	acl = posix_acl_alloc(count, GFP_KERNEL);
+	if (!acl)
+		return ERR_PTR(-ENOMEM);
+	for (n = 0; n < count; n++) {
+		struct pram_acl_entry *entry = (struct pram_acl_entry *)value;
+		if ((char *)value + sizeof(struct pram_acl_entry_short) > end)
+			goto fail;
+		acl->a_entries[n].e_tag  = be16_to_cpu(entry->e_tag);
+		acl->a_entries[n].e_perm = be16_to_cpu(entry->e_perm);
+		switch (acl->a_entries[n].e_tag) {
+		case ACL_USER_OBJ:
+		case ACL_GROUP_OBJ:
+		case ACL_MASK:
+		case ACL_OTHER:
+			value = (char *)value +
+				sizeof(struct pram_acl_entry_short);
+			acl->a_entries[n].e_id = ACL_UNDEFINED_ID;
+			break;
+		case ACL_USER:
+		case ACL_GROUP:
+			value = (char *)value + sizeof(struct pram_acl_entry);
+			if ((char *)value > end)
+				goto fail;
+			acl->a_entries[n].e_id =
+				be32_to_cpu(entry->e_id);
+			break;
+		default:
+			goto fail;
+		}
+	}
+	if (value != end)
+		goto fail;
+	return acl;
+
+fail:
+	posix_acl_release(acl);
+	return ERR_PTR(-EINVAL);
+}
+
+/*
+ * Save ACL information into the filesystem.
+ */
+static void *pram_acl_save(const struct posix_acl *acl, size_t *size)
+{
+	struct pram_acl_header *ext_acl;
+	char *e;
+	size_t n;
+
+	*size = pram_acl_size(acl->a_count);
+	ext_acl = kmalloc(sizeof(struct pram_acl_header) + acl->a_count *
+			sizeof(struct pram_acl_entry), GFP_KERNEL);
+	if (!ext_acl)
+		return ERR_PTR(-ENOMEM);
+	ext_acl->a_version = cpu_to_be32(PRAM_ACL_VERSION);
+	e = (char *)ext_acl + sizeof(struct pram_acl_header);
+	for (n = 0; n < acl->a_count; n++) {
+		struct pram_acl_entry *entry = (struct pram_acl_entry *)e;
+		entry->e_tag  = cpu_to_be16(acl->a_entries[n].e_tag);
+		entry->e_perm = cpu_to_be16(acl->a_entries[n].e_perm);
+		switch (acl->a_entries[n].e_tag) {
+		case ACL_USER:
+		case ACL_GROUP:
+			entry->e_id =
+				cpu_to_be32(acl->a_entries[n].e_id);
+			e += sizeof(struct pram_acl_entry);
+			break;
+		case ACL_USER_OBJ:
+		case ACL_GROUP_OBJ:
+		case ACL_MASK:
+		case ACL_OTHER:
+			e += sizeof(struct pram_acl_entry_short);
+			break;
+		default:
+			goto fail;
+		}
+	}
+	return (char *)ext_acl;
+
+fail:
+	kfree(ext_acl);
+	return ERR_PTR(-EINVAL);
+}
+
+/*
+ * inode->i_mutex: don't care
+ */
+static struct posix_acl *pram_get_acl(struct inode *inode, int type)
+{
+	int name_index;
+	char *value = NULL;
+	struct posix_acl *acl;
+	int retval;
+
+	if (!test_opt(inode->i_sb, POSIX_ACL))
+		return NULL;
+
+	acl = get_cached_acl(inode, type);
+	if (acl != ACL_NOT_CACHED)
+		return acl;
+
+	switch (type) {
+	case ACL_TYPE_ACCESS:
+		name_index = PRAM_XATTR_INDEX_POSIX_ACL_ACCESS;
+		break;
+	case ACL_TYPE_DEFAULT:
+		name_index = PRAM_XATTR_INDEX_POSIX_ACL_DEFAULT;
+		break;
+	default:
+		BUG();
+	}
+	retval = pram_xattr_get(inode, name_index, "", NULL, 0);
+	if (retval > 0) {
+		value = kmalloc(retval, GFP_KERNEL);
+		if (!value)
+			return ERR_PTR(-ENOMEM);
+		retval = pram_xattr_get(inode, name_index, "", value, retval);
+	}
+	if (retval > 0)
+		acl = pram_acl_load(value, retval);
+	else if (retval == -ENODATA || retval == -ENOSYS)
+		acl = NULL;
+	else
+		acl = ERR_PTR(retval);
+	kfree(value);
+
+	if (!IS_ERR(acl))
+		set_cached_acl(inode, type, acl);
+
+	return acl;
+}
+
+/*
+ * inode->i_mutex: down
+ */
+static int pram_set_acl(struct inode *inode, int type, struct posix_acl *acl)
+{
+	int name_index;
+	void *value = NULL;
+	size_t size = 0;
+	int error;
+
+	if (S_ISLNK(inode->i_mode))
+		return -EOPNOTSUPP;
+	if (!test_opt(inode->i_sb, POSIX_ACL))
+		return 0;
+
+	switch (type) {
+	case ACL_TYPE_ACCESS:
+		name_index = PRAM_XATTR_INDEX_POSIX_ACL_ACCESS;
+		if (acl) {
+			mode_t mode = inode->i_mode;
+			error = posix_acl_equiv_mode(acl, &mode);
+			if (error < 0)
+				return error;
+			else {
+				inode->i_mode = mode;
+				inode->i_ctime = CURRENT_TIME_SEC;
+				mark_inode_dirty(inode);
+				if (error == 0)
+					acl = NULL;
+			}
+		}
+		break;
+	case ACL_TYPE_DEFAULT:
+		name_index = PRAM_XATTR_INDEX_POSIX_ACL_DEFAULT;
+		if (!S_ISDIR(inode->i_mode))
+			return acl ? -EACCES : 0;
+		break;
+	default:
+		return -EINVAL;
+	}
+	if (acl) {
+		value = pram_acl_save(acl, &size);
+		if (IS_ERR(value))
+			return (int)PTR_ERR(value);
+	}
+
+	error = pram_xattr_set(inode, name_index, "", value, size, 0);
+
+	kfree(value);
+	if (!error)
+		set_cached_acl(inode, type, acl);
+	return error;
+}
+
+int pram_check_acl(struct inode *inode, int mask)
+{
+	struct posix_acl *acl = pram_get_acl(inode, ACL_TYPE_ACCESS);
+
+	if (IS_ERR(acl))
+		return PTR_ERR(acl);
+	if (acl) {
+		int error = posix_acl_permission(inode, acl, mask);
+		posix_acl_release(acl);
+		return error;
+	}
+
+	return -EAGAIN;
+}
+
+/*
+ * Initialize the ACLs of a new inode. Called from pram_new_inode.
+ *
+ * dir->i_mutex: down
+ * inode->i_mutex: up (access to inode is still exclusive)
+ */
+int pram_init_acl(struct inode *inode, struct inode *dir)
+{
+	struct posix_acl *acl = NULL;
+	int error = 0;
+
+	if (!S_ISLNK(inode->i_mode)) {
+		if (test_opt(dir->i_sb, POSIX_ACL)) {
+			acl = pram_get_acl(dir, ACL_TYPE_DEFAULT);
+			if (IS_ERR(acl))
+				return PTR_ERR(acl);
+		}
+		if (!acl)
+			inode->i_mode &= ~current_umask();
+	}
+
+	if (test_opt(inode->i_sb, POSIX_ACL) && acl) {
+		struct posix_acl *clone;
+		mode_t mode;
+
+		if (S_ISDIR(inode->i_mode)) {
+			error = pram_set_acl(inode, ACL_TYPE_DEFAULT, acl);
+			if (error)
+				goto cleanup;
+		}
+		clone = posix_acl_clone(acl, GFP_KERNEL);
+		error = -ENOMEM;
+		if (!clone)
+			goto cleanup;
+		mode = inode->i_mode;
+		error = posix_acl_create_masq(clone, &mode);
+		if (error >= 0) {
+			inode->i_mode = mode;
+			if (error > 0) {
+				/* This is an extended ACL */
+				error = pram_set_acl(inode,
+						     ACL_TYPE_ACCESS, clone);
+			}
+		}
+		posix_acl_release(clone);
+	}
+cleanup:
+       posix_acl_release(acl);
+       return error;
+}
+
+/*
+ * Does chmod for an inode that may have an Access Control List. The
+ * inode->i_mode field must be updated to the desired value by the caller
+ * before calling this function.
+ * Returns 0 on success, or a negative error number.
+ *
+ * We change the ACL rather than storing some ACL entries in the file
+ * mode permission bits (which would be more efficient), because that
+ * would break once additional permissions (like  ACL_APPEND, ACL_DELETE
+ * for directories) are added. There are no more bits available in the
+ * file mode.
+ *
+ * inode->i_mutex: down
+ */
+int pram_acl_chmod(struct inode *inode)
+{
+	struct posix_acl *acl, *clone;
+	int error;
+
+	if (!test_opt(inode->i_sb, POSIX_ACL))
+		return 0;
+	if (S_ISLNK(inode->i_mode))
+		return -EOPNOTSUPP;
+	acl = pram_get_acl(inode, ACL_TYPE_ACCESS);
+	if (IS_ERR(acl) || !acl)
+		return PTR_ERR(acl);
+	clone = posix_acl_clone(acl, GFP_KERNEL);
+	posix_acl_release(acl);
+	if (!clone)
+		return -ENOMEM;
+	error = posix_acl_chmod_masq(clone, inode->i_mode);
+	if (!error)
+		error = pram_set_acl(inode, ACL_TYPE_ACCESS, clone);
+	posix_acl_release(clone);
+	return error;
+}
+
+/*
+ * Extended attribut handlers
+ */
+static size_t pram_xattr_list_acl_access(struct dentry *dentry, char *list,
+					size_t list_size, const char *name,
+					size_t name_len, int type)
+{
+	const size_t size = sizeof(POSIX_ACL_XATTR_ACCESS);
+
+	if (!test_opt(dentry->d_sb, POSIX_ACL))
+		return 0;
+	if (list && size <= list_size)
+		memcpy(list, POSIX_ACL_XATTR_ACCESS, size);
+	return size;
+}
+
+static size_t pram_xattr_list_acl_default(struct dentry *dentry, char *list,
+					  size_t list_size, const char *name,
+					  size_t name_len, int type)
+{
+	const size_t size = sizeof(POSIX_ACL_XATTR_DEFAULT);
+
+	if (!test_opt(dentry->d_sb, POSIX_ACL))
+		return 0;
+	if (list && size <= list_size)
+		memcpy(list, POSIX_ACL_XATTR_DEFAULT, size);
+	return size;
+}
+
+static int pram_xattr_get_acl(struct dentry *dentry, const char *name,
+			      void *buffer, size_t size, int type)
+{
+	struct posix_acl *acl;
+	int error;
+
+	if (strcmp(name, "") != 0)
+		return -EINVAL;
+	if (!test_opt(dentry->d_sb, POSIX_ACL))
+		return -EOPNOTSUPP;
+
+	acl = pram_get_acl(dentry->d_inode, type);
+	if (IS_ERR(acl))
+		return PTR_ERR(acl);
+	if (acl == NULL)
+		return -ENODATA;
+	error = posix_acl_to_xattr(acl, buffer, size);
+	posix_acl_release(acl);
+
+	return error;
+}
+
+static int pram_xattr_set_acl(struct dentry *dentry, const char *name,
+			      const void *value, size_t size, int flags, int type)
+{
+	struct posix_acl *acl;
+	int error;
+
+	if (strcmp(name, "") != 0)
+		return -EINVAL;
+	if (!test_opt(dentry->d_sb, POSIX_ACL))
+		return -EOPNOTSUPP;
+	if (!is_owner_or_cap(dentry->d_inode))
+		return -EPERM;
+
+	if (value) {
+		acl = posix_acl_from_xattr(value, size);
+		if (IS_ERR(acl))
+			return PTR_ERR(acl);
+		else if (acl) {
+			error = posix_acl_valid(acl);
+			if (error)
+				goto release_and_out;
+		}
+	} else
+		acl = NULL;
+
+	error = pram_set_acl(dentry->d_inode, type, acl);
+
+release_and_out:
+	posix_acl_release(acl);
+	return error;
+}
+
+const struct xattr_handler pram_xattr_acl_access_handler = {
+	.prefix	= POSIX_ACL_XATTR_ACCESS,
+	.flags	= ACL_TYPE_ACCESS,
+	.list	= pram_xattr_list_acl_access,
+	.get	= pram_xattr_get_acl,
+	.set	= pram_xattr_set_acl,
+};
+
+const struct xattr_handler pram_xattr_acl_default_handler = {
+	.prefix	= POSIX_ACL_XATTR_DEFAULT,
+	.flags	= ACL_TYPE_DEFAULT,
+	.list	= pram_xattr_list_acl_default,
+	.get	= pram_xattr_get_acl,
+	.set	= pram_xattr_set_acl,
+};
diff --git a/fs/pramfs/acl.h b/fs/pramfs/acl.h
new file mode 100644
index 0000000..3970dbe
--- /dev/null
+++ b/fs/pramfs/acl.h
@@ -0,0 +1,86 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * POSIX ACL operations
+ *
+ * Copyright 2010 Marco Stornelli <marco.stornelli@gmail.com>
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ *
+ * Based on fs/ext2/acl.h with the following copyright:
+ *
+ * Copyright (C) 2001-2003 Andreas Gruenbacher, <agruen@suse.de>
+ *
+ */
+
+#include <linux/posix_acl_xattr.h>
+
+#define PRAM_ACL_VERSION	0x0001
+
+struct pram_acl_entry {
+	__be16		e_tag;
+	__be16		e_perm;
+	__be32		e_id;
+};
+
+struct pram_acl_entry_short {
+	__be16		e_tag;
+	__be16		e_perm;
+};
+
+struct pram_acl_header {
+	__be32		a_version;
+};
+
+static inline size_t pram_acl_size(int count)
+{
+	if (count <= 4) {
+		return sizeof(struct pram_acl_header) +
+		       count * sizeof(struct pram_acl_entry_short);
+	} else {
+		return sizeof(struct pram_acl_header) +
+		       4 * sizeof(struct pram_acl_entry_short) +
+		       (count - 4) * sizeof(struct pram_acl_entry);
+	}
+}
+
+static inline int pram_acl_count(size_t size)
+{
+	ssize_t s;
+	size -= sizeof(struct pram_acl_header);
+	s = size - 4 * sizeof(struct pram_acl_entry_short);
+	if (s < 0) {
+		if (size % sizeof(struct pram_acl_entry_short))
+			return -1;
+		return size / sizeof(struct pram_acl_entry_short);
+	} else {
+		if (s % sizeof(struct pram_acl_entry))
+			return -1;
+		return s / sizeof(struct pram_acl_entry) + 4;
+	}
+}
+
+#ifdef CONFIG_PRAMFS_POSIX_ACL
+
+/* acl.c */
+extern int pram_check_acl(struct inode *, int);
+extern int pram_acl_chmod(struct inode *);
+extern int pram_init_acl(struct inode *, struct inode *);
+
+#else
+#include <linux/sched.h>
+#define pram_check_acl	NULL
+#define pram_get_acl	NULL
+#define pram_set_acl	NULL
+
+static inline int pram_acl_chmod(struct inode *inode)
+{
+	return 0;
+}
+
+static inline int pram_init_acl(struct inode *inode, struct inode *dir)
+{
+	return 0;
+}
+#endif

^ permalink raw reply related

* [PATCH 10/17] pramfs: xip operations
From: Marco Stornelli @ 2011-01-06 12:03 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Linux Embedded, Linux FS Devel, Tim Bird

From: Marco Stornelli <marco.stornelli@gmail.com>

XIP operations.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
---
diff --git a/fs/pramfs/xip.c b/fs/pramfs/xip.c
new file mode 100644
index 0000000..9ad8caf
--- /dev/null
+++ b/fs/pramfs/xip.c
@@ -0,0 +1,83 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * XIP operations.
+ *
+ * Copyright 2009-2010 Marco Stornelli <marco.stornelli@gmail.com>
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include <linux/mm.h>
+#include <linux/fs.h>
+#include <linux/genhd.h>
+#include <linux/buffer_head.h>
+#include "pram.h"
+#include "xip.h"
+
+static int pram_find_and_alloc_blocks(struct inode *inode, sector_t iblock,
+				     sector_t *data_block, int create)
+{
+	int err = -EIO;
+	u64 block;
+
+	mutex_lock(&PRAM_I(inode)->truncate_mutex);
+
+	block = pram_find_data_block(inode, iblock);
+
+	if (!block) {
+		if (!create) {
+			err = -ENODATA;
+			goto err;
+		}
+
+		err = pram_alloc_blocks(inode, iblock, 1);
+		if (err)
+			goto err;
+
+		block = pram_find_data_block(inode, iblock);
+		if (!block) {
+			err = -ENODATA;
+			goto err;
+		}
+	}
+
+	*data_block = block;
+	err = 0;
+
+ err:
+	mutex_unlock(&PRAM_I(inode)->truncate_mutex);
+	return err;
+}
+
+static inline int __pram_get_block(struct inode *inode, pgoff_t pgoff, int create,
+		   sector_t *result)
+{
+	int rc = 0;
+
+	rc = pram_find_and_alloc_blocks(inode, (sector_t)pgoff, result, create);
+
+	if (rc == -ENODATA)
+		BUG_ON(create);
+
+	return rc;
+}
+
+int pram_get_xip_mem(struct address_space *mapping, pgoff_t pgoff, int create,
+				void **kmem, unsigned long *pfn)
+{
+	int rc;
+	sector_t block;
+
+	/* first, retrieve the block */
+	rc = __pram_get_block(mapping->host, pgoff, create, &block);
+	if (rc)
+		goto exit;
+
+	*kmem = pram_get_block(mapping->host->i_sb, block);
+	*pfn = page_to_pfn(virt_to_page((unsigned long)*kmem));
+
+exit:
+	return rc;
+}
diff --git a/fs/pramfs/xip.h b/fs/pramfs/xip.h
new file mode 100644
index 0000000..797f1b0
--- /dev/null
+++ b/fs/pramfs/xip.h
@@ -0,0 +1,28 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * XIP operations.
+ *
+ * Copyright 2009-2010 Marco Stornelli <marco.stornelli@gmail.com>
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#ifdef CONFIG_PRAMFS_XIP
+int pram_get_xip_mem(struct address_space *, pgoff_t, int, void **,
+							      unsigned long *);
+static inline int pram_use_xip(struct super_block *sb)
+{
+	struct pram_sb_info *sbi = (struct pram_sb_info *)sb->s_fs_info;
+	return sbi->s_mount_opt & PRAM_MOUNT_XIP;
+}
+#define mapping_is_xip(map) unlikely(map->a_ops->get_xip_mem)
+
+#else
+
+#define mapping_is_xip(map)	0
+#define pram_use_xip(sb)	0
+#define pram_get_xip_mem	NULL
+
+#endif

^ permalink raw reply related

* [PATCH 09/17] pramfs: dir operations
From: Marco Stornelli @ 2011-01-06 12:03 UTC (permalink / raw)
  To: Linux Kernel, Linux Embedded, Linux FS Devel, Tim Bird

From: Marco Stornelli <marco.stornelli@gmail.com>

File operations for directories.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
---
diff --git a/fs/pramfs/dir.c b/fs/pramfs/dir.c
new file mode 100644
index 0000000..cf0bcba
--- /dev/null
+++ b/fs/pramfs/dir.c
@@ -0,0 +1,208 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * File operations for directories.
+ *
+ * Copyright 2009-2010 Marco Stornelli <marco.stornelli@gmail.com>
+ * Copyright 2003 Sony Corporation
+ * Copyright 2003 Matsushita Electric Industrial Co., Ltd.
+ * 2003-2004 (c) MontaVista Software, Inc. , Steve Longerbeam
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include <linux/fs.h>
+#include <linux/pagemap.h>
+#include "pram.h"
+
+/*
+ *	Parent is locked.
+ */
+int pram_add_link(struct dentry *dentry, struct inode *inode)
+{
+	struct inode *dir = dentry->d_parent->d_inode;
+	struct pram_inode *pidir, *pi, *pitail = NULL;
+	u64 tail_ino, prev_ino;
+
+	const char *name = dentry->d_name.name;
+
+	int namelen = min_t(unsigned int, dentry->d_name.len, PRAM_NAME_LEN);
+
+	pidir = pram_get_inode(dir->i_sb, dir->i_ino);
+	pi = pram_get_inode(dir->i_sb, inode->i_ino);
+
+	dir->i_mtime = dir->i_ctime = CURRENT_TIME;
+
+	tail_ino = be64_to_cpu(pidir->i_type.dir.tail);
+	if (tail_ino != 0) {
+		pitail = pram_get_inode(dir->i_sb, tail_ino);
+		pram_memunlock_inode(dir->i_sb, pitail);
+		pitail->i_d.d_next = cpu_to_be64(inode->i_ino);
+		pram_memlock_inode(dir->i_sb, pitail);
+
+		prev_ino = tail_ino;
+
+		pram_memunlock_inode(dir->i_sb, pidir);
+		pidir->i_type.dir.tail = cpu_to_be64(inode->i_ino);
+		pidir->i_mtime = cpu_to_be32(dir->i_mtime.tv_sec);
+		pidir->i_ctime = cpu_to_be32(dir->i_ctime.tv_sec);
+		pram_memlock_inode(dir->i_sb, pidir);
+	} else {
+		/* the directory is empty */
+		prev_ino = 0;
+
+		pram_memunlock_inode(dir->i_sb, pidir);
+		pidir->i_type.dir.tail = cpu_to_be64(inode->i_ino);
+		pidir->i_type.dir.head = cpu_to_be64(inode->i_ino);
+		pidir->i_mtime = cpu_to_be32(dir->i_mtime.tv_sec);
+		pidir->i_ctime = cpu_to_be32(dir->i_ctime.tv_sec);
+		pram_memlock_inode(dir->i_sb, pidir);
+	}
+
+
+	pram_memunlock_inode(dir->i_sb, pi);
+	pi->i_d.d_prev = cpu_to_be64(prev_ino);
+	pi->i_d.d_parent = cpu_to_be64(dir->i_ino);
+	memcpy(pi->i_d.d_name, name, namelen);
+	pi->i_d.d_name[namelen] = '\0';
+	pram_memlock_inode(dir->i_sb, pi);
+	return 0;
+}
+
+int pram_remove_link(struct inode *inode)
+{
+	struct super_block *sb = inode->i_sb;
+	struct pram_inode *prev = NULL;
+	struct pram_inode *next = NULL;
+	struct pram_inode *pidir, *pi;
+
+	pi = pram_get_inode(sb, inode->i_ino);
+	pidir = pram_get_inode(sb, be64_to_cpu(pi->i_d.d_parent));
+	if (!pidir)
+		return -EACCES;
+
+	if (inode->i_ino == be64_to_cpu(pidir->i_type.dir.head)) {
+		/* first inode in directory */
+		next = pram_get_inode(sb, be64_to_cpu(pi->i_d.d_next));
+
+		if (next) {
+			pram_memunlock_inode(sb, next);
+			next->i_d.d_prev = 0;
+			pram_memlock_inode(sb, next);
+
+			pram_memunlock_inode(sb, pidir);
+			pidir->i_type.dir.head = pi->i_d.d_next;
+		} else {
+			pram_memunlock_inode(sb, pidir);
+			pidir->i_type.dir.head = 0;
+			pidir->i_type.dir.tail = 0;
+		}
+		pram_memlock_inode(sb, pidir);
+	} else if (inode->i_ino == be64_to_cpu(pidir->i_type.dir.tail)) {
+		/* last inode in directory */
+		prev = pram_get_inode(sb, be64_to_cpu(pi->i_d.d_prev));
+
+		pram_memunlock_inode(sb, prev);
+		prev->i_d.d_next = 0;
+		pram_memlock_inode(sb, prev);
+
+		pram_memunlock_inode(sb, pidir);
+		pidir->i_type.dir.tail = pi->i_d.d_prev;
+		pram_memlock_inode(sb, pidir);
+	} else {
+		/* somewhere in the middle */
+		prev = pram_get_inode(sb, be64_to_cpu(pi->i_d.d_prev));
+		next = pram_get_inode(sb, be64_to_cpu(pi->i_d.d_next));
+
+		if (prev && next) {
+			pram_memunlock_inode(sb, prev);
+			prev->i_d.d_next = pi->i_d.d_next;
+			pram_memlock_inode(sb, prev);
+
+			pram_memunlock_inode(sb, next);
+			next->i_d.d_prev = pi->i_d.d_prev;
+			pram_memlock_inode(sb, next);
+		}
+	}
+
+	pram_memunlock_inode(sb, pi);
+	pi->i_d.d_next = 0;
+	pi->i_d.d_prev = 0;
+	pi->i_d.d_parent = 0;
+	pram_memlock_inode(sb, pi);
+
+	return 0;
+}
+
+#define DT2IF(dt) (((dt) << 12) & S_IFMT)
+#define IF2DT(sif) (((sif) & S_IFMT) >> 12)
+
+static int pram_readdir(struct file *filp, void *dirent, filldir_t filldir)
+{
+	struct inode *inode = filp->f_dentry->d_inode;
+	struct super_block *sb = inode->i_sb;
+	struct pram_inode *pi;
+	int namelen, ret = 0;
+	char *name;
+	ino_t ino;
+
+	pi = pram_get_inode(sb, inode->i_ino);
+
+	switch ((unsigned long)filp->f_pos) {
+	case 0:
+		ret = filldir(dirent, ".", 1, 0, inode->i_ino, DT_DIR);
+		filp->f_pos++;
+		return ret;
+	case 1:
+		ret = filldir(dirent, "..", 2, 1, be64_to_cpu(pi->i_d.d_parent), DT_DIR);
+		ino = be64_to_cpu(pi->i_type.dir.head);
+		filp->f_pos = ino ? ino : 2;
+		return ret;
+	case 2:
+		ino = be64_to_cpu(pi->i_type.dir.head);
+		if (ino) {
+			filp->f_pos = ino;
+			pi = pram_get_inode(sb, ino);
+			break;
+		} else {
+			/* the directory is empty */
+			filp->f_pos = 2;
+			return 0;
+		}
+	case 3:
+		return 0;
+	default:
+		ino = filp->f_pos;
+		pi = pram_get_inode(sb, ino);
+		break;
+	}
+
+	while (pi && !be16_to_cpu(pi->i_links_count)) {
+		ino = filp->f_pos = be64_to_cpu(pi->i_d.d_next);
+		pi = pram_get_inode(sb, ino);
+	}
+
+	if (pi) {
+		name = pi->i_d.d_name;
+		namelen = strlen(name);
+
+		ret = filldir(dirent, name, namelen,
+			      filp->f_pos, ino,
+			      IF2DT(be16_to_cpu(pi->i_mode)));
+		filp->f_pos = pi->i_d.d_next ? be64_to_cpu(pi->i_d.d_next) : 3;
+	} else
+		filp->f_pos = 3;
+
+	return ret;
+}
+
+struct file_operations pram_dir_operations = {
+	.read		= generic_read_dir,
+	.readdir	= pram_readdir,
+	.fsync		= noop_fsync,
+	.unlocked_ioctl	= pram_ioctl,
+#ifdef CONFIG_COMPAT
+	.compat_ioctl	= pram_compat_ioctl,
+#endif
+};

^ permalink raw reply related

* [PATCH 08/17] pramfs: headers
From: Marco Stornelli @ 2011-01-06 12:02 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Linux Embedded, Linux FS Devel, Tim Bird

From: Marco Stornelli <marco.stornelli@gmail.com>

Definitions for the PRAMFS filesystem.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
---
diff --git a/fs/pramfs/pram.h b/fs/pramfs/pram.h
new file mode 100644
index 0000000..85169c4
--- /dev/null
+++ b/fs/pramfs/pram.h
@@ -0,0 +1,269 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * Definitions for the PRAMFS filesystem.
+ *
+ * Copyright 2009-2010 Marco Stornelli <marco.stornelli@gmail.com>
+ * Copyright 2003 Sony Corporation
+ * Copyright 2003 Matsushita Electric Industrial Co., Ltd.
+ * 2003-2004 (c) MontaVista Software, Inc. , Steve Longerbeam
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+#ifndef __PRAM_H
+#define __PRAM_H
+
+#include <linux/buffer_head.h>
+#include <linux/pram_fs.h>
+#include <linux/pram_fs_sb.h>
+#include <linux/crc16.h>
+#include <linux/mutex.h>
+#include <linux/types.h>
+#include "wprotect.h"
+
+/*
+ * Debug code
+ */
+#ifdef pr_fmt
+#undef pr_fmt
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+#endif
+
+#define pram_dbg(s, args...)		pr_debug(s, ## args)
+#define pram_err(sb, s, args...)	pram_error_mng(sb, s, ## args)
+#define pram_warn(s, args...)		pr_warning(s, ## args)
+#define pram_info(s, args...)		pr_info(s, ## args)
+
+#define pram_set_bit			ext2_set_bit
+#define pram_clear_bit			ext2_clear_bit
+#define pram_find_next_zero_bit		ext2_find_next_zero_bit
+
+#define clear_opt(o, opt)	(o &= ~PRAM_MOUNT_##opt)
+#define set_opt(o, opt)		(o |= PRAM_MOUNT_##opt)
+#define test_opt(sb, opt) \
+	(((struct pram_sb_info *)sb->s_fs_info)->s_mount_opt & PRAM_MOUNT_##opt)
+
+/*
+ * Pram inode flags
+ *
+ * PRAM_EOFBLOCKS_FL	There are blocks allocated beyond eof
+ */
+#define PRAM_EOFBLOCKS_FL	0x20000000
+/* Flags that should be inherited by new inodes from their parent. */
+#define PRAM_FL_INHERITED (FS_SECRM_FL | FS_UNRM_FL | FS_COMPR_FL |\
+			   FS_SYNC_FL | FS_IMMUTABLE_FL | FS_APPEND_FL |\
+			   FS_NODUMP_FL | FS_NOATIME_FL | FS_COMPRBLK_FL|\
+			   FS_NOCOMP_FL | FS_JOURNAL_DATA_FL |\
+			   FS_NOTAIL_FL | FS_DIRSYNC_FL)
+/* Flags that are appropriate for regular files (all but dir-specific ones). */
+#define PRAM_REG_FLMASK (~(FS_DIRSYNC_FL | FS_TOPDIR_FL))
+/* Flags that are appropriate for non-directories/regular files. */
+#define PRAM_OTHER_FLMASK (FS_NODUMP_FL | FS_NOATIME_FL)
+
+/* Function Prototypes */
+extern void pram_error_mng(struct super_block * sb, const char * fmt, ...);
+extern int pram_get_and_update_block(struct inode *inode, sector_t iblock,
+				     struct buffer_head *bh, int create);
+
+static inline int pram_readpage(struct file *file, struct page *page)
+{
+	return block_read_full_page(page, pram_get_and_update_block);
+}
+
+/* file.c */
+extern ssize_t pram_direct_IO(int rw, struct kiocb *iocb,
+			  const struct iovec *iov,
+			  loff_t offset, unsigned long nr_segs);
+extern int pram_mmap(struct file *file, struct vm_area_struct *vma);
+
+/* balloc.c */
+extern void pram_init_bitmap(struct super_block *sb);
+extern void pram_free_block(struct super_block *sb, unsigned long blocknr);
+extern int pram_new_block(struct super_block *sb, unsigned long *blocknr, int zero);
+extern unsigned long pram_count_free_blocks(struct super_block *sb);
+
+/* dir.c */
+extern int pram_add_link(struct dentry *dentry, struct inode *inode);
+extern int pram_remove_link(struct inode *inode);
+
+/* namei.c */
+extern struct dentry *pram_get_parent(struct dentry *child);
+
+/* inode.c */
+extern int pram_alloc_blocks(struct inode *inode, int file_blocknr,
+			     				unsigned int num);
+extern u64 pram_find_data_block(struct inode *inode,
+					 unsigned long file_blocknr);
+
+extern struct inode *pram_iget(struct super_block *sb, unsigned long ino);
+extern void pram_put_inode(struct inode *inode);
+extern void pram_evict_inode(struct inode *inode);
+extern struct inode *pram_new_inode(struct inode *dir, int mode);
+extern int pram_update_inode(struct inode *inode);
+extern int pram_write_inode(struct inode *inode, struct writeback_control *wbc);
+extern void pram_dirty_inode(struct inode *inode);
+extern int pram_notify_change(struct dentry *dentry, struct iattr *attr);
+extern long pram_fallocate(struct inode *inode, int mode, loff_t offset,
+			  loff_t len);
+extern void pram_set_inode_flags(struct inode *inode, struct pram_inode *pi);
+extern void pram_get_inode_flags(struct inode *inode, struct pram_inode *pi);
+
+/* ioctl.c */
+extern long pram_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
+#ifdef CONFIG_COMPAT
+extern long pram_compat_ioctl(struct file *file, unsigned int cmd,
+			      unsigned long arg);
+#endif
+
+/* super.c */
+#ifdef CONFIG_PRAMFS_TEST
+extern struct pram_super_block *get_pram_super(void);
+#endif
+extern struct super_block *pram_read_super(struct super_block *sb,
+					      void *data,
+					      int silent);
+extern int pram_statfs(struct dentry *d, struct kstatfs *buf);
+extern int pram_remount(struct super_block *sb, int *flags, char *data);
+
+/* symlink.c */
+extern int pram_block_symlink(struct inode *inode,
+			       const char *symname, int len);
+
+/* Inline functions start here */
+
+/* Mask out flags that are inappropriate for the given type of inode. */
+static inline __be32 pram_mask_flags(umode_t mode, __be32 flags)
+{
+	flags &= cpu_to_be32(PRAM_FL_INHERITED);
+	if (S_ISDIR(mode))
+		return flags;
+	else if (S_ISREG(mode))
+		return flags & cpu_to_be32(PRAM_REG_FLMASK);
+	else
+		return flags & cpu_to_be32(PRAM_OTHER_FLMASK);
+}
+
+static inline int pram_calc_checksum(u8 *data, int n)
+{
+	u16 crc = 0;
+	crc = crc16(~0, (__u8 *)data + sizeof(__be16), n - sizeof(__be16));
+	if (*((__be16 *)data) == cpu_to_be16(crc))
+		return 0;
+	else
+		return 1;
+}
+
+/* If this is part of a read-modify-write of the super block,
+   pram_memunlock_super() before calling! */
+static inline struct pram_super_block *
+pram_get_super(struct super_block *sb)
+{
+	struct pram_sb_info *sbi = (struct pram_sb_info *)sb->s_fs_info;
+	return (struct pram_super_block *)sbi->virt_addr;
+}
+
+static inline struct pram_super_block *
+pram_get_redund_super(struct super_block *sb)
+{
+	struct pram_sb_info *sbi = (struct pram_sb_info *)sb->s_fs_info;
+	return (struct pram_super_block *)(sbi->virt_addr + PRAM_SB_SIZE);
+}
+
+static inline void *
+pram_get_bitmap(struct super_block *sb)
+{
+	struct pram_super_block *ps = pram_get_super(sb);
+	return (void *)ps + be64_to_cpu(ps->s_bitmap_start);
+}
+
+/* If this is part of a read-modify-write of the inode metadata,
+   pram_memunlock_inode() before calling! */
+static inline struct pram_inode *
+pram_get_inode(struct super_block *sb, u64 ino)
+{
+	struct pram_super_block *ps = pram_get_super(sb);
+	return ino ? (struct pram_inode *)((void *)ps + ino) : NULL;
+}
+
+static inline ino_t
+pram_get_inodenr(struct super_block *sb, struct pram_inode *pi)
+{
+	struct pram_super_block *ps = pram_get_super(sb);
+	return (ino_t)((unsigned long)pi - (unsigned long)ps);
+}
+
+static inline u64
+pram_get_block_off(struct super_block *sb, unsigned long blocknr)
+{
+	struct pram_super_block *ps = pram_get_super(sb);
+	return (u64)(be64_to_cpu(ps->s_bitmap_start) +
+			     (blocknr << sb->s_blocksize_bits));
+}
+
+static inline unsigned long
+pram_get_blocknr(struct super_block *sb, u64 block)
+{
+	struct pram_super_block *ps = pram_get_super(sb);
+	return (block - be64_to_cpu(ps->s_bitmap_start)) >> sb->s_blocksize_bits;
+}
+
+/* If this is part of a read-modify-write of the block,
+   pram_memunlock_block() before calling! */
+static inline void *
+pram_get_block(struct super_block *sb, u64 block)
+{
+	struct pram_super_block *ps = pram_get_super(sb);
+	return block ? ((void *)ps + block) : NULL;
+}
+
+struct pram_inode_vfs {
+#ifdef CONFIG_PRAMFS_XATTR
+	/*
+	 * Extended attributes can be read independently of the main file
+	 * data. Taking i_mutex even when reading would cause contention
+	 * between readers of EAs and writers of regular file data, so
+	 * instead we synchronize on xattr_sem when reading or changing
+	 * EAs.
+	 */
+	struct rw_semaphore xattr_sem;
+#endif
+	/*
+	 * truncate_mutex is for serialising the truncate path against
+	 * get/update block.
+	 */
+	struct mutex truncate_mutex;
+	struct mutex i_meta_mutex;
+	struct inode vfs_inode;
+};
+
+static inline struct pram_inode_vfs *PRAM_I(struct inode *inode)
+{
+	return container_of(inode, struct pram_inode_vfs, vfs_inode);
+}
+
+/*
+ * Inodes and files operations
+ */
+
+/* dir.c */
+extern struct file_operations pram_dir_operations;
+
+/* file.c */
+extern struct inode_operations pram_file_inode_operations;
+extern struct file_operations pram_file_operations;
+extern struct file_operations pram_xip_file_operations;
+
+/* inode.c */
+extern struct address_space_operations pram_aops;
+extern struct address_space_operations pram_aops_xip;
+
+/* namei.c */
+extern struct inode_operations pram_dir_inode_operations;
+
+/* symlink.c */
+extern struct inode_operations pram_symlink_inode_operations;
+
+extern struct backing_dev_info pram_backing_dev_info;
+
+#endif	/* __PRAM_H */
diff --git a/include/linux/magic.h b/include/linux/magic.h
index ff690d0..4cb2e9b 100644
--- a/include/linux/magic.h
+++ b/include/linux/magic.h
@@ -38,6 +38,7 @@
 #define NFS_SUPER_MAGIC		0x6969
 #define OPENPROM_SUPER_MAGIC	0x9fa1
 #define PROC_SUPER_MAGIC	0x9fa0
+#define PRAM_SUPER_MAGIC	0xEFFA
 #define QNX4_SUPER_MAGIC	0x002f		/* qnx4 fs detection */
  #define REISERFS_SUPER_MAGIC	0x52654973	/* used by gcc */
diff --git a/include/linux/pram_fs.h b/include/linux/pram_fs.h
new file mode 100644
index 0000000..ced52f4
--- /dev/null
+++ b/include/linux/pram_fs.h
@@ -0,0 +1,130 @@
+/*
+ * FILE NAME include/linux/pram_fs.h
+ *
+ * BRIEF DESCRIPTION
+ *
+ * Definitions for the PRAMFS filesystem.
+ *
+ * Copyright 2009-2010 Marco Stornelli <marco.stornelli@gmail.com>
+ * Copyright 2003 Sony Corporation
+ * Copyright 2003 Matsushita Electric Industrial Co., Ltd.
+ * 2003-2004 (c) MontaVista Software, Inc. , Steve Longerbeam
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+#ifndef _LINUX_PRAM_FS_H
+#define _LINUX_PRAM_FS_H
+
+#include <linux/types.h>
+#include <linux/magic.h>
+
+/*
+ * The PRAM filesystem constants/structures
+ */
+
+/*
+ * Mount flags
+ */
+#define PRAM_MOUNT_PROTECT		0x000001  /* Use memory protection */
+#define PRAM_MOUNT_XATTR_USER		0x000002  /* Extended user attributes */
+#define PRAM_MOUNT_POSIX_ACL		0x000004  /* POSIX Access Control Lists */
+#define PRAM_MOUNT_XIP			0x000008  /* Execute in place */
+#define PRAM_MOUNT_ERRORS_CONT		0x000010  /* Continue on errors */
+#define PRAM_MOUNT_ERRORS_RO		0x000020  /* Remount fs ro on errors */
+#define PRAM_MOUNT_ERRORS_PANIC		0x000040  /* Panic on errors */
+
+/*
+ * Maximal count of links to a file
+ */
+#define PRAM_LINK_MAX		32000
+
+#define PRAM_MIN_BLOCK_SIZE 512
+#define PRAM_MAX_BLOCK_SIZE 4096
+#define PRAM_DEF_BLOCK_SIZE 2048
+
+#define PRAM_INODE_SIZE 128 /* must be power of two */
+#define PRAM_INODE_BITS   7
+
+/*
+ * Structure of a directory entry in PRAMFS.
+ * Offsets are to the inode that holds the referenced dentry.
+ */
+struct pram_dentry {
+	__be64	d_next;     /* next dentry in this directory */
+	__be64	d_prev;     /* previous dentry in this directory */
+	__be64	d_parent;   /* parent directory */
+	char	d_name[0];
+};
+
+
+/*
+ * Structure of an inode in PRAMFS
+ */
+struct pram_inode {
+	__be16	i_sum;          /* checksum of this inode */
+	__be32	i_uid;		/* Owner Uid */
+	__be32	i_gid;		/* Group Id */
+	__be16	i_mode;		/* File mode */
+	__be16	i_links_count;	/* Links count */
+	__be32	i_blocks;	/* Blocks count */
+	__be32	i_size;		/* Size of data in bytes */
+	__be32	i_atime;	/* Access time */
+	__be32	i_ctime;	/* Creation time */
+	__be32	i_mtime;	/* Modification time */
+	__be32	i_dtime;	/* Deletion Time */
+	__be64	i_xattr;	/* Extended attribute block */
+	__be32	i_generation;	/* File version (for NFS) */
+	__be32	i_flags;	/* Inode flags */
+
+	union {
+		struct {
+			/*
+			 * ptr to row block of 2D block pointer array,
+			 * file block #'s 0 to (blocksize/8)^2 - 1.
+			 */
+			__be64 row_block;
+		} reg;   /* regular file or symlink inode */
+		struct {
+			__be64 head; /* first entry in this directory */
+			__be64 tail; /* last entry in this directory */
+		} dir;
+		struct {
+			__be32 rdev; /* major/minor # */
+		} dev;   /* device inode */
+	} i_type;
+
+	struct pram_dentry i_d;
+};
+
+#define PRAM_NAME_LEN \
+    (PRAM_INODE_SIZE - offsetof(struct pram_inode, i_d.d_name) - 1)
+
+
+#define PRAM_SB_SIZE 128 /* must be power of two */
+
+/*
+ * Structure of the super block in PRAMFS
+ */
+struct pram_super_block {
+	__be16	s_sum;          /* checksum of this sb, including padding */
+	__be64	s_size;         /* total size of fs in bytes */
+	__be32	s_blocksize;    /* blocksize in bytes */
+	__be32	s_inodes_count;	/* total inodes count (used or free) */
+	__be32	s_free_inodes_count;/* free inodes count */
+	__be32	s_free_inode_hint;  /* start hint for locating free inodes */
+	__be32	s_blocks_count;	/* total data blocks count (used or free) */
+	__be32	s_free_blocks_count;/* free data blocks count */
+	__be32	s_free_blocknr_hint;/* free data blocks count */
+	__be64	s_bitmap_start; /* data block in-use bitmap location */
+	__be32	s_bitmap_blocks;/* size of bitmap in number of blocks */
+	__be32	s_mtime;	/* Mount time */
+	__be32	s_wtime;	/* Write time */
+	__be16	s_magic;	/* Magic signature */
+	char	s_volume_name[16]; /* volume name */
+};
+
+/* The root inode follows immediately after the redundant super block */
+#define PRAM_ROOT_INO (PRAM_SB_SIZE*2)
+
+#endif	/* _LINUX_PRAM_FS_H */
diff --git a/include/linux/pram_fs_sb.h b/include/linux/pram_fs_sb.h
new file mode 100644
index 0000000..0306269
--- /dev/null
+++ b/include/linux/pram_fs_sb.h
@@ -0,0 +1,45 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * Definitions for the PRAM filesystem.
+ *
+ * Copyright 2009-2010 Marco Stornelli <marco.stornelli@gmail.com>
+ * Copyright 2003 Sony Corporation
+ * Copyright 2003 Matsushita Electric Industrial Co., Ltd.
+ * 2003-2004 (c) MontaVista Software, Inc. , Steve Longerbeam
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#ifndef _LINUX_PRAM_FS_SB
+#define _LINUX_PRAM_FS_SB
+
+/*
+ * PRAM filesystem super-block data in memory
+ */
+struct pram_sb_info {
+	/*
+	 * base physical and virtual address of PRAMFS (which is also
+	 * the pointer to the super block)
+	 */
+	phys_addr_t phys_addr;
+	void *virt_addr;
+
+	/* Mount options */
+	unsigned long bpi;
+	unsigned long num_inodes;
+	unsigned long blocksize;
+	unsigned long initsize;
+	unsigned long s_mount_opt;
+	uid_t uid;		    /* Mount uid for root directory */
+	gid_t gid;		    /* Mount gid for root directory */
+	mode_t mode;		    /* Mount mode for root directory */
+	atomic_t next_generation;
+#ifdef CONFIG_PRAMFS_XATTR
+	struct rb_root desc_tree;
+	spinlock_t desc_tree_lock;
+#endif
+};
+
+#endif	/* _LINUX_PRAM_FS_SB */

^ permalink raw reply related

* [PATCH 07/17] pramfs: symlink operations
From: Marco Stornelli @ 2011-01-06 12:02 UTC (permalink / raw)
  To: Linux Kernel, Linux Embedded, Linux FS Devel, Tim Bird

From: Marco Stornelli <marco.stornelli@gmail.com>

Symlink operations.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
---
diff --git a/fs/pramfs/symlink.c b/fs/pramfs/symlink.c
new file mode 100644
index 0000000..f129271
--- /dev/null
+++ b/fs/pramfs/symlink.c
@@ -0,0 +1,76 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * Symlink operations
+ *
+ * Copyright 2009-2010 Marco Stornelli <marco.stornelli@gmail.com>
+ * Copyright 2003 Sony Corporation
+ * Copyright 2003 Matsushita Electric Industrial Co., Ltd.
+ * 2003-2004 (c) MontaVista Software, Inc. , Steve Longerbeam
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include <linux/fs.h>
+#include "pram.h"
+#include "xattr.h"
+
+int pram_block_symlink(struct inode *inode, const char *symname, int len)
+{
+	struct super_block *sb = inode->i_sb;
+	u64 block;
+	char *blockp;
+	int err;
+
+	err = pram_alloc_blocks(inode, 0, 1);
+	if (err)
+		return err;
+
+	block = pram_find_data_block(inode, 0);
+	blockp = pram_get_block(sb, block);
+
+	pram_memunlock_block(sb, blockp);
+	memcpy(blockp, symname, len);
+	blockp[len] = '\0';
+	pram_memlock_block(sb, blockp);
+	return 0;
+}
+
+static int pram_readlink(struct dentry *dentry, char *buffer, int buflen)
+{
+	struct inode *inode = dentry->d_inode;
+	struct super_block *sb = inode->i_sb;
+	u64 block;
+	char *blockp;
+
+	block = pram_find_data_block(inode, 0);
+	blockp = pram_get_block(sb, block);
+	return vfs_readlink(dentry, buffer, buflen, blockp);
+}
+
+static void *pram_follow_link(struct dentry *dentry, struct nameidata *nd)
+{
+	struct inode *inode = dentry->d_inode;
+	struct super_block *sb = inode->i_sb;
+	off_t block;
+	int status;
+	char *blockp;
+
+	block = pram_find_data_block(inode, 0);
+	blockp = pram_get_block(sb, block);
+	status = vfs_follow_link(nd, blockp);
+	return ERR_PTR(status);
+}
+
+struct inode_operations pram_symlink_inode_operations = {
+	.readlink	= pram_readlink,
+	.follow_link	= pram_follow_link,
+	.setattr	= pram_notify_change,
+#ifdef CONFIG_PRAMFS_XATTR
+	.setxattr	= generic_setxattr,
+	.getxattr	= generic_getxattr,
+	.listxattr	= pram_listxattr,
+	.removexattr	= generic_removexattr,
+#endif
+};

^ permalink raw reply related

* [PATCH 06/17] pramfs: inode operations for directories
From: Marco Stornelli @ 2011-01-06 12:02 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Linux Embedded, Linux FS Devel, Tim Bird

From: Marco Stornelli <marco.stornelli@gmail.com>

Inode operations for directories.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
---
diff --git a/fs/pramfs/namei.c b/fs/pramfs/namei.c
new file mode 100644
index 0000000..bedc43a
--- /dev/null
+++ b/fs/pramfs/namei.c
@@ -0,0 +1,371 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * Inode operations for directories.
+ *
+ * Copyright 2009-2010 Marco Stornelli <marco.stornelli@gmail.com>
+ * Copyright 2003 Sony Corporation
+ * Copyright 2003 Matsushita Electric Industrial Co., Ltd.
+ * 2003-2004 (c) MontaVista Software, Inc. , Steve Longerbeam
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+#include <linux/fs.h>
+#include <linux/pagemap.h>
+#include "pram.h"
+#include "acl.h"
+#include "xattr.h"
+#include "xip.h"
+
+/*
+ * Couple of helper functions - make the code slightly cleaner.
+ */
+
+static inline void pram_inc_count(struct inode *inode)
+{
+	inode->i_nlink++;
+	pram_write_inode(inode, 0);
+}
+
+static inline void pram_dec_count(struct inode *inode)
+{
+	if (inode->i_nlink) {
+		inode->i_nlink--;
+		pram_write_inode(inode, 0);
+	}
+}
+
+static inline int pram_add_nondir(struct inode *dir,
+				   struct dentry *dentry,
+				   struct inode *inode)
+{
+	int err = pram_add_link(dentry, inode);
+	if (!err) {
+		d_instantiate(dentry, inode);
+		unlock_new_inode(inode);
+		return 0;
+	}
+	pram_dec_count(inode);
+	unlock_new_inode(inode);
+	iput(inode);
+	return err;
+}
+
+/*
+ * Methods themselves.
+ */
+
+static ino_t
+pram_inode_by_name(struct inode *dir,
+		   struct dentry *dentry)
+{
+	struct pram_inode *pi;
+	ino_t ino;
+	int namelen;
+
+	pi = pram_get_inode(dir->i_sb, dir->i_ino);
+	ino = be64_to_cpu(pi->i_type.dir.head);
+
+	while (ino) {
+		pi = pram_get_inode(dir->i_sb, ino);
+
+		if (pi->i_links_count) {
+			namelen = strlen(pi->i_d.d_name);
+
+			if (namelen == dentry->d_name.len &&
+			    !memcmp(dentry->d_name.name,
+				    pi->i_d.d_name, namelen))
+				break;
+		}
+
+		ino = be64_to_cpu(pi->i_d.d_next);
+	}
+
+	return ino;
+}
+
+static struct dentry *
+pram_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
+{
+	struct inode *inode = NULL;
+	ino_t ino;
+
+	if (dentry->d_name.len > PRAM_NAME_LEN)
+		return ERR_PTR(-ENAMETOOLONG);
+
+	ino = pram_inode_by_name(dir, dentry);
+	if (ino) {
+		inode = pram_iget(dir->i_sb, ino);
+		if (unlikely(IS_ERR(inode))) {
+			if (PTR_ERR(inode) == -ESTALE) {
+				pram_err(dir->i_sb, "deleted inode referenced: %lu",
+						(unsigned long) ino);
+				return ERR_PTR(-EIO);
+			} else {
+				return ERR_CAST(inode);
+			}
+		}
+	}
+
+	d_splice_alias(inode, dentry);
+	return NULL;
+}
+
+
+/*
+ * By the time this is called, we already have created
+ * the directory cache entry for the new file, but it
+ * is so far negative - it has no inode.
+ *
+ * If the create succeeds, we fill in the inode information
+ * with d_instantiate().
+ */
+static int pram_create(struct inode *dir, struct dentry *dentry,
+			int mode, struct nameidata *nd)
+{
+	struct inode *inode = pram_new_inode(dir, mode);
+	int err = PTR_ERR(inode);
+	if (!IS_ERR(inode)) {
+		inode->i_op = &pram_file_inode_operations;
+		if (pram_use_xip(inode->i_sb)) {
+			inode->i_mapping->a_ops = &pram_aops_xip;
+			inode->i_fop = &pram_xip_file_operations;
+		} else {
+			inode->i_fop = &pram_file_operations;
+			inode->i_mapping->a_ops = &pram_aops;
+		}
+		err = pram_add_nondir(dir, dentry, inode);
+	}
+	return err;
+}
+
+static int pram_mknod(struct inode *dir, struct dentry *dentry, int mode,
+		       dev_t rdev)
+{
+	struct inode *inode = pram_new_inode(dir, mode);
+	int err = PTR_ERR(inode);
+	if (!IS_ERR(inode)) {
+		init_special_inode(inode, mode, rdev);
+		pram_write_inode(inode, 0); /* update rdev */
+		err = pram_add_nondir(dir, dentry, inode);
+	}
+	return err;
+}
+
+static int pram_symlink(struct inode *dir,
+			  struct dentry *dentry,
+			  const char *symname)
+{
+	struct super_block *sb = dir->i_sb;
+	int err = -ENAMETOOLONG;
+	unsigned len = strlen(symname);
+	struct inode *inode;
+
+	if (len+1 > sb->s_blocksize)
+		goto out;
+
+	inode = pram_new_inode(dir, S_IFLNK | S_IRWXUGO);
+	err = PTR_ERR(inode);
+	if (IS_ERR(inode))
+		goto out;
+
+	inode->i_op = &pram_symlink_inode_operations;
+	inode->i_mapping->a_ops = &pram_aops;
+
+	err = pram_block_symlink(inode, symname, len);
+	if (err)
+		goto out_fail;
+
+	inode->i_size = len;
+	pram_write_inode(inode, 0);
+
+	err = pram_add_nondir(dir, dentry, inode);
+out:
+	return err;
+
+out_fail:
+	pram_dec_count(inode);
+	unlock_new_inode(inode);
+	iput(inode);
+	goto out;
+}
+
+static int pram_link(struct dentry *dest_dentry,
+		       struct inode *dir,
+		       struct dentry *dentry)
+{
+	pram_dbg("hard links not supported\n");
+	return -EOPNOTSUPP;
+}
+
+static int pram_unlink(struct inode *dir, struct dentry *dentry)
+{
+	struct inode *inode = dentry->d_inode;
+	inode->i_ctime = dir->i_ctime;
+	pram_dec_count(inode);
+	return 0;
+}
+
+static int pram_mkdir(struct inode *dir, struct dentry *dentry, int mode)
+{
+	struct inode *inode;
+	struct pram_inode *pi;
+	int err = -EMLINK;
+
+	if (dir->i_nlink >= PRAM_LINK_MAX)
+		goto out;
+
+	pram_inc_count(dir);
+
+	inode = pram_new_inode(dir, S_IFDIR | mode);
+	err = PTR_ERR(inode);
+	if (IS_ERR(inode))
+		goto out_dir;
+
+	inode->i_op = &pram_dir_inode_operations;
+	inode->i_fop = &pram_dir_operations;
+	inode->i_mapping->a_ops = &pram_aops;
+
+	pram_inc_count(inode);
+
+	/* make the new directory empty */
+	pi = pram_get_inode(dir->i_sb, inode->i_ino);
+	pram_memunlock_inode(dir->i_sb, pi);
+	pi->i_type.dir.head = pi->i_type.dir.tail = 0;
+	pram_memlock_inode(dir->i_sb, pi);
+
+	err = pram_add_link(dentry, inode);
+	if (err)
+		goto out_fail;
+
+	d_instantiate(dentry, inode);
+	unlock_new_inode(inode);
+out:
+	return err;
+
+out_fail:
+	pram_dec_count(inode);
+	pram_dec_count(inode);
+	unlock_new_inode(inode);
+	iput(inode);
+out_dir:
+	pram_dec_count(dir);
+	goto out;
+}
+
+static int pram_rmdir(struct inode *dir, struct dentry *dentry)
+{
+	struct inode *inode = dentry->d_inode;
+	struct pram_inode *pi;
+	int err = -ENOTEMPTY;
+
+	if (!inode)
+		return -ENOENT;
+
+	pi = pram_get_inode(dir->i_sb, inode->i_ino);
+
+	/* directory to delete is empty? */
+	if (pi->i_type.dir.tail == 0) {
+		inode->i_ctime = dir->i_ctime;
+		inode->i_size = 0;
+		inode->i_nlink = 0;
+		pram_write_inode(inode, 0);
+		pram_dec_count(dir);
+		err = 0;
+	} else {
+		pram_dbg("dir not empty\n");
+	}
+
+	return err;
+}
+
+static int pram_rename(struct inode  *old_dir,
+			struct dentry *old_dentry,
+			struct inode  *new_dir,
+			struct dentry *new_dentry)
+{
+	struct inode *old_inode = old_dentry->d_inode;
+	struct inode *new_inode = new_dentry->d_inode;
+	struct pram_inode *pi_new;
+	int err = -ENOENT;
+
+	if (new_inode) {
+		err = -ENOTEMPTY;
+		pi_new = pram_get_inode(new_dir->i_sb, new_inode->i_ino);
+		if (S_ISDIR(old_inode->i_mode)) {
+			if (pi_new->i_type.dir.tail != 0)
+				goto out;
+			if (new_inode->i_nlink)
+				new_inode->i_nlink--;
+		}
+
+		new_inode->i_ctime = CURRENT_TIME;
+		pram_dec_count(new_inode);
+	} else {
+		if (S_ISDIR(old_inode->i_mode)) {
+			err = -EMLINK;
+			if (new_dir->i_nlink >= PRAM_LINK_MAX)
+				goto out;
+			pram_dec_count(old_dir);
+			pram_inc_count(new_dir);
+		}
+	}
+
+	/* unlink the inode from the old directory ... */
+	err = pram_remove_link(old_inode);
+	if (err)
+		goto out;
+
+	/* and link it into the new directory. */
+	err = pram_add_link(new_dentry, old_inode);
+	if (err)
+		goto out;
+
+	err = 0;
+ out:
+	return err;
+}
+
+struct dentry *pram_get_parent(struct dentry *child)
+{
+	struct inode *inode;
+	struct pram_inode *pi, *piparent;
+	ino_t ino;
+
+	pi = pram_get_inode(child->d_inode->i_sb, child->d_inode->i_ino);
+	if (!pi)
+		return ERR_PTR(-EACCES);
+
+	piparent = pram_get_inode(child->d_inode->i_sb, be64_to_cpu(pi->i_d.d_parent));
+	if (!pi)
+		return ERR_PTR(-ENOENT);
+
+	ino = pram_get_inodenr(child->d_inode->i_sb, piparent);
+	if (ino)
+		inode = pram_iget(child->d_inode->i_sb, ino);
+	else
+		return ERR_PTR(-ENOENT);
+
+	return d_obtain_alias(inode);
+}
+
+struct inode_operations pram_dir_inode_operations = {
+	.create		= pram_create,
+	.lookup		= pram_lookup,
+	.link		= pram_link,
+	.unlink		= pram_unlink,
+	.symlink	= pram_symlink,
+	.mkdir		= pram_mkdir,
+	.rmdir		= pram_rmdir,
+	.mknod		= pram_mknod,
+	.rename		= pram_rename,
+#ifdef CONFIG_PRAMFS_XATTR
+	.setxattr	= generic_setxattr,
+	.getxattr	= generic_getxattr,
+	.listxattr	= pram_listxattr,
+	.removexattr	= generic_removexattr,
+#endif
+	.setattr	= pram_notify_change,
+	.check_acl	= pram_check_acl,
+};

^ permalink raw reply related

* [PATCH 05/17] pramfs: block allocation
From: Marco Stornelli @ 2011-01-06 12:02 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Linux Embedded, Linux FS Devel, Tim Bird

From: Marco Stornelli <marco.stornelli@gmail.com>

Block allocation operations.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
---
diff --git a/fs/pramfs/balloc.c b/fs/pramfs/balloc.c
new file mode 100644
index 0000000..48b6e95
--- /dev/null
+++ b/fs/pramfs/balloc.c
@@ -0,0 +1,147 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * The blocks allocation and deallocation routines.
+ *
+ * Copyright 2009-2010 Marco Stornelli <marco.stornelli@gmail.com>
+ * Copyright 2003 Sony Corporation
+ * Copyright 2003 Matsushita Electric Industrial Co., Ltd.
+ * 2003-2004 (c) MontaVista Software, Inc. , Steve Longerbeam
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include <linux/fs.h>
+#include <linux/bitops.h>
+#include "pram.h"
+
+/*
+ * This just marks in-use the blocks that make up the bitmap.
+ * The bitmap must be writeable before calling.
+ */
+void pram_init_bitmap(struct super_block *sb)
+{
+	struct pram_super_block *ps = pram_get_super(sb);
+	unsigned long *bitmap = pram_get_bitmap(sb);
+	int blocks = be32_to_cpu(ps->s_bitmap_blocks);
+
+	memset(bitmap, 0, blocks << sb->s_blocksize_bits);
+
+	bitmap_fill(bitmap, blocks);
+}
+
+
+/* Free absolute blocknr */
+void pram_free_block(struct super_block *sb, unsigned long blocknr)
+{
+	struct pram_super_block *ps;
+	u64 bitmap_block;
+	unsigned long bitmap_bnr;
+	void *bitmap;
+	void *bp;
+
+	lock_super(sb);
+
+	bitmap = pram_get_bitmap(sb);
+	/*
+	 * find the block within the bitmap that contains the inuse bit
+	 * for the block we need to free. We need to unlock this bitmap
+	 * block to clear the inuse bit.
+	 */
+	bitmap_bnr = blocknr >> (3 + sb->s_blocksize_bits);
+	bitmap_block = pram_get_block_off(sb, bitmap_bnr);
+	bp = pram_get_block(sb, bitmap_block);
+
+	pram_memunlock_block(sb, bp);
+	pram_clear_bit(blocknr, bitmap); /* mark the block free */
+	pram_memlock_block(sb, bp);
+	
+	ps = pram_get_super(sb);
+	pram_memunlock_super(sb, ps);
+	
+	if (blocknr < be32_to_cpu(ps->s_free_blocknr_hint))
+		ps->s_free_blocknr_hint = cpu_to_be32(blocknr);
+	be32_add_cpu(&ps->s_free_blocks_count, 1);
+	pram_memlock_super(sb, ps);
+	
+	unlock_super(sb);
+}
+
+
+/*
+ * allocate a block and return it's absolute blocknr. Zeroes out the
+ * block if zero set.
+ */
+int pram_new_block(struct super_block *sb, unsigned long *blocknr, int zero)
+{
+	struct pram_super_block *ps;
+	u64 bitmap_block;
+	unsigned long bnr, bitmap_bnr;
+	int errval;
+	void *bitmap;
+	void *bp;
+
+	lock_super(sb);
+	ps = pram_get_super(sb);
+	bitmap = pram_get_bitmap(sb);
+
+	if (ps->s_free_blocks_count) {
+		/* find the oldest unused block */
+		bnr = pram_find_next_zero_bit(bitmap,
+					 be32_to_cpu(ps->s_blocks_count),
+					 be32_to_cpu(ps->s_free_blocknr_hint));
+
+		if (bnr < be32_to_cpu(ps->s_bitmap_blocks) ||
+				bnr >= be32_to_cpu(ps->s_blocks_count)) {
+			pram_dbg("no free blocks found!\n");
+			errval = -ENOSPC;
+			goto fail;
+		}
+
+		pram_memunlock_super(sb, ps);
+		be32_add_cpu(&ps->s_free_blocks_count, -1);
+		if (bnr < (be32_to_cpu(ps->s_blocks_count)-1))
+			ps->s_free_blocknr_hint = cpu_to_be32(bnr+1);
+		else
+			ps->s_free_blocknr_hint = 0;
+		pram_memlock_super(sb, ps);
+	} else {
+		pram_dbg("all blocks allocated\n");
+		errval = -ENOSPC;
+		goto fail;
+	}
+
+	/*
+	 * find the block within the bitmap that contains the inuse bit
+	 * for the unused block we just found. We need to unlock it to
+	 * set the inuse bit.
+	 */
+	bitmap_bnr = bnr >> (3 + sb->s_blocksize_bits);
+	bitmap_block = pram_get_block_off(sb, bitmap_bnr);
+	bp = pram_get_block(sb, bitmap_block);
+
+	pram_memunlock_block(sb, bp);
+	pram_set_bit(bnr, bitmap); /* mark the new block in use */
+	pram_memlock_block(sb, bp);
+
+	if (zero) {
+		bp = pram_get_block(sb, pram_get_block_off(sb, bnr));
+		pram_memunlock_block(sb, bp);
+		memset(bp, 0, sb->s_blocksize);
+		pram_memlock_block(sb, bp);
+	}
+
+	*blocknr = bnr;
+	pram_dbg("allocated blocknr %lu", bnr);
+	errval = 0;
+ fail:
+	unlock_super(sb);
+	return errval;
+}
+
+unsigned long pram_count_free_blocks(struct super_block *sb)
+{
+	struct pram_super_block *ps = pram_get_super(sb);
+	return be32_to_cpu(ps->s_free_blocks_count);
+}

^ permalink raw reply related

* [PATCH 04/17] pramfs: file operations
From: Marco Stornelli @ 2011-01-06 12:01 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Linux Embedded, Linux FS Devel, Tim Bird

From: Marco Stornelli <marco.stornelli@gmail.com>

File operations.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
---
diff --git a/fs/pramfs/file.c b/fs/pramfs/file.c
new file mode 100644
index 0000000..05a4af4
--- /dev/null
+++ b/fs/pramfs/file.c
@@ -0,0 +1,326 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * File operations for files.
+ *
+ * Copyright 2009-2010 Marco Stornelli <marco.stornelli@gmail.com>
+ * Copyright 2003 Sony Corporation
+ * Copyright 2003 Matsushita Electric Industrial Co., Ltd.
+ * 2003-2004 (c) MontaVista Software, Inc. , Steve Longerbeam
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include <linux/fs.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/uio.h>
+#include <linux/mm.h>
+#include <linux/uaccess.h>
+#include "pram.h"
+#include "acl.h"
+#include "xip.h"
+#include "xattr.h"
+
+/*
+ * The following functions are helper routines to copy to/from
+ * user space and iter over io vectors (mainly for readv/writev).
+ * They are used in the direct IO path.
+ */
+static size_t __pram_iov_copy_from(char *vaddr,
+			const struct iovec *iov, size_t base, size_t bytes)
+{
+	size_t copied = 0, left = 0;
+
+	while (bytes) {
+		char __user *buf = iov->iov_base + base;
+		int copy = min(bytes, iov->iov_len - base);
+
+		base = 0;
+		left = __copy_from_user(vaddr, buf, copy);
+		copied += copy;
+		bytes -= copy;
+		vaddr += copy;
+		iov++;
+
+		if (unlikely(left))
+			break;
+	}
+	return copied - left;
+}
+
+static size_t __pram_iov_copy_to(char *vaddr,
+			const struct iovec *iov, size_t base, size_t bytes)
+{
+	size_t copied = 0, left = 0;
+
+	while (bytes) {
+		char __user *buf = iov->iov_base + base;
+		int copy = min(bytes, iov->iov_len - base);
+
+		base = 0;
+		left = __copy_to_user(buf, vaddr, copy);
+		copied += copy;
+		bytes -= copy;
+		vaddr += copy;
+		iov++;
+
+		if (unlikely(left))
+			break;
+	}
+	return copied - left;
+}
+
+static size_t pram_iov_copy_from(void *to, struct iov_iter *i, size_t bytes)
+{
+	size_t copied;
+
+	if (likely(i->nr_segs == 1)) {
+		int left;
+		char __user *buf = i->iov->iov_base + i->iov_offset;
+		left = __copy_from_user(to, buf, bytes);
+		copied = bytes - left;
+	} else {
+		copied = __pram_iov_copy_from(to, i->iov, i->iov_offset, bytes);
+	}
+
+	return copied;
+}
+
+static size_t pram_iov_copy_to(void *from, struct iov_iter *i, size_t bytes)
+{
+	size_t copied;
+
+	if (likely(i->nr_segs == 1)) {
+		int left;
+		char __user *buf = i->iov->iov_base + i->iov_offset;
+		left = __copy_to_user(buf, from, bytes);
+		copied = bytes - left;
+	} else {
+		copied = __pram_iov_copy_to(from, i->iov, i->iov_offset, bytes);
+	}
+
+	return copied;
+}
+
+static size_t __pram_clear_user(const struct iovec *iov, size_t base, size_t bytes)
+{
+	size_t claened = 0, left = 0;
+
+	while (bytes) {
+		char __user *buf = iov->iov_base + base;
+		int clear = min(bytes, iov->iov_len - base);
+
+		base = 0;
+		left = __clear_user(buf, clear);
+		claened += clear;
+		bytes -= clear;
+		iov++;
+
+		if (unlikely(left))
+			break;
+	}
+	return claened - left;
+}
+
+static size_t pram_clear_user(struct iov_iter *i, size_t bytes)
+{
+	size_t clear;
+
+	if (likely(i->nr_segs == 1)) {
+		int left;
+		char __user *buf = i->iov->iov_base + i->iov_offset;
+		left = __clear_user(buf, bytes);
+		clear = bytes - left;
+	} else {
+		clear = __pram_clear_user(i->iov, i->iov_offset, bytes);
+	}
+
+	return clear;
+}
+
+static int pram_open_file(struct inode *inode, struct file *filp)
+{
+	filp->f_flags |= O_DIRECT;
+	return generic_file_open(inode, filp);
+}
+
+ssize_t pram_direct_IO(int rw, struct kiocb *iocb,
+		   const struct iovec *iov,
+		   loff_t offset, unsigned long nr_segs)
+{
+	struct file *file = iocb->ki_filp;
+	struct inode *inode = file->f_mapping->host;
+	struct super_block *sb = inode->i_sb;
+	int progress = 0, hole = 0, alloc_once = 1;
+	ssize_t retval = 0;
+	void *tmp = NULL;
+	unsigned long blocknr, blockoff, blocknr_start;
+	struct iov_iter iter;
+	int num_blocks, blocksize_mask;
+	size_t length = iov_length(iov, nr_segs);
+
+	if (length < 0)
+		return -EINVAL;
+	if ((rw == READ) && (offset + length > inode->i_size))
+		length = inode->i_size - offset;
+	if (!length)
+		goto out;
+
+	blocksize_mask = (1 << sb->s_blocksize_bits) - 1;
+	/* find starting block number to access */
+	blocknr = offset >> sb->s_blocksize_bits;
+	/* find starting offset within starting block */
+	blockoff = offset & blocksize_mask;
+	/* find number of blocks to access */
+	num_blocks = (blockoff + length + blocksize_mask) >>
+							sb->s_blocksize_bits;
+	blocknr_start = blocknr;
+
+	if (rw == WRITE) {
+		/* prepare a temporary buffer to hold a user data block
+		   for writing. */
+		tmp = kmalloc(sb->s_blocksize, GFP_KERNEL);
+		if (!tmp)
+			return -ENOMEM;
+	}
+
+	iov_iter_init(&iter, iov, nr_segs, length, 0);
+
+	while (length) {
+		int count;
+		u8 *bp = NULL;
+		u64 block = pram_find_data_block(inode, blocknr);
+		if (!block) {
+			if (alloc_once && rw == WRITE) {
+				/*
+				 * Allocate the data blocks starting from blocknr
+				 * to the end.
+				 */
+				retval = pram_alloc_blocks(inode, blocknr,
+							num_blocks - (blocknr -
+								 blocknr_start));
+				if (retval)
+					goto fail;
+				/* retry....*/
+				block = pram_find_data_block(inode, blocknr);
+				BUG_ON(!block);
+				alloc_once = 0;
+			} else if (unlikely(rw == READ)) {
+				/* We are falling in a hole */
+				hole = 1;
+				goto hole;
+			}
+		}
+		bp = (u8 *)pram_get_block(sb, block);
+		if (!bp) {
+			retval = -EACCES;
+			goto fail;
+		}
+ hole:
+		++blocknr;
+
+		count = blockoff + length > sb->s_blocksize ?
+			sb->s_blocksize - blockoff : length;
+
+		if (rw == READ) {
+			if (unlikely(hole)) {
+				retval = pram_clear_user(&iter, count);
+				if (retval != count) {
+					retval = -EFAULT;
+					goto fail;
+				}
+			} else {
+				retval = pram_iov_copy_to(&bp[blockoff], &iter,
+							  count);
+				if (retval != count) {
+					retval = -EFAULT;
+					goto fail;
+				}
+			}
+		} else {
+			retval = pram_iov_copy_from(tmp, &iter, count);
+			if (retval != count) {
+				retval = -EFAULT;
+				goto fail;
+			}
+
+			pram_memunlock_block(inode->i_sb, bp);
+			memcpy(&bp[blockoff], tmp, count);
+			pram_memlock_block(inode->i_sb, bp);
+		}
+
+		progress += count;
+		iov_iter_advance(&iter, count);
+		length -= count;
+		blockoff = 0;
+		hole = 0;
+	}
+
+	retval = progress;
+ fail:
+	kfree(tmp);
+ out:
+	return retval;
+}
+
+int pram_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	/* Only private mappings */
+	if (vma->vm_flags & VM_SHARED)
+		return -EINVAL;
+	return generic_file_mmap(file, vma);
+}
+
+static int pram_check_flags(int flags)
+{
+	if (!(flags & O_DIRECT))
+		return -EINVAL;
+
+	return 0;
+}
+
+struct file_operations pram_file_operations = {
+	.llseek		= generic_file_llseek,
+	.read		= do_sync_read,
+	.write		= do_sync_write,
+	.aio_read	= generic_file_aio_read,
+	.aio_write	= generic_file_aio_write,
+	.mmap		= pram_mmap,
+	.open		= pram_open_file,
+	.fsync		= noop_fsync,
+	.check_flags	= pram_check_flags,
+	.unlocked_ioctl	= pram_ioctl,
+	.splice_read	= generic_file_splice_read,
+#ifdef CONFIG_COMPAT
+	.compat_ioctl	= pram_compat_ioctl,
+#endif
+};
+
+#ifdef CONFIG_PRAMFS_XIP
+struct file_operations pram_xip_file_operations = {
+	.llseek		= generic_file_llseek,
+	.read		= xip_file_read,
+	.write		= xip_file_write,
+	.mmap		= xip_file_mmap,
+	.open		= generic_file_open,
+	.fsync		= noop_fsync,
+	.unlocked_ioctl	= pram_ioctl,
+#ifdef CONFIG_COMPAT
+	.compat_ioctl	= pram_compat_ioctl,
+#endif
+};
+#endif
+
+struct inode_operations pram_file_inode_operations = {
+#ifdef CONFIG_PRAMFS_XATTR
+	.setxattr	= generic_setxattr,
+	.getxattr	= generic_getxattr,
+	.listxattr	= pram_listxattr,
+	.removexattr	= generic_removexattr,
+#endif
+	.setattr	= pram_notify_change,
+	.check_acl	= pram_check_acl,
+	.fallocate	= pram_fallocate,
+};

^ permalink raw reply related

* [PATCH 03/17] pramfs: inode operations
From: Marco Stornelli @ 2011-01-06 12:01 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Linux Embedded, Linux FS Devel, Tim Bird

From: Marco Stornelli <marco.stornelli@gmail.com>

Inode methods (allocate/free/read/write).

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
---
diff --git a/fs/pramfs/inode.c b/fs/pramfs/inode.c
new file mode 100644
index 0000000..e5ee072
--- /dev/null
+++ b/fs/pramfs/inode.c
@@ -0,0 +1,848 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * Inode methods (allocate/free/read/write).
+ *
+ * Copyright 2009-2010 Marco Stornelli <marco.stornelli@gmail.com>
+ * Copyright 2003 Sony Corporation
+ * Copyright 2003 Matsushita Electric Industrial Co., Ltd.
+ * 2003-2004 (c) MontaVista Software, Inc. , Steve Longerbeam
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include <linux/fs.h>
+#include <linux/smp_lock.h>
+#include <linux/sched.h>
+#include <linux/highuid.h>
+#include <linux/module.h>
+#include <linux/mpage.h>
+#include <linux/backing-dev.h>
+#include <linux/falloc.h>
+#include "pram.h"
+#include "xattr.h"
+#include "xip.h"
+#include "acl.h"
+
+struct backing_dev_info pram_backing_dev_info __read_mostly = {
+	.ra_pages       = 0,    /* No readahead */
+	.capabilities	= BDI_CAP_NO_ACCT_AND_WRITEBACK,
+};
+
+/*
+ * allocate a data block for inode and return it's absolute blocknr.
+ * Zeroes out the block if zero set. Increments inode->i_blocks.
+ */
+static int pram_new_data_block(struct inode *inode, unsigned long *blocknr,
+			       int zero)
+{
+	int errval = pram_new_block(inode->i_sb, blocknr, zero);
+
+	if (!errval) {
+		struct pram_inode *pi = pram_get_inode(inode->i_sb,
+							inode->i_ino);
+		inode->i_blocks++;
+		pram_memunlock_inode(inode->i_sb, pi);
+		pi->i_blocks = cpu_to_be32(inode->i_blocks);
+		pram_memlock_inode(inode->i_sb, pi);
+	}
+
+	return errval;
+}
+
+/*
+ * find the offset to the block represented by the given inode's file
+ * relative block number.
+ */
+u64 pram_find_data_block(struct inode *inode, unsigned long file_blocknr)
+{
+	struct super_block *sb = inode->i_sb;
+	struct pram_inode *pi;
+	u64 *row; /* ptr to row block */
+	u64 *col; /* ptr to column blocks */
+	u64 bp = 0;
+	unsigned int i_row, i_col;
+	unsigned int N = sb->s_blocksize >> 3; /* num block ptrs per block */
+	unsigned int Nbits = sb->s_blocksize_bits - 3;
+
+	pi = pram_get_inode(sb, inode->i_ino);
+
+	i_row = file_blocknr >> Nbits;
+	i_col  = file_blocknr & (N-1);
+
+	row = pram_get_block(sb, be64_to_cpu(pi->i_type.reg.row_block));
+	if (row) {
+		col = pram_get_block(sb, be64_to_cpu(row[i_row]));
+		if (col)
+			bp = be64_to_cpu(col[i_col]);
+	}
+
+	return bp;
+}
+
+/*
+ * Free data blocks from inode in the range start <=> end
+ */
+static void __pram_truncate_blocks(struct inode *inode, loff_t start,
+				   loff_t end)
+{
+	struct super_block *sb = inode->i_sb;
+	struct pram_inode *pi = pram_get_inode(sb, inode->i_ino);
+	int N = sb->s_blocksize >> 3; /* num block ptrs per block */
+	int Nbits = sb->s_blocksize_bits - 3;
+	int first_row_index, last_row_index, i, j;
+	unsigned long blocknr, first_blocknr, last_blocknr;
+	unsigned int freed = 0;
+	u64 *row; /* ptr to row block */
+	u64 *col; /* ptr to column blocks */
+
+	mutex_lock(&PRAM_I(inode)->truncate_mutex);
+
+	if (start > end || !inode->i_blocks || !pi->i_type.reg.row_block) {
+		mutex_unlock(&PRAM_I(inode)->truncate_mutex);
+		return;
+	}
+
+	first_blocknr = (start + sb->s_blocksize - 1) >> sb->s_blocksize_bits;
+
+	if ((be32_to_cpu(pi->i_flags) & PRAM_EOFBLOCKS_FL) && start == 0)
+		last_blocknr = (1UL << (2*sb->s_blocksize_bits - 6)) - 1;
+	else
+		last_blocknr = (end + sb->s_blocksize - 1) >>
+							   sb->s_blocksize_bits;
+	first_row_index = first_blocknr >> Nbits;
+	last_row_index  = last_blocknr >> Nbits;
+
+	row = pram_get_block(sb, be64_to_cpu(pi->i_type.reg.row_block));
+
+	for (i = first_row_index; i <= last_row_index; i++) {
+		int first_col_index = (i == first_row_index) ?
+			first_blocknr & (N-1) : 0;
+		int last_col_index = (i == last_row_index) ?
+			last_blocknr & (N-1) : N-1;
+
+		if (unlikely(!row[i]))
+			continue;
+
+		col = pram_get_block(sb, be64_to_cpu(row[i]));
+
+		for (j = first_col_index; j <= last_col_index; j++) {
+
+			if (unlikely(!col[j]))
+				continue;
+
+			blocknr = pram_get_blocknr(sb, be64_to_cpu(col[j]));
+			pram_free_block(sb, blocknr);
+			freed++;
+			pram_memunlock_block(sb, col);
+			col[j] = 0;
+			pram_memlock_block(sb, col);
+		}
+
+		if (first_col_index == 0) {
+			blocknr = pram_get_blocknr(sb, be64_to_cpu(row[i]));
+			pram_free_block(sb, blocknr);
+			pram_memunlock_block(sb, row);
+			row[i] = 0;
+			pram_memlock_block(sb, row);
+		}
+	}
+
+	inode->i_blocks -= freed;
+
+	if (start == 0) {
+		blocknr = pram_get_blocknr(sb, be64_to_cpu(pi->i_type.reg.row_block));
+		pram_free_block(sb, blocknr);
+		pram_memunlock_inode(sb, pi);
+		pi->i_type.reg.row_block = 0;
+		pi->i_flags &= cpu_to_be32(~PRAM_EOFBLOCKS_FL);
+		goto update_blocks;
+	}
+	pram_memunlock_inode(sb, pi);
+
+ update_blocks:
+	pi->i_blocks = cpu_to_be32(inode->i_blocks);
+	pram_memlock_inode(sb, pi);
+
+	mutex_unlock(&PRAM_I(inode)->truncate_mutex);
+}
+
+static void pram_truncate_blocks(struct inode *inode, loff_t start, loff_t end)
+{
+	if (!(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
+	      S_ISLNK(inode->i_mode)))
+		return;
+	if (IS_APPEND(inode) || IS_IMMUTABLE(inode))
+		return;
+
+	__pram_truncate_blocks(inode, start, end);
+	inode->i_mtime = inode->i_ctime = CURRENT_TIME_SEC;
+	pram_update_inode(inode);
+}
+
+/*
+ * Allocate num data blocks for inode, starting at given file-relative
+ * block number. All blocks except the last are zeroed out.
+ */
+int pram_alloc_blocks(struct inode *inode, int file_blocknr, unsigned int num)
+{
+	struct super_block *sb = inode->i_sb;
+	struct pram_inode *pi = pram_get_inode(sb, inode->i_ino);
+	int N = sb->s_blocksize >> 3; /* num block ptrs per block */
+	int Nbits = sb->s_blocksize_bits - 3;
+	int first_file_blocknr;
+	int last_file_blocknr;
+	int first_row_index, last_row_index;
+	int i, j, errval;
+	unsigned long blocknr;
+	u64 *row;
+	u64 *col;
+
+	if (!pi->i_type.reg.row_block) {
+		/* alloc the 2nd order array block */
+		errval = pram_new_block(sb, &blocknr, 1);
+		if (errval) {
+			pram_dbg("failed to alloc 2nd order array block\n");
+			goto fail;
+		}
+		pram_memunlock_inode(sb, pi);
+		pi->i_type.reg.row_block = cpu_to_be64(pram_get_block_off(sb,
+								      blocknr));
+		pram_memlock_inode(sb, pi);
+	}
+
+	row = pram_get_block(sb, be64_to_cpu(pi->i_type.reg.row_block));
+
+	first_file_blocknr = file_blocknr;
+	last_file_blocknr = file_blocknr + num - 1;
+
+	first_row_index = first_file_blocknr >> Nbits;
+	last_row_index  = last_file_blocknr >> Nbits;
+
+	for (i = first_row_index; i <= last_row_index; i++) {
+		int first_col_index, last_col_index;
+
+		/*
+		 * we are starting a new row, so make sure
+		 * there is a block allocated for the row.
+		 */
+		if (!row[i]) {
+			/* allocate the row block */
+			errval = pram_new_block(sb, &blocknr, 1);
+			if (errval) {
+				pram_dbg("failed to alloc row block\n");
+				goto fail;
+			}
+			pram_memunlock_block(sb, row);
+			row[i] = cpu_to_be64(pram_get_block_off(sb, blocknr));
+			pram_memlock_block(sb, row);
+		}
+		col = pram_get_block(sb, be64_to_cpu(row[i]));
+
+		first_col_index = (i == first_row_index) ?
+			first_file_blocknr & (N-1) : 0;
+
+		last_col_index = (i == last_row_index) ?
+			last_file_blocknr & (N-1) : N-1;
+
+		for (j = first_col_index; j <= last_col_index; j++) {
+			if (!col[j]) {
+				errval = pram_new_data_block(inode, &blocknr, 1);
+				if (errval) {
+					pram_dbg("failed to alloc data block\n");
+					goto fail;
+				}
+				pram_memunlock_block(sb, col);
+				col[j] = cpu_to_be64(pram_get_block_off(sb,
+								      blocknr));
+				pram_memlock_block(sb, col);
+			}
+		}
+	}
+
+	errval = 0;
+ fail:
+	return errval;
+}
+
+static int pram_read_inode(struct inode *inode, struct pram_inode *pi)
+{
+	int ret = -EIO;
+
+	mutex_lock(&PRAM_I(inode)->i_meta_mutex);
+
+	if (pram_calc_checksum((u8 *)pi, PRAM_INODE_SIZE)) {
+		pram_err(inode->i_sb, "checksum error in inode %08x\n",
+			  (u32)inode->i_ino);
+		goto bad_inode;
+	}
+
+	inode->i_mode = be16_to_cpu(pi->i_mode);
+	inode->i_uid = be32_to_cpu(pi->i_uid);
+	inode->i_gid = be32_to_cpu(pi->i_gid);
+	inode->i_nlink = be16_to_cpu(pi->i_links_count);
+	inode->i_size = be32_to_cpu(pi->i_size);
+	inode->i_atime.tv_sec = be32_to_cpu(pi->i_atime);
+	inode->i_ctime.tv_sec = be32_to_cpu(pi->i_ctime);
+	inode->i_mtime.tv_sec = be32_to_cpu(pi->i_mtime);
+	inode->i_atime.tv_nsec = inode->i_mtime.tv_nsec =
+		inode->i_ctime.tv_nsec = 0;
+	inode->i_generation = be32_to_cpu(pi->i_generation);
+	pram_set_inode_flags(inode, pi);
+
+	/* check if the inode is active. */
+	if (inode->i_nlink == 0 && (inode->i_mode == 0 || be32_to_cpu(pi->i_dtime))) {
+		/* this inode is deleted */
+		pram_dbg("read inode: inode %lu not active", inode->i_ino);
+		ret = -ESTALE;
+		goto bad_inode;
+	}
+
+	inode->i_blocks = be32_to_cpu(pi->i_blocks);
+	inode->i_ino = pram_get_inodenr(inode->i_sb, pi);
+	inode->i_mapping->a_ops = &pram_aops;
+	inode->i_mapping->backing_dev_info = &pram_backing_dev_info;
+
+	insert_inode_hash(inode);
+	switch (inode->i_mode & S_IFMT) {
+	case S_IFREG:
+		if (pram_use_xip(inode->i_sb)) {
+			inode->i_mapping->a_ops = &pram_aops_xip;
+			inode->i_fop = &pram_xip_file_operations;
+		} else {
+			inode->i_op = &pram_file_inode_operations;
+			inode->i_fop = &pram_file_operations;
+		}
+		break;
+	case S_IFDIR:
+		inode->i_op = &pram_dir_inode_operations;
+		inode->i_fop = &pram_dir_operations;
+		break;
+	case S_IFLNK:
+		inode->i_op = &pram_symlink_inode_operations;
+		break;
+	default:
+		inode->i_size = 0;
+		init_special_inode(inode, inode->i_mode,
+				   be32_to_cpu(pi->i_type.dev.rdev));
+		break;
+	}
+
+	mutex_unlock(&PRAM_I(inode)->i_meta_mutex);
+	return 0;
+
+ bad_inode:
+	make_bad_inode(inode);
+	mutex_unlock(&PRAM_I(inode)->i_meta_mutex);
+	return ret;
+}
+
+int pram_update_inode(struct inode *inode)
+{
+	struct pram_inode *pi;
+	int retval = 0;
+
+	pi = pram_get_inode(inode->i_sb, inode->i_ino);
+	if (!pi)
+		return -EACCES;
+
+	mutex_lock(&PRAM_I(inode)->i_meta_mutex);
+
+	pram_memunlock_inode(inode->i_sb, pi);
+	pi->i_mode = cpu_to_be16(inode->i_mode);
+	pi->i_uid = cpu_to_be32(inode->i_uid);
+	pi->i_gid = cpu_to_be32(inode->i_gid);
+	pi->i_links_count = cpu_to_be16(inode->i_nlink);
+	pi->i_size = cpu_to_be32(inode->i_size);
+	pi->i_blocks = cpu_to_be32(inode->i_blocks);
+	pi->i_atime = cpu_to_be32(inode->i_atime.tv_sec);
+	pi->i_ctime = cpu_to_be32(inode->i_ctime.tv_sec);
+	pi->i_mtime = cpu_to_be32(inode->i_mtime.tv_sec);
+	pi->i_generation = cpu_to_be32(inode->i_generation);
+	pram_get_inode_flags(inode, pi);
+
+	if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode))
+		pi->i_type.dev.rdev = cpu_to_be32(inode->i_rdev);
+	
+	pram_memlock_inode(inode->i_sb, pi);
+
+	mutex_unlock(&PRAM_I(inode)->i_meta_mutex);
+	return retval;
+}
+
+/*
+ * NOTE! When we get the inode, we're the only people
+ * that have access to it, and as such there are no
+ * race conditions we have to worry about. The inode
+ * is not on the hash-lists, and it cannot be reached
+ * through the filesystem because the directory entry
+ * has been deleted earlier.
+ */
+static void pram_free_inode(struct inode *inode)
+{
+	struct super_block *sb = inode->i_sb;
+	struct pram_super_block *ps;
+	struct pram_inode *pi;
+	unsigned long inode_nr;
+
+	pram_xattr_delete_inode(inode);
+
+	lock_super(sb);
+
+	inode_nr = (inode->i_ino - PRAM_ROOT_INO) >> PRAM_INODE_BITS;
+
+	pi = pram_get_inode(sb, inode->i_ino);
+	pram_memunlock_inode(sb, pi);
+	pi->i_dtime = cpu_to_be32(get_seconds());
+	pi->i_type.reg.row_block = 0;
+	pi->i_xattr = 0;
+	pram_memlock_inode(sb, pi);
+
+	/* increment s_free_inodes_count */
+	ps = pram_get_super(sb);
+	pram_memunlock_super(sb, ps);
+	if (inode_nr < be32_to_cpu(ps->s_free_inode_hint))
+		ps->s_free_inode_hint = cpu_to_be32(inode_nr);
+	be32_add_cpu(&ps->s_free_inodes_count, 1);
+	if (be32_to_cpu(ps->s_free_inodes_count) == be32_to_cpu(ps->s_inodes_count) - 1) {
+		/* filesystem is empty */
+		pram_dbg("fs is empty!\n");
+		ps->s_free_inode_hint = cpu_to_be32(1);
+	}
+	pram_memlock_super(sb, ps);
+
+	unlock_super(sb);
+}
+
+struct inode *pram_iget(struct super_block *sb, unsigned long ino)
+{
+	struct inode *inode;
+	struct pram_inode *pi;
+	int err;
+
+	inode = iget_locked(sb, ino);
+	if (unlikely(!inode))
+		return ERR_PTR(-ENOMEM);
+	if (!(inode->i_state & I_NEW))
+		return inode;
+
+	pi = pram_get_inode(sb, ino);
+	if (!pi) {
+		err = -EACCES;
+		goto fail;
+	}
+	err = pram_read_inode(inode, pi);
+	if (unlikely(err))
+		goto fail;
+
+	unlock_new_inode(inode);
+	return inode;
+fail:
+	iget_failed(inode);
+	return ERR_PTR(err);
+}
+
+void pram_evict_inode(struct inode *inode)
+{
+	int want_delete = 0;
+
+	if (!inode->i_nlink && !is_bad_inode(inode))
+		want_delete = 1;
+
+	truncate_inode_pages(&inode->i_data, 0);
+
+	if (want_delete) {
+		/* unlink from chain in the inode's directory */
+		pram_remove_link(inode);
+		if (inode->i_blocks)
+			pram_truncate_blocks(inode, 0, inode->i_size);
+		inode->i_size = 0;
+	}
+
+	invalidate_inode_buffers(inode);
+	end_writeback(inode);
+
+	if (want_delete)
+		pram_free_inode(inode);
+}
+
+
+struct inode *pram_new_inode(struct inode *dir, int mode)
+{
+	struct super_block *sb;
+	struct pram_sb_info *sbi;
+	struct pram_super_block *ps;
+	struct inode *inode;
+	struct pram_inode *pi = NULL;
+	struct pram_inode *diri = NULL;
+	int i, errval;
+	ino_t ino = 0;
+
+	sb = dir->i_sb;
+	sbi = (struct pram_sb_info *)sb->s_fs_info;
+	inode = new_inode(sb);
+	if (!inode)
+		return ERR_PTR(-ENOMEM);
+
+	lock_super(sb);
+	ps = pram_get_super(sb);
+
+	if (ps->s_free_inodes_count) {
+		/* find the oldest unused pram inode */
+		for (i = be32_to_cpu(ps->s_free_inode_hint); i < be32_to_cpu(ps->s_inodes_count); i++) {
+			ino = PRAM_ROOT_INO + (i << PRAM_INODE_BITS);
+			pi = pram_get_inode(sb, ino);
+			/* check if the inode is active. */
+			if (be16_to_cpu(pi->i_links_count) == 0 &&
+			   (be16_to_cpu(pi->i_mode) == 0 ||
+			   be32_to_cpu(pi->i_dtime))) {
+				/* this inode is deleted */
+				break;
+			}
+		}
+
+		if (unlikely(i >= be32_to_cpu(ps->s_inodes_count))) {
+			pram_err(sb, "s_free_inodes_count!=0 but none free!?\n");
+			errval = -ENOSPC;
+			goto fail1;
+		}
+
+		pram_dbg("allocating inode %lu\n", ino);
+	} else {
+		pram_dbg("no space left to create new inode!\n");
+		errval = -ENOSPC;
+		goto fail1;
+	}
+
+	diri = pram_get_inode(sb, dir->i_ino);
+	if (!diri) {
+		errval = -EACCES;
+		goto fail1;
+	}
+
+	/* chosen inode is in ino */
+	inode->i_ino = ino;
+	inode_init_owner(inode, dir, mode);
+	inode->i_blocks = inode->i_size = 0;
+	inode->i_mtime = inode->i_atime = inode->i_ctime = CURRENT_TIME;
+
+	inode->i_generation = atomic_add_return(1, &sbi->next_generation);
+
+	pram_memunlock_inode(sb, pi);
+	pi->i_d.d_next = 0;
+	pi->i_d.d_prev = 0;
+	pi->i_flags = pram_mask_flags(mode, diri->i_flags);
+	pram_memlock_inode(sb, pi);
+
+	pram_set_inode_flags(inode, pi);
+
+	if (insert_inode_locked(inode) < 0) {
+		errval = -EINVAL;
+		goto fail2;
+	}
+	errval = pram_write_inode(inode, 0);
+	if (errval)
+		goto fail2;
+
+	errval = pram_init_acl(inode, dir);
+	if (errval)
+		goto fail2;
+
+	errval = pram_init_security(inode, dir);
+	if (errval)
+		goto fail2;
+
+	pram_memunlock_super(sb, ps);
+	be32_add_cpu(&ps->s_free_inodes_count, -1);
+	if (i < be32_to_cpu(ps->s_inodes_count)-1)
+		ps->s_free_inode_hint = cpu_to_be32(i+1);
+	else
+		ps->s_free_inode_hint = 0;
+	pram_memlock_super(sb, ps);
+
+	unlock_super(sb);
+
+	return inode;
+fail2:
+	unlock_super(sb);
+	unlock_new_inode(inode);
+	iput(inode);
+	return ERR_PTR(errval);
+fail1:
+	unlock_super(sb);
+	make_bad_inode(inode);
+	iput(inode);
+	return ERR_PTR(errval);
+}
+
+int pram_write_inode(struct inode *inode, struct writeback_control *wbc)
+{
+	return pram_update_inode(inode);
+}
+
+/*
+ * dirty_inode() is called from __mark_inode_dirty()
+ */
+void pram_dirty_inode(struct inode *inode)
+{
+	pram_update_inode(inode);
+}
+
+/* pram_get_and_update_block()
+ *
+ * Look for a block. If not found it can create a new one if create is
+ * different from zero.
+ *
+ * It returns zero if plain lookup failed or blocks mapped or allocated
+ * (plain lookup failed is not an error, e.g. for holes). Minor than zero
+ * otherwise.
+ */
+int pram_get_and_update_block(struct inode *inode, sector_t iblock,
+				     struct buffer_head *bh, int create)
+{
+	struct super_block *sb = inode->i_sb;
+	unsigned int blocksize = 1 << inode->i_blkbits;
+	int err = 0;
+	u64 block;
+	void *bp;
+
+	mutex_lock(&PRAM_I(inode)->truncate_mutex);
+
+	block = pram_find_data_block(inode, iblock);
+
+	if (!block) {
+		if (!create)
+			goto out;
+
+		err = pram_alloc_blocks(inode, iblock, 1);
+		if (err)
+			goto out;
+		block = pram_find_data_block(inode, iblock);
+		if (!block) {
+			err = -EIO;
+			goto out;
+		}
+		set_buffer_new(bh);
+	}
+
+	bh->b_blocknr = block;
+	set_buffer_mapped(bh);
+
+	/* now update the buffer synchronously */
+	bp = pram_get_block(sb, block);
+	if (buffer_new(bh)) {
+		pram_memunlock_block(sb, bp);
+		memset(bp, 0, blocksize);
+		pram_memlock_block(sb, bp);
+		memset(bh->b_data, 0, blocksize);
+	} else {
+		memcpy(bh->b_data, bp, blocksize);
+	}
+
+	set_buffer_uptodate(bh);
+
+ out:
+	mutex_unlock(&PRAM_I(inode)->truncate_mutex);
+	return err;
+}
+
+/*
+ * Called to zeros out a single block. It's used in the "resize"
+ * to avoid to keep data in case the file grow up again.
+ */
+static int pram_clear_block(struct inode *inode, loff_t newsize)
+{
+	pgoff_t index = newsize >> PAGE_CACHE_SHIFT;
+	unsigned long offset = newsize & (PAGE_CACHE_SIZE - 1);
+	unsigned long blocksize, length;
+	sector_t iblock;
+	u64 blockoff;
+	char *bp;
+	int ret = 0;
+
+	blocksize = 1 << inode->i_blkbits;
+	length = offset & (blocksize - 1);
+
+	/* Block boundary ? */
+	if (!length)
+		goto out;
+
+	length = blocksize - length;
+	iblock = (sector_t)index << (PAGE_CACHE_SHIFT - inode->i_blkbits);
+
+	mutex_lock(&PRAM_I(inode)->truncate_mutex);
+	blockoff = pram_find_data_block(inode, iblock);
+
+	/* Hole ? */
+	if (!blockoff)
+		goto out_unlock;
+
+	bp = pram_get_block(inode->i_sb, blockoff);
+	if (!bp) {
+		ret = -EACCES;
+		goto out_unlock;
+	}
+	pram_memunlock_block(inode->i_sb, bp);
+	memset(bp + offset, 0, length);
+	pram_memlock_block(inode->i_sb, bp);
+
+out_unlock:
+	mutex_unlock(&PRAM_I(inode)->truncate_mutex);
+out:
+	return ret;
+}
+
+static int pram_setsize(struct inode *inode, loff_t newsize)
+{
+	int ret = 0;
+	loff_t oldsize;
+
+	if (!(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
+	    S_ISLNK(inode->i_mode)))
+		return -EINVAL;
+	if (IS_APPEND(inode) || IS_IMMUTABLE(inode))
+		return -EPERM;
+
+	if (mapping_is_xip(inode->i_mapping))
+		ret = xip_truncate_page(inode->i_mapping, newsize);
+	else
+		ret = pram_clear_block(inode, newsize);
+	if (ret)
+		return ret;
+
+	oldsize = inode->i_size;
+	i_size_write(inode, newsize);
+	__pram_truncate_blocks(inode, newsize, oldsize);
+	inode->i_mtime = inode->i_ctime = CURRENT_TIME_SEC;
+	pram_update_inode(inode);
+
+	return ret;
+}
+
+int pram_notify_change(struct dentry *dentry, struct iattr *attr)
+{
+	struct inode *inode = dentry->d_inode;
+	int error;
+
+	error = inode_change_ok(inode, attr);
+	if (error)
+		return error;
+
+	if (attr->ia_valid & ATTR_SIZE && attr->ia_size != inode->i_size) {
+		error = pram_setsize(inode, attr->ia_size);
+		if (error)
+			return error;
+	}
+	setattr_copy(inode, attr);
+	if (attr->ia_valid & ATTR_MODE)
+		error = pram_acl_chmod(inode);
+	error = pram_update_inode(inode);
+
+	return error;
+}
+
+long pram_fallocate(struct inode *inode, int mode, loff_t offset, loff_t len)
+{
+	long ret = 0;
+	unsigned long blocknr, blockoff;
+	int num_blocks, blocksize_mask;
+	struct pram_inode *pi;
+	loff_t new_size;
+	
+	/* preallocation to directories is currently not supported */
+	if (S_ISDIR(inode->i_mode))
+		return -ENODEV;
+
+	mutex_lock(&inode->i_mutex);
+	mutex_lock(&PRAM_I(inode)->truncate_mutex);
+
+	if (IS_IMMUTABLE(inode)) {
+		ret = -EPERM;
+		goto out;
+	}
+
+	new_size = len + offset;
+	if (!(mode & FALLOC_FL_KEEP_SIZE) && new_size > inode->i_size) {
+		ret = inode_newsize_ok(inode, new_size);
+		if (ret)
+			goto out;
+	}
+
+	blocksize_mask = (1 << inode->i_sb->s_blocksize_bits) - 1;
+	offset += inode->i_size;
+	blocknr = offset >> inode->i_sb->s_blocksize_bits;
+	blockoff = offset & blocksize_mask;
+	num_blocks = (blockoff + len + blocksize_mask) >>
+						inode->i_sb->s_blocksize_bits;
+	ret = pram_alloc_blocks(inode, blocknr, num_blocks);
+	if (ret)
+		goto out;
+
+	if (mode & FALLOC_FL_KEEP_SIZE) {
+		pi = pram_get_inode(inode->i_sb, inode->i_ino);
+		if (!pi) {
+			ret = -EACCES;
+			goto out;
+		}
+		pram_memunlock_inode(inode->i_sb, pi);
+		pi->i_flags |= cpu_to_be32(PRAM_EOFBLOCKS_FL);
+		pram_memlock_inode(inode->i_sb, pi);
+	}
+
+	inode->i_mtime = inode->i_ctime = CURRENT_TIME_SEC;
+	if (!(mode & FALLOC_FL_KEEP_SIZE) && new_size > inode->i_size)
+		inode->i_size = new_size;
+	ret = pram_update_inode(inode);
+ out:
+	mutex_unlock(&PRAM_I(inode)->truncate_mutex);
+	mutex_unlock(&inode->i_mutex);
+	return ret;
+}
+
+void pram_set_inode_flags(struct inode *inode, struct pram_inode *pi)
+{
+	unsigned int flags = be32_to_cpu(pi->i_flags);
+
+	inode->i_flags &= ~(S_SYNC|S_APPEND|S_IMMUTABLE|S_NOATIME|S_DIRSYNC);
+	if (flags & FS_SYNC_FL)
+		inode->i_flags |= S_SYNC;
+	if (flags & FS_APPEND_FL)
+		inode->i_flags |= S_APPEND;
+	if (flags & FS_IMMUTABLE_FL)
+		inode->i_flags |= S_IMMUTABLE;
+	if (flags & FS_NOATIME_FL)
+		inode->i_flags |= S_NOATIME;
+	if (flags & FS_DIRSYNC_FL)
+		inode->i_flags |= S_DIRSYNC;
+}
+
+void pram_get_inode_flags(struct inode *inode, struct pram_inode *pi)
+{
+	unsigned int flags = inode->i_flags;
+	unsigned int pram_flags = be32_to_cpu(pi->i_flags);
+
+	pram_flags &= ~(FS_SYNC_FL|FS_APPEND_FL|FS_IMMUTABLE_FL|
+			FS_NOATIME_FL|FS_DIRSYNC_FL);
+	if (flags & S_SYNC)
+		pram_flags |= FS_SYNC_FL;
+	if (flags & S_APPEND)
+		pram_flags |= FS_APPEND_FL;
+	if (flags & S_IMMUTABLE)
+		pram_flags |= FS_IMMUTABLE_FL;
+	if (flags & S_NOATIME)
+		pram_flags |= FS_NOATIME_FL;
+	if (flags & S_DIRSYNC)
+		pram_flags |= FS_DIRSYNC_FL;
+
+	pi->i_flags = cpu_to_be32(pram_flags);
+}
+
+struct address_space_operations pram_aops = {
+	.readpage	= pram_readpage,
+	.direct_IO	= pram_direct_IO,
+};
+
+struct address_space_operations pram_aops_xip = {
+	.get_xip_mem	= pram_get_xip_mem,
+};

^ permalink raw reply related

* [PATCH 02/17] pramfs: super operations
From: Marco Stornelli @ 2011-01-06 12:01 UTC (permalink / raw)
  To: Linux Kernel, Linux Embedded, Linux FS Devel, Tim Bird

From: Marco Stornelli <marco.stornelli@gmail.com>

Super block operations.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
---
diff --git a/fs/pramfs/super.c b/fs/pramfs/super.c
new file mode 100644
index 0000000..0157b35
--- /dev/null
+++ b/fs/pramfs/super.c
@@ -0,0 +1,940 @@
+/*
+ * BRIEF DESCRIPTION
+ *
+ * Super block operations.
+ *
+ * Copyright 2009-2010 Marco Stornelli <marco.stornelli@gmail.com>
+ * Copyright 2003 Sony Corporation
+ * Copyright 2003 Matsushita Electric Industrial Co., Ltd.
+ * 2003-2004 (c) MontaVista Software, Inc. , Steve Longerbeam
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include <linux/module.h>
+#include <linux/string.h>
+#include <linux/slab.h>
+#include <linux/init.h>
+#include <linux/blkdev.h>
+#include <linux/parser.h>
+#include <linux/vfs.h>
+#include <linux/uaccess.h>
+#include <linux/io.h>
+#include <linux/seq_file.h>
+#include <linux/mount.h>
+#include <linux/mm.h>
+#include <linux/ctype.h>
+#include <linux/bitops.h>
+#include <linux/magic.h>
+#include <linux/exportfs.h>
+#include <linux/random.h>
+#include "xattr.h"
+#include "pram.h"
+
+static struct super_operations pram_sops;
+static const struct export_operations pram_export_ops;
+static struct kmem_cache *pram_inode_cachep;
+
+#ifdef CONFIG_PRAMFS_TEST
+static void *first_pram_super;
+
+struct pram_super_block *get_pram_super(void)
+{
+	return (struct pram_super_block *)first_pram_super;
+}
+EXPORT_SYMBOL(get_pram_super);
+#endif
+
+void pram_error_mng(struct super_block * sb, const char * fmt, ...)
+{
+	va_list args;
+
+	va_start(args, fmt);
+	printk(KERN_ERR "pramfs error: ");
+	vprintk(fmt, args);
+	printk("\n");
+	va_end(args);
+
+	if (test_opt(sb, ERRORS_PANIC))
+		panic("pramfs: panic from previous error\n");
+	if (test_opt(sb, ERRORS_RO)) {
+		printk(KERN_CRIT "pramfs error: remounting filesystem read-only");
+		sb->s_flags |= MS_RDONLY;
+	}
+}
+
+static void pram_set_blocksize(struct super_block *sb, unsigned long size)
+{
+	int bits;
+
+	/*
+	 * We've already validated the user input and the value here must be
+	 * between PRAM_MAX_BLOCK_SIZE and PRAM_MIN_BLOCK_SIZE
+	 * and it must be a power of 2.
+	 */
+	bits = fls(size) - 1;
+	sb->s_blocksize_bits = bits;
+	sb->s_blocksize = (1<<bits);
+}
+
+static inline void *pram_ioremap(phys_addr_t phys_addr, ssize_t size, bool protect)
+{
+	void *retval;
+
+	/*
+	 * NOTE: Userland may not map this resource, we will mark the region so
+	 * /dev/mem and the sysfs MMIO access will not be allowed. This
+	 * restriction depends on STRICT_DEVMEM option. If this option is
+	 * disabled or not available we mark the region only as busy.
+	 */
+	retval = request_mem_region_exclusive(phys_addr, size, "pramfs");
+	if (!retval)
+		goto fail;
+
+	retval = ioremap_nocache(phys_addr, size);
+
+	if (retval && protect)
+		pram_writeable(retval, size, 0);
+fail:
+	return retval;
+}
+
+static loff_t pram_max_size(int bits)
+{
+	loff_t res;
+	res = (1ULL << (3*bits - 6)) - 1;
+
+	if (res > MAX_LFS_FILESIZE)
+		res = MAX_LFS_FILESIZE;
+
+	pram_info("max file size %llu bytes\n", res);
+	return res;
+}
+
+enum {
+	Opt_addr, Opt_bpi, Opt_size,
+	Opt_num_inodes, Opt_mode, Opt_uid,
+	Opt_gid, Opt_blocksize, Opt_user_xattr,
+	Opt_nouser_xattr, Opt_noprotect,
+	Opt_acl, Opt_noacl, Opt_xip,
+	Opt_err_cont, Opt_err_panic, Opt_err_ro,
+	Opt_err
+};
+
+static const match_table_t tokens = {
+	{Opt_bpi,		"physaddr=%x"},
+	{Opt_bpi,		"bpi=%u"},
+	{Opt_size,		"init=%s"},
+	{Opt_num_inodes,	"N=%u"},
+	{Opt_mode,		"mode=%o"},
+	{Opt_uid,		"uid=%u"},
+	{Opt_gid,		"gid=%u"},
+	{Opt_blocksize,		"bs=%s"},
+	{Opt_user_xattr,	"user_xattr"},
+	{Opt_user_xattr,	"nouser_xattr"},
+	{Opt_noprotect,		"noprotect"},
+	{Opt_acl,		"acl"},
+	{Opt_acl,		"noacl"},
+	{Opt_xip,		"xip"},
+	{Opt_err_cont,		"errors=continue"},
+	{Opt_err_panic,		"errors=panic"},
+	{Opt_err_ro,		"errors=remount-ro"},
+	{Opt_err,		NULL},
+};
+
+static phys_addr_t get_phys_addr(void **data)
+{
+	phys_addr_t phys_addr;
+	char *options = (char *) *data;
+
+	if (!options || strncmp(options, "physaddr=", 9) != 0)
+		return (phys_addr_t)ULLONG_MAX;
+	options += 9;
+	phys_addr = (phys_addr_t)simple_strtoull(options, &options, 0);
+	if (*options && *options != ',') {
+		printk(KERN_ERR "Invalid phys addr specification: %s\n",
+		       (char *) *data);
+		return (phys_addr_t)ULLONG_MAX;
+	}
+	if (phys_addr & (PAGE_SIZE - 1)) {
+		printk(KERN_ERR "physical address 0x%16llx for pramfs isn't "
+			  "aligned to a page boundary\n",
+			  (u64)phys_addr);
+		return (phys_addr_t)ULLONG_MAX;
+	}
+	if (*options == ',')
+		options++;
+	*data = (void *) options;
+	return phys_addr;
+}
+
+static int pram_parse_options(char *options, struct pram_sb_info *sbi, bool remount)
+{
+	char *p, *rest;
+	substring_t args[MAX_OPT_ARGS];
+	int option;
+
+	if (!options)
+		return 0;
+
+	while ((p = strsep(&options, ",")) != NULL) {
+		int token;
+		if (!*p)
+			continue;
+
+		token = match_token(p, tokens, args);
+		switch (token) {
+		case Opt_addr:
+			if (remount)
+				goto bad_opt;
+			/* physaddr managed in get_phys_addr() */
+			break;
+		case Opt_bpi:
+			if (remount)
+				goto bad_opt;
+			if (match_int(&args[0], &option))
+				goto bad_val;
+			sbi->bpi = option;
+			break;
+		case Opt_uid:
+			if (remount)
+				goto bad_opt;
+			if (match_int(&args[0], &option))
+				goto bad_val;
+			sbi->uid = option;
+			break;
+		case Opt_gid:
+			if (match_int(&args[0], &option))
+				goto bad_val;
+			sbi->gid = option;
+			break;
+		case Opt_mode:
+			if (match_octal(&args[0], &option))
+				goto bad_val;
+			sbi->mode = option & 01777U;
+			break;
+		case Opt_size:
+			if (remount)
+				goto bad_opt;
+			/* memparse() will accept a K/M/G without a digit */
+			if (!isdigit(*args[0].from))
+				goto bad_val;
+			sbi->initsize = memparse(args[0].from, &rest);
+			break;
+		case Opt_num_inodes:
+			if (remount)
+				goto bad_opt;
+			if (match_int(&args[0], &option))
+				goto bad_val;
+				sbi->num_inodes = option;
+				break;
+		case Opt_blocksize:
+			if (remount)
+				goto bad_opt;
+			/* memparse() will accept a K/M/G without a digit */
+			if (!isdigit(*args[0].from))
+				goto bad_val;
+			sbi->blocksize = memparse(args[0].from, &rest);
+			if (sbi->blocksize < PRAM_MIN_BLOCK_SIZE ||
+				sbi->blocksize > PRAM_MAX_BLOCK_SIZE ||
+				!is_power_of_2(sbi->blocksize))
+				goto bad_val;
+			break;
+		case Opt_err_panic:
+			clear_opt(sbi->s_mount_opt, ERRORS_CONT);
+			clear_opt(sbi->s_mount_opt, ERRORS_RO);
+			set_opt(sbi->s_mount_opt, ERRORS_PANIC);
+			break;
+		case Opt_err_ro:
+			clear_opt(sbi->s_mount_opt, ERRORS_CONT);
+			clear_opt(sbi->s_mount_opt, ERRORS_PANIC);
+			set_opt(sbi->s_mount_opt, ERRORS_RO);
+			break;
+		case Opt_err_cont:
+			clear_opt(sbi->s_mount_opt, ERRORS_RO);
+			clear_opt(sbi->s_mount_opt, ERRORS_PANIC);
+			set_opt(sbi->s_mount_opt, ERRORS_CONT);
+			break;
+		case Opt_noprotect:
+#ifdef CONFIG_PRAMFS_WRITE_PROTECT
+			if (remount)
+				goto bad_opt;
+			clear_opt(sbi->s_mount_opt, PROTECT);
+#endif
+			break;
+#ifdef CONFIG_PRAMFS_XATTR
+		case Opt_user_xattr:
+			set_opt(sbi->s_mount_opt, XATTR_USER);
+			break;
+		case Opt_nouser_xattr:
+			clear_opt(sbi->s_mount_opt, XATTR_USER);
+			break;
+#else
+		case Opt_user_xattr:
+		case Opt_nouser_xattr:
+			pram_info("(no)user_xattr options not supported\n");
+			break;
+#endif
+#ifdef CONFIG_PRAMFS_POSIX_ACL
+		case Opt_acl:
+			set_opt(sbi->s_mount_opt, POSIX_ACL);
+			break;
+		case Opt_noacl:
+			clear_opt(sbi->s_mount_opt, POSIX_ACL);
+			break;
+#else
+		case Opt_acl:
+		case Opt_noacl:
+			pram_info("(no)acl options not supported\n");
+			break;
+#endif
+		case Opt_xip:
+#ifdef CONFIG_PRAMFS_XIP
+			if (remount)
+				goto bad_opt;
+			set_opt(sbi->s_mount_opt, XIP);
+			break;
+#else
+			pram_info("xip option not supported\n");
+			break;
+#endif
+		default: {
+			goto bad_opt;
+		}
+		}
+	}
+
+	return 0;
+
+bad_val:
+	printk(KERN_ERR "Bad value '%s' for mount option '%s'\n", args[0].from, p);
+	return -EINVAL;
+bad_opt:
+	printk(KERN_ERR "Bad mount option: \"%s\"\n", p);
+	return -EINVAL;
+}
+
+static struct pram_inode *pram_init(struct super_block *sb, unsigned long size)
+{
+	unsigned long bpi, num_inodes, bitmap_size, blocksize, num_blocks;
+	u64 bitmap_start;
+	struct pram_inode *root_i;
+	struct pram_super_block *super;
+	struct pram_sb_info *sbi = (struct pram_sb_info *)sb->s_fs_info;
+
+	pram_info("creating an empty pramfs of size %lu\n", size);
+	if (pram_is_protected(sb))
+		sbi->virt_addr = pram_ioremap(sbi->phys_addr, size, 1);
+	else
+		sbi->virt_addr = pram_ioremap(sbi->phys_addr, size, 0);
+
+	if (!sbi->virt_addr) {
+		printk(KERN_ERR "ioremap of the pramfs image failed\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+#ifdef CONFIG_PRAMFS_TEST
+	if (!first_pram_super)
+		first_pram_super = sbi->virt_addr;
+#endif
+
+	if (!sbi->blocksize)
+		blocksize = PRAM_DEF_BLOCK_SIZE;
+	else
+		blocksize = sbi->blocksize;
+
+	pram_set_blocksize(sb, blocksize);
+	blocksize = sb->s_blocksize;
+
+	if (sbi->blocksize && sbi->blocksize != blocksize)
+		sbi->blocksize = blocksize;
+
+	if (size < blocksize) {
+		printk(KERN_ERR "size smaller then block size\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	if (!sbi->bpi)
+		/*
+		 * default is that 5% of the filesystem is
+		 * devoted to the inode table
+		 */
+		bpi = 20 * PRAM_INODE_SIZE;
+	else
+		bpi = sbi->bpi;
+
+	if (!sbi->num_inodes)
+		num_inodes = size / bpi;
+	else
+		num_inodes = sbi->num_inodes;
+
+	/*
+	 * up num_inodes such that the end of the inode table
+	 * (and start of bitmap) is on a block boundary
+	 */
+	bitmap_start = (PRAM_SB_SIZE*2) + (num_inodes<<PRAM_INODE_BITS);
+	if (bitmap_start & (blocksize - 1))
+		bitmap_start = (bitmap_start + blocksize) &
+			~(blocksize-1);
+	num_inodes = (bitmap_start - (PRAM_SB_SIZE*2)) >> PRAM_INODE_BITS;
+
+	if (sbi->num_inodes && num_inodes != sbi->num_inodes)
+		sbi->num_inodes = num_inodes;
+
+	num_blocks = (size - bitmap_start) >> sb->s_blocksize_bits;
+
+	if (!num_blocks) {
+		printk(KERN_ERR "num blocks equals to zero\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	/* calc the data blocks in-use bitmap size in bytes */
+	if (num_blocks & 7)
+		bitmap_size = ((num_blocks + 8) & ~7) >> 3;
+	else
+		bitmap_size = num_blocks >> 3;
+	/* round it up to the nearest blocksize boundary */
+	if (bitmap_size & (blocksize - 1))
+		bitmap_size = (bitmap_size + blocksize) & ~(blocksize-1);
+
+	pram_info("blocksize %lu, num inodes %lu, num blocks %lu\n",
+		  blocksize, num_inodes, num_blocks);
+	pram_dbg("bitmap start 0x%08x, bitmap size %lu\n",
+		 (unsigned int)bitmap_start, bitmap_size);
+	pram_dbg("max name length %d\n", (unsigned int)PRAM_NAME_LEN);
+
+	super = pram_get_super(sb);
+	pram_memunlock_range(sb, super, bitmap_start + bitmap_size);
+
+	/* clear out super-block and inode table */
+	memset(super, 0, bitmap_start);
+	super->s_size = cpu_to_be64(size);
+	super->s_blocksize = cpu_to_be32(blocksize);
+	super->s_inodes_count = cpu_to_be32(num_inodes);
+	super->s_blocks_count = cpu_to_be32(num_blocks);
+	super->s_free_inodes_count = cpu_to_be32(num_inodes - 1);
+	super->s_bitmap_blocks = cpu_to_be32(bitmap_size >> sb->s_blocksize_bits);
+	super->s_free_blocks_count = cpu_to_be32(num_blocks - be32_to_cpu(super->s_bitmap_blocks));
+	super->s_free_inode_hint = cpu_to_be32(1);
+	super->s_bitmap_start = cpu_to_be64(bitmap_start);
+	super->s_magic = cpu_to_be16(PRAM_SUPER_MAGIC);
+	pram_sync_super(super);
+
+	root_i = pram_get_inode(sb, PRAM_ROOT_INO);
+
+	root_i->i_mode = cpu_to_be32(sbi->mode);
+	root_i->i_mode = cpu_to_be16(root_i->i_mode | S_IFDIR);
+	root_i->i_uid = cpu_to_be32(sbi->uid);
+	root_i->i_gid = cpu_to_be32(sbi->gid);
+	root_i->i_links_count = cpu_to_be16(2);
+	root_i->i_d.d_parent = cpu_to_be64(PRAM_ROOT_INO);
+	pram_sync_inode(root_i);
+
+	pram_init_bitmap(sb);
+
+	pram_memlock_range(sb, super, bitmap_start + bitmap_size);
+
+	return root_i;
+}
+
+static inline void set_default_opts(struct pram_sb_info *sbi)
+{
+	set_opt(sbi->s_mount_opt, PROTECT);
+	set_opt(sbi->s_mount_opt, ERRORS_CONT);
+}
+
+static int pram_fill_super(struct super_block *sb, void *data, int silent)
+{
+	struct pram_super_block *super, *super_redund;
+	struct pram_inode *root_pi;
+	struct pram_sb_info *sbi = NULL;
+	struct inode *root_i = NULL;
+	u64 root_offset;
+	unsigned long blocksize, initsize = 0;
+	u32 random = 0;
+	int retval = -EINVAL;
+
+	BUILD_BUG_ON(sizeof(struct pram_super_block) > PRAM_SB_SIZE);
+	BUILD_BUG_ON(sizeof(struct pram_inode) > PRAM_INODE_SIZE);
+
+	sbi = kzalloc(sizeof(struct pram_sb_info), GFP_KERNEL);
+	if (!sbi)
+		return -ENOMEM;
+	sb->s_fs_info = sbi;
+
+	set_default_opts(sbi);
+
+#ifdef CONFIG_PRAMFS_XATTR
+	spin_lock_init(&sbi->desc_tree_lock);
+	sbi->desc_tree.rb_node = NULL;
+#endif
+
+	sbi->phys_addr = get_phys_addr(&data);
+	if (sbi->phys_addr == (phys_addr_t)ULLONG_MAX)
+		goto out;
+
+	get_random_bytes(&random, sizeof(u32));
+	atomic_set(&sbi->next_generation, random);
+
+	/* Init with default values */
+	sbi->mode = (S_IRWXUGO | S_ISVTX);
+	sbi->uid = current_fsuid();
+	sbi->gid = current_fsgid();
+
+	if (pram_parse_options(data, sbi, 0))
+		goto out;
+
+	if (test_opt(sb, XIP) && test_opt(sb, PROTECT)) {
+		printk(KERN_ERR "xip and protect options both enabled\n");
+		goto out;
+	}
+
+	if (test_opt(sb, XIP) && sbi->blocksize != PAGE_SIZE) {
+		printk(KERN_ERR "blocksize not equal to page size and xip enabled\n");
+		goto out;
+	}
+
+	initsize = sbi->initsize;
+
+	/* Init a new pramfs instance */
+	if (initsize) {
+		root_pi = pram_init(sb, initsize);
+
+		if (IS_ERR(root_pi))
+			goto out;
+
+		super = pram_get_super(sb);
+
+		goto setup_sb;
+	}
+
+	pram_dbg("checking physical address 0x%016llx for pramfs image\n",
+		   (u64)sbi->phys_addr);
+
+	/* Map only one page for now. Will remap it when fs size is known. */
+	initsize = PAGE_SIZE;
+	if (pram_is_protected(sb))
+		sbi->virt_addr = pram_ioremap(sbi->phys_addr, initsize, 1);
+	else
+		sbi->virt_addr = pram_ioremap(sbi->phys_addr, initsize, 0);
+	if (!sbi->virt_addr) {
+		printk(KERN_ERR "ioremap of the pramfs image failed\n");
+		goto out;
+	}
+
+	super = pram_get_super(sb);
+	super_redund = pram_get_redund_super(sb);
+
+	/* Do sanity checks on the superblock */
+	if (be16_to_cpu(super->s_magic) != PRAM_SUPER_MAGIC) {
+		if (be16_to_cpu(super_redund->s_magic) != PRAM_SUPER_MAGIC) {
+			if (!silent)
+				printk(KERN_ERR "Can't find a valid pramfs "
+								"partition\n");
+			goto out;
+		} else {
+			pram_warn("Error in super block: try to repair it with "
+							  "the redundant copy");
+			/* Try to auto-recover the super block */
+			memcpy(super, super_redund, PRAM_SB_SIZE);
+		}
+	}
+
+	/* Read the superblock */
+	if (pram_calc_checksum((u8 *)super, PRAM_SB_SIZE)) {
+		if (pram_calc_checksum((u8 *)super_redund, PRAM_SB_SIZE)) {
+			printk(KERN_ERR "checksum error in super block\n");
+			goto out;
+		} else {
+			pram_warn("Error in super block: try to repair it with "
+							  "the redundant copy");
+			/* Try to auto-recover the super block */
+			memcpy(super, super_redund, PRAM_SB_SIZE);
+		}
+	}
+
+	blocksize = be32_to_cpu(super->s_blocksize);
+	pram_set_blocksize(sb, blocksize);
+
+	initsize = be64_to_cpu(super->s_size);
+	pram_info("pramfs image appears to be %lu KB in size\n", initsize>>10);
+	pram_info("blocksize %lu\n", blocksize);
+
+	/* Read the root inode */
+	root_pi = pram_get_inode(sb, PRAM_ROOT_INO);
+
+	/* Check that the root inode is in a sane state */
+	if (pram_calc_checksum((u8 *)root_pi, PRAM_INODE_SIZE)) {
+		printk(KERN_ERR "checksum error in root inode!\n");
+		goto out;
+	}
+
+	if (be64_to_cpu(root_pi->i_d.d_next)) {
+		printk(KERN_ERR "root->next not NULL??!!\n");
+		goto out;
+	}
+
+	if (!S_ISDIR(be16_to_cpu(root_pi->i_mode))) {
+		printk(KERN_ERR "root is not a directory!\n");
+		goto out;
+	}
+
+	root_offset = be64_to_cpu(root_pi->i_type.dir.head);
+	if (root_offset == 0)
+		pram_dbg("empty filesystem\n");
+
+	/* Remap the whole filesystem now */
+	if (pram_is_protected(sb))
+		pram_writeable(sbi->virt_addr, PAGE_SIZE, 1);
+	iounmap(sbi->virt_addr);
+	release_mem_region(sbi->phys_addr, PAGE_SIZE);
+	if (pram_is_protected(sb))
+		sbi->virt_addr = pram_ioremap(sbi->phys_addr, initsize, 1);
+	else
+		sbi->virt_addr = pram_ioremap(sbi->phys_addr, initsize, 0);
+	if (!sbi->virt_addr) {
+		printk(KERN_ERR "ioremap of the pramfs image failed\n");
+		goto out;
+	}
+	super = pram_get_super(sb);
+
+#ifdef CONFIG_PRAMFS_TEST
+	if (!first_pram_super)
+		first_pram_super = sbi->virt_addr;
+#endif
+
+	/* Set it all up.. */
+ setup_sb:
+	sb->s_magic = be16_to_cpu(super->s_magic);
+	sb->s_op = &pram_sops;
+	sb->s_maxbytes = pram_max_size(sb->s_blocksize_bits);
+	sb->s_time_gran = 1;
+	sb->s_export_op = &pram_export_ops;
+	sb->s_xattr = pram_xattr_handlers;
+#ifdef	CONFIG_PRAMFS_POSIX_ACL
+	sb->s_flags = (sb->s_flags & ~MS_POSIXACL) |
+		(sbi->s_mount_opt & PRAM_MOUNT_POSIX_ACL) ?
+		 MS_POSIXACL : 0;
+#endif
+	root_i = pram_iget(sb, PRAM_ROOT_INO);
+	if (IS_ERR(root_i)) {
+		retval = PTR_ERR(root_i);
+		goto out;
+	}
+	
+	sb->s_root = d_alloc_root(root_i);
+	if (!sb->s_root) {
+		iput(root_i);
+		printk(KERN_ERR "get pramfs root inode failed\n");
+		retval = -ENOMEM;
+		goto out;
+	}
+
+	retval = 0;
+	return retval;
+ out:
+	if (sbi->virt_addr) {
+		if (pram_is_protected(sb))
+			pram_writeable(sbi->virt_addr, initsize, 1);
+		iounmap(sbi->virt_addr);
+		release_mem_region(sbi->phys_addr, initsize);
+	}
+
+	kfree(sbi);
+	return retval;
+}
+
+int pram_statfs(struct dentry *d, struct kstatfs *buf)
+{
+	struct super_block *sb = d->d_sb;
+	struct pram_super_block *ps = pram_get_super(sb);
+
+	buf->f_type = PRAM_SUPER_MAGIC;
+	buf->f_bsize = sb->s_blocksize;
+	buf->f_blocks = be32_to_cpu(ps->s_blocks_count);
+	buf->f_bfree = buf->f_bavail = pram_count_free_blocks(sb);
+	buf->f_files = be32_to_cpu(ps->s_inodes_count);
+	buf->f_ffree = be32_to_cpu(ps->s_free_inodes_count);
+	buf->f_namelen = PRAM_NAME_LEN;
+	return 0;
+}
+
+static int pram_show_options(struct seq_file *seq, struct vfsmount *vfs)
+{
+	struct pram_sb_info *sbi = (struct pram_sb_info *)vfs->mnt_sb->s_fs_info;
+
+	seq_printf(seq, ",physaddr=0x%016llx", (u64)sbi->phys_addr);
+	if (sbi->initsize)
+		seq_printf(seq, ",init=%luk", sbi->initsize >> 10);
+	if (sbi->blocksize)
+		seq_printf(seq, ",bs=%lu", sbi->blocksize);
+	if (sbi->bpi)
+		seq_printf(seq, ",bpi=%lu", sbi->bpi);
+	if (sbi->num_inodes)
+		seq_printf(seq, ",N=%lu", sbi->num_inodes);
+	if (sbi->mode != (S_IRWXUGO | S_ISVTX))
+		seq_printf(seq, ",mode=%03o", sbi->mode);
+	if (sbi->uid != 0)
+		seq_printf(seq, ",uid=%u", sbi->uid);
+	if (sbi->gid != 0)
+		seq_printf(seq, ",gid=%u", sbi->gid);
+	if (test_opt(vfs->mnt_sb, ERRORS_RO))
+		seq_puts(seq, ",errors=remount-ro");
+	if (test_opt(vfs->mnt_sb, ERRORS_PANIC))
+		seq_puts(seq, ",errors=panic");
+#ifdef CONFIG_PRAMFS_WRITE_PROTECT
+	/* memory protection enabled by default */
+	if (!test_opt(vfs->mnt_sb, PROTECT))
+		seq_puts(seq, ",noprotect");
+#else
+	/*
+	 * If it's not compiled say to the user that there
+	 * isn't the protection.
+	 */
+	seq_puts(seq, ",noprotect");
+#endif
+
+#ifdef CONFIG_PRAMFS_XATTR
+	/* user xattr not enabled by default */
+	if (test_opt(vfs->mnt_sb, XATTR_USER))
+		seq_puts(seq, ",user_xattr");
+#endif
+
+#ifdef CONFIG_PRAMFS_POSIX_ACL
+	/* acl not enabled by default */
+	if (test_opt(vfs->mnt_sb, POSIX_ACL))
+		seq_puts(seq, ",acl");
+#endif
+
+#ifdef CONFIG_PRAMFS_XIP
+	/* xip not enabled by default */
+	if (test_opt(vfs->mnt_sb, XIP))
+		seq_puts(seq, ",xip");
+#endif
+
+	return 0;
+}
+
+int pram_remount(struct super_block *sb, int *mntflags, char *data)
+{
+	unsigned long old_sb_flags;
+	unsigned long old_mount_opt;
+	struct pram_super_block *ps;
+	struct pram_sb_info *sbi = (struct pram_sb_info *)sb->s_fs_info;
+	int ret = -EINVAL;
+
+	/* Store the old options */
+	old_sb_flags = sb->s_flags;
+	old_mount_opt = sbi->s_mount_opt;
+
+	if (pram_parse_options(data, sbi, 1))
+		goto restore_opt;
+
+	sb->s_flags = (sb->s_flags & ~MS_POSIXACL) |
+		((sbi->s_mount_opt & PRAM_MOUNT_POSIX_ACL) ? MS_POSIXACL : 0);
+
+	if ((*mntflags & MS_RDONLY) != (sb->s_flags & MS_RDONLY)) {
+		ps = pram_get_super(sb);
+		pram_memunlock_super(sb, ps);
+		ps->s_mtime = cpu_to_be32(get_seconds()); /* update mount time */
+		pram_memlock_super(sb, ps);
+	}
+
+	ret = 0;
+	return ret;
+
+ restore_opt:
+	sb->s_flags = old_sb_flags;
+	sbi->s_mount_opt = old_mount_opt;
+	return ret;
+}
+
+void pram_put_super(struct super_block *sb)
+{
+	struct pram_sb_info *sbi = (struct pram_sb_info *)sb->s_fs_info;
+	struct pram_super_block *ps = pram_get_super(sb);
+	u64 size = be64_to_cpu(ps->s_size);
+
+#ifdef CONFIG_PRAMFS_TEST
+	if (first_pram_super == sbi->virt_addr)
+		first_pram_super = NULL;
+#endif
+
+	pram_xattr_put_super(sb);
+	/* It's unmount time, so unmap the pramfs memory */
+	if (sbi->virt_addr) {
+		if (pram_is_protected(sb))
+			pram_writeable(sbi->virt_addr, size, 1);
+		iounmap(sbi->virt_addr);
+		sbi->virt_addr = NULL;
+		release_mem_region(sbi->phys_addr, size);
+	}
+
+	sb->s_fs_info = NULL;
+	kfree(sbi);
+}
+
+static struct inode *pram_alloc_inode(struct super_block *sb)
+{
+	struct pram_inode_vfs *vi = (struct pram_inode_vfs *)
+				kmem_cache_alloc(pram_inode_cachep, GFP_KERNEL);
+	if (!vi)
+		return NULL;
+	vi->vfs_inode.i_version = 1;
+	return &vi->vfs_inode;
+}
+
+static void pram_destroy_inode(struct inode *inode)
+{
+	kmem_cache_free(pram_inode_cachep, PRAM_I(inode));
+}
+
+static void init_once(void *foo)
+{
+	struct pram_inode_vfs *vi = (struct pram_inode_vfs *) foo;
+
+#ifdef CONFIG_PRAMFS_XATTR
+	init_rwsem(&vi->xattr_sem);
+#endif
+	mutex_init(&vi->truncate_mutex);
+	mutex_init(&vi->i_meta_mutex);
+	inode_init_once(&vi->vfs_inode);
+}
+
+static int __init init_inodecache(void)
+{
+	pram_inode_cachep = kmem_cache_create("pram_inode_cache",
+					     sizeof(struct pram_inode_vfs),
+					     0, (SLAB_RECLAIM_ACCOUNT|
+						SLAB_MEM_SPREAD),
+					     init_once);
+	if (pram_inode_cachep == NULL)
+		return -ENOMEM;
+	return 0;
+}
+
+static void destroy_inodecache(void)
+{
+	kmem_cache_destroy(pram_inode_cachep);
+}
+
+/*
+ * the super block writes are all done "on the fly", so the
+ * super block is never in a "dirty" state, so there's no need
+ * for write_super.
+ */
+static struct super_operations pram_sops = {
+	.alloc_inode	= pram_alloc_inode,
+	.destroy_inode	= pram_destroy_inode,
+	.write_inode	= pram_write_inode,
+	.dirty_inode	= pram_dirty_inode,
+	.evict_inode	= pram_evict_inode,
+	.put_super	= pram_put_super,
+	.statfs		= pram_statfs,
+	.remount_fs	= pram_remount,
+	.show_options	= pram_show_options,
+};
+
+static struct dentry *pram_mount(struct file_system_type *fs_type,
+	int flags, const char *dev_name, void *data)
+{
+	return mount_nodev(fs_type, flags, data, pram_fill_super);
+}
+
+static struct file_system_type pram_fs_type = {
+	.owner          = THIS_MODULE,
+	.name           = "pramfs",
+	.mount          = pram_mount,
+	.kill_sb        = kill_anon_super,
+};
+
+static struct inode *pram_nfs_get_inode(struct super_block *sb,
+		u64 ino, u32 generation)
+{
+	struct pram_super_block *ps = pram_get_super(sb);
+	struct inode *inode;
+
+	if (ino < PRAM_ROOT_INO)
+		return ERR_PTR(-ESTALE);
+	if (((ino - PRAM_ROOT_INO) >> PRAM_INODE_BITS) > be32_to_cpu(ps->s_inodes_count))
+		return ERR_PTR(-ESTALE);
+
+	inode = pram_iget(sb, ino);
+	if (IS_ERR(inode))
+		return ERR_CAST(inode);
+	if (generation && inode->i_generation != generation) {
+		/* we didn't find the right inode.. */
+		iput(inode);
+		return ERR_PTR(-ESTALE);
+	}
+	return inode;
+}
+
+static struct dentry *
+pram_fh_to_dentry(struct super_block *sb, struct fid *fid, int fh_len,
+		   int fh_type)
+{
+	return generic_fh_to_dentry(sb, fid, fh_len, fh_type,
+				    pram_nfs_get_inode);
+}
+
+static struct dentry *
+pram_fh_to_parent(struct super_block *sb, struct fid *fid, int fh_len,
+		   int fh_type)
+{
+	return generic_fh_to_parent(sb, fid, fh_len, fh_type,
+				    pram_nfs_get_inode);
+}
+
+static const struct export_operations pram_export_ops = {
+	.fh_to_dentry = pram_fh_to_dentry,
+	.fh_to_parent = pram_fh_to_parent,
+	.get_parent = pram_get_parent,
+};
+
+static int __init init_pram_fs(void)
+{
+	int rc = 0;
+
+	rc = init_pram_xattr();
+	if (rc)
+		return rc;
+
+	rc = init_inodecache();
+	if (rc)
+		goto out1;
+
+	rc = bdi_init(&pram_backing_dev_info);
+	if (rc)
+		goto out2;
+
+	rc = register_filesystem(&pram_fs_type);
+	if (rc)
+		goto out3;
+
+	return 0;
+
+out3:
+	bdi_destroy(&pram_backing_dev_info);
+out2:
+	destroy_inodecache();
+out1:
+	exit_pram_xattr();
+	return rc;
+}
+
+static void __exit exit_pram_fs(void)
+{
+	unregister_filesystem(&pram_fs_type);
+	bdi_destroy(&pram_backing_dev_info);
+	destroy_inodecache();
+	exit_pram_xattr();
+}
+
+MODULE_AUTHOR("Marco Stornelli <marco.stornelli@gmail.com>");
+MODULE_DESCRIPTION("Protected/Persistent RAM Filesystem");
+MODULE_LICENSE("GPL");
+
+module_init(init_pram_fs)
+module_exit(exit_pram_fs)

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox