* [PATCH v2 2/5] cramfs: make cramfs_physmem usable as root fs
From: Nicolas Pitre @ 2017-08-16 17:35 UTC (permalink / raw)
To: Alexander Viro; +Cc: linux-fsdevel, linux-embedded, linux-kernel, Chris Brandt
In-Reply-To: <20170816173536.1879-1-nicolas.pitre@linaro.org>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
---
init/do_mounts.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/init/do_mounts.c b/init/do_mounts.c
index c2de5104aa..43b5817f60 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -556,6 +556,14 @@ void __init prepare_namespace(void)
ssleep(root_delay);
}
+ if (IS_ENABLED(CONFIG_CRAMFS_PHYSMEM) && root_fs_names &&
+ !strcmp(root_fs_names, "cramfs_physmem")) {
+ int err = do_mount_root("cramfs", "cramfs_physmem",
+ root_mountflags, root_mount_data);
+ if (!err)
+ goto out;
+ }
+
/*
* wait for the known devices to complete their probing
*
--
2.9.5
^ permalink raw reply related
* [PATCH v2 1/5] cramfs: direct memory access support
From: Nicolas Pitre @ 2017-08-16 17:35 UTC (permalink / raw)
To: Alexander Viro; +Cc: linux-fsdevel, linux-embedded, linux-kernel, Chris Brandt
In-Reply-To: <20170816173536.1879-1-nicolas.pitre@linaro.org>
Small embedded systems typically execute the kernel code in place (XIP)
directly from flash to save on precious RAM usage. This adds the ability
to consume filesystem data directly from flash to the cramfs filesystem
as well. Cramfs is particularly well suited to this feature as it is
very simple and its RAM usage is already very low, and with this feature
it is possible to use it with no block device support and even lower RAM
usage.
This patch was inspired by a similar patch from Shane Nay dated 17 years
ago that used to be very popular in embedded circles but never made it
into mainline. This is a cleaned-up implementation that uses far fewer
memory address at run time when both methods are configured in. In the
context of small IoT deployments, this functionality has become relevant and useful again.
To distinguish between both access types, the cramfs_physmem filesystem
type must be specified when using a memory accessible cramfs image, and
the physaddr argument must provide the actual filesystem image's physical
memory location.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
---
fs/cramfs/Kconfig | 30 ++++++-
fs/cramfs/inode.c | 264 +++++++++++++++++++++++++++++++++++++++++++-----------
2 files changed, 242 insertions(+), 52 deletions(-)
diff --git a/fs/cramfs/Kconfig b/fs/cramfs/Kconfig
index 11b29d491b..5eed4ad2d5 100644
--- a/fs/cramfs/Kconfig
+++ b/fs/cramfs/Kconfig
@@ -1,6 +1,5 @@
config CRAMFS
tristate "Compressed ROM file system support (cramfs) (OBSOLETE)"
- depends on BLOCK
select ZLIB_INFLATE
help
Saying Y here includes support for CramFs (Compressed ROM File
@@ -20,3 +19,32 @@ config CRAMFS
in terms of performance and features.
If unsure, say N.
+
+config CRAMFS_BLOCKDEV
+ bool "Support CramFs image over a regular block device" if EXPERT
+ depends on CRAMFS && BLOCK
+ default y
+ help
+ This option allows the CramFs driver to load data from a regular
+ block device such a disk partition or a ramdisk.
+
+config CRAMFS_PHYSMEM
+ bool "Support CramFs image directly mapped in physical memory"
+ depends on CRAMFS
+ default y if !CRAMFS_BLOCKDEV
+ help
+ This option allows the CramFs driver to load data directly from
+ a linear adressed memory range (usually non volatile memory
+ like flash) instead of going through the block device layer.
+ This saves some memory since no intermediate buffering is
+ necessary.
+
+ The filesystem type for this feature is "cramfs_physmem".
+ The location of the CramFs image in memory is board
+ dependent. Therefore, if you say Y, you must know the proper
+ physical address where to store the CramFs image and specify
+ it using the physaddr=0x******** mount option (for example:
+ "mount -t cramfs_physmem -o physaddr=0x100000 none /mnt").
+
+ If unsure, say N.
+
diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
index 7919967488..393eb27ef4 100644
--- a/fs/cramfs/inode.c
+++ b/fs/cramfs/inode.c
@@ -24,6 +24,7 @@
#include <linux/mutex.h>
#include <uapi/linux/cramfs_fs.h>
#include <linux/uaccess.h>
+#include <linux/io.h>
#include "internal.h"
@@ -36,6 +37,8 @@ struct cramfs_sb_info {
unsigned long blocks;
unsigned long files;
unsigned long flags;
+ void *linear_virt_addr;
+ phys_addr_t linear_phys_addr;
};
static inline struct cramfs_sb_info *CRAMFS_SB(struct super_block *sb)
@@ -140,6 +143,9 @@ static struct inode *get_cramfs_inode(struct super_block *sb,
* BLKS_PER_BUF*PAGE_SIZE, so that the caller doesn't need to
* worry about end-of-buffer issues even when decompressing a full
* page cache.
+ *
+ * Note: This is all optimized away at compile time when
+ * CONFIG_CRAMFS_BLOCKDEV=n.
*/
#define READ_BUFFERS (2)
/* NEXT_BUFFER(): Loop over [0..(READ_BUFFERS-1)]. */
@@ -160,10 +166,10 @@ static struct super_block *buffer_dev[READ_BUFFERS];
static int next_buffer;
/*
- * Returns a pointer to a buffer containing at least LEN bytes of
- * filesystem starting at byte offset OFFSET into the filesystem.
+ * Populate our block cache and return a pointer from it.
*/
-static void *cramfs_read(struct super_block *sb, unsigned int offset, unsigned int len)
+static void *cramfs_blkdev_read(struct super_block *sb, unsigned int offset,
+ unsigned int len)
{
struct address_space *mapping = sb->s_bdev->bd_inode->i_mapping;
struct page *pages[BLKS_PER_BUF];
@@ -239,7 +245,39 @@ static void *cramfs_read(struct super_block *sb, unsigned int offset, unsigned i
return read_buffers[buffer] + offset;
}
-static void cramfs_kill_sb(struct super_block *sb)
+/*
+ * Return a pointer to the linearly addressed cramfs image in memory.
+ */
+static void *cramfs_direct_read(struct super_block *sb, unsigned int offset,
+ unsigned int len)
+{
+ struct cramfs_sb_info *sbi = CRAMFS_SB(sb);
+
+ if (!len)
+ return NULL;
+ if (len > sbi->size || offset > sbi->size - len)
+ return page_address(ZERO_PAGE(0));
+ return sbi->linear_virt_addr + offset;
+}
+
+/*
+ * Returns a pointer to a buffer containing at least LEN bytes of
+ * filesystem starting at byte offset OFFSET into the filesystem.
+ */
+static void *cramfs_read(struct super_block *sb, unsigned int offset,
+ unsigned int len)
+{
+ struct cramfs_sb_info *sbi = CRAMFS_SB(sb);
+
+ if (IS_ENABLED(CONFIG_CRAMFS_PHYSMEM) && sbi->linear_virt_addr)
+ return cramfs_direct_read(sb, offset, len);
+ else if (IS_ENABLED(CONFIG_CRAMFS_BLOCKDEV))
+ return cramfs_blkdev_read(sb, offset, len);
+ else
+ return NULL;
+}
+
+static void cramfs_blkdev_kill_sb(struct super_block *sb)
{
struct cramfs_sb_info *sbi = CRAMFS_SB(sb);
@@ -247,6 +285,16 @@ static void cramfs_kill_sb(struct super_block *sb)
kfree(sbi);
}
+static void cramfs_physmem_kill_sb(struct super_block *sb)
+{
+ struct cramfs_sb_info *sbi = CRAMFS_SB(sb);
+
+ if (sbi->linear_virt_addr)
+ memunmap(sbi->linear_virt_addr);
+ kill_anon_super(sb);
+ kfree(sbi);
+}
+
static int cramfs_remount(struct super_block *sb, int *flags, char *data)
{
sync_filesystem(sb);
@@ -254,34 +302,24 @@ static int cramfs_remount(struct super_block *sb, int *flags, char *data)
return 0;
}
-static int cramfs_fill_super(struct super_block *sb, void *data, int silent)
+static int cramfs_read_super(struct super_block *sb,
+ struct cramfs_super *super, int silent)
{
- int i;
- struct cramfs_super super;
+ struct cramfs_sb_info *sbi = CRAMFS_SB(sb);
unsigned long root_offset;
- struct cramfs_sb_info *sbi;
- struct inode *root;
-
- sb->s_flags |= MS_RDONLY;
-
- sbi = kzalloc(sizeof(struct cramfs_sb_info), GFP_KERNEL);
- if (!sbi)
- return -ENOMEM;
- sb->s_fs_info = sbi;
- /* Invalidate the read buffers on mount: think disk change.. */
- mutex_lock(&read_mutex);
- for (i = 0; i < READ_BUFFERS; i++)
- buffer_blocknr[i] = -1;
+ /* We don't know the real size yet */
+ sbi->size = PAGE_SIZE;
/* Read the first block and get the superblock from it */
- memcpy(&super, cramfs_read(sb, 0, sizeof(super)), sizeof(super));
+ mutex_lock(&read_mutex);
+ memcpy(super, cramfs_read(sb, 0, sizeof(*super)), sizeof(*super));
mutex_unlock(&read_mutex);
/* Do sanity checks on the superblock */
- if (super.magic != CRAMFS_MAGIC) {
+ if (super->magic != CRAMFS_MAGIC) {
/* check for wrong endianness */
- if (super.magic == CRAMFS_MAGIC_WEND) {
+ if (super->magic == CRAMFS_MAGIC_WEND) {
if (!silent)
pr_err("wrong endianness\n");
return -EINVAL;
@@ -289,10 +327,10 @@ static int cramfs_fill_super(struct super_block *sb, void *data, int silent)
/* check at 512 byte offset */
mutex_lock(&read_mutex);
- memcpy(&super, cramfs_read(sb, 512, sizeof(super)), sizeof(super));
+ memcpy(super, cramfs_read(sb, 512, sizeof(*super)), sizeof(*super));
mutex_unlock(&read_mutex);
- if (super.magic != CRAMFS_MAGIC) {
- if (super.magic == CRAMFS_MAGIC_WEND && !silent)
+ if (super->magic != CRAMFS_MAGIC) {
+ if (super->magic == CRAMFS_MAGIC_WEND && !silent)
pr_err("wrong endianness\n");
else if (!silent)
pr_err("wrong magic\n");
@@ -301,34 +339,34 @@ static int cramfs_fill_super(struct super_block *sb, void *data, int silent)
}
/* get feature flags first */
- if (super.flags & ~CRAMFS_SUPPORTED_FLAGS) {
+ if (super->flags & ~CRAMFS_SUPPORTED_FLAGS) {
pr_err("unsupported filesystem features\n");
return -EINVAL;
}
/* Check that the root inode is in a sane state */
- if (!S_ISDIR(super.root.mode)) {
+ if (!S_ISDIR(super->root.mode)) {
pr_err("root is not a directory\n");
return -EINVAL;
}
/* correct strange, hard-coded permissions of mkcramfs */
- super.root.mode |= (S_IRUSR | S_IXUSR | S_IRGRP | S_IXGRP | S_IROTH | S_IXOTH);
+ super->root.mode |= (S_IRUSR | S_IXUSR | S_IRGRP | S_IXGRP | S_IROTH | S_IXOTH);
- root_offset = super.root.offset << 2;
- if (super.flags & CRAMFS_FLAG_FSID_VERSION_2) {
- sbi->size = super.size;
- sbi->blocks = super.fsid.blocks;
- sbi->files = super.fsid.files;
+ root_offset = super->root.offset << 2;
+ if (super->flags & CRAMFS_FLAG_FSID_VERSION_2) {
+ sbi->size = super->size;
+ sbi->blocks = super->fsid.blocks;
+ sbi->files = super->fsid.files;
} else {
sbi->size = 1<<28;
sbi->blocks = 0;
sbi->files = 0;
}
- sbi->magic = super.magic;
- sbi->flags = super.flags;
+ sbi->magic = super->magic;
+ sbi->flags = super->flags;
if (root_offset == 0)
pr_info("empty filesystem");
- else if (!(super.flags & CRAMFS_FLAG_SHIFTED_ROOT_OFFSET) &&
+ else if (!(super->flags & CRAMFS_FLAG_SHIFTED_ROOT_OFFSET) &&
((root_offset != sizeof(struct cramfs_super)) &&
(root_offset != 512 + sizeof(struct cramfs_super))))
{
@@ -336,9 +374,18 @@ static int cramfs_fill_super(struct super_block *sb, void *data, int silent)
return -EINVAL;
}
+ return 0;
+}
+
+static int cramfs_finalize_super(struct super_block *sb,
+ struct cramfs_inode *cramfs_root)
+{
+ struct inode *root;
+
/* Set it all up.. */
+ sb->s_flags |= MS_RDONLY;
sb->s_op = &cramfs_ops;
- root = get_cramfs_inode(sb, &super.root, 0);
+ root = get_cramfs_inode(sb, cramfs_root, 0);
if (IS_ERR(root))
return PTR_ERR(root);
sb->s_root = d_make_root(root);
@@ -347,6 +394,92 @@ static int cramfs_fill_super(struct super_block *sb, void *data, int silent)
return 0;
}
+static int cramfs_blkdev_fill_super(struct super_block *sb, void *data, int silent)
+{
+ struct cramfs_sb_info *sbi;
+ struct cramfs_super super;
+ int i, err;
+
+ sbi = kzalloc(sizeof(struct cramfs_sb_info), GFP_KERNEL);
+ if (!sbi)
+ return -ENOMEM;
+ sb->s_fs_info = sbi;
+
+ /* Invalidate the read buffers on mount: think disk change.. */
+ for (i = 0; i < READ_BUFFERS; i++)
+ buffer_blocknr[i] = -1;
+
+ err = cramfs_read_super(sb, &super, silent);
+ if (err)
+ return err;
+ return cramfs_finalize_super(sb, &super.root);
+}
+
+static int cramfs_physmem_fill_super(struct super_block *sb, void *data, int silent)
+{
+ struct cramfs_sb_info *sbi;
+ struct cramfs_super super;
+ char *p;
+ int err;
+
+ sbi = kzalloc(sizeof(struct cramfs_sb_info), GFP_KERNEL);
+ if (!sbi)
+ return -ENOMEM;
+ sb->s_fs_info = sbi;
+
+ /*
+ * The physical location of the cramfs image is specified as
+ * a mount parameter. This parameter is mandatory for obvious
+ * reasons. Some validation is made on the phys address but this
+ * is not exhaustive and we count on the fact that someone using
+ * this feature is supposed to know what he/she's doing.
+ */
+ if (!data || !(p = strstr((char *)data, "physaddr="))) {
+ pr_err("unknown physical address for linear cramfs image\n");
+ return -EINVAL;
+ }
+ sbi->linear_phys_addr = memparse(p + 9, NULL);
+ if (!sbi->linear_phys_addr) {
+ pr_err("bad value for cramfs image physical address\n");
+ return -EINVAL;
+ }
+ if (sbi->linear_phys_addr & (PAGE_SIZE-1)) {
+ pr_err("physical address %pap for linear cramfs isn't aligned to a page boundary\n",
+ &sbi->linear_phys_addr);
+ return -EINVAL;
+ }
+
+ /*
+ * Map only one page for now. Will remap it when fs size is known.
+ * Although we'll only read from it, we want the CPU cache to
+ * kick in for the higher throughput it provides, hence MEMREMAP_WB.
+ */
+ pr_info("checking physical address %pap for linear cramfs image\n", &sbi->linear_phys_addr);
+ sbi->linear_virt_addr = memremap(sbi->linear_phys_addr, PAGE_SIZE,
+ MEMREMAP_WB);
+ if (!sbi->linear_virt_addr) {
+ pr_err("ioremap of the linear cramfs image failed\n");
+ return -ENOMEM;
+ }
+
+ err = cramfs_read_super(sb, &super, silent);
+ if (err)
+ return err;
+
+ /* Remap the whole filesystem now */
+ pr_info("linear cramfs image appears to be %lu KB in size\n",
+ sbi->size/1024);
+ memunmap(sbi->linear_virt_addr);
+ sbi->linear_virt_addr = memremap(sbi->linear_phys_addr, sbi->size,
+ MEMREMAP_WB);
+ if (!sbi->linear_virt_addr) {
+ pr_err("ioremap of the linear cramfs image failed\n");
+ return -ENOMEM;
+ }
+
+ return cramfs_finalize_super(sb, &super.root);
+}
+
static int cramfs_statfs(struct dentry *dentry, struct kstatfs *buf)
{
struct super_block *sb = dentry->d_sb;
@@ -573,38 +706,67 @@ static const struct super_operations cramfs_ops = {
.statfs = cramfs_statfs,
};
-static struct dentry *cramfs_mount(struct file_system_type *fs_type,
- int flags, const char *dev_name, void *data)
+static struct dentry *cramfs_blkdev_mount(struct file_system_type *fs_type,
+ int flags, const char *dev_name, void *data)
+{
+ return mount_bdev(fs_type, flags, dev_name, data, cramfs_blkdev_fill_super);
+}
+
+static struct dentry *cramfs_physmem_mount(struct file_system_type *fs_type,
+ int flags, const char *dev_name, void *data)
{
- return mount_bdev(fs_type, flags, dev_name, data, cramfs_fill_super);
+ return mount_nodev(fs_type, flags, data, cramfs_physmem_fill_super);
}
static struct file_system_type cramfs_fs_type = {
.owner = THIS_MODULE,
.name = "cramfs",
- .mount = cramfs_mount,
- .kill_sb = cramfs_kill_sb,
+ .mount = cramfs_blkdev_mount,
+ .kill_sb = cramfs_blkdev_kill_sb,
.fs_flags = FS_REQUIRES_DEV,
};
+
+static struct file_system_type cramfs_physmem_fs_type = {
+ .owner = THIS_MODULE,
+ .name = "cramfs_physmem",
+ .mount = cramfs_physmem_mount,
+ .kill_sb = cramfs_physmem_kill_sb,
+};
+
+#ifdef CONFIG_CRAMFS_BLOCKDEV
MODULE_ALIAS_FS("cramfs");
+#endif
+#ifdef CONFIG_CRAMFS_PHYSMEM
+MODULE_ALIAS_FS("cramfs_physmem");
+#endif
static int __init init_cramfs_fs(void)
{
int rv;
- rv = cramfs_uncompress_init();
- if (rv < 0)
- return rv;
- rv = register_filesystem(&cramfs_fs_type);
- if (rv < 0)
- cramfs_uncompress_exit();
- return rv;
+ if ((rv = cramfs_uncompress_init()) < 0)
+ goto err0;
+ if (IS_ENABLED(CONFIG_CRAMFS_BLOCKDEV) &&
+ (rv = register_filesystem(&cramfs_fs_type)) < 0)
+ goto err1;
+ if (IS_ENABLED(CONFIG_CRAMFS_PHYSMEM) &&
+ (rv = register_filesystem(&cramfs_physmem_fs_type)) < 0)
+ goto err2;
+ return 0;
+
+err2: if (IS_ENABLED(CONFIG_CRAMFS_BLOCKDEV))
+ unregister_filesystem(&cramfs_fs_type);
+err1: cramfs_uncompress_exit();
+err0: return rv;
}
static void __exit exit_cramfs_fs(void)
{
cramfs_uncompress_exit();
- unregister_filesystem(&cramfs_fs_type);
+ if (IS_ENABLED(CONFIG_CRAMFS_BLOCKDEV))
+ unregister_filesystem(&cramfs_fs_type);
+ if (IS_ENABLED(CONFIG_CRAMFS_PHYSMEM))
+ unregister_filesystem(&cramfs_physmem_fs_type);
}
module_init(init_cramfs_fs)
--
2.9.5
^ permalink raw reply related
* [PATCH v2 0/5] cramfs refresh for embedded usage
From: Nicolas Pitre @ 2017-08-16 17:35 UTC (permalink / raw)
To: Alexander Viro; +Cc: linux-fsdevel, linux-embedded, linux-kernel, Chris Brandt
This series brings a nice refresh to the cramfs filesystem, adding the
following capabilities:
- Direct memory access, bypassing the block and/or MTD layers entirely.
- Ability to store individual data blocks uncompressed.
- Ability to locate individual data blocks anywhere in the filesystem.
The end result is a very tight filesystem that can be accessed directly
from ROM without any other subsystem underneath. Also this allows for
user space XIP which is a very important feature for tiny embedded
systems.
Why cramfs?
Because cramfs is very simple and small. With CONFIG_CRAMFS_BLOCK=n and
CONFIG_CRAMFS_PHYSMEM=y the cramfs driver may use as little as 3704 bytes
of code. That's many times smaller than squashfs. And the runtime memory
usage is also much less with cramfs than squashfs. It packs very tightly
already compared to romfs which has no compression support. And the cramfs
format was simple to extend, allowing for both compressed and uncompressed
blocks within the same file.
Why not accessing ROM via MTD?
The MTD layer is nice and flexible. It also represents a huge overhead
considering its core with no other enabled options weights 19KB.
That's many times the size of the cramfs code for something that
essentially boils down to a glorified argument parser and a call to
memremap() in this case. And if someone still wants to use cramfs via
MTD then it is already possible with mtdblock.
Why not using DAX?
DAX stands for "Direct Access" and is a generic kernel layer helping
with the necessary tasks involved with XIP. It is tailored for large
writable filesystems and relies on the presence of an MMU. It also has
the following shortcoming: "The DAX code does not work correctly on
architectures which have virtually mapped caches such as ARM, MIPS and
SPARC." That makes it unsuitable for a large portion of the intended
targets for this series. And due to the read-only nature of cramfs, it is
possible to achieve the intended result with a much simpler approach making
DAX somewhat overkill in this context.
The maximum size of a cramfs image can't exceed 272MB. In practice it is
likely to be much much less. Given this series is concerned with small
memory systems, even in the MMU case there is always plenty of vmalloc
space left to map it all and even a 272MB memremap() wouldn't be a
problem. If it is then maybe your system is big enough with large
resources to manage already and you're pretty unlikely to be using cramfs
in the first place.
Of course, while this cramfs remains backward compatible with existing
filesystem images, a newer mkcramfs version is necessary to take advantage
of the extended data layout. I created a version of mkcramfs that
detects ELF files and marks text+rodata segments for XIP and compresses the
rest of those ELF files automatically.
So here it is. I'm also willing to step up as cramfs maintainer given
that no sign of any maintenance activities appeared for years.
This series is also available based on v4.13-rc4 via git here:
http://git.linaro.org/people/nicolas.pitre/linux xipcramfs
Changes from v1:
- Improved mmap() support by adding the ability to partially populate a
mapping and lazily split the non directly mapable pages to a separate
vma at fault time (thanks to Chris Brandt for testing).
- Clarified the documentation some more.
diffstat:
Documentation/filesystems/cramfs.txt | 42 ++
MAINTAINERS | 4 +-
fs/cramfs/Kconfig | 39 +-
fs/cramfs/README | 31 +-
fs/cramfs/inode.c | 621 +++++++++++++++++++++++++----
include/uapi/linux/cramfs_fs.h | 20 +-
init/do_mounts.c | 8 +
7 files changed, 688 insertions(+), 77 deletions(-)
^ permalink raw reply
* RE: [PATCH 0/5] cramfs refresh for embedded usage
From: Chris Brandt @ 2017-08-16 15:12 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Alexander Viro, linux-fsdevel@vger.kernel.org,
linux-embedded@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <alpine.LFD.2.20.1708161002500.17016@knanqh.ubzr>
On Wednesday, August 16, 2017 1, Nicolas Pitre wrote:
> > Just FYI,
> > I'm running an xipImage with all the RZ/A1 upstream drivers enabled and
> > only using about 4.5MB of total system RAM.
> > That's pretty good. Of course for a real application, you would trim off
> > the drivers and subsystems you don't plan on using, thus lowering your
> > RAM usage.
>
> On my MMU-less test target I'm going under the 1MB mark now.
Show off ;)
> Given that I also applied the device table patch to mkcramfs (that
> allows for the creation of device nodes and arbitrary
> user/group/permission without being root) it would be possible to extend
> this mechanism to implement other XIP patterns such as for
> uncompressible media files for example.
Good, I was going to ask about that.
I made an example once were all the graphics were RAW and uncompressed
and marked as XIP in AXFS. The result was a large saving of RAM because
as the graphics framework (DirectFB) would copy directly from Flash
whenever it needed to do a background erase or image re-draw (button press
animations).
Same went for playing MP3 files. The MP3 files were XIP in flash, so
mpg123 pulled from flash directly.
Chris
^ permalink raw reply
* RE: [PATCH 0/5] cramfs refresh for embedded usage
From: Nicolas Pitre @ 2017-08-16 14:29 UTC (permalink / raw)
To: Chris Brandt
Cc: Alexander Viro, linux-fsdevel@vger.kernel.org,
linux-embedded@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <SG2PR06MB1165E8DC60659449C24DD6B68A820@SG2PR06MB1165.apcprd06.prod.outlook.com>
On Wed, 16 Aug 2017, Chris Brandt wrote:
> On Wednesday, August 16, 2017, Nicolas Pitre wrote:
> > > Yes, now I can boot with my rootfs being a XIP cramfs.
> > >
> > > However, like you said, libc is not XIP.
> >
> > I think I have it working now. Probably learned more about the memory
> > management internals than I ever wanted to know. Please try the patch
> > below on top of all the previous ones. If it works for you as well then
> > I'll rebase and repost the whole thing.
> >
> > diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
> > index 4c7f01fcd2..0b651f985c 100644
> > --- a/fs/cramfs/inode.c
> > +++ b/fs/cramfs/inode.c
>
>
> Yes, that worked. Very nice!
Good.
> Just FYI,
> I'm running an xipImage with all the RZ/A1 upstream drivers enabled and
> only using about 4.5MB of total system RAM.
> That's pretty good. Of course for a real application, you would trim off
> the drivers and subsystems you don't plan on using, thus lowering your
> RAM usage.
On my MMU-less test target I'm going under the 1MB mark now.
> > +/*
> > + * It is possible for cramfs_physmem_mmap() to partially populate the mapping
> > + * causing page faults in the unmapped area. When that happens, we need to
> > + * split the vma so that the unmapped area gets its own vma that can be backed
> > + * with actual memory pages and loaded normally. This is necessary because
> > + * remap_pfn_range() overwrites vma->vm_pgoff with the pfn and filemap_fault()
> > + * no longer works with it. Furthermore this makes /proc/x/maps right.
> > + * Q: is there a way to do split vma at mmap() time?
> > + */
>
> So if I understand correctly, the issue is that sometimes you only have
> a partial PAGE worth that you need to map. Correct?
Yes, or the page is stored in its compressed form in the filesystem, or
it is misaligned, or any combination of those.
> For the AXFS file system, XIP page mapping was done on a per page
> decision, not per file. So the mkfs.axfs tool would only mark a page as XIP if
> the entire section would fit in a complete PAGE. If for example you had
> a partial page at the end of a multi page code segment, it would put
> that partial page in a separate portion of the AXFS image and be marked as
> 'copy to RAM' instead of being marked as 'map as XIP'.
> So in the AXFS case, it was a combination of the creation tool and file
> system driver features to fix the partial page issue.
> Not sure if any of this info is relevant, but I thought I would mention
> anyway.
Same applies here. The XIP decision is no longer a per file thing. This
is why mkcramfs puts loadable and read-only ELF segments into
uncompressed and aligned blocks while still packing the remaining of the
file. The partial page issue can be "fixed" within mkcramfs if
considered worth it. To incure the page alignment overhead only once,
all the uncompressed blocks can be located together away from their file
block tables, etc. The extended format implemented in this seris allows
for all this layout flexibility the fs creation tool may exploit.
The current restriction in the fs driver at the moment is that XIP
blocks must be contiguous in the filesystem. That is a hard requirement
in the non-mmu case anyway.
Given that I also applied the device table patch to mkcramfs (that
allows for the creation of device nodes and arbitrary
user/group/permission without being root) it would be possible to extend
this mechanism to implement other XIP patterns such as for
uncompressible media files for example.
> Thank you for your efforts on adding XIP to cramfs!
Thank you for testing.
Nicolas
^ permalink raw reply
* RE: [PATCH 0/5] cramfs refresh for embedded usage
From: Chris Brandt @ 2017-08-16 11:08 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Alexander Viro, linux-fsdevel@vger.kernel.org,
linux-embedded@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <alpine.LFD.2.20.1708160105470.17016@knanqh.ubzr>
On Wednesday, August 16, 2017, Nicolas Pitre wrote:
> > Yes, now I can boot with my rootfs being a XIP cramfs.
> >
> > However, like you said, libc is not XIP.
>
> I think I have it working now. Probably learned more about the memory
> management internals than I ever wanted to know. Please try the patch
> below on top of all the previous ones. If it works for you as well then
> I'll rebase and repost the whole thing.
>
> diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
> index 4c7f01fcd2..0b651f985c 100644
> --- a/fs/cramfs/inode.c
> +++ b/fs/cramfs/inode.c
Yes, that worked. Very nice!
$ cat /proc/self/maps
00008000-000a1000 r-xp 1b005000 00:0c 18192 /bin/busybox
000a9000-000aa000 rw-p 00099000 00:0c 18192 /bin/busybox
000aa000-000ac000 rw-p 00000000 00:00 0 [heap]
b6e23000-b6efc000 r-xp 1b0bc000 00:0c 766540 /lib/libc-2.18-2013.10.so
b6efc000-b6f04000 ---p 1b195000 00:0c 766540 /lib/libc-2.18-2013.10.so
b6f04000-b6f06000 r--p 000d9000 00:0c 766540 /lib/libc-2.18-2013.10.so
b6f06000-b6f07000 rw-p 000db000 00:0c 766540 /lib/libc-2.18-2013.10.so
b6f07000-b6f0a000 rw-p 00000000 00:00 0
b6f0a000-b6f21000 r-xp 1b0a4000 00:0c 670372 /lib/ld-2.18-2013.10.so
b6f24000-b6f25000 rw-p 00000000 00:00 0
b6f26000-b6f28000 rw-p 00000000 00:00 0
b6f28000-b6f29000 r--p 00016000 00:0c 670372 /lib/ld-2.18-2013.10.so
b6f29000-b6f2a000 rw-p 00017000 00:0c 670372 /lib/ld-2.18-2013.10.so
be877000-be898000 rw-p 00000000 00:00 0 [stack]
beba9000-bebaa000 r-xp 00000000 00:00 0 [sigpage]
ffff0000-ffff1000 r-xp 00000000 00:00 0 [vectors]
Just FYI,
I'm running an xipImage with all the RZ/A1 upstream drivers enabled and
only using about 4.5MB of total system RAM.
That's pretty good. Of course for a real application, you would trim off
the drivers and subsystems you don't plan on using, thus lowering your
RAM usage.
> +/*
> + * It is possible for cramfs_physmem_mmap() to partially populate the
> mapping
> + * causing page faults in the unmapped area. When that happens, we need
> to
> + * split the vma so that the unmapped area gets its own vma that can be
> backed
> + * with actual memory pages and loaded normally. This is necessary
> because
> + * remap_pfn_range() overwrites vma->vm_pgoff with the pfn and
> filemap_fault()
> + * no longer works with it. Furthermore this makes /proc/x/maps right.
> + * Q: is there a way to do split vma at mmap() time?
> + */
So if I understand correctly, the issue is that sometimes you only have
a partial PAGE worth that you need to map. Correct?
For the AXFS file system, XIP page mapping was done on a per page
decision, not per file. So the mkfs.axfs tool would only mark a page as XIP if
the entire section would fit in a complete PAGE. If for example you had
a partial page at the end of a multi page code segment, it would put
that partial page in a separate portion of the AXFS image and be marked as
'copy to RAM' instead of being marked as 'map as XIP'.
So in the AXFS case, it was a combination of the creation tool and file
system driver features to fix the partial page issue.
Not sure if any of this info is relevant, but I thought I would mention
anyway.
Thank you for your efforts on adding XIP to cramfs!
Chris
^ permalink raw reply
* RE: [PATCH 0/5] cramfs refresh for embedded usage
From: Nicolas Pitre @ 2017-08-16 5:10 UTC (permalink / raw)
To: Chris Brandt
Cc: Alexander Viro, linux-fsdevel@vger.kernel.org,
linux-embedded@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <SG2PR06MB1165749BD3C8336AB0CD27618A8D0@SG2PR06MB1165.apcprd06.prod.outlook.com>
On Tue, 15 Aug 2017, Chris Brandt wrote:
> On Tuesday, August 15, 2017 1, Nicolas Pitre wrote:
> > I was able to reproduce. The following patch on top should partially fix
> > it. I'm trying to figure out how to split a vma and link it properly in
> > the case the vma cannot be mapped entirely. In the mean time shared libs
> > won't be XIP.
> >
> >
> > diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
> > index 5aedbd224e..4c7f01fcd2 100644
> > --- a/fs/cramfs/inode.c
> > +++ b/fs/cramfs/inode.c
>
>
> Yes, now I can boot with my rootfs being a XIP cramfs.
>
> However, like you said, libc is not XIP.
I think I have it working now. Probably learned more about the memory
management internals than I ever wanted to know. Please try the patch
below on top of all the previous ones. If it works for you as well then
I'll rebase and repost the whole thing.
diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
index 4c7f01fcd2..0b651f985c 100644
--- a/fs/cramfs/inode.c
+++ b/fs/cramfs/inode.c
@@ -321,6 +321,86 @@ static u32 cramfs_get_block_range(struct inode *inode, u32 pgoff, u32 *pages)
return blockaddr << 2;
}
+/*
+ * It is possible for cramfs_physmem_mmap() to partially populate the mapping
+ * causing page faults in the unmapped area. When that happens, we need to
+ * split the vma so that the unmapped area gets its own vma that can be backed
+ * with actual memory pages and loaded normally. This is necessary because
+ * remap_pfn_range() overwrites vma->vm_pgoff with the pfn and filemap_fault()
+ * no longer works with it. Furthermore this makes /proc/x/maps right.
+ * Q: is there a way to do split vma at mmap() time?
+ */
+static const struct vm_operations_struct cramfs_vmasplit_ops;
+static int cramfs_vmasplit_fault(struct vm_fault *vmf)
+{
+ struct mm_struct *mm = vmf->vma->vm_mm;
+ struct vm_area_struct *vma, *new_vma;
+ unsigned long split_val, split_addr;
+ unsigned int split_pgoff, split_page;
+ int ret;
+
+ /* Retrieve the vma split address and validate it */
+ vma = vmf->vma;
+ split_val = (unsigned long)vma->vm_private_data;
+ split_pgoff = split_val & 0xffff;
+ split_page = split_val >> 16;
+ split_addr = vma->vm_start + split_page * PAGE_SIZE;
+ pr_debug("fault: addr=%#lx vma=%#lx-%#lx split=%#lx\n",
+ vmf->address, vma->vm_start, vma->vm_end, split_addr);
+ if (!split_val || split_addr >= vma->vm_end || vmf->address < split_addr)
+ return VM_FAULT_SIGSEGV;
+
+ /* We have some vma surgery to do and need the write lock. */
+ up_read(&mm->mmap_sem);
+ if (down_write_killable(&mm->mmap_sem))
+ return VM_FAULT_RETRY;
+
+ /* Make sure the vma didn't change between the locks */
+ vma = find_vma(mm, vmf->address);
+ if (vma->vm_ops != &cramfs_vmasplit_ops) {
+ /*
+ * Someone else raced with us and could have handled the fault.
+ * Let it go back to user space and fault again if necessary.
+ */
+ downgrade_write(&mm->mmap_sem);
+ return VM_FAULT_NOPAGE;
+ }
+
+ /* Split the vma between the directly mapped area and the rest */
+ ret = split_vma(mm, vma, split_addr, 0);
+ if (ret) {
+ downgrade_write(&mm->mmap_sem);
+ return VM_FAULT_OOM;
+ }
+
+ /* The direct vma should no longer ever fault */
+ vma->vm_ops = NULL;
+
+ /* Retrieve the new vma covering the unmapped area */
+ new_vma = find_vma(mm, split_addr);
+ BUG_ON(new_vma == vma);
+ if (!new_vma) {
+ downgrade_write(&mm->mmap_sem);
+ return VM_FAULT_SIGSEGV;
+ }
+
+ /*
+ * Readjust the new vma with the actual file based pgoff and
+ * process the fault normally on it.
+ */
+ new_vma->vm_pgoff = split_pgoff;
+ new_vma->vm_ops = &generic_file_vm_ops;
+ vmf->vma = new_vma;
+ vmf->pgoff = split_pgoff;
+ vmf->pgoff += (vmf->address - new_vma->vm_start) >> PAGE_SHIFT;
+ downgrade_write(&mm->mmap_sem);
+ return filemap_fault(vmf);
+}
+
+static const struct vm_operations_struct cramfs_vmasplit_ops = {
+ .fault = cramfs_vmasplit_fault,
+};
+
static int cramfs_physmem_mmap(struct file *file, struct vm_area_struct *vma)
{
struct inode *inode = file_inode(file);
@@ -337,6 +417,7 @@ static int cramfs_physmem_mmap(struct file *file, struct vm_area_struct *vma)
if ((vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_MAYWRITE))
return -EINVAL;
+ /* Could COW work here? */
fail_reason = "vma is writable";
if (vma->vm_flags & VM_WRITE)
goto fail;
@@ -364,7 +445,7 @@ static int cramfs_physmem_mmap(struct file *file, struct vm_area_struct *vma)
unsigned int partial = offset_in_page(inode->i_size);
if (partial) {
char *data = sbi->linear_virt_addr + offset;
- data += (pages - 1) * PAGE_SIZE + partial;
+ data += (max_pages - 1) * PAGE_SIZE + partial;
while ((unsigned long)data & 7)
if (*data++ != 0)
goto nonzero;
@@ -383,35 +464,42 @@ static int cramfs_physmem_mmap(struct file *file, struct vm_area_struct *vma)
if (pages) {
/*
- * Split the vma if we can't map it all so normal paging
- * will take care of the rest through cramfs_readpage().
+ * If we can't map it all, page faults will occur if the
+ * unmapped area is accessed. Let's handle them to split the
+ * vma and let the normal paging machinery take care of the
+ * rest through cramfs_readpage(). Because remap_pfn_range()
+ * repurposes vma->vm_pgoff, we have to save it somewhere.
+ * Let's use vma->vm_private_data to hold both the pgoff and the actual address split point.
+ * Maximum file size is 16MB so we can pack both together.
*/
if (pages != vma_pages) {
- if (1) {
- fail_reason = "fix me";
- goto fail;
- }
- ret = split_vma(vma->vm_mm, vma,
- vma->vm_start + pages * PAGE_SIZE, 0);
- if (ret)
- return ret;
+ unsigned int split_pgoff = vma->vm_pgoff + pages;
+ unsigned long split_val = split_pgoff + (pages << 16);
+ vma->vm_private_data = (void *)split_val;
+ vma->vm_ops = &cramfs_vmasplit_ops;
+ /* to keep remap_pfn_range() happy */
+ vma->vm_end = vma->vm_start + pages * PAGE_SIZE;
}
ret = remap_pfn_range(vma, vma->vm_start, address >> PAGE_SHIFT,
pages * PAGE_SIZE, vma->vm_page_prot);
+ /* restore vm_end in case we cheated it above */
+ vma->vm_end = vma->vm_start + vma_pages * PAGE_SIZE;
if (ret)
return ret;
+ pr_debug("mapped %s at 0x%08lx, %u/%u pages to vma 0x%08lx, "
+ "page_prot 0x%llx\n", file_dentry(file)->d_name.name,
+ address, pages, vma_pages, vma->vm_start,
+ (unsigned long long)pgprot_val(vma->vm_page_prot));
+ return 0;
}
-
- pr_debug("mapped %s at 0x%08lx, %u/%u pages to vma 0x%08lx, "
- "page_prot 0x%llx\n", file_dentry(file)->d_name.name,
- address, pages, vma_pages, vma->vm_start,
- (unsigned long long)pgprot_val(vma->vm_page_prot));
- return 0;
+ fail_reason = "no suitable block remaining";
fail:
pr_debug("%s: direct mmap failed: %s\n",
file_dentry(file)->d_name.name, fail_reason);
+
+ /* We failed to do a direct map, but normal paging will do it */
vma->vm_ops = &generic_file_vm_ops;
return 0;
}
^ permalink raw reply related
* RE: [PATCH 0/5] cramfs refresh for embedded usage
From: Chris Brandt @ 2017-08-15 11:00 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Alexander Viro, linux-fsdevel@vger.kernel.org,
linux-embedded@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <alpine.LFD.2.20.1708150001380.17016@knanqh.ubzr>
On Tuesday, August 15, 2017 1, Nicolas Pitre wrote:
> I was able to reproduce. The following patch on top should partially fix
> it. I'm trying to figure out how to split a vma and link it properly in
> the case the vma cannot be mapped entirely. In the mean time shared libs
> won't be XIP.
>
>
> diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
> index 5aedbd224e..4c7f01fcd2 100644
> --- a/fs/cramfs/inode.c
> +++ b/fs/cramfs/inode.c
Yes, now I can boot with my rootfs being a XIP cramfs.
However, like you said, libc is not XIP.
$ cat /proc/self/maps
00008000-000a1000 r-xp 1b005000 00:0c 18192 /bin/busybox
000a9000-000aa000 rw-p 00099000 00:0c 18192 /bin/busybox
000aa000-000ac000 rw-p 00000000 00:00 0 [heap]
b6ed8000-b6fb1000 r-xp 00000000 00:0c 766540 /lib/libc-2.18-2013.10.so
b6fb1000-b6fb9000 ---p 000d9000 00:0c 766540 /lib/libc-2.18-2013.10.so
b6fb9000-b6fbb000 r--p 000d9000 00:0c 766540 /lib/libc-2.18-2013.10.so
b6fbb000-b6fbc000 rw-p 000db000 00:0c 766540 /lib/libc-2.18-2013.10.so
b6fbc000-b6fbf000 rw-p 00000000 00:00 0
b6fbf000-b6fd6000 r-xp 00000000 00:0c 670372 /lib/ld-2.18-2013.10.so
b6fd9000-b6fda000 rw-p 00000000 00:00 0
b6fdb000-b6fdd000 rw-p 00000000 00:00 0
b6fdd000-b6fde000 r--p 00016000 00:0c 670372 /lib/ld-2.18-2013.10.so
b6fde000-b6fdf000 rw-p 00017000 00:0c 670372 /lib/ld-2.18-2013.10.so
be81f000-be840000 rw-p 00000000 00:00 0 [stack]
beb19000-beb1a000 r-xp 00000000 00:00 0 [sigpage]
ffff0000-ffff1000 r-xp 00000000 00:00 0 [vectors]
Chris
^ permalink raw reply
* RE: [PATCH 0/5] cramfs refresh for embedded usage
From: Nicolas Pitre @ 2017-08-15 4:10 UTC (permalink / raw)
To: Chris Brandt
Cc: Alexander Viro, linux-fsdevel@vger.kernel.org,
linux-embedded@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <SG2PR06MB1165D300028F81087B045C8A8A8C0@SG2PR06MB1165.apcprd06.prod.outlook.com>
On Mon, 14 Aug 2017, Chris Brandt wrote:
> On Monday, August 14, 2017, Nicolas Pitre wrote:
> > > However, now with your mkcramfs tool, I can no longer mount my cramfs
> > > image as the rootfs on boot. I was able to do that before (ie, 30
> > minutes
> > > ago) when using the community mkcramfs (ie, 30 minutes ago).
> > >
> > > I get this:
> > >
> > > [ 1.712425] cramfs: checking physical address 0x1b000000 for linear
> > cramfs image
> > > [ 1.720531] cramfs: linear cramfs image appears to be 15744 KB in
> > size
> > > [ 1.728656] VFS: Mounted root (cramfs_physmem filesystem) readonly on
> > device 0:12.
> > > [ 1.737062] devtmpfs: mounted
> > > [ 1.741139] Freeing unused kernel memory: 48K
> > > [ 1.745545] This architecture does not have kernel memory protection.
> > > [ 1.760381] Starting init: /sbin/init exists but couldn't execute it
> > (error -22)
> > > [ 1.769685] Starting init: /bin/sh exists but couldn't execute it
> > (error -14)
> >
> > Is /sbin/init a link to busybox?
>
> Yes.
>
>
> > I suppose it just boots if you do mkcramfs without -X?
>
> Correct. I just created another image and removed the "-X -X" when
> creating it. Now I can boot that image as my rootfs.
> (I'm using -X -X because I'm using a Cortex-A9 with MMU).
>
>
> > If so could you share your non-working cramfs image with me?
>
> I will send it (in a separate email)
I was able to reproduce. The following patch on top should partially fix
it. I'm trying to figure out how to split a vma and link it properly in
the case the vma cannot be mapped entirely. In the mean time shared libs
won't be XIP.
diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
index 5aedbd224e..4c7f01fcd2 100644
--- a/fs/cramfs/inode.c
+++ b/fs/cramfs/inode.c
@@ -285,10 +285,10 @@ static void *cramfs_read(struct super_block *sb, unsigned int offset,
/*
* For a mapping to be possible, we need a range of uncompressed and
- * contiguous blocks. Return the offset for the first block if that
- * verifies, or zero otherwise.
+ * contiguous blocks. Return the offset for the first block and number of
+ * valid blocks for which that is true, or zero otherwise.
*/
-static u32 cramfs_get_block_range(struct inode *inode, u32 pgoff, u32 pages)
+static u32 cramfs_get_block_range(struct inode *inode, u32 pgoff, u32 *pages)
{
struct super_block *sb = inode->i_sb;
struct cramfs_sb_info *sbi = CRAMFS_SB(sb);
@@ -306,11 +306,16 @@ static u32 cramfs_get_block_range(struct inode *inode, u32 pgoff, u32 pages)
do {
u32 expect = blockaddr + i * (PAGE_SIZE >> 2);
expect |= CRAMFS_BLK_FLAG_DIRECT_PTR|CRAMFS_BLK_FLAG_UNCOMPRESSED;
- pr_debug("range: block %d/%d got %#x expects %#x\n",
- pgoff+i, pgoff+pages-1, blockptrs[i], expect);
- if (blockptrs[i] != expect)
- return 0;
- } while (++i < pages);
+ if (blockptrs[i] != expect) {
+ pr_debug("range: block %d/%d got %#x expects %#x\n",
+ pgoff+i, pgoff+*pages-1, blockptrs[i], expect);
+ if (i == 0)
+ return 0;
+ break;
+ }
+ } while (++i < *pages);
+
+ *pages = i;
/* stored "direct" block ptrs are shifted down by 2 bits */
return blockaddr << 2;
@@ -321,8 +326,8 @@ static int cramfs_physmem_mmap(struct file *file, struct vm_area_struct *vma)
struct inode *inode = file_inode(file);
struct super_block *sb = inode->i_sb;
struct cramfs_sb_info *sbi = CRAMFS_SB(sb);
- unsigned int pages, max_pages, offset;
- unsigned long length, address;
+ unsigned int pages, vma_pages, max_pages, offset;
+ unsigned long address;
char *fail_reason;
int ret;
@@ -332,17 +337,20 @@ static int cramfs_physmem_mmap(struct file *file, struct vm_area_struct *vma)
if ((vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_MAYWRITE))
return -EINVAL;
- vma->vm_ops = &generic_file_vm_ops;
+ fail_reason = "vma is writable";
if (vma->vm_flags & VM_WRITE)
- return 0;
+ goto fail;
- length = vma->vm_end - vma->vm_start;
- pages = (length + PAGE_SIZE - 1) >> PAGE_SHIFT;
+ vma_pages = (vma->vm_end - vma->vm_start + PAGE_SIZE - 1) >> PAGE_SHIFT;
max_pages = (inode->i_size + PAGE_SIZE - 1) >> PAGE_SHIFT;
- if (vma->vm_pgoff >= max_pages || pages > max_pages - vma->vm_pgoff)
- return -EINVAL;
+ fail_reason = "beyond file limit";
+ if (vma->vm_pgoff >= max_pages)
+ goto fail;
+ pages = vma_pages;
+ if (pages > max_pages - vma->vm_pgoff)
+ pages = max_pages - vma->vm_pgoff;
- offset = cramfs_get_block_range(inode, vma->vm_pgoff, pages);
+ offset = cramfs_get_block_range(inode, vma->vm_pgoff, &pages);
fail_reason = "unsuitable block layout";
if (!offset)
goto fail;
@@ -351,37 +359,60 @@ static int cramfs_physmem_mmap(struct file *file, struct vm_area_struct *vma)
if (!PAGE_ALIGNED(address))
goto fail;
- /* Don't map a partial page if it contains some other data */
+ /* Don't map the last page if it contains some other data */
if (unlikely(vma->vm_pgoff + pages == max_pages)) {
unsigned int partial = offset_in_page(inode->i_size);
if (partial) {
char *data = sbi->linear_virt_addr + offset;
data += (pages - 1) * PAGE_SIZE + partial;
- fail_reason = "last partial page is shared";
while ((unsigned long)data & 7)
if (*data++ != 0)
- goto fail;
+ goto nonzero;
while (offset_in_page(data)) {
- if (*(u64 *)data != 0)
- goto fail;
+ if (*(u64 *)data != 0) {
+ nonzero:
+ pr_debug("mmap: %s: last page is shared\n",
+ file_dentry(file)->d_name.name);
+ pages--;
+ break;
+ }
data += 8;
}
}
}
-
- ret = remap_pfn_range(vma, vma->vm_start, address >> PAGE_SHIFT,
- length, vma->vm_page_prot);
- if (ret)
- return ret;
- pr_debug("mapped %s at 0x%08lx, length %lu to vma 0x%08lx, "
+
+ if (pages) {
+ /*
+ * Split the vma if we can't map it all so normal paging
+ * will take care of the rest through cramfs_readpage().
+ */
+ if (pages != vma_pages) {
+ if (1) {
+ fail_reason = "fix me";
+ goto fail;
+ }
+ ret = split_vma(vma->vm_mm, vma,
+ vma->vm_start + pages * PAGE_SIZE, 0);
+ if (ret)
+ return ret;
+ }
+
+ ret = remap_pfn_range(vma, vma->vm_start, address >> PAGE_SHIFT,
+ pages * PAGE_SIZE, vma->vm_page_prot);
+ if (ret)
+ return ret;
+ }
+
+ pr_debug("mapped %s at 0x%08lx, %u/%u pages to vma 0x%08lx, "
"page_prot 0x%llx\n", file_dentry(file)->d_name.name,
- address, length, vma->vm_start,
+ address, pages, vma_pages, vma->vm_start,
(unsigned long long)pgprot_val(vma->vm_page_prot));
return 0;
fail:
pr_debug("%s: direct mmap failed: %s\n",
file_dentry(file)->d_name.name, fail_reason);
+ vma->vm_ops = &generic_file_vm_ops;
return 0;
}
@@ -394,14 +425,15 @@ static unsigned long cramfs_physmem_get_unmapped_area(struct file *file,
struct inode *inode = file_inode(file);
struct super_block *sb = inode->i_sb;
struct cramfs_sb_info *sbi = CRAMFS_SB(sb);
- unsigned int pages, max_pages, offset;
+ unsigned int pages, block_pages, max_pages, offset;
pages = (len + PAGE_SIZE - 1) >> PAGE_SHIFT;
max_pages = (inode->i_size + PAGE_SIZE - 1) >> PAGE_SHIFT;
if (pgoff >= max_pages || pages > max_pages - pgoff)
return -EINVAL;
- offset = cramfs_get_block_range(inode, pgoff, pages);
- if (!offset)
+ block_pages = pages;
+ offset = cramfs_get_block_range(inode, pgoff, &block_pages);
+ if (!offset || block_pages != pages)
return -ENOSYS;
addr = sbi->linear_phys_addr + offset;
pr_debug("get_unmapped for %s ofs %#lx siz %lu at 0x%08lx\n",
^ permalink raw reply related
* RE: [PATCH 0/5] cramfs refresh for embedded usage
From: Chris Brandt @ 2017-08-14 18:37 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Alexander Viro, linux-fsdevel@vger.kernel.org,
linux-embedded@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <alpine.LFD.2.20.1708141405390.17016@knanqh.ubzr>
On Monday, August 14, 2017, Nicolas Pitre wrote:
> > However, now with your mkcramfs tool, I can no longer mount my cramfs
> > image as the rootfs on boot. I was able to do that before (ie, 30
> minutes
> > ago) when using the community mkcramfs (ie, 30 minutes ago).
> >
> > I get this:
> >
> > [ 1.712425] cramfs: checking physical address 0x1b000000 for linear
> cramfs image
> > [ 1.720531] cramfs: linear cramfs image appears to be 15744 KB in
> size
> > [ 1.728656] VFS: Mounted root (cramfs_physmem filesystem) readonly on
> device 0:12.
> > [ 1.737062] devtmpfs: mounted
> > [ 1.741139] Freeing unused kernel memory: 48K
> > [ 1.745545] This architecture does not have kernel memory protection.
> > [ 1.760381] Starting init: /sbin/init exists but couldn't execute it
> (error -22)
> > [ 1.769685] Starting init: /bin/sh exists but couldn't execute it
> (error -14)
>
> Is /sbin/init a link to busybox?
Yes.
> I suppose it just boots if you do mkcramfs without -X?
Correct. I just created another image and removed the "-X -X" when
creating it. Now I can boot that image as my rootfs.
(I'm using -X -X because I'm using a Cortex-A9 with MMU).
> If so could you share your non-working cramfs image with me?
I will send it (in a separate email)
Chris
^ permalink raw reply
* RE: [PATCH 0/5] cramfs refresh for embedded usage
From: Nicolas Pitre @ 2017-08-14 18:17 UTC (permalink / raw)
To: Chris Brandt
Cc: Alexander Viro, linux-fsdevel@vger.kernel.org,
linux-embedded@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <SG2PR06MB116563FCF10E2615C65080F28A8C0@SG2PR06MB1165.apcprd06.prod.outlook.com>
On Mon, 14 Aug 2017, Chris Brandt wrote:
> On Monday, August 14, 2017, Nicolas Pitre wrote:
> > > I just applied the patches tried this simple test:
> > > - tested with a Renesas RZ/A1 (Cortex-A9...so it has an MMU).
> > > - I set the sticky bit for busybox before using mkcramfs
> >
> > You need the newer mkcramfs I linked to in the documentation. With it
> > you don't need to play tricks with the sticky bit anymore. However you
> > need to specify -X twice (or just once for no-MMU targets) and it will
> > make every ELF files XIPable automatically.
>
> OK. Now I am getting bigger images that makes me think all the ELF files
> are uncompressed.
Yeah. No way around that of course. I listed a few TODO items to
mitigate the alignment losses if you have many executables.
> > > However, at this point I'm not sure how I can confirm that the XIP
> > > busybox actually executed as XIP or not.
> >
> > Just use busybox's built-in cat command and dump the content of
> > /proc/self/maps. You should see an offset that refers to a physical
> > address within your cramfs image for those segments marked read-only and
> > executable.
>
> It works! Pretty cool.
>
> $ /mnt/bin/busybox cat /proc/self/maps
> 00008000-000a1000 r-xp 1b005000 00:10 18192 /mnt/bin/busybox
>
> (my cramfs flash image is at physical address 0x1B000000)
Good! Independent validation is always nice.
> However, now with your mkcramfs tool, I can no longer mount my cramfs
> image as the rootfs on boot. I was able to do that before (ie, 30 minutes
> ago) when using the community mkcramfs (ie, 30 minutes ago).
>
> I get this:
>
> [ 1.712425] cramfs: checking physical address 0x1b000000 for linear cramfs image
> [ 1.720531] cramfs: linear cramfs image appears to be 15744 KB in size
> [ 1.728656] VFS: Mounted root (cramfs_physmem filesystem) readonly on device 0:12.
> [ 1.737062] devtmpfs: mounted
> [ 1.741139] Freeing unused kernel memory: 48K
> [ 1.745545] This architecture does not have kernel memory protection.
> [ 1.760381] Starting init: /sbin/init exists but couldn't execute it (error -22)
> [ 1.769685] Starting init: /bin/sh exists but couldn't execute it (error -14)
Is /sbin/init a link to busybox?
I suppose it just boots if you do mkcramfs without -X?
If so could you share your non-working cramfs image with me?
Nicolas
^ permalink raw reply
* RE: [PATCH 0/5] cramfs refresh for embedded usage
From: Chris Brandt @ 2017-08-14 18:01 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Alexander Viro, linux-fsdevel@vger.kernel.org,
linux-embedded@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <alpine.LFD.2.20.1708141322170.17016@knanqh.ubzr>
On Monday, August 14, 2017, Nicolas Pitre wrote:
> > I just applied the patches tried this simple test:
> > - tested with a Renesas RZ/A1 (Cortex-A9...so it has an MMU).
> > - I set the sticky bit for busybox before using mkcramfs
>
> You need the newer mkcramfs I linked to in the documentation. With it
> you don't need to play tricks with the sticky bit anymore. However you
> need to specify -X twice (or just once for no-MMU targets) and it will
> make every ELF files XIPable automatically.
OK. Now I am getting bigger images that makes me think all the ELF files
are uncompressed.
> > However, at this point I'm not sure how I can confirm that the XIP
> > busybox actually executed as XIP or not.
>
> Just use busybox's built-in cat command and dump the content of
> /proc/self/maps. You should see an offset that refers to a physical
> address within your cramfs image for those segments marked read-only and
> executable.
It works! Pretty cool.
$ /mnt/bin/busybox cat /proc/self/maps
00008000-000a1000 r-xp 1b005000 00:10 18192 /mnt/bin/busybox
(my cramfs flash image is at physical address 0x1B000000)
However, now with your mkcramfs tool, I can no longer mount my cramfs
image as the rootfs on boot. I was able to do that before (ie, 30 minutes
ago) when using the community mkcramfs (ie, 30 minutes ago).
I get this:
[ 1.712425] cramfs: checking physical address 0x1b000000 for linear cramfs image
[ 1.720531] cramfs: linear cramfs image appears to be 15744 KB in size
[ 1.728656] VFS: Mounted root (cramfs_physmem filesystem) readonly on device 0:12.
[ 1.737062] devtmpfs: mounted
[ 1.741139] Freeing unused kernel memory: 48K
[ 1.745545] This architecture does not have kernel memory protection.
[ 1.760381] Starting init: /sbin/init exists but couldn't execute it (error -22)
[ 1.769685] Starting init: /bin/sh exists but couldn't execute it (error -14)
[ 1.776956] Kernel panic - not syncing: No working init found. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance.
[ 1.791192] CPU: 0 PID: 1 Comm: init Not tainted 4.13.0-rc1-00014-g53182a0b7245 #667
[ 1.798959] Hardware name: Generic R7S72100 (Flattened Device Tree)
[ 1.805519] [<bf809261>] (unwind_backtrace) from [<bf807aa3>] (show_stack+0xb/0xc)
[ 1.813228] [<bf807aa3>] (show_stack) from [<bf810a1b>] (panic+0x6f/0x18c)
[ 1.820163] [<bf810a1b>] (panic) from [<bfa9f067>] (kernel_init+0x6b/0x98)
[ 1.827078] [<bfa9f067>] (kernel_init) from [<bf805011>] (ret_from_fork+0x11/0x20)
[ 1.834747] ---[ end Kernel panic - not syncing: No working init found. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance.
Chris
^ permalink raw reply
* RE: [PATCH 0/5] cramfs refresh for embedded usage
From: Nicolas Pitre @ 2017-08-14 17:31 UTC (permalink / raw)
To: Chris Brandt
Cc: Alexander Viro, linux-fsdevel@vger.kernel.org,
linux-embedded@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <SG2PR06MB1165840F183EADFC3265D7D98A8C0@SG2PR06MB1165.apcprd06.prod.outlook.com>
On Mon, 14 Aug 2017, Chris Brandt wrote:
> On Friday, August 11, 2017, Nicolas Pitre wrote:
> > This series brings a nice refresh to the cramfs filesystem, adding the
> > following capabilities:
> >
> > - Direct memory access, bypassing the block and/or MTD layers entirely.
> >
> > - Ability to store individual data blocks uncompressed.
> >
> > - Ability to locate individual data blocks anywhere in the filesystem.
> >
> > The end result is a very tight filesystem that can be accessed directly
> > from ROM without any other subsystem underneath. Also this allows for
> > user space XIP which is a very important feature for tiny embedded
> > systems.
>
>
>
> I just applied the patches tried this simple test:
> - tested with a Renesas RZ/A1 (Cortex-A9...so it has an MMU).
> - I set the sticky bit for busybox before using mkcramfs
You need the newer mkcramfs I linked to in the documentation. With it
you don't need to play tricks with the sticky bit anymore. However you
need to specify -X twice (or just once for no-MMU targets) and it will
make every ELF files XIPable automatically.
> - booted (with squashfs) and mounted the cramfs image
> - confirmed that the sticky bit was still set on busybox
> - was able to execute busybox in the cramfs image
>
>
> However, at this point I'm not sure how I can confirm that the XIP
> busybox actually executed as XIP or not.
Just use busybox's built-in cat command and dump the content of
/proc/self/maps. You should see an offset that refers to a physical
address within your cramfs image for those segments marked read-only and
executable.
Nicolas
^ permalink raw reply
* RE: [PATCH 0/5] cramfs refresh for embedded usage
From: Chris Brandt @ 2017-08-14 17:11 UTC (permalink / raw)
To: Nicolas Pitre, Alexander Viro
Cc: linux-fsdevel@vger.kernel.org, linux-embedded@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <20170811192252.19062-1-nicolas.pitre@linaro.org>
On Friday, August 11, 2017, Nicolas Pitre wrote:
> This series brings a nice refresh to the cramfs filesystem, adding the
> following capabilities:
>
> - Direct memory access, bypassing the block and/or MTD layers entirely.
>
> - Ability to store individual data blocks uncompressed.
>
> - Ability to locate individual data blocks anywhere in the filesystem.
>
> The end result is a very tight filesystem that can be accessed directly
> from ROM without any other subsystem underneath. Also this allows for
> user space XIP which is a very important feature for tiny embedded
> systems.
I just applied the patches tried this simple test:
- tested with a Renesas RZ/A1 (Cortex-A9...so it has an MMU).
- I set the sticky bit for busybox before using mkcramfs
- booted (with squashfs) and mounted the cramfs image
- confirmed that the sticky bit was still set on busybox
- was able to execute busybox in the cramfs image
However, at this point I'm not sure how I can confirm that the XIP
busybox actually executed as XIP or not.
Chris
^ permalink raw reply
* Re: [PATCH 1/5] cramfs: direct memory access support
From: Nicolas Pitre @ 2017-08-14 2:29 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Alexander Viro, linux-fsdevel, linux-embedded, linux-kernel,
Chris Brandt
In-Reply-To: <20170812074927.GA8703@infradead.org>
On Sat, 12 Aug 2017, Christoph Hellwig wrote:
> Direct physical memory access in a file system is never safe.
> Please make sure this goes through struct dax_operations.
Well, after having a closer look, I don't think dax might be relevant
here for a couple reasons:
- cramfs is read-only. No concurrent writes to worry about. Please
elaborate if you have other safety concerns in mind.
- This series targets small no-MMU systems. There is no paging involved
given the lack of a MMU. The whole of the filesystem is always
directly accessible in the address space from ROM alongside with the
actual kernel code being executed. I don't see how dax would ever be
pertinent here.
- Even with MMU systems, the maximum size of a cramfs image can't exceed
272MB. In practice it is likely to be much much less. Given this
targets small memory systems, there is always plenty of vmalloc space
left to map it all and even a 272MB memremap() wouldn't be a problem.
If it is a problem then maybe your system has large resources to
manage already and you're pretty unlikely to be using cramfs in the
first place, otherwise accessing it through a block device would be
just fine, at which point you'd better consider squashfs instead of
this.
All this to say that I think that dax is _way_ overkill and
inappropriate for the intended cramfs use case this series is
addressing.
Nicolas
^ permalink raw reply
* Re: [PATCH 1/5] cramfs: direct memory access support
From: Christoph Hellwig @ 2017-08-12 7:49 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Alexander Viro, linux-fsdevel, linux-embedded, linux-kernel,
Chris Brandt
In-Reply-To: <20170811192252.19062-2-nicolas.pitre@linaro.org>
Direct physical memory access in a file system is never safe.
Please make sure this goes through struct dax_operations.
^ permalink raw reply
* [PATCH 5/5] cramfs: rehabilitate it
From: Nicolas Pitre @ 2017-08-11 19:22 UTC (permalink / raw)
To: Alexander Viro; +Cc: linux-fsdevel, linux-embedded, linux-kernel, Chris Brandt
In-Reply-To: <20170811192252.19062-1-nicolas.pitre@linaro.org>
Update documentation, pointer to latest tools, appoint myself as maintainer.
Given it's been unloved for so long, I don't expect anyone will protest.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
---
Documentation/filesystems/cramfs.txt | 35 +++++++++++++++++++++++++++++++++++
MAINTAINERS | 4 ++--
fs/cramfs/Kconfig | 9 ++++++---
3 files changed, 43 insertions(+), 5 deletions(-)
diff --git a/Documentation/filesystems/cramfs.txt b/Documentation/filesystems/cramfs.txt
index 4006298f67..5955c23bac 100644
--- a/Documentation/filesystems/cramfs.txt
+++ b/Documentation/filesystems/cramfs.txt
@@ -45,6 +45,41 @@ you can just change the #define in mkcramfs.c, so long as you don't
mind the filesystem becoming unreadable to future kernels.
+Memory Mapped cramfs image
+--------------------------
+
+The CRAMFS_PHYSMEM Kconfig option adds support for loading data directly
+from a physical linear memory range (usually non volatile memory like Flash)
+to cramfs instead of going through the block device layer. This saves some
+memory since no intermediate buffering is necessary to hold the data before
+decompressing.
+
+And when data blocks are kept uncompressed and properly aligned, they will
+automatically be mapped directly into user space whenever possible providing
+eXecute-In-Place (XIP) from ROM of read-only segments. Data segments mapped
+read-write (hence they have to be copied to RAM) may still be compressed in
+the cramfs image in the same file along with non compressed read-only
+segments. Both MMU and no-MMU systems are supported. This is particularly
+handy for tiny embedded systems with very tight memory constraints.
+
+The filesystem type for this feature is "cramfs_physmem" to distinguish it
+from the block device (or MTD) based access. The location of the cramfs
+image in memory is system dependent. You must know the proper physical
+address where the cramfs image is located and specify it using the
+physaddr=0x******** mount option (for example:
+
+$ mount -t cramfs_physmem -o physaddr=0x80100000 none /mnt
+
+
+Tools
+-----
+
+A version of mkcramfs that can take advantage of the latest capabilities
+described above can be found here:
+
+https://github.com/npitre/cramfs-tools
+
+
For /usr/share/magic
--------------------
diff --git a/MAINTAINERS b/MAINTAINERS
index 44cb004c76..12f8155cfe 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3612,8 +3612,8 @@ F: drivers/cpuidle/*
F: include/linux/cpuidle.h
CRAMFS FILESYSTEM
-W: http://sourceforge.net/projects/cramfs/
-S: Orphan / Obsolete
+M: Nicolas Pitre <nico@linaro.org>
+S: Maintained
F: Documentation/filesystems/cramfs.txt
F: fs/cramfs/
diff --git a/fs/cramfs/Kconfig b/fs/cramfs/Kconfig
index 5eed4ad2d5..8ed27e41bd 100644
--- a/fs/cramfs/Kconfig
+++ b/fs/cramfs/Kconfig
@@ -1,5 +1,5 @@
config CRAMFS
- tristate "Compressed ROM file system support (cramfs) (OBSOLETE)"
+ tristate "Compressed ROM file system support (cramfs)"
select ZLIB_INFLATE
help
Saying Y here includes support for CramFs (Compressed ROM File
@@ -15,8 +15,11 @@ config CRAMFS
cramfs. Note that the root file system (the one containing the
directory /) cannot be compiled as a module.
- This filesystem is obsoleted by SquashFS, which is much better
- in terms of performance and features.
+ This filesystem is limited in capabilities and performance on
+ purpose to remain small and low on RAM usage. It is most suitable
+ for small embedded systems. For a more capable compressed filesystem
+ you should look at SquashFS which is much better in terms of
+ performance and features.
If unsure, say N.
--
2.9.4
^ permalink raw reply related
* [PATCH 4/5] cramfs: add mmap support
From: Nicolas Pitre @ 2017-08-11 19:22 UTC (permalink / raw)
To: Alexander Viro; +Cc: linux-fsdevel, linux-embedded, linux-kernel, Chris Brandt
In-Reply-To: <20170811192252.19062-1-nicolas.pitre@linaro.org>
When cramfs_physmem is used then we have the opportunity to map files
directly from ROM, directly into user space, saving on RAM usage.
This gives us Execute-In-Place (XIP) support.
For a file to be mmap()-able, the map area has to correspond to a range
of uncompressed and contiguous blocks, and in the MMU case it also has
to be page aligned. A version of mkcramfs with appropriate support is
necessary to create such a filesystem image.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
---
fs/cramfs/inode.c | 149 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 149 insertions(+)
diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
index b825ae162c..5aedbd224e 100644
--- a/fs/cramfs/inode.c
+++ b/fs/cramfs/inode.c
@@ -16,6 +16,7 @@
#include <linux/module.h>
#include <linux/fs.h>
#include <linux/pagemap.h>
+#include <linux/ramfs.h>
#include <linux/init.h>
#include <linux/string.h>
#include <linux/blkdev.h>
@@ -49,6 +50,7 @@ static inline struct cramfs_sb_info *CRAMFS_SB(struct super_block *sb)
static const struct super_operations cramfs_ops;
static const struct inode_operations cramfs_dir_inode_operations;
static const struct file_operations cramfs_directory_operations;
+static const struct file_operations cramfs_physmem_fops;
static const struct address_space_operations cramfs_aops;
static DEFINE_MUTEX(read_mutex);
@@ -96,6 +98,10 @@ static struct inode *get_cramfs_inode(struct super_block *sb,
case S_IFREG:
inode->i_fop = &generic_ro_fops;
inode->i_data.a_ops = &cramfs_aops;
+ if (IS_ENABLED(CONFIG_CRAMFS_PHYSMEM) &&
+ CRAMFS_SB(sb)->flags & CRAMFS_FLAG_EXT_BLOCK_POINTERS &&
+ CRAMFS_SB(sb)->linear_phys_addr)
+ inode->i_fop = &cramfs_physmem_fops;
break;
case S_IFDIR:
inode->i_op = &cramfs_dir_inode_operations;
@@ -277,6 +283,149 @@ static void *cramfs_read(struct super_block *sb, unsigned int offset,
return NULL;
}
+/*
+ * For a mapping to be possible, we need a range of uncompressed and
+ * contiguous blocks. Return the offset for the first block if that
+ * verifies, or zero otherwise.
+ */
+static u32 cramfs_get_block_range(struct inode *inode, u32 pgoff, u32 pages)
+{
+ struct super_block *sb = inode->i_sb;
+ struct cramfs_sb_info *sbi = CRAMFS_SB(sb);
+ int i;
+ u32 *blockptrs, blockaddr;
+
+ /*
+ * We can dereference memory directly here as this code may be
+ * reached only when there is a direct filesystem image mapping
+ * available in memory.
+ */
+ blockptrs = (u32 *)(sbi->linear_virt_addr + OFFSET(inode) + pgoff*4);
+ blockaddr = blockptrs[0] & ~CRAMFS_BLK_FLAGS;
+ i = 0;
+ do {
+ u32 expect = blockaddr + i * (PAGE_SIZE >> 2);
+ expect |= CRAMFS_BLK_FLAG_DIRECT_PTR|CRAMFS_BLK_FLAG_UNCOMPRESSED;
+ pr_debug("range: block %d/%d got %#x expects %#x\n",
+ pgoff+i, pgoff+pages-1, blockptrs[i], expect);
+ if (blockptrs[i] != expect)
+ return 0;
+ } while (++i < pages);
+
+ /* stored "direct" block ptrs are shifted down by 2 bits */
+ return blockaddr << 2;
+}
+
+static int cramfs_physmem_mmap(struct file *file, struct vm_area_struct *vma)
+{
+ struct inode *inode = file_inode(file);
+ struct super_block *sb = inode->i_sb;
+ struct cramfs_sb_info *sbi = CRAMFS_SB(sb);
+ unsigned int pages, max_pages, offset;
+ unsigned long length, address;
+ char *fail_reason;
+ int ret;
+
+ if (!IS_ENABLED(CONFIG_MMU))
+ return vma->vm_flags & (VM_SHARED | VM_MAYSHARE) ? 0 : -ENOSYS;
+
+ if ((vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_MAYWRITE))
+ return -EINVAL;
+
+ vma->vm_ops = &generic_file_vm_ops;
+ if (vma->vm_flags & VM_WRITE)
+ return 0;
+
+ length = vma->vm_end - vma->vm_start;
+ pages = (length + PAGE_SIZE - 1) >> PAGE_SHIFT;
+ max_pages = (inode->i_size + PAGE_SIZE - 1) >> PAGE_SHIFT;
+ if (vma->vm_pgoff >= max_pages || pages > max_pages - vma->vm_pgoff)
+ return -EINVAL;
+
+ offset = cramfs_get_block_range(inode, vma->vm_pgoff, pages);
+ fail_reason = "unsuitable block layout";
+ if (!offset)
+ goto fail;
+ address = sbi->linear_phys_addr + offset;
+ fail_reason = "data is not page aligned";
+ if (!PAGE_ALIGNED(address))
+ goto fail;
+
+ /* Don't map a partial page if it contains some other data */
+ if (unlikely(vma->vm_pgoff + pages == max_pages)) {
+ unsigned int partial = offset_in_page(inode->i_size);
+ if (partial) {
+ char *data = sbi->linear_virt_addr + offset;
+ data += (pages - 1) * PAGE_SIZE + partial;
+ fail_reason = "last partial page is shared";
+ while ((unsigned long)data & 7)
+ if (*data++ != 0)
+ goto fail;
+ while (offset_in_page(data)) {
+ if (*(u64 *)data != 0)
+ goto fail;
+ data += 8;
+ }
+ }
+ }
+
+ ret = remap_pfn_range(vma, vma->vm_start, address >> PAGE_SHIFT,
+ length, vma->vm_page_prot);
+ if (ret)
+ return ret;
+ pr_debug("mapped %s at 0x%08lx, length %lu to vma 0x%08lx, "
+ "page_prot 0x%llx\n", file_dentry(file)->d_name.name,
+ address, length, vma->vm_start,
+ (unsigned long long)pgprot_val(vma->vm_page_prot));
+ return 0;
+
+fail:
+ pr_debug("%s: direct mmap failed: %s\n",
+ file_dentry(file)->d_name.name, fail_reason);
+ return 0;
+}
+
+#ifndef CONFIG_MMU
+
+static unsigned long cramfs_physmem_get_unmapped_area(struct file *file,
+ unsigned long addr, unsigned long len,
+ unsigned long pgoff, unsigned long flags)
+{
+ struct inode *inode = file_inode(file);
+ struct super_block *sb = inode->i_sb;
+ struct cramfs_sb_info *sbi = CRAMFS_SB(sb);
+ unsigned int pages, max_pages, offset;
+
+ pages = (len + PAGE_SIZE - 1) >> PAGE_SHIFT;
+ max_pages = (inode->i_size + PAGE_SIZE - 1) >> PAGE_SHIFT;
+ if (pgoff >= max_pages || pages > max_pages - pgoff)
+ return -EINVAL;
+ offset = cramfs_get_block_range(inode, pgoff, pages);
+ if (!offset)
+ return -ENOSYS;
+ addr = sbi->linear_phys_addr + offset;
+ pr_debug("get_unmapped for %s ofs %#lx siz %lu at 0x%08lx\n",
+ file_dentry(file)->d_name.name, pgoff*PAGE_SIZE, len, addr);
+ return addr;
+}
+
+static unsigned cramfs_physmem_mmap_capabilities(struct file *file)
+{
+ return NOMMU_MAP_COPY | NOMMU_MAP_DIRECT | NOMMU_MAP_READ | NOMMU_MAP_EXEC;
+}
+#endif
+
+static const struct file_operations cramfs_physmem_fops = {
+ .llseek = generic_file_llseek,
+ .read_iter = generic_file_read_iter,
+ .splice_read = generic_file_splice_read,
+ .mmap = cramfs_physmem_mmap,
+#ifndef CONFIG_MMU
+ .get_unmapped_area = cramfs_physmem_get_unmapped_area,
+ .mmap_capabilities = cramfs_physmem_mmap_capabilities,
+#endif
+};
+
static void cramfs_blkdev_kill_sb(struct super_block *sb)
{
struct cramfs_sb_info *sbi = CRAMFS_SB(sb);
--
2.9.4
^ permalink raw reply related
* [PATCH 3/5] cramfs: implement uncompressed and arbitrary data block positioning
From: Nicolas Pitre @ 2017-08-11 19:22 UTC (permalink / raw)
To: Alexander Viro; +Cc: linux-fsdevel, linux-embedded, linux-kernel, Chris Brandt
In-Reply-To: <20170811192252.19062-1-nicolas.pitre@linaro.org>
Two new capabilities are introduced here:
- The ability to store some blocks uncompressed.
- The ability to locate blocks anywhere.
Those capabilities can be used independently, but the combination
opens the possibility for execute-in-place (XIP) of program text segments
that must remain uncompressed, and in the MMU case, must have a specific
alignment. It is even possible to still have the writable data segments
from the same file compressed as they have to be copied into RAM anyway.
This is achieved by giving special meanings to some unused block pointer
bits while remaining compatible with legacy cramfs images.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
---
fs/cramfs/README | 31 ++++++++++++++-
fs/cramfs/inode.c | 87 +++++++++++++++++++++++++++++++++---------
include/uapi/linux/cramfs_fs.h | 20 +++++++++-
3 files changed, 118 insertions(+), 20 deletions(-)
diff --git a/fs/cramfs/README b/fs/cramfs/README
index 9d4e7ea311..d71b27e0ff 100644
--- a/fs/cramfs/README
+++ b/fs/cramfs/README
@@ -49,17 +49,46 @@ same as the start of the (i+1)'th <block> if there is one). The first
<block> immediately follows the last <block_pointer> for the file.
<block_pointer>s are each 32 bits long.
+When the CRAMFS_FLAG_EXT_BLOCK_POINTERS capability bit is set, each
+<block_pointer>'s top bits may contain special flags as follows:
+
+CRAMFS_BLK_FLAG_UNCOMPRESSED (bit 31):
+ The block data is not compressed and should be copied verbatim.
+
+CRAMFS_BLK_FLAG_DIRECT_PTR (bit 30):
+ The <block_pointer> stores the actual block start offset and not
+ its end, shifted right by 2 bits. The block must therefore be
+ aligned to a 4-byte boundary. The block size is either blksize
+ if CRAMFS_BLK_FLAG_UNCOMPRESSED is also specified, otherwise
+ the compressed data length is included in the first 2 bytes of
+ the block data. This is used to allow discontiguous data layout
+ and specific data block alignments e.g. for XIP applications.
+
+
The order of <file_data>'s is a depth-first descent of the directory
tree, i.e. the same order as `find -size +0 \( -type f -o -type l \)
-print'.
<block>: The i'th <block> is the output of zlib's compress function
-applied to the i'th blksize-sized chunk of the input data.
+applied to the i'th blksize-sized chunk of the input data if the
+corresponding CRAMFS_BLK_FLAG_UNCOMPRESSED <block_ptr> bit is not set,
+otherwise it is the input data directly.
(For the last <block> of the file, the input may of course be smaller.)
Each <block> may be a different size. (See <block_pointer> above.)
+
<block>s are merely byte-aligned, not generally u32-aligned.
+When CRAMFS_BLK_FLAG_DIRECT_PTR is specified then the corresponding
+<block> may be located anywhere and not necessarily contiguous with
+the previous/next blocks. In that case it is minimally u32-aligned.
+If CRAMFS_BLK_FLAG_UNCOMPRESSED is also specified then the size is always
+blksize except for the last block which is limited by the file length.
+If CRAMFS_BLK_FLAG_DIRECT_PTR is set and CRAMFS_BLK_FLAG_UNCOMPRESSED
+is not set then the first 2 bytes of the block contains the size of the
+remaining block data as this cannot be determined from the placement of
+logically adjacent blocks.
+
Holes
-----
diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
index 393eb27ef4..b825ae162c 100644
--- a/fs/cramfs/inode.c
+++ b/fs/cramfs/inode.c
@@ -636,33 +636,84 @@ static int cramfs_readpage(struct file *file, struct page *page)
if (page->index < maxblock) {
struct super_block *sb = inode->i_sb;
u32 blkptr_offset = OFFSET(inode) + page->index*4;
- u32 start_offset, compr_len;
+ u32 block_ptr, block_start, block_len;
+ bool uncompressed, direct;
- start_offset = OFFSET(inode) + maxblock*4;
mutex_lock(&read_mutex);
- if (page->index)
- start_offset = *(u32 *) cramfs_read(sb, blkptr_offset-4,
- 4);
- compr_len = (*(u32 *) cramfs_read(sb, blkptr_offset, 4) -
- start_offset);
- mutex_unlock(&read_mutex);
+ block_ptr = *(u32 *) cramfs_read(sb, blkptr_offset, 4);
+ uncompressed = (block_ptr & CRAMFS_BLK_FLAG_UNCOMPRESSED);
+ direct = (block_ptr & CRAMFS_BLK_FLAG_DIRECT_PTR);
+ block_ptr &= ~CRAMFS_BLK_FLAGS;
+
+ if (direct) {
+ /*
+ * The block pointer is an absolute start pointer,
+ * shifted by 2 bits. The size is included in the
+ * first 2 bytes of the data block when compressed,
+ * or PAGE_SIZE otherwise.
+ */
+ block_start = block_ptr << 2;
+ if (uncompressed) {
+ block_len = PAGE_SIZE;
+ /* if last block: cap to file length */
+ if (page->index == maxblock - 1)
+ block_len = offset_in_page(inode->i_size);
+ } else {
+ block_len = *(u16 *)
+ cramfs_read(sb, block_start, 2);
+ block_start += 2;
+ }
+ } else {
+ /*
+ * The block pointer indicates one past the end of
+ * the current block (start of next block). If this
+ * is the first block then it starts where the block
+ * pointer table ends, otherwise its start comes
+ * from the previous block's pointer.
+ */
+ block_start = OFFSET(inode) + maxblock*4;
+ if (page->index)
+ block_start = *(u32 *)
+ cramfs_read(sb, blkptr_offset-4, 4);
+ /* Beware... previous ptr might be a direct ptr */
+ if (unlikely(block_start & CRAMFS_BLK_FLAG_DIRECT_PTR)) {
+ /* See comments on earlier code. */
+ u32 prev_start = block_start;
+ block_start = prev_start & ~CRAMFS_BLK_FLAGS;
+ block_start <<= 2;
+ if (prev_start & CRAMFS_BLK_FLAG_UNCOMPRESSED) {
+ block_start += PAGE_SIZE;
+ } else {
+ block_len = *(u16 *)
+ cramfs_read(sb, block_start, 2);
+ block_start += 2 + block_len;
+ }
+ }
+ block_start &= ~CRAMFS_BLK_FLAGS;
+ block_len = block_ptr - block_start;
+ }
- if (compr_len == 0)
+ if (block_len == 0)
; /* hole */
- else if (unlikely(compr_len > (PAGE_SIZE << 1))) {
- pr_err("bad compressed blocksize %u\n",
- compr_len);
+ else if (unlikely(block_len > 2*PAGE_SIZE ||
+ (uncompressed && block_len > PAGE_SIZE))) {
+ mutex_unlock(&read_mutex);
+ pr_err("bad data blocksize %u\n", block_len);
goto err;
+ } else if (uncompressed) {
+ memcpy(pgdata,
+ cramfs_read(sb, block_start, block_len),
+ block_len);
+ bytes_filled = block_len;
} else {
- mutex_lock(&read_mutex);
bytes_filled = cramfs_uncompress_block(pgdata,
PAGE_SIZE,
- cramfs_read(sb, start_offset, compr_len),
- compr_len);
- mutex_unlock(&read_mutex);
- if (unlikely(bytes_filled < 0))
- goto err;
+ cramfs_read(sb, block_start, block_len),
+ block_len);
}
+ mutex_unlock(&read_mutex);
+ if (unlikely(bytes_filled < 0))
+ goto err;
}
memset(pgdata + bytes_filled, 0, PAGE_SIZE - bytes_filled);
diff --git a/include/uapi/linux/cramfs_fs.h b/include/uapi/linux/cramfs_fs.h
index e4611a9b92..ed250aa372 100644
--- a/include/uapi/linux/cramfs_fs.h
+++ b/include/uapi/linux/cramfs_fs.h
@@ -73,6 +73,7 @@ struct cramfs_super {
#define CRAMFS_FLAG_HOLES 0x00000100 /* support for holes */
#define CRAMFS_FLAG_WRONG_SIGNATURE 0x00000200 /* reserved */
#define CRAMFS_FLAG_SHIFTED_ROOT_OFFSET 0x00000400 /* shifted root fs */
+#define CRAMFS_FLAG_EXT_BLOCK_POINTERS 0x00000800 /* block pointer extensions */
/*
* Valid values in super.flags. Currently we refuse to mount
@@ -82,7 +83,24 @@ struct cramfs_super {
#define CRAMFS_SUPPORTED_FLAGS ( 0x000000ff \
| CRAMFS_FLAG_HOLES \
| CRAMFS_FLAG_WRONG_SIGNATURE \
- | CRAMFS_FLAG_SHIFTED_ROOT_OFFSET )
+ | CRAMFS_FLAG_SHIFTED_ROOT_OFFSET \
+ | CRAMFS_FLAG_EXT_BLOCK_POINTERS )
+/*
+ * Block pointer flags
+ *
+ * The maximum block offset that needs to be represented is roughly:
+ *
+ * (1 << CRAMFS_OFFSET_WIDTH) * 4 +
+ * (1 << CRAMFS_SIZE_WIDTH) / PAGE_SIZE * (4 + PAGE_SIZE)
+ * = 0x11004000
+ *
+ * That leaves room for 3 flag bits in the block pointer table.
+ */
+#define CRAMFS_BLK_FLAG_UNCOMPRESSED (1 << 31)
+#define CRAMFS_BLK_FLAG_DIRECT_PTR (1 << 30)
+
+#define CRAMFS_BLK_FLAGS ( CRAMFS_BLK_FLAG_UNCOMPRESSED \
+ | CRAMFS_BLK_FLAG_DIRECT_PTR )
#endif /* _UAPI__CRAMFS_H */
--
2.9.4
^ permalink raw reply related
* [PATCH 2/5] cramfs: make cramfs_physmem usable as root fs
From: Nicolas Pitre @ 2017-08-11 19:22 UTC (permalink / raw)
To: Alexander Viro; +Cc: linux-fsdevel, linux-embedded, linux-kernel, Chris Brandt
In-Reply-To: <20170811192252.19062-1-nicolas.pitre@linaro.org>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
---
init/do_mounts.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/init/do_mounts.c b/init/do_mounts.c
index c2de5104aa..43b5817f60 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -556,6 +556,14 @@ void __init prepare_namespace(void)
ssleep(root_delay);
}
+ if (IS_ENABLED(CONFIG_CRAMFS_PHYSMEM) && root_fs_names &&
+ !strcmp(root_fs_names, "cramfs_physmem")) {
+ int err = do_mount_root("cramfs", "cramfs_physmem",
+ root_mountflags, root_mount_data);
+ if (!err)
+ goto out;
+ }
+
/*
* wait for the known devices to complete their probing
*
--
2.9.4
^ permalink raw reply related
* [PATCH 1/5] cramfs: direct memory access support
From: Nicolas Pitre @ 2017-08-11 19:22 UTC (permalink / raw)
To: Alexander Viro; +Cc: linux-fsdevel, linux-embedded, linux-kernel, Chris Brandt
In-Reply-To: <20170811192252.19062-1-nicolas.pitre@linaro.org>
Small embedded systems typically execute the kernel code in place (XIP)
directly from flash to save on precious RAM usage. This adds the ability
to consume filesystem data directly from flash to the cramfs filesystem
as well. Cramfs is particularly well suited to this feature as it is
very simple and its RAM usage is already very low, and with this feature
it is possible to use it with no block device support and even lower RAM
usage.
This patch was inspired by a similar patch from Shane Nay dated 17 years
ago that used to be very popular in embedded circles but never made it
into mainline. This is a cleaned-up implementation that uses far fewer
memory address at run time when both methods are configured in. In the
context of small IoT deployments, this functionality has become relevant and useful again.
To distinguish between both access types, the cramfs_physmem filesystem
type must be specified when using a memory accessible cramfs image, and
the physaddr argument must provide the actual filesystem image's physical
memory location.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
---
fs/cramfs/Kconfig | 30 ++++++-
fs/cramfs/inode.c | 264 +++++++++++++++++++++++++++++++++++++++++++-----------
2 files changed, 242 insertions(+), 52 deletions(-)
diff --git a/fs/cramfs/Kconfig b/fs/cramfs/Kconfig
index 11b29d491b..5eed4ad2d5 100644
--- a/fs/cramfs/Kconfig
+++ b/fs/cramfs/Kconfig
@@ -1,6 +1,5 @@
config CRAMFS
tristate "Compressed ROM file system support (cramfs) (OBSOLETE)"
- depends on BLOCK
select ZLIB_INFLATE
help
Saying Y here includes support for CramFs (Compressed ROM File
@@ -20,3 +19,32 @@ config CRAMFS
in terms of performance and features.
If unsure, say N.
+
+config CRAMFS_BLOCKDEV
+ bool "Support CramFs image over a regular block device" if EXPERT
+ depends on CRAMFS && BLOCK
+ default y
+ help
+ This option allows the CramFs driver to load data from a regular
+ block device such a disk partition or a ramdisk.
+
+config CRAMFS_PHYSMEM
+ bool "Support CramFs image directly mapped in physical memory"
+ depends on CRAMFS
+ default y if !CRAMFS_BLOCKDEV
+ help
+ This option allows the CramFs driver to load data directly from
+ a linear adressed memory range (usually non volatile memory
+ like flash) instead of going through the block device layer.
+ This saves some memory since no intermediate buffering is
+ necessary.
+
+ The filesystem type for this feature is "cramfs_physmem".
+ The location of the CramFs image in memory is board
+ dependent. Therefore, if you say Y, you must know the proper
+ physical address where to store the CramFs image and specify
+ it using the physaddr=0x******** mount option (for example:
+ "mount -t cramfs_physmem -o physaddr=0x100000 none /mnt").
+
+ If unsure, say N.
+
diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
index 7919967488..393eb27ef4 100644
--- a/fs/cramfs/inode.c
+++ b/fs/cramfs/inode.c
@@ -24,6 +24,7 @@
#include <linux/mutex.h>
#include <uapi/linux/cramfs_fs.h>
#include <linux/uaccess.h>
+#include <linux/io.h>
#include "internal.h"
@@ -36,6 +37,8 @@ struct cramfs_sb_info {
unsigned long blocks;
unsigned long files;
unsigned long flags;
+ void *linear_virt_addr;
+ phys_addr_t linear_phys_addr;
};
static inline struct cramfs_sb_info *CRAMFS_SB(struct super_block *sb)
@@ -140,6 +143,9 @@ static struct inode *get_cramfs_inode(struct super_block *sb,
* BLKS_PER_BUF*PAGE_SIZE, so that the caller doesn't need to
* worry about end-of-buffer issues even when decompressing a full
* page cache.
+ *
+ * Note: This is all optimized away at compile time when
+ * CONFIG_CRAMFS_BLOCKDEV=n.
*/
#define READ_BUFFERS (2)
/* NEXT_BUFFER(): Loop over [0..(READ_BUFFERS-1)]. */
@@ -160,10 +166,10 @@ static struct super_block *buffer_dev[READ_BUFFERS];
static int next_buffer;
/*
- * Returns a pointer to a buffer containing at least LEN bytes of
- * filesystem starting at byte offset OFFSET into the filesystem.
+ * Populate our block cache and return a pointer from it.
*/
-static void *cramfs_read(struct super_block *sb, unsigned int offset, unsigned int len)
+static void *cramfs_blkdev_read(struct super_block *sb, unsigned int offset,
+ unsigned int len)
{
struct address_space *mapping = sb->s_bdev->bd_inode->i_mapping;
struct page *pages[BLKS_PER_BUF];
@@ -239,7 +245,39 @@ static void *cramfs_read(struct super_block *sb, unsigned int offset, unsigned i
return read_buffers[buffer] + offset;
}
-static void cramfs_kill_sb(struct super_block *sb)
+/*
+ * Return a pointer to the linearly addressed cramfs image in memory.
+ */
+static void *cramfs_direct_read(struct super_block *sb, unsigned int offset,
+ unsigned int len)
+{
+ struct cramfs_sb_info *sbi = CRAMFS_SB(sb);
+
+ if (!len)
+ return NULL;
+ if (len > sbi->size || offset > sbi->size - len)
+ return page_address(ZERO_PAGE(0));
+ return sbi->linear_virt_addr + offset;
+}
+
+/*
+ * Returns a pointer to a buffer containing at least LEN bytes of
+ * filesystem starting at byte offset OFFSET into the filesystem.
+ */
+static void *cramfs_read(struct super_block *sb, unsigned int offset,
+ unsigned int len)
+{
+ struct cramfs_sb_info *sbi = CRAMFS_SB(sb);
+
+ if (IS_ENABLED(CONFIG_CRAMFS_PHYSMEM) && sbi->linear_virt_addr)
+ return cramfs_direct_read(sb, offset, len);
+ else if (IS_ENABLED(CONFIG_CRAMFS_BLOCKDEV))
+ return cramfs_blkdev_read(sb, offset, len);
+ else
+ return NULL;
+}
+
+static void cramfs_blkdev_kill_sb(struct super_block *sb)
{
struct cramfs_sb_info *sbi = CRAMFS_SB(sb);
@@ -247,6 +285,16 @@ static void cramfs_kill_sb(struct super_block *sb)
kfree(sbi);
}
+static void cramfs_physmem_kill_sb(struct super_block *sb)
+{
+ struct cramfs_sb_info *sbi = CRAMFS_SB(sb);
+
+ if (sbi->linear_virt_addr)
+ memunmap(sbi->linear_virt_addr);
+ kill_anon_super(sb);
+ kfree(sbi);
+}
+
static int cramfs_remount(struct super_block *sb, int *flags, char *data)
{
sync_filesystem(sb);
@@ -254,34 +302,24 @@ static int cramfs_remount(struct super_block *sb, int *flags, char *data)
return 0;
}
-static int cramfs_fill_super(struct super_block *sb, void *data, int silent)
+static int cramfs_read_super(struct super_block *sb,
+ struct cramfs_super *super, int silent)
{
- int i;
- struct cramfs_super super;
+ struct cramfs_sb_info *sbi = CRAMFS_SB(sb);
unsigned long root_offset;
- struct cramfs_sb_info *sbi;
- struct inode *root;
-
- sb->s_flags |= MS_RDONLY;
-
- sbi = kzalloc(sizeof(struct cramfs_sb_info), GFP_KERNEL);
- if (!sbi)
- return -ENOMEM;
- sb->s_fs_info = sbi;
- /* Invalidate the read buffers on mount: think disk change.. */
- mutex_lock(&read_mutex);
- for (i = 0; i < READ_BUFFERS; i++)
- buffer_blocknr[i] = -1;
+ /* We don't know the real size yet */
+ sbi->size = PAGE_SIZE;
/* Read the first block and get the superblock from it */
- memcpy(&super, cramfs_read(sb, 0, sizeof(super)), sizeof(super));
+ mutex_lock(&read_mutex);
+ memcpy(super, cramfs_read(sb, 0, sizeof(*super)), sizeof(*super));
mutex_unlock(&read_mutex);
/* Do sanity checks on the superblock */
- if (super.magic != CRAMFS_MAGIC) {
+ if (super->magic != CRAMFS_MAGIC) {
/* check for wrong endianness */
- if (super.magic == CRAMFS_MAGIC_WEND) {
+ if (super->magic == CRAMFS_MAGIC_WEND) {
if (!silent)
pr_err("wrong endianness\n");
return -EINVAL;
@@ -289,10 +327,10 @@ static int cramfs_fill_super(struct super_block *sb, void *data, int silent)
/* check at 512 byte offset */
mutex_lock(&read_mutex);
- memcpy(&super, cramfs_read(sb, 512, sizeof(super)), sizeof(super));
+ memcpy(super, cramfs_read(sb, 512, sizeof(*super)), sizeof(*super));
mutex_unlock(&read_mutex);
- if (super.magic != CRAMFS_MAGIC) {
- if (super.magic == CRAMFS_MAGIC_WEND && !silent)
+ if (super->magic != CRAMFS_MAGIC) {
+ if (super->magic == CRAMFS_MAGIC_WEND && !silent)
pr_err("wrong endianness\n");
else if (!silent)
pr_err("wrong magic\n");
@@ -301,34 +339,34 @@ static int cramfs_fill_super(struct super_block *sb, void *data, int silent)
}
/* get feature flags first */
- if (super.flags & ~CRAMFS_SUPPORTED_FLAGS) {
+ if (super->flags & ~CRAMFS_SUPPORTED_FLAGS) {
pr_err("unsupported filesystem features\n");
return -EINVAL;
}
/* Check that the root inode is in a sane state */
- if (!S_ISDIR(super.root.mode)) {
+ if (!S_ISDIR(super->root.mode)) {
pr_err("root is not a directory\n");
return -EINVAL;
}
/* correct strange, hard-coded permissions of mkcramfs */
- super.root.mode |= (S_IRUSR | S_IXUSR | S_IRGRP | S_IXGRP | S_IROTH | S_IXOTH);
+ super->root.mode |= (S_IRUSR | S_IXUSR | S_IRGRP | S_IXGRP | S_IROTH | S_IXOTH);
- root_offset = super.root.offset << 2;
- if (super.flags & CRAMFS_FLAG_FSID_VERSION_2) {
- sbi->size = super.size;
- sbi->blocks = super.fsid.blocks;
- sbi->files = super.fsid.files;
+ root_offset = super->root.offset << 2;
+ if (super->flags & CRAMFS_FLAG_FSID_VERSION_2) {
+ sbi->size = super->size;
+ sbi->blocks = super->fsid.blocks;
+ sbi->files = super->fsid.files;
} else {
sbi->size = 1<<28;
sbi->blocks = 0;
sbi->files = 0;
}
- sbi->magic = super.magic;
- sbi->flags = super.flags;
+ sbi->magic = super->magic;
+ sbi->flags = super->flags;
if (root_offset == 0)
pr_info("empty filesystem");
- else if (!(super.flags & CRAMFS_FLAG_SHIFTED_ROOT_OFFSET) &&
+ else if (!(super->flags & CRAMFS_FLAG_SHIFTED_ROOT_OFFSET) &&
((root_offset != sizeof(struct cramfs_super)) &&
(root_offset != 512 + sizeof(struct cramfs_super))))
{
@@ -336,9 +374,18 @@ static int cramfs_fill_super(struct super_block *sb, void *data, int silent)
return -EINVAL;
}
+ return 0;
+}
+
+static int cramfs_finalize_super(struct super_block *sb,
+ struct cramfs_inode *cramfs_root)
+{
+ struct inode *root;
+
/* Set it all up.. */
+ sb->s_flags |= MS_RDONLY;
sb->s_op = &cramfs_ops;
- root = get_cramfs_inode(sb, &super.root, 0);
+ root = get_cramfs_inode(sb, cramfs_root, 0);
if (IS_ERR(root))
return PTR_ERR(root);
sb->s_root = d_make_root(root);
@@ -347,6 +394,92 @@ static int cramfs_fill_super(struct super_block *sb, void *data, int silent)
return 0;
}
+static int cramfs_blkdev_fill_super(struct super_block *sb, void *data, int silent)
+{
+ struct cramfs_sb_info *sbi;
+ struct cramfs_super super;
+ int i, err;
+
+ sbi = kzalloc(sizeof(struct cramfs_sb_info), GFP_KERNEL);
+ if (!sbi)
+ return -ENOMEM;
+ sb->s_fs_info = sbi;
+
+ /* Invalidate the read buffers on mount: think disk change.. */
+ for (i = 0; i < READ_BUFFERS; i++)
+ buffer_blocknr[i] = -1;
+
+ err = cramfs_read_super(sb, &super, silent);
+ if (err)
+ return err;
+ return cramfs_finalize_super(sb, &super.root);
+}
+
+static int cramfs_physmem_fill_super(struct super_block *sb, void *data, int silent)
+{
+ struct cramfs_sb_info *sbi;
+ struct cramfs_super super;
+ char *p;
+ int err;
+
+ sbi = kzalloc(sizeof(struct cramfs_sb_info), GFP_KERNEL);
+ if (!sbi)
+ return -ENOMEM;
+ sb->s_fs_info = sbi;
+
+ /*
+ * The physical location of the cramfs image is specified as
+ * a mount parameter. This parameter is mandatory for obvious
+ * reasons. Some validation is made on the phys address but this
+ * is not exhaustive and we count on the fact that someone using
+ * this feature is supposed to know what he/she's doing.
+ */
+ if (!data || !(p = strstr((char *)data, "physaddr="))) {
+ pr_err("unknown physical address for linear cramfs image\n");
+ return -EINVAL;
+ }
+ sbi->linear_phys_addr = memparse(p + 9, NULL);
+ if (!sbi->linear_phys_addr) {
+ pr_err("bad value for cramfs image physical address\n");
+ return -EINVAL;
+ }
+ if (sbi->linear_phys_addr & (PAGE_SIZE-1)) {
+ pr_err("physical address %pap for linear cramfs isn't aligned to a page boundary\n",
+ &sbi->linear_phys_addr);
+ return -EINVAL;
+ }
+
+ /*
+ * Map only one page for now. Will remap it when fs size is known.
+ * Although we'll only read from it, we want the CPU cache to
+ * kick in for the higher throughput it provides, hence MEMREMAP_WB.
+ */
+ pr_info("checking physical address %pap for linear cramfs image\n", &sbi->linear_phys_addr);
+ sbi->linear_virt_addr = memremap(sbi->linear_phys_addr, PAGE_SIZE,
+ MEMREMAP_WB);
+ if (!sbi->linear_virt_addr) {
+ pr_err("ioremap of the linear cramfs image failed\n");
+ return -ENOMEM;
+ }
+
+ err = cramfs_read_super(sb, &super, silent);
+ if (err)
+ return err;
+
+ /* Remap the whole filesystem now */
+ pr_info("linear cramfs image appears to be %lu KB in size\n",
+ sbi->size/1024);
+ memunmap(sbi->linear_virt_addr);
+ sbi->linear_virt_addr = memremap(sbi->linear_phys_addr, sbi->size,
+ MEMREMAP_WB);
+ if (!sbi->linear_virt_addr) {
+ pr_err("ioremap of the linear cramfs image failed\n");
+ return -ENOMEM;
+ }
+
+ return cramfs_finalize_super(sb, &super.root);
+}
+
static int cramfs_statfs(struct dentry *dentry, struct kstatfs *buf)
{
struct super_block *sb = dentry->d_sb;
@@ -573,38 +706,67 @@ static const struct super_operations cramfs_ops = {
.statfs = cramfs_statfs,
};
-static struct dentry *cramfs_mount(struct file_system_type *fs_type,
- int flags, const char *dev_name, void *data)
+static struct dentry *cramfs_blkdev_mount(struct file_system_type *fs_type,
+ int flags, const char *dev_name, void *data)
+{
+ return mount_bdev(fs_type, flags, dev_name, data, cramfs_blkdev_fill_super);
+}
+
+static struct dentry *cramfs_physmem_mount(struct file_system_type *fs_type,
+ int flags, const char *dev_name, void *data)
{
- return mount_bdev(fs_type, flags, dev_name, data, cramfs_fill_super);
+ return mount_nodev(fs_type, flags, data, cramfs_physmem_fill_super);
}
static struct file_system_type cramfs_fs_type = {
.owner = THIS_MODULE,
.name = "cramfs",
- .mount = cramfs_mount,
- .kill_sb = cramfs_kill_sb,
+ .mount = cramfs_blkdev_mount,
+ .kill_sb = cramfs_blkdev_kill_sb,
.fs_flags = FS_REQUIRES_DEV,
};
+
+static struct file_system_type cramfs_physmem_fs_type = {
+ .owner = THIS_MODULE,
+ .name = "cramfs_physmem",
+ .mount = cramfs_physmem_mount,
+ .kill_sb = cramfs_physmem_kill_sb,
+};
+
+#ifdef CONFIG_CRAMFS_BLOCKDEV
MODULE_ALIAS_FS("cramfs");
+#endif
+#ifdef CONFIG_CRAMFS_PHYSMEM
+MODULE_ALIAS_FS("cramfs_physmem");
+#endif
static int __init init_cramfs_fs(void)
{
int rv;
- rv = cramfs_uncompress_init();
- if (rv < 0)
- return rv;
- rv = register_filesystem(&cramfs_fs_type);
- if (rv < 0)
- cramfs_uncompress_exit();
- return rv;
+ if ((rv = cramfs_uncompress_init()) < 0)
+ goto err0;
+ if (IS_ENABLED(CONFIG_CRAMFS_BLOCKDEV) &&
+ (rv = register_filesystem(&cramfs_fs_type)) < 0)
+ goto err1;
+ if (IS_ENABLED(CONFIG_CRAMFS_PHYSMEM) &&
+ (rv = register_filesystem(&cramfs_physmem_fs_type)) < 0)
+ goto err2;
+ return 0;
+
+err2: if (IS_ENABLED(CONFIG_CRAMFS_BLOCKDEV))
+ unregister_filesystem(&cramfs_fs_type);
+err1: cramfs_uncompress_exit();
+err0: return rv;
}
static void __exit exit_cramfs_fs(void)
{
cramfs_uncompress_exit();
- unregister_filesystem(&cramfs_fs_type);
+ if (IS_ENABLED(CONFIG_CRAMFS_BLOCKDEV))
+ unregister_filesystem(&cramfs_fs_type);
+ if (IS_ENABLED(CONFIG_CRAMFS_PHYSMEM))
+ unregister_filesystem(&cramfs_physmem_fs_type);
}
module_init(init_cramfs_fs)
--
2.9.4
^ permalink raw reply related
* [PATCH 0/5] cramfs refresh for embedded usage
From: Nicolas Pitre @ 2017-08-11 19:22 UTC (permalink / raw)
To: Alexander Viro; +Cc: linux-fsdevel, linux-embedded, linux-kernel, Chris Brandt
This series brings a nice refresh to the cramfs filesystem, adding the
following capabilities:
- Direct memory access, bypassing the block and/or MTD layers entirely.
- Ability to store individual data blocks uncompressed.
- Ability to locate individual data blocks anywhere in the filesystem.
The end result is a very tight filesystem that can be accessed directly
from ROM without any other subsystem underneath. Also this allows for
user space XIP which is a very important feature for tiny embedded
systems.
Why cramfs?
Because cramfs is very simple and small. With CONFIG_CRAMFS_BLOCK=n and
CONFIG_CRAMFS_PHYSMEM=y the cramfs driver may use only 3704 bytes of code.
That's many times smaller than squashfs. And the runtime memory usage is
also much less with cramfs than squashfs. It packs very tightly already
compared to romfs which has no compression support. And the cramfs format
was simple to extend, allowing for both compressed and uncompressed blocks
within the same file.
Why not accessing ROM via MTD?
The MTD layer is nice and flexible. It also represents a huge overhead
considering its core with no other enabled options weights 19KB.
That's many times the size of the cramfs code for something that
essentially boils down to a glorified argument parser and a call to
memremap(). And if someone still wants to use cramfs via MTD then
it is already possible with mtdblock.
Of course, while this cramfs remains backward compatible with existing
filesystem images, a newer mkcramfs version is necessary to take advantage
of the extended data layout. I created a version of mkcramfs that
detects ELF files and marks text+rodata segments for XIP and compresses the
rest automatically.
So here it is. I'm also willing to step up as cramfs maintainer given
that no sign of any maintenance activities appeared for years.
This series is also available based on v4.13-rc4 via git here:
http://git.linaro.org/people/nicolas.pitre/linux xipcramfs
diffstat:
Documentation/filesystems/cramfs.txt | 35 ++
MAINTAINERS | 4 +-
fs/cramfs/Kconfig | 39 ++-
fs/cramfs/README | 31 +-
fs/cramfs/inode.c | 500 +++++++++++++++++++++++++----
include/uapi/linux/cramfs_fs.h | 20 +-
init/do_mounts.c | 8 +
7 files changed, 560 insertions(+), 77 deletions(-)
^ permalink raw reply
* Re: mpc8572e linux kernel board bring-up problem
From: Rob Landley @ 2016-11-23 23:05 UTC (permalink / raw)
To: Nicolas Beland, linux-embedded@vger.kernel.org
In-Reply-To: <MWHPR01MB2623BDB22229B22DFF06DDC2ECB70@MWHPR01MB2623.prod.exchangelabs.com>
On 11/23/2016 08:00 AM, Nicolas Beland wrote:
> Btw, I'm on the right mailing list?
It's an ok mailing list for this but the volume here is pretty low. You
might have better luck on the powerpc mailing list.
The list of lists is at http://vger.kernel.org/vger-lists.html by the way.
If you want more rapid turnaround, freenode irc channels might (or might
not) work for that.
Rob
^ permalink raw reply
* mpc8572e linux kernel board bring-up problem
From: Nicolas Beland @ 2016-11-23 14:00 UTC (permalink / raw)
To: linux-embedded@vger.kernel.org
Hi everyone!
I got lucky and got an embedded linux internship:)
I'm trying to port linux to an existing board that is currently using vxWorks.
The board uses an mpc8572e processor which contains two e500v2 cores. Basically, the entry point in vmlinux is arch/powerpc/kernel/head_fsl_booke.S
We are using the simpleboot bootwrapper which works just fine (simpleboot passes the device tree address and jump right into the gunziped vmlinux which is located at 0x0 physical).
head_fsl_booke.S include arch/powerpc/kernel/fsl_booke_entry_mapping.S and we just crash at line 223 (rfi instruction) of fsl_booke_entry_mapping.S
215 lis r7,MSR_KERNEL@h
216 ori r7,r7,MSR_KERNEL@l
217 bl 1f /* Find our address */
218 1: mflr r9
219 rlwimi r6,r9,0,20,31
220 addi r6,r6,(2f - 1b)
221 mtspr SPRN_SRR0,r6
222 mtspr SPRN_SRR1,r7
223 rfi /* start execution out of TLB1[0] entry */
rfi at line 223 causes us to jump to 0xc000_0264 while we want to jump to 0x000_0264.
This is not caused by our mmu, we are basically loading 0xc000_0264 in srr0
I think it's related to the following config fields in our defconfig:
CONFIG_PAGE_OFFSET=0xc0000000 <----
CONFIG_PHYSICAL_START=0x00000000
CONFIG_PHYSICAL_ALIGN=0x04000000
CONFIG_TASK_SIZE=0xc0000000
CONFIG_KERNEL_START=0xc00000000 <-----
I tried to set CONFIG_KERNEL_START and CONFIG_PAGE_OFFSET to 0x0 but this would not compile:
arch/powerpc/include/asm/processor.h:100:2: error: #error User TASK_SIZE overlaps with KERNEL_START address
By the way CONFIG_TASK_SIZE=0xc0000000
Also, my debugger cannot map anything because all the symbols in vmlinux are at 0xcxxx_xxxx but we are at 0x0xxx_xxxx
Pretty much all the ports that uses the simpleboot bootwrapper are gunzipping vmlinux at 0x0 but I can easily change that and I tried to send vmlinux at 0xc000_0000 and it solved this particular issue but then the mmu mapping just does not work because simpleboot extract our device tree at about 0x0011_0000 and we need vmlinux and the device tree to be inside a 64Mb page (also tried to hack fsl_booke_entry_mapping.S to create a 4GB tlb entry based at 0x0 and this was causing other issues...)
Got any idea?
Also, I found out that __machine_desc_start is invalid (I see .long 0x0 as value in my on-chip debugger for this symbol when I tried to use a 4GB tlb entry.). This was causing me issues in probe_machine() located in arch/powerpc/kernel/setup-common.c when I tried the 4GB tlb approach. The 4GB tlb entry was also causing issues with the __va() macro (it was doing address translation from physical to virtual address) so I was removing the calls to that macro to make it work. but I don't really wan't to hack the kernel all over the place because I don't understand how it's supposed to be done. I suppose it's somehow still related to the configs above? For the 4GB tlb entry I want __va() to return what it received because we are using a 1:1 mapping.
Btw, I'm on the right mailing list?
Thanks!
Nicolas Béland
Dialogic Inc. Montreal (Canada)
Nicolas.Beland@dialogic.com
^ permalink raw reply
* Re: mounting squashfs as initrd from RAM
From: Rob Landley @ 2016-04-27 23:46 UTC (permalink / raw)
To: Nick Gifford, linux-embedded@vger.kernel.org
In-Reply-To: <20FFB90893A2AB4492D9DFB3B61CC8C403A3CD25@MBX028-W1-CA-4.exch028.domain.local>
On 04/27/2016 02:03 PM, Nick Gifford wrote:
>
> ________________________________________
> From: Rob Landley [rob@landley.net]
> Sent: Monday, April 25, 2016 7:55 PM
> To: Nick Gifford; linux-embedded@vger.kernel.org
> Subject: Re: mounting squashfs as initrd from RAM
>>> Virtual kernel memory layout:
>>> vector : 0xffff0000 - 0xffff1000 ( 4 kB)
>>> fixmap : 0xffc00000 - 0xffe00000 (2048 kB)
>>> vmalloc : 0xf0000000 - 0xff000000 ( 240 MB)
>
> Because fe7f9000 is off the top of "fixmap", but before the start of
> "vector", so it's in an unmapped area.
>
> [nick] Am I missing something here? Isn't fe7f9000 in the range "vmalloc : 0xf0000000 - 0xff000000"?
You're right, I misread fe7 as ffe7.
It's _really_ hard to read your responses, by the way. The >>> format
exists for a reason, I take it your client can't do replies that way?
>> [nick] Due to reasons above, I my guess is that uboot is moving the
>> image from 0x1000000 to 0x3e7f9000 before booting linux.
>
> Where does it say it's moving it? WHY move it? Why isn't the tftp just
> loading it at the right place to begin with? (The load address is an
> argument to the tftp command.)
>
> And how does 3e7f become f37f? What's the base physical address of your
> fixmap?)
>
> [nick] I don't know why, but uboot is moving it. Snippets from the original log:
> ## Loading init Ramdisk from Legacy Image at 01000000 ...
> Image Type: ARM Linux RAMDisk Image (gzip compressed)
> Data Size: 15958016 Bytes = 15.2 MiB
> Verifying Checksum ... OK
> Loading Ramdisk to 3e7f9000, end 3f731000 ... OK
>
> These tell me that uboot is finding it at 01000000, verifying it, and moving it to 3e7f9000 before booting linux. Then:
Why...?
Is it _just_ moving it, or is it decompressing it? If it decompresses
it, the kernel may not recognize it if it's expected a compressed one.
If it's not decompressing it, how is it "verifying" it?
> DEBUG: phys to virt addr 3e7f9000 --> fe7f9000
>
> tells me that linux has mapped 3e7f9000 (taken from uboot) to virtual address fe7f9000. I don't know why it is not mapped later when trying to access it to copy the rootfs.
Neither do I.
Is this an initrd issue or is this an issue with your board's vmalloc
implementation? Does anything ELSE using vmalloc work?
>>> I added the "DEBUG: phys to virt addr 3e7f9000 --> fe7f9000" where it
>>> looks like 0xfe7f9000 is being mapped for the initrd in
>> arch/arm/mm/init.c.
>>
>> That's a 790 line file, which contains 45 instances of "initrd" but does
>> not contain the word "buf" so i have no idea what variable you printed out.
>>
>> [nick] What I took from this is that after being moved to 0x3e7f9000, it is then being mapped to virtual memory at 0xfe7f9000.
Correction: it's _attempting_ to map it. The fact the result segfaults
when you try to access it is a problem.
> Note that 0xfe7f9000 is the address that is later given to unpack_to_rootfs (with the correct size).
>>
>> And it's architecture independent code where it looks like you're having
>> a board-specific problem. Possibly this is the first vmalloc use in the
>> kernel and vmalloc doesn't work on your board, it's hard to tell from here.
>>
>> [nick] I don't think it is a board problem as everything boots up fine when
>> I mount the rootfs out of flash with kernel arguments of
>> "console=ttyPS1,115200 earlyprintk noinitrd root=/dev/mtdblock6
> rootfstype=squashfs ro"
>
> Your board's memory layout and where your board is telling the kernel to
> look for the initrd data don't line up. That's either a board problem or
> a bootloader config problem.
>
> [nick] I think the path from physical addr 01000000 to physical addr 3e7f9000 to virtual address fe7f9000 shows from the logs. And it seems like fe7f9000 is a valid vmalloc address.
Except for the part where reading from it segfaults.
It's not objecting to the contents of the memory, it's objecting to the
_mapping_. Why is that?
Rob
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox