* JFFS2 mount time
@ 2004-10-20 14:26 Ferenc Havasi
2004-10-20 15:26 ` [OBORONA-SPAM] " Artem B. Bityuckiy
` (5 more replies)
0 siblings, 6 replies; 27+ messages in thread
From: Ferenc Havasi @ 2004-10-20 14:26 UTC (permalink / raw)
To: linux-mtd, jffs-dev, dwmw2
[-- Attachment #1: Type: text/plain, Size: 1792 bytes --]
Dear All,
Here is the latest version of our mount time improvement.
Using of it:
- apply this patch on the latest version of MTD
- compile sumtool (make command in mtd/util)
- make your JFFS2 image as before (or you can use already created images
as well)
- run sumtool to insert summary information, for example:
./sumtool -i original.jffs2 -o new.jffs2 -e128KiB
- recompile your kernel with "JFFS2 inode summary support"
Jarkko made a measurement on a real NAND device: his JFFS2 image was
120819928 (115M), after running sumtool the new image was 123338752 (117M).
Using the original mount time was 55 sec, with the new image it is only
8.5 sec.
It works very similar as our previous improvement: stores special
information at the end of the erase blocks, and at mount time if there
is this kind of information the scaning of the erase block is unneccessary.
New things compared to our previous improvement:
- it was fully rewritten
- we separated the user space tool from mkfs. (sumtool)
- sumtool now not only inserts the summary information but also make
some node-reordering. There will be two kind of erase blocks: in the
"first type" there will be only jffs2_raw_inodes, and all other node
(jffs2_raw_dirent) will be stored in the "second type". It generates
summary at the end of all "fist type" eraseblock. (the "second type"
will be scanned as before, because all information is needed in
jffs_raw_dirent at mount time)
Ceratinly all of these things are optional (as you can see above you
have to select it from kernel config). The JFFS2 image produced by
sumtool is also usable with previous kernel because the summary node is
JFFS2_FEATURE_RWCOMPAT_DELETE.
I think it can be usefull not only for us. David, may I commit it to the
CVS?
Regards,
Ferenc
[-- Attachment #2: jffs2-summary.patch --]
[-- Type: text/x-patch, Size: 36792 bytes --]
diff --unified --recursive --new-file mtd2/fs/Kconfig mtd/fs/Kconfig
--- mtd2/fs/Kconfig 2004-07-16 17:20:59.000000000 +0200
+++ mtd/fs/Kconfig 2004-10-20 15:10:42.000000000 +0200
@@ -68,6 +68,19 @@
Say 'N' unless you have NAND flash and you are willing to test and
develop JFFS2 support for it.
+config JFFS2_FS_SUMMARY
+ bool "JFFS2 inode summary support (EXPERIMENTAL)"
+ depends on JFFS2_FS
+ default n
+ help
+ This feature makes it possible to use inode summary information
+ for faster filesystem mount - specially on NAND.
+
+ The summary information can be inserted into a filesystem image
+ by the utility 'sumtool'.
+
+ If unsure, say 'N'.
+
config JFFS2_COMPRESSION_OPTIONS
bool "Advanced compression options for JFFS2"
default n
diff --unified --recursive --new-file mtd2/fs/jffs2/scan.c mtd/fs/jffs2/scan.c
--- mtd2/fs/jffs2/scan.c 2004-09-12 11:56:13.000000000 +0200
+++ mtd/fs/jffs2/scan.c 2004-10-20 15:44:25.000000000 +0200
@@ -4,6 +4,8 @@
* Copyright (C) 2001-2003 Red Hat, Inc.
*
* Created by David Woodhouse <dwmw2@redhat.com>
+ * Inode summary support by Zoltan Sogor, Ferenc Havasi, Patrik Kluba
+ * University of Szeged, Hungary
*
* For licensing information, see the file 'LICENCE' in this directory.
*
@@ -58,6 +60,11 @@
static int jffs2_scan_dirent_node(struct jffs2_sb_info *c, struct jffs2_eraseblock *jeb,
struct jffs2_raw_dirent *rd, uint32_t ofs);
+
+#ifdef CONFIG_JFFS2_FS_SUMMARY
+static struct jffs2_inode_cache *jffs2_scan_make_ino_cache(struct jffs2_sb_info *c, uint32_t ino);
+#endif
+
#define BLK_STATE_ALLFF 0
#define BLK_STATE_CLEAN 1
#define BLK_STATE_PARTDIRTY 2
@@ -292,6 +299,25 @@
#ifdef CONFIG_JFFS2_FS_NAND
int cleanmarkerfound = 0;
#endif
+#ifdef CONFIG_JFFS2_FS_SUMMARY
+ struct jffs2_raw_node_ref *raw;
+ struct jffs2_raw_node_ref *cache_ref;
+ struct jffs2_inode_cache *ic;
+
+ typedef struct sum_marker {
+ jint32_t offset;
+ jint32_t magic;
+ } sum_marker;
+
+ sum_marker *sm;
+ int i;
+ int sumsize;
+ uint32_t ino;
+ uint32_t crc;
+ struct jffs2_inode_sum_node *summary;
+ struct jffs2_inode_sum_record *sum_rec;
+ int bad_sum = 0;
+#endif
ofs = jeb->offset;
prevofs = jeb->offset - 1;
@@ -314,10 +340,217 @@
}
}
#endif
+
+#ifdef CONFIG_JFFS2_FS_SUMMARY
+ /* Looking for summary marker */
+ sm = (sum_marker *)kmalloc(sizeof(*sm), GFP_KERNEL);
+ if (!sm) {
+ return -ENOMEM;
+ }
+
+ err = jffs2_fill_scan_buf(c, (unsigned char *) sm, jeb->offset + c->sector_size - 8, 8);
+
+ if (err) {
+ return err;
+ }
+
+ if (je32_to_cpu(sm->magic) == JFFS2_SUM_MAGIC) {
+ ofs = je32_to_cpu(sm->offset);
+ sumsize = c->sector_size - ofs;
+ ofs += jeb->offset;
+
+ D1(printk(KERN_DEBUG "jffs2_scan_eraseblock(): Inode summary information found at 0x%x (%d bytes)\n", ofs, sumsize));
+
+ summary = (struct jffs2_inode_sum_node *) kmalloc(sumsize, GFP_KERNEL);
+
+ if (!summary) {
+ kfree(sm);
+ return -ENOMEM;
+ }
+
+ err = jffs2_fill_scan_buf(c, (unsigned char *)summary, ofs, sumsize);
+
+ if (err) {
+ kfree(sm);
+ kfree(summary);
+ return err;
+ }
+
+ /* OK, now check for node validity and CRC */
+ crcnode.magic = cpu_to_je16(JFFS2_MAGIC_BITMASK);
+ crcnode.nodetype = cpu_to_je16(JFFS2_NODETYPE_INODE_SUM);
+ crcnode.totlen = summary->totlen;
+ hdr_crc = crc32(0, &crcnode, sizeof(crcnode)-4);
+
+ if (je32_to_cpu(summary->hdr_crc) != hdr_crc) {
+ D1(printk(KERN_DEBUG "jffs2_scan_eraseblock(): Summary node header is corrupt (bad CRC or no summary at all)\n"));
+ bad_sum = 1;
+ }
+
+ if ((!bad_sum) && (je32_to_cpu(summary->totlen) != sumsize)) {
+ D1(printk(KERN_DEBUG "jffs2_scan_eraseblock(): Summary node is corrupt (wrong erasesize?)\n"));
+ bad_sum = 1;
+ }
+
+ crc = crc32(0, summary, sizeof(struct jffs2_inode_sum_node)-8);
+
+ if ((!bad_sum) && (je32_to_cpu(summary->node_crc) != crc)) {
+ D1(printk(KERN_DEBUG "jffs2_scan_eraseblock(): Summary node is corrupt (bad CRC)\n"));
+ bad_sum = 1;
+ }
+
+ sum_rec = (struct jffs2_inode_sum_record *) &(summary->sum[0]);
+ crc = crc32(0, sum_rec, sumsize - sizeof(struct jffs2_inode_sum_node));
+
+ if ((!bad_sum) && (je32_to_cpu(summary->sum_crc) != crc)) {
+ D1(printk(KERN_DEBUG "jffs2_scan_eraseblock(): Summary node data is corrupt (bad CRC)\n"));
+ bad_sum = 1;
+ }
+
+ if (!bad_sum) {
+
+ if ( je32_to_cpu(summary->cln_mkr) ){
+
+ D1(printk(KERN_DEBUG "Summary : CLEANMARKER node \n"));
+
+ if (je32_to_cpu(summary->cln_mkr) != c->cleanmarker_size) {
+ printk(KERN_DEBUG "CLEANMARKER node has totlen 0x%x != normal 0x%x\n",
+ je32_to_cpu(summary->cln_mkr), c->cleanmarker_size);
+ UNCHECKED_SPACE( PAD(je32_to_cpu(summary->cln_mkr)) );
+ }
+ else if (jeb->first_node) {
+ printk(KERN_DEBUG "CLEANMARKER node not first node in block (0x%08x)\n", jeb->offset);
+ UNCHECKED_SPACE( PAD(je32_to_cpu(summary->cln_mkr)) );
+ }
+ else {
+ struct jffs2_raw_node_ref *marker_ref = jffs2_alloc_raw_node_ref();
+
+ if (!marker_ref) {
+ printk(KERN_NOTICE "Failed to allocate node ref for clean marker\n");
+ return -ENOMEM;
+ }
+
+ marker_ref->next_in_ino = NULL;
+ marker_ref->next_phys = NULL;
+ marker_ref->flash_offset = jeb->offset | REF_NORMAL;
+ marker_ref->__totlen = je32_to_cpu(summary->cln_mkr);
+ jeb->first_node = jeb->last_node = marker_ref;
+
+ USED_SPACE( PAD(je32_to_cpu(summary->cln_mkr)) );
+
+ }
+ }
+
+ for(i = 0; i < je16_to_cpu(summary->sum_num); i++) {
+
+ D1(printk(KERN_DEBUG "jffs2_scan_eraseblock(): Processing summary information %d\n", i));
+
+ //JFFS2_NODETYPE_INODE:
+ ino = je32_to_cpu(sum_rec->inode);
+ D1(printk(KERN_DEBUG "jffs2_scan_eraseblock(): Inode at 0x%08x\n", jeb->offset + je32_to_cpu(sum_rec->offset)));
+ raw = jffs2_alloc_raw_node_ref();
+ if (!raw) {
+ printk(KERN_NOTICE "jffs2_scan_eraseblock(): allocation of node reference failed\n");
+ kfree(sm);
+ kfree(summary);
+ return -ENOMEM;
+ }
+
+ ic = jffs2_get_ino_cache(c, ino);
+ if (!ic) {
+ ic = jffs2_scan_make_ino_cache(c, ino);
+ if (!ic) {
+ printk(KERN_NOTICE "jffs2_scan_eraseblock(): scan_make_ino_cache failed\n");
+ jffs2_free_raw_node_ref(raw);
+ kfree(sm);
+ kfree(summary);
+ return -ENOMEM;
+ }
+ }
+
+ raw->flash_offset = (jeb->offset + je32_to_cpu(sum_rec->offset)) | REF_UNCHECKED;
+ raw->__totlen = PAD(je32_to_cpu(sum_rec->totlen));
+ raw->next_phys = NULL;
+ raw->next_in_ino = ic->nodes;
+
+ ic->nodes = raw;
+ if (!jeb->first_node)
+ jeb->first_node = raw;
+ if (jeb->last_node)
+ jeb->last_node->next_phys = raw;
+ jeb->last_node = raw;
+
+ /* do we need this? this requires storing another 4 bytes per record in the cache or an expensive reading */
+ pseudo_random += je32_to_cpu(sum_rec->version);
+
+ UNCHECKED_SPACE(PAD(je32_to_cpu(sum_rec->totlen)));
+
+ sum_rec++;
+ }
+
+ kfree(sm);
+ kfree(summary);
+
+ /* for ACCT_PARANOIA_CHECK */
+ cache_ref = jffs2_alloc_raw_node_ref();
+
+ if (!cache_ref) {
+ printk(KERN_NOTICE "Failed to allocate node ref for cache\n");
+ return -ENOMEM;
+ }
+
+ cache_ref->next_in_ino = NULL;
+ cache_ref->next_phys = NULL;
+ cache_ref->flash_offset = ofs | REF_NORMAL;
+ cache_ref->__totlen = sumsize;
+
+ if (!jeb->first_node)
+ jeb->first_node = cache_ref;
+ if (jeb->last_node)
+ jeb->last_node->next_phys = cache_ref;
+ jeb->last_node = cache_ref;
+
+ USED_SPACE(sumsize);
+
+ /* somebody check this and all of space accounting in summary support */
+
+ if ((jeb->used_size + jeb->unchecked_size) == PAD(c->cleanmarker_size) && !jeb->dirty_size
+ && (!jeb->first_node || !jeb->first_node->next_in_ino) ) {
+ return BLK_STATE_CLEANMARKER;
+ }
+ /* move blocks with max 4 byte dirty space to cleanlist */
+ else if (!ISDIRTY(c->sector_size - (jeb->used_size + jeb->unchecked_size))) {
+ c->dirty_size -= jeb->dirty_size;
+ c->wasted_size += jeb->dirty_size;
+ jeb->wasted_size += jeb->dirty_size;
+ jeb->dirty_size = 0;
+ return BLK_STATE_CLEAN;
+ }
+ else if (jeb->used_size || jeb->unchecked_size) {
+ return BLK_STATE_PARTDIRTY;
+ }
+ else {
+ return BLK_STATE_ALLDIRTY;
+ }
+ }
+ }
+ D1(printk(KERN_DEBUG "Summary end\n"));
+
+ ofs = jeb->offset;
+ prevofs = jeb->offset - 1;
+
+#endif
+
buf_ofs = jeb->offset;
if (!buf_size) {
buf_len = c->sector_size;
+#ifdef CONFIG_JFFS2_FS_SUMMARY
+ /* must reread because of summary test */
+ err = jffs2_fill_scan_buf(c, buf, buf_ofs, buf_len);
+ if (err)
+ return err;
+#endif
} else {
buf_len = EMPTY_SCAN_SIZE;
err = jffs2_fill_scan_buf(c, buf, buf_ofs, buf_len);
diff --unified --recursive --new-file mtd2/include/linux/jffs2.h mtd/include/linux/jffs2.h
--- mtd2/include/linux/jffs2.h 2004-05-25 13:31:55.000000000 +0200
+++ mtd/include/linux/jffs2.h 2004-10-20 14:53:52.000000000 +0200
@@ -28,6 +28,9 @@
#define JFFS2_EMPTY_BITMASK 0xffff
#define JFFS2_DIRTY_BITMASK 0x0000
+/* Summary node MAGIC marker */
+#define JFFS2_SUM_MAGIC 0x02851885
+
/* We only allow a single char for length, and 0xFF is empty flash so
we don't want it confused with a real length. Hence max 254.
*/
@@ -61,6 +64,7 @@
#define JFFS2_NODETYPE_INODE (JFFS2_FEATURE_INCOMPAT | JFFS2_NODE_ACCURATE | 2)
#define JFFS2_NODETYPE_CLEANMARKER (JFFS2_FEATURE_RWCOMPAT_DELETE | JFFS2_NODE_ACCURATE | 3)
#define JFFS2_NODETYPE_PADDING (JFFS2_FEATURE_RWCOMPAT_DELETE | JFFS2_NODE_ACCURATE | 4)
+#define JFFS2_NODETYPE_INODE_SUM (JFFS2_FEATURE_RWCOMPAT_DELETE | JFFS2_NODE_ACCURATE | 6)
// Maybe later...
//#define JFFS2_NODETYPE_CHECKPOINT (JFFS2_FEATURE_RWCOMPAT_DELETE | JFFS2_NODE_ACCURATE | 3)
@@ -148,10 +152,31 @@
uint8_t data[0];
} __attribute__((packed));
+struct jffs2_inode_sum_node{
+ jint16_t magic;
+ jint16_t nodetype; /* = JFFS2_NODETYPE_INODE_SUM */
+ jint32_t totlen;
+ jint32_t hdr_crc;
+ jint16_t sum_num; /* number of sum entries*/
+ jint32_t cln_mkr; /* clean marker size, 0 = no cleanmarker */
+ jint32_t sum_crc; /* summary information crc */
+ jint32_t node_crc; /* node crc */
+ jint32_t sum[0]; /* inode summary info */
+} __attribute__((packed));
+
+struct jffs2_inode_sum_record{
+ jint32_t inode;
+ jint32_t version;
+ jint32_t offset;
+ jint32_t totlen;
+} __attribute__((packed));
+
+
union jffs2_node_union {
struct jffs2_raw_inode i;
struct jffs2_raw_dirent d;
struct jffs2_unknown_node u;
+ struct jffs2_inode_sum_node s;
};
#endif /* __LINUX_JFFS2_H__ */
diff --unified --recursive --new-file mtd2/util/Makefile mtd/util/Makefile
--- mtd2/util/Makefile 2004-07-13 19:49:43.000000000 +0200
+++ mtd/util/Makefile 2004-10-19 15:11:02.000000000 +0200
@@ -13,7 +13,7 @@
TARGETS = ftl_format flash_erase flash_eraseall nanddump doc_loadbios \
mkfs.jffs ftl_check mkfs.jffs2 flash_lock flash_unlock \
flash_info mtd_debug flashcp nandwrite jffs2dump \
- nftldump nftl_format docfdisk #jffs2reader
+ nftldump nftl_format docfdisk sumtool #jffs2reader
SYMLINKS = compr_lzari.c compr_lzo.c
@@ -48,6 +48,9 @@
jffs2dump: jffs2dump.o crc32.o
$(CC) $(LDFLAGS) -o $@ $^
+sumtool: sumtool.o crc32.o
+ $(CC) $(LDFLAGS) -o $@ $^
+
install: ${TARGETS}
mkdir -p ${DESTDIR}/${SBINDIR}
install -m0755 -oroot -groot ${TARGETS} ${DESTDIR}/${SBINDIR}/
diff --unified --recursive --new-file mtd2/util/jffs2dump.c mtd/util/jffs2dump.c
--- mtd2/util/jffs2dump.c 2004-06-19 00:11:48.000000000 +0200
+++ mtd/util/jffs2dump.c 2004-10-20 14:55:28.000000000 +0200
@@ -3,7 +3,7 @@
*
* Copyright (C) 2003 Thomas Gleixner (tglx@linutronix.de)
*
- * $Id: jffs2dump.c,v 1.6 2004/06/18 22:11:48 gleixner Exp $
+ * $Id: jffs2dump.c,v 1.1 2004/10/19 07:19:55 weth Exp $
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
@@ -277,7 +277,63 @@
p += PAD(je32_to_cpu (node->d.totlen));
break;
-
+
+ case JFFS2_NODETYPE_INODE_SUM:{
+
+ int i;
+ jint32_t *offset,*magic;
+
+ printf ("%8s Inode Sum node at 0x%08x, totlen 0x%08x, sum_num %5d, cleanmarker size %5d\n",
+ obsolete ? "Obsolete" : "",
+ p - data,
+ je32_to_cpu (node->s.totlen),
+ je16_to_cpu (node->s.sum_num),
+ je32_to_cpu (node->s.cln_mkr));
+
+ crc = crc32 (0, node, sizeof (struct jffs2_inode_sum_node) - 8);
+ if (crc != je32_to_cpu (node->s.node_crc)) {
+ printf ("Wrong node_crc at 0x%08x, 0x%08x instead of 0x%08x\n", p - data, je32_to_cpu (node->s.node_crc), crc);
+ p += PAD(je32_to_cpu (node->s.totlen));
+ dirty += PAD(je32_to_cpu (node->s.totlen));;
+ continue;
+ }
+
+ crc = crc32(0, p + sizeof (struct jffs2_inode_sum_node), je32_to_cpu (node->s.totlen) - sizeof(struct jffs2_inode_sum_node));
+ if (crc != je32_to_cpu(node->s.sum_crc)) {
+ printf ("Wrong data_crc at 0x%08x, 0x%08x instead of 0x%08x\n", p - data, je32_to_cpu (node->s.sum_crc), crc);
+ p += PAD(je32_to_cpu (node->s.totlen));
+ dirty += PAD(je32_to_cpu (node->s.totlen));;
+ continue;
+ }
+
+ if(verbose){
+ for(i = 0; i < je16_to_cpu (node->s.sum_num); i++){
+ struct jffs2_inode_sum_record *sp;
+ sp = (struct jffs2_inode_sum_record *) (p + sizeof (struct jffs2_inode_sum_node));
+
+ printf ("%14s #ino %5d, version %5d, offset %8d, totlen 0x%08x\n",
+ "",
+ je32_to_cpu (sp[i].inode),
+ je32_to_cpu (sp[i].version),
+ je32_to_cpu (sp[i].offset),
+ je32_to_cpu (sp[i].totlen));
+
+ }
+
+ offset = (jint32_t *)((char *)p + je32_to_cpu(node->s.totlen) - 8);
+ magic = (jint32_t *)((char *)p + je32_to_cpu(node->s.totlen) - 4);
+
+ printf("%14s Sum Node Offset 0x%08x, Magic 0x%08x\n",
+ "",
+ je32_to_cpu(*offset),
+ je32_to_cpu(*magic));
+ }
+
+ p += PAD(je32_to_cpu (node->s.totlen));
+ break;
+
+ }
+
case JFFS2_NODETYPE_CLEANMARKER:
if (verbose) {
printf ("%8s Cleanmarker at 0x%08x, totlen 0x%08x\n",
@@ -418,9 +474,9 @@
write (fd, &newnode, sizeof (struct jffs2_raw_dirent));
write (fd, p + sizeof (struct jffs2_raw_dirent), PAD (je32_to_cpu (node->d.totlen) - sizeof (struct jffs2_raw_dirent)));
- p += PAD(je32_to_cpu (node->d.totlen));
+ p += PAD(je32_to_cpu (node->d.totlen));
break;
-
+
case JFFS2_NODETYPE_CLEANMARKER:
case JFFS2_NODETYPE_PADDING:
newnode.u.magic = cnv_e16 (node->u.magic);
diff --unified --recursive --new-file mtd2/util/sumtool.c mtd/util/sumtool.c
--- mtd2/util/sumtool.c 1970-01-01 01:00:00.000000000 +0100
+++ mtd/util/sumtool.c 2004-10-20 11:56:08.000000000 +0200
@@ -0,0 +1,800 @@
+/*
+ * sumtool.c
+ *
+ * Copyright (C) 2004 Zoltan Sogor <weth@inf.u-szeged.hu>,
+ * Ferenc Havasi <havasi@inf.u-szeged.hu>,
+ * Patrik Kluba <pajko@halom.u-szeged.hu>,
+ * University of Szeged, Hungary
+ *
+ * $Id: sumtool.c,v 1.2 2004/10/20 09:56:08 hafy Exp $
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Overview:
+ * This is a utility to reorder nodes and insert inode summary information
+ * into JFFS2 image for faster mount time - specially on NAND.
+ *
+ */
+
+#include <errno.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdarg.h>
+#include <string.h>
+#include <unistd.h>
+#include <fcntl.h>
+#include <time.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/param.h>
+#include <asm/types.h>
+#include <dirent.h>
+#include <mtd/jffs2-user.h>
+#include <endian.h>
+#include <byteswap.h>
+#include <getopt.h>
+#include "crc32.h"
+
+#define PAD(x) (((x)+3)&~3)
+
+#define SINODE 0 /* Inode Type*/
+#define SONODE 1 /* Other type*/
+
+static const char *const app_name = "sumtool";
+
+typedef struct sum_storage {
+ jint32_t inode;
+ jint32_t version;
+ jint32_t offset;
+ jint32_t totlen;
+ struct sum_storage *next;
+} sum_storage;
+
+static sum_storage *sum_collected = NULL; /* summary info list */
+static int sum_records = 0; /* number of sumary records */
+
+
+static int verbose = 0;
+static int add_cleanmarkers = 1; /* add cleanmarker to output */
+static int use_input_cleanmarker_size = 1; /* use input file's cleanmarker size (default) */
+static int found_cleanmarkers = 0; /* cleanmarker found in input file */
+static struct jffs2_unknown_node cleanmarker;
+static int cleanmarker_size = sizeof(cleanmarker);
+static const char *short_options = "o:i:e:hvVblnc:";
+static int erase_block_size = 65536;
+static int target_endian = __BYTE_ORDER;
+static int out_fd = -1;
+static int in_fd = -1;
+
+static uint8_t *inode_buffer = NULL; /* buffer for inodes */
+static unsigned int ino_ofs = 0; /* inode buffer offset */
+
+static uint8_t *dirent_buffer = NULL; /* buffer for directory entries and other (symlink, spec. device files, etc.)*/
+static unsigned int dent_ofs = 0; /* directory enrty buffer offset*/
+
+static uint8_t *file_buffer = NULL; /* file buffer contains the actual erase block*/
+static unsigned int file_ofs = 0; /* position in the buffer */
+
+static struct option long_options[] = {
+ {"output", 1, NULL, 'o'},
+ {"input", 1, NULL, 'i'},
+ {"eraseblock", 1, NULL, 'e'},
+ {"help", 0, NULL, 'h'},
+ {"verbose", 0, NULL, 'v'},
+ {"version", 0, NULL, 'V'},
+ {"bigendian", 0, NULL, 'b'},
+ {"littleendian", 0, NULL, 'l'},
+ {"no-cleanmarkers", 0, NULL, 'n'},
+ {"cleanmarker", 1, NULL, 'c'},
+ {NULL, 0, NULL, 0}
+};
+
+static char *helptext =
+ "Usage: sumtool [OPTIONS] -i inputfile -o outputfile\n"
+ "Convert the input JFFS2 file to a SUM-ed JFFS2 file\n\n"
+ "Options:\n"
+ " -e, --eraseblock=SIZE Use erase block size SIZE (default: 64KiB)\n"
+ " (usually 16KiB on NAND)\n"
+ " -c, --cleanmarker=SIZE Size of cleanmarker (default 12).\n"
+ " (usually 16 bytes on NAND, and will be set to\n"
+ " this value if left at the default 12). Will be\n"
+ " stored in OOB after each physical page composing\n"
+ " a physical erase block.\n"
+ " -n, --no-cleanmarkers Don't add a cleanmarker to every eraseblock\n"
+ " -o, --output=FILE Output to FILE \n"
+ " -i, --input=FILE Input from FILE \n"
+ " -b, --bigendian Image is big endian\n"
+ " -l --littleendian Image is little endian\n"
+ " -h, --help Display this help text\n"
+ " -v, --verbose Verbose operation\n"
+ " -V, --version Display version information\n\n";
+
+
+static char *revtext = "$Revision: 1.2 $";
+
+static unsigned char ffbuf[16] = {
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
+ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
+};
+
+static void verror_msg(const char *s, va_list p) {
+ fflush(stdout);
+ fprintf(stderr, "%s: ", app_name);
+ vfprintf(stderr, s, p);
+}
+
+static void error_msg_and_die(const char *s, ...) {
+ va_list p;
+
+ va_start(p, s);
+ verror_msg(s, p);
+ va_end(p);
+ putc('\n', stderr);
+ exit(EXIT_FAILURE);
+}
+
+static void vperror_msg(const char *s, va_list p) {
+ int err = errno;
+
+ if (s == 0)
+ s = "";
+ verror_msg(s, p);
+ if (*s)
+ s = ": ";
+ fprintf(stderr, "%s%s\n", s, strerror(err));
+}
+
+static void perror_msg_and_die(const char *s, ...) {
+ va_list p;
+
+ va_start(p, s);
+ vperror_msg(s, p);
+ va_end(p);
+ exit(EXIT_FAILURE);
+}
+
+
+
+static void full_write(void *target_buff, const void *buf, int len, int nd);
+
+void setup_cleanmarker() {
+
+ cleanmarker.magic = cpu_to_je16(JFFS2_MAGIC_BITMASK);
+ cleanmarker.nodetype = cpu_to_je16(JFFS2_NODETYPE_CLEANMARKER);
+ cleanmarker.totlen = cpu_to_je32(cleanmarker_size);
+ cleanmarker.hdr_crc = cpu_to_je32(crc32(0, &cleanmarker, sizeof(struct jffs2_unknown_node)-4));
+}
+
+void process_options (int argc, char **argv){
+ int opt,c;
+
+ while ((opt = getopt_long(argc, argv, short_options, long_options, &c)) >= 0)
+ {
+ switch (opt)
+ {
+ case 'o':
+ if (out_fd != -1) {
+ error_msg_and_die("output filename specified more than once");
+ }
+ out_fd = open(optarg, O_CREAT | O_TRUNC | O_RDWR, 0644);
+ if (out_fd == -1) {
+ perror_msg_and_die("open output file");
+ }
+ break;
+
+ case 'i':
+ if (in_fd != -1) {
+ error_msg_and_die("input filename specified more than once");
+ }
+ in_fd = open(optarg, O_RDONLY);
+ if (in_fd == -1) {
+ perror_msg_and_die("open input file");
+ }
+ break;
+ case 'b':
+ target_endian = __BIG_ENDIAN;
+ break;
+ case 'l':
+ target_endian = __LITTLE_ENDIAN;
+ break;
+ case 'h':
+ case '?':
+ error_msg_and_die(helptext);
+
+ case 'v':
+ verbose = 1;
+ break;
+
+ case 'V':
+ error_msg_and_die("revision %.*s\n",
+ (int) strlen(revtext) - 13, revtext + 11);
+
+ case 'e': {
+ char *next;
+ unsigned units = 0;
+ erase_block_size = strtol(optarg, &next, 0);
+ if (!erase_block_size)
+ error_msg_and_die("Unrecognisable erase size\n");
+
+ if (*next) {
+ if (!strcmp(next, "KiB")) {
+ units = 1024;
+ } else if (!strcmp(next, "MiB")) {
+ units = 1024 * 1024;
+ } else {
+ error_msg_and_die("Unknown units in erasesize\n");
+ }
+ } else {
+ if (erase_block_size < 0x1000)
+ units = 1024;
+ else
+ units = 1;
+ }
+ erase_block_size *= units;
+
+ /* If it's less than 8KiB, they're not allowed */
+ if (erase_block_size < 0x2000) {
+ fprintf(stderr, "Erase size 0x%x too small. Increasing to 8KiB minimum\n",
+ erase_block_size);
+ erase_block_size = 0x2000;
+ }
+ break;
+ }
+
+ case 'n':
+ add_cleanmarkers = 0;
+ break;
+ case 'c':
+ cleanmarker_size = strtol(optarg, NULL, 0);
+
+ if (cleanmarker_size < sizeof(cleanmarker)) {
+ error_msg_and_die("cleanmarker size must be >= 12");
+ }
+ if (cleanmarker_size >= erase_block_size) {
+ error_msg_and_die("cleanmarker size must be < eraseblock size");
+ }
+
+ use_input_cleanmarker_size = 0;
+ found_cleanmarkers = 1;
+ setup_cleanmarker();
+
+ break;
+
+ }
+ }
+}
+
+
+void init_buffers() {
+
+ inode_buffer = malloc(erase_block_size);
+
+ if (!inode_buffer) {
+ perror("out of memory");
+ close (in_fd);
+ close (out_fd);
+ exit(1);
+ }
+
+ dirent_buffer = malloc(erase_block_size);
+
+ if (!dirent_buffer) {
+ perror("out of memory");
+ close (in_fd);
+ close (out_fd);
+ exit(1);
+ }
+
+ file_buffer = malloc(erase_block_size);
+
+ if (!file_buffer) {
+ perror("out of memory");
+ close (in_fd);
+ close (out_fd);
+ exit(1);
+ }
+}
+
+void clean_buffers() {
+
+ if (inode_buffer)
+ free(inode_buffer);
+ if (dirent_buffer)
+ free(dirent_buffer);
+ if (file_buffer)
+ free(file_buffer);
+}
+
+int load_next_block() {
+
+ int ret;
+ ret = read(in_fd, file_buffer, erase_block_size);
+ file_ofs = 0;
+
+ if(verbose)
+ printf("Load next block : %d bytes read\n",ret);
+
+ return ret;
+}
+
+void write_buff_to_file(int nd) {
+
+ int ret;
+ int len = erase_block_size;
+ uint8_t *buf = NULL;
+
+ if (!nd) {
+ buf = inode_buffer;
+ while (len > 0) {
+ ret = write(out_fd, buf, len);
+
+ if (ret < 0)
+ perror_msg_and_die("write");
+
+ if (ret == 0)
+ perror_msg_and_die("write returned zero");
+
+ len -= ret;
+ buf += ret;
+ }
+ ino_ofs = 0;
+ }
+ else {
+ buf = dirent_buffer;
+ while (len > 0) {
+ ret = write(out_fd, buf, len);
+
+ if (ret < 0)
+ perror_msg_and_die("write");
+
+ if (ret == 0)
+ perror_msg_and_die("write returned zero");
+
+ len -= ret;
+ buf += ret;
+ }
+ dent_ofs = 0;
+ }
+}
+
+void dump_sum_records() {
+
+ struct jffs2_inode_sum_node isum;
+ struct sum_storage *temp;
+ jint32_t offset;
+ jint32_t *wpage;
+ int datasize;
+ int infosize;
+ int padsize;
+ jint32_t magic = cpu_to_je32(JFFS2_SUM_MAGIC);
+
+ if (!sum_records)
+ return;
+
+ datasize = sum_records * sizeof(struct jffs2_inode_sum_record) + 8;
+ infosize = sizeof(struct jffs2_inode_sum_node) + datasize;
+ padsize = erase_block_size - ino_ofs - infosize;
+ infosize += padsize; datasize += padsize;
+ offset = cpu_to_je32(ino_ofs);
+ jint32_t *tpage = (jint32_t *) malloc(datasize);
+
+ if(!tpage)
+ error_msg_and_die("Can't allocate memory to dump summary information!\n");
+
+ memset(tpage, 0xff, datasize);
+ memset(&isum, 0, sizeof(isum));
+
+ isum.magic = cpu_to_je16(JFFS2_MAGIC_BITMASK);
+ isum.nodetype = cpu_to_je16(JFFS2_NODETYPE_INODE_SUM);
+ isum.totlen = cpu_to_je32(infosize);
+ isum.hdr_crc = cpu_to_je32(crc32(0, &isum, sizeof(struct jffs2_unknown_node) - 4));
+
+ if (add_cleanmarkers && found_cleanmarkers) {
+ isum.cln_mkr = cpu_to_je32(cleanmarker_size);
+ }
+ else{
+ isum.cln_mkr = cpu_to_je32(0);
+ }
+
+ isum.sum_num = cpu_to_je16(sum_records);
+ wpage = tpage;
+
+ while (sum_records) {
+ *(wpage++) = sum_collected->inode;
+ *(wpage++) = sum_collected->version;
+ *(wpage++) = sum_collected->offset;
+ *(wpage++) = sum_collected->totlen;
+ temp = sum_collected;
+ sum_collected = sum_collected->next;
+ free(temp);
+ sum_records--;
+ }
+
+ ((char *)wpage) += padsize;
+ *(wpage++) = offset;
+ *(wpage++) = magic;
+ isum.sum_crc = cpu_to_je32(crc32(0, tpage, datasize));
+ isum.node_crc = cpu_to_je32(crc32(0, &isum, sizeof(isum) - 8));
+
+ full_write(inode_buffer + ino_ofs, &isum, sizeof(isum), SINODE);
+ full_write(inode_buffer + ino_ofs, tpage, datasize, SINODE);
+
+ free(tpage);
+}
+
+static void full_write(void *target_buff, const void *buf, int len, int nd) {
+ memcpy(target_buff, buf, len);
+
+ if (!nd)
+ ino_ofs += len;
+ else
+ dent_ofs += len;
+}
+
+static void pad(int req, int nd) {
+ if (!nd) {
+
+ while (req) {
+ if (req > sizeof(ffbuf)) {
+ full_write(inode_buffer + ino_ofs, ffbuf, sizeof(ffbuf), nd);
+ req -= sizeof(ffbuf);
+ } else {
+ full_write(inode_buffer + ino_ofs, ffbuf, req, nd);
+ req = 0;
+ }
+ }
+ }
+ else {
+ while (req) {
+ if (req > sizeof(ffbuf)) {
+ full_write(dirent_buffer + dent_ofs, ffbuf, sizeof(ffbuf), nd);
+ req -= sizeof(ffbuf);
+ }
+ else {
+ full_write(dirent_buffer + dent_ofs, ffbuf, req, nd);
+ req = 0;
+ }
+ }
+ }
+}
+
+static inline void padword(int nd) {
+
+ if (!nd){
+ if (ino_ofs % 4) {
+ full_write(inode_buffer + ino_ofs, ffbuf, 4 - (ino_ofs % 4), nd);
+ }
+ }
+ else {
+ if (dent_ofs % 4) {
+ full_write(dirent_buffer + dent_ofs, ffbuf, 4 - (dent_ofs % 4), nd);
+ }
+ }
+}
+
+static inline void pad_block_if_less_than(int req, int nd) {
+ if (!nd) {
+ int datasize = ((sum_records + 1) * sizeof(struct jffs2_inode_sum_record)) + sizeof(struct jffs2_inode_sum_node) + 8;
+ datasize += (4 - (datasize % 4)) % 4;
+ if (ino_ofs + req > erase_block_size - datasize) {
+ dump_sum_records();
+ write_buff_to_file(nd);
+ }
+
+ if (add_cleanmarkers && found_cleanmarkers) {
+ if (!ino_ofs) {
+ full_write(inode_buffer, &cleanmarker, sizeof(cleanmarker), nd);
+ pad(cleanmarker_size - sizeof(cleanmarker), nd);
+ padword(nd);
+ }
+ }
+
+ }
+ else {
+ if (dent_ofs + req > erase_block_size) {
+ pad(erase_block_size - dent_ofs, nd);
+ write_buff_to_file(nd);
+ }
+
+ if (add_cleanmarkers && found_cleanmarkers) {
+ if (!dent_ofs) {
+ full_write(dirent_buffer, &cleanmarker, sizeof(cleanmarker), nd);
+ pad(cleanmarker_size - sizeof(cleanmarker), nd);
+ padword(nd);
+ }
+ }
+ }
+}
+
+void flush_buffers() {
+
+ if ((add_cleanmarkers == 1) && (found_cleanmarkers == 1)) { /* CLEANMARKER */
+ if (ino_ofs != cleanmarker_size) { /* INODE BUFFER */
+
+ int datasize = ((sum_records + 1) * sizeof(struct jffs2_inode_sum_record)) + sizeof(struct jffs2_inode_sum_node) + 8;
+ datasize += (4 - (datasize % 4)) % 4;
+
+ /* If we have a full inode buffer, then write out inode and summary data */
+ if (ino_ofs + sizeof(struct jffs2_raw_inode) + JFFS2_MIN_DATA_LEN > erase_block_size - datasize) {
+ dump_sum_records();
+ write_buff_to_file(SINODE);
+ }
+ /* else just write out inode data */
+ else{
+ pad(erase_block_size - ino_ofs, SINODE);
+ write_buff_to_file(SINODE);
+ }
+ }
+
+ if (dent_ofs != cleanmarker_size) { /* DIRENT AND OTHERS BUFFER */
+ pad(erase_block_size - dent_ofs, SONODE);
+ write_buff_to_file(SONODE);
+ }
+
+ }
+ else { /* NO CLEANMARKER */
+ if (ino_ofs != 0) { /* INODE BUFFER */
+
+ int datasize = ((sum_records + 1) * sizeof(struct jffs2_inode_sum_record)) + sizeof(struct jffs2_inode_sum_node) + 8;
+ datasize += (4 - (datasize % 4)) % 4;
+
+ /* If we have a full inode buffer, then write out inode and summary data */
+ if (ino_ofs + sizeof(struct jffs2_raw_inode) + JFFS2_MIN_DATA_LEN > erase_block_size - datasize) {
+ dump_sum_records();
+ write_buff_to_file(SINODE);
+ }
+ /* Else just write out inode data */
+ else{
+ pad(erase_block_size - ino_ofs, SINODE);
+ write_buff_to_file(SINODE);
+ }
+ }
+
+ if (dent_ofs != 0) { /* DIRENT AND OTHER BUFFER */
+ pad(erase_block_size - dent_ofs, SONODE);
+ write_buff_to_file(SONODE);
+ }
+ }
+}
+
+
+void write_dirent_to_buff(union jffs2_node_union *node) {
+
+ pad_block_if_less_than(je32_to_cpu (node->d.totlen), SONODE);
+ full_write(dirent_buffer + dent_ofs, &(node->d), je32_to_cpu (node->d.totlen), SONODE);
+ padword(SONODE);
+}
+
+void add_sum_entry(union jffs2_node_union *node) {
+
+ sum_storage *walk;
+ sum_storage *temp = (sum_storage *) malloc(sizeof(sum_storage));
+
+ if (!temp)
+ error_msg_and_die("Can't allocate memory for summary information!\n");
+
+ temp->inode = node->i.ino;
+ temp->version = node->i.version;
+ temp->offset = cpu_to_je32(ino_ofs);
+ temp->totlen = node->i.totlen;
+ temp->next = NULL;
+
+ if (!sum_collected) {
+ sum_collected = temp;
+ }
+ else {
+ walk = sum_collected;
+
+ while (walk->next) {
+ walk = walk->next;
+ }
+ walk->next = temp;
+ }
+ sum_records++;
+}
+
+void write_inode_to_buff(union jffs2_node_union *node) {
+
+ pad_block_if_less_than(je32_to_cpu (node->i.totlen), SINODE);
+ add_sum_entry(node); /* Add inode summary entry to summary list */
+ full_write(inode_buffer + ino_ofs, &(node->i), je32_to_cpu (node->i.totlen), SINODE); /* Write out the inode to inode_buffer */
+ padword(SINODE);
+
+}
+
+
+void create_summed_image(int inp_size) {
+ uint8_t *p = file_buffer;
+ union jffs2_node_union *node;
+ uint32_t crc;
+ uint16_t type;
+ int bitchbitmask = 0;
+ int obsolete;
+
+ char name[256];
+
+ while ( p < (file_buffer + inp_size)) {
+
+ node = (union jffs2_node_union*) p;
+
+ /* Skip empty space */
+ if (je16_to_cpu (node->u.magic) == 0xFFFF && je16_to_cpu (node->u.nodetype) == 0xFFFF) {
+ p += 4;
+ continue;
+ }
+
+ if (je16_to_cpu (node->u.magic) != JFFS2_MAGIC_BITMASK) {
+ if (!bitchbitmask++)
+ printf ("Wrong bitmask at 0x%08x, 0x%04x\n", p - file_buffer, je16_to_cpu (node->u.magic));
+ p += 4;
+ continue;
+ }
+
+ bitchbitmask = 0;
+
+ type = je16_to_cpu(node->u.nodetype);
+ if ((type & JFFS2_NODE_ACCURATE) != JFFS2_NODE_ACCURATE) {
+ obsolete = 1;
+ type |= JFFS2_NODE_ACCURATE;
+ } else
+ obsolete = 0;
+
+ node->u.nodetype = cpu_to_je16(type);
+
+ crc = crc32 (0, node, sizeof (struct jffs2_unknown_node) - 4);
+ if (crc != je32_to_cpu (node->u.hdr_crc)) {
+ printf ("Wrong hdr_crc at 0x%08x, 0x%08x instead of 0x%08x\n", p - file_buffer, je32_to_cpu (node->u.hdr_crc), crc);
+ p += 4;
+ continue;
+ }
+
+ switch(je16_to_cpu(node->u.nodetype)) {
+
+ case JFFS2_NODETYPE_INODE:
+ if(verbose)
+ printf ("%8s Inode node at 0x%08x, totlen 0x%08x, #ino %5d, version %5d, isize %8d, csize %8d, dsize %8d, offset %8d\n",
+ obsolete ? "Obsolete" : "",
+ p - file_buffer, je32_to_cpu (node->i.totlen), je32_to_cpu (node->i.ino),
+ je32_to_cpu ( node->i.version), je32_to_cpu (node->i.isize),
+ je32_to_cpu (node->i.csize), je32_to_cpu (node->i.dsize), je32_to_cpu (node->i.offset));
+
+ crc = crc32 (0, node, sizeof (struct jffs2_raw_inode) - 8);
+ if (crc != je32_to_cpu (node->i.node_crc)) {
+ printf ("Wrong node_crc at 0x%08x, 0x%08x instead of 0x%08x\n", p - file_buffer, je32_to_cpu (node->i.node_crc), crc);
+ p += PAD(je32_to_cpu (node->i.totlen));
+ continue;
+ }
+
+ crc = crc32(0, p + sizeof (struct jffs2_raw_inode), je32_to_cpu(node->i.csize));
+ if (crc != je32_to_cpu(node->i.data_crc)) {
+ printf ("Wrong data_crc at 0x%08x, 0x%08x instead of 0x%08x\n", p - file_buffer, je32_to_cpu (node->i.data_crc), crc);
+ p += PAD(je32_to_cpu (node->i.totlen));
+ continue;
+ }
+
+ write_inode_to_buff(node);
+
+ p += PAD(je32_to_cpu (node->i.totlen));
+ break;
+
+ case JFFS2_NODETYPE_DIRENT:
+ memcpy (name, node->d.name, node->d.nsize);
+ name [node->d.nsize] = 0x0;
+
+ if(verbose)
+ printf ("%8s Dirent node at 0x%08x, totlen 0x%08x, #pino %5d, version %5d, #ino %8d, nsize %8d, name %s\n",
+ obsolete ? "Obsolete" : "",
+ p - file_buffer, je32_to_cpu (node->d.totlen), je32_to_cpu (node->d.pino),
+ je32_to_cpu ( node->d.version), je32_to_cpu (node->d.ino),
+ node->d.nsize, name);
+
+ crc = crc32 (0, node, sizeof (struct jffs2_raw_dirent) - 8);
+ if (crc != je32_to_cpu (node->d.node_crc)) {
+ printf ("Wrong node_crc at 0x%08x, 0x%08x instead of 0x%08x\n", p - file_buffer, je32_to_cpu (node->d.node_crc), crc);
+ p += PAD(je32_to_cpu (node->d.totlen));
+ continue;
+ }
+
+ crc = crc32(0, p + sizeof (struct jffs2_raw_dirent), node->d.nsize);
+ if (crc != je32_to_cpu(node->d.name_crc)) {
+ printf ("Wrong name_crc at 0x%08x, 0x%08x instead of 0x%08x\n", p - file_buffer, je32_to_cpu (node->d.name_crc), crc);
+ p += PAD(je32_to_cpu (node->d.totlen));
+ continue;
+ }
+
+ write_dirent_to_buff(node);
+
+ p += PAD(je32_to_cpu (node->d.totlen));
+ break;
+
+ case JFFS2_NODETYPE_CLEANMARKER:
+ if (verbose) {
+ printf ("%8s Cleanmarker at 0x%08x, totlen 0x%08x\n",
+ obsolete ? "Obsolete" : "",
+ p - file_buffer, je32_to_cpu (node->u.totlen));
+ }
+
+ if(!found_cleanmarkers){
+ found_cleanmarkers = 1;
+
+ if(add_cleanmarkers == 1 && use_input_cleanmarker_size == 1){
+ cleanmarker_size = je32_to_cpu (node->u.totlen);
+ setup_cleanmarker();
+ }
+ }
+
+ p += PAD(je32_to_cpu (node->u.totlen));
+ break;
+
+ case JFFS2_NODETYPE_PADDING:
+ if (verbose) {
+ printf ("%8s Padding node at 0x%08x, totlen 0x%08x\n",
+ obsolete ? "Obsolete" : "",
+ p - file_buffer, je32_to_cpu (node->u.totlen));
+ }
+ p += PAD(je32_to_cpu (node->u.totlen));
+ break;
+
+ case 0xffff:
+ p += 4;
+ break;
+
+ default:
+ if (verbose) {
+ printf ("%8s Unknown node at 0x%08x, totlen 0x%08x\n",
+ obsolete ? "Obsolete" : "",
+ p - file_buffer, je32_to_cpu (node->u.totlen));
+ }
+
+ write_dirent_to_buff(node);
+
+ p += PAD(je32_to_cpu (node->u.totlen));
+ }
+ }
+}
+
+int main(int argc, char **argv) {
+
+ int ret;
+
+ process_options(argc,argv);
+
+ if ((in_fd == -1) || (out_fd == -1)) {
+
+ if(in_fd != -1)
+ close(in_fd);
+ if(out_fd != -1)
+ close(out_fd);
+
+ error_msg_and_die("You must specify input and output files!\n");
+ }
+
+ init_buffers();
+
+ while ((ret = load_next_block())) {
+ create_summed_image(ret);
+ }
+
+ flush_buffers();
+ clean_buffers();
+
+ if (in_fd != -1)
+ close(in_fd);
+ if (out_fd != -1)
+ close(out_fd);
+
+ return 0;
+}
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [OBORONA-SPAM] JFFS2 mount time
2004-10-20 14:26 JFFS2 mount time Ferenc Havasi
@ 2004-10-20 15:26 ` Artem B. Bityuckiy
2004-10-20 15:49 ` Ferenc Havasi
2004-10-21 6:29 ` Artem B. Bityuckiy
` (4 subsequent siblings)
5 siblings, 1 reply; 27+ messages in thread
From: Artem B. Bityuckiy @ 2004-10-20 15:26 UTC (permalink / raw)
To: Ferenc Havasi; +Cc: dwmw2, linux-mtd
Ferenc,
I didn't investigate your patch well yet, but after 3 minuets of looking
to it I have two questions:
1. Why did you introduce the JFFS2_SUM_MAGIC constant? As I understand,
the node's magic field is needed to identify the *beginning of node*,
*not the node type*. The type of node is defined by the next field,
called 'nodetype'. You use it (JFFS2_NODETYPE_INODE_SUM). So, IMHO, the
JFFS2_SUM_MAGIC constant doesn't fit into the common rules...
2. This is very minor of course, just a remark. IMHO, its better to
avoid too many ifdefs, so, I think it is unnecessary to place the
function prototype under ifdef. I mead:
+#ifdef CONFIG_JFFS2_FS_SUMMARY
+static struct jffs2_inode_cache *jffs2_scan_make_ino_cache(struct
jffs2_sb_info *c, uint32_t ino);
+#endif
Ferenc Havasi wrote:
> Dear All,
>
> Here is the latest version of our mount time improvement.
>
> Using of it:
> - apply this patch on the latest version of MTD
> - compile sumtool (make command in mtd/util)
> - make your JFFS2 image as before (or you can use already created images
> as well)
> - run sumtool to insert summary information, for example:
> ./sumtool -i original.jffs2 -o new.jffs2 -e128KiB
> - recompile your kernel with "JFFS2 inode summary support"
>
> Jarkko made a measurement on a real NAND device: his JFFS2 image was
> 120819928 (115M), after running sumtool the new image was 123338752 (117M).
>
> Using the original mount time was 55 sec, with the new image it is only
> 8.5 sec.
>
> It works very similar as our previous improvement: stores special
> information at the end of the erase blocks, and at mount time if there
> is this kind of information the scaning of the erase block is unneccessary.
>
> New things compared to our previous improvement:
> - it was fully rewritten
> - we separated the user space tool from mkfs. (sumtool)
> - sumtool now not only inserts the summary information but also make
> some node-reordering. There will be two kind of erase blocks: in the
> "first type" there will be only jffs2_raw_inodes, and all other node
> (jffs2_raw_dirent) will be stored in the "second type". It generates
> summary at the end of all "fist type" eraseblock. (the "second type"
> will be scanned as before, because all information is needed in
> jffs_raw_dirent at mount time)
>
> Ceratinly all of these things are optional (as you can see above you
> have to select it from kernel config). The JFFS2 image produced by
> sumtool is also usable with previous kernel because the summary node is
> JFFS2_FEATURE_RWCOMPAT_DELETE.
>
> I think it can be usefull not only for us. David, may I commit it to the
> CVS?
>
> Regards,
> Ferenc
--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: JFFS2 mount time
2004-10-20 15:26 ` [OBORONA-SPAM] " Artem B. Bityuckiy
@ 2004-10-20 15:49 ` Ferenc Havasi
2004-10-20 15:53 ` Artem B. Bityuckiy
0 siblings, 1 reply; 27+ messages in thread
From: Ferenc Havasi @ 2004-10-20 15:49 UTC (permalink / raw)
To: Artem B. Bityuckiy; +Cc: dwmw2, linux-mtd
Hi Artem,
> 1. Why did you introduce the JFFS2_SUM_MAGIC constant? As I understand,
> the node's magic field is needed to identify the *beginning of node*,
> *not the node type*. The type of node is defined by the next field,
> called 'nodetype'. You use it (JFFS2_NODETYPE_INODE_SUM). So, IMHO, the
> JFFS2_SUM_MAGIC constant doesn't fit into the common rules...
The reason is the following: the summary node is at the end of the erase
block, and it has not fixed size (its size depends on the information it
stores).
The main advantage of using summary node is to avoid the original
scanning method. So we cannot use the original full-scanning method to
determine the begining of the summary node (using only
JFFS2_NODETYPE_INODE_SUM).
Our method is the following:
- read some bytes at the end of the erase block
- if the last word is JFFS2_SUM_MAGIC than we will almost sure that it
is an erase block which has summary
- the word before this magic is the length of the node
- using this length we can check that it is really a
JFFS2_NODETYPE_INODE_SUM node, and process it
I can't image more effective method to determine the begining of the
summary node. (if you have better suggestion...) And because the magic
is inside of the summary node I think it is fit to the philosophy of
JFFS2 - but a little bit tricky.
> 2. This is very minor of course, just a remark. IMHO, its better to
> avoid too many ifdefs, so, I think it is unnecessary to place the
> function prototype under ifdef. I mead:
>
> +#ifdef CONFIG_JFFS2_FS_SUMMARY
> +static struct jffs2_inode_cache *jffs2_scan_make_ino_cache(struct
> jffs2_sb_info *c, uint32_t ino);
> +#endif
Yes, I aggree. I will modify it.
Bye,
Ferenc
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: JFFS2 mount time
2004-10-20 15:49 ` Ferenc Havasi
@ 2004-10-20 15:53 ` Artem B. Bityuckiy
0 siblings, 0 replies; 27+ messages in thread
From: Artem B. Bityuckiy @ 2004-10-20 15:53 UTC (permalink / raw)
To: Ferenc Havasi; +Cc: linux-mtd
Ferenc Havasi wrote:
> The reason is the following: the summary node is at the end of the erase
> block, and it has not fixed size (its size depends on the information it
> stores).
>
> The main advantage of using summary node is to avoid the original
> scanning method. So we cannot use the original full-scanning method to
> determine the begining of the summary node (using only
> JFFS2_NODETYPE_INODE_SUM).
>
> Our method is the following:
> - read some bytes at the end of the erase block
> - if the last word is JFFS2_SUM_MAGIC than we will almost sure that it
> is an erase block which has summary
> - the word before this magic is the length of the node
> - using this length we can check that it is really a
> JFFS2_NODETYPE_INODE_SUM node, and process it
>
> I can't image more effective method to determine the begining of the
> summary node. (if you have better suggestion...) And because the magic
> is inside of the summary node I think it is fit to the philosophy of
> JFFS2 - but a little bit tricky.
Ok, I got it. I was wrong, sorry.
--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: JFFS2 mount time
2004-10-20 14:26 JFFS2 mount time Ferenc Havasi
2004-10-20 15:26 ` [OBORONA-SPAM] " Artem B. Bityuckiy
@ 2004-10-21 6:29 ` Artem B. Bityuckiy
2004-10-21 6:54 ` Ferenc Havasi
2004-10-21 7:30 ` JFFS2 mount time - more Artem B. Bityuckiy
` (3 subsequent siblings)
5 siblings, 1 reply; 27+ messages in thread
From: Artem B. Bityuckiy @ 2004-10-21 6:29 UTC (permalink / raw)
To: Ferenc Havasi; +Cc: dwmw2, linux-mtd, jffs-dev
Hello Ferenc,
As I understand, you only prepare JFFS2 image with summaries. This is
great until we do not change anything. For read-only file-systems this
is OK.
But what if files/direntries are changed/deleted ? Do you write summary
information dynamically? How are you going to place nodes/direntries to
different blocks dynamically?
Ferenc Havasi wrote:
> Dear All,
>
> Here is the latest version of our mount time improvement.
>
> Using of it:
> - apply this patch on the latest version of MTD
> - compile sumtool (make command in mtd/util)
> - make your JFFS2 image as before (or you can use already created images
> as well)
> - run sumtool to insert summary information, for example:
> ./sumtool -i original.jffs2 -o new.jffs2 -e128KiB
> - recompile your kernel with "JFFS2 inode summary support"
>
> Jarkko made a measurement on a real NAND device: his JFFS2 image was
> 120819928 (115M), after running sumtool the new image was 123338752 (117M).
>
> Using the original mount time was 55 sec, with the new image it is only
> 8.5 sec.
>
> It works very similar as our previous improvement: stores special
> information at the end of the erase blocks, and at mount time if there
> is this kind of information the scaning of the erase block is unneccessary.
>
> New things compared to our previous improvement:
> - it was fully rewritten
> - we separated the user space tool from mkfs. (sumtool)
> - sumtool now not only inserts the summary information but also make
> some node-reordering. There will be two kind of erase blocks: in the
> "first type" there will be only jffs2_raw_inodes, and all other node
> (jffs2_raw_dirent) will be stored in the "second type". It generates
> summary at the end of all "fist type" eraseblock. (the "second type"
> will be scanned as before, because all information is needed in
> jffs_raw_dirent at mount time)
>
> Ceratinly all of these things are optional (as you can see above you
> have to select it from kernel config). The JFFS2 image produced by
> sumtool is also usable with previous kernel because the summary node is
> JFFS2_FEATURE_RWCOMPAT_DELETE.
>
> I think it can be usefull not only for us. David, may I commit it to the
> CVS?
>
> Regards,
> Ferenc
--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: JFFS2 mount time
2004-10-21 6:29 ` Artem B. Bityuckiy
@ 2004-10-21 6:54 ` Ferenc Havasi
2004-10-21 7:16 ` Artem B. Bityuckiy
0 siblings, 1 reply; 27+ messages in thread
From: Ferenc Havasi @ 2004-10-21 6:54 UTC (permalink / raw)
To: Artem B. Bityuckiy; +Cc: dwmw2, linux-mtd, jffs-dev
Hi Artem,
> As I understand, you only prepare JFFS2 image with summaries. This is
> great until we do not change anything. For read-only file-systems this
> is OK.
>
> But what if files/direntries are changed/deleted ? Do you write summary
> information dynamically? How are you going to place nodes/direntries to
> different blocks dynamically?
You are right, there is a small change which is really important (and
will be ready very soon) to extend jffs2_mark_node_obsolete() to mark
not only the node but also its entry in the summary.
Any other improvement can be done later, because after it the filesystem
will be always coherent, because we write summary only at the of the
erasy blocks, when it is fully "finished" - so if there is a summary
somewhere we will not need to extend it, only to mark the obscolated nodes.
We also plan in the near future to implement the ability of generating
summary dinamically when the filesystem finishes an erase block - which
keep this "fast mount time" permament.
Bye,
Ferenc
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: JFFS2 mount time
2004-10-21 6:54 ` Ferenc Havasi
@ 2004-10-21 7:16 ` Artem B. Bityuckiy
2004-10-21 19:50 ` Ferenc Havasi
0 siblings, 1 reply; 27+ messages in thread
From: Artem B. Bityuckiy @ 2004-10-21 7:16 UTC (permalink / raw)
To: Ferenc Havasi; +Cc: dwmw2, linux-mtd, jffs-dev
> You are right, there is a small change which is really important (and
> will be ready very soon) to extend jffs2_mark_node_obsolete() to mark
> not only the node but also its entry in the summary.
Unfortunately, you can not mark entries as obsoleted in your summary
node in case of NAND.
If you write your summary only for *full* blocks, you will not need to
mark entries obsoleted, even if you have NOR flash (but you can on NOR).
The partially filled blocks must not have the summary node (you can
introduce special marker and write it to OOB of the last page of
NAND/last word of sector on NOR which tells if there is the summary node
present).
So, fully filled block will have summary and will be scanned very
quickly, partially filled ones will have no summary and will be fully
scanned, free blocks will have cleanmarkers and will not be scanned,
other blocks will be either erased or considered free.
>
> Any other improvement can be done later, because after it the filesystem
> will be always coherent, because we write summary only at the of the
> erasy blocks, when it is fully "finished" - so if there is a summary
> somewhere we will not need to extend it, only to mark the obscolated nodes.
Yes, nice, but why do you need to mark obsoleted nodes in summary ???
When you insert node to the fragtree or dirents to the list, JFFS2 code
will detect obsoleted nodes automatically, no need to mark them physically.
>
> We also plan in the near future to implement the ability of generating
> summary dinamically when the filesystem finishes an erase block - which
> keep this "fast mount time" permament.
This would be perfect.
--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: JFFS2 mount time
2004-10-21 7:16 ` Artem B. Bityuckiy
@ 2004-10-21 19:50 ` Ferenc Havasi
0 siblings, 0 replies; 27+ messages in thread
From: Ferenc Havasi @ 2004-10-21 19:50 UTC (permalink / raw)
To: Artem B. Bityuckiy; +Cc: dwmw2, linux-mtd, jffs-dev
Hi Artem,
> Unfortunately, you can not mark entries as obsoleted in your summary
> node in case of NAND.
>
> If you write your summary only for *full* blocks, you will not need to
> mark entries obsoleted, even if you have NOR flash (but you can on NOR).
> The partially filled blocks must not have the summary node (you can
> introduce special marker and write it to OOB of the last page of
> NAND/last word of sector on NOR which tells if there is the summary node
> present).
Really, you are right.
So we only have to solve this problem on NOR. I think the easiest
solution is to set jffs2_can_mark_obsolete() to false if the summary
support is enabled.
Bye,
Ferenc
^ permalink raw reply [flat|nested] 27+ messages in thread
* JFFS2 mount time - more
2004-10-20 14:26 JFFS2 mount time Ferenc Havasi
2004-10-20 15:26 ` [OBORONA-SPAM] " Artem B. Bityuckiy
2004-10-21 6:29 ` Artem B. Bityuckiy
@ 2004-10-21 7:30 ` Artem B. Bityuckiy
[not found] ` <41776351.4040204@yandex.ru>
` (2 subsequent siblings)
5 siblings, 0 replies; 27+ messages in thread
From: Artem B. Bityuckiy @ 2004-10-21 7:30 UTC (permalink / raw)
To: linux-mtd
Ferenc,
I have 3 more questions.
1. How large are your summary nodes (in average) for blocks full of
dirents/nodes ?
2. Why do not you use compression for them?
3. Why did you introduce new tool instead of just adding new options to
the mkfs.jffs2 ?
--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.
^ permalink raw reply [flat|nested] 27+ messages in thread[parent not found: <41776351.4040204@yandex.ru>]
* Re: JFFS2 mount time - 3 more questions
[not found] ` <41776351.4040204@yandex.ru>
@ 2004-10-21 7:39 ` Ferenc Havasi
0 siblings, 0 replies; 27+ messages in thread
From: Ferenc Havasi @ 2004-10-21 7:39 UTC (permalink / raw)
To: Artem B. Bityuckiy; +Cc: dwmw2, linux-mtd, jffs-dev
Hi Artem,
> 1. How large are your summary nodes (in average) for blocks full of
> dirents/nodes ?
It heavily depends on
- the size of the earase block
- the sizes of the nodes
It is 4 words for every jffs2_raw_inode. Dirents are stored separatedly
without summary.
> 2. Why do not you use compression for them?
To make boot time as fast as possible :) But not a bad idea. If someone
needs it we can make a new option.
> 3. Why did you introduce new tool instead of just adding new options to
> the mkfs.jffs2 ?
I think it is "nicer", cleaner design, and uing this separation the
reordering of the nodes is much more easier.
Bye,
Ferenc
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: JFFS2 mount time
2004-10-20 14:26 JFFS2 mount time Ferenc Havasi
` (3 preceding siblings ...)
[not found] ` <41776351.4040204@yandex.ru>
@ 2004-10-21 12:49 ` Jarkko Lavinen
2004-10-21 19:11 ` Ferenc Havasi
2004-10-22 9:58 ` Ferenc Havasi
2004-10-21 13:24 ` David Woodhouse
5 siblings, 2 replies; 27+ messages in thread
From: Jarkko Lavinen @ 2004-10-21 12:49 UTC (permalink / raw)
Cc: dwmw2, linux-mtd
On Wed, Oct 20, 2004 at 04:26:27PM +0200, ext Ferenc Havasi wrote:
> Jarkko made a measurement on a real NAND device: his JFFS2 image was
> 120819928 (115M), after running sumtool the new image was 123338752 (117M).
>
> Using the original mount time was 55 sec, with the new image it is only
> 8.5 sec.
My initial test was only about the mount time. I have now also tried
to exercise the patched file system and with very little testing I get
CRC or ECC errors.
# mount /dev/mtdblock2 /mnt -t jffs2
# mkdir /mnt/testdir
# umount /mnt
jffs2_flush_wbuf(): Write failed with -5
# mount /dev/mtdblock2 /mnt -t jffs2
mtd->read(0x1fbec bytes from 0x1fc0414) returned ECC error
Empty flash at 0x01fc2f1c ends at 0x01fc3000
#
With plain 2.6.9-rc4-omap1 with fresh CVS MTD code, I don't see anything
weird occuring.
Jarkko Lavinen
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: JFFS2 mount time
2004-10-21 12:49 ` JFFS2 mount time Jarkko Lavinen
@ 2004-10-21 19:11 ` Ferenc Havasi
2004-10-22 9:58 ` Ferenc Havasi
1 sibling, 0 replies; 27+ messages in thread
From: Ferenc Havasi @ 2004-10-21 19:11 UTC (permalink / raw)
To: Jarkko Lavinen; +Cc: dwmw2, Kluba Patrik, linux-mtd
Hi Jarkko,
> My initial test was only about the mount time. I have now also tried
> to exercise the patched file system and with very little testing I get
> CRC or ECC errors.
>
> # mount /dev/mtdblock2 /mnt -t jffs2
> # mkdir /mnt/testdir
> # umount /mnt
> jffs2_flush_wbuf(): Write failed with -5
> # mount /dev/mtdblock2 /mnt -t jffs2
> mtd->read(0x1fbec bytes from 0x1fc0414) returned ECC error
> Empty flash at 0x01fc2f1c ends at 0x01fc3000
> #
>
> With plain 2.6.9-rc4-omap1 with fresh CVS MTD code, I don't see anything
> weird occuring.
Can you send me the full kernel log file?
Another interesting test would be to test the new image (sumtool) with
the CVS MTD code. Because the summary is RWCOMPAT_DELETE node it should
works well. It would be nice to now if there is CRC/ECC errors in this case.
Thanks,
Ferenc
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: JFFS2 mount time
2004-10-21 12:49 ` JFFS2 mount time Jarkko Lavinen
2004-10-21 19:11 ` Ferenc Havasi
@ 2004-10-22 9:58 ` Ferenc Havasi
1 sibling, 0 replies; 27+ messages in thread
From: Ferenc Havasi @ 2004-10-22 9:58 UTC (permalink / raw)
To: Jarkko Lavinen; +Cc: dwmw2, linux-mtd
Jarkko Lavinen wrote:
> My initial test was only about the mount time. I have now also tried
> to exercise the patched file system and with very little testing I get
> CRC or ECC errors.
>
> # mount /dev/mtdblock2 /mnt -t jffs2
> # mkdir /mnt/testdir
> # umount /mnt
> jffs2_flush_wbuf(): Write failed with -5
> # mount /dev/mtdblock2 /mnt -t jffs2
> mtd->read(0x1fbec bytes from 0x1fc0414) returned ECC error
> Empty flash at 0x01fc2f1c ends at 0x01fc3000
> #
>
> With plain 2.6.9-rc4-omap1 with fresh CVS MTD code, I don't see anything
> weird occuring.
We are tring to find out what happens here...
Jarkko previously sent me some more detail, the logs starts with:
> sh-2.05b# /rootfstest.sh
> Mounting file system: Ok
> Creating a test directory: Ok
> Creating a test file: mtd->read(0x44 bytes from 0x1fa344c) returned ECC error
> Data CRC 6d0b1da8 != calculated CRC 9c4f3838 for node at 01fa344c
> mtd->read(0x44 bytes from 0x1fa3e20) returned ECC error
> Data CRC 5127cb7f != calculated CRC 057e127c for node at 01fa3e20
> mtd->read(0x44 bytes from 0x1fa4388) returned ECC error
> mtd->read(0x44 bytes from 0x1fa5cb8) returned ECC error
> mtd->read(0x44 bytes from 0x1fa6748) returned ECC error
> mtd->read(0x44 bytes from 0x1fa7b30) returned ECC error
> Data CRC 72b41a04 != calculated CRC ebc121db for node at 01fa7b30
> mtd->read(0x44 bytes from 0x1fa866c) returned ECC error
> Data CRC 9ff1d419 != calculated CRC cb2cce56 for node at 01fa866c
It means the mounting is done successfully. The first problem is when
the filesystem try to read jffs2_raw_inode nodes (if I am right the 0x44
is the size of that). It is not successfull (I don't know why), and it
cause CRC errors, too. The only one differences should be only the the
original version already read this 0x44 before (during mount time), the
summary version did not read yet, just know where it is from the summary.
Jarkko, one more interesing thing can be (if you have that image) to see
what is at the place 0x1fa344c, 01fa3e20, ... with the tool jffs2dump.
If anyone have any idea that is welcome. Unfortunatelly we don't have
real NAND device, and it works with our emulator.
Thanks,
Ferenc
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: JFFS2 mount time
2004-10-20 14:26 JFFS2 mount time Ferenc Havasi
` (4 preceding siblings ...)
2004-10-21 12:49 ` JFFS2 mount time Jarkko Lavinen
@ 2004-10-21 13:24 ` David Woodhouse
2004-10-21 20:05 ` Ferenc Havasi
5 siblings, 1 reply; 27+ messages in thread
From: David Woodhouse @ 2004-10-21 13:24 UTC (permalink / raw)
To: Ferenc Havasi; +Cc: linux-mtd, jffs-dev
On Wed, 2004-10-20 at 16:26 +0200, Ferenc Havasi wrote:
> Dear All,
>
> Here is the latest version of our mount time improvement.
It's looking good, but the kernel really needs to be able to write these
summaries for _itself_ in order to give a real improvement over the long
term. If the file system has to be read-only we might as well be using
cramfs, and if the summary becomes obsolete over time we might as well
not bother in a lot of cases.
--
dwmw2
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: JFFS2 mount time
2004-10-21 13:24 ` David Woodhouse
@ 2004-10-21 20:05 ` Ferenc Havasi
2004-10-22 12:44 ` Artem Bityuckiy
0 siblings, 1 reply; 27+ messages in thread
From: Ferenc Havasi @ 2004-10-21 20:05 UTC (permalink / raw)
To: David Woodhouse; +Cc: linux-mtd, jffs-dev
David Woodhouse wrote:
> It's looking good, but the kernel really needs to be able to write these
> summaries for _itself_ in order to give a real improvement over the long
> term. If the file system has to be read-only we might as well be using
> cramfs, and if the summary becomes obsolete over time we might as well
> not bother in a lot of cases.
Our plan for it:
We would like to store some additional information in jeb struct:
- a type information, where there this type can be INODE_ONLY and
ANYTHING_OTHER. This information is easy to detect during mount time.
- a predicted summary size (calculated dinamically). It will be used to
decide when to generate the summary. Ceratinly only for INODE_ONLY
erase blocks.
If I am right every node allocation is done by jffs2_reserve_space(). We
would like to modify it, and introduce a new interface for it called
jffs2_reserve_space_for_inode() function. Every inode storing function
(there is no too much I think) should call
jffs2_reserve_space_for_inode() with some extra information (inode
number...).
jffs2_reserve_space() should use only ANYTHING_OTHER eraseblocks, as
jffs2_reserve_space_for_inode() use only INODE_ONLY ones. If there is no
free space in them it should use the usual technique to find a clean
eraseblock and start to store the new node in it.
The generating of summary is also the task of
jffs2_reserve_space_for_inode(), if the new inode (+summary) is not fit
in the erase block, it will generates summary.
What do you think?
Regards,
Ferenc
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: JFFS2 mount time
2004-10-21 20:05 ` Ferenc Havasi
@ 2004-10-22 12:44 ` Artem Bityuckiy
2004-10-25 9:36 ` Ferenc Havasi
2004-10-26 9:29 ` Jarkko Lavinen
0 siblings, 2 replies; 27+ messages in thread
From: Artem Bityuckiy @ 2004-10-22 12:44 UTC (permalink / raw)
To: Ferenc Havasi; +Cc: linux-mtd, David Woodhouse, jffs-dev
Hello Ferenc,
At first, please, let me describe your design shortly to be sure I
understand it and we both thinking the same way.
Essentially, your design is based on the fact that you do not want to
refer directory entries in the summary nodes. Motivation that you will
keep almost the copy of direntries in the summary, thus:
1. duplicating too many information.
2. you suppose there will not be the mount speed acceleration.
So, for this purpose you are going to distribute the inode nodes and
other (including direntry nodes) by different blocks. Those blocks, who
contain only the inode nodes, will have summaries, other blocks - will not.
I think this is not the best solution. Why? In general, because I do not
like the following:
A. Your idea to distribute inode nodes and other nodes between different
blocks.
B. Your assumption that the directory information in summaries will not
affect the mount time.
The following are reasons concerning the item A.
1. Your change will affect JFFS2 very heavily. You will introduce
restriction into JFFS2. Another improvements may not work with such
restriction. Now all the blocks are equivalent. But you want to
distinguish between two kins of blocks. Don't you think it is too
complicated decision?
2. Think about the wear-leveling. In JFFS it was ideal. In JFFS2 it is
good, but not so ideal. I average, the inode nodes are changed more
often (just think about FIFOs, we told about them in this list
recently). So, you will need to Garbage Collect the NODE_ONLY blocks
more often. So, I afraid the wear-leveling will suffer from your
improvement.
3. Imagine the file system with *lots* of very small files. I this case,
the direntries portion on the media will be large enough. And the
mount time of such file system will not be improved very well.
4. It seems for me you will need to increase the number of blocks which
are reserved for the garbage collection (double ?). This is also minor
drawback.
The following are reasons concerning the item B.
I believe that if we have directory references in summaries, this will
increase the mount speed.
1. At first, we will store fewer data! We don't need to keep the common
headers, CRCs and mctimes.
2. At the second, we may compress summary (direntries aren't compressed)!
3. And the third, on NAND there is difference between reading lots of
different pages or few pages.
I propose the another design.
1. Keep direntry references in summaries too and hence, do not
distinguish between blocks with inode nodes and direntries.
2. Compress summaries.
So, you will avoid a lot of problems related to teaching the GC
distinguish between different blocks. This will be more natural. I
believe, summaries must refer *any* node in block. This is more simple
and clean design.
Why you do not like this?
I see only one potential problem: direntries may have long names (up to
255 symbols). this may lead to large summaries.
But in this case we may do:
1. Improve the JFFS2 itself. Keep, say, only 20, characters in the
full_dirent structure. Most of direntries will fit. For other, we will
just read the flash.
2. We may not touch JFFS2, and keep only 20 characters in summaries. For
other direntries, we may read them from flash (keeping theirs flash
offsets instead of names).
Comments?
--
Best regards, Artem B. Bityuckiy
Oktet Labs (St. Petersburg), Software Engineer.
+78124286709 (office) +79112449030 (mobile)
E-mail: dedekind@oktetlabs.ru, web: http://www.oktetlabs.ru
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: JFFS2 mount time
2004-10-22 12:44 ` Artem Bityuckiy
@ 2004-10-25 9:36 ` Ferenc Havasi
2004-10-25 10:56 ` Artem Bityuckiy
2004-10-25 11:21 ` Artem Bityuckiy
2004-10-26 9:29 ` Jarkko Lavinen
1 sibling, 2 replies; 27+ messages in thread
From: Ferenc Havasi @ 2004-10-25 9:36 UTC (permalink / raw)
To: dedekind, David Woodhouse; +Cc: linux-mtd, jffs-dev
Hi Artem,
> So, for this purpose you are going to distribute the inode nodes and
> other (including direntry nodes) by different blocks. Those blocks, who
> contain only the inode nodes, will have summaries, other blocks - will not.
Yes, I think there are three kinds of nodes:
- type A contains relevant amount of data which is not needed at mount
time (jffs2_raw_inode)
- type B is (almost) fully needed at mount time (jffs2_raw_dirent)
- type C is any other (unkown, developements in the future...)
To achieve as much mount time speed up as possible I think we should
distinguish them.
Using summary the really relevant speed up will be only at node type
A. We can also generate summary for type B, but that (as you wrote)
relevant ratio of the information will be duplicated.
So we whould like to intorduce two kinds of erase blocks:
- erase blocks with summary: it will store (now only) type A nodes,
maybe later some of type B
- erase block without summary: it will store all of type C and B nodes
which is not stored before
> 1. Your change will affect JFFS2 very heavily. You will introduce
> restriction into JFFS2. Another improvements may not work with such
> restriction. Now all the blocks are equivalent. But you want to
> distinguish between two kins of blocks. Don't you think it is too
> complicated decision?
What kind of restriction do you mean? We don't introduce any
restrictions. The "type C" kind of nodes are processed as before, using
the usual scanning method. If you what to force for every node to make
their represenation in the summary, that whould be a restriction.
I think for some kinds of node summary is meaningful, and for some kinds
not.
If we mix them that can be a very big slow down, if you what to process
them only with making a reference in the summary to its offset, because
if you (for example) what to read only 50 bytes (size of the node) you
will have to read 512/2048 bytes depening on the flash. (where mostly
there will be inode nodes which is not neccesery to read because that is
int he summary)
But if all of this "not summarized, small" nodes are stored in a
"seperated" erase block than the this 512/2048 byte reading will not be
unnecessary (because on the remaining 462-1998 bytes will store also
relevant information, which is not in the summary).
> 2. Think about the wear-leveling. In JFFS it was ideal. In JFFS2 it is
> good, but not so ideal. I average, the inode nodes are changed more
> often (just think about FIFOs, we told about them in this list
> recently). So, you will need to Garbage Collect the NODE_ONLY blocks
> more often. So, I afraid the wear-leveling will suffer from your
> improvement.
I think the GC solves it "automaticly". This mark
(SUMMARIZED/NOT_SUMMARIZED) is not a premament thing, it is done "pseudo
randomly".
I aggree that it cause some different behavior in wear-leveling but I
don't think it makes it relevantly worse.
> 4. It seems for me you will need to increase the number of blocks which
> are reserved for the garbage collection (double ?). This is also minor
> drawback.
I don't understand what do you mean here.
> I believe that if we have directory references in summaries, this will
> increase the mount speed.
> 1. At first, we will store fewer data! We don't need to keep the common
> headers, CRCs and mctimes.
> 2. At the second, we may compress summary (direntries aren't compressed)!
> 3. And the third, on NAND there is difference between reading lots of
> different pages or few pages.
Yes, we should try it - to store dirents in SUMMARIZED erase blocks. But
it can be a improvement later, for first we need a well working stable
system - and this is urgent for us now.
> 2. Compress summaries.
It makes harder to determine the optimal time of summary generation (it
is easy to see the summary size, but here the compressed size of it the
relevant). It can cause smaller image but may cause some slow down, too.
We may introduce it later as an option.
So now we have two open discussion:
- is the SUMMARIZED / NOT_SUMMARIZED distiguishment good or not
- in the first version do we need dirents in the summary or not
Fortunatelly the effects (and side effects) of this improvements will be
active only if the new kernel option is enabled, and don't kill any
other future improvements.
I curious about (at least) David's optinion about these topics.
Bye,
Ferenc
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: JFFS2 mount time
2004-10-25 9:36 ` Ferenc Havasi
@ 2004-10-25 10:56 ` Artem Bityuckiy
2004-10-25 15:30 ` Ferenc Havasi
2004-10-25 11:21 ` Artem Bityuckiy
1 sibling, 1 reply; 27+ messages in thread
From: Artem Bityuckiy @ 2004-10-25 10:56 UTC (permalink / raw)
To: Ferenc Havasi; +Cc: linux-mtd, David Woodhouse, jffs-dev
Hello Ferenc,
> Yes, I think there are three kinds of nodes:
> - type A contains relevant amount of data which is not needed at mount
> time (jffs2_raw_inode)
> - type B is (almost) fully needed at mount time (jffs2_raw_dirent)
> - type C is any other (unkown, developements in the future...)
>
> To achieve as much mount time speed up as possible I think we should
> distinguish them.
This is what I really do not like.
Ok, let us discuss now only this topic. Lt I explain why I believe it is
vad and very *unnatural* to introduce two or more kinds of blocks.
The example of JFFS2 change that I consider natural is the introduction
of new node type. It is natural, because of when JFFS2 was designed,
this possibility was foreseen and taken into account. It is relatively
easy to do this. It is possible to do this and do not affect other
things in the JFFS2.
Conversely, the introducing several block types was not foreseen in the
JFFS2 design. And all things in the JFFS2 are coded with the assumption
all the blocks are equivalent.
This is my point view on the issue in general.
Now I will try to illustrate why I think so.
1. In JFFS2 there are several lists of blocks - clean_list, dirty_list,
very_dirty_list?. Are you going to introduce clean_list_typeA,
dirty_list_typeA, very_dirty_list_typeA, clean_list_typeB,
dirty_list_typeB, very_dirty_list_typeB ?
2. Just do 'grep "_list" * | grep -e "\(dirty\)\|\(very\)"' and see how
many places in JFFS2 where these lists are changed. Do you think it is
natural to introduce 3 more lists? I believe not. What if somebody else
will introduce one more type of block?
3. There is write buffer in the JFFS2 which is used in case of NAND. Are
you going to have two wbufs? This is also significant change.
4. Now the GC just gives one block, and moves all the valid nodes to
another one. In your case (if you have the JFFS2 image which was created
by older code, without your patch, where all node types are mixed),
you will need to move one type of nodes to one block, another to the
another block.
So, I think you will be needed to change many things in JFFS2. You have
a risk to hit on a can of worms.
So, do you agree that this change is *unnatural* ?
===================================================================
>> 4. It seems for me you will need to increase the number of blocks
>> which are reserved for the garbage collection (double ?). This is also
>> minor drawback.
> I don't understand what do you mean here
I mean the sb->resv_blocks_gcmerge and related. You will need to
increase it, which is not very good.
--
Best regards, Artem B. Bityuckiy
Oktet Labs (St. Petersburg), Software Engineer.
+78124286709 (office) +79112449030 (mobile)
E-mail: dedekind@oktetlabs.ru, web: http://www.oktetlabs.ru
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: JFFS2 mount time
2004-10-25 10:56 ` Artem Bityuckiy
@ 2004-10-25 15:30 ` Ferenc Havasi
2004-10-26 9:59 ` Artem Bityuckiy
0 siblings, 1 reply; 27+ messages in thread
From: Ferenc Havasi @ 2004-10-25 15:30 UTC (permalink / raw)
To: dedekind; +Cc: linux-mtd, David Woodhouse, jffs-dev
Hi Artem,
> > To achieve as much mount time speed up as possible I think we should
> > distinguish them.
> This is what I really do not like.
>
> Ok, let us discuss now only this topic. Lt I explain why I believe it is
> vad and very *unnatural* to introduce two or more kinds of blocks.
You are right, it can be unnatural in point of the original design of
the JFFS2. But I think in point of the connection of this optimization
and JFFS2 it is more natural than simple store offsets in the summary,
or copy all the information into it.
Our plan was modify wbuf (make a second one) and modify
jffs2_reserve_space to select the right wbuf and generate summary. Never
planded to introduce new clean_*, dirty_*, ... lists, thats really too
difficult.
> 3. There is write buffer in the JFFS2 which is used in case of NAND. Are
> you going to have two wbufs? This is also significant change.
Yes, we started to implement it yesterday and now agree. It is really
not easy, and we don't write to rewrite the NAND handling part of JFFS2
whithout a real NAND device. Maybe at the design of JFFS3 :)
So you convinced me. We will change the design of summary. The inodes
and dirents will be also in the summary. All other nodes will be copied
as itself into the summary and cause a warning. The summary support will
be a required thing for new node types, too.
In the kernel we will have to modify
1. jffs2_scan_eraseblock(), as it is already in our patch
2. jeb struct to store generated the summary dinamically (one plus field)
3. jffs2_reserve_space(), which will have a new parameter (summary
size), which can be JFFS2_SUMMARY_INODE_SIZE or
JFFS2_SUMMARY_DIRENT_SIZE(namelen). It can decide when to generate
summary and it can do this generation.
4. jffs2_flash_writev(), which is used to write info to flash. It can
parse the node (similar to sumtool) and store the summary of it in its jeb.
If it works we'll check the effect of compressing the summary. (size and
speed)
Comments?
Bye,
Ferenc
P.s.: Thanks for this good conversation.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: JFFS2 mount time
2004-10-25 15:30 ` Ferenc Havasi
@ 2004-10-26 9:59 ` Artem Bityuckiy
2004-10-26 10:21 ` Ferenc Havasi
0 siblings, 1 reply; 27+ messages in thread
From: Artem Bityuckiy @ 2004-10-26 9:59 UTC (permalink / raw)
To: Ferenc Havasi; +Cc: David Woodhouse, linux-mtd, jffs-dev
Hello Ferenc,
Ferenc Havasi wrote:
> In the kernel we will have to modify
> 1. jffs2_scan_eraseblock(), as it is already in our patch
> 2. jeb struct to store generated the summary dinamically (one plus field)
IMHO, since the summary relates only to one block, the current block, it
is logical to refer the summary from the jffs2_sb_info, not from
jffs2_erase_blocks. It is also not very nice to store it in the
jffs2_erase_blocks since it will increase the size of array of JFFS2
blocks (c->blocks[]).
> 3. jffs2_reserve_space(), which will have a new parameter (summary
> size), which can be JFFS2_SUMMARY_INODE_SIZE or
> JFFS2_SUMMARY_DIRENT_SIZE(namelen). It can decide when to generate
> summary and it can do this generation.
Yes, I also think so.
Currently the jffs2_do_reserve_space() do (as I understand):
1. If the current block (c->nextblock) have space and it is sufficient
for request, it reserves it.
2. If the c->nextblock has fewer size, than requested, the c->nextblock
is wasted, put to the correspondent list (dirty_list, etc), free block
is taken and reserved.
Thus, the jffs2_do_reserve_space() should be improved to be able to save
some space for summary. And, some function like jffs2_write_summary()
which will be called before jffs2_do_reserve_space() takes new block
from the free_list.
> 4. jffs2_flash_writev(), which is used to write info to flash. It can
> parse the node (similar to sumtool) and store the summary of it in its jeb.
May be write here... Didn't think a lot... May be as I wrote, in
jffs2_do_reserve_space()...
I also offer you to include direntries in summaries and compress them. See:
sizeof(struct jffs2_raw_dirent) = 40 (without name)
you will need to store in your summary only:
totlen
pino
version
ino
nsize
type
name
which is 24 bytes. You don't store all data! Of course, in case of long
names things are not so good...
If you also compress them, they will be smaller (minus 50-70%)!
So, if there are few direntries in block, why not to store them in summary?
Did you measured the time of summary uncompress on your system? I can't
know for sure, but I suspect that if you have, say, 200MHz system, the
time of uncompression = o(time of block read)!
There is one more issue: if there are too many direntries in block,
summary may become too large (the compression helps here). In this case
you may not write summary or don't mention direntries in summary.
--
Best regards, Artem B. Bityuckiy
Oktet Labs (St. Petersburg), Software Engineer.
+78124286709 (office) +79112449030 (mobile)
E-mail: dedekind@oktetlabs.ru, web: http://www.oktetlabs.ru
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: JFFS2 mount time
2004-10-26 9:59 ` Artem Bityuckiy
@ 2004-10-26 10:21 ` Ferenc Havasi
2004-10-26 11:05 ` Artem Bityuckiy
0 siblings, 1 reply; 27+ messages in thread
From: Ferenc Havasi @ 2004-10-26 10:21 UTC (permalink / raw)
To: dedekind; +Cc: David Woodhouse, linux-mtd, jffs-dev
Hi Artem,
> IMHO, since the summary relates only to one block, the current block, it
> is logical to refer the summary from the jffs2_sb_info, not from
> jffs2_erase_blocks. It is also not very nice to store it in the
> jffs2_erase_blocks since it will increase the size of array of JFFS2
> blocks (c->blocks[]).
Is it sure than only one non-full erase block is in the filesystem?
Non-full means here that there is some nodes already in that, but also
there is some free space at the end of it.
>> 4. jffs2_flash_writev(), which is used to write info to flash. It can
>> parse the node (similar to sumtool) and store the summary of it in its
>> jeb.
>
> May be write here... Didn't think a lot... May be as I wrote, in
> jffs2_do_reserve_space()...
As I see jffs2_do_reserve space is called before inode/... allocation in
most cases. So at that time the summary information is not know - but at
writing it have to be known certainly.
> So, if there are few direntries in block, why not to store them in summary?
You may misunderstood me. In the previous letter I wrote: "So you
convinced me. We will change the design of summary. The inodes and
dirents will be also in the summary."
So now we do plan to store dirents in the summary. :)
> Did you measured the time of summary uncompress on your system? I can't
> know for sure, but I suspect that if you have, say, 200MHz system, the
> time of uncompression = o(time of block read)!
It depends on the compressor.
We will test it with zlib/rtime. I whould like to implement as an
optional feature.
> There is one more issue: if there are too many direntries in block,
> summary may become too large (the compression helps here). In this case
> you may not write summary or don't mention direntries in summary.
Let see how it work, and after we can make it more optimal :)
Bye,
Ferenc
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: JFFS2 mount time
2004-10-26 10:21 ` Ferenc Havasi
@ 2004-10-26 11:05 ` Artem Bityuckiy
2004-10-26 13:52 ` Ferenc Havasi
0 siblings, 1 reply; 27+ messages in thread
From: Artem Bityuckiy @ 2004-10-26 11:05 UTC (permalink / raw)
To: Ferenc Havasi; +Cc: linux-mtd, David Woodhouse, jffs-dev
Ferenc,
> Is it sure than only one non-full erase block is in the filesystem?
> Non-full means here that there is some nodes already in that, but also
> there is some free space at the end of it.
I didn't analyse this accurately, but my vision is that there is one
current block (c->nextblock). Even GC moves nodes to it. This is because
the jffs2_do_reserve_space() is always used (even by GC), and the
jffs2_do_reserve_space() always uses c->nextblock.
> As I see jffs2_do_reserve space is called before inode/... allocation in
> most cases. So at that time the summary information is not know - but at
> writing it have to be known certainly.
May be... From another hand you may write summary every time the
jffs2_reserve_space() fetches new block from the free_list...
Anyway, this is not fundamental...
> You may misunderstood me. In the previous letter I wrote: "So you
> convinced me. We will change the design of summary. The inodes and
> dirents will be also in the summary."
>
> So now we do plan to store dirents in the summary. :)
OK, sorry. :-)
> Let see how it work, and after we can make it more optimal :)
Agree :-)
Also, please, take into account that there may be checkpoint nodes (I'm
implementing this). So, I think you need to have a generic mechanism to
add new node types to your summary.
Also, I think it is good to have a generic mechanism to just refer some
nodes from summaries (for example, direntries with long names or
something else).
Thank you for conversation too.
:-)
--
Best regards, Artem B. Bityuckiy
Oktet Labs (St. Petersburg), Software Engineer.
+78124286709 (office) +79112449030 (mobile)
E-mail: dedekind@oktetlabs.ru, web: http://www.oktetlabs.ru
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: JFFS2 mount time
2004-10-26 11:05 ` Artem Bityuckiy
@ 2004-10-26 13:52 ` Ferenc Havasi
0 siblings, 0 replies; 27+ messages in thread
From: Ferenc Havasi @ 2004-10-26 13:52 UTC (permalink / raw)
To: dedekind; +Cc: linux-mtd, David Woodhouse, jffs-dev
Hi Artem,
> Also, please, take into account that there may be checkpoint nodes (I'm
> implementing this). So, I think you need to have a generic mechanism to
> add new node types to your summary.
>
> Also, I think it is good to have a generic mechanism to just refer some
> nodes from summaries (for example, direntries with long names or
> something else).
Yes, it will be easy to extend.
We also need a this general support - because we will introduce a new
node type, too, becauseof the model file support, which will start to
commit when David finishes his patch for Linus.
Bye,
Ferenc
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: JFFS2 mount time
2004-10-25 9:36 ` Ferenc Havasi
2004-10-25 10:56 ` Artem Bityuckiy
@ 2004-10-25 11:21 ` Artem Bityuckiy
1 sibling, 0 replies; 27+ messages in thread
From: Artem Bityuckiy @ 2004-10-25 11:21 UTC (permalink / raw)
To: Ferenc Havasi; +Cc: linux-mtd, jffs-dev
> I curious about (at least) David's optinion about these topics.
I also wonder why people are not very active :-)
--
Best regards, Artem B. Bityuckiy
Oktet Labs (St. Petersburg), Software Engineer.
+78124286709 (office) +79112449030 (mobile)
E-mail: dedekind@oktetlabs.ru, web: http://www.oktetlabs.ru
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: JFFS2 mount time
2004-10-22 12:44 ` Artem Bityuckiy
2004-10-25 9:36 ` Ferenc Havasi
@ 2004-10-26 9:29 ` Jarkko Lavinen
2004-10-26 10:24 ` Ferenc Havasi
2004-10-26 10:34 ` Artem Bityuckiy
1 sibling, 2 replies; 27+ messages in thread
From: Jarkko Lavinen @ 2004-10-26 9:29 UTC (permalink / raw)
To: linux-mtd, jffs-dev; +Cc: ext Artem Bityuckiy, David Woodhouse
[-- Attachment #1: Type: text/plain, Size: 2655 bytes --]
I tried to see with jffs2dump how much Inodes and Dirents I have on
root filesystem on Arm testbed. Quick and dirty Perl script attached.
This isn't accurate as the calculated total image size misses at least
the final padding on the last erase block.
The size of the plain JFFS2 image is 31.1 MiB. The root fs consists of
all applications and libraries and no user data.
$ jffs2dump -c rootfs.jffs2 | perl jffs2stats.pl
Number of dirents: 6144.
Total dirent node space: 304911 (0.9%)
Average dirent len: 49.6
Total dirent name space: 76671
Average name len: 12.5
Number of Inodes: 21197
Total Inode space: 32254866 (99.1%)
Average Inode size: 1521.7
Padding: 37326 0.1%
Total image size: 32559777
$ ls -l rootfs.jffs2
-rw-r--r-- 1 root root 32597104 Oct 20 15:11 rootfs.jffs2
With sumtool the image size grows to 31.8 MiB
$ jffs2dump -c rootfs-sum.jffs2 | perl jffs2stats.pl
Number of dirents: 6144.
Total dirent node space: 304911 (0.9%)
Average dirent len: 49.6
Total dirent name space: 76671
Average name len: 12.5
Number of Inodes: 21197
Total Inode space: 32254866 (97.2%)
Average Inode size: 1521.7
Number of Inode Summary nodes: 251
Total Inode Sum space: 631524, (1.9%)
Average Sum node size: 2516.0
Padding: 153063 0.5%
Total image size: 33191301
$ ls -l rootfs-sum.jffs2
-rw-r--r-- 1 root root 33423360 Oct 20 15:23 rootfs-sum.jffs2
If dentries were stored just as they are (unstripped and uncompressed)
in the summary, the summary size would grow by 50% to about 3% of the
whole image size.
On Fri, Oct 22, 2004 at 04:44:13PM +0400, ext Artem Bityuckiy wrote:
> I believe that if we have directory references in summaries, this will
> increase the mount speed.
>
> 1. At first, we will store fewer data! We don't need to keep the common
> headers, CRCs and mctimes.
> 2. At the second, we may compress summary (direntries aren't compressed)!
> 3. And the third, on NAND there is difference between reading lots of
> different pages or few pages.
I tried Ferenc's earlier mount time patch in August and the 52s mount
time dropped then to 14s. If I understand right, inodes and dentries
were then mixed in the erase block and the summary was for inodes
only. This shows reading dentries from semirandom places is
expensive.
Ferenc's latest patch put dentries on their own erase block in
consecutive order. Considering only the read efficiency from the
media, reading consecutive, uncompressed, and unstripped dentries from
a summary should cost no more than reading them from dedicated erase
block.
Jarkko Lavinen
[-- Attachment #2: jffs2stats.pl --]
[-- Type: text/x-perl, Size: 3230 bytes --]
#! /usr/bin/perl
$EBLOCKSIZE=131072;
$dirents = $totdirentlen = $totnamelen = 0;
$inodes = $totinodelen = 0;
$sumnodes = $totsumnodelen = 0;
$totpadlen = 0;
$gaps = $totgaplen = 0;
$nextaddr = 0;
sub checkpadding {
my ($addr, $totlen) = @_;
my $len = hex($addr) - $nextaddr;
if ($len > 0) {
if (hex($addr) % $EBLOCKSIZE == 0 || $len <= 3) {
$totalpadlen += $len;
} else {
print sprintf "Gap seen at %08x .. $addr, length $len\n", $nextaddr;
$gaps++;
$totgaplen += hex($addr) - $nextaddr;
}
}
$nextaddr = hex($addr) + hex($totlen);
}
while(<>) {
chop;
if (/^\s+Dirent/) {
die "Cannot parse $_" if (! /^ \s+
Dirent \s+
node \s at \s+ (\w+), \s+
totlen \s+ (\w+), \s+
\#pino \s+ (\w+), \s+
version \s+ (\w+), \s+
\#ino \s+ (\w+), \s+
nsize \s+ (\w+), \s+
name \s+ (.*)
$/x);
my ($addr, $totlen, $pino, $version, $ino, $nsize, $name) = ($1, $2, $3, $4, $5, $6, $7);
&checkpadding($addr, $totlen);
$dirents++;
$totdirentlen += hex($totlen);
$totnamelen += hex($nsize);
} elsif (/^\s+Inode Sum/) {
die "Cannot parse $_" if (! /^ \s+
Inode \s Sum \s+
node \s at \s+ (\w+), \s+
totlen \s+ (\w+), \s+
sum_num \s+ (\w+), \s+
cleanmarker \s size \s+ (\w+) \s*
$/x);
my ($addr, $totlen, $sum_num, $cleanmarksize) = ($1, $2, $3, $4);
&checkpadding($addr, $totlen);
$sumnodes++;
$totsumnodelen += hex($totlen);
} elsif (/^\s+Inode/) {
die "Cannot parse $_" if (! /^ \s+
Inode \s+
node \s at \s+ (\w+), \s+
totlen \s+ (\w+), \s+
\#ino \s+ (\w+), \s+
version \s+ (\w+), \s+
isize \s+ (\w+), \s+
csize \s+ (\w+), \s+
dsize \s+ (\w+), \s+
offset \s+ (\w+) \s*
$/x);
my ($addr, $totlen, $ino, $version, $isize, $csize, $dsize, $offset) = ($1, $2, $3, $4, $5, $6, $7, $8);
&checkpadding($addr, $totlen);
$inodes++;
$totinodelen += hex($totlen);
} else {
die "Cannot parse $_";
}
}
$totalsize = $totdirentlen + $totinodelen + $totsumnodelen + $totpadlen + $totgaplen;
print "Number of dirents:\t$dirents.\n";
print sprintf " Total dirent node space:\t$totdirentlen (%.1f%%)\n", 100.0*$totdirentlen/$totalsize;
print " Average dirent len:\t", sprintf("%.1f", $totdirentlen/$dirents), "\n" if ($dirents > 0);
print " Total dirent name space:\t$totnamelen\n";
print sprintf(" Average name len:\t%.1f\n", $totnamelen/$dirents) if ($dirents > 0);
print "\n";
print "Number of Inodes:\t$inodes\n";
print sprintf " Total Inode space:\t$totinodelen (%.1f%%)\n", 100.0*$totinodelen/$totalsize;
print " Average Inode size:\t", sprintf("%.1f", $totinodelen/$inodes), "\n" if ($inodes > 0);
if ($sumnodes) {
print "\n";
print "Number of Inode Summary nodes:\t$sumnodes\n";
print sprintf " Total Inode Sum space:\t$totsumnodelen, (%.1f%%)\n", 100.0*$totsumnodelen/$totalsize;
print " Average Sum node size:\t", sprintf("%.1f", $totsumnodelen/$sumnodes), "\n";
}
print sprintf "\nPadding:\t$totalpadlen %.1f%%\n", 100.0*$totalpadlen/$totalsize;
print "Total image size: $totalsize\n";
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: JFFS2 mount time
2004-10-26 9:29 ` Jarkko Lavinen
@ 2004-10-26 10:24 ` Ferenc Havasi
2004-10-26 10:34 ` Artem Bityuckiy
1 sibling, 0 replies; 27+ messages in thread
From: Ferenc Havasi @ 2004-10-26 10:24 UTC (permalink / raw)
To: Jarkko Lavinen; +Cc: ext Artem Bityuckiy, linux-mtd, David Woodhouse, jffs-dev
Hi Jarkko,
> If dentries were stored just as they are (unstripped and uncompressed)
> in the summary, the summary size would grow by 50% to about 3% of the
> whole image size.
Thanks, good to know it.
Did you got ECC/CRC errors? The most interest test for me whould be to
test the new (sumtool) image with the original kernel (because the
summary nodes are compatibles it should work), and see if there is
ECC/CRC errors or not.
Bye,
Ferenc
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: JFFS2 mount time
2004-10-26 9:29 ` Jarkko Lavinen
2004-10-26 10:24 ` Ferenc Havasi
@ 2004-10-26 10:34 ` Artem Bityuckiy
1 sibling, 0 replies; 27+ messages in thread
From: Artem Bityuckiy @ 2004-10-26 10:34 UTC (permalink / raw)
To: Jarkko Lavinen; +Cc: David Woodhouse, linux-mtd, jffs-dev
Hello Jarkko,
> I tried Ferenc's earlier mount time patch in August and the 52s mount
> time dropped then to 14s. If I understand right, inodes and dentries
> were then mixed in the erase block and the summary was for inodes
> only. This shows reading dentries from semirandom places is
> expensive.
This is very good that direntries are distributed more or less uniformly
in average.
>
> Ferenc's latest patch put dentries on their own erase block in
> consecutive order. Considering only the read efficiency from the
> media, reading consecutive, uncompressed, and unstripped dentries from
> a summary should cost no more than reading them from dedicated erase
> block.
>
Definitely true - the second patch must be better than the first one. But
unfortunately, it hard to do this dinamically :-( Ferenc tried...
But in my proposition, we will also refer direntries in the summary -
this is not the same as to read direntries from where they are placed,
this is another thing, especially in case of NAND! There is difference
(if we have NAND) - whether to read one 512 NAND page containing
compressed information about 20-25 direntries or to read 20-25
*different* NAND pages.
So, I think, new design will also better than the early Ferenc's patch :-)
--
Best regards, Artem B. Bityuckiy
Oktet Labs (St. Petersburg), Software Engineer.
+78124286709 (office) +79112449030 (mobile)
E-mail: dedekind@oktetlabs.ru, web: http://www.oktetlabs.ru
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2004-10-26 13:48 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-10-20 14:26 JFFS2 mount time Ferenc Havasi
2004-10-20 15:26 ` [OBORONA-SPAM] " Artem B. Bityuckiy
2004-10-20 15:49 ` Ferenc Havasi
2004-10-20 15:53 ` Artem B. Bityuckiy
2004-10-21 6:29 ` Artem B. Bityuckiy
2004-10-21 6:54 ` Ferenc Havasi
2004-10-21 7:16 ` Artem B. Bityuckiy
2004-10-21 19:50 ` Ferenc Havasi
2004-10-21 7:30 ` JFFS2 mount time - more Artem B. Bityuckiy
[not found] ` <41776351.4040204@yandex.ru>
2004-10-21 7:39 ` JFFS2 mount time - 3 more questions Ferenc Havasi
2004-10-21 12:49 ` JFFS2 mount time Jarkko Lavinen
2004-10-21 19:11 ` Ferenc Havasi
2004-10-22 9:58 ` Ferenc Havasi
2004-10-21 13:24 ` David Woodhouse
2004-10-21 20:05 ` Ferenc Havasi
2004-10-22 12:44 ` Artem Bityuckiy
2004-10-25 9:36 ` Ferenc Havasi
2004-10-25 10:56 ` Artem Bityuckiy
2004-10-25 15:30 ` Ferenc Havasi
2004-10-26 9:59 ` Artem Bityuckiy
2004-10-26 10:21 ` Ferenc Havasi
2004-10-26 11:05 ` Artem Bityuckiy
2004-10-26 13:52 ` Ferenc Havasi
2004-10-25 11:21 ` Artem Bityuckiy
2004-10-26 9:29 ` Jarkko Lavinen
2004-10-26 10:24 ` Ferenc Havasi
2004-10-26 10:34 ` Artem Bityuckiy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox