From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mx.dave-tech.it ([85.38.203.46]) by canuck.infradead.org with esmtps (Exim 4.63 #1 (Red Hat Linux)) id 1HPFFF-0000ZG-6w for linux-mtd@lists.infradead.org; Thu, 08 Mar 2007 04:49:29 -0500 Message-ID: <45EFDC0D.4090409@dave-tech.it> Date: Thu, 08 Mar 2007 10:49:01 +0100 From: R&D4 MIME-Version: 1.0 To: mtd_mailinglist Subject: JFFS2 as transactional FS (in other words: how to be sure that data have been writtent correctly from userspace) Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit List-Id: Linux MTD discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Hi all MTD developers, we are currently using an MTD partition on a NAND device, of course with JFFS2 on it ;-) , for transaction logging purpose. This transacion is mission critical and we cannot afford to lose data (or, even worse, have corrupted data!) For this reason we also use a battery-backed SRAM as temporary storage for the transaction state machine. After the transacion has been completed we flush the content of the SRAM to a file and (after the written is completed) we can overwrite the temporary storage with new data. Of course the machine can be interrupted in any moment without notice (e.g. watchdog, power failure). Only the content of the SRAM is guaranteed to be valid at any time. The "main" problem, of course, is to know "when" we can say "ok the data has been _completely_ written to the final storage". By reading back on this mailing list, "goooogling" on internet and reading JFFS2 FAQ (http://www.linux-mtd.infradead.org/faq/jffs2.html#L_writewell) I think I have found some kind of solution (I'm currently running some test on it) depending on the storage medium (NOR vs NAND): - on *NOR*: in our understanding, we can just use a simple fwrite() followed by fsync() or sync(). After the sync() return the control to the user's program, we can be sure that the data has been written on the device. So file = fopen(file_on_jffs2_nor) while(isneeded) { while (space_available(SRAM)) { fill(SRAM); } fread(buffer, SRAM); fwrite(buffer, file); fsync(file); invalidate_SDRAM(); } fclose(file) (Of course I have intetionally omitted the code for resuming from a warm reset.) QUESTION: Is this pseudo code correct? Is fsync() needed? (O_SYNC is not supported by JFFS2, AFAIK) or data has been _completely_ written right before the fwrite() return (so no sync() required)? - on *NAND*: things are a bit tricky ;-). Even if you call fsync() data may not have been written to storage, due the fact that "it's better to fill a NAND page before commit" For this reason only after "a while" the (dirty) page is written to storage even if it's not full. In the FAQ you say that this "a while" is controlled by the standard kernel vm functions by setting /proc/sys/vm/dirty_writeback_centisecs. By reading this I think about use this code: at system startup: `echo smallvalue > /proc/sys/vm/dirty_writeback_centisecs` file = fopen(file_on_jffs2_nand) while(isneeded) { while (space_available(SRAM)) { fill(SRAM); } fread(buffer, SRAM); fwrite(buffer, file); fsync(file); sleep(smallvalue+anothervalue) invalidate_SDRAM(); } fclose(file) 'smallvalue' should be something less that the standard 5 secs but something that will not waste to much CPU or NAND storage (by using not-completely-filled pages, correct me if I'm wrong about this point). I was thinking about 500 millis. 'anothervalue' should be something '>>smallvalue' and it should be used (IMHO) because Linux is not an RTOS, so timing are not tightly guaranteed. Is this approach correct or the something better that can be done?? Of course you can still flush buffers and dirty pages but umounting the partition but.. this is too long for our needs BTW I have seen, in my current test, that, without the sleep(), sometime my "last" data is not written correctly. Hope this (long ;-) ) email can lead to a useful discussion about this problem! :-) Best Regards, Andrea