From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lon-del-03.spheriq.net ([195.46.50.99]) by canuck.infradead.org with esmtps (Exim 4.43 #1 (Red Hat Linux)) id 1DE2LO-0006BH-Tr for linux-mtd@lists.infradead.org; Wed, 23 Mar 2005 04:40:08 -0500 Received: from lon-out-03.spheriq.net ([195.46.50.131]) by lon-del-03.spheriq.net with ESMTP id j2N9e5kn027505 for ; Wed, 23 Mar 2005 09:40:05 GMT Received: from lon-cus-01.spheriq.net (lon-cus-01.spheriq.net [195.46.50.37]) by lon-out-03.spheriq.net with ESMTP id j2N9e4ZL014036 for ; Wed, 23 Mar 2005 09:40:04 GMT Sender: Estelle HAMMACHE Message-ID: <4241396B.D689EB32@st.com> Date: Wed, 23 Mar 2005 10:39:55 +0100 From: Estelle HAMMACHE MIME-Version: 1.0 To: Sergei Sharonov References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: linux-mtd@lists.infradead.org Subject: Re: atomic file operations List-Id: Linux MTD discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sergei Sharonov wrote: > Is a write of 1024 bytes atomic? > Does it relate to the page size in any way? BTW I am using NAND and the page > may vary between 512 and 2048 bytes depending on a device. No write operation is guaranteed to be atomic. Have a look at jffs2_write_inode_range in write.c : if there is not enough space in the current block for the whole data, it may be split into several chunks. Additionally write ops that overlap a cache page boundary (not a flash page) are always split at the page limit. If you want to have atomic writes, you could: 1) Mandatorily: ensure that your application will not issue write ops which overlap a page boundary. You should not tweak the JFFS2 code to write such overlapping nodes, otherwise you must also tweak the GC and it gets difficult. 2) Either tweak jffs2_write_inode_range to forbid splitting data which does not overlap a page boundary or adjust JFFS2_MIN_DATA_LEN to reserve enough space (difficult to estimate maybe if you have compression...). The above tweaking should ensure that an input buffer is written to JFFS2 FS as a single CRC-protected data node. You should be aware that on NAND flash JFFS2 uses a (nand flash) page buffer (wbuf.c), which is flushed only on fsync/sync/umount. So even though your write ops will be atomic (with above code tweaks), there is no guarantee that a buffer is effectively committed to flash when write() returns, because the end of the data node may remain in the buffer. If you want that also, you can tweak JFFS2 again by requiring a wbuf flush after each "atomic write", or you can have your application call fsync after each write. > Is file rename atomic? See jffs2_rename in dir.c. There are two steps: make the new hard link, remove the old hard link. You may end up with two names for the same inode if there is a powerdown, so no it is not atomic. See dir.c, file.c, fs.c for other ops. Generally speaking write_inode_range is not an atomic operation, write_dnode and write_dirent are atomic ops. The order of operations in a file-level operation should ensure global atomicity in most cases. I don't know if there are other file-operations besides rename which are not atomic. > Second issue is: How badly these small chunks will affect my mount time? There have been previous threads about this. Some people proposed some (application-side) workaround, you can find it in the archive or maybe someone will point it to you. Bye Estelle