From: Sergei Sharonov <sergei.sharonov@halliburton.com>
To: linux-mtd@lists.infradead.org
Subject: Re: atomic file operations
Date: Wed, 23 Mar 2005 20:50:52 +0000 (UTC) [thread overview]
Message-ID: <loom.20050323T212715-407@post.gmane.org> (raw)
In-Reply-To: 4241396B.D689EB32@st.com
Estelle,
thanks, appreciate your help.
>
> Sergei Sharonov wrote:
> > Is a write of 1024 bytes atomic?
> > Does it relate to the page size in any way? BTW I am using NAND and the
> > page may vary between 512 and 2048 bytes depending on a device.
>
> No write operation is guaranteed to be atomic. Have a look
> at jffs2_write_inode_range in write.c : if there is not enough
> space in the current block for the whole data, it may be split
> into several chunks. Additionally write ops that overlap a
> cache page boundary (not a flash page) are always split at
> the page limit.
That means that one write may have several CRCs corresponding to
splinter chunks?
> If you want to have atomic writes, you could:
> 1) Mandatorily: ensure that your application will not
> issue write ops which overlap a page boundary.
> You should not tweak the JFFS2 code to write such
> overlapping nodes, otherwise you must also tweak
> the GC and it gets difficult.
> 2) Either tweak jffs2_write_inode_range to forbid
> splitting data which does not overlap a page boundary
> or adjust JFFS2_MIN_DATA_LEN to reserve enough
> space (difficult to estimate maybe if you have
> compression...).
>
> The above tweaking should ensure that an input buffer
> is written to JFFS2 FS as a single CRC-protected
> data node.
Ok, got that. Does not seem like a promissing idea considering
how fast jffs2 evolves and therefore how bad forking would be.
Thansk for the suggestion anyway.
> You should be aware that on NAND flash JFFS2 uses
> a (nand flash) page buffer (wbuf.c), which is flushed
> only on fsync/sync/umount. So even though your write
> ops will be atomic (with above code tweaks),
> there is no guarantee that a buffer is effectively
> committed to flash when write() returns, because the
> end of the data node may remain in the buffer.
> If you want that also, you can tweak JFFS2 again
> by requiring a wbuf flush after each "atomic write",
> or you can have your application call fsync after
> each write.
Beg pardon if it is FAQ, but if I open the file with O_SYNC
flag, wouldn't that guarantee synchronous write that does not
return untill all the data is in flash?
> > Is file rename atomic?
> See jffs2_rename in dir.c. There are two steps:
> make the new hard link, remove the old hard link.
> You may end up with two names for the same inode if
> there is a powerdown, so no it is not atomic.
Could not see that comming. Usualy people assume rename operation
atomic.
> > Second issue is: How badly these small chunks will affect my mount time?
> There have been previous threads about this.
> Some people proposed some (application-side) workaround,
> you can find it in the archive or maybe someone will point
> it to you.
I believe I saw a proposal to save small chunks as separate files, then
append them as a temp file and rename temp file to real log file.
The problems are (1) the log file is huge (2) rename is not atomic per
your reply.
Sergei Sharonov
next prev parent reply other threads:[~2005-03-23 21:01 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-03-22 21:57 atomic file operations Sergei Sharonov
2005-03-23 9:39 ` Estelle HAMMACHE
2005-03-23 20:50 ` Sergei Sharonov [this message]
2005-03-24 10:11 ` Estelle HAMMACHE
2005-03-24 10:53 ` Artem B. Bityuckiy
2005-03-24 11:59 ` Estelle HAMMACHE
2005-03-24 12:17 ` Artem B. Bityuckiy
2005-03-24 17:28 ` Sergei Sharonov
2005-03-24 19:32 ` Artem B. Bityuckiy
2005-03-24 22:00 ` David Woodhouse
2005-03-25 8:18 ` Artem B. Bityuckiy
2005-03-24 21:59 ` David Woodhouse
2005-03-25 16:18 ` Sergei Sharonov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=loom.20050323T212715-407@post.gmane.org \
--to=sergei.sharonov@halliburton.com \
--cc=linux-mtd@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox