From mboxrd@z Thu Jan 1 00:00:00 1970 From: Phillip Susi Subject: Re: Atomic file data replace API Date: Fri, 07 Jan 2011 20:11:04 -0500 Message-ID: <4D27B9A8.3020804@cfl.rr.com> References: <1294412141-sup-1734@think> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Cc: Olaf van der Spek , linux-btrfs To: Chris Mason Return-path: In-Reply-To: <1294412141-sup-1734@think> List-ID: On 01/07/2011 09:58 AM, Chris Mason wrote: > Yes and no. We have a best effort mechanism where we try to guess that > since you've done this truncate and the write that you want the writes > to show up quickly. But its a guess. It is a pretty good guess, and one that the NT kernel has been making for 15 years or so. I've been following this issue for some time and I still don't understand why Ted is so hostile to this and can't make it work right on ext4. When you get a rename() you just need to check if there are outstanding journal transactions and/or dirty cache pages, and hang the rename() transaction on the end of those. That way if the system crashes after the new file has fully hit the disk, the old file is gone and you only have the new one, but if it crashes before, you still have the old one in place. Both the writes and the rename can be delayed in the cache to an arbitrary point in the future; what matters is that their order is preserved.