From mboxrd@z Thu Jan 1 00:00:00 1970 From: Joel Fernandes Subject: Re: ext3 writing of data before metadata in ordered mode Date: Mon, 26 Oct 2009 00:17:26 -0700 Message-ID: <9ff7a3bc0910260017v22ed53c1q949f22201e8b048f@mail.gmail.com> References: <9ff7a3bc0910251433k2c719989n981e3652162cafdf@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: linux-fsdevel@vger.kernel.org, kernelnewbies To: Mulyadi Santosa Return-path: Received: from mail-pz0-f188.google.com ([209.85.222.188]:34783 "EHLO mail-pz0-f188.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754823AbZJZHRW convert rfc822-to-8bit (ORCPT ); Mon, 26 Oct 2009 03:17:22 -0400 Received: by pzk26 with SMTP id 26so7265816pzk.4 for ; Mon, 26 Oct 2009 00:17:27 -0700 (PDT) In-Reply-To: Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Hi Mulyadi, Thanks for your opinion. Well if you ask me, JBD and the I/O scheduler are 2 independent layers, so don't think the ordering of the data and metadata is done at that level. But there is something about the data completion handler you're talking about - I think. Simplistically, During a write() In data=3Dordered mode: 1. During updating of metadata (before the data is copied), the kernel updates the metadata buffers and moves the metadata block to a list in the active trasaction (which is going to be logged). 2. Then the actually data buffers (memory) are updated with the content= s. 3. Then journal_dirty_data is called on each affected data buffer (this apparently ensures that data is written before the metadata - I don't know how) 4. And then the block buffers are committed (marked as dirty so that the page flushing mechanism can send them to disk). Now steps 3 and 4 seem to be independent therefore I don't know how step 3 knows when step 4 completes? The only way I can think of is step 4 sends calls a callback after its done to step 3 somehow? Let me know if the above analysis makes sense, Thanks. -Joel On Sun, Oct 25, 2009 at 9:40 PM, Mulyadi Santosa wrote: > Hi Joel... > > On Mon, Oct 26, 2009 at 4:33 AM, Joel Fernandes wrote: >> In data=3Dordered mode the ext3_ordered_commit_write function marks = the >> buffers as dirty, how then does the JBD ensure that the data is >> written before the metadata? =A0Once the data buffers are marked as >> dirty, JBD doesn't have control anymore over when the data is writte= n >> is actually written to disk right? Because the actually writing of t= he >> data is handled by the page wtriteback mechanism (pdflush) right? > > I am not an expert, but here's my thought: > > I think writing to backing device is not done simply marking the > buffer/page cache dirty. So, I think what kernel does is first prepar= e > an I/O queue to update ext3 journal. Since we talk about data=3Dorder= ed > here, only metadata are logged. > > Perhaps the key here is, metadata writing is done as a async > completion handler of data writing handler. Thus, data is written > first, followed by metadata logging > > Another possibility is composing a single atomic I/O writing request, > composed of data writing and metadata logging. Thus, I/O scheduler > won't be able to re-order the request and must complete the sequence > as we prepared. > > -- > regards, > > Mulyadi Santosa > Freelance Linux trainer and consultant > > blog: the-hydra.blogspot.com > training: mulyaditraining.blogspot.com > -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html