From mboxrd@z Thu Jan 1 00:00:00 1970 From: Josef Bacik Subject: Re: ext3 writing of data before metadata in ordered mode Date: Mon, 26 Oct 2009 13:58:32 -0400 Message-ID: <20091026175831.GD20565@localhost.localdomain> References: <9ff7a3bc0910251433k2c719989n981e3652162cafdf@mail.gmail.com> <20091026131939.GA20565@localhost.localdomain> <9ff7a3bc0910261021k5d9bd5e2r69fd64bfbf882da2@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Josef Bacik , linux-fsdevel@vger.kernel.org, kernelnewbies To: Joel Fernandes Return-path: Received: from mx1.redhat.com ([209.132.183.28]:53715 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752528AbZJZR7N (ORCPT ); Mon, 26 Oct 2009 13:59:13 -0400 Content-Disposition: inline In-Reply-To: <9ff7a3bc0910261021k5d9bd5e2r69fd64bfbf882da2@mail.gmail.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Mon, Oct 26, 2009 at 10:21:52AM -0700, Joel Fernandes wrote: > Hi Josef, Your analysis makes perfect sense. Thank you so much. > > Another question, what could explain the slowness in data=ordered > mode? I believe everything is asynchronous right? various lists are > maintained, and kjournald keeps checking theses lists and flushing > data before metadata written and marked dirty as you said. Is the > slowness because the flushing of data is done earlier than required > unlike when done by pdflush which waits for a certain amount of time? > I'm not sure what slowness you are talking about, but I will assume you mean the slowness of committing a transaction. Basically everything that has happened since the last journal commit must be taken care of. So all data that has been written needs to be written out synchronously, and then its metadata written to the journal, and then we can let things start going again while the metadata is written to where its supposed to asynchronously. The key part of that is _all_ data needs to be written out. This is slow compared to Ext4 because with Ext4 we have delayed allocation, so even though we may have dirtied alot of pages since the last transaction has occured, they may not have been allocated yet, so no metadata has been changed yet, so we don't have to force the flushing of the data out to disk, so the journal commit takes much less time because there is much less work to do. I hope that answers your question. Thanks, Josef