From mboxrd@z Thu Jan 1 00:00:00 1970 From: Diego Calleja Subject: Re: [RFC] ext3: per-process soft-syncing data=ordered mode Date: Thu, 24 Jan 2008 22:50:26 +0100 Message-ID: <20080124225026.55fc6c20.diegocg@gmail.com> References: <200801242336.00340.a1426z@gawab.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara To: Al Boldi Return-path: Received: from fg-out-1718.google.com ([72.14.220.153]:50309 "EHLO fg-out-1718.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753607AbYAXVv6 convert rfc822-to-8bit (ORCPT ); Thu, 24 Jan 2008 16:51:58 -0500 Received: by fg-out-1718.google.com with SMTP id e21so373531fga.17 for ; Thu, 24 Jan 2008 13:51:56 -0800 (PST) In-Reply-To: <200801242336.00340.a1426z@gawab.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: El Thu, 24 Jan 2008 23:36:00 +0300, Al Boldi escribi= =F3: > Greetings! >=20 > data=3Dordered mode has proven reliable over the years, and it does t= his by=20 > ordering filedata flushes before metadata flushes. But this sometime= s=20 > causes contention in the order of a 10x slowdown for certain apps, ei= ther=20 > due to the misuse of fsync or due to inherent behaviour like db's, as= well=20 > as inherent starvation issues exposed by the data=3Dordered mode. There's a related bug in bugzilla: http://bugzilla.kernel.org/show_bug.= cgi?id=3D9546 The diagnostic from Jan Kara is different though, but I think it may be= the same problem... "One process does data-intensive load. Thus in the ordered mode the transaction is tiny but has tons of data buffers attached. If commit happens, it takes a long time to sync all the data before the commit can proceed... In the writeback mode, we don't wait for data buffers, i= n the journal mode amount of data to be written is really limited by the maximum size of a transaction and so we write by much smaller chunks and better latency is thus ensured." I'm hitting this bug too...it's surprising that there's not many people reporting more bugs about this, because it's really annoying. There's a patch by Jan Kara (that I'm including here because bugzilla d= idn't include it and took me a while to find it) which I don't know if it's s= upposed to fix the problem , but it'd be interesting to try: Don't allow too much data buffers in a transaction. diff --git a/fs/jbd/transaction.c b/fs/jbd/transaction.c index 08ff6c7..e6f9dd6 100644 --- a/fs/jbd/transaction.c +++ b/fs/jbd/transaction.c @@ -163,7 +163,7 @@ repeat_locked: spin_lock(&transaction->t_handle_lock); needed =3D transaction->t_outstanding_credits + nblocks; =20 - if (needed > journal->j_max_transaction_buffers) { + if (needed > journal->j_max_transaction_buffers || atomic_read(&trans= action->t_data_buf_count) > 32768) { /* * If the current transaction is already too large, then start * to commit it: we can then go back and attach this handle to @@ -1528,6 +1528,7 @@ static void __journal_temp_unlink_buffer(struct j= ournal_head *jh) return; case BJ_SyncData: list =3D &transaction->t_sync_datalist; + atomic_dec(&transaction->t_data_buf_count); break; case BJ_Metadata: transaction->t_nr_buffers--; @@ -1989,6 +1990,7 @@ void __journal_file_buffer(struct journal_head *j= h, return; case BJ_SyncData: list =3D &transaction->t_sync_datalist; + atomic_inc(&transaction->t_data_buf_count); break; case BJ_Metadata: transaction->t_nr_buffers++; diff --git a/include/linux/jbd.h b/include/linux/jbd.h index d9ecd13..6dd284a 100644 --- a/include/linux/jbd.h +++ b/include/linux/jbd.h @@ -541,6 +541,12 @@ struct transaction_s int t_outstanding_credits; =20 /* + * Number of data buffers on t_sync_datalist attached to + * the transaction. + */ + atomic_t t_data_buf_count; + + /* * Forward and backward links for the circular list of all transactio= ns * awaiting checkpoint. [j_list_lock] */ - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html