From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ric Wheeler Subject: background on the ext3 batching performance issue Date: Thu, 28 Feb 2008 07:09:17 -0500 Message-ID: <47C6A46D.8020700@emc.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "Feld, Andy" , linux-fsdevel@vger.kernel.org To: "Theodore Ts'o" , adilger@sun.com, David Chinner , jack@ucw.cz Return-path: Received: from mexforward.lss.emc.com ([128.222.32.20]:13183 "EHLO mexforward.lss.emc.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756434AbYB1PKq (ORCPT ); Thu, 28 Feb 2008 10:10:46 -0500 Sender: linux-fsdevel-owner@vger.kernel.org List-ID: At the LSF workshop, I mentioned that we have tripped across an embarrassing performance issue in the jbd transaction code which is clearly not tuned for low latency devices. The short summary is that we can do say 800 10k files/sec in a write/fsync/close loop with a single thread, but drop down to under 250 files/sec with 2 or more threads. This is pretty easy to reproduce with any small file write synchronous workload (i.e., fsync() each file before close). We used my fs_mark tool to reproduce. The core of the issue is the call in the jbd transaction code call out to schedule_timeout_uninterruptible(1) which causes us to sleep for 4ms: pid = current->pid; if (handle->h_sync && journal->j_last_sync_writer != pid) { journal->j_last_sync_writer = pid; do { old_handle_count = transaction->t_handle_count; schedule_timeout_uninterruptible(1); } while (old_handle_count != transaction->t_handle_count); } This is quite topical to the concern we had with low latency devices in general, but specifically things like SSD's. regards, ric