From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from plane.gmane.org ([80.91.229.3]:58225 "EHLO plane.gmane.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753701Ab2ISH2S (ORCPT ); Wed, 19 Sep 2012 03:28:18 -0400 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1TEEhn-0004RL-K2 for linux-btrfs@vger.kernel.org; Wed, 19 Sep 2012 09:28:19 +0200 Received: from 50C58B65.flatrate.dk ([80.197.139.101]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 19 Sep 2012 09:28:19 +0200 Received: from casper.bang by 50C58B65.flatrate.dk with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 19 Sep 2012 09:28:19 +0200 To: linux-btrfs@vger.kernel.org From: Casper Bang Subject: Re: Experiences: Why BTRFS had to yield for ZFS Date: Wed, 19 Sep 2012 07:28:05 +0000 (UTC) Message-ID: References: <5058068C.4040704@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: linux-btrfs-owner@vger.kernel.org List-ID: > Anand Jain oracle.com> writes: > archive-log-apply script - if you could, can you share the > script itself ? or provide more details about the script. > (It will help to understand the work-load in question). Our setup entails a whole bunch of scripts, but the apply script looks like this (orion is the production environment, pandium is the shadow): http://pastebin.com/k4T7deap The script invokes rman passing rman_recover_database.rcs: connect target / run { crosscheck archivelog all; delete noprompt expired archivelog all; catalog start with '/backup/oracle/flash_recovery_area/FROM_PROD/archivelog' noprompt; recover database; } We receive a 1GB archivelog roughly every 20'th minute, depending on the workload of the production environment. Apply rate starts out fine with btrfs > ext4 > zfs, but ends out with ZFS > ext4 > btrfs. The following numbers are from our consumer spinning-platter disk test, but they are equally representable to the SSD numbers we got. Ext4 starts out with a realtime to SCN ratio of about 3.4 and ends down around a factor 2.2. ZFS starts out with a realtime to SCN ratio of about 7.5 and ends down around a factor 4.4. Btrfs starts out with a realtime to SCN ratio of about 2.2 and ends down around a factor 0.8. This of course means we will never be able to catch up with production, as btrfs can't apply these as fast as they're created. It was even worse with btrfs on our 10xSSD server, where 20 min. of realtime work would end up taking some 5h to get applied (factor 0.06), obviously useless to us. I should point out, that during this process we also had to move some large backup sets around and we saw several times btrfs eating massive IO never to finish a simple mv command. I'm inclined to believe we've found some weak corner, perhaps in combination with SSD's - but it led us to compare with ext4 and ZFS, and dismiss btrfs for this over ZFS as it solves our problem.