From mboxrd@z Thu Jan  1 00:00:00 1970
From: Steven Pratt <slpratt@austin.ibm.com>
Subject: Re: New experimental btrfs branch ready for testing
Date: Fri, 05 Jun 2009 16:27:55 -0500
Message-ID: <4A298DDB.6070002@austin.ibm.com>
References: <20090601210447.GC3890@think> <4A281A3C.6000006@austin.ibm.com> <20090605142008.GB6942@think> <4A294194.6050006@austin.ibm.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
To: Chris Mason <chris.mason@oracle.com>, linux-btrfs@vger.kernel.org
Return-path: <linux-btrfs-owner@vger.kernel.org>
In-Reply-To: <4A294194.6050006@austin.ibm.com>
List-ID: <linux-btrfs.vger.kernel.org>

Steven Pratt wrote:
> Chris Mason wrote:
>> On Thu, Jun 04, 2009 at 02:02:20PM -0500, Steven Pratt wrote:
>>  
>>> Chris Mason wrote:
>>>    
>>>> Hello everyone,
>>>>
>>>> Yan Zheng has been doing some major surgery to the back references and
>>>> extent allocation code, tackling bottlenecks in the code that tracks
>>>> extents.  It scales better with many snapshots and performs better in
>>>> the common case of no snapshots at all.
>>>>
>>>> THE NEW CODE IS A FORWARD ROLLING DISK FORMAT CHANGE.  This means 
>>>> it is
>>>> compatible with the current btrfs disk format, but once you mount a
>>>> filesystem with the new code, it WILL NO LONGER BE MOUNTABLE FROM OLD
>>>> KERNELS.  Old kernels spit out an error message when you try them 
>>>> on new
>>>> format filesystems.
>>>>
>>>> This is a large change, and I'm hoping to have it stable in time 
>>>> for the
>>>> 2.6.31 merge window.  I've been testing it for about a week now, and
>>>> haven't been able to cause major problems yet.  But, testing the
>>>> compatibility with old format filesystems is the hard part, and
>>>> everyone that pulls the new code should backup their data first.
>>>>
>>>> I've setup git branches called newformat where you can pull the new 
>>>> code.
>>>>
>>>> For the kernel (based on 2.6.30-rc7):
>>>>
>>>> git pull 
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git 
>>>> newformat
>>>>
>>>>         
>>> So I started the performance runs on this. The base tests completed 
>>> fine  on the raid system and I will post results as soon as I can 
>>> finish  postprocessing, but when I tried to do nodatacow that 
>>> machine it crashed  pretty early. Here is console log:
>>>     
>>
>> Hi Steve,
>>
>> Thanks again for hammering on these.  Yan Zheng and I have both been
>> trying to reproduce problems with nodatacow and with the database random
>> write run.
>>   
> So now that the raid machine is actually up, I discovered it got 
> further than I thought on nodatacow. It did all the read tests, but 
> appeared to died on 16 thread random write(not odirect). There were no 
> messages logged to var/log/messages at all. Last I saw was :
>
> Jun  4 03:14:24 btrfs1 kernel: [65856.065491] btrfs: setting nodatacow
> Jun  4 15:24:45 btrfs1 syslogd 1.4.1: restart.
>
> Just dead until we rebooted machine later that day.

So the raid system complete the re-run of the nodatacow runs without 
error.  So still no idea what happened on this box the first time 
around.  As for the single disk system, it died during the random write 
test again, but it now looks like we might have a real HW failure.  This 
time we see SCSI error messages.  I have replaced the test disks and 
will try one more time.

The net is, I would hold off digging too much into this as even I don't 
have any repeatable errors.

Steve
>
>> But, so far we haven't been able to trigger any crashes.    Do you see
>> anything in your config or setup that is unusual?
>>   
> No, other than using the old mkfs with the new format.  I've kicked 
> off new runs to see if I hit the same issues
>
> Steve
>> -chris
>>   
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html