From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-gh0-f174.google.com ([209.85.160.174]:40957 "EHLO mail-gh0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753249Ab2HAMk6 (ORCPT ); Wed, 1 Aug 2012 08:40:58 -0400 Received: by ghrr11 with SMTP id r11so749980ghr.19 for ; Wed, 01 Aug 2012 05:40:57 -0700 (PDT) Message-ID: <501923D4.5070607@gmail.com> Date: Wed, 01 Aug 2012 20:40:52 +0800 From: Liu Bo MIME-Version: 1.0 To: Mitch Harder CC: linux-btrfs Subject: Re: Btrfs Intermittent ENOSPC Issues References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-btrfs-owner@vger.kernel.org List-ID: On 08/01/2012 03:37 AM, Mitch Harder wrote: > I've been working on running down intermittent ENOSPC issues. > > I can only seem to replicate ENOSPC errors when running zlib > compression. However, I have been seeing similar ENOSPC errors to a > lesser extent when playing with the LZ4HC patches. > > I apologize for not following up on this sooner, but I had drifted > away from using zlib, and didn't notice there was still an issue. > > My test case involves un-archiving linux git sources to a freshly > formatted btrfs partition, mounted with compress-force=zlib. I am > using a 16 GB partition on a 250 GB Western Digital SATA Hard Disk. > My current kernel is x86_64 linux-3.5.0 merged with Chris' for-linus > branch (for 3.6_rc). This includes Josef's "Btrfs: flush delayed > inodes if we're short on space" patch. > > I haven't isolated a root cause, but here's the feedback I have so far. > > (1) My test case won't generate ENOSPC issues with lzo compression or > no compression. > > (2) I've inserted some trace_printk debugging statements to trace > back the call stack, and the ENOSPC errors only seem to occur on a new > transaction: vfs_create -> btrfs_create -> btrfs_start_transaction -> > start_transaction -> btrfs_block_rsv_add -> reserve_metadata_bytes. > > (3) The ENOSPC condition will usually clear in a few seconds, > allowing writes to proceed. > > (4) I've added a loop to the reserve_metadata_bytes() function to > loop back with 'flush_state = FLUSH_DELALLOC (1)' for 1024 retries. > This reduces and/or eliminates the ENOSPC errors, as if we're waiting > on something else that is trying to complete. > > (5) I've been heavily debugging the reserve_metadata_bytes() > function, and I'm seeing problems with the way > space_info->bytes_may_use is handled. The space_info->bytes_may_use > value is important in determining if we're in an over-commit state. > But space_info->bytes_may_use value is often increased arbitrarily > without any mechanism for correcting the value. Subsequently, > space_info->bytes_may_use quickly increases in size to the point where > we are always in fallback allocation as if we're overcommitted. In my > trials, it was hard to capture a point where space_info->bytes_may_use > wasn't larger than the available size. > Interesting results. IIRC, space_info->bytes_may_use seems not to be arbitrarily increased: Block_rsv wants NUM bytes -> space_info's bytes_may_use += NUM Block_rsv uses SOME bytes and release itself -> space_info's bytes_may_use -= (NUM - SOME) So IMO it is 'over-reserve' that causes ENOSPC. Maybe we can try to find why more bytes need to be reserved with compress=zlib/compress=LZ4HC. thanks, liubo > (6) Even though reserve_metadata_bytes() is almost always in fallback > overcommitted mode, it is still working pretty well, and I've > developed the perception that the problem is something that needs to > finish elsewhere. > > Sorry for not having a patch to fix the issue. I'll try to keep > banging on it as time allows. > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >