linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] btrfs file write debugging patch
@ 2011-03-01 16:36 Xin Zhong
  2011-03-01 21:09 ` Mitch Harder
  0 siblings, 1 reply; 40+ messages in thread
From: Xin Zhong @ 2011-03-01 16:36 UTC (permalink / raw)
  To: linux-btrfs


Hi, Mitch
I think you can config ftrace to just trace function calls of btrfs.ko which will save a lot of trace buffer space. See below command:
#echo ':mod:btrfs' > /sys/kernel/debug/tracing/set_ftrace_filterAnd please send out the full ftrace log again.

Another helpful information might be the strace log of the wmldbcreate process. It will show us the io pattern of this command.
Thanks a lot for your help!
  		 	   		  

^ permalink raw reply	[flat|nested] 40+ messages in thread
* Re: [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped buffer of the same page
@ 2011-02-25 18:43 Mitch Harder
  2011-02-28  1:46 ` [PATCH] btrfs file write debugging patch Chris Mason
  0 siblings, 1 reply; 40+ messages in thread
From: Mitch Harder @ 2011-02-25 18:43 UTC (permalink / raw)
  To: Chris Mason
  Cc: Maria Wikström, Zhong, Xin, Johannes Hirte,
	linux-btrfs@vger.kernel.org

On Fri, Feb 25, 2011 at 11:11 AM, Mitch Harder
<mitch.harder@sabayonlinux.org> wrote:
> On Thu, Feb 24, 2011 at 5:14 PM, Mitch Harder
> <mitch.harder@sabayonlinux.org> wrote:
>> On Thu, Feb 24, 2011 at 10:32 AM, Mitch Harder
>> <mitch.harder@sabayonlinux.org> wrote:
>>> On Thu, Feb 24, 2011 at 10:19 AM, Chris Mason <chris.mason@oracle.c=
om> wrote:
>>>> Excerpts from Mitch Harder's message of 2011-02-24 11:03:07 -0500:
>>>>> On Thu, Feb 24, 2011 at 10:00 AM, Chris Mason <chris.mason@oracle=
=2Ecom> wrote:
>>>>> > Excerpts from Mitch Harder's message of 2011-02-24 10:55:15 -05=
00:
>>>>> >> 2011/2/24 Maria Wikstr=F6m <maria@ponstudios.se>:
>>>>> >> > m=E5n 2011-02-21 klockan 09:51 +0800 skrev Zhong, Xin:
>>>>> >> >> The backtrace in your attachment looks like a known bug of =
2.6.37 which have already been fixed in 2.6.38. I have no idea why late=
st btrfs still hang in your environment if there's no debug info...
>>>>> >> >>
>>>>> >> >
>>>>> >> > Haha, yes that's very hard :)
>>>>> >> >
>>>>> >> > 2.6.38-rc6 and btrfs-unstable behaves the same way. I can cl=
ose the
>>>>> >> > process with ctrl+c and it disappear a few seconds later. Th=
ere is no
>>>>> >> > CPU usage. Reading works because I can start htop and watch =
"svn info"
>>>>> >> > disappear, but everything writing to btrfs slows down to a c=
rawl. It
>>>>> >> > takes about 1 minute to log in. So I had to put the logs on =
an other
>>>>> >> > partition using ext3 to get the output from sysrq+t.
>>>>> >> >
>>>>> >>
>>>>> >> I believe I've been experiencing this issue also. =A0However, =
my problem
>>>>> >> usually results in a "No space left on device" error rather th=
an a
>>>>> >> lock-up or crash. =A0But I've bisected my issue to this patch,=
 and my
>>>>> >> "btrfs fi show" and "btrfs fi df" looks similar to others who'=
ve
>>>>> >> posted to this tread with all my space being allocated, but no=
t used.
>>>>> >>
>>>>> >
>>>>> > Sorry, which patch did you bisect the problem down to?
>>>>> >
>>>>>
>>>>> The patch at the head of this thread:
>>>>>
>>>>> Btrfs: pwrite blocked when writing from the mmaped buffer of the =
same page
>>>>
>>>> Hmmm, that patch shouldn't be changing our performance under delal=
loc
>>>> pressure, and it really shouldn't impact early enospc.
>>>>
>>>
>>> I've bisected this issue around where this patch went into git, and
>>> I've also constructed a testing patch that reverts this patch, and
>>> placed it on top of the current Btrfs git sources (I understand thi=
s
>>> patch addresses a real issue, this was just for testing).
>>>
>>> It could be that this patch just "uncovers" another problem, but al=
l
>>> my tests seem to point to this patch triggering this issue.
>>>
>>
>> I don't belief the previous ftrace I supplied had a large enough sco=
pe
>> to capture the issue.
>>
>> I've expanded my ftrace buffer, and filtered out everything but btrf=
s*
>> function calls ("# echo btrfs* >
>> /sys/kernel/debug/tracing/set_ftrace_filter").
>>
>> In this trace, I see btrfs spending a great deal of time in a while
>> loop (while (iov_iter_count(&i) > 0) {)) in the btrfs_file_aio_write=
()
>> function in file.c without exiting the function.
>>
>> I'm going to try to inject some debugging trace_printk() statements =
to
>> find if that portion of code is proceeding normally with my test cas=
e.
>>
>> I've put my expanded trace up on my local server, but my upload
>> bandwidth is pretty sad, and it may take a few minutes to transfer
>> even though it's only a 6MB file.
>>
>> http://dontpanic.dyndns.org/trace-openmotif-btrfs-v3.gz
>>
>
> Apologies for only hitting "Reply" instead of "Reply-All" on my last =
message.
>
> I've inserted additional trace_printk() to the btrfs_file_aio_write()
> and btrfs_copy_from_user() function in file.c in order to characteriz=
e
> the problem I've been encountering.
>
> I can see btrfs getting stuck in a loop in the "while
> (iov_iter_count(&i) > 0) {}" portion of the btrfs_file_aio_write()
> function.
>
> The loop is more-or-less following this process (from within the
> "while (iov_iter_count(&i) > 0) {}" loop):
>
> (1) Reserve some space with btrfs_delalloc_reserve_space()
> (2) Prepare the reserved space with prepare_pages()
> (3) Call btrfs_copy_from_user() to copy to the prepared space.
> -------------> From btrfs_copy_from_user()
> (4) ........Try to copy with copied =3D iov_iter_copy_from_user_atomi=
c()
> (5) ........The above operation results with copied =3D=3D 0. Break a=
nd
> return with a return value of 0 bytes copied.
> (6) There is no special handling for copied =3D=3D 0 in the "while
> (iov_iter_count(&i) > 0) {}" loop, so it loops back around, reserves
> some more space, and tries again.
>
> If I look back at how the code was set up before the patch at the hea=
d
> of this thread was applied (Btrfs: pwrite blocked when writing from
> the mmaped buffer of the same page), the btrfs_copy_from_user()
> function had some handling for "copied =3D=3D 0" that would change th=
e
> scope of the amount to write, and loop back to try the write again.
>
> I attempted to construct a patch that just reverted the handling for
> "copied =3D=3D 0" in btrfs_copy_from_user(), however, that just resul=
ted
> in my computer locking up when it reached the point where it was
> previously beginning to allocate disk space.
>
> So, I apologize for not having a patch to address the issue I'm
> seeing, but I hope I've added some insight.
>

Some clarification on my previous message...

After looking at my ftrace log more closely, I can see where Btrfs is
trying to release the allocated pages.  However, the calculation for
the number of dirty_pages is equal to 1 when "copied =3D=3D 0".

So I'm seeing at least two problems:
(1)  It keeps looping when "copied =3D=3D 0".
(2)  One dirty page is not being released on every loop even though
"copied =3D=3D 0" (at least this problem keeps it from being an infinit=
e
loop by eventually exhausting reserveable space on the disk).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2011-03-08  2:51 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-01 16:36 [PATCH] btrfs file write debugging patch Xin Zhong
2011-03-01 21:09 ` Mitch Harder
2011-03-02 10:58   ` Zhong, Xin
2011-03-02 14:00     ` Xin Zhong
2011-03-04  1:51     ` Chris Mason
2011-03-04  2:32       ` Josef Bacik
2011-03-04  2:42         ` Zhong, Xin
2011-03-04  2:41           ` Josef Bacik
2011-03-04  8:41             ` Zhong, Xin
2011-03-05 16:56             ` Mitch Harder
2011-03-05 17:28               ` Xin Zhong
2011-03-04 12:19       ` Chris Mason
2011-03-04 14:25         ` Xin Zhong
2011-03-04 15:33           ` Mitch Harder
2011-03-04 17:21             ` Mitch Harder
2011-03-05  1:00               ` Xin Zhong
2011-03-05 13:14                 ` Mitch Harder
2011-03-05 16:50                   ` Mitch Harder
2011-03-06 18:00                     ` Chris Mason
2011-03-07  0:58                       ` Chris Mason
2011-03-07  6:07                         ` Mitch Harder
2011-03-07  6:37                           ` Zhong, Xin
2011-03-07 19:56                           ` Maria Wikström
2011-03-07 22:12                             ` Johannes Hirte
2011-03-08  2:51                               ` Zhong, Xin
  -- strict thread matches above, loose matches on Subject: below --
2011-02-25 18:43 [PATCH v2]Btrfs: pwrite blocked when writing from the mmaped buffer of the same page Mitch Harder
2011-02-28  1:46 ` [PATCH] btrfs file write debugging patch Chris Mason
2011-02-28  8:56   ` Zhong, Xin
2011-02-28 14:02     ` Chris Mason
2011-02-28 10:13   ` Johannes Hirte
2011-02-28 14:00     ` Chris Mason
2011-02-28 16:10     ` Josef Bacik
2011-02-28 16:45       ` Maria Wikström
2011-02-28 17:47         ` Mitch Harder
2011-02-28 20:20           ` Mitch Harder
2011-03-01  5:09             ` Mitch Harder
2011-03-01 10:14             ` Zhong, Xin
2011-03-01 11:56               ` Zhong, Xin
2011-03-01 14:54                 ` Mitch Harder
2011-03-01 14:51               ` Mitch Harder
2011-03-01 21:56             ` Piotr Szymaniak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).