From: Aastha Mehta <aasthakm@gmail.com>
To: Josef Bacik <jbacik@fusionio.com>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Questions regarding logging upon fsync in btrfs
Date: Mon, 30 Sep 2013 23:07:20 +0200 [thread overview]
Message-ID: <CAEx9m46QXYSHjCAF3enUom__SwR8vDNsw_qmzFqXZPnPwxF_1w@mail.gmail.com> (raw)
In-Reply-To: <20130930204738.GB27490@localhost.localdomain>
On 30 September 2013 22:47, Josef Bacik <jbacik@fusionio.com> wrote:
> On Mon, Sep 30, 2013 at 10:30:59PM +0200, Aastha Mehta wrote:
>> On 30 September 2013 22:11, Josef Bacik <jbacik@fusionio.com> wrote:
>> > On Mon, Sep 30, 2013 at 09:32:54PM +0200, Aastha Mehta wrote:
>> >> On 29 September 2013 15:12, Josef Bacik <jbacik@fusionio.com> wrote:
>> >> > On Sun, Sep 29, 2013 at 11:22:36AM +0200, Aastha Mehta wrote:
>> >> >> Thank you very much for the reply. That clarifies a lot of things.
>> >> >>
>> >> >> I was trying a small test case that opens a file, writes a block of
>> >> >> data, calls fsync and then closes the file. If I understand correctly,
>> >> >> fsync would return only after all in-memory buffers have been
>> >> >> committed to disk. I have added few print statements in the
>> >> >> __extent_writepage function, and I notice that the function gets
>> >> >> called a bit later after fsync returns. It seems that I am not
>> >> >> guaranteed to see the data going to disk by the time fsync returns.
>> >> >>
>> >> >> Am I doing something wrong, or am I looking at the wrong place for
>> >> >> disk write? This happens both with tree logging enabled as well as
>> >> >> with notreelog.
>> >> >>
>> >> >
>> >> > So 3.1 was a long time ago and to be sure it had issues I don't think it was
>> >> > _that_ broken. You are probably better off instrumenting a recent kernel, 3.11
>> >> > or just build btrfs-next from git. But if I were to make a guess I'd say that
>> >> > __extent_writepage was how both data and metadata was written out at the time (I
>> >> > don't think I changed it until 3.2 or something later) so what you are likely
>> >> > seeing is the normal transaction commit after the fsync. In the case of
>> >> > notreelog we are likely starting another transaction and you are seeing that
>> >> > commit (at the time the transaction kthread would start a transaction even if
>> >> > none had been started yet.) Thanks,
>> >> >
>> >> > Josef
>> >>
>> >> Is there any special handling for very small file write, less than 4K? As
>> >> I understand there is an optimization to inline the first extent in a file if
>> >> it is smaller than 4K, does it affect the writeback on fsync as well? I did
>> >> set the max_inline mount option to 0, but even then it seems there is
>> >> some difference in fsync behaviour for writing first extent of less than 4K
>> >> size and writing 4K or more.
>> >>
>> >
>> > Yeah if the file is an inline extent then it will be copied into the log
>> > directly and the log will be written out, no going through the data write path
>> > at all. Max inline == 0 should make it so we don't inline, so if it isn't
>> > honoring that then that may be a bug. Thanks,
>> >
>> > Josef
>>
>> I tried it on 3.12-rc2 release, and it seems there is a bug then.
>> Please find attached logs to confirm.
>> Also, probably on the older release.
>>
>
> Oooh ok I understand, you have your printk's in the wrong place ;).
> do_writepages doesn't necessarily mean you are writing something. If you want
> to see if stuff got written to the disk I'd put a printk at run_delalloc_range
> and have it spit out the range it is writing out since thats what we think is
> actually dirty. Thanks,
>
> Josef
No, but I also placed dump_stack() in the beginning of
__extent_writepage. run_delalloc_range is being called only from
__extent_writepage, if it were to be called, the dump_stack() at the
top of __extent_writepage would have printed as well, no?
Thanks
--
Aastha Mehta
MPI-SWS, Germany
E-mail: aasthakm@mpi-sws.org
next prev parent reply other threads:[~2013-09-30 21:07 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-28 23:35 Questions regarding logging upon fsync in btrfs Aastha Mehta
2013-09-28 23:46 ` Aastha Mehta
2013-09-29 0:21 ` Hugo Mills
2013-09-29 0:42 ` Josef Bacik
2013-09-29 9:22 ` Aastha Mehta
2013-09-29 13:12 ` Josef Bacik
2013-09-30 19:32 ` Aastha Mehta
2013-09-30 20:11 ` Josef Bacik
2013-09-30 20:30 ` Aastha Mehta
2013-09-30 20:47 ` Josef Bacik
2013-09-30 21:07 ` Aastha Mehta [this message]
2013-09-30 21:17 ` Josef Bacik
2013-10-01 17:34 ` Josef Bacik
2013-10-01 19:40 ` Aastha Mehta
2013-10-01 19:42 ` Aastha Mehta
2013-10-01 20:13 ` Aastha Mehta
2013-10-02 11:52 ` Josef Bacik
2013-10-02 20:12 ` Aastha Mehta
2013-10-02 23:28 ` Josef Bacik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAEx9m46QXYSHjCAF3enUom__SwR8vDNsw_qmzFqXZPnPwxF_1w@mail.gmail.com \
--to=aasthakm@gmail.com \
--cc=jbacik@fusionio.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).