From: Josef Bacik <jbacik@fusionio.com>
To: Aastha Mehta <aasthakm@gmail.com>
Cc: Josef Bacik <jbacik@fusionio.com>,
linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Questions regarding logging upon fsync in btrfs
Date: Tue, 1 Oct 2013 13:34:16 -0400 [thread overview]
Message-ID: <20131001173416.GF27490@localhost.localdomain> (raw)
In-Reply-To: <CAEx9m46QXYSHjCAF3enUom__SwR8vDNsw_qmzFqXZPnPwxF_1w@mail.gmail.com>
On Mon, Sep 30, 2013 at 11:07:20PM +0200, Aastha Mehta wrote:
> On 30 September 2013 22:47, Josef Bacik <jbacik@fusionio.com> wrote:
> > On Mon, Sep 30, 2013 at 10:30:59PM +0200, Aastha Mehta wrote:
> >> On 30 September 2013 22:11, Josef Bacik <jbacik@fusionio.com> wrote:
> >> > On Mon, Sep 30, 2013 at 09:32:54PM +0200, Aastha Mehta wrote:
> >> >> On 29 September 2013 15:12, Josef Bacik <jbacik@fusionio.com> wrote:
> >> >> > On Sun, Sep 29, 2013 at 11:22:36AM +0200, Aastha Mehta wrote:
> >> >> >> Thank you very much for the reply. That clarifies a lot of things.
> >> >> >>
> >> >> >> I was trying a small test case that opens a file, writes a block of
> >> >> >> data, calls fsync and then closes the file. If I understand correctly,
> >> >> >> fsync would return only after all in-memory buffers have been
> >> >> >> committed to disk. I have added few print statements in the
> >> >> >> __extent_writepage function, and I notice that the function gets
> >> >> >> called a bit later after fsync returns. It seems that I am not
> >> >> >> guaranteed to see the data going to disk by the time fsync returns.
> >> >> >>
> >> >> >> Am I doing something wrong, or am I looking at the wrong place for
> >> >> >> disk write? This happens both with tree logging enabled as well as
> >> >> >> with notreelog.
> >> >> >>
> >> >> >
> >> >> > So 3.1 was a long time ago and to be sure it had issues I don't think it was
> >> >> > _that_ broken. You are probably better off instrumenting a recent kernel, 3.11
> >> >> > or just build btrfs-next from git. But if I were to make a guess I'd say that
> >> >> > __extent_writepage was how both data and metadata was written out at the time (I
> >> >> > don't think I changed it until 3.2 or something later) so what you are likely
> >> >> > seeing is the normal transaction commit after the fsync. In the case of
> >> >> > notreelog we are likely starting another transaction and you are seeing that
> >> >> > commit (at the time the transaction kthread would start a transaction even if
> >> >> > none had been started yet.) Thanks,
> >> >> >
> >> >> > Josef
> >> >>
> >> >> Is there any special handling for very small file write, less than 4K? As
> >> >> I understand there is an optimization to inline the first extent in a file if
> >> >> it is smaller than 4K, does it affect the writeback on fsync as well? I did
> >> >> set the max_inline mount option to 0, but even then it seems there is
> >> >> some difference in fsync behaviour for writing first extent of less than 4K
> >> >> size and writing 4K or more.
> >> >>
> >> >
> >> > Yeah if the file is an inline extent then it will be copied into the log
> >> > directly and the log will be written out, no going through the data write path
> >> > at all. Max inline == 0 should make it so we don't inline, so if it isn't
> >> > honoring that then that may be a bug. Thanks,
> >> >
> >> > Josef
> >>
> >> I tried it on 3.12-rc2 release, and it seems there is a bug then.
> >> Please find attached logs to confirm.
> >> Also, probably on the older release.
> >>
> >
> > Oooh ok I understand, you have your printk's in the wrong place ;).
> > do_writepages doesn't necessarily mean you are writing something. If you want
> > to see if stuff got written to the disk I'd put a printk at run_delalloc_range
> > and have it spit out the range it is writing out since thats what we think is
> > actually dirty. Thanks,
> >
> > Josef
>
> No, but I also placed dump_stack() in the beginning of
> __extent_writepage. run_delalloc_range is being called only from
> __extent_writepage, if it were to be called, the dump_stack() at the
> top of __extent_writepage would have printed as well, no?
>
Ok I've done the same thing and I'm not seeing what you are seeing. Are you
using any mount options other than notreelog and max_inline=0? Could you adjust
your printk to print out the root objectid for the inode as well? It could be
possible that this is the writeout for the space cache or inode cache. Thanks,
Josef
next prev parent reply other threads:[~2013-10-01 17:34 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-28 23:35 Questions regarding logging upon fsync in btrfs Aastha Mehta
2013-09-28 23:46 ` Aastha Mehta
2013-09-29 0:21 ` Hugo Mills
2013-09-29 0:42 ` Josef Bacik
2013-09-29 9:22 ` Aastha Mehta
2013-09-29 13:12 ` Josef Bacik
2013-09-30 19:32 ` Aastha Mehta
2013-09-30 20:11 ` Josef Bacik
2013-09-30 20:30 ` Aastha Mehta
2013-09-30 20:47 ` Josef Bacik
2013-09-30 21:07 ` Aastha Mehta
2013-09-30 21:17 ` Josef Bacik
2013-10-01 17:34 ` Josef Bacik [this message]
2013-10-01 19:40 ` Aastha Mehta
2013-10-01 19:42 ` Aastha Mehta
2013-10-01 20:13 ` Aastha Mehta
2013-10-02 11:52 ` Josef Bacik
2013-10-02 20:12 ` Aastha Mehta
2013-10-02 23:28 ` Josef Bacik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131001173416.GF27490@localhost.localdomain \
--to=jbacik@fusionio.com \
--cc=aasthakm@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).