From: Jan Kara <jack@suse.cz>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Ritesh Harjani <riteshh@linux.ibm.com>,
	jack@suse.cz, linux-ext4@vger.kernel.org, tytso@mit.edu,
	adilger.kernel@dilger.ca, linux-fsdevel@vger.kernel.org,
	hch@infradead.org, cmaiolino@redhat.com, david@fromorbit.com
Subject: Re: [PATCHv5 3/6] ext4: Move ext4 bmap to use iomap infrastructure.
Date: Wed, 4 Mar 2020 13:42:11 +0100	[thread overview]
Message-ID: <20200304124211.GC21048@quack2.suse.cz> (raw)
In-Reply-To: <20200303154709.GB8037@magnolia>
On Tue 03-03-20 07:47:09, Darrick J. Wong wrote:
> On Mon, Mar 02, 2020 at 02:28:39PM +0530, Ritesh Harjani wrote:
> > 
> > 
> > On 2/28/20 8:55 PM, Darrick J. Wong wrote:
> > > On Fri, Feb 28, 2020 at 02:56:56PM +0530, Ritesh Harjani wrote:
> > > > ext4_iomap_begin is already implemented which provides ext4_map_blocks,
> > > > so just move the API from generic_block_bmap to iomap_bmap for iomap
> > > > conversion.
> > > > 
> > > > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
> > > > Reviewed-by: Jan Kara <jack@suse.cz>
> > > > ---
> > > >   fs/ext4/inode.c | 2 +-
> > > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > > > index 6cf3b969dc86..81fccbae0aea 100644
> > > > --- a/fs/ext4/inode.c
> > > > +++ b/fs/ext4/inode.c
> > > > @@ -3214,7 +3214,7 @@ static sector_t ext4_bmap(struct address_space *mapping, sector_t block)
> > > >   			return 0;
> > > >   	}
> > > > -	return generic_block_bmap(mapping, block, ext4_get_block);
> > > > +	return iomap_bmap(mapping, block, &ext4_iomap_ops);
> > > 
> > > /me notes that iomap_bmap will filemap_write_and_wait for you, so one
> > > could optimize ext4_bmap to avoid the double-flush by moving the
> > > filemap_write_and_wait at the top of the function into the JDATA state
> > > clearing block.
> > 
> > IIUC, delalloc and data=journal mode are both mutually exclusive.
> > So we could get rid of calling filemap_write_and_wait() all together
> > from ext4_bmap().
> > And as you pointed filemap_write_and_wait() is called by default in
> > iomap_bmap which should cover for delalloc case.
> > 
> > 
> > @Jan/Darrick,
> > Could you check if the attached patch looks good. If yes then
> > will add your Reviewed-by and send a v6.
> > 
> > Thanks for the review!!
> > 
> > -ritesh
> > 
> > 
> 
> > From 93f560d9a483b4f389056e543012d0941734a8f4 Mon Sep 17 00:00:00 2001
> > From: Ritesh Harjani <riteshh@linux.ibm.com>
> > Date: Tue, 20 Aug 2019 18:36:33 +0530
> > Subject: [PATCH 3/6] ext4: Move ext4 bmap to use iomap infrastructure.
> > 
> > ext4_iomap_begin is already implemented which provides ext4_map_blocks,
> > so just move the API from generic_block_bmap to iomap_bmap for iomap
> > conversion.
> > 
> > Also no need to call for filemap_write_and_wait() any more in ext4_bmap
> > since data=journal mode anyway doesn't support delalloc and for all other
> > cases iomap_bmap() anyway calls the same function, so no need for doing
> > it twice.
> > 
> > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
> 
> Hmmm.  I don't recall how jdata actually works, but I get the impression
> here that we're trying to flush dirty data out to the journal and then
> out to disk, and then drop the JDATA state from the inode.  This
> mechanism exists (I guess?) so that dirty file pages get checkpointed
> out of jbd2 back into the filesystem so that bmap() returns meaningful
> results to lilo.
Exactly. E.g. when we are journalling data, we fill hole through mmap, we will
have block allocated as unwritten and we need to write it out so that the
data gets to the journal and then do journal flush to get the data to disk
so that lilo can read it from the devices. So removing
filemap_write_and_wait() when journalling data is wrong.
> This makes me wonder if you still need the filemap_write_and_wait in the
> JDATA case because otherwise the journal flush won't have the effect of
> writing all the dirty pagecache back to the filesystem?  OTOH I suppose
> the implicit write-and-wait call after we clear JDATA will not be
> writing to the journal.
> 
> Even more weirdly, the FIEMAP code doesn't drop JDATA at all...?
Yeah, it should do that but that's only performance optimization so that we
bother with journal flushing only when someone uses block mapping call on
a file with journalled dirty data. So you can hardly notice the bug by
testing...
								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply	other threads:[~2020-03-04 12:42 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-28  9:26 [PATCHv5 0/6] ext4: bmap & fiemap conversion to use iomap Ritesh Harjani
2020-02-28  9:26 ` [PATCHv5 1/6] ext4: Add IOMAP_F_MERGED for non-extent based mapping Ritesh Harjani
2020-02-28 15:26   ` Darrick J. Wong
2020-03-13 19:52   ` Theodore Y. Ts'o
2020-02-28  9:26 ` [PATCHv5 2/6] ext4: Optimize ext4_ext_precache for 0 depth Ritesh Harjani
2020-03-13 20:05   ` Theodore Y. Ts'o
2020-02-28  9:26 ` [PATCHv5 3/6] ext4: Move ext4 bmap to use iomap infrastructure Ritesh Harjani
2020-02-28 15:25   ` Darrick J. Wong
2020-03-02  8:58     ` Ritesh Harjani
2020-03-03 15:47       ` Darrick J. Wong
2020-03-04 12:42         ` Jan Kara [this message]
2020-03-04 15:37           ` Darrick J. Wong
2020-03-07  2:32             ` Theodore Y. Ts'o
2020-03-06 17:49           ` Ritesh Harjani
2020-03-07  0:51             ` Darrick J. Wong
2020-03-07  5:50               ` Ritesh Harjani
2020-03-13 20:16   ` Theodore Y. Ts'o
2020-02-28  9:26 ` [PATCHv5 4/6] ext4: Make ext4_ind_map_blocks work with fiemap Ritesh Harjani
2020-03-13 20:18   ` Theodore Y. Ts'o
2020-02-28  9:26 ` [PATCHv5 5/6] ext4: Move ext4_fiemap to use iomap framework Ritesh Harjani
2020-02-28 15:21   ` Darrick J. Wong
2020-03-14  3:03   ` Theodore Y. Ts'o
2020-02-28  9:26 ` [PATCHv5 6/6] Documentation: Correct the description of FIEMAP_EXTENT_LAST Ritesh Harjani
2020-02-28 15:20   ` Darrick J. Wong
2020-02-28 15:36   ` Matthew Wilcox
2020-03-02  8:10     ` Ritesh Harjani
2020-03-14  3:48       ` Theodore Y. Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox
  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):
  git send-email \
    --in-reply-to=20200304124211.GC21048@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=adilger.kernel@dilger.ca \
    --cc=cmaiolino@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=riteshh@linux.ibm.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY
  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
  Be sure your reply has a Subject: header at the top and a blank line
  before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).