From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com ([209.132.183.28]:36699 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932393AbcFOSp6 (ORCPT ); Wed, 15 Jun 2016 14:45:58 -0400 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1B8C7C04B308 for ; Wed, 15 Jun 2016 18:45:58 +0000 (UTC) From: "Benjamin Marzinski" Date: Wed, 15 Jun 2016 13:45:56 -0500 To: Bob Peterson Cc: linux-fsdevel , cluster-devel Subject: Re: [Cluster-devel] [PATCH 2/2] gfs2: writeout truncated pages Message-ID: <20160615184556.GL8095@octiron.msp.redhat.com> References: <1465941769-25596-1-git-send-email-bmarzins@redhat.com> <1465941769-25596-3-git-send-email-bmarzins@redhat.com> <2059374940.19514905.1466010482840.JavaMail.zimbra@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2059374940.19514905.1466010482840.JavaMail.zimbra@redhat.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Wed, Jun 15, 2016 at 01:08:02PM -0400, Bob Peterson wrote: > Hi Ben, > > Comments below. > > ----- Original Message ----- > | When gfs2 attempts to write a page to a file that is being truncated, > | and notices that the page is completely outside of the file size, it > | tries to invalidate it. However, this may require a transaction for > | journaled data files to revoke any buffers from the page on the active > | items list. Unfortunately, this can happen inside a log flush, where a > | transaction cannot be started. Also, gfs2 may need to be able to remove > | the buffer from the ail1 list before it can finish the log flush. > | > | To deal with this, gfs2 now skips the check to see if the write is > | outside the file size, and simply writes it anyway. This situation can > | only occur when the truncate code still has the file locked exclusively, > | and hasn't marked this block as free in the metadata (which happens > | later in truc_dealloc). After gfs2 writes this page out, the truncation > | code will shortly invalidate it and write out any revokes if necessary. > | > | To do this, gfs2 now implements its own version of block_write_full_page > | without the check, and calls the newly exported __block_write_full_page. > | The check still exists in nobh_writepage, which is gfs2 calls in the > | non-journaled case. But in this case the buffers will never be in the > | journal. Thus, no revokes will need to be issued and the write can > | safely be skipped without causing any possible journal replay issues. > | > | Signed-off-by: Benjamin Marzinski > | --- > | fs/gfs2/aops.c | 37 +++++++++++++++++++++++++++---------- > | 1 file changed, 27 insertions(+), 10 deletions(-) > | > | diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c > | index 37b7bc1..d3a7301 100644 > | --- a/fs/gfs2/aops.c > | +++ b/fs/gfs2/aops.c > | @@ -100,20 +100,11 @@ static int gfs2_writepage_common(struct page *page, > | struct inode *inode = page->mapping->host; > | struct gfs2_inode *ip = GFS2_I(inode); > | struct gfs2_sbd *sdp = GFS2_SB(inode); > | - loff_t i_size = i_size_read(inode); > | - pgoff_t end_index = i_size >> PAGE_SHIFT; > | - unsigned offset; > | > | if (gfs2_assert_withdraw(sdp, gfs2_glock_is_held_excl(ip->i_gl))) > | goto out; > | if (current->journal_info) > | goto redirty; > | - /* Is the page fully outside i_size? (truncate in progress) */ > | - offset = i_size & (PAGE_SIZE-1); > | - if (page->index > end_index || (page->index == end_index && !offset)) { > | - page->mapping->a_ops->invalidatepage(page, 0, PAGE_SIZE); > | - goto out; > | - } > > The concept in general looks good. Two things: > > (1) Since this is writepage_common, does it still need to do the > check for the normal non-jdata writeback? Does it makes sense to > split this back up into two writepage functions again rather than > the common one? After all, with your patch, the function does almost > nothing anymore. We do not need to worry about the non-jdata case. nobh_writepage(), which is called after gfs2_writepage_common in the non-jdata case, has the identical check. It doesn't invalidate the page, but that's fine, because the truncate code is about to do that anyway. This isn't even as big a change as it looks. In cases where we weren't doing this writepage during a log flush, there was nothing to guarantee that the truncate code wouldn't shrink the file size after the check in gfs2_writepage_common, and cause us to hit the check in nobh_writepage instead. Nate hit that scenario while testing the jdata case. In fact, there's nothing to guarantee that the i_size doesn't change after the check in nobh_writepage either, in which case we simply write out the page, like in the jdata case. > (2) Just a nit, but can you eliminate the ugly goto around the return? > In other words, rather than: > > if (current->journal_info) > goto redirty; > return 1; > redirty: > > Just simply do: > > if (!current->journal_info) > return 1; > Sure. I'll clean that up and respin the patches. -Ben > Regards, > > Bob Peterson > Red Hat File Systems