public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Jan Kara <jack@suse.cz>
Cc: cmm@us.ibm.com, linux-ext4@vger.kernel.org
Subject: Re: [PATCH] ext4: Fix delalloc sync hang with journal lock inversion
Date: Thu, 5 Jun 2008 19:24:13 +0530	[thread overview]
Message-ID: <20080605135413.GI8942@skywalker> (raw)
In-Reply-To: <20080602102759.GG30613@duck.suse.cz>

On Mon, Jun 02, 2008 at 12:27:59PM +0200, Jan Kara wrote:
> On Mon 02-06-08 15:29:56, Aneesh Kumar K.V wrote:
> > On Mon, Jun 02, 2008 at 11:35:00AM +0200, Jan Kara wrote:
> > > >  			BUG_ON(buffer_locked(bh));
> > > >  			if (buffer_dirty(bh))
> > > >  				mpage_add_bh_to_extent(mpd, logical, bh);
> > > > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> > > > index 789b6ad..655b8bf 100644
> > > > --- a/mm/page-writeback.c
> > > > +++ b/mm/page-writeback.c
> > > > @@ -881,7 +881,12 @@ int write_cache_pages(struct address_space *mapping,
> > > >  	pagevec_init(&pvec, 0);
> > > >  	if (wbc->range_cyclic) {
> > > >  		index = mapping->writeback_index; /* Start from prev offset */
> > > > -		end = -1;
> > > > +		/*
> > > > +		 * write only till the specified range_end even in cyclic mode
> > > > +		 */
> > > > +		end = wbc->range_end >> PAGE_CACHE_SHIFT;
> > > > +		if (!end)
> > > > +			end = -1;
> > > >  	} else {
> > > >  		index = wbc->range_start >> PAGE_CACHE_SHIFT;
> > > >  		end = wbc->range_end >> PAGE_CACHE_SHIFT;
> > >   Are you sure you won't break other users of range_cyclic with this
> > > change?
> > >
> > I haven't run any specific test to verify that. The concern was that if
> > we force cyclic mode for writeout in delalloc we may be starting the
> > writeout from a different offset than specified and would be writing
> > more. So the changes was to use the offset specified. A quick look at
> > the kernel suggested most of them had range_end as 0 with cyclic_mode.
> > I haven't audited the full kernel. I will do that. Meanwhile if you
> > think it is risky to make this changes i guess we should drop this
> > part. But i guess we can keep the below change
>   Hmm, I've just got an idea that it may be better to introduce a new flag
> for wbc like range_cont and it would mean that we start scan at
> writeback_index (we use range_start if writeback_index is not set) and
> end with range_end. That way we don't have to be afraid of interference
> with other range_cyclic users and in principle, range_cyclic is originally
> meant for other uses...
> 

something like below ?. With this ext4_da_writepages have

pgoff_t writeback_index = 0;
.....
if (!wbc->range_cyclic) {
	/*
	 * If range_cyclic is not set force range_cont
	 * and save  the old  writeback_index
	 */
	wbc->range_cont = 1;
	writeback_index = mapping->writeback_index;
	mapping->writeback_index = 0;
}
...
mpage_da_writepages(..)
..
if (writeback_index)
           mapping->writeback_index = writeback_index;
return ret;



mm: Add range_cont mode for writeback.

From: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Filesystems like ext4 needs to start a new transaction in
the writepages for block allocation. This happens with delayed
allocation and there is limit to how many credits we can request
from the journal layer. So we call write_cache_pages multiple
times with wbc->nr_to_write set to the maximum possible value
limitted by the max journal credits available.

Add a new mode to writeback that enables us to handle this
behaviour. If mapping->writeback_index is not set we use
wbc->range_start to find the start index and then at the end
of write_cache_pages we store the index in writeback_index. Next
call to write_cache_pages will start writeout from writeback_index.
Also we limit writing to the specified wbc->range_end.


Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---

 include/linux/writeback.h |    1 +
 mm/page-writeback.c       |   10 +++++++++-
 2 files changed, 10 insertions(+), 1 deletions(-)


diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index f462439..0d8573e 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -63,6 +63,7 @@ struct writeback_control {
 	unsigned for_writepages:1;	/* This is a writepages() call */
 	unsigned range_cyclic:1;	/* range_start is cyclic */
 	unsigned more_io:1;		/* more io to be dispatched */
+	unsigned range_cont:1;
 };
 
 /*
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 789b6ad..014a9f2 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -882,6 +882,12 @@ int write_cache_pages(struct address_space *mapping,
 	if (wbc->range_cyclic) {
 		index = mapping->writeback_index; /* Start from prev offset */
 		end = -1;
+	} else if (wbc->range_cont) {
+		if (!mapping->writeback_index)
+			index = wbc->range_start >> PAGE_CACHE_SHIFT;
+		else
+			index = mapping->writeback_index;
+		end = wbc->range_end >> PAGE_CACHE_SHIFT;
 	} else {
 		index = wbc->range_start >> PAGE_CACHE_SHIFT;
 		end = wbc->range_end >> PAGE_CACHE_SHIFT;
@@ -954,7 +960,9 @@ int write_cache_pages(struct address_space *mapping,
 		index = 0;
 		goto retry;
 	}
-	if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0))
+	if (wbc->range_cyclic ||
+			(range_whole && wbc->nr_to_write > 0) ||
+			wbc->range_cont)
 		mapping->writeback_index = index;
 	return ret;
 }

  reply	other threads:[~2008-06-05 13:54 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-30 13:39 [PATCH -v2] delalloc and journal locking order inversion fixes Aneesh Kumar K.V
2008-05-30 13:39 ` [PATCH] ext4: Use page_mkwrite vma_operations to get mmap write notification Aneesh Kumar K.V
2008-05-30 13:39   ` [PATCH] ext4: Inverse locking order of page_lock and transaction start Aneesh Kumar K.V
2008-05-30 13:39     ` [PATCH] vfs: Move mark_inode_dirty() from under page lock in generic_write_end() Aneesh Kumar K.V
2008-05-30 13:39       ` [PATCH] ext4: Add validation to jbd lock inversion patch and split and writepage Aneesh Kumar K.V
2008-05-30 13:39         ` [PATCH] ext4: inverse locking ordering of page_lock and transaction start in delalloc Aneesh Kumar K.V
2008-05-30 13:39           ` [PATCH] ext4: Fix delalloc sync hang with journal lock inversion Aneesh Kumar K.V
2008-06-02  9:35             ` Jan Kara
2008-06-02  9:59               ` Aneesh Kumar K.V
2008-06-02 10:27                 ` Jan Kara
2008-06-05 13:54                   ` Aneesh Kumar K.V [this message]
2008-06-05 16:22                     ` Jan Kara
2008-06-05 19:19                       ` Aneesh Kumar K.V
2008-06-11 12:41                         ` Jan Kara
2008-06-11 13:56                           ` Aneesh Kumar K.V
2008-06-11 17:48                             ` Jan Kara
2008-06-12 23:10                             ` Mingming Cao
2008-06-02  9:31         ` [PATCH] ext4: Add validation to jbd lock inversion patch and split and writepage Jan Kara
2008-06-02  9:52           ` Aneesh Kumar K.V
2008-06-02 10:40             ` Jan Kara
2008-05-30 17:51 ` [PATCH -v2] delalloc and journal locking order inversion fixes Mingming
2008-06-01 21:10 ` [PATCH] ext4: Need clear buffer_delay after page writeout for delayed allocation Mingming Cao
2008-06-02  3:14   ` Aneesh Kumar K.V
2008-06-02  3:50     ` Mingming Cao
2008-06-02  4:09       ` Aneesh Kumar K.V
2008-06-02  5:38         ` Mingming Cao
2008-06-02  6:35           ` Aneesh Kumar K.V
2008-06-02  7:04             ` Mingming Cao
2008-06-02  8:05               ` Aneesh Kumar K.V
2008-06-03  4:43                 ` Mingming Cao
2008-06-03 10:07                   ` Aneesh Kumar K.V
  -- strict thread matches above, loose matches on Subject: below --
2008-06-06 18:24 Patches for the patchqueue Aneesh Kumar K.V
2008-06-06 18:24 ` [PATCH] ext4: cleanup blockallocator Aneesh Kumar K.V
2008-06-06 18:24   ` [PATCH] ext2: Use page_mkwrite vma_operations to get mmap write notification Aneesh Kumar K.V
2008-06-06 18:24     ` [PATCH] ext3: " Aneesh Kumar K.V
2008-06-06 18:24       ` [PATCH] vfs: Don't flush delay buffer to disk Aneesh Kumar K.V
2008-06-06 18:24         ` [PATCH] mm: Add range_cont mode for writeback Aneesh Kumar K.V
2008-06-06 18:24           ` [PATCH] ext4: Fix delalloc sync hang with journal lock inversion Aneesh Kumar K.V
2008-05-21 17:44 delalloc and journal locking order inversion fixes Aneesh Kumar K.V
2008-05-21 17:44 ` [PATCH] ext4: Add validation to jbd lock inversion patch and split and writepage Aneesh Kumar K.V
2008-05-21 17:44   ` [PATCH] ext4: inverse locking ordering of page_lock and transaction start in delalloc Aneesh Kumar K.V
2008-05-21 17:44     ` [PATCH] ext4: Fix delalloc sync hang with journal lock inversion Aneesh Kumar K.V
2008-05-22 10:25       ` Aneesh Kumar K.V
2008-05-22 17:58         ` Mingming
2008-05-22 18:23           ` Aneesh Kumar K.V
2008-05-22 19:45             ` Mingming
2008-05-22 18:10       ` Mingming
2008-05-22 18:26         ` Aneesh Kumar K.V
2008-05-22 19:26           ` Mingming

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080605135413.GI8942@skywalker \
    --to=aneesh.kumar@linux.vnet.ibm.com \
    --cc=cmm@us.ibm.com \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox