public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <gregkh@suse.de>
To: linux-kernel@vger.kernel.org, stable@kernel.org
Cc: Justin Forbes <jmforbes@linuxtx.org>,
	Zwane Mwaikambo <zwane@arm.linux.org.uk>,
	"Theodore Ts'o" <tytso@mit.edu>,
	Randy Dunlap <rdunlap@xenotime.net>,
	Dave Jones <davej@redhat.com>,
	Chuck Wolber <chuckw@quantumlinux.com>,
	Chris Wedgwood <reviews@ml.cw.f00f.org>,
	Michael Krufky <mkrufky@linuxtv.org>,
	Chuck Ebbert <cebbert@redhat.com>,
	Domenico Andreoli <cavokz@gmail.com>, Willy Tarreau <w@1wt.eu>,
	Rodrigo Rubira Branco <rbranco@la.checkpoint.com>,
	Jake Edge <jake@lwn.net>, Eugene Teo <eteo@redhat.com>,
	torvalds@linux-foundation.org, akpm@linux-foundation.org,
	alan@lxorguk.ukuu.org.uk, Nick Piggin <npiggin@suse.de>
Subject: [patch 38/40] fs: remove WB_SYNC_HOLD
Date: Thu, 22 Jan 2009 22:14:56 -0800	[thread overview]
Message-ID: <20090123061456.GL2922@kroah.com> (raw)
In-Reply-To: <20090123001908.GA7397@kroah.com>

[-- Attachment #1: fs-remove-wb_sync_hold.patch --]
[-- Type: text/plain, Size: 3646 bytes --]

2.6.27-stable review patch.  If anyone has any objections, please let us know.

------------------

From: Nick Piggin <npiggin@suse.de>

commit 4f5a99d64c17470a784a6c68064207d82e3e74a5 upstream.

Remove WB_SYNC_HOLD.  The primary motiviation is the design of my
anti-starvation code for fsync.  It requires taking an inode lock over the
sync operation, so we could run into lock ordering problems with multiple
inodes.  It is possible to take a single global lock to solve the ordering
problem, but then that would prevent a future nice implementation of "sync
multiple inodes" based on lock order via inode address.

Seems like a backward step to remove this, but actually it is busted
anyway: we can't use the inode lists for data integrity wait: an inode can
be taken off the dirty lists but still be under writeback.  In order to
satisfy data integrity semantics, we should wait for it to finish
writeback, but if we only search the dirty lists, we'll miss it.

It would be possible to have a "writeback" list, for sys_sync, I suppose.
But why complicate things by prematurely optimise?  For unmounting, we
could avoid the "livelock avoidance" code, which would be easier, but
again premature IMO.

Fixing the existing data integrity problem will come next.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 fs/fs-writeback.c         |   12 ++----------
 include/linux/writeback.h |    1 -
 2 files changed, 2 insertions(+), 11 deletions(-)

--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -421,9 +421,6 @@ __writeback_single_inode(struct inode *i
  * If we're a pdlfush thread, then implement pdflush collision avoidance
  * against the entire list.
  *
- * WB_SYNC_HOLD is a hack for sys_sync(): reattach the inode to sb->s_dirty so
- * that it can be located for waiting on in __writeback_single_inode().
- *
  * If `bdi' is non-zero then we're being asked to writeback a specific queue.
  * This function assumes that the blockdev superblock's inodes are backed by
  * a variety of queues, so all inodes are searched.  For other superblocks,
@@ -499,10 +496,6 @@ void generic_sync_sb_inodes(struct super
 		__iget(inode);
 		pages_skipped = wbc->pages_skipped;
 		__writeback_single_inode(inode, wbc);
-		if (wbc->sync_mode == WB_SYNC_HOLD) {
-			inode->dirtied_when = jiffies;
-			list_move(&inode->i_list, &sb->s_dirty);
-		}
 		if (current_is_pdflush())
 			writeback_release(bdi);
 		if (wbc->pages_skipped != pages_skipped) {
@@ -588,8 +581,7 @@ restart:
 
 /*
  * writeback and wait upon the filesystem's dirty inodes.  The caller will
- * do this in two passes - one to write, and one to wait.  WB_SYNC_HOLD is
- * used to park the written inodes on sb->s_dirty for the wait pass.
+ * do this in two passes - one to write, and one to wait.
  *
  * A finite limit is set on the number of pages which will be written.
  * To prevent infinite livelock of sys_sync().
@@ -600,7 +592,7 @@ restart:
 void sync_inodes_sb(struct super_block *sb, int wait)
 {
 	struct writeback_control wbc = {
-		.sync_mode	= wait ? WB_SYNC_ALL : WB_SYNC_HOLD,
+		.sync_mode	= wait ? WB_SYNC_ALL : WB_SYNC_NONE,
 		.range_start	= 0,
 		.range_end	= LLONG_MAX,
 	};
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -30,7 +30,6 @@ static inline int task_is_pdflush(struct
 enum writeback_sync_modes {
 	WB_SYNC_NONE,	/* Don't wait on anything */
 	WB_SYNC_ALL,	/* Wait on every mapping */
-	WB_SYNC_HOLD,	/* Hold the inode on sb_dirty for sys_sync() */
 };
 
 /*


  parent reply	other threads:[~2009-01-23  6:30 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20090123001330.046404396@mini.kroah.org>
2009-01-23  0:19 ` [patch 00/40] 2.6.27-stable review Greg KH
2009-01-23  6:12   ` [patch 01/40] pkt_sched: sch_htb: Fix deadlock in hrtimers triggered by HTB Greg KH
2009-01-23  6:12   ` [patch 02/40] ipv6: Fix fib6_dump_table walker leak Greg KH
2009-01-23  6:12   ` [patch 03/40] sctp: Avoid memory overflow while FWD-TSN chunk is received with bad stream ID Greg KH
2009-01-23  6:13   ` [patch 04/40] pkt_sched: cls_u32: Fix locking in u32_change() Greg KH
2009-01-23  6:13   ` [patch 05/40] r6040: fix wrong logic in mdio code Greg KH
2009-01-23  6:13   ` [patch 06/40] r6040: save and restore MIER correctly in the interrupt routine Greg KH
2009-01-23  6:13   ` [patch 07/40] r6040: bump release number to 0.19 Greg KH
2009-01-23  6:13   ` [patch 08/40] tcp: dont mask EOF and socket errors on nonblocking splice receive Greg KH
2009-01-23  6:13   ` [patch 09/40] usb-storage: add last-sector hacks Greg KH
2009-01-23  6:13   ` [patch 10/40] usb-storage: set CAPACITY_HEURISTICS flag for bad vendors Greg KH
2009-01-23  6:13   ` [patch 11/40] ALSA: hda - Add automatic model setting for Samsung Q45 Greg KH
2009-01-23  6:13   ` [patch 12/40] ALSA: hda - make laptop-eapd model back for AD1986A Greg KH
2009-01-23  6:13   ` [patch 13/40] drivers/net/irda/irda-usb.c: fix buffer overflow Greg KH
2009-01-23  6:13   ` [patch 14/40] IA64: Turn on CONFIG_HAVE_UNSTABLE_CLOCK Greg KH
2009-01-23  6:13   ` [patch 15/40] kill sig -1 must only apply to callers namespace Greg KH
2009-01-23  6:13   ` [patch 16/40] lib/idr.c: use kmem_cache_zalloc() for the idr_layer cache Greg KH
2009-01-23  6:13   ` [patch 17/40] p54usb: Add USB ID for Thomson Speedtouch 121g Greg KH
2009-01-23  6:13   ` [patch 18/40] PCI: keep ASPM link state consistent throughout PCIe hierarchy Greg KH
2009-01-23  6:14   ` [patch 19/40] rt2x00: add USB ID for the Linksys WUSB200 Greg KH
2009-01-23  6:14   ` [patch 20/40] security: introduce missing kfree Greg KH
2009-01-23  6:14   ` [patch 21/40] sgi-xp: eliminate false detection of no heartbeat Greg KH
2009-01-23  6:14   ` [patch 22/40] clocksource: introduce clocksource_forward_now() Greg KH
2009-01-23  6:14   ` [patch 23/40] hwmon-vid: Add support for AMD family 10h CPUs Greg KH
2009-01-23  6:14   ` [patch 24/40] ath9k: quiet harmless ForceXPAon messages Greg KH
2009-01-23  6:14   ` [patch 25/40] dell_rbu: use scnprintf() instead of less secure sprintf() Greg KH
2009-01-23  6:14   ` [patch 26/40] hwmon: (abituguru3) Fix CONFIG_DMI=n fallback to probe Greg KH
2009-01-23  6:14   ` [patch 27/40] powerpc: is_hugepage_only_range() must account for both 4kB and 64kB slices Greg KH
2009-01-23  6:14   ` [patch 28/40] mm: write_cache_pages cyclic fix Greg KH
2009-01-23  6:14   ` [patch 29/40] mm: write_cache_pages early loop termination Greg KH
2009-01-23  6:14   ` [patch 30/40] mm: write_cache_pages writepage error fix Greg KH
2009-01-23  6:14   ` [patch 31/40] mm: write_cache_pages integrity fix Greg KH
2009-01-23  6:14   ` [patch 32/40] mm: write_cache_pages cleanups Greg KH
2009-01-23  6:14   ` [patch 33/40] mm: write_cache_pages optimise page cleaning Greg KH
2009-01-23  6:14   ` [patch 34/40] mm: write_cache_pages terminate quickly Greg KH
2009-01-23  6:14   ` [patch 35/40] mm: write_cache_pages more " Greg KH
2009-01-23  6:14   ` [patch 36/40] mm: do_sync_mapping_range integrity fix Greg KH
2009-01-23  6:14   ` [patch 37/40] mm: direct IO starvation improvement Greg KH
2009-01-23  6:14   ` Greg KH [this message]
2009-01-23  6:14   ` [patch 39/40] fs: sync_sb_inodes fix Greg KH
2009-01-23  6:15   ` [patch 40/40] fs: sys_sync fix Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090123061456.GL2922@kroah.com \
    --to=gregkh@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=cavokz@gmail.com \
    --cc=cebbert@redhat.com \
    --cc=chuckw@quantumlinux.com \
    --cc=davej@redhat.com \
    --cc=eteo@redhat.com \
    --cc=jake@lwn.net \
    --cc=jmforbes@linuxtx.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mkrufky@linuxtv.org \
    --cc=npiggin@suse.de \
    --cc=rbranco@la.checkpoint.com \
    --cc=rdunlap@xenotime.net \
    --cc=reviews@ml.cw.f00f.org \
    --cc=stable@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=w@1wt.eu \
    --cc=zwane@arm.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox