public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <andrea@suse.de>
To: Linus Torvalds <torvalds@transmeta.com>
Cc: Rik van Riel <riel@conectiva.com.br>,
	Mike Galbraith <mikeg@wen-online.de>,
	Jeff Garzik <jgarzik@mandrakesoft.com>,
	Daniel Phillips <phillips@bonn-fries.net>,
	Alexander Viro <viro@math.psu.edu>, Alan Cox <alan@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: VM in 2.4.7-pre hurts...
Date: Tue, 10 Jul 2001 04:56:17 +0200	[thread overview]
Message-ID: <20010710045617.J1594@athlon.random> (raw)
In-Reply-To: <Pine.LNX.4.33.0107081227020.7044-100000@penguin.transmeta.com> <Pine.LNX.4.33.0107081949290.18364-100000@penguin.transmeta.com>
In-Reply-To: <Pine.LNX.4.33.0107081949290.18364-100000@penguin.transmeta.com>; from torvalds@transmeta.com on Sun, Jul 08, 2001 at 07:59:01PM -0700

On Sun, Jul 08, 2001 at 07:59:01PM -0700, Linus Torvalds wrote:
> 
> On Sun, 8 Jul 2001, Linus Torvalds wrote:
> >
> > Anyway, having looked at the buffer case, I htink I found a potentially
> > nasty bug: "unlock_buffer()" with a buffer count of zero.
> 
> [ Basic problem: write-completion race with the code that does
>   "try_to_free_buffers" ]
> 
> Suggested fix:
>  - every b_end_io() function should decrement the buffer count as the
>    _last_ thing it does to the buffer.
>  - every bh IO submitter would have to increment the bh count before
>    submitting it for IO.

it seems to me there's a bit of overkill in the pre4 fix. First of all
such race obviously couldn't happen with the async_io and kiobuf
handlers, the former because the page stays locked so the bh cannot go
away with the locked page, the latter because the bh are private not
visible at all to the vm. Playing with the b_count for those two cases
are just wasted cycles.

Secondly even for the sync_io handler we shouldn't need to get_bh before
submitting the I/O. Once we get the bh locked we're just fine. As far I
can see all we need is to pin the bh around unlock_buffer in the I/O
completion handler which is simpler and equally efficient (possibly more
efficient in SMP where we won't risk to bh_put in a cpu different than
the bh_get, but probably the cacheline ping pong wouldn't change much
since we must do the lock/unlock on the b_state anyways).

Also somebody should explain me why end_buffer_write exists in first
place, it is just wasted memory and icache.

This is the way I would have fixed the smp race against pre3. Can you
see anything that isn't fixed by the below patch and that is fixed by
pre4? Unless somebody can see something I will drop some of the
unnecessary get_bh/put_bh from pre5.

--- bh-fix/drivers/block/ll_rw_blk.c.~1~	Tue Jul 10 04:33:57 2001
+++ bh-fix/drivers/block/ll_rw_blk.c	Tue Jul 10 04:49:11 2001
@@ -963,10 +963,12 @@
 /*
  * Default IO end handler, used by "ll_rw_block()".
  */
-static void end_buffer_io_sync(struct buffer_head *bh, int uptodate)
+void end_buffer_io_sync(struct buffer_head *bh, int uptodate)
 {
 	mark_buffer_uptodate(bh, uptodate);
+	atomic_inc(&bh->b_count);
 	unlock_buffer(bh);
+	atomic_dec(&bh->b_count);
 }
 
 /**
--- bh-fix/fs/buffer.c.~1~	Wed Jul  4 04:03:46 2001
+++ bh-fix/fs/buffer.c	Tue Jul 10 04:50:15 2001
@@ -161,13 +161,6 @@
 	atomic_dec(&bh->b_count);
 }
 
-/* End-of-write handler.. Just mark it up-to-date and unlock the buffer. */
-static void end_buffer_write(struct buffer_head *bh, int uptodate)
-{
-	mark_buffer_uptodate(bh, uptodate);
-	unlock_buffer(bh);
-}
-
 /*
  * The buffers have been marked clean and locked.  Just submit the dang
  * things.. 
@@ -182,7 +175,7 @@
 	atomic_inc(&wait->b_count);
 	do {
 		struct buffer_head * bh = *array++;
-		bh->b_end_io = end_buffer_write;
+		bh->b_end_io = end_buffer_io_sync;
 		submit_bh(WRITE, bh);
 	} while (--count);
 	wait_on_buffer(wait);
--- bh-fix/include/linux/fs.h.~1~	Mon Jul  9 20:25:17 2001
+++ bh-fix/include/linux/fs.h	Tue Jul 10 04:50:05 2001
@@ -1061,6 +1061,7 @@
 
 /* reiserfs_writepage needs this */
 extern void set_buffer_async_io(struct buffer_head *bh) ;
+extern void end_buffer_io_sync(struct buffer_head *, int);
 
 #define BUF_CLEAN	0
 #define BUF_LOCKED	1	/* Buffers scheduled for write */

Andrea

  reply	other threads:[~2001-07-10  2:56 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <Pine.LNX.4.33.0107081640570.308-100000@mikeg.weiden.de>
2001-07-08 15:43 ` VM in 2.4.7-pre hurts Rik van Riel
2001-07-08 17:15   ` Mike Galbraith
2001-07-08 17:58   ` Linus Torvalds
2001-07-08 18:23     ` Rik van Riel
2001-07-08 19:30       ` Linus Torvalds
2001-07-09  2:59         ` Linus Torvalds
2001-07-10  2:56           ` Andrea Arcangeli [this message]
2001-07-10  4:03             ` Linus Torvalds
2001-07-10  4:20               ` Linus Torvalds
2001-07-10  5:43                 ` Andrea Arcangeli
2001-07-10 14:56                   ` Andrea Arcangeli
2001-07-10 18:46                     ` Linus Torvalds
2001-07-10  5:11               ` Andrea Arcangeli
2001-07-09  7:56       ` Mike Galbraith
2001-07-09  8:25         ` Christoph Rohland
2001-07-09  9:18           ` Mike Galbraith
2001-07-09  9:29             ` Christoph Rohland
2001-07-09  9:38               ` Mike Galbraith
2001-07-09 11:17                 ` Mike Galbraith
2001-07-09 11:30                   ` Christoph Rohland
2001-07-09 12:26                     ` Mike Galbraith
2001-07-09 11:25                 ` Christoph Rohland
2001-07-09 12:20                   ` Mike Galbraith
2001-07-09 16:20                 ` Linus Torvalds
2001-07-09 19:44                   ` Christoph Rohland
2001-07-09 20:46                     ` Linus Torvalds
2001-07-11 19:39                       ` Christoph Rohland
2001-07-11  1:05                 ` Marcelo Tosatti
2001-07-11  4:03                   ` Mike Galbraith
2001-07-12  5:00       ` David Lang
     [not found] <Pine.LNX.4.33L.0107071542420.17825-100000@duckman.distro.conectiva>
2001-07-07 18:54 ` Linus Torvalds
2001-07-07 20:11   ` Rik van Riel
2001-07-08 17:11     ` Linus Torvalds
2001-07-08 18:29       ` Rik van Riel
2001-07-07 13:41 Jeff Garzik
2001-07-07 14:05 ` Jeff Garzik
2001-07-07 17:28 ` Linus Torvalds
2001-07-07 17:37   ` Jeff Garzik
2001-07-07 17:53     ` Linus Torvalds
2001-07-07 18:08       ` Jeff Garzik
2001-07-07 18:11         ` Rik van Riel
2001-07-07 21:33       ` Alan Cox
2001-07-07 18:00     ` Rik van Riel
2001-07-07 21:25   ` Alan Cox
2001-07-07 21:29     ` Rik van Riel
2001-07-07 21:34       ` Jeff Garzik
2001-07-07 21:43       ` Alan Cox
2001-07-07 21:45         ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20010710045617.J1594@athlon.random \
    --to=andrea@suse.de \
    --cc=alan@redhat.com \
    --cc=jgarzik@mandrakesoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikeg@wen-online.de \
    --cc=phillips@bonn-fries.net \
    --cc=riel@conectiva.com.br \
    --cc=torvalds@transmeta.com \
    --cc=viro@math.psu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox