linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: Dan Williams <dan.j.williams@intel.com>
Cc: dean gaudet <dean@arctic.org>, linux-raid@vger.kernel.org
Subject: Re: 2.6.24-rc6 reproducible raid5 hang
Date: Thu, 10 Jan 2008 11:09:54 +1100	[thread overview]
Message-ID: <18309.25170.258730.225322@notabene.brown> (raw)
In-Reply-To: message from Dan Williams on Wednesday January 9

On Wednesday January 9, dan.j.williams@intel.com wrote:
> On Sun, 2007-12-30 at 10:58 -0700, dean gaudet wrote:
> > i have evidence pointing to d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1
> > 
> > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1
> > 
> > which was Neil's change in 2.6.22 for deferring generic_make_request 
> > until there's enough stack space for it.
> > 
> 
> Commit d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1 reduced stack utilization
> by preventing recursive calls to generic_make_request.  However the
> following conditions can cause raid5 to hang until 'stripe_cache_size' is
> increased:
> 

Thanks for pursuing this guys.  That explanation certainly sounds very
credible.

The generic_make_request_immed is a good way to confirm that we have
found the bug,  but I don't like it as a long term solution, as it
just reintroduced the problem that we were trying to solve with the
problematic commit.

As you say, we could arrange that all request submission happens in
raid5d and I think this is the right way to proceed.  However we can
still take some of the work into the thread that is submitting the
IO by calling "raid5d()" at the end of make_request, like this.

Can you test it please?  Does it seem reasonable?

Thanks,
NeilBrown


Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/md.c    |    2 +-
 ./drivers/md/raid5.c |    4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c	2008-01-07 13:32:10.000000000 +1100
+++ ./drivers/md/md.c	2008-01-10 11:08:02.000000000 +1100
@@ -5774,7 +5774,7 @@ void md_check_recovery(mddev_t *mddev)
 	if (mddev->ro)
 		return;
 
-	if (signal_pending(current)) {
+	if (current == mddev->thread->tsk && signal_pending(current)) {
 		if (mddev->pers->sync_request) {
 			printk(KERN_INFO "md: %s in immediate safe mode\n",
 			       mdname(mddev));

diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c
--- .prev/drivers/md/raid5.c	2008-01-07 13:32:10.000000000 +1100
+++ ./drivers/md/raid5.c	2008-01-10 11:06:54.000000000 +1100
@@ -3432,6 +3432,7 @@ static int chunk_aligned_read(struct req
 	}
 }
 
+static void raid5d (mddev_t *mddev);
 
 static int make_request(struct request_queue *q, struct bio * bi)
 {
@@ -3547,7 +3548,7 @@ static int make_request(struct request_q
 				goto retry;
 			}
 			finish_wait(&conf->wait_for_overlap, &w);
-			handle_stripe(sh, NULL);
+			set_bit(STRIPE_HANDLE, &sh->state);
 			release_stripe(sh);
 		} else {
 			/* cannot get stripe for read-ahead, just give-up */
@@ -3569,6 +3570,7 @@ static int make_request(struct request_q
 			      test_bit(BIO_UPTODATE, &bi->bi_flags)
 			        ? 0 : -EIO);
 	}
+	raid5d(mddev);
 	return 0;
 }
 

  reply	other threads:[~2008-01-10  0:09 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-27 17:06 2.6.24-rc6 reproducible raid5 hang dean gaudet
2007-12-27 17:39 ` dean gaudet
2007-12-29 16:48   ` dean gaudet
2007-12-29 20:47     ` Dan Williams
2007-12-29 20:58       ` dean gaudet
2007-12-29 21:50         ` Justin Piszcz
2007-12-29 22:11           ` dean gaudet
2007-12-29 22:21             ` dean gaudet
2007-12-29 22:06         ` Dan Williams
2007-12-30 17:58           ` dean gaudet
2008-01-09 18:28             ` Dan Williams
2008-01-10  0:09               ` Neil Brown [this message]
2008-01-10  3:07                 ` Dan Williams
2008-01-10  3:57                   ` Neil Brown
2008-01-10  4:56                     ` Dan Williams
2008-01-10 20:28                     ` Bill Davidsen
2008-01-10  7:13                 ` dean gaudet
2008-01-10 18:49                   ` Dan Williams
2008-01-11  1:46                     ` Neil Brown
2008-01-11  2:14                       ` dean gaudet
2008-01-10 17:59                 ` dean gaudet
2007-12-27 19:52 ` Justin Piszcz
2007-12-28  0:08   ` dean gaudet
  -- strict thread matches above, loose matches on Subject: below --
2008-01-23 13:37 Tim Southerwood
2008-01-23 17:43 ` Carlos Carvalho
2008-01-24 20:30   ` Tim Southerwood
2008-01-28 17:29     ` Tim Southerwood
2008-01-29 14:16       ` Carlos Carvalho
2008-01-29 22:58         ` Bill Davidsen
2008-02-14 10:13           ` Burkhard Carstens

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=18309.25170.258730.225322@notabene.brown \
    --to=neilb@suse.de \
    --cc=dan.j.williams@intel.com \
    --cc=dean@arctic.org \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).