linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "BERTRAND Joël" <joel.bertrand@systella.fr>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-raid@vger.kernel.org, sparclinux@vger.kernel.org
Subject: Re: [BUG] Raid5 trouble
Date: Wed, 17 Oct 2007 18:44:41 +0200	[thread overview]
Message-ID: <47163BF9.304@systella.fr> (raw)
In-Reply-To: <e9c3a7c20710170840u2ed8d6a9x26523eec6700ad11@mail.gmail.com>

Dan Williams wrote:
> On 10/17/07, Dan Williams <dan.j.williams@intel.com> wrote:
>> On 10/17/07, BERTRAND Joël <joel.bertrand@systella.fr> wrote:
>>> BERTRAND Joël wrote:
>>>>     Hello,
>>>>
>>>>     I run 2.6.23 linux kernel on two T1000 (sparc64) servers. Each
>>>> server has a partitionable raid5 array (/dev/md/d0) and I have to
>>>> synchronize both raid5 volumes by raid1. Thus, I have tried to build a
>>>> raid1 volume between /dev/md/d0p1 and /dev/sdi1 (exported by iscsi from
>>>> the second server) and I obtain a BUG :
>>>>
>>>> Root gershwin:[/usr/scripts] > mdadm -C /dev/md7 -l1 -n2 /dev/md/d0p1
>>>> /dev/sdi1
>>>> ...
>>>         Hello,
>>>
>>>         I have fixed iscsi-target, and I have tested it. It works now without
>>> any trouble. Patches were posted on iscsi-target mailing list. When I
>>> use iSCSI to access to foreign raid5 volume, it works fine. I can format
>>> foreign volume, copy large files on it... But when I tried to create a
>>> new raid1 volume with a local raid5 volume and a foreign raid5 volume, I
>>> receive my well known Oops. You can find my dmesg after Oops :
>>>
>> Can you send your .config and your bootup dmesg?
>>
> 
> I found a problem which may lead to the operations count dropping
> below zero.  If ops_complete_biofill() gets preempted in between the
> following calls:
> 
> raid5.c:554> clear_bit(STRIPE_OP_BIOFILL, &sh->ops.ack);
> raid5.c:555> clear_bit(STRIPE_OP_BIOFILL, &sh->ops.pending);
> 
> ...then get_stripe_work() can recount/re-acknowledge STRIPE_OP_BIOFILL
> causing the assertion.  In fact, the 'pending' bit should always be
> cleared first, but the other cases are protected by
> spin_lock(&sh->lock).  Patch attached.

	Dan,

	I have modified get_stripe_work like this :

static unsigned long get_stripe_work(struct stripe_head *sh)
{
         unsigned long pending;
         int ack = 0;
         int a,b,c,d,e,f,g;

         pending = sh->ops.pending;

         test_and_ack_op(STRIPE_OP_BIOFILL, pending);
         a=ack;
         test_and_ack_op(STRIPE_OP_COMPUTE_BLK, pending);
         b=ack;
         test_and_ack_op(STRIPE_OP_PREXOR, pending);
         c=ack;
         test_and_ack_op(STRIPE_OP_BIODRAIN, pending);
         d=ack;
         test_and_ack_op(STRIPE_OP_POSTXOR, pending);
         e=ack;
         test_and_ack_op(STRIPE_OP_CHECK, pending);
         f=ack;
         if (test_and_clear_bit(STRIPE_OP_IO, &sh->ops.pending))
                 ack++;
         g=ack;

         sh->ops.count -= ack;

         if (sh->ops.count<0) printk("%d %d %d %d %d %d %d\n", 
a,b,c,d,e,f,g);
         BUG_ON(sh->ops.count < 0);

         return pending;
}

and I obtain on console :

  1 1 1 1 1 2
kernel BUG at drivers/md/raid5.c:390!
               \|/ ____ \|/
               "@'/ .. \`@"
               /_| \__/ |_\
                  \__U_/
md7_resync(5409): Kernel bad sw trap 5 [#1]

	If that can help you...

	JKB
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2007-10-17 16:44 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-16 13:24 [BUG] Raid5 trouble BERTRAND Joël
2007-10-17 14:32 ` BERTRAND Joël
2007-10-17 14:58   ` Dan Williams
2007-10-17 15:40     ` Dan Williams
2007-10-17 16:44       ` BERTRAND Joël [this message]
2007-10-18  0:46         ` Dan Williams
2007-10-18  8:29           ` BERTRAND Joël
2007-10-19  2:55       ` Bill Davidsen
2007-10-19  8:04         ` BERTRAND Joël
2007-10-19 15:51           ` Dan Williams
2007-10-19 16:03             ` BERTRAND Joël
     [not found]             ` <4718DE66.8000905@tmr.com>
2007-10-19 20:42               ` BERTRAND Joël
2007-10-19 20:49                 ` [BUG] Raid1/5 over iSCSI trouble BERTRAND Joël
2007-10-19 21:02                   ` [Iscsitarget-devel] " Ross S. W. Walker
2007-10-19 21:06                     ` BERTRAND Joël
2007-10-19 21:10                       ` Ross S. W. Walker
2007-10-20  7:45                         ` BERTRAND Joël
2007-10-19 21:11                       ` [Iscsitarget-devel] " Scott Kaelin
2007-10-19 21:04                   ` BERTRAND Joël
2007-10-19 21:08                     ` Ross S. W. Walker
2007-10-19 21:12                     ` Dan Williams
2007-10-20  8:05                       ` BERTRAND Joël
2007-10-24  7:12                         ` BERTRAND Joël
2007-10-24 20:10                           ` Bill Davidsen
2007-10-24 23:49                           ` Dan Williams
2007-10-25  0:03                             ` David Miller
2007-10-27 13:29                             ` BERTRAND Joël
2007-10-27 18:27                               ` Dan Williams
2007-10-27 19:35                                 ` BERTRAND Joël
2007-10-27 21:13                               ` Ming Zhang
2007-10-29 10:40                                 ` BERTRAND Joël
2007-10-19 21:19                     ` Ming Zhang
2007-10-19 23:50                     ` Bill Davidsen
2007-10-19 23:58                       ` Bill Davidsen
2007-10-20  7:52                       ` BERTRAND Joël
2007-10-17 16:07     ` [BUG] Raid5 trouble BERTRAND Joël

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47163BF9.304@systella.fr \
    --to=joel.bertrand@systella.fr \
    --cc=dan.j.williams@intel.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=sparclinux@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).