From: NeilBrown <neilb@suse.de>
To: Dan Williams <dan.j.williams@intel.com>
Cc: "Kwolek, Adam" <adam.kwolek@intel.com>,
"linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>,
"Ciechanowski, Ed" <ed.ciechanowski@intel.com>,
"Neubauer, Wojciech" <Wojciech.Neubauer@intel.com>
Subject: Re: [PATCH 2/2] imsm: FIX: Be more patient during loading matadata
Date: Thu, 14 Apr 2011 18:08:46 +1000 [thread overview]
Message-ID: <20110414180846.6a71eeb9@notabene.brown> (raw)
In-Reply-To: <BANLkTi=nontnQ_WKCaUWmosBDBihoDC7JA@mail.gmail.com>
On Wed, 13 Apr 2011 11:56:29 -0700 Dan Williams <dan.j.williams@intel.com>
wrote:
> 2011/4/12 Kwolek, Adam <adam.kwolek@intel.com>:
> >> -----Original Message-----
> >> From: dan.j.williams@gmail.com [mailto:dan.j.williams@gmail.com] On
> >> Behalf Of Dan Williams
> >> Sent: Wednesday, April 13, 2011 2:45 AM
> >> To: Kwolek, Adam
> >> Cc: neilb@suse.de; linux-raid@vger.kernel.org; Ciechanowski, Ed;
> >> Neubauer, Wojciech
> >> Subject: Re: [PATCH 2/2] imsm: FIX: Be more patient during loading
> >> matadata
> >>
> >> On Tue, Apr 12, 2011 at 5:51 AM, Adam Kwolek <adam.kwolek@intel.com>
> >> wrote:
> >> > Sometimes occurs that metadata cannot be loaded e.g. wrong check sum
> >> > It can happen due to metadata update racing with mdmon condition.
> >> > If mpb loading is tried again, it is loaded successfully.
> >> > Try to load metadata again before really giving up.
> >> >
> >> > Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
> >> > ---
> >> >
> >> > super-intel.c | 10 ++++++++--
> >> > 1 files changed, 8 insertions(+), 2 deletions(-)
> >> >
> >> > diff --git a/super-intel.c b/super-intel.c
> >> > index dc5e34e..d23267a 100644
> >> > --- a/super-intel.c
> >> > +++ b/super-intel.c
> >> > @@ -2773,8 +2773,14 @@ load_and_parse_mpb(int fd, struct intel_super
> >> *super, char *devname, int keep_fd
> >> > int err;
> >> >
> >> > err = load_imsm_mpb(fd, super, devname);
> >> > - if (err)
> >> > - return err;
> >> > + if (err) {
> >> > + /* try to load mpb again,
> >> > + * in case of mdmon race we could have more luck...
> >> > + */
> >> > + err = load_imsm_mpb(fd, super, devname);
> >> > + if (err)
> >> > + return err;
> >> > + }
> >> > err = load_imsm_disk(fd, super, devname, keep_fd);
> >> > if (err)
> >> > return err;
> >>
> >> This is semi-duplicates the check we already do after returning from
> >> load_and_parse_mpb in load_super_imsm_all. I'm curious, are you
> >> hitting this path from load_super_imsm? If the container is assembled
> >> we should be loading from the container, if the container is not
> >> available then mdmon can't be running and checksum errors are real.
> >>
> >
> > My test scenario is that after boot I'm disassembling read only array and immediately new array is assembled for grow continuation.
> > Sometimes occurs that mdadm throws exception and core file is generated. It shows that anchor pointer has NULL value due to CRC error.
> > Second reading try helps, and anchor is always read correctly.
> > This behavior and fact that if I put more time between array disassembling and assembling it again helps also suggest, that we have some race condition here.
>
> Right, so we need to fix that disassemble-to-assemble race, this patch
> only "helps" :-).
>
Yes. We need to fix it properly.
The race should be avoided by proper use of O_EXCL.
i.e.
1/ mdmon should only write metadata to devices while they are actually part
of the container and so the container has exclusive access (which it
shares with member arrays).
2/ mdadm should only try to use a device in an array if it has O_EXCL access.
Which of these rules is broken - and where?
> > Problem is not in currently monitored in mdmon container but rather in interaction with previous mdmon session that is about to close.
> >
> > This patch makes that error condition never occurs in this scenario.
>
> That "never" needs to be qualified. This patch makes the race harder
> to lose, but as far as I can see not "impossible" to lose.
>
> > Grow.c is fixed for correct error condition behavior also.
> > I can agree that both patches in this series can help for this problem separately, but I think both should be placed in code.
>
> Why does this existing check:
>
> /* retry the load if we might have raced against mdmon */
> if (err == 3 && mdmon_running(devnum))
> for (retry = 0; retry < 3; retry++) {
> usleep(3000);
> err = load_and_parse_mpb(dfd, s, NULL, 1);
> if (err != 3)
> break;
> }
It's pretty horrible that we need to do this, isn't it?
Maybe we need some way to synchronise with mdmon so we *know* if we have
raced or not.
i.e. mdmon keeps a count of the number of times it has updated the metadata.
We send a message to get the count, read, then get the count again.
The request blocks while mdmon is actually writing.
If the counts are different, we read again.
??
NeilBrown
>
> ...not help in this case. I suspect due to that mdmon_running()
> qualification and the fact that the test is only seeing this in a
> disassemble-to-assemble window. So that seems to be further evidence
> that a higher level fix is needed.
>
> --
> Dan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2011-04-14 8:08 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-12 12:51 [PATCH 1/2] FIX: Use successfully loaded metadata only Adam Kwolek
2011-04-12 12:51 ` [PATCH 2/2] imsm: FIX: Be more patient during loading matadata Adam Kwolek
2011-04-13 0:44 ` Dan Williams
2011-04-13 6:40 ` Kwolek, Adam
2011-04-13 18:56 ` Dan Williams
2011-04-14 7:57 ` Kwolek, Adam
2011-04-14 8:08 ` NeilBrown [this message]
2011-04-14 8:36 ` Dan Williams
2011-04-14 7:50 ` [PATCH 1/2] FIX: Use successfully loaded metadata only NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110414180846.6a71eeb9@notabene.brown \
--to=neilb@suse.de \
--cc=Wojciech.Neubauer@intel.com \
--cc=adam.kwolek@intel.com \
--cc=dan.j.williams@intel.com \
--cc=ed.ciechanowski@intel.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.