* [Drbd-dev] Re: [DRBD-cvs] svn commit by phil - r1985 - branches/drbd-0.7/drbd - Fixed a self made SMP lockup; showing up in drbd_al_com
[not found] <20051017134012.CB1741431D@mail.linbit.com>
@ 2005-10-17 13:43 ` Lars Marowsky-Bree
2005-10-17 13:56 ` Lars Ellenberg
0 siblings, 1 reply; 4+ messages in thread
From: Lars Marowsky-Bree @ 2005-10-17 13:43 UTC (permalink / raw)
To: drbd-dev
On 2005-10-17T15:40:12, drbd-cvs@lists.linbit.com wrote:
> Author: phil
> Date: 2005-10-17 15:40:11 +0200 (Mon, 17 Oct 2005)
> New Revision: 1985
>
> Modified:
> branches/drbd-0.7/drbd/drbd_bitmap.c
> Log:
> Fixed a self made SMP lockup; showing up in drbd_al_complete_io()
>
> With drbd-0.7.12 we moved the al_lock before the bm_clear_bit()
> in __drbd_set_in_sync(). That by itself is okay, but
> in bm_clear_bit() we used the spin_[un]lock_irq() functions,
> therefore reenabling interrupts...
> This is fixed now.
Hi Philipp, how critical is this bug, and how likely is it that it will
be hit in practice?
Sincerely,
Lars Marowsky-Brée <lmb@suse.de>
--
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin
"Ignorance more frequently begets confidence than does knowledge"
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Drbd-dev] Re: [DRBD-cvs] svn commit by phil - r1985 - branches/drbd-0.7/drbd - Fixed a self made SMP lockup; showing up in drbd_al_com
2005-10-17 13:43 ` [Drbd-dev] Re: [DRBD-cvs] svn commit by phil - r1985 - branches/drbd-0.7/drbd - Fixed a self made SMP lockup; showing up in drbd_al_com Lars Marowsky-Bree
@ 2005-10-17 13:56 ` Lars Ellenberg
2005-10-17 13:57 ` Lars Marowsky-Bree
0 siblings, 1 reply; 4+ messages in thread
From: Lars Ellenberg @ 2005-10-17 13:56 UTC (permalink / raw)
To: drbd-dev
/ 2005-10-17 15:43:49 +0200
\ Lars Marowsky-Bree:
> On 2005-10-17T15:40:12, drbd-cvs@lists.linbit.com wrote:
>
> > Author: phil
> > Date: 2005-10-17 15:40:11 +0200 (Mon, 17 Oct 2005)
> > New Revision: 1985
> >
> > Modified:
> > branches/drbd-0.7/drbd/drbd_bitmap.c
> > Log:
> > Fixed a self made SMP lockup; showing up in drbd_al_complete_io()
> >
> > With drbd-0.7.12 we moved the al_lock before the bm_clear_bit()
> > in __drbd_set_in_sync(). That by itself is okay, but
> > in bm_clear_bit() we used the spin_[un]lock_irq() functions,
> > therefore reenabling interrupts...
> > This is fixed now.
>
> Hi Philipp, how critical is this bug, and how likely is it that it will
> be hit in practice?
depending on the actual timing...
once we started looking for it, we could reproduce it on our
test cluster within minutes. it occurs on Primary/SyncSource which
is the typical case.
it will be hit. rating is CRITICAL (for smp).
there will be a 0.7.14 today, latest tomorrow, including this fix,
and the fix for not notifying the peer about "io-error on read".
--
: Lars Ellenberg Tel +43-1-8178292-0 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Schoenbrunner Str. 244, A-1120 Vienna/Europe http://www.linbit.com :
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Drbd-dev] Re: [DRBD-cvs] svn commit by phil - r1985 - branches/drbd-0.7/drbd - Fixed a self made SMP lockup; showing up in drbd_al_com
2005-10-17 13:56 ` Lars Ellenberg
@ 2005-10-17 13:57 ` Lars Marowsky-Bree
2005-10-17 14:15 ` Lars Ellenberg
0 siblings, 1 reply; 4+ messages in thread
From: Lars Marowsky-Bree @ 2005-10-17 13:57 UTC (permalink / raw)
To: drbd-dev
On 2005-10-17T15:56:18, Lars Ellenberg <Lars.Ellenberg@linbit.com> wrote:
> it will be hit. rating is CRITICAL (for smp).
>
> there will be a 0.7.14 today, latest tomorrow, including this fix,
> and the fix for not notifying the peer about "io-error on read".
Great, thank you. I'll pull that into SLES9 then.
Sincerely,
Lars Marowsky-Brée <lmb@suse.de>
--
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin
"Ignorance more frequently begets confidence than does knowledge"
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Drbd-dev] Re: [DRBD-cvs] svn commit by phil - r1985 - branches/drbd-0.7/drbd - Fixed a self made SMP lockup; showing up in drbd_al_com
2005-10-17 13:57 ` Lars Marowsky-Bree
@ 2005-10-17 14:15 ` Lars Ellenberg
0 siblings, 0 replies; 4+ messages in thread
From: Lars Ellenberg @ 2005-10-17 14:15 UTC (permalink / raw)
To: drbd-dev
/ 2005-10-17 15:57:49 +0200
\ Lars Marowsky-Bree:
> On 2005-10-17T15:56:18, Lars Ellenberg <Lars.Ellenberg@linbit.com> wrote:
>
> > it will be hit. rating is CRITICAL (for smp).
> >
> > there will be a 0.7.14 today, latest tomorrow, including this fix,
> > and the fix for not notifying the peer about "io-error on read".
>
> Great, thank you. I'll pull that into SLES9 then.
besser is das.
it is a race, and it will only trigger if you have seriously high
application io load during drbd resync on a Primary/SyncSource.
But yes, SLES9 SMP systems probably will match that description most of
the time, and if you then reconnect the secondary after some downtime ...
--
: Lars Ellenberg Tel +43-1-8178292-0 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Schoenbrunner Str. 244, A-1120 Vienna/Europe http://www.linbit.com :
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2005-10-17 14:15 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20051017134012.CB1741431D@mail.linbit.com>
2005-10-17 13:43 ` [Drbd-dev] Re: [DRBD-cvs] svn commit by phil - r1985 - branches/drbd-0.7/drbd - Fixed a self made SMP lockup; showing up in drbd_al_com Lars Marowsky-Bree
2005-10-17 13:56 ` Lars Ellenberg
2005-10-17 13:57 ` Lars Marowsky-Bree
2005-10-17 14:15 ` Lars Ellenberg
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox