* [Drbd-dev] Re: [DRBD-cvs] svn commit by phil - r1985 - branches/drbd-0.7/drbd - Fixed a self made SMP lockup; showing up in drbd_al_com [not found] <20051017134012.CB1741431D@mail.linbit.com> @ 2005-10-17 13:43 ` Lars Marowsky-Bree 2005-10-17 13:56 ` Lars Ellenberg 0 siblings, 1 reply; 4+ messages in thread From: Lars Marowsky-Bree @ 2005-10-17 13:43 UTC (permalink / raw) To: drbd-dev On 2005-10-17T15:40:12, drbd-cvs@lists.linbit.com wrote: > Author: phil > Date: 2005-10-17 15:40:11 +0200 (Mon, 17 Oct 2005) > New Revision: 1985 > > Modified: > branches/drbd-0.7/drbd/drbd_bitmap.c > Log: > Fixed a self made SMP lockup; showing up in drbd_al_complete_io() > > With drbd-0.7.12 we moved the al_lock before the bm_clear_bit() > in __drbd_set_in_sync(). That by itself is okay, but > in bm_clear_bit() we used the spin_[un]lock_irq() functions, > therefore reenabling interrupts... > This is fixed now. Hi Philipp, how critical is this bug, and how likely is it that it will be hit in practice? Sincerely, Lars Marowsky-Brée <lmb@suse.de> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Drbd-dev] Re: [DRBD-cvs] svn commit by phil - r1985 - branches/drbd-0.7/drbd - Fixed a self made SMP lockup; showing up in drbd_al_com 2005-10-17 13:43 ` [Drbd-dev] Re: [DRBD-cvs] svn commit by phil - r1985 - branches/drbd-0.7/drbd - Fixed a self made SMP lockup; showing up in drbd_al_com Lars Marowsky-Bree @ 2005-10-17 13:56 ` Lars Ellenberg 2005-10-17 13:57 ` Lars Marowsky-Bree 0 siblings, 1 reply; 4+ messages in thread From: Lars Ellenberg @ 2005-10-17 13:56 UTC (permalink / raw) To: drbd-dev / 2005-10-17 15:43:49 +0200 \ Lars Marowsky-Bree: > On 2005-10-17T15:40:12, drbd-cvs@lists.linbit.com wrote: > > > Author: phil > > Date: 2005-10-17 15:40:11 +0200 (Mon, 17 Oct 2005) > > New Revision: 1985 > > > > Modified: > > branches/drbd-0.7/drbd/drbd_bitmap.c > > Log: > > Fixed a self made SMP lockup; showing up in drbd_al_complete_io() > > > > With drbd-0.7.12 we moved the al_lock before the bm_clear_bit() > > in __drbd_set_in_sync(). That by itself is okay, but > > in bm_clear_bit() we used the spin_[un]lock_irq() functions, > > therefore reenabling interrupts... > > This is fixed now. > > Hi Philipp, how critical is this bug, and how likely is it that it will > be hit in practice? depending on the actual timing... once we started looking for it, we could reproduce it on our test cluster within minutes. it occurs on Primary/SyncSource which is the typical case. it will be hit. rating is CRITICAL (for smp). there will be a 0.7.14 today, latest tomorrow, including this fix, and the fix for not notifying the peer about "io-error on read". -- : Lars Ellenberg Tel +43-1-8178292-0 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Schoenbrunner Str. 244, A-1120 Vienna/Europe http://www.linbit.com : ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Drbd-dev] Re: [DRBD-cvs] svn commit by phil - r1985 - branches/drbd-0.7/drbd - Fixed a self made SMP lockup; showing up in drbd_al_com 2005-10-17 13:56 ` Lars Ellenberg @ 2005-10-17 13:57 ` Lars Marowsky-Bree 2005-10-17 14:15 ` Lars Ellenberg 0 siblings, 1 reply; 4+ messages in thread From: Lars Marowsky-Bree @ 2005-10-17 13:57 UTC (permalink / raw) To: drbd-dev On 2005-10-17T15:56:18, Lars Ellenberg <Lars.Ellenberg@linbit.com> wrote: > it will be hit. rating is CRITICAL (for smp). > > there will be a 0.7.14 today, latest tomorrow, including this fix, > and the fix for not notifying the peer about "io-error on read". Great, thank you. I'll pull that into SLES9 then. Sincerely, Lars Marowsky-Brée <lmb@suse.de> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Drbd-dev] Re: [DRBD-cvs] svn commit by phil - r1985 - branches/drbd-0.7/drbd - Fixed a self made SMP lockup; showing up in drbd_al_com 2005-10-17 13:57 ` Lars Marowsky-Bree @ 2005-10-17 14:15 ` Lars Ellenberg 0 siblings, 0 replies; 4+ messages in thread From: Lars Ellenberg @ 2005-10-17 14:15 UTC (permalink / raw) To: drbd-dev / 2005-10-17 15:57:49 +0200 \ Lars Marowsky-Bree: > On 2005-10-17T15:56:18, Lars Ellenberg <Lars.Ellenberg@linbit.com> wrote: > > > it will be hit. rating is CRITICAL (for smp). > > > > there will be a 0.7.14 today, latest tomorrow, including this fix, > > and the fix for not notifying the peer about "io-error on read". > > Great, thank you. I'll pull that into SLES9 then. besser is das. it is a race, and it will only trigger if you have seriously high application io load during drbd resync on a Primary/SyncSource. But yes, SLES9 SMP systems probably will match that description most of the time, and if you then reconnect the secondary after some downtime ... -- : Lars Ellenberg Tel +43-1-8178292-0 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Schoenbrunner Str. 244, A-1120 Vienna/Europe http://www.linbit.com : ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2005-10-17 14:15 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20051017134012.CB1741431D@mail.linbit.com>
2005-10-17 13:43 ` [Drbd-dev] Re: [DRBD-cvs] svn commit by phil - r1985 - branches/drbd-0.7/drbd - Fixed a self made SMP lockup; showing up in drbd_al_com Lars Marowsky-Bree
2005-10-17 13:56 ` Lars Ellenberg
2005-10-17 13:57 ` Lars Marowsky-Bree
2005-10-17 14:15 ` Lars Ellenberg
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox