From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Thu, 3 May 2001 15:26:52 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Thu, 3 May 2001 15:26:39 -0400 Received: from cisco7500-mainGW.gts.cz ([194.213.32.131]:10500 "EHLO bug.ucw.cz") by vger.kernel.org with ESMTP id ; Thu, 3 May 2001 15:26:06 -0400 Message-ID: <20010502230357.A9507@bug.ucw.cz> Date: Wed, 2 May 2001 23:03:57 +0200 From: Pavel Machek To: Doug Ledford , Max TenEyck Woodbury Cc: Eric.Ayers@intec-telecom-systems.com, James Bottomley , "Roets, Chris" , linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org Subject: Re: Linux Cluster using shared scsi In-Reply-To: <200105011445.KAA01117@localhost.localdomain> <3AEEDFFC.409D8271@redhat.com> <15086.60620.745722.345084@gargle.gargle.HOWL> <3AF025AE.511064F3@redhat.com> <3AF04648.73F5BFCE@cds.duke.edu> <3AF0483C.49C8CF90@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.93i In-Reply-To: <3AF0483C.49C8CF90@redhat.com>; from Doug Ledford on Wed, May 02, 2001 at 01:47:40PM -0400 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Hi! > > > ... > > > > > > If told to hold a reservation, then resend your reservation request once every > > > 2 seconds (this actually has very minimal CPU/BUS usage and isn't as big a > > > deal as requesting a reservation every 2 seconds might sound). The first time > > > the reservation is refused, consider the reservation stolen by another machine > > > and exit (or optionally, reboot). > > > > Umm. Reboot? What do you think this is? Windoze? > > It's the *only* way to guarantee that the drive is never touched by more than > one machine at a time (notice, I've not been talking about a shared use drive, > only one machine in the cluster "owns" the drive at a time, and it isn't for > single transactions that it owns the drive, it owns the drive for as long as > it is alive, this is a limitation of the filesystes currently available in > mainstream kernels). The reservation conflict and subsequent reboot also > *only* happens when a reservation has been forcefully stolen from a >machine. I do not believe reboot from kernel is right approach. Tell init with special signal, maybe; but do not reboot forcefully. This is policy; doing reboot might be right answer in 90% cases; that does not mean you should do it always. How is "reservation error" different from any other error? What if drive generates reservation error by mistake? I believe drive should be considered dead until reboot after "reservation error". That way you do not damage data by default, and provide user option with doing whatever is appropriate. Pavel -- I'm pavel@ucw.cz. "In my country we have almost anarchy and I don't care." Panos Katsaloulis describing me w.r.t. patents at discuss@linmodems.org