From mboxrd@z Thu Jan 1 00:00:00 1970 From: Neil Brown Subject: Re: [PATCH] md-cluster: Only one thread should request DLM lock Date: Wed, 28 Oct 2015 05:48:27 +0900 Message-ID: <87twpcca1w.fsf@notabene.neil.brown.name> References: <1445520669-4406-1-git-send-email-rgoldwyn@suse.de> <874mhiz643.fsf@notabene.neil.brown.name> <562A099E.2060709@suse.de> Mime-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha1; protocol="application/pgp-signature" Return-path: In-Reply-To: <562A099E.2060709@suse.de> Sender: linux-raid-owner@vger.kernel.org To: Goldwyn Rodrigues , linux-raid@vger.kernel.org Cc: gqjiang@suse.com, Goldwyn Rodrigues List-Id: linux-raid.ids --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Fri, Oct 23 2015, Goldwyn Rodrigues wrote: > On 10/22/2015 09:11 PM, Neil Brown wrote: >> rgoldwyn@suse.de writes: >> >>> From: Goldwyn Rodrigues >>> >>> If a DLM lock is in progress, requesting the same DLM lock will >>> result in -EBUSY. Use a mutex to make sure only one thread requests >>> for dlm_lock() function at a time. >>> >>> This will fix the error -EBUSY returned from DLM's >>> validate_lock_args(). >> >> I can see that we only want one thread calling dlm_lock() with a given >> 'struct dlm_lock_resource' at a time, otherwise nasty things could >> happen. >> >> However if such a race is possible, then aren't there other possibly >> complications. > > This is specific to the duration of dlm_lock() function only and not the= =20 > entire lifetime of the resource. If one thread has requested dlm_lock()=20 > and another thread comes in and calls dlm_lock() on the same resource,=20 > we will get -EBUSY on the second one because the lock is already requeste= d. > > Our dlm_unlock_sync() call is also a dlm_lock_sync(), and eventually=20 > dlm_lock() call, with a NULL lock. > >> >> Suppose two threads try to lock the same resource. >> Presumably one will try to lock the resource, then the next one (when it >> gets the mutex) will discover that it already has the resource, but will >> think it has exclusive access - maybe? > > I am not sure if I understand this. DLM locks are supposed to be at the=20 > node level as opposed to thread level. I think this is exactly my point. I think we need some extra thread-level locking. For example suppose some thread calls sendmsg() which takes the token lock, and then while that is happening metadata_update_start() gets called. It will try to take the token lock, but as the node already has the lock, it will succeed trivially. Then two threads on the one node both think they have the lock which will almost certainly lead to confusion. So we need to hold some mutex the entire time that sendmsg() is running, and need to hold that same mutex when calling metadata_update_start(). Once we have that, there is not need for the mutex you introduced which is just held while claiming the lock. It could be that we can use ->reconfig_mutex for a lot of this. Certainly we always hold ->reconfig_mutex while performing a metadata update. We probably don't want to take it just for ->resync_info_update(). I'm not sure if it would be best to have a per-resource mutex which we take in dlm_lock_sync() and drop in dlm_unlock_sync(), or if we want the locking at a higher level. Probably ->reconfig_mutex is already used where we need higher-level locking. So if you change you patch to unlock in dlm_unlock_sync() rather than at the end of dlm_lock_sync(), then I think it will make sense. Thanks, NeilBrown --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJWL+MbAAoJEDnsnt1WYoG5dp8QAKSJ24hzTVBLG8Kl4U4hsZAS ET+h3Scx9DWpYJya5zHDl0espvHtykUaPKfjOkK0KkZIP7s7on/VDWjJW+nPUrM+ 4g3TcDCvR2M+wylHzy0Bezl3XrQQ6KEkoUt+In7O1RC1UpALbpTsU7hdfQ+qi2XH Kdeww23BNc6Swk6QPerMD3fRRYTBjVHMUY8YRFcv6kZiQAr+mq8I7Me6Et5tlR7R hME33wetaORbM2G7V2mQccG3UqU/p3J6lSBpXUwCSl4zgnWg8Moqw/gCPbUayuhL yYqRw4OsHirctfwFOPV/82RVovJ2u3cnp4Fk/1Zh39YkiBhadEt3KW8YTbnB2z7a bGT0teGGeyKhs3WKznXcqLALglS7ldx1FVnQ7s6LzkbWwWlssYRsC0Cw2csDODAl rHmfw2tH4uuXmQdZNPiniJVkeN78kvXPx1r0kXVmRSaYPPFYzWK9BVj/bR9BIgVj hdVsJlNmbFEQx74BWau09HPVoJ6MJ0uZ/EIOnnv7gItI7OrfYhQoRzcweCbOj9ko auhegZ+u52b22GHmL/bCDX8dLb1wbSXaTZDpg8HLzSMeH6/R6Xe/pP9PUWRADTmk 8le1jDQ+C4DIf5AMoPz7qIjMbQidiVcirH8axajsWkCbXDYNNOJAYsZ1SbhQW8oh 35wLsVZqJ9S8Pa+k1txo =EPuX -----END PGP SIGNATURE----- --=-=-=--