From mboxrd@z Thu Jan 1 00:00:00 1970 From: Steven Haigh Subject: Re: [PATCH] v2 - Add exclusive locking option to block-iscsi Date: Fri, 6 May 2016 19:44:01 +1000 Message-ID: <572C6761.6080508@crc.id.au> References: <7096b642a7ecfe36f8896c4b2b840457@crc.id.au> <20160506090927.jektqesd3l45wymt@mac> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============8261532697260513767==" Return-path: In-Reply-To: <20160506090927.jektqesd3l45wymt@mac> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: =?UTF-8?Q?Roger_Pau_Monn=c3=a9?= Cc: xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --===============8261532697260513767== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="3jgnlfGr1VI3QbVWF07ToLjUpK8rTwvoU" This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --3jgnlfGr1VI3QbVWF07ToLjUpK8rTwvoU Content-Type: multipart/mixed; boundary="a2OWOuTT9exoIIVqmVTmHW0f5emtdVdE7" From: Steven Haigh To: =?UTF-8?Q?Roger_Pau_Monn=c3=a9?= Cc: xen-devel@lists.xen.org Message-ID: <572C6761.6080508@crc.id.au> Subject: Re: [PATCH] v2 - Add exclusive locking option to block-iscsi References: <7096b642a7ecfe36f8896c4b2b840457@crc.id.au> <20160506090927.jektqesd3l45wymt@mac> In-Reply-To: <20160506090927.jektqesd3l45wymt@mac> --a2OWOuTT9exoIIVqmVTmHW0f5emtdVdE7 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable On 6/05/2016 7:09 PM, Roger Pau Monn=E9 wrote: > On Thu, May 05, 2016 at 03:52:30PM +1000, Steven Haigh wrote: >> On 2016-05-05 12:32, Steven Haigh wrote: >>> Overview >>> >>> If you're using iSCSI, you can mount a target by multiple Dom0 >>> machines on the same target. For non-cluster aware filesystems, this >>> can lead to disk corruption and general bad times by all. The iSCSI >>> protocol allows the use of persistent reservations as per the SCSI >>> disk spec. Low level SCSI commands for locking are handled by the >>> sg_persist program (bundled with sg3_utils package in EL). >>> >>> The aim of this patch is to create a 'locktarget=3Dy' option specifie= d >>> within the disk 'target' command for iSCSI to lock the target in >>> exclusive mode on VM start with a key generated from the local system= s >>> IP, and release this lock on the shutdown of the DomU. >>> >>> Example Config: >>> disk =3D >>> ['script=3Dblock-iscsi,vdev=3Dxvda,target=3Diqn=3Diqn.1986-03.com.sun= :02:mytarget,portal=3Discsi.example.com,locktarget=3Dy'] >>> >>> In writing this, I have also re-factored parts of the script to put >>> some things in what I believe to be a better place to make expansion >>> easier. This is mainly in removing functions that purely call other >>> functions with no actual code execution. >>> >>> Signed-off-by: Steven Haigh >>> >>> (on a side note, first time I've submitted a patch to the list and I'= m >>> currently stuck on a webmail client, so apologies in advance if this >>> all goes wrong ;) >> >> Changes in v2: >> Bugfix: Call find_device to locate the /dev/sdX component of the iSCSI= >> target before trying to run unlock_device(). >> >> Apologies for this oversight. >=20 > Thanks for the patch! A couple of comments below. >=20 >> --=20 >> Steven Haigh >> >> Email: netwiz@crc.id.au >> Web: https://www.crc.id.au >> Phone: (03) 9001 6090 - 0412 935 897 >=20 >> --- block-iscsi 2016-02-10 01:44:19.000000000 +1100 >> +++ block-iscsi-lock 2016-05-05 15:42:09.557191235 +1000 >> @@ -31,33 +31,37 @@ >> echo $1 | sed "s/^\("$2"\)//" >> } >> =20 >> -check_tools() >> -{ >> - if ! command -v iscsiadm > /dev/null 2>&1; then >> - fatal "Unable to find iscsiadm tool" >> - fi >> - if [ "$multipath" =3D "y" ] && ! command -v multipath > /dev/null= 2>&1; then >> - fatal "Unable to find multipath" >> - fi >> -} >> - >> # Sets the following global variables based on the params field passe= d in as >> # a parameter: iqn, portal, auth_method, user, multipath, password >> parse_target() >> { >> # set multipath default value >> multipath=3D"n" >> - for param in $(echo "$1" | tr "," "\n") >> - do >> + for param in $(echo "$1" | tr "," "\n"); do >> case $param in >> iqn=3D*) >> iqn=3D$(remove_label $param "iqn=3D") >> + if ! command -v iscsiadm > /dev/null 2>&1; then >> + fatal "Could not find iscsiadm tool." >> + fi >> ;; >> portal=3D*) >> portal=3D$(remove_label $param "portal=3D") >> ;; >> multipath=3D*) >> multipath=3D$(remove_label $param "multipath=3D") >> + if ! command -v multipath > /dev/null 2>&1; then >> + fatal "Multipath selected, but no multipath tools fou= nd" >> + fi >> + ;; >> + locktarget=3D*) >> + locktarget=3D$(remove_label $param "locktarget=3D") >> + if ! command -v sg_persist > /dev/null 2>&1; then >> + fatal "Locking requested but no sg_persist found" >> + fi >> + if ! command -v gethostip > /dev/null 2>&1; then >> + fatal "Locking requested but no gethostip found for k= ey generation" >> + fi >=20 > Why don't you just add this to check_tools? In any case, if you want to= fold=20 > check_tools functionality into parse_target I think it should be done i= n a=20 > separate patch in order for it to be easier to review. >=20 > IMHO, I prefer to have both functions separated, because it's quite cle= ar=20 > what parse_target and check_tools do.=20 My thoughts are that it makes more sense to check that the tool required is there depending on the options that are provided. It saves multiple cases that would end up checking if an option is set, then check if the tool is there. Leaving the existing structure would mean that you parse out the target options in parse_target, but then also check which options are set to then check if the required tools are available for the specific options. It seems more logical and efficient to me to merge this together and essentially do the pre-flight checks as we go. It may be an idea to then rename parse_target to something else as a function name - maybe prepare (as that function doesn't exist anymore) to better reflect what it does? >> ;; >> esac >> done >> @@ -96,38 +100,29 @@ >> fi >> } >> =20 >> -# Attaches the target $iqn in $portal and sets $dev to point to the >> -# multipath device >> -attach() >> -{ >> - do_or_die iscsiadm -m node --targetname "$iqn" -p "$portal" --log= in > /dev/null >> - find_device >> -} >> =20 >> -# Discovers targets in $portal and checks that $iqn is one of those t= argets >> -# Also sets the auth parameters to attach the device >> -prepare() >> +lock_device() >> { >> - # Check if target is already opened >> - iscsiadm -m session 2>&1 | grep -q "$iqn" && fatal "Device alread= y opened" >> - # Discover portal targets >> - iscsiadm -m discovery -t st -p $portal 2>&1 | grep -q "$iqn" || \= >> - fatal "No matching target iqn found" >> -} >> - >> -# Attaches the device and writes xenstore backend entries to connect >> -# the device >> -add() >> -{ >> - attach >> - write_dev $dev >> + ## Lock the iSCSI target as Exclusive Access. >> + key=3D$(gethostip -x $(uname -n)) >=20 > Isn't there anything else that can be used as a key? What would happen = if=20 > the host IP changes while the VM is running? Couldn't the iqn (or a has= h=20 > of it) be used as the key? I thought long and hard about this - as I would like a better method. The key is used to secure the persistent reservation. If another system uses the same key, then they can remove the reservation. My testing seems to indicate that if you provide the same key while asking for a lock, you get it - even if a previous lock was granted to another system.= The idea here is that each Dom0 would have a different key - therefore the lock request from a different Dom0 would fail. >> + if ! sg_persist -d ${dev} -o -G -S ${key} > /dev/null; then >> + unlock_device >> + iscsiadm -m node --targetname "$iqn" -p "$portal" --logout > = /dev/null >> + fatal "iSCSI LOCK: Failed to register with target" >> + fi >> + if ! sg_persist -d ${dev} -o -R -K ${key} -T 6 > /dev/null; then >> + unlock_device >> + iscsiadm -m node --targetname "$iqn" -p "$portal" --logout > = /dev/null >> + fatal "iSCSI LOCK: Failed to set persistent reservation" >> + fi >> } >> =20 >> -# Disconnects the device >> -remove() >> +unlock_device() >> { >> - find_device >> - do_or_die iscsiadm -m node --targetname "$iqn" -p "$portal" --log= out > /dev/null >> + ## Unlock the iSCSI target. >> + key=3D$(gethostip -x $(uname -n)) >> + sg_persist -d ${dev} -o -L -K ${key} -T 6 > /dev/null || true >> + sg_persist -d ${dev} -o -G -K ${key} -S 0 > /dev/null || true >=20 > I'm not sure whether the return code of those functions shouldn't be=20 > checked. I know this is the teardown path, so there's not much that can= =20 > be done here on failure, but it seems like at least the script should c= heck=20 > for the return code '3', and then retry the command. According to=20 > sg3_utils(8), a '3' means: >=20 > "the DEVICE reports that it is not ready for the operation requested. T= he=20 > device may be in the process of becoming ready (e.g. spinning up but no= t at=20 > speed) so the utility may work after a wait." Possibly - however I came across a problem here... pygrub will add & remove while it gets the grub config file, then add again while firing up the actual VM. This causes a lock - unlock - lock - boot - unlock type workflow. You're right that it may be worthwhile to retry some of these - but I don't quite have a method in mind to do this that I'd be happy with. >> } >> >> command=3D$1 >> @@ -138,15 +133,28 @@ >> >> parse_target "$target" >> >> -check_tools || exit 1 >> - >> case $command in >> add) >> - prepare >> - add >> + # Check if target is already opened >> + iscsiadm -m session 2>&1 | grep -q "$iqn" && fatal "Device alread= y opened" >> + # Discover portal targets >> + iscsiadm -m discovery -t st -p $portal 2>&1 | grep -q "$iqn" || \= >> + fatal "No matching target iqn found" >> + >> + ## Login to the iSCSI target. >> + do_or_die iscsiadm -m node --targetname "$iqn" -p "$portal" --log= in > /dev/null >> + find_device >> + if [ "$locktarget" =3D "y" ]; then >> + lock_device >> + fi >> + write_dev $dev >> ;; >> remove) >> - remove >> + find_device >> + if [ "$locktarget" =3D "y" ]; then >> + unlock_device >> + fi >> + do_or_die iscsiadm -m node --targetname "$iqn" -p "$portal" --log= out > /dev/null >=20 > You are doing quite of a rework of the script here (by removing/merging= a=20 > bunch of functions). I think this should be sent as a separate patch, i= n=20 > order to ease the review (mixing new features with code cleanup makes i= t=20 > quite hard to review). Yeah I know - and I'm kinda new at this (being my first patch sent here). I also don't have a git clone of the xen repos, or anything else to base this one right now so I end up doing this manually. I'll see if I can break this up a bit more - but my time is somewhat limited atm when I have access to these machines to tinker. --=20 Steven Haigh Email: netwiz@crc.id.au Web: https://www.crc.id.au Phone: (03) 9001 6090 - 0412 935 897 --a2OWOuTT9exoIIVqmVTmHW0f5emtdVdE7-- --3jgnlfGr1VI3QbVWF07ToLjUpK8rTwvoU Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJXLGdqAAoJEEGvNdV6fTHcpyIP/itbdDs9n8O2mvqSCDzFPW5w Zun1yXRYISifo7omlYI9nMUfSvNG3cxtwchnK0pdFNsbmIt9xisrdSamliIICa/+ Xa3XQfZMtZBuqcc2ztG6u5G0Caeo+CjZKZFL7hZiGvr0qXYPjJqVb+TX7xSLgj+/ Gn6x37otu82D5Kmhaa3FQZHEXhV3MjnrUD/QaIvTqjzlWbN3AkR3Dxso0pihXzaJ dZr/b9EcKf5tFyaEN8j8EK4kicY9xd7FzdHtLcbDyxuR/NPu3nf2sbJr5yKJ4eHx MiOvOnUXqzjVOXln4yqK1kxeoxo55vJlg6lTj8GdXh+LkdM++6o1CajzgiVH3WeW PUxLYOac1eZHFrRNBxosvnu7bHHKa0auSe6T3aJwnE5NPKS4QG1kjCJj/Vb0qC6M B/ZqRTwdN6pGAkp7ZiFM6imyLDwKxRPDn8Cku6V9Two183p69JXhMM1vzZrTbwLZ Utv2pUL4jFHfiVCdbjLXUXS8TsCrYAnvb3dIkBa4QqSXBnEXEdKxEQv9JY1lVDIy SdKiw7m/HpnC3p5ySMzGUXs2u6lpgiTZ+9oklsaaeWcWx77an7ZMBPsJHVmQ2ERB u/qDUJnto/9xgHiXndpIkcWr7VaoZjalomupmq9mGH8+7rIo8v/bSxzVtsssJhgF VTiio/0EPxFNTkFQn0yA =W+wt -----END PGP SIGNATURE----- --3jgnlfGr1VI3QbVWF07ToLjUpK8rTwvoU-- --===============8261532697260513767== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwOi8vbGlzdHMueGVuLm9y Zy94ZW4tZGV2ZWwK --===============8261532697260513767==--