From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Chen, Kenneth W" Date: Wed, 19 Feb 2003 18:29:35 +0000 Subject: [Linux-ia64] ia64 rwsem using atomic primitive MIME-Version: 1 Content-Type: multipart/mixed; boundary="----_=_NextPart_001_01C2D844.DAA731BE" Message-Id: List-Id: To: linux-ia64@vger.kernel.org This is a multi-part message in MIME format. ------_=_NextPart_001_01C2D844.DAA731BE Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable I have converted rw semaphore from current generic spin_lock = implementation to use architecture specific atomic operation on ia64. = This new scheme speeds up all the semaphore operations in the fast path = with atomic instruction and fall back to a heavy function when there are = read/write contention. I've also taken some raw measurement how fast it = improves. The most significant gain comes from parallel reader lock = acquire/release which has around 6.6X speed up with the new version. = Here is a patch against 2.4.20. <>=20 - Ken ------_=_NextPart_001_01C2D844.DAA731BE Content-Type: application/octet-stream; name="rwsem.2.4.20.patch" Content-Transfer-Encoding: base64 Content-Description: rwsem.2.4.20.patch Content-Disposition: attachment; filename="rwsem.2.4.20.patch" ZGlmZiAtTnVyIGxpbnV4LTIuNC4yMC9hcmNoL2lhNjQvY29uZmlnLmluIGxpbnV4LTIuNC4yMC5y d3NlbS9hcmNoL2lhNjQvY29uZmlnLmluDQotLS0gbGludXgtMi40LjIwL2FyY2gvaWE2NC9jb25m aWcuaW4JV2VkIEZlYiAxOSAxMDoxODozMSAyMDAzDQorKysgbGludXgtMi40LjIwLnJ3c2VtL2Fy Y2gvaWE2NC9jb25maWcuaW4JV2VkIEZlYiAxOSAxMDoxODo1MCAyMDAzDQpAQCAtMjMsOCArMjMs OCBAQA0KIGRlZmluZV9ib29sIENPTkZJR19FSVNBIG4NCiBkZWZpbmVfYm9vbCBDT05GSUdfTUNB IG4NCiBkZWZpbmVfYm9vbCBDT05GSUdfU0JVUyBuDQotZGVmaW5lX2Jvb2wgQ09ORklHX1JXU0VN X0dFTkVSSUNfU1BJTkxPQ0sgeQ0KLWRlZmluZV9ib29sIENPTkZJR19SV1NFTV9YQ0hHQUREX0FM R09SSVRITSBuDQorZGVmaW5lX2Jvb2wgQ09ORklHX1JXU0VNX0dFTkVSSUNfU1BJTkxPQ0sgbg0K K2RlZmluZV9ib29sIENPTkZJR19SV1NFTV9YQ0hHQUREX0FMR09SSVRITSB5DQogDQogY2hvaWNl ICdJQS02NCBwcm9jZXNzb3IgdHlwZScgXA0KIAkiSXRhbml1bQkJQ09ORklHX0lUQU5JVU0gXA0K ZGlmZiAtTnVyIGxpbnV4LTIuNC4yMC9pbmNsdWRlL2FzbS1pYTY0L3J3c2VtLmggbGludXgtMi40 LjIwLnJ3c2VtL2luY2x1ZGUvYXNtLWlhNjQvcndzZW0uaA0KLS0tIGxpbnV4LTIuNC4yMC9pbmNs dWRlL2FzbS1pYTY0L3J3c2VtLmgJV2VkIERlYyAzMSAxNjowMDowMCAxOTY5DQorKysgbGludXgt Mi40LjIwLnJ3c2VtL2luY2x1ZGUvYXNtLWlhNjQvcndzZW0uaAlXZWQgRmViIDE5IDEwOjIwOjAz IDIwMDMNCkBAIC0wLDAgKzEsMTcxIEBADQorLyoNCisgKiBhc20taWE2NC9yd3NlbS5oOiBSL1cg c2VtYXBob3JlcyBmb3IgaWE2NA0KKyAqDQorICogQ29weXJpZ2h0IChDKSAyMDAzIEtlbiBDaGVu IDxrZW5uZXRoLncuY2hlbkBpbnRlbC5jb20+DQorICogQ29weXJpZ2h0IChDKSAyMDAzIEFzaXQg TWFsbGljayA8YXNpdC5rLm1hbGxpY2tAaW50ZWwuY29tPg0KKyAqDQorICogQmFzZWQgb24gYXNt LWkzODYvcndzZW0uaCBhbmQgb3RoZXIgYXJjaGl0ZWN0dXJlIGltcGxlbWVudGF0aW9uLg0KKyAq DQorICogVGhlIE1TVyBvZiB0aGUgY291bnQgaXMgdGhlIG5lZ2F0ZWQgbnVtYmVyIG9mIGFjdGl2 ZSB3cml0ZXJzIGFuZA0KKyAqIHdhaXRpbmcgbG9ja2VycywgYW5kIHRoZSBMU1cgaXMgdGhlIHRv dGFsIG51bWJlciBvZiBhY3RpdmUgbG9ja3MuDQorICoNCisgKiBUaGUgbG9jayBjb3VudCBpcyBp bml0aWFsaXplZCB0byAwIChubyBhY3RpdmUgYW5kIG5vIHdhaXRpbmcgbG9ja2VycykuDQorICoN CisgKiBXaGVuIGEgd3JpdGVyIHN1YnRyYWN0cyBXUklURV9CSUFTLCBpdCdsbCBnZXQgMHhmZmZm MDAwMSBmb3IgdGhlIGNhc2UNCisgKiBvZiBhbiB1bmNvbnRlbmRlZCBsb2NrLiBSZWFkZXJzIGlu Y3JlbWVudCBieSAxIGFuZCBzZWUgYSBwb3NpdGl2ZSB2YWx1ZQ0KKyAqIHdoZW4gdW5jb250ZW5k ZWQsIG5lZ2F0aXZlIGlmIHRoZXJlIGFyZSB3cml0ZXJzIChhbmQgbWF5YmUpIHJlYWRlcnMNCisg KiB3YWl0aW5nIChpbiB3aGljaCBjYXNlIGl0IGdvZXMgdG8gc2xlZXApLg0KKyAqLw0KKw0KKyNp Zm5kZWYgX0lBNjRfUldTRU1fSA0KKyNkZWZpbmUgX0lBNjRfUldTRU1fSA0KKw0KKyNpZmRlZiBf X0tFUk5FTF9fDQorI2luY2x1ZGUgPGxpbnV4L2xpc3QuaD4NCisjaW5jbHVkZSA8bGludXgvc3Bp bmxvY2suaD4NCisNCisvKg0KKyAqIHRoZSBzZW1hcGhvcmUgZGVmaW5pdGlvbg0KKyAqLw0KK3N0 cnVjdCByd19zZW1hcGhvcmUgew0KKwlzaWduZWQgaW50CQljb3VudDsNCisJc3BpbmxvY2tfdAkJ d2FpdF9sb2NrOw0KKwlzdHJ1Y3QgbGlzdF9oZWFkCXdhaXRfbGlzdDsNCisjaWYgUldTRU1fREVC VUcNCisJaW50CQkJZGVidWc7DQorI2VuZGlmDQorfTsNCisNCisjZGVmaW5lIFJXU0VNX1VOTE9D S0VEX1ZBTFVFCQkweDAwMDAwMDAwDQorI2RlZmluZSBSV1NFTV9BQ1RJVkVfQklBUwkJMHgwMDAw MDAwMQ0KKyNkZWZpbmUgUldTRU1fQUNUSVZFX01BU0sJCTB4MDAwMGZmZmYNCisjZGVmaW5lIFJX U0VNX1dBSVRJTkdfQklBUwkJKC0weDAwMDEwMDAwKQ0KKyNkZWZpbmUgUldTRU1fQUNUSVZFX1JF QURfQklBUwkJUldTRU1fQUNUSVZFX0JJQVMNCisjZGVmaW5lIFJXU0VNX0FDVElWRV9XUklURV9C SUFTCQkoUldTRU1fV0FJVElOR19CSUFTICsgUldTRU1fQUNUSVZFX0JJQVMpDQorDQorLyoNCisg KiBpbml0aWFsaXphdGlvbg0KKyAqLw0KKyNpZiBSV1NFTV9ERUJVRw0KKyNkZWZpbmUgX19SV1NF TV9ERUJVR19JTklUICAgICAgLCAwDQorI2Vsc2UNCisjZGVmaW5lIF9fUldTRU1fREVCVUdfSU5J VAkvKiAqLw0KKyNlbmRpZg0KKw0KKyNkZWZpbmUgX19SV1NFTV9JTklUSUFMSVpFUihuYW1lKSBc DQorCXsgUldTRU1fVU5MT0NLRURfVkFMVUUsIFNQSU5fTE9DS19VTkxPQ0tFRCwgXA0KKwkgIExJ U1RfSEVBRF9JTklUKChuYW1lKS53YWl0X2xpc3QpIFwNCisJICBfX1JXU0VNX0RFQlVHX0lOSVQg fQ0KKw0KKyNkZWZpbmUgREVDTEFSRV9SV1NFTShuYW1lKSBcDQorCXN0cnVjdCByd19zZW1hcGhv cmUgbmFtZSA9IF9fUldTRU1fSU5JVElBTElaRVIobmFtZSkNCisNCitleHRlcm4gc3RydWN0IHJ3 X3NlbWFwaG9yZSAqcndzZW1fZG93bl9yZWFkX2ZhaWxlZChzdHJ1Y3Qgcndfc2VtYXBob3JlICpz ZW0pOw0KK2V4dGVybiBzdHJ1Y3Qgcndfc2VtYXBob3JlICpyd3NlbV9kb3duX3dyaXRlX2ZhaWxl ZChzdHJ1Y3Qgcndfc2VtYXBob3JlICpzZW0pOw0KK2V4dGVybiBzdHJ1Y3Qgcndfc2VtYXBob3Jl ICpyd3NlbV93YWtlKHN0cnVjdCByd19zZW1hcGhvcmUgKnNlbSk7DQorDQorc3RhdGljIGlubGlu ZSB2b2lkIGluaXRfcndzZW0oc3RydWN0IHJ3X3NlbWFwaG9yZSAqc2VtKQ0KK3sNCisJc2VtLT5j b3VudCA9IFJXU0VNX1VOTE9DS0VEX1ZBTFVFOw0KKwlzcGluX2xvY2tfaW5pdCgmc2VtLT53YWl0 X2xvY2spOw0KKwlJTklUX0xJU1RfSEVBRCgmc2VtLT53YWl0X2xpc3QpOw0KKyNpZiBSV1NFTV9E RUJVRw0KKwlzZW0tPmRlYnVnID0gMDsNCisjZW5kaWYNCit9DQorDQorLyoNCisgKiBsb2NrIGZv ciByZWFkaW5nDQorICovDQorc3RhdGljIGlubGluZSB2b2lkIF9fZG93bl9yZWFkKHN0cnVjdCBy d19zZW1hcGhvcmUgKnNlbSkNCit7DQorCWludCByZXN1bHQ7DQorCV9fYXNtX18gX192b2xhdGls ZV9fICgiZmV0Y2hhZGQ0LmFjcSAlMD1bJTFdLDEiIDoNCisJCQkgICAgICAiPXIiKHJlc3VsdCkg OiAiciIoJnNlbS0+Y291bnQpIDogIm1lbW9yeSIpOw0KKwlpZiAocmVzdWx0IDwgMCkNCisJCXJ3 c2VtX2Rvd25fcmVhZF9mYWlsZWQoc2VtKTsNCit9DQorDQorLyoNCisgKiBsb2NrIGZvciB3cml0 aW5nDQorICovDQorc3RhdGljIGlubGluZSB2b2lkIF9fZG93bl93cml0ZShzdHJ1Y3Qgcndfc2Vt YXBob3JlICpzZW0pDQorew0KKwlpbnQgb2xkLCBuZXc7DQorDQorCWRvIHsNCisJCW9sZCA9IHNl bS0+Y291bnQ7DQorCQluZXcgPSBvbGQgKyBSV1NFTV9BQ1RJVkVfV1JJVEVfQklBUzsNCisJfSB3 aGlsZSAoY21weGNoZ19hY3EoJnNlbS0+Y291bnQsIG9sZCwgbmV3KSAhPSBvbGQpOw0KKw0KKwlp ZiAob2xkICE9IDApDQorCQlyd3NlbV9kb3duX3dyaXRlX2ZhaWxlZChzZW0pOw0KK30NCisNCisv Kg0KKyAqIHVubG9jayBhZnRlciByZWFkaW5nDQorICovDQorc3RhdGljIGlubGluZSB2b2lkIF9f dXBfcmVhZChzdHJ1Y3Qgcndfc2VtYXBob3JlICpzZW0pDQorew0KKwlpbnQgcmVzdWx0Ow0KKwlf X2FzbV9fIF9fdm9sYXRpbGVfXyAoImZldGNoYWRkNC5yZWwgJTA9WyUxXSwtMSIgOg0KKwkJCSAg ICAgICI9ciIocmVzdWx0KSA6ICJyIigmc2VtLT5jb3VudCkgOiAibWVtb3J5Iik7DQorCWlmIChy ZXN1bHQgPCAwICYmICgtLXJlc3VsdCAmIFJXU0VNX0FDVElWRV9NQVNLKSA9PSAwKQ0KKwkJcndz ZW1fd2FrZShzZW0pOw0KK30NCisNCisvKg0KKyAqIHVubG9jayBhZnRlciB3cml0aW5nDQorICov DQorc3RhdGljIGlubGluZSB2b2lkIF9fdXBfd3JpdGUoc3RydWN0IHJ3X3NlbWFwaG9yZSAqc2Vt KQ0KK3sNCisJaW50IG9sZCwgbmV3Ow0KKw0KKwlkbyB7DQorCQlvbGQgPSBzZW0tPmNvdW50Ow0K KwkJbmV3ID0gb2xkIC0gUldTRU1fQUNUSVZFX1dSSVRFX0JJQVM7DQorCX0gd2hpbGUgKGNtcHhj aGdfcmVsKCZzZW0tPmNvdW50LCBvbGQsIG5ldykgIT0gb2xkKTsNCisNCisJaWYgKG5ldyA8IDAg JiYgKG5ldyAmIFJXU0VNX0FDVElWRV9NQVNLKSA9PSAwKQ0KKwkJcndzZW1fd2FrZShzZW0pOw0K K30NCisNCisvKg0KKyAqIHRyeWxvY2sgZm9yIHJlYWRpbmcgLS0gcmV0dXJucyAxIGlmIHN1Y2Nl c3NmdWwsIDAgaWYgY29udGVudGlvbg0KKyAqLw0KK3N0YXRpYyBpbmxpbmUgaW50IF9fZG93bl9y ZWFkX3RyeWxvY2soc3RydWN0IHJ3X3NlbWFwaG9yZSAqc2VtKQ0KK3sNCisJaW50IHRtcDsNCisJ d2hpbGUgKCh0bXAgPSBzZW0tPmNvdW50KSA+PSAwKSB7DQorCQlpZiAodG1wID09IGNtcHhjaGdf YWNxKCZzZW0tPmNvdW50LCB0bXAsIHRtcCsxKSkgew0KKwkJCXJldHVybiAxOw0KKwkJfQ0KKwl9 DQorCXJldHVybiAwOw0KK30NCisNCisvKg0KKyAqIHRyeWxvY2sgZm9yIHdyaXRpbmcgLS0gcmV0 dXJucyAxIGlmIHN1Y2Nlc3NmdWwsIDAgaWYgY29udGVudGlvbg0KKyAqLw0KK3N0YXRpYyBpbmxp bmUgaW50IF9fZG93bl93cml0ZV90cnlsb2NrKHN0cnVjdCByd19zZW1hcGhvcmUgKnNlbSkNCit7 DQorCWludCB0bXAgPSBjbXB4Y2hnX2FjcSgmc2VtLT5jb3VudCwgUldTRU1fVU5MT0NLRURfVkFM VUUsDQorCQkJICBSV1NFTV9BQ1RJVkVfV1JJVEVfQklBUyk7DQorCXJldHVybiB0bXAgPT0gUldT RU1fVU5MT0NLRURfVkFMVUU7DQorfQ0KKw0KKy8qDQorICogaW1wbGVtZW50IGF0b21pYyBhZGQg ZnVuY3Rpb25hbGl0eQ0KKyAqLw0KK3N0YXRpYyBpbmxpbmUgdm9pZCByd3NlbV9hdG9taWNfYWRk KGludCBkZWx0YSwgc3RydWN0IHJ3X3NlbWFwaG9yZSAqc2VtKQ0KK3sNCisJYXRvbWljX2FkZChk ZWx0YSwgKGF0b21pY190ICopKCZzZW0tPmNvdW50KSk7DQorfQ0KKw0KK3N0YXRpYyBpbmxpbmUg aW50IHJ3c2VtX2F0b21pY191cGRhdGUoaW50IGRlbHRhLCBzdHJ1Y3Qgcndfc2VtYXBob3JlICpz ZW0pDQorew0KKwlyZXR1cm4gYXRvbWljX2FkZF9yZXR1cm4oZGVsdGEsIChhdG9taWNfdCAqKSgm c2VtLT5jb3VudCkpOw0KK30NCisNCisjZW5kaWYgLyogX19LRVJORUxfXyAqLw0KKyNlbmRpZiAv KiBfSUE2NF9SV1NFTV9IICovDQo= ------_=_NextPart_001_01C2D844.DAA731BE--