From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Martin Subject: [PATCH v3 16/28] arm64/sve: Probe SVE capabilities and usable vector lengths Date: Tue, 10 Oct 2017 19:38:33 +0100 Message-ID: <1507660725-7986-17-git-send-email-Dave.Martin@arm.com> References: <1507660725-7986-1-git-send-email-Dave.Martin@arm.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: In-Reply-To: <1507660725-7986-1-git-send-email-Dave.Martin@arm.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu To: linux-arm-kernel@lists.infradead.org Cc: linux-arch@vger.kernel.org, Okamoto Takayuki , libc-alpha@sourceware.org, Ard Biesheuvel , Szabolcs Nagy , Catalin Marinas , Will Deacon , Richard Sandiford , kvmarm@lists.cs.columbia.edu List-Id: linux-arch.vger.kernel.org VGhpcyBwYXRjaCB1c2VzIHRoZSBjcHVmZWF0dXJlcyBmcmFtZXdvcmsgdG8gZGV0ZXJtaW5lIGNv bW1vbiBTVkUKY2FwYWJpbGl0aWVzIGFuZCB2ZWN0b3IgbGVuZ3RocywgYW5kIGNvbmZpZ3VyZXMg dGhlIHJ1bnRpbWUgU1ZFCnN1cHBvcnQgY29kZSBhcHByb3ByaWF0ZWx5LgoKWkNSX0VMeCBpcyBu b3QgcmVhbGx5IGEgZmVhdHVyZSByZWdpc3RlciwgYnV0IGl0IGlzIGNvbnZlbmllbnQgdG8KdXNl IGl0IGFzIGEgdGVtcGxhdGUgZm9yIHJlY29yZGluZyB0aGUgbWF4aW11bSB2ZWN0b3IgbGVuZ3Ro CnN1cHBvcnRlZCBieSBhIENQVSwgdXNpbmcgdGhlIExFTiBmaWVsZC4gIFRoaXMgZmllbGQgaXMg c2ltaWxhciB0bwphIGZlYXR1cmUgZmllbGQgaW4gdGhhdCBpdCBpcyBhIGNvbnRpZ3VvdXMgYml0 ZmllbGQgZm9yIHdoaWNoIHdlCndhbnQgdG8gZGV0ZXJtaW5lIHRoZSBtaW5pbXVtIHN5c3RlbS13 aWRlIHZhbHVlLiAgVGhpcyBwYXRjaCBhZGRzClpDUiBhcyBhIHBzZXVkby1yZWdpc3RlciBpbiBj cHVpbmZvL2NwdWZlYXR1cmVzLCB3aXRoIGFwcHJvcHJpYXRlCmN1c3RvbSBjb2RlIHRvIHBvcHVs YXRlIGl0LiAgRmluZGluZyB0aGUgbWluaW11bSBzdXBwb3J0ZWQgdmFsdWUgb2YKdGhlIExFTiBm aWVsZCBpcyBsZWZ0IHRvIHRoZSBjcHVmZWF0dXJlcyBmcmFtZXdvcmsgaW4gdGhlIHVzdWFsCndh eS4KClRoZSBtZWFuaW5nIG9mIElEX0FBNjRaRlIwX0VMMSBpcyBub3QgYXJjaGl0ZWN0dXJhbGx5 IGRlZmluZWQgeWV0LApzbyBmb3Igbm93IHdlIGp1c3QgcmVxdWlyZSBpdCB0byBiZSB6ZXJvLgoK Tm90ZSB0aGF0IG11Y2ggb2YgdGhpcyBjb2RlIGlzIGRvcm1hbnQgYW5kIFNWRSBzdGlsbCB3b24n dCBiZSB1c2VkCnlldCwgc2luY2Ugc3lzdGVtX3N1cHBvcnRzX3N2ZSgpIHJlbWFpbnMgaGFyZHdp cmVkIHRvIGZhbHNlLgoKU2lnbmVkLW9mZi1ieTogRGF2ZSBNYXJ0aW4gPERhdmUuTWFydGluQGFy bS5jb20+CkNjOiBBbGV4IEJlbm7DqWUgPGFsZXguYmVubmVlQGxpbmFyby5vcmc+CkNjOiBTdXp1 a2kgSyBQb3Vsb3NlIDxTdXp1a2kuUG91bG9zZUBhcm0uY29tPgoKLS0tCgpEcm9wcGVkIEFsZXgg QmVubsOpZSdzIFJldmlld2VkLWJ5LCBzaW5jZSB0aGVyZSBpcyBuZXcgbG9naWMgaW4gdGhpcwpw YXRjaC4KCkNoYW5nZXMgc2luY2UgdjIKLS0tLS0tLS0tLS0tLS0tLQoKQnVnIGZpeGVzOgoKICog R290IHJpZCBvZiBkeW5hbWljIGFsbG9jYXRpb24gb2YgdGhlIHNoYWRvdyB2ZWN0b3IgbGVuZ3Ro IG1hcCBkdXJpbmcKICAgc2Vjb25kYXJ5IGJvb3QuICBTZWNvbmRhcnkgQ1BVIGJvb3QgdGFrZXMg cGxhY2UgaW4gYXRvbWljIGNvbnRleHQsCiAgIGFuZCByZWx5aW5nIG9uIEdGUF9BVE9NSUMgaGVy ZSBkb2Vzbid0IHNlZW0ganVzdGlmaWVkLgoKICAgSW5zdGVhZCwgdGhlIG5lZWRlZCBhZGRpdGlv bmFsIGJpdG1hcCBpcyBhbGxvY2F0ZWQgc3RhdGljYWxseS4gIE9ubHkKICAgb25lIHNoYWRvdyBt YXAgaXMgbmVlZGVkLCBiZWNhdXNlIENQVXMgZG9uJ3QgYm9vdCBjb25jdXJyZW50bHkuCgpSZXF1 ZXN0ZWQgYnkgQWxleCBCZW5uw6llOgoKICogUmVmbG93ZWQgdW50aWR5IGNvbW1lbnQgYWJvdmUg cmVhZF96Y3JfZmVhdHVyZXMoKQoKICogQWRkZWQgY29tbWVudHMgdG8gcmVhZF96Y3JfZmVhdHVy ZXMoKSB0byBleHBsYWluIHdoYXQgaXQncyB0cnlpbmcgdG8gZG8KICAgKHdoaWNoIGlzIG90aGVy d2lzZSBub3QgcmVhZGlseSBhcHBhcmVudCkuCgpSZXF1ZXN0ZWQgYnkgQ2F0YWxpbiBNYXJpbmFz OgoKICogTW92ZWQgZGlzYWJsaW5nIG9mIHRoZSBFTDEgU1ZFIHRyYXAgdG8gdGhlIGNwdWZlYXR1 cmVzIEMgY29kZS4KICAgVGhpcyBhbGxvd3MgYWRkaXRpb24gb2YgbmV3IGFzc2VtYmxlciBpbiBf X2NwdV9zZXR1cCB0byBiZQogICBhdm9pZGVkLgoKTWlzY2VsbGFuZW91czoKCiAqIEFkZGVkIGNv bW1lbnRzIGV4cGxhaW5pbmcgdGhlIGludGVudCwgcHVycG9zZSBhbmQgYmFzaWMgY29uc3RyYWlu dHMKICAgZm9yIGZwc2ltZC5jIGhlbHBlcnMuCi0tLQogYXJjaC9hcm02NC9pbmNsdWRlL2FzbS9j cHUuaCAgICAgICAgfCAgIDQgKysKIGFyY2gvYXJtNjQvaW5jbHVkZS9hc20vY3B1ZmVhdHVyZS5o IHwgIDM2ICsrKysrKysrKysrKwogYXJjaC9hcm02NC9pbmNsdWRlL2FzbS9mcHNpbWQuaCAgICAg fCAgMTQgKysrKysKIGFyY2gvYXJtNjQva2VybmVsL2NwdWZlYXR1cmUuYyAgICAgIHwgIDUwICsr KysrKysrKysrKysrKysKIGFyY2gvYXJtNjQva2VybmVsL2NwdWluZm8uYyAgICAgICAgIHwgICA2 ICsrCiBhcmNoL2FybTY0L2tlcm5lbC9mcHNpbWQuYyAgICAgICAgICB8IDExNCArKysrKysrKysr KysrKysrKysrKysrKysrKysrKysrKysrKy0KIDYgZmlsZXMgY2hhbmdlZCwgMjIxIGluc2VydGlv bnMoKyksIDMgZGVsZXRpb25zKC0pCgpkaWZmIC0tZ2l0IGEvYXJjaC9hcm02NC9pbmNsdWRlL2Fz bS9jcHUuaCBiL2FyY2gvYXJtNjQvaW5jbHVkZS9hc20vY3B1LmgKaW5kZXggODg5MjI2Yi4uODgz OTIyNyAxMDA2NDQKLS0tIGEvYXJjaC9hcm02NC9pbmNsdWRlL2FzbS9jcHUuaAorKysgYi9hcmNo L2FybTY0L2luY2x1ZGUvYXNtL2NwdS5oCkBAIC00MSw2ICs0MSw3IEBAIHN0cnVjdCBjcHVpbmZv X2FybTY0IHsKIAl1NjQJCXJlZ19pZF9hYTY0bW1mcjI7CiAJdTY0CQlyZWdfaWRfYWE2NHBmcjA7 CiAJdTY0CQlyZWdfaWRfYWE2NHBmcjE7CisJdTY0CQlyZWdfaWRfYWE2NHpmcjA7CiAKIAl1MzIJ CXJlZ19pZF9kZnIwOwogCXUzMgkJcmVnX2lkX2lzYXIwOwpAQCAtNTksNiArNjAsOSBAQCBzdHJ1 Y3QgY3B1aW5mb19hcm02NCB7CiAJdTMyCQlyZWdfbXZmcjA7CiAJdTMyCQlyZWdfbXZmcjE7CiAJ dTMyCQlyZWdfbXZmcjI7CisKKwkvKiBwc2V1ZG8tWkNSIGZvciByZWNvcmRpbmcgbWF4aW11bSBa Q1JfRUwxIExFTiB2YWx1ZTogKi8KKwl1NjQJCXJlZ196Y3I7CiB9OwogCiBERUNMQVJFX1BFUl9D UFUoc3RydWN0IGNwdWluZm9fYXJtNjQsIGNwdV9kYXRhKTsKZGlmZiAtLWdpdCBhL2FyY2gvYXJt NjQvaW5jbHVkZS9hc20vY3B1ZmVhdHVyZS5oIGIvYXJjaC9hcm02NC9pbmNsdWRlL2FzbS9jcHVm ZWF0dXJlLmgKaW5kZXggNGVhMzQ0MS4uNTFiZThlOCAxMDA2NDQKLS0tIGEvYXJjaC9hcm02NC9p bmNsdWRlL2FzbS9jcHVmZWF0dXJlLmgKKysrIGIvYXJjaC9hcm02NC9pbmNsdWRlL2FzbS9jcHVm ZWF0dXJlLmgKQEAgLTEwLDcgKzEwLDkgQEAKICNkZWZpbmUgX19BU01fQ1BVRkVBVFVSRV9ICiAK ICNpbmNsdWRlIDxhc20vY3B1Y2Fwcy5oPgorI2luY2x1ZGUgPGFzbS9mcHNpbWQuaD4KICNpbmNs dWRlIDxhc20vaHdjYXAuaD4KKyNpbmNsdWRlIDxhc20vc2lnY29udGV4dC5oPgogI2luY2x1ZGUg PGFzbS9zeXNyZWcuaD4KIAogLyoKQEAgLTIyMyw2ICsyMjUsMTMgQEAgc3RhdGljIGlubGluZSBi b29sIGlkX2FhNjRwZnIwXzMyYml0X2VsMCh1NjQgcGZyMCkKIAlyZXR1cm4gdmFsID09IElEX0FB NjRQRlIwX0VMMF8zMkJJVF82NEJJVDsKIH0KIAorc3RhdGljIGlubGluZSBib29sIGlkX2FhNjRw ZnIwX3N2ZSh1NjQgcGZyMCkKK3sKKwl1MzIgdmFsID0gY3B1aWRfZmVhdHVyZV9leHRyYWN0X3Vu c2lnbmVkX2ZpZWxkKHBmcjAsIElEX0FBNjRQRlIwX1NWRV9TSElGVCk7CisKKwlyZXR1cm4gdmFs ID4gMDsKK30KKwogdm9pZCBfX2luaXQgc2V0dXBfY3B1X2ZlYXR1cmVzKHZvaWQpOwogCiB2b2lk IHVwZGF0ZV9jcHVfY2FwYWJpbGl0aWVzKGNvbnN0IHN0cnVjdCBhcm02NF9jcHVfY2FwYWJpbGl0 aWVzICpjYXBzLApAQCAtMjY3LDYgKzI3NiwzMyBAQCBzdGF0aWMgaW5saW5lIGJvb2wgc3lzdGVt X3N1cHBvcnRzX3N2ZSh2b2lkKQogCXJldHVybiBmYWxzZTsKIH0KIAorLyoKKyAqIFJlYWQgdGhl IHBzZXVkby1aQ1IgdXNlZCBieSBjcHVmZWF0dXJlcyB0byBpZGVudGlmeSB0aGUgc3VwcG9ydGVk IFNWRQorICogdmVjdG9yIGxlbmd0aC4KKyAqCisgKiBVc2Ugb25seSBpZiBTVkUgaXMgcHJlc2Vu dC4KKyAqIFRoaXMgZnVuY3Rpb24gY2xvYmJlcnMgdGhlIFNWRSB2ZWN0b3IgbGVuZ3RoLgorICov CitzdGF0aWMgdTY0IF9fbWF5YmVfdW51c2VkIHJlYWRfemNyX2ZlYXR1cmVzKHZvaWQpCit7CisJ dTY0IHpjcjsKKwl1bnNpZ25lZCBpbnQgdnFfbWF4OworCisJLyoKKwkgKiBTZXQgdGhlIG1heGlt dW0gcG9zc2libGUgVkwsIGFuZCB3cml0ZSB6ZXJvZXMgdG8gYWxsIG90aGVyCisJICogYml0cyB0 byBzZWUgaWYgdGhleSBzdGljay4KKwkgKi8KKwlzdmVfa2VybmVsX2VuYWJsZShOVUxMKTsKKwl3 cml0ZV9zeXNyZWdfcyhaQ1JfRUx4X0xFTl9NQVNLLCBTWVNfWkNSX0VMMSk7CisKKwl6Y3IgPSBy ZWFkX3N5c3JlZ19zKFNZU19aQ1JfRUwxKTsKKwl6Y3IgJj0gfih1NjQpWkNSX0VMeF9MRU5fTUFT SzsgLyogZmluZCBzdGlja3kgMXMgb3V0c2lkZSBMRU4gZmllbGQgKi8KKwl2cV9tYXggPSBzdmVf dnFfZnJvbV92bChzdmVfZ2V0X3ZsKCkpOworCXpjciB8PSB2cV9tYXggLSAxOyAvKiBzZXQgTEVO IGZpZWxkIHRvIG1heGltdW0gZWZmZWN0aXZlIHZhbHVlICovCisKKwlyZXR1cm4gemNyOworfQor CiAjZW5kaWYgLyogX19BU1NFTUJMWV9fICovCiAKICNlbmRpZgpkaWZmIC0tZ2l0IGEvYXJjaC9h cm02NC9pbmNsdWRlL2FzbS9mcHNpbWQuaCBiL2FyY2gvYXJtNjQvaW5jbHVkZS9hc20vZnBzaW1k LmgKaW5kZXggN2RkMzkzOS4uYmFkNzJmZCAxMDA2NDQKLS0tIGEvYXJjaC9hcm02NC9pbmNsdWRl L2FzbS9mcHNpbWQuaAorKysgYi9hcmNoL2FybTY0L2luY2x1ZGUvYXNtL2Zwc2ltZC5oCkBAIC03 OSw2ICs3OSw3IEBAIGV4dGVybiB2b2lkIHN2ZV9zYXZlX3N0YXRlKHZvaWQgKnN0YXRlLCB1MzIg KnBmcHNyKTsKIGV4dGVybiB2b2lkIHN2ZV9sb2FkX3N0YXRlKHZvaWQgY29uc3QgKnN0YXRlLCB1 MzIgY29uc3QgKnBmcHNyLAogCQkJICAgdW5zaWduZWQgbG9uZyB2cV9taW51c18xKTsKIGV4dGVy biB1bnNpZ25lZCBpbnQgc3ZlX2dldF92bCh2b2lkKTsKK2V4dGVybiBpbnQgc3ZlX2tlcm5lbF9l bmFibGUodm9pZCAqKTsKIAogZXh0ZXJuIGludCBfX3JvX2FmdGVyX2luaXQgc3ZlX21heF92bDsK IApAQCAtOTEsMTAgKzkyLDIzIEBAIGV4dGVybiB2b2lkIGZwc2ltZF9yZWxlYXNlX3RocmVhZChz dHJ1Y3QgdGFza19zdHJ1Y3QgKnRhc2spOwogZXh0ZXJuIGludCBzdmVfc2V0X3ZlY3Rvcl9sZW5n dGgoc3RydWN0IHRhc2tfc3RydWN0ICp0YXNrLAogCQkJCSB1bnNpZ25lZCBsb25nIHZsLCB1bnNp Z25lZCBsb25nIGZsYWdzKTsKIAorLyoKKyAqIFByb2JpbmcgYW5kIHNldHVwIGZ1bmN0aW9ucy4K KyAqIENhbGxzIHRvIHRoZXNlIGZ1bmN0aW9ucyBtdXN0IGJlIHNlcmlhbGlzZWQgd2l0aCBvbmUg YW5vdGhlci4KKyAqLworZXh0ZXJuIHZvaWQgX19pbml0IHN2ZV9pbml0X3ZxX21hcCh2b2lkKTsK K2V4dGVybiB2b2lkIHN2ZV91cGRhdGVfdnFfbWFwKHZvaWQpOworZXh0ZXJuIGludCBzdmVfdmVy aWZ5X3ZxX21hcCh2b2lkKTsKK2V4dGVybiB2b2lkIF9faW5pdCBzdmVfc2V0dXAodm9pZCk7CisK ICNlbHNlIC8qICEgQ09ORklHX0FSTTY0X1NWRSAqLwogCiBzdGF0aWMgdm9pZCBfX21heWJlX3Vu dXNlZCBzdmVfYWxsb2Moc3RydWN0IHRhc2tfc3RydWN0ICp0YXNrKSB7IH0KIHN0YXRpYyB2b2lk IF9fbWF5YmVfdW51c2VkIGZwc2ltZF9yZWxlYXNlX3RocmVhZChzdHJ1Y3QgdGFza19zdHJ1Y3Qg KnRhc2spIHsgfQorc3RhdGljIHZvaWQgX19tYXliZV91bnVzZWQgc3ZlX2luaXRfdnFfbWFwKHZv aWQpIHsgfQorc3RhdGljIHZvaWQgX19tYXliZV91bnVzZWQgc3ZlX3VwZGF0ZV92cV9tYXAodm9p ZCkgeyB9CitzdGF0aWMgaW50IF9fbWF5YmVfdW51c2VkIHN2ZV92ZXJpZnlfdnFfbWFwKHZvaWQp IHsgcmV0dXJuIDA7IH0KK3N0YXRpYyB2b2lkIF9fbWF5YmVfdW51c2VkIHN2ZV9zZXR1cCh2b2lk KSB7IH0KIAogI2VuZGlmIC8qICEgQ09ORklHX0FSTTY0X1NWRSAqLwogCmRpZmYgLS1naXQgYS9h cmNoL2FybTY0L2tlcm5lbC9jcHVmZWF0dXJlLmMgYi9hcmNoL2FybTY0L2tlcm5lbC9jcHVmZWF0 dXJlLmMKaW5kZXggOTJhOTUwMi4uYzVhY2YzOCAxMDA2NDQKLS0tIGEvYXJjaC9hcm02NC9rZXJu ZWwvY3B1ZmVhdHVyZS5jCisrKyBiL2FyY2gvYXJtNjQva2VybmVsL2NwdWZlYXR1cmUuYwpAQCAt MjcsNiArMjcsNyBAQAogI2luY2x1ZGUgPGFzbS9jcHUuaD4KICNpbmNsdWRlIDxhc20vY3B1ZmVh dHVyZS5oPgogI2luY2x1ZGUgPGFzbS9jcHVfb3BzLmg+CisjaW5jbHVkZSA8YXNtL2Zwc2ltZC5o PgogI2luY2x1ZGUgPGFzbS9tbXVfY29udGV4dC5oPgogI2luY2x1ZGUgPGFzbS9wcm9jZXNzb3Iu aD4KICNpbmNsdWRlIDxhc20vc3lzcmVnLmg+CkBAIC0yODMsNiArMjg0LDEyIEBAIHN0YXRpYyBj b25zdCBzdHJ1Y3QgYXJtNjRfZnRyX2JpdHMgZnRyX2lkX2RmcjBbXSA9IHsKIAlBUk02NF9GVFJf RU5ELAogfTsKIAorc3RhdGljIGNvbnN0IHN0cnVjdCBhcm02NF9mdHJfYml0cyBmdHJfemNyW10g PSB7CisJQVJNNjRfRlRSX0JJVFMoRlRSX0hJRERFTiwgRlRSX05PTlNUUklDVCwgRlRSX0xPV0VS X1NBRkUsCisJCVpDUl9FTHhfTEVOX1NISUZULCBaQ1JfRUx4X0xFTl9TSVpFLCAwKSwJLyogTEVO ICovCisJQVJNNjRfRlRSX0VORCwKK307CisKIC8qCiAgKiBDb21tb24gZnRyIGJpdHMgZm9yIGEg MzJiaXQgcmVnaXN0ZXIgd2l0aCBhbGwgaGlkZGVuLCBzdHJpY3QKICAqIGF0dHJpYnV0ZXMsIHdp dGggNGJpdCBmZWF0dXJlIGZpZWxkcyBhbmQgYSBkZWZhdWx0IHNhZmUgdmFsdWUgb2YKQEAgLTM0 OSw2ICszNTYsNyBAQCBzdGF0aWMgY29uc3Qgc3RydWN0IF9fZnRyX3JlZ19lbnRyeSB7CiAJLyog T3AxID0gMCwgQ1JuID0gMCwgQ1JtID0gNCAqLwogCUFSTTY0X0ZUUl9SRUcoU1lTX0lEX0FBNjRQ RlIwX0VMMSwgZnRyX2lkX2FhNjRwZnIwKSwKIAlBUk02NF9GVFJfUkVHKFNZU19JRF9BQTY0UEZS MV9FTDEsIGZ0cl9yYXopLAorCUFSTTY0X0ZUUl9SRUcoU1lTX0lEX0FBNjRaRlIwX0VMMSwgZnRy X3JheiksCiAKIAkvKiBPcDEgPSAwLCBDUm4gPSAwLCBDUm0gPSA1ICovCiAJQVJNNjRfRlRSX1JF RyhTWVNfSURfQUE2NERGUjBfRUwxLCBmdHJfaWRfYWE2NGRmcjApLApAQCAtMzYzLDYgKzM3MSw5 IEBAIHN0YXRpYyBjb25zdCBzdHJ1Y3QgX19mdHJfcmVnX2VudHJ5IHsKIAlBUk02NF9GVFJfUkVH KFNZU19JRF9BQTY0TU1GUjFfRUwxLCBmdHJfaWRfYWE2NG1tZnIxKSwKIAlBUk02NF9GVFJfUkVH KFNZU19JRF9BQTY0TU1GUjJfRUwxLCBmdHJfaWRfYWE2NG1tZnIyKSwKIAorCS8qIE9wMSA9IDAs IENSbiA9IDEsIENSbSA9IDIgKi8KKwlBUk02NF9GVFJfUkVHKFNZU19aQ1JfRUwxLCBmdHJfemNy KSwKKwogCS8qIE9wMSA9IDMsIENSbiA9IDAsIENSbSA9IDAgKi8KIAl7IFNZU19DVFJfRUwwLCAm YXJtNjRfZnRyX3JlZ19jdHJlbDAgfSwKIAlBUk02NF9GVFJfUkVHKFNZU19EQ1pJRF9FTDAsIGZ0 cl9kY3ppZCksCkBAIC01MDAsNiArNTExLDcgQEAgdm9pZCBfX2luaXQgaW5pdF9jcHVfZmVhdHVy ZXMoc3RydWN0IGNwdWluZm9fYXJtNjQgKmluZm8pCiAJaW5pdF9jcHVfZnRyX3JlZyhTWVNfSURf QUE2NE1NRlIyX0VMMSwgaW5mby0+cmVnX2lkX2FhNjRtbWZyMik7CiAJaW5pdF9jcHVfZnRyX3Jl ZyhTWVNfSURfQUE2NFBGUjBfRUwxLCBpbmZvLT5yZWdfaWRfYWE2NHBmcjApOwogCWluaXRfY3B1 X2Z0cl9yZWcoU1lTX0lEX0FBNjRQRlIxX0VMMSwgaW5mby0+cmVnX2lkX2FhNjRwZnIxKTsKKwlp bml0X2NwdV9mdHJfcmVnKFNZU19JRF9BQTY0WkZSMF9FTDEsIGluZm8tPnJlZ19pZF9hYTY0emZy MCk7CiAKIAlpZiAoaWRfYWE2NHBmcjBfMzJiaXRfZWwwKGluZm8tPnJlZ19pZF9hYTY0cGZyMCkp IHsKIAkJaW5pdF9jcHVfZnRyX3JlZyhTWVNfSURfREZSMF9FTDEsIGluZm8tPnJlZ19pZF9kZnIw KTsKQEAgLTUyMCw2ICs1MzIsMTAgQEAgdm9pZCBfX2luaXQgaW5pdF9jcHVfZmVhdHVyZXMoc3Ry dWN0IGNwdWluZm9fYXJtNjQgKmluZm8pCiAJCWluaXRfY3B1X2Z0cl9yZWcoU1lTX01WRlIyX0VM MSwgaW5mby0+cmVnX212ZnIyKTsKIAl9CiAKKwlpZiAoaWRfYWE2NHBmcjBfc3ZlKGluZm8tPnJl Z19pZF9hYTY0cGZyMCkpIHsKKwkJaW5pdF9jcHVfZnRyX3JlZyhTWVNfWkNSX0VMMSwgaW5mby0+ cmVnX3pjcik7CisJCXN2ZV9pbml0X3ZxX21hcCgpOworCX0KIH0KIAogc3RhdGljIHZvaWQgdXBk YXRlX2NwdV9mdHJfcmVnKHN0cnVjdCBhcm02NF9mdHJfcmVnICpyZWcsIHU2NCBuZXcpCkBAIC02 MjMsNiArNjM5LDkgQEAgdm9pZCB1cGRhdGVfY3B1X2ZlYXR1cmVzKGludCBjcHUsCiAJdGFpbnQg fD0gY2hlY2tfdXBkYXRlX2Z0cl9yZWcoU1lTX0lEX0FBNjRQRlIxX0VMMSwgY3B1LAogCQkJCSAg ICAgIGluZm8tPnJlZ19pZF9hYTY0cGZyMSwgYm9vdC0+cmVnX2lkX2FhNjRwZnIxKTsKIAorCXRh aW50IHw9IGNoZWNrX3VwZGF0ZV9mdHJfcmVnKFNZU19JRF9BQTY0WkZSMF9FTDEsIGNwdSwKKwkJ CQkgICAgICBpbmZvLT5yZWdfaWRfYWE2NHpmcjAsIGJvb3QtPnJlZ19pZF9hYTY0emZyMCk7CisK IAkvKgogCSAqIElmIHdlIGhhdmUgQUFyY2gzMiwgd2UgY2FyZSBhYm91dCAzMi1iaXQgZmVhdHVy ZXMgZm9yIGNvbXBhdC4KIAkgKiBJZiB0aGUgc3lzdGVtIGRvZXNuJ3Qgc3VwcG9ydCBBQXJjaDMy LCBkb24ndCB1cGRhdGUgdGhlbS4KQEAgLTY3MCw2ICs2ODksMTQgQEAgdm9pZCB1cGRhdGVfY3B1 X2ZlYXR1cmVzKGludCBjcHUsCiAJCQkJCWluZm8tPnJlZ19tdmZyMiwgYm9vdC0+cmVnX212ZnIy KTsKIAl9CiAKKwlpZiAoaWRfYWE2NHBmcjBfc3ZlKGluZm8tPnJlZ19pZF9hYTY0cGZyMCkpIHsK KwkJdGFpbnQgfD0gY2hlY2tfdXBkYXRlX2Z0cl9yZWcoU1lTX1pDUl9FTDEsIGNwdSwKKwkJCQkJ aW5mby0+cmVnX3pjciwgYm9vdC0+cmVnX3pjcik7CisKKwkJaWYgKCFzeXNfY2Fwc19pbml0aWFs aXNlZCkKKwkJCXN2ZV91cGRhdGVfdnFfbWFwKCk7CisJfQorCiAJLyoKIAkgKiBNaXNtYXRjaGVk IENQVSBmZWF0dXJlcyBhcmUgYSByZWNpcGUgZm9yIGRpc2FzdGVyLiBEb24ndCBldmVuCiAJICog cHJldGVuZCB0byBzdXBwb3J0IHRoZW0uCkBAIC0xMDk3LDYgKzExMjQsMjMgQEAgdmVyaWZ5X2xv Y2FsX2NwdV9mZWF0dXJlcyhjb25zdCBzdHJ1Y3QgYXJtNjRfY3B1X2NhcGFiaWxpdGllcyAqY2Fw cykKIAl9CiB9CiAKK3N0YXRpYyB2b2lkIHZlcmlmeV9zdmVfZmVhdHVyZXModm9pZCkKK3sKKwl1 NjQgc2FmZV96Y3IgPSByZWFkX3Nhbml0aXNlZF9mdHJfcmVnKFNZU19aQ1JfRUwxKTsKKwl1NjQg emNyID0gcmVhZF96Y3JfZmVhdHVyZXMoKTsKKworCXVuc2lnbmVkIGludCBzYWZlX2xlbiA9IHNh ZmVfemNyICYgWkNSX0VMeF9MRU5fTUFTSzsKKwl1bnNpZ25lZCBpbnQgbGVuID0gemNyICYgWkNS X0VMeF9MRU5fTUFTSzsKKworCWlmIChsZW4gPCBzYWZlX2xlbiB8fCBzdmVfdmVyaWZ5X3ZxX21h cCgpKSB7CisJCXByX2NyaXQoIkNQVSVkOiBTVkU6IHJlcXVpcmVkIHZlY3RvciBsZW5ndGgocykg bWlzc2luZ1xuIiwKKwkJCXNtcF9wcm9jZXNzb3JfaWQoKSk7CisJCWNwdV9kaWVfZWFybHkoKTsK Kwl9CisKKwkvKiBBZGQgY2hlY2tzIG9uIG90aGVyIFpDUiBiaXRzIGhlcmUgaWYgbmVjZXNzYXJ5 ICovCit9CisKIC8qCiAgKiBSdW4gdGhyb3VnaCB0aGUgZW5hYmxlZCBzeXN0ZW0gY2FwYWJpbGl0 aWVzIGFuZCBlbmFibGUoKSBpdCBvbiB0aGlzIENQVS4KICAqIFRoZSBjYXBhYmlsaXRpZXMgd2Vy ZSBkZWNpZGVkIGJhc2VkIG9uIHRoZSBhdmFpbGFibGUgQ1BVcyBhdCB0aGUgYm9vdCB0aW1lLgpA QCAtMTExMCw4ICsxMTU0LDEyIEBAIHN0YXRpYyB2b2lkIHZlcmlmeV9sb2NhbF9jcHVfY2FwYWJp bGl0aWVzKHZvaWQpCiAJdmVyaWZ5X2xvY2FsX2NwdV9lcnJhdGFfd29ya2Fyb3VuZHMoKTsKIAl2 ZXJpZnlfbG9jYWxfY3B1X2ZlYXR1cmVzKGFybTY0X2ZlYXR1cmVzKTsKIAl2ZXJpZnlfbG9jYWxf ZWxmX2h3Y2Fwcyhhcm02NF9lbGZfaHdjYXBzKTsKKwogCWlmIChzeXN0ZW1fc3VwcG9ydHNfMzJi aXRfZWwwKCkpCiAJCXZlcmlmeV9sb2NhbF9lbGZfaHdjYXBzKGNvbXBhdF9lbGZfaHdjYXBzKTsK KworCWlmIChzeXN0ZW1fc3VwcG9ydHNfc3ZlKCkpCisJCXZlcmlmeV9zdmVfZmVhdHVyZXMoKTsK IH0KIAogdm9pZCBjaGVja19sb2NhbF9jcHVfY2FwYWJpbGl0aWVzKHZvaWQpCkBAIC0xMTg5LDYg KzEyMzcsOCBAQCB2b2lkIF9faW5pdCBzZXR1cF9jcHVfZmVhdHVyZXModm9pZCkKIAlpZiAoc3lz dGVtX3N1cHBvcnRzXzMyYml0X2VsMCgpKQogCQlzZXR1cF9lbGZfaHdjYXBzKGNvbXBhdF9lbGZf aHdjYXBzKTsKIAorCXN2ZV9zZXR1cCgpOworCiAJLyogQWR2ZXJ0aXNlIHRoYXQgd2UgaGF2ZSBj b21wdXRlZCB0aGUgc3lzdGVtIGNhcGFiaWxpdGllcyAqLwogCXNldF9zeXNfY2Fwc19pbml0aWFs aXNlZCgpOwogCmRpZmYgLS1naXQgYS9hcmNoL2FybTY0L2tlcm5lbC9jcHVpbmZvLmMgYi9hcmNo L2FybTY0L2tlcm5lbC9jcHVpbmZvLmMKaW5kZXggMzExODg1OS4uYmUyNjBlOCAxMDA2NDQKLS0t IGEvYXJjaC9hcm02NC9rZXJuZWwvY3B1aW5mby5jCisrKyBiL2FyY2gvYXJtNjQva2VybmVsL2Nw dWluZm8uYwpAQCAtMTksNiArMTksNyBAQAogI2luY2x1ZGUgPGFzbS9jcHUuaD4KICNpbmNsdWRl IDxhc20vY3B1dHlwZS5oPgogI2luY2x1ZGUgPGFzbS9jcHVmZWF0dXJlLmg+CisjaW5jbHVkZSA8 YXNtL2Zwc2ltZC5oPgogCiAjaW5jbHVkZSA8bGludXgvYml0b3BzLmg+CiAjaW5jbHVkZSA8bGlu dXgvYnVnLmg+CkBAIC0zMjYsNiArMzI3LDcgQEAgc3RhdGljIHZvaWQgX19jcHVpbmZvX3N0b3Jl X2NwdShzdHJ1Y3QgY3B1aW5mb19hcm02NCAqaW5mbykKIAlpbmZvLT5yZWdfaWRfYWE2NG1tZnIy ID0gcmVhZF9jcHVpZChJRF9BQTY0TU1GUjJfRUwxKTsKIAlpbmZvLT5yZWdfaWRfYWE2NHBmcjAg PSByZWFkX2NwdWlkKElEX0FBNjRQRlIwX0VMMSk7CiAJaW5mby0+cmVnX2lkX2FhNjRwZnIxID0g cmVhZF9jcHVpZChJRF9BQTY0UEZSMV9FTDEpOworCWluZm8tPnJlZ19pZF9hYTY0emZyMCA9IHJl YWRfY3B1aWQoSURfQUE2NFpGUjBfRUwxKTsKIAogCS8qIFVwZGF0ZSB0aGUgMzJiaXQgSUQgcmVn aXN0ZXJzIG9ubHkgaWYgQUFyY2gzMiBpcyBpbXBsZW1lbnRlZCAqLwogCWlmIChpZF9hYTY0cGZy MF8zMmJpdF9lbDAoaW5mby0+cmVnX2lkX2FhNjRwZnIwKSkgewpAQCAtMzQ4LDYgKzM1MCwxMCBA QCBzdGF0aWMgdm9pZCBfX2NwdWluZm9fc3RvcmVfY3B1KHN0cnVjdCBjcHVpbmZvX2FybTY0ICpp bmZvKQogCQlpbmZvLT5yZWdfbXZmcjIgPSByZWFkX2NwdWlkKE1WRlIyX0VMMSk7CiAJfQogCisJ aWYgKElTX0VOQUJMRUQoQ09ORklHX0FSTTY0X1NWRSkgJiYKKwkgICAgaWRfYWE2NHBmcjBfc3Zl KGluZm8tPnJlZ19pZF9hYTY0cGZyMCkpCisJCWluZm8tPnJlZ196Y3IgPSByZWFkX3pjcl9mZWF0 dXJlcygpOworCiAJY3B1aW5mb19kZXRlY3RfaWNhY2hlX3BvbGljeShpbmZvKTsKIH0KIApkaWZm IC0tZ2l0IGEvYXJjaC9hcm02NC9rZXJuZWwvZnBzaW1kLmMgYi9hcmNoL2FybTY0L2tlcm5lbC9m cHNpbWQuYwppbmRleCAzMjRjMTEyLi41NjczZjUwIDEwMDY0NAotLS0gYS9hcmNoL2FybTY0L2tl cm5lbC9mcHNpbWQuYworKysgYi9hcmNoL2FybTY0L2tlcm5lbC9mcHNpbWQuYwpAQCAtMTEzLDE5 ICsxMTMsMTkgQEAKIHN0YXRpYyBERUZJTkVfUEVSX0NQVShzdHJ1Y3QgZnBzaW1kX3N0YXRlICos IGZwc2ltZF9sYXN0X3N0YXRlKTsKIAogLyogRGVmYXVsdCBWTCBmb3IgdGFza3MgdGhhdCBkb24n dCBzZXQgaXQgZXhwbGljaXRseTogKi8KLXN0YXRpYyBpbnQgc3ZlX2RlZmF1bHRfdmwgPSBTVkVf VkxfTUlOOworc3RhdGljIGludCBzdmVfZGVmYXVsdF92bCA9IC0xOwogCiAjaWZkZWYgQ09ORklH X0FSTTY0X1NWRQogCiAvKiBNYXhpbXVtIHN1cHBvcnRlZCB2ZWN0b3IgbGVuZ3RoIGFjcm9zcyBh bGwgQ1BVcyAoaW5pdGlhbGx5IHBvaXNvbmVkKSAqLwogaW50IF9fcm9fYWZ0ZXJfaW5pdCBzdmVf bWF4X3ZsID0gLTE7CiAvKiBTZXQgb2YgYXZhaWxhYmxlIHZlY3RvciBsZW5ndGhzLCBhcyB2cV90 b19iaXQodnEpOiAqLwotc3RhdGljIERFQ0xBUkVfQklUTUFQKHN2ZV92cV9tYXAsIFNWRV9WUV9N QVgpOworc3RhdGljIF9fcm9fYWZ0ZXJfaW5pdCBERUNMQVJFX0JJVE1BUChzdmVfdnFfbWFwLCBT VkVfVlFfTUFYKTsKIAogI2Vsc2UgLyogISBDT05GSUdfQVJNNjRfU1ZFICovCiAKIC8qIER1bW15 IGRlY2xhcmF0aW9uIGZvciBjb2RlIHRoYXQgd2lsbCBiZSBvcHRpbWlzZWQgb3V0OiAqLwotZXh0 ZXJuIERFQ0xBUkVfQklUTUFQKHN2ZV92cV9tYXAsIFNWRV9WUV9NQVgpOworZXh0ZXJuIF9fcm9f YWZ0ZXJfaW5pdCBERUNMQVJFX0JJVE1BUChzdmVfdnFfbWFwLCBTVkVfVlFfTUFYKTsKIAogI2Vu ZGlmIC8qICEgQ09ORklHX0FSTTY0X1NWRSAqLwogCkBAIC01MDYsNiArNTA2LDExMSBAQCBpbnQg c3ZlX3NldF92ZWN0b3JfbGVuZ3RoKHN0cnVjdCB0YXNrX3N0cnVjdCAqdGFzaywKIAlyZXR1cm4g MDsKIH0KIAorLyoKKyAqIEJpdG1hcCBmb3IgdGVtcG9yYXJ5IHN0b3JhZ2Ugb2YgdGhlIHBlci1D UFUgc2V0IG9mIHN1cHBvcnRlZCB2ZWN0b3IgbGVuZ3RocworICogZHVyaW5nIHNlY29uZGFyeSBi b290LgorICovCitzdGF0aWMgREVDTEFSRV9CSVRNQVAoc3ZlX3NlY29uZGFyeV92cV9tYXAsIFNW RV9WUV9NQVgpOworCitzdGF0aWMgdm9pZCBzdmVfcHJvYmVfdnFzKERFQ0xBUkVfQklUTUFQKG1h cCwgU1ZFX1ZRX01BWCkpCit7CisJdW5zaWduZWQgaW50IHZxLCB2bDsKKwl1bnNpZ25lZCBsb25n IHpjcjsKKworCWJpdG1hcF96ZXJvKG1hcCwgU1ZFX1ZRX01BWCk7CisKKwl6Y3IgPSBaQ1JfRUx4 X0xFTl9NQVNLOworCXpjciA9IHJlYWRfc3lzcmVnX3MoU1lTX1pDUl9FTDEpICYgfnpjcjsKKwor CWZvciAodnEgPSBTVkVfVlFfTUFYOyB2cSA+PSBTVkVfVlFfTUlOOyAtLXZxKSB7CisJCXdyaXRl X3N5c3JlZ19zKHpjciB8ICh2cSAtIDEpLCBTWVNfWkNSX0VMMSk7IC8qIHNlbGYtc3luY2luZyAq LworCQl2bCA9IHN2ZV9nZXRfdmwoKTsKKwkJdnEgPSBzdmVfdnFfZnJvbV92bCh2bCk7IC8qIHNr aXAgaW50ZXJ2ZW5pbmcgbGVuZ3RocyAqLworCQlzZXRfYml0KHZxX3RvX2JpdCh2cSksIG1hcCk7 CisJfQorfQorCit2b2lkIF9faW5pdCBzdmVfaW5pdF92cV9tYXAodm9pZCkKK3sKKwlzdmVfcHJv YmVfdnFzKHN2ZV92cV9tYXApOworfQorCisvKgorICogSWYgd2UgaGF2ZW4ndCBjb21taXR0ZWQg dG8gdGhlIHNldCBvZiBzdXBwb3J0ZWQgVlFzIHlldCwgZmlsdGVyIG91dAorICogdGhvc2Ugbm90 IHN1cHBvcnRlZCBieSB0aGUgY3VycmVudCBDUFUuCisgKi8KK3ZvaWQgc3ZlX3VwZGF0ZV92cV9t YXAodm9pZCkKK3sKKwlzdmVfcHJvYmVfdnFzKHN2ZV9zZWNvbmRhcnlfdnFfbWFwKTsKKwliaXRt YXBfYW5kKHN2ZV92cV9tYXAsIHN2ZV92cV9tYXAsIHN2ZV9zZWNvbmRhcnlfdnFfbWFwLCBTVkVf VlFfTUFYKTsKK30KKworLyogQ2hlY2sgd2hldGhlciB0aGUgY3VycmVudCBDUFUgc3VwcG9ydHMg YWxsIFZRcyBpbiB0aGUgY29tbWl0dGVkIHNldCAqLworaW50IHN2ZV92ZXJpZnlfdnFfbWFwKHZv aWQpCit7CisJaW50IHJldCA9IDA7CisKKwlzdmVfcHJvYmVfdnFzKHN2ZV9zZWNvbmRhcnlfdnFf bWFwKTsKKwliaXRtYXBfYW5kbm90KHN2ZV9zZWNvbmRhcnlfdnFfbWFwLCBzdmVfdnFfbWFwLCBz dmVfc2Vjb25kYXJ5X3ZxX21hcCwKKwkJICAgICAgU1ZFX1ZRX01BWCk7CisJaWYgKCFiaXRtYXBf ZW1wdHkoc3ZlX3NlY29uZGFyeV92cV9tYXAsIFNWRV9WUV9NQVgpKSB7CisJCXByX3dhcm4oIlNW RTogY3B1JWQ6IFJlcXVpcmVkIHZlY3RvciBsZW5ndGgocykgbWlzc2luZ1xuIiwKKwkJCXNtcF9w cm9jZXNzb3JfaWQoKSk7CisJCXJldCA9IC1FSU5WQUw7CisJfQorCisJcmV0dXJuIHJldDsKK30K KworLyoKKyAqIEVuYWJsZSBTVkUgZm9yIEVMMS4KKyAqIEludGVuZGVkIGZvciB1c2UgYnkgdGhl IGNwdWZlYXR1cmVzIGNvZGUgZHVyaW5nIENQVSBib290LgorICovCitpbnQgc3ZlX2tlcm5lbF9l bmFibGUodm9pZCAqX19hbHdheXNfdW51c2VkIHApCit7CisJd3JpdGVfc3lzcmVnKHJlYWRfc3lz cmVnKENQQUNSX0VMMSkgfCBDUEFDUl9FTDFfWkVOX0VMMUVOLCBDUEFDUl9FTDEpOworCWlzYigp OworCisJcmV0dXJuIDA7Cit9CisKK3ZvaWQgX19pbml0IHN2ZV9zZXR1cCh2b2lkKQoreworCXU2 NCB6Y3I7CisKKwlpZiAoIXN5c3RlbV9zdXBwb3J0c19zdmUoKSkKKwkJcmV0dXJuOworCisJLyoK KwkgKiBUaGUgU1ZFIGFyY2hpdGVjdHVyZSBtYW5kYXRlcyBzdXBwb3J0IGZvciAxMjgtYml0IHZl Y3RvcnMsCisJICogc28gc3ZlX3ZxX21hcCBtdXN0IGhhdmUgYXQgbGVhc3QgU1ZFX1ZRX01JTiBz ZXQuCisJICogSWYgc29tZXRoaW5nIHdlbnQgd3JvbmcsIGF0IGxlYXN0IHRyeSB0byBwYXRjaCBp dCB1cDoKKwkgKi8KKwlpZiAoV0FSTl9PTighdGVzdF9iaXQodnFfdG9fYml0KFNWRV9WUV9NSU4p LCBzdmVfdnFfbWFwKSkpCisJCXNldF9iaXQodnFfdG9fYml0KFNWRV9WUV9NSU4pLCBzdmVfdnFf bWFwKTsKKworCXpjciA9IHJlYWRfc2FuaXRpc2VkX2Z0cl9yZWcoU1lTX1pDUl9FTDEpOworCXN2 ZV9tYXhfdmwgPSBzdmVfdmxfZnJvbV92cSgoemNyICYgWkNSX0VMeF9MRU5fTUFTSykgKyAxKTsK KworCS8qCisJICogU2FuaXR5LWNoZWNrIHRoYXQgdGhlIG1heCBWTCB3ZSBkZXRlcm1pbmVkIHRo cm91Z2ggQ1BVIGZlYXR1cmVzCisJICogY29ycmVzcG9uZHMgcHJvcGVybHkgdG8gc3ZlX3ZxX21h cC4gIElmIG5vdCwgZG8gb3VyIGJlc3Q6CisJICovCisJaWYgKFdBUk5fT04oc3ZlX21heF92bCAh PSBmaW5kX3N1cHBvcnRlZF92ZWN0b3JfbGVuZ3RoKHN2ZV9tYXhfdmwpKSkKKwkJc3ZlX21heF92 bCA9IGZpbmRfc3VwcG9ydGVkX3ZlY3Rvcl9sZW5ndGgoc3ZlX21heF92bCk7CisKKwkvKgorCSAq IEZvciB0aGUgZGVmYXVsdCBWTCwgcGljayB0aGUgbWF4aW11bSBzdXBwb3J0ZWQgdmFsdWUgPD0g NjQuCisJICogVkwgPT0gNjQgaXMgZ3VhcmFudGVlZCBub3QgdG8gZ3JvdyB0aGUgc2lnbmFsIGZy YW1lLgorCSAqLworCXN2ZV9kZWZhdWx0X3ZsID0gZmluZF9zdXBwb3J0ZWRfdmVjdG9yX2xlbmd0 aCg2NCk7CisKKwlwcl9pbmZvKCJTVkU6IG1heGltdW0gYXZhaWxhYmxlIHZlY3RvciBsZW5ndGgg JXUgYnl0ZXMgcGVyIHZlY3RvclxuIiwKKwkJc3ZlX21heF92bCk7CisJcHJfaW5mbygiU1ZFOiBk ZWZhdWx0IHZlY3RvciBsZW5ndGggJXUgYnl0ZXMgcGVyIHZlY3RvclxuIiwKKwkJc3ZlX2RlZmF1 bHRfdmwpOworfQorCiB2b2lkIGZwc2ltZF9yZWxlYXNlX3RocmVhZChzdHJ1Y3QgdGFza19zdHJ1 Y3QgKmRlYWRfdGFzaykKIHsKIAlzdmVfZnJlZShkZWFkX3Rhc2spOwpAQCAtNjM3LDYgKzc0Miw5 IEBAIHZvaWQgZnBzaW1kX2ZsdXNoX3RocmVhZCh2b2lkKQogCQkgKiBUaGlzIGlzIHdoZXJlIHdl IGVuc3VyZSB0aGF0IGFsbCB1c2VyIHRhc2tzIGhhdmUgYSB2YWxpZAogCQkgKiB2ZWN0b3IgbGVu Z3RoIGNvbmZpZ3VyZWQ6IG5vIGtlcm5lbCB0YXNrIGNhbiBiZWNvbWUgYSB1c2VyCiAJCSAqIHRh c2sgd2l0aG91dCBhbiBleGVjIGFuZCBoZW5jZSBhIGNhbGwgdG8gdGhpcyBmdW5jdGlvbi4KKwkJ ICogQnkgdGhlIHRpbWUgdGhlIGZpcnN0IGNhbGwgdG8gdGhpcyBmdW5jdGlvbiBpcyBtYWRlLCBh bGwKKwkJICogZWFybHkgaGFyZHdhcmUgcHJvYmluZyBpcyBjb21wbGV0ZSwgc28gc3ZlX2RlZmF1 bHRfdmwKKwkJICogc2hvdWxkIGJlIHZhbGlkLgogCQkgKiBJZiBhIGJ1ZyBjYXVzZXMgdGhpcyB0 byBnbyB3cm9uZywgd2UgbWFrZSBzb21lIG5vaXNlIGFuZAogCQkgKiB0cnkgdG8gZnVkZ2UgdGhy ZWFkLnN2ZV92bCB0byBhIHNhZmUgdmFsdWUgaGVyZS4KIAkJICovCi0tIAoyLjEuNAoKX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18Ka3ZtYXJtIG1haWxpbmcg bGlzdAprdm1hcm1AbGlzdHMuY3MuY29sdW1iaWEuZWR1Cmh0dHBzOi8vbGlzdHMuY3MuY29sdW1i aWEuZWR1L21haWxtYW4vbGlzdGluZm8va3ZtYXJtCg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com ([217.140.101.70]:49030 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932521AbdJJSj2 (ORCPT ); Tue, 10 Oct 2017 14:39:28 -0400 From: Dave Martin Subject: [PATCH v3 16/28] arm64/sve: Probe SVE capabilities and usable vector lengths Date: Tue, 10 Oct 2017 19:38:33 +0100 Message-ID: <1507660725-7986-17-git-send-email-Dave.Martin@arm.com> In-Reply-To: <1507660725-7986-1-git-send-email-Dave.Martin@arm.com> References: <1507660725-7986-1-git-send-email-Dave.Martin@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-arch-owner@vger.kernel.org List-ID: To: linux-arm-kernel@lists.infradead.org Cc: Catalin Marinas , Will Deacon , Ard Biesheuvel , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Szabolcs Nagy , Richard Sandiford , Okamoto Takayuki , kvmarm@lists.cs.columbia.edu, libc-alpha@sourceware.org, linux-arch@vger.kernel.org, Suzuki K Poulose Message-ID: <20171010183833.qP5G5n684zs5pNsxo-xSXy9XF8Ol91VrXjnSlNShpAo@z> This patch uses the cpufeatures framework to determine common SVE capabilities and vector lengths, and configures the runtime SVE support code appropriately. ZCR_ELx is not really a feature register, but it is convenient to use it as a template for recording the maximum vector length supported by a CPU, using the LEN field. This field is similar to a feature field in that it is a contiguous bitfield for which we want to determine the minimum system-wide value. This patch adds ZCR as a pseudo-register in cpuinfo/cpufeatures, with appropriate custom code to populate it. Finding the minimum supported value of the LEN field is left to the cpufeatures framework in the usual way. The meaning of ID_AA64ZFR0_EL1 is not architecturally defined yet, so for now we just require it to be zero. Note that much of this code is dormant and SVE still won't be used yet, since system_supports_sve() remains hardwired to false. Signed-off-by: Dave Martin Cc: Alex Bennée Cc: Suzuki K Poulose --- Dropped Alex Bennée's Reviewed-by, since there is new logic in this patch. Changes since v2 ---------------- Bug fixes: * Got rid of dynamic allocation of the shadow vector length map during secondary boot. Secondary CPU boot takes place in atomic context, and relying on GFP_ATOMIC here doesn't seem justified. Instead, the needed additional bitmap is allocated statically. Only one shadow map is needed, because CPUs don't boot concurrently. Requested by Alex Bennée: * Reflowed untidy comment above read_zcr_features() * Added comments to read_zcr_features() to explain what it's trying to do (which is otherwise not readily apparent). Requested by Catalin Marinas: * Moved disabling of the EL1 SVE trap to the cpufeatures C code. This allows addition of new assembler in __cpu_setup to be avoided. Miscellaneous: * Added comments explaining the intent, purpose and basic constraints for fpsimd.c helpers. --- arch/arm64/include/asm/cpu.h | 4 ++ arch/arm64/include/asm/cpufeature.h | 36 ++++++++++++ arch/arm64/include/asm/fpsimd.h | 14 +++++ arch/arm64/kernel/cpufeature.c | 50 ++++++++++++++++ arch/arm64/kernel/cpuinfo.c | 6 ++ arch/arm64/kernel/fpsimd.c | 114 +++++++++++++++++++++++++++++++++++- 6 files changed, 221 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h index 889226b..8839227 100644 --- a/arch/arm64/include/asm/cpu.h +++ b/arch/arm64/include/asm/cpu.h @@ -41,6 +41,7 @@ struct cpuinfo_arm64 { u64 reg_id_aa64mmfr2; u64 reg_id_aa64pfr0; u64 reg_id_aa64pfr1; + u64 reg_id_aa64zfr0; u32 reg_id_dfr0; u32 reg_id_isar0; @@ -59,6 +60,9 @@ struct cpuinfo_arm64 { u32 reg_mvfr0; u32 reg_mvfr1; u32 reg_mvfr2; + + /* pseudo-ZCR for recording maximum ZCR_EL1 LEN value: */ + u64 reg_zcr; }; DECLARE_PER_CPU(struct cpuinfo_arm64, cpu_data); diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index 4ea3441..51be8e8 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -10,7 +10,9 @@ #define __ASM_CPUFEATURE_H #include +#include #include +#include #include /* @@ -223,6 +225,13 @@ static inline bool id_aa64pfr0_32bit_el0(u64 pfr0) return val == ID_AA64PFR0_EL0_32BIT_64BIT; } +static inline bool id_aa64pfr0_sve(u64 pfr0) +{ + u32 val = cpuid_feature_extract_unsigned_field(pfr0, ID_AA64PFR0_SVE_SHIFT); + + return val > 0; +} + void __init setup_cpu_features(void); void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps, @@ -267,6 +276,33 @@ static inline bool system_supports_sve(void) return false; } +/* + * Read the pseudo-ZCR used by cpufeatures to identify the supported SVE + * vector length. + * + * Use only if SVE is present. + * This function clobbers the SVE vector length. + */ +static u64 __maybe_unused read_zcr_features(void) +{ + u64 zcr; + unsigned int vq_max; + + /* + * Set the maximum possible VL, and write zeroes to all other + * bits to see if they stick. + */ + sve_kernel_enable(NULL); + write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1); + + zcr = read_sysreg_s(SYS_ZCR_EL1); + zcr &= ~(u64)ZCR_ELx_LEN_MASK; /* find sticky 1s outside LEN field */ + vq_max = sve_vq_from_vl(sve_get_vl()); + zcr |= vq_max - 1; /* set LEN field to maximum effective value */ + + return zcr; +} + #endif /* __ASSEMBLY__ */ #endif diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h index 7dd3939..bad72fd 100644 --- a/arch/arm64/include/asm/fpsimd.h +++ b/arch/arm64/include/asm/fpsimd.h @@ -79,6 +79,7 @@ extern void sve_save_state(void *state, u32 *pfpsr); extern void sve_load_state(void const *state, u32 const *pfpsr, unsigned long vq_minus_1); extern unsigned int sve_get_vl(void); +extern int sve_kernel_enable(void *); extern int __ro_after_init sve_max_vl; @@ -91,10 +92,23 @@ extern void fpsimd_release_thread(struct task_struct *task); extern int sve_set_vector_length(struct task_struct *task, unsigned long vl, unsigned long flags); +/* + * Probing and setup functions. + * Calls to these functions must be serialised with one another. + */ +extern void __init sve_init_vq_map(void); +extern void sve_update_vq_map(void); +extern int sve_verify_vq_map(void); +extern void __init sve_setup(void); + #else /* ! CONFIG_ARM64_SVE */ static void __maybe_unused sve_alloc(struct task_struct *task) { } static void __maybe_unused fpsimd_release_thread(struct task_struct *task) { } +static void __maybe_unused sve_init_vq_map(void) { } +static void __maybe_unused sve_update_vq_map(void) { } +static int __maybe_unused sve_verify_vq_map(void) { return 0; } +static void __maybe_unused sve_setup(void) { } #endif /* ! CONFIG_ARM64_SVE */ diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 92a9502..c5acf38 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -27,6 +27,7 @@ #include #include #include +#include #include #include #include @@ -283,6 +284,12 @@ static const struct arm64_ftr_bits ftr_id_dfr0[] = { ARM64_FTR_END, }; +static const struct arm64_ftr_bits ftr_zcr[] = { + ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, + ZCR_ELx_LEN_SHIFT, ZCR_ELx_LEN_SIZE, 0), /* LEN */ + ARM64_FTR_END, +}; + /* * Common ftr bits for a 32bit register with all hidden, strict * attributes, with 4bit feature fields and a default safe value of @@ -349,6 +356,7 @@ static const struct __ftr_reg_entry { /* Op1 = 0, CRn = 0, CRm = 4 */ ARM64_FTR_REG(SYS_ID_AA64PFR0_EL1, ftr_id_aa64pfr0), ARM64_FTR_REG(SYS_ID_AA64PFR1_EL1, ftr_raz), + ARM64_FTR_REG(SYS_ID_AA64ZFR0_EL1, ftr_raz), /* Op1 = 0, CRn = 0, CRm = 5 */ ARM64_FTR_REG(SYS_ID_AA64DFR0_EL1, ftr_id_aa64dfr0), @@ -363,6 +371,9 @@ static const struct __ftr_reg_entry { ARM64_FTR_REG(SYS_ID_AA64MMFR1_EL1, ftr_id_aa64mmfr1), ARM64_FTR_REG(SYS_ID_AA64MMFR2_EL1, ftr_id_aa64mmfr2), + /* Op1 = 0, CRn = 1, CRm = 2 */ + ARM64_FTR_REG(SYS_ZCR_EL1, ftr_zcr), + /* Op1 = 3, CRn = 0, CRm = 0 */ { SYS_CTR_EL0, &arm64_ftr_reg_ctrel0 }, ARM64_FTR_REG(SYS_DCZID_EL0, ftr_dczid), @@ -500,6 +511,7 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info) init_cpu_ftr_reg(SYS_ID_AA64MMFR2_EL1, info->reg_id_aa64mmfr2); init_cpu_ftr_reg(SYS_ID_AA64PFR0_EL1, info->reg_id_aa64pfr0); init_cpu_ftr_reg(SYS_ID_AA64PFR1_EL1, info->reg_id_aa64pfr1); + init_cpu_ftr_reg(SYS_ID_AA64ZFR0_EL1, info->reg_id_aa64zfr0); if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) { init_cpu_ftr_reg(SYS_ID_DFR0_EL1, info->reg_id_dfr0); @@ -520,6 +532,10 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info) init_cpu_ftr_reg(SYS_MVFR2_EL1, info->reg_mvfr2); } + if (id_aa64pfr0_sve(info->reg_id_aa64pfr0)) { + init_cpu_ftr_reg(SYS_ZCR_EL1, info->reg_zcr); + sve_init_vq_map(); + } } static void update_cpu_ftr_reg(struct arm64_ftr_reg *reg, u64 new) @@ -623,6 +639,9 @@ void update_cpu_features(int cpu, taint |= check_update_ftr_reg(SYS_ID_AA64PFR1_EL1, cpu, info->reg_id_aa64pfr1, boot->reg_id_aa64pfr1); + taint |= check_update_ftr_reg(SYS_ID_AA64ZFR0_EL1, cpu, + info->reg_id_aa64zfr0, boot->reg_id_aa64zfr0); + /* * If we have AArch32, we care about 32-bit features for compat. * If the system doesn't support AArch32, don't update them. @@ -670,6 +689,14 @@ void update_cpu_features(int cpu, info->reg_mvfr2, boot->reg_mvfr2); } + if (id_aa64pfr0_sve(info->reg_id_aa64pfr0)) { + taint |= check_update_ftr_reg(SYS_ZCR_EL1, cpu, + info->reg_zcr, boot->reg_zcr); + + if (!sys_caps_initialised) + sve_update_vq_map(); + } + /* * Mismatched CPU features are a recipe for disaster. Don't even * pretend to support them. @@ -1097,6 +1124,23 @@ verify_local_cpu_features(const struct arm64_cpu_capabilities *caps) } } +static void verify_sve_features(void) +{ + u64 safe_zcr = read_sanitised_ftr_reg(SYS_ZCR_EL1); + u64 zcr = read_zcr_features(); + + unsigned int safe_len = safe_zcr & ZCR_ELx_LEN_MASK; + unsigned int len = zcr & ZCR_ELx_LEN_MASK; + + if (len < safe_len || sve_verify_vq_map()) { + pr_crit("CPU%d: SVE: required vector length(s) missing\n", + smp_processor_id()); + cpu_die_early(); + } + + /* Add checks on other ZCR bits here if necessary */ +} + /* * Run through the enabled system capabilities and enable() it on this CPU. * The capabilities were decided based on the available CPUs at the boot time. @@ -1110,8 +1154,12 @@ static void verify_local_cpu_capabilities(void) verify_local_cpu_errata_workarounds(); verify_local_cpu_features(arm64_features); verify_local_elf_hwcaps(arm64_elf_hwcaps); + if (system_supports_32bit_el0()) verify_local_elf_hwcaps(compat_elf_hwcaps); + + if (system_supports_sve()) + verify_sve_features(); } void check_local_cpu_capabilities(void) @@ -1189,6 +1237,8 @@ void __init setup_cpu_features(void) if (system_supports_32bit_el0()) setup_elf_hwcaps(compat_elf_hwcaps); + sve_setup(); + /* Advertise that we have computed the system capabilities */ set_sys_caps_initialised(); diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c index 3118859..be260e8 100644 --- a/arch/arm64/kernel/cpuinfo.c +++ b/arch/arm64/kernel/cpuinfo.c @@ -19,6 +19,7 @@ #include #include #include +#include #include #include @@ -326,6 +327,7 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info) info->reg_id_aa64mmfr2 = read_cpuid(ID_AA64MMFR2_EL1); info->reg_id_aa64pfr0 = read_cpuid(ID_AA64PFR0_EL1); info->reg_id_aa64pfr1 = read_cpuid(ID_AA64PFR1_EL1); + info->reg_id_aa64zfr0 = read_cpuid(ID_AA64ZFR0_EL1); /* Update the 32bit ID registers only if AArch32 is implemented */ if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) { @@ -348,6 +350,10 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info) info->reg_mvfr2 = read_cpuid(MVFR2_EL1); } + if (IS_ENABLED(CONFIG_ARM64_SVE) && + id_aa64pfr0_sve(info->reg_id_aa64pfr0)) + info->reg_zcr = read_zcr_features(); + cpuinfo_detect_icache_policy(info); } diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 324c112..5673f50 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -113,19 +113,19 @@ static DEFINE_PER_CPU(struct fpsimd_state *, fpsimd_last_state); /* Default VL for tasks that don't set it explicitly: */ -static int sve_default_vl = SVE_VL_MIN; +static int sve_default_vl = -1; #ifdef CONFIG_ARM64_SVE /* Maximum supported vector length across all CPUs (initially poisoned) */ int __ro_after_init sve_max_vl = -1; /* Set of available vector lengths, as vq_to_bit(vq): */ -static DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX); +static __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX); #else /* ! CONFIG_ARM64_SVE */ /* Dummy declaration for code that will be optimised out: */ -extern DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX); +extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX); #endif /* ! CONFIG_ARM64_SVE */ @@ -506,6 +506,111 @@ int sve_set_vector_length(struct task_struct *task, return 0; } +/* + * Bitmap for temporary storage of the per-CPU set of supported vector lengths + * during secondary boot. + */ +static DECLARE_BITMAP(sve_secondary_vq_map, SVE_VQ_MAX); + +static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX)) +{ + unsigned int vq, vl; + unsigned long zcr; + + bitmap_zero(map, SVE_VQ_MAX); + + zcr = ZCR_ELx_LEN_MASK; + zcr = read_sysreg_s(SYS_ZCR_EL1) & ~zcr; + + for (vq = SVE_VQ_MAX; vq >= SVE_VQ_MIN; --vq) { + write_sysreg_s(zcr | (vq - 1), SYS_ZCR_EL1); /* self-syncing */ + vl = sve_get_vl(); + vq = sve_vq_from_vl(vl); /* skip intervening lengths */ + set_bit(vq_to_bit(vq), map); + } +} + +void __init sve_init_vq_map(void) +{ + sve_probe_vqs(sve_vq_map); +} + +/* + * If we haven't committed to the set of supported VQs yet, filter out + * those not supported by the current CPU. + */ +void sve_update_vq_map(void) +{ + sve_probe_vqs(sve_secondary_vq_map); + bitmap_and(sve_vq_map, sve_vq_map, sve_secondary_vq_map, SVE_VQ_MAX); +} + +/* Check whether the current CPU supports all VQs in the committed set */ +int sve_verify_vq_map(void) +{ + int ret = 0; + + sve_probe_vqs(sve_secondary_vq_map); + bitmap_andnot(sve_secondary_vq_map, sve_vq_map, sve_secondary_vq_map, + SVE_VQ_MAX); + if (!bitmap_empty(sve_secondary_vq_map, SVE_VQ_MAX)) { + pr_warn("SVE: cpu%d: Required vector length(s) missing\n", + smp_processor_id()); + ret = -EINVAL; + } + + return ret; +} + +/* + * Enable SVE for EL1. + * Intended for use by the cpufeatures code during CPU boot. + */ +int sve_kernel_enable(void *__always_unused p) +{ + write_sysreg(read_sysreg(CPACR_EL1) | CPACR_EL1_ZEN_EL1EN, CPACR_EL1); + isb(); + + return 0; +} + +void __init sve_setup(void) +{ + u64 zcr; + + if (!system_supports_sve()) + return; + + /* + * The SVE architecture mandates support for 128-bit vectors, + * so sve_vq_map must have at least SVE_VQ_MIN set. + * If something went wrong, at least try to patch it up: + */ + if (WARN_ON(!test_bit(vq_to_bit(SVE_VQ_MIN), sve_vq_map))) + set_bit(vq_to_bit(SVE_VQ_MIN), sve_vq_map); + + zcr = read_sanitised_ftr_reg(SYS_ZCR_EL1); + sve_max_vl = sve_vl_from_vq((zcr & ZCR_ELx_LEN_MASK) + 1); + + /* + * Sanity-check that the max VL we determined through CPU features + * corresponds properly to sve_vq_map. If not, do our best: + */ + if (WARN_ON(sve_max_vl != find_supported_vector_length(sve_max_vl))) + sve_max_vl = find_supported_vector_length(sve_max_vl); + + /* + * For the default VL, pick the maximum supported value <= 64. + * VL == 64 is guaranteed not to grow the signal frame. + */ + sve_default_vl = find_supported_vector_length(64); + + pr_info("SVE: maximum available vector length %u bytes per vector\n", + sve_max_vl); + pr_info("SVE: default vector length %u bytes per vector\n", + sve_default_vl); +} + void fpsimd_release_thread(struct task_struct *dead_task) { sve_free(dead_task); @@ -637,6 +742,9 @@ void fpsimd_flush_thread(void) * This is where we ensure that all user tasks have a valid * vector length configured: no kernel task can become a user * task without an exec and hence a call to this function. + * By the time the first call to this function is made, all + * early hardware probing is complete, so sve_default_vl + * should be valid. * If a bug causes this to go wrong, we make some noise and * try to fudge thread.sve_vl to a safe value here. */ -- 2.1.4