From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Martin Subject: [PATCH v4 16/28] arm64/sve: Probe SVE capabilities and usable vector lengths Date: Fri, 27 Oct 2017 11:50:58 +0100 Message-ID: <1509101470-7881-17-git-send-email-Dave.Martin@arm.com> References: <1509101470-7881-1-git-send-email-Dave.Martin@arm.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: In-Reply-To: <1509101470-7881-1-git-send-email-Dave.Martin@arm.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu To: linux-arm-kernel@lists.infradead.org Cc: linux-arch@vger.kernel.org, Okamoto Takayuki , libc-alpha@sourceware.org, Ard Biesheuvel , Szabolcs Nagy , Catalin Marinas , Will Deacon , kvmarm@lists.cs.columbia.edu List-Id: linux-arch.vger.kernel.org VGhpcyBwYXRjaCB1c2VzIHRoZSBjcHVmZWF0dXJlcyBmcmFtZXdvcmsgdG8gZGV0ZXJtaW5lIGNv bW1vbiBTVkUKY2FwYWJpbGl0aWVzIGFuZCB2ZWN0b3IgbGVuZ3RocywgYW5kIGNvbmZpZ3VyZXMg dGhlIHJ1bnRpbWUgU1ZFCnN1cHBvcnQgY29kZSBhcHByb3ByaWF0ZWx5LgoKWkNSX0VMeCBpcyBu b3QgcmVhbGx5IGEgZmVhdHVyZSByZWdpc3RlciwgYnV0IGl0IGlzIGNvbnZlbmllbnQgdG8KdXNl IGl0IGFzIGEgdGVtcGxhdGUgZm9yIHJlY29yZGluZyB0aGUgbWF4aW11bSB2ZWN0b3IgbGVuZ3Ro CnN1cHBvcnRlZCBieSBhIENQVSwgdXNpbmcgdGhlIExFTiBmaWVsZC4gIFRoaXMgZmllbGQgaXMg c2ltaWxhciB0bwphIGZlYXR1cmUgZmllbGQgaW4gdGhhdCBpdCBpcyBhIGNvbnRpZ3VvdXMgYml0 ZmllbGQgZm9yIHdoaWNoIHdlCndhbnQgdG8gZGV0ZXJtaW5lIHRoZSBtaW5pbXVtIHN5c3RlbS13 aWRlIHZhbHVlLiAgVGhpcyBwYXRjaCBhZGRzClpDUiBhcyBhIHBzZXVkby1yZWdpc3RlciBpbiBj cHVpbmZvL2NwdWZlYXR1cmVzLCB3aXRoIGFwcHJvcHJpYXRlCmN1c3RvbSBjb2RlIHRvIHBvcHVs YXRlIGl0LiAgRmluZGluZyB0aGUgbWluaW11bSBzdXBwb3J0ZWQgdmFsdWUgb2YKdGhlIExFTiBm aWVsZCBpcyBsZWZ0IHRvIHRoZSBjcHVmZWF0dXJlcyBmcmFtZXdvcmsgaW4gdGhlIHVzdWFsCndh eS4KClRoZSBtZWFuaW5nIG9mIElEX0FBNjRaRlIwX0VMMSBpcyBub3QgYXJjaGl0ZWN0dXJhbGx5 IGRlZmluZWQgeWV0LApzbyBmb3Igbm93IHdlIGp1c3QgcmVxdWlyZSBpdCB0byBiZSB6ZXJvLgoK Tm90ZSB0aGF0IG11Y2ggb2YgdGhpcyBjb2RlIGlzIGRvcm1hbnQgYW5kIFNWRSBzdGlsbCB3b24n dCBiZSB1c2VkCnlldCwgc2luY2Ugc3lzdGVtX3N1cHBvcnRzX3N2ZSgpIHJlbWFpbnMgaGFyZHdp cmVkIHRvIGZhbHNlLgoKU2lnbmVkLW9mZi1ieTogRGF2ZSBNYXJ0aW4gPERhdmUuTWFydGluQGFy bS5jb20+ClJldmlld2VkLWJ5OiBTdXp1a2kgSyBQb3Vsb3NlIDxzdXp1a2kucG91bG9zZUBhcm0u Y29tPgpDYzogQWxleCBCZW5uw6llIDxhbGV4LmJlbm5lZUBsaW5hcm8ub3JnPgpDYzogQ2F0YWxp biBNYXJpbmFzIDxjYXRhbGluLm1hcmluYXNAYXJtLmNvbT4KCi0tLQoKKipEcm9wcGVkKiogUmV2 aWV3ZWQtYnk6IENhdGFsaW4gTWFyaW5hcyA8Y2F0YWxpbi5tYXJpbmFzQGFybS5jb20+CioqRHJv cHBlZCBhdCB2MyoqIFJldmlld2VkLWJ5OiBBbGV4IEJlbm7DqWUgPGFsZXguYmVubmVlQGxpbmFy by5vcmc+CgpUaGUgY2hhbmdlIHJlcXVlc3RlZCBieSBTdXp1a2kgKHNlZSBiZWxvdykgaXMgbm90 IHF1aXRlIHRyaXZpYWwsCnRob3VnaCBoZSB3YXMgaGFwcHkgZm9yIG1lIHRvIGFwcGx5IGhpcyBS ZXZpZXdlZC1ieSBvbmNlIHRoZSBjaGFuZ2UKd2FzIG1hZGUuCgpDaGFuZ2VzIHNpbmNlIHYzCi0t LS0tLS0tLS0tLS0tLS0KClJlcXVlc3RlZCBieSBDYXRhbGluIE1hcmluYXM6CgogKiBSZXBsYWNl IF9fbWF5YmVfdW51c2VkIGZ1bmN0aW9ucyB3aXRoIHN0YXRpYyBpbmxpbmVzLgoKUmVxdWVzdGVk IGJ5IFN1enVraSBQb3Vsb3NlOgoKICogRG9uJ3QgYm90aGVyIHRvIHByb2JlIGZvciBzdXBwb3J0 ZWQgdmVjdG9yIGxlbmd0aHMgaWYgd2UgYWxyZWFkeQogICBkZWNpZGVkIFNWRSBpcyBub3Qgc3Vw cG9ydGVkLgotLS0KIGFyY2gvYXJtNjQvaW5jbHVkZS9hc20vY3B1LmggICAgICAgIHwgICA0ICsr CiBhcmNoL2FybTY0L2luY2x1ZGUvYXNtL2NwdWZlYXR1cmUuaCB8ICAzNiArKysrKysrKysrKysK IGFyY2gvYXJtNjQvaW5jbHVkZS9hc20vZnBzaW1kLmggICAgIHwgIDE0ICsrKysrCiBhcmNoL2Fy bTY0L2tlcm5lbC9jcHVmZWF0dXJlLmMgICAgICB8ICA1MiArKysrKysrKysrKysrKysrCiBhcmNo L2FybTY0L2tlcm5lbC9jcHVpbmZvLmMgICAgICAgICB8ICAgNiArKwogYXJjaC9hcm02NC9rZXJu ZWwvZnBzaW1kLmMgICAgICAgICAgfCAxMTQgKysrKysrKysrKysrKysrKysrKysrKysrKysrKysr KysrKystCiA2IGZpbGVzIGNoYW5nZWQsIDIyMyBpbnNlcnRpb25zKCspLCAzIGRlbGV0aW9ucygt KQoKZGlmZiAtLWdpdCBhL2FyY2gvYXJtNjQvaW5jbHVkZS9hc20vY3B1LmggYi9hcmNoL2FybTY0 L2luY2x1ZGUvYXNtL2NwdS5oCmluZGV4IDg4OTIyNmIuLjg4MzkyMjcgMTAwNjQ0Ci0tLSBhL2Fy Y2gvYXJtNjQvaW5jbHVkZS9hc20vY3B1LmgKKysrIGIvYXJjaC9hcm02NC9pbmNsdWRlL2FzbS9j cHUuaApAQCAtNDEsNiArNDEsNyBAQCBzdHJ1Y3QgY3B1aW5mb19hcm02NCB7CiAJdTY0CQlyZWdf aWRfYWE2NG1tZnIyOwogCXU2NAkJcmVnX2lkX2FhNjRwZnIwOwogCXU2NAkJcmVnX2lkX2FhNjRw ZnIxOworCXU2NAkJcmVnX2lkX2FhNjR6ZnIwOwogCiAJdTMyCQlyZWdfaWRfZGZyMDsKIAl1MzIJ CXJlZ19pZF9pc2FyMDsKQEAgLTU5LDYgKzYwLDkgQEAgc3RydWN0IGNwdWluZm9fYXJtNjQgewog CXUzMgkJcmVnX212ZnIwOwogCXUzMgkJcmVnX212ZnIxOwogCXUzMgkJcmVnX212ZnIyOworCisJ LyogcHNldWRvLVpDUiBmb3IgcmVjb3JkaW5nIG1heGltdW0gWkNSX0VMMSBMRU4gdmFsdWU6ICov CisJdTY0CQlyZWdfemNyOwogfTsKIAogREVDTEFSRV9QRVJfQ1BVKHN0cnVjdCBjcHVpbmZvX2Fy bTY0LCBjcHVfZGF0YSk7CmRpZmYgLS1naXQgYS9hcmNoL2FybTY0L2luY2x1ZGUvYXNtL2NwdWZl YXR1cmUuaCBiL2FyY2gvYXJtNjQvaW5jbHVkZS9hc20vY3B1ZmVhdHVyZS5oCmluZGV4IDRlYTM0 NDEuLjliMjdlOGMgMTAwNjQ0Ci0tLSBhL2FyY2gvYXJtNjQvaW5jbHVkZS9hc20vY3B1ZmVhdHVy ZS5oCisrKyBiL2FyY2gvYXJtNjQvaW5jbHVkZS9hc20vY3B1ZmVhdHVyZS5oCkBAIC0xMCw3ICsx MCw5IEBACiAjZGVmaW5lIF9fQVNNX0NQVUZFQVRVUkVfSAogCiAjaW5jbHVkZSA8YXNtL2NwdWNh cHMuaD4KKyNpbmNsdWRlIDxhc20vZnBzaW1kLmg+CiAjaW5jbHVkZSA8YXNtL2h3Y2FwLmg+Cisj aW5jbHVkZSA8YXNtL3NpZ2NvbnRleHQuaD4KICNpbmNsdWRlIDxhc20vc3lzcmVnLmg+CiAKIC8q CkBAIC0yMjMsNiArMjI1LDEzIEBAIHN0YXRpYyBpbmxpbmUgYm9vbCBpZF9hYTY0cGZyMF8zMmJp dF9lbDAodTY0IHBmcjApCiAJcmV0dXJuIHZhbCA9PSBJRF9BQTY0UEZSMF9FTDBfMzJCSVRfNjRC SVQ7CiB9CiAKK3N0YXRpYyBpbmxpbmUgYm9vbCBpZF9hYTY0cGZyMF9zdmUodTY0IHBmcjApCit7 CisJdTMyIHZhbCA9IGNwdWlkX2ZlYXR1cmVfZXh0cmFjdF91bnNpZ25lZF9maWVsZChwZnIwLCBJ RF9BQTY0UEZSMF9TVkVfU0hJRlQpOworCisJcmV0dXJuIHZhbCA+IDA7Cit9CisKIHZvaWQgX19p bml0IHNldHVwX2NwdV9mZWF0dXJlcyh2b2lkKTsKIAogdm9pZCB1cGRhdGVfY3B1X2NhcGFiaWxp dGllcyhjb25zdCBzdHJ1Y3QgYXJtNjRfY3B1X2NhcGFiaWxpdGllcyAqY2FwcywKQEAgLTI2Nyw2 ICsyNzYsMzMgQEAgc3RhdGljIGlubGluZSBib29sIHN5c3RlbV9zdXBwb3J0c19zdmUodm9pZCkK IAlyZXR1cm4gZmFsc2U7CiB9CiAKKy8qCisgKiBSZWFkIHRoZSBwc2V1ZG8tWkNSIHVzZWQgYnkg Y3B1ZmVhdHVyZXMgdG8gaWRlbnRpZnkgdGhlIHN1cHBvcnRlZCBTVkUKKyAqIHZlY3RvciBsZW5n dGguCisgKgorICogVXNlIG9ubHkgaWYgU1ZFIGlzIHByZXNlbnQuCisgKiBUaGlzIGZ1bmN0aW9u IGNsb2JiZXJzIHRoZSBTVkUgdmVjdG9yIGxlbmd0aC4KKyAqLworc3RhdGljIGlubGluZSB1NjQg cmVhZF96Y3JfZmVhdHVyZXModm9pZCkKK3sKKwl1NjQgemNyOworCXVuc2lnbmVkIGludCB2cV9t YXg7CisKKwkvKgorCSAqIFNldCB0aGUgbWF4aW11bSBwb3NzaWJsZSBWTCwgYW5kIHdyaXRlIHpl cm9lcyB0byBhbGwgb3RoZXIKKwkgKiBiaXRzIHRvIHNlZSBpZiB0aGV5IHN0aWNrLgorCSAqLwor CXN2ZV9rZXJuZWxfZW5hYmxlKE5VTEwpOworCXdyaXRlX3N5c3JlZ19zKFpDUl9FTHhfTEVOX01B U0ssIFNZU19aQ1JfRUwxKTsKKworCXpjciA9IHJlYWRfc3lzcmVnX3MoU1lTX1pDUl9FTDEpOwor CXpjciAmPSB+KHU2NClaQ1JfRUx4X0xFTl9NQVNLOyAvKiBmaW5kIHN0aWNreSAxcyBvdXRzaWRl IExFTiBmaWVsZCAqLworCXZxX21heCA9IHN2ZV92cV9mcm9tX3ZsKHN2ZV9nZXRfdmwoKSk7CisJ emNyIHw9IHZxX21heCAtIDE7IC8qIHNldCBMRU4gZmllbGQgdG8gbWF4aW11bSBlZmZlY3RpdmUg dmFsdWUgKi8KKworCXJldHVybiB6Y3I7Cit9CisKICNlbmRpZiAvKiBfX0FTU0VNQkxZX18gKi8K IAogI2VuZGlmCmRpZmYgLS1naXQgYS9hcmNoL2FybTY0L2luY2x1ZGUvYXNtL2Zwc2ltZC5oIGIv YXJjaC9hcm02NC9pbmNsdWRlL2FzbS9mcHNpbWQuaAppbmRleCA4NmY1NTBjLi5kOGUwZGM5IDEw MDY0NAotLS0gYS9hcmNoL2FybTY0L2luY2x1ZGUvYXNtL2Zwc2ltZC5oCisrKyBiL2FyY2gvYXJt NjQvaW5jbHVkZS9hc20vZnBzaW1kLmgKQEAgLTc4LDYgKzc4LDcgQEAgZXh0ZXJuIHZvaWQgc3Zl X3NhdmVfc3RhdGUodm9pZCAqc3RhdGUsIHUzMiAqcGZwc3IpOwogZXh0ZXJuIHZvaWQgc3ZlX2xv YWRfc3RhdGUodm9pZCBjb25zdCAqc3RhdGUsIHUzMiBjb25zdCAqcGZwc3IsCiAJCQkgICB1bnNp Z25lZCBsb25nIHZxX21pbnVzXzEpOwogZXh0ZXJuIHVuc2lnbmVkIGludCBzdmVfZ2V0X3ZsKHZv aWQpOworZXh0ZXJuIGludCBzdmVfa2VybmVsX2VuYWJsZSh2b2lkICopOwogCiBleHRlcm4gaW50 IF9fcm9fYWZ0ZXJfaW5pdCBzdmVfbWF4X3ZsOwogCkBAIC05MCwxMCArOTEsMjMgQEAgZXh0ZXJu IHZvaWQgZnBzaW1kX3JlbGVhc2VfdGFzayhzdHJ1Y3QgdGFza19zdHJ1Y3QgKnRhc2spOwogZXh0 ZXJuIGludCBzdmVfc2V0X3ZlY3Rvcl9sZW5ndGgoc3RydWN0IHRhc2tfc3RydWN0ICp0YXNrLAog CQkJCSB1bnNpZ25lZCBsb25nIHZsLCB1bnNpZ25lZCBsb25nIGZsYWdzKTsKIAorLyoKKyAqIFBy b2JpbmcgYW5kIHNldHVwIGZ1bmN0aW9ucy4KKyAqIENhbGxzIHRvIHRoZXNlIGZ1bmN0aW9ucyBt dXN0IGJlIHNlcmlhbGlzZWQgd2l0aCBvbmUgYW5vdGhlci4KKyAqLworZXh0ZXJuIHZvaWQgX19p bml0IHN2ZV9pbml0X3ZxX21hcCh2b2lkKTsKK2V4dGVybiB2b2lkIHN2ZV91cGRhdGVfdnFfbWFw KHZvaWQpOworZXh0ZXJuIGludCBzdmVfdmVyaWZ5X3ZxX21hcCh2b2lkKTsKK2V4dGVybiB2b2lk IF9faW5pdCBzdmVfc2V0dXAodm9pZCk7CisKICNlbHNlIC8qICEgQ09ORklHX0FSTTY0X1NWRSAq LwogCiBzdGF0aWMgaW5saW5lIHZvaWQgc3ZlX2FsbG9jKHN0cnVjdCB0YXNrX3N0cnVjdCAqdGFz aykgeyB9CiBzdGF0aWMgaW5saW5lIHZvaWQgZnBzaW1kX3JlbGVhc2VfdGFzayhzdHJ1Y3QgdGFz a19zdHJ1Y3QgKnRhc2spIHsgfQorc3RhdGljIGlubGluZSB2b2lkIHN2ZV9pbml0X3ZxX21hcCh2 b2lkKSB7IH0KK3N0YXRpYyBpbmxpbmUgdm9pZCBzdmVfdXBkYXRlX3ZxX21hcCh2b2lkKSB7IH0K K3N0YXRpYyBpbmxpbmUgaW50IHN2ZV92ZXJpZnlfdnFfbWFwKHZvaWQpIHsgcmV0dXJuIDA7IH0K K3N0YXRpYyBpbmxpbmUgdm9pZCBzdmVfc2V0dXAodm9pZCkgeyB9CiAKICNlbmRpZiAvKiAhIENP TkZJR19BUk02NF9TVkUgKi8KIApkaWZmIC0tZ2l0IGEvYXJjaC9hcm02NC9rZXJuZWwvY3B1ZmVh dHVyZS5jIGIvYXJjaC9hcm02NC9rZXJuZWwvY3B1ZmVhdHVyZS5jCmluZGV4IGUyMjY3OTkuLjIx NTQzNzMgMTAwNjQ0Ci0tLSBhL2FyY2gvYXJtNjQva2VybmVsL2NwdWZlYXR1cmUuYworKysgYi9h cmNoL2FybTY0L2tlcm5lbC9jcHVmZWF0dXJlLmMKQEAgLTI3LDYgKzI3LDcgQEAKICNpbmNsdWRl IDxhc20vY3B1Lmg+CiAjaW5jbHVkZSA8YXNtL2NwdWZlYXR1cmUuaD4KICNpbmNsdWRlIDxhc20v Y3B1X29wcy5oPgorI2luY2x1ZGUgPGFzbS9mcHNpbWQuaD4KICNpbmNsdWRlIDxhc20vbW11X2Nv bnRleHQuaD4KICNpbmNsdWRlIDxhc20vcHJvY2Vzc29yLmg+CiAjaW5jbHVkZSA8YXNtL3N5c3Jl Zy5oPgpAQCAtMjg3LDYgKzI4OCwxMiBAQCBzdGF0aWMgY29uc3Qgc3RydWN0IGFybTY0X2Z0cl9i aXRzIGZ0cl9pZF9kZnIwW10gPSB7CiAJQVJNNjRfRlRSX0VORCwKIH07CiAKK3N0YXRpYyBjb25z dCBzdHJ1Y3QgYXJtNjRfZnRyX2JpdHMgZnRyX3pjcltdID0geworCUFSTTY0X0ZUUl9CSVRTKEZU Ul9ISURERU4sIEZUUl9OT05TVFJJQ1QsIEZUUl9MT1dFUl9TQUZFLAorCQlaQ1JfRUx4X0xFTl9T SElGVCwgWkNSX0VMeF9MRU5fU0laRSwgMCksCS8qIExFTiAqLworCUFSTTY0X0ZUUl9FTkQsCit9 OworCiAvKgogICogQ29tbW9uIGZ0ciBiaXRzIGZvciBhIDMyYml0IHJlZ2lzdGVyIHdpdGggYWxs IGhpZGRlbiwgc3RyaWN0CiAgKiBhdHRyaWJ1dGVzLCB3aXRoIDRiaXQgZmVhdHVyZSBmaWVsZHMg YW5kIGEgZGVmYXVsdCBzYWZlIHZhbHVlIG9mCkBAIC0zNTMsNiArMzYwLDcgQEAgc3RhdGljIGNv bnN0IHN0cnVjdCBfX2Z0cl9yZWdfZW50cnkgewogCS8qIE9wMSA9IDAsIENSbiA9IDAsIENSbSA9 IDQgKi8KIAlBUk02NF9GVFJfUkVHKFNZU19JRF9BQTY0UEZSMF9FTDEsIGZ0cl9pZF9hYTY0cGZy MCksCiAJQVJNNjRfRlRSX1JFRyhTWVNfSURfQUE2NFBGUjFfRUwxLCBmdHJfcmF6KSwKKwlBUk02 NF9GVFJfUkVHKFNZU19JRF9BQTY0WkZSMF9FTDEsIGZ0cl9yYXopLAogCiAJLyogT3AxID0gMCwg Q1JuID0gMCwgQ1JtID0gNSAqLwogCUFSTTY0X0ZUUl9SRUcoU1lTX0lEX0FBNjRERlIwX0VMMSwg ZnRyX2lkX2FhNjRkZnIwKSwKQEAgLTM2Nyw2ICszNzUsOSBAQCBzdGF0aWMgY29uc3Qgc3RydWN0 IF9fZnRyX3JlZ19lbnRyeSB7CiAJQVJNNjRfRlRSX1JFRyhTWVNfSURfQUE2NE1NRlIxX0VMMSwg ZnRyX2lkX2FhNjRtbWZyMSksCiAJQVJNNjRfRlRSX1JFRyhTWVNfSURfQUE2NE1NRlIyX0VMMSwg ZnRyX2lkX2FhNjRtbWZyMiksCiAKKwkvKiBPcDEgPSAwLCBDUm4gPSAxLCBDUm0gPSAyICovCisJ QVJNNjRfRlRSX1JFRyhTWVNfWkNSX0VMMSwgZnRyX3pjciksCisKIAkvKiBPcDEgPSAzLCBDUm4g PSAwLCBDUm0gPSAwICovCiAJeyBTWVNfQ1RSX0VMMCwgJmFybTY0X2Z0cl9yZWdfY3RyZWwwIH0s CiAJQVJNNjRfRlRSX1JFRyhTWVNfRENaSURfRUwwLCBmdHJfZGN6aWQpLApAQCAtNTA0LDYgKzUx NSw3IEBAIHZvaWQgX19pbml0IGluaXRfY3B1X2ZlYXR1cmVzKHN0cnVjdCBjcHVpbmZvX2FybTY0 ICppbmZvKQogCWluaXRfY3B1X2Z0cl9yZWcoU1lTX0lEX0FBNjRNTUZSMl9FTDEsIGluZm8tPnJl Z19pZF9hYTY0bW1mcjIpOwogCWluaXRfY3B1X2Z0cl9yZWcoU1lTX0lEX0FBNjRQRlIwX0VMMSwg aW5mby0+cmVnX2lkX2FhNjRwZnIwKTsKIAlpbml0X2NwdV9mdHJfcmVnKFNZU19JRF9BQTY0UEZS MV9FTDEsIGluZm8tPnJlZ19pZF9hYTY0cGZyMSk7CisJaW5pdF9jcHVfZnRyX3JlZyhTWVNfSURf QUE2NFpGUjBfRUwxLCBpbmZvLT5yZWdfaWRfYWE2NHpmcjApOwogCiAJaWYgKGlkX2FhNjRwZnIw XzMyYml0X2VsMChpbmZvLT5yZWdfaWRfYWE2NHBmcjApKSB7CiAJCWluaXRfY3B1X2Z0cl9yZWco U1lTX0lEX0RGUjBfRUwxLCBpbmZvLT5yZWdfaWRfZGZyMCk7CkBAIC01MjQsNiArNTM2LDEwIEBA IHZvaWQgX19pbml0IGluaXRfY3B1X2ZlYXR1cmVzKHN0cnVjdCBjcHVpbmZvX2FybTY0ICppbmZv KQogCQlpbml0X2NwdV9mdHJfcmVnKFNZU19NVkZSMl9FTDEsIGluZm8tPnJlZ19tdmZyMik7CiAJ fQogCisJaWYgKGlkX2FhNjRwZnIwX3N2ZShpbmZvLT5yZWdfaWRfYWE2NHBmcjApKSB7CisJCWlu aXRfY3B1X2Z0cl9yZWcoU1lTX1pDUl9FTDEsIGluZm8tPnJlZ196Y3IpOworCQlzdmVfaW5pdF92 cV9tYXAoKTsKKwl9CiB9CiAKIHN0YXRpYyB2b2lkIHVwZGF0ZV9jcHVfZnRyX3JlZyhzdHJ1Y3Qg YXJtNjRfZnRyX3JlZyAqcmVnLCB1NjQgbmV3KQpAQCAtNjI3LDYgKzY0Myw5IEBAIHZvaWQgdXBk YXRlX2NwdV9mZWF0dXJlcyhpbnQgY3B1LAogCXRhaW50IHw9IGNoZWNrX3VwZGF0ZV9mdHJfcmVn KFNZU19JRF9BQTY0UEZSMV9FTDEsIGNwdSwKIAkJCQkgICAgICBpbmZvLT5yZWdfaWRfYWE2NHBm cjEsIGJvb3QtPnJlZ19pZF9hYTY0cGZyMSk7CiAKKwl0YWludCB8PSBjaGVja191cGRhdGVfZnRy X3JlZyhTWVNfSURfQUE2NFpGUjBfRUwxLCBjcHUsCisJCQkJICAgICAgaW5mby0+cmVnX2lkX2Fh NjR6ZnIwLCBib290LT5yZWdfaWRfYWE2NHpmcjApOworCiAJLyoKIAkgKiBJZiB3ZSBoYXZlIEFB cmNoMzIsIHdlIGNhcmUgYWJvdXQgMzItYml0IGZlYXR1cmVzIGZvciBjb21wYXQuCiAJICogSWYg dGhlIHN5c3RlbSBkb2Vzbid0IHN1cHBvcnQgQUFyY2gzMiwgZG9uJ3QgdXBkYXRlIHRoZW0uCkBA IC02NzQsNiArNjkzLDE2IEBAIHZvaWQgdXBkYXRlX2NwdV9mZWF0dXJlcyhpbnQgY3B1LAogCQkJ CQlpbmZvLT5yZWdfbXZmcjIsIGJvb3QtPnJlZ19tdmZyMik7CiAJfQogCisJaWYgKGlkX2FhNjRw ZnIwX3N2ZShpbmZvLT5yZWdfaWRfYWE2NHBmcjApKSB7CisJCXRhaW50IHw9IGNoZWNrX3VwZGF0 ZV9mdHJfcmVnKFNZU19aQ1JfRUwxLCBjcHUsCisJCQkJCWluZm8tPnJlZ196Y3IsIGJvb3QtPnJl Z196Y3IpOworCisJCS8qIFByb2JlIHZlY3RvciBsZW5ndGhzLCB1bmxlc3Mgd2UgYWxyZWFkeSBn YXZlIHVwIG9uIFNWRSAqLworCQlpZiAoaWRfYWE2NHBmcjBfc3ZlKHJlYWRfc2FuaXRpc2VkX2Z0 cl9yZWcoU1lTX0lEX0FBNjRQRlIwX0VMMSkpICYmCisJCSAgICAhc3lzX2NhcHNfaW5pdGlhbGlz ZWQpCisJCQlzdmVfdXBkYXRlX3ZxX21hcCgpOworCX0KKwogCS8qCiAJICogTWlzbWF0Y2hlZCBD UFUgZmVhdHVyZXMgYXJlIGEgcmVjaXBlIGZvciBkaXNhc3Rlci4gRG9uJ3QgZXZlbgogCSAqIHBy ZXRlbmQgdG8gc3VwcG9ydCB0aGVtLgpAQCAtMTEwNiw2ICsxMTM1LDIzIEBAIHZlcmlmeV9sb2Nh bF9jcHVfZmVhdHVyZXMoY29uc3Qgc3RydWN0IGFybTY0X2NwdV9jYXBhYmlsaXRpZXMgKmNhcHMp CiAJfQogfQogCitzdGF0aWMgdm9pZCB2ZXJpZnlfc3ZlX2ZlYXR1cmVzKHZvaWQpCit7CisJdTY0 IHNhZmVfemNyID0gcmVhZF9zYW5pdGlzZWRfZnRyX3JlZyhTWVNfWkNSX0VMMSk7CisJdTY0IHpj ciA9IHJlYWRfemNyX2ZlYXR1cmVzKCk7CisKKwl1bnNpZ25lZCBpbnQgc2FmZV9sZW4gPSBzYWZl X3pjciAmIFpDUl9FTHhfTEVOX01BU0s7CisJdW5zaWduZWQgaW50IGxlbiA9IHpjciAmIFpDUl9F THhfTEVOX01BU0s7CisKKwlpZiAobGVuIDwgc2FmZV9sZW4gfHwgc3ZlX3ZlcmlmeV92cV9tYXAo KSkgeworCQlwcl9jcml0KCJDUFUlZDogU1ZFOiByZXF1aXJlZCB2ZWN0b3IgbGVuZ3RoKHMpIG1p c3NpbmdcbiIsCisJCQlzbXBfcHJvY2Vzc29yX2lkKCkpOworCQljcHVfZGllX2Vhcmx5KCk7CisJ fQorCisJLyogQWRkIGNoZWNrcyBvbiBvdGhlciBaQ1IgYml0cyBoZXJlIGlmIG5lY2Vzc2FyeSAq LworfQorCiAvKgogICogUnVuIHRocm91Z2ggdGhlIGVuYWJsZWQgc3lzdGVtIGNhcGFiaWxpdGll cyBhbmQgZW5hYmxlKCkgaXQgb24gdGhpcyBDUFUuCiAgKiBUaGUgY2FwYWJpbGl0aWVzIHdlcmUg ZGVjaWRlZCBiYXNlZCBvbiB0aGUgYXZhaWxhYmxlIENQVXMgYXQgdGhlIGJvb3QgdGltZS4KQEAg LTExMTksOCArMTE2NSwxMiBAQCBzdGF0aWMgdm9pZCB2ZXJpZnlfbG9jYWxfY3B1X2NhcGFiaWxp dGllcyh2b2lkKQogCXZlcmlmeV9sb2NhbF9jcHVfZXJyYXRhX3dvcmthcm91bmRzKCk7CiAJdmVy aWZ5X2xvY2FsX2NwdV9mZWF0dXJlcyhhcm02NF9mZWF0dXJlcyk7CiAJdmVyaWZ5X2xvY2FsX2Vs Zl9od2NhcHMoYXJtNjRfZWxmX2h3Y2Fwcyk7CisKIAlpZiAoc3lzdGVtX3N1cHBvcnRzXzMyYml0 X2VsMCgpKQogCQl2ZXJpZnlfbG9jYWxfZWxmX2h3Y2Fwcyhjb21wYXRfZWxmX2h3Y2Fwcyk7CisK KwlpZiAoc3lzdGVtX3N1cHBvcnRzX3N2ZSgpKQorCQl2ZXJpZnlfc3ZlX2ZlYXR1cmVzKCk7CiB9 CiAKIHZvaWQgY2hlY2tfbG9jYWxfY3B1X2NhcGFiaWxpdGllcyh2b2lkKQpAQCAtMTE5OCw2ICsx MjQ4LDggQEAgdm9pZCBfX2luaXQgc2V0dXBfY3B1X2ZlYXR1cmVzKHZvaWQpCiAJaWYgKHN5c3Rl bV9zdXBwb3J0c18zMmJpdF9lbDAoKSkKIAkJc2V0dXBfZWxmX2h3Y2Fwcyhjb21wYXRfZWxmX2h3 Y2Fwcyk7CiAKKwlzdmVfc2V0dXAoKTsKKwogCS8qIEFkdmVydGlzZSB0aGF0IHdlIGhhdmUgY29t cHV0ZWQgdGhlIHN5c3RlbSBjYXBhYmlsaXRpZXMgKi8KIAlzZXRfc3lzX2NhcHNfaW5pdGlhbGlz ZWQoKTsKIApkaWZmIC0tZ2l0IGEvYXJjaC9hcm02NC9rZXJuZWwvY3B1aW5mby5jIGIvYXJjaC9h cm02NC9rZXJuZWwvY3B1aW5mby5jCmluZGV4IDFmZjFjNWEuLjU4ZGE1MDQgMTAwNjQ0Ci0tLSBh L2FyY2gvYXJtNjQva2VybmVsL2NwdWluZm8uYworKysgYi9hcmNoL2FybTY0L2tlcm5lbC9jcHVp bmZvLmMKQEAgLTE5LDYgKzE5LDcgQEAKICNpbmNsdWRlIDxhc20vY3B1Lmg+CiAjaW5jbHVkZSA8 YXNtL2NwdXR5cGUuaD4KICNpbmNsdWRlIDxhc20vY3B1ZmVhdHVyZS5oPgorI2luY2x1ZGUgPGFz bS9mcHNpbWQuaD4KIAogI2luY2x1ZGUgPGxpbnV4L2JpdG9wcy5oPgogI2luY2x1ZGUgPGxpbnV4 L2J1Zy5oPgpAQCAtMzMxLDYgKzMzMiw3IEBAIHN0YXRpYyB2b2lkIF9fY3B1aW5mb19zdG9yZV9j cHUoc3RydWN0IGNwdWluZm9fYXJtNjQgKmluZm8pCiAJaW5mby0+cmVnX2lkX2FhNjRtbWZyMiA9 IHJlYWRfY3B1aWQoSURfQUE2NE1NRlIyX0VMMSk7CiAJaW5mby0+cmVnX2lkX2FhNjRwZnIwID0g cmVhZF9jcHVpZChJRF9BQTY0UEZSMF9FTDEpOwogCWluZm8tPnJlZ19pZF9hYTY0cGZyMSA9IHJl YWRfY3B1aWQoSURfQUE2NFBGUjFfRUwxKTsKKwlpbmZvLT5yZWdfaWRfYWE2NHpmcjAgPSByZWFk X2NwdWlkKElEX0FBNjRaRlIwX0VMMSk7CiAKIAkvKiBVcGRhdGUgdGhlIDMyYml0IElEIHJlZ2lz dGVycyBvbmx5IGlmIEFBcmNoMzIgaXMgaW1wbGVtZW50ZWQgKi8KIAlpZiAoaWRfYWE2NHBmcjBf MzJiaXRfZWwwKGluZm8tPnJlZ19pZF9hYTY0cGZyMCkpIHsKQEAgLTM1Myw2ICszNTUsMTAgQEAg c3RhdGljIHZvaWQgX19jcHVpbmZvX3N0b3JlX2NwdShzdHJ1Y3QgY3B1aW5mb19hcm02NCAqaW5m bykKIAkJaW5mby0+cmVnX212ZnIyID0gcmVhZF9jcHVpZChNVkZSMl9FTDEpOwogCX0KIAorCWlm IChJU19FTkFCTEVEKENPTkZJR19BUk02NF9TVkUpICYmCisJICAgIGlkX2FhNjRwZnIwX3N2ZShp bmZvLT5yZWdfaWRfYWE2NHBmcjApKQorCQlpbmZvLT5yZWdfemNyID0gcmVhZF96Y3JfZmVhdHVy ZXMoKTsKKwogCWNwdWluZm9fZGV0ZWN0X2ljYWNoZV9wb2xpY3koaW5mbyk7CiB9CiAKZGlmZiAt LWdpdCBhL2FyY2gvYXJtNjQva2VybmVsL2Zwc2ltZC5jIGIvYXJjaC9hcm02NC9rZXJuZWwvZnBz aW1kLmMKaW5kZXggNDc2YzYzNy4uNzAzZTlkNyAxMDA2NDQKLS0tIGEvYXJjaC9hcm02NC9rZXJu ZWwvZnBzaW1kLmMKKysrIGIvYXJjaC9hcm02NC9rZXJuZWwvZnBzaW1kLmMKQEAgLTExMywxOSAr MTEzLDE5IEBACiBzdGF0aWMgREVGSU5FX1BFUl9DUFUoc3RydWN0IGZwc2ltZF9zdGF0ZSAqLCBm cHNpbWRfbGFzdF9zdGF0ZSk7CiAKIC8qIERlZmF1bHQgVkwgZm9yIHRhc2tzIHRoYXQgZG9uJ3Qg c2V0IGl0IGV4cGxpY2l0bHk6ICovCi1zdGF0aWMgaW50IHN2ZV9kZWZhdWx0X3ZsID0gU1ZFX1ZM X01JTjsKK3N0YXRpYyBpbnQgc3ZlX2RlZmF1bHRfdmwgPSAtMTsKIAogI2lmZGVmIENPTkZJR19B Uk02NF9TVkUKIAogLyogTWF4aW11bSBzdXBwb3J0ZWQgdmVjdG9yIGxlbmd0aCBhY3Jvc3MgYWxs IENQVXMgKGluaXRpYWxseSBwb2lzb25lZCkgKi8KIGludCBfX3JvX2FmdGVyX2luaXQgc3ZlX21h eF92bCA9IC0xOwogLyogU2V0IG9mIGF2YWlsYWJsZSB2ZWN0b3IgbGVuZ3RocywgYXMgdnFfdG9f Yml0KHZxKTogKi8KLXN0YXRpYyBERUNMQVJFX0JJVE1BUChzdmVfdnFfbWFwLCBTVkVfVlFfTUFY KTsKK3N0YXRpYyBfX3JvX2FmdGVyX2luaXQgREVDTEFSRV9CSVRNQVAoc3ZlX3ZxX21hcCwgU1ZF X1ZRX01BWCk7CiAKICNlbHNlIC8qICEgQ09ORklHX0FSTTY0X1NWRSAqLwogCiAvKiBEdW1teSBk ZWNsYXJhdGlvbiBmb3IgY29kZSB0aGF0IHdpbGwgYmUgb3B0aW1pc2VkIG91dDogKi8KLWV4dGVy biBERUNMQVJFX0JJVE1BUChzdmVfdnFfbWFwLCBTVkVfVlFfTUFYKTsKK2V4dGVybiBfX3JvX2Fm dGVyX2luaXQgREVDTEFSRV9CSVRNQVAoc3ZlX3ZxX21hcCwgU1ZFX1ZRX01BWCk7CiAKICNlbmRp ZiAvKiAhIENPTkZJR19BUk02NF9TVkUgKi8KIApAQCAtNDk1LDYgKzQ5NSwxMTEgQEAgaW50IHN2 ZV9zZXRfdmVjdG9yX2xlbmd0aChzdHJ1Y3QgdGFza19zdHJ1Y3QgKnRhc2ssCiB9CiAKIC8qCisg KiBCaXRtYXAgZm9yIHRlbXBvcmFyeSBzdG9yYWdlIG9mIHRoZSBwZXItQ1BVIHNldCBvZiBzdXBw b3J0ZWQgdmVjdG9yIGxlbmd0aHMKKyAqIGR1cmluZyBzZWNvbmRhcnkgYm9vdC4KKyAqLworc3Rh dGljIERFQ0xBUkVfQklUTUFQKHN2ZV9zZWNvbmRhcnlfdnFfbWFwLCBTVkVfVlFfTUFYKTsKKwor c3RhdGljIHZvaWQgc3ZlX3Byb2JlX3ZxcyhERUNMQVJFX0JJVE1BUChtYXAsIFNWRV9WUV9NQVgp KQoreworCXVuc2lnbmVkIGludCB2cSwgdmw7CisJdW5zaWduZWQgbG9uZyB6Y3I7CisKKwliaXRt YXBfemVybyhtYXAsIFNWRV9WUV9NQVgpOworCisJemNyID0gWkNSX0VMeF9MRU5fTUFTSzsKKwl6 Y3IgPSByZWFkX3N5c3JlZ19zKFNZU19aQ1JfRUwxKSAmIH56Y3I7CisKKwlmb3IgKHZxID0gU1ZF X1ZRX01BWDsgdnEgPj0gU1ZFX1ZRX01JTjsgLS12cSkgeworCQl3cml0ZV9zeXNyZWdfcyh6Y3Ig fCAodnEgLSAxKSwgU1lTX1pDUl9FTDEpOyAvKiBzZWxmLXN5bmNpbmcgKi8KKwkJdmwgPSBzdmVf Z2V0X3ZsKCk7CisJCXZxID0gc3ZlX3ZxX2Zyb21fdmwodmwpOyAvKiBza2lwIGludGVydmVuaW5n IGxlbmd0aHMgKi8KKwkJc2V0X2JpdCh2cV90b19iaXQodnEpLCBtYXApOworCX0KK30KKwordm9p ZCBfX2luaXQgc3ZlX2luaXRfdnFfbWFwKHZvaWQpCit7CisJc3ZlX3Byb2JlX3ZxcyhzdmVfdnFf bWFwKTsKK30KKworLyoKKyAqIElmIHdlIGhhdmVuJ3QgY29tbWl0dGVkIHRvIHRoZSBzZXQgb2Yg c3VwcG9ydGVkIFZRcyB5ZXQsIGZpbHRlciBvdXQKKyAqIHRob3NlIG5vdCBzdXBwb3J0ZWQgYnkg dGhlIGN1cnJlbnQgQ1BVLgorICovCit2b2lkIHN2ZV91cGRhdGVfdnFfbWFwKHZvaWQpCit7CisJ c3ZlX3Byb2JlX3ZxcyhzdmVfc2Vjb25kYXJ5X3ZxX21hcCk7CisJYml0bWFwX2FuZChzdmVfdnFf bWFwLCBzdmVfdnFfbWFwLCBzdmVfc2Vjb25kYXJ5X3ZxX21hcCwgU1ZFX1ZRX01BWCk7Cit9CisK Ky8qIENoZWNrIHdoZXRoZXIgdGhlIGN1cnJlbnQgQ1BVIHN1cHBvcnRzIGFsbCBWUXMgaW4gdGhl IGNvbW1pdHRlZCBzZXQgKi8KK2ludCBzdmVfdmVyaWZ5X3ZxX21hcCh2b2lkKQoreworCWludCBy ZXQgPSAwOworCisJc3ZlX3Byb2JlX3ZxcyhzdmVfc2Vjb25kYXJ5X3ZxX21hcCk7CisJYml0bWFw X2FuZG5vdChzdmVfc2Vjb25kYXJ5X3ZxX21hcCwgc3ZlX3ZxX21hcCwgc3ZlX3NlY29uZGFyeV92 cV9tYXAsCisJCSAgICAgIFNWRV9WUV9NQVgpOworCWlmICghYml0bWFwX2VtcHR5KHN2ZV9zZWNv bmRhcnlfdnFfbWFwLCBTVkVfVlFfTUFYKSkgeworCQlwcl93YXJuKCJTVkU6IGNwdSVkOiBSZXF1 aXJlZCB2ZWN0b3IgbGVuZ3RoKHMpIG1pc3NpbmdcbiIsCisJCQlzbXBfcHJvY2Vzc29yX2lkKCkp OworCQlyZXQgPSAtRUlOVkFMOworCX0KKworCXJldHVybiByZXQ7Cit9CisKKy8qCisgKiBFbmFi bGUgU1ZFIGZvciBFTDEuCisgKiBJbnRlbmRlZCBmb3IgdXNlIGJ5IHRoZSBjcHVmZWF0dXJlcyBj b2RlIGR1cmluZyBDUFUgYm9vdC4KKyAqLworaW50IHN2ZV9rZXJuZWxfZW5hYmxlKHZvaWQgKl9f YWx3YXlzX3VudXNlZCBwKQoreworCXdyaXRlX3N5c3JlZyhyZWFkX3N5c3JlZyhDUEFDUl9FTDEp IHwgQ1BBQ1JfRUwxX1pFTl9FTDFFTiwgQ1BBQ1JfRUwxKTsKKwlpc2IoKTsKKworCXJldHVybiAw OworfQorCit2b2lkIF9faW5pdCBzdmVfc2V0dXAodm9pZCkKK3sKKwl1NjQgemNyOworCisJaWYg KCFzeXN0ZW1fc3VwcG9ydHNfc3ZlKCkpCisJCXJldHVybjsKKworCS8qCisJICogVGhlIFNWRSBh cmNoaXRlY3R1cmUgbWFuZGF0ZXMgc3VwcG9ydCBmb3IgMTI4LWJpdCB2ZWN0b3JzLAorCSAqIHNv IHN2ZV92cV9tYXAgbXVzdCBoYXZlIGF0IGxlYXN0IFNWRV9WUV9NSU4gc2V0LgorCSAqIElmIHNv bWV0aGluZyB3ZW50IHdyb25nLCBhdCBsZWFzdCB0cnkgdG8gcGF0Y2ggaXQgdXA6CisJICovCisJ aWYgKFdBUk5fT04oIXRlc3RfYml0KHZxX3RvX2JpdChTVkVfVlFfTUlOKSwgc3ZlX3ZxX21hcCkp KQorCQlzZXRfYml0KHZxX3RvX2JpdChTVkVfVlFfTUlOKSwgc3ZlX3ZxX21hcCk7CisKKwl6Y3Ig PSByZWFkX3Nhbml0aXNlZF9mdHJfcmVnKFNZU19aQ1JfRUwxKTsKKwlzdmVfbWF4X3ZsID0gc3Zl X3ZsX2Zyb21fdnEoKHpjciAmIFpDUl9FTHhfTEVOX01BU0spICsgMSk7CisKKwkvKgorCSAqIFNh bml0eS1jaGVjayB0aGF0IHRoZSBtYXggVkwgd2UgZGV0ZXJtaW5lZCB0aHJvdWdoIENQVSBmZWF0 dXJlcworCSAqIGNvcnJlc3BvbmRzIHByb3Blcmx5IHRvIHN2ZV92cV9tYXAuICBJZiBub3QsIGRv IG91ciBiZXN0OgorCSAqLworCWlmIChXQVJOX09OKHN2ZV9tYXhfdmwgIT0gZmluZF9zdXBwb3J0 ZWRfdmVjdG9yX2xlbmd0aChzdmVfbWF4X3ZsKSkpCisJCXN2ZV9tYXhfdmwgPSBmaW5kX3N1cHBv cnRlZF92ZWN0b3JfbGVuZ3RoKHN2ZV9tYXhfdmwpOworCisJLyoKKwkgKiBGb3IgdGhlIGRlZmF1 bHQgVkwsIHBpY2sgdGhlIG1heGltdW0gc3VwcG9ydGVkIHZhbHVlIDw9IDY0LgorCSAqIFZMID09 IDY0IGlzIGd1YXJhbnRlZWQgbm90IHRvIGdyb3cgdGhlIHNpZ25hbCBmcmFtZS4KKwkgKi8KKwlz dmVfZGVmYXVsdF92bCA9IGZpbmRfc3VwcG9ydGVkX3ZlY3Rvcl9sZW5ndGgoNjQpOworCisJcHJf aW5mbygiU1ZFOiBtYXhpbXVtIGF2YWlsYWJsZSB2ZWN0b3IgbGVuZ3RoICV1IGJ5dGVzIHBlciB2 ZWN0b3JcbiIsCisJCXN2ZV9tYXhfdmwpOworCXByX2luZm8oIlNWRTogZGVmYXVsdCB2ZWN0b3Ig bGVuZ3RoICV1IGJ5dGVzIHBlciB2ZWN0b3JcbiIsCisJCXN2ZV9kZWZhdWx0X3ZsKTsKK30KKwor LyoKICAqIENhbGxlZCBmcm9tIHRoZSBwdXRfdGFza19zdHJ1Y3QoKSBwYXRoLCB3aGljaCBjYW5u b3QgZ2V0IGhlcmUKICAqIHVubGVzcyBkZWFkX3Rhc2sgaXMgcmVhbGx5IGRlYWQgYW5kIG5vdCBz Y2hlZHVsYWJsZS4KICAqLwpAQCAtNjI5LDYgKzczNCw5IEBAIHZvaWQgZnBzaW1kX2ZsdXNoX3Ro cmVhZCh2b2lkKQogCQkgKiBUaGlzIGlzIHdoZXJlIHdlIGVuc3VyZSB0aGF0IGFsbCB1c2VyIHRh c2tzIGhhdmUgYSB2YWxpZAogCQkgKiB2ZWN0b3IgbGVuZ3RoIGNvbmZpZ3VyZWQ6IG5vIGtlcm5l bCB0YXNrIGNhbiBiZWNvbWUgYSB1c2VyCiAJCSAqIHRhc2sgd2l0aG91dCBhbiBleGVjIGFuZCBo ZW5jZSBhIGNhbGwgdG8gdGhpcyBmdW5jdGlvbi4KKwkJICogQnkgdGhlIHRpbWUgdGhlIGZpcnN0 IGNhbGwgdG8gdGhpcyBmdW5jdGlvbiBpcyBtYWRlLCBhbGwKKwkJICogZWFybHkgaGFyZHdhcmUg cHJvYmluZyBpcyBjb21wbGV0ZSwgc28gc3ZlX2RlZmF1bHRfdmwKKwkJICogc2hvdWxkIGJlIHZh bGlkLgogCQkgKiBJZiBhIGJ1ZyBjYXVzZXMgdGhpcyB0byBnbyB3cm9uZywgd2UgbWFrZSBzb21l IG5vaXNlIGFuZAogCQkgKiB0cnkgdG8gZnVkZ2UgdGhyZWFkLnN2ZV92bCB0byBhIHNhZmUgdmFs dWUgaGVyZS4KIAkJICovCi0tIAoyLjEuNAoKX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX18Ka3ZtYXJtIG1haWxpbmcgbGlzdAprdm1hcm1AbGlzdHMuY3MuY29s dW1iaWEuZWR1Cmh0dHBzOi8vbGlzdHMuY3MuY29sdW1iaWEuZWR1L21haWxtYW4vbGlzdGluZm8v a3ZtYXJtCg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com ([217.140.101.70]:56928 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752466AbdJ0Kvv (ORCPT ); Fri, 27 Oct 2017 06:51:51 -0400 From: Dave Martin Subject: [PATCH v4 16/28] arm64/sve: Probe SVE capabilities and usable vector lengths Date: Fri, 27 Oct 2017 11:50:58 +0100 Message-ID: <1509101470-7881-17-git-send-email-Dave.Martin@arm.com> In-Reply-To: <1509101470-7881-1-git-send-email-Dave.Martin@arm.com> References: <1509101470-7881-1-git-send-email-Dave.Martin@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-arch-owner@vger.kernel.org List-ID: To: linux-arm-kernel@lists.infradead.org Cc: Catalin Marinas , Will Deacon , Ard Biesheuvel , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Szabolcs Nagy , Okamoto Takayuki , kvmarm@lists.cs.columbia.edu, libc-alpha@sourceware.org, linux-arch@vger.kernel.org Message-ID: <20171027105058.GDGX8sS2mBEotciFtfjIOCP4GTZesX02j3ZY21znrmc@z> This patch uses the cpufeatures framework to determine common SVE capabilities and vector lengths, and configures the runtime SVE support code appropriately. ZCR_ELx is not really a feature register, but it is convenient to use it as a template for recording the maximum vector length supported by a CPU, using the LEN field. This field is similar to a feature field in that it is a contiguous bitfield for which we want to determine the minimum system-wide value. This patch adds ZCR as a pseudo-register in cpuinfo/cpufeatures, with appropriate custom code to populate it. Finding the minimum supported value of the LEN field is left to the cpufeatures framework in the usual way. The meaning of ID_AA64ZFR0_EL1 is not architecturally defined yet, so for now we just require it to be zero. Note that much of this code is dormant and SVE still won't be used yet, since system_supports_sve() remains hardwired to false. Signed-off-by: Dave Martin Reviewed-by: Suzuki K Poulose Cc: Alex Bennée Cc: Catalin Marinas --- **Dropped** Reviewed-by: Catalin Marinas **Dropped at v3** Reviewed-by: Alex Bennée The change requested by Suzuki (see below) is not quite trivial, though he was happy for me to apply his Reviewed-by once the change was made. Changes since v3 ---------------- Requested by Catalin Marinas: * Replace __maybe_unused functions with static inlines. Requested by Suzuki Poulose: * Don't bother to probe for supported vector lengths if we already decided SVE is not supported. --- arch/arm64/include/asm/cpu.h | 4 ++ arch/arm64/include/asm/cpufeature.h | 36 ++++++++++++ arch/arm64/include/asm/fpsimd.h | 14 +++++ arch/arm64/kernel/cpufeature.c | 52 ++++++++++++++++ arch/arm64/kernel/cpuinfo.c | 6 ++ arch/arm64/kernel/fpsimd.c | 114 +++++++++++++++++++++++++++++++++++- 6 files changed, 223 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h index 889226b..8839227 100644 --- a/arch/arm64/include/asm/cpu.h +++ b/arch/arm64/include/asm/cpu.h @@ -41,6 +41,7 @@ struct cpuinfo_arm64 { u64 reg_id_aa64mmfr2; u64 reg_id_aa64pfr0; u64 reg_id_aa64pfr1; + u64 reg_id_aa64zfr0; u32 reg_id_dfr0; u32 reg_id_isar0; @@ -59,6 +60,9 @@ struct cpuinfo_arm64 { u32 reg_mvfr0; u32 reg_mvfr1; u32 reg_mvfr2; + + /* pseudo-ZCR for recording maximum ZCR_EL1 LEN value: */ + u64 reg_zcr; }; DECLARE_PER_CPU(struct cpuinfo_arm64, cpu_data); diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index 4ea3441..9b27e8c 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -10,7 +10,9 @@ #define __ASM_CPUFEATURE_H #include +#include #include +#include #include /* @@ -223,6 +225,13 @@ static inline bool id_aa64pfr0_32bit_el0(u64 pfr0) return val == ID_AA64PFR0_EL0_32BIT_64BIT; } +static inline bool id_aa64pfr0_sve(u64 pfr0) +{ + u32 val = cpuid_feature_extract_unsigned_field(pfr0, ID_AA64PFR0_SVE_SHIFT); + + return val > 0; +} + void __init setup_cpu_features(void); void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps, @@ -267,6 +276,33 @@ static inline bool system_supports_sve(void) return false; } +/* + * Read the pseudo-ZCR used by cpufeatures to identify the supported SVE + * vector length. + * + * Use only if SVE is present. + * This function clobbers the SVE vector length. + */ +static inline u64 read_zcr_features(void) +{ + u64 zcr; + unsigned int vq_max; + + /* + * Set the maximum possible VL, and write zeroes to all other + * bits to see if they stick. + */ + sve_kernel_enable(NULL); + write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1); + + zcr = read_sysreg_s(SYS_ZCR_EL1); + zcr &= ~(u64)ZCR_ELx_LEN_MASK; /* find sticky 1s outside LEN field */ + vq_max = sve_vq_from_vl(sve_get_vl()); + zcr |= vq_max - 1; /* set LEN field to maximum effective value */ + + return zcr; +} + #endif /* __ASSEMBLY__ */ #endif diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h index 86f550c..d8e0dc9 100644 --- a/arch/arm64/include/asm/fpsimd.h +++ b/arch/arm64/include/asm/fpsimd.h @@ -78,6 +78,7 @@ extern void sve_save_state(void *state, u32 *pfpsr); extern void sve_load_state(void const *state, u32 const *pfpsr, unsigned long vq_minus_1); extern unsigned int sve_get_vl(void); +extern int sve_kernel_enable(void *); extern int __ro_after_init sve_max_vl; @@ -90,10 +91,23 @@ extern void fpsimd_release_task(struct task_struct *task); extern int sve_set_vector_length(struct task_struct *task, unsigned long vl, unsigned long flags); +/* + * Probing and setup functions. + * Calls to these functions must be serialised with one another. + */ +extern void __init sve_init_vq_map(void); +extern void sve_update_vq_map(void); +extern int sve_verify_vq_map(void); +extern void __init sve_setup(void); + #else /* ! CONFIG_ARM64_SVE */ static inline void sve_alloc(struct task_struct *task) { } static inline void fpsimd_release_task(struct task_struct *task) { } +static inline void sve_init_vq_map(void) { } +static inline void sve_update_vq_map(void) { } +static inline int sve_verify_vq_map(void) { return 0; } +static inline void sve_setup(void) { } #endif /* ! CONFIG_ARM64_SVE */ diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index e226799..2154373 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -27,6 +27,7 @@ #include #include #include +#include #include #include #include @@ -287,6 +288,12 @@ static const struct arm64_ftr_bits ftr_id_dfr0[] = { ARM64_FTR_END, }; +static const struct arm64_ftr_bits ftr_zcr[] = { + ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, + ZCR_ELx_LEN_SHIFT, ZCR_ELx_LEN_SIZE, 0), /* LEN */ + ARM64_FTR_END, +}; + /* * Common ftr bits for a 32bit register with all hidden, strict * attributes, with 4bit feature fields and a default safe value of @@ -353,6 +360,7 @@ static const struct __ftr_reg_entry { /* Op1 = 0, CRn = 0, CRm = 4 */ ARM64_FTR_REG(SYS_ID_AA64PFR0_EL1, ftr_id_aa64pfr0), ARM64_FTR_REG(SYS_ID_AA64PFR1_EL1, ftr_raz), + ARM64_FTR_REG(SYS_ID_AA64ZFR0_EL1, ftr_raz), /* Op1 = 0, CRn = 0, CRm = 5 */ ARM64_FTR_REG(SYS_ID_AA64DFR0_EL1, ftr_id_aa64dfr0), @@ -367,6 +375,9 @@ static const struct __ftr_reg_entry { ARM64_FTR_REG(SYS_ID_AA64MMFR1_EL1, ftr_id_aa64mmfr1), ARM64_FTR_REG(SYS_ID_AA64MMFR2_EL1, ftr_id_aa64mmfr2), + /* Op1 = 0, CRn = 1, CRm = 2 */ + ARM64_FTR_REG(SYS_ZCR_EL1, ftr_zcr), + /* Op1 = 3, CRn = 0, CRm = 0 */ { SYS_CTR_EL0, &arm64_ftr_reg_ctrel0 }, ARM64_FTR_REG(SYS_DCZID_EL0, ftr_dczid), @@ -504,6 +515,7 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info) init_cpu_ftr_reg(SYS_ID_AA64MMFR2_EL1, info->reg_id_aa64mmfr2); init_cpu_ftr_reg(SYS_ID_AA64PFR0_EL1, info->reg_id_aa64pfr0); init_cpu_ftr_reg(SYS_ID_AA64PFR1_EL1, info->reg_id_aa64pfr1); + init_cpu_ftr_reg(SYS_ID_AA64ZFR0_EL1, info->reg_id_aa64zfr0); if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) { init_cpu_ftr_reg(SYS_ID_DFR0_EL1, info->reg_id_dfr0); @@ -524,6 +536,10 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info) init_cpu_ftr_reg(SYS_MVFR2_EL1, info->reg_mvfr2); } + if (id_aa64pfr0_sve(info->reg_id_aa64pfr0)) { + init_cpu_ftr_reg(SYS_ZCR_EL1, info->reg_zcr); + sve_init_vq_map(); + } } static void update_cpu_ftr_reg(struct arm64_ftr_reg *reg, u64 new) @@ -627,6 +643,9 @@ void update_cpu_features(int cpu, taint |= check_update_ftr_reg(SYS_ID_AA64PFR1_EL1, cpu, info->reg_id_aa64pfr1, boot->reg_id_aa64pfr1); + taint |= check_update_ftr_reg(SYS_ID_AA64ZFR0_EL1, cpu, + info->reg_id_aa64zfr0, boot->reg_id_aa64zfr0); + /* * If we have AArch32, we care about 32-bit features for compat. * If the system doesn't support AArch32, don't update them. @@ -674,6 +693,16 @@ void update_cpu_features(int cpu, info->reg_mvfr2, boot->reg_mvfr2); } + if (id_aa64pfr0_sve(info->reg_id_aa64pfr0)) { + taint |= check_update_ftr_reg(SYS_ZCR_EL1, cpu, + info->reg_zcr, boot->reg_zcr); + + /* Probe vector lengths, unless we already gave up on SVE */ + if (id_aa64pfr0_sve(read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1)) && + !sys_caps_initialised) + sve_update_vq_map(); + } + /* * Mismatched CPU features are a recipe for disaster. Don't even * pretend to support them. @@ -1106,6 +1135,23 @@ verify_local_cpu_features(const struct arm64_cpu_capabilities *caps) } } +static void verify_sve_features(void) +{ + u64 safe_zcr = read_sanitised_ftr_reg(SYS_ZCR_EL1); + u64 zcr = read_zcr_features(); + + unsigned int safe_len = safe_zcr & ZCR_ELx_LEN_MASK; + unsigned int len = zcr & ZCR_ELx_LEN_MASK; + + if (len < safe_len || sve_verify_vq_map()) { + pr_crit("CPU%d: SVE: required vector length(s) missing\n", + smp_processor_id()); + cpu_die_early(); + } + + /* Add checks on other ZCR bits here if necessary */ +} + /* * Run through the enabled system capabilities and enable() it on this CPU. * The capabilities were decided based on the available CPUs at the boot time. @@ -1119,8 +1165,12 @@ static void verify_local_cpu_capabilities(void) verify_local_cpu_errata_workarounds(); verify_local_cpu_features(arm64_features); verify_local_elf_hwcaps(arm64_elf_hwcaps); + if (system_supports_32bit_el0()) verify_local_elf_hwcaps(compat_elf_hwcaps); + + if (system_supports_sve()) + verify_sve_features(); } void check_local_cpu_capabilities(void) @@ -1198,6 +1248,8 @@ void __init setup_cpu_features(void) if (system_supports_32bit_el0()) setup_elf_hwcaps(compat_elf_hwcaps); + sve_setup(); + /* Advertise that we have computed the system capabilities */ set_sys_caps_initialised(); diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c index 1ff1c5a..58da504 100644 --- a/arch/arm64/kernel/cpuinfo.c +++ b/arch/arm64/kernel/cpuinfo.c @@ -19,6 +19,7 @@ #include #include #include +#include #include #include @@ -331,6 +332,7 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info) info->reg_id_aa64mmfr2 = read_cpuid(ID_AA64MMFR2_EL1); info->reg_id_aa64pfr0 = read_cpuid(ID_AA64PFR0_EL1); info->reg_id_aa64pfr1 = read_cpuid(ID_AA64PFR1_EL1); + info->reg_id_aa64zfr0 = read_cpuid(ID_AA64ZFR0_EL1); /* Update the 32bit ID registers only if AArch32 is implemented */ if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) { @@ -353,6 +355,10 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info) info->reg_mvfr2 = read_cpuid(MVFR2_EL1); } + if (IS_ENABLED(CONFIG_ARM64_SVE) && + id_aa64pfr0_sve(info->reg_id_aa64pfr0)) + info->reg_zcr = read_zcr_features(); + cpuinfo_detect_icache_policy(info); } diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 476c637..703e9d7 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -113,19 +113,19 @@ static DEFINE_PER_CPU(struct fpsimd_state *, fpsimd_last_state); /* Default VL for tasks that don't set it explicitly: */ -static int sve_default_vl = SVE_VL_MIN; +static int sve_default_vl = -1; #ifdef CONFIG_ARM64_SVE /* Maximum supported vector length across all CPUs (initially poisoned) */ int __ro_after_init sve_max_vl = -1; /* Set of available vector lengths, as vq_to_bit(vq): */ -static DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX); +static __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX); #else /* ! CONFIG_ARM64_SVE */ /* Dummy declaration for code that will be optimised out: */ -extern DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX); +extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX); #endif /* ! CONFIG_ARM64_SVE */ @@ -495,6 +495,111 @@ int sve_set_vector_length(struct task_struct *task, } /* + * Bitmap for temporary storage of the per-CPU set of supported vector lengths + * during secondary boot. + */ +static DECLARE_BITMAP(sve_secondary_vq_map, SVE_VQ_MAX); + +static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX)) +{ + unsigned int vq, vl; + unsigned long zcr; + + bitmap_zero(map, SVE_VQ_MAX); + + zcr = ZCR_ELx_LEN_MASK; + zcr = read_sysreg_s(SYS_ZCR_EL1) & ~zcr; + + for (vq = SVE_VQ_MAX; vq >= SVE_VQ_MIN; --vq) { + write_sysreg_s(zcr | (vq - 1), SYS_ZCR_EL1); /* self-syncing */ + vl = sve_get_vl(); + vq = sve_vq_from_vl(vl); /* skip intervening lengths */ + set_bit(vq_to_bit(vq), map); + } +} + +void __init sve_init_vq_map(void) +{ + sve_probe_vqs(sve_vq_map); +} + +/* + * If we haven't committed to the set of supported VQs yet, filter out + * those not supported by the current CPU. + */ +void sve_update_vq_map(void) +{ + sve_probe_vqs(sve_secondary_vq_map); + bitmap_and(sve_vq_map, sve_vq_map, sve_secondary_vq_map, SVE_VQ_MAX); +} + +/* Check whether the current CPU supports all VQs in the committed set */ +int sve_verify_vq_map(void) +{ + int ret = 0; + + sve_probe_vqs(sve_secondary_vq_map); + bitmap_andnot(sve_secondary_vq_map, sve_vq_map, sve_secondary_vq_map, + SVE_VQ_MAX); + if (!bitmap_empty(sve_secondary_vq_map, SVE_VQ_MAX)) { + pr_warn("SVE: cpu%d: Required vector length(s) missing\n", + smp_processor_id()); + ret = -EINVAL; + } + + return ret; +} + +/* + * Enable SVE for EL1. + * Intended for use by the cpufeatures code during CPU boot. + */ +int sve_kernel_enable(void *__always_unused p) +{ + write_sysreg(read_sysreg(CPACR_EL1) | CPACR_EL1_ZEN_EL1EN, CPACR_EL1); + isb(); + + return 0; +} + +void __init sve_setup(void) +{ + u64 zcr; + + if (!system_supports_sve()) + return; + + /* + * The SVE architecture mandates support for 128-bit vectors, + * so sve_vq_map must have at least SVE_VQ_MIN set. + * If something went wrong, at least try to patch it up: + */ + if (WARN_ON(!test_bit(vq_to_bit(SVE_VQ_MIN), sve_vq_map))) + set_bit(vq_to_bit(SVE_VQ_MIN), sve_vq_map); + + zcr = read_sanitised_ftr_reg(SYS_ZCR_EL1); + sve_max_vl = sve_vl_from_vq((zcr & ZCR_ELx_LEN_MASK) + 1); + + /* + * Sanity-check that the max VL we determined through CPU features + * corresponds properly to sve_vq_map. If not, do our best: + */ + if (WARN_ON(sve_max_vl != find_supported_vector_length(sve_max_vl))) + sve_max_vl = find_supported_vector_length(sve_max_vl); + + /* + * For the default VL, pick the maximum supported value <= 64. + * VL == 64 is guaranteed not to grow the signal frame. + */ + sve_default_vl = find_supported_vector_length(64); + + pr_info("SVE: maximum available vector length %u bytes per vector\n", + sve_max_vl); + pr_info("SVE: default vector length %u bytes per vector\n", + sve_default_vl); +} + +/* * Called from the put_task_struct() path, which cannot get here * unless dead_task is really dead and not schedulable. */ @@ -629,6 +734,9 @@ void fpsimd_flush_thread(void) * This is where we ensure that all user tasks have a valid * vector length configured: no kernel task can become a user * task without an exec and hence a call to this function. + * By the time the first call to this function is made, all + * early hardware probing is complete, so sve_default_vl + * should be valid. * If a bug causes this to go wrong, we make some noise and * try to fudge thread.sve_vl to a safe value here. */ -- 2.1.4