From mboxrd@z Thu Jan 1 00:00:00 1970 From: Arnaldo Carvalho de Melo Subject: Re: perf event grouping for dummies (was Re: [PATCH] arc: perf: Enable generic "cache-references" and "cache-misses" events) Date: Thu, 22 Sep 2016 16:42:52 -0300 Message-ID: <20160922194252.GA2441@redhat.com> References: <1472125647-518-1-git-send-email-abrodkin@synopsys.com> <6074e252-6e18-bb01-4de1-023bd7e82f03@synopsys.com> <5f65fa04-8d33-e525-115d-4e6991a7668e@synopsys.com> <20160901083324.GM10153@twins.programming.kicks-ass.net> <2a18ae06-3abd-c3a1-e980-f04c511b08e5@synopsys.com> <04f6dcd2-35c6-6e28-2dcf-bc5f0bb446dc@us.ibm.com> <20160922075603.GW5008@twins.programming.kicks-ass.net> <4351119f-b212-5039-9a3d-f568f6893b36@us.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Content-Disposition: inline In-Reply-To: <4351119f-b212-5039-9a3d-f568f6893b36@us.ibm.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-snps-arc" Errors-To: linux-snps-arc-bounces+gla-linux-snps-arc=m.gmane.org@lists.infradead.org To: Paul Clarke Cc: Peter Zijlstra , Vineet Gupta , Alexey Brodkin , Will Deacon , "linux-kernel@vger.kernel.org" , "linux-perf-users@vger.kernel.org" , "linux-snps-arc@lists.infradead.org" , Jiri Olsa List-Id: linux-perf-users.vger.kernel.org RW0gVGh1LCBTZXAgMjIsIDIwMTYgYXQgMDE6MjM6MDRQTSAtMDUwMCwgUGF1bCBDbGFya2UgZXNj cmV2ZXU6Cj4gT24gMDkvMjIvMjAxNiAxMjo1MCBQTSwgVmluZWV0IEd1cHRhIHdyb3RlOgo+ID5P biAwOS8yMi8yMDE2IDEyOjU2IEFNLCBQZXRlciBaaWpsc3RyYSB3cm90ZToKPiA+Pk9uIFdlZCwg U2VwIDIxLCAyMDE2IGF0IDA3OjQzOjI4UE0gLTA1MDAsIFBhdWwgQ2xhcmtlIHdyb3RlOgo+ID4+ Pk9uIDA5LzIwLzIwMTYgMDM6NTYgUE0sIFZpbmVldCBHdXB0YSB3cm90ZToKPiA+Pj4+T24gMDkv MDEvMjAxNiAwMTozMyBBTSwgUGV0ZXIgWmlqbHN0cmEgd3JvdGU6Cj4gPj4+Pj4+LSBpcyB0aGF0 IHdoYXQgcGVyZiBldmVudCBncm91cGluZyBpcyA/Cj4gPj4+Pj4KPiA+Pj4+PkFnYWluLCBub3Bl LiBQZXJmIGV2ZW50IGdyb3VwcyBhcmUgc2luZ2xlIGNvdW50ZXIgKHNvIG5vIGltcGxpY2l0Cj4g Pj4+Pj5hZGRpdGlvbikgdGhhdCBhcmUgY28tc2NoZWR1bGVkIG9uIHRoZSBQTVUuCj4gPj4+Pgo+ ID4+Pj5JJ20gbm90IHN1cmUgSSB1bmRlcnN0YW5kIC0gZG9lcyB0aGlzIHJlcXVpcmUgc3BlY2lm aWMgUE1VL2FyY2ggc3VwcG9ydCAtIGFzIGluCj4gPj4+Pm11bHRpcGxlIGNvbmRpdGlvbnMgZmVl ZGluZyB0byBzYW1lIGNvdW50ZXIuCj4gPj4+Cj4gPj4+TXkgcmVhZCBpcyB0aGF0IGlzIHRoYXQg d2hhdCBQZXRlciBtZWFudCB3YXMgdGhhdCBlYWNoIGV2ZW50IGluIHRoZQo+ID4+PnBlcmYgZXZl bnQgZ3JvdXAgaXMgYSBzaW5nbGUgY291bnRlciwgc28gYWxsIHRoZSBldmVudHMgaW4gdGhlIGdy b3VwCj4gPj4+YXJlIGNvdW50ZWQgc2ltdWx0YW5lb3VzbHkuICAoTm8gbXVsdGlwbGV4aW5nLikK PiA+Pgo+ID4+UmlnaHQsIHNvcnJ5IGZvciB0aGUgcG9vciB3b3JkaW5nLgo+ID4+Cj4gPj4+PkFn YWluIHdoZW4geW91IHNheSBjby1zY2hlZHVsZWQgd2hhdCBkbyB5b3UgbWVhbiAtIHdoeSB3b3Vs ZCBhbnlvbmUgdXNlIHRoZSBldmVudAo+ID4+Pj5ncm91cGluZyAtIGlzIGl0IHdoZW4gdGhleSBv bmx5IGhhdmUgMSBjb3VudGVyIGFuZCB0aGV5IHdhbnQgdG8gY291bnQgMgo+ID4+Pj5jb25kaXRp b25zL2V2ZW50cyBhdCB0aGUgc2FtZSB0aW1lIC0gaXNuJ3QgdGhpcyBzYW1lIGFzIGV2ZW50IG11 bHRpcGxleGluZyA/Cj4gPj4+Cj4gPj4+SSdkIHNheSBpdCdzIHRoZSBjb252ZXJzZSBvZiBtdWx0 aXBsZXhpbmcuICBJbnN0ZWFkIG9mIG1hcHBpbmcKPiA+Pj5tdWx0aXBsZSBldmVudHMgdG8gYSBz aW5nbGUgY291bnRlciwgcGVyZiBldmVudCBncm91cHMgbWFwIGEgc2V0IG9mCj4gPj4+ZXZlbnRz IGVhY2ggdG8gdGhlaXIgb3duIGNvdW50ZXIsIGFuZCB0aGV5IGFyZSBhY3RpdmUgc2ltdWx0YW5l b3VzbHkuCj4gPj4+SSBzdXBwb3NlIGl0J3MgcG9zc2libGUgZm9yIHRoZSBfZ3JvdXBzXyB0byBi ZSBtdWx0aXBsZXhlZCB3aXRoIG90aGVyCj4gPj4+ZXZlbnRzIG9yIGdyb3VwcywgYnV0IHRoZSBn cm91cCBhcyBhIHdob2xlIHdpbGwgYmUgc2NoZWR1bGVkIHRvZ2V0aGVyLAo+ID4+PmFzIGEgZ3Jv dXAuCj4gPj4KPiA+PkNvcnJlY3QuCj4gPj4KPiA+PkVhY2ggZXZlbnRzIGdldCB0aGVpciBvd24g aGFyZHdhcmUgY291bnRlci4gR3JvdXBlZCBldmVudHMgYXJlCj4gPj5jby1zY2hlZHVsZWQgb24g dGhlIGhhcmR3YXJlLgo+ID4KPiA+QW5kIGlmIHdlIGRvbid0IGdyb3VwIHRoZW0sIHRoZW4gdGhl eSBfbWF5XyBub3QgYmUgY28tc2NoZWR1bGVkIChhY3RpdmUvY291bnRpbmcKPiA+YXQgdGhlIHNh bWUgdGltZSkgPyBCdXQgaG93IGNhbiB0aGlzIGJlIHBvc3NpYmxlLgo+ID5TYXkgd2UgaGF2ZSAy IGNvdW50ZXJzLCBib3RoIHRoZSBjbWRzIGJlbG93Cj4gPgo+ID4gICAgIHBlcmYgLWUgY3ljbGVz LGluc3RydWN0aW9ucyBoYWNrYmVuY2gKPiA+ICAgICBwZXJmIC1lIHtjeWNsZXMsaW5zdHJ1Y3Rp b25zfSBoYWNrYmVuY2gKPiA+Cj4gPndvdWxkIGFzc2lnbiAyIGNvdW50ZXJzIHRvIHRoZSAyIGNv bmRpdGlvbnMgd2hpY2gga2VlcCBjb3VudGluZyB1bnRpbCBwZXJmIGFza3MKPiA+dGhlbSB0byBz dG9wIChiZWNhdXNlIHRoZSBwcm9maWxlZCBhcHBsaWNhdGlvbiBlbmRlZCkKPiA+Cj4gPkkgZG9u J3QgdW5kZXJzdGFuZCB0aGUgInNjaGVkdWxpbmciIG9mIGNvdW50ZXIgLSBvbmNlIHdlIHNldCB0 aGVtIHRvIGNvdW50LCB0aGVyZQo+ID5pcyBubyByZWFsIGludGVydmVudGlvbi9zY2hlZHVsaW5n IGZvcm0gc29mdHdhcmUgaW4gdGVybXMgb2YgZGlzYWJsaW5nL2VuYWJsaW5nCj4gPihhc3N1bWlu ZyBubyBtdWx0aXBsZXhpbmcgZXRjKQoKU28sIGdldHRpbmcgdGhpcyBtYWNoaW5lIGFzIGFuIGV4 YW1wbGU6CgpbICAgIDAuMDY3NzM5XSBzbXBib290OiBDUFUwOiBJbnRlbChSKSBDb3JlKFRNKSBp Ny0zNjY3VSBDUFUgQCAyLjAwR0h6IChmYW1pbHk6IDB4NiwgbW9kZWw6IDB4M2EsIHN0ZXBwaW5n OiAweDkpClsgICAgMC4wNjc3NDRdIFBlcmZvcm1hbmNlIEV2ZW50czogUEVCUyBmbXQxKywgMTYt ZGVlcCBMQlIsIEl2eUJyaWRnZSBldmVudHMsIGZ1bGwtd2lkdGggY291bnRlcnMsIEludGVsIFBN VSBkcml2ZXIuClsgICAgMC4wNjc3NzRdIC4uLiB2ZXJzaW9uOiAgICAgICAgICAgICAgICAzClsg ICAgMC4wNjc3NzZdIC4uLiBiaXQgd2lkdGg6ICAgICAgICAgICAgICA0OApbICAgIDAuMDY3Nzc3 XSAuLi4gZ2VuZXJpYyByZWdpc3RlcnM6ICAgICAgNApbICAgIDAuMDY3Nzc4XSAuLi4gdmFsdWUg bWFzazogICAgICAgICAgICAgMDAwMGZmZmZmZmZmZmZmZgpbICAgIDAuMDY3Nzc5XSAuLi4gbWF4 IHBlcmlvZDogICAgICAgICAgICAgMDAwMGZmZmZmZmZmZmZmZgpbICAgIDAuMDY3NzgwXSAuLi4g Zml4ZWQtcHVycG9zZSBldmVudHM6ICAgMwpbICAgIDAuMDY3NzgxXSAuLi4gZXZlbnQgbWFzazog ICAgICAgICAgICAgMDAwMDAwMDcwMDAwMDAwZgpbICAgIDAuMDY4Njk0XSBOTUkgd2F0Y2hkb2c6 IGVuYWJsZWQgb24gYWxsIENQVXMsIHBlcm1hbmVudGx5IGNvbnN1bWVzIG9uZSBody1QTVUgY291 bnRlci4KCltyb290QHpvbyB+XSMgcGVyZiBzdGF0IC1lICd7YnJhbmNoLWluc3RydWN0aW9ucyxi cmFuY2gtbWlzc2VzLGJ1cy1jeWNsZXMsY2FjaGUtbWlzc2VzfScgbHMgYQpsczogY2Fubm90IGFj Y2VzcyAnYSc6IE5vIHN1Y2ggZmlsZSBvciBkaXJlY3RvcnkKCiBQZXJmb3JtYW5jZSBjb3VudGVy IHN0YXRzIGZvciAnbHMgYSc6CgogICAgICAgICAgIDM1NiwwOTAgICAgICBicmFuY2gtaW5zdHJ1 Y3Rpb25zICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAKICAgICAgICAg ICAgMTcsMTcwICAgICAgYnJhbmNoLW1pc3NlcyAgICAgICAgICAgICAjICAgIDQuODIlIG9mIGFs bCBicmFuY2hlcyAgICAgICAgCiAgICAgICAgICAgMjMyLDM2NSAgICAgIGJ1cy1jeWNsZXMgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIAogICAgICAgICAg ICAxMiwxMDcgICAgICBjYWNoZS1taXNzZXMgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAKCiAgICAgICAwLjAwMzYyNDk2NyBzZWNvbmRzIHRpbWUgZWxhcHNl ZAoKW3Jvb3RAem9vIH5dIyBwZXJmIHN0YXQgLWUgJ3ticmFuY2gtaW5zdHJ1Y3Rpb25zLGJyYW5j aC1taXNzZXMsYnVzLWN5Y2xlcyxjYWNoZS1taXNzZXMsY3B1LWN5Y2xlc30nIGxzIGEKbHM6IGNh bm5vdCBhY2Nlc3MgJ2EnOiBObyBzdWNoIGZpbGUgb3IgZGlyZWN0b3J5CgogUGVyZm9ybWFuY2Ug Y291bnRlciBzdGF0cyBmb3IgJ2xzIGEnOgoKICAgICA8bm90IGNvdW50ZWQ+ICAgICAgYnJhbmNo LWluc3RydWN0aW9ucyAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAo MC4wMCUpCiAgICAgPG5vdCBjb3VudGVkPiAgICAgIGJyYW5jaC1taXNzZXMgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgKDAuMDAlKQogICAgIDxub3QgY291 bnRlZD4gICAgICBidXMtY3ljbGVzICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICgwLjAwJSkKICAgICA8bm90IGNvdW50ZWQ+ICAgICAgY2FjaGUtbWlz c2VzICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAoMC4w MCUpCiAgICAgPG5vdCBjb3VudGVkPiAgICAgIGNwdS1jeWNsZXMgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgKDAuMDAlKQoKICAgICAgIDAuMDAzNjU5 Njc4IHNlY29uZHMgdGltZSBlbGFwc2VkCgpbcm9vdEB6b28gfl0jCgpUaGF0IHdhcyBhcyBhIGdy b3VwLCBpLmUuIHRob3NlIHt9IGVuY2xvc2luZyBpdCwgaWYgeW91IHJ1biBpdCB3aXRoIC12diwg YW1vbmcKb3RoZXIgdGhpbmdzIHlvdSdsbCBzZWUgdGhlICJncm91cF9mZCIgcGFyYW1ldGVyIHRv IHRoZSBzeXNfcGVyZl9ldmVudF9vcGVuCnN5c2NhbGw6Cgpbcm9vdEB6b28gfl0jIHBlcmYgc3Rh dCAtdnYgLWUgJ3ticmFuY2gtaW5zdHJ1Y3Rpb25zLGJyYW5jaC1taXNzZXMsYnVzLWN5Y2xlcyxj YWNoZS1taXNzZXMsY3B1LWN5Y2xlc30nIGxzIGEKc3lzX3BlcmZfZXZlbnRfb3BlbjogcGlkIDI4 NTgxICBjcHUgLTEgIGdyb3VwX2ZkIC0xICBmbGFncyAweDgKc3lzX3BlcmZfZXZlbnRfb3Blbjog cGlkIDI4NTgxICBjcHUgLTEgIGdyb3VwX2ZkIDMgIGZsYWdzIDB4OApzeXNfcGVyZl9ldmVudF9v cGVuOiBwaWQgMjg1ODEgIGNwdSAtMSAgZ3JvdXBfZmQgMyAgZmxhZ3MgMHg4CnN5c19wZXJmX2V2 ZW50X29wZW46IHBpZCAyODU4MSAgY3B1IC0xICBncm91cF9mZCAzICBmbGFncyAweDgKc3lzX3Bl cmZfZXZlbnRfb3BlbjogcGlkIDI4NTgxICBjcHUgLTEgIGdyb3VwX2ZkIDMgIGZsYWdzIDB4OAps czogY2Fubm90IGFjY2VzcyAnYSc6IE5vIHN1Y2ggZmlsZSBvciBkaXJlY3RvcnkKCiBQZXJmb3Jt YW5jZSBjb3VudGVyIHN0YXRzIGZvciAnbHMgYSc6CgogICAgIDxub3QgY291bnRlZD4gICAgICBi cmFuY2gtaW5zdHJ1Y3Rpb25zICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICgwLjAwJSkKICAgICA8bm90IGNvdW50ZWQ+ICAgICAgYnJhbmNoLW1pc3NlcyAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAoMC4wMCUpCiAgICAgPG5v dCBjb3VudGVkPiAgICAgIGJ1cy1jeWNsZXMgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgKDAuMDAlKQogICAgIDxub3QgY291bnRlZD4gICAgICBjYWNo ZS1taXNzZXMgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICgwLjAwJSkKICAgICA8bm90IGNvdW50ZWQ+ICAgICAgY3B1LWN5Y2xlcyAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAoMC4wMCUpCgogICAgICAgMC4w MDI4ODMyMDkgc2Vjb25kcyB0aW1lIGVsYXBzZWQKCltyb290QHpvbyB+XSMKClNvLCB0aGUgZmly c3Qgb25lIHBhc3NlcyAtMSwgdG8gY3JlYXRlIHRoZSBncm91cCwgdGhlIGZkIGl0IHJldHVybnMg aXMgJzMnLAp0aGF0IGlzIHVzZWQgYXMgZ3JvdXBfZmQgZm9yIHRoZSBvdGhlciBldmVudHMgaW4g dGhhdCBncm91cC4KClNvIHRoZSB3b3JrbG9hZCBydW5zIGJ1dCBub3RoaW5nIGlzIGNvdW50ZWQs IHRoZSBrZXJuZWwgY2FuJ3QgZG8gd2hhdCB3YXMKYXNrZWQsIGkuZS4gc2NoZWR1bGUgYWxsIHRo b3NlIDUgaGFyZHdhcmUgZXZlbnRzIF9hdCB0aGUgc2FtZSB0aW1lXywgbm8KbXVsdGlwbGV4aW5n IG9mIGNvdW50ZXJzIHRoYXQgY2FuIGNvdW50IGRpZmZlcmVudCBoYXJkd2FyZSBldmVudHMgaXMg cGVyZm9ybWVkCl9mb3IgdGhhdCB0YXNrXy4KCklmIHdlIHJlbW92ZSB0aGF0IHt9LCBpLmUuIHNh eSwgbm8gbmVlZCB0byBlbmFibGUgYWxsIHRob3NlIGNvdW50ZXJzIF9hdCB0aGUKc2FtZSB0aW1l XywgbXVsdGlwbGV4IHRoZW0gX2luIHRoZSBzYW1lIHRhc2tfIHRvIGJlIGFibGUgdG8gbWVhc3Vy ZSB0aGVtIGFsbAp0byBzb21lIGRlZ3JlZSwgaXQgIndvcmtzIjoKCltyb290QHpvbyB+XSMgcGVy ZiBzdGF0IC12diAtZSAnYnJhbmNoLWluc3RydWN0aW9ucyxicmFuY2gtbWlzc2VzLGJ1cy1jeWNs ZXMsY2FjaGUtbWlzc2VzLGNwdS1jeWNsZXMnIGxzIGEKcGVyZl9ldmVudF9hdHRyOiAoRm9yIHRo ZSBmaXJzdCBldmVudDopCiAgY29uZmlnICAgICAgICAgICAgICAgICAgICAgICAgICAgMHg0CiAg cmVhZF9mb3JtYXQgICAgICAgICAgICAgICAgICAgICAgVE9UQUxfVElNRV9FTkFCTEVEfFRPVEFM X1RJTUVfUlVOTklORwpzeXNfcGVyZl9ldmVudF9vcGVuOiBwaWQgMjg1OTQgIGNwdSAtMSAgZ3Jv dXBfZmQgLTEgIGZsYWdzIDB4OApzeXNfcGVyZl9ldmVudF9vcGVuOiBwaWQgMjg1OTQgIGNwdSAt MSAgZ3JvdXBfZmQgLTEgIGZsYWdzIDB4OApzeXNfcGVyZl9ldmVudF9vcGVuOiBwaWQgMjg1OTQg IGNwdSAtMSAgZ3JvdXBfZmQgLTEgIGZsYWdzIDB4OApzeXNfcGVyZl9ldmVudF9vcGVuOiBwaWQg Mjg1OTQgIGNwdSAtMSAgZ3JvdXBfZmQgLTEgIGZsYWdzIDB4OApzeXNfcGVyZl9ldmVudF9vcGVu OiBwaWQgMjg1OTQgIGNwdSAtMSAgZ3JvdXBfZmQgLTEgIGZsYWdzIDB4OAoKIFBlcmZvcm1hbmNl IGNvdW50ZXIgc3RhdHMgZm9yICdscyBhJzoKCiAgICAgICAgICAgMzE3LDg5MiAgICAgIGJyYW5j aC1pbnN0cnVjdGlvbnMgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg KDUzLjAxJSkKICAgICAgICAgICAgMTMsNDAwICAgICAgYnJhbmNoLW1pc3NlcyAgICAgICAgICAg ICAjICAgIDQuMjIlIG9mIGFsbCBicmFuY2hlcyAgICAgICAgCiAgICAgICAgICAgMjAxLDU3OCAg ICAgIGJ1cy1jeWNsZXMgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgIAogICAgICAgICAgICAxMSwzMjYgICAgICBjYWNoZS1taXNzZXMgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAKICAgICAgICAgMiwyMDMsNDgyICAg ICAgY3B1LWN5Y2xlcyAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAoNzguNDQlKQoKICAgICAgIDAuMDAzMDI2ODQwIHNlY29uZHMgdGltZSBlbGFwc2Vk Cgpbcm9vdEB6b28gfl0jCgpTZWUgdGhlIHJlYWRfZm9ybWF0PyBUaG9zZSBwZXJjZW50YWdlcz8g dGhlIGdyb3VwX2ZkID0gLTE/CgpJdCBhbGwgZGVwZW5kcyBvbiB0aGVzZSBQTVUgcmVzb3VyY2Vz OgoKWyAgICAwLjA2Nzc3N10gLi4uIGdlbmVyaWMgcmVnaXN0ZXJzOiAgICAgIDQKWyAgICAwLjA2 Nzc4MF0gLi4uIGZpeGVkLXB1cnBvc2UgZXZlbnRzOiAgIDMKCkl0cyB0aGlzIHBhcnQgb2YgJ21h biBwZXJmX2V2ZW50X29wZW4nOgoKICBUaGUgIGdyb3VwX2ZkICBhcmd1bWVudCAgYWxsb3dzICBl dmVudCBncm91cHMgdG8gYmUgY3JlYXRlZC4gIEFuIGV2ZW50IGdyb3VwCmhhcyBvbmUgZXZlbnQg d2hpY2ggaXMgdGhlIGdyb3VwIGxlYWRlci4gIFRoZSBsZWFkZXIgaXMgY3JlYXRlZCBmaXJzdCwg d2l0aApncm91cF9mZCA9IC0xLiAgVGhlIHJlc3Qgb2YgdGhlIGdyb3VwIG1lbWJlcnMgYXJlIGNy ZWF0ZWQgd2l0aCBzdWJzZXF1ZW50CnBlcmZfZXZlbnRfb3BlbigpIGNhbGxzIHdpdGggZ3JvdXBf ZmQgYmVpbmcgc2V0IHRvIHRoZSBmaWxlIGRlc2NyaXDigJAgdG9yICBvZgp0aGUgZ3JvdXAgbGVh ZGVyLiAgKEEgc2luZ2xlIGV2ZW50IG9uIGl0cyBvd24gaXMgY3JlYXRlZCB3aXRoIGdyb3VwX2Zk ID0gLTEgYW5kCmlzIGNvbnNpZGVyZWQgdG8gYmUgYSBncm91cCB3aXRoIG9ubHkgMSBtZW1iZXIu KSAgQW4gZXZlbnQgZ3JvdXAgaXMgc2NoZWR1bGVkCm9udG8gdGhlIENQVSBhcyBhIHVuaXQ6IGl0 IHdpbGwgYmUgcHV0IG9udG8gdGhlIENQVSBvbmx5IGlmIGFsbCBvZiB0aGUgZXZlbnRzCmluIHRo ZSBncm91cCBjYW4gYmUgcHV0IG9udG8gdGhlIENQVS4gIFRoaXMgIG1lYW5zIHRoYXQgIHRoZSAg dmFsdWVzICBvZiAgdGhlCm1lbWJlciAgZXZlbnRzICBjYW4gYmUgbWVhbmluZ2Z1bGx5IGNvbXBh cmVk4oCUYWRkZWQsIGRpdmlkZWQgKHRvIGdldCByYXRpb3MpLAphbmQgc28gb27igJR3aXRoIGVh Y2ggb3RoZXIsIHNpbmNlIHRoZXkgaGF2ZSBjb3VudGVkIGV2ZW50cyBmb3IgdGhlIHNhbWUgc2V0 IG9mCmV4ZWN1dGVkIGluc3RydWN0aW9ucy4KCi0gQXJuYWxkbwogCj4gSWYgeW91IGFzc3VtZSBu byBtdWx0aXBsZXhpbmcsIHRoZW4gdGhpcyBkaXNjdXNzaW9uIG9uIGdyb3VwaW5nIGlzIG1vb3Qu CiAKPiBJdCBkZXBlbmRzIG9uIGhvdyBtYW55IGV2ZW50cyB5b3Ugc3BlY2lmeSwgaG93IG1hbnkg Y291bnRlcnMgdGhlcmUKPiBhcmUsIGFuZCB3aGljaCBjb3VudGVycyBjYW4gY291bnQgd2hpY2gg ZXZlbnRzLiAgSWYgeW91IHNwZWNpZnkgYSBzZXQKPiBvZiBldmVudHMgZm9yIHdoaWNoIGV2ZXJ5 IGV2ZW50IGNhbiBiZSBjb3VudGVkIHNpbXVsdGFuZW91c2x5LCB0aGV5Cj4gd2lsbCBiZSBzY2hl ZHVsZWQgc2ltdWx0YW5lb3VzbHkgYW5kIGNvbnRpbnVvdXNseS4gIElmIHlvdSBzcGVjaWZ5Cj4g bW9yZSBldmVudHMgdGhhbiBjb3VudGVycywgdGhlcmUncyBtdWx0aXBsZXhpbmcuICBBTkQsIGlm IHlvdSBzcGVjaWZ5CgpUaGVyZSBpcyBtdWx0aXBsZXhpbmcgaWYgZ3JvdXBfZmQgaXMgc2V0IHRv IC0xIGluIGFsbCBldmVudHMuCgo+IGEgc2V0IG9mIGV2ZW50cywgc29tZSBvZiB3aGljaCBjYW5u b3QgYmUgY291bnRlZCBzaW11bHRhbmVvdXNseSBkdWUgdG8KPiBoYXJkd2FyZSBsaW1pdGF0aW9u cywgdGhleSdsbCBiZSBtdWx0aXBsZXhlZC4KCk5vdCBpZiBncm91cF9mZCBpcyBzZXQgdG8gYSBn cm91cCBsZWFkZXIuCiAKPiBQQwoKX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX18KbGludXgtc25wcy1hcmMgbWFpbGluZyBsaXN0CmxpbnV4LXNucHMtYXJjQGxp c3RzLmluZnJhZGVhZC5vcmcKaHR0cDovL2xpc3RzLmluZnJhZGVhZC5vcmcvbWFpbG1hbi9saXN0 aW5mby9saW51eC1zbnBzLWFyYw== From mboxrd@z Thu Jan 1 00:00:00 1970 From: acme@redhat.com (Arnaldo Carvalho de Melo) Date: Thu, 22 Sep 2016 16:42:52 -0300 Subject: perf event grouping for dummies (was Re: [PATCH] arc: perf: Enable generic "cache-references" and "cache-misses" events) In-Reply-To: <4351119f-b212-5039-9a3d-f568f6893b36@us.ibm.com> References: <1472125647-518-1-git-send-email-abrodkin@synopsys.com> <6074e252-6e18-bb01-4de1-023bd7e82f03@synopsys.com> <5f65fa04-8d33-e525-115d-4e6991a7668e@synopsys.com> <20160901083324.GM10153@twins.programming.kicks-ass.net> <2a18ae06-3abd-c3a1-e980-f04c511b08e5@synopsys.com> <04f6dcd2-35c6-6e28-2dcf-bc5f0bb446dc@us.ibm.com> <20160922075603.GW5008@twins.programming.kicks-ass.net> <4351119f-b212-5039-9a3d-f568f6893b36@us.ibm.com> List-ID: Message-ID: <20160922194252.GA2441@redhat.com> To: linux-snps-arc@lists.infradead.org Em Thu, Sep 22, 2016 at 01:23:04PM -0500, Paul Clarke escreveu: > On 09/22/2016 12:50 PM, Vineet Gupta wrote: > >On 09/22/2016 12:56 AM, Peter Zijlstra wrote: > >>On Wed, Sep 21, 2016@07:43:28PM -0500, Paul Clarke wrote: > >>>On 09/20/2016 03:56 PM, Vineet Gupta wrote: > >>>>On 09/01/2016 01:33 AM, Peter Zijlstra wrote: > >>>>>>- is that what perf event grouping is ? > >>>>> > >>>>>Again, nope. Perf event groups are single counter (so no implicit > >>>>>addition) that are co-scheduled on the PMU. > >>>> > >>>>I'm not sure I understand - does this require specific PMU/arch support - as in > >>>>multiple conditions feeding to same counter. > >>> > >>>My read is that is that what Peter meant was that each event in the > >>>perf event group is a single counter, so all the events in the group > >>>are counted simultaneously. (No multiplexing.) > >> > >>Right, sorry for the poor wording. > >> > >>>>Again when you say co-scheduled what do you mean - why would anyone use the event > >>>>grouping - is it when they only have 1 counter and they want to count 2 > >>>>conditions/events at the same time - isn't this same as event multiplexing ? > >>> > >>>I'd say it's the converse of multiplexing. Instead of mapping > >>>multiple events to a single counter, perf event groups map a set of > >>>events each to their own counter, and they are active simultaneously. > >>>I suppose it's possible for the _groups_ to be multiplexed with other > >>>events or groups, but the group as a whole will be scheduled together, > >>>as a group. > >> > >>Correct. > >> > >>Each events get their own hardware counter. Grouped events are > >>co-scheduled on the hardware. > > > >And if we don't group them, then they _may_ not be co-scheduled (active/counting > >at the same time) ? But how can this be possible. > >Say we have 2 counters, both the cmds below > > > > perf -e cycles,instructions hackbench > > perf -e {cycles,instructions} hackbench > > > >would assign 2 counters to the 2 conditions which keep counting until perf asks > >them to stop (because the profiled application ended) > > > >I don't understand the "scheduling" of counter - once we set them to count, there > >is no real intervention/scheduling form software in terms of disabling/enabling > >(assuming no multiplexing etc) So, getting this machine as an example: [ 0.067739] smpboot: CPU0: Intel(R) Core(TM) i7-3667U CPU @ 2.00GHz (family: 0x6, model: 0x3a, stepping: 0x9) [ 0.067744] Performance Events: PEBS fmt1+, 16-deep LBR, IvyBridge events, full-width counters, Intel PMU driver. [ 0.067774] ... version: 3 [ 0.067776] ... bit width: 48 [ 0.067777] ... generic registers: 4 [ 0.067778] ... value mask: 0000ffffffffffff [ 0.067779] ... max period: 0000ffffffffffff [ 0.067780] ... fixed-purpose events: 3 [ 0.067781] ... event mask: 000000070000000f [ 0.068694] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. [root at zoo ~]# perf stat -e '{branch-instructions,branch-misses,bus-cycles,cache-misses}' ls a ls: cannot access 'a': No such file or directory Performance counter stats for 'ls a': 356,090 branch-instructions 17,170 branch-misses # 4.82% of all branches 232,365 bus-cycles 12,107 cache-misses 0.003624967 seconds time elapsed [root at zoo ~]# perf stat -e '{branch-instructions,branch-misses,bus-cycles,cache-misses,cpu-cycles}' ls a ls: cannot access 'a': No such file or directory Performance counter stats for 'ls a': branch-instructions (0.00%) branch-misses (0.00%) bus-cycles (0.00%) cache-misses (0.00%) cpu-cycles (0.00%) 0.003659678 seconds time elapsed [root at zoo ~]# That was as a group, i.e. those {} enclosing it, if you run it with -vv, among other things you'll see the "group_fd" parameter to the sys_perf_event_open syscall: [root at zoo ~]# perf stat -vv -e '{branch-instructions,branch-misses,bus-cycles,cache-misses,cpu-cycles}' ls a sys_perf_event_open: pid 28581 cpu -1 group_fd -1 flags 0x8 sys_perf_event_open: pid 28581 cpu -1 group_fd 3 flags 0x8 sys_perf_event_open: pid 28581 cpu -1 group_fd 3 flags 0x8 sys_perf_event_open: pid 28581 cpu -1 group_fd 3 flags 0x8 sys_perf_event_open: pid 28581 cpu -1 group_fd 3 flags 0x8 ls: cannot access 'a': No such file or directory Performance counter stats for 'ls a': branch-instructions (0.00%) branch-misses (0.00%) bus-cycles (0.00%) cache-misses (0.00%) cpu-cycles (0.00%) 0.002883209 seconds time elapsed [root at zoo ~]# So, the first one passes -1, to create the group, the fd it returns is '3', that is used as group_fd for the other events in that group. So the workload runs but nothing is counted, the kernel can't do what was asked, i.e. schedule all those 5 hardware events _at the same time_, no multiplexing of counters that can count different hardware events is performed _for that task_. If we remove that {}, i.e. say, no need to enable all those counters _at the same time_, multiplex them _in the same task_ to be able to measure them all to some degree, it "works": [root at zoo ~]# perf stat -vv -e 'branch-instructions,branch-misses,bus-cycles,cache-misses,cpu-cycles' ls a perf_event_attr: (For the first event:) config 0x4 read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING sys_perf_event_open: pid 28594 cpu -1 group_fd -1 flags 0x8 sys_perf_event_open: pid 28594 cpu -1 group_fd -1 flags 0x8 sys_perf_event_open: pid 28594 cpu -1 group_fd -1 flags 0x8 sys_perf_event_open: pid 28594 cpu -1 group_fd -1 flags 0x8 sys_perf_event_open: pid 28594 cpu -1 group_fd -1 flags 0x8 Performance counter stats for 'ls a': 317,892 branch-instructions (53.01%) 13,400 branch-misses # 4.22% of all branches 201,578 bus-cycles 11,326 cache-misses 2,203,482 cpu-cycles (78.44%) 0.003026840 seconds time elapsed [root at zoo ~]# See the read_format? Those percentages? the group_fd = -1? It all depends on these PMU resources: [ 0.067777] ... generic registers: 4 [ 0.067780] ... fixed-purpose events: 3 Its this part of 'man perf_event_open': The group_fd argument allows event groups to be created. An event group has one event which is the group leader. The leader is created first, with group_fd = -1. The rest of the group members are created with subsequent perf_event_open() calls with group_fd being set to the file descrip? tor of the group leader. (A single event on its own is created with group_fd = -1 and is considered to be a group with only 1 member.) An event group is scheduled onto the CPU as a unit: it will be put onto the CPU only if all of the events in the group can be put onto the CPU. This means that the values of the member events can be meaningfully compared?added, divided (to get ratios), and so on?with each other, since they have counted events for the same set of executed instructions. - Arnaldo > If you assume no multiplexing, then this discussion on grouping is moot. > It depends on how many events you specify, how many counters there > are, and which counters can count which events. If you specify a set > of events for which every event can be counted simultaneously, they > will be scheduled simultaneously and continuously. If you specify > more events than counters, there's multiplexing. AND, if you specify There is multiplexing if group_fd is set to -1 in all events. > a set of events, some of which cannot be counted simultaneously due to > hardware limitations, they'll be multiplexed. Not if group_fd is set to a group leader. > PC From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936020AbcIVTnQ (ORCPT ); Thu, 22 Sep 2016 15:43:16 -0400 Received: from mx1.redhat.com ([209.132.183.28]:47994 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933053AbcIVTnN (ORCPT ); Thu, 22 Sep 2016 15:43:13 -0400 Date: Thu, 22 Sep 2016 16:42:52 -0300 From: Arnaldo Carvalho de Melo To: Paul Clarke Cc: Vineet Gupta , Peter Zijlstra , Alexey Brodkin , Will Deacon , "linux-kernel@vger.kernel.org" , "linux-perf-users@vger.kernel.org" , "linux-snps-arc@lists.infradead.org" , Jiri Olsa Subject: Re: perf event grouping for dummies (was Re: [PATCH] arc: perf: Enable generic "cache-references" and "cache-misses" events) Message-ID: <20160922194252.GA2441@redhat.com> References: <1472125647-518-1-git-send-email-abrodkin@synopsys.com> <6074e252-6e18-bb01-4de1-023bd7e82f03@synopsys.com> <5f65fa04-8d33-e525-115d-4e6991a7668e@synopsys.com> <20160901083324.GM10153@twins.programming.kicks-ass.net> <2a18ae06-3abd-c3a1-e980-f04c511b08e5@synopsys.com> <04f6dcd2-35c6-6e28-2dcf-bc5f0bb446dc@us.ibm.com> <20160922075603.GW5008@twins.programming.kicks-ass.net> <4351119f-b212-5039-9a3d-f568f6893b36@us.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <4351119f-b212-5039-9a3d-f568f6893b36@us.ibm.com> X-Url: http://acmel.wordpress.com User-Agent: Mutt/1.5.20 (2009-12-10) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Thu, 22 Sep 2016 19:43:07 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Em Thu, Sep 22, 2016 at 01:23:04PM -0500, Paul Clarke escreveu: > On 09/22/2016 12:50 PM, Vineet Gupta wrote: > >On 09/22/2016 12:56 AM, Peter Zijlstra wrote: > >>On Wed, Sep 21, 2016 at 07:43:28PM -0500, Paul Clarke wrote: > >>>On 09/20/2016 03:56 PM, Vineet Gupta wrote: > >>>>On 09/01/2016 01:33 AM, Peter Zijlstra wrote: > >>>>>>- is that what perf event grouping is ? > >>>>> > >>>>>Again, nope. Perf event groups are single counter (so no implicit > >>>>>addition) that are co-scheduled on the PMU. > >>>> > >>>>I'm not sure I understand - does this require specific PMU/arch support - as in > >>>>multiple conditions feeding to same counter. > >>> > >>>My read is that is that what Peter meant was that each event in the > >>>perf event group is a single counter, so all the events in the group > >>>are counted simultaneously. (No multiplexing.) > >> > >>Right, sorry for the poor wording. > >> > >>>>Again when you say co-scheduled what do you mean - why would anyone use the event > >>>>grouping - is it when they only have 1 counter and they want to count 2 > >>>>conditions/events at the same time - isn't this same as event multiplexing ? > >>> > >>>I'd say it's the converse of multiplexing. Instead of mapping > >>>multiple events to a single counter, perf event groups map a set of > >>>events each to their own counter, and they are active simultaneously. > >>>I suppose it's possible for the _groups_ to be multiplexed with other > >>>events or groups, but the group as a whole will be scheduled together, > >>>as a group. > >> > >>Correct. > >> > >>Each events get their own hardware counter. Grouped events are > >>co-scheduled on the hardware. > > > >And if we don't group them, then they _may_ not be co-scheduled (active/counting > >at the same time) ? But how can this be possible. > >Say we have 2 counters, both the cmds below > > > > perf -e cycles,instructions hackbench > > perf -e {cycles,instructions} hackbench > > > >would assign 2 counters to the 2 conditions which keep counting until perf asks > >them to stop (because the profiled application ended) > > > >I don't understand the "scheduling" of counter - once we set them to count, there > >is no real intervention/scheduling form software in terms of disabling/enabling > >(assuming no multiplexing etc) So, getting this machine as an example: [ 0.067739] smpboot: CPU0: Intel(R) Core(TM) i7-3667U CPU @ 2.00GHz (family: 0x6, model: 0x3a, stepping: 0x9) [ 0.067744] Performance Events: PEBS fmt1+, 16-deep LBR, IvyBridge events, full-width counters, Intel PMU driver. [ 0.067774] ... version: 3 [ 0.067776] ... bit width: 48 [ 0.067777] ... generic registers: 4 [ 0.067778] ... value mask: 0000ffffffffffff [ 0.067779] ... max period: 0000ffffffffffff [ 0.067780] ... fixed-purpose events: 3 [ 0.067781] ... event mask: 000000070000000f [ 0.068694] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. [root@zoo ~]# perf stat -e '{branch-instructions,branch-misses,bus-cycles,cache-misses}' ls a ls: cannot access 'a': No such file or directory Performance counter stats for 'ls a': 356,090 branch-instructions 17,170 branch-misses # 4.82% of all branches 232,365 bus-cycles 12,107 cache-misses 0.003624967 seconds time elapsed [root@zoo ~]# perf stat -e '{branch-instructions,branch-misses,bus-cycles,cache-misses,cpu-cycles}' ls a ls: cannot access 'a': No such file or directory Performance counter stats for 'ls a': branch-instructions (0.00%) branch-misses (0.00%) bus-cycles (0.00%) cache-misses (0.00%) cpu-cycles (0.00%) 0.003659678 seconds time elapsed [root@zoo ~]# That was as a group, i.e. those {} enclosing it, if you run it with -vv, among other things you'll see the "group_fd" parameter to the sys_perf_event_open syscall: [root@zoo ~]# perf stat -vv -e '{branch-instructions,branch-misses,bus-cycles,cache-misses,cpu-cycles}' ls a sys_perf_event_open: pid 28581 cpu -1 group_fd -1 flags 0x8 sys_perf_event_open: pid 28581 cpu -1 group_fd 3 flags 0x8 sys_perf_event_open: pid 28581 cpu -1 group_fd 3 flags 0x8 sys_perf_event_open: pid 28581 cpu -1 group_fd 3 flags 0x8 sys_perf_event_open: pid 28581 cpu -1 group_fd 3 flags 0x8 ls: cannot access 'a': No such file or directory Performance counter stats for 'ls a': branch-instructions (0.00%) branch-misses (0.00%) bus-cycles (0.00%) cache-misses (0.00%) cpu-cycles (0.00%) 0.002883209 seconds time elapsed [root@zoo ~]# So, the first one passes -1, to create the group, the fd it returns is '3', that is used as group_fd for the other events in that group. So the workload runs but nothing is counted, the kernel can't do what was asked, i.e. schedule all those 5 hardware events _at the same time_, no multiplexing of counters that can count different hardware events is performed _for that task_. If we remove that {}, i.e. say, no need to enable all those counters _at the same time_, multiplex them _in the same task_ to be able to measure them all to some degree, it "works": [root@zoo ~]# perf stat -vv -e 'branch-instructions,branch-misses,bus-cycles,cache-misses,cpu-cycles' ls a perf_event_attr: (For the first event:) config 0x4 read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING sys_perf_event_open: pid 28594 cpu -1 group_fd -1 flags 0x8 sys_perf_event_open: pid 28594 cpu -1 group_fd -1 flags 0x8 sys_perf_event_open: pid 28594 cpu -1 group_fd -1 flags 0x8 sys_perf_event_open: pid 28594 cpu -1 group_fd -1 flags 0x8 sys_perf_event_open: pid 28594 cpu -1 group_fd -1 flags 0x8 Performance counter stats for 'ls a': 317,892 branch-instructions (53.01%) 13,400 branch-misses # 4.22% of all branches 201,578 bus-cycles 11,326 cache-misses 2,203,482 cpu-cycles (78.44%) 0.003026840 seconds time elapsed [root@zoo ~]# See the read_format? Those percentages? the group_fd = -1? It all depends on these PMU resources: [ 0.067777] ... generic registers: 4 [ 0.067780] ... fixed-purpose events: 3 Its this part of 'man perf_event_open': The group_fd argument allows event groups to be created. An event group has one event which is the group leader. The leader is created first, with group_fd = -1. The rest of the group members are created with subsequent perf_event_open() calls with group_fd being set to the file descrip‐ tor of the group leader. (A single event on its own is created with group_fd = -1 and is considered to be a group with only 1 member.) An event group is scheduled onto the CPU as a unit: it will be put onto the CPU only if all of the events in the group can be put onto the CPU. This means that the values of the member events can be meaningfully compared—added, divided (to get ratios), and so on—with each other, since they have counted events for the same set of executed instructions. - Arnaldo > If you assume no multiplexing, then this discussion on grouping is moot. > It depends on how many events you specify, how many counters there > are, and which counters can count which events. If you specify a set > of events for which every event can be counted simultaneously, they > will be scheduled simultaneously and continuously. If you specify > more events than counters, there's multiplexing. AND, if you specify There is multiplexing if group_fd is set to -1 in all events. > a set of events, some of which cannot be counted simultaneously due to > hardware limitations, they'll be multiplexed. Not if group_fd is set to a group leader. > PC