From: Alen Stojanov <astojanov@inf.ethz.ch>
To: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: Some troubles with perf and measuring flops
Date: Wed, 12 Mar 2014 00:53:45 +0100 [thread overview]
Message-ID: <531FA209.4030603@inf.ethz.ch> (raw)
In-Reply-To: <5318CF72.20504@inf.ethz.ch>
So just to summarize (since I did not get any reply) - the final
conclusion is that I can not simply obtain proper flop counts with linux
perf, because of hardware limitations ?
On 06/03/14 20:41, Alen Stojanov wrote:
> On 06/03/14 19:25, Vince Weaver wrote:
>> On Thu, 6 Mar 2014, Alen Stojanov wrote:
>>
>>>> more complicated with AVX in the mix. What does the intel
>>>> documentation
>>>> say for the event for your architecture?
>>> I agree on this. However, if you would look at the .s file, you can
>>> see that
>>> it does not have any AVX instructions inside.
>> I'm pretty sure vmovsd and vmuld are AVX instructions.
>
> Yes you are absolutely right. I made a wrong statement. What I really
> meant was that there are no AVX instructions on packed doubles, since
> vmovsd and vmulsd operate with scalar doubles. This is also why I get
> zeros whenever I do:
>
> perf stat -e r530211 ./mmmtest 600
>
> Performance counter stats for './mmmtest 600':
>
> 0 r530211
>
> 0.952037328 seconds time elapsed
>
> What I really wanted to depict was the fact that I don't have to mix
> several counters to obtain results, as there would always be only
> FP_COMP_OPS_EXE:SSE_SCALAR_DOUBLE as an event in the code.
>
>>> And if I would monitor any other
>>> event on the CPU that counts any flop operations, I get 0s. It seems
>>> that the
>>> FP_COMP_OPS_EXE:SSE_SCALAR_DOUBLE is the only one that occurs. I
>>> don't think
>>> that FP_COMP_OPS_EXE:SSE_SCALAR_DOUBLE counts speculative events.
>> are you sure?
>>
>> See http://icl.cs.utk.edu/projects/papi/wiki/PAPITopics:SandyFlops
>> about FP events on SNB and IVB at least.
>
> Thank you for the link. I only made the assumption that we do not have
> speculative events, since in a previous project that was done as part
> of my research group, we were able to get accurate flops, using Intel
> PCM: https://github.com/GeorgOfenbeck/perfplot/ (and we were able to
> get correct flops of a of a mmm having size 1600x1600x1600).
>
> Nevertheless, as much as I understood, the PAPI is discussing count
> deviations whenever several counters are combined. In my use case that
> I send you before, I would always use one single raw counter to obtain
> counts. But the deviations that I obtain, they grow as the matrix size
> grows. I made a list to depict how much the flops would deviate
>
> List format:
> (mmm size) (anticipated_flops) (obtained_flops) (anticipated_flops /
> obtained_flops * 100.0)
> 10 2000 2061 97.040
> 20 16000 16692 95.854
> 30 54000 58097 92.948
> 40 128000 132457 96.635
> 50 250000 257482 97.094
> 60 432000 452624 95.443
> 70 686000 730299 93.934
> 80 1024000 1098453 93.222
> 90 1458000 1573331 92.670
> 100 2000000 2138014 93.545
> 110 2662000 2852239 93.330
> 120 3456000 3626028 95.311
> 130 4394000 4783638 91.855
> 140 5488000 5979236 91.784
> 150 6750000 7349358 91.845
> 160 8192000 11324521 72.339
> 170 9826000 11000354 89.324
> 180 11664000 13191288 88.422
> 190 13718000 16492253 83.178
> 200 16000000 20253599 78.998
> 210 18522000 23839202 77.696
> 220 21296000 27832906 76.514
> 230 24334000 32056213 75.910
> 240 27648000 40026709 69.074
> 250 31250000 41837527 74.694
> 260 35152000 47291908 74.330
> 270 39366000 53534225 73.534
> 280 43904000 60193718 72.938
> 290 48778000 67230702 72.553
> 300 54000000 74451165 72.531
> 310 59582000 82773965 71.982
> 320 65536000 129974914 50.422
> 330 71874000 99894238 71.950
> 340 78608000 108421806 72.502
> 350 85750000 118870753 72.137
> 360 93312000 129058036 72.302
> 370 101306000 141901053 71.392
> 380 109744000 152138340 72.134
> 390 118638000 170393279 69.626
> 400 128000000 225637046 56.728
> 410 137842000 208174503 66.215
> 420 148176000 205434911 72.128
> 430 159014000 231594232 68.661
> 440 170368000 235422186 72.367
> 450 182250000 280728129 64.920
> 460 194672000 282586911 68.889
> 470 207646000 310944304 66.779
> 480 221184000 409532779 54.009
> 490 235298000 381057200 61.749
> 500 250000000 413099959 60.518
> 510 265302000 393498007 67.421
> 520 281216000 675607105 41.624
> 530 297754000 988906780 30.109
> 540 314928000 1228529787 25.635
> 550 332750000 1396858866 23.821
> 560 351232000 2144144283 16.381
> 570 370386000 2712975462 13.652
> 580 390224000 3308411489 11.795
> 590 410758000 2326514544 17.656
>
> And I cant see a pattern to derive any conclusion that makes sense.
>
>>
>> Vince
> Alen
> --
> To unsubscribe from this list: send the line "unsubscribe
> linux-perf-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2014-03-11 23:48 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-06 0:55 Some troubles with perf and measuring flops Alen Stojanov
2014-03-06 1:40 ` Vince Weaver
2014-03-06 1:53 ` Alen Stojanov
2014-03-06 18:25 ` Vince Weaver
2014-03-06 19:41 ` Alen Stojanov
2014-03-11 23:53 ` Alen Stojanov [this message]
2014-03-13 20:17 ` Vince Weaver
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=531FA209.4030603@inf.ethz.ch \
--to=astojanov@inf.ethz.ch \
--cc=linux-perf-users@vger.kernel.org \
--cc=vincent.weaver@maine.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).