From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chulmin Kim Subject: Re: Question about LLC-load-misses event Date: Tue, 23 Oct 2012 14:53:22 +0900 Message-ID: <508630D2.60006@core.kaist.ac.kr> References: <507C0A9F.8060405@core.kaist.ac.kr> <507C1036.7020707@core.kaist.ac.kr> <87a9vd7nmd.fsf@sejong.aot.lge.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: Received: from core.kaist.ac.kr ([143.248.147.118]:46176 "EHLO core.kaist.ac.kr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751452Ab2JWF4b (ORCPT ); Tue, 23 Oct 2012 01:56:31 -0400 In-Reply-To: <87a9vd7nmd.fsf@sejong.aot.lge.com> Sender: linux-perf-users-owner@vger.kernel.org List-ID: To: Namhyung Kim , linux-perf-users@vger.kernel.org 2012-10-23 =EC=98=A4=ED=9B=84 2:39, Namhyung Kim =EC=93=B4 =EA=B8=80: > Hi Chulmin, > > On Mon, 15 Oct 2012 22:31:34 +0900, Chulmin Kim wrote: >> 2012-10-15 =EC=98=A4=ED=9B=84 10:07, Chulmin Kim =EC=93=B4 =EA=B8=80= : >>> (perf command : perf stat -a -A -e LLC-loads -e LLC-load-misses -e >>> instructions sleep 3) >>> >>> The problem is,, the bandwidth from STREAM benchmark does not match= with >>> the monitored value. >>> >>> e.g. >>> I got 9395MB/s from Stream. >>> >>> "perf" shows 134,642,063 LLC-load-misses for 3 seconds. >>> -> BW =3D ((# of events)/(3 seconds)) * 64 bytes / (1024*1024) =3D = 2739MB/s >>> In this equation, the term (64bytes) is for cache line size, and th= e >>> term(1024*1024) is for (MB/s). >>> >>> Why does this mismatch occur? >> In case of Oprofile, the value for a certain event represents the nu= mber >> of the overflows which occur when the number of the event exceeds th= e >> predefined value. >> Is it a similar case with that? > I guess not. And what's the result of the LLC-loads? AFAIK it count= s > all cache accesses including hits and misses. Sorry for the lack of information. I used STREAM benchmark which generates 100% cache miss. Of course, the value of LLC-loads shows bit larger number than that of=20 LLC-load-misses (but, they are almost same). > Did you calculate the > bandwidth using the result of LLC-loads? Bandwidth results: 9395MB/s from Stream 2739MB/s from LLC-load (including both hit and miss) I also want to add the BW from the mem write (about 3000MB/s from=20 LLC-store (including both hit and miss) ) I'm wondering why this difference happens? (9395MB/s vs about 5739MB/s= ) > I suspect the h/w *might* > prefetches a couple of lines when cache-miss occurred, but I'm not > sure. :) I also suspected PREFETCH. After i uploaded this question, i checked prefetch events using "perf". I got 100% prefetch miss also. (but i don't know the meaning of this=20 result thoroughly.) Do you know what this means? Are Prefetch events and LLC-load (or store) events exclusive? or=20 correlated? Perf tool is too hard for me.. No remarkable document or site to help=20 the user. (If you know, recommend it please! :) ) Thanks for your attention! > Thanks, > Namhyung >