* Intel PEBS Load Latency Measurement @ 2013-09-17 7:44 Manuel Selva 2013-09-19 8:22 ` Andi Kleen 2013-10-28 11:28 ` Manuel Selva 0 siblings, 2 replies; 13+ messages in thread From: Manuel Selva @ 2013-09-17 7:44 UTC (permalink / raw) To: linux-perf-users Hi all, I am trying to use PMU on a 2 sockets workstation (2x Intel Xeon X5650 currently running Linux 3.6.11) processor to identify memory controller unbalance. For this purpose I successfully used some uncore events to count the load on each memory controller through the perf_event_open system call. I am now planning to use Intel PEBS Load Latency Measurement to identify if these loads result in "unusual" long memory latencies. Looking at Vince Weaver web page discussing about kernel support for PMU here: http://web.eece.maine.edu/~vweaver/projects/perf_events/features.html I saw that the kernel 3.10 supports Load Latency Measurement. Unfortunately I can't find a man page describing how this works. Before looking at kernel sources, I wanted to ask here for confirmation about Load Latency Measurement in recent Linux kernels. Can anyone confirm that this functionality is available and usable ? Is the perf userland tool using it to provide the functionality to end users ? Thanks in advance for your help, -- Manu ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Intel PEBS Load Latency Measurement 2013-09-17 7:44 Intel PEBS Load Latency Measurement Manuel Selva @ 2013-09-19 8:22 ` Andi Kleen 2013-10-28 11:28 ` Manuel Selva 1 sibling, 0 replies; 13+ messages in thread From: Andi Kleen @ 2013-09-19 8:22 UTC (permalink / raw) To: Manuel Selva; +Cc: linux-perf-users Manuel Selva <selva.manuel@gmail.com> writes: > Is the perf userland tool using it to provide the functionality to end > users ? It's in the standard perf perf mem record ... perf mem report -Andi -- ak@linux.intel.com -- Speaking for myself only ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Intel PEBS Load Latency Measurement 2013-09-17 7:44 Intel PEBS Load Latency Measurement Manuel Selva 2013-09-19 8:22 ` Andi Kleen @ 2013-10-28 11:28 ` Manuel Selva 2013-10-29 2:36 ` Namhyung Kim 1 sibling, 1 reply; 13+ messages in thread From: Manuel Selva @ 2013-10-28 11:28 UTC (permalink / raw) To: Manuel Selva, linux-perf-users Hi, I am coming back on this subject after working on other stuff for several weeks. Andi pointed me to the userland tool 'perf mem' introduced in "recent" kernels (can't find the version) that is using the kernel perf_event_open system call to profile memory accesses. I guess the answer to my question is in the code of this tool, but before stepping deeper inside it, I wanted to ask you (Linux perf experts) few questions, to be sure I am on the right track. For now, I just configured a perf_event_attr to perform sampling of PERF_COUNT_HW_INSTRUCTIONS at a given period. Can you confirm than the sample_period means "the kernel will generate a sample (with fields asked through sample_type) every sample_period instructions ? Then after calling the perf_event_open system call I mmap the file descriptor returned with an arbitrary size of X pages (with X = 1 + 2^n). I then start recording events with ioctl on the file descriptor returned by perf_event_open. I am now wondering how to access the samples. My main concern is about the meaning of the data_head and data_tail fields of the metadata page located at the beginning of the memory mmaped. In understand that my samples are located just after this metadata page, and that these head and tail pointers are used to indicate where we are in the reading of the samples, is it correct ? While reading samples, should I use/modify these head and tail pointers, if yes what is the purpose of that ? I am going now to look for the perf mem code, to try to understand that from my side, but I am interested in any hint on the subject that may help me. Many thanks in advance for your help, Manu On 09/17/2013 09:44 AM, Manuel Selva wrote: > Hi all, > > I am trying to use PMU on a 2 sockets workstation (2x Intel Xeon X5650 > currently running Linux 3.6.11) processor to identify memory controller > unbalance. > > For this purpose I successfully used some uncore events to count the > load on each memory controller through the perf_event_open system call. > I am now planning to use Intel PEBS Load Latency Measurement to identify > if these loads result in "unusual" long memory latencies. > > Looking at Vince Weaver web page discussing about kernel support for PMU > here: > http://web.eece.maine.edu/~vweaver/projects/perf_events/features.html I > saw that the kernel 3.10 supports Load Latency Measurement. > Unfortunately I can't find a man page describing how this works. Before > looking at kernel sources, I wanted to ask here for confirmation about > Load Latency Measurement in recent Linux kernels. > > Can anyone confirm that this functionality is available and usable ? Is > the perf userland tool using it to provide the functionality to end users ? > > Thanks in advance for your help, > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Intel PEBS Load Latency Measurement 2013-10-28 11:28 ` Manuel Selva @ 2013-10-29 2:36 ` Namhyung Kim [not found] ` <CALbiyZy_JE+wai7d_=r-XzE+FdHRitTiAuPmANtRt7Qpet8fTg@mail.gmail.com> 0 siblings, 1 reply; 13+ messages in thread From: Namhyung Kim @ 2013-10-29 2:36 UTC (permalink / raw) To: Manuel Selva; +Cc: Manuel Selva, linux-perf-users Hi Manuel, On Mon, 28 Oct 2013 12:28:06 +0100, Manuel Selva wrote: > Hi, > > I am coming back on this subject after working on other stuff for > several weeks. Andi pointed me to the userland tool 'perf mem' > introduced in "recent" kernels (can't find the version) that is using > the kernel perf_event_open system call to profile memory accesses. > > I guess the answer to my question is in the code of this tool, but > before stepping deeper inside it, I wanted to ask you (Linux perf > experts) few questions, to be sure I am on the right track. > > For now, I just configured a perf_event_attr to perform sampling of > PERF_COUNT_HW_INSTRUCTIONS at a given period. Can you confirm than the > sample_period means "the kernel will generate a sample (with fields > asked through sample_type) every sample_period instructions ? Yes. > > Then after calling the perf_event_open system call I mmap the file > descriptor returned with an arbitrary size of X pages (with X = 1 + > 2^n). > > I then start recording events with ioctl on the file descriptor > returned by perf_event_open. I am now wondering how to access the > samples. My main concern is about the meaning of the data_head and > data_tail fields of the metadata page located at the beginning of the > memory mmaped. In understand that my samples are located just after > this metadata page, and that these head and tail pointers are used to > indicate where we are in the reading of the samples, is it correct ? Correct. > While reading samples, should I use/modify these head and tail > pointers, if yes what is the purpose of that ? The head is updated by kernel, you only need to update the tail after reading. Please see perf_record__mmap_read(). > > I am going now to look for the perf mem code, to try to understand > that from my side, but I am interested in any hint on the subject that > may help me. > > Many thanks in advance for your help, Hope this helps, Namhyung ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <CALbiyZy_JE+wai7d_=r-XzE+FdHRitTiAuPmANtRt7Qpet8fTg@mail.gmail.com>]
* Fwd: Intel PEBS Load Latency Measurement [not found] ` <CALbiyZy_JE+wai7d_=r-XzE+FdHRitTiAuPmANtRt7Qpet8fTg@mail.gmail.com> @ 2013-10-29 9:12 ` Manuel Selva 2013-10-29 13:20 ` Manuel Selva 2013-11-01 8:38 ` Fwd: " Namhyung Kim 0 siblings, 2 replies; 13+ messages in thread From: Manuel Selva @ 2013-10-29 9:12 UTC (permalink / raw) To: Namhyung Kim, linux-perf-users Hi Namhyung, Many thanks for your answer and the function you pointed. I think I now have all the required understanding of the perf_event_open syscall to do what I want. I still have two questions regarding Intel (I am on a Westmere-Ep Xeon X5650) Load latency feature and its usage by the perf mem tool. 1- In the Intel software developer guide we can read: "load operations are randomly selected by hardware and tagged to carry information related to data source locality and latency" I am wondering what does it mean, are we doing sampling at two different levels ? First the hardware chooses some load instructions to tag, and then each time X (sampling period in events count specified by software) such tagged instructions with a latency greater than a software specify threshold we record a sample with some information. What is the sampling rate of the hardware tagging mechanism, is it enough to get some interesting results ? 2- How does the perf mem tool (with the load option) with of course the help of the kernel uses this feature ? After a quick browsing of the code, here is my understanding, is it correct ? The PEBS load latency feature is enabled with the minimal possible latency (3 cycles) to do sampling on all loads and with a given default sampling period (x tagged load events with latency greater or equal to 3). In addition to these "loads events" the perf mem tool asks the kernel to record events about processes naming, and memory mappings of code to be able to retrieve offline the source code associated to instruction pointers present in samples. Thanks again for your help, Manu 2013/10/29 Namhyung Kim <namhyung@kernel.org> > > Hi Manuel, > > On Mon, 28 Oct 2013 12:28:06 +0100, Manuel Selva wrote: > > Hi, > > > > I am coming back on this subject after working on other stuff for > > several weeks. Andi pointed me to the userland tool 'perf mem' > > introduced in "recent" kernels (can't find the version) that is using > > the kernel perf_event_open system call to profile memory accesses. > > > > I guess the answer to my question is in the code of this tool, but > > before stepping deeper inside it, I wanted to ask you (Linux perf > > experts) few questions, to be sure I am on the right track. > > > > For now, I just configured a perf_event_attr to perform sampling of > > PERF_COUNT_HW_INSTRUCTIONS at a given period. Can you confirm than the > > sample_period means "the kernel will generate a sample (with fields > > asked through sample_type) every sample_period instructions ? > > Yes. > > > > > Then after calling the perf_event_open system call I mmap the file > > descriptor returned with an arbitrary size of X pages (with X = 1 + > > 2^n). > > > > I then start recording events with ioctl on the file descriptor > > returned by perf_event_open. I am now wondering how to access the > > samples. My main concern is about the meaning of the data_head and > > data_tail fields of the metadata page located at the beginning of the > > memory mmaped. In understand that my samples are located just after > > this metadata page, and that these head and tail pointers are used to > > indicate where we are in the reading of the samples, is it correct ? > > Correct. > > > > While reading samples, should I use/modify these head and tail > > pointers, if yes what is the purpose of that ? > > The head is updated by kernel, you only need to update the tail after > reading. Please see perf_record__mmap_read(). > > > > > I am going now to look for the perf mem code, to try to understand > > that from my side, but I am interested in any hint on the subject that > > may help me. > > > > Many thanks in advance for your help, > > Hope this helps, > Namhyung ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Intel PEBS Load Latency Measurement 2013-10-29 9:12 ` Fwd: " Manuel Selva @ 2013-10-29 13:20 ` Manuel Selva 2013-11-01 8:41 ` Namhyung Kim 2013-11-01 8:38 ` Fwd: " Namhyung Kim 1 sibling, 1 reply; 13+ messages in thread From: Manuel Selva @ 2013-10-29 13:20 UTC (permalink / raw) To: Namhyung Kim, linux-perf-users One more thing I forgot to ask is clarification about the pid parameter. According to Vince Weaver page: "If pid is 0, measurements happen on the current thread, if pid is greater than 0, the process indicated by pid is measured, and if pid is -1, all processes are counted." and according to perf userland tool wiki page, it's possible to attache to a specific thread with a -i option. As a consequence I wonder how I can use the perf perf_event_sys_call to only count events for a specific thread ? Thanks again 2013/10/29 Manuel Selva <selva.manuel@gmail.com>: > Hi Namhyung, > > Many thanks for your answer and the function you pointed. I think I > now have all the required understanding of the perf_event_open syscall > to do what I want. > > I still have two questions regarding Intel (I am on a Westmere-Ep Xeon > X5650) Load latency feature and its usage by the perf mem tool. > > 1- In the Intel software developer guide we can read: "load operations > are randomly selected by hardware and tagged to carry information > related to data source locality and latency" I am wondering what does > it mean, are we doing sampling at two different levels ? First the > hardware chooses some load instructions to tag, and then each time X > (sampling period in events count specified by software) such tagged > instructions with a latency greater than a software specify threshold > we record a sample with some information. What is the sampling rate of > the hardware tagging mechanism, is it enough to get some interesting > results ? > > 2- How does the perf mem tool (with the load option) with of course > the help of the kernel uses this feature ? After a quick browsing of > the code, here is my understanding, is it correct ? > The PEBS load latency feature is enabled with the minimal possible > latency (3 cycles) to do sampling on all loads and with a given > default sampling period (x tagged load events with latency greater or > equal to 3). In addition to these "loads events" the perf mem tool > asks the kernel to record events about processes naming, and memory > mappings of code to be able to retrieve offline the source code > associated to instruction pointers present in samples. > > Thanks again for your help, > > Manu > > > 2013/10/29 Namhyung Kim <namhyung@kernel.org> >> >> Hi Manuel, >> >> On Mon, 28 Oct 2013 12:28:06 +0100, Manuel Selva wrote: >> > Hi, >> > >> > I am coming back on this subject after working on other stuff for >> > several weeks. Andi pointed me to the userland tool 'perf mem' >> > introduced in "recent" kernels (can't find the version) that is using >> > the kernel perf_event_open system call to profile memory accesses. >> > >> > I guess the answer to my question is in the code of this tool, but >> > before stepping deeper inside it, I wanted to ask you (Linux perf >> > experts) few questions, to be sure I am on the right track. >> > >> > For now, I just configured a perf_event_attr to perform sampling of >> > PERF_COUNT_HW_INSTRUCTIONS at a given period. Can you confirm than the >> > sample_period means "the kernel will generate a sample (with fields >> > asked through sample_type) every sample_period instructions ? >> >> Yes. >> >> > >> > Then after calling the perf_event_open system call I mmap the file >> > descriptor returned with an arbitrary size of X pages (with X = 1 + >> > 2^n). >> > >> > I then start recording events with ioctl on the file descriptor >> > returned by perf_event_open. I am now wondering how to access the >> > samples. My main concern is about the meaning of the data_head and >> > data_tail fields of the metadata page located at the beginning of the >> > memory mmaped. In understand that my samples are located just after >> > this metadata page, and that these head and tail pointers are used to >> > indicate where we are in the reading of the samples, is it correct ? >> >> Correct. >> >> >> > While reading samples, should I use/modify these head and tail >> > pointers, if yes what is the purpose of that ? >> >> The head is updated by kernel, you only need to update the tail after >> reading. Please see perf_record__mmap_read(). >> >> > >> > I am going now to look for the perf mem code, to try to understand >> > that from my side, but I am interested in any hint on the subject that >> > may help me. >> > >> > Many thanks in advance for your help, >> >> Hope this helps, >> Namhyung ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Intel PEBS Load Latency Measurement 2013-10-29 13:20 ` Manuel Selva @ 2013-11-01 8:41 ` Namhyung Kim 2013-11-01 9:02 ` Manuel Selva 2013-11-01 17:02 ` Vince Weaver 0 siblings, 2 replies; 13+ messages in thread From: Namhyung Kim @ 2013-11-01 8:41 UTC (permalink / raw) To: Manuel Selva; +Cc: linux-perf-users On Tue, 29 Oct 2013 14:20:09 +0100, Manuel Selva wrote: > One more thing I forgot to ask is clarification about the pid > parameter. According to Vince Weaver page: "If pid is 0, measurements > happen on the current thread, if pid is greater than 0, the process > indicated by pid is measured, and if pid is -1, all processes are > counted." and according to perf userland tool wiki page, it's possible > to attache to a specific thread with a -i option. As a consequence I > wonder how I can use the perf perf_event_sys_call to only count events > for a specific thread ? In the syscall's point of view, pid is actually tid AFAIK - so I works on the thread-basis not the process. Thanks, Namhyung ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Intel PEBS Load Latency Measurement 2013-11-01 8:41 ` Namhyung Kim @ 2013-11-01 9:02 ` Manuel Selva 2013-11-01 17:02 ` Vince Weaver 1 sibling, 0 replies; 13+ messages in thread From: Manuel Selva @ 2013-11-01 9:02 UTC (permalink / raw) To: Namhyung Kim; +Cc: linux-perf-users Thanks for this info. It confirms what I finally concluded after looking at source code where I saw that pid is used to get a task_struct object. Manu On 11/01/2013 09:41 AM, Namhyung Kim wrote: > On Tue, 29 Oct 2013 14:20:09 +0100, Manuel Selva wrote: >> One more thing I forgot to ask is clarification about the pid >> parameter. According to Vince Weaver page: "If pid is 0, measurements >> happen on the current thread, if pid is greater than 0, the process >> indicated by pid is measured, and if pid is -1, all processes are >> counted." and according to perf userland tool wiki page, it's possible >> to attache to a specific thread with a -i option. As a consequence I >> wonder how I can use the perf perf_event_sys_call to only count events >> for a specific thread ? > > In the syscall's point of view, pid is actually tid AFAIK - so I works > on the thread-basis not the process. > > Thanks, > Namhyung > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Intel PEBS Load Latency Measurement 2013-11-01 8:41 ` Namhyung Kim 2013-11-01 9:02 ` Manuel Selva @ 2013-11-01 17:02 ` Vince Weaver 2013-11-01 18:08 ` Manuel Selva 1 sibling, 1 reply; 13+ messages in thread From: Vince Weaver @ 2013-11-01 17:02 UTC (permalink / raw) To: Namhyung Kim; +Cc: Manuel Selva, linux-perf-users On Fri, 1 Nov 2013, Namhyung Kim wrote: > On Tue, 29 Oct 2013 14:20:09 +0100, Manuel Selva wrote: > > One more thing I forgot to ask is clarification about the pid > > parameter. According to Vince Weaver page: "If pid is 0, measurements > > happen on the current thread, if pid is greater than 0, the process > > indicated by pid is measured, and if pid is -1, all processes are > > counted." and according to perf userland tool wiki page, it's possible > > to attache to a specific thread with a -i option. As a consequence I > > wonder how I can use the perf perf_event_sys_call to only count events > > for a specific thread ? > > In the syscall's point of view, pid is actually tid AFAIK - so I works > on the thread-basis not the process. It is true the manpage is a bit confusing here, though that's mostly due to the confusing way that Linux interchangably uses pid/tid for process and thread ids. I'll see if I can get the documentation made more clear. Vince ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Intel PEBS Load Latency Measurement 2013-11-01 17:02 ` Vince Weaver @ 2013-11-01 18:08 ` Manuel Selva 0 siblings, 0 replies; 13+ messages in thread From: Manuel Selva @ 2013-11-01 18:08 UTC (permalink / raw) To: Vince Weaver, Namhyung Kim; +Cc: linux-perf-users I agree about the confusing way that Linux uses pid and tid. I guess this comes from the (old) time where processes only had one thread. Anyway, threads are just processes sharing some part of their memory, and processes are just threads that don't share anything. Your man page (the online version, I have to check why I don't have it on my Linux workstation) was really helpful for me to build what I needed upon the perf_event_open system call without having to start from scratch with my own kernel module or the msr module and the Intel documentation. I listed from my side some complementary information that maybe useful for others, if you plan to update the man page I can provide you these notes. Thanks again to all here on the list for your help ! Manu On 11/01/2013 06:02 PM, Vince Weaver wrote: > On Fri, 1 Nov 2013, Namhyung Kim wrote: > >> On Tue, 29 Oct 2013 14:20:09 +0100, Manuel Selva wrote: >>> One more thing I forgot to ask is clarification about the pid >>> parameter. According to Vince Weaver page: "If pid is 0, measurements >>> happen on the current thread, if pid is greater than 0, the process >>> indicated by pid is measured, and if pid is -1, all processes are >>> counted." and according to perf userland tool wiki page, it's possible >>> to attache to a specific thread with a -i option. As a consequence I >>> wonder how I can use the perf perf_event_sys_call to only count events >>> for a specific thread ? >> >> In the syscall's point of view, pid is actually tid AFAIK - so I works >> on the thread-basis not the process. > > It is true the manpage is a bit confusing here, though that's mostly due > to the confusing way that Linux interchangably uses pid/tid for process > and thread ids. I'll see if I can get the documentation made more clear. > > Vince > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Fwd: Intel PEBS Load Latency Measurement 2013-10-29 9:12 ` Fwd: " Manuel Selva 2013-10-29 13:20 ` Manuel Selva @ 2013-11-01 8:38 ` Namhyung Kim 2013-11-06 13:06 ` Manuel Selva 2013-11-06 13:41 ` Stephane Eranian 1 sibling, 2 replies; 13+ messages in thread From: Namhyung Kim @ 2013-11-01 8:38 UTC (permalink / raw) To: Manuel Selva; +Cc: linux-perf-users, Stephane Eranian Hi Manuel, I'm CC-ing Stephane who is the author of the perf mem tool. Stephane, could you please answer the questions below if you have some time? Thanks, Namhyung On Tue, 29 Oct 2013 10:12:39 +0100, Manuel Selva wrote: > Hi Namhyung, > > Many thanks for your answer and the function you pointed. I think I > now have all the required understanding of the perf_event_open syscall > to do what I want. > > I still have two questions regarding Intel (I am on a Westmere-Ep Xeon > X5650) Load latency feature and its usage by the perf mem tool. > > 1- In the Intel software developer guide we can read: "load operations > are randomly selected by hardware and tagged to carry information > related to data source locality and latency" I am wondering what does > it mean, are we doing sampling at two different levels ? First the > hardware chooses some load instructions to tag, and then each time X > (sampling period in events count specified by software) such tagged > instructions with a latency greater than a software specify threshold > we record a sample with some information. What is the sampling rate of > the hardware tagging mechanism, is it enough to get some interesting > results ? > > 2- How does the perf mem tool (with the load option) with of course > the help of the kernel uses this feature ? After a quick browsing of > the code, here is my understanding, is it correct ? > The PEBS load latency feature is enabled with the minimal possible > latency (3 cycles) to do sampling on all loads and with a given > default sampling period (x tagged load events with latency greater or > equal to 3). In addition to these "loads events" the perf mem tool > asks the kernel to record events about processes naming, and memory > mappings of code to be able to retrieve offline the source code > associated to instruction pointers present in samples. > > Thanks again for your help, > > Manu > > > 2013/10/29 Namhyung Kim <namhyung@kernel.org> >> >> Hi Manuel, >> >> On Mon, 28 Oct 2013 12:28:06 +0100, Manuel Selva wrote: >> > Hi, >> > >> > I am coming back on this subject after working on other stuff for >> > several weeks. Andi pointed me to the userland tool 'perf mem' >> > introduced in "recent" kernels (can't find the version) that is using >> > the kernel perf_event_open system call to profile memory accesses. >> > >> > I guess the answer to my question is in the code of this tool, but >> > before stepping deeper inside it, I wanted to ask you (Linux perf >> > experts) few questions, to be sure I am on the right track. >> > >> > For now, I just configured a perf_event_attr to perform sampling of >> > PERF_COUNT_HW_INSTRUCTIONS at a given period. Can you confirm than the >> > sample_period means "the kernel will generate a sample (with fields >> > asked through sample_type) every sample_period instructions ? >> >> Yes. >> >> > >> > Then after calling the perf_event_open system call I mmap the file >> > descriptor returned with an arbitrary size of X pages (with X = 1 + >> > 2^n). >> > >> > I then start recording events with ioctl on the file descriptor >> > returned by perf_event_open. I am now wondering how to access the >> > samples. My main concern is about the meaning of the data_head and >> > data_tail fields of the metadata page located at the beginning of the >> > memory mmaped. In understand that my samples are located just after >> > this metadata page, and that these head and tail pointers are used to >> > indicate where we are in the reading of the samples, is it correct ? >> >> Correct. >> >> >> > While reading samples, should I use/modify these head and tail >> > pointers, if yes what is the purpose of that ? >> >> The head is updated by kernel, you only need to update the tail after >> reading. Please see perf_record__mmap_read(). >> >> > >> > I am going now to look for the perf mem code, to try to understand >> > that from my side, but I am interested in any hint on the subject that >> > may help me. >> > >> > Many thanks in advance for your help, >> >> Hope this helps, >> Namhyung ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Fwd: Intel PEBS Load Latency Measurement 2013-11-01 8:38 ` Fwd: " Namhyung Kim @ 2013-11-06 13:06 ` Manuel Selva 2013-11-06 13:41 ` Stephane Eranian 1 sibling, 0 replies; 13+ messages in thread From: Manuel Selva @ 2013-11-06 13:06 UTC (permalink / raw) To: Namhyung Kim; +Cc: linux-perf-users, Stephane Eranian Hi all, I think I got the point about the Intel SDM saying that the "hardware randomly tag load operations". This simply means, that when software indicates a sampling period of X events, the hardware choose one load randomly in each "packet" of X load events in order to load always sample the same thing for example when executing a loop. Is it correct ? Thanks, ---- Manu On 11/01/2013 09:38 AM, Namhyung Kim wrote: > Hi Manuel, > > I'm CC-ing Stephane who is the author of the perf mem tool. Stephane, > could you please answer the questions below if you have some time? > > Thanks, > Namhyung > > > On Tue, 29 Oct 2013 10:12:39 +0100, Manuel Selva wrote: >> Hi Namhyung, >> >> Many thanks for your answer and the function you pointed. I think I >> now have all the required understanding of the perf_event_open syscall >> to do what I want. >> >> I still have two questions regarding Intel (I am on a Westmere-Ep Xeon >> X5650) Load latency feature and its usage by the perf mem tool. >> >> 1- In the Intel software developer guide we can read: "load operations >> are randomly selected by hardware and tagged to carry information >> related to data source locality and latency" I am wondering what does >> it mean, are we doing sampling at two different levels ? First the >> hardware chooses some load instructions to tag, and then each time X >> (sampling period in events count specified by software) such tagged >> instructions with a latency greater than a software specify threshold >> we record a sample with some information. What is the sampling rate of >> the hardware tagging mechanism, is it enough to get some interesting >> results ? >> >> 2- How does the perf mem tool (with the load option) with of course >> the help of the kernel uses this feature ? After a quick browsing of >> the code, here is my understanding, is it correct ? >> The PEBS load latency feature is enabled with the minimal possible >> latency (3 cycles) to do sampling on all loads and with a given >> default sampling period (x tagged load events with latency greater or >> equal to 3). In addition to these "loads events" the perf mem tool >> asks the kernel to record events about processes naming, and memory >> mappings of code to be able to retrieve offline the source code >> associated to instruction pointers present in samples. >> >> Thanks again for your help, >> >> Manu >> >> >> 2013/10/29 Namhyung Kim <namhyung@kernel.org> >>> >>> Hi Manuel, >>> >>> On Mon, 28 Oct 2013 12:28:06 +0100, Manuel Selva wrote: >>>> Hi, >>>> >>>> I am coming back on this subject after working on other stuff for >>>> several weeks. Andi pointed me to the userland tool 'perf mem' >>>> introduced in "recent" kernels (can't find the version) that is using >>>> the kernel perf_event_open system call to profile memory accesses. >>>> >>>> I guess the answer to my question is in the code of this tool, but >>>> before stepping deeper inside it, I wanted to ask you (Linux perf >>>> experts) few questions, to be sure I am on the right track. >>>> >>>> For now, I just configured a perf_event_attr to perform sampling of >>>> PERF_COUNT_HW_INSTRUCTIONS at a given period. Can you confirm than the >>>> sample_period means "the kernel will generate a sample (with fields >>>> asked through sample_type) every sample_period instructions ? >>> >>> Yes. >>> >>>> >>>> Then after calling the perf_event_open system call I mmap the file >>>> descriptor returned with an arbitrary size of X pages (with X = 1 + >>>> 2^n). >>>> >>>> I then start recording events with ioctl on the file descriptor >>>> returned by perf_event_open. I am now wondering how to access the >>>> samples. My main concern is about the meaning of the data_head and >>>> data_tail fields of the metadata page located at the beginning of the >>>> memory mmaped. In understand that my samples are located just after >>>> this metadata page, and that these head and tail pointers are used to >>>> indicate where we are in the reading of the samples, is it correct ? >>> >>> Correct. >>> >>> >>>> While reading samples, should I use/modify these head and tail >>>> pointers, if yes what is the purpose of that ? >>> >>> The head is updated by kernel, you only need to update the tail after >>> reading. Please see perf_record__mmap_read(). >>> >>>> >>>> I am going now to look for the perf mem code, to try to understand >>>> that from my side, but I am interested in any hint on the subject that >>>> may help me. >>>> >>>> Many thanks in advance for your help, >>> >>> Hope this helps, >>> Namhyung > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Fwd: Intel PEBS Load Latency Measurement 2013-11-01 8:38 ` Fwd: " Namhyung Kim 2013-11-06 13:06 ` Manuel Selva @ 2013-11-06 13:41 ` Stephane Eranian 1 sibling, 0 replies; 13+ messages in thread From: Stephane Eranian @ 2013-11-06 13:41 UTC (permalink / raw) To: Namhyung Kim; +Cc: Manuel Selva, linux-perf-users Hi Manuel, On Fri, Nov 1, 2013 at 9:38 AM, Namhyung Kim <namhyung@kernel.org> wrote: > Hi Manuel, > > I'm CC-ing Stephane who is the author of the perf mem tool. Stephane, > could you please answer the questions below if you have some time? > > Thanks, > Namhyung > > > On Tue, 29 Oct 2013 10:12:39 +0100, Manuel Selva wrote: >> Hi Namhyung, >> >> Many thanks for your answer and the function you pointed. I think I >> now have all the required understanding of the perf_event_open syscall >> to do what I want. >> >> I still have two questions regarding Intel (I am on a Westmere-Ep Xeon >> X5650) Load latency feature and its usage by the perf mem tool. >> >> 1- In the Intel software developer guide we can read: "load operations >> are randomly selected by hardware and tagged to carry information >> related to data source locality and latency" I am wondering what does >> it mean, are we doing sampling at two different levels ? First the >> hardware chooses some load instructions to tag, and then each time X >> (sampling period in events count specified by software) such tagged >> instructions with a latency greater than a software specify threshold >> we record a sample with some information. What is the sampling rate of >> the hardware tagging mechanism, is it enough to get some interesting >> results ? >> The Load latency facility combines basic PEBS + a threshold mechanism to filter only certain types of loads based on their latencies. The mem_trans_retired:latency_above_threshold counts the number of loads retired that qualify for the threshold. This is the event you are actually sampling on. When that counter overflows, the retired load is sampled. If you set the counter to -P, it will overflow after P occurrences of the event. Now, it is clear that to get there you need to wait until the load retires, otherwise you don't know the latency. Note that latency here means instruction latency not just data access latency. So, I suspect underneath there is indeed some tagging mechanism. It can track only one load at a time. To avoid bias, the tagging mechanism uses some randomization scheme. I don't know how this tagging mechanism actually works. But clearly you may track loads that don't qualify for the threshold, they won't increment the counter and therefore will never be captured by perf_events. >> 2- How does the perf mem tool (with the load option) with of course >> the help of the kernel uses this feature ? After a quick browsing of >> the code, here is my understanding, is it correct ? >> The PEBS load latency feature is enabled with the minimal possible >> latency (3 cycles) to do sampling on all loads and with a given >> default sampling period (x tagged load events with latency greater or >> equal to 3). In addition to these "loads events" the perf mem tool >> asks the kernel to record events about processes naming, and memory >> mappings of code to be able to retrieve offline the source code >> associated to instruction pointers present in samples. >> Yes, your description is correct. The one difference compared with regular code sampling is that we also ask the kernel to record data mmaps, so we get a chance to symbolize data addresses (global variables only). Hope this helps. >> Thanks again for your help, >> >> Manu >> >> >> 2013/10/29 Namhyung Kim <namhyung@kernel.org> >>> >>> Hi Manuel, >>> >>> On Mon, 28 Oct 2013 12:28:06 +0100, Manuel Selva wrote: >>> > Hi, >>> > >>> > I am coming back on this subject after working on other stuff for >>> > several weeks. Andi pointed me to the userland tool 'perf mem' >>> > introduced in "recent" kernels (can't find the version) that is using >>> > the kernel perf_event_open system call to profile memory accesses. >>> > >>> > I guess the answer to my question is in the code of this tool, but >>> > before stepping deeper inside it, I wanted to ask you (Linux perf >>> > experts) few questions, to be sure I am on the right track. >>> > >>> > For now, I just configured a perf_event_attr to perform sampling of >>> > PERF_COUNT_HW_INSTRUCTIONS at a given period. Can you confirm than the >>> > sample_period means "the kernel will generate a sample (with fields >>> > asked through sample_type) every sample_period instructions ? >>> >>> Yes. >>> >>> > >>> > Then after calling the perf_event_open system call I mmap the file >>> > descriptor returned with an arbitrary size of X pages (with X = 1 + >>> > 2^n). >>> > >>> > I then start recording events with ioctl on the file descriptor >>> > returned by perf_event_open. I am now wondering how to access the >>> > samples. My main concern is about the meaning of the data_head and >>> > data_tail fields of the metadata page located at the beginning of the >>> > memory mmaped. In understand that my samples are located just after >>> > this metadata page, and that these head and tail pointers are used to >>> > indicate where we are in the reading of the samples, is it correct ? >>> >>> Correct. >>> >>> >>> > While reading samples, should I use/modify these head and tail >>> > pointers, if yes what is the purpose of that ? >>> >>> The head is updated by kernel, you only need to update the tail after >>> reading. Please see perf_record__mmap_read(). >>> >>> > >>> > I am going now to look for the perf mem code, to try to understand >>> > that from my side, but I am interested in any hint on the subject that >>> > may help me. >>> > >>> > Many thanks in advance for your help, >>> >>> Hope this helps, >>> Namhyung ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2013-11-06 13:41 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-09-17 7:44 Intel PEBS Load Latency Measurement Manuel Selva 2013-09-19 8:22 ` Andi Kleen 2013-10-28 11:28 ` Manuel Selva 2013-10-29 2:36 ` Namhyung Kim [not found] ` <CALbiyZy_JE+wai7d_=r-XzE+FdHRitTiAuPmANtRt7Qpet8fTg@mail.gmail.com> 2013-10-29 9:12 ` Fwd: " Manuel Selva 2013-10-29 13:20 ` Manuel Selva 2013-11-01 8:41 ` Namhyung Kim 2013-11-01 9:02 ` Manuel Selva 2013-11-01 17:02 ` Vince Weaver 2013-11-01 18:08 ` Manuel Selva 2013-11-01 8:38 ` Fwd: " Namhyung Kim 2013-11-06 13:06 ` Manuel Selva 2013-11-06 13:41 ` Stephane Eranian
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).