diff for duplicates of <200701310752.48632.arnd@arndb.de> diff --git a/a/1.txt b/N1/1.txt index a54b3fa..729fd40 100644 --- a/a/1.txt +++ b/N1/1.txt @@ -4,45 +4,35 @@ On Tuesday 30 January 2007 23:54, Maynard Johnson wrote: > >>doesn't have any relevance to this at all, the only data that is > >>per spu is the sample data collected on a profiling interrupt, > >>which you can then copy in the per-context data on a context switch. -> >=20 -> > The sample data is written out to the event buffer on every profiling=20 -> > interrupt. =A0But we don't write out the SPU program counter samples=20 -> > directly to the event buffer. =A0First, we have to find the cached_info= -=20 -> > for the appropriate SPU context to retrieve the cached vma-to-fileoffse= -t=20 -> > map. =A0Then we do the vma_map_lookup to find the fileoffset correspond= -ing=20 -> > to the SPU PC sample, which we then write out to the event buffer. =A0T= -his=20 -> > is one of the most time-critical pieces of the SPU profiling code, so I= -=20 -> > used an array to hold the cached_info for fast random access. =A0But as= - I=20 -> > stated in a code comment above, the negative implication of this curren= -t=20 -> > implementation is that the array can only hold the cached_info for=20 -> > currently running SPU tasks. =A0I need to give this some more thought. ->=20 -> I've given this some more thought, and I'm coming to the conclusion that= -=20 -> a pure array-based implementation for holding cached_info (getting rid=20 -> of the lists) would work well for the vast majority of cases in which=20 -> OProfile will be used. =A0Yes, it is true that the mapping of an SPU=20 -> context to a phsyical spu-numbered array location cannot be guaranteed=20 -> to stay valid, and that's why I discard the cached_info at that array=20 -> location when the SPU task is switched out. =A0Yes, it would be terribly= -=20 -> inefficient if the same SPU task gets switched back in later and we=20 -> would have to recreate the cached_info. =A0However, I contend that=20 -> OProfile users are interested in profiling one application at a time.=20 -> They are not going to want to muddy the waters with multiple SPU apps=20 -> running at the same time. =A0I can't think of any reason why someone woul= -d=20 +> > +> > The sample data is written out to the event buffer on every profiling +> > interrupt. But we don't write out the SPU program counter samples +> > directly to the event buffer. First, we have to find the cached_info +> > for the appropriate SPU context to retrieve the cached vma-to-fileoffset +> > map. Then we do the vma_map_lookup to find the fileoffset corresponding +> > to the SPU PC sample, which we then write out to the event buffer. This +> > is one of the most time-critical pieces of the SPU profiling code, so I +> > used an array to hold the cached_info for fast random access. But as I +> > stated in a code comment above, the negative implication of this current +> > implementation is that the array can only hold the cached_info for +> > currently running SPU tasks. I need to give this some more thought. +> +> I've given this some more thought, and I'm coming to the conclusion that +> a pure array-based implementation for holding cached_info (getting rid +> of the lists) would work well for the vast majority of cases in which +> OProfile will be used. Yes, it is true that the mapping of an SPU +> context to a phsyical spu-numbered array location cannot be guaranteed +> to stay valid, and that's why I discard the cached_info at that array +> location when the SPU task is switched out. Yes, it would be terribly +> inefficient if the same SPU task gets switched back in later and we +> would have to recreate the cached_info. However, I contend that +> OProfile users are interested in profiling one application at a time. +> They are not going to want to muddy the waters with multiple SPU apps +> running at the same time. I can't think of any reason why someone would > conscisouly choose to do that. ->=20 +> > Any thoughts from the general community, especially OProfile users? ->=20 +> Please assume that in the near future we will be scheduling SPU contexts in and out multiple times a second. Even in a single application, you can easily have more contexts than you have physical SPUs. diff --git a/a/content_digest b/N1/content_digest index 2a6fcdf..45a6268 100644 --- a/a/content_digest +++ b/N1/content_digest @@ -7,8 +7,8 @@ "To\0cbe-oss-dev@ozlabs.org" " maynardj@us.ibm.com\0" "Cc\0linuxppc-dev@ozlabs.org" - linux-kernel@vger.kernel.org - " oprofile-list@lists.sourceforge.net\0" + oprofile-list@lists.sourceforge.net + " linux-kernel@vger.kernel.org\0" "\00:1\0" "b\0" "On Tuesday 30 January 2007 23:54, Maynard Johnson wrote:\n" @@ -17,45 +17,35 @@ "> >>doesn't have any relevance to this at all, the only data that is\n" "> >>per spu is the sample data collected on a profiling interrupt,\n" "> >>which you can then copy in the per-context data on a context switch.\n" - "> >=20\n" - "> > The sample data is written out to the event buffer on every profiling=20\n" - "> > interrupt. =A0But we don't write out the SPU program counter samples=20\n" - "> > directly to the event buffer. =A0First, we have to find the cached_info=\n" - "=20\n" - "> > for the appropriate SPU context to retrieve the cached vma-to-fileoffse=\n" - "t=20\n" - "> > map. =A0Then we do the vma_map_lookup to find the fileoffset correspond=\n" - "ing=20\n" - "> > to the SPU PC sample, which we then write out to the event buffer. =A0T=\n" - "his=20\n" - "> > is one of the most time-critical pieces of the SPU profiling code, so I=\n" - "=20\n" - "> > used an array to hold the cached_info for fast random access. =A0But as=\n" - " I=20\n" - "> > stated in a code comment above, the negative implication of this curren=\n" - "t=20\n" - "> > implementation is that the array can only hold the cached_info for=20\n" - "> > currently running SPU tasks. =A0I need to give this some more thought.\n" - ">=20\n" - "> I've given this some more thought, and I'm coming to the conclusion that=\n" - "=20\n" - "> a pure array-based implementation for holding cached_info (getting rid=20\n" - "> of the lists) would work well for the vast majority of cases in which=20\n" - "> OProfile will be used. =A0Yes, it is true that the mapping of an SPU=20\n" - "> context to a phsyical spu-numbered array location cannot be guaranteed=20\n" - "> to stay valid, and that's why I discard the cached_info at that array=20\n" - "> location when the SPU task is switched out. =A0Yes, it would be terribly=\n" - "=20\n" - "> inefficient if the same SPU task gets switched back in later and we=20\n" - "> would have to recreate the cached_info. =A0However, I contend that=20\n" - "> OProfile users are interested in profiling one application at a time.=20\n" - "> They are not going to want to muddy the waters with multiple SPU apps=20\n" - "> running at the same time. =A0I can't think of any reason why someone woul=\n" - "d=20\n" + "> > \n" + "> > The sample data is written out to the event buffer on every profiling \n" + "> > interrupt. \302\240But we don't write out the SPU program counter samples \n" + "> > directly to the event buffer. \302\240First, we have to find the cached_info \n" + "> > for the appropriate SPU context to retrieve the cached vma-to-fileoffset \n" + "> > map. \302\240Then we do the vma_map_lookup to find the fileoffset corresponding \n" + "> > to the SPU PC sample, which we then write out to the event buffer. \302\240This \n" + "> > is one of the most time-critical pieces of the SPU profiling code, so I \n" + "> > used an array to hold the cached_info for fast random access. \302\240But as I \n" + "> > stated in a code comment above, the negative implication of this current \n" + "> > implementation is that the array can only hold the cached_info for \n" + "> > currently running SPU tasks. \302\240I need to give this some more thought.\n" + "> \n" + "> I've given this some more thought, and I'm coming to the conclusion that \n" + "> a pure array-based implementation for holding cached_info (getting rid \n" + "> of the lists) would work well for the vast majority of cases in which \n" + "> OProfile will be used. \302\240Yes, it is true that the mapping of an SPU \n" + "> context to a phsyical spu-numbered array location cannot be guaranteed \n" + "> to stay valid, and that's why I discard the cached_info at that array \n" + "> location when the SPU task is switched out. \302\240Yes, it would be terribly \n" + "> inefficient if the same SPU task gets switched back in later and we \n" + "> would have to recreate the cached_info. \302\240However, I contend that \n" + "> OProfile users are interested in profiling one application at a time. \n" + "> They are not going to want to muddy the waters with multiple SPU apps \n" + "> running at the same time. \302\240I can't think of any reason why someone would \n" "> conscisouly choose to do that.\n" - ">=20\n" + "> \n" "> Any thoughts from the general community, especially OProfile users?\n" - ">=20\n" + "> \n" "Please assume that in the near future we will be scheduling SPU contexts\n" "in and out multiple times a second. Even in a single application, you\n" "can easily have more contexts than you have physical SPUs.\n" @@ -79,4 +69,4 @@ "\n" "\tArnd <><" -59cb73ab612006d69fed68ced1bb7285e65c6e74462e71539921d0a458c7bcee +061c7e045a408e5aef6fe69e03d10cf203f95165a02c92533f5f0c6665d52799
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.