From: John Kacur <jkacur@redhat.com>
To: Crystal Wood <swood@redhat.com>
Cc: Clark Williams <williams@redhat.com>,
rt-users <linux-rt-users@vger.kernel.org>,
Peter Xu <peterx@redhat.com>
Subject: Re: [PATCH] oslat: Add command line option for bucket width
Date: Fri, 9 Dec 2022 20:03:27 -0500 (EST) [thread overview]
Message-ID: <1624dd34-d12c-38dc-aed1-a34366ceafba@redhat.com> (raw)
In-Reply-To: <20221209052254.2609767-1-swood@redhat.com>
On Thu, 8 Dec 2022, Crystal Wood wrote:
> New option -W/--bucket-width allows the user to specify how large of a
> range of latencies is covered by a single bucket, including allowing the
> creation of sub-microsecond buckets.
>
> When the flag is not used, output should be unchanged. However, if a
> bucket width is specified that is not a multiple of one microsecond,
> latencies will be output as fractional microseconds, at nanosecond
> precision. This includes JSON output.
>
> When using this option, it is up to the user to determine what level
> of precision is meaningful relative to measurement error, as is noted
> in the documentation.
>
> Signed-off-by: Crystal Wood <swood@redhat.com>
> ---
> I'm hoping that this output format is acceptable, including JSON,
> though I'm not familiar with how the latter is typically consumed.
>
> Another option would be to simplify by not making the output precision
> conditional (probably keeping the decimal for readability's sake, at
> least in the non-JSON output) but I didn't want to risk compatibility
> or user surprise issues.
> ---
> src/oslat/oslat.8 | 9 +++-
> src/oslat/oslat.c | 105 +++++++++++++++++++++++++++++++---------------
> 2 files changed, 80 insertions(+), 34 deletions(-)
>
> diff --git a/src/oslat/oslat.8 b/src/oslat/oslat.8
> index 39b36df0db3f..eb96448bfff1 100644
> --- a/src/oslat/oslat.8
> +++ b/src/oslat/oslat.8
> @@ -7,7 +7,7 @@ oslat \- OS Latency Detector
> .RI "[ \-shvz ] [ \-b " bucket-size " ] [ \-B " bias " ] [ \-c " cpu-list " ] \
> [ \-C " cpu-main-thread " ] [ \-f " rt-prio " ] [ \-\-json " filename " ] \
> [ \-m " workload-mem " ] [\-t " runtime " ] [ \-T " trace-threshold " ] \
> -[ \-w " workload " ]"
> +[ \-w " workload " ] [ \-W " bucket-width " ]"
> .SH DESCRIPTION
> .B oslat
> is an open source userspace polling mode stress program to detect OS level
> @@ -57,6 +57,13 @@ NOTE: please make sure the CPU frequency on all testing cores
> are locked before using this parmater. If you don't know how
> to lock the freq then please don't use this parameter.
> .TP
> +.B \-W, \-\-bucket-width
> +Interval between buckets in nanoseconds
> +
> +NOTE: Widths not a multiple of 1000 cause ns-precision output
> +You are responsible for considering the impact of measurement
> +overhead at the nanosecond scale.
> +.TP
> .B \-h, \-\-help
> Show the help message.
> .TP
> diff --git a/src/oslat/oslat.c b/src/oslat/oslat.c
> index 55302f11986b..e8a642699cc7 100644
> --- a/src/oslat/oslat.c
> +++ b/src/oslat/oslat.c
> @@ -192,6 +192,9 @@ struct global {
> struct timeval tv_start;
> int rtprio;
> int bucket_size;
> + int bucket_width;
> + int unit_per_us;
> + int precision;
> int trace_threshold;
> int runtime;
> /* The core that we run the main thread. Default is cpu0 */
> @@ -325,45 +328,46 @@ static float cycles_to_sec(const struct thread *t, uint64_t cycles)
>
> static void insert_bucket(struct thread *t, stamp_t value)
> {
> - int index, us;
> + int index;
> + unsigned int lat;
> uint64_t extra;
> + double us;
>
> - index = value / t->counter_mhz;
> - assert(index >= 0);
> - us = index + 1;
> - assert(us > 0);
> -
> + lat = (value * g.unit_per_us + t->counter_mhz - 1) / t->counter_mhz;
> + us = (double)lat / g.unit_per_us;
> if (!g.preheat && g.trace_threshold && us >= g.trace_threshold) {
> - char *line = "%s: Trace threshold (%d us) triggered on cpu %d with %u us!\n"
> + char *line = "%s: Trace threshold (%d us) triggered on cpu %d with %.*f us!\n"
> "Stopping the test.\n";
> - tracemark(line, g.app_name, g.trace_threshold, t->core_i, us);
> - err_quit(line, g.app_name, g.trace_threshold, t->core_i, us);
> + tracemark(line, g.app_name, g.trace_threshold, t->core_i,
> + g.precision, us);
> + err_quit(line, g.app_name, g.trace_threshold, t->core_i,
> + g.precision, us);
> }
>
> /* Update max latency */
> - if (us > t->maxlat)
> - t->maxlat = us;
> + if (lat > t->maxlat)
> + t->maxlat = lat;
>
> - if (us < t->minlat)
> - t->minlat = us;
> + if (lat < t->minlat)
> + t->minlat = lat;
>
> if (g.bias) {
> /* t->bias will be set after pre-heat if user enabled it */
> - us -= g.bias;
> + lat -= g.bias;
> /*
> * Negative should hardly happen, but if it happens, we assume we're in
> - * the smallest bucket, which is 1us. Same to index.
> + * the smallest bucket.
> */
> - if (us <= 0)
> - us = 1;
> - index -= g.bias;
> - if (index < 0)
> - index = 0;
> + if (lat <= 0)
> + lat = 1;
> }
>
> + index = lat / g.bucket_width;
> + assert(index >= 0);
> +
> /* Too big the jitter; put into the last bucket */
> if (index >= g.bucket_size) {
> - /* Keep the extra bit (in us) */
> + /* Keep the extra bit (in bucket width multiples) */
> extra = index - g.bucket_size;
> if (t->overflow_sum + extra < t->overflow_sum) {
> /* The uint64_t even overflowed itself; bail out */
> @@ -455,6 +459,19 @@ static void *thread_main(void *arg)
> printf("%s\n", end); \
> } while (0)
>
> +#define putfieldp(label, val, end) do { \
> + printf("%12s:\t", label); \
> + for (i = 0; i < g.n_threads; ++i) \
> + printf(" %.*f", g.precision, \
> + (double)(val) / g.unit_per_us); \
> + printf("%s\n", end); \
> + } while (0)
> +
> +static double bucket_to_lat(int bucket)
> +{
> + return (g.bias + (bucket + 1) * (double)g.bucket_width) / g.unit_per_us;
> +}
> +
> void calculate(struct thread *t)
> {
> int i, j;
> @@ -465,11 +482,11 @@ void calculate(struct thread *t)
> /* Calculate average */
> sum = count = 0;
> for (j = 0; j < g.bucket_size; j++) {
> - sum += 1.0 * t[i].buckets[j] * (g.bias+j+1);
> + sum += t[i].buckets[j] * bucket_to_lat(j);
> count += t[i].buckets[j];
> }
> /* Add the extra amount of huge spikes in */
> - sum += t->overflow_sum;
> + sum += t->overflow_sum * g.bucket_width;
> t[i].average = sum / count;
> }
> }
> @@ -501,16 +518,16 @@ static void write_summary(struct thread *t)
> print_dotdotdot = 0;
> }
>
> - snprintf(bucket_name, sizeof(bucket_name), "%03"PRIu64
> - " (us)", g.bias+j+1);
> + snprintf(bucket_name, sizeof(bucket_name), "%03.*f (us)",
> + g.precision, bucket_to_lat(j));
> putfield(bucket_name, t[i].buckets[j], PRIu64,
> (j == g.bucket_size - 1) ? " (including overflows)" : "");
> }
>
> - putfield("Minimum", t[i].minlat, PRIu64, " (us)");
> + putfieldp("Minimum", t[i].minlat, " (us)");
> putfield("Average", t[i].average, ".3lf", " (us)");
> - putfield("Maximum", t[i].maxlat, PRIu64, " (us)");
> - putfield("Max-Min", t[i].maxlat - t[i].minlat, PRIu64, " (us)");
> + putfieldp("Maximum", t[i].maxlat, " (us)");
> + putfieldp("Max-Min", t[i].maxlat - t[i].minlat, " (us)");
> putfield("Duration", cycles_to_sec(&(t[i]), t[i].runtime),
> ".3f", " (sec)");
> printf("\n");
> @@ -537,8 +554,8 @@ static void write_summary_json(FILE *f, void *data)
> if (t[i].buckets[j] == 0)
> continue;
> fprintf(f, "%s", comma ? ",\n" : "\n");
> - fprintf(f, " \"%" PRIu64 "\": %" PRIu64,
> - g.bias+j+1, t[i].buckets[j]);
> + fprintf(f, " \"%.*f\": %" PRIu64,
> + g.precision, bucket_to_lat(j), t[i].buckets[j]);
> comma = 1;
> }
> if (comma)
> @@ -610,6 +627,10 @@ static void usage(int error)
> "-v, --version Display the version of the software.\n"
> "-w, --workload Specify a kind of workload, default is no workload\n"
> " (options: no, memmove)\n"
> + "-W, --bucket-width Interval between buckets in nanoseconds\n"
> + " NOTE: Widths not a multiple of 1000 cause ns-precision output\n"
> + " You are responsible for considering the impact of measurement\n"
> + " overhead at the nanosecond scale.\n"
> "-z, --zero-omit Don't display buckets in the output histogram if all zeros.\n"
> );
> exit(error);
> @@ -630,7 +651,7 @@ static int workload_select(char *name)
> }
>
> enum option_value {
> - OPT_BUCKETSIZE=1, OPT_CPU_LIST, OPT_CPU_MAIN_THREAD,
> + OPT_BUCKETSIZE=1, OPT_BUCKETWIDTH, OPT_CPU_LIST, OPT_CPU_MAIN_THREAD,
> OPT_DURATION, OPT_JSON, OPT_RT_PRIO, OPT_HELP, OPT_TRACE_TH,
> OPT_WORKLOAD, OPT_WORKLOAD_MEM, OPT_BIAS,
> OPT_QUIET, OPT_SINGLE_PREHEAT, OPT_ZERO_OMIT,
> @@ -644,6 +665,7 @@ static void parse_options(int argc, char *argv[])
> int option_index = 0;
> static struct option options[] = {
> { "bucket-size", required_argument, NULL, OPT_BUCKETSIZE },
> + { "bucket-width", required_argument, NULL, OPT_BUCKETWIDTH },
> { "cpu-list", required_argument, NULL, OPT_CPU_LIST },
> { "cpu-main-thread", required_argument, NULL, OPT_CPU_MAIN_THREAD},
> { "duration", required_argument, NULL, OPT_DURATION },
> @@ -660,7 +682,7 @@ static void parse_options(int argc, char *argv[])
> { "version", no_argument, NULL, OPT_VERSION },
> { NULL, 0, NULL, 0 },
> };
> - int i, c = getopt_long(argc, argv, "b:Bc:C:D:f:hm:qsw:T:vz",
> + int i, c = getopt_long(argc, argv, "b:Bc:C:D:f:hm:qsw:W:T:vz",
> options, &option_index);
> long ncores;
>
> @@ -677,6 +699,20 @@ static void parse_options(int argc, char *argv[])
> exit(1);
> }
> break;
> + case OPT_BUCKETWIDTH:
> + case 'W':
> + g.bucket_width = strtol(optarg, NULL, 10);
> + if (g.bucket_size <= 0) {
I think this should be g.bucket_width
> + printf("Illegal bucket width: %s\n", optarg);
> + exit(1);
> + }
> + if (g.bucket_width % 1000) {
> + g.unit_per_us = 1000;
> + g.precision = 3;
> + } else {
> + g.bucket_width /= 1000;
> + }
> + break;
> case OPT_BIAS:
> case 'B':
> g.enable_bias = 1;
> @@ -811,7 +847,8 @@ static void record_bias(struct thread *t)
> bias = t[i].minlat;
> }
> g.bias = bias;
> - printf("Global bias set to %" PRId64 " (us)\n", bias);
> + printf("Global bias set to %.*f (us)\n", g.precision,
> + (double)bias / g.unit_per_us);
> }
>
> int main(int argc, char *argv[])
> @@ -835,6 +872,8 @@ int main(int argc, char *argv[])
> g.app_name = argv[0];
> g.rtprio = 0;
> g.bucket_size = BUCKET_SIZE;
> + g.bucket_width = 1;
> + g.unit_per_us = 1;
> g.runtime = 1;
> g.workload = &workload_list[WORKLOAD_DEFAULT];
> g.workload_mem_size = WORKLOAD_MEM_SIZE;
> --
> 2.38.1
>
>
A quick first look through and run, see the one comment above
near "case 'W'"
and then
checkpatch reports some minor easily fixed problems
../linux/scripts/checkpatch.pl oslat.patch
ERROR: code indent should use tabs where possible
#100: FILE: src/oslat/oslat.c:342:
+^I^I g.precision, us);$
ERROR: code indent should use tabs where possible
#102: FILE: src/oslat/oslat.c:344:
+^I^I g.precision, us);$
ERROR: spaces required around that '=' (ctx:VxV)
#227: FILE: src/oslat/oslat.c:654:
+ OPT_BUCKETSIZE=1, OPT_BUCKETWIDTH, OPT_CPU_LIST,
OPT_CPU_MAIN_THREAD,
^
I'll have a closer look on Monday, but thanks for your patch!
John
next prev parent reply other threads:[~2022-12-10 1:07 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-09 5:22 [PATCH] oslat: Add command line option for bucket width Crystal Wood
2022-12-10 1:03 ` John Kacur [this message]
2022-12-10 1:40 ` Crystal Wood
2022-12-13 19:31 ` John Kacur
2022-12-13 22:21 ` Crystal Wood
2022-12-13 20:41 ` John Kacur
2022-12-13 20:46 ` John Kacur
2022-12-13 22:24 ` Crystal Wood
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1624dd34-d12c-38dc-aed1-a34366ceafba@redhat.com \
--to=jkacur@redhat.com \
--cc=linux-rt-users@vger.kernel.org \
--cc=peterx@redhat.com \
--cc=swood@redhat.com \
--cc=williams@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox