* [PATCH] perf, tools, stat: Force C numeric locale for CSV mode @ 2016-01-05 19:17 Andi Kleen 2016-01-05 21:05 ` Jiri Olsa 0 siblings, 1 reply; 6+ messages in thread From: Andi Kleen @ 2016-01-05 19:17 UTC (permalink / raw) To: acme; +Cc: jolsa, mingo, linux-kernel, Andi Kleen From: Andi Kleen <ak@linux.intel.com> Some locales print floating point numbers with a comma instead of a dot. This causes problems with CSV mode because it causes extra false CSV fields. Force the numeric locale to be always C in CSV mode. Before: $ LC_ALL=pl_PL.utf8 perf stat -x, true 0,399472,,task-clock,399472,100,00 <---- extra bogus field ... After: $ LC_ALL=pl_PL.utf8 ./obj-perf/perf stat -x, true 0.338422,,task-clock,338422,100.00 Originally reported in https://github.com/andikleen/pmu-tools/issues/43 Signed-off-by: Andi Kleen <ak@linux.intel.com> --- tools/perf/builtin-stat.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index 9805e03..4d5e504 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -1826,6 +1826,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused) csv_output = true; if (!strcmp(csv_sep, "\\t")) csv_sep = "\t"; + setlocale(LC_NUMERIC, "C"); } else csv_sep = DEFAULT_SEPARATOR; -- 2.4.3 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] perf, tools, stat: Force C numeric locale for CSV mode 2016-01-05 19:17 [PATCH] perf, tools, stat: Force C numeric locale for CSV mode Andi Kleen @ 2016-01-05 21:05 ` Jiri Olsa 2016-01-05 21:17 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 6+ messages in thread From: Jiri Olsa @ 2016-01-05 21:05 UTC (permalink / raw) To: Andi Kleen; +Cc: acme, jolsa, mingo, linux-kernel, Andi Kleen On Tue, Jan 05, 2016 at 11:17:45AM -0800, Andi Kleen wrote: > From: Andi Kleen <ak@linux.intel.com> > > Some locales print floating point numbers with a comma instead of a dot. > This causes problems with CSV mode because it causes extra false CSV > fields. Force the numeric locale to be always C in CSV mode. > > Before: > > $ LC_ALL=pl_PL.utf8 perf stat -x, true > 0,399472,,task-clock,399472,100,00 <---- extra bogus field > ... > > After: > $ LC_ALL=pl_PL.utf8 ./obj-perf/perf stat -x, true > 0.338422,,task-clock,338422,100.00 > > Originally reported in https://github.com/andikleen/pmu-tools/issues/43 > > Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> thanks, jirka ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] perf, tools, stat: Force C numeric locale for CSV mode 2016-01-05 21:05 ` Jiri Olsa @ 2016-01-05 21:17 ` Arnaldo Carvalho de Melo 2016-01-05 21:28 ` Andi Kleen 0 siblings, 1 reply; 6+ messages in thread From: Arnaldo Carvalho de Melo @ 2016-01-05 21:17 UTC (permalink / raw) To: Andi Kleen; +Cc: Jiri Olsa, jolsa, mingo, linux-kernel, Andi Kleen Em Tue, Jan 05, 2016 at 10:05:01PM +0100, Jiri Olsa escreveu: > On Tue, Jan 05, 2016 at 11:17:45AM -0800, Andi Kleen wrote: > > From: Andi Kleen <ak@linux.intel.com> > > > > Some locales print floating point numbers with a comma instead of a dot. > > This causes problems with CSV mode because it causes extra false CSV > > fields. Force the numeric locale to be always C in CSV mode. > > > > Before: > > > > $ LC_ALL=pl_PL.utf8 perf stat -x, true > > 0,399472,,task-clock,399472,100,00 <---- extra bogus field > > ... > > > > After: > > $ LC_ALL=pl_PL.utf8 ./obj-perf/perf stat -x, true > > 0.338422,,task-clock,338422,100.00 > > > > Originally reported in https://github.com/andikleen/pmu-tools/issues/43 > > > > Signed-off-by: Andi Kleen <ak@linux.intel.com> > > Acked-by: Jiri Olsa <jolsa@kernel.org> I wonder what is that other tools do when stumbling on this, i.e. some other tool output that produces values that have the CSV character in it... Completely disabling the configured locale seems too harsh to me, aren't people used to changing the csv char via some option like we have in 'perf stat': -x, --field-separator when changing the locale from the default 'C' one? Hey, you even used it above, but you chose a CSV char that is used in this locale, oops ;-) - Arnaldo ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] perf, tools, stat: Force C numeric locale for CSV mode 2016-01-05 21:17 ` Arnaldo Carvalho de Melo @ 2016-01-05 21:28 ` Andi Kleen 2016-01-05 23:27 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 6+ messages in thread From: Andi Kleen @ 2016-01-05 21:28 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Andi Kleen, Jiri Olsa, jolsa, mingo, linux-kernel, Andi Kleen On Tue, Jan 05, 2016 at 06:17:57PM -0300, Arnaldo Carvalho de Melo wrote: > Em Tue, Jan 05, 2016 at 10:05:01PM +0100, Jiri Olsa escreveu: > > On Tue, Jan 05, 2016 at 11:17:45AM -0800, Andi Kleen wrote: > > > From: Andi Kleen <ak@linux.intel.com> > > > > > > Some locales print floating point numbers with a comma instead of a dot. > > > This causes problems with CSV mode because it causes extra false CSV > > > fields. Force the numeric locale to be always C in CSV mode. > > > > > > Before: > > > > > > $ LC_ALL=pl_PL.utf8 perf stat -x, true > > > 0,399472,,task-clock,399472,100,00 <---- extra bogus field > > > ... > > > > > > After: > > > $ LC_ALL=pl_PL.utf8 ./obj-perf/perf stat -x, true > > > 0.338422,,task-clock,338422,100.00 > > > > > > Originally reported in https://github.com/andikleen/pmu-tools/issues/43 > > > > > > Signed-off-by: Andi Kleen <ak@linux.intel.com> > > > > Acked-by: Jiri Olsa <jolsa@kernel.org> > > I wonder what is that other tools do when stumbling on this, i.e. > some other tool output that produces values that have the CSV character > in it... Proper CSV supports escaping the separator by putting the whole field into quotes. Unfortunately perf stat doesn't output proper CSV, the event fields with commas are not quoted. I usually work around it by using -x\; instead But the , problem should be still fixed. > > Completely disabling the configured locale seems too harsh to me, aren't > people used to changing the csv char via some option like we have in > 'perf stat': > > -x, --field-separator > > when changing the locale from the default 'C' one? Hey, you even used it > above, but you chose a CSV char that is used in this locale, oops ;-) It's just for numbers (LC_NUMERIC), everything else is still localized. -Andi ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] perf, tools, stat: Force C numeric locale for CSV mode 2016-01-05 21:28 ` Andi Kleen @ 2016-01-05 23:27 ` Arnaldo Carvalho de Melo 2016-01-06 1:37 ` Andi Kleen 0 siblings, 1 reply; 6+ messages in thread From: Arnaldo Carvalho de Melo @ 2016-01-05 23:27 UTC (permalink / raw) To: Andi Kleen; +Cc: Jiri Olsa, jolsa, mingo, linux-kernel, Andi Kleen Em Tue, Jan 05, 2016 at 10:28:39PM +0100, Andi Kleen escreveu: > On Tue, Jan 05, 2016 at 06:17:57PM -0300, Arnaldo Carvalho de Melo wrote: > > Em Tue, Jan 05, 2016 at 10:05:01PM +0100, Jiri Olsa escreveu: > > > On Tue, Jan 05, 2016 at 11:17:45AM -0800, Andi Kleen wrote: > > > > From: Andi Kleen <ak@linux.intel.com> > > > > > > > > Some locales print floating point numbers with a comma instead of a dot. > > > > This causes problems with CSV mode because it causes extra false CSV > > > > fields. Force the numeric locale to be always C in CSV mode. > > > > > > > > Before: > > > > > > > > $ LC_ALL=pl_PL.utf8 perf stat -x, true > > > > 0,399472,,task-clock,399472,100,00 <---- extra bogus field > > > > ... > > > > > > > > After: > > > > $ LC_ALL=pl_PL.utf8 ./obj-perf/perf stat -x, true > > > > 0.338422,,task-clock,338422,100.00 > > > > > > > > Originally reported in https://github.com/andikleen/pmu-tools/issues/43 > > > > > > > > Signed-off-by: Andi Kleen <ak@linux.intel.com> > > > > > > Acked-by: Jiri Olsa <jolsa@kernel.org> > > > > I wonder what is that other tools do when stumbling on this, i.e. > > some other tool output that produces values that have the CSV character > > in it... > > Proper CSV supports escaping the separator by putting the whole field > into quotes. Unfortunately perf stat doesn't output proper CSV, > the event fields with commas are not quoted. Right, there is even an RFC for CSV, and for a decade already :-) https://tools.ietf.org/html/rfc4180 > I usually work around it by using -x\; instead > > But the , problem should be still fixed. Humm, what is the problem then of doing, for example in my case, with a LC_ALL=pt_BR, that uses commans as the decimal separator: LC_ALL=C ./obj-perf/perf stat -x, true I.e. disabling it using the existing shell mecanism? [root@zoo ~]# export LC_ALL=pt_BR [root@zoo ~]# perf stat -e cycles -x, usleep 1 1068190,,cycles,1007049,100,00 Bad, but its a side effect of needing to use a locale that uses the CSV separator in the decimal separator, disabling it for commands where I want that CSV separator does the trick: [root@zoo ~]# LC_ALL= perf stat -e cycles -x, usleep 1 895081,,cycles,840687,100.00 [root@zoo ~]# It is an inconvenience, yeah, having to prefix that for tools where I want to use the comma for the CSV separator, but disabling LC_NUMERIC, albeit better than my what I misparsed at first (forcing all locale to 'C') still looks too harsh :-\ Using -x\; looks sane and shorter tho, perhaps even -x:, to save one extra char. - Arnaldo > > Completely disabling the configured locale seems too harsh to me, aren't > > people used to changing the csv char via some option like we have in > > 'perf stat': > > > > -x, --field-separator > > > > when changing the locale from the default 'C' one? Hey, you even used it > > above, but you chose a CSV char that is used in this locale, oops ;-) > > It's just for numbers (LC_NUMERIC), everything else is still localized. > > -Andi ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] perf, tools, stat: Force C numeric locale for CSV mode 2016-01-05 23:27 ` Arnaldo Carvalho de Melo @ 2016-01-06 1:37 ` Andi Kleen 0 siblings, 0 replies; 6+ messages in thread From: Andi Kleen @ 2016-01-06 1:37 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Andi Kleen, Jiri Olsa, jolsa, mingo, linux-kernel, Andi Kleen > > I usually work around it by using -x\; instead > > > > But the , problem should be still fixed. > > Humm, what is the problem then of doing, for example in my case, with a > LC_ALL=pt_BR, that uses commans as the decimal separator: It's user unfriendly and unobvious. Also you end up with subtly broken files,. And it would also change the locale of the measured program which may not be intended. Plus the floating point values with comma cannot be parsed by programs that don't know your locale (that was the problem with pmu-tools) > Using -x\; looks sane and shorter tho, perhaps even -x:, to save one > extra char. Even with that there is the problem that you end up with numbers that cannot be parsed by locale unaware programs. CSV is intended for other programs so it shouldn't be messed up like this. -Andi ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-01-06 1:37 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-01-05 19:17 [PATCH] perf, tools, stat: Force C numeric locale for CSV mode Andi Kleen 2016-01-05 21:05 ` Jiri Olsa 2016-01-05 21:17 ` Arnaldo Carvalho de Melo 2016-01-05 21:28 ` Andi Kleen 2016-01-05 23:27 ` Arnaldo Carvalho de Melo 2016-01-06 1:37 ` Andi Kleen
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.