* fiologparser.py
@ 2016-05-24 10:35 Martin Steigerwald
2016-05-24 11:12 ` fiologparser.py Ben England
0 siblings, 1 reply; 14+ messages in thread
From: Martin Steigerwald @ 2016-05-24 10:35 UTC (permalink / raw)
To: fio; +Cc: Mark Nelson, Ben England, Jens Axboe
Hello Mark, Ben,
I found fiologparser.py in fio 2.10 and for now packaged it into the /usr/
share/doc/fio. Yet I´d like to more promintly place it in /usr/bin or so… for
that I would need it to have no script name ending (as according to Debian
Policy). Would be fine with having it renamed to just fiologparser? I can
provide a patch to Jens.
Also it has no manpage, but a short intro in the script source itself. Do you
intent to provide a manpage? Otherwise I may have a go at it with help2man or
so once in a while. It appears to have quite some options:
# ./fiologparser.py
usage: fiologparser.py [-h] [-i INTERVAL] [-d DIVISOR] [-f] [-A] [-a] [-s]
FILE [FILE ...]
fiologparser.py: error: too few arguments
Care to elaborate what these are doing (besides what is mentioned in script
header)?
It requires python-scipy it seems. Anything else?
Thank you,
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: fiologparser.py 2016-05-24 10:35 fiologparser.py Martin Steigerwald @ 2016-05-24 11:12 ` Ben England 2016-05-24 14:04 ` fiologparser.py Mark Nelson 0 siblings, 1 reply; 14+ messages in thread From: Ben England @ 2016-05-24 11:12 UTC (permalink / raw) To: Martin Steigerwald; +Cc: fio, Mark Nelson, Jens Axboe no objections here, There is some additional documentation in the parse_args() routine in the "help" keyword parameter to parser.add_argument. parser.add_argument("FILE", help="collectl log output files to parse", nargs="+") it's really a fio-generated latency log, not a collectl log not aware of any dependencies except python-scipy. It doesn't work with python3 yet. Additional documentation for the -A option is in: https://github.com/axboe/fio/pull/177 -ben ----- Original Message ----- > From: "Martin Steigerwald" <ms@teamix.de> > To: fio@vger.kernel.org > Cc: "Mark Nelson" <mnelson@redhat.com>, "Ben England" <bengland@redhat.com>, "Jens Axboe" <axboe@kernel.dk> > Sent: Tuesday, May 24, 2016 6:35:54 AM > Subject: fiologparser.py > > Hello Mark, Ben, > > I found fiologparser.py in fio 2.10 and for now packaged it into the /usr/ > share/doc/fio. Yet I´d like to more promintly place it in /usr/bin or so… for > that I would need it to have no script name ending (as according to Debian > Policy). Would be fine with having it renamed to just fiologparser? I can > provide a patch to Jens. > > Also it has no manpage, but a short intro in the script source itself. Do you > intent to provide a manpage? Otherwise I may have a go at it with help2man or > so once in a while. It appears to have quite some options: > > # ./fiologparser.py > usage: fiologparser.py [-h] [-i INTERVAL] [-d DIVISOR] [-f] [-A] [-a] [-s] > FILE [FILE ...] > fiologparser.py: error: too few arguments > > Care to elaborate what these are doing (besides what is mentioned in script > header)? > > It requires python-scipy it seems. Anything else? > > Thank you, > > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fiologparser.py 2016-05-24 11:12 ` fiologparser.py Ben England @ 2016-05-24 14:04 ` Mark Nelson 2016-05-24 14:11 ` fiologparser.py Jens Axboe 2016-05-24 15:22 ` fiologparser.py Ben England 0 siblings, 2 replies; 14+ messages in thread From: Mark Nelson @ 2016-05-24 14:04 UTC (permalink / raw) To: Ben England, Martin Steigerwald; +Cc: fio, Mark Nelson, Jens Axboe I'm totally fine with removing .py from the name unless anyone else has objections. I hadn't really thought about writing a man page, but that would be great if you want to give it a go. Let's see if we can remove the numpy and scipy dependencies. It looks like we are just using it for min/average/median/max/percentile calculations. It would be nice if users didn't need anything other than argparse. Mark On 05/24/2016 06:12 AM, Ben England wrote: > no objections here, > > There is some additional documentation in the parse_args() routine in the "help" keyword parameter to parser.add_argument. > > parser.add_argument("FILE", help="collectl log output files to parse", nargs="+") > > it's really a fio-generated latency log, not a collectl log > > not aware of any dependencies except python-scipy. It doesn't work with python3 yet. > > Additional documentation for the -A option is in: > > https://github.com/axboe/fio/pull/177 > > -ben > > ----- Original Message ----- >> From: "Martin Steigerwald" <ms@teamix.de> >> To: fio@vger.kernel.org >> Cc: "Mark Nelson" <mnelson@redhat.com>, "Ben England" <bengland@redhat.com>, "Jens Axboe" <axboe@kernel.dk> >> Sent: Tuesday, May 24, 2016 6:35:54 AM >> Subject: fiologparser.py >> >> Hello Mark, Ben, >> >> I found fiologparser.py in fio 2.10 and for now packaged it into the /usr/ >> share/doc/fio. Yet I´d like to more promintly place it in /usr/bin or so… for >> that I would need it to have no script name ending (as according to Debian >> Policy). Would be fine with having it renamed to just fiologparser? I can >> provide a patch to Jens. >> >> Also it has no manpage, but a short intro in the script source itself. Do you >> intent to provide a manpage? Otherwise I may have a go at it with help2man or >> so once in a while. It appears to have quite some options: >> >> # ./fiologparser.py >> usage: fiologparser.py [-h] [-i INTERVAL] [-d DIVISOR] [-f] [-A] [-a] [-s] >> FILE [FILE ...] >> fiologparser.py: error: too few arguments >> >> Care to elaborate what these are doing (besides what is mentioned in script >> header)? >> >> It requires python-scipy it seems. Anything else? >> >> Thank you, >> >> > -- > To unsubscribe from this list: send the line "unsubscribe fio" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fiologparser.py 2016-05-24 14:04 ` fiologparser.py Mark Nelson @ 2016-05-24 14:11 ` Jens Axboe 2016-05-24 14:17 ` fiologparser.py Mark Nelson 2016-05-24 15:22 ` fiologparser.py Ben England 1 sibling, 1 reply; 14+ messages in thread From: Jens Axboe @ 2016-05-24 14:11 UTC (permalink / raw) To: Mark Nelson, Ben England, Martin Steigerwald; +Cc: fio, Mark Nelson On 05/24/2016 08:04 AM, Mark Nelson wrote: > I'm totally fine with removing .py from the name unless anyone else has > objections. I hadn't really thought about writing a man page, but that > would be great if you want to give it a go. > > Let's see if we can remove the numpy and scipy dependencies. It looks > like we are just using it for min/average/median/max/percentile > calculations. It would be nice if users didn't need anything other than > argparse. I haven't looked at it, but if it's just for min/av/median/etc and percentiles, that can be trivially hand rolled. -- Jens Axboe ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fiologparser.py 2016-05-24 14:11 ` fiologparser.py Jens Axboe @ 2016-05-24 14:17 ` Mark Nelson 0 siblings, 0 replies; 14+ messages in thread From: Mark Nelson @ 2016-05-24 14:17 UTC (permalink / raw) To: Jens Axboe, Ben England, Martin Steigerwald; +Cc: fio, Mark Nelson On 05/24/2016 09:11 AM, Jens Axboe wrote: > On 05/24/2016 08:04 AM, Mark Nelson wrote: >> I'm totally fine with removing .py from the name unless anyone else has >> objections. I hadn't really thought about writing a man page, but that >> would be great if you want to give it a go. >> >> Let's see if we can remove the numpy and scipy dependencies. It looks >> like we are just using it for min/average/median/max/percentile >> calculations. It would be nice if users didn't need anything other than >> argparse. > > I haven't looked at it, but if it's just for min/av/median/etc and > percentiles, that can be trivially hand rolled. > Exactly my thoughts! Mark ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fiologparser.py 2016-05-24 14:04 ` fiologparser.py Mark Nelson 2016-05-24 14:11 ` fiologparser.py Jens Axboe @ 2016-05-24 15:22 ` Ben England 2016-05-24 15:28 ` fiologparser.py Jens Axboe 2016-05-25 7:20 ` fiologparser.py Martin Steigerwald 1 sibling, 2 replies; 14+ messages in thread From: Ben England @ 2016-05-24 15:22 UTC (permalink / raw) To: Mark Nelson; +Cc: Martin Steigerwald, fio, Mark Nelson, Jens Axboe ----- Original Message ----- > From: "Mark Nelson" <mark.a.nelson@gmail.com> > To: "Ben England" <bengland@redhat.com>, "Martin Steigerwald" <ms@teamix.de> > Cc: fio@vger.kernel.org, "Mark Nelson" <mnelson@redhat.com>, "Jens Axboe" <axboe@kernel.dk> > Sent: Tuesday, May 24, 2016 10:04:14 AM > Subject: Re: fiologparser.py > > Let's see if we can remove the numpy and scipy dependencies. It looks > like we are just using it for min/average/median/max/percentile > calculations. It would be nice if users didn't need anything other than > argparse. > Just curious, why is scipy a problem? Is it because CBT isn't a package so you don't get dependencies handled when you install it? You are correct, it's easy to remove the dependencies, I just didn't know it was causing problems for people. You can get percentiles from just sorting the sample values and indexing into the array at the appropriate offset, I was just trying to re-use existing classes. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fiologparser.py 2016-05-24 15:22 ` fiologparser.py Ben England @ 2016-05-24 15:28 ` Jens Axboe 2016-05-24 15:35 ` fiologparser.py Ben England 2016-05-25 7:20 ` fiologparser.py Martin Steigerwald 1 sibling, 1 reply; 14+ messages in thread From: Jens Axboe @ 2016-05-24 15:28 UTC (permalink / raw) To: Ben England, Mark Nelson; +Cc: Martin Steigerwald, fio, Mark Nelson On 05/24/2016 09:22 AM, Ben England wrote: > > > ----- Original Message ----- >> From: "Mark Nelson" <mark.a.nelson@gmail.com> >> To: "Ben England" <bengland@redhat.com>, "Martin Steigerwald" <ms@teamix.de> >> Cc: fio@vger.kernel.org, "Mark Nelson" <mnelson@redhat.com>, "Jens Axboe" <axboe@kernel.dk> >> Sent: Tuesday, May 24, 2016 10:04:14 AM >> Subject: Re: fiologparser.py >> >> Let's see if we can remove the numpy and scipy dependencies. It looks >> like we are just using it for min/average/median/max/percentile >> calculations. It would be nice if users didn't need anything other than >> argparse. >> > > Just curious, why is scipy a problem? Is it because CBT isn't a > package so you don't get dependencies handled when you install it? You > are correct, it's easy to remove the dependencies, I just didn't know it > was causing problems for people. You can get percentiles from just > sorting the sample values and indexing into the array at the appropriate > offset, I was just trying to re-use existing classes. It's not necessarily a problem, but the less dependencies you have, the easier it is for people to use. I do the same for fio, try to have as few external dependencies as possible. Remember, not everybody is running on Linux... -- Jens Axboe ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fiologparser.py 2016-05-24 15:28 ` fiologparser.py Jens Axboe @ 2016-05-24 15:35 ` Ben England 2016-05-24 16:20 ` fiologparser.py Mark Nelson 0 siblings, 1 reply; 14+ messages in thread From: Ben England @ 2016-05-24 15:35 UTC (permalink / raw) To: Jens Axboe, Mark Nelson; +Cc: Martin Steigerwald, fio, Mark Nelson OK we'll remove the dependencies, I still want to have the -A option supported. -ben ----- Original Message ----- > From: "Jens Axboe" <axboe@kernel.dk> > To: "Ben England" <bengland@redhat.com>, "Mark Nelson" <mark.a.nelson@gmail.com> > Cc: "Martin Steigerwald" <ms@teamix.de>, fio@vger.kernel.org, "Mark Nelson" <mnelson@redhat.com> > Sent: Tuesday, May 24, 2016 11:28:39 AM > Subject: Re: fiologparser.py > > On 05/24/2016 09:22 AM, Ben England wrote: > > > > > > ----- Original Message ----- > >> From: "Mark Nelson" <mark.a.nelson@gmail.com> > >> To: "Ben England" <bengland@redhat.com>, "Martin Steigerwald" > >> <ms@teamix.de> > >> Cc: fio@vger.kernel.org, "Mark Nelson" <mnelson@redhat.com>, "Jens Axboe" > >> <axboe@kernel.dk> > >> Sent: Tuesday, May 24, 2016 10:04:14 AM > >> Subject: Re: fiologparser.py > >> > >> Let's see if we can remove the numpy and scipy dependencies. It looks > >> like we are just using it for min/average/median/max/percentile > >> calculations. It would be nice if users didn't need anything other than > >> argparse. > >> > > > > Just curious, why is scipy a problem? Is it because CBT isn't a > > package so you don't get dependencies handled when you install it? You > > are correct, it's easy to remove the dependencies, I just didn't know it > > was causing problems for people. You can get percentiles from just > > sorting the sample values and indexing into the array at the appropriate > > offset, I was just trying to re-use existing classes. > > It's not necessarily a problem, but the less dependencies you have, the > easier it is for people to use. I do the same for fio, try to have as > few external dependencies as possible. Remember, not everybody is > running on Linux... > > -- > Jens Axboe > > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fiologparser.py 2016-05-24 15:35 ` fiologparser.py Ben England @ 2016-05-24 16:20 ` Mark Nelson 2016-05-24 18:38 ` fiologparser.py Jeff Furlong 2016-05-24 20:47 ` fiologparser.py Ben England 0 siblings, 2 replies; 14+ messages in thread From: Mark Nelson @ 2016-05-24 16:20 UTC (permalink / raw) To: Ben England, Jens Axboe; +Cc: Martin Steigerwald, fio, Mark Nelson I've got a version that removes the dependency and appears to return the same values: https://github.com/axboe/fio/pull/181 Going through the code though, it looks like the -A values are computed differently than in the other original functions. In the original get_contribution function, all samples within the bounds are counted, along with samples that are only partially within the bounds. Each sample is weighted based on the duration it overlapped with the sample period: https://github.com/axboe/fio/blob/master/tools/fiologparser.py#L195-L198 for -A, only the samples that are totally within the bounds are counted, and are weighted equally despite how much of the period was spent in that sample: https://github.com/axboe/fio/blob/master/tools/fiologparser.py#L173 Thus if you look at say the average from -a: fiologparser.py -a *clat* 1000, 11582.770 2000, 14033.844 3000, 17087.446 4000, 17946.245 5000, 14554.196 6000, 14407.804 7000, 15218.106 8000, 15157.951 the results are quite a bit different from -A: fiologparser.py -A *clat* | tr -s "," " " | cut -f1,4 -d" " 0.000000 11902.719298 1000.000000 13247.750000 2000.000000 14270.549020 3000.000000 15092.192308 4000.000000 14127.472727 5000.000000 12880.137931 6000.000000 15296.735849 7000.000000 14857.306122 8000.000000 14854.766667 Mark On 05/24/2016 10:35 AM, Ben England wrote: > OK we'll remove the dependencies, I still want to have the -A option supported. > -ben > > ----- Original Message ----- >> From: "Jens Axboe" <axboe@kernel.dk> >> To: "Ben England" <bengland@redhat.com>, "Mark Nelson" <mark.a.nelson@gmail.com> >> Cc: "Martin Steigerwald" <ms@teamix.de>, fio@vger.kernel.org, "Mark Nelson" <mnelson@redhat.com> >> Sent: Tuesday, May 24, 2016 11:28:39 AM >> Subject: Re: fiologparser.py >> >> On 05/24/2016 09:22 AM, Ben England wrote: >>> >>> >>> ----- Original Message ----- >>>> From: "Mark Nelson" <mark.a.nelson@gmail.com> >>>> To: "Ben England" <bengland@redhat.com>, "Martin Steigerwald" >>>> <ms@teamix.de> >>>> Cc: fio@vger.kernel.org, "Mark Nelson" <mnelson@redhat.com>, "Jens Axboe" >>>> <axboe@kernel.dk> >>>> Sent: Tuesday, May 24, 2016 10:04:14 AM >>>> Subject: Re: fiologparser.py >>>> >>>> Let's see if we can remove the numpy and scipy dependencies. It looks >>>> like we are just using it for min/average/median/max/percentile >>>> calculations. It would be nice if users didn't need anything other than >>>> argparse. >>>> >>> >>> Just curious, why is scipy a problem? Is it because CBT isn't a >>> package so you don't get dependencies handled when you install it? You >>> are correct, it's easy to remove the dependencies, I just didn't know it >>> was causing problems for people. You can get percentiles from just >>> sorting the sample values and indexing into the array at the appropriate >>> offset, I was just trying to re-use existing classes. >> >> It's not necessarily a problem, but the less dependencies you have, the >> easier it is for people to use. I do the same for fio, try to have as >> few external dependencies as possible. Remember, not everybody is >> running on Linux... >> >> -- >> Jens Axboe >> >> ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: fiologparser.py 2016-05-24 16:20 ` fiologparser.py Mark Nelson @ 2016-05-24 18:38 ` Jeff Furlong 2016-05-24 20:47 ` fiologparser.py Ben England 1 sibling, 0 replies; 14+ messages in thread From: Jeff Furlong @ 2016-05-24 18:38 UTC (permalink / raw) To: Mark Nelson, Ben England, Jens Axboe Cc: Martin Steigerwald, fio@vger.kernel.org, Mark Nelson You may want to consider using in place sort functions. list.sort() is more efficient than s=sorted(list) and will greatly reduce DRAM usage for large latency logs. On Linux you can check the python DRAM usage with resource.getrusage(resource.RUSAGE_SELF).ru_maxrss, e.g. before and after your sort statements. Regards, Jeff -----Original Message----- From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org] On Behalf Of Mark Nelson Sent: Tuesday, May 24, 2016 9:20 AM To: Ben England <bengland@redhat.com>; Jens Axboe <axboe@kernel.dk> Cc: Martin Steigerwald <ms@teamix.de>; fio@vger.kernel.org; Mark Nelson <mnelson@redhat.com> Subject: Re: fiologparser.py I've got a version that removes the dependency and appears to return the same values: https://github.com/axboe/fio/pull/181 Going through the code though, it looks like the -A values are computed differently than in the other original functions. In the original get_contribution function, all samples within the bounds are counted, along with samples that are only partially within the bounds. Each sample is weighted based on the duration it overlapped with the sample period: https://github.com/axboe/fio/blob/master/tools/fiologparser.py#L195-L198 for -A, only the samples that are totally within the bounds are counted, and are weighted equally despite how much of the period was spent in that sample: https://github.com/axboe/fio/blob/master/tools/fiologparser.py#L173 Thus if you look at say the average from -a: fiologparser.py -a *clat* 1000, 11582.770 2000, 14033.844 3000, 17087.446 4000, 17946.245 5000, 14554.196 6000, 14407.804 7000, 15218.106 8000, 15157.951 the results are quite a bit different from -A: fiologparser.py -A *clat* | tr -s "," " " | cut -f1,4 -d" " 0.000000 11902.719298 1000.000000 13247.750000 2000.000000 14270.549020 3000.000000 15092.192308 4000.000000 14127.472727 5000.000000 12880.137931 6000.000000 15296.735849 7000.000000 14857.306122 8000.000000 14854.766667 Mark On 05/24/2016 10:35 AM, Ben England wrote: > OK we'll remove the dependencies, I still want to have the -A option supported. > -ben > > ----- Original Message ----- >> From: "Jens Axboe" <axboe@kernel.dk> >> To: "Ben England" <bengland@redhat.com>, "Mark Nelson" >> <mark.a.nelson@gmail.com> >> Cc: "Martin Steigerwald" <ms@teamix.de>, fio@vger.kernel.org, "Mark >> Nelson" <mnelson@redhat.com> >> Sent: Tuesday, May 24, 2016 11:28:39 AM >> Subject: Re: fiologparser.py >> >> On 05/24/2016 09:22 AM, Ben England wrote: >>> >>> >>> ----- Original Message ----- >>>> From: "Mark Nelson" <mark.a.nelson@gmail.com> >>>> To: "Ben England" <bengland@redhat.com>, "Martin Steigerwald" >>>> <ms@teamix.de> >>>> Cc: fio@vger.kernel.org, "Mark Nelson" <mnelson@redhat.com>, "Jens Axboe" >>>> <axboe@kernel.dk> >>>> Sent: Tuesday, May 24, 2016 10:04:14 AM >>>> Subject: Re: fiologparser.py >>>> >>>> Let's see if we can remove the numpy and scipy dependencies. It >>>> looks like we are just using it for >>>> min/average/median/max/percentile calculations. It would be nice >>>> if users didn't need anything other than argparse. >>>> >>> >>> Just curious, why is scipy a problem? Is it because CBT isn't a >>> package so you don't get dependencies handled when you install it? >>> You are correct, it's easy to remove the dependencies, I just didn't >>> know it was causing problems for people. You can get percentiles >>> from just sorting the sample values and indexing into the array at >>> the appropriate offset, I was just trying to re-use existing classes. >> >> It's not necessarily a problem, but the less dependencies you have, >> the easier it is for people to use. I do the same for fio, try to >> have as few external dependencies as possible. Remember, not >> everybody is running on Linux... >> >> -- >> Jens Axboe >> >> -- To unsubscribe from this list: send the line "unsubscribe fio" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer: This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fiologparser.py 2016-05-24 16:20 ` fiologparser.py Mark Nelson 2016-05-24 18:38 ` fiologparser.py Jeff Furlong @ 2016-05-24 20:47 ` Ben England 2016-05-25 2:04 ` fiologparser.py Mark Nelson 1 sibling, 1 reply; 14+ messages in thread From: Ben England @ 2016-05-24 20:47 UTC (permalink / raw) To: Mark Nelson; +Cc: Jens Axboe, Martin Steigerwald, fio, Mark Nelson Mark, I didn't notice the sample weighting code before. Weighting of samples might work for averaging, but it doesn't work for percentiles, min or max provided by -A option. I guess for min this won't be an issue generally, since min-latency samples will probably fall entirely within a time interval. But for max or higher percentiles it will *definitely* be an issue. For example, a really high latency sample could be the max for a whole range of time intervals. To compute percentiles, we can sort (by response time) the samples that *overlap the time interval* and then index into the python list something like this (ignoring boundary conditions): def get_percentile(list, percentile): return sample_list[len(list) * percentile / 100] min would be first array element in sample_list, max would be last array element in sample_list. And I'll definitely try using .sort instead of sorted(), thx Jeff. make sense? -ben ----- Original Message ----- > From: "Mark Nelson" <mark.a.nelson@gmail.com> > To: "Ben England" <bengland@redhat.com>, "Jens Axboe" <axboe@kernel.dk> > Cc: "Martin Steigerwald" <ms@teamix.de>, fio@vger.kernel.org, "Mark Nelson" <mnelson@redhat.com> > Sent: Tuesday, May 24, 2016 12:20:19 PM > Subject: Re: fiologparser.py > > I've got a version that removes the dependency and appears to return the > same values: > > https://github.com/axboe/fio/pull/181 > > Going through the code though, it looks like the -A values are computed > differently than in the other original functions. In the original > get_contribution function, all samples within the bounds are counted, > along with samples that are only partially within the bounds. Each > sample is weighted based on the duration it overlapped with the sample > period: > > https://github.com/axboe/fio/blob/master/tools/fiologparser.py#L195-L198 > > for -A, only the samples that are totally within the bounds are counted, > and are weighted equally despite how much of the period was spent in > that sample: > > https://github.com/axboe/fio/blob/master/tools/fiologparser.py#L173 > > Thus if you look at say the average from -a: > > fiologparser.py -a *clat* > > 1000, 11582.770 > 2000, 14033.844 > 3000, 17087.446 > 4000, 17946.245 > 5000, 14554.196 > 6000, 14407.804 > 7000, 15218.106 > 8000, 15157.951 > > the results are quite a bit different from -A: > > fiologparser.py -A *clat* | tr -s "," " " | cut -f1,4 -d" " > > 0.000000 11902.719298 > 1000.000000 13247.750000 > 2000.000000 14270.549020 > 3000.000000 15092.192308 > 4000.000000 14127.472727 > 5000.000000 12880.137931 > 6000.000000 15296.735849 > 7000.000000 14857.306122 > 8000.000000 14854.766667 > > Mark > > > On 05/24/2016 10:35 AM, Ben England wrote: > > OK we'll remove the dependencies, I still want to have the -A option > > supported. > > -ben > > > > ----- Original Message ----- > >> From: "Jens Axboe" <axboe@kernel.dk> > >> To: "Ben England" <bengland@redhat.com>, "Mark Nelson" > >> <mark.a.nelson@gmail.com> > >> Cc: "Martin Steigerwald" <ms@teamix.de>, fio@vger.kernel.org, "Mark > >> Nelson" <mnelson@redhat.com> > >> Sent: Tuesday, May 24, 2016 11:28:39 AM > >> Subject: Re: fiologparser.py > >> > >> On 05/24/2016 09:22 AM, Ben England wrote: > >>> > >>> > >>> ----- Original Message ----- > >>>> From: "Mark Nelson" <mark.a.nelson@gmail.com> > >>>> To: "Ben England" <bengland@redhat.com>, "Martin Steigerwald" > >>>> <ms@teamix.de> > >>>> Cc: fio@vger.kernel.org, "Mark Nelson" <mnelson@redhat.com>, "Jens > >>>> Axboe" > >>>> <axboe@kernel.dk> > >>>> Sent: Tuesday, May 24, 2016 10:04:14 AM > >>>> Subject: Re: fiologparser.py > >>>> > >>>> Let's see if we can remove the numpy and scipy dependencies. It looks > >>>> like we are just using it for min/average/median/max/percentile > >>>> calculations. It would be nice if users didn't need anything other than > >>>> argparse. > >>>> > >>> > >>> Just curious, why is scipy a problem? Is it because CBT isn't a > >>> package so you don't get dependencies handled when you install it? You > >>> are correct, it's easy to remove the dependencies, I just didn't know it > >>> was causing problems for people. You can get percentiles from just > >>> sorting the sample values and indexing into the array at the appropriate > >>> offset, I was just trying to re-use existing classes. > >> > >> It's not necessarily a problem, but the less dependencies you have, the > >> easier it is for people to use. I do the same for fio, try to have as > >> few external dependencies as possible. Remember, not everybody is > >> running on Linux... > >> > >> -- > >> Jens Axboe > >> > >> > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fiologparser.py 2016-05-24 20:47 ` fiologparser.py Ben England @ 2016-05-25 2:04 ` Mark Nelson 2016-05-25 8:58 ` fiologparser.py Martin Steigerwald 0 siblings, 1 reply; 14+ messages in thread From: Mark Nelson @ 2016-05-25 2:04 UTC (permalink / raw) To: Ben England; +Cc: Jens Axboe, Martin Steigerwald, fio, Mark Nelson On 05/24/2016 03:47 PM, Ben England wrote: > Mark, I didn't notice the sample weighting code before. Weighting of samples might work for averaging, but it doesn't work for percentiles, min or max provided by -A option. I guess for min this won't be an issue generally, since min-latency samples will probably fall entirely within a time interval. But for max or higher percentiles it will *definitely* be an issue. For example, a really high latency sample could be the max for a whole range of time intervals. I went back and reworked the print and per-interval functions so that they are part of a Printer class and Interval class respectively. It cleaned the code up pretty nicely. I was also able to integrate the "-A" code to use a lot of the existing statistics and formatting code. It now supports the "-d" flag for example. As part of that I took a stab at making a weighted implementation for percentiles (and as a result median). The basic idea is to sort samples by value but then iterate over samples by weight to close in on the percentile boundary. Once the samples that straddle the percentile boundary are found, take a weighted average of the two samples based inversely on their closeness to the boundary. I do think it's really important to count samples with overlapping boundaries. In the min case you otherwise disregard the min values that are spread over long time durations (ie when IOs stall). In the max case, you potentially loose out on high throughput samples at edge boundaries. I tried the old code and new code on a sample I had. There's a pretty big difference in the number of samples utilized (or partially utilized) per interval. old: > start-time, samples, min, avg, median, 90%, 95%, 99%, max > 0.000000, 8, 169631.000000, 321862.500000, 363155.000000, 417325.500000, 418426.250000, 419306.850000, 419527.000000 > 1000.000000, 8, 217273.000000, 324114.750000, 262548.000000, 449062.800000, 456610.900000, 462649.380000, 464159.000000 > 2000.000000, 8, 252437.000000, 351356.000000, 309912.500000, 468551.400000, 470426.700000, 471926.940000, 472302.000000 > 3000.000000, 8, 147123.000000, 315987.375000, 295690.500000, 451860.200000, 457549.100000, 462100.220000, 463238.000000 > 4000.000000, 8, 152847.000000, 325890.875000, 352656.000000, 442708.300000, 446184.150000, 448964.830000, 449660.000000 > 5000.000000, 7, 152547.000000, 333048.428571, 285577.000000, 465428.800000, 469807.900000, 473311.180000, 474187.000000 New: > end-time, samples, min, avg, median, 90%, 95%, 99%, max > 1000.000, 16, 169631.000, 321863.134, 298029.136, 451210.153, 455823.097, 457210.922, 457836.000 > 2000.000, 24, 184826.000, 341609.250, 285337.006, 462780.936, 465093.032, 465706.770, 466011.000 > 3000.000, 24, 88867.000, 312228.872, 298560.686, 466730.845, 469928.578, 471566.768, 472302.000 > 4000.000, 24, 88867.000, 309359.155, 278879.166, 458966.926, 462427.823, 462987.178, 463238.000 > 5000.000, 24, 137593.000, 326864.166, 317893.305, 449518.978, 455424.867, 459333.936, 461407.000 > 6000.000, 23, 131237.000, 340960.370, 319615.167, 460959.116, 468513.304, 472427.275, 474187.000 Code is here if anyone wants to critique/flame: https://github.com/markhpc/fio/commit/19943e4dce34233bc776ed868d12c4c03b5f98ec Mark > > To compute percentiles, we can sort (by response time) the samples that *overlap the time interval* and then index into the python list something like this (ignoring boundary conditions): > > def get_percentile(list, percentile): > return sample_list[len(list) * percentile / 100] > > min would be first array element in sample_list, > max would be last array element in sample_list. > > And I'll definitely try using .sort instead of sorted(), thx Jeff. > > make sense? > > -ben > > > ----- Original Message ----- >> From: "Mark Nelson" <mark.a.nelson@gmail.com> >> To: "Ben England" <bengland@redhat.com>, "Jens Axboe" <axboe@kernel.dk> >> Cc: "Martin Steigerwald" <ms@teamix.de>, fio@vger.kernel.org, "Mark Nelson" <mnelson@redhat.com> >> Sent: Tuesday, May 24, 2016 12:20:19 PM >> Subject: Re: fiologparser.py >> >> I've got a version that removes the dependency and appears to return the >> same values: >> >> https://github.com/axboe/fio/pull/181 >> >> Going through the code though, it looks like the -A values are computed >> differently than in the other original functions. In the original >> get_contribution function, all samples within the bounds are counted, >> along with samples that are only partially within the bounds. Each >> sample is weighted based on the duration it overlapped with the sample >> period: >> >> https://github.com/axboe/fio/blob/master/tools/fiologparser.py#L195-L198 >> >> for -A, only the samples that are totally within the bounds are counted, >> and are weighted equally despite how much of the period was spent in >> that sample: >> >> https://github.com/axboe/fio/blob/master/tools/fiologparser.py#L173 >> >> Thus if you look at say the average from -a: >> >> fiologparser.py -a *clat* >> >> 1000, 11582.770 >> 2000, 14033.844 >> 3000, 17087.446 >> 4000, 17946.245 >> 5000, 14554.196 >> 6000, 14407.804 >> 7000, 15218.106 >> 8000, 15157.951 >> >> the results are quite a bit different from -A: >> >> fiologparser.py -A *clat* | tr -s "," " " | cut -f1,4 -d" " >> >> 0.000000 11902.719298 >> 1000.000000 13247.750000 >> 2000.000000 14270.549020 >> 3000.000000 15092.192308 >> 4000.000000 14127.472727 >> 5000.000000 12880.137931 >> 6000.000000 15296.735849 >> 7000.000000 14857.306122 >> 8000.000000 14854.766667 >> >> Mark >> >> >> On 05/24/2016 10:35 AM, Ben England wrote: >>> OK we'll remove the dependencies, I still want to have the -A option >>> supported. >>> -ben >>> >>> ----- Original Message ----- >>>> From: "Jens Axboe" <axboe@kernel.dk> >>>> To: "Ben England" <bengland@redhat.com>, "Mark Nelson" >>>> <mark.a.nelson@gmail.com> >>>> Cc: "Martin Steigerwald" <ms@teamix.de>, fio@vger.kernel.org, "Mark >>>> Nelson" <mnelson@redhat.com> >>>> Sent: Tuesday, May 24, 2016 11:28:39 AM >>>> Subject: Re: fiologparser.py >>>> >>>> On 05/24/2016 09:22 AM, Ben England wrote: >>>>> >>>>> >>>>> ----- Original Message ----- >>>>>> From: "Mark Nelson" <mark.a.nelson@gmail.com> >>>>>> To: "Ben England" <bengland@redhat.com>, "Martin Steigerwald" >>>>>> <ms@teamix.de> >>>>>> Cc: fio@vger.kernel.org, "Mark Nelson" <mnelson@redhat.com>, "Jens >>>>>> Axboe" >>>>>> <axboe@kernel.dk> >>>>>> Sent: Tuesday, May 24, 2016 10:04:14 AM >>>>>> Subject: Re: fiologparser.py >>>>>> >>>>>> Let's see if we can remove the numpy and scipy dependencies. It looks >>>>>> like we are just using it for min/average/median/max/percentile >>>>>> calculations. It would be nice if users didn't need anything other than >>>>>> argparse. >>>>>> >>>>> >>>>> Just curious, why is scipy a problem? Is it because CBT isn't a >>>>> package so you don't get dependencies handled when you install it? You >>>>> are correct, it's easy to remove the dependencies, I just didn't know it >>>>> was causing problems for people. You can get percentiles from just >>>>> sorting the sample values and indexing into the array at the appropriate >>>>> offset, I was just trying to re-use existing classes. >>>> >>>> It's not necessarily a problem, but the less dependencies you have, the >>>> easier it is for people to use. I do the same for fio, try to have as >>>> few external dependencies as possible. Remember, not everybody is >>>> running on Linux... >>>> >>>> -- >>>> Jens Axboe >>>> >>>> >> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fiologparser.py 2016-05-25 2:04 ` fiologparser.py Mark Nelson @ 2016-05-25 8:58 ` Martin Steigerwald 0 siblings, 0 replies; 14+ messages in thread From: Martin Steigerwald @ 2016-05-25 8:58 UTC (permalink / raw) To: Mark Nelson; +Cc: Ben England, Jens Axboe, fio, Mark Nelson On Dienstag, 24. Mai 2016 21:04:13 CEST Mark Nelson wrote: > On 05/24/2016 03:47 PM, Ben England wrote: > > Mark, I didn't notice the sample weighting code before. Weighting of > > samples might work for averaging, but it doesn't work for percentiles, > > min or max provided by -A option. I guess for min this won't be an issue > > generally, since min-latency samples will probably fall entirely within a > > time interval. But for max or higher percentiles it will *definitely* be > > an issue. For example, a really high latency sample could be the max > > for a whole range of time intervals. > I went back and reworked the print and per-interval functions so that > they are part of a Printer class and Interval class respectively. It > cleaned the code up pretty nicely. I was also able to integrate the > "-A" code to use a lot of the existing statistics and formatting code. > It now supports the "-d" flag for example. While at it, Mark, could you also do the renaming to fiologparser? Otherwise I´d prepare a patch, but then we risk a trivial merge conflict :) Thanks, Martin ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fiologparser.py 2016-05-24 15:22 ` fiologparser.py Ben England 2016-05-24 15:28 ` fiologparser.py Jens Axboe @ 2016-05-25 7:20 ` Martin Steigerwald 1 sibling, 0 replies; 14+ messages in thread From: Martin Steigerwald @ 2016-05-25 7:20 UTC (permalink / raw) To: Ben England; +Cc: Mark Nelson, fio, Mark Nelson, Jens Axboe On Dienstag, 24. Mai 2016 11:22:06 CEST Ben England wrote: > -- Martin Steigerwald | Trainer teamix GmbH Südwestpark 43 90449 Nürnberg Tel.: +49 911 30999 55 | Fax: +49 911 30999 99 mail: martin.steigerwald@teamix.de | web: http://www.teamix.de | blog: http://blog.teamix.de Amtsgericht Nürnberg, HRB 18320 | Geschäftsführer: Oliver Kügow, Richard Müller teamix Support Hotline: +49 911 30999-112 Flexibilität im Haus – Sicherheit im Kopf, testen Sie jetzt 30 Tage kostenfrei unsere Cloud Backup Lösung FlexVault: www.teamix.de/cloud-backup ----- Original Message ----- > > > From: "Mark Nelson" <mark.a.nelson@gmail.com> > > To: "Ben England" <bengland@redhat.com>, "Martin Steigerwald" > > <ms@teamix.de> Cc: fio@vger.kernel.org, "Mark Nelson" > > <mnelson@redhat.com>, "Jens Axboe" <axboe@kernel.dk> Sent: Tuesday, May > > 24, 2016 10:04:14 AM > > Subject: Re: fiologparser.py > > > > Let's see if we can remove the numpy and scipy dependencies. It looks > > like we are just using it for min/average/median/max/percentile > > calculations. It would be nice if users didn't need anything other than > > argparse. > > Just curious, why is scipy a problem? Is it because CBT isn't a package so > you don't get dependencies handled when you install it? You are correct, > it's easy to remove the dependencies, I just didn't know it was causing > problems for people. You can get percentiles from just sorting the sample > values and indexing into the array at the appropriate offset, I was just > trying to re-use existing classes. There is a very clear reason and that is: After this operation, 37.9 MB of additional disk space will be used. Yes, thats right, that is the size of the python-scipy package in Debian Unstable. So I am glad to intend to remove the dependency, as I really would like to move fiologparser to /usr/bin, but that means currently python-scipy would need to become a hard dependency and I am probably not going to do that. So for fiologparser to move into a more promiment location, I´d need it to be free of that dependency. Thank you, ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2016-05-25 8:58 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-05-24 10:35 fiologparser.py Martin Steigerwald 2016-05-24 11:12 ` fiologparser.py Ben England 2016-05-24 14:04 ` fiologparser.py Mark Nelson 2016-05-24 14:11 ` fiologparser.py Jens Axboe 2016-05-24 14:17 ` fiologparser.py Mark Nelson 2016-05-24 15:22 ` fiologparser.py Ben England 2016-05-24 15:28 ` fiologparser.py Jens Axboe 2016-05-24 15:35 ` fiologparser.py Ben England 2016-05-24 16:20 ` fiologparser.py Mark Nelson 2016-05-24 18:38 ` fiologparser.py Jeff Furlong 2016-05-24 20:47 ` fiologparser.py Ben England 2016-05-25 2:04 ` fiologparser.py Mark Nelson 2016-05-25 8:58 ` fiologparser.py Martin Steigerwald 2016-05-25 7:20 ` fiologparser.py Martin Steigerwald
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.