From: Jay Lan <jlan@engr.sgi.com>
To: Shailabh Nagar <nagar@watson.ibm.com>
Cc: Andrew Morton <akpm@osdl.org>,
balbir@in.ibm.com, csturtiv@sgi.com,
linux-kernel@vger.kernel.org
Subject: Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats
Date: Fri, 23 Jun 2006 13:00:37 -0700 [thread overview]
Message-ID: <449C4865.7040706@engr.sgi.com> (raw)
In-Reply-To: <449C3897.70001@watson.ibm.com>
Shailabh Nagar wrote:
>Jay Lan wrote:
>
>>Shailabh Nagar wrote:
>>
>>
>>>Hi Andrew,
>>>
>>>Two developments on the tgid overhead issue:
>>>
>>>1. The latest results show that overhead is significant
>>>only when the exit rate exceeds roughly 1000 threads/second.
>>>
>>>
>>I worked with Shailabh this week to run various testing and
>>debugging as he requested. I was pulled off to some urgent
>>task yesterday and surprising saw this coming this morning...
>>
>
>Sorry...didn't mean to surprise. I sent you the data last night
>privately with request for comments.
>
Yeah, i saw it, but did not have time to respond before your posting.
>Your testing and help has been very valuable and helped uncover
>two issues: the locking patch (sent separately) and also a
>dependency between taskstats and delay accounting (for which another
>patch is being sent out shortly).
>
>
>>Let's slow it down please. My last testing (after your fix in
>>#2 below) still showed 109% overhead at system time.
>>
>
>True, but my point is that the overhead is at an extremely
>high exit rate. I think the test in which you saw 109% overhead
>ran 5000 iterations of 1000 threads and had an elapsed time of
>294 seconds (with tgid turned off) giving an exit rate of roughly
>8500 exits/second, right ?
>
>My results confirm the high overhead at these exit rates. In fact,
>on the system I used, I see the 649% overhead for the 2200 exits/second case
>even higher than yours) but the point is whether that exit rate
>is a valid design criteria.
>
Agreed. The indeed the deciding factor. The exit rate in the labs
does not help answer this question. I need input from our fields.
>
>>And, the per-thread group processing also increase the rate of ENOBUFS
>>at the receiver.
>>
>
>Could you quantify please ? Also, pls list the exit rate at which
>this happens.
>
I have not posted it nor quantify it because i must bring down the errors
count, or we (CSA) have to explore a different way. So any comparison
on these number at this point does not really help. Again, if the exit rate
is unrealistic, then i need to run a different set of testings. What
sleep_factor did you use? Are those printf() in your new test program
essential?
>
>>I need to check with other guys to find out if 1000 threads/sec
>>indeed unrealistic at our customers' environments. A good
>>design should allow a mechanism to turn off the penalty due to
>>a feature that is not common to everybody. I do not understand
>>your objection.
>>
>
>Only objection is that design shouldn't cater to a case that is
>extremely unlikely in practice. In most situations, there is no
>or insignificant penalty.
>
If this type of exit rate can happen even once a day, the surge may cause
loss of accounting data of other processes. Again, i do not have data
to say either way yet. But i would rather spend time on working on
the ENOBUFS error than running all different tests to argue on the
per-TG switch.
Regards,
- jay
>Perhaps others on the list can also chip in whether this kind of exit
>rate is realistic in some scenarios and where the peformance
>penalty matters (i.e. not system shutdown etc.)
>
>Please note that the exits have to be for multithreaded apps, not
>single-threaded ones for which tgid sending is already turned off.
>
>Thanks,
>Shailabh
>
>
>>Regards,
>> - jay
>>
>>
>>
>>>2. A new patch that modifies the locking used within taskstats,
>>>brings down the overhead of the extreme case quite a bit.
>>>I'll submit the patch along shortly in a separate mail.
>>>
>>>To get back to the effect of exit rate, I modified the fork+exit
>>>benchmark to vary the rate at which exits happened and
>>>ran tests on a 4-way 1.4 GHz x86_64 box. The kernel was 2.6.17,
>>>uses the delay accounting/taskstat patches in 2.6.17-mm1 + the new
>>>locking patch mentioned in 2. above.
>>>
>>>The results show that differential between tgid on and off
>>>starts becoming significant once the exit rate crosses roughly 1000
>>>threads/second. Below that exit rate, the difference is negligible.
>>>Above it, the difference starts climbing rapidly.
>>>
>>>So I guess the question is whether this rate of exit is representative
>>>enough of real life to warrant making any more changes to the existing
>>>patchset, beyond the locking changes in 2. above.
>>>
>>>>From my limited experience, I think this is too high an exit rate
>>>to be worrying about overhead.
>>>
>>>
>>> %ovhd of tgid on over off
>>> (higher is worse)
>>>
>>>Exit User Sys Elapsed
>>>Rate Time Time Time
>>>
>>>2283 25.76 649.41 -0.14
>>>1193 -10.53 88.81 -0.12
>>>963 -11.90 3.28 -0.10
>>>806 -8.54 -0.84 0.16
>>>694 -4.41 2.38 0.03
>>>
>>>Exit Rate: units are threads exiting per second.
>>>Calculated by (#threads_forked+exited)/(elapsed_time)/2
>>>Since app pretty much does only thread create and exit for 10000
>>>threads (1000 threads, 10 iterations), this is a good measure
>>>for exit rate.
>>>
>>>%diff in user, sys, elapsed times calculated using
>>>(tgid_on - tgid_off)/tgid_off * 100
>>>where tgid_on/off times are reported by /usr/bin/time as before.
>>>
>>>Each data point for tgid_on and tgid_off was an average
>>>of 10 runs of the fork+exit benchmark.
>>>The rate of exits was controlled by delaying the individual
>>>threads through a usleep before being allowed to exit.
>>>
>>>Machine was 4-way 1.6GHz x86_64 Opteron.
>>>
>>>"exit_recv -w", the user program consuming the stats, was running
>>>on the side, reading the stats but not writing to a file or
>>>printing to screen.
>>>
>>>
>>
>
>
next prev parent reply other threads:[~2006-06-23 20:00 UTC|newest]
Thread overview: 134+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-06-09 7:41 [Patch][RFC] Disabling per-tgid stats on task exit in taskstats Shailabh Nagar
2006-06-09 8:00 ` Andrew Morton
2006-06-09 10:51 ` Balbir Singh
2006-06-09 11:21 ` Andrew Morton
2006-06-09 13:20 ` Shailabh Nagar
2006-06-09 18:25 ` Jay Lan
2006-06-09 19:12 ` Shailabh Nagar
2006-06-09 15:36 ` Balbir Singh
2006-06-09 18:35 ` Jay Lan
2006-06-09 19:31 ` Shailabh Nagar
2006-06-09 21:56 ` Shailabh Nagar
2006-06-09 22:42 ` Jay Lan
2006-06-09 23:22 ` Andrew Morton
2006-06-09 23:47 ` Jay Lan
2006-06-09 23:56 ` Andrew Morton
2006-06-10 12:21 ` Shailabh Nagar
2006-06-12 18:31 ` Jay Lan
2006-06-12 21:57 ` Shailabh Nagar
2006-06-10 13:05 ` Shailabh Nagar
2006-06-12 18:54 ` Jay Lan
2006-06-21 19:11 ` Jay Lan
2006-06-21 19:14 ` Jay Lan
2006-06-21 19:34 ` Shailabh Nagar
2006-06-21 23:35 ` Jay Lan
2006-06-21 23:45 ` Shailabh Nagar
2006-06-23 17:14 ` Shailabh Nagar
2006-06-23 18:19 ` Jay Lan
2006-06-23 18:53 ` Shailabh Nagar
2006-06-23 20:00 ` Jay Lan [this message]
2006-06-23 20:16 ` Shailabh Nagar
2006-06-23 20:36 ` Jay Lan
2006-06-23 21:19 ` Andrew Morton
2006-06-23 22:07 ` Jay Lan
2006-06-23 23:47 ` Andrew Morton
2006-06-24 2:59 ` Shailabh Nagar
2006-06-24 4:39 ` Andrew Morton
2006-06-24 5:59 ` Shailabh Nagar
2006-06-26 17:33 ` Jay Lan
2006-06-26 17:52 ` Shailabh Nagar
2006-06-26 17:55 ` Andrew Morton
2006-06-26 18:00 ` Shailabh Nagar
2006-06-26 18:12 ` Andrew Morton
2006-06-26 18:26 ` Jay Lan
2006-06-26 18:39 ` Andrew Morton
2006-06-26 18:49 ` Shailabh Nagar
2006-06-26 19:00 ` Jay Lan
2006-06-28 21:30 ` Jay Lan
2006-06-28 21:53 ` Andrew Morton
2006-06-28 22:02 ` Jay Lan
2006-06-29 8:40 ` Paul Jackson
2006-06-29 12:30 ` Valdis.Kletnieks
2006-06-29 16:44 ` Paul Jackson
2006-06-29 18:01 ` Andrew Morton
2006-06-29 18:07 ` Paul Jackson
2006-06-29 18:26 ` Paul Jackson
2006-06-29 19:15 ` Shailabh Nagar
2006-06-29 19:41 ` Paul Jackson
2006-06-29 21:42 ` Shailabh Nagar
2006-06-29 21:54 ` Jay Lan
2006-06-29 22:09 ` Shailabh Nagar
2006-06-29 22:23 ` Paul Jackson
2006-06-30 0:15 ` Shailabh Nagar
2006-06-30 0:40 ` Paul Jackson
2006-06-30 1:00 ` Shailabh Nagar
2006-06-30 1:05 ` Paul Jackson
[not found] ` <44A46C6C.1090405@watson.ibm.com>
2006-06-30 0:38 ` Paul Jackson
2006-06-30 2:21 ` Paul Jackson
2006-06-30 2:46 ` Shailabh Nagar
2006-06-30 2:54 ` Paul Jackson
2006-06-30 3:02 ` Paul Jackson
2006-06-29 19:22 ` Shailabh Nagar
2006-06-29 19:10 ` Shailabh Nagar
2006-06-29 19:23 ` Paul Jackson
2006-06-29 19:33 ` Andrew Morton
2006-06-29 19:43 ` Shailabh Nagar
2006-06-29 20:00 ` Andrew Morton
2006-06-29 22:13 ` Shailabh Nagar
2006-06-29 23:00 ` jamal
2006-06-29 20:01 ` Shailabh Nagar
2006-06-29 21:22 ` Paul Jackson
2006-06-29 22:54 ` jamal
2006-06-30 0:38 ` Shailabh Nagar
2006-06-30 1:05 ` Andrew Morton
2006-06-30 1:11 ` Shailabh Nagar
2006-06-30 1:30 ` jamal
2006-06-30 3:01 ` Shailabh Nagar
2006-06-30 12:45 ` jamal
2006-06-30 2:25 ` Paul Jackson
2006-06-30 2:35 ` Andrew Morton
2006-06-30 2:43 ` Paul Jackson
2006-06-29 19:33 ` Jay Lan
2006-06-30 18:53 ` Shailabh Nagar
2006-06-30 19:10 ` Shailabh Nagar
2006-06-30 19:19 ` Shailabh Nagar
2006-06-30 20:19 ` jamal
2006-06-30 22:50 ` Andrew Morton
2006-07-01 2:20 ` Shailabh Nagar
2006-07-01 2:43 ` Andrew Morton
2006-07-01 3:37 ` Shailabh Nagar
2006-07-01 3:51 ` Andrew Morton
2006-07-03 21:11 ` Shailabh Nagar
2006-07-03 21:41 ` Andrew Morton
2006-07-04 0:13 ` Shailabh Nagar
2006-07-04 0:38 ` Andrew Morton
2006-07-04 20:19 ` Paul Jackson
2006-07-04 20:22 ` Paul Jackson
2006-07-04 0:54 ` Shailabh Nagar
2006-07-04 1:01 ` Andrew Morton
2006-07-04 13:05 ` jamal
2006-07-04 15:18 ` Shailabh Nagar
2006-07-04 16:37 ` Shailabh Nagar
2006-07-04 19:24 ` jamal
2006-07-05 14:09 ` Shailabh Nagar
2006-07-05 20:25 ` Chris Sturtivant
2006-07-05 20:32 ` Shailabh Nagar
2006-07-03 4:53 ` Paul Jackson
2006-07-03 15:02 ` Shailabh Nagar
2006-07-03 15:55 ` Paul Jackson
2006-07-03 16:31 ` Paul Jackson
2006-07-04 0:09 ` Shailabh Nagar
2006-07-04 19:59 ` Paul Jackson
2006-07-05 17:20 ` Jay Lan
2006-07-05 18:18 ` Shailabh Nagar
2006-06-30 22:56 ` Andrew Morton
2006-06-29 18:05 ` Nick Piggin
2006-06-29 12:42 ` Shailabh Nagar
2006-06-24 3:08 ` Shailabh Nagar
2006-06-21 20:38 ` Andrew Morton
2006-06-21 21:31 ` Shailabh Nagar
2006-06-21 21:45 ` Jay Lan
2006-06-21 21:54 ` Andrew Morton
2006-06-21 22:19 ` Jay Lan
2006-06-21 21:59 ` Shailabh Nagar
2006-06-09 15:55 ` Chris Sturtivant
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=449C4865.7040706@engr.sgi.com \
--to=jlan@engr.sgi.com \
--cc=akpm@osdl.org \
--cc=balbir@in.ibm.com \
--cc=csturtiv@sgi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nagar@watson.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox