From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751965AbWFLN2Q (ORCPT ); Mon, 12 Jun 2006 09:28:16 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751966AbWFLN2Q (ORCPT ); Mon, 12 Jun 2006 09:28:16 -0400 Received: from e36.co.us.ibm.com ([32.97.110.154]:1741 "EHLO e36.co.us.ibm.com") by vger.kernel.org with ESMTP id S1751965AbWFLN2P (ORCPT ); Mon, 12 Jun 2006 09:28:15 -0400 Message-ID: <448D6BEA.9090700@watson.ibm.com> Date: Mon, 12 Jun 2006 09:28:10 -0400 From: Shailabh Nagar User-Agent: Mozilla Thunderbird 1.0.7 (Windows/20050923) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Martin Peschke CC: Jay Lan , balbir@in.ibm.com, Andrew Morton , linux-kernel@vger.kernel.org, Chris Sturtivant , Peter Chubb Subject: Re: Merge of per task delay accounting (was Re: 2.6.18 -mm merge plans) References: <20060604135011.decdc7c9.akpm@osdl.org> <4484D25E.4020805@in.ibm.com> <4486017F.8030308@watson.ibm.com> <4486073B.2040102@sgi.com> <448D57CD.4030007@de.ibm.com> In-Reply-To: <448D57CD.4030007@de.ibm.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Martin Peschke wrote: > Jay Lan wrote: > >> Shailabh Nagar wrote: >> >>> Balbir Singh wrote: >> > >>> Andrew, >>> >>> The only other new set of patches to be discussed in this context >>> are the >>> statistics-infrastructure patches from Martin Peschke. >>> >>> That infrastructure cannot meet the needs of delay accounting, CSA >>> etc. because >>> - it only provides "user pull" model of getting stats whereas >>> "kernel push" is >>> needed for delay accounting >> >> >> Doesn't taskstats interface provide "user pull" request-reply model >> also? Serious accounting needs to push accounting data as soon as >> possible. >> >>> - it uses a relatively slow interface unsuitable for high volumes of >>> data. >> > > By design. > > I think it would be fatal to report every event relevant > to statistical data gathering up to user space. It's fine to have > the kernel maintain counters and to provide preprocessed data. > > Given that, is there a need for a high-speed interface for a > huge amount of unprocessed statistical data? Broadly speaking, yes. Inserting policy into the kernel will no doubt save the data being sent to userspace but also - limits flexibility of what userspace can do with it and - adds to the kernel code base unnecessarily Specifically for taskstats, there is a need for a high-speed interface because of the potential volume of data resulting from - large number of tasks in a single kernel - high frequency of task exits Some kernel-based preprocessing, such as per-tgid aggregation, can help cut down the volume but as long as we have a need for getting per-task data, an efficient interface will matter, atleast for our needs. > > However, the user interface is a just one building brick, > which can be enhanced or replaced with moderate effort, if > there is a need. True. This is not to suggest statistical infrastructure's interface choice isn't correct.. just that its not enough for the needs we seek to serve. A filesystem based interface has plenty of usability benefits so its primarily a question of which stats you want to export using its interface. > > >> Each statistic has its own definition, > > Allowing users to restrict accounting to what they need in their > particular case. Sensible defaults are usually available. > >>> needs to be read separately using ASCII, >>> reading data continuously means open/read/close each time.....all of >>> which is not very conducive to large structures being sent to >>> userspace. >> > > Debugfs file are fine for larger structures. > > Unless one keeps reading statistics dozens of times per second, > I don't see an issue with that. Since delay accounting stats can be exploited for resource management at user space, if one wants to get data for all tasks/processes periodically, it could add up to a fairly high demand for user<->kernel bandwidth even if the frequency need for reading one task's stats isn't that high. Its a question of scalability as number of tasks increase. For systemwide stats, your point is well taken...unlikely to be an issue, unless of course, one needs to read lots of them. > > The question is: what are the requirements to be covered? Yup...I think the infrastructures are serving differing needs. > >> Yes, i second the point. It won't be able to catch up the traffic. >> >>> - its oriented towards sampled data whereas taskstats isn't. >>> >>> So, we have a good consensus from existing/potential users of >>> taskstats and would >>> very much appreciate it being included in 2.6.18. >> >> >> Andrew, it has become clear that the community wants to see accounting >> data processing being moved to userspace. Thus there is a need for a >> common accounting interface to provide minimal works at kernel (via >> hooks at fork and exit) and deliver data to userspace. > > > Both, the statistics infrastructure on behalf of its exploiters as well > as the exploiters of the taskstats interface do data preprocessing, > that is, maintain counters in the kernel. > User space counters won't perform, of course. > > AFAICS, actual differences are: > - triggers for data delivery to user space > (statistics infrastructure: when user reads statistics through file, > taskstats: on certain task related events, right?) Taskstats data delivery is triggered by - user asking for data (i.e. akin to reading through a file, only done through a command-response interface) - on task exit event > - and, therewith, frequency of data delivery to user space > > > Martin --Shailabh