public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Shailabh Nagar <nagar@watson.ibm.com>
To: Jay Lan <jlan@engr.sgi.com>
Cc: Andrew Morton <akpm@osdl.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	elsa-devel <elsa-devel@lists.sourceforge.net>,
	LSE <lse-tech@lists.sourceforge.net>,
	ckrm-tech <ckrm-tech@lists.sourceforge.net>
Subject: Re: [Patch 6/6] Delay accounting: Connector interface
Date: Wed, 04 Jan 2006 16:31:22 -0500	[thread overview]
Message-ID: <43BC3EAA.4010801@watson.ibm.com> (raw)
In-Reply-To: <43BC1C43.9020101@engr.sgi.com>

Jay Lan wrote:

>Shailabh Nagar wrote:
>  
>
>>Changes since 11/14/05:
>>
>>- explicit versioning of statistics data returned
>>- new command type for returning per-tgid stats
>>- for cpu running time, use tsk->sched_info.cpu_time (collected by schedstats)
>>
>>First post 11/14/05
>>
>>delayacct-connector.patch
>>    
>>
<snip>

>>Index: linux-2.6.15-rc7/include/linux/connector.h
>>===================================================================
>>--- linux-2.6.15-rc7.orig/include/linux/connector.h
>>+++ linux-2.6.15-rc7/include/linux/connector.h
>>@@ -35,6 +35,10 @@
>>#define CN_IDX_CIFS			0x2
>>#define CN_VAL_CIFS                     0x1
>>
>>+/* Statistics connector ids */
>>+#define CN_IDX_STATS			0x2
>>+#define CN_VAL_STATS			0x2
>>+
>>#define CN_NETLINK_USERS		1
>>
>>/*
>>Index: linux-2.6.15-rc7/drivers/connector/cn_stats.c
>>===================================================================
>>--- /dev/null
>>+++ linux-2.6.15-rc7/drivers/connector/cn_stats.c
>>@@ -0,0 +1,243 @@
>>+/*
>>+ * cn_stats.c - Task statistics connector
>> 
>>    
>>
>
>It looks like you create this file to report task stats through connector
>and you include stats for delay accounting you need as a starter.
>  
>

Yes. The idea is that the connector interface will be used by everyone 
interested in exporting per-task/process
stats. The stats that I care about right now are delay stats and those 
are being exported now. Adding to the
exported set (by extending struct cnstats_data and possibly, struct 
cnstats_cmd) can happen in future.

>This sounds like a good approach to me. However, we need to move
>where cnstats_exit_connector(tsk) is invoked. See below.
>  
>

>>+ *
>>+ * Copyright (C) Shailabh Nagar, IBM Corp. 2005
>>+ * Based on work by Matt Helsley, Nguyen Anh Quynh and Guillaume Thouvenin
>>+ *
>>+ * This program is free software; you can redistribute it and/or modify
>>+ * it under the terms of the GNU General Public License as published by
>>+ * the Free Software Foundation; either version 2 of the License, or
>>+ * (at your option) any later version.
>>+ *
>>+ * This program is distributed in the hope that it will be useful,
>>+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>+ * GNU General Public License for more details.
>>+ *
>>+ */
>>+
>>+#include <linux/module.h>
>>+#include <linux/kernel.h>
>>+#include <linux/init.h>
>>+#include <linux/cn_stats.h>
>>+#include <asm/atomic.h>
>>+
>>+#define CN_STATS_NOCPU	(-1)
>>+#define CN_STATS_NOACK	0
>>+#define CN_STATS_MSG_SIZE (sizeof(struct cn_msg) + sizeof(struct cnstats))
>>+
>>+static atomic_t cnstats_num_listeners = ATOMIC_INIT(0);
>>+static struct cb_id cnstats_id = { CN_IDX_STATS, CN_VAL_STATS };
>>+/* cnstats_counts is used as the sequence number of the netlink message */
>>+static DEFINE_PER_CPU(__u32, cnstats_counts) = { 0 };
>>+
>>+void cnstats_init_msg(struct cn_msg *msg, int seq, int ack)
>>+{
>>+	memset(msg, 0, CN_STATS_MSG_SIZE);
>>+	msg->seq = seq;
>>+	msg->ack = ack;
>>+	memcpy(&msg->id, &cnstats_id, sizeof(msg->id));
>>+	msg->len = sizeof(struct cnstats);
>>+}
>>+
>>+/*
>>+ * Accumalate one task's statistics
>>+ *
>>+ */
>>+static inline void cnstats_add_tsk_data(struct cnstats_data *d, struct task_struct *tsk)
>>+{
>>+	uint64_t tmp;
>>+
>>+	d->pid = tsk->pid;
>>+	d->tgid = tsk->tgid;
>>+
>>+	/* zero XXX_total,non-zero XXX_count implies XXX stat overflowed */
>>+#ifdef CONFIG_SCHEDSTATS
>>+	d->cpu_count += tsk->sched_info.pcnt;
>>+	tmp = d->cpu_run_total + jiffies_to_usecs(tsk->sched_info.cpu_time)*1000;
>>+	d->cpu_run_total = (tmp < d->cpu_run_total)? 0:tmp;
>>+	tmp = d->cpu_delay_total + jiffies_to_usecs(tsk->sched_info.run_delay)*1000;
>>+	d->cpu_delay_total = (tmp < d->cpu_delay_total)? 0:tmp;
>>+#else
>>+	/* Non-zero XXX_total,zero XXX_count implies XXX stat unavailable */
>>+	d->cpu_count = 0;
>>+	d->cpu_run_total = d->cpu_delay_total = CNSTATS_NOCPUSTATS;
>>+#endif
>>+	spin_lock(&tsk->delays.lock);
>>+	tmp = d->blkio_delay_total + tsk->delays.blkio_delay;
>>+	d->blkio_delay_total = (tmp < d->blkio_delay_total)? 0:tmp;
>>+	tmp = d->swapin_delay_total + tsk->delays.swapin_delay;
>>+	d->swapin_delay_total = (tmp < d->swapin_delay_total)? 0:tmp;
>>+	d->blkio_count += tsk->delays.blkio_count;
>>+	d->swapin_count += tsk->delays.swapin_count;
>>+	spin_unlock(&tsk->delays.lock);
>>+}
>>+
>>+/*
>>+ * Send acknowledgement (error)
>>+ *
>>+ */
>>+static void cnstats_send_ack(int err, int rcvd_seq, int rcvd_ack)
>>+{
>>+	__u8 buffer[CN_STATS_MSG_SIZE];
>>+	struct cn_msg *msg = (struct cn_msg *)buffer;
>>+	struct cnstats *c = (struct cnstats *)msg->data;
>>+
>>+	cnstats_init_msg(msg, rcvd_seq, rcvd_ack+1);
>>+	c->version = CNSTATS_DATA_VERSION;
>>+	c->outtype = CNSTATS_DATA_NONE;
>>+
>>+	/* Following allows other functions to continue returning -ve errors */
>>+	c->data.ack.err = abs(err);
>>+
>>+	cn_netlink_send(msg, CN_IDX_STATS, GFP_KERNEL);
>>+}
>>+
>>+/*
>>+ * Send one pid's stats
>>+ *
>>+ */
>>+static int cnstats_send_pid(pid_t pid, int seq, int ack)
>>+{
>>+	__u8 buffer[CN_STATS_MSG_SIZE];
>>+	struct cn_msg *msg = (struct cn_msg *)buffer;
>>+	struct cnstats *c = (struct cnstats *)msg->data;
>>+	struct cnstats_data *d = (struct cnstats_data *)&c->data.stats;
>>+	struct task_struct *tsk;
>>+
>>+	cnstats_init_msg(msg, seq, ack);
>>+	c->version = CNSTATS_DATA_VERSION;
>>+	c->outtype = CNSTATS_DATA_PID;
>>+
>>+	read_lock(&tasklist_lock);
>>+	tsk = find_task_by_pid(pid);
>>+	if (!tsk) {
>>+		read_unlock(&tasklist_lock);
>>+		return -ESRCH;
>>+	}
>>+	get_task_struct(tsk);
>>+	read_unlock(&tasklist_lock);
>>+
>>+	cnstats_add_tsk_data(d, tsk);
>>+	put_task_struct(tsk);
>>+
>>+	return cn_netlink_send(msg, CN_IDX_STATS, GFP_KERNEL);
>>+}
>>+
>>+/*
>>+ * Send one tgid's stats (sum of appropriate per-pid stats)
>>+ *
>>+ */
>>+static int cnstats_send_tgid(pid_t tgid, int seq, int ack)
>>+{
>>+	__u8 buffer[CN_STATS_MSG_SIZE];
>>+	struct cn_msg *msg = (struct cn_msg *)buffer;;
>>+	struct cnstats *c = (struct cnstats *)msg->data;
>>+	struct cnstats_data *d = (struct cnstats_data *)&c->data.stats;
>>+	struct task_struct *first, *tsk;
>>+
>>+	cnstats_init_msg(msg, seq, ack);
>>+	c->version = CNSTATS_DATA_VERSION;
>>+	c->outtype = CNSTATS_DATA_TGID;
>>+
>>+	read_lock(&tasklist_lock);
>>+	first = tsk = find_task_by_pid(tgid);
>>+	if (!first) {
>>+		read_unlock(&tasklist_lock);
>>+		return -ESRCH;
>>+	}
>>+	do
>>+		cnstats_add_tsk_data(d, tsk);
>>+	while_each_thread(first, tsk);
>>+	read_unlock(&tasklist_lock);
>>+
>>+	d->pid = -1;
>>+	d->tgid = tgid;
>>+
>>+	return cn_netlink_send(msg, CN_IDX_STATS, GFP_KERNEL);
>>+}
>>+
>>+/***
>>+ * cnstats_ctl - handle command sent via CN_IDX_STATS connector
>>+ * @data: command
>>+ */
>>+static void cnstats_ctl(void *data)
>>+{
>>+	struct cn_msg *msg = data;
>>+	struct cnstats_cmd *cmd;
>>+	int err = 0;
>>+
>>+	if (msg->len != sizeof(*cmd))
>>+		return;
>>+
>>+	cmd = (struct cnstats_cmd *)msg->data;
>>+	switch (cmd->intype) {
>>+	case CNSTATS_CMD_LISTEN:
>>+		atomic_inc(&cnstats_num_listeners);
>>+		break;
>>+
>>+	case CNSTATS_CMD_IGNORE:
>>+		atomic_dec(&cnstats_num_listeners);
>>+		break;
>>+
>>+	case CNSTATS_CMD_STATS_PID:
>>+		if (atomic_read(&cnstats_num_listeners) < 1)
>>+			return;
>>+		err = cnstats_send_pid(cmd->param.pid, msg->seq, msg->ack+1);
>>+		if (!err)
>>+			return;
>>+		break;
>>+
>>+	case CNSTATS_CMD_STATS_TGID:
>>+		if (atomic_read(&cnstats_num_listeners) < 1)
>>+			return;
>>+		err = cnstats_send_tgid(cmd->param.pid, msg->seq, msg->ack+1);
>>+		if (!err)
>>+			return;
>>+		break;
>>+
>>+	default:
>>+		err = -EINVAL;
>>+		break;
>>+	}
>>+	cnstats_send_ack(err, msg->seq, msg->ack);
>>+}
>>+
>>+/***
>>+ * cnstats_exit_connector - send task's statistics to userspace when it exits
>>+ * @tsk: exiting task
>>+ */
>>+void cnstats_exit_connector(struct task_struct *tsk)
>>+{
>>+	int seq;
>>+	__u8 buffer[CN_STATS_MSG_SIZE];
>>+	struct cn_msg *msg = (struct cn_msg *)buffer;
>>+	struct cnstats *c = (struct cnstats *)msg->data;
>>+	struct cnstats_data *d = (struct cnstats_data *)&c->data.stats;
>>+
>>+	if (atomic_read(&cnstats_num_listeners) < 1)
>>+		return;
>>+
>>+	seq = get_cpu_var(cnstats_counts)++;
>>+	put_cpu_var(cnstats_counts);
>>+
>>+	cnstats_init_msg(msg, seq, CN_STATS_NOACK);
>>+	c->version = CNSTATS_DATA_VERSION;
>>+	c->outtype = CNSTATS_DATA_EXIT;
>>+	cnstats_add_tsk_data(d, tsk);
>>+
>>+	cn_netlink_send(msg, CN_IDX_STATS, GFP_KERNEL);
>>+}
>>+
>>+static int __init cnstats_init(void)
>>+{
>>+	int err;
>>+
>>+	if ((err = cn_add_callback(&cnstats_id, "cn_stats", &cnstats_ctl))) {
>>+		printk(KERN_WARNING "cn_stats failed to register\n");
>>+		return err;
>>+	}
>>+	return 0;
>>+}
>>+
>>+module_init(cnstats_init);
>>Index: linux-2.6.15-rc7/kernel/exit.c
>>===================================================================
>>--- linux-2.6.15-rc7.orig/kernel/exit.c
>>+++ linux-2.6.15-rc7/kernel/exit.c
>>@@ -29,6 +29,7 @@
>>#include <linux/syscalls.h>
>>#include <linux/signal.h>
>>#include <linux/cn_proc.h>
>>+#include <linux/cn_stats.h>
>>
>>#include <asm/uaccess.h>
>>#include <asm/unistd.h>
>>@@ -865,6 +866,7 @@ fastcall NORET_TYPE void do_exit(long co
>>
>>	tsk->exit_code = code;
>>	proc_exit_connector(tsk);
>>+	cnstats_exit_connector(tsk);
>> 
>>    
>>
>
>We need to move both proc_exit_connector(tsk) and
>cnstats_exit_connector(tsk) up to before exit_mm(tsk) statement.
>There are task statistics collected in task->mm and those stats
>will be lost after exit_mm(tsk).
>  
>
Good point. The cnstats connector can be moved.
I'm not sure if proc_exit_connector needs to move as well..but Matt can 
comment on that.

Do let me know if you think this interface would be good enough for 
CSA's needs or what changes would
be needed.

--Shailabh

>Thanks,
> - jay
>
>  
>
>>	exit_notify(tsk);
>>#ifdef CONFIG_NUMA
>>	mpol_free(tsk->mempolicy);
>>-
>>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>the body of a message to majordomo@vger.kernel.org
>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>Please read the FAQ at  http://www.tux.org/lkml/
>> 
>>    
>>
>
>
>  
>



  reply	other threads:[~2006-01-04 21:31 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-01-03 23:16 [Patch 0/6] Per-task delay accounting Shailabh Nagar
2006-01-03 23:23 ` [Patch 1/6] Delay accounting: timespec diff Shailabh Nagar
2006-01-03 23:26 ` [Patch 2/6] Delay accounting: Initialization, kernel boot option Shailabh Nagar
2006-01-03 23:28 ` [Patch 3/6] Delay accounting: Sync block I/O delays Shailabh Nagar
2006-01-03 23:30 ` [Patch 4/6] Delay accounting: Swap in delays Shailabh Nagar
2006-01-03 23:31 ` [Patch 5/6] Delay accounting: /proc interface Shailabh Nagar
2006-01-03 23:33 ` [Patch 6/6] Delay accounting: Connector interface Shailabh Nagar
2006-01-04  0:21   ` Greg KH
2006-01-04  0:42     ` Shailabh Nagar
2006-01-04  0:51       ` Greg KH
2006-01-04  7:49         ` [Lse-tech] " Shailabh Nagar
2006-01-04 19:04   ` Jay Lan
2006-01-04 21:31     ` Shailabh Nagar [this message]
2006-01-04 22:40     ` [ckrm-tech] " Matt Helsley
2006-01-04 23:17       ` Andrew Morton
2006-01-05 18:42         ` [PATCH 00/01] Move Exit Connectors Matt Helsley
2006-01-05 19:17           ` [PATCH 01/01][RFC] " Matt Helsley
2006-01-05 19:20           ` [PATCH 00/01] " Matt Helsley
2006-01-05 23:10             ` Andrew Morton
2006-01-06  0:06               ` [ckrm-tech] " Matt Helsley
2006-01-06  8:57                 ` [Lse-tech] " Jes Sorensen
2006-01-06 16:45                   ` Shailabh Nagar
2006-01-11 10:36                     ` Jes Sorensen
2006-01-11 12:56                       ` John Hesterberg
2006-01-11 13:50                         ` Jes Sorensen
2006-01-11 21:02                       ` Matt Helsley
2006-01-11 21:39                         ` John Hesterberg
2006-01-11 22:42                           ` Matt Helsley
2006-01-12 10:01                             ` Jes Sorensen
2006-01-12 23:20                               ` Matt Helsley
2006-01-13  9:35                                 ` Jes Sorensen
2006-01-14  7:23                                   ` Matt Helsley
2006-01-12  3:29                           ` Keith Owens
2006-01-12  5:04                             ` Paul E. McKenney
2006-01-12  5:38                               ` Keith Owens
2006-01-12  6:19                               ` Keith Owens
2006-01-12  6:51                                 ` Paul E. McKenney
2006-01-12  7:50                                   ` Keith Owens
2006-01-12 15:17                                     ` Paul E. McKenney
2006-01-17 17:26                                       ` Paul E. McKenney
2006-01-17 23:57                                         ` Keith Owens
2006-01-18  2:49                                           ` Paul E. McKenney
2006-01-18  2:55                                             ` Lee Revell
2006-01-18  6:29                                               ` Paul E. McKenney
2006-01-12  5:26                             ` Matt Helsley
2006-01-12  5:45                               ` Keith Owens
2006-01-12  9:51                         ` Jes Sorensen
2006-01-12 23:01                           ` Matt Helsley
2006-01-13  9:59                             ` Jes Sorensen
2006-01-13 10:38                             ` Jes Sorensen
2006-01-13 23:22                               ` Matt Helsley
2006-01-12 23:49                       ` Matt Helsley
2006-01-05  0:01       ` [ckrm-tech] Re: [Patch 6/6] Delay accounting: Connector interface Shailabh Nagar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43BC3EAA.4010801@watson.ibm.com \
    --to=nagar@watson.ibm.com \
    --cc=akpm@osdl.org \
    --cc=ckrm-tech@lists.sourceforge.net \
    --cc=elsa-devel@lists.sourceforge.net \
    --cc=jlan@engr.sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lse-tech@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox