[PATCH 0/4] Add cyclicload testtool support

linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/4] Add cyclicload testtool support
@ 2012-08-30  9:56 Priyanka Jain
  2012-08-30  9:56 ` [PATCH 1/4] Add README for cyclicload test tool Priyanka Jain
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Priyanka Jain @ 2012-08-30  9:56 UTC (permalink / raw)
  To: jkacur, williams, frank.rowand, linux-rt-users, dvhart,
	Rajan.Srivastava
  Cc: Poonam.Aggrwal, Priyanka Jain

*** BLURB HERE ***

Cyclicload test-tool is designed to simulate
load in form of one or two load threads.
It uses cyclictest as its base code.

But as cyclictest and cyclicload are targeted
for different test cases, cyclicload is coded as separate tool.
For easy code maintenance, source code of cyclictest is first
duplicated as cyclicload code and then calibration and load generation
logic has been added above that.
Plan is to move common code to library later on.

Developed against
	git://git.kernel.org/pub/scm/linux/kernel/git/clrkwllms/rt-tests.git
	commitID: 857cdd5320ce1f293f5dbcbec79cc8fe22b0bebf

Priyanka Jain (4):
  Add README for cyclicload test tool
  Duplicates cyclictest code as cyclicload
  Add cyclicload calibration & load generation feature
  Add cyclicload manual page

 Makefile                    |    7 +-
 src/cyclicload/README       |  273 ++++++
 src/cyclicload/cyclicload.8 |  206 ++++
 src/cyclicload/cyclicload.c | 2259 +++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 2744 insertions(+), 1 deletions(-)
 create mode 100644 src/cyclicload/README
 create mode 100644 src/cyclicload/cyclicload.8
 create mode 100644 src/cyclicload/cyclicload.c

-- 
1.7.4.1




^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/4] Add README for cyclicload test tool
  2012-08-30  9:56 [PATCH 0/4] Add cyclicload testtool support Priyanka Jain
@ 2012-08-30  9:56 ` Priyanka Jain
  2012-08-30  9:56 ` [PATCH 2/4] Duplicates cyclictest code as cyclicload Priyanka Jain
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Priyanka Jain @ 2012-08-30  9:56 UTC (permalink / raw)
  To: jkacur, williams, frank.rowand, linux-rt-users, dvhart,
	Rajan.Srivastava
  Cc: Poonam.Aggrwal, Priyanka Jain

Cyclicload test-tool is designed to simulate
load in form of one or two load threads.

Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com>
---
 src/cyclicload/README |  273 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 273 insertions(+), 0 deletions(-)
 create mode 100644 src/cyclicload/README

diff --git a/src/cyclicload/README b/src/cyclicload/README
new file mode 100644
index 0000000..b95b517
--- /dev/null
+++ b/src/cyclicload/README
@@ -0,0 +1,273 @@
+ABSTRACT
+---------
+While developing an embedded product at the level of unit testing or
+component testing, it is often a requirement to test it under a
+specified load so that test setup is close to the real product.
+E.g while working on Linux Network stack targeted for LTE product,
+requirement is to simulate L2 stack load for unit testing of Linux
+network stack. Also as the load can be of Real-time (RT) type or
+Non Real-Time (NRT) type, requirement is to simulate load of both
+types.
+Cyclicload has been developed to meet these requirements.
+It can simulate RT and/or NRT type of load in form of load threads.
+
+
+
+INTRODUCTION
+-------------
+Cyclicload is a test tool, designed to simulate system load
+for unit/component testing.
+It can simulate user-defined load at user-defined interval
+of user-defined priority/nice values.
+It is also capable or producing simultaneous Real-Time critical
+(RT) as well as Non Real-Time (NRT) critical type of load.
+It can work for both uniprocessor as well as multiprocessor.
+
+
+
+WHAT DOES IT DO
+---------------
+It creates one or two load generating thread/s.
+1) Simulated load1_thread
+	-of RT type
+	-can run for load of 0% to 99%
+2) Simulated load2_thread
+	-can be of RT or NRT type
+	-can run for load of 0% to 99%
+
+Priority, nice value, load%, etc as input via command line.
+Also sum of 'load1' and 'load2' should not be greater than 99%
+as 1% has been reserved for control framework.
+
+
+HOW DOES IT WORK
+-----------------
+Cyclicload workflow can be broadly divided into following stages
+1) Initialization stage
+	a) Calibration
+	b) Other initialization
+2) Load producing stage
+3) Exit stage
+
+1) Initialization stage
+Calibration is the most important part of this stage from
+functional point of view as calibration data can vary from
+system to system or for various configurations on same system.
+But as generally for given set of configuration on a particular
+system, calibration output is same, it is kept as optional part
+depending on presence of 'calibrate_count' file.
+
+First it calibrates required loop_count per unit time per CPU to
+simulate load for a unit time. It stores this data in
+'calibrate_count' file.
+For the same run or subsequent runs on the same system,
+it uses this calibrated data to generate load.
+It is recommended to run cyclicload with no other load
+present on the system for proper calibration to generate
+'calibrate_count' file and then use this file for the next run.
+
+2) Load producing stage
+This is the actual testing stage.
+In this stage, cyclicload calls a loop after every regular interval.
+In this interval, it generates 'load1', 'load2' and then waits for the
+next interval.
+It might be possible that interval expires before the 'load2' completion.
+In that case it discards off the remaining load.
+
+3)Exit stage
+In this stage, it displays how much desired load it was able to simulate
+in form of percentage of actual load run verses the desired load.
+
+For design details, see the DESIGN OVERVIEW SECTION.
+
+
+EXAMPLE USE CASE
+-----------------
+For better understanding of why this tool is required, what does it
+do and how does it works, let's consider an example use-case of LTE
+products.
+
+LTE system has many software components running on it:
+LTE layer2 stack having some Real time critical tasks
+like MAC thread and non-real time critical tasks like PDCP,
+Linux network stack, etc.
+While doing unit testing of Linux Network stack driver,
+one generally has a requirement to build the test setup which
+is close to the target LTE product. But making the complete
+LTE setup with EPC, eNodeB, etc is not always feasible.
+Cyclicload comes very handy here in simulating the LTE type
+of load.
+
+Let's assume LTE layer2 stack has 30% of real-time critical load
+and 20% on non-real time type.
+
+Cyclicload can first calibrate how much loop_count is required to
+generate a particular load. It can then use this output to simulate
+load in form of two threads:
+1) load1_thread
+	-running at RT priority
+	-producing 30% load
+	-simulating RT load on LTE layer2 stack
+
+2) load2_thread
+	-running at NRT priority
+	-producing 20% load
+	-simulating NRT load on LTE layer2 stack.
+
+This setup with simulated load can now be used to test the behavior
+of Linux network stack under the presence of simulated LTE layer2 load.
+
+Also, on completion, it displays how much ‘load1’ and ‘load2’
+it was able to simulate giving the user an idea of how much
+CPU bandwidth is available for the actual load in the system.
+
+Above example clearly showcase that one can do
+system optimization at unit level.
+
+
+COMMAND LINE ARGUMENTS/EXAMPLES
+-------------------------------
+
+Some of command line arguments:
+ "-x       --load_t1         load in percentage for t1 thread"
+ "-X       --load_t2         load in percentage for t2 thread"
+ "-z       --priority_t2     priority of t2 thread"
+ "-Z       --nice_t2         nice value of t2 thread"
+
+For more details on arguments, see cyclicload.8 file
+
+If both load_t1 and load_t2 are zero, it behaves as default cyclictest application
+
+	#sudo ./cyclicload -p 99 -S -c 1 -d 0 -x 20 -X 30 -q -D 600&
+
+For help,
+	#sudo ./cyclicload -help
+
+
+
+THINGS TO NOTE
+---------------
+Cyclicload has overhead of 0%-3% for 'load1' thread and
+'load2' thread. So this should be taken into account
+while running load.
+If someone has a requirement of more précised load,
+he can run 2-3 cycles of Cyclicload and check CPU load
+of load threads using applications like ‘top –H –d 60’
+to get an idea of extra load of threads on the
+particular system.
+This extra load is generally constant for the particular system.
+
+
+RECOMMENDED SETTINGS
+----------------------
+-First run is recommended to be run with no or least load for accuracy.
+-Should be run with sudo or root permission.
+-Calibration routine produces 'calibrate_count' file in 'pwd' directory.
+	If one don't have permission in that directory,
+	file path should be changed in FILENAME in source
+	code or one can exploit shared memory method.
+-As load1_thread in addition to load generation, also controls
+ intervals start and end, it is recommended to be run with
+ highest RT priority.
+-Recommended to run in quiet mode (-q) in background for more accurate
+ load generation.
+
+
+TESTED ON
+----------
+Tested on uni-processor and multi-processor PowerPC
+and i686 platforms having PREEMPT_RT Linux running on it.
+
+
+FUTURE ENHANCEMENTS
+-------------------
+-Add option to take filepath from command line.
+-Add option to flush caches in each interval to make it
+	close to actual possible scenario.
+-can be scalable to produce n number of load threads.
+-Test on other architectures.
+
+
+DESIGN OVERVIEW
+-----------------
+Cyclicload uses exiting cyclictest application present
+in rt-tests test suite as its base code. It adds logic
+for calibration and to simulate load above it.
+
+Cyclicload code can be functionally divided into following
+threads
+1) cyclicload process thread
+2) calibrate_thread
+3) timerthread ('load1' thread)
+4) load2_thread
+
+Details below
+
+1) cyclicload process thread
+--------------------------
+-Parse input arguments.
+
+-For first run (!'calibrate_count' file present):
+	-Creates 'calibrate_count' file.
+	-Create 'calibrate_thread'.
+	-Store calibrated count in 'calibrate_count' file
+
+-For subsequent runs ('calibrate_count' file present):
+	-Read calibrated count from file in
+	'calibrate_count_array' and use that count
+	to simulate desired load.
+
+-Create t 'timerthread'
+	-Simulate CPU interval windows.
+	-'timerthread' also act as 'load1_thread'
+	-'t' depends on command-line argument
+		By default for SMP system
+		t = num of cores, one thread per core
+
+-Update stats periodically while !shutdown
+-Print stats periodically depending on command-line
+ argument while !shutdown
+
+
+2) calibrate_thread
+------------------
+-Is created only once for the first run of cyclicload
+ if 'calibrate_count' file is not present.
+-Runs at highest RT priority.
+-Affine itself turn by turn to each CPU.
+	(for all CPUs for Multicore system)
+-Calibrates loop-count per unit (ms by default) per CPU.
+-Stores per CPU data in calibrate_count_array (global array).
+
+
+3)timerthread ('load1' thread)
+---------------------------
+-Run at priority parsed in main routine
+-Creates 'load2_thread' if load2 is of nonzero value
+-Calculate number of loops to generate 'load1' and 'load2' by
+	using calibrate_count_array (global array) and
+	required load percentage parsed in main routine.
+-Calculates interval(window) and reduced interval for which
+ it should sleep.
+	reduced interval = interval(window) - duration for 'load1'
+-While loop until !shutdown
+	-Generates 'load1' load
+	-Sleep for reduced interval
+	-Calculates latency.
+	-Set next_window_started flag.
+	-Signal load2_thread about next window start.
+-Overhead: 0%-3% of full CPU utilization.
+	-Varies from system to system but generally constant
+	 for a system.
+
+
+load2_thread
+------------
+-While loop until !shutdown
+	-Generates 'load2' load
+		if window expires before generate 'load2' finishes,
+		discards-off remaining load
+	-Waits for signal for next window to start.
+-Overhead: 0%-2% of full CPU utilization.
+	-Varies from system to system but generally constant
+	 for a system.
-- 
1.7.4.1



--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/4] Duplicates cyclictest code as cyclicload
  2012-08-30  9:56 [PATCH 0/4] Add cyclicload testtool support Priyanka Jain
  2012-08-30  9:56 ` [PATCH 1/4] Add README for cyclicload test tool Priyanka Jain
@ 2012-08-30  9:56 ` Priyanka Jain
  2012-08-30  9:56 ` [PATCH 3/4] Add cyclicload calibration & load generation feature Priyanka Jain
  2012-08-30  9:56 ` [PATCH 4/4] Add cyclicload manual page Priyanka Jain
  3 siblings, 0 replies; 7+ messages in thread
From: Priyanka Jain @ 2012-08-30  9:56 UTC (permalink / raw)
  To: jkacur, williams, frank.rowand, linux-rt-users, dvhart,
	Rajan.Srivastava
  Cc: Poonam.Aggrwal, Priyanka Jain

Cyclicload uses cyclictest as its base code.

But as cyclictest and cyclicload are targeted
for different test cases, cyclicload is coded as separate tool.
For easy code maintenance, source code of cyclictest is first
duplicated as cyclicload code and then calibration and load generation
logic has been added above that.
Plan is to move common code to library later on.

Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com>
---
 src/cyclicload/cyclicload.c | 1729 +++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 1729 insertions(+), 0 deletions(-)
 create mode 100644 src/cyclicload/cyclicload.c

diff --git a/src/cyclicload/cyclicload.c b/src/cyclicload/cyclicload.c
new file mode 100644
index 0000000..11b6cea
--- /dev/null
+++ b/src/cyclicload/cyclicload.c
@@ -0,0 +1,1729 @@
+/*
+ * High resolution timer test software
+ *
+ * (C) 2008-2012 Clark Williams <williams@redhat.com>
+ * (C) 2005-2007 Thomas Gleixner <tglx@linutronix.de>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License Version
+ * 2 as published by the Free Software Foundation.
+ *
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <stdarg.h>
+#include <unistd.h>
+#include <fcntl.h>
+#include <getopt.h>
+#include <pthread.h>
+#include <signal.h>
+#include <sched.h>
+#include <string.h>
+#include <time.h>
+#include <errno.h>
+#include <limits.h>
+#include <linux/unistd.h>
+
+#include <sys/prctl.h>
+#include <sys/stat.h>
+#include <sys/sysinfo.h>
+#include <sys/types.h>
+#include <sys/time.h>
+#include <sys/resource.h>
+#include <sys/utsname.h>
+#include <sys/mman.h>
+#include "rt_numa.h"
+
+#include "rt-utils.h"
+
+#define DEFAULT_INTERVAL 1000
+#define DEFAULT_DISTANCE 500
+
+#ifndef SCHED_IDLE
+#define SCHED_IDLE 5
+#endif
+#ifndef SCHED_NORMAL
+#define SCHED_NORMAL SCHED_OTHER
+#endif
+
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+
+/* Ugly, but .... */
+#define gettid() syscall(__NR_gettid)
+#define sigev_notify_thread_id _sigev_un._tid
+
+#ifdef __UCLIBC__
+#define MAKE_PROCESS_CPUCLOCK(pid, clock) \
+	((~(clockid_t) (pid) << 3) | (clockid_t) (clock))
+#define CPUCLOCK_SCHED          2
+
+static int clock_nanosleep(clockid_t clock_id, int flags, const struct timespec *req,
+		struct timespec *rem)
+{
+	if (clock_id == CLOCK_THREAD_CPUTIME_ID)
+		return -EINVAL;
+	if (clock_id == CLOCK_PROCESS_CPUTIME_ID)
+		clock_id = MAKE_PROCESS_CPUCLOCK (0, CPUCLOCK_SCHED);
+
+	return syscall(__NR_clock_nanosleep, clock_id, flags, req, rem);
+}
+
+int sched_setaffinity (__pid_t __pid, size_t __cpusetsize,
+                              __const cpu_set_t *__cpuset)
+{
+	return -EINVAL;
+}
+
+#undef CPU_SET
+#undef CPU_ZERO
+#define CPU_SET(cpu, cpusetp)
+#define CPU_ZERO(cpusetp)
+
+#else
+extern int clock_nanosleep(clockid_t __clock_id, int __flags,
+			   __const struct timespec *__req,
+			   struct timespec *__rem);
+#endif
+
+#define USEC_PER_SEC		1000000
+#define NSEC_PER_SEC		1000000000
+
+#define HIST_MAX		1000000
+
+#define MODE_CYCLIC		0
+#define MODE_CLOCK_NANOSLEEP	1
+#define MODE_SYS_ITIMER		2
+#define MODE_SYS_NANOSLEEP	3
+#define MODE_SYS_OFFSET		2
+
+#define TIMER_RELTIME		0
+
+/* Must be power of 2 ! */
+#define VALBUF_SIZE		16384
+
+#define KVARS			32
+#define KVARNAMELEN		32
+#define KVALUELEN		32
+
+int enable_events;
+
+static char *policyname(int policy);
+
+enum {
+	NOTRACE,
+	CTXTSWITCH,
+	IRQSOFF,
+	PREEMPTOFF,
+	PREEMPTIRQSOFF,
+	WAKEUP,
+	WAKEUPRT,
+	LATENCY,
+	FUNCTION,
+	CUSTOM,
+};
+
+/* Struct to transfer parameters to the thread */
+struct thread_param {
+	int prio;
+	int policy;
+	int mode;
+	int timermode;
+	int signal;
+	int clock;
+	unsigned long max_cycles;
+	struct thread_stat *stats;
+	int bufmsk;
+	unsigned long interval;
+	int cpu;
+	int node;
+};
+
+/* Struct for statistics */
+struct thread_stat {
+	unsigned long cycles;
+	unsigned long cyclesread;
+	long min;
+	long max;
+	long act;
+	double avg;
+	long *values;
+	long *hist_array;
+	pthread_t thread;
+	int threadstarted;
+	int tid;
+	long reduce;
+	long redmax;
+	long cycleofmax;
+	long hist_overflow;
+};
+
+static int shutdown;
+static int tracelimit = 0;
+static int ftrace = 0;
+static int kernelversion;
+static int verbose = 0;
+static int oscope_reduction = 1;
+static int lockall = 0;
+static int tracetype = NOTRACE;
+static int histogram = 0;
+static int histofall = 0;
+static int duration = 0;
+static int use_nsecs = 0;
+static int refresh_on_max;
+static int force_sched_other;
+static int priospread = 0;
+
+static pthread_cond_t refresh_on_max_cond = PTHREAD_COND_INITIALIZER;
+static pthread_mutex_t refresh_on_max_lock = PTHREAD_MUTEX_INITIALIZER;
+
+static pthread_mutex_t break_thread_id_lock = PTHREAD_MUTEX_INITIALIZER;
+static pid_t break_thread_id = 0;
+static uint64_t break_thread_value = 0;
+
+/* Backup of kernel variables that we modify */
+static struct kvars {
+	char name[KVARNAMELEN];
+	char value[KVALUELEN];
+} kv[KVARS];
+
+static char *procfileprefix = "/proc/sys/kernel/";
+static char *fileprefix;
+static char tracer[MAX_PATH];
+static char **traceptr;
+static int traceopt_count;
+static int traceopt_size;
+
+static int latency_target_fd = -1;
+static int32_t latency_target_value = 0;
+
+/* Latency trick
+ * if the file /dev/cpu_dma_latency exists,
+ * open it and write a zero into it. This will tell 
+ * the power management system not to transition to 
+ * a high cstate (in fact, the system acts like idle=poll)
+ * When the fd to /dev/cpu_dma_latency is closed, the behavior
+ * goes back to the system default.
+ * 
+ * Documentation/power/pm_qos_interface.txt
+ */
+static void set_latency_target(void)
+{
+	struct stat s;
+	int ret;
+
+	if (stat("/dev/cpu_dma_latency", &s) == 0) {
+		latency_target_fd = open("/dev/cpu_dma_latency", O_RDWR);
+		if (latency_target_fd == -1)
+			return;
+		ret = write(latency_target_fd, &latency_target_value, 4);
+		if (ret == 0) {
+			printf("# error setting cpu_dma_latency to %d!: %s\n", latency_target_value, strerror(errno));
+			close(latency_target_fd);
+			return;
+		}
+		printf("# /dev/cpu_dma_latency set to %dus\n", latency_target_value);
+	}
+}
+
+
+enum kernelversion {
+	KV_NOT_SUPPORTED,
+	KV_26_LT18,
+	KV_26_LT24,
+	KV_26_33,
+	KV_30
+};
+
+enum {
+	ERROR_GENERAL	= -1,
+	ERROR_NOTFOUND	= -2,
+};
+
+static char functiontracer[MAX_PATH];
+static char traceroptions[MAX_PATH];
+
+static int trace_fd     = -1;
+
+static int kernvar(int mode, const char *name, char *value, size_t sizeofvalue)
+{
+	char filename[128];
+	int retval = 1;
+	int path;
+	size_t len_prefix = strlen(fileprefix), len_name = strlen(name);
+
+	if (len_prefix + len_name + 1 > sizeof(filename)) {
+		errno = ENOMEM;
+		return 1;
+	}
+
+	memcpy(filename, fileprefix, len_prefix);
+	memcpy(filename + len_prefix, name, len_name + 1);
+
+	path = open(filename, mode);
+	if (path >= 0) {
+		if (mode == O_RDONLY) {
+			int got;
+			if ((got = read(path, value, sizeofvalue)) > 0) {
+				retval = 0;
+				value[got-1] = '\0';
+			}
+		} else if (mode == O_WRONLY) {
+			if (write(path, value, sizeofvalue) == sizeofvalue)
+				retval = 0;
+		}
+		close(path);
+	}
+	return retval;
+}
+
+static void setkernvar(const char *name, char *value)
+{
+	int i;
+	char oldvalue[KVALUELEN];
+
+	if (kernelversion < KV_26_33) {
+		if (kernvar(O_RDONLY, name, oldvalue, sizeof(oldvalue)))
+			fprintf(stderr, "could not retrieve %s\n", name);
+		else {
+			for (i = 0; i < KVARS; i++) {
+				if (!strcmp(kv[i].name, name))
+					break;
+				if (kv[i].name[0] == '\0') {
+					strncpy(kv[i].name, name,
+						sizeof(kv[i].name));
+					strncpy(kv[i].value, oldvalue,
+					    sizeof(kv[i].value));
+					break;
+				}
+			}
+			if (i == KVARS)
+				fprintf(stderr, "could not backup %s (%s)\n",
+					name, oldvalue);
+		}
+	}
+	if (kernvar(O_WRONLY, name, value, strlen(value)))
+		fprintf(stderr, "could not set %s to %s\n", name, value);
+
+}
+
+static void restorekernvars(void)
+{
+	int i;
+
+	for (i = 0; i < KVARS; i++) {
+		if (kv[i].name[0] != '\0') {
+			if (kernvar(O_WRONLY, kv[i].name, kv[i].value,
+			    strlen(kv[i].value)))
+				fprintf(stderr, "could not restore %s to %s\n",
+					kv[i].name, kv[i].value);
+		}
+	}
+}
+
+static inline void tsnorm(struct timespec *ts)
+{
+	while (ts->tv_nsec >= NSEC_PER_SEC) {
+		ts->tv_nsec -= NSEC_PER_SEC;
+		ts->tv_sec++;
+	}
+}
+
+static inline int64_t calcdiff(struct timespec t1, struct timespec t2)
+{
+	int64_t diff;
+	diff = USEC_PER_SEC * (long long)((int) t1.tv_sec - (int) t2.tv_sec);
+	diff += ((int) t1.tv_nsec - (int) t2.tv_nsec) / 1000;
+	return diff;
+}
+
+static inline int64_t calcdiff_ns(struct timespec t1, struct timespec t2)
+{
+	int64_t diff;
+	diff = NSEC_PER_SEC * (int64_t)((int) t1.tv_sec - (int) t2.tv_sec);
+	diff += ((int) t1.tv_nsec - (int) t2.tv_nsec);
+	return diff;
+}
+
+void traceopt(char *option)
+{
+	char *ptr;
+	if (traceopt_count + 1 > traceopt_size) {
+		traceopt_size += 16;
+		printf("expanding traceopt buffer to %d entries\n", traceopt_size);
+		traceptr = realloc(traceptr, sizeof(char*) * traceopt_size);
+		if (traceptr == NULL)
+			fatal ("Error allocating space for %d trace options\n",
+			       traceopt_count+1);
+	}
+	ptr = malloc(strlen(option)+1);
+	if (ptr == NULL)
+		fatal("error allocating space for trace option %s\n", option);
+	printf("adding traceopt %s\n", option);
+	strcpy(ptr, option);
+	traceptr[traceopt_count++] = ptr;
+}
+
+static int trace_file_exists(char *name)
+{
+	struct stat sbuf;
+	char *tracing_prefix = get_debugfileprefix();
+	char path[MAX_PATH];
+	strcat(strcpy(path, tracing_prefix), name);
+	return stat(path, &sbuf) ? 0 : 1;
+}
+
+void tracing(int on)
+{
+	if (on) {
+		switch (kernelversion) {
+		case KV_26_LT18: gettimeofday(0,(struct timezone *)1); break;
+		case KV_26_LT24: prctl(0, 1); break;
+		case KV_26_33: 
+		case KV_30:
+			write(trace_fd, "1", 1);
+			break;
+		default:	 break;
+		}
+	} else {
+		switch (kernelversion) {
+		case KV_26_LT18: gettimeofday(0,0); break;
+		case KV_26_LT24: prctl(0, 0); break;
+		case KV_26_33: 
+		case KV_30:
+			write(trace_fd, "0", 1);
+			break;
+		default:	break;
+		}
+	}
+}
+
+static int settracer(char *tracer)
+{
+	if (valid_tracer(tracer)) {
+		setkernvar("current_tracer", tracer);
+		return 0;
+	}
+	return -1;
+}
+
+static void setup_tracer(void)
+{
+	if (!tracelimit)
+		return;
+
+	if (mount_debugfs(NULL))
+		fatal("could not mount debugfs");
+
+	if (kernelversion >= KV_26_33) {
+		char testname[MAX_PATH];
+
+		fileprefix = get_debugfileprefix();
+		if (!trace_file_exists("tracing_enabled") &&
+		    !trace_file_exists("tracing_on"))
+			warn("tracing_enabled or tracing_on not found\n"
+			    "debug fs not mounted, "
+			    "TRACERs not configured?\n", testname);
+	} else
+		fileprefix = procfileprefix;
+
+	if (kernelversion >= KV_26_33) {
+		int ret;
+
+		if (trace_file_exists("tracing_enabled") &&
+		    !trace_file_exists("tracing_on"))
+			setkernvar("tracing_enabled", "1");
+
+		/* ftrace_enabled is a sysctl variable */
+		/* turn it on if you're doing anything but nop or event tracing */
+
+		fileprefix = procfileprefix;
+		if (tracetype)
+			setkernvar("ftrace_enabled", "1");
+		else
+			setkernvar("ftrace_enabled", "0");
+		fileprefix = get_debugfileprefix();
+
+		/*
+		 * Set default tracer to nop.
+		 * this also has the nice side effect of clearing out
+		 * old traces.
+		 */
+		ret = settracer("nop");
+
+		switch (tracetype) {
+		case NOTRACE:
+			/* no tracer specified, use events */
+			enable_events = 1;
+			break;
+		case FUNCTION:
+			ret = settracer("function");
+			break;
+		case IRQSOFF:
+			ret = settracer("irqsoff");
+			break;
+		case PREEMPTOFF:
+			ret = settracer("preemptoff");
+			break;
+		case PREEMPTIRQSOFF:
+			ret = settracer("preemptirqsoff");
+			break;
+		case CTXTSWITCH:
+			if (valid_tracer("sched_switch"))
+			    ret = settracer("sched_switch");
+			else {
+				if ((ret = event_enable("sched/sched_wakeup")))
+					break;
+				ret = event_enable("sched/sched_switch");
+			}
+			break;
+               case WAKEUP:
+                       ret = settracer("wakeup");
+                       break;
+               case WAKEUPRT:
+                       ret = settracer("wakeup_rt");
+                       break;
+		default:
+			if (strlen(tracer)) {
+				ret = settracer(tracer);
+				if (strcmp(tracer, "events") == 0 && ftrace)
+					ret = settracer(functiontracer);
+			}
+			else {
+				printf("cyclictest: unknown tracer!\n");
+				ret = 0;
+			}
+			break;
+		}
+
+		if (enable_events)
+			/* turn on all events */
+			event_enable_all();
+
+		if (ret)
+			fprintf(stderr, "Requested tracer '%s' not available\n", tracer);
+
+		setkernvar(traceroptions, "print-parent");
+		setkernvar(traceroptions, "latency-format");
+		if (verbose) {
+			setkernvar(traceroptions, "sym-offset");
+			setkernvar(traceroptions, "sym-addr");
+			setkernvar(traceroptions, "verbose");
+		} else {
+			setkernvar(traceroptions, "nosym-offset");
+			setkernvar(traceroptions, "nosym-addr");
+			setkernvar(traceroptions, "noverbose");
+		}
+		if (traceopt_count) {
+			int i;
+			for (i = 0; i < traceopt_count; i++)
+				setkernvar(traceroptions, traceptr[i]);
+		}
+		setkernvar("tracing_max_latency", "0");
+		if (trace_file_exists("latency_hist"))
+			setkernvar("latency_hist/wakeup/reset", "1");
+
+		/* open the tracing on file descriptor */
+		if (trace_fd == -1) {
+			char path[MAX_PATH];
+			strcpy(path, fileprefix);
+			if (trace_file_exists("tracing_on"))
+				strcat(path, "tracing_on");
+			else
+				strcat(path, "tracing_enabled");
+			if ((trace_fd = open(path, O_WRONLY)) == -1)
+				fatal("unable to open %s for tracing", path);
+		}
+
+	} else {
+		setkernvar("trace_all_cpus", "1");
+		setkernvar("trace_freerunning", "1");
+		setkernvar("trace_print_on_crash", "0");
+		setkernvar("trace_user_triggered", "1");
+		setkernvar("trace_user_trigger_irq", "-1");
+		setkernvar("trace_verbose", "0");
+		setkernvar("preempt_thresh", "0");
+		setkernvar("wakeup_timing", "0");
+		setkernvar("preempt_max_latency", "0");
+		if (ftrace)
+			setkernvar("mcount_enabled", "1");
+		setkernvar("trace_enabled", "1");
+		setkernvar("latency_hist/wakeup_latency/reset", "1");
+	}
+
+	tracing(1);
+}
+
+/*
+ * parse an input value as a base10 value followed by an optional
+ * suffix. The input value is presumed to be in seconds, unless
+ * followed by a modifier suffix: m=minutes, h=hours, d=days
+ *
+ * the return value is a value in seconds
+ */
+int
+parse_time_string(char *val)
+{
+	char *end;
+	int t = strtol(val, &end, 10);
+	if (end) {
+		switch (*end) {
+		case 'm':
+		case 'M':
+			t *= 60;
+			break;
+
+		case 'h':
+		case 'H':
+			t *= 60*60;
+			break;
+
+		case 'd':
+		case 'D':
+			t *= 24*60*60;
+			break;
+
+		}
+	}
+	return t;
+}
+
+/*
+ * Raise the soft priority limit up to prio, if that is less than or equal
+ * to the hard limit
+ * if a call fails, return the error
+ * if successful return 0
+ * if fails, return -1
+*/
+static int raise_soft_prio(int policy, const struct sched_param *param)
+{
+	int err;
+	int policy_max;	/* max for scheduling policy such as SCHED_FIFO */
+	int soft_max;
+	int hard_max;
+	int prio;
+	struct rlimit rlim;
+
+	prio = param->sched_priority;
+
+	policy_max = sched_get_priority_max(policy);
+	if (policy_max == -1) {
+		err = errno;
+		err_msg("WARN: no such policy\n");
+		return err;
+	}
+
+	err = getrlimit(RLIMIT_RTPRIO, &rlim);
+	if (err) {
+		err = errno;
+		err_msg_n(err, "WARN: getrlimit failed\n");
+		return err;
+	}
+
+	soft_max = (rlim.rlim_cur == RLIM_INFINITY) ? policy_max : rlim.rlim_cur;
+	hard_max = (rlim.rlim_max == RLIM_INFINITY) ? policy_max : rlim.rlim_max;
+
+	if (prio > soft_max && prio <= hard_max) {
+		rlim.rlim_cur = prio;
+		err = setrlimit(RLIMIT_RTPRIO, &rlim);
+		if (err) {
+			err = errno;
+			err_msg_n(err, "WARN: setrlimit failed\n");
+			/* return err; */
+		}
+	} else {
+		err = -1;
+	}
+
+	return err;
+}
+
+/*
+ * Check the error status of sched_setscheduler
+ * If an error can be corrected by raising the soft limit priority to
+ * a priority less than or equal to the hard limit, then do so.
+ */
+static int setscheduler(pid_t pid, int policy, const struct sched_param *param)
+{
+	int err = 0;
+
+try_again:
+	err = sched_setscheduler(pid, policy, param);
+	if (err) {
+		err = errno;
+		if (err == EPERM) {
+			int err1;
+			err1 = raise_soft_prio(policy, param);
+			if (!err1) goto try_again;
+		}
+	}
+
+	return err;
+}
+
+/*
+ * timer thread
+ *
+ * Modes:
+ * - clock_nanosleep based
+ * - cyclic timer based
+ *
+ * Clock:
+ * - CLOCK_MONOTONIC
+ * - CLOCK_REALTIME
+ *
+ */
+void *timerthread(void *param)
+{
+	struct thread_param *par = param;
+	struct sched_param schedp;
+	struct sigevent sigev;
+	sigset_t sigset;
+	timer_t timer;
+	struct timespec now, next, interval, stop;
+	struct itimerval itimer;
+	struct itimerspec tspec;
+	struct thread_stat *stat = par->stats;
+	int stopped = 0;
+	cpu_set_t mask;
+	pthread_t thread;
+
+	/* if we're running in numa mode, set our memory node */
+	if (par->node != -1)
+		rt_numa_set_numa_run_on_node(par->node, par->cpu);
+
+	if (par->cpu != -1) {
+		CPU_ZERO(&mask);
+		CPU_SET(par->cpu, &mask);
+		thread = pthread_self();
+		if(pthread_setaffinity_np(thread, sizeof(mask), &mask) == -1)
+			warn("Could not set CPU affinity to CPU #%d\n", par->cpu);
+	}
+
+	interval.tv_sec = par->interval / USEC_PER_SEC;
+	interval.tv_nsec = (par->interval % USEC_PER_SEC) * 1000;
+
+	stat->tid = gettid();
+
+	sigemptyset(&sigset);
+	sigaddset(&sigset, par->signal);
+	sigprocmask(SIG_BLOCK, &sigset, NULL);
+
+	if (par->mode == MODE_CYCLIC) {
+		sigev.sigev_notify = SIGEV_THREAD_ID | SIGEV_SIGNAL;
+		sigev.sigev_signo = par->signal;
+		sigev.sigev_notify_thread_id = stat->tid;
+		timer_create(par->clock, &sigev, &timer);
+		tspec.it_interval = interval;
+	}
+
+	memset(&schedp, 0, sizeof(schedp));
+	schedp.sched_priority = par->prio;
+	if (setscheduler(0, par->policy, &schedp)) 
+		fatal("timerthread%d: failed to set priority to %d\n", par->cpu, par->prio);
+
+	/* Get current time */
+	clock_gettime(par->clock, &now);
+
+	next = now;
+	next.tv_sec += interval.tv_sec;
+	next.tv_nsec += interval.tv_nsec;
+	tsnorm(&next);
+
+	if (duration) {
+		memset(&stop, 0, sizeof(stop)); /* grrr */
+		stop = now;
+		stop.tv_sec += duration;
+		tsnorm(&stop);
+	}
+	if (par->mode == MODE_CYCLIC) {
+		if (par->timermode == TIMER_ABSTIME)
+			tspec.it_value = next;
+		else {
+			tspec.it_value.tv_nsec = 0;
+			tspec.it_value.tv_sec = 1;
+		}
+		timer_settime(timer, par->timermode, &tspec, NULL);
+	}
+
+	if (par->mode == MODE_SYS_ITIMER) {
+		itimer.it_value.tv_sec = 1;
+		itimer.it_value.tv_usec = 0;
+		itimer.it_interval.tv_sec = interval.tv_sec;
+		itimer.it_interval.tv_usec = interval.tv_nsec / 1000;
+		setitimer (ITIMER_REAL, &itimer, NULL);
+	}
+
+	stat->threadstarted++;
+
+	while (!shutdown) {
+
+		uint64_t diff;
+		int sigs, ret;
+
+		/* Wait for next period */
+		switch (par->mode) {
+		case MODE_CYCLIC:
+		case MODE_SYS_ITIMER:
+			if (sigwait(&sigset, &sigs) < 0)
+				goto out;
+			break;
+
+		case MODE_CLOCK_NANOSLEEP:
+			if (par->timermode == TIMER_ABSTIME) {
+				if ((ret = clock_nanosleep(par->clock, TIMER_ABSTIME, &next, NULL))) {
+					if (ret != EINTR)
+						warn("clock_nanosleep failed. errno: %d\n", errno);
+					goto out;
+				}
+			} else {
+				if ((ret = clock_gettime(par->clock, &now))) {
+					if (ret != EINTR)
+						warn("clock_gettime() failed: %s", strerror(errno));
+					goto out;
+				}
+				if ((ret = clock_nanosleep(par->clock, TIMER_RELTIME, &interval, NULL))) {
+					if (ret != EINTR)
+						warn("clock_nanosleep() failed. errno: %d\n", errno);
+					goto out;
+				}
+				next.tv_sec = now.tv_sec + interval.tv_sec;
+				next.tv_nsec = now.tv_nsec + interval.tv_nsec;
+				tsnorm(&next);
+			}
+			break;
+
+		case MODE_SYS_NANOSLEEP:
+			if ((ret = clock_gettime(par->clock, &now))) {
+				if (ret != EINTR)
+					warn("clock_gettime() failed: errno %d\n", errno);
+				goto out;
+			}
+			if (nanosleep(&interval, NULL)) {
+				if (errno != EINTR)
+					warn("nanosleep failed. errno: %d\n", errno);
+				goto out;
+			}
+			next.tv_sec = now.tv_sec + interval.tv_sec;
+			next.tv_nsec = now.tv_nsec + interval.tv_nsec;
+			tsnorm(&next);
+			break;
+		}
+
+		if ((ret = clock_gettime(par->clock, &now))) {
+			if (ret != EINTR)
+				warn("clock_getttime() failed. errno: %d\n", errno);
+			goto out;
+		}
+
+		if (use_nsecs)
+			diff = calcdiff_ns(now, next);
+		else
+			diff = calcdiff(now, next);
+		if (diff < stat->min)
+			stat->min = diff;
+		if (diff > stat->max) {
+			stat->max = diff;
+			if (refresh_on_max)
+				pthread_cond_signal(&refresh_on_max_cond);
+		}
+		stat->avg += (double) diff;
+
+		if (duration && (calcdiff(now, stop) >= 0))
+			shutdown++;
+
+		if (!stopped && tracelimit && (diff > tracelimit)) {
+			stopped++;
+			tracing(0);
+			shutdown++;
+			pthread_mutex_lock(&break_thread_id_lock);
+			if (break_thread_id == 0)
+				break_thread_id = stat->tid;
+			break_thread_value = diff;
+			pthread_mutex_unlock(&break_thread_id_lock);
+		}
+		stat->act = diff;
+
+		if (par->bufmsk)
+			stat->values[stat->cycles & par->bufmsk] = diff;
+
+		/* Update the histogram */
+		if (histogram) {
+			if (diff >= histogram)
+				stat->hist_overflow++;
+			else
+				stat->hist_array[diff]++;
+		}
+
+		stat->cycles++;
+
+		next.tv_sec += interval.tv_sec;
+		next.tv_nsec += interval.tv_nsec;
+		if (par->mode == MODE_CYCLIC) {
+			int overrun_count = timer_getoverrun(timer);
+			next.tv_sec += overrun_count * interval.tv_sec;
+			next.tv_nsec += overrun_count * interval.tv_nsec;
+		}
+		tsnorm(&next);
+
+		if (par->max_cycles && par->max_cycles == stat->cycles)
+			break;
+	}
+
+out:
+	if (par->mode == MODE_CYCLIC)
+		timer_delete(timer);
+
+	if (par->mode == MODE_SYS_ITIMER) {
+		itimer.it_value.tv_sec = 0;
+		itimer.it_value.tv_usec = 0;
+		itimer.it_interval.tv_sec = 0;
+		itimer.it_interval.tv_usec = 0;
+		setitimer (ITIMER_REAL, &itimer, NULL);
+	}
+
+	/* switch to normal */
+	schedp.sched_priority = 0;
+	sched_setscheduler(0, SCHED_OTHER, &schedp);
+
+	stat->threadstarted = -1;
+
+	return NULL;
+}
+
+
+/* Print usage information */
+static void display_help(int error)
+{
+	char tracers[MAX_PATH];
+	char *prefix;
+
+	prefix = get_debugfileprefix();
+	if (prefix[0] == '\0')
+		strcpy(tracers, "unavailable (debugfs not mounted)");
+	else {
+		fileprefix = prefix;
+		if (kernvar(O_RDONLY, "available_tracers", tracers, sizeof(tracers)))
+			strcpy(tracers, "none");
+	}
+		
+	printf("cyclictest V %1.2f\n", VERSION_STRING);
+	printf("Usage:\n"
+	       "cyclictest <options>\n\n"
+	       "-a [NUM] --affinity        run thread #N on processor #N, if possible\n"
+	       "                           with NUM pin all threads to the processor NUM\n"
+	       "-b USEC  --breaktrace=USEC send break trace command when latency > USEC\n"
+	       "-B       --preemptirqs     both preempt and irqsoff tracing (used with -b)\n"
+	       "-c CLOCK --clock=CLOCK     select clock\n"
+	       "                           0 = CLOCK_MONOTONIC (default)\n"
+	       "                           1 = CLOCK_REALTIME\n"
+	       "-C       --context         context switch tracing (used with -b)\n"
+	       "-d DIST  --distance=DIST   distance of thread intervals in us default=500\n"
+	       "-D       --duration=t      specify a length for the test run\n"
+	       "                           default is in seconds, but 'm', 'h', or 'd' maybe added\n"
+	       "                           to modify value to minutes, hours or days\n"
+	       "-E       --event           event tracing (used with -b)\n"
+	       "-f       --ftrace          function trace (when -b is active)\n"
+	       "-h       --histogram=US    dump a latency histogram to stdout after the run\n"
+               "                           (with same priority about many threads)\n"
+	       "                           US is the max time to be be tracked in microseconds\n"
+	       "-H       --histofall=US    same as -h except with an additional summary column\n"
+	       "-i INTV  --interval=INTV   base interval of thread in us default=1000\n"
+	       "-I       --irqsoff         Irqsoff tracing (used with -b)\n"
+	       "-l LOOPS --loops=LOOPS     number of loops: default=0(endless)\n"
+	       "-m       --mlockall        lock current and future memory allocations\n"
+	       "-M       --refresh_on_max  delay updating the screen until a new max latency is hit\n" 
+	       "-n       --nanosleep       use clock_nanosleep\n"
+	       "-N       --nsecs           print results in ns instead of us (default us)\n"
+	       "-o RED   --oscope=RED      oscilloscope mode, reduce verbose output by RED\n"
+	       "-O TOPT  --traceopt=TOPT   trace option\n"
+	       "-p PRIO  --prio=PRIO       priority of highest prio thread\n"
+	       "-P       --preemptoff      Preempt off tracing (used with -b)\n"
+	       "-q       --quiet           print only a summary on exit\n"
+	       "-Q       --priospread       spread priority levels starting at specified value\n"
+	       "-r       --relative        use relative timer instead of absolute\n"
+	       "-s       --system          use sys_nanosleep and sys_setitimer\n"
+	       "-t       --threads         one thread per available processor\n"
+	       "-t [NUM] --threads=NUM     number of threads:\n"
+	       "                           without NUM, threads = max_cpus\n"
+	       "                           without -t default = 1\n"
+	       "-T TRACE --tracer=TRACER   set tracing function\n"
+	       "    configured tracers: %s\n"
+	       "-u       --unbuffered      force unbuffered output for live processing\n"
+	       "-v       --verbose         output values on stdout for statistics\n"
+	       "                           format: n:c:v n=tasknum c=count v=value in us\n"
+               "-w       --wakeup          task wakeup tracing (used with -b)\n"
+               "-W       --wakeuprt        rt task wakeup tracing (used with -b)\n"
+               "-y POLI  --policy=POLI     policy of realtime thread, POLI may be fifo(default) or rr\n"
+               "                           format: --policy=fifo(default) or --policy=rr\n"
+	       "-S       --smp             Standard SMP testing: options -a -t -n and\n"
+               "                           same priority of all threads\n"
+	       "-U       --numa            Standard NUMA testing (similar to SMP option)\n"
+               "                           thread data structures allocated from local node\n",
+	       tracers
+		);
+	if (error)
+		exit(EXIT_FAILURE);
+	exit(EXIT_SUCCESS);
+}
+
+static int use_nanosleep;
+static int timermode = TIMER_ABSTIME;
+static int use_system;
+static int priority;
+static int policy = SCHED_OTHER;	/* default policy if not specified */
+static int num_threads = 1;
+static int max_cycles;
+static int clocksel = 0;
+static int quiet;
+static int interval = DEFAULT_INTERVAL;
+static int distance = -1;
+static int affinity = 0;
+static int smp = 0;
+
+enum {
+	AFFINITY_UNSPECIFIED,
+	AFFINITY_SPECIFIED,
+	AFFINITY_USEALL
+};
+static int setaffinity = AFFINITY_UNSPECIFIED;
+
+static int clocksources[] = {
+	CLOCK_MONOTONIC,
+	CLOCK_REALTIME,
+};
+
+static void handlepolicy(char *polname)
+{
+	if (strncasecmp(polname, "other", 5) == 0)
+		policy = SCHED_OTHER;
+	else if (strncasecmp(polname, "batch", 5) == 0)
+		policy = SCHED_BATCH;
+	else if (strncasecmp(polname, "idle", 4) == 0)
+		policy = SCHED_IDLE;
+	else if (strncasecmp(polname, "fifo", 4) == 0)
+		policy = SCHED_FIFO;
+	else if (strncasecmp(polname, "rr", 2) == 0)
+		policy = SCHED_RR;
+	else	/* default policy if we don't recognize the request */
+		policy = SCHED_OTHER;
+}
+
+static char *policyname(int policy)
+{
+	char *policystr = "";
+
+	switch(policy) {
+	case SCHED_OTHER:
+		policystr = "other";
+		break;
+	case SCHED_FIFO:
+		policystr = "fifo";
+		break;
+	case SCHED_RR:
+		policystr = "rr";
+		break;
+	case SCHED_BATCH:
+		policystr = "batch";
+		break;
+	case SCHED_IDLE:
+		policystr = "idle";
+		break;
+	}
+	return policystr;
+}
+
+
+/* Process commandline options */
+static void process_options (int argc, char *argv[])
+{
+	int error = 0;
+	int option_affinity = 0;
+	int max_cpus = sysconf(_SC_NPROCESSORS_CONF);
+
+	for (;;) {
+ 		int option_index = 0;
+		/** Options for getopt */
+		static struct option long_options[] = {
+			{"affinity", optional_argument, NULL, 'a'},
+			{"breaktrace", required_argument, NULL, 'b'},
+			{"preemptirqs", no_argument, NULL, 'B'},
+			{"clock", required_argument, NULL, 'c'},
+			{"context", no_argument, NULL, 'C'},
+			{"distance", required_argument, NULL, 'd'},
+			{"event", no_argument, NULL, 'E'},
+			{"ftrace", no_argument, NULL, 'f'},
+			{"histogram", required_argument, NULL, 'h'},
+			{"histofall", required_argument, NULL, 'H'},
+			{"interval", required_argument, NULL, 'i'},
+			{"irqsoff", no_argument, NULL, 'I'},
+			{"loops", required_argument, NULL, 'l'},
+			{"mlockall", no_argument, NULL, 'm' },
+			{"refresh_on_max", no_argument, NULL, 'M' },
+			{"nanosleep", no_argument, NULL, 'n'},
+			{"nsecs", no_argument, NULL, 'N'},
+			{"oscope", required_argument, NULL, 'o'},
+			{"priority", required_argument, NULL, 'p'},
+                        {"policy", required_argument, NULL, 'y'},
+			{"preemptoff", no_argument, NULL, 'P'},
+			{"quiet", no_argument, NULL, 'q'},
+			{"relative", no_argument, NULL, 'r'},
+			{"system", no_argument, NULL, 's'},
+			{"threads", optional_argument, NULL, 't'},
+			{"unbuffered", no_argument, NULL, 'u'},
+			{"verbose", no_argument, NULL, 'v'},
+			{"duration",required_argument, NULL, 'D'},
+                        {"wakeup", no_argument, NULL, 'w'},
+                        {"wakeuprt", no_argument, NULL, 'W'},
+			{"help", no_argument, NULL, '?'},
+			{"tracer", required_argument, NULL, 'T'},
+			{"traceopt", required_argument, NULL, 'O'},
+			{"smp", no_argument, NULL, 'S'},
+			{"numa", no_argument, NULL, 'U'},
+			{"latency", required_argument, NULL, 'e'},
+			{"priospread", no_argument, NULL, 'Q'},
+			{NULL, 0, NULL, 0}
+		};
+		int c = getopt_long(argc, argv, "a::b:Bc:Cd:Efh:H:i:Il:MnNo:O:p:PmqQrsSt::uUvD:wWT:y:e:",
+				    long_options, &option_index);
+		if (c == -1)
+			break;
+		switch (c) {
+		case 'a':
+			option_affinity = 1;
+			if (smp || numa)
+				break;
+			if (optarg != NULL) {
+				affinity = atoi(optarg);
+				setaffinity = AFFINITY_SPECIFIED;
+			} else if (optind<argc && atoi(argv[optind])) {
+				affinity = atoi(argv[optind]);
+				setaffinity = AFFINITY_SPECIFIED;
+			} else {
+				setaffinity = AFFINITY_USEALL;
+			}
+			break;
+		case 'b': tracelimit = atoi(optarg); break;
+		case 'B': tracetype = PREEMPTIRQSOFF; break;
+		case 'c': clocksel = atoi(optarg); break;
+		case 'C': tracetype = CTXTSWITCH; break;
+		case 'd': distance = atoi(optarg); break;
+		case 'E': enable_events = 1; break;
+		case 'f': tracetype = FUNCTION; ftrace = 1; break;
+		case 'H': histofall = 1; /* fall through */
+		case 'h': histogram = atoi(optarg); break;
+		case 'i': interval = atoi(optarg); break;
+		case 'I':
+			if (tracetype == PREEMPTOFF) {
+				tracetype = PREEMPTIRQSOFF;
+				strncpy(tracer, "preemptirqsoff", sizeof(tracer));
+			} else {
+				tracetype = IRQSOFF;
+				strncpy(tracer, "irqsoff", sizeof(tracer));
+			}
+			break;
+		case 'l': max_cycles = atoi(optarg); break;
+		case 'n': use_nanosleep = MODE_CLOCK_NANOSLEEP; break;
+		case 'N': use_nsecs = 1; break;
+		case 'o': oscope_reduction = atoi(optarg); break;
+		case 'O': traceopt(optarg); break;
+		case 'p': 
+			priority = atoi(optarg); 
+			if (policy != SCHED_FIFO && policy != SCHED_RR)
+				policy = SCHED_FIFO;
+			break;
+		case 'P':
+			if (tracetype == IRQSOFF) {
+				tracetype = PREEMPTIRQSOFF;
+				strncpy(tracer, "preemptirqsoff", sizeof(tracer));
+			} else {
+				tracetype = PREEMPTOFF;
+				strncpy(tracer, "preemptoff", sizeof(tracer));
+			}
+			break;
+		case 'q': quiet = 1; break;
+		case 'Q': priospread = 1; break;
+		case 'r': timermode = TIMER_RELTIME; break;
+		case 's': use_system = MODE_SYS_OFFSET; break;
+		case 't':
+			if (smp) {
+				warn("-t ignored due to --smp\n");
+				break;
+			}
+			if (optarg != NULL)
+				num_threads = atoi(optarg);
+			else if (optind<argc && atoi(argv[optind]))
+				num_threads = atoi(argv[optind]);
+			else
+				num_threads = max_cpus;
+			break;
+		case 'T': 
+			tracetype = CUSTOM;
+			strncpy(tracer, optarg, sizeof(tracer)); 
+			break;
+		case 'u': setvbuf(stdout, NULL, _IONBF, 0); break;
+		case 'v': verbose = 1; break;
+		case 'm': lockall = 1; break;
+		case 'M': refresh_on_max = 1; break;
+		case 'D': duration = parse_time_string(optarg);
+			break;
+                case 'w': tracetype = WAKEUP; break;
+                case 'W': tracetype = WAKEUPRT; break;
+                case 'y': handlepolicy(optarg); break;
+		case 'S':  /* SMP testing */
+			if (numa)
+				fatal("numa and smp options are mutually exclusive\n");
+			smp = 1;
+			num_threads = max_cpus;
+			setaffinity = AFFINITY_USEALL;
+			use_nanosleep = MODE_CLOCK_NANOSLEEP;
+			break;
+		case 'U':  /* NUMA testing */
+			if (smp)
+				fatal("numa and smp options are mutually exclusive\n");
+#ifdef NUMA
+			if (numa_available() == -1)
+				fatal("NUMA functionality not available!");
+			numa = 1;
+			num_threads = max_cpus;
+			setaffinity = AFFINITY_USEALL;
+			use_nanosleep = MODE_CLOCK_NANOSLEEP;
+#else
+			warn("cyclictest was not built with the numa option\n");
+			warn("ignoring --numa or -U\n");
+#endif
+			break;
+		case 'e': /* power management latency target value */
+			  /* note: default is 0 (zero) */
+			latency_target_value = atoi(optarg);
+			if (latency_target_value < 0)
+				latency_target_value = 0;
+			break;
+
+		case '?': display_help(0); break;
+		}
+	}
+
+	if (option_affinity) {
+		if (smp) {
+			warn("-a ignored due to --smp\n");
+		} else if (numa) {
+			warn("-a ignored due to --numa\n");
+		}
+	}
+
+	if (setaffinity == AFFINITY_SPECIFIED) {
+		if (affinity < 0)
+			error = 1;
+		if (affinity >= max_cpus) {
+			warn("CPU #%d not found, only %d CPUs available\n",
+			    affinity, max_cpus);
+			error = 1;
+		}
+	} else if (tracelimit)
+		fileprefix = procfileprefix;
+
+	if (clocksel < 0 || clocksel > ARRAY_SIZE(clocksources))
+		error = 1;
+
+	if (oscope_reduction < 1)
+		error = 1;
+
+	if (oscope_reduction > 1 && !verbose) {
+		warn("-o option only meaningful, if verbose\n");
+		error = 1;
+	}
+
+	if (histogram < 0)
+		error = 1;
+
+	if (histogram > HIST_MAX)
+		histogram = HIST_MAX;
+
+	if (histogram && distance != -1)
+		warn("distance is ignored and set to 0, if histogram enabled\n");
+	if (distance == -1)
+		distance = DEFAULT_DISTANCE;
+
+	if (priority < 0 || priority > 99)
+		error = 1;
+
+	if (priospread && priority == 0) {
+		fprintf(stderr, "defaulting realtime priority to %d\n", 
+			num_threads+1);
+		priority = num_threads+1;
+	}
+
+	if (priority && (policy != SCHED_FIFO && policy != SCHED_RR)) {
+		fprintf(stderr, "policy and priority don't match: setting policy to SCHED_FIFO\n");
+		policy = SCHED_FIFO;
+	}
+
+	if ((policy == SCHED_FIFO || policy == SCHED_RR) && priority == 0) {
+		fprintf(stderr, "defaulting realtime priority to %d\n", 
+			num_threads+1);
+		priority = num_threads+1;
+	}
+
+	if (num_threads < 1)
+		error = 1;
+
+	if (error)
+		display_help(1);
+}
+
+static int check_kernel(void)
+{
+	struct utsname kname;
+	int maj, min, sub, kv, ret;
+
+	ret = uname(&kname);
+	if (ret) {
+		fprintf(stderr, "uname failed: %s. Assuming not 2.6\n",
+				strerror(errno));
+		return KV_NOT_SUPPORTED;
+	}
+	sscanf(kname.release, "%d.%d.%d", &maj, &min, &sub);
+	if (maj == 2 && min == 6) {
+		if (sub < 18)
+			kv = KV_26_LT18;
+		else if (sub < 24)
+			kv = KV_26_LT24;
+		else if (sub < 28) {
+			kv = KV_26_33;
+			strcpy(functiontracer, "ftrace");
+			strcpy(traceroptions, "iter_ctrl");
+		} else {
+			kv = KV_26_33;
+			strcpy(functiontracer, "function");
+			strcpy(traceroptions, "trace_options");
+		}
+	} else if (maj == 3) {
+		kv = KV_30;
+		strcpy(functiontracer, "function");
+		strcpy(traceroptions, "trace_options");
+		
+	} else
+		kv = KV_NOT_SUPPORTED;
+
+	return kv;
+}
+
+static int check_timer(void)
+{
+	struct timespec ts;
+
+	if (clock_getres(CLOCK_MONOTONIC, &ts))
+		return 1;
+
+	return (ts.tv_sec != 0 || ts.tv_nsec != 1);
+}
+
+static void sighand(int sig)
+{
+	shutdown = 1;
+	if (refresh_on_max)
+		pthread_cond_signal(&refresh_on_max_cond);
+	if (tracelimit)
+		tracing(0);
+}
+
+static void print_tids(struct thread_param *par[], int nthreads)
+{
+	int i;
+
+	printf("# Thread Ids:");
+	for (i = 0; i < nthreads; i++)
+		printf(" %05d", par[i]->stats->tid);
+	printf("\n");
+}
+
+static void print_hist(struct thread_param *par[], int nthreads)
+{
+	int i, j;
+	unsigned long long int log_entries[nthreads+1];
+	unsigned long maxmax, alloverflows;
+
+	bzero(log_entries, sizeof(log_entries));
+
+	printf("# Histogram\n");
+	for (i = 0; i < histogram; i++) {
+		unsigned long long int allthreads = 0;
+
+		printf("%06d ", i);
+
+		for (j = 0; j < nthreads; j++) {
+			unsigned long curr_latency=par[j]->stats->hist_array[i];
+			printf("%06lu", curr_latency);
+			if (j < nthreads - 1)
+				printf("\t");
+			log_entries[j] += curr_latency;
+			allthreads += curr_latency;
+		}
+		if (histofall && nthreads > 1) {
+			printf("\t%06llu", allthreads);
+			log_entries[nthreads] += allthreads;
+		}
+		printf("\n");
+	}
+	printf("# Total:");
+	for (j = 0; j < nthreads; j++)
+		printf(" %09llu", log_entries[j]);
+	if (histofall && nthreads > 1)
+		printf(" %09llu", log_entries[nthreads]);
+	printf("\n");
+	printf("# Min Latencies:");
+	for (j = 0; j < nthreads; j++)
+		printf(" %05lu", par[j]->stats->min);
+	printf("\n");
+	printf("# Avg Latencies:");
+	for (j = 0; j < nthreads; j++)
+		printf(" %05lu", par[j]->stats->cycles ?
+		       (long)(par[j]->stats->avg/par[j]->stats->cycles) : 0);
+	printf("\n");
+	printf("# Max Latencies:");
+	maxmax = 0;
+	for (j = 0; j < nthreads; j++) {
+ 		printf(" %05lu", par[j]->stats->max);
+		if (par[j]->stats->max > maxmax)
+			maxmax = par[j]->stats->max;
+	}
+	if (histofall && nthreads > 1)
+		printf(" %05lu", maxmax);
+	printf("\n");
+	printf("# Histogram Overflows:");
+	alloverflows = 0;
+	for (j = 0; j < nthreads; j++) {
+ 		printf(" %05lu", par[j]->stats->hist_overflow);
+		alloverflows += par[j]->stats->hist_overflow;
+	}
+	if (histofall && nthreads > 1)
+		printf(" %05lu", alloverflows);
+	printf("\n");
+}
+
+static void print_stat(struct thread_param *par, int index, int verbose)
+{
+	struct thread_stat *stat = par->stats;
+
+	if (!verbose) {
+		if (quiet != 1) {
+			char *fmt;
+			if (use_nsecs)
+                                fmt = "T:%2d (%5d) P:%2d I:%ld C:%7lu "
+					"Min:%7ld Act:%8ld Avg:%8ld Max:%8ld\n";
+			else
+                                fmt = "T:%2d (%5d) P:%2d I:%ld C:%7lu "
+					"Min:%7ld Act:%5ld Avg:%5ld Max:%8ld\n";
+                        printf(fmt, index, stat->tid, par->prio, 
+                               par->interval, stat->cycles, stat->min, stat->act,
+			       stat->cycles ?
+			       (long)(stat->avg/stat->cycles) : 0, stat->max);
+		}
+	} else {
+		while (stat->cycles != stat->cyclesread) {
+			long diff = stat->values
+			    [stat->cyclesread & par->bufmsk];
+
+			if (diff > stat->redmax) {
+				stat->redmax = diff;
+				stat->cycleofmax = stat->cyclesread;
+			}
+			if (++stat->reduce == oscope_reduction) {
+				printf("%8d:%8lu:%8ld\n", index,
+				       stat->cycleofmax, stat->redmax);
+				stat->reduce = 0;
+				stat->redmax = 0;
+			}
+			stat->cyclesread++;
+		}
+	}
+}
+
+int main(int argc, char **argv)
+{
+	sigset_t sigset;
+	int signum = SIGALRM;
+	int mode;
+	struct thread_param **parameters;
+	struct thread_stat **statistics;
+	int max_cpus = sysconf(_SC_NPROCESSORS_CONF);
+	int i, ret = -1;
+	int status;
+
+	process_options(argc, argv);
+
+	if (check_privs())
+		exit(EXIT_FAILURE);
+
+	/* Checks if numa is on, program exits if numa on but not available */
+	numa_on_and_available();
+
+	/* lock all memory (prevent swapping) */
+	if (lockall)
+		if (mlockall(MCL_CURRENT|MCL_FUTURE) == -1) {
+			perror("mlockall");
+			goto out;
+		}
+
+	/* use the /dev/cpu_dma_latency trick if it's there */
+	set_latency_target();
+
+	kernelversion = check_kernel();
+
+	if (kernelversion == KV_NOT_SUPPORTED)
+		warn("Running on unknown kernel version...YMMV\n");
+
+	setup_tracer();
+
+	if (check_timer())
+		warn("High resolution timers not available\n");
+
+	mode = use_nanosleep + use_system;
+
+	sigemptyset(&sigset);
+	sigaddset(&sigset, signum);
+	sigprocmask (SIG_BLOCK, &sigset, NULL);
+
+	signal(SIGINT, sighand);
+	signal(SIGTERM, sighand);
+
+	parameters = calloc(num_threads, sizeof(struct thread_param *));
+	if (!parameters)
+		goto out;
+	statistics = calloc(num_threads, sizeof(struct thread_stat *));
+	if (!statistics)
+		goto outpar;
+
+	for (i = 0; i < num_threads; i++) {
+		pthread_attr_t attr;
+		int node;
+		struct thread_param *par;
+		struct thread_stat *stat;
+
+		status = pthread_attr_init(&attr);
+		if (status != 0)
+			fatal("error from pthread_attr_init for thread %d: %s\n", i, strerror(status));
+
+		node = -1;
+		if (numa) {
+			void *stack;
+			void *currstk;
+			size_t stksize;
+
+			/* find the memory node associated with the cpu i */
+			node = rt_numa_numa_node_of_cpu(i);
+
+			/* get the stack size set for for this thread */
+			if (pthread_attr_getstack(&attr, &currstk, &stksize))
+				fatal("failed to get stack size for thread %d\n", i);
+
+			/* if the stack size is zero, set a default */
+			if (stksize == 0)
+				stksize = PTHREAD_STACK_MIN * 2;
+
+			/*  allocate memory for a stack on appropriate node */
+			stack = rt_numa_numa_alloc_onnode(stksize, node, i);
+
+			/* set the thread's stack */
+			if (pthread_attr_setstack(&attr, stack, stksize))
+				fatal("failed to set stack addr for thread %d to 0x%x\n",
+				      i, stack+stksize);
+		}
+
+		/* allocate the thread's parameter block  */
+		parameters[i] = par = threadalloc(sizeof(struct thread_param), node);
+		if (par == NULL)
+			fatal("error allocating thread_param struct for thread %d\n", i);
+		memset(par, 0, sizeof(struct thread_param));
+
+		/* allocate the thread's statistics block */
+		statistics[i] = stat = threadalloc(sizeof(struct thread_stat), node);
+		if (stat == NULL)
+			fatal("error allocating thread status struct for thread %d\n", i);
+		memset(stat, 0, sizeof(struct thread_stat));
+
+		/* allocate the histogram if requested */
+		if (histogram) {
+			int bufsize = histogram * sizeof(long);
+
+			stat->hist_array = threadalloc(bufsize, node);
+			if (stat->hist_array == NULL)
+				fatal("failed to allocate histogram of size %d on node %d\n",
+				      histogram, i);
+			memset(stat->hist_array, 0, bufsize);
+		}
+
+		if (verbose) {
+			int bufsize = VALBUF_SIZE * sizeof(long);
+			stat->values = threadalloc(bufsize, node);
+			if (!stat->values)
+				goto outall;
+			memset(stat->values, 0, bufsize);
+			par->bufmsk = VALBUF_SIZE - 1;
+		}
+
+		par->prio = priority;
+                if (priority && (policy == SCHED_FIFO || policy == SCHED_RR))
+			par->policy = policy;
+                else {
+			par->policy = SCHED_OTHER;
+			force_sched_other = 1;
+		}
+		if (priospread)
+			priority--;
+		par->clock = clocksources[clocksel];
+		par->mode = mode;
+		par->timermode = timermode;
+		par->signal = signum;
+		par->interval = interval;
+		if (!histogram) /* same interval on CPUs */
+			interval += distance;
+		if (verbose)
+			printf("Thread %d Interval: %d\n", i, interval);
+		par->max_cycles = max_cycles;
+		par->stats = stat;
+		par->node = node;
+		switch (setaffinity) {
+		case AFFINITY_UNSPECIFIED: par->cpu = -1; break;
+		case AFFINITY_SPECIFIED: par->cpu = affinity; break;
+		case AFFINITY_USEALL: par->cpu = i % max_cpus; break;
+		}
+		stat->min = 1000000;
+		stat->max = 0;
+		stat->avg = 0.0;
+		stat->threadstarted = 1;
+		status = pthread_create(&stat->thread, &attr, timerthread, par);
+		if (status)
+			fatal("failed to create thread %d: %s\n", i, strerror(status));
+
+	}
+
+	while (!shutdown) {
+		char lavg[256];
+		int fd, len, allstopped = 0;
+		static char *policystr = NULL;
+		static char *slash = NULL;
+		static char *policystr2;
+
+		if (!policystr)
+			policystr = policyname(policy);
+
+		if (!slash) {
+			if (force_sched_other) {
+				slash = "/";
+				policystr2 = policyname(SCHED_OTHER);
+			} else
+				slash = policystr2 = "";
+		}
+		if (!verbose && !quiet) {
+			fd = open("/proc/loadavg", O_RDONLY, 0666);
+			len = read(fd, &lavg, 255);
+			close(fd);
+			lavg[len-1] = 0x0;
+			printf("policy: %s%s%s: loadavg: %s          \n\n",
+			       policystr, slash, policystr2, lavg);
+		}
+
+		for (i = 0; i < num_threads; i++) {
+
+			print_stat(parameters[i], i, verbose);
+			if(max_cycles && statistics[i]->cycles >= max_cycles)
+				allstopped++;
+		}
+
+		usleep(10000);
+		if (shutdown || allstopped)
+			break;
+		if (!verbose && !quiet)
+			printf("\033[%dA", num_threads + 2);
+
+		if (refresh_on_max) {
+			pthread_mutex_lock(&refresh_on_max_lock);
+			pthread_cond_wait(&refresh_on_max_cond,
+					  &refresh_on_max_lock);
+			pthread_mutex_unlock(&refresh_on_max_lock);
+		}
+	}
+	ret = EXIT_SUCCESS;
+
+ outall:
+	shutdown = 1;
+	usleep(50000);
+
+	if (quiet)
+		quiet = 2;
+	for (i = 0; i < num_threads; i++) {
+		if (statistics[i]->threadstarted > 0)
+			pthread_kill(statistics[i]->thread, SIGTERM);
+		if (statistics[i]->threadstarted) {
+			pthread_join(statistics[i]->thread, NULL);
+			if (quiet && !histogram)
+				print_stat(parameters[i], i, 0);
+		}
+		if (statistics[i]->values)
+			threadfree(statistics[i]->values, VALBUF_SIZE*sizeof(long), parameters[i]->node);
+	}
+
+	if (histogram) {
+		print_hist(parameters, num_threads);
+		for (i = 0; i < num_threads; i++)
+			threadfree(statistics[i]->hist_array, histogram*sizeof(long), parameters[i]->node);
+	}
+
+	if (tracelimit) {
+		print_tids(parameters, num_threads);
+		if (break_thread_id) {
+			printf("# Break thread: %d\n", break_thread_id);
+			printf("# Break value: %llu\n", (unsigned long long)break_thread_value);
+		}
+	}
+	
+
+	for (i=0; i < num_threads; i++) {
+		if (!statistics[i])
+			continue;
+		threadfree(statistics[i], sizeof(struct thread_stat), parameters[i]->node);
+	}
+
+ outpar:
+	for (i = 0; i < num_threads; i++) {
+		if (!parameters[i])
+			continue;
+		threadfree(parameters[i], sizeof(struct thread_param), parameters[i]->node);
+	}
+ out:
+	/* ensure that the tracer is stopped */
+	if (tracelimit)
+		tracing(0);
+
+
+	/* close any tracer file descriptors */
+	if (trace_fd >= 0)
+		close(trace_fd);
+
+	if (enable_events)
+		/* turn off all events */
+		event_disable_all();
+
+	/* turn off the function tracer */
+	fileprefix = procfileprefix;
+	if (tracetype)
+		setkernvar("ftrace_enabled", "0");
+	fileprefix = get_debugfileprefix();
+
+	/* unlock everything */
+	if (lockall)
+		munlockall();
+
+	/* Be a nice program, cleanup */
+	if (kernelversion < KV_26_33)
+		restorekernvars();
+
+	/* close the latency_target_fd if it's open */
+	if (latency_target_fd >= 0)
+		close(latency_target_fd);
+
+	exit(ret);
+}
-- 
1.7.4.1




^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/4] Add cyclicload calibration & load generation feature
  2012-08-30  9:56 [PATCH 0/4] Add cyclicload testtool support Priyanka Jain
  2012-08-30  9:56 ` [PATCH 1/4] Add README for cyclicload test tool Priyanka Jain
  2012-08-30  9:56 ` [PATCH 2/4] Duplicates cyclictest code as cyclicload Priyanka Jain
@ 2012-08-30  9:56 ` Priyanka Jain
  2012-10-19  4:32   ` Jain Priyanka-B32167
  2012-08-30  9:56 ` [PATCH 4/4] Add cyclicload manual page Priyanka Jain
  3 siblings, 1 reply; 7+ messages in thread
From: Priyanka Jain @ 2012-08-30  9:56 UTC (permalink / raw)
  To: jkacur, williams, frank.rowand, linux-rt-users, dvhart,
	Rajan.Srivastava
  Cc: Poonam.Aggrwal, Priyanka Jain

Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com>
---
 Makefile                    |    7 +-
 src/cyclicload/cyclicload.c |  550 ++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 546 insertions(+), 11 deletions(-)

diff --git a/Makefile b/Makefile
index 3a82407..5f48262 100644
--- a/Makefile
+++ b/Makefile
@@ -2,7 +2,7 @@ VERSION_STRING = 0.84
 
 sources = cyclictest.c signaltest.c pi_stress.c rt-migrate-test.c	\
 	  ptsematest.c sigwaittest.c svsematest.c pmqtest.c sendme.c 	\
-	  pip_stress.c hackbench.c
+	  pip_stress.c hackbench.c cyclicload.c
 
 TARGETS = $(sources:.c=)
 
@@ -47,6 +47,7 @@ VPATH	+= src/pmqtest:
 VPATH	+= src/backfire:
 VPATH	+= src/lib
 VPATH	+= src/hackbench
+VPATH	+= src/cyclicload
 
 %.o: %.c
 	$(CC) -D VERSION_STRING=$(VERSION_STRING) -c $< $(CFLAGS)
@@ -98,6 +99,9 @@ pip_stress: pip_stress.o librttest.a
 hackbench: hackbench.o
 	$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $^ $(LIBS)
 
+cyclicload: cyclicload.o librttest.a
+	$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $^ $(LIBS) $(NUMA_LIBS)
+
 librttest.a: rt-utils.o error.o rt-get_cpu.o
 	$(AR) rcs librttest.a rt-utils.o error.o rt-get_cpu.o
 
@@ -140,6 +144,7 @@ install: all
 	gzip src/pmqtest/pmqtest.8 -c >"$(DESTDIR)$(mandir)/man8/pmqtest.8.gz"
 	gzip src/backfire/sendme.8 -c >"$(DESTDIR)$(mandir)/man8/sendme.8.gz"
 	gzip src/hackbench/hackbench.8 -c >"$(DESTDIR)$(mandir)/man8/hackbench.8.gz"
+	gzip src/cyclicload/cyclicload.8 -c >"$(DESTDIR)$(mandir)/man8/cyclicload.8.gz"
 
 .PHONY: release
 release: clean changelog
diff --git a/src/cyclicload/cyclicload.c b/src/cyclicload/cyclicload.c
index 11b6cea..ee43816 100644
--- a/src/cyclicload/cyclicload.c
+++ b/src/cyclicload/cyclicload.c
@@ -1,13 +1,28 @@
 /*
- * High resolution timer test software
+ * Load generation test software
  *
- * (C) 2008-2012 Clark Williams <williams@redhat.com>
- * (C) 2005-2007 Thomas Gleixner <tglx@linutronix.de>
+ * Author: Priyanka.Jain@freescale.com
+ * Based on cyclictest code
+ *
+ * Copyright 2012 Freescale Semiconductor, Inc.
+ *
+ * See file CREDITS for list of people who contributed to this
+ * project.
  *
  * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License Version
- * 2 as published by the Free Software Foundation.
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
  *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston,
+ * MA 02111-1307 USA
  */
 
 #include <stdio.h>
@@ -25,6 +40,7 @@
 #include <errno.h>
 #include <limits.h>
 #include <linux/unistd.h>
+#include <semaphore.h>
 
 #include <sys/prctl.h>
 #include <sys/stat.h>
@@ -34,7 +50,7 @@
 #include <sys/resource.h>
 #include <sys/utsname.h>
 #include <sys/mman.h>
-#include "rt_numa.h"
+#include "../cyclictest/rt_numa.h"
 
 #include "rt-utils.h"
 
@@ -157,6 +173,17 @@ struct thread_stat {
 	long redmax;
 	long cycleofmax;
 	long hist_overflow;
+	unsigned long load2_start;
+	pthread_t thread_t2;
+	int threadt2_started;
+	double avg_t1;
+	double avg_t2;
+	int done_t1;
+	int done_t2;
+	int num_t1;
+	int num_t2;
+	int next_window_started;
+	sem_t next_window_sem;
 };
 
 static int shutdown;
@@ -174,6 +201,16 @@ static int use_nsecs = 0;
 static int refresh_on_max;
 static int force_sched_other;
 static int priospread = 0;
+static int load_t1;
+static int load_t2;
+static int priority_t2;
+static int nice_t2;
+#define MAX_CORES 16
+#define FILENAME "calibrate_count"
+
+/*caliberation count in microseond*/
+#define CALIBRATE_COUNT_TIME 1000
+static int calibrate_count_array[MAX_CORES];
 
 static pthread_cond_t refresh_on_max_cond = PTHREAD_COND_INITIALIZER;
 static pthread_mutex_t refresh_on_max_lock = PTHREAD_MUTEX_INITIALIZER;
@@ -662,6 +699,70 @@ try_again:
 	return err;
 }
 
+static inline void generate_load(int loops, int *done, int *next_window)
+{
+	/*initializing with some random values*/
+	/*use volatile to prevent compiler from optimizing */
+	volatile int a = 144;
+	int b = 193, c = 182, d = 987;
+	*done = 0;
+	while ((loops-- > 0) && (*next_window == 0))  {
+		a = b + c * d ;
+		b = d + a - c ;
+		c = b * d;
+		d = a * c + b;
+		*done = *done + 1;
+	}
+}
+
+void *load2_thread(void *param)
+{
+	struct thread_param *par = param;
+	struct thread_stat *stat = par->stats;
+	struct sched_param schedp;
+	pthread_t thread;
+	cpu_set_t mask;
+
+	if (par->cpu != -1) {
+		CPU_ZERO(&mask);
+		CPU_SET(par->cpu, &mask);
+		thread = pthread_self();
+		if (pthread_setaffinity_np(thread, sizeof(mask), &mask) == -1)
+			warn("Could not set CPU affinity to CPU #%d\n",
+				par->cpu);
+	}
+
+	memset(&schedp, 0, sizeof(schedp));
+	schedp.sched_priority = priority_t2;
+	if (priority_t2 == 0) {
+		if (setscheduler(0, SCHED_OTHER, &schedp))
+			fatal("load2_thread%d: failed to set priority to %d\n",
+				par->cpu, par->prio);
+		if (setpriority(PRIO_PROCESS, 0, nice_t2) == -1)
+			warn("could not set nice value\n");
+
+	} else {
+		if (setscheduler(0, par->policy, &schedp))
+			fatal("load2_thread%d: failed to set priority to %d\n",
+				par->cpu, par->prio);
+	}
+	stat->load2_start = stat->cycles;
+	while (!shutdown) {
+		stat->next_window_started = 0;
+		generate_load(stat->num_t2,  &stat->done_t2,
+			&(stat->next_window_started));
+
+		/* wait for next window*/
+		/*
+		 *load2_threadruns at lower priority than timerthread
+		 *so no locking is required
+		 */
+		sem_wait(&stat->next_window_sem);
+	}
+	stat->threadt2_started = -1;
+	return NULL;
+}
+
 /*
  * timer thread
  *
@@ -688,6 +789,9 @@ void *timerthread(void *param)
 	int stopped = 0;
 	cpu_set_t mask;
 	pthread_t thread;
+	struct timespec reduced_interval;
+	int status;
+	int red_interval = par->interval;
 
 	/* if we're running in numa mode, set our memory node */
 	if (par->node != -1)
@@ -723,6 +827,28 @@ void *timerthread(void *param)
 	if (setscheduler(0, par->policy, &schedp)) 
 		fatal("timerthread%d: failed to set priority to %d\n", par->cpu, par->prio);
 
+	if (load_t1) {
+		stat->num_t1 = (calibrate_count_array[par->cpu] *
+			(load_t1 * par->interval/100))/CALIBRATE_COUNT_TIME;
+		red_interval *= (100 - load_t1)/100;
+	}
+	reduced_interval.tv_sec = red_interval/USEC_PER_SEC;
+	reduced_interval.tv_nsec = (red_interval%USEC_PER_SEC) * 1000;
+	if (load_t2) {
+		stat->num_t2 = (calibrate_count_array[par->cpu] *
+			(load_t2 * par->interval/100))/CALIBRATE_COUNT_TIME;
+		stat->threadt2_started++;
+		status = pthread_create(&stat->thread_t2, NULL, load2_thread,
+			par);
+		if (status)
+			fatal("failed to create load thread %s\n",
+				strerror(status));
+		status = sem_init(&stat->next_window_sem, 0, 0);
+		if (status)
+			fatal("failed to init sem %s\n",
+				strerror(status));
+	}
+
 	/* Get current time */
 	clock_gettime(par->clock, &now);
 
@@ -756,6 +882,162 @@ void *timerthread(void *param)
 	}
 
 	stat->threadstarted++;
+	while (!shutdown && stat->num_t1) {
+
+		uint64_t diff;
+		int sigs, ret;
+		int temp = 0;
+		generate_load(stat->num_t1, &stat->done_t1, &temp);
+
+		/* Wait for next period */
+		switch (par->mode) {
+		case MODE_CYCLIC:
+		case MODE_SYS_ITIMER:
+			if (sigwait(&sigset, &sigs) < 0)
+				goto out;
+			break;
+
+		case MODE_CLOCK_NANOSLEEP:
+			if (par->timermode == TIMER_ABSTIME) {
+				ret = clock_nanosleep(par->clock,
+					TIMER_ABSTIME, &next, NULL);
+				if (ret) {
+					if (ret != EINTR) {
+						warn("clock_nanosleep failed %s"
+							, strerror(errno));
+					}
+					goto out;
+				}
+			} else {
+				ret = clock_gettime(par->clock, &now);
+				if (ret) {
+					if (ret != EINTR)
+						warn("clock_gettime failed %s"
+							, strerror(errno));
+					goto out;
+				}
+				/*
+				 * If simulated load, sleep should be for
+				 * reduced interval
+				 */
+				ret = clock_nanosleep(par->clock,
+						TIMER_RELTIME,
+						&reduced_interval, NULL);
+				if (ret) {
+					if (ret != EINTR)
+						warn("clock_nanosleep failed %s"
+							, strerror(errno));
+					goto out;
+				}
+				next.tv_sec = now.tv_sec +
+						reduced_interval.tv_sec;
+				next.tv_nsec = now.tv_nsec +
+						reduced_interval.tv_nsec;
+				tsnorm(&next);
+			}
+			break;
+
+		case MODE_SYS_NANOSLEEP:
+			ret = clock_gettime(par->clock, &now);
+			if (ret) {
+				if (ret != EINTR)
+					warn("clock_gettime() failed: %s",
+							strerror(errno));
+				goto out;
+			}
+			/*
+			 * If simulated load, sleep should be for
+			 * reduced interval
+			 */
+			if (nanosleep(&reduced_interval, NULL)) {
+				if (errno != EINTR)
+					warn("nanosleep failed. errno: %s\n",
+							strerror(errno));
+				goto out;
+			}
+			next.tv_sec = now.tv_sec + reduced_interval.tv_sec;
+			next.tv_nsec = now.tv_nsec + reduced_interval.tv_nsec;
+			tsnorm(&next);
+			break;
+		}
+
+		ret = clock_gettime(par->clock, &now);
+		if (ret) {
+			if (ret != EINTR)
+				warn("clock_getttime() failed. errno: %s\n",
+							strerror(errno));
+			goto out;
+		}
+
+		if (use_nsecs)
+			diff = calcdiff_ns(now, next);
+		else
+			diff = calcdiff(now, next);
+		if (diff < stat->min)
+			stat->min = diff;
+		if (diff > stat->max) {
+			stat->max = diff;
+			if (refresh_on_max)
+				pthread_cond_signal(&refresh_on_max_cond);
+		}
+		stat->avg += (double) diff;
+
+		if (duration && (calcdiff(now, stop) >= 0))
+			shutdown++;
+
+		if (!stopped && tracelimit && (diff > tracelimit)) {
+			stopped++;
+			tracing(0);
+			shutdown++;
+			pthread_mutex_lock(&break_thread_id_lock);
+			if (break_thread_id == 0)
+				break_thread_id = stat->tid;
+			break_thread_value = diff;
+			pthread_mutex_unlock(&break_thread_id_lock);
+		}
+		stat->act = diff;
+
+		if (par->bufmsk)
+			stat->values[stat->cycles & par->bufmsk] = diff;
+
+		/* Update the histogram */
+		if (histogram) {
+			if (diff >= histogram)
+				stat->hist_overflow++;
+			else
+				stat->hist_array[diff]++;
+		}
+
+		stat->cycles++;
+
+		next.tv_sec += interval.tv_sec;
+		next.tv_nsec += interval.tv_nsec;
+		if (par->mode == MODE_CYCLIC) {
+			int overrun_count = timer_getoverrun(timer);
+			next.tv_sec += overrun_count * interval.tv_sec;
+			next.tv_nsec += overrun_count * interval.tv_nsec;
+		}
+		tsnorm(&next);
+		stat->avg_t1 += (double)((stat->done_t1 * 100)/stat->num_t1);
+		/*undone load will be discarded in next window*/
+		stat->done_t1 = 0;
+
+		if (stat->num_t2) {
+			stat->avg_t2 += (double)((stat->done_t2 * 100)/
+							stat->num_t2);
+			stat->done_t2 = 0;
+			/*
+			 *flag to intimade load2_thread that next window
+			 *has started
+			 */
+			if (!stat->next_window_started) {
+				stat->next_window_started = 1;
+				sem_post(&stat->next_window_sem);
+			}
+		}
+		if (par->max_cycles && par->max_cycles == stat->cycles)
+			break;
+	}
 
 	while (!shutdown) {
 
@@ -867,6 +1149,17 @@ void *timerthread(void *param)
 		}
 		tsnorm(&next);
 
+		if (load_t2) {
+			stat->avg_t2 += (double)((stat->done_t2 * 100)/
+							stat->num_t2);
+			stat->done_t2 = 0;
+			/*
+			 *flag to intimade load2_thread that next window
+			 *has started
+			 */
+			stat->next_window_started = 1;
+			sem_post(&stat->next_window_sem);
+		}
 		if (par->max_cycles && par->max_cycles == stat->cycles)
 			break;
 	}
@@ -888,6 +1181,8 @@ out:
 	sched_setscheduler(0, SCHED_OTHER, &schedp);
 
 	stat->threadstarted = -1;
+	if (load_t2)
+		sem_destroy(&stat->next_window_sem);
 
 	return NULL;
 }
@@ -959,6 +1254,10 @@ static void display_help(int error)
                "                           format: --policy=fifo(default) or --policy=rr\n"
 	       "-S       --smp             Standard SMP testing: options -a -t -n and\n"
                "                           same priority of all threads\n"
+	       "-x       --load_t1         load in percentage for load1_thread\n"
+	       "-X       --load_t2         load in percentage for load2_thread\n"
+	       "-z       --priority_t2     priority of load2_thread\n"
+	       "-Z       --nice_t2         nice value of load2_thread\n"
 	       "-U       --numa            Standard NUMA testing (similar to SMP option)\n"
                "                           thread data structures allocated from local node\n",
 	       tracers
@@ -987,7 +1286,7 @@ enum {
 	AFFINITY_SPECIFIED,
 	AFFINITY_USEALL
 };
-static int setaffinity = AFFINITY_UNSPECIFIED;
+static int setaffinity = AFFINITY_USEALL;
 
 static int clocksources[] = {
 	CLOCK_MONOTONIC,
@@ -1083,10 +1382,15 @@ static void process_options (int argc, char *argv[])
 			{"numa", no_argument, NULL, 'U'},
 			{"latency", required_argument, NULL, 'e'},
 			{"priospread", no_argument, NULL, 'Q'},
+			{"load_t1", required_argument, NULL, 'x'},
+			{"load_t2", required_argument, NULL, 'X'},
+			{"priority_t2", required_argument, NULL, 'z'},
+			{"nice_t2", required_argument, NULL, 'Z'},
 			{NULL, 0, NULL, 0}
 		};
-		int c = getopt_long(argc, argv, "a::b:Bc:Cd:Efh:H:i:Il:MnNo:O:p:PmqQrsSt::uUvD:wWT:y:e:",
-				    long_options, &option_index);
+		int c = getopt_long(argc, argv,
+				"a::b:Bc:Cd:Efh:H:i:Il:MnNo:O:p:PmqQrsSt::uUvD:wWT:y:e:x:X:z:Z:"
+				    , long_options, &option_index);
 		if (c == -1)
 			break;
 		switch (c) {
@@ -1200,7 +1504,18 @@ static void process_options (int argc, char *argv[])
 			if (latency_target_value < 0)
 				latency_target_value = 0;
 			break;
-
+		case 'x':
+			load_t1 = atoi(optarg);
+			break;
+		case 'X':
+			load_t2 = atoi(optarg);
+			break;
+		case 'z':
+			priority_t2 = atoi(optarg);
+			break;
+		case 'Z':
+			nice_t2 = atoi(optarg);
+			break;
 		case '?': display_help(0); break;
 		}
 	}
@@ -1221,6 +1536,9 @@ static void process_options (int argc, char *argv[])
 			    affinity, max_cpus);
 			error = 1;
 		}
+	} else if (setaffinity == AFFINITY_UNSPECIFIED) {
+		warn("thread affinity can't be unspecified for cyclicload\n");
+		error = 1;
 	} else if (tracelimit)
 		fileprefix = procfileprefix;
 
@@ -1268,6 +1586,15 @@ static void process_options (int argc, char *argv[])
 
 	if (num_threads < 1)
 		error = 1;
+	/*1% load has been reserved for control framework*/
+	if ((load_t1 + load_t2) > 99) {
+		fprintf(stderr, "load can't be greater than 99%\n");
+		error = 1;
+	}
+	if (priority_t2 < 0 || priority_t2 > priority) {
+		fprintf(stderr, "incorrect priority_t2\n");
+		error = 1;
+	}
 
 	if (error)
 		display_help(1);
@@ -1344,6 +1671,7 @@ static void print_hist(struct thread_param *par[], int nthreads)
 	int i, j;
 	unsigned long long int log_entries[nthreads+1];
 	unsigned long maxmax, alloverflows;
+	unsigned long load2_cycles;
 
 	bzero(log_entries, sizeof(log_entries));
 
@@ -1401,6 +1729,24 @@ static void print_hist(struct thread_param *par[], int nthreads)
 	if (histofall && nthreads > 1)
 		printf(" %05lu", alloverflows);
 	printf("\n");
+	if (load_t1) {
+		printf("# Avg Load t1");
+		for (j = 0; j < nthreads; j++)
+			printf(" %05lu", par[j]->stats->cycles ?
+			(long)(par[j]->stats->avg_t1/par[j]->stats->cycles) :
+			0);
+		printf("\n");
+	}
+	if (load_t1) {
+		printf("# Avg Load t2");
+		for (j = 0; j < nthreads; j++) {
+			load2_cycles = par[j]->stats->cycles -
+					par[j]->stats->load2_start;
+			printf(" %05lu", load2_cycles ?
+			(long)(par[j]->stats->avg_t2/load2_cycles) : 0);
+		}
+		printf("\n");
+	}
 }
 
 static void print_stat(struct thread_param *par, int index, int verbose)
@@ -1410,6 +1756,8 @@ static void print_stat(struct thread_param *par, int index, int verbose)
 	if (!verbose) {
 		if (quiet != 1) {
 			char *fmt;
+			unsigned long load2_cycles =
+					stat->cycles - stat->load2_start;
 			if (use_nsecs)
                                 fmt = "T:%2d (%5d) P:%2d I:%ld C:%7lu "
 					"Min:%7ld Act:%8ld Avg:%8ld Max:%8ld\n";
@@ -1420,6 +1768,18 @@ static void print_stat(struct thread_param *par, int index, int verbose)
                                par->interval, stat->cycles, stat->min, stat->act,
 			       stat->cycles ?
 			       (long)(stat->avg/stat->cycles) : 0, stat->max);
+			if (load_t1)
+				printf("\tAvgload1:%2ld.%2d\n",
+					stat->cycles ?
+					(long)(stat->avg_t1/stat->cycles) : 0,
+					stat->cycles ?
+					((long)stat->avg_t1%stat->cycles) : 0);
+			if (load_t2)
+				printf("\tAvgload2:%2ld.%2d\n",
+					load2_cycles ?
+					(long)(stat->avg_t2/load2_cycles) : 0,
+					load2_cycles ?
+					(long)stat->avg_t2%load2_cycles : 0);
 		}
 	} else {
 		while (stat->cycles != stat->cyclesread) {
@@ -1441,6 +1801,125 @@ static void print_stat(struct thread_param *par, int index, int verbose)
 	}
 }
 
+int calibrate_count_per_unit(int interval_per_unit)
+{
+	int diff = 1, x = 0;
+	struct timespec start, end;
+	int i, clock, k = 0, ret;
+	int count = 1;
+	int temp = 0;
+	int flag = 0;
+	int min = -1;
+
+	clock = clocksources[clocksel];
+
+	/*interval_per)unit is in us*/
+	if (use_nsecs)
+		interval_per_unit = interval_per_unit * 1000;
+
+	/*calculate minimum of 10 iterations
+	 *to get least count to generate a particular load
+	 */
+	for (i = 0 ; i < 10 ; i++) {
+		count = 1;
+		diff = 1;
+		x = 0;
+		while (diff < interval_per_unit) {
+			count *= 10;
+			x++;
+			ret = clock_gettime(clock, &start);
+			if (ret) {
+				if (ret != EINTR)
+					warn("clock_gettime() failed: %s",
+						strerror(errno));
+				return -1;
+			}
+			generate_load(count, &temp, &flag);
+			ret = clock_gettime(clock, &end);
+			if (ret) {
+				if (ret != EINTR)
+					warn("clock_gettime() failed: %s",
+						strerror(errno));
+				return -1;
+			}
+			if (use_nsecs)
+				diff = (calcdiff_ns(end, start));
+			else
+				diff = (calcdiff(end, start));
+		}
+		k = count;
+		while ((x > 0) && (diff != interval_per_unit) && (k != 0)) {
+			x--;
+			count += k;
+			k /= 10;
+			do {
+				count -= k;
+				ret = clock_gettime(clock, &start);
+				if (ret) {
+					if (ret != EINTR)
+						warn("clock_gettime() failed:%s"
+							, strerror(errno));
+					return -1;
+				}
+				generate_load(count, &temp, &flag);
+				ret = clock_gettime(clock, &end);
+				if (ret) {
+					if (ret != EINTR)
+						warn("clock_gettime() failed:%s"
+							, strerror(errno));
+					return -1;
+				}
+				if (use_nsecs)
+					diff = (calcdiff_ns(end, start));
+				else
+					diff = (calcdiff(end, start));
+			} while (diff > interval_per_unit);
+		}
+
+		if (diff != interval_per_unit)
+			count = (count * interval_per_unit)/diff;
+
+		if (i == 0)
+			min = count;
+		if (count < min)
+			min = count;
+	}
+	return min;
+}
+
+/*
+ * thread to calibrate data i.e. loop count per unit time
+ * for multicore system, thread affine itslef to each core
+ * turn by turn to calibrate count for that core
+ */
+void *calibrate_thread(void *arg)
+{
+	struct sched_param schedp;
+	int max_cpus = sysconf(_SC_NPROCESSORS_CONF);
+	int i = 0;
+
+	/*should be run at highest RT priority for proper caliberation*/
+	memset(&schedp, 0, sizeof(schedp));
+	schedp.sched_priority = 99;
+	sched_setscheduler(0, SCHED_FIFO, &schedp);
+
+	/*For multicore system, do caliberation for all CPUs*/
+	for (i = 0; i < max_cpus; i++) {
+		cpu_set_t mask;
+		CPU_ZERO(&mask);
+		CPU_SET(i, &mask);
+		if (sched_setaffinity(0, sizeof(mask), &mask) == -1)
+			warn("Could not set CPU affinity to CPU #%d\n", i);
+
+		/*caliberation count is maintained per CALIBRATE_COUNT_TIME*/
+		calibrate_count_array[i] =
+			calibrate_count_per_unit(CALIBRATE_COUNT_TIME);
+		if (calibrate_count_array[i] == -1)
+			warn("Could not set set calibrate for CPU #%d\n", i);
+	}
+	return NULL;
+}
+
 int main(int argc, char **argv)
 {
 	sigset_t sigset;
@@ -1451,6 +1930,8 @@ int main(int argc, char **argv)
 	int max_cpus = sysconf(_SC_NPROCESSORS_CONF);
 	int i, ret = -1;
 	int status;
+	pthread_t calibrate_thread_id;
+	FILE *fp;
 
 	process_options(argc, argv);
 
@@ -1495,6 +1976,44 @@ int main(int argc, char **argv)
 	statistics = calloc(num_threads, sizeof(struct thread_stat *));
 	if (!statistics)
 		goto outpar;
+	/*
+	 *For first run:
+	 *		create file
+	 *		Calibrate count per time unit & store in file
+	 *for subsequent run:
+	 *		read calibrated data from file & use
+	 */
+	fp = fopen(FILENAME, "r");
+	if (!fp) {
+		int val = 0;
+		fp = fopen(FILENAME, "w");
+		if (!fp)
+			goto outpar;
+		printf("Calibrating data\n");
+		/* create thread to calibrate count for each cpu*/
+		status = pthread_create(&calibrate_thread_id,
+				NULL, calibrate_thread, NULL);
+		if (status) {
+			fatal("failed to create thread %s\n", strerror(status));
+			goto outfile;
+		}
+		printf("Be patient, it will take some time in the first run\n");
+		printf("It is recommended to run for the first run ");
+		printf("with least load for proper caliberation\n");
+		/*wait for all threads to exit*/
+		status = pthread_join(calibrate_thread_id, (void *)&val);
+		if (status) {
+			fatal("failed in pthread_join %s\n", strerror(status));
+			goto outfile;
+		}
+		/*story array into file*/
+		fwrite(calibrate_count_array,
+			sizeof(calibrate_count_array), 1, fp);
+		printf("Calibration completed\n");
+	} else {
+		/*read from array*/
+		fread(calibrate_count_array, sizeof(int), MAX_CORES, fp);
+	}
 
 	for (i = 0; i < num_threads; i++) {
 		pthread_attr_t attr;
@@ -1593,6 +2112,13 @@ int main(int argc, char **argv)
 		stat->min = 1000000;
 		stat->max = 0;
 		stat->avg = 0.0;
+		stat->avg_t1 = 0.0;
+		stat->avg_t2 = 0.0;
+		stat->num_t1 = 0;
+		stat->num_t2 = 0;
+		stat->done_t1 = 0;
+		stat->done_t2 = 0;
+		stat->next_window_started = 1;
 		stat->threadstarted = 1;
 		status = pthread_create(&stat->thread, &attr, timerthread, par);
 		if (status)
@@ -1657,6 +2183,8 @@ int main(int argc, char **argv)
 	for (i = 0; i < num_threads; i++) {
 		if (statistics[i]->threadstarted > 0)
 			pthread_kill(statistics[i]->thread, SIGTERM);
+		if (statistics[i]->threadt2_started > 0)
+			pthread_kill(statistics[i]->thread_t2, SIGTERM);
 		if (statistics[i]->threadstarted) {
 			pthread_join(statistics[i]->thread, NULL);
 			if (quiet && !histogram)
@@ -1686,6 +2214,8 @@ int main(int argc, char **argv)
 			continue;
 		threadfree(statistics[i], sizeof(struct thread_stat), parameters[i]->node);
 	}
+ outfile:
+	fclose(fp);
 
  outpar:
 	for (i = 0; i < num_threads; i++) {
-- 
1.7.4.1




^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 4/4] Add cyclicload manual page
  2012-08-30  9:56 [PATCH 0/4] Add cyclicload testtool support Priyanka Jain
                   ` (2 preceding siblings ...)
  2012-08-30  9:56 ` [PATCH 3/4] Add cyclicload calibration & load generation feature Priyanka Jain
@ 2012-08-30  9:56 ` Priyanka Jain
  3 siblings, 0 replies; 7+ messages in thread
From: Priyanka Jain @ 2012-08-30  9:56 UTC (permalink / raw)
  To: jkacur, williams, frank.rowand, linux-rt-users, dvhart,
	Rajan.Srivastava
  Cc: Poonam.Aggrwal, Priyanka Jain

Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com>
---
 Use cyclictest manual page as base

 src/cyclicload/cyclicload.8 |  206 +++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 206 insertions(+), 0 deletions(-)
 create mode 100644 src/cyclicload/cyclicload.8

diff --git a/src/cyclicload/cyclicload.8 b/src/cyclicload/cyclicload.8
new file mode 100644
index 0000000..aada025
--- /dev/null
+++ b/src/cyclicload/cyclicload.8
@@ -0,0 +1,206 @@
+.\"                                      Hey, EMACS: -*- nroff -*-
+.TH CYCLICLOAD 8 "August 30, 2012"
+.\" Based on cyclictest.8.
+.\" Please adjust this date whenever revising the manpage.
+.\"
+.\" Some roff macros, for reference:
+.\" .nh        disable hyphenation
+.\" .hy        enable hyphenation
+.\" .ad l      left justify
+.\" .ad b      justify to both left and right margins
+.\" .nf        disable filling
+.\" .fi        enable filling
+.\" .br        insert line break
+.\" .sp <n>    insert n+1 empty lines
+.\" for manpage-specific macros, see man(7)
+.SH NAME
+cyclicload \- Load generation test program
+.SH SYNOPSIS
+.B cyclicload
+.RI "[ \-hfmnqrsvMS ] [\-a " proc " ] [\-b " usec " ] [\-c " clock " ] [\-d " dist " ] \
+[\-h " histogram " ] [\-i " intv " ] [\-l " loop " ] [\-o " red " ] [\-p " prio " ] \
+[\-t " num " ] [\-D " time "] [\-w] [\-W] [\-y " policy " ] [ \-S | \-U ] \
+[\-x " load_t1 " ] [\-X " load_t2 "] [\-z " priority_t2 "] [\-Z " nice_t2 " ]"
+
+.\" .SH DESCRIPTION
+.\" This manual page documents briefly the
+.\" .B cyclictest commands.
+.\" .PP
+.\" \fI<whatever>\fP escape sequences to invode bold face and italics, respectively.
+.\" \fBcyclicload\fP is a program that...
+.SH OPTIONS
+These programs follow the usual GNU command line syntax, with long
+options starting with two dashes ('\-\-').
+.br
+A summary of options is included below.
+.\" For a complete description, see the Info files.
+.TP
+.B \-a, \-\-affinity[=PROC]
+Run all threads on procesor number PROC. If PROC is not specified, run thread #N on processor #N.
+.TP
+.B \-b, \-\-breaktrace=USEC
+Send break trace command when latency > USEC. This is a debugging option to control the latency tracer in the realtime preemption patch.
+It is useful to track down unexpected large latencies on a system. This option does only work with following kernel config options.
+
+    For kernel < 2.6.24:
+.br
+    * CONFIG_PREEMPT_RT=y
+.br
+    * CONFIG_WAKEUP_TIMING=y
+.br
+    * CONFIG_LATENCY_TRACE=y
+.br
+    * CONFIG_CRITICAL_PREEMPT_TIMING=y
+.br
+    * CONFIG_CRITICAL_IRQSOFF_TIMING=y
+.sp 1
+    For kernel >= 2.6.24:
+.br
+    * CONFIG_PREEMPT_RT=y
+.br
+    * CONFIG_FTRACE
+.br
+    * CONFIG_IRQSOFF_TRACER=y
+.br
+    * CONFIG_PREEMPT_TRACER=y
+.br
+    * CONFIG_SCHED_TRACER=y
+.br
+    * CONFIG_WAKEUP_LATENCY_HIST
+
+
+kernel configuration options enabled. The USEC parameter to the \-b option defines a maximum latency value, which is compared against the actual latencies of the test. Once the measured latency is higher than the given maximum, the kernel tracer and cyclictest is stopped. The trace can be read from /proc/latency_trace. Please be aware that the tracer adds significant overhead to the kernel, so the latencies will be much higher than on a kernel with latency tracing disabled.
+.TP
+.B \-c, \-\-clock=CLOCK
+Selects the clock, which is used:
+
+    * 0 selects CLOCK_MONOTONIC, which is the monotonic increasing system time (default).
+    * 1 selects CLOCK_REALTIME, which is the time of day time.
+
+CLOCK_REALTIME can be set by settimeofday, while CLOCK_MONOTONIC can not be modified by the user.
+This option has no influence when the \-s option is given.
+.TP
+.B \-C, \-\-context
+context switch tracing (used with \-b)
+.TP
+.B \-d, \-\-distance=DIST
+Set the distance of thread intervals in microseconds (default is 500us). When cyclictest is called with the \-t option and more than one thread is created, then this distance value is added to the interval of the threads: Interval(thread N) = Interval(thread N\-1) + DIST
+.TP
+.B \-E, \-\-event
+event tracing (used with \-b)
+.TP
+.B \-f, \-\-ftrace
+Enable function tracing using ftrace as tracer. This option is available only with \-b.
+.TP
+.B \-h, \-\-histogram=MAXLATENCYINUS
+Dump latency histogram to stdout. US means the max time to be be tracked in microseconds. When you use \-h option to get histogram data, Cyclictest runs many threads with same priority without priority\-\-.
+.TP
+.B \-H, \-\-histofall=MAXLATENCYINUS
+Same as -h except that an additional histogram column is displayed at the right that contains summary data of all thread histograms. If cyclictest runs a single thread only, the -H option is equivalent to -h.
+.TP
+.B \-i, \-\-interval=INTV
+Set the base interval of the thread(s) in microseconds (default is 1000us). This sets the interval of the first thread. See also \-d.
+.TP
+.B \-l, \-\-loops=LOOPS
+Set the number of loops. The default is 0 (endless). This option is useful for automated tests with a given number of test cycles. Cyclictest is stopped once the number of timer intervals has been reached.
+.TP
+.B \-n, \-\-nanosleep
+Use clock_nanosleep instead of posix interval timers. Setting this option runs the tests with clock_nanosleep instead of posix interval timers.
+.TP
+.B \-N, \-\-nsecs
+Show results in nanoseconds instead of microseconds, which is the default unit.
+.TP
+.B \-o, \-\-oscope=RED
+Oscilloscope mode, reduce verbose output by RED.
+.TP
+.B \-O, \-\-traceopt=TRACING_OPTION
+Used to pass tracing options to ftrace tracers. May be invoked mutiple
+times for multiple trace options. For example trace options look at /sys/kernel/debug/tracing/trace_options
+.TP
+.B \-p, \-\-prio=PRIO
+Set the priority of the first thread. The given priority is set to the first test thread. Each further thread gets a lower priority:
+Priority(Thread N) = max(Priority(Thread N\-1) \- 1, 0)
+.TP
+.B \-q, \-\-quiet
+Run the tests quiet and print only a summary on exit. Useful for automated tests, where only the summary output needs to be captured.
+.TP
+.B \-r, \-\-relative
+Use relative timers instead of absolute. The default behaviour of the tests is to use absolute timers. This option is there for completeness and should not be used for reproducible tests.
+.TP
+.B \-s, \-\-system
+Use sys_nanosleep and sys_setitimer instead of posix timers. Note, that \-s can only be used with one thread because itimers are per process and not per thread. \-s in combination with \-n uses the nanosleep syscall and is not restricted to one thread.
+.TP
+.B \-T, \-\-tracer=TRACEFUNC
+set the ftrace tracer function. Used with the \-b option. Must be one
+of the trace functions available from <debugfs-mountpoint>/kernel/debug/tracing/available_tracers
+.TP
+.B \-t, \-\-threads[=NUM]
+Set the number of test threads (default is 1). Create NUM test threads. If NUM is not specified, NUM is set to
+the number of available CPUs. See \-d, \-i and \-p for further information.
+.TP
+.B \-m, \-\-mlockall
+Lock current and future memory allocations to prevent being paged out
+.TP
+.B \-v, \-\-verbose
+Output values on stdout for statistics. This option is used to gather statistical information about the latency distribution. The output is sent to stdout. The output format is:
+
+n:c:v
+
+where n=task number c=count v=latency value in us. Use this option in combination with \-l
+.TP
+.B \\-D, \-\-duration=TIME
+Run the test for the specified time, which defaults to seconds. Append 'm', 'h', or 'd' to specify minutes, hours or days
+.TP
+.B \\-w, \-\-wakeup
+task wakeup tracing (used with \-b)
+.TP
+.B \\-W, \-\-wakeuprt
+rt-task wakeup tracing (used with \-b)
+.TP
+.B \\-y, \-\-policy=NAME
+set the scheduler policy of the measurement threads
+where NAME is one of: other, normal, batch, idle, fifo, rr
+.TP
+.B \\-M, \-\-refresh_on_max
+delay updating the screen until a new max latency is hit (useful for
+running cyclictest on low-bandwidth connections)
+.TP
+.B \\-S, \-\-smp
+Set options for standard testing on SMP systems. Equivalent to using
+the options: "\-t \-a \-n" as well keeping any specified priority
+equal across all threads
+.TP
+.B \\-U, \-\-numa
+Similar to the above \-\-smp option, this implies the "\-t \-a \-n"
+options, as well as a constant measurement interval, but also forces
+memory allocations using the numa(3) policy library. Thread stacks and
+data structures are allocated from the NUMA node local to the core to
+which the thread is bound. Requires the underlying kernel to have NUMA
+support compiled in.
+.TP
+.B \-x, \-\-load_t1
+set the %load for load1 that needs to be simulated at priority 'p'
+.TP
+.B \-X, \-\-load_t2
+set the %load for load2 that needs to be simulated at priority_t2.
+If priority_t2=0 implying non-rt task then
+it simulates load2 for nice value nice_t2.
+.TP
+.B \-z, \-\-priority_t2
+set the priority of load2 thread
+.TP
+.B \-Z, \-\-nice_t2
+set the nice value of load2 thread if its priority is zero
+.\" .SH SEE ALSO
+.\" .BR bar (1),
+.\" .BR baz (1).
+.\" .br
+.\" The programs are documented fully by
+.\" .IR "The Rise and Fall of a Fooish Bar" ,
+.\" available via the Info system.
+.SH AUTHOR
+cyclicload was written by Priyanka Jain <Priyanka.Jain@freescale.com>
+based on cyclictest code by Thomas Gleixner <tglx@linuxtronix.de>.
+.PP
+This manual page was written by Priyanka Jain <Priyanka.Jain@freescale.com>
+based on cyclicload.8 by Alessio Igor Bogani <abogani@texware.it>
-- 
1.7.4.1




^ permalink raw reply related	[flat|nested] 7+ messages in thread

* RE: [PATCH 3/4] Add cyclicload calibration & load generation feature
  2012-08-30  9:56 ` [PATCH 3/4] Add cyclicload calibration & load generation feature Priyanka Jain
@ 2012-10-19  4:32   ` Jain Priyanka-B32167
  2012-10-19 15:39     ` John Kacur
  0 siblings, 1 reply; 7+ messages in thread
From: Jain Priyanka-B32167 @ 2012-10-19  4:32 UTC (permalink / raw)
  To: williams@redhat.com
  Cc: Aggrwal Poonam-B10812, jkacur@redhat.com,
	frank.rowand@am.sony.com, linux-rt-users@vger.kernel.org,
	Srivastava Rajan-B34330, dvhart@linux.intel.com,
	Jain Priyanka-B32167

Dear Clark,

Its along time that since I have send patches for cyclicload.
I can see patch for cyclicload integrated into work branch on rt-test git tree.
Did you have any feedback on its working.

Also I have made some improvements on it. Should I send the next version of the cyclicload patch or a new patch for the changes based on the 'work' branch code.

Regards
Priyanka



-----Original Message-----
From: Jain Priyanka-B32167 
Sent: Thursday, August 30, 2012 3:27 PM
To: jkacur@redhat.com; williams@redhat.com; frank.rowand@am.sony.com; linux-rt-users@vger.kernel.org; dvhart@linux.intel.com; Srivastava Rajan-B34330
Cc: Aggrwal Poonam-B10812; Jain Priyanka-B32167
Subject: [PATCH 3/4] Add cyclicload calibration & load generation feature

Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com>
---
 Makefile                    |    7 +-
 src/cyclicload/cyclicload.c |  550 ++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 546 insertions(+), 11 deletions(-)

diff --git a/Makefile b/Makefile
index 3a82407..5f48262 100644
--- a/Makefile
+++ b/Makefile
@@ -2,7 +2,7 @@ VERSION_STRING = 0.84
 
 sources = cyclictest.c signaltest.c pi_stress.c rt-migrate-test.c	\
 	  ptsematest.c sigwaittest.c svsematest.c pmqtest.c sendme.c 	\
-	  pip_stress.c hackbench.c
+	  pip_stress.c hackbench.c cyclicload.c
 
 TARGETS = $(sources:.c=)
 
@@ -47,6 +47,7 @@ VPATH	+= src/pmqtest:
 VPATH	+= src/backfire:
 VPATH	+= src/lib
 VPATH	+= src/hackbench
+VPATH	+= src/cyclicload
 
 %.o: %.c
 	$(CC) -D VERSION_STRING=$(VERSION_STRING) -c $< $(CFLAGS) @@ -98,6 +99,9 @@ pip_stress: pip_stress.o librttest.a
 hackbench: hackbench.o
 	$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $^ $(LIBS)
 
+cyclicload: cyclicload.o librttest.a
+	$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $^ $(LIBS) $(NUMA_LIBS)
+
 librttest.a: rt-utils.o error.o rt-get_cpu.o
 	$(AR) rcs librttest.a rt-utils.o error.o rt-get_cpu.o
 
@@ -140,6 +144,7 @@ install: all
 	gzip src/pmqtest/pmqtest.8 -c >"$(DESTDIR)$(mandir)/man8/pmqtest.8.gz"
 	gzip src/backfire/sendme.8 -c >"$(DESTDIR)$(mandir)/man8/sendme.8.gz"
 	gzip src/hackbench/hackbench.8 -c >"$(DESTDIR)$(mandir)/man8/hackbench.8.gz"
+	gzip src/cyclicload/cyclicload.8 -c >"$(DESTDIR)$(mandir)/man8/cyclicload.8.gz"
 
 .PHONY: release
 release: clean changelog
diff --git a/src/cyclicload/cyclicload.c b/src/cyclicload/cyclicload.c index 11b6cea..ee43816 100644
--- a/src/cyclicload/cyclicload.c
+++ b/src/cyclicload/cyclicload.c
@@ -1,13 +1,28 @@
 /*
- * High resolution timer test software
+ * Load generation test software
  *
- * (C) 2008-2012 Clark Williams <williams@redhat.com>
- * (C) 2005-2007 Thomas Gleixner <tglx@linutronix.de>
+ * Author: Priyanka.Jain@freescale.com
+ * Based on cyclictest code
+ *
+ * Copyright 2012 Freescale Semiconductor, Inc.
+ *
+ * See file CREDITS for list of people who contributed to this
+ * project.
  *
  * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License Version
- * 2 as published by the Free Software Foundation.
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
  *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston,
+ * MA 02111-1307 USA
  */
 
 #include <stdio.h>
@@ -25,6 +40,7 @@
 #include <errno.h>
 #include <limits.h>
 #include <linux/unistd.h>
+#include <semaphore.h>
 
 #include <sys/prctl.h>
 #include <sys/stat.h>
@@ -34,7 +50,7 @@
 #include <sys/resource.h>
 #include <sys/utsname.h>
 #include <sys/mman.h>
-#include "rt_numa.h"
+#include "../cyclictest/rt_numa.h"
 
 #include "rt-utils.h"
 
@@ -157,6 +173,17 @@ struct thread_stat {
 	long redmax;
 	long cycleofmax;
 	long hist_overflow;
+	unsigned long load2_start;
+	pthread_t thread_t2;
+	int threadt2_started;
+	double avg_t1;
+	double avg_t2;
+	int done_t1;
+	int done_t2;
+	int num_t1;
+	int num_t2;
+	int next_window_started;
+	sem_t next_window_sem;
 };
 
 static int shutdown;
@@ -174,6 +201,16 @@ static int use_nsecs = 0;  static int refresh_on_max;  static int force_sched_other;  static int priospread = 0;
+static int load_t1;
+static int load_t2;
+static int priority_t2;
+static int nice_t2;
+#define MAX_CORES 16
+#define FILENAME "calibrate_count"
+
+/*caliberation count in microseond*/
+#define CALIBRATE_COUNT_TIME 1000
+static int calibrate_count_array[MAX_CORES];
 
 static pthread_cond_t refresh_on_max_cond = PTHREAD_COND_INITIALIZER;  static pthread_mutex_t refresh_on_max_lock = PTHREAD_MUTEX_INITIALIZER; @@ -662,6 +699,70 @@ try_again:
 	return err;
 }
 
+static inline void generate_load(int loops, int *done, int 
+*next_window) {
+	/*initializing with some random values*/
+	/*use volatile to prevent compiler from optimizing */
+	volatile int a = 144;
+	int b = 193, c = 182, d = 987;
+	*done = 0;
+	while ((loops-- > 0) && (*next_window == 0))  {
+		a = b + c * d ;
+		b = d + a - c ;
+		c = b * d;
+		d = a * c + b;
+		*done = *done + 1;
+	}
+}
+
+void *load2_thread(void *param)
+{
+	struct thread_param *par = param;
+	struct thread_stat *stat = par->stats;
+	struct sched_param schedp;
+	pthread_t thread;
+	cpu_set_t mask;
+
+	if (par->cpu != -1) {
+		CPU_ZERO(&mask);
+		CPU_SET(par->cpu, &mask);
+		thread = pthread_self();
+		if (pthread_setaffinity_np(thread, sizeof(mask), &mask) == -1)
+			warn("Could not set CPU affinity to CPU #%d\n",
+				par->cpu);
+	}
+
+	memset(&schedp, 0, sizeof(schedp));
+	schedp.sched_priority = priority_t2;
+	if (priority_t2 == 0) {
+		if (setscheduler(0, SCHED_OTHER, &schedp))
+			fatal("load2_thread%d: failed to set priority to %d\n",
+				par->cpu, par->prio);
+		if (setpriority(PRIO_PROCESS, 0, nice_t2) == -1)
+			warn("could not set nice value\n");
+
+	} else {
+		if (setscheduler(0, par->policy, &schedp))
+			fatal("load2_thread%d: failed to set priority to %d\n",
+				par->cpu, par->prio);
+	}
+	stat->load2_start = stat->cycles;
+	while (!shutdown) {
+		stat->next_window_started = 0;
+		generate_load(stat->num_t2,  &stat->done_t2,
+			&(stat->next_window_started));
+
+		/* wait for next window*/
+		/*
+		 *load2_threadruns at lower priority than timerthread
+		 *so no locking is required
+		 */
+		sem_wait(&stat->next_window_sem);
+	}
+	stat->threadt2_started = -1;
+	return NULL;
+}
+
 /*
  * timer thread
  *
@@ -688,6 +789,9 @@ void *timerthread(void *param)
 	int stopped = 0;
 	cpu_set_t mask;
 	pthread_t thread;
+	struct timespec reduced_interval;
+	int status;
+	int red_interval = par->interval;
 
 	/* if we're running in numa mode, set our memory node */
 	if (par->node != -1)
@@ -723,6 +827,28 @@ void *timerthread(void *param)
 	if (setscheduler(0, par->policy, &schedp)) 
 		fatal("timerthread%d: failed to set priority to %d\n", par->cpu, par->prio);
 
+	if (load_t1) {
+		stat->num_t1 = (calibrate_count_array[par->cpu] *
+			(load_t1 * par->interval/100))/CALIBRATE_COUNT_TIME;
+		red_interval *= (100 - load_t1)/100;
+	}
+	reduced_interval.tv_sec = red_interval/USEC_PER_SEC;
+	reduced_interval.tv_nsec = (red_interval%USEC_PER_SEC) * 1000;
+	if (load_t2) {
+		stat->num_t2 = (calibrate_count_array[par->cpu] *
+			(load_t2 * par->interval/100))/CALIBRATE_COUNT_TIME;
+		stat->threadt2_started++;
+		status = pthread_create(&stat->thread_t2, NULL, load2_thread,
+			par);
+		if (status)
+			fatal("failed to create load thread %s\n",
+				strerror(status));
+		status = sem_init(&stat->next_window_sem, 0, 0);
+		if (status)
+			fatal("failed to init sem %s\n",
+				strerror(status));
+	}
+
 	/* Get current time */
 	clock_gettime(par->clock, &now);
 
@@ -756,6 +882,162 @@ void *timerthread(void *param)
 	}
 
 	stat->threadstarted++;
+	while (!shutdown && stat->num_t1) {
+
+		uint64_t diff;
+		int sigs, ret;
+		int temp = 0;
+		generate_load(stat->num_t1, &stat->done_t1, &temp);
+
+		/* Wait for next period */
+		switch (par->mode) {
+		case MODE_CYCLIC:
+		case MODE_SYS_ITIMER:
+			if (sigwait(&sigset, &sigs) < 0)
+				goto out;
+			break;
+
+		case MODE_CLOCK_NANOSLEEP:
+			if (par->timermode == TIMER_ABSTIME) {
+				ret = clock_nanosleep(par->clock,
+					TIMER_ABSTIME, &next, NULL);
+				if (ret) {
+					if (ret != EINTR) {
+						warn("clock_nanosleep failed %s"
+							, strerror(errno));
+					}
+					goto out;
+				}
+			} else {
+				ret = clock_gettime(par->clock, &now);
+				if (ret) {
+					if (ret != EINTR)
+						warn("clock_gettime failed %s"
+							, strerror(errno));
+					goto out;
+				}
+				/*
+				 * If simulated load, sleep should be for
+				 * reduced interval
+				 */
+				ret = clock_nanosleep(par->clock,
+						TIMER_RELTIME,
+						&reduced_interval, NULL);
+				if (ret) {
+					if (ret != EINTR)
+						warn("clock_nanosleep failed %s"
+							, strerror(errno));
+					goto out;
+				}
+				next.tv_sec = now.tv_sec +
+						reduced_interval.tv_sec;
+				next.tv_nsec = now.tv_nsec +
+						reduced_interval.tv_nsec;
+				tsnorm(&next);
+			}
+			break;
+
+		case MODE_SYS_NANOSLEEP:
+			ret = clock_gettime(par->clock, &now);
+			if (ret) {
+				if (ret != EINTR)
+					warn("clock_gettime() failed: %s",
+							strerror(errno));
+				goto out;
+			}
+			/*
+			 * If simulated load, sleep should be for
+			 * reduced interval
+			 */
+			if (nanosleep(&reduced_interval, NULL)) {
+				if (errno != EINTR)
+					warn("nanosleep failed. errno: %s\n",
+							strerror(errno));
+				goto out;
+			}
+			next.tv_sec = now.tv_sec + reduced_interval.tv_sec;
+			next.tv_nsec = now.tv_nsec + reduced_interval.tv_nsec;
+			tsnorm(&next);
+			break;
+		}
+
+		ret = clock_gettime(par->clock, &now);
+		if (ret) {
+			if (ret != EINTR)
+				warn("clock_getttime() failed. errno: %s\n",
+							strerror(errno));
+			goto out;
+		}
+
+		if (use_nsecs)
+			diff = calcdiff_ns(now, next);
+		else
+			diff = calcdiff(now, next);
+		if (diff < stat->min)
+			stat->min = diff;
+		if (diff > stat->max) {
+			stat->max = diff;
+			if (refresh_on_max)
+				pthread_cond_signal(&refresh_on_max_cond);
+		}
+		stat->avg += (double) diff;
+
+		if (duration && (calcdiff(now, stop) >= 0))
+			shutdown++;
+
+		if (!stopped && tracelimit && (diff > tracelimit)) {
+			stopped++;
+			tracing(0);
+			shutdown++;
+			pthread_mutex_lock(&break_thread_id_lock);
+			if (break_thread_id == 0)
+				break_thread_id = stat->tid;
+			break_thread_value = diff;
+			pthread_mutex_unlock(&break_thread_id_lock);
+		}
+		stat->act = diff;
+
+		if (par->bufmsk)
+			stat->values[stat->cycles & par->bufmsk] = diff;
+
+		/* Update the histogram */
+		if (histogram) {
+			if (diff >= histogram)
+				stat->hist_overflow++;
+			else
+				stat->hist_array[diff]++;
+		}
+
+		stat->cycles++;
+
+		next.tv_sec += interval.tv_sec;
+		next.tv_nsec += interval.tv_nsec;
+		if (par->mode == MODE_CYCLIC) {
+			int overrun_count = timer_getoverrun(timer);
+			next.tv_sec += overrun_count * interval.tv_sec;
+			next.tv_nsec += overrun_count * interval.tv_nsec;
+		}
+		tsnorm(&next);
+		stat->avg_t1 += (double)((stat->done_t1 * 100)/stat->num_t1);
+		/*undone load will be discarded in next window*/
+		stat->done_t1 = 0;
+
+		if (stat->num_t2) {
+			stat->avg_t2 += (double)((stat->done_t2 * 100)/
+							stat->num_t2);
+			stat->done_t2 = 0;
+			/*
+			 *flag to intimade load2_thread that next window
+			 *has started
+			 */
+			if (!stat->next_window_started) {
+				stat->next_window_started = 1;
+				sem_post(&stat->next_window_sem);
+			}
+		}
+		if (par->max_cycles && par->max_cycles == stat->cycles)
+			break;
+	}
 
 	while (!shutdown) {
 
@@ -867,6 +1149,17 @@ void *timerthread(void *param)
 		}
 		tsnorm(&next);
 
+		if (load_t2) {
+			stat->avg_t2 += (double)((stat->done_t2 * 100)/
+							stat->num_t2);
+			stat->done_t2 = 0;
+			/*
+			 *flag to intimade load2_thread that next window
+			 *has started
+			 */
+			stat->next_window_started = 1;
+			sem_post(&stat->next_window_sem);
+		}
 		if (par->max_cycles && par->max_cycles == stat->cycles)
 			break;
 	}
@@ -888,6 +1181,8 @@ out:
 	sched_setscheduler(0, SCHED_OTHER, &schedp);
 
 	stat->threadstarted = -1;
+	if (load_t2)
+		sem_destroy(&stat->next_window_sem);
 
 	return NULL;
 }
@@ -959,6 +1254,10 @@ static void display_help(int error)
                "                           format: --policy=fifo(default) or --policy=rr\n"
 	       "-S       --smp             Standard SMP testing: options -a -t -n and\n"
                "                           same priority of all threads\n"
+	       "-x       --load_t1         load in percentage for load1_thread\n"
+	       "-X       --load_t2         load in percentage for load2_thread\n"
+	       "-z       --priority_t2     priority of load2_thread\n"
+	       "-Z       --nice_t2         nice value of load2_thread\n"
 	       "-U       --numa            Standard NUMA testing (similar to SMP option)\n"
                "                           thread data structures allocated from local node\n",
 	       tracers
@@ -987,7 +1286,7 @@ enum {
 	AFFINITY_SPECIFIED,
 	AFFINITY_USEALL
 };
-static int setaffinity = AFFINITY_UNSPECIFIED;
+static int setaffinity = AFFINITY_USEALL;
 
 static int clocksources[] = {
 	CLOCK_MONOTONIC,
@@ -1083,10 +1382,15 @@ static void process_options (int argc, char *argv[])
 			{"numa", no_argument, NULL, 'U'},
 			{"latency", required_argument, NULL, 'e'},
 			{"priospread", no_argument, NULL, 'Q'},
+			{"load_t1", required_argument, NULL, 'x'},
+			{"load_t2", required_argument, NULL, 'X'},
+			{"priority_t2", required_argument, NULL, 'z'},
+			{"nice_t2", required_argument, NULL, 'Z'},
 			{NULL, 0, NULL, 0}
 		};
-		int c = getopt_long(argc, argv, "a::b:Bc:Cd:Efh:H:i:Il:MnNo:O:p:PmqQrsSt::uUvD:wWT:y:e:",
-				    long_options, &option_index);
+		int c = getopt_long(argc, argv,
+				"a::b:Bc:Cd:Efh:H:i:Il:MnNo:O:p:PmqQrsSt::uUvD:wWT:y:e:x:X:z:Z:"
+				    , long_options, &option_index);
 		if (c == -1)
 			break;
 		switch (c) {
@@ -1200,7 +1504,18 @@ static void process_options (int argc, char *argv[])
 			if (latency_target_value < 0)
 				latency_target_value = 0;
 			break;
-
+		case 'x':
+			load_t1 = atoi(optarg);
+			break;
+		case 'X':
+			load_t2 = atoi(optarg);
+			break;
+		case 'z':
+			priority_t2 = atoi(optarg);
+			break;
+		case 'Z':
+			nice_t2 = atoi(optarg);
+			break;
 		case '?': display_help(0); break;
 		}
 	}
@@ -1221,6 +1536,9 @@ static void process_options (int argc, char *argv[])
 			    affinity, max_cpus);
 			error = 1;
 		}
+	} else if (setaffinity == AFFINITY_UNSPECIFIED) {
+		warn("thread affinity can't be unspecified for cyclicload\n");
+		error = 1;
 	} else if (tracelimit)
 		fileprefix = procfileprefix;
 
@@ -1268,6 +1586,15 @@ static void process_options (int argc, char *argv[])
 
 	if (num_threads < 1)
 		error = 1;
+	/*1% load has been reserved for control framework*/
+	if ((load_t1 + load_t2) > 99) {
+		fprintf(stderr, "load can't be greater than 99%\n");
+		error = 1;
+	}
+	if (priority_t2 < 0 || priority_t2 > priority) {
+		fprintf(stderr, "incorrect priority_t2\n");
+		error = 1;
+	}
 
 	if (error)
 		display_help(1);
@@ -1344,6 +1671,7 @@ static void print_hist(struct thread_param *par[], int nthreads)
 	int i, j;
 	unsigned long long int log_entries[nthreads+1];
 	unsigned long maxmax, alloverflows;
+	unsigned long load2_cycles;
 
 	bzero(log_entries, sizeof(log_entries));
 
@@ -1401,6 +1729,24 @@ static void print_hist(struct thread_param *par[], int nthreads)
 	if (histofall && nthreads > 1)
 		printf(" %05lu", alloverflows);
 	printf("\n");
+	if (load_t1) {
+		printf("# Avg Load t1");
+		for (j = 0; j < nthreads; j++)
+			printf(" %05lu", par[j]->stats->cycles ?
+			(long)(par[j]->stats->avg_t1/par[j]->stats->cycles) :
+			0);
+		printf("\n");
+	}
+	if (load_t1) {
+		printf("# Avg Load t2");
+		for (j = 0; j < nthreads; j++) {
+			load2_cycles = par[j]->stats->cycles -
+					par[j]->stats->load2_start;
+			printf(" %05lu", load2_cycles ?
+			(long)(par[j]->stats->avg_t2/load2_cycles) : 0);
+		}
+		printf("\n");
+	}
 }
 
 static void print_stat(struct thread_param *par, int index, int verbose) @@ -1410,6 +1756,8 @@ static void print_stat(struct thread_param *par, int index, int verbose)
 	if (!verbose) {
 		if (quiet != 1) {
 			char *fmt;
+			unsigned long load2_cycles =
+					stat->cycles - stat->load2_start;
 			if (use_nsecs)
                                 fmt = "T:%2d (%5d) P:%2d I:%ld C:%7lu "
 					"Min:%7ld Act:%8ld Avg:%8ld Max:%8ld\n"; @@ -1420,6 +1768,18 @@ static void print_stat(struct thread_param *par, int index, int verbose)
                                par->interval, stat->cycles, stat->min, stat->act,
 			       stat->cycles ?
 			       (long)(stat->avg/stat->cycles) : 0, stat->max);
+			if (load_t1)
+				printf("\tAvgload1:%2ld.%2d\n",
+					stat->cycles ?
+					(long)(stat->avg_t1/stat->cycles) : 0,
+					stat->cycles ?
+					((long)stat->avg_t1%stat->cycles) : 0);
+			if (load_t2)
+				printf("\tAvgload2:%2ld.%2d\n",
+					load2_cycles ?
+					(long)(stat->avg_t2/load2_cycles) : 0,
+					load2_cycles ?
+					(long)stat->avg_t2%load2_cycles : 0);
 		}
 	} else {
 		while (stat->cycles != stat->cyclesread) { @@ -1441,6 +1801,125 @@ static void print_stat(struct thread_param *par, int index, int verbose)
 	}
 }
 
+int calibrate_count_per_unit(int interval_per_unit) {
+	int diff = 1, x = 0;
+	struct timespec start, end;
+	int i, clock, k = 0, ret;
+	int count = 1;
+	int temp = 0;
+	int flag = 0;
+	int min = -1;
+
+	clock = clocksources[clocksel];
+
+	/*interval_per)unit is in us*/
+	if (use_nsecs)
+		interval_per_unit = interval_per_unit * 1000;
+
+	/*calculate minimum of 10 iterations
+	 *to get least count to generate a particular load
+	 */
+	for (i = 0 ; i < 10 ; i++) {
+		count = 1;
+		diff = 1;
+		x = 0;
+		while (diff < interval_per_unit) {
+			count *= 10;
+			x++;
+			ret = clock_gettime(clock, &start);
+			if (ret) {
+				if (ret != EINTR)
+					warn("clock_gettime() failed: %s",
+						strerror(errno));
+				return -1;
+			}
+			generate_load(count, &temp, &flag);
+			ret = clock_gettime(clock, &end);
+			if (ret) {
+				if (ret != EINTR)
+					warn("clock_gettime() failed: %s",
+						strerror(errno));
+				return -1;
+			}
+			if (use_nsecs)
+				diff = (calcdiff_ns(end, start));
+			else
+				diff = (calcdiff(end, start));
+		}
+		k = count;
+		while ((x > 0) && (diff != interval_per_unit) && (k != 0)) {
+			x--;
+			count += k;
+			k /= 10;
+			do {
+				count -= k;
+				ret = clock_gettime(clock, &start);
+				if (ret) {
+					if (ret != EINTR)
+						warn("clock_gettime() failed:%s"
+							, strerror(errno));
+					return -1;
+				}
+				generate_load(count, &temp, &flag);
+				ret = clock_gettime(clock, &end);
+				if (ret) {
+					if (ret != EINTR)
+						warn("clock_gettime() failed:%s"
+							, strerror(errno));
+					return -1;
+				}
+				if (use_nsecs)
+					diff = (calcdiff_ns(end, start));
+				else
+					diff = (calcdiff(end, start));
+			} while (diff > interval_per_unit);
+		}
+
+		if (diff != interval_per_unit)
+			count = (count * interval_per_unit)/diff;
+
+		if (i == 0)
+			min = count;
+		if (count < min)
+			min = count;
+	}
+	return min;
+}
+
+/*
+ * thread to calibrate data i.e. loop count per unit time
+ * for multicore system, thread affine itslef to each core
+ * turn by turn to calibrate count for that core  */ void 
+*calibrate_thread(void *arg) {
+	struct sched_param schedp;
+	int max_cpus = sysconf(_SC_NPROCESSORS_CONF);
+	int i = 0;
+
+	/*should be run at highest RT priority for proper caliberation*/
+	memset(&schedp, 0, sizeof(schedp));
+	schedp.sched_priority = 99;
+	sched_setscheduler(0, SCHED_FIFO, &schedp);
+
+	/*For multicore system, do caliberation for all CPUs*/
+	for (i = 0; i < max_cpus; i++) {
+		cpu_set_t mask;
+		CPU_ZERO(&mask);
+		CPU_SET(i, &mask);
+		if (sched_setaffinity(0, sizeof(mask), &mask) == -1)
+			warn("Could not set CPU affinity to CPU #%d\n", i);
+
+		/*caliberation count is maintained per CALIBRATE_COUNT_TIME*/
+		calibrate_count_array[i] =
+			calibrate_count_per_unit(CALIBRATE_COUNT_TIME);
+		if (calibrate_count_array[i] == -1)
+			warn("Could not set set calibrate for CPU #%d\n", i);
+	}
+	return NULL;
+}
+
 int main(int argc, char **argv)
 {
 	sigset_t sigset;
@@ -1451,6 +1930,8 @@ int main(int argc, char **argv)
 	int max_cpus = sysconf(_SC_NPROCESSORS_CONF);
 	int i, ret = -1;
 	int status;
+	pthread_t calibrate_thread_id;
+	FILE *fp;
 
 	process_options(argc, argv);
 
@@ -1495,6 +1976,44 @@ int main(int argc, char **argv)
 	statistics = calloc(num_threads, sizeof(struct thread_stat *));
 	if (!statistics)
 		goto outpar;
+	/*
+	 *For first run:
+	 *		create file
+	 *		Calibrate count per time unit & store in file
+	 *for subsequent run:
+	 *		read calibrated data from file & use
+	 */
+	fp = fopen(FILENAME, "r");
+	if (!fp) {
+		int val = 0;
+		fp = fopen(FILENAME, "w");
+		if (!fp)
+			goto outpar;
+		printf("Calibrating data\n");
+		/* create thread to calibrate count for each cpu*/
+		status = pthread_create(&calibrate_thread_id,
+				NULL, calibrate_thread, NULL);
+		if (status) {
+			fatal("failed to create thread %s\n", strerror(status));
+			goto outfile;
+		}
+		printf("Be patient, it will take some time in the first run\n");
+		printf("It is recommended to run for the first run ");
+		printf("with least load for proper caliberation\n");
+		/*wait for all threads to exit*/
+		status = pthread_join(calibrate_thread_id, (void *)&val);
+		if (status) {
+			fatal("failed in pthread_join %s\n", strerror(status));
+			goto outfile;
+		}
+		/*story array into file*/
+		fwrite(calibrate_count_array,
+			sizeof(calibrate_count_array), 1, fp);
+		printf("Calibration completed\n");
+	} else {
+		/*read from array*/
+		fread(calibrate_count_array, sizeof(int), MAX_CORES, fp);
+	}
 
 	for (i = 0; i < num_threads; i++) {
 		pthread_attr_t attr;
@@ -1593,6 +2112,13 @@ int main(int argc, char **argv)
 		stat->min = 1000000;
 		stat->max = 0;
 		stat->avg = 0.0;
+		stat->avg_t1 = 0.0;
+		stat->avg_t2 = 0.0;
+		stat->num_t1 = 0;
+		stat->num_t2 = 0;
+		stat->done_t1 = 0;
+		stat->done_t2 = 0;
+		stat->next_window_started = 1;
 		stat->threadstarted = 1;
 		status = pthread_create(&stat->thread, &attr, timerthread, par);
 		if (status)
@@ -1657,6 +2183,8 @@ int main(int argc, char **argv)
 	for (i = 0; i < num_threads; i++) {
 		if (statistics[i]->threadstarted > 0)
 			pthread_kill(statistics[i]->thread, SIGTERM);
+		if (statistics[i]->threadt2_started > 0)
+			pthread_kill(statistics[i]->thread_t2, SIGTERM);
 		if (statistics[i]->threadstarted) {
 			pthread_join(statistics[i]->thread, NULL);
 			if (quiet && !histogram)
@@ -1686,6 +2214,8 @@ int main(int argc, char **argv)
 			continue;
 		threadfree(statistics[i], sizeof(struct thread_stat), parameters[i]->node);
 	}
+ outfile:
+	fclose(fp);
 
  outpar:
 	for (i = 0; i < num_threads; i++) {
--
1.7.4.1




^ permalink raw reply related	[flat|nested] 7+ messages in thread

* RE: [PATCH 3/4] Add cyclicload calibration & load generation feature
  2012-10-19  4:32   ` Jain Priyanka-B32167
@ 2012-10-19 15:39     ` John Kacur
  0 siblings, 0 replies; 7+ messages in thread
From: John Kacur @ 2012-10-19 15:39 UTC (permalink / raw)
  To: Jain Priyanka-B32167
  Cc: williams@redhat.com, Aggrwal Poonam-B10812, jkacur@redhat.com,
	frank.rowand@am.sony.com, linux-rt-users@vger.kernel.org,
	Srivastava Rajan-B34330, dvhart@linux.intel.com

On Fri, 19 Oct 2012, Jain Priyanka-B32167 wrote:

> Dear Clark,
> 
> Its along time that since I have send patches for cyclicload.
> I can see patch for cyclicload integrated into work branch on rt-test git tree.
> Did you have any feedback on its working.
> 
> Also I have made some improvements on it. Should I send the next version of the cyclicload patch or a new patch for the changes based on the 'work' branch code.
> 
> Regards
> Priyanka
> 
>

------------->o SNIP

Personally I think you should send the next version of the new patch, and 
not based on Clark's work branch. (a working branch is just that, a 
working place to store patches, no guarantees)

I think I should warn you that we have been discussing how difficult it is 
to make an artifical load. I worry that your program just adds a thread or 
threads spinning at another prio level, (which you can do already with 
cyclictest) but not adding any real load. I wouldn't like for you to do
 a lot of work only to have it ultimately rejected.

Why don't you submit the next version of your work, and then perhaps talk 
a little more about what you intend to do, and see what other people on 
the mailing list here think about that.

Thanks

John Kacur

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-10-19 15:38 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-30  9:56 [PATCH 0/4] Add cyclicload testtool support Priyanka Jain
2012-08-30  9:56 ` [PATCH 1/4] Add README for cyclicload test tool Priyanka Jain
2012-08-30  9:56 ` [PATCH 2/4] Duplicates cyclictest code as cyclicload Priyanka Jain
2012-08-30  9:56 ` [PATCH 3/4] Add cyclicload calibration & load generation feature Priyanka Jain
2012-10-19  4:32   ` Jain Priyanka-B32167
2012-10-19 15:39     ` John Kacur
2012-08-30  9:56 ` [PATCH 4/4] Add cyclicload manual page Priyanka Jain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).