All of lore.kernel.org
 help / color / mirror / Atom feed
From: Adam J <jacobvi123@gmail.com>
To: Jan Kiszka <jan.kiszka@siemens.com>
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai] unmapping and remapping /dev/rtheap
Date: Wed, 04 Sep 2013 07:50:36 -0400	[thread overview]
Message-ID: <52271E8C.8070801@gmail.com> (raw)
In-Reply-To: <52271128.9010401@siemens.com>

I'm a researcher who's trying to use a linux system as a testbed for a 
new program sampling methodology. What I want to be able to do is take a 
checkpoint at a given point in a program and then run that checkpoint 
for a set amount of instructions repeatedly with deterministic run time. 
I am new to running programs deterministically on Linux.

Currently I am running my experiments on a vanilla 3.10.10 linux system. 
I have all user-space processes disabled except for the shell I'm using 
to launch jobs. I am running my process under test with SCHED_FIFO 99 
priority on a CPUSet with a single CPU and I have NUMA emulation setup 
to segregate a portion of memory for that CPU from the rest of the 
system. DVFS is disabled, and if there is an involuntary context switch 
the run is said to have failed. I'm also flushing the pagecache using 
drop_caches before restoring the program to get the system state as 
repeatable as possible.

My test programs are controlled using performance counters via PAPI and 
signaling. For example, if I want to run a program for 100M instructions 
I write 100,000,000 to a specific file and then send a signal to the 
program telling it to start. The program reads the file, runs for 100M 
instructions, produces the unhalted cycle count, and then pauses. I want 
this unhalted cycle count to be the same every time.

I have my experiments set up to be able to repeatedly warm-up and run a 
region of a program. For example, I have program A which I want to 
measure run time for 100M instructions starting at Instruction 300M in 
the program. To setup the experiment take a checkpoint at Instruction 
100M. When running the experiment I restore the checkpoint at 
Instruction 100M and then run the program for 200M instructions to 
warm-up the micro-architectural state. Then I run the program for 100M 
instructions and measure the run time.

The issue I'm having right now is that I'm getting non-deterministic run 
times when running samples and I'm not sure of the cause. The programs I 
am sampling are from SPEC2006 which are not real-time applications. I am 
also running the SPEC applications on a non-real time linux system. One 
hypothesis I have is because the system isn't real time there are system 
level effects causing the program region to run for a different amount 
of time for each execution. To test this hypothesis I wanted to use a 
real-time linux system such as Xenomai to see if running my program 
under Xenomai will eliminate the variability. The programs I'm using are 
still not designed for real-time linux, but since I'm only looking at a 
small region of the program I'm hoping compiling the program unmodified 
using Xenomai would be sufficient.

Another hypothesis we have that the issue is due to different memory 
mappings being used after every restore resulting in the program 
occupying different ways in the cache. I am not sure how to determine if 
this is the cause of the run time variability though.

To test my first hypothesis I am trying to take a checkpoint of my 
application using CRIU so I can compare the run time variability of the 
Xenomai compiled application to that of the standard compilation of the 
application. If you think though this is not a valid hypothesis, I am 
open to other ideas.

Thank you,

Adam



On 9/4/2013 6:53 AM, Jan Kiszka wrote:
> On 2013-09-04 12:39, Adam Jacobvitz wrote:
>> Yeah...I suppose that's the fundamental issue. I'm brand new to xenomai and
>> I'm not really familiar to what state xenomai maintains for an application.
>>
>> Do you know what state I would need to checkpoint and if its even possible
>> to checkpoint it?
> There is a lot, starting with core thread objects, sync objects etc.,
> then there are skin-specific extensions of those and also objects that
> are shared between processes. But even if you export all this, Xenomai
> wasn't designed with this use case in mind. So you may find many tricky
> corner cases around checkpoint/restart - just like in Linux...
>
> What overall use case are you aiming at with CRIU for your RT
> process(es)? What states would your processes be in when
> saving/restoring? And what Xenomai version do you consider for this, 2.6
> or upcoming 3.0?
>
> Jan
>



  reply	other threads:[~2013-09-04 11:50 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-04 10:04 [Xenomai] unmapping and remapping /dev/rtheap Adam Jacobvitz
2013-09-04 10:34 ` Jan Kiszka
2013-09-04 10:39   ` Adam Jacobvitz
2013-09-04 10:53     ` Jan Kiszka
2013-09-04 11:50       ` Adam J [this message]
2013-09-04 12:03         ` Gilles Chanteperdrix
2013-09-04 12:06           ` Adam J

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52271E8C.8070801@gmail.com \
    --to=jacobvi123@gmail.com \
    --cc=jan.kiszka@siemens.com \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.