public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: ltt-dev@lists.casi.polymtl.ca
Cc: linux-kernel@vger.kernel.org,
	Steven Rostedt <rostedt@goodmis.org>,
	Lai Jiangshan <laijs@cn.fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Robert Wisniewski <bob@watson.ibm.com>,
	Ingo Molnar <mingo@elte.hu>
Subject: LTTng 0.146, adds extra read-side sub-buffer for flight recorder
Date: Mon, 13 Jul 2009 03:14:40 -0400	[thread overview]
Message-ID: <20090713071440.GA32730@Krystal> (raw)

Hi,

So, I needed a weekend break from writing my thesis (It's almost over!) ;)
and I had the great idea to try to come up with a way to ensure that
LTTng flight recorder mode permits to have a read-side that never sees
corrupted data.

Basically, this is the main thing Steven have been asking me for a
while. And it looks like I just figured out a way to do it.

So for flight recorder tracing, this new LTTng version allocates an
extra subbuffer which gets exchanged by the reader with the writer
subbuffer before it gets read.

Normal tracing does not need this extra subbuffer, because the
write-side just drops events when the buffer is full. So we don't
allocate it and we don't perform any exchange. The space
reservation/commit code plays nicely with both flight recorder and
normal tracing schemes.

Here is how I did it:

No modification was required to the buffer space reservation/commit
algorithm. I just had to do the following at the backend level
(responsible for writing data to/reader data from the buffer):

I am using an array of pointers (one pointer for each subbuffer), plus a
pointer to the reader subbuffer. Each of these pointers are pointing to
an array of pages, which are all the pages that constitute a subbuffer.
Reads/writes from/to the buffer are done by accessors which pick up the
right page location within this page table. By modifying the top-level
subbuffer pointer, we can swap a whole subbuffer in a single operation.

There is a trick to deal with concurrency between writer and reader.
When the top-level subbuffer pointers are not used (no writer is
currently writing into it, no reader is reading from its subbuffer), we
set a RCHAN_NOREF_FLAG (value: 0x1) which indicates that no reference is
currently taken to this subbuffer. As long as this flag is set in the
pointer, it is safe for the reader to exchange it. When the writer needs
to access this subbuffer for writing, it clears the flag, and sets it
back after committing the last piece of data to it.

When the reader figures out that the write-side subbuffer it is trying
to exchange has a reference, it fails with -EAGAIN.

Nice things about the way I do it here:

- I keep the separation between the space reservation layer and back-end
  buffer layer. The extra reader subbuffer exchange is done at the
  back-end layer. The reason why it took me so long to try to come up
  with something is that I tried to do it at the space reservation
  layer, which was not fitting well the space reservation semantics.

- Keeping space reservation and physical buffer management separate
  helps splitting complexity into sub-layers easier to verify.

- Given the space reservation/commit is separate from the subbuffer
  exchange per se, I don't need any special-cases for "if the tail
  pointer is in the reader page".... these things never happen because
  the reserve, commit and consumed counts are completely unrelated to
  the pointers to physical subbuffers.

As always, the tree is available at:

http://git.kernel.org/?p=linux/kernel/git/compudj/linux-2.6-lttng.git
git://git.kernel.org/pub/scm/linux/kernel/git/compudj/linux-2.6-lttng.git

The commits implementing this the extra reader page for the lockless
scheme are:

lttng-relay-per-subbuffer-index.patch
lttng-relay-per-subbuffer-index-low-bit-noref.patch
lttng-relay-lockless-writer-use-noref-flag.patch
lttng-relay-default-sb-index-to-noref.patch
lttng-relay-lockless-exchange-reader-writer-pages.patch

Comments are welcome,

Thanks,

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

             reply	other threads:[~2009-07-13  7:19 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-13  7:14 Mathieu Desnoyers [this message]
2009-07-14 22:33 ` LTTng 0.146, adds extra read-side sub-buffer for flight recorder Steven Rostedt
2009-07-14 22:57   ` Mathieu Desnoyers
2009-07-15 13:57   ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090713071440.GA32730@Krystal \
    --to=mathieu.desnoyers@polymtl.ca \
    --cc=bob@watson.ibm.com \
    --cc=fweisbec@gmail.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ltt-dev@lists.casi.polymtl.ca \
    --cc=mingo@elte.hu \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox