linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Pierre Tardy <tardyp@gmail.com>, Ingo Molnar <mingo@elte.hu>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	Tom Zanussi <tzanussi@gmail.com>,
	Paul Mackerras <paulus@samba.org>,
	linux-kernel@vger.kernel.org, arjan@infradead.org,
	ziga.mahkovec@gmail.com, davem <davem@davemloft.net>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Christoph Lameter <cl@linux-foundation.org>,
	Tejun Heo <tj@kernel.org>, Jens Axboe <jens.axboe@oracle.com>
Subject: Re: [RFC] Tracer Ring Buffer splice() vs page cache [was: Re: Perf and ftrace [was Re: PyTimechart]]
Date: Mon, 17 May 2010 18:42:43 -0400	[thread overview]
Message-ID: <20100517224243.GA10603@Krystal> (raw)
In-Reply-To: <1273862945.1674.14.camel@laptop>

* Peter Zijlstra (peterz@infradead.org) wrote:
> On Fri, 2010-05-14 at 14:32 -0400, Mathieu Desnoyers wrote:
> 
> > [CCing memory management specialists]
> 
> And jet you forgot Jens who wrote it ;-)

oops ! thanks for adding him.

> 
> > So I have three questions here:
> > 
> > 1 - could we enforce removal of these pages from the page cache by calling
> >     "page_cache_release()" before giving these pages back to the ring buffer ?
> > 
> > 2 - or maybe is there a page flag we could specify when we allocate them to
> >     ask for these pages to never be put in the page cache ? (but they should be
> >     still usable as write buffers)
> > 
> > 3 - is there something more we need to do to grab a reference on the pages
> >     before passing them to splice(), so that when we call page_cache_release()
> >     they don't get reclaimed ? 
> 
> There is no guarantee it is the pagecache they end up in, it could be a
> network packet queue, a pipe, or anything that implements .splice_write.
> 
> >From what I understand of splice() is that it assumes it passes
> ownership of the page, you're not supposed to touch them again, non of
> the above three are feasible.

Yup, I've looked more deeply at the splice() code, and I now see why things
don't fall apart in LTTng currently. My implementation seems to be causing
splice() to perform a copy. My ring buffer splice implementation is derived from
kernel/relay.c. I override

pipe_buf_operations release op with:

static void ltt_relay_pipe_buf_release(struct pipe_inode_info *pipe,
                                       struct pipe_buffer *pbuf)
{
}

and

splice_pipe_desc spd_release file op with:

static void ltt_relay_page_release(struct splice_pipe_desc *spd, unsigned int i)
{
}

My understanding is that by keeping 2 references on the pages (the ring buffer +
the pipe), splice safely refuses to move the pages and performs a copy instead.

I'll continue to look into this. One of the things I noticed that that we could
possibly use the "steal()" operation to steal the pages back from the page cache
to repopulate the ring buffer rather than continuously allocating new pages. If
steal() fails for some reasons, then we can fall back on page allocation. I'm
not sure it is safe to assume anything about pages being in the page cache
though. Maybe the safest route is to just allocate new pages for now.

Thoughts ?

Mathieu


-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-05-17 22:42 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-14 18:32 [RFC] Tracer Ring Buffer splice() vs page cache [was: Re: Perf and ftrace [was Re: PyTimechart]] Mathieu Desnoyers
2010-05-14 18:49 ` Peter Zijlstra
2010-05-17 22:42   ` Mathieu Desnoyers [this message]
2010-05-18 12:19     ` Peter Zijlstra
2010-05-18 15:16       ` Mathieu Desnoyers
2010-05-18 15:23         ` Peter Zijlstra
2010-05-18 15:43           ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100517224243.GA10603@Krystal \
    --to=mathieu.desnoyers@efficios.com \
    --cc=acme@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@infradead.org \
    --cc=cl@linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=fweisbec@gmail.com \
    --cc=jens.axboe@oracle.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@elte.hu \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tardyp@gmail.com \
    --cc=tj@kernel.org \
    --cc=tzanussi@gmail.com \
    --cc=ziga.mahkovec@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).