From: Jens Axboe <axboe@kernel.dk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, akpm@linux-foundation.org
Subject: Re: [RFC PATCH] fuse: support splice() reading from fuse device
Date: Thu, 20 May 2010 19:58:44 +0200 [thread overview]
Message-ID: <20100520175844.GW25951@kernel.dk> (raw)
In-Reply-To: <alpine.LFD.2.00.1005201043321.23538@i5.linux-foundation.org>
On Thu, May 20 2010, Linus Torvalds wrote:
>
>
> On Thu, 20 May 2010, Miklos Szeredi wrote:
> >
> > With Jens' pipe growing patch and additional fuse patches it was
> > possible to achieve a 20GBytes/s write throghput on my laptop in a
> > "null" filesystem (no page cache, data goes to /dev/null).
>
> Btw, I don't think that is a very interesting benchmark.
>
> The reason I say that is that many man years ago I played with doing
> zero-copy pipe read/write system calls (no splice, just automatic "follow
> the page tables, mark things read-only etc" things). It was considered
> sexy to do things like that during the mid-90's - there were all the crazy
> ukernel people with Mach etc doing magic things with moving pages around.
>
> It got me a couple of gigabytes per second back then (when memcpy() speeds
> were in the tens of megabytes) on benchmarks like lmbench that just wrote
> the same buffer over and over again without ever touching the data.
>
> It was totally worthless on _any_ real load. In fact, it made things
> worse. I never found a single case where it helped.
>
> So please don't ever benchmark things that don't make sense, and then use
> the numbers as any kind of reason to do anything. It's worse than
> worthless. It actually adds negative value to show "look ma, no hands" for
> things that nobody does. It makes people think it's a good idea, and
> optimizes the wrong thing entirely.
>
> Are there actual real loads that get improved? I don't care if it means
> that the improvement goes from three orders of magnitude to just a couple
> of percent. The "couple of percent on actual loads" is a lot more
> important than "many orders of magnitude on a made-up benchmark".
I agree on the basis that these types of benchmarks are fine to validate
the "are there stupid problems in the new code?" question, but not so
much as a comparison for anything.
I can easily run some pure IO benchmarks and send some numbers comparing
64KB vs 1MB pipes on splice. It's been a while since I did that, if I
recall correctly then the biggest issue I ran into back then was beating
on the inode mutex a lot and not so much the actual syscall entry/exit
count being a multiple of comparison tests with read/write using a
larger buffer.
--
Jens Axboe
WARNING: multiple messages have this Message-ID (diff)
From: Jens Axboe <axboe@kernel.dk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, akpm@linux-foundation.org
Subject: Re: [RFC PATCH] fuse: support splice() reading from fuse device
Date: Thu, 20 May 2010 19:58:44 +0200 [thread overview]
Message-ID: <20100520175844.GW25951@kernel.dk> (raw)
In-Reply-To: <alpine.LFD.2.00.1005201043321.23538@i5.linux-foundation.org>
On Thu, May 20 2010, Linus Torvalds wrote:
>
>
> On Thu, 20 May 2010, Miklos Szeredi wrote:
> >
> > With Jens' pipe growing patch and additional fuse patches it was
> > possible to achieve a 20GBytes/s write throghput on my laptop in a
> > "null" filesystem (no page cache, data goes to /dev/null).
>
> Btw, I don't think that is a very interesting benchmark.
>
> The reason I say that is that many man years ago I played with doing
> zero-copy pipe read/write system calls (no splice, just automatic "follow
> the page tables, mark things read-only etc" things). It was considered
> sexy to do things like that during the mid-90's - there were all the crazy
> ukernel people with Mach etc doing magic things with moving pages around.
>
> It got me a couple of gigabytes per second back then (when memcpy() speeds
> were in the tens of megabytes) on benchmarks like lmbench that just wrote
> the same buffer over and over again without ever touching the data.
>
> It was totally worthless on _any_ real load. In fact, it made things
> worse. I never found a single case where it helped.
>
> So please don't ever benchmark things that don't make sense, and then use
> the numbers as any kind of reason to do anything. It's worse than
> worthless. It actually adds negative value to show "look ma, no hands" for
> things that nobody does. It makes people think it's a good idea, and
> optimizes the wrong thing entirely.
>
> Are there actual real loads that get improved? I don't care if it means
> that the improvement goes from three orders of magnitude to just a couple
> of percent. The "couple of percent on actual loads" is a lot more
> important than "many orders of magnitude on a made-up benchmark".
I agree on the basis that these types of benchmarks are fine to validate
the "are there stupid problems in the new code?" question, but not so
much as a comparison for anything.
I can easily run some pure IO benchmarks and send some numbers comparing
64KB vs 1MB pipes on splice. It's been a while since I did that, if I
recall correctly then the biggest issue I ran into back then was beating
on the inode mutex a lot and not so much the actual syscall entry/exit
count being a multiple of comparison tests with read/write using a
larger buffer.
--
Jens Axboe
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-05-20 17:58 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-20 11:17 [RFC PATCH] fuse: support splice() reading from fuse device Miklos Szeredi
2010-05-20 11:17 ` Miklos Szeredi
2010-05-20 11:28 ` Jens Axboe
2010-05-20 11:28 ` Jens Axboe
2010-05-20 11:44 ` Miklos Szeredi
2010-05-20 11:44 ` Miklos Szeredi
2010-05-20 17:49 ` Linus Torvalds
2010-05-20 17:49 ` Linus Torvalds
2010-05-20 17:58 ` Jens Axboe [this message]
2010-05-20 17:58 ` Jens Axboe
2010-05-20 18:54 ` Miklos Szeredi
2010-05-20 18:54 ` Miklos Szeredi
2010-05-20 19:19 ` Linus Torvalds
2010-05-20 19:19 ` Linus Torvalds
2010-05-20 20:07 ` Miklos Szeredi
2010-05-20 20:07 ` Miklos Szeredi
2010-05-20 20:11 ` Matthew Wilcox
2010-05-20 20:11 ` Matthew Wilcox
2010-05-20 20:22 ` Miklos Szeredi
2010-05-20 20:22 ` Miklos Szeredi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100520175844.GW25951@kernel.dk \
--to=axboe@kernel.dk \
--cc=akpm@linux-foundation.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=miklos@szeredi.hu \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.