From: Jens Axboe <jens.axboe@oracle.com>
To: Jim Schutt <jaschut@sandia.gov>
Cc: linux-kernel@vger.kernel.org
Subject: Re: splice/vmsplice performance test results
Date: Thu, 16 Nov 2006 21:25:29 +0100 [thread overview]
Message-ID: <20061116202529.GH7164@kernel.dk> (raw)
In-Reply-To: <1163700539.2672.14.camel@sale659.sandia.gov>
On Thu, Nov 16 2006, Jim Schutt wrote:
> Hi,
>
> I've done some testing to see how splice/vmsplice perform
> vs. other alternatives on transferring a large file across
> a fast network. One option I tested was to use vmsplice
> to get a 1-copy receive, but it didn't perform as well
> as I had hoped. I was wondering if my results were at odds
> with what other people have observed.
>
> I've two systems, each with:
> Tyan S2895 motherboard
> 2 ea. 2.6 GHz Opteron
> 1 GiB memory
> Myricom Myri-10G 10 Gb/s NIC (PCIe x8)
> 2.6.19-rc5-g134a11f0 on FC4
>
> In addition, one system has a 3ware 9590-8ML (PCIe) and a 3ware
> 9550SX-8LP (PCI-X), with 16 Seagate Barracuda 7200.10 SATA drives
> (250 GB ea., NCQ enabled). Write caching is enabled on the 3ware
> cards.
>
> The Myricom cards are connected back-to-back using 9000 byte MTU.
> I baseline the network performance with 'iperf -w 1M -l 64K'
> and get 6.9 Gb/s.
>
> After a fair amount of testing, I settled on a 4-way software
> RAID0 on top of 4-way hardware RAID0 units as giving the best
> streaming performance. The file system is XFS, with the stripe
> unit set to the hardware RAID chunk size, and the stripe width
> 16 times that.
>
> Disk tuning parameters in /sys/block/sd*/queue are default
> values, except queue/nr_requests = 5 gives me best performance.
> (It seems like the 3ware cards slow down a little if I feed them
> too much data on the streaming write test I'm using.)
>
> I baseline file write performance with
> sync; time { dd if=/dev/zero of=./zero bs=32k count=512k; sync; }
> and get 465-520 MB/s (highly variable).
>
> I test baseline file read performance with
> time dd if=./zero of=/dev/null bs=32k count=512k
> and get 950 MB/s (fairly repeatable).
>
> My test program can do one of the following:
>
> send data:
> A) read() from file into buffer, write() buffer into socket
> B) mmap() section of file, write() that into socket, munmap()
> C) splice() from file to pipe, splice() from pipe to socket
>
> receive data:
> 1) read() from socket into buffer, write() buffer into file
> 2) ftruncate() to extend file, mmap() new extent, read()
> from socket into new extent, munmap()
> 3) read() from socket into buffer, vmsplice() buffer to
> pipe, splice() pipe to file (using the double-buffer trick)
>
> Here's the results, using:
> - 64 KiB buffer, mmap extent, or splice
> - 1 MiB TCP window
> - 16 GiB data sent across network
>
> A) from /dev/zero -> 1) to /dev/null : 857 MB/s (6.86 Gb/s)
>
> A) from file -> 1) to /dev/null : 472 MB/s (3.77 Gb/s)
> B) from file -> 1) to /dev/null : 366 MB/s (2.93 Gb/s)
> C) from file -> 1) to /dev/null : 854 MB/s (6.83 Gb/s)
>
> A) from /dev/zero -> 1) to file : 375 MB/s (3.00 Gb/s)
> A) from /dev/zero -> 2) to file : 150 MB/s (1.20 Gb/s)
> A) from /dev/zero -> 3) to file : 286 MB/s (2.29 Gb/s)
>
> I had (naively) hoped the read/vmsplice/splice combination would
> run at the same speed I can write a file, i.e. at about 450 MB/s
> on my setup. Do any of my numbers seem bogus, so I should look
> harder at my test program?
Could be read-ahead playing in here, I'd have to take a closer look at
the generated io patterns to say more about that. Any chance you can
capture iostat or blktrace info for such a run to compare that goes to
the disk? Can you pass along the test program?
> Or is read+write really the fastest way to get data off a
> socket and into a file?
splice() should be just as fast of course, and more efficient. Not a lot
of real-life performance tuning has gone into it yet, so I would not be
surprised if we need to smoothen a few edges.
--
Jens Axboe
next prev parent reply other threads:[~2006-11-16 20:26 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-11-16 18:08 splice/vmsplice performance test results Jim Schutt
2006-11-16 20:25 ` Jens Axboe [this message]
2006-11-16 21:24 ` Jim Schutt
2006-11-17 17:21 ` Jim Schutt
2006-11-20 7:59 ` Jens Axboe
2006-11-20 8:24 ` Jens Axboe
2006-11-20 15:49 ` Jim Schutt
2006-11-21 13:54 ` Jens Axboe
2006-11-21 19:17 ` Jim Schutt
2006-11-22 8:57 ` Jens Axboe
2006-11-22 22:35 ` Jim Schutt
2006-11-23 11:24 ` Jens Axboe
2006-11-27 20:57 ` Jim Schutt
2006-11-16 20:52 ` David Miller
2006-11-16 21:21 ` Jens Axboe
2006-11-16 21:27 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20061116202529.GH7164@kernel.dk \
--to=jens.axboe@oracle.com \
--cc=jaschut@sandia.gov \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox