All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andy Isaacson <adi@hexapodia.org>
To: linux-kernel@vger.kernel.org
Subject: sparse file performance (was Re: Is there a "make hole" (truncate in middle) syscall?)
Date: Fri, 5 Dec 2003 15:00:08 -0600	[thread overview]
Message-ID: <20031205150008.B14054@hexapodia.org> (raw)
In-Reply-To: <20031204172348.A14054@hexapodia.org>; from adi@hexapodia.org on Thu, Dec 04, 2003 at 05:23:48PM -0600

On Thu, Dec 04, 2003 at 05:23:48PM -0600, Andy Isaacson wrote:
> On Thu, Dec 04, 2003 at 02:32:23PM -0600, Rob Landley wrote:
> > What are the downsides of holes?  (How big do they have to be to
> > actually save space, is there a performance penalty to having a file
> > with 1000 4k holes in it, etc...)
> 
> It's filesystem-dependent; some filesystems don't implement sparse
> files.  The lower bound is one block; on extents-based filesystems like
> XFS it might be bigger.  (If you've got 1GB of data, then a 1MB block of
> zeros, then another GB of data, you're probably better off allocating a
> single 2GB extent rather than two smaller extents with a hole.)
> 
> There's no inherent downside to holey files; in fact they can be a
> straight-up performance win -- that's a block that doesn't need to be
> read from disk, just hand the user a COW pointer to your zero page.  And
> if you're lucky and the preceding and following blocks are allocated
> adjacent on disk, you can do it all as a single streaming IO.

I got curious enough to run some tests, and was suprised at the results.
My machine (Athlon XP 2400+, 2030 MHz, 512 MB, KT400, 2.4.22) can read
out of buffer cache at 234 MB/s, and off of its IDE disk at 40 MB/s.
I'd assumed that read(2)ing a holey file would go faster than reading
out of buffer cache; in theory you could do it completely in L1 cache
(with a 4KB buffer, it's just a ton of syscalls, some page table
manipulation, and a bunch of memcpy() out of a single zero page).  But
it turns out that reading a hole is *slower* than reading data from
buffer cache, just 195 MB/s.

200 MB file       234 MB/s  (with warm caches)
1 GB file          40 MB/s  (exceeds physical memory)
1 GB sparse file  195 MB/s

the 1GB sparse file was created with "dd if=file of=1gsparse bs=1M
count=1 seek=1023"; the filesystem is ext3.

Here's 'vmstat 5' while reading the 200MB file in a loop:

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa

 1  0  50968   4468   4872 410424    0    0     0     9  102    46 62 38  0  0
 1  0  50968   4448   4892 410424    0    0     0     6  101    41 62 38  0  0
 1  0  50968   4428   4912 410424    0    0     0     6  101    40 62 38  0  0
 1  0  50968   4404   4936 410424    0    0     0     6  101    37 61 39  0  0
 1  0  50968   4384   4956 410424    0    0     0     8  105   117 60 40  0  0
 1  0  50968   4484   4984 410296    0    0     0     9  103    81 62 38  0  0

here's 'vmstat 5' while reading the 1GB sparse file in a loop:

 1  0  55448   4460   2464 417320    0    0   217     6  144  3117 45 49  6  0
 1  0  55448   4444   2480 417304    0    0   219     6  204  3237 50 44  6  0
 1  0  55448   4444   2488 417288    0    0   218     9  181  3200 49 45  6  0
 1  0  55460   4456   2468 417140   30    0   249     6  182  3193 46 48  6  0
 1  0  55460   4396   2484 417300    0    2   220    12  140  3084 46 48  6  0
 1  0  55460   4356   2464 417360    0    0   216     2  145  3101 47 48  6  0

The code is simply doing

        while((n = read(fd, buf, sizeof(buf))) > 0) {
                c += n;
                for(i=0; i < n; i++) {
                        hist[buf[i]]++;
                }
        }

compiled with gcc 3.3.2 -O2.

Code appended.

-andy

#include <stdio.h>
#include <stdarg.h>
#include <stdlib.h>

#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/time.h>
#include <fcntl.h>
#include <ctype.h>

static void
die(char *fmt, ...)
{
	va_list a;
	va_start(a, fmt);
	vfprintf(stderr, fmt, a);
	va_end(a);
	exit(1);
}

double tod(void)
{
	static struct timeval tv1;
	struct timeval tv2;
	double r;

	if(tv1.tv_sec == 0) {
		gettimeofday(&tv1, 0);
		return 0;
	}
	gettimeofday(&tv2, 0);
	r = (tv2.tv_sec - tv1.tv_sec) + (tv2.tv_usec - tv1.tv_usec) / 1e6;
	memcpy(&tv1, &tv2, sizeof(tv1));
	return r;
}

int main(int argc, char **argv)
{
	char buf[4096];
	int fd, i, n, m;
	long long c = 0;
	double t1, t2;
	int hist[256] = { 0 };
	unsigned char *p = buf;

	if(argc != 2) die("usage: %s file\n", argv[0]);

	if((fd = open(argv[1], O_RDONLY)) == -1)
		die("%s: %s\n", argv[1], strerror(errno));

	t1 = tod();
	while((n = read(fd, buf, sizeof(buf))) > 0) {
		c += n;
		for(i=0; i < n; i++) {
			hist[p[i]]++;
		}
	}
	t2 = tod();
	if(n == -1) die("read: %s\n", strerror(errno));

	m = 0;
	for(i=1; i<256; i++)
		if(hist[i] > hist[m]) m = i;
	printf("%lld characters read, mode at %d '%c' with %d\n",
			c, m, isprint(m) ? m : '?', hist[m]);
	printf("%f seconds, %f MB/sec\n", t2-t1, c / (t2-t1) / 1e6);
	return 0;
}

  parent reply	other threads:[~2003-12-05 21:00 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-12-04 20:32 Is there a "make hole" (truncate in middle) syscall? Rob Landley
2003-12-04 20:55 ` Måns Rullgård
2003-12-04 21:10 ` Szakacsits Szabolcs
2003-12-05  0:02   ` Rob Landley
2003-12-04 22:33     ` Szakacsits Szabolcs
2003-12-05 11:22     ` Helge Hafting
2003-12-05 12:11   ` Måns Rullgård
2003-12-05 22:41     ` Mike Fedyk
2003-12-05 23:25       ` Måns Rullgård
2003-12-05 23:33       ` Szakacsits Szabolcs
2003-12-05 23:25     ` Szakacsits Szabolcs
2003-12-04 21:48 ` Mike Fedyk
2003-12-04 23:59   ` Rob Landley
2003-12-05 22:42     ` Olaf Titz
2003-12-04 22:53 ` Peter Chubb
2003-12-05  1:04   ` Philippe Troin
2003-12-05  2:39     ` Peter Chubb
2003-12-08  4:03     ` bill davidsen
2003-12-04 23:23 ` Andy Isaacson
2003-12-04 23:42   ` Szakacsits Szabolcs
2003-12-05  2:03     ` Mike Fedyk
2003-12-05  7:09       ` Ville Herva
2003-12-05 11:22   ` Anton Altaparmakov
2003-12-05 11:44     ` viro
2003-12-05 14:27       ` Anton Altaparmakov
2003-12-05 21:00   ` Andy Isaacson [this message]
2003-12-05 21:12     ` sparse file performance (was Re: Is there a "make hole" (truncate in middle) syscall?) Linus Torvalds
2003-12-08 20:43       ` Andy Isaacson
2003-12-11  5:13 ` Is there a "make hole" (truncate in middle) syscall? Hua Zhong
2003-12-11  6:19   ` Rob Landley
2003-12-11 18:58   ` Andy Isaacson
2003-12-11 19:15     ` Hua Zhong
2003-12-11 19:43       ` Andreas Dilger
2003-12-12 21:37         ` Daniel Phillips
2003-12-11 19:48       ` Jörn Engel
2003-12-11 19:55         ` Hua Zhong
2003-12-11 19:58         ` Andy Isaacson
2003-12-12 12:18           ` Jörn Engel
2003-12-12 15:40             ` Andy Isaacson
2003-12-12 16:03               ` Jörn Engel
2003-12-11 20:32         ` Rob Landley
2003-12-12 12:55           ` Jörn Engel
2003-12-12 13:28             ` Vladimir Saveliev
2003-12-12 13:43               ` Jörn Engel
2003-12-12 13:52                 ` Vladimir Saveliev
2003-12-12 14:04                   ` Jörn Engel
2003-12-12 13:53               ` Rob Landley
2003-12-12 14:01                 ` Vladimir Saveliev
2003-12-12 21:35                   ` Rob Landley
2003-12-15 10:00                     ` Vladimir Saveliev
2003-12-15 11:52                       ` Rob Landley
2003-12-15 13:26                         ` Jörn Engel
2003-12-12 13:39             ` Rob Landley
2003-12-12 13:56               ` Jörn Engel
2003-12-12 14:24                 ` Jörn Engel
2003-12-12 21:37                   ` Rob Landley
2003-12-15 12:47                     ` Jörn Engel
2003-12-16  5:43                       ` Rob Landley
2003-12-16 11:05                         ` Jörn Engel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20031205150008.B14054@hexapodia.org \
    --to=adi@hexapodia.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.