From: linux@horizon.com
To: naber@inl.nl
Cc: linux-kernel@vger.kernel.org
Subject: Re: a 15 GB file on tmpfs
Date: 22 Jul 2005 10:38:57 -0000 [thread overview]
Message-ID: <20050722103857.17907.qmail@science.horizon.com> (raw)
> I have a 15 GB file which I want to place in memory via tmpfs. I want to do
> this because I need to have this data accessible with a very low seek time.
It should work fine. tmpfs has the same limits as any other file system,
2 TB or more, and more than that with CONFIG_LBD.
NOTE, however, that tmpfs does NOT guarantee the data will be in RAM! It
uses the page cache just like any other file system, and pages out unused
data just like any other file system. If you just want average-case fast,
it'll work fine. If you want guaranteed fast, you'll have to work harder.
> I want to know if this is possible before spending 10,000 euros on a machine
> that has 16 GB of memory.
So create a 15 GB file on an existing machine. Make it sparse, so you
don't need so much RAM, but test to verify that the kernel doesn't
wrap at 4 GB, and can keep the data at offsets 0, 4 GB, 8 GB, and 12 GB
separate.
Works for me (test code below).
> The machine we plan to buy is a HP Proliant Xeon machine and I want to run a
> 32 bit linux kernel on it (the xeon we want doesn't have the 64-bit stuff
> yet)
If you're working with > 4GB data sets, I would recommend you think VERY hard
before deciding not to get a 64-bit machine. If you could just put all 15 GB
into your application's address space:
- The application would be much simpler and faster.
- The kernel wouldn't be slowed by HIGHMEM workarounds. It's not that bad,
but it's definitely noticeable.
- Your expensive new machine won't be obsolete quite as fast.
I'd also like to mention that AMD's large L2 TLB is enormously helpful when
working with large data sets. It's not discussed much on the web sites that
benchmark with games, but it really helps crunch a lot of data.
#define _GNU_SOURCE
#define _FILE_OFFSET_BITS 64
#include <sys/types.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#define STRIDE (1<<20)
int
main(int argc, char **argv)
{
int fd;
off_t off;
if (argc != 2) {
fprintf(stderr, "Wrong number of arguments: %u\n", argc);
return 1;
}
fd = open(argv[1], O_RDWR|O_CREAT|O_LARGEFILE, 0666);
if (fd < 0) {
perror(argv[1]);
return 1;
}
for (off = 0; off < 0x400000000LL; off += STRIDE) {
char buf[40];
off_t res;
ssize_t ss1, ss2;;
ss1 = sprintf(buf, "%llu", off);
res = lseek(fd, off, SEEK_SET);
if (res == (off_t)-1) {
perror("lseek");
return 1;
}
ss2 = write(fd, buf, ++ss1);
if (ss2 != ss1) {
perror("write");
return 1;
}
}
for (off = 0; off < 0x400000000LL; off += STRIDE) {
char buf[40], buf2[40];
off_t res;
ssize_t ss1, ss2;;
ss1 = sprintf(buf, "%lld", off);
res = lseek(fd, off, SEEK_SET);
if (res == (off_t)-1) {
perror("lseek");
return 1;
}
ss2 = read(fd, buf2, ++ss1);
if (ss2 != ss1 || memcmp(buf, buf2, ss1) != 0) {
fprintf(stderr, "Mismatch at %llu: %.*s vs. %s\n", off, (int)ss2, buf2, buf);
return 1;
}
}
printf("All tests succeeded.\n");
return 0;
}
next reply other threads:[~2005-07-22 14:29 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-07-22 10:38 linux [this message]
-- strict thread matches above, loose matches on Subject: below --
2005-07-22 9:39 a 15 GB file on tmpfs Cabaniols, Sebastien
2005-07-21 15:25 Andrew Burgess
2005-07-20 12:16 Bastiaan Naber
2005-07-20 13:20 ` Erik Mouw
2005-07-20 13:31 ` Antonio Vargas
2005-07-20 13:35 ` Miquel van Smoorenburg
2005-07-20 14:44 ` Erik Mouw
2005-07-20 15:23 ` Antonio Vargas
2005-07-21 2:13 ` Jim Nance
2005-07-20 15:18 ` Jan Engelhardt
2005-07-21 6:25 ` Denis Vlasenko
2005-07-21 8:42 ` Bernd Petrovitsch
2005-07-21 9:12 ` Stefan Smietanowski
2005-07-21 9:39 ` Bernd Petrovitsch
2005-07-22 10:36 ` Bernd Eckenfels
2005-07-22 11:00 ` Stefan Smietanowski
2005-07-22 16:25 ` Bernd Eckenfels
2005-07-22 21:10 ` Stefan Smietanowski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050722103857.17907.qmail@science.horizon.com \
--to=linux@horizon.com \
--cc=linux-kernel@vger.kernel.org \
--cc=naber@inl.nl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox