public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* a 15 GB file on tmpfs
@ 2005-07-20 12:16 Bastiaan Naber
  2005-07-20 13:20 ` Erik Mouw
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Bastiaan Naber @ 2005-07-20 12:16 UTC (permalink / raw)
  To: linux-kernel

Hi,

I am not sure if I can ask this here but I could not find any other place 
where I could fine anyone with this knowledge.

I have a 15 GB file which I want to place in memory via tmpfs. I want to do 
this because I need to have this data accessible with a very low seek time.

I want to know if this is possible before spending 10,000 euros on a machine 
that has 16 GB of memory. 

The machine we plan to buy is a HP Proliant Xeon machine and I want to run a 
32 bit linux kernel on it (the xeon we want doesn't have the 64-bit stuff 
yet)

Thanks in advance,
Bastiaan


^ permalink raw reply	[flat|nested] 19+ messages in thread
* Re: a 15 GB file on tmpfs
@ 2005-07-21 15:25 Andrew Burgess
  0 siblings, 0 replies; 19+ messages in thread
From: Andrew Burgess @ 2005-07-21 15:25 UTC (permalink / raw)
  To: linux-kernel

On Wed, Jul 20, 2005 at 02:16:36PM +0200, Bastiaan Naber wrote:
> I have a 15 GB file which I want to place in memory via tmpfs. I want to do 
> this because I need to have this data accessible with a very low seek time.

You don't want tmpfs. You want either (1) ramfs and copy the data once at
boot time or (2) any filesystem and mmap and lock which will read the data
every time your app starts. 

Tmpfs is backed by swap so other system activity will potentially cause some of
the file to go to swap which kills your latency spec.

HTH


^ permalink raw reply	[flat|nested] 19+ messages in thread
* RE: a 15 GB file on tmpfs
@ 2005-07-22  9:39 Cabaniols, Sebastien
  0 siblings, 0 replies; 19+ messages in thread
From: Cabaniols, Sebastien @ 2005-07-22  9:39 UTC (permalink / raw)
  To: Bastiaan Naber; +Cc: linux-kernel

Anyway, if you buy a brand new server, you will find it difficult to get
a 32 bit architecture.

What I don't understand is why you want it to go into tmpfs, why don't
you let the pagecache do its job of putting the data in RAM where it is
needed ? 

-----Original Message-----
From: linux-kernel-owner@vger.kernel.org
[mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Antonio Vargas
Sent: mercredi 20 juillet 2005 15:32
To: Erik Mouw
Cc: Bastiaan Naber; linux-kernel@vger.kernel.org
Subject: Re: a 15 GB file on tmpfs

On 7/20/05, Erik Mouw <erik@harddisk-recovery.com> wrote:
> On Wed, Jul 20, 2005 at 02:16:36PM +0200, Bastiaan Naber wrote:
> > I have a 15 GB file which I want to place in memory via tmpfs. I 
> > want to do this because I need to have this data accessible with a
very low seek time.
> 
> That should be no problem on a 64 bit architecture.
> 
> > I want to know if this is possible before spending 10,000 euros on a

> > machine that has 16 GB of memory.
> 
> If you want to spend that amount of money on memory anyway, the extra 
> cost for an AMD64 machine isn't that large.
> 
> > The machine we plan to buy is a HP Proliant Xeon machine and I want 
> > to run a
> > 32 bit linux kernel on it (the xeon we want doesn't have the 64-bit 
> > stuff
> > yet)
> 
> AFAIK you can't use a 15 GB tmpfs on i386 because large memory support

> is basically a hack to support multiple 4GB memory spaces (some VM 
> guru correct me if I'm wrong). Just get an Athlon64 machine and run a 
> 64 bit kernel on it. If compatibility is a problem, you can still run 
> a 32 bit
> i386 userland on an x86_64 kernel.
> 
> 
> Erik
> 
> --
> +-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
> | Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands
> -

Bastian, Erik is dead-on on that one: go 64bit and forget all worries
about your 15GB filesize. Just don't forget to look not only x86-64
(intel or amd) but also itanium, ppc64 and s390 machines, you never know
about surprises!

--
Greetz, Antonio Vargas aka winden of network

http://wind.codepixel.com/

Las cosas no son lo que parecen, excepto cuando parecen lo que si son.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel"
in the body of a message to majordomo@vger.kernel.org More majordomo
info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 19+ messages in thread
* Re: a 15 GB file on tmpfs
@ 2005-07-22 10:38 linux
  0 siblings, 0 replies; 19+ messages in thread
From: linux @ 2005-07-22 10:38 UTC (permalink / raw)
  To: naber; +Cc: linux-kernel

> I have a 15 GB file which I want to place in memory via tmpfs. I want to do 
> this because I need to have this data accessible with a very low seek time.

It should work fine.  tmpfs has the same limits as any other file system,
2 TB or more, and more than that with CONFIG_LBD.

NOTE, however, that tmpfs does NOT guarantee the data will be in RAM!  It
uses the page cache just like any other file system, and pages out unused
data just like any other file system.  If you just want average-case fast,
it'll work fine.  If you want guaranteed fast, you'll have to work harder.

> I want to know if this is possible before spending 10,000 euros on a machine 
> that has 16 GB of memory. 

So create a 15 GB file on an existing machine.  Make it sparse, so you
don't need so much RAM, but test to verify that the kernel doesn't
wrap at 4 GB, and can keep the data at offsets 0, 4 GB, 8 GB, and 12 GB
separate.

Works for me (test code below).

> The machine we plan to buy is a HP Proliant Xeon machine and I want to run a 
> 32 bit linux kernel on it (the xeon we want doesn't have the 64-bit stuff 
> yet)

If you're working with > 4GB data sets, I would recommend you think VERY hard
before deciding not to get a 64-bit machine.  If you could just put all 15 GB
into your application's address space:
- The application would be much simpler and faster.
- The kernel wouldn't be slowed by HIGHMEM workarounds.  It's not that bad,
  but it's definitely noticeable.
- Your expensive new machine won't be obsolete quite as fast.

I'd also like to mention that AMD's large L2 TLB is enormously helpful when
working with large data sets.  It's not discussed much on the web sites that
benchmark with games, but it really helps crunch a lot of data.



#define _GNU_SOURCE
#define _FILE_OFFSET_BITS 64
#include <sys/types.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>

#define STRIDE (1<<20)

int
main(int argc, char **argv)
{
	int fd;
	off_t off;

	if (argc != 2) {
		fprintf(stderr, "Wrong number of arguments: %u\n", argc);
		return 1;
	}
	fd = open(argv[1], O_RDWR|O_CREAT|O_LARGEFILE, 0666);
	if (fd < 0) {
		perror(argv[1]);
		return 1;
	}

	for (off = 0; off < 0x400000000LL; off += STRIDE) {
		char buf[40];
		off_t res;
		ssize_t ss1, ss2;;

		ss1 = sprintf(buf, "%llu", off);

		res = lseek(fd, off, SEEK_SET);
		if (res == (off_t)-1) {
			perror("lseek");
			return 1;
		}
		ss2 = write(fd, buf, ++ss1);
		if (ss2 != ss1) {
			perror("write");
			return 1;
		}
	}

	for (off = 0; off < 0x400000000LL; off += STRIDE) {
		char buf[40], buf2[40];
		off_t res;
		ssize_t ss1, ss2;;

		ss1 = sprintf(buf, "%lld", off);

		res = lseek(fd, off, SEEK_SET);
		if (res == (off_t)-1) {
			perror("lseek");
			return 1;
		}

		ss2 = read(fd, buf2, ++ss1);
		if (ss2 != ss1 || memcmp(buf, buf2, ss1) != 0) {
			fprintf(stderr, "Mismatch at %llu: %.*s vs. %s\n", off, (int)ss2, buf2, buf);
			return 1;
		}
	}
	printf("All tests succeeded.\n");
	return 0;
}


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2005-07-22 21:05 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-20 12:16 a 15 GB file on tmpfs Bastiaan Naber
2005-07-20 13:20 ` Erik Mouw
2005-07-20 13:31   ` Antonio Vargas
2005-07-20 13:35   ` Miquel van Smoorenburg
2005-07-20 14:44     ` Erik Mouw
2005-07-20 15:23       ` Antonio Vargas
2005-07-21  2:13         ` Jim Nance
2005-07-20 15:18 ` Jan Engelhardt
2005-07-21  6:25 ` Denis Vlasenko
2005-07-21  8:42 ` Bernd Petrovitsch
2005-07-21  9:12   ` Stefan Smietanowski
2005-07-21  9:39     ` Bernd Petrovitsch
2005-07-22 10:36 ` Bernd Eckenfels
2005-07-22 11:00   ` Stefan Smietanowski
2005-07-22 16:25     ` Bernd Eckenfels
2005-07-22 21:10       ` Stefan Smietanowski
  -- strict thread matches above, loose matches on Subject: below --
2005-07-21 15:25 Andrew Burgess
2005-07-22  9:39 Cabaniols, Sebastien
2005-07-22 10:38 linux

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox