public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: Benjamin LaHaise <bcrl@kvack.org>
Cc: Sonny Rao <sonny@burdell.org>, Linus Torvalds <torvalds@osdl.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: "Read my lips: no more merges" - aka Linux 2.6.14-rc1
Date: Thu, 15 Sep 2005 22:41:15 +0200	[thread overview]
Message-ID: <4329DC6B.2040803@cosmosbay.com> (raw)
In-Reply-To: <20050915201356.GA20966@kvack.org>

[-- Attachment #1: Type: text/plain, Size: 1184 bytes --]

Benjamin LaHaise a écrit :
> On Tue, Sep 13, 2005 at 09:04:32AM +0200, Eric Dumazet wrote:
> 
>>I wish a process param could allow open() to take any free fd available, 
>>not the lowest one. One can always use fcntl(fd, F_DUPFD, slot) to move a 
>>fd on a specific high slot and always keep the 64 first fd slots free to 
>>speedup the kernel part at open()/dup()/socket() time.
> 
> 
> The overhead is easy to avoid by making use of dup2() and close() to keep 
> the lowest file descriptors in the table free, allowing open() and socket() 
> to always return 3 or 4.

Yes, this is what I described :) Maybe this was not clear.

> 
> Alternatively, the kernel could track available file descriptors using a 
> tree to efficiently insert freed slots into an ordered list of free 
> regions (something similar to the avl tree used in vmas).  Is it worth 
> doing?

Well no, since a user app can manage itself this part if it happens to be 
performance critical.


Sample of a user land lib : Each time a new fd is returned by 
open()/socket()/pipe()/accept()... the thread should call

fd = fdcache_dupfd(fd);

And close the file using  fdcache_closefd(fd) instead of close(fd);

Eric

[-- Attachment #2: fastfdlib.c --]
[-- Type: text/plain, Size: 1521 bytes --]

/*
 * Unix kernel has an expensive get_unused_fd() function :
 * This is because semantics of Unix mandates that a open()/pipe()/socket()/ call always returns the lowest fd, not a random one.
 * Linux use a linear scan of a table of bits.
 * A program handling 1.000.000 files scans about 128 KB of ram, with a spinlock held : No other thread can get a fd.
 *
 * The trick is to use this library to make sure 64 low fds are available, so that the standard unix functions 
 * dont have to scan a lot of fd before finding a free one.
 * And remap them using fcntl(F_DUPFD) at precise slots we manage ourselfs.
 */
#include <pthread.h>
#include <fcntl.h>
#include <stdlib.h>
#include <unistd.h>

# define MAXFDS 1500000

struct {
	pthread_mutex_t lock;
	unsigned int cache_fd;
	unsigned int next_alloc;
	unsigned int *cache_tab;
	} fdd;


void fdcache_init()
{
	pthread_mutex_init(&fdd.lock, NULL);
	fdd.cache_tab = calloc(MAXFDS, sizeof(unsigned int));
	fdd.next_alloc = 64;
}

int fdcache_dupfd(int fd)
{
	int ret;
	pthread_mutex_lock(&fdd.lock);
	if (fdd.cache_fd == 0)
		fdd.cache_fd = fdd.next_alloc++;
	ret = fcntl(fd, F_DUPFD, fdd.cache_fd);
	if (ret != -1) {
		fdd.cache_fd = fdd.cache_tab[ret];
		pthread_mutex_unlock(&fdd.lock);
		close(fd);
		return ret;
	}
	else {
		pthread_mutex_unlock(&fdd.lock);
		return fd;
	}
}

void fdcache_closefd(int fd)
{
	if (fd == -1)
		return;

	close(fd);

	pthread_mutex_lock(&fdd.lock);
	fdd.cache_tab[fd] = fdd.cache_fd;
	fdd.cache_fd = fd;
	pthread_mutex_unlock(&fdd.lock);
}

  parent reply	other threads:[~2005-09-15 20:41 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-09-13  3:34 "Read my lips: no more merges" - aka Linux 2.6.14-rc1 Linus Torvalds
2005-09-13  3:54 ` Alejandro Bonilla Beeche
2005-09-13  3:59   ` Keith Owens
2005-09-13  4:03     ` Alejandro Bonilla Beeche
2005-09-14  5:16     ` Alejandro Bonilla Beeche
2005-09-14 16:28       ` Jeff Garzik
2005-09-14 16:40         ` Alejandro Bonilla
2005-09-14 16:43       ` Linus Torvalds
2005-09-14 16:52         ` Alejandro Bonilla
2005-09-15  0:48       ` Alejandro Bonilla Beeche
2005-09-13 14:27   ` Linus Torvalds
2005-09-13  6:28 ` more fallout from ATI Xpress timer workaround (was: Linux 2.6.14-rc1) Cal Peake
2005-09-13 20:04   ` Jean Delvare
2005-09-13  6:33 ` "Read my lips: no more merges" - aka Linux 2.6.14-rc1 Sonny Rao
2005-09-13  7:04   ` Eric Dumazet
2005-09-15  4:06     ` David S. Miller
2005-09-15  4:22       ` Linus Torvalds
2005-09-15 20:13     ` Benjamin LaHaise
2005-09-15 20:32       ` Linus Torvalds
2005-09-15 21:08         ` Eric Dumazet
2005-09-15 20:41       ` Eric Dumazet [this message]
2005-09-13  7:34 ` Udo A. Steinberg
2005-09-13 10:40 ` Mathieu Fluhr
2005-09-13 11:15   ` Helge Hafting
2005-09-13 15:14   ` Linus Torvalds
2005-09-13 17:01   ` Mathieu Fluhr
2005-09-13 17:15     ` Linus Torvalds
2005-09-13 18:12       ` Mathieu Fluhr
2005-09-13 19:11         ` Linus Torvalds
2005-09-14  8:11           ` 2.6.13 brings buffer underruns when recording DVDs in 16x (was Re: "Read my lips: no more merges" - aka Linux 2.6.14-rc1) Mathieu Fluhr
2005-09-14  8:30             ` Andrew Morton
2005-09-14 10:32               ` Mathieu Fluhr
2005-09-14 10:58                 ` Andrew Morton
2005-09-14 11:12                   ` Alessandro Suardi
2005-09-14 15:04           ` "Read my lips: no more merges" - aka Linux 2.6.14-rc1 Bill Davidsen
2005-09-14 23:38   ` Redeeman
2005-09-13 18:34 ` Roland Dreier
2005-09-13 18:46   ` Linus Torvalds
2005-09-13 21:32     ` Horst von Brand
2005-09-13 19:57 ` Rafael J. Wysocki
2005-09-14 15:31 ` Bill Davidsen
2005-09-14 22:56   ` Matthew Garrett
2005-09-14 17:33 ` Bill Davidsen
2005-09-14 17:45   ` Bill Davidsen
2005-09-14 21:47 ` Henrik Persson
2005-09-14 23:20   ` Jesper Juhl
2005-09-16 19:51     ` Henrik Persson
2005-09-14 22:11 ` 2.6.14-rc1 on ATI hangs when executing _STA and _INI methods Peter Osterlund
2005-09-14 22:27   ` Linus Torvalds
2005-09-14 22:41     ` Peter Osterlund
2005-09-14 23:27 ` "Read my lips: no more merges" - aka Linux 2.6.14-rc1 Redeeman
2005-09-16  7:44 ` Tomasz Torcz
  -- strict thread matches above, loose matches on Subject: below --
2005-09-13  6:07 Voluspa
2005-09-14 17:04 Steve Lee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4329DC6B.2040803@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=bcrl@kvack.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sonny@burdell.org \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox