All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: Benjamin LaHaise <bcrl@kvack.org>
Cc: Sonny Rao <sonny@burdell.org>, Linus Torvalds <torvalds@osdl.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: "Read my lips: no more merges" - aka Linux 2.6.14-rc1
Date: Thu, 15 Sep 2005 22:41:15 +0200	[thread overview]
Message-ID: <4329DC6B.2040803@cosmosbay.com> (raw)
In-Reply-To: <20050915201356.GA20966@kvack.org>

[-- Attachment #1: Type: text/plain, Size: 1184 bytes --]

Benjamin LaHaise a écrit :
> On Tue, Sep 13, 2005 at 09:04:32AM +0200, Eric Dumazet wrote:
> 
>>I wish a process param could allow open() to take any free fd available, 
>>not the lowest one. One can always use fcntl(fd, F_DUPFD, slot) to move a 
>>fd on a specific high slot and always keep the 64 first fd slots free to 
>>speedup the kernel part at open()/dup()/socket() time.
> 
> 
> The overhead is easy to avoid by making use of dup2() and close() to keep 
> the lowest file descriptors in the table free, allowing open() and socket() 
> to always return 3 or 4.

Yes, this is what I described :) Maybe this was not clear.

> 
> Alternatively, the kernel could track available file descriptors using a 
> tree to efficiently insert freed slots into an ordered list of free 
> regions (something similar to the avl tree used in vmas).  Is it worth 
> doing?

Well no, since a user app can manage itself this part if it happens to be 
performance critical.


Sample of a user land lib : Each time a new fd is returned by 
open()/socket()/pipe()/accept()... the thread should call

fd = fdcache_dupfd(fd);

And close the file using  fdcache_closefd(fd) instead of close(fd);

Eric

[-- Attachment #2: fastfdlib.c --]
[-- Type: text/plain, Size: 1521 bytes --]

/*
 * Unix kernel has an expensive get_unused_fd() function :
 * This is because semantics of Unix mandates that a open()/pipe()/socket()/ call always returns the lowest fd, not a random one.
 * Linux use a linear scan of a table of bits.
 * A program handling 1.000.000 files scans about 128 KB of ram, with a spinlock held : No other thread can get a fd.
 *
 * The trick is to use this library to make sure 64 low fds are available, so that the standard unix functions 
 * dont have to scan a lot of fd before finding a free one.
 * And remap them using fcntl(F_DUPFD) at precise slots we manage ourselfs.
 */
#include <pthread.h>
#include <fcntl.h>
#include <stdlib.h>
#include <unistd.h>

# define MAXFDS 1500000

struct {
	pthread_mutex_t lock;
	unsigned int cache_fd;
	unsigned int next_alloc;
	unsigned int *cache_tab;
	} fdd;


void fdcache_init()
{
	pthread_mutex_init(&fdd.lock, NULL);
	fdd.cache_tab = calloc(MAXFDS, sizeof(unsigned int));
	fdd.next_alloc = 64;
}

int fdcache_dupfd(int fd)
{
	int ret;
	pthread_mutex_lock(&fdd.lock);
	if (fdd.cache_fd == 0)
		fdd.cache_fd = fdd.next_alloc++;
	ret = fcntl(fd, F_DUPFD, fdd.cache_fd);
	if (ret != -1) {
		fdd.cache_fd = fdd.cache_tab[ret];
		pthread_mutex_unlock(&fdd.lock);
		close(fd);
		return ret;
	}
	else {
		pthread_mutex_unlock(&fdd.lock);
		return fd;
	}
}

void fdcache_closefd(int fd)
{
	if (fd == -1)
		return;

	close(fd);

	pthread_mutex_lock(&fdd.lock);
	fdd.cache_tab[fd] = fdd.cache_fd;
	fdd.cache_fd = fd;
	pthread_mutex_unlock(&fdd.lock);
}

  parent reply	other threads:[~2005-09-15 20:41 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-09-13  3:34 "Read my lips: no more merges" - aka Linux 2.6.14-rc1 Linus Torvalds
2005-09-13  3:54 ` Alejandro Bonilla Beeche
2005-09-13  3:59   ` Keith Owens
2005-09-13  4:03     ` Alejandro Bonilla Beeche
2005-09-14  5:16     ` Alejandro Bonilla Beeche
2005-09-14 16:28       ` Jeff Garzik
2005-09-14 16:40         ` Alejandro Bonilla
2005-09-14 16:43       ` Linus Torvalds
2005-09-14 16:52         ` Alejandro Bonilla
2005-09-15  0:48       ` Alejandro Bonilla Beeche
2005-09-15  3:21         ` Linus Torvalds
2005-09-15  5:20           ` H. Peter Anvin
2005-09-15  7:45             ` Junio C Hamano
2005-09-15  8:39               ` David Kågedal
2005-09-15 16:07                 ` Linus Torvalds
2005-09-15 17:48                   ` David Kågedal
2005-09-15 18:15                     ` Linus Torvalds
2005-09-15 22:08                       ` Christian Meder
2005-09-16 13:56                         ` Theodore Ts'o
2005-09-15 14:39             ` Linus Torvalds
2005-09-16  4:08           ` Horst von Brand
2005-09-13 14:27   ` Linus Torvalds
2005-09-13  6:28 ` more fallout from ATI Xpress timer workaround (was: Linux 2.6.14-rc1) Cal Peake
2005-09-13 20:04   ` Jean Delvare
2005-09-13  6:33 ` "Read my lips: no more merges" - aka Linux 2.6.14-rc1 Sonny Rao
2005-09-13  7:04   ` Eric Dumazet
2005-09-15  4:06     ` David S. Miller
2005-09-15  4:22       ` Linus Torvalds
2005-09-15 20:13     ` Benjamin LaHaise
2005-09-15 20:32       ` Linus Torvalds
2005-09-15 21:08         ` Eric Dumazet
2005-09-15 20:41       ` Eric Dumazet [this message]
2005-09-13  7:34 ` Udo A. Steinberg
2005-09-13 10:40 ` Mathieu Fluhr
2005-09-13 11:15   ` Helge Hafting
2005-09-13 15:14   ` Linus Torvalds
2005-09-13 17:01   ` Mathieu Fluhr
2005-09-13 17:15     ` Linus Torvalds
2005-09-13 18:12       ` Mathieu Fluhr
2005-09-13 19:11         ` Linus Torvalds
2005-09-14  8:11           ` 2.6.13 brings buffer underruns when recording DVDs in 16x (was Re: "Read my lips: no more merges" - aka Linux 2.6.14-rc1) Mathieu Fluhr
2005-09-14  8:30             ` Andrew Morton
2005-09-14 10:32               ` Mathieu Fluhr
2005-09-14 10:58                 ` Andrew Morton
2005-09-14 11:12                   ` Alessandro Suardi
2005-09-14 15:04           ` "Read my lips: no more merges" - aka Linux 2.6.14-rc1 Bill Davidsen
2005-09-14 23:38   ` Redeeman
2005-09-13 18:34 ` Roland Dreier
2005-09-13 18:46   ` Linus Torvalds
2005-09-13 21:32     ` Horst von Brand
2005-09-13 19:57 ` Rafael J. Wysocki
2005-09-14 15:31 ` Bill Davidsen
2005-09-14 22:56   ` Matthew Garrett
2005-09-14 17:33 ` Bill Davidsen
2005-09-14 17:45   ` Bill Davidsen
2005-09-14 21:47 ` Henrik Persson
2005-09-14 23:20   ` Jesper Juhl
2005-09-16 19:51     ` Henrik Persson
2005-09-14 22:11 ` 2.6.14-rc1 on ATI hangs when executing _STA and _INI methods Peter Osterlund
2005-09-14 22:27   ` Linus Torvalds
2005-09-14 22:41     ` Peter Osterlund
2005-09-14 23:27 ` "Read my lips: no more merges" - aka Linux 2.6.14-rc1 Redeeman
2005-09-16  7:44 ` Tomasz Torcz
  -- strict thread matches above, loose matches on Subject: below --
2005-09-13  6:07 Voluspa
2005-09-14 17:04 Steve Lee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4329DC6B.2040803@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=bcrl@kvack.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sonny@burdell.org \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.