From: Benjamin LaHaise <bcrl@kvack.org>
To: "David S. Miller" <davem@davemloft.net>
Cc: drepper@gmail.com, da-x@monatomic.org, linux-kernel@vger.kernel.org
Subject: Re: Status of AIO
Date: Mon, 6 Mar 2006 20:39:15 -0500 [thread overview]
Message-ID: <20060307013915.GU20768@kvack.org> (raw)
In-Reply-To: <20060306.165129.62204114.davem@davemloft.net>
On Mon, Mar 06, 2006 at 04:51:29PM -0800, David S. Miller wrote:
> I think any such VM tricks need serious thought. It has serious
> consequences as far as cost especially on SMP. Evgivny has some data
> that shows this, and chapter 5 of Networking Algorithmics has a lot of
> good analysis and paper references on this topic.
VM tricks do suck, so you just have to use the tricks that nobody else
is... My thinking is to do something like the following: have a structure
to reference a set of pages. When it is first created, it takes a reference
on the pages in question, and it is added to the vm_area_struct of the user
so that the vm can poke it for freeing when memory pressure occurs. The
sk_buff dataref also has to have a pointer to the pageref added. Now, the
trick to making it useful is as follows:
struct pageref {
atomic_t free_count;
int use_count; /* protected by socket lock */
...
unsigned long user_address;
unsigned long length;
struct socket *sock; /* backref for VM */
struct page *pages[];
};
The fast path in network transmit becomes:
if (sock->pageref->... overlaps buf) {
for each packet built {
use_count++;
<add pageref to skb's dataref happily without atomics
or memory copying>
}
}
Then the kfree_skb() path does an atomic_dec() on pageref->free_count
instead of the page. (Or get rid of the atomic using knowledge about the
fact that a given pageref could only be freed by the network driver it was
given to.) That would make the transmit path bloody cheap, and the tx irq
context no more expensive than it already is.
It's probably easier to show this tx path with code that gets the details
right.
-ben
--
"Time is of no importance, Mr. President, only life is important."
Don't Email: <dont@kvack.org>.
next prev parent reply other threads:[~2006-03-07 1:44 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-03-06 6:24 Status of AIO Dan Aloni
2006-03-06 15:05 ` Phillip Susi
2006-03-06 21:18 ` Benjamin LaHaise
2006-03-06 22:53 ` Ulrich Drepper
2006-03-06 23:15 ` Phillip Susi
2006-03-08 7:09 ` Ulrich Drepper
2006-03-08 15:58 ` Phillip Susi
2006-03-06 23:33 ` Benjamin LaHaise
2006-03-07 0:24 ` David S. Miller
2006-03-07 0:42 ` Benjamin LaHaise
2006-03-07 0:51 ` David S. Miller
2006-03-07 1:39 ` Benjamin LaHaise [this message]
2006-03-07 2:04 ` Dan Aloni
2006-03-07 2:07 ` Benjamin LaHaise
2006-03-07 3:11 ` David S. Miller
2006-03-07 7:33 ` Dan Aloni
2006-03-07 3:06 ` David S. Miller
2006-03-07 16:35 ` Benjamin LaHaise
2006-03-07 1:34 ` Phillip Susi
2006-03-07 3:04 ` David S. Miller
2006-03-07 4:07 ` Phillip Susi
2006-03-07 6:02 ` David S. Miller
2006-03-07 16:06 ` Phillip Susi
2006-03-07 1:30 ` Dan Aloni
2006-03-07 1:37 ` Nicholas Miell
2006-03-07 1:37 ` Phillip Susi
2006-03-07 1:40 ` Benjamin LaHaise
2006-03-06 23:18 ` Phillip Susi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060307013915.GU20768@kvack.org \
--to=bcrl@kvack.org \
--cc=da-x@monatomic.org \
--cc=davem@davemloft.net \
--cc=drepper@gmail.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox