From: Daniel Phillips <phillips@phunq.net>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org,
Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Subject: Re: Distributed storage.
Date: Fri, 3 Aug 2007 12:48:37 -0700 [thread overview]
Message-ID: <200708031248.37804.phillips@phunq.net> (raw)
In-Reply-To: <1186152817.12034.128.camel@twins>
On Friday 03 August 2007 07:53, Peter Zijlstra wrote:
> On Fri, 2007-08-03 at 17:49 +0400, Evgeniy Polyakov wrote:
> > On Fri, Aug 03, 2007 at 02:27:52PM +0200, Peter Zijlstra wrote:
> > ...my main position is to
> > allocate per socket reserve from socket's queue, and copy data
> > there from main reserve, all of which are allocated either in
> > advance (global one) or per sockoption, so that there would be no
> > fairness issues what to mark as special and what to not.
> >
> > Say we have a page per socket, each socket can assign a reserve for
> > itself from own memory, this accounts both tx and rx side. Tx is
> > not interesting, it is simple, rx has global reserve (always
> > allocated on startup or sometime way before reclaim/oom)where data
> > is originally received (including skb, shared info and whatever is
> > needed, page is just an exmaple), then it is copied into per-socket
> > reserve and reused for the next packet. Having per-socket reserve
> > allows to have progress in any situation not only in cases where
> > single action must be received/processed, and allows to be
> > completely fair for all users, but not only special sockets, thus
> > admin for example would be allowed to login, ipsec would work and
> > so on...
>
> Ah, I think I understand now. Yes this is indeed a good idea!
>
> It would be quite doable to implement this on top of that I already
> have. We would need to extend the socket with a sock_opt that would
> reserve a specified amount of data for that specific socket. And then
> on socket demux check if the socket has a non zero reserve and has
> not yet exceeded said reserve. If so, process the packet.
>
> This would also quite neatly work for -rt where we would not want
> incomming packet processing to be delayed by memory allocations.
At this point we need "anything that works" in mainline as a starting
point. By erring on the side of simplicity we can make this
understandable for folks who haven't spent the last two years wallowing
in it. The page per socket approach is about as simple as it gets. I
therefore propose we save our premature optimizations for later.
It will also help our cause if we keep any new internal APIs to strictly
what is needed to make deadlock go away. Not a whole lot more than
just the flag to mark a socket as part of the vm writeout path when you
get right down to essentials.
Regards,
Daniel
next prev parent reply other threads:[~2007-08-03 19:48 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-07-31 17:13 Distributed storage Evgeniy Polyakov
2007-08-02 21:08 ` Daniel Phillips
2007-08-03 10:26 ` Evgeniy Polyakov
2007-08-03 10:57 ` Evgeniy Polyakov
2007-08-03 12:27 ` Peter Zijlstra
2007-08-03 13:49 ` Evgeniy Polyakov
2007-08-03 14:53 ` Peter Zijlstra
2007-08-03 19:48 ` Daniel Phillips [this message]
2007-08-03 19:41 ` Daniel Phillips
2007-08-04 1:19 ` Daniel Phillips
2007-08-04 16:37 ` Evgeniy Polyakov
2007-08-05 8:04 ` Daniel Phillips
2007-08-05 15:08 ` Evgeniy Polyakov
2007-08-05 21:23 ` Daniel Phillips
2007-08-06 8:25 ` Evgeniy Polyakov
2007-08-07 12:05 ` Jens Axboe
2007-08-07 18:24 ` Daniel Phillips
2007-08-07 20:55 ` Jens Axboe
2007-08-08 9:54 ` Block device throttling [Re: Distributed storage.] Evgeniy Polyakov
2007-08-08 10:17 ` [1/1] " Evgeniy Polyakov
2007-08-08 13:28 ` Evgeniy Polyakov
2007-08-12 23:16 ` Daniel Phillips
2007-08-13 8:18 ` Evgeniy Polyakov
2007-08-27 21:57 ` Daniel Phillips
2007-08-13 5:22 ` Daniel Phillips
2007-08-13 5:36 ` Daniel Phillips
2007-08-13 6:44 ` Daniel Phillips
2007-08-13 8:14 ` Evgeniy Polyakov
2007-08-13 11:04 ` Daniel Phillips
2007-08-13 12:04 ` Evgeniy Polyakov
2007-08-13 12:18 ` Daniel Phillips
2007-08-13 12:24 ` Evgeniy Polyakov
2007-08-13 8:23 ` Evgeniy Polyakov
2007-08-13 11:18 ` Daniel Phillips
2007-08-13 12:18 ` Evgeniy Polyakov
2007-08-13 13:04 ` Daniel Phillips
2007-08-14 8:46 ` Evgeniy Polyakov
2007-08-14 11:13 ` Daniel Phillips
2007-08-14 11:30 ` Evgeniy Polyakov
2007-08-14 11:35 ` Daniel Phillips
2007-08-14 11:50 ` Evgeniy Polyakov
2007-08-14 12:32 ` Daniel Phillips
2007-08-14 12:46 ` Evgeniy Polyakov
2007-08-14 12:54 ` Daniel Phillips
2007-08-12 23:36 ` Distributed storage Daniel Phillips
2007-08-13 7:28 ` Jens Axboe
2007-08-13 7:45 ` Jens Axboe
2007-08-13 9:08 ` Daniel Phillips
2007-08-13 9:13 ` Jens Axboe
2007-08-13 9:55 ` Daniel Phillips
2007-08-13 10:06 ` Jens Axboe
2007-08-13 10:15 ` Daniel Phillips
2007-08-13 10:22 ` Jens Axboe
2007-08-13 10:32 ` Daniel Phillips
2007-08-13 9:18 ` Evgeniy Polyakov
2007-08-13 10:12 ` Daniel Phillips
2007-08-13 11:03 ` Evgeniy Polyakov
2007-08-13 11:45 ` Daniel Phillips
2007-08-13 8:59 ` Daniel Phillips
2007-08-13 9:12 ` Jens Axboe
2007-08-13 23:27 ` Daniel Phillips
2007-08-03 4:09 ` Mike Snitzer
2007-08-03 10:42 ` Evgeniy Polyakov
2007-08-04 0:49 ` Daniel Phillips
2007-08-03 5:04 ` Manu Abraham
2007-08-03 10:44 ` Evgeniy Polyakov
2007-08-04 2:51 ` Dave Dillow
2007-08-04 3:44 ` Manu Abraham
2007-08-04 17:03 ` Evgeniy Polyakov
2007-08-04 0:41 ` Daniel Phillips
2007-08-04 16:44 ` Evgeniy Polyakov
2007-08-05 8:06 ` Daniel Phillips
2007-08-05 15:01 ` Evgeniy Polyakov
2007-08-05 21:35 ` Daniel Phillips
2007-08-06 8:28 ` Evgeniy Polyakov
[not found] ` <200708281027.59528.phillips@phunq.net>
[not found] ` <20070828175403.GA28440@2ka.mipt.ru>
[not found] ` <200708281408.06618.phillips@phunq.net>
2007-08-29 8:53 ` [1/1] Block device throttling [Re: Distributed storage.] Evgeniy Polyakov
2007-08-30 23:20 ` Daniel Phillips
2007-08-31 17:33 ` Evgeniy Polyakov
2007-08-31 21:41 ` Alasdair G Kergon
2007-09-02 4:42 ` Daniel Phillips
-- strict thread matches above, loose matches on Subject: below --
2008-08-27 16:07 Distributed STorage Evgeniy Polyakov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200708031248.37804.phillips@phunq.net \
--to=phillips@phunq.net \
--cc=acme@ghostprotocols.net \
--cc=johnpol@2ka.mipt.ru \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).