From: Matt Mackall <mpm@selenic.com>
To: Andrea Arcangeli <andrea@suse.de>
Cc: Mike Christie <michaelc@cs.wisc.edu>,
Dmitry Yusupov <dmitry_yus@yahoo.com>,
open-iscsi@googlegroups.com,
James.Bottomley@HansenPartnership.com, netdev@oss.sgi.com
Subject: Re: [Ksummit-2005-discuss] Summary of 2005 Kernel Summit Proposed Topics
Date: Sat, 26 Mar 2005 22:38:48 -0800 [thread overview]
Message-ID: <20050327063848.GB15453@waste.org> (raw)
In-Reply-To: <20050327060403.GE4053@g5.random>
On Sun, Mar 27, 2005 at 08:04:03AM +0200, Andrea Arcangeli wrote:
> On Sat, Mar 26, 2005 at 09:48:31PM -0800, Matt Mackall wrote:
> > I believe the mempool can be shared among all sockets that represent
> > the same storage device. Packets out any socket represent progress.
>
> What's the point to have more than one socket connected to each storage
> device anyway?
There may be multiple network addresses (with different network paths)
associated with the same device for purposes of throughput or reliability.
> One algo to handle this is: after we get the gfp_atomic failure, we
> look at all the mempools are registered for a certain NIC, and we pick
> a random mempools that isn't empty. We use the non-empty mempool to
> receive the packet, and we let the netif_rx process the packet. Then if
> going up the stack we find that the packet doesn't belong to the
> socket-mempool, we discard the packet and we release the ram back into
> the mempool. This should make progress since eventually the right packet
> will go in the right mempool.
What if the number of packets queued by the time we reach the softirq
side of the stack exceeds the available buffers?
Imagine that we've got heavy DNS and iSCSI on the same box and that the box
gets wedged in OOM such that it can't answer DNS queries. But we can't
distinguish at receive time between DNS and iSCSI. As iSCSI is TCP, it
will send repeat ACKs at relatively long intervals but the DNS clients
will potentially continue to hammer the machine, filling the reserve
buffers and starving out the ACKs. We've got to essentially be able to
say "we are OOM, drop all traffic to sockets not flagged for storage"
and do so quickly enough that we can eventually get the ACKs.
> > > Perhaps the mempooling overhead will be too huge to pay for it even when
> > > it's not necessary, in such case the iscsid will have to pass a new
> > > bitflag to the socket syscall, when it creates the socket meant to talk
> > > with the remote disk.
> >
> > I think we probably attach a mempool to a socket after the fact. And
>
> I guess you meant before the fact (i.e. before the connection to the
> server), anything attached after the fact (whatever the fact is ;) isn't
> going to help.
After the socket is created, but before we commit to pumping storage
data through it (iSCSI has multiple phases). A privileged
setsockopt-like interface ought to suffice. Or something completely
kernel internal.
Which reminds me: FUSE and friends presumably have a very similar set
of problems.
--
Mathematics is the supreme nostalgia of our time.
next prev parent reply other threads:[~2005-03-27 6:38 UTC|newest]
Thread overview: 88+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <4241D106.8050302@cs.wisc.edu>
[not found] ` <20050324101622S.fujita.tomonori@lab.ntt.co.jp>
[not found] ` <1111628393.1548.307.camel@beastie>
[not found] ` <20050324113312W.fujita.tomonori@lab.ntt.co.jp>
[not found] ` <1111633846.1548.318.camel@beastie>
[not found] ` <20050324215922.GT14202@opteron.random>
[not found] ` <424346FE.20704@cs.wisc.edu>
[not found] ` <20050324233921.GZ14202@opteron.random>
[not found] ` <20050325034341.GV32638@waste.org>
[not found] ` <20050327035149.GD4053@g5.random>
2005-03-27 5:48 ` [Ksummit-2005-discuss] Summary of 2005 Kernel Summit Proposed Topics Matt Mackall
2005-03-27 6:04 ` Andrea Arcangeli
2005-03-27 6:38 ` Matt Mackall [this message]
2005-03-27 14:50 ` Andrea Arcangeli
2005-03-27 6:33 ` Dmitry Yusupov
2005-03-27 6:46 ` David S. Miller
2005-03-27 7:05 ` Dmitry Yusupov
2005-03-27 7:57 ` David S. Miller
2005-03-27 8:18 ` Dmitry Yusupov
2005-03-27 18:26 ` Mike Christie
2005-03-27 18:31 ` David S. Miller
2005-03-27 19:58 ` Matt Mackall
2005-03-27 21:49 ` Dmitry Yusupov
2005-03-27 18:47 ` Dmitry Yusupov
2005-03-27 21:14 ` Alex Aizman
[not found] ` <20050327211506.85EDA16022F6@mx1.suse.de>
2005-03-28 0:15 ` Andrea Arcangeli
2005-03-28 3:54 ` Rik van Riel
2005-03-28 4:34 ` David S. Miller
2005-03-28 4:50 ` Rik van Riel
2005-03-28 6:58 ` Alex Aizman
2005-03-28 16:12 ` Andi Kleen
2005-03-28 16:22 ` Andrea Arcangeli
2005-03-28 16:24 ` Rik van Riel
2005-03-29 15:11 ` Andi Kleen
2005-03-29 15:29 ` Rik van Riel
2005-03-29 17:03 ` Matt Mackall
2005-03-28 16:28 ` James Bottomley
2005-03-29 15:20 ` Andi Kleen
2005-03-29 15:56 ` James Bottomley
2005-03-29 17:19 ` Dmitry Yusupov
2005-03-29 21:08 ` jamal
2005-03-29 22:00 ` Rik van Riel
2005-03-29 22:17 ` Matt Mackall
2005-03-29 23:30 ` jamal
2005-03-29 23:00 ` jamal
2005-03-29 23:25 ` Matt Mackall
2005-03-30 0:30 ` H. Peter Anvin
2005-03-30 15:24 ` Andi Kleen
2005-03-29 22:03 ` Rick Jones
2005-03-29 23:13 ` jamal
2005-03-30 2:28 ` Alex Aizman
[not found] ` <E1DGSwp-0004ZE-00@thunker.thunk.org>
2005-03-30 17:16 ` Grant Grundler
2005-03-30 18:46 ` Dmitry Yusupov
2005-03-30 15:22 ` Andi Kleen
2005-03-30 15:33 ` Andrea Arcangeli
2005-03-30 15:38 ` Rik van Riel
2005-03-30 15:39 ` Andi Kleen
2005-03-30 15:44 ` Andrea Arcangeli
2005-03-30 15:50 ` Rik van Riel
2005-03-30 16:04 ` James Bottomley
2005-03-30 17:48 ` H. Peter Anvin
2005-03-30 16:02 ` Andi Kleen
2005-03-30 16:15 ` Andrea Arcangeli
2005-03-30 16:55 ` jamal
2005-03-30 18:42 ` Rik van Riel
2005-03-30 19:28 ` Alex Aizman
2005-03-31 11:41 ` Andi Kleen
2005-03-31 12:12 ` Rik van Riel
2005-03-31 18:59 ` Andi Kleen
2005-03-31 19:04 ` Rik van Riel
2005-03-31 15:35 ` Grant Grundler
2005-03-31 19:15 ` Alex Aizman
2005-03-31 19:34 ` Andi Kleen
2005-03-31 19:39 ` Rik van Riel
2005-03-31 11:45 ` Andi Kleen
2005-03-31 11:50 ` Andi Kleen
2005-03-31 17:09 ` Andrea Arcangeli
2005-03-31 22:05 ` Dmitry Yusupov
2005-03-30 17:24 ` Matt Mackall
2005-03-30 17:39 ` Dmitry Yusupov
2005-03-30 20:10 ` Mike Christie
2005-03-30 17:07 ` Grant Grundler
2005-03-30 5:12 ` H. Peter Anvin
2005-03-28 16:37 ` Dmitry Yusupov
2005-03-28 19:45 ` Roland Dreier
2005-03-28 20:32 ` Topic: Remote DMA network technologies Gerrit Huizenga
2005-03-28 20:36 ` Roland Dreier
[not found] ` <1112042936.5088.22.camel@beastie>
2005-03-28 22:32 ` [Ksummit-2005-discuss] Summary of 2005 Kernel Summit Proposed Topics Benjamin LaHaise
2005-03-29 3:19 ` Linux support for RDMA (was: [Ksummit-2005-discuss] Summary of 2005 Kernel Summit Proposed Topics) Roland Dreier
2005-03-30 16:00 ` Benjamin LaHaise
2005-03-31 1:08 ` Linux support for RDMA H. Peter Anvin
2005-04-02 18:08 ` [Ksummit-2005-discuss] Summary of 2005 Kernel Summit Proposed Topics Dmitry Yusupov
2005-04-02 19:13 ` Ming Zhang
2005-04-04 6:31 ` Grant Grundler
2005-04-04 18:57 ` Rick Jones
2005-03-29 3:14 ` Linux support for RDMA (was: [Ksummit-2005-discuss] Summary of 2005 Kernel Summit Proposed Topics) Roland Dreier
[not found] <42472259.2866086e.3169.318fSMTPIN_ADDED@mx.googlegroups.com>
2005-03-27 21:18 ` [Ksummit-2005-discuss] Summary of 2005 Kernel Summit Proposed Topics Alex Aizman
2005-03-27 21:53 Alex Aizman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050327063848.GB15453@waste.org \
--to=mpm@selenic.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=andrea@suse.de \
--cc=dmitry_yus@yahoo.com \
--cc=michaelc@cs.wisc.edu \
--cc=netdev@oss.sgi.com \
--cc=open-iscsi@googlegroups.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).