From: "Alex Aizman" <itn780@yahoo.com>
To: <open-iscsi@googlegroups.com>
Cc: <mpm@selenic.com>, <andrea@suse.de>, <michaelc@cs.wisc.edu>,
<James.Bottomley@HansenPartnership.com>, <netdev@oss.sgi.com>,
"'David S. Miller'" <davem@davemloft.net>,
<ksummit-2005-discuss@thunk.org>
Subject: RE: [Ksummit-2005-discuss] Summary of 2005 Kernel Summit Proposed Topics
Date: Sun, 27 Mar 2005 13:18:26 -0800 [thread overview]
Message-ID: <200503272118.j2RLImV0007976@oss.sgi.com> (raw)
In-Reply-To: <42472259.2866086e.3169.318fSMTPIN_ADDED@mx.googlegroups.com>
> Let's say, all N commands transmitted in a burst, and just
> one of these N gets ack-ed by the Target (via StatSN).
Let's say, all can_queue commands transmitted in a burst, and just
one of these can_queue commands gets ack-ed by the Target (via StatSN).
> -----Original Message-----
> From: Alex Aizman [mailto:itn780@yahoo.com]
> Sent: Sunday, March 27, 2005 1:15 PM
> To: open-iscsi@googlegroups.com
> Cc: mpm@selenic.com; andrea@suse.de; michaelc@cs.wisc.edu;
> James.Bottomley@HansenPartnership.com; netdev@oss.sgi.com;
> 'David S. Miller'; ksummit-2005-discuss@thunk.org
> Subject: RE: [Ksummit-2005-discuss] Summary of 2005 Kernel
> Summit Proposed Topics
>
>
> David S. Miller writes:
> >
> > On Sat, 26 Mar 2005 22:33:01 -0800
> > Dmitry Yusupov <dmitry_yus@yahoo.com> wrote:
> >
> > > i.e. TCP stack should call NIC driver's callback after
> all SKB data
> > > been successfully copied to the user space. At that point
> > NIC driver
> > > will safely replenish HW ring. This way we could avoid most
> > of memory
> > > allocations on receive.
> >
> > How does this solve your problem? This is just simple SKB
> recycling,
> > and it's a pretty old idea.
> >
> > TCP packets can be held on receive for arbitrary amounts of time.
> >
> > This is especially true if data is received out of order or when
> > packets are dropped. We can't even wake up the user until
> the holes
> > in the sequence space are filled.
> >
> > Even if data is received properly and in order, there are no hard
> > guarentees about when the user will get back onto the CPU
> to get the
> > data copied to it.
> >
> > During these gaps in time, you will need to keep your HW
> receive ring
> > populated with packets.
>
>
> Here's the way I see it.
>
> 1) There are iSCSI connections that should be "protected",
> resources-wise.
> Examples: remote swap device, bank accounts database on RAID
> accessed via iSCSI, etc.
>
> 2) There are two ways to protect the "protected" connections.
> One "Big Brother" like way is a centralized Resource Manager
> that performs a fully deterministic resource accounting
> throughout the system, all the way from NIC descriptors and
> on-chip memory up to iSCSI buffers for Data-Out headers.
>
>
> 3) The 2nd way is *awareness* of the "protected" connections
> propagated throughout the system, along with incremental
> implementation of more sophisticated recovery schemes.
>
> 4) The Resource Manager could be used in the following way.
> At session open time iSCSI control plane calculates iSCSI and
> TCP resources that should be available at all times. The
> calculation is done based on: the number of SCSI commands to
> be processed in parallel (the 'can_queue'), the maximum size
> of the SCSI payload in the SG, the negotiated maximum number
> of outstanding R2Ts, sizes of Immediate and FirstBurst data.
>
> 5) If Resource manager says there is not enough resources,
> iSCSI fails session open. This is better than to get in
> trouble well into runtime.
>
> 6) For example: to transmit 'can_queue' commands, iSCSI needs
> N skbufs.
Let's say, all N commands transmitted in a burst, and just
> one of these N gets ack-ed by the Target (via StatSN). In the
> fully deterministic system this does not necessarily mean
> that the scsi-ml can now send one command - because the full
> condition involves also recycling of skbuf(s) used for
> transmitting this one completed command. And although it is
> hard to imagine that the command gets fully done by the
> remote target without Tx buffers getting recycled, the
> theoretical chance exists (e.g., the NIC is slow or the
> driver has a bad Tx recycling implementation), and the fully
> deterministic scheme should take it into account.
>
> 7) Therefore, prior to calling scsi_done() iSCSI asks
> Resource Manager whether all the TCP etc. resources used for
> this command are already recycled. If not, the scsi_done()
> gets postponed. In addition, iSCSI "complains" to Resource
> Manager that it enters slow path because of this, which could
> prompt the latter to take an action. (End of the example).
>
> 8) If we agree to declare some connections
> "resource-proteced", it would immediately mean that there are
> possibly other connections that are not (resource-protected).
> Which in turn gives the Resource Manager a flexibility to
> OOM-kill those unprotected connections and cannibalize the
> corresponding resources for the protected ones.
>
> 9) Without some awareness of the resource-protected
> connections, and without some kind of resource counting at
> runtime (let it be partial and incomplete for starters) - the
> only remaining way for customers that require HA (High
> Availability) is to over-engineer: use 64GB RAM, TBs of disk
> space, etc.
> Which is probably not the end of the world as long as the
> prices go down..
>
> Alex
>
next parent reply other threads:[~2005-03-27 21:18 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <42472259.2866086e.3169.318fSMTPIN_ADDED@mx.googlegroups.com>
2005-03-27 21:18 ` Alex Aizman [this message]
2005-03-27 21:53 [Ksummit-2005-discuss] Summary of 2005 Kernel Summit Proposed Topics Alex Aizman
[not found] <4241D106.8050302@cs.wisc.edu>
[not found] ` <20050324101622S.fujita.tomonori@lab.ntt.co.jp>
[not found] ` <1111628393.1548.307.camel@beastie>
[not found] ` <20050324113312W.fujita.tomonori@lab.ntt.co.jp>
[not found] ` <1111633846.1548.318.camel@beastie>
[not found] ` <20050324215922.GT14202@opteron.random>
[not found] ` <424346FE.20704@cs.wisc.edu>
[not found] ` <20050324233921.GZ14202@opteron.random>
[not found] ` <20050325034341.GV32638@waste.org>
[not found] ` <20050327035149.GD4053@g5.random>
2005-03-27 5:48 ` Matt Mackall
2005-03-27 6:04 ` Andrea Arcangeli
2005-03-27 6:38 ` Matt Mackall
2005-03-27 14:50 ` Andrea Arcangeli
2005-03-27 6:33 ` Dmitry Yusupov
2005-03-27 6:46 ` David S. Miller
2005-03-27 7:05 ` Dmitry Yusupov
2005-03-27 7:57 ` David S. Miller
2005-03-27 8:18 ` Dmitry Yusupov
2005-03-27 18:26 ` Mike Christie
2005-03-27 18:31 ` David S. Miller
2005-03-27 19:58 ` Matt Mackall
2005-03-27 21:49 ` Dmitry Yusupov
2005-03-27 18:47 ` Dmitry Yusupov
2005-03-27 21:14 ` Alex Aizman
[not found] ` <20050327211506.85EDA16022F6@mx1.suse.de>
2005-03-28 0:15 ` Andrea Arcangeli
2005-03-28 3:54 ` Rik van Riel
2005-03-28 4:34 ` David S. Miller
2005-03-28 4:50 ` Rik van Riel
2005-03-28 6:58 ` Alex Aizman
2005-03-28 16:12 ` Andi Kleen
2005-03-28 16:22 ` Andrea Arcangeli
2005-03-28 16:24 ` Rik van Riel
2005-03-29 15:11 ` Andi Kleen
2005-03-29 15:29 ` Rik van Riel
2005-03-29 17:03 ` Matt Mackall
2005-03-28 16:28 ` James Bottomley
2005-03-29 15:20 ` Andi Kleen
2005-03-29 15:56 ` James Bottomley
2005-03-29 17:19 ` Dmitry Yusupov
2005-03-29 21:08 ` jamal
2005-03-29 22:00 ` Rik van Riel
2005-03-29 22:17 ` Matt Mackall
2005-03-29 23:30 ` jamal
2005-03-29 23:00 ` jamal
2005-03-29 23:25 ` Matt Mackall
2005-03-30 0:30 ` H. Peter Anvin
2005-03-30 15:24 ` Andi Kleen
2005-03-29 22:03 ` Rick Jones
2005-03-29 23:13 ` jamal
2005-03-30 2:28 ` Alex Aizman
[not found] ` <E1DGSwp-0004ZE-00@thunker.thunk.org>
2005-03-30 17:16 ` Grant Grundler
2005-03-30 18:46 ` Dmitry Yusupov
2005-03-30 15:22 ` Andi Kleen
2005-03-30 15:33 ` Andrea Arcangeli
2005-03-30 15:38 ` Rik van Riel
2005-03-30 15:39 ` Andi Kleen
2005-03-30 15:44 ` Andrea Arcangeli
2005-03-30 15:50 ` Rik van Riel
2005-03-30 16:04 ` James Bottomley
2005-03-30 17:48 ` H. Peter Anvin
2005-03-30 16:02 ` Andi Kleen
2005-03-30 16:15 ` Andrea Arcangeli
2005-03-30 16:55 ` jamal
2005-03-30 18:42 ` Rik van Riel
2005-03-30 19:28 ` Alex Aizman
2005-03-31 11:41 ` Andi Kleen
2005-03-31 12:12 ` Rik van Riel
2005-03-31 18:59 ` Andi Kleen
2005-03-31 19:04 ` Rik van Riel
2005-03-31 15:35 ` Grant Grundler
2005-03-31 19:15 ` Alex Aizman
2005-03-31 19:34 ` Andi Kleen
2005-03-31 19:39 ` Rik van Riel
2005-03-31 11:45 ` Andi Kleen
2005-03-31 11:50 ` Andi Kleen
2005-03-31 17:09 ` Andrea Arcangeli
2005-03-31 22:05 ` Dmitry Yusupov
2005-03-30 17:24 ` Matt Mackall
2005-03-30 17:39 ` Dmitry Yusupov
2005-03-30 20:10 ` Mike Christie
2005-03-30 17:07 ` Grant Grundler
2005-03-30 5:12 ` H. Peter Anvin
2005-03-28 16:37 ` Dmitry Yusupov
2005-03-28 19:45 ` Roland Dreier
[not found] ` <1112042936.5088.22.camel@beastie>
2005-03-28 22:32 ` Benjamin LaHaise
2005-04-02 18:08 ` Dmitry Yusupov
2005-04-02 19:13 ` Ming Zhang
2005-04-04 6:31 ` Grant Grundler
2005-04-04 18:57 ` Rick Jones
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200503272118.j2RLImV0007976@oss.sgi.com \
--to=itn780@yahoo.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=andrea@suse.de \
--cc=davem@davemloft.net \
--cc=ksummit-2005-discuss@thunk.org \
--cc=michaelc@cs.wisc.edu \
--cc=mpm@selenic.com \
--cc=netdev@oss.sgi.com \
--cc=open-iscsi@googlegroups.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).