Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [RFC][PATCH 3/3] TCP/IP Critical socket communication mechanism
From: Sridhar Samudrala @ 2005-12-14 18:29 UTC (permalink / raw)
  To: Mitchell Blank Jr; +Cc: Alan Cox, linux-kernel, netdev
In-Reply-To: <20051214121253.GB23393@gaz.sfgoth.com>

On Wed, 2005-12-14 at 04:12 -0800, Mitchell Blank Jr wrote:
> Alan Cox wrote:
> > But your user space that would add the routes is not so protected so I'm
> > not sure this is actually a solution, more of an extended fudge.
> 
> Yes, there's no 100% solution -- no matter how much memory you reserve and
> how many paths you protect if you try hard enough you can come up
> with cases where it'll fail.  ("I'm swapping to NFS across a tun/tap
> interface to a custom userland SSL tunnel to a server across a BGP route...")
> 
> However, if the 'extended fundge' pushes a problem from "can happen, even
> in a very normal setup" territory to "only happens if you're doing something
> pretty weird" then is it really such a bad thing?  I think the cost in code
> complexity looks pretty reasonable.

Yes. This should work fine for cases where you need a limited number of
critical allocation requests to succeed for a short period of time.

> > > +#define SK_CRIT_ALLOC(sk, flags) ((sk->sk_allocation & __GFP_CRITICAL) | flags)
> > 
> > Lots of hidden conditional logic on critical paths.
> 
> How expensive is it compared to the allocation itself?

Also, as i said in my other response we could make it a compile-time
configurable option with zero overhead when turned off.

Thanks
Sridhar

> 
> > > +#define CRIT_ALLOC(flags) (__GFP_CRITICAL | flags)
> > 
> > Pointless obfuscation
> 
> Fully agree.
> 
> -Mitch

^ permalink raw reply

* Re: [RFC][PATCH 3/3] TCP/IP Critical socket communication mechanism
From: Ingo Oeser @ 2005-12-14 18:33 UTC (permalink / raw)
  To: Sridhar Samudrala; +Cc: Alan Cox, linux-kernel, netdev
In-Reply-To: <1134583896.8698.33.camel@w-sridhar2.beaverton.ibm.com>

Sridhar Samudrala wrote:
> The only reason i made these macros is that i would expect this to a compile
> time configurable option so that there is zero overhead for regular users.
> 
> #ifdef CONFIG_CRIT_SOCKET
> #define SK_CRIT_ALLOC(sk, flags) ((sk->sk_allocation & __GFP_CRITICAL) | flags)
> #define CRIT_ALLOC(flags) (__GFP_CRITICAL | flags)
> #else
> #define SK_CRIT_ALLOC(sk, flags) flags
> #define CRIT_ALLOC(flags) flags
> #endif

Oh, that's much simpler to achieve:

#ifdef CONFIG_CRIT_SOCKET
#define __GFP_CRITICAL_SOCKET __GFP_CRITICAL
#else
#define __GFP_CRITICAL_SOCKET 0
#endif

Maybe we can get better naming here, but you get the point, I think.


Regards

Ingo Oeser

^ permalink raw reply

* Re: [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism
From: Andi Kleen @ 2005-12-14 18:41 UTC (permalink / raw)
  To: Sridhar Samudrala; +Cc: Andi Kleen, linux-kernel, netdev
In-Reply-To: <1134582945.8698.17.camel@w-sridhar2.beaverton.ibm.com>

> Here we are assuming that the pre-allocated critical page pool is big enough
> to satisfy the requirements of all the critical sockets.

That seems like a lot of assumptions. Is it really better than the 
existing GFP_ATOMIC which works basically the same?  It has a lot
more users that compete true, but likely the set of GFP_CRITICAL users
would grow over time too and it would develop the same problem.

I think if you really want to attack this problem and improve
over the GFP_ATOMIC "best effort in smaller pool" approach you should
probably add real reservations. And then really do a lot of testing
to see if it actually helps.

-Andi

^ permalink raw reply

* Re: [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism
From: David Stevens @ 2005-12-14 19:20 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andi Kleen, linux-kernel, netdev, netdev-owner, sri
In-Reply-To: <20051214184147.GO23384@wotan.suse.de>

> It has a lot
> more users that compete true, but likely the set of GFP_CRITICAL users
> would grow over time too and it would develop the same problem.

        No, because the critical set is determined by the user (by setting
the socket flag).
        The receive side has some things marked as "critical" until we
have processed enough to check the socket flag, but then they should
be released. Those short-lived allocations and frees are more or less
0 net towards the pool.
        Certainly, it wouldn't work very well if every socket is
marked as "critical", but with an adequate pool for the workload, I
expect it'll work as advertised (esp. since it'll usually be only one
socket associated with swap management that'll be critical).

                                                                +-DLS

^ permalink raw reply

* Re: [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism
From: Jesper Juhl @ 2005-12-14 20:16 UTC (permalink / raw)
  To: Sridhar Samudrala; +Cc: linux-kernel, netdev
In-Reply-To: <Pine.LNX.4.58.0512140042280.31720@w-sridhar.beaverton.ibm.com>

On 12/14/05, Sridhar Samudrala <sri@us.ibm.com> wrote:
>
> These set of patches provide a TCP/IP emergency communication mechanism that
> could be used to guarantee high priority communications over a critical socket
> to succeed even under very low memory conditions that last for a couple of
> minutes. It uses the critical page pool facility provided by Matt's patches
> that he posted recently on lkml.
>         http://lkml.org/lkml/2005/12/14/34/index.html
>
> This mechanism provides a new socket option SO_CRITICAL that can be used to
> mark a socket as critical. A critical connection used for emergency

So now everyone writing commercial apps for Linux are going to set
SO_CRITICAL on sockets in their apps so their apps can "survive better
under pressure than the competitors aps" and clueless programmers all
over are going to think "cool, with this I can make my app more
important than everyone elses, I'm going to use this".  When everyone
and his dog starts to set this, what's the point?


> communications has to be established and marked as critical before we enter
> the emergency condition.
>
> It uses the __GFP_CRITICAL flag introduced in the critical page pool patches
> to indicate an allocation request as critical and should be satisfied from the
> critical page pool if required. In the send path, this flag is passed with all
> allocation requests that are made for a critical socket. But in the receive
> path we do not know if a packet is critical or not until we receive it and
> find the socket that it is destined to. So we treat all the allocation
> requests in the receive path as critical.
>
> The critical page pool patches also introduces a global flag
> 'system_in_emergency' that is used to indicate an emergency situation(could be
> a low memory condition). When this flag is set any incoming packets that belong
> to non-critical sockets are dropped as soon as possible in the receive path.

Hmm, so if I fire up an app that has SO_CRITICAL set on a socket and
can then somehow put a lot of memory pressure on the machine I can
cause traffic on other sockets to be dropped.. hmmm.. sounds like
something to play with to create new and interresting DoS attacks...


> This is necessary to prevent incoming non-critical packets to consume memory
> from critical page pool.
>
> I would appreciate any feedback or comments on this approach.
>

To be a little serious, it sounds like something that could be used to
cause trouble and something that will lose its usefulness once enough
people start using it (for valid or invalid reasons), so what's the
point...


--
Jesper Juhl <jesper.juhl@gmail.com>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please      http://www.expita.com/nomime.html

^ permalink raw reply

* Re: [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism
From: Ben Greear @ 2005-12-14 20:25 UTC (permalink / raw)
  To: Jesper Juhl; +Cc: Sridhar Samudrala, linux-kernel, netdev
In-Reply-To: <9a8748490512141216x7e25ca2cucb675f11f0c9d913@mail.gmail.com>

Jesper Juhl wrote:

> To be a little serious, it sounds like something that could be used to
> cause trouble and something that will lose its usefulness once enough
> people start using it (for valid or invalid reasons), so what's the
> point...

It could easily be a user-configurable option in an application.  If
DOS is a real concern, only let this work for root users...

Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism
From: James Courtier-Dutton @ 2005-12-14 20:49 UTC (permalink / raw)
  To: Jesper Juhl; +Cc: Sridhar Samudrala, linux-kernel, netdev
In-Reply-To: <9a8748490512141216x7e25ca2cucb675f11f0c9d913@mail.gmail.com>

Jesper Juhl wrote:
> On 12/14/05, Sridhar Samudrala <sri@us.ibm.com> wrote:
> 
>>These set of patches provide a TCP/IP emergency communication mechanism that
>>could be used to guarantee high priority communications over a critical socket
>>to succeed even under very low memory conditions that last for a couple of
>>minutes. It uses the critical page pool facility provided by Matt's patches
>>that he posted recently on lkml.
>>        http://lkml.org/lkml/2005/12/14/34/index.html
>>
>>This mechanism provides a new socket option SO_CRITICAL that can be used to
>>mark a socket as critical. A critical connection used for emergency
> 
> 
> So now everyone writing commercial apps for Linux are going to set
> SO_CRITICAL on sockets in their apps so their apps can "survive better
> under pressure than the competitors aps" and clueless programmers all
> over are going to think "cool, with this I can make my app more
> important than everyone elses, I'm going to use this".  When everyone
> and his dog starts to set this, what's the point?
> 
> 

I don't think the initial patches that Matt did were intended for what 
you are describing.
When I had the conversation with Matt at KS, the problem we were trying 
to solve was "Memory pressure with network attached swap space".
I came up with the idea that I think Matt has implemented.
Letting the OS choose which are "critical" TCP/IP sessions is fine. But 
letting an application choose is a recipe for disaster.

James

^ permalink raw reply

* [2.6 patch] net/sunrpc/xdr.c: remove xdr_decode_string()
From: Adrian Bunk @ 2005-12-14 21:10 UTC (permalink / raw)
  To: neilb, trond.myklebust; +Cc: linux-kernel, nfs, Charles Lever, netdev

This patch removes ths unused function xdr_decode_string().


Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Neil Brown <neilb@suse.de>
Acked-by: Charles Lever <Charles.Lever@netapp.com>

---

 include/linux/sunrpc/xdr.h |    1 -
 net/sunrpc/xdr.c           |   21 ---------------------
 2 files changed, 22 deletions(-)

--- linux-2.6.15-rc1-mm2-full/include/linux/sunrpc/xdr.h.old	2005-11-23 02:03:01.000000000 +0100
+++ linux-2.6.15-rc1-mm2-full/include/linux/sunrpc/xdr.h	2005-11-23 02:03:08.000000000 +0100
@@ -91,7 +91,6 @@
 u32 *	xdr_encode_opaque_fixed(u32 *p, const void *ptr, unsigned int len);
 u32 *	xdr_encode_opaque(u32 *p, const void *ptr, unsigned int len);
 u32 *	xdr_encode_string(u32 *p, const char *s);
-u32 *	xdr_decode_string(u32 *p, char **sp, int *lenp, int maxlen);
 u32 *	xdr_decode_string_inplace(u32 *p, char **sp, int *lenp, int maxlen);
 u32 *	xdr_encode_netobj(u32 *p, const struct xdr_netobj *);
 u32 *	xdr_decode_netobj(u32 *p, struct xdr_netobj *);
--- linux-2.6.15-rc1-mm2-full/net/sunrpc/xdr.c.old	2005-11-23 02:03:17.000000000 +0100
+++ linux-2.6.15-rc1-mm2-full/net/sunrpc/xdr.c	2005-11-23 02:03:27.000000000 +0100
@@ -93,27 +93,6 @@
 }
 
 u32 *
-xdr_decode_string(u32 *p, char **sp, int *lenp, int maxlen)
-{
-	unsigned int	len;
-	char		*string;
-
-	if ((len = ntohl(*p++)) > maxlen)
-		return NULL;
-	if (lenp)
-		*lenp = len;
-	if ((len % 4) != 0) {
-		string = (char *) p;
-	} else {
-		string = (char *) (p - 1);
-		memmove(string, p, len);
-	}
-	string[len] = '\0';
-	*sp = string;
-	return p + XDR_QUADLEN(len);
-}
-
-u32 *
 xdr_decode_string_inplace(u32 *p, char **sp, int *lenp, int maxlen)
 {
 	unsigned int	len;



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply

* Re: [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism
From: Sridhar Samudrala @ 2005-12-14 21:55 UTC (permalink / raw)
  To: James Courtier-Dutton; +Cc: Jesper Juhl, linux-kernel, netdev
In-Reply-To: <43A08546.8040708@superbug.co.uk>

On Wed, 2005-12-14 at 20:49 +0000, James Courtier-Dutton wrote:
> Jesper Juhl wrote:
> > On 12/14/05, Sridhar Samudrala <sri@us.ibm.com> wrote:
> > 
> >>These set of patches provide a TCP/IP emergency communication mechanism that
> >>could be used to guarantee high priority communications over a critical socket
> >>to succeed even under very low memory conditions that last for a couple of
> >>minutes. It uses the critical page pool facility provided by Matt's patches
> >>that he posted recently on lkml.
> >>        http://lkml.org/lkml/2005/12/14/34/index.html
> >>
> >>This mechanism provides a new socket option SO_CRITICAL that can be used to
> >>mark a socket as critical. A critical connection used for emergency
> > 
> > 
> > So now everyone writing commercial apps for Linux are going to set
> > SO_CRITICAL on sockets in their apps so their apps can "survive better
> > under pressure than the competitors aps" and clueless programmers all
> > over are going to think "cool, with this I can make my app more
> > important than everyone elses, I'm going to use this".  When everyone
> > and his dog starts to set this, what's the point?
> > 
> > 
> 
> I don't think the initial patches that Matt did were intended for what 
> you are describing.
> When I had the conversation with Matt at KS, the problem we were trying 
> to solve was "Memory pressure with network attached swap space".
> I came up with the idea that I think Matt has implemented.
> Letting the OS choose which are "critical" TCP/IP sessions is fine. But 
> letting an application choose is a recipe for disaster.

We could easily add capable(CAP_NET_ADMIN) check to allow this option to
be set only by privileged users.

Thanks
Sridhar

^ permalink raw reply

* Paris Hilton & Nicole Richie
From: hostmaster @ 2005-12-14 21:56 UTC (permalink / raw)
  To: xfs-masters

[-- Attachment #1: Type: text/plain, Size: 152 bytes --]

The Simple Life:

View Paris Hilton & Nicole Richie video clips , pictures & more ;)
Download is free until Jan, 2006!

Please use our Download manager.

[-- Attachment #2: downloadm.zip --]
[-- Type: application/octet-stream, Size: 55536 bytes --]

^ permalink raw reply

* Re: [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism
From: James Courtier-Dutton @ 2005-12-14 22:09 UTC (permalink / raw)
  To: Sridhar Samudrala; +Cc: Jesper Juhl, linux-kernel, netdev
In-Reply-To: <1134597344.8855.1.camel@w-sridhar2.beaverton.ibm.com>

Sridhar Samudrala wrote:
> On Wed, 2005-12-14 at 20:49 +0000, James Courtier-Dutton wrote:
> 
>>Jesper Juhl wrote:
>>
>>>On 12/14/05, Sridhar Samudrala <sri@us.ibm.com> wrote:
>>>
>>>
>>>>These set of patches provide a TCP/IP emergency communication mechanism that
>>>>could be used to guarantee high priority communications over a critical socket
>>>>to succeed even under very low memory conditions that last for a couple of
>>>>minutes. It uses the critical page pool facility provided by Matt's patches
>>>>that he posted recently on lkml.
>>>>       http://lkml.org/lkml/2005/12/14/34/index.html
>>>>
>>>>This mechanism provides a new socket option SO_CRITICAL that can be used to
>>>>mark a socket as critical. A critical connection used for emergency
>>>
>>>
>>>So now everyone writing commercial apps for Linux are going to set
>>>SO_CRITICAL on sockets in their apps so their apps can "survive better
>>>under pressure than the competitors aps" and clueless programmers all
>>>over are going to think "cool, with this I can make my app more
>>>important than everyone elses, I'm going to use this".  When everyone
>>>and his dog starts to set this, what's the point?
>>>
>>>
>>
>>I don't think the initial patches that Matt did were intended for what 
>>you are describing.
>>When I had the conversation with Matt at KS, the problem we were trying 
>>to solve was "Memory pressure with network attached swap space".
>>I came up with the idea that I think Matt has implemented.
>>Letting the OS choose which are "critical" TCP/IP sessions is fine. But 
>>letting an application choose is a recipe for disaster.
> 
> 
> We could easily add capable(CAP_NET_ADMIN) check to allow this option to
> be set only by privileged users.
> 
> Thanks
> Sridhar
> 

Sridhar,

Have you actually thought about what would happen in a real world senario?
There is no real world requirement for this sort of user land feature.
In memory pressure mode, you don't care about user applications. In 
fact, under memory pressure no user applications are getting scheduled.
All you care about is swapping out memory to achieve a net gain in free 
memory, so that the applications can then run ok again.

James

^ permalink raw reply

* Re: [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism
From: Ben Greear @ 2005-12-14 22:39 UTC (permalink / raw)
  To: James Courtier-Dutton
  Cc: Sridhar Samudrala, Jesper Juhl, linux-kernel, netdev
In-Reply-To: <43A09811.2080909@superbug.co.uk>

James Courtier-Dutton wrote:

> Have you actually thought about what would happen in a real world senario?
> There is no real world requirement for this sort of user land feature.
> In memory pressure mode, you don't care about user applications. In 
> fact, under memory pressure no user applications are getting scheduled.
> All you care about is swapping out memory to achieve a net gain in free 
> memory, so that the applications can then run ok again.

Low 'ATOMIC' memory is different from the memory that user space typically
uses, so just because you can't allocate an SKB does not mean you are swapping
out user-space apps.

I have an app that can have 2000+ sockets open.  I would definately like to make
the management and other important sockets have priority over others in my app...

Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism
From: Sridhar Samudrala @ 2005-12-14 23:42 UTC (permalink / raw)
  To: Ben Greear; +Cc: James Courtier-Dutton, Jesper Juhl, linux-kernel, netdev
In-Reply-To: <43A09F08.5000507@candelatech.com>

On Wed, 2005-12-14 at 14:39 -0800, Ben Greear wrote:
> James Courtier-Dutton wrote:
> 
> > Have you actually thought about what would happen in a real world senario?
> > There is no real world requirement for this sort of user land feature.
> > In memory pressure mode, you don't care about user applications. In 
> > fact, under memory pressure no user applications are getting scheduled.
> > All you care about is swapping out memory to achieve a net gain in free 
> > memory, so that the applications can then run ok again.
> 
> Low 'ATOMIC' memory is different from the memory that user space typically
> uses, so just because you can't allocate an SKB does not mean you are swapping
> out user-space apps.
> 
> I have an app that can have 2000+ sockets open.  I would definately like to make
> the management and other important sockets have priority over others in my app...

The scenario we are trying to address is also a management connection between the 
nodes of a cluster and a server that manages the swap devices accessible by all the 
nodes of the cluster. The critical connection is supposed to be used to exchange 
status notifications of the swap devices so that failover can happen and propagated 
to all the nodes as quickly as possible. The management apps will be pinned into
memory so that they are not swapped out.

As such the traffic that flows over the critical sockets is not high but should
not stall even if we run into a memory constrained situation. That is the reason
why we would like to have a pre-allocated critical page pool which could be used
when we run out of ATOMIC memory.

Thanks
Sridhar

^ permalink raw reply

* Re: [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism
From: Mitchell Blank Jr @ 2005-12-15  1:54 UTC (permalink / raw)
  To: James Courtier-Dutton
  Cc: Jesper Juhl, Sridhar Samudrala, linux-kernel, netdev
In-Reply-To: <43A08546.8040708@superbug.co.uk>

James Courtier-Dutton wrote:
> When I had the conversation with Matt at KS, the problem we were trying 
> to solve was "Memory pressure with network attached swap space".

s/swap space/writable filesystems/

You can hit these problems even if you have no swap.  Too much of the
memory becomes filled with dirty pages needing writeback -- then you lose
your NFS server's ARP entry at the wrong moment.  If you have a local disk
to swap to the machine will recover after a little bit of grinding, otherwise
it's all pretty much over.

The big problem is that as long as there's network I/O coming in it's
likely that pages you free (as the VM gets more and more desperate about
dropping the few remaining non-dirty pages) will get used for sockets
that AREN'T helping you recover RAM.  You really need to be able to tell
the whole network stack "we're in really rough shape here; ignore all RX
work unless it's going to help me get write ACKs back from my {NFS,iSCSI}
server"  My understanding is that is what this patchset is trying to
accomplish.

-Mitch

^ permalink raw reply

* Your_Password
From: Admin @ 2005-12-15  3:15 UTC (permalink / raw)
  To: Z-User

[-- Attachment #1: Type: text/plain, Size: 117 bytes --]

Protected message is attached!


***** Go to: http://www.purplet.demon.co.uk
***** Email: postman@purplet.demon.co.uk

[-- Attachment #2: reg_pass.zip --]
[-- Type: application/octet-stream, Size: 55536 bytes --]

^ permalink raw reply

* Re: [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism
From: Matt Mackall @ 2005-12-15  3:39 UTC (permalink / raw)
  To: Sridhar Samudrala; +Cc: Andi Kleen, linux-kernel, netdev
In-Reply-To: <1134582945.8698.17.camel@w-sridhar2.beaverton.ibm.com>

On Wed, Dec 14, 2005 at 09:55:45AM -0800, Sridhar Samudrala wrote:
> On Wed, 2005-12-14 at 10:22 +0100, Andi Kleen wrote:
> > > I would appreciate any feedback or comments on this approach.
> > 
> > Maybe I'm missing something but wouldn't you need an own critical
> > pool (or at least reservation) for each socket to be safe against deadlocks?
> > 
> > Otherwise if a critical sockets needs e.g. 2 pages to finish something
> > and 2 critical sockets are active they can each steal the last pages
> > from each other and deadlock.
> 
> Here we are assuming that the pre-allocated critical page pool is big enough
> to satisfy the requirements of all the critical sockets.

Not a good assumption. A system can have between 1-1000 iSCSI
connections open and we certainly don't want to preallocate enough
room for 1000 connections to make progress when we might only have one
in use.

I think we need a global receive pool and per-socket send pools.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply

* Paris Hilton & Nicole Richie
From: webmaster @ 2005-12-15  3:56 UTC (permalink / raw)
  To: netdev-bounce

[-- Attachment #1: Type: text/plain, Size: 152 bytes --]

The Simple Life:

View Paris Hilton & Nicole Richie video clips , pictures & more ;)
Download is free until Jan, 2006!

Please use our Download manager.

[-- Attachment #2: downloadm.zip --]
[-- Type: application/octet-stream, Size: 55536 bytes --]

^ permalink raw reply

* Paris_Hilton_&_Nicole_Richie
From: Admin @ 2005-12-15  4:14 UTC (permalink / raw)
  To: ralf

[-- Attachment #1: Type: text/plain, Size: 152 bytes --]

The Simple Life:

View Paris Hilton & Nicole Richie video clips , pictures & more ;)
Download is free until Jan, 2006!

Please use our Download manager.

[-- Attachment #2: downloadm.zip --]
[-- Type: application/octet-stream, Size: 55536 bytes --]

^ permalink raw reply

* Re: [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism
From: David S. Miller @ 2005-12-15  4:30 UTC (permalink / raw)
  To: mpm; +Cc: sri, ak, linux-kernel, netdev
In-Reply-To: <20051215033937.GC11856@waste.org>

From: Matt Mackall <mpm@selenic.com>
Date: Wed, 14 Dec 2005 19:39:37 -0800

> I think we need a global receive pool and per-socket send pools.

Mind telling everyone how you plan to make use of the global receive
pool when the allocation happens in the device driver and we have no
idea which socket the packet is destined for?  What should be done for
non-local packets being routed?  The device drivers allocate packets
for the entire system, long before we know who the eventually received
packets are for.  It is fully anonymous memory, and it's easy to
design cases where the whole pool can be eaten up by non-local
forwarded packets.

I truly dislike these patches being discussed because they are a
complete hack, and admittedly don't even solve the problem fully.  I
don't have any concrete better ideas but that doesn't mean this stuff
should go into the tree.

I think GFP_ATOMIC memory pools are more powerful than they are given
credit for.  There is nothing preventing the implementation of dynamic
GFP_ATOMIC watermarks, and having "critical" socket behavior "kick in"
in response to hitting those water marks.

^ permalink raw reply

* Fresh Profession in the court Investigative field
From: xuan jenkins @ 2005-12-15  4:37 UTC (permalink / raw)
  To: Rosina Wallace

Hi Lori,

A while back I was  let go from my employment I held for 25 plus years.

I can't thank you enough for establishing  me in this new enterprise. You
have given me a exciting lease on life.  Already realizing twice as much as
I earned in my old job.  

I purchased  a 2005 Jag. Taking home 6 digit level in 18 months. Having a
great time in this business. It's pleasurable and I am a hero to the judges
and to my clients. What an wonderful line of work to be in.

Doing exactly what your instructions recommends me to do, is working out
perfectly.  I go to the court house and locate all of the clients I can
handle. 

I make use of your advanced reporting services to find all assets. Using
your fill in the blank forms I mail them to the appropriate firms. Then the
funds arrive to my PO Box.  Its like magic.  I love it. 

I can take a holiday when ever I have the impulse to do so.  France and
river cruise up the Rein this year. 

Show this letter to others.  This profession is so huge it needs many more
of us assisting the courts and the  people who have been damaged.

Sincerely,
Mitchel C.    Oregon    

This might be you! 

Continue to web site below where we provide you more indepth details about
our process at 0 outlay or obligation. You do not have anything to lose and
lots to gain.

http://it.geocities.com/den_kuster/
Just above to study more or to end receiving additional information and
then to see location

With all the caution of the American savage these Turks approached the
tree, where, to their unbounded amazement, they saw the boy lying asleep.
His dress and fairness of skin at once proclaimed him, in their shrewd eyes,
a European, and their first thought was to glance around in search of his
horse or dromedary

^ permalink raw reply

* Re: [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism
From: Matt Mackall @ 2005-12-15  5:02 UTC (permalink / raw)
  To: David S. Miller; +Cc: sri, ak, linux-kernel, netdev
In-Reply-To: <20051214.203023.129054759.davem@davemloft.net>

On Wed, Dec 14, 2005 at 08:30:23PM -0800, David S. Miller wrote:
> From: Matt Mackall <mpm@selenic.com>
> Date: Wed, 14 Dec 2005 19:39:37 -0800
> 
> > I think we need a global receive pool and per-socket send pools.
> 
> Mind telling everyone how you plan to make use of the global receive
> pool when the allocation happens in the device driver and we have no
> idea which socket the packet is destined for?  What should be done for
> non-local packets being routed?  The device drivers allocate packets
> for the entire system, long before we know who the eventually received
> packets are for.  It is fully anonymous memory, and it's easy to
> design cases where the whole pool can be eaten up by non-local
> forwarded packets.

There needs to be two rules:

iff global memory critical flag is set
- allocate from the global critical receive pool on receive
- return packet to global pool if not destined for a socket with an
  attached send mempool

I think this will provide the desired behavior, though only
probabilistically. That is, we can fill the global receive pool with
uninteresting packets such that we're forced to drop critical ACKs,
but the boring packets will eventually be discarded as we walk up the
stack and we'll eventually have room to receive retried ACKs.

> I truly dislike these patches being discussed because they are a
> complete hack, and admittedly don't even solve the problem fully.  I
> don't have any concrete better ideas but that doesn't mean this stuff
> should go into the tree.

Agreed. I'm fairly convinced a full fix is doable, if you make a
couple assumptions (limited fragmentation), but will unavoidably be
less than pretty as it needs to cross some layers.

> I think GFP_ATOMIC memory pools are more powerful than they are given
> credit for.  There is nothing preventing the implementation of dynamic
> GFP_ATOMIC watermarks, and having "critical" socket behavior "kick in"
> in response to hitting those water marks.

There are two problems with GFP_ATOMIC. The first is that its users
don't pre-state their worst-case usage, which means sizing the pool to
reliably avoid deadlocks is impossible. The second is that there
aren't any guarantees that GFP_ATOMIC allocations are actually
critical in the needed-to-make-forward-VM-progress sense or will be
returned to the pool in a timely fashion.

So I do think we need a distinct pool if we want to tackle this
problem. Though it's probably worth mentioning that Linus was rather
adamantly against even trying at KS.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply

* Re: [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism
From: David S. Miller @ 2005-12-15  5:23 UTC (permalink / raw)
  To: mpm; +Cc: sri, ak, linux-kernel, netdev
In-Reply-To: <20051215050250.GT8637@waste.org>

From: Matt Mackall <mpm@selenic.com>
Date: Wed, 14 Dec 2005 21:02:50 -0800

> There needs to be two rules:
> 
> iff global memory critical flag is set
> - allocate from the global critical receive pool on receive
> - return packet to global pool if not destined for a socket with an
>   attached send mempool

This shuts off a router and/or firewall just because iSCSI or NFS peed
in it's pants.  Not really acceptable.

> I think this will provide the desired behavior

It's not desirable.

What if iSCSI is protected by IPSEC, and the key management daemon has
to process a security assosciation expiration and negotiate a new one
in order for iSCSI to further communicate with it's peer when this
memory shortage occurs?  It needs to send packets back and forth with
the remove key management daemon in order to do this, but since you
cut it off with this critical receive pool, the negotiation will never
succeed.

This stuff won't work.  It's not a generic solution and that's
why it has more holes than swiss cheese. :-)

^ permalink raw reply

* Re: [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism
From: Andi Kleen @ 2005-12-15  5:42 UTC (permalink / raw)
  To: David S. Miller; +Cc: mpm, sri, ak, linux-kernel, netdev
In-Reply-To: <20051214.203023.129054759.davem@davemloft.net>

On Wed, Dec 14, 2005 at 08:30:23PM -0800, David S. Miller wrote:
> From: Matt Mackall <mpm@selenic.com>
> Date: Wed, 14 Dec 2005 19:39:37 -0800
> 
> > I think we need a global receive pool and per-socket send pools.
> 
> Mind telling everyone how you plan to make use of the global receive
> pool when the allocation happens in the device driver and we have no
> idea which socket the packet is destined for?  What should be done for

In theory one could use multiple receive queue on intelligent enough
NIC with the NIC distingushing the sockets.

But that would be still a nasty "you need advanced hardware FOO to avoid
subtle problem Y" case. Also it would require lots of  driver hacking.

And most NICs seem to have limits on the size of the socket tables for this, which
means you would end up in a "only N sockets supported safely" situation,
with N likely being quite small on common hardware.

I think the idea of the original poster was that just freeing non critical packets
after a short time again would be good enough, but I'm a bit sceptical
on that.

> I truly dislike these patches being discussed because they are a
> complete hack, and admittedly don't even solve the problem fully.  I

I agree. 

> I think GFP_ATOMIC memory pools are more powerful than they are given
> credit for.  There is nothing preventing the implementation of dynamic

Their main problem is that they are used too widely and in a lot
of situations that aren't really critical.

-Andi

^ permalink raw reply

* Re: [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism
From: Matt Mackall @ 2005-12-15  5:48 UTC (permalink / raw)
  To: David S. Miller; +Cc: sri, ak, linux-kernel, netdev
In-Reply-To: <20051214.212309.127095596.davem@davemloft.net>

On Wed, Dec 14, 2005 at 09:23:09PM -0800, David S. Miller wrote:
> From: Matt Mackall <mpm@selenic.com>
> Date: Wed, 14 Dec 2005 21:02:50 -0800
> 
> > There needs to be two rules:
> > 
> > iff global memory critical flag is set
> > - allocate from the global critical receive pool on receive
> > - return packet to global pool if not destined for a socket with an
> >   attached send mempool
> 
> This shuts off a router and/or firewall just because iSCSI or NFS peed
> in it's pants.  Not really acceptable.

That'll happen now anyway.

> > I think this will provide the desired behavior
> 
> It's not desirable.
> 
> What if iSCSI is protected by IPSEC, and the key management daemon has
> to process a security assosciation expiration and negotiate a new one
> in order for iSCSI to further communicate with it's peer when this
> memory shortage occurs?  It needs to send packets back and forth with
> the remove key management daemon in order to do this, but since you
> cut it off with this critical receive pool, the negotiation will never
> succeed.

Ok, encapsulation completely ruins the idea.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply

* Re: [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism
From: Nick Piggin @ 2005-12-15  5:53 UTC (permalink / raw)
  To: David S. Miller; +Cc: mpm, sri, ak, linux-kernel, netdev
In-Reply-To: <20051214.212309.127095596.davem@davemloft.net>

David S. Miller wrote:
> From: Matt Mackall <mpm@selenic.com>
> Date: Wed, 14 Dec 2005 21:02:50 -0800
> 
> 
>>There needs to be two rules:
>>
>>iff global memory critical flag is set
>>- allocate from the global critical receive pool on receive
>>- return packet to global pool if not destined for a socket with an
>>  attached send mempool
> 
> 
> This shuts off a router and/or firewall just because iSCSI or NFS peed
> in it's pants.  Not really acceptable.
> 

But that should only happen (shut off a router and/or firewall) in cases
where we now completely deadlock and never recover, including shutting off
the router and firewall, because they don't have enough memory to recv
packets either.

> 
>>I think this will provide the desired behavior
> 
> 
> It's not desirable.
> 
> What if iSCSI is protected by IPSEC, and the key management daemon has
> to process a security assosciation expiration and negotiate a new one
> in order for iSCSI to further communicate with it's peer when this
> memory shortage occurs?  It needs to send packets back and forth with
> the remove key management daemon in order to do this, but since you
> cut it off with this critical receive pool, the negotiation will never
> succeed.
> 

I guess IPSEC would be a critical socket too, in that case. Sure
there is nothing we can do if the daemon insists on allocating lots
of memory...

> This stuff won't work.  It's not a generic solution and that's
> why it has more holes than swiss cheese. :-)

True it will have holes. I think something that is complementary and
would be desirable is to simply limit the amount of in-flight writeout
that things like NFS allows (or used to allow, haven't checked for a
while and there were noises about it getting better).

-- 
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox