* cel's patches for 2.6.18 kernels @ 2006-09-20 18:28 Chuck Lever 2006-09-20 20:20 ` Christoph Hellwig 0 siblings, 1 reply; 17+ messages in thread From: Chuck Lever @ 2006-09-20 18:28 UTC (permalink / raw) To: Linux NFS Mailing List This e-mail announces the release of cel's patch set for Linux 2.6.18 kernels. The patchset contains several important features, including: 1. The finishing patches for the RPC client transport switch 2. Support for rpcbind protocol versions 2 and 3 3. Some support for IPv6 in the RPC and NFS clients 4. Elimination of the BKL in the RPC and NFS clients 5. Elimination of the RPC slot table The patches are available from: http://oss.oracle.com/~cel/linux-2.6/2.6.18/ Older versions of this patchset continue to be available at: http://oss.oracle.com/~cel/linux-2.6/ -- "We who cut mere stones must always be envisioning cathedrals" -- Quarry worker's creed ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: cel's patches for 2.6.18 kernels 2006-09-20 18:28 cel's patches for 2.6.18 kernels Chuck Lever @ 2006-09-20 20:20 ` Christoph Hellwig 2006-09-20 20:53 ` Chuck Lever 0 siblings, 1 reply; 17+ messages in thread From: Christoph Hellwig @ 2006-09-20 20:20 UTC (permalink / raw) To: Chuck Lever; +Cc: Linux NFS Mailing List On Wed, Sep 20, 2006 at 02:28:10PM -0400, Chuck Lever wrote: > This e-mail announces the release of cel's patch set for Linux 2.6.18 > kernels. The patchset contains several important features, including: > > 1. The finishing patches for the RPC client transport switch > > 2. Support for rpcbind protocol versions 2 and 3 > > 3. Some support for IPv6 in the RPC and NFS clients > > 4. Elimination of the BKL in the RPC and NFS clients > > 5. Elimination of the RPC slot table What's the merge plan for those patches? The transport switch and ipv6 code is eagerly anticipated by many people. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: cel's patches for 2.6.18 kernels 2006-09-20 20:20 ` Christoph Hellwig @ 2006-09-20 20:53 ` Chuck Lever 2006-09-20 21:16 ` Trond Myklebust ` (2 more replies) 0 siblings, 3 replies; 17+ messages in thread From: Chuck Lever @ 2006-09-20 20:53 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Linux NFS Mailing List On 9/20/06, Christoph Hellwig <hch@infradead.org> wrote: > On Wed, Sep 20, 2006 at 02:28:10PM -0400, Chuck Lever wrote: > > This e-mail announces the release of cel's patch set for Linux 2.6.18 > > kernels. The patchset contains several important features, including: > > > > 1. The finishing patches for the RPC client transport switch > > > > 2. Support for rpcbind protocol versions 2 and 3 > > > > 3. Some support for IPv6 in the RPC and NFS clients > > > > 4. Elimination of the BKL in the RPC and NFS clients > > > > 5. Elimination of the RPC slot table > > What's the merge plan for those patches? The transport switch and ipv6 > code is eagerly anticipated by many people. About two thirds of the RPC transport switch is already in. Twenty-ish more patches are going in 2.6.19. There remain roughly 30 more patches that I need to walk through with Trond, and we can probably get those into 2.6.20. I wouldn't mind moving them up, but that's up to Trond. The IPv6 stuff is still being finished. Olaf may be able to help with the NLM parts; Bull is still working through the NFS server parts; I'm trying to act as integrator. I would like to line up some testers. I think those may be as far out as three or four releases, but progress is occurring -- note there are a lot more IPv6 patches in this patchset than there were in 2.6.17's. Elimination of the BKL is complete except for a few unimportant spots. All it needs is thorough testing on a multi-way ppc64 system to reveal niggling problems. The RPC slot table patches are completely rewritten in this release. That will probably need some testing and review. -- "We who cut mere stones must always be envisioning cathedrals" -- Quarry worker's creed ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: cel's patches for 2.6.18 kernels 2006-09-20 20:53 ` Chuck Lever @ 2006-09-20 21:16 ` Trond Myklebust 2006-09-21 2:29 ` Chuck Lever 2006-09-21 0:06 ` Chris Croswhite 2006-09-21 13:48 ` Tony Reix 2 siblings, 1 reply; 17+ messages in thread From: Trond Myklebust @ 2006-09-20 21:16 UTC (permalink / raw) To: Chuck Lever; +Cc: Christoph Hellwig, Linux NFS Mailing List On Wed, 2006-09-20 at 16:53 -0400, Chuck Lever wrote: > On 9/20/06, Christoph Hellwig <hch@infradead.org> wrote: > > What's the merge plan for those patches? The transport switch and ipv6 > > code is eagerly anticipated by many people. > > About two thirds of the RPC transport switch is already in. > Twenty-ish more patches are going in 2.6.19. There remain roughly 30 > more patches that I need to walk through with Trond, and we can > probably get those into 2.6.20. I wouldn't mind moving them up, but > that's up to Trond. > > The IPv6 stuff is still being finished. Olaf may be able to help with > the NLM parts; Bull is still working through the NFS server parts; I'm > trying to act as integrator. I would like to line up some testers. I > think those may be as far out as three or four releases, but progress > is occurring -- note there are a lot more IPv6 patches in this > patchset than there were in 2.6.17's. What are the prospects for the server side? > Elimination of the BKL is complete except for a few unimportant spots. > All it needs is thorough testing on a multi-way ppc64 system to > reveal niggling problems. ...and a thorough review of the code. > The RPC slot table patches are completely rewritten in this release. > That will probably need some testing and review. Could you remind me again why we are so keen to get rid of the RPC slot table? Cheers, Trond ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: cel's patches for 2.6.18 kernels 2006-09-20 21:16 ` Trond Myklebust @ 2006-09-21 2:29 ` Chuck Lever 2006-09-21 12:26 ` Steve Dickson 0 siblings, 1 reply; 17+ messages in thread From: Chuck Lever @ 2006-09-21 2:29 UTC (permalink / raw) To: Trond Myklebust; +Cc: Christoph Hellwig, Linux NFS Mailing List On 9/20/06, Trond Myklebust <trond.myklebust@fys.uio.no> wrote: > On Wed, 2006-09-20 at 16:53 -0400, Chuck Lever wrote: > > On 9/20/06, Christoph Hellwig <hch@infradead.org> wrote: > > > What's the merge plan for those patches? The transport switch and ipv6 > > > code is eagerly anticipated by many people. > > > > About two thirds of the RPC transport switch is already in. > > Twenty-ish more patches are going in 2.6.19. There remain roughly 30 > > more patches that I need to walk through with Trond, and we can > > probably get those into 2.6.20. I wouldn't mind moving them up, but > > that's up to Trond. > > > > The IPv6 stuff is still being finished. Olaf may be able to help with > > the NLM parts; Bull is still working through the NFS server parts; I'm > > trying to act as integrator. I would like to line up some testers. I > > think those may be as far out as three or four releases, but progress > > is occurring -- note there are a lot more IPv6 patches in this > > patchset than there were in 2.6.17's. > > What are the prospects for the server side? About half of that work was already done to support NFSv4 callbacks, and so it is contained in my patch set already. The NFS server-specific patches are still in Aurelien's hands, so I don't know. NLM hasn't been touched yet, but Olaf has been thinking about it. > > The RPC slot table patches are completely rewritten in this release. > > That will probably need some testing and review. > > Could you remind me again why we are so keen to get rid of the RPC slot > table? I implemented from scratch using your suggestions. Basically the rpc_rqst is carved out of the RPC buffer instead of out of a large tied down piece of storage. No slot table means we can dynamically increase and decrease the number of concurrent RPC requests per transport -- which is a pre-requisite for scalability when all mounts to the same server use the same transport. I don't relish the idea of a hundred mount points on a client that are only allowed to start 16 total requests at a time. -- "We who cut mere stones must always be envisioning cathedrals" -- Quarry worker's creed ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: cel's patches for 2.6.18 kernels 2006-09-21 2:29 ` Chuck Lever @ 2006-09-21 12:26 ` Steve Dickson 2006-09-21 13:47 ` Trond Myklebust 0 siblings, 1 reply; 17+ messages in thread From: Steve Dickson @ 2006-09-21 12:26 UTC (permalink / raw) To: Chuck Lever; +Cc: Christoph Hellwig, Linux NFS Mailing List, Trond Myklebust Chuck Lever wrote: > No slot table means we can dynamically increase and decrease the > number of concurrent RPC requests per transport -- which is a > pre-requisite for scalability when all mounts to the same server use > the same transport. I don't relish the idea of a hundred mount points > on a client that are only allowed to start 16 total requests at a > time. > I think this is a very good thing... not having a hard code slot table make things much more scalable... imho... steved. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: cel's patches for 2.6.18 kernels 2006-09-21 12:26 ` Steve Dickson @ 2006-09-21 13:47 ` Trond Myklebust 2006-09-21 14:50 ` Chuck Lever 0 siblings, 1 reply; 17+ messages in thread From: Trond Myklebust @ 2006-09-21 13:47 UTC (permalink / raw) To: Steve Dickson; +Cc: Christoph Hellwig, Linux NFS Mailing List, Chuck Lever On Thu, 2006-09-21 at 08:26 -0400, Steve Dickson wrote: > > Chuck Lever wrote: > > No slot table means we can dynamically increase and decrease the > > number of concurrent RPC requests per transport -- which is a > > pre-requisite for scalability when all mounts to the same server use > > the same transport. I don't relish the idea of a hundred mount points > > on a client that are only allowed to start 16 total requests at a > > time. > > > I think this is a very good thing... not having a hard code slot table > make things much more scalable... imho... I'd like to see _lots_ of testing in a low-memory environment before I'm ready to fully subscribe to that opinion. Cheers, Trond ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: cel's patches for 2.6.18 kernels 2006-09-21 13:47 ` Trond Myklebust @ 2006-09-21 14:50 ` Chuck Lever 2006-09-21 15:06 ` Trond Myklebust 0 siblings, 1 reply; 17+ messages in thread From: Chuck Lever @ 2006-09-21 14:50 UTC (permalink / raw) To: Trond Myklebust; +Cc: Linux NFS Mailing List On 9/21/06, Trond Myklebust <trond.myklebust@fys.uio.no> wrote: > On Thu, 2006-09-21 at 08:26 -0400, Steve Dickson wrote: > > > > Chuck Lever wrote: > > > No slot table means we can dynamically increase and decrease the > > > number of concurrent RPC requests per transport -- which is a > > > pre-requisite for scalability when all mounts to the same server use > > > the same transport. I don't relish the idea of a hundred mount points > > > on a client that are only allowed to start 16 total requests at a > > > time. > > > > > I think this is a very good thing... not having a hard code slot table > > make things much more scalable... imho... > > I'd like to see _lots_ of testing in a low-memory environment before I'm > ready to fully subscribe to that opinion. The current behavior is that the VM dumps a boat load of writes on the NFS client, and they all queue up on the RPC client's backlog queue. In the new code, each request is allowed to proceed further to the allocation of an RPC buffer before it is stopped. The buffers come out of a slab cache, so low-memory behavior should be fairly reasonable. The small slot table size already throttles write-intensive workloads and anything that tries to drive concurrent I/O. To add an additional constraint that multiple mount point go through a small fixed size slot table seems like poor design. Perhaps we can add a per-mount point concurrency limit instead of a per-transport limit? -- "We who cut mere stones must always be envisioning cathedrals" -- Quarry worker's creed ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: cel's patches for 2.6.18 kernels 2006-09-21 14:50 ` Chuck Lever @ 2006-09-21 15:06 ` Trond Myklebust 2006-09-21 15:51 ` Chuck Lever 0 siblings, 1 reply; 17+ messages in thread From: Trond Myklebust @ 2006-09-21 15:06 UTC (permalink / raw) To: Chuck Lever; +Cc: Linux NFS Mailing List On Thu, 2006-09-21 at 10:50 -0400, Chuck Lever wrote: > On 9/21/06, Trond Myklebust <trond.myklebust@fys.uio.no> wrote: > The current behavior is that the VM dumps a boat load of writes on the > NFS client, and they all queue up on the RPC client's backlog queue. > In the new code, each request is allowed to proceed further to the > allocation of an RPC buffer before it is stopped. The buffers come > out of a slab cache, so low-memory behavior should be fairly > reasonable. What properties of slabs make them immune to low-memory issues? > The small slot table size already throttles write-intensive workloads > and anything that tries to drive concurrent I/O. To add an additional > constraint that multiple mount point go through a small fixed size > slot table seems like poor design. Its main purpose is precisely that of _limiting_ the amount of RPC buffer usage, and hence avoiding yet another potential source of memory deadlocks. There is already a mechanism in place for allowing the user to fiddle with the limits, > Perhaps we can add a per-mount point concurrency limit instead of a > per-transport limit? Why? What workloads are currently showing performance problems related to this issue? ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: cel's patches for 2.6.18 kernels 2006-09-21 15:06 ` Trond Myklebust @ 2006-09-21 15:51 ` Chuck Lever 2006-09-21 17:21 ` Trond Myklebust 2006-09-21 17:33 ` Trond Myklebust 0 siblings, 2 replies; 17+ messages in thread From: Chuck Lever @ 2006-09-21 15:51 UTC (permalink / raw) To: Trond Myklebust; +Cc: Linux NFS Mailing List On 9/21/06, Trond Myklebust <trond.myklebust@fys.uio.no> wrote: > On Thu, 2006-09-21 at 10:50 -0400, Chuck Lever wrote: > > On 9/21/06, Trond Myklebust <trond.myklebust@fys.uio.no> wrote: > > > The current behavior is that the VM dumps a boat load of writes on the > > NFS client, and they all queue up on the RPC client's backlog queue. > > In the new code, each request is allowed to proceed further to the > > allocation of an RPC buffer before it is stopped. The buffers come > > out of a slab cache, so low-memory behavior should be fairly > > reasonable. > > What properties of slabs make them immune to low-memory issues? I didn't say "immune". Slabs improve low-memory behavior. They limit the amount of internal memory fragmentation, and provide a clean and automatic API for reaping unused memory when the system has passed its low-memory threshold. Even when a mount point is totally idle and the connection has timed out, the slot table is still there. It's a large piece of memory, usually a page or more. With these patches, that memory usage is eliminated when a transport is idle, and can be reclaimed if needed. > > The small slot table size already throttles write-intensive workloads > > and anything that tries to drive concurrent I/O. To add an additional > > constraint that multiple mount point go through a small fixed size > > slot table seems like poor design. > > Its main purpose is precisely that of _limiting_ the amount of RPC > buffer usage, and hence avoiding yet another potential source of memory > deadlocks. [ I might point out that this is not documented anywhere. But that's an aside. ] We are getting ahead of ourselves. The patches I wrote do not remove the limit, they merely change it from a hard architectural limit to a virtual limit. BUT THE LIMIT STILL EXISTS, and defaults to 16 requests, just as before. If the limit is exceeded, no RPC buffer is allocated, and tasks are queued on the backlog queue, just as before. So the low-memory behavior characteristics of the patches should be exactly the same or somewhat better than before. The point is to allow more flexibility. You can now change the limit on the fly, while the transport is in use. This change is a pre-requisite to allowing the client to tune itself as more mount points use a single transport. Instead of a dumb fixed limit, we can now think about a flexible dynamic limit that can allow greater concurrency when resources are available. I might also point out that the *real* limiter of memory usage is the kmalloc in rpc_malloc. If it fails, call_allocate will delay and loop. This has nothing to do with the slot table size, and suggests that the slot table size limit is totally arbitrary. > There is already a mechanism in place for allowing the user to fiddle > with the limits, Why should any user care about setting this limit? The client should be able to regulate itself to make optimal use of the available resources. Hand-tuning this limit is simply a work around. > > Perhaps we can add a per-mount point concurrency limit instead of a > > per-transport limit? > > Why? What workloads are currently showing performance problems related > to this issue? They are listed above. See the paragraph that starts "The small slot table size..." -- "We who cut mere stones must always be envisioning cathedrals" -- Quarry worker's creed ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: cel's patches for 2.6.18 kernels 2006-09-21 15:51 ` Chuck Lever @ 2006-09-21 17:21 ` Trond Myklebust 2006-09-21 17:33 ` Chuck Lever 2006-09-21 17:33 ` Trond Myklebust 1 sibling, 1 reply; 17+ messages in thread From: Trond Myklebust @ 2006-09-21 17:21 UTC (permalink / raw) To: Chuck Lever; +Cc: Linux NFS Mailing List On Thu, 2006-09-21 at 11:51 -0400, Chuck Lever wrote: > On 9/21/06, Trond Myklebust <trond.myklebust@fys.uio.no> wrote: > They are listed above. See the paragraph that starts "The small slot > table size..." Supporting facts and figures, please. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: cel's patches for 2.6.18 kernels 2006-09-21 17:21 ` Trond Myklebust @ 2006-09-21 17:33 ` Chuck Lever 0 siblings, 0 replies; 17+ messages in thread From: Chuck Lever @ 2006-09-21 17:33 UTC (permalink / raw) To: Trond Myklebust; +Cc: Linux NFS Mailing List On 9/21/06, Trond Myklebust <trond.myklebust@fys.uio.no> wrote: > On Thu, 2006-09-21 at 11:51 -0400, Chuck Lever wrote: > > On 9/21/06, Trond Myklebust <trond.myklebust@fys.uio.no> wrote: > > > They are listed above. See the paragraph that starts "The small slot > > table size..." > > Supporting facts and figures, please. Ask your colleagues in North Carolina. They have unpublished numbers to demonstrate that 128 slots improves TPC-C significantly over 32, or even 64 slots. And you know very well that the client will dump a lot of writes onto the wire at once, via nfs_writepages. If the server can handle them, then the workload will move forward more quickly. This kind of burst is typical of intensive write workloads, and the burst of writes will block other traffic on the transport if there are only a few slots available. -- "We who cut mere stones must always be envisioning cathedrals" -- Quarry worker's creed ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: cel's patches for 2.6.18 kernels 2006-09-21 15:51 ` Chuck Lever 2006-09-21 17:21 ` Trond Myklebust @ 2006-09-21 17:33 ` Trond Myklebust 2006-09-21 17:37 ` Chuck Lever 1 sibling, 1 reply; 17+ messages in thread From: Trond Myklebust @ 2006-09-21 17:33 UTC (permalink / raw) To: Chuck Lever; +Cc: Linux NFS Mailing List On Thu, 2006-09-21 at 11:51 -0400, Chuck Lever wrote: > I didn't say "immune". Slabs improve low-memory behavior. They limit > the amount of internal memory fragmentation, and provide a clean and > automatic API for reaping unused memory when the system has passed its > low-memory threshold. So does kmalloc, which is built on a set of slabs. The main difference between the two is that slabs tend to be limited to one task at hand. > Even when a mount point is totally idle and the connection has timed > out, the slot table is still there. It's a large piece of memory, > usually a page or more. With these patches, that memory usage is > eliminated when a transport is idle, and can be reclaimed if needed. We'd usually prefer that the VM reclaim dirty pages first. > > > The small slot table size already throttles write-intensive workloads > > > and anything that tries to drive concurrent I/O. To add an additional > > > constraint that multiple mount point go through a small fixed size > > > slot table seems like poor design. > > > > Its main purpose is precisely that of _limiting_ the amount of RPC > > buffer usage, and hence avoiding yet another potential source of memory > > deadlocks. > > [ I might point out that this is not documented anywhere. But that's > an aside. ] > > We are getting ahead of ourselves. The patches I wrote do not remove > the limit, they merely change it from a hard architectural limit to a > virtual limit. > > BUT THE LIMIT STILL EXISTS, and defaults to 16 requests, just as before. As I've said before, that is intentional. > If the limit is exceeded, no RPC buffer is allocated, and tasks are > queued on the backlog queue, just as before. So the low-memory > behavior characteristics of the patches should be exactly the same or > somewhat better than before. No. They will differ, because your RPC queue can now eat unlimited amounts of memory. > The point is to allow more flexibility. You can now change the limit > on the fly, while the transport is in use. This change is a > pre-requisite to allowing the client to tune itself as more mount > points use a single transport. Instead of a dumb fixed limit, we can > now think about a flexible dynamic limit that can allow greater > concurrency when resources are available. > > I might also point out that the *real* limiter of memory usage is the > kmalloc in rpc_malloc. If it fails, call_allocate will delay and > loop. This has nothing to do with the slot table size, and suggests > that the slot table size limit is totally arbitrary. Correct. The real limiter is the kmalloc, and that is why we don't want to allow arbitrary slot sizes. We do not want the RPC layer to eat unlimited gobs of memory. > > There is already a mechanism in place for allowing the user to fiddle > > with the limits, > > Why should any user care about setting this limit? The client should > be able to regulate itself to make optimal use of the available > resources. Hand-tuning this limit is simply a work around. Remind me why _you_ lobbied for the ability to do this? ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: cel's patches for 2.6.18 kernels 2006-09-21 17:33 ` Trond Myklebust @ 2006-09-21 17:37 ` Chuck Lever 2006-09-21 17:41 ` Trond Myklebust 0 siblings, 1 reply; 17+ messages in thread From: Chuck Lever @ 2006-09-21 17:37 UTC (permalink / raw) To: Trond Myklebust; +Cc: Linux NFS Mailing List On 9/21/06, Trond Myklebust <trond.myklebust@fys.uio.no> wrote: > On Thu, 2006-09-21 at 11:51 -0400, Chuck Lever wrote: > > > There is already a mechanism in place for allowing the user to fiddle > > > with the limits, > > > > Why should any user care about setting this limit? The client should > > be able to regulate itself to make optimal use of the available > > resources. Hand-tuning this limit is simply a work around. > > Remind me why _you_ lobbied for the ability to do this? Because currently it is the only way you will accept to make any adjustment at all. It was a compromise. -- "We who cut mere stones must always be envisioning cathedrals" -- Quarry worker's creed ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: cel's patches for 2.6.18 kernels 2006-09-21 17:37 ` Chuck Lever @ 2006-09-21 17:41 ` Trond Myklebust 0 siblings, 0 replies; 17+ messages in thread From: Trond Myklebust @ 2006-09-21 17:41 UTC (permalink / raw) To: Chuck Lever; +Cc: Linux NFS Mailing List On Thu, 2006-09-21 at 13:37 -0400, Chuck Lever wrote: > On 9/21/06, Trond Myklebust <trond.myklebust@fys.uio.no> wrote: > > On Thu, 2006-09-21 at 11:51 -0400, Chuck Lever wrote: > > > > There is already a mechanism in place for allowing the user to fiddle > > > > with the limits, > > > > > > Why should any user care about setting this limit? The client should > > > be able to regulate itself to make optimal use of the available > > > resources. Hand-tuning this limit is simply a work around. > > > > Remind me why _you_ lobbied for the ability to do this? > > Because currently it is the only way you will accept to make any > adjustment at all. It was a compromise. ...and it is one we are going to live with until you can prove to me that these changes are safe. As for your assertion that RTP has shown that 128 slots gives better performance, that is good and fine, but does nothing to convince me that these new patches are any improvement over what we already have. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: cel's patches for 2.6.18 kernels 2006-09-20 20:53 ` Chuck Lever 2006-09-20 21:16 ` Trond Myklebust @ 2006-09-21 0:06 ` Chris Croswhite 2006-09-21 13:48 ` Tony Reix 2 siblings, 0 replies; 17+ messages in thread From: Chris Croswhite @ 2006-09-21 0:06 UTC (permalink / raw) To: Chuck Lever; +Cc: Christoph Hellwig, Linux NFS Mailing List ET3.1.7a is installed for linux only. Solaris and AIX will be later tonight. Chris On Wed, 2006-09-20 at 13:53, Chuck Lever wrote: > On 9/20/06, Christoph Hellwig <hch@infradead.org> wrote: > > On Wed, Sep 20, 2006 at 02:28:10PM -0400, Chuck Lever wrote: > > > This e-mail announces the release of cel's patch set for Linux 2.6.18 > > > kernels. The patchset contains several important features, including: > > > > > > 1. The finishing patches for the RPC client transport switch > > > > > > 2. Support for rpcbind protocol versions 2 and 3 > > > > > > 3. Some support for IPv6 in the RPC and NFS clients > > > > > > 4. Elimination of the BKL in the RPC and NFS clients > > > > > > 5. Elimination of the RPC slot table > > > > What's the merge plan for those patches? The transport switch and ipv6 > > code is eagerly anticipated by many people. > > About two thirds of the RPC transport switch is already in. > Twenty-ish more patches are going in 2.6.19. There remain roughly 30 > more patches that I need to walk through with Trond, and we can > probably get those into 2.6.20. I wouldn't mind moving them up, but > that's up to Trond. > > The IPv6 stuff is still being finished. Olaf may be able to help with > the NLM parts; Bull is still working through the NFS server parts; I'm > trying to act as integrator. I would like to line up some testers. I > think those may be as far out as three or four releases, but progress > is occurring -- note there are a lot more IPv6 patches in this > patchset than there were in 2.6.17's. > > Elimination of the BKL is complete except for a few unimportant spots. > All it needs is thorough testing on a multi-way ppc64 system to > reveal niggling problems. > > The RPC slot table patches are completely rewritten in this release. > That will probably need some testing and review. > > -- > "We who cut mere stones must always be envisioning cathedrals" > -- Quarry worker's creed > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys -- and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > NFS maillist - NFS@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nfs ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: cel's patches for 2.6.18 kernels 2006-09-20 20:53 ` Chuck Lever 2006-09-20 21:16 ` Trond Myklebust 2006-09-21 0:06 ` Chris Croswhite @ 2006-09-21 13:48 ` Tony Reix 2 siblings, 0 replies; 17+ messages in thread From: Tony Reix @ 2006-09-21 13:48 UTC (permalink / raw) To: Linux NFS Mailing List; +Cc: Aurélien Charbon Le mercredi 20 septembre 2006 =E0 16:53 -0400, Chuck Lever a =E9crit : > On 9/20/06, Christoph Hellwig <hch@infradead.org> wrote: > > On Wed, Sep 20, 2006 at 02:28:10PM -0400, Chuck Lever wrote: > The IPv6 stuff is still being finished. Olaf may be able to help with > the NLM parts; Bull is still working through the NFS server parts; = These "IPv6 patches for NFSv4 server part" patches are already available at: http://nfsv4.bullopensource.org/patches/ipv6-server/IPv6_patchset.php Aur=E9lien is working with Bruce Fields for integrating them. Regards, Tony ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=3Djoin.php&p=3Dsourceforge&CID=3DDE= VDEV _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2006-09-21 17:41 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-09-20 18:28 cel's patches for 2.6.18 kernels Chuck Lever 2006-09-20 20:20 ` Christoph Hellwig 2006-09-20 20:53 ` Chuck Lever 2006-09-20 21:16 ` Trond Myklebust 2006-09-21 2:29 ` Chuck Lever 2006-09-21 12:26 ` Steve Dickson 2006-09-21 13:47 ` Trond Myklebust 2006-09-21 14:50 ` Chuck Lever 2006-09-21 15:06 ` Trond Myklebust 2006-09-21 15:51 ` Chuck Lever 2006-09-21 17:21 ` Trond Myklebust 2006-09-21 17:33 ` Chuck Lever 2006-09-21 17:33 ` Trond Myklebust 2006-09-21 17:37 ` Chuck Lever 2006-09-21 17:41 ` Trond Myklebust 2006-09-21 0:06 ` Chris Croswhite 2006-09-21 13:48 ` Tony Reix
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.