diff for duplicates of <444F8096.2070308@redhat.com> diff --git a/a/1.txt b/N1/1.txt index 84b0c55..95368aa 100644 --- a/a/1.txt +++ b/N1/1.txt @@ -1,21 +1,19 @@ Trond Myklebust wrote: >On Wed, 2006-04-26 at 09:14 -0400, Peter Staubach wrote: -> =20 +> > >>Trond Myklebust wrote: >> ->> =20 +>> >> >>>On Tue, 2006-04-25 at 21:14 -0400, Steve Dickson wrote: ->>>=20 +>>> >>> ->>> =20 +>>> >>> ->>>>Currently the NFS client caches ACCESS information on a per uid bas= -is ->>>>which fall apart when different process with different uid consiste= -ntly +>>>>Currently the NFS client caches ACCESS information on a per uid basis +>>>>which fall apart when different process with different uid consistently >>>>access the same directory. The end result being a storm of needless >>>>ACCESS calls... >>>> @@ -24,83 +22,64 @@ ntly >>>>attributes timeout.. The table is indexed by the addition of the >>>>nfs_inode pointer and the cr_uid in the cred structure which should >>>>spread things out nicely for some decent scalability (although the ->>>>locking scheme may need to be reworked a bit). The table has 256 en= -tries +>>>>locking scheme may need to be reworked a bit). The table has 256 entries >>>>of struct list_head giving it a total size of 2k. ->>>> =20 +>>>> >>>> ->>>> =20 +>>>> >>>> ->>>Instead of having the field 'id', why don't you let the nfs_inode ke= -ep a ->>>small (hashed?) list of all the nfs_access_entry objects that refer = -to +>>>Instead of having the field 'id', why don't you let the nfs_inode keep a +>>>small (hashed?) list of all the nfs_access_entry objects that refer to >>>it? That would speed up searches for cached entries. >>> ->>>I agree with Neil's assessment that we need a bound on the size of t= -he ->>>cache. In fact, enforcing a bound is pretty much the raison d'=EAtre= - for a +>>>I agree with Neil's assessment that we need a bound on the size of the +>>>cache. In fact, enforcing a bound is pretty much the raison d'être for a >>>global table (by which I mean that if we don't need a bound, then we >>>might as well cache everything in the nfs_inode). ->>>How about rather changing that hash table into an LRU list, then add= -ing ->>>a shrinker callback (using set_shrinker()) to allow the VM to free u= -p +>>>How about rather changing that hash table into an LRU list, then adding +>>>a shrinker callback (using set_shrinker()) to allow the VM to free up >>>entries when memory pressure dictates that it must? >>> ->>> =20 +>>> >>> ->>Previous implementations have shown that a single per inode linear=20 +>>Previous implementations have shown that a single per inode linear >>linked list ->>ends up not being scalable enough in certain situations. There would= - end up ->>being too many entries in the list and searching the list would becom= -e ->>a bottleneck. Adding a set of hash buckets per inode also proved to = -be ->>inefficient because in order to have enough hash buckets to make the = -hashing ->>efficient, much space was wasted. Having a single set of hash bucket= -s, +>>ends up not being scalable enough in certain situations. There would end up +>>being too many entries in the list and searching the list would become +>>a bottleneck. Adding a set of hash buckets per inode also proved to be +>>inefficient because in order to have enough hash buckets to make the hashing +>>efficient, much space was wasted. Having a single set of hash buckets, >>adequately sized, ended up being the best solution. ->> =20 +>> >> > >What situations? AFAIA the number of processes in a typical setup are >almost always far smaller than the number of cached inodes. > -> =20 +> > The situation that doesn't scale is one where there are many different -users on the system. It is the situation where there are more then jus= -t +users on the system. It is the situation where there are more then just a few users per file. This can happen on compute servers or systems used for timesharing sorts of purposes. >For instance on my laptop, I'm currently running 146 processes, but ->according to /proc/slabinfo I'm caching 330000 XFS inodes + 141500 ext= -3 +>according to /proc/slabinfo I'm caching 330000 XFS inodes + 141500 ext3 >inodes. ->If I were to assume that a typical nfsroot system will show roughly th= -e ->same behaviour, then it would mean that a typical bucket in Steve's 25= -6 +>If I were to assume that a typical nfsroot system will show roughly the +>same behaviour, then it would mean that a typical bucket in Steve's 256 >hash entry table will contain at least 2000 entries that I need to >search through every time I want to do an access call. > -=46or such a system, there needs to be more than 256 hash buckets. The= - number -of the access cache hash buckets needs to be on scale with the number o= -f=20 +For such a system, there needs to be more than 256 hash buckets. The number +of the access cache hash buckets needs to be on scale with the number of hash buckets used for similarly sized caches and tables. ps - -To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= -" in +To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/a/content_digest b/N1/content_digest index 7e9f0e0..d39f227 100644 --- a/a/content_digest +++ b/N1/content_digest @@ -14,21 +14,19 @@ "Trond Myklebust wrote:\n" "\n" ">On Wed, 2006-04-26 at 09:14 -0400, Peter Staubach wrote:\n" - "> =20\n" + "> \n" ">\n" ">>Trond Myklebust wrote:\n" ">>\n" - ">> =20\n" + ">> \n" ">>\n" ">>>On Tue, 2006-04-25 at 21:14 -0400, Steve Dickson wrote:\n" - ">>>=20\n" + ">>> \n" ">>>\n" - ">>> =20\n" + ">>> \n" ">>>\n" - ">>>>Currently the NFS client caches ACCESS information on a per uid bas=\n" - "is\n" - ">>>>which fall apart when different process with different uid consiste=\n" - "ntly\n" + ">>>>Currently the NFS client caches ACCESS information on a per uid basis\n" + ">>>>which fall apart when different process with different uid consistently\n" ">>>>access the same directory. The end result being a storm of needless\n" ">>>>ACCESS calls...\n" ">>>>\n" @@ -37,85 +35,66 @@ ">>>>attributes timeout.. The table is indexed by the addition of the\n" ">>>>nfs_inode pointer and the cr_uid in the cred structure which should\n" ">>>>spread things out nicely for some decent scalability (although the\n" - ">>>>locking scheme may need to be reworked a bit). The table has 256 en=\n" - "tries\n" + ">>>>locking scheme may need to be reworked a bit). The table has 256 entries\n" ">>>>of struct list_head giving it a total size of 2k.\n" - ">>>> =20\n" + ">>>> \n" ">>>>\n" - ">>>> =20\n" + ">>>> \n" ">>>>\n" - ">>>Instead of having the field 'id', why don't you let the nfs_inode ke=\n" - "ep a\n" - ">>>small (hashed?) list of all the nfs_access_entry objects that refer =\n" - "to\n" + ">>>Instead of having the field 'id', why don't you let the nfs_inode keep a\n" + ">>>small (hashed?) list of all the nfs_access_entry objects that refer to\n" ">>>it? That would speed up searches for cached entries.\n" ">>>\n" - ">>>I agree with Neil's assessment that we need a bound on the size of t=\n" - "he\n" - ">>>cache. In fact, enforcing a bound is pretty much the raison d'=EAtre=\n" - " for a\n" + ">>>I agree with Neil's assessment that we need a bound on the size of the\n" + ">>>cache. In fact, enforcing a bound is pretty much the raison d'\303\252tre for a\n" ">>>global table (by which I mean that if we don't need a bound, then we\n" ">>>might as well cache everything in the nfs_inode).\n" - ">>>How about rather changing that hash table into an LRU list, then add=\n" - "ing\n" - ">>>a shrinker callback (using set_shrinker()) to allow the VM to free u=\n" - "p\n" + ">>>How about rather changing that hash table into an LRU list, then adding\n" + ">>>a shrinker callback (using set_shrinker()) to allow the VM to free up\n" ">>>entries when memory pressure dictates that it must?\n" ">>>\n" - ">>> =20\n" + ">>> \n" ">>>\n" - ">>Previous implementations have shown that a single per inode linear=20\n" + ">>Previous implementations have shown that a single per inode linear \n" ">>linked list\n" - ">>ends up not being scalable enough in certain situations. There would=\n" - " end up\n" - ">>being too many entries in the list and searching the list would becom=\n" - "e\n" - ">>a bottleneck. Adding a set of hash buckets per inode also proved to =\n" - "be\n" - ">>inefficient because in order to have enough hash buckets to make the =\n" - "hashing\n" - ">>efficient, much space was wasted. Having a single set of hash bucket=\n" - "s,\n" + ">>ends up not being scalable enough in certain situations. There would end up\n" + ">>being too many entries in the list and searching the list would become\n" + ">>a bottleneck. Adding a set of hash buckets per inode also proved to be\n" + ">>inefficient because in order to have enough hash buckets to make the hashing\n" + ">>efficient, much space was wasted. Having a single set of hash buckets,\n" ">>adequately sized, ended up being the best solution.\n" - ">> =20\n" + ">> \n" ">>\n" ">\n" ">What situations? AFAIA the number of processes in a typical setup are\n" ">almost always far smaller than the number of cached inodes.\n" ">\n" - "> =20\n" + "> \n" ">\n" "\n" "The situation that doesn't scale is one where there are many different\n" - "users on the system. It is the situation where there are more then jus=\n" - "t\n" + "users on the system. It is the situation where there are more then just\n" "a few users per file. This can happen on compute servers or systems\n" "used for timesharing sorts of purposes.\n" "\n" ">For instance on my laptop, I'm currently running 146 processes, but\n" - ">according to /proc/slabinfo I'm caching 330000 XFS inodes + 141500 ext=\n" - "3\n" + ">according to /proc/slabinfo I'm caching 330000 XFS inodes + 141500 ext3\n" ">inodes.\n" - ">If I were to assume that a typical nfsroot system will show roughly th=\n" - "e\n" - ">same behaviour, then it would mean that a typical bucket in Steve's 25=\n" - "6\n" + ">If I were to assume that a typical nfsroot system will show roughly the\n" + ">same behaviour, then it would mean that a typical bucket in Steve's 256\n" ">hash entry table will contain at least 2000 entries that I need to\n" ">search through every time I want to do an access call.\n" ">\n" "\n" - "=46or such a system, there needs to be more than 256 hash buckets. The=\n" - " number\n" - "of the access cache hash buckets needs to be on scale with the number o=\n" - "f=20\n" + "For such a system, there needs to be more than 256 hash buckets. The number\n" + "of the access cache hash buckets needs to be on scale with the number of \n" "hash\n" "buckets used for similarly sized caches and tables.\n" "\n" " ps\n" "-\n" - "To unsubscribe from this list: send the line \"unsubscribe linux-fsdevel=\n" - "\" in\n" + "To unsubscribe from this list: send the line \"unsubscribe linux-fsdevel\" in\n" "the body of a message to majordomo@vger.kernel.org\n" More majordomo info at http://vger.kernel.org/majordomo-info.html -7655f47819dfc96dc7eb99c5c1c370afce69a5f5b9128b29493344b991052caa +745f87d0048a4ea187da9cd2da488af9939b8cc4cb8d92b0162eea22e3a24c48
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.