From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Paul E. McKenney" Subject: Re: Ottawa and slow hash-table resize Date: Mon, 23 Feb 2015 14:35:14 -0800 Message-ID: <20150223223514.GB15405@linux.vnet.ibm.com> References: <20150223184904.GA24955@linux.vnet.ibm.com> <20150223210037.GA806@casper.infradead.org> Reply-To: paulmck@linux.vnet.ibm.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: alexei.starovoitov@gmail.com, herbert@gondor.apana.org.au, kaber@trash.net, davem@davemloft.net, ying.xue@windriver.com, netdev@vger.kernel.org, netfilter-devel@vger.kernel.org, josh@joshtriplett.org To: Thomas Graf Return-path: Content-Disposition: inline In-Reply-To: <20150223210037.GA806@casper.infradead.org> Sender: netdev-owner@vger.kernel.org List-Id: netfilter-devel.vger.kernel.org On Mon, Feb 23, 2015 at 09:00:37PM +0000, Thomas Graf wrote: > On 02/23/15 at 10:49am, Paul E. McKenney wrote: > > Hello! > > > > Alexei mentioned that there was some excitement a couple of weeks ago in > > Ottawa, something about the resizing taking forever when there were large > > numbers of concurrent additions. One approach comes to mind: > > > > o Currently, the hash table does not allow additions concurrently > > with resize operations. One way to allow this would be to > > have the addition operations add to the new hash table at the > > head of the lists. This would clearly require also updating the > > pointers used to control the unzip operation. > > I've already added this. Additions and removals can occur in > parallel to the resize and will go to the head of the new chain. Good! (I guess I got confused by one of the comments. Then again, I was looking at 3.19.) > > o Count the number of entries added during the resize operation. > > Then, at the end of the resize operation, if enough entries have > > been added, do a resize, but by multiple factors of two if > > need be. > > > > This should allow the table to take arbitrarily large numbers of updates > > during a resize operation. There are some other possibilities if this > > approach does not work out. > > The main problem is rapid growth of the table on small tables, > e.g. shift 4-6. Going through multiple grow cycles while > thousands of entries are being added will lead to long chains > which will require multiple RCU grace periods per growth and > thus slowing things down. > > The bucket locking is designed to ignore the highest order bit > of the hash to make sure that a single bucket lock in the new > double sized table protectes both buckets which map to the > same bucket in the old table. This simplifies locking a lot and > does not require nested locking. Growing by more than a factor > of two would require to manually lock all buckets to which > entries in the old bucket may map to. Or just ignore the (say) two upper bits if growing by (say) a factor of four. (If I understand what you are doing here, anyway.) > However, we do not want to grow the bucket lock mask > indefinitely so we could for example growth quicker if the > lock mask allows. Needs some more thought but it's definitely > doable and we need to provide users of the hash table with > ways to find a balance according to their needs. Indeed, finding the right balance can be tricky! Thanx, Paul