From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Chen, Kenneth W" Date: Fri, 28 Oct 2005 02:54:34 +0000 Subject: RE: ia64 get_mmu_context patch Message-Id: <200510280254.j9S2sYg12254@unix-os.sc.intel.com> List-Id: References: <200510271728.j9RHScS0002221922@kitche.zk3.dec.com> In-Reply-To: <200510271728.j9RHScS0002221922@kitche.zk3.dec.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: linux-ia64@vger.kernel.org Peter Keilty wrote on Thursday, October 27, 2005 10:28 AM > Please find attached IA64 context_id patch and supporting data for your > Review and consideration. > ... > Lockstat Data: > There are 4 sets of lockstat data, one each for loads of 40K, > 30K, 20K and 40K with no fork test. The lockstat data shows > that as loading increases the lock contention on the task > lock with wrap_mmu_context and higher utilization of the > ia64_ctx lock and the ia64_global_tlb_purge lock. Current implementation in wrap_mmu_context did not fully utilize all the rid space at the time of wrap. It finds first available free range starting from ia64_ctx.next, presumably much smaller than max_ctx. Was the lock contention because of much more frequent wrap_mmu_context? Ideally, it should only do one wrap when the entire rid space is exhausted. Current implementation in wrap_mmu_context is suboptimal in performance. > wrap_mmu_context (struct mm_struct *mm) > { .... > @@ -52,28 +74,23 @@ > ia64_ctx.limit = max_ctx + 1; > > /* > - * Scan all the task's mm->context and set proper safe range > + * Scan the ia64_ctx bitmap and set proper safe range > */ > +repeat: > + next_ctx = find_next_zero_bit(ia64_ctx.bitmap, ia64_ctx.limit, ia64_ctx.next); > + if (next_ctx >= ia64_ctx.limit) { > + smp_mb(); > + ia64_ctx.next = 300; /* skip daemons */ > + goto repeat; > + } > + ia64_ctx.next = next_ctx; I like the bitmap thing. But what's up with all this old range finding code doing here? You have a full bitmap that tracks used ctx_id, one more bitmap can be added to track pending flush. Then at the time of wrap, we can simply xor them to get full reusable rid. With that, kernel will only wrap when entire rid space is exhausted. I will post a patch. - Ken