From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx199.postini.com [74.125.245.199]) by kanga.kvack.org (Postfix) with SMTP id 23C216B004D for ; Tue, 26 Jun 2012 17:37:05 -0400 (EDT) Received: from akpm.mtv.corp.google.com (216-239-45-4.google.com [216.239.45.4]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id 7760E280 for ; Tue, 26 Jun 2012 21:37:04 +0000 (UTC) Date: Tue, 26 Jun 2012 14:37:03 -0700 From: Andrew Morton Subject: needed lru_add_drain_all() change Message-Id: <20120626143703.396d6d66.akpm@linux-foundation.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: linux-mm@kvack.org https://bugzilla.kernel.org/show_bug.cgi?id=43811 lru_add_drain_all() uses schedule_on_each_cpu(). But schedule_on_each_cpu() hangs if a realtime thread is spinning, pinned to a CPU. There's no intention to change the scheduler behaviour, so I think we should remove schedule_on_each_cpu() from the kernel. The biggest user of schedule_on_each_cpu() is lru_add_drain_all(). Does anyone have any thoughts on how we can do this? The obvious approach is to declare these: static DEFINE_PER_CPU(struct pagevec[NR_LRU_LISTS], lru_add_pvecs); static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs); static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs); to be irq-safe and use on_each_cpu(). lru_rotate_pvecs is already irq-safe and converting lru_add_pvecs and lru_deactivate_pvecs looks pretty simple. Thoughts? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx205.postini.com [74.125.245.205]) by kanga.kvack.org (Postfix) with SMTP id 5D7176B005A for ; Tue, 26 Jun 2012 20:55:04 -0400 (EDT) Message-ID: <4FEA59EE.8060804@kernel.org> Date: Wed, 27 Jun 2012 09:55:10 +0900 From: Minchan Kim MIME-Version: 1.0 Subject: Re: needed lru_add_drain_all() change References: <20120626143703.396d6d66.akpm@linux-foundation.org> In-Reply-To: <20120626143703.396d6d66.akpm@linux-foundation.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: linux-mm@kvack.org, KOSAKI Motohiro On 06/27/2012 06:37 AM, Andrew Morton wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=43811 > > lru_add_drain_all() uses schedule_on_each_cpu(). But > schedule_on_each_cpu() hangs if a realtime thread is spinning, pinned > to a CPU. There's no intention to change the scheduler behaviour, so I > think we should remove schedule_on_each_cpu() from the kernel. > > The biggest user of schedule_on_each_cpu() is lru_add_drain_all(). > > Does anyone have any thoughts on how we can do this? The obvious > approach is to declare these: > > static DEFINE_PER_CPU(struct pagevec[NR_LRU_LISTS], lru_add_pvecs); > static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs); > static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs); One more static DEFINE_PER_CPU(struct pagevec, activate_page_pvecs); > > to be irq-safe and use on_each_cpu(). lru_rotate_pvecs is already > irq-safe and converting lru_add_pvecs and lru_deactivate_pvecs looks > pretty simple. Yes. Changing looks simple. I'm okay with lru_[activate_page|deactivate]_pvecs because it's not hot but lru_rotate_pvecs is hotter than others. Considering mlock and CPU pinning of realtime thread is very rare, it might be rather expensive solution. Unfortunately, I have no idea better than you suggested. :( And looking 8891d6da17, mlock's lru_add_drain_all isn't must. If it's really bother us, couldn't we remove it? -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx131.postini.com [74.125.245.131]) by kanga.kvack.org (Postfix) with SMTP id A72DB6B005A for ; Tue, 26 Jun 2012 21:14:02 -0400 (EDT) Date: Tue, 26 Jun 2012 18:15:04 -0700 From: Andrew Morton Subject: Re: needed lru_add_drain_all() change Message-Id: <20120626181504.23b8b73d.akpm@linux-foundation.org> In-Reply-To: <4FEA59EE.8060804@kernel.org> References: <20120626143703.396d6d66.akpm@linux-foundation.org> <4FEA59EE.8060804@kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Minchan Kim Cc: linux-mm@kvack.org, KOSAKI Motohiro On Wed, 27 Jun 2012 09:55:10 +0900 Minchan Kim wrote: > On 06/27/2012 06:37 AM, Andrew Morton wrote: > > > https://bugzilla.kernel.org/show_bug.cgi?id=43811 > > > > lru_add_drain_all() uses schedule_on_each_cpu(). But > > schedule_on_each_cpu() hangs if a realtime thread is spinning, pinned > > to a CPU. There's no intention to change the scheduler behaviour, so I > > think we should remove schedule_on_each_cpu() from the kernel. > > > > The biggest user of schedule_on_each_cpu() is lru_add_drain_all(). > > > > Does anyone have any thoughts on how we can do this? The obvious > > approach is to declare these: > > > > static DEFINE_PER_CPU(struct pagevec[NR_LRU_LISTS], lru_add_pvecs); > > static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs); > > static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs); > > > One more > static DEFINE_PER_CPU(struct pagevec, activate_page_pvecs); > > > > > to be irq-safe and use on_each_cpu(). lru_rotate_pvecs is already > > irq-safe and converting lru_add_pvecs and lru_deactivate_pvecs looks > > pretty simple. > > > Yes. Changing looks simple. > I'm okay with lru_[activate_page|deactivate]_pvecs because it's not hot > but lru_rotate_pvecs is hotter than others. I don't think any change is needed for lru_rotate_pvecs? > Considering mlock and CPU pinning > of realtime thread is very rare, it might be rather expensive solution. > Unfortunately, I have no idea better than you suggested. :( > > And looking 8891d6da17, mlock's lru_add_drain_all isn't must. > If it's really bother us, couldn't we remove it? "grep lru_add_drain_all mm/*.c". They're all problematic. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx156.postini.com [74.125.245.156]) by kanga.kvack.org (Postfix) with SMTP id 331186B005A for ; Tue, 26 Jun 2012 21:20:22 -0400 (EDT) Message-ID: <4FEA5FD8.9060806@kernel.org> Date: Wed, 27 Jun 2012 10:20:24 +0900 From: Minchan Kim MIME-Version: 1.0 Subject: Re: needed lru_add_drain_all() change References: <20120626143703.396d6d66.akpm@linux-foundation.org> <4FEA59EE.8060804@kernel.org> <20120626181504.23b8b73d.akpm@linux-foundation.org> In-Reply-To: <20120626181504.23b8b73d.akpm@linux-foundation.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: linux-mm@kvack.org, KOSAKI Motohiro Hi Andrew, On 06/27/2012 10:15 AM, Andrew Morton wrote: > On Wed, 27 Jun 2012 09:55:10 +0900 Minchan Kim wrote: > >> On 06/27/2012 06:37 AM, Andrew Morton wrote: >> >>> https://bugzilla.kernel.org/show_bug.cgi?id=43811 >>> >>> lru_add_drain_all() uses schedule_on_each_cpu(). But >>> schedule_on_each_cpu() hangs if a realtime thread is spinning, pinned >>> to a CPU. There's no intention to change the scheduler behaviour, so I >>> think we should remove schedule_on_each_cpu() from the kernel. >>> >>> The biggest user of schedule_on_each_cpu() is lru_add_drain_all(). >>> >>> Does anyone have any thoughts on how we can do this? The obvious >>> approach is to declare these: >>> >>> static DEFINE_PER_CPU(struct pagevec[NR_LRU_LISTS], lru_add_pvecs); >>> static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs); >>> static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs); >> >> >> One more >> static DEFINE_PER_CPU(struct pagevec, activate_page_pvecs); >> >>> >>> to be irq-safe and use on_each_cpu(). lru_rotate_pvecs is already >>> irq-safe and converting lru_add_pvecs and lru_deactivate_pvecs looks >>> pretty simple. >> >> >> Yes. Changing looks simple. >> I'm okay with lru_[activate_page|deactivate]_pvecs because it's not hot >> but lru_rotate_pvecs is hotter than others. > > I don't think any change is needed for lru_rotate_pvecs? Sorry for the typo lru_add_pvecs > >> Considering mlock and CPU pinning >> of realtime thread is very rare, it might be rather expensive solution. >> Unfortunately, I have no idea better than you suggested. :( >> >> And looking 8891d6da17, mlock's lru_add_drain_all isn't must. >> If it's really bother us, couldn't we remove it? > > "grep lru_add_drain_all mm/*.c". They're all problematic. > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org > -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx138.postini.com [74.125.245.138]) by kanga.kvack.org (Postfix) with SMTP id 439DC6B005A for ; Tue, 26 Jun 2012 21:28:11 -0400 (EDT) Date: Tue, 26 Jun 2012 18:29:13 -0700 From: Andrew Morton Subject: Re: needed lru_add_drain_all() change Message-Id: <20120626182913.4098e5c4.akpm@linux-foundation.org> In-Reply-To: <4FEA5FD8.9060806@kernel.org> References: <20120626143703.396d6d66.akpm@linux-foundation.org> <4FEA59EE.8060804@kernel.org> <20120626181504.23b8b73d.akpm@linux-foundation.org> <4FEA5FD8.9060806@kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Minchan Kim Cc: linux-mm@kvack.org, KOSAKI Motohiro On Wed, 27 Jun 2012 10:20:24 +0900 Minchan Kim wrote: > >> Yes. Changing looks simple. > >> I'm okay with lru_[activate_page|deactivate]_pvecs because it's not hot > >> but lru_rotate_pvecs is hotter than others. > > > > I don't think any change is needed for lru_rotate_pvecs? > > > Sorry for the typo > lru_add_pvecs OK. A local_irq_save/restore shouldn't be tooooo expensive. We can remove the current get_cpu()/put_cpu() to reclaim some of the overhead. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx202.postini.com [74.125.245.202]) by kanga.kvack.org (Postfix) with SMTP id 3AF276B005A for ; Tue, 26 Jun 2012 22:09:36 -0400 (EDT) Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1SjhhE-0001KM-7B for linux-mm@kvack.org; Wed, 27 Jun 2012 04:09:32 +0200 Received: from 121.50.20.41 ([121.50.20.41]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 27 Jun 2012 04:09:32 +0200 Received: from minchan by 121.50.20.41 with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 27 Jun 2012 04:09:32 +0200 From: Minchan Kim Subject: Re: needed lru_add_drain_all() change Date: Wed, 27 Jun 2012 11:09:31 +0900 Message-ID: <4FEA6B5B.5000205@kernel.org> References: <20120626143703.396d6d66.akpm@linux-foundation.org> <4FEA59EE.8060804@kernel.org> <20120626181504.23b8b73d.akpm@linux-foundation.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit In-Reply-To: <20120626181504.23b8b73d.akpm@linux-foundation.org> Sender: owner-linux-mm@kvack.org List-ID: To: linux-mm@kvack.org Cc: KOSAKI Motohiro On 06/27/2012 10:15 AM, Andrew Morton wrote: >> Considering mlock and CPU pinning >> > of realtime thread is very rare, it might be rather expensive solution. >> > Unfortunately, I have no idea better than you suggested. :( >> > >> > And looking 8891d6da17, mlock's lru_add_drain_all isn't must. >> > If it's really bother us, couldn't we remove it? > "grep lru_add_drain_all mm/*.c". They're all problematic. Yeb but I'm not sure such system modeling is good. Potentially, It could make problem once we use workqueue of other CPU. -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx162.postini.com [74.125.245.162]) by kanga.kvack.org (Postfix) with SMTP id 88B8C6B005A for ; Wed, 27 Jun 2012 01:11:14 -0400 (EDT) Date: Tue, 26 Jun 2012 22:12:17 -0700 From: Andrew Morton Subject: Re: needed lru_add_drain_all() change Message-Id: <20120626221217.1682572a.akpm@linux-foundation.org> In-Reply-To: <4FEA6B5B.5000205@kernel.org> References: <20120626143703.396d6d66.akpm@linux-foundation.org> <4FEA59EE.8060804@kernel.org> <20120626181504.23b8b73d.akpm@linux-foundation.org> <4FEA6B5B.5000205@kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Minchan Kim Cc: linux-mm@kvack.org, KOSAKI Motohiro On Wed, 27 Jun 2012 11:09:31 +0900 Minchan Kim wrote: > On 06/27/2012 10:15 AM, Andrew Morton wrote: > > >> Considering mlock and CPU pinning > >> > of realtime thread is very rare, it might be rather expensive solution. > >> > Unfortunately, I have no idea better than you suggested. :( > >> > > >> > And looking 8891d6da17, mlock's lru_add_drain_all isn't must. > >> > If it's really bother us, couldn't we remove it? > > "grep lru_add_drain_all mm/*.c". They're all problematic. > > > Yeb but I'm not sure such system modeling is good. > Potentially, It could make problem once we use workqueue of other CPU. whut? My suggestion is that we switch lru_add_drain_all() to on_each_cpu() and delete schedule_on_each_cpu(). No workqueues. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx169.postini.com [74.125.245.169]) by kanga.kvack.org (Postfix) with SMTP id 257686B005A for ; Wed, 27 Jun 2012 01:41:34 -0400 (EDT) Message-ID: <4FEA9D13.6070409@kernel.org> Date: Wed, 27 Jun 2012 14:41:39 +0900 From: Minchan Kim MIME-Version: 1.0 Subject: Re: needed lru_add_drain_all() change References: <20120626143703.396d6d66.akpm@linux-foundation.org> <4FEA59EE.8060804@kernel.org> <20120626181504.23b8b73d.akpm@linux-foundation.org> <4FEA6B5B.5000205@kernel.org> <20120626221217.1682572a.akpm@linux-foundation.org> In-Reply-To: <20120626221217.1682572a.akpm@linux-foundation.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: linux-mm@kvack.org, KOSAKI Motohiro On 06/27/2012 02:12 PM, Andrew Morton wrote: > On Wed, 27 Jun 2012 11:09:31 +0900 Minchan Kim wrote: > >> On 06/27/2012 10:15 AM, Andrew Morton wrote: >> >>>> Considering mlock and CPU pinning >>>>> of realtime thread is very rare, it might be rather expensive solution. >>>>> Unfortunately, I have no idea better than you suggested. :( >>>>> >>>>> And looking 8891d6da17, mlock's lru_add_drain_all isn't must. >>>>> If it's really bother us, couldn't we remove it? >>> "grep lru_add_drain_all mm/*.c". They're all problematic. >> >> >> Yeb but I'm not sure such system modeling is good. >> Potentially, It could make problem once we use workqueue of other CPU. > > whut? > > My suggestion is that we switch lru_add_drain_all() to on_each_cpu() > and delete schedule_on_each_cpu(). No workqueues. Current problem is that RT thread doesn't yield his CPU so other tasks can't be scheduled in. schedule_on_each_cpu uses system workqueue so if there are any user to try using workqueue for the CPU(ex, schedule_work_on), he can make trouble, too. So my question is I doubt such greedy RT thread modeling is good. Do I miss something? -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx202.postini.com [74.125.245.202]) by kanga.kvack.org (Postfix) with SMTP id 0977E6B005A for ; Wed, 27 Jun 2012 01:54:40 -0400 (EDT) Date: Tue, 26 Jun 2012 22:55:44 -0700 From: Andrew Morton Subject: Re: needed lru_add_drain_all() change Message-Id: <20120626225544.068df1b9.akpm@linux-foundation.org> In-Reply-To: <4FEA9D13.6070409@kernel.org> References: <20120626143703.396d6d66.akpm@linux-foundation.org> <4FEA59EE.8060804@kernel.org> <20120626181504.23b8b73d.akpm@linux-foundation.org> <4FEA6B5B.5000205@kernel.org> <20120626221217.1682572a.akpm@linux-foundation.org> <4FEA9D13.6070409@kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Minchan Kim Cc: linux-mm@kvack.org, KOSAKI Motohiro On Wed, 27 Jun 2012 14:41:39 +0900 Minchan Kim wrote: > On 06/27/2012 02:12 PM, Andrew Morton wrote: > > > On Wed, 27 Jun 2012 11:09:31 +0900 Minchan Kim wrote: > > > >> On 06/27/2012 10:15 AM, Andrew Morton wrote: > >> > >>>> Considering mlock and CPU pinning > >>>>> of realtime thread is very rare, it might be rather expensive solution. > >>>>> Unfortunately, I have no idea better than you suggested. :( > >>>>> > >>>>> And looking 8891d6da17, mlock's lru_add_drain_all isn't must. > >>>>> If it's really bother us, couldn't we remove it? > >>> "grep lru_add_drain_all mm/*.c". They're all problematic. > >> > >> > >> Yeb but I'm not sure such system modeling is good. > >> Potentially, It could make problem once we use workqueue of other CPU. > > > > whut? > > > > My suggestion is that we switch lru_add_drain_all() to on_each_cpu() > > and delete schedule_on_each_cpu(). No workqueues. > > > Current problem is that RT thread doesn't yield his CPU so other tasks can't be scheduled in. > schedule_on_each_cpu uses system workqueue so if there are any user to try using > workqueue for the CPU(ex, schedule_work_on), he can make trouble, too. > So my question is I doubt such greedy RT thread modeling is good. > There's no way of fixing this without significantly degrading the service which rt priority offers. As we don't wish to degrade that service, schedule_work_on() and schedule_on_each_cpu() cannot be implemented reliably. So we delete them. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx119.postini.com [74.125.245.119]) by kanga.kvack.org (Postfix) with SMTP id 2BC956B005A for ; Wed, 27 Jun 2012 02:33:02 -0400 (EDT) Message-ID: <4FEAA925.9020202@kernel.org> Date: Wed, 27 Jun 2012 15:33:09 +0900 From: Minchan Kim MIME-Version: 1.0 Subject: Re: needed lru_add_drain_all() change References: <20120626143703.396d6d66.akpm@linux-foundation.org> <4FEA59EE.8060804@kernel.org> <20120626181504.23b8b73d.akpm@linux-foundation.org> <4FEA6B5B.5000205@kernel.org> <20120626221217.1682572a.akpm@linux-foundation.org> <4FEA9D13.6070409@kernel.org> <20120626225544.068df1b9.akpm@linux-foundation.org> In-Reply-To: <20120626225544.068df1b9.akpm@linux-foundation.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: linux-mm@kvack.org, KOSAKI Motohiro , Peter Zijlstra On 06/27/2012 02:55 PM, Andrew Morton wrote: > On Wed, 27 Jun 2012 14:41:39 +0900 Minchan Kim wrote: > >> On 06/27/2012 02:12 PM, Andrew Morton wrote: >> >>> On Wed, 27 Jun 2012 11:09:31 +0900 Minchan Kim wrote: >>> >>>> On 06/27/2012 10:15 AM, Andrew Morton wrote: >>>> >>>>>> Considering mlock and CPU pinning >>>>>>> of realtime thread is very rare, it might be rather expensive solution. >>>>>>> Unfortunately, I have no idea better than you suggested. :( >>>>>>> >>>>>>> And looking 8891d6da17, mlock's lru_add_drain_all isn't must. >>>>>>> If it's really bother us, couldn't we remove it? >>>>> "grep lru_add_drain_all mm/*.c". They're all problematic. >>>> >>>> >>>> Yeb but I'm not sure such system modeling is good. >>>> Potentially, It could make problem once we use workqueue of other CPU. >>> >>> whut? >>> >>> My suggestion is that we switch lru_add_drain_all() to on_each_cpu() >>> and delete schedule_on_each_cpu(). No workqueues. >> >> >> Current problem is that RT thread doesn't yield his CPU so other tasks can't be scheduled in. >> schedule_on_each_cpu uses system workqueue so if there are any user to try using >> workqueue for the CPU(ex, schedule_work_on), he can make trouble, too. >> So my question is I doubt such greedy RT thread modeling is good. >> > > There's no way of fixing this without significantly degrading the > service which rt priority offers. As we don't wish to degrade that > service, schedule_work_on() and schedule_on_each_cpu() cannot be > implemented reliably. So we delete them. Okay. I'm not against strongly if local_irq_save/restore isn't expensive as a first step for removing them because I have no good idea. I want to add some comment on schedule_work_on and friends. "You shouldn't use it any more and we will try to remove this". Anyway, let's wait further answer, especially, RT folks. -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx125.postini.com [74.125.245.125]) by kanga.kvack.org (Postfix) with SMTP id 3C9ED6B005A for ; Wed, 27 Jun 2012 02:40:16 -0400 (EDT) Date: Tue, 26 Jun 2012 23:41:19 -0700 From: Andrew Morton Subject: Re: needed lru_add_drain_all() change Message-Id: <20120626234119.755af455.akpm@linux-foundation.org> In-Reply-To: <4FEAA925.9020202@kernel.org> References: <20120626143703.396d6d66.akpm@linux-foundation.org> <4FEA59EE.8060804@kernel.org> <20120626181504.23b8b73d.akpm@linux-foundation.org> <4FEA6B5B.5000205@kernel.org> <20120626221217.1682572a.akpm@linux-foundation.org> <4FEA9D13.6070409@kernel.org> <20120626225544.068df1b9.akpm@linux-foundation.org> <4FEAA925.9020202@kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Minchan Kim Cc: linux-mm@kvack.org, KOSAKI Motohiro , Peter Zijlstra On Wed, 27 Jun 2012 15:33:09 +0900 Minchan Kim wrote: > Anyway, let's wait further answer, especially, RT folks. rt folks said "it isn't changing", and I agree with them. It isn't worth breaking the rt-prio quality of service because a few odd parts of the kernel did something inappropriate. Especially when those few sites have alternatives. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx172.postini.com [74.125.245.172]) by kanga.kvack.org (Postfix) with SMTP id AFBD86B005A for ; Wed, 27 Jun 2012 02:44:59 -0400 (EDT) Date: Tue, 26 Jun 2012 23:46:03 -0700 From: Andrew Morton Subject: Re: needed lru_add_drain_all() change Message-Id: <20120626234603.779f5cbb.akpm@linux-foundation.org> In-Reply-To: <4FEAA925.9020202@kernel.org> References: <20120626143703.396d6d66.akpm@linux-foundation.org> <4FEA59EE.8060804@kernel.org> <20120626181504.23b8b73d.akpm@linux-foundation.org> <4FEA6B5B.5000205@kernel.org> <20120626221217.1682572a.akpm@linux-foundation.org> <4FEA9D13.6070409@kernel.org> <20120626225544.068df1b9.akpm@linux-foundation.org> <4FEAA925.9020202@kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Minchan Kim Cc: linux-mm@kvack.org, KOSAKI Motohiro , Peter Zijlstra btw, the first step should be to audit all lru_add_drain_all() sites and work out exactly why they are calling lru_add_drain_all() - what are they trying to achive? Because we may be able to use a more lightweight approach there, or handle the asynchronous behaviour in a more graceful fashion, rather than forcing this massive synchronization barrier. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx139.postini.com [74.125.245.139]) by kanga.kvack.org (Postfix) with SMTP id 4F6356B005A for ; Wed, 27 Jun 2012 06:27:39 -0400 (EDT) Message-ID: <1340792851.10063.20.camel@twins> Subject: Re: needed lru_add_drain_all() change From: Peter Zijlstra Date: Wed, 27 Jun 2012 12:27:31 +0200 In-Reply-To: <20120626234119.755af455.akpm@linux-foundation.org> References: <20120626143703.396d6d66.akpm@linux-foundation.org> <4FEA59EE.8060804@kernel.org> <20120626181504.23b8b73d.akpm@linux-foundation.org> <4FEA6B5B.5000205@kernel.org> <20120626221217.1682572a.akpm@linux-foundation.org> <4FEA9D13.6070409@kernel.org> <20120626225544.068df1b9.akpm@linux-foundation.org> <4FEAA925.9020202@kernel.org> <20120626234119.755af455.akpm@linux-foundation.org> Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: Minchan Kim , linux-mm@kvack.org, KOSAKI Motohiro On Tue, 2012-06-26 at 23:41 -0700, Andrew Morton wrote: > On Wed, 27 Jun 2012 15:33:09 +0900 Minchan Kim wrote= : >=20 > > Anyway, let's wait further answer, especially, RT folks.=20 >=20 > rt folks said "it isn't changing", and I agree with them. It isn't > worth breaking the rt-prio quality of service because a few odd parts > of the kernel did something inappropriate. Especially when those > few sites have alternatives. I'm not exactly sure its a 'few' sites.. but yeah there's a few obvious sites we should look at. Afaict all lru_add_drain_all() callers do this optimistically, esp. since there's no hard sync. against adding new entries to the per-cpu pagevecs. So there's no hard requirement to wait for completion, now not waiting has obvious problems as well, but we could cheat and timeout after a few jiffies or so. This would avoid the DoS scenario, it will not improve the over-all quality of the kernel though, since an unflushed pagevec can result in compaction etc. failing. The problem with stuffing all this in hardirq context (using on_each_cpu() and friends) is that these people who do spin in fifo threads generally don't like interrupt latencies forced on them either. And I presume its currently scheduled is because its potentially quite expensive to flush all these pages. The only alternative I can come up with is scheduling the work like we do now, wait for it for a few jiffies, track which CPUs completed, cancel the others, and remote flush their pagevecs from the calling cpu. But I can't say I like that option either... As it stands I've always said that doing while(1) from FIFO/RR tasks is broken and you get to keep the pieces. If we can find good solutions for this I'm all ears, but I don't think its something we should bend over backwards for. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx101.postini.com [74.125.245.101]) by kanga.kvack.org (Postfix) with SMTP id B2D306B0062 for ; Wed, 27 Jun 2012 06:31:24 -0400 (EDT) Message-ID: <1340793075.10063.24.camel@twins> Subject: Re: needed lru_add_drain_all() change From: Peter Zijlstra Date: Wed, 27 Jun 2012 12:31:15 +0200 In-Reply-To: <20120626234603.779f5cbb.akpm@linux-foundation.org> References: <20120626143703.396d6d66.akpm@linux-foundation.org> <4FEA59EE.8060804@kernel.org> <20120626181504.23b8b73d.akpm@linux-foundation.org> <4FEA6B5B.5000205@kernel.org> <20120626221217.1682572a.akpm@linux-foundation.org> <4FEA9D13.6070409@kernel.org> <20120626225544.068df1b9.akpm@linux-foundation.org> <4FEAA925.9020202@kernel.org> <20120626234603.779f5cbb.akpm@linux-foundation.org> Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: Minchan Kim , linux-mm@kvack.org, KOSAKI Motohiro On Tue, 2012-06-26 at 23:46 -0700, Andrew Morton wrote: > btw, the first step should be to audit all lru_add_drain_all() sites > and work out exactly why they are calling lru_add_drain_all() - what > are they trying to achive? # git grep lru_add_drain_all fs/block_dev.c: lru_add_drain_all(); /* make sure all lru add caches are= flushed */ include/linux/swap.h:extern int lru_add_drain_all(void); mm/compaction.c: lru_add_drain_all(); mm/compaction.c: lru_add_drain_all(); mm/ksm.c: lru_add_drain_all(); mm/memcontrol.c: lru_add_drain_all(); mm/memcontrol.c: lru_add_drain_all(); mm/memcontrol.c: lru_add_drain_all(); mm/memory-failure.c: lru_add_drain_all(); mm/memory_hotplug.c: lru_add_drain_all(); mm/memory_hotplug.c: lru_add_drain_all(); mm/migrate.c: lru_add_drain_all(); mm/migrate.c: * here to avoid lru_add_drain_all(). mm/mlock.c: lru_add_drain_all(); /* flush pagevec */ mm/mlock.c: lru_add_drain_all(); /* flush pagevec */ mm/page_alloc.c: * For avoiding noise data, lru_add_drain_all() sho= uld be called mm/page_alloc.c: lru_add_drain_all(); mm/swap.c:int lru_add_drain_all(void) I haven't audited all sites, but most of them try to flush the per-cpu lru pagevecs to make sure the pages are on the lru so they can take them off again ;-) Take compaction for instance, if a page in the middle of a range is on a per-cpu pagevec it can't move it and the compaction might fail. Hmm, another alternative is teaching isolate_lru_page() and friends to take pages from the pagevecs directly, not sure what that would take. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx188.postini.com [74.125.245.188]) by kanga.kvack.org (Postfix) with SMTP id DF8B76B005A for ; Wed, 27 Jun 2012 08:04:28 -0400 (EDT) Message-ID: <1340798663.10063.36.camel@twins> Subject: Re: needed lru_add_drain_all() change From: Peter Zijlstra Date: Wed, 27 Jun 2012 14:04:23 +0200 In-Reply-To: <20120626143703.396d6d66.akpm@linux-foundation.org> References: <20120626143703.396d6d66.akpm@linux-foundation.org> Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: linux-mm@kvack.org On Tue, 2012-06-26 at 14:37 -0700, Andrew Morton wrote: > lru_add_drain_all() uses schedule_on_each_cpu(). But > schedule_on_each_cpu() hangs if a realtime thread is spinning, pinned > to a CPU. There's no intention to change the scheduler behaviour, so > I > think we should remove schedule_on_each_cpu() from the kernel. >=20 Anything that uses a per-cpu workqueue and waits on work from another cpu is vulnerable too. This would include things like padata, crypto and possibly others. ksoftirqd is vulnerable too, if it were preempted while handling a softirq, all of softirq handling will be out the window for that cpu. infiniband/hw/ehca would likely malfunction as well, since it has per-cpu threads. FIFO is dangerous, don't do stupid things :-) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx138.postini.com [74.125.245.138]) by kanga.kvack.org (Postfix) with SMTP id 56CCD6B005A for ; Thu, 28 Jun 2012 02:23:51 -0400 (EDT) Received: by ggm4 with SMTP id 4so1954446ggm.14 for ; Wed, 27 Jun 2012 23:23:50 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20120626143703.396d6d66.akpm@linux-foundation.org> References: <20120626143703.396d6d66.akpm@linux-foundation.org> From: KOSAKI Motohiro Date: Thu, 28 Jun 2012 02:23:30 -0400 Message-ID: Subject: Re: needed lru_add_drain_all() change Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: linux-mm@kvack.org On Tue, Jun 26, 2012 at 5:37 PM, Andrew Morton wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=3D43811 > > lru_add_drain_all() uses schedule_on_each_cpu(). =A0But > schedule_on_each_cpu() hangs if a realtime thread is spinning, pinned > to a CPU. =A0There's no intention to change the scheduler behaviour, so I > think we should remove schedule_on_each_cpu() from the kernel. > > The biggest user of schedule_on_each_cpu() is lru_add_drain_all(). > > Does anyone have any thoughts on how we can do this? =A0The obvious > approach is to declare these: > > static DEFINE_PER_CPU(struct pagevec[NR_LRU_LISTS], lru_add_pvecs); > static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs); > static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs); > > to be irq-safe and use on_each_cpu(). =A0lru_rotate_pvecs is already > irq-safe and converting lru_add_pvecs and lru_deactivate_pvecs looks > pretty simple. > > Thoughts? I agree. But i hope more. In these days, we have plenty lru_add_drain_all() callsite. So, i think we should remove struct pagevec and should aim migration aware new batch mechanism. maybe. This also improve compaction success rate. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx147.postini.com [74.125.245.147]) by kanga.kvack.org (Postfix) with SMTP id 9D96C6B005A for ; Thu, 28 Jun 2012 03:46:13 -0400 (EDT) Received: from m4.gw.fujitsu.co.jp (unknown [10.0.50.74]) by fgwmail5.fujitsu.co.jp (Postfix) with ESMTP id 3DF683EE0AE for ; Thu, 28 Jun 2012 16:46:12 +0900 (JST) Received: from smail (m4 [127.0.0.1]) by outgoing.m4.gw.fujitsu.co.jp (Postfix) with ESMTP id 252EB45DE54 for ; Thu, 28 Jun 2012 16:46:12 +0900 (JST) Received: from s4.gw.fujitsu.co.jp (s4.gw.fujitsu.co.jp [10.0.50.94]) by m4.gw.fujitsu.co.jp (Postfix) with ESMTP id ED62845DE50 for ; Thu, 28 Jun 2012 16:46:11 +0900 (JST) Received: from s4.gw.fujitsu.co.jp (localhost.localdomain [127.0.0.1]) by s4.gw.fujitsu.co.jp (Postfix) with ESMTP id E005CE08003 for ; Thu, 28 Jun 2012 16:46:11 +0900 (JST) Received: from ml13.s.css.fujitsu.com (ml13.s.css.fujitsu.com [10.240.81.133]) by s4.gw.fujitsu.co.jp (Postfix) with ESMTP id 997F01DB803E for ; Thu, 28 Jun 2012 16:46:11 +0900 (JST) Message-ID: <4FEC0B3F.7070108@jp.fujitsu.com> Date: Thu, 28 Jun 2012 16:43:59 +0900 From: Kamezawa Hiroyuki MIME-Version: 1.0 Subject: Re: needed lru_add_drain_all() change References: <20120626143703.396d6d66.akpm@linux-foundation.org> In-Reply-To: <20120626143703.396d6d66.akpm@linux-foundation.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: linux-mm@kvack.org (2012/06/27 6:37), Andrew Morton wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=43811 > > lru_add_drain_all() uses schedule_on_each_cpu(). But > schedule_on_each_cpu() hangs if a realtime thread is spinning, pinned > to a CPU. There's no intention to change the scheduler behaviour, so I > think we should remove schedule_on_each_cpu() from the kernel. > > The biggest user of schedule_on_each_cpu() is lru_add_drain_all(). > > Does anyone have any thoughts on how we can do this? The obvious > approach is to declare these: > > static DEFINE_PER_CPU(struct pagevec[NR_LRU_LISTS], lru_add_pvecs); > static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs); > static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs); > > to be irq-safe and use on_each_cpu(). lru_rotate_pvecs is already > irq-safe and converting lru_add_pvecs and lru_deactivate_pvecs looks > pretty simple. > > Thoughts? > How about this kind of RCU synchronization ? == /* * Double buffered pagevec for quick drain. * The usual per-cpu-pvec user need to take rcu_read_lock() before accessing. * External drainer of pvecs will relpace pvec vector and call synchroize_rcu(), * and drain all pages on unused pvecs in turn. */ static DEFINE_PER_CPU(struct pagevec[NR_LRU_LISTS * 2], lru_pvecs); atomic_t pvec_idx; /* must be placed onto some aligned address...*/ struct pagevec *my_pagevec(enum lru) { return pvec = &__get_cpu_var(lru_pvecs[lru << atomic_read(pvec_idx)]); } /* * percpu pagevec access should be surrounded by these calls. */ static inline void pagevec_start_access() { rcu_read_lock(); } static inline void pagevec_end_access() { rcu_read_unlock(); } /* * changing pagevec array vec 0 <-> 1 */ static void lru_pvec_update() { if (atomic_read(&pvec_idx)) atomic_set(&pvec_idx, 0); else atomic_set(&pvec_idx, 1); } /* * drain all LRUS on per-cpu pagevecs. */ DEFINE_MUTEX(lru_add_drain_all_mutex); static void lru_add_drain_all() { mutex_lock(&lru_add_drain_mutex); lru_pvec_update(); synchronize_rcu(); /* waits for all accessors to pvec quits. */ for_each_cpu(cpu) drain_pvec_of_the_cpu(cpu); mutex_unlock(&lru_add_drain_mutex); } == -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx143.postini.com [74.125.245.143]) by kanga.kvack.org (Postfix) with SMTP id F37AE6B005A for ; Thu, 28 Jun 2012 19:42:28 -0400 (EDT) Message-ID: <4FECEBF4.7010202@kernel.org> Date: Fri, 29 Jun 2012 08:42:44 +0900 From: Minchan Kim MIME-Version: 1.0 Subject: Re: needed lru_add_drain_all() change References: <20120626143703.396d6d66.akpm@linux-foundation.org> <4FEC0B3F.7070108@jp.fujitsu.com> In-Reply-To: <4FEC0B3F.7070108@jp.fujitsu.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Kamezawa Hiroyuki Cc: Andrew Morton , linux-mm@kvack.org On 06/28/2012 04:43 PM, Kamezawa Hiroyuki wrote: > (2012/06/27 6:37), Andrew Morton wrote: >> https://bugzilla.kernel.org/show_bug.cgi?id=43811 >> >> lru_add_drain_all() uses schedule_on_each_cpu(). But >> schedule_on_each_cpu() hangs if a realtime thread is spinning, pinned >> to a CPU. There's no intention to change the scheduler behaviour, so I >> think we should remove schedule_on_each_cpu() from the kernel. >> >> The biggest user of schedule_on_each_cpu() is lru_add_drain_all(). >> >> Does anyone have any thoughts on how we can do this? The obvious >> approach is to declare these: >> >> static DEFINE_PER_CPU(struct pagevec[NR_LRU_LISTS], lru_add_pvecs); >> static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs); >> static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs); >> >> to be irq-safe and use on_each_cpu(). lru_rotate_pvecs is already >> irq-safe and converting lru_add_pvecs and lru_deactivate_pvecs looks >> pretty simple. >> >> Thoughts? >> > > How about this kind of RCU synchronization ? > == > /* > * Double buffered pagevec for quick drain. > * The usual per-cpu-pvec user need to take rcu_read_lock() before > accessing. > * External drainer of pvecs will relpace pvec vector and call > synchroize_rcu(), > * and drain all pages on unused pvecs in turn. > */ > static DEFINE_PER_CPU(struct pagevec[NR_LRU_LISTS * 2], lru_pvecs); > > atomic_t pvec_idx; /* must be placed onto some aligned address...*/ > > > struct pagevec *my_pagevec(enum lru) > { > return pvec = &__get_cpu_var(lru_pvecs[lru << atomic_read(pvec_idx)]); > } > > /* > * percpu pagevec access should be surrounded by these calls. > */ > static inline void pagevec_start_access() > { > rcu_read_lock(); > } > > static inline void pagevec_end_access() > { > rcu_read_unlock(); > } > > > /* > * changing pagevec array vec 0 <-> 1 > */ > static void lru_pvec_update() > { > if (atomic_read(&pvec_idx)) > atomic_set(&pvec_idx, 0); > else > atomic_set(&pvec_idx, 1); > } > > /* > * drain all LRUS on per-cpu pagevecs. > */ > DEFINE_MUTEX(lru_add_drain_all_mutex); > static void lru_add_drain_all() > { > mutex_lock(&lru_add_drain_mutex); > lru_pvec_update(); > synchronize_rcu(); /* waits for all accessors to pvec quits. */ I don't know RCU internal but conceptually, I understood synchronize_rcu need context switching of all CPU. If it's partly true, it could be a problem, too. > for_each_cpu(cpu) > drain_pvec_of_the_cpu(cpu); > mutex_unlock(&lru_add_drain_mutex); > } > == > > > > > > > > > > > > > > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org > -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx122.postini.com [74.125.245.122]) by kanga.kvack.org (Postfix) with SMTP id 9834E6B005A for ; Thu, 28 Jun 2012 23:26:49 -0400 (EDT) Received: from m4.gw.fujitsu.co.jp (unknown [10.0.50.74]) by fgwmail6.fujitsu.co.jp (Postfix) with ESMTP id 3103D3EE0B6 for ; Fri, 29 Jun 2012 12:26:48 +0900 (JST) Received: from smail (m4 [127.0.0.1]) by outgoing.m4.gw.fujitsu.co.jp (Postfix) with ESMTP id 15E0E45DE51 for ; Fri, 29 Jun 2012 12:26:48 +0900 (JST) Received: from s4.gw.fujitsu.co.jp (s4.gw.fujitsu.co.jp [10.0.50.94]) by m4.gw.fujitsu.co.jp (Postfix) with ESMTP id EFF0845DE4D for ; Fri, 29 Jun 2012 12:26:47 +0900 (JST) Received: from s4.gw.fujitsu.co.jp (localhost.localdomain [127.0.0.1]) by s4.gw.fujitsu.co.jp (Postfix) with ESMTP id E21B7E08001 for ; Fri, 29 Jun 2012 12:26:47 +0900 (JST) Received: from m1000.s.css.fujitsu.com (m1000.s.css.fujitsu.com [10.240.81.136]) by s4.gw.fujitsu.co.jp (Postfix) with ESMTP id 917D71DB8037 for ; Fri, 29 Jun 2012 12:26:47 +0900 (JST) Message-ID: <4FED1FF1.7000901@jp.fujitsu.com> Date: Fri, 29 Jun 2012 12:24:33 +0900 From: Kamezawa Hiroyuki MIME-Version: 1.0 Subject: Re: needed lru_add_drain_all() change References: <20120626143703.396d6d66.akpm@linux-foundation.org> <4FEC0B3F.7070108@jp.fujitsu.com> <4FECEBF4.7010202@kernel.org> In-Reply-To: <4FECEBF4.7010202@kernel.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Minchan Kim Cc: Andrew Morton , linux-mm@kvack.org (2012/06/29 8:42), Minchan Kim wrote: > On 06/28/2012 04:43 PM, Kamezawa Hiroyuki wrote: > >> (2012/06/27 6:37), Andrew Morton wrote: >>> https://bugzilla.kernel.org/show_bug.cgi?id=43811 >>> >>> lru_add_drain_all() uses schedule_on_each_cpu(). But >>> schedule_on_each_cpu() hangs if a realtime thread is spinning, pinned >>> to a CPU. There's no intention to change the scheduler behaviour, so I >>> think we should remove schedule_on_each_cpu() from the kernel. >>> >>> The biggest user of schedule_on_each_cpu() is lru_add_drain_all(). >>> >>> Does anyone have any thoughts on how we can do this? The obvious >>> approach is to declare these: >>> >>> static DEFINE_PER_CPU(struct pagevec[NR_LRU_LISTS], lru_add_pvecs); >>> static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs); >>> static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs); >>> >>> to be irq-safe and use on_each_cpu(). lru_rotate_pvecs is already >>> irq-safe and converting lru_add_pvecs and lru_deactivate_pvecs looks >>> pretty simple. >>> >>> Thoughts? >>> >> >> How about this kind of RCU synchronization ? >> == >> /* >> * Double buffered pagevec for quick drain. >> * The usual per-cpu-pvec user need to take rcu_read_lock() before >> accessing. >> * External drainer of pvecs will relpace pvec vector and call >> synchroize_rcu(), >> * and drain all pages on unused pvecs in turn. >> */ >> static DEFINE_PER_CPU(struct pagevec[NR_LRU_LISTS * 2], lru_pvecs); >> >> atomic_t pvec_idx; /* must be placed onto some aligned address...*/ >> >> >> struct pagevec *my_pagevec(enum lru) >> { >> return pvec = &__get_cpu_var(lru_pvecs[lru << atomic_read(pvec_idx)]); >> } >> >> /* >> * percpu pagevec access should be surrounded by these calls. >> */ >> static inline void pagevec_start_access() >> { >> rcu_read_lock(); >> } >> >> static inline void pagevec_end_access() >> { >> rcu_read_unlock(); >> } >> >> >> /* >> * changing pagevec array vec 0 <-> 1 >> */ >> static void lru_pvec_update() >> { >> if (atomic_read(&pvec_idx)) >> atomic_set(&pvec_idx, 0); >> else >> atomic_set(&pvec_idx, 1); >> } >> >> /* >> * drain all LRUS on per-cpu pagevecs. >> */ >> DEFINE_MUTEX(lru_add_drain_all_mutex); >> static void lru_add_drain_all() >> { >> mutex_lock(&lru_add_drain_mutex); >> lru_pvec_update(); >> synchronize_rcu(); /* waits for all accessors to pvec quits. */ > > > I don't know RCU internal but conceptually, I understood synchronize_rcu need > context switching of all CPU. If it's partly true, it could be a problem, too. > Hmm, from Documenatation/RCU/stallwarn.txt == o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel without invoking schedule(). o A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might happen to preempt a low-priority task in the middle of an RCU read-side critical section. This is especially damaging if that low-priority task is not permitted to run on any other CPU, in which case the next RCU grace period can never complete, which will eventually cause the system to run out of memory and hang. While the system is in the process of running itself out of memory, you might see stall-warning messages. o A CPU-bound real-time task in a CONFIG_PREEMPT_RT kernel that is running at a higher priority than the RCU softirq threads. This will prevent RCU callbacks from ever being invoked, and in a CONFIG_TREE_PREEMPT_RCU kernel will further prevent RCU grace periods from ever completing. Either way, the system will eventually run out of memory and hang. In the CONFIG_TREE_PREEMPT_RCU case, you might see stall-warning messages. == you're right. (RCU stall warning seems to be shown per 60secs at default.) I'm wondering to do sync without RCU... == pvec_start_access(struct pagevec *pvec) { atomic_inc(&pvec->using); } pvec_end_access(struct pagevec *pvec) { atomic_dec(&pvec->using); } synchronize_pvec() { for_each_cpu(cpu) wait for pvec->using to be 0. } static void lru_add_drain_all() { mutex_lock(); lru_pvec_update(); //switch pvec synchronize_pvec(); // wait for all user exits for_each_cpu() drain pages in pvec mutex_unlock() } == "disable_irq() + intterupt()" will be easier. What is the cost of IRQ-disable v.s. atomic_inc() for local variable... Regards, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx172.postini.com [74.125.245.172]) by kanga.kvack.org (Postfix) with SMTP id 106B26B005A for ; Thu, 28 Jun 2012 23:49:45 -0400 (EDT) Received: from m4.gw.fujitsu.co.jp (unknown [10.0.50.74]) by fgwmail5.fujitsu.co.jp (Postfix) with ESMTP id 0FB6A3EE081 for ; Fri, 29 Jun 2012 12:49:43 +0900 (JST) Received: from smail (m4 [127.0.0.1]) by outgoing.m4.gw.fujitsu.co.jp (Postfix) with ESMTP id EC3A245DE58 for ; Fri, 29 Jun 2012 12:49:42 +0900 (JST) Received: from s4.gw.fujitsu.co.jp (s4.gw.fujitsu.co.jp [10.0.50.94]) by m4.gw.fujitsu.co.jp (Postfix) with ESMTP id D3E5F45DE57 for ; Fri, 29 Jun 2012 12:49:42 +0900 (JST) Received: from s4.gw.fujitsu.co.jp (localhost.localdomain [127.0.0.1]) by s4.gw.fujitsu.co.jp (Postfix) with ESMTP id C7A971DB8037 for ; Fri, 29 Jun 2012 12:49:42 +0900 (JST) Received: from m1000.s.css.fujitsu.com (m1000.s.css.fujitsu.com [10.240.81.136]) by s4.gw.fujitsu.co.jp (Postfix) with ESMTP id 844BE1DB802F for ; Fri, 29 Jun 2012 12:49:42 +0900 (JST) Message-ID: <4FED2554.6020601@jp.fujitsu.com> Date: Fri, 29 Jun 2012 12:47:32 +0900 From: Kamezawa Hiroyuki MIME-Version: 1.0 Subject: Re: needed lru_add_drain_all() change References: <20120626143703.396d6d66.akpm@linux-foundation.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: KOSAKI Motohiro Cc: Andrew Morton , linux-mm@kvack.org (2012/06/28 15:23), KOSAKI Motohiro wrote: > On Tue, Jun 26, 2012 at 5:37 PM, Andrew Morton > wrote: >> https://bugzilla.kernel.org/show_bug.cgi?id=43811 >> >> lru_add_drain_all() uses schedule_on_each_cpu(). But >> schedule_on_each_cpu() hangs if a realtime thread is spinning, pinned >> to a CPU. There's no intention to change the scheduler behaviour, so I >> think we should remove schedule_on_each_cpu() from the kernel. >> >> The biggest user of schedule_on_each_cpu() is lru_add_drain_all(). >> >> Does anyone have any thoughts on how we can do this? The obvious >> approach is to declare these: >> >> static DEFINE_PER_CPU(struct pagevec[NR_LRU_LISTS], lru_add_pvecs); >> static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs); >> static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs); >> >> to be irq-safe and use on_each_cpu(). lru_rotate_pvecs is already >> irq-safe and converting lru_add_pvecs and lru_deactivate_pvecs looks >> pretty simple. >> >> Thoughts? > > I agree. > > But i hope more. In these days, we have plenty lru_add_drain_all() > callsite. So, > i think we should remove struct pagevec and should aim migration aware new > batch mechanism. maybe. This also improve compaction success rate. > migration-aware means an framework which isolate_xxxx_page() can work with ? To do that, we need to know which object points to the page. Hmm. Do you have anyidea ? -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org