From: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Minchan Kim <minchan@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>, linux-mm@kvack.org
Subject: Re: needed lru_add_drain_all() change
Date: Fri, 29 Jun 2012 12:24:33 +0900 [thread overview]
Message-ID: <4FED1FF1.7000901@jp.fujitsu.com> (raw)
In-Reply-To: <4FECEBF4.7010202@kernel.org>
(2012/06/29 8:42), Minchan Kim wrote:
> On 06/28/2012 04:43 PM, Kamezawa Hiroyuki wrote:
>
>> (2012/06/27 6:37), Andrew Morton wrote:
>>> https://bugzilla.kernel.org/show_bug.cgi?id=43811
>>>
>>> lru_add_drain_all() uses schedule_on_each_cpu(). But
>>> schedule_on_each_cpu() hangs if a realtime thread is spinning, pinned
>>> to a CPU. There's no intention to change the scheduler behaviour, so I
>>> think we should remove schedule_on_each_cpu() from the kernel.
>>>
>>> The biggest user of schedule_on_each_cpu() is lru_add_drain_all().
>>>
>>> Does anyone have any thoughts on how we can do this? The obvious
>>> approach is to declare these:
>>>
>>> static DEFINE_PER_CPU(struct pagevec[NR_LRU_LISTS], lru_add_pvecs);
>>> static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs);
>>> static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs);
>>>
>>> to be irq-safe and use on_each_cpu(). lru_rotate_pvecs is already
>>> irq-safe and converting lru_add_pvecs and lru_deactivate_pvecs looks
>>> pretty simple.
>>>
>>> Thoughts?
>>>
>>
>> How about this kind of RCU synchronization ?
>> ==
>> /*
>> * Double buffered pagevec for quick drain.
>> * The usual per-cpu-pvec user need to take rcu_read_lock() before
>> accessing.
>> * External drainer of pvecs will relpace pvec vector and call
>> synchroize_rcu(),
>> * and drain all pages on unused pvecs in turn.
>> */
>> static DEFINE_PER_CPU(struct pagevec[NR_LRU_LISTS * 2], lru_pvecs);
>>
>> atomic_t pvec_idx; /* must be placed onto some aligned address...*/
>>
>>
>> struct pagevec *my_pagevec(enum lru)
>> {
>> return pvec = &__get_cpu_var(lru_pvecs[lru << atomic_read(pvec_idx)]);
>> }
>>
>> /*
>> * percpu pagevec access should be surrounded by these calls.
>> */
>> static inline void pagevec_start_access()
>> {
>> rcu_read_lock();
>> }
>>
>> static inline void pagevec_end_access()
>> {
>> rcu_read_unlock();
>> }
>>
>>
>> /*
>> * changing pagevec array vec 0 <-> 1
>> */
>> static void lru_pvec_update()
>> {
>> if (atomic_read(&pvec_idx))
>> atomic_set(&pvec_idx, 0);
>> else
>> atomic_set(&pvec_idx, 1);
>> }
>>
>> /*
>> * drain all LRUS on per-cpu pagevecs.
>> */
>> DEFINE_MUTEX(lru_add_drain_all_mutex);
>> static void lru_add_drain_all()
>> {
>> mutex_lock(&lru_add_drain_mutex);
>> lru_pvec_update();
>> synchronize_rcu(); /* waits for all accessors to pvec quits. */
>
>
> I don't know RCU internal but conceptually, I understood synchronize_rcu need
> context switching of all CPU. If it's partly true, it could be a problem, too.
>
Hmm, from Documenatation/RCU/stallwarn.txt
==
o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
without invoking schedule().
o A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might
happen to preempt a low-priority task in the middle of an RCU
read-side critical section. This is especially damaging if
that low-priority task is not permitted to run on any other CPU,
in which case the next RCU grace period can never complete, which
will eventually cause the system to run out of memory and hang.
While the system is in the process of running itself out of
memory, you might see stall-warning messages.
o A CPU-bound real-time task in a CONFIG_PREEMPT_RT kernel that
is running at a higher priority than the RCU softirq threads.
This will prevent RCU callbacks from ever being invoked,
and in a CONFIG_TREE_PREEMPT_RCU kernel will further prevent
RCU grace periods from ever completing. Either way, the
system will eventually run out of memory and hang. In the
CONFIG_TREE_PREEMPT_RCU case, you might see stall-warning
messages.
==
you're right. (RCU stall warning seems to be shown per 60secs at default.)
I'm wondering to do sync without RCU...
==
pvec_start_access(struct pagevec *pvec)
{
atomic_inc(&pvec->using);
}
pvec_end_access(struct pagevec *pvec)
{
atomic_dec(&pvec->using);
}
synchronize_pvec()
{
for_each_cpu(cpu)
wait for pvec->using to be 0.
}
static void lru_add_drain_all()
{
mutex_lock();
lru_pvec_update(); //switch pvec
synchronize_pvec(); // wait for all user exits
for_each_cpu()
drain pages in pvec
mutex_unlock()
}
==
"disable_irq() + intterupt()" will be easier.
What is the cost of IRQ-disable v.s. atomic_inc() for local variable...
Regards,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
prev parent reply other threads:[~2012-06-29 3:26 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-26 21:37 needed lru_add_drain_all() change Andrew Morton
2012-06-27 0:55 ` Minchan Kim
2012-06-27 1:15 ` Andrew Morton
2012-06-27 1:20 ` Minchan Kim
2012-06-27 1:29 ` Andrew Morton
2012-06-27 2:09 ` Minchan Kim
2012-06-27 5:12 ` Andrew Morton
2012-06-27 5:41 ` Minchan Kim
2012-06-27 5:55 ` Andrew Morton
2012-06-27 6:33 ` Minchan Kim
2012-06-27 6:41 ` Andrew Morton
2012-06-27 10:27 ` Peter Zijlstra
2012-06-27 6:46 ` Andrew Morton
2012-06-27 10:31 ` Peter Zijlstra
2012-06-27 12:04 ` Peter Zijlstra
2012-06-28 6:23 ` KOSAKI Motohiro
2012-06-29 3:47 ` Kamezawa Hiroyuki
2012-06-28 7:43 ` Kamezawa Hiroyuki
2012-06-28 23:42 ` Minchan Kim
2012-06-29 3:24 ` Kamezawa Hiroyuki [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FED1FF1.7000901@jp.fujitsu.com \
--to=kamezawa.hiroyu@jp.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).