linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [LSF/MM TOPIC] Reclaim in the face of really fast I/O
@ 2015-01-15 21:00 Dave Hansen
  2015-01-16  1:21 ` [Lsf-pc] " Sasha Levin
  0 siblings, 1 reply; 3+ messages in thread
From: Dave Hansen @ 2015-01-15 21:00 UTC (permalink / raw)
  To: Matthew Wilcox, lsf-pc, Reddy, Dheeraj; +Cc: Kleen, Andi, Chen, Tim C, Linux-MM

I/O devices are only getting faster.  In fact, they're getting closer
and closer to memory in latency and bandwidth.  But the VM is still
designed to do very orderly and costly procedures to reclaim memory, and
the existing algorithms don't parallelize particularly well.  They hit
contention on mmap_sem or the lru locks well before all of the CPU
horsepower that we have can be brought to bear on reclaim.

Once the latency to bring pages in and out of storage becomes low
enough, reclaiming the _right_ pages becomes much less important than
doing something useful with the CPU horsepower that we have.

We need to talk about ways to do reclaim with lower CPU overhead and to
parallelize more effectively.

There has been some research in this area by some folks at Intel and we
could quickly summarize what has been learned so far to help kick off a
discussion.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Lsf-pc] [LSF/MM TOPIC] Reclaim in the face of really fast I/O
  2015-01-15 21:00 [LSF/MM TOPIC] Reclaim in the face of really fast I/O Dave Hansen
@ 2015-01-16  1:21 ` Sasha Levin
  2015-01-16  1:32   ` Sasha Levin
  0 siblings, 1 reply; 3+ messages in thread
From: Sasha Levin @ 2015-01-16  1:21 UTC (permalink / raw)
  To: Dave Hansen, Matthew Wilcox, lsf-pc, Reddy, Dheeraj
  Cc: Kleen, Andi, Linux-MM, Chen, Tim C

On 01/15/2015 04:00 PM, Dave Hansen wrote:
> I/O devices are only getting faster.  In fact, they're getting closer
> and closer to memory in latency and bandwidth.  But the VM is still
> designed to do very orderly and costly procedures to reclaim memory, and
> the existing algorithms don't parallelize particularly well.  They hit
> contention on mmap_sem or the lru locks well before all of the CPU
> horsepower that we have can be brought to bear on reclaim.
> 
> Once the latency to bring pages in and out of storage becomes low
> enough, reclaiming the _right_ pages becomes much less important than
> doing something useful with the CPU horsepower that we have.
> 
> We need to talk about ways to do reclaim with lower CPU overhead and to
> parallelize more effectively.
> 
> There has been some research in this area by some folks at Intel and we
> could quickly summarize what has been learned so far to help kick off a
> discussion.

I was actually planning to bring that up. Trinity can cause enough stress
to a system that the hang watchdog triggers (with a 10 minute timeout!)
inside reclaim code.


Thanks,
Sasha

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Lsf-pc] [LSF/MM TOPIC] Reclaim in the face of really fast I/O
  2015-01-16  1:21 ` [Lsf-pc] " Sasha Levin
@ 2015-01-16  1:32   ` Sasha Levin
  0 siblings, 0 replies; 3+ messages in thread
From: Sasha Levin @ 2015-01-16  1:32 UTC (permalink / raw)
  To: Dave Hansen, Matthew Wilcox, lsf-pc, Reddy, Dheeraj
  Cc: Kleen, Andi, Linux-MM, Chen, Tim C

On 01/15/2015 08:21 PM, Sasha Levin wrote:
> On 01/15/2015 04:00 PM, Dave Hansen wrote:
>> I/O devices are only getting faster.  In fact, they're getting closer
>> and closer to memory in latency and bandwidth.  But the VM is still
>> designed to do very orderly and costly procedures to reclaim memory, and
>> the existing algorithms don't parallelize particularly well.  They hit
>> contention on mmap_sem or the lru locks well before all of the CPU
>> horsepower that we have can be brought to bear on reclaim.
>>
>> Once the latency to bring pages in and out of storage becomes low
>> enough, reclaiming the _right_ pages becomes much less important than
>> doing something useful with the CPU horsepower that we have.
>>
>> We need to talk about ways to do reclaim with lower CPU overhead and to
>> parallelize more effectively.
>>
>> There has been some research in this area by some folks at Intel and we
>> could quickly summarize what has been learned so far to help kick off a
>> discussion.
> 
> I was actually planning to bring that up. Trinity can cause enough stress
> to a system that the hang watchdog triggers (with a 10 minute timeout!)
> inside reclaim code.

Something like this:

[ 5412.398971] trinity-c23     R  running task    10768 29063  30295 0x10000006
[ 5412.400111]  ffff88046127b178 0000000000000ec3 0000000000000000 0000000000000000
[ 5412.401404]  ffffe8fff263cd6e 0000000000000000 0000000000000000 0000000000000000
[ 5412.402695]  0000000000000000 ffff88076bfff4c8 0000000000001099 0000000000000000
[ 5412.404084] Call Trace:
[ 5412.404495]  [<ffffffff916124f2>] ? _raw_spin_unlock_irqrestore+0xa2/0xf0
[ 5412.405611]  [<ffffffff915fbe25>] schedule+0x55/0x280
[ 5412.406435]  [<ffffffff81903e72>] throttle_direct_reclaim+0x432/0x660
[ 5412.407530]  [<ffffffff81552160>] ? __init_waitqueue_head+0xe0/0xe0
[ 5412.408565]  [<ffffffff81916007>] try_to_free_pages+0xf7/0x460
[ 5412.409528]  [<ffffffff818e25e1>] __alloc_pages_nodemask+0xc01/0x1940
[ 5412.410589]  [<ffffffff81a05646>] alloc_pages_vma+0x216/0x6f0
[ 5412.411572]  [<ffffffff8191a62a>] ? shmem_alloc_page+0x9a/0x170
[ 5412.412556]  [<ffffffff8191a62a>] shmem_alloc_page+0x9a/0x170
[ 5412.413507]  [<ffffffff818bc241>] ? find_get_entry+0x191/0x2c0
[ 5412.414475]  [<ffffffff818bc0b5>] ? find_get_entry+0x5/0x2c0
[ 5412.415513]  [<ffffffff818bd02b>] ? find_lock_entry+0x2b/0x140
[ 5412.416474]  [<ffffffff81924f16>] shmem_getpage_gfp+0xde6/0x1710
[ 5412.417461]  [<ffffffff81926eb1>] shmem_fault+0x1a1/0x7d0
[ 5412.418353]  [<ffffffff819745dd>] __do_fault+0xad/0x2a0
[ 5412.419285]  [<ffffffff8197e6c1>] handle_mm_fault+0x1331/0x5440
[ 5412.420121]  [<ffffffff812f59d3>] __do_page_fault+0x2d3/0xfb0
[ 5412.420877]  [<ffffffff81579bc7>] ? mark_held_locks+0x117/0x2b0
[ 5412.421688]  [<ffffffff81571fcd>] ? trace_hardirqs_off+0xd/0x10
[ 5412.422491]  [<ffffffff812f6828>] trace_do_page_fault+0xc8/0x420
[ 5412.423348]  [<ffffffff812da7d3>] do_async_page_fault+0x83/0x120
[ 5412.424175]  [<ffffffff91614d68>] async_page_fault+0x28/0x30
[ 5412.424920]  [<ffffffff81960e5e>] ? iov_iter_fault_in_readable+0x17e/0x280
[ 5412.425851]  [<ffffffff814a1045>] ? ___might_sleep+0x2a5/0x420
[ 5412.426645]  [<ffffffff818baa49>] generic_perform_write+0x179/0x5b0
[ 5412.427573]  [<ffffffff81b2b267>] ? __mnt_drop_write+0x57/0xa0
[ 5412.428365]  [<ffffffff818c23fc>] __generic_file_write_iter+0x59c/0x13e0
[ 5412.429292]  [<ffffffff81aa1c7c>] ? rw_copy_check_uvector+0x5c/0x470
[ 5412.430153]  [<ffffffff81a8f964>] ? kasan_poison_shadow+0x34/0x40
[ 5412.431062]  [<ffffffff818c3319>] generic_file_write_iter+0xd9/0x510
[ 5412.431892]  [<ffffffff81a9bec0>] ? new_sync_read+0x220/0x220
[ 5412.432675]  [<ffffffff81a9c17d>] do_iter_readv_writev+0x9d/0x190
[ 5412.433504]  [<ffffffff81aa22c9>] do_readv_writev+0x239/0xe10
[ 5412.434294]  [<ffffffff818c3240>] ? __generic_file_write_iter+0x13e0/0x13e0
[ 5412.435300]  [<ffffffff8174cb7e>] ? acct_account_cputime+0x6e/0xa0
[ 5412.435942]  [<ffffffff818c3240>] ? __generic_file_write_iter+0x13e0/0x13e0
[ 5412.436678]  [<ffffffff818b5fa7>] ? context_tracking_user_exit+0xc7/0x330
[ 5412.437395]  [<ffffffff8157a281>] ? trace_hardirqs_on_caller+0x521/0x850
[ 5412.438097]  [<ffffffff81aa3033>] vfs_writev+0x93/0x100
[ 5412.438632]  [<ffffffff81aa39ba>] SyS_pwritev+0x11a/0x200


Thanks,
Sasha

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-01-16  1:32 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-15 21:00 [LSF/MM TOPIC] Reclaim in the face of really fast I/O Dave Hansen
2015-01-16  1:21 ` [Lsf-pc] " Sasha Levin
2015-01-16  1:32   ` Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).