From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx148.postini.com [74.125.245.148]) by kanga.kvack.org (Postfix) with SMTP id 71E4F6B0006 for ; Thu, 7 Mar 2013 21:08:04 -0500 (EST) Received: by mail-ve0-f179.google.com with SMTP id da11so890336veb.24 for ; Thu, 07 Mar 2013 18:08:03 -0800 (PST) MIME-Version: 1.0 From: Raymond Jennings Date: Thu, 7 Mar 2013 18:07:23 -0800 Message-ID: Subject: Swap defragging Content-Type: text/plain; charset=UTF-8 Sender: owner-linux-mm@kvack.org List-ID: To: Linux Memory Management List Just a two cent question, but is there any merit to having the kernel defragment swap space? Making a process's swapped out stuff together? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx120.postini.com [74.125.245.120]) by kanga.kvack.org (Postfix) with SMTP id 9B1936B0006 for ; Thu, 7 Mar 2013 21:35:16 -0500 (EST) Date: Thu, 7 Mar 2013 21:35:11 -0500 From: Johannes Weiner Subject: Re: Swap defragging Message-ID: <20130308023511.GD23767@cmpxchg.org> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Raymond Jennings Cc: Linux Memory Management List On Thu, Mar 07, 2013 at 06:07:23PM -0800, Raymond Jennings wrote: > Just a two cent question, but is there any merit to having the kernel > defragment swap space? That is a good question. Swap does fragment quite a bit, and there are several reasons for that. We swap pages in our LRU list order, but this list is sorted by first access, not by access frequency (not quite that cookie cutter, but the ordering is certainly fairly coarse). This means that the pages may already be in suboptimal order for swap in at the time of swap out. Once written to disk, the layout tends to stick. One reason is that we actually try to not free swap slots unless there is a shortage of swap space to save future swap out IO (grep for vm_swap_full()). The other reason is that if a page shared among multiple threads is swapped out, it can not be removed from swap until all threads have faulted the page back in because of page table entries still referring to the swap slot on disk. In a multi-threaded application, this is rather unlikely. So even though the referencing order of the application might change, the disk layout won't. But adjusting the disk layout speculatively increases disk IO, so it could be hard to prove that you came up with a net improvement. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx174.postini.com [74.125.245.174]) by kanga.kvack.org (Postfix) with SMTP id DABE86B0006 for ; Thu, 7 Mar 2013 22:01:53 -0500 (EST) Received: by mail-ve0-f171.google.com with SMTP id b10so943365vea.30 for ; Thu, 07 Mar 2013 19:01:52 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <20130308023511.GD23767@cmpxchg.org> References: <20130308023511.GD23767@cmpxchg.org> From: Raymond Jennings Date: Thu, 7 Mar 2013 19:01:12 -0800 Message-ID: Subject: Re: Swap defragging Content-Type: text/plain; charset=UTF-8 Sender: owner-linux-mm@kvack.org List-ID: To: Johannes Weiner Cc: Linux Memory Management List Not to mention that swapped pages get freed when modified in RAM IIRC. On Thu, Mar 7, 2013 at 6:35 PM, Johannes Weiner wrote: > On Thu, Mar 07, 2013 at 06:07:23PM -0800, Raymond Jennings wrote: >> Just a two cent question, but is there any merit to having the kernel >> defragment swap space? > > That is a good question. > > Swap does fragment quite a bit, and there are several reasons for > that. > > We swap pages in our LRU list order, but this list is sorted by first > access, not by access frequency (not quite that cookie cutter, but the > ordering is certainly fairly coarse). This means that the pages may > already be in suboptimal order for swap in at the time of swap out. > > Once written to disk, the layout tends to stick. One reason is that > we actually try to not free swap slots unless there is a shortage of > swap space to save future swap out IO (grep for vm_swap_full()). The > other reason is that if a page shared among multiple threads is > swapped out, it can not be removed from swap until all threads have > faulted the page back in because of page table entries still referring > to the swap slot on disk. In a multi-threaded application, this is > rather unlikely. > > So even though the referencing order of the application might change, > the disk layout won't. But adjusting the disk layout speculatively > increases disk IO, so it could be hard to prove that you came up with > a net improvement. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx196.postini.com [74.125.245.196]) by kanga.kvack.org (Postfix) with SMTP id 16B266B0005 for ; Fri, 8 Mar 2013 21:00:44 -0500 (EST) Received: by mail-da0-f43.google.com with SMTP id u36so251857dak.30 for ; Fri, 08 Mar 2013 18:00:43 -0800 (PST) Message-ID: <513A97C5.7020008@gmail.com> Date: Sat, 09 Mar 2013 10:00:37 +0800 From: Will Huck MIME-Version: 1.0 Subject: Re: Swap defragging References: <20130308023511.GD23767@cmpxchg.org> In-Reply-To: <20130308023511.GD23767@cmpxchg.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Johannes Weiner Cc: Raymond Jennings , Linux Memory Management List Hi Johannes, On 03/08/2013 10:35 AM, Johannes Weiner wrote: > On Thu, Mar 07, 2013 at 06:07:23PM -0800, Raymond Jennings wrote: >> Just a two cent question, but is there any merit to having the kernel >> defragment swap space? > That is a good question. > > Swap does fragment quite a bit, and there are several reasons for > that. Are there any tools to test and monitor swap subsystem and page reclaim subsystem? > We swap pages in our LRU list order, but this list is sorted by first > access, not by access frequency (not quite that cookie cutter, but the > ordering is certainly fairly coarse). This means that the pages may > already be in suboptimal order for swap in at the time of swap out. > > Once written to disk, the layout tends to stick. One reason is that > we actually try to not free swap slots unless there is a shortage of > swap space to save future swap out IO (grep for vm_swap_full()). The > other reason is that if a page shared among multiple threads is > swapped out, it can not be removed from swap until all threads have > faulted the page back in because of page table entries still referring > to the swap slot on disk. In a multi-threaded application, this is > rather unlikely. > > So even though the referencing order of the application might change, > the disk layout won't. But adjusting the disk layout speculatively > increases disk IO, so it could be hard to prove that you came up with > a net improvement. > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx201.postini.com [74.125.245.201]) by kanga.kvack.org (Postfix) with SMTP id 4FA666B0005 for ; Sun, 10 Mar 2013 22:54:40 -0400 (EDT) Received: by mail-pb0-f43.google.com with SMTP id md12so3222263pbc.16 for ; Sun, 10 Mar 2013 19:54:39 -0700 (PDT) Message-ID: <513D4768.8050703@gmail.com> Date: Mon, 11 Mar 2013 10:54:32 +0800 From: Ric Mason MIME-Version: 1.0 Subject: Re: Swap defragging References: <20130308023511.GD23767@cmpxchg.org> In-Reply-To: <20130308023511.GD23767@cmpxchg.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Johannes Weiner Cc: Raymond Jennings , Linux Memory Management List Hi Johannes, On 03/08/2013 10:35 AM, Johannes Weiner wrote: > On Thu, Mar 07, 2013 at 06:07:23PM -0800, Raymond Jennings wrote: >> Just a two cent question, but is there any merit to having the kernel >> defragment swap space? > That is a good question. > > Swap does fragment quite a bit, and there are several reasons for > that. > > We swap pages in our LRU list order, but this list is sorted by first > access, not by access frequency (not quite that cookie cutter, but the > ordering is certainly fairly coarse). This means that the pages may > already be in suboptimal order for swap in at the time of swap out. > > Once written to disk, the layout tends to stick. One reason is that > we actually try to not free swap slots unless there is a shortage of If all the swap slots will be freed when swapoff? > swap space to save future swap out IO (grep for vm_swap_full()). The > other reason is that if a page shared among multiple threads is > swapped out, it can not be removed from swap until all threads have > faulted the page back in because of page table entries still referring > to the swap slot on disk. In a multi-threaded application, this is > rather unlikely. > > So even though the referencing order of the application might change, > the disk layout won't. But adjusting the disk layout speculatively > increases disk IO, so it could be hard to prove that you came up with > a net improvement. > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx146.postini.com [74.125.245.146]) by kanga.kvack.org (Postfix) with SMTP id 891346B0005 for ; Sun, 10 Mar 2013 23:11:34 -0400 (EDT) Received: by mail-pb0-f45.google.com with SMTP id ro8so3202962pbb.18 for ; Sun, 10 Mar 2013 20:11:33 -0700 (PDT) Message-ID: <513D4B5E.6050601@gmail.com> Date: Mon, 11 Mar 2013 11:11:26 +0800 From: Simon Jeons MIME-Version: 1.0 Subject: Re: Swap defragging References: <20130308023511.GD23767@cmpxchg.org> In-Reply-To: <20130308023511.GD23767@cmpxchg.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Johannes Weiner Cc: Raymond Jennings , Linux Memory Management List Hi Johannes, On 03/08/2013 10:35 AM, Johannes Weiner wrote: > On Thu, Mar 07, 2013 at 06:07:23PM -0800, Raymond Jennings wrote: >> Just a two cent question, but is there any merit to having the kernel >> defragment swap space? > That is a good question. One question here: The comments of setup_swap_extents: An ordered list of swap extents is built at swapon time and is then used at swap_writepage/swap_readpage tiem for locating where on disk a page belongs. But I didn't see any handle of swap extents in swap_writepage/swap_readpage, why? > > Swap does fragment quite a bit, and there are several reasons for > that. > > We swap pages in our LRU list order, but this list is sorted by first > access, not by access frequency (not quite that cookie cutter, but the > ordering is certainly fairly coarse). This means that the pages may > already be in suboptimal order for swap in at the time of swap out. > > Once written to disk, the layout tends to stick. One reason is that > we actually try to not free swap slots unless there is a shortage of > swap space to save future swap out IO (grep for vm_swap_full()). The > other reason is that if a page shared among multiple threads is > swapped out, it can not be removed from swap until all threads have > faulted the page back in because of page table entries still referring > to the swap slot on disk. In a multi-threaded application, this is > rather unlikely. > > So even though the referencing order of the application might change, > the disk layout won't. But adjusting the disk layout speculatively > increases disk IO, so it could be hard to prove that you came up with > a net improvement. > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx197.postini.com [74.125.245.197]) by kanga.kvack.org (Postfix) with SMTP id B40836B0005 for ; Sun, 10 Mar 2013 23:16:34 -0400 (EDT) Received: by mail-da0-f47.google.com with SMTP id s35so113407dak.6 for ; Sun, 10 Mar 2013 20:16:34 -0700 (PDT) Message-ID: <513D4C8D.6080106@gmail.com> Date: Mon, 11 Mar 2013 11:16:29 +0800 From: Jaegeuk Hanse MIME-Version: 1.0 Subject: Re: Swap defragging References: <20130308023511.GD23767@cmpxchg.org> In-Reply-To: <20130308023511.GD23767@cmpxchg.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Johannes Weiner Cc: Raymond Jennings , Linux Memory Management List Hi Johannes, On 03/08/2013 10:35 AM, Johannes Weiner wrote: > On Thu, Mar 07, 2013 at 06:07:23PM -0800, Raymond Jennings wrote: >> Just a two cent question, but is there any merit to having the kernel >> defragment swap space? > That is a good question. > > Swap does fragment quite a bit, and there are several reasons for > that. > > We swap pages in our LRU list order, but this list is sorted by first > access, not by access frequency (not quite that cookie cutter, but the > ordering is certainly fairly coarse). This means that the pages may > already be in suboptimal order for swap in at the time of swap out. > > Once written to disk, the layout tends to stick. One reason is that > we actually try to not free swap slots unless there is a shortage of > swap space to save future swap out IO (grep for vm_swap_full()). The Since anonymous page will be swap out if it's dirty and the contents of the page and data store in swap area is not equal now, why can avoid future swap out IO? > other reason is that if a page shared among multiple threads is > swapped out, it can not be removed from swap until all threads have > faulted the page back in because of page table entries still referring > to the swap slot on disk. In a multi-threaded application, this is > rather unlikely. > > So even though the referencing order of the application might change, > the disk layout won't. But adjusting the disk layout speculatively > increases disk IO, so it could be hard to prove that you came up with > a net improvement. > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx119.postini.com [74.125.245.119]) by kanga.kvack.org (Postfix) with SMTP id 9F7A86B0005 for ; Mon, 11 Mar 2013 02:24:38 -0400 (EDT) Received: by mail-ie0-f171.google.com with SMTP id 10so4371750ied.30 for ; Sun, 10 Mar 2013 23:24:37 -0700 (PDT) Message-ID: <513D789F.3050408@gmail.com> Date: Mon, 11 Mar 2013 14:24:31 +0800 From: Will Huck MIME-Version: 1.0 Subject: Re: Swap defragging References: <20130308023511.GD23767@cmpxchg.org> In-Reply-To: <20130308023511.GD23767@cmpxchg.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Johannes Weiner Cc: Raymond Jennings , Linux Memory Management List Hi Johannes, On 03/08/2013 10:35 AM, Johannes Weiner wrote: > On Thu, Mar 07, 2013 at 06:07:23PM -0800, Raymond Jennings wrote: >> Just a two cent question, but is there any merit to having the kernel >> defragment swap space? > That is a good question. > > Swap does fragment quite a bit, and there are several reasons for > that. > > We swap pages in our LRU list order, but this list is sorted by first > access, not by access frequency (not quite that cookie cutter, but the > ordering is certainly fairly coarse). This means that the pages may > already be in suboptimal order for swap in at the time of swap out. > > Once written to disk, the layout tends to stick. One reason is that > we actually try to not free swap slots unless there is a shortage of > swap space to save future swap out IO (grep for vm_swap_full()). The What I concern is scan_swap_map won't free swap slot which is lower than p->lowest_bit and larger than p->highest_bit. > other reason is that if a page shared among multiple threads is > swapped out, it can not be removed from swap until all threads have > faulted the page back in because of page table entries still referring > to the swap slot on disk. In a multi-threaded application, this is > rather unlikely. > > So even though the referencing order of the application might change, > the disk layout won't. But adjusting the disk layout speculatively > increases disk IO, so it could be hard to prove that you came up with > a net improvement. > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx158.postini.com [74.125.245.158]) by kanga.kvack.org (Postfix) with SMTP id 6C4F06B0006 for ; Tue, 12 Mar 2013 12:52:52 -0400 (EDT) Date: Tue, 12 Mar 2013 12:52:47 -0400 From: Johannes Weiner Subject: Re: Swap defragging Message-ID: <20130312165247.GB1953@cmpxchg.org> References: <20130308023511.GD23767@cmpxchg.org> <513A97C5.7020008@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <513A97C5.7020008@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Will Huck Cc: Raymond Jennings , Linux Memory Management List On Sat, Mar 09, 2013 at 10:00:37AM +0800, Will Huck wrote: > Hi Johannes, > On 03/08/2013 10:35 AM, Johannes Weiner wrote: > >On Thu, Mar 07, 2013 at 06:07:23PM -0800, Raymond Jennings wrote: > >>Just a two cent question, but is there any merit to having the kernel > >>defragment swap space? > >That is a good question. > > > >Swap does fragment quite a bit, and there are several reasons for > >that. > > Are there any tools to test and monitor swap subsystem and page > reclaim subsystem? seekwatcher is great to see the IO patterns. Anything that uses anonymous memory can test swap: a java job, multiplying matrixes, kernel builds etc. I mostly log /proc/vmstat by taking snapshots at a regular interval during the workload, then plot and visually correlate the swapin/swapout counters with the individual LRU sizes, page fault rate, what have you, to get a feeling for what it's doing. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx183.postini.com [74.125.245.183]) by kanga.kvack.org (Postfix) with SMTP id 08DE66B0036 for ; Tue, 12 Mar 2013 12:53:51 -0400 (EDT) Date: Tue, 12 Mar 2013 12:53:47 -0400 From: Johannes Weiner Subject: Re: Swap defragging Message-ID: <20130312165347.GC1953@cmpxchg.org> References: <20130308023511.GD23767@cmpxchg.org> <513D4768.8050703@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <513D4768.8050703@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Ric Mason Cc: Raymond Jennings , Linux Memory Management List On Mon, Mar 11, 2013 at 10:54:32AM +0800, Ric Mason wrote: > Hi Johannes, > On 03/08/2013 10:35 AM, Johannes Weiner wrote: > >On Thu, Mar 07, 2013 at 06:07:23PM -0800, Raymond Jennings wrote: > >>Just a two cent question, but is there any merit to having the kernel > >>defragment swap space? > >That is a good question. > > > >Swap does fragment quite a bit, and there are several reasons for > >that. > > > >We swap pages in our LRU list order, but this list is sorted by first > >access, not by access frequency (not quite that cookie cutter, but the > >ordering is certainly fairly coarse). This means that the pages may > >already be in suboptimal order for swap in at the time of swap out. > > > >Once written to disk, the layout tends to stick. One reason is that > >we actually try to not free swap slots unless there is a shortage of > > If all the swap slots will be freed when swapoff? Well, yeah, but that's hardly an interesting case, is it? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx155.postini.com [74.125.245.155]) by kanga.kvack.org (Postfix) with SMTP id 824246B0037 for ; Tue, 12 Mar 2013 13:03:45 -0400 (EDT) Date: Tue, 12 Mar 2013 13:03:39 -0400 From: Johannes Weiner Subject: Re: Swap defragging Message-ID: <20130312170339.GD1953@cmpxchg.org> References: <20130308023511.GD23767@cmpxchg.org> <513D4B5E.6050601@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <513D4B5E.6050601@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Simon Jeons Cc: Raymond Jennings , Linux Memory Management List On Mon, Mar 11, 2013 at 11:11:26AM +0800, Simon Jeons wrote: > Hi Johannes, > On 03/08/2013 10:35 AM, Johannes Weiner wrote: > >On Thu, Mar 07, 2013 at 06:07:23PM -0800, Raymond Jennings wrote: > >>Just a two cent question, but is there any merit to having the kernel > >>defragment swap space? > >That is a good question. > > One question here: > > The comments of setup_swap_extents: > An ordered list of swap extents is built at swapon time and is then > used at swap_writepage/swap_readpage tiem for locating where on disk > a page belongs. > But I didn't see any handle of swap extents in > swap_writepage/swap_readpage, why? This is not the right place for such questions. If you are interested in how the kernel works, buy a book, read the code, consult the kernel newbies project if you get stuck. Also, the answer to your question is within a few lines of each other in that same code file. Make an effort. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx111.postini.com [74.125.245.111]) by kanga.kvack.org (Postfix) with SMTP id 119916B0037 for ; Tue, 12 Mar 2013 13:08:51 -0400 (EDT) Date: Tue, 12 Mar 2013 13:08:47 -0400 From: Johannes Weiner Subject: Re: Swap defragging Message-ID: <20130312170847.GE1953@cmpxchg.org> References: <20130308023511.GD23767@cmpxchg.org> <513D4C8D.6080106@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <513D4C8D.6080106@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Jaegeuk Hanse Cc: Raymond Jennings , Linux Memory Management List On Mon, Mar 11, 2013 at 11:16:29AM +0800, Jaegeuk Hanse wrote: > Hi Johannes, > On 03/08/2013 10:35 AM, Johannes Weiner wrote: > >On Thu, Mar 07, 2013 at 06:07:23PM -0800, Raymond Jennings wrote: > >>Just a two cent question, but is there any merit to having the kernel > >>defragment swap space? > >That is a good question. > > > >Swap does fragment quite a bit, and there are several reasons for > >that. > > > >We swap pages in our LRU list order, but this list is sorted by first > >access, not by access frequency (not quite that cookie cutter, but the > >ordering is certainly fairly coarse). This means that the pages may > >already be in suboptimal order for swap in at the time of swap out. > > > >Once written to disk, the layout tends to stick. One reason is that > >we actually try to not free swap slots unless there is a shortage of > >swap space to save future swap out IO (grep for vm_swap_full()). The > > Since anonymous page will be swap out if it's dirty and the contents > of the page and data store in swap area is not equal now, why can > avoid future swap out IO? Modified pages get written out freshly, but in a multi-threaded application, the original page stays put until all threads have modified it or faulted it back in. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx202.postini.com [74.125.245.202]) by kanga.kvack.org (Postfix) with SMTP id 2E5DB6B0006 for ; Tue, 12 Mar 2013 20:46:36 -0400 (EDT) Received: by mail-pb0-f49.google.com with SMTP id xa12so426368pbc.36 for ; Tue, 12 Mar 2013 17:46:35 -0700 (PDT) Message-ID: <513FCC66.20200@gmail.com> Date: Wed, 13 Mar 2013 08:46:30 +0800 From: Will Huck MIME-Version: 1.0 Subject: Re: Swap defragging References: <20130308023511.GD23767@cmpxchg.org> <513A97C5.7020008@gmail.com> <20130312165247.GB1953@cmpxchg.org> In-Reply-To: <20130312165247.GB1953@cmpxchg.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Johannes Weiner Cc: Raymond Jennings , Linux Memory Management List Hi Johannes, On 03/13/2013 12:52 AM, Johannes Weiner wrote: > On Sat, Mar 09, 2013 at 10:00:37AM +0800, Will Huck wrote: >> Hi Johannes, >> On 03/08/2013 10:35 AM, Johannes Weiner wrote: >>> On Thu, Mar 07, 2013 at 06:07:23PM -0800, Raymond Jennings wrote: >>>> Just a two cent question, but is there any merit to having the kernel >>>> defragment swap space? >>> That is a good question. >>> >>> Swap does fragment quite a bit, and there are several reasons for >>> that. >> Are there any tools to test and monitor swap subsystem and page >> reclaim subsystem? One offline question, active:inactive => 1:1 for file page and active:inactive => inactive_ratio for anonymous page, why has this different? > seekwatcher is great to see the IO patterns. Anything that uses > anonymous memory can test swap: a java job, multiplying matrixes, > kernel builds etc. I mostly log /proc/vmstat by taking snapshots at a > regular interval during the workload, then plot and visually correlate > the swapin/swapout counters with the individual LRU sizes, page fault > rate, what have you, to get a feeling for what it's doing. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx141.postini.com [74.125.245.141]) by kanga.kvack.org (Postfix) with SMTP id 2F1DE6B0006 for ; Tue, 12 Mar 2013 21:31:36 -0400 (EDT) Received: by mail-ob0-f170.google.com with SMTP id wc20so518714obb.29 for ; Tue, 12 Mar 2013 18:31:35 -0700 (PDT) Message-ID: <513FD6F2.5060804@gmail.com> Date: Wed, 13 Mar 2013 09:31:30 +0800 From: Will Huck MIME-Version: 1.0 Subject: Re: Swap defragging References: <20130308023511.GD23767@cmpxchg.org> <513A97C5.7020008@gmail.com> <20130312165247.GB1953@cmpxchg.org> In-Reply-To: <20130312165247.GB1953@cmpxchg.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Johannes Weiner Cc: Raymond Jennings , Linux Memory Management List Hi Johannes, On 03/13/2013 12:52 AM, Johannes Weiner wrote: > On Sat, Mar 09, 2013 at 10:00:37AM +0800, Will Huck wrote: >> Hi Johannes, >> On 03/08/2013 10:35 AM, Johannes Weiner wrote: >>> On Thu, Mar 07, 2013 at 06:07:23PM -0800, Raymond Jennings wrote: >>>> Just a two cent question, but is there any merit to having the kernel >>>> defragment swap space? >>> That is a good question. >>> >>> Swap does fragment quite a bit, and there are several reasons for >>> that. >> Are there any tools to test and monitor swap subsystem and page >> reclaim subsystem? pgscan_kswapd_dma 0 pgscan_kswapd_dma32 0 pgscan_kswapd_normal 0 pgscan_kswapd_movable 0 pgscan_direct_dma 0 pgscan_direct_dma32 0 pgscan_direct_normal 0 pgscan_direct_movable 0 pgscan_direct_throttle 0 zone_reclaim_failed 0 pginodesteal 0 slabs_scanned 328704 slab cache is scaned but file-backed/swap-backed pages are not scanned, why? > seekwatcher is great to see the IO patterns. Anything that uses > anonymous memory can test swap: a java job, multiplying matrixes, > kernel builds etc. I mostly log /proc/vmstat by taking snapshots at a > regular interval during the workload, then plot and visually correlate > the swapin/swapout counters with the individual LRU sizes, page fault > rate, what have you, to get a feeling for what it's doing. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx109.postini.com [74.125.245.109]) by kanga.kvack.org (Postfix) with SMTP id 052A76B0002 for ; Tue, 12 Mar 2013 23:47:22 -0400 (EDT) Received: by mail-da0-f53.google.com with SMTP id n34so219774dal.12 for ; Tue, 12 Mar 2013 20:47:22 -0700 (PDT) Message-ID: <513FF6C4.1030708@gmail.com> Date: Wed, 13 Mar 2013 11:47:16 +0800 From: Jaegeuk Hanse MIME-Version: 1.0 Subject: Re: Swap defragging References: <20130308023511.GD23767@cmpxchg.org> <513D4C8D.6080106@gmail.com> <20130312170847.GE1953@cmpxchg.org> In-Reply-To: <20130312170847.GE1953@cmpxchg.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Johannes Weiner Cc: Raymond Jennings , Linux Memory Management List Hi Johannes, On 03/13/2013 01:08 AM, Johannes Weiner wrote: > On Mon, Mar 11, 2013 at 11:16:29AM +0800, Jaegeuk Hanse wrote: >> Hi Johannes, >> On 03/08/2013 10:35 AM, Johannes Weiner wrote: >>> On Thu, Mar 07, 2013 at 06:07:23PM -0800, Raymond Jennings wrote: >>>> Just a two cent question, but is there any merit to having the kernel >>>> defragment swap space? >>> That is a good question. >>> >>> Swap does fragment quite a bit, and there are several reasons for >>> that. >>> >>> We swap pages in our LRU list order, but this list is sorted by first >>> access, not by access frequency (not quite that cookie cutter, but the >>> ordering is certainly fairly coarse). This means that the pages may >>> already be in suboptimal order for swap in at the time of swap out. >>> >>> Once written to disk, the layout tends to stick. One reason is that >>> we actually try to not free swap slots unless there is a shortage of >>> swap space to save future swap out IO (grep for vm_swap_full()). The >> Since anonymous page will be swap out if it's dirty and the contents >> of the page and data store in swap area is not equal now, why can >> avoid future swap out IO? > Modified pages get written out freshly, but in a multi-threaded > application, the original page stays put until all threads have > modified it or faulted it back in. Sorry, you didn't resolve my confuse! It seems that this is your second reason for why disk layout tends to stick. However, what I confuse is your first reason. You said that we actually try to not free swap slots unless there is a shortage of swap space to save future swap out IO, why? Anonymous pages are swapped out since they are dirty, how can don't swap out and swap IO? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org