* Fwd: (un)loadable module support for zcache [not found] <CABv5NL-SquBQH8W+K1CXNBQQWqHyYO+p3Y9sPqsbfZKp5EafTg@mail.gmail.com> @ 2012-03-05 0:46 ` Ilendir 2012-03-05 16:57 ` Dan Magenheimer 1 sibling, 0 replies; 6+ messages in thread From: Ilendir @ 2012-03-05 0:46 UTC (permalink / raw) To: linux-mm; +Cc: ngupta While experimenting with zcache on various systems, we discovered what seems to be a different impact on CPU and power consumption, varying from system to system and workload. While there has been some research effort about the effect of on-line memory compression on power consumption [1], the trade-off, for example when using SSDs or on mobile platforms (e.g. Android), remains still unclear. Therefore it would be desirable to improve the possibilities to study this effects on the example of zcache. But zcache is missing an important feature: dynamic disabling and enabling. This is a big obstacle for further analysis. Since we have to do some free-to-choose work on a Linux related topic while doing an internship at the University in Erlangen, we'd like to implement this feature. Moreover, if we achieve our goal, the way to an unloadable zcache module isn’t far way. If that is accomplished, one of the blockers to get zcache out of the staging tree is gone. Any advice is appreciated. Florian Schmaus Stefan Hengelein Andor Daam [1] http://ziyang.eecs.umich.edu/~dickrp/publications/yang-crames-tecs.pdf -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: (un)loadable module support for zcache [not found] <CABv5NL-SquBQH8W+K1CXNBQQWqHyYO+p3Y9sPqsbfZKp5EafTg@mail.gmail.com> 2012-03-05 0:46 ` Fwd: (un)loadable module support for zcache Ilendir @ 2012-03-05 16:57 ` Dan Magenheimer 2012-03-08 14:36 ` Florian Schmaus 1 sibling, 1 reply; 6+ messages in thread From: Dan Magenheimer @ 2012-03-05 16:57 UTC (permalink / raw) To: Ilendir, linux-mm Cc: sjenning, Konrad Wilk, fschmaus, Andor Daam, i4passt, devel, Nitin Gupta > From: Ilendir [mailto:ilendir@googlemail.com] > Subject: (un)loadable module support for zcache > > While experimenting with zcache on various systems, we discovered what > seems to be a different impact on CPU and power consumption, varying > from system to system and workload. While there has been some research > effort about the effect of on-line memory compression on power > consumption [1], the trade-off, for example when using SSDs or on > mobile platforms (e.g. Android), remains still unclear. Therefore it > would be desirable to improve the possibilities to study this effects > on the example of zcache. But zcache is missing an important feature: > dynamic disabling and enabling. This is a big obstacle for further > analysis. > Since we have to do some free-to-choose work on a Linux related topic > while doing an internship at the University in Erlangen, we'd like to > implement this feature. > > Moreover, if we achieve our goal, the way to an unloadable zcache > module isn't far way. If that is accomplished, one of the blockers to > get zcache out of the staging tree is gone. > > Any advice is appreciated. > > Florian Schmaus > Stefan Hengelein > Andor Daam Hi Florian, Stefan, and Andor -- Thanks for your interest in zcache development! I see you've sent your original email separately to different lists so I will try to combine them into one cc list now so hopefully there will be one thread. Your idea of studying power consumption tradeoffs is interesting and the work to allow zcache to be installed as a module will also be very useful. I have given some thought on what would be necessary to allow zcache (or Xen tmem, or RAMster) to be insmod'ed and rmmod'ed. There are two main technical difficulties that I see. There may be more but let's start with these two. First, the "tmem frontend" code in cleancache and frontswap assumes that a "tmem backend" (such as zcache, Xen tmem, or RAMster) has already registered when filesystems are mounted (for cleancache) and when swapon is run (for frontswap). If no tmem backend has yet registered when the mount (or swapon) is invoked, then cleancache_enabled (or frontswap_enabled) has not been set to 1, and the corresponding init_fs/init routine has not been called and no tmem "pool" gets created. Then if zcache later registers with cleancache/frontend, it is too late... there are no mounts or swapons to trigger the calls that create the tmem pools. As result, all gets and puts and flushes will fail, and zcache does not work. I think the answer here is for cleancache (and frontswap) to support "lazy pool creation". If a backend has not yet registered when an init_fs/init call is made, cleancache (or frontswap) must record the attempt and generate a valid "fake poolid" to return. Any calls to put/get/flush with a fake poolid is ignored as the zcache module is not yet loaded. Later, when zcache is insmod'ed, it will attempt to register and cleancache must then call the init_fs/init routines (to "lazily" create the pools), obtain a "real poolid" from zcache for each pool and "map" the fake poolid to the real poolid on EVERY get/put/flush and on pool destroy (umount/swapoff). I think all changes for this will be in mm/cleancache.c and mm/frontswap.c... the backend does not need to know anything about it. This implementation will not be hard, but there may be a few corner cases that you will need to ensure are correct, and of course you will need to ensure that any coding changes follow proper Linux coding styles. Second issue: When zcache gets rmmod'ed, there is an issue of coherency. You need to ensure that if zcache goes through insmod -> rmmod -> insmod that no stale data remains in any tmem pool. If any stale data remains, a "get" of the old data may result in data corruption. The problem is that there may be millions of pages in cleancache and flushing those pages may take a very long time. The user will not want to wait that long. And for frontswap, frontswap_shrink must be called and since every page in frontswap contains real user data, you must ensure that all pages get decompressed and removed from frontswap either into physical RAM or a physical swap disk. (See frontswap_shrink in frontswap.c and frontswap_selfshrink in the RAMster code.) This may take a very VERY long time. So rmmod cannot complete until all the data in cleancache is freed and all the data in frontswap is repatriated to RAM or swap disk. I don't have an easy answer for this one. It may be possible to have "zombie" lists of partially destroyed pages and a kernel thread that (after rmmod completes) walks the list and frees or frontswap_shrinks the pages. I will leave this to you to solve... it is likely the hardest problem for making zcache work as a module. If you can't get it to work, it would still be useful to be able to "insmod" zcache, even if "rmmod" is not possible. Thanks, Dan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: (un)loadable module support for zcache 2012-03-05 16:57 ` Dan Magenheimer @ 2012-03-08 14:36 ` Florian Schmaus 2012-03-08 15:52 ` Dan Magenheimer 0 siblings, 1 reply; 6+ messages in thread From: Florian Schmaus @ 2012-03-08 14:36 UTC (permalink / raw) To: Dan Magenheimer, linux-mm Cc: Stefan Hengelein, sjenning, Konrad Wilk, Andor Daam, i4passt, devel, Nitin Gupta On 03/05/12 17:57, Dan Magenheimer wrote: > I think the answer here is for cleancache (and frontswap) to > support "lazy pool creation". If a backend has not yet > registered when an init_fs/init call is made, cleancache > (or frontswap) must record the attempt and generate a valid > "fake poolid" to return. Any calls to put/get/flush with > a fake poolid is ignored as the zcache module is not > yet loaded. Later, when zcache is insmod'ed, it will attempt > to register and cleancache must then call the init_fs/init > routines (to "lazily" create the pools), obtain a "real poolid" > from zcache for each pool and "map" the fake poolid to the real > poolid on EVERY get/put/flush and on pool destroy (umount/swapoff). We were thinking about how to make cleancache and frontswap able to cope with the mounting of filesystems and running of swapon when there is no backend registered without adding an indirection caused by a fake pool id map. We figured a way to deal with this in cleancache would be to store the struct super_block pointers in an array for every call to init_fs and the uuids and struct super_blocks pointers in different arrays for every call to init_shared_fs. When a filesystem unmounts before a backend is registered, its entries in the respective arrays are removed. While no backend is registered, the put_page() and invalidate_page() are ignored and get_page() fails. As soon as a backend registers the init_fs and init_shared_fs functions are called for the struct super_block pointers (and uuids) stored in the according arrays. For frontswap we are aiming for a similar approach by remembering the types for every call to init and failing put_page() and ignoring get_page() and invalidate_page(). Again, when a backend registers init is called for every type stored. This should allow backends to register with cleancache and frontswap even after the mounting of filesystems and/or swapon is run. Therefore it should allow zcache to be insmodded. This would be a first step to allow rmmodding of zcache aswell. Is this approach feasible? Stefan, Florian, and Andor -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: (un)loadable module support for zcache 2012-03-08 14:36 ` Florian Schmaus @ 2012-03-08 15:52 ` Dan Magenheimer 2012-03-08 16:51 ` Andor Daam 0 siblings, 1 reply; 6+ messages in thread From: Dan Magenheimer @ 2012-03-08 15:52 UTC (permalink / raw) To: Florian Schmaus, linux-mm Cc: Stefan Hengelein, sjenning, Konrad Wilk, Andor Daam, i4passt, devel, Nitin Gupta > From: Florian Schmaus [mailto:fschmaus@gmail.com] > Subject: Re: (un)loadable module support for zcache > > On 03/05/12 17:57, Dan Magenheimer wrote: > > I think the answer here is for cleancache (and frontswap) to > > support "lazy pool creation". If a backend has not yet > > registered when an init_fs/init call is made, cleancache > > (or frontswap) must record the attempt and generate a valid > > "fake poolid" to return. Any calls to put/get/flush with > > a fake poolid is ignored as the zcache module is not > > yet loaded. Later, when zcache is insmod'ed, it will attempt > > to register and cleancache must then call the init_fs/init > > routines (to "lazily" create the pools), obtain a "real poolid" > > from zcache for each pool and "map" the fake poolid to the real > > poolid on EVERY get/put/flush and on pool destroy (umount/swapoff). > > We were thinking about how to make cleancache and frontswap able to cope > with the mounting of filesystems and running of swapon when there is no > backend registered without adding an indirection caused by a fake pool > id map. > > We figured a way to deal with this in cleancache would be to store the > struct super_block pointers in an array for every call to init_fs and > the uuids and struct super_blocks pointers in different arrays for every > call to init_shared_fs. When a filesystem unmounts before a backend is > registered, its entries in the respective arrays are removed. > While no backend is registered, the put_page() and invalidate_page() are > ignored and get_page() fails. As soon as a backend registers the init_fs > and init_shared_fs functions are called for the struct super_block > pointers (and uuids) stored in the according arrays. > > For frontswap we are aiming for a similar approach by remembering the > types for every call to init and failing put_page() and ignoring > get_page() and invalidate_page(). > Again, when a backend registers init is called for every type stored. > > This should allow backends to register with cleancache and frontswap > even after the mounting of filesystems and/or swapon is run. Therefore > it should allow zcache to be insmodded. This would be a first step to > allow rmmodding of zcache aswell. > > Is this approach feasible? Hi Stefan, Florian, and Andor -- I do see a potential problem with this approach. You would be saving a superblock pointer and then using it later. What if the filesystem was unmounted in the meantime? Or, worse, what if it was unmounted and then the address of the superblock is reused to point to some completely different object? I think if you ensure that cleancache_invalidate_fs() is always called when a cleancache-enabled filesystem is unmounted, then in cleancache_invalidate_fs() you remove the matching superblock pointer from your arrays, then it should work. Dan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: (un)loadable module support for zcache 2012-03-08 15:52 ` Dan Magenheimer @ 2012-03-08 16:51 ` Andor Daam 2012-03-08 17:07 ` Dan Magenheimer 0 siblings, 1 reply; 6+ messages in thread From: Andor Daam @ 2012-03-08 16:51 UTC (permalink / raw) To: Dan Magenheimer Cc: Florian Schmaus, linux-mm, Stefan Hengelein, sjenning, Konrad Wilk, i4passt, devel, Nitin Gupta 2012/3/8 Dan Magenheimer <dan.magenheimer@oracle.com> > > > From: Florian Schmaus [mailto:fschmaus@gmail.com] > > Subject: Re: (un)loadable module support for zcache > > > > On 03/05/12 17:57, Dan Magenheimer wrote: > > > I think the answer here is for cleancache (and frontswap) to > > > support "lazy pool creation". If a backend has not yet > > > registered when an init_fs/init call is made, cleancache > > > (or frontswap) must record the attempt and generate a valid > > > "fake poolid" to return. Any calls to put/get/flush with > > > a fake poolid is ignored as the zcache module is not > > > yet loaded. Later, when zcache is insmod'ed, it will attempt > > > to register and cleancache must then call the init_fs/init > > > routines (to "lazily" create the pools), obtain a "real poolid" > > > from zcache for each pool and "map" the fake poolid to the real > > > poolid on EVERY get/put/flush and on pool destroy (umount/swapoff). > > > > We were thinking about how to make cleancache and frontswap able to cope > > with the mounting of filesystems and running of swapon when there is no > > backend registered without adding an indirection caused by a fake pool > > id map. > > > > We figured a way to deal with this in cleancache would be to store the > > struct super_block pointers in an array for every call to init_fs and > > the uuids and struct super_blocks pointers in different arrays for every > > call to init_shared_fs. When a filesystem unmounts before a backend is > > registered, its entries in the respective arrays are removed. > > While no backend is registered, the put_page() and invalidate_page() are > > ignored and get_page() fails. As soon as a backend registers the init_fs > > and init_shared_fs functions are called for the struct super_block > > pointers (and uuids) stored in the according arrays. > > > > For frontswap we are aiming for a similar approach by remembering the > > types for every call to init and failing put_page() and ignoring > > get_page() and invalidate_page(). > > Again, when a backend registers init is called for every type stored. > > > > This should allow backends to register with cleancache and frontswap > > even after the mounting of filesystems and/or swapon is run. Therefore > > it should allow zcache to be insmodded. This would be a first step to > > allow rmmodding of zcache aswell. > > > > Is this approach feasible? > > Hi Stefan, Florian, and Andor -- > > I do see a potential problem with this approach. You would > be saving a superblock pointer and then using it later. What > if the filesystem was unmounted in the meantime? Or, worse, > what if it was unmounted and then the address of the superblock > is reused to point to some completely different object? > > I think if you ensure that cleancache_invalidate_fs() is always > called when a cleancache-enabled filesystem is unmounted, > then in cleancache_invalidate_fs() you remove the matching > superblock pointer from your arrays, then it should work. > > Dan We already thought of removing the matching pointer, whenever a filesystem is unmounted. As the comment to __cleancache_invalidate_fs in cleancache.c states that this function is called by any cleancache-enabled filesystem at time of unmount, we assumed that this function was actually always called upon unmount. Is it not certain that this function is always called? Andor -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: (un)loadable module support for zcache 2012-03-08 16:51 ` Andor Daam @ 2012-03-08 17:07 ` Dan Magenheimer 0 siblings, 0 replies; 6+ messages in thread From: Dan Magenheimer @ 2012-03-08 17:07 UTC (permalink / raw) To: Andor Daam Cc: Florian Schmaus, linux-mm, Stefan Hengelein, sjenning, Konrad Wilk, i4passt, devel, Nitin Gupta > From: Andor Daam [mailto:andor.daam@googlemail.com] > Subject: Re: (un)loadable module support for zcache > > 2012/3/8 Dan Magenheimer <dan.magenheimer@oracle.com> > > > > > From: Florian Schmaus [mailto:fschmaus@gmail.com] > > > Subject: Re: (un)loadable module support for zcache > > > > > > This should allow backends to register with cleancache and frontswap > > > even after the mounting of filesystems and/or swapon is run. Therefore > > > it should allow zcache to be insmodded. This would be a first step to > > > allow rmmodding of zcache aswell. > > > > > > Is this approach feasible? > > > > Hi Stefan, Florian, and Andor -- > > > > I do see a potential problem with this approach. You would > > be saving a superblock pointer and then using it later. What > > if the filesystem was unmounted in the meantime? Or, worse, > > what if it was unmounted and then the address of the superblock > > is reused to point to some completely different object? > > > > I think if you ensure that cleancache_invalidate_fs() is always > > called when a cleancache-enabled filesystem is unmounted, > > then in cleancache_invalidate_fs() you remove the matching > > superblock pointer from your arrays, then it should work. > > We already thought of removing the matching pointer, whenever a filesystem is > unmounted. Great! > As the comment to __cleancache_invalidate_fs in cleancache.c states > that this function > is called by any cleancache-enabled filesystem at time of unmount, we > assumed that this function was actually always called upon unmount. Hi Andor -- Until now, cleancache_invalidate_fs was only called for garbage collection so it didn't really matter. Since, after you work is done, a missed call to cleancache_invalidate_fs has the potential to cause data corruption, it's probably best to be paranoid and verify. > Is it not certain that this function is always called? I *think* it should always be called, but I am not a filesystem expert. It might be worth asking the question on a filesytem mailing list (or on the individual lists for ext3/4, ocfs2, btrfs): "Is it ever possible for a superblock for a mounted filesystem to be free'd without a previous call to unmount the filesystem?" And you might want to check the call points for cleancache_invalidate_fs (in each of the filesystems) to see if there are error conditions which would skip the call to cleancache_invalidate_fs. Alternately, if you generate and keep track of a "fake pool id" and map it (after the backend registers) to a real pool id, I think there's no risk. However, I agree your solution is more elegant so as long as you verify that there is no chance of data corruption, I am OK with your solution. Dan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-03-08 17:07 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CABv5NL-SquBQH8W+K1CXNBQQWqHyYO+p3Y9sPqsbfZKp5EafTg@mail.gmail.com>
2012-03-05 0:46 ` Fwd: (un)loadable module support for zcache Ilendir
2012-03-05 16:57 ` Dan Magenheimer
2012-03-08 14:36 ` Florian Schmaus
2012-03-08 15:52 ` Dan Magenheimer
2012-03-08 16:51 ` Andor Daam
2012-03-08 17:07 ` Dan Magenheimer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).