* RE: [PATCH 0/4] promote zcache from staging [not found] <<1343413117-1989-1-git-send-email-sjenning@linux.vnet.ibm.com> @ 2012-07-27 19:21 ` Dan Magenheimer 2012-07-27 20:59 ` Konrad Rzeszutek Wilk 0 siblings, 1 reply; 41+ messages in thread From: Dan Magenheimer @ 2012-07-27 19:21 UTC (permalink / raw) To: Seth Jennings, Greg Kroah-Hartman Cc: Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Rzeszutek Wilk, Dan Magenheimer, Robert Jennings, linux-mm, linux-kernel, devel > From: Seth Jennings [mailto:sjenning@linux.vnet.ibm.com] > Subject: [PATCH 0/4] promote zcache from staging > > zcache is the remaining piece of code required to support in-kernel > memory compression. The other two features, cleancache and frontswap, > have been promoted to mainline in 3.0 and 3.5. This patchset > promotes zcache from the staging tree to mainline. > > Based on the level of activity and contributions we're seeing from a > diverse set of people and interests, I think zcache has matured to the > point where it makes sense to promote this out of staging. Hi Seth -- Per offline communication, I'd like to see this delayed for three reasons: 1) I've completely rewritten zcache and will post the rewrite soon. The redesigned code fixes many of the weaknesses in zcache that makes it (IMHO) unsuitable for an enterprise distro. (Some of these previously discussed in linux-mm [1].) 2) zcache is truly mm (memory management) code and the fact that it is in drivers at all was purely for logistical reasons (e.g. the only in-tree "staging" is in the drivers directory). My rewrite promotes it to (a subdirectory of) mm where IMHO it belongs. 3) Ramster heavily duplicates code from zcache. My rewrite resolves this. My soon-to-be-post also places the re-factored ramster in mm, though with some minor work zcache could go in mm and ramster could stay in staging. Let's have this discussion, but unless the community decides otherwise, please consider this a NACK. Thanks, Dan [1] http://marc.info/?t=133886706700002&r=1&w=2 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-07-27 19:21 ` [PATCH 0/4] promote zcache from staging Dan Magenheimer @ 2012-07-27 20:59 ` Konrad Rzeszutek Wilk 2012-07-27 21:42 ` Dan Magenheimer 0 siblings, 1 reply; 41+ messages in thread From: Konrad Rzeszutek Wilk @ 2012-07-27 20:59 UTC (permalink / raw) To: Dan Magenheimer Cc: Seth Jennings, Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Rzeszutek Wilk, Robert Jennings, linux-mm, linux-kernel, devel On Fri, Jul 27, 2012 at 12:21:50PM -0700, Dan Magenheimer wrote: > > From: Seth Jennings [mailto:sjenning@linux.vnet.ibm.com] > > Subject: [PATCH 0/4] promote zcache from staging > > > > zcache is the remaining piece of code required to support in-kernel > > memory compression. The other two features, cleancache and frontswap, > > have been promoted to mainline in 3.0 and 3.5. This patchset > > promotes zcache from the staging tree to mainline. > > > > Based on the level of activity and contributions we're seeing from a > > diverse set of people and interests, I think zcache has matured to the > > point where it makes sense to promote this out of staging. > > Hi Seth -- > > Per offline communication, I'd like to see this delayed for three > reasons: > > 1) I've completely rewritten zcache and will post the rewrite soon. > The redesigned code fixes many of the weaknesses in zcache that > makes it (IMHO) unsuitable for an enterprise distro. (Some of > these previously discussed in linux-mm [1].) > 2) zcache is truly mm (memory management) code and the fact that > it is in drivers at all was purely for logistical reasons > (e.g. the only in-tree "staging" is in the drivers directory). > My rewrite promotes it to (a subdirectory of) mm where IMHO it > belongs. > 3) Ramster heavily duplicates code from zcache. My rewrite resolves > this. My soon-to-be-post also places the re-factored ramster > in mm, though with some minor work zcache could go in mm and > ramster could stay in staging. > > Let's have this discussion, but unless the community decides > otherwise, please consider this a NACK. Hold on, that is rather unfair. The zcache has been in staging for quite some time - your code has not been posted. Part of "unstaging" a driver is for folks to review the code - and you just said "No, mine is better" without showing your goods. There is a third option - which is to continue the promotion of zcache from staging, get reviews, work on them ,etc, and alongside of that you can work on fixing up (or ripping out) zcache1 with zcache2 components as they make sense. Or even having two of them - an enterprise and an embedded version that will eventually get merged together. There is nothing wrong with modifying a driver once it has left staging. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH 0/4] promote zcache from staging 2012-07-27 20:59 ` Konrad Rzeszutek Wilk @ 2012-07-27 21:42 ` Dan Magenheimer 2012-07-29 1:54 ` Minchan Kim 2012-07-30 19:19 ` Seth Jennings 0 siblings, 2 replies; 41+ messages in thread From: Dan Magenheimer @ 2012-07-27 21:42 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Seth Jennings, Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Wilk, Robert Jennings, linux-mm, linux-kernel, devel > From: Konrad Rzeszutek Wilk [mailto:konrad@darnok.org] > Sent: Friday, July 27, 2012 3:00 PM > Subject: Re: [PATCH 0/4] promote zcache from staging > > On Fri, Jul 27, 2012 at 12:21:50PM -0700, Dan Magenheimer wrote: > > > From: Seth Jennings [mailto:sjenning@linux.vnet.ibm.com] > > > Subject: [PATCH 0/4] promote zcache from staging > > > > > > zcache is the remaining piece of code required to support in-kernel > > > memory compression. The other two features, cleancache and frontswap, > > > have been promoted to mainline in 3.0 and 3.5. This patchset > > > promotes zcache from the staging tree to mainline. > > > > > > Based on the level of activity and contributions we're seeing from a > > > diverse set of people and interests, I think zcache has matured to the > > > point where it makes sense to promote this out of staging. > > > > Hi Seth -- > > > > Per offline communication, I'd like to see this delayed for three > > reasons: > > > > 1) I've completely rewritten zcache and will post the rewrite soon. > > The redesigned code fixes many of the weaknesses in zcache that > > makes it (IMHO) unsuitable for an enterprise distro. (Some of > > these previously discussed in linux-mm [1].) > > 2) zcache is truly mm (memory management) code and the fact that > > it is in drivers at all was purely for logistical reasons > > (e.g. the only in-tree "staging" is in the drivers directory). > > My rewrite promotes it to (a subdirectory of) mm where IMHO it > > belongs. > > 3) Ramster heavily duplicates code from zcache. My rewrite resolves > > this. My soon-to-be-post also places the re-factored ramster > > in mm, though with some minor work zcache could go in mm and > > ramster could stay in staging. > > > > Let's have this discussion, but unless the community decides > > otherwise, please consider this a NACK. Hi Konrad -- > Hold on, that is rather unfair. The zcache has been in staging > for quite some time - your code has not been posted. Part of > "unstaging" a driver is for folks to review the code - and you > just said "No, mine is better" without showing your goods. Sorry, I'm not trying to be unfair. However, I don't see the point of promoting zcache out of staging unless it is intended to be used by real users in a real distro. There's been a lot of discussion, onlist and offlist, about what needs to be fixed in zcache and not much visible progress on fixing it. But fixing it is where I've spent most of my time over the last couple of months. If IBM or some other company or distro is eager to ship and support zcache in its current form, I agree that "promote now, improve later" is a fine approach. But promoting zcache out of staging simply because there is urgency to promote zsmalloc+zram out of staging doesn't seem wise. At a minimum, it distracts reviewers/effort from what IMHO is required to turn zcache into an enterprise-ready kernel feature. I can post my "goods" anytime. In its current form it is better than the zcache in staging (and, please remember, I wrote both so I think I am in a good position to compare the two). I have been waiting until I think the new zcache is feature complete before asking for review, especially since the newest features should demonstrate clearly why the rewrite is necessary and beneficial. But I can post* my current bits if people don't believe they exist and/or don't mind reviewing non-final code. (* Or I can put them in a publicly available git tree.) > There is a third option - which is to continue the promotion > of zcache from staging, get reviews, work on them ,etc, and > alongside of that you can work on fixing up (or ripping out) > zcache1 with zcache2 components as they make sense. Or even > having two of them - an enterprise and an embedded version > that will eventually get merged together. There is nothing > wrong with modifying a driver once it has left staging. Minchan and Seth can correct me if I am wrong, but I believe zram+zsmalloc, not zcache, is the target solution for embedded. The limitations of zsmalloc aren't an issue for zram but they are for zcache, and this deficiency was one of the catalysts for the rewrite. The issues are explained in more detail in [1], but if any point isn't clear, I'd be happy to explain further. However, I have limited time for this right now and I'd prefer to spend it finishing the code. :-} So, as I said, I am still a NACK, but if there are good reasons to duplicate effort and pursue the "third option", let's discuss them. Thanks, Dan [1] http://marc.info/?t=133886706700002&r=1&w=2 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-07-27 21:42 ` Dan Magenheimer @ 2012-07-29 1:54 ` Minchan Kim 2012-07-31 15:36 ` Konrad Rzeszutek Wilk 2012-07-30 19:19 ` Seth Jennings 1 sibling, 1 reply; 41+ messages in thread From: Minchan Kim @ 2012-07-29 1:54 UTC (permalink / raw) To: Dan Magenheimer Cc: Konrad Rzeszutek Wilk, Seth Jennings, Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Konrad Wilk, Robert Jennings, linux-mm, linux-kernel, devel On Fri, Jul 27, 2012 at 02:42:14PM -0700, Dan Magenheimer wrote: > > From: Konrad Rzeszutek Wilk [mailto:konrad@darnok.org] > > Sent: Friday, July 27, 2012 3:00 PM > > Subject: Re: [PATCH 0/4] promote zcache from staging > > > > On Fri, Jul 27, 2012 at 12:21:50PM -0700, Dan Magenheimer wrote: > > > > From: Seth Jennings [mailto:sjenning@linux.vnet.ibm.com] > > > > Subject: [PATCH 0/4] promote zcache from staging > > > > > > > > zcache is the remaining piece of code required to support in-kernel > > > > memory compression. The other two features, cleancache and frontswap, > > > > have been promoted to mainline in 3.0 and 3.5. This patchset > > > > promotes zcache from the staging tree to mainline. > > > > > > > > Based on the level of activity and contributions we're seeing from a > > > > diverse set of people and interests, I think zcache has matured to the > > > > point where it makes sense to promote this out of staging. > > > > > > Hi Seth -- > > > > > > Per offline communication, I'd like to see this delayed for three > > > reasons: > > > > > > 1) I've completely rewritten zcache and will post the rewrite soon. > > > The redesigned code fixes many of the weaknesses in zcache that > > > makes it (IMHO) unsuitable for an enterprise distro. (Some of > > > these previously discussed in linux-mm [1].) > > > 2) zcache is truly mm (memory management) code and the fact that > > > it is in drivers at all was purely for logistical reasons > > > (e.g. the only in-tree "staging" is in the drivers directory). > > > My rewrite promotes it to (a subdirectory of) mm where IMHO it > > > belongs. > > > 3) Ramster heavily duplicates code from zcache. My rewrite resolves > > > this. My soon-to-be-post also places the re-factored ramster > > > in mm, though with some minor work zcache could go in mm and > > > ramster could stay in staging. > > > > > > Let's have this discussion, but unless the community decides > > > otherwise, please consider this a NACK. > > Hi Konrad -- > > > Hold on, that is rather unfair. The zcache has been in staging > > for quite some time - your code has not been posted. Part of > > "unstaging" a driver is for folks to review the code - and you > > just said "No, mine is better" without showing your goods. > > Sorry, I'm not trying to be unfair. However, I don't see the point > of promoting zcache out of staging unless it is intended to be used > by real users in a real distro. There's been a lot of discussion, > onlist and offlist, about what needs to be fixed in zcache and not > much visible progress on fixing it. But fixing it is where I've spent > most of my time over the last couple of months. > > If IBM or some other company or distro is eager to ship and support > zcache in its current form, I agree that "promote now, improve later" > is a fine approach. But promoting zcache out of staging simply because > there is urgency to promote zsmalloc+zram out of staging doesn't > seem wise. At a minimum, it distracts reviewers/effort from what IMHO > is required to turn zcache into an enterprise-ready kernel feature. > > I can post my "goods" anytime. In its current form it is better > than the zcache in staging (and, please remember, I wrote both so > I think I am in a good position to compare the two). > I have been waiting until I think the new zcache is feature complete > before asking for review, especially since the newest features > should demonstrate clearly why the rewrite is necessary and > beneficial. But I can post* my current bits if people don't > believe they exist and/or don't mind reviewing non-final code. > (* Or I can put them in a publicly available git tree.) > > > There is a third option - which is to continue the promotion > > of zcache from staging, get reviews, work on them ,etc, and > > alongside of that you can work on fixing up (or ripping out) > > zcache1 with zcache2 components as they make sense. Or even > > having two of them - an enterprise and an embedded version > > that will eventually get merged together. There is nothing > > wrong with modifying a driver once it has left staging. > > Minchan and Seth can correct me if I am wrong, but I believe > zram+zsmalloc, not zcache, is the target solution for embedded. NOT ture. Some embedded devices use zcache but it's not original zcache but modificated one. Anyway, although embedded people use modified zcache, I am biased to Dan. I admit I don't spend lots of time to look zcache but as looking the code, it wasn't good shape and even had a bug found during code review and I felt strongly we should clean up it for promoting it to mm/. So I would like to wait Dan's posting if you guys are not urgent. (And I am not sure akpm allow it with current shape of zcache code.) But the concern is about adding new feature. I guess there might be some debate for long time and it can prevent promoting again. I think It's not what Seth want. I hope Dan doesn't mix clean up series and new feature series and post clean up series as soon as possible so let's clean up first and try to promote it and later, adding new feature or changing algorithm is desirable. > The limitations of zsmalloc aren't an issue for zram but they are > for zcache, and this deficiency was one of the catalysts for the > rewrite. The issues are explained in more detail in [1], > but if any point isn't clear, I'd be happy to explain further. > > However, I have limited time for this right now and I'd prefer > to spend it finishing the code. :-} > > So, as I said, I am still a NACK, but if there are good reasons > to duplicate effort and pursue the "third option", let's discuss > them. > > Thanks, > Dan > > [1] http://marc.info/?t=133886706700002&r=1&w=2 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-07-29 1:54 ` Minchan Kim @ 2012-07-31 15:36 ` Konrad Rzeszutek Wilk 2012-08-06 4:49 ` Minchan Kim 0 siblings, 1 reply; 41+ messages in thread From: Konrad Rzeszutek Wilk @ 2012-07-31 15:36 UTC (permalink / raw) To: Minchan Kim Cc: Dan Magenheimer, Konrad Rzeszutek Wilk, Seth Jennings, Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Robert Jennings, linux-mm, linux-kernel, devel On Sun, Jul 29, 2012 at 10:54:28AM +0900, Minchan Kim wrote: > On Fri, Jul 27, 2012 at 02:42:14PM -0700, Dan Magenheimer wrote: > > > From: Konrad Rzeszutek Wilk [mailto:konrad@darnok.org] > > > Sent: Friday, July 27, 2012 3:00 PM > > > Subject: Re: [PATCH 0/4] promote zcache from staging > > > > > > On Fri, Jul 27, 2012 at 12:21:50PM -0700, Dan Magenheimer wrote: > > > > > From: Seth Jennings [mailto:sjenning@linux.vnet.ibm.com] > > > > > Subject: [PATCH 0/4] promote zcache from staging > > > > > > > > > > zcache is the remaining piece of code required to support in-kernel > > > > > memory compression. The other two features, cleancache and frontswap, > > > > > have been promoted to mainline in 3.0 and 3.5. This patchset > > > > > promotes zcache from the staging tree to mainline. > > > > > > > > > > Based on the level of activity and contributions we're seeing from a > > > > > diverse set of people and interests, I think zcache has matured to the > > > > > point where it makes sense to promote this out of staging. > > > > > > > > Hi Seth -- > > > > > > > > Per offline communication, I'd like to see this delayed for three > > > > reasons: > > > > > > > > 1) I've completely rewritten zcache and will post the rewrite soon. > > > > The redesigned code fixes many of the weaknesses in zcache that > > > > makes it (IMHO) unsuitable for an enterprise distro. (Some of > > > > these previously discussed in linux-mm [1].) > > > > 2) zcache is truly mm (memory management) code and the fact that > > > > it is in drivers at all was purely for logistical reasons > > > > (e.g. the only in-tree "staging" is in the drivers directory). > > > > My rewrite promotes it to (a subdirectory of) mm where IMHO it > > > > belongs. > > > > 3) Ramster heavily duplicates code from zcache. My rewrite resolves > > > > this. My soon-to-be-post also places the re-factored ramster > > > > in mm, though with some minor work zcache could go in mm and > > > > ramster could stay in staging. > > > > > > > > Let's have this discussion, but unless the community decides > > > > otherwise, please consider this a NACK. > > > > Hi Konrad -- > > > > > Hold on, that is rather unfair. The zcache has been in staging > > > for quite some time - your code has not been posted. Part of > > > "unstaging" a driver is for folks to review the code - and you > > > just said "No, mine is better" without showing your goods. > > > > Sorry, I'm not trying to be unfair. However, I don't see the point > > of promoting zcache out of staging unless it is intended to be used > > by real users in a real distro. There's been a lot of discussion, > > onlist and offlist, about what needs to be fixed in zcache and not > > much visible progress on fixing it. But fixing it is where I've spent > > most of my time over the last couple of months. > > > > If IBM or some other company or distro is eager to ship and support > > zcache in its current form, I agree that "promote now, improve later" > > is a fine approach. But promoting zcache out of staging simply because > > there is urgency to promote zsmalloc+zram out of staging doesn't > > seem wise. At a minimum, it distracts reviewers/effort from what IMHO > > is required to turn zcache into an enterprise-ready kernel feature. > > > > I can post my "goods" anytime. In its current form it is better > > than the zcache in staging (and, please remember, I wrote both so > > I think I am in a good position to compare the two). > > I have been waiting until I think the new zcache is feature complete > > before asking for review, especially since the newest features > > should demonstrate clearly why the rewrite is necessary and > > beneficial. But I can post* my current bits if people don't > > believe they exist and/or don't mind reviewing non-final code. > > (* Or I can put them in a publicly available git tree.) > > > > > There is a third option - which is to continue the promotion > > > of zcache from staging, get reviews, work on them ,etc, and > > > alongside of that you can work on fixing up (or ripping out) > > > zcache1 with zcache2 components as they make sense. Or even > > > having two of them - an enterprise and an embedded version > > > that will eventually get merged together. There is nothing > > > wrong with modifying a driver once it has left staging. > > > > Minchan and Seth can correct me if I am wrong, but I believe > > zram+zsmalloc, not zcache, is the target solution for embedded. > > NOT ture. Some embedded devices use zcache but it's not original > zcache but modificated one. What kind of modifications? Would it make sense to post the patches for those modifications? > Anyway, although embedded people use modified zcache, I am biased to Dan. > I admit I don't spend lots of time to look zcache but as looking the > code, it wasn't good shape and even had a bug found during code review > and I felt strongly we should clean up it for promoting it to mm/. Do you recall what the bugs where? > So I would like to wait Dan's posting if you guys are not urgent. > (And I am not sure akpm allow it with current shape of zcache code.) > But the concern is about adding new feature. I guess there might be some > debate for long time and it can prevent promoting again. > I think It's not what Seth want. > I hope Dan doesn't mix clean up series and new feature series and > post clean up series as soon as possible so let's clean up first and > try to promote it and later, adding new feature or changing algorithm > is desirable. > > > > The limitations of zsmalloc aren't an issue for zram but they are > > for zcache, and this deficiency was one of the catalysts for the > > rewrite. The issues are explained in more detail in [1], > > but if any point isn't clear, I'd be happy to explain further. > > > > However, I have limited time for this right now and I'd prefer > > to spend it finishing the code. :-} > > > > So, as I said, I am still a NACK, but if there are good reasons > > to duplicate effort and pursue the "third option", let's discuss > > them. > > > > Thanks, > > Dan > > > > [1] http://marc.info/?t=133886706700002&r=1&w=2 > > > > -- > > To unsubscribe, send a message with 'unsubscribe linux-mm' in > > the body to majordomo@kvack.org. For more info on Linux MM, > > see: http://www.linux-mm.org/ . > > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > > -- > Kind regards, > Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-07-31 15:36 ` Konrad Rzeszutek Wilk @ 2012-08-06 4:49 ` Minchan Kim 0 siblings, 0 replies; 41+ messages in thread From: Minchan Kim @ 2012-08-06 4:49 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Dan Magenheimer, Konrad Rzeszutek Wilk, Seth Jennings, Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Robert Jennings, linux-mm, linux-kernel, devel On Tue, Jul 31, 2012 at 11:36:04AM -0400, Konrad Rzeszutek Wilk wrote: > On Sun, Jul 29, 2012 at 10:54:28AM +0900, Minchan Kim wrote: > > On Fri, Jul 27, 2012 at 02:42:14PM -0700, Dan Magenheimer wrote: > > > > From: Konrad Rzeszutek Wilk [mailto:konrad@darnok.org] > > > > Sent: Friday, July 27, 2012 3:00 PM > > > > Subject: Re: [PATCH 0/4] promote zcache from staging > > > > > > > > On Fri, Jul 27, 2012 at 12:21:50PM -0700, Dan Magenheimer wrote: > > > > > > From: Seth Jennings [mailto:sjenning@linux.vnet.ibm.com] > > > > > > Subject: [PATCH 0/4] promote zcache from staging > > > > > > > > > > > > zcache is the remaining piece of code required to support in-kernel > > > > > > memory compression. The other two features, cleancache and frontswap, > > > > > > have been promoted to mainline in 3.0 and 3.5. This patchset > > > > > > promotes zcache from the staging tree to mainline. > > > > > > > > > > > > Based on the level of activity and contributions we're seeing from a > > > > > > diverse set of people and interests, I think zcache has matured to the > > > > > > point where it makes sense to promote this out of staging. > > > > > > > > > > Hi Seth -- > > > > > > > > > > Per offline communication, I'd like to see this delayed for three > > > > > reasons: > > > > > > > > > > 1) I've completely rewritten zcache and will post the rewrite soon. > > > > > The redesigned code fixes many of the weaknesses in zcache that > > > > > makes it (IMHO) unsuitable for an enterprise distro. (Some of > > > > > these previously discussed in linux-mm [1].) > > > > > 2) zcache is truly mm (memory management) code and the fact that > > > > > it is in drivers at all was purely for logistical reasons > > > > > (e.g. the only in-tree "staging" is in the drivers directory). > > > > > My rewrite promotes it to (a subdirectory of) mm where IMHO it > > > > > belongs. > > > > > 3) Ramster heavily duplicates code from zcache. My rewrite resolves > > > > > this. My soon-to-be-post also places the re-factored ramster > > > > > in mm, though with some minor work zcache could go in mm and > > > > > ramster could stay in staging. > > > > > > > > > > Let's have this discussion, but unless the community decides > > > > > otherwise, please consider this a NACK. > > > > > > Hi Konrad -- > > > > > > > Hold on, that is rather unfair. The zcache has been in staging > > > > for quite some time - your code has not been posted. Part of > > > > "unstaging" a driver is for folks to review the code - and you > > > > just said "No, mine is better" without showing your goods. > > > > > > Sorry, I'm not trying to be unfair. However, I don't see the point > > > of promoting zcache out of staging unless it is intended to be used > > > by real users in a real distro. There's been a lot of discussion, > > > onlist and offlist, about what needs to be fixed in zcache and not > > > much visible progress on fixing it. But fixing it is where I've spent > > > most of my time over the last couple of months. > > > > > > If IBM or some other company or distro is eager to ship and support > > > zcache in its current form, I agree that "promote now, improve later" > > > is a fine approach. But promoting zcache out of staging simply because > > > there is urgency to promote zsmalloc+zram out of staging doesn't > > > seem wise. At a minimum, it distracts reviewers/effort from what IMHO > > > is required to turn zcache into an enterprise-ready kernel feature. > > > > > > I can post my "goods" anytime. In its current form it is better > > > than the zcache in staging (and, please remember, I wrote both so > > > I think I am in a good position to compare the two). > > > I have been waiting until I think the new zcache is feature complete > > > before asking for review, especially since the newest features > > > should demonstrate clearly why the rewrite is necessary and > > > beneficial. But I can post* my current bits if people don't > > > believe they exist and/or don't mind reviewing non-final code. > > > (* Or I can put them in a publicly available git tree.) > > > > > > > There is a third option - which is to continue the promotion > > > > of zcache from staging, get reviews, work on them ,etc, and > > > > alongside of that you can work on fixing up (or ripping out) > > > > zcache1 with zcache2 components as they make sense. Or even > > > > having two of them - an enterprise and an embedded version > > > > that will eventually get merged together. There is nothing > > > > wrong with modifying a driver once it has left staging. > > > > > > Minchan and Seth can correct me if I am wrong, but I believe > > > zram+zsmalloc, not zcache, is the target solution for embedded. > > > > NOT ture. Some embedded devices use zcache but it's not original > > zcache but modificated one. > > What kind of modifications? Would it make sense to post the patches It's for contiguos memory allocation. For it, it uses only clencache, not frontswap so it could zap ephemeral pages without latency for getting big contiguos memory. > for those modifications? It's another story so at the moment, let's not consider it. After we got some cleanup, I will revisit it. > > > Anyway, although embedded people use modified zcache, I am biased to Dan. > > I admit I don't spend lots of time to look zcache but as looking the > > code, it wasn't good shape and even had a bug found during code review > > and I felt strongly we should clean up it for promoting it to mm/. > > Do you recall what the bugs where? From: Minchan Kim <minchan@kernel.org> Date: Fri, 27 Jul 2012 10:10:31 +0900 Subject: [PATCH] zcache: initialize idr !CONFIG_FRONTSWAP doesn't initialize idr. This patch always initialize idr. Signed-off-by: Minchan Kim <minchan@kernel.org> --- drivers/staging/zcache/zcache-main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/staging/zcache/zcache-main.c b/drivers/staging/zcache/zcache-main.c index 564873f..a635ee2 100644 --- a/drivers/staging/zcache/zcache-main.c +++ b/drivers/staging/zcache/zcache-main.c @@ -968,7 +968,7 @@ static void zcache_put_pool(struct tmem_pool *pool) atomic_dec(&cli->refcount); } -int zcache_new_client(uint16_t cli_id) +static int zcache_new_client(uint16_t cli_id) { struct zcache_client *cli; int ret = -1; @@ -980,11 +980,11 @@ int zcache_new_client(uint16_t cli_id) if (cli->zspool) goto out; + idr_init(&cli->tmem_pools); #ifdef CONFIG_FRONTSWAP cli->zspool = zs_create_pool("zcache", ZCACHE_GFP_MASK); if (cli->zspool == NULL) goto out; - idr_init(&cli->tmem_pools); #endif ret = 0; out: -- 1.7.9.5 > > > So I would like to wait Dan's posting if you guys are not urgent. > > (And I am not sure akpm allow it with current shape of zcache code.) > > But the concern is about adding new feature. I guess there might be some > > debate for long time and it can prevent promoting again. > > I think It's not what Seth want. > > I hope Dan doesn't mix clean up series and new feature series and > > post clean up series as soon as possible so let's clean up first and > > try to promote it and later, adding new feature or changing algorithm > > is desirable. > > > > > > > The limitations of zsmalloc aren't an issue for zram but they are > > > for zcache, and this deficiency was one of the catalysts for the > > > rewrite. The issues are explained in more detail in [1], > > > but if any point isn't clear, I'd be happy to explain further. > > > > > > However, I have limited time for this right now and I'd prefer > > > to spend it finishing the code. :-} > > > > > > So, as I said, I am still a NACK, but if there are good reasons > > > to duplicate effort and pursue the "third option", let's discuss > > > them. > > > > > > Thanks, > > > Dan > > > > > > [1] http://marc.info/?t=133886706700002&r=1&w=2 > > > > > > -- > > > To unsubscribe, send a message with 'unsubscribe linux-mm' in > > > the body to majordomo@kvack.org. For more info on Linux MM, > > > see: http://www.linux-mm.org/ . > > > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > > > > -- > > Kind regards, > > Minchan Kim > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-07-27 21:42 ` Dan Magenheimer 2012-07-29 1:54 ` Minchan Kim @ 2012-07-30 19:19 ` Seth Jennings 2012-07-30 20:48 ` Dan Magenheimer 1 sibling, 1 reply; 41+ messages in thread From: Seth Jennings @ 2012-07-30 19:19 UTC (permalink / raw) To: Dan Magenheimer Cc: Konrad Rzeszutek Wilk, Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Wilk, Robert Jennings, linux-mm, linux-kernel, devel Dan, I started writing inline responses to each concern but that was adding more confusion than clarity. I would like to focus the discussion. The purpose of this patchset is to discuss the inclusion of zcache into mainline during the 3.7 merge window. zcache has been a staging since v2.6.39 and has been maturing with contributions from 15 developers (5 with multiple commits) working on improvements and bug fixes. I want good code in the kernel, so if there are particular areas that need attention before it's of acceptable quality for mainline we need that discussion. I am eager to have customers using memory compression with zcache but before that I want to see zcache in mainline. We agree with Konrad that zcache should be promoted before additional features are included. Greg has also expressed that he would like promotion before attempting to add additional features [1]. Including new features now, while in the staging tree, adds to the complexity and difficultly of reverifying zcache and getting it accepted into mainline. [1] https://lkml.org/lkml/2012/3/16/472 Let's have this discussion. If there are specific issues that need to be addressed to get this ready for mainline let's take them one-by-one and line-by-line with patches. Seth -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH 0/4] promote zcache from staging 2012-07-30 19:19 ` Seth Jennings @ 2012-07-30 20:48 ` Dan Magenheimer 2012-07-31 15:58 ` Konrad Rzeszutek Wilk 0 siblings, 1 reply; 41+ messages in thread From: Dan Magenheimer @ 2012-07-30 20:48 UTC (permalink / raw) To: Seth Jennings Cc: Konrad Rzeszutek Wilk, Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Wilk, Robert Jennings, linux-mm, linux-kernel, devel > From: Seth Jennings [mailto:sjenning@linux.vnet.ibm.com] > Subject: Re: [PATCH 0/4] promote zcache from staging > > Dan, > > I started writing inline responses to each concern but that > was adding more confusion than clarity. I would like to > focus the discussion. > : > Let's have this discussion. If there are specific issues > that need to be addressed to get this ready for mainline > let's take them one-by-one and line-by-line with patches. Hi Seth -- Thanks for your response and for your passion. The first discussion I think is about whether zsmalloc is a suitable allocator for zcache. In its current state in staging, zcache uses zbud for ephemeral (cleancache) zpages and zsmalloc for persistent (frontswap) zpages. I have proposed concerns on-list that the capabilities provided by zsmalloc are not suitable for supporting zcache in an enterprise distro. The author of zsmalloc concurred and has (at least so far) not been available to enhance zsmalloc, and you have taken a strong position that zsmalloc needed to be "generic" (i.e. will never deliver the functionality IMHO is necessary for zcache). So I have rewritten zbud to handle both kinds of zpages and, at the same time, to resolve my stated issues. This is the bulk of my major rewrite... I don't think constructing and reviewing a long series of one-by-one and line-by-line patches is of much value here, especially since the current code is in staging. We either (1) use the now rewritten zbud (2) wait until someone rewrites zsmalloc (3) accept the deficiencies of zcache in its current form. The second discussion is whether ramster, as a "user" of zcache, is relevant. As you know, ramster is built on top of zcache but requires a fair number of significant changes that, due to gregkh's restriction, could not be made directly to zcache while in staging. In my rewrite, I've taken a great deal of care that the "new" zcache cleanly supports both. While some couldn't care less about ramster, the next step of ramster may be of more interest to a broader part of the community. So I am eager to ensure that the core zcache code in zcache and ramster doesn't need to "fork" again. The zcache-main.c in staging/ramster is farther along than the zcache-main.c in staging/zcache, but IMHO my rewrite is better and cleaner than either. Most of the rest of the cleanup, such as converting to debugfs instead of sysfs, could be done as a sequence of one-by-one and line-by-line patches. I think we agree that zcache will not be promoted unless this change is made, but IMHO constructing and reviewing patches individually is not of much value since the above zbud and ramster changes already result in a major rewrite. I think the community would benefit most from a new solid code foundation for zcache and reviewers time (and your time and mine) would best be spent grokking the new code than from reviewing a very long sequence of cleanup patches. > The purpose of this patchset is to discuss the inclusion of > zcache into mainline during the 3.7 merge window. zcache > has been a staging since v2.6.39 and has been maturing with > contributions from 15 developers (5 with multiple commits) > working on improvements and bug fixes. > > I want good code in the kernel, so if there are particular > areas that need attention before it's of acceptable quality > for mainline we need that discussion. I am eager to have > customers using memory compression with zcache but before > that I want to see zcache in mainline. I think we are all eager to achieve the end result: real users using zcache in real production systems. IMHO your suggested path will not achieve that, certainly not in the 3.7 timeframe. The current code (IMHO) is neither suitable for promotion, nor functionally capable of taking the beating of an enterprise distro. > We agree with Konrad that zcache should be promoted before > additional features are included. Greg has also expressed > that he would like promotion before attempting to add > additional features [1]. Including new features now, while > in the staging tree, adds to the complexity and difficultly > of reverifying zcache and getting it accepted into mainline. > > [1] https://lkml.org/lkml/2012/3/16/472still in staging. Zcache as submitted to staging in 2.6.39 was (and is) a working proof-of-concept. As you know, Greg's position created a "catch 22"... zcache in its current state isn't good enough to be promoted, but we can't change it substantially to resolve its deficiencies while it is still in staging. (Minchan recently stated that he doesn't think it is in good enough shape to be approved by Andrew, and I agree.) That's why I embarked on the rewrite. Lastly, I'm not so much "adding new features" as ensuring the new zcache foundation will be sufficient to support enterprise users. But I do now agree with Minchan (and I think with you) that I need to post where I'm at, even if I am not 100% ready or satisfied. I'll try to do that by the end of the week. Thanks, Dan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-07-30 20:48 ` Dan Magenheimer @ 2012-07-31 15:58 ` Konrad Rzeszutek Wilk 2012-07-31 16:19 ` Greg Kroah-Hartman 0 siblings, 1 reply; 41+ messages in thread From: Konrad Rzeszutek Wilk @ 2012-07-31 15:58 UTC (permalink / raw) To: Dan Magenheimer Cc: Seth Jennings, Konrad Rzeszutek Wilk, Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Minchan Kim, Robert Jennings, linux-mm, linux-kernel, devel On Mon, Jul 30, 2012 at 01:48:29PM -0700, Dan Magenheimer wrote: > > From: Seth Jennings [mailto:sjenning@linux.vnet.ibm.com] > > Subject: Re: [PATCH 0/4] promote zcache from staging > > > > Dan, > > > > I started writing inline responses to each concern but that > > was adding more confusion than clarity. I would like to > > focus the discussion. > > : > > Let's have this discussion. If there are specific issues > > that need to be addressed to get this ready for mainline > > let's take them one-by-one and line-by-line with patches. > > Hi Seth -- > > Thanks for your response and for your passion. > > The first discussion I think is about whether zsmalloc is > a suitable allocator for zcache. In its current state > in staging, zcache uses zbud for ephemeral (cleancache) > zpages and zsmalloc for persistent (frontswap) zpages. OK, but - unstaging 'zsmalloc' is a different patchset. > I have proposed concerns on-list that the capabilities > provided by zsmalloc are not suitable for supporting zcache > in an enterprise distro. The author of zsmalloc concurred The goal is to support _both_ enterprise and embedded. But what you are saying sounds like it does not work in enterprise environment - which is not my experience? If you are saying that the code should not be integrated before it works in both classes perfectly - well, then a lot of other code in the Linux should not have been accepted - and I would call those insufficienies "bugs". And bugs are .. natural, albeit pesky. I think what you are saing by "enterprise" is that you want it to be enabled by default (or at least be confident that it can be done so) for everybody and that it work quite well under 99% workload. While right now it covers only 98% (or some other number) of workload. > and has (at least so far) not been available to enhance > zsmalloc, and you have taken a strong position that zsmalloc > needed to be "generic" (i.e. will never deliver the functionality > IMHO is necessary for zcache). So I have rewritten zbud to > handle both kinds of zpages and, at the same time, to > resolve my stated issues. This is the bulk of my Ok, so zbud rewrite is to remove the need for zsmalloc and use zbud2 for both persistent and ephemeral pages. > major rewrite... I don't think constructing and reviewing > a long series of one-by-one and line-by-line patches is > of much value here, especially since the current code is > in staging. We either (1) use the now rewritten zbud (2) wait > until someone rewrites zsmalloc (3) accept the deficiencies > of zcache in its current form. This sounds like a Catch-22 :-) Greg would like to have the TODO list finished - and it seems that one of the todo's is to have one instead of two engines for dealing with pages. But at the same time not adding in new features. > > The second discussion is whether ramster, as a "user" of > zcache, is relevant. As you know, ramster is built on > top of zcache but requires a fair number of significant > changes that, due to gregkh's restriction, could not be > made directly to zcache while in staging. In my rewrite, > I've taken a great deal of care that the "new" zcache > cleanly supports both. While some couldn't care less about > ramster, the next step of ramster may be of more interest > to a broader part of the community. So I am eager to > ensure that the core zcache code in zcache and ramster > doesn't need to "fork" again. The zcache-main.c in staging/ramster > is farther along than the zcache-main.c in staging/zcache, but > IMHO my rewrite is better and cleaner than either. So in short you made zcache more modular? > > Most of the rest of the cleanup, such as converting to debugfs > instead of sysfs, could be done as a sequence of one-by-one > and line-by-line patches. I think we agree that zcache will Sure. Thought you could do it more wholesale: sysfs->debugfs patch. > not be promoted unless this change is made, but IMHO constructing > and reviewing patches individually is not of much value since > the above zbud and ramster changes already result in a major > rewrite. I think the community would benefit most from a new > solid code foundation for zcache and reviewers time (and your > time and mine) would best be spent grokking the new code than > from reviewing a very long sequence of cleanup patches. You are ignoring the goodness of the testing and performance numbers that zcache has gotten so far. With a new code those numbers are invalidated. That is throwing away some good data. Reviewing code based on the old code (and knowing how the old code works) I think is easier than trying to understand new code from scratch - at least one has a baseline to undertand it. But that might be just my opinion - either way I am OK looking at brand new code or old code - but I would end up looking at the old code to answer those : "Huh. I wonder how we did that previously." - at which point it might make sense just to have the patches broken up in small segments. > > > The purpose of this patchset is to discuss the inclusion of > > zcache into mainline during the 3.7 merge window. zcache > > has been a staging since v2.6.39 and has been maturing with > > contributions from 15 developers (5 with multiple commits) > > working on improvements and bug fixes. > > > > I want good code in the kernel, so if there are particular > > areas that need attention before it's of acceptable quality > > for mainline we need that discussion. I am eager to have > > customers using memory compression with zcache but before > > that I want to see zcache in mainline. > > I think we are all eager to achieve the end result: real users > using zcache in real production systems. IMHO your suggested > path will not achieve that, certainly not in the 3.7 timeframe. > The current code (IMHO) is neither suitable for promotion, nor > functionally capable of taking the beating of an enterprise distro. So we agree that it must be fixed, but disagree on how to fix it :-) However there are real folks in the embedded env that use it (with some modifications) - please don't call them "unreal users". > > > We agree with Konrad that zcache should be promoted before > > additional features are included. Greg has also expressed > > that he would like promotion before attempting to add > > additional features [1]. Including new features now, while > > in the staging tree, adds to the complexity and difficultly > > of reverifying zcache and getting it accepted into mainline. > > > > [1] https://lkml.org/lkml/2012/3/16/472still in staging. > > Zcache as submitted to staging in 2.6.39 was (and is) a working > proof-of-concept. As you know, Greg's position created a > "catch 22"... zcache in its current state isn't good enough > to be promoted, but we can't change it substantially to resolve > its deficiencies while it is still in staging. (Minchan > recently stated that he doesn't think it is in good enough > shape to be approved by Andrew, and I agree.) That's why I > embarked on the rewrite. > > Lastly, I'm not so much "adding new features" as ensuring the > new zcache foundation will be sufficient to support enterprise > users. But I do now agree with Minchan (and I think with you) > that I need to post where I'm at, even if I am not 100% ready or > satisfied. I'll try to do that by the end of the week. So in my head I feel that it is Ok to: 1) address the concerns that zcache has before it is unstaged 2) rip out the two-engine system with a one-engine system (and see how well it behaves) 3) sysfs->debugfs as needed 4) other things as needed I think we are getting hung-up what Greg said about adding features and the two-engine->one engine could be understood as that. While I think that is part of a staging effort to clean up the existing issues. Lets see what Greg thinks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-07-31 15:58 ` Konrad Rzeszutek Wilk @ 2012-07-31 16:19 ` Greg Kroah-Hartman 2012-07-31 17:51 ` Konrad Rzeszutek Wilk 0 siblings, 1 reply; 41+ messages in thread From: Greg Kroah-Hartman @ 2012-07-31 16:19 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Dan Magenheimer, devel, Seth Jennings, linux-mm, linux-kernel, Minchan Kim, Konrad Rzeszutek Wilk, Andrew Morton, Robert Jennings, Nitin Gupta On Tue, Jul 31, 2012 at 11:58:43AM -0400, Konrad Rzeszutek Wilk wrote: > So in my head I feel that it is Ok to: > 1) address the concerns that zcache has before it is unstaged > 2) rip out the two-engine system with a one-engine system > (and see how well it behaves) > 3) sysfs->debugfs as needed > 4) other things as needed > > I think we are getting hung-up what Greg said about adding features > and the two-engine->one engine could be understood as that. > While I think that is part of a staging effort to clean up the > existing issues. Lets see what Greg thinks. Greg has no idea, except I want to see the needed fixups happen before new features get added. Add the new features _after_ it is out of staging. greg k-h -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-07-31 16:19 ` Greg Kroah-Hartman @ 2012-07-31 17:51 ` Konrad Rzeszutek Wilk 2012-07-31 18:19 ` Seth Jennings 2012-08-06 0:38 ` Minchan Kim 0 siblings, 2 replies; 41+ messages in thread From: Konrad Rzeszutek Wilk @ 2012-07-31 17:51 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Dan Magenheimer, devel, Seth Jennings, linux-mm, linux-kernel, Minchan Kim, Konrad Rzeszutek Wilk, Andrew Morton, Robert Jennings, Nitin Gupta On Tue, Jul 31, 2012 at 09:19:16AM -0700, Greg Kroah-Hartman wrote: > On Tue, Jul 31, 2012 at 11:58:43AM -0400, Konrad Rzeszutek Wilk wrote: > > So in my head I feel that it is Ok to: > > 1) address the concerns that zcache has before it is unstaged > > 2) rip out the two-engine system with a one-engine system > > (and see how well it behaves) > > 3) sysfs->debugfs as needed > > 4) other things as needed > > > > I think we are getting hung-up what Greg said about adding features > > and the two-engine->one engine could be understood as that. > > While I think that is part of a staging effort to clean up the > > existing issues. Lets see what Greg thinks. > > Greg has no idea, except I want to see the needed fixups happen before > new features get added. Add the new features _after_ it is out of > staging. I think we (that is me, Seth, Minchan, Dan) need to talk to have a good understanding of what each of us thinks are fixups. Would Monday Aug 6th at 1pm EST on irc.freenode.net channel #zcache work for people? > > greg k-h -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-07-31 17:51 ` Konrad Rzeszutek Wilk @ 2012-07-31 18:19 ` Seth Jennings 2012-08-06 0:38 ` Minchan Kim 1 sibling, 0 replies; 41+ messages in thread From: Seth Jennings @ 2012-07-31 18:19 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Greg Kroah-Hartman, Dan Magenheimer, devel, linux-mm, linux-kernel, Minchan Kim, Konrad Rzeszutek Wilk, Andrew Morton, Robert Jennings, Nitin Gupta On 07/31/2012 12:51 PM, Konrad Rzeszutek Wilk wrote: > Would Monday Aug 6th at 1pm EST on irc.freenode.net channel #zcache work > for people? I think this is a great idea! Dan, can you post code as an RFC by tomorrow or Thursday? We (Rob and I) have the Texas Linux Fest starting Friday. We need time to review the code prior to chat so that we can talk about specifics rather than generalities. If that can be done, then we are available for the chat on Monday. Seth -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-07-31 17:51 ` Konrad Rzeszutek Wilk 2012-07-31 18:19 ` Seth Jennings @ 2012-08-06 0:38 ` Minchan Kim 2012-08-06 15:24 ` Dan Magenheimer 2012-08-07 19:28 ` Konrad Rzeszutek Wilk 1 sibling, 2 replies; 41+ messages in thread From: Minchan Kim @ 2012-08-06 0:38 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Greg Kroah-Hartman, Dan Magenheimer, devel, Seth Jennings, linux-mm, linux-kernel, Konrad Rzeszutek Wilk, Andrew Morton, Robert Jennings, Nitin Gupta Hi Konrad, On Tue, Jul 31, 2012 at 01:51:42PM -0400, Konrad Rzeszutek Wilk wrote: > On Tue, Jul 31, 2012 at 09:19:16AM -0700, Greg Kroah-Hartman wrote: > > On Tue, Jul 31, 2012 at 11:58:43AM -0400, Konrad Rzeszutek Wilk wrote: > > > So in my head I feel that it is Ok to: > > > 1) address the concerns that zcache has before it is unstaged > > > 2) rip out the two-engine system with a one-engine system > > > (and see how well it behaves) > > > 3) sysfs->debugfs as needed > > > 4) other things as needed > > > > > > I think we are getting hung-up what Greg said about adding features > > > and the two-engine->one engine could be understood as that. > > > While I think that is part of a staging effort to clean up the > > > existing issues. Lets see what Greg thinks. > > > > Greg has no idea, except I want to see the needed fixups happen before > > new features get added. Add the new features _after_ it is out of > > staging. > > I think we (that is me, Seth, Minchan, Dan) need to talk to have a good > understanding of what each of us thinks are fixups. > > Would Monday Aug 6th at 1pm EST on irc.freenode.net channel #zcache work > for people? 1pm EST is 2am KST(Korea Standard Time) so it's not good for me. :) I know it's hard to adjust my time for yours so let you talk without me. Instead, I will write it down my requirement. It's very simple and trivial. 1) Please don't add any new feature like replace zsmalloc with zbud. It's totally untested so it needs more time for stable POV bug, or performance/fragementation. 2) Factor out common code between zcache and ramster. It should be just clean up code and should not change current behavior. 3) Add lots of comment to public functions 4) make function/varabiel names more clearly. They are necessary for promotion and after promotion, let's talk about new great features. > > > > > greg k-h > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH 0/4] promote zcache from staging 2012-08-06 0:38 ` Minchan Kim @ 2012-08-06 15:24 ` Dan Magenheimer 2012-08-06 15:47 ` Pekka Enberg 2012-08-07 19:28 ` Konrad Rzeszutek Wilk 1 sibling, 1 reply; 41+ messages in thread From: Dan Magenheimer @ 2012-08-06 15:24 UTC (permalink / raw) To: Minchan Kim, Konrad Wilk Cc: Greg Kroah-Hartman, devel, Seth Jennings, linux-mm, linux-kernel, Konrad Rzeszutek Wilk, Andrew Morton, Robert Jennings, Nitin Gupta > > I think we (that is me, Seth, Minchan, Dan) need to talk to have a good > > understanding of what each of us thinks are fixups. > > > > Would Monday Aug 6th at 1pm EST on irc.freenode.net channel #zcache work > > for people? > > 1pm EST is 2am KST(Korea Standard Time) so it's not good for me. :) > I know it's hard to adjust my time for yours so let you talk without > me. Instead, I will write it down my requirement. It's very simple and > trivial. > > 1) Please don't add any new feature like replace zsmalloc with zbud. > It's totally untested so it needs more time for stable POV bug, > or performance/fragementation. > > 2) Factor out common code between zcache and ramster. It should be just > clean up code and should not change current behavior. > > 3) Add lots of comment to public functions > > 4) make function/varabiel names more clearly. > > They are necessary for promotion and after promotion, > let's talk about new great features. Hi Minchan -- I hope you had a great vacation! Since we won't be able to discuss this by phone/irc, I guess I need to reply here. Let me first restate my opinion as author of zcache. The zcache in staging is really a "demo" version. It was written 21 months ago (and went into staging 16 months ago) primarily to show, at Andrew Morton's suggestion, that frontswap and cleancache had value in a normal standalone kernel (i.e. virtualization not required). When posted in early 2011 zcache was known to have some fundamental flaws in the design... that's why it went into "staging". The "demo" version in staging still has those flaws and the change from xvmalloc to zsmalloc makes one of those flaws worse. These design flaws are now fixed in the new code base I posted last week AND the new code base has improved factoring, comments and the code is properly re-merged with the zcache "fork" in ramster. We are not talking about new "features"... I have tried to back out the new features from the new code base already posted and will post them separately. So I think we have four choices: A) Try to promote zcache as is. (Seth's proposal) B) Clean up zcache with no new functionality. (Minchan's proposal above) C) New code base (in mm/tmem/) after review. (Dan's proposal) D) New code base but retrofit as a series of patches (Konrad's suggestion) Minchan, if we go with your proposal (B) are you volunteering to do the work? And if you do, doesn't it have the same issue that it is also totally untested? And, since (B) doesn't solve the fundamental design issues, are you volunteering to fix those next? And, in the meantime, doesn't this mean we have THREE versions of zcache? IMHO, the fastest way to get the best zcache into the kernel and to distros and users is to throw away the "demo" version, move forward to a new solid well-designed zcache code base, and work together to build on it. There's still a lot to do so I hope we can work together. Thanks, Dan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-08-06 15:24 ` Dan Magenheimer @ 2012-08-06 15:47 ` Pekka Enberg 2012-08-06 16:21 ` Dan Magenheimer 0 siblings, 1 reply; 41+ messages in thread From: Pekka Enberg @ 2012-08-06 15:47 UTC (permalink / raw) To: Dan Magenheimer Cc: Minchan Kim, Konrad Wilk, Greg Kroah-Hartman, devel, Seth Jennings, linux-mm, linux-kernel, Konrad Rzeszutek Wilk, Andrew Morton, Robert Jennings, Nitin Gupta On Mon, Aug 6, 2012 at 6:24 PM, Dan Magenheimer <dan.magenheimer@oracle.com> wrote: > IMHO, the fastest way to get the best zcache into the kernel and > to distros and users is to throw away the "demo" version, move forward > to a new solid well-designed zcache code base, and work together to > build on it. There's still a lot to do so I hope we can work together. I'm not convinced it's the _fastest way_. You're effectively invalidating all the work done under drivers/staging so you might end up in review limbo with your shiny new code... AFAICT, your best bet is to first clean up zcache under driver/staging and get that promoted under mm/zcache.c. You can then move on to the more controversial ramster and figure out where to put the clustering code, etc. Pekka -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH 0/4] promote zcache from staging 2012-08-06 15:47 ` Pekka Enberg @ 2012-08-06 16:21 ` Dan Magenheimer 2012-08-06 16:29 ` Greg Kroah-Hartman 2012-08-07 0:44 ` Minchan Kim 0 siblings, 2 replies; 41+ messages in thread From: Dan Magenheimer @ 2012-08-06 16:21 UTC (permalink / raw) To: Pekka Enberg Cc: Minchan Kim, Konrad Wilk, Greg Kroah-Hartman, devel, Seth Jennings, linux-mm, linux-kernel, Konrad Rzeszutek Wilk, Andrew Morton, Robert Jennings, Nitin Gupta > From: Pekka Enberg [mailto:penberg@kernel.org] > Subject: Re: [PATCH 0/4] promote zcache from staging > > On Mon, Aug 6, 2012 at 6:24 PM, Dan Magenheimer > <dan.magenheimer@oracle.com> wrote: > > IMHO, the fastest way to get the best zcache into the kernel and > > to distros and users is to throw away the "demo" version, move forward > > to a new solid well-designed zcache code base, and work together to > > build on it. There's still a lot to do so I hope we can work together. > > I'm not convinced it's the _fastest way_. <grin> I guess I meant "optimal", combining "fast" and "best". > You're effectively > invalidating all the work done under drivers/staging so you might end up > in review limbo with your shiny new code... Fixing the fundamental design flaws will sooner or later invalidate most (or all) of the previous testing/work anyway, won't it? Since any kernel built with staging is "tainted" already, I feel like now is a better time to make a major design transition. I suppose: (E) replace "demo" zcache with new code base and keep it in staging for another cycle is another alternative, but I think gregkh has said no to that. Dan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-08-06 16:21 ` Dan Magenheimer @ 2012-08-06 16:29 ` Greg Kroah-Hartman 2012-08-06 16:38 ` Dan Magenheimer 2012-08-07 0:44 ` Minchan Kim 1 sibling, 1 reply; 41+ messages in thread From: Greg Kroah-Hartman @ 2012-08-06 16:29 UTC (permalink / raw) To: Dan Magenheimer Cc: Pekka Enberg, Minchan Kim, Konrad Wilk, devel, Seth Jennings, linux-mm, linux-kernel, Konrad Rzeszutek Wilk, Andrew Morton, Robert Jennings, Nitin Gupta On Mon, Aug 06, 2012 at 09:21:22AM -0700, Dan Magenheimer wrote: > I suppose: > > (E) replace "demo" zcache with new code base and keep it > in staging for another cycle > > is another alternative, but I think gregkh has said no to that. No I have not. If you all feel that the existing code needs to be dropped and replaced with a totally new version, that's fine with me. It's forward progress, which is all that I ask for. Hope this helps, greg k-h -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH 0/4] promote zcache from staging 2012-08-06 16:29 ` Greg Kroah-Hartman @ 2012-08-06 16:38 ` Dan Magenheimer 0 siblings, 0 replies; 41+ messages in thread From: Dan Magenheimer @ 2012-08-06 16:38 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Pekka Enberg, Minchan Kim, Konrad Wilk, devel, Seth Jennings, linux-mm, linux-kernel, Konrad Rzeszutek Wilk, Andrew Morton, Robert Jennings, Nitin Gupta > From: Greg Kroah-Hartman [mailto:gregkh@linuxfoundation.org] > Subject: Re: [PATCH 0/4] promote zcache from staging > > On Mon, Aug 06, 2012 at 09:21:22AM -0700, Dan Magenheimer wrote: > > I suppose: > > > > (E) replace "demo" zcache with new code base and keep it > > in staging for another cycle > > > > is another alternative, but I think gregkh has said no to that. > > No I have not. If you all feel that the existing code needs to be > dropped and replaced with a totally new version, that's fine with me. > It's forward progress, which is all that I ask for. > > Hope this helps, > greg k-h Hi Greg -- Cool! I guess I mistakenly assumed that your "no new features" requirement also implied "no fixes of fundamental design flaws". :-) Having option (E) should make it easier to decide the best technical solution, separate from the promotion timing and "where does it land" question. We'll get back to you soon... Thanks! Dan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-08-06 16:21 ` Dan Magenheimer 2012-08-06 16:29 ` Greg Kroah-Hartman @ 2012-08-07 0:44 ` Minchan Kim 1 sibling, 0 replies; 41+ messages in thread From: Minchan Kim @ 2012-08-07 0:44 UTC (permalink / raw) To: Dan Magenheimer Cc: Pekka Enberg, Konrad Wilk, Greg Kroah-Hartman, devel, Seth Jennings, linux-mm, linux-kernel, Konrad Rzeszutek Wilk, Andrew Morton, Robert Jennings, Nitin Gupta Hi Dan, On Mon, Aug 06, 2012 at 09:21:22AM -0700, Dan Magenheimer wrote: > > From: Pekka Enberg [mailto:penberg@kernel.org] > > Subject: Re: [PATCH 0/4] promote zcache from staging > > > > On Mon, Aug 6, 2012 at 6:24 PM, Dan Magenheimer > > <dan.magenheimer@oracle.com> wrote: > > > IMHO, the fastest way to get the best zcache into the kernel and > > > to distros and users is to throw away the "demo" version, move forward > > > to a new solid well-designed zcache code base, and work together to > > > build on it. There's still a lot to do so I hope we can work together. > > > > I'm not convinced it's the _fastest way_. > > <grin> I guess I meant "optimal", combining "fast" and "best". > > > You're effectively > > invalidating all the work done under drivers/staging so you might end up > > in review limbo with your shiny new code... > > Fixing the fundamental design flaws will sooner or later invalidate > most (or all) of the previous testing/work anyway, won't it? Since > any kernel built with staging is "tainted" already, I feel like now > is a better time to make a major design transition. > > I suppose: > > (E) replace "demo" zcache with new code base and keep it > in staging for another cycle I go for (E). Please send your refactoring code as formal patch. Thanks. -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-08-06 0:38 ` Minchan Kim 2012-08-06 15:24 ` Dan Magenheimer @ 2012-08-07 19:28 ` Konrad Rzeszutek Wilk 1 sibling, 0 replies; 41+ messages in thread From: Konrad Rzeszutek Wilk @ 2012-08-07 19:28 UTC (permalink / raw) To: Minchan Kim Cc: Greg Kroah-Hartman, Dan Magenheimer, devel, Seth Jennings, linux-mm, linux-kernel, Konrad Rzeszutek Wilk, Andrew Morton, Robert Jennings, Nitin Gupta [-- Attachment #1: Type: text/plain, Size: 2962 bytes --] On Mon, Aug 06, 2012 at 09:38:16AM +0900, Minchan Kim wrote: > Hi Konrad, > > On Tue, Jul 31, 2012 at 01:51:42PM -0400, Konrad Rzeszutek Wilk wrote: > > On Tue, Jul 31, 2012 at 09:19:16AM -0700, Greg Kroah-Hartman wrote: > > > On Tue, Jul 31, 2012 at 11:58:43AM -0400, Konrad Rzeszutek Wilk wrote: > > > > So in my head I feel that it is Ok to: > > > > 1) address the concerns that zcache has before it is unstaged > > > > 2) rip out the two-engine system with a one-engine system > > > > (and see how well it behaves) > > > > 3) sysfs->debugfs as needed > > > > 4) other things as needed > > > > > > > > I think we are getting hung-up what Greg said about adding features > > > > and the two-engine->one engine could be understood as that. > > > > While I think that is part of a staging effort to clean up the > > > > existing issues. Lets see what Greg thinks. > > > > > > Greg has no idea, except I want to see the needed fixups happen before > > > new features get added. Add the new features _after_ it is out of > > > staging. > > > > I think we (that is me, Seth, Minchan, Dan) need to talk to have a good > > understanding of what each of us thinks are fixups. > > > > Would Monday Aug 6th at 1pm EST on irc.freenode.net channel #zcache work > > for people? > > 1pm EST is 2am KST(Korea Standard Time) so it's not good for me. :) > I know it's hard to adjust my time for yours so let you talk without > me. Instead, I will write it down my requirement. It's very simple and > trivial. OK, Thank you. We had a lengthy chat (full chat log attached). The summary was that we all want to promote zcache (for various reasons), but we are hang up whether we are OK unstaging it wherein it lowers the I/Os but potentially not giving large performance increase (when doing 'make -jN') or that we want both of those characteristics in. Little history: v3.3 had swap readahead patches that made the amount of pages going in swap dramatically decrease - as such the performance of zcache is not anymore amazing, but ok. Seth and Robert (and I surmise Minchan too) are very interested in zcache as its lowers the amount of I/Os but performance is secondary. Dan is interested in having less I/Os and providing a performance boost with the such workloads as 'make -jN' - in short less I/Os and better performance. Dan would like both before unstaging. The action items that came out are: - Seth posted some benchmarks - he is going to rerun them with v3.5 to see how it behaves in terms of performance (make -jN benchmark). - Robert is going to take a swing at Minchan refactor and adding comments, etc (But after we get over the hump of agreeing on the next step). - Konrad to rummage in his mbox to find any other technical objections that were raised on zcache earlier to make sure to address them. - Once Seth is finished Konrad is going to take another swing at driving this discussion - either via email, IRC or conference call. [-- Attachment #2: zcache-Aug6.log --] [-- Type: text/plain, Size: 24123 bytes --] Aug 06 11:38:41 * Now talking on #zcache Aug 06 11:38:41 * cameron.freenode.net sets mode +n #zcache Aug 06 11:38:41 * cameron.freenode.net sets mode +s #zcache Aug 06 12:38:15 * sjennings (~sjennings@2001:470:1f0f:87d:e46c:7cd1:4974:640b) has joined #zcache Aug 06 12:49:12 * rcj (~rcjenn@32.97.110.59) has joined #zcache Aug 06 12:50:26 * djm1 (~djm1021@inet-hqmc03-o.oracle.com) has joined #zcache Aug 06 12:50:29 <rcj> Good morning Konrad, thanks for suggesting this. Aug 06 12:50:32 * darnok (~konrad@209-6-85-33.c3-0.smr-ubr2.sbo-smr.ma.cable.rcn.com) has joined #zcache Aug 06 12:50:47 <djm1> hi all, first time user for freenode so I had to register, sorry to be late Aug 06 12:50:53 <djm1> (this is Dan Magenheimer) Aug 06 12:51:17 <darnok> djm1, I think you can be just an unregister user? Aug 06 12:51:36 <konrad> rcj, Sure thinkg Aug 06 12:51:51 * darnok goes away now that the proper Konrad is in place Aug 06 12:52:08 <djm1> konrad/darnok: ok, we can deal with that later, for now, "call me djm1" Aug 06 12:52:22 <konrad> Sure. So I think we are all here right? Minchan couldn't make it Aug 06 12:52:38 <sjennings> i don't know of anyone else that was planning to come **** ENDING LOGGING AT Mon Aug 6 12:52:39 2012 **** BEGIN LOGGING AT Mon Aug 6 12:52:39 2012 Aug 06 12:52:46 <rcj> Sounds good Aug 06 12:52:59 * konrad nods. Aug 06 12:53:18 <konrad> Let me first start by saying thank you for being able to make it. Aug 06 12:53:42 <konrad> Sometimes it takes a whole bunch of iterations to setup a call but this worked out fine. Aug 06 12:53:46 * darnok has quit (Read error: Operation timed out) Aug 06 12:54:10 <konrad> And also for the interest in zcache/tmem/frontswap/cleanpage/future ideas. Aug 06 12:55:11 <konrad> I think we all want all the pieces completely out of staging (irregardlesss of how internally each company uses the technologies - we have been shipping it with our kernel - UEK2) Aug 06 12:55:55 <rcj> We do want it out of staging, for us we can't use it until it's out so it's a bit more critical in that way. Aug 06 12:55:55 <konrad> But for right now the zcache is on the plate since that can be used in numerous ways. Aug 06 12:56:09 * djm1 makes minor correction... we are shipping frontswap and cleancache and the xen shim in UEK2, not zcache Aug 06 12:56:21 <konrad> djm1, Ah thats right. Aug 06 12:56:33 * konrad puts on his bigger TODO list to address that. Aug 06 12:57:07 <djm1> rcj: right, we can't use it while its in staging either Aug 06 12:57:27 <djm1> UEK2 did jump the gun on frontswap though Aug 06 12:57:48 <konrad> rcj, With the 'can't use it' is it more of a .. support type question? Meaning that once it gets its equal to the being almost bug-free? Aug 06 12:58:19 <djm1> I think "officially" anything in staging results in a tainted kernel, though I don't know how that applies to distros Aug 06 12:58:32 <rcj> Trying to parse that last sentence konrad Aug 06 12:58:35 <konrad> rcj, I am trying here to figure out if the "unstaging" part that is critical to you is in terms of code quality or some other criteria. Aug 06 12:59:31 * sashal (~sasha@95-89-78-76-dynip.superkabel.de) has joined #zcache Aug 06 12:59:33 <konrad> rcj, As that leads directly to what is the upmost important in zcache for you guys Aug 06 12:59:41 <sjennings> the distros we work with only accept mainline drivers/code, not staging Aug 06 12:59:42 <rcj> konrad, quality is a different discussion. Quality has to be there. Aug 06 12:59:50 <konrad> Hey Sasha. Let me paste to you on a side-channel what we said so far. Aug 06 12:59:58 <sashal> konrad, thanks Aug 06 13:00:20 <konrad> sashal, Hopefully your client won't ban me for pasting too fast. Aug 06 13:00:32 <djm1> sashal: welcome! Aug 06 13:00:38 * djm1 is dan.magenheimer Aug 06 13:01:05 <sjennings> my interest is using frontswap/cleancache/zcache for in-kernel memory compression. no xen, no kvm Aug 06 13:01:07 <sashal> djm1, hey! Aug 06 13:01:10 <konrad> sjennings, rcj : OK, so its the 'feature' needs to be in the mainline before they will accept turning it on. Somehow I thought staging would be part of it since Fedora has some bits turned for that. Aug 06 13:02:43 <konrad> sjennings, rcj , djm1 : But didn't realize the tainting part of the code? Ah yes: 2564 add_taint_module(mod, TAINT_CRAP); Aug 06 13:03:05 <sjennings> yes, that looks bad in the dmesg of enterprise distros Aug 06 13:03:27 * konrad nods Aug 06 13:03:57 <konrad> Then it comes down to: time-line and resources Aug 06 13:04:19 <djm1> konrad: well, yes, and the small matter of technical solutions too ;-) Aug 06 13:04:28 <sjennings> i sent out the promotion patch because we find value in zcache as it is right now Aug 06 13:05:12 <sjennings> it is stable and, as both Dan and I have demostrated, has measurable benefits (I/O reduction and sometime speed) Aug 06 13:05:14 <konrad> sjennings, OK. so the aim for getting it out in v3.6 is .. an optimistic hope not a real I-MUST-GET-IT-NOW type. Aug 06 13:05:18 <rcj> sjennings, and there are other contributors that are in same situation where they are using zcache as it exists. Aug 06 13:06:46 <sjennings> well, it's for 3.7 now, and the promotion patch should either work or tell us what showstoppers remain the prevent promotion Aug 06 13:07:00 <djm1> sjennings: two things... I recall Dave Hansen stating fairly clearly a year ago that zcache was not near suitable for promoting... is he happy? Aug 06 13:07:25 <djm1> second, have you measured zcache/non-zcache results on a recent kernel? Aug 06 13:08:13 <sjennings> depends on how recent your talking about. i posted my zcache performance number on v3.4 i think Aug 06 13:08:30 <sjennings> i consider that recent Aug 06 13:09:35 * djm1 thinks it was 3.2, not sure, but Rik Van Riel made a lot of swap subsystem improvements in the last couple of kernels... I suggest rerunning at least a couple of benchmarks (especially something swap heavy) Aug 06 13:10:46 <sjennings> seems like that would only make zcache, in it currently form, more useful Aug 06 13:10:46 <konrad> djm1, Nothing big in v3.4: konrad@phenom:~/ssd/linux$ git log --oneline v3.4..v3.5 mm/swapfile.c Aug 06 13:10:46 <konrad> 9b15b81 swap: fix shmem swapping when more than 8 areas Aug 06 13:10:46 <konrad> a3fe778 Merge tag 'stable/frontswap.v16-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/mm Aug 06 13:10:46 <konrad> 4b91355 memcg: fix/change behavior of shared anon at moving task Aug 06 13:10:46 <konrad> bde05d1 shmem: replace page if mapping excludes its zone Aug 06 13:10:46 <konrad> 38b5faf mm: frontswap: core swap subsystem hooks and headers Aug 06 13:12:20 <konrad> djm1, So I think that is OK for v3.4 perf perspective? Aug 06 13:12:36 <konrad> sjennings, These weren't benchmark with other patches on top? Like the WasActive one that Dan came up with? Aug 06 13:13:25 <sjennings> no, i used the mainline + frontswap (before it was mainlined in v3.5) Aug 06 13:13:43 <rcj> djm1, If dhansen has specific issues he'll need to speak for them in this promotion process. Aug 06 13:13:54 <djm1> sjennings: you ran your benchmarks before I was at LSF/MM and I think 3.3 was released that weekend Aug 06 13:14:07 <djm1> https://lkml.org/lkml/2012/4/16/449 Aug 06 13:14:23 <djm1> here's one of the Rik van Riel patches Aug 06 13:16:42 <djm1> sjennings: after rik's patches went in, the "non-zcache" times dropped dramatically Aug 06 13:18:11 <sjennings> zcache's primary benefit for me is reduced I/O Aug 06 13:18:29 <sjennings> just because swap on rotational media is faster now, doesn't make zcache unneeded Aug 06 13:18:57 <sjennings> not all linux machines use rotational swap media (i.e. phones) Aug 06 13:18:58 <djm1> sjennings: believe me, I am not arguing that zcache is not needed, just that we now have a lot more work/tuning to do Aug 06 13:19:42 <rcj> djm1, but there will always be interactions that drive work. Aug 06 13:20:48 <rcj> djm1, having zcache in staging or mainline is orthogonal. In fact, in mainline a patch that causes a regression for zcache might get more attention. Aug 06 13:23:15 <djm1> rcj: sorry, I'm not sure I understand your point Aug 06 13:23:59 <djm1> I am suggesting that the benchmarks Seth and I ran and posted, if rerun today on 3.4/3.5 would look horrible Aug 06 13:24:58 <sjennings> how is Rik's patchset going to reduce I/O like zcache does? Aug 06 13:25:05 <djm1> (correction: some of them look horrible) Aug 06 13:25:26 <sjennings> if anything, it increases I/O be doing more aggressive readahead Aug 06 13:25:39 <rcj> djm1, You brought up Rik's patch improving swap performance without zcache. You're saying that zcache needs to improve. That's not a zcache regression. The kernel is improving in the swap area for some cases, that's great. So then we take a TODO to worrk on zcache swap performance or elevate the priority of that work. it's not a blocker. Aug 06 13:25:47 <djm1> sjennings: good question... I don't know for sure... but they dramatically increase performance Aug 06 13:26:34 <djm1> rcj: are you saying that zcache is useful regardless of whether it increases performance? Aug 06 13:27:29 <djm1> (or are you saying you have measured it on non-rotating and it DOES increase performance there, and so you don't care if whether it increases performance on rotating?) Aug 06 13:27:56 <djm1> (measured on 3.4/3.5...) Aug 06 13:27:59 <rcj> djm1, I'm saying that there are other measures as Seth is pointing out. Aug 06 13:28:02 <konrad> djm1, Does it matter? The goal is to have limited amount of I/Os. Aug 06 13:28:55 * sjennings what konrad said Aug 06 13:28:55 <konrad> djm1, well, let me redact that. .. and not have horrible performance . Aug 06 13:28:58 <djm1> konrad,rcj: to me, reduced I/O and better performance are directly correlated, unless the bottleneck is the cpu of course Aug 06 13:30:01 <djm1> sjennings, ok, then please rerun your benchmarks on 3.4 or 3.5, and look at whether zcache reduced I/Os vs non-zcache Aug 06 13:31:21 <konrad> djm1, You are thinking is that it does not? Or that it does reduce I/O but the perf overall goes down? Aug 06 13:31:30 <sjennings> i can do that, but those numbers don't negate our point. this change in behavior isn't a bug in zcache Aug 06 13:32:16 <djm1> sjennings: ok, then let's start over from objectives: I think you said you want to promote zcache because it reduces I/O Aug 06 13:33:37 <djm1> I am agreeing that on 3.2 it did, but I have seen on 3.4 performance for non-zcache has improved dramatically, so I believe zcache no longer reduces I/O over non-zcache Aug 06 13:34:53 <sjennings> i'm i correct in understand rik's patch to make swap readahead more agressive? Aug 06 13:35:12 <djm1> it MAY be the case that the number of I/Os has improved dramatically for non-zcache, but the number of pages read/written for zcache is better than non-zcache Aug 06 13:35:15 <sjennings> *am i Aug 06 13:36:30 <djm1> sjennings: yes, that is one of rik's patches, there were several and I didn't look at all of them carefully Aug 06 13:37:47 <sjennings> we need to come back around on this though. we want promotion. what are the absolute showstoppers to prevent this. that is bugs or hard numbers that demonstrate the zcache has no value now (since previous number indictated it did have value) Aug 06 13:38:52 <rcj> sjennings, This doesn't affect cleancache and it doesn't break frontswap, it just improves other kernel code. There will always be work, but I don't see that gating mainline acceptance. Aug 06 13:38:56 <konrad> sjennings, The ones that Minchin identified were cleanups and refactor of code to make it more readable. No zcache engine replacement. Aug 06 13:40:01 <rcj> konrad, that sort of patch can go in anytime though. I could send out patchess to clean-up and refactor mainline code. Aug 06 13:40:07 <djm1> rcj: please help me understand... you want to promote zcache because it improved performance on older kernels even if it doesn't on current kernels? Aug 06 13:41:13 <djm1> sjennings: the hard numbers clearly demonstrate that the existing zcache has much less value now Aug 06 13:42:22 <rcj> djm1, I don't see you presenting numbers though, only saying that it "may" affect performance. So send out performance patches. Even if Rik never sent out his patches, could zcache stand performance work? Could any part of the kernel take performance patches? I don't see this as a gate to mainline. Aug 06 13:42:53 <djm1> rcj: please answer my question, you have avoided it Aug 06 13:43:36 <konrad> djm1, Is your thought that the answer to the performance numbers are the zcache2 different engine? Or the different algorithm for dealing with pages? Aug 06 13:43:40 <djm1> it is true I don't wish to shoot myself in the foot by publicly showing that zcache sucks now, but nor do I want to put my sign-off on promoting it while hiding the knowledge that it does Aug 06 13:45:01 <djm1> konrad: yes, I've sent many emails (offlist, and one or two on) about the design flaws of the "demo" zcache... Aug 06 13:45:55 <konrad> djm1, So what you are saying is that zcache2 by itself , if it was in v3.3 would show phenomal numbers, not just good. Aug 06 13:46:25 <rcj> djm1, I can't answer general questions unless we're talking real numbers. The code is in public and the development should be as well. Aug 06 13:46:26 <konrad> djm1, And when measuring it with v3.5, it shows good respectible numbers. Aug 06 13:46:45 <djm1> konrad: sadly no, my hand was forced to release the current status Aug 06 13:47:30 <konrad> djm1, So what you are saying is that you want to continue on working on thsi until you do get to that point Aug 06 13:47:53 <konrad> djm1, But that might take longer than a couple of releases and one might as well just drop in the new code. Aug 06 13:47:58 <djm1> rcj: ok, let me give you two numbers off the top of my head... Seth's published measurement 1700 seconds on zcache, 2000 non-zcache. Aug 06 13:48:15 <djm1> New measurement on non-zcache 3.5: 800 seconds Aug 06 13:48:46 <sjennings> did you use my machine? Aug 06 13:48:56 <sjennings> they aren't comparable Aug 06 13:49:13 <sjennings> plus i'm talking about I/O Aug 06 13:49:29 <djm1> sjennings: I am inviting you to use your machine and re-measure, or to admit you don't care what the new numbers are Aug 06 13:50:08 * konrad sighs. Aug 06 13:50:33 <konrad> I was hoping to have a clear define ACTION LISTs out of this and it seems we have at least two: Aug 06 13:50:44 <konrad> 1). I could send out patchess to clean-up and refactor mainline code Aug 06 13:50:59 <konrad> 2). Run existing benchmarks against v3.5 using zcache/non-zcache Aug 06 13:51:42 <sjennings> doing 1 won't resolve djm1 objections though Aug 06 13:52:35 <konrad> djm1, So Dan, if the numbers on sjennings show no perf degradation, or his tests on v3.5 show good numbers, I believe you would be OK with the current zcache code right? Aug 06 13:52:56 <konrad> sjennings, True. Just going through the action items to jot them down. Aug 06 13:54:36 <djm1> konrad: if sjennings runs the same benchmarks as he previously published on 3.5 and zcache performance always beats non-zcache performance, that is a big step in the right direction Aug 06 13:55:12 <djm1> konrad: but knowing the design flaws in zcache I can fairly easily find benchmarks for which it is not the case Aug 06 13:55:20 <konrad> djm1, If they show horrible perf that is tried in with the generic swap system... then what? Your drop-in does not necessarily fixes it, and it might be a swap system issue. Which is why rcj proposoal to move with the unstaging as soon as possible so that we aren't behind with this is the right thing Aug 06 13:55:52 <konrad> djm1, OK, but some benchmarks are so synthethic that they bear no resemblence to real world scenario. Aug 06 13:56:58 <konrad> djm1, But the design flaws discussion is something that can be addressed later too. Aug 06 13:58:17 <rcj> "Lies, damn lies, and statistics", right? Seth was talking about IOPs being reduced and Dan's talking about # seconds to run kern-bench which are different, you can always find some benchmark that falls down. But that doesn't mean there isn't some value still. zcache is not a panacea but does have use-cases. Aug 06 13:58:41 <konrad> rcj: Heh Aug 06 13:59:19 <rcj> And with measurements you can understand the problem and address it. But the question about promotion isn't necessarily resolved that way. Aug 06 13:59:58 * rcj re-reads the second sentence and needs to restate... Aug 06 14:00:26 <djm1> rcj: please re-run the same benchmarks as sjennings ran in March, measure all three: seconds, pages of I/O and number of I/Os and then let's discuss further Aug 06 14:01:27 <sjennings> djm1, regarding the step in the right direction comment, i need to know what evidence would be required for you to drop your objection to promotion before i go through all the trouble of producing that evidence. Aug 06 14:01:53 <sjennings> if i can show that zcache reduces I/O during a kernel build with a 3.5 kernel, is that enouhg Aug 06 14:02:34 <djm1> sjennings: across the same range of -jN? Aug 06 14:02:39 <sjennings> sure Aug 06 14:03:16 <djm1> number of pages of I/O or number of I/Os? (and I ask that for a very specific reason) Aug 06 14:03:50 <konrad> djm1, So I am curious. If this discussion was done in v3.3 day-time - would the issue of the design of zcache be on the table? Aug 06 14:04:04 <konrad> djm1, Or the rewrite or the benchmarks? Aug 06 14:05:04 <djm1> konrad: good question... yes, we had this discussion offlist prior to v3.3 (and I think you were cc'ed) Aug 06 14:05:35 <djm1> at that point, zcache was "good" and I was trying to make it "better", now I fear that it sucks and I am trying to make it at least "good" Aug 06 14:07:19 <djm1> rcj, sjennings: I have worked in a big company for 30 years, I understand you have constraints, and I can only guess what is driving this Aug 06 14:07:51 <djm1> I suspect that IBM has a contractual obligation to deliver this in some embedded system which uses SSD/NVRAM Aug 06 14:08:31 <sjennings> djm1, our motivation is not relevant Aug 06 14:08:41 <rcj> sjennings, and the community doesn't care. Aug 06 14:08:44 <djm1> so to answer sjennings question, if zcache is truly going to be used successfully by real users, no, I don't think I want to get in the way of that Aug 06 14:08:50 <sjennings> this code is of value to use and others Aug 06 14:09:03 <sjennings> s/use/us Aug 06 14:10:00 <sjennings> and improvements can continue to be made Aug 06 14:10:09 <djm1> sjennings: restating that and ignoring what I am telling you and refusing to measure it for yourself doesn't help Aug 06 14:11:21 <djm1> rcj: the community most certainly does care... had Seth and I never published *amazing* zcache numbers (compared to native), we wouldn't be having this conversation Aug 06 14:12:03 <rcj> djm1, I was just saying that the community doesn't care about motivations Aug 06 14:12:10 <djm1> by promoting zcache based on those numbers, it makes me feel disingenous, if not just dirty Aug 06 14:12:18 <konrad> djm1, I think you made your point. You would like Seth to do some benchmarking to verify the perf numbers. Aug 06 14:12:35 <konrad> djm1, And if they do not work right, then address those before promoting it. Aug 06 14:13:34 <sjennings> but we don't agree on what "work right" is Aug 06 14:13:43 <sjennings> how are we defining that Aug 06 14:14:02 <sjennings> i define it in reduced I/O (even if runtime is longer) Aug 06 14:15:21 <djm1> which is why motivation is relevant Aug 06 14:16:04 <konrad> djm1, motivation==use case I think? Aug 06 14:16:09 <djm1> if this is the classic battle: "which is more important? enterprise or embedded" then let's call it that Aug 06 14:16:55 <djm1> konrad: yes, but "my use case runs faster so I don't care if yours run slower, just don't use it" Aug 06 14:17:57 <konrad> djm1, So your concern is with .. I can't seem to find in the log here Aug 06 14:18:22 <djm1> sjennings, rcj: so IMHO zram is the right match for you then... what is it about zcache that solves your problem better than zram? Aug 06 14:19:42 <konrad> djm1, Ah, speed. Workloads running faster. Aug 06 14:23:19 * djm1 goes afk Aug 06 14:26:03 <konrad> djm1, Um, hope you won't be long afk Aug 06 14:27:02 <sjennings> i'll post numbers on the v3.5 kernel Aug 06 14:27:20 <konrad> sjennings, OK. Aug 06 14:27:31 <konrad> sjennings, Thank you. Aug 06 14:27:45 * sashal has quit (Excess Flood) Aug 06 14:28:05 * sashal (~sasha@95-89-78-76-dynip.superkabel.de) has joined #zcache Aug 06 14:28:21 <konrad> sashal, Last thing said was: "<sjennings> i'll post numbers on the v3.5 kernel Aug 06 14:28:21 <konrad> <konrad> sjennings, OK. Aug 06 14:28:21 <konrad> <konrad> sjennings, Thank you." Aug 06 14:29:00 <konrad> sjennings, rcj , djm1 : I thought this would be only an hour but it went for an hour and half. Yikes. Aug 06 14:29:21 <konrad> sjennings, rcj : I hope it doesn't negatively impact your guys schedule for today. Aug 06 14:30:08 <konrad> sjennings, rcj djm1 :I think it makes sense to one more discussion about this after Seth has the opportunity to run the numbers. Aug 06 14:30:32 <konrad> sjennings, rcj djm1 sashal : Would thsi time work next week? Would a conference call be better than IRC? Aug 06 14:30:57 <rcj> konrad, IRC is better, this needs to be completely public. On-list would be better. Aug 06 14:31:31 <rcj> konrad, I just don't want to see barriers to people being involved in a discussion. Aug 06 14:31:55 <konrad> rcj, Good point. .. I am thinking of more of removing some of the barriers by having folks in one location so to speak. Aug 06 14:32:27 <rcj> konrad, I understand completely. Just trying to find a good balance. Aug 06 14:33:27 <konrad> sjennings, djm1, rcj, So let me write a summary of this and also save the full log as an attachment. Aug 06 14:33:30 <rcj> konrad, I'm still worried by the premise of the objection. That a non-zcache patch has improved performance somewhere else in the kernel and this poses a barrier to mainline inclusion. Aug 06 14:34:52 <konrad> rcj, Right. Thought not for embedded side. The things that Andrew Morton pointed out is that he would like zcache be enabled by default. Aug 06 14:36:07 <konrad> rcj, And Dan is trying to make that happen. But if the enablement of zcache does not benefit on nowadays desktops..Then it should not be enabled by default. Aug 06 14:36:37 <konrad> rcj, And then it comes back to Andrew having a potential issue with zcache. So Dan is trying to make sure that this issue will not surface. Aug 06 14:37:04 <konrad> rcj, But the other side of the coin is that if its used in the embedded side then it does not matter. Aug 06 14:37:51 <konrad> rcj, Which I believe is what you guys are coming from. Aug 06 14:38:58 <rcj> konrad, We don't know the other show-stoppers because the conversation got killed pretty early on-list. Aug 06 14:40:37 <konrad> rcj, Ugh. OK. Aug 06 14:41:32 <konrad> rcj, Let me see what I have in the mbox and if I can compile to something similar to what Seth did for frontswa Aug 06 14:42:08 <konrad> Well, I have to head out, other things are showing up in my calendar Aug 06 14:42:48 <rcj> konrad, thanks. Aug 06 14:43:13 <konrad> sjennings, rcj djm1 sashal : Gotta go. I think the two TODOs are: 1). Seth run v3.5 w/ zcache and w/o zcache; 2) Konrad to rummage in his mbox to find earlier objections to zcache Aug 06 14:43:27 * djm1 returnsm apparently too late Aug 06 14:43:54 <rcj> konrad, sounds good. Aug 06 14:44:00 <konrad> sjennings, rcj djm1 sashal : 3) Propose to continue this on mailing list for the more broader topics, while the more fine one - on IRC. Aug 06 14:44:17 <konrad> .. and also get Minchan involved here. Aug 06 14:44:44 * konrad is away Aug 06 15:01:14 * sashal has quit (Ping timeout: 240 seconds) Aug 06 15:42:06 * sashal (~sasha@95-89-78-76-dynip.superkabel.de) has joined #zcache Aug 06 16:20:00 * rcj has quit (Quit: rcj) Aug 06 17:03:01 * sjennings has quit (Quit: Leaving) Aug 07 12:17:29 * Disconnected (Connection reset by peer). ^ permalink raw reply [flat|nested] 41+ messages in thread
* [PATCH 0/4] promote zcache from staging @ 2012-07-27 18:18 Seth Jennings 2012-07-27 18:18 ` [PATCH 1/4] zsmalloc: collapse internal .h into .c Seth Jennings ` (6 more replies) 0 siblings, 7 replies; 41+ messages in thread From: Seth Jennings @ 2012-07-27 18:18 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Seth Jennings, Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Rzeszutek Wilk, Dan Magenheimer, Robert Jennings, linux-mm, linux-kernel, devel zcache is the remaining piece of code required to support in-kernel memory compression. The other two features, cleancache and frontswap, have been promoted to mainline in 3.0 and 3.5. This patchset promotes zcache from the staging tree to mainline. Based on the level of activity and contributions we're seeing from a diverse set of people and interests, I think zcache has matured to the point where it makes sense to promote this out of staging. Overview ======== zcache is a backend to frontswap and cleancache that accepts pages from those mechanisms and compresses them, leading to reduced I/O caused by swap and file re-reads. This is very valuable in shared storage situations to reduce load on things like SANs. Also, in the case of slow backing/swap devices, zcache can also yield a performance gain. In-Kernel Memory Compression Overview: swap subsystem page cache + + frontswap cleancache + + zcache frontswap glue zcache cleancache glue + + +---------+------------+ + zcache/tmem core + +---------+------------+ + + zsmalloc zbud Everything below the frontswap/cleancache layer is current inside the zcache driver expect for zsmalloc which is a shared between zcache and another memory compression driver, zram. Since zcache is dependent on zsmalloc, it is also being promoted by this patchset. For information on zsmalloc and the rationale behind it's design and use cases verses already existing allocators in the kernel: https://lkml.org/lkml/2012/1/9/386 zsmalloc is the allocator used by zcache to store persistent pages that comes from frontswap, as opposed to zbud which is the (internal) allocator used for ephemeral pages from cleancache. zsmalloc uses many fields of the page struct to create it's conceptual high-order page called a zspage. Exactly which fields are used and for what purpose is documented at the top of the zsmalloc .c file. Because zsmalloc uses struct page extensively, Andrew advised that the promotion location be mm/: https://lkml.org/lkml/2012/1/20/308 Some benchmarking numbers demonstrating the I/O saving that can be had with zcache: https://lkml.org/lkml/2012/3/22/383 Dan's presentation at LSF/MM this year on zcache: http://oss.oracle.com/projects/tmem/dist/documentation/presentations/LSFMM12-zcache-final.pdf This patchset is based on next-20120727 + 3-part zsmalloc patchset below https://lkml.org/lkml/2012/7/18/353 The zsmalloc patchset is already acked and will be integrated by Greg after 3.6-rc1 is out. Seth Jennings (4): zsmalloc: collapse internal .h into .c zsmalloc: promote to mm/ drivers: add memory management driver class zcache: promote to drivers/mm/ drivers/Kconfig | 2 + drivers/Makefile | 1 + drivers/mm/Kconfig | 13 ++ drivers/mm/Makefile | 1 + drivers/{staging => mm}/zcache/Makefile | 0 drivers/{staging => mm}/zcache/tmem.c | 0 drivers/{staging => mm}/zcache/tmem.h | 0 drivers/{staging => mm}/zcache/zcache-main.c | 4 +- drivers/staging/Kconfig | 4 - drivers/staging/Makefile | 2 - drivers/staging/zcache/Kconfig | 11 -- drivers/staging/zram/zram_drv.h | 3 +- drivers/staging/zsmalloc/Kconfig | 10 -- drivers/staging/zsmalloc/Makefile | 3 - drivers/staging/zsmalloc/zsmalloc_int.h | 149 -------------------- .../staging/zsmalloc => include/linux}/zsmalloc.h | 0 mm/Kconfig | 18 +++ mm/Makefile | 1 + .../zsmalloc/zsmalloc-main.c => mm/zsmalloc.c | 133 ++++++++++++++++- 19 files changed, 170 insertions(+), 185 deletions(-) create mode 100644 drivers/mm/Kconfig create mode 100644 drivers/mm/Makefile rename drivers/{staging => mm}/zcache/Makefile (100%) rename drivers/{staging => mm}/zcache/tmem.c (100%) rename drivers/{staging => mm}/zcache/tmem.h (100%) rename drivers/{staging => mm}/zcache/zcache-main.c (99%) delete mode 100644 drivers/staging/zcache/Kconfig delete mode 100644 drivers/staging/zsmalloc/Kconfig delete mode 100644 drivers/staging/zsmalloc/Makefile delete mode 100644 drivers/staging/zsmalloc/zsmalloc_int.h rename {drivers/staging/zsmalloc => include/linux}/zsmalloc.h (100%) rename drivers/staging/zsmalloc/zsmalloc-main.c => mm/zsmalloc.c (86%) -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* [PATCH 1/4] zsmalloc: collapse internal .h into .c 2012-07-27 18:18 Seth Jennings @ 2012-07-27 18:18 ` Seth Jennings 2012-07-27 18:18 ` [PATCH 2/4] zsmalloc: promote to mm/ Seth Jennings ` (5 subsequent siblings) 6 siblings, 0 replies; 41+ messages in thread From: Seth Jennings @ 2012-07-27 18:18 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Seth Jennings, Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Rzeszutek Wilk, Dan Magenheimer, Robert Jennings, linux-mm, linux-kernel, devel The patch collapses in the internal zsmalloc_int.h into the zsmalloc-main.c file. This is done in preparation for the promotion to mm/ where separate internal headers are discouraged. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> --- drivers/staging/zsmalloc/zsmalloc-main.c | 132 +++++++++++++++++++++++++- drivers/staging/zsmalloc/zsmalloc_int.h | 149 ------------------------------ 2 files changed, 131 insertions(+), 150 deletions(-) delete mode 100644 drivers/staging/zsmalloc/zsmalloc_int.h diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/drivers/staging/zsmalloc/zsmalloc-main.c index defe350..09a9d35 100644 --- a/drivers/staging/zsmalloc/zsmalloc-main.c +++ b/drivers/staging/zsmalloc/zsmalloc-main.c @@ -76,9 +76,139 @@ #include <linux/cpu.h> #include <linux/vmalloc.h> #include <linux/hardirq.h> +#include <linux/spinlock.h> +#include <linux/types.h> #include "zsmalloc.h" -#include "zsmalloc_int.h" + +/* + * This must be power of 2 and greater than of equal to sizeof(link_free). + * These two conditions ensure that any 'struct link_free' itself doesn't + * span more than 1 page which avoids complex case of mapping 2 pages simply + * to restore link_free pointer values. + */ +#define ZS_ALIGN 8 + +/* + * A single 'zspage' is composed of up to 2^N discontiguous 0-order (single) + * pages. ZS_MAX_ZSPAGE_ORDER defines upper limit on N. + */ +#define ZS_MAX_ZSPAGE_ORDER 2 +#define ZS_MAX_PAGES_PER_ZSPAGE (_AC(1, UL) << ZS_MAX_ZSPAGE_ORDER) + +/* + * Object location (<PFN>, <obj_idx>) is encoded as + * as single (void *) handle value. + * + * Note that object index <obj_idx> is relative to system + * page <PFN> it is stored in, so for each sub-page belonging + * to a zspage, obj_idx starts with 0. + * + * This is made more complicated by various memory models and PAE. + */ + +#ifndef MAX_PHYSMEM_BITS +#ifdef CONFIG_HIGHMEM64G +#define MAX_PHYSMEM_BITS 36 +#else /* !CONFIG_HIGHMEM64G */ +/* + * If this definition of MAX_PHYSMEM_BITS is used, OBJ_INDEX_BITS will just + * be PAGE_SHIFT + */ +#define MAX_PHYSMEM_BITS BITS_PER_LONG +#endif +#endif +#define _PFN_BITS (MAX_PHYSMEM_BITS - PAGE_SHIFT) +#define OBJ_INDEX_BITS (BITS_PER_LONG - _PFN_BITS) +#define OBJ_INDEX_MASK ((_AC(1, UL) << OBJ_INDEX_BITS) - 1) + +#define MAX(a, b) ((a) >= (b) ? (a) : (b)) +/* ZS_MIN_ALLOC_SIZE must be multiple of ZS_ALIGN */ +#define ZS_MIN_ALLOC_SIZE \ + MAX(32, (ZS_MAX_PAGES_PER_ZSPAGE << PAGE_SHIFT >> OBJ_INDEX_BITS)) +#define ZS_MAX_ALLOC_SIZE PAGE_SIZE + +/* + * On systems with 4K page size, this gives 254 size classes! There is a + * trader-off here: + * - Large number of size classes is potentially wasteful as free page are + * spread across these classes + * - Small number of size classes causes large internal fragmentation + * - Probably its better to use specific size classes (empirically + * determined). NOTE: all those class sizes must be set as multiple of + * ZS_ALIGN to make sure link_free itself never has to span 2 pages. + * + * ZS_MIN_ALLOC_SIZE and ZS_SIZE_CLASS_DELTA must be multiple of ZS_ALIGN + * (reason above) + */ +#define ZS_SIZE_CLASS_DELTA 16 +#define ZS_SIZE_CLASSES ((ZS_MAX_ALLOC_SIZE - ZS_MIN_ALLOC_SIZE) / \ + ZS_SIZE_CLASS_DELTA + 1) + +/* + * We do not maintain any list for completely empty or full pages + */ +enum fullness_group { + ZS_ALMOST_FULL, + ZS_ALMOST_EMPTY, + _ZS_NR_FULLNESS_GROUPS, + + ZS_EMPTY, + ZS_FULL +}; + +/* + * We assign a page to ZS_ALMOST_EMPTY fullness group when: + * n <= N / f, where + * n = number of allocated objects + * N = total number of objects zspage can store + * f = 1/fullness_threshold_frac + * + * Similarly, we assign zspage to: + * ZS_ALMOST_FULL when n > N / f + * ZS_EMPTY when n == 0 + * ZS_FULL when n == N + * + * (see: fix_fullness_group()) + */ +static const int fullness_threshold_frac = 4; + +struct size_class { + /* + * Size of objects stored in this class. Must be multiple + * of ZS_ALIGN. + */ + int size; + unsigned int index; + + /* Number of PAGE_SIZE sized pages to combine to form a 'zspage' */ + int pages_per_zspage; + + spinlock_t lock; + + /* stats */ + u64 pages_allocated; + + struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS]; +}; + +/* + * Placed within free objects to form a singly linked list. + * For every zspage, first_page->freelist gives head of this list. + * + * This must be power of 2 and less than or equal to ZS_ALIGN + */ +struct link_free { + /* Handle of next free chunk (encodes <PFN, obj_idx>) */ + void *next; +}; + +struct zs_pool { + struct size_class size_class[ZS_SIZE_CLASSES]; + + gfp_t flags; /* allocation flags used when growing pool */ + const char *name; +}; /* * A zspage's class index and fullness group diff --git a/drivers/staging/zsmalloc/zsmalloc_int.h b/drivers/staging/zsmalloc/zsmalloc_int.h deleted file mode 100644 index 8c0b344..0000000 --- a/drivers/staging/zsmalloc/zsmalloc_int.h +++ /dev/null @@ -1,149 +0,0 @@ -/* - * zsmalloc memory allocator - * - * Copyright (C) 2011 Nitin Gupta - * - * This code is released using a dual license strategy: BSD/GPL - * You can choose the license that better fits your requirements. - * - * Released under the terms of 3-clause BSD License - * Released under the terms of GNU General Public License Version 2.0 - */ - -#ifndef _ZS_MALLOC_INT_H_ -#define _ZS_MALLOC_INT_H_ - -#include <linux/kernel.h> -#include <linux/spinlock.h> -#include <linux/types.h> - -/* - * This must be power of 2 and greater than of equal to sizeof(link_free). - * These two conditions ensure that any 'struct link_free' itself doesn't - * span more than 1 page which avoids complex case of mapping 2 pages simply - * to restore link_free pointer values. - */ -#define ZS_ALIGN 8 - -/* - * A single 'zspage' is composed of up to 2^N discontiguous 0-order (single) - * pages. ZS_MAX_ZSPAGE_ORDER defines upper limit on N. - */ -#define ZS_MAX_ZSPAGE_ORDER 2 -#define ZS_MAX_PAGES_PER_ZSPAGE (_AC(1, UL) << ZS_MAX_ZSPAGE_ORDER) - -/* - * Object location (<PFN>, <obj_idx>) is encoded as - * as single (void *) handle value. - * - * Note that object index <obj_idx> is relative to system - * page <PFN> it is stored in, so for each sub-page belonging - * to a zspage, obj_idx starts with 0. - * - * This is made more complicated by various memory models and PAE. - */ - -#ifndef MAX_PHYSMEM_BITS -#ifdef CONFIG_HIGHMEM64G -#define MAX_PHYSMEM_BITS 36 -#else /* !CONFIG_HIGHMEM64G */ -/* - * If this definition of MAX_PHYSMEM_BITS is used, OBJ_INDEX_BITS will just - * be PAGE_SHIFT - */ -#define MAX_PHYSMEM_BITS BITS_PER_LONG -#endif -#endif -#define _PFN_BITS (MAX_PHYSMEM_BITS - PAGE_SHIFT) -#define OBJ_INDEX_BITS (BITS_PER_LONG - _PFN_BITS) -#define OBJ_INDEX_MASK ((_AC(1, UL) << OBJ_INDEX_BITS) - 1) - -#define MAX(a, b) ((a) >= (b) ? (a) : (b)) -/* ZS_MIN_ALLOC_SIZE must be multiple of ZS_ALIGN */ -#define ZS_MIN_ALLOC_SIZE \ - MAX(32, (ZS_MAX_PAGES_PER_ZSPAGE << PAGE_SHIFT >> OBJ_INDEX_BITS)) -#define ZS_MAX_ALLOC_SIZE PAGE_SIZE - -/* - * On systems with 4K page size, this gives 254 size classes! There is a - * trader-off here: - * - Large number of size classes is potentially wasteful as free page are - * spread across these classes - * - Small number of size classes causes large internal fragmentation - * - Probably its better to use specific size classes (empirically - * determined). NOTE: all those class sizes must be set as multiple of - * ZS_ALIGN to make sure link_free itself never has to span 2 pages. - * - * ZS_MIN_ALLOC_SIZE and ZS_SIZE_CLASS_DELTA must be multiple of ZS_ALIGN - * (reason above) - */ -#define ZS_SIZE_CLASS_DELTA 16 -#define ZS_SIZE_CLASSES ((ZS_MAX_ALLOC_SIZE - ZS_MIN_ALLOC_SIZE) / \ - ZS_SIZE_CLASS_DELTA + 1) - -/* - * We do not maintain any list for completely empty or full pages - */ -enum fullness_group { - ZS_ALMOST_FULL, - ZS_ALMOST_EMPTY, - _ZS_NR_FULLNESS_GROUPS, - - ZS_EMPTY, - ZS_FULL -}; - -/* - * We assign a page to ZS_ALMOST_EMPTY fullness group when: - * n <= N / f, where - * n = number of allocated objects - * N = total number of objects zspage can store - * f = 1/fullness_threshold_frac - * - * Similarly, we assign zspage to: - * ZS_ALMOST_FULL when n > N / f - * ZS_EMPTY when n == 0 - * ZS_FULL when n == N - * - * (see: fix_fullness_group()) - */ -static const int fullness_threshold_frac = 4; - -struct size_class { - /* - * Size of objects stored in this class. Must be multiple - * of ZS_ALIGN. - */ - int size; - unsigned int index; - - /* Number of PAGE_SIZE sized pages to combine to form a 'zspage' */ - int pages_per_zspage; - - spinlock_t lock; - - /* stats */ - u64 pages_allocated; - - struct page *fullness_list[_ZS_NR_FULLNESS_GROUPS]; -}; - -/* - * Placed within free objects to form a singly linked list. - * For every zspage, first_page->freelist gives head of this list. - * - * This must be power of 2 and less than or equal to ZS_ALIGN - */ -struct link_free { - /* Handle of next free chunk (encodes <PFN, obj_idx>) */ - void *next; -}; - -struct zs_pool { - struct size_class size_class[ZS_SIZE_CLASSES]; - - gfp_t flags; /* allocation flags used when growing pool */ - const char *name; -}; - -#endif -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 2/4] zsmalloc: promote to mm/ 2012-07-27 18:18 Seth Jennings 2012-07-27 18:18 ` [PATCH 1/4] zsmalloc: collapse internal .h into .c Seth Jennings @ 2012-07-27 18:18 ` Seth Jennings 2012-07-27 18:18 ` [PATCH 3/4] drivers: add memory management driver class Seth Jennings ` (4 subsequent siblings) 6 siblings, 0 replies; 41+ messages in thread From: Seth Jennings @ 2012-07-27 18:18 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Seth Jennings, Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Rzeszutek Wilk, Dan Magenheimer, Robert Jennings, linux-mm, linux-kernel, devel This patch promotes the slab-based zsmalloc memory allocator from the staging tree to mm/ zcache depends on this allocator for storing compressed RAM pages in an efficient way under system wide memory pressure where high-order (greater than 0) page allocation are very likely to fail. For more information on zsmalloc and its internals, read the documentation at the top of the zsmalloc c file. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> --- drivers/staging/Kconfig | 2 -- drivers/staging/Makefile | 1 - drivers/staging/zcache/zcache-main.c | 4 ++-- drivers/staging/zram/zram_drv.h | 3 +-- drivers/staging/zsmalloc/Kconfig | 10 ---------- drivers/staging/zsmalloc/Makefile | 3 --- .../staging/zsmalloc => include/linux}/zsmalloc.h | 0 mm/Kconfig | 18 ++++++++++++++++++ mm/Makefile | 1 + .../zsmalloc/zsmalloc-main.c => mm/zsmalloc.c | 3 +-- 10 files changed, 23 insertions(+), 22 deletions(-) delete mode 100644 drivers/staging/zsmalloc/Kconfig delete mode 100644 drivers/staging/zsmalloc/Makefile rename {drivers/staging/zsmalloc => include/linux}/zsmalloc.h (100%) rename drivers/staging/zsmalloc/zsmalloc-main.c => mm/zsmalloc.c (99%) diff --git a/drivers/staging/Kconfig b/drivers/staging/Kconfig index e3402d5..b7f7bc7 100644 --- a/drivers/staging/Kconfig +++ b/drivers/staging/Kconfig @@ -78,8 +78,6 @@ source "drivers/staging/zram/Kconfig" source "drivers/staging/zcache/Kconfig" -source "drivers/staging/zsmalloc/Kconfig" - source "drivers/staging/wlags49_h2/Kconfig" source "drivers/staging/wlags49_h25/Kconfig" diff --git a/drivers/staging/Makefile b/drivers/staging/Makefile index 3be59d0..ad74bee 100644 --- a/drivers/staging/Makefile +++ b/drivers/staging/Makefile @@ -34,7 +34,6 @@ obj-$(CONFIG_DX_SEP) += sep/ obj-$(CONFIG_IIO) += iio/ obj-$(CONFIG_ZRAM) += zram/ obj-$(CONFIG_ZCACHE) += zcache/ -obj-$(CONFIG_ZSMALLOC) += zsmalloc/ obj-$(CONFIG_WLAGS49_H2) += wlags49_h2/ obj-$(CONFIG_WLAGS49_H25) += wlags49_h25/ obj-$(CONFIG_FB_SM7XX) += sm7xxfb/ diff --git a/drivers/staging/zcache/zcache-main.c b/drivers/staging/zcache/zcache-main.c index c214977..06ce28f 100644 --- a/drivers/staging/zcache/zcache-main.c +++ b/drivers/staging/zcache/zcache-main.c @@ -32,9 +32,9 @@ #include <linux/crypto.h> #include <linux/string.h> #include <linux/idr.h> -#include "tmem.h" +#include <linux/zsmalloc.h> -#include "../zsmalloc/zsmalloc.h" +#include "tmem.h" #ifdef CONFIG_CLEANCACHE #include <linux/cleancache.h> diff --git a/drivers/staging/zram/zram_drv.h b/drivers/staging/zram/zram_drv.h index 572c0b1..f6d0925 100644 --- a/drivers/staging/zram/zram_drv.h +++ b/drivers/staging/zram/zram_drv.h @@ -17,8 +17,7 @@ #include <linux/spinlock.h> #include <linux/mutex.h> - -#include "../zsmalloc/zsmalloc.h" +#include <linux/zsmalloc.h> /* * Some arbitrary value. This is just to catch diff --git a/drivers/staging/zsmalloc/Kconfig b/drivers/staging/zsmalloc/Kconfig deleted file mode 100644 index 9084565..0000000 --- a/drivers/staging/zsmalloc/Kconfig +++ /dev/null @@ -1,10 +0,0 @@ -config ZSMALLOC - tristate "Memory allocator for compressed pages" - default n - help - zsmalloc is a slab-based memory allocator designed to store - compressed RAM pages. zsmalloc uses virtual memory mapping - in order to reduce fragmentation. However, this results in a - non-standard allocator interface where a handle, not a pointer, is - returned by an alloc(). This handle must be mapped in order to - access the allocated space. diff --git a/drivers/staging/zsmalloc/Makefile b/drivers/staging/zsmalloc/Makefile deleted file mode 100644 index b134848..0000000 --- a/drivers/staging/zsmalloc/Makefile +++ /dev/null @@ -1,3 +0,0 @@ -zsmalloc-y := zsmalloc-main.o - -obj-$(CONFIG_ZSMALLOC) += zsmalloc.o diff --git a/drivers/staging/zsmalloc/zsmalloc.h b/include/linux/zsmalloc.h similarity index 100% rename from drivers/staging/zsmalloc/zsmalloc.h rename to include/linux/zsmalloc.h diff --git a/mm/Kconfig b/mm/Kconfig index d5c8019..2586b66 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -411,3 +411,21 @@ config FRONTSWAP and swap data is stored as normal on the matching swap device. If unsure, say Y to enable frontswap. + +config ZSMALLOC + tristate "Memory allocator for compressed pages" + default n + help + zsmalloc is a slab-based memory allocator designed to store + compressed RAM pages. zsmalloc uses a memory pool that combines + single pages into higher order pages by linking them together + using the fields of the struct page. Allocations are then + mapped through copy buffers or VM mapping, in order to reduce + memory pool fragmentation and increase allocation success rate under + memory pressure. + + This results in a non-standard allocator interface where + a handle, not a pointer, is returned by the allocation function. + This handle must be mapped in order to access the allocated space. + + If unsure, say N. diff --git a/mm/Makefile b/mm/Makefile index 92753e2..8a3d7bea 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -57,3 +57,4 @@ obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o obj-$(CONFIG_CLEANCACHE) += cleancache.o obj-$(CONFIG_MEMORY_ISOLATION) += page_isolation.o +obj-$(CONFIG_ZSMALLOC) += zsmalloc.o diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c b/mm/zsmalloc.c similarity index 99% rename from drivers/staging/zsmalloc/zsmalloc-main.c rename to mm/zsmalloc.c index 09a9d35..6b20429 100644 --- a/drivers/staging/zsmalloc/zsmalloc-main.c +++ b/mm/zsmalloc.c @@ -78,8 +78,7 @@ #include <linux/hardirq.h> #include <linux/spinlock.h> #include <linux/types.h> - -#include "zsmalloc.h" +#include <linux/zsmalloc.h> /* * This must be power of 2 and greater than of equal to sizeof(link_free). -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 3/4] drivers: add memory management driver class 2012-07-27 18:18 Seth Jennings 2012-07-27 18:18 ` [PATCH 1/4] zsmalloc: collapse internal .h into .c Seth Jennings 2012-07-27 18:18 ` [PATCH 2/4] zsmalloc: promote to mm/ Seth Jennings @ 2012-07-27 18:18 ` Seth Jennings 2012-07-31 15:31 ` Konrad Rzeszutek Wilk 2012-07-27 18:18 ` [PATCH 4/4] zcache: promote to drivers/mm/ Seth Jennings ` (3 subsequent siblings) 6 siblings, 1 reply; 41+ messages in thread From: Seth Jennings @ 2012-07-27 18:18 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Seth Jennings, Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Rzeszutek Wilk, Dan Magenheimer, Robert Jennings, linux-mm, linux-kernel, devel This patchset creates a new driver class under drivers/ for memory management related drivers, like zcache. This driver class would be for drivers that don't actually enabled a hardware device, but rather augment the memory manager in some way. In-tree candidates for this driver class are zcache, zram, and lowmemorykiller, both in staging. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> --- drivers/Kconfig | 2 ++ drivers/Makefile | 1 + drivers/mm/Kconfig | 3 +++ 3 files changed, 6 insertions(+) create mode 100644 drivers/mm/Kconfig diff --git a/drivers/Kconfig b/drivers/Kconfig index ece958d..67fe7bd 100644 --- a/drivers/Kconfig +++ b/drivers/Kconfig @@ -152,4 +152,6 @@ source "drivers/vme/Kconfig" source "drivers/pwm/Kconfig" +source "drivers/mm/Kconfig" + endmenu diff --git a/drivers/Makefile b/drivers/Makefile index 5b42184..121742e 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -139,3 +139,4 @@ obj-$(CONFIG_EXTCON) += extcon/ obj-$(CONFIG_MEMORY) += memory/ obj-$(CONFIG_IIO) += iio/ obj-$(CONFIG_VME_BUS) += vme/ +obj-$(CONFIG_MM_DRIVERS) += mm/ diff --git a/drivers/mm/Kconfig b/drivers/mm/Kconfig new file mode 100644 index 0000000..e5b3743 --- /dev/null +++ b/drivers/mm/Kconfig @@ -0,0 +1,3 @@ +menu "Memory management drivers" + +endmenu -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH 3/4] drivers: add memory management driver class 2012-07-27 18:18 ` [PATCH 3/4] drivers: add memory management driver class Seth Jennings @ 2012-07-31 15:31 ` Konrad Rzeszutek Wilk 0 siblings, 0 replies; 41+ messages in thread From: Konrad Rzeszutek Wilk @ 2012-07-31 15:31 UTC (permalink / raw) To: Seth Jennings Cc: Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Minchan Kim, Dan Magenheimer, Robert Jennings, linux-mm, linux-kernel, devel On Fri, Jul 27, 2012 at 01:18:36PM -0500, Seth Jennings wrote: > This patchset creates a new driver class under drivers/ for > memory management related drivers, like zcache. I was going back and forth with Dan whether it should be in mm/ or in drivers/mm. > > This driver class would be for drivers that don't actually enabled > a hardware device, but rather augment the memory manager in some > way. > > In-tree candidates for this driver class are zcache, zram, and > lowmemorykiller, both in staging. But with some many (well, three of them) I think sticking them in drviers/mm makes more sense. > > Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> > --- > drivers/Kconfig | 2 ++ > drivers/Makefile | 1 + > drivers/mm/Kconfig | 3 +++ > 3 files changed, 6 insertions(+) > create mode 100644 drivers/mm/Kconfig > > diff --git a/drivers/Kconfig b/drivers/Kconfig > index ece958d..67fe7bd 100644 > --- a/drivers/Kconfig > +++ b/drivers/Kconfig > @@ -152,4 +152,6 @@ source "drivers/vme/Kconfig" > > source "drivers/pwm/Kconfig" > > +source "drivers/mm/Kconfig" > + > endmenu > diff --git a/drivers/Makefile b/drivers/Makefile > index 5b42184..121742e 100644 > --- a/drivers/Makefile > +++ b/drivers/Makefile > @@ -139,3 +139,4 @@ obj-$(CONFIG_EXTCON) += extcon/ > obj-$(CONFIG_MEMORY) += memory/ > obj-$(CONFIG_IIO) += iio/ > obj-$(CONFIG_VME_BUS) += vme/ > +obj-$(CONFIG_MM_DRIVERS) += mm/ > diff --git a/drivers/mm/Kconfig b/drivers/mm/Kconfig > new file mode 100644 > index 0000000..e5b3743 > --- /dev/null > +++ b/drivers/mm/Kconfig > @@ -0,0 +1,3 @@ > +menu "Memory management drivers" > + > +endmenu > -- > 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* [PATCH 4/4] zcache: promote to drivers/mm/ 2012-07-27 18:18 Seth Jennings ` (2 preceding siblings ...) 2012-07-27 18:18 ` [PATCH 3/4] drivers: add memory management driver class Seth Jennings @ 2012-07-27 18:18 ` Seth Jennings 2012-07-29 2:20 ` [PATCH 0/4] promote zcache from staging Minchan Kim ` (2 subsequent siblings) 6 siblings, 0 replies; 41+ messages in thread From: Seth Jennings @ 2012-07-27 18:18 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Seth Jennings, Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Rzeszutek Wilk, Dan Magenheimer, Robert Jennings, linux-mm, linux-kernel, devel This patchset promtes the zcache driver from staging to drivers/mm/. zcache captures swap pages via frontswap and pages that fall out of the page cache via cleancache and compress them in RAM, providing a compressed RAM swap and a compressed second-chance page cache. Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> --- drivers/mm/Kconfig | 10 ++++++++++ drivers/mm/Makefile | 1 + drivers/{staging => mm}/zcache/Makefile | 0 drivers/{staging => mm}/zcache/tmem.c | 0 drivers/{staging => mm}/zcache/tmem.h | 0 drivers/{staging => mm}/zcache/zcache-main.c | 0 drivers/staging/Kconfig | 2 -- drivers/staging/Makefile | 1 - drivers/staging/zcache/Kconfig | 11 ----------- 9 files changed, 11 insertions(+), 14 deletions(-) create mode 100644 drivers/mm/Makefile rename drivers/{staging => mm}/zcache/Makefile (100%) rename drivers/{staging => mm}/zcache/tmem.c (100%) rename drivers/{staging => mm}/zcache/tmem.h (100%) rename drivers/{staging => mm}/zcache/zcache-main.c (100%) delete mode 100644 drivers/staging/zcache/Kconfig diff --git a/drivers/mm/Kconfig b/drivers/mm/Kconfig index e5b3743..22289c6 100644 --- a/drivers/mm/Kconfig +++ b/drivers/mm/Kconfig @@ -1,3 +1,13 @@ menu "Memory management drivers" +config ZCACHE + bool "Dynamic compression of swap pages and clean pagecache pages" + depends on (CLEANCACHE || FRONTSWAP) && CRYPTO=y && ZSMALLOC=y + select CRYPTO_LZO + default n + help + Zcache uses compression and an in-kernel implementation of + transcendent memory to store clean page cache pages and swap + in RAM, providing a noticeable reduction in disk I/O. + endmenu diff --git a/drivers/mm/Makefile b/drivers/mm/Makefile new file mode 100644 index 0000000..f36f509 --- /dev/null +++ b/drivers/mm/Makefile @@ -0,0 +1 @@ +obj-$(CONFIG_ZCACHE) += zcache/ diff --git a/drivers/staging/zcache/Makefile b/drivers/mm/zcache/Makefile similarity index 100% rename from drivers/staging/zcache/Makefile rename to drivers/mm/zcache/Makefile diff --git a/drivers/staging/zcache/tmem.c b/drivers/mm/zcache/tmem.c similarity index 100% rename from drivers/staging/zcache/tmem.c rename to drivers/mm/zcache/tmem.c diff --git a/drivers/staging/zcache/tmem.h b/drivers/mm/zcache/tmem.h similarity index 100% rename from drivers/staging/zcache/tmem.h rename to drivers/mm/zcache/tmem.h diff --git a/drivers/staging/zcache/zcache-main.c b/drivers/mm/zcache/zcache-main.c similarity index 100% rename from drivers/staging/zcache/zcache-main.c rename to drivers/mm/zcache/zcache-main.c diff --git a/drivers/staging/Kconfig b/drivers/staging/Kconfig index b7f7bc7..0940d2e 100644 --- a/drivers/staging/Kconfig +++ b/drivers/staging/Kconfig @@ -76,8 +76,6 @@ source "drivers/staging/iio/Kconfig" source "drivers/staging/zram/Kconfig" -source "drivers/staging/zcache/Kconfig" - source "drivers/staging/wlags49_h2/Kconfig" source "drivers/staging/wlags49_h25/Kconfig" diff --git a/drivers/staging/Makefile b/drivers/staging/Makefile index ad74bee..6e1c491 100644 --- a/drivers/staging/Makefile +++ b/drivers/staging/Makefile @@ -33,7 +33,6 @@ obj-$(CONFIG_IPACK_BUS) += ipack/ obj-$(CONFIG_DX_SEP) += sep/ obj-$(CONFIG_IIO) += iio/ obj-$(CONFIG_ZRAM) += zram/ -obj-$(CONFIG_ZCACHE) += zcache/ obj-$(CONFIG_WLAGS49_H2) += wlags49_h2/ obj-$(CONFIG_WLAGS49_H25) += wlags49_h25/ obj-$(CONFIG_FB_SM7XX) += sm7xxfb/ diff --git a/drivers/staging/zcache/Kconfig b/drivers/staging/zcache/Kconfig deleted file mode 100644 index 4881839..0000000 --- a/drivers/staging/zcache/Kconfig +++ /dev/null @@ -1,11 +0,0 @@ -config ZCACHE - bool "Dynamic compression of swap pages and clean pagecache pages" - depends on (CLEANCACHE || FRONTSWAP) && CRYPTO=y && ZSMALLOC=y - select CRYPTO_LZO - default n - help - Zcache doubles RAM efficiency while providing a significant - performance boosts on many workloads. Zcache uses - compression and an in-kernel implementation of transcendent - memory to store clean page cache pages and swap in RAM, - providing a noticeable reduction in disk I/O. -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-07-27 18:18 Seth Jennings ` (3 preceding siblings ...) 2012-07-27 18:18 ` [PATCH 4/4] zcache: promote to drivers/mm/ Seth Jennings @ 2012-07-29 2:20 ` Minchan Kim 2012-08-07 20:23 ` Seth Jennings 2012-08-14 22:18 ` Seth Jennings 6 siblings, 0 replies; 41+ messages in thread From: Minchan Kim @ 2012-07-29 2:20 UTC (permalink / raw) To: Seth Jennings Cc: Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Konrad Rzeszutek Wilk, Dan Magenheimer, Robert Jennings, linux-mm, linux-kernel, devel Hi Seth, zcache out of staging is rather controversial as you see this thread. But I believe zram is very mature and code/comment is clean. In addition, it has lots of real customers in embedded side so IMHO, it would be easy to promote it firstly. Of course, it will promote zsmalloc which is half on what you want. What do you think about? If you agree, could you do that firstly? If you don't want and promoting zcache continue to be controversial, I will do that after my vacation. Thanks. On Fri, Jul 27, 2012 at 01:18:33PM -0500, Seth Jennings wrote: > zcache is the remaining piece of code required to support in-kernel > memory compression. The other two features, cleancache and frontswap, > have been promoted to mainline in 3.0 and 3.5. This patchset > promotes zcache from the staging tree to mainline. > > Based on the level of activity and contributions we're seeing from a > diverse set of people and interests, I think zcache has matured to the > point where it makes sense to promote this out of staging. > > Overview > ======== > zcache is a backend to frontswap and cleancache that accepts pages from > those mechanisms and compresses them, leading to reduced I/O caused by > swap and file re-reads. This is very valuable in shared storage situations > to reduce load on things like SANs. Also, in the case of slow backing/swap > devices, zcache can also yield a performance gain. > > In-Kernel Memory Compression Overview: > > swap subsystem page cache > + + > frontswap cleancache > + + > zcache frontswap glue zcache cleancache glue > + + > +---------+------------+ > + > zcache/tmem core > + > +---------+------------+ > + + > zsmalloc zbud > > Everything below the frontswap/cleancache layer is current inside the > zcache driver expect for zsmalloc which is a shared between zcache and > another memory compression driver, zram. > > Since zcache is dependent on zsmalloc, it is also being promoted by this > patchset. > > For information on zsmalloc and the rationale behind it's design and use > cases verses already existing allocators in the kernel: > > https://lkml.org/lkml/2012/1/9/386 > > zsmalloc is the allocator used by zcache to store persistent pages that > comes from frontswap, as opposed to zbud which is the (internal) allocator > used for ephemeral pages from cleancache. > > zsmalloc uses many fields of the page struct to create it's conceptual > high-order page called a zspage. Exactly which fields are used and for > what purpose is documented at the top of the zsmalloc .c file. Because > zsmalloc uses struct page extensively, Andrew advised that the > promotion location be mm/: > > https://lkml.org/lkml/2012/1/20/308 > > Some benchmarking numbers demonstrating the I/O saving that can be had > with zcache: > > https://lkml.org/lkml/2012/3/22/383 > > Dan's presentation at LSF/MM this year on zcache: > > http://oss.oracle.com/projects/tmem/dist/documentation/presentations/LSFMM12-zcache-final.pdf > > This patchset is based on next-20120727 + 3-part zsmalloc patchset below > > https://lkml.org/lkml/2012/7/18/353 > > The zsmalloc patchset is already acked and will be integrated by Greg after > 3.6-rc1 is out. > > Seth Jennings (4): > zsmalloc: collapse internal .h into .c > zsmalloc: promote to mm/ > drivers: add memory management driver class > zcache: promote to drivers/mm/ > > drivers/Kconfig | 2 + > drivers/Makefile | 1 + > drivers/mm/Kconfig | 13 ++ > drivers/mm/Makefile | 1 + > drivers/{staging => mm}/zcache/Makefile | 0 > drivers/{staging => mm}/zcache/tmem.c | 0 > drivers/{staging => mm}/zcache/tmem.h | 0 > drivers/{staging => mm}/zcache/zcache-main.c | 4 +- > drivers/staging/Kconfig | 4 - > drivers/staging/Makefile | 2 - > drivers/staging/zcache/Kconfig | 11 -- > drivers/staging/zram/zram_drv.h | 3 +- > drivers/staging/zsmalloc/Kconfig | 10 -- > drivers/staging/zsmalloc/Makefile | 3 - > drivers/staging/zsmalloc/zsmalloc_int.h | 149 -------------------- > .../staging/zsmalloc => include/linux}/zsmalloc.h | 0 > mm/Kconfig | 18 +++ > mm/Makefile | 1 + > .../zsmalloc/zsmalloc-main.c => mm/zsmalloc.c | 133 ++++++++++++++++- > 19 files changed, 170 insertions(+), 185 deletions(-) > create mode 100644 drivers/mm/Kconfig > create mode 100644 drivers/mm/Makefile > rename drivers/{staging => mm}/zcache/Makefile (100%) > rename drivers/{staging => mm}/zcache/tmem.c (100%) > rename drivers/{staging => mm}/zcache/tmem.h (100%) > rename drivers/{staging => mm}/zcache/zcache-main.c (99%) > delete mode 100644 drivers/staging/zcache/Kconfig > delete mode 100644 drivers/staging/zsmalloc/Kconfig > delete mode 100644 drivers/staging/zsmalloc/Makefile > delete mode 100644 drivers/staging/zsmalloc/zsmalloc_int.h > rename {drivers/staging/zsmalloc => include/linux}/zsmalloc.h (100%) > rename drivers/staging/zsmalloc/zsmalloc-main.c => mm/zsmalloc.c (86%) > > -- > 1.7.9.5 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-07-27 18:18 Seth Jennings ` (4 preceding siblings ...) 2012-07-29 2:20 ` [PATCH 0/4] promote zcache from staging Minchan Kim @ 2012-08-07 20:23 ` Seth Jennings 2012-08-07 21:47 ` Dan Magenheimer 2012-08-09 18:50 ` Seth Jennings 2012-08-14 22:18 ` Seth Jennings 6 siblings, 2 replies; 41+ messages in thread From: Seth Jennings @ 2012-08-07 20:23 UTC (permalink / raw) To: Seth Jennings Cc: Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Rzeszutek Wilk, Dan Magenheimer, Robert Jennings, linux-mm, linux-kernel, devel On 07/27/2012 01:18 PM, Seth Jennings wrote: > Some benchmarking numbers demonstrating the I/O saving that can be had > with zcache: > > https://lkml.org/lkml/2012/3/22/383 There was concern that kernel changes external to zcache since v3.3 may have mitigated the benefit of zcache. So I re-ran my kernel building benchmark and confirmed that zcache is still providing I/O and runtime savings. Gentoo w/ kernel v3.5 (frontswap only, cleancache disabled) Quad-core i5-2500 @ 3.3GHz 512MB DDR3 1600MHz (limited with mem=512m on boot) Filesystem and swap on 80GB HDD (about 58MB/s with hdparm -t) majflt are major page faults reported by the time command pswpin/out is the delta of pswpin/out from /proc/vmstat before and after the make -jN Mind the 512MB RAM vs 1GB in my previous results. This just reduces the number of threads required to create memory pressure and removes some of the context switching noise from the results. I'm also using a single HDD instead of the RAID0 in my previous results. Each run started with with: swapoff -a swapon -a sync echo 3 > /proc/sys/vm/drop_caches I/O (in pages): normal zcache change N pswpin pswpout majflt I/O sum pswpin pswpout majflt I/O sum %I/O 4 0 2 2116 2118 0 0 2125 2125 0% 8 0 575 2244 2819 4 4 2219 2227 21% 12 2543 4038 3226 9807 1748 2519 3871 8138 17% 16 23926 47278 9426 80630 8252 15598 9372 33222 59% 20 50307 127797 15039 193143 20224 40634 17975 78833 59% Runtime (in seconds): N normal zcache %change 4 126 127 -1% 8 124 124 0% 12 131 133 -2% 16 189 156 17% 20 261 235 10% %CPU utilization (out of 400% on 4 cpus) N normal zcache %change 4 254 253 0% 8 261 263 -1% 12 250 248 1% 16 173 211 -22% 20 124 140 -13% There is a sweet spot at 16 threads, where zcache is improving runtime by 17% and reducing I/O by 59% (185MB) using 22% more CPU. Seth -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH 0/4] promote zcache from staging 2012-08-07 20:23 ` Seth Jennings @ 2012-08-07 21:47 ` Dan Magenheimer 2012-08-08 16:29 ` Seth Jennings 2012-08-09 18:50 ` Seth Jennings 1 sibling, 1 reply; 41+ messages in thread From: Dan Magenheimer @ 2012-08-07 21:47 UTC (permalink / raw) To: Seth Jennings Cc: Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Wilk, Robert Jennings, linux-mm, linux-kernel, devel, Kurt Hackel > From: Seth Jennings [mailto:sjenning@linux.vnet.ibm.com] > Subject: Re: [PATCH 0/4] promote zcache from staging > > On 07/27/2012 01:18 PM, Seth Jennings wrote: > > Some benchmarking numbers demonstrating the I/O saving that can be had > > with zcache: > > > > https://lkml.org/lkml/2012/3/22/383 > > There was concern that kernel changes external to zcache since v3.3 may > have mitigated the benefit of zcache. So I re-ran my kernel building > benchmark and confirmed that zcache is still providing I/O and runtime > savings. Hi Seth -- Thanks for re-running your tests. I have a couple of concerns and hope that you, and other interested parties, will read all the way through my lengthy response. The zcache issues I have seen in recent kernels arise when zcache gets "full". I notice your original published benchmarks [1] include N=24, N=28, and N=32, but these updated results do not. Are you planning on completing the runs? Second, I now see the numbers I originally published for what I thought was the same benchmark as yours are actually an order of magnitude larger (in sec) than yours. I didn't notice this in March because we were focused on the percent improvement, not the raw measurements. Since the hardware is highly similar, I suspect it is not a hardware difference but instead that you are compiling a much smaller kernel. In other words, your test case is much smaller, and so exercises zcache much less. My test case compiles a full enterprise kernel... what is yours doing? IMHO, any cache in computer science needs to be measured both when it is not-yet-full and when it is full. The "demo" zcache in staging works very well before it is full and I think our benchmarking in March and your re-run benchmarks demonstrate that. At LSFMM, Andrea Arcangeli pointed out that zcache, for frontswap pages, has no "writeback" capabilities and, when it is full, it simply rejects further attempts to put data in its cache. He said this is unacceptable for KVM and I agreed that it was a flaw that needed to be fixed before zcache should be promoted. When I tested zcache for this, I found that not only was he right, but that zcache could not be fixed without a major rewrite. This is one of the "fundamental flaws" of the "demo" zcache, but the new code base allows for this to be fixed. A second flaw is that the "demo" zcache has no concept of LRU for either cleancache or frontswap pages, or ability to reclaim pageframes at all for frontswap pages. (And for cleancache, pageframe reclaim is semi-random). As I've noted in other threads, this may be impossible to implement/fix with zsmalloc, and zsmalloc's author Nitin Gupta has agreed, but the new code base implements all of this with zbud. One can argue that LRU is not a requirement for zcache, but a long history of operating systems theory would suggest otherwise. A third flaw is that the "demo" version has a very poor policy to determine what pages are "admitted". The demo policy does take into account the total RAM in the system, but not current memory load conditions. The new code base IMHO does a better job but discussion will be in a refereed presentation at the upcoming Plumber's meeting. The fix for this flaw might be back-portable to the "demo" version so is not a showstopper in the "demo" version, but fixing it is not just a cosmetic fix. I can add more issues to the list, but will stop here. IMHO the "demo" zcache is not suitable for promotion from staging, which is why I spent over two months generating a new code base. I, perhaps more than anyone else, would like to see zcache used, by default, by real distros and customers, but I think it is premature to promote it, especially the old "demo" code. I do realize, however, that this decision is not mine alone so defer to the community to decide. Dan [1] https://lkml.org/lkml/2012/3/22/383 [2] http://lkml.indiana.edu/hypermail/linux/kernel/1203.2/02842.html -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-08-07 21:47 ` Dan Magenheimer @ 2012-08-08 16:29 ` Seth Jennings 2012-08-08 17:47 ` Dan Magenheimer 0 siblings, 1 reply; 41+ messages in thread From: Seth Jennings @ 2012-08-08 16:29 UTC (permalink / raw) To: Dan Magenheimer Cc: Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Wilk, Robert Jennings, linux-mm, linux-kernel, devel, Kurt Hackel On 08/07/2012 04:47 PM, Dan Magenheimer wrote: > I notice your original published benchmarks [1] include > N=24, N=28, and N=32, but these updated results do not. Are you planning > on completing the runs? Second, I now see the numbers I originally > published for what I thought was the same benchmark as yours are actually > an order of magnitude larger (in sec) than yours. I didn't notice > this in March because we were focused on the percent improvement, not > the raw measurements. Since the hardware is highly similar, I suspect > it is not a hardware difference but instead that you are compiling > a much smaller kernel. In other words, your test case is much > smaller, and so exercises zcache much less. My test case compiles > a full enterprise kernel... what is yours doing? I am doing a minimal kernel build for my local hardware configuration. With the reduction in RAM, 1GB to 512MB, I didn't need to do test runs with >20 threads to find the peak of the benefit curve at 16 threads. Past that, zcache is saturated and I'd just be burning up my disk. I'm already swapping out about 500MB (i.e. RAM size) in the 20 thread non-zcache case. Also, I provide the magnitude numbers (pages, seconds) just to show my source data. The %change numbers are the real results as they remove build size as a factor. > At LSFMM, Andrea > Arcangeli pointed out that zcache, for frontswap pages, has no "writeback" > capabilities and, when it is full, it simply rejects further attempts > to put data in its cache. He said this is unacceptable for KVM and I > agreed that it was a flaw that needed to be fixed before zcache should > be promoted. KVM (in-tree) is not a current user of zcache. While the use cases of possible future zcache users should be considered, I don't think they can be used to prevent promotion. > A second flaw is that the "demo" zcache has no concept of LRU for > either cleancache or frontswap pages, or ability to reclaim pageframes > at all for frontswap pages. ... > > A third flaw is that the "demo" version has a very poor policy to > determine what pages are "admitted". ... > > I can add more issues to the list, but will stop here. All of the flaws you list do not prevent zcache from being beneficial right now, as my results demonstrate. Therefore, the flaws listed are really potential improvements and can be done in mainline after promotion. Even if large changes are required to make these improvements, they can be made in mainline in an incremental and public way. Seth -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH 0/4] promote zcache from staging 2012-08-08 16:29 ` Seth Jennings @ 2012-08-08 17:47 ` Dan Magenheimer 0 siblings, 0 replies; 41+ messages in thread From: Dan Magenheimer @ 2012-08-08 17:47 UTC (permalink / raw) To: Seth Jennings Cc: Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Wilk, Robert Jennings, linux-mm, linux-kernel, devel, Kurt Hackel > From: Seth Jennings [mailto:sjenning@linux.vnet.ibm.com] Hi Seth -- Good discussion. Even though we disagree, I appreciate your enthusiasm and your good work on the kernel! > Subject: Re: [PATCH 0/4] promote zcache from staging > > On 08/07/2012 04:47 PM, Dan Magenheimer wrote: > > I notice your original published benchmarks [1] include > > N=24, N=28, and N=32, but these updated results do not. Are you planning > > on completing the runs? Second, I now see the numbers I originally > > published for what I thought was the same benchmark as yours are actually > > an order of magnitude larger (in sec) than yours. I didn't notice > > this in March because we were focused on the percent improvement, not > > the raw measurements. Since the hardware is highly similar, I suspect > > it is not a hardware difference but instead that you are compiling > > a much smaller kernel. In other words, your test case is much > > smaller, and so exercises zcache much less. My test case compiles > > a full enterprise kernel... what is yours doing? > > I am doing a minimal kernel build for my local hardware > configuration. > > With the reduction in RAM, 1GB to 512MB, I didn't need to do > test runs with >20 threads to find the peak of the benefit > curve at 16 threads. Past that, zcache is saturated and I'd > just be burning up my disk. I think that's exactly what I said in a snippet of my response that you deleted. A cache needs to work well both when it is non-full and when it is full. You are only demonstrating that it works well when it is non-full. When it is "saturated", bad things can happen. Finding the "peak of the benefit" is only half the work of benchmarking. So it appears you are trying to prove your point by showing the workloads that look good, while _not_ showing the workloads that look bad, and then claiming you don't care about those bad workloads anyway. > Also, I provide the magnitude numbers (pages, seconds) just > to show my source data. The %change numbers are the real > results as they remove build size as a factor. You'll have to explain what you mean because, if I understand correctly, this is just not true. Different build sizes definitely affect memory management differently, just as different values of N (for make -jN) have an effect. > > At LSFMM, Andrea > > Arcangeli pointed out that zcache, for frontswap pages, has no "writeback" > > capabilities and, when it is full, it simply rejects further attempts > > to put data in its cache. He said this is unacceptable for KVM and I > > agreed that it was a flaw that needed to be fixed before zcache should > > be promoted. > > KVM (in-tree) is not a current user of zcache. While the > use cases of possible future zcache users should be > considered, I don't think they can be used to prevent promotion. That wasn't my point. Andrea identified the flaw as an issue of zcache. > > A second flaw is that the "demo" zcache has no concept of LRU for > > either cleancache or frontswap pages, or ability to reclaim pageframes > > at all for frontswap pages. > ... > > > > A third flaw is that the "demo" version has a very poor policy to > > determine what pages are "admitted". > ... > > > > I can add more issues to the list, but will stop here. > > All of the flaws you list do not prevent zcache from being > beneficial right now, as my results demonstrate. Therefore, > the flaws listed are really potential improvements and can > be done in mainline after promotion. Even if large changes > are required to make these improvements, they can be made in > mainline in an incremental and public way. Your results only demonstrate that zcache is beneficial on the workloads that you chose to present. But using the same workload with slightly different parameters (-jN or compiling a larger kernel), zcache can be _detrimental_, and you've chosen to not measure or present those cases, even though you did measure and present some of those cases in your first benchmark runs posted in March (on an earlier kernel). I can only speak for myself, but this appears disingenuous to me. Sorry, but FWIW my vote is still a NACK. IMHO zcache needs major work before it should be promoted, and I think we should be spending the time fixing the known flaws rather than arguing about promoting "demo" code. Dan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-08-07 20:23 ` Seth Jennings 2012-08-07 21:47 ` Dan Magenheimer @ 2012-08-09 18:50 ` Seth Jennings 2012-08-09 20:20 ` Dan Magenheimer 1 sibling, 1 reply; 41+ messages in thread From: Seth Jennings @ 2012-08-09 18:50 UTC (permalink / raw) To: Seth Jennings Cc: Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Rzeszutek Wilk, Dan Magenheimer, Robert Jennings, linux-mm, linux-kernel, devel On 08/07/2012 03:23 PM, Seth Jennings wrote: > On 07/27/2012 01:18 PM, Seth Jennings wrote: >> Some benchmarking numbers demonstrating the I/O saving that can be had >> with zcache: >> >> https://lkml.org/lkml/2012/3/22/383 > > There was concern that kernel changes external to zcache since v3.3 may > have mitigated the benefit of zcache. So I re-ran my kernel building > benchmark and confirmed that zcache is still providing I/O and runtime > savings. There was a request made to test with even greater memory pressure to demonstrate that, at some unknown point, zcache doesn't have real problems. So I continued out to 32 threads: N=4..20 is the same data as before except for the pswpin values. I found a mistake in the way I computed pswpin that changed those values slightly. However, this didn't change the overall trend. I also inverted the %change fields since it is a percent change vs the normal case. I/O (in pages) normal zcache change N pswpin pswpout majflt I/O sum pswpin pswpout majflt I/O sum %I/O 4 0 2 2116 2118 0 0 2125 2125 0% 8 0 575 2244 2819 0 4 2219 2223 -21% 12 1979 4038 3226 9243 1269 2519 3871 7659 -17% 16 21568 47278 9426 78272 7770 15598 9372 32740 -58% 20 50307 127797 15039 193143 20224 40634 17975 78833 -59% 24 186278 364809 45052 596139 47406 90489 30877 168772 -72% 28 274734 777815 53112 1105661 134981 307346 63480 505807 -54% 32 988530 2002087 168662 3159279 324801 723385 140288 1188474 -62% Runtime (in seconds) N normal zcache %change 4 126 127 1% 8 124 124 0% 12 131 133 2% 16 189 156 -17% 20 261 235 -10% 24 513 288 -44% 28 556 434 -22% 32 1463 745 -49% %CPU utilization (out of 400% on 4 cpus) N normal zcache %change 4 254 253 0% 8 261 263 1% 12 250 248 -1% 16 173 211 22% 20 124 140 13% 24 64 114 78% 28 59 76 29% 32 23 45 96% The ~60% I/O savings holds even out to 32 threads, at which point the non-zcache case has 12GB of I/O and is taking 12x longer to complete. Additionally, the runtime savings increases significantly beyond 20 threads, even though the absolute runtime is suboptimal due to the extreme memory pressure. Seth -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH 0/4] promote zcache from staging 2012-08-09 18:50 ` Seth Jennings @ 2012-08-09 20:20 ` Dan Magenheimer 2012-08-10 18:14 ` Seth Jennings 0 siblings, 1 reply; 41+ messages in thread From: Dan Magenheimer @ 2012-08-09 20:20 UTC (permalink / raw) To: Seth Jennings Cc: Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Wilk, Robert Jennings, linux-mm, linux-kernel, devel, Kurt Hackel > From: Seth Jennings [mailto:sjenning@linux.vnet.ibm.com] > Subject: Re: [PATCH 0/4] promote zcache from staging > > On 08/07/2012 03:23 PM, Seth Jennings wrote: > > On 07/27/2012 01:18 PM, Seth Jennings wrote: > >> Some benchmarking numbers demonstrating the I/O saving that can be had > >> with zcache: > >> > >> https://lkml.org/lkml/2012/3/22/383 > > > > There was concern that kernel changes external to zcache since v3.3 may > > have mitigated the benefit of zcache. So I re-ran my kernel building > > benchmark and confirmed that zcache is still providing I/O and runtime > > savings. > > There was a request made to test with even greater memory pressure to > demonstrate that, at some unknown point, zcache doesn't have real > problems. So I continued out to 32 threads: Hi Seth -- Thanks for continuing with running the 24-32 thread benchmarks. > Runtime (in seconds) > N normal zcache %change > 4 126 127 1% > threads, even though the absolute runtime is suboptimal due to the > extreme memory pressure. I am not in a position right now to reproduce your results or mine (due to a house move which is limiting my time and access to my test machines, plus two presentations later this month at Linuxcon NA and Plumbers) but I still don't think you've really saturated the cache, which is when the extreme memory pressure issues will show up in zcache. I suspect that adding more threads to a minimal kernel compile doesn't increase the memory pressure as much as I was seeing, so you're not seeing what I was seeing: the zcache number climb to as much as 150% WORSE than non-zcache. In various experiments trying variations, I have seen four-fold degradations and worse. My test case is a kernel compile using a full OL kernel config file, which is roughly equivalent to a RHEL6 config. Compiling this kernel, using similar hardware, I have never seen a runtime less than ~800 seconds for any value of N. I suspect that my test case, having much more source to compile, causes the N threads in a "make -jN" each have more work to do, in parallel. Since your test harness is obviously all set up, would you be willing to reproduce your/my non-zcache/zcache runs with a RHEL6 config file and publish the results (using a 3.5 zcache)? IIRC, the really bad zcache results starting showing up at N=24. I also wonder if you have anything else unusual in your test setup, such as a fast swap disk (mine is a partition on the same rotating disk as source and target of the kernel build, the default install for a RHEL6 system)? Or have you disabled cleancache? Or have you changed any sysfs parameters or other kernel files? Also, whether zcache or non-zcache, I've noticed that the runtime of this workload when swapping can vary by as much as 30-40%, so it would be wise to take at least three samples to ensure a statistically valid comparison. And are you using 512M of physical memory or relying on kernel boot parameters to reduce visible memory... and if the latter have you confirmed with /proc/meminfo? Obviously, I'm baffled at the difference in our observations. While I am always willing to admit that my numbers may be wrong, I still can't imagine why you are in such a hurry to promote zcache when these questions are looming. Would you care to explain why? It seems reckless to me, and unlike the IBM behavior I expect, so I really wonder about the motivation. My goal is very simple: "First do no harm". I don't think zcache should be enabled for distros (and users) until we can reasonably demonstrate that running a workload with zcache is never substantially worse than running the same workload without zcache. If you can tell your customer: "Yes, always enable zcache", great! But if you have to tell your customer: "It depends on the workload, enable it if it works for you, disable it otherwise", then zcache will get a bad reputation, and will/should never be enabled in a reputable non-hobbyist distro. I fear the "demo" zcache will get a bad reputation so prefer to delay promotion while there is serious doubt about whether "harm" may occur. Last, you've never explained what problems zcache solves for you that zram does not. With Minchan pushing for the promotion of zram+zsmalloc, does zram solve your problem? Another alternative might be to promote zcache as "demozcache" (i.e. fork it for now). It's hard to identify a reasonable compromise when you are just saying "Gotta promote zcache NOW!" and not explaining the problem you are trying to solve or motivations behind it. OK, Seth, I think all my cards are on the table. Where's yours? (And, hello, is anyone else following this anyway? :-) Thanks, Dan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-08-09 20:20 ` Dan Magenheimer @ 2012-08-10 18:14 ` Seth Jennings 2012-08-15 9:38 ` Konrad Rzeszutek Wilk 2012-08-17 22:21 ` Dan Magenheimer 0 siblings, 2 replies; 41+ messages in thread From: Seth Jennings @ 2012-08-10 18:14 UTC (permalink / raw) To: Dan Magenheimer Cc: Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Wilk, Robert Jennings, linux-mm, linux-kernel, devel, Kurt Hackel On 08/09/2012 03:20 PM, Dan Magenheimer wrote > I also wonder if you have anything else unusual in your > test setup, such as a fast swap disk (mine is a partition > on the same rotating disk as source and target of the kernel build, > the default install for a RHEL6 system)? I'm using a normal SATA HDD with two partitions, one for swap and the other an ext3 filesystem with the kernel source. > Or have you disabled cleancache? Yes, I _did_ disable cleancache. I could see where having cleancache enabled could explain the difference in results. > Or have you changed any sysfs parameters or > other kernel files? No. > And are you using 512M of physical memory or relying on > kernel boot parameters to reduce visible memory Limited with mem=512M boot parameter. > ... and > if the latter have you confirmed with /proc/meminfo? Yes, confirmed. Seth -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-08-10 18:14 ` Seth Jennings @ 2012-08-15 9:38 ` Konrad Rzeszutek Wilk 2012-08-15 14:24 ` Seth Jennings 2012-08-17 22:21 ` Dan Magenheimer 1 sibling, 1 reply; 41+ messages in thread From: Konrad Rzeszutek Wilk @ 2012-08-15 9:38 UTC (permalink / raw) To: Seth Jennings Cc: Dan Magenheimer, Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Minchan Kim, Robert Jennings, linux-mm, linux-kernel, devel, Kurt Hackel On Fri, Aug 10, 2012 at 01:14:01PM -0500, Seth Jennings wrote: > On 08/09/2012 03:20 PM, Dan Magenheimer wrote > > I also wonder if you have anything else unusual in your > > test setup, such as a fast swap disk (mine is a partition > > on the same rotating disk as source and target of the kernel build, > > the default install for a RHEL6 system)? > > I'm using a normal SATA HDD with two partitions, one for > swap and the other an ext3 filesystem with the kernel source. > > > Or have you disabled cleancache? > > Yes, I _did_ disable cleancache. I could see where having > cleancache enabled could explain the difference in results. Why did you disable the cleancache? Having both (cleancache to compress fs data) and frontswap (to compress swap data) is the goal - while you turned one of its sources off. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-08-15 9:38 ` Konrad Rzeszutek Wilk @ 2012-08-15 14:24 ` Seth Jennings 0 siblings, 0 replies; 41+ messages in thread From: Seth Jennings @ 2012-08-15 14:24 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Dan Magenheimer, Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Minchan Kim, Robert Jennings, linux-mm, linux-kernel, devel, Kurt Hackel On 08/15/2012 04:38 AM, Konrad Rzeszutek Wilk wrote: > On Fri, Aug 10, 2012 at 01:14:01PM -0500, Seth Jennings wrote: >> On 08/09/2012 03:20 PM, Dan Magenheimer wrote >>> I also wonder if you have anything else unusual in your >>> test setup, such as a fast swap disk (mine is a partition >>> on the same rotating disk as source and target of the kernel build, >>> the default install for a RHEL6 system)? >> >> I'm using a normal SATA HDD with two partitions, one for >> swap and the other an ext3 filesystem with the kernel source. >> >>> Or have you disabled cleancache? >> >> Yes, I _did_ disable cleancache. I could see where having >> cleancache enabled could explain the difference in results. > > Why did you disable the cleancache? Having both (cleancache > to compress fs data) and frontswap (to compress swap data) is the > goal - while you turned one of its sources off. I excluded cleancache to reduce interference/noise from the benchmarking results. For this particular workload, cleancache doesn't make a lot of sense since it will steal pages that could otherwise be used for storing frontswap pages to prevent swapin/swapout I/O. In a test run with both enabled, I found that it didn't make much difference under moderate to extreme memory pressure. Both resulted in about 55% I/O reduction. However, on light memory pressure with 8 and 12 threads, it lowered the I/O reduction ability of zcache to roughly 0 compared to ~20% I/O reduction without cleancache. In short, cleancache only had the power to harm in this case, so I didn't enable it. Seth -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH 0/4] promote zcache from staging 2012-08-10 18:14 ` Seth Jennings 2012-08-15 9:38 ` Konrad Rzeszutek Wilk @ 2012-08-17 22:21 ` Dan Magenheimer 2012-08-17 23:33 ` Seth Jennings 1 sibling, 1 reply; 41+ messages in thread From: Dan Magenheimer @ 2012-08-17 22:21 UTC (permalink / raw) To: Seth Jennings Cc: Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Wilk, Robert Jennings, linux-mm, linux-kernel, devel, Kurt Hackel > From: Seth Jennings [mailto:sjenning@linux.vnet.ibm.com] > Subject: Re: [PATCH 0/4] promote zcache from staging > > On 08/09/2012 03:20 PM, Dan Magenheimer wrote > > I also wonder if you have anything else unusual in your > > test setup, such as a fast swap disk (mine is a partition > > on the same rotating disk as source and target of the kernel build, > > the default install for a RHEL6 system)? > > I'm using a normal SATA HDD with two partitions, one for > swap and the other an ext3 filesystem with the kernel source. > > > Or have you disabled cleancache? > > Yes, I _did_ disable cleancache. I could see where having > cleancache enabled could explain the difference in results. Sorry to beat a dead horse, but I meant to report this earlier in the week and got tied up by other things. I finally got my test scaffold set up earlier this week to try to reproduce my "bad" numbers with the RHEL6-ish config file. I found that with "make -j28" and "make -j32" I experienced __DATA CORRUPTION__. This was repeatable. The type of error led me to believe that the problem was due to concurrency of cleancache reclaim. I did not try with cleancache disabled to prove/support this theory but it is consistent with the fact that you (Seth) have not seen a similar problem and has disabled cleancache. While this problem is most likely in my code and I am suitably chagrined, it re-emphasizes the fact that the current zcache in staging is 20-month old "demo" code. The proposed new zcache codebase handles concurrency much more effectively. I'll be away from email for a few days now. Dan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-08-17 22:21 ` Dan Magenheimer @ 2012-08-17 23:33 ` Seth Jennings 2012-08-18 19:09 ` Dan Magenheimer 0 siblings, 1 reply; 41+ messages in thread From: Seth Jennings @ 2012-08-17 23:33 UTC (permalink / raw) To: Dan Magenheimer Cc: Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Wilk, Robert Jennings, linux-mm, linux-kernel, devel, Kurt Hackel On 08/17/2012 05:21 PM, Dan Magenheimer wrote: >> From: Seth Jennings [mailto:sjenning@linux.vnet.ibm.com] >> Subject: Re: [PATCH 0/4] promote zcache from staging >> >> On 08/09/2012 03:20 PM, Dan Magenheimer wrote >>> I also wonder if you have anything else unusual in your >>> test setup, such as a fast swap disk (mine is a partition >>> on the same rotating disk as source and target of the kernel build, >>> the default install for a RHEL6 system)? >> >> I'm using a normal SATA HDD with two partitions, one for >> swap and the other an ext3 filesystem with the kernel source. >> >>> Or have you disabled cleancache? >> >> Yes, I _did_ disable cleancache. I could see where having >> cleancache enabled could explain the difference in results. > > Sorry to beat a dead horse, but I meant to report this > earlier in the week and got tied up by other things. > > I finally got my test scaffold set up earlier this week > to try to reproduce my "bad" numbers with the RHEL6-ish > config file. > > I found that with "make -j28" and "make -j32" I experienced > __DATA CORRUPTION__. This was repeatable. I actually hit this for the first time a few hours ago when I was running performance for your rewrite. I didn't know what to make of it yet. The 24-thread kernel build failed when both frontswap and cleancache were enabled. > The type of error led me to believe that the problem was > due to concurrency of cleancache reclaim. I did not try > with cleancache disabled to prove/support this theory > but it is consistent with the fact that you (Seth) have not > seen a similar problem and has disabled cleancache. > > While this problem is most likely in my code and I am > suitably chagrined, it re-emphasizes the fact that > the current zcache in staging is 20-month old "demo" > code. The proposed new zcache codebase handles concurrency > much more effectively. I imagine this can be solved without rewriting the entire codebase. If your new code contains a fix for this, can we just pull it as a single patch? Seth -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH 0/4] promote zcache from staging 2012-08-17 23:33 ` Seth Jennings @ 2012-08-18 19:09 ` Dan Magenheimer 0 siblings, 0 replies; 41+ messages in thread From: Dan Magenheimer @ 2012-08-18 19:09 UTC (permalink / raw) To: Seth Jennings Cc: Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Wilk, Robert Jennings, linux-mm, linux-kernel, devel > From: Seth Jennings [mailto:sjenning@linux.vnet.ibm.com] > Sent: Friday, August 17, 2012 5:33 PM > To: Dan Magenheimer > Cc: Greg Kroah-Hartman; Andrew Morton; Nitin Gupta; Minchan Kim; Konrad Wilk; Robert Jennings; linux- > mm@kvack.org; linux-kernel@vger.kernel.org; devel@driverdev.osuosl.org; Kurt Hackel > Subject: Re: [PATCH 0/4] promote zcache from staging > > > > > Sorry to beat a dead horse, but I meant to report this > > earlier in the week and got tied up by other things. > > > > I finally got my test scaffold set up earlier this week > > to try to reproduce my "bad" numbers with the RHEL6-ish > > config file. > > > > I found that with "make -j28" and "make -j32" I experienced > > __DATA CORRUPTION__. This was repeatable. > > I actually hit this for the first time a few hours ago when > I was running performance for your rewrite. I didn't know > what to make of it yet. The 24-thread kernel build failed > when both frontswap and cleancache were enabled. > > > The type of error led me to believe that the problem was > > due to concurrency of cleancache reclaim. I did not try > > with cleancache disabled to prove/support this theory > > but it is consistent with the fact that you (Seth) have not > > seen a similar problem and has disabled cleancache. > > > > While this problem is most likely in my code and I am > > suitably chagrined, it re-emphasizes the fact that > > the current zcache in staging is 20-month old "demo" > > code. The proposed new zcache codebase handles concurrency > > much more effectively. > > I imagine this can be solved without rewriting the entire > codebase. If your new code contains a fix for this, can we > just pull it as a single patch? Hi Seth -- I didn't even observe this before this week, let alone fix this as an individual bug. The redesign takes into account LRU ordering and zombie pageframes (which have valid pointers to the contained zbuds and possibly valid data, so can't be recycled yet), taking races and concurrency carefully into account. The demo codebase is pretty dumb about concurrency, really a hack that seemed to work. Given the above, I guess the hack only works _most_ of the time... when it doesn't data corruption can occur. It would be an interesting challenge, but likely very time-consuming, to fix this one bug while minimizing other changes so that the fix could be delivered as a self-contained incremental patch. I suspect if you try, you will learn why the rewrite was preferable and necessary. (Away from email for a few days very soon now.) Dan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-07-27 18:18 Seth Jennings ` (5 preceding siblings ...) 2012-08-07 20:23 ` Seth Jennings @ 2012-08-14 22:18 ` Seth Jennings 2012-08-14 23:29 ` Minchan Kim 6 siblings, 1 reply; 41+ messages in thread From: Seth Jennings @ 2012-08-14 22:18 UTC (permalink / raw) To: Seth Jennings Cc: Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Rzeszutek Wilk, Dan Magenheimer, Robert Jennings, linux-mm, linux-kernel, devel On 07/27/2012 01:18 PM, Seth Jennings wrote: > zcache is the remaining piece of code required to support in-kernel > memory compression. The other two features, cleancache and frontswap, > have been promoted to mainline in 3.0 and 3.5. This patchset > promotes zcache from the staging tree to mainline. > > Based on the level of activity and contributions we're seeing from a > diverse set of people and interests, I think zcache has matured to the > point where it makes sense to promote this out of staging. I am wondering if there is any more discussion to be had on the topic of promoting zcache. The discussion got dominated by performance concerns, but hopefully my latest performance metrics have alleviated those concerns for most and shown the continuing value of zcache in both I/O and runtime savings. I'm not saying that zcache development is complete by any means. There are still many improvements that can be made. I'm just saying that I believe it is stable and beneficial enough to leave the staging tree. Seth -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 0/4] promote zcache from staging 2012-08-14 22:18 ` Seth Jennings @ 2012-08-14 23:29 ` Minchan Kim 0 siblings, 0 replies; 41+ messages in thread From: Minchan Kim @ 2012-08-14 23:29 UTC (permalink / raw) To: Seth Jennings Cc: Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Minchan Kim, Konrad Rzeszutek Wilk, Dan Magenheimer, Robert Jennings, linux-mm, linux-kernel, devel Hi Seth, On Tue, Aug 14, 2012 at 05:18:57PM -0500, Seth Jennings wrote: > On 07/27/2012 01:18 PM, Seth Jennings wrote: > > zcache is the remaining piece of code required to support in-kernel > > memory compression. The other two features, cleancache and frontswap, > > have been promoted to mainline in 3.0 and 3.5. This patchset > > promotes zcache from the staging tree to mainline. > > > > Based on the level of activity and contributions we're seeing from a > > diverse set of people and interests, I think zcache has matured to the > > point where it makes sense to promote this out of staging. > > I am wondering if there is any more discussion to be had on > the topic of promoting zcache. The discussion got dominated > by performance concerns, but hopefully my latest performance > metrics have alleviated those concerns for most and shown > the continuing value of zcache in both I/O and runtime savings. > > I'm not saying that zcache development is complete by any > means. There are still many improvements that can be made. > I'm just saying that I believe it is stable and beneficial > enough to leave the staging tree. > > Seth I want to do some clean up on zcache but I'm okay after it is promoted if Andrew merge it. But I'm not sure he doesn't mind it due to not good code quality which includes not enough comment, not good variable/function name, many code duplication of ramster). Anyway, I think we should unify common code between zcache and ramster before promoting at least. Otherwise, it would make code refactoring hard because we always have to touch both side for just a clean up. It means zcache contributor for the clean up should know well ramster too and it's not desirable. > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2012-08-18 19:10 UTC | newest] Thread overview: 41+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <<1343413117-1989-1-git-send-email-sjenning@linux.vnet.ibm.com> 2012-07-27 19:21 ` [PATCH 0/4] promote zcache from staging Dan Magenheimer 2012-07-27 20:59 ` Konrad Rzeszutek Wilk 2012-07-27 21:42 ` Dan Magenheimer 2012-07-29 1:54 ` Minchan Kim 2012-07-31 15:36 ` Konrad Rzeszutek Wilk 2012-08-06 4:49 ` Minchan Kim 2012-07-30 19:19 ` Seth Jennings 2012-07-30 20:48 ` Dan Magenheimer 2012-07-31 15:58 ` Konrad Rzeszutek Wilk 2012-07-31 16:19 ` Greg Kroah-Hartman 2012-07-31 17:51 ` Konrad Rzeszutek Wilk 2012-07-31 18:19 ` Seth Jennings 2012-08-06 0:38 ` Minchan Kim 2012-08-06 15:24 ` Dan Magenheimer 2012-08-06 15:47 ` Pekka Enberg 2012-08-06 16:21 ` Dan Magenheimer 2012-08-06 16:29 ` Greg Kroah-Hartman 2012-08-06 16:38 ` Dan Magenheimer 2012-08-07 0:44 ` Minchan Kim 2012-08-07 19:28 ` Konrad Rzeszutek Wilk 2012-07-27 18:18 Seth Jennings 2012-07-27 18:18 ` [PATCH 1/4] zsmalloc: collapse internal .h into .c Seth Jennings 2012-07-27 18:18 ` [PATCH 2/4] zsmalloc: promote to mm/ Seth Jennings 2012-07-27 18:18 ` [PATCH 3/4] drivers: add memory management driver class Seth Jennings 2012-07-31 15:31 ` Konrad Rzeszutek Wilk 2012-07-27 18:18 ` [PATCH 4/4] zcache: promote to drivers/mm/ Seth Jennings 2012-07-29 2:20 ` [PATCH 0/4] promote zcache from staging Minchan Kim 2012-08-07 20:23 ` Seth Jennings 2012-08-07 21:47 ` Dan Magenheimer 2012-08-08 16:29 ` Seth Jennings 2012-08-08 17:47 ` Dan Magenheimer 2012-08-09 18:50 ` Seth Jennings 2012-08-09 20:20 ` Dan Magenheimer 2012-08-10 18:14 ` Seth Jennings 2012-08-15 9:38 ` Konrad Rzeszutek Wilk 2012-08-15 14:24 ` Seth Jennings 2012-08-17 22:21 ` Dan Magenheimer 2012-08-17 23:33 ` Seth Jennings 2012-08-18 19:09 ` Dan Magenheimer 2012-08-14 22:18 ` Seth Jennings 2012-08-14 23:29 ` Minchan Kim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).