* Re: [PATCH 2/2] x86: use pv-ops in {pte,pmd}_{set,clear}_flags() [not found] ` <CAKbGBLiVqaHEOZx6y4MW4xDTUdKRhVLZXTTGiqYT7vuH2Wgeww@mail.gmail.com> @ 2014-03-25 20:16 ` Linus Torvalds 2014-03-31 12:26 ` Mel Gorman 0 siblings, 1 reply; 10+ messages in thread From: Linus Torvalds @ 2014-03-25 20:16 UTC (permalink / raw) To: Steven Noonan, Mel Gorman, Rik van Riel Cc: David Vrabel, Andrew Morton, linux-mm On Mon, Mar 24, 2014 at 8:31 AM, Steven Noonan <steven@uplinklabs.net> wrote: > Vrabel's comments make me think that revisiting the elimination of the > _PAGE_NUMA bit implementation would be a good idea... should I CC you > on this discussion (not sure if you're subscribed to xen-devel, or if > LKML is a better place for that discussion)? I detest the PAGE_NUMA games myself, but: From: David Vrabel <david.vrabel@citrix.com>: > > I really do not understand how you're supposed to distinguish between a > PTE for a PROT_NONE page and one with _PAGE_NUMA -- they're identical. > i.e., pte_numa() will return true for a PROT_NONE protected page which > just seems wrong to me. The way to distinguish PAGE_NUMA from PROTNONE is *supposed* to be by looking at the vma, and PROTNONE goes together with a vma with PROT_NONE. That's what the comments in pgtable_types.h say. However, as far as I can tell, that is pure and utter bullshit. It's true that generally handle_mm_fault() shouldn't be called for PROT_NONE pages, since it will fail the protection checks. However, we have FOLL_FORCE that overrides those protection checks for things like ptrace etc. So people have tried to convince me that _PAGE_NUMA works, but I'm really not at all sure they are right. I fundamentally think that it was a horrible horrible disaster to make _PAGE_NUMA alias onto _PAGE_PROTNONE. But I'm cc'ing the people who tried to convince me otherwise last time around, to see if they can articulate this mess better now. The argument *seems* to be that if things are truly PROT_NONE, then the page will never be touched by page faulting code (and as mentioned, I think that argument is fundamentally broken), and if it's PROT_NUMA then the page faulting code will magically do the right thing. FURTHERMORE, the argument was that we can't just call things PROT_NONE and just say that "those are the semantics", because on other architectures PROT_NONE might mean/do something else. Which really makes no sense either, because if the argument was that PROT_NONE causes faults that can either be handled as faults (for PROT_NONE) or as NUMA issues (for NUMA), then dammit, that argument should be completely architecture-independent. But I gave up arguing with people. I will state (again) that I think this is a f*cking mess, and saying that PROTNONE and NUMA are somehow the exact same thing on x86 but not in general is bogus crap. And saying that you can determine which it is from the vma is very debatable too. Let the people responsible for the crap try to explain why it works and has to be that mess. Again. Rik, Mel? Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] x86: use pv-ops in {pte,pmd}_{set,clear}_flags() 2014-03-25 20:16 ` [PATCH 2/2] x86: use pv-ops in {pte,pmd}_{set,clear}_flags() Linus Torvalds @ 2014-03-31 12:26 ` Mel Gorman 2014-03-31 15:41 ` Linus Torvalds 0 siblings, 1 reply; 10+ messages in thread From: Mel Gorman @ 2014-03-31 12:26 UTC (permalink / raw) To: Linus Torvalds Cc: Steven Noonan, Rik van Riel, David Vrabel, Andrew Morton, Ingo Molnar, Peter Zijlstra, linux-mm On Tue, Mar 25, 2014 at 01:16:02PM -0700, Linus Torvalds wrote: > On Mon, Mar 24, 2014 at 8:31 AM, Steven Noonan <steven@uplinklabs.net> wrote: > > Vrabel's comments make me think that revisiting the elimination of the > > _PAGE_NUMA bit implementation would be a good idea... should I CC you > > on this discussion (not sure if you're subscribed to xen-devel, or if > > LKML is a better place for that discussion)? > > I detest the PAGE_NUMA games myself, but: > First of all, sorry for the slow response even by my standards. I was at LSF/MM and Collaboration all last week and it took up all my attention. Today is my first day back properly online and trawling through the inbox mess. > From: David Vrabel <david.vrabel@citrix.com>: > > > > I really do not understand how you're supposed to distinguish between a > > PTE for a PROT_NONE page and one with _PAGE_NUMA -- they're identical. > > i.e., pte_numa() will return true for a PROT_NONE protected page which > > just seems wrong to me. > > The way to distinguish PAGE_NUMA from PROTNONE is *supposed* to be by > looking at the vma, and PROTNONE goes together with a vma with > PROT_NONE. That's what the comments in pgtable_types.h say. > This is the expectation. We did not want to even attempt tracking NUMA hints on a per-VMA basis because the fault handler would go to hell with the need to fixup vmas. > However, as far as I can tell, that is pure and utter bullshit. It's > true that generally handle_mm_fault() shouldn't be called for > PROT_NONE pages, since it will fail the protection checks. However, we > have FOLL_FORCE that overrides those protection checks for things like > ptrace etc. So people have tried to convince me that _PAGE_NUMA works, > but I'm really not at all sure they are right. > For FOLL_FORCE, we do not set FOLL_NUMA in this chunk here /* * If FOLL_FORCE and FOLL_NUMA are both set, handle_mm_fault * would be called on PROT_NONE ranges. We must never invoke * handle_mm_fault on PROT_NONE ranges or the NUMA hinting * page faults would unprotect the PROT_NONE ranges if * _PAGE_NUMA and _PAGE_PROTNONE are sharing the same pte/pmd * bitflag. So to avoid that, don't set FOLL_NUMA if * FOLL_FORCE is set. */ if (!(gup_flags & FOLL_FORCE)) gup_flags |= FOLL_NUMA; Without FOLL_NUMA, we do not do "pmd_numa" checks because they cannot distinguish between a prot_none and pmd_numa as they use identical bits on x86. This is in follow_page_mask if ((flags & FOLL_NUMA) && pmd_numa(*pmd)) goto no_page_table; Without the checks FOLL_FORCE would screw up when it encountered a page protected for NUMA hinting faults. I recognise that it further muddies the waters on what _PAGE_NUMA actually means. A potential alternative would have been to have two pte bits -- _PAGE_NONE and an unused PTE bit (if there is one) that we'd call_PAGE_NUMA where a pmd_mknuma sets both. The _PAGE_NONE is what would cause a hinting fault but we'd use the second bit to distinguish between PROT_NONE and a NUMA hinting fault. I doubt the end result would be much cleaner though and it would be a mess. Another alternative is to simply not allow NUMA_BALANCING on Xen. It's not even clear what it means as the Xen NUMA topology may or may not correspond to underlying physical nodes. It's even less clear what happens if both guest and host use automatic balancing. > I fundamentally think that it was a horrible horrible disaster to make > _PAGE_NUMA alias onto _PAGE_PROTNONE. > We did not have much of a choice. We needed something that would trap a fault and _PAGE_PROTNONE is not available on all architectures. ppc64 reused _PAGE_COHERENT for example. > But I'm cc'ing the people who tried to convince me otherwise last time > around, to see if they can articulate this mess better now. > > The argument *seems* to be that if things are truly PROT_NONE, then > the page will never be touched by page faulting code (and as > mentioned, I think that argument is fundamentally broken), and if it's > PROT_NUMA then the page faulting code will magically do the right > thing. > This is essentially the argument with the addendum that follow_page is meant to avoid trying pmd_numa checks on FOLL_FORCE. > FURTHERMORE, the argument was that we can't just call things PROT_NONE > and just say that "those are the semantics", because on other > architectures PROT_NONE might mean/do something else. Or that the equivalent of _PAGE_PROTNONE did not exist and was implemented by some other means. > Which really > makes no sense either, because if the argument was that PROT_NONE > causes faults that can either be handled as faults (for PROT_NONE) or > as NUMA issues (for NUMA), then dammit, that argument should be > completely architecture-independent. > > But I gave up arguing with people. I will state (again) that I think > this is a f*cking mess, and saying that PROTNONE and NUMA are somehow > the exact same thing on x86 but not in general is bogus crap. And > saying that you can determine which it is from the vma is very > debatable too. > Ok, so how do you suggest that _PAGE_NUMA could have been implemented that did *not* use _PAGE_PROTNONE on x86, trapped a fault and was not expensive as hell to handle? -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] x86: use pv-ops in {pte,pmd}_{set,clear}_flags() 2014-03-31 12:26 ` Mel Gorman @ 2014-03-31 15:41 ` Linus Torvalds 2014-03-31 16:10 ` Linus Torvalds 2014-04-01 18:18 ` David Vrabel 0 siblings, 2 replies; 10+ messages in thread From: Linus Torvalds @ 2014-03-31 15:41 UTC (permalink / raw) To: Mel Gorman Cc: Steven Noonan, Rik van Riel, David Vrabel, Andrew Morton, Ingo Molnar, Peter Zijlstra, linux-mm On Mon, Mar 31, 2014 at 5:26 AM, Mel Gorman <mgorman@suse.de> wrote: > > Ok, so how do you suggest that _PAGE_NUMA could have been implemented > that did *not* use _PAGE_PROTNONE on x86, trapped a fault and was not > expensive as hell to handle? So on x86, the obvious model is to use another bit. We've got several. The _PAGE_NUMA case only matters for when _PAGE_PRESENT is clear, and when that bit is clear the hardware doesn't care about any of the other bits. Currently we use: #define _PAGE_BIT_PROTNONE _PAGE_BIT_GLOBAL #define _PAGE_BIT_FILE _PAGE_BIT_DIRTY which are bits 8 and 6 respectively, afaik. and the only rule is that (a) we should *not* use a bit we already use when the page is not present (since that is ambiguous!) and (b) we should *not* use a bit that is used by the swap index cases. I think bit 7 should work, but maybe I missed something. Can somebody tell me why _PAGE_NUMA is *not* that bit seven? Make "pte_present()" on x86 just check all of the present/numa/protnone bits, and if any of them is set, it's a "present" page. Now, unlike x86, some other architectures do *not* have free bits, so there may be problems elsewhere. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] x86: use pv-ops in {pte,pmd}_{set,clear}_flags() 2014-03-31 15:41 ` Linus Torvalds @ 2014-03-31 16:10 ` Linus Torvalds 2014-03-31 16:27 ` Cyrill Gorcunov 2014-04-01 18:18 ` David Vrabel 1 sibling, 1 reply; 10+ messages in thread From: Linus Torvalds @ 2014-03-31 16:10 UTC (permalink / raw) To: Mel Gorman, Peter Anvin, Ingo Molnar, the arch/x86 maintainers Cc: Steven Noonan, Rik van Riel, David Vrabel, Andrew Morton, Peter Zijlstra, linux-mm, Cyrill Gorcunov [ Adding x86 maintainers - Ingo was involved earlier, make it more explicit ] On Mon, Mar 31, 2014 at 8:41 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > So on x86, the obvious model is to use another bit. We've got several. > The _PAGE_NUMA case only matters for when _PAGE_PRESENT is clear, and > when that bit is clear the hardware doesn't care about any of the > other bits. Currently we use: > > #define _PAGE_BIT_PROTNONE _PAGE_BIT_GLOBAL > #define _PAGE_BIT_FILE _PAGE_BIT_DIRTY > > which are bits 8 and 6 respectively, afaik. Side note to the x86 guys: I think it was a mistake (long long long ago) to define these "valid when not present" bits in terms of the "valid when present" bits. It causes insane situations like this: #if _PAGE_BIT_FILE < _PAGE_BIT_PROTNONE which makes no sense *except* if you think that those bits can have random odd hardware-defined values. But they really can't. They are just random bit numbers we chose. It has *also* caused horrible pain with the whole "soft dirty" thing, and we have absolutely ridiculous macros in pgtable-2level.h for the insane soft-dirty case, trying to keep the swap bits spread out "just right" to make soft-dirty (_PAGE_BIT_HIDDEN aka bit 11) not alias with the bits we use for swap offsets etc. So how about we just say: - define the bits we use when PAGE_PRESENT==0 separately and explicitly - clean up the crazy soft-dirty crap, preferably by just making it depend on a 64-bit pte (so you have to have X86_PAE enabled or be on x86-64) that would sound like a good cleanup, because right now it's a complete nightmare to think about which bits are used how when P is 0. The above insane #if being the prime example of that confusion. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] x86: use pv-ops in {pte,pmd}_{set,clear}_flags() 2014-03-31 16:10 ` Linus Torvalds @ 2014-03-31 16:27 ` Cyrill Gorcunov 0 siblings, 0 replies; 10+ messages in thread From: Cyrill Gorcunov @ 2014-03-31 16:27 UTC (permalink / raw) To: Linus Torvalds Cc: Mel Gorman, Peter Anvin, Ingo Molnar, the arch/x86 maintainers, Steven Noonan, Rik van Riel, David Vrabel, Andrew Morton, Peter Zijlstra, linux-mm On Mon, Mar 31, 2014 at 09:10:07AM -0700, Linus Torvalds wrote: > > It causes insane situations like this: > > #if _PAGE_BIT_FILE < _PAGE_BIT_PROTNONE > > which makes no sense *except* if you think that those bits can have > random odd hardware-defined values. But they really can't. They are > just random bit numbers we chose. I never understand this ifdef (I've asked once but got no reply). > It has *also* caused horrible pain with the whole "soft dirty" thing, > and we have absolutely ridiculous macros in pgtable-2level.h for the > insane soft-dirty case, trying to keep the swap bits spread out "just > right" to make soft-dirty (_PAGE_BIT_HIDDEN aka bit 11) not alias with > the bits we use for swap offsets etc. > > So how about we just say: > > - define the bits we use when PAGE_PRESENT==0 separately and explicitly > > - clean up the crazy soft-dirty crap, preferably by just making it > depend on a 64-bit pte (so you have to have X86_PAE enabled or be on > x86-64) Sounds good for me, i'll try my best (if noone object). > > that would sound like a good cleanup, because right now it's a > complete nightmare to think about which bits are used how when P is 0. > The above insane #if being the prime example of that confusion. Cyrill -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] x86: use pv-ops in {pte,pmd}_{set,clear}_flags() 2014-03-31 15:41 ` Linus Torvalds 2014-03-31 16:10 ` Linus Torvalds @ 2014-04-01 18:18 ` David Vrabel 2014-04-01 18:43 ` Linus Torvalds 1 sibling, 1 reply; 10+ messages in thread From: David Vrabel @ 2014-04-01 18:18 UTC (permalink / raw) To: Linus Torvalds Cc: Mel Gorman, Steven Noonan, Rik van Riel, Andrew Morton, Ingo Molnar, Peter Zijlstra, linux-mm On 31/03/14 16:41, Linus Torvalds wrote: > On Mon, Mar 31, 2014 at 5:26 AM, Mel Gorman <mgorman@suse.de> wrote: >> >> Ok, so how do you suggest that _PAGE_NUMA could have been implemented >> that did *not* use _PAGE_PROTNONE on x86, trapped a fault and was not >> expensive as hell to handle? > > So on x86, the obvious model is to use another bit. We've got several. > The _PAGE_NUMA case only matters for when _PAGE_PRESENT is clear, and > when that bit is clear the hardware doesn't care about any of the > other bits. Currently we use: > > #define _PAGE_BIT_PROTNONE _PAGE_BIT_GLOBAL > #define _PAGE_BIT_FILE _PAGE_BIT_DIRTY > > which are bits 8 and 6 respectively, afaik. > > and the only rule is that (a) we should *not* use a bit we already use > when the page is not present (since that is ambiguous!) and (b) we > should *not* use a bit that is used by the swap index cases. I think > bit 7 should work, but maybe I missed something. I don't think it's sufficient to avoid collisions with bits used only with P=0. The original value of this bit must be retained when the _PAGE_NUMA bit is set/cleared. Bit 7 is PAT[2] and whilst Linux currently sets up the PAT such that PAT[2] is a 'don't care', there has been talk up adjusting the PAT to include more types. So I'm not sure it's a good idea to use bit 7. What's wrong with using e.g., bit 62? And not supporting this NUMA rebalancing feature on 32-bit non-PAE builds? David > Can somebody tell me why _PAGE_NUMA is *not* that bit seven? Make > "pte_present()" on x86 just check all of the present/numa/protnone > bits, and if any of them is set, it's a "present" page. > > Now, unlike x86, some other architectures do *not* have free bits, so > there may be problems elsewhere. > > Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] x86: use pv-ops in {pte,pmd}_{set,clear}_flags() 2014-04-01 18:18 ` David Vrabel @ 2014-04-01 18:43 ` Linus Torvalds 2014-04-01 19:03 ` Cyrill Gorcunov 0 siblings, 1 reply; 10+ messages in thread From: Linus Torvalds @ 2014-04-01 18:43 UTC (permalink / raw) To: David Vrabel Cc: Mel Gorman, Steven Noonan, Rik van Riel, Andrew Morton, Ingo Molnar, Peter Zijlstra, linux-mm On Tue, Apr 1, 2014 at 11:18 AM, David Vrabel <david.vrabel@citrix.com> wrote: > > I don't think it's sufficient to avoid collisions with bits used only > with P=0. The original value of this bit must be retained when the > _PAGE_NUMA bit is set/cleared. > > Bit 7 is PAT[2] and whilst Linux currently sets up the PAT such that > PAT[2] is a 'don't care', there has been talk up adjusting the PAT to > include more types. So I'm not sure it's a good idea to use bit 7. > > What's wrong with using e.g., bit 62? And not supporting this NUMA > rebalancing feature on 32-bit non-PAE builds? Sounds good to me, but it's not available in 32-bit PAE. The high bits are all reserved, afaik. But you'd have to be insane to care about NUMA balancing on 32-bit, even with PAE. So restricting it to x86-64 and using the high bits (I think bits 52-62 are all available to SW) sounds fine to me. Same goes for soft-dirty. I think it's fine if we say that you won't have soft-dirty with a 32-bit kernel. Even with PAE. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] x86: use pv-ops in {pte,pmd}_{set,clear}_flags() 2014-04-01 18:43 ` Linus Torvalds @ 2014-04-01 19:03 ` Cyrill Gorcunov 2014-04-02 11:33 ` Pavel Emelyanov 0 siblings, 1 reply; 10+ messages in thread From: Cyrill Gorcunov @ 2014-04-01 19:03 UTC (permalink / raw) To: Linus Torvalds, Pavel Emelyanov Cc: David Vrabel, Mel Gorman, Steven Noonan, Rik van Riel, Andrew Morton, Ingo Molnar, Peter Zijlstra, linux-mm On Tue, Apr 01, 2014 at 11:43:11AM -0700, Linus Torvalds wrote: > On Tue, Apr 1, 2014 at 11:18 AM, David Vrabel <david.vrabel@citrix.com> wrote: > > > > I don't think it's sufficient to avoid collisions with bits used only > > with P=0. The original value of this bit must be retained when the > > _PAGE_NUMA bit is set/cleared. > > > > Bit 7 is PAT[2] and whilst Linux currently sets up the PAT such that > > PAT[2] is a 'don't care', there has been talk up adjusting the PAT to > > include more types. So I'm not sure it's a good idea to use bit 7. > > > > What's wrong with using e.g., bit 62? And not supporting this NUMA > > rebalancing feature on 32-bit non-PAE builds? > > Sounds good to me, but it's not available in 32-bit PAE. The high bits > are all reserved, afaik. > > But you'd have to be insane to care about NUMA balancing on 32-bit, > even with PAE. So restricting it to x86-64 and using the high bits (I > think bits 52-62 are all available to SW) sounds fine to me. > > Same goes for soft-dirty. I think it's fine if we say that you won't > have soft-dirty with a 32-bit kernel. Even with PAE. Well, at the moment we use soft-dirty for x86-64 only in criu but there were plans to implement complete 32bit support as well. While personally I don't mind dropping soft-dirty for non x86-64 case, I would like to hear Pavel's opinion, Pavel? (n.b, i'm still working on cleaning up _page bits, it appeared to be harder than I've been expecting). -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] x86: use pv-ops in {pte,pmd}_{set,clear}_flags() 2014-04-01 19:03 ` Cyrill Gorcunov @ 2014-04-02 11:33 ` Pavel Emelyanov 2014-04-02 13:29 ` Cyrill Gorcunov 0 siblings, 1 reply; 10+ messages in thread From: Pavel Emelyanov @ 2014-04-02 11:33 UTC (permalink / raw) To: Cyrill Gorcunov Cc: Linus Torvalds, David Vrabel, Mel Gorman, Steven Noonan, Rik van Riel, Andrew Morton, Ingo Molnar, Peter Zijlstra, linux-mm On 04/01/2014 11:03 PM, Cyrill Gorcunov wrote: > On Tue, Apr 01, 2014 at 11:43:11AM -0700, Linus Torvalds wrote: >> On Tue, Apr 1, 2014 at 11:18 AM, David Vrabel <david.vrabel@citrix.com> wrote: >>> >>> I don't think it's sufficient to avoid collisions with bits used only >>> with P=0. The original value of this bit must be retained when the >>> _PAGE_NUMA bit is set/cleared. >>> >>> Bit 7 is PAT[2] and whilst Linux currently sets up the PAT such that >>> PAT[2] is a 'don't care', there has been talk up adjusting the PAT to >>> include more types. So I'm not sure it's a good idea to use bit 7. >>> >>> What's wrong with using e.g., bit 62? And not supporting this NUMA >>> rebalancing feature on 32-bit non-PAE builds? >> >> Sounds good to me, but it's not available in 32-bit PAE. The high bits >> are all reserved, afaik. >> >> But you'd have to be insane to care about NUMA balancing on 32-bit, >> even with PAE. So restricting it to x86-64 and using the high bits (I >> think bits 52-62 are all available to SW) sounds fine to me. >> >> Same goes for soft-dirty. I think it's fine if we say that you won't >> have soft-dirty with a 32-bit kernel. Even with PAE. > > Well, at the moment we use soft-dirty for x86-64 only in criu but there > were plans to implement complete 32bit support as well. While personally > I don't mind dropping soft-dirty for non x86-64 case, I would like > to hear Pavel's opinion, Pavel? We (Parallels) don't have plans on C/R on 32-bit kernels, but I speak only for Parallels. However, people I know who need 32-bit C/R use ARM :) > (n.b, i'm still working on cleaning up _page bits, it appeared to > be harder than I've been expecting). > . Thanks, Pavel -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] x86: use pv-ops in {pte,pmd}_{set,clear}_flags() 2014-04-02 11:33 ` Pavel Emelyanov @ 2014-04-02 13:29 ` Cyrill Gorcunov 0 siblings, 0 replies; 10+ messages in thread From: Cyrill Gorcunov @ 2014-04-02 13:29 UTC (permalink / raw) To: Pavel Emelyanov Cc: Linus Torvalds, David Vrabel, Mel Gorman, Steven Noonan, Rik van Riel, Andrew Morton, Ingo Molnar, Peter Zijlstra, linux-mm On Wed, Apr 02, 2014 at 03:33:48PM +0400, Pavel Emelyanov wrote: ... > >> > >> But you'd have to be insane to care about NUMA balancing on 32-bit, > >> even with PAE. So restricting it to x86-64 and using the high bits (I > >> think bits 52-62 are all available to SW) sounds fine to me. > >> > >> Same goes for soft-dirty. I think it's fine if we say that you won't > >> have soft-dirty with a 32-bit kernel. Even with PAE. > > > > Well, at the moment we use soft-dirty for x86-64 only in criu but there > > were plans to implement complete 32bit support as well. While personally > > I don't mind dropping soft-dirty for non x86-64 case, I would like > > to hear Pavel's opinion, Pavel? > > We (Parallels) don't have plans on C/R on 32-bit kernels, but I speak only > for Parallels. However, people I know who need 32-bit C/R use ARM :) OK, since it's x86 specific I can prepare patch for dropping softdirty on x86-32 (this will release ugly macros in file mapping a bit but not that significantly). Guys, while looking into how to re-define _PAGE bits for case where present bit is dropped I though about the form like #define _PAGE_BIT_FILE (_PAGE_BIT_PRESENT + 1) /* _PAGE_BIT_RW */ #define _PAGE_BIT_NUMA (_PAGE_BIT_PRESENT + 2) /* _PAGE_BIT_USER */ #define _PAGE_BIT_PROTNONE (_PAGE_BIT_PRESENT + 3) /* _PAGE_BIT_PWT */ and while _PAGE_BIT_FILE case should work (as well as swap pages), I'm not that sure about the numa and protnone case. I fear there are some code paths which depends on the former bits positions -- ie when PAGE_BIT_PROTNONE = _PAGE_BIT_NUMA = _PAGE_BIT_GLOBAL. One of the _PAGE_BIT_GLOBAL user is the page attributes code. It seems to always check _PAGE_BIT_PRESENT together with _PAGE_BIT_GLOBAL, so if _PAGE_BIT_PROTNONE get redefined to a new value it should not fail. Thus main concern is protnone + numa code, which I must admit I don't know well enough yet. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-04-02 13:29 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <1395425902-29817-1-git-send-email-david.vrabel@citrix.com> [not found] ` <1395425902-29817-3-git-send-email-david.vrabel@citrix.com> [not found] ` <533016CB.4090807@citrix.com> [not found] ` <CAKbGBLiVqaHEOZx6y4MW4xDTUdKRhVLZXTTGiqYT7vuH2Wgeww@mail.gmail.com> 2014-03-25 20:16 ` [PATCH 2/2] x86: use pv-ops in {pte,pmd}_{set,clear}_flags() Linus Torvalds 2014-03-31 12:26 ` Mel Gorman 2014-03-31 15:41 ` Linus Torvalds 2014-03-31 16:10 ` Linus Torvalds 2014-03-31 16:27 ` Cyrill Gorcunov 2014-04-01 18:18 ` David Vrabel 2014-04-01 18:43 ` Linus Torvalds 2014-04-01 19:03 ` Cyrill Gorcunov 2014-04-02 11:33 ` Pavel Emelyanov 2014-04-02 13:29 ` Cyrill Gorcunov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).