From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate2.de.ibm.com (8.12.10/8.12.10) with ESMTP id j946pNd7187764 for ; Tue, 4 Oct 2005 06:51:23 GMT Received: from d12av02.megacenter.de.ibm.com (d12av02.megacenter.de.ibm.com [9.149.165.228]) by d12nrmr1607.megacenter.de.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j946pN27147356 for ; Tue, 4 Oct 2005 08:51:23 +0200 Received: from d12av02.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av02.megacenter.de.ibm.com (8.12.11/8.13.3) with ESMTP id j946pNiR005718 for ; Tue, 4 Oct 2005 08:51:23 +0200 Received: from localhost (dyn-9-152-216-41.boeblingen.de.ibm.com [9.152.216.41]) by d12av02.megacenter.de.ibm.com (8.12.11/8.12.11) with ESMTP id j946pNAZ005713 for ; Tue, 4 Oct 2005 08:51:23 +0200 Date: Tue, 4 Oct 2005 08:50:30 +0200 From: Heiko Carstens Subject: sparsemem & sparsemem extreme question Message-ID: <20051004065030.GA21741@osiris.boeblingen.de.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-linux-mm@kvack.org Return-Path: To: linux-mm@kvack.org List-ID: Hi all, I did an implementation of CONFIG_SPARSEMEM for s390, which indeed was quite easy. Just to find out that it was not sufficient :) SPARSEMEM_EXTREME looks better but unfortunately adds another layer of indirection. I'm just wondering why there is all this indirection stuff here and why not have one contiguous aray of struct pages (residing in the vmalloc area) that deals with whatever size of memory an architecture wants to support. Unused areas just wouldn't have any backing with real pages and on access generate a page fault (nobody is supposed to access these pages anyway). This would have the advantage that all the primitives like e.g. pfn_to_page would be as simple as before, no need to waste large parts of the page flags and in addition it would easily allow for memory hotplug on page size granularity. The only drawbacks are (as far as I can see) a _huge_ virtual mem_map array, but that shouldn't matter too much. A real problem could be that the mem_map array and therefore the vmalloc area need to be generated quiete early. Most probably this has already been thought about before, but I couldn't find anything in the achives. Heiko -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <434292D3.2040105@shadowen.org> Date: Tue, 04 Oct 2005 15:33:55 +0100 From: Andy Whitcroft MIME-Version: 1.0 Subject: Re: sparsemem & sparsemem extreme question References: <20051004065030.GA21741@osiris.boeblingen.de.ibm.com> In-Reply-To: <20051004065030.GA21741@osiris.boeblingen.de.ibm.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Heiko Carstens Cc: linux-mm@kvack.org List-ID: Heiko Carstens wrote: > I did an implementation of CONFIG_SPARSEMEM for s390, which indeed was quite > easy. Just to find out that it was not sufficient :) > SPARSEMEM_EXTREME looks better but unfortunately adds another layer of > indirection. > I'm just wondering why there is all this indirection stuff here and why not > have one contiguous aray of struct pages (residing in the vmalloc area) that > deals with whatever size of memory an architecture wants to support. > Unused areas just wouldn't have any backing with real pages and on access > generate a page fault (nobody is supposed to access these pages anyway). > This would have the advantage that all the primitives like e.g. pfn_to_page > would be as simple as before, no need to waste large parts of the page flags > and in addition it would easily allow for memory hotplug on page size > granularity. > The only drawbacks are (as far as I can see) a _huge_ virtual mem_map array, > but that shouldn't matter too much. A real problem could be that the mem_map > array and therefore the vmalloc area need to be generated quiete early. > > Most probably this has already been thought about before, but I couldn't find > anything in the achives. During the implementation of SPARSEMEM_EXTREME other layouts such as the huge 'partially populated' mem_map were considered. For a number of our target architectures kernel virtual address is at a premium so this would not be suitable for them. We did consider whether to have different mechanisms for KVA rich architectures but (if I remember correctly) benchmarking the implementation seemed to indicate that the additional indirection was insignificant if even detectable. The architecture of sparsemem is supposed to allow architecture specific implementations should that be necessary but I've not yet seen a compelling arguement for one yet. On the subject of page flags, I would point out that SPARSEMEM either reuses already used bits for 32 bit architectures, or makes use of unused bits in the 64 case. It doesn't reduce the number of flags bits available. -apw -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by e1.ny.us.ibm.com (8.12.11/8.12.11) with ESMTP id j94GFCUd012820 for ; Tue, 4 Oct 2005 12:15:12 -0400 Received: from d01av02.pok.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by d01relay04.pok.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j94GFChH104350 for ; Tue, 4 Oct 2005 12:15:12 -0400 Received: from d01av02.pok.ibm.com (loopback [127.0.0.1]) by d01av02.pok.ibm.com (8.12.11/8.13.3) with ESMTP id j94GFBMa018239 for ; Tue, 4 Oct 2005 12:15:11 -0400 Subject: Re: sparsemem & sparsemem extreme question From: Dave Hansen In-Reply-To: <20051004065030.GA21741@osiris.boeblingen.de.ibm.com> References: <20051004065030.GA21741@osiris.boeblingen.de.ibm.com> Content-Type: text/plain Date: Tue, 04 Oct 2005 09:15:02 -0700 Message-Id: <1128442502.20208.6.camel@localhost> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Heiko Carstens Cc: linux-mm List-ID: On Tue, 2005-10-04 at 08:50 +0200, Heiko Carstens wrote: > I'm just wondering why there is all this indirection stuff here and why not > have one contiguous aray of struct pages (residing in the vmalloc area) that > deals with whatever size of memory an architecture wants to support. This is exactly what ia64 does today. Programatically, it does remove a layer of indirection. However, there are some data structures that have to be traversed during a lookup: the page tables. Granted, the TLB will provide some caching, but a lookup on ia64 can potentially be much more expensive than the two cacheline misses that sparsemem extreme might have. In the end no one has ever produced any compelling performance reason to use a vmem_map (as ia64 calls it). In addition, sparsemem doesn't cause any known performance regressions, either. -- Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate2.de.ibm.com (8.12.10/8.12.10) with ESMTP id j956e7d7179720 for ; Wed, 5 Oct 2005 06:40:07 GMT Received: from d12av02.megacenter.de.ibm.com (d12av02.megacenter.de.ibm.com [9.149.165.228]) by d12nrmr1607.megacenter.de.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j956e7vk162946 for ; Wed, 5 Oct 2005 08:40:07 +0200 Received: from d12av02.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av02.megacenter.de.ibm.com (8.12.11/8.13.3) with ESMTP id j956e7HD031403 for ; Wed, 5 Oct 2005 08:40:07 +0200 Date: Wed, 5 Oct 2005 08:39:09 +0200 From: Heiko Carstens Subject: Re: sparsemem & sparsemem extreme question Message-ID: <20051005063909.GA9699@osiris.boeblingen.de.ibm.com> References: <20051004065030.GA21741@osiris.boeblingen.de.ibm.com> <1128442502.20208.6.camel@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1128442502.20208.6.camel@localhost> Sender: owner-linux-mm@kvack.org Return-Path: To: Dave Hansen Cc: linux-mm List-ID: > > I'm just wondering why there is all this indirection stuff here and why not > > have one contiguous aray of struct pages (residing in the vmalloc area) that > > deals with whatever size of memory an architecture wants to support. > This is exactly what ia64 does today. Programatically, it does remove a > layer of indirection. However, there are some data structures that have > to be traversed during a lookup: the page tables. Granted, the TLB will > provide some caching, but a lookup on ia64 can potentially be much more > expensive than the two cacheline misses that sparsemem extreme might > have. Sure, just that on s390 we have a 1:1 mapping anyway. So these lookups would be more or less for free for us (compared to what we have now). > In the end no one has ever produced any compelling performance reason to > use a vmem_map (as ia64 calls it). In addition, sparsemem doesn't cause > any known performance regressions, either. As far as I understand the memory hotplug patches they won't work without SPARSEMEM support. So the ia64 approach with a vmem_map will not work here, right? Actually my concern is that whenever the address space that is covered with SPARSEMEM_EXTREME is not sufficient just another layer of indirection needs to be added. Heiko -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e31.co.us.ibm.com (8.12.11/8.12.11) with ESMTP id j95Fq0eO016061 for ; Wed, 5 Oct 2005 11:52:00 -0400 Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by westrelay02.boulder.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j95FqpfK524106 for ; Wed, 5 Oct 2005 09:52:51 -0600 Received: from d03av03.boulder.ibm.com (loopback [127.0.0.1]) by d03av03.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id j95FqoWu004001 for ; Wed, 5 Oct 2005 09:52:50 -0600 Subject: Re: sparsemem & sparsemem extreme question From: Dave Hansen In-Reply-To: <20051005063909.GA9699@osiris.boeblingen.de.ibm.com> References: <20051004065030.GA21741@osiris.boeblingen.de.ibm.com> <1128442502.20208.6.camel@localhost> <20051005063909.GA9699@osiris.boeblingen.de.ibm.com> Content-Type: text/plain Date: Wed, 05 Oct 2005 08:52:34 -0700 Message-Id: <1128527554.26009.2.camel@localhost> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Heiko Carstens Cc: linux-mm List-ID: On Wed, 2005-10-05 at 08:39 +0200, Heiko Carstens wrote: > > > I'm just wondering why there is all this indirection stuff here and why not > > > have one contiguous aray of struct pages (residing in the vmalloc area) that > > > deals with whatever size of memory an architecture wants to support. > > This is exactly what ia64 does today. Programatically, it does remove a > > layer of indirection. However, there are some data structures that have > > to be traversed during a lookup: the page tables. Granted, the TLB will > > provide some caching, but a lookup on ia64 can potentially be much more > > expensive than the two cacheline misses that sparsemem extreme might > > have. > > Sure, just that on s390 we have a 1:1 mapping anyway. So these lookups would > be more or less for free for us (compared to what we have now). Is the 1:1 mapping done with pagetables? If so, it is not free. > > In the end no one has ever produced any compelling performance reason to > > use a vmem_map (as ia64 calls it). In addition, sparsemem doesn't cause > > any known performance regressions, either. > > As far as I understand the memory hotplug patches they won't work without > SPARSEMEM support. So the ia64 approach with a vmem_map will not work here, > right? If we had vmem_map implemented for every arch that supported memory hotplug, and it didn't have performance implications, then we could have vmem_map everywhere, and use it for hotplug. > Actually my concern is that whenever the address space that is covered with > SPARSEMEM_EXTREME is not sufficient just another layer of indirection needs > to be added. Do you have any performance numbers to back up your concerns, or is it more about the code complexity? -- Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate3.de.ibm.com (8.12.10/8.12.10) with ESMTP id j95FxJTZ192790 for ; Wed, 5 Oct 2005 15:59:19 GMT Received: from d12av02.megacenter.de.ibm.com (d12av02.megacenter.de.ibm.com [9.149.165.228]) by d12nrmr1607.megacenter.de.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j95FxJvk181536 for ; Wed, 5 Oct 2005 17:59:19 +0200 Received: from d12av02.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av02.megacenter.de.ibm.com (8.12.11/8.13.3) with ESMTP id j95FxJVo025336 for ; Wed, 5 Oct 2005 17:59:19 +0200 Date: Wed, 5 Oct 2005 17:58:23 +0200 From: Heiko Carstens Subject: Re: sparsemem & sparsemem extreme question Message-ID: <20051005155823.GA10119@osiris.ibm.com> References: <20051004065030.GA21741@osiris.boeblingen.de.ibm.com> <1128442502.20208.6.camel@localhost> <20051005063909.GA9699@osiris.boeblingen.de.ibm.com> <1128527554.26009.2.camel@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1128527554.26009.2.camel@localhost> Sender: owner-linux-mm@kvack.org Return-Path: To: Dave Hansen Cc: linux-mm List-ID: > > > to be traversed during a lookup: the page tables. Granted, the TLB will > > > provide some caching, but a lookup on ia64 can potentially be much more > > > expensive than the two cacheline misses that sparsemem extreme might > > > have. > > Sure, just that on s390 we have a 1:1 mapping anyway. So these lookups would > > be more or less for free for us (compared to what we have now). > Is the 1:1 mapping done with pagetables? If so, it is not free. Sure, it's done with pagetables. What I meant: we have the 1:1 mapping already today. So adding anything to the vmalloc area won't make it more expensive. > > Actually my concern is that whenever the address space that is covered with > > SPARSEMEM_EXTREME is not sufficient just another layer of indirection needs > > to be added. > Do you have any performance numbers to back up your concerns, or is it > more about the code complexity? No, my concern is actually that the s390 archticture actually will come up with some sort of memory that's present in the physical address space where the most significant bit of the addresses will be turned _on_. That means we would need to support the whole 64 bit physical address space... Considering this, this would be good for at least one if not two additional indirection layers, which would make the code too complex, IMHO. That's why I think the vmem_map approach would be easiest to implement this :) Heiko -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e35.co.us.ibm.com (8.12.11/8.12.11) with ESMTP id j95G34bN016919 for ; Wed, 5 Oct 2005 12:03:04 -0400 Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167]) by westrelay02.boulder.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j95G5mfK543026 for ; Wed, 5 Oct 2005 10:05:48 -0600 Received: from d03av01.boulder.ibm.com (loopback [127.0.0.1]) by d03av01.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id j95G5lRm027978 for ; Wed, 5 Oct 2005 10:05:47 -0600 Subject: Re: sparsemem & sparsemem extreme question From: Dave Hansen In-Reply-To: <20051005155823.GA10119@osiris.ibm.com> References: <20051004065030.GA21741@osiris.boeblingen.de.ibm.com> <1128442502.20208.6.camel@localhost> <20051005063909.GA9699@osiris.boeblingen.de.ibm.com> <1128527554.26009.2.camel@localhost> <20051005155823.GA10119@osiris.ibm.com> Content-Type: text/plain Date: Wed, 05 Oct 2005 09:05:40 -0700 Message-Id: <1128528340.26009.8.camel@localhost> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Heiko Carstens Cc: linux-mm List-ID: On Wed, 2005-10-05 at 17:58 +0200, Heiko Carstens wrote: > we have the 1:1 mapping already > today. So adding anything to the vmalloc area won't make it more expensive. The pagetables themselves have a cost, as do the lookups. Those make it more expensive. On 64-bit machines the vaddr space is not an expense. > No, my concern is actually that the s390 archticture actually will come up > with some sort of memory that's present in the physical address space where > the most significant bit of the addresses will be turned _on_. Why do you think this? -- Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate4.de.ibm.com (8.12.10/8.12.10) with ESMTP id j95GB5O0201440 for ; Wed, 5 Oct 2005 16:11:05 GMT Received: from d12av02.megacenter.de.ibm.com (d12av02.megacenter.de.ibm.com [9.149.165.228]) by d12nrmr1607.megacenter.de.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j95GB5vk164370 for ; Wed, 5 Oct 2005 18:11:05 +0200 Received: from d12av02.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av02.megacenter.de.ibm.com (8.12.11/8.13.3) with ESMTP id j95GB5GW003358 for ; Wed, 5 Oct 2005 18:11:05 +0200 Date: Wed, 5 Oct 2005 18:10:09 +0200 From: Heiko Carstens Subject: Re: sparsemem & sparsemem extreme question Message-ID: <20051005161009.GA10146@osiris.ibm.com> References: <20051004065030.GA21741@osiris.boeblingen.de.ibm.com> <1128442502.20208.6.camel@localhost> <20051005063909.GA9699@osiris.boeblingen.de.ibm.com> <1128527554.26009.2.camel@localhost> <20051005155823.GA10119@osiris.ibm.com> <1128528340.26009.8.camel@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1128528340.26009.8.camel@localhost> Sender: owner-linux-mm@kvack.org Return-Path: To: Dave Hansen Cc: linux-mm List-ID: > > No, my concern is actually that the s390 archticture actually will come up > > with some sort of memory that's present in the physical address space where > > the most significant bit of the addresses will be turned _on_. > > Why do you think this? That's a matter of fact and what the Specs say... Heiko -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e35.co.us.ibm.com (8.12.11/8.12.11) with ESMTP id j95GHpgG030353 for ; Wed, 5 Oct 2005 12:17:51 -0400 Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by westrelay02.boulder.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j95GKYfK547560 for ; Wed, 5 Oct 2005 10:20:34 -0600 Received: from d03av03.boulder.ibm.com (loopback [127.0.0.1]) by d03av03.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id j95GKYej031892 for ; Wed, 5 Oct 2005 10:20:34 -0600 Subject: Re: sparsemem & sparsemem extreme question From: Dave Hansen In-Reply-To: <20051005161009.GA10146@osiris.ibm.com> References: <20051004065030.GA21741@osiris.boeblingen.de.ibm.com> <1128442502.20208.6.camel@localhost> <20051005063909.GA9699@osiris.boeblingen.de.ibm.com> <1128527554.26009.2.camel@localhost> <20051005155823.GA10119@osiris.ibm.com> <1128528340.26009.8.camel@localhost> <20051005161009.GA10146@osiris.ibm.com> Content-Type: text/plain Date: Wed, 05 Oct 2005 09:20:22 -0700 Message-Id: <1128529222.26009.16.camel@localhost> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Heiko Carstens Cc: linux-mm List-ID: On Wed, 2005-10-05 at 18:10 +0200, Heiko Carstens wrote: > > > No, my concern is actually that the s390 archticture actually will come up > > > with some sort of memory that's present in the physical address space where > > > the most significant bit of the addresses will be turned _on_. > > > > Why do you think this? > > That's a matter of fact and what the Specs say... I'd appreciate any pointer to the relevant information, especially the stuff that explains just how sparse a physical address space can be on that architecture. What would discontigmem have done with the same layout? Does s390 even support discontigmem? -- Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate2.de.ibm.com (8.12.10/8.12.10) with ESMTP id j95HDPd7081438 for ; Wed, 5 Oct 2005 17:13:25 GMT Received: from d12av02.megacenter.de.ibm.com (d12av02.megacenter.de.ibm.com [9.149.165.228]) by d12nrmr1607.megacenter.de.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j95HDPvk160982 for ; Wed, 5 Oct 2005 19:13:25 +0200 Received: from d12av02.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av02.megacenter.de.ibm.com (8.12.11/8.13.3) with ESMTP id j95HDPr2018711 for ; Wed, 5 Oct 2005 19:13:25 +0200 Date: Wed, 5 Oct 2005 19:12:30 +0200 From: Heiko Carstens Subject: Re: sparsemem & sparsemem extreme question Message-ID: <20051005171230.GA10204@osiris.ibm.com> References: <20051004065030.GA21741@osiris.boeblingen.de.ibm.com> <1128442502.20208.6.camel@localhost> <20051005063909.GA9699@osiris.boeblingen.de.ibm.com> <1128527554.26009.2.camel@localhost> <20051005155823.GA10119@osiris.ibm.com> <1128528340.26009.8.camel@localhost> <20051005161009.GA10146@osiris.ibm.com> <1128529222.26009.16.camel@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1128529222.26009.16.camel@localhost> Sender: owner-linux-mm@kvack.org Return-Path: To: Dave Hansen Cc: linux-mm List-ID: > > That's a matter of fact and what the Specs say... > I'd appreciate any pointer to the relevant information, especially the > stuff that explains just how sparse a physical address space can be on > that architecture. What would discontigmem have done with the same > layout? Does s390 even support discontigmem? s390 does not support discontigmem at all. And unfortunately the documentation is not publicly available yet, sorry. Anything specific you need to know about the memory layout? Heiko -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234]) by e5.ny.us.ibm.com (8.12.11/8.12.11) with ESMTP id j95HKGZ3014485 for ; Wed, 5 Oct 2005 13:20:16 -0400 Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64]) by d01relay02.pok.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j95HKGt2089682 for ; Wed, 5 Oct 2005 13:20:16 -0400 Received: from d01av04.pok.ibm.com (loopback [127.0.0.1]) by d01av04.pok.ibm.com (8.12.11/8.13.3) with ESMTP id j95HKGJ9021118 for ; Wed, 5 Oct 2005 13:20:16 -0400 Subject: Re: sparsemem & sparsemem extreme question From: Dave Hansen In-Reply-To: <20051005171230.GA10204@osiris.ibm.com> References: <20051004065030.GA21741@osiris.boeblingen.de.ibm.com> <1128442502.20208.6.camel@localhost> <20051005063909.GA9699@osiris.boeblingen.de.ibm.com> <1128527554.26009.2.camel@localhost> <20051005155823.GA10119@osiris.ibm.com> <1128528340.26009.8.camel@localhost> <20051005161009.GA10146@osiris.ibm.com> <1128529222.26009.16.camel@localhost> <20051005171230.GA10204@osiris.ibm.com> Content-Type: text/plain Date: Wed, 05 Oct 2005 10:20:09 -0700 Message-Id: <1128532809.26009.39.camel@localhost> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Heiko Carstens Cc: linux-mm List-ID: On Wed, 2005-10-05 at 19:12 +0200, Heiko Carstens wrote: > > > That's a matter of fact and what the Specs say... > > I'd appreciate any pointer to the relevant information, especially the > > stuff that explains just how sparse a physical address space can be on > > that architecture. What would discontigmem have done with the same > > layout? Does s390 even support discontigmem? > > s390 does not support discontigmem at all. And unfortunately the > documentation is not publicly available yet, sorry. > Anything specific you need to know about the memory layout? How sparse is it? How few present pages can be there be in a worst-case physical area? -- Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate2.de.ibm.com (8.12.10/8.12.10) with ESMTP id j95Hkad7107336 for ; Wed, 5 Oct 2005 17:46:36 GMT Received: from d12av02.megacenter.de.ibm.com (d12av02.megacenter.de.ibm.com [9.149.165.228]) by d12nrmr1607.megacenter.de.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j95HkZvk163648 for ; Wed, 5 Oct 2005 19:46:35 +0200 Received: from d12av02.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av02.megacenter.de.ibm.com (8.12.11/8.13.3) with ESMTP id j95HkZHA008184 for ; Wed, 5 Oct 2005 19:46:35 +0200 Date: Wed, 5 Oct 2005 19:45:42 +0200 From: Heiko Carstens Subject: Re: sparsemem & sparsemem extreme question Message-ID: <20051005174542.GB10204@osiris.ibm.com> References: <20051004065030.GA21741@osiris.boeblingen.de.ibm.com> <1128442502.20208.6.camel@localhost> <20051005063909.GA9699@osiris.boeblingen.de.ibm.com> <1128527554.26009.2.camel@localhost> <20051005155823.GA10119@osiris.ibm.com> <1128528340.26009.8.camel@localhost> <20051005161009.GA10146@osiris.ibm.com> <1128529222.26009.16.camel@localhost> <20051005171230.GA10204@osiris.ibm.com> <1128532809.26009.39.camel@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1128532809.26009.39.camel@localhost> Sender: owner-linux-mm@kvack.org Return-Path: To: Dave Hansen Cc: linux-mm List-ID: > > Anything specific you need to know about the memory layout? > How sparse is it? How few present pages can be there be in a worst-case > physical area? Worst case that is already currently valid is that you can have 1 MB segments whereever you want in address space. For instance I just configured a virtual machine that has the following memory layout: Address Range ----------------------------------- 0000000000000000 - 00000000000FFFFF 000000FFC0000000 - 000000FFC00FFFFF Even though it's currently not possible to define memory segments above 1TB, this limit is likely to go away. In addition if running in a logical partition we always have a small gap at the 2GB barrier. Not sure how large that gap is, I'll check tomorrow. Heiko -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234]) by e3.ny.us.ibm.com (8.12.11/8.12.11) with ESMTP id j95Hvh5J002485 for ; Wed, 5 Oct 2005 13:57:43 -0400 Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64]) by d01relay02.pok.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j95Hvgt2094050 for ; Wed, 5 Oct 2005 13:57:42 -0400 Received: from d01av04.pok.ibm.com (loopback [127.0.0.1]) by d01av04.pok.ibm.com (8.12.11/8.13.3) with ESMTP id j95Hvf8R008565 for ; Wed, 5 Oct 2005 13:57:42 -0400 Subject: Re: sparsemem & sparsemem extreme question From: Dave Hansen In-Reply-To: <20051005174542.GB10204@osiris.ibm.com> References: <20051004065030.GA21741@osiris.boeblingen.de.ibm.com> <1128442502.20208.6.camel@localhost> <20051005063909.GA9699@osiris.boeblingen.de.ibm.com> <1128527554.26009.2.camel@localhost> <20051005155823.GA10119@osiris.ibm.com> <1128528340.26009.8.camel@localhost> <20051005161009.GA10146@osiris.ibm.com> <1128529222.26009.16.camel@localhost> <20051005171230.GA10204@osiris.ibm.com> <1128532809.26009.39.camel@localhost> <20051005174542.GB10204@osiris.ibm.com> Content-Type: text/plain Date: Wed, 05 Oct 2005 10:57:34 -0700 Message-Id: <1128535054.26009.53.camel@localhost> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Heiko Carstens Cc: linux-mm , Bob Picco List-ID: On Wed, 2005-10-05 at 19:45 +0200, Heiko Carstens wrote: > > > Anything specific you need to know about the memory layout? > > How sparse is it? How few present pages can be there be in a worst-case > > physical area? > > Worst case that is already currently valid is that you can have 1 MB > segments whereever you want in address space. ... > Even though it's currently not possible to define memory segments above > 1TB, this limit is likely to go away. Go away, or get moved up? ia64 today is designed to work with 50 bits of physical address space, and 30 bit sections. That's exactly the same scale that you're talking about with 1MB sections and 1TB of physical space. So, sparsemem extreme should be perfectly fine for that case (that's explicitly what it was designed for). How much bigger than 1TB will it go? -- Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate1.de.ibm.com (8.12.10/8.12.10) with ESMTP id j95I5aOI147290 for ; Wed, 5 Oct 2005 18:05:36 GMT Received: from d12av02.megacenter.de.ibm.com (d12av02.megacenter.de.ibm.com [9.149.165.228]) by d12nrmr1607.megacenter.de.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j95I5avk138026 for ; Wed, 5 Oct 2005 20:05:36 +0200 Received: from d12av02.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av02.megacenter.de.ibm.com (8.12.11/8.13.3) with ESMTP id j95I5Zbl018293 for ; Wed, 5 Oct 2005 20:05:36 +0200 Date: Wed, 5 Oct 2005 20:04:43 +0200 From: Heiko Carstens Subject: Re: sparsemem & sparsemem extreme question Message-ID: <20051005180443.GC10204@osiris.ibm.com> References: <20051005063909.GA9699@osiris.boeblingen.de.ibm.com> <1128527554.26009.2.camel@localhost> <20051005155823.GA10119@osiris.ibm.com> <1128528340.26009.8.camel@localhost> <20051005161009.GA10146@osiris.ibm.com> <1128529222.26009.16.camel@localhost> <20051005171230.GA10204@osiris.ibm.com> <1128532809.26009.39.camel@localhost> <20051005174542.GB10204@osiris.ibm.com> <1128535054.26009.53.camel@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1128535054.26009.53.camel@localhost> Sender: owner-linux-mm@kvack.org Return-Path: To: Dave Hansen Cc: linux-mm , Bob Picco List-ID: > > > > Anything specific you need to know about the memory layout? > > > How sparse is it? How few present pages can be there be in a worst-case > > > physical area? > > > > Worst case that is already currently valid is that you can have 1 MB > > segments whereever you want in address space. > ... > > Even though it's currently not possible to define memory segments above > > 1TB, this limit is likely to go away. > > Go away, or get moved up? > > ia64 today is designed to work with 50 bits of physical address space, > and 30 bit sections. That's exactly the same scale that you're talking > about with 1MB sections and 1TB of physical space. So, sparsemem > extreme should be perfectly fine for that case (that's explicitly what > it was designed for). > > How much bigger than 1TB will it go? As already mentioned, we will have physical memory with the MSB set. Afaik the hardware uses this bit to distinguish between different types of memory. So we are going to have the full 64 bit address space. Heiko -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Wed, 5 Oct 2005 14:42:54 -0400 From: Bob Picco Subject: Re: sparsemem & sparsemem extreme question Message-ID: <20051005184254.GA25483@localhost.localdomain> References: <1128527554.26009.2.camel@localhost> <20051005155823.GA10119@osiris.ibm.com> <1128528340.26009.8.camel@localhost> <20051005161009.GA10146@osiris.ibm.com> <1128529222.26009.16.camel@localhost> <20051005171230.GA10204@osiris.ibm.com> <1128532809.26009.39.camel@localhost> <20051005174542.GB10204@osiris.ibm.com> <1128535054.26009.53.camel@localhost> <20051005180443.GC10204@osiris.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20051005180443.GC10204@osiris.ibm.com> Sender: owner-linux-mm@kvack.org Return-Path: To: Heiko Carstens Cc: Dave Hansen , linux-mm , Bob Picco List-ID: Heiko Carstens wrote: [Wed Oct 05 2005, 02:04:43PM EDT] > > > > > Anything specific you need to know about the memory layout? > > > > How sparse is it? How few present pages can be there be in a worst-case > > > > physical area? > > > > > > Worst case that is already currently valid is that you can have 1 MB > > > segments whereever you want in address space. > > ... > > > Even though it's currently not possible to define memory segments above > > > 1TB, this limit is likely to go away. > > > > Go away, or get moved up? > > > > ia64 today is designed to work with 50 bits of physical address space, > > and 30 bit sections. That's exactly the same scale that you're talking > > about with 1MB sections and 1TB of physical space. So, sparsemem > > extreme should be perfectly fine for that case (that's explicitly what > > it was designed for). > > > > How much bigger than 1TB will it go? > > As already mentioned, we will have physical memory with the MSB set. Afaik > the hardware uses this bit to distinguish between different types of memory. > So we are going to have the full 64 bit address space. > > Heiko Possibly the architects did a similar thing to ia64. When bit 63 is set the access in uncached. This doesn't seem relevant to sparsemem. The attribute could be stored in page structure flags or some other way. bob -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by e2.ny.us.ibm.com (8.12.11/8.12.11) with ESMTP id j95M6jaF022312 for ; Wed, 5 Oct 2005 18:06:45 -0400 Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64]) by d01relay04.pok.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j95M6i08035060 for ; Wed, 5 Oct 2005 18:06:44 -0400 Received: from d01av04.pok.ibm.com (loopback [127.0.0.1]) by d01av04.pok.ibm.com (8.12.11/8.13.3) with ESMTP id j95M6iF5004176 for ; Wed, 5 Oct 2005 18:06:44 -0400 Subject: Re: sparsemem & sparsemem extreme question From: Dave Hansen In-Reply-To: <20051005180443.GC10204@osiris.ibm.com> References: <20051005063909.GA9699@osiris.boeblingen.de.ibm.com> <1128527554.26009.2.camel@localhost> <20051005155823.GA10119@osiris.ibm.com> <1128528340.26009.8.camel@localhost> <20051005161009.GA10146@osiris.ibm.com> <1128529222.26009.16.camel@localhost> <20051005171230.GA10204@osiris.ibm.com> <1128532809.26009.39.camel@localhost> <20051005174542.GB10204@osiris.ibm.com> <1128535054.26009.53.camel@localhost> <20051005180443.GC10204@osiris.ibm.com> Content-Type: text/plain Date: Wed, 05 Oct 2005 15:06:42 -0700 Message-Id: <1128550002.18249.14.camel@localhost> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Heiko Carstens Cc: linux-mm , Bob Picco List-ID: On Wed, 2005-10-05 at 20:04 +0200, Heiko Carstens wrote: > As already mentioned, we will have physical memory with the MSB set. Afaik > the hardware uses this bit to distinguish between different types of memory. > So we are going to have the full 64 bit address space. Is it just the MSB? If so, we can probably just shift it down to some reasonable address. The only issue comes if you really have the whole address space used in some random way. -- Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate4.de.ibm.com (8.12.10/8.12.10) with ESMTP id j967ecO0130886 for ; Thu, 6 Oct 2005 07:40:38 GMT Received: from d12av02.megacenter.de.ibm.com (d12av02.megacenter.de.ibm.com [9.149.165.228]) by d12nrmr1607.megacenter.de.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j967ecZ2175652 for ; Thu, 6 Oct 2005 09:40:38 +0200 Received: from d12av02.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av02.megacenter.de.ibm.com (8.12.11/8.13.3) with ESMTP id j967ecYo022828 for ; Thu, 6 Oct 2005 09:40:38 +0200 Date: Thu, 6 Oct 2005 09:39:43 +0200 From: Heiko Carstens Subject: Re: sparsemem & sparsemem extreme question Message-ID: <20051006073943.GA2482@osiris.boeblingen.de.ibm.com> References: <20051005155823.GA10119@osiris.ibm.com> <1128528340.26009.8.camel@localhost> <20051005161009.GA10146@osiris.ibm.com> <1128529222.26009.16.camel@localhost> <20051005171230.GA10204@osiris.ibm.com> <1128532809.26009.39.camel@localhost> <20051005174542.GB10204@osiris.ibm.com> <1128535054.26009.53.camel@localhost> <20051005180443.GC10204@osiris.ibm.com> <1128550002.18249.14.camel@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1128550002.18249.14.camel@localhost> Sender: owner-linux-mm@kvack.org Return-Path: To: Dave Hansen Cc: linux-mm , Bob Picco List-ID: > > As already mentioned, we will have physical memory with the MSB set. Afaik > > the hardware uses this bit to distinguish between different types of memory. > > So we are going to have the full 64 bit address space. > Is it just the MSB? If so, we can probably just shift it down to some > reasonable address. The only issue comes if you really have the whole > address space used in some random way. Unfortunately there is more than just the MSB. If the MSB is set then there will be a 'model dependent' number of bits just below the MSB used to encode whatever the hardware thinks is necessary. Also you cannot tell how these bits will be set from an operating system's view. Best thing to do is to assume nothing and leave the addresses alone. Heiko -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org