* Re: [Bug 25042] New: RAM buffer I/O resource badly interacts with memory hot-add
[not found] <bug-25042-27@https.bugzilla.kernel.org/>
@ 2011-01-04 21:51 ` Andrew Morton
2011-01-04 22:32 ` Linus Torvalds
2011-01-06 2:12 ` KAMEZAWA Hiroyuki
0 siblings, 2 replies; 5+ messages in thread
From: Andrew Morton @ 2011-01-04 21:51 UTC (permalink / raw)
To: linux-mm, linux-acpi; +Cc: bugzilla-daemon, Linus Torvalds, petr, akataria
(switched to email. Please respond via emailed reply-to-all, not via the
bugzilla web interface).
I'm not sure who to blame here so I'll just spray it at everyone I've
ever met ;)
On Thu, 16 Dec 2010 23:00:12 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=25042
>
> Summary: RAM buffer I/O resource badly interacts with memory
> hot-add
> Product: Memory Management
> Version: 2.5
> Kernel Version: 2.6.35
> Platform: All
> OS/Version: Linux
> Tree: Mainline
> Status: NEW
> Severity: normal
> Priority: P1
> Component: Other
> AssignedTo: akpm@linux-foundation.org
> ReportedBy: petr@vandrovec.name
> CC: akataria@vmware.com
> Regression: Yes
>
>
> Created an attachment (id=40502)
> --> (https://bugzilla.kernel.org/attachment.cgi?id=40502)
> /proc/iomem after issuing hot-add, one from 3076 to 3200, other from 3200 to
> 3456MB
>
> Linus's commit 45fbe3ee01b8e463b28c2751b5dcc0cbdc142d90 in May 2009 added code
> to create 'RAM buffer' above top of RAM to ensure that I/O resources do not
> start immediately after RAM, but sometime later. Originally it was enforcing
> 32MB alignment, now it enforces 64MB. Which means that in VMs with memory size
> which is not multiple of 64MB there will be additional 'RAM buffer' resource
> present:
>
> 100000000-1003fffff : System RAM
> 100400000-103ffffff : RAM buffer
>
> When we try to hot-add memory, kernel complains that there was resource
> conflict with this fake 'RAM buffer' and hot-added memory is not recognized:
>
> [ 115.324952] Hotplug Mem Device
> [ 115.325549] System RAM resource 100400000 - 10fffffff cannot be added
> [ 115.325553] ACPI:memory_hp:add_memory failed
> [ 115.326519] ACPI:memory_hp:Error in acpi_memory_enable_device
> [ 115.327183] acpi_memhotplug: probe of PNP0C80:00 failed with error -22
> [ 115.327347]
> [ 115.327350] driver data not found
> [ 115.328808] ACPI:memory_hp:Cannot find driver data
>
> For now we've modified hotplug code to split hot-added request into smaller
> ranges, so only first <= 252MB are unusable, rather than whole xxxGB chunk, but
> if 'RAM buffer' could be made dependent on memory hot-plug not available on the
> platform, it would be much better.
>
> Another approach is resurrecting
> http://linux.derkeiler.com/Mailing-Lists/Kernel/2008-07/msg06501.html and using
> this range instead of all "unclaimed" ranges for placing I/O devices. Then
> "RAM buffer" would not be necessary at all.
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Bug 25042] New: RAM buffer I/O resource badly interacts with memory hot-add
2011-01-04 21:51 ` [Bug 25042] New: RAM buffer I/O resource badly interacts with memory hot-add Andrew Morton
@ 2011-01-04 22:32 ` Linus Torvalds
2011-01-04 23:55 ` Petr Vandrovec
2011-01-06 2:12 ` KAMEZAWA Hiroyuki
1 sibling, 1 reply; 5+ messages in thread
From: Linus Torvalds @ 2011-01-04 22:32 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-acpi, bugzilla-daemon, petr, akataria,
Bjorn Helgaas
On Tue, Jan 4, 2011 at 1:51 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
>> Linus's commit 45fbe3ee01b8e463b28c2751b5dcc0cbdc142d90 in May 2009 added code
>> to create 'RAM buffer' above top of RAM to ensure that I/O resources do not
>> start immediately after RAM, but sometime later. Originally it was enforcing
>> 32MB alignment, now it enforces 64MB. Which means that in VMs with memory size
>> which is not multiple of 64MB there will be additional 'RAM buffer' resource
>> present:
>>
>> 100000000-1003fffff : System RAM
>> 100400000-103ffffff : RAM buffer
I'd suggest just working around it by hotplugging in 64MB chunks.
IOW, the old "it hurts when I do that - don't do that then" solution
to the problem. There is no reason why a VM should export some random
8MB-aligned region that I can see.
>> Another approach is resurrecting
>> http://linux.derkeiler.com/Mailing-Lists/Kernel/2008-07/msg06501.html and using
>> this range instead of all "unclaimed" ranges for placing I/O devices. Then
>> "RAM buffer" would not be necessary at all.
Yeah, not going to happen. There's no point (see above), and it is
fundamentally wrong to even think that the firmware tables - ACPI or
otherwise - would be so perfect that you can just always trust them.
Every time somebody makes the mistake of thinking they can do that
(and it happens distressingly often), they are quickly shown to be
wrong, and there's some random hardware out there that simply doesn't
list the ranges it uses.
What could happen these days is to move the "gap" logic from the e820
table (and /proc/iomem) and into the "arch_remove_reservations()"
logic. See commit fcb119183c73bf0781009713f303e28b1fb13d3e. That might
make memory hotplug happier.
That said, I do repeat: why the hell do you keep digging that hole in
the first place. Do memory hotplug in 256MB chunks, naturally aligned,
and don't bother with any of this crazy crap.
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Bug 25042] New: RAM buffer I/O resource badly interacts with memory hot-add
2011-01-04 22:32 ` Linus Torvalds
@ 2011-01-04 23:55 ` Petr Vandrovec
0 siblings, 0 replies; 5+ messages in thread
From: Petr Vandrovec @ 2011-01-04 23:55 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, linux-mm, linux-acpi, bugzilla-daemon, akataria,
Bjorn Helgaas
On Tue, Jan 4, 2011 at 2:32 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Jan 4, 2011 at 1:51 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
>>> Linus's commit 45fbe3ee01b8e463b28c2751b5dcc0cbdc142d90 in May 2009 added code
>>> to create 'RAM buffer' above top of RAM to ensure that I/O resources do not
>>> start immediately after RAM, but sometime later. Originally it was enforcing
>>> 32MB alignment, now it enforces 64MB. Which means that in VMs with memory size
>>> which is not multiple of 64MB there will be additional 'RAM buffer' resource
>>> present:
>>>
>>> 100000000-1003fffff : System RAM
>>> 100400000-103ffffff : RAM buffer
>
> I'd suggest just working around it by hotplugging in 64MB chunks.
Unfortunately that does not work - kernels configured for sparsemem
hate adding memory in chunks smaller than section size - regions with
end aligned to 128MB, and at least 128MB large is requirement for
x86-64. If smaller region is added, then either non-existent memory
is activated, or nothing happens at all, depending on exact values and
kernel versions. So we align end of the hot-added region to 128MB on
x86-64, and 1GB on ia32. But we do not align start because there was
no need...
> IOW, the old "it hurts when I do that - don't do that then" solution
> to the problem. There is no reason why a VM should export some random
> 8MB-aligned region that I can see.
It just adds memory where it ended - power-on memory ended at
0x1003ffff, and so it now platform naturally tries to continue where
it left off - from 0x10040000 to 0x10ffffff. It has no idea that OS
inside has some special requirements, and OS inside unfortunately does
not support _PRS/_SRS on memory devices either, so we cannot offer
possible choices hoping that guest will pick one it likes more than
default placement/size.
> That said, I do repeat: why the hell do you keep digging that hole in
> the first place. Do memory hotplug in 256MB chunks, naturally aligned,
> and don't bother with any of this crazy crap.
So that we can provide contiguous memory area to the VM, and layout of
VM created with some amount of memory is same as VM which was
hot-added to the required size - that's important for supporting
hibernate, and it is easier to implement than discontiguous ranges.
I've modified code so that we hot-add two regions, first to align
memory size to 256MB (that one is not activated successfully if memory
size is not multiple of 64MB, but we cannot do smaller due to
sparsemem restrictions listed above), and add remaining (if more than
256MB is added) from there. That makes workaround similar to clash
between OPROM base addresses assigned by kernel and ranges reserved in
SRAT for memory hot-add...
Petr
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Bug 25042] New: RAM buffer I/O resource badly interacts with memory hot-add
2011-01-04 21:51 ` [Bug 25042] New: RAM buffer I/O resource badly interacts with memory hot-add Andrew Morton
2011-01-04 22:32 ` Linus Torvalds
@ 2011-01-06 2:12 ` KAMEZAWA Hiroyuki
2011-01-06 2:20 ` Linus Torvalds
1 sibling, 1 reply; 5+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-01-06 2:12 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-acpi, bugzilla-daemon, Linus Torvalds, petr,
akataria
On Tue, 4 Jan 2011 13:51:48 -0800
Andrew Morton <akpm@linux-foundation.org> wrote:
>
> (switched to email. Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> I'm not sure who to blame here so I'll just spray it at everyone I've
> ever met ;)
>
> On Thu, 16 Dec 2010 23:00:12 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
>
> > https://bugzilla.kernel.org/show_bug.cgi?id=25042
> >
> > Summary: RAM buffer I/O resource badly interacts with memory
> > hot-add
> > Product: Memory Management
> > Version: 2.5
> > Kernel Version: 2.6.35
> > Platform: All
> > OS/Version: Linux
> > Tree: Mainline
> > Status: NEW
> > Severity: normal
> > Priority: P1
> > Component: Other
> > AssignedTo: akpm@linux-foundation.org
> > ReportedBy: petr@vandrovec.name
> > CC: akataria@vmware.com
> > Regression: Yes
> >
> >
> > Created an attachment (id=40502)
> > --> (https://bugzilla.kernel.org/attachment.cgi?id=40502)
> > /proc/iomem after issuing hot-add, one from 3076 to 3200, other from 3200 to
> > 3456MB
> >
> > Linus's commit 45fbe3ee01b8e463b28c2751b5dcc0cbdc142d90 in May 2009 added code
> > to create 'RAM buffer' above top of RAM to ensure that I/O resources do not
> > start immediately after RAM, but sometime later. Originally it was enforcing
> > 32MB alignment, now it enforces 64MB. Which means that in VMs with memory size
> > which is not multiple of 64MB there will be additional 'RAM buffer' resource
> > present:
> >
> > 100000000-1003fffff : System RAM
> > 100400000-103ffffff : RAM buffer
> >
> > When we try to hot-add memory, kernel complains that there was resource
> > conflict with this fake 'RAM buffer' and hot-added memory is not recognized:
> >
> > [ 115.324952] Hotplug Mem Device
> > [ 115.325549] System RAM resource 100400000 - 10fffffff cannot be added
> > [ 115.325553] ACPI:memory_hp:add_memory failed
> > [ 115.326519] ACPI:memory_hp:Error in acpi_memory_enable_device
> > [ 115.327183] acpi_memhotplug: probe of PNP0C80:00 failed with error -22
> > [ 115.327347]
> > [ 115.327350] driver data not found
> > [ 115.328808] ACPI:memory_hp:Cannot find driver data
> >
> > For now we've modified hotplug code to split hot-added request into smaller
> > ranges, so only first <= 252MB are unusable, rather than whole xxxGB chunk, but
> > if 'RAM buffer' could be made dependent on memory hot-plug not available on the
> > platform, it would be much better.
> >
Hmm ? Why do you need to place "hot-added" memory's address range next to System
RAM ? Sparsemem allows sparse memory layout.
Is it very difficult ?
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Bug 25042] New: RAM buffer I/O resource badly interacts with memory hot-add
2011-01-06 2:12 ` KAMEZAWA Hiroyuki
@ 2011-01-06 2:20 ` Linus Torvalds
0 siblings, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2011-01-06 2:20 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Andrew Morton, linux-mm, linux-acpi, bugzilla-daemon, petr,
akataria
On Wed, Jan 5, 2011 at 6:12 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
>
> Hmm ? Why do you need to place "hot-added" memory's address range next to System
> RAM ? Sparsemem allows sparse memory layout.
Well, even without sparsemem, why couldn't the initial memory image
just be more nicely aligned to 256MB or something?
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-01-06 2:21 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <bug-25042-27@https.bugzilla.kernel.org/>
2011-01-04 21:51 ` [Bug 25042] New: RAM buffer I/O resource badly interacts with memory hot-add Andrew Morton
2011-01-04 22:32 ` Linus Torvalds
2011-01-04 23:55 ` Petr Vandrovec
2011-01-06 2:12 ` KAMEZAWA Hiroyuki
2011-01-06 2:20 ` Linus Torvalds
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).