Linux LVM users
 help / color / mirror / Atom feed
* [linux-lvm] vgchange -a memory consumption
@ 2008-07-12  5:57 Daniel Stodden
  2008-07-12 16:51 ` Alasdair G Kergon
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel Stodden @ 2008-07-12  5:57 UTC (permalink / raw)
  To: Linux LVM


Hi.

I'm running, lvm2-2.02.26.

Found that vgchange -a y/n generates remarkable memory consumption on
volumes with larger numbers of small volumes.

Tried on a VG with 1024 LVs of 8MB each. Interestingly, the type of
operation does not seem to matter much. Notably, deactivating an already
unavailable volume group generates similarly high pressure than actually
activating it.

Looking at the code, I've got only gained a partial understanding where
the memory is exactly spent -- and for what. So much I believe I do
understand, maybe someone can enlighten me:

Iterating through all the volumes pushes the data segment up by about
750k -- per iteration. That memory allocated per volume never seems to
get freed.

So, together with the memory locking performed in lock_vol, it's only a
matter of installed RAM and volume numbers when the OOM killer will kick
in.

_vg_read() seems to play a role. Apparently replayed twice for each LV
(once for lock, then for unlock). 

To a rather outside observer like me, the path taken to get there seems
rather strange. The LV is handed over as a UUID string to lock_vol().
In the '-an' case, this is handed over to lv_deactivate, which will
(re-)load both the VG and LV metadata in order to get the respective
lvinfo struct. 

So what I don't really get is: 

Why is that data reread? Especially the VG metadata. Or am I missing
something. Second: why isn't that memory freed after returning from
activate_lv?

But most importantly: Could this be fixed?

Best,
Daniel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [linux-lvm] vgchange -a memory consumption
  2008-07-12  5:57 [linux-lvm] vgchange -a memory consumption Daniel Stodden
@ 2008-07-12 16:51 ` Alasdair G Kergon
  2008-07-15  6:19   ` Daniel Stodden
  0 siblings, 1 reply; 6+ messages in thread
From: Alasdair G Kergon @ 2008-07-12 16:51 UTC (permalink / raw)
  To: LVM general discussion and development

On Fri, Jul 11, 2008 at 10:57:31PM -0700, Daniel Stodden wrote:
> I'm running, lvm2-2.02.26.
 
Don't bother investigating that version - stuff got changed.
Update to the latest release (or CVS) and try again.

> Why is that data reread? 

Because the two parts of the code are designed to be independent.  - The
so-called "activation" code sits behind an API in a so-called "locking"
module.  There's a choice of locking modules, and some send the requests
around a cluster of machines - remote machines will only run the
activation code and manage the metadata independently.  We just pass
UUIDs through the cluster communication layer, never metadata itself.

> Second: why isn't that memory freed after returning from
> activate_lv?
 
It's released after processing the whole command.  If there are cases
where too much is still being held while processing in the *current*
version of the code, then yes, you might be able to free parts of it
sooner.

Alasdair
-- 
agk@redhat.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [linux-lvm] vgchange -a memory consumption
  2008-07-12 16:51 ` Alasdair G Kergon
@ 2008-07-15  6:19   ` Daniel Stodden
  2008-07-16 16:48     ` [PATCH] " Daniel Stodden
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel Stodden @ 2008-07-15  6:19 UTC (permalink / raw)
  To: LVM general discussion and development


Hey Alasdair,

thanks a lot for the prompt reply.


 On Sat, 2008-07-12 at 17:51 +0100, Alasdair G Kergon wrote:
> On Fri, Jul 11, 2008 at 10:57:31PM -0700, Daniel Stodden wrote:
> > I'm running, lvm2-2.02.26.
>  
> Don't bother investigating that version - stuff got changed.
> Update to the latest release (or CVS) and try again.
> 
> > Why is that data reread? 
> 
> Because the two parts of the code are designed to be independent.  - The
> so-called "activation" code sits behind an API in a so-called "locking"
> module.  There's a choice of locking modules, and some send the requests
> around a cluster of machines - remote machines will only run the
> activation code and manage the metadata independently.  We just pass
> UUIDs through the cluster communication layer, never metadata itself.

Oooh - kay. I've only been looking at _file..() operations. In the
clustered version that sounds much more obvious.

> > Second: why isn't that memory freed after returning from
> > activate_lv?
>  
> It's released after processing the whole command.  If there are cases
> where too much is still being held while processing in the *current*
> version of the code, then yes, you might be able to free parts of it
> sooner.

I've been running on CVS today. The situation appears to have improved,
but only slightly. Still way to much memory going down the drain. 

BTW: Did CVS change the memlocking policy? I just noticed that I can run
beyond physical RAM now. Is that a bug or a feature?

I had a very long look at the path down activate/deactivate() in general
and the dm storage allocator in particular. If I nail a separate per-LV
pool over the cmd_context in _activate_lvs_in_vg() and empty it once per
cycle, things slow down a little [1], but the general problem vanishes. 

Now, overriding cmd->mem isn't exactly beautiful. Any better
suggestions? I need this fixed. And soon. :}

Second is revisions: I suppose something like the above would work as a
patch into elderly source RPMs as well. Such as the .26 I mentioned in
my original post. Any tips on this? I'd consider upgrading, but I've see
your advise against that on debian's launchpad, at least regarding .38
and .39. Which is hip?

So far, thank you very much again.

Best,
Daniel

[1] For a stack-alike allocator, I think dm_pool_free() generates a
rather scary number of individual brk()s while rewinding. But that's
certainly not a functional issue, and I may, again, be mistaken.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] [linux-lvm] vgchange -a memory consumption
  2008-07-15  6:19   ` Daniel Stodden
@ 2008-07-16 16:48     ` Daniel Stodden
       [not found]       ` <20080716165243.GM7155@agk.fab.redhat.com>
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel Stodden @ 2008-07-16 16:48 UTC (permalink / raw)
  To: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 3054 bytes --]


In the hope that somebody finds the time to comment, here's a patch for
the original issue described. I'd just like to see the problem resolved
in future versions. Suggestions very welcome.

Thanks.

Daniel


On Mon, 2008-07-14 at 23:19 -0700, Daniel Stodden wrote:
> Hey Alasdair,
> 
> thanks a lot for the prompt reply.
> 
> 
>  On Sat, 2008-07-12 at 17:51 +0100, Alasdair G Kergon wrote:
> > On Fri, Jul 11, 2008 at 10:57:31PM -0700, Daniel Stodden wrote:
> > > I'm running, lvm2-2.02.26.
> >  
> > Don't bother investigating that version - stuff got changed.
> > Update to the latest release (or CVS) and try again.
> > 
> > > Why is that data reread? 
> > 
> > Because the two parts of the code are designed to be independent.  - The
> > so-called "activation" code sits behind an API in a so-called "locking"
> > module.  There's a choice of locking modules, and some send the requests
> > around a cluster of machines - remote machines will only run the
> > activation code and manage the metadata independently.  We just pass
> > UUIDs through the cluster communication layer, never metadata itself.
> 
> Oooh - kay. I've only been looking at _file..() operations. In the
> clustered version that sounds much more obvious.
> 
> > > Second: why isn't that memory freed after returning from
> > > activate_lv?
> >  
> > It's released after processing the whole command.  If there are cases
> > where too much is still being held while processing in the *current*
> > version of the code, then yes, you might be able to free parts of it
> > sooner.
> 
> I've been running on CVS today. The situation appears to have improved,
> but only slightly. Still way to much memory going down the drain. 
> 
> BTW: Did CVS change the memlocking policy? I just noticed that I can run
> beyond physical RAM now. Is that a bug or a feature?
> 
> I had a very long look at the path down activate/deactivate() in general
> and the dm storage allocator in particular. If I nail a separate per-LV
> pool over the cmd_context in _activate_lvs_in_vg() and empty it once per
> cycle, things slow down a little [1], but the general problem vanishes. 
> 
> Now, overriding cmd->mem isn't exactly beautiful. Any better
> suggestions? I need this fixed. And soon. :}
> 
> Second is revisions: I suppose something like the above would work as a
> patch into elderly source RPMs as well. Such as the .26 I mentioned in
> my original post. Any tips on this? I'd consider upgrading, but I've see
> your advise against that on debian's launchpad, at least regarding .38
> and .39. Which is hip?
> 
> So far, thank you very much again.
> 
> Best,
> Daniel
> 
> [1] For a stack-alike allocator, I think dm_pool_free() generates a
> rather scary number of individual brk()s while rewinding. But that's
> certainly not a functional issue, and I may, again, be mistaken.
> 
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

[-- Attachment #2: lvm2-vgchangemem.diff --]
[-- Type: text/x-patch, Size: 611 bytes --]

diff -r c5ae2629f8f9 tools/vgchange.c
--- a/tools/vgchange.c	Mon Jul 14 19:04:54 2008 -0700
+++ b/tools/vgchange.c	Tue Jul 15 11:48:12 2008 -0700
@@ -58,7 +58,10 @@ static int _activate_lvs_in_vg(struct cm
 	struct logical_volume *lv;
 	const char *pvname;
 	int count = 0;
+	struct dm_pool *mem = cmd->mem;
 
+	cmd->mem = dm_pool_create("volume", 1024);
+	
 	list_iterate_items(lvl, &vg->lvs) {
 		lv = lvl->lv;
 
@@ -99,8 +102,12 @@ static int _activate_lvs_in_vg(struct cm
 			continue;
 		}
 
+		dm_pool_empty(cmd->mem);
 		count++;
 	}
+
+	dm_pool_destroy(cmd->mem);
+	cmd->mem = mem;
 
 	return count;
 }

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] [linux-lvm] vgchange -a memory consumption
       [not found]             ` <1216231344.16876.25.camel@desktop>
@ 2008-07-16 18:08               ` Daniel Stodden
  0 siblings, 0 replies; 6+ messages in thread
From: Daniel Stodden @ 2008-07-16 18:08 UTC (permalink / raw)
  To: Linux LVM

On Wed, 2008-07-16 at 11:02 -0700, Daniel Stodden wrote:
> On Wed, 2008-07-16 at 18:42 +0100, Alasdair G Kergon wrote:
> > Quick way for starters, is allocate a pointless object in the pool
> > then free eveything back to that point each time round the loop.
> 
> Thought so. Thanks for suggesting it.
> 
> > But proper thing is to track down into the library and find
> > which routines are the ones where the pool can be freed at
> > the end and isn't being.
> 
> Let's take lv_deactivate() in activate.c as an example.
> 
> We'd have
> 	lv = lv_from_lvid() 
> 
> and then would do a:
> 	dm_pool_free(lv->vg)
> 
> or something similar, because we know the VG is constructed before th LV
> can be found. That's what I meant with 'non-obvious'.
> 
> That example wasn't even a very good fix, because it appears to still
> leave a tip hanging around. Didn't figure exactly what it was.

Did you ever consider to have the part behind lock_vol() allocate it's
own pool (let's call it tmp) for temporary storage? So e.g.
lv_from_lvid() would read the VG into tmp and the lv it actally returns
into cmd->mem?

That would be a much larger change, but probably the cleaner way to do
it.

... not like I'm looking for extra work or something :]

Best,
Daniel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH2] [linux-lvm] vgchange -a memory consumption
       [not found]           ` <20080716174240.GN7155@agk.fab.redhat.com>
       [not found]             ` <1216231344.16876.25.camel@desktop>
@ 2008-07-16 20:39             ` Daniel Stodden
  1 sibling, 0 replies; 6+ messages in thread
From: Daniel Stodden @ 2008-07-16 20:39 UTC (permalink / raw)
  To: Alasdair G Kergon; +Cc: Linux LVM

[-- Attachment #1: Type: text/plain, Size: 584 bytes --]


Hi.

On Wed, 2008-07-16 at 18:42 +0100, Alasdair G Kergon wrote:
> Quick way for starters, is allocate a pointless object in the pool
> then free eveything back to that point each time round the loop.
> 
> But proper thing is to track down into the library and find
> which routines are the ones where the pool can be freed at
> the end and isn't being.
> 
> IOW where are most of those allocations happening?
> 
> Alasdair

Okay. Unless we really find a better way to do it, here is the suggested marker version.

As far as I can tell so far, this works equally well.

Best,
Daniel

[-- Attachment #2: lvm2-vgchange-mem.diff --]
[-- Type: text/x-patch, Size: 780 bytes --]

diff -r f79956e87d1d -r 6a69ebd5ade1 tools/vgchange.c
--- a/tools/vgchange.c	Mon Jul 14 18:16:03 2008 -0700
+++ b/tools/vgchange.c	Wed Jul 16 13:15:32 2008 -0700
@@ -57,10 +57,17 @@ static int _activate_lvs_in_vg(struct cm
 	struct lv_list *lvl;
 	struct logical_volume *lv;
 	const char *pvname;
+	void *marker;
 	int count = 0;
 
 	list_iterate_items(lvl, &vg->lvs) {
 		lv = lvl->lv;
+
+		marker = dm_pool_alloc(cmd->mem, 1);
+		if (!marker) {
+			log_error("Out of memory");
+			return count;
+		}
 
 		/* Only request activation of snapshot origin devices */
 		if ((lv->status & SNAPSHOT) || lv_is_cow(lv))
@@ -98,6 +105,8 @@ static int _activate_lvs_in_vg(struct cm
 			pvmove_poll(cmd, pvname, 1);
 			continue;
 		}
+
+		dm_pool_free(cmd->mem, marker);
 
 		count++;
 	}

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-07-16 20:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-12  5:57 [linux-lvm] vgchange -a memory consumption Daniel Stodden
2008-07-12 16:51 ` Alasdair G Kergon
2008-07-15  6:19   ` Daniel Stodden
2008-07-16 16:48     ` [PATCH] " Daniel Stodden
     [not found]       ` <20080716165243.GM7155@agk.fab.redhat.com>
     [not found]         ` <1216228176.1130.13.camel@desktop>
     [not found]           ` <20080716174240.GN7155@agk.fab.redhat.com>
     [not found]             ` <1216231344.16876.25.camel@desktop>
2008-07-16 18:08               ` Daniel Stodden
2008-07-16 20:39             ` [PATCH2] " Daniel Stodden

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox