All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [Gluster-devel] Puppet-Gluster+ThinP
       [not found] ` <1420171478.1325442.1397081884591.JavaMail.zimbra@redhat.com>
@ 2014-04-20 23:59   ` Ric Wheeler
       [not found]     ` <53545F62.8060001-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Ric Wheeler @ 2014-04-20 23:59 UTC (permalink / raw)
  To: Paul Cuzner, James; +Cc: device-mapper development, Gluster Devel

On 04/09/2014 03:18 PM, Paul Cuzner wrote:
>
> I'm really interested in the thinp best practices too. gluster-deploy has had 
> thinp support for a while now - and I asked the question about best practices 
> a while back - but nothing came back..
>
> Hopefully - you're timing is better than mine!
>
> cc'ing Rajesh since the thinp is all about snapshot enablement.

The amount of space you set aside is very much workload dependent (rate of 
change, rate of deletion, rate of notifying the storage about the freed space).

Keep in mind with snapshots (and thinly provisioned storage, whether using a 
software target or thinly provisioned array) we need to issue the "discard" 
commands down the IO stack in order to let the storage target reclaim space.

That typically means running the fstrim command on the local file system (XFS, 
ext4, btrfs, etc) every so often. Less typically, you can mount your local file 
system with "-o discard" to do it inband (but that comes at a performance 
penalty usually).

There is also a event mechanism to help us get notified when we hit a target 
configurable watermark ("help, we are running short on real disk, add more or 
clean up!").

Definitely worth following up with the LVM/device mapper people on how to do 
this best,

Ric

>
> --------------------------------------------------------------------------------
>
>     *From: *"James" <purpleidea@gmail.com>
>     *To: *"Gluster Devel" <gluster-devel@nongnu.org>
>     *Sent: *Thursday, 10 April, 2014 3:13:40 AM
>     *Subject: *[Gluster-devel] Puppet-Gluster+ThinP
>
>     Okay,
>
>     Here's a first draft of puppet-gluster w/ thin-p. This patch includes
>     documentation updates too! (w00t!)
>
>     https://github.com/purpleidea/puppet-gluster/tree/feat/thinp
>
>     FYI: I'll probably rebase this branch.
>     FYI: Somewhat untested. Read the commit message.
>
>     Comments welcome :)
>
>     I'm most interested to hear about if everyone is pleased with the way I
>     run the thin-p lv command. I think this makes the most sense, but let me
>     know if anyone has improvements. Also I'd love to hear about what the
>     default values for the parameters should be, but that's a one line
>     patch, so no rush for me.
>
>     Cheers,
>     James
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Puppet-Gluster+ThinP
       [not found]     ` <53545F62.8060001-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2014-04-21  0:11       ` James
  2014-04-21  0:44         ` [Gluster-devel] Puppet-Gluster+ThinP Ric Wheeler
  2014-04-22 14:30         ` David Teigland
  0 siblings, 2 replies; 7+ messages in thread
From: James @ 2014-04-21  0:11 UTC (permalink / raw)
  To: Ric Wheeler; +Cc: device-mapper development, Gluster Devel

On Sun, Apr 20, 2014 at 7:59 PM, Ric Wheeler <rwheeler-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> The amount of space you set aside is very much workload dependent (rate of
> change, rate of deletion, rate of notifying the storage about the freed
> space).

From the Puppet-Gluster perspective, this will be configurable. I
would like to set a vaguely sensible default though, which I don't
have at the moment.

>
> Keep in mind with snapshots (and thinly provisioned storage, whether using a
> software target or thinly provisioned array) we need to issue the "discard"
> commands down the IO stack in order to let the storage target reclaim space.
>
> That typically means running the fstrim command on the local file system
> (XFS, ext4, btrfs, etc) every so often. Less typically, you can mount your
> local file system with "-o discard" to do it inband (but that comes at a
> performance penalty usually).

Do you think it would make sense to have Puppet-Gluster add a cron job
to do this operation?
Exactly what command should run, and how often? (Again for having
sensible defaults.)

>
> There is also a event mechanism to help us get notified when we hit a target
> configurable watermark ("help, we are running short on real disk, add more
> or clean up!").
Can you point me to some docs about this feature?

>
> Definitely worth following up with the LVM/device mapper people on how to do
> this best,
>
> Ric

Thanks for the comments. From everyone I've talked to, it seems some
of the answers are still in progress. The good news is, that I'm ahead
of the curve for being ready for when this becomes more mainstream. I
think Paul is in the same position too.

James

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Gluster-devel] Puppet-Gluster+ThinP
  2014-04-21  0:11       ` Puppet-Gluster+ThinP James
@ 2014-04-21  0:44         ` Ric Wheeler
       [not found]           ` <535469F9.7050605-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2014-04-22 14:30         ` David Teigland
  1 sibling, 1 reply; 7+ messages in thread
From: Ric Wheeler @ 2014-04-21  0:44 UTC (permalink / raw)
  To: James; +Cc: Lukas Czerner, device-mapper development, Paul Cuzner,
	Gluster Devel

On 04/20/2014 05:11 PM, James wrote:
> On Sun, Apr 20, 2014 at 7:59 PM, Ric Wheeler <rwheeler@redhat.com> wrote:
>> The amount of space you set aside is very much workload dependent (rate of
>> change, rate of deletion, rate of notifying the storage about the freed
>> space).
>  From the Puppet-Gluster perspective, this will be configurable. I
> would like to set a vaguely sensible default though, which I don't
> have at the moment.

This will require a bit of thinking as you have noticed, but let's start with 
some definitions.

The basic use case is one file system backed by an exclusive dm-thinp target (no 
other file system writing to that dm-thinp pool or contending for allocation).

The goal is to get an alert in time to intervene before things get ugly, so we 
are hoping to get a sense of rate of change in the file system and how long any 
snapshot will be retained for.

For example, if we have a 10TB file system (presented as such to the user) and 
we write say 500GB of new data/day, daily snapshots will need that space for as 
long as we retain them.  If you write much less (5GB/day), it will clearly take 
a lot less.

The above makes this all an effort to predict the future, but that is where the 
watermark alert kicks in to help us recover from a bad prediction.

Maybe we use a default of setting aside 20% of raw capacity for snapshots and 
set that watermark at 90% full?  For a lot of use people, I suspect a fairly low 
rate of change and that means pretty skinny snapshots.

We will clearly need to have a lot of effort here in helping explain this to 
users so they can make the trade off for their particular use case.

>
>> Keep in mind with snapshots (and thinly provisioned storage, whether using a
>> software target or thinly provisioned array) we need to issue the "discard"
>> commands down the IO stack in order to let the storage target reclaim space.
>>
>> That typically means running the fstrim command on the local file system
>> (XFS, ext4, btrfs, etc) every so often. Less typically, you can mount your
>> local file system with "-o discard" to do it inband (but that comes at a
>> performance penalty usually).
> Do you think it would make sense to have Puppet-Gluster add a cron job
> to do this operation?
> Exactly what command should run, and how often? (Again for having
> sensible defaults.)

I think that we should probably run fstrim once a day or so (hopefully late at 
night or off peak)?  Adding in Lukas who lead a lot of the discard work.

>
>> There is also a event mechanism to help us get notified when we hit a target
>> configurable watermark ("help, we are running short on real disk, add more
>> or clean up!").
> Can you point me to some docs about this feature?

My quick google search only shows my own very shallow talk slides, so let me dig 
around for something better :)

>
>> Definitely worth following up with the LVM/device mapper people on how to do
>> this best,
>>
>> Ric
> Thanks for the comments. From everyone I've talked to, it seems some
> of the answers are still in progress. The good news is, that I'm ahead
> of the curve for being ready for when this becomes more mainstream. I
> think Paul is in the same position too.
>
> James

This is all new stuff - even not with gluster on top of it - so this will mean 
hitting a few bumps I fear.  Definitely worth putting thought into this now and 
working on the documentation,

Ric

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Gluster-devel] Puppet-Gluster+ThinP
  2014-04-21  0:11       ` Puppet-Gluster+ThinP James
  2014-04-21  0:44         ` [Gluster-devel] Puppet-Gluster+ThinP Ric Wheeler
@ 2014-04-22 14:30         ` David Teigland
       [not found]           ` <20140422143018.GA25966-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  1 sibling, 1 reply; 7+ messages in thread
From: David Teigland @ 2014-04-22 14:30 UTC (permalink / raw)
  To: James; +Cc: Paul Cuzner, device-mapper development, Ric Wheeler,
	Gluster Devel

On Sun, Apr 20, 2014 at 08:11:08PM -0400, James wrote:
> On Sun, Apr 20, 2014 at 7:59 PM, Ric Wheeler <rwheeler@redhat.com> wrote:
> > There is also a event mechanism to help us get notified when we hit a target
> > configurable watermark ("help, we are running short on real disk, add more
> > or clean up!").
> Can you point me to some docs about this feature?

This topical man page is a recent addition.  If there are questions not
covered here, we may want to add information about it.

http://man7.org/linux/man-pages/man7/lvmthin.7.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Puppet-Gluster+ThinP
       [not found]           ` <535469F9.7050605-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2014-04-24  5:59             ` James
  2014-04-24 12:03               ` [Gluster-devel] Puppet-Gluster+ThinP Lukáš Czerner
  0 siblings, 1 reply; 7+ messages in thread
From: James @ 2014-04-24  5:59 UTC (permalink / raw)
  To: Ric Wheeler; +Cc: Lukas Czerner, device-mapper development, Gluster Devel

On Sun, Apr 20, 2014 at 8:44 PM, Ric Wheeler <rwheeler-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On 04/20/2014 05:11 PM, James wrote:
>>
>> On Sun, Apr 20, 2014 at 7:59 PM, Ric Wheeler <rwheeler-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>>>
>>> The amount of space you set aside is very much workload dependent (rate
>>> of
>>> change, rate of deletion, rate of notifying the storage about the freed
>>> space).
>>
>>  From the Puppet-Gluster perspective, this will be configurable. I
>> would like to set a vaguely sensible default though, which I don't
>> have at the moment.
>
>
> This will require a bit of thinking as you have noticed, but let's start
> with some definitions.
>
> The basic use case is one file system backed by an exclusive dm-thinp target
> (no other file system writing to that dm-thinp pool or contending for
> allocation).
>
> The goal is to get an alert in time to intervene before things get ugly, so
> we are hoping to get a sense of rate of change in the file system and how
> long any snapshot will be retained for.
>
> For example, if we have a 10TB file system (presented as such to the user)
> and we write say 500GB of new data/day, daily snapshots will need that space
> for as long as we retain them.  If you write much less (5GB/day), it will
> clearly take a lot less.
>
> The above makes this all an effort to predict the future, but that is where
> the watermark alert kicks in to help us recover from a bad prediction.
>
> Maybe we use a default of setting aside 20% of raw capacity for snapshots
> and set that watermark at 90% full?  For a lot of use people, I suspect a
> fairly low rate of change and that means pretty skinny snapshots.
>
> We will clearly need to have a lot of effort here in helping explain this to
> users so they can make the trade off for their particular use case.
>
>
>>
>>> Keep in mind with snapshots (and thinly provisioned storage, whether
>>> using a
>>> software target or thinly provisioned array) we need to issue the
>>> "discard"
>>> commands down the IO stack in order to let the storage target reclaim
>>> space.
>>>
>>> That typically means running the fstrim command on the local file system
>>> (XFS, ext4, btrfs, etc) every so often. Less typically, you can mount
>>> your
>>> local file system with "-o discard" to do it inband (but that comes at a
>>> performance penalty usually).
>>
>> Do you think it would make sense to have Puppet-Gluster add a cron job
>> to do this operation?
>> Exactly what command should run, and how often? (Again for having
>> sensible defaults.)
>
>
> I think that we should probably run fstrim once a day or so (hopefully late
> at night or off peak)?  Adding in Lukas who lead a lot of the discard work.

I decided I'd kick off this party by writing a patch, and opening a
bug against my own product (is it cool to do that?)
Bug is: https://bugzilla.redhat.com/show_bug.cgi?id=1090757
Patch is: https://github.com/purpleidea/puppet-gluster/commit/1444914fe5988cc285cd572e3ed1042365d58efd
Please comment on the bug if you have any advice or recommendations
about fstrim.

Thanks!

>
>
>>
>>> There is also a event mechanism to help us get notified when we hit a
>>> target
>>> configurable watermark ("help, we are running short on real disk, add
>>> more
>>> or clean up!").
>>
>> Can you point me to some docs about this feature?
>
>
> My quick google search only shows my own very shallow talk slides, so let me
> dig around for something better :)
>
>
>>
>>> Definitely worth following up with the LVM/device mapper people on how to
>>> do
>>> this best,
>>>
>>> Ric
>>
>> Thanks for the comments. From everyone I've talked to, it seems some
>> of the answers are still in progress. The good news is, that I'm ahead
>> of the curve for being ready for when this becomes more mainstream. I
>> think Paul is in the same position too.
>>
>> James
>
>
> This is all new stuff - even not with gluster on top of it - so this will
> mean hitting a few bumps I fear.  Definitely worth putting thought into this
> now and working on the documentation,
>
> Ric
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dm-devel]  Puppet-Gluster+ThinP
       [not found]           ` <20140422143018.GA25966-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2014-04-24  5:59             ` James
  0 siblings, 0 replies; 7+ messages in thread
From: James @ 2014-04-24  5:59 UTC (permalink / raw)
  To: David Teigland; +Cc: device-mapper development, Ric Wheeler, Gluster Devel

On Tue, Apr 22, 2014 at 10:30 AM, David Teigland <teigland-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> This topical man page is a recent addition.  If there are questions not
> covered here, we may want to add information about it.
>
> http://man7.org/linux/man-pages/man7/lvmthin.7.html


Hey, I've actually read this and it was extremely helpful.
Someone in #lvm pointed it out a few weeks ago.
I do have some questions, but I think they're more along the lines of
"what thin-p setup does glusterfs expect/prefer?"
Please feel free to have a quick look at:

https://bugzilla.redhat.com/show_bug.cgi?id=1090757

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Gluster-devel] Puppet-Gluster+ThinP
  2014-04-24  5:59             ` Puppet-Gluster+ThinP James
@ 2014-04-24 12:03               ` Lukáš Czerner
  0 siblings, 0 replies; 7+ messages in thread
From: Lukáš Czerner @ 2014-04-24 12:03 UTC (permalink / raw)
  To: James; +Cc: Paul Cuzner, device-mapper development, Ric Wheeler,
	Gluster Devel

On Thu, 24 Apr 2014, James wrote:

> Date: Thu, 24 Apr 2014 01:59:21 -0400
> From: James <purpleidea@gmail.com>
> To: Ric Wheeler <rwheeler@redhat.com>
> Cc: Paul Cuzner <pcuzner@redhat.com>,
>     Gluster Devel <gluster-devel@nongnu.org>,
>     device-mapper development <dm-devel@redhat.com>,
>     Lukas Czerner <lczerner@redhat.com>
> Subject: Re: [Gluster-devel] Puppet-Gluster+ThinP
> 
> On Sun, Apr 20, 2014 at 8:44 PM, Ric Wheeler <rwheeler@redhat.com> wrote:
> > On 04/20/2014 05:11 PM, James wrote:
> >>
> >> On Sun, Apr 20, 2014 at 7:59 PM, Ric Wheeler <rwheeler@redhat.com> wrote:
> >>>
> >>> The amount of space you set aside is very much workload dependent (rate
> >>> of
> >>> change, rate of deletion, rate of notifying the storage about the freed
> >>> space).
> >>
> >>  From the Puppet-Gluster perspective, this will be configurable. I
> >> would like to set a vaguely sensible default though, which I don't
> >> have at the moment.
> >
> >
> > This will require a bit of thinking as you have noticed, but let's start
> > with some definitions.
> >
> > The basic use case is one file system backed by an exclusive dm-thinp target
> > (no other file system writing to that dm-thinp pool or contending for
> > allocation).
> >
> > The goal is to get an alert in time to intervene before things get ugly, so
> > we are hoping to get a sense of rate of change in the file system and how
> > long any snapshot will be retained for.
> >
> > For example, if we have a 10TB file system (presented as such to the user)
> > and we write say 500GB of new data/day, daily snapshots will need that space
> > for as long as we retain them.  If you write much less (5GB/day), it will
> > clearly take a lot less.
> >
> > The above makes this all an effort to predict the future, but that is where
> > the watermark alert kicks in to help us recover from a bad prediction.
> >
> > Maybe we use a default of setting aside 20% of raw capacity for snapshots
> > and set that watermark at 90% full?  For a lot of use people, I suspect a
> > fairly low rate of change and that means pretty skinny snapshots.
> >
> > We will clearly need to have a lot of effort here in helping explain this to
> > users so they can make the trade off for their particular use case.
> >
> >
> >>
> >>> Keep in mind with snapshots (and thinly provisioned storage, whether
> >>> using a
> >>> software target or thinly provisioned array) we need to issue the
> >>> "discard"
> >>> commands down the IO stack in order to let the storage target reclaim
> >>> space.
> >>>
> >>> That typically means running the fstrim command on the local file system
> >>> (XFS, ext4, btrfs, etc) every so often. Less typically, you can mount
> >>> your
> >>> local file system with "-o discard" to do it inband (but that comes at a
> >>> performance penalty usually).
> >>
> >> Do you think it would make sense to have Puppet-Gluster add a cron job
> >> to do this operation?
> >> Exactly what command should run, and how often? (Again for having
> >> sensible defaults.)
> >
> >
> > I think that we should probably run fstrim once a day or so (hopefully late
> > at night or off peak)?  Adding in Lukas who lead a lot of the discard work.
> 
> I decided I'd kick off this party by writing a patch, and opening a
> bug against my own product (is it cool to do that?)
> Bug is: https://bugzilla.redhat.com/show_bug.cgi?id=1090757
> Patch is: https://github.com/purpleidea/puppet-gluster/commit/1444914fe5988cc285cd572e3ed1042365d58efd
> Please comment on the bug if you have any advice or recommendations
> about fstrim.

This is a good workaround (assuming that ${valid_path} is a
mountpoint of the file system on top of the thinp), but eventually I think
it would be great if this could be done automatically on the lower level.

There is already some effort from lvm2 team

https://bugzilla.redhat.com/show_bug.cgi?id=824900

But I think that best solution would be if they would fire off fstrim
on the file system when they hit watermark on the pool. This could
be done via their own dmeventd daemon.

They already have policy where dmeventd is watching thinp pool
utilization and at certain thresholds firing off lvm commands to
possibly extend the pool based on the lvm.conf settings. So I think
this is the right way to put this functionality.

But that needs to be discussed with lvm2 people.

Thanks!
-Lukas

> 
> Thanks!
> 
> >
> >
> >>
> >>> There is also a event mechanism to help us get notified when we hit a
> >>> target
> >>> configurable watermark ("help, we are running short on real disk, add
> >>> more
> >>> or clean up!").
> >>
> >> Can you point me to some docs about this feature?
> >
> >
> > My quick google search only shows my own very shallow talk slides, so let me
> > dig around for something better :)
> >
> >
> >>
> >>> Definitely worth following up with the LVM/device mapper people on how to
> >>> do
> >>> this best,
> >>>
> >>> Ric
> >>
> >> Thanks for the comments. From everyone I've talked to, it seems some
> >> of the answers are still in progress. The good news is, that I'm ahead
> >> of the curve for being ready for when this becomes more mainstream. I
> >> think Paul is in the same position too.
> >>
> >> James
> >
> >
> > This is all new stuff - even not with gluster on top of it - so this will
> > mean hitting a few bumps I fear.  Definitely worth putting thought into this
> > now and working on the documentation,
> >
> > Ric
> >
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-04-24 12:03 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1397056420.4190.93.camel@freed>
     [not found] ` <1420171478.1325442.1397081884591.JavaMail.zimbra@redhat.com>
2014-04-20 23:59   ` [Gluster-devel] Puppet-Gluster+ThinP Ric Wheeler
     [not found]     ` <53545F62.8060001-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-04-21  0:11       ` Puppet-Gluster+ThinP James
2014-04-21  0:44         ` [Gluster-devel] Puppet-Gluster+ThinP Ric Wheeler
     [not found]           ` <535469F9.7050605-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-04-24  5:59             ` Puppet-Gluster+ThinP James
2014-04-24 12:03               ` [Gluster-devel] Puppet-Gluster+ThinP Lukáš Czerner
2014-04-22 14:30         ` David Teigland
     [not found]           ` <20140422143018.GA25966-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-04-24  5:59             ` [dm-devel] Puppet-Gluster+ThinP James

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.