linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ric Wheeler <rwheeler@redhat.com>
To: Chris Mason <chris.mason@oracle.com>
Cc: Matthew Wilcox <matthew@wil.cx>, Theodore Tso <tytso@mit.edu>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	James Bottomley <James.Bottomley@hansenpartnership.com>,
	Jens Axboe <jens.axboe@oracle.com>,
	David Woodhouse <dwmw2@infradead.org>,
	linux-scsi@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Black_David@emc.com, Tom Coughlan <coughlan@redhat.com>
Subject: Re: thin provisioned LUN support
Date: Fri, 07 Nov 2008 16:04:52 -0500	[thread overview]
Message-ID: <4914AD74.30509@redhat.com> (raw)
In-Reply-To: <1226090910.15281.84.camel@think.oraclecorp.com>

Chris Mason wrote:
> On Fri, 2008-11-07 at 15:26 -0500, Ric Wheeler wrote:
>   
>> Matthew Wilcox wrote:
>>     
>>> On Fri, Nov 07, 2008 at 03:19:13PM -0500, Theodore Tso wrote:
>>>   
>>>       
>>>> Let's be just a *little* bit fair here.  Suppose we wanted to
>>>> implement thin-provisioned disks using devicemapper and LVM; consider
>>>> that LVM uses a default PE size of 4M for some very good reasons.
>>>> Asking filesystems to be a little smarter about allocation policies so
>>>> that we allocate in existing 4M chunks before going onto the next, and
>>>> asking the block layer to pool trim requests to 4M chunks is not
>>>> totally unreasonable.
>>>>
>>>> Array vendors use chunk sizes > than typical filesystem chunk sizes
>>>> for the same reason that LVM does.  So to say that this is due to
>>>> purely a "broken firmware architecture" is a little unfair.
>>>>     
>>>>         
>>> I think we would have a full-throated discussion about whether the
>>> right thing to do was to put the tracking in the block layer or in LVM.
>>> Rather similar to what we're doing now, in fact.
>>>   
>>>       
>> You definitely could imagine having a device mapper target that could 
>> track the discards commands and subsequent writes which would invalidate 
>> the previous discards.
>>
>> Actually, it would be kind of nice to move all of this away from the 
>> file systems entirely.
>>     
>
> * Fast
> * Crash safe
> * Bounded ram usage
> * Accurately deliver the trims
>
> Pick any three ;)  If we're dealing with large files, I can see it
> working well.  For files that are likely to be smaller than the physical
> extent size, you end up with either extra state bits on disk (and
> keeping them in sync) or a log structured lvm.
>
> I do agree that an offline tool to account for bytes used would be able
> to make up for this, and from a thin provisioning point of view, we
> might be better off if we don't accurately deliver all the trims all the
> time.
>   

Given the best practice more or less states that users need to have set 
the high water mark sufficiently low to allow storage admins to react, I 
think a tool like this would be very useful.

Think of how nasty it would be to run out of real blocks on a device 
that seems to have plenty of unused capacity :-)

> People just use the space again soon anyway, I'd have to guess the
> filesystems end up in a steady state outside of special events.
>
> In another email Ted mentions that it makes sense for the FS allocator
> to notice we've just freed the last block in an aligned region of size
> X, and I'd agree with that.
>
> The trim command we send down when we free the block could just contain
> the entire range that is free (and easy for the FS to determine) every
> time.
>
> -chris
>   
I think sending down the entire contiguous range of freed sectors would work well with these boxes...

ric




  reply	other threads:[~2008-11-07 21:04 UTC|newest]

Thread overview: 105+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-06 14:43 thin provisioned LUN support Ric Wheeler
2008-11-06 15:17 ` James Bottomley
2008-11-06 15:24   ` David Woodhouse
2008-11-06 16:00     ` Ric Wheeler
2008-11-06 16:40       ` Martin K. Petersen
2008-11-06 17:04         ` Ric Wheeler
2008-11-06 17:15     ` Matthew Wilcox
2008-11-07 12:05     ` Jens Axboe
2008-11-07 12:14       ` Ric Wheeler
2008-11-07 12:17         ` David Woodhouse
2008-11-07 12:19         ` Jens Axboe
2008-11-07 14:26           ` thin provisioned LUN support & file system allocation policy Ric Wheeler
2008-11-07 14:34             ` Matthew Wilcox
2008-11-07 14:45               ` Jörn Engel
2008-11-07 14:43             ` Theodore Tso
2008-11-07 14:54               ` Ric Wheeler
2008-11-07 15:26                 ` jim owens
2008-11-07 15:31                   ` David Woodhouse
2008-11-07 15:35                     ` jim owens
2008-11-07 15:46                       ` Theodore Tso
2008-11-07 15:51                         ` Martin K. Petersen
2008-11-07 16:06                           ` Ric Wheeler
2008-11-07 15:56                         ` James Bottomley
2008-11-07 15:36                     ` James Bottomley
2008-11-07 15:48                       ` David Woodhouse
2008-11-07 15:36                   ` Theodore Tso
2008-11-07 15:45                     ` Matthew Wilcox
2008-11-07 16:07                       ` jim owens
2008-11-07 16:12                         ` James Bottomley
2008-11-07 16:23                           ` jim owens
2008-11-07 16:02                   ` Ric Wheeler
2008-11-07 14:55               ` Matthew Wilcox
2008-11-07 15:20         ` thin provisioned LUN support James Bottomley
2008-11-09 23:08           ` Dave Chinner
2008-11-09 23:37             ` James Bottomley
2008-11-10  0:33               ` Dave Chinner
2008-11-10 14:31                 ` James Bottomley
2008-11-07 15:49       ` Chris Mason
2008-11-07 16:00         ` Martin K. Petersen
2008-11-07 16:06           ` James Bottomley
2008-11-07 16:11             ` Chris Mason
2008-11-07 16:18               ` James Bottomley
2008-11-07 16:22                 ` Ric Wheeler
2008-11-07 16:27                   ` James Bottomley
2008-11-07 16:28                   ` David Woodhouse
2008-11-07 17:22                 ` Chris Mason
2008-11-07 18:09                   ` Ric Wheeler
2008-11-07 18:36                     ` Theodore Tso
2008-11-07 18:41                       ` Ric Wheeler
     [not found]                       ` <49148BDF.9050707@redhat.com>
2008-11-07 19:35                         ` Theodore Tso
2008-11-07 19:55                           ` Martin K. Petersen
2008-11-07 20:19                             ` Theodore Tso
2008-11-07 20:21                               ` Matthew Wilcox
     [not found]                               ` <20081107202149.GJ15439@parisc-linux.org>
2008-11-07 20:26                                 ` Ric Wheeler
2008-11-07 20:48                                   ` Chris Mason
2008-11-07 21:04                                     ` Ric Wheeler [this message]
2008-11-07 21:13                                     ` Theodore Tso
2008-11-07 20:42                                 ` Theodore Tso
2008-11-07 21:06                               ` Martin K. Petersen
2008-11-07 20:37                             ` Ric Wheeler
2008-11-10  2:44                               ` Black_David
2008-11-10  2:36                           ` Black_David
2008-11-07 19:44                       ` jim owens
2008-11-07 19:48                         ` Matthew Wilcox
2008-11-07 19:50                         ` Ric Wheeler
2008-11-09 23:36           ` Dave Chinner
2008-11-10  3:40             ` Thin provisioning & arrays Black_David
2008-11-10  8:31               ` Dave Chinner
2008-11-10  9:59                 ` David Woodhouse
2008-11-10 13:30                   ` Matthew Wilcox
2008-11-10 13:36                     ` Jens Axboe
2008-11-10 17:05                   ` UNMAP is a hint Black_David
2008-11-10 17:30                     ` Matthew Wilcox
2008-11-10 17:56                       ` Ric Wheeler
2008-11-10 22:18                   ` Thin provisioning & arrays Dave Chinner
2008-11-11  1:23                     ` Black_David
2008-11-11  2:09                       ` Keith Owens
2008-11-11 13:59                         ` Ric Wheeler
2008-11-11 14:55                           ` jim owens
2008-11-11 15:38                             ` Ric Wheeler
2008-11-11 15:59                               ` jim owens
2008-11-11 16:25                                 ` Ric Wheeler
2008-11-11 16:53                                   ` jim owens
2008-11-11 23:08                             ` Dave Chinner
2008-11-11 23:52                               ` jim owens
2008-11-11 22:49                       ` Dave Chinner
2008-11-06 15:27 ` thin provisioned LUN support jim owens
2008-11-06 15:57   ` jim owens
2008-11-06 16:21     ` James Bottomley
     [not found] ` <yq1d4h8nao5.fsf@sermon.lab.mkp.net>
2008-11-06 15:42   ` Ric Wheeler
2008-11-06 15:57     ` David Woodhouse
2008-11-06 22:36 ` Dave Chinner
2008-11-06 22:55   ` Ric Wheeler
     [not found]   ` <491375E9.7020707@redhat.com>
2008-11-06 23:06     ` James Bottomley
2008-11-06 23:10       ` Ric Wheeler
2008-11-06 23:26         ` James Bottomley
2008-11-06 23:32 ` thin provisioned LUN support - T10 activity Black_David
2008-11-07 11:59 ` thin provisioned LUN support Artem Bityutskiy
2008-11-10 20:39 ` Aggregating discard requests in the filesystem Matthew Wilcox
2008-11-10 20:44   ` Chris Mason
2008-11-11  0:12   ` Brad Boyer
2008-11-11 15:25     ` jim owens
2008-11-11 16:40 ` thin provisioned LUN support Christoph Hellwig
2008-11-11 17:07   ` jim owens
2008-11-11 17:33     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4914AD74.30509@redhat.com \
    --to=rwheeler@redhat.com \
    --cc=Black_David@emc.com \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=chris.mason@oracle.com \
    --cc=coughlan@redhat.com \
    --cc=dwmw2@infradead.org \
    --cc=jens.axboe@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=matthew@wil.cx \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).