All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ric Wheeler <rwheeler@redhat.com>
To: Chris Mason <chris.mason@oracle.com>
Cc: Matthew Wilcox <matthew@wil.cx>, Theodore Tso <tytso@mit.edu>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	James Bottomley <James.Bottomley@hansenpartnership.com>,
	Jens Axboe <jens.axboe@oracle.com>,
	David Woodhouse <dwmw2@infradead.org>,
	linux-scsi@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Black_David@emc.com, Tom Coughlan <coughlan@redhat.com>
Subject: Re: thin provisioned LUN support
Date: Fri, 07 Nov 2008 16:04:52 -0500	[thread overview]
Message-ID: <4914AD74.30509@redhat.com> (raw)
In-Reply-To: <1226090910.15281.84.camel@think.oraclecorp.com>

Chris Mason wrote:
> On Fri, 2008-11-07 at 15:26 -0500, Ric Wheeler wrote:
>   
>> Matthew Wilcox wrote:
>>     
>>> On Fri, Nov 07, 2008 at 03:19:13PM -0500, Theodore Tso wrote:
>>>   
>>>       
>>>> Let's be just a *little* bit fair here.  Suppose we wanted to
>>>> implement thin-provisioned disks using devicemapper and LVM; consider
>>>> that LVM uses a default PE size of 4M for some very good reasons.
>>>> Asking filesystems to be a little smarter about allocation policies so
>>>> that we allocate in existing 4M chunks before going onto the next, and
>>>> asking the block layer to pool trim requests to 4M chunks is not
>>>> totally unreasonable.
>>>>
>>>> Array vendors use chunk sizes > than typical filesystem chunk sizes
>>>> for the same reason that LVM does.  So to say that this is due to
>>>> purely a "broken firmware architecture" is a little unfair.
>>>>     
>>>>         
>>> I think we would have a full-throated discussion about whether the
>>> right thing to do was to put the tracking in the block layer or in LVM.
>>> Rather similar to what we're doing now, in fact.
>>>   
>>>       
>> You definitely could imagine having a device mapper target that could 
>> track the discards commands and subsequent writes which would invalidate 
>> the previous discards.
>>
>> Actually, it would be kind of nice to move all of this away from the 
>> file systems entirely.
>>     
>
> * Fast
> * Crash safe
> * Bounded ram usage
> * Accurately deliver the trims
>
> Pick any three ;)  If we're dealing with large files, I can see it
> working well.  For files that are likely to be smaller than the physical
> extent size, you end up with either extra state bits on disk (and
> keeping them in sync) or a log structured lvm.
>
> I do agree that an offline tool to account for bytes used would be able
> to make up for this, and from a thin provisioning point of view, we
> might be better off if we don't accurately deliver all the trims all the
> time.
>   

Given the best practice more or less states that users need to have set 
the high water mark sufficiently low to allow storage admins to react, I 
think a tool like this would be very useful.

Think of how nasty it would be to run out of real blocks on a device 
that seems to have plenty of unused capacity :-)

> People just use the space again soon anyway, I'd have to guess the
> filesystems end up in a steady state outside of special events.
>
> In another email Ted mentions that it makes sense for the FS allocator
> to notice we've just freed the last block in an aligned region of size
> X, and I'd agree with that.
>
> The trim command we send down when we free the block could just contain
> the entire range that is free (and easy for the FS to determine) every
> time.
>
> -chris
>   
I think sending down the entire contiguous range of freed sectors would work well with these boxes...

ric




  reply	other threads:[~2008-11-07 21:05 UTC|newest]

Thread overview: 118+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-06 14:43 thin provisioned LUN support Ric Wheeler
2008-11-06 15:17 ` James Bottomley
2008-11-06 15:24   ` David Woodhouse
2008-11-06 16:00     ` Ric Wheeler
2008-11-06 16:40       ` Martin K. Petersen
2008-11-06 17:04         ` Ric Wheeler
2008-11-06 17:15     ` Matthew Wilcox
2008-11-07 12:05     ` Jens Axboe
2008-11-07 12:14       ` Ric Wheeler
2008-11-07 12:17         ` David Woodhouse
2008-11-07 12:19         ` Jens Axboe
2008-11-07 14:26           ` thin provisioned LUN support & file system allocation policy Ric Wheeler
2008-11-07 14:34             ` Matthew Wilcox
2008-11-07 14:45               ` Jörn Engel
2008-11-07 14:43             ` Theodore Tso
2008-11-07 14:54               ` Ric Wheeler
2008-11-07 14:54               ` Ric Wheeler
2008-11-07 15:26                 ` jim owens
2008-11-07 15:31                   ` David Woodhouse
2008-11-07 15:35                     ` jim owens
2008-11-07 15:46                       ` Theodore Tso
2008-11-07 15:51                         ` Martin K. Petersen
2008-11-07 16:06                           ` Ric Wheeler
2008-11-07 15:56                         ` James Bottomley
2008-11-07 15:36                     ` James Bottomley
2008-11-07 15:48                       ` David Woodhouse
2008-11-07 15:36                   ` Theodore Tso
2008-11-07 15:45                     ` Matthew Wilcox
2008-11-07 16:07                       ` jim owens
2008-11-07 16:12                         ` James Bottomley
2008-11-07 16:23                           ` jim owens
2008-11-07 15:45                     ` Matthew Wilcox
2008-11-07 16:02                   ` Ric Wheeler
2008-11-07 14:55               ` Matthew Wilcox
2008-11-07 14:55               ` Matthew Wilcox
2008-11-07 15:20         ` thin provisioned LUN support James Bottomley
2008-11-09 23:08           ` Dave Chinner
2008-11-09 23:37             ` James Bottomley
2008-11-10  0:33               ` Dave Chinner
2008-11-10 14:31                 ` James Bottomley
2008-11-07 15:49       ` Chris Mason
2008-11-07 16:00         ` Martin K. Petersen
2008-11-07 16:06           ` James Bottomley
2008-11-07 16:11             ` Chris Mason
2008-11-07 16:18               ` James Bottomley
2008-11-07 16:22                 ` Ric Wheeler
2008-11-07 16:27                   ` James Bottomley
2008-11-07 16:28                   ` David Woodhouse
2008-11-07 17:22                 ` Chris Mason
2008-11-07 18:09                   ` Ric Wheeler
2008-11-07 18:36                     ` Theodore Tso
2008-11-07 18:41                       ` Ric Wheeler
2008-11-07 18:41                       ` Ric Wheeler
2008-11-07 19:35                         ` Theodore Tso
2008-11-07 19:55                           ` Martin K. Petersen
2008-11-07 20:19                             ` Theodore Tso
2008-11-07 20:21                               ` Matthew Wilcox
2008-11-07 20:21                               ` Matthew Wilcox
2008-11-07 20:26                                 ` Ric Wheeler
2008-11-07 20:48                                   ` Chris Mason
2008-11-07 21:04                                     ` Ric Wheeler [this message]
2008-11-07 21:13                                     ` Theodore Tso
2008-11-07 20:42                                 ` Theodore Tso
2008-11-07 21:06                               ` Martin K. Petersen
2008-11-07 20:37                             ` Ric Wheeler
2008-11-10  2:44                               ` Black_David
2008-11-10  2:44                                 ` Black_David
2008-11-10  2:36                           ` Black_David
2008-11-10  2:36                             ` Black_David
2008-11-07 19:44                       ` jim owens
2008-11-07 19:48                         ` Matthew Wilcox
2008-11-07 19:50                         ` Ric Wheeler
2008-11-09 23:36           ` Dave Chinner
2008-11-10  3:40             ` Thin provisioning & arrays Black_David
2008-11-10  3:40               ` Black_David
2008-11-10  8:31               ` Dave Chinner
2008-11-10  9:59                 ` David Woodhouse
2008-11-10 13:30                   ` Matthew Wilcox
2008-11-10 13:36                     ` Jens Axboe
2008-11-10 17:05                   ` UNMAP is a hint Black_David
2008-11-10 17:05                     ` Black_David
2008-11-10 17:30                     ` Matthew Wilcox
2008-11-10 17:56                       ` Ric Wheeler
2008-11-10 22:18                   ` Thin provisioning & arrays Dave Chinner
2008-11-11  1:23                     ` Black_David
2008-11-11  1:23                       ` Black_David
2008-11-11  2:09                       ` Keith Owens
2008-11-11 13:59                         ` Ric Wheeler
2008-11-11 14:55                           ` jim owens
2008-11-11 15:38                             ` Ric Wheeler
2008-11-11 15:59                               ` jim owens
2008-11-11 16:25                                 ` Ric Wheeler
2008-11-11 16:53                                   ` jim owens
2008-11-11 23:08                             ` Dave Chinner
2008-11-11 23:52                               ` jim owens
2008-11-11 23:52                               ` jim owens
2008-11-11 22:49                       ` Dave Chinner
2008-11-06 15:27 ` thin provisioned LUN support jim owens
2008-11-06 15:57   ` jim owens
2008-11-06 16:21     ` James Bottomley
     [not found] ` <yq1d4h8nao5.fsf@sermon.lab.mkp.net>
2008-11-06 15:42   ` Ric Wheeler
2008-11-06 15:57     ` David Woodhouse
2008-11-06 22:36 ` Dave Chinner
2008-11-06 22:55   ` Ric Wheeler
2008-11-06 22:55   ` Ric Wheeler
2008-11-06 23:06     ` James Bottomley
2008-11-06 23:10       ` Ric Wheeler
2008-11-06 23:26         ` James Bottomley
2008-11-06 23:32 ` thin provisioned LUN support - T10 activity Black_David
2008-11-06 23:32   ` Black_David
2008-11-07 11:59 ` thin provisioned LUN support Artem Bityutskiy
2008-11-10 20:39 ` Aggregating discard requests in the filesystem Matthew Wilcox
2008-11-10 20:44   ` Chris Mason
2008-11-11  0:12   ` Brad Boyer
2008-11-11 15:25     ` jim owens
2008-11-11 16:40 ` thin provisioned LUN support Christoph Hellwig
2008-11-11 17:07   ` jim owens
2008-11-11 17:33     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4914AD74.30509@redhat.com \
    --to=rwheeler@redhat.com \
    --cc=Black_David@emc.com \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=chris.mason@oracle.com \
    --cc=coughlan@redhat.com \
    --cc=dwmw2@infradead.org \
    --cc=jens.axboe@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=matthew@wil.cx \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.