From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Brown Subject: Re: Software RAID and TRIM Date: Tue, 19 Jul 2011 12:22:35 +0200 Message-ID: References: <4E235984.2070704@5t9.de> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Sender: linux-raid-owner@vger.kernel.org To: linux-raid@vger.kernel.org List-Id: linux-raid.ids On 19/07/2011 11:29, Lutz Vieweg wrote: > On 07/18/2011 10:18 PM, David Brown wrote: >> You don't need to fill an erase block for writing - writes are done >> as write blocks (I think 4K is the norm). > > You are right on that. Those sectors in a partially used erase block > that have not been written to since the last erase of the whole erase > block can be written to as good as sectors in completely empty erase > blocks. > > >> My main point about TRIM being expensive is the effect it has on >> the block IO queue, regardless of the implementation in the SSD. > > Because of those effects on the block-IO-queue, the user-space > work-around we implemented to discard the SSDs our RAID-1s consist of > will not discard "one area on all SSDs at a time", but rather iterate > first through all unused areas on one SSD, then iterate through the > same list of areas on the second SSD. > Do you take the arrays off-line during this process, or at least make them read-only? If not, how do you ensure that the lists are valid? > The effect of this is very much to our liking: While we can see > near-100%-utilization on one SSD at a time during the discards, the > other SSD will happily service the readers, and even the writes that > go to the /dev/md* device are buffered in main memory long enough > that we do not really see a significantly bad impact on the service. > (This might be different, though, if the discards were done during > peak-write-load times of the day.) > > >> I really hope your SSD's return zeros for TRIM'ed blocks > > For RAID-1, the only consequence of not doing so is just that > "data-check" runs may result in a > 0 mismatch_cnt. It does not > destroy any of your data, and as long as I have two SSDs in a RAID, > both of which give a non-error result when reading a sector, I would > have no indication of "which of the returned sector contents to > prefer", anyway. > > (I admit that for health monitoring it is useful to have a meaningful > mismatch_cnt.) > >> and that you are sure all your TRIMs are in full raid stripes - >> otherwise you will /seriously/ mess up your raid arrays. > > Again, for RAID0/1 (even 10) I don't see why this would harm any > data. > Fair enough for RAID1. Just don't try it with RAID5!