* [linux-lvm] Testing TRIM with LVM @ 2011-04-12 14:59 DarkNovaNick 2011-04-12 23:47 ` Mike Snitzer 0 siblings, 1 reply; 54+ messages in thread From: DarkNovaNick @ 2011-04-12 14:59 UTC (permalink / raw) To: linux-lvm [-- Attachment #1: Type: text/plain, Size: 1443 bytes --] I recently purchased a Crucial C300 SSD and set it up as my primary drive. I wanted to use LVM, and my reading indicates that LVM added support to pass-through TRIM commands with kernel 2.6.37. I'm running Ubuntu 10.10, but I installed the latest 11.04 kernel, so I'm running kernel 2.6.38-8. I added "discard" to fstab so my mountpoint looks like: /dev/mapper/vg0-vol0 on / type ext4 (rw,noatime,nodiratime,errors=remount-ro,discard,commit=0) I found directions on various sites like: http://duopetalflower.blogspot.com/2010/11/enterprise-kernel-6-has-ssd-trim.html on how to confirm if TRIM is working. I ran (as root): dd if=/dev/urandom of=tempfile count=100 bs=512k oflag=direct hdparm --fibmap tempfile (then took the first begin_LBA) hdparm --read-sector 191710208 /dev/sda and it printed: /dev/sda: reading sector 191710208: succeeded 3254 3a32 3834 313a 2b36 3030 303a 2030 4544 5542 2047 3728 3a29 5320 7661 6e69 2067 6e69 6564 2078 6f66 2072 7270 646f ............ Then I ran: rm tempfile sync hdparm --read-sector 191710208 /dev/sda and it still returned: /dev/sda: reading sector 191710208: succeeded 3254 3a32 3834 313a 2b36 3030 303a 2030 4544 5542 2047 3728 3a29 5320 7661 6e69 2067 6e69 6564 2078 6f66 2072 7270 646f ............. If TRIM is working, the sector is supposed to contain all zeros. Am I doing something wrong or do I need to do something more to get LVM to pass down the TRIM command? Nick [-- Attachment #2: Type: text/html, Size: 1643 bytes --] ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] Testing TRIM with LVM 2011-04-12 14:59 [linux-lvm] Testing TRIM with LVM DarkNovaNick @ 2011-04-12 23:47 ` Mike Snitzer 2011-04-13 1:47 ` DarkNovaNick 0 siblings, 1 reply; 54+ messages in thread From: Mike Snitzer @ 2011-04-12 23:47 UTC (permalink / raw) To: DarkNovaNick; +Cc: LVM general discussion and development On Tue, Apr 12 2011 at 10:59am -0400, DarkNovaNick@gmail.com <DarkNovaNick@gmail.com> wrote: > I recently purchased a Crucial C300 SSD and set it up as my primary > drive. I wanted to use LVM, and my reading indicates that LVM added > support to pass-through TRIM commands with kernel 2.6.37. I'm > running Ubuntu 10.10, but I installed the latest 11.04 kernel, so > I'm running kernel 2.6.38-8. I added "discard" to fstab so my > mountpoint looks like: > > /dev/mapper/vg0-vol0 on / type ext4 > (rw,noatime,nodiratime,errors=remount-ro,discard,commit=0) If the discards aren't working then ext4 will print a warning and stop issuing them, see: http://git.kernel.org/linus/a30eec2a8650a77f7 So if you don't see "discard not supported, disabling" in your dmesg (or /var/log/messages) then TRIM is likely working as expected. > I found directions on various sites like: > http://duopetalflower.blogspot.com/2010/11/enterprise-kernel-6-has-ssd-trim.html > on how to confirm if TRIM is working. I ran (as root): > > dd if=/dev/urandom of=tempfile count=100 bs=512k oflag=direct > hdparm --fibmap tempfile > (then took the first begin_LBA) > hdparm --read-sector 191710208 /dev/sda > > and it printed: > > /dev/sda: > reading sector 191710208: succeeded > 3254 3a32 3834 313a 2b36 3030 303a 2030 > 4544 5542 2047 3728 3a29 5320 7661 6e69 > 2067 6e69 6564 2078 6f66 2072 7270 646f > ............ > > Then I ran: > rm tempfile > sync > hdparm --read-sector 191710208 /dev/sda > > and it still returned: > /dev/sda: > reading sector 191710208: succeeded > 3254 3a32 3834 313a 2b36 3030 303a 2030 > 4544 5542 2047 3728 3a29 5320 7661 6e69 > 2067 6e69 6564 2078 6f66 2072 7270 646f > ............. > > If TRIM is working, the sector is supposed to contain all zeros. That is only if the SSD implements TRIM so that it zeroes the discarded blocks. You can check if the device at least knows enough to report discard_zeroes_data: cat /sys/block/sda/queue/discard_zeroes_data Also, even if the SSD does zero, it may take some time -- well after the discard has completed -- for the SSD to actually zero the blocks (this is true of some lesser quality SSDs). > Am I doing something wrong or do I need to do something more to get > LVM to pass down the TRIM command? LVM (device mapper specifically) passes discards perfectly well. But the snapshot and dm-crypt targets don't have discard support. All other targets do have discard support. What is your output for: dmsetup table Mike ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] Testing TRIM with LVM 2011-04-12 23:47 ` Mike Snitzer @ 2011-04-13 1:47 ` DarkNovaNick 2011-04-13 8:41 ` Zdenek Kabelac 2011-04-13 22:40 ` [linux-lvm] [PATCH] dm snapshot: add discard support to the snapshot-origin target [was: Re: Testing TRIM with LVM] Mike Snitzer 0 siblings, 2 replies; 54+ messages in thread From: DarkNovaNick @ 2011-04-13 1:47 UTC (permalink / raw) To: Mike Snitzer; +Cc: LVM general discussion and development [-- Attachment #1: Type: text/plain, Size: 4058 bytes --] On Apr 12, 2011 6:47pm, Mike Snitzer <snitzer@redhat.com> wrote: > On Tue, Apr 12 2011 at 10:59am -0400, > DarkNovaNick@gmail.com DarkNovaNick@gmail.com> wrote: > > I recently purchased a Crucial C300 SSD and set it up as my primary > > drive. I wanted to use LVM, and my reading indicates that LVM added > > support to pass-through TRIM commands with kernel 2.6.37. I'm > > running Ubuntu 10.10, but I installed the latest 11.04 kernel, so > > I'm running kernel 2.6.38-8. I added "discard" to fstab so my > > mountpoint looks like: > > > > /dev/mapper/vg0-vol0 on / type ext4 > > (rw,noatime,nodiratime,errors=remount-ro,discard,commit=0) > If the discards aren't working then ext4 will print a warning and stop > issuing them, see: http://git.kernel.org/linus/a30eec2a8650a77f7 > So if you don't see "discard not supported, disabling" in your dmesg (or > /var/log/messages) then TRIM is likely working as expected. > > I found directions on various sites like: > > > http://duopetalflower.blogspot.com/2010/11/enterprise-kernel-6-has-ssd-trim.html > > on how to confirm if TRIM is working. I ran (as root): > > > > dd if=/dev/urandom of=tempfile count=100 bs=512k oflag=direct > > hdparm --fibmap tempfile > > (then took the first begin_LBA) > > hdparm --read-sector 191710208 /dev/sda > > > > and it printed: > > > > /dev/sda: > > reading sector 191710208: succeeded > > 3254 3a32 3834 313a 2b36 3030 303a 2030 > > 4544 5542 2047 3728 3a29 5320 7661 6e69 > > 2067 6e69 6564 2078 6f66 2072 7270 646f > > ............ > > > > Then I ran: > > rm tempfile > > sync > > hdparm --read-sector 191710208 /dev/sda > > > > and it still returned: > > /dev/sda: > > reading sector 191710208: succeeded > > 3254 3a32 3834 313a 2b36 3030 303a 2030 > > 4544 5542 2047 3728 3a29 5320 7661 6e69 > > 2067 6e69 6564 2078 6f66 2072 7270 646f > > ............. > > > > If TRIM is working, the sector is supposed to contain all zeros. > That is only if the SSD implements TRIM so that it zeroes the discarded > blocks. You can check if the device at least knows enough to report > discard_zeroes_data: > cat /sys/block/sda/queue/discard_zeroes_data > Also, even if the SSD does zero, it may take some time -- well after the > discard has completed -- for the SSD to actually zero the blocks (this > is true of some lesser quality SSDs). > > Am I doing something wrong or do I need to do something more to get > > LVM to pass down the TRIM command? > LVM (device mapper specifically) passes discards perfectly well. But > the snapshot and dm-crypt targets don't have discard support. All other > targets do have discard support. What is your output for: dmsetup table > Mike I get 0 back when I do "cat /sys/block/sda/queue/discard_zeroes_data" so does this mean that my drive doesn't zero out the TRIMed blocks? I don't really care if it does or does not, I just want to make sure the drive is getting the TRIM commands as the benchmarks I've seen show performance degradation if they aren't used. Regarding snapshot targets, the only instance of "discard not supported" came as I did a snapshot today: Apr 12 15:27:58 darknovanick kernel: [65147.211211] EXT4-fs warning (device dm-0): release_blocks_on_commit:2672: discard not supported, disabling Apr 12 15:28:16 darknovanick kernel: [65166.103061] EXT4-fs (dm-1): 72 orphan inodes deleted Apr 12 15:28:16 darknovanick kernel: [65166.103062] EXT4-fs (dm-1): recovery complete Apr 12 15:28:17 darknovanick kernel: [65166.232257] EXT4-fs (dm-1): mounted filesystem with ordered data mode. Opts: (null) This snapshot was of a LV on the SSD, but the snapshot itself was being stored on a mechanical hard drive. Does this message mean that discard was just disabled for the snapshot itself (which was deleted a half hour later), or is it now disabled for the entire volume now that I've done a snapshot, even though its been removed? My dmsetup table output: vg0-vol0: 0 249438208 linear 8:2 512 Thanks, Nick [-- Attachment #2: Type: text/html, Size: 5644 bytes --] ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] Testing TRIM with LVM 2011-04-13 1:47 ` DarkNovaNick @ 2011-04-13 8:41 ` Zdenek Kabelac 2011-04-13 15:38 ` DarkNovaNick 2011-04-13 22:40 ` [linux-lvm] [PATCH] dm snapshot: add discard support to the snapshot-origin target [was: Re: Testing TRIM with LVM] Mike Snitzer 1 sibling, 1 reply; 54+ messages in thread From: Zdenek Kabelac @ 2011-04-13 8:41 UTC (permalink / raw) To: LVM general discussion and development; +Cc: Mike Snitzer, DarkNovaNick Dne 13.4.2011 03:47, DarkNovaNick@gmail.com napsal(a): > On Apr 12, 2011 6:47pm, Mike Snitzer <snitzer@redhat.com> wrote: >> On Tue, Apr 12 2011 at 10:59am -0400, > >> DarkNovaNick@gmail.com DarkNovaNick@gmail.com> wrote: > > > >> > I recently purchased a Crucial C300 SSD and set it up as my primary > >> > drive. I wanted to use LVM, and my reading indicates that LVM added > >> > support to pass-through TRIM commands with kernel 2.6.37. I'm > >> > running Ubuntu 10.10, but I installed the latest 11.04 kernel, so > >> > I'm running kernel 2.6.38-8. I added "discard" to fstab so my > >> > mountpoint looks like: > >> > > >> > /dev/mapper/vg0-vol0 on / type ext4 > >> > (rw,noatime,nodiratime,errors=remount-ro,discard,commit=0) > > > >> If the discards aren't working then ext4 will print a warning and stop > >> issuing them, see: http://git.kernel.org/linus/a30eec2a8650a77f7 > > > >> So if you don't see "discard not supported, disabling" in your dmesg (or > >> /var/log/messages) then TRIM is likely working as expected. > > > >> > I found directions on various sites like: > >> > >> http://duopetalflower.blogspot.com/2010/11/enterprise-kernel-6-has-ssd-trim.html >> > >> > on how to confirm if TRIM is working. I ran (as root): > >> > > >> > dd if=/dev/urandom of=tempfile count=100 bs=512k oflag=direct > >> > hdparm --fibmap tempfile > >> > (then took the first begin_LBA) > >> > hdparm --read-sector 191710208 /dev/sda > >> > > >> > and it printed: > >> > > >> > /dev/sda: > >> > reading sector 191710208: succeeded > >> > 3254 3a32 3834 313a 2b36 3030 303a 2030 > >> > 4544 5542 2047 3728 3a29 5320 7661 6e69 > >> > 2067 6e69 6564 2078 6f66 2072 7270 646f > >> > ............ > >> > > >> > Then I ran: > >> > rm tempfile > >> > sync > >> > hdparm --read-sector 191710208 /dev/sda > >> > > >> > and it still returned: > >> > /dev/sda: > >> > reading sector 191710208: succeeded > >> > 3254 3a32 3834 313a 2b36 3030 303a 2030 > >> > 4544 5542 2047 3728 3a29 5320 7661 6e69 > >> > 2067 6e69 6564 2078 6f66 2072 7270 646f > >> > ............. > >> > > >> > If TRIM is working, the sector is supposed to contain all zeros. > > > >> That is only if the SSD implements TRIM so that it zeroes the discarded > >> blocks. You can check if the device at least knows enough to report > >> discard_zeroes_data: > > > >> cat /sys/block/sda/queue/discard_zeroes_data > > > >> Also, even if the SSD does zero, it may take some time -- well after the > >> discard has completed -- for the SSD to actually zero the blocks (this > >> is true of some lesser quality SSDs). > > > >> > Am I doing something wrong or do I need to do something more to get > >> > LVM to pass down the TRIM command? > > > >> LVM (device mapper specifically) passes discards perfectly well. But > >> the snapshot and dm-crypt targets don't have discard support. All other > >> targets do have discard support. What is your output for: dmsetup table > > > >> Mike > > > > > I get 0 back when I do "cat /sys/block/sda/queue/discard_zeroes_data" so does > this mean that my drive doesn't zero out the TRIMed blocks? I don't really > care if it does or does not, I just want to make sure the drive is getting the > TRIM commands as the benchmarks I've seen show performance degradation if they > aren't used. As I've the same drive - what kind of performance degradation do you see? Note: this drive (unlike many others) is processing TRIM command on background. So 'traditional' benchmark do not provide valuable results. (As we've been evaluating this drive somewhat to compare with Intel). Outcome is - when you issue TRIM command - it's immediately finished - but the performance of drive is temporarily decreased while the drive is performing block trimming for some time. Thus you might get 'impressive' results when some benchmarks are evaluating TRIM performance ;) Zdenek ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] Testing TRIM with LVM 2011-04-13 8:41 ` Zdenek Kabelac @ 2011-04-13 15:38 ` DarkNovaNick 0 siblings, 0 replies; 54+ messages in thread From: DarkNovaNick @ 2011-04-13 15:38 UTC (permalink / raw) To: Zdenek Kabelac, DarkNovaNick Cc: Mike Snitzer, LVM general discussion and development [-- Attachment #1: Type: text/plain, Size: 4741 bytes --] On Apr 13, 2011 3:41am, Zdenek Kabelac <zdenek.kabelac@gmail.com> wrote: > Dne 13.4.2011 03:47, DarkNovaNick@gmail.com napsal(a): > > On Apr 12, 2011 6:47pm, Mike Snitzer snitzer@redhat.com> wrote: > >> On Tue, Apr 12 2011 at 10:59am -0400, > > > >> DarkNovaNick@gmail.com DarkNovaNick@gmail.com> wrote: > > > > > > > >> > I recently purchased a Crucial C300 SSD and set it up as my primary > > > >> > drive. I wanted to use LVM, and my reading indicates that LVM added > > > >> > support to pass-through TRIM commands with kernel 2.6.37. I'm > > > >> > running Ubuntu 10.10, but I installed the latest 11.04 kernel, so > > > >> > I'm running kernel 2.6.38-8. I added "discard" to fstab so my > > > >> > mountpoint looks like: > > > >> > > > > >> > /dev/mapper/vg0-vol0 on / type ext4 > > > >> > (rw,noatime,nodiratime,errors=remount-ro,discard,commit=0) > > > > > > > >> If the discards aren't working then ext4 will print a warning and stop > > > >> issuing them, see: http://git.kernel.org/linus/a30eec2a8650a77f7 > > > > > > > >> So if you don't see "discard not supported, disabling" in your dmesg > (or > > > >> /var/log/messages) then TRIM is likely working as expected. > > > > > > > >> > I found directions on various sites like: > > > >> > > >> > http://duopetalflower.blogspot.com/2010/11/enterprise-kernel-6-has-ssd-trim.html > >> > > > >> > on how to confirm if TRIM is working. I ran (as root): > > > >> > > > > >> > dd if=/dev/urandom of=tempfile count=100 bs=512k oflag=direct > > > >> > hdparm --fibmap tempfile > > > >> > (then took the first begin_LBA) > > > >> > hdparm --read-sector 191710208 /dev/sda > > > >> > > > > >> > and it printed: > > > >> > > > > >> > /dev/sda: > > > >> > reading sector 191710208: succeeded > > > >> > 3254 3a32 3834 313a 2b36 3030 303a 2030 > > > >> > 4544 5542 2047 3728 3a29 5320 7661 6e69 > > > >> > 2067 6e69 6564 2078 6f66 2072 7270 646f > > > >> > ............ > > > >> > > > > >> > Then I ran: > > > >> > rm tempfile > > > >> > sync > > > >> > hdparm --read-sector 191710208 /dev/sda > > > >> > > > > >> > and it still returned: > > > >> > /dev/sda: > > > >> > reading sector 191710208: succeeded > > > >> > 3254 3a32 3834 313a 2b36 3030 303a 2030 > > > >> > 4544 5542 2047 3728 3a29 5320 7661 6e69 > > > >> > 2067 6e69 6564 2078 6f66 2072 7270 646f > > > >> > ............. > > > >> > > > > >> > If TRIM is working, the sector is supposed to contain all zeros. > > > > > > > >> That is only if the SSD implements TRIM so that it zeroes the discarded > > > >> blocks. You can check if the device at least knows enough to report > > > >> discard_zeroes_data: > > > > > > > >> cat /sys/block/sda/queue/discard_zeroes_data > > > > > > > >> Also, even if the SSD does zero, it may take some time -- well after > the > > > >> discard has completed -- for the SSD to actually zero the blocks (this > > > >> is true of some lesser quality SSDs). > > > > > > > >> > Am I doing something wrong or do I need to do something more to get > > > >> > LVM to pass down the TRIM command? > > > > > > > >> LVM (device mapper specifically) passes discards perfectly well. But > > > >> the snapshot and dm-crypt targets don't have discard support. All other > > > >> targets do have discard support. What is your output for: dmsetup table > > > > > > > >> Mike > > > > > > > > > > I get 0 back when I do "cat /sys/block/sda/queue/discard_zeroes_data" > so does > > this mean that my drive doesn't zero out the TRIMed blocks? I don't > really > > care if it does or does not, I just want to make sure the drive is > getting the > > TRIM commands as the benchmarks I've seen show performance degradation > if they > > aren't used. > As I've the same drive - what kind of performance degradation do you see? > Note: this drive (unlike many others) is processing TRIM command on > background. So 'traditional' benchmark do not provide valuable results. > (As we've been evaluating this drive somewhat to compare with Intel). > Outcome is - when you issue TRIM command - it's immediately finished - > but the > performance of drive is temporarily decreased while the drive is > performing > block trimming for some time. Thus you might get 'impressive' results when > some benchmarks are evaluating TRIM performance ;) > Zdenek I may not have been clear, but the benchmarks I am referring to show the performance degradation if you are *not* using TRIM (ie just relying on the drive's internal garbage collector). Which is why I want to make sure the TRIM is working so it doesn't get into this poor state. Nick [-- Attachment #2: Type: text/html, Size: 8591 bytes --] ^ permalink raw reply [flat|nested] 54+ messages in thread
* [linux-lvm] [PATCH] dm snapshot: add discard support to the snapshot-origin target [was: Re: Testing TRIM with LVM] 2011-04-13 1:47 ` DarkNovaNick 2011-04-13 8:41 ` Zdenek Kabelac @ 2011-04-13 22:40 ` Mike Snitzer 2011-04-13 23:48 ` [linux-lvm] " Mike Snitzer 2011-04-14 15:31 ` [linux-lvm] [PATCH] dm snapshot: add discard support to the snapshot-origin target [was: Re: Testing TRIM wi DarkNovaNick 1 sibling, 2 replies; 54+ messages in thread From: Mike Snitzer @ 2011-04-13 22:40 UTC (permalink / raw) To: DarkNovaNick; +Cc: dm-devel, LVM general discussion and development On Tue, Apr 12 2011 at 9:47pm -0400, DarkNovaNick@gmail.com <DarkNovaNick@gmail.com> wrote: > I get 0 back when I do "cat > /sys/block/sda/queue/discard_zeroes_data" so does this mean that my > drive doesn't zero out the TRIMed blocks? I don't really care if it > does or does not, I just want to make sure the drive is getting the > TRIM commands as the benchmarks I've seen show performance > degradation if they aren't used. Right, means TRIM doesn't zero the TRIM'd blocks. But otherwise it looks like you're in good shape relative to TRIM. > Regarding snapshot targets, the only instance of "discard not > supported" came as I did a snapshot today: > > Apr 12 15:27:58 darknovanick kernel: [65147.211211] EXT4-fs warning > (device dm-0): release_blocks_on_commit:2672: discard not supported, > disabling > Apr 12 15:28:16 darknovanick kernel: [65166.103061] EXT4-fs (dm-1): > 72 orphan inodes deleted > Apr 12 15:28:16 darknovanick kernel: [65166.103062] EXT4-fs (dm-1): > recovery complete > Apr 12 15:28:17 darknovanick kernel: [65166.232257] EXT4-fs (dm-1): > mounted filesystem with ordered data mode. Opts: (null) > > This snapshot was of a LV on the SSD, but the snapshot itself was > being stored on a mechanical hard drive. Does this message mean that > discard was just disabled for the snapshot itself (which was deleted > a half hour later), or is it now disabled for the entire volume now > that I've done a snapshot, even though its been removed? Unfortunately, the failed discard to the snapshot-origin device disables ext4's discards for the entire volume (even after you remove the snapshot). But the following patch fixes that: Subject: dm snapshot: add discard support to the snapshot-origin target Allow the snapshot-origin target to pass discards to the origin device (after any copyout of the region being discarded; but generally speaking that copyout won't be needed as the region will have already been copied to the cow). So for example, when you remove a file from an ext4 mounted origin volume it triggers copyout to the snapshot. When the DISCARD is then issued during the subsequent journal commit no copyout is needed. Signed-off-by: Mike Snitzer <snitzer@redhat.com> --- drivers/md/dm-snap.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c index a2d3309..028d216 100644 --- a/drivers/md/dm-snap.c +++ b/drivers/md/dm-snap.c @@ -2076,6 +2076,7 @@ static int origin_ctr(struct dm_target *ti, unsigned int argc, char **argv) ti->private = dev; ti->num_flush_requests = 1; + ti->num_discard_requests = 1; return 0; } @@ -2153,7 +2154,7 @@ static int origin_iterate_devices(struct dm_target *ti, static struct target_type origin_target = { .name = "snapshot-origin", - .version = {1, 7, 1}, + .version = {1, 8, 0}, .module = THIS_MODULE, .ctr = origin_ctr, .dtr = origin_dtr, ^ permalink raw reply related [flat|nested] 54+ messages in thread
* Re: [linux-lvm] dm snapshot: add discard support to the snapshot-origin target [was: Re: Testing TRIM with LVM] 2011-04-13 22:40 ` [linux-lvm] [PATCH] dm snapshot: add discard support to the snapshot-origin target [was: Re: Testing TRIM with LVM] Mike Snitzer @ 2011-04-13 23:48 ` Mike Snitzer 2011-04-26 17:32 ` Mike Snitzer 2011-04-14 15:31 ` [linux-lvm] [PATCH] dm snapshot: add discard support to the snapshot-origin target [was: Re: Testing TRIM wi DarkNovaNick 1 sibling, 1 reply; 54+ messages in thread From: Mike Snitzer @ 2011-04-13 23:48 UTC (permalink / raw) To: DarkNovaNick; +Cc: dm-devel, LVM general discussion and development On Wed, Apr 13 2011 at 6:40pm -0400, Mike Snitzer <snitzer@redhat.com> wrote: > On Tue, Apr 12 2011 at 9:47pm -0400, > DarkNovaNick@gmail.com <DarkNovaNick@gmail.com> wrote: > > > I get 0 back when I do "cat > > /sys/block/sda/queue/discard_zeroes_data" so does this mean that my > > drive doesn't zero out the TRIMed blocks? I don't really care if it > > does or does not, I just want to make sure the drive is getting the > > TRIM commands as the benchmarks I've seen show performance > > degradation if they aren't used. > > Right, means TRIM doesn't zero the TRIM'd blocks. > > But otherwise it looks like you're in good shape relative to TRIM. > > > Regarding snapshot targets, the only instance of "discard not > > supported" came as I did a snapshot today: > > > > Apr 12 15:27:58 darknovanick kernel: [65147.211211] EXT4-fs warning > > (device dm-0): release_blocks_on_commit:2672: discard not supported, > > disabling > > Apr 12 15:28:16 darknovanick kernel: [65166.103061] EXT4-fs (dm-1): > > 72 orphan inodes deleted > > Apr 12 15:28:16 darknovanick kernel: [65166.103062] EXT4-fs (dm-1): > > recovery complete > > Apr 12 15:28:17 darknovanick kernel: [65166.232257] EXT4-fs (dm-1): > > mounted filesystem with ordered data mode. Opts: (null) > > > > This snapshot was of a LV on the SSD, but the snapshot itself was > > being stored on a mechanical hard drive. Does this message mean that > > discard was just disabled for the snapshot itself (which was deleted > > a half hour later), or is it now disabled for the entire volume now > > that I've done a snapshot, even though its been removed? > > Unfortunately, the failed discard to the snapshot-origin device disables > ext4's discards for the entire volume (even after you remove the > snapshot). > > But the following patch fixes that: > > Subject: dm snapshot: add discard support to the snapshot-origin target > > Allow the snapshot-origin target to pass discards to the origin device > (after any copyout of the region being discarded; but generally speaking > that copyout won't be needed as the region will have already been copied > to the cow). > > So for example, when you remove a file from an ext4 mounted origin > volume it triggers copyout to the snapshot. When the DISCARD is then > issued during the subsequent journal commit no copyout is needed. It should be noted that discards are treated as writes. This means that a snapshot should not be taken of an origin volume before a filesystem is placed on that origin. Doing so may result in the entire origin volume being discarded (mkfs.ext4 does this) which would cause copyout of the entire origin volume to the snapshot. Mike ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] dm snapshot: add discard support to the snapshot-origin target [was: Re: Testing TRIM with LVM] 2011-04-13 23:48 ` [linux-lvm] " Mike Snitzer @ 2011-04-26 17:32 ` Mike Snitzer 2011-04-28 0:19 ` [linux-lvm] [PATCH] dm snapshot: ignore discards issued to the snapshot-origin target Mike Snitzer 0 siblings, 1 reply; 54+ messages in thread From: Mike Snitzer @ 2011-04-26 17:32 UTC (permalink / raw) To: DarkNovaNick; +Cc: dm-devel, LVM general discussion and development On Wed, Apr 13 2011 at 7:48pm -0400, Mike Snitzer <snitzer@redhat.com> wrote: > On Wed, Apr 13 2011 at 6:40pm -0400, > Mike Snitzer <snitzer@redhat.com> wrote: > > > Unfortunately, the failed discard to the snapshot-origin device disables > > ext4's discards for the entire volume (even after you remove the > > snapshot). > > > > But the following patch fixes that: > > > > Subject: dm snapshot: add discard support to the snapshot-origin target > > > > Allow the snapshot-origin target to pass discards to the origin device > > (after any copyout of the region being discarded; but generally speaking > > that copyout won't be needed as the region will have already been copied > > to the cow). > > > > So for example, when you remove a file from an ext4 mounted origin > > volume it triggers copyout to the snapshot. When the DISCARD is then > > issued during the subsequent journal commit no copyout is needed. > > It should be noted that discards are treated as writes. This means that > a snapshot should not be taken of an origin volume before a filesystem > is placed on that origin. Doing so may result in the entire origin > volume being discarded (mkfs.ext4 does this) which would cause copyout > of the entire origin volume to the snapshot. Just a quick follow-up. We had a discussion on snapshot-origin discard support and arrived at this needing to be done in 2 steps: 1) have snapshot-origin ignore discard but return success 2) process the discard, discard treated as a write to the origin, if user passes an option to enable it The concern is that the patch I proposed can break existing scripts that create filesystems on pristine origin devices (that have a snapshot(s)). Instead of only copying out the various minimalist changes that mkfs does (when discard isn't used) the entire origin device would be copied out (when the entire device is discarded early in mkfs.ext4). I'll circle back to working on this in a few weeks (the annoying bit is the userspace change to make the snapshot-origin's discard behavior configurable via lvcreate -s --enable-origin-discards ...; maybe someone else will get to it before I do :) Mike ^ permalink raw reply [flat|nested] 54+ messages in thread
* [linux-lvm] [PATCH] dm snapshot: ignore discards issued to the snapshot-origin target 2011-04-26 17:32 ` Mike Snitzer @ 2011-04-28 0:19 ` Mike Snitzer 2011-04-28 7:53 ` [linux-lvm] [dm-devel] " Christoph Hellwig 0 siblings, 1 reply; 54+ messages in thread From: Mike Snitzer @ 2011-04-28 0:19 UTC (permalink / raw) To: dm-devel; +Cc: DarkNovaNick, LVM general discussion and development Discards pose a problem for the snapshot-origin target because they are treated as writes. Treating a discard as a write would trigger a copyout to the snapshot. Such copyout can prove too costly in the face of otherwise benign scenarios (e.g. create a snapshot and then mkfs.ext4 the origin -- mkfs.ext4 discards the entire volume by default, which would copyout the entire origin volume to the snapshot). Signed-off-by: Mike Snitzer <snitzer@redhat.com> --- drivers/md/dm-snap.c | 9 ++++++++- 1 files changed, 8 insertions(+), 1 deletions(-) diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c index a2d3309..639af12 100644 --- a/drivers/md/dm-snap.c +++ b/drivers/md/dm-snap.c @@ -2076,6 +2076,7 @@ static int origin_ctr(struct dm_target *ti, unsigned int argc, char **argv) ti->private = dev; ti->num_flush_requests = 1; + ti->num_discard_requests = 1; return 0; } @@ -2095,6 +2096,12 @@ static int origin_map(struct dm_target *ti, struct bio *bio, if (bio->bi_rw & REQ_FLUSH) return DM_MAPIO_REMAPPED; + if (bio->bi_rw & REQ_DISCARD) { + /* ignore discard requests */ + bio_endio(bio, 0); + return DM_MAPIO_SUBMITTED; + } + /* Only tell snapshots if this is a write */ return (bio_rw(bio) == WRITE) ? do_origin(dev, bio) : DM_MAPIO_REMAPPED; } @@ -2153,7 +2160,7 @@ static int origin_iterate_devices(struct dm_target *ti, static struct target_type origin_target = { .name = "snapshot-origin", - .version = {1, 7, 1}, + .version = {1, 8, 0}, .module = THIS_MODULE, .ctr = origin_ctr, .dtr = origin_dtr, ^ permalink raw reply related [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] [PATCH] dm snapshot: ignore discards issued to the snapshot-origin target 2011-04-28 0:19 ` [linux-lvm] [PATCH] dm snapshot: ignore discards issued to the snapshot-origin target Mike Snitzer @ 2011-04-28 7:53 ` Christoph Hellwig 2011-04-28 20:59 ` [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] Mike Snitzer 0 siblings, 1 reply; 54+ messages in thread From: Christoph Hellwig @ 2011-04-28 7:53 UTC (permalink / raw) To: device-mapper development Cc: DarkNovaNick, LVM general discussion and development On Wed, Apr 27, 2011 at 08:19:13PM -0400, Mike Snitzer wrote: > Discards pose a problem for the snapshot-origin target because they are > treated as writes. Treating a discard as a write would trigger a > copyout to the snapshot. Such copyout can prove too costly in the face > of otherwise benign scenarios (e.g. create a snapshot and then mkfs.ext4 > the origin -- mkfs.ext4 discards the entire volume by default, which > would copyout the entire origin volume to the snapshot). You also need to make sure that we don't claim discard_zeroes_data for the origin volume in this case. Especially as ext4 started to rely on this actually working (very bad idea IMHO, but that's another story) ^ permalink raw reply [flat|nested] 54+ messages in thread
* [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-04-28 7:53 ` [linux-lvm] [dm-devel] " Christoph Hellwig @ 2011-04-28 20:59 ` Mike Snitzer 2011-04-28 21:28 ` Eric Sandeen ` (2 more replies) 0 siblings, 3 replies; 54+ messages in thread From: Mike Snitzer @ 2011-04-28 20:59 UTC (permalink / raw) To: Christoph Hellwig; +Cc: sandeen, dm-devel, linux-ext4, DarkNovaNick, linux-lvm [cc'ing linux-ext4] On Thu, Apr 28 2011 at 3:53am -0400, Christoph Hellwig <hch@infradead.org> wrote: > On Wed, Apr 27, 2011 at 08:19:13PM -0400, Mike Snitzer wrote: > > Discards pose a problem for the snapshot-origin target because they are > > treated as writes. Treating a discard as a write would trigger a > > copyout to the snapshot. Such copyout can prove too costly in the face > > of otherwise benign scenarios (e.g. create a snapshot and then mkfs.ext4 > > the origin -- mkfs.ext4 discards the entire volume by default, which > > would copyout the entire origin volume to the snapshot). > > You also need to make sure that we don't claim discard_zeroes_data for > the origin volume in this case. Especially as ext4 started to rely > on this actually working (very bad idea IMHO, but that's another story) Eric Sandeen helped me see that having the DM snapshot-origin target return success but actually ignore discards is just bad form. Especially when you consider that this exercise was motivated by the fact that ext4 will disable discards on the first discard failure, see: http://www.redhat.com/archives/dm-devel/2011-April/msg00070.html Eric and I think it is best to revert this commit: a30eec2 ext4: stop issuing discards if not supported by device (though ideally ext4 would still WARN_ONCE per superblock with something like: "discard failed, please consider disabling discard support") 1) The user asked for discards (with '-o discard' mount option) - what is the real harm in coninuing to issue them even if it _seems_ they aren't supported? 2) assuming the entire block device uniformly supports discards can be flawed (a DM device's discard support can vary based on logical offset). Thoughts? ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-04-28 20:59 ` [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] Mike Snitzer @ 2011-04-28 21:28 ` Eric Sandeen 2011-04-28 22:59 ` Alasdair G Kergon 2011-04-29 1:12 ` Andreas Dilger 2011-04-29 9:30 ` Lukas Czerner 2 siblings, 1 reply; 54+ messages in thread From: Eric Sandeen @ 2011-04-28 21:28 UTC (permalink / raw) To: Mike Snitzer Cc: Christoph Hellwig, dm-devel, linux-ext4, DarkNovaNick, linux-lvm On 4/28/11 3:59 PM, Mike Snitzer wrote: > [cc'ing linux-ext4] > > On Thu, Apr 28 2011 at 3:53am -0400, > Christoph Hellwig <hch@infradead.org> wrote: > >> On Wed, Apr 27, 2011 at 08:19:13PM -0400, Mike Snitzer wrote: >>> Discards pose a problem for the snapshot-origin target because they are >>> treated as writes. Treating a discard as a write would trigger a >>> copyout to the snapshot. Such copyout can prove too costly in the face >>> of otherwise benign scenarios (e.g. create a snapshot and then mkfs.ext4 >>> the origin -- mkfs.ext4 discards the entire volume by default, which >>> would copyout the entire origin volume to the snapshot). >> >> You also need to make sure that we don't claim discard_zeroes_data for >> the origin volume in this case. Especially as ext4 started to rely >> on this actually working (very bad idea IMHO, but that's another story) > > Eric Sandeen helped me see that having the DM snapshot-origin target > return success but actually ignore discards is just bad form. > > Especially when you consider that this exercise was motivated by the > fact that ext4 will disable discards on the first discard failure, see: > http://www.redhat.com/archives/dm-devel/2011-April/msg00070.html > > Eric and I think it is best to revert this commit: > a30eec2 ext4: stop issuing discards if not supported by device > > (though ideally ext4 would still WARN_ONCE per superblock with something > like: "discard failed, please consider disabling discard support") > > 1) The user asked for discards (with '-o discard' mount option) > - what is the real harm in coninuing to issue them even if it _seems_ > they aren't supported? TBH I sent a30eec2 on a whim. Seemed reasonable at the time, but if discard-ability changes over time, it may not be the best plan. > 2) assuming the entire block device uniformly supports discards can > be flawed (a DM device's discard support can vary based on logical > offset). I still think that concats of floppies, usb disks, and ssds should be rare, so I'm less concerned about that ;) I think Mike is right though, that if you do not do anything with a discard, you should return -EOPNOTSUPP, and not pretend that you honored it. We should, IMHO, deal with the truth of the matter at the filesystem caller. -Eric > Thoughts? ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-04-28 21:28 ` Eric Sandeen @ 2011-04-28 22:59 ` Alasdair G Kergon 2011-04-28 23:01 ` Eric Sandeen 0 siblings, 1 reply; 54+ messages in thread From: Alasdair G Kergon @ 2011-04-28 22:59 UTC (permalink / raw) To: LVM general discussion and development, Mike Snitzer, Christoph Hellwig, dm-devel, linux-ext4, DarkNovaNick On Thu, Apr 28, 2011 at 04:28:17PM -0500, Eric Sandeen wrote: > I still think that concats of floppies, usb disks, and ssds should be rare, so I'm less concerned about that ;) It's the 'undefined' cases that cause us the trouble though: we do have to return something and I prefer to work with defined and documented behaviour, rather than pretending something is 'undefined' or 'unimportant' and later finding people relied on its actual behaviour. Alasdair ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-04-28 22:59 ` Alasdair G Kergon @ 2011-04-28 23:01 ` Eric Sandeen 2011-04-28 23:11 ` Alasdair G Kergon 0 siblings, 1 reply; 54+ messages in thread From: Eric Sandeen @ 2011-04-28 23:01 UTC (permalink / raw) To: LVM general discussion and development, Mike Snitzer, Christoph Hellwig, dm-devel, linux-ext4, DarkNovaNick On 4/28/11 5:59 PM, Alasdair G Kergon wrote: > On Thu, Apr 28, 2011 at 04:28:17PM -0500, Eric Sandeen wrote: >> I still think that concats of floppies, usb disks, and ssds should be rare, so I'm less concerned about that ;) > > It's the 'undefined' cases that cause us the trouble though: we do have > to return something and I prefer to work with defined and documented > behaviour, rather than pretending something is 'undefined' or > 'unimportant' and later finding people relied on its actual behaviour. If you are saying that a target should return EOPNOTSUPP when it does not support the operation, then we are in violent agreement. :) -Eric > Alasdair ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-04-28 23:01 ` Eric Sandeen @ 2011-04-28 23:11 ` Alasdair G Kergon 0 siblings, 0 replies; 54+ messages in thread From: Alasdair G Kergon @ 2011-04-28 23:11 UTC (permalink / raw) To: Eric Sandeen Cc: Mike Snitzer, Christoph Hellwig, dm-devel, DarkNovaNick, LVM general discussion and development, linux-ext4 On Thu, Apr 28, 2011 at 06:01:18PM -0500, Eric Sandeen wrote: > If you are saying that a target should return EOPNOTSUPP when it does not support the operation, then we are in violent agreement. :) Naturally, with the definition you used in the first post. (I.e. if the target does *anything* with it, you do not get that error, even if it did not process it completely.) Alasdair ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-04-28 20:59 ` [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] Mike Snitzer 2011-04-28 21:28 ` Eric Sandeen @ 2011-04-29 1:12 ` Andreas Dilger 2011-04-29 13:55 ` Mike Snitzer 2011-04-29 9:30 ` Lukas Czerner 2 siblings, 1 reply; 54+ messages in thread From: Andreas Dilger @ 2011-04-29 1:12 UTC (permalink / raw) To: Mike Snitzer Cc: sandeen, Christoph Hellwig, dm-devel, DarkNovaNick, linux-lvm, linux-ext4 On Apr 28, 2011, at 14:59, Mike Snitzer wrote: > On Thu, Apr 28 2011 at 3:53am -0400, > Christoph Hellwig <hch@infradead.org> wrote: >> On Wed, Apr 27, 2011 at 08:19:13PM -0400, Mike Snitzer wrote: >>> Discards pose a problem for the snapshot-origin target because they are >>> treated as writes. Treating a discard as a write would trigger a >>> copyout to the snapshot. Such copyout can prove too costly in the face >>> of otherwise benign scenarios (e.g. create a snapshot and then mkfs.ext4 >>> the origin -- mkfs.ext4 discards the entire volume by default, which >>> would copyout the entire origin volume to the snapshot). >> >> You also need to make sure that we don't claim discard_zeroes_data for >> the origin volume in this case. Especially as ext4 started to rely >> on this actually working (very bad idea IMHO, but that's another story) > > Eric Sandeen helped me see that having the DM snapshot-origin target > return success but actually ignore discards is just bad form. > > Especially when you consider that this exercise was motivated by the > fact that ext4 will disable discards on the first discard failure, see: > http://www.redhat.com/archives/dm-devel/2011-April/msg00070.html > > Eric and I think it is best to revert this commit: > a30eec2 ext4: stop issuing discards if not supported by device > > (though ideally ext4 would still WARN_ONCE per superblock with something > like: "discard failed, please consider disabling discard support") > > 1) The user asked for discards (with '-o discard' mount option) > - what is the real harm in coninuing to issue them even if it _seems_ > they aren't supported? Aren't discard operations considered barriers? If there are needless discard operations being sent to the disk then it seems like this would have a non-zero performance penalty, just by virtue of blocking IO submissions at the block device layer. If my understanding of the mechanics is flawed, then please ignore. > 2) assuming the entire block device uniformly supports discards can > be flawed (a DM device's discard support can vary based on logical > offset). Hybrid storage is potentially an increasingly popular configuration. We've been looking at DM-mapped SSD RAID-1 + HDD RAID-6 arrays for ext4 to allow storing only the metadata on SSD. That said, it may make sense to have the DM translation layer check the component devices of an LV to see if discard is supported on any of them, and only return -EOPNOTSUPP if none of them do. While not a perfect solution for the case where an LV is migrated onto a device that does support discard, I don't think we have to handle every corner case perfectly to still handle reasonable use cases in a sensible manner. Cheers, Andreas ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-04-29 1:12 ` Andreas Dilger @ 2011-04-29 13:55 ` Mike Snitzer 0 siblings, 0 replies; 54+ messages in thread From: Mike Snitzer @ 2011-04-29 13:55 UTC (permalink / raw) To: Andreas Dilger Cc: Christoph Hellwig, sandeen, linux-ext4@vger.kernel.org device-mapper development, DarkNovaNick, linux-lvm On Thu, Apr 28 2011 at 9:12pm -0400, Andreas Dilger <adilger.kernel@dilger.ca> wrote: > On Apr 28, 2011, at 14:59, Mike Snitzer wrote: > > Eric and I think it is best to revert this commit: > > a30eec2 ext4: stop issuing discards if not supported by device > > > > (though ideally ext4 would still WARN_ONCE per superblock with something > > like: "discard failed, please consider disabling discard support") > > > > 1) The user asked for discards (with '-o discard' mount option) > > - what is the real harm in coninuing to issue them even if it _seems_ > > they aren't supported? > > Aren't discard operations considered barriers? If there are needless > discard operations being sent to the disk then it seems like this > would have a non-zero performance penalty, just by virtue of blocking > IO submissions at the block device layer. The blkdev_issue_discard() interface is synchronous. Given that, yes it will cause jbd2 to block waiting for it to complete. For DM, the discard would not actually be issued to the physical storage there is still work that would be done to submit the bio only to have DM return -EOPNOTSUPP. But ext4 doesn't care if it is on DM or not; so I agree that cost of continuing to issue discards would be higher on non-DM storage. That said, the user asked for discards via -o discard. Ext4 is taking that to mean "no the user only _really_ asked for discards if they will always work". If ext4 is committed to that disable-on-first-failure hueristic then Documentation/filesystems/ext4.txt should probably speak to it. > > 2) assuming the entire block device uniformly supports discards can > > be flawed (a DM device's discard support can vary based on logical > > offset). > > Hybrid storage is potentially an increasingly popular configuration. > We've been looking at DM-mapped SSD RAID-1 + HDD RAID-6 arrays for > ext4 to allow storing only the metadata on SSD. Meaning you assemble a tailormade DM linear (concat) device that mixes the two? What are you using to assemble that device -- presummably some code that is aware of ext4's metadata and data layout? Code hooked off of mkfs? > That said, it may > make sense to have the DM translation layer check the component > devices of an LV to see if discard is supported on any of them, and > only return -EOPNOTSUPP if none of them do. That is what DM does. DM will only return -EOPNOTSUPP if a particular target doesn't support discards. If only one device supports discards then the entire logical DM device advertises discard support. Mike ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-04-28 20:59 ` [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] Mike Snitzer 2011-04-28 21:28 ` Eric Sandeen 2011-04-29 1:12 ` Andreas Dilger @ 2011-04-29 9:30 ` Lukas Czerner 2011-04-29 12:24 ` [linux-lvm] [dm-devel] " Alasdair G Kergon 2 siblings, 1 reply; 54+ messages in thread From: Lukas Czerner @ 2011-04-29 9:30 UTC (permalink / raw) To: Mike Snitzer Cc: sandeen, Christoph Hellwig, dm-devel, DarkNovaNick, linux-lvm, linux-ext4 On Thu, 28 Apr 2011, Mike Snitzer wrote: > [cc'ing linux-ext4] > > On Thu, Apr 28 2011 at 3:53am -0400, > Christoph Hellwig <hch@infradead.org> wrote: > > > On Wed, Apr 27, 2011 at 08:19:13PM -0400, Mike Snitzer wrote: > > > Discards pose a problem for the snapshot-origin target because they are > > > treated as writes. Treating a discard as a write would trigger a > > > copyout to the snapshot. Such copyout can prove too costly in the face > > > of otherwise benign scenarios (e.g. create a snapshot and then mkfs.ext4 > > > the origin -- mkfs.ext4 discards the entire volume by default, which > > > would copyout the entire origin volume to the snapshot). > > > > You also need to make sure that we don't claim discard_zeroes_data for > > the origin volume in this case. Especially as ext4 started to rely > > on this actually working (very bad idea IMHO, but that's another story) I do not think that it is bad idea. It is supposed to work and we do not want to "optimize" for broken devices (or broken cheap crap, as someone concisely described before). > > Eric Sandeen helped me see that having the DM snapshot-origin target > return success but actually ignore discards is just bad form. > > Especially when you consider that this exercise was motivated by the > fact that ext4 will disable discards on the first discard failure, see: > http://www.redhat.com/archives/dm-devel/2011-April/msg00070.html > > Eric and I think it is best to revert this commit: > a30eec2 ext4: stop issuing discards if not supported by device > > (though ideally ext4 would still WARN_ONCE per superblock with something > like: "discard failed, please consider disabling discard support") I think that we do not need to revert it, we just need to do the "right thing" in the underlying layers. That said: 1. We need to honor all the "discard limits" so the discard bios does not actually fail. 2. If the device is composed of various other devices, we should return -EOPNOTSUPP is none of the devices support discard. 3. We should succeed, if at least one of the devices support discard and it does not fail for any reason. 4. We should not advertise discard_zeroes_data if any of the devices does not zero data or does not support discard. I am not sure how "hard" is to assure those conditions in DM. If those conditions are met, we can rely on consistent information in the layers above. Thanks! -Lukas > > 1) The user asked for discards (with '-o discard' mount option) > - what is the real harm in coninuing to issue them even if it _seems_ > they aren't supported? > 2) assuming the entire block device uniformly supports discards can > be flawed (a DM device's discard support can vary based on logical > offset). > > Thoughts? > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-04-29 9:30 ` Lukas Czerner @ 2011-04-29 12:24 ` Alasdair G Kergon 2011-04-29 12:29 ` Christoph Hellwig 2011-05-02 7:16 ` Lukas Czerner 0 siblings, 2 replies; 54+ messages in thread From: Alasdair G Kergon @ 2011-04-29 12:24 UTC (permalink / raw) To: Lukas Czerner Cc: sandeen, Mike Snitzer, Christoph Hellwig, dm-devel, DarkNovaNick, linux-lvm, linux-ext4 On Fri, Apr 29, 2011 at 11:30:47AM +0200, Lukas Czerner wrote: > 1. We need to honor all the "discard limits" so the discard bios does > not actually fail. > 2. If the device is composed of various other devices, we should return > -EOPNOTSUPP is none of the devices support discard. > 3. We should succeed, if at least one of the devices support discard and > it does not fail for any reason. > 4. We should not advertise discard_zeroes_data if any of the devices > does not zero data or does not support discard. > I am not sure how "hard" is to assure those conditions in DM. If those > conditions are met, we can rely on consistent information in the layers > above. Remember that -EOPNOTSUPP return applies only to that one *specific* discard. It does not tell you for sure whether or not another future discard to the device will succeed. (It's a property of offset - if there are several devices underneath - and of time - if the device or one below it gets reconfigured.) The core issue here is whether a filesytem should decide that the receipt of a single -EOPNOTSUPP is a reason never to send any more, whether a more sophisticated algorithm should be used (considering the proportion/offsets of them over given periods of time and retrying later), or whether more comprehensive information about the discard capabilities of the device should be presented - and whether this should be handled automatically or whether it should be under userspace control (i.e. the sysadmin can instruct the filesystem what to do). dm's role is simply to handle a discard if it can, and report back if it couldn't. Additionally it could provide more comprehensive information about the discard capabilities of the device, but my sense is that most consider this unnecessary as normally dm devices will have a coherent behaviour throughout. Alasdair ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-04-29 12:24 ` [linux-lvm] [dm-devel] " Alasdair G Kergon @ 2011-04-29 12:29 ` Christoph Hellwig 2011-04-29 14:28 ` Eric Sandeen 2011-05-02 7:16 ` Lukas Czerner 1 sibling, 1 reply; 54+ messages in thread From: Christoph Hellwig @ 2011-04-29 12:29 UTC (permalink / raw) To: Lukas Czerner, Mike Snitzer, sandeen, Christoph Hellwig, dm-devel, DarkNovaNick, linux-lvm, linux-ext4 FYI, that disable discard on failure seems to be an ext4 special and no one else picked up that stupid idea. If ext4 wants to misbehave for that just let them.. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-04-29 12:29 ` Christoph Hellwig @ 2011-04-29 14:28 ` Eric Sandeen 2011-04-29 15:13 ` Ray Morris 2011-05-04 16:33 ` Ted Ts'o 0 siblings, 2 replies; 54+ messages in thread From: Eric Sandeen @ 2011-04-29 14:28 UTC (permalink / raw) To: Christoph Hellwig Cc: Mike Snitzer, dm-devel, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4 On 4/29/11 7:29 AM, Christoph Hellwig wrote: > FYI, that disable discard on failure seems to be an ext4 special and no > one else picked up that stupid idea. If ext4 wants to misbehave for > that just let them.. It was my "stupid idea," and I'm ok with reverting it ;) -Eric ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-04-29 14:28 ` Eric Sandeen @ 2011-04-29 15:13 ` Ray Morris 2011-05-04 16:33 ` Ted Ts'o 1 sibling, 0 replies; 54+ messages in thread From: Ray Morris @ 2011-04-29 15:13 UTC (permalink / raw) To: LVM general discussion and development Cc: sandeen, Mike Snitzer, Lukas, Christoph Hellwig, dm-devel, DarkNovaNick, Czerner, linux-ext4 > On 4/29/11 7:29 AM, Christoph Hellwig wrote: > > no one else picked up that stupid idea. If ext4 wants to > > misbehave for that just let them.. > > It was my "stupid idea," and I'm ok with reverting it ;) > > -Eric http://www.google.com/search?q=how+to+remove+foot+from+mouth -- Ray Morris support@bettercgi.com Strongbox - The next generation in site security: http://www.bettercgi.com/strongbox/ Throttlebox - Intelligent Bandwidth Control http://www.bettercgi.com/throttlebox/ Strongbox / Throttlebox affiliate program: http://www.bettercgi.com/affiliates/user/register.php On Fri, 29 Apr 2011 09:28:51 -0500 Eric Sandeen <sandeen@redhat.com> wrote: > On 4/29/11 7:29 AM, Christoph Hellwig wrote: > > FYI, that disable discard on failure seems to be an ext4 special > > and no one else picked up that stupid idea. If ext4 wants to > > misbehave for that just let them.. > > It was my "stupid idea," and I'm ok with reverting it ;) > > -Eric > _______________________________________________ > linux-lvm mailing list > linux-lvm@redhat.com > https://www.redhat.com/mailman/listinfo/linux-lvm > read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/ > ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-04-29 14:28 ` Eric Sandeen 2011-04-29 15:13 ` Ray Morris @ 2011-05-04 16:33 ` Ted Ts'o 2011-05-04 17:02 ` Lukas Czerner 1 sibling, 1 reply; 54+ messages in thread From: Ted Ts'o @ 2011-05-04 16:33 UTC (permalink / raw) To: Eric Sandeen Cc: Mike Snitzer, Christoph Hellwig, dm-devel, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4 On Fri, Apr 29, 2011 at 09:28:51AM -0500, Eric Sandeen wrote: > On 4/29/11 7:29 AM, Christoph Hellwig wrote: > > FYI, that disable discard on failure seems to be an ext4 special and no > > one else picked up that stupid idea. If ext4 wants to misbehave for > > that just let them.. > > It was my "stupid idea," and I'm ok with reverting it ;) I think I forgot to send the patch out, but it's been reverted in the ext4 master branch, commit id: d9f34504e695. It's more than a revert, actually, since I also dropped error checking for FITRIM. Otherwise an attempt to use FITRIM would stop after hitting the first dm region that didn't support discards. - Ted ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-04 16:33 ` Ted Ts'o @ 2011-05-04 17:02 ` Lukas Czerner 0 siblings, 0 replies; 54+ messages in thread From: Lukas Czerner @ 2011-05-04 17:02 UTC (permalink / raw) To: Ted Ts'o Cc: Eric Sandeen, Mike Snitzer, Christoph Hellwig, dm-devel, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4 On Wed, 4 May 2011, Ted Ts'o wrote: > On Fri, Apr 29, 2011 at 09:28:51AM -0500, Eric Sandeen wrote: > > On 4/29/11 7:29 AM, Christoph Hellwig wrote: > > > FYI, that disable discard on failure seems to be an ext4 special and no > > > one else picked up that stupid idea. If ext4 wants to misbehave for > > > that just let them.. > > > > It was my "stupid idea," and I'm ok with reverting it ;) > > I think I forgot to send the patch out, but it's been reverted in the > ext4 master branch, commit id: d9f34504e695. It's more than a revert, > actually, since I also dropped error checking for FITRIM. Otherwise > an attempt to use FITRIM would stop after hitting the first dm region > that didn't support discards. > > - Ted > However I would very much like to change blkdev_issue_discard() not NOT return EOPNOTSUPP if QUEUE_FLAG_DISCARD is set, this way we can rely on EOPNOTSUPP saying really that it is NOT supported. Regardless on ext4 change. I will send a patch for it, if you guy agree. Also consider BLKDISCARD which is now broken and this change will fix that! Thanks! -Lukas ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-04-29 12:24 ` [linux-lvm] [dm-devel] " Alasdair G Kergon 2011-04-29 12:29 ` Christoph Hellwig @ 2011-05-02 7:16 ` Lukas Czerner 2011-05-02 8:13 ` Alasdair G Kergon 1 sibling, 1 reply; 54+ messages in thread From: Lukas Czerner @ 2011-05-02 7:16 UTC (permalink / raw) To: Alasdair G Kergon Cc: sandeen, Mike Snitzer, Christoph Hellwig, dm-devel, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4 On Fri, 29 Apr 2011, Alasdair G Kergon wrote: > On Fri, Apr 29, 2011 at 11:30:47AM +0200, Lukas Czerner wrote: > > 1. We need to honor all the "discard limits" so the discard bios does > > not actually fail. > > 2. If the device is composed of various other devices, we should return > > -EOPNOTSUPP is none of the devices support discard. > > 3. We should succeed, if at least one of the devices support discard and > > it does not fail for any reason. > > 4. We should not advertise discard_zeroes_data if any of the devices > > does not zero data or does not support discard. > > I am not sure how "hard" is to assure those conditions in DM. If those > > conditions are met, we can rely on consistent information in the layers > > above. > > Remember that -EOPNOTSUPP return applies only to that one *specific* > discard. It does not tell you for sure whether or not another future > discard to the device will succeed. (It's a property of offset - if > there are several devices underneath - and of time - if the device or > one below it gets reconfigured.) > > The core issue here is whether a filesytem should decide that the > receipt of a single -EOPNOTSUPP is a reason never to send any more, > whether a more sophisticated algorithm should be used (considering > the proportion/offsets of them over given periods of time and retrying > later), or whether more comprehensive information about the discard > capabilities of the device should be presented - and whether this should > be handled automatically or whether it should be under userspace control > (i.e. the sysadmin can instruct the filesystem what to do). > > dm's role is simply to handle a discard if it can, and report back > if it couldn't. Additionally it could provide more comprehensive > information about the discard capabilities of the device, but my sense > is that most consider this unnecessary as normally dm devices will have > a coherent behaviour throughout. > > Alasdair > > Hi Alasdair, I like the last idea the best. Dm is the only one who has the overall information about the storage, hence it should provide some of that information for others to do the right thing. Christoph mentioned that ext4 is the only one actually disabling discard on -EOPNOTSUPP, but firstly that's not true (gfs2 and nilfs does that as well) and secondly it is the right thing to do! What else should we be doing when it tells us that "discard is not supported"? continue trying ? that does not make sense at all! However when we have dm device where part of the device supports discard due to underlying hardware capability we just can not return EOPNOTSUPP from blkdev_issue_discard, because it is just not true! It does support discard, but it is not consistent through the whole device and at some layer we should notice that and not let it through. How to do this is just implementation detail. We can either do it at dm level, while handling bios, or we can export "do not return eopnotsupp" information and blkdev_issue_discard() should use that information to do the right thing. Thanks! -Lukas ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-02 7:16 ` Lukas Czerner @ 2011-05-02 8:13 ` Alasdair G Kergon 2011-05-02 8:19 ` Christoph Hellwig 0 siblings, 1 reply; 54+ messages in thread From: Alasdair G Kergon @ 2011-05-02 8:13 UTC (permalink / raw) To: device-mapper development Cc: sandeen, Mike Snitzer, Christoph Hellwig, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4, Alasdair G Kergon On Mon, May 02, 2011 at 09:16:21AM +0200, Lukas Czerner wrote: > However when we have dm device where part of the device supports > discard due to underlying hardware capability we just can not return > EOPNOTSUPP from blkdev_issue_discard, because it is just not true! EOPNOTSUPP from dm means the operation was not supported on that *one* bio. It does *not* tell you anything in general about the device, or whether you'd get the same error from different bios in future. Alasdair ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-02 8:13 ` Alasdair G Kergon @ 2011-05-02 8:19 ` Christoph Hellwig 2011-05-02 10:24 ` Lukas Czerner 0 siblings, 1 reply; 54+ messages in thread From: Christoph Hellwig @ 2011-05-02 8:19 UTC (permalink / raw) To: device-mapper development, Alasdair G Kergon, sandeen, Mike Snitzer, Christoph Hellwig, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4 On Mon, May 02, 2011 at 09:13:08AM +0100, Alasdair G Kergon wrote: > On Mon, May 02, 2011 at 09:16:21AM +0200, Lukas Czerner wrote: > > However when we have dm device where part of the device supports > > discard due to underlying hardware capability we just can not return > > EOPNOTSUPP from blkdev_issue_discard, because it is just not true! > > EOPNOTSUPP from dm means the operation was not supported on that *one* bio. > It does *not* tell you anything in general about the device, or whether > you'd get the same error from different bios in future. Exactly. We already have the information in the queue limits to tell the filesystem if discard is supported at all or not. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-02 8:19 ` Christoph Hellwig @ 2011-05-02 10:24 ` Lukas Czerner 2011-05-02 12:48 ` [linux-lvm] " Mike Snitzer ` (2 more replies) 0 siblings, 3 replies; 54+ messages in thread From: Lukas Czerner @ 2011-05-02 10:24 UTC (permalink / raw) To: Christoph Hellwig Cc: sandeen, Mike Snitzer, device-mapper development, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4, Alasdair G Kergon On Mon, 2 May 2011, Christoph Hellwig wrote: > On Mon, May 02, 2011 at 09:13:08AM +0100, Alasdair G Kergon wrote: > > On Mon, May 02, 2011 at 09:16:21AM +0200, Lukas Czerner wrote: > > > However when we have dm device where part of the device supports > > > discard due to underlying hardware capability we just can not return > > > EOPNOTSUPP from blkdev_issue_discard, because it is just not true! > > > > EOPNOTSUPP from dm means the operation was not supported on that *one* bio. > > It does *not* tell you anything in general about the device, or whether > > you'd get the same error from different bios in future. > > Exactly. We already have the information in the queue limits to tell > the filesystem if discard is supported at all or not. > So I gave it a try. First of all the device composed of SSD and spinning disk does export discard_support information properly, however it also advertise discard_zeroes_data which is wrong and possibly dangerous and should be fixed! And second of all, strictly speaking if EOPNOTSUPP from dm means that operation was not supported on that *one* bio, blkdev_issue_discard() should handle that and do not return EOPNOTSUPP further if queue limits tells that device has discard support. Is this acceptable solution for you guys ? I can make that change since I am changing blkdev_issue_discard() anyway. Or, we can make that change in filesystem where we disable discard on mount time, when we notice that discard mount option was specified, but the device does not support it (we should probably do this regardless on blkdev_issue_discard() change). Thanks! -Lukas ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-02 10:24 ` Lukas Czerner @ 2011-05-02 12:48 ` Mike Snitzer 2011-05-02 13:05 ` Lukas Czerner 2011-05-02 13:48 ` [linux-lvm] [dm-devel] " Martin K. Petersen 2011-05-02 14:20 ` Martin K. Petersen 2 siblings, 1 reply; 54+ messages in thread From: Mike Snitzer @ 2011-05-02 12:48 UTC (permalink / raw) To: Lukas Czerner Cc: sandeen, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, linux-ext4, Alasdair G Kergon On Mon, May 02 2011 at 6:24am -0400, Lukas Czerner <lczerner@redhat.com> wrote: > On Mon, 2 May 2011, Christoph Hellwig wrote: > > > On Mon, May 02, 2011 at 09:13:08AM +0100, Alasdair G Kergon wrote: > > > On Mon, May 02, 2011 at 09:16:21AM +0200, Lukas Czerner wrote: > > > > However when we have dm device where part of the device supports > > > > discard due to underlying hardware capability we just can not return > > > > EOPNOTSUPP from blkdev_issue_discard, because it is just not true! > > > > > > EOPNOTSUPP from dm means the operation was not supported on that *one* bio. > > > It does *not* tell you anything in general about the device, or whether > > > you'd get the same error from different bios in future. > > > > Exactly. We already have the information in the queue limits to tell > > the filesystem if discard is supported at all or not. > > > > So I gave it a try. First of all the device composed of SSD and spinning > disk does export discard_support information properly, however it also > advertise discard_zeroes_data which is wrong and possibly dangerous and > should be fixed! You're effectively advocating that blk_stack_limits() needs to clear the topmost device's discard_zeroes_data flag if any one bottom device does not have discard_zeroes_data. > And second of all, strictly speaking if EOPNOTSUPP from dm means that > operation was not supported on that *one* bio, blkdev_issue_discard() > should handle that and do not return EOPNOTSUPP further if queue limits > tells that device has discard support. Is this acceptable solution for > you guys ? I can make that change since I am changing blkdev_issue_discard() > anyway. Or, we can make that change in filesystem where we disable > discard on mount time, when we notice that discard mount option was > specified, but the device does not support it (we should probably do > this regardless on blkdev_issue_discard() change). The blkdev_issue_discard() change you propose could be fine (mask EOPNOTSUPP return if device advertises support for discards) -- though Eric said we shouldn't ever say we did something when we didn't. But that blkdev_issue_discard() change is really only safe if the discard_zeroes_data flag is cleared by blk_stack_limits() if finds inconsistent discard_zeroes_data support. Mike ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-02 12:48 ` [linux-lvm] " Mike Snitzer @ 2011-05-02 13:05 ` Lukas Czerner 2011-05-02 14:47 ` Eric Sandeen 0 siblings, 1 reply; 54+ messages in thread From: Lukas Czerner @ 2011-05-02 13:05 UTC (permalink / raw) To: Mike Snitzer Cc: sandeen, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4, Alasdair G Kergon On Mon, 2 May 2011, Mike Snitzer wrote: > On Mon, May 02 2011 at 6:24am -0400, > Lukas Czerner <lczerner@redhat.com> wrote: > > > On Mon, 2 May 2011, Christoph Hellwig wrote: > > > > > On Mon, May 02, 2011 at 09:13:08AM +0100, Alasdair G Kergon wrote: > > > > On Mon, May 02, 2011 at 09:16:21AM +0200, Lukas Czerner wrote: > > > > > However when we have dm device where part of the device supports > > > > > discard due to underlying hardware capability we just can not return > > > > > EOPNOTSUPP from blkdev_issue_discard, because it is just not true! > > > > > > > > EOPNOTSUPP from dm means the operation was not supported on that *one* bio. > > > > It does *not* tell you anything in general about the device, or whether > > > > you'd get the same error from different bios in future. > > > > > > Exactly. We already have the information in the queue limits to tell > > > the filesystem if discard is supported at all or not. > > > > > > > So I gave it a try. First of all the device composed of SSD and spinning > > disk does export discard_support information properly, however it also > > advertise discard_zeroes_data which is wrong and possibly dangerous and > > should be fixed! > > You're effectively advocating that blk_stack_limits() needs to clear the > topmost device's discard_zeroes_data flag if any one bottom device does > not have discard_zeroes_data. Exactly. > > > And second of all, strictly speaking if EOPNOTSUPP from dm means that > > operation was not supported on that *one* bio, blkdev_issue_discard() > > should handle that and do not return EOPNOTSUPP further if queue limits > > tells that device has discard support. Is this acceptable solution for > > you guys ? I can make that change since I am changing blkdev_issue_discard() > > anyway. Or, we can make that change in filesystem where we disable > > discard on mount time, when we notice that discard mount option was > > specified, but the device does not support it (we should probably do > > this regardless on blkdev_issue_discard() change). > > The blkdev_issue_discard() change you propose could be fine (mask > EOPNOTSUPP return if device advertises support for discards) -- though > Eric said we shouldn't ever say we did something when we didn't. Exactly, so we should not say that it is not supported when it is, but we just hit the "wrong" part of the device:) I would just very much like to keep the abstraction of having one consistent device underneath the file system and not deal with several devices, or regions with different behaviour in the file system itself (let the pixies underneath deal with that, after all not all of us are btrfs:)) > > But that blkdev_issue_discard() change is really only safe if the > discard_zeroes_data flag is cleared by blk_stack_limits() if finds > inconsistent discard_zeroes_data support. Agreed > > Mike > Thanks! -Lukas ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-02 13:05 ` Lukas Czerner @ 2011-05-02 14:47 ` Eric Sandeen 2011-05-02 14:48 ` Christoph Hellwig 2011-05-02 14:58 ` Lukas Czerner 0 siblings, 2 replies; 54+ messages in thread From: Eric Sandeen @ 2011-05-02 14:47 UTC (permalink / raw) To: Lukas Czerner Cc: Mike Snitzer, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, linux-ext4, Alasdair G Kergon On 5/2/11 8:05 AM, Lukas Czerner wrote: > On Mon, 2 May 2011, Mike Snitzer wrote: ... >> The blkdev_issue_discard() change you propose could be fine (mask >> EOPNOTSUPP return if device advertises support for discards) -- though >> Eric said we shouldn't ever say we did something when we didn't. > > Exactly, so we should not say that it is not supported when it is, but > we just hit the "wrong" part of the device:) I would just very much like > to keep the abstraction of having one consistent device underneath the > file system and not deal with several devices, or regions with different > behaviour in the file system itself (let the pixies underneath deal with > that, after all not all of us are btrfs:)) I still think we need to stick with the simple rule: "EOPNOTSUPP returned for a particular bio means that it is not supported for that particular bio" - I don't know what else we can do, without creating an ambiguity... This does, however, suck for the layer calling in to a complex device. What is the overhead for sending discard bios down to a device that does not support it? -Eric ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-02 14:47 ` Eric Sandeen @ 2011-05-02 14:48 ` Christoph Hellwig 2011-05-02 14:58 ` Lukas Czerner 1 sibling, 0 replies; 54+ messages in thread From: Christoph Hellwig @ 2011-05-02 14:48 UTC (permalink / raw) To: Eric Sandeen Cc: Mike Snitzer, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4, Alasdair G Kergon On Mon, May 02, 2011 at 09:47:16AM -0500, Eric Sandeen wrote: > I still think we need to stick with the simple rule: "EOPNOTSUPP returned for a particular bio means that it is not supported for that particular bio" - I don't know what else we can do, without creating an ambiguity... > > This does, however, suck for the layer calling in to a complex device. > > What is the overhead for sending discard bios down to a device that does not support it? For a DM-like device there is very little overhead. The bio gets dispatched to make_request and failed in there almost immediately. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-02 14:47 ` Eric Sandeen 2011-05-02 14:48 ` Christoph Hellwig @ 2011-05-02 14:58 ` Lukas Czerner 1 sibling, 0 replies; 54+ messages in thread From: Lukas Czerner @ 2011-05-02 14:58 UTC (permalink / raw) To: Eric Sandeen Cc: Mike Snitzer, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4, Alasdair G Kergon On Mon, 2 May 2011, Eric Sandeen wrote: > On 5/2/11 8:05 AM, Lukas Czerner wrote: > > On Mon, 2 May 2011, Mike Snitzer wrote: > > ... > > >> The blkdev_issue_discard() change you propose could be fine (mask > >> EOPNOTSUPP return if device advertises support for discards) -- though > >> Eric said we shouldn't ever say we did something when we didn't. > > > > Exactly, so we should not say that it is not supported when it is, but > > we just hit the "wrong" part of the device:) I would just very much like > > to keep the abstraction of having one consistent device underneath the > > file system and not deal with several devices, or regions with different > > behaviour in the file system itself (let the pixies underneath deal with > > that, after all not all of us are btrfs:)) > > I still think we need to stick with the simple rule: "EOPNOTSUPP returned for a particular bio means that it is not supported for that particular bio" - I don't know what else we can do, without creating an ambiguity... I agree, but we do not care about bios in file system. We need to know whether the device itself supports it or not. Also consider BLKDISCARD ioctl(), now as it is it's completely broken on such dm devices, because you are not able to use it due to -EOPNOTSUPP first time you hit the device which does not support it, but it can very well be just a small part of dm device. Again, it does not make sense to return -EOPNOTSUPP from blkdev_issue_discard() when the device actually supports it. -Lukas > > This does, however, suck for the layer calling in to a complex device. > > What is the overhead for sending discard bios down to a device that does not support it? > > -Eric > -- ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-02 10:24 ` Lukas Czerner 2011-05-02 12:48 ` [linux-lvm] " Mike Snitzer @ 2011-05-02 13:48 ` Martin K. Petersen 2011-05-02 14:20 ` Martin K. Petersen 2 siblings, 0 replies; 54+ messages in thread From: Martin K. Petersen @ 2011-05-02 13:48 UTC (permalink / raw) To: Lukas Czerner Cc: sandeen, Mike Snitzer, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, linux-ext4, Alasdair G Kergon >>>>> "Lukas" == Lukas Czerner <lczerner@redhat.com> writes: Lukas> however it also advertise discard_zeroes_data which is wrong and Lukas> possibly dangerous and should be fixed! That shouldn't happen. I'll have a look... -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-02 10:24 ` Lukas Czerner 2011-05-02 12:48 ` [linux-lvm] " Mike Snitzer 2011-05-02 13:48 ` [linux-lvm] [dm-devel] " Martin K. Petersen @ 2011-05-02 14:20 ` Martin K. Petersen 2011-05-02 14:39 ` Lukas Czerner 2 siblings, 1 reply; 54+ messages in thread From: Martin K. Petersen @ 2011-05-02 14:20 UTC (permalink / raw) To: Lukas Czerner Cc: sandeen, Mike Snitzer, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, linux-ext4, Alasdair G Kergon >>>>> "Lukas" == Lukas Czerner <lczerner@redhat.com> writes: Lukas> So I gave it a try. First of all the device composed of SSD and Lukas> spinning disk does export discard_support information properly, Lukas> however it also advertise discard_zeroes_data which is wrong and Lukas> possibly dangerous and should be fixed! I can't reproduce this here. If I mix discard and non-discard devices things work correctly. discard_zeroes_data also gets cleared if I mix discard-capable drives where one zeroes and one doesn't. -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-02 14:20 ` Martin K. Petersen @ 2011-05-02 14:39 ` Lukas Czerner 2011-05-02 14:50 ` Martin K. Petersen ` (2 more replies) 0 siblings, 3 replies; 54+ messages in thread From: Lukas Czerner @ 2011-05-02 14:39 UTC (permalink / raw) To: Martin K. Petersen Cc: sandeen, Mike Snitzer, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4, Alasdair G Kergon [-- Attachment #1: Type: TEXT/PLAIN, Size: 1779 bytes --] On Mon, 2 May 2011, Martin K. Petersen wrote: > >>>>> "Lukas" == Lukas Czerner <lczerner@redhat.com> writes: > > Lukas> So I gave it a try. First of all the device composed of SSD and > Lukas> spinning disk does export discard_support information properly, > Lukas> however it also advertise discard_zeroes_data which is wrong and > Lukas> possibly dangerous and should be fixed! > > I can't reproduce this here. If I mix discard and non-discard devices > things work correctly. discard_zeroes_data also gets cleared if I mix > discard-capable drives where one zeroes and one doesn't. > > [root@trim ~]# hdparm -I /dev/sdb | grep -i trim [root@trim ~]# cat /sys/block/sdb/queue/discard_zeroes_data 0 [root@trim ~]# hdparm -I /dev/sdd | grep -i trim * Data Set Management TRIM supported * Deterministic read after TRIM [root@trim ~]# cat /sys/block/sdd/queue/discard_zeroes_data 1 [root@trim ~]# pvcreate /dev/sdd1 Physical volume "/dev/sdd1" successfully created [root@trim ~]# pvcreate /dev/sdb3 Physical volume "/dev/sdb3" successfully created [root@trim ~]# vgcreate vg_test /dev/sdd1 /dev/sdb3 Volume group "vg_test" successfully created [root@trim ~]# lvcreate -L 3500M vg_test Logical volume "lvol0" created [root@trim ~]# ls -lah /dev/mapper/vg_test-lvol0 lrwxrwxrwx. 1 root root 7 2.�kv� 10.30 /dev/mapper/vg_test-lvol0 -> ../dm-0 [root@trim ~]# cat /sys/block/dm-0/queue/discard_zeroes_data 1 [root@trim ~]# uname -r 2.6.39-rc5-blkdev+ [root@trim ~]# cat /proc/partitions 8 19 1951897 sdb3 8 49 1951866 sdd1 253 0 3584000 dm-0 So I assume it got fixes just recently ?. I'll give it a try with the up-to-date kernel. Thanks! -Lukas ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-02 14:39 ` Lukas Czerner @ 2011-05-02 14:50 ` Martin K. Petersen 2011-05-02 14:58 ` [linux-lvm] " Mike Snitzer 2011-05-02 16:58 ` [linux-lvm] [dm-devel] " Martin K. Petersen 2 siblings, 0 replies; 54+ messages in thread From: Martin K. Petersen @ 2011-05-02 14:50 UTC (permalink / raw) To: Lukas Czerner Cc: sandeen, Martin K. Petersen, Mike Snitzer, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, linux-ext4, Alasdair G Kergon >>>>> "Lukas" == Lukas Czerner <lczerner@redhat.com> writes: Lukas> So I assume it got fixes just recently ?. I'll give it a try with Lukas> the up-to-date kernel. Ok, turns out the SSD I used for the test predates advertising TRIM. Beats me why this is broken for the zeroes/!zeroes case. I'll get it fixed... -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-02 14:39 ` Lukas Czerner 2011-05-02 14:50 ` Martin K. Petersen @ 2011-05-02 14:58 ` Mike Snitzer 2011-05-02 16:58 ` [linux-lvm] [dm-devel] " Martin K. Petersen 2 siblings, 0 replies; 54+ messages in thread From: Mike Snitzer @ 2011-05-02 14:58 UTC (permalink / raw) To: Lukas Czerner Cc: sandeen, Martin K. Petersen, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, linux-ext4, Alasdair G Kergon On Mon, May 02 2011 at 10:39am -0400, Lukas Czerner <lczerner@redhat.com> wrote: > On Mon, 2 May 2011, Martin K. Petersen wrote: > > > >>>>> "Lukas" == Lukas Czerner <lczerner@redhat.com> writes: > > > > Lukas> So I gave it a try. First of all the device composed of SSD and > > Lukas> spinning disk does export discard_support information properly, > > Lukas> however it also advertise discard_zeroes_data which is wrong and > > Lukas> possibly dangerous and should be fixed! > > > > I can't reproduce this here. If I mix discard and non-discard devices > > things work correctly. discard_zeroes_data also gets cleared if I mix > > discard-capable drives where one zeroes and one doesn't. > > > > > > [root@trim ~]# hdparm -I /dev/sdb | grep -i trim > [root@trim ~]# cat /sys/block/sdb/queue/discard_zeroes_data > 0 > [root@trim ~]# hdparm -I /dev/sdd | grep -i trim > * Data Set Management TRIM supported > * Deterministic read after TRIM > [root@trim ~]# cat /sys/block/sdd/queue/discard_zeroes_data > 1 > [root@trim ~]# pvcreate /dev/sdd1 > Physical volume "/dev/sdd1" successfully created > [root@trim ~]# pvcreate /dev/sdb3 > Physical volume "/dev/sdb3" successfully created > [root@trim ~]# vgcreate vg_test /dev/sdd1 /dev/sdb3 > Volume group "vg_test" successfully created > [root@trim ~]# lvcreate -L 3500M vg_test > Logical volume "lvol0" created > [root@trim ~]# ls -lah /dev/mapper/vg_test-lvol0 > lrwxrwxrwx. 1 root root 7 2. kvě 10.30 /dev/mapper/vg_test-lvol0 -> ../dm-0 > [root@trim ~]# cat /sys/block/dm-0/queue/discard_zeroes_data > 1 Hmm, Strange considering DM just uses blk_stack_limits(). As Martin said, current blk_stack_limits() code is fine: t->discard_zeroes_data &= b->discard_zeroes_data; > So I assume it got fixes just recently ?. I'll give it a try with the > up-to-date kernel. Hasn't changed at all since it was introduced: 98262f2 v2.6.33-rc1 block: Allow devices to indicate whether discarded blocks are zeroed ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-02 14:39 ` Lukas Czerner 2011-05-02 14:50 ` Martin K. Petersen 2011-05-02 14:58 ` [linux-lvm] " Mike Snitzer @ 2011-05-02 16:58 ` Martin K. Petersen 2011-05-03 8:57 ` Lukas Czerner 2 siblings, 1 reply; 54+ messages in thread From: Martin K. Petersen @ 2011-05-02 16:58 UTC (permalink / raw) To: Lukas Czerner Cc: sandeen, Martin K. Petersen, Mike Snitzer, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, linux-ext4, Alasdair G Kergon >>>>> "Lukas" == Lukas Czerner <lczerner@redhat.com> writes: Lukas, Lukas> [root@trim ~]# lvcreate -L 3500M vg_test Lukas> Logical volume "lvol0" created Ok, so here's what I think is going on. You're creating a linear target which happens to fit inside the first PV. Here's two devices. One that supports discard_zeroes_data=1 (8:17) and one that doesn't (8:49). # dmsetup table foo-bar: 0 1032192 striped 2 32 8:17 384 8:49 384 foo-baz: 0 106496 linear 8:17 516480 # grep . /sys/block/d*/queue/discard_z* /sys/block/dm-0/queue/discard_zeroes_data:0 /sys/block/dm-1/queue/discard_zeroes_data:1 LV foo/bar (dm-0) is striped, straddling two devices with incompatible values. Hence 0. LV foo/baz (dm-1) is linear and fits inside the first PV. Thus it has discard_zeroes_data=1. -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-02 16:58 ` [linux-lvm] [dm-devel] " Martin K. Petersen @ 2011-05-03 8:57 ` Lukas Czerner 2011-05-04 15:10 ` Martin K. Petersen 2011-05-04 15:16 ` [linux-lvm] [dm-devel] " Martin K. Petersen 0 siblings, 2 replies; 54+ messages in thread From: Lukas Czerner @ 2011-05-03 8:57 UTC (permalink / raw) To: Martin K. Petersen Cc: sandeen, Mike Snitzer, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4, Alasdair G Kergon [-- Attachment #1: Type: TEXT/PLAIN, Size: 2129 bytes --] On Mon, 2 May 2011, Martin K. Petersen wrote: > >>>>> "Lukas" == Lukas Czerner <lczerner@redhat.com> writes: > > Lukas, > > Lukas> [root@trim ~]# lvcreate -L 3500M vg_test > Lukas> Logical volume "lvol0" created > > Ok, so here's what I think is going on. You're creating a linear target > which happens to fit inside the first PV. > > Here's two devices. One that supports discard_zeroes_data=1 (8:17) and > one that doesn't (8:49). > > # dmsetup table > foo-bar: 0 1032192 striped 2 32 8:17 384 8:49 384 > foo-baz: 0 106496 linear 8:17 516480 > > # grep . /sys/block/d*/queue/discard_z* > /sys/block/dm-0/queue/discard_zeroes_data:0 > /sys/block/dm-1/queue/discard_zeroes_data:1 > > LV foo/bar (dm-0) is striped, straddling two devices with incompatible > values. Hence 0. > > LV foo/baz (dm-1) is linear and fits inside the first PV. Thus it has > discard_zeroes_data=1. Well that's why I have included /proc/partition, so you can see that this is not happening, because both partitions are 1.86GB long and the LV I am creating from those is 3.42GB long. [root@trim ~]# dmsetup table vg_test-lvol0: 0 3899392 linear 8:49 2048 vg_test-lvol0: 3899392 3268608 linear 8:19 2048 Nevertheless there is something weird going on, because even when I create striped volume I get this: [root@trim ~]# dmsetup table vg_test-lvol0: 0 7176192 striped 2 16 8:49 2048 8:19 2048 [root@trim ~]# ls -lah /dev/mapper/vg_test-lvol0 lrwxrwxrwx. 1 root root 7 3.�kv� 04.42 /dev/mapper/vg_test-lvol0 -> ../dm-0 [root@trim ~]# grep . /sys/block/dm-0/queue/discard_* /sys/block/dm-0/queue/discard_granularity:512 /sys/block/dm-0/queue/discard_max_bytes:4294966784 /sys/block/dm-0/queue/discard_zeroes_data:1 And of course (8:49) is a partition on the SSD which does zero data and (8:19) is a partition on the spinning device which does not zero data, nor does it support discard. And this is on yesterdays kernel. [root@trim ~]# uname -r 2.6.39-rc5+ So assuming that you're trying recent kernel as well, someone is doing something different :) Thanks! -Lukas ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-03 8:57 ` Lukas Czerner @ 2011-05-04 15:10 ` Martin K. Petersen 2011-05-04 16:02 ` [linux-lvm] " Mike Snitzer ` (2 more replies) 2011-05-04 15:16 ` [linux-lvm] [dm-devel] " Martin K. Petersen 1 sibling, 3 replies; 54+ messages in thread From: Martin K. Petersen @ 2011-05-04 15:10 UTC (permalink / raw) To: Lukas Czerner Cc: sandeen, Martin K. Petersen, Mike Snitzer, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, linux-ext4, Alasdair G Kergon >>>>> "Lukas" == Lukas Czerner <lczerner@redhat.com> writes: Lukas> Nevertheless there is something weird going on, because even when Lukas> I create striped volume I get this: Could you please try the following patch? It has a bunch of small tweaks to the discard stack in it. I'll split it up before posting for real but I'd like to know if it fixes your issue... block/libata/scsi: Various logical block provisioning fixes - Add sysfs documentation for the discard topology parameters - Fix discard stacking problem - Switch our libata SAT over to using the WRITE SAME limits - UNMAP alignment needs to be converted to bytes - Only report alignment and zeroes_data if the device supports discard Reported-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block index 4873c75..c1eb41c 100644 --- a/Documentation/ABI/testing/sysfs-block +++ b/Documentation/ABI/testing/sysfs-block @@ -142,3 +142,67 @@ Description: with the previous I/O request are enabled. When set to 2, all merge tries are disabled. The default value is 0 - which enables all types of merge tries. + +What: /sys/block/<disk>/discard_alignment +Date: May 2011 +Contact: Martin K. Petersen <martin.petersen@oracle.com> +Description: + Devices that support discard functionality may + internally allocate space in units that are bigger than + the exported logical block size. The discard_alignment + parameter indicates how many bytes the beginning of the + device is offset from the internal allocation unit's + natural alignment. + +What: /sys/block/<disk>/<partition>/discard_alignment +Date: May 2011 +Contact: Martin K. Petersen <martin.petersen@oracle.com> +Description: + Devices that support discard functionality may + internally allocate space in units that are bigger than + the exported logical block size. The discard_alignment + parameter indicates how many bytes the beginning of the + partition is offset from the internal allocation unit's + natural alignment. + +What: /sys/block/<disk>/queue/discard_granularity +Date: May 2011 +Contact: Martin K. Petersen <martin.petersen@oracle.com> +Description: + Devices that support discard functionality may + internally allocate space using units that are bigger + than the logical block size. The discard_granularity + parameter indicates the size of the internal allocation + unit in bytes if reported by the device. Otherwise the + discard_granularity will be set to match the device's + physical block size. A discard_granularity of 0 means + that the device does not support discard functionality. + +What: /sys/block/<disk>/queue/discard_max_bytes +Date: May 2011 +Contact: Martin K. Petersen <martin.petersen@oracle.com> +Description: + Devices that support discard functionality may have + internal limits on the number of bytes that can be + trimmed or unmapped in a single operation. Some storage + protocols also have inherent limits on the number of + blocks that can be described in a single command. The + discard_max_bytes parameter is set by the device driver + to the maximum number of bytes that can be discarded in + a single operation. Discard requests issued to the + device must not exceed this limit. A discard_max_bytes + value of 0 means that the device does not support + discard functionality. + +What: /sys/block/<disk>/queue/discard_zeroes_data +Date: May 2011 +Contact: Martin K. Petersen <martin.petersen@oracle.com> +Description: + Devices that support discard functionality may return + stale or random data when a previously discarded block + is read back. This can cause problems if the filesystem + expects discarded blocks to be explicitly cleared. If a + device reports that it deterministically returns zeroes + when a discarded area is read the discard_zeroes_data + parameter will be set to one. Otherwise it will be 0 and + the result of reading a discarded area is undefined. diff --git a/block/blk-settings.c b/block/blk-settings.c index 1fa7692..42d3bf5 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -120,7 +120,7 @@ void blk_set_default_limits(struct queue_limits *lim) lim->discard_granularity = 0; lim->discard_alignment = 0; lim->discard_misaligned = 0; - lim->discard_zeroes_data = -1; + lim->discard_zeroes_data = 1; lim->logical_block_size = lim->physical_block_size = lim->io_min = 512; lim->bounce_pfn = (unsigned long)(BLK_BOUNCE_ANY >> PAGE_SHIFT); lim->alignment_offset = 0; @@ -166,6 +166,7 @@ void blk_queue_make_request(struct request_queue *q, make_request_fn *mfn) blk_set_default_limits(&q->limits); blk_queue_max_hw_sectors(q, BLK_SAFE_MAX_SECTORS); + q->limits.discard_zeroes_data = 0; /* * by default assume old behaviour and bounce for any highmem page diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index e2f57e9e..30ea95f 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -2138,7 +2138,7 @@ static unsigned int ata_scsiop_inq_b0(struct ata_scsi_args *args, u8 *rbuf) * with the unmap bit set. */ if (ata_id_has_trim(args->id)) { - put_unaligned_be32(65535 * 512 / 8, &rbuf[20]); + put_unaligned_be64(65535 * 512 / 8, &rbuf[36]); put_unaligned_be32(1, &rbuf[28]); } diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index bd0806e..cc081dd 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -490,7 +490,8 @@ static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode) unsigned int max_blocks = 0; q->limits.discard_zeroes_data = sdkp->lbprz; - q->limits.discard_alignment = sdkp->unmap_alignment; + q->limits.discard_alignment = sdkp->unmap_alignment * + (logical_block_size >> 9); q->limits.discard_granularity = max(sdkp->physical_block_size, sdkp->unmap_granularity * logical_block_size); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 2ad95fa..1416e3c 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -257,7 +257,7 @@ struct queue_limits { unsigned char misaligned; unsigned char discard_misaligned; unsigned char cluster; - signed char discard_zeroes_data; + unsigned char discard_zeroes_data; }; struct request_queue @@ -1066,13 +1066,16 @@ static inline int queue_limit_discard_alignment(struct queue_limits *lim, sector { unsigned int alignment = (sector << 9) & (lim->discard_granularity - 1); + if (!lim->max_discard_sectors) + return 0; + return (lim->discard_granularity + lim->discard_alignment - alignment) & (lim->discard_granularity - 1); } static inline unsigned int queue_discard_zeroes_data(struct request_queue *q) { - if (q->limits.discard_zeroes_data == 1) + if (q->limits.max_discard_sectors && q->limits.discard_zeroes_data == 1) return 1; return 0; ^ permalink raw reply related [flat|nested] 54+ messages in thread
* Re: [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-04 15:10 ` Martin K. Petersen @ 2011-05-04 16:02 ` Mike Snitzer 2011-05-04 16:50 ` Martin K. Petersen 2011-05-04 17:10 ` [linux-lvm] [dm-devel] " Lukas Czerner 2011-05-18 12:16 ` [linux-lvm] " Mike Snitzer 2 siblings, 1 reply; 54+ messages in thread From: Mike Snitzer @ 2011-05-04 16:02 UTC (permalink / raw) To: Martin K. Petersen Cc: sandeen, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4, Alasdair G Kergon On Wed, May 04 2011 at 11:10am -0400, Martin K. Petersen <martin.petersen@oracle.com> wrote: > >>>>> "Lukas" == Lukas Czerner <lczerner@redhat.com> writes: > > Lukas> Nevertheless there is something weird going on, because even when > Lukas> I create striped volume I get this: > > Could you please try the following patch? It has a bunch of small tweaks > to the discard stack in it. I'll split it up before posting for real but > I'd like to know if it fixes your issue... > > > block/libata/scsi: Various logical block provisioning fixes > > - Add sysfs documentation for the discard topology parameters > > - Fix discard stacking problem > > - Switch our libata SAT over to using the WRITE SAME limits > > - UNMAP alignment needs to be converted to bytes > > - Only report alignment and zeroes_data if the device supports discard > > Reported-by: Lukas Czerner <lczerner@redhat.com> > Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> > > diff --git a/block/blk-settings.c b/block/blk-settings.c > index 1fa7692..42d3bf5 100644 > --- a/block/blk-settings.c > +++ b/block/blk-settings.c > @@ -120,7 +120,7 @@ void blk_set_default_limits(struct queue_limits *lim) > lim->discard_granularity = 0; > lim->discard_alignment = 0; > lim->discard_misaligned = 0; > - lim->discard_zeroes_data = -1; > + lim->discard_zeroes_data = 1; > lim->logical_block_size = lim->physical_block_size = lim->io_min = 512; > lim->bounce_pfn = (unsigned long)(BLK_BOUNCE_ANY >> PAGE_SHIFT); > lim->alignment_offset = 0; lim->discard_zeroes_data = -1; was suspect to me too. But why default to 1 here? > @@ -166,6 +166,7 @@ void blk_queue_make_request(struct request_queue *q, make_request_fn *mfn) > > blk_set_default_limits(&q->limits); > blk_queue_max_hw_sectors(q, BLK_SAFE_MAX_SECTORS); > + q->limits.discard_zeroes_data = 0; > > /* > * by default assume old behaviour and bounce for any highmem page Only to then reset to 0 here? Shouldn't we default to 0 and only set to 1 where applicable (e.g. sd_config_discard)? ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-04 16:02 ` [linux-lvm] " Mike Snitzer @ 2011-05-04 16:50 ` Martin K. Petersen 2011-05-04 18:03 ` Mike Snitzer 0 siblings, 1 reply; 54+ messages in thread From: Martin K. Petersen @ 2011-05-04 16:50 UTC (permalink / raw) To: Mike Snitzer Cc: sandeen, Martin K. Petersen, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4, Alasdair G Kergon >>>>> "Mike" == Mike Snitzer <snitzer@redhat.com> writes: Mike> lim->discard_zeroes_data = -1; was suspect to me too. Mike> But why default to 1 here? Because otherwise DM would default to having dzd to "unsupported", meaning the feature would never be turned on regardless of the bottom device capabilities. The old approach used the -1 value to indicate "has not been set". That was only really intended as a value for the stacking drivers, not for the LLDs. It was a bit of a hack and I'd rather deal with dzd the same way as we do with clustering. >> @@ -166,6 +166,7 @@ void blk_queue_make_request(struct request_queue >> *q, make_request_fn *mfn) >> >> blk_set_default_limits(&q->limits); >> blk_queue_max_hw_sectors(q, BLK_SAFE_MAX_SECTORS); >> + q->limits.discard_zeroes_data = 0; >> >> /* >> * by default assume old behaviour and bounce for any highmem page Mike> Only to then reset to 0 here? Shouldn't we default to 0 and only Mike> set to 1 where applicable (e.g. sd_config_discard)? My first approach was to set it in dm-table.c before stacking. But I thought it was icky to have the stacking driver ask for defaults and then have to tweak them for things to work correctly. The other option is to have blk_set_default_stacking_limits(). Or we could add a flag to blk_set_default_limits to indicate whether this is a LLD or a stacking driver. We already special-case BLK_SAFE_MAX_SECTORS when setting the request function. And that's the only non-stacking user of the default limits call. So that's why I disabled dzd there. Since this is a stable bugfix I also wanted to keep it small and simple. But I'm totally open to suggestions. -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-04 16:50 ` Martin K. Petersen @ 2011-05-04 18:03 ` Mike Snitzer 0 siblings, 0 replies; 54+ messages in thread From: Mike Snitzer @ 2011-05-04 18:03 UTC (permalink / raw) To: Martin K. Petersen Cc: sandeen, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4, Alasdair G Kergon On Wed, May 04 2011 at 12:50pm -0400, Martin K. Petersen <martin.petersen@oracle.com> wrote: > >>>>> "Mike" == Mike Snitzer <snitzer@redhat.com> writes: > > Mike> lim->discard_zeroes_data = -1; was suspect to me too. > Mike> But why default to 1 here? > > Because otherwise DM would default to having dzd to "unsupported", > meaning the feature would never be turned on regardless of the bottom > device capabilities. > > The old approach used the -1 value to indicate "has not been set". That > was only really intended as a value for the stacking drivers, not for > the LLDs. It was a bit of a hack and I'd rather deal with dzd the same > way as we do with clustering. > > > >> @@ -166,6 +166,7 @@ void blk_queue_make_request(struct request_queue > >> *q, make_request_fn *mfn) > >> > >> blk_set_default_limits(&q->limits); > >> blk_queue_max_hw_sectors(q, BLK_SAFE_MAX_SECTORS); > >> + q->limits.discard_zeroes_data = 0; > >> > >> /* > >> * by default assume old behaviour and bounce for any highmem page > > Mike> Only to then reset to 0 here? Shouldn't we default to 0 and only > Mike> set to 1 where applicable (e.g. sd_config_discard)? > > My first approach was to set it in dm-table.c before stacking. But I > thought it was icky to have the stacking driver ask for defaults and > then have to tweak them for things to work correctly. > > The other option is to have blk_set_default_stacking_limits(). Or we > could add a flag to blk_set_default_limits to indicate whether this is a > LLD or a stacking driver. > > We already special-case BLK_SAFE_MAX_SECTORS when setting the request > function. And that's the only non-stacking user of the default limits > call. So that's why I disabled dzd there. Since this is a stable bugfix > I also wanted to keep it small and simple. But I'm totally open to > suggestions. Your current approach sounds good. Might be good to briefly speak to the duality of the stacking vs non-stacking approach in the associated patch header. Thanks for clarifying. Mike ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-04 15:10 ` Martin K. Petersen 2011-05-04 16:02 ` [linux-lvm] " Mike Snitzer @ 2011-05-04 17:10 ` Lukas Czerner 2011-05-04 17:32 ` Martin K. Petersen 2011-05-18 12:16 ` [linux-lvm] " Mike Snitzer 2 siblings, 1 reply; 54+ messages in thread From: Lukas Czerner @ 2011-05-04 17:10 UTC (permalink / raw) To: Martin K. Petersen Cc: sandeen, Mike Snitzer, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4, Alasdair G Kergon [-- Attachment #1: Type: TEXT/PLAIN, Size: 9258 bytes --] On Wed, 4 May 2011, Martin K. Petersen wrote: > >>>>> "Lukas" == Lukas Czerner <lczerner@redhat.com> writes: > > Lukas> Nevertheless there is something weird going on, because even when > Lukas> I create striped volume I get this: > > Could you please try the following patch? It has a bunch of small tweaks > to the discard stack in it. I'll split it up before posting for real but > I'd like to know if it fixes your issue... It works thanks! This is before the patch applied: NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO sdb 0 0B 0B 0 О©╫О©╫sdb1 4293918720 0B 0B 0 О©╫О©╫sdb2 4293918720 0B 0B 0 О©╫О©╫sdb3 4293918720 0B 0B 0 О©╫О©╫vg_test-lvol0 (dm-0) 0 512B 16E 1 sdd 0 512B 16E 1 О©╫О©╫sdd1 0 512B 16E 1 О©╫О©╫vg_test-lvol0 (dm-0) 0 512B 16E 1 and this is with the patch: NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO sdb 0 0B 0B 0 О©╫О©╫sdb1 0 0B 0B 0 О©╫О©╫sdb2 0 0B 0B 0 О©╫О©╫sdb3 0 0B 0B 0 О©╫О©╫vg_test-lvol0 (dm-0) 0 512B 2G 0 sdd 0 512B 2G 1 О©╫О©╫sdd1 0 512B 2G 1 О©╫О©╫vg_test-lvol0 (dm-0) 0 512B 2G 0 It also works properly with two devices supporting discard. NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO sdd 0 512B 2G 1 О©╫О©╫sdd1 0 512B 2G 1 О©╫ О©╫О©╫vg_test-lvol0 (dm-0) 0 512B 2G 1 О©╫О©╫sdd2 0 512B 2G 1 О©╫О©╫vg_test-lvol0 (dm-0) 0 512B 2G 1 Also I am not sure why this changed discard_max_bytes from 4294966784 to 2147450880 in my case, that does not seem right...lsblk in the first place apparently overflowed. Thanks for solving this! -Lukas > > > block/libata/scsi: Various logical block provisioning fixes > > - Add sysfs documentation for the discard topology parameters > > - Fix discard stacking problem > > - Switch our libata SAT over to using the WRITE SAME limits > > - UNMAP alignment needs to be converted to bytes > > - Only report alignment and zeroes_data if the device supports discard > > Reported-by: Lukas Czerner <lczerner@redhat.com> > Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> > > diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block > index 4873c75..c1eb41c 100644 > --- a/Documentation/ABI/testing/sysfs-block > +++ b/Documentation/ABI/testing/sysfs-block > @@ -142,3 +142,67 @@ Description: > with the previous I/O request are enabled. When set to 2, > all merge tries are disabled. The default value is 0 - > which enables all types of merge tries. > + > +What: /sys/block/<disk>/discard_alignment > +Date: May 2011 > +Contact: Martin K. Petersen <martin.petersen@oracle.com> > +Description: > + Devices that support discard functionality may > + internally allocate space in units that are bigger than > + the exported logical block size. The discard_alignment > + parameter indicates how many bytes the beginning of the > + device is offset from the internal allocation unit's > + natural alignment. > + > +What: /sys/block/<disk>/<partition>/discard_alignment > +Date: May 2011 > +Contact: Martin K. Petersen <martin.petersen@oracle.com> > +Description: > + Devices that support discard functionality may > + internally allocate space in units that are bigger than > + the exported logical block size. The discard_alignment > + parameter indicates how many bytes the beginning of the > + partition is offset from the internal allocation unit's > + natural alignment. > + > +What: /sys/block/<disk>/queue/discard_granularity > +Date: May 2011 > +Contact: Martin K. Petersen <martin.petersen@oracle.com> > +Description: > + Devices that support discard functionality may > + internally allocate space using units that are bigger > + than the logical block size. The discard_granularity > + parameter indicates the size of the internal allocation > + unit in bytes if reported by the device. Otherwise the > + discard_granularity will be set to match the device's > + physical block size. A discard_granularity of 0 means > + that the device does not support discard functionality. > + > +What: /sys/block/<disk>/queue/discard_max_bytes > +Date: May 2011 > +Contact: Martin K. Petersen <martin.petersen@oracle.com> > +Description: > + Devices that support discard functionality may have > + internal limits on the number of bytes that can be > + trimmed or unmapped in a single operation. Some storage > + protocols also have inherent limits on the number of > + blocks that can be described in a single command. The > + discard_max_bytes parameter is set by the device driver > + to the maximum number of bytes that can be discarded in > + a single operation. Discard requests issued to the > + device must not exceed this limit. A discard_max_bytes > + value of 0 means that the device does not support > + discard functionality. > + > +What: /sys/block/<disk>/queue/discard_zeroes_data > +Date: May 2011 > +Contact: Martin K. Petersen <martin.petersen@oracle.com> > +Description: > + Devices that support discard functionality may return > + stale or random data when a previously discarded block > + is read back. This can cause problems if the filesystem > + expects discarded blocks to be explicitly cleared. If a > + device reports that it deterministically returns zeroes > + when a discarded area is read the discard_zeroes_data > + parameter will be set to one. Otherwise it will be 0 and > + the result of reading a discarded area is undefined. > diff --git a/block/blk-settings.c b/block/blk-settings.c > index 1fa7692..42d3bf5 100644 > --- a/block/blk-settings.c > +++ b/block/blk-settings.c > @@ -120,7 +120,7 @@ void blk_set_default_limits(struct queue_limits *lim) > lim->discard_granularity = 0; > lim->discard_alignment = 0; > lim->discard_misaligned = 0; > - lim->discard_zeroes_data = -1; > + lim->discard_zeroes_data = 1; > lim->logical_block_size = lim->physical_block_size = lim->io_min = 512; > lim->bounce_pfn = (unsigned long)(BLK_BOUNCE_ANY >> PAGE_SHIFT); > lim->alignment_offset = 0; > @@ -166,6 +166,7 @@ void blk_queue_make_request(struct request_queue *q, make_request_fn *mfn) > > blk_set_default_limits(&q->limits); > blk_queue_max_hw_sectors(q, BLK_SAFE_MAX_SECTORS); > + q->limits.discard_zeroes_data = 0; > > /* > * by default assume old behaviour and bounce for any highmem page > diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c > index e2f57e9e..30ea95f 100644 > --- a/drivers/ata/libata-scsi.c > +++ b/drivers/ata/libata-scsi.c > @@ -2138,7 +2138,7 @@ static unsigned int ata_scsiop_inq_b0(struct ata_scsi_args *args, u8 *rbuf) > * with the unmap bit set. > */ > if (ata_id_has_trim(args->id)) { > - put_unaligned_be32(65535 * 512 / 8, &rbuf[20]); > + put_unaligned_be64(65535 * 512 / 8, &rbuf[36]); > put_unaligned_be32(1, &rbuf[28]); > } > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > index bd0806e..cc081dd 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -490,7 +490,8 @@ static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode) > unsigned int max_blocks = 0; > > q->limits.discard_zeroes_data = sdkp->lbprz; > - q->limits.discard_alignment = sdkp->unmap_alignment; > + q->limits.discard_alignment = sdkp->unmap_alignment * > + (logical_block_size >> 9); > q->limits.discard_granularity = > max(sdkp->physical_block_size, > sdkp->unmap_granularity * logical_block_size); > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > index 2ad95fa..1416e3c 100644 > --- a/include/linux/blkdev.h > +++ b/include/linux/blkdev.h > @@ -257,7 +257,7 @@ struct queue_limits { > unsigned char misaligned; > unsigned char discard_misaligned; > unsigned char cluster; > - signed char discard_zeroes_data; > + unsigned char discard_zeroes_data; > }; > > struct request_queue > @@ -1066,13 +1066,16 @@ static inline int queue_limit_discard_alignment(struct queue_limits *lim, sector > { > unsigned int alignment = (sector << 9) & (lim->discard_granularity - 1); > > + if (!lim->max_discard_sectors) > + return 0; > + > return (lim->discard_granularity + lim->discard_alignment - alignment) > & (lim->discard_granularity - 1); > } > > static inline unsigned int queue_discard_zeroes_data(struct request_queue *q) > { > - if (q->limits.discard_zeroes_data == 1) > + if (q->limits.max_discard_sectors && q->limits.discard_zeroes_data == 1) > return 1; > > return 0; > -- ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-04 17:10 ` [linux-lvm] [dm-devel] " Lukas Czerner @ 2011-05-04 17:32 ` Martin K. Petersen 2011-05-04 17:35 ` Lukas Czerner 0 siblings, 1 reply; 54+ messages in thread From: Martin K. Petersen @ 2011-05-04 17:32 UTC (permalink / raw) To: Lukas Czerner Cc: sandeen, Martin K. Petersen, Mike Snitzer, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, linux-ext4, Alasdair G Kergon >>>>> "Lukas" == Lukas Czerner <lczerner@redhat.com> writes: Lukas> Also I am not sure why this changed discard_max_bytes from Lukas> 4294966784 to 2147450880 in my case, that does not seem Lukas> right...lsblk in the first place apparently overflowed. 2G is correct. The relevant libata patch was missing from the series I submitted during the merge window and I don't see it in the archives. It was at the end of the stack so I guess I had an off-by-one in my commit range... -- Martin K. Petersen Oracle Linux Engineering ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-04 17:32 ` Martin K. Petersen @ 2011-05-04 17:35 ` Lukas Czerner 0 siblings, 0 replies; 54+ messages in thread From: Lukas Czerner @ 2011-05-04 17:35 UTC (permalink / raw) To: Martin K. Petersen Cc: sandeen, Mike Snitzer, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4, Alasdair G Kergon On Wed, 4 May 2011, Martin K. Petersen wrote: > >>>>> "Lukas" == Lukas Czerner <lczerner@redhat.com> writes: > > Lukas> Also I am not sure why this changed discard_max_bytes from > Lukas> 4294966784 to 2147450880 in my case, that does not seem > Lukas> right...lsblk in the first place apparently overflowed. > > 2G is correct. The relevant libata patch was missing from the series I > submitted during the merge window and I don't see it in the archives. It > was at the end of the stack so I guess I had an off-by-one in my commit > range... > Ok, thanks. -Lukas ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-04 15:10 ` Martin K. Petersen 2011-05-04 16:02 ` [linux-lvm] " Mike Snitzer 2011-05-04 17:10 ` [linux-lvm] [dm-devel] " Lukas Czerner @ 2011-05-18 12:16 ` Mike Snitzer 2011-05-18 12:52 ` Mike Snitzer 2 siblings, 1 reply; 54+ messages in thread From: Mike Snitzer @ 2011-05-18 12:16 UTC (permalink / raw) To: Martin K. Petersen Cc: sandeen, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4, Alasdair G Kergon Hey Martin, On Wed, May 04 2011 at 11:10am -0400, Martin K. Petersen <martin.petersen@oracle.com> wrote: > >>>>> "Lukas" == Lukas Czerner <lczerner@redhat.com> writes: > > Lukas> Nevertheless there is something weird going on, because even when > Lukas> I create striped volume I get this: > > Could you please try the following patch? It has a bunch of small tweaks > to the discard stack in it. I'll split it up before posting for real but > I'd like to know if it fixes your issue... Would be ideal to get these fixes staged for 2.6.40. Do you intend to split these patches up and post for 2.6.40 inclussion? Please advise, thanks! > block/libata/scsi: Various logical block provisioning fixes > > - Add sysfs documentation for the discard topology parameters > > - Fix discard stacking problem > > - Switch our libata SAT over to using the WRITE SAME limits > > - UNMAP alignment needs to be converted to bytes > > - Only report alignment and zeroes_data if the device supports discard > > Reported-by: Lukas Czerner <lczerner@redhat.com> > Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> > > diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block > index 4873c75..c1eb41c 100644 > --- a/Documentation/ABI/testing/sysfs-block > +++ b/Documentation/ABI/testing/sysfs-block > @@ -142,3 +142,67 @@ Description: > with the previous I/O request are enabled. When set to 2, > all merge tries are disabled. The default value is 0 - > which enables all types of merge tries. > + > +What: /sys/block/<disk>/discard_alignment > +Date: May 2011 > +Contact: Martin K. Petersen <martin.petersen@oracle.com> > +Description: > + Devices that support discard functionality may > + internally allocate space in units that are bigger than > + the exported logical block size. The discard_alignment > + parameter indicates how many bytes the beginning of the > + device is offset from the internal allocation unit's > + natural alignment. > + > +What: /sys/block/<disk>/<partition>/discard_alignment > +Date: May 2011 > +Contact: Martin K. Petersen <martin.petersen@oracle.com> > +Description: > + Devices that support discard functionality may > + internally allocate space in units that are bigger than > + the exported logical block size. The discard_alignment > + parameter indicates how many bytes the beginning of the > + partition is offset from the internal allocation unit's > + natural alignment. > + > +What: /sys/block/<disk>/queue/discard_granularity > +Date: May 2011 > +Contact: Martin K. Petersen <martin.petersen@oracle.com> > +Description: > + Devices that support discard functionality may > + internally allocate space using units that are bigger > + than the logical block size. The discard_granularity > + parameter indicates the size of the internal allocation > + unit in bytes if reported by the device. Otherwise the > + discard_granularity will be set to match the device's > + physical block size. A discard_granularity of 0 means > + that the device does not support discard functionality. > + > +What: /sys/block/<disk>/queue/discard_max_bytes > +Date: May 2011 > +Contact: Martin K. Petersen <martin.petersen@oracle.com> > +Description: > + Devices that support discard functionality may have > + internal limits on the number of bytes that can be > + trimmed or unmapped in a single operation. Some storage > + protocols also have inherent limits on the number of > + blocks that can be described in a single command. The > + discard_max_bytes parameter is set by the device driver > + to the maximum number of bytes that can be discarded in > + a single operation. Discard requests issued to the > + device must not exceed this limit. A discard_max_bytes > + value of 0 means that the device does not support > + discard functionality. > + > +What: /sys/block/<disk>/queue/discard_zeroes_data > +Date: May 2011 > +Contact: Martin K. Petersen <martin.petersen@oracle.com> > +Description: > + Devices that support discard functionality may return > + stale or random data when a previously discarded block > + is read back. This can cause problems if the filesystem > + expects discarded blocks to be explicitly cleared. If a > + device reports that it deterministically returns zeroes > + when a discarded area is read the discard_zeroes_data > + parameter will be set to one. Otherwise it will be 0 and > + the result of reading a discarded area is undefined. > diff --git a/block/blk-settings.c b/block/blk-settings.c > index 1fa7692..42d3bf5 100644 > --- a/block/blk-settings.c > +++ b/block/blk-settings.c > @@ -120,7 +120,7 @@ void blk_set_default_limits(struct queue_limits *lim) > lim->discard_granularity = 0; > lim->discard_alignment = 0; > lim->discard_misaligned = 0; > - lim->discard_zeroes_data = -1; > + lim->discard_zeroes_data = 1; > lim->logical_block_size = lim->physical_block_size = lim->io_min = 512; > lim->bounce_pfn = (unsigned long)(BLK_BOUNCE_ANY >> PAGE_SHIFT); > lim->alignment_offset = 0; > @@ -166,6 +166,7 @@ void blk_queue_make_request(struct request_queue *q, make_request_fn *mfn) > > blk_set_default_limits(&q->limits); > blk_queue_max_hw_sectors(q, BLK_SAFE_MAX_SECTORS); > + q->limits.discard_zeroes_data = 0; > > /* > * by default assume old behaviour and bounce for any highmem page > diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c > index e2f57e9e..30ea95f 100644 > --- a/drivers/ata/libata-scsi.c > +++ b/drivers/ata/libata-scsi.c > @@ -2138,7 +2138,7 @@ static unsigned int ata_scsiop_inq_b0(struct ata_scsi_args *args, u8 *rbuf) > * with the unmap bit set. > */ > if (ata_id_has_trim(args->id)) { > - put_unaligned_be32(65535 * 512 / 8, &rbuf[20]); > + put_unaligned_be64(65535 * 512 / 8, &rbuf[36]); > put_unaligned_be32(1, &rbuf[28]); > } > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > index bd0806e..cc081dd 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -490,7 +490,8 @@ static void sd_config_discard(struct scsi_disk *sdkp, unsigned int mode) > unsigned int max_blocks = 0; > > q->limits.discard_zeroes_data = sdkp->lbprz; > - q->limits.discard_alignment = sdkp->unmap_alignment; > + q->limits.discard_alignment = sdkp->unmap_alignment * > + (logical_block_size >> 9); > q->limits.discard_granularity = > max(sdkp->physical_block_size, > sdkp->unmap_granularity * logical_block_size); > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > index 2ad95fa..1416e3c 100644 > --- a/include/linux/blkdev.h > +++ b/include/linux/blkdev.h > @@ -257,7 +257,7 @@ struct queue_limits { > unsigned char misaligned; > unsigned char discard_misaligned; > unsigned char cluster; > - signed char discard_zeroes_data; > + unsigned char discard_zeroes_data; > }; > > struct request_queue > @@ -1066,13 +1066,16 @@ static inline int queue_limit_discard_alignment(struct queue_limits *lim, sector > { > unsigned int alignment = (sector << 9) & (lim->discard_granularity - 1); > > + if (!lim->max_discard_sectors) > + return 0; > + > return (lim->discard_granularity + lim->discard_alignment - alignment) > & (lim->discard_granularity - 1); > } > > static inline unsigned int queue_discard_zeroes_data(struct request_queue *q) > { > - if (q->limits.discard_zeroes_data == 1) > + if (q->limits.max_discard_sectors && q->limits.discard_zeroes_data == 1) > return 1; > > return 0; > -- > dm-devel mailing list > dm-devel@redhat.com > https://www.redhat.com/mailman/listinfo/dm-devel ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-18 12:16 ` [linux-lvm] " Mike Snitzer @ 2011-05-18 12:52 ` Mike Snitzer 0 siblings, 0 replies; 54+ messages in thread From: Mike Snitzer @ 2011-05-18 12:52 UTC (permalink / raw) To: Martin K. Petersen Cc: sandeen, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4, Alasdair G Kergon On Wed, May 18 2011 at 8:16am -0400, Mike Snitzer <snitzer@redhat.com> wrote: > Hey Martin, > > On Wed, May 04 2011 at 11:10am -0400, > Martin K. Petersen <martin.petersen@oracle.com> wrote: > > > >>>>> "Lukas" == Lukas Czerner <lczerner@redhat.com> writes: > > > > Lukas> Nevertheless there is something weird going on, because even when > > Lukas> I create striped volume I get this: > > > > Could you please try the following patch? It has a bunch of small tweaks > > to the discard stack in it. I'll split it up before posting for real but > > I'd like to know if it fixes your issue... > > Would be ideal to get these fixes staged for 2.6.40. > > Do you intend to split these patches up and post for 2.6.40 inclussion? Ah, I now see you posted them and Jens picked some up.. thanks Mike ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-03 8:57 ` Lukas Czerner 2011-05-04 15:10 ` Martin K. Petersen @ 2011-05-04 15:16 ` Martin K. Petersen 2011-05-04 16:12 ` Lukas Czerner 2011-05-05 8:33 ` Karel Zak 1 sibling, 2 replies; 54+ messages in thread From: Martin K. Petersen @ 2011-05-04 15:16 UTC (permalink / raw) To: Lukas Czerner, Karel Zak Cc: sandeen, Martin K. Petersen, Mike Snitzer, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, linux-ext4, Alasdair G Kergon >>>>> "Lukas" == Lukas Czerner <lczerner@redhat.com> writes: I got tired of poking around in sysfs to find the discard topology. Here's a patch against lsblk that adds a -D option to present this information in a human-readable form: # lsblk -D NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO sda 0 0B 0B 0 └─sda1 0 0B 0B 0 sdb 0 512B 2G 1 └─sdb1 0 512B 2G 1 Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> diff --git a/misc-utils/lsblk.8 b/misc-utils/lsblk.8 index 38ff48f..d7d7aa8 100644 --- a/misc-utils/lsblk.8 +++ b/misc-utils/lsblk.8 @@ -29,6 +29,8 @@ Print the SIZE column in bytes rather than in human-readable format. .IP "\fB\-d, \-\-nodeps\fP" Don't print device holders or slaves. For example "lsblk --nodeps /dev/sda" prints information about the sda device only. +.IP "\fB\-D, \-\-discard\fP" +Print information about the discard (TRIM, UNMAP) capabilities for each device. .IP "\fB\-e, \-\-exclude \fIlist\fP Exclude the devices specified by a comma-separated \fIlist\fR of major device numbers. Note that RAM disks (major=1) are excluded by default. diff --git a/misc-utils/lsblk.c b/misc-utils/lsblk.c index 38326d0..671e690 100644 --- a/misc-utils/lsblk.c +++ b/misc-utils/lsblk.c @@ -77,6 +77,10 @@ enum { COL_ROTA, COL_SCHED, COL_TYPE, + COL_DALIGN, + COL_DGRAN, + COL_DMAX, + COL_DZERO, __NCOLUMNS }; @@ -112,8 +116,11 @@ static struct colinfo infos[__NCOLUMNS] = { [COL_PHYSEC] = { "PHY-SEC", 7, TT_FL_RIGHT, N_("physical sector size") }, [COL_LOGSEC] = { "LOG-SEC", 7, TT_FL_RIGHT, N_("logical sector size") }, [COL_SCHED] = { "SCHED", 0.1, 0, N_("I/O scheduler name") }, - [COL_TYPE] = { "TYPE", 4, 0, N_("device type") } - + [COL_TYPE] = { "TYPE", 4, 0, N_("device type") }, + [COL_DALIGN] = { "DISC-ALN", 6, TT_FL_RIGHT, N_("discard alignment offset") }, + [COL_DGRAN] = { "DISC-GRAN", 6, TT_FL_RIGHT, N_("discard granularity") }, + [COL_DMAX] = { "DISC-MAX", 6, TT_FL_RIGHT, N_("discard max bytes") }, + [COL_DZERO] = { "DISC-ZERO", 1, TT_FL_RIGHT, N_("discard zeroes data") }, }; struct lsblk { @@ -702,6 +709,33 @@ static void set_tt_data(struct blkdev_cxt *cxt, int col, int id, struct tt_line if (p) tt_line_set_data(ln, col, p); break; + case COL_DALIGN: + p = sysfs_strdup(cxt, "discard_alignment"); + if (p) + tt_line_set_data(ln, col, p); + break; + case COL_DGRAN: + p = sysfs_strdup(cxt, "queue/discard_granularity"); + if (!lsblk->bytes) + p = size_to_human_string(atoi(p)); + + if (p) + tt_line_set_data(ln, col, p); + break; + case COL_DMAX: + p = sysfs_strdup(cxt, "queue/discard_max_bytes"); + + if (!lsblk->bytes) + p = size_to_human_string(atoi(p)); + + if (p) + tt_line_set_data(ln, col, p); + break; + case COL_DZERO: + p = sysfs_strdup(cxt, "queue/discard_zeroes_data"); + if (p) + tt_line_set_data(ln, col, p); + break; }; } @@ -930,6 +964,7 @@ static void __attribute__((__noreturn__)) help(FILE *out) " -a, --all print all devices\n" " -b, --bytes print SIZE in bytes rather than in human readable format\n" " -d, --nodeps don't print slaves or holders\n" + " -D, --discard print discard capabilities\n" " -e, --exclude <list> exclude devices by major number (default: RAM disks)\n" " -f, --fs output info about filesystems\n" " -h, --help usage information (this)\n" @@ -967,6 +1002,7 @@ int main(int argc, char *argv[]) { "all", 0, 0, 'a' }, { "bytes", 0, 0, 'b' }, { "nodeps", 0, 0, 'd' }, + { "discard", 0, 0, 'D' }, { "help", 0, 0, 'h' }, { "output", 1, 0, 'o' }, { "perms", 0, 0, 'm' }, @@ -987,7 +1023,7 @@ int main(int argc, char *argv[]) lsblk = &_ls; memset(lsblk, 0, sizeof(*lsblk)); - while((c = getopt_long(argc, argv, "abde:fhlnmo:irt", longopts, NULL)) != -1) { + while((c = getopt_long(argc, argv, "abdDe:fhlnmo:irt", longopts, NULL)) != -1) { switch(c) { case 'a': lsblk->all_devices = 1; @@ -998,6 +1034,13 @@ int main(int argc, char *argv[]) case 'd': lsblk->nodeps = 1; break; + case 'D': + columns[ncolumns++] = COL_NAME; + columns[ncolumns++] = COL_DALIGN; + columns[ncolumns++] = COL_DGRAN; + columns[ncolumns++] = COL_DMAX; + columns[ncolumns++] = COL_DZERO; + break; case 'e': parse_excludes(optarg); break; ^ permalink raw reply related [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-04 15:16 ` [linux-lvm] [dm-devel] " Martin K. Petersen @ 2011-05-04 16:12 ` Lukas Czerner 2011-05-05 8:33 ` Karel Zak 1 sibling, 0 replies; 54+ messages in thread From: Lukas Czerner @ 2011-05-04 16:12 UTC (permalink / raw) To: Martin K. Petersen Cc: sandeen, Mike Snitzer, Christoph Hellwig, device-mapper development, Karel Zak, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4, Alasdair G Kergon [-- Attachment #1: Type: TEXT/PLAIN, Size: 4904 bytes --] On Wed, 4 May 2011, Martin K. Petersen wrote: > >>>>> "Lukas" == Lukas Czerner <lczerner@redhat.com> writes: > > I got tired of poking around in sysfs to find the discard topology. > Here's a patch against lsblk that adds a -D option to present this > information in a human-readable form: > > # lsblk -D > NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO > sda 0 0B 0B 0 > └─sda1 0 0B 0B 0 > sdb 0 512B 2G 1 > └─sdb1 0 512B 2G 1 Very useful indeed! Thanks! -Lukas > > > Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> > > diff --git a/misc-utils/lsblk.8 b/misc-utils/lsblk.8 > index 38ff48f..d7d7aa8 100644 > --- a/misc-utils/lsblk.8 > +++ b/misc-utils/lsblk.8 > @@ -29,6 +29,8 @@ Print the SIZE column in bytes rather than in human-readable format. > .IP "\fB\-d, \-\-nodeps\fP" > Don't print device holders or slaves. For example "lsblk --nodeps /dev/sda" prints > information about the sda device only. > +.IP "\fB\-D, \-\-discard\fP" > +Print information about the discard (TRIM, UNMAP) capabilities for each device. > .IP "\fB\-e, \-\-exclude \fIlist\fP > Exclude the devices specified by a comma-separated \fIlist\fR of major device numbers. > Note that RAM disks (major=1) are excluded by default. > diff --git a/misc-utils/lsblk.c b/misc-utils/lsblk.c > index 38326d0..671e690 100644 > --- a/misc-utils/lsblk.c > +++ b/misc-utils/lsblk.c > @@ -77,6 +77,10 @@ enum { > COL_ROTA, > COL_SCHED, > COL_TYPE, > + COL_DALIGN, > + COL_DGRAN, > + COL_DMAX, > + COL_DZERO, > > __NCOLUMNS > }; > @@ -112,8 +116,11 @@ static struct colinfo infos[__NCOLUMNS] = { > [COL_PHYSEC] = { "PHY-SEC", 7, TT_FL_RIGHT, N_("physical sector size") }, > [COL_LOGSEC] = { "LOG-SEC", 7, TT_FL_RIGHT, N_("logical sector size") }, > [COL_SCHED] = { "SCHED", 0.1, 0, N_("I/O scheduler name") }, > - [COL_TYPE] = { "TYPE", 4, 0, N_("device type") } > - > + [COL_TYPE] = { "TYPE", 4, 0, N_("device type") }, > + [COL_DALIGN] = { "DISC-ALN", 6, TT_FL_RIGHT, N_("discard alignment offset") }, > + [COL_DGRAN] = { "DISC-GRAN", 6, TT_FL_RIGHT, N_("discard granularity") }, > + [COL_DMAX] = { "DISC-MAX", 6, TT_FL_RIGHT, N_("discard max bytes") }, > + [COL_DZERO] = { "DISC-ZERO", 1, TT_FL_RIGHT, N_("discard zeroes data") }, > }; > > struct lsblk { > @@ -702,6 +709,33 @@ static void set_tt_data(struct blkdev_cxt *cxt, int col, int id, struct tt_line > if (p) > tt_line_set_data(ln, col, p); > break; > + case COL_DALIGN: > + p = sysfs_strdup(cxt, "discard_alignment"); > + if (p) > + tt_line_set_data(ln, col, p); > + break; > + case COL_DGRAN: > + p = sysfs_strdup(cxt, "queue/discard_granularity"); > + if (!lsblk->bytes) > + p = size_to_human_string(atoi(p)); > + > + if (p) > + tt_line_set_data(ln, col, p); > + break; > + case COL_DMAX: > + p = sysfs_strdup(cxt, "queue/discard_max_bytes"); > + > + if (!lsblk->bytes) > + p = size_to_human_string(atoi(p)); > + > + if (p) > + tt_line_set_data(ln, col, p); > + break; > + case COL_DZERO: > + p = sysfs_strdup(cxt, "queue/discard_zeroes_data"); > + if (p) > + tt_line_set_data(ln, col, p); > + break; > }; > } > > @@ -930,6 +964,7 @@ static void __attribute__((__noreturn__)) help(FILE *out) > " -a, --all print all devices\n" > " -b, --bytes print SIZE in bytes rather than in human readable format\n" > " -d, --nodeps don't print slaves or holders\n" > + " -D, --discard print discard capabilities\n" > " -e, --exclude <list> exclude devices by major number (default: RAM disks)\n" > " -f, --fs output info about filesystems\n" > " -h, --help usage information (this)\n" > @@ -967,6 +1002,7 @@ int main(int argc, char *argv[]) > { "all", 0, 0, 'a' }, > { "bytes", 0, 0, 'b' }, > { "nodeps", 0, 0, 'd' }, > + { "discard", 0, 0, 'D' }, > { "help", 0, 0, 'h' }, > { "output", 1, 0, 'o' }, > { "perms", 0, 0, 'm' }, > @@ -987,7 +1023,7 @@ int main(int argc, char *argv[]) > lsblk = &_ls; > memset(lsblk, 0, sizeof(*lsblk)); > > - while((c = getopt_long(argc, argv, "abde:fhlnmo:irt", longopts, NULL)) != -1) { > + while((c = getopt_long(argc, argv, "abdDe:fhlnmo:irt", longopts, NULL)) != -1) { > switch(c) { > case 'a': > lsblk->all_devices = 1; > @@ -998,6 +1034,13 @@ int main(int argc, char *argv[]) > case 'd': > lsblk->nodeps = 1; > break; > + case 'D': > + columns[ncolumns++] = COL_NAME; > + columns[ncolumns++] = COL_DALIGN; > + columns[ncolumns++] = COL_DGRAN; > + columns[ncolumns++] = COL_DMAX; > + columns[ncolumns++] = COL_DZERO; > + break; > case 'e': > parse_excludes(optarg); > break; > -- ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-04 15:16 ` [linux-lvm] [dm-devel] " Martin K. Petersen 2011-05-04 16:12 ` Lukas Czerner @ 2011-05-05 8:33 ` Karel Zak 2011-05-05 10:48 ` Lukas Czerner 1 sibling, 1 reply; 54+ messages in thread From: Karel Zak @ 2011-05-05 8:33 UTC (permalink / raw) To: Martin K. Petersen Cc: sandeen, Mike Snitzer, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4, Alasdair G Kergon On Wed, May 04, 2011 at 11:16:05AM -0400, Martin K. Petersen wrote: > >>>>> "Lukas" == Lukas Czerner <lczerner@redhat.com> writes: > > I got tired of poking around in sysfs to find the discard topology. > Here's a patch against lsblk that adds a -D option to present this > information in a human-readable form: Applied, thanks. > # lsblk -D > NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO > sda 0 0B 0B 0 > └─sda1 0 0B 0B 0 > sdb 0 512B 2G 1 > └─sdb1 0 512B 2G 1 I have a question, 2.6.35 on my ThinkPad, non-SSD disk: NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO sda 0 0B 0B 0 ├─sda1 4294935040 0B 0B 0 ├─sda2 4188038656 0B 0B 0 ├─sda3 1346205184 0B 0B 0 ├─sda4 3231165440 0B 0B 0 ├─sda5 4188006400 0B 0B 0 │ └─kzak-home (dm-0) 0 0B 0B 0 └─sda6 2035725312 0B 0B 0 Does is make sense? The DISC-ALN is non-zero but DISC-GRAN is zero. Note that cat /sys/block/sda/sda*/discard_alignment returns the same numbers. Karel -- Karel Zak <kzak@redhat.com> http://karelzak.blogspot.com ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [dm-devel] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] 2011-05-05 8:33 ` Karel Zak @ 2011-05-05 10:48 ` Lukas Czerner 0 siblings, 0 replies; 54+ messages in thread From: Lukas Czerner @ 2011-05-05 10:48 UTC (permalink / raw) To: Karel Zak Cc: sandeen, Mike Snitzer, Martin K. Petersen, Christoph Hellwig, device-mapper development, DarkNovaNick, linux-lvm, Lukas Czerner, linux-ext4, Alasdair G Kergon [-- Attachment #1: Type: TEXT/PLAIN, Size: 1632 bytes --] On Thu, 5 May 2011, Karel Zak wrote: > On Wed, May 04, 2011 at 11:16:05AM -0400, Martin K. Petersen wrote: > > >>>>> "Lukas" == Lukas Czerner <lczerner@redhat.com> writes: > > > > I got tired of poking around in sysfs to find the discard topology. > > Here's a patch against lsblk that adds a -D option to present this > > information in a human-readable form: > > Applied, thanks. > > > # lsblk -D > > NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO > > sda 0 0B 0B 0 > > └─sda1 0 0B 0B 0 > > sdb 0 512B 2G 1 > > └─sdb1 0 512B 2G 1 > > I have a question, 2.6.35 on my ThinkPad, non-SSD disk: > > NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO > sda 0 0B 0B 0 > ├─sda1 4294935040 0B 0B 0 > ├─sda2 4188038656 0B 0B 0 > ├─sda3 1346205184 0B 0B 0 > ├─sda4 3231165440 0B 0B 0 > ├─sda5 4188006400 0B 0B 0 > │ └─kzak-home (dm-0) 0 0B 0B 0 > └─sda6 2035725312 0B 0B 0 > > > Does is make sense? The DISC-ALN is non-zero but DISC-GRAN is zero. > > Note that cat /sys/block/sda/sda*/discard_alignment returns the same > numbers. > > Karel > > That is fixed by the kernel patch posted above :) -Lukas ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [linux-lvm] [PATCH] dm snapshot: add discard support to the snapshot-origin target [was: Re: Testing TRIM wi 2011-04-13 22:40 ` [linux-lvm] [PATCH] dm snapshot: add discard support to the snapshot-origin target [was: Re: Testing TRIM with LVM] Mike Snitzer 2011-04-13 23:48 ` [linux-lvm] " Mike Snitzer @ 2011-04-14 15:31 ` DarkNovaNick 1 sibling, 0 replies; 54+ messages in thread From: DarkNovaNick @ 2011-04-14 15:31 UTC (permalink / raw) To: Mike Snitzer; +Cc: LVM general discussion and development [-- Attachment #1: Type: text/plain, Size: 1772 bytes --] On Apr 13, 2011 5:40pm, Mike Snitzer <snitzer@redhat.com> wrote: > Unfortunately, the failed discard to the snapshot-origin device disables > ext4's discards for the entire volume (even after you remove the > snapshot). > But the following patch fixes that: > Subject: dm snapshot: add discard support to the snapshot-origin target > Allow the snapshot-origin target to pass discards to the origin device > (after any copyout of the region being discarded; but generally speaking > that copyout won't be needed as the region will have already been copied > to the cow). > So for example, when you remove a file from an ext4 mounted origin > volume it triggers copyout to the snapshot. When the DISCARD is then > issued during the subsequent journal commit no copyout is needed. > Signed-off-by: Mike Snitzer snitzer@redhat.com> > --- > drivers/md/dm-snap.c | 3 ++- > 1 files changed, 2 insertions(+), 1 deletions(-) > diff --git a/drivers/md/dm-snap.cb/drivers/md/dm-snap.c > index a2d3309..028d216 100644 > --- a/drivers/md/dm-snap.c > +++ b/drivers/md/dm-snap.c > @@ -2076,6 +2076,7 @@ static int origin_ctr(struct dm_target *ti, > unsigned int argc, char **argv) > ti->private = dev; > ti->num_flush_requests = 1; > + ti->num_discard_requests = 1; > return 0; > } > @@ -2153,7 +2154,7 @@ static int origin_iterate_devices(struct dm_target > *ti, > static struct target_type origin_target = { > .name = "snapshot-origin", > - .version = {1, 7, 1}, > + .version = {1, 8, 0}, > .module = THIS_MODULE, > .ctr = origin_ctr, > .dtr = origin_dtr, Thanks, I will put that patch in my kernel. Do you anticipate this change making its way into the main kernel source at some point in the future? Thanks again, Nick [-- Attachment #2: Type: text/html, Size: 2717 bytes --] ^ permalink raw reply [flat|nested] 54+ messages in thread
end of thread, other threads:[~2011-05-18 12:52 UTC | newest] Thread overview: 54+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-04-12 14:59 [linux-lvm] Testing TRIM with LVM DarkNovaNick 2011-04-12 23:47 ` Mike Snitzer 2011-04-13 1:47 ` DarkNovaNick 2011-04-13 8:41 ` Zdenek Kabelac 2011-04-13 15:38 ` DarkNovaNick 2011-04-13 22:40 ` [linux-lvm] [PATCH] dm snapshot: add discard support to the snapshot-origin target [was: Re: Testing TRIM with LVM] Mike Snitzer 2011-04-13 23:48 ` [linux-lvm] " Mike Snitzer 2011-04-26 17:32 ` Mike Snitzer 2011-04-28 0:19 ` [linux-lvm] [PATCH] dm snapshot: ignore discards issued to the snapshot-origin target Mike Snitzer 2011-04-28 7:53 ` [linux-lvm] [dm-devel] " Christoph Hellwig 2011-04-28 20:59 ` [linux-lvm] do not disable ext4 discards on first discard failure? [was: Re: dm snapshot: ignore discards issued to the snapshot-origin target] Mike Snitzer 2011-04-28 21:28 ` Eric Sandeen 2011-04-28 22:59 ` Alasdair G Kergon 2011-04-28 23:01 ` Eric Sandeen 2011-04-28 23:11 ` Alasdair G Kergon 2011-04-29 1:12 ` Andreas Dilger 2011-04-29 13:55 ` Mike Snitzer 2011-04-29 9:30 ` Lukas Czerner 2011-04-29 12:24 ` [linux-lvm] [dm-devel] " Alasdair G Kergon 2011-04-29 12:29 ` Christoph Hellwig 2011-04-29 14:28 ` Eric Sandeen 2011-04-29 15:13 ` Ray Morris 2011-05-04 16:33 ` Ted Ts'o 2011-05-04 17:02 ` Lukas Czerner 2011-05-02 7:16 ` Lukas Czerner 2011-05-02 8:13 ` Alasdair G Kergon 2011-05-02 8:19 ` Christoph Hellwig 2011-05-02 10:24 ` Lukas Czerner 2011-05-02 12:48 ` [linux-lvm] " Mike Snitzer 2011-05-02 13:05 ` Lukas Czerner 2011-05-02 14:47 ` Eric Sandeen 2011-05-02 14:48 ` Christoph Hellwig 2011-05-02 14:58 ` Lukas Czerner 2011-05-02 13:48 ` [linux-lvm] [dm-devel] " Martin K. Petersen 2011-05-02 14:20 ` Martin K. Petersen 2011-05-02 14:39 ` Lukas Czerner 2011-05-02 14:50 ` Martin K. Petersen 2011-05-02 14:58 ` [linux-lvm] " Mike Snitzer 2011-05-02 16:58 ` [linux-lvm] [dm-devel] " Martin K. Petersen 2011-05-03 8:57 ` Lukas Czerner 2011-05-04 15:10 ` Martin K. Petersen 2011-05-04 16:02 ` [linux-lvm] " Mike Snitzer 2011-05-04 16:50 ` Martin K. Petersen 2011-05-04 18:03 ` Mike Snitzer 2011-05-04 17:10 ` [linux-lvm] [dm-devel] " Lukas Czerner 2011-05-04 17:32 ` Martin K. Petersen 2011-05-04 17:35 ` Lukas Czerner 2011-05-18 12:16 ` [linux-lvm] " Mike Snitzer 2011-05-18 12:52 ` Mike Snitzer 2011-05-04 15:16 ` [linux-lvm] [dm-devel] " Martin K. Petersen 2011-05-04 16:12 ` Lukas Czerner 2011-05-05 8:33 ` Karel Zak 2011-05-05 10:48 ` Lukas Czerner 2011-04-14 15:31 ` [linux-lvm] [PATCH] dm snapshot: add discard support to the snapshot-origin target [was: Re: Testing TRIM wi DarkNovaNick
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).