* [PATCH v2] zram: support REQ_DISCARD @ 2014-02-26 5:23 Joonsoo Kim 2014-02-26 8:07 ` Minchan Kim 2014-02-26 13:16 ` Sergey Senozhatsky 0 siblings, 2 replies; 9+ messages in thread From: Joonsoo Kim @ 2014-02-26 5:23 UTC (permalink / raw) To: Andrew Morton Cc: Minchan Kim, Nitin Gupta, linux-kernel, Sergey Senozhatsky, Jerome Marchand, Joonsoo Kim, Joonsoo Kim zram is ram based block device and can be used by backend of filesystem. When filesystem deletes a file, it normally doesn't do anything on data block of that file. It just marks on metadata of that file. This behavior has no problem on disk based block device, but has problems on ram based block device, since we can't free memory used for data block. To overcome this disadvantage, there is REQ_DISCARD functionality. If block device support REQ_DISCARD and filesystem is mounted with discard option, filesystem sends REQ_DISCARD to block device whenever some data blocks are discarded. All we have to do is to handle this request. This patch implements to flag up QUEUE_FLAG_DISCARD and handle this REQ_DISCARD request. With it, we can free memory used by zram if it isn't used. v2: handle unaligned case commented by Jerome Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 5ec61be..5364c1e 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -501,6 +501,36 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, return ret; } +static void zram_bio_discard(struct zram *zram, struct bio *bio) +{ + u32 index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT; + size_t n = bio->bi_iter.bi_size; + size_t misalign; + + /* + * On some arch, logical block (4096) aligned request couldn't be + * aligned to PAGE_SIZE, since their PAGE_SIZE aren't 4096. + * Therefore we should handle this misaligned case here. + */ + misalign = (bio->bi_iter.bi_sector & + (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT; + if (misalign) { + if (n < misalign) + return; + + n -= misalign; + index++; + } + + while (n >= PAGE_SIZE) { + write_lock(&zram->meta->tb_lock); + zram_free_page(zram, index); + write_unlock(&zram->meta->tb_lock); + index++; + n -= PAGE_SIZE; + } +} + static void zram_reset_device(struct zram *zram, bool reset_capacity) { size_t index; @@ -618,6 +648,12 @@ static void __zram_make_request(struct zram *zram, struct bio *bio) struct bio_vec bvec; struct bvec_iter iter; + if (unlikely(bio->bi_rw & REQ_DISCARD)) { + zram_bio_discard(zram, bio); + bio_endio(bio, 0); + return; + } + index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT; offset = (bio->bi_iter.bi_sector & (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT; @@ -784,6 +820,10 @@ static int create_device(struct zram *zram, int device_id) ZRAM_LOGICAL_BLOCK_SIZE); blk_queue_io_min(zram->disk->queue, PAGE_SIZE); blk_queue_io_opt(zram->disk->queue, PAGE_SIZE); + zram->disk->queue->limits.discard_granularity = PAGE_SIZE; + zram->disk->queue->limits.max_discard_sectors = UINT_MAX; + zram->disk->queue->limits.discard_zeroes_data = 1; + queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, zram->disk->queue); add_disk(zram->disk); -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2] zram: support REQ_DISCARD 2014-02-26 5:23 [PATCH v2] zram: support REQ_DISCARD Joonsoo Kim @ 2014-02-26 8:07 ` Minchan Kim 2014-02-28 15:24 ` Joonsoo Kim 2014-02-26 13:16 ` Sergey Senozhatsky 1 sibling, 1 reply; 9+ messages in thread From: Minchan Kim @ 2014-02-26 8:07 UTC (permalink / raw) To: Joonsoo Kim Cc: Andrew Morton, Nitin Gupta, linux-kernel, Sergey Senozhatsky, Jerome Marchand, Joonsoo Kim Hi Joonsoo, On Wed, Feb 26, 2014 at 02:23:15PM +0900, Joonsoo Kim wrote: > zram is ram based block device and can be used by backend of filesystem. > When filesystem deletes a file, it normally doesn't do anything on data > block of that file. It just marks on metadata of that file. This behavior > has no problem on disk based block device, but has problems on ram based > block device, since we can't free memory used for data block. To overcome > this disadvantage, there is REQ_DISCARD functionality. If block device > support REQ_DISCARD and filesystem is mounted with discard option, > filesystem sends REQ_DISCARD to block device whenever some data blocks are > discarded. All we have to do is to handle this request. > > This patch implements to flag up QUEUE_FLAG_DISCARD and handle this > REQ_DISCARD request. With it, we can free memory used by zram if it isn't > used. > > v2: handle unaligned case commented by Jerome > > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c > index 5ec61be..5364c1e 100644 > --- a/drivers/block/zram/zram_drv.c > +++ b/drivers/block/zram/zram_drv.c > @@ -501,6 +501,36 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, > return ret; > } > > +static void zram_bio_discard(struct zram *zram, struct bio *bio) > +{ > + u32 index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT; > + size_t n = bio->bi_iter.bi_size; Nitpick: Please use more meaningful name(ex, len) rather than 'n'. > + size_t misalign; > + > + * On some arch, logical block (4096) aligned request couldn't be > + * aligned to PAGE_SIZE, since their PAGE_SIZE aren't 4096. > + * Therefore we should handle this misaligned case here. > + */ > + misalign = (bio->bi_iter.bi_sector & > + (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT; > + if (misalign) { > + if (n < misalign) > + return; > + > + n -= misalign; > + index++; > + } > + > + while (n >= PAGE_SIZE) { > + write_lock(&zram->meta->tb_lock); > + zram_free_page(zram, index); > + write_unlock(&zram->meta->tb_lock); > + index++; > + n -= PAGE_SIZE; > + } > +} > + > static void zram_reset_device(struct zram *zram, bool reset_capacity) > { > size_t index; > @@ -618,6 +648,12 @@ static void __zram_make_request(struct zram *zram, struct bio *bio) > struct bio_vec bvec; > struct bvec_iter iter; > > + if (unlikely(bio->bi_rw & REQ_DISCARD)) { > + zram_bio_discard(zram, bio); > + bio_endio(bio, 0); > + return; > + } > + > index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT; > offset = (bio->bi_iter.bi_sector & > (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT; > @@ -784,6 +820,10 @@ static int create_device(struct zram *zram, int device_id) > ZRAM_LOGICAL_BLOCK_SIZE); > blk_queue_io_min(zram->disk->queue, PAGE_SIZE); > blk_queue_io_opt(zram->disk->queue, PAGE_SIZE); > + zram->disk->queue->limits.discard_granularity = PAGE_SIZE; > + zram->disk->queue->limits.max_discard_sectors = UINT_MAX; > + zram->disk->queue->limits.discard_zeroes_data = 1; I don't know what discard_zeroes_data does mean. It seems we should make sure zram should return zero pages for discarded block on next time but prolblem could happen if you bail out in discard logic due to misalign but caller seem to know it was successful? What happens in this case? > + queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, zram->disk->queue); > > add_disk(zram->disk); > > -- > 1.7.9.5 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Kind regards, Minchan Kim ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] zram: support REQ_DISCARD 2014-02-26 8:07 ` Minchan Kim @ 2014-02-28 15:24 ` Joonsoo Kim 2014-03-06 8:24 ` Minchan Kim 0 siblings, 1 reply; 9+ messages in thread From: Joonsoo Kim @ 2014-02-28 15:24 UTC (permalink / raw) To: Minchan Kim Cc: Joonsoo Kim, Andrew Morton, Nitin Gupta, LKML, Sergey Senozhatsky, Jerome Marchand 2014-02-26 17:07 GMT+09:00 Minchan Kim <minchan@kernel.org>: > Hi Joonsoo, > > On Wed, Feb 26, 2014 at 02:23:15PM +0900, Joonsoo Kim wrote: >> zram is ram based block device and can be used by backend of filesystem. >> When filesystem deletes a file, it normally doesn't do anything on data >> block of that file. It just marks on metadata of that file. This behavior >> has no problem on disk based block device, but has problems on ram based >> block device, since we can't free memory used for data block. To overcome >> this disadvantage, there is REQ_DISCARD functionality. If block device >> support REQ_DISCARD and filesystem is mounted with discard option, >> filesystem sends REQ_DISCARD to block device whenever some data blocks are >> discarded. All we have to do is to handle this request. >> >> This patch implements to flag up QUEUE_FLAG_DISCARD and handle this >> REQ_DISCARD request. With it, we can free memory used by zram if it isn't >> used. >> >> v2: handle unaligned case commented by Jerome >> >> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> >> >> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c >> index 5ec61be..5364c1e 100644 >> --- a/drivers/block/zram/zram_drv.c >> +++ b/drivers/block/zram/zram_drv.c >> @@ -501,6 +501,36 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, >> return ret; >> } >> >> +static void zram_bio_discard(struct zram *zram, struct bio *bio) >> +{ >> + u32 index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT; >> + size_t n = bio->bi_iter.bi_size; > > Nitpick: > Please use more meaningful name(ex, len) rather than 'n'. > Hello, Minchan. Will do. >> + size_t misalign; >> + >> + * On some arch, logical block (4096) aligned request couldn't be >> + * aligned to PAGE_SIZE, since their PAGE_SIZE aren't 4096. >> + * Therefore we should handle this misaligned case here. >> + */ >> + misalign = (bio->bi_iter.bi_sector & >> + (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT; >> + if (misalign) { >> + if (n < misalign) >> + return; >> + >> + n -= misalign; >> + index++; >> + } >> + >> + while (n >= PAGE_SIZE) { >> + write_lock(&zram->meta->tb_lock); >> + zram_free_page(zram, index); >> + write_unlock(&zram->meta->tb_lock); >> + index++; >> + n -= PAGE_SIZE; >> + } >> +} >> + >> static void zram_reset_device(struct zram *zram, bool reset_capacity) >> { >> size_t index; >> @@ -618,6 +648,12 @@ static void __zram_make_request(struct zram *zram, struct bio *bio) >> struct bio_vec bvec; >> struct bvec_iter iter; >> >> + if (unlikely(bio->bi_rw & REQ_DISCARD)) { >> + zram_bio_discard(zram, bio); >> + bio_endio(bio, 0); >> + return; >> + } >> + >> index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT; >> offset = (bio->bi_iter.bi_sector & >> (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT; >> @@ -784,6 +820,10 @@ static int create_device(struct zram *zram, int device_id) >> ZRAM_LOGICAL_BLOCK_SIZE); >> blk_queue_io_min(zram->disk->queue, PAGE_SIZE); >> blk_queue_io_opt(zram->disk->queue, PAGE_SIZE); >> + zram->disk->queue->limits.discard_granularity = PAGE_SIZE; >> + zram->disk->queue->limits.max_discard_sectors = UINT_MAX; >> + zram->disk->queue->limits.discard_zeroes_data = 1; > > I don't know what discard_zeroes_data does mean. It seems we should > make sure zram should return zero pages for discarded block on next > time but prolblem could happen if you bail out in discard logic > due to misalign but caller seem to know it was successful? > > What happens in this case? > This will result in the problem what you think about. I will change it like as following. if (PAGE_SIZE == ZRAM_LOGICAL_BLOCK_SIZE) zram->disk->queue->limits.discard_zeroes_data = 1; else zram->disk->queue->limits.discard_zeroes_data = 0; Does It work for you? Thanks. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] zram: support REQ_DISCARD 2014-02-28 15:24 ` Joonsoo Kim @ 2014-03-06 8:24 ` Minchan Kim 0 siblings, 0 replies; 9+ messages in thread From: Minchan Kim @ 2014-03-06 8:24 UTC (permalink / raw) To: Joonsoo Kim Cc: Joonsoo Kim, Andrew Morton, Nitin Gupta, LKML, Sergey Senozhatsky, Jerome Marchand On Sat, Mar 01, 2014 at 12:24:37AM +0900, Joonsoo Kim wrote: > 2014-02-26 17:07 GMT+09:00 Minchan Kim <minchan@kernel.org>: > > Hi Joonsoo, > > > > On Wed, Feb 26, 2014 at 02:23:15PM +0900, Joonsoo Kim wrote: > >> zram is ram based block device and can be used by backend of filesystem. > >> When filesystem deletes a file, it normally doesn't do anything on data > >> block of that file. It just marks on metadata of that file. This behavior > >> has no problem on disk based block device, but has problems on ram based > >> block device, since we can't free memory used for data block. To overcome > >> this disadvantage, there is REQ_DISCARD functionality. If block device > >> support REQ_DISCARD and filesystem is mounted with discard option, > >> filesystem sends REQ_DISCARD to block device whenever some data blocks are > >> discarded. All we have to do is to handle this request. > >> > >> This patch implements to flag up QUEUE_FLAG_DISCARD and handle this > >> REQ_DISCARD request. With it, we can free memory used by zram if it isn't > >> used. > >> > >> v2: handle unaligned case commented by Jerome > >> > >> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> > >> > >> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c > >> index 5ec61be..5364c1e 100644 > >> --- a/drivers/block/zram/zram_drv.c > >> +++ b/drivers/block/zram/zram_drv.c > >> @@ -501,6 +501,36 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, > >> return ret; > >> } > >> > >> +static void zram_bio_discard(struct zram *zram, struct bio *bio) > >> +{ > >> + u32 index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT; > >> + size_t n = bio->bi_iter.bi_size; > > > > Nitpick: > > Please use more meaningful name(ex, len) rather than 'n'. > > > > Hello, Minchan. > > Will do. > > >> + size_t misalign; > >> + > >> + * On some arch, logical block (4096) aligned request couldn't be > >> + * aligned to PAGE_SIZE, since their PAGE_SIZE aren't 4096. > >> + * Therefore we should handle this misaligned case here. > >> + */ > >> + misalign = (bio->bi_iter.bi_sector & > >> + (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT; > >> + if (misalign) { > >> + if (n < misalign) > >> + return; > >> + > >> + n -= misalign; > >> + index++; > >> + } > >> + > >> + while (n >= PAGE_SIZE) { > >> + write_lock(&zram->meta->tb_lock); > >> + zram_free_page(zram, index); > >> + write_unlock(&zram->meta->tb_lock); > >> + index++; > >> + n -= PAGE_SIZE; > >> + } > >> +} > >> + > >> static void zram_reset_device(struct zram *zram, bool reset_capacity) > >> { > >> size_t index; > >> @@ -618,6 +648,12 @@ static void __zram_make_request(struct zram *zram, struct bio *bio) > >> struct bio_vec bvec; > >> struct bvec_iter iter; > >> > >> + if (unlikely(bio->bi_rw & REQ_DISCARD)) { > >> + zram_bio_discard(zram, bio); > >> + bio_endio(bio, 0); > >> + return; > >> + } > >> + > >> index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT; > >> offset = (bio->bi_iter.bi_sector & > >> (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT; > >> @@ -784,6 +820,10 @@ static int create_device(struct zram *zram, int device_id) > >> ZRAM_LOGICAL_BLOCK_SIZE); > >> blk_queue_io_min(zram->disk->queue, PAGE_SIZE); > >> blk_queue_io_opt(zram->disk->queue, PAGE_SIZE); > >> + zram->disk->queue->limits.discard_granularity = PAGE_SIZE; > >> + zram->disk->queue->limits.max_discard_sectors = UINT_MAX; > >> + zram->disk->queue->limits.discard_zeroes_data = 1; > > > > I don't know what discard_zeroes_data does mean. It seems we should > > make sure zram should return zero pages for discarded block on next > > time but prolblem could happen if you bail out in discard logic > > due to misalign but caller seem to know it was successful? > > > > What happens in this case? > > > > This will result in the problem what you think about. > I will change it like as following. > > if (PAGE_SIZE == ZRAM_LOGICAL_BLOCK_SIZE) > zram->disk->queue->limits.discard_zeroes_data = 1; > else > zram->disk->queue->limits.discard_zeroes_data = 0; > > Does It work for you? Yeb, pz, resend. > > Thanks. > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Kind regards, Minchan Kim ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] zram: support REQ_DISCARD 2014-02-26 5:23 [PATCH v2] zram: support REQ_DISCARD Joonsoo Kim 2014-02-26 8:07 ` Minchan Kim @ 2014-02-26 13:16 ` Sergey Senozhatsky 2014-02-26 13:44 ` Jerome Marchand 1 sibling, 1 reply; 9+ messages in thread From: Sergey Senozhatsky @ 2014-02-26 13:16 UTC (permalink / raw) To: Joonsoo Kim Cc: Andrew Morton, Minchan Kim, Nitin Gupta, linux-kernel, Jerome Marchand, Joonsoo Kim Hello, On (02/26/14 14:23), Joonsoo Kim wrote: > zram is ram based block device and can be used by backend of filesystem. > When filesystem deletes a file, it normally doesn't do anything on data > block of that file. It just marks on metadata of that file. This behavior > has no problem on disk based block device, but has problems on ram based > block device, since we can't free memory used for data block. To overcome > this disadvantage, there is REQ_DISCARD functionality. If block device > support REQ_DISCARD and filesystem is mounted with discard option, > filesystem sends REQ_DISCARD to block device whenever some data blocks are > discarded. All we have to do is to handle this request. > > This patch implements to flag up QUEUE_FLAG_DISCARD and handle this > REQ_DISCARD request. With it, we can free memory used by zram if it isn't > used. > > v2: handle unaligned case commented by Jerome > > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c > index 5ec61be..5364c1e 100644 > --- a/drivers/block/zram/zram_drv.c > +++ b/drivers/block/zram/zram_drv.c > @@ -501,6 +501,36 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, > return ret; > } > > +static void zram_bio_discard(struct zram *zram, struct bio *bio) > +{ > + u32 index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT; > + size_t n = bio->bi_iter.bi_size; > + size_t misalign; > + > + /* > + * On some arch, logical block (4096) aligned request couldn't be > + * aligned to PAGE_SIZE, since their PAGE_SIZE aren't 4096. > + * Therefore we should handle this misaligned case here. > + */ > + misalign = (bio->bi_iter.bi_sector & > + (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT; > + if (misalign) { > + if (n < misalign) > + return; > + > + n -= misalign; > + index++; > + } > + > + while (n >= PAGE_SIZE) { > + write_lock(&zram->meta->tb_lock); > + zram_free_page(zram, index); > + write_unlock(&zram->meta->tb_lock); > + index++; > + n -= PAGE_SIZE; > + } > +} > + a side note, do we need zram_bio_discard() function? I mean, can we handle discard request in zram_bvec_rw(), where we already know index, etc. (passed from __zram_make_request())? for example: @@ -510,6 +510,11 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, ret = zram_bvec_write(zram, bvec, index, offset); } + if (unlikely(bio->bi_rw & REQ_DISCARD)) { + write_lock(&zram->meta->tb_lock); + zram_free_page(zram, index); + write_unlock(&zram->meta->tb_lock); + } return ret; } -ss > static void zram_reset_device(struct zram *zram, bool reset_capacity) > { > size_t index; > @@ -618,6 +648,12 @@ static void __zram_make_request(struct zram *zram, struct bio *bio) > struct bio_vec bvec; > struct bvec_iter iter; > > + if (unlikely(bio->bi_rw & REQ_DISCARD)) { > + zram_bio_discard(zram, bio); > + bio_endio(bio, 0); > + return; > + } > + > index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT; > offset = (bio->bi_iter.bi_sector & > (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT; > @@ -784,6 +820,10 @@ static int create_device(struct zram *zram, int device_id) > ZRAM_LOGICAL_BLOCK_SIZE); > blk_queue_io_min(zram->disk->queue, PAGE_SIZE); > blk_queue_io_opt(zram->disk->queue, PAGE_SIZE); > + zram->disk->queue->limits.discard_granularity = PAGE_SIZE; > + zram->disk->queue->limits.max_discard_sectors = UINT_MAX; > + zram->disk->queue->limits.discard_zeroes_data = 1; > + queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, zram->disk->queue); > > add_disk(zram->disk); > > -- > 1.7.9.5 > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] zram: support REQ_DISCARD 2014-02-26 13:16 ` Sergey Senozhatsky @ 2014-02-26 13:44 ` Jerome Marchand 2014-02-26 13:57 ` Sergey Senozhatsky 0 siblings, 1 reply; 9+ messages in thread From: Jerome Marchand @ 2014-02-26 13:44 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Joonsoo Kim, Andrew Morton, Minchan Kim, Nitin Gupta, linux-kernel, Joonsoo Kim On 02/26/2014 02:16 PM, Sergey Senozhatsky wrote: > Hello, > > On (02/26/14 14:23), Joonsoo Kim wrote: >> zram is ram based block device and can be used by backend of filesystem. >> When filesystem deletes a file, it normally doesn't do anything on data >> block of that file. It just marks on metadata of that file. This behavior >> has no problem on disk based block device, but has problems on ram based >> block device, since we can't free memory used for data block. To overcome >> this disadvantage, there is REQ_DISCARD functionality. If block device >> support REQ_DISCARD and filesystem is mounted with discard option, >> filesystem sends REQ_DISCARD to block device whenever some data blocks are >> discarded. All we have to do is to handle this request. >> >> This patch implements to flag up QUEUE_FLAG_DISCARD and handle this >> REQ_DISCARD request. With it, we can free memory used by zram if it isn't >> used. >> >> v2: handle unaligned case commented by Jerome >> >> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> >> >> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c >> index 5ec61be..5364c1e 100644 >> --- a/drivers/block/zram/zram_drv.c >> +++ b/drivers/block/zram/zram_drv.c >> @@ -501,6 +501,36 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, >> return ret; >> } >> >> +static void zram_bio_discard(struct zram *zram, struct bio *bio) >> +{ >> + u32 index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT; >> + size_t n = bio->bi_iter.bi_size; >> + size_t misalign; >> + >> + /* >> + * On some arch, logical block (4096) aligned request couldn't be >> + * aligned to PAGE_SIZE, since their PAGE_SIZE aren't 4096. >> + * Therefore we should handle this misaligned case here. >> + */ >> + misalign = (bio->bi_iter.bi_sector & >> + (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT; >> + if (misalign) { >> + if (n < misalign) >> + return; >> + >> + n -= misalign; >> + index++; >> + } >> + >> + while (n >= PAGE_SIZE) { >> + write_lock(&zram->meta->tb_lock); >> + zram_free_page(zram, index); >> + write_unlock(&zram->meta->tb_lock); >> + index++; >> + n -= PAGE_SIZE; >> + } >> +} >> + > > a side note, do we need zram_bio_discard() function? I mean, can we handle > discard request in zram_bvec_rw(), where we already know index, etc. (passed > from __zram_make_request())? > We'd still have to make sure not to discard pages that are still partially used, but it might simplify the code: __zram_make_request() already takes care of splitting the request. > for example: > > @@ -510,6 +510,11 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, > ret = zram_bvec_write(zram, bvec, index, offset); > } > > + if (unlikely(bio->bi_rw & REQ_DISCARD)) { + if (!is_partial_io(bvec) { > + write_lock(&zram->meta->tb_lock); > + zram_free_page(zram, index); > + write_unlock(&zram->meta->tb_lock); + } Also this code might still call zram_bvec_read() and increase num_reads for discard request: I guess bio_data_dir(bio) == READ == 0 in this case. Btw, why __zram_make_request() has an that rw argument? All the information it needs is passed by the bio argument already. I kind of recollect to have seen a cleanup patch that get rid of it or is it just my imagination? Jerome > + } > return ret; > } > > -ss > >> static void zram_reset_device(struct zram *zram, bool reset_capacity) >> { >> size_t index; >> @@ -618,6 +648,12 @@ static void __zram_make_request(struct zram *zram, struct bio *bio) >> struct bio_vec bvec; >> struct bvec_iter iter; >> >> + if (unlikely(bio->bi_rw & REQ_DISCARD)) { >> + zram_bio_discard(zram, bio); >> + bio_endio(bio, 0); >> + return; >> + } >> + >> index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT; >> offset = (bio->bi_iter.bi_sector & >> (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT; >> @@ -784,6 +820,10 @@ static int create_device(struct zram *zram, int device_id) >> ZRAM_LOGICAL_BLOCK_SIZE); >> blk_queue_io_min(zram->disk->queue, PAGE_SIZE); >> blk_queue_io_opt(zram->disk->queue, PAGE_SIZE); >> + zram->disk->queue->limits.discard_granularity = PAGE_SIZE; >> + zram->disk->queue->limits.max_discard_sectors = UINT_MAX; >> + zram->disk->queue->limits.discard_zeroes_data = 1; >> + queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, zram->disk->queue); >> >> add_disk(zram->disk); >> >> -- >> 1.7.9.5 >> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] zram: support REQ_DISCARD 2014-02-26 13:44 ` Jerome Marchand @ 2014-02-26 13:57 ` Sergey Senozhatsky 2014-02-26 14:06 ` Jerome Marchand 0 siblings, 1 reply; 9+ messages in thread From: Sergey Senozhatsky @ 2014-02-26 13:57 UTC (permalink / raw) To: Jerome Marchand Cc: Sergey Senozhatsky, Joonsoo Kim, Andrew Morton, Minchan Kim, Nitin Gupta, linux-kernel, Joonsoo Kim On (02/26/14 14:44), Jerome Marchand wrote: > On 02/26/2014 02:16 PM, Sergey Senozhatsky wrote: > > Hello, > > > > On (02/26/14 14:23), Joonsoo Kim wrote: > >> zram is ram based block device and can be used by backend of filesystem. > >> When filesystem deletes a file, it normally doesn't do anything on data > >> block of that file. It just marks on metadata of that file. This behavior > >> has no problem on disk based block device, but has problems on ram based > >> block device, since we can't free memory used for data block. To overcome > >> this disadvantage, there is REQ_DISCARD functionality. If block device > >> support REQ_DISCARD and filesystem is mounted with discard option, > >> filesystem sends REQ_DISCARD to block device whenever some data blocks are > >> discarded. All we have to do is to handle this request. > >> > >> This patch implements to flag up QUEUE_FLAG_DISCARD and handle this > >> REQ_DISCARD request. With it, we can free memory used by zram if it isn't > >> used. > >> > >> v2: handle unaligned case commented by Jerome > >> > >> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> > >> > >> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c > >> index 5ec61be..5364c1e 100644 > >> --- a/drivers/block/zram/zram_drv.c > >> +++ b/drivers/block/zram/zram_drv.c > >> @@ -501,6 +501,36 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, > >> return ret; > >> } > >> > >> +static void zram_bio_discard(struct zram *zram, struct bio *bio) > >> +{ > >> + u32 index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT; > >> + size_t n = bio->bi_iter.bi_size; > >> + size_t misalign; > >> + > >> + /* > >> + * On some arch, logical block (4096) aligned request couldn't be > >> + * aligned to PAGE_SIZE, since their PAGE_SIZE aren't 4096. > >> + * Therefore we should handle this misaligned case here. > >> + */ > >> + misalign = (bio->bi_iter.bi_sector & > >> + (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT; > >> + if (misalign) { > >> + if (n < misalign) > >> + return; > >> + > >> + n -= misalign; > >> + index++; > >> + } > >> + > >> + while (n >= PAGE_SIZE) { > >> + write_lock(&zram->meta->tb_lock); > >> + zram_free_page(zram, index); > >> + write_unlock(&zram->meta->tb_lock); > >> + index++; > >> + n -= PAGE_SIZE; > >> + } > >> +} > >> + > > > > a side note, do we need zram_bio_discard() function? I mean, can we handle > > discard request in zram_bvec_rw(), where we already know index, etc. (passed > > from __zram_make_request())? > > > > We'd still have to make sure not to discard pages that are still partially > used, but it might simplify the code: __zram_make_request() already takes > care of splitting the request. > > > for example: > > > > @@ -510,6 +510,11 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, > > ret = zram_bvec_write(zram, bvec, index, offset); > > } > > > > + if (unlikely(bio->bi_rw & REQ_DISCARD)) { > > + if (!is_partial_io(bvec) { > > > + write_lock(&zram->meta->tb_lock); > > + zram_free_page(zram, index); > > + write_unlock(&zram->meta->tb_lock); > > + } > > Also this code might still call zram_bvec_read() and increase num_reads > for discard request: I guess bio_data_dir(bio) == READ == 0 in this case. > > Btw, why __zram_make_request() has an that rw argument? All the information > it needs is passed by the bio argument already. I kind of recollect to have > seen a cleanup patch that get rid of it or is it just my imagination? > it doesn't. cleanup patch 'do not pass rw argument to __zram_make_request()' is in linux-next. -ss > Jerome > > > + } > > return ret; > > } > > > > -ss > > > >> static void zram_reset_device(struct zram *zram, bool reset_capacity) > >> { > >> size_t index; > >> @@ -618,6 +648,12 @@ static void __zram_make_request(struct zram *zram, struct bio *bio) > >> struct bio_vec bvec; > >> struct bvec_iter iter; > >> > >> + if (unlikely(bio->bi_rw & REQ_DISCARD)) { > >> + zram_bio_discard(zram, bio); > >> + bio_endio(bio, 0); > >> + return; > >> + } > >> + > >> index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT; > >> offset = (bio->bi_iter.bi_sector & > >> (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT; > >> @@ -784,6 +820,10 @@ static int create_device(struct zram *zram, int device_id) > >> ZRAM_LOGICAL_BLOCK_SIZE); > >> blk_queue_io_min(zram->disk->queue, PAGE_SIZE); > >> blk_queue_io_opt(zram->disk->queue, PAGE_SIZE); > >> + zram->disk->queue->limits.discard_granularity = PAGE_SIZE; > >> + zram->disk->queue->limits.max_discard_sectors = UINT_MAX; > >> + zram->disk->queue->limits.discard_zeroes_data = 1; > >> + queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, zram->disk->queue); > >> > >> add_disk(zram->disk); > >> > >> -- > >> 1.7.9.5 > >> > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] zram: support REQ_DISCARD 2014-02-26 13:57 ` Sergey Senozhatsky @ 2014-02-26 14:06 ` Jerome Marchand 2014-02-28 15:20 ` Joonsoo Kim 0 siblings, 1 reply; 9+ messages in thread From: Jerome Marchand @ 2014-02-26 14:06 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Joonsoo Kim, Andrew Morton, Minchan Kim, Nitin Gupta, linux-kernel, Joonsoo Kim On 02/26/2014 02:57 PM, Sergey Senozhatsky wrote: > On (02/26/14 14:44), Jerome Marchand wrote: >> On 02/26/2014 02:16 PM, Sergey Senozhatsky wrote: >>> Hello, >>> >>> On (02/26/14 14:23), Joonsoo Kim wrote: >>>> zram is ram based block device and can be used by backend of filesystem. >>>> When filesystem deletes a file, it normally doesn't do anything on data >>>> block of that file. It just marks on metadata of that file. This behavior >>>> has no problem on disk based block device, but has problems on ram based >>>> block device, since we can't free memory used for data block. To overcome >>>> this disadvantage, there is REQ_DISCARD functionality. If block device >>>> support REQ_DISCARD and filesystem is mounted with discard option, >>>> filesystem sends REQ_DISCARD to block device whenever some data blocks are >>>> discarded. All we have to do is to handle this request. >>>> >>>> This patch implements to flag up QUEUE_FLAG_DISCARD and handle this >>>> REQ_DISCARD request. With it, we can free memory used by zram if it isn't >>>> used. >>>> >>>> v2: handle unaligned case commented by Jerome >>>> >>>> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> >>>> >>>> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c >>>> index 5ec61be..5364c1e 100644 >>>> --- a/drivers/block/zram/zram_drv.c >>>> +++ b/drivers/block/zram/zram_drv.c >>>> @@ -501,6 +501,36 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, >>>> return ret; >>>> } >>>> >>>> +static void zram_bio_discard(struct zram *zram, struct bio *bio) >>>> +{ >>>> + u32 index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT; >>>> + size_t n = bio->bi_iter.bi_size; >>>> + size_t misalign; >>>> + >>>> + /* >>>> + * On some arch, logical block (4096) aligned request couldn't be >>>> + * aligned to PAGE_SIZE, since their PAGE_SIZE aren't 4096. >>>> + * Therefore we should handle this misaligned case here. >>>> + */ >>>> + misalign = (bio->bi_iter.bi_sector & >>>> + (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT; >>>> + if (misalign) { >>>> + if (n < misalign) >>>> + return; >>>> + >>>> + n -= misalign; >>>> + index++; >>>> + } >>>> + >>>> + while (n >= PAGE_SIZE) { >>>> + write_lock(&zram->meta->tb_lock); >>>> + zram_free_page(zram, index); >>>> + write_unlock(&zram->meta->tb_lock); >>>> + index++; >>>> + n -= PAGE_SIZE; >>>> + } >>>> +} >>>> + >>> >>> a side note, do we need zram_bio_discard() function? I mean, can we handle >>> discard request in zram_bvec_rw(), where we already know index, etc. (passed >>> from __zram_make_request())? >>> >> >> We'd still have to make sure not to discard pages that are still partially >> used, but it might simplify the code: __zram_make_request() already takes >> care of splitting the request. >> >>> for example: >>> >>> @@ -510,6 +510,11 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, >>> ret = zram_bvec_write(zram, bvec, index, offset); >>> } >>> >>> + if (unlikely(bio->bi_rw & REQ_DISCARD)) { >> >> + if (!is_partial_io(bvec) { >> >>> + write_lock(&zram->meta->tb_lock); >>> + zram_free_page(zram, index); >>> + write_unlock(&zram->meta->tb_lock); >> >> + } >> >> Also this code might still call zram_bvec_read() and increase num_reads >> for discard request: I guess bio_data_dir(bio) == READ == 0 in this case. >> >> Btw, why __zram_make_request() has an that rw argument? All the information >> it needs is passed by the bio argument already. I kind of recollect to have >> seen a cleanup patch that get rid of it or is it just my imagination? >> > > it doesn't. cleanup patch 'do not pass rw argument to __zram_make_request()' > is in linux-next. > You're right. I must be blind since there is an exemple of __zram_make_request() without this argument just a few line below. > -ss > >> Jerome >> >>> + } >>> return ret; >>> } >>> >>> -ss >>> >>>> static void zram_reset_device(struct zram *zram, bool reset_capacity) >>>> { >>>> size_t index; >>>> @@ -618,6 +648,12 @@ static void __zram_make_request(struct zram *zram, struct bio *bio) I need to open my eyes. Jerome >>>> struct bio_vec bvec; >>>> struct bvec_iter iter; >>>> >>>> + if (unlikely(bio->bi_rw & REQ_DISCARD)) { >>>> + zram_bio_discard(zram, bio); >>>> + bio_endio(bio, 0); >>>> + return; >>>> + } >>>> + >>>> index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT; >>>> offset = (bio->bi_iter.bi_sector & >>>> (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT; >>>> @@ -784,6 +820,10 @@ static int create_device(struct zram *zram, int device_id) >>>> ZRAM_LOGICAL_BLOCK_SIZE); >>>> blk_queue_io_min(zram->disk->queue, PAGE_SIZE); >>>> blk_queue_io_opt(zram->disk->queue, PAGE_SIZE); >>>> + zram->disk->queue->limits.discard_granularity = PAGE_SIZE; >>>> + zram->disk->queue->limits.max_discard_sectors = UINT_MAX; >>>> + zram->disk->queue->limits.discard_zeroes_data = 1; >>>> + queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, zram->disk->queue); >>>> >>>> add_disk(zram->disk); >>>> >>>> -- >>>> 1.7.9.5 >>>> >> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] zram: support REQ_DISCARD 2014-02-26 14:06 ` Jerome Marchand @ 2014-02-28 15:20 ` Joonsoo Kim 0 siblings, 0 replies; 9+ messages in thread From: Joonsoo Kim @ 2014-02-28 15:20 UTC (permalink / raw) To: Jerome Marchand Cc: Sergey Senozhatsky, Joonsoo Kim, Andrew Morton, Minchan Kim, Nitin Gupta, LKML 2014-02-26 23:06 GMT+09:00 Jerome Marchand <jmarchan@redhat.com>: > On 02/26/2014 02:57 PM, Sergey Senozhatsky wrote: >> On (02/26/14 14:44), Jerome Marchand wrote: >>> On 02/26/2014 02:16 PM, Sergey Senozhatsky wrote: >>>> Hello, >>>> >>>> On (02/26/14 14:23), Joonsoo Kim wrote: >>>>> zram is ram based block device and can be used by backend of filesystem. >>>>> When filesystem deletes a file, it normally doesn't do anything on data >>>>> block of that file. It just marks on metadata of that file. This behavior >>>>> has no problem on disk based block device, but has problems on ram based >>>>> block device, since we can't free memory used for data block. To overcome >>>>> this disadvantage, there is REQ_DISCARD functionality. If block device >>>>> support REQ_DISCARD and filesystem is mounted with discard option, >>>>> filesystem sends REQ_DISCARD to block device whenever some data blocks are >>>>> discarded. All we have to do is to handle this request. >>>>> >>>>> This patch implements to flag up QUEUE_FLAG_DISCARD and handle this >>>>> REQ_DISCARD request. With it, we can free memory used by zram if it isn't >>>>> used. >>>>> >>>>> v2: handle unaligned case commented by Jerome >>>>> >>>>> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> >>>>> >>>>> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c >>>>> index 5ec61be..5364c1e 100644 >>>>> --- a/drivers/block/zram/zram_drv.c >>>>> +++ b/drivers/block/zram/zram_drv.c >>>>> @@ -501,6 +501,36 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, >>>>> return ret; >>>>> } >>>>> >>>>> +static void zram_bio_discard(struct zram *zram, struct bio *bio) >>>>> +{ >>>>> + u32 index = bio->bi_iter.bi_sector >> SECTORS_PER_PAGE_SHIFT; >>>>> + size_t n = bio->bi_iter.bi_size; >>>>> + size_t misalign; >>>>> + >>>>> + /* >>>>> + * On some arch, logical block (4096) aligned request couldn't be >>>>> + * aligned to PAGE_SIZE, since their PAGE_SIZE aren't 4096. >>>>> + * Therefore we should handle this misaligned case here. >>>>> + */ >>>>> + misalign = (bio->bi_iter.bi_sector & >>>>> + (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT; >>>>> + if (misalign) { >>>>> + if (n < misalign) >>>>> + return; >>>>> + >>>>> + n -= misalign; >>>>> + index++; >>>>> + } >>>>> + >>>>> + while (n >= PAGE_SIZE) { >>>>> + write_lock(&zram->meta->tb_lock); >>>>> + zram_free_page(zram, index); >>>>> + write_unlock(&zram->meta->tb_lock); >>>>> + index++; >>>>> + n -= PAGE_SIZE; >>>>> + } >>>>> +} >>>>> + >>>> >>>> a side note, do we need zram_bio_discard() function? I mean, can we handle >>>> discard request in zram_bvec_rw(), where we already know index, etc. (passed >>>> from __zram_make_request())? >>>> Hello, Sergey. Sorry for late response. I think that introducing new function is better idea, since discard_request is significantly different with rw request. First of all, it doesn't use bvec. So splitting code in __zram_make_request() would not work properly for it. And zram_bvec_rw() is bvec handler and deals with PAGE_SIZE unit request which is not appropriate for discard request. But, it is good to use common index, offset, so I will move down position of zram_bio_discard(). Thanks for comment! ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-03-06 8:24 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-02-26 5:23 [PATCH v2] zram: support REQ_DISCARD Joonsoo Kim 2014-02-26 8:07 ` Minchan Kim 2014-02-28 15:24 ` Joonsoo Kim 2014-03-06 8:24 ` Minchan Kim 2014-02-26 13:16 ` Sergey Senozhatsky 2014-02-26 13:44 ` Jerome Marchand 2014-02-26 13:57 ` Sergey Senozhatsky 2014-02-26 14:06 ` Jerome Marchand 2014-02-28 15:20 ` Joonsoo Kim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox