All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch 0/8] dm-bufio patches
@ 2018-03-26 18:29 mpatocka
  2018-03-26 18:29 ` [patch 1/8] dm-bufio: move dm-bufio.h to include/linux/ mpatocka
                   ` (8 more replies)
  0 siblings, 9 replies; 12+ messages in thread
From: mpatocka @ 2018-03-26 18:29 UTC (permalink / raw)
  To: mpatocka, msnitzer, agk; +Cc: dm-devel

Hi

Here I'm sending the second version of dm-bufio patches for arbitrary
sector size.

I dropped the patch that removes alloc_pages because it turns out that
alloc_pages is faster than the slab allocator.

I also added a patch that removes struct bio from struct dm_buffer
(reducing the size of dm_buffer to 1/3) and allocates the bio only when
doing read or write.

Mikulas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 1/8] dm-bufio: move dm-bufio.h to include/linux/
  2018-03-26 18:29 [patch 0/8] dm-bufio patches mpatocka
@ 2018-03-26 18:29 ` mpatocka
  2018-03-26 18:29 ` [patch 2/8] dm-bufio: get rid of slab cache name allocations mpatocka
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: mpatocka @ 2018-03-26 18:29 UTC (permalink / raw)
  To: mpatocka, msnitzer, agk; +Cc: dm-devel

[-- Attachment #1: dm-bufio-export.patch --]
[-- Type: text/plain, Size: 12965 bytes --]

This patch moves dm-bufio to include/linux/, so that external modules can
use it. (it is needed for the VDO team)

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/md/dm-bufio.c                         |    2 
 drivers/md/dm-bufio.h                         |  148 --------------------------
 drivers/md/dm-integrity.c                     |    2 
 drivers/md/dm-snap-persistent.c               |    2 
 drivers/md/dm-verity.h                        |    2 
 drivers/md/persistent-data/dm-block-manager.c |    2 
 include/linux/dm-bufio.h                      |  148 ++++++++++++++++++++++++++
 7 files changed, 153 insertions(+), 153 deletions(-)

Index: linux-2.6/drivers/md/dm-bufio.h
===================================================================
--- linux-2.6.orig/drivers/md/dm-bufio.h	2017-09-22 14:31:41.159418000 +0200
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,148 +0,0 @@
-/*
- * Copyright (C) 2009-2011 Red Hat, Inc.
- *
- * Author: Mikulas Patocka <mpatocka@redhat.com>
- *
- * This file is released under the GPL.
- */
-
-#ifndef DM_BUFIO_H
-#define DM_BUFIO_H
-
-#include <linux/blkdev.h>
-#include <linux/types.h>
-
-/*----------------------------------------------------------------*/
-
-struct dm_bufio_client;
-struct dm_buffer;
-
-/*
- * Create a buffered IO cache on a given device
- */
-struct dm_bufio_client *
-dm_bufio_client_create(struct block_device *bdev, unsigned block_size,
-		       unsigned reserved_buffers, unsigned aux_size,
-		       void (*alloc_callback)(struct dm_buffer *),
-		       void (*write_callback)(struct dm_buffer *));
-
-/*
- * Release a buffered IO cache.
- */
-void dm_bufio_client_destroy(struct dm_bufio_client *c);
-
-/*
- * Set the sector range.
- * When this function is called, there must be no I/O in progress on the bufio
- * client.
- */
-void dm_bufio_set_sector_offset(struct dm_bufio_client *c, sector_t start);
-
-/*
- * WARNING: to avoid deadlocks, these conditions are observed:
- *
- * - At most one thread can hold at most "reserved_buffers" simultaneously.
- * - Each other threads can hold at most one buffer.
- * - Threads which call only dm_bufio_get can hold unlimited number of
- *   buffers.
- */
-
-/*
- * Read a given block from disk. Returns pointer to data.  Returns a
- * pointer to dm_buffer that can be used to release the buffer or to make
- * it dirty.
- */
-void *dm_bufio_read(struct dm_bufio_client *c, sector_t block,
-		    struct dm_buffer **bp);
-
-/*
- * Like dm_bufio_read, but return buffer from cache, don't read
- * it. If the buffer is not in the cache, return NULL.
- */
-void *dm_bufio_get(struct dm_bufio_client *c, sector_t block,
-		   struct dm_buffer **bp);
-
-/*
- * Like dm_bufio_read, but don't read anything from the disk.  It is
- * expected that the caller initializes the buffer and marks it dirty.
- */
-void *dm_bufio_new(struct dm_bufio_client *c, sector_t block,
-		   struct dm_buffer **bp);
-
-/*
- * Prefetch the specified blocks to the cache.
- * The function starts to read the blocks and returns without waiting for
- * I/O to finish.
- */
-void dm_bufio_prefetch(struct dm_bufio_client *c,
-		       sector_t block, unsigned n_blocks);
-
-/*
- * Release a reference obtained with dm_bufio_{read,get,new}. The data
- * pointer and dm_buffer pointer is no longer valid after this call.
- */
-void dm_bufio_release(struct dm_buffer *b);
-
-/*
- * Mark a buffer dirty. It should be called after the buffer is modified.
- *
- * In case of memory pressure, the buffer may be written after
- * dm_bufio_mark_buffer_dirty, but before dm_bufio_write_dirty_buffers.  So
- * dm_bufio_write_dirty_buffers guarantees that the buffer is on-disk but
- * the actual writing may occur earlier.
- */
-void dm_bufio_mark_buffer_dirty(struct dm_buffer *b);
-
-/*
- * Mark a part of the buffer dirty.
- *
- * The specified part of the buffer is scheduled to be written. dm-bufio may
- * write the specified part of the buffer or it may write a larger superset.
- */
-void dm_bufio_mark_partial_buffer_dirty(struct dm_buffer *b,
-					unsigned start, unsigned end);
-
-/*
- * Initiate writing of dirty buffers, without waiting for completion.
- */
-void dm_bufio_write_dirty_buffers_async(struct dm_bufio_client *c);
-
-/*
- * Write all dirty buffers. Guarantees that all dirty buffers created prior
- * to this call are on disk when this call exits.
- */
-int dm_bufio_write_dirty_buffers(struct dm_bufio_client *c);
-
-/*
- * Send an empty write barrier to the device to flush hardware disk cache.
- */
-int dm_bufio_issue_flush(struct dm_bufio_client *c);
-
-/*
- * Like dm_bufio_release but also move the buffer to the new
- * block. dm_bufio_write_dirty_buffers is needed to commit the new block.
- */
-void dm_bufio_release_move(struct dm_buffer *b, sector_t new_block);
-
-/*
- * Free the given buffer.
- * This is just a hint, if the buffer is in use or dirty, this function
- * does nothing.
- */
-void dm_bufio_forget(struct dm_bufio_client *c, sector_t block);
-
-/*
- * Set the minimum number of buffers before cleanup happens.
- */
-void dm_bufio_set_minimum_buffers(struct dm_bufio_client *c, unsigned n);
-
-unsigned dm_bufio_get_block_size(struct dm_bufio_client *c);
-sector_t dm_bufio_get_device_size(struct dm_bufio_client *c);
-sector_t dm_bufio_get_block_number(struct dm_buffer *b);
-void *dm_bufio_get_block_data(struct dm_buffer *b);
-void *dm_bufio_get_aux_data(struct dm_buffer *b);
-struct dm_bufio_client *dm_bufio_get_client(struct dm_buffer *b);
-
-/*----------------------------------------------------------------*/
-
-#endif
Index: linux-2.6/include/linux/dm-bufio.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/include/linux/dm-bufio.h	2018-03-08 10:13:17.599999000 +0100
@@ -0,0 +1,148 @@
+/*
+ * Copyright (C) 2009-2011 Red Hat, Inc.
+ *
+ * Author: Mikulas Patocka <mpatocka@redhat.com>
+ *
+ * This file is released under the GPL.
+ */
+
+#ifndef _LINUX_DM_BUFIO_H
+#define _LINUX_DM_BUFIO_H
+
+#include <linux/blkdev.h>
+#include <linux/types.h>
+
+/*----------------------------------------------------------------*/
+
+struct dm_bufio_client;
+struct dm_buffer;
+
+/*
+ * Create a buffered IO cache on a given device
+ */
+struct dm_bufio_client *
+dm_bufio_client_create(struct block_device *bdev, unsigned block_size,
+		       unsigned reserved_buffers, unsigned aux_size,
+		       void (*alloc_callback)(struct dm_buffer *),
+		       void (*write_callback)(struct dm_buffer *));
+
+/*
+ * Release a buffered IO cache.
+ */
+void dm_bufio_client_destroy(struct dm_bufio_client *c);
+
+/*
+ * Set the sector range.
+ * When this function is called, there must be no I/O in progress on the bufio
+ * client.
+ */
+void dm_bufio_set_sector_offset(struct dm_bufio_client *c, sector_t start);
+
+/*
+ * WARNING: to avoid deadlocks, these conditions are observed:
+ *
+ * - At most one thread can hold at most "reserved_buffers" simultaneously.
+ * - Each other threads can hold at most one buffer.
+ * - Threads which call only dm_bufio_get can hold unlimited number of
+ *   buffers.
+ */
+
+/*
+ * Read a given block from disk. Returns pointer to data.  Returns a
+ * pointer to dm_buffer that can be used to release the buffer or to make
+ * it dirty.
+ */
+void *dm_bufio_read(struct dm_bufio_client *c, sector_t block,
+		    struct dm_buffer **bp);
+
+/*
+ * Like dm_bufio_read, but return buffer from cache, don't read
+ * it. If the buffer is not in the cache, return NULL.
+ */
+void *dm_bufio_get(struct dm_bufio_client *c, sector_t block,
+		   struct dm_buffer **bp);
+
+/*
+ * Like dm_bufio_read, but don't read anything from the disk.  It is
+ * expected that the caller initializes the buffer and marks it dirty.
+ */
+void *dm_bufio_new(struct dm_bufio_client *c, sector_t block,
+		   struct dm_buffer **bp);
+
+/*
+ * Prefetch the specified blocks to the cache.
+ * The function starts to read the blocks and returns without waiting for
+ * I/O to finish.
+ */
+void dm_bufio_prefetch(struct dm_bufio_client *c,
+		       sector_t block, unsigned n_blocks);
+
+/*
+ * Release a reference obtained with dm_bufio_{read,get,new}. The data
+ * pointer and dm_buffer pointer is no longer valid after this call.
+ */
+void dm_bufio_release(struct dm_buffer *b);
+
+/*
+ * Mark a buffer dirty. It should be called after the buffer is modified.
+ *
+ * In case of memory pressure, the buffer may be written after
+ * dm_bufio_mark_buffer_dirty, but before dm_bufio_write_dirty_buffers.  So
+ * dm_bufio_write_dirty_buffers guarantees that the buffer is on-disk but
+ * the actual writing may occur earlier.
+ */
+void dm_bufio_mark_buffer_dirty(struct dm_buffer *b);
+
+/*
+ * Mark a part of the buffer dirty.
+ *
+ * The specified part of the buffer is scheduled to be written. dm-bufio may
+ * write the specified part of the buffer or it may write a larger superset.
+ */
+void dm_bufio_mark_partial_buffer_dirty(struct dm_buffer *b,
+					unsigned start, unsigned end);
+
+/*
+ * Initiate writing of dirty buffers, without waiting for completion.
+ */
+void dm_bufio_write_dirty_buffers_async(struct dm_bufio_client *c);
+
+/*
+ * Write all dirty buffers. Guarantees that all dirty buffers created prior
+ * to this call are on disk when this call exits.
+ */
+int dm_bufio_write_dirty_buffers(struct dm_bufio_client *c);
+
+/*
+ * Send an empty write barrier to the device to flush hardware disk cache.
+ */
+int dm_bufio_issue_flush(struct dm_bufio_client *c);
+
+/*
+ * Like dm_bufio_release but also move the buffer to the new
+ * block. dm_bufio_write_dirty_buffers is needed to commit the new block.
+ */
+void dm_bufio_release_move(struct dm_buffer *b, sector_t new_block);
+
+/*
+ * Free the given buffer.
+ * This is just a hint, if the buffer is in use or dirty, this function
+ * does nothing.
+ */
+void dm_bufio_forget(struct dm_bufio_client *c, sector_t block);
+
+/*
+ * Set the minimum number of buffers before cleanup happens.
+ */
+void dm_bufio_set_minimum_buffers(struct dm_bufio_client *c, unsigned n);
+
+unsigned dm_bufio_get_block_size(struct dm_bufio_client *c);
+sector_t dm_bufio_get_device_size(struct dm_bufio_client *c);
+sector_t dm_bufio_get_block_number(struct dm_buffer *b);
+void *dm_bufio_get_block_data(struct dm_buffer *b);
+void *dm_bufio_get_aux_data(struct dm_buffer *b);
+struct dm_bufio_client *dm_bufio_get_client(struct dm_buffer *b);
+
+/*----------------------------------------------------------------*/
+
+#endif
Index: linux-2.6/drivers/md/dm-bufio.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-bufio.c	2018-03-08 07:13:48.655471000 +0100
+++ linux-2.6/drivers/md/dm-bufio.c	2018-03-08 10:14:02.119999000 +0100
@@ -6,7 +6,7 @@
  * This file is released under the GPL.
  */
 
-#include "dm-bufio.h"
+#include <linux/dm-bufio.h>
 
 #include <linux/device-mapper.h>
 #include <linux/dm-io.h>
Index: linux-2.6/drivers/md/dm-integrity.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-integrity.c	2018-03-08 06:53:09.000000000 +0100
+++ linux-2.6/drivers/md/dm-integrity.c	2018-03-08 10:14:36.659999000 +0100
@@ -18,7 +18,7 @@
 #include <crypto/hash.h>
 #include <crypto/skcipher.h>
 #include <linux/async_tx.h>
-#include "dm-bufio.h"
+#include <linux/dm-bufio.h>
 
 #define DM_MSG_PREFIX "integrity"
 
Index: linux-2.6/drivers/md/dm-snap-persistent.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-snap-persistent.c	2017-06-06 23:49:56.237523000 +0200
+++ linux-2.6/drivers/md/dm-snap-persistent.c	2018-03-08 10:14:14.219999000 +0100
@@ -14,7 +14,7 @@
 #include <linux/export.h>
 #include <linux/slab.h>
 #include <linux/dm-io.h>
-#include "dm-bufio.h"
+#include <linux/dm-bufio.h>
 
 #define DM_MSG_PREFIX "persistent snapshot"
 #define DM_CHUNK_SIZE_DEFAULT_SECTORS 32	/* 16KB */
Index: linux-2.6/drivers/md/dm-verity.h
===================================================================
--- linux-2.6.orig/drivers/md/dm-verity.h	2017-11-24 18:01:03.878480000 +0100
+++ linux-2.6/drivers/md/dm-verity.h	2018-03-08 10:14:27.389999000 +0100
@@ -12,7 +12,7 @@
 #ifndef DM_VERITY_H
 #define DM_VERITY_H
 
-#include "dm-bufio.h"
+#include <linux/dm-bufio.h>
 #include <linux/device-mapper.h>
 #include <crypto/hash.h>
 
Index: linux-2.6/drivers/md/persistent-data/dm-block-manager.c
===================================================================
--- linux-2.6.orig/drivers/md/persistent-data/dm-block-manager.c	2017-06-03 00:22:40.129122000 +0200
+++ linux-2.6/drivers/md/persistent-data/dm-block-manager.c	2018-03-08 10:14:52.749999000 +0100
@@ -5,8 +5,8 @@
  */
 #include "dm-block-manager.h"
 #include "dm-persistent-data-internal.h"
-#include "../dm-bufio.h"
 
+#include <linux/dm-bufio.h>
 #include <linux/crc32c.h>
 #include <linux/module.h>
 #include <linux/slab.h>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 2/8] dm-bufio: get rid of slab cache name allocations
  2018-03-26 18:29 [patch 0/8] dm-bufio patches mpatocka
  2018-03-26 18:29 ` [patch 1/8] dm-bufio: move dm-bufio.h to include/linux/ mpatocka
@ 2018-03-26 18:29 ` mpatocka
  2018-03-26 18:29 ` [patch 3/8] dm-bufio: remove code that merges slab caches mpatocka
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: mpatocka @ 2018-03-26 18:29 UTC (permalink / raw)
  To: mpatocka, msnitzer, agk; +Cc: dm-devel

[-- Attachment #1: dm-bufio-remove-name-allocations.patch --]
[-- Type: text/plain, Size: 2917 bytes --]

dm-bufio keeps the array dm_bufio_cache_names that holds names of the slab
caches.

Since the commit db265eca7700 ("mm/sl[aou]b: Move duping of slab name to
slab_common.c"), the kernel automatically duplicates the slab cache name
when creating the slab cache, so we no longer have to keep the name
allocated.

This patch removes the code that allocates the slab names and keeps them
around.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/md/dm-bufio.c |   21 +++------------------
 1 file changed, 3 insertions(+), 18 deletions(-)

Index: linux-2.6/drivers/md/dm-bufio.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-bufio.c	2018-03-13 16:41:15.637954000 +0100
+++ linux-2.6/drivers/md/dm-bufio.c	2018-03-14 11:49:06.199999000 +0100
@@ -173,7 +173,6 @@ struct dm_buffer {
 /*----------------------------------------------------------------*/
 
 static struct kmem_cache *dm_bufio_caches[PAGE_SHIFT - SECTOR_SHIFT];
-static char *dm_bufio_cache_names[PAGE_SHIFT - SECTOR_SHIFT];
 
 static inline int dm_bufio_cache_index(struct dm_bufio_client *c)
 {
@@ -185,7 +184,6 @@ static inline int dm_bufio_cache_index(s
 }
 
 #define DM_BUFIO_CACHE(c)	(dm_bufio_caches[dm_bufio_cache_index(c)])
-#define DM_BUFIO_CACHE_NAME(c)	(dm_bufio_cache_names[dm_bufio_cache_index(c)])
 
 #define dm_bufio_in_request()	(!!current->bio_list)
 
@@ -1703,19 +1701,10 @@ struct dm_bufio_client *dm_bufio_client_
 
 	mutex_lock(&dm_bufio_clients_lock);
 	if (c->blocks_per_page_bits) {
-		if (!DM_BUFIO_CACHE_NAME(c)) {
-			DM_BUFIO_CACHE_NAME(c) = kasprintf(GFP_KERNEL, "dm_bufio_cache-%u", c->block_size);
-			if (!DM_BUFIO_CACHE_NAME(c)) {
-				r = -ENOMEM;
-				mutex_unlock(&dm_bufio_clients_lock);
-				goto bad;
-			}
-		}
-
 		if (!DM_BUFIO_CACHE(c)) {
-			DM_BUFIO_CACHE(c) = kmem_cache_create(DM_BUFIO_CACHE_NAME(c),
-							      c->block_size,
-							      c->block_size, 0, NULL);
+			char name[26];
+			snprintf(name, sizeof name, "dm_bufio_cache-%u", c->block_size);
+			DM_BUFIO_CACHE(c) = kmem_cache_create(name, c->block_size, c->block_size, 0, NULL);
 			if (!DM_BUFIO_CACHE(c)) {
 				r = -ENOMEM;
 				mutex_unlock(&dm_bufio_clients_lock);
@@ -1908,7 +1897,6 @@ static int __init dm_bufio_init(void)
 	dm_bufio_current_allocated = 0;
 
 	memset(&dm_bufio_caches, 0, sizeof dm_bufio_caches);
-	memset(&dm_bufio_cache_names, 0, sizeof dm_bufio_cache_names);
 
 	mem = (__u64)mult_frac(totalram_pages - totalhigh_pages,
 			       DM_BUFIO_MEMORY_PERCENT, 100) << PAGE_SHIFT;
@@ -1952,9 +1940,6 @@ static void __exit dm_bufio_exit(void)
 	for (i = 0; i < ARRAY_SIZE(dm_bufio_caches); i++)
 		kmem_cache_destroy(dm_bufio_caches[i]);
 
-	for (i = 0; i < ARRAY_SIZE(dm_bufio_cache_names); i++)
-		kfree(dm_bufio_cache_names[i]);
-
 	if (dm_bufio_client_count) {
 		DMCRIT("%s: dm_bufio_client_count leaked: %d",
 			__func__, dm_bufio_client_count);

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 3/8] dm-bufio: remove code that merges slab caches
  2018-03-26 18:29 [patch 0/8] dm-bufio patches mpatocka
  2018-03-26 18:29 ` [patch 1/8] dm-bufio: move dm-bufio.h to include/linux/ mpatocka
  2018-03-26 18:29 ` [patch 2/8] dm-bufio: get rid of slab cache name allocations mpatocka
@ 2018-03-26 18:29 ` mpatocka
  2018-03-26 18:29 ` [patch 4/8] dm-bufio: fix slab cache attributes mpatocka
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: mpatocka @ 2018-03-26 18:29 UTC (permalink / raw)
  To: mpatocka, msnitzer, agk; +Cc: dm-devel

[-- Attachment #1: dm-bufio-use-cache-per-client.patch --]
[-- Type: text/plain, Size: 5126 bytes --]

Currently, all slab allocators can merge duplicate caches. So we don't
need extra merging logic in dm-bufio. We can just allocate one slab cache
per client and let the allocator merge them.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/md/dm-bufio.c |   53 +++++++++++++-------------------------------------
 1 file changed, 14 insertions(+), 39 deletions(-)

Index: linux-2.6/drivers/md/dm-bufio.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-bufio.c	2018-03-23 22:40:52.177960000 +0100
+++ linux-2.6/drivers/md/dm-bufio.c	2018-03-26 14:43:49.857250000 +0200
@@ -57,10 +57,9 @@
 #define DM_BUFIO_INLINE_VECS		16
 
 /*
- * Don't try to use kmem_cache_alloc for blocks larger than this.
+ * Don't try to use alloc_pages for blocks larger than this.
  * For explanation, see alloc_buffer_data below.
  */
-#define DM_BUFIO_BLOCK_SIZE_SLAB_LIMIT	(PAGE_SIZE >> 1)
 #define DM_BUFIO_BLOCK_SIZE_GFP_LIMIT	(PAGE_SIZE << (MAX_ORDER - 1))
 
 /*
@@ -101,11 +100,11 @@ struct dm_bufio_client {
 	unsigned block_size;
 	unsigned char sectors_per_block_bits;
 	unsigned char pages_per_block_bits;
-	unsigned char blocks_per_page_bits;
 	unsigned aux_size;
 	void (*alloc_callback)(struct dm_buffer *);
 	void (*write_callback)(struct dm_buffer *);
 
+	struct kmem_cache *slab_cache;
 	struct dm_io_client *dm_io;
 
 	struct list_head reserved_buffers;
@@ -172,19 +171,6 @@ struct dm_buffer {
 
 /*----------------------------------------------------------------*/
 
-static struct kmem_cache *dm_bufio_caches[PAGE_SHIFT - SECTOR_SHIFT];
-
-static inline int dm_bufio_cache_index(struct dm_bufio_client *c)
-{
-	unsigned ret = c->blocks_per_page_bits - 1;
-
-	BUG_ON(ret >= ARRAY_SIZE(dm_bufio_caches));
-
-	return ret;
-}
-
-#define DM_BUFIO_CACHE(c)	(dm_bufio_caches[dm_bufio_cache_index(c)])
-
 #define dm_bufio_in_request()	(!!current->bio_list)
 
 static void dm_bufio_lock(struct dm_bufio_client *c)
@@ -384,9 +370,9 @@ static void __cache_size_refresh(void)
 static void *alloc_buffer_data(struct dm_bufio_client *c, gfp_t gfp_mask,
 			       enum data_mode *data_mode)
 {
-	if (c->block_size <= DM_BUFIO_BLOCK_SIZE_SLAB_LIMIT) {
+	if (unlikely(c->slab_cache != NULL)) {
 		*data_mode = DATA_MODE_SLAB;
-		return kmem_cache_alloc(DM_BUFIO_CACHE(c), gfp_mask);
+		return kmem_cache_alloc(c->slab_cache, gfp_mask);
 	}
 
 	if (c->block_size <= DM_BUFIO_BLOCK_SIZE_GFP_LIMIT &&
@@ -426,7 +412,7 @@ static void free_buffer_data(struct dm_b
 {
 	switch (data_mode) {
 	case DATA_MODE_SLAB:
-		kmem_cache_free(DM_BUFIO_CACHE(c), data);
+		kmem_cache_free(c->slab_cache, data);
 		break;
 
 	case DATA_MODE_GET_FREE_PAGES:
@@ -1672,8 +1658,6 @@ struct dm_bufio_client *dm_bufio_client_
 	c->sectors_per_block_bits = __ffs(block_size) - SECTOR_SHIFT;
 	c->pages_per_block_bits = (__ffs(block_size) >= PAGE_SHIFT) ?
 				  __ffs(block_size) - PAGE_SHIFT : 0;
-	c->blocks_per_page_bits = (__ffs(block_size) < PAGE_SHIFT ?
-				  PAGE_SHIFT - __ffs(block_size) : 0);
 
 	c->aux_size = aux_size;
 	c->alloc_callback = alloc_callback;
@@ -1699,20 +1683,15 @@ struct dm_bufio_client *dm_bufio_client_
 		goto bad_dm_io;
 	}
 
-	mutex_lock(&dm_bufio_clients_lock);
-	if (c->blocks_per_page_bits) {
-		if (!DM_BUFIO_CACHE(c)) {
-			char name[26];
-			snprintf(name, sizeof name, "dm_bufio_cache-%u", c->block_size);
-			DM_BUFIO_CACHE(c) = kmem_cache_create(name, c->block_size, c->block_size, 0, NULL);
-			if (!DM_BUFIO_CACHE(c)) {
-				r = -ENOMEM;
-				mutex_unlock(&dm_bufio_clients_lock);
-				goto bad;
-			}
+	if (block_size < PAGE_SIZE) {
+		char name[26];
+		snprintf(name, sizeof name, "dm_bufio_cache-%u", c->block_size);
+		c->slab_cache = kmem_cache_create(name, c->block_size, c->block_size, 0, NULL);
+		if (!c->slab_cache) {
+			r = -ENOMEM;
+			goto bad;
 		}
 	}
-	mutex_unlock(&dm_bufio_clients_lock);
 
 	while (c->need_reserved_buffers) {
 		struct dm_buffer *b = alloc_buffer(c, GFP_KERNEL);
@@ -1747,6 +1726,7 @@ bad:
 		list_del(&b->lru_list);
 		free_buffer(b);
 	}
+	kmem_cache_destroy(c->slab_cache);
 	dm_io_client_destroy(c->dm_io);
 bad_dm_io:
 	mutex_destroy(&c->lock);
@@ -1793,6 +1773,7 @@ void dm_bufio_client_destroy(struct dm_b
 	for (i = 0; i < LIST_SIZE; i++)
 		BUG_ON(c->n_buffers[i]);
 
+	kmem_cache_destroy(c->slab_cache);
 	dm_io_client_destroy(c->dm_io);
 	mutex_destroy(&c->lock);
 	kfree(c);
@@ -1896,8 +1877,6 @@ static int __init dm_bufio_init(void)
 	dm_bufio_allocated_vmalloc = 0;
 	dm_bufio_current_allocated = 0;
 
-	memset(&dm_bufio_caches, 0, sizeof dm_bufio_caches);
-
 	mem = (__u64)mult_frac(totalram_pages - totalhigh_pages,
 			       DM_BUFIO_MEMORY_PERCENT, 100) << PAGE_SHIFT;
 
@@ -1932,14 +1911,10 @@ static int __init dm_bufio_init(void)
 static void __exit dm_bufio_exit(void)
 {
 	int bug = 0;
-	int i;
 
 	cancel_delayed_work_sync(&dm_bufio_work);
 	destroy_workqueue(dm_bufio_wq);
 
-	for (i = 0; i < ARRAY_SIZE(dm_bufio_caches); i++)
-		kmem_cache_destroy(dm_bufio_caches[i]);
-
 	if (dm_bufio_client_count) {
 		DMCRIT("%s: dm_bufio_client_count leaked: %d",
 			__func__, dm_bufio_client_count);

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 4/8] dm-bufio: fix slab cache attributes
  2018-03-26 18:29 [patch 0/8] dm-bufio patches mpatocka
                   ` (2 preceding siblings ...)
  2018-03-26 18:29 ` [patch 3/8] dm-bufio: remove code that merges slab caches mpatocka
@ 2018-03-26 18:29 ` mpatocka
  2018-03-26 18:29 ` [patch 5/8] dm-bufio recorder fields in dm-buffer mpatocka
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: mpatocka @ 2018-03-26 18:29 UTC (permalink / raw)
  To: mpatocka, msnitzer, agk; +Cc: dm-devel

[-- Attachment #1: dm-bufio-drop-alignment.patch --]
[-- Type: text/plain, Size: 1439 bytes --]

The I/O buffer doesn't have to be aligned, so we can drop the alignment on
the slab cache. This patch also sets SLAB_RECLAIM_ACCOUNT, so that the
memory allocated from the cache is accounted as reclaimable and doesn't
inflate the 'used' entry in the free command.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/md/dm-bufio.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Index: linux-2.6/drivers/md/dm-bufio.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-bufio.c	2018-03-21 12:33:34.928416000 +0100
+++ linux-2.6/drivers/md/dm-bufio.c	2018-03-21 12:35:14.938416000 +0100
@@ -618,7 +618,6 @@ static void use_inline_bio(struct dm_buf
 		unsigned this_step = min((unsigned)(PAGE_SIZE - offset_in_page(ptr)), len);
 		if (!bio_add_page(&b->bio, virt_to_page(ptr), this_step,
 				  offset_in_page(ptr))) {
-			BUG_ON(b->c->block_size <= PAGE_SIZE);
 			use_dmio(b, rw, sector, n_sectors, offset, end_io);
 			return;
 		}
@@ -1686,7 +1685,7 @@ struct dm_bufio_client *dm_bufio_client_
 	if (block_size < PAGE_SIZE) {
 		char name[26];
 		snprintf(name, sizeof name, "dm_bufio_cache-%u", c->block_size);
-		c->slab_cache = kmem_cache_create(name, c->block_size, c->block_size, 0, NULL);
+		c->slab_cache = kmem_cache_create(name, c->block_size, ARCH_KMALLOC_MINALIGN, SLAB_RECLAIM_ACCOUNT, NULL);
 		if (!c->slab_cache) {
 			r = -ENOMEM;
 			goto bad;

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 5/8] dm-bufio recorder fields in dm-buffer
  2018-03-26 18:29 [patch 0/8] dm-bufio patches mpatocka
                   ` (3 preceding siblings ...)
  2018-03-26 18:29 ` [patch 4/8] dm-bufio: fix slab cache attributes mpatocka
@ 2018-03-26 18:29 ` mpatocka
  2018-03-26 18:29 ` [patch 6/8] dm-bufio: use slab cache for dm_buffer mpatocka
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: mpatocka @ 2018-03-26 18:29 UTC (permalink / raw)
  To: mpatocka, msnitzer, agk; +Cc: dm-devel

[-- Attachment #1: dm-bufio-reorder.patch --]
[-- Type: text/plain, Size: 1957 bytes --]

Reorder fields in dm-buffer, so that it improves packing and reduces
structure size. The compiler allocates 32-bit integer for field 'enum
data_mode', so we change it to unsigned char.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/md/dm-bufio.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

Index: linux-2.6/drivers/md/dm-bufio.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-bufio.c	2018-03-26 17:35:00.149999000 +0200
+++ linux-2.6/drivers/md/dm-bufio.c	2018-03-26 17:35:00.139999000 +0200
@@ -147,11 +147,11 @@ struct dm_buffer {
 	struct list_head lru_list;
 	sector_t block;
 	void *data;
-	enum data_mode data_mode;
+	unsigned char data_mode;		/* DATA_MODE_* */
 	unsigned char list_mode;		/* LIST_* */
-	unsigned hold_count;
 	blk_status_t read_error;
 	blk_status_t write_error;
+	unsigned hold_count;
 	unsigned long state;
 	unsigned long last_accessed;
 	unsigned dirty_start;
@@ -303,7 +303,7 @@ static void __remove(struct dm_bufio_cli
 
 /*----------------------------------------------------------------*/
 
-static void adjust_total_allocated(enum data_mode data_mode, long diff)
+static void adjust_total_allocated(unsigned char data_mode, long diff)
 {
 	static unsigned long * const class_ptr[DATA_MODE_LIMIT] = {
 		&dm_bufio_allocated_kmem_cache,
@@ -368,7 +368,7 @@ static void __cache_size_refresh(void)
  * space.
  */
 static void *alloc_buffer_data(struct dm_bufio_client *c, gfp_t gfp_mask,
-			       enum data_mode *data_mode)
+			       unsigned char *data_mode)
 {
 	if (unlikely(c->slab_cache != NULL)) {
 		*data_mode = DATA_MODE_SLAB;
@@ -408,7 +408,7 @@ static void *alloc_buffer_data(struct dm
  * Free buffer's data.
  */
 static void free_buffer_data(struct dm_bufio_client *c,
-			     void *data, enum data_mode data_mode)
+			     void *data, unsigned char data_mode)
 {
 	switch (data_mode) {
 	case DATA_MODE_SLAB:

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 6/8] dm-bufio: use slab cache for dm_buffer
  2018-03-26 18:29 [patch 0/8] dm-bufio patches mpatocka
                   ` (4 preceding siblings ...)
  2018-03-26 18:29 ` [patch 5/8] dm-bufio recorder fields in dm-buffer mpatocka
@ 2018-03-26 18:29 ` mpatocka
  2018-03-26 18:29 ` [patch 7/8] dm-bufio: support non-power-of-two block sizes mpatocka
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: mpatocka @ 2018-03-26 18:29 UTC (permalink / raw)
  To: mpatocka, msnitzer, agk; +Cc: dm-devel

[-- Attachment #1: dm-bufio-buffer-cache.patch --]
[-- Type: text/plain, Size: 3602 bytes --]

Use the slab cache for the structure dm_buffer, so that the size is not
padded up to the next power of two.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/md/dm-bufio.c |   27 ++++++++++++++++++---------
 1 file changed, 18 insertions(+), 9 deletions(-)

Index: linux-2.6/drivers/md/dm-bufio.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-bufio.c	2018-03-26 15:33:50.987250000 +0200
+++ linux-2.6/drivers/md/dm-bufio.c	2018-03-26 15:34:12.907250000 +0200
@@ -100,10 +100,10 @@ struct dm_bufio_client {
 	unsigned block_size;
 	unsigned char sectors_per_block_bits;
 	unsigned char pages_per_block_bits;
-	unsigned aux_size;
 	void (*alloc_callback)(struct dm_buffer *);
 	void (*write_callback)(struct dm_buffer *);
 
+	struct kmem_cache *slab_buffer;
 	struct kmem_cache *slab_cache;
 	struct dm_io_client *dm_io;
 
@@ -435,8 +435,7 @@ static void free_buffer_data(struct dm_b
  */
 static struct dm_buffer *alloc_buffer(struct dm_bufio_client *c, gfp_t gfp_mask)
 {
-	struct dm_buffer *b = kmalloc(sizeof(struct dm_buffer) + c->aux_size,
-				      gfp_mask);
+	struct dm_buffer *b = kmem_cache_alloc(c->slab_buffer, gfp_mask);
 
 	if (!b)
 		return NULL;
@@ -445,7 +444,7 @@ static struct dm_buffer *alloc_buffer(st
 
 	b->data = alloc_buffer_data(c, gfp_mask, &b->data_mode);
 	if (!b->data) {
-		kfree(b);
+		kmem_cache_free(c->slab_buffer, b);
 		return NULL;
 	}
 
@@ -467,7 +466,7 @@ static void free_buffer(struct dm_buffer
 	adjust_total_allocated(b->data_mode, -(long)c->block_size);
 
 	free_buffer_data(c, b->data, b->data_mode);
-	kfree(b);
+	kmem_cache_free(c->slab_buffer, b);
 }
 
 /*
@@ -1641,6 +1640,7 @@ struct dm_bufio_client *dm_bufio_client_
 	int r;
 	struct dm_bufio_client *c;
 	unsigned i;
+	char slab_name[27];
 
 	BUG_ON(block_size < 1 << SECTOR_SHIFT ||
 	       (block_size & (block_size - 1)));
@@ -1658,7 +1658,6 @@ struct dm_bufio_client *dm_bufio_client_
 	c->pages_per_block_bits = (__ffs(block_size) >= PAGE_SHIFT) ?
 				  __ffs(block_size) - PAGE_SHIFT : 0;
 
-	c->aux_size = aux_size;
 	c->alloc_callback = alloc_callback;
 	c->write_callback = write_callback;
 
@@ -1683,14 +1682,22 @@ struct dm_bufio_client *dm_bufio_client_
 	}
 
 	if (block_size < PAGE_SIZE) {
-		char name[26];
-		snprintf(name, sizeof name, "dm_bufio_cache-%u", c->block_size);
-		c->slab_cache = kmem_cache_create(name, c->block_size, ARCH_KMALLOC_MINALIGN, SLAB_RECLAIM_ACCOUNT, NULL);
+		snprintf(slab_name, sizeof slab_name, "dm_bufio_cache-%u", c->block_size);
+		c->slab_cache = kmem_cache_create(slab_name, c->block_size, ARCH_KMALLOC_MINALIGN, SLAB_RECLAIM_ACCOUNT, NULL);
 		if (!c->slab_cache) {
 			r = -ENOMEM;
 			goto bad;
 		}
 	}
+	if (aux_size)
+		snprintf(slab_name, sizeof slab_name, "dm_bufio_buffer-%u", aux_size);
+	else
+		snprintf(slab_name, sizeof slab_name, "dm_bufio_buffer");
+	c->slab_buffer = kmem_cache_create(slab_name, sizeof(struct dm_buffer) + aux_size, 0, SLAB_RECLAIM_ACCOUNT, NULL);
+	if (!c->slab_buffer) {
+		r = -ENOMEM;
+		goto bad;
+	}
 
 	while (c->need_reserved_buffers) {
 		struct dm_buffer *b = alloc_buffer(c, GFP_KERNEL);
@@ -1726,6 +1733,7 @@ bad:
 		free_buffer(b);
 	}
 	kmem_cache_destroy(c->slab_cache);
+	kmem_cache_destroy(c->slab_buffer);
 	dm_io_client_destroy(c->dm_io);
 bad_dm_io:
 	mutex_destroy(&c->lock);
@@ -1773,6 +1781,7 @@ void dm_bufio_client_destroy(struct dm_b
 		BUG_ON(c->n_buffers[i]);
 
 	kmem_cache_destroy(c->slab_cache);
+	kmem_cache_destroy(c->slab_buffer);
 	dm_io_client_destroy(c->dm_io);
 	mutex_destroy(&c->lock);
 	kfree(c);

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 7/8] dm-bufio: support non-power-of-two block sizes
  2018-03-26 18:29 [patch 0/8] dm-bufio patches mpatocka
                   ` (5 preceding siblings ...)
  2018-03-26 18:29 ` [patch 6/8] dm-bufio: use slab cache for dm_buffer mpatocka
@ 2018-03-26 18:29 ` mpatocka
  2018-03-26 18:29 ` [patch 8/8] dm-bufio: dont embed bio in dm_buffer mpatocka
  2018-03-26 18:57 ` [patch 0/8] dm-bufio patches Mike Snitzer
  8 siblings, 0 replies; 12+ messages in thread
From: mpatocka @ 2018-03-26 18:29 UTC (permalink / raw)
  To: mpatocka, msnitzer, agk; +Cc: dm-devel

[-- Attachment #1: dm-bufio-non-power-of-two-size.patch --]
[-- Type: text/plain, Size: 5364 bytes --]

Support sector sizes that are not a power of two. Slab cache is used for
allocations.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/md/dm-bufio.c |   64 ++++++++++++++++++++++++++++++--------------------
 1 file changed, 39 insertions(+), 25 deletions(-)

Index: linux-2.6/drivers/md/dm-bufio.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-bufio.c	2018-03-26 15:34:20.577250000 +0200
+++ linux-2.6/drivers/md/dm-bufio.c	2018-03-26 15:34:20.577250000 +0200
@@ -57,12 +57,6 @@
 #define DM_BUFIO_INLINE_VECS		16
 
 /*
- * Don't try to use alloc_pages for blocks larger than this.
- * For explanation, see alloc_buffer_data below.
- */
-#define DM_BUFIO_BLOCK_SIZE_GFP_LIMIT	(PAGE_SIZE << (MAX_ORDER - 1))
-
-/*
  * Align buffer writes to this boundary.
  * Tests show that SSDs have the highest IOPS when using 4k writes.
  */
@@ -98,8 +92,7 @@ struct dm_bufio_client {
 
 	struct block_device *bdev;
 	unsigned block_size;
-	unsigned char sectors_per_block_bits;
-	unsigned char pages_per_block_bits;
+	s8 sectors_per_block_bits;
 	void (*alloc_callback)(struct dm_buffer *);
 	void (*write_callback)(struct dm_buffer *);
 
@@ -375,11 +368,11 @@ static void *alloc_buffer_data(struct dm
 		return kmem_cache_alloc(c->slab_cache, gfp_mask);
 	}
 
-	if (c->block_size <= DM_BUFIO_BLOCK_SIZE_GFP_LIMIT &&
+	if (c->block_size <= KMALLOC_MAX_SIZE &&
 	    gfp_mask & __GFP_NORETRY) {
 		*data_mode = DATA_MODE_GET_FREE_PAGES;
 		return (void *)__get_free_pages(gfp_mask,
-						c->pages_per_block_bits);
+						c->sectors_per_block_bits - (PAGE_SHIFT - SECTOR_SHIFT));
 	}
 
 	*data_mode = DATA_MODE_VMALLOC;
@@ -416,7 +409,8 @@ static void free_buffer_data(struct dm_b
 		break;
 
 	case DATA_MODE_GET_FREE_PAGES:
-		free_pages((unsigned long)data, c->pages_per_block_bits);
+		free_pages((unsigned long)data,
+			   c->sectors_per_block_bits - (PAGE_SHIFT - SECTOR_SHIFT));
 		break;
 
 	case DATA_MODE_VMALLOC:
@@ -634,10 +628,14 @@ static void submit_io(struct dm_buffer *
 	sector_t sector;
 	unsigned offset, end;
 
-	sector = (b->block << b->c->sectors_per_block_bits) + b->c->start;
+	if (likely(b->c->sectors_per_block_bits >= 0))
+		sector = b->block << b->c->sectors_per_block_bits;
+	else
+		sector = b->block * (b->c->block_size >> SECTOR_SHIFT);
+	sector += b->c->start;
 
 	if (rw != REQ_OP_WRITE) {
-		n_sectors = 1 << b->c->sectors_per_block_bits;
+		n_sectors = b->c->block_size >> SECTOR_SHIFT;
 		offset = 0;
 	} else {
 		if (b->c->write_callback)
@@ -941,8 +939,11 @@ static void __get_memory_limit(struct dm
 		}
 	}
 
-	buffers = dm_bufio_cache_size_per_client >>
-		  (c->sectors_per_block_bits + SECTOR_SHIFT);
+	buffers = dm_bufio_cache_size_per_client;
+	if (likely(c->sectors_per_block_bits >= 0))
+		buffers >>= c->sectors_per_block_bits + SECTOR_SHIFT;
+	else
+		buffers /= c->block_size;
 
 	if (buffers < c->minimum_buffers)
 		buffers = c->minimum_buffers;
@@ -1476,8 +1477,12 @@ EXPORT_SYMBOL_GPL(dm_bufio_get_block_siz
 
 sector_t dm_bufio_get_device_size(struct dm_bufio_client *c)
 {
-	return i_size_read(c->bdev->bd_inode) >>
-			   (SECTOR_SHIFT + c->sectors_per_block_bits);
+	sector_t s = i_size_read(c->bdev->bd_inode) >> SECTOR_SHIFT;
+	if (likely(c->sectors_per_block_bits >= 0))
+		s >>= c->sectors_per_block_bits;
+	else
+		sector_div(s, c->block_size >> SECTOR_SHIFT);
+	return s;
 }
 EXPORT_SYMBOL_GPL(dm_bufio_get_device_size);
 
@@ -1575,8 +1580,12 @@ static bool __try_evict_buffer(struct dm
 
 static unsigned long get_retain_buffers(struct dm_bufio_client *c)
 {
-        unsigned long retain_bytes = READ_ONCE(dm_bufio_retain_bytes);
-        return retain_bytes >> (c->sectors_per_block_bits + SECTOR_SHIFT);
+	unsigned long retain_bytes = READ_ONCE(dm_bufio_retain_bytes);
+	if (likely(c->sectors_per_block_bits >= 0))
+		retain_bytes >>= c->sectors_per_block_bits + SECTOR_SHIFT;
+	else
+		retain_bytes /= c->block_size;
+	return retain_bytes;
 }
 
 static unsigned long __scan(struct dm_bufio_client *c, unsigned long nr_to_scan,
@@ -1642,8 +1651,11 @@ struct dm_bufio_client *dm_bufio_client_
 	unsigned i;
 	char slab_name[27];
 
-	BUG_ON(block_size < 1 << SECTOR_SHIFT ||
-	       (block_size & (block_size - 1)));
+	if (!block_size || block_size & ((1 << SECTOR_SHIFT) - 1)) {
+		WARN_ON(1);
+		r = -EINVAL;
+		goto bad_client;
+	}
 
 	c = kzalloc(sizeof(*c), GFP_KERNEL);
 	if (!c) {
@@ -1654,9 +1666,10 @@ struct dm_bufio_client *dm_bufio_client_
 
 	c->bdev = bdev;
 	c->block_size = block_size;
-	c->sectors_per_block_bits = __ffs(block_size) - SECTOR_SHIFT;
-	c->pages_per_block_bits = (__ffs(block_size) >= PAGE_SHIFT) ?
-				  __ffs(block_size) - PAGE_SHIFT : 0;
+	if (is_power_of_2(block_size))
+		c->sectors_per_block_bits = __ffs(block_size) - SECTOR_SHIFT;
+	else
+		c->sectors_per_block_bits = -1;
 
 	c->alloc_callback = alloc_callback;
 	c->write_callback = write_callback;
@@ -1681,7 +1694,8 @@ struct dm_bufio_client *dm_bufio_client_
 		goto bad_dm_io;
 	}
 
-	if (block_size < PAGE_SIZE) {
+	if (block_size <= KMALLOC_MAX_SIZE &&
+	    (block_size < PAGE_SIZE || !is_power_of_2(block_size))) {
 		snprintf(slab_name, sizeof slab_name, "dm_bufio_cache-%u", c->block_size);
 		c->slab_cache = kmem_cache_create(slab_name, c->block_size, ARCH_KMALLOC_MINALIGN, SLAB_RECLAIM_ACCOUNT, NULL);
 		if (!c->slab_cache) {

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [patch 8/8] dm-bufio: dont embed bio in dm_buffer
  2018-03-26 18:29 [patch 0/8] dm-bufio patches mpatocka
                   ` (6 preceding siblings ...)
  2018-03-26 18:29 ` [patch 7/8] dm-bufio: support non-power-of-two block sizes mpatocka
@ 2018-03-26 18:29 ` mpatocka
  2018-03-26 18:57 ` [patch 0/8] dm-bufio patches Mike Snitzer
  8 siblings, 0 replies; 12+ messages in thread
From: mpatocka @ 2018-03-26 18:29 UTC (permalink / raw)
  To: mpatocka, msnitzer, agk; +Cc: dm-devel

[-- Attachment #1: dm-bufio-alloc-bio.patch --]
[-- Type: text/plain, Size: 6414 bytes --]

The bio structure consumes a substantial part of dm_buffer. This structure
is needed only when doing I/O on the buffer, thus we don't have to embed
it in the buffer, we can allocate the structure only when doing I/O.

We don't need to create bio set because in case of allocation failure we
fall back to dmio (which keeps its own bio set).

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/md/dm-bufio.c |  105 +++++++++++++++++++++-----------------------------
 1 file changed, 45 insertions(+), 60 deletions(-)

Index: linux-2.6/drivers/md/dm-bufio.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-bufio.c	2018-03-26 16:44:19.369999000 +0200
+++ linux-2.6/drivers/md/dm-bufio.c	2018-03-26 17:05:13.000000000 +0200
@@ -51,12 +51,6 @@
 #define DM_BUFIO_DEFAULT_RETAIN_BYTES   (256 * 1024)
 
 /*
- * The number of bvec entries that are embedded directly in the buffer.
- * If the chunk size is larger, dm-io is used to do the io.
- */
-#define DM_BUFIO_INLINE_VECS		16
-
-/*
  * Align buffer writes to this boundary.
  * Tests show that SSDs have the highest IOPS when using 4k writes.
  */
@@ -153,8 +147,7 @@ struct dm_buffer {
 	unsigned write_end;
 	struct dm_bufio_client *c;
 	struct list_head write_list;
-	struct bio bio;
-	struct bio_vec bio_vec[DM_BUFIO_INLINE_VECS];
+	void (*end_io)(struct dm_buffer *, blk_status_t);
 #ifdef CONFIG_DM_DEBUG_BLOCK_STACK_TRACING
 #define MAX_STACK 10
 	struct stack_trace stack_trace;
@@ -534,12 +527,11 @@ static void dmio_complete(unsigned long
 {
 	struct dm_buffer *b = context;
 
-	b->bio.bi_status = error ? BLK_STS_IOERR : 0;
-	b->bio.bi_end_io(&b->bio);
+	b->end_io(b, unlikely(error != 0) ? BLK_STS_IOERR : 0);
 }
 
 static void use_dmio(struct dm_buffer *b, int rw, sector_t sector,
-		     unsigned n_sectors, unsigned offset, bio_end_io_t *end_io)
+		     unsigned n_sectors, unsigned offset)
 {
 	int r;
 	struct dm_io_request io_req = {
@@ -563,71 +555,69 @@ static void use_dmio(struct dm_buffer *b
 		io_req.mem.ptr.vma = (char *)b->data + offset;
 	}
 
-	b->bio.bi_end_io = end_io;
-
 	r = dm_io(&io_req, 1, &region, NULL);
-	if (r) {
-		b->bio.bi_status = errno_to_blk_status(r);
-		end_io(&b->bio);
-	}
+	if (unlikely(r))
+		b->end_io(b, errno_to_blk_status(r));
 }
 
-static void inline_endio(struct bio *bio)
+static void bio_complete(struct bio *bio)
 {
-	bio_end_io_t *end_fn = bio->bi_private;
+	struct dm_buffer *b = bio->bi_private;
 	blk_status_t status = bio->bi_status;
-
-	/*
-	 * Reset the bio to free any attached resources
-	 * (e.g. bio integrity profiles).
-	 */
-	bio_reset(bio);
-
-	bio->bi_status = status;
-	end_fn(bio);
+	bio_put(bio);
+	b->end_io(b, status);
 }
 
-static void use_inline_bio(struct dm_buffer *b, int rw, sector_t sector,
-			   unsigned n_sectors, unsigned offset, bio_end_io_t *end_io)
+static void use_bio(struct dm_buffer *b, int rw, sector_t sector,
+		    unsigned n_sectors, unsigned offset)
 {
+	struct bio *bio;
 	char *ptr;
-	unsigned len;
+	unsigned vec_size, len;
 
-	bio_init(&b->bio, b->bio_vec, DM_BUFIO_INLINE_VECS);
-	b->bio.bi_iter.bi_sector = sector;
-	bio_set_dev(&b->bio, b->c->bdev);
-	b->bio.bi_end_io = inline_endio;
-	/*
-	 * Use of .bi_private isn't a problem here because
-	 * the dm_buffer's inline bio is local to bufio.
-	 */
-	b->bio.bi_private = end_io;
-	bio_set_op_attrs(&b->bio, rw, 0);
+	vec_size = b->c->block_size >> PAGE_SHIFT;
+	if (unlikely(b->c->sectors_per_block_bits < PAGE_SHIFT - SECTOR_SHIFT))
+		vec_size += 2;
+
+	bio = bio_kmalloc(GFP_NOWAIT | __GFP_NORETRY | __GFP_NOWARN, vec_size);
+	if (!bio) {
+dmio:
+		use_dmio(b, rw, sector, n_sectors, offset);
+		return;
+	}
+
+	bio->bi_iter.bi_sector = sector;
+	bio_set_dev(bio, b->c->bdev);
+	bio_set_op_attrs(bio, rw, 0);
+	bio->bi_end_io = bio_complete;
+	bio->bi_private = b;
 
 	ptr = (char *)b->data + offset;
 	len = n_sectors << SECTOR_SHIFT;
 
 	do {
 		unsigned this_step = min((unsigned)(PAGE_SIZE - offset_in_page(ptr)), len);
-		if (!bio_add_page(&b->bio, virt_to_page(ptr), this_step,
+		if (!bio_add_page(bio, virt_to_page(ptr), this_step,
 				  offset_in_page(ptr))) {
-			use_dmio(b, rw, sector, n_sectors, offset, end_io);
-			return;
+			bio_put(bio);
+			goto dmio;
 		}
 
 		len -= this_step;
 		ptr += this_step;
 	} while (len > 0);
 
-	submit_bio(&b->bio);
+	submit_bio(bio);
 }
 
-static void submit_io(struct dm_buffer *b, int rw, bio_end_io_t *end_io)
+static void submit_io(struct dm_buffer *b, int rw, void (*end_io)(struct dm_buffer *, blk_status_t))
 {
 	unsigned n_sectors;
 	sector_t sector;
 	unsigned offset, end;
 
+	b->end_io = end_io;
+
 	if (likely(b->c->sectors_per_block_bits >= 0))
 		sector = b->block << b->c->sectors_per_block_bits;
 	else
@@ -652,11 +642,10 @@ static void submit_io(struct dm_buffer *
 		n_sectors = (end - offset) >> SECTOR_SHIFT;
 	}
 
-	if (n_sectors <= ((DM_BUFIO_INLINE_VECS * PAGE_SIZE) >> SECTOR_SHIFT) &&
-	    b->data_mode != DATA_MODE_VMALLOC)
-		use_inline_bio(b, rw, sector, n_sectors, offset, end_io);
+	if (b->data_mode != DATA_MODE_VMALLOC)
+		use_bio(b, rw, sector, n_sectors, offset);
 	else
-		use_dmio(b, rw, sector, n_sectors, offset, end_io);
+		use_dmio(b, rw, sector, n_sectors, offset);
 }
 
 /*----------------------------------------------------------------
@@ -669,16 +658,14 @@ static void submit_io(struct dm_buffer *
  * Set the error, clear B_WRITING bit and wake anyone who was waiting on
  * it.
  */
-static void write_endio(struct bio *bio)
+static void write_endio(struct dm_buffer *b, blk_status_t status)
 {
-	struct dm_buffer *b = container_of(bio, struct dm_buffer, bio);
-
-	b->write_error = bio->bi_status;
-	if (unlikely(bio->bi_status)) {
+	b->write_error = status;
+	if (unlikely(status)) {
 		struct dm_bufio_client *c = b->c;
 
 		(void)cmpxchg(&c->async_write_error, 0,
-				blk_status_to_errno(bio->bi_status));
+				blk_status_to_errno(status));
 	}
 
 	BUG_ON(!test_bit(B_WRITING, &b->state));
@@ -1055,11 +1042,9 @@ found_buffer:
  * The endio routine for reading: set the error, clear the bit and wake up
  * anyone waiting on the buffer.
  */
-static void read_endio(struct bio *bio)
+static void read_endio(struct dm_buffer *b, blk_status_t status)
 {
-	struct dm_buffer *b = container_of(bio, struct dm_buffer, bio);
-
-	b->read_error = bio->bi_status;
+	b->read_error = status;
 
 	BUG_ON(!test_bit(B_READING, &b->state));
 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [patch 0/8] dm-bufio patches
  2018-03-26 18:29 [patch 0/8] dm-bufio patches mpatocka
                   ` (7 preceding siblings ...)
  2018-03-26 18:29 ` [patch 8/8] dm-bufio: dont embed bio in dm_buffer mpatocka
@ 2018-03-26 18:57 ` Mike Snitzer
  2018-03-26 20:20   ` Mikulas Patocka
  8 siblings, 1 reply; 12+ messages in thread
From: Mike Snitzer @ 2018-03-26 18:57 UTC (permalink / raw)
  To: mpatocka; +Cc: dm-devel, agk

On Mon, Mar 26 2018 at  2:29pm -0400,
mpatocka@redhat.com <mpatocka@redhat.com> wrote:

> Hi
> 
> Here I'm sending the second version of dm-bufio patches for arbitrary
> sector size.
> 
> I dropped the patch that removes alloc_pages because it turns out that
> alloc_pages is faster than the slab allocator.

I had already staged the previous set, so I need to go back over that
and update accordingly.

> I also added a patch that removes struct bio from struct dm_buffer
> (reducing the size of dm_buffer to 1/3) and allocates the bio only when
> doing read or write.

OK, I'll go over your new set and try to tease out any differences.

Mike

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [patch 0/8] dm-bufio patches
  2018-03-26 18:57 ` [patch 0/8] dm-bufio patches Mike Snitzer
@ 2018-03-26 20:20   ` Mikulas Patocka
  2018-03-27  0:55     ` Mike Snitzer
  0 siblings, 1 reply; 12+ messages in thread
From: Mikulas Patocka @ 2018-03-26 20:20 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: dm-devel, agk



On Mon, 26 Mar 2018, Mike Snitzer wrote:

> On Mon, Mar 26 2018 at  2:29pm -0400,
> mpatocka@redhat.com <mpatocka@redhat.com> wrote:
> 
> > Hi
> > 
> > Here I'm sending the second version of dm-bufio patches for arbitrary
> > sector size.
> > 
> > I dropped the patch that removes alloc_pages because it turns out that
> > alloc_pages is faster than the slab allocator.
> 
> I had already staged the previous set, so I need to go back over that
> and update accordingly.
> 
> > I also added a patch that removes struct bio from struct dm_buffer
> > (reducing the size of dm_buffer to 1/3) and allocates the bio only when
> > doing read or write.
> 
> OK, I'll go over your new set and try to tease out any differences.
> 
> Mike

One more patch that I forgot to include.



From: Mikulas Patocka <mpatocka@redhat.com>
Subject: [PATCH] dm-bufio: delete outdated comment

This comment was true when dm-bufio was written, but since the kernel 4.3,
bios can have arbitrary size and the driver splits them.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/md/dm-bufio.c |    4 ----
 1 file changed, 4 deletions(-)

Index: linux-2.6/drivers/md/dm-bufio.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-bufio.c	2018-03-13 16:38:32.257954000 +0100
+++ linux-2.6/drivers/md/dm-bufio.c	2018-03-13 16:38:32.247954000 +0100
@@ -540,10 +540,6 @@ static void __relink_lru(struct dm_buffe
  *
  *	the memory must be direct-mapped, not vmalloced;
  *
- *	the I/O driver can reject requests spuriously if it thinks that
- *	the requests are too big for the device or if they cross a
- *	controller-defined memory boundary.
- *
  * If the buffer is small enough (up to DM_BUFIO_INLINE_VECS pages) and
  * it is not vmalloced, try using the bio interface.
  *

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [patch 0/8] dm-bufio patches
  2018-03-26 20:20   ` Mikulas Patocka
@ 2018-03-27  0:55     ` Mike Snitzer
  0 siblings, 0 replies; 12+ messages in thread
From: Mike Snitzer @ 2018-03-27  0:55 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: dm-devel, agk

On Mon, Mar 26 2018 at  4:20pm -0400,
Mikulas Patocka <mpatocka@redhat.com> wrote:

> 
> 
> On Mon, 26 Mar 2018, Mike Snitzer wrote:
> 
> > On Mon, Mar 26 2018 at  2:29pm -0400,
> > mpatocka@redhat.com <mpatocka@redhat.com> wrote:
> > 
> > > Hi
> > > 
> > > Here I'm sending the second version of dm-bufio patches for arbitrary
> > > sector size.
> > > 
> > > I dropped the patch that removes alloc_pages because it turns out that
> > > alloc_pages is faster than the slab allocator.
> > 
> > I had already staged the previous set, so I need to go back over that
> > and update accordingly.
> > 
> > > I also added a patch that removes struct bio from struct dm_buffer
> > > (reducing the size of dm_buffer to 1/3) and allocates the bio only when
> > > doing read or write.
> > 
> > OK, I'll go over your new set and try to tease out any differences.
> > 
> > Mike
> 
> One more patch that I forgot to include.
> 
> 
> 
> From: Mikulas Patocka <mpatocka@redhat.com>
> Subject: [PATCH] dm-bufio: delete outdated comment
> 
> This comment was true when dm-bufio was written, but since the kernel 4.3,
> bios can have arbitrary size and the driver splits them.
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

I had already included this one back on March 4.

I've pushed dm-4.17 with all your bufio changes.  I tweaked headers and
some whitespace, etc.  Please review and verify all looks fine, thanks:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-4.17

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-03-27  0:55 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-03-26 18:29 [patch 0/8] dm-bufio patches mpatocka
2018-03-26 18:29 ` [patch 1/8] dm-bufio: move dm-bufio.h to include/linux/ mpatocka
2018-03-26 18:29 ` [patch 2/8] dm-bufio: get rid of slab cache name allocations mpatocka
2018-03-26 18:29 ` [patch 3/8] dm-bufio: remove code that merges slab caches mpatocka
2018-03-26 18:29 ` [patch 4/8] dm-bufio: fix slab cache attributes mpatocka
2018-03-26 18:29 ` [patch 5/8] dm-bufio recorder fields in dm-buffer mpatocka
2018-03-26 18:29 ` [patch 6/8] dm-bufio: use slab cache for dm_buffer mpatocka
2018-03-26 18:29 ` [patch 7/8] dm-bufio: support non-power-of-two block sizes mpatocka
2018-03-26 18:29 ` [patch 8/8] dm-bufio: dont embed bio in dm_buffer mpatocka
2018-03-26 18:57 ` [patch 0/8] dm-bufio patches Mike Snitzer
2018-03-26 20:20   ` Mikulas Patocka
2018-03-27  0:55     ` Mike Snitzer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.