* [PATCH 1/6] block: convert bio.__bi_cnt from atomic_t to refcount_t
2017-10-20 8:15 [PATCH 0/6] v4 block refcount conversion patches Elena Reshetova
@ 2017-10-20 8:15 ` Elena Reshetova
2017-10-20 8:15 ` [PATCH 2/6] block: convert blk_queue_tag.refcnt " Elena Reshetova
` (5 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Elena Reshetova @ 2017-10-20 8:15 UTC (permalink / raw)
To: axboe
Cc: james.bottomley, linux-kernel, linux-block, linux-scsi,
linux-btrfs, peterz, gregkh, fujita.tomonori, mingo, clm, jbacik,
dsterba, keescook, Elena Reshetova
atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)
Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.
The variable bio.__bi_cnt is used as pure reference counter.
Convert it to refcount_t and fix up the operations.
Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
---
block/bio.c | 6 +++---
fs/btrfs/volumes.c | 2 +-
include/linux/bio.h | 4 ++--
include/linux/blk_types.h | 3 ++-
4 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/block/bio.c b/block/bio.c
index 101c2a9..58edc1b 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -279,7 +279,7 @@ void bio_init(struct bio *bio, struct bio_vec *table,
{
memset(bio, 0, sizeof(*bio));
atomic_set(&bio->__bi_remaining, 1);
- atomic_set(&bio->__bi_cnt, 1);
+ refcount_set(&bio->__bi_cnt, 1);
bio->bi_io_vec = table;
bio->bi_max_vecs = max_vecs;
@@ -557,12 +557,12 @@ void bio_put(struct bio *bio)
if (!bio_flagged(bio, BIO_REFFED))
bio_free(bio);
else {
- BIO_BUG_ON(!atomic_read(&bio->__bi_cnt));
+ BIO_BUG_ON(!refcount_read(&bio->__bi_cnt));
/*
* last put frees it
*/
- if (atomic_dec_and_test(&bio->__bi_cnt))
+ if (refcount_dec_and_test(&bio->__bi_cnt))
bio_free(bio);
}
}
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index b397375..11812ee 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -450,7 +450,7 @@ static noinline void run_scheduled_bios(struct btrfs_device *device)
waitqueue_active(&fs_info->async_submit_wait))
wake_up(&fs_info->async_submit_wait);
- BUG_ON(atomic_read(&cur->__bi_cnt) == 0);
+ BUG_ON(refcount_read(&cur->__bi_cnt) == 0);
/*
* if we're doing the sync list, record that our
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 275c91c..0fa4dd2 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -253,7 +253,7 @@ static inline void bio_get(struct bio *bio)
{
bio->bi_flags |= (1 << BIO_REFFED);
smp_mb__before_atomic();
- atomic_inc(&bio->__bi_cnt);
+ refcount_inc(&bio->__bi_cnt);
}
static inline void bio_cnt_set(struct bio *bio, unsigned int count)
@@ -262,7 +262,7 @@ static inline void bio_cnt_set(struct bio *bio, unsigned int count)
bio->bi_flags |= (1 << BIO_REFFED);
smp_mb__before_atomic();
}
- atomic_set(&bio->__bi_cnt, count);
+ refcount_set(&bio->__bi_cnt, count);
}
static inline bool bio_flagged(struct bio *bio, unsigned int bit)
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index a2d2aa7..1ec370e 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -7,6 +7,7 @@
#include <linux/types.h>
#include <linux/bvec.h>
+#include <linux/refcount.h>
struct bio_set;
struct bio;
@@ -104,7 +105,7 @@ struct bio {
unsigned short bi_max_vecs; /* max bvl_vecs we can hold */
- atomic_t __bi_cnt; /* pin count */
+ refcount_t __bi_cnt; /* pin count */
struct bio_vec *bi_io_vec; /* the actual vec list */
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH 2/6] block: convert blk_queue_tag.refcnt from atomic_t to refcount_t
2017-10-20 8:15 [PATCH 0/6] v4 block refcount conversion patches Elena Reshetova
2017-10-20 8:15 ` [PATCH 1/6] block: convert bio.__bi_cnt from atomic_t to refcount_t Elena Reshetova
@ 2017-10-20 8:15 ` Elena Reshetova
2017-10-20 8:15 ` [PATCH 3/6] block: convert blkcg_gq.refcnt " Elena Reshetova
` (4 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Elena Reshetova @ 2017-10-20 8:15 UTC (permalink / raw)
To: axboe
Cc: james.bottomley, linux-kernel, linux-block, linux-scsi,
linux-btrfs, peterz, gregkh, fujita.tomonori, mingo, clm, jbacik,
dsterba, keescook, Elena Reshetova
atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)
Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.
The variable blk_queue_tag.refcnt is used as pure reference counter.
Convert it to refcount_t and fix up the operations.
Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
---
block/blk-tag.c | 8 ++++----
include/linux/blkdev.h | 3 ++-
2 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/block/blk-tag.c b/block/blk-tag.c
index e1a9c15..a7263e3 100644
--- a/block/blk-tag.c
+++ b/block/blk-tag.c
@@ -35,7 +35,7 @@ EXPORT_SYMBOL(blk_queue_find_tag);
*/
void blk_free_tags(struct blk_queue_tag *bqt)
{
- if (atomic_dec_and_test(&bqt->refcnt)) {
+ if (refcount_dec_and_test(&bqt->refcnt)) {
BUG_ON(find_first_bit(bqt->tag_map, bqt->max_depth) <
bqt->max_depth);
@@ -130,7 +130,7 @@ static struct blk_queue_tag *__blk_queue_init_tags(struct request_queue *q,
if (init_tag_map(q, tags, depth))
goto fail;
- atomic_set(&tags->refcnt, 1);
+ refcount_set(&tags->refcnt, 1);
tags->alloc_policy = alloc_policy;
tags->next_tag = 0;
return tags;
@@ -180,7 +180,7 @@ int blk_queue_init_tags(struct request_queue *q, int depth,
queue_flag_set(QUEUE_FLAG_QUEUED, q);
return 0;
} else
- atomic_inc(&tags->refcnt);
+ refcount_inc(&tags->refcnt);
/*
* assign it, all done
@@ -225,7 +225,7 @@ int blk_queue_resize_tags(struct request_queue *q, int new_depth)
* Currently cannot replace a shared tag map with a new
* one, so error out if this is the case
*/
- if (atomic_read(&bqt->refcnt) != 1)
+ if (refcount_read(&bqt->refcnt) != 1)
return -EBUSY;
/*
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 02fa42d..1fefdbb 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -26,6 +26,7 @@
#include <linux/percpu-refcount.h>
#include <linux/scatterlist.h>
#include <linux/blkzoned.h>
+#include <linux/refcount.h>
struct module;
struct scsi_ioctl_command;
@@ -295,7 +296,7 @@ struct blk_queue_tag {
unsigned long *tag_map; /* bit map of free/busy tags */
int max_depth; /* what we will send to device */
int real_max_depth; /* what the array can hold */
- atomic_t refcnt; /* map can be shared */
+ refcount_t refcnt; /* map can be shared */
int alloc_policy; /* tag allocation policy */
int next_tag; /* next tag */
};
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH 3/6] block: convert blkcg_gq.refcnt from atomic_t to refcount_t
2017-10-20 8:15 [PATCH 0/6] v4 block refcount conversion patches Elena Reshetova
2017-10-20 8:15 ` [PATCH 1/6] block: convert bio.__bi_cnt from atomic_t to refcount_t Elena Reshetova
2017-10-20 8:15 ` [PATCH 2/6] block: convert blk_queue_tag.refcnt " Elena Reshetova
@ 2017-10-20 8:15 ` Elena Reshetova
2017-10-20 8:16 ` [PATCH 4/6] block: convert io_context.active_ref " Elena Reshetova
` (3 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Elena Reshetova @ 2017-10-20 8:15 UTC (permalink / raw)
To: axboe
Cc: james.bottomley, linux-kernel, linux-block, linux-scsi,
linux-btrfs, peterz, gregkh, fujita.tomonori, mingo, clm, jbacik,
dsterba, keescook, Elena Reshetova
atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)
Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.
The variable blkcg_gq.refcnt is used as pure reference counter.
Convert it to refcount_t and fix up the operations.
Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
---
block/blk-cgroup.c | 2 +-
include/linux/blk-cgroup.h | 11 ++++++-----
2 files changed, 7 insertions(+), 6 deletions(-)
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index d3f56ba..1e7cedc 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -107,7 +107,7 @@ static struct blkcg_gq *blkg_alloc(struct blkcg *blkcg, struct request_queue *q,
blkg->q = q;
INIT_LIST_HEAD(&blkg->q_node);
blkg->blkcg = blkcg;
- atomic_set(&blkg->refcnt, 1);
+ refcount_set(&blkg->refcnt, 1);
/* root blkg uses @q->root_rl, init rl only for !root blkgs */
if (blkcg != &blkcg_root) {
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 9d92153..c95d29d 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -19,6 +19,7 @@
#include <linux/radix-tree.h>
#include <linux/blkdev.h>
#include <linux/atomic.h>
+#include <linux/refcount.h>
/* percpu_counter batch for blkg_[rw]stats, per-cpu drift doesn't matter */
#define BLKG_STAT_CPU_BATCH (INT_MAX / 2)
@@ -122,7 +123,7 @@ struct blkcg_gq {
struct request_list rl;
/* reference count */
- atomic_t refcnt;
+ refcount_t refcnt;
/* is this blkg online? protected by both blkcg and q locks */
bool online;
@@ -354,8 +355,8 @@ static inline int blkg_path(struct blkcg_gq *blkg, char *buf, int buflen)
*/
static inline void blkg_get(struct blkcg_gq *blkg)
{
- WARN_ON_ONCE(atomic_read(&blkg->refcnt) <= 0);
- atomic_inc(&blkg->refcnt);
+ WARN_ON_ONCE(refcount_read(&blkg->refcnt) == 0);
+ refcount_inc(&blkg->refcnt);
}
void __blkg_release_rcu(struct rcu_head *rcu);
@@ -366,8 +367,8 @@ void __blkg_release_rcu(struct rcu_head *rcu);
*/
static inline void blkg_put(struct blkcg_gq *blkg)
{
- WARN_ON_ONCE(atomic_read(&blkg->refcnt) <= 0);
- if (atomic_dec_and_test(&blkg->refcnt))
+ WARN_ON_ONCE(refcount_read(&blkg->refcnt) == 0);
+ if (refcount_dec_and_test(&blkg->refcnt))
call_rcu(&blkg->rcu_head, __blkg_release_rcu);
}
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH 4/6] block: convert io_context.active_ref from atomic_t to refcount_t
2017-10-20 8:15 [PATCH 0/6] v4 block refcount conversion patches Elena Reshetova
` (2 preceding siblings ...)
2017-10-20 8:15 ` [PATCH 3/6] block: convert blkcg_gq.refcnt " Elena Reshetova
@ 2017-10-20 8:16 ` Elena Reshetova
2017-10-20 8:16 ` [PATCH 5/6] block: convert bsg_device.ref_count " Elena Reshetova
` (2 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Elena Reshetova @ 2017-10-20 8:16 UTC (permalink / raw)
To: axboe
Cc: james.bottomley, linux-kernel, linux-block, linux-scsi,
linux-btrfs, peterz, gregkh, fujita.tomonori, mingo, clm, jbacik,
dsterba, keescook, Elena Reshetova
atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)
Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.
The variable io_context.active_ref is used as pure reference counter.
Convert it to refcount_t and fix up the operations.
Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
---
block/bfq-iosched.c | 2 +-
block/blk-ioc.c | 4 ++--
block/cfq-iosched.c | 4 ++--
include/linux/iocontext.h | 7 ++++---
4 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index a4783da..1ec9b22 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -4030,7 +4030,7 @@ static void bfq_update_has_short_ttime(struct bfq_data *bfqd,
* bfqq. Otherwise check average think time to
* decide whether to mark as has_short_ttime
*/
- if (atomic_read(&bic->icq.ioc->active_ref) == 0 ||
+ if (refcount_read(&bic->icq.ioc->active_ref) == 0 ||
(bfq_sample_valid(bfqq->ttime.ttime_samples) &&
bfqq->ttime.ttime_mean > bfqd->bfq_slice_idle))
has_short_ttime = false;
diff --git a/block/blk-ioc.c b/block/blk-ioc.c
index 63898d2..69704d2 100644
--- a/block/blk-ioc.c
+++ b/block/blk-ioc.c
@@ -176,7 +176,7 @@ void put_io_context_active(struct io_context *ioc)
unsigned long flags;
struct io_cq *icq;
- if (!atomic_dec_and_test(&ioc->active_ref)) {
+ if (!refcount_dec_and_test(&ioc->active_ref)) {
put_io_context(ioc);
return;
}
@@ -275,7 +275,7 @@ int create_task_io_context(struct task_struct *task, gfp_t gfp_flags, int node)
/* initialize */
atomic_long_set(&ioc->refcount, 1);
atomic_set(&ioc->nr_tasks, 1);
- atomic_set(&ioc->active_ref, 1);
+ refcount_set(&ioc->active_ref, 1);
spin_lock_init(&ioc->lock);
INIT_RADIX_TREE(&ioc->icq_tree, GFP_ATOMIC | __GFP_HIGH);
INIT_HLIST_HEAD(&ioc->icq_list);
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 9f342ef..e6d5d6d 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -2941,7 +2941,7 @@ static void cfq_arm_slice_timer(struct cfq_data *cfqd)
* task has exited, don't wait
*/
cic = cfqd->active_cic;
- if (!cic || !atomic_read(&cic->icq.ioc->active_ref))
+ if (!cic || !refcount_read(&cic->icq.ioc->active_ref))
return;
/*
@@ -3933,7 +3933,7 @@ cfq_update_idle_window(struct cfq_data *cfqd, struct cfq_queue *cfqq,
if (cfqq->next_rq && req_noidle(cfqq->next_rq))
enable_idle = 0;
- else if (!atomic_read(&cic->icq.ioc->active_ref) ||
+ else if (!refcount_read(&cic->icq.ioc->active_ref) ||
!cfqd->cfq_slice_idle ||
(!cfq_cfqq_deep(cfqq) && CFQQ_SEEKY(cfqq)))
enable_idle = 0;
diff --git a/include/linux/iocontext.h b/include/linux/iocontext.h
index df38db2..a1e28c3 100644
--- a/include/linux/iocontext.h
+++ b/include/linux/iocontext.h
@@ -3,6 +3,7 @@
#include <linux/radix-tree.h>
#include <linux/rcupdate.h>
+#include <linux/refcount.h>
#include <linux/workqueue.h>
enum {
@@ -96,7 +97,7 @@ struct io_cq {
*/
struct io_context {
atomic_long_t refcount;
- atomic_t active_ref;
+ refcount_t active_ref;
atomic_t nr_tasks;
/* all the fields below are protected by this lock */
@@ -128,9 +129,9 @@ struct io_context {
static inline void get_io_context_active(struct io_context *ioc)
{
WARN_ON_ONCE(atomic_long_read(&ioc->refcount) <= 0);
- WARN_ON_ONCE(atomic_read(&ioc->active_ref) <= 0);
+ WARN_ON_ONCE(refcount_read(&ioc->active_ref) == 0);
atomic_long_inc(&ioc->refcount);
- atomic_inc(&ioc->active_ref);
+ refcount_inc(&ioc->active_ref);
}
static inline void ioc_task_link(struct io_context *ioc)
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH 5/6] block: convert bsg_device.ref_count from atomic_t to refcount_t
2017-10-20 8:15 [PATCH 0/6] v4 block refcount conversion patches Elena Reshetova
` (3 preceding siblings ...)
2017-10-20 8:16 ` [PATCH 4/6] block: convert io_context.active_ref " Elena Reshetova
@ 2017-10-20 8:16 ` Elena Reshetova
2017-10-20 8:16 ` [PATCH 6/6] drivers, block: convert xen_blkif.refcnt " Elena Reshetova
2017-10-20 8:43 ` [PATCH 0/6] v4 block refcount conversion patches Johannes Thumshirn
6 siblings, 0 replies; 9+ messages in thread
From: Elena Reshetova @ 2017-10-20 8:16 UTC (permalink / raw)
To: axboe
Cc: james.bottomley, linux-kernel, linux-block, linux-scsi,
linux-btrfs, peterz, gregkh, fujita.tomonori, mingo, clm, jbacik,
dsterba, keescook, Elena Reshetova
atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)
Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.
The variable bsg_device.ref_count is used as pure reference counter.
Convert it to refcount_t and fix up the operations.
Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
---
block/bsg.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/block/bsg.c b/block/bsg.c
index ee1335c..6c98422 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -21,6 +21,7 @@
#include <linux/idr.h>
#include <linux/bsg.h>
#include <linux/slab.h>
+#include <linux/refcount.h>
#include <scsi/scsi.h>
#include <scsi/scsi_ioctl.h>
@@ -38,7 +39,7 @@ struct bsg_device {
struct list_head busy_list;
struct list_head done_list;
struct hlist_node dev_list;
- atomic_t ref_count;
+ refcount_t ref_count;
int queued_cmds;
int done_cmds;
wait_queue_head_t wq_done;
@@ -710,7 +711,7 @@ static int bsg_put_device(struct bsg_device *bd)
mutex_lock(&bsg_mutex);
- do_free = atomic_dec_and_test(&bd->ref_count);
+ do_free = refcount_dec_and_test(&bd->ref_count);
if (!do_free) {
mutex_unlock(&bsg_mutex);
goto out;
@@ -768,7 +769,7 @@ static struct bsg_device *bsg_add_device(struct inode *inode,
bsg_set_block(bd, file);
- atomic_set(&bd->ref_count, 1);
+ refcount_set(&bd->ref_count, 1);
mutex_lock(&bsg_mutex);
hlist_add_head(&bd->dev_list, bsg_dev_idx_hash(iminor(inode)));
@@ -788,7 +789,7 @@ static struct bsg_device *__bsg_get_device(int minor, struct request_queue *q)
hlist_for_each_entry(bd, bsg_dev_idx_hash(minor), dev_list) {
if (bd->queue == q) {
- atomic_inc(&bd->ref_count);
+ refcount_inc(&bd->ref_count);
goto found;
}
}
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH 6/6] drivers, block: convert xen_blkif.refcnt from atomic_t to refcount_t
2017-10-20 8:15 [PATCH 0/6] v4 block refcount conversion patches Elena Reshetova
` (4 preceding siblings ...)
2017-10-20 8:16 ` [PATCH 5/6] block: convert bsg_device.ref_count " Elena Reshetova
@ 2017-10-20 8:16 ` Elena Reshetova
2017-10-20 8:43 ` [PATCH 0/6] v4 block refcount conversion patches Johannes Thumshirn
6 siblings, 0 replies; 9+ messages in thread
From: Elena Reshetova @ 2017-10-20 8:16 UTC (permalink / raw)
To: axboe
Cc: james.bottomley, linux-kernel, linux-block, linux-scsi,
linux-btrfs, peterz, gregkh, fujita.tomonori, mingo, clm, jbacik,
dsterba, keescook, Elena Reshetova
atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)
Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.
The variable xen_blkif.refcnt is used as pure reference counter.
Convert it to refcount_t and fix up the operations.
Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
---
drivers/block/xen-blkback/common.h | 7 ++++---
drivers/block/xen-blkback/xenbus.c | 2 +-
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h
index ecb35fe..0c3320d 100644
--- a/drivers/block/xen-blkback/common.h
+++ b/drivers/block/xen-blkback/common.h
@@ -35,6 +35,7 @@
#include <linux/wait.h>
#include <linux/io.h>
#include <linux/rbtree.h>
+#include <linux/refcount.h>
#include <asm/setup.h>
#include <asm/pgalloc.h>
#include <asm/hypervisor.h>
@@ -319,7 +320,7 @@ struct xen_blkif {
struct xen_vbd vbd;
/* Back pointer to the backend_info. */
struct backend_info *be;
- atomic_t refcnt;
+ refcount_t refcnt;
/* for barrier (drain) requests */
struct completion drain_complete;
atomic_t drain;
@@ -372,10 +373,10 @@ struct pending_req {
(_v)->bdev->bd_part->nr_sects : \
get_capacity((_v)->bdev->bd_disk))
-#define xen_blkif_get(_b) (atomic_inc(&(_b)->refcnt))
+#define xen_blkif_get(_b) (refcount_inc(&(_b)->refcnt))
#define xen_blkif_put(_b) \
do { \
- if (atomic_dec_and_test(&(_b)->refcnt)) \
+ if (refcount_dec_and_test(&(_b)->refcnt)) \
schedule_work(&(_b)->free_work);\
} while (0)
diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index 21c1be1..5955b61 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -176,7 +176,7 @@ static struct xen_blkif *xen_blkif_alloc(domid_t domid)
return ERR_PTR(-ENOMEM);
blkif->domid = domid;
- atomic_set(&blkif->refcnt, 1);
+ refcount_set(&blkif->refcnt, 1);
init_completion(&blkif->drain_complete);
INIT_WORK(&blkif->free_work, xen_blkif_deferred_free);
--
2.7.4
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH 0/6] v4 block refcount conversion patches
2017-10-20 8:15 [PATCH 0/6] v4 block refcount conversion patches Elena Reshetova
` (5 preceding siblings ...)
2017-10-20 8:16 ` [PATCH 6/6] drivers, block: convert xen_blkif.refcnt " Elena Reshetova
@ 2017-10-20 8:43 ` Johannes Thumshirn
2017-10-20 10:25 ` Reshetova, Elena
6 siblings, 1 reply; 9+ messages in thread
From: Johannes Thumshirn @ 2017-10-20 8:43 UTC (permalink / raw)
To: Elena Reshetova
Cc: axboe, james.bottomley, linux-kernel, linux-block, linux-scsi,
linux-btrfs, peterz, gregkh, fujita.tomonori, mingo, clm, jbacik,
dsterba, keescook
Elena Reshetova <elena.reshetova@intel.com> writes:
> Elena Reshetova (6):
> block: convert bio.__bi_cnt from atomic_t to refcount_t
> block: convert blk_queue_tag.refcnt from atomic_t to refcount_t
> block: convert blkcg_gq.refcnt from atomic_t to refcount_t
> block: convert io_context.active_ref from atomic_t to refcount_t
> block: convert bsg_device.ref_count from atomic_t to refcount_t
> drivers, block: convert xen_blkif.refcnt from atomic_t to refcount_t
Hi Elena,
While the bsg ref_count is cheap, do you have any numbers how the other
conversions compare in performance (throughput and latency) vs atomics?
It should be quite easy to measure against a null_blk device.
Thanks a lot,
Johannes
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
^ permalink raw reply [flat|nested] 9+ messages in thread* RE: [PATCH 0/6] v4 block refcount conversion patches
2017-10-20 8:43 ` [PATCH 0/6] v4 block refcount conversion patches Johannes Thumshirn
@ 2017-10-20 10:25 ` Reshetova, Elena
0 siblings, 0 replies; 9+ messages in thread
From: Reshetova, Elena @ 2017-10-20 10:25 UTC (permalink / raw)
To: Johannes Thumshirn
Cc: axboe@kernel.dk, james.bottomley@hansenpartnership.com,
linux-kernel@vger.kernel.org, linux-block@vger.kernel.org,
linux-scsi@vger.kernel.org, linux-btrfs@vger.kernel.org,
peterz@infradead.org, gregkh@linuxfoundation.org,
fujita.tomonori@lab.ntt.co.jp, mingo@redhat.com, clm@fb.com,
jbacik@fb.com, dsterba@suse.com, keescook@chromium.org
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 2758 bytes --]
> Elena Reshetova <elena.reshetova@intel.com> writes:
> > Elena Reshetova (6):
> > block: convert bio.__bi_cnt from atomic_t to refcount_t
> > block: convert blk_queue_tag.refcnt from atomic_t to refcount_t
> > block: convert blkcg_gq.refcnt from atomic_t to refcount_t
> > block: convert io_context.active_ref from atomic_t to refcount_t
> > block: convert bsg_device.ref_count from atomic_t to refcount_t
> > drivers, block: convert xen_blkif.refcnt from atomic_t to refcount_t
>
> Hi Elena,
>
> While the bsg ref_count is cheap, do you have any numbers how the other
> conversions compare in performance (throughput and latency) vs atomics?
Hi Johannes,
The performance would depend on which "breed" of refcount_t is used underneath.
We currently have 3 versions:
- refcount_t defaults to atomic_t (no CONFIG_REFCOUNT_FULL enabled, no arch. support)
Impact is zero in this case since it is just atomic functions are used.
- refcount_t uses arch. specific implementation (arch. enables ARCH_HAS_REFCOUNT)
Impact depends on arch. implementation. Currently only x86 provides one.
- refcount_t uses "full" arch. independent implementation.
Here are cycle numbers for comparing these 3 (https://lwn.net/Articles/728626/):
Just copy pasting for convenience:
">These are the cycle counts comparing a loop of refcount_inc() from 1
>to INT_MAX and back down to 0 (via refcount_dec_and_test()), between
>unprotected refcount_t (atomic_t), fully protected REFCOUNT_FULL
>(refcount_t-full), and this overflow-protected refcount (refcount_t-fast):
>2147483646 refcount_inc()s and 2147483647 refcount_dec_and_test()s:
cycles protections
>atomic_t 82249267387 none
>refcount_t-fast 82211446892 overflow, untested dec-to-zero
>refcount_t-full 144814735193 overflow, untested dec-to-zero, inc-from-zero"
So, the middle option (called here refcount_t-fast) with arch. specific
implementation gives a negligible impact. The "full" one is more pricey, but it is
disabled by default anyway, so only people who want strict security enable it.
Are these numbers convincing enough that we don't have to measure
the block devices? :)
Best Regards,
Elena.
>
> It should be quite easy to measure against a null_blk device.
>
> Thanks a lot,
> Johannes
>
> --
> Johannes Thumshirn Storage
> jthumshirn@suse.de +49 911 74053 689
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±ý»k~ÏâØ^nr¡ö¦zË\x1aëh¨èÚ&£ûàz¿äz¹Þú+Ê+zf£¢·h§~Ûiÿÿïêÿêçz_è®\x0fæj:+v¨þ)ߣøm
^ permalink raw reply [flat|nested] 9+ messages in thread