[PATCH 0/3] fuse: fix accounting background requests

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/3] fuse: fix accounting background requests
@ 2012-12-26 12:44 Maxim Patlasov
  2012-12-26 12:44 ` [PATCH 1/3] fuse: make request allocations for background processing explicit Maxim Patlasov
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Maxim Patlasov @ 2012-12-26 12:44 UTC (permalink / raw)
  To: miklos; +Cc: dev, xemul, fuse-devel, bfoster, linux-kernel, devel

Hi,

The feature was added long time ago (commit 08a53cdc...) with the comment:

> A task may have at most one synchronous request allocated.  So these requests
> need not be otherwise limited.
> 
> However the number of background requests (release, forget, asynchronous
> reads, interrupted requests) can grow indefinitely.  This can be used by a
> malicous user to cause FUSE to allocate arbitrary amounts of unswappable
> kernel memory, denying service.
> 
> For this reason add a limit for the number of background requests, and block
> allocations of new requests until the number goes bellow the limit.

However, the implementation suffers from the following problems:

1. Latency of synchronous requests. As soon as fc->num_background hits the
limit, all allocations are blocked: both for synchronous and background
requests. This is unnecessary - as the comment cited above states, synchronous
requests need not be limited (by fuse). Moreover, sometimes it's very
inconvenient. For example, a dozen of tasks aggressively writing to mmap()-ed
area may block 'ls' for long while (>1min in my experiments).

2. Thundering herd problem. When fc->num_background falls below the limit,
request_end() calls wake_up_all(&fc->blocked_waitq). This wakes up all waiters
while it's not impossible that the first waiter getting new request will
immediately put it to background increasing fc->num_background again.
(experimenting with mmap()-ed writes I observed 2x slowdown as compared with
fuse after applying this patch-set)

The patch-set re-works fuse_get_req (and its callers) to throttle only requests
intended for background processing. Having this done, it becomes possible to
use exclusive wakeups in chained manner: request_end() wakes up a waiter,
the waiter allocates new request and submits it for background processing,
the processing ends in request_end() where another wakeup happens an so on.

Thanks,
Maxim

---

Maxim Patlasov (3):
      fuse: make request allocations for background processing explicit
      fuse: skip blocking on allocations of synchronous requests
      fuse: implement exclusive wakeup for blocked_waitq

 fs/fuse/cuse.c   |    2 +-
 fs/fuse/dev.c    |   60 +++++++++++++++++++++++++++++++++++++++++-------------
 fs/fuse/file.c   |    5 +++--
 fs/fuse/fuse_i.h |    3 +++
 fs/fuse/inode.c  |    1 +
 5 files changed, 54 insertions(+), 17 deletions(-)

-- 
Signature

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/3] fuse: make request allocations for background processing explicit
  2012-12-26 12:44 [PATCH 0/3] fuse: fix accounting background requests Maxim Patlasov
@ 2012-12-26 12:44 ` Maxim Patlasov
  2012-12-26 12:44 ` [PATCH 2/3] fuse: skip blocking on allocations of synchronous requests Maxim Patlasov
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Maxim Patlasov @ 2012-12-26 12:44 UTC (permalink / raw)
  To: miklos; +Cc: dev, xemul, fuse-devel, bfoster, linux-kernel, devel

There are two types of processing requests in FUSE: synchronous (via
fuse_request_send()) and asynchronous (via adding to fc->bg_queue).

Fortunately, the type of processing is always known in advance, at the time
of request allocation. This preparatory patch utilizes this fact making
fuse_get_req() aware about the type. Next patches will use it.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
---
 fs/fuse/cuse.c   |    2 +-
 fs/fuse/dev.c    |   22 +++++++++++++++++++---
 fs/fuse/file.c   |    5 +++--
 fs/fuse/fuse_i.h |    3 +++
 fs/fuse/inode.c  |    1 +
 5 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/fs/fuse/cuse.c b/fs/fuse/cuse.c
index ee8d550..37e18dc 100644
--- a/fs/fuse/cuse.c
+++ b/fs/fuse/cuse.c
@@ -411,7 +411,7 @@ static int cuse_send_init(struct cuse_conn *cc)
 
 	BUILD_BUG_ON(CUSE_INIT_INFO_MAX > PAGE_SIZE);
 
-	req = fuse_get_req(fc);
+	req = fuse_get_req_for_background(fc);
 	if (IS_ERR(req)) {
 		rc = PTR_ERR(req);
 		goto err;
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 8c23fa7..0b6b9d1 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -55,8 +55,10 @@ EXPORT_SYMBOL_GPL(fuse_request_alloc);
 struct fuse_req *fuse_request_alloc_nofs(void)
 {
 	struct fuse_req *req = kmem_cache_alloc(fuse_req_cachep, GFP_NOFS);
-	if (req)
+	if (req) {
 		fuse_request_init(req);
+		req->background = 1; /* writeback always goes to bg_queue */
+	}
 	return req;
 }
 
@@ -97,7 +99,8 @@ static void fuse_req_init_context(struct fuse_req *req)
 	req->in.h.pid = current->pid;
 }
 
-struct fuse_req *fuse_get_req(struct fuse_conn *fc)
+struct fuse_req *fuse_get_req_internal(struct fuse_conn *fc,
+				       bool for_background)
 {
 	struct fuse_req *req;
 	sigset_t oldset;
@@ -123,14 +126,26 @@ struct fuse_req *fuse_get_req(struct fuse_conn *fc)
 
 	fuse_req_init_context(req);
 	req->waiting = 1;
+	req->background = for_background;
 	return req;
 
  out:
 	atomic_dec(&fc->num_waiting);
 	return ERR_PTR(err);
 }
+
+struct fuse_req *fuse_get_req(struct fuse_conn *fc)
+{
+	return fuse_get_req_internal(fc, 0);
+}
 EXPORT_SYMBOL_GPL(fuse_get_req);
 
+struct fuse_req *fuse_get_req_for_background(struct fuse_conn *fc)
+{
+	return fuse_get_req_internal(fc, 1);
+}
+EXPORT_SYMBOL_GPL(fuse_get_req_for_background);
+
 /*
  * Return request in fuse_file->reserved_req.  However that may
  * currently be in use.  If that is the case, wait for it to become
@@ -408,6 +423,7 @@ __acquires(fc->lock)
 
 void fuse_request_send(struct fuse_conn *fc, struct fuse_req *req)
 {
+	BUG_ON(req->background);
 	req->isreply = 1;
 	spin_lock(&fc->lock);
 	if (!fc->connected)
@@ -430,7 +446,7 @@ EXPORT_SYMBOL_GPL(fuse_request_send);
 static void fuse_request_send_nowait_locked(struct fuse_conn *fc,
 					    struct fuse_req *req)
 {
-	req->background = 1;
+	BUG_ON(!req->background);
 	fc->num_background++;
 	if (fc->num_background == fc->max_background)
 		fc->blocked = 1;
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 78d2837..86101ce 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -131,6 +131,7 @@ static void fuse_file_put(struct fuse_file *ff, bool sync)
 			fuse_put_request(ff->fc, req);
 		} else {
 			req->end = fuse_release_end;
+			req->background = 1;
 			fuse_request_send_background(ff->fc, req);
 		}
 		kfree(ff);
@@ -657,7 +658,7 @@ static int fuse_readpages_fill(void *_data, struct page *page)
 	     (req->num_pages + 1) * PAGE_CACHE_SIZE > fc->max_read ||
 	     req->pages[req->num_pages - 1]->index + 1 != page->index)) {
 		fuse_send_readpages(req, data->file);
-		data->req = req = fuse_get_req(fc);
+		data->req = req = fuse_get_req_internal(fc, fc->async_read);
 		if (IS_ERR(req)) {
 			unlock_page(page);
 			return PTR_ERR(req);
@@ -683,7 +684,7 @@ static int fuse_readpages(struct file *file, struct address_space *mapping,
 
 	data.file = file;
 	data.inode = inode;
-	data.req = fuse_get_req(fc);
+	data.req = fuse_get_req_internal(fc, fc->async_read);
 	err = PTR_ERR(data.req);
 	if (IS_ERR(data.req))
 		goto out;
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index e24dd74..70362e9 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -671,6 +671,9 @@ void fuse_request_free(struct fuse_req *req);
  * Get a request, may fail with -ENOMEM
  */
 struct fuse_req *fuse_get_req(struct fuse_conn *fc);
+struct fuse_req *fuse_get_req_for_background(struct fuse_conn *fc);
+struct fuse_req *fuse_get_req_internal(struct fuse_conn *fc,
+				       bool for_background);
 
 /**
  * Gets a requests for a file operation, always succeeds
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index f0eda12..c0f0eba 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -1032,6 +1032,7 @@ static int fuse_fill_super(struct super_block *sb, void *data, int silent)
 	init_req = fuse_request_alloc();
 	if (!init_req)
 		goto err_put_root;
+	init_req->background = 1;
 
 	if (is_bdev) {
 		fc->destroy_req = fuse_request_alloc();


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/3] fuse: skip blocking on allocations of synchronous requests
  2012-12-26 12:44 [PATCH 0/3] fuse: fix accounting background requests Maxim Patlasov
  2012-12-26 12:44 ` [PATCH 1/3] fuse: make request allocations for background processing explicit Maxim Patlasov
@ 2012-12-26 12:44 ` Maxim Patlasov
  2012-12-26 12:45 ` [PATCH 3/3] fuse: implement exclusive wakeup for blocked_waitq Maxim Patlasov
  2013-02-06 17:12 ` [PATCH 0/3] fuse: fix accounting background requests Miklos Szeredi
  3 siblings, 0 replies; 6+ messages in thread
From: Maxim Patlasov @ 2012-12-26 12:44 UTC (permalink / raw)
  To: miklos; +Cc: dev, xemul, fuse-devel, bfoster, linux-kernel, devel

Miklos wrote:

> A task may have at most one synchronous request allocated. So these
> requests need not be otherwise limited.

The patch re-works fuse_get_req() to follow this idea.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
---
 fs/fuse/dev.c |   21 +++++++++++++--------
 1 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 0b6b9d1..c7bef93 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -103,17 +103,22 @@ struct fuse_req *fuse_get_req_internal(struct fuse_conn *fc,
 				       bool for_background)
 {
 	struct fuse_req *req;
-	sigset_t oldset;
-	int intr;
 	int err;
 
 	atomic_inc(&fc->num_waiting);
-	block_sigs(&oldset);
-	intr = wait_event_interruptible(fc->blocked_waitq, !fc->blocked);
-	restore_sigs(&oldset);
-	err = -EINTR;
-	if (intr)
-		goto out;
+
+	if (for_background) {
+		sigset_t oldset;
+		int intr;
+
+		block_sigs(&oldset);
+		intr = wait_event_interruptible(fc->blocked_waitq,
+						!fc->blocked);
+		restore_sigs(&oldset);
+		err = -EINTR;
+		if (intr)
+			goto out;
+	}
 
 	err = -ENOTCONN;
 	if (!fc->connected)


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/3] fuse: implement exclusive wakeup for blocked_waitq
  2012-12-26 12:44 [PATCH 0/3] fuse: fix accounting background requests Maxim Patlasov
  2012-12-26 12:44 ` [PATCH 1/3] fuse: make request allocations for background processing explicit Maxim Patlasov
  2012-12-26 12:44 ` [PATCH 2/3] fuse: skip blocking on allocations of synchronous requests Maxim Patlasov
@ 2012-12-26 12:45 ` Maxim Patlasov
  2013-02-06 17:12 ` [PATCH 0/3] fuse: fix accounting background requests Miklos Szeredi
  3 siblings, 0 replies; 6+ messages in thread
From: Maxim Patlasov @ 2012-12-26 12:45 UTC (permalink / raw)
  To: miklos; +Cc: dev, xemul, fuse-devel, bfoster, linux-kernel, devel

The patch solves thundering herd problem. So far as previous patches ensured
that only allocations for background may block, it's safe to wake up one
waiter. Whoever it is, it will wake up another one in request_end() afterwards.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
---
 fs/fuse/dev.c |   21 ++++++++++++++++-----
 1 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index c7bef93..7ed2096 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -112,8 +112,8 @@ struct fuse_req *fuse_get_req_internal(struct fuse_conn *fc,
 		int intr;
 
 		block_sigs(&oldset);
-		intr = wait_event_interruptible(fc->blocked_waitq,
-						!fc->blocked);
+		intr = wait_event_interruptible_exclusive(fc->blocked_waitq,
+							  !fc->blocked);
 		restore_sigs(&oldset);
 		err = -EINTR;
 		if (intr)
@@ -224,6 +224,13 @@ struct fuse_req *fuse_get_req_nofail(struct fuse_conn *fc, struct file *file)
 void fuse_put_request(struct fuse_conn *fc, struct fuse_req *req)
 {
 	if (atomic_dec_and_test(&req->count)) {
+		if (unlikely(req->background)) {
+			spin_lock(&fc->lock);
+			if (!fc->blocked)
+				wake_up(&fc->blocked_waitq);
+			spin_unlock(&fc->lock);
+		}
+
 		if (req->waiting)
 			atomic_dec(&fc->num_waiting);
 
@@ -321,10 +328,14 @@ __releases(fc->lock)
 	list_del(&req->intr_entry);
 	req->state = FUSE_REQ_FINISHED;
 	if (req->background) {
-		if (fc->num_background == fc->max_background) {
+		req->background = 0;
+
+		if (fc->num_background == fc->max_background)
 			fc->blocked = 0;
-			wake_up_all(&fc->blocked_waitq);
-		}
+
+		if (!fc->blocked)
+			wake_up(&fc->blocked_waitq);
+
 		if (fc->num_background == fc->congestion_threshold &&
 		    fc->connected && fc->bdi_initialized) {
 			clear_bdi_congested(&fc->bdi, BLK_RW_SYNC);


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/3] fuse: fix accounting background requests
  2012-12-26 12:44 [PATCH 0/3] fuse: fix accounting background requests Maxim Patlasov
                   ` (2 preceding siblings ...)
  2012-12-26 12:45 ` [PATCH 3/3] fuse: implement exclusive wakeup for blocked_waitq Maxim Patlasov
@ 2013-02-06 17:12 ` Miklos Szeredi
  2013-03-21 13:04   ` Maxim V. Patlasov
  3 siblings, 1 reply; 6+ messages in thread
From: Miklos Szeredi @ 2013-02-06 17:12 UTC (permalink / raw)
  To: Maxim Patlasov; +Cc: dev, xemul, fuse-devel, bfoster, linux-kernel, devel

On Wed, Dec 26, 2012 at 1:44 PM, Maxim Patlasov <mpatlasov@parallels.com> wrote:
> Hi,
>
> The feature was added long time ago (commit 08a53cdc...) with the comment:
>
>> A task may have at most one synchronous request allocated.  So these requests
>> need not be otherwise limited.
>>
>> However the number of background requests (release, forget, asynchronous
>> reads, interrupted requests) can grow indefinitely.  This can be used by a
>> malicous user to cause FUSE to allocate arbitrary amounts of unswappable
>> kernel memory, denying service.
>>
>> For this reason add a limit for the number of background requests, and block
>> allocations of new requests until the number goes bellow the limit.
>
> However, the implementation suffers from the following problems:
>
> 1. Latency of synchronous requests. As soon as fc->num_background hits the
> limit, all allocations are blocked: both for synchronous and background
> requests. This is unnecessary - as the comment cited above states, synchronous
> requests need not be limited (by fuse). Moreover, sometimes it's very
> inconvenient. For example, a dozen of tasks aggressively writing to mmap()-ed
> area may block 'ls' for long while (>1min in my experiments).
>
> 2. Thundering herd problem. When fc->num_background falls below the limit,
> request_end() calls wake_up_all(&fc->blocked_waitq). This wakes up all waiters
> while it's not impossible that the first waiter getting new request will
> immediately put it to background increasing fc->num_background again.
> (experimenting with mmap()-ed writes I observed 2x slowdown as compared with
> fuse after applying this patch-set)
>
> The patch-set re-works fuse_get_req (and its callers) to throttle only requests
> intended for background processing. Having this done, it becomes possible to
> use exclusive wakeups in chained manner: request_end() wakes up a waiter,
> the waiter allocates new request and submits it for background processing,
> the processing ends in request_end() where another wakeup happens an so on.

Thanks.  These patches look okay.

But they don't apply to for-next.  Can you please update them?

Thanks,
Miklos

>
> Thanks,
> Maxim
>
> ---
>
> Maxim Patlasov (3):
>       fuse: make request allocations for background processing explicit
>       fuse: skip blocking on allocations of synchronous requests
>       fuse: implement exclusive wakeup for blocked_waitq
>
>
>  fs/fuse/cuse.c   |    2 +-
>  fs/fuse/dev.c    |   60 +++++++++++++++++++++++++++++++++++++++++-------------
>  fs/fuse/file.c   |    5 +++--
>  fs/fuse/fuse_i.h |    3 +++
>  fs/fuse/inode.c  |    1 +
>  5 files changed, 54 insertions(+), 17 deletions(-)
>
> --
> Signature

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/3] fuse: fix accounting background requests
  2013-02-06 17:12 ` [PATCH 0/3] fuse: fix accounting background requests Miklos Szeredi
@ 2013-03-21 13:04   ` Maxim V. Patlasov
  0 siblings, 0 replies; 6+ messages in thread
From: Maxim V. Patlasov @ 2013-03-21 13:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: dev, xemul, fuse-devel, bfoster, linux-kernel, devel

02/06/2013 09:12 PM, Miklos Szeredi пишет:
> On Wed, Dec 26, 2012 at 1:44 PM, Maxim Patlasov <mpatlasov@parallels.com> wrote:
>> Hi,
>>
>> The feature was added long time ago (commit 08a53cdc...) with the comment:
>>
>>> A task may have at most one synchronous request allocated.  So these requests
>>> need not be otherwise limited.
>>>
>>> However the number of background requests (release, forget, asynchronous
>>> reads, interrupted requests) can grow indefinitely.  This can be used by a
>>> malicous user to cause FUSE to allocate arbitrary amounts of unswappable
>>> kernel memory, denying service.
>>>
>>> For this reason add a limit for the number of background requests, and block
>>> allocations of new requests until the number goes bellow the limit.
>> However, the implementation suffers from the following problems:
>>
>> 1. Latency of synchronous requests. As soon as fc->num_background hits the
>> limit, all allocations are blocked: both for synchronous and background
>> requests. This is unnecessary - as the comment cited above states, synchronous
>> requests need not be limited (by fuse). Moreover, sometimes it's very
>> inconvenient. For example, a dozen of tasks aggressively writing to mmap()-ed
>> area may block 'ls' for long while (>1min in my experiments).
>>
>> 2. Thundering herd problem. When fc->num_background falls below the limit,
>> request_end() calls wake_up_all(&fc->blocked_waitq). This wakes up all waiters
>> while it's not impossible that the first waiter getting new request will
>> immediately put it to background increasing fc->num_background again.
>> (experimenting with mmap()-ed writes I observed 2x slowdown as compared with
>> fuse after applying this patch-set)
>>
>> The patch-set re-works fuse_get_req (and its callers) to throttle only requests
>> intended for background processing. Having this done, it becomes possible to
>> use exclusive wakeups in chained manner: request_end() wakes up a waiter,
>> the waiter allocates new request and submits it for background processing,
>> the processing ends in request_end() where another wakeup happens an so on.
> Thanks.  These patches look okay.
>
> But they don't apply to for-next.  Can you please update them?

Sorry for long delay. I'll send updated patches soon.

Thanks,
Maxim

>
> Thanks,
> Miklos
>
>> Thanks,
>> Maxim
>>
>> ---
>>
>> Maxim Patlasov (3):
>>        fuse: make request allocations for background processing explicit
>>        fuse: skip blocking on allocations of synchronous requests
>>        fuse: implement exclusive wakeup for blocked_waitq
>>
>>
>>   fs/fuse/cuse.c   |    2 +-
>>   fs/fuse/dev.c    |   60 +++++++++++++++++++++++++++++++++++++++++-------------
>>   fs/fuse/file.c   |    5 +++--
>>   fs/fuse/fuse_i.h |    3 +++
>>   fs/fuse/inode.c  |    1 +
>>   5 files changed, 54 insertions(+), 17 deletions(-)
>>
>> --
>> Signature


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-03-21 13:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-26 12:44 [PATCH 0/3] fuse: fix accounting background requests Maxim Patlasov
2012-12-26 12:44 ` [PATCH 1/3] fuse: make request allocations for background processing explicit Maxim Patlasov
2012-12-26 12:44 ` [PATCH 2/3] fuse: skip blocking on allocations of synchronous requests Maxim Patlasov
2012-12-26 12:45 ` [PATCH 3/3] fuse: implement exclusive wakeup for blocked_waitq Maxim Patlasov
2013-02-06 17:12 ` [PATCH 0/3] fuse: fix accounting background requests Miklos Szeredi
2013-03-21 13:04   ` Maxim V. Patlasov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox