From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-db5eur01on0093.outbound.protection.outlook.com ([104.47.2.93]:17743 "EHLO EUR01-DB5-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753942AbcKPWOQ (ORCPT ); Wed, 16 Nov 2016 17:14:16 -0500 Subject: Re: [fuse-devel] fuse: max_background and congestion_threshold settings References: <87oa1g90nx.fsf@thinkpad.rath.org> <64a57faa-d3a6-a209-8728-723ed7f37c2f@virtuozzo.com> <87fumrmdvn.fsf@thinkpad.rath.org> <716677ab-f962-1628-205b-2326219f4487@virtuozzo.com> <877f83mb2v.fsf@thinkpad.rath.org> CC: Miklos Szeredi , , linux-fsdevel , LKML To: Nikolaus Rath From: Maxim Patlasov Message-ID: <7828c809-f699-c16f-a1aa-24ce839547ff@virtuozzo.com> Date: Wed, 16 Nov 2016 12:41:03 -0800 MIME-Version: 1.0 In-Reply-To: <877f83mb2v.fsf@thinkpad.rath.org> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On 11/16/2016 12:19 PM, Nikolaus Rath wrote: > On Nov 16 2016, Maxim Patlasov wrote: >> On 11/16/2016 11:19 AM, Nikolaus Rath wrote: >> >>> Hi Maxim, >>> >>> On Nov 15 2016, Maxim Patlasov wrote: >>>> On 11/15/2016 08:18 AM, Nikolaus Rath wrote: >>>>> Could someone explain to me the meaning of the max_background and >>>>> congestion_threshold settings of the fuse module? >>>>> >>>>> At first I assumed that max_background specifies the maximum number of >>>>> pending requests (i.e., requests that have been send to userspace but >>>>> for which no reply was received yet). But looking at fs/fuse/dev.c, it >>>>> looks as if not every request is included in this number. >>>> fuse uses max_background for cases where the total number of >>>> simultaneous requests of given type is not limited by some other >>>> natural means. AFAIU, these cases are: 1) async processing of direct >>>> IO; 2) read-ahead. As an example of "natural" limitation: when >>>> userspace process blocks on a sync direct IO read/write, the number of >>>> requests fuse consumed is limited by the number of such processes >>>> (actually their threads). In contrast, if userspace requests 1GB >>>> direct IO read/write, it would be unreasonable to issue 1GB/128K==8192 >>>> fuse requests simultaneously. That's where max_background steps in. >>> Ah, that makes sense. Are these two cases meant as examples, or is that >>> an exhaustive list? Because I would have thought that other cases should >>> be writing of cached data (when writeback caching is enabled), and >>> asynchronous I/O from userspace...? >> I think that's exhaustive list, but I can miss something. >> >> As for writing of cached data, that definitely doesn't go through >> background requests. Here we rely on flusher: fuse will allocate as >> many requests as the flusher wants to writeback. >> >> Buffered AIO READs actually block in submit_io until fully >> processed. So it's just another example of "natural" limitation I told >> above. > Not sure I understand. What is it that's blocking? It can't be the > userspace process, because then it wouldn't be asynchronous I/O... Surprise! Alas, Linux kernel does NOT process buffered AIO reads in async manner. You can verify it yourself by strace-ing a simple program looping over io_submit + io_getevents: for direct IO (as expected) io_submit returns immediately while io_getevents waits for actual IO; in contrast, for buffered IO (surprisingly) io_submit waits for actual IO while io_getevents returns immediately. Presumably, people are supposed to use mmap-ed read/writes rather than buffered AIO. > >>> Also, I am not sure what you mean with async processing of direct >>> I/O. Shouldn't direct I/O always go directly to the file-system? If so, >>> how can it be processed asynchronously? >> That's a nice optimization we implemented a few years ago: having >> incoming sync direct IO request of 1MB size, kernel fuse splits it >> into eight 128K requests and starts processing them in async manner, >> waiting for the completion of all of them before completing that >> incoming 1MB requests. > I see. But why isn't that also done for regular (non-direct) IO? Regular READs are helped by async read-ahead. Regular writes go through writeback mechanics: flusher calls fuse_writepages() and the latter submits as many async write requests as needed. Everything looks fine. (but as I wrote those async requests are not under fuse max_backgroung control). Thanks, Maxim > > Thanks, > -Nikolaus