linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] fuse: fuse support zero copy.
       [not found] <CGME20251020080512epcas5p4d3abbe6719fcb78fd65aea0524d85165@epcas5p4.samsung.com>
@ 2025-10-20  8:00 ` Xiaobing Li
  2025-10-20 16:15   ` Jens Axboe
  0 siblings, 1 reply; 3+ messages in thread
From: Xiaobing Li @ 2025-10-20  8:00 UTC (permalink / raw)
  To: miklos
  Cc: axboe, linux-fsdevel, io-uring, bschubert, kbusch, amir73il,
	asml.silence, dw, josef, joannelkoong, tom.leiming, joshi.k,
	kun.dou, peiwei.li, xue01.he

DDN has enabled Fuse to support the io-uring solution, allowing us 
to implement zero copy on this basis to further improve performance.

We have currently implemented zero copy using io-uring's fixed-buf 
feature, further improving Fuse read performance. The general idea is 
to first register a shared memory space through io_uring. 
Then, libfuse in user space directly stores the read data into 
the registered memory. The kernel then uses the io_uring_cmd_import_fixed 
interface to directly retrieve the read results from the 
shared memory, eliminating the need to copy data from user space to 
kernel space.

The test data is as follows:

4K IO size                                                           gain
-------------------------------------------------------------------------
                               |   no zero copy   |    zero copy  |  
rw         iodepth     numjobs |      IOPS        |      IOPS     |    
read          1           1    |      93K         |      97K      |  1.04
read          16          16   |      169K        |      172K     |  1.02
read          16          32   |      172K        |      173K     |  1.01
read          32          16   |      169K        |      171K     |  1.01
read          32          32   |      172K        |      173K     |  1.01
randread      1           1    |      116K        |      136K     |  1.17
randread      1           32   |      985K        |      994K     |  1.01
randread      64          1    |      234K        |      261K     |  1.12
randread      64          16   |      166K        |      168K     |  1.01
randread      64          32   |      168K        |      170K     |  1.01

128K IO size                                                         gain
-------------------------------------------------------------------------
                               |   no zero copy   |    zero copy  |
rw         iodepth     numjobs |      IOPS        |      IOPS     |  
read           1          1    |      24K         |      28K      |  1.17
read           16         1    |      17K         |      19K      |  1.12
read           64         1    |      17K         |      19K      |  1.12
read           64         16   |      51K         |      55K      |  1.08
read           64         32   |      54K         |      56K      |  1.04
randread       1          1    |      24K         |      25K      |  1.04
randread       16         1    |      17K         |      19K      |  1.12
randread       64         1    |      16K         |      19K      |  1.19
randread       64         16   |      50K         |      54K      |  1.08
randread       64         32   |      49K         |      55K      |  1.12
-------------------------------------------------------------------------

I will list the code after this solution is confirmed to be feasible.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] fuse: fuse support zero copy.
  2025-10-20  8:00 ` [RFC] fuse: fuse support zero copy Xiaobing Li
@ 2025-10-20 16:15   ` Jens Axboe
       [not found]     ` <CGME20251021052840epcas5p36d502a54805a8ba37c2929bb314088d4@epcas5p3.samsung.com>
  0 siblings, 1 reply; 3+ messages in thread
From: Jens Axboe @ 2025-10-20 16:15 UTC (permalink / raw)
  To: Xiaobing Li, miklos
  Cc: linux-fsdevel, io-uring, bschubert, kbusch, amir73il,
	asml.silence, dw, josef, joannelkoong, tom.leiming, joshi.k,
	kun.dou, peiwei.li, xue01.he

On 10/20/25 2:00 AM, Xiaobing Li wrote:
> DDN has enabled Fuse to support the io-uring solution, allowing us 
> to implement zero copy on this basis to further improve performance.
> 
> We have currently implemented zero copy using io-uring's fixed-buf 
> feature, further improving Fuse read performance. The general idea is 
> to first register a shared memory space through io_uring. 
> Then, libfuse in user space directly stores the read data into 
> the registered memory. The kernel then uses the io_uring_cmd_import_fixed 
> interface to directly retrieve the read results from the 
> shared memory, eliminating the need to copy data from user space to 
> kernel space.
> 
> The test data is as follows:
> 
> 4K IO size                                                           gain
> -------------------------------------------------------------------------
>                                |   no zero copy   |    zero copy  |  
> rw         iodepth     numjobs |      IOPS        |      IOPS     |    
> read          1           1    |      93K         |      97K      |  1.04
> read          16          16   |      169K        |      172K     |  1.02
> read          16          32   |      172K        |      173K     |  1.01
> read          32          16   |      169K        |      171K     |  1.01
> read          32          32   |      172K        |      173K     |  1.01
> randread      1           1    |      116K        |      136K     |  1.17
> randread      1           32   |      985K        |      994K     |  1.01
> randread      64          1    |      234K        |      261K     |  1.12
> randread      64          16   |      166K        |      168K     |  1.01
> randread      64          32   |      168K        |      170K     |  1.01
> 
> 128K IO size                                                         gain
> -------------------------------------------------------------------------
>                                |   no zero copy   |    zero copy  |
> rw         iodepth     numjobs |      IOPS        |      IOPS     |  
> read           1          1    |      24K         |      28K      |  1.17
> read           16         1    |      17K         |      19K      |  1.12
> read           64         1    |      17K         |      19K      |  1.12
> read           64         16   |      51K         |      55K      |  1.08
> read           64         32   |      54K         |      56K      |  1.04
> randread       1          1    |      24K         |      25K      |  1.04
> randread       16         1    |      17K         |      19K      |  1.12
> randread       64         1    |      16K         |      19K      |  1.19
> randread       64         16   |      50K         |      54K      |  1.08
> randread       64         32   |      49K         |      55K      |  1.12
> -------------------------------------------------------------------------
> 
> I will list the code after this solution is confirmed to be feasible.

Can you post the patches? A bit hard to tell if something is feasible or
the right direction without them :-)

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re:Re:[RFC] fuse: fuse support zero copy.
       [not found]     ` <CGME20251021052840epcas5p36d502a54805a8ba37c2929bb314088d4@epcas5p3.samsung.com>
@ 2025-10-21  5:24       ` Xiaobing Li
  0 siblings, 0 replies; 3+ messages in thread
From: Xiaobing Li @ 2025-10-21  5:24 UTC (permalink / raw)
  To: axboe
  Cc: miklos, linux-fsdevel, io-uring, bschubert, kbusch, amir73il,
	asml.silence, dw, josef, joannelkoong, tom.leiming, joshi.k,
	kun.dou, peiwei.li, xue01.he

On 10/20/25 10:15:00 AM, Jens Axboe wrote:
>On 10/20/25 2:00 AM, Xiaobing Li wrote:
>> DDN has enabled Fuse to support the io-uring solution, allowing us 
>> to implement zero copy on this basis to further improve performance.
>> 
>> We have currently implemented zero copy using io-uring's fixed-buf 
>> feature, further improving Fuse read performance. The general idea is 
>> to first register a shared memory space through io_uring. 
>> Then, libfuse in user space directly stores the read data into 
>> the registered memory. The kernel then uses the io_uring_cmd_import_fixed 
>> interface to directly retrieve the read results from the 
>> shared memory, eliminating the need to copy data from user space to 
>> kernel space.
>> 
>> The test data is as follows:
>> 
>> 4K IO size                                                           gain
>> -------------------------------------------------------------------------
>>                                |   no zero copy   |    zero copy  |  
>> rw         iodepth     numjobs |      IOPS        |      IOPS     |    
>> read          1           1    |      93K         |      97K      |  1.04
>> read          16          16   |      169K        |      172K     |  1.02
>> read          16          32   |      172K        |      173K     |  1.01
>> read          32          16   |      169K        |      171K     |  1.01
>> read          32          32   |      172K        |      173K     |  1.01
>> randread      1           1    |      116K        |      136K     |  1.17
>> randread      1           32   |      985K        |      994K     |  1.01
>> randread      64          1    |      234K        |      261K     |  1.12
>> randread      64          16   |      166K        |      168K     |  1.01
>> randread      64          32   |      168K        |      170K     |  1.01
>> 
>> 128K IO size                                                         gain
>> -------------------------------------------------------------------------
>>                                |   no zero copy   |    zero copy  |
>> rw         iodepth     numjobs |      IOPS        |      IOPS     |  
>> read           1          1    |      24K         |      28K      |  1.17
>> read           16         1    |      17K         |      19K      |  1.12
>> read           64         1    |      17K         |      19K      |  1.12
>> read           64         16   |      51K         |      55K      |  1.08
>> read           64         32   |      54K         |      56K      |  1.04
>> randread       1          1    |      24K         |      25K      |  1.04
>> randread       16         1    |      17K         |      19K      |  1.12
>> randread       64         1    |      16K         |      19K      |  1.19
>> randread       64         16   |      50K         |      54K      |  1.08
>> randread       64         32   |      49K         |      55K      |  1.12
>> -------------------------------------------------------------------------
>> 
>> I will list the code after this solution is confirmed to be feasible.
>
>Can you post the patches? A bit hard to tell if something is feasible or
>the right direction without them :-)

Ok, I'll send the patch when I'm ready.

--
Xiaobing Li

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-10-21  7:19 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CGME20251020080512epcas5p4d3abbe6719fcb78fd65aea0524d85165@epcas5p4.samsung.com>
2025-10-20  8:00 ` [RFC] fuse: fuse support zero copy Xiaobing Li
2025-10-20 16:15   ` Jens Axboe
     [not found]     ` <CGME20251021052840epcas5p36d502a54805a8ba37c2929bb314088d4@epcas5p3.samsung.com>
2025-10-21  5:24       ` Xiaobing Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).