linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: dai.ngo@oracle.com
To: Olga Kornievskaia <olga.kornievskaia@gmail.com>
Cc: Chuck Lever III <chuck.lever@oracle.com>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	Trond Myklebust <trondmy@hammerspace.com>,
	Bruce Fields <bfields@fieldses.org>
Subject: Re: [PATCH v2 0/2] enhance NFSv4.2 SSC to delay unmount source's export.
Date: Wed, 7 Apr 2021 10:13:14 -0700	[thread overview]
Message-ID: <0b0c7c79-d593-c4ae-db9b-46600f2cea28@oracle.com> (raw)
In-Reply-To: <CAN-5tyGS0ZO4PtTseLSmC4=fYQCUwMs6FB509g2PSCg1v+jySg@mail.gmail.com>


On 4/7/21 9:30 AM, Olga Kornievskaia wrote:
> On Tue, Apr 6, 2021 at 9:23 PM <dai.ngo@oracle.com> wrote:
>>
>> On 4/6/21 6:12 PM, dai.ngo@oracle.com wrote:
>>> On 4/6/21 1:43 PM, Olga Kornievskaia wrote:
>>>> On Tue, Apr 6, 2021 at 3:58 PM Chuck Lever III
>>>> <chuck.lever@oracle.com> wrote:
>>>>>
>>>>>> On Apr 6, 2021, at 3:57 PM, Olga Kornievskaia
>>>>>> <olga.kornievskaia@gmail.com> wrote:
>>>>>>
>>>>>> On Tue, Apr 6, 2021 at 3:43 PM Chuck Lever III
>>>>>> <chuck.lever@oracle.com> wrote:
>>>>>>>
>>>>>>>> On Apr 6, 2021, at 3:41 PM, Olga Kornievskaia
>>>>>>>> <olga.kornievskaia@gmail.com> wrote:
>>>>>>>>
>>>>>>>> On Tue, Apr 6, 2021 at 12:33 PM Chuck Lever III
>>>>>>>> <chuck.lever@oracle.com> wrote:
>>>>>>>>>
>>>>>>>>>> On Apr 2, 2021, at 7:30 PM, Dai Ngo <dai.ngo@oracle.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Currently the source's export is mounted and unmounted on every
>>>>>>>>>> inter-server copy operation. This causes unnecessary overhead
>>>>>>>>>> for each copy.
>>>>>>>>>>
>>>>>>>>>> This patch series is an enhancement to allow the export to remain
>>>>>>>>>> mounted for a configurable period (default to 15 minutes). If the
>>>>>>>>>> export is not being used for the configured time it will be
>>>>>>>>>> unmounted
>>>>>>>>>> by a delayed task. If it's used again then its expiration time is
>>>>>>>>>> extended for another period.
>>>>>>>>>>
>>>>>>>>>> Since mount and unmount are no longer done on each copy request,
>>>>>>>>>> this overhead is no longer used to decide whether the copy should
>>>>>>>>>> be done with inter-server copy or generic copy. The threshold used
>>>>>>>>>> to determine sync or async copy is now used for this decision.
>>>>>>>>>>
>>>>>>>>>> -Dai
>>>>>>>>>>
>>>>>>>>>> v2: fix compiler warning of missing prototype.
>>>>>>>>> Hi Olga-
>>>>>>>>>
>>>>>>>>> I'm getting ready to shrink-wrap the initial NFSD v5.13 pull
>>>>>>>>> request.
>>>>>>>>> Have you had a chance to review Dai's patches?
>>>>>>>> Hi Chuck,
>>>>>>>>
>>>>>>>> I apologize I haven't had the chance to review/test it yet. Can I
>>>>>>>> have
>>>>>>>> until tomorrow evening to do so?
>>>>>>> Next couple of days will be fine. Thanks!
>>>>>>>
>>>>>> I also assumed there would be a v2 given that kbot complained about
>>>>>> the NFSD patch.
>>>>> This is the v2 (see Subject: )
>>>> Sigh. Thank you. I somehow missed v2 patches themselves and only saw
>>>> the originals. Again I'll test/review the v2 by the end of the day
>>>> tomorrow!
>>>>
>>>> Actually a question for Dai: have you done performance tests with your
>>>> patches and can show that small copies still perform? Can you please
>>>> post your numbers with the patch series? When we posted the original
>>>> patch set we did provide performance numbers to support the choices we
>>>> made (ie, not hurting performance of small copies).
>>> Currently the source and destination export was mounted with default
>>> rsize of 524288 and the patch uses threshold of (rsize * 2 = 1048576)
>>> to decide whether to do inter-server copy or generic copy.
>>>
>>> I ran performance tests on my test VMs, with and without the patch,
>>> using 4 file sizes 1048576, 1049490, 2048000 and 7341056 bytes. I ran
>>> each test 5 times and took the average. I include the results of 'cp'
>>> for reference:
>>>
>>> size            cp          with patch                  without patch
>>> ----------------------------------------------------------------
>>> 1048576  0.031    0.032 (generic)             0.029 (generic)
>>> 1049490  0.032    0.042 (inter-server)      0.037 (generic)
>>> 2048000  0.051    0.047 (inter-server)      0.053 (generic)
>>> 7341056  0.157    0.074 (inter-server)      0.185 (inter-server)
>>> ----------------------------------------------------------------
>> Sorry, times are in seconds.
> Thank you for the numbers. #2 case is what I'm worried about.

Regarding performance numbers, the patch does better than the original
code in #3 and #4 and worse then original code in #1 and #2. #4 run
shows the benefit of the patch when doing inter-copy. The #2 case can
be mitigated by using a configurable threshold. In general, I think it's
more important to have good performance on large files than small files
when using inter-server copy.  Note that the original code does not
do well with small files either as shown above.

>
> I don't believe the code works. In my 1st test doing "nfstest_ssc
> --runtest inter01" and then doing it again. What I see from inspecting
> the traces is that indeed unmount doesn't happen but for the 2nd copy
> the mount happens again.
>
> I'm attaching the trace. my servers are .114 (dest), .110 (src). my
> client .68. The first run of "inter01" places a copy in frame 364.
> frame 367 has the beginning of the "mount" between .114 and .110. then
> read happens. then a copy offload callback happens. No unmount happens
> as expected. inter01 continues with its verification and clean up. By
> frame 768 the test is done. I'm waiting a bit. So there is a heatbeat
> between the .114 and .110 in frame 769. Then the next run of the
> "inter01", COPY is placed in frame 1110. The next thing that happens
> are PUTROOTFH+bunch of GETATTRs that are part of the mount. So what is
> the saving here? a single EXCHANGE_ID? Either the code doesn't work or
> however it works provides no savings.

The saving are EXCHANGE_ID, CREATE_SESSION, RECLAIM COMPLETE,
DESTROY_SESSION and DESTROY_CLIENTID for *every* inter-copy request.
The saving is reflected in the number of #4 test run above.

Note that the overhead of the copy in the current code includes mount
*and* unmount. However the threshold computed in __nfs4_copy_file_range
includes only the guesstimated mount overhead and not the unmount
overhead so it not correct.

-Dai


>
> Honestly I don't understand the whole need of a semaphore and all.

The semaphore is to prevent the export to be unmounted while it's
being used.

-Dai

> My
> approach that I tried before was to create a delayed work item but I
> don't recall why I dropped it.
> https://urldefense.com/v3/__https://patchwork.kernel.org/project/linux-nfs/patch/20170711164416.1982-43-kolga@netapp.com/__;!!GqivPVa7Brio!Jl5Wq7nrFUsaUQjgLJoSuV-cDlvbPaav3x8nXQcRhAdxjVEoWvK24sNgoE82Zg$
>
>
>> -Dai
>>
>>> Note that without the patch, the threshold to do inter-server
>>> copy is (524288 * 14 = 7340032) bytes. With the patch, the threshold
>>> to do inter-server is (524288 * 2 = 1048576) bytes, same as
>>> threshold to decide to sync/async for intra-copy.
>>>
>>>> While I agree that delaying the unmount on the server is beneficial
>>>> I'm not so sure that dropping the client restriction is wise because
>>>> the small (singular) copy would suffer the setup cost of the initial
>>>> mount.
>>> Right, but only the 1st copy. The export remains to be mounted for
>>> 15 mins so subsequent small copies do not incur the mount and unmount
>>> overhead.
>>>
>>> I think ideally we want the server to do inter-copy only if it's faster
>>> than the generic copy. We can probably come up with a number after some
>>> testing and that number can not be based on the rsize as it is now since
>>> the rsize can be changed by mount option. This can be a fixed number,
>>> 1M/2M/etc, and it should be configurable. What do you think? I'm open
>>> to any other options.
>>>
>>>>    Just my initial thoughts...
>>> Thanks,
>>> -Dai
>>>
>>>>> --
>>>>> Chuck Lever
>>>>>
>>>>>
>>>>>

  parent reply	other threads:[~2021-04-07 17:13 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-02 23:30 [PATCH v2 0/2] enhance NFSv4.2 SSC to delay unmount source's export Dai Ngo
2021-04-02 23:30 ` [PATCH v2 1/2] NFSD: delay unmount source's export after inter-server copy completed Dai Ngo
2021-04-02 23:30 ` [PATCH v2 2/2] NFSv4.2: mount overhead should not be used as threshold for inter-server copy Dai Ngo
2021-04-06 16:33 ` [PATCH v2 0/2] enhance NFSv4.2 SSC to delay unmount source's export Chuck Lever III
2021-04-06 19:41   ` Olga Kornievskaia
2021-04-06 19:42     ` Chuck Lever III
2021-04-06 19:57       ` Olga Kornievskaia
2021-04-06 19:58         ` Chuck Lever III
2021-04-06 20:43           ` Olga Kornievskaia
2021-04-07  1:12             ` dai.ngo
2021-04-07  1:23               ` dai.ngo
     [not found]                 ` <CAN-5tyGS0ZO4PtTseLSmC4=fYQCUwMs6FB509g2PSCg1v+jySg@mail.gmail.com>
2021-04-07 17:13                   ` dai.ngo [this message]
2021-04-07 19:01                     ` Olga Kornievskaia
2021-04-07 20:16                       ` dai.ngo
2021-04-07 21:40                         ` Olga Kornievskaia
2021-04-07 22:50                           ` dai.ngo
2021-04-08  0:58                             ` Olga Kornievskaia
2021-04-08  7:19                               ` dai.ngo
2021-04-08 15:25                               ` Chuck Lever III
2021-04-06 19:58         ` dai.ngo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0b0c7c79-d593-c4ae-db9b-46600f2cea28@oracle.com \
    --to=dai.ngo@oracle.com \
    --cc=bfields@fieldses.org \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=olga.kornievskaia@gmail.com \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).