From: Andrew Paniakin <apanyaki@amazon.com>
To: <pc@cjr.nz>, <stfrench@microsoft.com>, <sashal@kernel.org>,
<pc@manguebit.com>
Cc: <regressions@lists.linux.dev>, <stable@vger.kernel.org>,
<linux-cifs@vger.kernel.org>, <abuehaze@amazon.com>,
<simbarb@amazon.com>, <benh@amazon.com>
Subject: Re: [REGRESSION][BISECTED] Commit 60e3318e3e900 in stable/linux-6.1.y breaks cifs client failover to another server in DFS namespace
Date: Mon, 24 Jun 2024 10:59:20 -0700 [thread overview]
Message-ID: <Znmz-Pzi4UrZxlR0@3c06303d853a.ant.amazon.com> (raw)
In-Reply-To: <ZnMkNzmitQdP9OIC@3c06303d853a.ant.amazon.com>
On 19/06/2024, Andrew Paniakin wrote:
> Commit 60e3318e3e900 ("cifs: use fs_context for automounts") was
> released in v6.1.54 and broke the failover when one of the servers
> inside DFS becomes unavailable. We reproduced the problem on the EC2
> instances of different types. Reverting aforementioned commint on top of
> the latest stable verison v6.1.94 helps to resolve the problem.
>
> Earliest working version is v6.2-rc1. There were two big merges of CIFS fixes:
> [1] and [2]. We would like to ask for the help to investigate this problem and
> if some of those patches need to be backported. Also, is it safe to just revert
> problematic commit until proper fixes/backports will be available?
>
> We will help to do testing and confirm if fix works, but let me also list the
> steps we used to reproduce the problem if it will help to identify the problem:
> 1. Create Active Directory domain eg. 'corp.fsxtest.local' in AWS Directory
> Service with:
> - three AWS FSX file systems filesystem1..filesystem3
> - three Windows servers; They have DFS installed as per
> https://learn.microsoft.com/en-us/windows-server/storage/dfs-namespaces/dfs-overview:
> - dfs-srv1: EC2AMAZ-2EGTM59
> - dfs-srv2: EC2AMAZ-1N36PRD
> - dfs-srv3: EC2AMAZ-0PAUH2U
>
> 2. Create DFS namespace eg. 'dfs-namespace' in Windows server 2008 mode
> and three folders targets in it:
> - referral-a mapped to filesystem1.corp.local
> - referral-b mapped to filesystem2.corp.local
> - referral-c mapped to filesystem3.corp.local
> - local folders dfs-srv1..dfs-srv3 in C:\DFSRoots\dfs-namespace of every
> Windows server. This helps to quickly define underlying server when
> DFS is mounted.
>
> 3. Enabled cifs debug logs:
> ```
> echo 'module cifs +p' > /sys/kernel/debug/dynamic_debug/control
> echo 'file fs/cifs/* +p' > /sys/kernel/debug/dynamic_debug/control
> echo 7 > /proc/fs/cifs/cifsFYI
> ```
>
> 4. Mount DFS namespace on Amazon Linux 2023 instance running any vanilla
> kernel v6.1.54+:
> ```
> dmesg -c &>/dev/null
> cd /mnt
> mount -t cifs -o cred=/mnt/creds,echo_interval=5 \
> //corp.fsxtest.local/dfs-namespace \
> ./dfs-namespace
> ```
>
> 5. List DFS root, it's also required to avoid recursive mounts that happen
> during regular 'ls' run:
> ```
> sh -c 'ls dfs-namespace'
> dfs-srv2 referral-a referral-b
> ```
>
> The DFS server is EC2AMAZ-1N36PRD, it's also listed in mount:
> ```
> [root@ip-172-31-2-82 mnt]# mount | grep dfs
> //corp.fsxtest.local/dfs-namespace on /mnt/dfs-namespace type cifs (rw,relatime,vers=3.1.1,cache=strict,username=Admin,domain=corp.fsxtest.local,uid=0,noforceuid,gid=0,noforcegid,addr=172.31.11.26,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=5,actimeo=1,closetimeo=1)
> //EC2AMAZ-1N36PRD.corp.fsxtest.local/dfs-namespace/referral-a on /mnt/dfs-namespace/referral-a type cifs (rw,relatime,vers=3.1.1,cache=strict,username=Admin,domain=corp.fsxtest.local,uid=0,noforceuid,gid=0,noforcegid,addr=172.31.12.80,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=5,actimeo=1,closetimeo=1)
> ```
>
> List files in first folder:
> ```
> sh -c 'ls dfs-namespace/referral-a'
> filea.txt.txt
> ```
>
> 6. Shutdown DFS server-2.
> List DFS root again, server changed from dfs-srv2 to dfs-srv1 EC2AMAZ-2EGTM59:
> ```
> sh -c 'ls dfs-namespace'
> dfs-srv1 referral-a referral-b
> ```
>
> 7. Try to list files in another folder, this causes ls to fail with error:
> ```
> sh -c 'ls dfs-namespace/referral-b'
> ls: cannot access 'dfs-namespace/referral-b': No route to host```
>
> Sometimes it's also 'Operation now in progress' error.
>
> mount shows the same output:
> ```
> //corp.fsxtest.local/dfs-namespace on /mnt/dfs-namespace type cifs (rw,relatime,vers=3.1.1,cache=strict,username=Admin,domain=corp.fsxtest.local,uid=0,noforceuid,gid=0,noforcegid,addr=172.31.11.26,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=5,actimeo=1,closetimeo=1)
> //EC2AMAZ-1N36PRD.corp.fsxtest.local/dfs-namespace/referral-a on /mnt/dfs-namespace/referral-a type cifs (rw,relatime,vers=3.1.1,cache=strict,username=Admin,domain=corp.fsxtest.local,uid=0,noforceuid,gid=0,noforcegid,addr=172.31.12.80,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=5,actimeo=1,closetimeo=1)
> ```
>
> I also attached kernel debug logs from this test.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=851f657a86421
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0a924817d2ed9
>
> Reported-by: Andrei Paniakin <apanyaki@amazon.com>
> Bisected-by: Simba Bonga <simbarb@amazon.com>
> ---
>
> #regzbot introduced: v6.1.54..v6.2-rc1
Friendly reminder, did anyone had a chance to look into this report?
next prev parent reply other threads:[~2024-06-24 17:59 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-19 18:32 [REGRESSION][BISECTED] Commit 60e3318e3e900 in stable/linux-6.1.y breaks cifs client failover to another server in DFS namespace Andrew Paniakin
2024-06-24 17:59 ` Andrew Paniakin [this message]
2024-06-25 11:07 ` [REGRESSION][BISECTED][STABLE] " Christian Heusel
2024-06-26 22:09 ` Andrew Paniakin
2024-06-27 20:16 ` Christian Heusel
2024-07-11 9:49 ` Linux regression tracking (Thorsten Leemhuis)
2024-07-13 3:22 ` Andrew Paniakin
2024-07-23 0:51 ` Andrew Paniakin
2024-09-27 10:56 ` Linux regression tracking (Thorsten Leemhuis)
2024-10-16 20:49 ` Andrew Paniakin
2024-11-10 17:59 ` Andrew Paniakin
2025-03-16 18:34 ` Andrew Paniakin
2025-05-10 3:05 ` Andrew Paniakin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Znmz-Pzi4UrZxlR0@3c06303d853a.ant.amazon.com \
--to=apanyaki@amazon.com \
--cc=abuehaze@amazon.com \
--cc=benh@amazon.com \
--cc=linux-cifs@vger.kernel.org \
--cc=pc@cjr.nz \
--cc=pc@manguebit.com \
--cc=regressions@lists.linux.dev \
--cc=sashal@kernel.org \
--cc=simbarb@amazon.com \
--cc=stable@vger.kernel.org \
--cc=stfrench@microsoft.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox